Congressional Committee Webcast Archives Review: I See Progress

One of the continuing themes of the Open House and Open Senate Projects has been investigating what congressional committees make available to the public on their websites. I’ve recently become interested in committee markup meetings, and I was curious to see how often webcasts of markup meetings are available on committee websites.

In a survey of House and Senate standing committee websites this week, I found the following. Note that I counted hearings separately from markup sessions and business meetings.

Hearing Webcasts: The vast majority (32 of 35) of committees make archives of video webcasts of hearings regularly available on their websites, in what appeared to be a very timely way. The videos are pretty poor quality by today’s standards, but it’s still very useful. The exceptions were Senate Foreign Relations [UPDATE: I just missed the very obscure links. Nevermind. All Senate committees have video archives.], House Agriculture, House Appropriations, House Rules, and House Ways and Means which lacked video archives. House Agriculture makes it up by providing transcripts… after several months; the other four committees provide no electronic record of hearings.

It appeared that most also have live webcasts of most hearings, but I couldn’t tell just from looking at the websites once.

Hearing Transcripts: Transcripts were surprisingly hard to come by. Senate Armed Services, Senate Rules, Senate Veterans’ Affairs, and House Energy and Commerce seemed to be the only committees that provided  transcripts regularly. (That’s 4 of 35 committees.) Considering the importance of transcripts for disability accessibility and machine processing (e.g. search), this is too bad.

Hearing Prepared Testimony & Statements: PDFs and other document formats were used to post prepared statements and testimony — this almost makes up for not having transcripts. Four committees lacked even this, and of those none were among the committees that posted transcripts. House Appropriations and House Rules posted neither video, nor transcripts, nor prepared statements. The other two at least posted videos.

For hearings, by and large there is an electronic record available, and if you can find a record you can find video.

I counted markup sessions and business meetings separately from hearings. Electronic records were far less common for these meetings.

Markups: About half of the committees posted archival videos for these business meetings. Of those that didn’t, one posted transcripts. That leaves 18 out of 35 posting no electronic record of these meetings. The notable committee here is House Judiciary, which posts both transcripts and video of business meetings.

A similar survey for Senate committees was done just about a year ago by someone else on the OHP mail list who might want to remain anonymous on this point (I’m not sure). In comparison to that survey, more Senate committees are posting hearing archival video now, which is great. Less than half were regularly posting archival audio/video then, and now the vast majority are posting video. As for markups, just two of 16 Senate committees were posting recordings of markups regularly then, with a few more posting them irregularly, and some transcripts. So it is nice to see that Senate committees are moving more of this information online as well.

One note, some committees display a note at the starts of their videos: “The use of duplications of broadcast coverage of the Committee on Transportation is governed by the rules of the House. Use for political or commercial purposes is expressly prohibited.” I hope no one takes that message seriously, and I wonder what legal basis this message has. I don’t believe I am subject to the rules of the House.

This topic goes a long way back. See Carl Malamud’s work for more.

Additional notes:

Jim Snider replied to this on the OHP list saying that House Commerce had links to some webcasts which were not actually working, but noted that it was probably a glitch. He also wrote, “The last time I checked several years ago House Commerce Committee transcripts were running at least a year late and sometimes several years late.  The public record included in the transcripts also may not include follow-up correspondence on the public record between witnesses and the committee. In 1994 I wrote a master’s thesis on video access to public meetings, and in 1999 an op-ed in the Chicago Tribune, “Senate Hypocrisy Over “Hot” Testimony,” on how Congress inhibits public access to their public meeting video archives.”

Aphid pointed out a Metavid wiki page for congressional video availability. I seem to have duplicated some work, and I’ll need to check if I can update that page with anything new. UPDATE: Aphid also notes there that because many of the video streams are in a proprietary format, it may be illegal under the DMCA law to archive these videos. This along with the restrictions noted in House Rules is a major point to be addressed in the future.

UPDATE 2:
What can committees do going forward?

* For the sake of archives and use by professional journalists, provide a stream that is high-quality (it probably exists but just isn’t public).

* Similarly, provide the streams at least additionally in a format that does not make it a violation of federal law to copy (again, it’s a problem regardless of whether the committee says “go ahead”).

* Remove any additional assertions (e.g. House Rules) on how congressional video may be used. Either it is public or it is not. It is an affront to free speech if Congress thinks government records, of all things, should be off-limits to any part of public discourse.

* Partner with experts in the public — e.g. Aphid and Carl Malamud — on establishing goals for congressional video.

Update on bulk data from Congress

One of the Open House Project’s recommendations was that Congress share its legislative data with the public in bulk and I’ve had a long history of posts on the subject. Over at the Free Gov info blog (link), Bob Tapella, Public Printer at the Government Printing Office, tells us that they are responding to this recommendation. He writes in a comment (presumably it is really him):

We have recently been called upon by Congress in the joint explanatory statement on the H.R. 1105, to work with the Library of Congress, including the Congressional Research Service, and the Law Library of Congress, to discuss access to bulk data. Specifically, the language is as follows:

[JT: omitted — I’ve posted it before here]

To address this request, a Legislative branch task force has been assembled consisting of representatives from the offices of the Secretary of the Senate, the Clerk of the House, the Library of Congress, Congressional Research Service, the Law Library of Congress, and GPO. This task force has already met and is working to develop a position on access to bulk data. We will look to this work and the review by Congress to help guide our work on making bulk data accessible.

Grin.

Bulk data downloads approved in the omnibus spending bill (success!)

Two recommendations of the Open House Project report have been taken up in the FY09 omnibus appropriations bill (H.R. 1105). The first recommendation in our chapter on legislative databases was that the Library of Congress make its bill status database directly available to the public and that the GPO not sell legislative documents to the public. These have been the two issues I’ve had my sights on over the last three years (probably starting here). The second recommendation was about coordinating web standards across Congress. These recommendations are addressed in two paragraphs the House statement accompanying the bill for Division G – Legislative Branch, which is almost like being law itself.

The two paragraphs were added by Congressman Mike Honda of California, one of our champions of the use of technology to further transparency and civic engagement. John Wonderlich of Sunlight Foundation, Rob Pierson in Honda’s office, and I collaborated on this over a long period of time. Honda got involved in 2007 asking the Library to look into this and then in 2008 getting the paragraphs added to the bill markup.

Continue reading “Bulk data downloads approved in the omnibus spending bill (success!)”

Comparing stimulus bill text versions side-by-side

One of the concrete benefits of open government data is that third parties can use the data to do something useful that no one in government has the mandate, resources, or insight to do. If you think what I am about to tell you below is cool, and helpful, then you are a supporter of open government data.

On my site GovTrack, you can now find comparisons of the text of H.R. 1, the stimulus bill, at different stages in its legislative life — including the House version (as passed) and the current Senate version (amendment 570).

The main page on GovTrack for HR 1 is: here

Here’s a direct link to the comparison:

Comparisons are possible between any two versions of the bill posted by GPO. Comparisons are available for any bill.

If you find this useful, please take a moment to consider that something like this is possible only when Congress takes data openness seriously. When GPO went online and THOMAS was created in the early 90s, they chose good data formats and access policies (mostly). But the work on open government data didn’t end 15 years ago. As “what’s hot” shifts to video and Twitter, the choices made today are going to impact whether or not these sources of data empower us in the future, whether or not we miss exciting opportunities such as having tools like the one above.

(Thanks to John Wonderlich and Peggy Garvin for some side discussion about this before my post. GovTrack wasn’t initially picking up the latest Senate versions because GPO seems to have gone out of its way to accommodate posting the latest versions before they were passed by the Senate, which is great, but caught GovTrack by surprise.)

Navigating legislation (after the fact, of course)

In May, the Congress passed the 2008 Farm Bill, which regulates various food, nutrition, and apparently biofuel issues. Tufts food policy professor Parke Wilde writes on his blog today:

The 629-page text (.pdf) of the 2008 Farm Bill is so complex and unreadable that the U.S. food policy community has been on the edge of our seats waiting for the USDA/ERS side-by-side comparison unveiled today.

The ERS side-by-side tool compares the new Farm Bill with current law, title by title, so we can finally begin to understand what the law really means.

ERS is the USDA’s Economic Research Service. Their side-by-side webpage, which I think was just published this week, shows the provisions of the previous and the current bill side-by-side. (It’s not a comparison of the bill text, but of summaries of the provisions.)

This is interesting on a number of accounts. First, the fact that it is the USDA making this comparison suggests that everyone agrees that the bill itself is effectively incomprehensible even to professionals and scholars on account of its size and summarizing it is costly enough that only the government would do it, taking three months to prepare.

Second, if this is what was needed to understand the Farm Bill, was it passed without anyone understanding it?

Third- This comparison was made by and for professionals and scholars, not by tech geeks. Why aren’t we talking to them?

The ERS tool comes complete with a seemingly unintentionally hilarious intro video — overly dramatic with background music fit for the Miss Universe competition. (Wilde likened it to “a documentary by Kenneth Burns or an account of a manned mission to the moon”.)

Legislative Databases recommendation makes it to House Leg Branch Appropriations markup

I’m ecstatic. All right, so this all goes back to late 2006, a bunch of people sitting at their computers writing some emails about what Congress should do with data. I distinctly remember Dan Newman and I both thinking that the Library of Congress should make its raw legislative database (that powers THOMAS) available directly to us to build applications off of, rather than the screen-scraping that I was doing. One thing leads to another, the Open House Project, the legislative databases section of the OHP report in May 2007 (which I principally wrote), then later that year with the support of Rep. Mike Honda, in November CHA asked the LOC to look into the issue (more), and then in the last month his office submitted text for the House Legislative Branch Appropriations Report, which made it through subcommittee markup of the bill, to give this request a little more teeth (like, ehm, the force of law).

His office also submitted a second paragraph which I’ll get to below.

Rob Pierson in Honda’s office writes on the OHP mail list:

I’ve mentioned on the list some of the steps my boss (Congressman Honda) has been taking, with counsel from many folks on this list, to guide Congressional policies on the path towards effectively leveraging technology to open up access to the public. There are actually quite a few other staffers who also follow this list, and we’ve certainly learned quite a bit from the conversations posted here, so I wanted to throw out a quick note of appreciation to everyone who has been contributing to the discussions.

With guidance from the conversations on this list (and the OHP report), Congressman Honda recently submitted the following sections into the House Legislative Branch Appropriations Report. The following (or possibly very similar versions) were included in the Leg Branch Subcommittee markup of the bill:

*Public Access to Legislative Data (as submitted)*

The Committee believes that the public should have improved access to legislative information through more advanced search capabilities such as those available through the Library of Congress’ Legislative
Information System. The Committee also supports enhancing public access to legislative documents, bill status, summary information, and other legislative data, through more direct methods such as bulk data downloads and other means of no-charge digital access to legislative databases. The Committee requests that the Library and Government Printing Office report on the progress towards these goals within 90 days of enactment of this Act.

Note that the GPO has also been stuck in there. More more on that, see this post.

The second paragraph that Honda’s office submitted John noted was parallel to the final chapter of our report, Coordinating Web Standards. (Hmm, I principally wrote that chapter too….)

*Congressional Technology Coordination (as submitted)*

The Committee recognizes the need for the House of Representatives to develop a strategic and coordinated plan that will prepare for the future technology needs of the institution. A 2006 report commissioned by the Chief Administrative Officer and the Committee on House Administration, entitled /Strategic Technology Road Map for the Ten Year Vision of Technology in the House of Representatives/ provided a suggested structure for an IT evaluation and decision-making process.
No later than 90 days after the enactment of this Act, the Committee requests that the Chief Administrative Officer, the Clerk, and the Sergeant at Arms report to the Committee of their efforts to develop House-wide data-sharing standards; implement standard legislative document formats; address the increasing resource challenges of Member offices; and identify disparate systems throughout the institution, which prevent it from taking advantage of economies of scale.

This is of course fantastic news for anyone that supports transparency, which is, well, everyone in their right mind, I think. So thanks to Congressman Honda for taking the initiative on this!

(Other links: last year’s leg branch appropriations blog post, my first or one of my first posts here about structured data)

Eating well on Independence Day

Happy 4th of July. I thought I’d share an interesting website that has nothing to do with government transparency but is about good use of government data. The USDA maintains a big database of nutrition facts about foods. You can download the database and build applications based on it, like a menu planner. This is something I’ve been thinking about in the back of my head for a while since after getting into the whole Michael Pollan food mind-set I’ve wondered whether one can make a healthy diet just by balancing various food groups (as I try to do with limited success), or whether (contra Pollan’s overall message, though maybe not in the details) it would be useful to start adding up the numbers of various nutrients to see how my meals match up with recommended values. How should I know, for instance, if I’ve managed to exclude an important vitamin in my particular selection of foods that I eat week after week, right?

The database is great itself, but the cooler website is MyPyramid Menu Planner (mypyramidtracker.gov) (also out of the USDA). You can enter a typical daily roster of what you eat (with a nice sound effect) and it will tell you how it stacks up for a recommended diet for your age (or for me, how to gain weight to a recommended amount for my age). It feels a little over-simplified, but the simplicity keeps me on the site. I find, not surprisingly, that I probably eat about half of the recommended calories and clearly not enough grain or fruit. Well, I knew this in the abstract, but quantifying it helps direct me to fixing the problem.

I’m sure there are other websites that do similar things, but it’s nice to find a case where the government has both published a comprehensive (well structured, well documented) database and has also built a really nice interface for the data. And on a topic that is really very important to daily life, too.

And with that, I think I will take the rest of the weekend off from civics!

Communicating with Congress: Recommendations for Improving the Democratic Dialogue

CMF published an interim report Communicating with Congress: Recommendations for Improving the Democratic Dialogue . I had one of those “someone got it right” moments reading the report. Following what seemed to be tireless work by Daniel Bennett and Rob Pierson (Rep. Mike Honda’s office) and CMF staff going back a long time, and a conference in October that I really enjoyed, they recommend adding metadata to constituent communication to reliably indicate who the sender is, what the issue is, and what advocacy organization helped the sender send the message.

The recommendation serves to help congressional staff manage incoming communication. It’s a method of triage on the one hand, and a tool to help tally communications by position on the other. Critical as this may be, I find tallying to be incredibly superficial — and it really reveals, I think, that the world of communicating with Congress has become extremely narrow. (But I’ve written on that before.)

Webcontent.gov updates publishing-data recommendations

I was very lucky this week to have stumbled into the middle of an update being done to a page maintained by the U.S.’s GSA at webcontent.gov on best practices for making data available, for executive branch agencies. The site serves as a collection of best practices and uses OMB policies
as a starting point. I think it had been last updated in 2005.

The page updated is here.

The updates were a combination of suggestions from Scott Horvath and Jeremy Fee at the USGS, Kol Peterson from EPA, and me, and really big thanks go to Scott and Kol for reaching out to others for input on Monday and getting the feedback back to Bev Godwin at GSA who runs webcontent.gov who published the changes only a few days later. Scott also notes that additional suggestions could still be considered (his email address is at the bottom of that page).

In making my suggestions, I turned to the Open Government Data Principles and tried to squeeze in as much as I could without overloading the document, and I drew from ideas that came up in the preparation of the Open House Project report. Some of the changes made were:

  • It now provides examples of data as being documents, audio/visual recordings, and databases.
  • It now says to support “the widest practical range of public uses of
    the data”. It had formerly suggested supporting the “intended” use of
    the website by visitors.
  • It notes the benefit of providing data: “New uses of your agency’s
    data may become a valuable public resource that would be out of the
    scope of your own website, such as helping to keep the public informed
    about the work of your agency and supporting civic education and
    participation.”
  • There is a new paragraph that I might be misunderstanding but which
    seems to make a suggestion along the lines of the recent “Invisible
    Hand” paper about the agency’s website getting the data the same way the
    public does: “Providing a uniform method to access raw data can also be
    the first step in internal development, accomplishing both goals at
    once. When a uniform method to access data is available, developers and
    web–services can focus on data presentation.”
  • It notes that the availability of bulk downloads of data is something
    to consider when building data access.
  • It notes some disadvantages of using proprietary formats.
  • It recommends that if a proprietary format is needed, a
    non-proprietary format should be used in addition.
  • It adds a benchmark to test for success: “One benchmark for
    determining whether data is made sufficiently available is whether the
    public has all of the data needed to replicate any searching, sorting,
    and display functionality provided on the agency’s own website.”
  • It notes that consulting the public in the development of data access
    seems to be entailed from OMB policy: “When choosing data formats and
    distribution methods, keep in mind that your agency’s visitors are the
    best judges of their own needs. Agencies must “establish and maintain
    communications with members of the public and with State and local
    governments to ensure your agency creates information dissemination
    products meeting their respective needs” (OMB Policies for Federal
    Public Websites #4A).”

We have a real success story here.

Government Data and the Invisible Hand

The guys over at Princeton’s new Center for Information Technology Policy wrote a really great paper for the Yale Journal of Law & Technology on the role data should have, compared to websites, in government. It articulates a point that I think many of us subconsciously have had in mind:

“The new administration should specify that the federal government’s primary objective as an online publisher is to provide data that is easy for others to reuse, rather than to help citizens use the data in one particular way or another.”

And they suggest an interesting way to push that forward:

“The policy route to realizing this principle is to require that federal government websites retrieve the underlying data using the same infrastructure that they have made available to the public. Such a rule incentivizes government bodies to keep this infrastructure in good working order, and ensures that private parties will have no less an opportunity to use public data than the government itself does. The rule prevents the situation, sadly typical of government websites today, in which governmental interest in presenting data in a particular fashion distracts from, and thereby impedes, the provision of data to users for their own purposes.”

I think this is a worthwhile addition to the opengovdata and publicmarkup.org policy documents — if not as a direct recommendation (because I think it may be too much to ask for in a grand form) then noted as a long-term goal or (in terms of the second paragraph I quoted) as a benchmark, a concrete way to tell whether data is open.

The full citation is: Robinson, David, Yu, Harlan, Zeller, William P and Felten, Edward W, “Government Data and the Invisible Hand” (2008). Yale Journal of Law & Technology, Vol. 11, 2008