Public Comment to the House Appropriations Legislative Branch Subcommittee for FY2014

I will be submitting the following public comment to the House Committee on Appropriations Subcommittee on the Legislative Branch regarding Public Access to Legislative Information.


I write to urge the subcommittee to expand funding for legislative transparency.

I am the president of Civic Impulse LLC, which operates the free legislative tracking service Our website has become an authoritative source for legislative information:

  • More citizens turn to for information about the status of legislation than the Library of Congress (LOC)‘s THOMAS and websites. [See]
  • Hundreds of House and Senate staff use each day.
  • More than 70 congressmen use GovTrack services to display congressional district maps and their voting record on their official website.

Why is this? has become the de facto authoritative source for legislative information because the Congress does not publish enough “bulk legislative data.” In 2004 we stepped in to fill the vacuum created by the lack of information coming from the Congress. It is long past due for the House to correct this problem.

When the Committee released a draft report last year indicating it intended to have legislative branch agencies publish less bulk data, The Washington Post picked up on the story and wrote:

“At Congress’s ’90s-vintage archive site, there’s no way to compare bills side by side. No tool to measure the success rate of a bill’s sponsor. And there’s certainly no way to leave a comment. Congress makes it hard for outside sites to do any of this, either, by refusing to give out bulk data on its bills in a user-friendly form.” (“Congressional data may soon be easier to use online,” The Washington Post, June 8, 2012.)

Soon after, the Speaker and Majority Leader formed the “Bulk Data Task Force.” Since the formation of the task force, new bulk data projects have been completed at the Government Printing Office (GPO) including bulk bill text and at the House Clerk (committee schedules and documents and bulk floor action data).

“Bulk data” is a core component of any government information dissemination program. The House Clerk publishes roll call vote results as bulk XML data. In 2009, the Government Printing Office began offering bulk data for bill text, the Federal Register, and other publications. The Office of Law Revision Counsel publishes the United States Code in multiple bulk data formats. Bulk data can be produced at a fraction of the cost of other information dissemination methods, such as colorful websites.

Yet much information about the Congress remains out of public view. There is no public bulk data for the status of legislation (the LOC “BSS” database), amendments, or committee votes. I believe that eventually all official artifacts of the legislative process should be available online, free, in real time, and as structured bulk data. [See Recommendations to the Bulk Data Task Force.]

And, sadly, proposals for cost-reduction threaten the public’s access to the law itself. A 2013 congressionally-funded report by the National Academy of Public Administration (NAPA) called for the Congress to consider charging the public fees to read the law online at GPO’s website. NAPA’s report is severely out of touch. There is no dispute that it is a moral imperative for Congress to fund programs that provide broad access to the law and other parts of the public record. is a demonstration that bulk data creates broad public access and that bulk data is also the most cost-effective way to create access. Since 2004, has reached tens of millions of individuals at a cost of less than $1 million.

The Committee can advance broad public access to legislative information by providing adequate funding for:

  • Publishing the LOC legislative status (“BSS”) database as bulk data. [See Recommendations to the Bulk Data Task Force.]
  • Enhancing GPO’s highly successful FDSys system.
  • Creating bulk data program officers at GPO, LOC, and under House Clerk.
  • Evaluating the cost and impact of legislative transparency by an organization that believes in the public’s right to primary legal documents (i.e. not NAPA).

Thank you for the opportunity to submit comments on legislative branch appropriations for FY 2014.

Joshua Tauberer

President, Civic Impulse LLC

Open Data Day 2013 Hackathon Recap

Last weekend in perhaps as many as 100 cities around the world open data enthusiasts held hackathons. Here in DC we too were celebrating February 23 as International Open Data Day. And it was, dare I say, a great success.

Over 150 developers, data scientists, social entrepreneurs, government employees, and other open data enthusiasts participated in our event, first at a kickoff Friday night at Google’s DC headquarters and then at the Saturday session at The World Bank. Participants worked on local DC issues, global open source mapping, world poverty, and open government. Here are some quick links:

Videos: One | Two — Photos: One | Two

Eric’s Recap | Sam’s Recap | Tumblr | Storified Tweets

Press coverage is listed at the end.

Our approach to the hackathon was a little different than many others. Our goals were to strengthen the open data community, to foster connections between people and between projects, and to emphasize problem statements over prototypes and solutions. There was no beer or pizza at our hackathon, no competitions, and no pressure to produce outputs. Participants came motivated and stayed focused without needing to be treated like brogrammers. This created a positive, welcoming, and highly productive environment.

In the morning Eric Mill (Sunlight Foundation/@konklone) ran a several-hours-long tutorial on open data for about 40 participants. Some were new to coding. Others were project managers (inside and outside of government) who wanted to learn more about what open data is all about from the ground up. Eric walked the participants through exploring APIs through the web browser and using command-line tools to process CSV files — a very concrete way to explain the benefits of adding structure to data.

Several projects focused on local DC issues: mapping zoning restrictions (more), graphing public and charter school enrollment and (other education data), mapping trees by species, and building a database of social service providers.

A large team of map hackers worked on mapping Kathmandu in Open Street Map to aid disaster response, and with their collaborators around the world mapped over 7,000 building footprints.

Global poverty and international development was the focus of several other projects, from building APIs for international development project performance data to measuring poverty in real time using Twitter.

The open government projects worked on adding semantic information to legislative documents, comparing legislative documents for similarity, extracting legal citations, cataloging our government representatives at the local level, and building “devops” tools for rapid deployment of VMs that might be useful in government or for open data researchers.

And there were other projects that don’t fit into any of those categories, like building Python tools for creating education curricula,

The event was organized by me (Josh Tauberer/GovTrack/@JoshData), Eric Mill (Sunlight Foundation/@konklone), Katherine Townsend (USAID/@DiploKat), Dmitry Kachaev (Presidential Innovation Fellow/Millennium Challenge Corporation/@kachok), Sam Lee (The World Bank/@OpenNotion), and Julia Bezgacheva (@ulkins/The World Bank).

Thanks to The World Bank especially, and to Google, the participants that helped out with registration in the morning, and to everyone who came!

This was DC’s second open data day. Our first was on Dec. 3, 2011 and was co-hosted by POPVOX (Josh Tauberer) and Wikimedia DC (Katie Filbert). See what we did on the post-event recap at Participants then worked on improving access to U.S. law, scanning federal spending for anomalies following Benford’s Law, understanding farm subsidy grants, building local transit apps, and keeping Congress accountable. Only about half of the participants were programmers, buteveryone found a way to be involved.

It was also DC’s second international development data day. The last one was held on December 9, 2012 in the lead-up to the Development DataJam hosted by White House’s Office of Science & Technology. Those events primarily served as ideation jams to bring together issue area experts and data experts to develop new ideas and partner for new solutions. Experts were sought out to inform the discussions, but anyone with an interest in open data in development were welcomed and participated.

Press coverage

DCist: Hack D.C.: Hackers Put Open Data to Use to Help Improve Local Government

The Atlantic Cities: Is There a Link Between Walkability and Local School Performance?

Greater Greater Washington: How school tiers match up with Walk Score

Greater Greater Education: Community of civic hackers for education takes shape