Political text analysis: The Times counts debate words

The New York Times has an interesting flash application that breaks down the text of yesterday’s Democratic debate (there was a debate? UPDATE: And it was in my own city??) by speaker and shows visually the distribution of who spoken when through the debate. I mention it here because it’s one of these data transformations very much in the same spirit of what I keep pushing here. They took the transcript, made it visual and interactive, and the end result is a vastly different view onto the debate than anyone had before. It uses the same transcript as anyone else, but adds something very new and informative.

One can’t help but notice that the different candidates are not getting the same amount of speaking time. Clinton spoke more than 3.5 times more words, and the same for speaking time, than Biden. For that matter, basically so did the moderator, who held the floor for more time than anyone but Clinton. It’s no wonder that Clinton is considered “the Democrat to beat” considering she’s in our face more.

If the numbers weren’t so vastly different between the candidates, we’d chalk it up to some random variation that happens from debate to debate. But, from the numbers, the speaking times are clearly planned. It’s so clear that I feel like maybe I missed something. Is it common knowledge that the debates are proportioning time out to the candidates based on their poll numbers (or something equivalent)? It’s not just that the front-runners are getting more time. The statistical correlation is ridiculously high (speaking time versus FOX News/Opinion Dynamics Poll. Oct. 23-24: r=.96). That is, the debate organizers are basically using this formula to determine how much time each candidate should get:

Speaking Time = 8:26 minutes + 25 seconds * Latest Poll Number (%)

Of course, debate organizers can’t control exactly how long each candidate talks for, but the candidates only deviated from the formula by at most two minutes and twenty seconds (Biden, who spoke less, and DoddCORRECTED: Edwards, who spoke more).

So now I’m getting off topic a bit, but in any case: transformations on data can be very revealing!

Steve King introduces a new bill with a bit of Internet-transparency thrown in

Steve King, a Republican from Iowa, has introduced a new bill that has a clause specifically about Internet-based transparency. (We know King from his bill H.R. 170: Sunlight Act of 2007, parts of which I think were integrated into the passed ethics reform bill. One part that wasn’t integrated was a provision to have bills posted online for 48 hours before their consideration.) His new bill is H. Res. 776: Amending the Rules of the House of Representatives to require that rescission bills always be considered under open rules every year, and for other purposes.

This bill, like most of the 12 others he has introduced this year, takes a classical conservative position, here trying to reduce government spending. The real point of the bill is expressed best in one of its findings clauses:

Whereas a rescissions bill, which would cut Federal spending, should be brought to the House floor at the beginning of every fiscal quarter to give Congress the opportunity to cut and cancel unnecessary, wasteful, and bloated government spending to eliminate the deficit;

But the interesting part for us is:

Whereas the process of cutting spending should be open to the public, by posting this spending cutting bill and its amendments on the Internet, so that Americans can exercise their right to contact their Members of Congress and make their views known

It has a variant of the 48-hours language from his other bill applied specifically to rescission bills.

Committee Votes: That’s The Deal

I happened to check on the list of cosponsors to H. Res. 231: Amending the Rules of the House of Representatives to require all committees post record votes on their web sites within 48 hours of such votes — the number is growing. It now has 131 cosponsors, with 27 added in the last two months. That’s some good work on the Hill for whoever has been rounding up support for the bill.

All of the cosponsors are Republican. Does anyone know why that would be? Do Members not bother to seek out support across the aisle, do Members not listen to Dear Colleague letters from across the aisle, or are the Democrats not interested in actual transparency reform?

Committee Votes: What’s the deal?

For a few years now I’ve wanted to look into integrating committee actions into GovTrack. Along with full roll call votes, it would be nice to be able to see how committee members voted in committee on various issues. Finally I took a look at a report PDF from the House Armed Services committee on the defense appropriations bill to see how they include committee votes. The report PDF is, as far as I know, the only way to find out this information besides personally going to a committee office or, maybe, making a phone call. With all congressional data, there’s never an easy way to get it, but some programming magic (screen scraping) is usually enough to extract the info out of wherever it is.

Not so for committee votes. Reminiscent of the type-print-scan-print-mail-scan-print-type financial disclosure methods in the Senate, committee votes were included in this PDF as an image. That is, the vote was typed up, and then probably printed, scanned, and then imported as an image in the final report. Because it is an image, and not text, it is infeasible to extract this information automatically.

I’ll give the committee the benefit of the doubt that this just happens to be the way they’ve always done it, and change is tough.

But come on. This isn’t transparency.

The newest advocacy org.: the Oversight committee

Advocacy organizations are, to some degree, defined by mobilizing a community to take some action. One thing they tend to do is conclude a message with something like “To take action, call your congressperson at [phone number].” Sometimes I find that kind of off-putting because it seems like all they want to accomplish is what in the tech world is called a distributed denial of service attack, where a service (in this case a congressional office) is tried to be taken off line by an attack from many sources (in this case the constituents), somehow independently organized. (Ok, maybe that’s a bit too negative.)

This just in from the Committee on Oversight and Government Reform RSS feed:

On October 8, 2007, the American Spectator printed a fictitious story alleging that Congressman Waxman and the House Oversight Committee were investigating conservative and Republican talk show radio programs….

The American Spectator should immediately retract its report and apologize for the confusion its fictitious report has caused. Moreover, anyone concerned about the false reporting should contact the American Spectator at (703)807-2011 to register your views.

Since when was the Oversight committee in the business of mobilizing a group to take action?

Communication: Authentication Part II

The problem of authentication is basically this: how can we off-load the problem onto someone else that’s already doing authentication? I suggested last post charging credit cards using some credit card charging service that happens to verify billing addresses too (and, as Oxa pointed out in the comments, it’s fairly disenfranchising, although to be honest I don’t mind—Internet communication is already disenfranchising). Two more methods to consider are off-loading the verification to the postal service, or to the individuals.

Sending postcards to verify addresses– The recipient has to type in a random code in the postcard to verify that he got the postcard, i.e. that he’s at that address. (Oxa mentioned this, and I’ve seen it elsewhere.) I didn’t mention it because I assumed this would be too costly. Actually, that may not be so true. I’m just ballparking, but if the overhead of a credit card purchase is around 10 cents, and it costs 41 cents to mail a postcard, that’s not soooo different. But mailing a postcard has some additional overhead (printing the postcard (automagically), and manually schelpping postcards from a printer tray to an outgoing USPS mailbox). I also found a service that will verify phone number-address pairs, which is actually pretty close to what is needed — at around 40 cents per verification.

However, even these methods don’t get you all the way, because in fact we need more than address verification. We need verification or at least assurance that the person hasn’t verified before. You could limit the number of verifications per address, but there are some technical problems with that. The credit card method has the advantage that an individual can only verify as many times as the number of credit cards that he has, and that’s usually pretty limited.

There’s another route to consider, but this is a route tried before with no success as far as I’m aware. You can off-load the authentication problem to the users by creating a web of trust. User A does the work of authenticating users B, C, and D, User B authenticates E, F, and G, etc.. And then one just has to worry about how much you trust a small number of root users, rather than the whole community. But I don’t know if this has ever been a practical solution to anything.