6 problems you should know before writing about Facebook’s emotional contagion study

In a Facebook study published this week, Facebook manipulated all of their U.S. many of their users’ News Feeds by omitting 0-90% of posts containing either positive or negative content over the course of a week in 2012. They reported that those users wrote fewer positive and negative words (respectively) in their own posts, concluding that Facebook is a medium on which emotions spread, a case of “emotional contagion” using their technical term.

Here’s what you need to know:

On average, no emotion actually spread

The number of positive words in their average user’s posts decreased from 6 words to… 6 words.

The first major omission in the study is the lack of individual-level statistics. While they reported aggregate numbers such as having analyzed “over 3 million posts” totaling “122 million words” made by their “N = 689,003” users, and the study’s implication for “hundreds of thousands of emotion expressions,” they omitted any discussion of whether and how individuals were affected in any meaningful way.

From their numbers, the average user wrote 4.5-5 posts totaling 177 words during the experimental week. Only 3.6% of those words — so about 6 words — were “emotional,” and they found that by omitting about half of emotional posts from user’s News Feeds that percentage would go down by 0.1% or less. A 0.1% change is about 2/10ths of a word.

For most of their users, there was not even close to a measurable effect.

(The study did mention the Cohen’s d statistic of ‘0.02’ which is another way to say that there was an aggregate effect but basically no individual-level effect.)

The study has no test for external validity (was it about emotions at all?)

An important part of every study is checking that what you’re measuring actually relates to the phenomenon you’re interested in. This is called external validity. The authors of the Facebook study boasted that they didn’t think of this.

The paper quixotically mentions that “no text was seen by the researchers” in order to comply with Facebook’s agreement with its users about how it will use their data.

They didn’t look at all?

That’s kind of a problem. How do you perform a study on 122 million words and not look at any of them?

Are the posts even original, expressive content? The users might be sharing posts less (sharing is sort of like retweeting) or referring less to the emotional states of friends (“John sounds sad!”). The words in a post may reflect the emotions of someone besides the poster!

To classify words as “positive” or “negative” the study consulted a pre-existing list of positive and negative words used throughout these sorts of social science research studies. This comes with some limitations: sarcasm, quotation, or even simple negation completely cut out the legs under this approach. I actually think in aggregate these problems tend to go away, but only when you have a large effect size.

The whole of Facebook’s reported effect on emotion could be due to one of the many limitations of using word lists as a proxy for emotion. They needed to demonstrate it wasn’t.

Methodological concerns

This study is not reproducible. While most research isn’t ever reproduced, that it could be provides a check against the fabrication of results (and sometimes that’s how fabricators are caught). Facebook provides the only access to a network of this size and shape. It is unlikely they would provide access to research that might discredit the study.

The study also uses a strange analysis. Their experimental design was 2 X 9-ish (control or experiment X 10-90% of posts hidden), but they plugged the two variables into their linear regression in two ways. The first became a binary (“dummy”) variable in the regression, which is right, but the second become a weight on the data points rather than a predictor. That’s an odd choice. Do the results come out differently if the percentage of posts hidden is properly included in the regression model? Did they choose the analysis that gave the results they wanted to see? (This is why I say “about half of emotional posts” above, since the analysis is over a weighted range.)

Informed consent

Finally, there’s the problem of informed consent. It is unethical to run experiments on people without it. The paper addresses legal consent, in the sense that the users agreed to various things as a pre-condition for using Facebook. Though being manipulated was probably not one of them (I don’t know what Facebook’s terms of service were in early 2012 unfortunately).

Certainly the consent didn’t reach the level of informed consent, in which participants have a cogent sense of what is at stake. There’s a great discussion of this at Slate by Katy Waldman.

Facebook’s users have a right to be outraged over this.

Keep in mind though that there are different ethical obligations for research versus developing a product. It could be ethical for Facebook to manipulate News Feeds to figure out how to increase engagement while at the same time being unethical for a research journal to publish a paper about it.

Sunsets over DC

Last week I noticed that the sunset aligned unusually well with my cross-street, Newton St NW, and it made me wonder if we have any Manhattanhenge-like events in DC. DC can one-up Manhattan — we’ve got a double-henge, if you’ll let me coin a phrase.

The Double-henge

Here in Columbia Heights we have a unique street pattern. Two roads — Park Rd and Monroe St. —  come to an apex on 14th St. They go north both to the east and west of 14th St. On a few days a year — centered on May 15 and July 29 — the roads point east toward sunrise and west toward sunset. Click the links to see on suncalc.net. (The alignment isn’t exact, so the effect spans a few days.)

All the henges

Like Manhattan, DC’s grid lines up with sunrise & sunset. It’s on the equinoxes, so we get a boring double-henge on those days too.

Some of the state avenues are kind of close to the solar azimuths on the solstices, but the peak days are a few days off. In the summer it is on the same days as the Columbia Heights Doublehenge. On those days the avenues parallel to New York Avenue line up with sunrise and the avenues parallel to Pennsylvania Avenue line up with sunset. Around the winter solstice — Nov 5 and Feb 6 — the avenues parallel to Pennsylvania Avenue line up with sunrise and the avenues parallel to New York Avenue line up with sunset.

I wondered for each day of the year, what was the DC road that best aligns with sunrise and sunset. If you’re driving these would also be the roads to avoid (h/t @knowtheory). Here’s a table for the next year. The links will show you where exactly it is:

Date Sunrise Street Sunset Street
2014-06-20 Military Rd NW Ridge Rd SE
2014-06-21 Military Rd NW Ridge Rd SE
2014-06-22 Military Rd NW Ridge Rd SE
2014-06-23 Military Rd NW Ridge Rd SE
2014-06-24 Military Rd NW Ridge Rd SE
2014-06-25 Military Rd NW Ridge Rd SE
2014-06-26 Nebraska Ave NW Ridge Rd SE
2014-06-27 Nebraska Ave NW Ridge Rd SE
2014-06-28 Nebraska Ave NW Ridge Rd SE
2014-06-29 Nebraska Ave NW Mount Olivet Rd NE
2014-06-30 Nebraska Ave NW Mount Olivet Rd NE
2014-07-01 Nebraska Ave NW Pennsylvania Ave SE
2014-07-02 Nebraska Ave NW Pennsylvania Ave SE
2014-07-03 Nebraska Ave NW Pennsylvania Ave SE
2014-07-04 Nebraska Ave NW Pennsylvania Ave SE
2014-07-05 Nebraska Ave NW Pennsylvania Ave SE
2014-07-06 Nebraska Ave NW Reno Rd NW
2014-07-07 Nebraska Ave NW Reno Rd NW
2014-07-08 Anacostia Dr SE Thomas Rd SW
2014-07-09 Anacostia Dr SE Thomas Rd SW
2014-07-10 Anacostia Dr SE Thomas Rd SW
2014-07-11 DC Hwy 295 R St SE
2014-07-12 DC Hwy 295 R St SE
2014-07-13 Minnesota Ave SE R St SE
2014-07-14 Minnesota Ave SE R St SE
2014-07-15 DC Hwy 295 R St SE
2014-07-16 US Hwy 1 Macarthur Blvd NW
2014-07-17 US Hwy 1 Macarthur Blvd NW
2014-07-18 US Hwy 1 Macarthur Blvd NW
2014-07-19 US Hwy 1 Florida Ave NE
2014-07-20 US Hwy 1 Neal St NE
2014-07-21 Legation St NW Neal St NE
2014-07-22 Legation St NW Morse St NE
2014-07-23 Legation St NW Morse St NE
2014-07-24 Mississippi Ave SE Pennsylvania Ave SE
2014-07-25 Mississippi Ave SE Pennsylvania Ave SE
2014-07-26 Mississippi Ave SE Pennsylvania Ave SE
2014-07-27 US Hwy 1 Alt Monroe St NW
2014-07-28 US Hwy 1 Alt Monroe St NW
2014-07-29 US Hwy 1 Alt Park Rd NW
2014-07-30 Potomac Ave SE Park Rd NW
2014-07-31 US Hwy 1 Alt Lamont St NW
2014-08-01 US Hwy 1 Alt Pennsylvania Ave NW
2014-08-02 Firth Sterling Ave SE Massachusetts Ave NW
2014-08-03 Myrtle Ave NE Pennsylvania Ave NW
2014-08-04 Arlington Memorial Brg Missouri Ave NW
2014-08-05 Arlington Memorial Brg Missouri Ave NW
2014-08-06 Arlington Memorial Brg Missouri Ave NW
2014-08-07 US Hwy 50 Missouri Ave NW
2014-08-08 US Hwy 50 Canal Rd NW
2014-08-09 US Hwy 50 Canal Rd NW
2014-08-10 US Hwy 1 Virginia Ave SE
2014-08-11 US Hwy 1 Pennsylvania Ave NW
2014-08-12 US Hwy 1 Pennsylvania Ave NW
2014-08-13 US Hwy 1 Pennsylvania Ave NW
2014-08-14 US Hwy 1 Spring Rd NW
2014-08-15 Alabama Ave SE Howard Rd SE
2014-08-16 Savannah St SE March Ln SW
2014-08-17 S Carolina Ave SE Macdill Blvd SW
2014-08-18 S Carolina Ave SE McChord St SW
2014-08-19 S Carolina Ave SE Macdill Blvd SW
2014-08-20 Valley Ave SE Macdill Blvd SW
2014-08-21 Valley Ave SE Military Rd NW
2014-08-22 Alabama Ave SE Kalmia Rd NW
2014-08-23 Riggs Rd NE Kalmia Rd NW
2014-08-24 Mc Guire Ave SE Good Hope Rd SE
2014-08-25 Mc Guire Ave SE Gales St NE
2014-08-26 Whittier St NW Gales St NE
2014-08-27 Whittier St NW Military Rd NW
2014-08-28 Alabama Ave SE Military Rd NW
2014-08-29 Savannah St SE C St SE
2014-08-30 Kenyon St NW Brooks St NE
2014-08-31 Kenyon St NW Brooks St NE
2014-09-01 Princeton Pl NW Blaine St NE
2014-09-02 Princeton Pl NW Atlantic St SE
2014-09-03 Princeton Pl NW Jonquil St NW
2014-09-04 Roxanna Rd NW H St SE
2014-09-05 Perimeter North Rd SW Watson St NW
2014-09-06 W St NW Woodley Rd NW
2014-09-07 W St NW Calvert St NW
2014-09-08 W St NW Independence Ave SW
2014-09-09 Central Ave NE Independence Ave SW
2014-09-10 Chapin St NW Independence Ave SW
2014-09-11 Jackson St NE Independence Ave SW
2014-09-12 Newton St NE Forrester St SW
2014-09-13 Ingraham St NW H St NE
2014-09-14 Webster St NW L St NW
2014-09-15 Emerson St NW Morrison St NW
2014-09-16 Madison Dr NW Kennedy St NW
2014-09-17 McKinley St NW Emerson St NW
2014-09-18 L St NW Ingraham St NW
2014-09-19 US Hwy 50 Newton St NE
2014-09-20 Forrester St SW Newton St NE
2014-09-21 Independence Ave SW Jackson St NE
2014-09-22 Independence Ave SW Central Ave NE
2014-09-23 Independence Ave SW Central Ave NE
2014-09-24 V St NE W St NW
2014-09-25 Watson St NW W St NW
2014-09-26 Watson St NW Perimeter North Rd SW
2014-09-27 H St SE Perimeter North Rd SW
2014-09-28 Jonquil St NW Princeton Pl NW
2014-09-29 Atlantic St SE Princeton Pl NW
2014-09-30 Atlantic St SE Princeton Pl NW
2014-10-01 Brooks St NE Kenyon St NW
2014-10-02 Brooks St NE Kenyon St NW
2014-10-03 C St SE Alabama Ave SE
2014-10-04 Military Rd NW Alabama Ave SE
2014-10-05 Benning Rd NE Whittier St NW
2014-10-06 Gales St NE Mc Guire Ave SE
2014-10-07 Good Hope Rd SE Mc Guire Ave SE
2014-10-08 Gales St NE Mc Guire Ave SE
2014-10-09 Kalmia Rd NW Riggs Rd NE
2014-10-10 Kalmia Rd NW Valley Ave SE
2014-10-11 Macdill Blvd SW Valley Ave SE
2014-10-12 Macdill Blvd SW S Carolina Ave SE
2014-10-13 McChord St SW S Carolina Ave SE
2014-10-14 Blanchard Dr SW S Carolina Ave SE
2014-10-15 March Ln SW Alabama Ave SE
2014-10-16 Howard Rd SE US Hwy 1
2014-10-17 Pennsylvania Ave NW US Hwy 1
2014-10-18 Pennsylvania Ave NW US Hwy 1
2014-10-19 Pennsylvania Ave NW US Hwy 1
2014-10-20 Pennsylvania Ave NW US Hwy 1
2014-10-21 Canal Rd NW US Hwy 50
2014-10-22 Douglas St NE US Hwy 50
2014-10-23 Missouri Ave NW US Hwy 50
2014-10-24 Missouri Ave NW Arlington Memorial Brg
2014-10-25 Pennsylvania Ave NW Arlington Memorial Brg
2014-10-26 Massachusetts Ave NW Firth Sterling Ave SE
2014-10-27 Pennsylvania Ave NW US Hwy 1 Alt
2014-10-28 Lamont St NW US Hwy 1 Alt
2014-10-29 Park Rd NW US Hwy 1 Alt
2014-10-30 Monroe St NW US Hwy 1 Alt
2014-10-31 Monroe St NW US Hwy 1 Alt
2014-11-01 Pennsylvania Ave SE Mississippi Ave SE
2014-11-02 Pennsylvania Ave SE Mississippi Ave SE
2014-11-03 Morse St NE Legation St NW
2014-11-04 Morse St NE Legation St NW
2014-11-05 Oates St NE US Hwy 1
2014-11-06 Florida Ave NE US Hwy 1
2014-11-07 Macarthur Blvd NW US Hwy 1
2014-11-08 R St SE Minnesota Ave SE
2014-11-09 R St SE Minnesota Ave SE
2014-11-10 R St SE Anacostia Dr SE
2014-11-11 Thomas Rd SW Anacostia Dr SE
2014-11-12 Thomas Rd SW Nebraska Ave NW
2014-11-13 Pennsylvania Ave SE Nebraska Ave NW
2014-11-14 Pennsylvania Ave SE Nebraska Ave NW
2014-11-15 Mount Olivet Rd NE Military Rd NW
2014-11-16 Ridge Rd SE Military Rd NW
2014-11-17 Ridge Rd SE Military Rd NW
2014-11-18 Linnean Ave NW Military Rd NW
2014-11-19 Virginia Ave NW Kalorama Rd NW
2014-11-20 Virginia Ave NW Kalorama Rd NW
2014-11-21 Virginia Ave NW Kalorama Rd NW
2014-11-22 Pope St SE DC Hwy 295
2014-11-23 Pope St SE DC Hwy 295
2014-11-24 Aeration Rd SW Cathedral Ave NW
2014-11-25 Aeration Rd SW Cathedral Ave NW
2014-11-26 Aeration Rd SW Cathedral Ave NW
2014-11-27 Aeration Rd SW Westover Ave SW
2014-11-28 Newcomb St SE Condon Ter SE
2014-11-29 Newcomb St SE Mississippi Ave SE
2014-11-30 Mellon St SE Mississippi Ave SE
2014-12-01 Sumner Rd SE Mississippi Ave SE
2014-12-02 Sumner Rd SE US Hwy 1 Alt
2014-12-03 Sumner Rd SE US Hwy 1 Alt
2014-12-04 Howard Rd SE US Hwy 1 Alt
2014-12-05 Howard Rd SE US Hwy 1 Alt
2014-12-06 13th St NE Vista St NE
2014-12-07 Ainger Pl SE Vista St NE
2014-12-08 Ainger Pl SE Vista St NE
2014-12-09 Ainger Pl SE Vista St NE
2014-12-10 Ainger Pl SE Vista St NE
2014-12-11 S Dakota Ave NE US Hwy 1 Alt
2014-12-12 S Dakota Ave NE US Hwy 1 Alt
2014-12-13 S Dakota Ave NE US Hwy 1 Alt
2014-12-14 S Dakota Ave NE US Hwy 1 Alt
2014-12-15 S Dakota Ave NE US Hwy 1 Alt
2014-12-16 S Dakota Ave NE US Hwy 1 Alt
2014-12-17 S Dakota Ave NE US Hwy 1 Alt
2014-12-18 S Dakota Ave NE US Hwy 1 Alt
2014-12-19 Montana Ave NE US Hwy 1 Alt
2014-12-20 Montana Ave NE US Hwy 1 Alt
2014-12-21 Montana Ave NE US Hwy 1 Alt
2014-12-22 Montana Ave NE US Hwy 1 Alt
2014-12-23 Montana Ave NE US Hwy 1 Alt
2014-12-24 S Dakota Ave NE US Hwy 1 Alt
2014-12-25 S Dakota Ave NE US Hwy 1 Alt
2014-12-26 S Dakota Ave NE US Hwy 1 Alt
2014-12-27 S Dakota Ave NE US Hwy 1 Alt
2014-12-28 S Dakota Ave NE US Hwy 1 Alt
2014-12-29 S Dakota Ave NE US Hwy 1 Alt
2014-12-30 S Dakota Ave NE US Hwy 1 Alt
2014-12-31 S Dakota Ave NE US Hwy 1 Alt
2015-01-01 Ainger Pl SE US Hwy 1 Alt
2015-01-02 Ainger Pl SE Vista St NE
2015-01-03 Ainger Pl SE Vista St NE
2015-01-04 Ainger Pl SE Vista St NE
2015-01-05 13th St NE Vista St NE
2015-01-06 Howard Rd SE US Hwy 1 Alt
2015-01-07 Howard Rd SE Lanier Pl NW
2015-01-08 Sumner Rd SE US Hwy 1 Alt
2015-01-09 Sumner Rd SE US Hwy 1 Alt
2015-01-10 Sumner Rd SE Mississippi Ave SE
2015-01-11 Newcomb St SE Mississippi Ave SE
2015-01-12 Newcomb St SE Mississippi Ave SE
2015-01-13 Newcomb St SE Condon Ter SE
2015-01-14 Newcomb St SE Westover Ave SW
2015-01-15 Aeration Rd SW Cathedral Ave NW
2015-01-16 Aeration Rd SW Cathedral Ave NW
2015-01-17 Aeration Rd SW Cathedral Ave NW
2015-01-18 Aeration Rd SW DC Hwy 295
2015-01-19 Pope St SE DC Hwy 295
2015-01-20 Pope St SE Kalorama Rd NW
2015-01-21 Virginia Ave NW Kalorama Rd NW
2015-01-22 Virginia Ave NW Kalorama Rd NW
2015-01-23 Virginia Ave NW Military Rd NW
2015-01-24 Linnean Ave NW Military Rd NW
2015-01-25 Ridge Rd SE Military Rd NW
2015-01-26 Ridge Rd SE Nebraska Ave NW
2015-01-27 Pennsylvania Ave SE Nebraska Ave NW
2015-01-28 Pennsylvania Ave SE Nebraska Ave NW
2015-01-29 Thomas Rd SW Nebraska Ave NW
2015-01-30 Thomas Rd SW Anacostia Dr SE
2015-01-31 R St SE DC Hwy 295
2015-02-01 R St SE Minnesota Ave SE
2015-02-02 R St SE Minnesota Ave SE
2015-02-03 Macarthur Blvd NW US Hwy 1
2015-02-04 Florida Ave NE US Hwy 1
2015-02-05 Neal St NE US Hwy 1
2015-02-06 Morse St NE Legation St NW
2015-02-07 Morse St NE Legation St NW
2015-02-08 Pennsylvania Ave SE Mississippi Ave SE
2015-02-09 Pennsylvania Ave SE Mississippi Ave SE
2015-02-10 S Capitol St SE US Hwy 1 Alt
2015-02-11 Monroe St NW US Hwy 1 Alt
2015-02-12 Park Rd NW US Hwy 1 Alt
2015-02-13 Lamont St NW US Hwy 1 Alt
2015-02-14 Pennsylvania Ave NW Firth Sterling Ave SE
2015-02-15 Massachusetts Ave NW Myrtle Ave NE
2015-02-16 Pennsylvania Ave NW Arlington Memorial Brg
2015-02-17 Missouri Ave NW Arlington Memorial Brg
2015-02-18 Missouri Ave NW US Hwy 50
2015-02-19 Douglas St NE US Hwy 50
2015-02-20 Canal Rd NW US Hwy 50
2015-02-21 Nash St SE US Hwy 1
2015-02-22 Pennsylvania Ave NW US Hwy 1
2015-02-23 Pennsylvania Ave NW US Hwy 1
2015-02-24 Pennsylvania Ave NW US Hwy 1
2015-02-25 Howard Rd SE Alabama Ave SE
2015-02-26 March Ln SW S Carolina Ave SE
2015-02-27 Blanchard Dr SW S Carolina Ave SE
2015-02-28 McChord St SW S Carolina Ave SE
2015-03-01 Macdill Blvd SW Valley Ave SE
2015-03-02 Macdill Blvd SW Valley Ave SE
2015-03-03 Kalmia Rd NW Alabama Ave SE
2015-03-04 Kalmia Rd NW Riggs Rd NE
2015-03-05 Gales St NE Mc Guire Ave SE
2015-03-06 Good Hope Rd SE Mc Guire Ave SE
2015-03-07 Gales St NE Whittier St NW
2015-03-08 Benning Rd NE Alabama Ave SE
2015-03-09 Military Rd NW Alabama Ave SE
2015-03-10 C St SE Kenyon St NW
2015-03-11 Brooks St NE Kenyon St NW
2015-03-12 Brooks St NE Princeton Pl NW
2015-03-13 Atlantic St SE Princeton Pl NW
2015-03-14 Atlantic St SE Princeton Pl NW
2015-03-15 Woodley Rd NW Roxanna Rd NW
2015-03-16 Watson St NW Perimeter North Rd SW
2015-03-17 H St SE W St NW
2015-03-18 Calvert St NW W St NW
2015-03-19 V St NE W St NW
2015-03-20 Independence Ave SW Central Ave NE
2015-03-21 Independence Ave SW Calvert St NW
2015-03-22 Independence Ave SW Newton St NE
2015-03-23 Forrester St SW Newton St NE
2015-03-24 H St NE Ingraham St NW
2015-03-25 L St NW Emerson St NW
2015-03-26 Morrison St NW Kennedy St NW
2015-03-27 Kennedy St NW Morrison St NW
2015-03-28 Emerson St NW L St NW
2015-03-29 Webster St NW H St NE
2015-03-30 Newton St NE Forrester St SW
2015-03-31 Newton St NE Independence Ave SW
2015-04-01 Jackson St NE Independence Ave SW
2015-04-02 Central Ave NE Independence Ave SW
2015-04-03 Central Ave NE V St NE
2015-04-04 W St NW Calvert St NW
2015-04-05 W St NW Woodley Rd NW
2015-04-06 Bryant St NW Watson St NW
2015-04-07 Perimeter North Rd SW H St SE
2015-04-08 Princeton Pl NW Atlantic St SE
2015-04-09 Princeton Pl NW Atlantic St SE
2015-04-10 Princeton Pl NW Blaine St NE
2015-04-11 Irvington St SW Eads St NE
2015-04-12 Kenyon St NW Brooks St NE
2015-04-13 Savannah St SE C St SE
2015-04-14 Alabama Ave SE Military Rd NW
2015-04-15 I- 295 Benning Rd NE
2015-04-16 Whittier St NW Gales St NE
2015-04-17 Mc Guire Ave SE Good Hope Rd SE
2015-04-18 Mc Guire Ave SE Gales St NE
2015-04-19 Riggs Rd NE Kalmia Rd NW
2015-04-20 Riggs Rd NE Kalmia Rd NW
2015-04-21 Valley Ave SE Military Rd NW
2015-04-22 Valley Ave SE Macdill Blvd SW
2015-04-23 Alabama Ave SE Macdill Blvd SW
2015-04-24 S Carolina Ave SE McChord St SW
2015-04-25 S Carolina Ave SE Blanchard Dr SW
2015-04-26 S Carolina Ave SE March Ln SW
2015-04-27 Alabama Ave SE Howard Rd SE
2015-04-28 US Hwy 1 Pennsylvania Ave NW
2015-04-29 US Hwy 1 Pennsylvania Ave NW
2015-04-30 US Hwy 1 Pennsylvania Ave NW
2015-05-01 US Hwy 1 Virginia Ave SE
2015-05-02 US Hwy 50 Canal Rd NW
2015-05-03 US Hwy 50 Canal Rd NW
2015-05-04 US Hwy 50 Douglas St NE
2015-05-05 US Hwy 50 Missouri Ave NW
2015-05-06 Arlington Memorial Brg Missouri Ave NW
2015-05-07 Arlington Memorial Brg Missouri Ave NW
2015-05-08 Arlington Memorial Brg Pennsylvania Ave NW
2015-05-09 Myrtle Ave NE Massachusetts Ave NW
2015-05-10 Firth Sterling Ave SE Pennsylvania Ave NW
2015-05-11 US Hwy 1 Alt Pennsylvania Ave NW
2015-05-12 US Hwy 1 Alt Monroe St NW
2015-05-13 Potomac Ave SE Park Rd NW
2015-05-14 US Hwy 1 Alt Monroe St NW
2015-05-15 US Hwy 1 Alt Monroe St NW
2015-05-16 US Hwy 1 Alt Pennsylvania Ave SE
2015-05-17 Mississippi Ave SE Pennsylvania Ave SE
2015-05-18 Mississippi Ave SE Pennsylvania Ave SE
2015-05-19 Mississippi Ave SE Pennsylvania Ave SE
2015-05-20 Legation St NW Morse St NE
2015-05-21 Legation St NW Morse St NE
2015-05-22 Legation St NW Neal St NE
2015-05-23 US Hwy 1 Neal St NE
2015-05-24 US Hwy 1 Florida Ave NE
2015-05-25 US Hwy 1 Macarthur Blvd NW
2015-05-26 US Hwy 1 Macarthur Blvd NW
2015-05-27 US Hwy 1 Macarthur Blvd NW
2015-05-28 Minnesota Ave SE R St SE
2015-05-29 Minnesota Ave SE R St SE
2015-05-30 Minnesota Ave SE R St SE
2015-05-31 DC Hwy 295 R St SE
2015-06-01 Anacostia Dr SE R St SE
2015-06-02 Anacostia Dr SE Thomas Rd SW
2015-06-03 Anacostia Dr SE Thomas Rd SW
2015-06-04 Anacostia Dr SE Thomas Rd SW
2015-06-05 Nebraska Ave NW Thomas Rd SW
2015-06-06 Nebraska Ave NW Reno Rd NW
2015-06-07 Nebraska Ave NW Pennsylvania Ave SE
2015-06-08 Nebraska Ave NW Pennsylvania Ave SE
2015-06-09 Nebraska Ave NW Pennsylvania Ave SE
2015-06-10 Nebraska Ave NW Pennsylvania Ave SE
2015-06-11 Nebraska Ave NW Pennsylvania Ave SE
2015-06-12 Nebraska Ave NW Pennsylvania Ave SE
2015-06-13 Nebraska Ave NW Mount Olivet Rd NE
2015-06-14 Nebraska Ave NW Ridge Rd SE
2015-06-15 Nebraska Ave NW Ridge Rd SE
2015-06-16 Military Rd NW Ridge Rd SE
2015-06-17 Military Rd NW Ridge Rd SE
2015-06-18 Military Rd NW Ridge Rd SE
2015-06-19 Military Rd NW Ridge Rd SE
2015-06-20 Military Rd NW Ridge Rd SE

source code

Did github help fix government? Not so fast.

Last month the Administration posted perhaps the first github pull request to change federal policy. One WIRED writer was quick to call success, writing here, “By opening up the revisions and the discussions behind them, the White House is making its thinking clear.” But no.

It’s easy to be fooled into believing that a new medium also signals new substance.

No substantive policy change in this pull request

The pull request is a proposed change to a federal memorandum on github regarding open data. The change clarifies when agencies should openly license their data. The memorandum originally said that federal agencies should always use open licensing. But as I pointed out when the memorandum was issued a year ago, that’s not legally possible. Most federal data is not subject to copyright in the first place, and works that are in the public domain can’t be licensed.

The proposed update to the memorandum fixes the Administration’s mistake by adding at the top:

“In instances where government data . . . does not fall squarely within the public domain . . .”

clarifying that open licensing should only be used where copyright applies. Mainly that means when the data was produced by a government contractor. There is no substantive change made in this pull request though. It clarifies the only sensible meaning the original memorandum actually had.

Omits discussion of the substantive issues

If this were the only issue in the paragraph being edited, then I too would call it success. But late last year 14 organizations backed a statement supporting the public domain for government data — not open licensing — and several of us who wrote the letter met with the Administration about the issue. The absence of any mention of the substantive issue in that paragraph should be a red flag for thinking the pull request represents open dialog.

The substantive issue is that the policy condones the copyrighting of any government data, much of which might be used to create or enforce government policy. That’s a serious First Amendment concern. It means that even if journalists can get a hold of some data, they might only be able to share it on terms set by a government agency or even a government contractor. As a broad government policy, the notion of copyrighting government data is ridiculous and flies in the face of our country’s traditions and values. (Note: Forget national security, privacy, etc. This could be data about any mundane policy.)

The pull request omits discussion of this issue, as well as other issues that I and others have discussed with the Administration (as I noted in my reply to the pull request).

Where was the dialog?

There was dialog on these issues, but it wasn’t on github. It was in private in-person meetings, as these things usually are. I and others met with Administration staff in private meetings in August 2013, April 2014, and May 2014. Our discussions each time were thoughtful and productive.

There was plenty of good dialog, but it wasn’t online. I first raised the licensing issue on github a year ago in issues #5 and #64, to which the Administration replied only that they would look into it. The issue was picked up against in issue #257, but again there was no participation in the github issue by the Administration. (There is a lot of dialog in that github repository, but it is about data standards and not policy, and most of the participants in those discussions are government employees or contractors (including myself, in those conversations) — which is a good thing, but not the subject of the WIRED article.)

The pull request posted last month represents the end of a year-long process in which discussions were taking place off-line, and proof that even with github most dialog will still continue to take place off-line.

Lest journalists get confused let’s just be clear that there wasn’t any discussion of substance on github. It was elsewhere, off-line, like normal.

Details matter

Now I’m just going to be a jerk and red-line the WIRED article because it got a lot of details wrong:

This White House GitHub Experiment Could Help Fix Government
BY ROBERT MCMILLAN

While many of our nation’s problems are quite clear, the way our government addresses them is too often a black box—opaque and closed to all but insiders and lobbyists.

But the White House has taken a remarkable–if small–step toward bringing greater transparency to the legislative process. (“legislative” refers to the legislative branch of government, i.e. Congress. This is an executive-branch memo and thus not related to the legislative process.) For the first time, it has used the GitHub social coding website as a forum for discussing and ultimately changing government policy. With one GitHub “pull request,” it modified (The document has not yet been modified.) theProject Open Data policy document, which spells out how government agencies are supposed to open up access to their data. This represents the fusion of open source software and government policy that open-government advocates have long predicted (#notalladvocates predict this). And it might be a sign of things to come as others—the city of San Francisco, and the New York state senate, to name a couple—bring collaborative government into the light.

‘We’re taking a well-known page from the open source playbook: that developing policy in an open and iterative way will create a stronger, more effective product.’

Late last week, Haley Van Dyck at the Office of Management and Budget submitted a pull request that suggested small changes to Project Open data that clarify how agencies think about open source and public domain software (The memo does not cover software. It is about data.). Pull requests are a Silicon Valley innovation. They’re typically used by software developers on GitHub to suggest and discuss changes to code. But they’re also a good tool for tracking changes to complex legal documents, even government regulations.

While Van Dyck’s changes weren’t big, it’s important that these issues were raised and addressed in a public forum where anyone can suggest language for the policy document.(Anyone can, but no one did. The pull request was submitted by the Administration to the Administration’s own document. Let’s wait until they accept a pull request submitted by the public to a policy document.) “We’re taking a well-known page from the open source playbook: that developing policy in an open and iterative way will create a stronger, more effective product. The more we can involve the community, the better that product will be,” said Van Dyck—a senior adviser to the U.S. Chief Information Officer—in an email to WIRED.

The White House will wait a few weeks to review comments to the pull requests, but then Van Dyck’s changes become official government policy with the push of a button. This is open source government: The tonic that could cure the back-room deal. (Most government policy-making involves public comments, review periods, and pushing a button to upload the final policy to the Internet. There is absolutely nothing more open-source about this than the usual agency rule-making process.)

By opening up the revisions (there is no policy-making in our government that doesn’t involve posting revisions) and the discussions behind them (as I mentioned, there was no discussion on github), the White House is making its thinking clear, and there’s an added bonus: The changes are easier to read and understand. Compare Van Dyck’s revisions here, to Rep. Lou Barletta’s proposed changes to existing law in his Emergency Unemployment Compensation Extension Act of 2014. In the GitHub document, you can see the old text struck-through in red and the new additions in green. Congressional bills like Barletta’s, on the other hand, read like uncompiled source code, detailing all the changes to be made but giving the reader no idea what the finished product will look like.(That’s not what uncompiled source code looks like. And ‘compiled’ source code certainly looks no better.)

That makes some bills unreadable, as far as the average citizen is concerned. (This isn’t an apples-to-apples comparison. Modifying 200-year-old statutory law is going to be harder for the “average citizen” to read than modifying a memo written last year.)  “The thing that is actually voted on is the edits,” says Ben Balter, GitHub’s government evangelist. He has been working with the feds for years, convincing them to use more open-source software and adopt more of an open-source attitude. “The open government community has been talking about doing stuff like this, but it’s never reached fruition because there weren’t enough stakeholders in government.”

That’s begun to change, Balter says. He says he’s spending more time explaining to federal employees how they can use open source tools and methods. Two years ago, he was still convincing them to give open-source a shot. Now he’s watching the White House merge pull requests.

Responding to Dept. of Education’s RFI on APIs

The Department of Education has an RFI due tomorrow on the Use of APIs in Higher Education Data and Student Aid Processes. I submitted the following response:

Overview

The RFI asks how APIs for higher education data and programs can achieve policy goals of the Administration. As an expert on open government data, I am submitting this comment to address when APIs are useful.

Modern methods of information dissemination and service delivery recognize the long-standing role of mediators in facilitating citizen-government transactions. The media, educational institutions, and many others have long played a crucial role in helping citizens make use of information about higher education produced by the government and enroll in government services. The function of electronic standards for information dissemination and service delivery is to make mediation more efficient and therefore able to reach a wider audience. These new methods are a force multiplier for policy objectives.

Do Open Data First

An API is one of the two modern methods of information dissemination and service delivery specifically sought after by the Administration. Besides building APIs, creating open data — also called bulk, raw, and structured data — is also now an Administration goal as outlined in the White House’s Memorandum on Open Data (M-13-13).

It is important to understand when open data or an API is the right technology for a particular database or service.

Open data, when possible, is always both less costly to implement and more powerful than a “read API”. Here is a summary of why:

* Open data is static but APIs are dynamic. That means that APIs require long-term maintenance to ensure that the API remains continuously and indefinitely available. Open data is released once and updated periodically as needed.

* Open data provides a complete database but APIs provide only a small window into the data. That means that while open data can be used to build any application, an API can only be used to build applications that require a small amount of the data at a time.

* A *good* API requires that the agency do everything that good open data requires plus much more, including the creation of RESTful services, building around use cases, and creating “client libraries”.

A “read API” must do everything that open data does, plus much more. Therefore agencies should walk before they run. Build good open data first, validate that it meets the needs of users, learn how to do that well, and only after validation and learning invest in building an API to address additional use cases.

Open data should always be available before a “read API” is considered. The few cases were open data is not possible for information dissemination (when data changes in real time, like in the stock market, or the data is extremely large) are not likely to apply to data about higher education.

For an example of open data, the Census Bureau and the National Weather Service have been providing open data since the mid 1990s. The practices of open data have a 25-year history.

I advise against the implementation of any read APIs for a dataset before open data for that dataset is available and validated with stakeholders.

Not all open data is created equal. Well-designed open data will prove to be most useful for mediators — and thus the consumers. For more information on open data, please see:

* My book, Open Government Data: The Book, at http://opengovdata.io/
* Best practices for open data licensing, at http://theunitedstates.io/licensing/

When to build APIs

That said, the above advice applies only to information dissemination. Read/write APIs are an excellent strategy for the enrollment in or participation in government services. In a read/write API, unlike a read-only API, the external user is submitting information — such as form values — in a transactional process. A read/write API decouples the customer’s experience from the business logic so that mediators can create new experiences but still be in compliant with the agency’s business logic.

Just as with information dissemination, mediators can be valuable during transactions. Different audiences might respond best to different ways in which the transaction occurs (off-line, on an iPad, in large print, in plain language, or using jargon when targeting domain experts, etc.). Using a read/write API, mediators can create new and tailored methods of performing the same transaction and best reach audiences that the agency alone could not best reach.

Since transactions are by their nature dynamic, open data would not meet this need.

Not all APIs are created equal. Exceptional APIs lower the barrier to entry and the ongoing costs for mediators. Poorly designed APIs could result in helping no one.

A well-designed API provides granular access, deep filtering, typed values, normalized tables, RESTful interfaces, multiple output formats, useful validation messages, use-case or intent-oriented URLs, documentation, client libraries, versioning, fast results, high uptime, easy on-boarding, interactive documentation, and a developer community hub. The best APIs are used by the agencies themselves inside their own products and services.

For more information on what makes a good API, please see my blog post “What makes a good API?” at http://razor.occams.info/blog/2014/02/10/what-makes-a-good-api/.

About Me

I am the founder of GovTrack.us, a legislative transparency website that has been at the forefront of open government data since 2004, and I am the author of Open Government Data: The Book (opengovdata.io). I formerly consulted for the Department of Health and Human Services on HealthData.gov and currently consult for the Council of the District of Columbia. I can be reached at tauberer@govtrack.us.