Earlier today DC’s mayor issued a Transparency, Open Government and Open Data Directive (readable thanks to Alex Howard here). Much of it was adapted from the White House’s open government memoranda, including those memoranda’s faults.
There are many things to like about the Directive, including the mention of a potential new Chief Data Officer position, the use of open formats, and the goal of promoting reuse. The framing in terms of transparency, participation, and collaboration — lifted from Obama’s 2009 open government memo and adopted in the Mayor’s 2011 memorandum on transparency and open government — is good. (Though not great. The White House never managed to actually execute the collaboration part.)
But much of it is also undercut by a new notion of conditional access to government data that is becoming the norm.
Having their cake and eating it too
What I mean is that while the directive explicitly and clearly states that there will be
no restrictions on copying, publishing, further distributing, modifying or using the data [in DC’s data catalog]
it simultaneously explicitly describes a number of restrictions that there will or may be on use of the data. (It’s clear DC copied language from the White House’s 2013 open data memo (“M-13-13”), which I’ve blogged about before here and here, including their mistakes.)
“No restrictions” is what we want. It is, by community consensus, a core and defining quality of open government data.
If there are capricious rules around the reuse of it, it’s not open government data. Period. Restrictions serve only to create a legal lever by which the government can put pressure on things they don’t like. Imagine if the DC government took legal action against Greater Greater Washington to stop an unflattering story on the basis that GGW didn’t properly cite the DC government for the data used in a story. This is what the future of open data in DC looks like when there are restrictions on reuse.
Okay so specifically:
“Open license” does not mean “no restrictions”
So first it says that the data catalog will accomplish this goal of “no restrictions” by making the data available through an “open license.” The usual meaning of open license does not mean “no restrictions,” however. Most open licenses, including open source licenses and Creative Commons licenses, only grant some privileges but not others. Often privileges come along with new requirements, such as GPL’s virality clause, or the restriction that users must attribute the work to the author. Under the Open Definition, “open” means reusable but potentially subject to certain terms.
In guidance I co-wrote with Eric Mill, Jonathan Gray, and others called Best-Practices Language for Making Data “License-Free”, we addressed what governments should do if they really want to create “no restrictions.” They should use CC0, a copyright waiver. This is really the only way to achieve “no restrictions.”
(This was one of the confusions in M-13-13 as well. It’s clear the directive took the open licensing language from M-13-13.)
“Open license” presumes the work is copyrighted
Facts cannot be copyrighted. To the extent that DC’s data catalog contains facts about the District, about government operations, and so on, the data files in the catalog are likely not subject to copyright protections. (What is and isn’t copyrightable is murky.) Open licensing, as normally understood, presumes the work is copyrighted. If the work isn’t copyrighted, an open license simply doesn’t apply. You can’t license what you don’t own.
(This was another one of the confusions in M-13-13. But unlike the federal government, the DC government probably can copyright things it produces. But probably not data files.)
Data users must agree to a contract first
(This was not a problem in M-13-13. The License-Free Best Practices did address this though.)
Attribution and explanation requirements
The directive also gives us a clue about what else will be in the Terms of Service:
Nothing in this Order shall be deemed to prohibit OCTO or any agency … from adopting or implementing measures necessary or appropriate to . . . (v) require a third party providing the District’s public data (or applications based on public data) to the public to explicitly identify the source and version of the public dataset, and describe any modifications made to the public dataset.
This is an attribution requirement, plus a requirement for data users to explain themselves.
To be sure, and as Alex Howard called me out on on Twitter, these are hypotheticals that the directive leaves open and not something the directive is mandating. But the fact that these are mentioned strongly suggests that OCTO or other agencies want to enforce these sort of terms and will if they can.
And, as you might guess I would say, requirements to attribute the government for data and to explain what you did with data are restrictions on use, which like the others create a lever by which the DC government might put pressure on things it doesn’t like.
(This was also a problem in M-13-13, but in this case it doesn’t appear that the DC directive specifically copied the problem from M-13-13.)
There is a strong American tradition — or at least a core American value — that the government does not get in the way of the dissemination of ideas. We don’t always live up to that ideal, but we strive for it. Access to information about the government that comes with restrictions on what we can say when we use it (e.g. attribution & explanation), a waiver of rights or a commitment to indemnify, etc. are all an anathema to accountability and transparency and respect for the public.
If and when these new terms go up, I will encourage users to FOIA for the same information rather than get it from the DC data catalog.