The Open Data Delusion
The Open Data Delusion
Open Data is about the “Data” as much as it is about the “Open.” Some stories from my experience as an Open Data activist and adviser illustrate it.
This article first appeared on 20 May 2016 on the now defunct online magazine Broken Toilets. It’s still available on Web Archive.
I first met Gail Ramster in 2010 at an event about the release of London-wide Open Data by the Greater London Authority. A researcher on “toilet usability,” she was trying to gather public data to compile a list of toilets accessible to elderly people. Six years later I met Gail again, this time at her office at the Royal College of Art, to discuss her experience. The past 5 years have been for me a whirlwind of Open Data advocacy; first working to increase awareness of Open Data in academia, then as a ministerial adviser in the now defunct Open Data User Group – or ODUG, an advisory panel at the UK Government Cabinet Office. ODUG operated in 2012-2015 to help the Government prioritize data releases, assign funding, and produce policy recommendations.
In those three years, one of the most curious things I learned was that toilets are among the most requested datasets. A project to allow Local Authorities to release toilet locations was one of the recipients of the Release of Data Fund (see: Local Government Data Incentive Scheme), which we administered.
When I tell Gail that I believe toilet data is representative of the whole Open Data parable, she laughs: “It’s all very fragmented.” In fact, despite a lot of work by and with the Government, in 2016, we still have little assurance about data quality, frequency of the releases, reliability of schemas, conflicting standards, and teething issues with licensing.
Photograph: The Helen Hamlyn Centre for Design
Ten years of Open Data have come with plenty of expectations and some successes. The imagination of activists, coders, and ordinary citizens was catalyzed when data scraped from government websites brought survivors to safety after Hurricane Katrina (2005) in the U.S. But now the data revolution seems incomplete. A mixture of inaccurate data, licensing issues, incompatible formats, and unclear update processes, have brought the movement from the early hopes to a state of disappointment. From the hype of “Open Data can save lives,” many in the community are now left wondering whether that potential has been overstated. We are still awaiting the coming of the Open Data killer application. Does this application exist at all?
My personal fascination with Open Data dates back to the very early days. Most people think of Open Data as a single and uniform movement, while it is in fact a two-faced phenomenon: on one side, a push for transparency and citizen engagement; on the other, the ability to use data to reimagine public service and policy-making, based on fact and evidence. Working in academia, it has been evident to me that these two aspects are (or should be) intimately linked in a positive feedback loop: sharing data enables more research, which in turn releases more data ready to power even more research. But as I’ve learned in my ODUG years, it is difficult to translate this process into something that works for the civil service, where politics and bureaucracy conspire to grind innovation to a halt.
In its early days in the UK, between 2005 and 2010, Open Data focused on transparency. Prime Minister David Cameron hailed the release of public data as a way to foster an “army of armchair auditors”: ordinary citizens who, empowered by financial expenditure spreadsheets, would scrutinise public sector operations. Despite its transparency ethos, the armchair auditor idea failed to catch on: no guarantees on data accuracy or update frequency, combined with a general lack of interest, sent armchair auditors to early retirement.
Maps Image: Gail Ramster
The British experience is extremely important in its successes and failures. Still topping the tables of every imaginable “Open Data league,” from the OKFN index (2nd place) to the Open Data Barometer (1st place), UK Open Data was part of a worldwide movement towards a more transparent way of running Government. The Open Government Partnership hailed the UK as the best example in this context, and the 2013 OGP Summit, held in London, reinforced this status. The extremely high-profile participation of David Cameron and the then Cabinet Office minister (responsible for the Open Data and Transparency agenda/portfolio) Francis Maude, told the world that to achieve success with Open Data senior leaders must be involved. But all that glitters ain’t gold: success was often ill-defined as “number of datasets released” without a clear discourse around data quality or utility. While the UK shows success in getting its political leaders involved and releasing more data, it won’t be the ideal standard until the focus is on the ability to use that data to actually do things in practice.
In late 2013, the Government’s rhetoric started to shift: Open Data became an instrument to power services and build businesses. Public toilet data might not sound ground-breaking, yet it is incredibly representative of the difficulties of this phase. For a long time, licensing agreements with mapping products suppliers prevented local councils from releasing data with an open license (see here, here and here). This limited the possible reuses. “I didn’t really look at re-licensing the data. It’s really complicated due to the number of different licenses,” says Gail.
Problems emerged using data from different providers. “The data would come in different formats and not all datasets would have the same information,” Gail has worked with the Local Government Association to produce a standard schema across multiple local authorities. Even then, there is no guarantee on who would update the data and how often. “Local authorities don’t know whether to assign this task to their Open Data people, who don’t know much about toilets, or to their toilets people, who don’t understand much about data issues.” In the strict budgeting of recession Britain, Councils struggle to find resources to bridge this skills gap, and certainly cannot afford to attract talent on pay scales that do not compare to the private sector.
In 2013, Gail and her collaborators turned to crowdsourcing to complete the dataset. I ask her if she’s aware of any council getting the data back to update and improve the quality of their own datasets. “Maybe one case. But there is no official route to provide feedback.”
This lack of engagement is one of the reasons for the disappointment within the community. It makes Open Data look like a tick-box exercise for authorities seeking good PR. The public has few routes to help them do data the right way. The lack of paths “back to the system” has clearly limited the possibility of correcting datasets, but it has also caused frustration to the community. The obvious consequence has been a loss of positive energy. And with these difficulties no business could reliably adopt Open Data to build new services.
Earlier on, the Greater London Authority had been a positive example of Open Data engagement. In 2008, under Emer Coleman, then Director of Digital Projects, City Hall started to build its “London Datastore”. Emer invited developers and activists to workshops that traced the course for data to be released. “It was very collaborative. I was their person inside the system,” remembers Emer. “Before we did anything, we said ‘let’s just have a workshop. What is the Minimum Viable Product we need to build? What do we need to prioritise?’”
Coming across as passionate and thoughtful, Emer incarnates the typical public sector disruptor, with a mixture of energy and frustration built up over years of fighting for innovation. She hails getting Transport for London to release real-time data as her biggest success. “It’s been like a blackbox opening up.” A large number of transport planning apps, including the popular CityMapper, have come out of this data.
That energy seems to have disappeared. “I think the rhetoric has outstripped reality. In the early days technologists like Chris Thorpe and Chris Taggart were involved. This helped shape the services we needed to build, and decide what data we needed to release.” After she left City Hall, control has gone back to policymakers. Open Data shifted to business as usual and the earlier success has not been replicated. When I mention “killer apps,” Emer smiles. “The only way to make progress on Open Data is to get data scientists and technologists into the rooms of power. The nature of technology is not to care about hierarchy, but to seek evidence-based explanations. This dis-intermediates power structures and allows innovation to happen.”
After a long career in the public sector, she went on to work for Transport API, an aggregator of transport Open Data. It is one of the very few examples of successful Open Data businesses turning profits.
Within the UK Central Government, things have been even more difficult. Despite scoring highly on the Open Government Partnership tables, there are few stories that sound truly revolutionary. Environmental data is behind some of these stories. After floods affected large parts of the country in 2013, the Department for the Environment, Food, and Rural Affairs, through their Environment Agency, went on an unrivalled data release spree, publishing thousands of datasets in just 2 years. The outcomes have been interesting. Shoothill, a data analysis consultancy firm, has launched GaugeMap, a real-time visualization of rivers and tides using real-time data, allowing people in risk areas to be alerted promptly.
A new shift in rhetoric has begun after the 2015 General Election. Where “Open Data” and “Transparency” had been the key concepts in the previous administration, there is now a “Data Government Programme” focused on rethinking internal processes rather than on engaging the community and general public through transparency.
This is a two-faced phenomenon. On one side, the disappearance of “Openness” as a first-class citizen is a worrying sign which points to a lack of engagement with the data users community; on the other hand, this suggests that the Government is starting to conceive of data as a central mechanism to run operations.
With the advent of the Government Digital Services, a mostly paper-oriented Government started to adopt a digital-by-default approach to citizen services. This has come together with a move to a well architected data layer to power these services. Not all of this data is open, but the general view is that Open Data could become one of the several outcomes of this move. The jury is still out.
However, the Government has announced funding of £5M to build an Open Address database, a painful topic for the community: without addresses, it is difficult to deal with many location-based datasets. This could be a positive sign of a new age for data: by using it internally, the understanding of open data needs improvements within the public sector.
The developing world might have different problems than the UK, but Open Data stories are similar. Freshly out of a PhD in Geographical Information Systems from the University of Nottingham, Mark Iliffe has spent the last 5 years mapping slums in Tanzania for sanitation and flood prevention. He collected data about water points and toilets and disseminated the information to residents using platforms like OpenStreetMap. Mark suggests that international development projects can only work when supported by the local community: outsiders cannot see what the community knows. “Sometimes what looks like a small shack is actually a pit latrine serving 20 people, or a guy sitting next to a water tank with a jerry can, who measures and sells water.”
Water Seller Photograph: Mark Iliffe
The projects he runs, funded by the World Bank, involved enfranchising and training community members in data collection, so that this community knowledge could be drawn out and mapped. “Once mapping is complete, scenarios range from ‘what if water points are built here – how many more people will get water?’ to ‘if it floods here, how are schools affected?’”
Mark is adamant that the best way to present Open Data is as a tool for efficiency and improving operations rather than as a transparency mechanism. “Some countries struggle with the notion that access to information is a priority. It cannot be when the provision of public services is facing challenges. The narrative of improving efficiency is more compelling.”
Interestingly, this resonates with the experience of the UK, where transparency data releases have been all but a success.
“During the recent cholera outbreak in Dar es Salaam, neighborhood officers have started to use the already existing open map data to identify poor drinking points, flagging these for inspection. It was completely unexpected.” This is where hopes for the future of Open Data lie: the ability to use the information to devise preventative actions.
Photograph: Giuseppe Sollazzo
Photograph: Giuseppe Sollazzo
The similarities between the UK and the developing world are striking, but success seems to go in opposite directions. There might be an issue of scale: the transformational power of Open Data can only be seen when enough data is released, and when data can be embedded in everyday operations. In Dar Es Salaam the data is being produced at the right scale and this allows it to produce positive outcomes. “There has been an incomplete revolution because we had incomplete data,” says Emer Coleman before I leave her. Although this might sound like a developing country issue, it is true in places like the UK, where despite the huge effort, Open Data is still not “business as usual” in a reliable way. “We haven’t had a scale of data releases and uses that allows a revolution. Get more data and hire data scientists to use the data and you will see the impact.” She might be right.
Meanwhile, the community is revising its strategy. One key issue of Open Data is that information rights were never clearly attached to it and the community didn’t quite ask for it, because speeding up access to the data was considered more important. As Tom Steinberg wrote, the Open Data community was too polite. Open Data, especially in the UK, might have come with a license (the Open Government Licence) but this is full of legal exceptions – for example, the assumption that “data might be owned by others” who are not the license issuers. Moreover, the political climate no longer allows any more duties forced onto public authorities; as the government seems intent on weakening Freedom of Information, it sounds unlikely that Open Data rights might come into force at any point. If the UK isn’t able to move into this direction, what will the developing world do?
These legal difficulties should not hide the fact that Open Data is ultimately powerful when it represents a conversation between data experts inside the system and data users who access that system. And to see the system become mature and produce better services, it needs to keep that conversation alive, learn from it, and use its lessons to change.