Free data: utility, risks, opportunities

Some random thoughts after The possibilities of real-time data event at the City Hall.

Free your location: you’re already being photographed
I was not surprised to hear the typical objection (or rant, if you don’t mind) of institutions’ representative when requested to release data: “We must comply with the Data Protection Act!“. Although this is technically true, I’d like to remind these bureaucrats that in the UK being portraited by a photographer in a public place is legal. In other words, if I’m in Piccadilly Circus and someone wants to take a portrait of me, and possibly use it for profit, he is legally allowed to do so without my authorization.
Hence, if we’re talking about releasing Oyster data, I can’t really see bigger problems than those related to photographs: where Oyster data makes it public where you are and, possibly, when, a photograph might give insight to where you are and what you are doing. I think that where+what is intrinsically more dangerous (and misleading, in most cases) than where+when, so what’s the fuss about?

Free our data: you will benefit from it!
Bryan Sivak, Chief Technology Officer of Washington DC (yes, they have a CTO!), has clearly shown it with an impressive talk: freeing public data improves service level and saves public money. This is a powerful concept: if an institution releases data, developers and business will start creating enterprises and applications over it. But more importantly, the institution itself will benefit from better accessibility, data standards, and fresh policies. That’s why the OCTO has released data and facilitated competition by offering money prizes to developers: the government gets expertise and new ways of looking at data in return for technological free speech. It’s something the UK (local) government should seriously consider.

Free your comments: the case for partnerships between companies and users
Jonathan Raper, our Twitter’s @MadProf, is sure that partnerships between companies and users will become more and more popular. Companies, in his view, will let the cloud generate and manage a flow of information about their services and possibly integrate it in their reputation management strategy.
I wouldn’t be too optimistic, though. Albeit it’s true that many longsighted companies have started engaging with the cloud and welcome autonomous, independently run, twitter service updates, most of them will try to dismiss any reference to bad service. There are also issues with data covered by licenses (see the case of FootyTweets).
I don’t know why I keep thinking about trains as an example, but would you really think that, say, Thameslink would welcome the cloud twitting about constant delays on their Luton services? Not to mention the fact that NationalRail forced a developer to stop offering a free iPhone application with train schedules – to start selling their own, non free (yes, charging £4.99 for data you can get from their own mobile web-site for free, with the same ease of use, is indeed a stupid commercial strategy).

Ain’t it beautiful, that thing?
We’ve seen many fascinating visualization of free data, both real-time and not. Some of these require a lot of work to develop. But are they useful? What I wonder is not just if they carry any commercial utility, but if they can actually be useful to people, by improving their life experience. I have no doubt, for example, that itoworld‘s visualization of transport data, and especially those about Congestion Charging, are a great tool to let people understand policies and authorities make better planning. But I’m not sure that MIT SenseLab’s graphs of phone calls during the World Cup Final, despite being beautiful to see, funny to think about, and technically accurate, may bring any improvement to user experience. (Well, this may be the general difference between commercial and academic initiative – but I believe this applies more generally, in the area of data visualization).

Unorthodox uses of locative technologies
MIT Senselab‘s Carlo Ratti used gsm cell association data to approximate people density in streets. This is an interesting use of technology. Nonetheless, unorthodox uses of technologies, especially locative technologies, must be taken carefully. Think about using the same technique to calculate road traffic density: you would have to consider single and multiple occupancy vehicles, where this can have different meanings on city roads and motorways. Using technology in unusual ways is fascinating and potentially useful, but the association of the appropriate technique to the right problem must be carefully gauged.

Risks of not-so-deep research
This is generally true in research, but I would say it’s getting more evident in location-based services research and commercial activities: targeting marginally interesting areas of knowledge and enterprise. Ratti’s words: “One PhD student is currently looking at the correlations between Britons and parties in Barcelona… no results yet“. Of course, this was told as a half-joke. But in many contexts, it’s still a half-truth.