Three things about data...

Posted on Dec 6, 2024

Image

First published on LinkedIn

Quite a few interesting discussions at Think Data yesterday, both on and off stage.

Three main thoughts from me:

  1. the “future of data” is… overrated. There is a lot of work needed to get the basics right in terms of data quality, seamless data pipelines, understanding of the goods and bads of data. We need to move out of that mentality of blind faith in data as “the solution”. I’ve read a fantastic article recently on how any data pipeline, taken uncritically, introduces dangerous bias at every step: selection bias when gathering data, recency bias when interpreting it, and confirmation bias when getting insight from it. And bias means ineffective data. We need to be applying critical thinking in building solid data foundations, or the future of data will be broken, disappointing, and potentially dangerous.

  2. good data governance is a beautiful thing. I promise: this is not a civil servant’s dystopian dream of ever increasing red tape :) When most folks criticise governance, what they have experienced is bad, slow governance. Governance, and by that I mean good governance, is there to protect people - the citizens and the operators. But there is no need to make it byzantine and slow: good governance is clear, lean, and builds confidence. One thing I’ve learnt during my years in the NHS AI Lab is that how long it takes to clear governance is correlated with data maturity: the more mature the organisation is, the faster, because it is able to understand the problem without being scared by it, without parking a data governance request forever, without escalating it all the time to the CEO because everyone is too afraid. Increasing data maturity lays the foundations of good governance.

  3. we must still do a lot of work on communicating data well. Data is complicated. Engaging with a varied user base with varied data needs means that not everyone will be at the same level of technical understanding. I always make the example that if you say “the UK average house price is…”, 99% of the untrained folks will automatically think that the average is the arithmetic mean (hint: it’s not). Not to mention the difficulty in grappling with uncertainty when the data comes in probabilities, such as weather data. As public servants, we must go the extra mile and engage, engage, engage; transparently, helping the public understand the concept of uncertainty, or questioning the definition of common sense words.