Sketching a model of data ownership

  • This was a very brief thought exercise, and in hindsight there are a few other aspects to tackle. Please now consider this just a live doc of notes, thoughts and scribbles, that I’ll add to, amend and maybe straighten out one day :)
  • I hadn’t heard of personal data stores when I wrote this last year, but they’re growing in importance. Irina Bolychevsky, from Redecentralise, wrote this piece on them. Tim Berners-Lee’s recently announced Solid venture is one such project.
  • Rashida Richardson, from the AI Now Institute, has also raised some important issues with individual ownership of data (e.g. where is the boundary for ownership of, say, medical records, for things like hereditary conditions?). I’m not sure I pressed this point sufficiently originally, and it suggests that even if data is collected/stored on a personal basis (see above), the content/intentionality of that data may reference others, complicating matters. This could suggest the need to collectivise ownership, if ownership is itself even suitable…
  • To that last point, Will Davies’ point that data is really just ‘our impressions’ on someone else’s ‘infrastructure’ captures the ongoing difficulty of assessing how (much) value should flow to the infrastructure providers and the data factory itself. Of course, you could also construe ‘impressions’ as the labour that keeps that industrial complex flowing. ¯\_(ツ)_/¯
  • I’m also sceptical of putting a monetary value on data. Beyond almost certainly protecting incumbents, what other kind of behaviour/companies/activity might that (dis)incentivise? Indeed, is payment even a necessary condition of promoting public value? Could that be achieved just by making high + non-exploitative governance/usage standards a condition of access?
  • Finally, is control more or as important than ownership? In a few years privacy-app Jumbo will, surely, just be a ‘data control all in one place’ app. The ODI’s recent work on data trusts sits somewhere in between this and Solid. Chuck in a self-sovereign ID and maybe we’re getting there.
  • Anyway, just some more caveats to an already over-caveated piece :)

The challenge

The proposal

One of the responses to this state of affairs is for consumers to ‘own their data’. It pays to consider what this means for three main areas of concern:

  1. Data collection: Any data collected through the devices and services you use would need to be sent directly to a place of your choosing, rather than that service or device’s own servers.
  2. Data storage/management: For reasons of privacy and security, you may decide to run your own server and store a raw data stream. Or, for convenience, you might outsource this to a company who will turn your raw data into meaningful insights for you (based on their ability to aggregate their users’ data).
  3. Data-based insights: Whether you have a company that provides insights at this stage, or you sell your raw data to other services who produce their own insights, the apparent benefit of this model is that you at least have control over the process.


  1. Thanks to Lorna Pittaway for her advice.
  2. Facebook’s insights are often far more complex than this, such as their ability to spot potentially suicidal users. (My model is admittedly, and deliberately, extremely simplified. Still, it’s worth noting that any comprehensive account would need to process and communicate (and report?) various nonstandard data points such as this.)
  3. Mozilla is a good example of how a mission-driven organisation could also compete when it comes to recruitment.
  4. Update 03/04: clarified language around data portability/APIs.
  5. Update 28/04/19: added further points.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store