All articles

Linked Open Data Business Models

Jeni Tennison has blogged about Open Data Business Models (https://www.jenitennison.com/blog/node/172). Like all Jeni’s writings it’s useful and insightful. But I resonated particularly strongly with one comment in her blog:

“I don’t think anyone is likely to publish open data well if they don’t have some motivation that is a lot nearer to home than ‘helping grow the economy’”.

This is very true. I can hallucinate a number of ways that you could persuade a data owner to publish a quick snapshot of their data, or possibly even an intermittent collection of snapshots, but publishing ‘live’ data – quality data, where the published data is updated as soon as the underlying data is changed, done in such a way that enough information is published to ensure that the data is meaningful – involves significant work, and if the benefit of doing this isn’t directly felt by the people doing the work, then such publication is unlikely to be sustainable.

As a consequence, I’m quite sceptical about a number of proposed business models.

One of the models that Jeni mentions is cost avoidance, and this is a good one. If publishing live data turns out to be cheaper than not publishing it, then this is a very good reason for having a sustainable publishing solution. Jeni gives reducing FOI requests as an example of such cost avoidance.

Publishing data as Linked Data gives another opportunity for cost avoidance.

People who collect government data generally don’t do such data collection for its own sake. The data is already published in some form or other, whether to the EU, to Parliament, to another government department or to other people within the same department. Moreover, it’s not uncommon for the same data to be published multiple times, in different formats, to meet the needs of different customers. Anecdotally we’ve heard that some data collected by local councils is reported to central government seven times, in seven different ways.

Publishing as Linked Data enables “Publish once, use many times”. Linked Data is published in context, giving both the value and the “meaning” of the data. Different consumers can extract different slices of the data for different purposes – you no longer need to publish different views of the data for different users. Choosing to publish your data as linked data can significantly reduce your overall data publishing costs, since you’re reducing the overall burden of publication.

Paradoxically, of course, these benefits are independent of publishing open data – you don’t have to add open publication of data to your existing data publication efforts to gain the benefits of publishing as linked data. But the political pressure of having to publish as open data is in several cases turning an unacceptable publication burden into an intolerable one, and helping to spread the uptake of linked data publication.

It’s still a non-trivial amount of work to set up a linked data publication pipeline, and in economically challenging times it’s hard to find the money for an upfront investment even if this will reduce costs in future years.  This is where I think another of Jeni’s business models – sponsorship – can find a useful place. Instead of sponsoring people to publish a snapshot of their data, sponsoring them to set up a linked data publishing program gives them the financial resources to get the work going, and leaves them with a process that is cheaper to run than their current process, which is what is needed for a sustainable solution.

In such a solution, the data that is published as linked open data is precisely the same data that is used internally within the department; publishing open data is not an additional task – it’s the only task.

I used to work for Hewlett Packard. In its early days, HP operated what it called a “next bench” philosophy – engineers were told to work on tools and products that would be useful to “the guy on the next bench”. This was extremely successful and made HP’s offerings the tools of choice for hardware engineers round the world.

I think that knowing that the data you are publishing and keeping up to date is precisely the data that the person at the next desk is using for analysis is potentially the most compelling business case of all.