Getting to know PROV - the W3C Provenance Specifications
Provenance (the origin or source) of information is critical in deciding whether information is to be trusted, how it should be integrated with other diverse information sources, and how to give credit to its originators when reusing it. In order to promote the widespread publication of provenance information on the Web, the W3C is producing the W3C PROV set of specifications. These specifications provide a basis for the common exchange of provenance information on the Web. This half-day tutorial provides you with an in depth dive into these specifications including hands on information on how to publish, query and access provenance information. You will learn how to model your provenance data using the PROV data model and ontology, how to produce provenance information that enables integrity checking and inferences, as well as how to expose and acquire provenance information using PROV access mechanisms and services. ·
January 17, 2010 By Scott Brinker 4 Comments
The 8th linked data business model
In response to my post on linked data business models, Leigh Dodds at Talis wrote a terrific piece with his thoughts on the business of linked data. Leigh presents a number of great ideas that I think really carry the conversation forward.
One of his points is that I overlooked an important model, what he calls the “sponsorship model.” Under this model, a government entity or a non-profit organization has a funded mandate to deliver certain data to the public or their targeted constituency. I’d humbly suggest calling it the subsidized model though, to avoid confusion, because sponsorship is often associated with advertising and branding — very different business models. ·
I’ve organized these by how revenue is generated, from direct money-for-data to indirect branding programs.
Within each of these revenue models, there’s also a secondary dimension of how the data is delivered, whether in raw form for others to leverage in their own applications or embedded into a pre-packaged application provided directly to end-users.
1. Subscription model. Some data will be valuable enough that you can charge people a subscription to access it. This model has been around for a while, but it will gain new life as linked data standards make it easier for people to consume and mash-up data in novel applications.
2. Advertising model. Advertising: the second oldest profession. Data-driven applications will have plenty of opportunity for contextual ads and sponsorships. One interesting twist will be advertisers who pay to include information in raw data feeds, data-layer ads if you will.
3. Authority model. If anyone can publish data on the web, how will you know what data is good? That problem will be an opportunity for third-party “authorities” to validate data — or do official reviews and certifications that are published as data — and charge for participation. Compliance services are related to this.
4. Affiliate model. Affiliate marketing programs generate over $6 billion/year in commissions and are a major source of transactions and leads for merchants such as Amazon.com. Embedding affiliate links in data, so that they are activated when surfaced into end-user applications, are a natural extension of this existing model.
5. Value-Add model. Useful data can be bundled with other services to make the overall solution more valuable. For example, think of the benchmarking data now included with Google Analytics. Access to data can also be offered earlier in the sales funnel, as a lead generation incentive.
6. Traffic model. As with Google Rich Snippets, data can be used to boost the visibility and ranking of sites in major and vertical search engines. This is data-enhanced search engine optimization (SEO++) to increase traffic. Nickname: the “data for nothing and links for free” model (apologies to Mark Knopfler).
7. Branding model. As Josh Jones-Dilworth said, “Data shapes conversations and markets.” Data branding can use data — and the vocabularies that define and structure data — to position and promote a company’s worldview and differentiation strategy.
Of course, there will be hybrid models that combine several of these approaches.
Particularly in the early days, most organizations will benefit from experimenting with linked data for traffic, branding, and a little value add. Their own value will be learning as much as anything. As the data web matures, and they become more experienced, they may embrace more direct revenue models.
But don’t underestimate the importance of data branding. When it comes to establishing industry standard vocabularies and ontologies, there is a definite first-mover advantage.
For the entrepreneurs in this space, however, everything is fair game. ·
Chenliang Li, Aixin Sun, Jianshu Weng, and Qi He. Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, page 523--532. New York, NY, USA, ACM, (2013)