@soeren

CSV2RDF: User-Driven CSV to RDF Mass Conversion Framework

, , and . Proceedings of the ISEM '13, September 04 - 06 2013, Graz, Austria, (2013)

Abstract

Governments and public administrations started recently to publish large amounts of structured data on the Web, mostly in the form of tabular data such as CSV files or Excel sheets. Various tools and projects have been launched aiming at facilitating the lifting of tabular data to reach semantically structured and linked data. However, none of these tools supported a truly incremental, pay-as-you-go data publication and mapping strategy, which enables effort sharing between data owners, community experts and consumers. In this article, we present an approach for enabling the user-driven semantic mapping of large amounts tabular data. We devise a simple mapping language for tabular data, which is easy to understand even for casual users, but expressive enough to cover the vast majority of potential tabular mappings use cases. We outline a formal approach for mapping tabular data to RDF. Default mappings are automatically created and can be revised by the community using a semantic wiki. The mappings are executed using a sophisticated streaming RDB2RDF conversion. We report about the deployment of our approach at the Pan-European data portal PublicData.eu, where we transformed and enriched almost 10,000 datasets accounting for 7.3 billion triples.

Links and resources

Tags

community

  • @dice-research
  • @aksw
  • @soeren
@soeren's tags highlighted