@schmidt2

iTrails: pay-as-you-go information integration in dataspaces

, , , , and . Proceedings of the 33rd international conference on Very large data bases, page 663--674. VLDB Endowment, (2007)

Abstract

Dataspace management has been recently identified as a new agenda for information management 17, 22 and information integration 23. In sharp contrast to standard information integration architectures, a dataspace management system is a <i>data-coexistence approach:</i> it does not require <i>any</i> investments in semantic integration before querying services on the data are provided. Rather, a dataspace can be gradually enhanced over time by defining relationships among the data. Defining those integration semantics gradually is termed <i>pay-as-you-go</i> information integration 17, as time and effort (pay) are needed over time (go) to provide integration semantics. The benefits are better query results (gain). This paper is the first to explore pay-as-you-go information integration in dataspaces. We provide a technique for declarative pay-as-you-go information integration named iTrails. The core idea of our approach is to declaratively add lightweight 'hints' (trails) to a <i>search engine</i> thus allowing gradual enrichment of loosely integrated data sources. Our experiments confirm that iTrails can be efficiently implemented introducing only little overhead during query execution. At the same time iTrails strongly improves the quality of query results. Furthermore, we present rewriting and pruning techniques that allow us to scale iTrails to tens of thousands of trail definitions with minimal growth in the rewritten query size.

Links and resources

Tags

community

  • @hidders
  • @schmidt2
  • @markush
  • @dblp
  • @buehner
@schmidt2's tags highlighted