copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Bootstrapping pay-as-you-go data integration systems

A. Das Sarma, X. Dong, and A. Halevy. Proceedings of the 2008 ACM SIGMOD international conference on Management of data, page 861--874. New York, NY, USA, ACM, (2008)
DOI: 10.1145/1376616.1376702

Abstract

Data integration systems offer a uniform interface to a set of data sources. Despite recent progress, setting up and maintaining a data integration application still requires significant upfront effort of creating a mediated schema and semantic mappings from the data sources to the mediated schema. Many application contexts involving multiple data sources (e.g., the web, personal information management, enterprise intranets) do not require full integration in order to provide useful services, motivating a pay-as-you-go approach to integration. With that approach, a system starts with very few (or inaccurate) semantic mappings and these mappings are improved over time as deemed necessary. This paper describes the first completely self-configuring data integration system. The goal of our work is to investigate how advanced of a starting point we can provide a pay-as-you-go system. Our system is based on the new concept of a probabilistic mediated schema that is automatically created from the data sources. We automatically create probabilistic schema mappings between the sources and the mediated schema. We describe experiments in multiple domains, including 50-800 data sources, and show that our system is able to produce high-quality answers with no human intervention.

Description

Bootstrapping pay-as-you-go data integration systems

Links and resources

BibTeX key: DasSarma:2008:BPD:1376616.1376702
entry type: inproceedings
address: New York, NY, USA
booktitle: Proceedings of the 2008 ACM SIGMOD international conference on Management of data
year: 2008
pages: 861--874
publisher: ACM
series: SIGMOD '08
location: Vancouver, Canada
acmid: 1376702
isbn: 978-1-60558-102-6
numpages: 14
DOI: 10.1145/1376616.1376702
url: http://doi.acm.org/10.1145/1376616.1376702

@schmidt2's tags highlighted

Cite this publication

@inproceedings{DasSarma:2008:BPD:1376616.1376702, abstract = {Data integration systems offer a uniform interface to a set of data sources. Despite recent progress, setting up and maintaining a data integration application still requires significant upfront effort of creating a mediated schema and semantic mappings from the data sources to the mediated schema. Many application contexts involving multiple data sources (e.g., the web, personal information management, enterprise intranets) do not require full integration in order to provide useful services, motivating a pay-as-you-go approach to integration. With that approach, a system starts with very few (or inaccurate) semantic mappings and these mappings are improved over time as deemed necessary. This paper describes the first completely self-configuring data integration system. The goal of our work is to investigate how advanced of a starting point we can provide a pay-as-you-go system. Our system is based on the new concept of a probabilistic mediated schema that is automatically created from the data sources. We automatically create probabilistic schema mappings between the sources and the mediated schema. We describe experiments in multiple domains, including 50-800 data sources, and show that our system is able to produce high-quality answers with no human intervention.}, acmid = {1376702}, added-at = {2012-10-26T11:15:27.000+0200}, address = {New York, NY, USA}, author = {Das Sarma, Anish and Dong, Xin and Halevy, Alon}, biburl = {https://www.bibsonomy.org/bibtex/251b3a72887628fcf3a9bec7edac2d752/schmidt2}, booktitle = {Proceedings of the 2008 ACM SIGMOD international conference on Management of data}, description = {Bootstrapping pay-as-you-go data integration systems}, doi = {10.1145/1376616.1376702}, interhash = {40b40b177db7ba572ea65082e8b51e4e}, intrahash = {51b3a72887628fcf3a9bec7edac2d752}, isbn = {978-1-60558-102-6}, keywords = {bootstrapping dataspaces pay-as-you-go toread}, location = {Vancouver, Canada}, numpages = {14}, pages = {861--874}, publisher = {ACM}, series = {SIGMOD '08}, timestamp = {2012-10-26T11:15:27.000+0200}, title = {Bootstrapping pay-as-you-go data integration systems}, url = {http://doi.acm.org/10.1145/1376616.1376702}, year = 2008 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Bootstrapping pay-as-you-go data integration systems

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Bootstrapping pay-as-you-go data integration systems

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Bootstrapping pay-as-you-go data integration systems

Comments and Reviews
(0)