Executing SPARQL Queries over the Web of Linked Data
O. Hartig, C. Bizer, and J. Freytag. The Semantic Web -- ISWC 2009: 8th International Semantic Web Conference, Chantilly, VA, USA, volume 5823 of Lecture Notes in Computer Science, Springer, Berlin, (2009)
DOI: 10.1007/978-3-642-04930-9_19
Abstract
The Web of Linked Data forms a single, globally distributed dataspace. Due to the openness of this dataspace, it is not possible to know in advance all data sources that might be relevant for query answering. This openness poses a new challenge that is not addressed by traditional research on federated query processing. In this paper we present an approach to execute SPARQL queries over the Web of Linked Data. The main idea of our approach is to discover data that might be relevant for answering a query during the query execution itself. This discovery is driven by following RDF links between data sources based on URIs in the query and in partial results. The URIs are resolved over the HTTP protocol into RDF data which is continuously added to the queried dataset. This paper describes concepts and algorithms to implement our approach using an iterator-based pipeline. We introduce a formalization of the pipelining approach and show that classical iterators may cause blocking due to the latency of HTTP requests. To avoid blocking, we propose an extension of the iterator paradigm. The evaluation of our approach shows its strengths as well as the still existing challenges.
%0 Book Section
%1 HartigBizerFreytag09ISWC
%A Hartig, Olaf
%A Bizer, Christian
%A Freytag, Johann-Christoph
%B The Semantic Web -- ISWC 2009: 8th International Semantic Web Conference, Chantilly, VA, USA
%C Berlin
%D 2009
%E Bernstein, Abraham
%E Karger, David R.
%E Heath, Tom
%E Feigenbaum, Lee
%E Maynard, Diana
%E Motta, Enrico
%E Thirunarayan, Krishnaprasad
%I Springer
%K 01624 springer paper ai semantic web rdf retrieval zzz.sw
%P 293--309
%R 10.1007/978-3-642-04930-9_19
%T Executing SPARQL Queries over the Web of Linked Data
%V 5823
%X The Web of Linked Data forms a single, globally distributed dataspace. Due to the openness of this dataspace, it is not possible to know in advance all data sources that might be relevant for query answering. This openness poses a new challenge that is not addressed by traditional research on federated query processing. In this paper we present an approach to execute SPARQL queries over the Web of Linked Data. The main idea of our approach is to discover data that might be relevant for answering a query during the query execution itself. This discovery is driven by following RDF links between data sources based on URIs in the query and in partial results. The URIs are resolved over the HTTP protocol into RDF data which is continuously added to the queried dataset. This paper describes concepts and algorithms to implement our approach using an iterator-based pipeline. We introduce a formalization of the pipelining approach and show that classical iterators may cause blocking due to the latency of HTTP requests. To avoid blocking, we propose an extension of the iterator paradigm. The evaluation of our approach shows its strengths as well as the still existing challenges.
@incollection{HartigBizerFreytag09ISWC,
abstract = {The Web of Linked Data forms a single, globally distributed dataspace. Due to the openness of this dataspace, it is not possible to know in advance all data sources that might be relevant for query answering. This openness poses a new challenge that is not addressed by traditional research on federated query processing. In this paper we present an approach to execute SPARQL queries over the Web of Linked Data. The main idea of our approach is to discover data that might be relevant for answering a query during the query execution itself. This discovery is driven by following RDF links between data sources based on URIs in the query and in partial results. The URIs are resolved over the HTTP protocol into RDF data which is continuously added to the queried dataset. This paper describes concepts and algorithms to implement our approach using an iterator-based pipeline. We introduce a formalization of the pipelining approach and show that classical iterators may cause blocking due to the latency of HTTP requests. To avoid blocking, we propose an extension of the iterator paradigm. The evaluation of our approach shows its strengths as well as the still existing challenges.},
added-at = {2017-03-26T15:07:47.000+0200},
address = {Berlin},
author = {Hartig, Olaf and Bizer, Christian and Freytag, Johann-Christoph},
biburl = {https://www.bibsonomy.org/bibtex/20a60d352088022b356aabc75c8071e7d/flint63},
booktitle = {The Semantic Web -- ISWC 2009: 8th International Semantic Web Conference, Chantilly, VA, USA},
crossref = {ISWC2009},
doi = {10.1007/978-3-642-04930-9_19},
editor = {Bernstein, Abraham and Karger, David R. and Heath, Tom and Feigenbaum, Lee and Maynard, Diana and Motta, Enrico and Thirunarayan, Krishnaprasad},
file = {SpringerLink:2009/HartigBizerFreytag09ISWC.pdf:PDF},
groups = {public},
interhash = {b2e433e89dbb9ea7721911c75c2622ec},
intrahash = {0a60d352088022b356aabc75c8071e7d},
keywords = {01624 springer paper ai semantic web rdf retrieval zzz.sw},
pages = {293--309},
publisher = {Springer},
series = {Lecture Notes in Computer Science},
timestamp = {2017-07-13T17:38:45.000+0200},
title = {Executing {SPARQL} Queries over the Web of Linked Data},
username = {flint63},
volume = 5823,
year = 2009
}