Benchmarking Fulltext Search Performance of RDF Stores
E. Minack, W. Siberski, and W. Nejdl. 6th Annual European Semantic Web Conference (ESWC2009), page 81-95. (June 2009)
Abstract
More and more applications use the RDF framework as their data model and RDF stores to index and retrieve their data. Many of these applications require both structured queries as well as fulltext search. SPARQL addresses the first requirement in a standardized way, while fulltext search is provided by store-specific implementations. RDF benchmarks enable developers to compare structured query performance of different stores, but for fulltext search on RDF data no such benchmarks and comparisons exist so far. In this paper, we extend the LUBM benchmark with synthetic scalable fulltext data and corresponding queries for fulltext-related query performance evaluation. Based on the extended benchmark, we provide a detailed comparison of fulltext search features and performance of the most widely used RDF stores. Results show interesting RDF store insights for basic fulltext queries (classic IR queries) as well as hybrid queries (structured and fulltext queries). Our results are not only valuable for selecting the right RDF store for specific applications, but also reveal the need for performance improvements for certain kinds of queries.
%0 Conference Paper
%1 benchmarking2009
%A Minack, Enrico
%A Siberski, Wolf
%A Nejdl, Wolfgang
%B 6th Annual European Semantic Web Conference (ESWC2009)
%D 2009
%K Database_Management_System Optimization_(Computer_Science) Performance_Engineering Query SPARQL Scalability Search_Engine Semantic_Web Test Validation Visualization benchmark fulltext lubm rdf
%P 81-95
%T Benchmarking Fulltext Search Performance of RDF Stores
%U http://data.semanticweb.org/conference/eswc/2009/paper/225
%X More and more applications use the RDF framework as their data model and RDF stores to index and retrieve their data. Many of these applications require both structured queries as well as fulltext search. SPARQL addresses the first requirement in a standardized way, while fulltext search is provided by store-specific implementations. RDF benchmarks enable developers to compare structured query performance of different stores, but for fulltext search on RDF data no such benchmarks and comparisons exist so far. In this paper, we extend the LUBM benchmark with synthetic scalable fulltext data and corresponding queries for fulltext-related query performance evaluation. Based on the extended benchmark, we provide a detailed comparison of fulltext search features and performance of the most widely used RDF stores. Results show interesting RDF store insights for basic fulltext queries (classic IR queries) as well as hybrid queries (structured and fulltext queries). Our results are not only valuable for selecting the right RDF store for specific applications, but also reveal the need for performance improvements for certain kinds of queries.
@inproceedings{benchmarking2009,
abstract = {More and more applications use the RDF framework as their data model and RDF stores to index and retrieve their data. Many of these applications require both structured queries as well as fulltext search. SPARQL addresses the first requirement in a standardized way, while fulltext search is provided by store-specific implementations. RDF benchmarks enable developers to compare structured query performance of different stores, but for fulltext search on RDF data no such benchmarks and comparisons exist so far. In this paper, we extend the LUBM benchmark with synthetic scalable fulltext data and corresponding queries for fulltext-related query performance evaluation. Based on the extended benchmark, we provide a detailed comparison of fulltext search features and performance of the most widely used RDF stores. Results show interesting RDF store insights for basic fulltext queries (classic IR queries) as well as hybrid queries (structured and fulltext queries). Our results are not only valuable for selecting the right RDF store for specific applications, but also reveal the need for performance improvements for certain kinds of queries.},
added-at = {2009-05-29T11:44:15.000+0200},
author = {Minack, Enrico and Siberski, Wolf and Nejdl, Wolfgang},
biburl = {https://www.bibsonomy.org/bibtex/26975220dbe39ccf50ad45e2473c06000/eswc2009},
booktitle = {6th Annual European Semantic Web Conference (ESWC2009)},
interhash = {179e5aa9b9cb4fd22feddcc29f277326},
intrahash = {6975220dbe39ccf50ad45e2473c06000},
keywords = {Database_Management_System Optimization_(Computer_Science) Performance_Engineering Query SPARQL Scalability Search_Engine Semantic_Web Test Validation Visualization benchmark fulltext lubm rdf},
month = {June},
pages = {81-95},
timestamp = {2009-05-29T11:44:15.000+0200},
title = {Benchmarking Fulltext Search Performance of RDF Stores},
url = {http://data.semanticweb.org/conference/eswc/2009/paper/225},
year = 2009
}