S. Kunze, and S. Auer. 7th IEEE International Conference on Semantic Computing, September 16-18, 2013, Irvine, California, USA, (2013)
Abstract
Recently, a large number of dataset repositories, catalogs and portals
are emerging in the science and government realms. Once a large number
of datasets are published on such data portals, the question arises
how to retrieve datasets satisfying a user's information need. In
this article, we present an approach for retrieving datasets according
to user queries. We define dataset retrieval as a specialization
of information retrieval. Instead of retrieving documents that are
relevant to a certain information need, dataset retrieval describes
the process of returning relevant RDF datasets. As with information
retrieval, the term relevance cannot be clearly defined when
using traditional methods like stemming. The inherent usage of RDF
in RDF datasets enables a better way of retrieving relevant ones.
We therefore propose an additional retrieval mechanism, which is
inspired by facet search: dataset filtering. When querying,
the entire set of available datasets is processed by a set of semantic
filters each of which can unambiguously decide whether or not a
given dataset is relevant to the query. The resulting set is then
given back to the requester. We implemented and evaluated our approach
in CKAN, which fuels publicdata.eu and is the most popular data portal
worldwide.
%0 Conference Paper
%1 Kunze2013
%A Kunze, Sven R.
%A Auer, Sören
%B 7th IEEE International Conference on Semantic Computing, September 16-18, 2013, Irvine, California, USA
%D 2013
%K 2013 auer event_ICSC group_aksw kunze lod2page
%T Dataset Retrieval
%U http://svn.aksw.org/papers/2013/ICSC_Dataset_retrieval/public.pdf
%X Recently, a large number of dataset repositories, catalogs and portals
are emerging in the science and government realms. Once a large number
of datasets are published on such data portals, the question arises
how to retrieve datasets satisfying a user's information need. In
this article, we present an approach for retrieving datasets according
to user queries. We define dataset retrieval as a specialization
of information retrieval. Instead of retrieving documents that are
relevant to a certain information need, dataset retrieval describes
the process of returning relevant RDF datasets. As with information
retrieval, the term relevance cannot be clearly defined when
using traditional methods like stemming. The inherent usage of RDF
in RDF datasets enables a better way of retrieving relevant ones.
We therefore propose an additional retrieval mechanism, which is
inspired by facet search: dataset filtering. When querying,
the entire set of available datasets is processed by a set of semantic
filters each of which can unambiguously decide whether or not a
given dataset is relevant to the query. The resulting set is then
given back to the requester. We implemented and evaluated our approach
in CKAN, which fuels publicdata.eu and is the most popular data portal
worldwide.
@inproceedings{Kunze2013,
abstract = {Recently, a large number of dataset repositories, catalogs and portals
are emerging in the science and government realms. Once a large number
of datasets are published on such data portals, the question arises
how to retrieve datasets satisfying a user's information need. In
this article, we present an approach for retrieving datasets according
to user queries. We define \emph{dataset retrieval} as a specialization
of information retrieval. Instead of retrieving documents that are
relevant to a certain information need, dataset retrieval describes
the process of returning relevant RDF datasets. As with information
retrieval, the term \emph{relevance} cannot be clearly defined when
using traditional methods like stemming. The inherent usage of RDF
in RDF datasets enables a better way of retrieving relevant ones.
We therefore propose an additional retrieval mechanism, which is
inspired by facet search: \emph{dataset filtering}. When querying,
the entire set of available datasets is processed by a set of \emph{semantic
filters} each of which can unambiguously decide whether or not a
given dataset is relevant to the query. The resulting set is then
given back to the requester. We implemented and evaluated our approach
in CKAN, which fuels publicdata.eu and is the most popular data portal
worldwide.},
added-at = {2017-01-27T23:28:47.000+0100},
author = {Kunze, Sven R. and Auer, S\"oren},
bdsk-url-1 = {http://svn.aksw.org/papers/2013/ICSC_Dataset_retrieval/public.pdf},
biburl = {https://www.bibsonomy.org/bibtex/2192d7bbe498cd8f8247a633020dcf24e/soeren},
booktitle = {7th IEEE International Conference on Semantic Computing, September 16-18, 2013, Irvine, California, USA},
interhash = {6b82d5ab62abb8bee1cd8123bf036377},
intrahash = {192d7bbe498cd8f8247a633020dcf24e},
keywords = {2013 auer event_ICSC group_aksw kunze lod2page},
owner = {soeren},
timestamp = {2017-01-27T23:30:12.000+0100},
title = {Dataset Retrieval},
url = {http://svn.aksw.org/papers/2013/ICSC_Dataset_retrieval/public.pdf},
year = 2013
}