Bootstrapping the Linked Data Web

Abstract

Most knowledge sources on the Data Web were extracted from structured or semi-structured data. Thus, they encompass solely a small fraction of the information available on the document-oriented Web. In this paper, we present BOA, an iterative bootstrapping strategy for extracting RDF from unstructured data. The idea behind BOA is to use the Data Web as background knowledge for the extraction of natural language patterns that represent predicates found on the Data Web. These patterns are used to extract instance knowledge from natural language text. This knowledge is finally fed back into the Data Web, therewith closing the loop. We evaluate our approach on two data sets using DBpedia as background knowledge. Our results show that we can extract several thousand new facts in one iteration with very high accuracy. Moreover, we provide the first repository of natural language representations of predicates found on the Data Web.

BibTeX key: Gerber2011
entry type: inproceedings
booktitle: 1st Workshop on Web Scale Knowledge Extraction @ ISWC 2011
year: 2011
owner: gerb

BibSonomy

Bootstrapping the Linked Data Web

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on