Incollection,

Automatic Gazetteer Generation from Wikipedia

A. Bosca, and L. Dini.
Advanced Language Technologies for Digital Libraries, volume 6699 of Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2011)
DOI: 10.1007/978-3-642-23160-5_5

Abstract

The presence of high quality Named Entity gazetteer within a CLIR system is crucial in order to provide multilingual access to digital resources, particularly in the domain of Digital Libraries. In our paper we investigate an approach for automatically extracting this kind of resources from Wikipedia using an unsupervised approach that leverages the DBpedia classification of the English articles in order to induce the same classification onto encyclopedia pages expressed in other languages. By exploiting the structured information present in Wikipedia we furthermore aim at enriching our standard gazetteer with translations to other languages as well as with the alternative spellings of the entities.

BibTeX key: bosca2011automatic
entry type: incollection
booktitle: Advanced Language Technologies for Digital Libraries
year: 2011
pages: 61-71
publisher: Springer Berlin Heidelberg
series: Lecture Notes in Computer Science
volume: 6699
isbn: 978-3-642-23159-9
DOI: 10.1007/978-3-642-23160-5_5
url: http://dx.doi.org/10.1007/978-3-642-23160-5_5

BibSonomy

Automatic Gazetteer Generation from Wikipedia

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on