Incollection,

Automatic Gazetteer Generation from Wikipedia

, and .
Advanced Language Technologies for Digital Libraries, volume 6699 of Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2011)
DOI: 10.1007/978-3-642-23160-5_5

Abstract

The presence of high quality Named Entity gazetteer within a CLIR system is crucial in order to provide multilingual access to digital resources, particularly in the domain of Digital Libraries. In our paper we investigate an approach for automatically extracting this kind of resources from Wikipedia using an unsupervised approach that leverages the DBpedia classification of the English articles in order to induce the same classification onto encyclopedia pages expressed in other languages. By exploiting the structured information present in Wikipedia we furthermore aim at enriching our standard gazetteer with translations to other languages as well as with the alternative spellings of the entities.

Tags

Users

  • @asmelash

Comments and Reviews