A novel approach to automatic gazetteer generation using Wikipedia
Z. Zhang, and J. Iria. Proceedings of the 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources, page 1--9. Stroudsburg, PA, USA, Association for Computational Linguistics, (2009)
Abstract
Gazetteers or entity dictionaries are important knowledge resources for solving a wide range of NLP problems, such as entity extraction. We introduce a novel method to automatically generate gazetteers from seed lists using an external knowledge resource, the Wikipedia. Unlike previous methods, our method exploits the rich content and various structural elements of Wikipedia, and does not rely on language- or domain-specific knowledge. Furthermore, applying the extended gazetteers to an entity extraction task in a scientific domain, we empirically observed a significant improvement in system accuracy when compared with those using seed gazetteers.
Description
A novel approach to automatic gazetteer generation using Wikipedia
%0 Conference Paper
%1 Zhang:2009:NAA:1699765.1699766
%A Zhang, Ziqi
%A Iria, José
%B Proceedings of the 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources
%C Stroudsburg, PA, USA
%D 2009
%I Association for Computational Linguistics
%K gazetteer listgrowing setcompletion
%P 1--9
%T A novel approach to automatic gazetteer generation using Wikipedia
%U http://dl.acm.org/citation.cfm?id=1699765.1699766
%X Gazetteers or entity dictionaries are important knowledge resources for solving a wide range of NLP problems, such as entity extraction. We introduce a novel method to automatically generate gazetteers from seed lists using an external knowledge resource, the Wikipedia. Unlike previous methods, our method exploits the rich content and various structural elements of Wikipedia, and does not rely on language- or domain-specific knowledge. Furthermore, applying the extended gazetteers to an entity extraction task in a scientific domain, we empirically observed a significant improvement in system accuracy when compared with those using seed gazetteers.
%@ 978-1-932432-55-8
@inproceedings{Zhang:2009:NAA:1699765.1699766,
abstract = {Gazetteers or entity dictionaries are important knowledge resources for solving a wide range of NLP problems, such as entity extraction. We introduce a novel method to automatically generate gazetteers from seed lists using an external knowledge resource, the Wikipedia. Unlike previous methods, our method exploits the rich content and various structural elements of Wikipedia, and does not rely on language- or domain-specific knowledge. Furthermore, applying the extended gazetteers to an entity extraction task in a scientific domain, we empirically observed a significant improvement in system accuracy when compared with those using seed gazetteers.},
acmid = {1699766},
added-at = {2013-10-24T12:22:09.000+0200},
address = {Stroudsburg, PA, USA},
author = {Zhang, Ziqi and Iria, Jos{\'e}},
biburl = {https://www.bibsonomy.org/bibtex/2592761f9a27b029b16d02012647c4b8f/asmelash},
booktitle = {Proceedings of the 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources},
description = {A novel approach to automatic gazetteer generation using Wikipedia},
interhash = {724767b3f1eb3a3439a0f941490d2ab6},
intrahash = {592761f9a27b029b16d02012647c4b8f},
isbn = {978-1-932432-55-8},
keywords = {gazetteer listgrowing setcompletion},
location = {Suntec, Singapore},
numpages = {9},
pages = {1--9},
publisher = {Association for Computational Linguistics},
series = {People's Web '09},
timestamp = {2013-10-24T12:22:09.000+0200},
title = {A novel approach to automatic gazetteer generation using Wikipedia},
url = {http://dl.acm.org/citation.cfm?id=1699765.1699766},
year = 2009
}