Distinguishing between Instances and Classes in the Wikipedia Taxonomy
C. Zirn, V. Nastase, and M. Strube. Proceedings of the 5th European Semantic Web Conference, Berlin, Heidelberg, Springer Verlag, (June 2008)
Abstract
This paper presents an automatic method for differentiating between instances and classes in a large scale taxonomy induced from the Wikipedia category network. The method exploits characteristics of the category names and the structure of the network. The approach we present is the first attempt to make this distinction automatically in a large scale resource. In contrast, WordNet and Cyc rely on manual annotations. The result of the process is evaluated against ResearchCyc. On the subnetwork shared by our taxonomy and ResearchCyc we report 84.52% accuracy.
%0 Conference Paper
%1 zirn2008distinguishing
%A Zirn, Caecilia
%A Nastase, Vivi
%A Strube, Michael
%B Proceedings of the 5th European Semantic Web Conference
%C Berlin, Heidelberg
%D 2008
%E Hauswirth, Manfred
%E Koubarakis, Manolis
%E Bechhofer, Sean
%I Springer Verlag
%K instances taxonomy classes ontologies-and-natural-language
%T Distinguishing between Instances and Classes in the Wikipedia Taxonomy
%U http://data.semanticweb.org/conference/eswc/2008/papers/36
%X This paper presents an automatic method for differentiating between instances and classes in a large scale taxonomy induced from the Wikipedia category network. The method exploits characteristics of the category names and the structure of the network. The approach we present is the first attempt to make this distinction automatically in a large scale resource. In contrast, WordNet and Cyc rely on manual annotations. The result of the process is evaluated against ResearchCyc. On the subnetwork shared by our taxonomy and ResearchCyc we report 84.52% accuracy.
@inproceedings{zirn2008distinguishing,
abstract = {This paper presents an automatic method for differentiating between instances and classes in a large scale taxonomy induced from the Wikipedia category network. The method exploits characteristics of the category names and the structure of the network. The approach we present is the first attempt to make this distinction automatically in a large scale resource. In contrast, WordNet and Cyc rely on manual annotations. The result of the process is evaluated against ResearchCyc. On the subnetwork shared by our taxonomy and ResearchCyc we report 84.52% accuracy.},
added-at = {2008-05-28T14:49:52.000+0200},
address = {Berlin, Heidelberg},
author = {Zirn, Caecilia and Nastase, Vivi and Strube, Michael},
biburl = {https://www.bibsonomy.org/bibtex/27c52baaf3f2528ccd15e211eb6832186/eswc2008},
booktitle = {Proceedings of the 5th European Semantic Web Conference},
editor = {Hauswirth, Manfred and Koubarakis, Manolis and Bechhofer, Sean},
interhash = {acf1d6b329e85fd4700c2dec3b552343},
intrahash = {7c52baaf3f2528ccd15e211eb6832186},
keywords = {instances taxonomy classes ontologies-and-natural-language},
month = {June},
publisher = {Springer Verlag},
series = {LNCS},
timestamp = {2008-05-28T14:49:52.000+0200},
title = {Distinguishing between Instances and Classes in the Wikipedia Taxonomy},
url = {http://data.semanticweb.org/conference/eswc/2008/papers/36},
year = 2008
}