From Web Directories to Ontologies: Natural Language Processing Challenges
I. Zaihrayeu, L. Sun, F. Giunchiglia, W. Pan, Q. Ju, M. Chi, and X. Huang. Proceedings of the 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference (ISWC/ASWC2007), Busan, South Korea, volume 4825 of LNCS, page 617--630. Berlin, Heidelberg, Springer Verlag, (November 2007)
Abstract
Hierarchical classifications are used pervasively by humans as a means to organize their data and knowledge about the world. One of their main advantages is that natural language labels, used to describe their contents, are easily understood by human users. However, at the same time, this is also one of their main disadvantages as these same labels are ambiguous and very hard to be reasoned about by software agents. This fact creates an insuperable hindrance for classifications to being embedded in the Semantic Web infrastructure. This paper presents an approach to converting classifications into lightweight ontologies, and it makes the following contributions: (i) it identifies the main NLP problems related to the conversion process and shows how they are different from the classical problems of NLP; (ii) it proposes heuristic solutions to these problems, which are especially effective in this domain; and (iii) it evaluates the proposed solutions by testing them on DMoz data.
%0 Conference Paper
%1 Zaihrayeu/2007/Directories
%A Zaihrayeu, Ilya
%A Sun, Lei
%A Giunchiglia, Fausto
%A Pan, Wei
%A Ju, Qi
%A Chi, Mingmin
%A Huang, Xuanjing
%B Proceedings of the 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference (ISWC/ASWC2007), Busan, South Korea
%C Berlin, Heidelberg
%D 2007
%E Aberer, Karl
%E Choi, Key-Sun
%E Noy, Natasha
%E Allemang, Dean
%E Lee, Kyung-Il
%E Nixon, Lyndon J B
%E Golbeck, Jennifer
%E Mika, Peter
%E Maynard, Diana
%E Schreiber, Guus
%E Cudré-Mauroux, Philippe
%I Springer Verlag
%K 2007 challenge directory information_extraction iswc language natural natural_language_processing processing research_15 semantic_web web web_annotation
%P 617--630
%T From Web Directories to Ontologies: Natural Language Processing Challenges
%U http://iswc2007.semanticweb.org/papers/617.pdf
%V 4825
%X Hierarchical classifications are used pervasively by humans as a means to organize their data and knowledge about the world. One of their main advantages is that natural language labels, used to describe their contents, are easily understood by human users. However, at the same time, this is also one of their main disadvantages as these same labels are ambiguous and very hard to be reasoned about by software agents. This fact creates an insuperable hindrance for classifications to being embedded in the Semantic Web infrastructure. This paper presents an approach to converting classifications into lightweight ontologies, and it makes the following contributions: (i) it identifies the main NLP problems related to the conversion process and shows how they are different from the classical problems of NLP; (ii) it proposes heuristic solutions to these problems, which are especially effective in this domain; and (iii) it evaluates the proposed solutions by testing them on DMoz data.
@inproceedings{Zaihrayeu/2007/Directories,
abstract = {Hierarchical classifications are used pervasively by humans as a means to organize their data and knowledge about the world. One of their main advantages is that natural language labels, used to describe their contents, are easily understood by human users. However, at the same time, this is also one of their main disadvantages as these same labels are ambiguous and very hard to be reasoned about by software agents. This fact creates an insuperable hindrance for classifications to being embedded in the Semantic Web infrastructure. This paper presents an approach to converting classifications into lightweight ontologies, and it makes the following contributions: (i) it identifies the main NLP problems related to the conversion process and shows how they are different from the classical problems of NLP; (ii) it proposes heuristic solutions to these problems, which are especially effective in this domain; and (iii) it evaluates the proposed solutions by testing them on DMoz data.},
added-at = {2007-11-07T19:13:58.000+0100},
address = {Berlin, Heidelberg},
author = {Zaihrayeu, Ilya and Sun, Lei and Giunchiglia, Fausto and Pan, Wei and Ju, Qi and Chi, Mingmin and Huang, Xuanjing},
biburl = {https://www.bibsonomy.org/bibtex/26648edb72133b15ed33185e54e48e8c1/iswc2007},
booktitle = {Proceedings of the 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference (ISWC/ASWC2007), Busan, South Korea},
crossref = {http://data.semanticweb.org/conference/iswc-aswc/2007/proceedings},
editor = {Aberer, Karl and Choi, Key-Sun and Noy, Natasha and Allemang, Dean and Lee, Kyung-Il and Nixon, Lyndon J B and Golbeck, Jennifer and Mika, Peter and Maynard, Diana and Schreiber, Guus and Cudré-Mauroux, Philippe},
interhash = {536ddd31fe90752d62794587ad07933a},
intrahash = {6648edb72133b15ed33185e54e48e8c1},
keywords = {2007 challenge directory information_extraction iswc language natural natural_language_processing processing research_15 semantic_web web web_annotation},
month = {November},
pages = {617--630},
publisher = {Springer Verlag},
series = {LNCS},
timestamp = {2007-11-07T19:20:54.000+0100},
title = {From Web Directories to Ontologies: Natural Language Processing Challenges},
url = {http://iswc2007.semanticweb.org/papers/617.pdf},
volume = 4825,
year = 2007
}