PORE: Positive-Only Relation Extraction from Wikipedia Text
G. Wang, Y. Yu, and H. Zhu. Proceedings of the 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference (ISWC/ASWC2007), Busan, South Korea, volume 4825 of LNCS, page 575--588. Berlin, Heidelberg, Springer Verlag, (November 2007)
Abstract
Extracting semantic relations is of great importance for the creation of the Semantic Web content. It is of great benefit to semi-automatically extract relations from the free text of Wikipedia using the structured content readily available in it. Pattern matching methods that employ information redundancy cannot work well since there is not much redundancy information in Wikipedia, compared to the Web. Multi-class classification methods are not reasonable since no classification of relation types is available in Wikipedia. In this paper, we propose PORE (Positive-Only Relation Extraction), for relation extraction from Wikipedia text. The core algorithm B-POL extends a state-of-the-art positive-only learning algorithm using bootstrapping, strong negative identification, and transductive inference to work with fewer positive training examples. We conducted experiments on several relations with different amount of training data. The experimental results show that B-POL can work effectively given only a small amount of positive training examples and it significantly outperforms the original positive learning approaches and a multi-class SVM. Furthermore, although PORE is applied in the context of Wikipedia, the core algorithm B-POL is a general approach for Ontology Population and can be adapted to other domains.
%0 Conference Paper
%1 Wang/2007/PORE:
%A Wang, Gang
%A Yu, Yong
%A Zhu, Haiping
%B Proceedings of the 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference (ISWC/ASWC2007), Busan, South Korea
%C Berlin, Heidelberg
%D 2007
%E Aberer, Karl
%E Choi, Key S.
%E Noy, Natasha
%E Allemang, Dean
%E Lee, Kyung I.
%E Nixon, Lyndon J. B.
%E Golbeck, Jennifer
%E Mika, Peter
%E Maynard, Diana
%E Schreiber, Guus
%E Mauroux, Philippe C.
%I Springer Verlag
%K annotation iswc, knowledge-extraction nlp semantic-web text-mining wikipedia
%P 575--588
%T PORE: Positive-Only Relation Extraction from Wikipedia Text
%U http://iswc2007.semanticweb.org/papers/575.pdf
%V 4825
%X Extracting semantic relations is of great importance for the creation of the Semantic Web content. It is of great benefit to semi-automatically extract relations from the free text of Wikipedia using the structured content readily available in it. Pattern matching methods that employ information redundancy cannot work well since there is not much redundancy information in Wikipedia, compared to the Web. Multi-class classification methods are not reasonable since no classification of relation types is available in Wikipedia. In this paper, we propose PORE (Positive-Only Relation Extraction), for relation extraction from Wikipedia text. The core algorithm B-POL extends a state-of-the-art positive-only learning algorithm using bootstrapping, strong negative identification, and transductive inference to work with fewer positive training examples. We conducted experiments on several relations with different amount of training data. The experimental results show that B-POL can work effectively given only a small amount of positive training examples and it significantly outperforms the original positive learning approaches and a multi-class SVM. Furthermore, although PORE is applied in the context of Wikipedia, the core algorithm B-POL is a general approach for Ontology Population and can be adapted to other domains.
@inproceedings{Wang/2007/PORE:,
abstract = {Extracting semantic relations is of great importance for the creation of the Semantic Web content. It is of great benefit to semi-automatically extract relations from the free text of Wikipedia using the structured content readily available in it. Pattern matching methods that employ information redundancy cannot work well since there is not much redundancy information in Wikipedia, compared to the Web. Multi-class classification methods are not reasonable since no classification of relation types is available in Wikipedia. In this paper, we propose PORE (Positive-Only Relation Extraction), for relation extraction from Wikipedia text. The core algorithm B-POL extends a state-of-the-art positive-only learning algorithm using bootstrapping, strong negative identification, and transductive inference to work with fewer positive training examples. We conducted experiments on several relations with different amount of training data. The experimental results show that B-POL can work effectively given only a small amount of positive training examples and it significantly outperforms the original positive learning approaches and a multi-class SVM. Furthermore, although PORE is applied in the context of Wikipedia, the core algorithm B-POL is a general approach for Ontology Population and can be adapted to other domains.},
added-at = {2008-02-10T02:19:38.000+0100},
address = {Berlin, Heidelberg},
author = {Wang, Gang and Yu, Yong and Zhu, Haiping},
biburl = {https://www.bibsonomy.org/bibtex/23f50c3301e467186596b2ae8a43efa4c/brightbyte},
booktitle = {Proceedings of the 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference (ISWC/ASWC2007), Busan, South Korea},
citeulike-article-id = {2162726},
description = {stuff from citeyoulike},
editor = {Aberer, Karl and Choi, Key S. and Noy, Natasha and Allemang, Dean and Lee, Kyung I. and Nixon, Lyndon J. B. and Golbeck, Jennifer and Mika, Peter and Maynard, Diana and Schreiber, Guus and Mauroux, Philippe C.},
interhash = {bd717d8b88d8cecd1de6a1adab5598bb},
intrahash = {3f50c3301e467186596b2ae8a43efa4c},
keywords = {annotation iswc, knowledge-extraction nlp semantic-web text-mining wikipedia},
month = {November},
pages = {575--588},
priority = {2},
publisher = {Springer Verlag},
series = {LNCS},
timestamp = {2009-01-23T09:58:50.000+0100},
title = {PORE: Positive-Only Relation Extraction from Wikipedia Text},
url = {http://iswc2007.semanticweb.org/papers/575.pdf},
volume = 4825,
year = 2007
}