Adaptive information extraction from text by rule induction and generalisation
F. Ciravegna. Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2, page 1251--1256. San Francisco, CA, USA, Morgan Kaufmann Publishers Inc., (2001)
Abstract
(LP)<sup>2</sup> is a covering algorithm for adaptive Information Extraction from text (IE). It induces symbolic rules that insert SGML tags into texts by learning from examples found in a user-defined tagged corpus. Training is performed in two steps: initially a set of tagging rules is learned; then additional rules are induced to correct mistakes and imprecision in tagging. Induction is performed by bottom-up generalization of examples in the training corpus. Shallow knowledge about Natural Language Processing (NLP) is used in the generalization process. The algorithm has a considerable success story. From a scientific point of view, experiments report excellent results with respect to the current state of the art on two publicly available corpora. From an application point of view, a successful industrial IE tool has been based on (LP)<sup>2</sup>. Real world applications have been developed and licenses have been released to external companies for building other applications. This paper presents (LP)<sup>2</sup>, experimental results and applications, and discusses the role of shallow NLP in rule induction.
Description
Adaptive information extraction from text by rule induction and generalisation
%0 Conference Paper
%1 Ciravegna:2001:AIE:1642194.1642261
%A Ciravegna, Fabio
%B Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
%C San Francisco, CA, USA
%D 2001
%I Morgan Kaufmann Publishers Inc.
%K extraction induction information learning pattern
%P 1251--1256
%T Adaptive information extraction from text by rule induction and generalisation
%U http://eprints.aktors.org/118/01/IJCAI01.pdf
%X (LP)<sup>2</sup> is a covering algorithm for adaptive Information Extraction from text (IE). It induces symbolic rules that insert SGML tags into texts by learning from examples found in a user-defined tagged corpus. Training is performed in two steps: initially a set of tagging rules is learned; then additional rules are induced to correct mistakes and imprecision in tagging. Induction is performed by bottom-up generalization of examples in the training corpus. Shallow knowledge about Natural Language Processing (NLP) is used in the generalization process. The algorithm has a considerable success story. From a scientific point of view, experiments report excellent results with respect to the current state of the art on two publicly available corpora. From an application point of view, a successful industrial IE tool has been based on (LP)<sup>2</sup>. Real world applications have been developed and licenses have been released to external companies for building other applications. This paper presents (LP)<sup>2</sup>, experimental results and applications, and discusses the role of shallow NLP in rule induction.
%@ 1-55860-812-5, 978-1-558-60812-2
@inproceedings{Ciravegna:2001:AIE:1642194.1642261,
abstract = {(LP)<sup>2</sup> is a covering algorithm for adaptive Information Extraction from text (IE). It induces symbolic rules that insert SGML tags into texts by learning from examples found in a user-defined tagged corpus. Training is performed in two steps: initially a set of tagging rules is learned; then additional rules are induced to correct mistakes and imprecision in tagging. Induction is performed by bottom-up generalization of examples in the training corpus. Shallow knowledge about Natural Language Processing (NLP) is used in the generalization process. The algorithm has a considerable success story. From a scientific point of view, experiments report excellent results with respect to the current state of the art on two publicly available corpora. From an application point of view, a successful industrial IE tool has been based on (LP)<sup>2</sup>. Real world applications have been developed and licenses have been released to external companies for building other applications. This paper presents (LP)<sup>2</sup>, experimental results and applications, and discusses the role of shallow NLP in rule induction.},
acmid = {1642261},
added-at = {2012-10-11T17:00:22.000+0200},
address = {San Francisco, CA, USA},
author = {Ciravegna, Fabio},
biburl = {https://www.bibsonomy.org/bibtex/25eb346593c6330ee742947824e75e710/jil},
booktitle = {Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2},
description = {Adaptive information extraction from text by rule induction and generalisation},
interhash = {8e97c7bdb4db3c8144c32849b12b9714},
intrahash = {5eb346593c6330ee742947824e75e710},
isbn = {1-55860-812-5, 978-1-558-60812-2},
keywords = {extraction induction information learning pattern},
location = {Seattle, WA, USA},
numpages = {6},
pages = {1251--1256},
publisher = {Morgan Kaufmann Publishers Inc.},
series = {IJCAI'01},
timestamp = {2013-11-23T20:11:51.000+0100},
title = {Adaptive information extraction from text by rule induction and generalisation},
url = {http://eprints.aktors.org/118/01/IJCAI01.pdf},
year = 2001
}