Inbook,

Symbolic and Neural Learning of Named-Entity Recognition and Classification Systems in Two Languages

G. Petasis, S. Petridis, G. Paliouras, V. Karkaletsis, S. Perantonis, and C. Spyropoulos.
volume 18 of International Series in Intelligent Technologies, page 193--210. Springer Berlin / Heidelberg, (January 2002)http://www.springer.com/mathematics/book/978-0-7923-7645-3.

Full text

Abstract

This paper compares two alternative approaches to the problem of acquiring named-entity recognition and classification systems from training corpora, in two different languages. The process of named-entity recognition and classification is an important subtask in most language engineering applications, in particular information extraction, where different types of named entity are associated with specific roles in events. The manual construction of rules for the recognition of named entities is a tedious and time-consuming task. For this reason, effective methods to acquire such systems automatically from data are very desirable. In this paper we compare two popular learning methods on this task: a decision-tree induction method and a multi-layered feed-forward neural network. Particular emphasis is paid on the selection of the appropriate data representation for each method and the extraction of training examples from unstructured textual data. We compare the performance of the two methods on large corpora of English and Greek texts and present the results. In addition to the good performance of both methods, one very interesting result is the fact that a simple representation of the data, which ignores the order of the words within a named entity, leads to improved results over a more complex approach that preserves word order.

BibTeX key: Petasis:2002:SNL:647292.722672
entry type: inbook
booktitle: Advances in Computational Intelligence and Learning: Methods and Applications
year: 2002
month: January
pages: 193--210
publisher: Springer Berlin / Heidelberg
series: International Series in Intelligent Technologies
volume: 18
isbn: 978-0-7923-7645-3
Document: http://www.ellogon.org/petasis/bibliography/COIL2000/COILBook2001.pdf
note: http://www.springer.com/mathematics/book/978-0-7923-7645-3

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

%0 Book Section %1 Petasis:2002:SNL:647292.722672 %A Petasis, Georgios %A Petridis, Sergios %A Paliouras, Georgios %A Karkaletsis, Vangelis %A Perantonis, Stavros J. %A Spyropoulos, Constantine D. %B Advances in Computational Intelligence and Learning: Methods and Applications %D 2002 %E Zimmermann, Hans-Jürgen %E Tselentis, Georgios %E van Someren, Maarsten %E Dounias, Georgios %I Springer Berlin / Heidelberg %K entity induction, named networks neural recognition, tree %P 193--210 %T Symbolic and Neural Learning of Named-Entity Recognition and Classification Systems in Two Languages %U http://www.ellogon.org/petasis/bibliography/COIL2000/COILBook2001.pdf %V 18 %X This paper compares two alternative approaches to the problem of acquiring named-entity recognition and classification systems from training corpora, in two different languages. The process of named-entity recognition and classification is an important subtask in most language engineering applications, in particular information extraction, where different types of named entity are associated with specific roles in events. The manual construction of rules for the recognition of named entities is a tedious and time-consuming task. For this reason, effective methods to acquire such systems automatically from data are very desirable. In this paper we compare two popular learning methods on this task: a decision-tree induction method and a multi-layered feed-forward neural network. Particular emphasis is paid on the selection of the appropriate data representation for each method and the extraction of training examples from unstructured textual data. We compare the performance of the two methods on large corpora of English and Greek texts and present the results. In addition to the good performance of both methods, one very interesting result is the fact that a simple representation of the data, which ignores the order of the words within a named entity, leads to improved results over a more complex approach that preserves word order. %@ 978-0-7923-7645-3

BibSonomy

Symbolic and Neural Learning of Named-Entity Recognition and Classification Systems in Two Languages

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on