Automatic selection of high quality parses created by a fully unsupervised parser
R. Reichart, und A. Rappoport. Proceedings of the Thirteenth Conference on Computational Natural Language Learning, Seite 156--164. Stroudsburg, PA, USA, Association for Computational Linguistics, (2009)
Zusammenfassung
The average results obtained by unsupervised statistical parsers have greatly improved in the last few years, but on many specific sentences they are of rather low quality. The output of such parsers is becoming valuable for various applications, and it is radically less expensive to create than manually annotated training data. Hence, automatic selection of high quality parses created by unsupervised parsers is an important problem.</p> <p>In this paper we present PUPA, a <i>POS-based Unsupervised Parse Assessment</i> algorithm. The algorithm assesses the quality of a parse tree using POS sequence statistics collected from a batch of parsed sentences. We evaluate the algorithm by using an unsupervised POS tagger and an unsupervised parser, selecting high quality parsed sentences from English (WSJ) and German (NEGRA) corpora. We show that PUPA outperforms the leading previous parse assessment algorithm for supervised parsers, as well as a strong unsupervised baseline. Consequently, PUPA allows obtaining high quality parses without any human involvement.
Beschreibung
Automatic selection of high quality parses created by a fully unsupervised parser
%0 Conference Paper
%1 Reichart:2009:ASH:1596374.1596400
%A Reichart, Roi
%A Rappoport, Ari
%B Proceedings of the Thirteenth Conference on Computational Natural Language Learning
%C Stroudsburg, PA, USA
%D 2009
%I Association for Computational Linguistics
%K algorithm evaluation of parsing part pos pupa speech unsupervised
%P 156--164
%T Automatic selection of high quality parses created by a fully unsupervised parser
%U http://dl.acm.org/citation.cfm?id=1596374.1596400
%X The average results obtained by unsupervised statistical parsers have greatly improved in the last few years, but on many specific sentences they are of rather low quality. The output of such parsers is becoming valuable for various applications, and it is radically less expensive to create than manually annotated training data. Hence, automatic selection of high quality parses created by unsupervised parsers is an important problem.</p> <p>In this paper we present PUPA, a <i>POS-based Unsupervised Parse Assessment</i> algorithm. The algorithm assesses the quality of a parse tree using POS sequence statistics collected from a batch of parsed sentences. We evaluate the algorithm by using an unsupervised POS tagger and an unsupervised parser, selecting high quality parsed sentences from English (WSJ) and German (NEGRA) corpora. We show that PUPA outperforms the leading previous parse assessment algorithm for supervised parsers, as well as a strong unsupervised baseline. Consequently, PUPA allows obtaining high quality parses without any human involvement.
%@ 978-1-932432-29-9
@inproceedings{Reichart:2009:ASH:1596374.1596400,
abstract = {The average results obtained by unsupervised statistical parsers have greatly improved in the last few years, but on many specific sentences they are of rather low quality. The output of such parsers is becoming valuable for various applications, and it is radically less expensive to create than manually annotated training data. Hence, automatic selection of high quality parses created by unsupervised parsers is an important problem.</p> <p>In this paper we present PUPA, a <i>POS-based Unsupervised Parse Assessment</i> algorithm. The algorithm assesses the quality of a parse tree using POS sequence statistics collected from a batch of parsed sentences. We evaluate the algorithm by using an unsupervised POS tagger and an unsupervised parser, selecting high quality parsed sentences from English (WSJ) and German (NEGRA) corpora. We show that PUPA outperforms the leading previous parse assessment algorithm for supervised parsers, as well as a strong unsupervised baseline. Consequently, PUPA allows obtaining high quality parses without any human involvement.},
acmid = {1596400},
added-at = {2012-11-21T18:47:09.000+0100},
address = {Stroudsburg, PA, USA},
author = {Reichart, Roi and Rappoport, Ari},
biburl = {https://www.bibsonomy.org/bibtex/25369b656298ba4bf57b162a105d1156a/jil},
booktitle = {Proceedings of the Thirteenth Conference on Computational Natural Language Learning},
description = {Automatic selection of high quality parses created by a fully unsupervised parser},
interhash = {5362ffb1f031ee6db8d9aeb06f61e4cf},
intrahash = {5369b656298ba4bf57b162a105d1156a},
isbn = {978-1-932432-29-9},
keywords = {algorithm evaluation of parsing part pos pupa speech unsupervised},
location = {Boulder, Colorado},
numpages = {9},
pages = {156--164},
publisher = {Association for Computational Linguistics},
series = {CoNLL '09},
timestamp = {2013-11-23T20:11:51.000+0100},
title = {Automatic selection of high quality parses created by a fully unsupervised parser},
url = {http://dl.acm.org/citation.cfm?id=1596374.1596400},
year = 2009
}