copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Reranking and self-training for parser adaptation

D. McClosky, E. Charniak, and M. Johnson. ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, page 337--344. Morristown, NJ, USA, Association for Computational Linguistics, (2006)
DOI: http://dx.doi.org/10.3115/1220175.1220218

Abstract

Statistical parsers trained and tested on the Penn Wall Street Journal (WSJ) treebank have shown vast improvements over the last 10 years. Much of this improvement, however, is based upon an ever-increasing number of features to be trained on (typically) the WSJ treebank data. This has led to concern that such parsers may be too finely tuned to this corpus at the expense of portability to other genres. Such worries have merit. The standard "Charniak parser" checks in at a labeled precision-recall f-measure of 89.7% on the Penn WSJ test set, but only 82.9% on the test set from the Brown treebank corpus.This paper should allay these fears. In particular, we show that the reranking parser described in Charniak and Johnson (2005) improves performance of the parser on Brown to 85.2%. Furthermore, use of the self-training techniques described in (McClosky et al., 2006) raise this to 87.8% (an error reduction of 28%) again without any use of labeled Brown data. This is remarkable since training the parser and reranker on labeled Brown data achieves only 88.4%.

Description

Reranking and self-training for parser adaptation

Links and resources

BibTeX key: 1220218
entry type: inproceedings
address: Morristown, NJ, USA
booktitle: ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
year: 2006
pages: 337--344
publisher: Association for Computational Linguistics
location: Sydney, Australia
DOI: http://dx.doi.org/10.3115/1220175.1220218
url: http://portal.acm.org/citation.cfm?id=1220218&dl=GUIDE,

@jamesh's tags highlighted

Cite this publication

@inproceedings{1220218, abstract = {Statistical parsers trained and tested on the Penn Wall Street Journal (WSJ) treebank have shown vast improvements over the last 10 years. Much of this improvement, however, is based upon an ever-increasing number of features to be trained on (typically) the WSJ treebank data. This has led to concern that such parsers may be too finely tuned to this corpus at the expense of portability to other genres. Such worries have merit. The standard "Charniak parser" checks in at a labeled precision-recall f-measure of 89.7% on the Penn WSJ test set, but only 82.9% on the test set from the Brown treebank corpus.This paper should allay these fears. In particular, we show that the reranking parser described in Charniak and Johnson (2005) improves performance of the parser on Brown to 85.2%. Furthermore, use of the self-training techniques described in (McClosky et al., 2006) raise this to 87.8% (an error reduction of 28%) again without any use of labeled Brown data. This is remarkable since training the parser and reranker on labeled Brown data achieves only 88.4%.}, added-at = {2009-04-17T10:44:05.000+0200}, address = {Morristown, NJ, USA}, author = {McClosky, David and Charniak, Eugene and Johnson, Mark}, biburl = {https://www.bibsonomy.org/bibtex/2dc066983abb39a4dec4466ce2ea2853c/jamesh}, booktitle = {ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics}, description = {Reranking and self-training for parser adaptation}, doi = {http://dx.doi.org/10.3115/1220175.1220218}, interhash = {913eb75b3c60c27e212fb56b06b78b02}, intrahash = {dc066983abb39a4dec4466ce2ea2853c}, keywords = {parser semisupervised}, location = {Sydney, Australia}, pages = {337--344}, publisher = {Association for Computational Linguistics}, timestamp = {2009-04-17T10:44:05.000+0200}, title = {Reranking and self-training for parser adaptation}, url = {http://portal.acm.org/citation.cfm?id=1220218&dl=GUIDE,}, year = 2006 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Reranking and self-training for parser adaptation

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Reranking and self-training for parser adaptation

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Reranking and self-training for parser adaptation

Comments and Reviews
(0)