Creating Robust Supervised Classifiers via Web-Scale N-gram Data
S. Bergsma, E. Pitler, and D. Lin. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL-10), (2010)
Abstract
In this paper, we systematically assess the value of using web-scale N-gram data in state-of-the-art supervised NLP classifiers. We compare classifiers that include or exclude features for the counts of various N-grams, where the counts are obtained from a web-scale auxiliary corpus. We show that including N-gram count features can advance the state-of-the-art accuracy on standard data sets for adjective ordering, spelling correction, noun compound bracketing, and verb part-of-speech disambiguation. More importantly, when operating on new domains, or when labeled training data is not plentiful, we show that using web-scale N-gram features is essential for achieving robust performance.
%0 Conference Paper
%1 Bergsma:EtAl:10
%A Bergsma, Shane
%A Pitler, Emily
%A Lin, Dekang
%B Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL-10)
%D 2010
%K 2010 acl bracketing compounds
%T Creating Robust Supervised Classifiers via Web-Scale N-gram Data
%U http://aclweb.org/anthology-new/P/P10/P10-1089.pdf
%X In this paper, we systematically assess the value of using web-scale N-gram data in state-of-the-art supervised NLP classifiers. We compare classifiers that include or exclude features for the counts of various N-grams, where the counts are obtained from a web-scale auxiliary corpus. We show that including N-gram count features can advance the state-of-the-art accuracy on standard data sets for adjective ordering, spelling correction, noun compound bracketing, and verb part-of-speech disambiguation. More importantly, when operating on new domains, or when labeled training data is not plentiful, we show that using web-scale N-gram features is essential for achieving robust performance.
@inproceedings{Bergsma:EtAl:10,
abstract = {In this paper, we systematically assess the value of using web-scale N-gram data in state-of-the-art supervised NLP classifiers. We compare classifiers that include or exclude features for the counts of various N-grams, where the counts are obtained from a web-scale auxiliary corpus. We show that including N-gram count features can advance the state-of-the-art accuracy on standard data sets for adjective ordering, spelling correction, noun compound bracketing, and verb part-of-speech disambiguation. More importantly, when operating on new domains, or when labeled training data is not plentiful, we show that using web-scale N-gram features is essential for achieving robust performance.},
added-at = {2011-04-05T03:33:20.000+0200},
author = {Bergsma, Shane and Pitler, Emily and Lin, Dekang},
biburl = {https://www.bibsonomy.org/bibtex/2728f8e1570f808f18ca0ac999c0d11f0/seandalai},
booktitle = {Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL-10)},
interhash = {37594315b68366fee1dac9d16d5467a4},
intrahash = {728f8e1570f808f18ca0ac999c0d11f0},
keywords = {2010 acl bracketing compounds},
timestamp = {2011-04-05T03:33:20.000+0200},
title = {Creating Robust Supervised Classifiers via Web-Scale N-gram Data},
url = {http://aclweb.org/anthology-new/P/P10/P10-1089.pdf},
year = 2010
}