Unsupervised Discovery of a Statistical Verb Lexicon
T. Grenager, and C. Manning. Proceedings of the Conference on Empirical Methods in Natural Language Processing, The Stanford Natural Language Processing Group, (2006)
Abstract
This paper demonstrates how unsupervised techniques can be used to learn models of deep linguistic structure. Determining the semantic roles of a verb’s dependents is an important step in natural language understanding. We present a method for learning models of verb argument patterns directly from unannotated text. The learned models are similar to existing verb lexicons such as VerbNet and PropBank, but additionally include statistics about the linkings used by each verb. The method is based on a structured probabilistic model of the domain, and unsupervised learning is performed with the EM algorithm. The learned models can also be used discriminatively as semantic role labelers, and when evaluated relative to the PropBank annotation, the best learned model reduces 28% of the error between an informed baseline and an oracle upper bound.
%0 Conference Paper
%1 grenager06verblex
%A Grenager, Trond
%A Manning, Christopher D.
%B Proceedings of the Conference on Empirical Methods in Natural Language Processing
%D 2006
%K 2006 stanford parsetree parser nlp NT2OD
%T Unsupervised Discovery of a Statistical Verb Lexicon
%U http://nlp.stanford.edu/pubs/verblex.pdf
%X This paper demonstrates how unsupervised techniques can be used to learn models of deep linguistic structure. Determining the semantic roles of a verb’s dependents is an important step in natural language understanding. We present a method for learning models of verb argument patterns directly from unannotated text. The learned models are similar to existing verb lexicons such as VerbNet and PropBank, but additionally include statistics about the linkings used by each verb. The method is based on a structured probabilistic model of the domain, and unsupervised learning is performed with the EM algorithm. The learned models can also be used discriminatively as semantic role labelers, and when evaluated relative to the PropBank annotation, the best learned model reduces 28% of the error between an informed baseline and an oracle upper bound.
@inproceedings{grenager06verblex,
abstract = {This paper demonstrates how unsupervised techniques can be used to learn models of deep linguistic structure. Determining the semantic roles of a verb’s dependents is an important step in natural language understanding. We present a method for learning models of verb argument patterns directly from unannotated text. The learned models are similar to existing verb lexicons such as VerbNet and PropBank, but additionally include statistics about the linkings used by each verb. The method is based on a structured probabilistic model of the domain, and unsupervised learning is performed with the EM algorithm. The learned models can also be used discriminatively as semantic role labelers, and when evaluated relative to the PropBank annotation, the best learned model reduces 28% of the error between an informed baseline and an oracle upper bound.},
added-at = {2007-02-25T19:56:57.000+0100},
author = {Grenager, Trond and Manning, Christopher D.},
biburl = {https://www.bibsonomy.org/bibtex/26f2469214ab827007bd3d105d2f7f566/butonic},
booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing},
interhash = {532e71ddc67714419d701b42be7aac7a},
intrahash = {6f2469214ab827007bd3d105d2f7f566},
keywords = {2006 stanford parsetree parser nlp NT2OD},
organization = {The Stanford Natural Language Processing Group},
school = {Stanford University},
timestamp = {2007-02-25T19:56:57.000+0100},
title = {{U}nsupervised {D}iscovery of a {S}tatistical {V}erb {L}exicon},
url = {http://nlp.stanford.edu/pubs/verblex.pdf},
year = 2006
}