BibSonomy :: bibtex  ::

tag user group author concept BibTeX key search:all search:butonic
A blue social bookmark and publication sharing system.
tags · relations · groups · popular
help · blog · about
login · register
butonic's BibTeX entry:  

Accurate Unlexicalized Parsing

Annual Meeting of the Association for Computational Linguistics, 41: 423-430, 2003.
Authors: Dan Klein and Christopher D. Manning
URL: http://nlp.stanford.edu/~manning/papers/unlexicalized-parsing.pdf
Tags: NT2OD nlp parser stanford
Abstract: We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar. Indeed, its performance of 86.36% (LP/LR F1 ) is better than that of early lexicalized PCFG models, and surprisingly close to the current state-of-the-art. This result has potential uses beyond establishing a strong lower bound on the maximum possible accuracy of unlexicalized models: an unlexicalized PCFG is much more compact, easier to replicate, and easier to interpret than more complex lexical models, and the parsing algorithms are simpler, more widely understood, of lower asymptotic complexity, and easier to optimize.
| URL | BibTeX  
@inproceedings{klein03accurate,
title = {Accurate Unlexicalized Parsing},
author = {Dan Klein and Christopher D. Manning},
booktitle = {Annual Meeting of the Association for Computational Linguistics},
pages = {423-430},
school = {The Stanford Natural Language Processing Group},
url = {http://nlp.stanford.edu/~manning/papers/unlexicalized-parsing.pdf},
volume = {41},
year = {2003},
abstract = {We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar. Indeed, its performance of 86.36% (LP/LR F1 ) is better than that of early lexicalized PCFG models, and surprisingly close to the current state-of-the-art. This result has potential uses beyond establishing a strong lower bound on the maximum possible accuracy of unlexicalized models: an unlexicalized PCFG is much more compact, easier to replicate, and easier to interpret than more complex lexical models, and the parsing algorithms are simpler, more widely understood, of lower asymptotic complexity, and easier to optimize.},
keywords = {NT2OD nlp parser stanford }
}