entry of diego_ma:
(0)
This publication has not been reviewed yet.
rating distribution
average user rating
?
The average rating is computed over all reviews. However, some of them may be invisible to you due to the visibility setting chosen by the reviewers.
Generating Annotated Corpora for Reading Comprehension and Question Answering Evaluation.
by:In: Proc. EACL 2003 workshop on NLP for Question Answering
(2003)
, p. 13-19.
Resources (URL, PDF, PS...)
Abstract
Recently, reading comprehension tests for students and adult language learners have received increased attention within the NLP community as a means to develop and evaluate robust question answering NLQA methods. We present our ongoing work on automatically creating richly annotated corpus resources for NLQA and on comparing automatic methods for answering questions against this data set. Starting with the CBC4Kids corpus, we have added XML annotation layers for tokenization, lemmatization, stemming, semantic classes, POS tags and best anking syntactic parses to support future experiments with semantic answer retrieval and inference. Using this resource, we have calculated a baseline for word-overlap based answer retrieval Hirschman et al., 1999 on the CBC4Kids data and found the method performs slightly better than on the REMEDIA corpus. We hope that our richly annotated version of the CBC4Kids corpus will become a standard resource, especially as a controlled environment for evaluating inference-based techniques.


publication