Abstract
Reading comprehension (RC)---in contrast to information retrieval---requires
integrating information and reasoning about events, entities, and their
relations across a full document. Question answering is conventionally used to
assess RC ability, in both artificial agents and children learning to read.
However, existing RC datasets and tasks are dominated by questions that can be
solved by selecting answers using superficial information (e.g., local context
similarity or global term frequency); they thus fail to test for the essential
integrative aspect of RC. To encourage progress on deeper comprehension of
language, we present a new dataset and set of tasks in which the reader must
answer questions about stories by reading entire books or movie scripts. These
tasks are designed so that successfully answering their questions requires
understanding the underlying narrative rather than relying on shallow pattern
matching or salience. We show that although humans solve the tasks easily,
standard RC models struggle on the tasks presented here. We provide an analysis
of the dataset and the challenges it presents.
Users
Please
log in to take part in the discussion (add own reviews or comments).