@jaeschke

A Two-stage Sieve Approach for Quote Attribution

, , , and . Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, page 460--470. Valencia, Spain, Association for Computational Linguistics, (April 2017)

Abstract

We present a deterministic sieve-based system for attributing quotations in literary text and a new dataset: QuoteLi3. Quote attribution, determining who said what in a given text, is important for tasks like creating dialogue systems, and in newer areas like computational literary studies, where it creates opportunities to analyze novels at scale rather than only a few at a time. We release QuoteLi3, which contains more than 6,000 annotations linking quotes to speaker mentions and quotes to speaker entities, and introduce a new algorithm for quote attribution. Our two-stage algorithm first links quotes to mentions, then mentions to entities. Using two stages encapsulates difficult sub-problems and improves system performance. The modular design allows us to tune for overall performance or higher precision, which is useful for many real-world use cases. Our system achieves an average F-score of 87.5 across three novels, outperforming previous systems, and can be tuned for precision of 90.4 at a recall of 65.1.

Description

A Two-stage Sieve Approach for Quote Attribution - ACL Anthology

Links and resources

Tags

community

  • @jaeschke
  • @lea-w
  • @dblp
@jaeschke's tags highlighted