@mkroell

The role of documents vs. queries in extracting class attributes from text

, , and . CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, page 485--494. New York, NY, USA, ACM, (2007)
DOI: http://doi.acm.org/10.1145/1321440.1321510

Abstract

Challenging the implicit reliance on document collections, this paper discusses the pros and cons of using query logs rather than document collections, as self-contained sources of data in textual information extraction. The differences are quantified as part of a large-scale study on extracting prominent attributes or quantifiable properties of classes (e.g., top speed, price and fuel consumption for CarModel) from unstructured text. In a head-to-head qualitative comparison, a lightweight extraction method produces class attributes that are 45% more accurate on average, when acquired from query logs rather than Web documents.

Links and resources

Tags

community