@goalscoringsuperstarhero

Finding relevant passages using noun-noun compounds: Coherence vs. proximity

, and . Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, page 385--387. (2000)

Abstract

Intuitively, words forming phrases are a more precise description of content than words as a sequence of keywords. Yet, evidence that phrases would be more effective for information retrieval is inconclusive. This paper isolates a neglected class of phrases, that is abundant in communication, has an established theoretical foundation, and shows promise for an effective expression of the user's information need: the noun-noun compound (NNC). In an experiment, a variety of meaningful NNCs were used to isolate relevant passages in a large and varied corpus. In a first pass, passages were retrieved based on textual proximity of the words or their semantic peers. A second pass retained only passages containing a syntactically coherent structure equivalent to the original NNC. This second pass showed a dramatic increase in precision. Preliminary results show the validity of our intuition about phrases in the special but very productive case of NNCs.

Links and resources

Tags

community

  • @goalscoringsuperstarhero
  • @dblp
  • @seandalai
@goalscoringsuperstarhero's tags highlighted