Inproceedings,

Lexical Semantic Relatedness with Random Graph Walks

, and .
EMNLP-CoNLL, page 581--589. (2007)

Abstract

Many systems for tasks such as question answering, multi-document summarization, and infor- mation retrieval need robust numerical measures of lexical relatedness. Standard thesaurus-based measures of word pair similarity are based on only a single path between those words in the thesaurus graph. By contrast, we propose a new model of lexical semantic relatedness that incorporates information from every explicit or implicit path connecting the two words in the entire graph. Our model uses a random walk over nodes and edges derived from WordNet links and corpus statistics. We treat the graph as a Markov chain and compute a word-specific sta- tionary distribution via a generalized PageRank algorithm. Semantic relatedness of a word pair is scored by a novel divergence measure, ZKL, that outperforms existing measures on certain classes of distributions. In our experiments, the resulting relatedness measure is the WordNet-based measure most highly correlated with human similarity judgments by rank ordering at ρ = .90.

Tags

Users

  • @quesada
  • @kde-alumni
  • @thoni
  • @hotho
  • @dblp

Comments and Reviewsshow / hide

  • @thoni
    8 years ago
    Random Walks on WordNet. After those random walks, they calculate transition probabilities (i.e. Markov chains of first order). They finally claim a high correlation with human judgment, but it's again only with RG and MC. On WS353, they end up with something around 0.55 (which is not that bad for WordNet, but anyway)
Please log in to take part in the discussion (add own reviews or comments).