Abstract
The automatic ranking of word pairs as per their semantic relatedness and
ability to mimic human notions of semantic relatedness has widespread
applications. Measures that rely on raw data (distributional measures) and
those that use knowledge-rich ontologies both exist. Although extensive studies
have been performed to compare ontological measures with human judgment, the
distributional measures have primarily been evaluated by indirect means. This
paper is a detailed study of some of the major distributional measures; it
lists their respective merits and limitations. New measures that overcome these
drawbacks, that are more in line with the human notions of semantic
relatedness, are suggested. The paper concludes with an exhaustive comparison
of the distributional and ontology-based measures. Along the way, significant
research problems are identified. Work on these problems may lead to a better
understanding of how semantic relatedness is to be measured.
Users
Please
log in to take part in the discussion (add own reviews or comments).