@thoni

New Experiments in Distributional Representations of Synonymy

, , , , , , and . Proceedings of the Ninth Conference on Computational Natural Language Learning, page 25--32. Stroudsburg, PA, USA, Association for Computational Linguistics, (2005)

Abstract

Recent work on the problem of detecting synonymy through corpus analysis has used the Test of English as a Foreign Language (TOEFL) as a benchmark. However, this test involves as few as 80 questions, prompting questions regarding the statistical significance of reported results. We overcome this limitation by generating a TOEFL-like test using WordNet, containing thousands of questions and composed only of words occurring with sufficient corpus frequency to support sound distributional comparisons. Experiments with this test lead us to a similarity measure which significantly outperforms the best proposed to date. Analysis suggests that a strength of this measure is its relative robustness against polysemy.

Description

New experiments in distributional representations of synonymy

Links and resources

Tags

community

  • @thoni
  • @dblp
@thoni's tags highlighted