copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Attaching Translations to Proper Lexical Senses in DBnary

A. Tchechmedjiev, G. Sérasset, J. Goulian, and D. Schwab. 3rd Workshop on Linked Data in Linguistics: Multilingual Knowledge Resources and Natural Language Processing, page to appear. Islande, (2014)

Abstract

The DBnary project aims at providing high quality Lexical Linked Data extracted from different Wiktionary language editions. Data from 10 different languages is currently extracted for a total of over 3.16M translation links that connect lexical entries from the 10 extracted languages, to entries in more than one thousand languages. In Wiktionary, glosses are often associated with translations to help users understand to what sense they refer to, whether through a textual definition or a target sense number. In this article we aim at the extraction of as much of this information as possible and then the disambiguation of the corresponding translations for all languages available. We use an adaptation of various textual and semantic similarity techniques based on partial or fuzzy gloss overlaps to disambiguate the translation relations (To account for the lack of normalization, e.g. lemmatization and PoS tagging) and then extract some of the sense number information present to build a gold standard so as to evaluate our disambiguation as well as tune and optimize the parameters of the similarity measures. We obtain F-measures of the order of 80\\% (on par with similar work on English only), across the three languages where we could generate a gold standard (French, Portuguese, Finnish) and show that most of the disambiguation errors are due to inconsistencies in Wiktionary itself that cannot be detected at the generation of DBnary (shifted sense numbers, inconsistent glosses, etc.).

Links and resources

BibTeX key: tchechmedjiev:hal-00990870
entry type: inproceedings
address: Islande
booktitle: 3rd Workshop on Linked Data in Linguistics: Multilingual Knowledge Resources and Natural Language Processing
year: 2014
pages: to appear
hal_id: hal-00990870
audience: internationale
pdf: http://hal.archives-ouvertes.fr/hal-00990870/PDF/dbnary-wsd.pdf
affiliation: Laboratoire d'Informatique de Grenoble - LIG
language: Anglais
url: http://hal.archives-ouvertes.fr/hal-00990870

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Attaching Translations to Proper Lexical Senses in DBnary

Abstract

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Attaching Translations to Proper Lexical Senses in DBnary

Abstract

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Attaching Translations to Proper Lexical Senses in DBnary

Comments and Reviews
(0)