Inproceedings,

Emergence of multilingual representations by independent component analysis using parallel corpora

, and .
Proceedings of the The Ninth Scandinavian Conference on Artificial Intelligence (SCAI 2006), 22, page 101--105. Finnish Artificial Intelligence Society, (2006)

Abstract

This paper reports the first results on extracting a meaningful representation for words from multilingual parallel corpora. Independent component analysis is used to extract a number of components from statistics calculated for words in contexts. Individual components are meaningful and multilingual and words are represented as a bag of concepts model. The component space created by the extracted components is also multilingual. Words that are related in different languages appear close to each other in the component space, which makes it possible to find translations for words between languages.

Tags

Users

  • @jjv

Comments and Reviews