Word Category Maps based on Emergent Features Created by ICA
J. Väyrynen, и T. Honkela. Proceedings of the STeP'2004 Cognition + Cybernetics Symposium, стр. 173--185. Finnish Artificial Intelligence Society, (2004)
Аннотация
In this paper, we assume that word co-occurrence statistics can be used to extract meaningful features, exhibiting syntactic and semantic behavior, from text data. Independent component analysis (ICA), an unsupervised statistical method, is applied to word usage statistics, calculated from a natural language corpora, to extract a number of features. With a self-organizing map (SOM), we will demonstrate that the extracted vector representation for words can further be applied to other tasks. It is also demonstrated, that the ICA-based encoding scheme is a good alternative to random projection (RP), a method commonly used in text analysis.
%0 Conference Paper
%1 Vayrynen-Honkela:2004:STEP2004
%A Väyrynen, Jaakko J.
%A Honkela, Timo
%B Proceedings of the STeP'2004 Cognition + Cybernetics Symposium
%D 2004
%E Hyötyniemi, Heikki
%E Ala-Siuru, Pekka
%E Seppänen, Jouko
%I Finnish Artificial Intelligence Society
%K context emergence ica som word
%P 173--185
%T Word Category Maps based on Emergent Features Created by ICA
%X In this paper, we assume that word co-occurrence statistics can be used to extract meaningful features, exhibiting syntactic and semantic behavior, from text data. Independent component analysis (ICA), an unsupervised statistical method, is applied to word usage statistics, calculated from a natural language corpora, to extract a number of features. With a self-organizing map (SOM), we will demonstrate that the extracted vector representation for words can further be applied to other tasks. It is also demonstrated, that the ICA-based encoding scheme is a good alternative to random projection (RP), a method commonly used in text analysis.
@inproceedings{Vayrynen-Honkela:2004:STEP2004,
abstract = {In this paper, we assume that word co-occurrence statistics can be used to extract meaningful features, exhibiting syntactic and semantic behavior, from text data. Independent component analysis (ICA), an unsupervised statistical method, is applied to word usage statistics, calculated from a natural language corpora, to extract a number of features. With a self-organizing map (SOM), we will demonstrate that the extracted vector representation for words can further be applied to other tasks. It is also demonstrated, that the ICA-based encoding scheme is a good alternative to random projection (RP), a method commonly used in text analysis.
},
added-at = {2008-11-26T15:13:36.000+0100},
author = {V{\"{a}}yrynen, Jaakko J. and Honkela, Timo},
biburl = {https://www.bibsonomy.org/bibtex/2be7ef5183530356d14a03b1ced819b8a/jjv},
booktitle = {Proceedings of the STeP'2004 Cognition + Cybernetics Symposium},
editor = {Hy{\"{o}}tyniemi, Heikki and Ala-Siuru, Pekka and Sepp{\"{a}}nen, Jouko},
interhash = {9191cb2818e285bc7e5b072da3d8c696},
intrahash = {be7ef5183530356d14a03b1ced819b8a},
keywords = {context emergence ica som word},
pages = {173--185},
publisher = {Finnish Artificial Intelligence Society},
series = {Publications of the Finnish Artificial Intelligence Society},
timestamp = {2008-11-26T15:13:36.000+0100},
title = {Word Category Maps based on Emergent Features Created by {ICA}},
year = 2004
}