
Towards explicit semantic features using independent component analysis

, , and . Proceedings of the Workshop Semantic Content Acquisition and Representation (SCAR), Stockholm, Sweden, Swedish Institute of Computer Science, (2007)SICS Technical Report T2007-06.


Latent semantic analysis (LSA) can be used to create an implicit semantic vectorial representation for words. Independent component analysis (ICA) can be derived as an extension to LSA that rotates the latent semantic space so that it becomes explicit, that is, the features correspond more with those resulting from human cognitive activity. This enables nonlinear filtering of the features, such as thresholding that forces sparse ICA components for words. We will demonstrate this with multiple choice semantic vocabulary tests generated from a multilingual thesaurus. The experiments are conducted in English, Finnish and Swedish.

Links and resources
