In natural language understanding (NLU) tasks, there is a hierarchy of lenses through which we can extract meaning — from words to sentences to paragraphs to documents. At the document level, one of the most useful ways to understand text is by analyzing its topics. The process of learning, recognizing, and extracting these topics across a collection of documents is called topic modeling.
In this post, we will explore topic modeling through 4 of the most popular techniques today: LSA, pLSA, LDA, and the newer, deep learning-based lda2vec.
N. Tatti, T. Mielikainen, A. Gionis, and H. Mannila. Proceedings of the Sixth IEEE International Conference on Data Mining (ICDM 2006), page 603--612. IEEE, (December 2006)
D. Mitomo, H. Nakamura, K. Ikeda, A. Yamagishi, and J. Higo. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 64 (4):
883-894(September 2006)Transition state of a SH3 domain detected with principle component analysis and a charge-neutralized all-atom protein model (p 883-894)
Daisuke Mitomo, Hironori K. Nakamura, Kazuyoshi Ikeda, Akihiko Yamagishi, Junichi Higo
Published Online: Jun 28 2006 4:14PM
DOI: 10.1002/prot.21069.
S. Jang, N. Sreerama, V. Liao, S. Lu, F. Li, S. Shin, R. Woody, and S. Lin. PROTEIN SCIENCE, 15 (10):
2290-2299(October 2006)Soonmin Jang, Narasimha Sreerama, Vivian H.-C. Liao, S. Hsiu-Feng Lu, Feng-Yin Li, Seokmin Shin, Robert W. Woody, and Sheng Hsien Lin
Theoretical investigation of the photoinitiated folding of HP-36
Protein Sci 2006 15: 2290-2299. Published in Advance September 8, 2006, 10.1110/ps.062145106.