Extracting topics is a good unsupervised data-mining technique to discover the underlying relationships between texts. There are many different approaches with the most popular probably being LDA but…
In this 3-part blog series we present a unifying perspective on pre-trained word embeddings under a general framework of matrix factorization. The most popular word embedding model, Word2vec, has…
E. Gaussier, and C. Goutte. Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, page 601--602. New York, NY, USA, ACM, (2005)
C. Liu, H. Yang, J. Fan, L. He, and Y. Wang. Proceedings of the 19th international conference on World wide web, page 681--690. New York, NY, USA, ACM, (2010)