In natural language understanding, there is a hierarchy of lenses through which we can extract meaning - from words to sentences to paragraphs to documents. At the document level, one of the most useful ways to understand text is by analyzing its topics.
I made an introductory talk on word embeddings in the past and this write-up is an extended version of the part about philosophical ideas behind word vectors.
T. Gao, X. Yao, and D. Chen. (2021)cite arxiv:2104.08821Comment: Accepted to EMNLP 2021. The code and pre-trained models are available at https://github.com/princeton-nlp/simcse.
S. Wang, J. Tang, C. Aggarwal, and H. Liu. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, ACM, (October 2016)