@lopusz_kdd

Evaluating Topic Models for Digital Libraries

, , , , und . Proceedings of the 10th Annual Joint Conference on Digital Libraries, Seite 215--224. New York, NY, USA, ACM, (2010)
DOI: 10.1145/1816123.1816156

Zusammenfassung

Topic models could have a huge impact on improving the ways users find and discover content in digital libraries and search interfaces through their ability to automatically learn and apply subject tags to each and every item in a collection, and their ability to dynamically create virtual collections on the fly. However, much remains to be done to tap this potential, and empirically evaluate the true value of a given topic model to humans. In this work, we sketch out some sub-tasks that we suggest pave the way towards this goal, and present methods for assessing the coherence and interpretability of topics learned by topic models. Our large-scale user study includes over 70 human subjects evaluating and scoring almost 500 topics learned from collections from a wide range of genres and domains. We show how scoring model -- based on pointwise mutual information of word-pair using Wikipedia, Google and MEDLINE as external data sources - performs well at predicting human scores. This automated scoring of topics is an important first step to integrating topic modeling into digital libraries

Beschreibung

Evaluating topic models for digital libraries

Links und Ressourcen

Tags

Community

  • @ans
  • @dblp
  • @lopusz_kdd
@lopusz_kdds Tags hervorgehoben