@aho

Topic-conditioned novelty detection

, , , and . KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, page 688--693. New York, NY, USA, ACM, (2002)
DOI: 10.1145/775047.775150

Abstract

Automated detection of the first document reporting each new event in temporally-sequenced streams of documents is an open challenge. In this paper we propose a new approach which addresses this problem in two stages: 1) using a supervised learning algorithm to classify the on-line document stream into pre-defined broad topic categories, and 2) performing topic-conditioned novelty detection for documents in each topic. We also focus on exploiting named-entities for event-level novelty detection and using feature-based heuristics derived from the topic histories. Evaluating these methods using a set of broadcast news stories, our results show substantial performance gains over the traditional one-level approach to the novelty detection problem.

Links and resources

Tags

community

  • @brusilovsky
  • @lillejul
  • @aho
  • @dblp
  • @utahell
  • @diana
@aho's tags highlighted