@djsaab

Context recognition for hierarchical text classification

. Journal of the American Society for Information Science and Technology, 9999 (9999): NA+ (2009)
DOI: 10.1002/asi.21022

Abstract

Information is often organized as a text hierarchy. A hierarchical text-classification system is thus essential for the management, sharing, and dissemination of information. It aims to automatically classify each incoming document into zero, one, or several categories in the text hierarchy. In this paper, we present a technique called CRHTC (context recognition for hierarchical text classification) that performs hierarchical text classification by recognizing the context of discussion (COD) of each category. A category's COD is governed by its ancestor categories, whose contents indicate contextual backgrounds of the category. A document may be classified into a category only if its content matches the category's COD. CRHTC does not require any trials to manually set parameters, and hence is more portable and easier to implement than other methods. It is empirically evaluated under various conditions. The results show that CRHTC achieves both better and more stable performance than several hierarchical and nonhierarchical text-classification methodologies.

Description

djsaab's CiteULike library 20091211

Links and resources

Tags

community

  • @djsaab
  • @dblp
@djsaab's tags highlighted