@msn

Comparing Clustering Algorithms and Their Influence on the Evolution of Labeled Clusters

. Database and Expert Systems Applications, (2007)

Abstract

We study the influence of different clustering algorithms on cluster evolution monitoring in data streams. The capturing and interpretation of cluster change delivers indicators on the evolution of the underlying population. For text stream monitoring,the clusters can be summarized into topics, so that cluster monitoring provides insights on the data and decline of thematic subjects over time. However, such insightsshould always be taken with a grain of salt: The quality of the clusters has a decisive impact on the observed changes. Inthe simplest case, cluster change across the stream may be due to the low quality of the original cluster than to a driftin the population belonging to this cluster. We show our framework Theme Finder for topic evolution monitoring in streams and compare the influence to the quality of two very different cluster algorithms.After an evaluation of different cluster algorithms with external and internal quality measures, we use the center based bisectingk-means algorithm and the density-based DBScan algorithm. Our results show that the influence is relatively high and showthat different clustering algorithms results allow to draw conclusion to the evaluation of the other cluster algorithm. Ourexperiments were done on a subarchive of the ACM library.

Description

SpringerLink - Book Chapter

Links and resources

Tags

community

  • @kmd-ovgu
  • @msn
  • @dblp
@msn's tags highlighted