Incollection,

Fundamental Limitations of Spectral Clustering Methods

, and .
Advances in Neural Information Processing Systems 19, MIT Press, Cambridge, MA, (2007)

Abstract

Spectral clustering methods are common graph-based approaches to clustering of data. Spectral clustering algorithms typically start from local information encoded in a weighted graph on the data and cluster according to the global eigenvectors of the corresponding (normalized) similarity matrix. One contribution of this paper is to present fundamental limitations of this general local to global approach. We show that based only on local information, the underlying normalized cut functional is not a suitable measure for the quality of clustering. Further, even with a suitable similarity measure, we show that the first few eigenvectors of such adjacency matrices cannot successfully cluster datasets that contain structures at different scales of size and density. These findings provide theoretical explanations to some empirically observed characteristics of these methods. Based on our theoretical analysis, a second contribution of this paper is a novel diffusion based measure to evaluate the coherence of individual clusters. Our measure can be used in conjunction with any graph-based clustering method (global, top-down, bottom-up), it is scale-free and can determine coherent clusters at all scales. We present simple examples where various spectral clustering algorithms fail, whereas clustering utilizing this coherence measure finds the correct clusters at all scales.

Tags

Users

  • @mhwombat
  • @seandalai

Comments and Reviews