Automatic Extraction of Clusters from Hierarchical Clustering Representations.
J. Sander, X. Qin, Z. Lu, N. Niu, and A. Kovarsky. Proc. of 7th Pacific-Asia Conf. of Advances in Knowledge Discovery and Data Mining, PAKDD 2003, Proceedings, page 75-87. Springer, (2003)
Abstract
Hierarchical clustering algorithms are typically more effective in detecting the true clustering structure of a data set than partitioning algorithms. However, hierarchical clustering algorithms do not actually create clusters, but compute only a hierarchical representation of the data set. This makes them unsuitable as an automatic pre-processing step for other algorithms that operate on detected clusters. This is true for both dendrograms and reachability plots, which have been proposed as hierarchical clustering representations, and which have different advantages and disadvantages. In this paper we first investigate the relation between dendrograms and reachability plots and introduce methods to convert them into each other showing that they essentially contain the same information. Based on reachability plots, we then introduce a technique that automatically determines the significant clusters in a hierarchical cluster representation. This makes it for the first time possible to use hierarchical clustering as an automatic pre-processing step that requires no user interaction to select clusters from a hierarchical cluster representation.
%0 Conference Paper
%1 sander03extraction
%A Sander, Jörg
%A Qin, Xuejie
%A Lu, Zhiyong
%A Niu, Nan
%A Kovarsky, Alex
%B Proc. of 7th Pacific-Asia Conf. of Advances in Knowledge Discovery and Data Mining, PAKDD 2003, Proceedings
%D 2003
%I Springer
%K cites.procm research.clustering state.toRead
%P 75-87
%T Automatic Extraction of Clusters from Hierarchical Clustering Representations.
%U http://www.springerlink.com/content/he3wv27nyj5ldh3y/
%X Hierarchical clustering algorithms are typically more effective in detecting the true clustering structure of a data set than partitioning algorithms. However, hierarchical clustering algorithms do not actually create clusters, but compute only a hierarchical representation of the data set. This makes them unsuitable as an automatic pre-processing step for other algorithms that operate on detected clusters. This is true for both dendrograms and reachability plots, which have been proposed as hierarchical clustering representations, and which have different advantages and disadvantages. In this paper we first investigate the relation between dendrograms and reachability plots and introduce methods to convert them into each other showing that they essentially contain the same information. Based on reachability plots, we then introduce a technique that automatically determines the significant clusters in a hierarchical cluster representation. This makes it for the first time possible to use hierarchical clustering as an automatic pre-processing step that requires no user interaction to select clusters from a hierarchical cluster representation.
@inproceedings{sander03extraction,
abstract = {Hierarchical clustering algorithms are typically more effective in detecting the true clustering structure of a data set than partitioning algorithms. However, hierarchical clustering algorithms do not actually create clusters, but compute only a hierarchical representation of the data set. This makes them unsuitable as an automatic pre-processing step for other algorithms that operate on detected clusters. This is true for both dendrograms and reachability plots, which have been proposed as hierarchical clustering representations, and which have different advantages and disadvantages. In this paper we first investigate the relation between dendrograms and reachability plots and introduce methods to convert them into each other showing that they essentially contain the same information. Based on reachability plots, we then introduce a technique that automatically determines the significant clusters in a hierarchical cluster representation. This makes it for the first time possible to use hierarchical clustering as an automatic pre-processing step that requires no user interaction to select clusters from a hierarchical cluster representation.},
added-at = {2008-08-22T09:11:51.000+0200},
author = {Sander, J{\"o}rg and Qin, Xuejie and Lu, Zhiyong and Niu, Nan and Kovarsky, Alex},
biburl = {https://www.bibsonomy.org/bibtex/2bfcee29ca5ead9e34a81bc5d2eab89a5/msn},
booktitle = {Proc. of 7th Pacific-Asia Conf. of Advances in Knowledge Discovery and Data Mining, PAKDD 2003, Proceedings},
editor_ = {Kyu-Young Whang and Jongwoo Jeon and Kyuseok Shim and Jaideep Srivastava},
interhash = {c54d6a717ba5c61f12bbd9e802f62298},
intrahash = {bfcee29ca5ead9e34a81bc5d2eab89a5},
keywords = {cites.procm research.clustering state.toRead},
language = {english},
pages = {75-87},
publisher = {Springer},
series_ = {Lecture Notes in Computer Science},
timestamp = {2009-06-25T15:59:26.000+0200},
title = {Automatic Extraction of Clusters from Hierarchical Clustering Representations.},
url = {http://www.springerlink.com/content/he3wv27nyj5ldh3y/},
volume_ = {2637},
year = 2003
}