Comparison of combination methods using Spectral Clustering Ensembles
A. Lourenco, and A. Fred. 4th International Workshop on Pattern Recognition in Information Systems - PRIS 2004 ICEIS04, (2004)
Abstract
We address the problem of the combination of multiple data partitions, that we call a clustering ensemble. We use a recent
clustering approach, known as Spectral Clustering, and the classical K-Means algorithm to produce the partitions that
constitute the clustering ensembles. A comparative evaluation of several combination methods is performed by measuring the
consistency between the combined data partition and (a) ground truth information, and (b) the clustering ensemble. Two
consistency measures are used: (i) an index based on cluster matching between two partitions; and (ii) an information theoretic
index exploring the concept of mutual information between data partitions. Results on a variety of synthetic and real data sets
show that, while combination results are more robust solutions than individual clusterings, no combination method proves to be a clear winner. Furthermore, without the use of a priori information, the mutual information based measure is not able to systematically select the best combination method for each
problem, optimality being measured based on ground truth information.
%0 Conference Paper
%1 Pris2004
%A Lourenco, Andre
%A Fred, Ana
%B 4th International Workshop on Pattern Recognition in Information Systems - PRIS 2004 ICEIS04
%D 2004
%K Clustering_Combination Clustering_ensembles Spectral_clustering comparative evaluation k-means
%T Comparison of combination methods using Spectral Clustering Ensembles
%X We address the problem of the combination of multiple data partitions, that we call a clustering ensemble. We use a recent
clustering approach, known as Spectral Clustering, and the classical K-Means algorithm to produce the partitions that
constitute the clustering ensembles. A comparative evaluation of several combination methods is performed by measuring the
consistency between the combined data partition and (a) ground truth information, and (b) the clustering ensemble. Two
consistency measures are used: (i) an index based on cluster matching between two partitions; and (ii) an information theoretic
index exploring the concept of mutual information between data partitions. Results on a variety of synthetic and real data sets
show that, while combination results are more robust solutions than individual clusterings, no combination method proves to be a clear winner. Furthermore, without the use of a priori information, the mutual information based measure is not able to systematically select the best combination method for each
problem, optimality being measured based on ground truth information.
@inproceedings{Pris2004,
abstract = {We address the problem of the combination of multiple data partitions, that we call a clustering ensemble. We use a recent
clustering approach, known as Spectral Clustering, and the classical K-Means algorithm to produce the partitions that
constitute the clustering ensembles. A comparative evaluation of several combination methods is performed by measuring the
consistency between the combined data partition and (a) ground truth information, and (b) the clustering ensemble. Two
consistency measures are used: (i) an index based on cluster matching between two partitions; and (ii) an information theoretic
index exploring the concept of mutual information between data partitions. Results on a variety of synthetic and real data sets
show that, while combination results are more robust solutions than individual clusterings, no combination method proves to be a clear winner. Furthermore, without the use of a priori information, the mutual information based measure is not able to systematically select the best combination method for each
problem, optimality being measured based on ground truth information.},
added-at = {2009-10-25T21:17:08.000+0100},
author = {Lourenco, Andre and Fred, Ana},
biburl = {https://www.bibsonomy.org/bibtex/22512fb438fe2bb9355532ac7c0ebf6e0/alourenco},
booktitle = {4th International Workshop on Pattern Recognition in Information Systems - PRIS 2004 ICEIS04},
interhash = {91130aae01e51557dbcdfc26702bceda},
intrahash = {2512fb438fe2bb9355532ac7c0ebf6e0},
keywords = {Clustering_Combination Clustering_ensembles Spectral_clustering comparative evaluation k-means},
timestamp = {2009-10-25T21:17:08.000+0100},
title = {Comparison of combination methods using Spectral Clustering Ensembles},
year = 2004
}