Alternatives to the k-means algorithm that find better clusterings
G. Hamerly, и C. Elkan. CIKM '02: Proceedings of the eleventh international conference on Information and knowledge management, стр. 600--607. New York, NY, USA, ACM, (2002)
DOI: 10.1145/584792.584890
Аннотация
We investigate here the behavior of the standard k-means clustering algorithm and several alternatives to it: the k-harmonic means algorithm due to Zhang and colleagues, fuzzy k-means, Gaussian expectation-maximization, and two new variants of k-harmonic means. Our aim is to find which aspects of these algorithms contribute to finding good clusterings, as opposed to converging to a low-quality local optimum. We describe each algorithm in a unified framework that introduces separate cluster membership and data weight functions. We then show that the algorithms do behave very differently from each other on simple low-dimensional synthetic datasets and image segmentation tasks, and that the k-harmonic means method is superior. Having a soft membership function is essential for finding high-quality clusterings, but having a non-constant data weight function is useful also.
Описание
Alternatives to the k-means algorithm that find better clusterings
%0 Conference Paper
%1 584890
%A Hamerly, Greg
%A Elkan, Charles
%B CIKM '02: Proceedings of the eleventh international conference on Information and knowledge management
%C New York, NY, USA
%D 2002
%I ACM
%K algorithm kmeans machinelearning
%P 600--607
%R 10.1145/584792.584890
%T Alternatives to the k-means algorithm that find better clusterings
%U http://portal.acm.org/citation.cfm?id=584890
%X We investigate here the behavior of the standard k-means clustering algorithm and several alternatives to it: the k-harmonic means algorithm due to Zhang and colleagues, fuzzy k-means, Gaussian expectation-maximization, and two new variants of k-harmonic means. Our aim is to find which aspects of these algorithms contribute to finding good clusterings, as opposed to converging to a low-quality local optimum. We describe each algorithm in a unified framework that introduces separate cluster membership and data weight functions. We then show that the algorithms do behave very differently from each other on simple low-dimensional synthetic datasets and image segmentation tasks, and that the k-harmonic means method is superior. Having a soft membership function is essential for finding high-quality clusterings, but having a non-constant data weight function is useful also.
%@ 1-58113-492-4
@inproceedings{584890,
abstract = {We investigate here the behavior of the standard k-means clustering algorithm and several alternatives to it: the k-harmonic means algorithm due to Zhang and colleagues, fuzzy k-means, Gaussian expectation-maximization, and two new variants of k-harmonic means. Our aim is to find which aspects of these algorithms contribute to finding good clusterings, as opposed to converging to a low-quality local optimum. We describe each algorithm in a unified framework that introduces separate cluster membership and data weight functions. We then show that the algorithms do behave very differently from each other on simple low-dimensional synthetic datasets and image segmentation tasks, and that the k-harmonic means method is superior. Having a soft membership function is essential for finding high-quality clusterings, but having a non-constant data weight function is useful also.},
added-at = {2010-10-13T07:59:03.000+0200},
address = {New York, NY, USA},
author = {Hamerly, Greg and Elkan, Charles},
biburl = {https://www.bibsonomy.org/bibtex/2f0d375b3eef2c34db4190bb9a2a0f527/cdevries},
booktitle = {CIKM '02: Proceedings of the eleventh international conference on Information and knowledge management},
description = {Alternatives to the k-means algorithm that find better clusterings},
doi = {10.1145/584792.584890},
interhash = {41a8806f8b6d98a660a0f46e4559998d},
intrahash = {f0d375b3eef2c34db4190bb9a2a0f527},
isbn = {1-58113-492-4},
keywords = {algorithm kmeans machinelearning},
location = {McLean, Virginia, USA},
pages = {600--607},
publisher = {ACM},
timestamp = {2010-10-13T07:59:03.000+0200},
title = {Alternatives to the k-means algorithm that find better clusterings},
url = {http://portal.acm.org/citation.cfm?id=584890},
year = 2002
}