Techreport,

Survey Of Clustering Data Mining Techniques

P. Berkhin.
(2002)

Abstract

Clustering is a division of data into groups of similar objects. Representing the data by fewer clusters neccessarily loses certain fine details, but achieves simplification. It models data by its clusters. Data modeling puts clustering in a historial perspective rooted in mathematics, statistics and numerical analysis. From a machine learning perspective clusters correspond to hidden patterns, the search for clusters in unsupervised learning and the resulting system represents a data concept. From a practicual perspective clustering plays an outstanding role in data mining applications, Web analysis, CRM, marketing, medical diagnostics, computational biology, and many others. Clustering is the subject of active research in several fields such as statistics, pattern recognition and machine learning. This survery focuses on clustering in data ming. Data mining adds to clustering the complications of very large datasets with very many attributes of different types. This imposes unique computational requirements on relevant clustering algorithms. A variety of algorithms have recently emerged that meet these requirements and were successfully applied to real-life data mining problems. They are subject of the survey.

BibTeX key: Berkhin02surveyof
entry type: techreport
year: 2002
url: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.3739

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

BibSonomy

Survey Of Clustering Data Mining Techniques

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on