Abstract
Clustering is a division of data into groups of similar objects. Representing the
data by fewer clusters necessarily loses certain fine details, but achieves
simplification. It models data by its clusters. Data modeling puts clustering in a
historical perspective rooted in mathematics, statistics, and numerical analysis.
From a machine learning perspective clusters correspond to hidden patterns, the
search for clusters is unsupervised learning, and the resulting system represents a
data concept. From a practical perspective clustering plays an outstanding role in
data mining applications such as scientific data exploration, information retrieval
and text mining, spatial database applications, Web analysis, CRM, marketing,
medical diagnostics, computational biology, and many others.
Clustering is the subject of active research in several fields such as statistics,
pattern recognition, and machine learning. This survey focuses on clustering in
data mining. Data mining adds to clustering the complications of very large
datasets with very many attributes of different types. This imposes unique
computational requirements on relevant clustering algorithms. A variety of
algorithms have recently emerged that meet these requirements and were
successfully applied to real-life data mining problems. They are subject of the
survey.
Description
Survey Of Clustering Data Mining Techniques - Berkhin (ResearchIndex)
Links and resources
Tags
community