Inproceedings,

OPTICS: Ordering Points to Identify the Clustering Structure

M. Ankerst, M. Breunig, H. Kriegel, and J. Sander.
International Conference on Management of Data and Symposium on Principles of Database Systems Philadelphia (SIGMOD/PODS 1999), PA, USA - May 31 - June 03, 1999, page 49-60. New York, NY, USA, ACM, (1999)
DOI: 10.1145/304181.304187

Abstract

Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of the well-known clustering algorithms require input parameters which are hard to determine but have a significant influence on the clustering result. Furthermore, for many real-data sets there does not even exist a global parameter setting for which the result of the clustering algorithm describes the intrinsic clustering structure accurately. We introduce a new algorithm for the purpose of cluster analysis which does not produce a clustering of a data set explicitly; but instead creates an augmented ordering of the database representing its density-based clustering structure. This cluster-ordering contains information which is equivalent to the density-based clusterings corresponding to a broad range of parameter settings. It is a versatile basis for both automatic and interactive cluster analysis. We show how to automatically and efficiently extract not only 'traditional' clustering information (e.g. representative points, arbitrary shaped clusters), but also the intrinsic clustering structure. For medium sized data sets, the cluster-ordering can be represented graphically and for very large data sets, we introduce an appropriate visualization technique. Both are suitable for interactive exploration of the intrinsic clustering structure offering additional insights into the distribution and correlation of the data.

BibTeX key: ankerst-1999
entry type: inproceedings
address: New York, NY, USA
booktitle: International Conference on Management of Data and Symposium on Principles of Database Systems Philadelphia (SIGMOD/PODS 1999), PA, USA - May 31 - June 03, 1999
year: 1999
pages: 49-60
publisher: ACM
isbn: 1-58113-084-8
DOI: 10.1145/304181.304187

Users

Comments and Reviewsshow / hide

@cscholz 13 years ago
extension for dbscan, nice to read, a lot of new approaches, very detailed
References
Bookmarks
deleting review
@bjoern 13 years ago
References
Bookmarks
deleting review

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

%0 Conference Paper %1 ankerst-1999 %A Ankerst, Mihael %A Breunig, Markus M. %A Kriegel, Hans-Peter %A Sander, Jörg %B International Conference on Management of Data and Symposium on Principles of Database Systems Philadelphia (SIGMOD/PODS 1999), PA, USA - May 31 - June 03, 1999 %C New York, NY, USA %D 1999 %E Delis, Alex %E Faloutsos, Christos %E Ghandeharizadeh, Shahram %I ACM %K %P 49-60 %R 10.1145/304181.304187 %T OPTICS: Ordering Points to Identify the Clustering Structure %X Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of the well-known clustering algorithms require input parameters which are hard to determine but have a significant influence on the clustering result. Furthermore, for many real-data sets there does not even exist a global parameter setting for which the result of the clustering algorithm describes the intrinsic clustering structure accurately. We introduce a new algorithm for the purpose of cluster analysis which does not produce a clustering of a data set explicitly; but instead creates an augmented ordering of the database representing its density-based clustering structure. This cluster-ordering contains information which is equivalent to the density-based clusterings corresponding to a broad range of parameter settings. It is a versatile basis for both automatic and interactive cluster analysis. We show how to automatically and efficiently extract not only 'traditional' clustering information (e.g. representative points, arbitrary shaped clusters), but also the intrinsic clustering structure. For medium sized data sets, the cluster-ordering can be represented graphically and for very large data sets, we introduce an appropriate visualization technique. Both are suitable for interactive exploration of the intrinsic clustering structure offering additional insights into the distribution and correlation of the data. %@ 1-58113-084-8

@inproceedings{ankerst-1999, abstract = {Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of the well-known clustering algorithms require input parameters which are hard to determine but have a significant influence on the clustering result. Furthermore, for many real-data sets there does not even exist a global parameter setting for which the result of the clustering algorithm describes the intrinsic clustering structure accurately. We introduce a new algorithm for the purpose of cluster analysis which does not produce a clustering of a data set explicitly; but instead creates an augmented ordering of the database representing its density-based clustering structure. This cluster-ordering contains information which is equivalent to the density-based clusterings corresponding to a broad range of parameter settings. It is a versatile basis for both automatic and interactive cluster analysis. We show how to automatically and efficiently extract not only 'traditional' clustering information (e.g. representative points, arbitrary shaped clusters), but also the intrinsic clustering structure. For medium sized data sets, the cluster-ordering can be represented graphically and for very large data sets, we introduce an appropriate visualization technique. Both are suitable for interactive exploration of the intrinsic clustering structure offering additional insights into the distribution and correlation of the data.}, added-at = {2011-12-11T15:28:04.000+0100}, address = {New York, NY, USA}, author = {Ankerst, Mihael and Breunig, Markus M. and Kriegel, Hans-Peter and Sander, Jörg}, biburl = {https://www.bibsonomy.org/bibtex/2a22cc1d26329bb7879db456e6ca1f515/bjoern}, booktitle = {International Conference on Management of Data and Symposium on Principles of Database Systems Philadelphia (SIGMOD/PODS 1999), PA, USA - May 31 - June 03, 1999}, doi = {10.1145/304181.304187}, editor = {Delis, Alex and Faloutsos, Christos and Ghandeharizadeh, Shahram}, interhash = {7417e17c0e8eec9f1a9f2bc57a476b15}, intrahash = {a22cc1d26329bb7879db456e6ca1f515}, isbn = {1-58113-084-8}, keywords = {}, pages = {49-60}, publisher = {ACM}, timestamp = {2011-12-11T15:28:04.000+0100}, title = {OPTICS: Ordering Points to Identify the Clustering Structure}, year = 1999 }

BibSonomy

OPTICS: Ordering Points to Identify the Clustering Structure

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on