BibSonomy :: bibtex  ::

tag user group author concept BibTeX key search:all search:marcoalvarez
A blue social bookmark and publication sharing system.
tags · relations · groups · popular
help · blog · about
login · register
marcoalvarez's BibTeX entry:  

Density-based clustering over an evolving data stream with noise

SIAM International Conference on Data Mining, 2006.
Authors: Feng Cao and Martin Ester and Weining Qian and Aoying Zhou
URL: http://www.siam.org/meetings/sdm06/proceedings/030caof.pdf
Tags: Clustering DataStream
Abstract: Clustering is an important task in mining evolving data streams. Beside the limited memory and one-pass constraints, the nature of evolving data streams implies the following requirements for stream clustering: no assumption on the number of clusters, discovery of clusters with arbitrary shape and ability to handle outliers. While a lot of clustering algorithms for data streams have been proposed, they offer no solution to the combination of these requirements. In this paper, we present DenStream, a new approach for discovering clusters in an evolving data stream. The �dense� micro-cluster (named core-micro-cluster) is introduced to summarize the clusters with arbitrary shape, while the potential core-micro-cluster and outlier micro-cluster structures are proposed to maintain and distinguish the potential clusters and outliers. A novel pruning strategy is designed based on these concepts, which guarantees the precision of the weights of the micro-clusters with limited memory. Our performance study over a number of real and synthetic data sets demonstrates the effectiveness and efficiency of our method.
| URL | BibTeX  
@inproceedings{Cao2006,
title = {Density-based clustering over an evolving data stream with noise},
author = {Feng Cao and Martin Ester and Weining Qian and Aoying Zhou},
booktitle = {SIAM International Conference on Data Mining},
url = {http://www.siam.org/meetings/sdm06/proceedings/030caof.pdf},
year = {2006},
abstract = {Clustering is an important task in mining evolving data streams. Beside the limited memory and one-pass constraints, the nature of evolving data streams implies the following requirements for stream clustering: no assumption on the number of clusters, discovery of clusters with arbitrary shape and ability to handle outliers. While a lot of clustering algorithms for data streams have been proposed, they offer no solution to the combination of these requirements. In this paper, we present DenStream, a new approach for discovering clusters in an evolving data stream. The �dense� micro-cluster (named core-micro-cluster) is introduced to summarize the clusters with arbitrary shape, while the potential core-micro-cluster and outlier micro-cluster structures are proposed to maintain and distinguish the potential clusters and outliers. A novel pruning strategy is designed based on these concepts, which guarantees the precision of the weights of the micro-clusters with limited memory. Our performance study over a number of real and synthetic data sets demonstrates the effectiveness and efficiency of our method.},
keywords = {Clustering DataStream }
}