@ijcseit

A Novel Dencos Model For High Dimensional Data Using Genetic Algorithms

, , , and . International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), 2 (1): 35-43 (February 2012)
DOI: 10.5121/ijcseit.2012.2104

Abstract

Subspace clustering is an emerging task that aims at detecting clusters in entrenched in subspaces. Recent approaches fail to reduce results to relevant subspace clusters. Their results are typically highly redundant and lack the fact of considering the critical problem, “the density divergence problem,” in discovering the clusters, where they utilize an absolute density value as the density threshold to identify the dense regions in all subspaces. Considering the varying region densities in different subspace cardinalities, we note that a more appropriate way to determine whether a region in a subspace should be identified as dense is by comparing its density with the region densities in that subspace. Based on this idea and due to the infeasibility of applying previous techniques in this novel clustering model, we devise an innovative algorithm, referred to as DENCOS(DENsity Conscious Subspace clustering), to adopt a divide-and-conquer scheme to efficiently discover clusters satisfying different density thresholds in different subspace cardinalities. DENCOS can discover the clusters in all subspaces with high quality, and the efficiency significantly outperforms previous works, thus demonstrating its practicability for subspace clustering. As validated by our extensive experiments on retail dataset, it outperforms previous works. We extend our work with a clustering technique based on genetic algorithms which is capable of optimizing the number of clusters for tasks with well formed and separated clusters.

Links and resources

Tags