Inproceedings,

An Information-Theoretical Approach to Clustering Categorical Databases using Genetic Algorithms

D. Cristofor, and D. Simovici.
In 2nd SIAM ICDM, Workshop on clustering high dimensional data, page 37--46. (2002)

Abstract

Clustering categorical databases presents special difficulties due to the absence of natural dissimilarities between objects. We present a solution that overcomes these difficulties that is based on an information-theoretical definition of dissimilarities between partitions of finite sets (applied to partitions of the set of objects to be clustered which are determined by categorical attributes) and makes use of genetic algorithms for finding an acceptable approximative clustering. We tested our method on databases for which the clustering of the rows is known in advance and we show that our proposed method finds the natural clustering of the data with a good classification rate, better than that of the classical algorithm k-means.

BibTeX key: Cristofor02aninformation-theoretical
entry type: inproceedings
booktitle: In 2nd SIAM ICDM, Workshop on clustering high dimensional data
year: 2002
pages: 37--46
url: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.4219

BibSonomy

An Information-Theoretical Approach to Clustering Categorical Databases using Genetic Algorithms

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on