S. Shah, and V. Koltun. (2018)cite arxiv:1803.01449Comment: The code is available at http://github.com/shahsohil/DCC.
Abstract
Clustering high-dimensional datasets is hard because interpoint distances
become less informative in high-dimensional spaces. We present a clustering
algorithm that performs nonlinear dimensionality reduction and clustering
jointly. The data is embedded into a lower-dimensional space by a deep
autoencoder. The autoencoder is optimized as part of the clustering process.
The resulting network produces clustered data. The presented approach does not
rely on prior knowledge of the number of ground-truth clusters. Joint nonlinear
dimensionality reduction and clustering are formulated as optimization of a
global continuous objective. We thus avoid discrete reconfigurations of the
objective that characterize prior clustering algorithms. Experiments on
datasets from multiple domains demonstrate that the presented algorithm
outperforms state-of-the-art clustering schemes, including recent methods that
use deep networks.
%0 Generic
%1 shah2018continuous
%A Shah, Sohil Atul
%A Koltun, Vladlen
%D 2018
%K clustering to_read unsupervised
%T Deep Continuous Clustering
%U http://arxiv.org/abs/1803.01449
%X Clustering high-dimensional datasets is hard because interpoint distances
become less informative in high-dimensional spaces. We present a clustering
algorithm that performs nonlinear dimensionality reduction and clustering
jointly. The data is embedded into a lower-dimensional space by a deep
autoencoder. The autoencoder is optimized as part of the clustering process.
The resulting network produces clustered data. The presented approach does not
rely on prior knowledge of the number of ground-truth clusters. Joint nonlinear
dimensionality reduction and clustering are formulated as optimization of a
global continuous objective. We thus avoid discrete reconfigurations of the
objective that characterize prior clustering algorithms. Experiments on
datasets from multiple domains demonstrate that the presented algorithm
outperforms state-of-the-art clustering schemes, including recent methods that
use deep networks.
@misc{shah2018continuous,
abstract = {Clustering high-dimensional datasets is hard because interpoint distances
become less informative in high-dimensional spaces. We present a clustering
algorithm that performs nonlinear dimensionality reduction and clustering
jointly. The data is embedded into a lower-dimensional space by a deep
autoencoder. The autoencoder is optimized as part of the clustering process.
The resulting network produces clustered data. The presented approach does not
rely on prior knowledge of the number of ground-truth clusters. Joint nonlinear
dimensionality reduction and clustering are formulated as optimization of a
global continuous objective. We thus avoid discrete reconfigurations of the
objective that characterize prior clustering algorithms. Experiments on
datasets from multiple domains demonstrate that the presented algorithm
outperforms state-of-the-art clustering schemes, including recent methods that
use deep networks.},
added-at = {2018-03-06T10:44:16.000+0100},
author = {Shah, Sohil Atul and Koltun, Vladlen},
biburl = {https://www.bibsonomy.org/bibtex/2689de7ae7ab1b94a18c7d038e42e6c8a/jk_itwm},
description = {Deep Continuous Clustering},
interhash = {3b66fbe10abef9cd5fe3f0b184faa70f},
intrahash = {689de7ae7ab1b94a18c7d038e42e6c8a},
keywords = {clustering to_read unsupervised},
note = {cite arxiv:1803.01449Comment: The code is available at http://github.com/shahsohil/DCC},
timestamp = {2018-03-06T10:44:16.000+0100},
title = {Deep Continuous Clustering},
url = {http://arxiv.org/abs/1803.01449},
year = 2018
}