Abstract
Many popular representation-learning algorithms use training objectives
defined on the observed data space, which we call pixel-level. This may be
detrimental when only a small fraction of the bits of signal actually matter at
a semantic level. We hypothesize that representations should be learned and
evaluated more directly in terms of their information content and statistical
or structural constraints. To address the first quality, we consider learning
unsupervised representations by maximizing mutual information between part or
all of the input and a high-level feature vector. To address the second, we
control characteristics of the representation by matching to a prior
adversarially. Our method, which we call Deep INFOMAX (DIM), can be used to
learn representations with desired characteristics and which empirically
outperform a number of popular unsupervised learning methods on classification
tasks. DIM opens new avenues for unsupervised learn-ing of representations and
is an important step towards flexible formulations of representation learning
objectives catered towards specific end-goals.
Users
Please
log in to take part in the discussion (add own reviews or comments).