Misc,

The information bottleneck method

N. Tishby, F. Pereira, and W. Bialek.
(2000)cite arxiv:physics/0004057.

Abstract

We define the relevant information in a signal $xX$ as being the information that this signal provides about another signal $y\Y$. Examples include the information that face images provide about the names of the people portrayed, or the information that speech sounds provide about the words spoken. Understanding the signal $x$ requires more than just predicting $y$, it also requires specifying which features of $\X$ play a role in the prediction. We formalize this problem as that of finding a short code for $\X$ that preserves the maximum information about $\Y$. That is, we squeeze the information that $\X$ provides about $\Y$ through a `bottleneck' formed by a limited set of codewords $\tX$. This constrained optimization problem can be seen as a generalization of rate distortion theory in which the distortion measure $d(x,\x)$ emerges from the joint statistics of $\X$ and $\Y$. This approach yields an exact set of self consistent equations for the coding rules $X \tX$ and $\tX \Y$. Solutions to these equations can be found by a convergent re-estimation method that generalizes the Blahut-Arimoto algorithm. Our variational principle provides a surprisingly rich framework for discussing a variety of problems in signal processing and learning, as will be described in detail elsewhere.

BibTeX key: tishby2000information
entry type: misc
year: 2000
url: http://arxiv.org/abs/physics/0004057
note: cite arxiv:physics/0004057

BibSonomy

The information bottleneck method

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on