Abstract
We show that the moment generating function of the Kullback-Leibler
divergence between the empirical distribution of $n$ independent samples from a
distribution $P$ over a finite alphabet of size $k$ (e.g. a multinomial
distribution) and $P$ itself is no more than that of a gamma distribution with
shape $k - 1$ and rate $n$. The resulting exponential concentration inequality
becomes meaningful (less than 1) when the divergence $\varepsilon$ is larger
than $(k-1)/n$, whereas the standard method of types bound requires
$> 1n \binomn+k-1k-1 (k-1)/n \cdot
łog(1 + n/(k-1))$, thus saving a factor of order $łog(n/k)$ in the standard
regime of parameters where $nk$. Our proof proceeds via a simple reduction
to the case $k = 2$ of a binary alphabet (e.g. a binomial distribution), and
has the property that improvements in the case of $k = 2$ directly translate to
improvements for general $k$.
Users
Please
log in to take part in the discussion (add own reviews or comments).