Abstract
We consider the problem of estimating Shannon's entropy $H$ from discrete
data, in cases where the number of possible symbols is unknown or even
countably infinite. The Pitman-Yor process, a generalization of Dirichlet
process, provides a tractable prior distribution over the space of countably
infinite discrete distributions, and has found major applications in Bayesian
non-parametric statistics and machine learning. Here we show that it also
provides a natural family of priors for Bayesian entropy estimation, due to the
fact that moments of the induced posterior distribution over $H$ can be
computed analytically. We derive formulas for the posterior mean (Bayes' least
squares estimate) and variance under Dirichlet and Pitman-Yor process priors.
Moreover, we show that a fixed Dirichlet or Pitman-Yor process prior implies a
narrow prior distribution over $H$, meaning the prior strongly determines the
entropy estimate in the under-sampled regime. We derive a family of continuous
mixing measures such that the resulting mixture of Pitman-Yor processes
produces an approximately flat prior over $H$. We show that the resulting
Pitman-Yor Mixture (PYM) entropy estimator is consistent for a large class of
distributions. We explore the theoretical properties of the resulting
estimator, and show that it performs well both in simulation and in application
to real data.
Users
Please
log in to take part in the discussion (add own reviews or comments).