Abstract
One of the core problems of modern statistics is to approximate
difficult-to-compute probability densities. This problem is especially
important in Bayesian statistics, which frames all inference about unknown
quantities as a calculation involving the posterior density. In this paper, we
review variational inference (VI), a method from machine learning that
approximates probability densities through optimization. VI has been used in
many applications and tends to be faster than classical methods, such as Markov
chain Monte Carlo sampling. The idea behind VI is to first posit a family of
densities and then to find the member of that family which is close to the
target. Closeness is measured by Kullback-Leibler divergence. We review the
ideas behind mean-field variational inference, discuss the special case of VI
applied to exponential family models, present a full example with a Bayesian
mixture of Gaussians, and derive a variant that uses stochastic optimization to
scale up to massive data. We discuss modern research in VI and highlight
important open problems. VI is powerful, but it is not yet well understood. Our
hope in writing this paper is to catalyze statistical research on this class of
algorithms.
Users
Please
log in to take part in the discussion (add own reviews or comments).