Abstract
Bayesian methods for machine learning have been widely investigated, yielding
principled methods for incorporating prior information into inference
algorithms. In this survey, we provide an in-depth review of the role of
Bayesian methods for the reinforcement learning (RL) paradigm. The major
incentives for incorporating Bayesian reasoning in RL are: 1) it provides an
elegant approach to action-selection (exploration/exploitation) as a function
of the uncertainty in learning; and 2) it provides a machinery to incorporate
prior knowledge into the algorithms. We first discuss models and methods for
Bayesian inference in the simple single-step Bandit model. We then review the
extensive recent literature on Bayesian methods for model-based RL, where prior
information can be expressed on the parameters of the Markov model. We also
present Bayesian methods for model-free RL, where priors are expressed over the
value function or policy class. The objective of the paper is to provide a
comprehensive survey on Bayesian RL algorithms and their theoretical and
empirical properties.
Users
Please
log in to take part in the discussion (add own reviews or comments).