Regret Bounds for Batched Bandits

Abstract

We present simple and efficient algorithms for the batched stochastic multi-armed bandit and batched stochastic linear bandit problems. We prove bounds for their expected regrets that improve over the best-known regret bounds for any number of batches. In particular, our algorithms in both settings achieve the optimal expected regrets by using only a logarithmic number of batches. We also study the batched adversarial multi-armed bandit problem for the first time and find the optimal regret, up to logarithmic factors, of any algorithm with predetermined batch sizes.

BibTeX key: esfandiari2019regret
entry type: article
year: 2019
url: http://arxiv.org/abs/1910.04959
note: cite arxiv:1910.04959

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

BibSonomy

Regret Bounds for Batched Bandits

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on