Minimax Regret for Bandit Convex Optimisation of Ridge Functions
T. Lattimore. (2021)cite arxiv:2106.00444Comment: Correcting an (instructive) error that leads to a weaker result.
Аннотация
We analyse adversarial bandit convex optimisation with an adversary that is
restricted to playing functions of the form $f_t(x) = g_t(x,
þeta\rangle)$ for convex $g_t : R R$ and unknown $þeta
R^d$ that is homogeneous over time. We provide a short
information-theoretic proof that the minimax regret is at most $O(d n
łog(n diam(K)))$ where $n$ is the number of
interactions, $d$ the dimension and $diam(K)$ is the
diameter of the constraint set.
Описание
[2106.00444] Minimax Regret for Bandit Convex Optimisation of Ridge Functions
%0 Journal Article
%1 lattimore2021minimax
%A Lattimore, Tor
%D 2021
%K bandits optimization
%T Minimax Regret for Bandit Convex Optimisation of Ridge Functions
%U http://arxiv.org/abs/2106.00444
%X We analyse adversarial bandit convex optimisation with an adversary that is
restricted to playing functions of the form $f_t(x) = g_t(x,
þeta\rangle)$ for convex $g_t : R R$ and unknown $þeta
R^d$ that is homogeneous over time. We provide a short
information-theoretic proof that the minimax regret is at most $O(d n
łog(n diam(K)))$ where $n$ is the number of
interactions, $d$ the dimension and $diam(K)$ is the
diameter of the constraint set.
@article{lattimore2021minimax,
abstract = {We analyse adversarial bandit convex optimisation with an adversary that is
restricted to playing functions of the form $f_t(x) = g_t(\langle x,
\theta\rangle)$ for convex $g_t : \mathbb R \to \mathbb R$ and unknown $\theta
\in \mathbb R^d$ that is homogeneous over time. We provide a short
information-theoretic proof that the minimax regret is at most $O(d \sqrt{n}
\log(n \operatorname{diam}(\mathcal K)))$ where $n$ is the number of
interactions, $d$ the dimension and $\operatorname{diam}(\mathcal K)$ is the
diameter of the constraint set.},
added-at = {2021-06-08T14:33:52.000+0200},
author = {Lattimore, Tor},
biburl = {https://www.bibsonomy.org/bibtex/2767396c107059655d433d7cce23303f0/kirk86},
description = {[2106.00444] Minimax Regret for Bandit Convex Optimisation of Ridge Functions},
interhash = {95218ed3f8b68d35b3fa442f4c7c074a},
intrahash = {767396c107059655d433d7cce23303f0},
keywords = {bandits optimization},
note = {cite arxiv:2106.00444Comment: Correcting an (instructive) error that leads to a weaker result},
timestamp = {2021-06-08T14:33:52.000+0200},
title = {Minimax Regret for Bandit Convex Optimisation of Ridge Functions},
url = {http://arxiv.org/abs/2106.00444},
year = 2021
}