Abstract
We consider the problem of controlling an unknown linear dynamical system in
the presence of (nonstochastic) adversarial perturbations and adversarial
convex loss functions. In contrast to classical control, the a priori
determination of an optimal controller here is hindered by the latter's
dependence on the yet unknown perturbations and costs. Instead, we measure
regret against an optimal linear policy in hindsight, and give the first
efficient algorithm that guarantees a sublinear regret bound, scaling as
T^2/3, in this setting.
Users
Please
log in to take part in the discussion (add own reviews or comments).