Аннотация
Learning-based control algorithms require collection of abundant supervision
for training. Safe exploration algorithms enable this data collection to
proceed safely even when only partial knowledge is available. In this paper, we
present a new episodic framework to design a sub-optimal pool of motion plans
that aid exploration for learning unknown residual dynamics under safety
constraints. We derive an iterative convex optimization algorithm that solves
an information-cost Stochastic Nonlinear Optimal Control problem (Info-SNOC),
subject to chance constraints and approximated dynamics to compute a safe
trajectory. The optimization objective encodes both performance and
exploration, and the safety is incorporated as distributionally robust chance
constraints. The dynamics are predicted from a robust learning model. We prove
the safety of rollouts from our exploration method and reduction in uncertainty
over epochs ensuring consistency of our learning method. We validate the
effectiveness of Info-SNOC by designing and implementing a pool of safe
trajectories for a planar robot.
Пользователи данного ресурса
Пожалуйста,
войдите в систему, чтобы принять участие в дискуссии (добавить собственные рецензию, или комментарий)