Training Agents using Upside-Down Reinforcement Learning

Abstract

Traditional Reinforcement Learning (RL) algorithms either predict rewards with value functions or maximize them using policy search. We study an alternative: Upside-Down Reinforcement Learning (Upside-Down RL or UDRL), that solves RL problems primarily using supervised learning techniques. Many of its main principles are outlined in a companion report 34. Here we present the first concrete implementation of UDRL and demonstrate its feasibility on certain episodic learning problems. Experimental results show that its performance can be surprisingly competitive with, and even exceed that of traditional baseline algorithms developed over decades of research.

BibTeX key: srivastava2019training
entry type: misc
year: 2019
url: http://arxiv.org/abs/1912.02877
note: cite arxiv:1912.02877Comment: NNAISENSE Technical Report. 17 pages, 6 figures

BibSonomy

Training Agents using Upside-Down Reinforcement Learning

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on