Аннотация

The application of Reinforcement Learning (RL) Algorithms is often hindered by the combinatorial explosion of the state space. Previous works have leveraged abstractions which condense large state spaces to find tractable solutions, however they assumed that the abstractions are provided by a domain expert. In this work we propose a new approach to automatically construct Abstract Markov Decision Processes (AMDPs) for Potential Based Reward Shaping to improve the sample efficiency of RL algorithms. Our approach to construct abstract states is inspired by graph representation learning methods and effectively encodes topological and reward structure of the ground level MDP. We perform large scale quantitative experiments on Flag Collection domain. We show improvements of up to 6.5 times in sample efficiency and up to 3 times in run time over the baseline approach. Besides, with our qualitative analyses of the generated AMDP we demonstrate the capability of our approach to preserve topological and reward structure of the ground level MDP.

Описание

Graph Learning based Generation of Abstractions for Reinforcement Learning.

Линки и ресурсы

тэги