The application of Reinforcement Learning (RL) Algorithms is often
hindered by the combinatorial explosion of the state space. Previous works have leveraged abstractions which condense large state
spaces to find tractable solutions, however they assumed that the
abstractions are provided by a domain expert. In this work we propose a new approach to automatically construct Abstract Markov
Decision Processes (AMDPs) for Potential Based Reward Shaping
to improve the sample efficiency of RL algorithms. Our approach to
construct abstract states is inspired by graph representation learning
methods and effectively encodes topological and reward structure
of the ground level MDP. We perform large scale quantitative experiments on Flag Collection domain. We show improvements of up
to 6.5 times in sample efficiency and up to 3 times in run time over
the baseline approach. Besides, with our qualitative analyses of the
generated AMDP we demonstrate the capability of our approach
to preserve topological and reward structure of the ground level
MDP.
Description
Graph Learning based Generation of Abstractions for Reinforcement Learning.