Technische Dokumentation,

Regulating Action Value Estimation in Deep Reinforcement Learning

.
Adaptive and Learning Agents Workshop at AAMAS 2023, (März 2023)

Zusammenfassung

Deep Q-learning is known to suffer from overestimation of action values, due to the maximization operation when computing the target values. Such overestimation can lead to substantial degradation of reward performance. In this work, we introduce a simple method based on DQN, named as Deep Value Q-learning, which regulates the estimation of action values and effectively tackles over- and underestimation. We evaluate our method on Atari-100k benchmark and demonstrate that DVQN consistently outperforms Deep Q-learning, Deep Double Q-learning and Clipped Deep Double Q-learning in terms of reward performance. Moreover, our experimental results show that DVQN serves as a better backbone network than DQN, when combined with an additional representation learning objective.

Tags

Nutzer

  • @yuan.xue
  • @l3s

Kommentare und Rezensionen