Article,

Transient and asymptotic dynamics of reinforcement learning in games

L. Izquierdo, S. Izquierdo, N. Gotts, and J. Polhill.
Games and Economic Behavior, 61 (2): 259 - 276 (2007)
DOI: https://doi.org/10.1016/j.geb.2007.01.005

Abstract

Reinforcement learners tend to repeat actions that led to satisfactory outcomes in the past, and avoid choices that resulted in unsatisfactory experiences. This behavior is one of the most widespread adaptation mechanisms in nature. In this paper we fully characterize the dynamics of one of the best known stochastic models of reinforcement learning Bush, R., Mosteller, F., 1955. Stochastic Models of Learning. Wiley & Sons, New York for 2-player 2-strategy games. We also provide some extensions for more general games and for a wider class of learning algorithms. Specifically, it is shown that the transient dynamics of Bush and Mosteller's model can be substantially different from its asymptotic behavior. It is also demonstrated that in general—and in sharp contrast to other reinforcement learning models in the literature—the asymptotic dynamics of Bush and Mosteller's model cannot be approximated using the continuous time limit version of its expected motion.

BibTeX key: IZQUIERDO2007259
entry type: article
year: 2007
journal: Games and Economic Behavior
number: 2
pages: 259 - 276
volume: 61
issn: 0899-8256
DOI: https://doi.org/10.1016/j.geb.2007.01.005
url: http://www.sciencedirect.com/science/article/pii/S0899825607000127

BibSonomy

Transient and asymptotic dynamics of reinforcement learning in games

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on