Abstract

Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states.

Links and resources

DOI:
10.1007/BF00992698
URL:
BibTeX key:
Watkins1992
search on:

Comments and Reviews  
(0)

There is no review or comment yet. You can write one!

Tags


Cite this publication