copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

R. Sutton, {. Szepesvári, A. Geramifard, and M. Bowling. UAI, page 528--536. (2008)

Abstract

We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available after each interaction with the world. This paper develops an explicitly model-based approach extending the Dyna architecture to linear function approximation. Dyna-style planning proceeds by generating imaginary experience from the world model and then applying model-free reinforcement learning algorithms to the imagined state transitions. Our main results are to prove that linear Dyna-style planning converges to a unique solution independent of the generating distribution, under natural conditions. In the policy evaluation setting, we prove that the limit point is the least-squares (LSTD) solution. An implication of our results is that prioritized-sweeping can be soundly extended to the linear approximation case, backing up to preceding features rather than to preceding states. We introduce two versions of prioritized sweeping with linear Dyna and briefly illustrate their performance empirically on the Mountain Car and Boyan Chain problems.

Links and resources

BibTeX key: sutton2008a
entry type: inproceedings
booktitle: UAI
year: 2008
pages: 528--536
crossref: UAI2008
ee: http://uai2008.cs.helsinki.fi/UAI_camera_ready/sutton.pdf
date-added: 2010-08-28 17:38:14 -0600
pdf: papers/linearDyna.pdf
bibsource: DBLP, http://dblp.uni-trier.de
date-modified: 2010-11-25 00:53:46 -0700

Cite this publication

@inproceedings{sutton2008a, abstract = {We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available after each interaction with the world. This paper develops an explicitly model-based approach extending the Dyna architecture to linear function approximation. Dyna-style planning proceeds by generating imaginary experience from the world model and then applying model-free reinforcement learning algorithms to the imagined state transitions. Our main results are to prove that linear Dyna-style planning converges to a unique solution independent of the generating distribution, under natural conditions. In the policy evaluation setting, we prove that the limit point is the least-squares (LSTD) solution. An implication of our results is that prioritized-sweeping can be soundly extended to the linear approximation case, backing up to preceding features rather than to preceding states. We introduce two versions of prioritized sweeping with linear Dyna and briefly illustrate their performance empirically on the Mountain Car and Boyan Chain problems.}, added-at = {2020-03-17T03:03:01.000+0100}, author = {Sutton, R.S. and Szepesv{\'a}ri, {Cs}. and Geramifard, A. and Bowling, M. H.}, bibsource = {DBLP, http://dblp.uni-trier.de}, biburl = {https://www.bibsonomy.org/bibtex/269a21b37d44b45b573b6aec72bbca77d/csaba}, booktitle = {UAI}, crossref = {UAI2008}, date-added = {2010-08-28 17:38:14 -0600}, date-modified = {2010-11-25 00:53:46 -0700}, ee = {http://uai2008.cs.helsinki.fi/UAI_camera_ready/sutton.pdf}, interhash = {9ec394f031ee125db1ac2f525878a8a8}, intrahash = {69a21b37d44b45b573b6aec72bbca77d}, keywords = {approximation approximation, asymptotic convergence, function learning, planning, reinforcement stochastic theory,}, pages = {528--536}, pdf = {papers/linearDyna.pdf}, timestamp = {2020-03-17T03:03:01.000+0100}, title = {Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping}, year = 2008 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

Abstract

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

Abstract

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

Comments and Reviews
(0)