Inproceedings,

Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models

B. Pires, and {. Szepesvári.
COLT, page 121--151. (2016)

Abstract

In this paper we study a model-based approach to calculating approximately optimal policies in Markovian Decision Processes. In particular, we derive novel bounds on the loss of using a policy derived from a factored linear model, a class of models which generalize virtually all previous models that come with strong computational guarantees. For the first time in the literature, we derive performance bounds for model-based techniques where the model inaccuracy is measured in weighted norms. Moreover, our bounds show a decreased sensitivity to the discount factor and, unlike similar bounds derived for other approaches, they are insensitive to measure mismatch. Similarly to previous works, our proofs are also based on contraction arguments, but with the main differences that we use carefully constructed norms building on Banach lattices, and the contraction property is only assumed for operators acting on ``compressed'' spaces, thus weakening previous assumptions, while strengthening previous results.

BibTeX key: PiSze16:FLM
entry type: inproceedings
booktitle: COLT
year: 2016
pages: 121--151
pdf: papers/COLT16-FLM.pdf
date-modified: 2017-07-28 02:28:25 +0000
date-added: 2016-05-09 08:32:08 +0000

BibSonomy

Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on