Convergent Reinforcement Learning with Value Function Interpolation

Аннотация

We consider the convergence of a class of reinforcement learning algorithms combined with value function interpolation methods using the methods developed in (Littman and Szepesvari, 1996). As a special case of the obtained general results, for the first time, we prove the (almost sure) convergence of Q-learning when combined with value function interpolation in uncountable spaces.

ключ BibTeX: szepesvari2000
тип записи: techreport
адрес: Budapest 1121, Konkoly Th. M. u. 29-33, HUNGARY
год: 2000
учреждение: Mindmaker Ltd.
номер: TR-2001-02
pdf: papers/rlfapp.pdf
date-modified: 2010-09-04 14:48:33 -0600

тэги

Пользователи данного ресурса

Комментарии и рецензиипоказать / перейти в невидимый режим

Пожалуйста, войдите в систему, чтобы принять участие в дискуссии (добавить собственные рецензию, или комментарий)

BibSonomy