Abstract
Dynamic Programming, Q-learning and other discrete Markov Decision
Process solvers can be applied to continuous d-dimensional state-spaces by
quantizing the state space into an array of boxes. This is often problematic
above two dimensions: a coarse quantization can lead to poor policies, and
fine quantization is too expensive. Possible solutions are variable-resolution
discretization, or function approximation by neural nets. A third option,
which has been little studied in the reinforcement...
Users
Please
log in to take part in the discussion (add own reviews or comments).