Abstract
The accuracy-based XCS classifier system has been
shown to solve typical data mining problems in a machine-learning
competitive way. However, successful applications in multistep
problems, modeled by a Markov decision process, were restricted
to very small problems. Until now, the temporal difference learning
technique in XCS was based on deterministic updates. However,
since a prediction is actually generated by a set of rules in XCS
and Learning Classifier Systems in general, gradient-based update
methods are applicable. The extension of XCS to gradient-based
update methods results in a classifier system that is more robust
and more parameter independent, solving large and difficult maze
problems reliably. Additionally, the extension to gradient methods
highlights the relation of XCS to other function approximation
methods in reinforcement learning.
Users
Please
log in to take part in the discussion (add own reviews or comments).