Mastersthesis,

Applying Policy Gradient Reinforcement Learning to Optimise Robot Behaviours

A. Witsch.
University of Kassel, Master's Thesis, (June 2010)

Abstract

In robotics, elementary behaviour patterns often tackle control theoretic problems. Because of incomplete or imprecise models of the control system, the structure and the parameters of a control policy are unknown. These problems can be solved by reinforcement learning algorithms like policy gradient methods. In this thesis, policy gradient learning is used to optimise a controller represented as a z-transformed rational function. This representation facilitates simultaneous optimisation of the control structure and its parameters in time space. The resulting controller can be analysed in terms of control theory to predict the control behaviour for arbitrary scenarios. Because the performance of gradient descent algorithms heavily depends on appropriate starting points, these parameters must be chosen carefully. This work presents a method that allows learning of an initial parameter set with the help of a single demonstrated trajectory. The approach is evaluated on a cartpole simulation for demonstrating the expressiveness of the policy. We also describe how to stabilise the gradient descent by introducing a linearisation term. Furthermore, a real soccer robot scenario demonstrates the ability of the proposed approach to deal with noisy scenarios. This illustrates the flexibility and adaptability of the approach for different problems with only little initial knowledge. A discussion of open questions and concluding remarks finally motivate future work and possible extensions of the proposed approach.

BibTeX key: Witsch2010
entry type: mastersthesis
year: 2010
month: 06
institution: Distributed Systems Research Group
school: University of Kassel
type: Master's Thesis
Document: http://das-lab.vs.eecs.uni-kassel.de/publications/Witsch2010-Master-PolicyGradient.pdf

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@mastersthesis{Witsch2010, abstract = {In robotics, elementary behaviour patterns often tackle control theoretic problems. Because of incomplete or imprecise models of the control system, the structure and the parameters of a control policy are unknown. These problems can be solved by reinforcement learning algorithms like policy gradient methods. In this thesis, policy gradient learning is used to optimise a controller represented as a z-transformed rational function. This representation facilitates simultaneous optimisation of the control structure and its parameters in time space. The resulting controller can be analysed in terms of control theory to predict the control behaviour for arbitrary scenarios. Because the performance of gradient descent algorithms heavily depends on appropriate starting points, these parameters must be chosen carefully. This work presents a method that allows learning of an initial parameter set with the help of a single demonstrated trajectory. The approach is evaluated on a cartpole simulation for demonstrating the expressiveness of the policy. We also describe how to stabilise the gradient descent by introducing a linearisation term. Furthermore, a real soccer robot scenario demonstrates the ability of the proposed approach to deal with noisy scenarios. This illustrates the flexibility and adaptability of the approach for different problems with only little initial knowledge. A discussion of open questions and concluding remarks finally motivate future work and possible extensions of the proposed approach.}, added-at = {2016-04-07T16:24:24.000+0200}, author = {Witsch, Andreas}, biburl = {https://www.bibsonomy.org/bibtex/2793a363e273c8134936f23d51883eae3/cn}, institution = {Distributed Systems Research Group}, interhash = {c25d49a6c4cbe4c6637466840903bba6}, intrahash = {793a363e273c8134936f23d51883eae3}, keywords = {Machine-Learning}, month = {06}, school = {University of Kassel}, timestamp = {2016-04-07T17:32:48.000+0200}, title = {Applying Policy Gradient Reinforcement Learning to Optimise Robot Behaviours}, type = {Master's Thesis}, url = {http://das-lab.vs.eecs.uni-kassel.de/publications/Witsch2010-Master-PolicyGradient.pdf}, year = 2010 }

BibSonomy

Applying Policy Gradient Reinforcement Learning to Optimise Robot Behaviours

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on