copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Bias-Optimal Incremental Learning of Control Sequences for Virtual Robots

J. Schmidhuber, V. Zhumatiy, and M. Gagliolo. Procceedings of the eigth conference on Intelligent Autonomous Systems, IAS-8, page 658--665. Amsterdam, (2004)

Abstract

Learning and planning control is hard. The search space of traditional planners consists of sequences of primitive actions. To exploit reusable subsequences and other algorithmic regularities, however, we should instead search the general space of programs that compute action sequences. Such programs may invoke very fast thinking actions consuming only nanoseconds (such as conditional jumps to certain code addresses) as well as very slow control actions consuming seconds in the real world (such as stretch-arm-until-obstacle-sensation). What is an optimal way of allocating time to tests of such non-homogeneous programs? What is an optimal way of reusing experience with previous tasks to learn solutions to new tasks? One answer is given by the recent Optimal Ordered Problem Solver OOPS, a near-bias-optimal incremental extension of Levin's nonincremental universal search, which we apply to virtual robotics for the first time: our snake robot uses OOPS to learn to walk and jump in a partially observable environment (POMDP) with a huge state/action space.

Links and resources

BibTeX key: schmidhuber:2004:IAS
entry type: inproceedings
address: Amsterdam
booktitle: Procceedings of the eigth conference on Intelligent Autonomous Systems, IAS-8
year: 2004
pages: 658--665
Document: ftp://ftp.idsia.ch/pub/juergen/snakeias.pdf

@brazovayeye's tags highlighted

Cite this publication

@inproceedings{schmidhuber:2004:IAS, abstract = {Learning and planning control is hard. The search space of traditional planners consists of sequences of primitive actions. To exploit reusable subsequences and other algorithmic regularities, however, we should instead search the general space of programs that compute action sequences. Such programs may invoke very fast thinking actions consuming only nanoseconds (such as conditional jumps to certain code addresses) as well as very slow control actions consuming seconds in the real world (such as stretch-arm-until-obstacle-sensation). What is an optimal way of allocating time to tests of such non-homogeneous programs? What is an optimal way of reusing experience with previous tasks to learn solutions to new tasks? One answer is given by the recent Optimal Ordered Problem Solver OOPS, a near-bias-optimal incremental extension of Levin's nonincremental universal search, which we apply to virtual robotics for the first time: our snake robot uses OOPS to learn to walk and jump in a partially observable environment (POMDP) with a huge state/action space.}, added-at = {2008-06-19T17:46:40.000+0200}, address = {Amsterdam}, author = {Schmidhuber, Juergen and Zhumatiy, Viktor P and Gagliolo, Matteo}, biburl = {https://www.bibsonomy.org/bibtex/26c34b247a057ecda130cabcad44ad30c/brazovayeye}, booktitle = {Procceedings of the eigth conference on Intelligent Autonomous Systems, IAS-8}, interhash = {44478e4530d185f60a60bf397093f70a}, intrahash = {6c34b247a057ecda130cabcad44ad30c}, keywords = {algorithms, genetic programming}, pages = {658--665}, timestamp = {2008-06-19T17:51:06.000+0200}, title = {Bias-Optimal Incremental Learning of Control Sequences for Virtual Robots}, url = {ftp://ftp.idsia.ch/pub/juergen/snakeias.pdf}, year = 2004 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Bias-Optimal Incremental Learning of Control Sequences for Virtual Robots

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Bias-Optimal Incremental Learning of Control Sequences for Virtual Robots

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Bias-Optimal Incremental Learning of Control Sequences for Virtual Robots

Comments and Reviews
(0)