We introduce AlphaGo Zero, the latest evolution of AlphaGo, the first computer program to defeat a world champion at the ancient Chinese game of Go. Zero is even more powerful and is arguably the strongest Go player in history. Previous versions of AlphaGo initially trained on thousands of human amateur and professional games to learn how to play Go. AlphaGo Zero skips this step and learns to play simply by playing games against itself, starting from completely random play. In doing so, it quickly surpassed human level of play and defeated the previously published champion-defeating version of AlphaGo by 100 games to 0.
These are lectures for course 6.S094: Deep Learning for Self-Driving Cars taught in Winter 2017. Course website: http://cars.mit.edu Contact: deepcars@mit.ed...
I have been working on Reinforcement Learning for the past few months and all I can say about it: It is different. A writeup of the common quirks and frustrations of Reinforcement Learning I have…
Humans excel at solving a wide variety of challenging problems, from low-level motor control through to high-level cognitive tasks. Our goal at DeepMind is to create artificial agents that can achieve a similar level of performance and generality. Like a human, our agents learn for themselves to achieve successful strategies that lead to the greatest long-term rewards.
A. Zeng, S. Song, S. Welker, J. Lee, A. Rodriguez, and T. Funkhouser. (2018)cite arxiv:1803.09956Comment: Under review at the International Conference On Intelligent Robots and Systems (IROS) 2018. Project webpage: http://vpg.cs.princeton.edu.
S. Albrecht, and P. Stone. (2017)cite arxiv:1709.08071Comment: 42 pages, submitted for review to Artificial Intelligence Journal. Keywords: multiagent systems, agent modelling, opponent modelling, survey, open problems.