Abstract
On April 13th, 2019, OpenAI Five became the first AI system to defeat the
world champions at an esports game. The game of Dota 2 presents novel
challenges for AI systems such as long time horizons, imperfect information,
and complex, continuous state-action spaces, all challenges which will become
increasingly central to more capable AI systems. OpenAI Five leveraged existing
reinforcement learning techniques, scaled to learn from batches of
approximately 2 million frames every 2 seconds. We developed a distributed
training system and tools for continual training which allowed us to train
OpenAI Five for 10 months. By defeating the Dota 2 world champion (Team OG),
OpenAI Five demonstrates that self-play reinforcement learning can achieve
superhuman performance on a difficult task.
Users
Please
log in to take part in the discussion (add own reviews or comments).