We present an approach to automate the process of discovering optimization
methods, with a focus on deep learning architectures. We train a Recurrent
Neural Network controller to generate a string in a domain specific language
that describes a mathematical update equation based on a list of primitive
functions, such as the gradient, running average of the gradient, etc. The
controller is trained with Reinforcement Learning to maximize the performance
of a model after a few epochs. On CIFAR-10, our method discovers several update
rules that are better than many commonly used optimizers, such as Adam,
RMSProp, or SGD with and without Momentum on a ConvNet model. We introduce two
new optimizers, named PowerSign and AddSign, which we show transfer well and
improve training on a variety of different tasks and architectures, including
ImageNet classification and Google's neural machine translation system.
Description
[1709.07417] Neural Optimizer Search with Reinforcement Learning
%0 Generic
%1 bello2017neural
%A Bello, Irwan
%A Zoph, Barret
%A Vasudevan, Vijay
%A Le, Quoc V.
%D 2017
%K 2017 arxiv deep-learning optimization reinforcement-learning
%T Neural Optimizer Search with Reinforcement Learning
%U http://arxiv.org/abs/1709.07417
%X We present an approach to automate the process of discovering optimization
methods, with a focus on deep learning architectures. We train a Recurrent
Neural Network controller to generate a string in a domain specific language
that describes a mathematical update equation based on a list of primitive
functions, such as the gradient, running average of the gradient, etc. The
controller is trained with Reinforcement Learning to maximize the performance
of a model after a few epochs. On CIFAR-10, our method discovers several update
rules that are better than many commonly used optimizers, such as Adam,
RMSProp, or SGD with and without Momentum on a ConvNet model. We introduce two
new optimizers, named PowerSign and AddSign, which we show transfer well and
improve training on a variety of different tasks and architectures, including
ImageNet classification and Google's neural machine translation system.
@misc{bello2017neural,
abstract = {We present an approach to automate the process of discovering optimization
methods, with a focus on deep learning architectures. We train a Recurrent
Neural Network controller to generate a string in a domain specific language
that describes a mathematical update equation based on a list of primitive
functions, such as the gradient, running average of the gradient, etc. The
controller is trained with Reinforcement Learning to maximize the performance
of a model after a few epochs. On CIFAR-10, our method discovers several update
rules that are better than many commonly used optimizers, such as Adam,
RMSProp, or SGD with and without Momentum on a ConvNet model. We introduce two
new optimizers, named PowerSign and AddSign, which we show transfer well and
improve training on a variety of different tasks and architectures, including
ImageNet classification and Google's neural machine translation system.},
added-at = {2018-01-04T12:48:23.000+0100},
author = {Bello, Irwan and Zoph, Barret and Vasudevan, Vijay and Le, Quoc V.},
biburl = {https://www.bibsonomy.org/bibtex/2c94e41dba37db075ca7f0b54f10cbced/achakraborty},
description = {[1709.07417] Neural Optimizer Search with Reinforcement Learning},
interhash = {a41a81293e2e6d1f6a4b295d22121fe4},
intrahash = {c94e41dba37db075ca7f0b54f10cbced},
keywords = {2017 arxiv deep-learning optimization reinforcement-learning},
note = {cite arxiv:1709.07417Comment: ICML 2017 Conference paper},
timestamp = {2018-01-04T12:48:23.000+0100},
title = {Neural Optimizer Search with Reinforcement Learning},
url = {http://arxiv.org/abs/1709.07417},
year = 2017
}