copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Large Batch Training of Convolutional Networks

Y. You, I. Gitman, and B. Ginsburg. (2017)cite arxiv:1708.03888.

Abstract

A common way to speed up training of large convolutional networks is to add computational units. Training is then performed using data-parallel synchronous Stochastic Gradient Descent (SGD) with mini-batch divided between computational units. With an increase in the number of nodes, the batch size grows. But training with large batch size often results in the lower model accuracy. We argue that the current recipe for large batch training (linear learning rate scaling with warm-up) is not general enough and training may diverge. To overcome this optimization difficulties we propose a new training algorithm based on Layer-wise Adaptive Rate Scaling (LARS). Using LARS, we scaled Alexnet up to a batch size of 8K, and Resnet-50 to a batch size of 32K without loss in accuracy.

Description

Large Batch Training of Convolutional Networks

Links and resources

BibTeX key: you2017large
entry type: misc
year: 2017
url: http://arxiv.org/abs/1708.03888
note: cite arxiv:1708.03888

@alrigazzi's tags highlighted

Cite this publication

search on

Meta data

Last update 5 years ago
Created 5 years ago

Comments and Reviews
(0)

There is no review or comment yet. You can write one!

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Large Batch Training of Convolutional Networks

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Large Batch Training of Convolutional Networks

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Large Batch Training of Convolutional Networks

Comments and Reviews
(0)