copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Toward Deeper Understanding of Nonconvex Stochastic Optimization with Momentum using Diffusion Approximations

T. Liu, Z. Chen, E. Zhou, and T. Zhao. (2018)cite arxiv:1802.05155.

Abstract

Momentum Stochastic Gradient Descent (MSGD) algorithm has been widely applied to many nonconvex optimization problems in machine learning. Popular examples include training deep neural networks, dimensionality reduction, and etc. Due to the lack of convexity and the extra momentum term, the optimization theory of MSGD is still largely unknown. In this paper, we study this fundamental optimization algorithm based on the so-called "strict saddle problem." By diffusion approximation type analysis, our study shows that the momentum helps escape from saddle points, but hurts the convergence within the neighborhood of optima (if without the step size annealing). Our theoretical discovery partially corroborates the empirical success of MSGD in training deep neural networks. Moreover, our analysis applies the martingale method and "Fixed-State-Chain" method from the stochastic approximation literature, which are of independent interest.

Description

Toward Deeper Understanding of Nonconvex Stochastic Optimization with Momentum using Diffusion Approximations

Links and resources

BibTeX key: liu2018toward
entry type: misc
year: 2018
url: http://arxiv.org/abs/1802.05155
note: cite arxiv:1802.05155

@jk_itwm's tags highlighted

Cite this publication

search on

Meta data

Last update 7 years ago
Created 7 years ago

Comments and Reviews
(0)

There is no review or comment yet. You can write one!

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Toward Deeper Understanding of Nonconvex Stochastic Optimization with Momentum using Diffusion Approximations

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Toward Deeper Understanding of Nonconvex Stochastic Optimization with Momentum using Diffusion Approximations

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Toward Deeper Understanding of Nonconvex Stochastic Optimization with Momentum using Diffusion Approximations

Comments and Reviews
(0)