copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit

S. Mei, T. Misiakiewicz, and A. Montanari. (2019)cite arxiv:1902.06015Comment: 61 pages.

Abstract

We consider learning two layer neural networks using stochastic gradient descent. The mean-field description of this learning dynamics approximates the evolution of the network weights by an evolution in the space of probability distributions in $R^D$ (where $D$ is the number of parameters associated to each neuron). This evolution can be defined through a partial differential equation or, equivalently, as the gradient flow in the Wasserstein space of probability distributions. Earlier work shows that (under some regularity assumptions), the mean field description is accurate as soon as the number of hidden units is much larger than the dimension $D$. In this paper we establish stronger and more general approximation guarantees. First of all, we show that the number of hidden units only needs to be larger than a quantity dependent on the regularity properties of the data, and independent of the dimensions. Next, we generalize this analysis to the case of unbounded activation functions, which was not covered by earlier bounds. We extend our results to noisy stochastic gradient descent. Finally, we show that kernel ridge regression can be recovered as a special limit of the mean field analysis.

Description

[1902.06015] Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit

Links and resources

BibTeX key: mei2019meanfield
entry type: article
year: 2019
url: http://arxiv.org/abs/1902.06015
note: cite arxiv:1902.06015Comment: 61 pages

@kirk86's tags highlighted

Cite this publication

@article{mei2019meanfield, abstract = {We consider learning two layer neural networks using stochastic gradient descent. The mean-field description of this learning dynamics approximates the evolution of the network weights by an evolution in the space of probability distributions in $R^D$ (where $D$ is the number of parameters associated to each neuron). This evolution can be defined through a partial differential equation or, equivalently, as the gradient flow in the Wasserstein space of probability distributions. Earlier work shows that (under some regularity assumptions), the mean field description is accurate as soon as the number of hidden units is much larger than the dimension $D$. In this paper we establish stronger and more general approximation guarantees. First of all, we show that the number of hidden units only needs to be larger than a quantity dependent on the regularity properties of the data, and independent of the dimensions. Next, we generalize this analysis to the case of unbounded activation functions, which was not covered by earlier bounds. We extend our results to noisy stochastic gradient descent. Finally, we show that kernel ridge regression can be recovered as a special limit of the mean field analysis.}, added-at = {2019-09-26T15:18:02.000+0200}, author = {Mei, Song and Misiakiewicz, Theodor and Montanari, Andrea}, biburl = {https://www.bibsonomy.org/bibtex/2a5d68ca685aaaaf39d67d0013f8993ab/kirk86}, description = {[1902.06015] Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit}, interhash = {197edaf5b69b663cb6ecaf5c0e51173d}, intrahash = {a5d68ca685aaaaf39d67d0013f8993ab}, keywords = {approximate dynamic generalization optimization probability readings}, note = {cite arxiv:1902.06015Comment: 61 pages}, timestamp = {2019-09-26T15:18:40.000+0200}, title = {Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit}, url = {http://arxiv.org/abs/1902.06015}, year = 2019 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit

Comments and Reviews
(0)