Article,

A mean-field limit for certain deep neural networks

D. Araújo, R. Oliveira, and D. Yukimura.
(2019)cite arxiv:1906.00193Comment: 79 pages and 2 figures.

Abstract

Understanding deep neural networks (DNNs) is a key challenge in the theory of machine learning, with potential applications to the many fields where DNNs have been successfully used. This article presents a scaling limit for a DNN being trained by stochastic gradient descent. Our networks have a fixed (but arbitrary) number $L2$ of inner layers; $N1$ neurons per layer; full connections between layers; and fixed weights (or "random features" that are not trained) near the input and output. Our results describe the evolution of the DNN during training in the limit when $N+ınfty$, which we relate to a mean field model of McKean-Vlasov type. Specifically, we show that network weights are approximated by certain "ideal particles" whose distribution and dependencies are described by the mean-field model. A key part of the proof is to show existence and uniqueness for our McKean-Vlasov problem, which does not seem to be amenable to existing theory. Our paper extends previous work on the $L=1$ case by Mei, Montanari and Nguyen; Rotskoff and Vanden-Eijnden; and Sirignano and Spiliopoulos. We also complement recent independent work on $L>1$ by Sirignano and Spiliopoulos (who consider a less natural scaling limit) and Nguyen (who nonrigorously derives similar results).

BibTeX key: araujo2019meanfield
entry type: article
year: 2019
url: http://arxiv.org/abs/1906.00193
note: cite arxiv:1906.00193Comment: 79 pages and 2 figures

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@article{araujo2019meanfield, abstract = {Understanding deep neural networks (DNNs) is a key challenge in the theory of machine learning, with potential applications to the many fields where DNNs have been successfully used. This article presents a scaling limit for a DNN being trained by stochastic gradient descent. Our networks have a fixed (but arbitrary) number $L\geq 2$ of inner layers; $N\gg 1$ neurons per layer; full connections between layers; and fixed weights (or "random features" that are not trained) near the input and output. Our results describe the evolution of the DNN during training in the limit when $N\to +\infty$, which we relate to a mean field model of McKean-Vlasov type. Specifically, we show that network weights are approximated by certain "ideal particles" whose distribution and dependencies are described by the mean-field model. A key part of the proof is to show existence and uniqueness for our McKean-Vlasov problem, which does not seem to be amenable to existing theory. Our paper extends previous work on the $L=1$ case by Mei, Montanari and Nguyen; Rotskoff and Vanden-Eijnden; and Sirignano and Spiliopoulos. We also complement recent independent work on $L>1$ by Sirignano and Spiliopoulos (who consider a less natural scaling limit) and Nguyen (who nonrigorously derives similar results).}, added-at = {2019-06-04T14:04:40.000+0200}, author = {Araújo, Dyego and Oliveira, Roberto I. and Yukimura, Daniel}, biburl = {https://www.bibsonomy.org/bibtex/256827608a6d09b9082d76a9df1b60a54/kirk86}, description = {[1906.00193] A mean-field limit for certain deep neural networks}, interhash = {775315ee68d4705fff105aa2fe4d3e76}, intrahash = {56827608a6d09b9082d76a9df1b60a54}, keywords = {deep-learning machine-learning theory}, note = {cite arxiv:1906.00193Comment: 79 pages and 2 figures}, timestamp = {2019-06-04T14:04:40.000+0200}, title = {A mean-field limit for certain deep neural networks}, url = {http://arxiv.org/abs/1906.00193}, year = 2019 }

BibSonomy

A mean-field limit for certain deep neural networks

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on