A. Achille, and S. Soatto. (2017)cite arxiv:1706.01350Comment: Deep learning, neural network, representation, flat minima, information bottleneck, overfitting, generalization, sufficiency, minimality, sensitivity, information complexity, stochastic gradient descent, regularization, total correlation, PAC-Bayes.
S. Chen, E. Dobriban, and J. Lee. (2019)cite arxiv:1907.10905Comment: Changed title. Added results on overparametrized 2-layer nets. Added error bars to experiments. Numerous other minor improvements.
G. Dziugaite, and D. Roy. (2017)cite arxiv:1703.11008Comment: 14 pages, 1 table, 2 figures. Corresponds with UAI camera ready and supplement. Includes additional references and related experiments.
J. Frankle, G. Dziugaite, D. Roy, and M. Carbin. (2019)cite arxiv:1912.05671Comment: This submission subsumes 1903.01611 ("Stabilizing the Lottery Ticket Hypothesis" and "The Lottery Ticket Hypothesis at Scale").
S. Mei, and A. Montanari. (2019)cite arxiv:1908.05355Comment: We added two sections in version 3. One section provides the precise asymptotics of the training error. The other section describes a Gaussian covariate model, which gives the same asymptotic test error as the random features model.
J. Negrea, M. Haghifam, G. Dziugaite, A. Khisti, and D. Roy. (2019)cite arxiv:1911.02151Comment: 23 pages, 1 figure. To appear in, Advances in Neural Information Processing Systems (33), 2019.
N. Tishby, and N. Zaslavsky. (2015)cite arxiv:1503.02406Comment: 5 pages, 2 figures, Invited paper to ITW 2015; 2015 IEEE Information Theory Workshop (ITW) (IEEE ITW 2015).
C. Wei, J. Lee, Q. Liu, and T. Ma. (2018)cite arxiv:1810.05369Comment: version 2: title changed from originally Ön the Margin Theory of Feedforward Neural Networks". Substantial changes from old version of paper, including a new lower bound on NTK sample complexity version 3: reorganized NTK lower bound proof.