S. Amari, R. Karakida, and M. Oizumi. Proceedings of Machine Learning Research, volume 89 of Proceedings of Machine Learning Research, page 694--702. PMLR, (16--18 Apr 2019)
C. Maddison, D. Paulin, Y. Teh, and A. Doucet. (2019)cite arxiv:1902.02257Comment: Major revision, including simpler equivalent conditions for dual relative smoothness and applications to exponential penalty functions and p-norm regression.
N. Meinshausen, and P. Bühlmann. (2006)cite arxiv:math/0608017Comment: Published at http://dx.doi.org/10.1214/009053606000000281 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org).
G. Louppe, J. Hermans, and K. Cranmer. Proceedings of Machine Learning Research, volume 89 of Proceedings of Machine Learning Research, page 1438--1447. PMLR, (16--18 Apr 2019)
P. Zhao, G. Rocha, and B. Yu. (2009)cite arxiv:0909.0411Comment: Published in at http://dx.doi.org/10.1214/07-AOS584 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org).
A. Slivkins. (2019)cite arxiv:1904.07272Comment: The manuscript is complete, but comments are very welcome! To be published with Foundations and Trends in Machine Learning.
N. Tishby, and N. Zaslavsky. (2015)cite arxiv:1503.02406Comment: 5 pages, 2 figures, Invited paper to ITW 2015; 2015 IEEE Information Theory Workshop (ITW) (IEEE ITW 2015).
A. Achille, and S. Soatto. (2017)cite arxiv:1706.01350Comment: Deep learning, neural network, representation, flat minima, information bottleneck, overfitting, generalization, sufficiency, minimality, sensitivity, information complexity, stochastic gradient descent, regularization, total correlation, PAC-Bayes.
C. Wei, J. Lee, Q. Liu, and T. Ma. (2018)cite arxiv:1810.05369Comment: version 2: title changed from originally Ön the Margin Theory of Feedforward Neural Networks". Substantial changes from old version of paper, including a new lower bound on NTK sample complexity version 3: reorganized NTK lower bound proof.
D. Soudry, E. Hoffer, M. Nacson, S. Gunasekar, and N. Srebro. (2017)cite arxiv:1710.10345Comment: Final JMLR version, with improved discussions over v3. Main improvements in journal version over conference version (v2 appeared in ICLR): We proved the measure zero case for main theorem (with implications for the rates), and the multi-class case.