D. Soudry, E. Hoffer, M. Nacson, S. Gunasekar, and N. Srebro. (2017)cite arxiv:1710.10345Comment: Final JMLR version, with improved discussions over v3. Main improvements in journal version over conference version (v2 appeared in ICLR): We proved the measure zero case for main theorem (with implications for the rates), and the multi-class case.
J. Behrmann, W. Grathwohl, R. Chen, D. Duvenaud, and J. Jacobsen. Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, page 573--582. Long Beach, California, USA, PMLR, (09--15 Jun 2019)