A. Achille, and S. Soatto. (2017)cite arxiv:1706.01350Comment: Deep learning, neural network, representation, flat minima, information bottleneck, overfitting, generalization, sufficiency, minimality, sensitivity, information complexity, stochastic gradient descent, regularization, total correlation, PAC-Bayes.
E. Ryu, J. Liu, S. Wang, X. Chen, Z. Wang, and W. Yin. (2019)cite arxiv:1905.05406Comment: Published in the International Conference on Machine Learning, 2019.