L. Smith. (2017)cite arxiv:1506.01186Comment: Presented at WACV 2017; see https://github.com/bckenstler/CLR for instructions to implement CLR in Keras.
D. Kingma, and J. Ba. (2014)cite arxiv:1412.6980Comment: Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego, 2015.
Y. Lin, S. Han, H. Mao, Y. Wang, and W. Dally. (2017)cite arxiv:1712.01887Comment: we find 99.9% of the gradient exchange in distributed SGD is redundant; we reduce the communication bandwidth by two orders of magnitude without losing accuracy.
R. Kidambi, P. Netrapalli, P. Jain, and S. Kakade. (2018)cite arxiv:1803.05591Comment: 28 pages, 10 figures. Appears as an oral presentation at International Conference on Learning Representations (ICLR), 2018. Code implementing the ASGD method can be found at https://github.com/rahulkidambi/AccSGD.