M. Alom, A. Moody, N. Maruyama, B. Van Essen, and T. Taha. (2018)cite arxiv:1802.02615Comment: 8 pages, 23 figures,Submitted to International Joint Conference on Neural Networks (IJCNN) 2018.
Y. Lin, S. Han, H. Mao, Y. Wang, and W. Dally. (2017)cite arxiv:1712.01887Comment: we find 99.9% of the gradient exchange in distributed SGD is redundant; we reduce the communication bandwidth by two orders of magnitude without losing accuracy.
G. Bellec, D. Kappel, W. Maass, and R. Legenstein. (2017)cite arxiv:1711.05136Comment: Accepted for publication at ICLR 2018. 10 pages (12 with references, 24 with appendix), 4 Figures in the main text.