J. Behrmann, W. Grathwohl, R. Chen, D. Duvenaud, and J. Jacobsen. Proceedings of the 36th International Conference on Machine Learning
, volume 97 of Proceedings of Machine Learning Research, page 573--582. Long Beach, California, USA, PMLR, (09--15 Jun 2019)
Q. Wang, L. Huang, Z. Jiang, K. Knight, H. Ji, M. Bansal, and Y. Luan. (2019)cite arxiv:1905.07870Comment: 12 pages. Accepted by ACL 2019 Code and resource will be available at https://github.com/EagleW/PaperRobot.
Y. Lin, S. Han, H. Mao, Y. Wang, and W. Dally. (2017)cite arxiv:1712.01887Comment: we find 99.9% of the gradient exchange in distributed SGD is redundant; we reduce the communication bandwidth by two orders of magnitude without losing accuracy.