Y. Lin, S. Han, H. Mao, Y. Wang, и W. Dally. (2017)cite arxiv:1712.01887Comment: we find 99.9% of the gradient exchange in distributed SGD is redundant; we reduce the communication bandwidth by two orders of magnitude without losing accuracy.
Q. Wang, L. Huang, Z. Jiang, K. Knight, H. Ji, M. Bansal, и Y. Luan. (2019)cite arxiv:1905.07870Comment: 12 pages. Accepted by ACL 2019 Code and resource will be available at https://github.com/EagleW/PaperRobot.