Author of the publication

1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed.

, , , , and . HIPC, page 272-281. IEEE, (2022)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

The Cilkview scalability analyzer., , and . SPAA, page 145-156. ACM, (2010)GRIP: Multi-Store Capacity-Optimized High-Performance Nearest Neighbor Search for Vector Search Engine., and . CIKM, page 1673-1682. ACM, (2019)Scheduling for data center interactive services., and . Allerton, page 1170-1181. IEEE, (2011)Scalable and Efficient MoE Training for Multitask Multilingual Models., , , , , , , , and . CoRR, (2021)ScaLA: Accelerating Adaptation of Pre-Trained Transformer-Based Language Models via Efficient Large-Batch Adversarial Noise., , and . CoRR, (2022)ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats., , and . CoRR, (2023)ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks., , , , , , , , , and 2 other author(s). CoRR, (2023)Better Caching in Search Advertising Systems with Rapid Refresh Predictions., , , , and . WWW, page 1875-1884. ACM, (2018)Fast LSTM Inference by Dynamic Decomposition on Cloud Systems., , , , , , and . ICDM, page 748-757. IEEE, (2019)ZeRO-Offload: Democratizing Billion-Scale Model Training., , , , , , , and . USENIX Annual Technical Conference, page 551-564. USENIX Association, (2021)