Author of the publication

HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework.

, , , , , , and . Proc. VLDB Endow., 15 (2): 312-320 (2021)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework., , , , , , and . Proc. VLDB Endow., 15 (2): 312-320 (2021)Improving Automatic Parallel Training via Balanced Memory Workload Optimization., , , , , and . CoRR, (2023)HetuMoE: An Efficient Trillion-scale Mixture-of-Expert Distributed Training System., , , , and . CoRR, (2022)Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism., , , , , , and . Proc. VLDB Endow., 16 (3): 470-479 (2022)TSPLIT: Fine-grained GPU Memory Management for Efficient DNN Training via Tensor Splitting., , , and . ICDE, page 2615-2628. IEEE, (2022)HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model Training., , , , , , and . SIGMOD Conference, page 470-480. ACM, (2022)Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent., , , , , , , and . Proc. VLDB Endow., 16 (12): 3781-3794 (2023)Heterogeneity-Aware Distributed Machine Learning Training via Partial Reduce., , , , , , and . SIGMOD Conference, page 2262-2270. ACM, (2021)Hetu: a highly efficient automatic parallel distributed deep learning system., , , , and . Sci. China Inf. Sci., (January 2023)OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning., , , and . CoRR, (2022)