Author of the publication

Understanding Scalability and Fine-Grain Parallelism of Synchronous Data Parallel Training.

, , , and . MLHPC@SC, page 1-8. IEEE, (2019)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

On the Benefits of Transparent Compression for Cost-Effective Cloud Data Storage.. Trans. Large Scale Data Knowl. Centered Syst., (2011)Towards efficient on-demand VM provisioning: Study of VM runtime I/O access patterns to shared image content., , and . IM, page 321-329. IEEE, (2015)Towards Low-Overhead Resilience for Data Parallel Deep Learning., , , , and . CCGRID, page 336-345. IEEE, (2022)Scalable Incremental Checkpointing using GPU-Accelerated De-Duplication., , , , , , , , and . ICPP, page 665-674. ACM, (2023)Towards High Performance Resilience Using Performance Portable Abstractions., , , , and . Euro-Par, volume 12820 of Lecture Notes in Computer Science, page 451-465. Springer, (2021)MPIGDB: A Flexible Debugging Infrastructure for MPI Programs., and . FlexScience@HPDC, page 11-18. ACM, (2023)Integrating process, control-flow, and data resiliency layers using a hybrid Fenix/Kokkos approach., , , , , , , and . CLUSTER, page 418-428. IEEE, (2022)Spark-DIY: A Framework for Interoperable Spark Operations with High Performance Block-Based Data Models., , , , and . BDCAT, page 1-10. IEEE Computer Society, (2018)Optimizing Asynchronous Multi-Level Checkpoint/Restart Configurations with Machine Learning., , , , , , , and . IPDPS Workshops, page 1036-1043. IEEE, (2020)Improving Performance of Data Dumping with Lossy Compression for Scientific Simulation., , , , , , and . CLUSTER, page 1-11. IEEE, (2019)