Author of the publication

Spartan: A Sparsity-Adaptive Framework to Accelerate Deep Neural Network Training on GPUs.

, , , , , , , , and . IEEE Trans. Parallel Distributed Syst., 32 (10): 2448-2463 (2021)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Data Transfer Optimizations for Host-CPU and Accelerators in AXI4MLIR., , , , and . CoRR, (2024)AXI4MLIR: User-Driven Automatic Host Code Generation for Custom AXI-Based Accelerators., , , , , , , , and . CGO, page 143-157. IEEE, (2024)From High-Level Frameworks to custom Silicon with SODA., , , , , , , , , and 3 other author(s). HCS, page 1-13. IEEE, (2022)VCSR: An Efficient GPU Memory-Aware Sparse Format., , , and . IEEE Trans. Parallel Distributed Syst., 33 (10): 3977-3989 (2022)ML-CGRA: An Integrated Compilation Framework to Enable Efficient Machine Learning Acceleration on CGRAs., , , , , , and . DAC, page 1-6. IEEE, (2023)AXI4MLIR: User-Driven Automatic Host Code Generation for Custom AXI-Based Accelerators., , , , , , , , and . CoRR, (2023)DRIPS: Dynamic Rebalancing of Pipelined Streaming Applications on CGRAs., , , , , , , and . HPCA, page 304-316. IEEE, (2022)An MLIR-based Compiler Flow for System-Level Design and Hardware Acceleration., , , , , , , , and . ICCAD, page 6:1-6:9. ACM, (2022)Performance Evaluation and Improvement of Real-Time Computer Vision Applications for Edge Computing Devices., , and . ICPE (Companion), page 139-144. ACM, (2021)Towards Automatic and Agile AI/ML Accelerator Design with End-to-End Synthesis., , , , , , , , , and 2 other author(s). ASAP, page 218-225. IEEE, (2021)