Author of the publication

Duplo: Lifting Redundant Memory Accesses of Deep Neural Networks for GPU Tensor Cores.

, , , , , and . MICRO, page 725-737. IEEE, (2020)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

NOMAD: Enabling Non-blocking OS-managed DRAM Cache via Tag-Data Decoupling., , and . HPCA, page 193-205. IEEE, (2023)A framework for architecture-level power, area, and thermal simulation and its application to network-on-chip design exploration., , , , and . SIGMETRICS Perform. Evaluation Rev., 38 (4): 63-68 (2011)SnakeByte: A TLB Design with Adaptive and Recursive Page Merging in GPUs., , , , and . HPCA, page 1195-1207. IEEE, (2023)TRINITY: Coordinated Performance, Energy and Temperature Management in 3D Processor-Memory Stacks., , , and . CoRR, (2018)Energy-Efficient Acceleration of Deep Neural Networks on Realtime-Constrained Embedded Edge Devices., , , and . IEEE Access, (2020)Architectural Reliability: Lifetime Reliability Characterization and Management ofMany-Core Processors., , and . IEEE Comput. Archit. Lett., 14 (2): 103-106 (2015)Amdahl's law for lifetime reliability scaling in heterogeneous multicore processors., , and . HPCA, page 594-605. IEEE Computer Society, (2016)FineReg: Fine-Grained Register File Management for Augmenting GPU Throughput., , , and . MICRO, page 364-376. IEEE Computer Society, (2018)Instruction-based energy estimation methodology for asymmetric manycore processor simulations., , , and . SimuTools, page 166-171. ICST/ACM, (2012)Enhancements to FPMIPv6 for improved seamless vertical handover between LTE and heterogeneous access networks., , , , , and . IEEE Wirel. Commun., 20 (3): 1-0 (2013)