Author of the publication

Maximizing Performance Through Memory Hierarchy-Driven Data Layout Transformations.

, , , and . MCHPC@SC, page 1-10. IEEE, (2022)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Immersive Technology in the Public School Classroom: When a Class Meets., , , , and . iLRN, page 1-8. IEEE, (2021)A Case Study of Porting HPGMG from CUDA to OpenMP Target Offload., , , and . IWOMP, volume 12295 of Lecture Notes in Computer Science, page 37-51. Springer, (2020)A CAD-based methodology to optimize HLS code via the Roofline model., , , , , and . ICCAD, page 116:1-116:9. IEEE, (2020)Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis., , , , , , , and . PMBS@SC, volume 8966 of Lecture Notes in Computer Science, page 129-148. Springer, (2014)Performance Analysis of GPU Programming Models Using the Roofline Scaling Trajectories., , and . Bench, volume 12093 of Lecture Notes in Computer Science, page 3-19. Springer, (2019)s-Step Krylov Subspace Methods as Bottom Solvers for Geometric Multigrid., , , , , , and . IPDPS, page 1149-1158. IEEE Computer Society, (2014)Performance Variability on Xeon Phi., , , , and . ISC Workshops, volume 10524 of Lecture Notes in Computer Science, page 419-429. Springer, (2017)Exploiting reuse and vectorization in blocked stencil computations on CPUs and GPUs., , , , and . SC, page 52:1-52:44. ACM, (2019)Loop Chaining: A Programming Abstraction for Balancing Locality and Parallelism., , , , , , , , , and 1 other author(s). IPDPS Workshops, page 375-384. IEEE, (2013)FTL: Transfer Learning Nonlinear Plasma Dynamic Transitions in Low Dimensional Embeddings via Deep Neural Networks., , , , , and . CoRR, (2024)