Author of the publication

Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications.

, , , , , , and . IWOMP, volume 4315 of Lecture Notes in Computer Science, page 267-278. Springer, (2006)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

INSTANT: A Runtime Framework to Orchestrate In-Situ Workflows., and . Euro-Par, volume 14100 of Lecture Notes in Computer Science, page 199-213. Springer, (2023)Designing a 3D Parallel Memory-Aware Lattice Boltzmann Algorithm on Manycore Systems., and . Euro-Par, volume 12820 of Lecture Notes in Computer Science, page 519-535. Springer, (2021)Designing a Synchronization-reducing Clustering Method on Manycores: Some Issues and Improvements., , and . MLHPC@SC, page 9:1-9:8. ACM, (2017)An Extended Roofline Model with Communication-Awareness for Distributed-Memory HPC Systems., and . HPC Asia, page 26-35. ACM, (2019)Designing a 3D Parallel Memory-Aware Lattice Boltzmann Algorithm on Manycore Systems., and . CoRR, (2022)Utilizing GPU Performance Counters to Characterize GPU Kernels via Machine Learning., and . ICCS (1), volume 12137 of Lecture Notes in Computer Science, page 88-101. Springer, (2020)L2 Cache Modeling for Scientific Applications on Chip Multi-Processors., , and . ICPP, page 51. IEEE Computer Society, (2007)Feedback-directed thread scheduling with memory considerations., , and . HPDC, page 97-106. ACM, (2007)OpenGraphGym-MG: Using Reinforcement Learning to Solve Large Graph Optimization Problems on MultiGPU Systems., , and . CoRR, (2021)A Scalable Non-blocking Multicast Scheme for Distributed DAG Scheduling., , and . ICCS (1), volume 5544 of Lecture Notes in Computer Science, page 195-204. Springer, (2009)