Author of the publication

Optimizations in a high-performance conjugate gradient benchmark for IA-based multi- and many-core processors.

, , , , , , , , , , , and . Int. J. High Perform. Comput. Appl., 30 (1): 11-27 (2016)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Microscaling Data Formats for Deep Learning., , , , , , , , , and 23 other author(s). CoRR, (2023)Can traditional programming bridge the ninja performance gap for parallel computing applications?, , , , , , , and . Commun. ACM, 58 (5): 77-86 (2015)Lattice QCD on Intel® Xeon PhiTM Coprocessors., , , , , , , and . ISC, volume 7905 of Lecture Notes in Computer Science, page 40-54. Springer, (2013)Designing and dynamically load balancing hybrid LU for multi/many-core., , , , , and . Comput. Sci. Res. Dev., 26 (3-4): 211-220 (2011)Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices., , , , , , , , and . SC, page 945-955. IEEE Computer Society, (2014)Fast Sort on CPUs, GPUs and Intel MIC Architectures, , , , , , and . Technical Report, Intel Labs, (2010)Interactive hybrid simulation of large-scale traffic., , , and . SIGGRAPH Talks, page 6. ACM, (2011)PhysBAM: physically based simulation., , , , and . SIGGRAPH Courses, page 10:1-10:22. ACM, (2011)Exploring Shared-Memory Optimizations for an Unstructured Mesh CFD Application on Modern Parallel Systems., , , , , , , , , and . IPDPS, page 723-732. IEEE Computer Society, (2015)High Performance Non-uniform FFT on Modern X86-based Multi-core Systems., , , , , , , , , and . IPDPS, page 449-460. IEEE Computer Society, (2012)