Author of the publication

Efficient sparse matrix-vector multiplication on x86-based many-core processors.

, , , and . ICS, page 273-282. ACM, (2013)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Designing and dynamically load balancing hybrid LU for multi/many-core., , , , , and . Comput. Sci. Res. Dev., 26 (3-4): 211-220 (2011)Fast Sort on CPUs, GPUs and Intel MIC Architectures, , , , , , and . Technical Report, Intel Labs, (2010)PhysBAM: physically based simulation., , , , and . SIGGRAPH Courses, page 10:1-10:22. ACM, (2011)Interactive hybrid simulation of large-scale traffic., , , and . SIGGRAPH Talks, page 6. ACM, (2011)Can traditional programming bridge the ninja performance gap for parallel computing applications?, , , , , , , and . Commun. ACM, 58 (5): 77-86 (2015)Microscaling Data Formats for Deep Learning., , , , , , , , , and 23 other author(s). CoRR, (2023)Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices., , , , , , , , and . SC, page 945-955. IEEE Computer Society, (2014)Lattice QCD on Intel® Xeon PhiTM Coprocessors., , , , , , , and . ISC, volume 7905 of Lecture Notes in Computer Science, page 40-54. Springer, (2013)Exploring Shared-Memory Optimizations for an Unstructured Mesh CFD Application on Modern Parallel Systems., , , , , , , , , and . IPDPS, page 723-732. IEEE Computer Society, (2015)High Performance Non-uniform FFT on Modern X86-based Multi-core Systems., , , , , , , , , and . IPDPS, page 449-460. IEEE Computer Society, (2012)