Author of the publication

Domain-Specific Optimization of Two Jacobi Smoother Kernels and Their Evaluation in the ECM Performance Model.

, , , and . Parallel Process. Lett., (2014)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Domain decomposition and locality optimization for large-scale lattice Boltzmann simulations, , , and . CoRR, (2011)Comparison of different Propagation Steps for the Lattice Boltzmann Method, , , and . CoRR, (2011)Data access optimizations for highly threaded multi-core CPUs with multiple memory controllers, , and . CoRR, (2007)Towards energy efficiency and maximum computational intensity for stencil algorithms using wavefront diamond temporal blocking., , , and . CoRR, (2014)Optimizing ccNUMA locality for task-parallel execution under OpenMP and TBB on multicore-based systems, and . CoRR, (2011)Performance and power for highly parallel systems., , , and . Concurr. Comput. Pract. Exp., 28 (2): 187-188 (2016)Hybrid-Parallel Sparse Matrix-Vector Multiplication with Explicit Communication Overlap on Current Multicore-Based Systems., , , and . Parallel Process. Lett., 21 (3): 339-358 (2011)Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels., , , and . PMBS@SC, page 1-6. IEEE, (2019)Level-Based Blocking for Sparse Matrices: Sparse Matrix-Power-Vector Multiplication., , , and . IEEE Trans. Parallel Distributed Syst., 34 (2): 581-597 (February 2023)Analytical performance estimation during code generation on modern GPUs., , , , and . J. Parallel Distributed Comput., (March 2023)