Author of the publication

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs., , , and . IEEE Trans. Parallel Distributed Syst., 29 (12): 2700-2712 (2018)Matrix multiplication on batches of small matrices in half and half-complex precisions., , and . J. Parallel Distributed Comput., (2020)Performance optimization of Sparse Matrix-Vector Multiplication for multi-component PDE-based applications using GPUs., , , and . Concurr. Comput. Pract. Exp., 28 (12): 3447-3465 (2016)High Performance Multi-GPU SpMV for Multi-component PDE-Based Applications., , and . Euro-Par, volume 9233 of Lecture Notes in Computer Science, page 601-612. Springer, (2015)Progressive Optimization of Batched LU Factorization on GPUs., , and . HPEC, page 1-6. IEEE, (2019)Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs., , , and . HPEC, page 1-7. IEEE, (2020)Portable and Efficient Dense Linear Algebra in the Beginning of the Exascale Era., , , , , , , , and . P3HPC@SC, page 36-46. IEEE, (2022)Evaluating the Performance of NVIDIA's A100 Ampere GPU for Sparse and Batched Computations., , , , and . PMBS@SC, page 26-38. IEEE, (2020)Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems., , , , , , , , , and . Supercomput. Front. Innov., 2 (4): 67-86 (2015)Fast Cholesky factorization on GPUs for batch and native modes in MAGMA., , , and . J. Comput. Sci., (2017)