Author of the publication

LU, QR, and Cholesky factorizations: Programming model, performance analysis and optimization techniques for the Intel Knights Landing Xeon Phi.

, , , , , and . HPEC, page 1-7. IEEE, (2016)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Fast Cholesky factorization on GPUs for batch and native modes in MAGMA., , , and . J. Comput. Sci., (2017)Batched matrix computations on hardware accelerators based on GPUs., , , , and . Int. J. High Perform. Comput. Appl., 29 (2): 193-208 (2015)Abstract: A Novel Hybrid CPU-GPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks., , , , and . SC Companion, page 1338-1339. IEEE Computer Society, (2012)Harnessing GPU tensor cores for fast FP16 arithmetic to speed up mixed-precision iterative refinement solvers., , , and . SC, page 47:1-47:11. IEEE / ACM, (2018)A hybrid Hermitian general eigenvalue solver, , , , , and . CoRR, (2012)Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels., , and . SC, page 8:1-8:11. ACM, (2011)Performance Analysis of Parallel FFT on Large Multi-GPU Systems., , , , and . IPDPS Workshops, page 372-381. IEEE, (2022)Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems., , , , , , , , , and . Supercomput. Front. Innov., 2 (4): 67-86 (2015)Model-Driven One-Sided Factorizations on Multicore Accelerated Systems., , , , , and . Supercomput. Front. Innov., 1 (1): 85-115 (2014)Heterogenous Acceleration for Linear Algebra in Multi-coprocessor Environments., , , and . VECPAR, volume 8969 of Lecture Notes in Computer Science, page 31-42. Springer, (2014)