Author of the publication

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs.

A. Abdelfattah, S. Tomov, and J. Dongarra. IPDPS, page 111-122. IEEE, (2019)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

Aschraf Abdelfattah

Basem Abdelfattah

Aadel Abdelfattah

Samer Abdelfattah

Nahed Abdelfattah

Other publications of authors with the same name

Performance optimization of Sparse Matrix-Vector Multiplication for multi-component PDE-based applications using GPUs.A. Abdelfattah, H. Ltaief, D. Keyes, and J. Dongarra. Concurr. Comput. Pract. Exp., 28 (12): 3447-3465 (2016)Matrix multiplication on batches of small matrices in half and half-complex precisions.A. Abdelfattah, S. Tomov, and J. Dongarra. J. Parallel Distributed Comput., (2020)Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs.A. Abdelfattah, A. Haidar, S. Tomov, and J. Dongarra. IEEE Trans. Parallel Distributed Syst., 29 (12): 2700-2712 (2018)Evaluating the Performance of NVIDIA's A100 Ampere GPU for Sparse and Batched Computations.H. Anzt, Y. Tsai, A. Abdelfattah, T. Cojean, and J. Dongarra. PMBS@SC, page 26-38. IEEE, (2020)High Performance Multi-GPU SpMV for Multi-component PDE-Based Applications.A. Abdelfattah, H. Ltaief, and D. Keyes. Euro-Par, volume 9233 of Lecture Notes in Computer Science, page 601-612. Springer, (2015)Portable and Efficient Dense Linear Algebra in the Beginning of the Exascale Era.M. Gates, A. YarKhan, D. Sukkari, K. Akbudak, S. Cayrols, D. Bielich, A. Abdelfattah, M. Farhan, and J. Dongarra. P3HPC@SC, page 36-46. IEEE, (2022)Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs.C. Brown, A. Abdelfattah, S. Tomov, and J. Dongarra. HPEC, page 1-7. IEEE, (2020)Progressive Optimization of Batched LU Factorization on GPUs.A. Abdelfattah, S. Tomov, and J. Dongarra. HPEC, page 1-6. IEEE, (2019)Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems.J. Dongarra, M. Abalenkovs, A. Abdelfattah, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, I. Yamazaki, and A. YarKhan. Supercomput. Front. Innov., 2 (4): 67-86 (2015)Systematic Approach in Optimizing Numerical Memory-Bound Kernels on GPU.A. Abdelfattah, D. Keyes, and H. Ltaief. Euro-Par Workshops, volume 7640 of Lecture Notes in Computer Science, page 207-216. Springer, (2012)

BibSonomy

Disambiguation of "Abdelfattah, Ahmad"

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs.

Please choose a person to relate this publication to

Aschraf Abdelfattah

Basem Abdelfattah

Aadel Abdelfattah

Samer Abdelfattah

Nahed Abdelfattah

Other publications of authors with the same name

Disambiguation

BibSonomy

Disambiguation of "Abdelfattah, Ahmad"

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs.

Please choose a person to relate this publication to

Aschraf Abdelfattah

Basem Abdelfattah

Aadel Abdelfattah

Samer Abdelfattah

Nahed Abdelfattah

Other publications of authors with the same name

Disambiguation

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs.