Author of the publication

Performance-Portable Autotuning of OpenCL Kernels for Convolutional Layers of Deep Neural Networks.

, , , and . MLHPC@SC, page 9-18. IEEE Computer Society, (2016)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Mixed-Precision Algorithm for Finding Selected Eigenvalues and Eigenvectors of Symmetric and Hermitian Matrices1., , and . ScalAH@SC, page 43-50. IEEE, (2022)Adaptive block size for dense QR factorization in hybrid CPU-GPU systems via statistical modeling., , and . Parallel Comput., 40 (5-6): 70-85 (2014)Autotuning Numerical Dense Linear Algebra for Batched Computation With GPU Hardware Accelerators., , , , and . Proc. IEEE, 106 (11): 2040-2055 (2018)A survey of numerical linear algebra methods utilizing mixed-precision arithmetic., , , , , , , , , and 11 other author(s). Int. J. High Perform. Comput. Appl., (2021)A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic., , , , , , , , , and 15 other author(s). CoRR, (2020)Massively Parallel Automated Software Tuning., , , , and . ICPP, page 92:1-92:10. ACM, (2019)Scalable Data Generation for Evaluating Mixed-Precision Solvers., , , , and . HPEC, page 1-6. IEEE, (2020)Tuning Block Size for QR Factorization on CPU-GPU Hybrid Systems., , and . MCSoC, page 205-211. IEEE Computer Society, (2012)Performance-Portable Autotuning of OpenCL Kernels for Convolutional Layers of Deep Neural Networks., , , and . MLHPC@SC, page 9-18. IEEE Computer Society, (2016)