Author of the publication

LIBXSMM: accelerating small matrix multiplications by runtime code generation.

, , , and . SC, page 981-991. IEEE Computer Society, (2016)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Porting of the DBCSR library for Sparse Matrix-Matrix Multiplications to Intel Xeon Phi systems., , , , , and . CoRR, (2017)Harnessing Deep Learning via a Single Building Block., , , , , , , , and . IPDPS, page 222-233. IEEE, (2020)Tensor processing primitives: a programming abstraction for efficiency and portability in deep learning workloads., , , , , , , , , and 7 other author(s). SC, page 14. ACM, (2021)Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures., , , , , , , and . IPDPS, page 950-963. IEEE, (2024)Multigrid optical flow for deformable medical volume registration., , , , and . SIGGRAPH Talks, ACM, (2010)Efficiency of High Order Spectral Element Methods on Petascale Architectures., , , , , and . ISC, volume 9697 of Lecture Notes in Computer Science, page 449-466. Springer, (2016)Distributed Training of Generative Adversarial Networks for Fast Detector Simulation., , , , , , and . ISC Workshops, volume 11203 of Lecture Notes in Computer Science, page 487-503. Springer, (2018)LIBXSMM: accelerating small matrix multiplications by runtime code generation., , , and . SC, page 981-991. IEEE Computer Society, (2016)Towards a high-performance AI compiler with upstream MLIR., , , , , , , and . CoRR, (2024)Reduced Precision Strategies for Deep Learning: A High Energy Physics Generative Adversarial Network Use Case., , , , , , , and . ICPRAM, page 251-258. SCITEPRESS, (2021)