Author of the publication

Porting existing cache-oblivious linear algebra HPC modules to larrabee architecture.

, , and . Conf. Computing Frontiers, page 91-92. ACM, (2010)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices., , , , , , , , and . SC, page 945-955. IEEE Computer Society, (2014)An efficient vectorization of linked-cell particle simulations., and . Conf. Computing Frontiers, page 241-244. ACM, (2012)Porting existing cache-oblivious linear algebra HPC modules to larrabee architecture., , and . Conf. Computing Frontiers, page 91-92. ACM, (2010)Full correlation matrix analysis of fMRI data on Intel® Xeon Phi™ coprocessors, , , , , , , , and . SC'15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, page 1--12. IEEE, (2015)FPGA-Based AI Smart NICs for Scalable Distributed AI Training Systems., , , , , and . IEEE Comput. Archit. Lett., 21 (2): 49-52 (2022)Efficiency of High Order Spectral Element Methods on Petascale Architectures., , , , , and . ISC, volume 9697 of Lecture Notes in Computer Science, page 449-466. Springer, (2016)Petaflop Seismic Simulations in the Public Cloud., , and . ISC, volume 11501 of Lecture Notes in Computer Science, page 167-185. Springer, (2019)Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures., , , , , , and . CoRR, (2023)Microscaling Data Formats for Deep Learning., , , , , , , , , and 23 other author(s). CoRR, (2023)Optimizing Deep Learning RNN Topologies on Intel Architecture., , , , , , and . Supercomput. Front. Innov., 6 (3): 64-85 (2019)