Author of the publication

Improving Communication Performance and Scalability of Native Applications on Intel Xeon Phi Coprocessor Clusters.

, , , , , , , , , , and . IPDPS, page 1083-1092. IEEE Computer Society, (2014)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Lattice QCD on Intel® Xeon PhiTM Coprocessors., , , , , , , and . ISC, volume 7905 of Lecture Notes in Computer Science, page 40-54. Springer, (2013)Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices., , , , , , , , and . SC, page 945-955. IEEE Computer Society, (2014)Optimizing Deep Learning RNN Topologies on Intel Architecture., , , , , , and . Supercomput. Front. Innov., 6 (3): 64-85 (2019)Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures., , , , , , and . CoRR, (2023)High Performance Non-uniform FFT on Modern X86-based Multi-core Systems., , , , , , , , , and . IPDPS, page 449-460. IEEE Computer Society, (2012)Lattice QCD with Domain Decomposition on Intel® Xeon Phi Co-Processors., , , , , , and . SC, page 69-80. IEEE Computer Society, (2014)Optimizing Wilson-Dirac Operator and Linear Solvers for Intel® KNL., , , , and . ISC Workshops, volume 9945 of Lecture Notes in Computer Science, page 415-427. (2016)Efficient and Generic 1D Dilated Convolution Layer for Deep Learning., , , , , , , and . CoRR, (2021)Accelerating Deep Learning based Identification of Chromatin Accessibility from noisy ATAC-seq Data., , , , , , , and . IPDPS Workshops, page 176-185. IEEE, (2022)Optimization of geometric multigrid for emerging multi- and manycore processors., , , , , , , , , and . SC, page 96. IEEE/ACM, (2012)