Author of the publication

Improving Communication Performance and Scalability of Native Applications on Intel Xeon Phi Coprocessor Clusters.

, , , , , , , , , , and . IPDPS, page 1083-1092. IEEE Computer Society, (2014)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Dynamic fine-grained sparse memory accesses., , , , and . MEMSYS, page 85-97. ACM, (2018)Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications., , , , , , , , , and 18 other author(s). CoRR, (2018)A framework for low-communication 1-D FFT., , , and . Sci. Program., 21 (3-4): 181-195 (2013)Efficient Soft-Error Detection for Low-precision Deep Learning Recommendation Models., , , , , , and . CoRR, (2021)Unity: Accelerating DNN Training Through Joint Optimization of Algebraic Transformations and Parallelization., , , , , , , , , and 5 other author(s). OSDI, page 267-284. USENIX Association, (2022)With Shared Microexponents, A Little Shifting Goes a Long Way., , , , , , , , , and 12 other author(s). ISCA, page 83:1-83:13. ACM, (2023)Enabling Sparse Winograd Convolution by Native Pruning., , and . CoRR, (2017)Gate scheduling for quantum algorithms., and . CoRR, (2017)Navigating the maze of graph analytics frameworks using massive graph datasets., , , , , , , , and . SIGMOD Conference, page 979-990. ACM, (2014)Improving concurrency and asynchrony in multithreaded MPI applications using software offloading., , , , , , , and . SC, page 30:1-30:12. ACM, (2015)