Author of the publication

Improving Communication Performance and Scalability of Native Applications on Intel Xeon Phi Coprocessor Clusters.

, , , , , , , , , , and . IPDPS, page 1083-1092. IEEE Computer Society, (2014)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Out-of-Distribution Detection Using an Ensemble of Self Supervised Leave-Out Classifiers., , , , , and . ECCV (8), volume 11212 of Lecture Notes in Computer Science, page 560-574. Springer, (2018)Mixed Low-precision Deep Learning Inference using Dynamic Fixed Point., , , , and . CoRR, (2017)MADRaS : Multi Agent Driving Simulator., , , , , , and . CoRR, (2020)Mixed Precision Training With 8-bit Floating Point., , , and . CoRR, (2019)High Performance Scalable FPGA Accelerator for Deep Neural Networks., , , , , , , , , and . CoRR, (2019)AUTOSPARSE: Towards Automated Sparse Training of Deep Neural Networks., , , , and . CoRR, (2023)Exploring Shared-Memory Optimizations for an Unstructured Mesh CFD Application on Modern Parallel Systems., , , , , , , , , and . IPDPS, page 723-732. IEEE Computer Society, (2015)High Performance Non-uniform FFT on Modern X86-based Multi-core Systems., , , , , , , , , and . IPDPS, page 449-460. IEEE Computer Society, (2012)Automatic Model Parallelism for Deep Neural Networks with Compiler and Hardware Support., , and . CoRR, (2019)PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives., , , , , and . CoRR, (2020)