Author of the publication

CUDA-For-Clusters: A System for Efficient Execution of CUDA Kernels on Multi-core Clusters.

, , and . Euro-Par, volume 7484 of Lecture Notes in Computer Science, page 415-426. Springer, (2012)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Data Flow Implementation of Generalized Guarded Commands., and . PARLE (1), volume 505 of Lecture Notes in Computer Science, page 372-389. Springer, (1991)Author Rebuttal to Rocha et al. "Comments on Minimizing Buffer Requirements under Rate-Optimal Schedule in Regular Dataflow Networks"., and . J. Signal Process. Syst., 81 (1): 135-136 (2015)Attempting guards in parallel: A data flow approach to execute generalized guarded commands., , and . Int. J. Parallel Program., 21 (4): 225-268 (1992)A Theory for Co-Scheduling Hardware and Software Pipelines in ASIPs and Embedded Processors., , and . Des. Autom. Embed. Syst., 6 (3): 243-275 (2002)MicroRefresh: Minimizing Refresh Overhead in DRAM Caches., , and . MEMSYS, page 350-361. ACM, (2016)Synergistic execution of stream programs on multicores with accelerators., , and . LCTES, page 99-108. ACM, (2009)Reconciling transactional conflicts with compiler's help., and . CGO, page 53-62. ACM, (2012)Area and Power Reduction of Embedded DSP Systems using Instruction Compression and Re-configurable Encoding, , and . J. VLSI Signal Process. Syst., 44 (3): 245--267 (2006)Taming warp divergence., and . CGO, page 50-60. ACM, (2017)Design and Performance Evaluation of EXMAN: An EXtended MANchester Data Flow Computer., , and . IEEE Trans. Computers, 35 (3): 229-244 (1986)