Author of the publication

A script-based autotuning compiler system to generate high-performance CUDA code.

, , , , , and . ACM Trans. Archit. Code Optim., 9 (4): 31:1-31:25 (2013)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Compiler-Controlled Caching in Superword Register Files for Multimedia Extension Architectures., , and . IEEE PACT, page 45-55. IEEE Computer Society, (2002)Evaluating compiler technology for control-flow optimizations for multimedia extension architectures., , and . Microprocess. Microsystems, 33 (4): 235-243 (2009)Superword-Level Parallelism in the Presence of Control Flow., , and . CGO, page 165-175. IEEE Computer Society, (2005)Mapping Irregular Applications to DIVA, a PIM-based Data-Intensive Architecture., , , , , , , , , and 4 other author(s). SC, page 57. ACM, (1999)The Combined Effectiveness of Unimodular Transformations, Tiling, and Software Prefetching., , , , and . IPPS, page 39-45. IEEE Computer Society, (1996)A tile selection algorithm for data locality and cache interference., and . International Conference on Supercomputing, page 492-499. ACM, (1999)Hierarchical parallelization and optimization of high-order stencil computations on multicore clusters., , , , , , , , , and . J. Supercomput., 62 (2): 946-966 (2012)Exploiting Superword-Level Locality in Multimedia Extension Architectures., , and . J. Instruction-Level Parallelism, (2003)Processing-in-memory technology for knowledge discovery algorithms., , , , , and . DaMoN, page 2. ACM, (2006)