Author of the publication

Advancing Direct Convolution Using Convolution Slicing Optimization and ISA Extensions.

, , , , , , and . ACM Trans. Archit. Code Optim., 20 (4): 54:1-54:26 (December 2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Automated GPU Grid Geometry Selection for OPENMP Kernels., , , , and . SBAC-PAD, page 442-449. IEEE, (2018)Flexibility Is Key in Organizing a Global Professional Conference Online: The ICPE 2020 Experience in the COVID-19 Era., , , , , and . CoRR, (2020)Topic 9: Parallel and Distributed Programming., and . Euro-Par, volume 5168 of Lecture Notes in Computer Science, page 686-687. Springer, (2008)Should potential loop optimizations influence inlining decisions?, , and . CASCON, page 30-38. IBM, (2003)10th Workshop on Compiler-Driven Performance., , , , and . CASCON, page 371-372. IBM / ACM, (2011)Eliminating Redundant Join-Set Computations in Static Single Assignment., and . J. Univers. Comput. Sci., 12 (8): 1007-1019 (2006)Forma: A framework for safe automatic array reshaping., , , , and . ACM Trans. Program. Lang. Syst., 30 (1): 2 (2007)Generalized Index-Set Splitting., , , and . CC, volume 3443 of Lecture Notes in Computer Science, page 106-120. Springer, (2005)A Characterization of Shared Data Access Patterns in UPC Programs., , and . LCPC, volume 4382 of Lecture Notes in Computer Science, page 111-125. Springer, (2006)Using shared-data localization to reduce the cost of inspector-execution in unified-parallel-C programs., , , , and . Parallel Comput., (2016)