Author of the publication

TDC: Towards Extremely Efficient CNNs on GPUs via Hardware-Aware Tucker Decomposition.

, , , , , , and . PPoPP, page 260-273. ACM, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Communication Efficient Matrix Multiplication on Hypercubes., and . SPAA, page 320-329. ACM, (1994)ATP: Directed Graph Embedding with Asymmetric Transitivity Preservation., , , , , and . CoRR, (2018)One-to-one mapping of process graphs onto a hypercube., and . ICS, page 91-98. ACM, (1989)Integrating parallel file systems with object-based storage devices., , , , and . SC, page 27. ACM Press, (2007)Optimization by neural networks., and . ICNN, page 325-332. IEEE, (1988)Analytical modeling of cache behavior for affine programs., , , and . Proc. ACM Program. Lang., 2 (POPL): 32:1-32:26 (2018)Partitioning Graphs on Message-Passing Machines by Pairwise Mincut., , and . Inf. Sci., 111 (1-4): 223-237 (1998)Empirical performance model-driven data layout optimization and library call selection for tensor contraction expressions., , , , , and . J. Parallel Distributed Comput., 72 (3): 338-352 (2012)Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver., , , , , , and . J. Parallel Distributed Comput., 66 (5): 659-673 (2006)Tiling Multidimensional Itertion Spaces for Multicomputers., and . J. Parallel Distributed Comput., 16 (2): 108-120 (1992)