Author of the publication

Overcoming Data Transfer Bottlenecks in DNN Accelerators via Layer-Conscious Memory Managment.

, , , , and . FPGA, page 120. ACM, (2019)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

GCNear: A Hybrid Architecture for Efficient GCN Training with Near-Memory Processing., , , and . CoRR, (2021)ArchExplorer: Microarchitecture Exploration Via Bottleneck Analysis., , , , , , , and . MICRO, page 268-282. ACM, (2023)FlexBFS: a parallelism-aware implementation of breadth-first search on GPU., , , , , , , and . PPoPP, page 279-280. ACM, (2012)An Intermediate-Centric Dataflow for Transposed Convolution Acceleration on FPGA., , , and . ACM Trans. Embed. Comput. Syst., 22 (6): 92:1-92:22 (November 2023)Overcoming Data Transfer Bottlenecks in DNN Accelerators via Layer-Conscious Memory Managment., , , , and . FPGA, page 120. ACM, (2019)Generating Systolic Array Accelerators With Reusable Blocks., , , and . IEEE Micro, 40 (4): 85-92 (2020)PetS: A Unified Framework for Parameter-Efficient Transformers Serving., , , and . USENIX Annual Technical Conference, page 489-504. USENIX Association, (2022)Distributed Control Independence for Composable Multi-processors., , , , , , and . ACIS-ICIS, page 124-129. IEEE Computer Society, (2012)FTDL: An FPGA-tailored Architecture for Deep Learning Systems., , , , , and . FPGA, page 320. ACM, (2020)Efficient Super-Resolution System With Block-Wise Hybridization and Quantized Winograd on FPGA., , , , , , , and . IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 42 (11): 3910-3924 (November 2023)