Author of the publication

Making a Better Use of Caches for GCN Accelerators with Feature Slicing and Automatic Tile Morphing.

, , , , , and . IEEE Comput. Archit. Lett., 20 (2): 102-105 (2021)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

μLayer: Low Latency On-Device Inference Using Cooperative Single-Layer Acceleration and Processor-Friendly Quantization., , , , and . EuroSys, page 45:1-45:15. ACM, (2019)DANCE: Differentiable Accelerator/Network Co-Exploration., , , , , and . CoRR, (2020)Enabling Fine-Grained Spatial Multitasking on Systolic-Array NPUs Using Dataflow Mirroring., , , , , , and . IEEE Trans. Computers, 72 (12): 3383-3398 (December 2023)Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks., , , , , , , , , and 1 other author(s). ASPLOS, page 316-331. ACM, (2018)Occamy: Memory-efficient GPU Compiler for DNN Inference., , , , , , and . DAC, page 1-6. IEEE, (2023)GPUpd: a fast and scalable multi-GPU architecture using cooperative projection and distribution., , , , , and . MICRO, page 574-586. ACM, (2017)Making a Better Use of Caches for GCN Accelerators with Feature Slicing and Automatic Tile Morphing., , , , , and . IEEE Comput. Archit. Lett., 20 (2): 102-105 (2021)Design and Analysis of a Processing-in-DIMM Join Algorithm: A Case Study with UPMEM DIMMs., , , , , , , and . Proc. ACM Manag. Data, 1 (2): 113:1-113:27 (2023)It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher., , , , , , and . CVPR, page 8301-8311. IEEE, (2022)SALoBa: Maximizing Data Locality and Workload Balance for Fast Sequence Alignment on GPUs., , , , , , , and . IPDPS, page 728-738. IEEE, (2022)