Author of the publication

A Low-Cost Floating-Point Dot-Product-Dual-Accumulate Architecture for HPC-Enabled AI.

, , , , , , , , and . IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 43 (2): 681-693 (February 2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

On-Chip Memory System Optimization Design for the FT64 Scientific Stream Accelerator., , , , , , , , , and . IEEE Micro, 28 (4): 51-70 (2008)A Stream System-on-Chip Architecture for High Speed Target Recognition Based on Biologic Vision., , , , , and . Asia-Pacific Computer Systems Architecture Conference, volume 4697 of Lecture Notes in Computer Science, page 256-267. Springer, (2007)Unified Virtual Memory Support for Deep CNN Accelerator on SoC FPGA., , , , and . ICA3PP (1), volume 9528 of Lecture Notes in Computer Science, page 64-76. Springer, (2015)SAT: A Stream Architecture Template for Embedded Applications., , , , , and . CIT, page 1711-1718. IEEE Computer Society, (2010)Enabling a Uniform OpenCL Device View for Heterogeneous Platforms., , , , , , and . IEICE Trans. Inf. Syst., 98-D (4): 812-823 (2015)Efficient Multiple-Precision and Mixed-Precision Floating-Point Fused Multiply-Accumulate Unit for HPC and AI Applications., , , , , and . ICA3PP, volume 13777 of Lecture Notes in Computer Science, page 642-659. Springer, (2022)Fully Distributed On-chip Instruction Memory Design for Stream Architecture Based on Field-Divided VLIW Compression., , , , and . HPCC-ICESS, page 25-32. IEEE Computer Society, (2012)Tiled Multi-Core Stream Architecture., , , , , , and . Trans. High Perform. Embed. Archit. Compil., (2011)Software Managed Instruction Scratchpad Memory Optimization in Stream Architecture Based on Hot Code Analysis of Kernels., , , , , and . DSD, page 823-830. IEEE Computer Society, (2010)Poster Abstract: A Template-based Framework for Generating Network Processor in FPGA., , , , and . INFOCOM Workshops, page 1057-1058. IEEE, (2019)