From post

Improving performance portability for GPU-specific OpenCL kernels on multi-core/many-core CPUs by analysis-based transformations.

, , , и . Frontiers Inf. Technol. Electron. Eng., 16 (11): 899-916 (2015)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

Optimizing OpenCL Implementation of Deep Convolutional Neural Network on FPGA., , , , , и . NPC, том 10578 из Lecture Notes in Computer Science, стр. 100-111. Springer, (2017)Multiple-Dimension Scalable Adaptive Stream Architecture., , , и . Asia-Pacific Computer Systems Architecture Conference, том 3189 из Lecture Notes in Computer Science, стр. 199-211. Springer, (2004)Extending BORPH for shared memory reconfigurable computers., , , , и . FPL, стр. 563-566. IEEE, (2012)Poster Abstract: A Template-based Framework for Generating Network Processor in FPGA., , , , и . INFOCOM Workshops, стр. 1057-1058. IEEE, (2019)High efficient sedimentary basin simulations on hybrid CPU-GPU clusters., , , , , и . Clust. Comput., 17 (2): 359-369 (2014)Accelerated Motion Estimation of H.264 on Imagine Stream Processor., , , , , и . ICIAR, том 3656 из Lecture Notes in Computer Science, стр. 367-374. Springer, (2005)Accelerating 3D CNN-based Lung Nodule Segmentation on a Multi-FPGA System., , , , и . FPGA, стр. 117. ACM, (2019)Deep Learning Research and Development Platform: Characterizing and Scheduling with QoS Guarantees on GPU Clusters., , , , , , и . IEEE Trans. Parallel Distributed Syst., 31 (1): 34-50 (2020)TILE-SIM: A Systematic Approach to Systolic Array-based Accelerator Evaluation., , , , и . ISPASS, стр. 141-143. IEEE, (2022)MALMM: A multi-array architecture for large-scale matrix multiplication on FPGA., , , , и . IEICE Electron. Express, 15 (10): 20180286 (2018)