Author of the publication

I-BERT: Integer-only BERT Quantization.

, , , , and . ICML, volume 139 of Proceedings of Machine Learning Research, page 5506-5518. PMLR, (2021)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Semantic video segmentation using both appearance and geometric information., , , , and . Intelligent Robots and Computer Vision: Algorithms and Techniques, volume 9406 of SPIE Proceedings, page 94060B. SPIE, (2015)KAIST interactive bicycle racing simulator: the 2nd version with advanced features., , , , , , , , , and 4 other author(s). IROS, page 2961-2966. IEEE, (2002)Neighbor-Aware Adaptive Retry Limit for IEEE 802.11-Based Mobile Ad Hoc Networks., , and . WCNC, page 1402-1407. IEEE, (2008)Full Stack Optimization of Transformer Inference: a Survey., , , , , , , , , and 2 other author(s). CoRR, (2023)Integer-Only Zero-Shot Quantization for Efficient Speech Recognition., , , , , , , , , and . ICASSP, page 4288-4292. IEEE, (2022)Hessian-Aware Pruning and Optimal Neural Implant., , , , , , and . WACV, page 3665-3676. IEEE, (2022)Applications and Techniques for Fast Machine Learning in Science., , , , , , , , , and 77 other author(s). CoRR, (2021)AI and Memory Wall., , , , , and . IEEE Micro, 44 (3): 33-39 (May 2024)ETS: Efficient Tree Search for Inference-Time Scaling., , , , , , , , , and . CoRR, (February 2025)Squeezed Attention: Accelerating Long Context Length LLM Inference., , , , , , , , and . ACL (1), page 32631-32652. Association for Computational Linguistics, (2025)