copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, page 2704-2713. (June 2018)
DOI: 10.1109/CVPR.2018.00286

Abstract

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware. We also co-design a training procedure to preserve end-to-end model accuracy post quantization. As a result, the proposed quantization scheme improves the tradeoff between accuracy and on-device latency. The improvements are significant even on MobileNets, a model family known for run-time efficiency, and are demonstrated in ImageNet classification and COCO detection on popular CPUs.

Description

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference - IEEE Conference Publication

Links and resources

BibTeX key

8578384

entry type

inproceedings

booktitle

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition

year

2018

month

June

pages

2704-2713

issn

2575-7075

DOI

10.1109/CVPR.2018.00286

url

https://ieeexplore.ieee.org/document/8578384

additional links

https://arxiv.org/pdf/1712.05877.pdf

@sohnki's tags highlighted

Cite this publication

search on

Meta data

Last update 4 years ago
Created 4 years ago

Comments and Reviews
(0)

There is no review or comment yet. You can write one!

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Comments and Reviews
(0)