copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Bayesian Bits: Unifying Quantization and Pruning

M. van Baalen, C. Louizos, M. Nagel, R. Amjad, Y. Wang, T. Blankevoort, and M. Welling. (2020)cite arxiv:2005.07093.

Abstract

We introduce Bayesian Bits, a practical method for joint mixed precision quantization and pruning through gradient based optimization. Bayesian Bits employs a novel decomposition of the quantization operation, which sequentially considers doubling the bit width. At each new bit width, the residual error between the full precision value and the previously rounded value is quantized. We then decide whether or not to add this quantized residual error for a higher effective bit width and lower quantization noise. By starting with a power-of-two bit width, this decomposition will always produce hardware-friendly configurations, and through an additional 0-bit option, serves as a unified view of pruning and quantization. Bayesian Bits then introduces learnable stochastic gates, which collectively control the bit width of the given tensor. As a result, we can obtain low bit solutions by performing approximate inference over the gates, with prior distributions that encourage most of them to be switched off. We further show that, under some assumptions, L0 regularization of the network parameters corresponds to a specific instance of the aforementioned framework. We experimentally validate our proposed method on several benchmark datasets and show that we can learn pruned, mixed precision networks that provide a better trade-off between accuracy and efficiency than their static bit width equivalents.

Description

[2005.07093] Bayesian Bits: Unifying Quantization and Pruning

Links and resources

BibTeX key: vanbaalen2020bayesian
entry type: article
year: 2020
url: http://arxiv.org/abs/2005.07093
note: cite arxiv:2005.07093

@kirk86's tags highlighted

Cite this publication

@article{vanbaalen2020bayesian, abstract = {We introduce Bayesian Bits, a practical method for joint mixed precision quantization and pruning through gradient based optimization. Bayesian Bits employs a novel decomposition of the quantization operation, which sequentially considers doubling the bit width. At each new bit width, the residual error between the full precision value and the previously rounded value is quantized. We then decide whether or not to add this quantized residual error for a higher effective bit width and lower quantization noise. By starting with a power-of-two bit width, this decomposition will always produce hardware-friendly configurations, and through an additional 0-bit option, serves as a unified view of pruning and quantization. Bayesian Bits then introduces learnable stochastic gates, which collectively control the bit width of the given tensor. As a result, we can obtain low bit solutions by performing approximate inference over the gates, with prior distributions that encourage most of them to be switched off. We further show that, under some assumptions, L0 regularization of the network parameters corresponds to a specific instance of the aforementioned framework. We experimentally validate our proposed method on several benchmark datasets and show that we can learn pruned, mixed precision networks that provide a better trade-off between accuracy and efficiency than their static bit width equivalents.}, added-at = {2020-05-15T14:12:25.000+0200}, author = {van Baalen, Mart and Louizos, Christos and Nagel, Markus and Amjad, Rana Ali and Wang, Ying and Blankevoort, Tijmen and Welling, Max}, biburl = {https://www.bibsonomy.org/bibtex/2f8586833d6a5f12d32d74f3bad493862/kirk86}, description = {[2005.07093] Bayesian Bits: Unifying Quantization and Pruning}, interhash = {e24c443640851dfd0fb1679c78975791}, intrahash = {f8586833d6a5f12d32d74f3bad493862}, keywords = {bayesian compression optimization}, note = {cite arxiv:2005.07093}, timestamp = {2020-05-15T14:12:25.000+0200}, title = {Bayesian Bits: Unifying Quantization and Pruning}, url = {http://arxiv.org/abs/2005.07093}, year = 2020 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Bayesian Bits: Unifying Quantization and Pruning

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Bayesian Bits: Unifying Quantization and Pruning

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Bayesian Bits: Unifying Quantization and Pruning

Comments and Reviews
(0)