Provably Robust Deep Learning via Adversarially Trained Smoothed
Classifiers
H. Salman, G. Yang, J. Li, P. Zhang, H. Zhang, I. Razenshteyn, and S. Bubeck. (2019)cite arxiv:1906.04584Comment: Spotlight at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada; 9 pages main text; 31 pages total.
Abstract
Recent works have shown the effectiveness of randomized smoothing as a
scalable technique for building neural network-based classifiers that are
provably robust to $\ell_2$-norm adversarial perturbations. In this paper, we
employ adversarial training to improve the performance of randomized smoothing.
We design an adapted attack for smoothed classifiers, and we show how this
attack can be used in an adversarial training setting to boost the provable
robustness of smoothed classifiers. We demonstrate through extensive
experimentation that our method consistently outperforms all existing provably
$\ell_2$-robust classifiers by a significant margin on ImageNet and CIFAR-10,
establishing the state-of-the-art for provable $\ell_2$-defenses. Moreover, we
find that pre-training and semi-supervised learning boost adversarially trained
smoothed classifiers even further. Our code and trained models are available at
http://github.com/Hadisalman/smoothing-adversarial .
Description
[1906.04584] Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers
cite arxiv:1906.04584Comment: Spotlight at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada; 9 pages main text; 31 pages total
%0 Conference Paper
%1 salman2019provably
%A Salman, Hadi
%A Yang, Greg
%A Li, Jerry
%A Zhang, Pengchuan
%A Zhang, Huan
%A Razenshteyn, Ilya
%A Bubeck, Sebastien
%D 2019
%K adversarial neurips2019 optimization robustness
%T Provably Robust Deep Learning via Adversarially Trained Smoothed
Classifiers
%U http://arxiv.org/abs/1906.04584
%X Recent works have shown the effectiveness of randomized smoothing as a
scalable technique for building neural network-based classifiers that are
provably robust to $\ell_2$-norm adversarial perturbations. In this paper, we
employ adversarial training to improve the performance of randomized smoothing.
We design an adapted attack for smoothed classifiers, and we show how this
attack can be used in an adversarial training setting to boost the provable
robustness of smoothed classifiers. We demonstrate through extensive
experimentation that our method consistently outperforms all existing provably
$\ell_2$-robust classifiers by a significant margin on ImageNet and CIFAR-10,
establishing the state-of-the-art for provable $\ell_2$-defenses. Moreover, we
find that pre-training and semi-supervised learning boost adversarially trained
smoothed classifiers even further. Our code and trained models are available at
http://github.com/Hadisalman/smoothing-adversarial .
@inproceedings{salman2019provably,
abstract = {Recent works have shown the effectiveness of randomized smoothing as a
scalable technique for building neural network-based classifiers that are
provably robust to $\ell_2$-norm adversarial perturbations. In this paper, we
employ adversarial training to improve the performance of randomized smoothing.
We design an adapted attack for smoothed classifiers, and we show how this
attack can be used in an adversarial training setting to boost the provable
robustness of smoothed classifiers. We demonstrate through extensive
experimentation that our method consistently outperforms all existing provably
$\ell_2$-robust classifiers by a significant margin on ImageNet and CIFAR-10,
establishing the state-of-the-art for provable $\ell_2$-defenses. Moreover, we
find that pre-training and semi-supervised learning boost adversarially trained
smoothed classifiers even further. Our code and trained models are available at
http://github.com/Hadisalman/smoothing-adversarial .},
added-at = {2019-12-10T01:06:35.000+0100},
author = {Salman, Hadi and Yang, Greg and Li, Jerry and Zhang, Pengchuan and Zhang, Huan and Razenshteyn, Ilya and Bubeck, Sebastien},
biburl = {https://www.bibsonomy.org/bibtex/23a7902c5c510a31ff2e3b3c0a29f38ef/kirk86},
description = {[1906.04584] Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers},
interhash = {6fa0ca602587c0575b3dfe9383a91278},
intrahash = {3a7902c5c510a31ff2e3b3c0a29f38ef},
keywords = {adversarial neurips2019 optimization robustness},
note = {cite arxiv:1906.04584Comment: Spotlight at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada; 9 pages main text; 31 pages total},
timestamp = {2019-12-10T01:06:35.000+0100},
title = {Provably Robust Deep Learning via Adversarially Trained Smoothed
Classifiers},
url = {http://arxiv.org/abs/1906.04584},
year = 2019
}