Abstract
Why are classifiers in high dimension vulnerable to ädversarial"
perturbations? We show that it is likely not due to information theoretic
limitations, but rather it could be due to computational constraints.
First we prove that, for a broad set of classification tasks, the mere
existence of a robust classifier implies that it can be found by a possibly
exponential-time algorithm with relatively few training examples. Then we give
a particular classification task where learning a robust classifier is
computationally intractable. More precisely we construct a binary
classification task in high dimensional space which is (i) information
theoretically easy to learn robustly for large perturbations, (ii) efficiently
learnable (non-robustly) by a simple linear separator, (iii) yet is not
efficiently robustly learnable, even for small perturbations, by any algorithm
in the statistical query (SQ) model. This example gives an exponential
separation between classical learning and robust learning in the statistical
query model. It suggests that adversarial examples may be an unavoidable
byproduct of computational limitations of learning algorithms.
Users
Please
log in to take part in the discussion (add own reviews or comments).