Statistical Query Lower Bounds for Robust Estimation of High-dimensional
Gaussians and Gaussian Mixtures
I. Diakonikolas, D. Kane, and A. Stewart. (2016)cite arxiv:1611.03473Comment: Changes from v1: Revised presentation. Added more applications of the technique (SQ lower bounds for robust sparse mean estimation and robust covariance estimation in spectral norm). Sharpened testing lower bound to linear in the dimension (compared to nearly-linear in first version).
Abstract
We describe a general technique that yields the first Statistical Query
lower bounds for a range of fundamental high-dimensional learning problems
involving Gaussian distributions. Our main results are for the problems of (1)
learning Gaussian mixture models (GMMs), and (2) robust (agnostic) learning of
a single unknown Gaussian distribution. For each of these problems, we show a
super-polynomial gap between the (information-theoretic) sample
complexity and the computational complexity of any Statistical Query
algorithm for the problem. Our SQ lower bound for Problem (1) is qualitatively
matched by known learning algorithms for GMMs. Our lower bound for Problem (2)
implies that the accuracy of the robust learning algorithm
in~DiakonikolasKKLMS16 is essentially best possible among all
polynomial-time SQ algorithms.
Our SQ lower bounds are attained via a unified moment-matching technique that
is useful in other contexts and may be of broader interest. Our technique
yields nearly-tight lower bounds for a number of related unsupervised
estimation problems. Specifically, for the problems of (3) robust covariance
estimation in spectral norm, and (4) robust sparse mean estimation, we
establish a quadratic statistical--computational tradeoff for SQ
algorithms, matching known upper bounds. Finally, our technique can be used to
obtain tight sample complexity lower bounds for high-dimensional testing
problems. Specifically, for the classical problem of robustly testing an
unknown mean (known covariance) Gaussian, our technique implies an
information-theoretic sample lower bound that scales linearly in the
dimension. Our sample lower bound matches the sample complexity of the
corresponding robust learning problem and separates the sample complexity
of robust testing from standard (non-robust) testing.
Description
[1611.03473] Statistical Query Lower Bounds for Robust Estimation of High-dimensional Gaussians and Gaussian Mixtures
cite arxiv:1611.03473Comment: Changes from v1: Revised presentation. Added more applications of the technique (SQ lower bounds for robust sparse mean estimation and robust covariance estimation in spectral norm). Sharpened testing lower bound to linear in the dimension (compared to nearly-linear in first version)
%0 Journal Article
%1 diakonikolas2016statistical
%A Diakonikolas, Ilias
%A Kane, Daniel M.
%A Stewart, Alistair
%D 2016
%K robustness stats
%T Statistical Query Lower Bounds for Robust Estimation of High-dimensional
Gaussians and Gaussian Mixtures
%U http://arxiv.org/abs/1611.03473
%X We describe a general technique that yields the first Statistical Query
lower bounds for a range of fundamental high-dimensional learning problems
involving Gaussian distributions. Our main results are for the problems of (1)
learning Gaussian mixture models (GMMs), and (2) robust (agnostic) learning of
a single unknown Gaussian distribution. For each of these problems, we show a
super-polynomial gap between the (information-theoretic) sample
complexity and the computational complexity of any Statistical Query
algorithm for the problem. Our SQ lower bound for Problem (1) is qualitatively
matched by known learning algorithms for GMMs. Our lower bound for Problem (2)
implies that the accuracy of the robust learning algorithm
in~DiakonikolasKKLMS16 is essentially best possible among all
polynomial-time SQ algorithms.
Our SQ lower bounds are attained via a unified moment-matching technique that
is useful in other contexts and may be of broader interest. Our technique
yields nearly-tight lower bounds for a number of related unsupervised
estimation problems. Specifically, for the problems of (3) robust covariance
estimation in spectral norm, and (4) robust sparse mean estimation, we
establish a quadratic statistical--computational tradeoff for SQ
algorithms, matching known upper bounds. Finally, our technique can be used to
obtain tight sample complexity lower bounds for high-dimensional testing
problems. Specifically, for the classical problem of robustly testing an
unknown mean (known covariance) Gaussian, our technique implies an
information-theoretic sample lower bound that scales linearly in the
dimension. Our sample lower bound matches the sample complexity of the
corresponding robust learning problem and separates the sample complexity
of robust testing from standard (non-robust) testing.
@article{diakonikolas2016statistical,
abstract = {We describe a general technique that yields the first {\em Statistical Query
lower bounds} for a range of fundamental high-dimensional learning problems
involving Gaussian distributions. Our main results are for the problems of (1)
learning Gaussian mixture models (GMMs), and (2) robust (agnostic) learning of
a single unknown Gaussian distribution. For each of these problems, we show a
{\em super-polynomial gap} between the (information-theoretic) sample
complexity and the computational complexity of {\em any} Statistical Query
algorithm for the problem. Our SQ lower bound for Problem (1) is qualitatively
matched by known learning algorithms for GMMs. Our lower bound for Problem (2)
implies that the accuracy of the robust learning algorithm
in~\cite{DiakonikolasKKLMS16} is essentially best possible among all
polynomial-time SQ algorithms.
Our SQ lower bounds are attained via a unified moment-matching technique that
is useful in other contexts and may be of broader interest. Our technique
yields nearly-tight lower bounds for a number of related unsupervised
estimation problems. Specifically, for the problems of (3) robust covariance
estimation in spectral norm, and (4) robust sparse mean estimation, we
establish a quadratic {\em statistical--computational tradeoff} for SQ
algorithms, matching known upper bounds. Finally, our technique can be used to
obtain tight sample complexity lower bounds for high-dimensional {\em testing}
problems. Specifically, for the classical problem of robustly {\em testing} an
unknown mean (known covariance) Gaussian, our technique implies an
information-theoretic sample lower bound that scales {\em linearly} in the
dimension. Our sample lower bound matches the sample complexity of the
corresponding robust {\em learning} problem and separates the sample complexity
of robust testing from standard (non-robust) testing.},
added-at = {2020-02-26T13:44:47.000+0100},
author = {Diakonikolas, Ilias and Kane, Daniel M. and Stewart, Alistair},
biburl = {https://www.bibsonomy.org/bibtex/294df922837dd2af68282db01864cfdac/kirk86},
description = {[1611.03473] Statistical Query Lower Bounds for Robust Estimation of High-dimensional Gaussians and Gaussian Mixtures},
interhash = {0e791d3cedba774674de96b9677131da},
intrahash = {94df922837dd2af68282db01864cfdac},
keywords = {robustness stats},
note = {cite arxiv:1611.03473Comment: Changes from v1: Revised presentation. Added more applications of the technique (SQ lower bounds for robust sparse mean estimation and robust covariance estimation in spectral norm). Sharpened testing lower bound to linear in the dimension (compared to nearly-linear in first version)},
timestamp = {2020-02-26T13:44:47.000+0100},
title = {Statistical Query Lower Bounds for Robust Estimation of High-dimensional
Gaussians and Gaussian Mixtures},
url = {http://arxiv.org/abs/1611.03473},
year = 2016
}