Abstract
We study the robust proper learning of univariate log-concave
distributions (over continuous and discrete domains). Given a set of samples
drawn from an unknown target distribution, we want to compute a log-concave
hypothesis distribution that is as close as possible to the target, in total
variation distance. In this work, we give the first computationally efficient
algorithm for this learning problem. Our algorithm achieves the
information-theoretically optimal sample size (up to a constant factor), runs
in polynomial time, and is robust to model misspecification with nearly-optimal
error guarantees.
Specifically, we give an algorithm that, on input $n=O(1/\eps^5/2)$ samples
from an unknown distribution $f$, runs in time $O(n^8/5)$, and
outputs a log-concave hypothesis $h$ that (with high probability) satisfies
$\dtv(h, f) = O(øpt)+\eps$, where $øpt$ is the minimum total variation
distance between $f$ and the class of log-concave distributions. Our approach
to the robust proper learning problem is quite flexible and may be applicable
to many other univariate distribution families.
Users
Please
log in to take part in the discussion (add own reviews or comments).