Аннотация
We study learning algorithms that are restricted to using a small amount of
information from their input sample. We introduce a category of learning
algorithms we term $d$-bit information learners, which are algorithms whose
output conveys at most $d$ bits of information of their input. A central theme
in this work is that such algorithms generalize.
We focus on the learning capacity of these algorithms, and prove sample
complexity bounds with tight dependencies on the confidence and error
parameters. We also observe connections with well studied notions such as
sample compression schemes, Occam's razor, PAC-Bayes and differential privacy.
We discuss an approach that allows us to prove upper bounds on the amount of
information that algorithms reveal about their inputs, and also provide a lower
bound by showing a simple concept class for which every (possibly randomized)
empirical risk minimizer must reveal a lot of information. On the other hand,
we show that in the distribution-dependent setting every VC class has empirical
risk minimizers that do not reveal a lot of information.
Пользователи данного ресурса
Пожалуйста,
войдите в систему, чтобы принять участие в дискуссии (добавить собственные рецензию, или комментарий)