Article,

Bayesian Adaptive Data Analysis Guarantees from Subgaussianity

S. Elder.
(2016)cite arxiv:1611.00065.

Abstract

The new field of adaptive data analysis seeks to provide algorithms and provable guarantees for models of machine learning that allow researchers to reuse their data, which normally falls outside of the usual statistical paradigm of static data analysis. In 2014, Dwork, Feldman, Hardt, Pitassi, Reingold and Roth introduced one potential model and proposed several solutions based on differential privacy. In previous work in 2016, we described a problem with this model and instead proposed a Bayesian variant, but also found that the analogous Bayesian methods cannot achieve the same statistical guarantees as in the static case. In this paper, we prove the first positive results for the Bayesian model, showing that with a Dirichlet prior, the posterior mean algorithm indeed matches the statistical guarantees of the static case. The main ingredient is a new theorem showing that the $Beta(\alpha,\beta)$ distribution is subgaussian with variance proxy $O(1/(\alpha+\beta+1))$, a concentration result also of independent interest. We provide two proofs of this result: a probabilistic proof utilizing a simple condition for the raw moments of a positive random variable and a learning-theoretic proof based on considering the beta distribution as a posterior, both of which have implications to other related problems.

BibTeX key: elder2016bayesian
entry type: article
year: 2016
url: http://arxiv.org/abs/1611.00065
note: cite arxiv:1611.00065

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@article{elder2016bayesian, abstract = {The new field of adaptive data analysis seeks to provide algorithms and provable guarantees for models of machine learning that allow researchers to reuse their data, which normally falls outside of the usual statistical paradigm of static data analysis. In 2014, Dwork, Feldman, Hardt, Pitassi, Reingold and Roth introduced one potential model and proposed several solutions based on differential privacy. In previous work in 2016, we described a problem with this model and instead proposed a Bayesian variant, but also found that the analogous Bayesian methods cannot achieve the same statistical guarantees as in the static case. In this paper, we prove the first positive results for the Bayesian model, showing that with a Dirichlet prior, the posterior mean algorithm indeed matches the statistical guarantees of the static case. The main ingredient is a new theorem showing that the $\mathrm{Beta}(\alpha,\beta)$ distribution is subgaussian with variance proxy $O(1/(\alpha+\beta+1))$, a concentration result also of independent interest. We provide two proofs of this result: a probabilistic proof utilizing a simple condition for the raw moments of a positive random variable and a learning-theoretic proof based on considering the beta distribution as a posterior, both of which have implications to other related problems.}, added-at = {2019-09-19T13:35:04.000+0200}, author = {Elder, Sam}, biburl = {https://www.bibsonomy.org/bibtex/2f89b7030dc768ef75820f8a7271b0d97/kirk86}, description = {[1611.00065] Bayesian Adaptive Data Analysis Guarantees from Subgaussianity}, interhash = {17ea17923077ae09e94d1d1fa70ca400}, intrahash = {f89b7030dc768ef75820f8a7271b0d97}, keywords = {bayesian bounds differential-privacy generalization}, note = {cite arxiv:1611.00065}, timestamp = {2019-09-19T13:35:04.000+0200}, title = {Bayesian Adaptive Data Analysis Guarantees from Subgaussianity}, url = {http://arxiv.org/abs/1611.00065}, year = 2016 }

BibSonomy

Bayesian Adaptive Data Analysis Guarantees from Subgaussianity

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on