Zusammenfassung
Virtually any model we use in machine learning to make predictions does not
perfectly represent reality. So, most of the learning happens under model
misspecification. In this work, we present a novel analysis of the
generalization performance of Bayesian model averaging under model
misspecification and i.i.d. data using a new family of second-order PAC-Bayes
bounds. This analysis shows, in simple and intuitive terms, that Bayesian model
averaging provides suboptimal generalization performance when the model is
misspecified. In consequence, we provide strong theoretical arguments showing
that Bayesian methods are not optimal for learning predictive models, unless
the model class is perfectly specified. Using novel second-order PAC-Bayes
bounds, we derive a new family of Bayesian-like algorithms, which can be
implemented as variational and ensemble methods. The output of these algorithms
is a new posterior distribution, different from the Bayesian posterior, which
induces a posterior predictive distribution with better generalization
performance. Experiments with Bayesian neural networks illustrate these
findings.
Nutzer