Gaussian Process Behaviour in Wide Deep Neural Networks
A. Matthews, M. Rowland, J. Hron, R. Turner, and Z. Ghahramani. (2018)cite arxiv:1804.11271Comment: This work substantially extends the work of Matthews et al. (2018) published at the International Conference on Learning Representations (ICLR) 2018.
Abstract
Whilst deep neural networks have shown great empirical success, there is
still much work to be done to understand their theoretical properties. In this
paper, we study the relationship between random, wide, fully connected,
feedforward networks with more than one hidden layer and Gaussian processes
with a recursive kernel definition. We show that, under broad conditions, as we
make the architecture increasingly wide, the implied random function converges
in distribution to a Gaussian process, formalising and extending existing
results by Neal (1996) to deep networks. To evaluate convergence rates
empirically, we use maximum mean discrepancy. We then compare finite Bayesian
deep networks from the literature to Gaussian processes in terms of the key
predictive quantities of interest, finding that in some cases the agreement can
be very close. We discuss the desirability of Gaussian process behaviour and
review non-Gaussian alternative models from the literature.
Description
[1804.11271] Gaussian Process Behaviour in Wide Deep Neural Networks
cite arxiv:1804.11271Comment: This work substantially extends the work of Matthews et al. (2018) published at the International Conference on Learning Representations (ICLR) 2018
%0 Generic
%1 matthews2018gaussian
%A Matthews, Alexander G. de G.
%A Rowland, Mark
%A Hron, Jiri
%A Turner, Richard E.
%A Ghahramani, Zoubin
%D 2018
%K bayesian deep-learning gaussian-proceses kernels readings
%T Gaussian Process Behaviour in Wide Deep Neural Networks
%U http://arxiv.org/abs/1804.11271
%X Whilst deep neural networks have shown great empirical success, there is
still much work to be done to understand their theoretical properties. In this
paper, we study the relationship between random, wide, fully connected,
feedforward networks with more than one hidden layer and Gaussian processes
with a recursive kernel definition. We show that, under broad conditions, as we
make the architecture increasingly wide, the implied random function converges
in distribution to a Gaussian process, formalising and extending existing
results by Neal (1996) to deep networks. To evaluate convergence rates
empirically, we use maximum mean discrepancy. We then compare finite Bayesian
deep networks from the literature to Gaussian processes in terms of the key
predictive quantities of interest, finding that in some cases the agreement can
be very close. We discuss the desirability of Gaussian process behaviour and
review non-Gaussian alternative models from the literature.
@conference{matthews2018gaussian,
abstract = {Whilst deep neural networks have shown great empirical success, there is
still much work to be done to understand their theoretical properties. In this
paper, we study the relationship between random, wide, fully connected,
feedforward networks with more than one hidden layer and Gaussian processes
with a recursive kernel definition. We show that, under broad conditions, as we
make the architecture increasingly wide, the implied random function converges
in distribution to a Gaussian process, formalising and extending existing
results by Neal (1996) to deep networks. To evaluate convergence rates
empirically, we use maximum mean discrepancy. We then compare finite Bayesian
deep networks from the literature to Gaussian processes in terms of the key
predictive quantities of interest, finding that in some cases the agreement can
be very close. We discuss the desirability of Gaussian process behaviour and
review non-Gaussian alternative models from the literature.},
added-at = {2019-11-22T14:44:18.000+0100},
author = {Matthews, Alexander G. de G. and Rowland, Mark and Hron, Jiri and Turner, Richard E. and Ghahramani, Zoubin},
biburl = {https://www.bibsonomy.org/bibtex/2301e7e6845457bff4b3abbfe615fe7ba/kirk86},
description = {[1804.11271] Gaussian Process Behaviour in Wide Deep Neural Networks},
interhash = {b83edd2d9ab1431350cae5cd360d1821},
intrahash = {301e7e6845457bff4b3abbfe615fe7ba},
keywords = {bayesian deep-learning gaussian-proceses kernels readings},
note = {cite arxiv:1804.11271Comment: This work substantially extends the work of Matthews et al. (2018) published at the International Conference on Learning Representations (ICLR) 2018},
timestamp = {2019-11-22T14:44:34.000+0100},
title = {Gaussian Process Behaviour in Wide Deep Neural Networks},
url = {http://arxiv.org/abs/1804.11271},
year = 2018
}