Abstract
We show that the output of a (residual) convolutional neural network (CNN)
with an appropriate prior over the weights and biases is a Gaussian process
(GP) in the limit of infinitely many convolutional filters, extending similar
results for dense networks. For a CNN, the equivalent kernel can be computed
exactly and, unlike "deep kernels", has very few parameters: only the
hyperparameters of the original CNN. Further, we show that this kernel has two
properties that allow it to be computed efficiently; the cost of evaluating the
kernel for a pair of images is similar to a single forward pass through the
original CNN with only one filter per layer. The kernel equivalent to a
32-layer ResNet obtains 0.84% classification error on MNIST, a new record for
GPs with a comparable number of parameters.
Users
Please
log in to take part in the discussion (add own reviews or comments).