Abstract
Deep kernel learning combines the non-parametric flexibility of kernel
methods with the inductive biases of deep learning architectures. We propose a
novel deep kernel learning model and stochastic variational inference procedure
which generalizes deep kernel learning approaches to enable classification,
multi-task learning, additive covariance structures, and stochastic gradient
training. Specifically, we apply additive base kernels to subsets of output
features from deep neural architectures, and jointly learn the parameters of
the base kernels and deep network through a Gaussian process marginal
likelihood objective. Within this framework, we derive an efficient form of
stochastic variational inference which leverages local kernel interpolation,
inducing points, and structure exploiting algebra. We show improved performance
over stand alone deep networks, SVMs, and state of the art scalable Gaussian
processes on several classification benchmarks, including an airline delay
dataset containing 6 million training points, CIFAR, and ImageNet.
Description
[1611.00336] Stochastic Variational Deep Kernel Learning
Links and resources
Tags
community