PhD thesis,

A Scalable and Flexible Framework for Gaussian Processes via Matrix-Vector Multiplication

.
Cornell University, (2020)

Abstract

Gaussian processes (GPs) exhibit a classic tension of many machine learning methods: they possess desirable modelling capabilities yet suffer from im- portant practical limitations. In many instances, GPs are able to offer well- calibrated uncertainty estimates, interpretable predictions, and the ability to en- code prior knowledge. These properties have made them an indispensable tool for black-box optimization, time series forecasting, and high-risk applications like health care. Despite these benefits, GPs are typically not applied to datasets with more than a few thousand data points. This is in part due to an inference procedure that requires matrix inverses, determinants, and other expensive op- erations. Moreover, specialty models often require significant implementation efforts. This thesis aims to alleviate these practical concerns through a single sim- ple design decision. Taking inspiration from neural network libraries, we con- struct GP inference algorithms using only matrix-vector multiplications (MVMs) and other linear operations. This MVM-based approach simultaneously ad- dress several of these practical concerns: it reduces asymptotic complexity, ef- fectively utilizes GPU hardware, and provides straight-forward implementa- tions for many specialty GP models. The chapters of this thesis each address a different aspect of Gaussian pro- cess inference. Chapter 3 introduces a MVM method for training Gaussianprocess regression models (i.e. optimizing kernel/likelihood hyperparameters). This approach unifies several existing methods into a highly-parallel and stable algorithm. Chapter 4 focuses on making predictions with Gaussian processes. A memory-efficient cache, which can be computed through MVMs, significantly reduces the computation of predictive distributions. Chapter 5 introduces a multi-purpose MVM algorithm that can be used to draw samples from GP pos- teriors and perform approximate Gaussian process inference. All three of these methods offer speedups ranging from 4× to 40×. Importantly, applying any of these algorithms to specialty models (e.g. multitask GPs and scalable approx- imations) simply requires a matrix-vector multiplication routine that exploits covariance structure afforded by the model. The MVM methods from this thesis form the building blocks of the GPyTorch library, an open-sourced GP implementation designed for scalability and simple implementations. In the final chapter, we evaluate GPyTorch models on several large-scale regression datasets. Using the proposed MVM methods, we can apply exact Gaussian processes to datasets that are 2 orders of magnitude larger than what has previously been reported—up to 1 million data points.

Tags

Users

  • @peter.ralph

Comments and Reviews