Inproceedings,

Rates of Convergence for Sparse Variational Gaussian Process Regression

D. Burt, C. Rasmussen, and M. Van Der Wilk.
Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, page 862--871. Long Beach, California, USA, PMLR, (09--15 Jun 2019)

Full text

Abstract

Excellent variational approximations to Gaussian process posteriors have been developed which avoid the $Ołeft(N^3\right)$ scaling with dataset size $N$. They reduce the computational cost to $Ołeft(NM^2\right)$, with $MN$ the number of inducing variables, which summarise the process. While the computational cost seems to be linear in $N$, the true complexity of the algorithm depends on how $M$ must increase to ensure a certain quality of approximation. We show that with high probability the KL divergence can be made arbitrarily small by growing $M$ more slowly than $N$. A particular case is that for regression with normally distributed inputs in D-dimensions with the Squared Exponential kernel, $M=O(łog^D N)$ suffices. Our results show that as datasets grow, Gaussian process posteriors can be approximated cheaply, and provide a concrete rule for how to increase $M$ in continual learning scenarios.

BibTeX key: pmlr-v97-burt19a
entry type: inproceedings
address: Long Beach, California, USA
booktitle: Proceedings of the 36th International Conference on Machine Learning
year: 2019
month: 09--15 Jun
pages: 862--871
publisher: PMLR
series: Proceedings of Machine Learning Research
volume: 97
pdf: http://proceedings.mlr.press/v97/burt19a/burt19a.pdf
Document: http://proceedings.mlr.press/v97/burt19a.html

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@inproceedings{pmlr-v97-burt19a, abstract = {Excellent variational approximations to Gaussian process posteriors have been developed which avoid the $\mathcal{O}\left(N^3\right)$ scaling with dataset size $N$. They reduce the computational cost to $\mathcal{O}\left(NM^2\right)$, with $M\ll N$ the number of \emph{inducing variables}, which summarise the process. While the computational cost seems to be linear in $N$, the true complexity of the algorithm depends on how $M$ must increase to ensure a certain quality of approximation. We show that with high probability the KL divergence can be made arbitrarily small by growing $M$ more slowly than $N$. A particular case is that for regression with normally distributed inputs in D-dimensions with the Squared Exponential kernel, $M=\mathcal{O}(\log^D N)$ suffices. Our results show that as datasets grow, Gaussian process posteriors can be approximated cheaply, and provide a concrete rule for how to increase $M$ in continual learning scenarios.}, added-at = {2020-07-16T12:13:53.000+0200}, address = {Long Beach, California, USA}, author = {Burt, David and Rasmussen, Carl Edward and Van Der Wilk, Mark}, biburl = {https://www.bibsonomy.org/bibtex/2149185b850836a99ee468b4d53ef0182/kirk86}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, description = {Rates of Convergence for Sparse Variational Gaussian Process Regression}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, interhash = {31df05d6ccbe4ce1f1d0e699832802fb}, intrahash = {149185b850836a99ee468b4d53ef0182}, keywords = {convergence gaussian-proceses readings sparsity variational}, month = {09--15 Jun}, pages = {862--871}, pdf = {http://proceedings.mlr.press/v97/burt19a/burt19a.pdf}, publisher = {PMLR}, series = {Proceedings of Machine Learning Research}, timestamp = {2020-07-16T12:13:53.000+0200}, title = {Rates of Convergence for Sparse Variational {G}aussian Process Regression}, url = {http://proceedings.mlr.press/v97/burt19a.html}, volume = 97, year = 2019 }

BibSonomy

Rates of Convergence for Sparse Variational Gaussian Process Regression

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on