Gaussian Processes
! Sometimes it is possible to consider infinitely many features at once, by extending from a sum to
an integral. This requires some regularity assumption about the features’ locations, shape, etc.
! The resulting nonparametric model is known as a Gaussian process
! Inference in GPs is tractable (though at polynomial cost O(N3 ) in the number N of datapoints)
! There is no unique kernel. In fact, there are quite a few! E.g.
k(a, b) = exp(−(a − b)2 ) Gaussian / Square Exponential / RBF kernel
k(a, b) = min(a − t0 , b − t0 ) Wiener process
1
k(a, b) = min3 (a − t0 , b − t0 ) cubic spline kernel
3
1
+ |a − b| · min2 (a − t0 , b − t0 )
2 ! #
2 −1 2a! b
k(a, b) = sin " Neural Network kernel (Williams, 1998)
π (1 + 2a! a)(1 + 2b! b)
Probabilistic ML — P. Hennig, SS 2021 — Lecture 09: Gaussian Processes— © Philipp Hennig, 2021 CC BY-NC-SA 3.0 20