Ch14 Bayesian Learning
Ch14 Bayesian Learning
Chapter 14
Bayesian Learning
supplementary slides to
Machine Learning Fundamentals
c
Hui Jiang 2020
published by Cambridge University Press
August 2020
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
Outline
2 Conjugate Priors
3 Approximate Inference
4 Gaussian Processes
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
p(θ | x1 ) ∝ p(θ)p(x1 | θ)
p(θ | x1 , x2 ) ∝ p(θ | x1 ) p(x2 | θ)
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
2πσ02
2πτ02
as n → ∞, we have
p(µ|x1 ) ∝ p(µ)p(x1 |µ) =⇒ p(µ|x1 ) = N (µ|ν1 , τ12 )
2
σ0 τ02 τ02 σ0
2
◦ τn → 0
with ν1 = τ02 +σ02 ν0 + τ02 +σ02 x1 and τ12 = τ02 +σ02
◦ νn → x̄n
p(µ | x1 , · · · , xn ) = N (µ|νn , τn2 ) with
nτ02 σ02
τ02 σ0
2 ◦ µMAP → µML
νn = 2 x̄n
nτ02 +σ0
+ 2 ν0
nτ02 +σ0
and τn2 = nτ02 +σ02
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
Conjugate Priors
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
the likelihood function of a training set D = x1 , x2 , · · · xN :
N
Y
p D µ, Σ = p xi µ, Σ
i=1
N
−1
Σ 2
1 h
−1 N | −1
i
= exp − tr N S Σ − (µ − x̄) Σ (µ − x̄)
(2π)N d/2 2 2
Bayesian learning:
p µ, Σ D ∝ GIW µ, Σ ν0 , Φ0 , λ0 , ν0 · p D µ, Σ
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
◦ λ1 = λ0 + N and ν1 = ν0 + N
◦ ν1 = λ0λν00+N
+N x̄
|
◦ Φ1 = Φ0 + N S + λλ00+N
N
x̄ − ν0 x̄ − ν0
MAP estimation: µMAP , ΣMAP = arg maxµ,Σ p µ, Σ DN
λ0 ν0 + N x̄
µMAP = ν1 =
λ0 + N
|
Φ0 + N S + λλ00+N
N
Φ1 x̄ − ν0 x̄ − ν0
ΣMAP = =
ν1 + d + 1 ν0 + N + d + 1
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
Approximate Inference
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
Laplace’s Method
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
Z
p(D, θ)
KL q(θ) k p(θ | D) = ln p(D) − q(θ) ln dθ
q(θ)
|θ {z }
L(q)
minq KL q(θ) k p(θ | D) ⇐⇒ maxq L(q)
assume q(θ) = q1 (θ1 ) q2 (θ2 ) · · · qI (θI ) can be factorized over
some disjoint subsets θ = θ1 ∪ θ2 ∪ · · · ∪ θI
R QI PI R
L(q) = θ i=1 qi (θi ) ln p(D, θ)dθ − i=1 θi
qi (θi ) ln qi (θi )dθi
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
define a new distribution: pe(θi ; D) ∝ exp Ej6=i ln p(D, θ)
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
with
(0) (0)
p(w1 , · · · , wM ) = Dir(w1 , · · · , wM | α1 , · · · , αM )
(0)
, Φ(0) (0) (0)
p(µm , Σm ) = GIW µm , Σm | νm m , λm , νm
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
introduce 1-of-M latent variable z = z1 z2 · · · zM for GMMs:
M zm
Y zm
p(x, z | θ) = wm N (x | µm , Σm )
m=1
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
=⇒ ln q∗ (z) = C 0 +
h ln |Σ | i h (x − µ )| Σ −1 (x − µ ) i
PM m m m m
m=1 zm E ln wm − E −E
| 2 {z 2 }
ln ρm
zm zm
◦ q ∗ (z) is a multinomial: q ∗ (z) ∝
QM QM
m=1 ρm ∝ m=1 rm ,
ρ
where rm = PM m for all m
m=1 ρm
h i
2 ln q ∗ (w1 , · · · , wM ) = Ez,µm ,Σm ln p(θ) + ln p(x, z|θ)
(0)
= M
P PM
m=1 (αm − 1) ln wm + m=1 rm ln wm + C
◦ q ∗ (w1 , · · · , wM ) is a Dirichlet distribution:
(1) (1)
q ∗ (w1 , · · · , wM ) = Dir w1 , · · · , wM | α1 , · · · , αM
(1) (0)
where αm = αm + rm for all m = 1, 2, · · · , M
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
= ln p(µm , Σm ) + E zm ln N (x|µm , Σm ) + C 0
q ∗ (µm , Σm ) = GIW µm , Σm | νm
(1)
, Φ(1) (1) (1)
m , λm , νm
where
λ(1) (0)
m = λ m + rm
(1) (0)
νm = νm + rm
(0) (0)
(1) λ m νm + r m x
νm = (0)
λ m + rm
λ(0) rm (0) |
Φ(1) (0)
m = Φm + (0)
(0)
x − νm x − νm
λm + rm
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
set n = 0
while not converge do
E-step: collect statistics:
(n) (n) (n) (n) (n)
αm , νm , Φm , λm , νm + x −→ rm
M-step: update all hyperparameters:
(n) (n) (n) (n) (n)
αm , νm , Φm , λm , νm + rm + x
(n+1) (n+1) (n+1) (n+1) (n+1)
−→ αm , νm , Φm , λm , νm
n=n+1
end while
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
◦ Gaussian processes
◦ Dirichlet processes
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
◦ σ: vertical scale
◦ l: horizontal scale
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
p y | f, D = N y | f , σ02 I
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
hyper-parameter learning:
with
CN k
CN +1 =
k| κ2
2
where κ = Φ(x̃, x̃) + σ02 and ki = Φ(xi , x̃)
point estimation (MAP or
the predictive distribution: mean):
p y, ỹ | D, x̃
E ỹ D, y, x̃ = ỹMAP
p ỹ | D, y, x̃ =
p y|D
= k| C−1
N y
= N ỹ k| C−1 2 | −1
N y, κ − k CN k
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
c
supplementary slides to Machine Learning Fundamentals Hui Jiang 2020 published by Cambridge University Press
Chapter 14
Formulation Conjugate Priors Approximate Inference Gaussian Processes
given a training set D = x1 , x2 , · · · , xN and the corresponding
|
outputs y = y1 y2 · · · yN
non-parametric prior: p f | D = N f | 0, ΣD
QN yi 1−yi
likelihood: p(y |f, D) = i=1 l f (xi ) 1 − l f (xi )