Matrix Differentiation
CS4243 Computer Vision and Pattern Recognition
Leow Wee Kheng
Department of Computer Science
School of Computing
National University of Singapore
Leow Wee Kheng (CS4243) Matrix Differentation 1 / 12
Matrix Differentiation
Matrix Differentiation
This set of slides summarizes some commonly used matrix
differentiation formulae.
In the following,
A and B are n×n matrices
a11 · · · a1n b11 · · · b1n
A = ... .. ,
.. .. .. .. .
B= . (1)
. . . .
an1 · · · ann bn1 · · · bnn
x and y are n×1 column matrices (i.e., vectors)
y1
x1
..
x = ... , y = . . (2)
xn yn
Leow Wee Kheng (CS4243) Matrix Differentation 2 / 12
Matrix Differentiation
(1) E = x⊤ x
X
E = x⊤ x = x2i = x21 + · · · + x2n . (3)
i
∂E
= 2xi . (4)
∂xi
So,
∂E/∂x1 x1
∂E .. .. = 2 x.
= = 2 (5)
∂x . .
∂E/∂xn xn
Leow Wee Kheng (CS4243) Matrix Differentation 3 / 12
Matrix Differentiation
(2) E = x⊤ y
E = x ⊤ y = x 1 y1 + · · · + x n y n . (6)
∂E
= yi . (7)
∂xi
So,
∂E
= y. (8)
∂x
Leow Wee Kheng (CS4243) Matrix Differentation 4 / 12
Matrix Differentiation
(3) E = (x⊤ y)2
E = (x⊤ y)2 = (x1 y1 + · · · + xn yn )2 . (9)
∂E
= 2(x1 y1 + · · · + xn yn ) yi . (10)
∂xi
So,
∂E/∂x1
∂E .. ⊤
= = 2 x y y. (11)
∂x .
∂E/∂xn
Leow Wee Kheng (CS4243) Matrix Differentation 5 / 12
Matrix Differentiation
(4) Ax
a11 · · · a1n x1 a11 x1 + · · · + a1n xn
.. .. .. .. = ..
Ax = . . (12)
. . . .
an1 · · · ann xn an1 x1 + · · · + ann xn
Let X
si = [Ax]i = ai1 x1 + · · · + ain xn = aij xj . (13)
j
Then,
∂si
= aij . (14)
∂xj
So,
∂s1 /∂x1 · · · ∂s1 /∂xn
∂(Ax) .. .. ..
= = A. (15)
. . .
∂x⊤
∂sn /∂x1 · · · ∂sn /∂xn
Leow Wee Kheng (CS4243) Matrix Differentation 6 / 12
Matrix Differentiation
And,
∂(Ax)⊤
= A⊤ . (16)
∂x
Leow Wee Kheng (CS4243) Matrix Differentation 7 / 12
Matrix Differentiation
(5) E = kAxk2
2
X X
E = kAxk2 = (Ax)⊤ (Ax) = akj xj . (17)
k j
" #
∂E XX X X
=2 aki akj xj = 2 aki akj xj . (18)
∂xi
k j j k
That is,
∂E Xh i
=2 A⊤A xj . (19)
∂xi ij
j
Thus,
∂E
= 2A⊤Ax. (20)
∂x
Leow Wee Kheng (CS4243) Matrix Differentation 8 / 12
Matrix Differentiation
(6) E = kAxk2
X
E= s2i (21)
i
where X
si = aij xj . (22)
j
∂E ∂si
= 2si = 2si xj . (23)
∂aij ∂aij
Thus,
s1
∂E
= 2 ... [x1 · · · xn ] = 2Ax x⊤ . (24)
∂A
sn
Leow Wee Kheng (CS4243) Matrix Differentation 9 / 12
Matrix Differentiation
(7) E = kBAxk2
Let y = Ax. Then,
E = kByk2 . (25)
∂E
= 2Byy⊤ = 2BAx x⊤ A⊤ . (26)
∂B
Leow Wee Kheng (CS4243) Matrix Differentation 10 / 12
Matrix Differentiation
(8) E = kBAxk2
Let C = BA. Then,
E = kCxk2 . (27)
∂E
= 2C⊤ Cx = 2A⊤ B⊤ BAx. (28)
∂x
Leow Wee Kheng (CS4243) Matrix Differentation 11 / 12
Matrix Differentiation
(9) E = kBAxk2
" #2
X XX
2
E = kBAxk = bkl alm xm . (29)
k l m
" #
∂E X XX
= 2 bkl alm xm bki xj
∂aij m
k l
# "
XX X
= 2 bki bkl alm xm xj
(30)
l m k
XXh i
⊤
= 2 B B alm xm xj
il
l m
= 2 B⊤ BAx i xj .
So,
∂E
= 2B⊤ BAxx⊤ . (31)
∂A
Leow Wee Kheng (CS4243) Matrix Differentation 12 / 12