Principal Component Analysis
4 3 9
1. We want to find the principal components associated to the matrix A = .
2 7 9
(a) Use A to find a matrix B with mean ~0.
3
3 5 9 1 −2 0
~ = 5, so B = A −
Solution. The mean vector for A is m = .
3 5 9 −1 2 0
9
(b) Find the principal components of B.
Solution. To find the principal components of B, we needto first findthe eigenvalues and
2 −4 0
then use them to find an orthonormal eigenbasis for B T B = −4 8 0. The characteristic
0 0 0
T
polynomial of B B is
2−λ −4 0
det(B T B−λI3 ) = det −4 8 − λ 0 = −λ[(2−λ)(8−λ)−16] = −λ(λ2 −10λ) = −λ2 (λ−10).
0 0 −λ
−8 −4 0 −1
So B T B has eigenvalues 10, 0, 0. The 10-eigenspace is ker −4 −2 0 = span 2
0 0 −10 0
2 0
while the 0-eigenspace is ker B T B = span 1 , 0.
0 1
−1
Normalizing each of these vectors, we get an orthonormal eigenbasis consisting of v~1 = √15 2 ,
0
2 0
v~2 = √15 1, and v~3 = 0 with associated eigenvalues 10, 0, 0 respectively.
0 1
From the Prep Video, we learned that these are exactly the first, second, and third principal
components of B.
2. A 4 × 3 matrix A with mean ~0 has an approximate singular value decomposition
0.64 −0.67 −0.20 −0.33 7.29 0 0
0.74 −0.56 0.37
0.43 0.72 −0.18 −0.51 0 2.77 0
A= −0.56 −0.23 0.31 0.92
−0.19 0.15 −0.79 0 0 1.06
−0.63 −0.77 −0.10
−0.31 −0.03 −0.95 0.05 0 0 0
(a) Find the eigenvalues of AT A.
Solution. The eigenvalues of AT A are the squares of the singular values of A, so λ = (7.29)2 , (2.77)2 , (1.06)2 .
1
c Harvard Math 21b
(b) What are the principal components of A?
Solution. The principal components of A consist of an orthonormal eigenbasis for AT A. This
T
is exactly how we find V for a singular value decomposition
A = U ΣV . So the rows of the
0.74
third matrix form the principal components of A, i.e., −0.56 is the first principal component,
0.37
−0.23 −0.63
0.31 the second, and −0.77 the third.
0.92 −0.10
(c) (Recap) What are the ways that we can find the principal components of A?
Solution. One method is by finding an orthonormal eigenbasis for AT A directly, which requires
us to find the eigenvalues and eigenspaces for AT A. Alternatively, if we have it, we can use the
columns of V in a singular value decomposition of A = U ΣV T .
3. When using principal component analysis on a matrix A, we want to know how much of the variance in
our data will be accounted for by taking different numbers of principal components. The total variance
of an n × m matrix A is n1 tr(AT A).
Let A be a 12 × 7 matrix so that AT A has eigenvalues λ = 12, 9, 9, 4, 2, 0, 0 and principal components
v~1 , . . . , v~7 .
(a) What is the total variance of A?
Solution. The trace of AT A is the sum of the eigenvalues of A, i.e., tr(AT A) = 12 + 9 + 9 + 4 +
36
2 + 0 + 0 = 36. Then the total variance of A is 12 = 3.
(b) What is the variance of A is in the direction v~1 ? v~2 ?
Solution. To find the variance of A in the direction v~1 , we need to compute
1 T A 1 T
v~1 A v~1 = v~1 12v~1 = v~1 T v~1 = 1 .
12 12
λ2
For v~2 , we notice that we will simply obtain 12 from this computation. That is, the variance in
3
the direction v~2 is .
4
(c) Find the proportion of the total variance accounted for by each principal component.
Solution. By computing the ratio between the variance accounted for by v~1 and the total variance
of A, we have 31 . For v~2 and v~3 we obtain 14 . For v~4 , we get 36
4
= 19 ; v~5 , 36
2 1
= 18 ; v~6 and v~7 , none.
(d) How many principal components of A do you need to account for 50% of the variance of A? 90%?
1 1 7
Solution. We only need v~1 and v~2 to account for 3 + =
of the variance (which is greater
4 12
12 + 9 + 9 + 4 34
than 50%). To get 90%, we need to include v~3 and v~4 as well: = which is
36 36
greater than 90% (but just including v~3 wouldn’t be enough).
4. Let A be an n × m matrix so that AT A has eigenvalues λ1 , λ2 , . . . , λm in decreasing order and
(v~1 , . . . , v~m ) are the principal components of A. Generalize your findings from #3.
2
c Harvard Math 21b
(a) What is the variance of A in the direction v~k ?
Solution. Since v~k is a unit vector, v~k T v~k = 1 and so the variance of A along v~k is
1 T A 1 λk
v~k A v~k = λk v~k T v~k = .
n n n
(b) What proportion of the variance of A is accounted for by v~k ?
λk
Solution. In general, the kth principal component of A accounts for of the total
tr(AT A)
variance of A.
5. True or false. In either case, explain your reasoning.
(a) If a 10 × 3 matrix A has singular values 5, 3, 4, then the total variance of A is 1.2.
Solution. This is false. We can use the eigenvalues of AT A to determine the total variance but
the singular values are the square roots of these eigenvalues. So the total variance of A would be
1
10 (25 + 9 + 16) = 5 not 1.2.
(b) If v~1 is the first principal component of A, then the first column of A accounts for the most
variance of A.
Solution. This is false. The first principal component of A is a vector in the domain of A and
indicates a relationship among the columns that gives the most variance.
(c) If AT A has an eigenvalue of 0, then there is some direction along which the variance of A is equal
to 0.
Solution. This is true. Consider a unit eigenvector ~v for AT A with an eigenvalue of 0. Then the
variance of A along ~v is n1 ~v T AT A~v = 0.
(d) If A is n × n and has an eigenvalue of 0, then there is some direction along which the variance of
A is equal to 0.
Solution. This is true. Consider a unit eigenvector ~v for A with eigenvalue 0. Then AT A~v =
AT (~0) = ~0 and so ~v is an eigenvector for AT A with eigenvalue 0. Now we can use the previous
problem. (This can also be shown directly.)
3
c Harvard Math 21b