0% found this document useful (0 votes)
213 views3 pages

Principal Component Analysis: C Harvard Math 21b

Principal component analysis (PCA) is used to find the principal components of a matrix. The principal components are the eigenvectors of the covariance matrix and represent the directions of maximum variance in the data. PCA involves computing the eigenvalues and eigenvectors of the covariance matrix to determine the principal components and the proportion of variance explained by each component. The first few principal components typically account for most of the variance in the data and can be used for dimensionality reduction.

Uploaded by

Emily Liu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
213 views3 pages

Principal Component Analysis: C Harvard Math 21b

Principal component analysis (PCA) is used to find the principal components of a matrix. The principal components are the eigenvectors of the covariance matrix and represent the directions of maximum variance in the data. PCA involves computing the eigenvalues and eigenvectors of the covariance matrix to determine the principal components and the proportion of variance explained by each component. The first few principal components typically account for most of the variance in the data and can be used for dimensionality reduction.

Uploaded by

Emily Liu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Principal Component Analysis

 
4 3 9
1. We want to find the principal components associated to the matrix A = .
2 7 9

(a) Use A to find a matrix B with mean ~0.


 
3    
3 5 9 1 −2 0
~ = 5, so B = A −
Solution. The mean vector for A is m = .
3 5 9 −1 2 0
9

(b) Find the principal components of B.


Solution. To find the principal components of B, we needto first findthe eigenvalues and
2 −4 0
then use them to find an orthonormal eigenbasis for B T B = −4 8 0. The characteristic
0 0 0
T
polynomial of B B is
 
2−λ −4 0
det(B T B−λI3 ) = det  −4 8 − λ 0  = −λ[(2−λ)(8−λ)−16] = −λ(λ2 −10λ) = −λ2 (λ−10).
0 0 −λ
   
−8 −4 0 −1
So B T B has eigenvalues 10, 0, 0. The 10-eigenspace is ker −4 −2 0  = span  2 
    0 0 −10 0
2 0
while the 0-eigenspace is ker B T B = span 1 , 0.
0 1
 
−1
Normalizing each of these vectors, we get an orthonormal eigenbasis consisting of v~1 = √15  2 ,
    0
2 0
v~2 = √15 1, and v~3 = 0 with associated eigenvalues 10, 0, 0 respectively.
0 1
From the Prep Video, we learned that these are exactly the first, second, and third principal
components of B.

2. A 4 × 3 matrix A with mean ~0 has an approximate singular value decomposition


  
0.64 −0.67 −0.20 −0.33 7.29 0 0  
0.74 −0.56 0.37
 0.43 0.72 −0.18 −0.51  0  2.77 0 

A= −0.56 −0.23 0.31 0.92 
−0.19 0.15 −0.79  0 0 1.06
−0.63 −0.77 −0.10
−0.31 −0.03 −0.95 0.05 0 0 0

(a) Find the eigenvalues of AT A.


Solution. The eigenvalues of AT A are the squares of the singular values of A, so λ = (7.29)2 , (2.77)2 , (1.06)2 .

1

c Harvard Math 21b
(b) What are the principal components of A?

Solution. The principal components of A consist of an orthonormal eigenbasis for AT A. This


T
is exactly how we find V for a singular value decomposition
 A = U ΣV . So the rows of the
0.74
third matrix form the principal components of A, i.e., −0.56 is the first principal component,
0.37
   
−0.23 −0.63
 0.31  the second, and −0.77 the third.
0.92 −0.10

(c) (Recap) What are the ways that we can find the principal components of A?
Solution. One method is by finding an orthonormal eigenbasis for AT A directly, which requires
us to find the eigenvalues and eigenspaces for AT A. Alternatively, if we have it, we can use the
columns of V in a singular value decomposition of A = U ΣV T .

3. When using principal component analysis on a matrix A, we want to know how much of the variance in
our data will be accounted for by taking different numbers of principal components. The total variance
of an n × m matrix A is n1 tr(AT A).
Let A be a 12 × 7 matrix so that AT A has eigenvalues λ = 12, 9, 9, 4, 2, 0, 0 and principal components
v~1 , . . . , v~7 .

(a) What is the total variance of A?

Solution. The trace of AT A is the sum of the eigenvalues of A, i.e., tr(AT A) = 12 + 9 + 9 + 4 +


36
2 + 0 + 0 = 36. Then the total variance of A is 12 = 3.

(b) What is the variance of A is in the direction v~1 ? v~2 ?

Solution. To find the variance of A in the direction v~1 , we need to compute


1 T A 1 T
v~1 A v~1 = v~1 12v~1 = v~1 T v~1 = 1 .
12 12
λ2
For v~2 , we notice that we will simply obtain 12 from this computation. That is, the variance in
3
the direction v~2 is .
4

(c) Find the proportion of the total variance accounted for by each principal component.
Solution. By computing the ratio between the variance accounted for by v~1 and the total variance
of A, we have 31 . For v~2 and v~3 we obtain 14 . For v~4 , we get 36
4
= 19 ; v~5 , 36
2 1
= 18 ; v~6 and v~7 , none.

(d) How many principal components of A do you need to account for 50% of the variance of A? 90%?
1 1 7
Solution. We only need v~1 and v~2 to account for 3 + =
of the variance (which is greater
4 12
12 + 9 + 9 + 4 34
than 50%). To get 90%, we need to include v~3 and v~4 as well: = which is
36 36
greater than 90% (but just including v~3 wouldn’t be enough).

4. Let A be an n × m matrix so that AT A has eigenvalues λ1 , λ2 , . . . , λm in decreasing order and


(v~1 , . . . , v~m ) are the principal components of A. Generalize your findings from #3.

2

c Harvard Math 21b
(a) What is the variance of A in the direction v~k ?

Solution. Since v~k is a unit vector, v~k T v~k = 1 and so the variance of A along v~k is

1 T A 1 λk
v~k A v~k = λk v~k T v~k = .
n n n

(b) What proportion of the variance of A is accounted for by v~k ?

λk
Solution. In general, the kth principal component of A accounts for of the total
tr(AT A)
variance of A.

5. True or false. In either case, explain your reasoning.

(a) If a 10 × 3 matrix A has singular values 5, 3, 4, then the total variance of A is 1.2.
Solution. This is false. We can use the eigenvalues of AT A to determine the total variance but
the singular values are the square roots of these eigenvalues. So the total variance of A would be
1
10 (25 + 9 + 16) = 5 not 1.2.

(b) If v~1 is the first principal component of A, then the first column of A accounts for the most
variance of A.
Solution. This is false. The first principal component of A is a vector in the domain of A and
indicates a relationship among the columns that gives the most variance.

(c) If AT A has an eigenvalue of 0, then there is some direction along which the variance of A is equal
to 0.
Solution. This is true. Consider a unit eigenvector ~v for AT A with an eigenvalue of 0. Then the
variance of A along ~v is n1 ~v T AT A~v = 0.

(d) If A is n × n and has an eigenvalue of 0, then there is some direction along which the variance of
A is equal to 0.

Solution. This is true. Consider a unit eigenvector ~v for A with eigenvalue 0. Then AT A~v =
AT (~0) = ~0 and so ~v is an eigenvector for AT A with eigenvalue 0. Now we can use the previous
problem. (This can also be shown directly.)

3

c Harvard Math 21b

You might also like