Computing Eigenvalues and/or
Eigenvectors;Part 2, The Power method
and QR-algorithm
Tom Lyche
Centre of Mathematics for Applications,
Department of Informatics,
University of Oslo
November 16, 2008
Today
I The power method to find the dominant eigenvector
I The shifted power method to speed up convergence
I The inverse power method
I The Rayleigh quotient iteration
I The QR-algorithm
The Power Method
I Find the eigenvector corresponding to the dominant
(largest in absolute value) eigenvalue.
I With a simple modification we can also find the
corresponding eigenvalue
Assumptions
I Let A ∈ Cn,n have eigenpairs (λj , vj ), j = 1, . . . , n.
I Given z0 ∈ Cn we assume that
(i) |λ1 | > |λ2 | ≥ |λ3 | ≥ · · · ≥ |λn |,
(ii) A has linearly independent eigenvectors ,
(iii) z0 = c1 v1 + · · · + cn vn with c1 6= 0.
I The first assumption means that A has a dominant
eigenvalue λ1 of algebraic multiplicity one.
I The second assumption is not necessary. It is included to
simplify the analysis.
I The third assumption says that zT0 v1 6= 0, i. e., z0 has a
component in the direction v1 .
Powers
I Given A ∈ Cn,n , a vector z0 ∈ Cn , and assume that i),ii),
iii) hold.
I Define a sequence {zk } of vectors in Cn by
zk := Ak z0 = Azk−1 , k = 1, 2, . . . .
I Assume z0 = c1 v1 + c2 v2 + · · · + cn vn .
I Then
zk = c1 λk1 v1 + c2 λk2 v2 + · · · + cn λkn vn , k = 0, 1, 2, . . . .
k k
zk λ2 λn
I
λk
= c 1 v 1 + c 2 λ1
v 2 + · · · + c n λ1
.
1
I zk /λk1 , → c1 v1 , k → ∞
The Power method
Need to normalize the vectors zk .
I Choose a norm on Cn , set x0 = z0 /kz0 k and generate for
k = 1, 2, . . . unit vectors ,{xk } as follows:
(i) yk = Axk−1
(1)
(ii) xk = yk /kyk k.
Example
I
1 2
A= , z0 = [1.0, 1.0]T , x0 = [0.707, 0.707]T
3 4
I x1 = [0.39, 0.92], x2 = [0.4175, 0.9087],
x3 = [0.4159, 0.9094], . . .
I converges to an eigenvector of A
Convergence
I The way {xk } converges to an eigenvector will depend on
the sign of λ1 . Suppose xk = K v1 for some K 6= 0.
I Then Axk = λ1 xk and xk+1 = Axk /kAxk k = |λλ11 | xk .
I Thus xk+1 = −xk if λ1 < 0.
eigenvalue
I Suppose we know an approximate eigenvector of
A ∈ Cn,n . How should we estimate the corresponding
eigenvalue?
I If (λ, u) is an exact eigenpair then Au − λu = 0.
I If u is an approximate eigenvector we can minimize the
function ρ : C → R given by
ρ(µ) := kAu − µuk2 .
Theorem
uH Au
ρ is minimized when µ = ν := uH u
is the Rayleigh quotient of
u.
Power with Rayleigh
function [l,x,it]=powerit(A,z,K,tol)
af=norm(A,’fro’); x=z/norm(z);
for k=1:K
y=A*x; l=x’*y;
if norm(y-l*x)/af < tol
it=k; x=y/norm(y); return
end
x=y/norm(y);
end
it=K+1;
The shifted power method
I A variant of the power method is the shifted power
method.
I In this method we choose a number s and apply the
power method to the matrix A − sI.
I The number s is called a shift since it shifts an eigenvalue
λ of A to λ − s of A − sI.
I Sometimes the convergence can be faster if the shift is
chosen intelligently.
Example
I
1 2 1.7 −0.4 1 2
A1 := , A2 := , and A3 = .
3 4 0.15 2.2 −3 4
I Start with a random vector and tol=10−6 .
I Get convergence in 7 iterations for A1 , 174 iterations for
A2 and no convergence for A3 .
I A3 has two complex eigenvalues so assumption i) is not
satisfied
I Rate of convergence depends on r = |λ2 /λ1 |. Faster
convergence for smaller r .
I We have r ≈ 0.07 for A1 and r = 0.95 for A2 .
The inverse power method
I We apply the power method to the inverse matrix
(A − sI)−1 , where s is a shift.
I If A has eigenvalues λ1 , . . . , λn in no particular order then
(A − sI)−1 has eigenvalues
µ1 (s) = (λ1 −s)−1 , µ2 (s) = (λ2 −s)−1 , . . . , µn (s) = (λn −s)−1 .
I Suppose λ1 is a simple eigenvalue of A.
I Then lims→λ1 |µ1 (s)| = ∞, while
lims→λ1 µj (s) = (λj − λ1 )−1 < ∞ for j = 2, . . . , n.
I Hence, by choosing s sufficiently close to λ1 the inverse
power method will converge to that eigenvalue.
For the inverse power method (1) is replaced by.
(i) (A − sI)yk = xk−1
(2)
(ii) xk = yk /kyk k.
Note that we solve the linear system rather than computing
the inverse matrix. Normally the PLU-factorization of A − sI
is pre-computed in order to speed up the iteration.
Rayleigh quotient iteration
We can combine inverse power with Rayleigh quotient
calculation.
(i) (A − sk−1 I)yk = xk−1 ,
(ii) xk = yk /kyk k,
(iii) sk = xHk Axk ,
(iv ) rk = Axk − sk xk .
I We can avoid the calculation of Axk in (iii) and (iv ).
Example
1 2
I A1 := .
3 4
√
I λ = (5 − 33)/2 ≈ −0.37
I Start with x = [1, 1]T
k 1 2 3 4 5
krk2 1.0e+000 7.7e-002 1.6e-004 8.2e-010 2.0e-020
|sk − λ| 3.7e-001 -1.2e-002 -2.9e-005 -1.4e-010 -2.2e-016
Table: Quadratic convergence of Rayleigh quotient iteration
Problem with singularity?
I The linear system in i) becomes closer and closer to
singular as sk converges to the eigenvalue.
I Thus the system becomes more and more ill-conditioned
and we can expect large errors in the computed yk .
I This is indeed true, but we are lucky.
I Most of the error occurs in the direction of the
eigenvector and this error disappears when we normalize
yk in ii).
I Miraculously, the normalized eigenvector will be quite
accurate.
Discussion
I Since the shift changes from iteration to iteration the
computation of y will require O(n3 ) flops for a full matrix.
I For such a matrix it might pay to reduce it to a upper
Hessenberg form or tridiagonal form before starting the
iteration.
I However, if we have a good approximation to an eigenpair
then only a few iterations are necessary to obtain close to
machine accuracy.
I If Rayleigh quotient iteration converges the convergence
will be quadratic and sometimes even cubic.
The QR-algorithm
I An iterative method to compute all eigenvalues and
eigenvectors of a matrix A ∈ Cn,n .
I The matrix is reduced to triangular form by a sequence of
unitary similarity transformations computed from the
QR-factorization of A.
I Recall that for a square matric the QR-factorization and
the QR-decomposition are the same.
I If A = QR is a QR-factorization then Q ∈ Cn,n is unitary,
QH Q = I and R ∈ Cn,n is upper triangular.
Basic QR
A1 = A
for k = 1, 2, . . .
Qk Rk = Ak (QR-factorization of Ak ) (3)
Ak+1 = Rk Qk .
end
Example
2 1
I A1 = A =
1 2
√1
−2 −1 1 −5 −4
I A1 = Q1 R1 = ∗ √5
5 −1 2 0 3
−5 −4 −2 −1
I A2 = R1 Q1 = 15 ∗ ∗=
0 3 −1 2
2.8 −0.6
.
−0.6 1.2
2.997 −0.074 3.0000 −0.0001
I A3 = , A9 =
−0.074 1.0027 −0.0001 1.0000
I A9 is almost diagonal and contain the eigenvalues λ1 = 3
and λ2 = 1 on the diagonal.
Example 2
0.9501 0.8913 0.8214 0.9218
0.2311 0.7621 0.4447 0.7382
A1 = A =
0.6068
0.4565 0.6154 0.1763
0.4860 0.0185 0.7919 0.4057
we obtain
2.323 0.047223 −0.39232 −0.65056
−2.1e − 10 0.13029 0.36125 0.15946
A14 =
−4.1e − 10 −0.58622
.
0.052576 −0.25774
1.2e − 14 3.3e − 05 −1.1e − 05 0.22746
A14 is close to quasi-triangular.
Example 2
2.323 0.047223 −0.39232 −0.65056
−2.1e − 10 0.13029 0.36125 0.15946
A14 =
−4.1e − 10 −0.58622
.
0.052576 −0.25774
1.2e − 14 3.3e − 05 −1.1e − 05 0.22746
I The 1 × 1 blocks give us two real eigenvalues λ1 ≈ 2.323
and λ4 ≈ 0.2275.
I The middle 2 × 2 block has complex eigenvalues resulting
in λ2 ≈ 0.0914 + 0.4586i and λ3 ≈ 0.0914 − 0.4586i.
I From Gerschgorin’s circle theorem it follows that the
approximations to the real eigenvalues are quite accurate.
I We would also expect the complex eigenvalues to have
small absolute errors.
Why QR works
I Since QHk Ak = Rk we obtain
H
Ak+1 = Rk Qk = QHk Ak Qk = Q̃k AQ̃k , where Q̃k = Q1 · · · Qk .
(4)
I Thus Ak+1 is similar to Ak and hence to A.
I It combines both the power method and the Rayleigh
quotient iteration.
I If A ∈ Rn,n has real eigenvalues, then under fairly general
conditions, the sequence {Ak } converges to an upper
triangular matrix, the Schur form.
I If A is real, but with some complex eigenvalues, then the
convergence will be to the quasi-triangular Schur form
The Shifted QR-algorithms
Like in the inverse power method it is possible to speed up the
convergence by introducing shifts. The explicitly shifted
QR-algorithm works as follows:
A1 = A
for k = 1, 2, . . .
Choose a shift sk
(5)
Qk Rk = Ak − sk I (QR-factorization of Ak − sI)
Ak+1 = Rk Qk + sk I.
end
Remarks
1. Ak+1 = QHk Ak Qk is unitary similar to Ak .
2. Before applying this algorithm we reduce A to upper
Hessenberg form
3. If A is upper Hessenberg then all matrices {Ak }k≥1 will
be upper Hessenberg.
4. Why? because Ak+1 = Rk Ak R−1 k . A product of two
upper triangular matrices and an upper Hessenberg
matrix is upper Hessenberg.
5. Givens rotations is used to compute the QR-factorization
of Ak − sk I.
More remarks
6. If ai+1,i = 0 we can split the the eigenvalue
problem into two smaller problems. We do this
splitting if one ai+1,i becomes sufficiently small.
7. sk := eTn Ak en is called the Rayleigh quotient shift
8. The Wilkinson shift is the eigenvalue of the lower
right 2 × 2 corner closest to the n, n entry of Ak .
9. Convergence is at least quadratic for both kinds
of shifts.
10. In the implicitly shifted QR-algorithm we combine
two shifts at a time. Can find both real and
complex eigenvalues without using complex
arithmetic.
More remarks
11. To find the eigenvectors we first find the
eigenvectors of the triangular or quasi-triangular
matrix. We then compute the eigenvalues of the
upper Hessenberg matrix and finally the
eigenvectors of A.
12. Normally we find all eigenvalues in O(n3 ) flops.
This is because we need O(n3 ) flops to find the
upper Hessenberg form, each iterations requires
O(n2 ) flops if Ak is upper Hessenberg and O(n)
flops if Ak is tridiagonal, and we normally need
O(n) iterations.