0% found this document useful (0 votes)

84 views18 pages

Solving Large Linear Systems

This document discusses solving large sparse linear systems that arise from finite element and finite difference methods. It addresses three key questions: how to number nodes/equations, how to use the sparse structure of the matrix, and whether to use direct elimination or iterative methods. For 2D problems, direct elimination methods that exploit the minimum degree ordering of nodes are discussed. This reordering aims to minimize fill-in during factorization. For 3D problems, iterative methods are preferable due to the prohibitively large storage and computation requirements of direct methods. The Kronecker product is introduced as an efficient way to construct the matrices for 2D and 3D problems from the 1D case.

Uploaded by

olivier_b_cazal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

84 views18 pages

Solving Large Linear Systems

Uploaded by

olivier_b_cazal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

CHAPTER 3.

BOUNDARY VALUE PROBLEMS

192

3.6

Solving Large Linear Systems

Finite elements and nite dierences produce large linear systems KU = F . The
matrices K are extremely sparse. They have only a small number of nonzero entries in
a typical row. In physical space those nonzeros are clustered tightly togetherthey
come from neighboring nodes and meshpoints. But we cannot number N 2 nodes in a
plane in any way that keeps neighbors close together! So in 2-dimensional problems,
and even more in 3-dimensional problems, we meet three questions right away:
1. How best to number the nodes
2. How to use the sparseness of K (when nonzeros can be widely separated)
3. Whether to choose direct elimination or an iterative method.
That last point will split this section into two partselimination methods in 2D
(where node order is important) and iterative methods in 3D (where preconditioning
is crucial).
To x ideas, we will create the n equations KU = F from Laplaces dierence
equation in an interval, a square, and a cube. With N unknowns in each direction,
K has order n = N or N 2 or N 3 . There are 3 or 5 or 7 nonzeros in a typical row of
the matrix. Second dierences in 1D, 2D, and 3D are shown in Figure 3.17.

Tridiagonal K

4
1

2
N by N

1
1

Block
Tridiagonal
K
N 2 by N 2

N 3 by N 3

1 1

Figure 3.17: 3, 5, 7 point dierence molecules for uxx , uxx uyy , uxx uyy uzz .

Along a typical row of the matrix, the entries add to zero. In two dimensions this
is 4 1 1 1 1 = 0. This zero sum remains true for nite elements (the element
shapes decide the exact numerical entries). It reects the fact that u = 1 solves
Laplaces equation and Ui = 1 has dierences equal to zero. The constant vector
solves KU = 0 except near the boundaries. When a neighbor is a boundary point
where Ui is known, its value moves onto the right side of KU = F . Then that row of
K is not zero sum. Otherwise K would be singular, if K ones(n, 1) = zeros(n, 1).
Using block matrix notation, we can create the 2D matrix K = K2D from the
familiar N by N second dierence matrix K. We number the nodes of the square a

3.6. SOLVING LARGE LINEAR SYSTEMS

193

row at a time (this natural numbering is not necessarily best). Then the 1s for
the neighbor above and the neighbor below are N positions away from the main
diagonal of K2D . The 2D matrix is block tridiagonal with tridiagonal blocks:

2 1
K + 2I
I

1
I
2 1

K + 2I I

K=
=

K
(1)
2D

1 2
I K + 2I
Size N
Time N

Elimination in this order: K2D has size n = N 2

Bandwidth w = N, Space nw = N 3 , Time nw 2 = N 4

The matrix K2D has 4s down the main diagonal. Its bandwidth w = N is the
distance from the diagonal to the nonzeros in I. Many of the spaces in between are
lled during elimination! Then the storage space required for the factors in K = LU
is of order nw = N 3 . The time is proportional to nw 2 = N 4 , when n rows each
contain w nonzeros, and w nonzeros below the pivot require elimination.
Those counts are not impossibly large in many practical 2D problems (and we
show how they can be reduced). The horrifying large counts come for K3D in three
dimensions. Suppose the 3D grid is numbered by square cross-sections in the natural
order 1, . . . , N. Then K3D has blocks of order N 2 from those squares. Each square
is numbered as above to produce blocks coming from K2D and I = I2D :

Size n = N 3
I
K2D + 2I
Bandwidth w
= N 2

I
K2D + 2I I

K3D =

Elimination space nw = N 5

Elimination time nw 2 = N 7
I K2D + 2I
Now the main diagonal contains 6s, and inside rows have six 1s. Next to a point
or edge or corner of the boundary cube, we lose one or two or three of those 1s.
The good way to create K2D from K and I (N by N) is to use the kron(A, B)
command. This Kronecker product replaces each entry aij by the block aij B. To take
second dierences in all rows at the same time, and then all columns, use kron:
K2D = kron(K, I) + kron(I, K) .

(2)

The identity matrix in two dimensions is I2D = kron(I, I). This adjusts to allow
rectangles, with Is of dierent sizes, and in three dimensions to allow boxes. For a
cube we take second dierences inside all planes and also in the z-direction:
K3D = kron(K2D , I) + kron(I2D , K) .
Having set up these special matrices K2D and K3D , we have to say that there are
special ways to work with them. The x, y, z directions are separable. The geometry
(a box) is also separable. See Section 7.2 on Fast Poisson Solvers. Here the matrices
K and K2D and K3D are serving as models of the type of matrices that we meet.

CHAPTER 3. BOUNDARY VALUE PROBLEMS

194

Minimum Degree Algorithm

We now describe (a little roughly) a useful reordering of the nodes and the equations
in K2D U = F. The ordering achieves minimum degree at each stepthe number
of nonzeros below the pivot row is minimized. This is essentially the algorithm
used in MATLABs command U = K\F , when K has been dened as a sparse matrix.
We list some of the functions from the sparfun directory:
speye (sparse identity I)
nnz (number of nonzero entries)
nd
(nd indices of nonzeros)
spy (visualize sparsity pattern)
colamd and symamd (approximate minimum degree permutation of K)
You can test and use the minimum degree algorithms without a careful analysis. The
approximations are faster than the exact minimum degree permutations colmmd and
symmmd. The speed (in two dimensions) and the roundo errors are quite reasonable.
In the Laplace examples, the minimum degree ordering of nodes is irregular compared to a row at a time. The nal bandwidth is probably not decreased. But the
nonzero entries are postponed as long as possible! That is the key.
The dierence is shown in the arrow matrix of Figure 3.18. On the left, minimum
degree (one nonzero o the diagonal) leads to large bandwidth. But there is no ll-in.
Elimination will only change its last row and column. The triangular factors L and
U have all the same zeros as A. The space for storage stays at 3n, and elimination
needs only n divisions and multiplications and subtractions.

Bandwidth

6 and 3

Fill-in

0 and 6

F F

Figure 3.18: Arrow matrix: Minimum degree (no F) against minimum bandwidth.

The second ordering reduces the bandwidth from 6 to 3. But when row 4 is
reached as the pivot row, the entries indicated by F are lled in. That full lower
quarter of A gives
18 n
2 nonzeros to both factors L and U . You see that the whole
prole of the matrix decides the ll-in, not just the bandwidth.
The minimum degree algorithm chooses the (k + 1)st pivot column, after k
columns have been eliminated as usual below the diagonal, by the following rule:
In the remaining matrix of size n k, select the column with the fewest nonzeros.

3.6. SOLVING LARGE LINEAR SYSTEMS

195

The component of U corresponding to that column is renumbered k + 1. So is the

node in the nite dierence grid. Of course elimination in that column will normally
produce new nonzeros in the remaining columns! Some ll-in is unavoidable. So
the algorithm must keep track of the new positions of nonzeros, and also the actual
entries. It is the positions that decide the ordering of unknowns. Then the entries
decide the numbers in L and U .
Example
Figure 3.19 shows a small example of the minimal degree ordering, for
Laplaces 5-point scheme. The node connections produce nonzero entries (indicated by
) in K. The problem has six unknowns. K has two 3 by 3 tridiagonal blocks from
horizontal links, and two 3 by 3 blocks with I from vertical links.
The degree of a node is the number of connections to other nodes. This is the
number of nonzeros in that column of K. The corner nodes 1, 3, 4, 6 all have degree
2. Nodes 2 and 5 have degree 3. A larger region has inside nodes of degree 4, which
will not be eliminated rst. The degrees change as elimination proceeds, because of
ll-in.
The rst elimination step chooses row 1 as pivot row, because node 1 has minimum
degree 2. (We had to break a tie! Any degree 2 node could come rst, leading to dierent
elimination orders.) The pivot is P, the other nonzeros in that row are boxed. When
row 1 operates on rows 2 and 4, it changes six entries below it. In particular, the two
ll-in entries marked by F change to nonzeros. This ll-in of the (2, 4) and (4, 2) entries
corresponds to the dashed line connecting nodes 2 and 4 in the graph.

4
1

Pivots P

Fill-in F

Zeros

from
elimination

3
P

4
F

6
F

Figure 3.19: Minimum degree nodes 1 and 3. The pivots P are in rows 1 and 3; new

edges 24 and 26 in the graph match the matrix entries F lled in by elimination.

CHAPTER 3. BOUNDARY VALUE PROBLEMS

196

Nodes that were connected to the eliminated node are now connected to each other.
Elimination continues on the 5 by 5 matrix (and the graph with 5 nodes). Node 2 still has
degree 3, so it is not eliminated next. If we break the tie by choosing node 3, elimination
using the new pivot P will ll in the (2, 6) and (6, 2) positions. Node 2 becomes linked
to node 6 because they were both linked to the eliminated node 3.
The problem is reduced to 4 by 4, for the unknown U s at the remaining nodes
2, 4, 5, 6. Problem
asks you to take the next stepchoose a minimum degree node
and reduce the system to 3 by 3.

Storing the Nonzero Structure = Sparsity Pattern

A large system KU = F needs a fast and economical storage of the node connections
(which match the positions of nonzeros in K). The connections and nonzeros change
as elimination proceeds. The list of edges and nonzero positions corresponds to the
adjacency matrix of the graph of nodes. The adjacency matrix has 1 or 0 to indicate
nonzero or zero in K.
For each node i, we have a list adj(i) of the nodes connected to i. How to combine
these into one master list NZ for the whole graph and the whole matrix K? A simple
way is to store the lists adj(i) sequentially in NZ (the nonzeros for i = 1 up to i = n).
An index array IND of pointers tells the starting position of the sublist adj(i) within
the master list NZ. It is useful to give IND an (n + 1)st entry to point to the nal
entry in NZ (or to the blank that follows, in Figure 3.20). MATLAB will store one
more array (the same length nnz(K) as NZ) to give the actual nonzero entries.
NZ
IND
Node

2 4

1
1

3
2

3 5

2 6

6
3

8
4

5 |

2 4 6

10
5

3 5

13
6

Figure 3.20: Master list NZ of nonzeros (neighbors in Figure 3.19). Positions in

IND.

The indices i are the original numbering of the nodes. If there is renumbering,
the new ordering can be stored as a permutation PERM. Then PERM(i) = k when
the new number i is assigned to the node with original number k. The text [GL] by
George and Liu is the classic reference for this entire section on ordering of the nodes.

Graph Separators
Here is another good ordering, dierent from minimum degree. Graphs or meshes are
often separated into disjoint pieces by a cut. The cut goes through a small number
of nodes or meshpoints (a separator). It is a good idea to number the nodes in the

3.6. SOLVING LARGE LINEAR SYSTEMS

197

separator last. Elimination is relatively fast for the disjoint pieces P and Q. It only
slows down at the end, for the (smaller) separator S.
The three groups P, Q, S of meshpoints have no direct connections between P and
Q (they are both connected to the separator S). Numbered in that order, the block
arrow stiness matrix and its K = LU factorization look like this:

KP
0
KP S
UP 0 A
LP

KQ KQS
UQ B
K= 0
U=
L = 0 LQ
C
KSP KSQ KS
X Y Z
(3)
The zero blocks in K give zero blocks in L and U. The submatrix KP comes rst
in elimination, to produce LP and UP . Then come the factors LQ UQ of KQ , followed
by the connections through the separator. The major cost is often that last step, the
solution of a fairly dense system of the size of the separator.
4

1
2
3

Arrow matrix
(Figure 3.18)

3
S

Q
4

Separator comes last

(Figure 3.19)

Blocks P, Q
Separator S

Figure 3.21: A graph separator numbered last produces a block arrow matrix K.
Figure 3.21 shows three examples, each with separators. The graph for a perfect
arrow matrix has a one-point separator (very unusual). The 6-node rectangle has a
two-node separator in the middle. Every N by N grid can be cut by an N-point
separator (and N is much smaller than N 2 ). If the meshpoints form a rectangle, the
best cut is down the middle in the shorter direction.
You could say that the numbering of P then Q then S is block minimum degree.
But one cut with one separator will not come close to an optimal numbering. It
is natural to extend the idea to a nested sequence of cuts. P and Q have their
own separators at the next level. This nested dissection continues until it is not
productive to cut further. It is a strategy of divide and conquer.
Figure 3.22 illustrates three levels of nested dissection on a 7 by 7 grid. The rst
cut is down the middle. Then two cuts go across and four cuts go down. Numbering
the separators last within each stage, the matrix K of size 49 has arrows inside arrows
inside arrows. The spy command will display the pattern of nonzeros.
Separators and nested dissection show how numbering strategies are based on the
graph of nodes and edges in the mesh. Those edges correspond to nonzeros in the
matrix K. The nonzeros created by elimination (lled entries in L and U) correspond
to paths in the graph. In practice, there has to be a balance between simplicity and
optimality in the numberingin scientic computing simplicity is a very good thing!

CHAPTER 3. BOUNDARY VALUE PROBLEMS

198

3
0
405

zero

22 to 30

1 to 9

3 18

19 to 21

40 to 42

10 to 18

31 to 39

zero

zero
3 18

7 42

Figure 3.22: Three levels of separators. Still

nonzeros in K, only

in L.

A very reasonable compromise is the backslash command U = K\F that uses a

nearly minimum degree ordering in Sparse MATLAB.

Operation Counts (page K)

Here are the complexity estimates for the 5-point Laplacian with N 2 or N 3 nodes:
Minimum Degree
Space (nonzeros from ll-in)
Time (ops for elimination)

n = N 2 in 2D
X
X

n = N 3 in 3D
X
X

Nested Dissection
Space (nonzeros from ll-in)
Time (ops for elimination)

X
X

In the last century, nested dissection lost outit was slower on almost all applications. Now larger problems are appearing and the asymptotics eventually give nested
dissection an edge. Algorithms for cutting graphs can produce short cuts into nearly
equal pieces. Of course a new idea for ordering could still win.

Iterative versus Direct Methods

This section is a guide to solution methods for problems Ax = b that are too large
and expensive for ordinary elimination. We are thinking of sparse matrices A, when
a multiplication Ax is relatively cheap. If A has at most p nonzeros in every row,
then Ax needs at most pn multiplications. Typical applications are to large nite
dierence equations or nite element problems on unstructured grids. In the special
case of a square grid for Laplaces equation, a Fast Poisson Solver (Section 7.2) is
available.
We turn away from elimination to iterative methods and Krylov subspaces.
Pure iterative methods are easier to analyze, but the Krylov subspace methods are

3.6. SOLVING LARGE LINEAR SYSTEMS

199

more powerful. So the older iterations of Jacobi and Gauss-Seidel and overrelaxation
are less favored in scientic computing, compared to conjugate gradients and GMRES. When the growing Krylov subspaces reach the whole space Rn , these methods
(in exact arithmetic) give the exact solution A1 b. But in reality we stop much earlier, long before n steps are complete. The conjugate gradient method (for positive
denite A, and with a good preconditioner ) has become truly important.
The next ten pages will introduce you to numerical linear algebra. This has
become a central part of scientic computing, with a clear goal: Find a fast stable
algorithm that uses the special properties of the matrices. We meet matrices that
are sparse or symmetric or triangular or orthogonal or tridiagonal or Hessenberg or
Givens or Householder. Those matrices are at the core of so many computational
problems. The algorithm doesnt need details of the entries (which come from the
specic application). By using only their structure, numerical linear algebra oers
major help.
Overall, elimination with good numbering is the rst choice until storage and CPU
time become excessive. This high cost often arises rst in three dimensions. At that
point we turn to iterative methods, which require more expertise. You must choose
the method and the preconditioner. The next pages aim to help the reader at this
frontier of scientic computing.

Pure Iterations
We begin with old-style pure iteration (not obsolete). The letter K will be reserved
for Krylov so we leave behind the notation KU = F . The linear system becomes
Ax = b with a large sparse matrix A, not necessarily symmetric or positive denite:
Linear system Ax = b

Residual rk = b Axk

Preconditioner P A

The preconditioner P attempts to be close to A and at the same time much easier
to work with. A diagonal P is one extreme (not very close). P = A is the other
extreme (too close). Splitting the matrix A gives an equivalent form of Ax = b:
Splitting

P x = (P A)x + b .

(4)

This suggests an iteration, in which every vector xk leads to the next xk+1 :
Iteration

P xk+1 = (P A)xk + b .

(5)

Starting from any x0 , the rst step nds x1 from P x1 = (P A)x0 + b. The iteration
continues to x2 with the same matrix P , so it often helps to know its triangular factors
L and U. Sometimes P itself is triangular, or its factors L and U are approximations
to the triangular factors of A. Two conditions on P make the iteration successful:
1. The new xk+1 must be quickly computable. Equation (5) must be fast to solve.
2. The errors ek = x xk must converge quickly to zero.

CHAPTER 3. BOUNDARY VALUE PROBLEMS

200

Subtract equation (5) from (4) to nd the error equation. It connects ek to ek+1 :
Error

P ek+1 = (P A)ek which means ek+1 = (I P 1A)ek = Mek .

(6)

The right side b disappears in this error equation. Each step multiplies the error
vector by M = I P 1 A. The speed of convergence of xk to x (and of ek to zero)
depends entirely on M . The test for convergence is given by the eigenvalues of M :
Convergence test

Every eigenvalue of M must have |(M )| < 1.

The largest eigenvalue (in absolute value) is the spectral radius (M ) = max |(M )|.
Convergence requires (M ) < 1. The convergence rate is set by the largest eigenvalue. For a large problem, we are happy with (M ) = .9 and even (M ) = .99.
Suppose that the initial error e0 happens to be an eigenvector of M . Then the
next error is e1 = Me0 = e0 . At every step the error is multiplied by , so we must
have || < 1. Normally e0 is a combination of all the eigenvectors. When the iteration
multiplies by M , each eigenvector is multiplied by its own eigenvalue. After k steps
those multipliers are k . We have convergence if all || < 1.
For preconditioner we rst propose two simple choices:
Jacobi iteration
Gauss-Seidel iteration

P = diagonal part of A
P = lower triangular part of A

Typical examples have spectral radius (M ) = 1 cN 1 . This comes closer and

closer to 1 as the mesh is rened and the matrix grows. An improved preconditioner
P can give (M ) = 1 cN 1/2 . Then is smaller and convergence is faster, as in
overrelaxation. But a dierent approach has given more exibility in constructing
a good P , from a quick incomplete LU factorization of the true matrix A:
I ncomplete LU

P = (approximation to L)(approximation to U ) .

The exact A = LU has ll-in, so zero entries in A become nonzero in L and U . The approximate L and U could ignore this ll-in (fairly dangerous). Or P = Lapprox Uapprox
can keep only the ll-in entries F above a xed threshold. The variety of options,
and the fact that the computer can decide automatically which entries to keep, has
made the ILU idea (incomplete LU ) a very popular starting point.
Example
The 1, 2, 1 matrix A = K provides an excellent example. We choose
the preconditioner P = T , the same matrix with T11 = 1 instead of K11 = 2. The LU
factors of T are perfect rst dierences, with diagonals of +1 and 1. (Remember that
all pivots of T equal 1, while the pivots of K are 2/1, 3/2, 4/3, . . .) We can compute the
right side of T 1 Kx = T 1 b with only 2N additions and no multiplications (just back
substitution using L and U ). Idea: This L and U are approximately correct for K.
The matrix P 1A = T 1 K on the left side is triangular. More than that, T is a rank 1
change from K (the 1, 1 entry changes from 2 to 1). It follows that T 1 K and K 1 T

3.6. SOLVING LARGE LINEAR SYSTEMS

201

shows that
are rank 1 changes from the identity matrix I. A calculation in Problem
only the rst column of I is changed, by the linear vector = (N, N 1, . . . , 1):
P 1 A = T 1 K = I + eT
and
K 1 T = I (eT
(7)
1
1 )/(N + 1) .

1
1 0 . . . 0 so eT
b by
Here eT
1 =
1 has rst column . This example nds x = K
1
1
1
a quick exact formula (K T )T b, needing only 2N additions for T and N additions
and multiplications for K 1 T . In practice we wouldnt precondition this K (just solve).
The usual purpose of preconditioning is to speed up convergence for iterative methods,
and that depends on the eigenvalues of P 1 A. Here the eigenvalues of T 1 K are its
diagonal entries N +1, 1, . . . , 1. This example will illustrate a special property of conjugate
gradients, that with only two dierent eigenvalues it reaches the true solution x in two
steps.
The iteration P xk+1 = (P A)xk + b is too simple! It is choosing one particular
vector in a Krylov subspace. With relatively little work we can make a much better
choice of xk . Krylov projections are the state of the art in todays iterative methods.

Krylov Subspaces
Our original equation is Ax = b. The preconditioned equation is P 1 Ax = P 1b.
When we write P 1 , we never intend that an inverse would be explicitly computed
(except in our example). The ordinary iteration is a correction to xk by the vector
P 1 rk :
P xk+1 = (P A)xk + b

P xk+1 = P xk + rk

xk+1 = xk + P 1 rk . (8)

Here rk = b Axk is the residual. It is the error in Ax = b, not the error ek

in x. The symbol P 1 rk represents the change from xk to xk+1 , but that step is
not computed by multiplying P 1 times rk . We might use incomplete LU, or a few
steps of a multigrid iteration, or domain decomposition. Or an entirely new
preconditioner.
In describing Krylov subspaces, I should work with P 1 A. For simplicity I will
only write A. I am assuming that P has been chosen and used, and the preconditioned equation P 1 Ax = P 1 b is given the notation Ax = b. The preconditioner is
now P = I. Our new matrix A is probably better than the original matrix with that
name.
The Krylov subspace Kk (A, b) contains all combinations of b, Ab, . . . , Ak1 b.
These are the vectors that we can compute quickly, multiplying by a sparse A. We
look in this space Kk for the approximation xk to the true solution of Ax = b. Notice
that the pure iteration xk = (I A)xk1 + b does produce a vector in Kk when xk1
is in Kk1 . The Krylov subspace methods make other choices of xk . Here are four
dierent approaches to choosing a good xk in Kk this is the important decision:

CHAPTER 3. BOUNDARY VALUE PROBLEMS

202

1. The residual rk = b Axk is orthogonal to Kk (Conjugate Gradients, . . . )

2. The residual rk has minimum norm for xk in Kk (GMRES, MINRES, . . . )
3. rk is orthogonal to a dierent space like Kk (AT ) (BiConjugate Gradients, . . . )
4. ek has minimum norm (SYMMLQ; for BiCGStab xk is in AT Kk (AT ); . . . )
In every case we hope to compute the new xk quickly and stably from the earlier xs.
If that recursion only involves xk1 and xk2 (short recurrence) it is especially fast.
We will see this happen for conjugate gradients and symmetric positive denite A.
The BiCG method in 3 is a natural extension of short recurrences to unsymmetric
Abut stability and other questions open the door to the whole range of methods.
To compute xk we need a basis for Kk . The best basis q1 , . . . , qk is orthonormal.
Each new qk comes from orthogonalizing t = Aqk1 to the basis vectors q1 , . . . , qk1
that are already chosen. This is the Gram-Schmidt idea (called modied GramSchmidt when we subtract projections of t onto the qs one at a time, for numerical
stability). The iteration to compute the orthonormal qs is known as Arnoldis
method:
1
2
3
4
5
6

q1 = b/b2 ;
for j = 1, . . . , k 1
t = Aqj ;
for i = 1, . . . , j
hij = qiT t;
t = t hij qi ;
end;

hj+1,j = t2 ;
qj+1 = t/hj+1,j ;
end

% Normalize to q1 = 1
% t is in the Krylov space Kj+1(A, b)
% hij qi = projection of t onto qi
% Subtract component of t along qi
% t is now orthogonal to q1 , . . . , qj

% Normalize t to qj+1 = 1

% q1 , . . . , qk are orthonormal in Kk

Put the column vectors q1 , . . . , qk into an n by k matrix Qk . Multiplying rows of

by columns of Qk produces all the inner products qiT qj , which are the 0s and 1s
in the identity matrix. The orthonormal property means that QT
k Qk = Ik .
QT
k

Arnoldi constructs each qj+1 from Aqj by subtracting projections hij qi . If we

express the steps up to j = k 1 in matrix notation, they become AQk1 = Qk Hk,k1:
Arnoldi

AQk1 =

Aq1 Aqk1
n by k 1

h11 h12

h21 h22
=
q1 qk

0 h23
0
0
n by k
k by

h1,k1
h2,k1
.

hk,k1
k1

(9)

That matrix Hk,k1 is upper Hessenberg because it has only one nonzero diagonal
below the main diagonal. We check that the rst column of this matrix equation

3.6. SOLVING LARGE LINEAR SYSTEMS

203

(multiplying by columns!) produces q2 :

Aq1 = h11 q1 + h21 q2

q2 =

Aq1 h11 q1
.
h21

(10)

That subtraction is Step 4 in Arnoldis algorithm. Division by h21 is Step 6.

Unless more of the hij are zero, the cost is increasing at every iteration. We have
k dot products to compute at step 3 and 5, and k vector updates in steps 4 and 6. A
short recurrence means that most of these hij are zero. That happens when A = AT .
The matrix H is tridiagonal when A is symmetric. This fact is the foundation of conjugate gradients. For a matrix proof, multiply equation (9) by QT
k1 . The
T
right side becomes H without its last row, because (Qk1 Qk )Hk,k1 = [ I 0 ] Hk,k1.
The left side QT
k1 AQk1 is always symmetric when A is symmetric. So that H matrix
has to be symmetric, which makes it tridiagonal. There are only three nonzeros in
the rows and columns of H, and Gram-Schmidt to nd qk+1 only involves qk and qk1 :
Arnoldi when A = AT

Aqk = hk+1,k qk+1 + hk,k qk + hk1,k qk1 .

(11)

This is the Lanczos iteration. Each new qk+1 = (Aqk hk,k qk hk1,k qk1 )/hk+1,k
involves one multiplication Aqk , two dot products for new hs, and two vector updates.

The QR Method for Eigenvalues

Allow me an important comment on the eigenvalue problem Ax = x. We have seen
T
that Hk1 = QT
k1 AQk1 is tridiagonal if A = A . When k 1 reaches n and Qn is
1
square, the matrix H = QT
n AQn = Qn AQn has the same eigenvalues as A:
Same

Hy = Q1
n AQn y = y

gives

Ax = x with x = Qn y .

(12)

It is much easier to nd the eigenvalues for a tridiagonal H than the for original A.
The famous QR method for the eigenvalue problem starts with T1 = H, factors
it into T1 = Q1 R1 (this is Gram-Schmidt on the short columns of T1 ), and reverses
order to produce T2 = R1 Q1 . The matrix T2 is again tridiagonal, and its o-diagonal
entries are normally smaller than for T1 . The next step is Gram-Schmidt on T2 ,
orthogonalizing its columns in Q2 by the combinations in the upper triangular R2 :
.
QR Method Factor T2 into Q2 R2 . Reverse order to T3 = R2 Q2 = Q1
2 T2 Q2 (13)
By the reasoning in (12), any Q1 T Q has the same eigenvalues as T . So the matrices
T2 , T3 , . . . all have the same eigenvalues as T1 = H and A. (These square Qk from
Gram-Schmidt are entirely dierent from the rectangular Qk in Arnoldi.) We can
even shift T before Gram-Schmidt, and we should, provided we remember to shift
back:
Shifted QR

Factor Tk sk I = Qk Rk . Reverse to Tk+1 = Rk Qk + bk I .

(14)

When the shift sk is chosen to be the n, n entry of Tk , the last o-diagonal entry of
Tk+1 becomes very small. The n, n entry of Tk+1 moves close to an eigenvalue. Shifted

CHAPTER 3. BOUNDARY VALUE PROBLEMS

204

QR is one of the great algorithms of numerical linear algebra. It solves moderate-size

eigenvalue problems with great eciency. This is the core of MATLABs eig(A).
For a large symmetric matrix, we often stop the Arnoldi-Lanczos iteration at a
tridiagonal Hk with k < n. The full n-step process to reach Hn is too expensive, and
often we dont need all n eigenvalues. So we compute (by the same QR method) the k
eigenvalues of Hk instead of the n eigenvalues of Hn . These computed 1k , 2k , . . . , kk
can provide good approximations to the rst k eigenvalues of A. And we have an
excellent start on the eigenvalue problem for Hk+1, if we decide to take a further step.
This Lanczos method will nd, approximately and iteratively and quickly, the
leading eigenvalues of a large symmetric matrix.

The Conjugate Gradient Method

We return to iterative methods for Ax = b. The Arnoldi algorithm produced orthonormal basis vectors q1 , q2 , . . . for the growing Krylov subspaces K1 , K2 , . . .. Now
we select vectors x1 , x2 , . . . in those subspaces that approach the exact solution to
Ax = b. We concentrate on the conjugate gradient method for symmetric positive
denite A.
The rule for xk in conjugate gradients is that the residual rk = b Axk should
be orthogonal to all vectors in Kk . Since rk will be in Kk+1 , it must be a multiple
of Arnoldis next vector qk+1 ! Each residual is therefore orthogonal to all previous
residuals (which are multiples of the previous qs):
Orthogonal residuals

riT rk = 0

for i < k .

(15)

The dierence between rk and qk+1 is that the qs are normalized, as in q1 = b/b.

Since rk1 is a multiple of qk , the dierence rk rk1 is orthogonal to each subspace
K with i < k. Certainly xi xi1 lies in that Ki . So r is orthogonal to earlier
xs:
for i < k .
(16)
(xi xi1 )T (rk rk1) = 0
i

These dierences x and r are directly connected, because the bs cancel in r:

rk rk1 = (b Axk ) (b Axk1 ) = A(xk xk1 ) .

(17)

Substituting (17) into (16), the updates in the xs are A-orthogonal or conjugate:
Conjugate updates x

(xi xi1 )T A(xk xk1 ) = 0

for i < k .

(18)

Now we have all the requirements. Each conjugate gradient step will nd a new
search direction dk for the update xk xk1 . From xk1 it will move the right
distance k dk to xk . Using (17) it will compute the new rk . The constants k in
the search direction and k in the update will be determined by (15) and (16) for
i = k 1. For symmetric A the orthogonality in (15) and (16) will be automatic for
i < k 1, as in Arnoldi. We have a short recurrence for the new xk and rk .

3.6. SOLVING LARGE LINEAR SYSTEMS

205

Here is one cycle of the algorithm, starting from x0 = 0 and r0 = b and 1 = 0. It

involves only two new dot products and one matrix-vector multiplication Ad:
Conjugate
Gradient
Method

1
2
3
4
5

T
T
k = rk1
rk1 /rk2
rk2
dk = rk1 + k dk1
T
rk1 /dT
k = rk1
k Adk
xk = xk1 + k dk
rk = rk1 k Adk

%
%
%
%
%

Improvement this step

Next search direction
Step length to next xk
Approximate solution
New residual from (17)

The formulas 1 and 3 for k and k are explained briey belowand fully by TrefethenBau ( ) and Shewchuk ( ) and many other good references.

Dierent Viewpoints on Conjugate Gradients

I want to describe the (same!) conjugate gradient method in two dierent ways:
1. It solves a tridiagonal system Hy = f recursively
2. It minimizes the energy 12 xT Ax xT b recursively.
How does Ax = b change to the tridiagonal Hy = f ? That uses Arnoldis orthonormal columns q1 , . . . , qn in Q, with QT Q = I and QT AQ = H:
Ax = b is (QT AQ)(QT x) = QT b which is Hy = f = (b, 0, . . . , 0) .

(19)

Since q1 is b/b, the rst component of f = QT b is q1T b = b and the other components are qiT b = 0. The conjugate gradient method is implicitly computing this
symmetric tridiagonal H and updating the solution y at each step. Here is the third
step:

h11 h12
b
H3 y3 = h21 h22 h23 y3 = 0 .
(20)
h32 h33
0
This is the equation Ax = b projected by Q3 onto the third Krylov subspace K3 .
These hs never appear in conjugate gradients. We dont want to do Arnoldi too!
It is the LDLT factors of H that CG is somehow computingtwo new numbers at
each step. Those give a fast update from yj1 to yj . The corresponding xj = Qj yj
from conjugate gradients approaches the exact solution xn = Qn yn which is x = A1 b.
If we can see conjugate gradients also as an energy minimizing algorithm, we can
extend it to nonlinear problems and use it in optimization. For our linear equation
Ax = b, the energy is E(x) = 12 xT Ax xT b. Minimizing E(x) is the same as solving
Ax = b, when A is positive denite (the main point of Section 1. ). The CG
iteration minimizes E(x) on the growing Krylov subspaces. On the rst
subspace K1 , the line where x is b = d1, this minimization produces the right

CHAPTER 3. BOUNDARY VALUE PROBLEMS

206
value for 1 :

1
E(b) = 2 bT Ab bT b
2

is minimized at

1 =

bT b
.
bT Ab

(21)

That 1 is the constant chosen in step 3 of the rst conjugate gradient cycle.
The gradient of E(x) = 12 xT Ax xT b is exactly Ax b. The steepest descent
direction at x1 is along the negative gradient, which is r1 ! This sounds like the perfect
direction d2 for the next move. But the great diculty with steepest descent is that
this r1 can be too close to the rst direction. Little progress that way. So we add
the right multiple 2 d1 , in order to make d2 = r1 + 2 d1 A-orthogonal to the rst
direction d1 .
Then we move in this conjugate direction d2 to x2 = x1 + 2 d2 . This explains the
name conjugate gradients, rather than the pure gradients of steepest descent. Every
cycle of CG chooses j to minimize E(x) in the new search direction x = xj1 + dj .
The last cycle (if we go that far) gives the overall minimizer xn = x = A1 b.
Example

3
2 1 1
4
1 2 1 1 = 0 .
1
1 1 2
0

Ax = b

From x0 = 0 and 1 = 0 and r0 = d1 = b the rst cycle gives 1 = 12 and x1 = 12 b =

(2, 0, 0). The new residual is r1 = b Ax1 = (0, 2, 2). Then the second cycle yields

2
d2 = 2 ,
2

2 =

8
,
16

3
x2 = 1 = A1 b !
1

2 =

8
,
16

The correct solution is reached in two steps, where normally it will take n = 3 steps. The
reason is that this particular A has only two distinct eigenvalues 4 and 1. In that case
A1 b is a combination of b and Ab, and this best combination x2 is found at cycle 2. The
residual r2 is zero and the cycles stop earlyvery unusual.
Energy minimization leads in [ ] to an estimate of the convergence
rate for the

T
error e = x xj in conjugate gradients, using the A-norm eA = e Ae:

Error estimate

j
max min

x x0 A .
x xj A 2
max + min

(22)

This is the best-known error estimate, although it doesnt account for any clustering of
the eigenvalues of A. It involves only the condition number max /min . Problem
gives the optimal error estimate but it is not so easy to compute. That optimal
estimate needs all the eigenvalues of A, while (22) uses only the extreme eigenvalues
max (A) and min(A)which in practice we can bound above and below.

3.6. SOLVING LARGE LINEAR SYSTEMS

207

Minimum Residual Methods

When A is not symmetric positive denite, conjugate gradient is not guaranteed to
solve Ax = b. Most likely it wont. We will follow van der Vorst [ ] in briey
describing the minimum norm residual approach, leading to MINRES and GMRES.
These methods choose xj in the Krylov subspace Kj so that b Axj is minimal.
First we compute the orthonormal Arnoldi vectors q1 , . . . , qj . They go in the columns
of Qj , so QT
j Qj = I. As in (19) we set xj = Qj y, to express the solution as a
combination of those qs. Then the norm of the residual rj using (9) is
b Axj = b AQj y = b Qj+1 Hj+1,j y .

(23)

T
These vectors are all in the Krylov space Kj+1, where rjT (Qj+1 QT
j +1 rj ) = rj rj . This
T
says that the norm is not changed when we multiply by Qj +1 . Our problem becomes:

Choose y to minimize rj = QT
j +1 b Hj+1,j y = f Hy .

(24)

This is an ordinary least squares problem for the equation Hy = f with only j + 1
equations and j unknowns. The right side f = QT
j+1 b is (r0 , 0, . . . , 0) as in (19).
The matrix H = Hj+1,j is Hessenberg as in (9), with one nonzero diagonal below the
main diagonal. We face a completely typical problem of numerical linear algebra:
Use the special properties of H and f to nd a fast algorithm that computes y. The
two favorite algorithms for this least squares problem are closely related:

MINRES A is symmetric (probably indenite, or we use CG) and H is tridiagonal

GMRES

A is not symmetric and the upper triangular part of H can be full

In both cases we want to clear out that nonzero diagonal below the main diagonal of
H. The natural way to do that, one nonzero entry at a time, is by Givens rotations.
These plane rotations are so useful and simple (the essential part is only 2 by 2) that
we complete this section by explaining them.

Givens Rotations
The direct approach to the least squares solution of Hy = f constructs the normal
equations H T Hy = H T f . That was the central idea in Chapter 1, but you see what
we lose. If H is Hessenberg, with many good zeros, H T H is full. Those zeros in H
should simplify and shorten the computations, so we dont want the normal equations.
The other approach to least squares is by Gram-Schmidt. We factor H into
orthogonal times upper triangular. Since the letter Q is already used, the orthogonal matrix will be called G (after Givens). The upper triangular matrix is G1 H.
The 3 by 2 case shows how a plane rotation G1
21 can clear out the subdiagonal entry

CHAPTER 3. BOUNDARY VALUE PROBLEMS

208
h21 :

h11 h12

cos sin 0
sin cos 0 h21 h22 = 0 .
G1
21 H =
0
0
0
1
0 h32

(25)

That bold zero entry requires h11 sin = h21 cos , which determines . A second
1 1
rotation G1
32 , in the 2-3 plane, will zero out the 3, 2 entry. Then G32 G21 H is a
square upper triangular matrix U above a row of zeros!
The Givens orthogonal matrix is G = G21 G32 but there is no reason to do this multiplication. We use each Gij as it is constructed, to simplify the least squares problem.
Rotations (and all orthogonal matrices) leave the lengths of vectors unchanged:

U
F
1 1
1 1
(26)
y
.
Hy f = G32 G21 Hy G32 G21 f =
e
0
This length is what MINRES and GMRES minimize. The row of zeros below U
means that the last entry e is the errorwe cant reduce it. But we get all the other
entries exactly right by solving the j by j system Uy = F (here j = 2). This gives
the best least squares solution y. Going back to the original problem of minimizing
r = b Axj , the best xj in the Krylov space Kj is Qj y.
For non-symmetric A (GMRES rather than MINRES) we dont have a short
recurrence. The upper triangle in H can be full, and step j becomes expensive and
possibly inaccurate as j increases. So we may change full GMRES to GMRES(m),
which restarts the algorithm every m steps. It is not so easy to choose a good m.

Problem Set 3.6

Create K2D for a 4 by 4 square grid with N 2 = 32 interior mesh points (so
n = 9). Print out its factors K = LU (or its Cholesky factor C = chol(K) for
the symmetrized form K = CT C). How many zeros in these triangular factors?
Also print out inv(K) to see that it is full.

As N increases, what parts of the LU factors of K2D are lled in?

Can you answer the same question for K3D ? In each case we really want an
estimate cN p of the number of nonzeros (the most important number is p).

Use the tic; ...; toc clocking command to compare the solution time for K2D x =
random f in ordinary MATLAB and sparse MATLAB (where K2D is dened as
a sparse matrix). Above what value of N does the sparse routine K\f win?

Compare ordinary vs. sparse solution times in the three-dimensional K3Dx =

random f . At which N does the sparse K\f begin to win?

Incomplete LU

Conjugate gradients

3.6. SOLVING LARGE LINEAR SYSTEMS

209

Draw the next step after Figure 3.19 when the matrix has become 4 by 4 and
the graph has nodes 2456. Which have minimum degree? Is there more
ll-in?

Redraw the right side of Figure 3.19 if row number 2 is chosen as the second
pivot row. Node 2 does not have minimum degree. Indicate new edges in the
5-node graph and new nonzeros F in the matrix.

T
To show that T 1 K = I + eT
1 in (7), with e1 = [ 1 0 . . . 0 ], we can start from
T
1
1
T
K = T + e1 e1 . Then T K = I + (T e1 )e1 and we verify that e1 = T :

1 1
N
1

2 1
N 1
=
0
= e1 .
T =

1
2
1
0

Second dierences of a linear vector are zero. Now multiply T 1 K = I + eT

1
1
times I (eT
)/(N
+
1)
to
establish
the
inverse
matrix
K
T
in
(7).
1
11

Arnoldi expresses each Aqk as hk+1,k qk+1 + hk,k qk + + h1,k q1 . Multiply by qiT
to nd hi,k = qiT Aqk . If A is symmetric you can write this as (Aqi )T qk . Explain
why (Aqi )T qk = 0 for i < k 1 by expanding Aqi into hi+1,i qi+1 + + h1,i q1 .
We have a short recurrence if A = AT (only hk+1,k and hk,k and hk1,k are
nonzero).

3.5 Finite Differences and Fast Poisson Solvers: C 2006 Gilbert Strang
No ratings yet
3.5 Finite Differences and Fast Poisson Solvers: C 2006 Gilbert Strang
7 pages
CH 2 Solutions To Linear Equations
No ratings yet
CH 2 Solutions To Linear Equations
125 pages
Chapter - Two - CT - 1
No ratings yet
Chapter - Two - CT - 1
6 pages
Cheating
No ratings yet
Cheating
9 pages
ln04 Eg501v Fall19
No ratings yet
ln04 Eg501v Fall19
11 pages
On The Application of The Minimum Degree Algorithm To Finite Element Systems
No ratings yet
On The Application of The Minimum Degree Algorithm To Finite Element Systems
23 pages
Lecture6 D2 PDF
No ratings yet
Lecture6 D2 PDF
162 pages
Row Operations
No ratings yet
Row Operations
9 pages
Linear Algebra Solutions
No ratings yet
Linear Algebra Solutions
357 pages
Solution Manual
No ratings yet
Solution Manual
350 pages
Solutions Manual for Linear Algebra
80% (10)
Solutions Manual for Linear Algebra
350 pages
(Peter J. Olver, Chehrzad Shakiban) Instructor's S
No ratings yet
(Peter J. Olver, Chehrzad Shakiban) Instructor's S
357 pages
2 - Diffusion - in - 1D Copy1
No ratings yet
2 - Diffusion - in - 1D Copy1
19 pages
Ifem051 w2
No ratings yet
Ifem051 w2
27 pages
Lect 6-Branch-Bound Hangarian
No ratings yet
Lect 6-Branch-Bound Hangarian
11 pages
Linear Systems - Introduction To Computers & Engineering
No ratings yet
Linear Systems - Introduction To Computers & Engineering
12 pages
Solving Sparse Linear System
No ratings yet
Solving Sparse Linear System
23 pages
Y Can Be Arbitrary As Long As They Satisfy This Relation. 1 25
No ratings yet
Y Can Be Arbitrary As Long As They Satisfy This Relation. 1 25
14 pages
Marked - Kenneth Hardy - Linear Algebra For Engineers and Scientists Using Matlab (2005, Pearson) - Libgen - Li-40-53
No ratings yet
Marked - Kenneth Hardy - Linear Algebra For Engineers and Scientists Using Matlab (2005, Pearson) - Libgen - Li-40-53
14 pages
Numerical Methods Chap3
No ratings yet
Numerical Methods Chap3
30 pages
MIT18 06S10 Pset1 s10 Soln
No ratings yet
MIT18 06S10 Pset1 s10 Soln
9 pages
Linear System: 2011 Intro. To Computation Mathematics LAB Session
No ratings yet
Linear System: 2011 Intro. To Computation Mathematics LAB Session
7 pages
Chap2 5
No ratings yet
Chap2 5
6 pages
Lecture1 Four Special Matrices Pre MMDA Spring25
No ratings yet
Lecture1 Four Special Matrices Pre MMDA Spring25
8 pages
1 Methods For Solving Systems of Linear Equations
No ratings yet
1 Methods For Solving Systems of Linear Equations
21 pages
Gaussian Elimination in Linear Algebra
No ratings yet
Gaussian Elimination in Linear Algebra
4 pages
Householder Transformations in Linear Algebra
No ratings yet
Householder Transformations in Linear Algebra
21 pages
FDMcode
No ratings yet
FDMcode
9 pages
Ol MATHS
No ratings yet
Ol MATHS
25 pages
LU Factorization Explained
No ratings yet
LU Factorization Explained
16 pages
HPC Linear
No ratings yet
HPC Linear
52 pages
Numerical Chapter3
No ratings yet
Numerical Chapter3
51 pages
Instructors' Solutions Manual: Applied Linear Algebra
No ratings yet
Instructors' Solutions Manual: Applied Linear Algebra
205 pages
Gauss Elimination Method
No ratings yet
Gauss Elimination Method
15 pages
Math - Advance Engineering Mathematics
No ratings yet
Math - Advance Engineering Mathematics
5 pages
Matrix Algorithms: Gaussian & Gauss-Jordan
No ratings yet
Matrix Algorithms: Gaussian & Gauss-Jordan
28 pages
Gaussian Elimination Methods
No ratings yet
Gaussian Elimination Methods
29 pages
Mitres 18 010 f24 Slides02
No ratings yet
Mitres 18 010 f24 Slides02
18 pages
Term 1 Chapter 3 - Matrices
100% (2)
Term 1 Chapter 3 - Matrices
4 pages
Seminar Recommendations
No ratings yet
Seminar Recommendations
63 pages
Numerical Methods: System of Linear Equations
No ratings yet
Numerical Methods: System of Linear Equations
63 pages
Lecture Notes - Linear Algebraic Equations
No ratings yet
Lecture Notes - Linear Algebraic Equations
17 pages
Pages From Linear Fall23 Pp21-40
No ratings yet
Pages From Linear Fall23 Pp21-40
20 pages
PS1 Solution 2
No ratings yet
PS1 Solution 2
16 pages
LU Factorization & Pivoting Guide
No ratings yet
LU Factorization & Pivoting Guide
19 pages
Lu Decomposition
No ratings yet
Lu Decomposition
37 pages
Linear Programming
No ratings yet
Linear Programming
10 pages
Mae501 Lec03 Notes 1757395005155
No ratings yet
Mae501 Lec03 Notes 1757395005155
26 pages
Solving Linear & Nonlinear Equations
No ratings yet
Solving Linear & Nonlinear Equations
20 pages
MATH219 Lecture 6
No ratings yet
MATH219 Lecture 6
9 pages
Linear Algebra Lecture Notes
No ratings yet
Linear Algebra Lecture Notes
44 pages
Poisson2d Notes
No ratings yet
Poisson2d Notes
7 pages
Chapter 3
No ratings yet
Chapter 3
29 pages
Mahesh Report For Seminar Nov 2
No ratings yet
Mahesh Report For Seminar Nov 2
19 pages
03 PRECITECH 2.0 7th July 2024
No ratings yet
03 PRECITECH 2.0 7th July 2024
5 pages
Determination of Calcium by Complexometry
No ratings yet
Determination of Calcium by Complexometry
3 pages
Understanding Servomechanisms
100% (1)
Understanding Servomechanisms
4 pages
The Basics of Rebuilding Machine Tools
100% (1)
The Basics of Rebuilding Machine Tools
8 pages
Lab 3 - Linear Motion
No ratings yet
Lab 3 - Linear Motion
8 pages
Propeller Performance Analysis
No ratings yet
Propeller Performance Analysis
10 pages
Lightning Arrestor Leakage Current Report
No ratings yet
Lightning Arrestor Leakage Current Report
15 pages
Magnetic Materials Overview
No ratings yet
Magnetic Materials Overview
4 pages
Machinery Lubrication Specialist Materials
100% (15)
Machinery Lubrication Specialist Materials
219 pages
Failure Mechanisms in Polymer Matrix Composites PDF
No ratings yet
Failure Mechanisms in Polymer Matrix Composites PDF
2 pages
Optimization Word Problems
100% (1)
Optimization Word Problems
4 pages
3 Structural
100% (2)
3 Structural
49 pages
B.Tech - FEM Syllabus
No ratings yet
B.Tech - FEM Syllabus
2 pages
National Institute of Technology Goa: Machine Dynamics and Vibrations ME253
No ratings yet
National Institute of Technology Goa: Machine Dynamics and Vibrations ME253
2 pages
Catalogue EN GREASECOMING 20181015
No ratings yet
Catalogue EN GREASECOMING 20181015
4 pages
Chemical Bonding Notes For Exam
No ratings yet
Chemical Bonding Notes For Exam
4 pages
AD-SDI Spatial Reference System Guide
No ratings yet
AD-SDI Spatial Reference System Guide
41 pages
Heat Transfer: Temp, Aperture, Shielding
No ratings yet
Heat Transfer: Temp, Aperture, Shielding
5 pages
GPR Guide for Geoscientists
No ratings yet
GPR Guide for Geoscientists
9 pages
WITH SOLUTIONS Pumps and Turbines Orifice and Jet FLow
No ratings yet
WITH SOLUTIONS Pumps and Turbines Orifice and Jet FLow
42 pages
Creep and Shrinkage
100% (1)
Creep and Shrinkage
296 pages
Engineering Mathematics-I Question Bank..
No ratings yet
Engineering Mathematics-I Question Bank..
10 pages
SJI - TechnicalDigest12 2020
No ratings yet
SJI - TechnicalDigest12 2020
148 pages
Chapter 4 - Dynamic Analysis
100% (1)
Chapter 4 - Dynamic Analysis
16 pages
High Fedility Simulation
No ratings yet
High Fedility Simulation
11 pages
Pile Cap Design for Engineers
No ratings yet
Pile Cap Design for Engineers
5 pages
Molar Heat Capacities of Choline Chloride-Based Deep Eutectic Solvents and Their Binary Mixtures With Water
100% (1)
Molar Heat Capacities of Choline Chloride-Based Deep Eutectic Solvents and Their Binary Mixtures With Water
25 pages
Kurt Godel Uber Formal Unentscheidbare S
No ratings yet
Kurt Godel Uber Formal Unentscheidbare S
10 pages

Solving Large Linear Systems

Uploaded by

Solving Large Linear Systems

Uploaded by

CHAPTER 3.

BOUNDARY VALUE PROBLEMS

Solving Large Linear Systems

3.6. SOLVING LARGE LINEAR SYSTEMS

Elimination in this order: K2D has size n = N 2

CHAPTER 3. BOUNDARY VALUE PROBLEMS

Minimum Degree Algorithm

3.6. SOLVING LARGE LINEAR SYSTEMS

The component of U corresponding to that column is renumbered k + 1. So is the

CHAPTER 3. BOUNDARY VALUE PROBLEMS

Storing the Nonzero Structure = Sparsity Pattern

Figure 3.20: Master list NZ of nonzeros (neighbors in Figure 3.19). Positions in

3.6. SOLVING LARGE LINEAR SYSTEMS

Separator comes last

CHAPTER 3. BOUNDARY VALUE PROBLEMS

Figure 3.22: Three levels of separators. Still

A very reasonable compromise is the backslash command U = K\F that uses a

Operation Counts (page K)

Iterative versus Direct Methods

3.6. SOLVING LARGE LINEAR SYSTEMS

CHAPTER 3. BOUNDARY VALUE PROBLEMS

P ek+1 = (P A)ek which means ek+1 = (I P 1A)ek = Mek .

Every eigenvalue of M must have |(M )| < 1.

Typical examples have spectral radius (M ) = 1 cN 1 . This comes closer and

3.6. SOLVING LARGE LINEAR SYSTEMS

Here rk = b Axk is the residual. It is the error in Ax = b, not the error ek

CHAPTER 3. BOUNDARY VALUE PROBLEMS

1. The residual rk = b Axk is orthogonal to Kk (Conjugate Gradients, . . . )

Put the column vectors q1 , . . . , qk into an n by k matrix Qk . Multiplying rows of

Arnoldi constructs each qj+1 from Aqj by subtracting projections hij qi . If we

3.6. SOLVING LARGE LINEAR SYSTEMS

(multiplying by columns!) produces q2 :

That subtraction is Step 4 in Arnoldis algorithm. Division by h21 is Step 6.

Aqk = hk+1,k qk+1 + hk,k qk + hk1,k qk1 .

The QR Method for Eigenvalues

Factor Tk sk I = Qk Rk . Reverse to Tk+1 = Rk Qk + bk I .

CHAPTER 3. BOUNDARY VALUE PROBLEMS

QR is one of the great algorithms of numerical linear algebra. It solves moderate-size

The Conjugate Gradient Method

The dierence between rk and qk+1 is that the qs are normalized, as in q1 = b/ b .

These dierences x and r are directly connected, because the bs cancel in r:

(xi xi1 )T A(xk xk1 ) = 0

3.6. SOLVING LARGE LINEAR SYSTEMS

Here is one cycle of the algorithm, starting from x0 = 0 and r0 = b and 1 = 0. It

Improvement this step

Dierent Viewpoints on Conjugate Gradients

CHAPTER 3. BOUNDARY VALUE PROBLEMS

From x0 = 0 and 1 = 0 and r0 = d1 = b the rst cycle gives 1 = 12 and x1 = 12 b =

3.6. SOLVING LARGE LINEAR SYSTEMS

Minimum Residual Methods

MINRES A is symmetric (probably indenite, or we use CG) and H is tridiagonal

A is not symmetric and the upper triangular part of H can be full

CHAPTER 3. BOUNDARY VALUE PROBLEMS

Problem Set 3.6

As N increases, what parts of the LU factors of K2D are lled in?

Compare ordinary vs. sparse solution times in the three-dimensional K3Dx =

3.6. SOLVING LARGE LINEAR SYSTEMS

Second dierences of a linear vector  are zero. Now multiply T 1 K = I + eT

You might also like

The dierence between rk and qk+1 is that the qs are normalized, as in q1 = b/b.

Second dierences of a linear vector are zero. Now multiply T 1 K = I + eT