Medical Image Reconstruction
Term II – 2012
Topic 1: Mathematical Basis
Lecture 1
Professor Yasser Mostafa Kadah
Topic Today
Matrix Computations
Computational complexity of common matrix operations
Examples of matrix decompositions
How to solve linear system of equation Ax=b on a computer
Vector / Matrix norm definitions
Conditioning of matrices
Least squares problem
Iterative linear system solution methods
Vector calculus (differentiation with respect to a vector)
Matrix Vector Multiplication
Consider an nm matrix A and n1 vector x:
Matrix vector multiplication b=Ax is given as,
Matrix Vector Multiplication
If b = Ax, then b is a linear combination of the columns of
A.
Computer pseudo-code:
Computational Complexity:
Flop Count and Order of Complexity
Real numbers are normally stored in computers in a
floating-point format.
Arithmetic operations that a computer performs on
these numbers are called floating-point operations (flops)
Example: Update
1 Multiplication + 1 Addition = 2 flops
Matrix-vector multiplication : 2 nm flops or O(nm)
For nxn matrix (n1) vector: O(n2) operation
Doubling problem size quadruples effort to solve
Matrix-Matrix Multiplication
If A is an nm matrix, and X is mp, we can form the
product B = AX, which is np such that,
Pseudo-code:
2mnp flops
Square case: O(n3)
Systems of Linear Equations
Consider a system of n linear equations in n unknowns
Can be expressed as Ax=b such that
Systems of Linear Equations
Theorem: Let A be a square matrix. The following six
conditions are equivalent
Methods to Solve Linear Equations
Theoretical: compute A-1 then premultiply by it:
A-1 A x = A-1 b x= A-1 b
Practical: A-1 is never computed!
Unstable
Computationally very expensive
Numerical accuracy
Gaussian elimination ??
Computational complexity?
Numerical accuracy?
Explore ways to make this solution simpler
Elementary Operations
A linear system of equation Ax=b remains the same if we:
Add a multiple of one equation to another equation.
Interchange two equations.
Multiply an equation by a nonzero constant.
Explore ways of solving the linear system using these
elementary operations
Gaussian elimination is an example of such method
Triangular systems of equations
Lower triangular systems
Consider linear system Gy=b: Forward Substitution
Upper triangular system: Backward Substitution
Efficient computation for such special matrices
Cholesky Decomposition
Cholesky Decomposition Theorem: Let A be a symmetric
positive definite matrix. Then A can be decomposed in
exactly one way into a product A = RTR
Solution Using Cholesky Decomposition
Consider problem Ax=b
Then, use Choleskly decomposition to put A= RTR
Then, Ax = b RTRx= b
Let Rx= y then solve RTy= b
triangular system of equations that is easy to solve
Then, solve Rx=y
Another triangular system of equations that is easy to solve
LU Decomposition
LU Decomposition Theorem: Let A be an nn matrix
whose leading principal submatrices are all nonsingular.
Then A can be decomposed in exactly one way into a
product A= L U as:
Vector Norm
Measure of distance
Definition:
Example: Euclidean norm
Vector Norm
General definition of p-norm:
Examples:
2-norm: Euclidean distance
1-norm: (taxicab norm or Manhattan norm)
-norm:
Vector Norm
Example:
draw the circles defined by the following equations: ||x||1= 1 ,
||x||2= 1 , ||x||= 1
y
-1 1 x
-1
Matrix Norm
A matrix norm is a function that assigns to each Anxn a
real number ||A|| such that:
Example: Frobenius norm (commonly used)
Matrix Norm
Induced (operator) norm
Special case: induced p-norm or Matrix p-norm
Theoretically important
Expensive to compute
Frobenius norm is NOT the matrix 2-norm (=max eigenvalue)
Theorem:
Examples:
Condition Number
Consider a linear system equation and its perturbation:
Then, or
Hence,
Also,
Combining equations:
Define the condition number as:
Condition Number
Using induced matrix norm, (A) 1
Matrices with (A) 1000 are considered ill-Conditioned
Numerical errors in solving Ax=b are amplified in the solution
by the condition number
Estimation of condition number: from eigenvalues: divide
maximum eigenvalue by the minimum eigenvalue or
(A) = max/min
For singular matrices, (A) =
Condition number improvement by scaling equations
possible
Example:
Roundoff Errors
Floating point number presentation
Mantissa
Exponent 7
Problems occur when adding numbers of very different
scales
If a computation results in a number that is too big to be
represented, an overflow is said to have occurred.
If a number that is nonzero but too small to be
represented is computed, an underflow results.
Machine epsilon: smallest positive floating point number s
such that fl(1+s)>1 (Homework to compute)
Sensitivity Analysis
Using perturbation analysis, show how stable the solution
is for a particular matrix A and machine precision s.
Condition number describes the matrix only
Be careful with choice of single vs. double precision since time
gain may end up causing major errors in result !
Least-Squares Problem
To find an optimal solution to linear system of equations
Ax=b that does not have to be square and it is desired to
minimize the 2-norm of the residual
n>m : overdetermined system (least-squares solution)
n<m: underdetermined system (minimum-norm solution)
Orthogonal Matrices
An orthogonal matrix has its inverse the same as its
transpose
Determinant = 1
Condition number = 1 (ideal)
Orthogonal transformations preserve length
Orthogonal transformations preserve angle
Example: rotators and reflectors
QR Decomposition
Any nxn matrix A can be decomposed as a product QR
where Q is an orthogonal matrix and R is an upper
triangular matrix
Solution of Ax=b is again straightforward:
QRx=b
Let Rx= y and solve Qy=b (solution is simply y= QTb)
Then solve triangular system Rx=y as before
Advantage of QR solution: excellent numerical stability
Overdetermined case (A is nxm with n>m): QR
decomposition is still possible with :
Singular Value Decomposition (SVD)
Let A be an nxm nonzero matrix with rank r. Then A can
be expressed as a product:
Where:
U is an nxn orthogonal matrix
V is an mxm orthogonal matrix
is an nxm diagonal matrix of singular values in the form:
Solution of Least Squares Using SVD
Condition number can be shown to be equal to:
In order to improve condition number, we can solve the
equation after replacing the smallest singular values by
zero until the condition number is low enough
Regularization of the ill-conditioned provlem
“Pseudo-inverse” or “Moore-Penrose generalized inverse”
Highest numerical stability of all methods but O(n3)
Computational Complexity
Cholesky's algorithm applied to an nn matrix performs
about n3/3 flops.
LU based decomposition applied to an nn matrix
performs about 2n3/3 flops.
Gaussian elimination applied to an nn matrix performs
about 2n3/3 flops.
QR decomposition: 2nm2-2m3/3 flops
SVD has O(n3) flops
All are still too high for some problems
Need to find other methods with lower complexity
Iterative Solution Methods
Much less computations of O(n2)
Steepest descent based methods
Conjugate gradient based methods
Steepest Descent Methods
Looks for the error b-Ax and tries to remove this error
in its direction
Conjugate Gradient (CG) Method
Suppose we want to solve the following system of linear
equations Ax = b where the n-by-n matrix A is
symmetric (i.e., AT = A), positive definite (i.e., xTAx > 0
for all non-zero vectors x in Rn), and real.
We say that two non-zero vectors u and v are conjugate
(with respect to A) or mutually orthogonal if: uTAv=0
not related to the notion of complex conjugate.
Suppose that {pk} is a sequence of n mutually conjugate
directions. Then the pk form a basis of Rn, so we can
expand the solution x* of Ax = b in this basis:
http://en.wikipedia.org/wiki/Conjugate_gradient_method
Conjugate Gradient (CG) Method
To compute the coefficients:
(since pi and pk are mutually conjugate for ik)
Can compute the solution in maximum n iterations!
Removes the error in “mutually-orthogonal” directions
Better performance compared to steepest descent
http://en.wikipedia.org/wiki/Conjugate_gradient_method
Iterative Conjugate Gradient Method
Need to solve systems where n
is so large
Direct method take too much time
Careful choice of directions
Start with Ax-b as p0
Takes a few iterations to reach
reasonable accuracy
Octave/Matlab code available at http://en.wikipedia.org/wiki/Conjugate_gradient_method
Vector Calculus
Let x and y be general vectors of orders n and m
respectively:
Define derivative as:
Vector Calculus
Special cases: when x or y are scalars
Other important derivatives:
Exercise
Write a program to compute Machine Epsilon and report your results.
Look for Octave/Matlab functions that implement the topics discussed in this
lecture and provide a list of them.
Modify the conjugate gradient method described in this lecture to allow using
a general real matrix A that is not symmetric or positive definite.
Implement code for Gaussian elimination, steepest descent and conjugate
gradient methods and compare results (time and accuracy) to SVD based
solution (pseudo-inverse). Use only a few iterations for iterative methods.
Use vector calculus to show that the solution of Ax=b for symmetric A
minimizes the objective function:
In all above problems involving linear system solution, use the Hilbert matrix as your A matrix, use a
random x vector, compute b=Ax and use A and b to compute as estimate of the x vector then compare it
to what you have to test your system
You can use available Octave/Matlab functions and write your own code for parts that are not available.