0% found this document useful (0 votes)
16 views24 pages

Introduction To Computational Mathematic

Chapter 1 discusses mathematical foundations essential for computational mathematics, focusing on Big-O notations for algorithmic complexity and vector calculus. It explains the concepts of asymptotic notation, vector operations, and properties such as dot and cross products. The chapter also covers differentiation of vectors and introduces important operators like gradient, divergence, and curl.

Uploaded by

andersonasher16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views24 pages

Introduction To Computational Mathematic

Chapter 1 discusses mathematical foundations essential for computational mathematics, focusing on Big-O notations for algorithmic complexity and vector calculus. It explains the concepts of asymptotic notation, vector operations, and properties such as dot and cross products. The chapter also covers differentiation of vectors and introduces important operators like gradient, divergence, and curl.

Uploaded by

andersonasher16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chapter 1

Mathematical Foundations

1.1 Big-O Notations

In the description of algorithmic complexity, we often have to use the order


notations, often in terms of big O and small o. Loosely speaking, for two
functions f (x) and g(x), if
f (x)
lim → K, (1.1)
x→x0 g(x)
where K is a finite, non-zero limit, we write
f = O(g). (1.2)
The big O notation means that f is asymptotically equivalent to the order
of g(x). If the limit is unity or K = 1, we say f (x) is order of g(x). In this
special case, we write
f ∼ g, (1.3)
which is equivalent to f /g → 1 and g/f → 1 as x → x0 . Obviously, x0
can be any value, including 0 and ∞. The notation ∼ does not necessarily
mean ≈ in general, though it might give the same results, especially in the
case when x → 0. For example, sin x ∼ x and sin x ≈ x if x → 0.
When we say f is order of 100 (or f ∼ 100), this does not mean f ≈ 100,
but it can mean that f could be between about 50 and 150. The small o
omputatinotation
ona is often used if the limit tends to 0. That is
oC l
t

Ma

f
Introduction

thematics

lim → 0, (1.4)
Xin-She Yang x→x0 g
c World Scientific
or
(20 08)
f = o(g). (1.5)

3
INTRODUCTION TO COMPUTATIONAL MATHEMATICS
© World Scientific Publishing Co. Pte. Ltd.
[Link]
4 Introduction to Computational Mathematics

If g > 0, f = o(g) is equivalent to f ≪ g. For example, for ∀x ∈ R, we


2
have ex ≈ 1 + x + O(x2 ) ≈ 1 + x + x2 + o(x).

Example 1.1: A classic example is Stirling’s asymptotic series for facto-


rials
√ n 1 1 139
n! ∼ 2πn ( )n (1 + + − − ...),
e 12n 288n2 51480n3
which can demonstrate the fundamental difference between asymptotic se-
ries and the standard approximate expansions. For standard power ex-
pansions, the error Rk (hk ) → 0, but for an asymptotic series, the error
of the truncated series Rk decreases compared with the leading term [here

2πn(n/e)n ]. However, Rn does not necessarily tend to zero. In fact,
1 √
R2 = · 2πn(n/e)n ,
12n
is still very large as R2 → ∞ if n ≫ 1. For example, for n √= 100, we have
n! = 9.3326 × 10157 , while the leading approximation is 2πn(n/e)n =
9.3248 × 10157 . The difference between these two values is 7.7740 × 10154 ,
which is still very large, though three orders smaller than the leading ap-
proximation.

1.2 Vector and Vector Calculus

Vector analysis is an important part of computational mathematics. Many


quantities such as force, velocity, and deformation in sciences are vectors
which have both a magnitude and a direction. Vectors are a special class of
matrices. In this chapter, we will briefly review the basic concepts in linear
algebra.
A vector u is a set of ordered numbers u = (u1 , u2 , ..., un ), where its
components ui (i = 1, ..., n) ∈ ℜ are real numbers. All these vectors form
an n-dimensional vector space V n . A simple example is the position vector
p = (x, y, z) where x, y, z are the 3-D Cartesian coordinates.
To add any two vectors u = (u1 , ..., un ) and v = (v1 , ..., vn ), we simply
add their corresponding components,
omputational
oC u + v = (u1 + v1 , u2 + v2 , ..., un + vn ), (1.6)
t

Ma
Introduction

thematics

Xin-She Yang and the sum is also a vector. The addition of vectors has commutability
(u + v = v + u) and associativity [(u + v) + w = u + (v + w)]. This
c World Scientific

is
(20 08)
because each of the components is obtained by simple addition which
means it has the same properties.

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
1.2 Vector and Vector Calculus 5

u w

a v b v
v

u u

Fig. 1.1 Addition of vectors: (a) parallelogram a = u + v; (b) vector polygon b =


u + v + w.

The zero vector 0 is a special case where all its components are zeros.
The multiplication of a vector u with a scalar or constant α ∈ ℜ is carried
out by the multiplication of each component,
αu = (αu1 , αu2 , ..., αun ). (1.7)
Thus, we have
−u = (−u1 , −u2 , ..., −un ). (1.8)
The dot product or inner product of two vectors x and y is defined as
n
X
x · y = x1 y1 + x2 y2 + ... + xn yn = xi yi , (1.9)
i=1

which is a real number. The length or norm of a vector x is the root of the
dot product of the vector itself,
v
u n
√ uX
|x| = kxk = x · x = t x2i . (1.10)
i=1

When kxk = 1, then it is a unit vector. It is straightforward to check that


the dot product has the following properties:
u · v = v · u, u · (v + w) = u · v + u · w, (1.11)
and

omputational (αu) · (βv) = (αβ)u · v, (1.12)


oC
t

Ma
Introduction

where α, β ∈ ℜ are constants.


thematics

Xin-She Yang
c World Scientific
The angle θ between two vectors u and v can be calculated using their
dot product
(20 08)
u · v = ||u|| ||v|| cos(θ), 0 ≤ θ ≤ π, (1.13)

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
6 Introduction to Computational Mathematics

which leads to
u·v
cos(θ) = . (1.14)
kuk kvk
If the dot product of these two vectors is zero or cos(θ) = 0 (i.e., θ = π/2),
then we say that these two vectors are orthogonal.
Since cos(θ) ≤ 1, then we get the useful Cauchy-Schwartz inequality:
ku · vk ≤ kuk kvk. (1.15)
The dot product of two vectors is a scalar or a number. On the other
hand, the cross product or outer product of two vectors is a new vector
i j k
u = x × y = x1 x2 x3
y1 y2 y3

x2 x3 x x x x
= i+ 3 1 j+ 1 2 k
y2 y3 y3 y1 y1 y2

= (x2 y3 − x3 y2 )i + (x3 y1 − x1 y3 )j + (x1 y2 − x2 y1 )k. (1.16)


In fact, the norm of kx × yk is the area of the parallelogram formed by x
and y. We have
kx × yk = kxk kyk sin θ, (1.17)
where θ is the angle between the two vectors. In addition, the vector
u = x × y is perpendicular to both x and y, following a right-hand rule.
It is straightforward to check that the cross product has the following
properties:
x × y = −y × x, (x + y) × z = x × z + y × z, (1.18)
and
(αx) × (βy) = (αβ)x × y, α, b ∈ ℜ. (1.19)
A very special case is u × u = 0. For unit vectors, we have
omputational i × j = k, j × k = i, k × i = j. (1.20)
oC
t

Ma
Introduction

thematics

Xin-She Yang
Example 1.2: For two 3-D vectors u = (4, 5, −6) and v = (2, −2, 1/2),
c World Scientific
their dot product is
(20 08)
u · v = 4 × 2 + 5 × (−2) + (−6) × 1/2 = −5.

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
1.2 Vector and Vector Calculus 7

As their moduli are


p √
||u|| = 42 + 52 + (−6)2 = 77,
p √
||v|| = 22 + (−2)2 + (1/2)2 = 33/2,
we can calculate the angle θ between the two vectors. We have
u·v −5 10
cos θ = =√ √ =− √ ,
||u||||v|| 77 × 33/2 11 21
or
10
θ = cos−1 (− √ ) ≈ 101.4◦ .
11 21
Their cross product is
w =u×v

= (5 × 1/2 − (−2) × (−6), (−6) × 2 − 4 × 1/2, 4 × (−2) − 5 × 2)

= (−19/2, −14, −18).


Similarly, we have
v × u = (19/2, 14, 18) = −u × v.
The norm of the cross product is
r
−19 2
kwk = ( ) + (−14)2 + (−18)2 ≈ 24.70,
2
while

√ 33
kukkvk sin θ = 77 × × sin(101.4◦ ) ≈ 24.70 = kwk.
2
It is easy to verify that
u · w = 4 × (−19/2) + 5 × (−14) + (−6) × (−18) = 0,
and
v · w = 2 × (−19/2) + (−2) × (−14) + 1/2 × (−18) = 0.
Indeed, the vector w is perpendicular to both u and v.
omputational Any vector v in an n-dimensional vector space V n can be written as a
oC
t

Ma

combination of a set of n independent basis vectors or orthogonal spanning


Introduction

thematics

vectors e1 , e2 , ..., en , so that


Xin-She Yang
c World Scientific
n
X
(20 08) v = v1 e1 + v2 e2 + ... + vn en = vi ei , (1.21)
i=1

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
8 Introduction to Computational Mathematics

where the coefficients/scalars v1 , v2 , ..., vn are the components of v relative


to the basis e1 , e2 , ..., en . The most common basis vectors are the orthog-
onal unit vectors. In a three-dimensional case, they are i = (1, 0, 0), j =
(0, 1, 0), k = (0, 0, 1) for x-, y-, z-axis, respectively. Thus, we have
x = x1 i + x2 j + x3 k. (1.22)
The three unit vectors satisfy i · j = j · k = k · i = 0.
Two non-zero vectors u and v are said to be linearly independent if
αu + βv = 0 implies that α = β = 0. If α, β are not all zeros, then
these two vectors are linearly dependent. Two linearly dependent vectors
are parallel (u//v) to each other. Similarly, any three linearly dependent
vectors u, v, w are in the same plane.
The differentiation of a vector is carried out over each component and
treating each component as the usual differentiation of a scalar. Thus, from
a position vector
P (t) = x(t)i + y(t)j + z(t)k, (1.23)
we can write its velocity as
dP
v= = ẋ(t)i + ẏ(t)j + ż(t)k, (1.24)
dt
and acceleration as
d2 P
= ẍ(t)i + ÿ(t)j + z̈(t)k,
a= (1.25)
dt2
where ˙ = d/dt. Conversely, the integral of v is
Z t
P = vdt + p0 , (1.26)
0
where p0 is a vector constant or the initial position at t = 0.
From the basic definition of differentiation, it is easy to check that the
differentiation of vectors has the following properties:
d(αa) da d(a · b) da db
=α , = ·b+a· , (1.27)
dt dt dt dt dt
and
d(a × b) da db
omputational = ×b+a× . (1.28)
oC
dt dt dt
t

Ma
Introduction

thematics

Xin-She Yang Three important operators commonly used in vector analysis, especially
in the formulation of mathematical models, are the gradient operator (grad
c World Scientific

(20 08)
or ∇), the divergence operator (div or ∇·) and the curl operator (curl or
∇×).

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
1.2 Vector and Vector Calculus 9

Sometimes, it is useful to calculate the directional derivative of φ at a


point (x, y, z) in the direction of n
∂φ ∂φ ∂φ ∂φ
= n · ∇φ = cos(α) + cos(β) + cos(γ), (1.29)
∂n ∂x ∂y ∂z
where n = (cos α, cos β, cos γ) is a unit vector and α, β, γ are the directional
angles. Generally speaking, the gradient of any scalar function φ of x, y, z
can be written in a similar way,
∂φ ∂φ ∂φ
gradφ = ∇φ = i+ j+ k. (1.30)
∂x ∂y ∂z
This is the same as the application of the del operator ∇ to the scalar
function φ
∂ ∂ ∂
∇= i+ j+ k. (1.31)
∂x ∂y ∂z
The direction of the gradient operator on a scalar field gives a vector field.
As the gradient operator is a linear operator, it is straightforward to
check that it has the following properties:
∇(αψ + βφ) = α∇ψ + β∇φ, ∇(ψφ) = ψ∇φ + φ∇ψ, (1.32)
where α, β are constants and ψ, φ are scalar functions.
For a vector field
u(x, y, z) = u(x, y, z)i + v(x, y, z)j + w(x, y, z)k, (1.33)
the application of the operator ∇ can lead to either a scalar field or vector
field depending on how the del operator is applied to the vector field. The
divergence of a vector field is the dot product of the del operator ∇ and u
∂u1 ∂u2 ∂u3
div u ≡ ∇ · u = + + , (1.34)
∂x ∂y ∂z
and the curl of u is the cross product of the del operator and the vector
field u
i j k
∂ ∂ ∂
curl u ≡ ∇ × u = ∂x ∂y ∂z . (1.35)
omputational u1 u2 u3
oC
t

Ma
Introduction

One of the most commonly used operators in engineering and science is


thematics

Xin-She Yang
the Laplacian operator
c World Scientific

∂2φ ∂2φ ∂2φ


(20 08)
∇2 φ = ∇ · (∇φ) = + 2 + 2, (1.36)
∂x2 ∂y ∂z

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
10 Introduction to Computational Mathematics

for Laplace’s equation


∆φ ≡ ∇2 φ = 0. (1.37)
Some important theorems are often rewritten in terms of the above
three operators, especially in fluid dynamics and finite element analysis.
For example, Gauss’s theorem connects the integral of divergence with the
related surface integral
ZZZ ZZ
(∇ · Q)dΩ = Q · ndS. (1.38)
Ω S

1.3 Matrices and Matrix Decomposition

Matrices are widely used in computational mathematics, especially in the


implementation of many algorithms. A matrix is a table or array of numbers
or functions arranged in rows and columns. The elements or entries of a
matrix A are often denoted as aij . For a matrix A with m rows and n
columns,
 
a11 a12 ... a1j ... a1n
 a21 a22 ... a2j ... a2n 
 
A ≡ [aij ] =  . .. . . ..  , (1.39)
 .. . . . 
am1 am2 ... amj ... amn
we say the size of A is m by n, or m × n. A is a square matrix if m = n.
For example,
   
123 ex sin x
A= , B= , (1.40)
456 −i cos x eiθ
and


u
u =  v , (1.41)
w
where A is a 2 × 3 matrix, B is a 2 × 2 square matrix, and u is a 3 × 1
omputaticolumn
ona matrix or column vector.
oC l
t

Ma
Introduction

The sum of two matrices A and B is only possible if they have the same
thematics

Xin-She Yang
size m × n, and their sum, which is also m × n, is obtained by adding their
c World Scientific
corresponding entries
(20 08)
C = A + B, cij = aij + bij , (1.42)

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
1.3 Matrices and Matrix Decomposition 11

where (i = 1, 2, ..., m; j = 1, 2, ..., n). The product of a matrix A with a


scalar α ∈ ℜ is obtained by multiplying each entry by α. The product of
two matrices is only possible if the number of columns of A is the same as
the number of rows of B. That is to say, if A is m × n and B is n × r, then
the product C is m × r,
n
X
cij = (AB)ij = aik bkj . (1.43)
k=1
n
z }| {
n
If A is a square matrix, then we have A = AA...A. The multiplications
of matrices are generally not commutive, i.e., AB 6= BA. However, the
multiplication has associativity A(uv) = (Au)v and A(u+v) = Au+Av.
The transpose AT of A is obtained by switching the position of rows
and columns, and thus AT will be n × m if A is m × n, (aT )ij = aji , (i =
1, 2, ..., m; j = 1, 2, ..., n). Generally,
(AT )T = A, (AB)T = B T AT . (1.44)
The differentiation and integral of a matrix are carried out over each of
its members or elements. For example, for a 2 × 2 matrix
 da11 da12 
dA
= Ȧ = dadt21 dadt22 , (1.45)
dt dt dt

and
R R 
Z a11 dt a12 dt
Adt =  . (1.46)
R R
a21 dt a22 dt
A diagonal matrix A is a square matrix whose every entry off the main
diagonal is zero (aij = 0 if i 6= j). Its diagonal elements or entries may or
may not have zeros. In general, it can be written as
 
d1 0 ... 0
 0 d2 ... 0 
 
D= .. . (1.47)
 . 
omputational 0 0 ... dn
oC
t

Ma
Introduction

For example, the matrix


thematics

Xin-She Yang
 
c World Scientific 10 0
(20 08)
I = 0 1 0 (1.48)
00 1

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
12 Introduction to Computational Mathematics

is a 3 × 3 identity or unitary matrix. In general, we have

AI = IA = A. (1.49)

A zero or null matrix 0 is a matrix with all of its elements being zero.
There are three important matrices: lower (upper) triangular matrix,
tridiagonal matrix, and augmented matrix, and they are important in the
solution of linear equations. A tridiagonal matrix often arises naturally
from the finite difference and finite volume discretization of partial differ-
ential equations, and it can in general be written as
 
b1 c1 0 0 ... 0 0
 a2 b2 c2 0 ... 0 0 
 
Q =  0 a3 b3 c3 ... 0 0 .
 
(1.50)
 .
.. .. 
 . 
0 0 0 0 ... an bn
An augmented matrix is formed by two matrices with the same number
of rows. For example, the follow system of linear equations

a11 u1 + a12 u2 + a13 u3 = b1 ,

a21 u1 + a22 u2 + a23 u3 = b2 ,

a31 u1 + a32 u2 + a33 u3 = b3 , (1.51)

can be written in a compact form in terms of matrices


    
a11 a12 a13 u1 b1
 a21 a22 a23  u2  =  b2 , (1.52)
a31 a32 a33 u3 b3
or

Au = b. (1.53)

This can in turn be written as the following augmented form


 
omputational a11 a12 a13 b1
oC
t

Ma
Introduction

[A|b] = a21 a22 a23 b2 . (1.54)


thematics

Xin-She Yang
a31 a32 a33 b3
c World Scientific

(20 08)
The augmented form is widely used in Gauss-Jordan elimination and linear
programming.

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
1.3 Matrices and Matrix Decomposition 13

A lower (upper) triangular matrix is a square matrix with all the ele-
ments above (below) the diagonal entries being zeros. In general, a lower
triangular matrix can be written as
 
l11 0 ... 0
 l12 l22 ... 0 
 
L= .. , (1.55)
 . 
ln1 ln2 ... lnn
while the upper triangular matrix can be written as
 
u11 u12 ... u1n
 0 u22 ... u2n 
 
U = .. . (1.56)
 . 
0 0 ... unn
Any n × n square matrix A = [aij ] can be decomposed or factorized as a
product of an L and a U , that is
A = LU , (1.57)
2
though some decomposition is not unique because we have n +n unknowns:
n(n + 1)/2 coefficients lij and n(n + 1)/2 coefficients uij , but we can only
provide n2 equations from the coefficients aij . Thus, there are n free pa-
rameters. The uniqueness of decomposition is often achieved by imposing
either lii = 1 or uii = 1 where i = 1, 2, ..., n.
Other LU variants include the LDU and LUP decompositions. An LDU
decomposition can be written as
A = LDU , (1.58)
where L and U are lower and upper matrices with all the diagonal entries
being unity, and D is a diagonal matrix. On the other hand, the LUP
decomposition can be expressed as
A = LU P , or A = P LU , (1.59)
where P is a permutation matrix which is a square matrix and has exactly
omputatione
ona entry 1 in each column and each row with 0’s elsewhere. However, most
oC l
numerical libraries and software use the following LUP decomposition
t

Ma
Introduction

thematics

Xin-She Yang
c World Scientific
P A = LU . (1.60)

(20 08)
which makes it easier to decompose some matrices. However, the require-
ment for LU decompositions is relatively strict. An invertible matrix A

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
14 Introduction to Computational Mathematics

has an LU decomposition provided that the determinants of all its diagonal


minors or leading submatrices are not zeros.
A simpler way of decomposing a square matrix A for solving a system
of linear equations is to write
A = D + L + U, (1.61)
where D is a diagonal matrix. L and U are the strictly lower and up-
per triangular matrix without diagonal elements. This decomposition is
much simpler to implement than the LU decomposition because there is no
multiplication involved here.

Example 1.3: The following 3 × 3 matrix


 
2 1 5
A =  4 −4 5 ,
5 2 −5
can be decomposed as A = LU . That is
  
1 0 0 u11 u12 u13
A =  l21 1 0  0 u22 u23 ,
l31 l32 0 0 0 u33
which becomes  
u11 u12 u13
A =  l21 u11 l21 u12 + u22 l21 u13 + u23 
l31 u11 l31 u12 + l32 u22 l31 u13 + l32 u23 + u33
 
2 1 5
=  4 −4 5 .
5 2 −5
This leads to u11 = 2, u12 = 1 and u13 = 5. As l21 u11 = 4, so l21 =
4/u11 = 2. Similarly, l31 = 2.5. From l21 u12 + u22 = −4, we have u22 =
−4 − 2 × 1 = −6. From l21 u13 + u23 = 5, we have u23 = 5 − 2 × 5 = −5.
Using l31 u12 + l32 u22 = 2, or 2.5 × 1 + l32 × (−6) = 2, we get l32 = 1/12.
Finally, l31 u13 + l32 u23 + u33 = −5 gives u33 = −5 − 2.5 × 5 − 1/12 × (−5) =
−205/12. Therefore, we now have
    
2 1 5 1 0 0 2 1 5
 4 −4 5  =  2 1 0  0 −6 −5 .
om pu tationa 5 2 −5 5/2 1/12 1 0 0 −205/12
oC l
The L+D+U decomposition can be written as
t

Ma
Introduction

thematics

     
Xin-She Yang 2 0 0 000 015
c World Scientific
A = D + L + U =  0 −4 0  +  4 0 0  +  0 0 5 .
(20 08) 0 0 −5 520 000

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
1.4 Determinant and Inverse 15

1.4 Determinant and Inverse

The determinant of a square matrix A is a number or scalar obtained by the


following recursive formula or the cofactor or Laplace expansion by column
or row. For example, expanding by row k, we have
Xn
det(A) = |A| = (−1)k+j akj Mkj , (1.62)
j=1

where Mij is the determinant of a minor matrix of A by deleting row i and


column j. For a simple 2 × 2 matrix, its determinant simply becomes
a11 a12
= a11 a22 − a12 a21 . (1.63)
a21 a22
The determinants of matrices have the following properties:
|αA| = α|A|, |AT | = |A|, |AB| = |A||B|, (1.64)
where A and B have the same size (n × n).
An n × n square matrix is singular if |A| = 0, and is nonsingular if and
only if |A| =
6 0. The trace of a square matrix tr(A) is defined as the sum
of the diagonal elements,
Xn
tr(A) = aii = a11 + a22 + ... + ann . (1.65)
i=1
The rank of a matrix A is the number of linearly independent vectors
forming the matrix. Generally speaking, the rank of A satisfies
rank(A) ≤ min(m, n). (1.66)
For an n × n square matrix A, it is nonsingular if rank(A) = n.
The inverse matrix A−1 of a square matrix A is defined as
A−1 A = AA−1 = I. (1.67)
More generally,
A−1 −1
l A = AAr = I, (1.68)
where A−1
is the left inverse while
l A−1
is the right inverse. If
r = A−1
A−1
r ,
l
we say that the matrix A is invertible and its inverse is simply denoted by
omputationa−1
oC Al . It is worth noting that the unit matrix I has the same size as A.
t

Ma
Introduction

The inverse of a square matrix exists if and only if A is nonsingular or


thematics

Xin-She Yang
det(A) 6= 0. From the basic definitions, it is straightforward to prove that
c World Scientific

the inverse of a matrix has the following properties


(20 08)
(A−1 )−1 = A, (AT )−1 = (A−1 )T , (1.69)

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
16 Introduction to Computational Mathematics

and
(AB)−1 = B −1 A−1 . (1.70)
The inverse of a lower (upper) triangular matrix is also a lower (upper)
triangular matrix. The inverse of a diagonal matrix
 
d1 0 ... 0
 0 d2 ... 0 
 
D= .. , (1.71)
 . 
0 0 ... dn
can simply be written as
 
1/d1 0 ... 0
 0 1/d2 ... 0 
D −1
 
= .. , (1.72)
 . 
0 0 ... 1/dn
where di 6= 0. If any of these elements di is zero, then the diagonal matrix
is not invertible as it becomes singular. For a 2 × 2 matrix, its inverse is
simply
   
ab 1 d −b
A= , A−1 = . (1.73)
cd ad − bc −c a

Example 1.4: For two matrices,


   
4 5 0 2 3
A =  −2 2 5 , B =  0 −2 ,
2 −3 1 5 2
their transpose matrices are
 
4 −2 2  
2 0 5
AT =  5 2 −3 , B T = .
3 −2 2
0 5 1
Let D = AB be their product; we have
 
D11 D12
omputational
AB = D =  D21 D22 .
oC
D31 D32
t

Ma
Introduction

thematics

Xin-She Yang The first two entries are


c World Scientific
3
X
(20 08) D11 = A1j Bj1 = 2 × 4 + 5 × 0 + 0 × 5 = 8,
j=1

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
1.4 Determinant and Inverse 17

and
3
X
D12 = A1j Bj2 = 4 × 3 + 5 × (−2) + 0 × 2 = 2.
j=1

Similarly, the other entries are:


D21 = 21, D22 = 0, D31 = 9, D33 = 14.
Therefore, we get
    
4 5 0 2 3 8 2
AB =  −2 2 5  0 −2  = D =  21 0 .
2 −3 1 5 2 9 14
However, the product BA does not exist, though
 
T T 8 21 9
B A = = D T = (AB)T .
2 0 14
The inverse of A is
 
17 −5 25
1 
A−1 = 12 4 −20 ,
128
2 22 18
and the determinant of A is
det(A) = 128.
It is straightforward to verify that
 
100
AA−1 = A−1 A = I =  0 1 0 .
001
For example, the first entry is obtained by
3
X 17 12 2
A1j A−1
j1 = 4 × +5× +0× = 1.
j=1
128 128 128

Other entries can be verified similarly. Finally, the trace of A is


tr(A) = A11 + A22 + A33 = 4 + 2 + 1 = 7.
omputational
oC
The algorithmic complexity of most algorithms for obtaining the inverse
t

Ma
Introduction

thematics

of
Xin-She Yang a general square matrix is O(n3 ). That is why most modern algorithms
try to avoid the direct inverse of a large matrix. Solution of a large matrix
c World Scientific

(20 08)
system is instead carried out either by partial inverse via decomposition
or by iteration (or a combination of these two methods). If the matrix

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
18 Introduction to Computational Mathematics

can be decomposed into triangular matrices either by LU factorization or


direction decomposition, the aim is then to invert a triangular matrix, which
is simpler and more efficient.
For a triangular matrix, the inverse can be obtained using algorithms
of O(n2 ) complexity. Similarly, the solution of a linear system with a lower
(upper) triangular matrix A can be obtained by forward (back) substitu-
tions. In general, for a lower triangular matrix
 
α11 0 ... 0
 α12 α22 ... 0 
 
A= .. , (1.74)
 . 
αn1 αn2 ... αnn
the forward substitutions for the system Au = b can be carried out as
follows:
b1
u1 = ,
α11
1
u2 = (b2 − α21 u1 ),
α22
i−1
1 X
ui = (bi − αij uj ), (1.75)
αii j=1
where i = 2, ..., n. We see that it takes 1 division to get u1 , 3 floating
point calculations to get u2 , and (2i − 1) to get ui . So the total algorithmic
complexity is O(1 + 3 + ...(2n − 1)) = O(n2 ). Similar arguments apply to
the upper triangular systems.
The inverse A−1 of a lower triangular matrix can in general be written
as  
β11 0 ... 0
 β12 β22 ... 0  
A−1 = 
 
.  = B = B1 B2 ... Bn , (1.76)
 .. 
βn1 βn2 ... βnn
where B j are the j-th column vector of B. The inverse must satisfy
AA−1 = I or
 
A B1 B2 ... Bn = I = e1 e2 ... en , (1.77)
where ej is the j-th unit vector of size n with the j-th element being 1 and
omputational 
oC T
all other elements being zero. That is ej = 0 0 ... 1 0 ... 0 . In order to
t

Ma
Introduction

thematics

obtain B, we have to solve n linear systems


Xin-She Yang
c World Scientific AB 1 = e1 , AB 2 = e2 , ..., AB n = en . (1.78)
(20 08)
As A is a lower triangular matrix, the solution of AB j = ej can easily be
obtained by direct forward substitutions discussed earlier in this section.

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
1.5 Matrix Exponential 19

1.5 Matrix Exponential

Sometimes, we need to calculate exp[A], where A is a square matrix. In


this case, we have to deal with matrix exponentials. The exponential of a
square matrix A is defined as

A
X 1 n 1
e ≡ A = I + A + A2 + ..., (1.79)
n=0
n! 2
where I is an identity matrix with the same size as A, and A2 = AA and
so on. This (rather odd) definition in fact provides a method of calculating
the matrix exponential. The matrix exponentials are very useful in solving
systems of differential equations.

Example 1.5: For a simple matrix


 
t0
A= ,
0t
its exponential is simply
 t 
e 0
eA = .
0 et
For a more complicated matrix
 
t a
B= ,
a t
we have  1 t+a 
2 (e + et−a ) 12 (et+a − et−a )
eB =  .
1 t+a t−a 1 t+a t−a
2 (e − e ) 2 (e +e )

As you can see, it is quite complicated but still straightforward to cal-


culate the matrix exponentials. Fortunately, it can easily be done using a
computer. By using the power expansions and the basic definition, we can
prove the following useful identities

X 1 t2
etA ≡ (tA)n = I + tA + A2 + ..., (1.80)
n=0
n! 2

X (−1)n−1 n 1 1
ln(IA) ≡ A = A − A2 + A3 + ..., (1.81)
omputational n=1
n! 2 3
oC
t

Ma

eA eB = eA+B
Introduction

(if AB = BA), (1.82)


thematics

Xin-She Yang
d tA
c World Scientific
e = AetA = etA A, (1.83)
dt
(eA )−1 = e−A , det(eA ) = etrA .
(20 08)
(1.84)

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
20 Introduction to Computational Mathematics

1.6 Hermitian and Quadratic Forms

The matrices we have discussed so far are real matrices because all their
elements are real. In general, the entries or elements of a matrix can be
complex numbers, and the matrix becomes a complex matrix. For a matrix
A, its complex conjugate A∗ is obtained by taking the complex conjugate
of each of its elements. The Hermitian conjugate A† is obtained by taking
the transpose of its complex conjugate matrix. That is to say, for
 
a11 a12 ...
A =  a21 a21 ... , (1.85)
... ... ...
we have
 
a∗11 a∗12 ...
A∗ =  a∗21 a∗22 ... , (1.86)
... ... ...
and
 
a∗11 a∗21 ...
A† = (A∗ )T = (AT )∗ =  a∗12 a22 ... . (1.87)
... ... ...
A square matrix A is called orthogonal if and only if A−1 = AT . If a
square matrix A satisfies A∗ = A, it is called an Hermitian matrix. It is
an anti-Hermitian matrix if A∗ = −A. If the Hermitian matrix of a square
matrix A is equal to the inverse of the matrix (or A† = A−1 ), it is called
a unitary matrix.

Example 1.6: For a complex matrix


 
2 + 3iπ 1 + 9i 0
A= ,
eiπ −2i i sin θ
its complex conjugate A∗ is
 
∗ 2 − 3iπ 1 − 9i 0
A = .
oC
omputational e−iπ 2i −i sin θ
t

Ma
Introduction

The Hermitian conjugate of A is


thematics

Xin-She Yang
 
c World Scientific 2 − 3iπ e−iπ
(20 08)
A† =  1 − 9i 2i  = (A∗ )T .
0 −i sin θ

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
1.6 Hermitian and Quadratic Forms 21

For the rotation matrix


 
cos θ sin θ
A= ,
− sin θ cos θ
its inverse and transpose are
 
1 cos θ − sin θ
A−1 = ,
cos2 θ + sin2 θ sin θ cos θ
and
 
T cos θ − sin θ
A = .
sin θ cos θ
Since cos2 θ + sin2 θ = 1, we have AT = A−1 . Therefore, the original
rotation matrix A is orthogonal.
A very useful concept in computational mathematics and computing
is quadratic forms. For a real vector q T = (q1 , q2 , q3 , ..., qn ) and a real
symmetric square matrix A, a quadratic form ψ(q) is a scalar function
defined by
  q 
A11 A12 ... A1n 1
 A21 A22 ... A2n  2  q
ψ(q) = q T Aq = q1 q2 ... qn 
 ... ... ... ... 
 .  , (1.88)
 .. 
An1 An2 ... Ann qn
which can be written as
n X
X n
ψ(q) = qi Aij qj . (1.89)
i=1 j=1

Since ψ is a scalar, it should be independent of the coordinates.


In the case of a square matrix A, ψ might be more easily evaluated in
certain intrinsic coordinates Q1 , Q2 , ..., Qn . An important result concerning
the quadratic form is that it can always be written through appropriate
transformations as
Xn
ψ(q) = λi Q2i = λ1 Q21 + λ2 Q22 + ...λn Q2n , (1.90)
i=1
omputational
oC
where λi are the eigenvalues of the matrix A determined by
t

Ma
Introduction

thematics

Xin-She Yang
c World Scientific
det |A − λI| = 0, (1.91)

(20 08)
and Qi are the intrinsic components along directions of the eigenvectors in
this case.

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
22 Introduction to Computational Mathematics

The natural extension of quadratic forms is the Hermitian form which


is the quadratic form for a complex Hermitian matrix A. Furthermore,
the matrix A can consist of linear operators and functionals in addition to
numbers.
Example 1.7: For a vector q = (q1 , q2 ) and the square matrix
 
2 −5
A= ,
−5 2
we have a quadratic form
  
 2 −5 q1
ψ(q) = q1 q2 = 2q12 − 10q1 q2 + 2q22 .
−5 2 q2
The eigenvalues of the matrix A is determined by
2 − λ −5
= 0,
−5 2 − λ
whose solutions are λ1 = 7 and λ2 = −3 (see the next section for further
details). Their corresponding eigenvectors are
 √  √ 
−√ 2/2 √2/2
v1 = , v2 = .
2/2 2/2
We can see that v 1 ·v 2 = 0, which means that these two eigenvectors are or-
thogonal. Writing the quadratic form in terms of the intrinsic coordinates,
we have
ψ(q) = 7Q21 − 3Q22 .
Furthermore, if we assume ψ(q) = 1 as a simple constraint, then the equa-
tion 7Q21 − 3Q22 = 1 corresponds to a hyperbola.

1.7 Eigenvalues and Eigenvectors

The eigenvalues λ of any n × n square matrix A is determined by


Au = λu, (1.92)
or
omputational
oC
t

Ma

(A − λI)u = 0. (1.93)
Introduction

thematics

Xin-She Yang
where I is a unitary matrix with the same size as A. Any non-trivial
c World Scientific

solution requires that


(20 08)
det |A − λI| = 0, (1.94)

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
1.7 Eigenvalues and Eigenvectors 23

or
a11 − λ a12 ... a1n
a21 a22 − λ ... a2n
.. .. = 0, (1.95)
. .
an1 an2 ... ann
which again can be written as a polynomial
λn + αn−1 λn−1 + ... + α0 = (λ − λ1 )...(λ − λn ) = 0, (1.96)
where λi are the eigenvalues which could be complex numbers. In general,
the determinant is zero, which leads to a polynomial of order n in λ. For
each eigenvalue λ, there is a corresponding eigenvector u whose direction
can be uniquely determined. However, the length of the eigenvector is not
unique because any non-zero multiple of u will also satisfy equation (1.92),
and thus can be considered as an eigenvector. For this reason, it is usually
necessary to apply additional conditions by setting the length as unity, and
subsequently the eigenvector becomes a unit eigenvector.
Generally speaking, a real n × n matrix A has n eigenvalues λi (i =
1, 2, ..., n), however, these eigenvalues are not necessarily distinct. If the
real matrix is symmetric, that is to say AT = A, then the matrix has n
distinct eigenvectors, and all the eigenvalues are real numbers.
The eigenvalues λi are related to the trace and determinant of the matrix
n
X n
X
tr(A) = aii = λ1 + λ2 + ... + λn = λi , (1.97)
i=1 i=1

and
n
Y
det(A) = |A| = λi . (1.98)
i=1

Example 1.8: The eigenvalues of the square matrix


 
4 9
A= ,
2 −3
can be obtained by solving
omputational
oC
t

4−λ 9
Ma
Introduction

= 0.
thematics

Xin-She Yang 2 −3 − λ
c World Scientific

We have
(20 08)
(4 − λ)(−3 − λ) − 18 = (λ − 6)(λ + 5) = 0.

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
24 Introduction to Computational Mathematics

Thus, the eigenvalues are λ = 6 and λ = −5. Let v = (v1 v2 )T be the


eigenvector; we have for λ = 6
  
−2 9 v1
|A − λI| = = 0,
2 −9 v2
which means that
−2v1 + 9v2 = 0, 2v1 − 9v2 = 0.
These two equations are virtually the same (not linearly independent), so
the solution is
9
v1 = v2 .
2
Any vector parallel to v is also an eigenvector. In order to get a unique
eigenvector, we have to impose an extra requirement, that is, the length of
the vector is unity. We now have
v12 + v22 = 1,
or
9v2 2
( ) + v22 = 1,
√ 2 √
which gives v2 = ±2/ 85, and v1 = ±9/ 85. As these two vectors are
in opposite directions, we can choose any of the two directions. So the
eigenvector for the eigenvalue λ = 6 is√
 
9/ 85
v =  √ .
2/ 85
Similarly,
√ √the corresponding eigenvector for the eigenvalue λ = −5 is v =
(− 2/2 2/2)T .
Furthermore, the trace and determinant of A are
tr(A) = 4 + (−3) = 1, det(A) = 4 × (−3) − 2 × 9 = −30.
The sum of the eigenvalues is
X2
λi = 6 + (−5) = 1 = tr(A),
i=1
while the product of the eigenvalues is
Y2
λi = 6 × (−5) = −30 = det(A).
i=1
omputational
oC
For any real square matrix A with the eigenvalues λi = eig(A), the
t

Ma
Introduction

thematics

eigenvalues
Xin-She Yang of αA are αλi where α 6= 0 ∈ ℜ. This property becomes handy
when rescaling the matrices in some iteration formulae so that the rescaled
c World Scientific

(20 08)
scheme becomes more stable. This is also the major reason why the pivoting
and removing/rescaling of exceptionally large elements works.

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
1.8 Definiteness of Matrices 25

1.8 Definiteness of Matrices

A square symmetric matrix A is said to be positive definite if all its eigen-


values are strictly positive (λi > 0 where i = 1, 2, ..., n). By multiplying
(1.92) by uT , we have
uT Au = uT λu = λuT u, (1.99)
which leads to
uTAu
λ= . (1.100)
uT u
This means that
uTAu > 0, if λ > 0. (1.101)
In fact, for any vector v, the following relationship holds
v TAv > 0. (1.102)
Since v can be a unit vector, thus all the diagonal elements of A should be
strictly positive as well. If all the eigenvalues are non-negative or λi ≥ 0,
then the matrix is called positive semi-definite. In general, an indefinite
matrix can have both positive and negative eigenvalues.
The inverse of a positive definite matrix is also positive definite. For a
linear system Au = f where f is a known column vector, if A is positive
definite, then the system can be solved more efficiently by matrix decom-
position methods.

Example 1.9: In general, a 2 × 2 symmetric matrix A


 
αβ
A= ,
βγ
is positive definite if
αu21 + 2βu1 u2 + γu22 > 0,
for all u = (u1 , u2 )T 6= 0. The inverse of A is
 
−1 1 γ −β
A = ,
o Com
putationa
l αγ − β 2 −β α
t

Ma
Introduction

which is also positive definite.


thematics

Xin-She Yang
c World Scientific
As the eigenvalues of
 
12
(20 08) A= ,
21

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]
26 Introduction to Computational Mathematics

are λ = 3, −1, the matrix is indefinite. For another matrix


 
4 6
B= ,
6 20
we can find its eigenvalues using a similar method as discussed earlier, and
the eigenvalues are λ = 2, 22. So matrix B is positive definite. The inverse
of B
 
1 20 −6
B −1 = ,
44 −6 4
is also positive definite because B −1 has two eigenvalues: λ = 1/2, 1/22.

omputational
oC
t

Ma
Introduction

thematics

Xin-She Yang
c World Scientific

(20 08)

INTRODUCTION TO COMPUTATIONAL MATHEMATICS


© World Scientific Publishing Co. Pte. Ltd.
[Link]

You might also like