0% found this document useful (0 votes)
5K views1,176 pages

Applied Linear Algebra and Optimization Using MATLAB

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5K views1,176 pages

Applied Linear Algebra and Optimization Using MATLAB

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Applied Linear Algebra

and Optimization
using MATLAB R
License, Disclaimer of Liability, and Limited Warranty

By purchasing or using this book (the “Work”), you agree that this license grants
permission to use the contents contained herein, but does not give you the right
of ownership to any of the textual content in the book or ownership to any of
the information or products contained in it. This license does not permit up-
loading of the Work onto the Internet or on a network (of any kind) without the
written consent of the Publisher. Duplication or dissemination of any text, code,
simulations, images, etc. contained herein is limited to and subject to licensing
terms for the respective products, and permission must be obtained from the
Publisher or the owner of the content, etc., in order to reproduce or network
any portion of the textual material (in any media) that is contained in the Work.

Mercury Learning and Information (“MLI” or “the Publisher”) and any-


one involved in the creation, writing, or production of the accompanying algo-
rithms, code, or computer programs (“the software”), and any accompanying
Web site or software of the Work, cannot and do not warrant the performance
or results that might be obtained by using the contents of the Work. The au-
thor, developers, and the Publisher have used their best efforts to insure the
accuracy and functionality of the textual material and/or programs contained
in this package; we, however, make no warranty of any kind, express or implied,
regarding the performance of these contents or programs. The Work is sold “as
is” without warranty (except for defective materials used in manufacturing the
book or due to faulty workmanship).

The author, developers, and the publisher of any accompanying content, and
anyone involved in the composition, production, and manufacturing of this work
will not be liable for damages of any kind arising out of the use of (or the inabil-
ity to use) the algorithms, source code, computer programs, or textual material
contained in this publication. This includes, but is not limited to, loss of revenue
or profit, or other incidental, physical, or consequential damages arising out of
the use of this Work.

The sole remedy in the event of a claim of any kind is expressly limited to
replacement of the book, and only at the discretion of the Publisher. The use of
“implied warranty” and certain “exclusions” vary from state to state, and might
not apply to the purchaser of this product.
Applied Linear Algebra
and Optimization
using MATLAB R

Rizwan Butt, PhD

Mercury Learning and Information


Dulles, Virginia
Copyright 2011
c by Mercury Learning and Information.
All rights reserved.

This publication, portions of it, or any accompanying software may not be reproduced
in any way, stored in a retrieval system of any type, or transmitted by any means,
media, electronic display or mechanical display, including, but not limited to,
photocopy, recording, Internet postings, or scanning, without prior permission in
writing from the publisher.

Publisher: David Pallai


Mercury Learning and Information
22841 Quicksilver Drive
Dulles, VA 20166
info@[Link]
[Link]
1-800-758-3756

This book is printed on acid-free paper.

R. Butt, PhD. Applied Linear Algebra and Optimization using MATLAB


R

ISBN: 978-1-9364200-4-9

The publisher recognizes and respects all marks used by companies, manufacturers,
and developers as a means to distinguish their products. All brand names and
product names mentioned in this book are trademarks or service marks of their
respective companies. Any omission or misuse (of any kind) of service marks or
trademarks, etc. is not an attempt to infringe on the property of others.

Library of Congress Control Number: 2010941258

1112133 2 1

Our titles are available for adoption, license, or bulk purchase by institutions,
corporations, etc. For additional information, please contact the
Customer Service Dept. at 1-800-758-3756 (toll free).

The sole obligation of Mercury Learning and Information to the purchaser is to


replace the disc, based on defective materials or faulty workmanship, but not based on
the operation or functionality of the product.
Dedicated to
Muhammad Sarwar Khan,
The Greatest Friend in the World
Contents

Preface xv

Acknowledgments xix

1 Matrices and Linear Systems 1


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Linear Systems in Matrix Notation . . . . . . . . . 7
1.2 Properties of Matrices and Determinants . . . . . . . . . 10
1.2.1 Introduction to Matrices . . . . . . . . . . . . . . . 10
1.2.2 Some Special Matrix Forms . . . . . . . . . . . . . 15
1.2.3 Solutions of Linear Systems of Equations . . . . . 30
1.2.4 The Determinant of a Matrix . . . . . . . . . . . . . 38
1.2.5 Homogeneous Linear Systems . . . . . . . . . . . 62
1.2.6 Matrix Inversion Method . . . . . . . . . . . . . . . 68
1.2.7 Elementary Matrices . . . . . . . . . . . . . . . . . 71
1.3 Numerical Methods for Linear Systems . . . . . . . . . . 74
1.4 Direct Methods for Linear Systems . . . . . . . . . . . . . 74
1.4.1 Cramer’s Rule . . . . . . . . . . . . . . . . . . . . 75
1.4.2 Gaussian Elimination Method . . . . . . . . . . . . 79
1.4.3 Pivoting Strategies . . . . . . . . . . . . . . . . . . 99
1.4.4 Gauss–Jordan Method . . . . . . . . . . . . . . . . 106
1.4.5 LU Decomposition Method . . . . . . . . . . . . . 111
1.4.6 Tridiagonal Systems of Linear Equations . . . . . . 157
1.5 Conditioning of Linear Systems . . . . . . . . . . . . . . . 161
1.5.1 Norms of Vectors and Matrices . . . . . . . . . . . 162
1.5.2 Errors in Solving Linear Systems . . . . . . . . . . 167
vii
viii Contents

1.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 180


1.6.1 Curve Fitting, Electric Networks, and Traffic Flow . 180
1.6.2 Heat Conduction . . . . . . . . . . . . . . . . . . . 189
1.6.3 Chemical Solutions and
Balancing Chemical Equations . . . . . . . . . . . 192
1.6.4 Manufacturing, Social, and Financial Issues . . . . 195
1.6.5 Allocation of Resources . . . . . . . . . . . . . . . 201
1.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
1.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

2 Iterative Methods for Linear Systems 243


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 243
2.2 Jacobi Iterative Method . . . . . . . . . . . . . . . . . . . 245
2.3 Gauss–Seidel Iterative Method . . . . . . . . . . . . . . . 252
2.4 Convergence Criteria . . . . . . . . . . . . . . . . . . . . 270
2.5 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . 280
2.6 Successive Over-Relaxation Method . . . . . . . . . . . . 294
2.7 Conjugate Gradient Method . . . . . . . . . . . . . . . . . 308
2.8 Iterative Refinement . . . . . . . . . . . . . . . . . . . . . 313
2.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
2.10 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 316

3 The Eigenvalue Problems 327


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 327
3.2 Linear Algebra and Eigenvalues Problems . . . . . . . . . 348
3.3 Diagonalization of Matrices . . . . . . . . . . . . . . . . . 357
3.4 Basic Properties of Eigenvalue Problems . . . . . . . . . 374
3.5 Some Results of Eigenvalues Problems . . . . . . . . . . 393
3.6 Applications of Eigenvalue Problems . . . . . . . . . . . . 397
3.6.1 System of Differential Equations . . . . . . . . . . 397
3.6.2 Difference Equations . . . . . . . . . . . . . . . . . 405
3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
3.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 410

4 Numerical Computation of Eigenvalues 417


4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Contents ix

4.2 Vector Iterative Methods for Eigenvalues . . . . . . . . . . 418


4.2.1 Power Method . . . . . . . . . . . . . . . . . . . . 419
4.2.2 Inverse Power Method . . . . . . . . . . . . . . . . 429
4.2.3 Shifted Inverse Power Method . . . . . . . . . . . 433
4.3 Location of the Eigenvalues . . . . . . . . . . . . . . . . . 438
4.3.1 Gerschgorin Circles Theorem . . . . . . . . . . . . 438
4.3.2 Rayleigh Quotient . . . . . . . . . . . . . . . . . . 440
4.4 Intermediate Eigenvalues . . . . . . . . . . . . . . . . . . 442
4.5 Eigenvalues of Symmetric Matrices . . . . . . . . . . . . 446
4.5.1 Jacobi Method . . . . . . . . . . . . . . . . . . . . 448
4.5.2 Sturm Sequence Iteration . . . . . . . . . . . . . . 455
4.5.3 Given’s Method . . . . . . . . . . . . . . . . . . . . 460
4.5.4 Householder’s Method . . . . . . . . . . . . . . . . 465
4.6 Matrix Decomposition Methods . . . . . . . . . . . . . . . 473
4.6.1 QR Method . . . . . . . . . . . . . . . . . . . . . . 473
4.6.2 LR Method . . . . . . . . . . . . . . . . . . . . . . 479
4.6.3 Upper Hessenberg Form . . . . . . . . . . . . . . 482
4.6.4 Singular Value Decomposition . . . . . . . . . . . 491
4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
4.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 500

5 Interpolation and Approximation 511


5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 511
5.2 Polynomial Approximation . . . . . . . . . . . . . . . . . . 513
5.2.1 Lagrange Interpolating Polynomials . . . . . . . . 514
5.2.2 Newton’s General Interpolating Formula . . . . . . 530
5.2.3 Aitken’s Method . . . . . . . . . . . . . . . . . . . 552
5.2.4 Chebyshev Polynomials . . . . . . . . . . . . . . . 557
5.3 Least Squares Approximation . . . . . . . . . . . . . . . . 574
5.3.1 Linear Least Squares . . . . . . . . . . . . . . . . 575
5.3.2 Polynomial Least Squares . . . . . . . . . . . . . . 581
5.3.3 Nonlinear Least Squares . . . . . . . . . . . . . . 585
5.3.4 Least Squares Plane . . . . . . . . . . . . . . . . . 601
5.3.5 Trigonometric Least Squares Polynomial . . . . . 604
5.3.6 Least Squares Solution of an
Overdetermined System . . . . . . . . . . . . . . . 608
x Contents

5.3.7 Least Squares Solution of an


Underdetermined System . . . . . . . . . . . . . . 613
5.3.8 The Pseudoinverse of a Matrix . . . . . . . . . . . 619
5.3.9 Least Squares with QR Decomposition . . . . . . 622
5.3.10 Least Squares with Singular
Value Decomposition . . . . . . . . . . . . . . . . 628
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 634
5.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 635

6 Linear Programming 653


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 653
6.2 General Formulation . . . . . . . . . . . . . . . . . . . . . 655
6.3 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 656
6.4 Linear Programming Problems . . . . . . . . . . . . . . . 656
6.4.1 Formulation of Mathematical Model . . . . . . . . 657
6.4.2 Formulation of Mathematical Model . . . . . . . . 659
6.5 Graphical Solution of LP Models . . . . . . . . . . . . . . 660
6.5.1 Reversed Inequality Constraints . . . . . . . . . . 668
6.5.2 Equality Constraints . . . . . . . . . . . . . . . . . 668
6.5.3 Minimum Value of a Function . . . . . . . . . . . . 669
6.5.4 LP Problem in Canonical Form . . . . . . . . . . . 676
6.5.5 LP Problem in Standard Form . . . . . . . . . . . . 677
6.5.6 Some Important Definitions . . . . . . . . . . . . . 682
6.6 The Simplex Method . . . . . . . . . . . . . . . . . . . . . 683
6.6.1 Basic and Nonbasic Variables . . . . . . . . . . . . 683
6.6.2 The Simplex Algorithm . . . . . . . . . . . . . . . . 684
6.6.3 Simplex Method for Minimization Problem . . . . . 690
6.7 Unrestricted in Sign Variables . . . . . . . . . . . . . . . . 693
6.8 Finding a Feasible Basis . . . . . . . . . . . . . . . . . . . 695
6.8.1 By Trial and Error . . . . . . . . . . . . . . . . . . . 695
6.8.2 Use of Artificial Variables . . . . . . . . . . . . . . 696
6.9 Big M Simplex Method . . . . . . . . . . . . . . . . . . . . 697
6.10 Two-Phase Simplex Method . . . . . . . . . . . . . . . . . 701
6.11 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706
6.11.1 Comparison of Primal and Dual Problems . . . . . 708
6.11.2 Primal-Dual Problems in Standard Form . . . . . . 711
Contents xi

6.12 Sensitivity Analysis in


Linear Programming . . . . . . . . . . . . . . . . . . . . . 717
6.13 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 721
6.14 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 722

7 Nonlinear Programming 735


7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 735
7.2 Review of Differential Calculus . . . . . . . . . . . . . . . 736
7.2.1 Limits of Functions . . . . . . . . . . . . . . . . . . 736
7.2.2 Continuity of a Function . . . . . . . . . . . . . . . 738
7.2.3 Derivative of a Function . . . . . . . . . . . . . . . 739
7.2.4 Local Extrema of a Function . . . . . . . . . . . . 742
7.2.5 Directional Derivatives and the Gradient Vector . . 752
7.2.6 Hessian Matrix . . . . . . . . . . . . . . . . . . . . 757
7.2.7 Taylor’s Series Expansion . . . . . . . . . . . . . . 762
7.2.8 Quadratic Forms . . . . . . . . . . . . . . . . . . . 768
7.3 Nonlinear Equations and Systems . . . . . . . . . . . . . 774
7.3.1 Bisection Method . . . . . . . . . . . . . . . . . . . 775
7.3.2 Fixed-Point Method . . . . . . . . . . . . . . . . . 781
7.3.3 Newton’s Method . . . . . . . . . . . . . . . . . . . 786
7.3.4 System of Nonlinear Equations . . . . . . . . . . . 789
7.4 Convex and Concave Functions . . . . . . . . . . . . . . 802
7.5 Standard Form of a Nonlinear
Programming Problem . . . . . . . . . . . . . . . . . . . . 818
7.6 One-Dimensional Unconstrained
Optimization . . . . . . . . . . . . . . . . . . . . . . . . . 819
7.6.1 Golden-Section Search . . . . . . . . . . . . . . . 819
7.6.2 Quadratic Interpolation . . . . . . . . . . . . . . . 825
7.6.3 Newton’s Method . . . . . . . . . . . . . . . . . . . 831
7.7 Multidimensional Unconstrained
Optimization . . . . . . . . . . . . . . . . . . . . . . . . . 835
7.7.1 Gradient Methods . . . . . . . . . . . . . . . . . . 840
7.7.2 Newton’s Method . . . . . . . . . . . . . . . . . . . 850
7.8 Constrained Optimization . . . . . . . . . . . . . . . . . . 855
7.8.1 Lagrange Multipliers . . . . . . . . . . . . . . . . . 855
7.8.2 The Kuhn–Tucker Conditions . . . . . . . . . . . . 868
xii Contents

7.8.3 Karush–Kuhn–Tucker Conditions . . . . . . . . . . 870


7.9 Generalized Reduced-Gradient Method . . . . . . . . . . 881
7.10 Separable Programming . . . . . . . . . . . . . . . . . . . 890
7.11 Quadratic Programming . . . . . . . . . . . . . . . . . . . 895
7.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 900
7.13 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 901

Appendices 917

A Number Representations and Errors 917


A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 917
A.2 Number Representations and the Base of Numbers . . . 918
A.2.1 Normalized Floating-Point Representations . . . . 921
A.2.2 Rounding and Chopping . . . . . . . . . . . . . . . 924
A.3 Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925
A.4 Sources of Errors . . . . . . . . . . . . . . . . . . . . . . . 927
A.4.1 Human Errors . . . . . . . . . . . . . . . . . . . . . 927
A.4.2 Truncation Errors . . . . . . . . . . . . . . . . . . . 927
A.4.3 Round-off Errors . . . . . . . . . . . . . . . . . . . 928
A.5 Effect of Round-off Errors in
Arithmetic Operations . . . . . . . . . . . . . . . . . . . . 929
A.5.1 Round-off Errors in Addition and Subtraction . . . 929
A.5.2 Round-off Errors in Multiplication . . . . . . . . . . 931
A.5.3 Round-off Errors in Division . . . . . . . . . . . . . 933
A.5.4 Round-off Errors in Powers and Roots . . . . . . . 935
A.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 937
A.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 938

B Mathematical Preliminaries 941


B.1 The Vector Space . . . . . . . . . . . . . . . . . . . . . . 941
B.1.1 Vectors in Two Dimensions . . . . . . . . . . . . . 942
B.1.2 Vectors in Three Dimensions . . . . . . . . . . . . 947
B.1.3 Lines and Planes in Space . . . . . . . . . . . . . 964
B.2 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . 976
B.2.1 Geometric Representation of Complex Numbers . 977
B.2.2 Operations on Complex Numbers . . . . . . . . . 978
Contents xiii

B.2.3 Polar Forms of Complex Numbers . . . . . . . . . 980


B.2.4 Matrices with Complex Entries . . . . . . . . . . . 983
B.2.5 Solving Systems with Complex Entries . . . . . . . 984
B.2.6 Determinants of Complex Numbers . . . . . . . . 984
B.2.7 Complex Eigenvalues and Eigenvectors . . . . . . 985
B.3 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . 986
B.3.1 Properties of Inner Products . . . . . . . . . . . . 987
B.3.2 Complex Inner Products . . . . . . . . . . . . . . . 990
B.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 992

C Introduction to MATLAB 1007


C.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1007
C.2 Some Basic MATLAB Operations . . . . . . . . . . . . . . 1008
C.2.1 MATLAB Numbers and Numeric Formats . . . . . 1010
C.2.2 Arithmetic Operations . . . . . . . . . . . . . . . . 1012
C.2.3 MATLAB Mathematical Functions . . . . . . . . . . 1014
C.2.4 Scalar Variables . . . . . . . . . . . . . . . . . . . 1015
C.2.5 Vectors . . . . . . . . . . . . . . . . . . . . . . . . 1016
C.2.6 Matrices . . . . . . . . . . . . . . . . . . . . . . . . 1020
C.2.7 Creating Special Matrices . . . . . . . . . . . . . . 1024
C.2.8 Matrix Operations . . . . . . . . . . . . . . . . . . 1032
C.2.9 Strings and Printing . . . . . . . . . . . . . . . . . 1035
C.2.10 Solving Linear Systems . . . . . . . . . . . . . . . 1037
C.2.11 Graphing in MATLAB . . . . . . . . . . . . . . . . . 1044
C.3 Programming in MATLAB . . . . . . . . . . . . . . . . . . 1051
C.3.1 Statements for Control Flow . . . . . . . . . . . . . 1051
C.3.2 For Loop . . . . . . . . . . . . . . . . . . . . . . . 1052
C.3.3 While Loop . . . . . . . . . . . . . . . . . . . . . . 1052
C.3.4 Nested for Loops . . . . . . . . . . . . . . . . . . . 1053
C.3.5 Structure . . . . . . . . . . . . . . . . . . . . . . . 1054
C.4 Defining Functions . . . . . . . . . . . . . . . . . . . . . . 1056
C.5 MATLAB Built-in Functions . . . . . . . . . . . . . . . . . 1059
C.6 Symbolic Computation . . . . . . . . . . . . . . . . . . . . 1061
C.6.1 Some Important Symbolic Commands . . . . . . . 1064
C.6.2 Solving Equations Symbolically . . . . . . . . . . . 1069
C.6.3 Calculus . . . . . . . . . . . . . . . . . . . . . . . . 1071
xiv Contents

C.6.4 Symbolic Ordinary Differential Equations . . . . . 1077


C.6.5 Linear Algebra . . . . . . . . . . . . . . . . . . . . 1079
C.6.6 Eigenvalues and Eigenvectors . . . . . . . . . . . 1080
C.6.7 Plotting Symbolic Expressions . . . . . . . . . . . 1081
C.7 Symbolic Math Toolbox Functions . . . . . . . . . . . . . 1083
C.8 Index of MATLAB Programs . . . . . . . . . . . . . . . . . 1086
C.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 1089
C.10 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 1090

D Answers to Selected Exercises 1097


D.0.1 Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . 1097
D.0.2 Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . 1107
D.0.3 Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . 1108
D.0.4 Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . 1111
D.0.5 Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . 1115
D.0.6 Chapter 6 . . . . . . . . . . . . . . . . . . . . . . . 1118
D.0.7 Chapter 7 . . . . . . . . . . . . . . . . . . . . . . . 1120
D.0.8 Appendix A . . . . . . . . . . . . . . . . . . . . . . 1122
D.0.9 Appendix B . . . . . . . . . . . . . . . . . . . . . . 1123
D.0.10 Appendix C . . . . . . . . . . . . . . . . . . . . . . 1126

Bibliography 1129

Index 1145
Preface

This book presents an integrated approach to numerical linear algebra and


optimization theory based on a computer—in this case, using the software
package MATLAB. This book has evolved over many years from lecture
notes on Numerical Linear Algebra and Optimization Theory that accom-
pany both graduate and post-graduate courses in mathematics at the King
Saud University at Riyadh, Saudi Arabia. These courses deal with linear
equations, approximations, eigenvalue problems, and linear and nonlinear
optimization problems. We discuss several numerical methods for solving
both linear systems of equations and optimization problems. It is generally
accepted that linear algebra methods aid in finding the solution of linear
and nonlinear optimization problems.

The main approach used in this book is quite different from currently
available books, which are either too theoretical or too computational. The
approach adopted in this book lies between the above two extremities. The
book fully exploits MATLAB’s symbolic, numerical, and graphical capabil-
ities to develop a thorough understanding of linear algebra and optimiza-
tion algorithms.

The book covers two distinct topics: linear algebra and optimization
theory. Linear algebra plays an important role in both applied and
theoretical mathematics, as well as in all of science and engineering,
xvi Preface

computer science, probability and statistics, economics, numerical analysis,


and many other disciplines. Nowadays, a proper grounding in both calcu-
lus and linear algebra is an essential prerequisite for a successful career in
science, engineering, and mathematics. Linear algebra can be viewed as the
mathematical apparatus needed to solve potentially huge linear systems,
to understand their underlying structure, and to apply what is learned in
other contexts. The term linear is the key and, in fact, refers not just to
linear algebraic equations, but also to linear differential equations, linear
boundary value problems, linear iterative systems, and so on.

The other focus of this book is on optimization theory. This theory


is the study of the extremal values of a function; its maxima and min-
ima. The topics in this theory range from conditions for existence of a
unique extremal value to methods—both analytic and numeric—for find-
ing the extremal values, and for what values of the independent variables
the function attains its extremes. It is a branch of mathematics that en-
compasses many diverse areas of optimization and minimization. The more
modern term is operational research. It includes the calculus of variations,
control theory, convex optimization theory, decision theory, game theory,
linear and nonlinear programming, queuing systems, etc. In this book we
emphasize only linear and nonlinear programming problems.

A wide range of applications appears throughout the book. They have


been chosen and written to give the student a sense of the broad range of
applicability of linear algebra and optimization theory. These applications
range from theoretical applications such as the use of linear algebra in dif-
ferential equations, difference equations, and least squares analysis.

When dealing with linear algebra or optimization theory, we often need


a computer. We believe that computers can improve the conceptional un-
derstanding of mathematics, not just enable the completion of complicated
calculations. We have chosen MATLAB as our standard package because
it is a widely used software for working with matrices. The surge of pop-
ularity in MATLAB is related to the increasing popularity of UNIX and
computer graphics. To what extent numerical computations will be pro-
grammed in MATLAB in the future is uncertain. A short introduction to
Preface xvii

MATLAB is given in Appendix C, and the programs in the text serve as


further examples.

The topics are discussed in a simplified manner with a number of ex-


amples illustrating the different concepts and applications. Most of the
sections contain a fairly large number of exercises, some of which relate
to real-life problems. Chapter 1 covers the basic concepts of matrices and
determinants and describes the basic computational methods used to solve
nonhomogeneous linear equations. Direct methods, including Cramer’s
rule, the Gaussian elimination method and its variants, the Gauss–Jordan
method, and LU decomposition methods, are discussed. It also covers
the conditioning of linear systems. Many ill-conditioned problems are dis-
cussed. The chapter closes with the many interesting applications of linear
systems. In Chapter 2, we discuss iterative methods, including the Ja-
cobi method, the Gauss–Seidel method, the SOR iterative method, the
conjugate gradient method, and the residual corrector method. Chapter
3 covers the selected methods of computing matrix eigenvalues. The ap-
proach discussed here should help students understand the relationship of
eigenvalues to the roots of characteristic equations. We define eigenvalues
and eigenvectors and study several examples. We discuss the diagonaliza-
tion of matrices and the computation of powers of diagonalizable matrices.
Some interesting applications of the eigenvalues and eigenvectors of a ma-
trix are also discussed at the end of the chapter. In Chapter 4, various
numerical methods are discussed for the eigenvalues of matrices. Among
them are the power iterative methods, the Jacobi method, Given’s method,
the Householder method, the QR iteration method, the LR method, and
the singular value decomposition method. Chapter 5 describes the ap-
proximation of functions. In this chapter we also describe curve fitting of
experimental data based on least squares methods. We discuss linear, non-
linear, plane, and trigonometric function least squares approximations. We
use QR decomposition and singular value decomposition for the solution of
the least squares problem. In Chapter 6, we describe standard linear pro-
gramming formulations. The subject of linear programming, in general,
involves the development algorithms and methodologies in optimization.
The field, developed by George Dantzig and his associates in 1947, is now
widely used in industry and has its foundation in linear algebra. In keep-
xviii Preface

ing with the intent of this book, this chapter presents the mathematical
formulations of basic linear programming problems. In Chapter 7, we de-
scribe nonlinear programming formulations. We discuss many numerical
methods for solving unconstrained and constrained problems. In the be-
ginning of the chapter some of the basic mathematical concepts useful in
developing optimization theory are presented. For unconstrained optimiza-
tion problems we discuss the golden-section search method and quadratic
interpolation method, which depend on the initial guesses that bracket the
single optimum, and Newton’s method, which is based on the idea from
calculus that the minimum or maximum can be found by solving f 0 (x) = 0.
For the functions of several variables, we use the steepest descent method
and Newton’s method. For handling nonlinear optimization problems with
constraints, we discuss the generalized reduced-gradient method, Lagrange
multipliers, and KT conditions. At the end of the chapter, we also discuss
quadratic programming problems and the separable programming prob-
lems.

In each chapter, we discuss several examples to guide students step-


by-step through the most complex topics. Since the only real way to learn
mathematics is to use it, there is a list of exercises provided at the end
of each chapter. These exercises range from very easy to quite difficult.
This book is completely self-contained, with all the necessary mathematical
background given in it. Finally, this book provides balanced convergence
of the theory, application, and numerical computation of all the topics dis-
cussed.

Appendix A covers different kinds of errors that are preparatory sub-


jects for numerical computations. To explain the sources of these errors,
there is a brief discussion of Taylor’s series and how numbers are computed
and saved in computers. Appendix B consists of a brief introduction to
vectors in space and a review of complex numbers and how to do linear
algebra with them. It is also devoted to general inner product spaces and
to how different notations and processes generalize. In Appendix C, we dis-
cuss the basic commands for the software package MATLAB. In Appendix
D, we give answers to selected odd-numbered exercises.
Acknowledgments

I wish to express my gratitude to all those colleagues, friends, and as-


sociates of mine, without whose help this work was not possible. I am
grateful, especially, to Dr. Saleem, Dr. Zafar Ellahi, Dr. Esia Al-Said, and
Dr. Salah Hasan for reading earlier versions of the manuscript and for pro-
viding encouraging comments. I have written this book as the background
material for an interactive first course in linear algebra and optimization.
The encouragement and positive feedback that I have received during the
design and development of the book have given me the energy required to
complete the project.

I also want to express my heartfelt thanks to a special person who has


been very helpful to me in a great many ways over the course of my career:
Muhammad Sarwar Khan, of King Saud University, Riyadh, Saudi Arabia.

My sincere thanks are also due to the Deanship of the Scientific Re-
search Center, College of Science, King Saud University, Riyadh, KSA,
for financial support and for providing facilities throughout the research
project No. (Math/2008/05/B).

It has taken me five years to write this book and thanks must go to my
long-suffering family for my frequent unsocial behavior over these years. I
am profoundly grateful to my wife Saima, and our children Fatima, Usman,
xx Acknowledgments

Fouzan, and Rahmah, for their patience, encouragement, and understand-


ing throughout this project. Special thanks goes to my elder daughter,
Fatima, for creating all the figures in this project.

Dr. Rizwan Butt


Department of Mathematics,
College of Science
King Saud University
August, 2010
Chapter 1

Matrices and Linear Systems

1.1 Introduction
When engineering systems are modeled, the mathematical description is
frequently developed in terms of a set of algebraic simultaneous equations.
Sometimes these equations are nonlinear and sometimes linear. In this
chapter, we discuss systems of simultaneous linear equations and describe
the numerical methods for the approximate solutions of such systems. The
solution of a system of simultaneous linear algebraic equations is proba-
bly one of the most important topics in engineering computation. Prob-
lems involving simultaneous linear equations arise in the areas of elasticity,
electric-circuit analysis, heat transfer, vibrations, and so on. Also, the
numerical integration of some types of ordinary and partial differential
equations may be reduced to the solution of such a system of equations. It
has been estimated, for example, that about 75% of all scientific problems
require the solution of a system of linear equations at one stage or another.
It is therefore important to be able to solve linear problems efficiently and
accurately.
1
2 Applied Linear Algebra and Optimization using MATLAB

Definition 1.1 (Linear Equation)

It is an equation in which the highest exponent in a variable term is no


more than one. The graph of such an equation is a straight line. •

A linear equation in two variables x1 and x2 is an equation that can be


written in the form
a1 x1 + a2 x2 = b,
where a1 , a2 , and b are real numbers. Note that this is the equation of a
straight line in the plane. For example, the equations
4
5x1 + 2x2 = 2, x1 + 2x2 = 1, 2x1 − 4x2 = π
5
are all linear equations in two variables.

A linear equation in n variables x1 , x2 , . . . , xn is an equation that can


be written as
a1 x1 + a2 x2 + · · · + an xn = b,
where a1 , a2 , . . . , an are real numbers and called the coefficients of unknown
variables x1 , x2 , . . . , xn and the real number b, the right-hand side of the
equation, is called the constant term of the equation.

Definition 1.2 (System of Linear Equations)

A system of linear equations (or linear system) is simply a finite set of


linear equations. •

For example,
4x1 − 2x2 = 5
3x1 + 2x2 = 4
is a system of two equations in two variables x1 and x2 , and

2x1 + x2 − 5x3 + 2x4 = 9


4x1 + 3x2 + 2x3 + 4x4 = 3
x1 + 2x2 + 3x3 + 2x4 = 11
Matrices and Linear Systems 3

is the system of three equations in the four variables x1 , x2 , x3 , and x4 .

In order to write a general system of m linear equations in the n vari-


ables x1 , . . . , xn , we have
a11 x1 + a12 x2 + · · · + a1n xn = b1
a21 x1 + a22 x2 + · · · + a2n xn = b2
.. .. .. .. (1.1)
. . ··· . .
am1 x1 + am2 x2 + · · · + amn xn = bm
or, in compact form the system (1.1) can be written as
n
X
aij xj = bi , i = 1, 2, . . . , m. (1.2)
j=1

For such a system we seek all possible ordered sets of numbers c1 , . . . , cn


which satisfy all m equations when they are substituted for the variables
x1 , x2 , . . . , xn . Any such set {c1 , c2 , . . . , cn } is called a solution of the sys-
tem of linear equations (1.1) or (1.2).

There are three possible types of linear systems that arise in engineering
problems, and they are described as follows:
1. If there are more equations than unknown variables (m > n), then
the system is usually called overdetermined. Typically, an overdeter-
mined system has no solution. For example, the following system
4x1 = 8
3x1 + 9x2 = 13
3x2 = 9
has no solution.
2. If there are more unknown variables than the number of the equations
(n > m), then the system is usually called underdetermined. Typi-
cally, an underdetermined system has an infinite number of solutions.
For example, the system
x1 + 5x2 = 45
3x2 + 4x3 = 21
4 Applied Linear Algebra and Optimization using MATLAB

has infinitely many solutions.

3. If there are the same number of equations as unknown variables (m =


n), then the system is usually called a simultaneous system. It has
a unique solution if the system satisfies certain conditions (which we
will discuss below). For example, the system

2x1 + 4x2 + x3 = −11


−x1 + 3x2 − 2x3 = −16
2x1 − 3x2 + 5x3 = 21

has the unique solution x1 = 2, x2 = −4, x3 = 1.


Most engineering problems fall into this category. In this chapter, we
will solve simultaneous linear systems using many numerical meth-
ods.
A simultaneous system of linear equations is said to be linear inde-
pendent if no equation in the system can be expressed as a linear
combination of the others. Under these circumstances a unique solu-
tion exists. For example, the system of linear equations

2x1 + x2 − x3 = 1
x1 − 2x2 + 3x3 = 4
x1 + x2 = 1

is linear independent and therefore has the unique solution

x1 = 1, x2 = 0, x3 = 1.

However, the system

5x1 + x2 + x3 = 4
3x1 − x2 + x3 = −2
x1 + x2 = 3

does not have a unique solution since the equations are not linear
independent; the first equation is equal to the second equation plus
twice the third equation.
Matrices and Linear Systems 5

Theorem 1.1 (Solution of a Linear System)

Every system of linear equations has either no solution, exactly one solu-
tion, or infinitely many solutions. •

For example, in the case of a system of two equations in two variables,


we can have these three possibilities for the solutions of the linear system.
First, the two lines (since the graph of a linear equation is a straight line)
may be parallel and distinct, and in this case, there is no solution to the
system because the two lines do not intersect each other at any point. For
example, consider the system

x1 + x2 = 1
2x1 + 2x2 = 3.

From the graphs (Figure 1.1(a)) of the given two equations we can see
that the lines are parallel, so the given system has no solution. It can be
proved algebraically simply by multiplying the first equation of the system
by 2 to get a system of the form

2x1 + 2x2 = 2
2x1 + 2x2 = 3,

which is not possible.

Second, the two lines may not be parallel, and they may meet at exactly
one point, so in this case the system has exactly one solution. For example,
consider the system
x1 − x2 = −1
3x1 − x2 = 3.
From the graphs (Figure 1.1(b)) of these two equations we can see that
the lines intersect at exactly one point, namely, (2, 3), and so the system
has exactly one solution, x1 = 2, x2 = 3. To show this algebraically, if we
substitute x2 = x1 + 1 in the second equation, we have 3x1 − x1 − 1 = 3,
or x1 = 2, and using this value of x in x2 = x1 + 1 gives x2 = 3.
6 Applied Linear Algebra and Optimization using MATLAB

Finally, the two lines may actually be the same line, and so in this case,
every point on the lines gives a solution to the system and therefore there
are infinitely many solutions. For example, consider the system

x1 + x2 = 1
2x1 + 2x2 = 2.

Figure 1.1: Three possible solutions of simultaneous systems.

Here, both equations have the same line for their graph (Figure 1.1(c)).
So this system has infinitely many solutions because any point on this line
gives a solution to this system, since any solution of the first equation is
also a solution of the second equation. For example, if we set x2 = x1 − 1,
and choose x1 = 0, x2 = 1, x1 = 1, x2 = 0, and so on. •
Note that a system of equations with no solution is said to be an incon-
sistent system and if it has at least one solution, it is said to be a consistent
system.
Matrices and Linear Systems 7

1.1.1 Linear Systems in Matrix Notation


The general simultaneous system of n linear equations with n unknown
variables x1 , x2 , . . . , xn is
a11 x1 + a12 x2 + · · · + a1n xn = b1
a21 x1 + a22 x2 + · · · + a2n xn = b2
.. .. .. .. .. (1.3)
. . . . .
an1 x1 + an2 x2 + · · · + ann xn = bn .

The system of linear equations (1.3) can be written as the single matrix
equation     
a11 a12 · · · a1n x1 b1
 a21 a22 · · · a2n   x2   b2 
..   ..  =  ..  . (1.4)
    
 .. .. ..
 . . . .  .   . 
an1 an2 · · · ann xn bn
If we compute the product of the two matrices on the left-hand side of
(1.9), we have
   
a11 x1 + a12 x2 + · · · + a1n xn b1
 a21 x1 + a22 x2 + · · · + a2n xn   b2 
= . (1.5)
   
 .. .. .. .. ..
 . . . .   . 
an1 x1 + an2 x2 + · · · + ann xn bn

But two matrices are equal if and only if their corresponding elements
are equal. Hence, the single matrix equation (1.9) is equivalent to the
system of the linear equations (1.3). If we define
     
a11 a12 · · · a1n x1 b1
 a21 a22 · · · a2n   x2   b2 
A =  .. , x = , b =  ..  ,
     
.. .. ..   .. 
 . . . .   .   . 
an1 an2 · · · ann xn bn

the coefficient matrix, the column matrix of unknowns, and the column
matrix of constants, respectively, and then the system (1.3) can be written
very compactly as
Ax = b, (1.6)
8 Applied Linear Algebra and Optimization using MATLAB

which is called the matrix form of the system of linear equations (1.3). The
column matrices x and b are called vectors.

If the right-hand sides of the equal signs of (1.6) are not zero, then the
linear system (1.6) is called a nonhomogeneous system, and we will find
that all the equations must be independent to obtain a unique solution.
If the constants b of (1.6) are added to the coefficient matrix A as a
column of elements in the position shown below

..
 
a11 a12 · · · a1n . b1 
 ..
a21 a22 · · · a2n . b2 
 
[A|b] =  , (1.7)

.. .. .. .. .. .. 

 . . . . . .  
..
an1 an2 · · · ann . bn

then the matrix [A|b] is called the augmented matrix of the system (1.6).
In many instances, it may be convenient to operate on the augmented ma-
trix instead of manipulating the equations. It is customary to put a bar
between the last two columns of the augmented matrix to remind us where
the last column came from. However, the bar is not absolutely necessary.
The coefficient and augmented matrices of a linear system will play key
roles in our methods of solving linear systems.

Using MATLAB commands we can define an augmented matrix as fol-


lows:

>> A = [1 2 3; 4 5 6; 7 8 9];
>> b = [10; 11; 12];
>> Aug = [A b]
Aug =
1 2 3 10
4 5 6 11
7 8 9 12

Also,
Matrices and Linear Systems 9

>> Aug = [A eye(3)]


Aug =
1 2 3 1 0 0
4 5 6 0 1 0
7 8 9 0 0 1
If all of the constant terms b1 , b2 , . . . , bn on the right-hand sides of the
equal signs of the linear system (1.6) are zero, then the system is called a
homogeneous system, and it can be written as

a11 x1 + a12 x2 + · · · + a1n xn = 0


a21 x1 + a22 x2 + · · · + a2n xn = 0
.. .. .. .. .. (1.8)
. . . . .
an1 x1 + an2 x2 + · · · + ann xn = 0.

The system of linear equations (1.8) can be written as the single matrix
equation     
a11 a12 · · · a1n x1 0
 a21 a22 · · · a2n   x2   0 
..   ..  =  ..  . (1.9)
    
 .. .. ..
 . . . .  .   . 
an1 an2 · · · ann xn 0
It can also be written in more compact form as

Ax = 0, (1.10)

where
     
a11 a12 · · · a1n x1 0
 a21 a22 · · · a2n   x2   0 
A= ..  , x= , 0= .
     
.. .. .. .. ..
 . . . .   .   . 
an1 an2 · · · ann xn 0

It can be seen by inspection of the homogeneous system (1.10) that


one of its solution, is x = 0; such a solution, in which all of the unknowns
are zero, is called the trivial solution or zero solution. For the general
nonhomogeneous linear system there are three possibilities: no solution,
one solution, or infinitely many solutions. For the general homogeneous
10 Applied Linear Algebra and Optimization using MATLAB

system, there are only two possibilities: either the zero solution is the only
solution, or there are infinitely many solutions (called nontrivial solutions).
Of course, it is usually nontrivial solutions that are of interest in physical
problems. A nontrivial solution to the homogeneous system can occur with
certain conditions on the coefficient matrix A, which we will discuss later.

1.2 Properties of Matrices and Determinants


To discuss the solutions of linear systems, it is necessary to introduce the
basic algebraic properties of matrices that make it possible to describe
linear systems in a concise way and make solving a system of n linear
equations easier.

1.2.1 Introduction to Matrices


A matrix can be described as a rectangular array of elements that can be
represented as follows:
 
a11 a12 · · · a1n
 a21 a22 · · · a2n 
A =  .. . (1.11)
 
.. .. ..
 . . . . 
am1 am2 · · · amn

The numbers a11 , a12 , . . . , amn that make up the array are called the ele-
ments of the matrix. The first subscript for the element denotes the row
and the second denotes the column in which the element appears. The
elements of a matrix may take many forms. They could be all numbers
(real or complex), or variables, or functions, or integrals, or derivatives, or
even matrices themselves.
The order or size of a matrix is specified by the number of rows (m)
and column (n); thus, the matrix A in (1.11) is of order m by n, usually
written as m × n.
A vector can be considered a special case of a matrix having only one
row or one column. A row vector containing n elements is a 1 × n matrix,
called a row matrix, and a column vector of n elements is an n × 1 matrix,
called a (column matrix). A matrix of order 1 × 1 is called a scalar.
Matrices and Linear Systems 11

Definition 1.3 (Matrix Equality)

Two matrices A = (aij ) and B = (bij ) are equal if they are the same size
and the corresponding elements in A and B are equal, i.e.,

A = B, if and only if aij = bij

for i = 1, 2, . . . , m and j = 1, 2, . . . , n. For example, the matrices


   
1 −1 2 1 −1 z
A=  1 3 2  and B =  1 3 2 
2 4 3 x y w

are equal, if and only if x = 2, y = 4, z = 2, and w = 3. •

Definition 1.4 (Addition of Matrices)

Let A = (aij ) and B = (bij ) both be m × n matrices, then the sum A +


B of two matrices of the same size is a new matrix C = (cij ), each of
whose elements is the sum of the two corresponding elements in the original
matrices, i.e.,

cij = aij + bij , for i = 1, 2, . . . , m, and j = 1, 2, . . . , n.

For example, let


   
1 2 4 1
A= and B= .
3 4 5 2

Then      
1 2 4 1 5 3
+ = = C.
3 4 5 2 8 6

Using MATLAB commands and adding two matrices A and B of the


same size results in the answer C, another matrix of the same size:
12 Applied Linear Algebra and Optimization using MATLAB

>> A = [1 2; 3 4];
>> B = [4 1; 5 2];
>> C = A + B
C=
5 3
8 6

Definition 1.5 (Difference of Matrices)

Let A and B be m × n matrices, and we write A + (−1)B as A − B and


the difference of two matrices of the same size is a new matrix C, each of
whose elements is the difference of the two corresponding elements in the
original matrices. For example, let
   
1 2 4 1
A= and B = .
3 4 5 2
Then      
1 2 4 1 −3 1
− = = C.
3 4 5 2 −2 2
Note that (−1)B = −B is obtained by multiplying each entry of matrix B
by (−1), the scalar multiple of matrix B by −1. The matrix −B is called
the negative of the matrix B. •

Definition 1.6 (Multiplication of Matrices)

The multiplication of two matrices is defined only when the number of


columns in the first matrix is equal to the number of rows in the second. If
an m × n matrix A is multiplied by an n × p matrix B, then the product
matrix C is an m × p matrix where each term is defined by
n
X
cij = aik bkj
k=1

for each i = 1, 2, . . . , m and j = 1, 2, . . . , p. For example, let


   
1 2 4 1
A= and B = .
3 4 5 2
Matrices and Linear Systems 13

Then
      
1 2 4 1 4 + 10 1 + 4 14 5
= = = C.
3 4 5 2 12 + 20 3 + 8 32 11

Note that even if AB is defined, the product BA may not be defined.


Moreover, a simple multiplication of two square matrices of the same size
will show that even if BA is defined, it need not be equal to AB, i.e., they
do not commute. For example, if
   
1 2 2 1
A= and B= ,
−1 3 0 1

then    
2 3 1 7
AB = while BA = .
−2 2 −1 3

Thus, AB 6= BA. •

Using MATLAB commands, matrix multiplication has the standard


meaning as well. Multiplying two matrices A and B of size m × p and
p × n respectively, results in the answer C, another matrix of size m × n:

>> A = [1 2; −1 3];
>> B = [2 1; 0 1];
>> C = A ∗ B
C=
2 3
−2 2

MATLAB also has component-wise operations for multiplication, divi-


sion, and exponentiation. These three operations are a combination of a
period (.) and one of the operators ∗, /, and ˆ , which perform operations
on a pair of matrices (or vectors) with equal numbers of rows and columns.
For example, consider the two row vectors:
14 Applied Linear Algebra and Optimization using MATLAB

>> u = [1 2 3 4];
>> v = [5 3 0 2];
>> x = u. ∗ v
x=
5608

Warning: Divide by zero.

>> y = u./v;
y=
0.2000 0.6667 Inf 2.0000

These operations apply to matrices as well as vectors:

>> A = [1 2 3; 4 5 6; 7 8 9];
>> B = [9 8 7; 6 5 4; 3 2 1];
>> C = A. ∗ B
C=
9 16 21
24 25 24
21 16 9

Note that A. ∗ B is not the same as A ∗ B.

The array exponentiation operator, .ˆ , raises the individual elements


of a matrix to a power:

>> A = [1 2 3; 4 5 6; 7 8 9];
>> D = A.ˆ 2
D=
1 4 9
16 25 36
49 64 81
Matrices and Linear Systems 15

>> E = A.ˆ (1/2)


D=
1.0000 1.4142 1.7321
2.0000 2.2361 2.4495
2.6458 2.8284 3.0000
The syntax of array operators requires the correct placement of a typo-
graphically small symbol, a period, in what might be a complex formula.
Although MATLAB will catch syntax errors, it is still possible to make
computational mistakes with legal operations. For example, A.ˆ 2 and Aˆ
2 are both legal, but not at all equivalent.

In linear algebra, the addition and subtraction of matrices and vec-


tors are element-by-element operations. Thus, there are no special array
operators for addition and subtraction.

1.2.2 Some Special Matrix Forms


There are many special types of matrices encountered frequently in engi-
neering analysis. We discuss some of them in the following.

Definition 1.7 (Square Matrix)

A matrix A which has the same number of rows m and columns n, i.e.,
m = n, defined as

A = (aij ), for i = 1, 2, . . . , n, and j = 1, 2, . . . , n

is called a square matrix. For example, the matrices


 
  2 1 2
1 2
A= and B= 1 2 3 
−1 3
0 1 5

are square matrices because both have the same number of rows and columns.

16 Applied Linear Algebra and Optimization using MATLAB

Definition 1.8 (Null Matrix)

It is a matrix in which all elements are zero, i.e.,


A = (aij ) = 0, for i = 1, 2, . . . , n and j = 1, 2, . . . , n.
It is also called a zero matrix. It may be either rectangular or square. For
example, the matrices
 
  0 0 0
0 0 0
A= and B =  0 0 0 
0 0 0
0 0 0
are zero matrices. •
Definition 1.9 (Identity Matrix)

It is a square matrix in which the main diagonal elements are equal to 1.


It is defined as

aij = 0, if i = 6 j,
I = (aij ) =
aij = 1, if i = j.
An example of a 4 × 4 identity matrix may be written as
 
1 0 0 0
 0 1 0 0 
I4 = 
 0 0
.
1 0 
0 0 0 1
The identity matrix (also called a unit matrix) serves somewhat the same
purpose in matrix algebra as does the number one (unity) in scalar algebra.
It is called the identity matrix because multiplication of a matrix by it will
result in the same matrix. For a square matrix A of order n, it can be seen
that
In A = AIn = A.
Similarly, for a rectangular matrix B of order m × n, we have
Im B = BIn = B.
The multiplication of an identity matrix by itself results in the same identity
matrix. •
Matrices and Linear Systems 17

In MATLAB, identity matrices are created with the eye function, which
can take either one or two input arguments:

>> I = eye(n)
>> I = eye(m, n)

Definition 1.10 (Transpose Matrix)

The transpose of a matrix A is a new matrix formed by interchanging the


rows and columns of the original matrix. If the original matrix A is of
order m × n, then the transpose matrix, AT , will be of the order n × m,
i.e.,
if A = (aij ), for i = 1, 2, . . . , m and j = 1, 2, . . . , n,
then
AT = (aji ), for i = 1, 2, . . . , n and j = 1, 2, . . . , m.

The transpose of a matrix A can be found by using the following MAT-


LAB commands:

>> A = [1 2 3; 4 5 6; 7 8 9]
>> B = A0
B=
1.0000 4.0000 7.0000
2.0000 5.0000 8.0000
3.0000 6.0000 9.0000
Note that

1. (AT )T = A,

2. (A1 + A2 )T = AT1 + AT2 ,

3. (A1 A2 )T = AT2 AT1 ,

4. (αA)T = αAT , and α is a scalar.


18 Applied Linear Algebra and Optimization using MATLAB

Definition 1.11 (Inverse Matrix)

An n × n matrix A has an inverse or is invertible if there exists an n × n


matrix B such that

AB = BA = In .

Then the matrix B is called the inverse of A and is denoted by A−1 . For
example, let

−1 32
   
2 3
A= and B= .
2 2 1 1

Then we have

AB = BA = I2 ,

which means that B is an inverse of A. Note that the invertible matrix is


also called the nonsingular matrix. •

To find the inverse of a square matrix A using MATLAB commands we


do as follows:

>> A = [21 0 0; −1 2 − 1 0; 0 − 1 2 − 1; 0 0 − 1 2]
>> Ainv = IN V M AT (A)
Ainv =
0.8000 0.6000 0.4000 0.2000
0.6000 1.2000 0.8000 0.4000
0.4000 0.8000 1.2000 0.6000
0.2000 0.4000 0.6000 0.8000
Matrices and Linear Systems 19

Program 1.1
MATLAB m-file for Finding the Inverse of a Matrix
function [Ainv]=INVMAT(A)
[n,n]=size(A); I=zeros(n,n);
for i=1:n; I(i,i)=1; end
m(1:n,1:n)=A; m(1 : n, n + 1 : 2 ∗ n) = I;
for i=1:n; m(i, 1 : 2 ∗ n) = m(i, 1 : 2 ∗ n)/m(i, i);
for k=1:n; if i˜ =k
m(k, 1 : 2∗n) = m(k, 1 : 2∗n)−m(k, i)∗m(i, 1 : 2∗n);
end; end; end
invrs = m(1 : n, n + 1 : 2 ∗ n);

The MATLAB built-in function inv(A) can be also used to calculate


the inverse of a square matrix A, if A is invertible:

>> I = Ainv ∗ A;
>> f ormat short e
>> disp(I)
I=
1.0000e + 00 −1.1102e − 16 0 0
0 1.0000e + 00 0 0
0 0 1.0000e + 00 2.2204e − 16
0 0 0 1.0000e + 00
The values of I(2, 1), and I(3, 4) are very small, but nonzero, due to
round-off errors in the computation of Ainv and I. It is often preferable to
use rational numbers rather than decimal numbers. The function frac(x)
returns the rational approximation to x, or we can use the other MATLAB
command as follows:

>> f ormat rat

If the matrix A is not invertible, then the matrix A is called singular.

There are some well-known properties of the invertible matrix which


are defined as follows.
20 Applied Linear Algebra and Optimization using MATLAB

Theorem 1.2 If the matrix A is invertible, then:

1. It has exactly one inverse. If B and C are the inverses of A, then


B = C.

2. Its inverse matrix A−1 is also invertible and (A−1 )−1 = A.

3. Its product with another invertible matrix is invertible, and the in-
verse of the product is the product of the inverses in the reverse order.
If A and B are invertible matrices of the same size, then AB is in-
vertible and (AB)−1 = B −1 A−1 .

4. Its transpose matrix AT is invertible and (AT )−1 = (A−1 )T .

5. The kA for any nonzero k is invertible, i.e., (kA)−1 = k1 A−1 .

6. The Ak for any k is also invertible, i.e., (Ak )−1 = (A−1 )k .

7. Its size 1 × 1 is invertible when it is nonzero. If A = (a), then


A−1 = ( a1 ).

8. The formula for A−1 when n = 2 is


 −1  
−1 a11 a12 1 a22 −a12
A = = ,
a21 a22 a11 a22 − a12 a21 −a21 a11

provided that a11 a22 − a12 a21 6= 0. •

Definition 1.12 (Diagonal Matrix)

It is a square matrix having all elements equal to zero except those on the
main diagonal, i.e.,

aij = 0, if i 6= j
A = (aij ) =
aij 6= 0, if i = j.

Note that all diagonal matrices are invertible if all diagonal entries are
nonzero. •
Matrices and Linear Systems 21

The MATLAB function diag is used to either create a diagonal matrix


from a vector or to extract the diagonal entries of a matrix. If the input
argument of the diag function is a vector, MATLAB uses the vector to
create a diagonal matrix:

>> x = [2, 2, 2];


>> A = diag(x)
A=
2 0 0
0 2 0
0 0 2
The matrix A is called the scalar matrix because it has all the elements on
the main diagonal equal to the same scalars 2. Multiplication of a square
matrix and a scalar matrix is commutative, and the product is also a di-
agonal matrix.

If the input argument of the diag function is a matrix, the result is a


vector of the diagonal elements:

>> B = [2 − 4 1; 6 10 − 3; 0 5 8]
>> M = diag(B)
M=
2
10
8

Definition 1.13 (Upper-Triangular Matrix)

It is a square matrix which has zero elements below and to the left of the
main diagonal. The diagonal as well as the above diagonal elements can
take on any value, i.e.,

U = (uij ), where uij = 0, if i > j.

An example of such a matrix is


22 Applied Linear Algebra and Optimization using MATLAB

 
1 2 3
U =  0 4 5 .
0 0 6

The upper-triangular matrix is called an upper-unit-triangular matrix if


the diagonal elements are equal to one. This type of matrix is used in solv-
ing linear algebraic equations by LU decomposition with Crout’s method.
Also, if the main diagonal elements of the upper-triangular matrix are zero,
then the matrix  
0 a12 a13
A= 0 0 a23 
0 0 0
is called a strictly upper-triangular matrix. This type of matrix will be used
in solving linear systems by iterative methods. •

Using the MATLAB command triu(A) we can create an upper-triangular


matrix from a given matrix A as follows:

>> A = [1 2 3; 4 5 6; 7 8 9];
>> U = triu(A)
U=
1 2 3
0 4 5
0 0 6
We can also create a strictly upper-triangular matrix, i.e., an upper-
triangular matrix with zero diagonals, from a given matrix A by using the
MATLAB built-in function triu(A,I) as follows:

>> A = [1 2 3; 4 5 6; 7 8 9];
>> U = triu(A, I)
U=
0 2 3
0 0 5
0 0 0
Matrices and Linear Systems 23

Definition 1.14 (Lower-Triangular Matrix)

It is a square matrix which has zero elements above and to the right of the
main diagonal, and the rest of the elements can take on any value, i.e.,

L = (lij ), where lij = 0, if i < j.

An example of such a matrix is


 
2 0 0
L =  3 1 0 .
4 5 3

The lower-triangular matrix is called a lower-unit-triangular matrix if the


diagonal elements are equal to one. This type of matrix is used in solving
linear algebraic equations by LU decomposition with Doolittle’s method.
Also, if the main diagonal elements of the lower-triangular matrix are zero,
then the matrix  
0 0 0
A =  a21 0 0 
a31 a32 0
is called a strictly lower-triangular matrix. We will use this type of matrix
in solving the linear systems by using iterative methods. •

In a similar way, we can create a lower-triangular matrix and a strictly


lower-triangular matrix from a given matrix A by using the MATLAB
built-in functions tril(A) and tril(A,I), respectively.

Note that all the triangular matrices (upper or lower) with nonzero
diagonal entries are invertible.

Definition 1.15 (Symmetric Matrix)

A symmetric matrix is one in which the elements aij of a matrix A in the


ith row and jth column are equal to the elements aji in the jth row and ith
column, which means that

AT = A, i.e., aij = aji , for i 6= j.


24 Applied Linear Algebra and Optimization using MATLAB

Note that any diagonal matrix, including the identity, is symmetric. A


lower- or upper-triangular matrix is symmetric if and only if it is, in fact,
a diagonal matrix.

One way to generate a symmetric matrix is to multiply a matrix by its


transpose, since AT A is symmetric for any A. To generate a symmetric
matrix using MATLAB commands we do the following:

>> A = [1 : 4; 5 : 8; 9 : 12]
%A is not symmetric
>> B = A0 ∗ A
B=
107 122 137 152
122 140 158 176
137 158 179 200
152 176 200 224
>> C = A ∗ A0
C=
30 70 110
70 174 278
110 278 446

Example 1.1 Find all the values of a, b, and c for which the following
matrix is symmetric:
 
4 a+b+c 0
A =  −1 3 b − c .
−a + 2b − 2c 1 b − 2c
Solution. If the given matrix is symmetric, then A = AT , i.e.,
 
4 a+b+c 0
A =  −1 3 b−c 
−a + 2b − 2c 1 b − 2c
 
4 −1 −a + 2b − 2c
= a+b+c 3 1  = AT ,
0 b − c b − 2c
Matrices and Linear Systems 25

which implies that

0 = −a + 2b − 2c
−1 = a + b + c
1 = b − c.

Solving the above system, we get

a = 2, b = −1, c = −2,

and using these values, we have the given matrix of the form
 
4 −1 0
A=  −1 3 1 .
0 1 3

Theorem 1.3 If A and B are symmetric matrices of the same size, and
if k is any scalar, then:

1. AT is also symmetric;

2. A + B and A − B are symmetric;

3. and kA is also symmetric.

Note that the product of symmetric matrices is not symmetric in gen-


eral, but the product is symmetric if and only if the matrices commute.
Also, note that if A is a square matrix, then the matrices A, AAT , and
AT A are either all nonsingular or all singular. •

If for a matrix A, the aij = −aji for a i 6= j and the main diagonal
elements are not all zero, then the matrix A is called a skew matrix. If
all the elements on the main diagonal of a skew matrix are zero, then the
matrix is called skew symmetric, i.e.,

A = −AT , with aij = −aji for i 6= j, and aii = 0.


26 Applied Linear Algebra and Optimization using MATLAB

Any square matrix may be split into the sum of a symmetric and a skew
symmetric matrix. Thus,
1 1
A = (A + AT ) + (A − AT ),
2 2
1
where 2 (A+A ) is a symmetric matrix and 12 (A−AT ) is a skew symmetric
T

matrix. The matrices


     
1 2 3 1 2 3 0 2 3
 2 4 5  ,  −2 4 −5  , and  −2 0 5 
3 5 6 −3 5 6 −3 5 0
are examples of symmetric, skew, and skew symmetric matrices, respec-
tively. •
Definition 1.16 (Partitioned Matrix)

A matrix A is said to be partitioned if horizontal and vertical lines have


been introduced, subdividing A into submatrices called blocks. Partitioning
allows A to be written as a matrix A whose entries are its blocks. A simple
example of a partitioned matrix may be an augmented matrix, which can
be partitioned in the form
B = [A|b].
It is frequently necessary to deal separately with various groups of el-
ements, or submatrices, within a large matrix. This situation can arise
when the size of a matrix becomes too large for convenient handling, and
it becomes necessary to work with only a portion of the matrix at any one
time. Also, there will be cases in which one part of a matrix will have
a physical significance that is different from the remainder, and it is in-
structive to isolate that portion and identify it by a special symbol. For
example, the following 4 × 5 matrix A has been partitioned into four blocks
of elements, each of which is itself a matrix:
 . 
a11 a12 a13 .. a14 a15
 a21 a22 a23 ... a24 a25 
 
 
A =  a31 a32 a33 ... a34 a35  .
 
..
 
 ··· ··· ··· . ··· ··· 
 
.
a41 a42 a43 .. a44 a45
Matrices and Linear Systems 27

The partitioning lines must always extend entirely through the matrix
as in the above example. If the submatrices of A are denoted by the symbols
A11 , A12 , A21 , and A22 so that
   
a11 a12 a13 a14 a15
A11 =  a21 a22 a23  , A12 =  a24 a25  ,
a31 a32 a33 a34 a35
 
A21 = a41 a42 a43 , A22 = a44 a45 ,
then the original matrix can be written in the form
 
A11 A12
A= .
A21 A22
A partitioned matrix may be transposed by appropriate transposition
and rearrangement of the submatrices. For example, it can be seen by
inspection that the transpose of the matrix A is
 T 
A11 AT12
AT =  .
T T
A21 A22

Note that AT has been formed by transposing each submatrix of A and


then interchanging the submatrices on the secondary diagonal.

Partitioned matrices such as the one given above can be added, sub-
tracted, and multiplied provided that the partitioning is performed in an
appropriate manner. For the addition and subtraction of two matrices, it is
necessary that both matrices be partitioned in exactly the same way. Thus,
a partitioned matrix B of order 4 × 5 (compare with matrix A above) will
be conformable for addition with A only if it is partitioned as follows:
 . 
b11 b12 b13 .. b14 b15
 b21 b22 b23 ... b24 b25 
 
 
B =  b31 b32 b33 ... b34 b35  .
 
..
 
 ··· ··· ··· . ··· ··· 
 
.
b41 b42 b43 .. b44 b45
28 Applied Linear Algebra and Optimization using MATLAB

It can be expressed in the form


 
B11 B12
B= ,
B21 B22

in which B11 , B12 , B21 , and B22 represent the corresponding submatrices.
In order to add A and B and obtain a sum C, it is necessary according to
the rules for addition of matrices that the following represent the sum:
   
A11 + B11 A12 + B12 C11 C12
A+B = = = C.
A21 + B21 A22 + B22 C21 C22

Note that like A and B, the sum matrix C will also have the same par-
titions.

The conformability requirement for multiplication of partitioned matri-


ces is somewhat different from that for addition and subtraction. To show
the requirement, consider again the matrix A given previously and assume
that it is to be postmultiplied by a matrix D, which must have five rows but
may have any number of columns. Also assume that D is partitioned into
four submatrices as follows:
 
D11 D12
D= .
D21 D22

Then, when forming the product AD according to the usual rules for
matrix multiplication, the following result is obtained:
  
A11 A12 D11 D12
M = AD =
A21 A22 D21 D22
 
A11 D11 + A12 D21 A11 D12 + A12 D22
=
A21 D11 + A22 D21 A21 D12 + A22 D22
 
M11 M12
= .
M21 M22

Thus, the multiplication of the two partitioned matrices is possible if


the columns of the first partitioned matrix are partitioned in exactly the
Matrices and Linear Systems 29

same way as the rows of the second partitioned matrix. It does not matter
how the rows of the first partitioned matrix and the columns of the second
partitioned matrix are partitioned. •

Definition 1.17 (Band Matrix)

An n × n square matrix A is called a band matrix if there exists positive


integers p and q, with 1 < p and q < n, such that

aij = 0 for p≤j−i or q ≤ i − j.

The number p describes the number of diagonals above, including the


main diagonal on which the nonzero entries may lie. The number q de-
scribes the number of diagonals below, including the main diagonal on
which the nonzero entries may lie. The number p + q − 1 is called the
bandwidth of the matrix A, which tells us how many of the diagonals can
contain nonzero entries. For example, the matrix
 
1 2 3
 2 3 4 5 
A=  0 5 6 7 

0 0 7 8

is banded with p = 3 and q = 2, and so the bandwidth is equal to 4. An


important property of the band matrix is called the tridiagonal matrix, in
this case, p = q = 2, i.e., all nonzero elements lie either on or directly above
or below the main diagonal. For this type of matrix, Gaussian elimination
is particularly simpler. In general, the nonzero elements of a tridiagonal
matrix lie in three bands: the superdiagonal, diagonal, and subdiagonal.
For example, the matrix
 
1 2
 2 3 1 
 

 3 2 1 

A=  2 4 3 


 1 2 3 

 1 6 4 
3 4
30 Applied Linear Algebra and Optimization using MATLAB

is a tridiagonal matrix.

A matrix which is predominantly zero is called a sparse matrix. A band


matrix or a tridiagonal matrix is a sparse matrix, but the nonzero elements
of a sparse matrix are not necessarily near the diagonal. •

Definition 1.18 (Permutation Matrix)

A permutation matrix P has only 0s and 1s and there is exactly one in each
row and column of P . For example, the following matrices are permutation
matrices:
 
  0 1 0 0
1 0 0  1 0 0 0 
P =  0 0 1 , P =  0 0 1 0 .

0 1 0
0 0 0 1

The product P A has the same rows as A but in a different order (permuted),
while AP is just A with the columns permuted. •

1.2.3 Solutions of Linear Systems of Equations


Here we shall discuss the familiar technique called the method of elimina-
tion to find the solutions of linear systems. This method starts with the
augmented matrix of the given linear system and obtains a matrix of a
certain form. This new matrix represents a linear system that has exactly
the same solutions as the given origin system. In the following, we define
two well-known forms of a matrix.

Definition 1.19 (Row Echelon Form)

An m × n matrix A is said to be in row echelon form if it satisfies the


following properties:

1. Any rows consisting entirely of zeros are at the bottom.

2. The first entry from the left of a nonzero row is 1. This entry is
called the leading one of its row.
Matrices and Linear Systems 31

3. For each nonzero row, the leading one appears to the right and below
any leading ones in preceding rows.
Note that, in particular, in any column containing a leading one, all
entries below the leading one are zero. For example, the following matrices
are in row echelon form:
 
      0 1 0 1
1 2 1 1 0 2 1 2 3 4
 0 1 3 ,  0 1 2 ,  0 0 1 2 ,  0 0 1 0 .
 
 0 0 0 0 
0 0 0 0 0 1 0 0 0 0
0 0 0 0

Observe that a matrix in row echelon form is actually the augmented


matrix of a linear system (i.e., the last column is the right-hand side of
the system Ax = b), and the system is quite easy to solve by backward
substitution. For example, writing the first above matrix in linear system
form, we have
x1 + 2x2 = 1
x2 = 3.
No need to involve the last equation, which is

0x1 + 0x2 = 0,

and it satisfies for any choices of x1 and x2 . Thus, by using backward


substitution, we get

x2 = 3 and x1 = (1 − 2x2 ) = (1 − 2(3)) = −5.

So the unique solution of the linear system is [−5, 3]T .

Similarly, the linear system that corresponds to the second above matrix
is
x1 = 2
x2 = 2
0 = 1.
The third equation of this system shows that

0x1 + 0x2 = 1,
32 Applied Linear Algebra and Optimization using MATLAB

which is not possible for any choices of x1 and x2 . Hence, the system has
no solution.

Finally, the linear system that corresponds to the third above matrix is

x1 + 2x2 + 3x3 = 4
x3 = 2
0x1 + 0x2 + 0x3 = 0,

and by backward substitution (without using the third equation of the sys-
tem), we get

x3 = 2, and x1 = 4 − 2x2 − 3x3 = −2 − 2x2 .

By choosing an arbitrary nonzero value of x2 , we will get the value of


x1 , which implies that we have infinitely many solutions for such a linear
system. •

If we add one more property in the above definition of row echelon


form, then we will get another well-known form of a matrix, called reduced
row echelon form, which we define as follows.

Definition 1.20 (Reduced Row Echelon Form)

An m × n matrix A is said to be in reduced row echelon form if it satisfies


the following properties:

1. Any rows consisting entirely of zeros are at the bottom.

2. The first entry from the left of a nonzero row is 1. This entry is
called the leading one of its row.

3. For each nonzero row, the leading one appears to the right and below
any leading ones in preceding rows.

4. If a column contains a leading one, then all other entries in that


column (above and below a leading one) are zeroes.
Matrices and Linear Systems 33

For example, the following matrices are in reduced row echelon form:
       
1 0 1 1 0 0 2 1 4 5 0 1 1 0 0 0
 0 1 2 ,  0 1 0 4 ,  0 0 0 1 ,  0 0 1 0 2 ,
0 0 0 0 0 1 6 0 0 0 0 0 0 0 1 1

and the following matrices are not in reduced row echelon form:
       
1 3 0 2 1 3 0 2 1 0 0 3 1 0 2 0 0
 0 0 0 0 ,  0 0 5 4 ,  0 0 1 2 ,  0 1 1 0 2 .
0 0 1 4 0 0 0 1 0 1 0 6 0 2 0 1 1

Note that a useful property of matrices in reduced row echelon form


is that if A is an n × n matrix in reduced row echelon form not equal to
identity matrix In , then A has a row consisting entirely of zeros. •

There are usually many sequences of row operations that can be used to
transform a given matrix to reduced row echelon form—they all, however,
lead to the same reduced row echelon form. In the following, we shall
discuss how to transform a given matrix in reduced row echelon form.

Definition 1.21 (Elementary Row Operations)


It is the procedure that can be used to transform a given matrix into row
echelon or reduced row echelon form. An elementary row operation on an
m × n matrix A is any of the following operations:

1. Interchanging two rows of a matrix A;

2. Multiplying a row of A by a nonzero constant;

3. Adding a multiple of a row of A to another row.

Observe that when a matrix is viewed as the augmented matrix of a


linear system, the elementary row operations are equivalent, respectively,
to interchanging two equations, multiplying an equation by a nonzero con-
stant, and adding a multiple of an equation to another equation. •
34 Applied Linear Algebra and Optimization using MATLAB

Example 1.2 Consider the matrix


 
0 0 1 3
A =  3 2 0 4 .
4 4 8 12

Interchanging rows 1 and 2 gives


 
3 2 0 4
R1 =  0 0 1 3  .
4 4 8 12

Multiplying the third row of A by 14 , we get


 
0 0 1 3
R2 =  3 2 0 4 .
1 1 2 3

Adding (−2) times row 2 of A to row 3 of A gives


 
0 0 1 3
R3 =  3 2 0 4 .
−2 0 8 4

Observe that in obtaining R3 from A, row 2 did not change. •

Theorem 1.4 Every matrix can be brought to reduced row echelon form
by a series of elementary row operations. •

Example 1.3 Consider the matrix


 
1 −3 0 0 1
A =  2 −6 −1 1 1 .
3 −9 2 −1 5

Using the finite sequence of elementary row operations, we get the matrix
of the form  
1 −3 0 0 1
R1 =  0 0 1 −1 1  ,
0 0 0 1 0
Matrices and Linear Systems 35

which is in row echelon form. If we continue with the matrix R1 and make
all elements above the leading one equal to zero, we obtain
 
1 −3 0 0 1
R2 =  0 0 1 0 1 ,
0 0 0 1 0

which is the reduced row echelon form of the given matrix A. •

MATLAB has a function rref used to arrive directly at the reduced


echelon form of a matrix. For example, using the above given matrix, we
do the following:

>> A = [1 − 3 0 0 1; 2 − 6 − 1 1 1; 3 − 9 2 − 1 5];
>> B = rref (A)
B=
1 −3 0 0 1
0 0 1 0 1
0 0 0 1 0

Definition 1.22 (Row Equivalent Matrix)

An m × n matrix A is said to be row equivalent to an m × n matrix B if B


can be obtained by applying a finite sequence of elementary row operations
to the matrix A. •

Example 1.4 Consider the matrix


 
1 3 6 5
A= 2 1 4 3 .
3 −4 3 4

If we add (−1) times row 1 of A to its third row, we get


 
1 3 6 5
R1 =  2 1 4 3 ,
2 −7 −3 −1
36 Applied Linear Algebra and Optimization using MATLAB

so R1 is row equivalent to A.

Interchanging row 2 and row 3 of the matrix R1 gives the matrix of the
form  
1 3 6 5
R2 =  2 −7 −3 −1  ,
2 1 4 3
so R2 is row equivalent to R1 .

Multiplying row 2 of R2 by (−2), we obtain


 
1 3 6 5
R3 =  −4 14 6 2  ,
2 1 4 3
so R3 is row equivalent to R2 .

It then follows that R3 is row equivalent to the given matrix A since


we obtained the matrix R3 by applying three successive elementary row
operations to A. •

Theorem 1.5
1. Every matrix is row equivalent to itself.
2. If a matrix A is row equivalent to a matrix B, then B is row equivalent
to A.
3. If a matrix A is row equivalent to a matrix B and B is row equivalent
to a matrix C, then A is row equivalent to C. •

Theorem 1.6 Every m × n matrix is row equivalent to a unique matrix


in reduced row echelon form. •

Example 1.5 Use elementary row operations on matrices to solve the lin-
ear system
− x2 + x3 = 1
x1 − x2 − x3 = 1
−x1 + 3x3 = −2.
Matrices and Linear Systems 37

Solution. The process begins with the augmented matrix form


.
 
0 −1 1 .. 1
 1 −1 −1 ...
 
1 .
 
..
−1 0 3 . −2
Interchanging the first and the second rows gives
..
 
1 −1 −1 . 1
 .. 
.
 0 −1 1 . 1 

.
−1 0 3 .. −2
Adding (1) times row 1 of the above matrix to its third row, we get
.
 
1 −1 −1 .. 1

 0 −1 .. 
.
 1 . 1 
..
0 −1 2 . −1
Now multiplying the second row by −1 gives
..
 
1 −1 −1 . 1
 .. 
.
 0
 1 −1 . −1 
..
0 −1 2 . −1
Replace row 1 with the sum of itself and (1) times row 2, and then also
replace row 3 with the sum of itself and (1) times row 2, and we get the
matrix of the form
..
 
1 0 −2 . 0
 0 1 −1 ... −1  .
 
 
..
0 0 1 . −2
Replace row 1 with the sum of itself and (2) times row 3, and then replace
row 2 with the sum of itself and (1) times the row 3, and we get
..
 
1 0 0 . −4
 0 1 0 ... −3  .
 
 
..
0 0 1 . −2
38 Applied Linear Algebra and Optimization using MATLAB

Now by writing in equation form and using backward substitution

x1 = −4
x2 = −3
x3 = −2,

and we get the solution [−4, −3, −2]T of the given linear system. •

1.2.4 The Determinant of a Matrix


The determinant is a certain kind of a function that associates a real num-
ber with a square matrix. We will denote the determinant of a square
matrix A by det(A) or |A|.

Definition 1.23 (Determinant of a Matrix)

Let A = (aij ) be an n × n square matrix, then a determinant of A is given


by:

1. det(A) = a11 , if n = 1.

2. det(A) = a11 a22 − a12 a21 , if n = 2. •

For example, if
   
4 2 6 3
A= and B = ,
−3 7 2 5

then

det(A) = (4)(7) − (−3)(2) = 34 and det(B) = (6)(5) − (3)(2) = 24.

Notice that the determinant of a 2 × 2 matrix is given by the difference


of the products of the two diagonals of a matrix. The determinant of a
3 × 3 matrix is defined in terms of the determinants of 2 × 2 matrices, and
the determinant of a 4 × 4 matrix is defined in terms of the determinants
of 3 × 3 matrices, and so on.
Matrices and Linear Systems 39

The MATLAB function det(A), is calculated by the determinant of the


square matrix A as:

>> A = [2 2; 6 7];
>> B = det(A)
B=
2.0000
Another way to find the determinants of only 2 × 2 and 3 × 3 matrices
can be found easily and quickly using diagonals (or direct evaluation). For
a 2 × 2 matrix, the determinant can be obtained by forming the product of
the entries on the line from left to right and subtracting from this number
the product of the entries on the line from right to left. For a matrix of
size 3 × 3, the diagonals of an array consisting of the matrix with the first
two columns added to the right are used. Then the determinant can be
obtained by forming the sum of the products of the entries on the lines
from left to right, and subtracting from this number the products of the
entries on the lines from right to left, as shown in Figure (1.2).

Thus, for a 2 × 2 matrix


|A| = a11 a22 − a12 a21
and for 3 × 3 matrix
|A| = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 − a11 a23 a32 − a12 a21 a33 .
(diagonal products from left to right) (diagonal products from right to left)

For example, the determinant of a 2 × 2 matrix can be computed as



12 5
|A| = = (12)(6) − (5)(−7) = 72 + 35 = 107,
−7 6
and the determinant of a 3 × 3 matrix can be obtained as

4 5 6

|A| = −3 8 2 = [(4)(8)(7) + (5)(2)(4) + (6)(−3)(9)]
4 9 7
− [(6)(8)(4) + (4)(2)(9) + (5)(−3)(7)]

= 102 − 159 = −57.


40 Applied Linear Algebra and Optimization using MATLAB

Figure 1.2: Direct evaluation of 2 × 2 and 3 × 3 determinants.

For finding the determinants of the higher-order matrices, we will define


the following concepts called the minor and cofactor of matrices.

Definition 1.24 (Minor of a Matrix)

The minor Mij of all elements aij of a matrix A of order n × n as the


determinant of the submatrix of order (n − 1) × (n − 1) is obtained from
A by deleting the ith row and jth column (also called the ijth minor of A).
For example, let
 
2 3 −1
A= 5 3 2 ,
4 −2 4

then the minor M11 will be obtained by deleting the first row and the first
column of the given matrix A, i.e.,


3 2
M11 = = (3)(4) − (−2)(2) = 12 + 4 = 16.
−2 4

Similarly, we can find the other possible minors of the given matrix as
Matrices and Linear Systems 41

follows:
5 2
M12 =
4 4
=
20 − 16 = 4


5 3
M13 = = −10 − 12 = −22
4 −2

3 −1
M21 = = 12 − 2 = 10
−2 4

2 −1
M22 =
4
= 8+4 = 12
4

2 3
M23 =
4 −2
=
−4 − 12 = −16


3 −1
M31 =
3
= 6+3 = 9
2

2 −1
M32 =
5
= 4+5 = 9
2

2 3
M33 = 5 3 =
6 − 15 = −9,

which are the required minors of the given matrix. •


Definition 1.25 (Cofactor of a Matrix)

The cofactor Aij of all elements aij of a matrix A of order n × n is given


by
Aij = (−1)i+j Mij ,
where Mij is the minor of all elements aij of a matrix A. For example, the
cofactors Aij of all elements aij of the matrix
 
2 3 −1
A= 5 3 2 
4 −2 4
42 Applied Linear Algebra and Optimization using MATLAB

are computed as follows:

A11 = (−1)1+1 M11 = M11 = 16


A12 = (−1)1+2 M12 = −M12 = −4
A13 = (−1)1+3 M13 = M13 = −22
A21 = (−1)2+1 M21 = −M21 = −10
A22 = (−1)2+2 M22 = M22 = 12
A23 = (−1)2+3 M23 = −M23 = 16
A31 = (−1)3+1 M31 = M31 = 9
A32 = (−1)3+2 M32 = −M32 = −9
A33 = (−1)3+3 M33 = M33 = −9,

which are the required cofactors of the given matrix. •

To get the above results, we use the MATLAB command window as


follows:

>> A = [2 3 − 1; 5 3 2; 4 − 2 4];
>> Cof A = cof actor(A, 1, 1);
>> Cof A = cof actor(A, 1, 2);
>> Cof A = cof actor(A, 1, 3);
>> Cof A = cof actor(A, 2, 1);
>> Cof A = cof actor(A, 2, 2);
>> Cof A = cof actor(A, 2, 3);
>> Cof A = cof actor(A, 3, 1);
>> Cof A = cof actor(A, 3, 2);
>> Cof A = cof actor(A, 3, 3);
Matrices and Linear Systems 43

Program 1.2
MATLAB m-file for Finding Minors and Cofactors
of a Matrix
function CofA = cofactor(A,i,j)
[m,n] = size(A);
if m ˜ = n error(Matrix must be square) end
A1 = A([1:i-1,i+1:n],[1:j-1,j+1:n]);
Minor = det(A1);
CofA = (-1)ˆ (i+j)*det(Minor);

Definition 1.26 (Cofactor Expansion of a Determinant of


a Matrix)

Let A be a square matrix, then we define the determinant of A as the sum


of the products of the elements of the first row and their cofactors. If A is
a 3 × 3 matrix, then its determinant is defined as

det(A) = |A| = a1 1A11 + a1 2A12 + a1 3A13 .

Similarly, in general, for an n × n matrix, we define it as


n
X
det(A) = |A| = aij Aij , n > 2, (1.12)
1

where the summation is on i for any fixed value of the jth column (1 ≤ j ≤
n), or on j for any fixed value of the ith row (1 ≤ i ≤ n), and Aij is the
cofactor of element aij . •

Example 1.6 Find the minors and cofactors of the matrix A and use them
to evaluate the determinant of the matrix
 
3 1 −4
A= 2 5 6 .
1 4 8
44 Applied Linear Algebra and Optimization using MATLAB

Solution. The minors of A are calculated as follows:



5 6
M11 = = 40 − 24 = 16
4 8

2 6
M12 = = 16 − 6 = 10
1 8

2 5
M13 = = 8−5 = 3.
1 4

From these values of the minors, we can calculate the cofactors of the
elements of the given matrix as follows:

A11 = (−1)1+1 M11 = M11 = 16


1+2
A12 = (−1) M12 = −M12 = −10
A13 = (−1)1+3 M13 = M13 = 3.

Now by using the cofactor expansion along the first row, we can find
the determinant of the matrix as follows:

det(A) = a11 A11 +a12 A12 +a13 A13 = (3)(16)+(1)(−10)+(−4)(3) = 26.

Note that in Example 1.6, we computed the determinant of the matrix


by using the cofactor expansion along the first row, but it can also be found
along the first column of the matrix.

To get the results of Example 1.6, we use the MATLAB Command


Window as follows:

>> A = [3 1 − 4; 2 5 6; 1 4 8];
>> DetA = Cof F exp(A);
Matrices and Linear Systems 45

Program 1.3
MATLAB m-file for Finding the Determinant of a
Matrix by Cofactor Expansion
function DetA = CofFexp(A)
[m,n] = size(A);
if m ˜ = n error (Matrix must be square) end
a = A(1,:);c = [ ];
for i=1:n
c1i = cofactor(A,1,i);
c = [c;c1i]; end
DetA = a*c;;

Theorem 1.7 (The Laplace Expansion Theorem)

The determinant of an n × n matrix A = {aij }, when n ≥ 2, can be


computed as

det(A) = ai1 Ai1 + ai2 Ai2 + · · · + ain Ain


Xn
= aij Aij ,
j=1

which is called the cofactor expansion along the ith row, and also as

det(A) = a1j A1j + a2j A2j + · · · + anj Anj


Xn
= aij Aij
i=1

and is called the cofactor expansion along the jth column. This is called
the Laplace expansion theorem. •

Note that the cofactor and minor of an element aij differs only in sign,
i.e., Aij = ±Mij . A quick way for determining whether to use the + or −
is to use the fact that the sign relating Aij and Mij is in the ith row and
jth column of the checkerboard array
46 Applied Linear Algebra and Optimization using MATLAB

 
+ − + − + ···

 − + − + − ··· 


 + − + − + ··· .


 − + − + − ··· 

.. .. .. .. .. ..
. . . . . .

For example, A11 = M11 , A21 = −M21 , A12 = −M12 , A22 = M22 , and so on.

Definition 1.27 (Cofactor Matrix)

If A is any n × n matrix and Aij is the cofactor of aij , then the matrix
 
A11 A12 · · · A1n
 A21 A22 · · · A2n 
 
 .. .. .. 
 . . ··· . 
An1 An2 · · · Ann

is called the matrix of the cofactor from A. For example, the cofactor of
the matrix  
3 2 −1
A= 1 6 3 
2 −4 0
can be calculated as follows:

A11 = 12, A12 = 6, A13 = −16


A21 = 4, A22 = 2, A23 = 16
A31 = 12, A32 = −10, A33 = 16.

So that the matrix of the form


 
12 6 −16
 4 2 16 
12 −10 16

is the required cofactor matrix of the given matrix. •


Matrices and Linear Systems 47

Definition 1.28 (Adjoint of a Matrix)

If A is any n × n matrix and Aij is the cofactor of aij of A, then the


transpose of this matrix is called the adjoint of A and is denoted by Adj(A).
For example, the cofactor matrix of the matrix
 
3 2 −1
A= 1 6 3 
2 −4 0
is calculated as  
12 6 −16
 4 2 16  .
12 −10 16
So by taking its transpose, we get the matrix
 T  
12 6 −16 12 4 12
 4 2 16  =  6 2 −10  = Adj(A),
12 −10 16 −16 16 16
which is called the adjoint of the given matrix A. •
Example 1.7 Find the determinant of the following matrix using cofactor
expansion and show that det(A) = 0 when x = 4:
 
x+2 x 2
A= 1 x − 1 3 .
4 x+1 x
Solution. Using the cofactor expansion along the first row, we compute
the determinant of the given matrix as
|A| = a11 C11 + a12 C12 + a13 C13 ,
where

1+1
x−1 3
C11 = (−1) M11 = M11 =
= x2 − 4x − 3
x+1 x

1 3
C12 = (−1)1+2 M12 = −M12 = − = −x + 12
4 x

1 x−1
C13 = (−1)1+3 M13 = M13 = − = −3x + 5.
4 x+1
48 Applied Linear Algebra and Optimization using MATLAB

Thus,

|A| = (x + 2)[x2 − 4x − 3] + x[−x + 12] + 2[−3x + 5] = x3 − 3x2 − 5x + 4.

Now taking x = 4, we get

|A| = (4)3 − 3(4)2 − 5(4) + 4 = 64 − 48 − 20 + 4 = 0,

which is the required determinant of the matrix at x = 4. •

The following are special properties, which will be helpful in reducing the
amount of work involved in evaluating determinants.

Theorem 1.8 (Properties of the Determinant)

Let A be an n × n matrix:

1. The determinant of a matrix A is zero if any row or column is zero


or equal to a linear combination of other rows and columns.
For example, if  
3 1 0
A =  2 1 0 ,
4 3 0
then det(A) = 0.

2. A determinant of a matrix A is changed in sign if the two rows or


two columns are interchanged. For example, if
 
3 2
A= ,
4 5

then det(A) = 7, but for the matrix


 
4 5
B= ,
3 2

obtained from the matrix A by interchanging its rows, we have det(B) =


−7.
Matrices and Linear Systems 49

3. The determinant of a matrix A is equal to the determinant of its


transpose. For example, if
 
5 3
A= ,
4 4

then det(A) = 8, and for the matrix


 
5 4
B= ,
3 4

obtained from the matrix A by taking its transpose, we have

det(B) = 8 = det(A).

5. If the matrix B is obtained from the matrix A by multiplying every


element in one row or in one column by k, then the determinant of
the matrix B is equal to k times the determinant of A. For example,
if  
6 5
A= ,
3 4
then det(A) = 9, but for the matrix
 
12 10
B= ,
3 4

obtained from the matrix A by multiplying its first row by 2, we have

det(B) = 18 = 2(9) = 2 det(A).

6. If the matrix B is obtained from the matrix A by adding to a row (or


a column) a multiple of another row (or another column) of A, then
the determinant of the matrix B is equal to the determinant of A.
For example, if  
4 3
A= ,
5 4
50 Applied Linear Algebra and Optimization using MATLAB

then det(A) = 1, and for the matrix


 
4 3
B= ,
13 10
obtained from the matrix A by adding to its second row 2 times the
first row, we have
det(B) = 1 = det(A).

7. If two rows or two columns of a matrix A are identical, then the


determinant is zero. For example, if
 
2 3
A= ,
2 3
then det(A) = 0.
8. The determinant of a product of matrices is the product of the deter-
minants of all matrices. For example, if
   
3 4 5 1 2 3
A =  3 2 1 , B =  4 2 3 ,
2 1 6 1 3 5
then det(A) = −36 and det(A) = −3. Also,
 
24 29 46
AB =  12 13 20  ,
12 24 39
then det(AB) = 108. Thus,
det(A) det(B) = (−36)(−3) = 108 = det(AB).

9. The determinant of a triangular matrix (upper-triangular or lower-


triangular matrix) is equal to the product of all their main diagonal
elements. For example, if
 
3 4 5
A =  0 4 7 ,
0 0 5
Matrices and Linear Systems 51

then
det(A) = (3)(4)(5) = 60.

10. The determinant of an n × n matrix A times the scalar multiple k is


equal to k n times the determinant of the matrix A, i.e., det(kA) =
k n det(A). For example, if
 
3 4 5
A =  2 3 6 ,
1 0 5

then det(A) = 14, and for the matrix


 
6 8 10
B = 2A =  4 6 12  ,
2 0 10
obtained from the matrix A by multiplying by 2, we have

det(B) = 112 = 8(14) = 23 det(A).

11. The determinant of the kth power of a matrix A is equal to the kth
power of the determinant of the matrix A, i.e., det(Ak ) = (det(A))k .
For example, if  
2 −2 0
A= 2 3 −1  ,
1 0 1
then det(A) = 12, and for the matrix
 
−18 −30 12
B = A3 =  24 −3 −9  ,
3 −12 3
obtained by taking the cubic power of the matrix A, we have

det(B) = 1728 = (12)3 = (det(A))3 .

12. The determinant of a scalar matrix (1 × 1) is equal to the element


itself. For example, if A = (8), then det(A) = 8.
52 Applied Linear Algebra and Optimization using MATLAB

Example 1.8 Find all the values of α for which det(A) = 0, where
 
α−3 1 0
A= 0 α − 1 1 .
0 2 α

Solution. We find the determinant of the given matrix by using the co-
factor expansion along the first row, so we compute

|A| = a11 C11 + a12 C12 + a13 C13



α−1 1
+ 0 0 α − 1
0 1
= (α − 3) − 1
2 α 0 α 0 2
= (α − 3)[(α − 1)(α) − 2] + 1[0 − 0] + 0
= (α − 3)[α2 − α) − 2]
= (α − 3)(α + 1)(α − 2)

given det(A) = 0, which implies that

|A| = 0
(α − 3)(α + 1)(α − 2) = 0,

which gives
α = −1, α = 2, α = 3,
the required values of α for which det(A) = 0. •

Example 1.9 Find all the values of α such that


 
  3 −1 0
4α α
det = det  0 α −2  .
1 α+1
−1 3 α+1

Solution. Since
4α α
= 4α(α + 1) − α,
1 α+1
which is equivalent to

4α α
= 4α2 + 3α.
1 α+1
Matrices and Linear Systems 53

Also,

3 −1 0

0 α
−2 = 3[α(α+1)+6]−(−1)[(0)(α+1)+2]+0[(0)(3)−(−1)(α)],

−1 3 α+1

which can be written as



3 −1 0

0 α −2 = 3α2 + 3α + 16.

−1 3 α+1

Given that
3 −1 0
4α α

1 α+1
= 0 α
−2 ,

−1 3 α+1
we get
4α2 + 3α = 3α2 + 3α + 16.
Simplifying this quadratic polynomial, we have

α2 = 16 or α2 − 16 = 0,

which gives
α = −4 and α = 4,
the required values of α. •

Example 1.10 Find the determinant of the matrix


 
−5a −5b −5c
A =  2d − g 2e − h 2f − i  ,
2d 2e 2f

if  
a b c
det  d e f  = 4.
g h i
54 Applied Linear Algebra and Optimization using MATLAB

Solution. Using the property of the determinant, we get




a b c

|A| = (−5) 2d − g 2e − h 2f − i .
2d 2e 2f

Subtracting the third row from the second row gives



a b c

|A| = (−5) −g −h −i .
2d 2e 2f

Interchanging the last two rows, we get



a b c

|A| = (−5)(−1) 2d 2e 2f
,

−g −h −i

or
a b c

|A| = (−5)(−1)(2)(−1) d e f .

g h i
Since it is given that
a b c

d e f = 4,

g h i
we have
|A| = (−5)(−1)(2)(−1)(4) = −40,
the required determinant of the given matrix. •

Elimination Method for Evaluating a Determinant

One can easily transform the given determinant into upper-triangular form
by using the following row operations:

1. Add a multiple of one row to another row, and this will not affect
the determinant.
Matrices and Linear Systems 55

2. Interchange two rows of the determinant, and this will be done by


multiplying the determinant by −1.
After transforming the given determinant into upper-triangular form,
then use the fact that the determinant of a triangular matrix is the product
of its diagonal elements.
Example 1.11 Find the following determinant:

3 6 9

6
2 −7 .
−3 1 −1
1
Solution. Multiplying row 1 of the determinant by 3
gives

1 2 3

(3) 6 2 −7 .

−3 1 −1

Now to create the zeros below the main diagonal, column by column, we
do as follows:

Replace the second row of the determinant with the sum of itself and (−6)
times the first row of the determinant and then replace the third row of
the determinant with the sum of itself and (3) times the first row of the
determinant, which gives

1 2 3

(3) 0 −10 −25 .

0 7 8
1
Multiplying row 2 of the determinant by − 10 gives

1 2 3

(3)(−10) 0 1 −5/2 .
0 7 8

Replacing the third row of the determinant with the sum of itself and (−7)
times the second row of the determinant, we obtain
56 Applied Linear Algebra and Optimization using MATLAB


1 2 3

(3)(−10) 0 1 −5/2
= (3)(−10)(1)(1)(−19/2) = 285,

0 0 −19/2

which is the required value of the given determinant. •

Theorem 1.9 If A is an invertible matrix, then:


1. det(A) 6= 0.
1
2. det(A−1 ) = .
det(A)
Adj(A)
3. A−1 = .
det(A)
A
4. (adj(A))−1 = = adj(A−1 ).
det(A)
5. det(adj(A)) = det(A)n−1 . •

By using Theorem 1.9 we can find the inverse of a matrix by showing


that the determinant of a matrix is not equal to zero and by using the
adjoint and determinant of the given matrix A.

Example 1.12 For what values of α does the following matrix have an
inverse?  
1 0 α
A=  2 2 1 
0 2α 1
Solution. We find the determinant of the given matrix by using cofactor
expansion along the first row as follows:

|A| = a11 C11 + a12 C12 + a13 C13 ,

which is equal to

|A| = (1)C11 + (0)C12 + (α)C13 = C11 + αC13 .


Matrices and Linear Systems 57

Now we compute the values of C11 and C13 as follows:


2 1
C11 = (−1)1+1 M11 = M11 = = 2 − 2α
2α 1

2 2
C13 = (−1)1+3 M13 = M13 = − = 4α.
0 2α

Thus,

|A| = C11 + αC13 = 2 − 2α + 4α2 .

From Theorem 1.9 we know that the matrix has an inverse if det(A) 6= 0,
so

|A| = 2 − 2α + 4α2 6= 0,

which implies that

2α2 − α − 1 = (2α + 1)(α − 1) 6= 0.

Hence, the given matrix has an inverse if α 6= −1/2 and α 6= 1. •

Example 1.13 Use the adjoint method to compute the inverse of the fol-
lowing matrix:  
1 2 −1
A =  2 −1 1 .
1 2 2
Also, find the inverse and determinant of the adjoint matrix.

Solution. First, we compute the determinant of the given matrix as fol-


lows:
|A| = a11 C11 + a12 C12 + a13 C13 ,
which gives
|A| = (1)(−4) − (2)(3) + (−1)(5) = −15
Now we compute the nine cofactors as follows:
58 Applied Linear Algebra and Optimization using MATLAB


−1 1 2 −1
= −4, C12 = − 2 1

C11 = + = −3, C13 = + = 5,
2 2 1 2 1 2

2 −1 −1
= −6, C22 = + 1
1 2
C21 = − = 3, C23 = − = 0,
2 2 1 2 1 2

2 −1 1 −1 1 2
C31 = + = 1, C32 = − = −3, C33 = + = −5.
−1 1 2 1 2 −1
Thus, the cofactor matrix has the form
 
−4 −3 5
 −6 3 0 ,
1 −3 −5
and the adjoint is the transpose of the cofactor matrix
 T  
−4 −3 5 −4 −6 1
adj(A) =  −6 3 0  =  −3 3 −3  .
1 −3 −5 5 0 −5

To get the adjoint of the matrix of Example 1.13, we use the MATLAB
Command Window as follows:

>> A = [1 2 − 1; 2 − 1 1; 1 2 2];
>> AdjA = Adjoint(A);

Program 1.4
MATLAB m-file for Finding the Adjoint of a
Matrix Function AdjA = Adjoint(A)
[m,n] = size(A);
if m ˜ = n error(‘Matrix must be square’) end
A1 = [ ];
for i = 1:n
for j=1:n
A1 = [A1;cofactor(A,i,j)];end;end
AdjA = reshape(A1,n,n);
Matrices and Linear Systems 59

Then by using Theorem 1.9 we can have the inverse of the matrix as
follows:

  
−4 −6 1 4/15 2/5 −1/15
Adj(A) 1
A−1 = = −  −3 3 −3  =  1/5 −1/5 1/5  .
det(A) 15
5 0 −5 −1/3 0 1/3

Using Theorem 1.9 we can compute the inverse of the adjoint matrix as:
 
−1/15 −2/15 1/15
A
(adj(A))−1 = =  −2/15 1/15 −1/15  ,
det(A)
−1/15 −2/15 −2/15

and the determinant of the adjoint matrix as

det(adj(A)) = (det(A))3−1 = (−15)2 = 225.

Now we consider the implementation of finding the inverse of the matrix


 
1 −1 1 2
 1 0 1 3 
A=
 0

0 2 4 
1 1 −1 1

by using the adjoint and the determinant of the matrix in the MATLAB
Command Window as:

>> A = [1 − 1 1 2; 1 0 1 3; 0 0 2 4; 1 1 − 1 1];

The cofactors Aij of elements of the given matrix A can also be found
directly by using the MATLAB Command Window as follows:
60 Applied Linear Algebra and Optimization using MATLAB

>> A11 = (−1)ˆ (1 + 1) ∗ det(A([2 : 4], [2 : 4]));


>> A12 = (−1)ˆ (1 + 2) ∗ det(A([2 : 4], [1, 3 : 4]));
>> A13 = (−1)ˆ (1 + 3) ∗ det(A([2 : 4], [1 : 2, 4]));
>> A14 = (−1)ˆ (1 + 4) ∗ det(A([2 : 4], [1 : 3]));
>> A21 = (−1)ˆ (2 + 1) ∗ det(A([1, 3 : 4], [2 : 4]));
>> A22 = (−1)ˆ (2 + 2) ∗ det(A([1, 3 : 4], [1, 3 : 4]));
>> A23 = (−1)ˆ (2 + 3) ∗ det(A([1, 3 : 4], [1 : 2, 4]));
>> A24 = (−1)ˆ (2 + 4) ∗ det(A([1, 3 : 4], [1 : 3]));
>> A31 = (−1)ˆ (3 + 1) ∗ det(A([1 : 2, 4], [2 : 4]));
>> A32 = (−1)ˆ (3 + 2) ∗ det(A([1 : 2, 4], [1, 3 : 4]));
>> A33 = (−1)ˆ (3 + 3) ∗ det(A([1 : 2, 4], [1 : 2, 4]));
>> A34 = (−1)ˆ (3 + 4) ∗ det(A([1 : 2, 4], [1 : 3]));
>> A41 = (−1)ˆ (4 + 1) ∗ det(A([1 : 3], [2 : 4]));
>> A42 = (−1)ˆ (4 + 2) ∗ det(A([1 : 3], [1, 3 : 4]));
>> A43 = (−1)ˆ (4 + 3) ∗ det(A([1 : 3], [1 : 2, 4]));
>> A44 = (−1)ˆ (4 + 4) ∗ det(A([1 : 3], [1 : 3]));

Now form the cofactor matrix B using the Aij s as follows:

>> B =
[A11 A12 A13 A14;
A21 A22 A23 A24;
A31 A32 A33 A34;
A41 A42 A43 A44]

which gives

B=
−2 −4 −4 2
6 6 8 −4
−3 −2 −3 2
−2 −2 −4 2

The adjoint matrix is the transpose of the cofactor matrix:


Matrices and Linear Systems 61

>> adjA = B 0

−2 6 −3 −2
−4 6 −2 −2
−4 8 −3 −4
2 −4 2 2
The determinant of the matrix can be obtained as:

>> det(A)
ans =
2
The inverse of A is the adjoint matrix divided by the determinant of A.

>> invA = (1/det(A)) ∗ adjA;


invA =
−1 3 −1.5 −1
−2 3 −1 −1
−2 4 −1.5 −2
1 −2 1 1
Verify the results by finding A−1 directly using the MATLAB command:

>> inv(A)

Example 1.14 If det(A) = 3 and det(B) = 4, then show that

det(A2 B −1 AT B 3 ) = 432.

Solution. By using the properties of the determinant of the matrix, we


have
det(A2 B −1 AT B 3 ) = det(A2 ) det(B −1 ) det(AT ) det(B 3 ),
which can also be written as
1
det(A2 B −1 AT B 3 ) = (det(A))2 (det(A))(det(B))3 .
det(B)
62 Applied Linear Algebra and Optimization using MATLAB

Now using the given information, we get


1
det(A2 B −1 AT B 3 ) = (3)2 (3)(4)3 = 33 42 = 432,
4
the required solution. •

1.2.5 Homogeneous Linear Systems


We have seen that every system of linear equations has either no solution,
a unique solution, or infinitely many solutions. However, there is another
type of system that always has at least one solution, i.e., either a unique
solution (called a zero solution or trivial solution) or infinitely many solu-
tions (called nontrivial solutions). Such a system is called a homogeneous
linear system.
Definition 1.29 A system of linear equations is said to be homogeneous
if all the constant terms are zero, i.e.,
Ax = b = 0. (1.13)
For example,
x1 + 2x2 − x3 = 0
2x1 − 3x2 + 3x3 = 0
is a homogeneous linear system. But
x1 + 2x2 − x3 = 0
2x1 − 3x2 + 3x3 = 1
is not a homogeneous linear system.

The general homogeneous system of m linear equations with n unknown


variables x1 , x2 , . . . , xn is

a11 x1 + a12 x2 + · · · + a1n xn = 0


a21 x1 + a22 x2 + · · · + a2n xn = 0
.. .. .. .. .. (1.14)
. . . . .
am1 x1 + am2 x2 + · · · + amn xn = 0.
Matrices and Linear Systems 63

The system of linear equations (1.14) can be written as the single matrix
equation
    
a11 a12 · · · a1n x1 0
 a21 a22 · · · a2n   x2   0 
..   ..  =  ..  . (1.15)
    
 .. .. ..
 . . . .  .   . 
am1 am2 · · · amn xn 0
If we compute the product of the two matrices on the left-hand side of
(1.15), we have

   
a11 x1 + a12 x2 + · · · + a1n xn 0
 a21 x1 + a22 x2 + · · · + a2n xn   0 
= . (1.16)
   
 .. .. .. .. ..
 . . . .   . 
am1 x1 + am2 x2 + · · · + amn xn 0

But the two matrices are equal if and only if their corresponding elements
are equal. Hence, the single matrix equation (1.15) is equivalent to the
system of the linear equations (1.14). If we define
     
a11 a12 · · · a1n x1 0
 a21 a22 · · · a2n   x2   0 
A =  .. ..  , x =  ..  , b =  ..  ,
     
.. ..
 . . . .   .   . 
am1 am2 · · · amn xn 0

the coefficient matrix, the column matrix of unknowns, and the column
matrix of constants, respectively, then the system (1.14) can be written
very compactly as
Ax = b, (1.17)
which is called the matrix form of the homogeneous system. •

Note that a homogeneous linear system has an augmented matrix of


the form

[A|0].
64 Applied Linear Algebra and Optimization using MATLAB

Theorem 1.10 Every homogeneous linear system Ax = 0 has either ex-


actly one solution or infinitely many solutions. •

Example 1.15 Solve the following homogeneous linear system:

x1 + x2 + 2x3 = 0
2x1 + 3x2 + 4x3 = 0
3x1 + 4x2 + 7x3 = 0.

Solution. Consider the augmented matrix form of the given system as


follows:
.
 
1 1 2 .. 0

[A|0] =  .. 
 2 3 4 . .
0 
.
3 4 7 .. 0
To convert it into reduced echelon form, we first do the elementary row
operations: row2 – (2)row1 and row3 – (3)row1 gives
.
 
1 1 2 .. 0
≡
 .. 
.
 0 1 0 . 0 
..
0 1 1 . 0

Next, using the elementary row operations: row3 – row2 and row1 – row2,
we get
..
 
1 0 2 . 0
≡  0 1 0 ... 0 
 
.

..
0 0 1 . 0

Finally, using the elementary row operation: row1 – (2)row3, we obtain


.
 
1 0 0 .. 0
≡
 .. 
.
 0 1 0 . 0 
.
0 0 1 .. 0
Matrices and Linear Systems 65

Thus,

x1 = 0, x2 = 0, x3 = 0

is the only trivial solution of the given system. •

Theorem 1.11 A homogeneous linear system Ax = 0 of m linear equa-


tions with n unknowns, where m < n, has infinitely many solutions. •

Example 1.16 Solve the homogeneous linear system

x1 + 2x2 + x3 = 0
2x1 − 3x2 + 4x3 = 0.

Solution. Consider the augmented matrix form of the given system as


. !
1 2 1 .. 0
[A|0] = . .
2 −3 4 .. 0

To convert it into reduced echelon form, we first do the elementary row


operation row2 – 2row1, and we get
.. !
1 2 1 . 0
∼ . .
0 −7 2 .. 0

Doing the elementary row operation: − 17 row2 gives

. !
1 2 1 .. 0
∼ . .
0 1 − 72 .. 0

Finally, using the elementary row operation row1 – 2row3, we get


.. !
1 2 11 . 0
∼ 7
. .
0 1 − 2 .. 0 7

Writing it in the system of equations form, we have


66 Applied Linear Algebra and Optimization using MATLAB

11
x1 + 0x2 + x3 = 0
7
2
0x1 + x2 − x3 = 0
7
and from it, we get
2
x2 = x3 .
7
Taking x3 = t, for t ∈ R and t 6= 0, we get the nontrivial solution
11 2
[x1 , x2 , x3 ]T = [ t, t, t]T .
7 7
Thus, the given system has infinitely many solutions, and this is to be
expected because the given system has three unknowns and only two equa-
tions. •

Example 1.17 For what values of α does the homogeneous linear system

(α − 2)x1 + x2 = 0
x1 + (α − 2)x2 = 0

have nontrivial solutions?

Solution. The augmented matrix form of the given system is


.. !
(α − 2) 1 . 0
[A|0] = . .
1 (α − 2) .. 0

By interchanging row1 by row2, we get


. !
1 (α − 2) .. 0
∼ .. .
(α − 2) 1 . 0

Doing the elementary row operation: row2 – (α − 2) row1 gives


Matrices and Linear Systems 67

.. !
1 (α − 2) . 0
∼ .. .
0 1 − (α − 2)2 . 0
Using backward substitution, we obtain

x1 + (α − 2)x2 = 0
0x1 + 1 − (α − 2)2 x2 = 0.
Notice that if x2 = 0, then x1 = 0, and the given system has a trivial
6 0. This implies that
solution, so let x2 =

1 − (α − 2)2 = 0
1 − α2 + 4α − 4 = 0
α2 − 4α + 3 = 0
(α − 3)(α − 1) = 0,
which gives
α = 1 and α = 3.
Notice that for these values of α, the given set of equations are identical,
i.e.,
(for α = 1)
−x1 + x2 = 0
x1 − x2 = 0,
and (for α = 3)
x1 + x2 = 0
x1 + x2 = 0.
Thus, the given system has nontrivial solutions (infinitely many solu-
tions) for α = 1 and α = 3. •
The following basic theorems on the solvability of linear systems are
proved in linear algebra.
68 Applied Linear Algebra and Optimization using MATLAB

Theorem 1.12 A homogeneous system of n equations in n unknowns has


a solution other than the trivial solution if and only if the determinant of
the coefficients matrix A vanishes, i.e., matrix A is singular. •

Theorem 1.13 (Necessary and Sufficient Condition for a Unique


Solution)

A nonhomogeneous system of n equations in n unknowns has a unique


solution if and only if the determinant of a coefficients matrix A does not
vanish, i.e., A is nonsingular. •

1.2.6 Matrix Inversion Method


If matrix A is nonsingular, then the linear system (1.6) always has a unique
solution for each b since the inverse matrix A−1 exists, so the solution of
the linear system (1.6) can be formally expressed as

A−1 Ax = A−1 b
Ix = A−1 b

or
x = A−1 b. (1.18)
If A is a square invertible matrix, there exists a sequence of elementary
row operations that carry A to the identity matrix I of the same size, i.e.,
A −→ I. This same sequence of row operations carries I to A−1 , i.e.,
I −→ A−1 . This can also be written as

[A|I] −→ [I|A−1 ].

Example 1.18 Use the matrix inversion method to find the solution of
the following linear system:

x1 + 2x2 = 1
−2x1 + x2 + 2x3 = 1
−x1 + x2 + x3 = 1.
Matrices and Linear Systems 69

Solution. First, we compute the inverse of the given matrix as


 
1 2 0
A =  −2 1 2 
−1 1 1

by reducing A to the identity matrix I by elementary row operations and


then applying the same sequence of operations to I to produce A−1 . Con-
sider the augmented matrix
.
 
1 2 0 .. 1 0 0

[A|I] =  .. 
.
 −2 1 2 . 0 1 0 
..
−1 1 1 . 0 0 1

Multiply the first row by −2 and −1 and then, subtracting the results
from the second and third rows, respectively, we get
.
 
1 2 0 .. 1 0 0
∼
 .. 
.
 0 5 2 . 2 1 0 
..
0 3 1 . 1 0 1

Multiplying the second row by 15 , we get


.
 
1 2 0 .. 1 0 0
∼
 .
.

.
 0 1 2/5 . 2/5 1/5 0 
.
0 3 1 .. 1 0 1

Multiplying the second row by 2 and 3 and then subtracting the results
from the first and third rows, respectively, we get
.
 
1 0 −4/5 .. 1/5 −2/5 0
 . 
∼  0 1 2/5 .. 2/5 1/5 .
0 
..
0 0 −1/5 . −1/5 −3/5 1

After multiplying the third row by −5, we obtain


70 Applied Linear Algebra and Optimization using MATLAB

.
 
1 0 −4/5 .. 1/5 −2/5 0

∼ .. 
.
 0 1 2/5 . 2/5 1/5 0 
..
0 0 1 . 1 3 −5

Multiplying the third row by 25 and − 45 and then subtracting the results
from the second and first rows, respectively, we get
..
 
1 0 0 . 1 2 −4
 .
∼  0 1 0 .. 0 −1

.
2 

.
0 0 1 .. 1 3 −5
Thus, the inverse of the given matrix is
 
1 2 −4
A−1 =  0 −1 2 ,
1 3 −5
and the unique solution of the system can be computed as
    
1 2 −4 1 −1
x = A−1 b =  0 −1 2  1  =  1 ,
1 3 −5 1 −1
i.e.,
x1 = −1, x2 = 1, x3 = −1,
the solution of the given system by the matrix inversion method. •

Thus, when the matrix inverse A−1 of the coefficient matrix A is com-
puted, the solution vector x of the system (1.6) is simply the product of
inverse matrix A−1 and the right-hand side vector b.

Using MATLAB commands, the linear system of equations defined by


the coefficient matrix A and the right-hand side vector b using the matrix
inverse method is solved with:
Matrices and Linear Systems 71

>> A = [1 2 0; −2 1 2; −1 1 1];
>> b = [1; 1; 1];
>> x = A \ b
x=
1.0000
1.0000
−1.0000

Theorem 1.14 For an n × n matrix A, the following properties are equiv-


alent:

1. The inverse of matrix A exists, i.e., A is nonsingular.

2. The determinant of matrix A is nonzero.

3. The homogeneous system Ax = 0 has a trivial solution x = 0.

4. The nonhomogeneous system Ax = b has a unique solution. •

Not all matrices have inverses. Singular matrices don’t have inverses
and thus the corresponding systems of equations do not have unique solu-
tions. The inverse of a matrix can also be computed by using the following
numerical methods for linear systems: Gauss-elimination method, Gauss–
Jordan method, and LU decomposition method. But the best and simplest
method for finding the inverse of a matrix is to perform the Gauss–Jordan
method on the augmented matrix with an identity matrix of the same size.

1.2.7 Elementary Matrices


An n × n matrix E is called an elementary matrix if it can be obtained
from the n × n identity matrix In by a single elementary row operation.
For example, the first elementary matrix E1 is obtained by multiplying the
second row of the identity matrix by 6, i.e.,
   
1 0 0 1 0 0
I =  0 1 0  −→  0 6 0  = E1 .
0 0 1 0 0 1
72 Applied Linear Algebra and Optimization using MATLAB

The second elementary matrix E2 is obtained by multiplying the first


row of the identity matrix by −5 and adding it to the third row, i.e.,
   
1 0 0 1 0 0
I =  0 1 0  −→  0 1 0  = E2 .
0 0 1 −5 0 1
Similarly, the third elementary matrix E3 is obtained by interchanging
the second and third rows of the identity matrix, i.e.,
   
1 0 0 1 0 0
I =  0 1 0  −→  0 0 0  = E3 .
0 0 1 0 1 0
Notice that elementary matrices are always square.
Theorem 1.15 To perform an elementary row operation on the m × n
matrix A, multiply A on the left by the corresponding elementary matrix.

Example 1.19 Let
 
1 2 3 −5
A= 3 3 2 1 .
4 1 −2 4
Find an elementary matrix E such that EA is the matrix that results
by adding 5 times the first row of A to the third row.

Solution. The matrix E must be 3 × 3 to conform to the product EA. So,


we get E by adding 5 times the first row to the third row. This gives
 
1 0 0
E =  0 1 0 ,
5 0 1
and the product EA is given as
    
1 0 0 1 2 3 −5 1 2 3 −5
EA =  0 1 0   3 3 2 1 = 3 3 2 1 .
5 0 1 4 1 −2 4 9 11 13 −21

Matrices and Linear Systems 73

Theorem 1.16 An elementary matrix is invertible, and the inverse is also


an elementary matrix. •
Example 1.20 Express the matrix
 
2 3
A=
1 1
as a product of elementary matrices.

Solution. We reduce A to identity matrix I and write the elementary


matrix at each stage, given
 
2 3
A= .
1 1
By interchanging the first and the second rows, we get
   
1 1 0 1
E1 A = , where E1 = .
2 3 1 0
Multiplying the second row by 2 and subtracting the result from the
second row, we get
   
1 1 1 0
E2 (E1 A) = E2 E1 A = , where E2 = .
0 1 −2 1
Finally, by subtracting the third row from the first row, we get
   
1 0 1 −1
E3 (E2 E1 A) = E3 E2 E1 A = , where E3 = .
0 1 0 1
Hence,

E3 E2 E1 A = I,
and so
A = (E3 E2 E1 )−1 .
This means that
   
0 1 1 0 1 1
A= E1−1 E2−1 E3−1 = .
1 0 2 1 0 1

74 Applied Linear Algebra and Optimization using MATLAB

Theorem 1.17 A square matrix A is invertible if and only if it is a product


of elementary matrices. •

Theorem 1.18 An n × n matrix A is invertible if and only if:

1. It is row equivalent to identity matrix In .

2. Its reduced row echelon form is identity matrix In .

3. It is expressible as a product of elementary matrices.

4. It has n pivots. •

In the following, we will discuss the direct methods for solving the linear
systems.

1.3 Numerical Methods for Linear Systems


To solve systems of linear equations using numerical methods, there are
two types of methods available. The first type of methods are called direct
methods or elimination methods. The other type of numerical methods are
called iterative methods. In this chapter we will discuss only the first type
of the numerical methods, and the other type of the numerical methods
will be discussed in Chapter 2. The first type of methods find the solution
in a finite number of steps. These methods are guaranteed to succeed and
are recommended for general use. Here, we will consider Cramer’s rule, the
Gaussian elimination method and its variants, the Gauss–Jordan method,
and LU decomposition (by Doolittle’s, Crout’s, and Cholesky methods).

1.4 Direct Methods for Linear Systems


This type of method refers to a procedure for computing a solution from a
form that is mathematically exact. We shall begin with a simple method
called Cramer’s rule with determinants. We shall then continue with the
Gaussian elimination method and its variants and methods involving tri-
angular, symmetric, and tridiagonal matrices.
Matrices and Linear Systems 75

1.4.1 Cramer’s Rule


This is our first direct method for solving linear systems by the use of
determinants. This method is one of the least efficient for solving a large
number of linear equations. It is, however, very useful for explaining some
problems inherent in the solution of linear equations.

Consider a system of two linear equations


a11 x1 + a12 x2 = b1
a21 x1 + a22 x2 = b2 ,
with the condition that a11 a22 − a12 a21 6= 0, i.e., the determinant of the
given matrix must not be equal to zero or the matrix must be nonsingular.
Solving the above system using systematic elimination by multiplying the
first equation of the system with a22 and the second equation by a12 and
subtracting gives

(a11 a22 − a12 a21 )x1 = a22 b1 − a12 b2 ,

and now solving for x1 gives


a22 b1 − a12 b2
x1 = ,
a11 a22 − a12 a21
and putting the value of x1 in any equation of the given system, we have
x2 as
a22 b2 − a12 b1
x2 = .
a11 a22 − a12 a21
Then writing it in determinant form, we have
|A1 | |A2 |
x1 = and x2 = ,
|A| |A|
where

b a a11 b1 a a
|A1 | = 1 12 and |A| = 11 12

, |A2 | = , .
b2 a22 a21 b2 a21 a22

In a similar way, one can use Cramer’s rule for a set of n linear equations
as follows:
76 Applied Linear Algebra and Optimization using MATLAB

|Ai |
xi = , i = 1, 2, 3, . . . , n, (1.19)
|A|
i.e., the solution for any one of the unknown xi in a set of simultaneous
equations is equal to the ratio of two determinants; the determinant in
the denominator is the determinant of the coefficient matrix A, while the
determinant in the numerator is the same determinant with the ith column
replaced by the elements from the right-hand sides of the equation.

Example 1.21 Solve the following system using Cramer’s rule:

5x1 + x3 + 2x4 = 3
x1 + x2 + 3x3 + x4 = 5
x1 + x2 + 2x4 = 1
x1 + x2 + x3 + x4 = −1.

Solution. Writing the given system in matrix form


    
5 0 1 2 x1 3
 1 1 3 1   x2   5 
  = 
 1 1 0 2   x3   1 
1 1 1 1 x4 −1

gives   
5 0 1 2 3
 1 1 3 1   5 
A=
 1
 and b=
 1 .

1 0 2 
1 1 1 1 −1
The determinant of the matrix A can be calculated by using cofactor
expansion as follows:

5 0 1 2

1 1 3 1
|A| =
1 1 0 2

1 1 1 1

= a11 c11 + a12 c12 + a13 c13 + a14 c14 = 5(2) + 0(−2) + 1(0) + 2(0) = 10 6= 0,
Matrices and Linear Systems 77

which shows that the given matrix A is nonsingular. Then the matrices
A1 , A2 , A3 , and A4 can be computed as
   
3 0 1 2 5 3 1 2
 5 1 3 1   1 5 3 1 
A1 =  1 1 0 2 
, A2 = 
 1
,
1 0 2 
−1 1 1 1 1 −1 1 1
   
5 0 3 2 5 0 1 3
 1 1 5 1   1 1 3 5 
A3 =  1 1
, A4 =  .
1 2   1 1 0 1 
1 1 −1 1 1 1 1 −1
The determinant of the matrices A1 , A2 , A3 , and A4 can be computed
as follows:

|A1 | = 3(2) + 0(18) + 1(−6) + 2(−10) = 6 + 0 − 6 − 20 = −20


|A2 | = 5(−18) + 3(−2) + 1(6) + 2(10) = −90 − 6 + 6 + 20 = −70
|A3 | = 5(6) + 0(−6) + 3(0) + 2(0) = 30 + 0 + 0 + 0 = 30
|A4 | = 5(10) + 0(−10) + 1(0) + 3(0) = 50 + 0 + 0 + 0 = 50.

Now applying Cramer’s rule, we get


|A1 | 20
x1 = = − = −2
|A| 10

|A2 | 70
x2 = = − = −7
|A| 10

|A3 | 30
x3 = = = 3
|A| 10

|A3 | 50
x4 = = = 5,
|A| 10
which is the required solution of the given system. •

Thus Cramer’s rule is useful in hand calculations only if the determi-


nants can be evaluated easily, i.e., for n = 3 or n = 4. The solution of a
78 Applied Linear Algebra and Optimization using MATLAB

3
system of n linear equations by Cramer’s rule will require N = (n + 1) n3
multiplications. Therefore, this rule is much less efficient for large values of
n and is at most never used for computational purposes. When the number
of equations is large (n > 4), other methods of solutions are more desirable.

Use MATLAB commands to find the solution of the above linear sys-
tem by Cramer’s rule as follows:

>> A = [5 0 1 2; 1 1 3 1; 1 1 0 2; 1 1 1 1];
>> b = [3; 5; 1; −1];
>> A1 = [b A(:, [2 : 4])];
>> x1 = det(A1)/det(A);
>> A2 = [A(:, 1) b A(:, [3 : 4])];
>> x2 = det(A2)/det(A);
>> A3 = [A(:, [1 : 2]) b A(:, 4)];
>> x3 = det(A3)/det(A);
>> A4 = [A(:, [1 : 3]) b];
>> x4 = det(A4)/det(A);

Procedure 1.1 (Cramer’s Rule)


1. Form the coefficient matrix A and column matrix b.
2. Compute the determinant of A. If det A = 0, then the system has no
solution; otherwise, go to the next step.
3. Compute the determinant of the new matrix Ai by replacing the ith
matrix with the column vector b.
4. Repeat step 3 for i = 1, 2, . . . , n.
5. Solve for the unknown variables xi using
det(Ai )
xi = , for i = 1, 2, . . . , n.
det(A)

The m-file CRule.m and the following MATLAB commands can be used
to generate the solution of Example 1.21 as follows:
Matrices and Linear Systems 79

>> A = [5 0 1 2; 1 1 3 1; 1 1 0 2; 1 1 1 1];
>> b = [3; 5; 1; −1];
>> sol = CRule(A, b);

Program 1.5
MATLAB m-file for Cramer’s Rule for a Linear System
function sol=CRule(A,b)
[m, n] = size(A);
if m ˜ = n error(‘Matrix is not square.’); end
if det(A) == 0 error(‘Matrix is singular.’);end
for i = 1:n
B = A; B(:, i) = b;
sol(i) = det(B) / det(A);end
sol = sol’;

1.4.2 Gaussian Elimination Method


It is one of the most popular and widely used direct methods for solving
linear systems of algebraic equations. No method of solving linear sys-
tems requires fewer operations than the Gaussian procedure. The goal of
the Gaussian elimination method for solving linear systems is to convert
the original system into the equivalent upper-triangular system from which
each unknown is determined by backward substitution.

The Gaussian elimination procedure starts with forward elimination,


in which the first equation in the linear system is used to eliminate the
first variable from the rest of the (n − 1) equations. Then the new second
equation is used to eliminate the second variable from the rest of the (n−2)
equations, and so on. If (n − 1) such elimination is performed, and the re-
sulting system will be the triangular form. Once this forward elimination
is complete, we can determine whether the system is overdetermined or
underdetermined or has a unique solution. If it has a unique solution, then
backward substitution is used to solve the triangular system easily and one
can find the unknown variables involved in the system.
80 Applied Linear Algebra and Optimization using MATLAB

Now we shall describe the method in detail for a system of n linear


equations. Consider the following system of n linear equations:

a11 x1 + a12 x2 + a13 x3 + · · · + a1n xn = b1


a21 x1 + a22 x2 + a23 x3 + · · · + a2n xn = b2
a31 x1 + a32 x2 + a33 x3 + · · · + a3n xn = b3 (1.20)
.. .. .. .. .. ..
. . . . . .
an1 x1 + an2 x2 an3 x3 + · · · + ann xn = bn .

Forward Elimination

Consider the first equation of the given system (1.20)

a11 x1 + a12 x2 + a13 x3 + · · · + a1n xn = b1 (1.21)

as the first pivotal equation with the first pivot element a11 . Then the first
equation times multiples mi1 = (ai1 /a11 ), i = 2, 3, . . . , n is subtracted from
the ith equation to eliminate the first variable x1 , producing an equivalent
system

a11 x1 + a12 x2 + a13 x3 + · · · + a1n xn = b1


(1) (1) (1) (1)
a22 x2 + a23 x3 + · · · + a2n xn = b2
(1) (1) (1) (1)
a32 x2 + a33 x3 + · · · + a3n xn = b3 (1.22)
.. .. .. .. ..
. . . . .
(1) (1) (1) (1)
an2 x2 + an3 x3 + · · · + ann xn = bn .

Now consider a second equation of the system (1.22), which is

(1) (1) (1) (1)


a22 x2 + a23 x3 + · · · + a2n xn = b2 , (1.23)
(1)
the second pivotal equation with the second pivot element a22 . Then
(1) (1)
the second equation times multiples mi2 = (ai2 /a22 ), i = 3, . . . , n is sub-
tracted from the ith equation to eliminate the second variable x2 , producing
Matrices and Linear Systems 81

an equivalent system
a11 x1 + a12 x2 + a13 x3 + · · · + a1n xn = b1
(1) (1) (1) (1)
a22 x2 + a23 x3 + · · · + a2n xn = b2
(2) (2) (2)
a33 x3 + · · · + a3n xn = b3 (1.24)
.. .. .. ..
. . . .
(2) (2) (2)
an3 x3 + · · · + ann xn = bn .

Now consider a third equation of the system (1.24), which is


(2) (2) (2)
a33 x3 + · · · + a3n xn = b3 , (1.25)
(2)
the third pivotal equation with the third pivot element a33 . Then the third
(2) (2)
equation times multiples mi3 = (ai3 /a33 ), i = 4, . . . , n is subtracted from
the ith equation to eliminate the third variable x3 . Similarly, after (n–
1)th steps, we have the nth pivotal equation which has only one unknown
variable xn , i.e.,
a11 x1 + a12 x2 + a13 x3 + · · · + a1n xn = b1
(1) (1) (1) (1)
+ a22 x2 + a23 x3 + · · · + a2n xn = b2
(2) (2) (2)
+ a33 x3 + · · · + a3n xn = b3 (1.26)
.. ..
. .
(n−1) (n−1)
ann x n = bn ,
(n−1)
with the nth pivotal element ann . After getting the upper-triangular
system, which is equivalent to the original system, the forward elimination
is completed.

Backward Substitution

After the triangular set of equations has been obtained, the last equation of
system (1.26) yields the value of xn directly. The value is then substituted
into the equation next to the last one of the system (1.26) to obtain a value
of xn−1 , which is, in turn, used along with the value of xn in the second
82 Applied Linear Algebra and Optimization using MATLAB

to the last equation to obtain a value of xn−2 , and so on. A mathematical


formula can be obtained for the backward substitution:
(n−1) 
bn
xn = (n−1)



ann







1 
(n−2) (n−2)
 

xn−1 = (n−2) bn−1 − an−1n xn

an−1n−1 . (1.27)
..


.


! 

n 
1 X 

x1 = b1 − a1j xj



a11 j=2

The Gaussian elimination can be carried out by writing only the co-
efficients and the right-hand side terms in a matrix form, the augmented
matrix form. Indeed, this is exactly what a computer program for Gaus-
sian elimination does. Even for hand calculations, the augmented matrix
form is more convenient than writing all sets of equations. The augmented
matrix is formed as follows:
 
a11 a12 a13 · · · a1n | b1
 a21 a22 a23 · · · a2n | b2 
 
 a31 a32 a33 · · · a3n | b3 
 . (1.28)
 .. .. .. .. .. 
 . . . . . | 
an1 an2 an3 · · · ann | bn

The operations used in the Gaussian elimination method can now be


applied to the augmented matrix. Consequently, system (1.26) is now
written directly as
 
a11 a12 a13 · · · a1n | b1
 (1) (1) (1) (1) 

 a22 a23 · · · a2n | b2 

 (2) (2) (2) 

 a33 · · · a3n | b3 ,
 (1.29)
 .. .. 

 . . | 

(n−1) (n−1)
ann | bn
Matrices and Linear Systems 83

from which the unknowns are determined as before by using backward


substitution. The number of multiplications and divisions for the Gaussian
elimination method for one b vector is approximately
 n3  n
2
N= +n − . (1.30)
3 3

Simple Gaussian Elimination Method


First, we will solve the linear system using the simplest variation of the
Gaussian elimination method, called simple Gaussian elimination or Gaus-
sian elimination without pivoting. The basics of this variation is that all
possible diagonal elements (called pivot elements) should be nonzero. If
at any stage an element becomes zero, then interchange that row with
any row below with a nonzero element at that position. After getting the
upper-triangular matrix, we use backward substitution to get the solution
of the given linear system.
Example 1.22 Solve the following linear system using the simple Gaus-
sian elimination method:
x1 + 2x2 + x3 = 2
2x1 + 5x2 + 2x3 = 1
x1 + 3x2 + 4x3 = 5.

Solution. The process begins with the augmented matrix form


..
 
1 2 1 . 2
 . 
 2 5
 3 .. 1 
.
..
1 3 4 . 5

Since a11 = 1 6= 0, we wish to eliminate the elements a21 and a31 by


subtracting from the second and third rows the appropriate multiples of the
first row. In this case, the multiples are given as
2 1
m21 = = 2 and m31 = = 1.
1 1
Hence,
84 Applied Linear Algebra and Optimization using MATLAB

.
 
1 2 1 .. 2
 0 1 1 ... −3  .
 
 
..
0 1 3 . 3
(1) (1)
Since a22 = 1 6= 0, we eliminate the entry in the a32 position by subtracting
1
the multiple m32 = = 1 of the second row from the third row to get
1
.
 
1 2 1 .. 2
 0 1 1 ... −3  .
 
 
..
0 0 2 . 6
Obviously, the original set of equations has been transformed to an
upper-triangular form. Since all the diagonal elements of the obtaining
upper-triangular matrix are nonzero, the coefficient matrix of the given
system is nonsingular, and the given system has a unique solution. Now
expressing the set in algebraic form yields
x1 + 2x2 + x3 = 2
x2 + x3 = −3
2x3 = 6.
Now using backward substitution, we get
2x3 = 6 gives x3 = 3
x2 = −x3 − 3 = −(3) − 3 = −6 gives x2 = −6
x1 = 2 − 2x2 − x3 = 2 − 2(−6) − 3 gives x1 = 11,
which is the required solution of the given system. •
The above results can be obtained using MATLAB commands as fol-
lows:

>> B = [1 2 1 2; 2 5 3 1; 1 3 4 5];
%B = [A|b] = Augmented matrix
>> x = W P (B);
>> disp(x)
Matrices and Linear Systems 85

Program 1.6
MATLAB m-file for Simple Gaussian Elimination Method
function x=WP(B)
[n,t]=size(B); U=B;
for k=1:n-1; for i=k:n-1; m=U(i+1,k)/U(k,k);
for j=1:t; U(i+1,j)=U(i+1,j)-m*U(k,j);end;end end
i=n; x(i,1)=U(i,t)/U(i,i);
for i=n-1:-1:1; s=0;
for k=n:-1:i+1; s = s + U (i, k) ∗ x(k, 1); end
x(i,1)=(U(i,t)-s)/U(i,i); end; B; U; x; end

In the simple description of Gaussian elimination without pivoting just


given, we used the kth equation to eliminate the variable xk from equations
k + 1, . . . , n during the kth step of the procedure. This is possible only if
(k−1)
at the beginning of the kth step, the coefficient akk of xk in equation k is
not zero. Even though these coefficients are used as denominators both in
the multipliers mij and in the backward substitution equations, this does
not necessarily mean that the linear system is not solvable, but that the
procedure of the solution must be altered.

Example 1.23 Solve the following linear system using the simple Gaus-
sian elimination method:
x2 + x3 = 1
x1 + 2x2 + 2x3 = 1
2x1 + x2 + 2x3 = 3.

Solution. Write the given system in augmented matrix form:


..
 
0 1 1 . 1
 . 
 1 2
 2 .. 1 
.
.
2 1 2 .. 3

To solve this system, the simple Gaussian elimination method will fail
immediately because the element in the first row on the leading diagonal,
the pivot, is zero. Thus, it is impossible to divide that row by the pivot
86 Applied Linear Algebra and Optimization using MATLAB

value. Clearly, this difficulty can be overcome by rearranging the order of


the rows; for example, making the first row the second gives
..
 
1 2 2 . 1
 0 1 1 ... 1  .
 
 
..
2 1 2 . 3
Now we use the usual elimination process. The first elimination step
is to eliminate the element a31 = 3 from the third row by subtracting a
multiple m31 = 31 = 3 of row 1 from row 3, which gives
..
 
1 2 2 . 1
 . 
 0
 1 1 .. 1 
.
.
0 −3 −2 .. 1
We finished with the first elimination step since the element a21 is already
eliminated from the second row. The second elimination step is to eliminate
(1)
the element a32 = −3 from the third row by subtracting a multiple m32 = −31
of row 2 from row 3, which gives
.
 
1 2 2 .. 1
 0 1 1 ... 1  .
 
 
..
0 0 1 . 4
Obviously, the original set of equations has been transformed to an upper-
triangular form. Now expressing the set in algebraic form yields
x1 + 2x2 + 2x3 = 1
x2 + x3 = 1
x3 = 4.
Now using backward substitution, we get
x3 = 4

x2 = 1 − x3 = 1 − 4 = −3

x1 = 1 − 2x2 − 2x3 = 1 − 2(−3) − 2(4) = −1,


Matrices and Linear Systems 87

the solution of the given system. •

Example 1.24 Solve the following linear system using the simple Gaus-
sian elimination method:
x1 + x2 + x3 = 3
2x1 + 2x2 + 3x3 = 7
x1 + 2x2 + 3x3 = 6.

Solution. Write the given system in augmented matrix form:


..
 
1 1 1 . 3
 . 
 2 2
 3 .. 7 
.
.
1 2 3 .. 6

The first elimination step is to eliminate the elements a21 = 2 and a31 =
1 from the second and third rows by subtracting the multiples m21 = 21 = 2
and m31 = 11 = 1 of row 1 from row 2 and row 3, respectively, which gives

.
 
1 1 1 .. 3
 0 0 1 ... 1  .
 
 
..
0 1 2 . 3

We finished the first elimination step. To start the second elimination


(1)
step, since we know that the element a22 = 0, called the second pivot
element, the simple Gaussian elimination cannot continue in its present
form. Therefore, we interchange rows 2 and 3 to get
..
 
1 1 1 . 3
 0 1 2 ... 3  .
 
 
..
0 0 1 . 1
(1)
We have finished with the second elimination step since the element a32 is
already eliminated from the third row. Obviously, the original set of equa-
tions has been transformed to an upper-triangular form. Now expressing
88 Applied Linear Algebra and Optimization using MATLAB

the set in algebraic form yields


x1 + x2 + x3 = 3
x2 + 2x3 = 3
x3 = 1.
Now using backward substitution, we get
x3 = 1, x2 = 1, x1 = 1,
the solution of the system. •
Example 1.25 Using the simple Gaussian elimination method, find all
values of a and b for which the following linear system is consistent or
inconsistent:
2x1 − x2 + 3x3 = 1
4x1 + 2x2 + 2x3 = 2a
2x1 + x2 + x3 = b.
Solution. Write the given system in augmented matrix form:
 
2 −1 3 1
 4 2 2 2a  ,
2 1 1 b
in which we wish to eliminate the elements a21 and a31 by subtracting from
the second and third rows the appropriate multiples of the first row. In this
case, the multiples are given as
4 2
m21 = = 2 and m31 = = 1.
2 2
Hence,  
2 −1 3 1
 0 4 −4 2a − 2  .
0 2 −2 b − 1
We have finished the first elimination step. The second elimination step is
(1)
to eliminate element a32 = 2 by subtracting a multiple m32 = 42 = 12 of row
2 from row 3, which gives
 
2 −1 3 1
 0 4 −4 2a − 2  .
0 0 0 b−a
Matrices and Linear Systems 89

We finished the second column. So the third row of the equivalent upper-
triangular system is
0x1 + 0x2 + 0x3 = b − a. (1.31)
First, if (1.31) has no constraint on unknowns x1 , x2 , and x3 , then the
upper-triangular system represents only two nontrivial equations, namely,
2x1 − x2 + 3x3 = 1
4x2 − 4x3 = 2a − 2
in the three unknowns. As a result, one of the unknowns can be chosen
arbitrarily, say x3 = x∗3 , then x∗2 and x∗1 can be obtained by using backward
substitution:
1
x∗2 = 1/2a − 1/2 − x∗3 ; x∗1 = (1 + 1/2a − 1/2 − 4x∗3 ).
2
Hence,
1
x∗ = [ (1 + 1/2a − 1/2 − 4x∗3 ), 1/2a − 1/2 − x∗3 , x∗3 ]T
2
is an approximation solution of the given system for any value of x∗3 for
any real value of a. Hence, the given linear system is consistent (infinitely
many solutions).

Second, when b − a 6= 0, in this case, (1.31) puts a restriction on


unknowns x1 , x2 , and x3 that is impossible to satisfy. So the given system
cannot have any solutions and, therefore, is inconsistent. •
Example 1.26 Solve the following homogeneous linear system using the
simple Gaussian elimination method:
x1 + x2 − 2x3 = 0
2x1 + 4x2 − 3x3 = 0
3x1 + 7x2 − 5x3 = 0.
Solution. The process begins with the augmented matrix form
..
 
1 1 −2 . 0
 .. 
.
 2 4 −3 . 0 

..
3 7 −5 . 0
90 Applied Linear Algebra and Optimization using MATLAB

Using the following multiples,


2 3
m21 = = 2 and m31 = =3
1 1
finishes the first elimination step, and we get
.
 
1 1 −2 .. 0

 0 2 .. 
.
 1 . 0 
..
0 4 1 . 0

Then using the multiple m32 = 42 = 2 of the second row from the third row,
we get
..
 
1 1 −2 . 0
 .. 
.
 0 2 1 . 0 

..
0 0 −1 . 0
Obviously, the original set of equations has been transformed to an
upper-triangular form. Thus, the system has the unique solution [0, 0, 0]T ,
i.e., the system has only the trivial solution. •

Example 1.27 Find the value of k for which the following homogeneous
linear system has nontrivial solutions by using the simple Gaussian elimi-
nation method:
2x1 − 3x2 + 5x3 = 0
−2x1 + 6x2 − x3 = 0
4x1 − 9x2 + kx3 = 0.
Solution. The process begins with the augmented matrix form
.
 
2 −3 5 .. 0

 −2 .. 
,
 6 −1 . 0 
..
4 −9 k . 0

and then using the following multiples,


−2 4
m21 = = −1 and m31 = = 2,
2 2
Matrices and Linear Systems 91

which gives
.
 
2 −3 5 .. 0

 0 .. 
.
 3 4 . 0 
..
0 −3 k − 10 . 0
−3
Also, by using the multiple m32 = 3
= −1, we get

..
 
2 −3 5 . 0
 . 
 0
 3 4 .. 0 
.
.
0 0 k − 6 .. 0

From the last row of the above system, we obtain

k − 6 = 0, which gives k = 6.

Also, solving the above underdetermined system

2x1 − 3x2 + 5x3 = 0


3x2 + 4x3 = 0

by taking x3 = 1, we have the nontrivial solutions

x∗ = α[−9/2, −4/3, 1]T , for α 6= 0.

Note that if we put x3 = 0, for example, we obtain the trivial solution


[0, 0, 0]T . •

Theorem 1.19 An upper-triangular matrix A is nonsingular if and only


if all its diagonal elements are not zero. •

Example 1.28 Use the simple Gaussian elimination method to find all
the values of α which make the following matrix singular:
 
1 −1 α
A= 2 2 1 .
0 α −1.5
92 Applied Linear Algebra and Optimization using MATLAB

Solution. Apply the forward elimination step of the simple Gaussian elim-
ination on the given matrix A and eliminate the element a21 by subtracting
from the second row the appropriate multiple of the first row. In this case,
the multiple is given as
 
1 −1 α
 0 4 1 − 2α  .
0 α −1.5
We finished the first elimination step. The second elimination step is
(1)
to eliminate element a32 = α by subtracting a multiple m32 = α4 of row 2
from row 3, which gives
1 −1 α
 
 0 4 1 − 2α  .
α(1 − 2α)
 
0 0 −1.5 −
4
To show that the given matrix is singular, we have to set the third diagonal
element equal to zero (by Theorem 1.19), i.e.,
α(1 − 2α)
−1.5 − = 0.
4
After simplifying, we obtain
2α2 − α − 6 = 0.
Solving the above quadratic equation, we get
3
α=− and α = 2,
2
which are the possible values of α, which make the given matrix singular.•

Example 1.29 Use the smallest positive integer value of α to find the
unique solution of the linear system Ax = [1, 6, −4]T by the simple Gaus-
sian elimination method, where
 
1 −1 α
A= 2 2 1 .
0 α −1.5
Matrices and Linear Systems 93

Solution. Since we know from Example 1.28 that the given matrix A is
singular when α = − 32 and α = 2, to find the unique solution we take the
smallest positive integer value α = 1 and consider the augmented matrix
as follows:
.
 
1 −1 1 .. 1

 2 .. 
.
 2 1 . 6 
..
0 1 −1.5 . −4
Applying the forward elimination step of the simple Gaussian elimina-
tion on the given matrix A and eliminating the element a21 by subtracting
from the second row the appropriate multiple m21 = 2 of the first row gives

..
 
1 −1 1 . 1
 . 
 0
 4 −1 .. .
4 
.
0 1 −1.5 .. −4

(1)
The second elimination step is to eliminate element a32 = 1 by subtracting
a multiple m32 = 14 of row 2 from row 3, which gives

..
 
1 −1 1 . 1
 .. 
.
 0
 4 −1 . 4 
..
0 0 −5/4 . −5

Now expressing the set in algebraic form yields

x1 − x2 + x3 = 1
4x2 − x3 = 4
−5/4x3 = −5.

Using backward substitution, we obtain

x3 = 4, x2 = 2, x1 = −1,

the unique solution of the given system. •


94 Applied Linear Algebra and Optimization using MATLAB

Note that the inverse of the nonsingular matrix A can be easily determined
by using the simple Gaussian elimination method. Here, we have to con-
sider the augmented matrix as a combination of the given matrix A and the
identity matrix I (the same size as A). To find the inverse matrix BA−1 ,
we must solve the linear system in which the jth column of the matrix B
is the solution of the linear system with the right-hand side the jth column
of the matrix I.
Example 1.30 Use the simple Gaussian elimination method to find the
inverse of the following matrix:
 
2 −1 3
A =  4 −1 6  .
2 −3 4

Solution. Suppose that the inverse A−1 = B of the given matrix exists
and let
    
2 −1 3 b11 b12 b13 1 0 0
AB =  4 −1 6   b21 b22 b23  =  0 1 0  = I.
2 −3 4 b31 b32 b33 0 0 1

Now to find the elements of the matrix B, we apply simple Gaussian


elimination on the augmented matrix:
.
 
2 −1 3 .. 1 0 0

[A|I] =  .. 
.
 4 −1 6 . 0 1 0 
..
2 −3 4 . 0 0 1

Apply the forward elimination step of the simple Gaussian elimination


on the given matrix A and eliminate the elements a21 = 4 and a31 = 2 by
subtracting from the second and the third rows the appropriate multiples
m21 = 24 = 2 and m31 = 22 = 1 of the first row. It gives
..
 
2 −1 3 . 1 0 0
 .. 
1 0 . −2 1 0  .
 0

..
0 −2 1 . −1 0 1
Matrices and Linear Systems 95

We finished the first elimination step. The second elimination step is


(1)
to eliminate element a32 = −2 by subtracting a multiple m32 = −2 1
= −2
of row 2 from row 3, which gives
..
 
2 −1 3 . 1 0 0
 .. 
.
 0
 1 0 . −2 1 0 
..
0 0 1 . −5 2 1

We solve the first system


    
2 −1 3 b11 1
 0 1 0   b21  =  −2 
0 0 1 b31 −5

by using backward substitution, and we get

2b11 − b21 + 3b31 = 1


b21 = −2
b31 = −5,

which gives
b11 = 7, b21 = −2, b31 = −5.
Similarly, the solution of the second linear system
    
2 −1 3 b12 0
 0 1 0   b22  =  1 
0 0 1 b32 2

can be obtained as follows:

2b12 − b22 + 3b32 = 0


b22 = 1
b32 = 2,

which gives
b12 = −5/2, b22 = 1, b32 = 2.
96 Applied Linear Algebra and Optimization using MATLAB

Finally, the solution of the third linear system


    
2 −1 3 b13 0
 0 1 0   b23  =  0 
0 0 1 b33 1

can be obtained as follows:


2b13 − b23 + 3b33 = 0
b23 = 0
b33 = 1,

and it gives
b13 = −3/2, b23 = 0, b33 = 1.
Hence, the elements of the inverse matrix B are
 5 3 
7 − −
 2 2 
 
B = A−1 =  −2 ,
 
 1 0 
 
−5 2 1

which is the required inverse of the given matrix A. •

Procedure 1.2 (Gaussian Elimination Method)


1. Form the augmented matrix, B = [A|b].

2. Check the first pivot element a11 6= 0, then move to the next step;
otherwise, interchange rows so that a11 6= 0.

3. Multiply row one by multiplier mi1 = ai1 /a11 and subtract to the ith
row for i = 2, 3, . . . , n.

4. Repeat steps 2 and 3 for the remaining pivots elements unless coeffi-
cient matrix A becomes upper-triangular matrix U .
bn−1
5. Use backward substitution to solve xn from the nth equation xn = n
ann
and solve the other (n − 1) unknown variables by using (1.27).
Matrices and Linear Systems 97

We now introduce the most important numerical quantity associated with


a matrix.

Definition 1.30 (Rank of a Matrix)

The rank of a matrix A is the number of pivots. An m × n matrix will,


in general, have a rank r, where r is an integer and r ≤ min{m, n}. If
r = min{m, n}, then the matrix is said to be full rank. If r < min{m, n},
then the matrix is said to be rank deficient. •

In principle, the rank of a matrix can be determined by using the Gaus-


sian elimination process in which the coefficient matrix A is reduced to
upper-triangular form U . After reducing the matrix to triangular form, we
find that the rank is the number of columns with nonzero values on the
diagonal of U . In practice, especially for large matrices, round-off errors
during the row operation may cause a loss of accuracy in this method of
rank computation.
Theorem 1.20 For a system of n equations with n unknowns written in
the form Ax = b, the solution x of a system exists and is unique for any
b, if and only if rank(A) = n. •
Conversely, if rank(A) < n for an n × n matrix A, then the system of
equations Ax = b may or may not be consistent. Such a system may not
have a solution, or the solution, if it exists, will not be unique.
Example 1.31 Find the rank of the following matrix:
 
1 2 4
A= 1 1 5 .
1 1 6

Solution. Apply the forward elimination step of simple Gaussian elimina-


tion on the given matrix A and eliminate the elements below the first pivot
(first diagonal element) to
 
1 2 4
 0 −1 1  .
0 −1 2
98 Applied Linear Algebra and Optimization using MATLAB

We finished the first elimination step. The second pivot is in the (2, 2)
position, but after eliminating the element below it, we find the triangular
form to be  
1 2 4
 0 −1 1  .
0 0 3
Since the number of pivots are three, the rank of the given matrix is 3. Note
that the original matrix is nonsingular since the rank of the 3 × 3 matrix
is 3. •

In MATLAB, the built-in rank function can be used to estimate the


rank of a matrix:

>> A = [1 2 4; 1 1 5; 1 1 6];
>> rank(A)
ans =
3
Note that:
rank(AB) ≤ min(rank(A), rank(B))
rank(A + B) ≤ rank(A) + rank(B)
rank(AAT ) = rank(A) = rank(AT A)
Although the rank of a matrix is very useful to categorize the behavior
of matrices and systems of equations, the rank of a matrix is usually not
computed. •

The use of nonzero pivots is sufficient for the theoretical correctness


of simple Gaussian elimination, but more care must be taken if one is to
obtain reliable results. For example, consider the linear system

0.000100x1 + x2 = 1
x1 + x2 = 2,

which has the exact solution x = [1.00010, 0.99990]T . Now we solve this
system by simple Gaussian elimination. The first elimination step is to
eliminate the first variable x1 from the second equation by subtracting
Matrices and Linear Systems 99

multiple m21 = 10000 of the first equation from the second equation, which
gives
0.000100x1 + x2 = 1
− 10000x2 = −10000.

Using backward substitution we get the solution x∗ = [0, 1]T . Thus, a


computational disaster has occurred. But if we interchange the equations,
we obtain
x1 + x2 = 2
0.000100x1 + x2 = 1.

Applying Gaussian elimination again, we get the solution x∗ = [1, 1]T .


This solution is as good as one would hope. So, we conclude from this
example that it is not enough just to avoid a zero pivot, one must also
avoid a relatively small one. Here we need some pivoting strategies to help
us overcome the difficulties faced during the process of simple Gaussian
elimination.

1.4.3 Pivoting Strategies


We know that simple Gaussian elimination is applied to a problem with
no pivotal elements that are zero, but the method does not work if the
first coefficient of the first equation or a diagonal coefficient becomes zero
in the process of the solution, because they are used as denominators in a
forward elimination.
Pivoting is used to change the sequential order of the equations for two
purposes; first to prevent diagonal coefficients from becoming zero, and sec-
ond, to make each diagonal coefficient larger in magnitude than any other
coefficient below it, i.e., to decrease the round-off errors. The equations are
not mathematically affected by changes in sequential order, but changing
the order makes the coefficient become nonzero. Even when all diagonal
coefficients are nonzero, the change of order increases the accuracy of the
computations.
There are two standard pivoting strategies used to handle these diffi-
culties easily. They are explained as follows.
100 Applied Linear Algebra and Optimization using MATLAB

Partial Pivoting
Here, we develop an implementation of Gaussian elimination that utilizes
the pivoting strategy discussed above. In using Gaussian elimination by
partial pivoting (or row pivoting), the basic approach is to use the largest
(in absolute value) element on or below the diagonal in the column of
current interest as the pivotal element for elimination in the rest of that
column.
One immediate effect of this will be to force all the multiples used to be
not greater than 1 in absolute value. This will inhibit the growth of error in
the rest of the elimination phase and in subsequent backward substitution.
At stage k of forward elimination, it is necessary, therefore, to be able to
identify the largest element from |akk |, |ak+1,k |, . . . , |ank |, where these aik s
are the elements in the current partially triangularized coefficient matrix. If
this maximum occurs in row p, then the pth and kth rows of the augmented
matrix are interchanged and the elimination proceeds as usual. In solving
n linear equations, a total of N = n(n+1)/2 coefficients must be examined.
Example 1.32 Solve the following linear system using Gaussian elimina-
tion with partial pivoting:
x1 + x2 + x3 = 1
2x1 + 3x2 + 4x3 = 3
4x1 + 9x2 + 16x3 = 11.
Solution. For the first elimination step, since 4 is the largest absolute
coefficient of the first variable x1 , the first row and the third row are inter-
changed, which gives us
4x1 + 9x2 + 16x3 = 11
2x1 + 3x2 + 4x3 = 3
x1 + x2 + x3 = 1.
Eliminate the first variable x1 from the second and third rows by subtracting
the multiples m21 = 24 and m31 = 41 of row 1 from row 2 and row 3,
respectively, which gives
4x1 + 9x2 + 16x3 = 11
− 3/2x2 − 4x3 = −5/2
− 5/4x2 − x3 = −7/5.
Matrices and Linear Systems 101

For the second elimination step, − 32 is the largest absolute coefficient of the
second variable x2 , so eliminate the second variable x2 from the third row
by subtracting the multiple m32 = 56 of row 2 from row 3, which gives

4x1 + 9x2 + 16x3 = 11


− 3/2x2 − 4x3 = −5/2
1/3x3 = 1/3.

Obviously, the original set of equations has been transformed to an equiva-


lent upper-triangular form. Now using backward substitution, we get

x1 = 1, x2 = −1, x3 = 1,

which is the required solution of the given linear system. •

The following MATLAB commands will give the same results we ob-
tained in Example 1.32 of the Gaussian elimination method with partial
pivoting:

>> B = [1 1 1 1; 2 3 4 3; 4 9 16 11];
>> x = P P (B);
>> disp(x)
102 Applied Linear Algebra and Optimization using MATLAB

Program 1.7
MATLAB m-file for Gaussian Elimination by Partial Pivoting
function x=PP(B)
% B = input(0 input matrix in f orm[A/b]0 );
[n, t] = size(B); U = B;
for M = 1:n-1
mx(M ) = abs(U (M, M )); r = M ;
for i = M+1:n
if mx(M ) < abs(U (i, M ))
mx(M)=abs(U(i,M)); r = i; end; end
rw1(1,1:t)=U(r,1:t); rw2(1,1:t)=U(M,1:t);
U(M,1:t)=rw1 ; U(r,1:t)=rw2 ;
for k=M+1:n
m=U(k,M)/U(M,M);
for j=M:t
U (k, j) = U (k, j) − m ∗ U (M, j); end;end
i=n; x(i)=U(i,t)/U(i,i);
for i=n-1:-1:1; s=0;
for k=n:-1:i+1
s = s + U (i, k) ∗ x(k); end
x(i)=(U(i,t)-s)/U(i,i); end; B; U; x; end

Procedure 1.3 (Partial Pivoting)

1. Suppose we are about to work on the ith column of the matrix. Then
we search that portion of the ith column below and including the di-
agonal and find the element that has the largest absolute value. Let
p denote the index of the row that contains this element.

2. Interchange row i and p.

3. Proceed with elimination procedure 1.2.


Matrices and Linear Systems 103

Total Pivoting
In the case of total pivoting (or complete pivoting), we search for the largest
number (in absolute value) in the entire array instead of just in the first
column, and this number is the pivot. This means that we shall probably
need to interchange the columns as well as rows. When solving a system
of equations using complete pivoting, each row interchange is equivalent to
interchanging two equations, while each column interchange is equivalent
to interchanging the two unknowns.

At the kth step, interchange both the rows and columns of the matrix
so that the largest number in the remaining matrix is used as the pivot
i.e., after the pivoting
|akk | = max|aij |, for i = k, k + 1, . . . , n, j = k, k + 1, . . . , n.
There are times when the partial pivoting procedure is inadequate.
When some rows have coefficients that are very large in comparison to
those in other rows, partial pivoting may not give a correct solution.

Therefore, when in doubt, use total pivoting. No amount of pivot-


ing will remove inherent ill-conditioning (we will discuss this later in the
chapter) from a set of equations, but it helps to ensure that no further
ill-conditioning is introduced in the course of computation.
Example 1.33 Solve the following linear system using Gaussian elimina-
tion with total pivoting:
x1 + x2 + x3 = 1
2x1 + 3x2 + 4x3 = 3
4x1 + 9x2 + 16x3 = 11.
Solution. For the first elimination step, since 16 is the largest absolute
coefficient of variable x3 in the given system, the first row and the third
row are interchanged as well as the first column and third column, and we
get
16x3 + 9x2 + 4x1 = 11
4x3 + 3x2 + 2x1 = 3
x3 + 9x2 + x1 = 1.
104 Applied Linear Algebra and Optimization using MATLAB

Then eliminate the third variable x3 from the second and third rows by
4 1
subtracting the multiples m21 = 16 and m31 = 16 of row 1 from rows 2 and
3, which respectively, gives

16x3 + 9x2 + 4x1 = 11


3 1
x
4 2
+ x1 = 4
7 5
x
16 2
+ 3/4x1 = 16
.

For the second elimination step, 1 is the largest absolute coefficient of the
first variable x1 in the second row and third column, so the second and third
columns are interchanged, giving us

16x3 + 4x1 + 9x2 = 11


3 1
x1 + x
4 2
= 4
3 7 5
x
4 1
+ x
16 2
= 16
.

Eliminate the first variable x1 from the third row by subtracting the multiple
m32 = 43 of row 2 from row 3, which gives

16x3 + 4x1 + 9x2 = 11


x1 + 43 x2 = 14
− 18 x2 = 18 .

The original set of equations has been transformed to an equivalent upper-


triangular form. Now using backward substitution, we get

x1 = 1, x2 = −1, x3 = 1,

which is the required solution of the given linear system. •


Matrices and Linear Systems 105

Program 1.8
MATLAB m-file for the Gaussian Elimination by Total Pivoting
function x=TP(B)
% B = input(‘input matrix in f orm[A/b]0 );
[n,m]=size(B);U=B; w=zeros(n,n);
for i=1:n; N(i)=i; end
for M = 1:n-1; r=M; c=M;
for i = M:n; for j = M:n
if max(M ) < abs(U (i, j)); max(M)=abs(U(i,j));
r = i; c = j; end; end; end
rw1(1,1:m)=U(r,1:m); rw2(1,1:m)=U(M,1:m);
U(M,1:m)=rw1;U(r,1:m)=rw2 ; cl1(1:n,1)= U(1:n,c);
cl2(1 : n, 1) = U (1 : n, M ); U (1 : n, M ) = cl1(1 : n, 1);
U (1 : n, c) = cl2(1 : n, 1); p = N (M ); N (M ) = N (c);
N (c) = p; w(M, 1 : n) = N ;
for k = M + 1 : n; e = U (k, M )/U (M, M );
for j = M : m; U (k, j) = U (k, j) − e ∗ U (M, j); end; end
i = n; x(i, 1) = U (i, m)/U (i, i);
for i = n − 1 : −1 : 1; s = 0;
for k = n : −1 : i + 1; s = s + U (i, k) ∗ x(k, 1);end
x(i, 1) = (U (i, m) − s)/U (i, i); end
for i=1:n; X(N (i), 1) = x(i, 1); end; B; U ; X; end

MATLAB can be used to get the same results we obtained in Exam-


ple 1.33 of the Gaussian elimination method with total pivoting with the
following command:

>> B = [1 1 1 1; 2 3 4 3; 4 9 16 11];
>> x = T P (B);
>> disp(x)

Total pivoting offers little advantage over partial pivoting and it is signifi-
cantly slower, requiring N = n(n+1)(2n+1)
6
elements to be examined in total.
It is rarely used in practice because interchanging columns changes the
order of the xs and, consequently, add significant and usually unjustified
106 Applied Linear Algebra and Optimization using MATLAB

complexity to the computer program. So for getting good results partial


pivoting has shown to be a very reliable procedure.

1.4.4 Gauss–Jordan Method


This method is a modification of the Gaussian elimination method. The
Gauss–Jordan method is inefficient for practical calculation, but is often
useful for theoretical purposes. The basis of this method is to convert the
given matrix into a diagonal form. The forward elimination of the Gauss–
Jordan method is identical to the Gaussian elimination method. However,
Gauss–Jordan elimination uses backward elimination rather than backward
substitution. In the Gauss–Jordan method the forward elimination and
backward elimination need not be separated. This is possible because a
pivot element can be used to eliminate the coefficients not only below but
also above at the same time. If this approach is taken, the form of the
coefficients matrix becomes diagonal when elimination by the last pivot is
completed. The Gauss–Jordan method simply yields a transformation of
the augmented matrix of the form

[A|b] → [I|c],

where I is the identity matrix and c is the column matrix, which represents
the possible solution of the given linear system.
Example 1.34 Solve the following linear system using the Gauss–Jordan
method:
x1 + 2x2 = 3
−x1 − 2x3 = −5
−3x1 − 5x2 + x3 = −4.
Solution. Write the given system in the augmented matrix form
..
 
1 2 0 . 3
 .. 
.
 −1 0 −2 . −5 

..
−3 −5 1 . −4
The first elimination step is to eliminate elements a21 = −1 and a31 = −3
by subtracting the multiples m21 = −1 and m31 = −3 of row 1 from rows
Matrices and Linear Systems 107

2 and 3, respectively, which gives


.
 
1 2 0 .. 3
 .
 0 2 −2 .. −2

.
 
.
0 1 1 .. 5

The second row is now divided by 2 to give


..
 
1 2 0 . 3
 .. 
1 −1 . −1  .
 0

..
0 1 1 . 5
(1)
The second elimination step is to eliminate the elements in positions a12 =
2 and a32 = 1 by subtracting the multiples m12 = 2 and m32 = 1 of row 2
from rows 1 and 3, respectively, which gives
.
 
1 0 2 .. 5
 0 1 −1 ... −1  .
 
 
..
0 0 2 . 6

The third row is now divided by 2 to give


.
 
1 0 2 .. 5
 0 1 −1 ... −1
 
.
 
.
0 0 1 .. 3
(1)
The third elimination step is to eliminate the elements in positions a23 =
−1 and a13 = 2 by subtracting the multiples m23 = −1 and m13 = 2 of row
3 from rows 2 and 1, respectively, which gives
..
 
1 0 0 . −1
 0 1 0 ...
 
 2 .
..
0 0 1 . 3
108 Applied Linear Algebra and Optimization using MATLAB

Obviously, the original set of equations has been transformed to a diagonal


form. Now expressing the set in algebraic form yields
x1 = −1
x2 = 2
x3 = 3,
which is the required solution of the given system. •
The above results can be obtained using MATLAB commands, as fol-
lows:

>> Ab = [A|b] = [1 2 0 3; −1 0 − 2 − 5; −3 − 5 1 − 4];


>> GaussJ(Ab);

Program 1.9
MATLAB m-file for the Gauss–Jordan Method
function sol=GaussJ(Ab)
[m,n]=size(Ab);
for i=1:m
Ab(i, :) = Ab(i, :)/Ab(i, i);
for j=1:m
if j == i; continue; end
Ab(j, :) = Ab(j, :) − Ab(j, i) ∗ Ab(i, :);
end; end; sol=Ab;

Procedure 1.4 (Gauss–Jordan Method)


1. Form the augmented matrix, [A|b].

2. Reduce the coefficient matrix A to unit upper-triangular form using


the Gaussian procedure.

3. Use the nth row to reduce the nth column to an equivalent identity
matrix column.

4. Repeat step 3 for n–1 through 2 to get the augmented matrix of the
form [I|c].
Matrices and Linear Systems 109

5. Solve for the unknown xi = ci , for i = 1, 2, . . . , n.


The number of multiplications and divisions required for the Gauss–Jordan
method is approximately
 3
n n
N= − n2 − ,
2 2
which is approximately 50% larger than for the Gaussian elimination method.
Consequently, the Gaussian elimination method is preferred.
The Gauss–Jordan method is particularly well suited to compute the
inverse of a matrix through the transformation
[A|I] → [I|A−1 ].
Note if the inverse of the matrix can be found, then the solution of the
linear system can be computed easily from the product of matrix A−1 and
column matrix b, i.e.,
x = A−1 b. (1.32)
Example 1.35 Apply the Gauss–Jordan method to find the inverse of the
following matrix:  
10 1 −5
A =  −20 3 20  .
5 3 5
Then solve the system with b = [1, 2, 6]T .

Solution. Consider the following augmented matrix:


..
 
10 1 −5 . 1 0 0
[A|I] =  −20 3 20 ... 0 1 0 
 
.

..
5 3 5 . 0 0 1
Divide the first row by 10, which gives
..
 
1 0.1 −0.5 . 0.1 0 0

=  −20 .. 
3 20 . .
0 1 0 

..
5 3 5 . 0 0 1
110 Applied Linear Algebra and Optimization using MATLAB

The first elimination step is to eliminate the elements in positions a21 =


−20 and a31 = 5 by subtracting the multiples m21 = −20 and m31 = 5 of
row 1 from rows 2 and 3, respectively, which gives
..
 
1 0.1 −0.5 . 0.1 0 0

= 0 .
.

5 10 . 2 1 0  .

..
0 2.5 7.5 . −0.5 0 1
Divide the second row by 5, which gives
.
 
1 0.1 −0.5 .. 0.1 0 0

= 0 .
.

.
 1 2 . 0.4 0.2 0 
..
0 2.5 7.5 . −0.5 0 1
The second elimination step is to eliminate the elements in positions a12 =
(1)
0.1 and a32 = 2.5 by subtracting the multiples m12 = 0.1 and m32 = 2.5 of
row 2 from rows 1 and 3, respectively, which gives
..
 
1 0 −0.7 . 0.06 −0.02 0
 . 
= 0 1
 2 .. 0.4 .
0.2 0 
.
0 0 2.5 .. −1.5 −0.5 1
Divide the third row by 2.5, which gives
..
 
1 0 −0.7 . 0.06 −0.02 0
=
 .. 
 0 1 2 . 0.4 0.2 .
0 
.
0 0 1 .. −0.6 −0.2 0.4
(2)
The third elimination step is to eliminate the elements in positions a23 = 2
(2)
and a13 = −0.7 by subtracting the multiples m23 = 2 and m13 = −0.7 of
row 3 from rows 2 and 1, respectively, which gives
..
 
1 0 0 . −0.36 −0.16 0.28
=  0 1 0 ... −1
 
1.6 0.6 −0.8   = [I|A ].

.
0 0 1 .. −0.6 −0.2 0.4
Matrices and Linear Systems 111

Obviously, the original augmented matrix [A|I] has been transformed to the
augmented matrix of the form [I|A−1 ]. Hence, the solution of the linear
system can be obtained by the matrix multiplication (1.32) as
      
x1 −0.36 −0.16 0.28 1 1
 x2  =  1.6 0.6 −0.8   2  =  −2  .
x3 −0.6 −0.2 0.4 6 1.4

Hence, x∗ = [1, −2, 1.4]T is the solution of the given system. •

The above results can be obtained using MATLAB, as follows:

>> Ab = [A|I] = [10 1 − 5 1 0 0; −20 3 20 0 1 0; 5 3 5 0 0 1];


>> [I|inv(A)] = GaussJ(Ab);
>> b = [1 2 6]0 ;
>> x = inv(A) ∗ b;

1.4.5 LU Decomposition Method


This is another direct method to find the solution of a system of linear equa-
tions. LU decomposition (or the factorization method) is a modification
of the elimination method. Here we decompose or factorize the coefficient
matrix A into the product of two triangular matrices in the form

A = LU, (1.33)

where L is a lower-triangular matrix and U is the upper-triangular matrix.


Both are the same size as the coefficients matrix A. To solve a number
of linear equations sets in which the coefficients matrices are all identical
but the right-hand sides are different, then LU decomposition is more
efficient than the elimination method. Specifying the diagonal elements
of either L or U makes the factoring unique. The procedure based on
unity elements on the diagonal of matrix L is called Doolittle’s method (or
Gauss factorization), while the procedure based on unity elements on the
diagonal of matrix U is called Crout’s method. Another method, called the
Cholesky method, is based on the constraint that the diagonal elements of
L are equal to the diagonal elements of U , i.e., lii = uii , for i = 1, 2, . . . , n.
112 Applied Linear Algebra and Optimization using MATLAB

The general forms of L and U are written as


   
l11 0 ··· 0 u11 u12 · · · u1n
 l21 l22 · · · 0   0 u22 · · · u2n 
L =  .. ..  , U =  .. ..  , (1.34)
   
.. .. .. ..
 . . . .   . . . . 
ln1 ln2 · · · lnn 0 0 · · · unn

such that lij = 0 for i < j and uij = 0 for i > j.

Consider a linear system


Ax = b (1.35)
and let A be factored into the product of L and U , as shown by (1.34).
Then the linear system (1.35) becomes

LU x = b

or can be written as
Ly = b,
where
y = U x.
The unknown elements of matrix L and matrix U are computed by equating
corresponding elements in matrices A and LU in a systematic way. Once
the matrices L and U have been constructed, the solution of system (1.35)
can be computed in the following two steps:

1. Solve the system Ly = b.

By using forward elimination, we will find the components of the


unknown vector y by using the following steps:

y 1 = b1 , 

i−1
X . (1.36)
y i = bi − lij yj , i = 2, 3, . . . , n 

j=1
Matrices and Linear Systems 113

2. Solve the system U x = y.

By using backward substitution, we will find the components of the


unknown vector x by using the following steps:
yn 
xn = , 
unn" #


n
1 X . (1.37)
xi = yi − uij xj , i = n − 1, n − 2, . . . , 1 
uii


j=i+1

Thus, the relationship of the matrices L and U to the original matrix A is


given by the following theorem.
Theorem 1.21 If Gaussian elimination can be performed on the linear
system Ax = b without row interchanges, then the matrix A can be factored
into the product of a lower-triangular matrix L and an upper-triangular
matrix U , i.e.,
A = LU,
where the matrices L and U are the same size as A. •
Let us consider a nonsingular system Ax = b and with the help of the
simple Gauss elimination method we will convert the coefficient matrix A
into the upper-triangular matrix U by using elementary row operations. If
all the pivots are nonzero, then row interchanges are not necessary, and the
decomposition of the matrix A is possible. Consider the following matrix:
 
2 4 2
A= 4 9 7 .
−2 −2 5
To convert it into the upper-triangular matrix U , we first apply the follow-
ing row operations
Row2 − (2)Row1 and Row3 + Row1,
which gives  
2 4 2
 0 1 3 .
0 2 7
114 Applied Linear Algebra and Optimization using MATLAB

Once again, applying the row operation


Row2 − (2)Row2,
we get  
2 4 2
 0 1 3  = U,
0 0 1
which is the required upper-triangular matrix.

Now defining the three elementary matrices (each of them can be ob-
tained by adding a multiple of row i to row j) associated with these row
operations:
     
1 0 0 1 0 0 1 0 0
E1 =  −2 1 0  , E2 =  0 1 0  , E3 =  0 1 0 .
0 0 1 1 0 1 0 −2 1
Then
     
1 0 0 1 0 0 1 0 0 1 0 0
E3E2E1 =  0 1 0   0 1 0   −2 1 0  =  −2 1 0 
0 −2 1 1 0 1 0 0 1 5 −2 1
and
    
1 0 0 2 4 2 2 4 2
E3E2E1A =  −2 1 0  4 9 7  =  0 1 3  = U.
5 −2 1 −2 −2 5 0 0 1
So
A = E1−1 E2−1 E3−1 = LU,
where
     
1 0 0 1 0 0 1 0 0 1 0 0
E1−1 E2−1 E3−1 =  2 1 0   0 1 0   0 1 0  =  2 1 0  = L.
0 0 1 −1 0 1 0 2 1 −1 2 1
Thus, A = LU is a product of a lower-triangular matrix L and an upper-
triangular matrix U . Naturally, this is called an LU decomposition of A.
Matrices and Linear Systems 115

Theorem 1.22 Let A be an n × n matrix that has an LU factorization,


i.e.,
A = LU.
If A has rank n (i.e., all pivots are nonzeros), then L and U are uniquely
determined by A. •

Now we will discuss all three possible variations of LU decomposition to


find the solution of the nonsingular linear system in the following.

Doolittle’s Method
In Doolittle’s method (called Gauss factorization), the upper-triangular
matrix U is obtained by forward elimination of the Gaussian elimination
method and the lower-triangular matrix L containing the multiples used in
the Gaussian elimination process as the elements below the diagonal with
unity elements on the main diagonal.

For the matrix A in Example 1.22, we can have the decomposition of


matrix A in the form
    
1 2 1 1 0 0 1 2 1
 2 5 3 = 2 1 0  0 1 1 ,
1 3 4 1 1 1 0 0 2

where the unknown elements of matrix L are the used multiples and the
matrix U is the same as we obtained in the forward elimination process.
Example 1.36 Construct the LU decomposition of the following matrix A
by using Gauss factorization (i.e., LU decomposition by Doolittle’s method).
Find the value(s) of α for which the following matrix is
 
1 −1 α
A =  −1 2 −α 
α 1 1

singular. Also, find the unique solution of the linear system Ax = [1, 1, 2]T
by using the smallest positive integer value of α.
116 Applied Linear Algebra and Optimization using MATLAB

Solution. Since we know that


     
1 −1 α 1 0 0 u11 u12 u13
A =  −1 2 −α  =  m21 1 0  ,  0 u22 u23  = LU,
α 1 1 m31 m32 1 0 0 u33

now we will use only the forward elimination step of the simple Gaussian
elimination method to convert the given matrix A into the upper-triangular
matrix U . Since a11 = 1 6= 0, we wish to eliminate the elements a21 = −1
and a31 = α by subtracting from the second and third rows the appropriate
multiples of the first row. In this case, the multiples are given,

−1 α
m21 = = −1 and m31 = = α.
1 1
Hence,  
1 −1 α
 0 1 0 .
0 1 + α 1 − α2
(1) (1)
Since a22 = 1 6= 0, we eliminate the entry in the a32 = 1 + α position by
subtracting the multiple m32 = 1+α
1
of the second row from the third row to
get  
1 −1 α
 0 1 0 .
0 0 1 − α2
Obviously, the original set of equations has been transformed to an upper-
triangular form. Thus,
    
1 −1 α 1 0 0 1 −1 α
 −1 2 −α  =  −1 1 0  0 1 0 ,
α 1 1 α 1+α 1 0 0 1 − α2

which is the required decomposition of A. The matrix will be singular, if


the third diagonal element 1 − α2 of the upper-triangular U is equal to zero
(Theorem 1.19), which gives α = ±1.
Matrices and Linear Systems 117

To find the unique solution of the given system we take α = 2, and it


gives
    
1 −1 2 1 0 0 1 −1 2
 −1 2 −2  =  −1 1 0  0 1 0 .
2 1 1 2 3 1 0 0 −3

Now solve the first system Ly = b for unknown vector y, i.e.,


    
1 0 0 y1 1
 −1 1 0   y2  =  1 .
2 3 1 y3 2

Performing forward substitution yields

y1 = 1 gives y1 = 1,
−y1 + y2 = 1 gives y2 = 2,
2y1 + 3y2 + y3 = 2 gives y3 = −6.

Then solve the second system U x = y for unknown vector x, i.e.,


    
1 −1 2 x1 1
 0 1 0   x2  =  2  .
0 0 −3 x3 −6

Performing backward substitution yields

x1 − x2 + 2x3 = 1 gives x1 = −1
x2 = 2 gives x2 = 2
− 3x3 = −6 gives x3 = 2,

which gives
x1 = −1
x2 = 2
x3 = 2,
the approximate solution of the given system. •
118 Applied Linear Algebra and Optimization using MATLAB

We can write a MATLAB m-file to factor a nonsingular matrix A into


a unit lower-triangular matrix L and an upper-triangular matrix U using
the lu − gauss function. The following MATLAB commands can be used
to reproduce the solution of the linear system of Example 1.22:

>> A = [1 2 0; −1 0 − 2; −3 − 5 1];
>> B = lu − gauss(A);
>> L = eye(size(B)) + tril(B, −1);
>> U = triu(A);
>> b = [3 − 5 − 4]0 ;
>> y = L \ b;
>> x = U \ y;

Program 1.10
MATLAB m-file for the LU Decomposition Method
function A = lu − gauss(A)
% LU factorization without pivoting
[n,n] = size(A); for i=1:n-1; pivot = A(i,i);
for k=i+1:n; A(k,i)=A(k,i)/pivot;
for j=i+1:n; A(k, j) = A(k, j) − A(k, i) ∗ A(i, j);
end;end; end

There is another way to find the values of the unknown elements of the
matrices L and U , which we describe in the following example.

Example 1.37 Construct the LU decomposition of the following matrix


using Doolittle’s method:
 
1 2 4
A =  1 3 3 .
2 2 2
Matrices and Linear Systems 119

Solution. Since
  
1 0 0 u11 u12 u13
A = LU =  l21 1 0   0 u22 u23  ,
l31 l32 1 0 0 u33

performing the multiplication on the right-hand side gives


   
1 2 4 u11 u12 u13
 1 3 3  =  l21 u11 l21 u12 + u22 l21 u13 + u23  .
2 2 2 l31 u11 l31 u12 + l32 u22 l31 u13 + l32 u23 + u33

Then equate elements of the first column to obtain

1 = u11 , u11 = 1
1 = l21 u11 , l21 = 1
2 = l31 u11 , l31 = 2.

Now equate elements of the second column to obtain

2 = u12 , u12 = 2
3 = l21 u12 + u22 , u22 = 3 − 2 = 1
2 = l31 u12 + l32 u22 , l32 = 2 − 4 = −2.

Finally, equate elements of the third column to obtain

4 = u13 , u13 = 4
3 = l21 u13 + u23 , u23 = 3 − 4 = −1
2 = l31 u13 + l32 u23 + u33 , u33 = 2 − 10 = −8.

Thus, we obtain
    
1 2 4 1 0 0 1 2 4
 1 3 3 = 1 1 0  0 1 1 ,
2 2 2 2 −2 1 0 0 −8

the factorization of the given matrix. •


120 Applied Linear Algebra and Optimization using MATLAB

The general formula for getting elements of L and U corresponding to the


coefficient matrix A for a set of n linear equations can be written as

i−1

X 
uij = aij − lik ukj , 2 ≤i≤j 



k=1






" j−1
# 

1 X 

lij = aij − lik ukj , i >j≥2

uii . (1.38)
k=1 




uij = a1j , i =1







ai1 ai1 

lij = = , j =1


u11 a11

Example 1.38 Solve the following linear system by LU decomposition us-


ing Doolittle’s method:
  
1 2 4 −2
A=  1 3 3  and b =  3 .
2 2 2 −6
Solution. The factorization of the coefficient matrix A has already been
constructed in Example 1.37 as
    
1 2 4 1 0 0 1 2 4
 1 3 3 = 1 1 0  0 1 1 .
2 2 2 2 −2 1 0 0 −8
Then solve the first system Ly = b for unknown vector y, i.e.,
    
1 0 0 y1 −2
 1 1 0   y2  =  3  .
2 −2 1 y3 −6
Performing forward substitution yields
y1 = −2 gives y1 = −2,
y1 + y2 = 3 gives y2 = 5,
2y1 − 2y2 + y3 = −6 gives y3 = 8.
Matrices and Linear Systems 121

Then solve the second system U x = y for unknown vector x, i.e.,

    
1 2 4 x1 −2
 0 1 1   x2  =  5  .
0 0 −8 x3 8

Performing backward substitution yields

x1 + 2x2 + 4x3 = −2 gives x1 = −6


x2 + x3 = 5 gives x2 = 4
− 8x3 = 8 gives x3 = −1,

which gives

x1 = −6
x2 = 4
x3 = −1,

the approximate solution of the given system. •

We can also write the MATLAB m-file called Doolittle.m to get the
solution of the linear system by LU decomposition by using Doolittle’s
method. In order to reproduce the above results using MATLAB com-
mands, we do the following:

>> A = [1 2 4; 1 3 3; 2 2 2];
>> b = [−2 3 − 6];
>> sol = Doolittle(A, b);
122 Applied Linear Algebra and Optimization using MATLAB

Program 1.11
MATLAB m-file for using Doolittle’s Method
function sol = Doolittle(A,b)
[n,n]=size(A); u=A;l=zeros(n,n);
for i=1:n-1; if abs(u(i,i))> 0
for i1=i+1:n; m(i1,i)=u(i1,i)/u(i,i);
for j=1:n
u(i1, j) = u(i1, j) − m(i1, i) ∗ u(i, j);end;end;end;end
for i=1:n; l(i,1)=A(i,1)/u(1,1); end
for j=2:n; for i=2:n; s=0;
for k=1:j-1; s = s + l(i, k) ∗ u(k, j); end
l(i,j)=(A(i,j)-s)/u(j,j); end; end y(1)=b(1)/l(1,1);
for k=2:n; sum=b(k);
for i=1:k-1; sum = sum − l(k, i) ∗ y(i); end
y(k)=sum/l(k,k); end
x(n)=y(n)/u(n,n);
for k=n-1:-1:1; sum=y(k);
for i=k+1:n; sum = sum − u(k, i) ∗ x(i); end
x(k)=sum/u(k,k); end; l; u; y; x

Procedure 1.5 (LU Decomposition by Doolittle’s Method)

1. Take the nonsingular matrix A.

2. If possible, decompose the matrix A = LU using (1.38).

3. Solve linear system Ly = b using (1.36).

4. Solve linear system U x = y using (1.37).

The LDV Factorization


There is some asymmetry in LU decomposition because the lower-triangular
matrix has 1s on its diagonal, while the upper-triangular matrix has a
nonunit diagonal. This is easily remedied by factoring the diagonal entries
Matrices and Linear Systems 123

out of the upper-triangular matrix as follows:


    
u11 u12 · · · u1n u11 ··· 0 1 u12 /u11 · · · u1n /u11
 0 u22 · · · u2n   0 u22 · · ·  0 1 · · · u2n /u22 
..  =  .. ..  .
    
 .. .. .. .. .. ..  .. .. ..
 . . . .   . . . .  . . . . 
0 0 · · · unn 0 · · · unn 0 0 ··· 1

Let D denote the diagonal matrix having the same diagonal elements as
the upper-triangular matrix U ; in other words, D contains the pivots on
its diagonal and zeros everywhere else. Let V be the redefining upper-
triangular matrix obtained from the original upper-triangular matrix U by
dividing each row by its pivot, so that V has all 1s on the diagonal. It
is easily seen that U = DV , which allows any LU decomposition to be
written as
A = LDV,
where L and V are lower- and upper-triangular matrices with 1s on both
of their diagonals. This is called the LDV factorization of A.

Example 1.39 Find the LDV factorization of the following matrix:


 
1 2 −1
A= 3 2 1 .
2 4 1

Solution. By using Doolittle’s method, the LU decomposition of A can be


obtained as
    
1 2 −1 1 0 0 1 2 −1
A= 3 2 1  =  3 1 0   0 −4 4  = LU.
2 4 1 2 0 1 0 0 3

Then the matrix D and the matrix V can be obtained as


    
1 2 −1 1 0 0 1 2 −1
U =  0 −4 4  =  0 −4 0   0 1 4  = DV.
0 0 3 0 0 3 0 0 1
124 Applied Linear Algebra and Optimization using MATLAB

Thus, the LDV factorization of the given matrix A is obtained as


   
1 0 0 1 0 0 1 2 −1
A = LDV =  3 1 0   0 −4 0   0 1 4 .
2 0 1 0 0 3 0 0 1

If a given matrix A is symmetric, then there is a connection between


the lower-triangular matrix L and the upper-triangular matrix U in the
LU decomposition. In the first elimination step, the elements in Ls first
column are obtained by dividing U s first row by the diagonal elements.
Similarly, during the second elimination step, l32 = uu22
23
. In general, when
a symmetric matrix is decomposed without pivots, lij is related to uji
through the identity
uji
lij = .
ujj
In other words, each column of a matrix L equals the corresponding row
of a matrix U divided by the diagonal element. It is uniquely determined
that the LDV decomposition of a symmetric matrix has the form LDLT ,
since A = LDV . Taking the transpose of it, we get

(A)T = (LDV )T = V T DT LT = V T DLT ,

(the diagonal matrix D is symmetric), and the uniqueness of the LDV


decomposition implies that

L=VT and V = LT .

Note that not every symmetric matrix has an LDLT factorization. How-
ever, if A = LDLT , then A must be symmetric because

(A)T = (LDLT )T = (LT )T DT LT = LDLT = A.

Example 1.40 Find the LDLT factorization of the following symmetric


matrix:  
1 3 2
A =  3 4 1 .
2 1 2
Matrices and Linear Systems 125

Solution. By using Doolittle’s method, the LU decomposition of A can be


obtained as
    
1 3 2 1 0 0 1 3 2
A=  3 4 1  =  3 1 0   0 −5 −5  = LU.
2 1 2 2 1 1 0 0 3

Then the matrix D and the matrix V can be obtained as


    
1 3 2 1 0 0 1 3 2
U =  0 −5 −5  =  0 −5 0   0 1 1  = DV.
0 0 3 0 0 3 0 0 1

Note that    
1 3 2 1 0 0
V =  0 1 1  = LT =  3 1 0  .
0 0 1 2 1 1
Thus, we obtain

  
1 0 0 1 0 0 1 3 2
A = LDLT =  3 1 0   0 −5 0   0 1 1  ,
2 1 1 0 0 3 0 0 1

the LDLT factorization of the given matrix A. •

Crout’s Method
Crout’s method, in which matrix U has unity on the main diagonal, is
similar to Doolittle’s method in all other aspects. The L and U matrices
are obtained by expanding the matrix equation A = LU term by term to
determine the elements of the L and U matrices.

Example 1.41 Construct the LU decomposition of the following matrix


using Crout’s method:  
1 2 3
A =  6 5 4 .
2 5 6
126 Applied Linear Algebra and Optimization using MATLAB

Solution. Since
  
l11 0 0 1 u12 u13
A = LU =  l21 l22 0   0 1 u23  ,
l31 l32 l33 0 0 1
performing the multiplication on the right-hand side gives
   
1 2 3 l11 l11 u12 l11 u13
 6 5 4  =  l21 l21 u12 + l22 l21 u13 + l22 u23 .
2 5 6 l31 l31 u12 + l32 l31 u13 + l32 u23 + l33
Then equate elements of the first column to obtain
1 = l11
6 = l21
2 = l31 .
Then equate elements of the second column to obtain
2 = l11 u12 , u12 = 2

5 = l21 u12 + l22 , l22 = 5 − 12 = −7

5 = l31 u12 + l32 , l32 = 5 − 4 = 1.


Finally, equate elements of the third column to obtain
3 = l11 u13 , u13 = 3

4 = l21 u13 + l22 u23 , u23 = (4 − 18)/ − 7 = 2

6 = l31 u13 + l32 u23 + l33 , l33 = (6 − 6 − 2) = −2.


Thus, we get
    
1 2 3 1 0 0 1 2 3
 6 5 4  =  6 −7 0  0 1 2 ,
2 5 6 2 1 −2 0 0 1
the factorization of the given matrix. •
Matrices and Linear Systems 127

The general formula for getting elements of L and U corresponding to


the coefficient matrix A for a set of n linear equations can be written as

j−1

X 
lij = aij − lik ukj , i ≥ j, i = 1, 2, . . . , n 



k=1






i−1


1 X 

uij = [aij − lik ukj ], i < j, j = 2, 3, . . . , n

lii . (1.39)
k=1 




lij = ai1 , j=1







aij 

uij = , i=1


a11

Example 1.42 Solve the following linear system by LU decomposition us-


ing Crout’s method:
   
1 2 3 1
A= 6 5 4  and b =  −1  .
2 5 6 5

Solution. The factorization of the coefficient matrix A has already been


constructed in Example (1.41) as
    
1 2 3 1 0 0 1 2 3
 6 5 4  =  6 −7 0  0 1 2 .
2 5 6 2 1 −2 0 0 1

Then solve the first system Ly = b for unknown vector y, i.e.,


    
1 0 0 y1 1
 6 −7 0   y2  =  −1  .
2 1 −2 y3 5
128 Applied Linear Algebra and Optimization using MATLAB

Performing forward substitution yields

y1 = 1 gives y1 = 1
6y1 − 7y2 = −1 gives y2 = 1
2y1 + y2 − 2y3 = 5 gives y3 = −1.

Then solve the second system U x = y for unknown vector x, i.e.,

    
1 2 3 x1 1
 0 1 2   x2  =  1  .
0 0 1 x3 −1

Performing backward substitution yields

x1 + 2x2 + 3x3 = 1 gives x1 = −2


x2 + 2x3 = 1 gives x2 = 3
x3 = −1 gives x3 = −1,

which gives the approximate solution x∗ = [−2, 3, −1]T . •

The above results can be reproduced by using MATLAB commands as


follows:

>> A = [1 2 3; 6 5 4; 2 5 6];
>> b = [1 − 1 5];
>> sol = Crout(A, b);
Matrices and Linear Systems 129

Program 1.12
MATLAB m-file for the Crout’s Method
function sol = Crout(A, b)
[n,n]=size(A); u=zeros(n,n); l=u;
for i=1:n; u(i,i)=1; end
l(1,1)=A(1,1);
for i=2:n
u(1,i)=A(1,i)/l(1,1); l(i,1)=A(i,1); end
for i=2:n; for j=2:n; s=0;
if i <= j; K=i-1;
else; K=j-1; end
for k=1:K; s = s + l(i, k) ∗ u(k, j); end
if j > i; u(i,j)=(A(i,j)-s)/l(i,i); else
l(i,j)=A(i,j)-s; end;end;end
y(1)=b(1)/l(1,1);
for k=2:n; sum=b(k);
for i=1:k-1; sum = sum − l(k, i) ∗ y(i); end
y(k)=sum/l(k,k); end
x(n)=y(n)/u(n,n);
for k=n-1:-1:1; sum=y(k);
for i=k+1:n; sum = sum − u(k, i) ∗ x(i); end
x(k)=sum/u(k,k); end; l; u; y; x;

Procedure 1.6 (LU Decomposition by Crout’s Method)

1. Take the nonsingular matrix A.

2. If possible, decompose the matrix A = LU using (1.39).

3. Solve linear system Ly = b using (1.36).

4. Solve linear system U x = y using (1.37).

Note that the factorization method is also used to invert matrices.


Their usefulness for this purpose is based on the fact that triangular ma-
trices are easily inverted. Once the factorization has been affected, the
130 Applied Linear Algebra and Optimization using MATLAB

inverse of a matrix A is found from the formula


A−1 = (LU )−1 = U −1 L−1 . (1.40)
Then
U A−1 = L−1 , where LL−1 = I.
A practical way of calculating the determinant is to use the forward
elimination process of Gaussian elimination or, alternatively, LU decom-
position. If no pivoting is used, calculation of the determinant using LU
decomposition is very easy, since by one of the properties of the determi-
nant
det(A) = det(LU ) = det(L) det(U ).
So when using LU decomposition by Doolittle’s method,
n
Y
det(A) = det(U ) = uii = (u11 u22 · · · unn ),
i=1

where det(L) = 1 because L is a lower-triangular matrix and all its diagonal


elements are unity. For LU decomposition by Crout’s method,
n
Y
det(A) = det(L) = lii = (l11 l22 · · · lnn ),
i=1

where det(U ) = 1 because U is an upper-triangular matrix and all its di-


agonal elements are unity.

Example 1.43 Find the determinant and inverse of the following matrix
using LU decomposition by Doolittle’s method:
 
1 −2 1
A =  1 −1 1  .
1 1 2
Solution. We know that
    
1 −2 1 1 0 0 u11 u12 u13
A =  1 −1 1  =  m21 1 0   0 u22 u23  = LU.
1 1 2 m31 m32 1 0 0 u33
Matrices and Linear Systems 131

Now we will use only the forward elimination step of the simple Gaussian
elimination method to convert the given matrix A into the upper-triangular
matrix U . Since a11 = 1 6= 0, we wish to eliminate the elements a21 = 1
and a31 = 1 by subtracting from the second and third rows the appropriate
multiples of the first row. In this case, the multiples are given as

m21 = 1, and m31 = 1.

Hence,  
1 −2 1
 0 1 0 .
0 3 1
(1) (1)
Since a22 = 1 6= 0, we eliminate the entry in the a32 = 3 position by
subtracting the multiple m32 = 3 of the second row from the third row to
get  
1 −2 1
 0 1 0 .
0 0 1
Obviously, the original set of equations has been transformed to an upper-
triangular form. Thus,
    
1 −2 1 1 0 0 1 −2 1
 1 −1 1  =  1 1 0  0 1 0 ,
1 1 2 1 3 1 0 0 1

which is the required decomposition of A.


Now we find the determinant of matrix A as

det(A) = det(U ) = u11 u22 u33 = (1)(1)(1) = 1.

To find the inverse of matrix A, first we will compute the inverse of the
lower-triangular matrix L−1 from
  0   
1 0 0 l11 0 0 1 0 0
LL−1 =  1 1 0   l21 0 0
l22 0 = 0 1 0 =I
0 0 0
1 3 1 l31 l32 l33 0 0 1
132 Applied Linear Algebra and Optimization using MATLAB

by using forward substitution.


To solve the first system
  0   
1 0 0 l11 1
0 
 1 1 0   l21 =  0 ,
0
1 3 1 l31 0

by using forward substitution, we get


0 0 0
l11 = 1, l21 = −1, l31 = 2.

Similarly, the solution of the second linear system


    
1 0 0 0 0
 1 0 
1 0   l22 =  1 
0
1 3 1 l32 0

can be obtained
0 0
l22 = 1, l32 = −3.
Finally, the solution of the third linear system
    
1 0 0 0 0
 1 1 0  0  =  0 
0
1 3 1 l33 1
0
gives l33 = 1.
Hence, the elements of the matrix L−1 are
 
1 0 0
L−1 =  −1 1 0 ,
2 −3 1

which is the required inverse of the lower-triangular matrix L.


To find the inverse of the given matrix A, we will solve the system
 0
a11 a012 a013
   
1 −2 1 1 0 0
U A−1 =  0 1 0   a021 a022 a023  =  −1 1 0  = L−1
0 0 0
0 0 1 a31 a32 a33 2 −3 1
Matrices and Linear Systems 133

by using backward substitution.


We solve the first system
  0   
1 −2 1 a11 1
 0 1 0   a021  =  −1 
0 0 1 a031 2

by using backward substitution, and we get

a011 = −3, a021 = −1, a031 = 2.

Similarly, the solution of the second linear system


  0   
1 −2 1 a12 0
 0 1 0   a022  =  1 
0 0 1 a032 −3

can be obtained as follows:

a012 = 5, a022 = 1, a032 = −3.

Finally, the solution of the third linear system


  0   
1 −2 1 a13 0
 0 1 0   a023  =  0 
0 0 1 a033 1

can be obtained as follows:

a013 = −1, a023 = 0, a033 = 1.

Hence, the elements of the inverse matrix A−1 are


 
−3 5 −1
A−1 =  −1 1 0 ,
2 −3 1

which is the required inverse of the given matrix A. •


134 Applied Linear Algebra and Optimization using MATLAB

For LU decomposition we have not used pivoting for the sake of sim-
plicity. However, pivoting is important for the same reason as in Gaussian
elimination. We know that pivoting in Gaussian elimination is equivalent
to interchanging the rows of the coefficients matrix together with the terms
on the right-hand side. This indicates that pivoting may be applied to LU
decomposition as long as the interchanging is applied to the left and right
terms in the same way. When performing pivoting in LU decomposition,
the changes in the order of the rows are recorded. The same reordering is
then applied to the right-hand side terms before starting the solution in
accordance with the forward elimination and backward substitution steps.

Indirect LU Decomposition

It is to be noted that a nonsingular matrix A sometimes cannot be directly


factored as A = LU . For example, the matrix in Example 1.24 is nonsin-
gular, but it cannot be factored into the product LU . Let us assume it has
a LU form and
   
2 2 −4 u11 u12 u13
 2 2 −1  =  l21 u11 l21 u12 + u22 l21 u13 + u23 .
3 2 −3 l31 u11 l31 u12 + l32 u22 l31 u13 + l32 u23 + u33

Then equate elements of the first column to obtain

2 = u11 gives u11 = 2


2 = l21 u11 gives l21 = 1
3 = l31 u11 gives l31 = 23 .

Then equate elements of the second column to obtain

2 = u12 gives u12 = 2


2 = l21 u12 + u22 gives u22 = 0
2 = l31 u12 + l32 u22 gives 0 = −1,

which is not possible because 0 6= −1, a contradiction. Hence, the matrix


A cannot be directly factored into the product of L and U . The indirect
factorization LU of A can be obtained by using the permutation matrix P
Matrices and Linear Systems 135

and replacing the matrix A by P A. For example, using the above matrix
A, we have
    
1 0 0 2 2 −4 2 2 −4
P A =  0 0 1   2 2 −1  =  3 2 −3  .
0 1 0 3 2 −3 2 2 −1

From this multiplication we see that rows 2 and 3 of the original matrix A
are interchanged, and the resulting matrix P A has a LU factorization and
we have
    
1 0 0 2 2 −4 2 2 −4
 1.5 1 0   0 −1 3  =  3 2 −3  .
1 0 1 0 0 3 2 2 −1

The following theorem is an extension of Theorem 1.21, which includes the


case when interchanged rows are required. Thus, LU factorization can be
used to find the solution to any linear system Ax = b with a nonsingular
matrix A.

Theorem 1.23 Let A be a square n × n matrix and assume that Gaussian


elimination can be performed successfully to solve the linear system Ax =
b, but that row interchanges are required. Then there exists a permutation
matrix P = pk , . . . , p2 , p1 (where p1 , p2 , . . . , pk are the elementary matrices
corresponding to the row interchanges used) so that the P A matrix has a
LU factorization, i.e.,
P A = LU, (1.41)
where P A is the matrix obtained from A by doing these interchanges to A.
Note that P = In if no interchanges are used. •

When pivoting is used in LU decomposition, its effects should be taken into


consideration. First, we recognize that LU decomposition with pivoting is
equivalent to performing two separate process:

1. Transform A to A0 by performing all shifting of rows.

2. Then decompose A0 to LU with no pivoting.


136 Applied Linear Algebra and Optimization using MATLAB

The former step can be expressed by

A0 = P A, equivalently A = P −1 A0 ,

where P is called a permutation matrix and represents the pivoting oper-


ation. The second process is

A0 = P A = LU

and so
A = P −1 LU = (P T L)U
since P −1 = P T . The determinant of A may now be written as

det(A) = det(P −1 ) det(L) det(U )

or
det(A) = β det(L) det(U ),
where β = det(P −1 ) equals −1 or +1 depending on whether the number
pivoting is odd or even, respectively. •

One can use the MATLAB built-in lu function to obtain the permuta-
tion matrix P so that the P A matrix has a LU decomposition:

>> A = [0 1 2; −1 4 2; 2 2 1];
>> [L, U, P ] = lu(A);
It will give us the permutation matrix P and the matrices L and U as
follows:  
0 0 1
P =  0 1 0 
1 0 0
and   
1 0 0 2 2 1
P A =  −0.5 1 0   0 5 2.5  = LU.
0 0.2 1 0 0 1.5
So
A = P −1 LU
Matrices and Linear Systems 137

or   
0 0.2 1 2 2 1
A = (P T L)U =  −0.5 1 0   0 5 2.5  .
1 0 0 0 0 1.5

Example 1.44 Consider the following matrix:


 
0 3 2
A =  3 2 5 ,
6 2 4

then:
1. Show that A does not have LU factorization;

2. Use Gauss elimination by partial pivoting and find the permutation


matrix P as well as the LU factors such that P A = LU ;

3. Use the information in P, L, and U to solve the system Ax = [6, 4, 3]T .


Solution. (1) I using simple Gauss elimination, since a11 = 0, from
Theorem 1.21, the LU decomposition of A is not possible.
(2) For applying Gauss elimination by partial pivoting, the interchanges
of the rows between row 1 and row 3 gives
 
6 2 4
 3 2 5 ,
0 3 3

and then using multiple m21 = 36 = 1


2
,
we obtain
 
6 2 4
 0 1 3 .
0 3 3

Now interchanging row 2 and row 3 gives


 
6 2 4
 0 3 3 .
0 1 3
138 Applied Linear Algebra and Optimization using MATLAB

By using multiple m32 = 13 , we get


 
6 2 4
 0 3 2 .
0 0 2

Note that during this elimination process two row interchanges were needed,
which means we got two elementary permutation matrices of the inter-
changes (from Theorem 1.23), which are
   
0 0 1 1 0 0
p1 =  0 1 0  and p2 =  0 0 1  .
1 0 0 0 1 0

Thus, the permutation matrix is


    
1 0 0 0 0 1 0 1 0
P = p2 p 1 =  0 0 1   0 1 0  =  0 0 1  .
0 1 0 1 0 0 1 0 0

If we do these interchanges to the given matrix A, the result is the matrix


P A, i.e.,
    
0 1 0 0 3 2 3 2 5
PA =  0 0 1  3 2 5  =  6 2 4 .
1 0 0 6 2 4 0 3 3

Now apply LU decomposition to the matrix P A, and we will convert it to


the upper-triangular matrix U by using the possible multiples

3
m21 = 2, m31 = 0, m32 = −
2
as follows:
     
3 2 5 3 2 5 3 2 5
 6 2 4  →  6 −2 −6  →  0 −2 −6  = U.
0 3 3 0 3 3 0 0 −6
Matrices and Linear Systems 139

Thus, P A = LU , where
   
1 0 0 3 2 5
L= 2 1 0  and U =  0 −2 −6  .
0 −3/2 1 0 0 −6

(3) Solve the first system Ly = P b = [4, 3, 6]T for unknown vector y, i.e.,
    
1 0 0 y1 4
 2 1 0   y2  =  3  .
0 −3/2 1 y3 6

Performing forward substitution yields

y1 = 4 gives y1 = 4
2y1 + y2 = 3 gives y2 = −5
− 3/2y2 + y3 = 6 gives y3 = −1.5.

Then solve the second system U x = y for the unknown vector x, i.e.,
    
3 2 5 x1 4
 0 −2 −6   x2  =  −5  .
0 0 −6 x3 −1.5

Performing backward substitution yields

3x1 + 2x2 + 5x3 = 4 gives x1 = −0.25


− 2x2 − 6x3 = −5 gives x2 = 1.75
− 6x3 = −1.5 gives x3 = 0.25,

which gives the approximate solution x∗ = [−0.25, 1.75, 0.25]T . •

The major advantage of the LU decomposition methods is the efficiency


when multiple unknown b vectors must be considered. The number of mul-
tiplications and divisions required by the complete Gaussian elimination
3
method is N = ( n3 ) + n2 − ( n3 ). The forward substitution step required
to solve the system Ly = b requires N = n2 − ( n2 ) operations, and the
backward substitution step required to solve the system U x = y requires
N = n2 + ( n2 ) operations. Thus, the total number of multiplications and
140 Applied Linear Algebra and Optimization using MATLAB

divisions required by LU decomposition, after L and U matrices have been


determined, is N = 2n2 , which is much less work than required by the
Gaussian elimination method, especially for large systems. •

In the analysis of many physical systems, sets of linear equations arise


that have coefficient matrices that are both symmetric and positive-definite.
Now we factorize such a matrix A into the product of lower-triangular and
upper-triangular matrices which have these two properties. Before we do
the factorization, we define the following matrix.
Definition 1.31 (Positive-Definite Matrix)

The function
 
a11 a12 · · · a1n  
x 1
 a21 a22 · · · a2n   x2 

xT Ax = x1 x2 · · · xn 
 a31 a32 · · · a3n  
 . 

 .. .. .. ..   .. 
 . . . . 
xn
an1 an2 · · · ann
or n X
n
X
xT Ax = aij xi xj (1.42)
i=1 j=1
can be used to represent any quadratic polynomial in the variables x1 , x2 , . . . , xn
and is called a quadratic form. A matrix is said to be positive-definite if
its quadratic form is positive for all real nonzero vectors x, i.e.,
xT Ax > 0, for every n-dimensional column vector x 6= 0.
Example 1.45 The matrix
 
4 −1 0
A =  −1 4 −1 
0 −1 4
is positive-definite and suppose x is any nonzero three-dimensional column
vector, then
  
4 −1 0 x1
xT Ax = x1 x2 x3  −1

4 −1   x2 
0 −1 4 x3
Matrices and Linear Systems 141

or  
 4x1 − x2
= x1 x2 x3  −x1 + 4x2 − x3  .
− x2 + 4x3
Thus,
xT Ax = 4x21 − 2x1 x2 + 4x22 − 2x2 x3 + 4x23 .
After rearranging the terms, we have

xT Ax = 3x21 + (x1 − x2 )2 + 2x22 + (x2 − x3 )2 + 3x23 .

Hence,
3x21 + (x1 − x2 )2 + 2x22 + (x2 − x3 )2 + 3x23 > 0,
unless x1 = x2 = x3 = 0. •

Symmetric positive-definite matrices occur frequently in equations de-


rived by minimization or energy principles, and their properties can often
be utilized in numerical processes.

Theorem 1.24 If A is a positive-definite matrix, then:


1. A is nonsingular.

2. aii > 0, for each i = 1, 2, . . . , n.

Theorem 1.25 The symmetric matrix A is a positive-definite matrix, if


and only if Gaussian elimination without row interchange can be performed
on the linear system Ax = b, with all pivot elements positive. •

Theorem 1.26 A matrix A is positive-definite if the determinant of the


principal minors of A are positive.

The principal minors of a matrix A are the square submatrices lying in the
upper-left hand corner of A. An n × n matrix A has n of these principal
minors. For example, for the matrix
 
6 2 1
A =  2 3 2 ,
1 1 2
142 Applied Linear Algebra and Optimization using MATLAB

the determinant of its principal minors are

det(6) = 6 > 0,
 
6 2
det = 18 − 4 = 14 > 0,
2 3
 
6 2 1
det  2 3 2  = 19 > 0.
1 1 2

Thus, the matrix A is positive-definite. •

Theorem 1.27 If a symmetric matrix A is diagonally dominant, then it


must be positive-definite. •

For example, for the diagonally dominant matrix


 
4 −1 2
A =  −1 3 0 ,
2 0 5

the determinant of its principal minors are

det(4) = 4 > 0,
 
4 −1
det = 12 − 1 = 11 > 0,
−1 3
 
4 −1 2
det  −1 3 0  = 43 > 0.
2 0 5

Hence, (using Theorem 1.26) matrix A is positive-definite. •

Theorem 1.28 If a matrix A is nonsingular, then AT A is always positive-


definite. •
Matrices and Linear Systems 143

For example, for the matrix


 
1 1 1
A =  1 1 2 ,
1 2 1

we can have
 
3 4 4
AT A =  4 6 5  .
4 5 6

Then the determinant of its principal minors are

det(3) = 3 > 0,
 
3 4
det = 18 − 16 = 2 > 0,
4 6
 
3 4 4
det  4 6 5  = 1 > 0.
4 5 6

Thus, matrix A is positive-definite. •

Cholesky Method

The Cholesky method (or square root method) is of the same form as
Doolittle’s method and Crout’s method except it is limited to equations
involving symmetrical coefficient matrices. In the case of a symmetric
and positive-definite matrix A it is possible to construct an alternative
triangular factorization with a saved number of calculations compared with
previous factorizations. Here, we decompose the matrix A into the product
of LLT , i.e.,
A = LLT , (1.43)
144 Applied Linear Algebra and Optimization using MATLAB

where L is the lower-triangular matrix and LT is its transpose. The ele-


ments of L are computed by equating successive columns in the relation
    
a11 a12 · · · a1n l11 0 ··· 0 l11 l21 · · · ln1
 a21 a22 · · · a2n   l21 l22 · · · 0   0 l22 · · · ln2 
 
= ..  .
  
 .. .. .. .. . .. .. .. . .. ..
.   .. .   ..
   
 . . . . . . . . 
an1 an2 · · · ann ln1 ln2 · · · lnn 0 0 · · · lnn

After constructing the matrices L and LT , the solution of the system Ax =


b can be computed in the following two steps:
1. Solve Ly = b, for y.
(using forward substitution)

2. Solve LT x = y, for x.
(using backward substitution)
In this procedure, it is necessary to take the square root of the elements
on the main diagonal of the coefficient matrix. However, for a positive-
definite matrix the terms on its main diagonal are positive, so no difficulty
will arise when taking the square root of these terms.
Example 1.46 Construct the LU decomposition of the following matrix
using the Cholesky method:
 
1 1 2
A =  1 2 4 .
2 4 9

Solution. Since
  
l11 0 0 l11 l21 l31
A = LLT =  l21 l22 0   0 l22 l32  ,
l31 l32 l33 0 0 l33

performing the multiplication on the right-hand side gives


   2 
1 1 2 l11 l11 l21 l11 l31
 1 2 4  =  l11 l21 2 2
l21 + l22 l21 l31 + l22 l32  .
2 2 2
2 4 9 l11 l31 l31 l21 + l22 l32 l31 + l32 + l33
Matrices and Linear Systems 145

Then equate elements of the first column to obtain

2

1 = l11 gives l11 = 1=1

1 = l11 l21 gives l21 = 1

2 = l11 l31 gives l31 = 2.


Note that l11 could be − 1 and so the matrix L is not (quite) unique.
Now equate elements of the second column to obtain

2 2
2 = l21 + l22 gives l22 = 1
4 = l31 l21 + l32 l22 gives l32 = 2.

Finally, equate elements of the third column to obtain

2 2 2
9 = l31 + l32 + l33 gives l33 = 1.

Thus, we obtain

    
1 1 2 1 0 0 1 1 2
 1 2 4  =  1 1 0  0 1 2 ,
2 4 9 2 2 1 0 0 1

the factorization of the given matrix. •


146 Applied Linear Algebra and Optimization using MATLAB

For a general n × n matrix, the elements of the lower-triangular matrix L


are constructed from
√ 
l11 = a11 



aj1



lj1 = , j>1 

l11 





v ! 

u i−1
X



u
2
lii = aii − lik , 1<i<n 
t 



k=1
. (1.44)
" # 

i−1 
1 X 

lji = aji − ljk lik , j>i>1 


lii k=1








v ! 
n−1
u 

u X 
2

lnn = t ann − lnk 



k=1

The method fails if ljj = 0 and the expression inside the square root is
negative, in which case all of the elements in column j are purely imaginary.
There is, however, a special class of matrices for which these problems don’t
occur.
The Cholesky method provides a convenient method for investigat-
ing the positive-definiteness of symmetric matrices. The formal definition
xT Ax > 0, for all x 6= 0, is not easy to verify in practice. However, it is
relatively straightforward to attempt the construct of a Cholesky decom-
position of a symmetric matrix.

Theorem 1.29 A matrix A is positive-definite, if and only if A can be


factored in the form A = LLT , where L is a lower-triangular matrix with
nonzero diagonal entries. •

Example 1.47 Show that the following matrix is positive-definite by using


the Cholesky method:  
9 3 6
A =  3 10 8  .
6 8 9
Matrices and Linear Systems 147

Solution. Since
  
l11 0 0 l11 l21 l31
A = LLT =  l21 l22 0   0 l22 l32  ,
l31 l32 l33 0 0 l33
performing the multiplication on the right-hand side gives
   2 
9 3 6 l11 l11 l21 l11 l31
 3 10 8  =  l11 l21 l21 2 2
+ l22 l21 l31 + l22 l32  .
2 2 2
6 8 9 l11 l31 l31 l21 + l22 l32 l31 + l32 + l33
Then equate elements of the first column to obtain
2

9 = l11 gives l11 = 9=3

3 = l11 l21 gives l21 = 1

6 = l11 l31 gives l31 = 2.


Now equate elements of the second column to obtain
2 2
10 = l21 + l22 gives l22 = 3
8 = l31 l21 + l32 l22 gives l32 = 2.
Finally, equate elements of the third column to obtain
2 2 2
9 = l31 + l32 + l33 gives l33 = 1.
Thus, the factorization obtained as
    
9 3 6 3 0 0 3 1 2
A =  3 10 8  =  1 3 0   0 3 2  = LLT ,
6 8 9 2 2 1 0 0 1
and it shows that the given matrix is positive-definite. •
If the symmetric coefficient matrix is not positive-definite, then the
terms on the main diagonal can be zero or negative. For example, the
symmetric coefficient matrix
 
1 1 2
A= 1 2 4 
2 4 8
148 Applied Linear Algebra and Optimization using MATLAB

is not positive-definite because the Cholesky decomposition of the matrix


has the form
    
1 1 2 1 0 0 1 1 2
A= 1 2 4 = 1 1 0   0 1 2  = LLT ,
2 4 8 2 2 0 0 0 0

which shows that one of the diagonal elements of L and LT is zero. •


Example 1.48 Solve the following linear system by LU decomposition us-
ing the Cholesky method:
   
1 1 2 2
A=  1 2 4  and b =  1 .
2 4 9 1

Solution. The factorization of the coefficient matrix A has already been


constructed in Example (1.46) as
    
1 1 2 1 0 0 1 1 2
 1 2 4 = 1 1 0  0 1 2 .
2 4 9 2 2 1 0 0 1

Then solve the first system Ly = b for unknown vector y, i.e.,


    
1 0 0 y1 2
 1 1 0   y2  =  1  .
2 2 1 y3 1

Performing forward substitution yields

y1 = 2, y1 = 2
y1 + y2 = 1, y2 = −1
2y1 + 2y2 + y3 = 1, y3 = −1.

Then solve the second system LT x = y for unknown vector x, i.e.,


    
1 1 2 x1 2
 0 1 2   x2  =  −1  .
0 0 1 x3 −1
Matrices and Linear Systems 149

Performing backward substitution yields

x1 + x2 + 2x3 = 2 gives x1 = 3
x2 + 2x3 = −1 gives x2 = 1
x3 = −1 gives x3 = −1,

which gives the approximate solution x∗ = [3, 1, −1]T . •

Now use the following MATLAB commands to obtain the above results:

>> A = [1 1 2; 1 2 4; 2 4 9];
>> b = [2 1 1];
>> sol = Cholesky(A, b);

Procedure 1.7 (LU Decomposition by the Cholesky Method)

1. Take the positive-definite matrix A.

2. If possible, decompose the matrix A = LLT using (1.44).

3. Solve linear system Ly = b using (1.36).

4. Solve linear system LT x = y using (1.37).

Example 1.49 Find the bounds on α for which the Cholesky factorization
of the following matrix with real elements
 
1 2 α
A =  2 8 2α 
α 2α 9

is possible.

Solution. Since
  
l11 0 0 l11 l21 l31
A = LLT =  l21 l22 0   0 l22 l32  ,
l31 l32 l33 0 0 l33
150 Applied Linear Algebra and Optimization using MATLAB

performing the multiplication on the right-hand side gives

   2 
1 2 α l11 l11 l21 l11 l31
2
 2 8 2α  =  l11 l21 l21 2
+ l22 l21 l31 + l22 l32  .
2 2 2
α 2α 9 l11 l31 l31 l21 + l22 l32 l31 + l32 + l33

Then equate elements of the first column to obtain

2

1 = l11 gives l11 = 1=1

2 = l11 l21 gives l21 = 2

α = l11 l31 gives l31 = α.


Note that l11 could be − 1 and so matrix L is not (quite) unique.
Now equate elements of the second column to obtain

2 2
8 = l21 + l22 gives l22 = 2
2α = l31 l21 + l32 l22 gives l32 = 0.

Finally, equate elements of the third column to obtain

2 2 2

9 = l31 + l32 + l33 gives l33 = 9 − α2 ,

which shows that the allowable values of α must satisfy 9 − α2 > 0.


Thus, α is bounded by −3 < α < 3. •
Matrices and Linear Systems 151

Program 1.13
MATLAB m-file for the Cholesky Method
function sol = Cholesky(A, b)
[n,n]=size(A); l=zeros(n,n); u=l;
l(1, 1) = (A(1, 1)) \ 0.5; u(1,1)=l(1,1);
for i=2:n; u(1,i)=A(1,i)/l(1,1);
l(i,1)=A(i,1)/u(1,1); end
for i=2:n; for j=2:n; s=0;
if i <= j; K=i-1; else; K=j-1; end
for k=1:K; s = s + l(i, k) ∗ u(k, j); end
if j > i; u(i,j)=(A(i,j)-s)/l(i,i);
elseif i == j
l(i, j) = (A(i, j) − s) \ 0.5; u(i,j)=l(i,j);
else; l(i,j)=(A(i,j)-s)/u(j,j); end; end; end
y(1)=b(1)/l(1,1);
for k=2:n; sum=b(k);
for i=1:k-1; sum = sum − l(k, i) ∗ y(i); end
y(k)=sum/l(k,k); end
x(n)=y(n)/u(n,n);
for k=n-1:-1:1; sum=y(k);
for i=k+1:n; sum = sum − u(k, i) ∗ x(i); end
x(k)=sum/u(k,k); end; l; u; y; x;

Example 1.50 Find the LU decomposition of the following matrix using


Doolittle’s, Crout’s, and the Cholesky methods:
 
4 −2 4
A =  −2 2 2 .
4 2 29

Solution. By using the simple Gauss elimination method, one can convert
the given matrix into the upper-triangular matrix
 
4 −2 4
U = 0 1 4 
0 0 9
152 Applied Linear Algebra and Optimization using MATLAB

with the help of the possible multiples

m21 = −0.5, m31 = 1, m32 = 4.

Thus, the LU decomposition of A using Doolittle’s method is


  
1 0 0 4 −2 4
A = LU =  −0.5 1 0   0 1 4 .
1 4 1 0 0 9
Rather than computing the next two factorizations directly, we can obtain
them from Doolittle’s factorization above. From Doolittle’s factorization
the LDV factorization of the given matrix A can be obtained as
   
1 0 0 4 0 0 1 −0.5 1
A = LDV =  −0.5 1 0   0 1 0   0 1 4 .
1 4 1 0 0 9 0 0 1

By putting L̂ = LD, i.e.,


    
1 0 0 4 0 0 4 0 0
L̂ = LD =  −0.5 1 0   0 1 0  =  −2 1 0  ,
1 4 1 0 0 9 4 4 9
we can obtain Crout’s factorization as follows:
  
4 0 0 1 −0.5 1
A = LDV = L̂V =  −2 1 0   0 1 4 .
4 4 9 0 0 1
Similarly, the Cholesky factorization is obtained by splitting diagonal ma-
1 1
trix D into the form D 2 D 2 in the LDV factorization and associating one
factor with L and the other with V . Thus,
1 1
A = LDV = (LD 2 )(D 2 V ) = L̂L̂T ,

where
    
1 0 0 2 0 0 2 0 0
L̂ = LD1/2 =  −0.5 1 0   0 1 0  =  −1 1 0 
1 4 1 0 0 3 2 4 3
Matrices and Linear Systems 153

and
    
2 0 0 1 −0.5 1 2 −1 2
L̂T = D1/2 V =  0 1 0   0 1 4 = 0 1 4 .
0 0 3 0 0 1 0 0 3

Thus, we obtain
  
2 0 0 2 −1 2
A = L̂L̂T =  −1 1 0   0 1 4 ,
2 4 3 0 0 3

the Cholesky factorization of the given matrix A. •

The factorization of primary interest is A = LU , where L is a unit lower-


triangular matrix and U is an upper-triangular matric. Henceforth, when
we refer to a LU decomposition, we mean one in which L is a unit lower-
triangular matrix.

Example 1.51 Show that the following matrix cannot be factored as A =


LDLT :  
1 2 1
A=  2 5 3 .
1 3 2
Solution. By using the simple Gauss elimination method we can use the
multipliers
m21 = 2, m31 = 1, m32 = 1,
and we can convert the given matrix into an upper-triangular matrix as
follows:    
1 2 1 1 2 1
 0 1 1  →  0 1 1  = U.
0 1 1 0 0 0
(2)
Since the element a33 = 0, the simple Gaussian elimination cannot con-
tinue in its present form and from Theorem 1.21, the decomposition of A
is not possible. Hence, A cannot be factored as A = LDLT . •
154 Applied Linear Algebra and Optimization using MATLAB

Since we know that not every matrix has a direct LU decomposition,


we define the following matrix which gives the sufficient condition for the
LU decomposition of the matrix. It also helps us with the convergence of
the iterative methods for solving linear systems.
Definition 1.32 (Strictly Diagonally Dominant Matrix)

A square matrix is said to be Strictly Diagonally Dominant (SDD) if the


absolute value of each element on the main diagonal is greater than the
sum of the absolute values of all the other elements in that row. Thus, a
SDD matrix is defined as
n
X
|aii | > |aij |, for i = 1, 2, . . . , n. (1.45)
j=1
j6=i

Example 1.52 The matrix


 
7 3 1
A= 1 6 3 
−2 4 8
is SDD since
|7| > |3| + |1|, i.e., 7 > 4
|6| > |1| + |3|, i.e., 6 > 4
|8| > | − 2| + |4|, i.e., 8 > 6,
but the matrix  
6 −3 4
B= 3 7 3 
5 −4 10
is not SDD since
|6| > | − 3| + |4|, i.e., 6 > 7,

which is not true. •

An SDD matrix occurs naturally in a wide variety of practical applica-


tions, and when solving an SDD system by the Gauss elimination method,
partial pivoting is never required.
Matrices and Linear Systems 155

Theorem 1.30 If a matrix A is strictly diagonally dominant, then:

1. Matrix A is nonsingular.

2. Gaussian elimination without row interchange can be performed on


the linear system Ax = b.

3. Matrix A has LU factorization. •

Example 1.53 Solve the following linear system using the simple Gaus-
sian elimination method and also find the LU decomposition of the matrix
using Doolittle’s method and Crout’s method:

5x1 + x2 + x3 = 7
2x1 + 6x2 + x3 = 9
x1 + 2x2 + 9x3 = 12.

Solution. Start with the augmented matrix form


.
 
5 1 1 .. 7
 2 6 1 ... 9
 
,
 
.
1 2 9 .. 12

and since a11 = 5 6= 0, we can eliminate the elements a21 and a31 by
subtracting from the second and third rows the appropriate multiples of the
first row. In this case the multiples are given,
2 1
m21 = = 0.4 and m31 = = 0.2.
5 5
Hence,
..
 
5 1 1 . 7
 0 5.6 0.6 ... 6.2  .
 
 
..
0 1.8 8.8 . 10.6
(1) (1)
Since a22 = 5.6 6= 0, we eliminate the entry in the a32 position by sub-
1.8
tracting the multiple m32 = 5.6 = 0.32 of the second row from the third row
156 Applied Linear Algebra and Optimization using MATLAB

to get
.
 
5 1 ..
1 7
 0 5.6 0.6 ... 6.2  .
 
 
..
0 0 8.6 . 8.6
Obviously, the original set of equations has been transformed to an upper-
triangular form. All the diagonal elements of the obtaining upper-triangular
matrix are nonzero, which means that the coefficient matrix of the given
system is nonsingular, therefore, the given system has a unique solution.
Now expressing the set in algebraic form yields
5x1 + x2 + x3 = 7
5.6x2 + 0.6x3 = 6.2
8.6x3 = 8.6.
Now use backward substitution to get the solution of the system as
8.6x3 = 8.6 gives x3 = 1
5.6x2 = −0.6x3 + 6.2 = −0.6 + 6.2 = 5.6 gives x2 = 1
5x1 = 7 − x2 − x3 = 7 − 1 − 1 = 5 gives x1 = 1.
We know that when using LU decomposition by Doolittle’s method the un-
known elements of matrix L are the multiples used and the matrix U is the
same as we obtained in the forward elimination process of the simple Gauss
elimination. Thus, the LU decomposition of matrix A can be obtained by
using Doolittle’s method as follows:
   
5 1 1 1 0 0 5 1 1
A =  2 6 1   0.4 1 0   0 5.6 0.6  = LU.
1 2 9 0.5 0.32 1 0 0 8.6
Similarly, the LU decomposition of matrix A can be obtained by using
Crout’s method as
   
5 1 1 5 0 0 1 0.2 0.2
A=  2 6 1   2 5.6 0   0 1 0.1  = LU.
1 2 9 1 1.8 8.6 0 0 1
Thus, the conditions of Theorem 1.30 are satisfied. •
Matrices and Linear Systems 157

1.4.6 Tridiagonal Systems of Linear Equations


The application of numerical methods to the solution of certain engineering
problems may in some cases result in a set of tridiagonal linear algebraic
equations. Heat conduction and fluid flow problems are some of the many
applications that generate such a system.
A tridiagonal system has a coefficients matrix T of which all elements
except those on the main diagonal and the two diagonals just above and
below the main diagonal (usually called superdiagonal and subdiagonal,
respectively) are defined as
 
α1 c1 0 ··· 0
..
 β2 α2 c2 . 0
 


T = ... 
. (1.46)
 0 β3 a3 0 
 . ... ... ...
 ..

cn−1 
0 0 0 βn αn

This type of matrix can be stored more economically, which is the


case for a fully populated matrix. Obviously, one may use any one of
the methods discussed in the previous sections for solving the tridiagonal
system
T x = b, (1.47)
but the linear system involving nonsingular matrices of the form T given
in (1.47) are also most easily solved by the LU decomposition method just
described for the general linear system. The tridiagonal matrix T can be
factored into a lower-bidiagonal factor L and an upper-bidiagonal factor U
having the following forms:
   
1 0 0 ··· 0 u1 c1 0 ··· 0
. ..
l2 1 0 .. 0  0 u2 c2 . 0
   
  
 .   .. 
L=
 0 l3 1 .. ,
0  U =
 0 0 u3 . 0 .

 ... ... ... ...   .. .. .. .. 
 0   . . . . cn−1 
0 0 0 ln 1 0 0 0 0 un
(1.48)
158 Applied Linear Algebra and Optimization using MATLAB

The unknown elements li and ui of matrices L and U , respectively, can be


computed as a special case of Doolittle’s method using the LU decompo-
sition method,

u1 = α1 





βi 
li = , i = 2, 3, . . . , n . (1.49)
ui−1 





u = α −lc ,
i i i i−1 i = 2, 3, . . . , n. 

After finding the values for li and ui , then they are used along with the
elements ci , to solve the tridiagonal system (1.47) by solving the first bidi-
agonal system
Ly = b, (1.50)
for y by using forward substitution,

y 1 = b1
, (1.51)
yi = bi − li yi−1 , i = 2, 3, . . . , n

followed by solving the second bidiagonal system,

U x = y, (1.52)

for x by using backward substitution,



xn = yn /un
. (1.53)
xi = yi − ci xi+1 , i = n − 1, . . . , 1

The entire process for solving the original system (1.47) requires 3n ad-
ditions, 3n multiplications, and 2n divisions. Thus, the total number of
multiplications and divisions is approximately 5n.

Most large tridiagonal systems are strictly diagonally dominant (defined


as follows), so pivoting is not necessary. When solving systems of equations
with a tridiagonal coefficients matrix T , iterative methods can sometimes
be used to one’s advantage. These methods are introduced in Chapter 2.
Matrices and Linear Systems 159

Example 1.54 Solve the following tridiagonal system of equations using


the LU decomposition method:

x1 + x2 = 1
x1 + 2x2 + x3 = 0
x2 + 3x3 + x4 = 1
x3 + 4x4 = 1.

Solution. Construct the factorization of tridiagonal matrix T as follows:


    
1 1 0 0 1 0 0 0 u1 1 0 0
 1 2 1 0   l2 1 0
  0   0 u2 1 0
  

 0 = .
1 3 1   0 l3 1 0   0 0 u3 1 
0 0 1 4 0 0 l4 1 0 0 0 u4

Then the elements of the L and U matrices can be computed by using (1.48)
as follows:
u1 = α1 = 1

β2 1
l2 = = =1
u1 1

u2 = α2 − l2 c1 = 2 − (1)1 = 1

b3 1
l3 = = =1
u2 1

u3 = α3 − l3 c2 = 3 − (1)1 = 2

b4 1
l4 = =
u3 2

1 7
u4 = α4 − l4 c3 = 4 − ( )1 = .
2 2
After finding the elements of the bidiagonal matrices L and U , we solve the
160 Applied Linear Algebra and Optimization using MATLAB

first system Ly = b as follows:

   

1 0 0 0 y1 1
 1 1 0 0   y2   0 
   =  .
 0 1 1 0   y3   1 
0 0 21 1 y4 1

Using forward substitution, we get

[y1 , y2 , y3 , y4 ]T = [1, −1, 2, 0]T .

Now we solve the second system U x = y as follows:

   

1 1 0 0 x1 1
 0 1 1 0   x2   −1 
=
  2 .
  
 0 0 2 1   x3
0 0 0 27 x4 0

Using backward substitution, we get

x∗ = [x1 , x2 , x3 , x4 ]T = [3, −2, 1, 0]T ,

which is the required solution of the given system. •

The above results can be obtained using MATLAB commands. We do


the following:

>> T b = [T |b] = [1 1 0 0 1; 1 2 1 0 0; 0 1 3 1 1; 0 0 1 4 1];


>> T riDLU (T b);
Matrices and Linear Systems 161

Program 1.14
MATLAB m-file for LU Decomposition for a Tridiagonal System
function sol=TRiDLU(Tb)
[m,n]=size(Tb); L=eye(m); U=zeros(m);
U(1,1)=Tb(1,1);
for i=2:m
U (i − 1, i) = T b(i − 1, i);
L(i, i − 1) = T b(i, i − 1)/U (i − 1, i − 1);
U (i, i) − L(i, i − 1) ∗ T b(i − 1, i); end
disp(’The lower-triangular matrix’) L;
disp(’The upper-triangular matrix’) U;
y = inv(L) ∗ T b(:, n); x = inv(U ) ∗ y;

Procedure 1.8 (LU Decomposition by the Tridiagonal Method)

1. Take the tridiagonal matrix T .

2. Decompose the matrix T = LU using (1.49).

3. Solve linear system Ly = b using (1.51).

4. Solve linear system U x = y using (1.53).

1.5 Conditioning of Linear Systems


In solving the linear system numerically we have to see the problem condi-
tioning, algorithm stability, and cost. Earlier we discussed efficient elimi-
nation schemes to solve a linear system, and these schemes are stable when
pivoting is employed. But there are some ill-conditioned systems which are
tough to solve by any method. These types of linear systems are identified
in this chapter.

Here, we will present a parameter, the condition number, which quan-


titatively measures the conditioning of a linear system. The condition
number is greater than and equal to one and as a linear system becomes
162 Applied Linear Algebra and Optimization using MATLAB

more ill-conditioned, the condition number increases. After factoring a ma-


trix, the condition number can be estimated in roughly the same time it
takes to solve a few factored systems (LU )x = b. Hence, after factoring a
matrix, the extra computer time needed to estimate the condition number
is usually insignificant.

1.5.1 Norms of Vectors and Matrices


For solving linear systems, we discuss a method for quantitatively mea-
suring the distance between vectors in Rn , the set of all column vectors
with real components, to determine whether the sequence of vectors that
results from using a direct method converges to a solution of the system.
To define a distance in Rn , we use the notation of the norm of a vector.

Vector Norms

It is sometimes useful to have a scalar measure of the magnitude of a vector.


Such a measure is called a vector norm and for a vector x is written as kxk.

A vector norm on Rn is a function from Rn to R satisfying:

1. kxk > 0, for all x ∈ Rn ;

2. kxk = 0, if and only if x = 0;

3. kαxk = |α|kxk, for all α ∈ R, x ∈ Rn ;

4. kx + yk ≤ kxk + kyk, for all x, y ∈ Rn .

There are three norms in Rn that are most commonly used in applications,
called l1 -norm, l2 -norm, and l∞ -norm, and are defined for the given vectors
Matrices and Linear Systems 163

x = [x1 , x2 , . . . , xn ]T as

n
X
kxk1 = |xi |
i=1

n
!1/2
X
kxk2 = x2i
i=1

kxk∞ = max |xi |.


1≤i≤n

The l1 -norm is called the absolute norm, the l2 -norm is frequently called
the Euclidean norm as it is just the formula for distance in ordinary three-
dimensional Euclidean space extended to dimension n, and finally, the
l∞ -norm is called the maximum norm or occasionally the uniform norm.
All these three norms are also called the natural norms.

Example 1.55 Compute the lp -norms (p = 1, 2, ∞) of the vector x =


[−5, 3, −2]T in R3 .

Solution. These lp -norms (p = 1, 2, ∞) of the given vector are:

kxk1 = |x1 | + |x2 | + |x3 | = | − 5| + |3| + | − 2| = 10,


h i1/2
kxk2 = (x21 + x22 + x23 )1/2 = (−5)2 + (3)2 + (−2)2 ≈ 6.16,

kxk∞ = max{|x1 |, |x2 |, |x3 |} = max{| − 5|, |3|, | − 2|} = 5.

In MATLAB, the built-in norm function computes the lp -norms of vec-


tors. If only one argument is passed to norm, the l2 -norm is returned and
for two arguments, the second one is used to specify the value of p:
164 Applied Linear Algebra and Optimization using MATLAB

>> x = [−5 3 − 2];


>> v = norm(x)
v = 6.16
>> x = [−5 3 − 2];
>> v = norm(x, 2)
v = 6.16
>> x = [−5 3 − 2];
>> v = norm(x, 1)
v = 10
>> x = [−5 3 − 2];
>> v = norm(x, inf )
v=5

The internal MATLAB constant inf is used to select the l∞ -norm.

Matrix Norms
A matrix norm is a measure of how well one matrix approximates another,
or, more accurately, of how well their difference approximates the zero ma-
trix. An iterative procedure for inverting a matrix produces a sequence
of approximate inverses. Since, in practice, such a process must be termi-
nated, it is desirable to have some measure of the error of an approximate
inverse.

So a matrix norm on the set of all n × n matrices is a real-valued


function, k.k, defined on this set, satisfying for all n × n matrices A and B
and all real numbers α as follows:

1. kAk > 0, A 6= 0;

2. kAk = 0, A = 0;

3. kIk = 1, I is the identity matrix;

4. kαAk = |α|kAk, for scalar α ∈ R;

5. kA + Bk ≤ kAk + kBk;
Matrices and Linear Systems 165

6. kABk ≤ kAkkBk;

7. kA − Bk ≥ kAk − kBk .

Several norms for matrices have been defined, and we shall use the following
three natural norms l1 , l2 , and l∞ for a square matrix of order n:
n
!
X
kAk1 = max |aij | = maximum column sum,
j
i=1

kAk2 = max kAxk2 = spectral norm,


kxk2 =1

n
!
X
kAk∞ = max |aij | = row sum norm.
i
j=1

The l1 -norm and l∞ -norm are widely used because they are easy to cal-
culate. The matrix norm kAk2 that corresponds to the l2 -norm is related
to the eigenvalues of the matrix. It sometimes has special utility because
no other norm is smaller than this norm. It, therefore, provides the best
measure of the size of a matrix, but is also the most difficult to compute.
We will discuss this natural norm later in the chapter.

For an m × n matrix, we can paraphrase the Frobenius (or Euclidean)


norm (which is not a natural norm) and define it as

m X
n
!1/2
X
kAkF = |aij |2 .
i=1 j=1

It can be shown that p


kAkF = tr(AT A),
where tr(AT A) is the trace of a matrix AT A, i.e., the sum of the diagonal
entries of AT A. The Frobenius norm of a matrix is a good measure of the
magnitude of a matrix. Note that kAkF 6= kAk2 . For a diagonal matrix,
all norms have the same values.
166 Applied Linear Algebra and Optimization using MATLAB

Example 1.56 Compute the lp -norms (p = 1, ∞, F ) of the following ma-


trix:  
4 2 −1
A= 3 5 −2  .
1 −2 7
Solution. These norms are:
3
X
|ai1 | = |4| + |3| + |1| = 8,
i=1
3
X
|ai2 | = |2| + |5| + | − 2| = 9,
i=1
3
X
|ai3 | = | − 1| + | − 2| + |7| = 10,
i=1

so
kAk1 = max{8, 9, 10} = 10.
Also,
3
X
|a1j | = |4| + |2| + | − 1| = 7,
j=1
3
X
|a2j | = |3| + |5| + | − 2| = 10,
j=1
X3
|a3j | = |1| + | − 2| + |7| = 10,
j=1
so
kAk∞ = max{7, 10, 10} = 10.
Finally, we have

kAkF = (16 + 4 + 1 + 9 + 25 + 4 + 1 + 4 + 49)1/2 ≈ 10.6301,

the Frobenius norm of the given matrix. •

Like the lp -norms of vectors, in MATLAB the built-in norm function


can be used to compute the lp -norms of matrices. The l1 -norm of a matrix
Matrices and Linear Systems 167

can be computed as follows:

>> A = [4 2 − 1; 3 5 − 2; 1 − 2 − 7];
>> B = norm(A, 1)
B=
10
The l∞ -norm of a matrix A is:

>> A = [4 2 − 1; 3 5 − 2; 1 − 2 − 7];
>> B = norm(A, inf )
B=
10
Finally, the Frobenius norm of the matrix A is:

>> A = [4 2 − 1; 3 5 − 2; 1 − 2 − 7];
>> B = norm(A,0 f ro0 )
B=
10.6301

1.5.2 Errors in Solving Linear Systems


Any computed solution of a linear system must, because of round-off and
other errors, be considered an approximate solution. Here, we shall con-
sider the most natural method for determining the accuracy of a solution
of the linear system. One obvious way of estimating the accuracy of the
computed solution x∗ is to compute Ax∗ and to see how close Ax∗ comes
to b. Thus, if x∗ is an approximate solution of the given system Ax = b,
we compute a vector
r = b − Ax∗ , (1.54)
which is called the residual vector and which can be easily calculated. The
quantity
krk kb − Ax∗ k
=
kbk kbk
is called the relative residual. We use MATLAB as follows:
168 Applied Linear Algebra and Optimization using MATLAB

Program 1.15
MATLAB m-file for Finding the Residual Vector
function r=RES(A,b,x0)
[n,n]=size(A);
for i=1:n; R(i) = b(i);
for j=1:n
R(i)=R(i)-A(i,j)*x0(j);end
RES(i)=R(i); end
r=RES’

The smallness of the residual then provides a measure of the goodness


of the approximate solution x∗ . If every component of vector r vanishes,
then x∗ is the exact solution. If x∗ is a good approximation, then we would
expect each component of r to be small, at least in a relative sense. For
example, the linear system
x1 + 2x2 = 3
1.0001x1 + 2x2 = 3.0001

has the approximate solution x∗ = [3, 0]T . To see how good this solution
is, we compute the residual, r = [0, −0.0002]T .

We can conclude from the residual that the approximate solution is


correct to at most three decimal places. Also, the linear system
1.0000x1 + 0.9600x2 + 0.8400x3 + 0.6400x4 = 3.4400
0.9600x1 + 0.9214x2 + 0.4406x3 + 0.2222x4 = 2.5442
0.8400x1 + 0.4406x2 + 1.0000x3 + 0.3444x4 = 2.6250
0.6400x1 + 0.2222x2 + 0.3444x3 + 1.0000x4 = 2.2066

has the exact solution x = [1, 1, 1, 1]T and the approximate solution due to
Gaussian elimination without pivoting is

x∗ = [1.0000322, 0.99996948, 0.99998748, 1.0000113]T ,

and the residual is

r = [0.6 × 10−7 , 0.6 × 10−7 , −0.53 × 10−5 , −0.21 × 10−4 ]T .


Matrices and Linear Systems 169

The approximate solution due to Gaussian elimination with partial pivot-


ing is
x∗ = [0.9999997, 0.99999997, 0.99999996, 1.0000000]T ,
and the residual is

r = [0.3 × 10−7 , 0.3 × 10−7 , 0.6 × 10−7 , 0.1 × 10−8 ]T .

We found that all the elements of the residual for the second case (with
pivoting) are less than 0.6 × 10−7 , whereas for the first case (without piv-
oting) they are as large as 0.2 × 10−4 . Even without knowing the exact
solution, it is clear that the solution obtained in the second case is much
better than the first case. The residual provides a reasonable measure of
the accuracy of a solution in those cases where the error is primarily due
to the accumulation of round-off errors.

Intuitively it would seem reasonable to assume that when krk is small


for a given vector norm, then the error kx − x∗ k would be small as well.
In fact, this is true for some systems. However, there are systems of equa-
tions which do not satisfy this property. Such systems are said to be
ill-conditioned.

These are systems in which small changes in the coefficients of the


system lead to large changes in the solution. For example, consider the
linear system
x1 + x2 = 2
x1 + 1.01x2 = 2.01.
The exact solution is easily verified to be x1 = x2 = 1. On the other hand,
the system
x1 + x2 = 2
1.001x1 + x2 = 2.01
has the solution x1 = 10, x2 = −8. Thus, a change of 1% in the coefficients
has changed the solution by a factor of 10. If in the above given system, we
substitute x1 = 10, x2 = 8, we find that the residuals are r1 = 0, r2 = 0.09,
so this solution looks reasonable, although it is grossly in error. In practical
problems we can expect the coefficients in the system to be subject to small
170 Applied Linear Algebra and Optimization using MATLAB

errors, either because of round-off or because of physical measurement. If


the system is ill-conditioned, the resulting solution may be grossly in error.
Errors of this type, unlike those caused by round-off error accumulation,
cannot be avoided by careful programming.

We have seen that for ill-conditioned systems the residual is not neces-
sarily a good measure of the accuracy of a solution. How then can we tell
when a system is ill-conditioned? In the following we discuss some possible
indicators of ill-conditioned systems.

Definition 1.33 (Condition Number of a Matrix)

The number kAkkA−1 k is called the condition number of a nonsingular


matrix A and is denoted by K(A), i.e.,

cond(A) = K(A) = kAkkA−1 k. (1.55)

Note that the condition number K(A) for A depends on the matrix norm
used and can, for some matrices, vary considerably as the matrix norm is
changed. Since

1 = kIk = kAA−1 k ≤ kAkkA−1 k = K(A),

the condition number is always in the range 1 ≤ K(A) ≤ ∞ regardless of


any natural norm. The lower limit is attained for identity matrices and
K(A) = ∞ if A is singular. So the matrix A is well-behaved (or well-
conditioned) if K(A) is close to 1 and is increasingly ill-conditioned when
K(A) is significantly greater than 1, i.e., K(A) → ∞. •

The condition numbers provide bounds for the sensitivity of the solution
of a set of equations to changes in the coefficient matrix. Unfortunately,
the evaluation of any of the condition numbers of a matrix A is not a trivial
task since it is necessary first to obtain its inverse.

So if the condition number of a matrix is a very large number, then this


is one of the indicators of an ill-conditioned system. Another indicator of
ill-conditioning is when the pivots during the process of elimination suffer
Matrices and Linear Systems 171

a loss of one or more significant figures. Small changes in the right-hand


side terms of the system lead to large changes in the solution and give
another indicator of an ill-conditioned system. Also, when the elements of
the inverse of the coefficient matrix are large compared with the elements
of the coefficients matrix, this also shows an ill-conditioned system.

Example 1.57 Compute the condition number of the following matrix us-
ing the l∞ -norm:  
2 −1 0
A =  2 −4 −1  .
−1 0 2
Solution. The condition number of a matrix is defined as

K(A) = kAk∞ kA−1 k∞ .

First, we calculate the inverse of the given matrix, which is


8 2 1
 
 13 − −
 13 13 

 
3 4 2
A−1 = 
 
− − .
 13 13 13 
 
 
 4 1 6 

13 13 13
Now we calculate the l∞ -norm of both the matrices A and A−1 . Since the
l∞ -norm of a matrix is the maximum of the absolute row sums, we have

kAk∞ = max{|2| + | − 1| + |0|, |2| + | − 4| + | − 1|, | − 1| + |0| + |2|} = 7

and
n 8 −2 −1 3 −4 −2 4 −1 6 o
kA−1 k∞ = max + + , + + , + + ,

13 13 13 13 13 13 13 13 13
which gives
11
kA−1 k∞ = .
13
172 Applied Linear Algebra and Optimization using MATLAB

Therefore,

−1 11
K(A) = kAk∞ kA k∞ = (7) ≈ 5.9231.
13
Depending on the application, we might consider this number to be rea-
sonably small and conclude that the given matrix A is reasonably well-
conditioned. •
To get the above results using MATLAB commands, we do the follow-
ing:

>> A = [2 − 1 0; 2 − 4 − 1; −1 0 2];
>> Ainv = inv(A)
>> K(A) = norm(A, inf ) ∗ norm(Ainv, inf );
K(A) =
5.9231
Some matrices are notoriously ill-conditioned. For example, consider the
4 × 4 Hilbert matrix
 1 1 1 
1
 2 3 4 
 
 
 1 1 1 1 
 
 2 3 4 5 
H= ,
 
 1 1 1 1 
 
 3 4 5 6 
 
 
1 1 1 1
 
4 5 6 7
whose entries are defined by
1
aij = , for i, j = 1, 2, . . . , n.
(i + j − 1)
The inverse of the matrix H can be obtained as
 
16 −120 240 −140
 −120 1200 −2700 1680 
H −1 = 
 240 −2700
.
6480 −4200 
−140 1680 −4200 2800
Matrices and Linear Systems 173

Then the condition number of the Hilbert matrix is

K(H) = kHk∞ kH −1 k∞ = (2.0833)(13620) ≈ 28375,

which is quite large. Note that the condition numbers of Hilbert matri-
ces increase rapidly as the sizes of the matrices increase. Therefore, large
Hilbert matrices are considered to be extremely ill-conditioned.

We might think that if the determinant of a matrix is close to zero, then


the matrix is ill-conditioned. However, this is false. Consider the matrix
 −7 
10 0
A= ,
0 10−7

for which det A = 10−14 ≈ 0. One can easily find the condition number of
the given matrix as

K(A) = kAk∞ kA−1 k∞ = (10−7 )(107 ) = 1.

The matrix A is therefore perfectly conditioned. Thus, a small determi-


nant is necessary but not sufficient for a matrix to be ill-conditioned.

The condition number of a matrix K(A) using the l2 -norm can be com-
puted by the built-in function cond command in MATLAB as follows:

>> A = [1 − 1 2; 3 1 − 1; 2 0 1];
>> K(A) = cond(A);
K(A) =
19.7982

Theorem 1.31 (Error in Linear Systems)

Suppose that x∗ is an approximation to the solution x of the linear system


Ax = b and A is a nonsingular matrix and r is the residual vector for x∗ .
Then for any natural norm, the error is

kx − x∗ k ≤ krkkA−1 k, (1.56)
174 Applied Linear Algebra and Optimization using MATLAB

and the relative error is


kx − x∗ k krk
≤ K(A) , provided x 6= 0, b 6= 0. (1.57)
kxk kbk

Proof. Since r = b − Ax∗ and A is nonsingular, then

Ax − Ax∗ = b − (b − r) = r,

which implies that


A(x − x∗ ) = r (1.58)
or
x − x∗ = A−1 r.
Taking the norm on both side gives

kx − x∗ k = kA−1 rk ≤ kA−1 kkrk.

Moreover, since b = Ax, then

kbk
kbk ≤ kAkkxk, or kxk ≥ .
kAk

Hence,
kx − x∗ k kA−1 kkrk krk
≤ ≤ K(A) .
kxk kbk/kAk kbk
The inequalities (1.56) and (1.57) imply that the quantities kA−1 k and
K(A) can be used to give an indication of the connection between the
residual vector and the accuracy of the approximation. If the quantity
K(A) ≈ 1, the relative error will be fairly close to the relative residual.
But if K(A) >> 1, then the relative error could be many times larger than
the relative residual.

Example 1.58 Consider the following linear system:

x1 + x2 − x3 = 1
x1 + 2x2 − 2x3 = 0
−2x1 + x2 + x3 = −1.
Matrices and Linear Systems 175

(a) Discuss the ill-conditioning of the given linear system.


(b) Let x∗ = [2.01, 1.01, 1.98]T be an approximate solution of the given sys-
tem, then find the residual vector r and its norm krk∞ .
(c) Estimate the relative error using (1.57).
(d) Use the simple Gaussian elimination method to find the approximate
error using (1.58).

Solution. (a) Given the matrix


 
1 1 −1
A =  1 2 −2  ,
−2 1 1
the inverse can be computed as
 
2 −1 0
A−1 =  1.5 −0.5 0.5  .
2.5 −1.5 0.5
Then the l∞ -norms of both matrices are
kAk∞ = 5 and kA−1 k∞ = 4.5.
Using the values of both matrices’ norms, we can find the value of the
condition number of A as
K(A) = kAk∞ k|A−1 k∞ = 22.5 >> 1,
which shows that the matrix is ill-conditioned. Thus, the given system is
ill-conditioned.

>> A = [1 1 − 1; 1 2 − 2; −2 1 1];
>> K(A) = norm(A, inf ) ∗ norm(inv(A), inf );
(b) The residual vector can be calculated as
r = b − Ax∗
    
1 1 1 −1 2.01
=  0  −  1 2 −2   1.01  .
−1 −2 1 1 1.98
176 Applied Linear Algebra and Optimization using MATLAB

After simplifying, we get


 
−0.04
r =  −0.07  ,
0.03

and it gives
krk∞ = 0.07.
>> A = [1 1 − 1; 1 2 − 2; −2 1 1];
>> b = [1 0 − 1]0 ;
>> x0 = [2.01 1.01 1.98]0 ;
>> r = RES(A, b, x0);
>> rnorm = norm(r, inf );
(c) From (1.57), we have

kx − x∗ k krk
≤ K(A) .
kxk kbk

By using parts (a) and (b) and the value kbk∞ = 1, we obtain

kx − x∗ k (0.07)
≤ (22.5) = 1.575.
kxk 1

>> RelErr = (K(A) ∗ rnorm)/norm(b, inf );

(d) Solve the linear system Ae = r, where


   
1 1 −1 −0.04
A =  1 2 −2  and r =  −0.07 
−2 1 1 0.03

and e = x − x∗ . Write the above system in the augmented matrix form


..
 
1 1 −1 . −0.04
 1 2 −2 ... −0.07  .
 
 
..
−2 1 1 . 0.03
Matrices and Linear Systems 177

After applying the forward elimination step of the simple Gauss elimination
method, we obtain
..
 
1 1 −1 . −0.04
 0 1 −1 ... −0.03  .
 
 
..
0 0 2 . 0.04
Now by using backward substitution, we obtain the solution

e∗ = [−0.01, −0.01, 0.02]T ,

which is the required approximation of the exact error. •

>> B = [1 1 − 1 − 0.04; 1 2 − 2 − 0.07; −2 1 1 0.03];


>> W P (B);

Conditioning
Let us consider the conditioning of the linear system

Ax = b. (1.59)

Case 1.1 Suppose that the right-hand side term b is replaced by b + δb,
where δb is an error in b. If x + δx is the solution corresponding to the
right-hand side b + δb, then we have

A(x + δx) = (b + δb), (1.60)

which implies that

Ax + Aδx = b + δb,
Aδx = δb.

Multiplying by A−1 , we get

δx = A−1 δb.

Taking the norm gives

kδxk = kA−1 δbk ≤ kA−1 kkδbk. (1.61)


178 Applied Linear Algebra and Optimization using MATLAB

Thus, the change kδxk in the solution is bounded by kA−1 k times the change
kδbk in the right-hand side.
The conditioning of the linear system is connected with the ratio between
kδxk kδbk
the relative error and the relative change in the right-hand side,
kxk kbk
which gives

kδxk/kxk kA−1 δbk/kxk kAxkkA−1 δbk


= = ≤ kAkkA−1 k,
kδbk/kbk kδbk/kAxk kxkkδbk

which implies that


kδxk kδbk
≤ K(A) . (1.62)
kxk kbk

Thus, the relative change in the solution is bounded by the condition num-
ber of the matrix times the relative change in the right-hand side. When
the product in the right-hand side is small, the relative change in the so-
lution is small.

Case 1.2 Suppose that the matrix A is replaced by A + δA, where δA is


the error in A, while the right-hand side term b is similar. If x + δx is the
solution corresponding to the matrix A + δA, then we have

(A + δA)(x + δx) = b, (1.63)

which implies that

Ax + Aδx + δA(x + δx) = b

or
Aδx = −δA(x + δx).
Multiplying by A−1 , we get

δx = −A−1 δA(x + δx).

Taking the norm gives

kδxk = k − A−1 δA(x + δx)k ≤ kA−1 kkδAk(kxk + kδxk)


Matrices and Linear Systems 179

or
kδxk(1 − kA−1 kkδAk) ≤ kA−1 kkδAkkxk,
which can be written as
kδxk kA−1 kkδAk K(A)kδAk/kAk
≤ −1
= . (1.64)
kxk (1 − kA kkδAk) (1 − kA−1 kkδAk)
If the product kA−1 kkδAk is much smaller than 1, the denominator in
(1.64) is near 1. Consequently, when kA−1 kkδAk is much smaller than 1,
then (1.64) implies that the relative change in the solution is bounded by
the condition number of a matrix times the relative change in the coefficient
matrix.

Case 1.3 Suppose that there is a change in the coefficient matrix A and the
right-hand side term b together, and if x + δx is the solution corresponding
to the coefficient matrix A + δA and the right-hand side b + δb, then we
have
(A + δA)(x + δx) = (b + δb), (1.65)
which implies that
Ax + Aδx + xδA + δAδx = b + δb
or
Aδx + δxδA = (δb − xδA).
Multiplying by A−1 , we get
δx(I + A−1 δA) = A−1 (δb − xδA)
or
δx = (I + A−1 δA)−1 A−1 (δb − xδA). (1.66)
Since we know that if A is nonsingular and δA is the error in A, we obtain
kA−1 δAk ≤ kA−1 kkδAk < 1, (1.67)
it then follows that (see Fröberg 1969) the matrix (I+A−1 δA) is nonsingular
and
1 1
k(I + A−1 δA)−1 k ≤ −1
≤ −1
. (1.68)
1 − kA δAk 1 − kA kkδAk
180 Applied Linear Algebra and Optimization using MATLAB

Taking the norm of (1.66) and using (1.68) gives


kA−1 k
kδxk ≤ [kδbk + kxkkδAk]
1 − kA−1 kkδAk
or
kA−1 k
 
kδxk kδbk
≤ + kδAk . (1.69)
kxk 1 − kA−1 kkδAk kxk
Since we know that
kbk
kxk ≥ , (1.70)
kAk
by using (1.70) in (1.69), we get
 
kδxk K(A) kδAk kδbk
≤ + . (1.71)
kxk kδAk kAk kbk
1 − K(A)
kAk
The estimate (1.71) shows that small relative changes in A and b cause
small relative changes in the solution x of the linear system (1.59) if the
inequality
K(A)
(1.72)
kδAk
1 − K(A)
kAk
is not too large. •

1.6 Applications
In this section we discuss applications of linear systems. Here, we will solve
or tackle a variety of real-life problems from several areas of science.

1.6.1 Curve Fitting, Electric Networks, and Traffic Flow


Curve Fitting

The following problem occurs in many different branches of science. A set


of data points
(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )
Matrices and Linear Systems 181

Figure 1.3: Fitting a graph to data points.

is given and it is necessary to find a polynomial whose graph passes through


the points. The points are often measurements in an experiment. The x-
coordinates are called base points. It can be shown that if the base points
are all distinct, then a unique polynomial of degree n − 1 (or less)

p(x) = a0 + a1 x + · · · + an−2 xn−2 + an−1 xn−1

can be fitted to the points (Figure 1.3).

The coefficients an−1 , an−2 , . . . , a1 , a0 of the appropriate polynomial can


be found by substituting the points into the polynomial equation and then
solving a system of linear equations. It is usual to write the polynomial in
terms of ascending powers of x for the purpose of finding these coefficients.
The columns of the matrix of coefficients of the system of equations then of-
ten follow a pattern. More will be discussed about this in the next chapter.

We now illustrate the procedure by fitting a polynomial of degree 2, a


parabola, to a set of three such data points.
182 Applied Linear Algebra and Optimization using MATLAB

Example 1.59 Determine the equation of the polynomial of degree 2 whose


graph passes through the points (1, 6), (2, 3), and (3, 2).

Solution. Observe that in this example we are given three points and we
want to find a polynomial of degree 2 (one less than the number of data
points). Let the polynomial be
p(x) = a0 + a1 x + a2 x2 .
We are given three points and shall use these three sets of information to
determine the three unknowns a0 , a1 , and a2 . Substituting
x = 1, y = 6; x = 2, y = 3; x = 3, y = 2,
in turn, into the polynomial leads to the following system of three linear
equations in a0 , a1 , and a2 :
a0 + a1 + a2 = 6
a0 + 2a1 + 4a2 = 3
a0 + 3a1 + 9a2 = 2.
Solve this system for a2 , a1 , and a0 using the Gauss elimination method:
. . .
     
1 1 1 .. 6 1 1 1 .. 6 1 1 1 .. 6
 1 2 4 ...  0 1 3 ... −3  ≈  0 1 3 ... −3  .
     
 3 
 ≈   
. .. ..
1 3 9 .. 2 0 2 8 . −4 0 0 2 . 2
Now use backward substitution to get the solution of the system (Fig-
ure 1.4),
2a2 = 2 gives a2 = 1

a1 + 3a2 = −3 gives a1 = −6

a0 + a1 + a2 = 6 gives a0 = 11.
Thus,
p(x) = 11 − 6x + x2
is the required the polynomial. •
Matrices and Linear Systems 183

Figure 1.4: Fitting a graph to data points of Example 1.59.

Electrical Network Analysis

Systems of linear equations are used to determine the currents through


various branches of electrical networks. The following two laws, which are
based on experimental verification in the laboratory, lead to the equations.

Theorem 1.32 (Kirchoff ’s Laws)

1. Junctions: All the current flowing into a junction must flow out of
it.
2. Paths: The sum of the IR terms (where I denotes current and R
resistance) in any direction around a closed path is equal to the total voltage
in the path in that direction. •

Example 1.60 Consider the electric network in Figure 1.5. Let us deter-
mine the currents through each branch of this network.

Solution. The batteries are 8 volts and 16 volts. The resistances are 1
ohm, 4 ohms, and 2 ohms. The current entering each battery will be the
184 Applied Linear Algebra and Optimization using MATLAB

Figure 1.5: Electrical circuit.

same as that leaving it.

Let the currents in the various branches of the given circuit be I1 , I2 ,


and I3 . Kirchhoff ’s Laws refer to junctions and closed paths. There are
two junctions in these circuits, namely, the points B and D. There are
three closed paths, namely ABDA, CBDC, and ABCDA. Apply the laws to
the junctions and paths.

Junctions

Junction B : I1 + I2 = I3
Junction D : I3 = I1 + I2
These two equations result in a single linear equation

I1 + I2 − I3 = 0.

Paths

P ath ABDA : 2I1 + 1I3 + 2I1 = 8


P ath CBDC : 4I2 + 1I3 = 16
Matrices and Linear Systems 185

It is not necessary to look further at path ABCDA. We now have a system


of three linear equations in three unknowns, I1 , I2 , and I3 . Path ABCDA,
in fact, leads to an equation that is a combination of the last two equations;
there is no new information.

The problem thus reduces to solving the following system of three linear
equations in three variables I1 , I2 , and I3 :
I1 + I2 − I3 = 0
4I1 + I3 = 8
4I2 + I3 = 16.
Solve this system for I1 , I2 , and I3 using the Gauss elimination method:
.. .. .
     
1 1 −1 . 0 1 1 −1 . 0 1 1 −1 .. 0
 .   .   . 
 4
 0 1 .. 8 ≈
 0 −4 5 ..  ≈  0 −4
8   5 .. 8 .

. . .
0 4 1 .. 16 0 4 1 .. −4 0 0 6 .. 24
Now use backward substitution to get the solution of the system:
6I3 = 24 gives I3 = 4

−4I2 + 5I3 = 8 gives I2 = 3

I1 + I2 − I3 = 0 gives I1 = 1.
Thus, the currents are I1 = 1, I2 = 3, and I3 = 4. The units are amps.
The solution is unique, as is to be expected in this physical situation. •

Traffic Flow

Network analysis, as we saw in the previous discussion, plays an important


role in electrical engineering. In recent years, the concepts and tools of
network analysis have been found to be useful in many other fields, such
as information theory and the study of transportation systems. The fol-
lowing analysis of traffic flow through a road network during peak periods
illustrates how systems of linear equations with many solutions can arise
in practice.
186 Applied Linear Algebra and Optimization using MATLAB

Consider the typical road network in Figure 1.6. It represents an area


of downtown Jacksonville, Florida. The streets are all one-way with the
arrows indicating the direction of traffic flow. The flow of traffic in and out
of the network is measured in vehicles per hour (vph). The figures given
here are based on midweek peak traffic hours, 7 A.M. to 9 A.M. and 4
P.M. to 6 P.M. An increase of 2% in the overall flow should be allowed for
during Friday evening traffic flow. Let us construct a mathematical model
that can be used to analyze this network. Let the traffic flows along the

Figure 1.6: Downtown Jacksonville, Florida, USA.

various branches be x1 , . . . , x7 as shown in Figure 1.6.


Theorem 1.33 (Traffic Law)

All traffic entering a junction must leave that junction. •


Matrices and Linear Systems 187

This conservation of flow constraint (compare it to the first of Kirchhoff’s


Laws for electrical networks) leads to a system of linear equations:
Junction A : Traffic entering = 400 + 200
Traffic leaving = x1 + x5
Thus x1 + x5 = 600
Junction B : Traffic entering = x1 + x6
Traffic leaving = x2 + 100
Thus, x1 + x6 = x2 + 100.
Continuing thus for each junction and writing the resulting equations in
convenient form with variables on the left and constraints on the right, we
get the following system of linear equations:
Junction A : x1 +x5 = 600
Junction B : x1 −x2 +x6 = 100
Junction C : x2 −x7 = 500
Junction D : −x3 +x7 = 200
Junction E : −x3 +x4 +x6 = 800
Junction F : x4 +x5 = 600
The Gauss–Jordan elimination method is used to solve this system of equa-
tions. Observe that the augmented matrix contains many zeros. These
zeros greatly reduce the amount of computation involved. In practice,
networks are much larger than the one we have illustrated here, and the
systems of linear equations that describe them are thus much larger. The
systems are solved on a computer, however, the augmented matrices of all
such systems contain many zeros.

Solve this system for x1 , x2 , . . . , x7 using the Gauss–Jordan elimination


method:  . 
1 0 0 0 1 0 0 .. 600

 1 −1 .. 
 0 0 0 1 0 . 100 

 .. 
 0 1 0 0 0 0 −1 . 500 
 ≈ ··· ≈
..

0 −1 0 0 0
 
 0 1 . 200 
 .. 
 0
 0 −1 0 0 0 1 . 800  
..
0 0 0 1 1 0 0 . 600
188 Applied Linear Algebra and Optimization using MATLAB

 . 
1 0 0 0 0 1 −1 .. 600
 . 

 0 1 0 0 0 0 −1 .. 500 

 .. 
 0 0 1 0 0 0 −1 . −200 
≈ ..
.
1 −1 .
 
 0 0 0 1 0 600 
 .. 

 0 0 0 0 1 −1 1 . 000 

..
0 0 0 0 0 0 0 . 000
The system of equations that corresponds to this form is:

x1 +x6 −x7 = 600


x2 −x7 = 500
x3 −x7 = −200
x4 +x6 −x7 = 600
x5 −x6 +x7 = 000.

Expressing each leading variable in terms of the remaining variables, we


get
x1 = −x6 + x7 + 600
x2 = x7 + 500
x3 = x7 − 200
x4 = −x6 + x7 + 600
x5 = x6 − x7 .
As was perhaps to be expected, the system of equations has many solutions—
there are many traffic flows possible. One does have a certain amount of
choice at intersections.

Let us now use this mathematical model to arrive at information. Sup-


pose it becomes necessary to perform road work on the stretch of Adams
Street between Laura and Hogan. It is desirable to have as small a flow of
traffic as possible along this stretch of road. The flows can be controlled
along various branches by means of traffic lights at junctions. What is the
minimum flow possible along Adams that would not lead to traffic conges-
tion? What are the flows along the other branches when this is attained?
Our model will enable us to answer these questions.
Matrices and Linear Systems 189

Minimizing the flow along Adams corresponds to minimizing x7 . Since


all traffic flows must be greater than or equal to zero, the third equation
implies that the minimum value of x7 is 200, otherwise, x3 could become
negative. (A negative flow would be interpreted as traffic moving in the
opposite direction to the one permitted on a one-way street.) Thus, the
road work must allow for a flow of at least 200 cars per hour on the branch
CD in the peak period.

Let us now examine what the flows in the other branches will be when
this minimum flow along Adams is attained, when x7 gives

x1 = −x6 + 800
x2 = + 700
x3 = 000
x4 = −x6 + 800
x5 = x6 − 200.

Since x7 = 200 implies that x3 = 0 and vice-versa, we see that the minimum
flow in branch x7 can be attained by making x3 = 0; i.e., by closing branch
DE to traffic. •

1.6.2 Heat Conduction


Another typical application of linear systems is in heat-transfer problems
in physics and engineering.

Suppose we have a thin rectangular metal plate whose edges are kept at
fixed temperatures. As an example, let the left edge be 0o C , the right edge
2o C, and the top and bottom edges 1o C (Figure 1.7). We want to know
the temperature inside the plate. There are several ways of approaching
this kind of problem. The simplest approach of interest to us will be
the following type of approximation: we shall overlay our plate with finer
and finer grids, or meshes. The intersections of the mesh lines are called
mesh points. Mesh points are divided into boundary and interior points,
depending on whether they lie on the boundary or the interior of the plate.
We may consider these points as heat elements, such that each influences
its neighboring points. We need the temperature of the interior points,
190 Applied Linear Algebra and Optimization using MATLAB

Figure 1.7: Heat-transfer problem.

given the temperature of the boundary points. It is obvious that the finer
the grid, the better the approximation of the temperature distribution of
the plate. To compute the temperature of the interior points, we use the
following principle.
Theorem 1.34 (Mean Value Property for Heat Conduction)

The temperature at any interior point is the average of the temperatures of


its neighboring points. •
Suppose, for simplicity, we have only four interior points with unknown
temperatures x1 , x2 , x3 , x4 , and 12 boundary points (not named) with the
temperatures indicated in Figure 1.7.
Example 1.61 Compute the unknown temperatures x1 , x2 , x3 , x4 using Fig-
ure 1.7.

Solution. According to the mean value property, we have


1
x1 = (x2 + x3 + 1)
4
Matrices and Linear Systems 191

1
x2 = (x1 + x4 + 3)
4
1
x3 = (x1 + x4 + 1)
4
1
x4 = (x2 + x3 + 3) .
4
The problem thus reduces to solving the following system of four linear
equations in four variables x1 , x2 , x3 , and x4 :
4x1 − x2 − x3 = 1
−x1 + 4x2 − x4 = 3
−x1 + 4x3 − x4 = 1
− x2 − x3 + 4x4 = 3.
Solve this system for x1 , x2 , x3 , and x4 using the Gauss elimination method:
..
 
4 −1 −1 0 . 1
..
 
 
4 −1 −1 0 . 1  15 1 . 13
..

 .   0 − −1 
 −1 4 0 −1 .. 3  4 4 4 
  
 ≈ ··· ≈  .
 

 −1 .
.. 1  56 16 .
. 22
0 4 −1  0 0 − . 
15 15 15
   
..  
0 −1 −1 4 . 3  24 .. 30 
0 0 0 .
7 7
Now use backward substitution to get the solution of the system:
24 30 5
x4 = gives x4 =
7 7 4
56 16 22 3
x3 − x4 = gives x3 =
15 15 15 4
15 1 13 5
x2 − x3 − x4 = gives x2 =
4 4 4 4
3
4x1 − x2 − x3 = 1 gives x1 = .
4
Thus, the temperatures are x1 = 34 , x2 = 54 , x3 = 34 , and x4 = 54 . •
192 Applied Linear Algebra and Optimization using MATLAB

1.6.3 Chemical Solutions and


Balancing Chemical Equations
Example 1.62 (Chemical Solutions) It takes three different ingredi-
ents, A, B, and C, to produce a certain chemical substance. A, B, and
C have to be dissolved in water separately before they interact to form
the chemical. The solution containing A at 2.5g per cubic centimeter
(g/cm3 ) combined with the solution containing B at 4.2g/cm3 , combined
with the solution containing C at 5.6g/cm3 , makes 26.50g of the chemical.
If the proportions for A, B, C in these solutions are changed to 3.4, 4.7, and
2.8g/cm3 , respectively (while the volumes remain the same), then 22.86g of
the chemical is produced. Finally, if the proportions are changed to 3.7, 6.1,
and 3.7g/cm3 , respectively, then 29.12g of the chemical is produced. What
are the volumes in cubic centimeters of the solutions containing A, B, and
C?

Solution. Let x, y, z be the cubic centimeters of the corresponding volumes


of the solutions containing A, B, and C. Then 2.5x is the mass of A in
the first case, 4.2y is the mass of B, and 5.6z is the mass of C. Added
together, the three masses should be 26.50. So 2.5x + 4.2y + 5.6z = 26.50.
The same reasoning applies to the other two cases, and we get the system

2.5x + 4.2y + 5.6z = 26.50


3.4x + 4.7y + 2.8z = 22.86
3.6x + 6.1y + 3.7z = 29.12.

Solve this system for x, y, and z using the Gauss elimination method:
.
 
2.5 4.2 5.6 .. 26.50
 3.4 4.7 2.8 ...
 
 22.86 

.
3.6 6.1 3.7 .. 29.12

..
 
2.5 4.2 5.6 . 26.50
 .
≈  0 −1.012 −4.816 .. −13.18 



..
0 0.052 −4.364 . −9.04
Matrices and Linear Systems 193

.
 
2.5 4.2 5.6 .. 26.50
 . 
≈
 0 −1.012 −4.816 .. −13.18 
.
..
0 0 −4.612 . −9.717
Now use backward substitution to get the solution of the system:

−4.612z = −9.717 gives z = 2.107


−1.012y − 4.816z = −13.18 gives y = 2.996
2.5x + 4.2y + 5.6z = 26.50 gives x = 0.847.

Hence, the volumes of the solutions containing A, B, and C are, respec-


tively, 0.847cm3 , 2.996cm3 , and 2.107cm3 . •

Balancing Chemical Equations

When a chemical reaction occurs, certain molecules (the reactants) com-


bine to form new molecules (the products). A balanced chemical equation
is an algebraic equation that gives the relative numbers of reactants and
products in the reaction and has the same number of atoms of each type
on the left- and right-hand sides. The equation is usually written with the
reactants on the left, the products on the right, and an arrow in between
to show the direction of the reaction.

For example, for the reaction in which hydrogen gas (H2 ) and oxygen
(O2 ) combine to form water (H2 O), a balanced chemical equation is

2H2 + O2 −→ 2H2 O,

indicating that two molecules of hydrogen combine with one molecule of


oxygen to form two molecules of water. Observe that the equation is bal-
anced, since there are four hydrogen atoms and two oxygen atoms on each
side. Note that there will never be a unique balanced equation for a reac-
tion, since any positive integer multiple of a balanced equation will also be
balanced. For example, 6H2 + 3O2 −→ 6H2 O is also balanced. Therefore,
we usually look for the simplest balanced equation for a given reaction.
Note that the process of balancing chemical equations really involves solv-
ing a homogeneous system of linear equations.
194 Applied Linear Algebra and Optimization using MATLAB

Example 1.63 (Balancing Chemical Equations) The combustion of


ammonia (N H3 ) in oxygen produces nitrogen (N2 ) and water. Find a bal-
anced chemical equation for this reaction.

Solution. Let w, x, y, and z denote the numbers of molecules of ammonia,


oxygen, nitrogen, and water, respectively, then we are seeking an equation
of the form
wN H3 + xO2 −→ yN2 + zH2 O.
Comparing the number of nitrogen, hydrogen, and oxygen atoms in the
reactants and products, we obtain three linear equations:

Nitrogen: w = 2y
Hydrogen: 3w = 2z
Oxygen: 2x = z.

Rewriting these equations in standard form gives us a homogeneous system


of three equations in four variables:

w − 2y = 0
3w − 2z = 0
2x − z = 0.

The augmented matrix form of the system is

.
 
1 0 −2 0 .. 0
 . 
 3 0
 0 −2 .. 0 
.
.
0 2 0 −1 .. 0

Solve this system for w, x, y, and z using the Gauss elimination method
with partial pivoting:

.. .
   
3 0 0 −2 . 0 3 0 0 −2 .. 0
2 .. .
   
 0 0 −2
 3 ≈ 0 2
. 0   0 −1 .. 0 .

. 2 ..
0 2 0 −1 .. 0 0 0 −2 3
. 0
Matrices and Linear Systems 195

Now use backward substitution to get the solution of the homogeneous sys-
tem:
−2y + 32 z = 0 gives y = 31 z

2x − z=0 gives x = 12 z

3w − 2z = 0 gives w = 32 z.
The smallest positive value of z that will produce integer values for all
four variables is the least common denominator of the fractions 23 , 12 , and
1
3
—namely, 6—which gives

w = 4, x = 3, y = 2, z = 6.

Therefore,
4N H3 + 3O2 −→ 2N2 + 6H2 O
is the balanced chemical equation. •

1.6.4 Manufacturing, Social, and Financial Issues


Example 1.64 (Manufacturing) Sun Microsystems manufactures three
types of personal computers: The Cyclone, the Cyclops, and the Cycloid.
It takes 15 hours to assemble the Cyclone, 4 hours to test its hardware, and
5 hours to install its software. The hours required for the Cyclops are 12
hours to assemble, 4.5 hours to test, and 2.5 hours to install. The Cycloid,
being the lower end of the line, requires 10 hours to assemble, 3 hours to
test, and 2.5 hours to install. If the company’s factory can afford 1250
labor hours per month for assembling, 400 hours for testing, and 320 hours
for installation, how many PCs of each kind can be produced in a month?

Solution. Let x, y, z be the number of Cyclones, Cyclops, and Cycloids


produced each month. Then it takes 15x + 12y + 10z hours to assemble the
computers. Hence, 15x + 12y + 10z = 1250. Similarly, we get equations for
testing and installing. The resulting system is
15x + 12y + 10z = 1250
4x + 4.5y + 3z = 400
5x + 2.5y + 2.5z = 320.
196 Applied Linear Algebra and Optimization using MATLAB

Solve this system for x, y, and z using the Gauss elimination method:
 .. 
  15 12 10 . 1250 
..

15 12 10 . 1250  
 ..  
 0 13 1 .
. 200 
 4 4.5 3 . 400  ≈ 
 . 
10 3 3
 
..
 
5 2.5 2.5 . 320
 
3 5 .. 290
 
0 − − . −
2 6 3
 . 
15 12 10 .. 1250
 
 
 13 1 .
.. 200 
≈ 0 .
 
 10 3 3 
 
35 .. 770
 
0 0 − . −
78 39
Now use backward substitution to get the solution of the system:
35 770
− z=− gives z = 44
78 39
13 1 200
y+ z= gives y = 40
10 3 3

15x + 12y + 10z = 1250 gives x = 20.


Hence, 20 Cyclones, 40 Cyclops, and 44 Cycloids can be manufactured
monthly. •

Example 1.65 (Weather) The average of the temperature for the cities
of Jeddah, Makkah, and Riyadh was 50o C during a given summer day. The
temperature in Makkah was 5o C higher than the average of the temperatures
of the other two cities. The temperature in Riyadh was 5o C lower than the
average temperature of the other two cities. What was the temperature in
each of the cities?

Solution. Let x, y, z be the temperatures in Jeddah, Makkah, and Riyadh,


respectively. The average temperature of all three cities is (x+y+z)
3
, which is
Matrices and Linear Systems 197

50o C. On the other hand, the temperature in Makkah exceeds the average
temperature of Jeddah and Riyadh, (x+z)
2
, by 5o C. So, y = (x+z)
2+5
. Likewise,
(x+y)
we have z = 2−5 . So, the system becomes

(x + y + z)
= 50
3
(x + z)
y = +5
2
(x + y)
z = − 5.
2
Rewriting the above system in standard form, we get

x + y + z = 150
−x + 2y − z = 10
−x − y + 2z = −10.

Solve this system for x, y, and z using the Gauss elimination method:
.. ..
   
1 1 1 . 150 1 1 1 . 150
 ..   . 
 −1
 2 −1 . 10  ≈  0 3
  0 .. 160 .
.. .
−1 −1 2 . −10 0 0 3 .. 140

Now use backward substitution to get the solution of the system:

3z = 140 gives z = 46.667


3y = 160 gives y = 53.333
x + y + z = 150 gives x = 50.

Thus, the temperature in Jeddah was 50o C and the temperatures in Makkah
and Riyadh were approximately, 53o C and 470 C, respectively. •

Example 1.66 (Foreign Currency Exchange) An international busi-


ness person needs, on the average, fixed amounts of Pakistani rupees, En-
glish pounds, and Saudi riyals during each of his business trips. He traveled
three times this year. The first time he exchanged a total of $26000 at the
198 Applied Linear Algebra and Optimization using MATLAB

following rates: the dollar was 60 rupees, 0.6 pounds, and 3.75 riyals. The
second time he exchanged a total of $25500 at these rates: the dollar was
65 rupees, 0.56 pounds, and 3.76 riyals. The third time he exchanged again
a total of $25500 at these rates: the dollar was 65 rupees, 0.6 pounds, and
3.75 riyals. How many rupees, pounds, and riyals did he buy each time?

Solution. Let x, y, z be the fixed amounts of rupees, pounds, and riyals he


1
purchases each time. Then the first time he spent ( 60 )x dollars to buy ru-
1 1
pees, ( 0.6 )y dollars to buy pounds, and ( 3.75 )z dollars to buy riyals. Hence,

     
1 1 1
x+ y+ z = 26000.
60 0.8 3.75

The same reasoning applies to the other two purchases, and we get the
system

1 5 4
x + y + z = 26000
60 3 15
1 25 25
x + y + z = 25500
65 14 94
1 5 4
x + y + z = 25500.
65 3 15

Solve this system for x, y, and z using the Gauss elimination method:

1 5 4 .. 1 5 4 ..
   
 60 . 26000   . 26000 
 3 15   60 3 15 
   
 1 25 25 ..   45 121 .. 

 65 . 25500  ≈ 0 . 1500 
 14 94 
 
 182 6110 

   
 1 5 4 ..   5 4 .. 
. 25500 0 . 1500
65 3 15 39 195
Matrices and Linear Systems 199

1 5 4 ..
 
 60 . 26000
 3 15 

 
 45 121 .. 
≈ 0
 . 1500  .
 182 6110 
 
 13 .. 6500 
0 0 .
1269 9
Now use backward substitution to get the solution of the system:
13 6500
z= gives z = 70500
1269 9
45 121
y+ z = 1500 gives y = 420
182 6110
1 5 4
x + y + z = 26000 gives x = 390000.
60 3 15
Therefore, each time he bought 390000 rupees, 420 pounds, and 70500 riyals
for his trips. •
Example 1.67 (Inheritance) A father plans to distribute his estate, worth
SR234,000, between his four daughters as follows: 23 of the estate is to be
split equally among the daughters. For the rest, each daughter is to receive
SR3,000 for each year that remains until her 21st birthday. Given that
the daughters are all 3 years apart, how much would each receive from her
father’s estate? How old are the daughters now?

Solution. Let x, y, z, and w be the amounts of money that each daughter


will receive from the splitting of 31 of the estate, according to age, starting
with the oldest one. Then x + y + z + w = 13 (234, 000) = 78, 000. On the
other hand, w − z = 3(3000), z − y = 3(3000), and y − x = 3(3000). The
problem thus reduces to solving the following system of four linear equations
in four variables x, y, z, and w:
x + y + z + w = 78, 000
− z + w = 9, 000
− y + z = 9, 000
−x + y = 9, 000.
200 Applied Linear Algebra and Optimization using MATLAB

Solve this system for x1 , x2 , x3 , and x4 using the Gauss elimination method
with partial pivoting:

.. ..
   
 1 1 1 1 . 78, 000   1
..
1 1 1 . 78, 000 
..
 0 0 −1 1 . 9, 000   0 0 −1 1 . 9, 000 
   
. ≈ ..
1 0 .. 9, 000 
 
 0 −1
  0 −1 1 0 . 9, 000 

 
.. ..
−1 1 0 0 . 9, 000 0 2 1 1 . 87, 000

and
..
 
 .. 
1 1 1 1 . 78, 000  1 1 1 1 . 78, 000 
.
..  0 2 1 1 .. 87, 000 
   
 0 2 1 1 . 87, 000 

3 1 .. ≈ 0
  3 1 .. 
.

 0 0 . 52, 500  0 . 52, 500 
2 2
 
 2 2   
..  4 .. 
0 0 −1 1 . 9, 000 0 0 0 . 44, 000
3
Now use backward substitution to get the solution of the system:

4
w = 44, 000 gives w = 33, 000
3
3 1
z + w = 52, 500 gives z = 24, 000
2 2

2y + z + w = 87, 000 gives y = 15, 000

x + y + z + w = 78, 000 gives x = 6, 000.

One-quarter of two-thirds of the estate is worth 14 ( 23 (234, 000)) = SR39, 000.


So, the youngest daughter will receive (33, 000 + 39, 000) = SR72, 000, the
next one (24, 000 + 39, 000) = SR63, 000, the next one (15, 000 + 39, 000) =
SR54, 000, and the first one (6, 000 + 39, 000) = SR45, 000. The oldest
daughter will receive 6, 000 = 2(3, 000), so she is currently 21 − 2 = 19.
The second one is 16, the third one is 13, and the last one is 10 years old.

Matrices and Linear Systems 201

1.6.5 Allocation of Resources


A great many applications of systems of linear equations involve allocating
limited resources subject to a set of constraints.

Example 1.68 A dietitian is to arrange a special diet composed of four


foods A, B, C, and D. The diet is to include 72 units of calcium, 45 units
of iron, 42 units of vitamin A, and 60 units of vitamin B. The following
table shows the amount of calcium, iron, vitamin A, and vitamin B (in

Food Calcium Iron Vitamin A Vitamin B


A 18 6 6 6
B 9 6 12 9
C 9 9 6 9
D 12 12 9 18

units) per ounce in foods A, B, C, and D. Find, if possible, the amount of


foods A, B, C, and D that can be included in the special diet to conform to
the dietitian’s recommendations.

Solution. Let x, y, z, and w be the ounces of foods A, B, C, and D,


respectively. Then we have the system of equations

18x + 9y + 9z + 12w = 72
6x + 6y + 9z + 12w = 45
6x + 12y + 6z + 9w = 42
6x + 9y + 9z + 18w = 60.

Solve this system for x, y, z, and w using the Gauss elimination method:

.. ..
   
 18 9 9 12 .
..
72   18 9 9 12 .
..
72 
 6 6 9 12 . 45   0 3 6 8 . 21 
   
 .. ≈ .. 
 6 12 6 9 . 42   0 9 3 5 . 18 
   
.. ..
6 9 9 18 . 60 0 6 6 14 . 36
202 Applied Linear Algebra and Optimization using MATLAB

and

..
  .. 
18 9 9 12 . 72
 18 9 9 12
..
. 72  
.. 
 0 3

6 8 . 21  
  0 3 6 8 . 21 
≈ .

 .. ..
 0 0 −15 −19 . −45   0 0 −15 −19 . −45 
28 ..
   
..
0 0 −6 −2 . 36 0 0 0 . 12
5
Now use backward substitution to get the solution of the system:
28 15
w = 12 gives w =
5 7
2
−15z − 19w = −45 gives z =
7
5
3y + 6z + 8w = 21 gives y =
7
29
18x + 9y + 9z + 12w = 72 gives x = .
14
29
Thus, the amount in ounces of foods A, B, C, and D are x = 14
,y = 75 , z =
2
7
, and w = 15
7
, respectively. •

1.7 Summary
The basic methods for solving systems of linear algebraic equations were
discussed in this chapter. Since these methods use matrices and determi-
nants, the basic properties of matrices and determinants were presented.

Several direct solution methods were also discussed. Among them were
Cramer’s rule, Gaussian elimination and its variants, the Gauss–Jordan
method, and the LU decomposition method. Cramer’s rule is impracti-
cal for solving systems with more than three or four equations. Gaussian
elimination is the best choice for solving linear systems. For systems of
equations having a constant coefficients matrix but many right-hand side
Matrices and Linear Systems 203

vectors, LU decomposition is the method of choice. The LU decomposi-


tion method has been used for the solution of tridiagonal systems. Direct
methods are generally used when the number of equations is small, or most
of the coefficients of the equations are nonzero, or the system of equations
is not diagonally dominant, or the system of equations is ill-conditioned.
But these methods are generally impractical when a large number of equa-
tions must be solved simultaneously. In this chapter we also discussed
conditioning of linear systems by using a parameter called the condition
number. Many ill-conditioned systems were discussed. The coefficient ma-
trix A of an ill-conditioned system Ax = b has a large condition number.
The numerical solution to a linear system is less reliable when A has a
large condition number than when A has a small condition number. The
numerical solution x∗ of Ax = b is different from the exact solution x be-
cause of round-off errors in all stages of the solution process. The round-off
errors occur in the elimination or factorization of A and during backward
substitution to compute x∗ . The degree to which perturbation in A and
b affect the numerical solution is determined by the value of the condition
number K(A). A large value of K(A) indicates that A is close to being
singular. When K(A) is large, matrix A is said to be ill-conditioned and
small perturbations in A and b cause relatively large differences between x
and x∗ . If K(A) is small, any stable algorithm will return a solution with
small residual r, while if K(A) is large, then the return solution may have
large errors even though the residuals are small. The best way to deal with
ill-conditioning is to avoid it by reformulating the problem.

At the end of the chapter we discussed many applications of linear


systems. Fitting a polynomial of degree (n − 1) to n data points leads
to a system of linear equations that has a unique solution. The analysis
of electric networks and traffic flow give rise to systems that have unique
solutions and many solutions. The model for traffic flow is similar to that
of electric networks, but it has fewer restrictions, leading to more freedom
and thus many solutions in place of a unique solution. Applications to
heat conduction, chemical reactions, balancing equations, manufacturing,
social and financial issues, and allocation of resources were also covered.
204 Applied Linear Algebra and Optimization using MATLAB

1.8 Problems
1. Determine the matrix C given by the following expression

C = 2A − 3B,

if the matrices A and B are


   
2 −1 1 1 1 1
A =  −1 2 3 , B =  0 1 3 .
2 1 2 2 1 4

2. Find the product AB and BA for the matrices of Problem 1.

3. Show that the product AB of the following rectangular matrices is a


singular matrix:
 
6 −3  
2 −1 −2
A= 1 4 , B= .
3 −4 −1
−2 1

4. Let
     
1 2 3 1 1 2 1 0 1
A =  0 −1 2  , B =  −1 1 −1  , C =  0 1 2 .
2 0 2 1 0 2 2 0 1

(a) Compute AB and BA and show that AB 6= BA.


(b) Find (A + B) + C and A + (B + C).
(c) Show that (AB)T = B T AT .

5. Find a value of x and y such that AB T = C T , where


 
1 2 3
A =  4 2 0 , B = [1 x 1], C = [−2 − 2 y].
2 1 3
Matrices and Linear Systems 205

6. Find the values of a and b such that each of the following matrices is
symmetric:
   
1 3 5 −2 a + b 2
(a) A =  a + 2 5 6  , (b) B =  3 4 2a + b  ,
b+1 6 7 2 5 −3
   
1 4 a−b 1 a − 4b 2
(c) C =  4 2 a + 3b  , (d) D =  2 8 6 .
7 3 4 7 a − 7b 8

7. Which of the following matrices are skew symmetric?

(a)    
1 −5 0 −4
A= , B= ,
5 0 4 0
(b)    
1 9 1 6
C= , D= ,
−9 7 −6 2
(c)    
0 2 −2 3 −3 −3
E =  −2 0 4 , F = 3 3 −3  ,
2 −4 0 3 3 3
(d)    
1 −5 1 2 8 6
G= 5 1 4 , H =  −8 4 2 .
−1 −4 1 −6 −2 5

8. Determine whether each of the following matrices is in row echelon


form, reduced row echelon form, or neither:

(a)    
1 0 0 1 0 8
A =  0 1 0 , B =  0 1 2 ,
0 0 3 0 0 0
206 Applied Linear Algebra and Optimization using MATLAB

(b)    
1 2 3 0 1 2 0 0 1
C =  0 0 0 1 , D =  0 0 1 0 1 ,
0 0 0 1 0 0 0 1 0
(c)  
1 4 5 6  
 0 1 0 0 0 3
1 7 8 
E=
 0
, F =  0 0 1 0 4 ,
0 1 9 
0 0 0 1 5
0 0 0 0
(d)    
1 0 0 3 0 0 0 0 0
 0 1 0 4   0 0 1 2 4 
G=
 0
, H= .
0 0 5   0 0 0 1 0 
0 0 0 6 0 0 0 0 0
9. Find the row echelon form of each of the following matrices using
elementary row operations, and then solve the linear system:

(a)    
0 1 2 1
A =  2 3 4 , b =  −1  .
1 3 2 2
(b)    
1 2 3 1
A =  0 3 1 , b =  0 .
−1 4 5 −3
(c)    
0 −1 0 1
A= 3 0 1 , b =  3 .
0 1 1 2
(d)   

0 −1 2 4 2
 2 3 5 6   1 
A= , b=
 −1  .

 1 3 −2 4 
1 2 −1 3 2
Matrices and Linear Systems 207

10. Find the row echelon form of each of the following matrices using
elementary row operations, and then solve the linear system:

(a)
   
1 4 −2 5
A =  2 3 2 , b =  −3  .
6 4 1 4

(b)
   
2 2 7 3
A =  0 3 2 , b =  2 .
3 2 1 5

(c)
   
0 −1 0 1
A= 5 0 2 , b =  1 .
−1 1 4 1

(d)
   
1 1 2 4 11
 1 3 4 5   7 
A=
 1
, b=
 6 .

4 2 4 
2 2 −1 3 4

11. Find the reduced row echelon form of each of the following matrices
using elementary row operations, and then solve the linear system:

(a)
   
1 2 3 4
A =  −1 2 1  , b =  3 .
0 1 2 1

(b)
   
0 1 4 1
A =  2 1 −1  , b =  1 .
1 3 4 −1
208 Applied Linear Algebra and Optimization using MATLAB

(c)    
0 −1 3 2 6
 3 2 5 4   4 
A=
 −1
, b=
 4 .

3 1 2 
2 3 4 1 4
(d)    
1 2 −4 1 1
 −2 0 2 3   −1 
A= , b=
 2 .

 0 1 −1 2 
2 3 0 −1 4

12. Compute the determinant of each of the following matrices using


cofactor expansion along any row or column:
     
cos x sin x 1 x y z 2x 0 z
A= 0 3 cos x −3 sin x  , B =  0 x2 y  , C =  0 2y −z  .
0 2 sin x 2 cos x 0 y2 x z −z 2z

13. Compute the determinant of each of the following matrices using


cofactor expansion along any row or column:
     
3 7 6 11 −6 4 4 −8 11
A =  0 3 5  , B =  −16 8 6  , C =  10 1 4 .
7 4 3 5 7 12 7 10 8

14. Let    
1 1 1 0
A= , B= ,
0 1 1 1
then show that (AB)−1 = B −1 A−1 .

15. Evaluate the determinant of each of the following matrices using the
Gauss elimination method:
     
3 1 −1 4 1 6 17 46 7
A= 2 0 4  , B =  −3 6 4 , C =  20 49 8  .
1 −5 1 5 0 9 23 52 19
Matrices and Linear Systems 209

16. Evaluate the determinant of each of the following matrices using the
Gauss elimination method:
   
4 2 5 −1 4 −2 5 −3
 2 5 4 6 
, B =  1 8 12 7 

A=  4 5 1
,
3   1 4 3 6 
11 7 1 1 5 3 −3 6
   
13 22 −12 8 9 11 2 8
 15 10 33 4   15 1 3 12 
C=
 9 −12
, D=
 9 −12 5 17  .

5 7 
15 33 −19 26 13 17 21 15

17. Find all zeros (values of x such that f (x) = 0) of polynomial f (x) =
det(A), where
 
x−1 3 2
A= 3 x 1 .
2 1 x−2

18. Find all zeros (values of x such that f (x) = 0) of polynomial f (x) =
det(A), where
 
x 0 1
A =  2 1 3 .
0 x 2

19. Find all zeros (values of x such that f (x) = 0) of polynomial f (x) =
det(A), where
 
x −8 5 2
 −3 x 2 1 
A=  3
.
4 x 1 
3 6 −5 17

20. (a) The matrix


 
−x 1 0
A= 0 −x 1 
−c0 −c1 −c2
210 Applied Linear Algebra and Optimization using MATLAB

is called the companion matrix of the polynomial (−1)(c2 x2 +c1 x+c0 ).


Show that

−x 1 0

|A| = 0 −x 1 = (−1)(c2 x2 + c1 x + c0 ).
−c0 −c1 −c2

(b) The matrix  


1 1 1
 
 
 x1 x2 x3 
A= 
 
2 2 2
x1 x2 x3
is called the vandermonde matrix. It is a square matrix and it is
famously ill-conditioned. Show that

1 1 1



|A| = x1 x2 x3 = (x1 − x2 )(x2 − x3 )(x3 − x1 ).


2 2 2
x x x
1 2 3

(c) A square matrix A is said to be a nilpotent matrix, if Ak = 0


for some positive integer k. Prove that if A is nilpotent, then the
determinant of A is zero.

(d) A square matrix A is said to be an idempotent matrix, if A2 = A.


Prove that if A is idempotent, then either det(A) = 1 or det(A) = 0.

(e) A square matrix A is said to be an involution matrix, if A2 = I.


Give an example of a 3 × 3 matrix that is an involution matrix.
21. Compute the adjoint of each matrix A, and find the inverse of it, if
it exists:
 
  1 2 −1
1 2
(a) A = , (b) A =  2 1 4 ,
−3 4
1 5 −8
Matrices and Linear Systems 211
 
1 1 0
(c) A =  1 0 1  .
0 1 1

22. Show that A(Adj A) = (Adj A)A = det(A)I3 , if


 
2 1 3
A =  −1 2 0 .
3 −2 1

23. Find the inverse and determinant of the adjoint matrix of each of the
following matrices:
     
4 1 5 3 4 −2 1 2 4
A =  5 6 3 , B =  2 5 4 , C =  1 4 0 .
5 4 4 7 −3 4 3 1 1

24. Find the inverse and determinant of the adjoint matrix of each of the
following matrices:
     
3 2 5 5 3 −2 1 2 3
A =  2 5 4 , B =  3 5 6 , C =  4 5 6 .
5 4 6 −2 6 5 7 8 8

25. Find the inverse of each of the following matrices using the determi-
nant:
 
    0 4 2 −4
0 1 5 2 4 −2  6 1 4 −3 
A =  3 1 2  , B =  −4 7 5 , C =   4
.
3 1 3 
2 3 4 5 −4 4
8 4 −3 2

26. Solve each of the following homogeneous linear systems:

(a)
x1 − 2x2 + x3 = 0
x1 + x2 + 3x3 = 0
2x1 + 3x2 − 5x3 = 0.
212 Applied Linear Algebra and Optimization using MATLAB

(b)
x1 − 5x2 + 3x3 = 0
2x1 + 3x2 + 2x3 = 0
x1 − 2x2 − 4x3 = 0.
(c)
3x1 + 4x2 − 2x3 = 0
2x1 − 5x2 − 4x3 = 0
3x1 − 2x2 + 3x3 = 0.
(d)
x1 + x2 + 3x3 − 2x4 = 0
x1 + 2x2 + 5x3 + x4 = 0
x1 − 3x2 + x3 + 2x4 = 0.

27. Find value(s) of α such that each of the following homogeneous linear
systems has a nontrivial solution:

(a)
2x1 − (1 − 3α)x2 = 0
x1 + αx2 = .
(b)
2x1 + 2αx2 − x3 = 0
x1 − 2x2 + x3 = 0
αx1 + 2x2 − 3x3 = 0.
(c)
x1 + 2x2 + 4x3 = 0
3x1 + 7x2 + αx3 = 0
3x1 + 3x2 + 15x3 = 0.
(d)
x1 + x2 + 2x3 − 3x4 = 0
x1 + 2x2 + x3 − 2x4 = 0
3x1 + x2 + αx3 + 3x4 = 0
3x1 + x2 + αx3 + 3x4 = 0
2x1 + 3x2 + x3 + αx4 = 0.
Matrices and Linear Systems 213

28. Using the matrices in Problem 15, solve the following systems using
the matrix inversion method:

(a) Ax = [1, 1, −3]T , (b) Bx = [2, 1, 3]T , (c) Cx = [1, 0, 1]T .

29. Solve the following systems using the matrix inversion method:

(a)
x1 + 3x2 − x3 = 4
5x1 − 2x2 − x3 = −2
2x1 + 2x2 + x3 = 9.
(b)
x1 + x2 + 3x3 = 2
5x1 + 3x2 + x3 = 3
2x1 + 3x2 + x3 = −1.
(c)
4x1 + x2 − 3x3 = −1
3x1 + 2x2 − 6x3 = −2
x1 − 5x2 + 3x3 = −3.
(d)
7x1 + 11x2 − 15x3 = 21
3x1 + 22x2 − 18x3 = 12
2x1 − 13x2 + 9x3 = 16.

30. Solve the following systems using the matrix inversion method:

(a)
3x1 − 2x2 − 4x3 = 7
5x1 − 2x2 − 3x3 = 8
7x1 + 4x2 + 2x3 = 9.
(b)
−3x1 + 4x2 + 3x3 = 11
5x1 + 3x2 + x3 = 12
x1 + x2 + 5x3 = 10.
214 Applied Linear Algebra and Optimization using MATLAB

(c)
x1 + 42 − 8x3 = 7
2x1 + 7x2 − 5x3 = −5
3x1 − 6x2 + 6x3 = 4.
(d)
17x1 + 18x2 − 19x3 = 10
43x1 + 22x2 − 14x3 = 11
25x1 − 33x2 + 21x3 = 12.

31. Solve the following systems using the matrix inversion method:

(a)
2x1 + 3x2 − 4x3 + 4x4 = 11
x1 + 3x2 − 4x3 + 2x4 = 12
4x1 + 3x2 + 2x3 + 3x4 = 14
3x1 − 4x2 + 5x3 + 6x4 = 15.
(b)
7x1 + 13x2 + 12x3 + 9x4 = 21
3x1 + 23x2 − 5x3 + 2x4 = 10
4x1 − 7x2 + 22x3 + 3x4 = 11
3x1 − 4x2 + 25x3 + 16x4 = 10.
(c)
12x1 + 6x2 + 5x3 − 2x4 = 21
11x1 + 13x2 + 7x3 + 2x4 = 22
14x1 + 9x2 + 2x3 − 6x4 = 23
7x1 − 24x2 − 7x3 + 8x4 = 24.
(d)
15x1 − 26x2 + 15x3 − 11x4 = 17
14x1 + 15x2 + 7x3 + 7x4 = 18
17x1 + 14x2 − 22x3 − 16x4 = 19
21x1 − 12x2 − 7x3 + 8x4 = 20.

32. In each case, factor the matrix as a product of elementary matrices:


     
1 1 3 2 1 1
(a) , (b) , (c) ,
3 1 1 2 −2 4
Matrices and Linear Systems 215
     
1 0 1 1 1 2 1 −3 5
(d)  0 2 1  , (e)  0 1 2  , (f )  −2 2 −4  .
2 2 3 1 2 3 4 7 9

33. Solve Problem 30 using Cramer’s rule.

34. Solve the following systems using Cramer’s rule:

(a)
3x1 + 4x2 + 5x3 = 1
3x1 + 2x2 + x3 = 2
4x1 + 3x2 + 5x3 = 3.
(b)
x1 − 4x2 + 2x3 = 4
−4x1 + 5x2 + 6x3 = 0
7x1 − 3x2 + 5x3 = 4.
(c)
6x1 + 7x2 + 8x3 = 1
−5x1 + 3x2 + 2x3 = 1
x1 + 2x2 + 3x3 = 1.
(d)
x1 + 3x2 − 4x3 + 5x4 = 2
6x1 − x2 + 6x3 + 3x4 = −3
2x1 + x2 + 3x3 + 2x4 = 4
x1 + 5x2 + 6x3 + 7x4 = 2.

35. Solve the following systems using Cramer’s rule:

(a)
2x1 − 2x2 + 8x3 = 1
5x1 + 6x2 + 5x3 = 2
7x1 + 7x2 + 9x3 = 3.
(b)
3x1 − 3x2 + 12x3 = 14
−4x1 + 5x2 + 16x3 = 18
x1 − 15x2 + 24x3 = 19.
216 Applied Linear Algebra and Optimization using MATLAB

(c)
9x1 − 11x2 + 12x3 = 3
−5x1 + 3x2 + 2x3 = 4
7x1 − 12x2 + 13x3 = 5.
(d)
11x1 + 3x2 − 13x3 + 15x4 = 22
26x1 − 5x2 + 6x3 + 13x4 = 23
22x1 + 6x2 + 13x3 + 12x4 = 24
17x1 − 25x2 + 16x3 + 27x4 = 25.

36. Use the simple Gaussian elimination method to show that the fol-
lowing system does not have a solution:

3x1 + x2 = 1.5
2x1 − x2 − x3 = 2
4x1 + 3x2 + x3 = 0.

37. Solve Problem 34 using the simple Gaussian elimination method.

38. Solve the following systems using the simple Gaussian elimination
method:

(a)
x1 − x2 = −2
−x1 + 2x2 − x3 = 5
4x1 − x2 + 4x3 = 1.
(b)
3x1 + x2 − x3 = 5
5x1 − 3x2 + 2x3 = 7
2x1 − x2 + x3 = 3.
(c)
3x1 + x2 + x3 = 2
2x1 + 2x2 + 4x3 = 3
4x1 + 9x2 + 16x3 = 1.
Matrices and Linear Systems 217

(d)
2x1 + x2 + x3 − x4 = 9
x1 + 9x2 + 8x3 + 4x4 = 11
−x1 + 3x2 + 5x3 + 2x4 = 10
5x1 + x2 + x4 = 12.

39. Solve the following systems using the simple Gaussian elimination
method:

(a)
2x1 + 5x2 − 4x3 = 3
2x1 + 2x2 − x3 = 1
3x1 + 2x2 − 3x3 = −5.
(b)
2x2 − x3 = 1
3x1 − x2 + 2x3 = 4
x1 + 3x2 − 5x3 = 1.
(c)
x1 + 2x2 = 3
−x1 − 2x3 = −5
−3x1 − 5x2 + x3 = −4.
(d)
3x1 + 2x2 + 4x3 − x4 = 2
x1 + 4x2 + 5x3 + x4 = 1
4x1 + 5x2 + 4x3 + 3x4 = 5
2x1 + 3x2 + 2x3 + 4x4 = 6.

40. For what values of a and b does the following linear system have no
solution or infinitely many solutions:

(a)
2x1 + x2 + x3 = 2
−2x1 + x2 + 3x3 = a
2x1 − x3 = b.
218 Applied Linear Algebra and Optimization using MATLAB

(b)
2x1 + 3x2 − x3 = 1
x1 − x2 + 3x3 = a
3x1 + 7x2 − 5x3 = b.
(c)
2x1 − x2 + 3x3 = 3
3x1 + x2 − 5x3 = a
−5x1 − 5x2 + 21x3 = b.
(d)

2x1 − x2 + 3x3 = 5
4x1 + 2x2 + bx3 = 6
−2x1 + ax2 + 3x3 = 4.

41. Find the value(s) of α so that each of the following linear systems
has a nontrivial solution:

(a)
2x1 + 2x2 + 3x3 = 1
3x1 + αx2 + 5x3 = 3
x1 + 7x2 + 3x3 = 2.
(b)
x1 + 2x2 + x3 = 2
x1 + 3x2 + 6x3 = 5
2x1 + 3x2 + αx3 = 6.
(c)
αx1 + x2 + x3 = 7
x1 + x2 − x3 = 2
x1 + x2 + αx3 = 1.
(d)
2x1 + αx2 + 3x3 = 9
3x1 − 4x2 − 5x3 = 11
4x1 + 5x2 + αx3 = 12.
Matrices and Linear Systems 219

42. Find the inverse of each of the following matrices by using the simple
Gauss elimination method:
     
3 3 3 5 3 2 1 2 3
A =  0 2 2 , B =  3 2 2 , C =  2 5 2 .
2 4 5 2 6 5 3 4 3

43. Find the inverse of each of the following matrices by using the simple
Gauss elimination method:
     
3 2 3 1 −3 2 5 2 3
A =  4 2 2 , B =  3 2 6 , C =  2 5 5 .
2 4 3 2 −6 5 3 2 4

44. Determine the rank of each of the following matrices:


     
3 1 −1 4 1 6 17 46 7
A=  2 0 4  , B =  −3 6 4  , C =  20 49 8  .
1 −5 1 5 0 9 23 52 9

45. Determine the rank of each matrix:


 
   1  2 3 4
2 −1 0 0.1 0.2 0.3  2 4 6 8 
A =  2 −1 1  , B =  0.4 0.5 0.6  , C = 
 3
.
5 7 9 
1 1 −1 0.7 0.8 0.91
4 6 8 10

46. Let A be an m × n matrix and B be an n × p matrix. Show that the


rank of AB is less than or equal to the rank of A.

47. Solve Problem 38 using Gaussian elimination with partial pivoting.

48. Solve the following linear systems using Gaussian elimination with
partial and without pivoting:

(a)
1.001x1 + 1.5x2 = 0
2x1 + 3x2 = 1.
220 Applied Linear Algebra and Optimization using MATLAB

(b)
x1 + 1.001x2 = 2.001
x1 + x2 = 2.
(c)
6.122x1 + 1500.5x2 = 1506.622
2000x1 + 3x2 = 2003.
49. The elements of matrix A, the Hilbert matrix, are defined by
aij = 1/(i + j − 1), for i, j = 1, 2, . . . , n.
Find the solution of the system Ax = b for n = 4 and b = [1, 2, 3, 4]T
using Gaussian elimination by partial pivoting.
50. Solve the following systems using the Gauss–Jordan method:

(a)
x1 + 4x2 + x3 = 1
2x1 + 4x2 + x3 = 9
3x1 + 5x2 − 2x3 = 11.
(b)
x1 + x 2 + x3 = 1
2x1 − x2 + 3x3 = 4
3x1 + 2x2 − 2x3 = −2.
(c)
2x1 + 3x2 + 6x3 + x4 = 2
x1 + x2 − 2x3 + 4x4 = 1
3x1 + 5x2 − 2x3 + 2x4 = 11
2x1 + 2x2 + 2x3 − 3x4 = 2.
51. The following sets of linear equations have a common coefficients ma-
trix but different right-side terms:

(a)
2x1 + 3x2 + 5x3 = 0
3x1 + x2 − 2x3 = −2
x1 + 3x2 + 4x3 = −3.
Matrices and Linear Systems 221

(b)
2x1 + 3x2 + 5x3 = 1
3x1 + x2 − 2x3 = 2
x1 + 3x2 + 4x3 = 4.
(c)
2x1 + 3x2 + 5x3 = −5
3x1 + x2 − 2x3 = 6
x1 + 3x2 + 4x3 = −1.
The coefficients and the three sets of right-side terms may be com-
bined into an augmented matrix of the form
..
 
2 3 5 . 0 1 −5

 3 1 −2 .. 
.
 . −2 2 6 
..
1 3 4 . −3 4 −1
If we apply the Gauss–Jordan method to this augmented matrix form
and reduce the first three columns to the unity matrix form, the solu-
tion for the three problems are automatically obtained in the fourth,
fifth, and sixth columns when elimination is completed. Calculate
the solution in this way.
52. Calculate the inverse of each matrix using the Gauss–Jordan method:
 
    5 −2 0 0
3 −9 5 1 4 5  −2 5 −2 0 
(a)  0 5 1  , (b)  2 1 2  , (c)   0 −2
.
5 −2 
−1 6 3 8 1 1
0 0 −2 5
53. Find the inverse of the Hilbert matrix of size 4 × 4 using the Gauss–
Jordan method. Then solve the linear system Ax = [1, 2, 3, 4]T .
54. Find the LU decomposition of each matrix A using Doolittle’s method
and then solve the systems:
(a)    
2 −1 1 4
A=  −3 4 −1 ,  b=  5 .
1 −1 1 6
222 Applied Linear Algebra and Optimization using MATLAB

(b)   
7 6 5 2
A =  5 4 3 , b =  1 .
3 7 6 2
(c)    
2 2 2 0
A =  1 2 1 , b =  −4  .
3 3 4 1
(d)    
2 4 −6 −4
A= 1 5 3 , b =  10  .
1 3 2 5
(e)   
1 −1 0 2
A =  2 −1 1 , b =  4 .
2 −2 −1 3
(f )    
1 5 3 4
A =  2 4 6 , b =  11  .
1 3 2 5
55. Find the LU decomposition of each matrix A using Doolittle’s method,
and then solve the systems:

(a)
  
3 −2 1 1 3
 −3 7 4 −3   2 
A=
 2 −5 3
, b=
 1 .

4 
7 −3 2 4 2
(b)
   
2 −4 5 3 6
 3 5 −4 3   5 
A=
 1
, b=
 2 .

6 2 6 
7 2 5 1 4
Matrices and Linear Systems 223

(c)
   
2 2 3 −2 10
 10 2 13 11   14 
A=
 2
, b=
 11  .

5 4 6 
1 −4 −2 7 9

(d)
   
5 12 4 −11 44
 21 15 13 23   33 
A=
 31
, b=
 55  .

33 12 22 
−17 15 14 11 22

(e)
   
1 −1 10 8 −2
 12 −17 11 22   7 
A= , b=
 6 .

 22 31 13 −1 
8 24 13 9 5

(f )
 
  41
41 25 23 −18  1 
A =  2 13 −16 12  ,  15  .
b= 
11 13 9 7
13

56. For the value(s) of α of each of the following matrices, if A is singular,


using Doolittle’s method:
224 Applied Linear Algebra and Optimization using MATLAB

 
1 −1 2
(a) A =  −1 3 −1  .
α −2 3
 
1 5 7
(b) A =  4 4 α .
−2 α 9
 
2 −4 α
(c) A= 2 4 3 .
4 −2 5
 
2 α 1−α
(d) A= 2 5 −2  .
2 5 4
 
1 −1 3
(e) A= 3 2 3 .
4 α−2 7
 
1 5 α
(f ) A= 1 4 α − 2 .
1 −2 8

57. Find the determinant of each of the following matrices using LU


decomposition by Doolittle’s method:
   
2 3 −1 1 −2 2
(a) A =  1 2 1  , (b) A =  2 1 1 ,
2 1 −6 1 0 1
   
2 4 1 2 4 −6
(c) A =  3 3 2 ,
 (d) A =  1 5 3 ,
4 1 4 1 3 2
   
1 −1 0 1 5 3
(e) A =  2 −1 1  , (f ) A =  1 2 3 .
−2 2 1 1 3 2
Matrices and Linear Systems 225

58. Use the smallest positive integer to find the unique solution of each of
the linear systems of Problem 56 using LU decomposition by Doolit-
tle’s method:

(a) Ax = [2, 3, 2]T .


(b) Ax = [5, −6, 2]T .
(c) Ax = [11, 13, 10]T .
(d) Ax = [−8, 11, 8]T .
(e) Ax = [32, 23, 12]T .
(f ) Ax = [−11, 43, 22]T .

59. Find the LDV factorization of each of the following matrices:

(a)
   
3 4 3 4 −2 3
A =  2 3 3 , B= 5 2 −3  .
1 3 5 4 3 6
(b)
   
2 5 4 3 2 −6
A =  2 1 6 , B =  2 2 −5  .
3 2 7 3 4 7
(c)
   
1 −5 4 4 7 −6
A= 2 3 −4  , B= 5 5 −5  .
3 2 6 6 −4 9
(d)
 
  2 3 4 5
3 −1 4  3 1 2 4 
A=  2 2 −1 , B=
 3
.
1 1 1 
3 2 2
4 3 1 2
226 Applied Linear Algebra and Optimization using MATLAB

60. Find the LDLT factorization of each of the following matrices:

(a)  
2 3 4
A =  3 5 2 .
4 2 6
(b)  
3 −2 4
A =  −2 2 1 .
4 1 3
(c)  
2 1 −1
A= 1 3 2 .
−1 2 2
(d)  
1 −2 3 4
 −2 3 4 5 
A= .
 3 4 5 −6 
4 5 −6 7

61. Solve Problem 54 by LU decomposition using Crout’s method.

62. Find the determinant of each of the following matrices using LU


decomposition by Crout’s method:
   
2 2 −1 2 −1 1
(a) A =  1 2 1  , (b) A =  1 2 2 ,
2 1 −4 2 0 2
   
4 4 1 2 4 5
(c) A =  5 4 2  , (d) A =  3 5 3 ,
1 4 4 4 3 2
   
1 −1 2 1 5 3
(e) A =  2 −1 1  , (f ) A =  1 2 3 .
−2 2 4 1 3 4
Matrices and Linear Systems 227

63. Solve the following systems by LU decomposition using the Cholesky


method:

(a)
   
1 −1 1 2
A =  −1 5 −1  , b =  2 .
1 −1 10 2

(b)
   
10 2 1 7
A =  2 10 3  , b =  −4  .
1 3 10 3

(c)
   
4 2 3 1
A =  2 17 1  , b =  2 .
3 1 5 5

(d)
  
3 4 −6 0 4
 4 5 3 1   5 
A=
 −6
, b=
 2 .

3 3 1 
0 1 1 3 3

64. Solve the following systems by LU decomposition using the Cholesky


method:

(a)
   
5 −1 1 5
A =  −3 5 −2  , b =  7 .
2 −1 7 9

(b)
   
6 2 −3 5
A =  3 12 −4  , b =  2 .
6 3 13 4
228 Applied Linear Algebra and Optimization using MATLAB

(c)    
5 2 −5 3
A= 2 4 4 , b =  11  .
−3 −2 7 14
(d)   
1 4 −6 0 12
 2 2 3 3   13 
A=
 −3
, b=
 14  .

6 7 1 
0 2 −3 5 15

65. Solve the following tridiagonal systems using LU decomposition:

(a)    
3 −1 0 1
A =  −1 3 −1  , b =  1 .
0 −1 3 1
(b)   

2 3 0 0 6
 3 2 3 0   7 
A=
 0
, b=
 5 .

3 2 3 
0 0 3 2 3
(c)   
4 −1 0 0 1
 −1 4 −1 0   1 
A=
 0 −1
, b=
 1 .

4 −1 
0 0 −1 4 1
(d)   
2 3 0 0 1
 3 5 4 0   2 
A=
 0
, b=
 3 .

4 6 3 
0 0 3 4 4

66. Solve the following tridiagonal systems using LU decomposition:


Matrices and Linear Systems 229

(a)    
4 −2 0 5
A =  −2 5 −2  , b =  6 .
0 −2 6 7
(b)    
8 1 0 0 2
 1 8 1 0   2 
A=
 0
, b=
 2 .

1 8 1 
0 0 1 8 2
(c)   
5 −3 0 0 7
 −3 6 −2 0   −5 
A=
 0 −2
, b=
 4 .

7 −5 
0 0 −5 8 2
(d)    
2 −4 0 0 11
 −4 5 7 0   12 
A=
 0
, b=
 13  .

7 6 2 
0 0 2 8 14
67. Find kxk1 , kxk2 , and kxk∞ for the following vectors:

(a)
[2, −1, −6, 3]T .
(b)
[sin k, cos k, 3k ]T , for a fixed integer k.
68. Find k.k1 , k.k∞ and [Link] for the following matrices:
   
3 1 −1 4 1 6
A= 2 0 4  , B =  −3 6 4  ,
1 −5 1 5 0 9
 
  3 11 −5 2
17 46 7  6 8 −11 6 
C=  20 49 8  , D =   −4 −8
.
10 14 
23 52 9
13 14 −12 9
230 Applied Linear Algebra and Optimization using MATLAB

69. Consider the following matrices:


   
−11 7 −8 6 2 7
A= 5 9 6  , B =  −12 10 8  ,
6 3 7 3 −15 14
 
  2 1 −1 1
5 −6 4  1 3 5 2 
C =  −7 8 5 , D =   −2 −3
.
4 5 
3 −9 12
3 4 −2 4
Find k.k1 and k.k∞ for (a) A3 , (b) A2 + B 2 + C 2 + D2 , (c) BC,
(d) C 2 + D2 .
70. The n × n Hilbert matrix H (n) is defined by
(n) 1
Hij = , 1 ≤ i, j ≤ n.
i+j−1
Find the l∞ -norm of the 10 × 10 Hilbert matrix.
71. Compute the condition numbers of the following matrices relative to
k.k∞ :
1 1 1
 
 3 2 5 
     

 1
 0.03 0.01 −0.02 1.11 1.98 2.01
1 1 
 , (b)  0.15 0.51 −0.11  , (c)  1.01 1.05 2.05  .
(a) 
 2
 5 3 
 1.11 2.22 3.33 0.85 0.45 1.25
 
 1 1 1 
5 3 2
72. The following linear systems have x as the exact solution and x∗ is an
kr||∞
approximate solution. Compute kx − x∗ k∞ and K(A) , where
kAk∞
r = b − Ax∗ is the residual vector:

(a)
0.89x1 + 0.53x2 = 0.36
0.47x1 + 0.28x2 = 0.19
Matrices and Linear Systems 231

x = [1, −1]T
x∗ = [0.702, −0.500]T
(b)
0.986x1 + 0.579x2 = 0.235
0.409x1 + 0.237x2 = 0.107

x = [2, −3]T
x∗ = [2.110, −3.170]T
(c)
1.003x1 + 58.090x2 = 68.12
5.550x1 + 321.8x2 = 377.3

x = [10, 1]T
x∗ = [−10, 1]T

73. Discuss the ill-conditioning (stability) of the linear system

1.01x1 + 0.99x2 = 2
0.99x1 + 1.01x2 = 2.

If x∗ = [2, 0]t is an approximate solution of the system, then find the


residual vector r and estimate the relative error.

74. Show that if B is singular, then

1 kA − Bk
≤ .
K(A) kAk

75. Consider the following matrices:


   
0.06 0.01 0.02 0.1 0.2 0.12
A=  0.13 0.05 0.11  , B =  0.1 0.4 0.2  .
1.01 2.02 3.03 0.2 0.05 0.1

Using Problem 74, compute the approximation of the condition num-


ber of the matrix A relative to k.k∞ .
232 Applied Linear Algebra and Optimization using MATLAB

76. Let A and B be nonsingular n × n matrices. Show that

(a)
K(A) ≥ 1 and K(B) ≥ 1.
(b)
K(AB) ≤ K(A)K(B).

77. The exact solution of the linear system


x1 + x2 = 1
x1 + 1.01x2 = 2

is x = [−99, 100]T . Change the coefficient matrix slightly to


 
1 1
δA = ,
1 0.99
and consider the linear system
x 1 + x2 = 1
x1 + 0.99x2 = 2.
Compute the changed solution δx of the system. Is the matrix A
ill-conditioned?
78. Using Problem 77, compute the relative error and the relative resid-
ual.
79. The exact solution of the linear system
x1 + 3x2 = 4
1.0001x1 + 3x2 = 4.0001

is x = [1, 1]T . Change the right-hand vector b slightly to δb =


[4.0001, 4.0003]T , and consider the linear system
x1 + 3x2 = 4.0001
1.0001x1 + 3x2 = 4.0003.
Compute the changed solution δx of the system. Is the matrix A
ill-conditioned?
Matrices and Linear Systems 233

80. If kAk < 1, then show that the matrix (I − A) is nonsingular and
1
k(I − A)−1 k ≤ .
1 − kAk

81. The exact solution of the linear system


x1 + x2 = 3
x1 + 1.0005x2 = 3.0010

is x = [1, 2]T . Change the coefficient matrix and the right-hand


vector b slightly to
   
1 1 2.99
δA = and δb = ,
1 1.001 3.01
and consider the linear system
x1 + x2 = 2.99
x1 + 1.001x2 = 3.01
Compute the changed solution δx of the system. Is the matrix A
ill-conditioned?
82. Find the condition number of the following matrix:
!
1 1
An = 1 .
1 1−
n
Solve the linear system A4 x = [2, 2]T and compute the relative resid-
ual.
83. Determine equations of the polynomials of degree two whose graphs
pass through the given points.

(a) (1, 2), (2, 2), (3, 4).


(b) (1, 14), (2, 22), (3, 32).
(c) (1, 5), (2, 7), (3, 9).
(d) (−1, −1), (0, 1), (1, −3).
(e) (1, 8), (3, 26), (5, 60).
234 Applied Linear Algebra and Optimization using MATLAB

84. Find an equation of the polynomial of degree three whose graph


passes through the points (1, −3), (2, −1), (3, 9), (4, 33).

85. Determine the currents through the various branches of the electrical
network in Figure 1.8:

(a) When battery C is 9 volts.


(b) When battery C is 23 volts.

Figure 1.8: Electrical circuit.

Note how the current through the branch AB is reversed in (b). What
would the voltage of C have to be for no current to pass through AB?
Matrices and Linear Systems 235

86. Construct a mathematical model that describes the traffic flow in


the road network of Figure 1.9. All streets are one-way streets in the
directions indicated. The units are in vehicles per hour. Give two
distinct possible flows of traffic. What is the minimum possible flow
that can be expected along branch AB?

Figure 1.9: Traffic flow.


236 Applied Linear Algebra and Optimization using MATLAB

87. Figure 1.10 represents the traffic entering and leaving a “roundabout”
road junction. Such junctions are very common in Europe. Construct
a mathematical model that describes the flow of traffic along the vari-
ous branches. What is the minimum flow theoretically possible along
the branch BC? Is this flow ever likely to be realized in practice?

Figure 1.10: Traffic flow.


Matrices and Linear Systems 237

88. Find the temperatures at x1 , x2 , x3 , and x4 of the triangular metal


plate shown in Figure 1.11, given that the temperature of each inte-
rior point is the average of its four neighboring points.

Figure 1.11: Heat Conduction.


238 Applied Linear Algebra and Optimization using MATLAB

89. Find the temperatures at x1 , x2 , and x3 of the triangular metal plate


shown in Figure 1.12, given that the temperature of each interior
point is the average of its four neighboring points.

Figure 1.12: Heat conduction.

90. It takes three different ingredients, A, B, and C, to produce a cer-


tain chemical substance. A, B, and C have to be dissolved in water
separately before they interact to form the chemical. The solution
containing A at 2.2g/cm3 combined with the solution containing B
at 2.5g/cm3 , combined with the solution containing C at 4.6g/cm3 ,
makes 18.25g of the chemical. If the proportions for A, B, C in
these solutions are changed to 2.4, 3.5, and 5.8g/cm3 , respectively
(while the volumes remain the same), then 21.26g of the chemi-
cal is produced. Finally, if the proportions are changed to 1.7, 2.1,
and 3.9g/cm3 , respectively, then 15.32g of the chemical is produced.
What are the volumes in cubic centimeters of the solutions containing
A, B, and C?
Matrices and Linear Systems 239

91. Find a balanced chemical equation for each reaction:

(a) F eS2 + O2 −→ F e2 O3 + SO2 .

(b) CO2 + H2 O −→ C6 H12 O6 + O2 (This reaction takes place when a


green plant converts carbon dioxide and water to glucose and oxygen
during photosynthesis.)

(c) C4 H10 + O2 −→ CO2 + H2 O (This reaction occurs when butane,


C4 H10 , burns in the presence of oxygen to form carbon dioxide and
water.)

(d) C5 H11 OH + O2 −→ H2 O + CO2 (This reaction represents the


combustion of amyl alcohol.)

92. Find a balanced chemical equation for each reaction:

(a) C7 H6 O2 + O2 −→ H2 O + CO2 .

(b) HClO4 + P4 O10 −→ H3 P O4 + Cl2 O7 .

(c) N a2 CO3 + C + N2 −→ N aCN + CO.

(d) C2 H2 Cl4 + Ca(OH)2 −→ C2 HCl3 + CaCl2 + H2 O.

93. A manufacturing company produces three products, I, II, and III.


It uses three machines, A, B, and C, for 350, 150, and 100 hours, re-
spectively. Making one thousand atoms of type I requires 30, 10, and
5 hours on machines A, B, and C, respectively. Making one thousand
atoms of type II requires 20, 10, and 10 hours on machines A, B, and
C, respectively. Making one thousand atoms of type III requires
30, 30, and 5 hours on machines A, B, and C, respectively. Find the
number of items of each type of product that can be produced if the
machines are used at full capacity.
240 Applied Linear Algebra and Optimization using MATLAB

94. The average of the temperature for the cities of Jeddah, Makkah,
and Riyadh was 15o C during a given winter day. The temperature
in Makkah was 6o C higher than the average of the temperatures
of the other two cities. The temperature in Riyadh was 6o C lower
than the average temperature of the other two cities. What was the
temperature in each one of the cities?

95. An international business person needs, on the average, fixed amounts


of Japanese yen, French francs, and German marks during each of
his business trips. He traveled three times this year. The first time
he exchanged a total of $2,400 at the following rates: the dollar was
100 yen, 1.5 francs, and 1.2 marks. The second time he exchanged a
total of $2,350 at these rates: the dollar was 100 yen, 1.2 francs, and
1.5 marks. The third time he exchanged a total of $2,390 at these
rates: the dollar was 125 yen, 1.2 francs, and 1.2 marks. How many
yen, francs, and marks did he buy each time?

96. A father plans to distribute his estate, worth SR1000,000, between


his four sons as follows: 43 of the estate is to be split equally among
the sons. For the rest, each son is to receive SR5,000 for each year
that remains until his 25th birthday. Given that the sons are all 4
years apart, how much would each receive from his father’s estate?

97. A biologist has placed three strains of bacteria (denoted by I, II, and
III) in a test tube, where they will feed on three different food sources
(A, B, and C). Each day 2300 units of A, 800 units of B, and 1500
units of C are placed in the test tube, and each bacterium consumes
a certain number of units of each food per day, as shown in the given
table. How many bacteria of each strain can coexist in the test tube
and consume all the food?

Food Bacteria Bacteria Bacteria


Strain I Strain II Strain III
Food A 2 2 4
Food B 1 2 0
Food C 1 3 1
Matrices and Linear Systems 241

98. Al-karim hires three types of laborers, I, II, and III, and pays them
SR20, SR15, and SR10 per hour, respectively. If the total amount
paid is SR20,000 for a total of 300 hours of work, find the possible
number of hours put in by the three categories of workers if the
category III workers must put in the maximum amount of hours.
Chapter 2

Iterative Methods for Linear


Systems

2.1 Introduction
The methods discussed in Chapter 1 for the solution of the system of linear
equations have been direct, which required a finite number of arithmetic
operations. The elimination methods for solving such systems usually
yield sufficiently accurate solutions for approximately 20 to 25 simulta-
neous equations, where most of the unknowns are present in all of the
equations. When the coefficients matrix is sparse (has many zeros), a con-
siderably large number of equations can be handled by the elimination
methods. But these methods are generally impractical when many hun-
dreds or thousands of equations must be solved simultaneously.

There are, however, several methods that can be used to solve large
numbers of simultaneous equations. These methods, called iterative meth-
ods, are methods by which an approximation to the solution of a system
243
244 Applied Linear Algebra and Optimization using MATLAB

of linear equations may be obtained. The iterative methods are used most
often for large, sparse systems of linear equations and they are efficient in
terms of computer storage and time requirements. Systems of this type
arise frequently in the numerical solutions of boundary value problems
and partial differential equations. Unlike the direct methods, the iterative
methods may not always yield a solution, even if the determinant of the
coefficients matrix is not zero.

The iterative methods to solve the system of linear equations

Ax = b (2.1)

start with an initial approximation x(0) to the solution x of the linear


system (2.1) and generate a sequence of vectors {x(k) }∞
k=0 that converges
to x. Most of these iterative methods involve a process that converts the
system (2.1) into an equivalent system of the form

x = Tx + c (2.2)

for some square matrix T and vector c. After the initial vector x(0) is se-
lected, the sequence of approximate solution vectors is generated by com-
puting
x(k+1) = T x(k) + c, for k = 0, 1, 2, . . . . (2.3)

The sequence is terminated when the error is sufficiently small, i.e.,

kx(k+1) − x(k) k < , for small positive . (2.4)

Among them, the most useful methods are the Jacobi method, the
Gauss–Seidel method, the Successive Over-Relaxation (SOR) method, and
the conjugate gradient method.

Before discussing these methods, it is convenient to introduce notations


for some matrices. The matrix A is written as

A = L + D + U, (2.5)
Iterative Methods for Linear Systems 245

where L is strictly lower-triangular, U is strictly upper-triangular, and D


is the diagonal parts of the coefficients matrix A, i.e.,
   
0 0 0 ··· 0 a11 a12 a13 · · · a1n
 a21
 0 0 ··· 0  
 0
 0 a23 · · · a2n  
L=
 a31 a32 0 · · · 0 ,

U =
 0 0 0 · · · a 3n ,

 .. .. .. .. ..   .. .. .. .. .. 
 . . . . .   . . . . . 
an1 an2 an3 · · · 0 0 0 0 ··· 0
and  
a11 0 0 ··· 0

 0 a22 0 ··· 0 

D=
 0 0 a33 · · · 0 .

 .. .. .. .. .. 
 . . . . . 
0 0 0 · · · ann
Then (2.1) can be written as
(L + D + U )x = b. (2.6)
Now we discuss our first iterative method to solve the linear system (2.6).

2.2 Jacobi Iterative Method


This is one of the easiest iterative methods to find the approximate solution
of the system of linear equations (2.1). To explain its procedure, consider
a system of three linear equations as follows:
a11 x1 + a12 x2 + a13 x3 = b1
a21 x1 + a22 x2 + a23 x3 = b2
a31 x1 + a32 x2 + a33 x3 = b3 .
The solution process starts by solving for the first variable x1 from the first
equation, the second variable x2 from the second equation and the third
variable x3 from the third equation, which gives
a11 x1 = b1 − a12 x2 − a13 x3
a22 x2 = b2 − a21 x1 − a23 x3
a33 x3 = b3 − a31 x1 − a32 x2
246 Applied Linear Algebra and Optimization using MATLAB

or in matrix form
Dx = b − (L + U )x.
Divide both sides of the above three equations by their diagonal elements,
a11 , a22 , and a33 , respectively, to get

1 h i
x1 = b1 − a12 x2 − a13 x3
a11

1 h i
x2 = b2 − a21 x1 − a23 x3
a22

1 h i
x3 = b3 − a31 x1 − a32 x2 ,
a33

which can be written in the matrix form

x = D−1 [b − (L + U )x].
h iT
(0) (0) (0) (0)
Let x = x1 , x2 , x3
be an initial solution of the exact solution x of
the linear system (2.1). Then define an iterative sequence

(k+1) 1 h (k) (k)


i
x1 = b1 − a12 x2 − a13 x3
a11

(k+1) 1 h (k) (k)


i
x2 = b2 − a21 x1 − a23 x3 (2.7)
a22

(k+1) 1 h (k) (k)


i
x3 = b3 − a31 x1 − a32 x2
a33

or in matrix form

x(k+1) = D−1 [b − (L + U )x(k) ], k = 0, 1, 2, . . . , (2.8)

where k is the number of iterative steps. Then the form (2.7) is called
the Jacobi formula for the system of three equations and (2.8) is called its
matrix form. For a general system of n linear equations, the Jacobi method
Iterative Methods for Linear Systems 247

is defined by
" i−1 n
#
(k+1) 1 X (k)
X (k)
xi = bi − aij xj − aij xj (2.9)
aii j=1 j=i+1
i = 1, 2, . . . , n, k = 0, 1, 2, . . .

provided that the diagonal elements aii 6= 0, for each i = 1, 2, . . . , n. If


the diagonal elements equal zero, then reordering of the equations can be
performed so that no element in the diagonal position equals zero. The
matrix form of the Jacobi iterative method (2.9) can be written as

x(k+1) = c + TJ x(k) , k = 0, 1, 2, . . . (2.10)

or
      
x1 c1 0 −t12 · · · −t1n x1
x2   −t21
c2 0 · · · −t2n    x2 
    
= + ..   ..  ,
    
 ..    .. .. .. ..
  .    . . . . .  . 
xn (k+1) cn −tn1 −tn2 · · · 0 xn (k)
(2.11)
where the Jacobi iteration matrix TJ and vector c are defined as follows:

TJ = −D−1 (L + U ) and c = D−1 b, (2.12)

and their elements are defined by


aij
tij = , i, j = 1, 2, . . . , n, i 6= j
aii

tij = 0, i=j

bi
ci = , i = 1, 2, . . . , n.
aii
The Jacobi iterative method is sometimes called the method of simultane-
ous iterations, because all values of xi are iterated simultaneously. That
(k+1) (k)
is, all values of xi depend only on the values of xi .
248 Applied Linear Algebra and Optimization using MATLAB

Note that the diagonal elements of the Jacobi iteration matrix TJ are
(0)
always zero. As usual with iterative methods, an initial approximation xi
must be supplied. If we don’t have knowledge of the exact solution, it is
(0)
conventional to start with xi = 0, for all i. The iterations defined by
(2.9) are stopped when

kx(k+1) − x(k) k < , (2.13)

or by using other possible stopping criteria

kx(k+1) − x(k) k
< (2.14)
kx(k+1) k

where  is a preassigned small positive number. For this purpose, any


convenient norm can be used, the most common being the l∞ -norm.

Example 2.1 Solve the following system of equations using the Jacobi it-
erative method using  = 10−5 in the l∞ -norm:

15x1 − x2 − 2x3 − 3x4 = 11


−x1 + 15x2 − 2x3 − 3x4 = 22
−x1 − 2x2 + 15x3 − 3x4 = 33
−x1 − 2x2 − 3x3 + 15x4 = 44.

Start with the initial solution x(0) = [0, 0, 0, 0]T .

Solution. The Jacobi method for the given system is

(k+1) 1h (k) (k) (k)


i
x1 = 15 + x2 + 2x3 + 3x4
15

(k+1) 1h (k) (k) (k)


i
x2 = 22 + x1 + 2x3 + 3x4
15

(k+1) 1h (k) (k) (k)


i
x3 = 33 + x1 + 2x2 + 3x4
15

(k+1) 1h (k) (k) (k)


i
x4 = 44 + x1 + 2x2 + 3x3 ,
15
Iterative Methods for Linear Systems 249

Table 2.1: Solution of Example 2.1.


(k) (k) (k) (k)
k x1 x2 x3 x4
0 0.00000 0.00000 0.00000 0.00000
1 0.73333 1.46667 2.20000 2.93333
2 1.71111 2.39556 3.03111 3.61778
3 2.02074 2.70844 3.35704 3.97304
4 2.15611 2.84366 3.49045 4.10058
5 2.20842 2.89592 3.54300 4.15431
6 2.22966 2.91716 3.56421 4.17528
7 2.23810 2.92560 3.57266 4.18377
8 2.24148 2.92898 3.57604 4.18715
9 2.24283 2.93033 3.57739 4.18851
10 2.24338 2.93088 3.57793 4.18905
11 2.24359 2.93109 3.57815 4.18926
12 2.24368 2.93118 3.57824 4.18935
13 2.24371 2.93121 3.57827 4.18938
14 2.24373 2.93123 3.57829 4.18940
15 2.24373 2.93123 3.57829 4.18940

(0) (0) (0) (0)


and starting with the initial approximation x1 = 0, x2 = 0, x3 = 0, x4 =
0, then for k = 0, we obtain

(1) 1h (0) (0) (0)


i
x1 = 15 + x2 + 2x3 + 3x4 = 0.73333
15

(1) 1h (0) (0) (0)


i
x2 = 22 + x1 + 2x3 + 3x4 = 1.46667
15

(1) 1h (0) (0) (0)


i
x3 = 33 + x1 + 2x2 + 3x4 = 2.20000
15

(1) 1h (0) (0) (0)


i
x4 = 44 + x1 + 2x2 + 3x3 = 2.93333.
15
The first and subsequent iterations are listed in Table 2.1.
250 Applied Linear Algebra and Optimization using MATLAB

Note that the Jacobi method converged and after 15 iterations we ob-
tained the good approximation [2.24373, 2.93123, 3.57829, 4.18940]T of
the given system having the exact solution [2.24374, 2.93124, 3.57830,
4.18941]T . Ideally, the iterations should stop automatically when we ob-
tain the required accuracy using one of the stopping criteria mentioned in
(2.13) or (2.14). •

The above results can be obtained using MATLAB commands, as fol-


lows:

>> Ab = [15 − 1 − 2 − 3 11; −1 15 − 2 − 3 22; −1 − 2 15 − 3 33; ...


−1 − 2 − 3 44];
>> x = [0 0 0 0];
>> acc = 1e − 05;
>> JacobiM (Ab, x, acc);

Example 2.2 Solve the following system of equations using the Jacobi it-
erative method:

−x1 + 15x2 − 2x3 − 3x4 = 22


15x1 − x2 − 2x3 − 3x4 = 11
−x1 − 2x2 + 15x3 − 3x4 = 33
−x1 − 2x2 − 3x3 + 15x4 = 44.

Start with the initial solution x(0) = [0, 0, 0, 0]T .

Solution. Results for this linear system are listed in Table 2.2. Note that
in this case the Jacobi method diverges rapidly. Although the given linear
system is the same as the linear system of Example 2.1, the first and second
equations are interchanged. From this example we conclude that the Jacobi
iterative method is not always convergent.
Iterative Methods for Linear Systems 251

Table 2.2: Solution of Example 2.2.


(k) (k) (k) (k)
k x1 x2 x3 x4
0 0.000000 0.000000 0.000000 0.00000
1 –2.2000e+001 –1.1000e+001 2.2000e+000 2.9333e+000
2 –2.0020e+002 –3.5420e+002 –1.4667e-001 4.4000e-001
3 –5.3360e+003 –3.0150e+003 –5.8285e+001 –5.7669e+001
4 –4.4958e+004 –7.9762e+004 –7.6707e+002 –7.6646e+002
5 –1.1926e+006 –6.7054e+005 –1.3783e+004 –1.3783e+004
6 –9.9893e+006 –1.7820e+007 –1.7167e+005 –1.7167e+005
7 –2.6645e+008 –1.4898e+008 –3.0763e+006 –3.0763e+006
8 –2.2193e+009 –3.9813e+009 –3.8242e+007 –3.8242e+007
9 –5.9529e+010 –3.3099e+010 –6.8645e+008 –6.8645e+008
10 –4.9305e+011 –8.8950e+011 –8.5190e+009 –8.5190e+009

Program 2.1
MATLAB m-file for the Jacobi Iterative Method
function x=JacobiM(Ab,x,acc) % Ab = [A b]
[n,t]=size(Ab); b=Ab(1:n,t); R=1; k=1;
d(1,1:n+1)=[0 x]; while R > acc
for i=1:n
sum=0;
for j=1:n; if j ˜ =i
sum = sum + Ab(i, j) ∗ d(k, j + 1); end;
x(1, i) = (1/Ab(i, i)) ∗ (b(i, 1) − sum); end;end
k=k+1; d(k,1:n+1)=[k-1 x];
R=max(abs((d(k,2:n+1)-d(k-1,2:n+1))));
if k > 10 & R > 100
(‘Jacobi Method diverges’)
break; end; end; x=d;

Procedure 2.1 (Jacobi Method)


1. Check that the coefficient matrix A is strictly diagonally dominant
(for guaranteed convergence).
252 Applied Linear Algebra and Optimization using MATLAB

2. Initialize the first approximation x(0) and preassigned accuracy .

bi
3. Compute the constant c = D−1 b = , for i = 1, 2, . . . , n.
aii

4. Compute the Jacobi iteration matrix TJ = −D−1 (L + U ).

(k+1) (k)
5. Solve for the approximate solutions xi = TJ xi +c, i = 1, 2, . . . , n
k = 0, 1, . . ..
(k+1) (k)
6. Repeat step 5 until kxi − xi k < .

2.3 Gauss–Seidel Iterative Method


This is one of the most popular and widely used iterative methods for
finding the approximate solution of the system of linear equations. This
iterative method is a modification of the Jacobi iterative method and gives
us good accuracy by using the most recently calculated values.

From the Jacobi iterative formula (2.9), it is seen that the new estimates
for solution x are computed from the old estimates and only when all
the new estimates have been determined are they then used in the right-
hand side of the equation to perform the next iteration. But the Gauss–
Seidel method is used to make use of the new estimates in the right-hand
side of the equation as soon as they become available. For example, the
Gauss–Seidel formula for the system of three equations can be defined as
an iterative sequence:

(k+1) 1 h (k) (k)


i
x1 = b1 − a12 x2 − a13 x3
a11

(k+1) 1 h (k+1) (k)


i
x2 = b2 − a21 x1 − a23 x3 (2.15)
a22

(k+1) 1 h (k+1) (k+1)


i
x3 = b3 − a31 x1 − a32 x2 .
a33
Iterative Methods for Linear Systems 253

For a general system of n linear equations, the Gauss–Seidel iterative


method is defined as
" i−1 n
#
(k+1) 1 X (k+1)
X (k)
xi = bi − aij xj − aij xj , (2.16)
aii j=1 j=i+1
i = 1, 2, . . . , n, k = 0, 1, 2, . . .

and in matrix form, can be represented by

x(k+1) = (D + L)−1 [b − U x(k) ], for each k = 0, 1, 2, . . . . (2.17)

For the lower-triangular matrix (D + L) to be nonsingular, it is necessary


and sufficient that the diagonal elements aii 6= 0, for each i = 1, 2, . . . , n.
By comparing (2.3) and (2.17), we obtain

TG = −(D + L)−1 U and c = (D + L)−1 b, (2.18)

which are called the Gauss–Seidel iteration matrix and the vector, respec-
tively.

The Gauss–Seidel iterative method is sometimes called the method of


successive iteration, because the most recent values of all xi are used in
the calculation.

Example 2.3 Solve the following system of equations using the Gauss–
Seidel iterative method, with  = 10−5 in the l∞ -norm:

15x1 − x2 − 2x3 − 3x4 = 11


−x1 + 15x2 − 2x3 − 3x4 = 22
−x1 − 2x2 + 15x3 − 3x4 = 33
−x1 − 2x2 − 3x3 + 15x4 = 44.

Start with the initial solution x(0) = [0, 0, 0, 0]T .


254 Applied Linear Algebra and Optimization using MATLAB

Solution. The Gauss–Seidel method for the given system is

(k+1) 1h (k) (k) (k)


i
x1 = 15 + x2 + 2x3 + 3x4
15

(k+1) 1h (k+1) (k) (k)


i
x2 = 22 + x1 + 2x3 + 3x4
15

(k+1) 1h (k+1) (k+1) (k)


i
x3 = 33 + x1 + 2x2 + 3x4
15

(k+1) 1h (k+1) (k+1) (k+1)


i
x4 = 44 + x1 + 2x2 + 3x3 ,
15
(0) (0) (0) (0)
and starting with the initial approximation x1 = 0, x2 = 0, x3 = 0, x4 =
0, then for k = 0, we obtain

(1) 1h (0) (0) (0)


i
x1 = 15 + x2 + 2x3 + 3x4 = 0.73333
15

(1) 1h (1) (0) (0)


i
x2 = 22 + x1 + 2x3 + 3x4 = 1.51556
15

(1) 1h (1) (1) (0)


i
x3 = 33 + x1 + 2x2 + 3x4 = 2.45096
15

(1) 1h (1) (1) (1)


i
x4 = 44 + x1 + 2x2 + 3x3 = 3.67449.
15
The first and subsequent iterations are listed in Table 2.3.

The above results can be obtained using MATLAB commands as fol-


lows:

>> Ab = [15 − 1 − 2 − 3 11; −1 15 − 2 − 3 22; −1 − 2 15 − 3 33; ...


−1 − 2 − 3 44];
>> x = [0 0 0 0];
>> acc = 1e − 5;
>> GaussSM (Ab, x, acc);
Iterative Methods for Linear Systems 255

Table 2.3: Solution of Example 2.3.


(k) (k) (k) (k)
k x1 x2 x3 x4
0 0.00000 0.00000 0.00000 0.00000
1 0.73333 1.51556 2.45096 3.67449
2 1.89606 2.65476 3.41527 4.09676
3 2.18504 2.88706 3.54996 4.17394
4 2.23392 2.92371 3.57354 4.18681
5 2.24208 2.92997 3.57749 4.18897
6 2.24346 2.93102 3.57816 4.18933
7 2.24369 2.93120 3.57827 4.18939
8 2.24373 2.93123 3.57829 4.18940
9 2.24374 2.93123 3.57830 4.18941

Note that the Gauss–Seidel method converged for the given system and re-
quired nine iterations to obtain the approximate solution [2.24374, 2.93123,
3.57830, 4.18941]T , which is equal to the exact solution [2.24374, 2.93124,
3.57830, 4.18941]T up to six significant digits, which is six iterations less
than required by the Jacobi method for the same linear system. •

Example 2.4 Solve the following system of equations using the Gauss–
Seidel iterative method, with  = 10−5 in the l∞ -norm:

−x1 + 15x2 − 2x3 − 3x4 = 22


15x1 − x2 − 2x3 − 3x4 = 11
−x1 − x2 + 15x3 − 3x4 = 33
−x1 − 2x2 − 3x3 + 15x4 = 44.

Start with the initial solution x(0) = [0, 0, 0, 0]T .

Solution. Results for this linear system are listed in Table 2.4. Note that
in this case the Gauss–Seidel method diverges rapidly. Although the given
linear system is the same as the linear system of the previous Example 2.3,
the first and second equations are interchanged. From this example we
conclude that the Gauss–Seidel iterative method is not always convergent.•
256 Applied Linear Algebra and Optimization using MATLAB

Table 2.4: Solution of Example 2.4.


(k) (k) (k) (k)
k x1 x2 x3 x4
0 0.00000 0.00000 0.00000 0.0000
1 –2.2000e+001 –3.4100e+002 –4.4733e+001 –5.2947e+001
2 –4.8887e+003 –7.3093e+004 –1.0080e+004 –1.2085e+004
3 –1.0400e+006 –1.5544e+007 –2.1442e+006 –2.5707e+006
4 –2.2115e+008 –3.3053e+009 –4.5597e+008 –5.4665e+008
5 –4.7028e+010 –7.0287e+011 –9.6960e+010 –1.1624e+011
6 –1.0000e+013 –1.4946e+014 –2.0618e+013 –2.4719e+013
7 –2.1265e+015 –3.1783e+016 –4.3844e+015 –5.2564e+015
8 –4.5220e+017 –6.7585e+018 –9.3233e+017 –1.1177e+018
9 –9.6160e+019 –1.4372e+021 –1.9826e+020 –2.3769e+020

Program 2.2
MATLAB m-file for the Gauss–Seidel Iterative Method
function x=GaussSM(Ab,x,acc) % Ab = [A b]
[n,t]=size(Ab); b=Ab(1:n,t);R=1; k=1;
d(1,1:n+1)=[0 x]; k=k+1; while R > acc
for i=1:n; sum=0; for j=1:n
if j <= i − 1; sum = sum + Ab(i, j) ∗ d(k, j + 1);
elseif j >= i + 1
sum = sum + Ab(i, j) ∗ d(k − 1, j + 1); end; end
x(1, i) = (1/Ab(i, i)) ∗ (b(i, 1) − sum);
d(k,1)=k-1; d(k,i+1)=x(1,i); end
R=max(abs((d(k,2:n+1)-d(k-1,2:n+1))));
k=k+1; if R > 100 & k > 10 (‘Gauss–Seidel method Diverges’)
break ;end;end;x=d;
Procedure 2.2 (Gauss–Seidel Method)
1. Check that the coefficient matrix A is strictly diagonally dominant
(for guaranteed convergence).
2. Initialize the first approximation x(0) ∈ R and preassigned accuracy
.
Iterative Methods for Linear Systems 257

3. Compute the constant c = (D + L)−1 b.


4. Compute the Gauss–Seidel iteration matrix TG = −(D + L)−1 U .
(k+1) (k)
5. Solve for the approximate solutions xi = TG xi + c, i =
1, 2, . . . , n, k = 0, 1, . . . .
(k+1) (k)
6. Repeat step 5 until kxi − xi k < .
From Example 2.1 and Example 2.3, we note that the solution by the
Gauss–Seidel method converges more quickly than the Jacobi method. In
general, we may state that if both the Jacobi method and the Gauss–
Seidel method converge, then the Gauss–Seidel method will con-
verge more quickly. This is generally the case but is not always true.
In fact, there are some linear systems for which the Jacobi method con-
verges but the Gauss–Seidel method does not, and others for which the
Gauss–Seidel method converges but the Jacobi method does not.
Example 2.5 Solve the following system of equations using the Jacobi and
Gauss–Seidel iterative methods, using  = 10−5 in the l∞ -norm and taking
the initial solution x(0) = [0, 0, 0, 0]T :
7x1 + x2 + x4 = 2
2x1 + 5x2 + x3 + 3x4 = 2
4x2 + 5x3 + 2x4 = 3
x1 + 3x2 + 2x3 + 6x4 = 4.
Solution. First, we solve by the Jacobi method and for the given system,
the Jacobi formula is
(k+1) 1h (k) (k)
i
x1 = 2 − 11x2 − x4
7

(k+1) 1h (k) (k) (k)


i
x2 = 2 − 2x1 − x3 − 3x4
5

(k+1) 1h (k) (k)


i
x3 = 3 − 4x2 − 2x4
2

(k+1) 1h (k) (k) (k)


i
x4 = 4 − x1 − 3x2 − 2x3 ,
6
258 Applied Linear Algebra and Optimization using MATLAB

Table 2.5: Solution by the Jacobi method.


(k) (k) (k) (k)
k x1 x2 x3 x4
0 0.00000 0.00000 0.00000 0.00000
1 0.28571 0.40000 0.60000 0.66667
2 –0.43810 –0.23429 0.01333 0.21905
3 0.62259 0.44114 0.69981 0.85238
4 –0.52928 –0.50043 –0.09387 0.10906
5 1.05652 0.56505 0.95672 1.03638
6 –0.75027 –0.83578 –0.26659 –0.11085
7 1.61492 0.81994 1.31296 1.29847
8 –1.18825 –1.28764 –0.57535 –0.45011
9 2.37345 1.26043 1.81015 1.70031
10 –1.93787 –1.93160 –1.08847 –0.96251

(0) (0) (0) (0)


and starting with the initial approximation x1 = 0, x2 = 0, x3 = 0, x4 =
0, then for k = 0, we obtain

(1) 1h (0) (0)


i
x1 = 2 − 11x2 − x4 = 0.28571
7

(1) 1h (0) (0) (0)


i
x2 = 2 − 2x1 − x3 − 3x4 = 0.40000
5

(1) 1h (0) (0)


i
x3 = 3 − 4x2 − 2x4 = 0.60000
2

(1) 1h (0) (0) (0)


i
x4 = 4 − x1 − 3x2 − 2x3 = 0.66667.
6

The first and subsequent iterations are listed in Table 2.5. Now we solve
the same system by the Gauss–Seidel method and for the given system, the
Iterative Methods for Linear Systems 259

Gauss–Seidel formula is

(k+1) 1h (k) (k)


i
x1 = 2 − 11x2 − x4
7

(k+1) 1h (k+1) (k) (k)


i
x2 = 2 − 2x1 − x3 − 3x4
5

(k+1) 1h (k+1) (k)


i
x3 = 3 − 4x2 − 2x4
2

(k+1) 1h (k+1) (k+1) (k+1)


i
x4 = 4 − x1 − 3x2 − 2x3 ,
6
(0) (0) (0) (0)
and starting with the initial approximation x1 = 0, x2 = 0, x3 = 0, x4 =
0, then for k = 0, we obtain

(1) 1h (0) (0)


i
x1 = 2 − 11x2 − x4 = 0.28571
7

(1) 1h (1) (0) (0)


i
x2 = 2 − 2x1 − x3 − 3x4 = 0.28571
5

(1) 1h (1) (0)


i
x3 = 3 − 4x2 − 2x4 = 0.37143
2

(1) 1h (1) (1) (1)


i
x4 = 4 − x1 − 3x2 − 2x3 = 0.35238.
6
260 Applied Linear Algebra and Optimization using MATLAB

Table 2.6: Solution by the Gauss–Seidel method.


(k) (k) (k) (k)
k x1 x2 x3 x4
0 0.00000 0.00000 0.00000 0.00000
1 0.28571 0.28571 0.37143 0.35238
2 –0.21361 0.19973 0.29927 0.50265
3 –0.09995 0.07854 0.33611 0.53202
4 0.08630 –0.02095 0.40395 0.52811
5 0.24319 –0.09493 0.46470 0.51870
6 0.36080 –0.14848 0.51130 0.51034
7 0.44613 –0.18692 0.54540 0.50397
8 0.50745 –0.21444 0.56996 0.49933
9 0.55136 –0.23413 0.58758 0.49598
10 0.58278 –0.24822 0.60018 0.49358
.. .. .. .. ..
. . . . .
25 0.66118 –0.28335 0.63163 0.48760
26 0.66132 –0.28342 0.63169 0.48759
27 0.66143 –0.28346 0.63174 0.48758
28 0.66150 –0.28350 0.63177 0.48758

The first and subsequent iterations are listed in Table 2.6. Note that the Ja-
cobi method diverged and the Gauss–Seidel method converged after 28 itera-
tions with the approximate solution [0.66150, −0.28350, 0.63177, 0.48758]T
of the given system, which has the exact solution [0.66169, −0.28358, 0.63184,
0.48756]T . •

Example 2.6 Solve the following system of equations using the Jacobi and
Gauss–Seidel iterative methods, using  = 10−5 in the l∞ -norm and taking
the initial solution x(0) = [0, 0, 0, 0]T :
x1 + 2x2 − 2x3 = 1
x1 + x2 + x3 = 2
2x1 + 2x2 + x3 = 3
x1 + x2 + x3 + x 4 = 4.

Start with the initial solution x(0) = [0, 0, 0, 0]T .


Iterative Methods for Linear Systems 261

Solution. First, we solve by the Jacobi method and for the given system,
the Jacobi formula is

(k+1) 1h (k) (k)


i
x1 = 1 − 2x2 + 2x3
1

(k+1) 1h (k) (k)


i
x2 = 2 − x1 − x3
1

(k+1) 1h (k) (k)


i
x3 = 3 − 2x1 − 2x2
1

(k+1) 1h (k) (k) (k)


i
x4 = 4 − x1 − x2 − x3 ,
1

(0) (0) (0) (0)


and starting with initial approximation x1 = 0, x2 = 0, x3 = 0, x4 = 0,
then for k = 0, we obtain

(1) 1h (0) (0)


i
x1 = 1 − 2x2 + 2x3 = 1.0000
1

(1) 1h (0) (0)


i
x2 = 2 − x1 − x3 = 2.0000
1

(1) 1h (0) (0)


i
x3 = 3 − 2x1 − 2x2 = 3.0000
1

(1) 1h (0) (0) (0)


i
x4 = 4 − x1 − x2 − x3 = 4.0000.
1

The first and subsequent iterations are listed in Table 2.7. Now we solve
the same system by the Gauss–Seidel method and for the given system, the
262 Applied Linear Algebra and Optimization using MATLAB

Table 2.7: Solution by the Jacobi method.


(k) (k) (k) (k)
k x1 x2 x3 x4
0 0.0000 0.0000 0.0000 0.0000
1 1.0000 2.0000 3.0000 4.0000
2 3.0000 –2.0000 –3.0000 –2.0000
3 –1.0000 2.0000 1.0000 6.0000
4 –1.0000 2.0000 1.0000 2.0000
5 –1.0000 2.0000 1.0000 2.0000

Gauss–Seidel formula is
(k+1) 1h (k) (k)
i
x1 = 1 − 2x2 + 2x3
1

(k+1) 1h (k+1) (k)


i
x2 = 2 − x1 − x3
1

(k+1) 1h (k+1) (k+1)


i
x3 = 3 − 2x1 − 2x2
1

(k+1) 1h (k+1) (k+1) (k+1)


i
x4 = 4 − x1 − x2 − x3 ,
1
(0) (0) (0) (0)
and starting with the initial approximation x1 = 0, x2 = 0, x3 = 0, x4 =
0, then for k = 0, we obtain
(1) 1h (0) (0)
i
x1 = 1 − 2x2 + 2x3 = 1.0000
1

(1) 1h (1) (0)


i
x2 = 2 − x1 − x3 = 1.0000
1

(1) 1h (1) (1)


i
x3 = 3 − 2x1 − 2x2 = −1.0000
1

(1) 1h (1) (1) (1)


i
x4 = 4 − x1 − x2 − x3 = 3.0000.
1
The first and subsequent iterations are listed in Table 2.8. Note that the
Iterative Methods for Linear Systems 263

Table 2.8: Solution by the Gauss–Seidel method.


(k) (k) (k) (k)
k x1 x2 x3 x4
0 0.0000 0.0000 0.0000 0.0000
1 1.0000 1.0000 -1.0000 3.0000
2 -3.0000 6.0000 -3.0000 4.0000
3 -17.0000 22.0000 -7.0000 6.0000
4 -57.0000 66.0000 -15.0000 10.0000
5 -417.0000 450.0000 -63.0000 34.0000
7 -1025.0000 1090.0000 -127.0000 66.0000
8 -2433.0000 2562.0000 -255.0000 130.0000
9 -5633.0000 5890.0000 -511.0000 258.0000

Jacobi method converged quickly (only five iterations) but the Gauss–Seidel
method diverged for the given system. •

Example 2.7 Consider the system:


6x1 + 2x2 = 1
x1 + 7x2 − 2x3 = 2
3x1 − 2x2 + 9x3 = −1.
(a) Find the matrix form of the iterative (Jacobi and Gauss–Seidel)
methods.
(k) (k) (k)
(b) If x(k) = [x1 , x2 , x3 ]T , then write the iterative forms of part (a) in
component forms and find the exact solution of the given system.
(c) Find the formulas for the error e(k+1) in the (n + 1)th step.
(d) Find the second approximation of the error e(2) using part (c) if
x(0) = [0, 00]T .

Solution. Since the given matrix A is


 
6 2 0
A= 1 7 −2  ,
3 −2 9
264 Applied Linear Algebra and Optimization using MATLAB


    
0 0 0 0 2 0 6 0 0
A=L+U +D = 1 0 0  +  0 0 −2  +  0 7 0  .
3 −2 0 0 0 0 0 0 9

Jacobi Iterative Method

(a) Since the matrix form of the Jacobi iterative method can be written as

x(k+1) = TJ x(k) + c, k = 0, 1, 2, . . .

where
TJ = −D−1 (L + U ) and c = D−1 b

one can easily compute the Jacobi iteration matrix TJ and the vector c as
follows:
2 1
   
 0 −6 0   6 
   
   
 1 2   2 
TJ = 
 −7 0  and c = 
 7 .

 7   
   
 3 2   1 
− 0 −
9 9 9
Thus, the matrix form of the Jacobi iterative method is

2 1
   
0 − 0 

 6 

 6 

   
 1 2  (k) 
 2 
x(k+1) =
 −7 0 x + , k = 0, 1, 2.
 7 


 7 

   
 3 2   1 
− 0 −
9 9 9

(b) Now by writing the above iterative matrix form in component form, we
Iterative Methods for Linear Systems 265

have
1 1
   
  0 − 0   
x1 
 3  x1

 6 

      
 x2  =  − 1 2  2 
     
0   x2  +  ,
   7 7 
    7 
     
   
x3  1 2  x3  1 
− 0 −
3 9 9
and it is equivalent to
1 1
x1 = − x2 +
3 6
1 2 2
x2 = − x1 + x3 +
7 7 7
1 2 1
x3 = − x1 + x2 − .
3 9 9
Now solving for x1 , x2 , and x3 , we get
1
 
 
x1  12 
 
   
  
 x2  =  1 
,


 
  4 

 
x3  1 

12
which is the exact solution of the given system.
(c) Since the error in the (n + 1)th step is defined as
e(k+1) = x − x(k+1) ,
we have
1 2 1
     
0 − 0

 12 
 
 6 

 6



     
 1   1
  2  (k) 
 2 
e(k+1) = − − 0 x + .

 4   7
 7 

 7
 

     
 1   3 2   1 
− − 0 −
12 9 9 9
266 Applied Linear Algebra and Optimization using MATLAB

This can also be written as

1 2 1
    
0 − 0

 12 
 
 6   12



    
 1   1
  2   1 
e(k+1) =  − − 0  

 4   7
  7 
  4 

    
 1   3 2  1 
− − 0 −
12 9 9 12
2
 
0 − 0 

 6 
 
 1 2 
 e(k)
+  − 0

 7 7 

 
 3 2 
− 0
9 9
1
 

 6 

 
 2 
+ 

 7 

 
 1 

9

or
2
 
0 − 0 

 6 
 
 1 2 
e(k+1) =
 −7 0  e(k) ,
 7 

 
 3 2 
− 0
9 9

(because x(k) = e(k) − x) which is the required error in the (n + 1)th step.
(d) Now finding the first approximation of the error, we have to compute
Iterative Methods for Linear Systems 267

the following:
2
 
0 − 0 

 6 
 
 1 2 
e(1) =
 −7 0  e(0) ,
 7 

 
 3 2 
− 0
9 9
where
e(0) = x − x(0) .
Using x(0) = [0, 0, 0]T , we have
1 1
   
 
 12  0  12 
   
     
(0)
 1    
− 0 = 1 
e = .

 4    
    4 

   
 1  0  1 
− −
12 12
Thus,
2 1 1
    
 0 − 6 0   12   − 12 
    
    
(1)
 1 2  1   1 
e =  7 − 0  = − .
 7 

 4 
  28 
 
    
 3 2  1   1 
− 0 −
9 9 12 36
Similarly, for the second approximation of the error, we have to compute
the following:
2
 
 0 −6 0 
 
 
(2)
 1 2  (1)
e =  7− 0 e
 7 
 
 3 2 
− 0
9 9
268 Applied Linear Algebra and Optimization using MATLAB

or
2 1 1
    
0 − 0   − 12   84 

 6    
    
 1 2  − 1  =  5 ,
   
e(2) =
 −7 0
 7   28   252 
   
    
 3 2  1   5 
− 0
9 9 36 252

which is the required second approximation of the error.

Gauss–Seidel Iterative Method

(a) Now by using the Gauss–Seidel method, first we compute the Gauss–
Seidel iteration matrix TG and the vector c as follows:

1 1
   
 0 − 0 
 3 

 6 

   
 1 2   11 
TG = 
 0
 and c= .
 21 7 


 42 

   
 23 4   41 
0 −
189 63 378

Thus, the matrix form of the Gauss–Seidel iterative method is

1 1
   
 0 − 0 
 3 

 6 

   
 1 2  (k) 
 11 
x(k+1) =
 0 x + , k = 0, 1, 2.
 21 7 


 42 

   
 23 4   41 
0 −
189 63 378
Iterative Methods for Linear Systems 269

(b) Writing the above iterative form in component form, we get


1 1
   
  0 − 0  
x1 
 3  x1


 6 

      
  
 x2  =  0 1 2   
  x2  +  11 
,


 
  21 7 

 
  42 

   
x3  23 4  x3  41 
0 −
189 63 378
and it is equivalent to
1 1
x1 = − x2 +
3 6
1 2 11
x2 = x2 + x3 +
21 7 42
23 4 41
x3 = x2 + x2 − .
189 63 378
Now solving for x1 , x2 , and x3 , we get
1
 
 
x1 
 12 

   
  
 x2  =  1 
,


 
  4 

 
x3  1 

12
which is the exact solution of the given system.
(c) The error in the (n + 1)th step can be easily computed as
1
 
 0 −3 0 
 
 
(k+1)
 1 2  (k)
e = 0 e .

 21 7 

 
 23 4 
0
189 63
270 Applied Linear Algebra and Optimization using MATLAB

(d) The first and second approximations of the error can be calculated as
follows:
1
 
 0 − 0 
 3 
 
1 2  e(0) = [− 1 , − 1 , 19 ]T
(1)

e =  0
 21 7  12 84 756
 
 23 4 
0
189 63
and
1
 
 0 − 0 
 3 
 
1  e(1) = [ 1 , 5 , 1 ]T ,
2 

e(2) =
 0
 21 7  252 756 6804
 
 23 4 
0
189 63
which is the required second approximation of the error. •

2.4 Convergence Criteria


Since we noted that the Jacobi method and the Gauss–Seidel method do
not always converge to the solution of the given system of linear equations,
here, we need some conditions which make both methods converge. The
sufficient condition for the convergence of both methods is defined in the
following theorem.

Theorem 2.1 (Sufficient Condition for Convergence)

If a matrix A is strictly diagonally dominant, then for any choice of ini-


tial approximation x(0) ∈ R, both the Jacobi method and the Gauss–Seidel
method give the sequence {x(k) }∞
k=0 of approximations that converge to the
solution of a linear system. •

There is another sufficient condition for the convergence of both itera-


tive methods, which is defined in the following theorem.
Iterative Methods for Linear Systems 271

Theorem 2.2 (Sufficient Condition for Convergence)

For any initial approximation x(0) ∈ R, the sequence {x(k) }∞


k=0 of approxi-
mations defined by

x(k+1) = T x(k) + c, for each k ≥ 0, and c 6= 0 (2.19)

converges to the unique solution of x = T x+c if kT k < 1, for any natural


matrix norm, and the following error bounds hold:

kx − x(k) k ≤ kT kk kx(0) − xk
(2.20)
(k) kT kk
kx − x k ≤ kx(1) − x(0) k.
1 − kT k

Note that the smaller the value of kT k, the faster the convergence of
the iterative methods.

Example 2.8 Show that for the nonhomogeneous linear system Ax = b,


with the matrix A  
5 0 −1
A =  −1 3 0 ,
0 −1 4

the Gauss–Seidel iterative method converges faster than the Jacobi iterative
method.

Solution. Here we will show that the l∞ -norm of the Gauss–Seidel itera-
tion matrix TG is less than the l∞ -norm of the Jacobi iteration matrix TJ ,
i.e.,
kTG k < kTJ k.

The Jacobi iteration matrix TJ can be obtained from the given matrix A as
272 Applied Linear Algebra and Optimization using MATLAB

follows:
1
 
 0 0
5 
 −1    
5 0 0 0 0 −1 
 1

−1

TJ = −D (L+U ) = − 0 3 0   −1 0 0 = .
0 0 
 
 3
0 0 4 0 −1 0 



 1 
0 − 0
4
Then the l∞ -norm of the matrix TJ is
 
1 1 1 1
kTJ k∞ = max , , = = 0.3333 < 1.
5 3 4 3
The Gauss–Seidel iteration matrix TG is defined as
  
5 0 0 0 0 −1
TG = −(D + L)−1 U = −  −1 3 0  0 0 0 ,
0 −1 4 0 0 0
and it gives
1
 
 0 0 5 
 
 
 1 
TG =  0 0 −
 .
 15 

 
 1 
0 0
60
Then the l∞ -norm of the matrix TG is
 
1 1 1 1
kTG k∞ = max , , = = 0.2000 < 1,
5 15 60 5
which shows that the Gauss–Seidel method will converge faster than the
Jacobi method for the given linear system. •
Note that the condition kT k < 1 is equivalent to the condition that a
matrix A is to be strictly diagonally dominant.
Iterative Methods for Linear Systems 273

For the Jacobi method for a general matrix A, the norm of the Jacobi
iteration matrix is defined as
n
X aij
kTJ k = max aii .

1≤i≤n
j=1
j6=i

Thus, kTJ k < 1 is equivalent to requiring


n
X
|aij | < |aii |,
j=1
j6=i

i.e., the matrix A is strictly diagonally dominant.


Example 2.9 Consider the following linear system of equations:
10x1 + 2x2 + x3 + x4 = 5
x1 + 12x2 + x3 + 2x4 = 9
2x1 + x2 + 13x3 + 3x4 = 1
x1 + 2x2 + x3 + 15x4 = 13.
(a) Show that both iterative methods (Jacobi and Gauss–Seidel) will
converge by using kT k∞ < 1.
(b) Find the second approximation x(2) when the initial solution is
x(0) = [0, 0, 0, 0]T .
(c) Compute the error bounds for your approximations.
(d) How many iterations are needed to get an accuracy within 10−4 ?

Solution. Since the given matrix A is


 
10 2 1 1
 1 12 1 2 
A=  2 1

13 3 
1 2 1 15
from (2.5), we have
     
0 0 0 0 10 0 0 0 0 2 1 1
 1 0 0 0   0 12 0 0   0 0 1 2 
A = L+U +D = 
 2
+ + .
1 0 0   0 0 13 0   0 0 0 3 
1 2 1 0 0 0 0 15 0 0 0 0
274 Applied Linear Algebra and Optimization using MATLAB

Jacobi Iterative Method

(a) Since the Jacobi iteration matrix is defined as

TJ = −D−1 (L + U ),

and computing the right-hand side, we get

 1
 2 1 1 
0 − − −

0 0 0  10 10 10 
 10   
    
 
0 2 1 1
 1 1 2 

 0 1   − 0 − − 
0 0 
1 0 1 2   12 12 12 
TJ = − 
 12  = ,

2 1 0 3
  
   2 1 3 

 0 1 
1 2 1 0
 − − 0 − 
0 0  
13 13 13 

13
  
1
   
0 0 0 1 2 1
 
15 − − − 0
15 15 15
then the l∞ norm of the matrix TJ is
 
4 4 6 4 6
kTJ k∞ = max , , , = = 0.46154 < 1.
10 12 13 15 13

Thus, the Jacobi method will converge for the given linear system.

(b) The Jacobi method for the given system is

(k+1) 1h (k) (k) (k)


i
x1 = 5 − 2x2 − x3 − x4
10

(k+1) 1h (k) (k) (k)


i
x2 = 9 − x1 − x3 − 2x4
12

(k+1) 1h (k) (k) (k)


i
x3 = 1 − 2x1 − x2 − 3x4
13

(k+1) 1h (k) (k) (k)


i
x4 = 13 − x1 − 2x2 − x3 .
15
Iterative Methods for Linear Systems 275

(0) (0) (0) (0)


Starting with an initial approximation x1 = 0, x2 = 0, x3 = 0, x4 = 0,
and for k = 0, 1, we obtain the first and the second approximations as
follows:

x(1) = [0.5, 0.75, 0.07692, 0.86667]T , x(2) = [0.25564, 0.62970, −0.12436, 0.72821]T .

(c) Using the error bound formula (2.20), we obtain


   
0.5 0
6 2 
(2) ( 13 )  0.75   0 
kx − x k ≤ 6  0.07692  − 
  
1 − 13 0 
0.86667 0
or
0.21302
kx − x(2) k ≤ (0.86667) = 0.34286.
0.53846
(d) To find the number of iterations, we use formula (2.20) as

kTJ kk
kx − x(k) k ≤ kx(1) − x(0) k ≤ 10−4 .
1 − kTJ k
It gives
6 k
( 13 )
7 (0.86667) ≤ 10−4
13
or
7
6 k ( 13 ) × 10−4
( ) ≤ ,
13 0.86667
which gives
(0.46154)k ≤ (6.21) × 10−5 .
Taking ln on both sides, we obtain
2
k ln( ) ≤ ln (6.21) × 10−5

3
or
k(−0.77327) ≤ (−9.68621),
and it gives
k ≥ 12.5263 or k = 13,
276 Applied Linear Algebra and Optimization using MATLAB

which is the required number of iterations.

Gauss–Seidel Iterative Method

(a) Since the Gauss–Seidel iteration matrix is defined as

TG = −(D + L)−1 U,

and computing the right-hand side, we have


 1 
0 0 0
 10 
0 2 1 1

 
 
 1 1  
− 0 0

0 0 1 2 
 
 20 12  
TG = −  ,
 

 − 23 − 1 1

0 0 0 3 
 
0  
1560 156 13
  
 
0 0 0 0
 
107 5 1 1
 
− − −
23400 468 195 15
and it gives  1 1 1 
0 − − −
 5 10 10 
 
 
 1 3 19 
 0 − − 
 60 40 120 
TG =  ,
 
 23 11 317 
 0 − 
780 520 1560 
 

 
107 119 136
 
0
11700 7800 3291
then the l∞ norm of the matrix TG is

kTG k∞ = max {0.4, 0.25058, 0.2539, 0.0657} = 0.4 < 1.

Thus, the Gauss–Seidel method will converge for the given linear system.
Iterative Methods for Linear Systems 277

(b) The Gauss–Seidel method for the given system is

(k+1) 1h (k) (k) (k)


i
x1 = 5 − 2x2 − x3 − x4
10

(k+1) 1h (k+1) (k) (k)


i
x2 = 9 − x1 − x3 − 2x4
12

(k+1) 1h (k+1) (k+1) (k)


i
x3 = 1 − 2x1 − x2 − 3x4
13

(k+1) 1h (k+1) (k+1) (k+1)


i
x4 = 13 − x1 − 2x2 − x3 .
15
(0) (0) (0) (0)
Starting with an initial approximation x1 = 0, x2 = 0, x3 = 0, x4 = 0,
and for k = 0, 1, we obtain the first and the second approximations as
follows:
x(1) = [0.5, 0.70833, −0.05449, 0.74252]T
x(2) = [0.28953, 0.66854, −0.07616, 0.76330]T .
(c) Using error bound formula (2.20), we obtain
   
0.5 0
(0.4)2

 0.70833  −  0 
  
kx − x(2) k ≤ 
1 − 0.4  −0.05449  
0 
0.74252 0
or
0.16
kx − x(2) k ≤(0.74252) = 0.19801.
0.6
(d) To find the number of iterations, we use formula (2.20) as

kTJ kk
kx − x(k) k ≤ kx(1) − x(0) k ≤ 10−4 .
1 − kTJ k
It gives
(0.4)k
(0.74252) ≤ 10−4
0.6
or
(0.4)k ≤ (8.08 × 10−5 ).
278 Applied Linear Algebra and Optimization using MATLAB

Taking ln on both sides, we obtain

k ln(0.4) ≤ ln(8.08 × 10−5 )

or
k(−0.91629) ≤ (−9.4235),

and it gives
k ≥ 10.28441 or k = 11,

which is the required number of iterations. •

Theorem 2.3 If A is a symmetric positive-definite matrix with positive


diagonal entries, then the Gauss–Seidel method converges to a unique so-
lution of the linear system Ax = b. •

Example 2.10 Solve the following system of linear equations using Gauss–
Seidel iterative methods, using  = 10−5 in the l∞ -norm and taking the
initial solution x(0) = [0, 0, 0, 0]T :

5x1 − x3 = 1
14x2 − x3 − x4 = 1
−x1 − x2 + 13x3 = 4
− x2 + 9x4 = 3.

Solution. The matrix


 
5 0 −1 0
 0 14 −1 −1 
A=
 −1 −1 13

0 
0 −1 0 9

of the given system is symmetric positive-definite with positive diagonal


Iterative Methods for Linear Systems 279

entries, and the Gauss–Seidel formula for the system is

(k+1) 1h (k)
i
x1 = 1 + x3
5

(k+1) 1h (k) (k)


i
x2 = 1 + x3 + x4
14

(k+1) 1h (k+1) (k+1)


i
x3 = 4 + x1 + x2
13

(k+1) 1h (k+1)
i
x4 = 3 + x2 .
9
(0) (0) (0) (0)
So starting with an initial approximation x1 = 0, x2 = 0, x3 = 0, x4 =
0, and for k = 0, we get

(1) 1h (0)
i
x1 = 1 + x3 = 0.200000
5

(1) 1h (0) (0)


i
x2 = 1 + x3 + x4 = 0.071429
14

(1) 1h (1) (1)


i
x3 = 4 + x1 + x2 = 0.328571
13

(1) 1h (1)
i
x4 = 3 + x2 = 0.341270.
9
The first and subsequent iterations are listed in Table 2.9. •

Note that the Gauss–Seidel method converged very fast (only five it-
erations) and the approximate solution of the given system [0.267505,
0.120302, 0.337524, 0.346700]T is equal to the exact solution [0.267505,
0.120302, 0.337524, 0.346700]T up to six decimal places.
280 Applied Linear Algebra and Optimization using MATLAB

Table 2.9: Solution by the Gauss–Seidel method.


(k) (k) (k) (k)
k x1 x2 x3 x4
0 0.000000 0.000000 0.000000 0.000000
1 0.200000 0.071429 0.328571 0.341270
2 0.265714 0.119274 0.337307 0.346586
3 0.267461 0.120278 0.337518 0.346698
4 0.267504 0.120301 0.337524 0.346700
5 0.267505 0.120302 0.337524 0.346700

2.5 Eigenvalues and Eigenvectors


Here, we will briefly discuss the eigenvalues and eigenvectors of an n × n
matrix. We also show how they can be used to describe the solutions of
linear systems.
Definition 2.1 An n × n matrix A is said to have an eigenvalue λ of A
if there exists a nonzero vector, called an eigenvector x, such that
Ax = λx. (2.21)
Then the relation (2.21) represents the eigenvalue problem, and we refer
to (λ, x) as an eigenpair. •
The equivalent form of (2.21) is
(A − λI)x = 0, (2.22)
where I is an n × n identity matrix. The system of equations (2.22) has
the nontrivial solution x if, and only if, A − λI is singular or, equivalently,
det(A − λI) = |A − λI| = 0. (2.23)
The above relation (2.23) represents a polynomial equation in λ of
degree n which in principle could be used to obtain the eigenvalues of the
matrix A. This equation is called the characteristic equation of A. There
are n roots of (2.23), which we will denote by λ1 , λ2 , . . . , λn . For a given
eigenvalue λi , the corresponding eigenvector xi is not uniquely determined.
If x is an eigenvector, then so is αx, where α is any nonzero scalar.
Iterative Methods for Linear Systems 281

Example 2.11 Find the eigenvalues and eigenvectors of the following ma-
trix:  
−6 0 0
A =  11 −3 0  .
−3 6 7
Solution. To find the eigenvalues of the given matrix A by using (2.23),
we have
−6 − λ 0 0


11 −3 − λ 0 = 0,
−3 6 7−λ
which gives a characteristic equation of the form
λ3 + 2λ2 − 45λ − 126 = 0.
It factorizes to
(−6 − λ)(−3 − λ)(7 − λ) = 0
and gives us the eigenvalues λ = −6, λ = −3, and λ = 7 of the given
matrix A. Note that the sum of these eigenvalues is −2, and this agrees
with the trace of A. After finding the eigenvalues of the matrix we turn to
the problem of finding eigenvectors. The eigenvectors of A corresponding
to the eigenvalues λ are the nonzero vectors x that satisfy (2.22). Equiv-
alently, the eigenvectors corresponding to λ are the nonzero vectors in the
solution space of (2.22). We call this solution space the eigenspace of A
corresponding to λ.

To find the eigenvectors of the above given matrix A corresponding to


each of these eigenvalues, we substitute each of these three eigenvalues in
(2.22). When λ = −6, we have
    
0 0 0 x1 0
 11 3 0   x2  =  0  ,
−3 6 13 x3 0
which implies that
0x1 + 0x2 + 0x3 = 0
11x1 + 3x2 + 0x3 = 0
−3x1 + 6x2 + 13x3 = 0.
282 Applied Linear Algebra and Optimization using MATLAB

Solving this system, we get x1 = 3, x2 = −11, and x3 = 75. Hence, the


eigenvector x(1) corresponding to the first eigenvalue, λ1 = −6, is

x(1) = α[3, −11, 75]T , where α ∈ R, α 6= 0.

When λ = −3, we have


    
−3 0 0 x1 0
 11 0 0   x2  =  0  ,
−3 6 10 x3 0

which implies that

3x1 + 0x2 + 0x3 = 0


11x1 + 0x2 + 0x3 = 0
−3x1 + 6x2 + 10x3 = 0,

which gives the solution, x1 = 0, x2 = 5, and x3 = −3. Hence, the eigen-


vector x(2) corresponding to the second eigenvalue, λ2 = −3, is

x(2) = α[0, 5, −3]T , where α ∈ R, α 6= 0.

Finally, when λ = 7, we have


    
−13 0 0 x1 0
 11 −10 0   x2  =  0  ,
−3 6 0 x3 0

which implies that

−13x1 + 0x2 + 0x3 = 0


11x1 − 10x2 + 0x3 = 0
−3x1 + 6x2 + 0x3 = 0,

which gives x1 = x2 = 0, and x3 = 1. Hence,

x(3) = α[0, 0, 1]T , where α ∈ R, α 6= 0

is the eigenvector x(3) corresponding to the third eigenvalue, λ3 = 7. •


Iterative Methods for Linear Systems 283

The MATLAB command eig is the basic eigenvalue and eigenvector


routine. The command

>> D = eig(A);

returns a vector containing all the eigenvalues of the matrix A. If the


eigenvectors are also wanted, the syntax

>> [X, D] = eig(A);

will return a matrix X whose columns are eigenvectors of A corresponding


to the eigenvalues in the diagonal matrix D. To get the results of Exam-
ple 2.11, we use the MATLAB Command Window as follows:

>> A = [−6 0 0; 11 − 3 0; −3 6 7];


>> P = poly(A);
>> P P = poly2sym(P );
>> [X, D] = eig(A);
>> eigenvalues = diag(D);

Definition 2.2 (Spectral Radius of a Matrix)

Let A be an n × n matrix. Then the spectral radius ρ(A) of a matrix A is


defined as
ρ(A) = max |λi |,
1≤i≤n

where λi are the eigenvalues of a matrix A. •

For example, the matrix


 
4 1 −3
A= 0 0 2 
0 0 −3

has the characteristic equation of the form

det(A − λI) = −λ3 + λ2 + 12λ = 0,


284 Applied Linear Algebra and Optimization using MATLAB

which gives the eigenvalues λ = 4, 0, −3 of A. Hence, the spectral radius


of A is
ρ(A) = max{|4|, |0|, | − 3|} = 4.
The spectral radius of a matrix A may be found using MATLAB com-
mands as follows:

>> A = [4 1 − 3; 0 0 2; 0 0 − 3];
>> B = max(eig(A))
B=
4

Example 2.12 For the matrix


 
a b
A= ,
c d
if the eigenvalues of the Jacobi iteration matrix and the Gauss–Seidel iter-
ation matrix are λi and µi , respectively, then show that µmax = λ2max .

Solution. Decompose the given matrix into the following form:


     
0 0 a 0 0 b
A=L+D+U = + + .
c 0 0 d 0 0
First, we define the Jacobi iteration matrix as
TJ = −D−1 (L + U ),
and computing the right-hand side, we get
 
1 
b

 a 0   0 −
 0 b  a 
TJ = −  = .

1  c 0

c
  
0 − 0
d d
To find the eigenvalues of the matrix TJ , we do as follows:
b

−λ −


a
det(TJ − λI) = =0
c

− −λ


d
Iterative Methods for Linear Systems 285

gives r r
cb cb
λ1 = − , λ2 =
ad ad
and r
cb
λmax = .
ad
Similarly, we can find the Gauss–Seidel iteration matrix as

TG = −(L + D)−1 U,

and computing the right-hand side, we get


 
b
 0 −a 
TG =  .
 
 cb 
0
ad
To find the eigenvalues of the matrix TG , we do as follows:


−µ b

a
det(TG − λI) = = 0,

cb
0 − µ
ad

which gives
cb
µ1 = 0, µ2 =
ad
and
cb
µmax = .
ad
Thus,
r !2
cb
µmax = = λ2max ,
ad
which is the required result. •
286 Applied Linear Algebra and Optimization using MATLAB

The necessary and sufficient condition for the convergence of the Jacobi
iterative method and the Gauss–Seidel iterative method is defined in the
following theorem.

Theorem 2.4 (Necessary and Sufficient Condition for Convergence)

For any initial approximation x(0) ∈ R, the sequence {x(k) }∞


k=0 of approxi-
mations defined by
x(k+1) = T x(k) + c, for each k ≥ 0, and c 6= 0 (2.24)
converges to the unique solution of x = T x + c, if and only if ρ(T ) < 1.

Note that the condition ρ(T ) < 1 is satisfied when kT k < 1 because
ρ(T ) ≤ kT k for any natural norm. •

No general results exist to help us choose between the Jacobi method or


the Gauss–Seidel method to solve an arbitrary linear system. However, the
following theorem is suitable for the special case.
Theorem 2.5 If aii ≤ 0, for each i 6= j, and aii > 0, for each i =
1, 2, . . . , n, then one and only one of the following statements holds:
1. 0 ≤ ρ(TG ) < ρ(TJ ) < 1.
2. 1 < ρ(TJ ) < ρ(TG ).
3. ρ(TJ ) = ρ(TG ) = 0.
4. ρ(TJ ) = ρ(TG ) = 1. •

Example 2.13 Find the spectral radius of the Jacobi and the Gauss–Seidel
iteration matrices using each of the following matrices:
   
2 0 −1 1 −1 1
(a) A =  −1 3 0  , (b) A =  −2 2 −1  ,
0 −1 4 0 1 5
   
1 0 0 1 0 −1
(c) A =  −1 2 0  , (d) A =  1 1 0 .
0 −1 3 0 1 1
Iterative Methods for Linear Systems 287

Solution. (a) The Jacobi iteration matrix TJ for the given matrix A can
be obtained as
1
 
0 0

 2 
 
 1 
TJ = 
 3 0 0 ,

 
 
 1 
0 0
4
and the characteristic equation of the matrix TJ is
1
det(TJ − λI) = −λ3 + = 0.
24
Solving this cubic polynomial, the maximum eigenvalue (in absolute) of TJ
329
is , i.e.,
949
329
ρ(TJ ) = = 0.3467.
949
Also, the Gauss–Seidel iteration matrix TG for the given matrix A is
1
 
0 0

 2 

 
 1 
TG =  0 0 

 6 

 
 1 
0 0
24
and has the characteristic equation of the form
1 2
det(TG − λI) = −λ3 + λ = 0.
24
Solving this cubic polynomial, we obtain the maximum eigenvalue of TG ,
1
, i.e.,
24
1
ρ(TG ) = = 0.0417.
24
288 Applied Linear Algebra and Optimization using MATLAB

(b) The Jacobi iteration matrix TJ for the given matrix A is


 
0 1 −1

 1 1 
TJ =  0 ,

2
 1 
0 − 0
5
with the characteristic equation of the form
9 2 1
det(TJ − λI) = −λ3 + λ + = 0,
20 15
and it gives
1098
ρ(TJ ) = = 1.0447.
1051
The Gauss–Seidel iteration matrix TG is
 
0 1 −1
 1 
TG =  0 1 − 
2 ,

 1 1 
0 −
5 10
with the characteristic equation of the form
11 2
det(TG − λI) = −λ3 + λ = 0,
10
and it gives
11
ρ(TG ) = = 1.1000.
10
Similarly, for the matrices for (c) and (d), we have
 
0 0 0  
 1  0 0 0
TJ =  2
 0 0 
 , TG =  0 0 0  ,
 1 
0 0 0
0 0
3
with
ρ(TJ ) = 0.0000 and ρ(TG ) = 0.0000
Iterative Methods for Linear Systems 289

and    
0 0 1 0 0 1
TJ =  −1 0 0 , TG =  0 0 −1  ,
0 −1 0 0 0 1
with
ρ(TJ ) = 1.0000 and ρ(TG ) = 1.0000,
respectively. •

Definition 2.3 (Convergent Matrix)

An n × n matrix is called a convergent matrix if

lim (Ak )ij = 0, for each i, j = 1, 2, . . . , n.


k→∞

Example 2.14 Show that the matrix


 
1
 3 0 
A=
 

 1 1 
9 3
is the convergent matrix.

Solution. By computing the powers of the given matrix, we obtain


     
1 1 1
 9 0   27 0   81 0 
A2 =   , A3 =   , A4 =  .
     
 2 1   3 1   4 1 
27 9 81 27 243 81
Then in general, we have
  k 
1
0
 3 
Ak = 
 k  k ,
1 
3k+1 3
290 Applied Linear Algebra and Optimization using MATLAB

and it gives
 k  
1 k
lim = 0 and lim = 0.
k→∞ 3 k→∞ 3k+1
Hence, the given matrix A is convergent. •
Since the above matrix has the eigenvalue 31 of order two, its spectral
radius is 13 . This shows the important relation existing between the spectral
radius of a matrix and the convergent of a matrix.
Theorem 2.6 The following statements are equivalent:
1. A is a convergent matrix.
2. lim kAn k = 0, for all natural norms.
n→∞

3. ρ(A) < 1.
4. lim An x = 0, for every x. •
n→∞

Example 2.15 Show that the matrix


 
1 1 0 1
 1 1 1 0 
A=  0 1

1 1 
1 0 1 1
is not the convergent matrix.

Solution. First, we shall find the eigenvalues of the given matrix A by


computing the characteristic equation of the matrix as follows:
det(A − λI) = λ4 − 4λ3 + 2λ2 + 4λ − 3 = 0,
which factorizes to
(λ + 1)(λ − 3)(λ − 1)2 = 0
and gives the eigenvalues 3, 1, 1, and –1 of the given matrix A. Hence,
the spectral radius of A is
ρ(A) = max{|3|, |1|, |1|, | − 1|} = 3,
which shows that the given matrix is not convergent. •
Iterative Methods for Linear Systems 291

We will discuss some very important results concerning the eigenvalue


problems. The proofs of all the results are beyond the scope of this text
and will be omitted. However, they are very easily understood and can be
used.

Theorem 2.7 If A is an n × n matrix, then


1. [ρ(AT A)]1/2 = kAk2 , and

2. ρ(A) ≤ kAk, for any natural norm k.k.

Example 2.16 Consider the matrix


 
−2 1 2
A =  1 0 0 ,
0 1 0

which gives a characteristic equation of the form

det(A − λI) = −λ3 − 2λ2 + λ + 2 = 0.

Solving this cubic equation, the eigenvalues of A are –2, –1, and 1. Thus
the spectral radius of A is

ρ(A) = max{| − 2|, | − 1|, |1|} = 2.

Also,
    
−2 1 0 −2 1 2 5 −2 −4
AT A =  1 0 1   1 0 0  =  −2 2 2 ,
2 0 0 0 1 0 −4 2 4

and a characteristic equation of AT A is

−λ3 + 11λ2 − 14λ + 4 = 0,

which gives the eigenvalues 0.4174, 1, and 9.5826. Therefore, the spectral
radius of AT A is 9.5826. Hence,
p √
kAk2 = ρ(AT A) = 9.5826 ≈ 3.0956.
292 Applied Linear Algebra and Optimization using MATLAB

From this we conclude that

ρ(A) = 2 < 3.0956 ≈ kAk2 .

One can also show that


ρ(A) = 2 < 5 = kAk∞
ρ(A) = 2 < 3 = kAk1 ,
which satisfies Theorem 2.7. •

The spectral norm of a matrix A may be found using MATLAB com-


mands as follows:

>> A = [−2 1 2; 1 0 0; 0 1 0];


>> B = sqrt(max(eig(A0 ∗ A)))
B=
3.0956

Theorem 2.8 If A is a symmetric matrix then


p
kAk2 = ρ(AT A) = ρ(A).

Example 2.17 Consider a symmetric matrix


 
3 0 1
A =  0 −3 0  ,
1 0 3
which has a characteristic equation of the form

−λ3 + 4λ2 + 9λ − 36 = 0.

Solving this cubic equation, we have the eigenvalues 4, –3, and 3 of


the given matrix A. Therefore, the spectral radius of A is 4. Since A is
symmetric,  
10 0 6
AT A = A2 =  0 9 0  .
6 0 10
Iterative Methods for Linear Systems 293

Since we know that the eigenvalues of A2 are the eigenvalues of A raised


to the power of 2, the eigenvalues of AT A are 16, 9, and 9, and its spectral
radius is ρ(AT A) = ρ(A2 ) = [ρ(A)]2 = 16. Hence,
p √
kAk2 = ρ(AT A) = 16 = 4 = ρ(A),
which satisfies Theorem 2.8. •

Theorem 2.9 If A is a nonsingular matrix, then for any eigenvalue of A


1
≤ |λ| ≤ kAk2 .
kA−1 k 2

Note that this result is also true for any natural norm. •

Example 2.18 Consider the matrix


 
2 1
A= ,
3 2
and its inverse matrix is
 
−1 2 −1
A = .
−3 2
First, we find the eigenvalues of the matrix
 
T 13 8
A A= ,
8 5
which can be obtained by solving the characteristic equation

T
13 − λ 8
det(A A − λI) = = λ2 − 18λ + 1 = 0,
8 5−λ

which gives the eigenvalues 17.96 and 0.04. The spectral radius of AT A is
17.96. Hence, p √
kAk2 = ρ(AT A) = 17.96 ≈ 4.24.
Since a characteristic equation of (A−1 )T (A−1 ) is

−1 T −1
13 − λ 4
det[(A ) (A ) − λI] = = λ2 − 18λ + 49 = 0,
4 5−λ
294 Applied Linear Algebra and Optimization using MATLAB

which gives the eigenvalues 14.64 and 3.36 of (A−1 )T (A−1 ), its spectral
radius 14.64. Hence,
p √
kA−1 k2 = ρ((A−1 )T (A−1 )) = 14.64 ≈ 3.83.

Note that the eigenvalues of A are 3.73 and 0.27, therefore, its spectral
radius is 3.73. Hence,
1
< |3.73| < 4.24,
3.83
which satisfies Theorem 2.9. •

2.6 Successive Over-Relaxation Method


We have seen that the Gauss–Seidel method uses updated information
immediately and converges more quickly than the Jacobi method, but in
some large systems of equations the Gauss–Seidel method converges at a
very slow rate. Many techniques have been developed in order to improve
the convergence of the Gauss–Seidel method. Perhaps one of the simplest
and most widely used methods is Successive Over-Relaxation (SOR). A
useful modification to the Gauss–Seidel method is defined by the iterative
scheme:
" i−1 n
#
(k+1) (k) ω X (k+1)
X (k)
xi = (1 − ω)xi + bi − aij xj − aij xj ,(2.25)
aii j=1 j=i+1
i = 1, 2, . . . , n, k = 1, 2, . . .

which can be written as


" i−1 n
#
(k+1) (k) ω X (k+1)
X (k)
xi = xi + bi − aij xj − aij xj . (2.26)
aii j=1 j=i
i = 1, 2, . . . , n, k = 1, 2, . . .

The matrix form of the SOR method can be represented by

x(k+1) = (D + ωL)−1 [(1 − ω)D + ωU ]x(k) + ω(D − ωL)−1 b, (2.27)


Iterative Methods for Linear Systems 295

which is equivalent to
x(k+1) = Tω x(k) + c, (2.28)
where
Tω = (D + ωL)−1 [(1 − ω)D − ωU ] and c = ω(D − ωL)−1 b (2.29)
are called the SOR iteration matrix and the vector, respectively.

The quantity ω is called the relaxation factor. It can be formally proved


that convergence can be obtained for values of ω in the range 0 < ω < 2.
For ω = 1, the SOR method (2.25) is simply the Gauss–Seidel method. The
methods involving (2.25) are called relaxation methods. For the choices of
0 < ω < 1, the procedures are called under-relaxation methods and can be
used to obtain convergence of some systems that are not convergent by the
Gauss–Seidel method. For choices 1 < ω < 2, the procedures are called
over-relaxation methods, which can be used to accelerate the convergence
for systems that are convergent by the Gauss–Seidel method. The SOR
methods are particularly useful for solving linear systems that occur in the
numerical solutions of certain partial differential equations.
Example 2.19 Find the l∞ -norm of the SOR iteration matrix Tω , when
ω = 1.005, by using the following matrix:
 
5 −1
A= .
−1 10
Solution. Since the SOR iteration matrix is
Tω = (D + ωL)−1 [(1 − ω)D − ωU ],
where
     
0 0 0 −1 5 0
L= , U= , D= ,
−1 0 0 0 0 10
then    −1
5 0 0 0
Tω = + 1.005
0 10 −1 0
    
5 0 0 −1
(1 − 1.005) − 1.005 ,
0 10 0 0
296 Applied Linear Algebra and Optimization using MATLAB

which is equal to
 −1  
5 0 −0.025 1.005
Tω = .
−1.005 10 0 −0.05
Thus,   
0.2 0 −0.025 1.005
Tω =
0.0201 0.1 0 −0.05
or  
−0.005 0.201
Tω = .
−0.0005 0.0152
The l∞ -norm of the matrix Tω is
kTω k∞ = max{0.206, 0.0157} = 0.206.

Example 2.20 Solve the following system of linear equations, taking an
initial approximation x(0) = [0, 0, 0, 0]T and with  = 10−4 in the l∞ -norm:
2x1 + 8x2 = 1
5x1 − x2 + x3 = 2
−x1 + x2 + 4x3 + x4 = 12
x2 + x3 + 5x4 = 12.
(a) Using the Gauss–Seidel method.
(b) Using the SOR method with ω = 0.33.

Solution. (a) The Gauss–Seidel method for the given system is


(k+1) 1h (k)
i
x1 = 1 − 8x2
2

(k+1) 1 h (k+1) (k)


i
x2 = 2 − 5x1 − x3
−1

(k+1) 1h (k+1) (k+1) (k)


i
x3 = 12 + x1 − x2 − x4
4

(k+1) 1h (k+1) (k+1)


i
x4 = 12 − x2 − x3 .
5
Iterative Methods for Linear Systems 297

Table 2.10: Solution of Example 2.20 by the Gauss–Seidel method.


(k) (k) (k) (k)
k x1 x2 x3 x4
0 0.00000 0.00000 0.00000 0.00000
1 5.0000e–001 5.0000e–001 3.0000e+000 1.7000e+000
2 –1.5000e+000 –6.5000e+000 3.8250e+000 2.9350e+000
3 2.6500e+001 1.3432e+002 –2.4690e+001 –1.9527e+001
4 –5.3680e+002 –2.7107e+003 5.5135e+002 4.3427e+002
5 1.0843e+004 5.4766e+004 –1.1086e+004 –8.7335e+003
6 –2.1906e+005 –1.1064e+006 2.2402e+005 1.7648e+005
7 4.4256e+006 2.2352e+007 –4.5257e+006 –3.5653e+006
8 –8.9408e+007 –4.5157e+008 9.1431e+007 7.2027e+007
9 1.8063e+009 9.1227e+009 –1.8471e+009 –1.4551e+009

Starting with an initial approximation x(0) = [0, 0, 0, 0]T , and for k = 0,


we obtain

(1) 1h (0)
i
x1 = 1 − 8x2 = 0.5
2

(1) 1 h (1) (0)


i
x2 = 2 − 5x1 − x3 = 0.5
−1

(1) 1h (1) (1) (0)


i
x3 = 12 + x1 − x2 − x4 = 3.0
4

(1) 1h (1) (1)


i
x4 = 12 − x2 − x3 = 1.7.
5

The first and subsequent iterations are listed in Table 2.10.


298 Applied Linear Algebra and Optimization using MATLAB

(b) Now the SOR method for the given system is


(k+1) (k) ωh (k)
i
x1 = (1 − ω)x1 + 1 − 8x2
2
(k+1) (k) ω h (k+1) (k)
i
x2 = (1 − ω)x2 + 2 − 5x1 − x3
−1
(k+1) (k) ωh (k+1) (k+1) (k)
i
x3 = (1 − ω)x3 + 12 + x1 − x2 − x4
4
(k+1) (k) ωh (k+1) (k+1)
i
x4 = (1 − ω)x4 + 12 − x2 − x3 .
5
Starting with an initial approximation x(0) = [0, 0, 0, 0]T , ω = 0.33, and
for k = 0, we obtain
(1) (0) ωh (0)
i
x1 = (1 − ω)x1 + 1 − 8x2 = 0.16500
2
(1) (0) ω h (1) (0)
i
x2 = (1 − ω)x2 + 2 − 5x1 − x3 = −0.387750
−1
(1) (0) ωh (1) (1) (0)
i
x3 = (1 − ω)x3 + 12 + x1 − x2 − x4 = 1.03560
4
(1) (0) ωh (1) (1)
i
x4 = (1 − ω)x4 + 12 − x2 − x3 = 0.74924.
5
The first and subsequent iterations are listed in Table 2.11. Note that the
Gauss–Seidel method diverged for the given system, but the SOR method
converged very slowly for the given system. •

Example 2.21 Solve the following system of linear equations using the
SOR method, with  = 0.5 × 10−6 in the l∞ -norm:
2x1 + x2 = 4
x1 + 2x2 + x3 = 8
x2 + 2x3 + x4 = 12
x3 + 2x4 = 11.

Start with an initial approximation x(0) = [0, 0, 0, 0]T and take ω = 1.27.
Iterative Methods for Linear Systems 299

Table 2.11: Solution of Example 2.20 by the SOR Method.


(k) (k) (k) (k)
k x1 x2 x3 x4
0 0.00000 0.00000 0.00000 0.00000
1 0.16500 –0.38775 1.03560 0.74924
2 0.66376 0.51715 1.63414 1.15201
3 –0.26301 –0.20820 1.98531 1.44656
4 0.02493 –0.10321 2.21139 1.62205
5 0.05030 0.08360 2.33506 1.71914
6 –0.19531 –0.15568 2.40939 1.79508
7 –0.05655 –0.06251 2.45669 1.83669
8 –0.09343 –0.04533 2.48049 1.86186
9 –0.14497 –0.11101 2.49552 1.88207
10 –0.09613 –0.06948 2.50453 1.89227
.. .. .. .. ..
. . . . .
21 –0.11932 –0.08436 2.51291 1.91401
22 –0.11939 –0.08427 2.51284 1.91410

Solution. For the given system, the SOR method with ω = 1.27 is

(k+1) (k) ωh (k)


i
x1 = (1 − ω)x1 + 4 − x2
2
(k+1) (k) ωh (k+1) (k)
i
x2 = (1 − ω)x2 + 8 − x1 − x3
2
(k+1) (k) ωh (k+1) (k)
i
x3 = (1 − ω)x3 + 12 − x2 − x4
2
(k+1) (k) ωh (k+1)
i
x4 = (1 − ω)x4 + 11 − x3 .
2

Starting with an initial approximation x(0) = [0, 0, 0, 0]T , and for k = 0,


300 Applied Linear Algebra and Optimization using MATLAB

Table 2.12: Solution of Example 2.21 by the SOR method.


(k) (k) (k) (k)
k x1 x2 x3 x4
0 0.000000 0.000000 0.000000 0.000000
1 2.540000 3.467100 5.418392 3.544321
2 –0.34741 0.923809 3.319772 3.919978
3 2.047182 1.422556 3.331152 3.811324
4 1.083938 1.892328 3.098770 3.988224
5 1.045709 1.937328 3.020607 3.990094
6 1.027456 1.986402 3.009361 3.996730
.. .. .. .. ..
. . . . .
15 0.999999 2.000000 3.000000 4.000000
16 1.000000 2.000000 3.000000 4.000000

we obtain
(1) (0) 1.27 (0)
x1 = (1 − 1.27)x1 + [4 − x2 ] = 2.54
2

(1) (0) 1.27 (1) (0)


x2 = (1 − 1.27)x2 + [8 − x1 − x3 ] = 3.4671
2

(1) (0) 1.27 (1) (0)


x3 = (1 − 1.27)x3 + [12 − x2 − x4 ] = 5.418392
2

(1) (0) 1.27 (1)


x4 = (1 − 1.27)x4 + [11 − x3 ] = 3.544321.
2
The first and subsequent iterations are listed in Table 2.12.

To get these results using MATLAB commands, we do the following:

>> Ab = [2 1 0 0 4; 1 2 1 0 8; 0 1 2 1 12; 0 0 1 2 11];


>> x = [0 0 0 0];
>> w = 1.27; acc = 0.5e − 6;
>> SORM (Ab, x, w, acc);
Iterative Methods for Linear Systems 301

Table 2.13: Solution of Example 2.21 by the Gauss–Seidel method.


(k) (k) (k) (k)
k x1 x2 x3 x4
0 0.000000 0.000000 0.000000 0.000000
1 2.000000 3.000000 4.500000 3.250000
2 0.500000 1.500000 3.625000 3.687500
3 1.250000 1.562500 3.375000 3.812500
4 1.218750 1.703125 3.242188 3.878906
5 1.148438 1.804688 3.158203 3.920898
6 1.097656 1.872070 3.103516 3.948242
.. .. .. .. ..
. . . . .
35 1.000000 1.999999 3.000000 4.000000
36 1.000000 2.000000 3.000000 4.000000

We note that the SOR method converges and required 16 iterations


to obtain what is obviously the correct solution for the given system. If
we solve Example 2.21 using the Gauss–Seidel method, we find that this
method also converges, but very slowly because it needed 36 iterations to
obtain the correct solution, shown by Table 2.13, which is 20 iterations
more than required by the SOR method. Also, if we solve the same exam-
ple using the Jacobi method, we will find that it needs 73 iterations to get
the correct solution. Comparing the SOR method with the Gauss–Seidel
method, a large reduction in the number of iterations can be achieved,
given an efficient choice of ω.

In practice, ω should be chosen in the range 1 < ω < 2, but the precise
choice of ω is a major problem. Finding the optimum value for ω depends
on the particular problem (size of the system of equations and the nature
of the equations) and often requires careful work. A detailed study for
the optimization of ω can be found in Isaacson and Keller (1966). The
following theorems can be used in certain situations for the convergence of
the SOR method.
302 Applied Linear Algebra and Optimization using MATLAB

Theorem 2.10 If all the diagonal elements of a matrix A are nonzero,


i.e., aii 6= 0, for each i = 1, 2, . . . , n, then
ρ(Tω ) = |ω − 1|.
This implies that the SOR method converges only if 0 < ω < 2. •
Theorem 2.11 If A is a positive-definite matrix and 0 < ω < 2, then
the SOR method converges for any choice of initial approximation vector
x(0) ∈ R. •
Theorem 2.12 If A is a positive-definite and tridiagonal matrix, then
ρ(TG ) = [ρ(TJ )]2 < 1,
and the optimal choices of relaxation factor ω for the SOR method is
2
ω= p , (2.30)
1 + 1 − [ρ(TJ )]2
where TG and TJ are the Gauss–Seidel iteration and the Jacobi iteration
matrices, respectively. With this choice of relaxation factor ω, we can have
the spectral radius of the SOR iteration matrix Tω as
ρ(Tω ) = ω − 1.
Example 2.22 Find the optimal choice for the relaxation factor ω for
using it in the SOR method for solving the linear system Ax = b, where
the coefficient matrix A is given as follows:
 
2 −1 0
A =  −1 2 −1  .
0 −1 2
Solution. Since the given matrix A is positive-definite and tridiagonal, we
can use Theorem 2.12 to find the optimal choice for ω. Using matrix A,
we can find the Jacobi iteration matrix TJ as follows:
1
 
0 0
 1 2 1 
 
TJ = 
 0 .
2 2 
 1 
0 0
2
Iterative Methods for Linear Systems 303

Now to find the spectral radius of the Jacobi iteration matrix TJ , we use
the characteristic equation

λ
det(TJ − λI) = |TJ − λi| = −λ3 + ,
2
which gives the eigenvalues of matrix TJ , as λ = 0, ± √12 . Thus,

1
ρ(TJ ) = √ = 0.707107,
2
and the optimal value of ω is

2
ω= p = 1.171573.
1 + 1 − (0.707107)2

Also, note that the Gauss–Seidel iteration matrix TG has a the form

1
 
 0 2 0 
 
 
 1 1 
TG =  0
 ,
 4 2 

 
 1 1 
0
8 4
and its characteristic equation is

λ2
det(TG − λI) = |TG − λI| = −λ3 + .
2
Thus,
1
= 0.50000 = (ρ(TJ ))2 ,
ρ(TG ) =
2
which agrees with Theorem 2.12. •

Note that the optimal value of ω can also be found by using (2.30) if the
eigenvalues of the Jacobi iteration matrix TJ are real and 0 < ρ(TJ ) < 1. •
304 Applied Linear Algebra and Optimization using MATLAB

Example 2.23 Find the optimal choice for the relaxation factor ω by us-
ing the matrix
 
5 −1 −1 −1
 2 5 −1 0 
A=  −1
.
−1 5 −1 
−1 −1 −1 5
Solution. Using the given matrix A, we can find the Jacobi iteration
matrix TJ as
 1 1 1 
0
 5 5 5 
 
 
 2 1 
 5 0 5 0 
 
TJ =  .
 
 1 1 1 
 0 
 5 5 5 
 
 
1 1 1
 
0
5 5 5
Now to find the spectral radius of the Jacobi iteration matrix TJ , we use
the characteristic equation

det(TJ − λI) = 0,

and get the following polynomial equation:

6 2 8 8
−λ4 − λ − λ− = (5λ − 3) ∗ (5λ + 1)3 = 0.
25 125 125
Solving the above polynomial equation, we obtain

3 1 1 1
λ = ,− ,− ,− ,
5 5 5 5
which are the eigenvalues of the matrix TJ . From this we get

3
ρ(TJ ) = = 0.6,
5
Iterative Methods for Linear Systems 305

the spectral radius of the matrix TJ .


Since the value of ρ(TJ ) is less than 1, we can use formula (2.30) and get
2
ω= p = 1.1111,
1 + 1 − (0.6)2
the optimal value of ω. •
Since the rate of convergence of an iterative method depends on the
spectral radius of the matrix associated with the method, one way to choose
a method to accelerate convergence is to choose a method whose associated
matrix T has a minimal spectral radius.
Example 2.24 Compare the convergence of the Jacobi, Gauss–Seidel, and
SOR iterative methods for the system of linear equations Ax = b, where
the coefficient matrix A is given as
 
4 −1 0 0
 −1 4 −1 0 
A=  0 −1
.
4 −1 
0 0 −1 4
Solution. First, we compute the Jacobi iteration matrix by using
TJ = −D−1 (L + U ).
Since
   
4 0 0 0 0 −1 0 0
 0 4 0 0   −1 0 −1 0 
D=  and L+U =  0 −1
,
 0 0 4 0  0 −1 
0 0 0 4 0 0 −1 0
 1  
4
0 0 0 0 −1 0 0
 0 1 0 0   −1 0 −1 0 
TJ = −  4
 0 0 1 0

  0 −1

4
0 −1 
0 0 0 14 0 0 −1 0
1
 
0 4
0 0
 1 0 41 0 
= 4
1 .
1

 0 0
4 4
0 0 14 0
306 Applied Linear Algebra and Optimization using MATLAB

To find the eigenvalues of the Jacobi iteration matrix TJ , we evaluate the


determinant as
−λ 1
1 4
0 0
1
4 −λ1 0

4
1 = 0,
0
4
−λ 4
1
0 0 4
−λ

which gives the characteristic equation of the form

λ4 − 0.1875λ2 + 1/256 = 0.

Solving this fourth-degree polynomial equation, we get the eigenvalues

λ = −0.4045, λ = −0.1545, λ = 0.1545, λ = 0.4045

of the matrix TJ . The spectral radius of the matrix TJ is

ρ(TJ ) = 0.4045 < 1,

which shows that the Jacobi method will converge for the given linear sys-
tem.

Since the given matrix is positive-definite and tridiagonal, by using The-


orem 2.12 we can compute the spectral radius of the Gauss–Seidel iteration
matrix with the help of the spectral radius of the Jacobi iteration matrix,
i.e.,
ρ(TG ) = [ρ(TJ )]2 = (0.4045)2 = 0.1636 < 1,
which shows that the Gauss–Seidel method will also converge, and faster,
than the Jacobi method. Also, from Theorem 2.12, we have

ρ(Tω ) = ω − 1.

Now to find the spectral radius of the SOR iteration matrix Tω , we have to
calculate first the optimal value of ω by using

2
ω= p .
1 + 1 − [ρ(TJ )]2
Iterative Methods for Linear Systems 307

So using ρ(TJ ) = 0.4045, we get


2
ω= p = 1.045.
1 + 1 − [0.4045]2

Using this optimal value of ω, we can compute the spectral radius of the
SOR iteration matrix Tω as follows:

ρ(Tω ) = ω − 1 = 1.045 − 1 = 0.045 < 1.

Thus the SOR method will also converge for the given system, and faster
than the other two methods, because

ρ(Tω ) < ρ(TG ) < ρ(TJ ).

Program 2.3
MATLAB m-file for the SOR Iterative Method
function sol=SORM(Ab,x,w,acc) % Ab = [A b]
[n,t]=size(Ab); b=Ab(1:n,t); R=1; k=1;
d(1,1:n+1)=[0 x];
k=k+1; while R > acc
for i=1:n
sum=0;
for j=1:n
if j <= i − 1; sum = sum + Ab(i, j) ∗ d(k, j + 1);
elseif j >= i + 1; sum = sum + Ab(i, j) ∗ d(k − 1, j + 1);
end;end
x(1, i) = (1 − w) ∗ d(k − 1, i + 1) + (w/Ab(i, i)) ∗ (b(i, 1) −
sum);
d(k, 1) = k − 1; d(k, i + 1) = x(1, i); end
R = max(abs((d(k, 2 : n + 1) − d(k − 1, 2 : n + 1))));
if R > 100 & k > 10; break; end
k=k+1; end; x=d;
308 Applied Linear Algebra and Optimization using MATLAB

Procedure 2.3 (SOR Method)

1. Find or take ω in the interval (0, 2) (for guaranteed convergence).

2. Initialize the first approximation x(0) and preassigned accuracy .

3. Compute the constant c = ω(D − ωL)−1 b.

4. Compute the SOR iteration matrix Tω = (D+ωL)−1 [(1−ω)D−ωU ].


(k+1) (k)
5. Solve for the approximate solutions xi = Tω xi +c, i = 1, 2, . . . , n,
and k = 0, 1, . . ..
(k+1) (k)
6. Repeat step 5 until kxi − xi k < .

2.7 Conjugate Gradient Method


So far, we have discussed two broad classes of methods for solving linear
systems. The first, known as direct methods (Chapter 1), are based on
some version of Gaussian elimination or LU decomposition. Direct meth-
ods eventually obtain the exact solution but must be carried through to
completion before any useful information is obtained. The second class
contains the iterative methods discussed in the present chapter that lead to
closer and closer approximations to the solution, but almost never reach
the exact value.

Now we discuss a method, called the conjugate gradient method, which


was developed as long ago as 1952. It was originally developed as a direct
method designed to solve an n × n positive-definite linear system. As a
direct method it is generally inferior to Gaussian elimination with pivoting
since both methods require n major steps to determine a solution, and the
steps of the conjugate gradient method are more computationally expan-
sive than those in Gaussian elimination. However, the conjugate gradient
method is very useful when employed as an iterative approximation method
for solving large sparse systems.
Iterative Methods for Linear Systems 309

Actually, this method is rarely used as a primary method for solving


linear systems, rather, its more common applications arise in solving dif-
ferential equations and when other iterative methods converge very slowly.
We assume the coefficient matrix A of the linear system Ax = b is positive-
definite and orthogonality with respect to the inner product notation
< x, y >= xT Ay,
where x and y are n-dimensional vectors. Also, we have for each x and y,
< x, Ay >=< Ax, y > .
The conjugate gradient method is a variational approach in which we
seek the vector x∗ as a solution to the linear system Ax = b, if and only if
x∗ minimizes
E(x) =< x, Ax > −2 < x, b > . (2.31)
In addition, for any x and v 6= 0, the function E(x + tv) has its minimum
when
< v, b − Ax >
t= .
< v, Av >
The process is started by specifying an initial estimate x(0) at iteration
zero, and by computing the initial residual vector from
r(0) = b − Ax(0) .
We then obtain improved estimates x(k) from the iterative process
x(k) = x(k−1) + tk v(k) , (2.32)
where v(k) is a search direction expressed as a vector and the value of
< v(k) , b − Ax(k−1) >
tk =
< v(k) , Av(k) >
is chosen to minimize the value of E(x(k) ).

In a related method, called the method of steepest descent, v(k) is chosen


as the residual vector
v(k) = r(k−1) = b − Ax(k−1) .
310 Applied Linear Algebra and Optimization using MATLAB

This method has merit for nonlinear systems and optimization prob-
lems, but it is not used for linear systems because of slow convergence. An
alternative approach uses a set of nonzero direction vectors {v(1) , . . . , v(n) }
that satisfy
< v(i) , Av(j) >= 0, if i 6= j.
This is called an A-orthogonality condition, and the set of vectors {v(1) , . . . , v(n) }
is said to be A-orthogonal.
In the conjugate gradient method, we use v(1) equal to r(0) only at the
beginning of the process. For all later iterations, we choose
kr(k) k2 (k)
v(k+1) = r(k) + v
kr(k−1) k2
to be conjugate to all previous direction vectors.

Note that the initial approximation x(0) can be chosen by the user,
with x(0) = 0 as the default. The number of iterations, m ≤ n, can be
chosen by the user in advance; alternatively, one can impose a stopping
criterion based on the size of the residual vector, kr(k) k, or, alternatively,
the distance between successive iterates, kx(k+1) − x(k) k. If the process is
carried on to the bitter end, i.e., m = n, then, in the absence of round-off
errors, the results will be the exact solution to the linear system. More
iterations than n may be required in practical applications because of the
introduction of round-off errors.
Example 2.25 The linear system
2x1 − x2 = 1
−x1 + 2x2 − x3 = 0
− x2 + x3 = 1

has the exact solution x = [2, 3, 4]T . Solve the system by the conjugate
gradient method.

Solution. Start with an initial approximation x(0) = [0, 0, 0]T and find the
residual vector as

r(0) = b − Ax(0) = b = [1, 0, 1]T .


Iterative Methods for Linear Systems 311

The first conjugate direction is v(1) = r(0) = [1, 0, 1]T . Since kr(0) k = 2
and < v(1) , v(1) >= [v(1) ]T Av(1) = 3, we use (2.32) to obtain the updated
approximation to the solution
 
2
    3 

1
kr(0) k2 2   
x(1) = x(0) + v (1)
= 0 = 0 .
 
< v(1) , v(1) > 3

1
 
 
 2 
3
Now we compute the next residual vector as
1 4 4
r(1) = b − Ax(1) = b = [− , , ]T ,
3 3 3
and the conjugate direction as
1 2
   

 3 
 
  1 
 3 

(1) 2
     
kr k  4  2    4 
v(2) = r(1) + (0) 2 v(1) =  +  0 = ,
kr k  3  2  
     3 

   
 1  1  4 
3 3
which satisfies the conjugacy condition < v , v >= [v ] Av(2) = 0.
(1) (2) (1) T

Now we get the new approximation as


2
     
2 13
 3   3  
     6  
(1) 2
  
kr k   2
4  
x(2) = x(1) + v (2)
= 0 + = 3 .
   
< v(2) , v(2) >  89 
  

   3  
 


 2   
 4   11 
3 3 3
Since we are dealing with a 3 × 3 system, we will recover the exact solution
by one more iteration of the method. The new residual vector is
 T
(2) (2) 1 1 1
r = b − Ax = b = − , − , ,
3 6 3
312 Applied Linear Algebra and Optimization using MATLAB

and the final conjugate direction is


1 2
    

− 1
 3   3   − 
     4 
kr(2) k2 (2)  1  14  4
     
 
v(3) = r(2) + v = − + 
  = 0 ,
kr(1) k2  6  2 3
  
  
     
 1   4  1 
3 3 2

which, as one can check, is conjugate to both v(1) and v(2) . Thus, the
solution is obtained from
   
13 1
 6   −4   
2
   
kr (2) 2
k   1 
(3) (2) (3) 4
x =x + v = 3 + 3  0 = 3 .
     
< v(3) , v(3) >
4
  8 
   
 11   1 
3 2
Since we appl