0% found this document useful (0 votes)
427 views343 pages

Nptel Book

Uploaded by

Divyanshu Saxena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
427 views343 pages

Nptel Book

Uploaded by

Divyanshu Saxena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 343

Prof. I. K.

Rana
Mathematics, IIT Bombay
INDEX
S. No Topic Page No.
Week 1
1 Introduction I 1

2 Introduction II 26

3 Introduction III 39

4 Systems of Linear Equations I 46

5 Systems of Linear Equations II 58

6 Systems of Linear Equations III 65


Week 2
7 Reduced Row Echelon Form and Rank I 80

8 Reduced Row Echelon Form and Rank II 96

9 Reduced Row Echelon Form and Rank III 107

10 Solvability of a Linear System, Linear Span, Basis I 112

11 Solvability of a Linear System, Linear Span, Basis II 122

12 Solvability of a Linear System, Linear Span, Basis III 129


Week 3
13 Linear Span, Linear Independence and Basis I 136

14 Linear Span, Linear Independence and Basis II 144

15 Linear Span, Linear Independence and Basis III 148

16 Row Space, Column Space, Rank-Nullity Theorem I 161

17 Row Space, Column Space, Rank-Nullity Theorem II 167

18 Row Space, Column Space, Rank-Nullity Theorem III 173


Week 4
19 Determinants and their Properties I 176

20 Determinants and their Properties II 184

21 Determinants and their Properties III 190


Week 5
22 Linear Transformations I 199

23 Linear Transformations II 209

24 Linear Transformations III 220

25 Orthonormal Basis, Geometry in R^2 I 227

26 Orthonormal Basis, Geometry in R^2 II 234

27 Orthonormal Basis, Geometry in R^2 III 238


Week 6
28 Isometries, Eigenvalues and Eigenvectors I 248

29 Isometries, Eigenvalues and Eigenvectors II 256

30 Isometries, Eigenvalues and Eigenvectors III 262

31 Diagonalization and Real Symmetric Matrices I 270

32 Diagonalization and Real Symmetric Matrices II 279

33 Diagonalization and Real Symmetric Matrices III 284


Week 7
34 Diagonalization and its Applications I 288

35 Diagonalization and its Applications II 295

36 Diagonalization and its Applications III 303


Week 8
37 Abstract Vector Spaces I 310

38 Abstract Vector Spaces II 316

39 Abstract Vector Spaces III 321

40 Inner Product Spaces I 329

41 Inner Product Spaces II 334


Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 01
Introduction - I

Okay so our basic course is this course linear algebra, I call it as basic linear algebra.
(Refer Slide Time: 00:32)

For this course, the main text as referred in the bulletin is Kreyszig, Advanced Engineering
Mathematics, 8th edition Chapter 6 and 7. So those are the basic two chapters that will cover
roughly. The additional reference you can also refer the book from Geometry to Algebra, A
course in Linear Algebra available in our library as well as outside.
(Refer Slide Time: 01:35)

1
Let us look at why one studies linear algebra.
(Video Start Time: 01:43)
(Refer Slide Time: 01:54)

(Refer Slide Time: 02:11)

(Refer Slide Time: 02:29)

2
(Refer Slide Time: 02:58)

(Refer Slide Time: 03:07)

3
(Refer Slide Time: 03:15)

(Refer Slide Time: 03:16)

4
(Refer Slide Time: 03:20)

(Refer Slide Time: 03:27)

5
(Refer Slide Time: 03:50)

(Refer Slide Time: 04:26)

6
(Refer Slide Time: 04:43)

(Refer Slide Time: 04:51)

7
(Refer Slide Time: 04:57)

(Refer Slide Time: 05:11)

8
(Refer Slide Time: 05:43)

(Refer Slide Time: 05:53)

9
(Refer Slide Time: 06:04)

(Video End Time: 06:05)


Basically, these are all applications of linear algebra and if you are going to work in any one
of these fields later on, there are many more actually, list some of them, you need linear algebra.
So that is why linear algebra is put as a basic course for all engineers as well as mathematic
students, economic students and so on.
(Refer Slide Time: 06:31)

10
So let us look at some of the other topics. So higher studies in mathematics will lead positively
to linear algebra, physics needs linear algebra, computer graphics yes, economics so all modern
economics is mathematical economics, statistics, electrical engineering and mechanical
engineering and many more. So there are many more fields where this will be linear algebra is
useful.

So let us begin with what we are going to do in this course is basically cover some of the basic
concepts that will be required later on by you.
(Refer Slide Time: 07:14)

So basically linear algebra is a study of linearity which arises in two situations. One is study of
linear equations in one and more variables and secondly it also is study of geometry in the
algebraic setup. So that is where the linearity comes into picture. So geometry can be studied

11
only in one dimension or two dimensions or three dimensions but you cannot visualize
geometry in fourth dimension.

So how do you do geometry in fourth dimension and high dimensions? That is done by linear
algebra okay, making it abstract. Also, there are two components of linear algebra.
(Refer Slide Time: 08:02)

One is the computational aspect that is called the numerical computations and other is
geometric visualization or intuition. How geometry can be transmitted into. So we will be using
both these aspects in our course to study the linear algebra okay, so that you get a feel for the
subject.
(Refer Slide Time: 08:19)

12
So let us I am just going to start with calling some of the basic facts, you might have studied in
your school or last year in some of the courses. So what is linear equation in two variables? So
equation of the type 𝐴𝑥 + 𝐵𝑦 = 𝐶. Their 𝑥, 𝑦 are variables and 𝐴, 𝐵, 𝐶 are scalars, which are
fixed. The variables are the quantities which can take any values. So 𝐴𝑥 + 𝐵𝑦 = 𝐶 is called a
linear equation in two variables.

And not all 𝐴, 𝐵, 𝐶 should be 0 because if all are 0 then there is nothing to be done right okay.
So as a set you can write this as ordered pair (𝑥, 𝑦) such that 𝐴𝑥 + 𝐵𝑦 = 𝐶, so as ordered pair
so you can write this as a set.
(Refer Slide Time: 09:25)

So now once you write it as a set geometrically you can think of plotting it right. You can plot
this. So this is what is called the graph of line, it is a straight line and it indicates all the pairs
(𝑥, 𝑦) which satisfy this equation. So you can call the graph as the solution of the linear
equation right. Those are the values which satisfy. It is interesting to see whether what happens
when 𝐴, 𝐵, 𝐶 change.

(Refer Slide Time: 10:09)

13
For example, this is a line and you can see the value of 𝐴 = 1, value of 𝐵 = 1 and value of
𝐶 = 1.5 and you can change this. If I change the value of 𝐴 right, you see what is happening
to the equation, that inclination is changing that something is remaining fixed and that is a point
where it is cutting the axis, 𝑦 axis. If I change 𝐵 again, I am changing 𝐵 to various values again
right, the inclination changes.

And if I change 𝐶 right, you can see what happens. It is interesting to see that many times this
is not observed that if 𝐴 is positive, this is an inclination and when 𝐴 becomes negative that is
an inclination right and that is the position horizontal when 𝐴 = 0. So that is geometry of this
solution set of linear equation in two variables.
(Refer Slide Time: 11:22)

14
Let us slightly go bit further and let us look at two equations in two variables. So the two
equation 𝐴1 𝑥 + 𝐵1 𝑦 = 𝐶1 and the another equation 𝐴2 𝑥 + 𝐵2 𝑦 = 𝐶2 and we want to find, for
the first equation the solution set is the line geometrically. The second one, the solution set is
another straight line. So when you want to say that you want to find simultaneously solve this
right, that means we want to find those points (𝑥, 𝑦) in the plane which satisfy both the
equations.

And that essentially means that is where the lines will intersect. So both lines are parallel okay
but not coincidental, so they will never intersect. So there is no solution possible geometrically.
Lines intersect and there is one of the axioms of geometry that the two lines always intersect
only at one point if the two intersect if they are not coincidental. That means there will be only
unique solution for the two system of two linear equations right.

If at all they are non-coincidental then there will be one solution and lines are coincidental that
means the two lines coincide, so there are infinite number of points where the solutions are
there. So geometrically we get this information from geometry that for the system of two
equations in two variables 3 possibilities arise. Both the lines are parallel but not coincidental.
That means they never intersect, no solution and unique solution and infinite number of
solutions.
(Refer Slide Time: 13:08)

So these are the geometrical possibilities for linear equations.


(Refer Slide Time: 13:15)

15
One can try to solve these things geometrically. How do you find the point of intersection
geometrically? That is a question one would like to answer. So the idea is if this P is the point
where these two lines intersect, if you can rotate one of the lines so that it becomes horizontal
or vertical, that will give me the 𝑥 coordinate or the 𝑦 coordinate of that point right. So given
a line, one of them is fixed, there is a point of intersection, I rotate the other line okay.

So that when it becomes horizontal right when it becomes horizontal what will be the line, 𝑦 =
𝑐, for some 𝑐, so that is the 𝑦 coordinate of the point of intersection.

(Refer Slide Time: 14:25)

So these are the two lines, blue and the red one which intersect this point okay. Now what we
have done is have taken a combination of these two lines. So 2x+3y+3 times, the 3 is the value

16
which is I am going to change okay and let us see what happens. So what we are discovering
is that if you take a combination of the two lines, the linear combination of the two lines then
what happens?

See point of intersection remains the same when you take a linear combination but what is
happening is the slope of the line is changing right when it becomes horizontal that will give
me the value of the 𝑥 coordinate and when it becomes horizontal that will give me the value of
sorry right so that will give me the value of the 𝑦 coordinate. So geometrically that is how you
can solve a system of two linear equations in two variables.

So basic idea is if I take a linear combination of the two equations right, idea should be that
one of the variable should vanish, either it should become 𝑥 = 𝑥0 or 𝑦 = 𝑦0 , that will give me
one of the coordinates. So I can put back the value in the equation either of the equation will
get back the other. So geometrically that is how you solve, so it says that we should be doing
something all change of variation, sorry we should be doing elimination of variables for that
equations.
(Refer Slide Time: 16:11)

So let us just so that is what is explained here that if I take a linear combination, the slope will
be different right, the linear combination will have a different slope because 𝐴 and 𝐵 would
have changed for the new equation but the point of intersection remains the same. Why does
the point of intersection remains the same, whether the 𝑥0 , 𝑦0 satisfies the first equation right.

17
Then, 𝐴1 𝑥0 + 𝐵1 𝑦0 = 𝐶1 and similarly for the other equation, when you add they will remain
the same right. So point of intersection will remain the same, the only slope changes so it
rotates.
(Refer Slide Time: 17:01)

So that is the geometric way of solving but algebraically we can solve it okay. So algebraically
the idea is try to eliminate one of the variables so that you get the value of the right, eliminate
one of the variables you get the value of the one coordinate of one and find out the other. So
for two, you do this in the school, so given and you want the coordinate of 𝑥, so you want the
y component to be 0, so put that equal to 0 and solve okay.

So you can do that algebraically also. So the basic idea that we are getting from here is that to
find a solution of two linear equations in two variables, you try to eliminate one variable right
from that by taking the linear combination of the two equations okay.
(Refer Slide Time: 17:51)

18
So this is called the variable elimination method for the two variables. Let us try to go a step
higher and see what happens when we have got equation in 3 variables. So 𝐴𝑥 + 𝐵𝑦 + 𝐶𝑧 = 𝐷
where 𝐴, 𝐵, 𝐶, 𝐷 are constants and 𝑥, 𝑦, 𝑧 are scalars. So what is this equation, what does it
geometrically represent? So that represent 𝑥, 𝑦, 𝑧, so it is a triplet (𝑥, 𝑦, 𝑧) right that will be a
point in ℝ3 right. So let us see how what does this represent?
(Refer Slide Time: 18:35)

So let us just try to see.


(Video Start Time: 18:40)
(Refer Slide Time: 18:44)

19
What does the equation in 3 variables represent?
(Refer Slide Time: 18:48)

Transform of your equation in two variables 𝑥 and 𝑦 is written as 𝐴𝑥 + 𝐵𝑦 = 𝐶.


(Refer Slide Time: 19:03)

20
The standard form of your equation in 3 variables 𝑥, 𝑦, 𝑧 is written as 𝐴𝑥 + 𝐵𝑦 + 𝐶𝑧 = 𝐷 where
at least one of the coefficients 𝐴, 𝐵, 𝐶 must be nonzero. Since the equation contains 3 variables,
it must be plotted using a 3-dimensional coordinate system.
(Refer Slide Time: 19:24)

Start by setting 𝐴 and 𝐵 to 0 and 𝐶 to 1, this eliminates all the variables but 𝑧, if we set the
value of 𝑧 as 3, we get the linear equation as 𝑧 = 3.
(Refer Slide Time: 19:42)

21
Since 𝑥 and 𝑦 are free to take any values, the graph of this equation consist of every point in
3-dimensional space, the 𝑧 coordinate is 3 and 𝑥 and 𝑦 coordinates are any real numbers.
(Refer Slide Time: 20:06)

The graph of this equation is therefore a horizontal plane. If we set 𝐴 and 𝐶 to 0 and 𝐵 to 1, all
variables but 𝑦 are eliminated from the equation.
(Refer Slide Time: 20:19)

22
And setting 𝐷 into 3, we get the equation 𝑦 = 3. Now since 𝑥 and 𝐶 are free to take on any
value, the graph of this equation consist of every point in 𝑦 coordinate is 3.
(Refer Slide Time: 20:52)

The graph of this equation is therefore a vertical plane. If we set 𝐴 = 1, 𝐵 = 0, 𝐶 = 0, 𝐷 = 3,


all variables but 𝑥 are eliminated. This graph is consistent of point with the 𝑥 coordinate of 3.
(Refer Slide Time: 21:24)

23
The graph of this equation is the principal plane 3 units in front of the origin. Pushing to 0 and
the 𝑥 and 𝑦 coordinates are 1, this eliminates 𝑧 from the equation but leaves the variables x and
y. Now if we set the value of 𝐷 = 3, we get the linear equation as 𝑥 + 𝑦 = 3. Let us ignore 𝑧
for the moment and plot this equation on 𝑥, 𝑦 plane. This gives us a line which intersects the 𝑥
and 𝑦 axis. Every point on this line is a solution to the equation 𝑥 + 𝑦 = 3 where the 𝑧
coordinate of every point is 0.

So there are more solutions than this. All points on any line for 𝑥 + 𝑦 = 3 are solutions to this
equation regardless of the value of 𝑧.
(Refer Slide Time: 22:32)

There are an infinite number of lines for this, one line for every possible value of 𝑧. Since 𝑧
can be any zero number, all these lines together create a plane. Each point in this plane

24
corresponds to the solution of the linear equation 𝑥 + 𝑦 = 3. The graph of the linear equation
in 3 variables will always be a plane with graph in Cartesian coordinate assuming that the
constants 𝐴, 𝐵, 𝐶 are not all 0.

Any plane which corresponds to the solutions extends infinitely in all direction. Depending
upon the values of 𝐴, 𝐵, 𝐶, 𝐷, the point may lie in any position and orientation.
(Refer Slide Time: 23:19)

(Video End Time: 23:22)


So equation in 3 variables represents a plane geometrically and one would like to know that
given so that all points in the plane are solutions of that linear equation of 3 variables.

25
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology - Bombay

Lecture – 02
Introduction II

Now let us look at slightly more complicated things namely if we are given more than one equation
in 3 variables, what could be possible solution of that.
(Refer Slide Time: 00:36)

So let us look at, which again geometrically it is very obvious.


(Video Start Time: 00:40 - Video End Time: 04:06)

26
There are 2 linear equations in 2 variables. There are 3 types of solutions there. Depending upon
the vertices in parallel, so there are no points of intersection and the system has no solution. And
if the lines are identical, the 2 perhaps coincide intersecting at an infinite number of points. The
system therefore has an infinite number of solutions. Likewise, systems of 3 linear equations and
3 variables can have a single unique solution, no solutions, or an infinite number of solutions
depending on ways in which the 3 planes are oriented.

Let us consider all possible ways in which 3 planes can intersect or not intersect. We will number
these planes 1, 2 and 3. One possibility is that no 2 planes in the system are parallel. Then all 3
planes intersect at only 1 point. In this case, the system will have 1 unique solution. The second
possibility is that 2 of the planes are not parallel and so their points of intersection form a line. If
the third plane intersects the other 2 along the same line, then the common points of intersection
for all 3 planes are represented by that one.

The system will therefore have an infinite number of solutions corresponding to every point on the
lock. The third plane is not necessarily have to be the same for both of the other points. If the third
point is that integral for one of the other 2 planes, then the points come into all 3 planes are still
represented by the same line of intersection and the system will have an infinite number of
solutions.

27
An infinite number of solutions will also exist if the system consists of 3 identical planes. In this
case, any point which lies on one plane is come into all 3 planes. The system will therefore have
an infinite number of solutions corresponding to every point on the plane. In addition, there are
several orientations of the plane before result in a system with no solution. Any time the system
has 2 distinct parallel planes, there can be no solution.

Since distinct parallel planes have no points in common regardless of the orientation of the third
point, there can be no point which lie on all 3 planes. The third plane can be distinct parallel to the
other 2 identical to one of the other 2 or not parallel to either of the other planes. Thus intersecting
third point.

Regardless of the orientation of the third plane since there are no points in common to all 3 planes,
the system has no solution. One additional configuration of the planes which will result in the
system with no solutions is that the planes are oriented so then there intersection points lie along
3 distinct parallel lines. Once again since there are no points in common to all 3 planes, the system
has no solution.
(Refer Slide Time: 04:07)

(Refer Slide Time: 04:13)

28
(Refer Slide Time: 04:20)

Right, so let us just summarize what we have done till now. We have looked at system of a linear
equation in 2 variables. And then we looked at 2 equations in 2 variables and then we looked at
the system of 3 equations in 3 variables and possible solutions. In either of this dimensions, either
the system has no solution or it has infinite number of solutions or a unique solution. How does
one go about analyzing when the number of variables increase?

There is no geometry available. What do we do about it? So what we will do is, try to convert these
geometric pictures into algebra and then try to see whether we can extend that algebra for more
number of variables.

29
(Refer Slide Time: 05:22)

So let us analyze this method of variable elimination method. So let us look at this example, simple
example and then we will try to formalize this. So example is 𝑥 + 3𝑦 + 5𝑧 = 0, 𝑥 + 2𝑦 + 3𝑧 = 1
and the third equation is 𝑥 − 𝑧 = 1. So I have labelled this equations as 𝐸1 , 𝐸2 , 𝐸3 . So the idea is
try to eliminate. Already in the third equation, right, 𝑦 is missing. So let us try to eliminate the
variable 𝑦 from the other 2 equations also, okay.

So that is the idea. So what we do is, we do these operations, okay. So here for example what we
have done is we have eliminated x from the equations. The first equation remains as it is. The
second equation what we have done is 𝐸3 − 𝐸1 and then operation is 𝐸2 − 𝐸1 . So we have just
taken a linear combinations of the equations. So the solutions will not change. The idea is whenever
you take a linear combination of any 2 or more of the equations, the solution does not change.

And now from these 2 equations, I can try to eliminate again one of the variables. So let us do that.
So 𝐸3 − 3 𝐸2 , so that operation is done.

So I get the system of equations which is equivalent to the earlier one because I have just taken
linear combinations, done nothing else, right. So solution of the last system should be same as
solution of the original system. So that is the idea. But in the last system, I get an equation 0 =
−2, that means that is not possible, that is absurd, right. So that means this system has got no

30
solution, right. So that is algebraically solving.
(Refer Slide Time: 07:41)

So this system has no solution.

Now consider this. The second system is equivalent to a given system. So solution of this should
be same as the solution of the original one. And here, I get 𝑥3 = 2, last equation. I put the value
of 𝑥3 in the previous equation, I get the value of 𝑥2 and I put these 2 values in the first equation, I
get the value of 𝑥1 .

So I eliminate and then substitute. So this method is called elimination and substitution method or
backwards substitution and that gives me the solution of the system. So this system has got a
unique solution, right. Earlier one, had no solution. This one has unique solution. And let us look
at another one, this one.
(Refer Slide Time: 09:13)

31
So there is only 1 equation 𝑥1 + 𝑥2 + 𝑥3 = 6 that because geometrically is a plane, right. How
many points are there in the plane? Infinite number of them. So this system, one equation in 3
variables will have infinite number of solutions possible. How I can write them? One possibility
is you can write 𝑥1 = 6 − 𝑥2 − 𝑥3 , right. So that means what? 𝑥1 can be determined in terms of
values of 𝑥2 and 𝑥3 and 𝑥2 and 𝑥3 are free to choose any values, right.

So I put different values for 𝑥2 and different value for 𝑥3 , I get different value for 𝑥1 . So for 𝑥2
and 𝑥3 can take infinite number of values, so that means 𝑥1 also has infinite number of values.
That means the system has got infinite number of solutions which we are verifying algebraically.
Geometrically it is plane, we know it is an infinite number of solutions. So 3 possibilities. I have
given you 3 examples where the system had no solution, system had exactly 1 solution and the
system had, okay, infinite number of solutions.
(Refer Slide Time: 10:32)

32
What is we observe in solving all these things? The solution of a system does not change whether
it is 2 variables 2 equations, 1 equation. Any 2 equations are interchanged, what will happen? If I
interchange any 2 equations, right, instead of saying this is line 1 and this is line 2 in the plane, I
am saying this is line 1 and this is line 2. I am just renaming them, right. The solution does not
change if I change the order of the equations, right.

An equation is multiplied by non-0 scalar. If I take a line, right and multiply it by a non-0 scalar
everywhere on the left hand side as well as right hand side, does the equation change? Equation
remains the same. Line remains the same, right. So solution does not change, right. So why non-
0? Because when I multiply by 0, then all the information is lost, 0=0, that is nothing, equation is
gone, right.

So multiplication by a non-0 scalar leaves the equation. So solution are unchanged. And the third
one is, one equation is added to another, that we had already seen, right. If one solution is there,
right and you add one equation to another scalar to multiple of that and now of course not 0, right,
then the solution does not change, right, that we saw. In the 2 variable thing, we saw that if you
add scalar multiple of 1 linear equation to another, the point of intersection does not change.

Only the inclination of the line changes, right. So these 3 are basic operations that do not change
the solution of a system of equations, right. So that is one observation. And then the other

33
observation is that the variables really do not play any role in all the computations. x1 remains the
x1, x2 remains the x2, x3 remains x3. It is only that constants which are in front of them or on the
right side of the equality, right.

They change when you do something with them, scalar multiple and do something, right. So we
can forget when we write. We can forget about these variables. Only we should keep track that
base is the coefficient of the first variable. This is the coefficient of the second variable. This is
coefficient of the third and so on, right. So variable do not play any role. So let us try to describe
this in an abstract study without observations.
(Refer Slide Time: 13:17)

So we will define formally an equation of the form 𝑎1 𝑥1 + 𝑎2 𝑥2 + ⋯ + 𝑎𝑛 𝑥𝑛 = 𝑏1 is called a


linear equation in 𝑛 variables 𝑥1 , 𝑥2 , … , 𝑥𝑛 . Number of variables is 𝑛, only 1 equation. Why it is
called linear? Because the powers of all the variables is 1. That is why it is called the linear equation
in n variables. So more than 1, so you can have a system of m equations in n variables.

So the first letter indicates the equation. Second indicates the variable, right. And for
computational purposes, we do not want 𝑥1 , 𝑥2 , … , 𝑥𝑛 coming into the picture, right. We do not
want them.
(Refer Slide Time: 14:46)

34
c
So let us write, so before that let us just say a vector 𝑠1 , 𝑠2 , … , 𝑠𝑛 ∈ ℝ𝑛 . So when we say it is a
solution, if I put 𝑥1 = 𝑠1 , 𝑥2 = 𝑠2 , … , 𝑥𝑛 = 𝑠𝑛 in all the equations, then the left hand side is equal
to the right hand side, right. So that is called the solution. Though the idea is how do we solve such
a system, that is what we want to analyze?
(Refer Slide Time: 15:41)

So let us write that. Variables are 𝑥1 , 𝑥2 , … , 𝑥𝑛 equality and the right hand side something appears,
right.
So return this way, all the data about the system is captured. We have fixed the position of the
variables. This is captured.

35
(Refer Slide Time: 16:25)

So this motivates, so a rectangular array of 𝑛 equations in 𝑛 variables. So basically what we are


saying is to represent a system of a linear equations, a rectangular array of numbers becomes
important, right. So we start defining, because this is going to come again and again. It motivates
a definition in mathematics, namely a rectangular array of numbers, they could be real or complex.

In fact, they could be more general but we will be concerned only with constants which are real or
complex, okay. So a rectangular array in which there are m rows, first row, second row, third row,
m rows are there. And each row has got a 𝑛 column, right. So this is called a matrix, okay. Why
matrix is important you can ask?

The important thing is when we do computations, you are not going to do it humanly. We are
going to put it on a computer, all the computations, right. All multiplications, divisions, everything
and a computer or a machine cannot store variables. We can only store scalars, right. So we can
ask the machine to store the data in this format, right. Even machine will not know what is the row
1 or row 2 or row 3 or row 𝑚, right. So what you do it is write all if there is a rectangular array of
m numbers in rectangular array, 𝑚 × 𝑛.

How many total numbers are there? 𝑚 × 𝑛, right. So the computer will store it as 𝑚 × 𝑛 numbers
So the basic idea is matrices written this way capture the computational aspect of linear equations,

36
okay. So this is called a matrix of order 𝑚 × 𝑛. When 𝑚 = 𝑛, we will call this as a square matrix.
(Refer Slide Time: 19:04)

So if you are in the 𝑖th row and 𝑗th column, so that position is called the 𝑖𝑗th term or entry of the
matrix. So we will start using this term. We got a matrix whose m rows, n columns, ijth entry is so
and so, right. So we will start using that language. So sometimes in short you write this whole, if
you are not really interested in numbers.
(Refer Slide Time: 20:11)

So we have defined a new quantity which involves numbers. So what are the possibilities? What
kind of operations we can do? So this is very common in mathematics. You define a quantity. You

37
collect all such quantities and try to do operations on these quantities. For example, you must have
studied functions, right, real value functions. Then you can think a real value functions, defined
on the interval.

We can add functions. We can multiply functions. We can scale or multiply a function. You can
compose functions. So in the class of all functions, you can do these various operations. You can
add, you can subtract, you can scale or multiply, you can multiply functions or you can compose
functions. Similarly, we have got collection of matrices, right. So what do we want to do?

First of all, given 2 matrices of the same order, we say they are equal so we are defining equality
of matrices. The order is same. So natural thing to say whether they are equal means what, each
entry should be equal. So 𝑖𝑗th entry of 1 should be equal to 𝑖𝑗th entry of other, right, the
corresponding entries must be equal. So that is equality of matrices.

38
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology - Bombay

Lecture – 03
Introduction III

(Refer Slide Time: 00:26)

You can also define addition. So what will be addition? Let us revise addition. Given 2 matrices
𝐴 and 𝐵. If 𝐴 = [𝑎𝑖𝑗 ], 𝐵 = [𝑏𝑖𝑗 ], order is same, 𝑚 × 𝑛. So addition will be 𝑎𝑖𝑗 + 𝑏𝑖𝑗 . So 𝑖𝑗th entry
of the sum is the sum of the 𝑖𝑗th entries of the corresponding matrices. So multiplication of
matrices, for example here, 𝐴 + 𝐵 is same as 𝐵 + 𝐴. This is a nice property like numbers, right.
𝐴 + (𝐵 + 𝐶) is same as (𝐴 + 𝐵) + 𝐶, associative property. Commutative property, associative
property and so on. For example, you can take the matrix to be all entries of 0. So what is 𝐴 + 𝑂
matrix? It is again 𝐴. So 𝑂 is like an identity. So properties similar to that of number addition are
appearing with this kind of addition, right. So we will like to define some multiplication where
some nice properties come out, okay.
(Refer Slide Time: 01:42)

39
Let us just look at the one simple operation called the transpose of a matrix, okay. So what is a
transpose of a matrix? If the matrix is 𝑚 × 𝑛, transpose is just interchange the rows with the
columns. Whatever was the first row, make it the first column of another matrix. So define a new
matrix, first row from where becomes the first column and similarly other are replaced. All the
rows become columns and all columns naturally will become rows, right. So if it was 𝑚 × 𝑛, what
is the new matrix? 𝑛 × 𝑚, right, rows have been.
So here are some simple observations. Transpose is a well-defined matrix, right. We are given all
the entries of it. If A is 𝑚 × 𝑛, obviously 𝐴𝑇 is 𝑛 × 𝑚 and what is transpose of transpose? We are
getting back the matrix. You interchange rows with columns, you again interchange, right, rows
of that with columns, you will get back the matrix. So (𝐴𝑇 )𝑇 = 𝐴, right. So these are elementary
properties of matrices.
(Refer Slide Time: 03:22)

40
So we looked at what is defined as a transpose. So we will call a matrix to be a symmetric if 𝐴 =
𝐴𝑇 . If the interchange of rows and columns does not change the matrix, right and we will call it as
anti-symmetric or normally call skew-symmetric if 𝐴 = −𝐴𝑇 . Does it put any condition on the
matrix 𝐴? If I say a matrix 𝐴 is symmetric, does it put any condition?

Yes, because transpose will change rows to columns. So if 𝐴 was 𝑚 × 𝑛, transpose becomes 𝑛 × 𝑚
and if you have to be equal, that means 𝑚 has to be equal to 𝑛; otherwise, they are not equal from
the equality of matrices. So the concept of symmetric or skew-symmetric matrices is defined only
for square matrices, okay. So that is 1.
(Refer Slide Time: 04:25)

41
So here is now the matrix multiplication. So we have got 2 matrices 𝐴 = [𝑎𝑖𝑗 ] and another matrix
is 𝐵 = [𝑏𝑘𝑙 ]. So here observe 𝐴 is 𝑚 × 𝑛 and B is 𝑛 × 𝑝, okay. Then the product of these 2 matrices
𝐴𝐵, right, is a new matrix, let us call it as 𝐶. So how do I describe 𝐶? So row of the first one,
column of the second, the corresponding entries you multiply and add.

And that naturally puts the condition that the number of columns here should be same as number
of rows in the other. That is why we have put 𝑚 × 𝑛 and 𝑛 × 𝑝, the number of, right. The columns
here is same as the number of rows there. So 𝑚 × 𝑛. So what will be the resulting matrix? The
resulting matrix, this is summed up. So 𝑐𝑗𝑙 , 𝑗 varies 1 to 𝑚, 𝑙 varies from 1 to 𝑝, so that gives you
a matrix of the order is 𝑚 × 𝑛, A; 𝑛 × 𝑝, that is matrix B.

(Refer Slide Time: 07:45)

42
One obvious way of saying that is a nice property is that it is associative. 𝐴(𝐵𝐶) is same as (𝐴𝐵)𝐶.
This way defining of product has got a nice property again, associative, okay. It is not
commutative. 𝐴𝐵 need not be equal to 𝐵𝐴, right. One can give examples for that. So if you want
to define 𝐴𝐵𝐶, you see the condition is put 𝐴 is 𝑚 × 𝑛, 𝐵 is 𝑛 × 𝑝 and 𝐶 is 𝑝 × 𝑞.

(Refer Slide Time: 08:54)

This we will not prove it but you can try to write a proof yourself, right. And you can try to verify
it with some simple examples. Product transpose is same as transpose inverted product, right.
(𝐴𝐵)𝑇 is same as 𝐵 𝑇 𝐴𝑇 . One reason that this should be so, because if you take 𝐴𝐵 and transpose.

43
The rows are going to be interchanged with columns. So similarly if you have 𝐵 𝑇 𝐴𝑇 , already rows
are interchanged with columns, rows are interchanged with columns.

(Refer Slide Time: 10:39)

So this is a system. So all the data about the system is captured in these numbers, right. Let us try
to write this slightly more neatly, okay so that we get our system. So let us, this part of the data, I
will call it as a matrix 𝐴. I will call this as b, okay. So what is the order of 𝐴? That is 𝑚 × 𝑛, right.
What is the order of 𝑏? 𝑚 × 1. And what was the variables? 𝑥1 , 𝑥2 , … , 𝑥𝑛 . So let us write them as
a column vector 𝑋. So what is the order of that? 𝑛 × 1, right.
(Refer Slide Time: 12:35)

44
So all the data, the equation can be represented as 𝐴𝑋 = 𝑏. So this is the matrix representation of
a system of linear equation. So we will call the system 𝑚 × 𝑛, right, a system instead of writing
𝑚 × 𝑛 system of 𝑚 linear equations in 𝑛 variables, can be represented as a matrix equation 𝐴𝑋 =
𝑏, where 𝐴 is the matrix which are the coefficients coming in front of the variables, 𝑋 is the
variable, unknown quantity we want to solve and 𝑏 is the vector on the right hand side, right. So
that is the matrix equation. And this is actually equality because when you multiply, right, when
you multiply, you get the first equation, second entry as the second equation, third entry as the
third equation. Equality of matrices give you all the system, right. So the problem is how to solve
this system of linear equations.

And the Gauss elimination method that we had just now seen, it says you can do those 3 operations
that does not change the system of equations, right. Solution for the system. What was our goals?
You can interchange, you can replace the position of the equation. That means what? In this matrix
multiplication, what are you going to do? Changing the order of equation. That means
interchanging the position of one of the rows, interchange of rows, that should not affect the
solution, right.

That does not actually affect the solution. What was the second? You can multiply any row by a
non-0 scalar. That does not change the solution, right. And third you can add any 2 rows if you
like, right. The solution does not change. So you will get a new system which is equivalent to the
earlier one and the solution does not change.

And the idea is that this matrix which is representing, right, it should have lesser number of
coefficients coming, more 0's coming so that you are able to solve the system very easily, right.
So we will see it next time on this Gauss elimination substitution method for a general system of
linear equations, right.

45
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology - Bombay

Lecture – 04
Systems of Linear Equations I

Okay, so let us begin with today's lecture. In the previous lecture, we had started looking at why
linear algebra is important to study.
(Refer Slide Time: 00:35)

Then we started looking at how does linearity arise in setups. Basically there are 2 ways linearity
can arise. One is because of linear equations and the other is studying geometry in the algebraic
setup. We started looking at the system of linear equations. We looked at system of linear equations
in 2 variables and then in 3 variables.

And both dimensions 2 and dimension 3, that is 2 variables and 3 variables, we analyze
geometrically how the solutions can be obtained and that led us to algebraic method of solving
those equations namely trying to eliminate variables one at a time and trying to see whether the
system has a solution or not. So we will start with looking at general system of linear equations
and generalize Gauss elimination method for a system of 𝑚 linear equations in 𝑛 variables.
(Refer Slide Time: 01:43)

46
(Refer Slide Time: 02:49)

(Refer Slide Time: 03:19)

47
The solution of the system does not change if we change the order of the equations. That is just
renumbering the equations. Instead of first with the second. The solution does not change.
Multiplying an equation with a non-0 scalar and adding it to another. So we will get a new equation
in place of the old one. But still the system has the same solution. Whichever was the solution
earlier, that still stays the solution. And thirdly, if you multiply an equation by a non-0 scalar, you
are changing the, scaling both sides equally, so the solution does not change. So but an important
thing is here it should be non-0 scalar, multiply by non-0 scalar. So these 3 type of operations are
operated upon the equations and they are called elementary row operations. So we observe that the
elementary row operations when performed on a system of linear equations, do not change the
solution set.
(Refer Slide Time: 04:28)

48
And what is the aim? The aim is by applying this operations, elementary row operations, 𝑚 × 𝑛
system that is 𝑚 in linear equations in 𝑛 variables can be transformed to a system of equations of
the following type where the first equation has coefficients 𝑐11 , 𝑐12 and so on. In the next one, so
in the next equation, at least one of the variables is eliminated. If here 𝑐11 was not 0, then in the
next one, that coefficient is eliminated and you get an equation which starts with 𝑥2 . Possibly 𝑥2
also can be 0 but 𝑥1 is positively missing in this. So 1 variable has been eliminated. So using those
elementary row operations, the system, given system can be changed, can be transformed to a
system of the following type. So we write this as there is a number 𝑟. So 𝑟 is less than or equal to
the number of the variables, right. 𝑛 is the number of variables, so that is number 𝑟 between 1 and
𝑛 such that the first 𝑟 equations possibly have non-0 coefficients, but after the 𝑟 + 1 equation,
everything is 0 = 0.

So what we have done is? We have reduced the number of equations which are required to find
the solution, okay, to r equations possible. Here remaining equations 𝑛 − 𝑟 equations are all 0 =
0. So this is the method suggested by Gauss that do those row operations, elementary row
operations on the system and transform into this form. But what is the advantage of this? So let us,
now supposing this last equation, the remaining equations are always true but look at this equation,
okay.

In this supposing it so happens, okay, that in the 𝑟th equation, all the coefficients here are 0 but 𝑑𝑟

49
is not equal to 0, suppose this happens, right. You will get 0 equal to a non-0 number and that will
be in, there is some inconsistency in the system. So the system will not have any solution. So that
is called an inconsistent.
(Refer Slide Time: 07:21)

So if conclusion is, if for some 𝑟 between 1 and 𝑛, 𝑛 is the number of variables, all the coefficients
𝑐𝑟𝑗 = 0, so that means for all 𝑗, that equation on the left hand side is 0 but the right hand side is
not equal to 0. Then the system is inconsistent. So the second possibility is, system is consistent,
that means this does not happen. So then there are solutions. So then 𝑛 − 𝑟 variables get arbitrary
values.

So let us just look at the previous equation. Supposing this is a consistent system. So this involves
the variables 𝑥𝑟 , 𝑥𝑟+1 , … , 𝑥𝑛 . So in this equation, this is an equation involving these variables, 𝑛 −
𝑟 variables are involved in this. We can solve one of the variables in terms of the other variables
using this equation.

So in this, we can give the remaining variables, are arbitrary variables, find values for one of the
variables. So that what it says that 𝑛 − 𝑟 variables get arbitrary values, okay. And the remaining
can be determined in terms of these variables from the backward substitution. So this, we are
summarizing this.

50
The system is inconsistent means no solution and that happens when there is something like 0
equal to non-0 in one of those equations which have been reduced to a specific form using the row
operations. Consistent means there are solutions at least 1 solution and you can get between infinite
number of solutions when putting various values for some variables and calculating the other in
terms of that. So this is the method suggested by Gauss to solve system of linear equations.

(Refer Slide Time: 09:27)

To make the system more systematic, we realize that the coefficients are important and not the
variables. So we look at this array of numbers. So it becomes important to study array of numbers.
That gave us a concept of what is a matrix.
(Refer Slide Time: 10:00)

51
So such a thing is called a matrix. So this system we saw it yesterday in the previous lectures that
this system of linear equations m equations in n variables can be written in terms of matrices. So
the matrix consisting of the coefficients of this m equations for getting the variables and forgetting
the arithmetic symbols that is 𝐴, 𝑋 is the variable which is unknown vector. So 𝑋 is 𝑛 rows, 1
column, 𝐴 is 𝑚 rows 𝑛 columns, so when you multiply, you will get 𝑚 × 𝑛, right.

So in the matrix form, you get this 𝐴𝑋 = 𝑏 in the matrix form of the equation. The idea writing in
the matrix form is we are not really concerned with this 𝑋, we are only concerned with 𝐴 and 𝑏
when we transform, apply elementary transformations.

(Refer Slide Time: 11:18)

52
So and by Gauss elimination method says that the coefficient matrix 𝐴 and that vector 𝑏, after
those elementary row operations, get transformed into matrices of this type.

So the transformed matrix 𝐴 which was original is transformed to 𝐴̃ which is of this form and the
vector b gets transformed to 𝑏̃ because whatever operations you are doing on the left hand side,
you have to do it on the right hand side also, right. So when 𝐴 changes 𝑏 automatically is going to
change according to those operations, right. So this is the new coefficient matrix for the new system
which is equivalent to the earlier one.

And this has a special form. Look at the bottom, there are some rows which are all equal to 0 and
the number of possibly non-0 coefficients, right, the place where they occur is increasing. This is
special form of a matrix. We will spend some time on this form of the matrix because this is going
to be important for all future calculations.
(Refer Slide Time: 12:43)

53
So to keep track of 𝐴 and the vector 𝑏 together, we form a new matrix, 𝐴 is 𝑚 × 𝑛, 𝑏 is 𝑚 × 1. So
the row operations whatever we do on 𝐴 are also going to be done on 𝑏. So this is the matrix on
which we are going to operate. And the idea is do the elementary row operations and change it to
that form. This part of the matrix should be changed to the special form.
(Refer Slide Time: 13:31)

So this matrix is called the augmented matrix because we have added 1 more, we have added 1
more column to 𝐴. So it is called augmented matrix of the system of linear equations. And Gauss
elimination method is summarized as reducing this matrix to a special form by elementary row
operations. So what is that special form? We will discuss that.
(Refer Slide Time: 14:00)

54
Here is the special form. It is called the row echelon form of a matrix. For any matrix 𝑀 which is
𝑚 × 𝑛, non-0, of course, the 0 matrix has nothing to do with it. It will not given anything. So we
assume that it is at least 1 non-0 entry in the matrix. So 𝑀 is 𝑚 × 𝑛, non-0 matrix. We say this
matrix is in row echelon form. REF is short for row echelon form if there is a number 𝑟 between
1 and the minimum of 𝑚 and 𝑛, and there are natural numbers 1, 𝑝1 , 𝑝2 , … , 𝑝𝑟 with the following
properties, right. The first property says the matrix looks like the first 𝑟 rows possibly are non-0.
Remaining are all 0 rows. So that is the property of this, characteristic property of this number 𝑟,
okay.
(Refer Slide Time: 15:44)

55
That means what? This means in the first r rows, the non-0 entries, they cannot go back kind of a
thing. If it is, in some row it is coming at a place, then the next rows, it has to be on the right side
of it, column, right. The previous ones are all 0. So the first non-0 entry comes at pith column. So
in 𝑝1 somewhere comes, so 𝑝2 non-0 entry should not come on the left side of that column. It
should be only on the column number on the right hand side of it. So that is the property of
this 𝑝1 , 𝑝2 , … , 𝑝𝑟 , okay.
(Refer Slide Time: 17:32)

So this is what the matrix will look like.

For a matrix, of course non-0, 𝑚 × 𝑛, there is a number 𝑟. Why this has to be written minimum of
𝑚 × 𝑛? Because 𝑟 rows are non-0, right. So 𝑟 has to be less than the number of rows, 𝑟 has to be
less than or equal to 𝑚 but the non-0 entries going to come in a column. So 𝑟 has to be less than
or equal to number of columns, right.

(Refer Slide Time: 22:33)

56
57
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology - Bombay

Lecture – 05
Systems of Linear Equations II

(Refer Slide Time: 00:26)

Let us look at the matrix 0 0 0 0. Is it in the row echelon form? We are looking only at the non-0
ones, so we are not really bothered about it, okay. You can say this is not in the row echelon form.
It cannot be brought because it is non, there is no non-0 entry at all. Let us look at this 1 2 and 5.
Is that in the row echelon form? Yes. It is. There is only 1 row which is non-0, so fine. What about
this one? That also is, okay.
(Refer Slide Time: 00:58)

58
Let us look at some more. For example, let us look at this one. 0 1 2 first row. 1 0 2 then second
row, right. Here the first non-0 entry in the first row is coming at 1 but in the next row, it is coming
at, first place itself. Whereas in the second column where it is in the first column, so this is not in
the row echelon form. If I wanted to bring it in the row echelon form, I will interchange these 2
rows.

(Refer Slide Time: 04:07)

(Refer Slide Time: 05:56)

59
So let us look at, so this is a theorem which can be written down, the proof can be written down.
Actually when we do the changing a matrix to the row echelon form, we will see that the proof
automatically will be a part of it. There sometimes, you say there exists, you actually construct
that thing and that shows that it exists also, right. So there are some proofs in mathematics which
are purely existential proofs where you cannot show that object but you can prove it exists, right.

There are other proofs in mathematics where you actually construct and show that it exists. The
difference is very my pet quote saying that the god exists, right. Some people believe god exists
because of some reasons. But you cannot produce god in front of you, right. So there are existential
proofs which purely say, right. You might have seen in calculus that a criteria for convergence of
a limit.

Every equations sequence has a limit. It does not tell you where it is. It says it exists, right. But
definition of convergence says you can locate the point which is the limit. So that is the difference
between existence and proving actually it exists, okay. You will come across these things in many
parts of mathematics

There is a typo here. Every matrix, right, of any order 𝑚 × 𝑛, there is no relation between 𝑚 and
𝑛, okay. There is a typo. It should be 𝑚, 𝑚 × 𝑛 matrix can be reduced to a row echelon form. The
word a is used indicating that row can be more than 1, row echelon form of a matrix, using purely

60
elementary row operations, okay. So the row echelon form we will shorten it as REF. Row
operations we will write as row operations. Is the theorem, we will see how the proof also goes as
we do something, okay.
(Refer Slide Time: 08:11)

Why the word echelon comes? If you look at the English dictionary, what is echelon? So go back
and look at English dictionary. What does the word echelon means? From where does this word
suddenly drops in, in mathematics? Echelon means a particular arrangement formation of
something. For example when the planes fly, they fly in a particular form. It is called the echelon
form of the planes.

So one is behind the other, right. They do not, one position is slightly behind. So the position where
the non-0 entry is coming is on the right side, that is the formation. So echelon word basically
means formation, particular formation of something, okay. I just brought this picture now.
Hopefully the echelon word will stay in your mind. It is formation of one going behind or staircase
if you want to keep in mind that kind of a thing, okay.
(Refer Slide Time: 09:15)

61
So let us look at an example how does one do that. All should be obtained only using row,
elementary row operations. That is the only condition. Those are the only tools given to you, right.
Using those tools, you have to transform the given system into a new system, that is all, okay. So
there is row echelon form. Let us go a step further.
(Refer Slide Time: 14:01)

Let us look at one more example, okay.


So you can say that the row echelon form means that there is a 𝑟, so the bottom 𝑟 rows are 0. In
the non-0 row, the non-0 entry is pivot, right. The pivots come at in the column which are
increasing order. That is all, right. You can understand that way also, right. So let us pick up one
of them which is not say.

62
(Refer Slide Time: 17:52)

Which one? a) is not, right.

(Refer Slide Time: 18:18)

Now here is a proof of that theorem or a process also. You see the point is when we are saying it
is a calculation of numbers only, the variables are not important. That means the numbers, the
values of the entries of the matrix, we are going to multiply by something, add 1 row to another
and do something. And here 2/3, 3/2, 4/5, you can do humanly. You can use calculators and do it.

But in applications, these things, the number of equations, you will be surprised can run into

63
thousands. For example this weather prediction, how do they predict weather? How do they make
model for economy? There are lot of equations coming. Lot of factors which are effecting the
equations coming. And more often than not, to solve these equations one make them linear that is
called approximation.

So you linearize the equations, the model, you have a model in which lot of equations are coming
which are not linear. You linearize it. So you get an approximation kind of a thing, right. Because
when a system is linear, it is easier to solve. We have methods of solving them. So how does one
solve a system of linear equations is very important thing for applications point of view also.

And the idea is that even if the matrix is order or 1000*1000, 1000*2000, we can put it on a
computer and we can ask the computer or a machine to do the calculations for us and make it in
the row echelon form. But the question is, how does computer knows what to do, right. They only
may have numbers. So that is why the matrices coming to the picture. The variables are gone. Only
the entries of the matrices and that 𝑏1 , 𝑏2 , … , 𝑏𝑚 , the column that is important.

So and that we can feed a data to the computer, right. When you are saying interchange, we will
see later on how those operations can be done by computers. But we have to give a definite method
of doing it.

There is no definite way of saying that okay this is the only way of changing a matrix to row
echelon form. So can we give a definite way by which we will proceed so that the computer knows
we know, everybody knows that this is the way we have done it. So that method is also the proof
of this theorem that every matrix can be reduced to row echelon form.

64
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology - Bombay

Lecture – 06
Systems of Linear Equations III

We prove that every matrix has a REF. So let us say that or we are going to do on induction,
number of rows, 1 × 𝑚. There is only 1 row in the matrix. Can you say it is in the row echelon
form? No basically because if it is non-0, right, some 0's may be coming. First time there will be
some non-0 if it is a non-0 matrix. That is fine. There is only 1 row, so nothing to worry, right.

So the proof that a row matrix, a row vector is in row echelon form if it is non-0. This is okay? By
default, it is non-0. So it has to be in row echelon form. So induction starts, 𝑚 = 1, the proof is,
let us assume when the number of rows is something, it is okay. When number of rows become 1
more, how to do it, right? So that is a question. So let us look at that.
(Refer Slide Time: 01:31)

So let us say these are the rows of 𝑅1 , 𝑅2 , … , 𝑅𝑚 . m is non-0 matrix. So what one does is?
(Refer Slide Time: 01:41)

65
Look at the columns of 𝑀. We are given a matrix, right. Look at the first column. Possibly
everything is 0 in that column, fine. So what will the machine do? It will scan the first column. No
non-0 entry, go to the second column. Non-0 entry, so look at a column where the first time it spots
a non-0 entry in some place. We do not know where. So let 𝑝1 be the first column that has a non-
0 entry.

First column 0, second column 0. The first column where non-0 entry somewhere comes is the
column number 𝑝1. So non-0 entry has come at some place, that we should make it as the first one,
that should make as the row. So that row should be made as the first row. Because everything else
was 0 on the left side of it. Is it clear what we are doing? Look at the column 1. Column 1 is all 0.

Scan column 2. All 0. At the 𝑝1 column, the first time a column has a non-0 entry is 𝑝1. Now it
will come at some place. Say it is at the 𝑖th place. Everything else on the left side is 0, right.
Everything at the top is also 0. So what we do is, we take this row and make it as the first row. So
what will happen? Everything else is 0. The first non-0 entry is coming at the place 𝑝1, right.

So first stage is achieved. The first non-0 row as a non-0 entry at a place 𝑝1. Everything on the left
is 0. At the bottom, we do not know what it is, right. But if there is a non-0 entry coming somewhere
at the bottom, we can make that also 0 by row operations. So everything below that can be made
as 0, right. So what we have gotten? 𝑝1 non-0, everything below that is 0, everything on the left is

66
0.

Now again now starts scanning the remaining part of the matrix. The remaining part of the matrix,
what is the order of the remaining part? Already 1 row is gone. So 1 row less. So by induction, I
should have row echelon form for that. So proof is over, right. So it is done by scanning the
columns. Scan a column, non-0 entry, make it as the first row, right. Now go to the next columns.

Spot the non-0 entry. Make that as the second row. And everything below that keep on making it
0. So that will make, right, that is definite way of telling a machine that proceed this way, write
systematically and in the end, everything is over. Everything you scanned, all the columns are
scanned, your matrix will be in row echelon form.

So that is what is written here. So it says look at the columns of m. Let 𝑝1 be the first column that
has a non-0 entry. Say ith place. First column non-0 entry. So this is what the ith row will look
like. 0 0 0 first non-0 entry coming here. Now that should be made as a first one, right. So change
it.
(Refer Slide Time: 05:08)

So change it to the first one. Something is there at the bottom, we do not know. But on the left,
everything is 0. These may be non-0 entries here at the bottom. But what we can do is? By row
operations, I can make these entries as 0, right by a suitable multiplications. This is non-0, divide

67
by, right, multiply by the 1/of that number, that will become 1, negative of that, add it here, I can
make this as 0, make this as 0, this as 0, right.

So once a non-0 entry has come at a place, everything at the bottom can be made as, in that column,
can be made as 0 by elementary row operations. So I will have 0's here, I will have a non-0 entry
here and this will be all 0's. So what is left? Left is this part of my matrix and that has order less
than m, right. Number of rows has reduced, okay. So that will mean what? A number of, this has
reduced, that means by induction, I can have that as in the row echelon form.

I can go on applying this. I look at this column now and say non-0 entry will possibly will come
here and change it in this part. In this part, so I will be looking only this matrix now. So look at
this column here. Non-0 entry come again, I will shift it up and so on, okay. That is the induction
part, okay. So that is how.
(Refer Slide Time: 06:33)

So the second step was, this is non-0, everything else is made 0, right, which I said. Now look at
this part and do the same thing. Now repeat the process on this part. So in your computer language,
what you will say, do loop, right. This is a loop thing. On this matrix, do nothing. Because
everything is reduced, then that will be the remaining part and so on.
(Refer Slide Time: 06:56)

68
So this is a definite way of doing things which can be fed to a machine also, right, okay. So how
do we use this row echelon form to solve system of linear equations?
(Refer Slide Time: 07:14)

So we have a given system of linear equations 𝐴𝑋 = 𝑏, right. Then we will make the augmented
matrix because 𝑏 is going to important. So this is a new matrix which has got 1 more column added
to it and we will reduce it to row echelon form, right.
(Refer Slide Time: 07:39)

69
So I think let us give the video a break and try to solve this problem now. Then we will see the
solution. So look at the system where the augmented matrix is given, sorry, not, actually it is in
the row echelon form. So this example is already in that form, right. Is that okay? You are doing
so that this matrix comes in the, 𝐴 is the matrix. Let us discuss this example now. So this matrix
[𝐴|𝑏], 𝐴 is already in the echelon form. Now we use backward substitution to get the solutions.
When a variable is not determined, that is, it does not have a fixed value given by the system, it is
a free variable and it can attain any arbitrary value.

(Refer Slide Time: 14:08)

70
I am just repeating that solution again, right, last equation.
(Refer Slide Time: 14:11)

So that is what the solution.


(Refer Slide Time: 15:13)

So let us do one more illustration so that you understand. So this is a system given, okay. Once
again I already purposefully put in the row echelon form so that we can analyze what is the
backward substitution. Otherwise, you write 𝐴 and 𝑏 and reduce it to row echelon form, you will
get something like this, right. Here it is already done. That [𝐴̃ | 𝑏̃]. So last row does not give you
anything, is 0, forget about it, no information.

71
So in this system, there is only 1 solution possible, right. In this system, there is only 1 solution
possible, which we obtain by back substitution.
Here column 1 has 𝑝1, 2 has 𝑝2 . If the 𝑝1 , 𝑝2 , … , 𝑝𝑟 are in 𝑛 columns, 𝑟 = 𝑛 and each column has
to get a pivot. That means there will be a unique solution. If each column has a pivot, it is going
to be unique solution, right, that is what is happening in this. So these observations we are putting,
we will put them together.
(Refer Slide Time: 17:36)

So 𝑟 = 𝑛, the number of variables, right. So 𝑟 = 𝑛 means unique solution. If 𝑟 < 𝑛, then 𝑛 − 𝑟


variables get independent values, arbitrary values. Then infinite solutions come, right. If 𝑟 < 𝑛,
then some columns will not have pivots. That means those variables will be determined in terms
of the other one, okay.
(Refer Slide Time: 18:30)

72
Now let us look at this another one. That means when I reduce the matrix 𝐴, aim is keep 𝐴 and 𝑏
side by side. Aim is to make 𝐴 in the row echelon form. Go on changing 𝑏 accordingly. Do not
bother about it. So it gives you, last row is 0. But here comes a row in 𝐴 which is 0 but on the right
hand side, becomes 3. So what will this equation look like? 0=3, that means there is an absurd
equation.

One of the equations is never going to have a solution. That means the system cannot have a
solution.

So that gives you an inconsistent. So this is the way you will spot an inconsistency. So 3
possibilities arise. One the number of pivots, right, the 𝑟 is strictly less than 𝑛, the number of
variables. Then 𝑛 − 𝑟 variables will get arbitrary values and remaining 𝑟 will be determined in
terms of these 𝑛 − 𝑟.

Second, each column gets a pivot. 𝑛 = 𝑟. Then unique solution, right. And third, there is a row 0
= non-0, right. So in that case, no solution, inconsistent. So 3 possibilities. Again look at 2 variables
and 3 variables. 2 lines, either they are parallel, so no solution, they do not meet, right. Then 2
lines intersect.

They will intersect only at a unique point and then they are coincidental. So when the 2 lines are

73
coincidental, then in the row echelon form, 1 row will be 0; otherwise, relation between 2 variables.
So 1 variable will give you the other value. So infinite solutions, right. So perfect similarity comes,
okay.
(Refer Slide Time: 21:02)

(Refer Slide Time: 21:27)

So once again, let us, this row echelon form for solving a system of linear equations. Summarize
what we have done. So it says look at the system 𝐴𝑋 = 𝑏, a 𝑚 × 𝑛 system with augmented matrix
as 𝐴 along with 𝑏, call it as 𝑀, augmented matrix, we are calling it as 𝑀. So reduce the matrix to
the row echelon form and look at the non-0 rows, 𝑟 is a number of non-0 rows for this and pivots

74
are coming at 𝑝𝑖 th place, right.
(Refer Slide Time: 22:14)

Then what? The system is inconsistent if there is a pivot in that column, right, the b column. There
is a non-0. If the pivot comes there, that means all the previous must be 0, right. If in the augmented
matrix, there is a pivot lying in the column where b is coming that means there is a non-0 entry in
that, right. Everything else on the left side must be, what is the definition of a pivot? Everything
on the left is 0. So 0 = non-0 that is inconsistent, system is inconsistent. That is what we saw, right.
(Refer Slide Time: 22:50)

And if 𝑟 < 𝑛, then there are 𝑛 − 𝑟 parameters or variables, whatever you want to call them, which
will determine a general solution, right.

75
(Refer Slide Time: 23:00)

So what we have done is? For a system 𝐴𝑋 = 𝑏, right, we take 𝐴 and 𝑏 and reduce to row echelon
form, right. Now reducing a matrix to row echelon form is by scanning and interchanging or
multiplying, right. Now how does a computer knows, I cannot give a verbal command,
interchange. I have to give it in the form of something computational to the, you understand what
I am saying.

If you want to implement this command that interchange 2 rows. How does the computer
understand that command? It can understand only multiplication, division, addition, arithmetic,
right. So what we are going to do is, interchanging of 2 rows, that is one command. Other is
multiplying a matrix by, row by a non-0 scalar, second. And third is adding, right. Adding one row
to another.

These ones we want to make it executable commands so that a computer can understand, right.
(Refer Slide Time: 26:04)

76
(Refer Slide Time: 27:32)

(Refer Slide Time: 29:57)

77
For an elementary row operation to be done on A, premultiply it with the matrix obtained from
identity matrix by that operation. So what we Type equation here.are saying is to give a command
to a computer, how do I do that operation is take identity matrix, okay. Whatever operation you
want elementary row operation you want to be done on 𝐴, do it on identity first. We will get a
matrix. So store that matrix.

Once you have stored that matrix, multiply the matrix 𝐴 on the left side by that matrix. The
resulting matrix will be doing that operation on 𝐴. So if you want to interchange 2 rows, take
identity matrix, interchange its rows, you will get a matrix. Premultiply 𝐴 with it, the resulting
matrix will be the one where rows of 𝐴 are corresponding rows are interchanged.

If you want to add, do the same thing for identity matrix, add those corresponding rows, identity
matrix, you get a matrix, premultiply 𝐴 with this matrix, the corresponding rows will be added.
And similarly, right, if you want to multiply and add or interchange, same thing you do here and
apply it here. So for all 3 elementary row operations, what are the 3 elementary row operations?

One was interchange of rows. The other was multiplying row by a scalar. And the third one was
multiplying and adding. These 3 operations you can do on identity matrix and store them and
whenever you want to do these operations on 𝐴, premultiply 𝐴 with that matrix. So computer will
store in the computer those things and do that multiplications on the computer to do those

78
operations, right.

So these matrices are called elementary matrices. What are elementary matrices? These are the
matrices obtained from the corresponding identity matrix by doing elementary row operations on
them of 3 types, right. So there are 3 types of elementary matrices which are used to do elementary
row operations on any given matrix, right. So we will continue study of this elementary matrices
next time.

79
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology - Bombay

Lecture – 07
Reduced Row Echelon Form and Rank of a Matrix I

Okay, so let us begin today's lecture by recalling the main aspects of our row echelon form.
(Refer Slide Time: 00:29)

Because that is one of the important things. So let me just recall. Row echelon form of a matrix,
M means that there exists a number r between and the minimum of the row and the columns of the
matrix. M is m*n and there are natural numbers p1 p2 pr with the following properties. One, the
first r rows of the matrix are non-0, that means there is a at least 1 non-0 entry in that matrix, in
the rows. And the bottom rows are all 0. So remaining m-r rows are all equal to 0. So that is what
r means. There are first r rows, top r rows are non-0 and the remaining bottom are 0.
(Refer Slide Time: 01:25)

80
And the numbers pi indicate the place in ith row, pith place, the pith column, the entry is non-0
and that should be the first non-0 entry and the property is that the column numbers where the non-
0 entries come, for the first column should be strictly on the left side of column number 2 and so
on. So and these entries are called pivots that we saw.
(Refer Slide Time: 01:51)

So this is what the matrix looks like. So a matrix in the row echelon form will look like, there will
be some 0 in some column. The first non-0 entry at the first row comes at a place a1p1 and the
next comes for the second row, it comes at a2p2 where p2 is strictly bigger then p1. That means it
is on the right side of p2 and so on. And the bottom m-r rows are 0, right.
(Refer Slide Time: 02:21)

81
So we stated a theorem that every non-0 m*n matrix can be reduced, so it should be m*n, reduced
to a row echelon form.
(Refer Slide Time: 02:31)

So let me just go through 1 example again, how it is done so that it is clear to everybody. We are
given this matrix A. So the process is we scan the columns of the matrix, start with scanning the
columns. So this is the first column, okay. In the first column, the first non-0 entry comes at the
place. So the operation should be interchanged R1 with R2 so that the first entry in the first row
becomes non-0.

So let us do that. So once we interchange the second row with the first row, you get these 2 goes

82
up. So 2 1 0, this row becomes and 0 0 0 comes at the bottom. So the first entry in the first row has
become non-0. So the first row is a non-0 row, okay. Now once again what we do is, in that column,
everything else should be made 0.

So this is what the pivot is. Below the pivot, everything else should be made as 0. So you can
multiply this row R1 with 3, that will give you 6 here. Subtract it from the third row, that will make
this column, this entry 6 as 0. So multiply the first row with -3 and add it to the third row, it will
make it 0 and similarly for the next one.
(Refer Slide Time: 03:57)

So for that, so the operations are, so this R3 goes to, this is how we write it, R3 goes to R3-2R1.
R4 goes to R4+-3/2R1. So that will make everything below that column =0 by elementary row
operations. So once the first column is done, we look at the second column. So in this column now,
this row is already taken care of. So we should look at this submatrix, okay. So in this, here the
number 0 is coming, right.

So the first non-0 entry comes at the place -1 which is row number 3. So what we should be doing?
Interchange R3 with R2. So let us do that. So once you do that, you will get -1 up, -2 up and 3 up.
So this will become. So you will get a pivot in row 2 also. On the left, it is 0. That does not change
at all. And now once again, the operation should be make everything below the pivot to be equal
to 0 by row operations. So this already is 0. So multiply this by 1/2 and add, so make that as 0.

83
(Refer Slide Time: 05:11)

So once you do that, below the pivot -1, everything becomes 0. Now go on doing this. So go to the
next one. So next one again, this row is equal to 0. So you should interchange these 2, right. So
interchange R4 with R3 and so that will give you, so interchange will give you this, okay. So now
bottom row is all 0's.

So next column, we do not have to go to the next column at all now, right. Because we have gotten
the pivots 2 -1 3/2, okay. This will not make it a pivot, right. Even if you look at the last column,
this will not be pivot because the first non-0 entry in that row has already come, okay. So this
operation is over. So that means, okay.
(Refer Slide Time: 06:04)

84
So the last time we have started looking at how do you make this row operations operatable via
matrices? Because to put it on the machine, we cannot give an oral command or written command
that interchange R1 with R3. We have to give it executable command. So that is done by matrix
multiplications. So we are describing now the elementary matrices which are going to do the same
operations as you say interchange, add one row to another or multiply a row with another one.

So to do that, we start with an identity matrix, okay, m*m, any identity matrix of order m*m. What
you do? On this identity matrix, do the operation that you want to do. You want to interchange the
ith row with the jth row. So for identity matrix, interchange its ith row the jth row, that will give
you a new matrix, identity matrix will change, right. So the change matrix is called Eij, okay.

Or if you want to multiply the ith row by scalar alpha, right, then what you do? Take identity
matrix, multiply its ith row by alpha, you will get the matrix E alpha i, right. So this will be in the
ith column, ith row. It was 1 here in the identity, now I multiply it with alpha, right. So this is the
second type of operation, E alpha I, okay.
(Refer Slide Time: 07:41)

85
Similarly, you can do the next one. That is multiplying a row by a scalar and adding it to another
row, right. So that also can be done. What does this, so this is for the E interchanging of i and j.
(Refer Slide Time: 07:58)

So take the ith row. So what does this matrix means? We have taken the lower indicator, i means
you take the ith row to it, add alpha*the jth row. So identity matrix will change, right. That matrix
is denoted by Ei alpha*j. So operation goes, take i, to it add alpha*j. So you get 3 types of
elementary matrices, okay. These are called elementary matrices. What is the first one? Eij. What
does Eij means?

Take identity matrix, interchange its ith row with the jth row. So Eij simply means interchange i

86
with jth row. E alpha i multiply the ith row by the scalar alpha and that alpha should be non-0,
right. We want non-0. And the third one, take the ith row to the jth row multiply by alpha and add
it to the ith row, right, identity matrix. So this will change identity matrix into the third type of
matrix.

So these 3 are called elementary matrices. Why are they important? Because if I want in a matrix
A, I want to change its ith row to jth row, it is obtained by premultiplying the matrix A by the
corresponding elementary matrix. So suppose we have a got a matrix A which is m*n. You want
to interchange its ith row with the jth row. So take the corresponding m*m matrix Eij, premultiply
A with this and the resulting thing would be interchanging the rows of ith row of A with ith row
of j, right. So that one can write down the proofs of this.
(Refer Slide Time: 09:44)

So let us just look at examples of what E12 means what in 3*3. So what is E12, interchanging
identity matrix row 1 with row 2, right. So in identity matrix, it is 1 here, the left most corner, 1 0
0. So that has gone to row 2. Row 2 has come up. So this is E12. Similarly, 3+2c, E3, that means
what? In the third row, add 2 times second row multiplied by c.

So when the second row you multiply by c and added here, this was 0 earlier. So now it becomes
c here, right. So that is E lambda i, that means what? E lambda 1. E lambda 1 means multiplying
the first row by the scalar lambda. So these are illustrations of the 3 elementary matrices, right,

87
okay.
(Refer Slide Time: 10:43)

So the theorem says, we will not write down a formal proof of this. One can write down, multiply
and look at the ijth entry of every matrix. So there is only matter of writing. We will see how do
we use it. So it says the product of Ejk with A, okay. A is a matrix, it is m*n, m rows n columns.
So what should be the order of that matrix Ejk you should be taking? m*n. Because you are going
to premultiply, right.

A is m*n, so you should be taking the identity matrix m*m, changing, doing the operation there
and premultiply. So take the matrix Ejk which is, right, m*m, multiply it with A. So what will be
your resulting matrix? Eij is m*m. A is m*n. So the product will be m*n matrix. And will be same
as, as if you have interchanged the jth and the kth row of A, right.

So is it clear? So doing a row operation on a matrix is multiplying, premultiplying by the


appropriate order matrix, elementary matrix, with that operation done on the identity matrix.
Similarly, Ei alpha that will do, so Ei+alpha j, what this will do? It will take the jth row of A,
multiply it with c and add it to the ith row of A, right. So I think this is clearly written. So this is
type. And similarly the lambda one.
(Refer Slide Time: 12:22)

88
Let us see some examples of this. Let us take the matrix A, okay. This is the matrix and we want
to interchange its first row with its third row. So what I should be doing? This matrix is of order
of 3*4. So I should be looking at it elementary matrix of order 3*3 which is obtained from identity
matrix of order. So A is a matrix, we want to interchange its first row with the third row. Pick up
the corresponding elementary matrix, right.

So what should be the corresponding elementary matrix? It should be 1/3, right. That means in the
identity matrix of order 3*3, 3 is the number of rows, it was interchanged, first with the third one.
So when I interchange. This is what, E13 will look like, the elementary matrix. And now what
should I do? Take this matrix, premultiplying A with this matrix.

So let us do. So E13*A, so that is E13, that is A. Now if you multiply, right. So let us, this is 0,
this is 6 and this is 3. So 3 will come here, right. Matrix multiplication. You can check that. Clear
to everybody? Right? So pickup the corresponding elementary matrix, corresponding to the
elementary row operation that you want to do and premultiply, okay. So that is.
(Refer Slide Time: 13:49)

89
So the resulting matrix will be transformed accordingly. So the theorem says that we said every
matrix can be reduced to row echelon form, right, by row operations. In terms of matrices, what it
will mean? It will mean that if A is m*n matrix, then there exists elementary matrices, some
number of them, E1 E2 En of order m. They are all square matrices of order m such that if I
premultiply A with those, so these are the row operations we are doing via matrix multiplication.

This will be in the row echelon form, right. So we have just restated the theorem that every matrix
can be reduced to row echelon form by elementary row operations in terms of matrices, elementary
matrices, right. Now let us observe one thing. Suppose once you have reduced a matrix to row
echelon form, what is the row echelon form means? The first non-0 entry, right at the various rows
which is coming is in the increasing order, right.

But it does not say what is that non-0 entry. It says it is non-0. The pivot p1 for the first row is a
non-0 number, right. Let us make it 1. How do I make it 1? I will multiply that row by 1/that entry,
right. So by elementary row operation, the pivot value can be changed from the arbitrary number
whatever it is in the matrix to the number 1 by elementary row operation again. Once the pivot is
1, below the pivot, everything is 0 in the row echelon form.

That is what pivot means. Only possibility is above it, there could be non-0 entry sitting, right. But
if I make elementary row operations on that again, I can make those entries 0. Because on the left

90
of the pivot, everything is 0. So when I make row operations, nothing is going to change on the
left of that, right in the above rows. In the rows above the pivot. Clear to everybody? And I can
make suitable multiplications and adding subtracting the rows, the elementary row operations, so
that everything above the pivot also is equal to 0.

So this is a special form of row echelon form. Once you have gotten the row echelon form, I can
make the pivot value equal to 1 by elementary row operations and every other entry in that column
equal to 0. Below the pivot, everything is 0 already anyway. Above it also, you can make it 0 by
elementary row operations, by multiplying with suitable constant and adding in the rows above,
right.

So that is a special form. So we write this as that the row echelon form of the matrix can be further
changed, transformed by elementary row operations so that each pivot becomes 1 and all the entries
above each pivot also vanishes. Vanish means what? They become 0, okay. One should,
appropriate word should be they are made 0. So in the column where pivot is coming, right, above
the pivot everything is 0, below the pivot everything is 0, right.

So this form of the matrix is called the reduced row echelon form of the matrix. So that means
what? Every matrix by elementary row operations can be reduced to row echelon form as well as
reduced to, can be changed to reduced row echelon form where the pivots are 1 and every other
entry in that pivot column is equal to 0. We will see advantage of this form of the matrix, okay.
The advantage, one advantage is, we said that the row echelon form of matrix need not be unique,
right.

It only says where are the pivots coming and what order they are coming. They are coming like
columns are increasing order. Does it say anything more than that? For example, if a row is
multiplied with something, if a matrix is in row echelon form and you multiply a row with
something, it still remains in the row echelon form. It does not change, right. If you multiply it,
any row by a non-0 scalar, numbers will change but the positions and the ordering will not change
of the pivots, right.

91
But the reduced row echelon form of a matrix is always unique. Once you have done this,
normalizing, that means every pivot is made as 1 and every other entry is made 0, one can prove
the theorem that the reduced row echelon form of a matrix is unique. There is only 1 possibility,
right. Unique meaning what? Entries are same, whichever way you do it, does not matter,
whichever you do it, okay, right.
(Refer Slide Time: 18:55)

So this is what reduced row echelon form will look like. Just I picture for general one. There may
be some columns which are all 0, does not matter. The first row will have a non-0 entry at a place
p1, the first pivot is 1 and coming at a place p1. There may be something in between, 0 or non-0,
we do not bother about that. But the non-0 entry in the second row is again 1, comes at the column
number p2, right.

So p2 is on the right side of p1 and in that column, everything is 0. Around the left of that also,
everything is 0 because that is a pivot. On the left of the pivot, everything, right, in that column,
every row below that, it should be 0. So this is all 0. In the pivot column, 0 1 0, this is a second
row and goes so on. At the bottom, there will be some rows which are all 0 in the row echelon
form.

So what we are saying is the pivots has in the row echelon form, they are in the increasing order.
All the pivots, here is a pivot 1, here is pivot 1, here is pivot 1 and here is pivot 1 and everything

92
in that column, in the pivot column is equal to 0. So that is the general form of reduced row echelon
form of a matrix. So the theorem we had was every matrix can be reduced to.
(Refer Slide Time: 20:24)

So let us look at an example. Remember just now we looked at the matrix is beginning of the
lecture and said this is the echelon form of the matrix. So echelon form, here is the pivot 2, here is
the pivot -1, here is the pivot 3/2, right. There are 3 pivots, p1 column, p2 p3, okay. This is the
entry. These are the pivots. I want to change it to reduced row echelon form. So what should be
the first step?

This pivot should be made as 1. In the first row, the pivot entry is 2, that should be made as 1. So
how do I make it 1? Divide the first row by 2, right. So how the operation should be? Divide the
first row by 2, so that will make it 1. And then how do I make this as 1? Divide it by -1 and this
one, divide it by 3/2. So all the pivots will become 1. Once the pivots have become 1, I can use
that to make this entry 0, above the pivot.

This entry is 1, right. I can make that entry as 0 by doing a row operation in the first one and the
second one. That will not change this entry, right. That will stay as it is. So on the left side, the
entries will not change. So this pivot will remain as 1 by that row operation when I tried to make
this entry, the second entry in the row 1 equal to 0. I will, as such I can do it here itself also. I can
just add this row to this.

93
That will make it 0, right. So it is not, the way of attaining a reduced row echelon form is not
unique but the form is unique. You can first multiply every column 0 and then make a pivot 1 or
make the pivot 1 and then make every other entry in that column equal to 0. So once you do that,
so this is what will become. I have written those operations here. But you should try to do it
yourself, okay.

I might have made an arithmetical mistake, that is a possibility. But the entry idea should be, this
should be 1. The first pivot which was 2 earlier, should be 1 now. Here it was -1, that is 1 again.
And 3/2, that has been made as 1, okay. But these 2 are still not 0, right. So what should I do?
Multiply this row by -2 and add here, then this will become 0. So I will have reduced row echelon
form.

This will be some numbers. We do not bother about them, right. That is non-pivotal column. In
the reduced row echelon form, what matters is the pivotal columns, right. Pivotal columns are in
the increasing order, one and second, each entry, the pivot is 1 and each entry in that pivot is equal
to 0. Everything else is of irrelevance to us, right. And that form is unique. Clear to everybody
how to reduce a matrix?
(Refer Slide Time: 23:36)

So once you do those operations, this will become 1 and these are the entries which on the side

94
remained which are 0 non-0, we do not bother about them. So this is I have reduced a matrix to
the reduced row echelon form. So this is the reduced row echelon form of the matrix 𝐴, right. So
it is clear what we are doing?

You can reduce a matrix to a row echelon form, one by elementary row operations and you can go
on further refining that row echelon form, change it to what it is called reduced row echelon form
again by still elementary row operations only, right. The pivots are made 1 and every other entry
in that pivot column is made equal to 0, right. It is not order of doing operations. It is just property
here to finally achieve, right. That is reduced row echelon form of the matrix.

95
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology- Bombay

Lecture – 08
Reduced Row Echelon Form and Rank II

(Refer Slide Time: 00:35)

How to get row Echelon form, how to get reduced echelon form, right and we said that these row
operations can be achieved by elementary matrices, pick up a suitable order elementary matrix and
pre multiply 𝐴 with that matrix, we will get that operation, okay, right.
(Refer Slide Time: 00:47)

96
So, we can write another theorem that, let us specialises this to a square matrix, suppose you have
got a square matrix 𝑛 × 𝑛, okay and you look at its reduced eow Echelon form, what are the
possibilities; a number of columns = number of rows, so one possibility is the number of pivots
that 𝑟 = 𝑛, right that means, every column will have a pivot, right; 2 possibilities, for a 𝑛 × 𝑛
matrix, if you look at the reduced low echelon form of that matrix, then what is a possibility?

The number of pivots = number of rows = number of columns that means each row and each
column will have a pivot, right and every other entry in that pivot column is 0, so what does that
matrix look like; that is precisely an identity matrix, so one possibility for a square matrix, when
it is reduced to row echelon form; reduced row echelon form, when it is transformed to reduced
roe echelon form that form is identity matrix that is 1.

Or, what is the possibility; number of pivots is 𝑟 < 𝑛, right, one was = 𝑛, other is < 𝑛 that
means what; that means at least one row at the bottom must be 0, it cannot be more than, it cannot
be = 𝑛, it can be at the most 𝑛 − 1, so there are only 𝑛 − 1 pivots that means, at least one bottom
row should be = 0, right, so there are 2 possibilities for a square matrix in the reduced echelon
form either the reduced echelon form is identity or at least at one row at the bottom which is all 0,
right, okay.

97
The proof is either obvious because once we do those operations, right, it only depends on the
number of; so if the number of pivots is let 𝑝 < 𝑛, then 𝑛 – 𝑝 rows at the bottom will be all 0,
right and you can see that on that portion of the matrix where pivots coming that will be identity
that portion will be identity matrix, okay, right, p cross p on the left side will be identity.
(Refer Slide Time: 03:15)

So, let us, now how is that useful; we are going to give you an application of it, so let us start
defining what is called the invertibility of a matrix, you all must be knowing that already. Let; we
have; invertibility is defined only for square matrices, it is something to do with multiplication like
for number; every number which is non-zero, right say 𝛼 has a multiplicative inverse 1/ 𝛼; product
gives you 1, right.

So, in matrices for addition, we already have identify, right, if 𝑚 × 𝑛 matrix A, you add a matrix
m cross n which is all 0’s that do not change the matrix, right, so that is additive identity for the
matrices of order 𝑚 × 𝑛.

But in matrix multiplication is not commutative, 𝐴𝐵 need not be equal to 𝐵𝐴, first of all 𝐴𝐵 may
be defined and 𝐵𝐴 may not be defined for a rectangular matrices, even if it is defined they may
not be equal, right, so to say it is something like inverse exist, we have to put a condition that 𝐴 is
a square matrix of order; any order, we say 𝐴 is invertible; if there is a matrix 𝐵, say 𝐴𝐵 and 𝐵𝐴,
both are defined.

98
They will be defined if 𝐵 is also of the same order as the matrix 𝐴, right are defined and equal to
identity matrix, 𝐴𝐵 multiplied, that means multiplying 𝐴 on the right side by 𝐵 or multiplying 𝐴
on the left side by 𝐵, both products should be equal and equal to identity matrix. In that case, we
will say the matrix, this matrix 𝐴 is invertible, right and this matrix 𝐵, we call as inverse of 𝐴, 𝐵 =
𝐴−1 .

But one show very easily that if one such exist, it has to be only, it is unique, that means if there
exist a matrix 𝐵 with this property, right then there is only one, there cannot be some other matrix
also, so if A universe exists, it is unique, so you call it the universe of the matrix. So, this I will
leave it for you to verify, right that it is unique that means, if 𝐴𝐵 = 𝐵𝐴, also 𝐴𝐶 = 𝐶𝐴 = 𝐼,
then 𝐵 = 𝐶.

Next, if 𝐴 and 𝐵 are invertible, so these are the properties of invertible matrices, if 𝐴 and 𝐵 are
invertible matrices and you are able to multiply them, right that means what; they are of the same
order, then 𝐴𝐵 is also invaluable, product of invertible matrices is again invertible and the inverse
of the product is product of the inverses. So, (𝐴𝐵)−1 is same as order interchanged; 𝐵 −1 𝐴−1, so
that helps you to compute inverses of products for invertible matrices.

So, these also you should verify, now let us look at elementary matrices, these 3 elementary
matrices, what all the elementary matrices; we took an identity matrix, right, example; we
interchanged 𝑖th row with the 𝑗th row, we got an elementary matrix. Is that matrix invertible that
you want that you get, is that invertible? Intuitively, it should be because I can reverse that
operation, right and what is that operation?

Multiplying on the left by some matrix, right, so I invert that; if I add it, I have interchanged, I can
again interchanged, what I will get back? Identity, right so, elementary matrices; they are all
reversible operations, right, interchanging 𝑖 with 𝑗 is same as reverse, 𝑗 with 𝑖, adding one 𝑗th row
to 𝑖 right, subtracting back, okay, multiplying by a scalar non zero that is why we said non zero.

99
The 2 reasons; 1; in an equation that equation may becomes 0, data will be lost, another thing now
comes there, if you have multiplying a row by a nonzero scalar to get that elementary matrix, then
you can multiply by 1 over of that we will get back the identity matrix, right, so all these
elementary; 3 elementary matrices; 3 types of elementary matrices are invertible matrices, they are
special matrices, they are all invertible, right.

(Refer Slide Time: 09:37)

Theorem says a square matrix 𝐴 is invertible if and only if, it is a necessary and sufficient condition
says it is a product of elementary matrices, if a matrix can be split at a product of elementary
matrices but obviously, each elementary matrices is invertible, right, product of invertible is
invertible.

So one way is obvious, if a matrix is the product of invertible matrices, it is invertible that is of
obvious, right because if a matrix 𝐴 is a product of invertible, right, each invertible; each is; if 𝐴
is a product of elementary matrices, each elementary matrices invertible, product of invertible is
invertible, so 𝐴 is invertible, so one way is obvious, so let us look at that proof. So, suppose, 𝐴 is
a product of elementary matrices, 𝐴 is; this is the product, then what is the inverse?

100
It is precisely this matrix, you can see that right, not only it tells you what; it is invertible, it even
tells you what is an inverse, for why did; you can write the elementary matrix as; you can write
the matrix 𝐴 as the product of elementary matrices, so it gives 2 things, if 𝐴 is a product of
elementary matrices, it tells you how to compute the inverse provided, you can write down the
matrix as product of elementary matrices, right so one way is obvious. Let us look at the other way
round that means, what is the other way round conversely; suppose, 𝐴 is invertible, what we want
to prove?

That A is a product of elementary matrices, what is the definition of invertibility? It says 𝐴 is


invertible, if there is a matrix 𝐵, so that 𝐴 𝐵 = 𝐵 𝐴 = 𝐼, so okay. Now, let us take this matrix A,
which is given to us and look at its reduced row echelon form, how do you get that reduced row
echelon form; by pre-multiplying by some elementary matrices.
(Refer Slide Time: 13:50)

(Refer Slide Time: 17:49)

101
So, let us go ahead and look at some more examples, so we are going to look at, so we proved that
𝐴 is invertible, if and only if 𝐴 is a product of invertible matrices, right, so let us start with some
matrix, 𝐴 which is 𝑛 × 𝑛, so what want to do; we want to check, is 𝐴 invertible, that is one question
we want to answer and second; if yes, how to find its inverse? Right, so, a given a matrix 𝐴, we
want to check whether it is invertible or not, right.

And whether we can find out the inverse of that all, so what should be the strategy; it says 𝐴 is
invertible if and only if it is the product of invertible matrices, right and in that case, its row echelon
form we saw it should be identity, right, in that case, row echelon form of 𝐴 should be identity
matrix, so the idea should be take the matrix 𝐴 and find its reduced row echelon form, if that is the
identity, it will be invertible that is one, right.
(Refer Slide Time: 20:00)

102
Now, I also want to compute 𝐴−1, right, so how will I compute 𝐴−1 ? It just said that if you take a
matrix 𝐴, right, if so; if 𝐴 is the matrix, right and 𝐴̃ is 𝐸𝑁 𝐸𝑁−1 ⋯ 𝐸1 , so let, I am just revising
again what we proved in the theorem, if this is so, right is the reduced row echelon form of 𝐴 that
should be identity, right, so let if that is okay, then what should happen 𝐴 is invertible that we
already said.

And right, this is identity, so what is 𝐴−1 then; we just now saw; 𝐴−1 is the inverse of this multiplied
together, right.
(Refer Slide Time: 22:03)

103
So, let us take the matrix 𝐴, I am just writing randomly some example, 1 2 3, -1 0 1, 1 0 2, let us
write this matrix, okay, it has to be a square matrix, if I want to analyse its whether it is invertible
or not, so it is 3 cross 3, this is the matrix which is 3 cross 3, I want to check whether it is invertible
or not and can I find its inverse, right, so what we do is; because the same operations I will like to
do it on identity also.

So, we write a new matrix which is 𝐴 along with identity, so we write 1 2 3, -1 0 1, 1 0 2 and what
is the identity matrix of the same order? 1 0 0, 0 1, 0 0 1, right, so that is this matrix, what my aim;
look at the part which is in 𝐴, okay, so this is the part which is in 𝐴, okay on that I should do row
operations and reduce it to reduced row echelon form but same operation I should be keep doing
on the identity also on the bigger matrix also, so that I keep track of what is happening.

(Refer Slide Time: 25:31)

(Refer Slide Time: 27:29)

104
(Refer Slide Time: 28:16)

(Refer Slide Time: 29:37)

105
So, implies 𝐴 invertible and 𝐴−1 is as given in the slide.

So, what you do; take the matrix 𝐴 along with it, write identity matrix on the same order, right, so
we will get a bigger matrix, those number of rows will be same but number of columns will be
double right, now go on doing elementary row operations on 𝐴 part of the matrix, idea should be
to reduce it to reduce row echelon form but do the same on the bigger matrix itself right not only
on part 𝐴, on the part of identity also, so that will transform identity matrix into something.

If your transform matrix, the reduce row echelon form of 𝐴 is identity, then in the last step, you
can say that that is invertible, if it is not then 𝐴 is non-invertible, right if it is identity, then the
transform matrix identity which has changed to that would be the inverse of the matrix, right so
this helps us writing elementary row operations as matrix multiplications gives us a benefit that
we can put it on a computer and asked the computer to check whether it is invertible or not.

And find the inverse for me, right, everything is in matrix multiplication, whatever elementary row
operation I am doing, you can writing on the matrix multiplication and eventually, identity matrix
will changed to something will give me the inverse of the matrix, so checking and finding both at
the same time, right.

106
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology- Bombay

Lecture – 09
Reduced Row Echelon Form and Rank of a Matrix - III

(Refer Slide Time: 00:30)

So, let us look at; so let us a let 𝐴 be 𝑚 × 𝑛, 𝐴̃ its row echelon, let us write; so we will write now
onward REF that is for row echelon form and RREF will be reduced REF, now REF is already
defined, so it is a new command, okay, RREF. So, let us look at 𝐴̃ to be the reduced row echelon
form, so it is characterize; what are the characteristics; what are the characteristics of this?

There is a number 𝑟, okay, so this is 𝑚 × 𝑛, there is a number 𝑟 between 1 and the minimum once
again, I am going to there, okay and there are numbers 𝑝1 , 𝑝2 , … , 𝑝𝑟 , right as many as, okay.
(Refer Slide Time: 02:03)

107
So, there exist this such that the matrix looks like 𝑅1 , … , 𝑅𝑟 and 0, 0 everything, so this is 𝑚 × 𝑛,
so how many zeros will be there? These are 𝑚 − 𝑟 rows are all 0, what does the row one look
like; so row 1 that looks like 0 1 and something, right and similarly, the other, so let us write 𝑅𝑖 ,
so this is 𝑝𝑖 th column, pivot will be 1 right, reduced row echelon form, right and in the column
𝐶𝑝𝑖 ; so 𝐶𝑝𝑖 th column, all entries are 0 except other than; which one; 𝑖th column, okay. So, entry;
what shall we call it; for in the column, the 𝐶𝑖𝑗0 = 0, for 𝑗0 = 𝑝𝑖 , means;
(Refer Slide Time: 03:53)

The columns looks like this, that was this just write that so, this is the column 𝐶𝑝𝑖 , it is 0 here 0 1
0 0 and what is this place; these are 𝑖th, right, these are 𝑖th entry in that; 𝑖th row, 𝑖th row is non-

108
zero, were describing the non-zero rows, so ith row is non-zero, the non-zero entry is; first non
zero entry is 1, everything else is = 0.
(Refer Slide Time: 04:29)

So, let us give this 𝑅 a name; so let us put a definition, the number 𝑟 is called the rank of the matrix,
so what is 𝑟; 𝑟 is the number which is coming as the non-zero rows, 𝑟 rows are non-zero in the
reduced row echelon form of the matrix, right, so that number 𝑟 is called rank of the matrix, right,
𝑟 is called the rank of the matrix and now let us see; so let us try to apply to a system of equations,
what does it mean in the new terminology, right?

So, consider 𝐴, 𝑚 × 𝑛 system of linear equations, so that is 𝐴𝑋 = 𝑏, and what is 𝑏? 𝑏 is the


vector coming on the right hand side constants.
(Refer Slide Time: 06:49)

109
𝐴𝑋 = 𝑏 that is the matrix notation for a system of linear equations, right and how did we said we
will see whether a solution exist or not, so we said we will look at the matrix [𝐴 | 𝑏] that is
augmented matrix, we said we look at that right and reduce 𝐴 to row echelon form or reduced row
echelon form, okay, so when you do that this will become 𝐴̃ and this will become 𝑏̃, where 𝐴̃ is
the reduced row echelon form of the matrix, right.

So, 𝐴̃ reduced row echelon form of 𝐴, okay and we said the system is inconsistent if what happens;
if there is a row in the part of 𝐴, which is 0 and there is a nonzero constant coming in 𝑏, right, now
in terms of; if I look at the bigger matrix [𝐴 | 𝑏], 𝐴 has already been reduced to row echelon form
or reduced row echelon form and 𝑏 has changed to something, so if I look at the bigger matrix
[𝐴̃| 𝑏̃] compared to [𝐴 | 𝑏], then the rank, the number of nonzero rows in 𝐴 will be strictly less
than the number of nonzero rows in the bigger one.

(Refer Slide Time: 09:26)

110
Because we have to give it a machine, we cannot say randomly you do compute, whichever you
like, whatever arbitrary value you like, so for a definiteness we say that whenever the number of
solutions is infinite, right that means, there are 𝑛 − 𝑟 variables which are going to get arbitrary
values, r is the rank of the matrix 𝐴, 𝑛 is the number of variables, so 𝑛 − 𝑟 variables are going to
get arbitrary values.

So, we give those arbitrary values to non-pivotal variables and the pivotal variables are computed
in terms of non-pivoted variables to get all possible solutions and later on will see, this also has
some other usefulness, when we want to describe the solution space, we want to describe also
infinite solutions in some finite number of them, so we will look at what is called the basis of the
solution space, so this will be useful there in there, so this rule will be useful there, so we will see
examples next time of this, okay, right.

111
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology- Bombay

Lecture – 10
Solvability of a Linear System, Linear Span, Basis I

(Refer Slide Time: 00:33)

Let us just recall, what we have done in the previous lecture, we have started looking at what is
called the rank of a matrix, so we defined for the matrix 𝐴 which is 𝑚 × 𝑛, look at its row echelon
form particularly you can also look at the reduced row echelon form and the rank of the matrix is
defined as the number of nonzero rows in that matrix or the number of pivots, both are same.

So, we had mention that if you take row echelon form of a matrix that may not be unique but the
number 𝑟 which is number of nonzero rows, its same in all row echelon forms, okay and the first
nonzero entry was called pivot, so number of pivots is also fixed, once you have row echelon form,
so the rank is defined to be the number of nonzero rows in the row echelon form, now reduced row
echelon form.

(Refer Slide Time: 01:52)

112
So and we stated the theorem of solving a system of equations in terms of this new terminology
rank, so it says the following, first of all, the solution set is non-empty that means the system is
consistent if and only if the rank of 𝐴 is equal to rank of the augmented matrix, the rank; 2 ranks
are same that means the pivot, it should not happened that there is no pivot in a row in A but there
is a pivot in the augmented matrix, right.

So that will give you an inconsistent equation 0 = nonzero thing, so existence is if only if the rank
of 𝐴 is same as the rank of augmented matrix that means, the number of nonzero rows in 𝐴 and
the number of nonzero rows in augmented matrix should be same, okay, once you have them in
row echelon form. So, one is a unique solution possible, 𝑛 is the number of variables, 𝑚 is some
m cross n, n is number of variables.

So, if there are n pivots exactly, right that means each variable will get a unique value, so the
system has unique solution if and only if the rank of 𝐴 = rank of [𝐴 | 𝑏] = 𝑛 that is consistency and
then there is a unique solution and non-unique means, actually there are infinite number of
solutions.

The number of the pivots is strictly less than the number of variables, okay, so then it says there
are infinite number of solution and we said that the difference 𝑛 – 𝑟, the rank, right those many

113
variables will get infinite values, their arbitrary, they are free to give any values, right, so you can
choose the remaining.

If it is less than n, there are infinite solutions and although infinite solutions can be obtained by
backward substitution, right, putting the non-pivotal variables arbitrary values and calculating the
pivotal variables in terms of rows arbitrary values, right.
(Refer Slide Time: 04:29)

So, let us write a; so the I think proof we have already gone through, the existence and; so let us
keep the proof, right.
(Refer Slide Time: 04:34)

114
(Refer Slide Time: 04:39)

Uniqueness also we gave gone through, so let us; so I am just summarising now, if the rank of 𝐴
is 𝑟, rank of [𝐴 | 𝑏] is 𝑟 + 1 that is one more column is there in augmented matrix, so possibility
is rank of 𝐴 = 𝑟 but the rank of the augmented matrix is 𝑟 + 1, then there is no solution, it is an
inconsistent solution, the solution set, right, the set of all solutions is an empty set. If both are 𝑟
and 𝑟 is strictly less than n, the number of variables, then there are infinitely many solutions
possible, right.

And if both are equal to n, the rank of 𝐴 is same as rank of augmented matrix, which is equal to 𝑛,
then the solution set will be a single point that means there is a unique solution, right, this cannot
happen, 𝑟 cannot be strictly bigger than 𝑛, right, okay the rank of the matrix cannot be bigger than
number of rows or number of columns, so this possibility does not arise, so these are the only 3
possibilities which can arise.
(Refer Slide Time: 05:44)

115
And as a consequence of this, there is a some simple application which we have already discussed,
a square matrix, 𝑛 × 𝑛 is invertible if only if 𝐴𝑋 = 𝑂 has only trivial solution, right, there should
not be any nontrivial solution for this, is it clear because 𝐴𝑋 = 𝑂, 0 is always a solution, right
and if 𝐴 is invertible that means only one solution possible, for 0 should be the only one, is it clear
to everybody.

𝐴𝑋 = 0, always has 0 as the solution, if we put 𝑋 = 0 that is the solution, right and when
invertible that means there is only one solution possible, right invertible means that was also
echelon to the row echelon form is identity matrix, right that was also same as saying the rank = 𝑛
and we also showed that this is same as saying the product, 𝐴 is a product of elementary matrices,
right, 𝐴 is invertible if and only if it is a product of elementary matrices, we showed that.

The row echelon form should be identity matrix, right, invertibility is same as saying that rank =
𝑛, number of nonzero rows must be same as a number of variables and that is same as a row, the
number of rows or number of columns in the square matrix and product of elementary matrices,
so there are various ways of saying when a matrix is invertible, right all are same but there is
different ways of saying the same thing, okay.
(Refer Slide Time: 07:57)

116
So, let us look at that the previous one said that system 𝐴𝑋 = 0 seems to be an important 𝐴 is to
consider, so such a system is called a homogenous system, so system of linear equations were the
right hand side that vector is a 𝑏 is 0 is called homogenous system, right, so given below, we want
to relate it with the; so, given any linear system; 𝐴𝑋 = 𝑏, if you put 𝑏 = 0, will get a
homogeneous system, right.

So, given a system of linear equations 𝐴𝑋 = 𝑏, if you put 𝑏 = 0 that is called the homogeneous
system associated with the given system of linear equations, right we are putting, so what is the
advantage of doing that and what is the use of it; so, let us; the observation is the solutions of the
general system are related with the solutions of the corresponding homogeneous system, in what
way, we will describe that.

So, first of all let us is note that the homogeneous system always have a solution, right, 𝐴𝑋 = 0,
so 0 is always the solution of that okay, so let us define all solution of the homogeneous system
has a set, so let us call that as null space given a matrix 𝐴, you are given a matrix m cross n, so
look at all possible, the vectors 𝑋, so I will want to get 𝐴𝑋 = 0, so what will be the order of the
vector 𝑋, you have to multiply 𝐴 with 𝑋, 𝐴 is 𝑚 × 𝑛.

So, vector 𝑥 has to be 𝑛 × 1, right so you are looking at all vectors 𝑋 ∈ ℝ𝑛 such that 𝐴𝑋 = 0, so
look at that side, so collection of all vectors which are killed by 𝐴, you can think it that way that

117
is called the null space of the matrix 𝐴, right and that is same as a solution space of the
homogeneous system 𝐴𝑋 = 0, right in terms of set theory, we call it as null space denoted by n
of 𝐴, right that is a solution space of the homogeneous system 𝐴𝑋 = 0 for a given matrix 𝐴.

So, for every matrix we are associating is space with it, a set with it, the null space which is the
solution set of the homogeneous system, okay. What is the advantage of that? So, the advantage is
look at, we want to solve the system 𝐴𝑋 = 𝑏, right, the theorem says if the solution set of 𝐴𝑋 =
𝑏 is same as; you find at least one solution of the system 𝐴𝑋 = 𝑏 and add to it all solutions of
homogeneous systems, you get all solutions of the system.

So, finding a solution for the non-homogeneous case or the general system is find at least one, find
all solutions of the homogeneous case, take 1 from there and add it to it, we will get all, one by
one pick and add, right, so the solution of the; solution space of 𝐴𝑋 = 𝑏 is same as 𝑋0 + 𝑋, where
𝑋 is in the null space that means 𝐴𝑋 = 0 and what is 𝑋0; some particular solution, if I were able
to find at least one solution of the system 𝐴𝑋 = 𝑏, add to it a solution of the homogeneous system,
you get a solution of the general system, right.

So, what it says is that the solution of the homogeneous system is important thing to consider using
that we can describe all solutions of the system 𝐴𝑋 = 𝑏 provided we can find the null space, the
solution space of homogeneous system plus one particular solution, right.
(Refer Slide Time: 12:21)

118
So, we will concentrate on this, so questions are; how to list all solutions of a homogeneous
system? So, the problem; we are reducing the problem to the homogeneous case in a sense and
what is the best economical way of telling what are all the solutions, you can say this is the
solutions, right but can I give a method of generating all the solutions of the homogeneous system,
right what is the most economical way?

So, let us observe something that is 𝑋1 , 𝑋2 are solutions of the homogeneous system, 𝐴𝑋 = 0,
then 𝛼 𝑋1 + 𝛽 𝑋2 also is a solution, if 𝑋1 and 𝑋2 are the solutions of the homogeneous system that
means 𝐴𝑋1 = 0, 𝐴𝑋2 = 0, then the claim is 𝛼 𝑋1 + 𝛽 𝑋2, where 𝛼 and 𝛽 are any scalars, is also
a solution of the homogeneous system because we multiply by 𝐴, then what is a product?

A multiplied by 𝛼 𝑋1 + 𝛽 𝑋2 that is same as 𝛼 𝐴𝑋1 + 𝛽 𝐴𝑋2 , why the distributive property of


multiplication of matrices, right but 𝐴𝑋1 = 0, 𝐴𝑋2 = 0, so, 𝛼 𝑋1 + 𝛽 𝑋2 of that that again also is
a solution of it, right.
(Refer Slide Time: 14:05)

119
You say that a subset 𝑉 of ℝ𝑛 , right is called a vector space if it has that property may be, take any
2 elements 𝑣, 𝑤 in 𝑉, take 2 any scalars 𝑎, 𝑏 and take the linear combination 𝑎𝑣 + 𝑏𝑤, that is
called a linear combination that also should belong to V, right.

So, if the set 𝑉 has this property, then we say it is a vector space.

So, null space is a subset of ℝ^𝑛 and it is a vector space.


(Refer Slide Time: 18:50)

(Refer Slide Time: 21:24)

120
So, visualising; so what is 𝐿, so that is {𝛼 𝑋0 | 𝛼 ∈ ℝ}, let us try to draw a picture of it, so what is
the picture; 𝑋0 is the vector, right, so I can draw that vector, okay, let us draw it from here, so this
is a vector, okay, so that is my vector 𝑋0, what is 𝛼 𝑋0, what I am doing? I am taking a fix vector
and only stretching it, right, so when I stretch it, its length will change only, right, so this will give
me, right, the vector could be anywhere here from here to here or here to here or here to depending
on whether it is positive or negative.

So, what is this; this is a line through the origin right in ℝ2 , is that okay, right, so this is a line
through the origin that is a vector space, okay.

121
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology- Bombay

Lecture – 11
Solvability of a Linear System, Linear Span, Basis II

Can you think of some other subspace in ℝ2 , only every line through the origin is the sub space,
this was one of them, I fix 𝑋0, 𝑋0 could be anything, right, so there is another possibility that
means ℝ lines passing through the origin in ℝ2 are vector spaces in ℝ2 , right, some other
example; so I am just looking at ℝ2 , simple one, you can visualise.
(Refer Slide Time: 01:02)

So, let us look at another example, let us look at the zero vector alone, so that is 𝑉, is that a
vector space? Only one element is there, what all you multiply, it will always be 0 and it will
be staying inside, so is a subspace, so oh sorry is a vector space.
(Refer Slide Time: 01:37)

122
Let us look at the whole space plane or so let us look at third example, 𝑉 = ℝ2 , 𝑉 is the
subspace of ℝ2 , right.

(Refer Slide Time: 04:20)

For example, let us look at this; this is an interesting example to consider, so look at a line not
passing through origin, so 𝐿 is the proper subset of ℝ2 , a line not through origin, so question
is; is it a vector space, so that is the question, so is it a vector space, so how do you describe a
line as a set of vectors, so I can write this line 𝐿 as; I want to write set of vectors in ℝ2 , so how
were been writing the vectors?

Normally, points in ℝ2 are written as ordered pair; (𝑥, 𝑦) you can write also has matrices, if
they have to lie on a line, right, so 𝑦 = 𝑚𝑥 + 𝑐, right for some 𝑚, 𝑐 and if you want not so

123
origin, then 𝑐 should not be equal to 0, right, so that is all points in the plane which lie on the
line are given by these 2 constants 𝑚, 𝑐.

So, I want to check, right, so whether if I take 2 points here, whether the combination, right, so
as vectors what you will write; (𝑥1 , 𝑦1 ), other vector would be (𝑥2 , 𝑦2 ) on the line, so let belong
to L and let us take 𝛼, 𝛽 ∈ ℝ, then 𝛼(𝑥1 , 𝑦1 ) + 𝛽(𝑥2 , 𝑦2 ), I want to check whether it belongs to
𝐿 or not, what is that equal to; by our vector addition that is 𝛼 𝑥1 + 𝛽 𝑥2 and here it will be
𝛼 𝑦1 + 𝛽 𝑦2 , right, vector addition.
(Refer Slide Time: 06:59)

The 𝑦 component should be 𝑚𝑥 + 𝑐, right.

(Refer Slide Time: 10:20)

124
(Refer Slide Time: 11:07)

So, what does that mean? That means what is the conclusion; so the conclusion is that if I take
𝐿 = {(𝑥, 𝑦) | 𝑦 = 𝑚𝑥 + 𝑐}, 𝑐 ≠ 0, okay then 𝐿 is not vector space, then it is not a vector space
because if I take 2 elements and take scalar multiples of them and add that is not again in the
same right, so that means in the plane every line through a origin is a vector space, every line
which is not through a origin is not a vector space, right.

So, the question still remains, can you say these are the only subspaces or these are the only
vector spaces in ℝ2 , I cannot answer that question at present but that it is a true statement, so
we will show later, right these are the only vector spaces in ℝ2 either it should be a trivial that
is 0 or it should be the whole space ℝ2 or it should be a line passing through the origin, at
present we have only shown these are vector spaces and no line not through a origin is a vector
space.

But we will say these are the only ones evict later, okay, right, okay, so the line passing through
the origin of slope and;
(Refer Slide Time: 12:40)

125
Now, this is an observation we said if a line does not pass through origin that is not a vector
space in fact that is a property of every vector space that means if 𝑉 is a vector space in ℝ𝑛 ,
then the 0 vector should be inside it, why, why should 0 vector be inside every vector space,
so what is the definition of a vector space? It says 𝛼 𝑣1 + 𝛽 𝑣2 should belong to 𝑉 for every
𝑣1 , 𝑣2 ∈ 𝑉 and 𝛼, 𝛽 ∈ ℝ.

But who stops we are taking 𝛼 = 0 and 𝛽 = 0, that means the 0 vector should be in 𝑉, so
that means, it is a necessary condition for a subset 𝑉 in ℝ𝑛 to be a vector space that 0 must
belong to it.

Otherwise, it cannot be a vector space, so that is what happened in the plane, no line which is
passing through the origin, that means if a line is not passing through a origin, then zero vector
is not part of it so, it cannot be a vector space and that is what we proved, checking actually,
right, okay.
(Refer Slide Time: 14:21)

126
So, let us start with the some finite number of vectors in ℝ𝑛 , 𝑐1 𝑣1 + 𝑐2 𝑣2 + ⋯ + 𝑐𝑘 𝑣𝑘 , so
such a vector will call as a finite linear combination of 𝑣1 , 𝑣2 , … , 𝑣𝑘 , just a name.

Why finite? Because the number of vectors is finite, you can just call it as a linear combination,
okay, but to stress that we are taking only finite number of them; we will get a finite linear
combination of these vectors. So, now comes a general definition, okay, so let us look at a set;
the symbol has not come out 𝑆 as a subset of ℝ𝑛 , look at a subset 𝑆 of ℝ𝑛 , now what I can do?

I can pick up any finite number of vectors from 𝑆 and form a linear combination of those
vectors, right, so I choose any vectors 𝑣1 , 𝑣2 , … , 𝑣𝑘 ∈ 𝑆 and look at scalars 𝑐1 , 𝑐2 , … , 𝑐𝑘 ∈ ℝ
and form a linear combination. The set consisting of all these linear combinations called that
as 𝐿(𝑆).

So, what is 𝐿(𝑆)? 𝐿(𝑆) is a set of all finite linear combinations of vectors taken from 𝑆, right,
is it clear, this 𝑘 is not fixed, you can pick up only one vector, you can pick any 2, any 3, any
4, any finite number of them, form a linear combination by picking vectors finite number of
them picking finite number of scalars making a linear combination. For all such possible linear
combinations, right and call it 𝐿(𝑆).

The theorem is 𝐿(𝑆) is always a vector space as 𝑆 may be any set but 𝐿(𝑆) is always a vector
space, so let us check that why it is always a vector space, it is a simple proof but let us check
it, okay, right.
(Refer Slide Time: 18:04)

127
(Refer Slide Time: 20:45)

Hence 𝐿(𝑆) is a vector space.

128
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology- Bombay

Lecture – 12
Solvability of a Linear System, Linear Span, Basis III

(Refer Slide Time: 00:37)

So, given a set 𝑆 contained in ℝ𝑛 , 𝐿(𝑆) is a vector space, right. Let us check for an example.

(Refer Slide Time: 03:28)

129
(Refer Slide Time: 05:13)

(Refer Slide Time: 07:47)

So, 𝐿(𝑆), we want to find out what it is. We see that 𝑣3 = 𝑣1 + 𝑣2 , I do not need 𝑣3 because 𝑣3
itself is a combination of 𝑣1 and 𝑣2 , right.
(Refer Slide Time: 09:10)

130
So, let us look at 𝛼 𝑣1 + 𝛽 𝑣2 + 𝛾 𝑣3 ∈ 𝐿(𝑆), so what is that equal to, it is equal to (𝛼 + 𝛾)𝑣1 +
(𝛽 + 𝛾)𝑣2 .

So, what does that imply that implies a linear span of 𝑆, which was {𝑣1 , 𝑣2 , 𝑣3 }; 3 vectors is same
as a linear span of first 2 only, right so that means 𝑣3 was redundant in finding what is a linear
span of set 𝑆, right because one of them was already linear combination, right that means what;
given a set of vectors, it will take a linear span of them right, then probably, some of the vectors
you can cut down, remove them, right.

You can get a smaller subset of the original set, which will span the same thing and probably, you
can go on cutting, can you remove everything, no you cannot remove everything, there has to be
something, right so that means what that means given a set 𝑆 and a linear span of it, possibly you
can have a subset of it which gives the same span but there should be something called minimal
probably, the smaller subset which will give me that right.

And idea should be like when this example we have done it, we should be trying to do it every
time, right, so that motivates a definition what is called the basis of a vector space.
(Refer Slide Time: 11:15)

131
So, let us define, okay, so let; so let us take let 𝑉 contained in ℝ𝑛 be a vector space and 𝐵, a subset
of 𝑉 such that 𝐿(𝐵) = 𝑉. We say 𝐵 is a basis of 𝑉 if there is no proper subset 𝐵0 of 𝐵 such that
𝐿(𝐵0 ) = 𝐿(𝐵) = 𝑉.

From 𝐵, I should not be able to cut down something and make it smaller and still getting the all of
𝐿(𝐵) that means no proper subset of 𝐵 will span the vector space 𝐿(𝐵).

(Refer Slide Time: 15:16)

132
So, examples, I think best is plane itself, so let us look at 𝑉 = ℝ2 .
(Refer Slide Time: 17:29)

(Refer Slide Time: 19:22)

(Refer Slide Time: 21:41)

133
(Refer Slide Time: 23:15)

The basis for a vector space need not be unique, you can have more than one basis for a vector
space right but the property should stay, it should span and it should be the smallest spanning rate
right, you can have more than one possible in fact in the; in ℝ2 , if you take any 2 lines which are
intersecting not parallel not coincident I mean there should be 2 lines. Any 2 lines if you take that
will form a basis that is something saying similar to it is not mentioned anywhere in books or
schools, when you have coordinate system, you always take 𝑥-axis and 𝑦-axis right and you say

134
everything else can be obtained by moving parallel to 𝑥-axis and then going parallel to 𝑦-axis to
reach any point, so any vector is a linear combination of 𝑥 coordinate and 𝑦 coordinate.

But why should I take to be perpendicular, I can take any 2 lines, the only thing would be I should
be moving parallel to these lines now right, so a coordinate system that is what coordinate system
is that it should not be a perpendicular always right, still form a basis right, advantage of those
perpendicular once is because of Pythagoras theorem and computation became simpler.

But as such basis, any 2 lines which are non-parallel, right otherwise is the same thing basically
we are saying that they should intersect right, any 2 intersecting lines will give you a basis, so take
vector here and take a vector here, right, they will give you a; okay. So, basis we have defined, so
what we will have done today is; we looked at a system of linear equations, 𝐴𝑋 = 𝑏, we looked
at the solutions set of the homogeneous system.

And said that solution of the general system can be obtained from homogeneous system right, by
taking a particular one and adding it to the solution of the homogeneous one, so solution of
homogeneous becomes important and solutions of the homogeneous system we can call it as the
null space and that has a property of being a; that motivated us to define what is the vector space,
right, so null space has the property, if we take 2 vectors and take linear combination that is again
in the null space, right.

So, we define what is called a vector space, we will look at various examples and what we are
looking at is; how to generate a vector space as a linear combination of a subset of it and we have
come to the notion of basis, the basis is a set of vectors by which whose linear combination of
elements of which gives you the vector space and it is the smallest one, nothing smaller will 1
right, so next lecture, we will look at how to find basis of a given vector space, okay.

135
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 13
Linear Span, Linear Independence and Basis - I

So let us begin our today’s lecture. We will just recall what we have done earlier. We have
looked at what is called the linear span of set of vectors. So given a subset 𝑆 of ℝ𝑛 , you define
what is called the linear span 𝐿(𝑆).

(Refer Slide Time: 00:48)

For a vector space 𝑉 in ℝ𝑛 , a set 𝐵 is a basis of 𝑉 if 𝐿(𝐵) = 𝑉 and there is no proper subset 𝐵0
of 𝐵 such that 𝐿(𝐵0 ) = 𝐿(𝐵) = 𝑉.

(Refer Slide Time: 02:15)

136
So basis of a vector space need not be unique. That is what we said, so given a vector space
there could be more than one set which form a basis. So the questions arise does every vector
space has a basis? How to find a basis of a given vector space and what is the relation between
any two basis of the vector space right? So given a vector space first question will like to answer
is does it have always have a basis right.

Second, does it how to find that basis given vector space and what is the relation between any
two basis of the vector space. So let us observe one thing to start with that basis if a vector
space if set 𝐵 is a basis, then 0 cannot belong to it right and this 0 cannot belong to the basis
right. If it is there anywhere because it is not going to do, so we will remove it. So basis will
always have because it is a minimal set.

So if 𝐵 is the set of vector which is a basis and contain 0, then we can delete 0 right because
any vectors we can take a linear combination 0 times 𝑣 + 0 times that will give you 0. So linear
span will always have 0, so if 0 is a part of a basis then it is not minimal, so we can always
remove it to make it a minimal right and definition of basis is the minimal set. So 0 will never
belong to the basis of a vector space okay.
(Refer Slide Time: 04:00)

137
So we want to show that every vector space has a basis. So the idea is following. So let us look
at the proof of this.
(Refer Slide Time: 04:13)

So theorem is if 𝑉 is obviously a nonzero vector space, then there exists a basis for 𝑉 right. So
let us look at a proof. See so 𝑉 is nonzero and 𝑉 is contained in ℝ^𝑛 right. So we start with
other set of generators right. So let 𝐵 contained in 𝑉 be a finite set such that a linear span of 𝐵
equal to 𝑉 right. We have got a set of generators and we want to check whether it is basis or
not. So what is the possibility?

So one possibility is case I that 𝐵 is minimal that is what is 𝐵 minimal means, there does not
exist any proper subset such that linear span of 𝐵0 is equal to 𝑉. Then, what will happen? That
means we are saying that this set itself is a basis. So that is 𝐵 is a basis, so what is case II? So

138
case I was it is minimal. So what is the second possibility? It is not minimal. So suppose 𝐵 is
not minimal, so let that means what?

If it is not minimal that means at least there is one vector right which you can remove. So let
𝑣 ∈ 𝐵 such that 𝐿(𝐵 ∖ 𝑣) = 𝐿(𝐵) = 𝑉.

So look at so it is not required that means what? So again two possibilities either this 𝐵 ∖ 𝑣 is
a basis, once we have removed that one vector which was not required so one possibility is
after removing one vector it becomes a basis. If not, again in so if not then there is there exists
some say 𝑢 belonging to 𝐵 ∖ 𝑣 which can be removed right like we removed earlier one vector.

So there is one possibility again that there is a vector which we can remove. So if you go on
doing this process what will happen right, the set 𝐵 is a finite set right, the basis is a finite set.
We removed one vector, we removed another vector right so eventually you cannot remove
everything, eventually there will be one vector left at least in the last stage if possibly then that
single vector will give everything right.
(Refer Slide Time: 08:41)

So after finite number of steps, so let us write after finite number of steps, we will get set 𝐵0
which is contained in 𝐵 such that 𝐿(𝐵0 ) = 𝑉 and 𝐵0 is minimal. That is 𝐵0 is a basis. So what
are we saying? We are saying that we have got a vector space which is generated by some set
of vectors 𝐵 that may not be minimal; I can make it minimal by removing one at a time which
are vectors which are not required.

139
(Refer Slide Time: 09:57)

(Refer Slide Time: 10:45)

(Refer Slide Time: 11:24)

140
So let us give some equivalent ways of describing what is the basis. So 𝑉 is a vector space
which is of course nonzero and 𝐵 is a set which is {𝑣1 , 𝑣2 , … , 𝑣𝑘 } contained in 𝑉. So we want
to give different ways of looking at when the set 𝐵 can be a basis. We have already defined
one definition as minimal set of generators. So the first one is 𝐵 is the basis that is it is the
minimal set of generators.

And we are going to show it is equivalent to saying that every element v of the vector space 𝑉
has a unique representation as a linear combination of elements of 𝐵. So let us prove that. Let
us prove that the I and II are equivalent.
(Refer Slide Time: 12:18)

(Refer Slide Time: 14:38)

141
(Refer Slide Time: 17:45)

(Refer Slide Time: 21:43)

142
This proves I is equivalent to II, I implies II and II implies I. So that was the statement we
wanted to show here.
(Refer Slide Time: 22:25)

So in the theorem, the first statement is 𝐵 is a basis, it is a minimal set of generators and were
shown it is equivalent to say that every element is representation as a linear combination of
elements of 𝐵 and that representation is unique, there is only one way of doing it.

143
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 14
Linear Span, Linear Independence and Basis - II

(Refer Slide Time: 00:39)

(Refer Slide Time: 01:56)

(Refer Slide Time: 03:45)

144
(Refer Slide Time: 07:34)

(Refer Slide Time: 08:30)

145
(Refer Slide Time: 10:33)

You are given a vector space and you have given a finite set 𝐵 right. We want to know when
is the finite set a basis, one is it is a basis means it is a minimal set of generators right. Second,
every element in the vector space has a unique representation in terms of elements of 𝐵 that
means every element in 𝑉 is a unique linear combination.

And the third says every element is a linear combination and 0 has unique representation. So
they say 0 has unique representation. So they are saying that every element has unique
representation is equivalent to saying that 0 has unique representation and in terms of

146
maximality and minimality it says 𝐵 is maximal subset right such that 0 has unique
representation right. So these are 4 equivalent ways of describing what is called a basis. All
will be useful in constructing basis for vector spaces, so we will see that how these are useful.

147
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 15
Linear Span, Linear Independence and Basis - III

(Refer Slide Time: 00:26)

So set of vectors which called linearly independent if a linear combination of them equal to 0
implies all the scalars are equal to 0 right. So and if they are not, what is the opposite of it?
That means there is a linear combination which is 0 but not all the scalars are 0 that is called
that property is called that set is linearly dependent right. So opposite of independent we say it
is dependent. Independent means 0 has unique representation and dependent will mean that 0
does not have.

That means ∑𝑖 𝛼𝑖 𝑣𝑖 = 0 for not all 𝛼𝑖 = 0, such a relation should be possible. Let us look at
some examples.
(Refer Slide Time: 01:57)

148
(Refer Slide Time: 02:41)

(Refer Slide Time: 04:24)

149
If the number of components is strictly less than the number of vectors, they are going to be
linearly dependent vectors always. Can you see the reason why it should be? Let us just write
and see whether why that should be true.

(Refer Slide Time: 06:45)

Let 𝑣1 , … , 𝑣𝑘 ∈ ℝ𝑛 such that 𝑛 < 𝑘. Suppose we write 𝑣𝑖 = [𝑎1𝑖 𝑎2𝑖 ⋯ 𝑎𝑛𝑖 ]𝑇 . Then the
linear combination 𝛼1 𝑣1 + ⋯ + 𝛼𝑘 𝑣𝑘 = 0 gives the system 𝐴𝑋 = 0, where 𝐴 = [𝑎𝑖𝑗 ]𝑛×𝑘 and

𝑋 = [𝛼𝑗 ]𝑘×1 . So this equation is same as this equation right. So now how many equations are

here? 𝑛 equations in 𝑘 variables.

150
(Refer Slide Time: 08:26)

For this system, the rank should be at most the minimum of 𝑛 and 𝑘, and 𝑛 < 𝑘, so the rank is
less than 𝑘.

So system of equations is consistent with actually infinite number of solutions right.

That means if I have got more number of vectors than the number of components then that set
is always linearly dependent and that is what is happening in this case. I have got components
2 right, number of vectors is 3, so this has to be linearly dependent and we verified that actually
there is a possibility here right but that is general solution and that is quite useful writing later
on that if I have given a set of vectors right first see the components and the number of vectors.

It may be obviously linearly dependent if possible right the number of vectors is more than the
number of components.
(Refer Slide Time: 12:25)

151
So let us I always recaptured all this in this slide. 𝐵 is linearly independent set right and this 2
is redundantly this would not be there right, so 𝐵 is linearly independent and the span is 𝑉
right? So item wise you should remove, 𝐵 is maximum linearly independent or minimal this
should be a minimal set right minimal not again independent. I think this all is wrongly typed
thing. Is it clear?

Anyway wrong thing also helps you to understand what should be the right thing. So the first
is 𝐵 is linearly independent and spans 𝑉 that is 1. Second, maximum linearly independent and
the third is the minimal set which generates. You do not have to put linearly independent there
okay.
(Refer Slide Time: 13:25)

152
So now let us we said number of vector space can have more than one basis we saw that. So
let us construct one more example to say that a vector space okay.
(Refer Slide Time: 13:42)

So let us look at so consider 𝑉 = ℝ2 .


(Refer Slide Time: 15:09)

(Refer Slide Time: 16:08)

153
(Refer Slide Time: 17:18)

(Refer Slide Time: 18:18)

154
So given a vector space there could be more than one basis of that vector space possible. So
that is what the remark says.
(Refer Slide Time: 20:25)

A vector space can have more than one basis; however, a theorem in our subject says that any
two basis of the vector space will have same number of elements. The number of elements in
two different basis has to be same, it cannot be right one cannot have say in ℝ2 we said we
have 𝐵1 and 𝐵2 both had two vectors right. One of them alone will not be able to generate, we
want to see why?

Let us see 𝐵1. If I remove one of them, what do I get? Only 1 0 and what will it generate, linear
combination 𝑥, 0. So there is only 𝑥 axis right. Other one alone will give only 𝑦 axis. So I

155
cannot make it smaller anyway. If I make it bigger, if I add a third element to it, then it becomes
linearly dependent, independence is gone then, just now we saw, 3 vectors of 2 component
each will be dependent, so it cannot be that right.

Similarly, in the other one in the other .

So we will say the dimension of ℝ2 has vector spaces 2, so you can define it for anything now.
The number of elements in any vector space right 𝑉 in the subset of ℝ𝑛 which is a vector space,
so in any vector space is the unique number that is a number of elements in the basis and that
is called the dimension of the vector space right. So one we showed every vector space has a
basis right.

And secondly this theorem we are not proving that any two elements in the number of elements
in the basis is unique. That means any two basis will have the same number of elements we are
assuming that. As a consequence of it, every vector space you can associate a unique number,
the number of elements in any basis of that which exist is called the dimension of it.

In some sense, that is the size you can think of it the size of right. So let us look at some more
examples. So let us look at this. Consider the system 𝑥 + 𝑦 − 𝑧 = 0.
(Refer Slide Time: 23:40)

(Refer Slide Time: 25:04)

156
(Refer Slide Time: 26:27)

(Refer Slide Time: 27:47)

157
(Refer Slide Time: 28:31)

(Refer Slide Time: 29:36)

158
(Refer Slide Time: 29:57)

(Refer Slide Time: 30:09)

159
So what we have done is given a system of equations which had infinite number of solution,
there were two free variables, we put them special values and got a basis for the solution space
that means the solution space for this particular homogeneous system is written as a linear
combination of only two vectors right. So that is the advantage of basis. So we will continue
this in the next lecture.

160
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 16
Row Space, Column Space, Rank-Nullity Theorem - I

Okay so welcome to this today’s lecture. We will start recalling what we have done earlier.
(Refer Slide Time: 00:34)

So we defined the notion of a basis of a vector space 𝑉 in ℝ𝑛 and proved the following facts
about the basis. 𝐵 is a linearly independent set and generates 𝑉. That means every vector in 𝑉
is a linear combination of elements of 𝐵 and it is a linearly independent set or equivalently 𝐵
is a maximal linearly independent set. So you can replace this condition that it spans by the
maximal property.

And it is equivalent to saying it is a linearly independent set such that 𝐿(𝐵) of that, that is same
as the earlier one or it is a minimal set of generators that is another property that we proved last
time. We also said will assume this theorem that any two basis of a vector space have the same
number of elements. So that gave rise to the definition of dimension of a vector space. So
dimension is a number of elements in any basis of a vector space.

There can be different basis for the number of elements and each base will be same for a given
vector space, so that we called as the dimension.
(Refer Slide Time: 01:51)

161
Some useful properties of independent and dependent sets. If a set is linearly independent, if a
set 𝑆 is linearly dependent and 𝑆 is a subset of 𝑇, then the bigger set also is linearly dependent.
What does independent mean? Here that is where the linear combination of elements of 𝑆 which
is 0 but not all coefficients are 0 but elements of 𝑆 are also elements of 𝑇, so that means there
is a linear combination of elements of 𝑇 which is 0 but one of the coefficient is not 0.

So 𝑇 is also dependent, so if inside a given set you find a subset which is linearly dependent,
then the bigger set itself is linearly dependent and equivalent way of saying the same thing
would be if 𝑇 is linearly independent the bigger set is linearly independent, then every subset
also is linearly independent because the property of independence is that if a linear combination
is 0, all the scalars must be 0 right.

If it is true for elements of 𝑇, then obviously it is true for elements of the subset also. Another
way of saying the linear dependence is a set is linearly dependent if and only that is a definition
actually of linear dependence but you can also write one element at least one element of 𝑆 is a
linear combination of elements of the remaining one and this is another property that we proved
namely you have got 𝑚 vectors in ℝ𝑛 right.

You have got 𝑚 vectors in ℝ𝑛 and the number of vectors is bigger than the number of
components then that is always linearly dependent. So we observed that okay. So in ℝ2 , we
have got 3 elements, 3 vectors, they have to be linearly dependent right, they cannot be
independent, so that property we saw.
(Refer Slide Time: 03:58)

162
Let us look at some vector spaces associated with the matrix. So let us take a matrix 𝐴 and
these are the row vectors 𝑅1 , 𝑅2, … , 𝑅𝑚 , there are m rows and each row is a right vector of
length n okay, so this is the m rows and n columns. Each row vector has got n components
right, so this is m x n matrix. So these vectors 𝑅1 , 𝑅2 , … , 𝑅𝑚 if you put them together in a set
and generate a vector space, so that means a linear span of the row vectors 𝑅1 , 𝑅2 , … , 𝑅𝑚 is
called the row space of the matrix 𝐴.

So given a matrix, look at the row vectors right, so look at the row vectors there are 𝑚 of them,
so the subspace of ℝ𝑛 each vector is of length components 𝑛. So if you generate a subspace
that will be subspace of ℝ𝑛 . So row vectors generate a subspace that is called the row space of,
so that vector space has got a dimension right. So that dimension is called the row-rank of the
matrix 𝐴 that is called the row-rank of the matrix 𝐴.

So what is a row-rank? Now we are defining a new term, earlier we had the notion of rank. So
what was the rank? In the row echelon form, the number of nonzero rows right that was, now
we are defining something new but will show it is actually equivalent to what the earlier
definition is that in terms of vector spaces look at the row space that is the space span by the
row vectors, look at its dimension and that dimension is called the row-rank of the matrix.
(Refer Slide Time: 06:03)

163
And similarly we will have for the columns, we shall take the matrix and write the column
vectors 𝐶1 , 𝐶2 , … , 𝐶𝑛 , so these are the column vectors right. There are 𝑛 of them, 𝑚 × 𝑛 matrix,
so there are how many columns are there, 𝑛 of them, each column has got 𝑚 entries right. So
this is you can write a matrix in the column form, so this we call the column form of a matrix,
matrix with only columns are written down.

So this is a column vectors, so look at the space span by the column vectors, each column
vector is a vector in ℝ𝑚 right, m components are there, so you will get a subspace of or you
get a vector space which is inside ℝ𝑚 , so that is called the column space. So column space is
a vector space in ℝ𝑚 and the row space is the vector space in ℝ𝑛 okay. So these are the two
subspaces and they play some important role. We will see what are the roles they play.

(Refer Slide Time: 08:09)

164
So here is the first theorem about the row-rank. Supposing 𝐴 is a given matrix and you apply
row of ratios to it, it will change to something, rows will change. Now what the elementary
row operations were? One was interchange of two rows, other one was adding one row to
another or multiplying a row by a nonzero scalar right. Now if you take a linear combination
of the rows and transform them by elementary row operations what you will get?

(Refer Slide Time: 13:30)

(Refer Slide Time: 18:52)

165
Here is a theorem which is based on these things only, what we have just now discussed

Number of pivots and that is same as equal to the rank anyway, so column rank is same as the
row rank okay but we also get a basis by picking up the corresponding column vectors right,
so these are the ways of getting the basis for them okay.

166
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 17
Row Space, Column Space, Rank-Nullity Theorem - II

So let us look at some examples for that.


(Refer Slide Time: 00:30)

So probably one of them I can throw, maybe two of them I can throw out or remove three of
them, I do not know how many, I want to find out which are the vectors which put together
will give me a linearly independent set and that should generate the same vector space as all of
them, all the 6 generators have space vector space the linear span right. I want a basis for that
vector space right.

So what will be the basis? It should be a linearly independent subset of the generators one and
what is the second thing? It should be linearly independent and generator. Two things, we
should generate everything and it should be linearly independent. So I want a subset of the 6
vectors which is a linearly independent set and all 6 of them can be represented, each one of
them can be represented as a linear combination of those whichever I choose.

So what is the process of doing that? I am going to describe that okay. So let us look at, so
these are 6 vectors each having 4 components so it is a linearly dependent set obvious right. So

167
what you want to do? We want to find the maximal linearly independent subset right. We find
a part of it which is linearly independent and maximal. So 3 of them I find and 4 of them will
not be linearly independent and that will form a basis right.

Definition of basis, either it should be a maximal linearly independent set or a linearly


independent set which generates, either of them is okay or it should be minimal set of
generators, these 3 either of the conditions I can use criteria, so let us look at. So what is the
process?

(Refer Slide Time: 03:55)

(Refer Slide Time: 06:03)

168
Now the claim is these 3 vectors should be linearly independent right and should span
everything. I want to check that also right. Theorem says it should be so, so let us try to verify
that with this example okay. So I want to verify, it is the linearly independent subset and in fact
spans all the remaining ones, so let us check.
(Refer Slide Time: 06:58)

(Refer Slide Time: 08:50)

So all the vectors are linear combinations of those 3 right and those 3 were linearly independent
that means those 3 form a basis for the linear span of the columns right. So that is all you find
the columns.

169
They give me a basis for the column space of the matrix 𝐴 right because you wrote everything
as columns and in the row echelon form whichever pivotal columns are corresponding ones
should give me the basis and we have verified that okay right.
(Refer Slide Time: 11:05)

So this is I put the algorithm. Suppose you are given vectors {𝑣1 , … , 𝑣𝑛 } in given set of vectors
to extract a maximal linearly independent subset out of it right, that is our aim, given a set of
vectors, I want to extract a maximal linearly independent subset out of the given set, I want to
find a subset which is linearly independent at the larger subset. So the algorithm is consider
𝑚 × 𝑛 matrix by writing these as the column vectors.

So given vectors write them as a column vectors right. Once you have written the column
vectors, reduce them to the that matrix to the reduce the row echelon form or row echelon form
and look at the pivots 𝑝1 , … , 𝑝𝑟 , where 𝑟 is the rank, so there are 𝑟 pivotal columns, pick up
those columns 𝑣𝑝1 , … , 𝑣𝑝𝑟 , those columns is the that collection is the maximum linearly
independent subset right.

And this we have verified in one example and this can be put on a computer. This can be put
on a computer right. Well matrices can be put on a computer or computation of changing,
reducing to row echelon form is premultiplication by matrices. So a machine can do that job
okay but in the exam you will be the machine doing okay. We will have to do it ourselves right.
So this is one algorithm.

170
Now another possibility is that you are given a set of vectors, you are not interested in knowing
which one of them are linearly independent but you just want to find a basis for the linear span
of those vectors, we want to find some basis, here what we are doing? Given the set of vectors,
we are finding among the set of vectors as a basis but sometimes you are not interested in
finding basis from the given set, you just want to find a basis.
(Refer Slide Time: 13:11)

So let us look at the algorithm for that.

When we write them as the row vectors, then the nonzero rows in the row echelon form are
linearly independent. In the first one, the advantage is you are getting a subset which is linearly
independent right. In this form, you are only getting some basis; it is not the original ones right
but the span of the original set of vectors you are finding a basis for that, that is all. So
depending on what you want to do, whether you want to find among the given ones a basis or
you just want to find a basis, you write it as a row or the vector as row or column.

But the process will again involve reducing it to row echelon form, that is crucial, without that
you cannot do nothing. So all along till now what we have done is we have tried to reduce
everything to computation to row echelon form or reduced row echelon form right. The key
lies in that; you should be able to compute in that right. So that part I am leaving for you to
check.

171
That means what? I should be using these 2, I should be able to write all these 3 vectors as
linear combinations. So solving 3 sets of equations right, you have to solve that. So do that for
the practice.

172
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 18
Row Space, Column Space, Rank-Nullity Theorem III

(Refer Slide Time: 00:26)

Let us do one more example to get the hang of everything that we are doing okay.
(Refer Slide Time: 01:36)

So in this matrix how many pivots are there 2 right, they are in the yellow box. There are first
2 rows which are nonzero. So what is the rank of this matrix, 2 is the rank of this matrix right,
a number of nonzero rows okay right. Now what is the next thing I want to do? I want to find

173
nullity. So how do I find the nullity that means I should describe the null space. I should
describe all solutions of 𝐴𝑋 = 0.

How many independent variables are there? Remember solving the equations if 𝑛 is the number
of variables, 𝑟 is the rank right then the number of independent variables which get arbitrary
values is 𝑛 − 𝑟. And what are those variables? They are precisely the non-pivotal variables.

(Refer Slide Time: 06:18)

(Refer Slide Time: 09:51)

174
So that means I want to show that this system is consistent basically, this system is consistent
right. Is it okay, that you know how to do that how to show a system is consistent or not again
going to row echelon form and doing it right.

So I think this is the right stage to recap everything that we have done and because next topic
will be something called determinants, yes. So let me just recap whatever we have done till
now. We started by looking at what is linear algebra, basically we said linear algebra is study
of linearity, these arises in 2 ways one is system of linear equations and doing geometry in
algebraic setup.

175
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 19
Determinants and their Properties - I

So welcome to today’s lecture. We will just recall what we had done last time and then describe
what we will be doing today. If you recall last time we looked at how to find inverse of a matrix
and we proved a theorem namely that a square matrix is invertible if and only if it is reduced
to echelon form looks it is identity matrix and we also give an algorithm of finding the inverse
in that case.
(Refer Slide Time: 00:59)

So this was the theorem that we proved last time that 𝑛 × 𝑛 matrix is invertible if and only if it
has full rank that is the rank of the matrix is 𝑛 and that is equivalent to saying that the reduced
row echelon form is identity and that process of converting the row, converting the matrix to
reduced row echelon form also gives you a way of finding the inverse. So today we are going
to look at another concept called determinant of a matrix and which will also related to
invertibility of the matrix.

(Refer Slide Time: 02:03)

176
So we will start looking at the concept of geometry namely finding the area of a parallelogram.
Now suppose you are given a parallelogram with defined by 2 vertices OA is one vector sorry,
OA is one vector and OB is another vector. So let us say one vector is C another vector is AB.
So these 2 vectors, this is u and this is v, so AB. So they define a parallelogram any 2 vectors
define a parallelogram by drawing a vector parallel to this and when.

So if you recall your vector algebra, the diagonal of this parallelogram gives you the sum of
the 2 vectors and it can be proved using purely by geometry that the area of this parallelogram
is given by ad-bc, if A has got the vector components AB and B has components CD then the
area of this parallelogram purely by geometric considerations by drawing lines perpendicular
and so on.

And used looking at congruence of triangles and using the concept that the area of the triangle
is (1/2) × 𝑏𝑎𝑠𝑒 × ℎ𝑒𝑖𝑔ℎ𝑡, one can come to the conclusion that this area of this parallelogram
is given by the 𝑎𝑑 − 𝑏𝑐, this is pure geometry.
(Refer Slide Time: 03:40)

177
If we interchange the order of u and v you will see that the area comes out as of 𝑎𝑑 − 𝑏𝑐 which
was earlier, this also relates to what we call as the left handed system or the right handed system
in the plane. We go from left to right clockwise or anti-clock wise. So one can call 𝑎𝑑 − 𝑏𝑐 as
the signed area because area is normally taken as positive every time. So our starting point is
given 2 vectors 𝑢 and 𝑣 we have the area of the parallelogram.
(Refer Slide Time: 04:16)

So 𝑃(𝑢, 𝑣) denotes the area of the parallelogram, now this concept of area, geometric area has
very nice properties, the first property is that if the 2 vectors coincide this should 𝑃(𝑢, 𝑢), if
the 2 vectors 𝑢 and 𝑣 coincide then there is no parallelogram then the area is 0. So if the 2
vectors 𝑢 and 𝑣 are same then this is the typo here it should be 𝑃(𝑢, 𝑢) = 0. The second
important property is that if you scale one of the vectors either 𝑢 or 𝑣.

178
So multiply it by a scalar 𝜆. So you stretch right 𝑢 by 𝜆, then the area also get stretched by the
same factor. So algebraically we can say that the area 𝑃(𝜆𝑢, 𝑣) = 𝑃(𝑢, 𝜆𝑣) = 𝜆𝑃(𝑢, 𝑣). So the
scalar comes out, that is the second property.
(Refer Slide Time: 05:27)

There is a third property namely, if I take 2 vectors 𝑢1 and 𝑢2 and add them I get any new
vector 𝑢1 + 𝑢2 and I can form the parallelogram given by 𝑢1 + 𝑢2 and 𝑣.

Here we are using the fact that the area, geometric concept of area is additive function if you
have 2 non-overlapping regions the areas of the union = the sum of the areas, so that property
is used so but basically important thing is 𝑃(𝑢1 + 𝑢2 , 𝑣) = 𝑃(𝑢1 , 𝑣) + 𝑃(𝑢2 , 𝑣).
(Refer Slide Time: 07:41)

179
So one says this is the area of span by 2 vectors 𝑢 and 𝑣 is linear in each variable 𝑢 and 𝑣 in
each variable it is linear.
(Refer Slide Time: 07:56)

So this way of interpreting motivates us a definition. So you can think of 𝑢 and 𝑣 as the row
vectors of a matrix so that is 2 × 2 okay. So we generalize this to 𝑛 × 𝑛 matrix. So given a
matrix 𝑛 × 𝑛 square matrix with real entries.

(Refer Slide Time: 10:10)

(Refer Slide Time: 11:12)

180
So these are the properties which define determinant and everything else can be deduced from
these 3 basic properties that is what we want to show. So sometimes that instead of writing the
letter 𝐷 of matrix 𝐴 one also puts 2 bars around 𝐴 right this symbol also used to indicate you
are looking at the determinant of the matrix 𝐴.

They are perfectly taken from the concept of area of a parallelogram.

(Refer Slide Time: 13:33)

(Refer Slide Time: 16:27)

181
(Refer Slide Time: 18:12)

(Refer Slide Time: 20:45)

182
So this gives you the property that if 2 rows a matrix are linearly or if any if the set of vectors
are linearly dependent then one of them will be linear combination you can write that also, is
it clear only those 2 properties are used.

183
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 20
Determinants and their Properties - II

(Refer Slide Time: 00:26)

(Refer Slide Time: 02:51)

(Refer Slide Time: 04:31)

184
(Refer Slide Time: 09:05)

(Refer Slide Time: 09:19)

185
So we know that determinant of each 𝐸𝑘 is not 0? Yes, just now we computed right, either it is
1 or it is -1 or it is some nonzero 𝛼. So determinants of elementary matrices are nonzero.

If there are 𝑟 times interchanges required then it will give (−1)𝑟 and that precisely says if to
obtain new row echelon form number of row interchanges it required is 𝑟 right that means the
determinant should be (−1)𝑟 times determinant of the original one, because each one is going
to give you right from here right. So these are useful in computing. So we will see how they
are useful in computing, yes the proof of this clear.

It is just saying first step reduced row echelon form is the product of elementary matrices into
the matrix A, apply determinant on both side, use the product property and write down. Okay.

(Refer Slide Time: 14:06)

186
(Refer Slide Time: 14:43)

Same proof later we will work when any matrix 𝐴 is invertible, but at present we do not know
that relation right, that for any 2 matrices, det(𝐴𝐵) = det(𝐴) det⁡(𝐵). We do not know that
yet. We only know for elementary matrices this property is true so we are using that right. So
that show you slowly how rigorously we are building up our concepts, our known theorems
from simple 3 properties only okay.

So now we come to the fact that det(𝐴𝐵) = det(𝐴) det⁡(𝐵). Proof is quite simple.

(Refer Slide Time: 17:58)

187
(Refer Slide Time: 19:22)

(Refer Slide Time: 22:47)

188
So I can write now equivalent ways of describing invertibility first one was 𝐴 is invertible, now
we showed it is equivalent to full rank right, full rank means what there reduced row echelon
form is identity matrix that is equivalent of that and now just we have shown is if determinant
is not equal to 0 right we proved 𝐴 is invertible if and only if determinant is the product of
elementary matrix is not equal to 0 and it is determinant of 𝐴 inverse that is the bracket missing
here.

189
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 21
Determinants and their Properties - III

See till now we said that let us start with a function 𝐷 on 𝑛 × 𝑛 matrices with those 3 properties,
what are the properties? First property was, if 2 rows are identical then it is equal to 0.
(Refer Slide Time: 00:42)

Second property linearity in each of the row vectors, third determinant of identity is equal to 1
and we deduce all the consequences, but we did not show that there is a function with that
properties those 3 properties, does there exist a function with those 3 properties? We assume
there exist one and deduced what are it is further properties. We did not say there is one right.
Now the stage has come to say that they actually exist and there is only one such function.

So what we have done is we assumed existence of a function called the determinant function
with some properties, basic properties and deduced some other properties, we did not say that
such a function exist right. So existence we have to show in mathematics we not only have to
say that something exist as these are properties, we have to show these are the properties and
this either exist one function and only one.

Then only we can say that all function will have those properties right. That means for every
matrix 𝐴 there is a unique concept of associating with it a determinant, such that those 3

190
properties hold right. So that infect is what you have been doing all along expanding a matrix
by rows or by columns. See when you are given 𝑛 × 𝑛 matrix you have used that fact that we
can expand it by any row or by any column, that means what?

See when you try to expand, let us say 3 × 3. Let us say how do you expand 3 × 3 so that you
understand what you are going to do.
(Refer Slide Time: 02:44)

How do you compute for 2 × 2 by the same method? So it is the inductive process of computing
same is the definition of determinant. So let us formulate the definition of determinant. If the
matrix is 1 × 1 induction, we want to define the notion of determinant of a 𝑛 × 𝑛 matrix so
how do you define? Define it for 𝑛 = 1. So for 𝑛 = 1 what is the definition, it is only 1 scalar
right. So it is that scalar itself as the determinant only 1 row 1 column so all properties will be
satisfied.

Assume you have defined notion of determinant for (𝑛 − 1) × (𝑛 − 1) matrices, that means
for every (𝑛 − 1) × (𝑛 − 1) matrix you have the notion of determinant with those 3 properties.
I want to define what is the notion of determinant for a 𝑛 × 𝑛 matrix so that is defined by
looking at what is called, what we wrote as removing that row, that column that is called a
minor right in your thing.

(Refer Slide Time: 05:55)

191
Basically that normalizing factor says that all have to be same okay. So we will not prove that
so this gives you existence and one shows that the fine this way there is only one concept of
determinant meaning what that this definition of determinant which ever row or column you
take as those 3 properties. So it is rightly deep theorem. It has all those 3 properties and each
one of them will give you the same value right, they are all equivalent to each other.

So you can use any one of them as per your convenience right. So this is more useful, this
definition is more useful, this definition is useful in computing that determinant. It is the
inductive process. So you can feed it to a machine also to do the job, but this is also existence
and uniqueness, while proving if you use this definition and try to prove that for example that
det 𝐴−1 = 1/ det 𝐴.

(Refer Slide Time: 09:02)

192
(Refer Slide Time: 09:42)

How do you prove det 𝐴 = det 𝐴𝑇 ? You can say okay let us prove we have expanding by rows
and columns and same we try to expand by columns and rows and do something, but there is
very simple way of doing it, again by using our abstract definition
(Refer Slide Time: 13:26)

193
(Refer Slide Time: 13:51)

Now here is something which will not prove everything here because they are messy to write.
See look at the transpose of the cofactors of a matrix that is definition of what is called adjoint
of a matrix right. So what is the adjoint of a matrix? First of all, we looked at minors and then
we looked at the cofactors right. So look at the transpose of the cofactor, look at the cofactors
that give you a matrix.

(Refer Slide Time: 15:41)

194
So let us just these computations we have been doing so we will not go into all aspects. Let us
look at one example you are given this matrix okay. So how do you compute adjoint of this
matrix? Look at the minors right. So this row, this column okay and this row, this column
expands by any, these are minor, these are so let us not bother too much about the computations
verification of that. Basically you compute the minors matrix, you compute the cofactors
matrix.
(Refer Slide Time: 16:21)

And then you compute what is adjoint, the transpose of that gives you this right and you
multiply and see that it may come out equal to determinant of 𝐴 identity or it may come out to
be something else. Here it comes out to be equal to 0 that means the matrix is not invertible
right you cannot do anything. So earlier that was only for invertible matrices because you can
divide only when determinant, that identity is okay.

195
This is the complicated way of finding the inverse. So anyway this example illustrates the fact
that for a noninvertible matrix this could have come out to be = 0 because we can check the
determinant of that = 0.
(Refer Slide Time: 18:16)

So here is another theorem which will not prove, namely it says that the rank is related to also
determinant. You are given 𝑛 × 𝑛 matrix right you can form sub-matrices of lower order right.
Either the determinant of the whole matrix is not = 0 what does that say, determinant not = 0
the matrix is invertible, rank is full or the determinant is 0, so matrix is not invertible now look
at the sub-matrices of order (𝑚 − 1) × (𝑛 − 1) there are many possibilities right sub-matrices.

(Refer Slide Time: 19:38)

196
(Refer Slide Time: 21:23)

(Refer Slide Time: 21:52)

197
So finding determinant by a machine is not time efficient way of doing it, but this method has
used in some other places so it is kept okay. So with that we finish this lecture on determinants.
We want to start now something new. We will start looking at a dynamics of linear algebra.
The dynamics means when things move right. So we look at what is called linear
transformations in linear algebra.

Linear transformations are map it takes one vector and throws it somewhere else. So we will
do it next time okay. Let us stop here. Thank you.

198
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 22
Linear Transformations -I

Okay, so we will start with our today’s lecture. Till now we have seen various concepts of
linear algebra, more of theoretical, they are motivated by geometrical concepts. Today we are
starting to look at a concept, which is more dynamic in nature and so to motivate that concept
let us look at a small video clip rather.
(Refer Slide Time: 00:54)

Observe the movements of the objects in this clip. So what are the kind of movements that you
saw in this clip. These clouds were moving, this wings of the windmill they were moving, the
sun was moving and this balloon was changing and this flag was changing, let us see it once
again some part of it and observe the various movements.
(Refer Slide Time: 02:08)

199
So what are the various movements which are happening? Clouds were removing in certain
direction. The sun was moving along a particular path, the flag was shifting it is position and
the balloon was becoming bigger or smaller. So these are various things which we use in life
which we observe in our day-to-day life and which have been used in this particular clip. So
let us see what are those things. So movements were shown and some of them were reflection.
(Refer Slide Time: 02:46)

The flag changing okay. Then scaling.


(Refer Slide Time: 02:52)

200
Balloon becoming bigger and bigger retaining the shape that is called scaling.
(Refer Slide Time: 03:00)

Then there was rotation. The sun was moving along a circular path.
(Refer Slide Time: 03:07)

201
How are these movements described mathematically? So basically the idea is to look for maps,
let us call it as a map 𝑇, from ℝ3 that is our word okay, to ℝ3 which preserve collinearity. What
does collinearity mean? They take lines to straight lines to straight lines they do not change the
straightness of 3 points. If 3 points are collinear to start with then their images also remain
collinear.

Right, so lines go to lines, so origin remains fixed in some of this right, in translation origin
does not remain fix, but in rotation, reflection, scaling all these thing origin remains fixed and
third the proportionality is maintained. The proportionally division ratios are mainlined if
𝑢, 𝑣, 𝑤 are in a particular ratio these points on a straight line were in a particular ratio then the
images also maintain the same ratio.

So question mathematically comes what are the maps from ℝ3 to ℝ3 which have these 3
properties, which we observe in day to day life and which are useful.
(Refer Slide Time: 04:39)

202
So one can prove a theorem.

Here it is from ℝ3 to ℝ3 , but when define we will for vector spaces we will see that they need
not be from same vector space to same vector space right domain and range need not be same.
We will use this notation and if 𝑉 is contained in ℝ𝑛 is the vector space we define the notion
of a vector space and u is inside v which is also a vector space then we will say 𝑈 is the subspace
of 𝑉 right.

A subset also a vector space will call it as a subspace and this dimension of a vector space will
write as dim 𝑉 that will denote the dimension of the vector space, there is notation so that we
do not have to write everything again and again.
(Refer Slide Time: 07:24)

203
So here is the formal definition of a linear transformation.

So geometrically a linear transformation as we said preserves right, it takes lines to lines,


preserves ratios and leaves origin fixed algebraically. See algebraically what it is essentially
says is that there is a linear structure in the domain and this is the linear structure in the
codomain. So it gives due respect to the corresponding structures right, in the domain as well
as that is the algebraic meaning of that.

And this phenomenon comes in lot of branches in mathematics whether you study group study
here we are studying vector spaces. So whenever there is a set and there is structure and there
is one study maps between those 2 structure and if they preserve those are the important aspects
that is what whole of mathematics is about. You take a set, look some structure on it okay. You
get an object and look maps between these objects.

For example, in your calculus course you had the real line, right. On the real line what is the
structure there is notion of distance on the real line. So notion have limits make sense on the
domain. Closeness makes sense. You can take a range to be ℝ or ℝ2 or ℝ3 . There also is the
notion of distance. So closeness is there. So you can study if something is close air, whether
the images are close or not, continuity and such things.

So that happens in most of the branches in mathematics so okay. Let us go ahead with our
concept of linear transformations.
(Refer Slide Time: 11:26)

204
(Refer Slide Time: 14:25)

So it is not linear okay and this it causes some problem in computer graphics see, we will show
later on that these motions translation, rotation, deflection, such things are used in computer
animation right. So when you want to do computer animation you have to tell the machine that
at this point this line is to be rotated like a robo is there standing and arm force right.

So that means this is my center wave I want to rotate and by how much angle I want to rotate
but how do I give command to the computer, it does not understand to rotate. So all this is done
via matrices okay. We will see how this command can be given by matrix and then to a
computer and then computer does the job right. So that is possible, but the problem comes this
translation is not linear.

That causes a problem because to give a command to a computer we need to convert a linear
transformation to a matrix and translation is not linear so you cannot convert directly into a
matrix. So what is done is one introduces a dummy variable right and then coverts this into
from ℝ3 one goes to ℝ4 actually and then does it. So you will learn these things if you are
doing computer animation, computer graphics.

You will learn their order called projective coordinates. So we will not go into that because
idea is to introduce things okay.
(Refer Slide Time: 18:15)

205
So let us look at some particular ways of introducing linear transformation examples they
comes from matrices, so till now we have been treating matrices as collection of rows and
columns of numbers which help us to reduce right express linear equation as system of linear
equations and do various job. Here is a dynamic version of it.

So every matrix 𝑚 × 𝑛 gives you a linear map from not from ℝ𝑚 to ℝ𝑛 but from ℝ𝑛 to ℝ𝑚
okay. So this keep that in mind. So now 𝐴 is 𝑚 × 𝑛 but 𝑇𝐴 is from ℝ𝑛 to ℝ𝑚 , want to look at
some particular cases of this. We will take a special matrix and see what is the effect happening.
So let us look at that.
(Refer Slide Time: 21:56)

(Refer Slide Time: 23:53)

206
(Refer Slide Time: 25:02)

So we will come across many more slowly. I just wanted to introduce 2 of them. So rotation
has been done. Reflection has been done okay. We will look at examples more a bit later. So
now here is, so linear transformations become important from these 2 examples anyway right.

That is what the importance of independence was generates and independent. Everything
should be a linear combination and unique representation should be there. So we are fixing a
ordered basis {𝑣1 , 𝑣2 , … , 𝑣𝑛 }, then the claim is 𝑇 is uniquely determined by the values on this
𝑛 vectors 𝑣1 , 𝑣2 , … , 𝑣𝑛 , that means 𝑇 is the liner transformations on a vectors that is 𝑉. If we
fix it is values on basis elements, then it is value for every vector is fixed.

207
That means to determine a linear transformation completely on a vector space of dimension
say 𝑛 there is an 𝑛 elements in the basis. You need to know only the values on the basis
elements. Where does this basis vector go that is all, nothing more is required.

208
Basic Linear Algebra
Prof.Inder K.Rana
Department of Mathematics
Indian Institute of Technology- Bombay

Lecture - 23
Linear Transformations-II

(Refer Slide Time: 00:33)

So let us look at is this {𝑣1 , 𝑣2 } form a basis let us look at 𝑅 2 right.

Simplest way is find the determinant right look at the determinant whether rows or columns
whichever way you like right? for a square matrix when are the rows and columns independent if
they are independent than it is invertible? So it is 2 by 2 and the invertible is the determinant is not
equal to 0.

For simple things computation so these are independent because determinant is not 0. So, let us
fix T of v1 is 10 so we want to find we want to check whether what we have done we have to fix
T of v1 as 10so v1 v2 are basis elements T of v1 is 10 T of v2 is 01. Okay I want to know what is
T of V general vector so how do we compute t of v general vector for any v I should write first a
linear combination of the basis elements.

209
V should be written as sigma alpha i vi an then T applied to it T of vn you know so let us do that
so if you look at xy okay that is x+y/2 times v1 so that is a linear combination of v1 and v2 you
can check any vector okay xy it is a linear combination . Okay so what is T of this it will eb x+y/2
T of v1 +y-x/2 T of v2 by linearity that is what they are saying . if we do that what is T of v1 is 10
T of v2 is 01 what is this scalar multiply by the vector 10+y-x/2 multiplied by.

So, what is this x+y2 y-x/2 so that is the linear transformation so it says once I fix values of T on
the basis vector every other vector because its going to be representative uniquely as a linear
combination of the basis. T of that will be linear combination of the T of the image vectors of the
basis. So that values we know right so this is given images on the basis elements basis elements
we know.

How to find the general formula for the transformation right? so its okay so this we have to do the
linear equation right? Xy=alpha v1+beta v2 so you will get two equations and two variable you
have to solve them. So, basically that previous knowledge will be coming handy here now you
want to write a vector as a linear combination of known vectors. So, xy= alpha times 11+beta times
so you will get two equations alpha and beta in terms of x and y.

So, x and y are fixed for you given to you so find the value of alpha and beta in terms of x and y
you have to so that is you check whether a given vector is a linear combination of some other
vectors or not right? so this is obtained by solving two equations with two variables alpha beta.
(Refer Slide Time: 04:52)

210
Okay so here is the important theorem which will prove it says where associating with the linear
transformation from sub spaces V is a linear transformation from vector space V2 vector space W
if we look at all the vectors in the domain which go to 0 look at all the vectors u in the domain u
belonging to v say that u of u=0 all the vectors in the domain which go to the 0 vector will let them
together that set is called kernel of the linear transformation.

So, what is the kernel? all the vectors which are killed by the linear transformation which goes to
0. So, in the domain collate together all vectors whose images are 0. So, that is a subset at present
but the claim is actually it is a subspace. What is the difference between a sub set and a sub space?
In a subset obviously all u will go to 0. But if u1 and u2 go to 0 and alpha times u should also go
to 0 it I a vector space.

Right so we will prove that so this kernel is all the elements of the vectors space v will go to 0 it
is called the kernel the range like range of a map right? The image side what is the range rage is T
of u belonging to the domain that is a range of a function as it is right? so claim is if T is linear
then the range is also is a vector space okay second and third now kernel T and range T both are
vector spaces.

So they allow dimensions so it says dimension of kernel of T +dimension of the range of T


=dimension of the domain space V this formula holds this is as same as the dimension of kernel it

211
is called the nullity actually and dimension if range is called the rank of the linear transformation
it is same formula that we had for rank nullity theorem for matrices. We will see how it is related
to that also later on

So it is the same formula okay same theorem for linear transformation that the rank right dimension
of kernel is nullity dimension of range is called the rank . So, rank+nullity= dimension of the
domain space .So, let us prove this theorem okay proofs are quite simple straightforward so let us
prove them so i will write the proofs so that it is easier to understand slowly.
(Refer Slide Time: 07:47)

So, the first one is T is from v to w right so what is kernel of T that is =all u belonging to v say
that T of u=0. So, as such this is a subset of v right? so claim kernel of T is a subspace . So, to
prove this so let us take let u1 u2 belong to v kernel of two elements and scalars nullifying beta
belonging to scalars. Take a linear combination then alpha u1+beta u2 should belong to kernel of
T that is what we have to show right to show it is a vector space.

The linear combination of elements should be inside. So, let us to show this what is to be shown
that this element also has a property T of this element is 0. So, let us note T of alpha u1+ beta u2
what is that=linearity plays a part so it is T of alpha u1+T of beta u2. But again by linearity alpha
comes out that is T of u1+beta T of u2 right I am using linearly. So, this is =alpha times this is
0+beta times 0 T of u1u1 belongs to kernel u2 belongs to kernel.

212
So, T u1 is 0so this is =0 right so that mean the kernel is a subspace whereas if we cut the range
our range is a subspace of what . let us put the range so what is the range of T what is tis this is all
W belonging to W.
(Refer Slide Time: 10:13)

In the image say that W=T of u for some u belonging to v is that okay right all the points and the
co domain which are images of something right? another way of saying this is another way of
saying T of u belong to v both are same okay they are equivalent of both alpha, So this is a subset
of W range is a subset of W claim W is a subspace right so what do i have to check? so let W1 W2
belong to W alpha and beta will be scalars. .

So, to show alpha W1+beta W2 belongs to W clear how do you show that I should represent this
as T of something as image of something? right? so note so this is what we saw alpha W1+beta
W2 what does that equal to we know W1 is in the W1 sorry I wrote W right I should have written
of T of u W1 W2 in the range I should show; other one is always true that if two elements are in
W, then linear combination is in W, that is a vector space anyway; to show of the range is a.

So, here is the mistake W1W2 belonging to the range right should imply the linear combination
inside the range right? That is to be shown I will start with tis one W1 belongs to a range that
means what alpah1 what is W1 some of T of u1+beta what is W2 that is T of u2. So, where W1=

213
T of u1 and W2= T of u2 is that okay because if they are in the range there must be the images so
what does this give you?.

So, this is equal to implies that this right hand side is alpha of T1 so we can write as T of alpha1
u1 +T of why did I write alpha 1 anyway there is no alpha 1 where is alpha 1 here right +beta times
u2 this alpha can be taken in because of linearity beta can be taken inside with our own linearity
this is same as all linear alpha u1+beta u2. But this new element in u right so that means W alpha
W1+beta W2 is an image of some element right using linearity.

Nothing more than that were not doing anything just using what is given us linearity and bring it
inside. So, it is a subspace right.
(Refer Slide Time: 13:48)

So, let us look at a third property the third thing we want to show we want to show that the
dimension of kernel of T +dimension of range of T =dimension of v . So, T is formed v to w keep
that in mind right so as I say this is dimension m and this is dimension n right v is a vector space
it will have some dimension n right W will have some dimension n so let us look at that we do not
have to write everything.

Okay so now we want to compute what is the dimension of kernel? how does what is the definition
of dimension or the dimension of a vector space is a number of elements in any basis of the vector

214
space. If the dimension is n right we will have some n vectors as there is a basis for it we do not
know what it is we can find let us start so let us let kernel of Have basis kernel of T is a subspace
right? so does that make the space in itself? so it has a basis right?.

So, let us write it has a basis so let us call this as v1 v2 vk k <=n is that okay? this kernel uses a
subspace of vn right so it will have some basis let us say k elements are the basis dimension is k
right? and this k will be <=dimension of n because it is a subset of vn , so k<=n so I will just a
picture imagine this is v right? it does not look like a Venn diagram I am trying to show and here
is a kernel of t this is a part of it right.

So, kernel were taken as a basis for the whole space also has a basis right? so what we are trying
to do in our ways and large this given basis of kernel were fully basis of v. I can go on adding
more and more a nonzero elements right? Till I get n elements which are linearly see because if
kernel is not whole of v there must be a vector outside right? there must be a vector which is not a
linear combination of elements of the Kernel.

So, add that to that basis of the kernel you will get 1 more k+1 probably that itself finish everything.
If not take one more and add till you get n elements which are linearly independent right? So what
one says is enlarge it to a basis of v enlarge it that means you are given k vector it form a basis of
the kernel add how many more do you will be adding n-k so that you have n vectors which form a
basis of v.

So, enlarges the basis of v1 v2 vk up to where this is kernel the new one which are added is v k+1
up to vm. We get a basis of v is that okay right so this is to form a basis of v so the whole thing
has got a basis . So, let us consider any w belonging to range of t that implies w must be= in the
range there must be mage of something for some u belonging to v is that okay if w is in the range
it must be image of something.

Now I have got the basis of v and u belongs to v right. So, now this u can be written as I have got
a basis for v u is element of v so what should I have that means now there exists some alpha 1
alpha 2 alpha m such that u must be a linear combination of the basis elements m is the basis

215
element right for this basis. So, it will be alpha 1 v1+alpha 2 v2+alpha kvk+ alpha k+1 vk+1+alpha
m vm right it is a linear combination.

So, u must be written that way but what i am interested is I am not interested in u I am interested
in w because I want to get a basis for the rage so I should go to the image. So, what does this imply
what is T of u .
(Refer Slide Time: 19:40)

So, imply what is T of u linearity that is alpha 1 T of v1+alpha 2 T of v2+alpha k T of vk+ I am


using linearity again and again+1 okay + alpha N T of Vn using linearity and using u is the linear
combination T of u must be linear combination of the images right? so that implies what is T of u
that is W right so W= what is T of v1 where is v1 these are form a basis of the kernel so what is T
of v1 0 T of v2 is 0 up to T of vk is 0.

So, let us use that fact this is 0 so this goes this goes off this is not required this is not required this
is 0 so what is left alpha k+1 T of v k+1+alpha n T of vn. So, what does it mean any vector so this
implies any vector w which is of range is a linear combination of these ones right? that means the
range of T is span of the vectors T with k+1 T of vn right these vectors span only thing you have
to check we have to show the basis once it is generated.

216
We want to show our basis what we should do prove it is linearly independent right? so claim the
T of vk+1 T of vn are also linearly independent so let us check that so how do we check their
linearly independent then a linear combination=0 should imply all is equal. So, let us say beta k+1
T of vk+1 I am choosing purposefully the betas now right? +they are arbitrary beta n T of vn=0
what is to be shown to show beta of j=0 for every j .

All these betas are 0 right but now what do we know? we only know that this vk are independent
right? we know something what you recall because we had the basis the basis of v. So, these ones
are independent we have to now for now go back to the domain space and use this fact there so
how do i go back? so the idea is this one okay? so let us look at the left hand side of this=okay is
equal to okay i add to it T of u1 T of u2 and T of uk all of them.

Whether all are 0 right so it says sigma we can write beta T of ui I=1 to n=0 okay for any beta 1 to
beta k . Take first any beta 1 beta k but T of u1 is known T of u2 is known and T of k is known as
first take n=0 whatever beta1 beta 2 and beta k right remaining comparison is given to us right we
are going to make it =0 is it okay . So, it implies linearity now take linearity that means T of sigma
beta i vi i=1 to n=0 by linearity and I am going back now.

So, that means what T of something is 0 that must be in the kernel right?
(Refer Slide Time: 24:38)

217
So, it implies sigma beta i vi i=1 to n belongs to kernel of T right but that was for any beta I so
what does this imply what is the property of vi to vn they were a basis right and I have got a linear
combination now right. So, this belongs to kernel of T that is a linear combination which T=0 this
is =0 so this is okay sorry i should not be hurrying through let me just from here we have to rewrite
what we have to show it says beta here okay.

I will write it again in a simpler way I have done it given beta k+1 T of uk++beta n T of vn=0 or
beta implies all of these are 0 beta k +1=beta n=0 that is what I want to show right okay let us
show this okay. So, this one implies T of beta k+1 v of k+1 + T of beta n vn=0 that by linearity.
So, let us bring her again linearity that may sigma of we will write sigma beta j vj j=(27:03) v
j=k+1 to n=0 right.

Now that means this element belongs to kernel right t of something is 0 so that element makes
clear. So, implies sigma j=k+1 to n beta j vj belongs to kernel of T right from k+1 onwards.. But I
know that kernel got a basis already v1 v2 vk. So, this element must be a linear combination of v1
v2 vk . So, implies sigma j=k+1 to n beta j vj must be =sigma alpha I vi I =1 to k right . Now bring
everything on one side so what does that imply? so bring everything.
(Refer Slide Time: 28:03)

So, implies that – sigma alpha i vi I =k+1 to n+ “Professor - student conversation starts pardon
i=1 oh i=1 sorry “Professor - student conversation ends” 1 to k+j=k+1 to n beta j vj =0

218
everything on one side. Now what is the left hand side is a linear combination of v1 v2 vk up to
vn all of them but i know that it is linearly independent all of them are linearly independent.
Because they formed the basis of the whole space v.

So, implies what all the alpha=0 beta j for every I and every j all of them must be 0. v1 v2 vk and
n is linearly independent. So, implies T of vk+1 T of vn is a basis of the range T of U right that is
=range . So, basically the idea is we start with a basis of the kernel which is in v enlarge it to a
basis of whole of v whatever you edit the first is the kernel the image will be 0. They do not
contribute anything idea is that the remaining n-k will contribute to the range of t.

They will be independent and everything will be generated by them. So, how many are there? N-
k right? so dimension of the range space is n-k dimension k. So, total=n so that is rank related
theorem. Basically doing what? looking at the bases of the kernel enlarging and saying that images
remaining ones that you added form a basis for the range. That is all okay

219
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology- Bombay

Lecture - 24
Linear Transformations-III

Right. So this is what is called the Rank nullity theorem. I once of all write dash and it looks like
minus. It is Rank + nullity=N right rank nullity theorem.
(Refer Slide Time: 00:43)

So this is the last one.


(Refer Slide Time: 00:49)

220
Dimension of the kernel + dimension of the range=V so that is what is called the rank nullity
theorem. We will see that this is same as diagonality theorem of matrices a bit later . Okay good
let us look at a bit further so important thing what you have done is what a linear transformations
kernel is the sub space of the domain range is the sub space of the co- domain and the dimensions
added up will give you the dimension the domain. Right important consequence of this very
important consequence.
(Refer Slide Time: 01:31)

There is a linear map and then these statements are equivalent see if you have got function one set
to another in general set theory a function and this function need not be one-one right. This function
need not be onto and there is no relation between function being one-one and function being onto.

221
You can have function which are one-one but not onto and functions which are not on to but are
one-one and both also.

But for linear transformation it is very important and very nice that a linear transformation if it is
one-one and that is same as saying kernel=0 and that is same as saying T=T is onto so a linear
transformation if it is one-one it must also be onto right and it cannot have one of the properties.
Let us see a proof of that so T is the linear transformation
(Refer Slide Time: 02:34)

(Refer Slide Time: 04:29)

222
(Refer Slide Time: 06:41)

How are these linear and we said that given a matrix you can generate a linear transformation out
of it now conversely we want to show that given any linear transformation it arises that way only
it arises through a matrix multiplication that is only way of getting a linear transformation so for
that let us have this notation if V is a vector space and you have got a basis {𝑣1 , … , 𝑣𝑛 } ordered
basis.

How is this matrix obtained what is the question? How do you get that matrix for a given linear
transformation?

So, let us look at a proof of that I will write that proof first and then show you so it will be a
revision.
(Refer Slide Time: 10:26)

223
(Refer Slide Time: 13:49)

(Refer Slide Time: 17:21)

224
(Refer Slide Time:18:19)

So, the proof so the important thing is T of V is the linear combination that we get here.
(Refer Slide Time: 18:54)

225
(Refer Slide Time: 20:10)

Just look at one example so let us look at this and finish off .

226
Basic Linear Algebra
Prof. Inder K.Rana
Department of Mathematics
Indian Institute of Technology- Bombay

Lecture - 25
Orthonormal Basis and Geometry in the Euclidean Plane- I

Let us begin todays lecture by recalling.


(Refer Slide Time: 00:29)

What we have done last time so given a linear transformation over vector space from 𝑇 to 𝑊 we
define what is called the kernel of that linear transformation so that is all vectors in the domain
which go to 0 vector whose image is 0. Image of the linear transformation is the set of all images
of 𝑇.

And we showed that both of them are sub spaces not only that they form a relation called that rank
nullity relation. The dimension of the kernel that is the dimension of the null space so the kernel
is sub space of the domain + the dimension of the range of T which is range of T is the sub space
of the co-domain so if you add this 2 together you get the dimension of V the domain space so that
is the rank nullity theorem.
(Refer Slide Time: 01:40)

227
And as the consequence of that we stated this theorem namely f is the linear map from V to V from
the same space to itself then the following statements are equivalent namely T is one- one. It takes
different vectors to different vectors the kernel is 0 space no other element in the kernel other than
0 and T is onto. So, as rank nullity theorem what we are saying is a linear transformation of T is
one-one if and only T is onto.

(Refer Slide Time: 02:49)

(Refer Slide Time: 04:21)

228
(Refer Slide Time: 06:43)

(Refer Slide Time: 08:35)

229
(Refer Slide Time: 10:32)

Then this addition of this let us compute and verify try to see whether it is okay or not right and
this will also gives you an idea of how to compute things.
(Refer Slide Time: 11:36)

230
So, that is how you write the matrix properly in the linear transformation given a basis order basis
in the domain an order basis in the co domain so i have computed for T. I have computed for S
you as an exercise I will leave you to try and write down what is T + S so what is T + S.
(Refer Slide Time:16:41)

We are just verifying but in the process it is quite clear how to find the basis of given an order
basis in the domain and the order basis in co-domain how to find a matrix of that linear
transformation and then you can easily verify this also so that is the process required to verify let

231
us go a step further not only for addition and scalar multiplication it holds and this also holds for
composition say in functions it can compose 1 function with another one.

If the range of 1 is in the domain of the other and same is possible here.
(Refer Slide Time: 19:21)

(Refer Slide Time:23:58)

So let us write it as the consequence of the theorem namely if T is the linear transformation V2 to
V and B is the order basis and T is bijective that is same as one-one and onto if and only if its

232
matrix is invertible and the matrix of the inverse transformation is same as inverse of the matrix
proof of interchange with each other.

233
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology- Bombay

Lecture - 26
Orthonormal Basis and Geometry in the Euclidean Plane- II

(Refer Slide Time: 00:28)

Let us try to do one example of this again to see what is happening right to understand so let us do
this example okay so that do you feel comfortable in computing things?
(Refer Slide Time: 00:43)

234
You want to find the matrix of the inverse linear transformation right so what will be that by
Theorem it says inverse of this matrix. So, let us write that.
(Refer Slide Time: 04:58)

(Refer Slide Time: 07:28)

And by now using this I can find out what is my expression for the linear transformation T inverse.
That also can be found okay that we found out as so is it clear to everybody? yes so basically.

235
What we are saying is we are giving it an association between the linear transformation and
matrices.

And this association is very nice the matrix of the sum = sum of the matrices the matrix of the
scalar multiple is the scalar multiple of the matrix. And matrix of the composition is product of the
original matrices.
Which is not of numbers its elements are matrices themselves you can have many other so we will
discuss it in the end of the course there is a concept of general vector space. And whatever we have
being doing almost everything works for those vector spaces also and they are important for
computation point of view so we will come to that bit later so let us just come back to our thing.
(Refer Slide Time: 12:26)

(Refer Slide Time: 14:17)

236
(Refer Slide Time: 17:25)

Each vector is normal that means standardised and the length = 1 and orthogonal to each other so
the 2 are combined as orthonormal so the standard basis of ℝ2 or ℝ3 is an example of a orthonormal
basis we define it formally also.

237
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology - Bombay

Lecture – 27
Orthonormal Basis, Geometry in R^2 III

Right, so let us start doing it in R𝑛 .


(Refer Slide Time: 00:27)

This dot product normally is also instead of, sometimes dot is not very visible, so one writes it as
this bracket, angle bracket. So this is called the inner product between v and w. So dot product in
general settings is called inner product. That same terminology goes over when you have some
abstract spaces also, okay. So we are getting out of the standard terminology of R2R3n, taking a
general terminology and general definitions.

Is the magnitude in R2 or R3 , we start calling it as the norm of the vector or you can also call the
length of the vector. So what is that? That v, inner product v with v, that gives you sigma vi square
square root, right. So we say 2 vectors are orthogonal in Rn are mutually orthogonal if the dot
product or the inner product between them is equal to 0.

Same definition carried over now. See if v is perpendicular to w, then w also is perpendicular to
v, because this multiplication is commutative, right. Whether you write inner product v with w,

238
dot product with v and w, that is same as dot product of w with v. So it does not matter. So this is
a symmetric. v perpendicular to w is same as w perpendicular to v.
(Refer Slide Time: 03:16)

Here are somethings which I think we will not prove them. We will just assume. This is what is
called Cauchy-Schwartz inequality, you must have proved it in R2 and R3 as a; same proof actually
continues. If you go back and look at the same proof, it continues. Namely, the absolute value of
the dot product of v and w is less than, because essentially the formula comes. This is equal to
norm v, norm v cos theta, right and absolute value of cos; same relation actually is valid.

(Refer Slide Time: 05:27)

239
So let us now formally define what is orthonormal bases, okay. So given a set S, v1 v2, vk, a set
of vectors in Rn, we say that this forms an orthogonal set if vij is perpendicular to vl whenever j is
not equal to l. Any 2 of them, distinct ones are perpendicular to each other, right. One of the vectors
may be 0, we are not saying all are non-0. The 0 vector will be a perpendicular to everything, you
can say that way, right.

And say orthonormal, orthogonality is different from orthogonal essentially. Orthogonal is


perpendicular to each other and orthonormal says not only it is orthogonal, something more, the
norm of each vector is 1, magnitude is equal to 1. So it is something more than being orthogonal
and as a consequence of this because norm of each vector is 1, 0 cannot appear in an orthonormal
set.

In an orthonormal set, 0 cannot be a member. Because magnitude is 1. If there is a vector 0, then


its magnitude is 0. So if a set of vectors is orthonormal, then 0 cannot be a member of that. Be sure
of that, right. So definition of an orthonormal bases, bases v1 v2 vk of a vector space v in Rn is
called orthonormal bases if this forms an orthonormal set.

That means it is a bases, right like every vector is a linear combination, +added property, length
of each vector is equal to 1. And any 2 of them are perpendicular to each other when they are
distinct. Then we say it is an orthonormal bases. And advantage I have already showed you what
is the advantage of an orthonormal bases?
(Refer Slide Time: 07:26)

240
(Refer Slide Time: 09:39)

If B is an orthonormal basis, then what is that vector x. It is a linear combination of u1 u2 uk, right,
because that is the basis. It should be linear combination. But it says what are those scalars? They
are precisely xu1 xu2 dot product xuk dot product. So coordinates which I said earlier, if you are
given an orthonormal basis, its coordinates are immediately known, right. Those unique scalars
alpha 1 to alpha n which give a linear combination, is immediately known.

Take the vector, form its dot product with that corresponding basis elements in the orthonormal
basis, you get the corresponding coefficients. So that is the advantage of orthonormal basis. So

241
one would like whenever you are given a vector space, I would like to convert it into a or I would
like to construct out of it a orthonormal basis, right. So because of this advantage.
(Refer Slide Time: 10:39)

So to do that, there is the question, how to do that. And the process is very simple.
(Refer Slide Time: 10:46)

It is very geometric. Let us look at this, okay. Supposing this is a vector, red one is 1 vector, this
is another vector. What is called the projection of 1 vector on the other, ordinarily? You will say
drop a perpendicular, right. Is the shadow, right. Projection is just the shadow of something. If you
are standing on the ground, what is your projection? Your shadow, right. So if I drop a
perpendicular, then this is a projection, right, up to here.

242
Now from this vector, if I subtract that, what I will get? This is a vector, this is another vector, I
subtract, I will get this vector, right. Vector addition. This plus this should be, right, equal to, but
I am taking, opposite direction is there. So that gives me this vector. So from this vector if I subtract
the projection, I get a vector which is perpendicular to it. That is the simple idea. And if I look at
the plane, plane can be generated by this vector and this vector, original.

Or it can be generated by this and this perpendicular vectors. These 2 perpendicular, one is the
original one, other is obtained from the second one by removing the projection, I get a
perpendicular one and these 2 generate the same thing which the earlier one was generating. So
that is the basic idea. So what we do, given some vectors, first one take it as it is, okay. The next
one, remove from it the projection of this in that, right, remove that, you get a second vector.

Now you have got 2 vectors which are perpendicular to each other and generate a subspace. Now
you are given a third vector. So what you will do? How you will get to the third one, perpendicular
to this and giving the same thing? For the third one, project it on the plane now and remove that
projection, subtract that projection, you will get a perpendicular one, right. So that one along with
that plane will give you all, whole of 3 thing, 3-dimension. So that process you go on doing it,
okay. I just wanted to show you something, right.
(Refer Slide Time: 13:06)

243
(Refer Slide Time: 14:23)

So here is what we do. Suppose you are given what we start with is, given a vector space, start
with a basis of it. Because eventually when you want to convert something into orthonormal, that
is going to be linearly independent anyway, right. And if you know how to construct basis of a
vector space. So start with a basis, right. If not basis, at least it should be a spanning set, meaning
every vector is a linear combination of these vectors.

(Refer Slide Time: 17:18)

(Refer Slide Time: 20:26)

244
And once you do your computation yourself, you will do it, right, okay. What is the third one?
How do you get the third one now?
So process is clear. So I am just given you a simple example, try to, you can note it down if this
and try to work out yourself what is the orthonormal basis? And check when I put the slides on the
mural, check whether my computation is okay or not. If wrong, let me know if there is some typo
in your thing, okay. It will give you also a practice of doing things, okay. So it is clear to everybody,
how do you get the orthonormal basis?

Given a linearly independent set, let us assume that you are given an independent set which is a
basis. If not, you can still do the process. At some stage, your w will become 0. So drop that as if
it is not there and go to the next one.
That will give you orthogonal set, then normalize each one of them to get your orthonormal basis,
okay.
(Refer Slide Time: 25:22)

245
So okay, I do not want to say you something more which later on, it gets related with something
called Fourier coefficients, those who have to do electrical engineering or something on Fourier
series, you will see that these things come again, they will hit you again somewhere in some other
form. There also it is called Bessel's inequality and Parseval's identity. Fourier coefficients will
come, okay.

So this is related with something. So basically at present, what we are saying is, take a vector v0,
I want to show this. So take any particular vector v0, okay. So that will be a linear combination.
And what is the linear combination? So let us define v0 to be the just linear combination on the
right hand side, right. And take the dot product. The dot product when you expand using linearity,
will come out to be equal to this, right.

Here if you drop this, this is sum of 2 squares, this sign is a sum of 2 square terms, right. If I drop
one of them, we will get bigger than or equal to, that is the other. Nothing more than that. Basically
what we are saying is, this ul's are mutually orthogonal, right. So look at this and expand.

(Refer Slide Time: 27:40)

246
So equality holds when it becomes an orthonormal basis, right. So I think we will continue next
time about this identity a bit. Basically the idea is given a basis on a vector space, how to make it
an orthonormal basis and the importance of that lies in the fact that coordinates are immediately
known and the process is a Gram-Schmidt orthonormalization process, right. So we will continue
next time. Thank you.

247
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture- 28
Isometries, Eigenvalues and Eigen Vectors-I

Okay so let us begin with today’s lecture. We will start recalling what we had done last time.
We looked at what is called the Orthonormal basis of a vector space.
(Refer Slide Time: 00:41)

And then we looked at orthonormal basis was defined as a basis such that it is a non-zero vector
such that mutually orthogonal and form a basis. So orthonormal basis is a set of vectors which
is a basis and any 2 are mutually orthogonal. And then we prove the result that if a set of vector
is orthogonal and of course none of them is 0 then that is linearly independent.

(Refer Slide Time: 01:54)

248
And then we saw the process of constructing orthonormal basis.
None of them will be zero and which will span the same space at that of this. Once we have
got on that you normalize them to form a orthonormal basis. So that is a process of constructing
a orthonormal basis from a given point and we have looked at examples of that. So we will
continue with the further ideas.
(Refer Slide Time: 04:11)

We also proved what is called Bessel’s Inequality which said that if you are given a
orthonormal set not necessarily a basis then the coefficient v ul square, this sum is always < or
= norm of u square and equality holds there if and only if this form a orthonormal basis. So that
is what is called Parseval’s identity. So Bessel’s Inequality becomes equality if and only if the
given set of orthonormal vectors is a basis also.
(Refer Slide Time: 04:56)

249
Now let us look at if we recall we looked at linear transformations as the map from one vector
space to another. So there is a notion of dot products so we would like to specialize those linear
transformations which not only preserve co linearity also preserve angles they do not change
the angles.

So such things are called Isometry. So a linear transformation from v to v is called an Isometry
if the inner product Tv and Tw is same as the inner product of the original. So it preserves the
inner product and we have seen that inner product is relative to the notion of angle and distance.
So it is not surprising that we have a theorem namely T is an Isometry if and only if it preserves
the length also. So here we said it preserves the inner product so inner product is related
basically to angles.

So it says T is an Isometry if and only if it preserves the notion of distance also the magnitude
of vector is kept intact. So an Isometry takes lines-to-lines right preserve angles as well as
preserves the magnitudes so they are sort of the perfect transformation in the planes. Example
when you physically do something physically you move an object the shape of the object does
not change right.

That means all the straight lines remains straight lines angles remain same right and distance
remain same. So those are basically called the rigid motion in R3 . So this is coming from that
translation is not a linear transformation, but still it is a rigid motion okay. So this is how they
arise algebraically Isometry or is a map from the vector space V to V such that it preserves of
course the angle that is given.

250
And we are saying that it is equivalent to saying that it preserves also the magnitude and this is
not very difficult to show basically how is the magnitude related to the inner product that is
square root of the v dot v right that is a magnitude. So using that one can easily prove.
(Refer Slide Time: 07:46)

(Refer Slide Time: 11:46)

(Refer Slide Time: 15:16)

251
So let us look at some examples which are quite obvious which we normally use actually in
rigid motion.

(Refer Slide Time: 17:33)

So physically it should not change right angle or distance right so you can try that. What should
be called as a reflection, how do you define a reflection in R2 to R2 what should be the formula
for that right, what would be the formula for reflection against a line. It will depend on what is
that line the line is at angle theta. So the formula will come out in terms of theta again okay.
(Refer Slide Time: 19:59)

252
So here is a once we have said that we have got a inner product space means what is a inner
product space vector space on which inner product is there so all subspace of Rn once we take
the inner product also it minds that becomes a inner product space right. So T is linear and B
is an ordered orthonormal basis. Earlier we looked at ordered basis and looked at matrix of that
right.

Now there is angle also available you can have a basis which is orthonormal. So T is a linear
transformation from V to V and we have fixed an ordered orthonormal basis on V not only a
basis that also is orthonormal any 2 vectors are perpendicular to each other and norm is 1. Let
us compute the matrix of that and the important thing is if T is an Isometry right if and only if
the column vector of that matrix of T forms a orthonormal set.

So given any linear transformation given an ordered basis you will get a matrix so you will get
the column vectors right. Here if the starting basis is orthonormal right and if T is Isometry one
can show that the column vectors are actually orthonormal. They are perpendicular to each
other and norm=1 so that is a special property of linear transformation when you get their
matrix with respect to ordered orthonormal basis right. So we would not write the proof of that,
but it is a very nice property.

And another one is the row vector also form y only right column vector or row vectors also
form okay an orthonormal. So it says that if T is a linear transformation from one vector space
to another you take a ordered orthonormal basis of V look at the matrix corresponding to that
then T is an Isometry if and only if the column vector are orthonormal and equivalently row

253
vectors are also orthonormal right.

So matrices which have that property square matrices which has the property A transpose A is
identity we will call them as orthogonal matrices. So the matrix of an Isometry is a orthogonal
matrix right. So you can write it as this theorem that T is an Isometry if and only it is matrices
is a orthogonal matrix right. But saying the row vectors that is same as saying not only A
transpose A is identity A transpose also is identity right both are same.
(Refer Slide Time: 25:42)

So this is an equivalent way of saying that a matrix A is orthogonal if and only if this is a
definition right, but you can take the transpose of this what is a transpose of this it will be
precisely A transpose, transpose right that is precise with this. So saying that a matrix is
orthogonal is equivalent as A transpose A=identity or AA transpose that is same as saying A
transpose is the inverse.

The transpose of the matrix is itself is an inverse of the matrix. So if a matrix is orthogonal it
is invertible as a consequence obviously right and it inverse is the transpose. So they become
very nice to compute the inverse you do not have to go to determinant or adjoin or anything
row echelon form nothing just take the transpose you get the inverse of the matrix we have a
very special matrices right.

And they arise as matrix representation of Isometry right which preserves conversion of angle
and distance. So that is what we are saying that orthonormal matrices arise in matrix

254
representation of Isometry with respect to orthonormal basis right.

255
Basic Linear Algebra
Prof. Inder. K. Rana
Department of Mathematics
Indian Institute of Technology– Bombay

Lecture - 29
Isometries, Eigenvalues and Eigenvectors-II

(Refer Slide Time: 00:27)

Let us start with an example to motivate everything that we want to do. Let us look at a matrix
2/2 simple matrix right -5, 2, -7, 4. Let us compute what happens when you multiply okay
multiply A with this vector 1 -1 why I have chosen 1 -1 I will come and bit later I will tell you
why this special matrix this special vector I have chosen, but let us just for the time being
compute this product.

So A is 1 this vector is so what is a product -5, 1, right -7 when you multiply by this so what
we will get +-5+7 that is 2 so similarly other one is -2. So ordinary products we are doing
matrices and that is I can write 2 times I can take out 2. So this matrix right with respect to this
vector 1 -1 has a special property. A apply to this vector is 2 times the same vector. So what is
it doing?

So this matrix A is just changing the vector 1 -1 to scale we are just stretching it kind of thing
right to twice. It is not changing it to anything it just stretching it that is all right. It lies on that
line -1, 1 right 1 -1. So it was this vector now it is stretching twice of that nothing more than
that okay. So here is the question given a matrix say how to find all vectors that are scaled by

256
it.

For this matrix we have one I have given you how I got it I will tell you soon so that is one.
How to find the scaling factor of a given matrix. Given a matrix does it have some property
that for some scalars there will be some vectors say that A apply to that vector X it will be
lambda times applied to that vector. So what are those scalars right, how to find those scalars
and how to find corresponding vectors which tells that it scales it.

So this is and why is this important, why these all exercise we are worried about it and why it
is important. So these 3 questions we like to answer okay. So let us look at one by one. I will
keep this example in mind for a time being we will work out with this example only so that we
understand everything right. So the question is given this matrix say I have selected one
particular vector 1 -1 and which has a property that matrix says scales it.

It scales by a scalar 2 I know the scalar I know the vector which is getting scaled right. Actually
if I take any other any vector with this say if I take 5 -5 that also will be scaled right, but the
scaling factor will be different then that is all, but if I keep 2 then it will be same 1 -1 it will be
scaled by some other factors, but if I take 5 -5 then it will be again 2 times 5 -5. So along that
line every vector is getting scaled that is all whatever vector we choose it will be scaled by
twice of that right. So the direction is lot of important.
(Refer Slide Time: 04:09)

So let us look at it.

257
(Refer Slide Time: 08:23)

(Refer Slide Time: 08:52)

It cannot be invertible right because if it is invertible only solution possible is X=0 right. For a
homogenous system the only solution possible is 0 if and only if the matrix is invertible right.
So if a non-zero solution is obtained some matrix apply to X=0 x is non-zero that means this
matrix cannot be invertible it has to be singular. If it is singular what should happen determinant
of that should be=0 right.

(Refer Slide Time: 13:03)

258
(Refer Slide Time: 15:04)

(Refer Slide Time: 15:26)

259
(Refer Slide Time: 16:01)

(Refer Slide Time: 18:49)

260
(Refer Slide Time: 19:48)

Let us call that matrix as P 1-1 second column this will be invertible matrix right is it invertible
determinant not=0. So this is invertible so that equation AP=P times this diagonal matrix P is
invertible so this is invertible I can take it on the other side by multiplying by inverse of that.
Yes, I can multiply on both sides on the left by the inverse of this matrix this is called P.

261
Basic Linear Algebra
Prof. Inder. K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture – 30
Isometries, Eigenvalues, Eigenvectors-III

Now what is the advantage of doing all these kind of things. Why I am bothering and doing
such kind of a thing you can ask a question. It is nice it looks nice mathematically, but why one
should bother about it.
(Refer Slide Time: 00:41)

See look supposing A is a matrix for that matrix that we had taken -5 right -5, -7 and 2 and 4.
Suppose I want to compute the 100 power of that matrix right that was the matrix. What was
the matrix we have.
(Refer Slide Time: 01:03)

262
A was the matrix which we had -5, -7, 2 and 4 right. These are the and for this matrix A have
got P inverse AP= 2, 0, 0, 3 there exist P such that this one is right. I want to compute what is
A raise power 100 right. So from here what is A= I can write A= from this equation so that is
A= 2, 0, 0, 3 on the right I multiply it by inverse so P inverse and here I multiply by P is it
okay.

So A=this so what is A 100 that is P this matrix 2, 0,0, 3 P inverse whole thing raise to the
power 100 what is this= looks surprising what it is= let us compute it for square first of all and
then we will see what is happening.
(Refer Slide Time: 02:19)

263
And for which powers are very easy to compute and multiplying by I have to only find out
what is P and what is P inverse only 2 matrices. I see only one matrix and it is inverse right. So
such a result is what is called diagonalizability of a matrix A. So what we have done is given a
matrix A which was our special one we have said we can find a matrix P which is invertible
such that P inverse AP is a diagonal matrix.

A diagonal matrix is known a matrix P is also known right. So the question comes for what
kind of matrices one can do that. For this particular one we are able to do it, but we may not be
that lucky so we have to find a condition which ensures if this matrix satisfies this property
then it can be diagonalized not only it can be diagonalized that means I can find a matrix P
such that P inverse AP will be diagonal and I know what is a diagonal matrix also right.

So those are the scalars which are going to appear on the diagonal are nothing, but determinant
of A- lambda I=0 solution of that. So it becomes important to solve determinant of A- lambda
I right so let us write that as a definition.
(Refer Slide Time: 05:05)

So these are the question so let us write lambda is called an eigenvalue. So such a scalar is
called an eigenvalue if there exist a non-zero vector oh gosh it is a typo. There is a vector X
such that AX= lambda X. So if there is a non-zero vector which is scaled by that vector by the
matrix A then that is called an eigenvalue for and this vector is called an eigenvector for that
eigenvalue it is called an eigenvector.

And just now I said for a given Eigenvalue in this example the previous example Eigenvalue

264
is there too, but there are many eigenvectors possible for you if one is a eigenvector scalar
multiple of that also is a eigenvector right the same eigenvalue. So there could be more than
one actually we can put them all together in one box you can call it as Eigen subspace of with
respect that Eigenvalue so we will come to that slowly.

So that is called an Eigenvector. So the problem is this is what you called the Eigenvalue
problem we call it and given a matrix A find its Eigenvalues and find corresponding
Eigenvectors right at least the process is quite clear. There is nothing we have already done one
example of that.

(Refer Slide Time: 07:42)

So essentially we are not exactly proving it, but giving you a hint that intuitive we are saying
that this determinant is a polynomial in lambda of degree n. So the characteristics polynomial
is a polynomial that is why it is called a characteristics polynomial because it actually turns out
to be polynomial and it is characteristics of that matrix is a property of the matrix is a
determinant A is a polynomial of degree n.

(Refer Slide Time: 10:05)

265
So let us look at this so the roots of this are the eigenvalues right. Roots of the characteristics
polynomial of the Eigenvalue. Now here is a theorem in algebra called fundamental theorem
of algebra which says here the coefficient are all going to be real right real matrices, but the
roots of a real polynomial may not be always real they can be complex right. So the
characteristics polynomial will be a polynomial with real coefficient.

It may not have any real root right all the roots maybe complex. So Eigenvalue need not exist
that is a possibility. So for a given matrix the Eigenvalue no Eigenvalue may exist. So gone
case for that you can do nothing you cannot diagonalize that matrix at all, you are gone at the
first step itself. It may have Eigenvalues right some of the roots may be real, some of the roots
maybe complex how many all will be there total is a polynomial degree n.

So a fundamental theorem of algebra says there will be n roots for this. Some of them maybe
real some maybe complex, but the complex roots always occur in pairs right. So if it is a
polynomial if the matrix n is of odd order then at least one real root will exist right. So then a
hope is there you start with that at least. So that is coming from algebra basically okay. So let
us we will slowly come to so a root there is something called the multiplicity.

For a quadratic you know, that gives you all the roots real or complex whatever it is for a
quadratic for a cubic one does not know. So the hope is you find one by hook or crook divide
by that and get a quadratic and try to solve it kind of thing. For 4th degree there is no hope at
all right. It is very difficult to find roots of a polynomial okay, but here anyway for this by just

266
looking at it you can see that there is a possibility of saying that there is a root lambda=2.

(Refer Slide Time: 13:46)

How to find Eigenvalues and Eigenvectors corresponding to that solve the characteristics
polynomial and for each root solve the corresponding homogenous system okay.
(Refer Slide Time: 16:00)

(Refer Slide Time: 16:38)

267
So here is one example of that so look at this matrix.

They basically arise in probability theory, statistical analysis because the probabilities are
distributed so total=1 essentially is something like their probability of something happening
kind of thing and you want to this is called the stochastic matrix because it describes the system
in probability and there you will like to know if you apply that process again and again what
happens.

(Refer Slide Time: 19:26)

(Refer Slide Time: 20:31)

268
You can also call it as Eigen subspace right vector space subspace corresponding to the
Eigenvalue lambda. Now there is something which probably will bother next time. So given
even if a lambda an Eigenvalue exist there is a Eigenvalue that means you have to find a
Eigenvector for that, but sometimes one allows even complex Eigenvalues the complex roots
which are coming for the characteristics polynomial.

269
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 31
Diagonalization and Real Symmetric Matrices I

So let us begin today’s lecture. We will start recalling what we have been doing. We have been
looking at the definition of eigenvalues, eigenvectors and diagonalization of matrices.
(Refer Slide Time: 00:44)

So let us just recall the example we have done.


(Refer Slide Time: 02:10)

270
So this prompted us to define and ask the question, given a matrix A, when does there exist an
invertible matrix P, such that 𝑃 −1 𝐴𝑃 is a diagonal matrix and how to find that P. So we made a
definition that a matrix A is diagonalizable if there exists a matrix P, which is invertible such that
𝑃−1 𝐴𝑃 is a diagonal matrix and we stated a theorem. Actually we proved it also partially, so let us
just go through the proof again.
(Refer Slide Time: 02:50)

There are 𝑛 vectors, which are linearly independent and form a basis. So let us just run through
the proof again.
(Refer Slide Time: 03:42)

271
(Refer Slide Time: 05:06)

So what we want to show. We want to show that there is a matrix P where that 𝑃−1 𝐴𝑃 is diagonal.
So let us construct the matrix with these as the column vectors, right. So P is defined as, with these
vectors as a column vectors, so these being linearly independent. So this is a linearly independent
set, so this is invertible, right. So rank of P is 𝑛, it is an invertible matrix.
(Refer Slide Time: 06:55)

272
There is nothing in the proof except that writing the matrix multiplication and expressing it
appropriately. So this proves the theorem that a matrix A is diagonalizable, if and only if there are
n eigenvalues, right and there are eigenvectors corresponding to them form a linearly independent
set. So that is the condition for diagonalization. So the question comes.
(Refer Slide Time: 08:02)

When they are linearly independent, everything is okay. So the problem is given nxn matrix, how
do you find a linearly independent set of eigenvectors. So the question, does a given matrix have
n eigenvalues 1 and for each eigenvalue, you will have an eigenvector with a form linearly

273
independent set or not. If yes, the matrix is diagonalizable. If not, you cannot help it and it is not
diagonalizable, right.

So basically what we are saying is, this n eigenvectors will form a basis, because this is an nxn,
right. So saying that a matrix is diagonalizable is equivalent to saying finding n linearly
independent eigenvectors for that matrix, right. Finding eigenvector means first you have to find
the eigenvalues anyway. So that is a problem we want to take.
(Refer Slide Time: 08:59)

Then the claim is this set is linearly independent. So what we are saying is even though all the
eigenvalues need not be distinct, but if you collect the eigenvectors corresponding to distinct
eigenvalues, they will form a linearly independent set, right. So at least there is some achievement.
They may not be, all of them may not be distinct, but whatever are distinct, for each one of them,
we find an eigenvector.

So these 2 must be equal then, after multiplication, right. So multiplying this by lambda Vl, it will
get the next equation.
(Refer Slide Time: 14:11)

274
(Refer Slide Time: 15:58)

This matrix is not diagonalizable, right, because there is only 1 eigenvalue and for that eigenvalue,
lambda=1, I can find only 1 linearly independent eigenvector because the solution space has got
dimension 1, right. So it depends on for a root, if it is repeated, can I find that many linearly
independent eigenvectors for it or not, right. Essentially is boils down to that. So let us keep some
names to these things.
(Refer Slide Time: 20:31)

275
So the number of times it is repeated, that is called the algebraic multiplicity of that eigenvalue.
So as a root of the characteristic polynomial, it will be at least appearing once, because it is a root.
It may appear twice, thrice or so many times, so number of times it appears, that is called algebraic
multiplicity of the eigenvalue.

If as many independent you can find as algebraic multiplicity, then you are through, right.
If both are equal, then it will become diagonalizable, right. Because for each eigenvalue, we will
have as many eigenvector as is the algebraic multiplicity. So put them together, you will have a
linearly independent basis, linearly independent set of eigenvectors forming a basis. So we can
diagonalize that, okay. So that is what, so this is what normally called the defect, but that is not
really important, okay, what you call that difference, okay.

Algebraic multiplicity will always be bigger than geometric multiplicity. Is that clear? Yes. The
number of times root is repeated algebraically that many eigenvectors, you may not be able to find.
You will be able to find only less of them, because they are independent.
(Refer Slide Time: 24:29)

276
For each eigenvalue, there is eigenspace, there is a basis, which has as many eigenvectors as is the
algebraic multiplicity, if they are equal. Put them together, you will get a basis consisting of
eigenvectors for the underlying space Rn, okay. So their sum is equal to the degree of that is equal
to R, right. Algebraic multiplicity equal to geometric multiplicity, total=n, so for each one of them,
there is a basis, put them together. So there is nothing complicated in this theorem.

Basically, that is how you will analyse whether a matrix is diagonalizable or not. So the main
problem is given a matrix to show it is diagonalizable or not, what we have to do? We have to find,
for the given matrix, what are the eigenvalues. Once you have the eigenvalues, find out
corresponding eigenvectors, right. Find out the dimension of the corresponding eigenspace that is
same as the null space of𝐴 − λ𝐼, right.

If each is equal to the algebraic multiplicity dimensionality of 𝐴 − λ𝐼 the algebraic multiplicity,


then we are through, then we can find a basis consisting of eigenvectors, right.
(Refer Slide Time: 26:37)

277
278
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 32
Diagonalization and Real Symmetric Matrices - II

(Refer Slide Time: 00:26)

So let us look at one example to completely understand the process. So we are given this matrix
A okay, so you want to find the eigenvalues and a basis consisting of eigenvectors to
diagonalize this matrix A if possible right. So what is the first step? Find the eigenvalues.

So for each eigenvalue will find a vector which is eigenvector, put them together, they will
form a basis of consisting of eigenvectors.
And how do you find the null space or nullity? You have to reduce it to the row echelon form,
so you are given here. The row echelon form is given by this okay. So once that is the row
echelon form, what is the rank of the matrix? The rank of the matrix is 2. So what is the nullity?
1 right. So the eigenspace or the null space for this has only got dimension as 1 so how do you
find that?

(Refer Slide Time: 03:50)

279
If all roots are distinct, you are lucky, will find eigenvectors for each, write down the columns
of the matrix as eigenvectors and you get the matrix P which is invertible which will
diagonalize it. If not, you have to see whether you are getting as many independent
eigenvectors. The nullity is same as the algebraic multiplicity of that eigenvalue. If that is the
case, again you are lucky and you are through right. If not, it is not diagonalizable right okay.
(Refer Slide Time: 08:50)

(Refer Slide Time: 11:01)

280
\
(Refer Slide Time: 13:26)

So you have got two linearly independent eigenvectors for the same eigenvalue, so you got 3
eigenvectors which are linearly independent, so matrix will be diagonalizable okay. So let us
compute that.
(Refer Slide Time: 14:16)

281
(Refer Slide Time: 15:42)

So we get the matrix the first one eigenvector, the second and the third, put them together you
get the matrix P, determinant of this is not 0 because they are linearly independent, you can
check also, so it is invertible and P inverse AP should be equal to diagonal, this is again a
mistake here. This should be 3 3 n, there should be 6, the third eigenvalue right.
(Refer Slide Time: 16:12)

282
So that is verification of that, actually you are getting a diagonal matrix. Is it clear
diagonalization process? Everything clear to everybody? Yes, okay.

283
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 33
Diagonalization and Real Symmetric Matrices - III

(Refer Slide Time: 00:26)

Now the problem comes, we are all along lucky here, in the earlier example we saw that
eigenvalue lambda=1, the nullity was 1, so you could not find eigenvectors that was not
diagonalizing. There is a class on matrices where everything goes very nicely and they are
called the real symmetric matrices. So the theorem says if A is 𝑛 × 𝑛 real symmetric matrix,
then A has 𝑛 real eigenvalues.

See what are eigenvalues? That the roots of the characteristic polynomial right. Even if A is a
real matrix, the characteristic polynomial will be a polynomial with the real coefficients. So it
may not have as many real roots as is the degree but there will as many roots as the degree,
they maybe complex right. So fundamental theorem of algebra says given a polynomial either
real or complex does not matter what coefficients, it will have as many roots as the degree.

But some of the roots maybe complex roots and we are given a matrix which is real, so we are
interested only in the real roots. So the characteristic polynomial of a real matrix may not have
real roots at all, that is also possible. It may have real roots right but not all of them roots may

284
be real, some roots may be complex right but here is a special type of matrices which are called
the real symmetric matrices.

It says if A is a real symmetric matrix that means a matrix with real entries and what is the
symmetric matrix? When to say matrix is symmetric? If𝐴 = 𝐴𝑇 , so if it has that property then
it says the theorem says for a real symmetric matrix it will have n eigenvalues that means the
characteristic polynomial will have all roots real right. They may or may not be repeated, that
is a different story.

They will have all and hence there are n eigenvalues that means there will be as many
eigenvectors also, each eigenvalue will give you eigenvector right. So important thing is for a
real symmetric matrix, all roots will be real okay. There will be as many eigenvalues as is the
order of the matrix that is very important okay. So that is so how do you let us prove it.
(Refer Slide Time: 06:47)

So simply using the fact that only we have used symmetric here right, 𝐴𝑇 = 𝐴, this place we
have used A is symmetric, otherwise it is just a definition of the dot product nothing more than
that. So all the eigenvalues are real, so these are special class of matrices which says if A is
real symmetric then be sure you are going to get as many eigenvalues as is the order of the
matrix and all if the matrix is real you will get real eigenvalues right okay.

So for real symmetric all eigenvalues, so eigenvalues will exist because fundamental theorem
of calculus says eigenvalue should exist right and this theorem says all eigenvalues have to be

285
real, so n x n real symmetric matrix will positively have n real eigenvalues and hence n real
eigenvectors also okay.
(Refer Slide Time: 11:39)

And the corresponding eigenvector when you find that will be a real eigenvector because you
will be solving that homogenous system applications right okay.
(Refer Slide Time: 11:45)

There is actually something more which happens, so I think that also is simple, let us prove it
today itself. So if A is real symmetric and there are distinct eigenvalues, we have already shown
that for any matrix if the eigenvalues are distinct then the corresponding eigenvectors are
linearly independent but now it says something more. If lambda 1, lambda 2, lambda k are
distinct, eigenvalues and pickup eigenvectors corresponding to them ui.

286
Then, they are not only independent, they form an orthogonal set, they are perpendicular also.
So you get a special property that distinct eigenvalues give you eigenvectors which are linearly
independent for any matrix but if a matrix is real symmetric then the eigenvectors are actually
perpendicular to each other okay. So let us again the proof just realizes the fact that what is the
dot product right.

See for a real symmetric matrix, eigenvalues will exist, they may be repeated but supposing
you are lucky and you get distinct, all distinct right then the corresponding eigenvectors will
be orthogonal to each other right. You will get as many, so you will get a basis consisting of
eigenvectors which are orthogonal right and if you divide each one of them by their norm you
get an orthonormal basis right.

So for an ordinary matrix if eigenvalues are distinct, you will get an invertible matrix right but
if it is a real symmetric matrix and the eigenvalues are all distinct right where n eigenvalues
which are all distinct, you will get a orthogonal matrix which will diagonalize it. So that is the
advantage right. So I think will do that next time. Upshot of today’s thing is how to find
eigenvalues, how to find eigenvectors.

If you are lucky, distinct eigenvalues you will get a matrix which will diagonalize it and
important that happens because eigenvectors corresponding to distinct eigenvalues are linearly
independent in general. When it is real symmetric, all eigenvalues will exist. There will be n
real eigenvalues and eigenvectors corresponding to distinct eigenvalues will be not only
linearly independent, they will be actually orthogonal to each other right okay. So let us stop
here today.

287
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 34
Diagonalization and its Applications - I

Okay so let us begin today’s lecture by recalling the last lecture we had proved the following
theorem.
(Refer Slide Time: 00:39)

So using this because it has already got 𝑛 eigenvalues and so there will be 𝑛 eigenvectors right
and we already know that because they are orthogonal, they are also linearly independent.
(Refer Slide Time: 01:35)

288
So improvement on the previous theorem which was for diagonalizable, diagonalization of
matrices says the following. That if A is a real symmetric matrix, then there exists an orthogonal
matrix P such that 𝑃−1 𝐴𝑃 is a diagonal matrix okay and the diagonal entries are the diagonal
eigenvalues of the matrix A. The column vector P are the eigenvectors of A, this will be the
eigenvectors of A.

So basically the same process that we do for general diagnozability. Given a matrix A, we find
eigenvalues, we find the corresponding eigenvectors right. If it is real symmetry, they are going
to be n eigenvalues right accounting multiplicity, so each eigenspace is going to have a basis
consisting of eigenvector right. So but the only thing is that eigenvectors corresponding to
distinct eigenvalues for real symmetric will be orthogonal but inside the eigenspace they may
not be orthogonal.

They will be only linearly independent, so what we can do is we can make using Gram-Schmidt
process, we can make those eigenvectors also mutually orthogonal. So for each eigenvalue will
have the eigenvectors right, there is a basis consisting of eigenvectors because it is
diagonalizable. So we can find orthogonal, orthonormal basis convert that basis to an
orthonormal basis.

And this says that for a real symmetric matrix this will always happen right, so for a real
symmetric matrix there will be n eigenvalues right. There is a basis consisting of eigenvectors
right which form an orthonormal set right. So there is a basis consisting of orthonormal vectors,
so that is what is called the spectral theorem for real symmetric matrices right. So keep in mind
for a general matrix, it may not be diagonalizable at all.

Because it may not have eigenvalues, even if it has eigenvalues the algebraic multiplicities may
not be equal to the geometric multiplicities but for a real symmetric matrix, always there exist
eigenvalues, algebraic multiplicity of each eigenvalue is equal to the geometric multiplicity
right and as a consequence it becomes diagonalizable right. Because it is diagonalizable,
eigenvectors corresponding to distinct eigenvalues are mutually orthogonal.

But inside each eigen-subspace right, the basis exists but it may not be orthogonal. So you can
use Gram-Schmidt process to make it orthogonal right. So that is the basic idea of this spectral
theorem. The advantage is that this matrix P is an orthogonal matrix, so for an orthogonal

289
matrix what is P inverse? You know orthogonal means P transpose P is=identity right, so that
means the inverse is the transpose itself.

So there is advantage here that for a real symmetric you do not have to compute P inverse, it is
just the transpose of the matrix P that you get okay.
(Refer Slide Time: 05:09)

So we will look at one example completely illustrating this idea.


(Refer Slide Time: 06:46)

(Refer Slide Time: 08:35)

290
(Refer Slide Time: 10:37)

(Refer Slide Time: 12:24)

291
(Refer Slide Time: 13:47)

So first column is the first eigenvector, second column second eigenvector, third the third
eigenvector, process is same whether it is ordinary diagonalizable matrix or it is real symmetric.
You find as many linearly independent eigenvectors as is the order of the matrix. If you are
able to find, it is diagonalizable. If not, you are which is not diagonalizable. For real symmetric,
you will be able to find 3 not only linearly independent, mutually orthogonal and orthonormal
set of vectors which form a basis, so you get this right.

292
Unlike the ordinary matrix where it may not have any solution at all right, there may not be
any eigenvalue for an ordinary real matrix but here for a real symmetric you will be able to
factorize right and you will get n eigenvalues, some of them maybe repeated right but each will
have algebraic multiplicity same as geometric multiplicity, for a real symmetric algebraic will
be always equal to geometric multiplicity.

Distinct eigenvectors corresponding to distinct eigenvalues will be mutually orthogonal, inside


right the eigenvectors corresponding to the same eigenvalue you are able to find as many as
algebraic multiplicity, they may not be mutually orthogonal. So use Gram-Schmidt process so
that you get orthonormal basis right. So that is the process. So what are the applications?
(Refer Slide Time: 16:49)

(Refer Slide Time: 21:10)

293
When the domain of a function is a matrix exponential is 1 which most probably will come
across when you look at systems of differential equations and finding solutions, so eigenvalues
and everything will come back there. So there is one advantage of diagnozability that you can
compute powers and the functions of.

294
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 35
Diagonalization and its Applications -II

(Refer Slide Time: 00:26)

If your just shift the origin, so let us do that process. Okay. So it is easier to sketch and understand
if the curve is given in the canonical form. Okay, so let us see we will see how it is done. Okay.
(Refer Slide Time: 03:06)

295
Every point is shifted by(𝑥0 , 𝑦0 ). So what is the aim of doing this, that means if I put these values
in this equation those terms which I want to make 0 they should vanish, right that is the criteria.
By this substitution, I want to choose (𝑥0 , 𝑦0 ) in such a way that when I put those values in the
equation the terms of XY, X and Y they vanish. So let us put that condition. So put these values
in the given equation.
(Refer Slide Time: 04:28)

(Refer Slide Time: 05:56)

So this is the first what one does is, by shift of origin you can bring your quadratic curve to
canonical form. Okay.

296
(Refer Slide Time: 08:10)

(Refer Slide Time: 08:55)

Now advantage of writing this is, look at this matrix. It is becomes the real symmetric matrix AB
BC it is a real symmetric matrix. So once it is real symmetric it will be diagnosable by a orthogonal
matrix.
(Refer Slide Time: 14:16)

297
(Refer Slide Time: 15:51)

So it says, to visualize it first shift the origin so that the term XY and terms vanished becomes to
canonical form and then you rotate you will get your standard form of a conic which is easy to
visualize. Okay.
(Refer Slide Time: 17:26)

298
But that becomes the important, one study is called differential geometry, right where you study
what are curves, what are surfaces, how do you characterize surfaces and so on. Curves and
surfaces. So this is essentially.
(Refer Slide Time: 18:26)

(Refer Slide Time: 19:44)

299
(Refer Slide Time: 22:05)

(Refer Slide Time: 23:41)

300
(Refer Slide Time: 24:40)

So that is that original one, right. That is original quadratic curve. But if I; is only after plotting it
looks like a ellipse to us now, right. But if you do not plot it we do not know what is kind of a
ellipse it is. So shift the origin, so let us shift the origin, if you shift the origin so this will go here.

So this will become; this is the new shifting the origin translation only, right. If you want to see
how does this translation come about; but one can actually show that it is translating and sort of
emerged thing slowly it is going and translating. So this is the one, what will I do. I should rotate
it by matrix, so here is the rotation you rotate it. So now it looks like perfectly nice ellipse eigen is
organized as we draw it kind of thing. So that is how visualization is obtained okay.

301
So this is the final form, okay. That is what it look like. You can try to do it; try to draw it you will
understand everything. I am using a software called GeoGebra which his freely available, free to
download okay. I have just sat down and plotted all these things and rotate it.

All things are available here, how to translate, how to right. Everything is there. So use that if you
want to sort of have a change of scenario. Paint with software and try to understand mathematics.
Okay. So that is an application of spectral theorem.

302
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 36
Diagonalization and its Applications -III

So how did we; I am revising what we have done. How did we start off the course? We started
looking at, how does linearity arise in real life. And that was Systems of Linear Equations. Trying
to find solutions of Systems of Linear equations and the method given by Gauss that is eliminating
one variable at a time we want to do formalize it. So to formalize that we brought in, what is called
the Matrix representation of a system of linear equations.

We said the method of Gauss is nothing but reducing the matrix A to row equivalent form or
specialize one called reduced row equivalent form. So what is the row equivalent form? Row
equivalent form of a matrix is that by elementary row operations, right; elementary row operations
are you can multiply a row by a nonzero scalar. You can add one row to another, or you can
interchange two rows. So these are called elementary row operations.

By doing these operations alone you can bring your matrix to a special form. You are doing
something on the rows and the row equivalent form is essentially says, in your matrix, right, there
will be some bottom rows which are going to be 0 possibly, right. There are some nonzero rows
and some rows which are identically 0. So suppose there are r rows which are nonzero they are at
the top n-r are at the bottom, right. So it says, that, this r, we give it a name called the rank of the
matrix, right.

And so there is a number r such that r between 1 and N such that bottom r, top r rows are nonzero
bottom rows are 0. Now why a row is nonzero? Because somewhere a nonzero entry is coming,
so how do we describe that? You look at a row, first row; first spot the nonzero entry. It comes at
a column called P1. Look at the second row, the first nonzero entry comes at a place called P2
column number P2. So the condition is P2 should be bigger than P1.

303
Each next row nonzero entry should be coming in a column which is in the right side of the earlier
one, right. It may not be immediate right it will be somewhere on the right. Okay. So that form of
the matrix is called row equivalent form. And you can further do row operations, so that in that
column one nonzero entry is there, right; there is a nonzero entry, below that everything is 0. So,
that entry is called a pivot.

In that, everything on the top also you can make it 0 by elementary row operations. So in pivotal
columns where the first nonzero entry of a particular row is coming, by row operations further you
can make; that is the only nonzero entry everything else is 0 and that nonzero entry also you can
divide by that constant and make it equal to one, right. So that is called the reduced row equivalent
form of the matrix.

Then we looked and applications of this. One linear system is consistent or inconsistent of rank,
so take the Ax=B, take A, augmented matrix AB, right; reduce to row equivalent form. If the rank
of A right is same as the rank of B, then the system is consistent otherwise if it is bigger one more
then it is not consistent. When it is consistent that means what there are r pivots among the r n
variables there are r pivots, right the P1, P2, Pn are coming in the columns.

So those pivotal variables their values are obtained from the non-pivotal variables; by giving non-
pivotal variables arbitrary values and backward substitution. So take at the bottom row, right,
compute the pivotal variable in terms of non-pivotal ones. Give them arbitrary value, put back, put
back and get your all solutions, right. That is how finding all possible solutions.

And then we also said that, solving a system Ax=B is essentially equivalent to solving the
homogenous part of it, Ax=0. What is the difference? You find one particular solution of Ax=B
somehow and find all solutions of Ax=B so general solution is obtained by taking one particular
solution adding to it a solution of the homogenous system, right. That gives you all. That also
brought, okay, we will come to it a little bit later.

But row equivalent form also gives you a method of checking whether a square matrix is invertible
or not, right. So, if it is going to be invertible that means there should be unique solution for Ax=B.

304
That is same as equivalent to saying rank should be full; and that is same as saying when you
reduce row equivalent form that should be identity. So, a method of checking is take A M/N, put
along with identity matrix N/N and do row operations to A and do the same operations to identity
also and try to find out what is the row equivalent form of A.

If this becomes identity the next portion gives you the A inverse where the identity has changed
to, right that becomes A inverse in finding what is, how to check whether something is invertible
or not and same time computing the inverse also, so that is application of row equivalent form,
another one. Then we looked at matrices and we looked at the row space and the column space,
right. They are all; subspace is of Rn and Rm, right.

What is a subspace? It is a subset of Rn or Rm so then in itself it is closed under linear


combinations, right that is a subspace, okay. For example, in R2 if you take a line passing through
the origin that is a subspace of R2, right because if I take any two points and add them we will be
somewhere on the line again, right scalar multiple also, so that is subspace. So Row space, column
space and then we looked at what is the dimension of the row space, dimension of the column
space and we said both are same that is same equal to the rank of the matrix, right.

And what is dimension of the space? That was involved what is called a basis and the number of
elements in a basis. So we said every subspace of a vector of Rn or Rm will have a basis. What is
a basis? You take linear combination of a subset B is called a basis. If you take all linear
combinations of elements of that, that should give you all elements of the space and it should be
minimal. You should not be able to reduce number of; you should not be able to remove some
elements from that.

So it is a minimal set of generators equivalent to; it is a maximal linearly independent set. So those
were equivalent ways of describing what is a basis and then we proved, we did not prove; we said
the theorem, that any two basis has a same number of elements, vector subspace can have different
basis; remember, r3 itself has different basis, right, we gave many examples. But the number of
elements in a basis will always be same and that is called the dimension of the space, right.

305
So we had dimension of the column space, dimension of the row space and we also had the
dimension of the null space Ax=0. That was called the nullity. So we had the theorem, Rank Nullity
theorem, the rank + nullity= n. right. That was the rank nullity theorem we proved. And that had
some applications namely every matrix or every, okay we came to linear transformations. Rank
nullity theorem for linear transformation and the dimension of the row space plus dimension of
the range space, right. That as a consequence, okay.

I think more systematically we looked at matrix multiplication as a linear map, A applied to X=Y
that is a linear operation on the domain XY. So, you can call that as a linear map induced by a
matrix and conversely every linear transformation from one vector space to another; of course
vector space is subspace of Rn gave rise to a matrix representation of the linear transformation.
So, how was that obtained? Say t is from v to w, v of dimension 2, w of dimension 3, right.

So v of dimension 2 it has a basis element of v1, v2, w dimension 3, it has basis elements w1, w2,
w3. So take v1, take the image of v1 under t then go to some vector in w but that should be a linear
combination of w1, w2, w3. So ev1 is a linear combination, so those scalars which are coming in
the linear combination that give you the first row. E of v2, again a linear combination that gives
you the second row, so take the basis elements in domain each one take the image and write it as
a linear combination, you get the corresponding columns of the matrix representation.

So that is the matrix representation of a linear transformation. And then as a consequence of rank
nullity theorem, we said t is 1, 1 is same as t is 1, 2 as maps. So that is what we proved, so that was
linear transformations basis of linear transformations, rank, basis, dimension. Then we looked at
generalizing the notion of angle in vector spaces, right because there is a notion of angle, dot
product in Rn, so that we brought in the notion of perpendicularity, two vectors are perpendicular
if the dot product is equal to 0.

So, we said given a basis, can we generate an Orthonormal basis out of it, right. A basis is linearly
independent of vectors which generate everything. Orthonormal is, those every orthogonal
collection is also linearly independent automatically, right. That is a simple property that we
proved. And then we said that, every linearly independent need not be orthogonal, obviously.

306
So how do you get from linear independence to orthogonal that is Gramshin process? So given a
linearly independent set; one by one iteratively you can generate a basis which is a orthonormal
basis, right. So, once that is obtained, that means, what is the advantage of orthonormal basis? It
says that coordinates of a vector, see every vector is a unique linear combination of the basis
elements, right.

Every v is sigma alpha vi, where v and i are basis elements, this alpha i's are unique, right. That is
what we called as the coordinates of the vector. These unique coordinates are immediately known
if the orthonormal basis is given, because they are just a dot product. If you take the dot product
on both sides of a particular vi the right side is vi vj, everything will vanish except vi, vi, right.

So that means, advantage of an orthonormal basis is that coordinates of a vector are immediately
computable, right if your basis is orthonormal. That is why we prefer to have an orthonormal basis
whenever possible and that is Gramshin process, right. And then we looked at a particular case
that given a matrix how can we make it similar to a diagonal matrix? That means finding an
invertible matrix P such that P inverse AP is diagonal. That was diagonalization process for the
problem.

We said, this leads to computation of eigenvalues, computation of Eigen vectors and checking
when are the Eigen vectors linearly independent and they forming basis of the underlying space if
the matrix is M cross N, that should form a basis of Rn, right. So; computation of Eigen values
that led to determinant of A- lambda i because A- lambda i you want a solution of that, a nonzero
solution. That means A-lambda i should not be invertible, right. Then only it is possible.

If it is invertible, only unique solution for homogeneous system that is a tribal one. If you want a
nonzero solution then the determinant of A-lambda i should be equal to 0, because it has to be
singular. That means, solving the polynomial, finding roots of the polynomial determinant of A-
lambda i. That gave you the Eigen values. How do you find the corresponding Eigen vectors? The
solution of A- lambda i applied to X=0, you have to find x. So finding solution of the homogeneous
system where a matrix is A- lambda i.

307
And for that we know, go back to row equivalent form and see what is the rank, what is the nullity,
so null space you want; all possible solutions, right then find the dimensions of the null space,
those many linearly independent Eigen vectors you can obtain. And we proved a theorem that a
distinct Eigenvalues, Eigen vectors are linearly independent. The general matrix the Eigen vectors
are corresponding to distinct Eigenvalues are linearly independent, right.

Now the question is only inside for a particular Eigenvalue, if you look at all Eigen vectors, right,
they form a subspace. Okay. The question is, whether dimension of that subspace is same as the
number of times a root is repeated or not? The number of times the root is repeated it is called
algebraic multiplicity; the dimension of the null space of A- lambda i is called geometric
multiplicity; whenever the two are equal your matrix will be diagnosable.

Because that says you will be able to find as many linearly independent Eigen vectors as the
number of times Eigenvalue is repeated, right. So that was Eigen’s ability of arbitrary matrix. That
may or may not happen. So the condition is you should have Eigenvalue, each Eigenvalue;
algebraic multiplicity should be equal to the geometric multiplicity, right. And there should be n
Eigenvalues and that happens if your matrix is a real symmetric matrix.

For a real symmetric matrix, you are assured that there will be as many Eigenvalues as is the
dimension that is M cross N where n is the dimension of the matrix. There will be n Eigenvalue
even if an algebraic multiplicity of each will be equal to geometric multiplicity that is also part of
the claim that is the part of the theorem. And thirdly, you can find a orthonormal basis for Eigen
vectors. For any two distinct Eigenvalues, the Eigen vectors are perpendicular to each other.

But, inside there will only be linearly independent for each Eigenvalue; using Gramshin you can
ortho-normalize it. So that is what we did today. Okay. So that is our 5 minutes; 10 minutes, I have
revised the whole course. So next two lectures, what I am going to do is, I am going to look at
abstract vector spaces; vector spaces other than subspaces Rn or Rm itself. They also come your
courses later on. So we will spend some time on that. Okay.

308
So we will stop today. We will not go to them today itself. So, today’s lecture ends here. Okay.

309
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 37
Abstract Vector Spaces - I

Right so let us begin with today’s topic. So this is going to be abstraction of linear algebra. So
the concepts that we have studied namely Rn’s and so on, they are going to be at abstract now.
The reason being these concepts are used in other branches of mathematics as well as other
branches in engineering and sciences. So let us begin with structure called vector space.
(Refer Slide Time: 01:06)

So the prototype for a vector space is ℝ𝑛 . So where there are vectors, you can add vectors, you
can scalar multiply vectors with various properties. So a vector space V is a non-empty set and
will denote by F the real numbers or the complex numbers. So the set, on the set V there are
two algebraic operations defined. One is called addition, so for elements a and b, so we look at
V x V.

So these are two operations defined on vector space, one is addition, other is we call it as scalar
multiplication.
(Refer Slide Time: 02:27)

310
(Refer Slide Time: 03:45)

We have said here that these are unique 0 and –a but those can be proved using the properties
and one can also prove obvious facts that 0 times a is 0 is lambda times 0 and -1 multiplied by
so this is additive inverse in the field and that is same as – of a. So these are some obvious
properties that one can prove using these axioms itself. So we will not prove that. We will just
assume because we do not have that much time.

But it is a nice thing to reduce these properties from the basic axioms of addition and
distributive properties right.
(Refer Slide Time: 04:34)

311
So the underlying scalars you can choose whichever you like. So that will give you two
different objects. One will be ℂ𝑛 as a vector space over ℝ; another will be ℂ𝑛 as a vector space
over the complex numbers itself. This also we had studied that look at the null space of a matrix
A.

So that addition becomes commutative, associative and so on. So that is the space of all real
polynomials or you can also have complex polynomials and the coefficients are complex
numbers. So these are examples. This is something that you will come across in your
differential equation.

So this is normally one calls the homogenous differential equation but will study these things
in differential equations or solutions of this type okay. So this is a second degree differential
equation, this is what is called the first degree differential equation. So if you have already
done a course in differential equations, you might have already come across. If not when you
do it, you will come across these things.

So let us go and start further, what did we think do in when we have subspaces of ℝ𝑛 ? We
started looking at given a subset of vectors, we started looking at linear combinations of them,
we started looking at generating a subspace with them, then started looking at when is a
minimal set of generators right, we call that as a basis and so on. So same things are possible
here in abstract vector spaces also.
(Refer Slide Time: 11:45)

312
But before that let us just look at if you look at all real matrices the entry is bigger than or equal
to 0. Does this follow vector space another addition of matrices? We got all real matrices m x
n okay such that all the entries are bigger than or equal to 0. Do you think this forms a vector
space? Of course, if we add we will get again a matrix of the same type but if we multiply by
a scalar you may not get the matrix of the same type where the scalar is -1.

If all entries are positive, will you multiply by -1, you get all entries negative right. So that will
not be an element, so scalar multiplication, the usual scalar multiplication does not work, may
be some other scalar multiplication you can define. So that says that for a vector space, it is a
set which is important, it is important also what is the operation of addition defined on it, what
is the operation of scalar multiplication defined on it right.

So that is why they told bigger subspace, so study of nonlinear differential equation and non-
homogenous differential equations will be a part of your course on differential equations right.
So let us define the concept of the subspace. Given a vector space and given a subset right, we
can think of creating subset itself as a vector space but that is possible only when given two
elements of the subspace, you add you again get an element of that subset right.

(Refer Slide Time: 16:55)

313
In the vector space, you cannot, there is no concept of adding infinite number of them right,
only finite number is possible because you can add again and again till the finite state. So look
at all linear combinations of elements of S okay where n is varying, alpha i's are varying and
vi’s varying, so all possible linear combinations call that as the square bracket as one control
easily.

And if you take two linear combinations then some is again a linear combination, scalar
multiplication is again a linear combination. So this is a subspace and this is called the subspace
generated by S right. So given a set S in a vector space what is the subspace generated by it? It
is the set of all possible finite linear combinations of elements of S right. That is called the
subspace generated.

So here comes if S itself is a finite set such that the space generated by S is V. We say it is a
finitely generated vector space. If V is a vector space and you have order finite subset of it, so
that when I take linear combinations of elements of that finite set that gives me everything in
V right, every vector in V can be obtained as a linear combination of elements of that set S and
if that S is finite, we say our vector space V is finitely generated right.

Finite number of elements of V give everything to add linear combinations. If it is not, then the
set is non-finitely generated vector space. That is where abstract vector spaces will start
differing from ℝ𝑛 and subspaces of ℝ𝑛 . So let us look at some examples of this.
(Refer Slide Time: 20:31)

314
Think it now. How many elements are there in a matrix of order 𝑚 × 𝑛? There are 𝑚𝑛
elements.

So I can identify a matrix of order 𝑚 × 𝑛 with ℝ𝑚𝑛 .

A machine only knows a string of elements that you are giving to store.

So that is how the computer stores the data in it right. So that is a way we are saying we can
find a basis for m matrices of all that 𝑚 × 𝑛 also.

315
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology - Bombay

Lecture – 38
Abstract Vector Spaces II

Let us look at R3 for example or Rd.


(Refer Slide Time: 00:28)

(Refer Slide Time: 00:50)

What are elements of it? They are polynomials. And what is the polynomial look like?

316
(Refer Slide Time: 03:29)

(Refer Slide Time: 07:56)

So let us write in in fact.

(Refer Slide Time: 09:32)

317
Can you think of some other example of a space, vector space which is not finitely generated? The
idea should be, see if you take a vector space which is finitely generated, I will take a subspace of
it, that also is going to be finitely generated, right. Now if you take a vector space which is infinitely
generated and put it inside a bigger thing, that bigger cannot be finitely generated, right.
(Refer Slide Time: 10:27)

(Refer Slide Time: 13:23)

318
We looked at what is called linear independence and dependence. Same you can define, you can
say a finite, given a set of vectors or finite set, you can say they are linearly dependent if a linear
combination is equal to 0 where all the scalars are not equal to 0, that is one of them should be
non-0. So there is a linear combination which is 0 with one of the scalars appearing non-0.
(Refer Slide Time: 14:30)

Any set of vectors will be called linearly independent, is called linearly independent if, you pickup
any finite number of elements of that set S, take a linear combination of the finite number, because
linear combination is defined only for finite. If that is equal to 0, implies its scalar is 0, then that
set S is called linearly independent. Is it clear? So let us, before going further, let us look at some
example, okay.

319
(Refer Slide Time: 15:42)

320
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology- Bombay

Lecture – 39
Abstract Vector Spaces III

(Refer Slide Time: 00:35)

(Refer Slide Time: 02:47)

Now, we have defined independence, we have defined what is generation, so something which is
independent and generates, we call that as a basis of that vector space. So, what is the basis? It is

321
a linearly independent set and every vector, right, V is a linear combination of course finite only,
right, so S may not be finite but it is a linearly independent set, okay, and set in vector V is a linear
combination of some finite number of; the finite number may change, right.

So, there vector spaces is not given, basis is not given, nothing is given, so how to show, you do
not know what that addition is; what is the multiplication and so on, a general theorem. Now, you
see the abstraction coming in, so one proves the theorem that whatever be the vector space,
whatever be V, whatever be addition, scalar multiplication, you can always find a basis, so basis;
the idea is or say of finitely generated.

If this vector space is 1 0 that means there is some element in it which is other than 0, we have
only 0, then nothing to show anywhere, is zero dimension you can; zero element is there, if there
is 1 non zero element in that whatever it may be, start with that so, you have got one set right,
which is linearly independent but it may not generate everything, so you want to enlarge it, add
one more but how long you can go on adding?

If it is not finitely generated probably, you will go on doing it, when do you stop, how do you stop?
So, the concept of infinite comes into picture, right, so it relates to some basic things in set theory
which will I cannot go into it. So, this is the theorem which relates with some basic concepts in set
theory, okay, so this theorem can be proved that every vector space but again, there are issues there
because one uses some examine set theory called well ordering principle, which some
mathematicians, some mathematicians do not accept, this is exam.

If you assume that exam, you can prove this theorem, if you do not assume, you cannot prove or
disprove this theorem, so there is something new coming to you in mathematics that there are
theorems which are dependent on what examine set theory you are starting, this is like a game,
mathematics is like a game being played, it is like a football game or cricket game, right, different
set of rules, something else may come out.

So, there are various formulations exams of set theory in which what is something called well
ordering principle is not a part of standard set theory, okay, 90% of the mathematicians are

322
assumed that exam and go ahead, you can prove these theorem, for others unless you construct 1,
right, a set which is generating it; I would not believe it kind of thing, right, so anyway, so we will
assume it that every vector space has a basis.

Once that is done, if it is finite, so there are 2 possibilities, it is finitely generated, it is infinitely
generated. If it is finitely generated, then one can prove with your that any 2 basis will have the
same number of elements, if it is finitely generated, right that means, there is a finite set S which
will generate, which is linearly independent and will generate, right, so finitely generated implies
that any 2 basis will has the same number of elements.

And that is called the dimension of that finitely generated vector space, if it is infinitely generated,
we say it has an infinite dimension; it is infinite dimensional vector space. So, in vector spaces,
there are finite dimensional, there are infinite dimensional, right, so let us; because this is the here
itself, let us show that Rx okay, so let us look at examples.
(Refer Slide Time: 09:07)

So, it is better to so say it is infinitely generating, right or it is infinite dimensional whichever way
you want to call it, okay.
(Refer Slide Time: 12:15)

323
(Refer Slide Time: 12:41)

So, this will come across in when you studying solutions of a differential equations; this will come
back to you, how do you describe all possible solutions of a differential equation; you will look at
the homogenous point of it, you will find basis for homogenous part will give you a vector space
and you want to generate the vector space by describing the minimum number of independent
solutions, so this kind of things will come there, okay.
(Refer Slide Time: 17:08)

324
(Refer Slide Time: 19:56)

(Refer Slide Time: 27:48)

325
And all this is becoming possible to check because everything is finite dimensional, so we are able
to bring everything to a system of equations basically, right. For example, here, it can be 3
equations because of variable, right, you can solve them, so that is the advantage of finite
dimensional vector spaces, you can do computations with them, okay.
(Refer Slide Time: 27:55)

(Refer Slide Time: 30:48)

326
(Refer Slide Time: 31:53)

What it does that the background, there is an algorithm, we says replace this by tangent line
approximation linear and calculate and give me the value, so all algorithms, most of them are based
on yet to be slightly better, quadratic operation and so on but linear is the simplest one, so these
are linear maps and of course, if it is a finite dimensional vector spaces, right and every linear map
will give rise to a matrix, ordered basis, like we have done it earlier, same, right, so they are easier
to handle.

327
So, we will stop here today. Next time, what we will do is; see how the notion of angle right is
taken over to an abstract vector spaces, perpendicularity, okay.

328
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 40
Inner Product Spaces - I

So last lecture we had seen how abstract vector spaces can be defined and now will look at the
notion of perpendicularity and the inner product on the abstract vector spaces.
(Refer Slide Time: 00:45)

So basically the idea is that the properties of the inner product the dot product in ℝ𝑛 are taken
over as definition of an inner product. So given any vector space which is either real or complex
depending on whether reals or complex, for every pair of vectors v and w, we associate a
number which is real in case it is over the reals, is the complex number if it is over the complex
numbers.

So basically the properties which were true for inner product in ℝ𝑛 are taken as definitions in
an inner product space, nothing more than that okay.
(Refer Slide Time: 05:01)

329
We have Cauchy-Schwartz inequality for the ℝ𝑛 .
Once you have the notion of dot product, you can define the notion of perpendicularity right in
inner product space.
(Refer Slide Time: 07:06)

So Pythagoras theorem is true, Parallelogram law is true for abstract inner product spaces once
we have the notion of the perpendicularity right. Same proof so that nothing, no change comes
okay.
(Refer Slide Time: 08:32)

330
(Refer Slide Time: 10:52)

It belongs to vector space because addition of function is associative, commutative right, zero
function is there, so all those properties hold. So it becomes a vector space, so this is addition
and scalar multiplication, make it a vector space.
(Refer Slide Time: 12:56)

331
(Refer Slide Time: 14:47)

See if the function is not continuous then you can give many kinds of examples.
(Refer Slide Time: 17:50)

332
(Refer Slide Time: 18:24)

𝑏
So this 𝐶[𝑎, 𝑏] so what we are saying is this space 𝐶[𝑎, 𝑏] with ⟨𝑓, 𝑔⟩ = ∫𝑎 𝑓(𝑥)𝑔(𝑥)𝑑𝑥 is an
inner product right okay. You can just for the sake of more understanding this is a vector space
right over the reals, so and we have shown that it is infinite dimensional right. This was example
of a vector space which is infinite dimensional. So you got an infinite dimensional vector space
on which a notion of inner product is also defined right, so keep that much in mind okay.

333
Basic Linear Algebra
Prof. Inder K. Rana
Department of Mathematics
Indian Institute of Technology – Bombay

Lecture - 41
Inner Product Spaces - II

(Refer Slide Time: 00:26)

So once we have the notion of perpendicularity, we can talk about orthogonal sets. A set of
vectors in an inner product space, we will call it as orthogonal, if any two of them are
perpendicular to each other right. Whatever be the definition of dot product, once there is a dot
product right we can define perpendicularity with respect to that dot product okay.

And will say it is orthonormal if each vector has got length 1 right like infinite dimensional
spaces okay. So here is same theorem works here also which same proof works that if a set of
vector is orthogonal then it is also linearly independent. Same proof which was there earlier,
no change at all right. So will not go to the proof again, so same proof works.
(Refer Slide Time: 01:26)

334
Just checking, nothing more than that and this is a very important orthonormal set. When you
do some I think already if you have done something in differential equations or partial
differential equations, it will come something called Fourier series right. So there what is
Fourier series? That is expanding a function in terms of cos and sin. These are precisely the
basis you can think of.

These are orthogonal set in terms of which you try to expand every function right. So this will
come back to you when you study Fourier series problem okay. So this is important there. So
important thing is this is a orthonormal set okay and this is an infinite orthonormal set that any
two of them are perpendicular to each other. So this is also any two which are mutually
orthogonal also linearly independent.
(Refer Slide Time: 06:18)

335
Now if you recall, we have done in a finite dimensional vector space given a basis you can
make an orthonormal basis out of it right. So what all the process of converting a basis to an
orthonormal basis? That was basically using what is called the Gram-Schmidt process right.
So at every stage remove the projection on the previously defined, so I am just recalling, so
given vectors okay.
(Refer Slide Time: 08:46)

So what was the advantage of the Gram-Schmidt process? It gave you a way of finding
orthonormal basis for a finite dimensional vector space. Start with a basis, convert it into an
orthonormal basis. So you can say for a finite dimensional vector space right every finite
dimensional inner product space has got an orthonormal basis because there will be some basis
and I can convert that into orthonormal but is it same theorem true for infinite dimension?

336
Can you say that for an infinite dimensional inner product space there is an orthonormal basis?
That is a difficult question to answer right and that leads to something called the study of spaces
called Hilbert spaces. So will not go into that okay, will not bother about that, those of you go
on to take some minor courses later on in mathematics in something called finite element
methods are such things you will find this coming there.

So this subject this leads to a subject called study of topic called functional analysis. So is a
function space right, space of functions. So what space of functions as a vector space has got a
basis, notion of a magnitude and notion of an angle and existence of orthonormal basis, so such
things fall under the study of a subject called functional analysis. So if you happen to take that,
this will be a starting point for their okay. So will not go into that because that requires a lot of
more machinery to understand okay.
(Refer Slide Time: 10:38)

(Refer Slide Time: 12:43)

337
And then but these are not orthonormal, you have to orthonormalize them right. They are
orthogonal only. So you can divide by the norm normalize them.
(Refer Slide Time: 13:30)

(Refer Slide Time: 14:36)

338
So what is the definition taken? The definition is you take the properties of a dot product in ℝ𝑛
and transport it as a definition on any vector space right. So that is one, so once it has a notion
of dot product, it gives you the notion of perpendicularity. Once there is a notion of
perpendicularity, you can talk about orthonormalization process, Gram-Schmidt process.

But all of them infinite sequence will give the same space that is not true actually, we need
something more okay. So that is the study of other space. Now I think I will just probably say
something, what we do in finite dimensional spaces? We looked at what are called maps which
are isometries. Once there is the notion of inner product, you can define the notion of an
isometry right.

What is isometry? Corresponding thing is true but one has to do a lot of work and that again
goes to the field of functional analysis. We looked at what are called orthogonal
transformations between vector spaces and so on. So again I am just saying this is a beginning
of another topic called functional analysis right, so which some of you will come across
probably. So we will stop here with the course okay.

Northing more in the course because all mostly we have dealt with finite dimensional spaces
only right and their properties, so with that we end the course.

339
THIS BOOK IS NOT FOR SALE
NOR COMMERCIAL USE

(044) 2257 5905/08 nptel.ac.in swayam.gov.in

You might also like