MTH212
MTH212
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Linear Transformations
3.2 Spaces Associated with a Linear Transformation
3.3 The Range Space and the Kernel
3.4 Rank and Nullity
3.5 Some types of Linear Transformations
3.6 Homomorphism Theorems
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/Further Reading
1.0 INTRODUCTION
You have already learnt about a vector space and several concepts related to it. In this
unit we initiate the study of certain mappings between two vector spaces, called linear
transformations. The importance of these mappings can be realized from the fact that,
in the calculus of several variables, every continuously differentiable function can be
replaced, to a first approximation, by a linear one. This fact is a reflection of a general
principle that every problem on the change of some quantity under the action of
several factors can be regarded, to a first approximation, as a linear problem. It often
turns out that this gives an adequate result. Also, in physics it is important to know
how vectors behave under a change of the coordinate system. This requires a study of
linear transformations.
In this unit we study linear transformations and their properties, as well as two spaces
associated with a linear transformations and their properties, as well as two spaces
associated with a linear transformation, and their dimensions. Then, we prove the
existence of linear transformations with some specific properties, as discuss the notion
of an isomorphism between two vector spaces, which allows us to say that all finite-
dimensional vector spaces of the same dimension are the “Same”, in a certain sense.
Finally, we state and prove the Fundamental Theorem of Homomorphism and some of
its corollaries, and apply them to various situations.
2.0 OBJECTIVES
By now you are familiar with vector spaces IR2 and IR3. Now consider the mapping
f:R2 R3:f(x,y) = (x,y,0) (see fig. 1).
A B f
2 0
C 1 1 2 Y
1 D
2 D` Á
3
0 1 2 3 4 X
Fig. 1: f transforms ABCD to A`B`C`D`. 4
X c` B`
(ii) for any α € R and (a,b) € R2, f((αa, αb)) = (αa, αb, 0) = α (a,b,0) = αf((a,b)).
So we have a function f between two vector spaces such that (i) and (ii) above
hold true.
(i) Says that the sum of two plane vectors is mapped under f to the sum to sum of
their images under f.
(ii) Says that a line in the plane R2 is mapped under f to a line in R2.
The properties (i) and (ii) together say that f is linear, a term that we now define.
Definition: Let U and V be vector spaces over a field F. A linear transformation (or
linear operator) from U to V is a function T: U V, such that LT1) T(u1 + u2) =
T(u1) + T(u1), for U1, U2 U, and LT2) T(αu) = αT(u) for α F and U U.
The conditions LT1 and LT2 can be combined to give the following equivalent
condition. LT3) T(α1u1 + α2u2) = α1 T(u1) + α2T(u2), for α1, α2 F and u1, u2 U.
What we are saying is that [LT1 and LT2] LT3. This can be easily shown as
follows:
We will show that LT3 LT1 and LT3 LT2. Now, LT3 is true V α1α2 F.
Therefore, it is certainly true for α1 = 1 α2 , that is, LT1 holds.
Now, to show that LT2 is true, consider T(αu) for any α F and u U. We have
T(αu) = T(αu + 0. u) = αT(u) + 0.T(u) = αT(u), thus proving that LT2 holds.
You can try and prove the converse now. That is what the following exercise is all
about!
E E1) Show that the conditions LT1 and LT2 together imply LT3.
Before going further, let us note two properties of any linear transformation T:U V,
which follow from LT1 (or LT2, or LT3).
LT4) T(0) = 0. Let’s see why this is true. Since T(0) = T(0 + 0) = T(0) +T(0) (by
LT1, we subtract T(0) from both sides to get T(0) = 0.
LT5) T(-u) = - T(u) V u U. why is this so? Well, since 0 = T(0) = T(u – u) = T(u) +
T(-u), we get T(-u) = - T(u).
E E2) Can you show how LT4 and LT5 will follow from LT2?
Check that T is a linear transformation. (It is called the null or Zero Transformation,
and is denoted by 0).
Solution: For any α, , F and u1, u2 U, we have T(αu1 + u2) = 0 = α.0 + .0 =
αT(u1) + T(u2).
Solution: We will use LT3 to show that projection is a linear operator. For α,R
and (x1 ………xn), (y1…………yn) in Rn, we have
Remark: Consider the function p:R3 R2: p(x,y,z) = (x,y). this is a projection from
R3 on to the xy-plane. Similarly, the functions f and g, from R3 R2, defined by
f(x,y,z) = (x,z) and g(x,y,z) = (y,z) are projections from R3 onto the xz-plane and the
yz-plane, respectively.
In general, any function : Rn Rm (n> m), which is defined by dropping any (n – m)
coordinates, is a projection map.
Now let us see another example of a linear transformation that is very geometric in
nature.
P(2,1)
0 X
Q(2,-1)
T[α (x1,y1) + (x2,y2) = T (αx1, + x2, αy1 + y2) = (αx1 + x2, - ay1 - y2)
So, far we’ve given examples of linear transformations. Now we give an example of a
very important function which is not linear. This example’s importance lies in its
geometric applications.
Y
A` B`
4
A B
3
D`
2 C`
D
1 C
X
0 1 2 3 4
E E3) Let T: R2 R2 be the reflection in the y-axis. Find an expression for T as
in Example 4. Is T a linear operator?
E E4): For a fixed vector (a1, a2, a3) in R3, define the mapping T:R3 R by
T(x1,x2,x3) = a1x1 + a2x2 + a3x3. Show that T is a linear transformation. Note
that T(x1,x2,x3) is the dot product of (x1,x2,x3) and (a1,a2,a3) (ref. sec. 2.4).
E E5) Show that the map T:R3 R3 defined by T(x1,x2,x3) = (x1 + x2 – x3, 2x1 –
x2, x2 + 2x3) is a linear operator.
You came across the real vector space Pn, of all polynomials of degree less than or
equal to n, in Unit 4. The next exercise concerns it.
Show that D:Pn is a linear transformation. (Observe that Df is nothing but the
derivative of f. D is called the differentiation operator.)
In Unit 3 we introduced you to the concept of a quotient space. We now define a very
useful linear transformation, using this concept.
You have already seen that a linear transformation T:U V must satisfy T(α1u1 +
α2u2) = α1T(u1) + α2T(u2), for α1, α2 F and u1, u2 U. More generally, we can show
that,
Let us show this by induction, that is, we assume the above relation for n = m, and
prove it for m + 1, Now.
Thus, the result is true for n = m+1. Hence, by induction, it holds true for all n.
Let us now come to a very important property of any linear transformation T:U V.
In Unit 4 we mentioned that every vector space has a basis. Thus, U has a basis. We
will now show that T is completely determined by its values on a basis of U. More
precisely, we have
What we have just proved is that once we know the values of T on a basis of U, then
we can find T(u) for any u U.
Note: Theorem 1 is true even when U is not finite – dimensional. The proof, in this
case, is on the same lines as above.
Let us see how the idea of Theorem 1 helps us to prove the following useful result.
Proof: A basis for R is (1). Let T(1) = V V. then, for any α R, T(α) = αT(1) = α
v.
Once you have read Sec. 5, 3 you will realize that this theorem says that T® is a
vector space of dimension one, whose basis is [T(1)].
Now try the following exercise, for which you will need Theorem 1.
E E8) We define a linear operator T:R2 R2: T(1,0) = (0,1) and T(0,5) =
1,0). What is T(3,5)? What is T(5,3)?
Now we shall prove a very useful theorem about linear transformations, which is
linked to Theorem 1
Theorem 3: Let ) e1 …., en) be a basis of U and let v1, …….vn be any n vectors in V.
Then there exists one and only one linear transformation T:U V such that T(e1) =
v1, i = 1, …….n.
Proof: Let u U. Then u can be uniquely written as u = a1 e1 + ………., T αn en
(see Unit 4, Theorem 9).
Define T(u) = α1v1 + ….+ αnvn. The T defines a mapping from U to V such that T(e1)
= v1V i = 1, …….n,. Let us now show that T is linear. Let a, b be scalars and u, u`
U. The scalar α1, …..,αn, 1,………, n such that u = α1e1 + ……, + αnen and u` =
1e1 + …… + nen.
Therefore, T is a linear transformation with the property that T(e1) = v1V i. Theorem
1 now implies that T is the only linear transformation with the above properties.
Solution: By Theorem 3 we know that T:R3 R2 such that T(e1) = (1,2), T(e2) =
(2,3), and T(e3) = (3,4). We want to know what T(x) is, for any x = (x1, x2, x3) R3,
Now, X = x1 e1 + x2 e2 + x3 e3.
Therefore, T(x1, x2, x3) = (x1 + 2x2 + 3x3, 2x1 + 3x2 + 4x3) is the definition of the
linear transformation T.
In Unit 1 you found that given any function, there is a set associated with it, namely,
its range. We will now consider two sets which are associated with any linear
transformation, T. These are the range and the kernel of T.
Let U and V be vector spaces over a field F. Let T:U V be a linear transformation.
We will define the range of T as well as the Kernel of T. At first, you will see them as
sets. We will prove that these sets are also vector spaces over F.
Definition: The range of T, denoted by R(T), is the set {T(x)] x U}/ The kernel
(or null space) of T, denoted by Ker T, is the set {x U] T(x) = 0 } Note that R(T)
V and Ker T U.
Example 8: Let I: V V be the identity transformation (see Example 1). Find R(I)
and Ker I.
Example 9: Let T: R3 R be defined by T(x1, x2, x3) = 3x1 + x2 + 2x3 Find R(T) and
Ker T.
Solution: R(T) = {x R x1, x2, x3, R with 3x1 + x2 + 2x3. = x}. For example, 0
R(T). Since 0 = 3.0 + 0 + 2.0 = T(0, 0, 0,)
Also, I E R (T), since I = 3.1/3 + 0 + 2.0 = T(1/3, 0, 0), or I = 3.0 + 1 + 2.0 = T(0, 1,
0), or I = T(0, 0, ½) or I = T (1/6, ½, 0).
Now can you see that R(T) is the whole real line R? This is because, for any α R, α
= α 1 = α T(1/3,0,0) = T(α /3, 0, 0) R (T).
For example, (0,0,0) Ker T. But (1, 0,0) Ker T . :. Ker T ≠R3. In fact, Ker T is
the plane 3x1 + x2 +2x3 = 0 in R3.
Example 10: Let T:R3 R3 be defined by T(x1,x2, x3) = (x1 – x2 + 2x3, 2x1 + x2, - x1
– 2x2 + 2x3). Find R(T) and Ker T.
Solution: To find R(T), we must find conditions on y1, y2, y3 R so that (y1, y2, y3)
R(T), i.e. ., we must find some (x1, x2, x3) R3 so that (y1, y2, y3) = T(x1, x2, x3) =
(x1 – x2 + 2x3, 2x1 + x1 – 2x2 + 2x3).
This means
x1 – x2 + 2x3 = y1 ………………. (1)
2x + x2 = y2 …………………..(2)
-x1 = 2x2 + 2x3 = y3 ……………. (3)
Subtracting 2 times Equation (1) from Equation (2) and adding Equations (1) and (3)
we get.
y2 - 2y1 y2 – 2y1 y1 + y2
X3 = 0, x2 = ------------ and x1 = y1 + ------------- = ----------
3 3 3
Now (x1,x2,x3) Ker T if and only if the following equations are true:
x1 – x2 + 2x3 = 0
2x1 + x2 = 0
-x2 – 2x2 + 2x3 = 0
Thus, we can give arbitrary values to x2 and calculate x1 and x3 in terms of x2.
Therefore, Ker T = {(- α/2, α, (3/4)α): α R}.
In this example, we se that finding R(T) and Ker T amounts to solving a system of
equations. In Unit 9, you will learn a systematic way of solving a system of linear
equations by the use of matrices and determinants.
The following exercises will help you in getting used to R(T) and Ker T.
E E10) Let T be the zero transformation given in Example 2. Find Ker T and
R(T). Does I R(T)?
E E11) Find R(T) and Ker T for each of the following operators.
(Note that the operators in (a) and (b) are projections onto the xy-plane and the z-axis,
respectively).
Now that you are familiar with the sets R9T) and Ker T, we will prove that they are
vector spaces.
Theorem 4: Let U and V be vector spaces over a field F. Let T:U V be a linear
transformation. Then Ker T is a subspace of U and R(T) is a subspace of V.
Proof: Let x1, x2, Ker T U and α1, α2F. Now, by definition, T(x1) = T(x2) = 0.
Let y1, y2 R(T) V, and α1, α2 F. then, by definition of R(T), there exist x1, x2 U
such that T(x1) = y1 and T(x2) = y2.
= T(α1x1 + α2x2).
Now that we have proved that R(T) and Ker T are vector spaces, you know, from Unit
4, that they must have a dimension. We will study these dimensions now.
Consider any linear transformation T:U V, assuming that dim U is finite. Then Ker
T, being a subspace of U, has finite dimension and dim (Ker T) ≤ dim U. Also note
that R(T) = T(U), the image of U under T, a fact you will need to use in solving the
following exercise.
E E12) Let {e1 …..en} be a basis of U Show that R(T) is generated by {T(e1),
…………, T(en)}.
From E12 it is clear that, if dim U = N, then dim R(T) ≤ n. Thus, dim R(T) is finite,
and the following definition is meaningful.
Definition: The rank of T is defined to be the dimension of R(T), the range space of
T. The nullity of T is defined to be the dimension of Ker T, the kernel (or the null
space) of T.
Thus, rank (T) = dim R(T) and nullity (T) = dim Ker T.
We have already seen that rank (T) ≤ dim U and nullity (T) ≤ dim U.
Example 11: Let T:U V be the zero transformation given in example 2. What are
the rank and nullity of T?
Solution: In E10 you saw that R(T) = (0) and Ker T = U, Therefore, rank (T) = 0 and
nullity (T) = dim U.
Note that rank (T) + nullity (T) = dim U, in this case.
E E13) If T is the identity operator on V, find rank (T) and nullity (T).
E E14) Let D be the differentiation operator in E6. Give a basis for the range
space of D and for Ker D. What are rank (D) and nullity (D)?
In the above example and exercises you will find that for T:U V, rank (T) + nullity
(T) = dim U. In fact, this is the most important result about rank and nullity of a linear
operator. We will now state and prove this result.
Theorem 5: Let U and V be vector spaces over a field F and dim U = n. Let T:U
V be a linear operator. Then rank (T) + nullity (T) = n.
Proof: Let nullity (T) = m, that is, dim Ker T = m, Let (e1 ….., em) be a basis of Ker
T. We know that Ker T is a subspace of U. thus, by theorem 11 of Unit 4, we can
extend this basis to obtain a basis (e1, ….., em, em+1, ……, en) of U. We shall show
that {T(em+1), ………, T(en)} is a basis of R(T). Then, our result will follow because
dim R(T) will be n – m = n – nullity (T).
Let us first prove that {T(em+1), ……, T(en)} spans, or generates, R(T). Let y R (T).
Then, by definition of R(T), there exists x U such that T(x) = y.
Therefore, there exist a1, ……, am F such that am+1 em+1 + …. + anen = a1e1 + … + am
em (- a1) e1 + …. + (- am) em + am+1 em+1 + … + an en = 0.
Since {e1,……, en} is a basis of U, it follows that this set is linearly independent.
Hence, - a1 = 0, ---, -am = 0, am+1 = 0, …., an = 0. In particular, am+1 = …. = an = 0,
which we wanted to prove.
Therefore, dim R(T) = n – m = n – nullity (T), that is, rank (T + nullity (T) = n. let us
see how this theorem can be useful.
Example 12: Let L:R3 R be the map given by L(x,y,z) = x + y + z. What is nullity
(L)?
Solution: In this case it is easier to obtain R(L), rather than Ker L. Since L (1,0,0) = 1
≠ 0, R(L) ≠ {0}, and hence dim R(L) ≠ =. Also, R(L) is a subspace of R. Thus, dim
R(L) ≤ dim R = 1. therefore, the only possibility for dim R(L) is dim R(L) = 1,. By
Theorem 5, dim Ker L + dim R(L) = 3.
E E15) Give the rank and nullity of each of the linear transformations in E11.
E E16) Let U and V be real vector spaces and T:U V be a linear
transformation, where dim U = 1. Show that R(T) is either a point or a line.
Before ending this section we will prove a result that links the rank (or nullity) of the
composite of two linear operators with the rank (or nullity) of each of them.
Theorem 6: Let V be a vector space over a field F. Let S and T be linear operators
from V to V. Then
Proof: We shall prove (a). Note that (ST) (v) = S(T(v)) for any v V
Therefore, R(ST) R(S). This implies that rank (ST) ≤ rank (S).
:. R (ST) S (R(T)), so that dim R(ST) ≤ dim S(R(T)) ≤ dim R(T) (since dim L(U)
≤ U, for any linear operator (0).
Let us recall, from Unit 1, that there can be different types of functions, some of
which are one-one, onto or invertible. We can also define such types of linear
transformations as follows:
a) T is called one-one (or injective) if, for u1, u2 U with u1, u2, we have T (u1)
T (u2). If T is injective, we also say T is 1 – 1.
Note that T is 1 – 1 if T (u1) = T (u2) u1 = u2.
b) T is called onto (or surjective) if, for each v V, u U such that T(u) = v,
that is R(T) = V.
Can you think of examples of such functions? The identity operator is both one-one
and onto. Why is this so? Well, I:V V is an operator such that, if v1, v2 V with v1
≠ v2 then I(v1) ≠ I(v2) Also, R(I) = V, so that I is onto.
Proof: First assume T is one – one. Let u Ker T. Then T (u) = 0 = T(0). This
means that u = 0. thus, Ker T = (0). Conversely, let Ker T = (0). Suppose u 1, u2 U
with T(u1) = T(u2) T(u2 – u2) = 0 u1 – u2 Ker u1-u2 = 0 u1 = u2. therefore
T is 1 - 1
Suppose now that T is a one – one and onto linear transformation from a vector space
U to a vector space V. Then, from Unit 1 (Theorem 4), we know that T -1 exists. But
is T-1 linear? The answer to this question is ‘yes’, as is shown in the following
theorem.
Theorem 8: Let U and V be vector spaces ovet a field F. Let T:U V be a none-one
and onto linear transformation. Then T-1: V U is a linear transformation.
Proof: Let y1, y2 V and α1, α2 F. Suppose T-1 (y1) = x1 and T-1 (y2) = x2. then, by
definition, y1 = T (x1) and y2 = T(x2).
We will now show that T-1 is 1 -1, for this, suppose y1, y2 V such that T-1 (y1) = T-1
(y2) Let x1 = T-1 (y1) and x2 = T-1 (y2).
Then T(x1) = y1 and T (x2) = y2. We know that x1 = x2. Therefore, T (x1) = T (x2),
that is, y1 = y2. thus, we have shown that T-1 (y1) = T-1 (y2) y1 = y2, proving that T-
1
is 1 – 1. T-1 is also surjective because, for any u U, T (u) = v V such that T-1
(v) = u.
Theorem 8 says that a one-one and onto linear transformation is invertible, and the
inverse is also a one-one and onto linear transformation.
Definition: Let U and V be vector spaces over a field F, and let T:U V be a one-
one and onto linear transformation. The T is called an Isomorphism between U and
V. In this case we say that U and V are isomorphic vector spaces. This is denoted
by U V.
An obvious example f an isomorphism is the identity operator. Can you think of any
other? The following exercise may help.
In all these exercises and examples, have you noticed that if T is an isomorphism
between U and V then T-1 is an isomorphism between V and U?
Using these properties of an isomorphism we can get some useful results, like the
following:
Proof: First we show that the set {T (e1), …, T(en)} spans V. Since T is onto, R(T) =
V. thus, from E12 you know that {T(e1),.., T(en)} spans V.
Let us now show that {T(e1),.., T(en)} is linearly independent. Suppose there exist
scalars c1,…, cn, such that c1 T(e1) + … + cn T(en) = 0 ……….1
Remark: The argument showing the linear independence of {T(e1), …, T (en)} in the
above theorem can be used to prove that any one-one linear transformation T:U V
maps any linearly independent subset of U onto a linearly independent subset of V
(see E22).
We now give an important result equating ‘isomorphism’ with ‘1 -1 ‘ and with ‘onto’
in the finite-dimensional case.
a) T is 1 – 1
b) T is onto.
c) T is an isomorphism.
Proof: To prove the result we will prove (a) (b) (c) (a). Let dim U = dim
V = n.
Now (a) implies that Ker T = (0+ (from Theorem 7), Hence, nullity (T) = (0).
Therefore by Theorem 5, rank (T) = n that is dim R(T) = n = dim V. But R (T) is a
subspace of V. thus, by the remark following Theorem 12 of Unit 4, we get R (T) = V,
i.e., T is onto, i.e., (b) is true. So (a) (b).
Similarly, if (b) holds then rank (T) = n, and hence, nullity (T) = 0. consequently, Ker
T = {0}, and T is one-one. Hence, T is one-one and onto, i.e. . t is an isomorphism.
Therefore, (b) implies (c).
That (a) follows from 9c) is immediate from the definition of an isomorphism.
Example 13: (To show that the spaces have to be finite-dimensional): Let V be the
real vector space of all polynomials. Let D:V V be defined by D (a0 + a1 x + ..+ ar
xr) = a1 + 2a2x + …. + rar xr-1. then show that D is onto but not 1 -1.
Solution: Note that V has infinite dimension, a basis being {1, x, x 2, …}. D is onto
because any element of V is of the form a0 + a1 x + …. + anxn = D
a1 an
a0 x + ---- x2 + …. + ----- xn + 1
2 n+1
The following exercise shows that the statement of Theorem 10 is false if dim U ≠
dim V.
E E12) Define a linear operator T: R3 R2 such that T is onto but T is not 1 -
1. Note that dim R3 ≠ dim R2.
Let us use Theorems 9 and 10 to prove our next result.
Theorem 11: Let T:V V be a linear transformation and let {e1, …, en} be a basis
of V. Then T is one-one and onto if and only if {T (e1), …, T(en)} is linearly
independent.
b) Is it true that every linear transformation maps every linearly independent set of
vectors into a linearly independent set?
d) Show that every linear transformation maps a linearly dependent set of vectors
onto a linearly dependent set of vectors.
E E23) Let T: R3 R3 be defined by T(x1, x2, x3) = (x1 + x3, x2 + x3, x1 + x2). Is
T invertible? If yes, find a rule for T-1 like the one which defines T.
Theorem 12: Let U and V be finite-dimensional vector spaces over F. the U and V
are isomorphic if and only if dim U = dim V.
Proof: We have already seen that if U and V are isomorphic then dim U= dim V.
Conversely, suppose dim U = dim V = n. We shall show that U and V are isomorphic.
Let {e1,…, en} be a basis of U and {f1,…., fn} be a basis of V. By Theorem 3, there
exists a linear transformation T:U V such that T(e1) = f1, i = 1, ….., n.
Proof: Since dim Rn = n = dim RV, we get V Rn. Similarly, if dim cV = n, then V
cn .
Remark: Let V be a vector space over F and let B = {e1,…., en} be a basis
n
of V. Each v € V can be uniquely expressed as v = α1 e1. Recall that
i=1
α1,…., αn are called the coordinates of v with respect to B (refer to sec. 4.4.1).
E E24) Let T: U V be a one- one linear mapping. Show that T is onto if and
only if dim U = dim V. (of course, you must assume that U and V are finite
dimensional spaces).
Linear transformation are also called vector space homomorphisms. There is a basic
theorem which uses the properties of homomorphisms to show the isomorphism of
certain quotient spaces (ref. Unit 3). It is simple to prove. But is very important
because it is always being used to prove more advanced theorems on vector spaces.
(in the Abstract Algebra course we will prove this theorem in the setting of groups and
rings)
Theorem 13: Let V and W be vector spaces over a firld F and T:V W be a linear
transformation. Then V/Ker T R (T).
Proof: You know that Ker T is a subspace of V, so that V/Ker T is a well defined
vector space over F. Also R (T) = {T(v)V V}. To proof the theorem let us define
V/Ker T R (T) by (v + Ker T) = T (v).
Firstly, we must show that is a well defined function, that is, if v +Ker T = v` +
Ker T then (v + Ker T) = (v` + Ker T), i.e. T(v) = T (v`).
T(V)
------------------ ST (V)
Ker S T(V)
Therefore,
T(V)
dim --------------------- = dim ST(V)
Ker S T(V)
That is, dim T(V) – dim (Ker S T (V)) = dim ST (V), which is what we had to
show.
E E25) Using Example 14 and the Rank Nullity Theorem, show that nullity (ST)
= nullity (T) + dim (R(T) Ker S).
Solution: Note that we can consider R as a subspace of R3 for the following reason:
any element a of R is equated with the element (α 0, 0 ) of R 3. Now, we define a
function f: R3 R2: f(α,, ) = (, ). then f is a linear transformation and Ker f = {(α,
0, 0) α R} R. Also f is onto, since any element (α, ) of R2 is f (0, α, ). Thus,
by Theorem 13, R3/R R2.
Note: In general, for any n m, Rn/Rm Rn-m.. Similarly, Cn-m Cn/Cm for n m.
The next result is a corollary to the Fundamental Theorem of Homomorphism. But,
before studying it, read unit 3 for definition of the sum of spaces.
Now a+ b + B = a + B + b + B = a + B + B, since b B.
A+B
= a + B, since B is the zero element of ---------
B
a Ker T.
A/Ker T R (T)
E E26) Using the corollary above, show that A B/B A () denotes the direct
sum of defined in sec. 3.6).
Proof: This time we shall prove the theorem with you. To start with let use define a
function T: V/W V/U : t (v + ) = v + U. Now try E 27
So, is the theorem proved? Yes; apply theorem 13 to T. we end the unit by
summarizing what we have done in it.
4.0 CONCLUSION
(1) A linear transformation from a vector space U over F to a vector space V over
F is a function T: U V such that,
These conditions are equivalent to the single condition LT3) T (u1 + u2) =
T(u1) + T(u2) for , F and u1, u2 U.
(5) T : U V is
(6) Let U and V be finite-dimensional vector spaces with the same dimension.
The T: U V is 1 -1 iff T is onto iff t is an isomorphism
(7) Two finite – dimensional vector spaces U and V are isomorphic if and only if
dim U = dim V.
(8) Let V and W be vector spaces over a field F, and T:V W be a linear
transformation. Then V/Ker T R(T).
5.0 SUMMARY
E1) For any a1, a2 F and u1, u2 U, we know that a1u1 U and a2u2 U.
therefore, by LT1.
E 2) By LT2, T (0, u) = 0.T (u) for any u U. Thus, T (0) = 0. Similarly, for any u
U, T (-u) = T (( - 1)u) = (-1) T(u) = - T (u).
E3) T (x,y) = (- x, y ) V (x, y) R2. (See the geometric view in Fig.4) T is a linear
operator. This can be proved the same way as we did in Example 4.
Y
Q (-1,2) P (1,2)
1 0 X
1
Then (f + g) (x) = a0 + b0) + (a1 + b1) x + … + (an + bn)
:. [D(f + g)] (x) = (a0 + b1) + 2 (a2 + b2) x + … + n (an + bn) xn-1
= (a1 + 2a2x +… + nanxn-1) + (b1 + 2b2x + … + nbnxn-1)
= (Df) (x) + (Dg) (x) = (Df + Dg) (x)
Thus, D (f + g) = Df + Dg, showing that D is a linear map.
Then, for any element x + iy C (x, y R), we have T (x + iy) = xT (1) + yT(i) =
x+ y. Thus, T is defined by T(x + iy) = x + yV x + i y C.
E10) T: U V : T (U) = 0 V u U.
:. Ket T = {u E U T (u) = 0} = U
R(T) = { T(u) u , U} = {0}. :. 1 R (T).
= {0,0,z) z R}
:. Ker T is the z-axis.
E12) Any element of R(T) is of the form T(u), u U. Since {e1,….en) generates U,
scalars 1,…., an such that u = 1e1 + … + n en.
Then T(u) = 1 T(e1) + … + n T(en), that is, T (u) is in the linear span of {T(e1), ….,
T(en)}.
:. {T(e1) ……, T(en)} generates R(T).
E13) T: V V: T (v0 = v. Since R(T) = V and Ker T = (0), we see that rank (T) =
dim V, nullity (T) = 0.
b1 bn-1
Pn-1 is D b0 x + ---- x2 + …+ -------- xn R (D)
2 n
Ker D = {a0 + a1x + ….+ anxn a1 + 2a2x +,…+ nan xn-1 = 0, ai R Vi}
= {a0 + a1x +…+ anxna1 = 0, a2 = 0, …, an = 0, ai R Vi}
= {a0a0 R} = R
nullity (D) = 1.
If rank (T) = 1, then dim R(T) = 1, That is, R(T) is a vector space over R generated by
a single element, v, say. Then R(T) is the line Rv = {v R}.
E19) Firstly note that T is a linear transformation. Secondly, T is 1-1 because T (x,
y, z) = T (x`, y`, z`) (x, y z) = (x`, y`, z`)
Thirdly, T is onto because any (x, y, z) R3 can be written as T(x, - y, y, z)
:., T is an isomorphism. :. T-1 : R3 R3 exists and is defined by T-1 (x, y, z) = (x – y, y,
z).
b) No. For example, the zero operator maps every linearly independent set to {0},
which is not linearly independent.
c) Let T:U V be a linear operator, and {u1,…., un} be a linearly dependent set
of vectors in U. We have to show that {T(u1),…,T(un)} is linearly dependent.
Since {u1,…..,un} is linearly dependent, scalars a1,….,an, not all zero, such
that a1u1 + ….+ anun = 0.
:., x + y = 0 = y + z = x + z x = 0 = y = z
Ker T = { (0,0,0)}
T is 1 – 1
:. by Theorem 10, T is invertible.
(a + b, b+ c, a+ c) = (x, y, z)
a + b = x, b + c = y, a + c = z
.A B
:. ------
B
CONTENTS
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Introduction
3.2 Objectives
3.3 The Vector Space L (U, V)
3.4 The Dual Space
3.5 Composition of Linear Transformations
3.6 Minimal Polynomial
Theorems
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/Further Reading
1.0 INTRODUCTION
In the last unit we introduced you to linear transformations and their properties. We
will now show that the set of all linear transformations form a vector space U to a
vector space V forms a vector space itself, and its dimension is dim U (dim V). In
particular, we define and discuss the dual space of a vector space.
In Unit 1 we defined the composition of two functions. Over here we will discuss the
composition of two linear transformations and show that it is again a linear operator.
Note that we use the terms ‘linear transformation’ interchangeably.
2.0 OBJECTIVES
Prove and use the fact that L (U, V) is a vector space of dimension (dim U)
(dim V);
Use dual bases, whenever convenient;
Obtain the composition of two linear operators, whenever possible;
Obtain the minimal polynomial of a linear transformation T:V V in some
simple cases;
Obtain the inverse of an isomorphism T: V V if its minimal polynomial is
known.
3.1 Introduction
By now you must be quite familiar with linear operators, as well as vector spaces. In
this section we consider the set of al linear operators from one vector space to another,
and show that it forms a vector space.
Let U, V be vector spaces over a field F. consider the set of all linear transformations
from U to V. We denote this set by L(U, V).
We will now define addition and scalar multiplication in L(U, V) so that L (U,V)
becomes a vector space.
Suppose S, T L (U, V) (that is, S and T are linear operators from U to V). we define
(S + T: U V by
(S + T) (a1 u1 + a2 u2)
Is S a linear operator? To answer this take 1, 2 F and u1, u2 U. Then, (αS)
(1, u1 + b2 u2) = αS (1 u1 + 2u2) = α[1 S(u1) + 2S(u2)]
= 1 (αS) (u1) + 2 (αS) (u2)
Hence, αS L (U, V).
E E1 Show that the set L (U, V) is a vector space over F with respect to the
operations of addition and multiplication by scalars defined above. (Hint: The
zero vector in this space is the zero transformation).
E12 (e1) = 0, E12 (e2) = f1, E12 (e3) = 0,…., E12 (em) = 0.
In general, there exist Eij L(U, V) for I = 1, …., n, j = 1, …., m, such that Eij (ej) = fi
and Eij (ek) = 0 for j k.
To get used to these Eij try the following exercise before continuing the proof.
E E2) Clearly define E2m, E32 and Emn’
We complete the proof by showing that {Eiji = 1, ... m} is a basis of L(U, V).
Let us first show that set is linearly independent over F. for this, suppose
n m
cij Eij = 0 ……………………………………..(1)
i=1 j=1
n m
cij Eij (ek) = 0 k = 1, …., m.
i=1 j=1
But, {f1, …, fn} is a basis for V thus, cik = 0 for all I = 1, …., n.
Now, for each j such that 1 j m, T (ej) V. since {f1, ….fn} is a basis of V, there
exist scalars cij, …., cin such that
n
T(ej) = cij fi ………………………..(2)
i =1
n m
T = cij Eij ………………… .(3)
i=1 j=1
n m n
Now, cij Eij (ek). = cik fi = T(ek), by (2). This implies (3).
i j i=1
Thus, we have proved that the set of mn elements {E iji = 1,….,n, j = 1,…m} is a
basis for L(U,V).
After having looked at L(U, V), we now discuss this vector space for the particular
case when V = F.
The vector space L(U, V), discussed in sec. 2.2, has a particular name when V = F.
Definition: Let U be a vector space over F. Then the space L(U, F) is called the dual
space of U*, and is denoted by U.
In this section we shall study some basic properties of U*. the elements of U have a
specific name, which we now give.
We know that the space V*, of linear functional on V, is a vector space. Also, if dim
V = m, then dim V* = m, by Theorem 1. (Remember, dim F = 1).
Hence, we see that dim V = dim V*. From Theorem 12 of unit 5, it follows that the
vector spaces V and, V* are isomorphic.
We now construct a special basis for V*. Let {e1,….em} be a basis for V. by Theorem
3 of Unit 5, for each i = 1, …, m, there exists a unique linear functional f i on V such
that
We will prove that the linear functional f1, …., fm, constructed above, form a basis of
V*.
Since dim V = dim V* = m, it is enough to show that the set { f 1,….fm} is linearly
independent. For this we suppose e1,…,cm F such that c1f1 +…+ cm fm = 0.
n
Now cjfj = 0
j=1
n
cjfj (ei) = 0, for each i.
j=1
n
cj (fj(ei)) = 0 i.
j=1
n
cj ij = 0 i ci = 0 i
j=1
Thus, the set { f1,….fm} is a set of m linearly independent elements of a vector space
V* of dimension m, Thus, from Unit 4 (Theorem 5, Cor. 1), it forms a basis of V*.
Definition: The basis {f1,….,fm} of V* is called the dual basis of the basis {e1,….,
em} of V.
We now come to the result that shows the convenience of using a dual basis.
n
v = fi. (v)ei.
i=1
Proof: Since {f1, ….,fn} is a basis of V*, for f V* there exist scalars c1,….cn such
that
n
f = ci fi.
i=1
Therefore,
n
f (ej) = ci fi. (ej)
i=1
n
= ciij, by definition of dual basis.
i=1
= cj.
n
v = ai ei.
i=1
Hence, n
fj (v) = ai fi. (ej)
i=1
n
= aiij.
i=1
= aj,
and we obtain
n
v = fi (v) ej
i=1
Example 3: Consider the basis e1 = (1, 0 – 1), e2 = (1, 1, 0) of C3 over C. Find the
dual basis of {e1, e2, e3}.
Solution: Any element of C3 is v = (z1, z2, z3), z1 C. Since {e1, e2, e3} is a basis, we
have 1, 2, 3 C. Since That
Thus, +2 + 3 = z1
2 + 3 = z2
-1 + 2 = z3.
Now, by Theorem 2,
v = f1, (v) e1 + f2 (v) e2 + f3 (v) e3, where {f1, f2, f3} is the dual basis. Also v = 1 e1 +
2 e2 + 3 e3.
Thus, the dual basis of {e1, e2, e3} is {f1, f2, f3}, where f1, f2, f3 will be defined as
follows:
Now let us look at the dual of the dual space. If you like, you may skip this portion
and go straight to sec. 6. 4.
Let V be an n-dimensional vector space. We have already seen that V and V* are
isomorphic because dim V = dim V*. The dual of V* is called the second dual of V
and is denoted by V**. We will show that V V**.
Now any element of V** is a linear transformation from V* to F. Also, for any v
V and f V*, f(v) F. So we define a mapping : V V**: v v, where ( v)
(f) = f(v) for all f V* and v V. (Over here we will us (v) and v
interchangeably).
= c1 f1 (v) + c2 f2 (v)
= c1 ( v) (f1) + c2 ( v) (f2)
Furthermore, the map : V V** is linear. This can be seen as follows: for c1, c2
F and v1, v2, V.
Now that we have shown that 0 is linear, we want to show that it is actually an
isomorphism. We will show that 0 is 1 -1. For this. By Theorem 7 of Unit 5, it
suffices to show that (v) = 0 implies v = 0. Let {f1,….,fn} be the dual basis of a
basis {e1,…,en} of V.
n
By Theorem 2. we have v = f1(v) ei.
i=1
Now (v) = 0 (v) (f1) = 0 V i = 1, …., n.
f1(v) = 0 V i = 1, ….,n
v = f1 (v) e1 = 0
Hence, it follows that is 1 -1. thus, is an isomorphism (Unit 5, Theorem 10).
Then there exists a unique v V such that (f) = f(v) for all f V*.
Proof: By Theorem 3, since is an isomorphism, it is onto and 1 -1. thus, there
exists a unique v V such that (v) = . This by definition, implies that (f) =
(v) (f) = f(v) for all f V*.
In the following section we look at the composition of linear operators, and the vector
space A(v), where V is a vector space over F.
3.5 Composition of Linear Transformations
Do you remember the definition f the composition of functions, which you studied in
Unit 1? Let us now consider the particular case of the composition of two linear
transformations. Suppose T: U V W, defined by S0T(u) = S (T(u)) V u U.
T S
U V W
SoT
The first question which comes to our mind is whether S0T is linear. The affirmative
answer is given by the following result.
Proof: All we need to prove is the linearity of the map SoT. Let 1, 2 F and u1,
u2 U. Then
E E7) Let I be the identity operator on V. show that SoI = IoS = S for all S
A (V).
E E8) Prove that So0 = 0oS = 0 for all S A (V). where 0 is the null
operator.
Remark: Let S:V V be an invertible linear transformation (ref. Sec 1.4), that is an
isomorphism. Then, by Unit 5, Theorem 8, S-1 L (V, V) = A (V).
SoS-1 = S-1 0S = Iv, where I, denotes the identity transformation on V. this remark
leads us to the following interesting result.
Proof: Let us first assume that S is an isomorphism. Then, the remark above tells us
that S-1 A(V) such that SoS-1 = I = S-1oS. Thus, we have T (= S-1) such that SoT =
ToS = I.
Conversely, suppose T exists in A(V), such that SoT = I = ToS. We want to show that
S is 1 -1 and onto
We first show that S is 1 -1, that is, Ker S = {0}. Now, x Ker S S(x) = 0 ToS
(x) = T{0} = 0 I(x) = 0 x = 0. thus, Ker S = {0}.
Next, we show that S is onto, that is, for any v V, u V such that S(u) = v. Now,
for any v V,
Now, let us look at some examples involving the composite of linear operators.
T(x1, x2) = (x1, x2, x1 + x2) and S(x1, x2, x3) = (x1, x2). Find SoT and ToS.
Solution: First, note that T L (R2, R3) and S L (R3, R2). :. SoT and ToS are both
well defined linear operators. Now,
SoT (x1, x2) = S (T(x1, x2)) = S (x1, x2, x1 + x2) = (x1, x2).
Now, ToS (x1, x2, x3) = T(S(x1, x2, x3)) = T (x1, x2) = x1, x2, x1 + x2).
In this case SoT A (R2), while ToS A (R3). Clearly, SoT, ToS. Also, note that
SoT = I, but ToS I.
Remark: Even if SoT and ToS both being to A(V), SoT may not be equal to ToS.
We give such an example below.
Example 5: Let S, T A (R2) be defined by T (x1, x2) = (x1 _ x2, x1 – x2) and S (x1,
x2) = 0, x2). Show that SoT ToS.
Solution: You can check that SoT (x1, x2) = (0, x1 – x2) and ToS (x1, x2) = (x2 – x2).
Thus, (x1, x2) R2 such that SoT (x1, x2) ToS (x1, x2) (for instance, SoT (1, 1)
ToS (1, 1)). That is, SoT ToS.
Note: Before checking whether SoT is a well defined linear operator. You must be
sure that both S and T are well defined linear operators.
Now try to solve the following exercise.
E E10) Let T (x1, x2) = (0, x1, x2) and S (x1, x2, x3) = (x1, + x2, x2 + x3) Find SoT
and ToS. When is SoT = ToS?
E E11) Let T (x1, x2) = 2x1, x1 + 2x2) for (x1, x2) R2, and S (x1, x2, x3) = (x1 +
2x2, 3x1 – x2, x3) for (x1, x2, x3) R3. Are SoT and ToS defined? If yes, find
them.
E E13) Let S, T A (V) and S be invertible. Show that rank (ST) = rank (TS)
= rank (T). (ST).
So far we have discussed the composition of linear transformation. We have seen that
if S, T A (V), then SoT A (V), where V is a vector space of dimension n. Thus,
we have introduced another binary operation (see 1. 5.2) in A (V), namely, the
composition of operators, denoted by o. Remember, we already have the binary
operations given in Sec. 6.2 In the following theorem we sate some simple properties
that involve all these operations.
The properties of A(V), stated in theorems 1 and 6 are very important and will be used
implicitly again and again. To get used to A(V) and the operations in it try the
following exercises.
E E14) Consider S, T:R2 R2 defined by S(x1, x2) = x1, - x2) and T (x1, x2) = (x1
+ x2, x2 – x3). What are S + T, ST, TS, So (S-T) and (S – T) oS?
E E15) Let S E A (V), dim V = n and rank (S) = r, Let
M = {T A (V) ST = 0},
N = {T A (V) TS = 0}
By now you must have got used to handling the elements of A(V). the next section
deals with polynomials that are related to these elements.
Recall that a polynomial in one variable x over F is of the form p(x) = a 0 + a1x +,….+
an xn, where a0, a1,….., an F.
Since each of I, T,…., Tn A (V), we find P (T) A (V). We say P (T) F [T]. If q
is another polynomial in x over F, then P (T) q (T) = q (T) = P(T), that is, P(T) and q
(T) commute with each other. This can be seen as follows:
Let q(T) = b0 I + b1 T + … + bm Tm
E E16) Let p, q E F [x] such that p (T) = 0, q (T) = 0. Show that (p + q) (T) = 0.
(( p + q) (x) means p (x) + q (x)).
E E17) Check that (2I + 3S + S3) commutes with (S + 2S4), for S A (Rn).
We now go on to prove that given any T A (V) we can find a polynomial g F [x]
such that
Proof: We have already seen that A(V) is a vector space of dimension n2. Hence, the
set {I, T, T2,…, Tn2} of n2 + 1 vectors of A (V), must be linearly dependent (ref. Unit
4, Theorem 7). Therefore, there must exist a0, a1,….,an2 F (not all zero) such that a0 I
+ a1 T +…+ an2 Tn2 = 0.
The following exercise will help you in getting used to polynomials in x and T.
E E18) Give an example of polynomials g (x) and h (x) in r [x], for which g (I)
= 0 and h (0) = 0, where I and 0 are the identity and zero transformations in A
(R3).
E E19) Let T E A (V). then we have a map from F [x] to A (V) given by
(p) = p (T) show that, for a b F and p, q F [x],
Proof: Consider the set S = {g F [x] g (T) = 0}. This set is non-empty since, by
Theorem 7, there exists a non-zero polynomial g, of degree at most n2, such that g (T)
= 0. Now consider the set D = {deg f f S}. Then D is a subset of N U {0}, and
therefore, it must have a minimum element, m. Let h S such that deg h = m. then h
(T) = 0 and deg h deg g V g S.
We now show that p is unique, that is, if q is any monic polynomial of smallest degree
such that q (T) = 0, then p = q. But this is easy. Firstly, since deg p deg g V g S,
deg p deg q.
Since p (T) = 0 and q (T) = 0, we get (p – q) (T) = 0. But p – q = (a0 – b0) + … + (an-1
– bn-1) xn-1. Hence, (p – q) is a polynomial of degree strictly less than the degree of p,
such that (p – q) (T) = 0. That is, p – q S with deg (p – q) < deg p. This is a
contradiction to the way we chose p, unless p – q = 0, that is, p = q. :. P is the unique
polynomial satisfying the conditions of Theorem 8.
Definition: For T E A (V), the unique monic polynomial p of smallest degree such
that p(T) = 0 is called the minimal polynomial of T.
Example 6: For any vector space V, find the minimal polynomials for I, the identity
transformation, and 0, the zero transformation.
Solution: Let p (x) = x – 1 and q (x) = x. Then p and q are monic such that p(I) = 0
and q (0) = 0. Clearly no non-zero polynomials of smaller degree have the above
properties. Thus x – 1 and x are the required polynomials.
E E20) Define T:R3 R3: T (x1, x2, x3) = (0, x1, x2). Show that the minimal
polynomial of T is x3.
E E21) Define T:Rn Rn:T (x1, ……, xn-1). What is the minimal polynomial
of T? (Does E 20 help you?
T(x1, x2, x3) = (3x1, x1 – x2, 2x1 + x2 + x3). Show that (T2 – I) (T – 3I)
= 0. what is the minimal polynomial of T?
We will now state and prove a criterion by which we can obtain the minimal
polynomial of linear operator T, once we know any polynomial f F [x] with f (T) =
0. It says that the minimal polynomial must be a factor of any such f.
Theorem 9: Let T A (V) and let p (x) be the minimal polynomial of T. Let f (x) be
any polynomial such that f (T) = 0. Then there exists a polynomial g (x) such that f
(x) = p (x) g (x).
Proof: The division algorithm states that given f (x) and p (x), there exist
polynomials g (x) and h (x) such that f (x) = p (x) g (x) + h (x), where h (x) = 0 or deg
h (x) < deg p (x). Now,
This contradicts the fact that p (x) is the minimal polynomial of T. hence, h (x) = 0
and we get f (x) = p (x) g (x).
Using this theorem, can you obtain the minimal polynomial of T in E22 more easily?
Now we only need to check if T-I, T + I or T – 3I are 0.
Remark: if dim V = n and T A(V), we have seen that the degree of the minimal
polynomial p of T n2. We will study a systematic method of finding the minimal
polynomial of T, and some applications of this polynomial. But now we will only
illustrate one application of the concept of the minimal polynomial by proving the
following theorem.
Theorem 10: Let T A(V). Then T is invertible if and only if the constant term in
the minimal polynomial of T is not zero.
This equation gives us a monic polynomial q(x) = a1 + ….+ xm-1 such that q(T) = 0
and deg q < deg p. this contradicts the fact that p is the minimal polynomial of T.
Therefore, if T-1 exists the constant term in the minimal polynomial of T cannot be
zero.
Conversely; suppose the constant term in the minimal polynomial of T is not zero, that
is, a0 0. Then dividing Equation (1) on both sides by (–a0), we get
Then we have ST = I and TS = I. This shows, by Theorem 5, that T -1 exists and T-1 =
S.
E E23) Let Pn be the space of all polynomials of degree n. Consider the linear
operator D: P2 P2 given by D (a0 + a1 x + a2 x2) = a1 + 2a2x. (Note that D is
just the differentiation operator.) Show that D4 = 0. What is the minimal
polynomial of D? is D invertible?
We will now end the unit by summarizing what we have covered in it.
2.6 Summary
2.7 Solutions/Answers
E1) We have to check that VS1 – VS10 are satisfied by L(U,V). we have already
shown that VS1 and VS6 are true.
VS2: For any L, M, N L (U, V), we have V u U, [(L + M) + N] (u)
= (L + M) (u) + N (u) = [L(u) + M(u)] + N (u)
= L(u) + [M(u) + N (u)], since addition is associative in V.
= [L + (M + N)] (u)
:. ( L + M) + N = L + (M + N).
E3) Both spaces have dimension 2 over R. A. basis for L (R 2, R) is {E11, E12},
where E11 (1, 0) = 1, E11 (0, 1) = 0, E12 (1, 0) = 0, E12, (0, 1) = 1. A basis for L
(R, R2) is {E11, E21}, where E11 (1) = (1, 0), E21 (1) = (0, 1).
E4 Let f: R3 R be any linear functional. Let f (1, 0, 0) = a1, f (0, 1, 0) = a2, f(0,
0, 1) = a3. Then, for any x ⇋ (x1, x2, x3), we have x = x1 (1, 0,0) + x2 (0, 1, 0) +
x3 (0, 0, 1).
:. f (x) = x1 f (1, 0,0) + x2 f (0, 1, 0) + x3 f (0, 0, 1)
= a1 x1 + a2 x2 + a3 x3.
E5 Let the dual basis be {f1, f2, f3}. Then, for any v P2, v = f1 (v). 1 + f2 (v). x
+ f3 (v). x2
:. If v = a0 + a1x + a2x2, then f1 (v) = a0, f2 (v) = a1, f3 (v) = a2.
That is, f1 (a0 + a1x + a2x2) = a0, f2 (a0 + a1x + a2x2) = a1, f3 (a0
+ a1x + a2x ) = a3, for any a0 + a1 x + a2 x P2.
2 2
E6) Let {f1,…., fn} be a basis of V* Let its dual basis be {1,…., n}, i. V**. Let
ei V such that (ei) = i (ref. Theorem 3) for i = 1,…., n.
Then {e1,….,en} is a basis of V, since -1 is an isomorphism and maps a basis
to {e1,…., en}. Now fi (ej) = (ej) (fi) = j (fi) = ji, by definition of a dual
basis.
:. {f1,…., fn} is the dual of {e1,….en}.
E11) Since T A (R2) and S A (R3), SoT and ToS are not defined.
E12) Both (RoS) oT and Ro (SoT) are in L (U, Z). For any u U,
[(RoS) oT] (u) = (RoS) [T(u)] = R[S(T(u))] = R [SoT) (u)] = [Ro(SoT)] (u).
:. (RoS) oT = Ro (SoT).
We must also show that no monic polynomial q of smaller degree exists such that q
(T) = 0.
E21) Consider p(x) = xn. Then p (T) = 0 and no non- zero polynomial q of
lesser degree exists such that q (T) = 0. this ca be checked on the lines of the solution
of E20.
Suppose q = a + bx + x2 such that q (T) = 0. Then q (T) (x1, x2, x3) = (0, 0, 0) V (x1,
x2, x3) R3. This means that a + 3b + 9 = 0, (b + 2) x1 + (a = b + 1) x2 = 0, (2b + 9) x1
+ bx2 + (a + b + 1) x3 = 0. Eliminating a and b, we find that these equations can be
solved provided 5x1 – 2x2 – 4x3 = 0. But they should be true for any (x1, x2, x3) R3.
:. The equations can’t be solved, and q does not exist. :., the minimal polynomial of T
is (x2 – I) (x – 3).
E23) D4 (a0 + a1x +a2x2) D3 (a1, + 2a2x) = D2 (2a2) = D(0) = 0 V a0 + a1x + a2x2 P2.
:. D4 = 0.
The minimal polynomial of D can be D, D2, D3 or D4. Check that D3 = 0, but D2 0.
:. The minimal polynomial of D is p(x) = x3. Since p has no non-zero constant term,
D is not an isomorphism.
E25) Since the minimal polynomial of S is xn, Sn = 0 and Sn-1 0. : v0 V such
that
Sn-1 (v0) 0. Let a1, a2,….., an F such that
a1 v0 + a2 S(v0) + …+ an Sn-1 (0) = 0 …………………….. (1)
Then, applying Sn-1 to both sides of this equation, we get a1Sn-1 (v0) + …+ anS2n-1(v0) =
0
a1Sn-1 (v0) = 0, since Sn = 0 Sn+1 = …. = S2n-1
a1 = 0.
1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Vector Space of Matrices
3.1.1 Definition of a Matrix
3.1.2 Matrix of a Linear Transformation
3.1.3 Sum and Multiplication by Scalars
3.1.4 Mmxn (F) is a Vector Space
3.1.5 Dimension of Mmxn (F) over F
3.2 New Matrices From Old
3.2.1 Transpose
3.2.2 Conjugate Transpose
3.3 Some Types of Matrices
3.3.1 Diagonal Matrix
3.3.2 Triangular Matrix
3.4 Matrix Multiplication
3.4.1 Matrix of the Composition of Linear Transformations
3.5 Properties of a Matrix Product
3.6 Invertible Matrices
3.6.1 Inverse of a Matrix
3.6.2 Matrix of Change of Basis
3.7 Solutions/Answers
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/Further Reading
1.0 INTRODUCTION
You have studied linear transformations in Units 1 and 2 we will now study a simple
means of representing them, namely, by matrices the plural form of ‘matrix’). We
will show that, given a linear transformation, we can obtain a matrix associated to it,
and vice versa. Then, as you will see, certain properties of a linear transformation can
be studied more easily if we study the associated matrix instead. For example, you
will see in Block 3, that it is often easier to obtain the characteristic roots of a matrix
than of a linear transformation.
To realize the deep connection between matrices and linear transformations, you
should go back to the exact sport in Units 1 and 2 to which frequent references are
made.
This unit may take you a little longer to study, than previous ones, but don’t let that
worry you. The material in it is actually very simple.
2.0 OBJECTIVES
The coefficients of the unknowns, x, y, z and t can be arranged in rows and columns to
form a rectangular array as follows:
1 -2 4 1 1 -2 4 1
1 ½ 0 11 1 ½ 0 11
0 3 -5 0 0 3 -5 0
The numbers appearing in the various positions of a matrix are called the entries (or
elements) of the matrix. Note that the same number may appear at two or more
different positions of a matrix. For example, 1 appears in 3 different positions in the
matrix given above.
In the matrix above, the three horizontal rows of entries have 4 elements each. These
are called the rows of this matrix. The four vertical rows of entries in the matrix,
having 3 elements each, are called its columns. Thus, this matrix has three rows and
four columns. We describe this by saying that this is a matrix of size 3 x 4 (“3 by 4”
or “3 cross 4”), or that this is a 3 x 4 matrix. The rows are counted from top to bottom
and the columns are counted from left to right. Thus, the first row is (1, -2, 4, 1), the
second row is (1, ½, 0, 11), and so on. Similarly,
Let us see what we mean by a matrix of size m x n, where m and n are any two natural
numbers.
Let F be a field.
A rectangular array
A11 a12…………. A1n
The element at the intersection of the ith row and the jth column is called the (i,j) th
elements. For example, in the m x n matrix above, the (2, n) the element is a2n, which
is the intersection of the 2nd row and the nth column.
A brief notation for this matrix is [a ij] mn, or simply [aij], if m and n need not be
stressed. We also denote matrices by capital letters A, B, C, … etc. The set of all m x
n matrices over F is denoted by Mmxn (F).
If m = n, then the matrix is called a square matrix. The set of all n x n matrices over F
is denoted by Mn (F).
Example 1: There are 20 male and 5 female students in the B.Sc. (Math. Hons) 1
year class in a certain college, 15 male and 10 female student’s in B.Sc. (Math. Hon’s)
II year and 12 male and 10 female students in B.Sc. (Math. Hon’s) III year. How does
this information give rise to a matrix?
Solution: One of the ways in which we can arrange this information in the form of a
matrix is as follows:
Male 20 15 12
Female 5 10 10
This is a 2 x 3 matrix.
B.Sc. I 5 20
B.Sc. II 10 15
B.Sc. II 10 12
To get used to matrices and their elements, you can try the following exercises.
1 2 3 2 5 3 2
E Let A = 4 5 0 and B = 5 4 1 5
0 0 7 0 3 2 0
Give the
a) (1, 2)th elements of A and B.
b) Third row of A.
c) Second column of A and the first column of B.
d) Forth row of B.
How did you solve E 2? Did the (I, j) th entry of one differ from the (I, j) th entry of
the other for some I and j? if not, then they were equal. For example, the two 1 x 1
matrices [2] and [2] are equal. But [2] [3], since their entries at the (1, 1) position
differ.
Definition: Two matrices are said to be equal if
i. They have the same size, that is, they have the same number of rows as well as
the same number of columns, and
ii. Their elements, at all the corresponding positions, are the same.
1 0 x y
Example 2: if 2 3 = z 3 , then what are x, y and z?
Solution: Firstly both matrices are of the same size, namely, 2 x 2, Now, for these
matrices to be equal the (I j) th elements of both must be equal V I, j. therefore, we
must have x = I, y = 0, z = 2.
Now that you are familiar with the concept of a matrix, we will link it up with linear
transformations.
We will now obtain a matrix that corresponds to a given linear transformation. You
will see how easy it is to go from matrices to linear transformations, and back. Let U
and V be vector spaces over a field F, of dimensions n and m, respectively. Let
We mean that the order in which the elements of the basis are written is fixed. Thus,
an ordered basis {e1, e2} is not equal to an ordered basis {e2, e1}).
Given a linear transformation T:U V, we will associate a matrix to it. For this, we
consider T (e1), ……, T (en), which are all elements of V and hence, they are linear
combinations of f1,….., fm. Thus, there exist mn scalars ij, such that
From these n equations we form an m x n matrix whose first column consists of the
coefficients of the first equation, second column consists of the coefficients of the
second equation, and so on. This matrix.
.. .. .. ..
is called the matrix of T with respect to the bases B1 and B2. Notice that the
coordinate vector of T (ej) is the jth column of A.
We use the notation [T]B1, B2 for this matrix. Thus, to obtain [T]B1 B2 we consider T
(ei) V ei B1, and write them as linear combinations of the elements of B2.
Remark: Why do we insist on order bases? What happens if we interchange the order
of the elements in B= to {en, e1,….., en-1}? The matrix [T]B1, B2 also changes, the last
column becoming the first column now. Similarly, if we change the positions of the
f1’s in B2, the rows of [T]B1, B2 will get interchanged.
Solution: Let B1 = {e1, e2, e3}, where e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1). Let
B2 = {f1, f2}, where f1 = {1, 0), f2 = {0, 1), Note that B1 and B2 are the standard bases
of R3 and R2, respectively.
T (e1) = (1,0) = f1 = 1. f1 + 0.f2
T (e2) = (0, 1) = f2 = 0.f1 + 1.f2
T (e3) = (0, 0) = 0f1 + 0f2.
1 0 0
Thus, [T]B1.B2 = 0 1 0
E E4) Choose two other bases B1 and B`2 of R3 and R2, respectively. (In
Unit 4 you came across a lot of bases of both these vector spaces) For T in the
example above, give the matrix [T]B`1B`2
What E4 shows us is that the matrix of a transformation depends on the bases that we
use for obtaining it. The next two exercises also being out the same fact.
E E7) Let V be the vector space of polynomials over R of degree < 3, in the
variable t. let D:V V be the differential operator given in Unit 5 (E6, when
n = 3). Show that the matrix of D with respect to the basis {1, t, t 2, t3} is
0 1 0 0
0 0 2 0
0 0 0 3
0 0 0 0
So far, given a linear transformation, we have obtained a matrix from it. This works
the other way also. That is given a matrix we can define a linear transformation
corresponding to it.
1 2 4
[T]B = 2 3 1 , where B is the standard basis of R3.
3 1 2
1 1 0
[T]B1. B2 = 0 1 1 , where B1 and B2 are the standard
E E9) Find the linear operator T: C C whose matrix, with respect to the
basis {1, i} is
0 -1
1 0 ,
(Note that C, the field of complex numbers, is a vector space over R, of dimension 3)
Now we are in a position to define the sum of matrices and multiplication of a matrix
by a scalar.
In Unit 5 you studied about the sum and scalar multiples of linear transformations. In
the following theorem we will see what happens to the matrices associated with the
linear transformations that are sums or scalar multiples of given linear
transformations.
Theorem 1: Let U and V be vector spaces over F, of dimensions n and m,
respectively. Let B1 and B2 be arbitrary bases of u and V, respectively. (Let us
abbreviate [T]B1. B2 to [T] during this theorem.) Let S, T L (U, V) and α F.
Suppose [S] = [aij], [T] = [bij]. Then
m
T(ej) = bij fi V j = 1,…., n and
i=1
m m
= aij fi = bij fi
i=1 i=1
= m
= (aij + bij)fi.
i=1
Thus, by definition of the matrix with respect to B1 and B2, we get [S + T] = aij + bij].
m
= α aij fi
i=1
Two matrices can be added if and
only if they are of the same size
= m
(α aij) fi
i=1
: . . . .
. . . . .
. . . . .
A+B = . . . . . . . .
1 4 5 0 1 0
Solution: Firstly, notice that both the matrices are of the same size (otherwise, we
can’t add them). Their sum is
1 + 0 4 + 1 5 + 0 1 5 5
0 + 1 1 + 4 0 + 5 = 1 5 5
1 0 -1 0
b) 0 1 and 0 -1 ?
Now, let us define the scalar multiple of a matrix, again motivated by theorem 1.
. . . . .
am1 am2 . . amn
In other words, A is the m x n matrix whose (i,i) th element is a times the (i,j) th
element of A.
1
½ ¼ /3
Example 6: What is 2A, where 0 0 0 ?
Thus,
2
1 ½ /3
2A = 0 0 0
1 1 0
E E11) Calculate 3 2 and 3 2 + 1.
Remark: The way we have defined the sum and scalar multiple of matrices allows us
to write Theorem 1 as follows:
[S]B1. B2 = [S]B1.B2.
The following exercise will help you in checking if you have understood the contents
of sections 7.2.2 and 7.2.3.
We now want to show that the set of all m x n matrices over F is actually a vector
space over F.
After having defined the sum and scalar multiplication of matrices, we enumerate the
properties of these operations. This will ultimately lead us to prove that the set of all
m x n matrices over F is a vector space over F. Do keep the properties VS1 – VS10
(of Unit 3) in mind. For any A = [aij], B = [bij], C = [cij] Mmxn (F) and F, we
have
(A + B) + C = A + (B + C), since
(aij + bij + cij = aij + (bij + cij) V i, j, as they are element of a field.
ii) Additive identity: the matrix of the zero transformation (see unit 5), with
respect to any basis, will have 0 as all its entries. This is called the zero matrix.
Consider the zero matrix 0, of size m x n. then, for any A Mmxn (F).
A + 0 = 0 + A = A,
iii) Additive inverse: Given A Mmxn (F) we consider the matrix (-1) A. then
A + (-1) A = (-1) A + A = 0
This is because the (i,j)th element of (-1) A is –aij, and aij + (-aij) = 0 = (-aij) + aij V
i.j.
Thus, (-1) A is the additive inverse of A. We denote (-1) A by – A.
v) (A + B) = A + B.
vi) ( + ) A = A + A
viii) 1. A = A
E E13) Write out the formal profs of the properties (v) – (viii) given above.
These eight properties imply that Mmxn (F) is a vector space over F
Now that we have shown that Mmxn (F) is a vector space over F, we know it must have
a dimension.
What is the dimension of Mmxn (F) over F? to answer this question we prove the
following theorem. But, before you go further, check whether you remember the
definition of a vector space isomorphism (Unit 5).
m
v1 = aij fi for j = 1, …., n.
i=1
m
T(ej) =vj = aijfi.
i=1
Proof: Theorem 2 tells us the Mmxn (F) is isomorphic to L (U, V). Therefore, dimF
Mmxn (F) = dimF (L(U, V) (by Theorem 12 of Unit 5) = mn, from Unit 6 (Theorem 1).
Why do you think we chose such a roundabout way for obtaining dim M mxn (F)? We
could as well have tried to obtain mn linearly independent m x n matrices and show
that they generate Mmxn (F). But that would be quite tedious (see E16). Also, we have
done so much work on L(U, V) so why not use that! And, doesn’t the way we have
used seem neat?
E E 14) At most, how many matrices can there be in any linearly independent
subject of M2x3 (F)?
E E15) Are the matrices [1, 0] and [1, -1] linearly independent over R?
E E16) Let Eij be an m x n matrix whose (i, j)th element is 1 and the other
elements are 0. Show that {Eij: i m. 1 j h}is a basis of Mmxn (F) over F.
Conclude that dimF Mmxn (F) = mn.
Now we move on to the next section, where we see some ways of getting new
matrices from given ones.
Given any matrix we can obtain new matrices from them in different ways. Let us see
three of these ways.
3.2.1 Transpose
1 0 9
Suppose A = 2 5 9
From this we for a matrix whose first and second columns are the first and second
rows of A, respectively. That is, we obtain.
1 2
B = 0 5
9 9
Then B is called the transpose of A. Note that A is also the transpose of B, since the
rows of B are the columns of A. here A is a 2 x 3 matrix and B is a 3 x 2 matrix.
Note that, if A = [a ij]mxn’ then At = [bij]mxn where bij is the intersection of the ith row
and the jth column of At. :. bij is the intersection of the jth row and ith column of A,
i.e.., aji. :. bij = aji.
1 2
E E17) Find At, where a = 2 0.
a) (A + B)t = At + Bt
b) (αA)t = αAt.
c) (At)t = A
1 2
1 1 are both symmetric matrices.
0 2
-2 0 is an example of a skew-symmetric matrix since
t
0 2 = 0 -2 = - 0 2
-2 0 2 0 -2 0
E E20) Take a 2 x 2 matrix A. Calculate A + At and A – At. Which of these is
symmetric and which is skew-symmetric?
What you have shown in E20 is true for a square matrix of any size, namely, for any A
Mn (F), A + At is symmetric and A – At is skew-symmetric.
We now given another ways of getting a new matrix from a given matrix over the
complex field.
3.2.2 Conjugate
If A is a matrix over C, then the matrix obtained by replacing each entry of A by its
complex conjugate is called the conjugate of A, and is denoted by Ā.
Three properties of conjugates, which are similar to those of the transpose, are
1 :
Example 7: Find the conjugate of 2 + i - 3 - 2i
1 -1
2 - i -3 + 2i
1 1
Example 8: What is the conjugate of 2 3 ?
Solution: Note that this matrix has only real entries. Thus, the complex conjugate of
each entry is itself. This means that the conjugate of this matrix is itself.
i 2
E E21) Calculate the conjugate of 3 i .
Given a matrix A Mmxn(F) we form a matrix B by taking the conjugate of At. Then
B = Āt, is called the Conjugate transpose of A.
1 i
t
Example 9: Find Ā where A 2 + i - 3 - 2i.
1 2+i
t
Solution: Firstly, A = i - 3 - 2i, Then
1 2 - i
t
Ā = -1 - 3 + 2i.
Now, note a peculiar occurrence. If we first calculate Ā and then take its transpose,
we get the same matrix, namely, Āt. That is, (Ā)t = Āt. In general, (Ā)t- ĀtV A
Mmxn (C),
E E22) Show that A = Āt A is a square matrix.
1 1+i
For example, the matrix 1 - i 2 is Hermitian, whereas the
i 1+i
Matrix -1 + I 0 is a skew-Harmitian matrix.
1 2
Note: If A = 2 0 Then A = At = Āt (Since the entries are all real).
We will now discuss two important and often-used, types of square matrices.
d1 0 … 0
0 d2 … 0
:.[T]B1.B2 = . . …
0 0 … dn.
Let A [aij] be a square matrix. The entries a11, a22,…..,amn are called the diagonal
entries of A. This is because they lie along the diagonal, from left to right, of the
matrix. All the other entries of A are called the off-diagonal entries of A.
A square matrix whose off-diagonal entries are zero (i.e., aij = 0 V i j) is called a
diagonal matrix. The diagonal matrix.
d1 0 0 . . 0
0 d2 0 . . 0
: : : : : :
0 0 0 . . dn .
Note: The di’s may or may not be zero. What happens if all the d i’s are zero? Well,
we get the n x n zero matrix, which corresponds to the zero operator.
If di = 1 V I = 1,. . . . . . . . . . . . ,n, we get the identity matrix, In (or I, when the size is
understood).
E E23) Show that I, is the matrix associated to the identity operator from Rn to
Rn.
if F, the linear operator I:Rn Rn: I(v) = v, for all v Rn, is called a Scalar
operator. Its matrix with respect to any basis is I = diag (,, . . . . . . . . . , ). Such
a matrix is called a Scalar matrix. It is a diagonal matrix whose diagonal entries are
all equal. With this much discussion on diagonal matrices, we move onto describe
triangular matrices.
0 a22 …. . . a2n
: : …. . . :
0 0 0 . . . ann’.
1 3 1 0 1 0
0 3
while 0 0 is strictly upper triangular.
Note that every strictly upper triangular matrix is an upper triangular matrix.
b11 0 0 . . 0
b21 b22 : . . 0
[T]B = : : : : : :
bn1 bn2 bn3 . . bnn.
Such a matrix is called a Lower Triangular Matrix. If bij = 0 for all I j, then B is
said to be a strictly Lower Triangular Matrix.
The matrix
0 0 0 0
2 0 0 0
-1 -1 0 0
1 0 5 0
A = 1 2 3 1 0 0
0 0 6 3 5 6
In fact, for any n x n upper triangular matrix A, its transpose is lower triangular, and
vice versa.
We have already discussed scalar multiplication. Now we see how to multiply two
matrices. Again, the motivation for this operation comes from linear transformations.
B1 = {e1, e2 . . . . . . . . ,en}
B2 = {f1, f2, . . . . . . . . ,fn}
B3 = {g1, g2, . . . . . . . . ,gm}
n
Then, we know that T(ek) = bjk fjV k = 1, 2, ……,p,
j =1
m
and S(fj) aij giV j = 1, 2, ….., n.
i=1
n
Therefore, SoT (ek) = S(T(ek)) = S bjk fj = bjk S(f1) + b2k S(f2) +…..+ bnk S(fn)
i=1
m m m
= b1k aijgi + b2k a12gi + ……+ bnk ain gi
i=1 i=1 i=1
m
= (aij b1k + ai2 b2k + . . . . . + ain bnk) gi, on collecting the
i=1
coefficients of gi.
m
Thus, [ST]B1. B3 = [cik]mxp, where cik = aij bjk
i=1
n
cik = aij bjk = aij b1k + a12 b2k + . . . . . + ain bnk.
i=1
In order to obtain the (I,k) th element of AB, take the ith row. Of A and the kth
column of B are both n-tuples. Multiply their corresponding elements and add up all
these products.
For example, if the 2nd row of A = [1 2 3], and the 3rd column of
1
B = 5 , then the (2,3) entry of AB = 1x4 + 2x5 + 3x 6 = 32
6
Note that two matrices A and B can only be multiplied if the number of columns of A
= the number of rows of B. the following illustration may help in explaining what we
do to obtain the product of two matrices.
A B
. . .. .. . . .. . .. . =
. . .. .. . . .. . .. .
. . .. . .. .
. . .. . .. .
Note: This is a very new kind of operation so take your time in trying to understand
it. To get you used to matrix multiplication we consider the product of a row and a
column matrix.
b1
Let A = [a1, a2, …., an] be a xn matrix and B = b2 be an n x 1 matrix. Then AB is
the [x] matrix
:
bn .
1 0 0 2 1
Example 10: Let A = 7 0 8 , B = 3 5
0 0 9 4 0
In fact, even if AB and BA are both defined it is possible that AB BA. Consider the
following example.
1 1 0 0 1
Example 11: Let A = 0 1 1 , B = 1 1
1 1
Is AB = BA?
1 x 1 + 1(-1) 1 x 0 + 1 x 0 0 0
Then AB = =
1 x 1 + 1 (-1) 1x0+ 1x 0 0 0
So, you see, the product of two non-zero matrices can be zero.
The following exercises will given you some practice in matrix multiplication.
1 1 1 0
E E27) Let A = , B =
0 1 1 1
1 1 0 0 1
E E28) Let C = 0 1 0 , D = 1 1
0 0
Write C + D, CD and DC, if defined. Is CD = DC?
E E29) With A, B as in E 27, calculate (A + B)2 and A2 + 2AB + B2. Are they
equal? (Here A2 means A. A.).
-bd d
E E30 Let A = -d2b db , b, d F. Find A2.
1 0 0 x 1 0 0
E E31) Calculate 0 2 0 y and ([x y z] 0 2 0
0 0 3 z 0 0 3
E E32) Take a 3 x 2 matrix A whose end row consists of zeros only. Multiply it
by any 2 x 4 matrix B. Show that the 2nd row of AB consists of zeros only. (In
fact, for any two matrices A and B such that AB is defined. If the ith row of A
is the zero vector, then the ith row of AB is also the zero vector. Similarly, if
the jth column of B is the zero vector, then the jth column of AB is the zero
vector)
[ST]B1. B3 = [S]B2.B3 [T] B1.B2, where B1, B2, B3 are the bases of U, V, W,
respectively.
Example 12: Let T:R2 R3 be a linear transformation such that T (x, y) = (2x + y, x
+ 2y, x + t). let S: R3 R2 be defined by S (x, y, z) = (-y + 2z, y – z). Obtain the
matrices [T] B1.B2, [S]B2. B1, and [SoT]B1, and verify that [SoT]B1 = [S]B2.B1
[T]B1.B2, where B1 and B2 are the standard bases in R2 and R3, respectively.
2 1
Thus, [T]B1.B2 = 1 2
1 1
Also,
Thus, 0 -1 2
[S]B2.B1 = 0 1 -1
0 -1 2 2 1
1 2
So, [S]B2.B1 [T]B1.B2 = 0 1 -1 1 1
= 1 0
0 1 = I2.
E E33) Let S: R3 R3: S(x, y, z) = (0, x, y), and T:R3 R3: T(x, y, z) = (x, 0,
y) show that [SoT]n = [S]B [T]B’ where B is the standard basis of R3.
We will now state 5 properties concerning matrix multiplication. (Their proofs could
get a little technical, and we prefer not to give them here).
(i) Associative Law: if A, B, C are m x n, n x p and p x q matrices, respectively,
over F, then (AB) C = A (BC), i.e.., matrix multiplication is associative.
(ii) Distributive Law: If A is an m x n matrix and B, C are n x p matrices, then A
(B + C) = AB + AC.
Similarly, if, A and B are m x n matrices, and C is an n x p matrix, then (A +
B) C = AC + BC.
(iii) Multiplicative identity: In Sec. 7.4.1, we defined the identity matrix In. This
acts as the multiplicative identity for matrix multiplication. We have AIn = A,
Im A = A, for every m x n matrix A.
(iv) If a F, and A, B are m x n and n x p matrices over F, respectively then α(AB)
= (αA) B = A (αB).
(v) If A, B are m x n, n x p matrices over F, respectively, then (AB) t = Bt At. (This
says that the operation of taking the transpose of a matrix is anti-commutative).
These properties can help you in solving the following exercises.
2 -1 1 -2 -5
E E3) For A = 1 0 and B = 3 4 0 ,
-3 4
E E36) Let A = 1 0 -3 , B = 2 -1 3
0 0 0 4 0 -2
In this section we will first explain what invertible matrices are. Then we will see
what we mean by the matrix of a change of basis. Finally, we will show you that such
matrix must be invertible.
We have the following theorem involving the matrix of an invertible linear operator.
Let S L (V, V) be such that [S] B = A. (S exists because of Theorem 2) Then [T] B
[S]B = [S]B [T]B = I = [I]B. Thus, [TS]B = [ST]B = [I]B.
1 1
Example 13: Is A = 0 1 invertible?
a b
Solution: Suppose A were invertible. Then B = c d such that AB
= I = BA. Now.
1 1 a b 1 0
AB = I 0 1 c d = 0 1
a+c b+d 1 0
c d 0 1 c = 0, d = 1, a = 1, b = -1
1 -1
:.B = 0 1 , Now you can also check that BA = I.
Therefore, A is invertible.
We now show that if an inverse of a matrix exists, it must be unique.
1 a 1 b
A = 0 1 , B = 0 1
1 a 1 b 1 a+b
Solution: Now AB = 0 1 0 1 = 0 1 .
Now, how can we use this to obtain A-1? Well, if AB = I, then a + b = 0. So, if we
take
1 -a
B = 0 1 ,
1 -a
We get AB = BA = I. Thus, A-1 = 0 1
1 0
E E39) is the matrix 2 -1 invertible? If so, find its inverse.
We will now make a few observations about the matrix inverse, in the form of a
theorem.
(b) If A, B Mn (F) are invertible, then AB is invertible and (AB)-1 = B-1 A-1
We now relate matrix invertibility with the linear independence of its rows or
columns. When we say that the m rows of A = [a ij] Mmxn (F) are linearly
independent, what do we mean? Let R1,…..,Rm be the m row vectors [a11, a12,…., a1n],
[a21,….,a2n],….,[am1,….,amn], respectively. We say that they are linearly independent
if, whenever a1,…., am F such that a1R1 + …+ am Rm = 0,
Then, a1 = 0, . . . . . . . . , am = 0.
Similarly, the n columns C1, …., Cn of A are linearly independent if b1C1 + ….bnCn =
0 b1 = 0, b2 = 0,…., bn = 0, where b1,…..,bn F.
(a) A is invertible
(b) The columns of A are linearly independent.
(c) The rows of A are linearly independent.
Proof: We first prove (a) (b), using Theorem 4, Let V be an n-dimensional vector
space. Over F and B = {e1,…,en} be a basis of V. Let T L (V, V) be such that [T] B
= A. then A is invertible iff T is invertible iff T(e1), T(e2),…., T (en) are linearly
independent (see Unit 5 Theorem 9). Now we define the map
a1
: V Mnx1 (F): (a1e1 + ….+ an en) = :
.
an
Let C1, C2,…..,Cn be the columns of A. Then (T(ei)) = Ci for all i = 1,... n. Since
is an isomorphism, T(e1), ….., T(en) are linearly independent iff C1, C2,….., Cn are
linearly independent. Thus, A is invertible iff C1,….., Cn are linearly independent.
Thus, we have proved (a) (b).
Now, the equivalence of (a) and (c) follows because A is invertible At is invertible
the columns of At are linearly independent (as we have just shown) the rows of
A are linearly independent (since the columns of At are the rows of A).
From the following example you can see how Theorem 7 can be useful.
Example 15:
1 0 1
Let A = 0 1 1 M3 (R).
1 1 1
Solution: let R1, R2, R3 be the rows of A. We will show that they are linearly
independent.
X (1,0,1) + y (0, 1,1) + z (1,1,1) = (0, 0, 0). This gives us the following equations.
x+z = 0
y+z = 0
x+y+z = 0
On solving these we get x = 0, y = 0 z = 0.
E E41) Check if
2 0 1
0 0 1 M3 (Q) is invertible.
0 3 0
We will now see how we associate a matrix to a change of basis. This association will
be made use of very often in the next block.
Let V be an n-dimensional vector space over F. Let B = {e1, e2, . . . . . . ., en} and B’ =
{e’1, e’2 ………..e’n} be two bases of V. Since e’j. V, for every j, it is a linear
combination of the elements of B. Suppose,
n
’
ej = aij eiVj = 1, . . . . . . . . ., n
i=1
The n x n matrix A = [a ij] is called the matrix of the change of basis from B to B’ It is
denoted by MBB.
Note that A is the matrix of the transformation T L (V, V) such that T (e j) = e’jV j =
1, ……., n, with respect to the basis B. Since {e’j,. . . . . . . .,e’n} is a basis of V, from
Unite 5 we see that T is 1 – 1 and onto. Thus T is invertible. So A is invertible.
Thus, the matrix of the change of basis from B to B’ s invertible.
Note: a) MBB = In, This is because, in this case e’j = ejVi = 1, 2, ……,n.
Theorem 8: Let B = {e1, e2, . . . . . . .,en} be a fixed basis of V. the mapping B’
MB’B is a 1 -1 and onto correspondence between the set of all bases of V and the set
of invertible n x n matrices over F.
Example 16: In R2, B = {e1, e2} is the standard basis. Let B’ be the basis obtained by
rotating B through a angle in the anti-clockwise direction (see Fig. 1). Then B’ =
(e’1, e’2) where e’1 = (cos , sin ), e’2 = (-sin , cos ). Fin MB’B.
cos - sin
B’
Thus, M B = sin cos
E E42) Let B be the standard basis of R3 and B’ be another basis such that
01 1
MB’B. = 11 0
0 0 3
’
What are the elements of B ?
What happens if we change the basis more than once? The following theorem tells us
something about the corresponding matrices.
Corollary: Let B.B’ be two bases of V. then MB’B MB’B = I = MB’B MB’B
Proof: By Theorem 9,
MB’B MB’B = MB’B = I
But, how does the change of basis affect the matrix associated to a given linear
transformation? In sec. 7.2 we remarked that the matrix of a linear transformation
depends upon the pair of bases chosen. The relation between the matrices of a
transformation with respect to two pairs of bases can be described as follows.
Theorem 10: Let T L (U, V), Let B1 = {e1, . . . . . . . , en} and B2 = {f1, . . . . . . ,fm}
be a pair of bases of U and V, respectively.
Let B’1 = {e’1, . . . . . . ., e’n}, B’2 = {f’1,. . . . . . . f’m} be another pair of bases of U and
V, respectively. Then.
Now, a corollary to Theorem 10, which will come in handy in the next block.
Corollary: Let T L (V, V) and B, B’ be two bases of V. Then [T]B ’ = P-1 [T]BP,
where P = MB’B .
Proof: [T]B = MB’B [T]B MB’B = P-1 [T]BP, by the corollary to Theorem 9. Let
us now recapitulate all that we have covered in this unit.
4.0 SUMMARY
3.6 Solutions/Answers
E1) a) You want the elements in the 1st row and the 2nd column. They are 2
and 5, respectively.
b) [0 0 7]
2
c) The second column of A is 5
0
2
The first column of B is also 5
0
1 2 1 0
3 4 2 0
5 6 and 3 0
7 8 4 0
E4) Suppose B’1 = {(1, 0, 1), (0, 2, - 1)} and B’2 = {0, 1), (1, 0)}
Then T (1, 0, 1) = (1,) = 0. (0,1) + 1. (1, 0)
T (0, 2, - 1) = (0,2) = 2. (0, 1) + 0, (1, 0)
T (1, 0, 0) = (1, 0) = 0. (0, 1) + 1. (1, 0).
0 2 0
:. [T]B’1.B’2 = 1 0 1
E5) B1 = {e1, e2, e3} B2 = {f1, f2} are the standard bases (given in Example 3).
T (e1) = T (1, 0, 0) = (1, 2) = f1 + 2f2
T (e2) = T (0, 1, 0) = (2, 3) = 2f1 + 3f2
T (e3) = T (0, 0, 1) = (2, 4) = 2f1 + 4f2.
1 2 2
:. [T]B1.B2 = 2 3 4
1 0 3
:. [T] B’1.B’2 = 0 1 -2
c) Both matrices are of the same size, namely, 2 x 2. their sum is the matrix.
1 + ( -1) 0 + 0 0 0
0 + 0 1 + (-1) = 0 0
1 3 0 0
E11) 3 = , 3 =
2 6 1 3
1 0 1 3
and 3 + = 3 =
2 1 3 9
1 0 1 0
Notice that 3 + = 3 + 3
2 1 2 1
1 0
:.[S]B1.B2 = 0 0 , a 3 x 2 matrix
0 1
0 0
:. [T]B1.B2 1 0 , a 3 x 2 matrix
0 1
:.[S + T] B1. B2 = [S]B1.B2 + [T]B1. B2 =
1 0 0 0 1 0
= 0 0 + 1 0 = 1 0 .
0 1 0 1 0 2
1 0 α 0
And [S]B1.B2 = [S]B1.B2 = 0 0 = 0 0 , for any R
0 1 0
E13) We will prove (v) and (vi) here. You can prove (vii) and (viii) in a similar
way .
vi) prove it using the fact that (+ )aij = aij + aij.
E14) Since dim M2x3 (R) is 6, any linearly independent subset can have 6 elements,
at most.
1 0 … 0 0 1 0 … 0
E16) 0 0 … 0 0 0 0 … 0
… … … … , E 12= … … … … ... and so on
0 0 … 0 0 0 0 0 0
Now any m x n matrix A = [a ij] = a11 E11 + a12E12 + … + amn Emn (for example, in the 2
x 2 situation.
2 3 1 0 0 1 0 0 0 0
0 1 = 2 0 0 + 3 0 0 + 0 1 0 + 1 0 1
Thus, [Eij] {i = 1,…, m, j = 1,…,n} generates Mmxn (F). Also, if ij, I = 1,…., m,
J = 1,…..,n, be scalars such that 11 E11 + 12 E12 + … + mn Emn = 0.
Thus
11 12 ….. 1n 0 ….. 0
We get 21 22 ….. 2n = 0 ….. 0
: : ::::: : : :::::
m1 m2 ….. mn 0 ….. 0
Therefore, aij = 0 V I j.
Hence, the given set is linearly independent. :. It is a basis of M mxn (F). The number
of elements in this basis is mn.
1 2
E17) At = 2 0 , In this case At = A.
:. (A)t = A.
a c
E20) Let A = b d, be a square matrix over a field F.
a c
t
Then A = b d
a+a b+c 2a b + c
t
:. A + A = c+b d+d = b + c 2d , and
a- b b- c 0 b - c
t
A - A = c - b d - d = - (b - c) 0
E22) The size of Āt is the same as the size of At. :. A = Āt. Implies that the sizes of
A and At are the same. :. A is a square matrix.
1 0 …. 0
0 1 …. 0
:.[I]B = : : …. : = In.
.
0 0 … 1
E24) Since A is upper triangular, all its elements below the diagonal are zero.
Again, since A = At, a lower triangular matrix, all the entries of A above the
diagonal are zero:., all the off-diagonal entries of A are zero. :. A is a diagonal
matrix.
0 1
The converse is not true. For example, the diagonal entries of 2 0 are zero,
but this matrix is not skew-symmetric.
E26) [1 x 1 + 0 x 2 + 0 x 3] = [1]
1 x 1 + 1 x 1 1 x 0 + 1 x 1 2 1
E27) AB = 0 x 1+ 1 x 1 0 x 0 + 1 x 1 = 1 1
1 0 1 1 1 1
BA = 1 1 0 1 = 1 2
E28) C + D is not defined.
= 0 1 0
1 2 0
0 0 0
2 1 2 1 2 1 5 4
E29) A + B = 1 2 :. (A + B)2 = 1 2 1 2 = 4 5
1 1 1 1 1 2
2
Also A = 0 1 0 1 = 0 1
1 0 1 0 1 0
B2 = 1 1 1 1 = 2 1
2 1 4 2
2. AB = 2 1 1 2 2
1 2 4 2 1 0 6 4
2 2
:.A + 2AB + B = 0 1 + 2 2 + 2 1 = 4 4
1 0 0 x x
E31) 0 2 0 y = 2y
0 0 3 z 3z
1 0 0
[x y x] 0 2 0 = [x 2y 3z]
0 0 3
1 2 1 2 3 0
E32) We take A = 0 0 , B = 4 5 1 1 , Then
3 1
9 12 5 2
AB= 0 0 0 0 , You can see that the 2nd row of AB is zero
7 11 10 1
0 0 0 1 0 0
E33) [S]B = 1 0 0 , [T]B = 0 0 0
0 1 0 0 1 0
0 0 0
:.[S]B [T]B = 1 0 0
0 0 0
0 0 0
Also, [SoT]B = 1 0 0 = [S]B [T]B
0 0 0
-1 -8 -10 -2 -16 -5
E35) AB = 1 -2 -5 :. 2(AB) = 2 -4 -10
9 22 15 18 44 30
4 -2 1 -2 -5
On the other hand, (2A) B = 2 0 3 4 0
-6 8
= -2 -16 -20
2 -4 -10
18 44 30
:. 2(AB) = (2A)B
0 -7 -3 0 -11 0
E36) AB = -11 -4 6 -7 -4 0
0 0 0 . :. (AB)t = -3 6 0
1 2 4 2 1 0 0 -11 0
Also, Bt At = -4 -1 0 -1 0 0 = -7 -4 0
0 3 -2 0 -3 0 -3 6 0
= (AB)t.
d1 0 0 … 0 e1 0 0 … 0
0 d2 0 … 0 0 e2 0 … 0
AB = . . . … . . . . ::: :
: : : … : : : : … :
0 0 : … dn 0 0 : … en .
= d1 e1 0 0 … 0
0 d 2 e2 0 … 0
0 0 d3 e 3 … 0
: : : :
0 0 0 … dn en .
1 0 a b 1 0 a b 1 0
2 -1 c d = 0 1 = c d 2 -1
1 0
This gives us A = which is the same as the given matrix.
2 -1
1 0 1 0
This shows that the given matrix is invertible and, in fact, 1 -1 = 2 -1
E40) Firstly, is a well defined map. Secondly, check that (v1 + v2) = (v1) +
(v2), and (v) = (v) for v, v1, v2, V and F. Thirdly, show that (v) =
0 v = 0, that is is 1 – 1. Then; by Unit 5 (Theorem 10), you have shown
that is an isomorphism.
E41) We will show that its columns are linearly independent over Q. Now, if x, y, z
Q such that
2 0 1 0
x 0 + y 0 +z 1 = 0 , we get the equations
0 3 0 0
2x + z = 0
z = 0
3y = 0
on solving them we get x = 0, y = 0, z = 0.
1.0 Introduction
2.0 Objective
3.0 Main Content
3.1 Rank of a Matrix
3.2 Elementary Operates
3.3 Elementary Operation on a Matrix
3.4 Row-reduced Echelon Matrices
3.5 Applications of Row-reduction
3.6 Inverse of a Matrix
3.7 Solving a System of Linear Equations
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/Further Reading
1.0 INTRODUCTION
In Unit 3 we introduced you to a matrix and showed you how a system of linear
equations can give us a matrix. An important reason for which linear algebra arose is
the theory of simultaneous linear equations. A system of simultaneous linear
equations can be translated into a matrix equation, and solved by using matrices.
The study of the rank of a matrix is a natural forerunner to the theory of simultaneous
linear equations. Because, it is in terms of rank that we can find out whether a
simultaneous system of equations has a solution or not. In this unit we start by
studying the rank and inverse of a matrix. Then we discuss row operations on a
matrix and use them for obtaining the rank and inverse of a matrix. Finally, we apply
this knowledge to determine the nature of solutions of a system of linear equations.
The method of solving a system of linear equations tat we give here is by “successive
elimination of variable”. It is also called the Gaussian elimination process.
With this unit we finish Block 2. In the next block we will discuss concepts that are
intimately related to matrices.
2.0 OBJECTIVES
Consider any mxn matrix A, over a field F. We can associate two vector spaces with
it, in a very natural way. Let us see what they are. Let A = [a ij]. A has m rows, say,
R1, R2,….., Rm, where R1 = (a11, a12,….., a1n), R2 = (a21, a22,….., a2n)…., Rm = (am1,
am2,….., amn).
R1
R2
Thus, R1 i , and A =
Rm
The subspace of Fn generated by the row vectors R1, …. Rm of A, is called the row
space of A, and is denoted by RS (A).
1 0 0
Example 1: If A = 1 0 (0,0,1) RS (A)?
0, does
Solution: The row space of a is the subspace of R3 generated by (1, 0, 0) and (0,1,0).
Therefore, RS (A) = {a, b, 0) | a, b R). Therefore (0, 0, 1) RS (A).
The dimension of the row space of A is called the row rank of A, and is denoted by p r
(A).
1 0
Example 2: If A = ,0find
1 pr(A)
2 0
Solution: The row space of A is the subspace of R2 generated by (1,0), (0,1) and (2,0).
But (2,0) already lies in the vector space generated by (1,0) and 0,1), since (2,0) = 2
(1,0). Therefore, the row of A is generated by the linear independent vectors (1,0) and
(0,1). Thus, pr (A) = 2.
Just as we have defined the row space of A, we can define the column space of A.
each column of A is an m-tuple, and hence belongs to Fm. We denote the column of A
by C1, …., Cn. The subspace of Fm generated by {C1,….Cn} is called the column
space of A and is denoted by CS (A). The dimension of CS (A) is called the column
rank of A, and is denoted of pc (A). Again, since CS (A) is generated by n vectors and
is a subspace of Fm, we get 0 pc (A) min (m, n).
1 0 1
0 2 1
E E2) Obtain the column rank and row rank of A =
In E2 you may have noticed that the row and column ranks of A equal. In fact, in
Theorem 1, we prove that pr (A) = pc (A), for any matrix A. But first, we prove a
lemma.
a) CS (AB) CS (A),
b) RS (AB) RS (B).
am amn
cmj n bkj
amk
1
k=1
= C1 b1j + …+ Cn bnj.
Where C1,…Cn are the columns of A.
Thus, the columns of AB are linear combinations of the columns of A. thus, the
columns of AB CS(A). So, CS(AB) CS(A).
Hence, pc (AB) pc(A).
Now, RS(A) = [{R1, R2,…., m where R1, R2, ….,Rm are the rows of A. let {e1,e2,
…er,} be a basis of RS(A). Then R1 is a linear combination of e1…e r , for each i
=1…,m. Let
r
Ri = j, i 1, 2,….m, where bij F for 1 i m, 1 j r.
bijej=1
R1 b11……b1r e1
………….
= ………….
………….
Rm bm1…… bmr er
So, A = BE, where B = [b ij] is an mxr matrix and E is the rxn matrix with rows e1,
e2,…er. (remember, e1 Fn, for each i=1,…., r.)
So, t = pc (A) = (BE) pc (B), by Lemma 1.
min (m,r,)
r
Thus, t r.
Definition: the integer pc(A) (=pr(A) is called the rank of A, and is denoted by p(A).
You will see that theorem 1 is very helpful if we want to prove any fact about p(A). If
it is easier to deal with the rows of A we can prove the fact for pr(A). similarly, if it is
easier to deal with the columns of A, we can prove the fact for p c(A). While proving
Theorem 3 we have used this facility that theorem 1 gives us.
E E4) If A,B are two matrices such that AB is defined then show that p(AB)
min (p(A), p(B)).
E E5) Suppose C ≠ 0 Mmx1(F), and R ≠ 0 M1xn(F), then show that the rank
of the mxn matrix CR is 1. (Hint: use E4).
Does the term ‘rank’ seem familiar to you? Do you remember studying about the rank
of a linear transformation in Unit 2? We now see if the rank of a linear transformation
is related to the rank of its matrix. The following theorem brings forth the precise
relationship. (God through sec. 2.3 before further.)
Theorem 2:Let U.V be vector spaces over F of dimensions n and m, respectively. Let
B1 be a basis of U and B2 be a basis of V. Let T L(U,V).
Proof: Let B1 = {e1, e2} and B2 = {f1, f2,….fm}. As in the proof of Theorem 7 of unit
3, : V Mm,1(F): (v) = coordinate vector of v with respect to the basis B2, is an
isomorphism.
Now. R(T) = |{ T(e1), T(e2)…. T(en)}|. Let A = [T] B1, B2, have C1,C2,….Cn as its
columns.
Then CS(A) =|{C1,C2….Cn}|. Also, (T(e1)) = Ci i = 1, …, n.
~
Thus, :v CS(A) is an isomorphism.R(T) CS(A)
Corollary 1: Let A be an m x n matrix. Let P,Q be mxm and nxn invertible matrices,
respectively. B1
B 1,
Proof: Let T L(U,V) be such that [T] = A. We are given matrices Q and P -1.
B2
Therefore, by theorem 8 of Unit 3, B2 bases B1 and B2 of U and V, respectively, such
that Q = M and P-1 = M
In other words, we can change the bases suitable so that the matrix of T with respect
to the new bases is PAQ.
1 2 3 0 1 3 0 0
E E6) Take A = 0 -1 -2 , P= -1 0 Q= 0 2 0 Obtain PAQ
0 0 1
and show that p(PAQ) = p(A).
Now we state and prove another corollary to Theorem 2. This corollary is useful
because it transforms any matrix into a very simple matrix, a matrix whose entries are
1 and 0 only.
Corollary 2: Let A be an mxn matrix with rant r. Then invertible matrices P and Q
such that PAQ = 1r 0
0 0
Proof: Let T L((,V) be such that [T] B1, B2, = A. since p(A) = r, rank (T) = r.
nullity (T) = n-r (Unit 2, Theorem 5).
Where 0sxt denotes the zero matrix of size sxt. (Remember that u1,….,un-r Ker T.)
1r 0
Hence, PAQ =
0 0
B B2
Where Q = M andB11P = M , by
B2 Theorem 10 of Unit 3.
1r 0
0 0
Note is called the normal form of the matrix A.
PAQ = 0 0 … 0 , since p(A) = 1.
1
= [1 00 … 0]
1
A = p-1 (PAQ)Q-1 = p-1 [1 -1
0 0 … 0] A = CR.
0
Where C = p-1 ≠ 0.R = [1 0 … 0] Q-1 ≠ 0.
The solution of E7 is a particular case of the general phenomenon: the normal form of
an nxn invertible matrix is 1n.
Let us now look at some ways of transforming a matrix by playing around with its
rows. The idea is to get more and more entries of the matrix to be zero. This will help
us in solving systems of linear equations.
x+y+z=1
2x + 3z = 0
How can you express this system of equations in matrix form?
One way is
x
1 1 1 y 1
2 0 3 z = 0
am1x1 + am2 x2 + .. + amnxn = bm
x1 b1
where A = [aij]mxn, X = , B=
xn bm
In this section we will study methods of changing the matrix A to a very simple form
so that we can obtain an immediate solution to the system of linear equations AX = B.
For this purpose, we will always be multiplying A on the left or the right by a suitable
matrix. In effect, we will be applying elementary row or column operations on a.
Let A be an mxn matrix. As usual, we denote its rows by R1, ….,Rm, and columns by
C1,….Cn. We call the following operations elementary row operations:
0 1 2
Then R12 (A) = 1 2 3 (interchanging the two rows).
1 2 3 1 2 3
Also R2 (3) (A) = 0x3 1x3 2x3 = 0 3 6
1 + 0 x 2 2 +1 x 2 3+2x2 1 4 7
and R12(2) (A) = = 0 1 2
0 1 2
E E8) If A = , what is
0 0 1
1 0 0
0 1 0
a) R21 (A) b) R32 o R21 (A) c) R13(-1) (A)?
Just as we defined the row operations, we can define the three column
operations as follows:
1) Interchanging C1 and C1 for i ≠ j denoted by Cij.
2) Multiplying C1 by a F, a ≠ 0, denoted by Ci,(a).
3) Adding a C1 to C1 , where a F, denoted by Cij (a).
1 3
For example, if A = 2 4
1 13
Then C21(10) (A) = 2 24
31 3
and C12(10) (A) = 42 4
We will now prove a theorem which we will use in sec. 8.3.2 for obtaining the rank
of a matrix easily.
Proof: the way we will prove the statement is to show that the row space remains
unchanged under row operations and the column space remains unchanged under
column operations. This means that the row rank and the column rank remain
unchanged. This immediately shows, by Theorem 1, that the rank of the matrix
remains unchanged.
Now, let us show that the row space remains unaltered. Let R1,…Rm be the rows of a
matrix A. Then the row space of A is generated by {R1….Ri…Rj….Rm}. On applying
Rij to A, the rows of A remain the same. Only their order gets changed. Therefore,
the row space of Rij(A) is the same as the rows space of A.
If we apply R1(a), for a F, a ≠ 0, then any linear combination of R1,….Rm is a1R1
ai +…+ a R , which is a linear combination of
+…+ amRm = a1 +….+ aR
ai m m
R1….,aRi,….Rm.
Thus, |{ R1….Ri…Rm}| = [{R1….aRi…Rm}]. That is, the row space of A is the same
as the row space of Ri (a) (A).
Hence, the row space of A remains unaltered under any elementary row operations.
We can similarly show that the column space remains unaltered under elementary
column operations.
1 0 0 0 1 0
For example, C12(I3) = C12 0 =1 0 1 0is 0an elementary matrix.
0 0 1 0 0 1
Since there are six types of elementary operations, we get six types of elementary
matrices, but not all them are different.
E E9) Check that R23(I4) = C23(I4,)R2(2) (14) = C2(2)(I4) and R12(3)(I4) = C21(3)(I4)
0 1 2
A= 3 0 0 if we multiply it on the left by
2 1 0
E12 = 0 1 0 We get
1 0 0
0 0 1
0 1 0 0 1 2 3 0 0
1 0 0 3 0 0 0 1 2 = R12(A)
0 0 1 2 1 0 2 1 0
0 1 2 1 0 5 0 1 2
Finally, E13 (5)A = 3 0 0 0 1 0 = 3 0 15
2 1 0 0 0 1 2 1 10
= R13 (5)(A)
0 1 2 1 0 5 0 1 2
But, AE13 (5) = 3 0 0 0 1 0 3 0 15
2 1 0 0 0 1 2 1 10
= C31(5)(A).
What you have just seen are example of a general phenomenon. We will not state this
general result formally. (Its proof is slightly technical, and so, we skip it.)
Theorem 4: for any matrix A
a) Rij(A). = Eij A
b) Ri(a)(A) = Ei(a)A, for a ≠ 0.
c) Rij(a)(A). = Eij (a)A
d) Cij(A) = AEij
e) Ci(a)(A) = AEi(a), for a ≠ )
f) Cij(a)(A) = Aeij(a)
In (f) note the change of indices i and j.
An immediate corollary to this theorem shows that all the elementary matrices are
invertible (see Sec. 7.6).
a) EijEij = I,
b) Ei(a-1)Ei(a) = I, for a ≠ 0.
c) Eij (-a) Eij(a) = I.
Proof: we prove (a) only and leave the rest to you (see E10).
Now, from Theorem 4,
The corollary tells us that the elementary matrices are invertible and the inverse of an
elementary matrix is also an element matrix of the same type.
E E11) Actually multiply the two 4 x 4 matrices E13 (-2) and E13(2) to get I3.
And now we will introduce you to a very nice type of matrix, which any matrix can be
transformed to by applying elementary operations.
In this matrix the three non-zero rows come before the zero row, and the first non-zero
entry in each non-zero row is 1. Also, below this 1, are only zero. This type of matrix
has a special name, which we now give.
1 1 2
Is 0 1 0 a row-reduced echelon matrix? Yes. It satisfies all the conditions of the
definition. On the other hand are not row-reduce echelon matrices, since they
violate conditions (a), 9b) and (c), respectively.
0 0 0 2 1 0 0 1 0
0 0 1 0 0 1 1 1 1
The matrix
0 1 3 4 9 7 8 0 -1 0 1
0 0 0 0 1 5 6 10 2 0 0
0 0 0 0 0 0 0 1 7 0 12
0 0 0 0 0 0 0 0 0 1 10
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
But, why bring in this type of a matrix? Well the following theorem gives us one good
reason.
Theorem 5: The rank of a row-reduced echelon matrix is equal to the number of its
non-zero rows.
Proof: Let R1,R2,…Rr be the non-zero rows of an m x n row-reduced echelon matrix,
E. Then RS(E) is generated by R1,…Rr. We want to show that R1, …,Rr are linearly
independent. Suppose R1 has its first non-zero entry in column k1, R2 in column k2,
and so on. Then, for any r scalars c1,…cr such that c1 R1 that c1, R1 +C2R2 +…+ crRr
= 0 we immediately get
k1 k2 kr
c1 [0,…… ..,0. 1, ,……..…, ,……….,
, …..….. ]
+ c2 [0,………………..,0,1, …………, , ……….. ]
+ cr [0, ……………………………..…..,0,1, ………… ]
= . [0,………………………………………………….. 0]
where denotes
various entries that aren’t bothering to calculation.
This equation gives us the following equations (when we equate the k1th entries, the
k2th entries,…., the krth entries on both sides of the equation):
c1 = 0 = c1 ( ) + c2 = 0…., c1( ) + c2( ) +….+
r = 0.
r-1( ) +c
On solving these equations we get
c1 = 0 = c2…=cr.R1….,Rr are linearly independent p(E) = r.
Not only is it easy to obtain the rank of an echelon matrix, one can also solve linear
equations of the type AX =B more easily if A is in echelon form.
Matrices II
Now here is some good news!
Every matrix can be transformed to the row echelon form by a series of elementary
row operations. We say that the matrix is reduced to the echelon form. Consider the
following example.
0 0 0 0 0 1
0 1 2 -1 -1 1
Example 4: let A = 0 1 2 0 3 1
0 0 0 1 4 1
0 2 4 1 10 2
Solution: the first column of a is zero. The second is non-zero. The (1,2)th element is
0. we want 1 at this position. We apply R12 to A and get.
0 1 2 -1 -1 1
0 0 0 0 0 1
A1 = 0 1 2 0 3 1
0 0 0 1 4 1
0 2 4 1 10 2
The (1,2)th, entry has become 1. Now we subtract multiples of the first row from other
rows so that the (2,2)th, (3,2)th (4,2)th and (5,2)th entries become zero. So e apply R ij
(-1) and R51)-2) and get
0 1 2 -1 -1 1
0 0 0 0 0 1
A2 = 0 1 2 0 4 0
0 0 0 1 4 1
0 0 0 3 12 2
Now, beneath the entries of the first row we have zeros in the first 3 columns, and in
the fourth column we find non-zero entries. We want 1 at the (2,4)the position, so we
interchange the 2nd 3rd rows. We get
0 1 2 -1 -1 1
0 0 0 1 4 0
A3 = 0 0 0 0 0 1
0 0 0 1 4 1
0 0 0 3 12 0
We now subtract suitable multiples if the 2nd row from the 3rd, 4th and 5th rows so that
the (3,4)th (4,4(th (5,4)th entries all become zero.
0 1 2 -1 -1 1
R42(-1) R
0 0 0 1 4 0 A~B means that on apply
R42(-3) 0 0 0 0 =A
0 41 the operation R to A we
A3 ~ 0 0 0 0 0 1 get matrix B.
0 0 0 0 0 0
Now we have zero below the entries of the 2nd row, except for the
6th column. The (3,6)th element is 1. We subtract suitable multiples
of the 3rd row from the 4th and 5th rows so that the (4,6)th elements
become zero.
0 1 2 -1 -1 1
0 0 0 1 4 0
R43(-1) Linear transformations
0 0 0 0 0 1 and Matrices
A4 ~ 0 0 0 0 0 0
0 0 0 0 0 0
And now we have achieved a row echelon matrix. Notice that we applied 7
elementary operations to a to obtain this matrix.
In general, we have the following theorem.
The proof of this result is just a repetition of the process that you went through in
Example 4 for practice, we give you the following exercise.
1 2 0
E E12) Reduce the matrix 0 1 0 to echelon form.
3 1 0
Let us look at some examples to actually see how the echelon form of a matrix
simplifies matters.
By now must have got used to obtaining row echelon forms. Let us discuss some
ways of applying this reduction.
In this section we shall see how to utilize row-reduction for obtaining the inverse of a
matrix, and for solving a system of leaner equations.
Now, how do we use this knowledge for obtaining the inverse of an invertible matrix?
Suppose we have an n x n invertible matrix A. We know that A = 1A. Where 1 = 1 n.
Now, we apply a series of elementary row operations E1,…E to A so that A gets
transformation to 1n
Thus,
1 = EsEs-1…E2E1A = E Es-1… E2E1 (1A)
= (E Es-1… E2E11)A = BA
where B = E …. E1I. Then B is the inverse of A!
Note that we are reducing A to 1 and not only to the echelon form.
1 2 3 1 0 0
0 1 5 2 -1 0
= 0 5 7 A (applying
3 0 -1 R2(-1) and R3(-1))
1 0 -7 -3 2 0
= 0 1 5 A (applying
2 -1 0 R12(-2) and R32(-5))
0 0 -18 -7 5 -1
1 0 -7 -3 2 0
2
A-1(applying
0
= 0 1 5 R3(-1/18))
0 0 1 7/18 -5/18 1/19
Proof: In Unit 7 you studied that given the matrix A, we can obtain a linear
transformation T: Fn Fm such that [T]B.B = A, where B and B are bases of Fn and Fm,
respectively.
x
Now, X = is a 1solution of AX=0 if and only if it lies in ker T(since
T(X)=AX)
xn
Thus, the number of linearly independent solutions is dim ker T nullity (T) =
n-rank (T) (Unit 5. Theorem 5.)
This theorem is very useful for finding out whether a homogeneous system has any
non-trivial solution or not.
3x – 2y +z = 0
x+y =0
x – 3z =0
How many solutions does it have which are linearly independent over R?
3 -2 1
Solution: Here our coefficient matrix, A = 1 1 0
1 0 -3
Thus, the number of linearly independent solutions is 3 – 3 = 0. This means that this
system of equation has no non-zero solution.
In Example 7 the number of unknowns was equal to the number of equations, that is, n
= m. What happens if n>m?
A system of m homogeneous equations in n unknowns has a non-zero solution if n>m,
why? Well in n>m, then the rank of the coefficient matrix is less than or equal to m,
and hence, less than. So n-r> 0 Therefore, at least one non-zero solution exists.
E E15) Give a set of linearly independent solutions for the system of equations
x + 2y + 3z = 0
2x + 4y + z = 0
Now consider the general equation AX = B, where A is an mxn matrix. We form the
augmented matrix |AB|. This is an m x (n+1) matrix whose last column is the matrix
B. Here, we also include the case B = 0.
The following result tells us under what conditions the system AX = B has a solution.
Theorem 9: The system of linear equations given by the matrix equation AX = B has a
solution if P(A) = p([AB]).
Equation |AB| -1 = 0 has a solution, say , then cc1C1 + c2C2 +….+ cnCn = B,
n-1
where
C1….Cn are the columns of A. That is, B is a linear combination of the C i’s,, RS
([AB]) = RS(A), p([AB]).
Then a solution of AX = B is X = an
Thus, AX = B has a solution if and only if p(A) = p(|AB|).
Remark: If A is invertible then the system AX=B has the unique solution X= A -1 B.
Now, once we know that the system given by AX = B is consistent, how do we find a
solution? We utilize the method of successive (or Gaussian) elimination. This method
is attributed to the famous German mathematician, Carl Friedrich Gauss (1777-1855)
(see Fig. 1). Gauss was called the “prince of mathematicians” by his contemporaries.
He did a great amount of work in pure mathematics as well as probability theory of
errors, geodesy, mechanics, electro-magnetism and optics.
To apply the method of Gaussian elimination, we first reduce |AB| to its row echelon
form.
X
E. Then, we write out the equations E =-10 and solve them, which is simple.
Let us illustrate the method.
Example 8: Solve the following system by using the Gaussian elimination process.
x + 2y + 3z = 1
2x + 4y + z = z
Now let us look at an example where B = ), that is, the system is homogeneous
x +y + 5t = 0 x = (- 14/3)z – (7/3)t
y – (7/3) z + (4/3)t = 0 y = (7?3)z –(4/3)t
which is the solution in terms of z and t. Thus, the solution set of the given system of
equations, in terms of two parameters and , is
E E16) Use the Gaussian method to obtain solution sets of the following system
of
equations.
4x1 – 3x2 + x3 – 7 = 0
x1 – 2x2 – 2x3 - = 0
3x1 – x2 = 2x3 + 1 = 0
5.0 SUMMARY
We defined the row rank, column rank and rank of a mtrix, and showed that the
are equal.
We proved that the rank of a linear transformation is equal to the rank of its
matrix.
We defined the six elementary row and column operations.
We have shown you how to reduce a matrix to the row-reduced echelon form.
We have used the echelon form to obtain the inverse of a matrix.
We proved that the number of linearly independent solutions of a homogenous
system of equations given by the matrix equation AX = 0 is n-r, where r = rank
of A n = number of columns of A.
We proved that the system of linear equations given by the matrix equation AX
= B is consistent if and only if p(A) = p([AB]).
We have shown you how to solve a system of linear equations by the process
of successive elimination of variables, that is, the Gaussian method.
Solutions/Answers
The row space of A is the subspace of R3 generated by (1,0,1) and (0,2,1). These
vectors are linearly independent and hence, form a basis of RS(A).pr(A) = 2.
n n n
= k=1 aik bk1 aikbk2 ……. aikbkp
k=1 k=1
= ai1 [b11 b12 ..n1p] + ai1 [b21 b22…b2p] +…. + ain [bn1 bn2…bnp], a
linear combination of the rows of B.Rs.(AB)RS(B) pr (AB) pr(B).
a1
Now, if c = , r = [B1,……., Bn], then
am
Since C 0, ai 0, for some i. Similarly bj 0 for some j.ai bj 0.
CR 0.
p (CR) 0. p(CR) = 1.
0 -2 -2
E6) PAQ = -3 -4 -3 The rows of PAQ are linearly
independent.p(PAQ)= 2.Also the rows of a are linearly
independent.p(PAQ) = p(A).
1 0 0 1 0 0
0 2 0 0 1 0
E7)Let A = Then p(A) = 3. A’s normal form is
0 0 3 0 0 1
a) 1 0 0
0 0 1
0 1 0 1 0 0 0 1 0
0 0 =1 0 1 0
b) R32 o R21(A) = R32 0 0 1
0 1 0
0 + 0 x )- 1) 0 + 1 x ( - 1) 1 + 0 x ( – 1) 0 -1 1
c) 1 0 0 = 1 0 0
0 1 0 0 1 0
1 0 0 0
0 0 1 0
E9) R23 (14)= 0 1 0 0 = C23 (14)
0 0 0 1
1 0 0 0
R2(2) (14) = 0 2 0 0 = C2(2) (14)
0 0 1 0
0 0 0 1
1 3 0 0
0 1 0 0
R12(3) (14) = 0 0 1 0 = C21 (3) (14)
0 0 0 1
Eij (- a) (Eij (a) = Rij (-a) (Eij(a)) = Rij (-a) (1)) = 1, providing (c).
1 0 -2 0 1 0 2 0 1 0 0 0
0 1 0 0 0 1 0 0 0 1 0 0
E11 E13 (-2) E13 (2) = 0 0 1 0 0 0 =1 0 0 0 1 0
0 0 0 1 0 0 0 1 0 0 0 1
1 2 0 1 2 0 1 2 0
E12) 0 1 0 R31(-3) 0 1 0 R32(5) 0 1 0
3 1 0 ~
0 -5 0 ~
0 0 0
1 2 0 5 1 2 0 5
E13) 2 1 7R216(-2), R31(-4) 0 -3 7 R-432(5)
4 5 7 10 ~ 0 -3 7 10
10
1 2 0 5
R2(-1/3) 0 1 -7/13 4/3
0 -3 7 -10
1 2 0 5 1 2 0 5
0 1 -7/3 R4/3 0 1 -7/3 4/3
R32 (3) 3(-1/6)
~ 0 0 0 -6 ~ 0 0 0 1
p(A) = 3
0 1 3 1 0 0
E14) A = 2 3 =5 0 1 A0
3 5 7 0 0 1
0 1 3 0 1 3
2 3 5 = 2 3 5 A (applying R12
3 5 7 3 5 7
1 0 0 -1 2 -1
1/4 -9/4 3/2
0 1 0 = A (applying R3(-1/2), R23(-3) and R13 (2))
0 0 1 1/4 3/4 -1/2
-1/2 -3/2 1
-1 2 -1
A is invertible, and A-1 = 1/4 -9/4 3/2
1/4 3/4 -1/2
-1/2 -3/2 1
E15) The given system is equivalent to
x 0
1 2 3
y
2 4 1= 0
z
1 2 3
Now, the rank of 2 4 1 is 2., the number of linear independent solutions is 3 – 2
= 1.any non-zero solution will be a linearly independent solution. Now, the given
equations are equivalent to
x + 2y = - 3z ….. (1)
2x + 4y = - z …. (2)
Note that you can get several answers to this exercise. But any solution will be
(-2, 1,0), for some R.
We can solve this system to get the unique solution x1 = -7, x2 = -100 x3 = 5.
Eigenvalues And Eigenvectors
This section consist of three units in which we first introduce you to the theory of
determinants, and give its applications in solving systems of linear equations.
The theory of determinants was originated by Leibniz in 1693 while studying systems
of simultaneous linear equations. The mathematician Jacobi was perhaps the most
prolific contributor to the theory of determinants. In fact, there is a particular kind of
determinant that is named Jacobian, after him. The mathematicians Cramer and
Bezout used determinants extensively for solving systems of linear equations.
In unit 6 we discuss eigenvalues and eigenvectors. Their use first appeared in the
study of quadratic forms. The concepts that you will study in this unit were developed
by Arthur Cayley and others during the 1840s. What you will discover in the unit is
the algebraic eigenvalue problem and methods of finding eigenvalues and linearly
independent eigenvectors.
If you are interested in knowing more about the material covered in this block, you
can refer to the books listed in the course introduction. These books will be available
at your study centre.
Introduction
There are several ways of developing the theory of determinants. In section 5.2 we
approach it in one way. In section 5.3 you will study the properties of determinants
and certain other basic facts about them. We go on to give applications in solving a
system of linear equations (Cramer’s Rule) and obtaining the inverse of a matrix. We
also define the determinant of a linear transformation. We end with discussing a
method of obtaining the rank of a matrix.
Throughout this unit F will denote field of characteristic zero (Mn (F) will denote the
set of n x n matrices over F and Vn (F) will denote the space of all n x 1 matrices over
F, that is,
a1
a2
Vn (F) = X = . ai F
:
an
The concept of a determinant must be understood properly because you will be using
it again and again. Do spend more time on section 5.2, if necessary. We also advise
you to revise unit 1-4 before starting this unit.
2.0 OBJECTIVES
There are many ways of introducing and defining the determinant function from Mn
(F) to F. In this section we give one of them, the classical approach. This was
given by the French mathematician Laplace (1749- 1827), and still very much in
use.
We will define the determinant function det: Mn (F) F by induction on n. That is, we
will define it for n = 1,2,3, and then define it for any n, assuming the definition
for n-1.
When n = 1, for any A M1(F) we have a = [a], for some a F. In this case we define
det (A) = det ([a]) = a.
0 1
For example det -2 3 a13 = 0 x 3 – 1 x (- 2) = 2.
det (A) = a det a22 a23 a21 a23 +a13 det a21 a22
-a12 det
a32 a33 a31 a33 a31 a32
That is, det (a) = (-1)1+1 a11 (det of the matrix left after deleting the row and
column containing a11) + (-1 ) 1+2 a12 (det of the matrix left after deleting left
row and column containing a12) + (-1) 1+2 (det of the matrix left after deleting
the row and column containing a13
Note that the power of (-1) that is attached to a11, is 1+ j for j = 1+ j for j =
1,2,3.
So, det (A)=a11 (a22 a33 – a23 a32) – a12 (a21 a33 – a23 a31) + a13 (a21 a32– a22 a31).
In fact, we could have calculated | A | from the second row also as follows:
Example 1: Let
1 2 6
A= 5 4 1 Calculate | A|
7 3 2
1 2 6
|A | = 5 4 1
7 3 2
let Aij denote the matrix obtained by deleting the ith row and jth column of A
let us expand by the first row. Observe that
= 5 x 3 – 4 x 7 = - 13.
Thus,
|A| =(-1)1+1X 1X |A11| + (-1)1+2 X2X |A12|+ (-1)1+3X6 X |A13|= 5 –6– 8= - 79.
E E1) Now obtain A of Example 1, by expanding by the second row, and the
third row does the value of A depend upon the row used for calculating it?
Now, let us see how this definition is extended to define det (A) for any n x n matrix
A, n ≠ 1.
A11 a12 … a1n
a 21 a22 … a2n
. . .
When A = . . .
. . .
a11 a12 … ain
. . .
. . .
. . .
an1 an2 … ann
det (A) = (-1)1+1 a11det (A11) + (- 1)1+2 a12 det (A12) + … + (- 1)1+n ain det (Ain),
where Aij is the (n – 1) x (n – 1) matrix obtained from A by deleting the ith row and
the jth column, and I is a fixed integer with 1 ≤ i ≤ n.
n
We, thus, wee that det (A) = ∑ (- 1) i+j aji det (Aji),
j=1
The following example will help you to get used to calculating determinants.
Example 2: Let
-3 -2 0 2
A = 2 1 0 -1 Calculate |A |
1 0 1 2
2 1 -3 1
-2 0 2
|A | = 1 1 -1
1 0 1
The first three rows have one zero each. Let us expand along third row. Observe
that a32 = 0. So we don’t need to calculate A32. Now,
-2 0 2 -3 –2 2 -3 –2 0
A31 = 1 0 1 , A33 = 2 1 -1 ,A34 = 2 10
1 -3 1 2 1 1 2 1 -3
We will obtain |A31|, |A33|, and |A34| by expanding along the second, third and second
row, respectively.
-2 0 2
, |A31| = 1 0 -1
1 -3 1
0 2 – 2 2 + (- 1 ) 2+3 .(-1). -2 0
= ( - 1) 2+1 .1. + (- 1)2+2 .0.
-3 1 1 -1 1 -3
(expansion along the second row)
|A33| = -3 -2 2
2 1 -1 = ( -1)3+1 .2. -2 2 + (-1) 2+3 .1. -3 -2
2 1 1 1 -1 2 1
-3 -2
+ 9-1) 3+3 .1.
2 1 (expansion along the third row)
At this point we mention that there are two other methods of obtaining determinants –
via permutations and via multilinear forms. We will not be doing these methods
here. Fr purposes of actual calculation of determinants the method that we have
given is normally used. The other methods are used to prove various properties
of determinants.
Rn, then the absolute of det (u1u2…., un) is the magnitude of the volume of the n-
dimensional box spanned by u1, u2, …. un.
In this section we will state some properties of determinants, mostly proof. We will
take examples and check that these properties hold for them.
Now, for any A Mn (F) we shall denote its columns by C1, C2, ….Cn Then we have
the following 7 properties, P1 – P7.
Thus, we have
P5: for any α F and i ≠ j, det (C1,…., Ci,…, Cj ,Ci+1,…., Cn ).
Using P6, and the fact that det (A) can be obtained by expanding along any row, we
get P7” For A Mn (F), we can obtain det (A) by expanding along any column. That
is, for a fixed k,
|A| = (- 1)1+k a1k |A1k| + (- 1)2+k a2k |A2k| +…+ (- 1 )n+k ank |Ank |
An important remark now.
Remark: Using P6, we can immediately say that P1 – P5 are valid when columns
are replaced by rows.
P4 says that
det (Ri (α ) (A)) = αn det (A) = det (Ci (α) (A)), α F, and P5 says that
det (Rij (α) (A)) = det (A) = det (Cij) (α) (A), V F.
a) 1 6 0 b) 1 2 -1 -3
2 7 2 2 4 5 0
1 6 0 0 2 -1 -2
-1 0 0 1
Solution: a) Since the first and third rows or A (R1 and R3) coincide, |A| = 0, by P2
and P6.
b) 1 2 -1 -3
2 4 5 0
|A| = 0 2 -1 -2
-1 0 0 1
1 2 -1 -3
= 2 4 5 0, by adding R1 to R4.
0 2 -1 -2
0 2 -1 -2
= 0, since R3 = R4.
E E4) Calculation 1 3 0 2 3 5
2 1 2 and 1 0 1
1 3 0 4 6 10
Now we give some examples of determinants that you may come arose often.
Example 4: Let
A b b b
A = b a b b, where a, b € R.
b b a b
b b b a
Calculate |A|
Solution:
a b b b
A = b a b b,
b b a b
b b b a
= (a + 3b) (a – b)3.
In Example 4 we have used an important, and easily proved fact, namely, det (diag
(a1, a2….., an)) = a1 a2…. an, a1F i.
a1 0 0 0 1 0 0
0 a2 0 0 0 1 0
: : : : = αn αn… αn . . . , by P4
0 0 0 αn : : :
0 0 1
= α1 α2 …… αn |1n|
= α1 α2 …… αn, since|1n| = 1.
1 1 1 1
x1 x2 x3 x4 (xj – xi), 1 I j 4
x12 x2 2 x3 2 x4 2 = ij
x13 x2 3 x3 3 x4 3
1 0 0 0 (by subtracting
x1 x2 – x1 x3 – x1 x4 – x1 the first column from
= x12 x22 – x12 x32 – x12x42 – x12 every other column)
x13 x23 – x13 x33 – x13 x43 – x13
x2 – x1 x3 – x 1 x4 – x 1
= (x2 – x1) (x2 + x1) (x3 – x1) (x3 + x1) (x4 – x1) (x4 + x1)
(x2 – x1) (x2 +x1 _ x2x1) (x3 – x1) (x3 +x1 +x3x1) (x4-x1) (x42+x12 + x4x1)
2 2 2 2
(by expanding along the first row and factorizing the entries)
(by taking out (x2 – x1) (x3 + x1), and (x4 – x1)from column 1,2 and 3
respectively).
= (x2 – x1) (x3 – x1) 1 0 0
x2 + x1 x3 – x1 x4 – x2
2 2 2 2 2 2
x2 +x1 + x2x1 x3 – x2 + (x3 – x2)x1 x4 –x2 +(x4 –x2)x1
(by subtracting the first column from the second and third columns)
= (x2 – x1) (x3 + x1) (x4 + x1) (x3 – x2) (x4 – x2) 1 1
x3 + x2+ x1 x4 + x2+x1
= (x2 – x1) (x3 - x1) (x4 - x1) (x3 – x2) (x4 – x2) (x4 – x3)
= ij (xj – xi), 1 i, j 4
a11 * * a22 * *
0 a22 * 0 a33 *
0 0 . 0 0 . (expanding
along C1)
0 0 ann 0 0 ann
=…= a11 a22…ann, each time expanding along the first column.
In the Calculus course you must have come across df/dt =f’(t), where f is a
function of it. The next exercise involves this.
E E6) Let us define the function (t) by
(t) = f(t) g(t)
f’(t) g’(t)
And now, let us study a method for obtaining the invertible matrix.
In this section we first obtain the determinant of the product of two matrices and
then define an adjoint of a matrix. Finally, we see the conditions under which a
matrix is invertible, and, when it is invertible, we give its inverse in terms of its
adjoint.
In unit 7 you studied matrix multiplication, let us see what happens to the
determinant of a product of matrices.
Theorem 1: Let A and B be n X n matrices over F. Then det (AB) = (A) det
(B).
We will not do proof here since it is slightly complicated. But let us verify
theorem 1 for some cases.
1 0 2 2 10 9
A= 3 1 0 and B = 0 3 8 .
0 0 1 0 0 5
2 10 19
2 10
Since AB = 6 33 35 , |AB| = 5 6 33 = 30
0 0 5
= |A| |B|.
1 0 -1 -1 0 1
A = 0 2 -2 and B =-2 2 0
3 -3 5 5 -3 3
a b c 2
c a b . Therefore,
b c a
We get the required determinant to be
2 2
a b c a b c
c a b = c a b (by theorem 1)
b c a b c a
= (a3 + b3 + c3 – 3abc)2,
because a b c a b c b c a
c a b =a c a -bb a +c b c
b c a
On the other hand, det (A + B) ≠ det (a) + det (b), in general. The following
exercise is an example.
1 0 -1 0
E e8) Let A = ,B = show that det (A + B) ≠ det (A) + det
0 1 0 -1
(B).
3 0 1
, A = [T]B = -2 1 0
-1 2 4
So, by definition,
3 0 1
det(T) = det(A) = -2 1 0
-1 2 4
=3 1 0 +1 -2 1 = 12 -3 = 9.
2 4 -1 2
determinants
E E9) Find the determinant of the zero operator and the identity operator
from
R3 R3.
E E10) Consider the differential operator
D: P2 P2 : D (a0 +a1x+a2x2) = a1+ 2a1x.
What is det(D)?
Let us now see what the adjoint of a square matrix is, and how it will help us in
obtaining the inverse of an invertible matrix.
In section 9.2 we used the notation Aij for the matrix obtained from a square matrix A
by deleting its ith row and jth column. Related to this we define the (i,j)th cofactor of
A (or the cofactor of aij) to be (-1)i+j |Aij|. It is denoted by Cij. That is Cij = ( -1)i+j |A|.
a) ai1 Ci1 + ai2Ci2 …. + ain Cin = det(A) = a1i C1i +a2iC2i + .. + aniCni.
b) ai1 Cj1 + ai1Cj2 …. + ain Cjn = 0 = ai1 Cj1 + a2iC2j + .. +ani Cnj if i ≠j.
We will not be proving this theorem here. We only mention that (a) follows
immediately from the definintion of det (A), since det (A) = (-1)i+1 ai1 |Ai1| + … + (-
1)i+n ain |Ain|.
E E11)Verify (b) of theorem 2 for the matrix in example 9 and i=1, j=2 or 3.
C13 = 0 1 = - sin
Sin 0
In Unit 7 you came across one method of finding out id a matrix is invertible.
The following theorem uses the adjoint to give another way of finding out if a
matrix A is invertible. It also gives us A-1, if A is invertible.
det(A) 0 …. 0
0 det(A) …. 0
A (Adj (A)) = 0 0 …. 0
: : …. :
0 0 …. Det (A)
1 0 0
0 1 0
= det(A) 0 0 0 = det (A) 1.
: : :
0 0 1
1 1
A |A| Adj (A) = 1 = |A| Adj (A) A
1
A-1 = |A| Adj (A).
cos 0 -sin
A = 0 1 0 Find A-1
sin 0 cos
Solution:
= cos2 + sin2 = 1
also, from Example 10 we know that
x1 b1
x2 b2
AX = B, where A = [aij], X = . ,B= .
. .
xn bn
In Section 8.4 we discussed the Gaussian elimination method for obtaining a solution
of this system. In this section we give a rule due to the mathematician Cramer, for
solving a system of linear equations when the number of equations equals the
number of variables.
Let the columns of A be C1, C2, …. Cn. If det(A) ≠ 0, the given system has a
unique solution, namely,
X1 = D1/D,…. xn = Dn/D, where
Di = det (C1,….Ci-1, B,Ci-1,…., Cn)
= determinant of the matrix obtained from A by replacing the ith column
by B, and D = det (A).
Proof: Since |A| ≠ 0, the corollary to Theorem 3 says that A-1 exists
Now AX = B A-1 AX = A-1B
IX = (1/D) Adji(A) B
C11 C21 Cn1 b1
C12 C22 Cn2 b2
. ..
X = (1/D) .. ..
.
. ..
Now, Di = det (C1, ….., Ci-1, B, Ci+1, …., Cn). Expanding along the ith
column, we get Di = C1ib1 + C2ib2 + … + Cnibn.
Thus,
x1 D1
x2 D1
. = 1/D .
. .
xn Dn
2x + 3y – z = 2
x + 2y + z = -1
2x + y -6z = 4
2 3 -1 x 2
A = 1 2 1 , X = y , B = -1 Therefore, apply the rule,
2 1 -6 z 4
2 3 -1 2 3 -1 2 3 2
-1 2 1 1 -1 1 1 2 -1
4 1 -62 4 -6 2 1 4
x= , y= , z=
2 3 -1 2 3 -1 2 3 2
1 2 1 1 -1 1 1 2 -1
4 1 -6 2 4 -6 2 1 4
Substitute these values in the given equations to check that we haven’t made a mistake
in our calculations.
X + 2y +4z =1
2x + 3y -z =3
x -3z =2
Now let us see what happens if B = 0. Remember, in Unit 8 you saw that AX = 0 has
n-r linearly independent solutions, where r = rank a. The following theorem
tells us this condition in terms of det(A).
Proof: first assume that AX = 0has a non-trivial solution. Suppose, if possible, that
det(A) ≠ 0. Then Cramer’s Ruler’s says that AX = 0 has only the trivial solution X = 0
(because each Di=0 in Theorem 4). This is a contradiction to our assumption.
Therefore, det(A) = 0.
And now we introduce you to the determinant rank of a matrix, which leads us to
another method of obtaining the rank of a matrix.
In Unit 5 and 8 you were introduced to the rank of a linear transformation and the rank
of a matrix, respectively. Then we related the two ranks. In this section we will
discuss the determinant rank and show that it is the rank of the concerned matrix.
First we give a necessary and sufficient condition for n vectors in Vn(F) to be linearly
dependent.
Theorem 6: Let X1,X2,….,XnVn (F). Then X1,X2,…..,Xn are linearly dependent over
the field F if and only if det (X1X2,……Xn) = 0.
Proof: Let U = (X1X2,……Xn) be the n x n matrix whose column vectors are
X1X2,……Xn. Then X1X2,……Xn are linearly dependent over F if and only if
there exist scalars a1, a2,….,an F, not all zero, such that a1 X1 + a2X2 + … +
anXn = 0.
Now, a1 a1
a2 a2
U . = (X1X2,……Xn) .
. .
αn αn
But this happens if and only if det (U) = 0, by Theorem 5. Thus, Theorem 6 is
proved.
A submatrix of is a matrix
1 2 3 that can be obtained from A
by deleting some rows and
Now, consider the matrix A = 0 4 5 columns.
1 2 3
Since two rows of A are equal we know that |A| = 0. But consider its 2 x 2
submatrix
0 4
A13 = Its determinant is – 4 ≠ 0. In this case we say that the
1 2
determinant rank of A is 2.
Note: The determinant rank r of any m x n matrix is defined, not only of a square
matrix. Also r min (m, n).
1 4
Example 13: Obtain the determinant rank of A = 2 5
3 6
a) 1 2 0 b) 1 2 3
0 2 1 , 4 5 6
1 0 2
And now we come to the reason for introducing the determinant rank – it gives us
another method for obtaining the rank of a matrix.
Theorem 7: The determinant rank of an m x n matrix A is equal to the rank of A.
also, by definition of p (A), we know that the number of linearly independent rows
that A has is p (A). These rows form a p (A) x n matrix p (A). Thus, B will have p(A)
linearly independent columns. Retaining these linearly independent columns of B we
get a p(A) x p(A) submatrix C of B. so, C is a submatrix of A whose determinant will
be non-zero, by theorem 6, since its columns are linearly independent. Thus, by the
definition of the determinant rank of A, we get
p(A) r ……..(2)
(1) and (2) give us us p(A) = r.
2 3 4
A= 3 1 2
-1 2 2
Solution: det (A) = 0. but det 2 3 = - 7 ≠ 0.
3 1
Thus, by theorem 7, p(A) = 2.
Remark: This example show Theorem 7 can simplify the calculation of the rank of a
matrix in some cases. We don’t have to reduce a matrix to echelon form each time.
But, in (a) of the following exercise, we see a situation where using this method seems
to be as tedious as the row-reduction method.
a) 3 1 2 5 b) 2 3 5 1
1 2 -1 2 , 1 -1 2 1
4 3 1 7
genvalues and
E20 (a) shows how much time can be taken by using this method. On the other hand,
genvectors E20 (b) shows how little time it takes to obtain p(A), using the determinant rank.
Thus, the method to be used for obtaining p(A) varies from case to case.
5.4 Summary
7) The proof and use of Cramer’s rule for solving a system of linear equations.
8) The proof of the fact that the homogeneous system of linear equations AX = 0 has a
non-zero solution if and only if det(A) = 0.
9) The definition of the determinant rank, and the proof of the fact that rank of A =
determinant rank of A.
5.5 Solutions/Answers
|A23| = 1 2 = 3 – 14 = -11.
7 3
|A| = (-5) (-14) + (-40) – (-11) = -79.
Thus, |A| = -79, irrespective of the row that we use to obtain it.
E2)a) At = 1 5 7
2 4 3 , , on expanding by the first row, we get
6 1 2
|At| = 1 4 3 2 3 2 4
-5 +7 = 5 + 70 + /(- 22) = -79.
1 2 6 2 6 1
-3 2 1 2
t
b) A = -2 1 0 1 Since the 3rd row has the maximum
0 0 1 -3
2 -1 2 1
-3 2 2 -3 2 1
|A| = 1 -2 1 1 - (-3 ) -2 1 0 = 2 + 3 (2) = 8.
2 -1 1 2 -1 2
Z
E3) The magnitude of the required volume is the modulus of
1 0 1 k
j Y
0 1 0 = 1. i
0 0 1 O
E5 a 0 0 b 0
α b 0 =a r c = a b c.
βT c
a d e
0 b f =a b f = a b c.
0 0 c 0 c
f(t) g(t)
E6 (t) = = f(t´) - f´(t) g(t).
f´(t) g´(t)
|B| = - |A|
Also |AB| = -6 3 -2
-14 10 -6
28 -21 18
-6 3 -2
= -14 10 -6 adding 2R2 to R3.
0 -1 6
= -6 -2 -6 3
-14 -6 -14 10 , expanding along R3.
But a + B = 0 0 …. |A+B| = 0
0 0
0 -15 18
Adj(A) = 0 10 -12
0 0 0
4 3 2
Adj(A) = 6 8 2
-6 3 2
1 4 3 2
1
A = A Adj (A) =10
-1
= 6 8 2
-6 3 2 .
E15) Since A. Adj (A) = |A| I = Adj (A). A, and |A| ≠ 0, we find that
1
[Adj(A)]-1 exists, and is |A| A.
1 2 4 x 1
A= 2 3 -1 , X = y ,B= 3
1 0 -3 z 2
1 2 4
D1 = 3 3 -1 = -19
2 0 -3
1 1 4
D1 = 2 3 -1 =2
1 2 -3
1 2 1
D1 = 2 3 3 =1
1 0 2
D = |A| = -11
x = D1 = 19 y = D2 = -2 z = D3 = -1
D 11 D 11 D 11
2 3 1
A= 1 -1 -1
4 6 2
1 0 2
E18) 0 -1 3 = -3 + 2 = -1 ≠ 0. the given vectors are linearly
1 1 0 independent.
3 2 5
Also, the determinant of the 3 x 3 submatrix 1 -1 2 is zero.
4 1 7
In fact, you can check that the determinant of any of the 3x3 submatrices is zero.
Now let us look at the 2 x 2 submatrices of A. Since 3 1 = 5 ≠ 0
1 2
we find that p(A) = 2.
2 3
Now = -5 ≠ 0. , p(A) = 2.
1 -1
UNIT 6
Structure
6.1 Introduction
Objectives
6.2 The Algebraic Eigenvalue Problem
6.3 Obtaining eigenvalues and Eigenvectors
Characteristic Polynomial
Eigenvalues of linear Transformation
6.4 Diagonalisation
6.5 Summary
6.6 Solutions/Answers
6.1 Introduction
In Unit 5 you have studied about the matrix of a linear transformation. You have had
several opportunities, in earlier units, to observe that the matrix of a linear
transformation depends on the choice of the bases of the concerned vector spaces.
The eigenvalue problem involves the evaluation of all the eigenvalues and
eigenvectors of a linear transformation or a matrix. The solution of this problem has
basic applications in almost all branches of the sciences, technology and the social
science besides its fundamental role in various branches of pure and applied
mathematics. The emergence of computers and the availability of modern computing
facilities has further strengthened this study, since they can handle very large systems
of equations.
Consider the linear mapping T : R2 R2: T(x,y) = (2x, y). Then, T(1,0) = (2,0) =
2(1,0).T(x,y) = 2(x,y) = (1,0) ≠ (0,0). In this situation we say that 2 is an eigenvalue
of T. But what is an eigenvalue?
The fundamental algebraic eigenvalue problem deals with the determination of all the
eigenvalues of a linear transformation. Let us look at some examples of how we can
find eigenvales.
Warning: The zero vector can never be an eigenvector. But, F can be an eigevalue.
For example, 0 is an eigenvalue of the linear operator in E 1, a corresponding
eigenvector being (0,1).
W = Ker (T - I), and hence, Wis a subspace of V (ref. unit 5, Theorem 4).
Since is an eigenvalue of T, it has an eigenvector, which must be non-zero. Thus,
Wis non-zero.
As we have said in Unit 2 (Theorem 5), the matrix A becomes a linear transformation
from V n(F) to V n(F), if we define
A: Vn(F) V n(F) : A(X) = AX.
Also , you can see that [A]Bo = A, where.
1 0 0
0 1 0
0 0 0
Bo = e1 = . , e2 = . ,…., en = .
. . .
. . .
. . 0
0 0 1
is the standard ordered basis of Vn(F) to Vn(F), with respect to the standard basis Bo, is
A itself.
This is why we denote the linear transformation A by A itself.
Looking at matrices as linear transformations in the above manner will help you in the
understanding of eigenvalues and eigenvectors for matrices.
Definition: A scalar is an eigenvalue of an n x n matrix a over F if there exists X
Vn(F), X ≠ 0, such that AX =X are eigenvectors of the matrix A corresponding to the
eigenvalue .
1 0 0
Example 4: Let A = 0 2 0 Obtain an eigenvalue and a corresponding
Eigenvector of A 0 0 3
1 1
Solution: Now A 0 = 0 This shows that 1 is an
0 0
1
eigenvalue and 0 is an eigenvector corresponding to it.
0
0 0 0 0
Infact, A 1 = 2 1 and A 0 = 3 0.
0 0 1 1
0 0
eigenvectors 1 and 1 , respectively.
0 0
Example 5: Obtain an eigenvalue and a corresponding eigenvector
0 -1
of A = M2(R).
1 2
Solution: Suppose R is an eigenvalue of A. then
x 0 -yx
x= ≠ such that AX = X, that is, =
y 0 x+2y y
x x x x x x
y V3(R) A y = y = y V3(R) 2y = y
z z z z 3z z
x
= 0 xR which is the same as {x,0,0) | x R}.
0
an1 an2 …. ann xn
an1 an2 …. ann xn xn
This homogeneous system of linear equations has a non-trivial solution if and only if
the determinant of the coefficient matrix is equal to 0 (by Unit 9, theorem 5). Thus,
is an eigenvalue of A if and only if
where the coefficients c1, c2, …, cn depend on the entries aij of the matrix A.
0 0 2
1 0 1
0 1 -2 .
Note that is an eigenvalue of A iff det(I – A) = fA() = 0, that is, iff is a root of
the characteristic polynomial fA(t), defined above. Due to this fact, eigenvalues are
also called characteristic root, and eigenvectors are called chrematistic vectors.
For example, the eigenvalues of the matrix in Example 6 are the roots of the
polynomial t2 – 1, namely, 1 and 9 – 1).
Thus, the roots of fB (t) and fA(t) coincide. Therefore, the eigenvalues of A and B are
the same.
Let us consider some more examples so that the concepts mentioned in this
section become absolutely clear to you.
Example 7: find the eigenvalues and eigenvectors of the matrix
0 0 2
1 0 1
0 1 -2
0 0 2 x1 x1
1 0 1 x2 =1 x2
0 1 -2 x3 x3
1 1 0 0 x1 x1
-1 -1 0 0 x2 x2
-2 -2 2 1 x3 =0 x3 ,
1 1 -1 0 x4 x4
which gives x 1 + x2 = 0
-x1 –x2 = 0
-2x1 – 2x2 + 2x3 + x4 = 0
x1 + x2 –x3 = 0
The first and last equations give x3 = 0. Then, the third equation gives x4 = 0.
The first equation gives x1 = -x2.
The first two equations give x2 = 0 and x1 = 0. Then the last equation gives x4 = - x3 .
Thus, the eigenvectors are
0 0
0 = x3 0 , x3 ≠ 0, x3 R.
x3 1
-x3 -1
Example 9: Obtain the eigenvalues and eigenvectors of
0 1 0
A= 1 0 0
0 0 1
Solution: the characteristic polynomial of A = f A(t) = det(tI-A)
t -1 0
= -1 t 0 = (t+1) (t – 1)2
0 0 t-1
which is equivalent to
x 2 = x1
x 1 = x2
x 3 = x3
0 1 0 x1 x1
1 0 0 x2 = (-1) x2
0 0 1 x3 x3 ,
which gives x2 = x1
x1 = x2
x3 = x3
1 0
1 and 0 , which form a basis of the eigenspace W1.
0 1
Thus, W-1 is 1 – dimensional, while dim R W1 = 2.
Try the following exercises now.
E E7) find the eigenvalues and bases for the eigenspaces of the matrix.
2 1 0
A= 0 1 -1
0 2 4
a1 0 0 . . 0
0 a2 0 . . 0
D= 0 0 a3 . . 0 , where ai ≠ aj for i ≠ j.
. . . . . .
0 0 . . . an
We now turn to the eigenvalues and eigenvectors of linear transformations.
det (T - I) = 0
det (I – T) = 0
det (I – A) = 0, where A = [T]B is the matrix of T with respect to a basis B of V.
Note that [I – T]B = I – [T]B.
This definition does not depend on the basis B chosen, since similar matrice have the
same characteristic polynomial (Theorem 1), and the matrices of the same linear
transformation T with respect to two different ordered bases of V are similar.
Just as for matrices, the eigenvalues of T are precisely the roots of the
characteristic polynomial of T.
T l
= = t2 + 1, which has no real roots.
-1 t
Hence, the linear transformation T has n real eigenvalues. But, it has two complex
eigenvalues i and - i
Now that we have discussed a method of obtaining the eigenvalues and eigenvectors
of a matrix, let us see how they help in transforming any square matrix into a diagonal
matrix.
6.4 Diagonalisation
In this section we start with proving a theorem that discusses the linear
independence
But 2 ≠ r for i = 1,2,……. R-1. hence (i - r) ≠ 0 for i = 1,2….., r-1, and we must
have ai = 0 for i= 1,2,……, r-1. However, this is not possible since (1) would imply
that vr = 0, and, being an eigenvector, vr can never be 0. Thus, we reach a
contradiction.
Hence, the assumption we started with must be wrong. Thus, {v1,v2,…., vm}must be
linearly independent, and the theorem is proved.
We will use theorem 2 to choose a basis for a vector space V so that the matrix [T] B is
a diagonal matrix.
1 0 0 … 0
0 2 0 … 0
[T]B = 0 0 3 … .
. . . … .
0 0 0 … n ,
where 1, 2, ……n are scalars which need not be distinct.
1 0 0 … 0
0 2 0 … 0
[T]B = 0 0 3 … .
.. .. .. … .
0 0 0 … n ,
Since basis vectors are always non-zero, v1,v2,…., vn are non-zero. Thus, we find that
v1,v2,…., vn are eigenvecyors of T.
[T]B = .
. . ..
0 0 αn
Thus, Theorem 2,3, and 4 are true for the matrix A regarded as a linear transformation
from Vn(F) to Vn(F). Therefore, given an nxn matrix A, we know that it is
diagonalizable if it has n distinct eigenvalues.
1 0 0
0 2 0
= (X1,X2,……Xn) . . … .
. . … .
0 0 n
= P diag (12,……n)
now, by Theorem 2, the column vectors of P are linearly independent. This means
that P is invertible (Unit 9, Theorem 6). Therefore, we can pre-multiply both sides of
the matrix equation AP = P diag (12,……n).
1 2 0
A= 2 1 -6
2 -2 3
t-1 -2 0
-2 t-1 6 = (t – 5) (t – 3) (t + 3).
-2 2 t-3
1 1 1 1 -1 -1
A 2 =5 2 , A 1 =3 1 and A 2 = -3 2
-1 -1 0 0 1 1
Thus, 1 1 -1
2 1 and 1
-1 0 0
are eigenvectors corresponding to the distinct eigenvalues 5,3 and -3, respectively. By
Theorem 5, the matrix which diagonalises A is given by
1 1 -1
P = 2 1 2 . Check, by actual multiplication, that
-1 0 1
5 0 0
P = 0 1 0 , which is in diagonal form.
0 0 -3
The following exercise will give you some practice in diagonalising matrices.
E E12) Are the matrices in Examples 7,8 and 9 diagonalisable? If so, diagonalise them.
We end this unit by summarizing what has been done in it.
6.5 Summary
As in previous unit, in this unit also we have treated linear transformations along
with the analogous matrix version. We have covered the following point here.
6.6 Solutions/Answers
E1) Supose R is an eigenvalue. Then (x,y) ≠ (0,0) such that T (x,y) = (x,y)
(x,0) = (x,y) x = x, y = 0. These equations are satisfied if = 1, y = 0
, 1 is an eigenvalue. A corresponding eigenvector of (1,0). Note that there are
infinitely many eigenvectors corresponding to 1, namely, (x,0) 0 ≠ x R.
1 2 x x x + 2y = 3x and 3y = 3y.
0 3 y =3 y
These equations are satisfied by x = 1, y = 1 and x = 2, y = 2.
1 2
3 is an eigenvalue, and and are eigevectors
1 2
corresponding to 3.
x x+2y 3x
E4) W3 = V2(R) =
y 3y 3y
x x
= V2(R) x=y = xR
y x
1
This is the 1-dimensional real subspace of V2(R) whose basis is
1
t 0 -2 t -1 0 -2
W5) It is -1 t -1 =t +
0 -1 t+2 -1 t+2 -1 t+2
E6 The eigenvalues are the roots of the polynomial t3+2t2 –t-2 = (t-1) (t+1) (t+2)
they are 1, -1, -2.
t-2 1 0
E7 fA(t) = 0 t-1 1 = (t-2)2 (t-3)
0 -2 t-4
2x + y = 2x x=x
y – z = 2y y=0
2y + 4z = 2z z=0
x 1
W2 = 0 xR , a basis for W2 is 0
0 0
The eigenvectors corresponding to 2 are given by
2 1 0 x x
0 1 -1 y =3 y This gives us the equations.
0 2 4 z z
2x + y = 3x x=x
y – z = 3y y=x
2y + 4z = 3z z = 2x
x 1
W3 = x xR , a basis for W2 is 1
-2x -2
t-a1 0 0
0 t-a2 0
E8) fD(t) = . . … . = (t-a1) (t-a2) …. (t-an)
. . … .
0 0 t-an
a1 0 … 0 x1 x1
0 a2 … 0 x2 x2
. . … : : = a1 :
0 0 … an xn xn
0
:
Similarly, the eigenvectors corresponding to a1 are 0 , x2 ≠ 0, x1, R.
x1
0
:
0
2
E9) B = {1,x,x } is a basis of P2
0 1 0
Then [D]B = 0 0 2
0 0 0 .
t -1 0
, the characteristic polynomial of D is 0 t -2 = t3
0 0 t
E12) Since the matrix in Example 7 has distinct eigenvalues 1, -1 and -2, it is
diagonalizable. Eigenvectors corresponding to.
1 -2 1
These eigenvalue are 3 1 and 0 , respectively
1 1 1
2 -2 -1 0 0 2 1 0 0
, if P 3 1 0 , then P-1 1 0 1 P = 0 -1 0
1 1 1 0 1 -2 0 0 -2.
The matrix in Example 8 is not diagonalizable. This is because it only has two
distinct eigenvalues and, corresponding to each , it has only linearly independent
eigenvector. , we cannot find a basis of V4(F) consisting of eigenvectors. And now
apply Theorem 3.
The matrix in Example 9 is diagonalizable though it only has two distinct eigenvalue.
This is because corresponding to 1 = -1 there is one linear independent eigenvector,
but corresponding to 2 = 1 there exist two linearly independent eigenvectors.
Therefore, we can form a basis V3(R) consisting of the eigenvectors.
1 1 0
-1 1 0
0 0 1 .
1 1 0
The matrix P = -1 1 0 is invertible , and
0 0 1
1 1 0 -1 0 0
-1
P -1 1 0 P= 0 1 0
0 0 1 0 0 1
UNIT 7
Structure
7.1 Introduction
objectives
7.2 Cayley-Hamilton Theorem
7.3 Minimal Polynomial
7.4 Summary
7.5 Solutions/Answers 2
7.1 Introduction
This unit is basically a continuation of the previous unit, but the emphasis is on a
different aspect of the problem discussed in the previous unit.
In this unit we first show that every square matrix (of linear transformation
T: V V) satisfies its characteristic equation, and use this to compute the inverse of
the concerned matrix (or linear transformation), if it exists.
Then we define the minimal polynomial of a square matrix, and discuss the
relationship between the characteristic and minimal polynomials. This leads us to a
simple way of obtaining the minimal polynomial of a matrix (or linear
transformation).
We advise you to study Unit 2.4 and 6 before starting this unit.
Objectives
0 1 2
Let us consider the 3 x 3 matrix A = -1 2 1
0 3 2
t -1 -2
Then tI – A = 1 t-2 -1
0 -3 t-2
t2 – 4t + 1 t+4 2t – 3
Adj (tI – A) = t–2 2
t – 2t t–2
2
-3 3t t – 2t + I
1 0 0 -4 1 2 1 4 -3
= 0 1 0 t2+ 1 -2 1 t + -2 0 -2
0 0 1 0 3 -2 -3 0 1
Now, comparing constant term and the coefficients of t, t2,….., tn on both sides we
get.
- ABn = cnI
Bn - ABn-1 = cn tI
Bn-1 – Abn-2 = cn-2I
. . . .
. . . .
B, - AB2 = c2I
B2 - AB1 = c1I
B1 = I
Pre-multiplying the first equation by I, the second A, third by a2, ….., that last by An,
and adding all these equations, we get.0 = cnI + cn-1A + cn-2A2 + … + c2An-2 + c1An-1 +
An = f(A).
Thus, f(A) = An + c1an-1 + c2An-2 … + cn-1A + cnI = 0, and the Cayley-Hamilton
theorem is proved.
This proof is false. Why? Well , the left hand side of the above equation, f(A), is an n
x n matrix while the right hand side is the scalar 0, being the value of det(0).
Proof: Let dim V = n, and let B = {v1, v2, ……, vn} be a basis of V. In Unit 10 we
have observed that
f(t) = the characteristic polynomial of T
= the characteristic polynomial of the matrix [T]B.
Let [T]B = A.
Again, using the one-one property of [ ]B, this implies that f(T0 = 0.
Thus, Theorem 2 is true.
t – 2 -1
- t2 – 3t + 2
1 t
, we want to verify that A2 -3A + 21 = 0.
3 2 3 2 7 6
Now, A2 = =
-1 0 -1 0 -3 -2
A2 – 3A + 21 = 7
6
- 9 6 + 2
0
= 0
0
-3 -2 -3 0 0 2 0 0
Proof: By Theorem 1,
f(A) = An +c1An-1 + … + cn -1A + cnI = 0
A(An-1 +c1An-2 + … + cn -1I) = - cnI
and (An-1 +c1An-2 + … + cn -1I)A = - cnI
A [-cn-1 (An-1 +c1 An-2 + … + cn -1I) ] = I
= [ - cn-1 (An-1 +c1An-2 + … + cn -1I)] = A.
Thus, A is invertible, and
An-1 = - c1An-2 + … + cn -1I).
t-2 -1 -1
= 1 t- 2 1 = t3 – 7t2 + 19t – 19.
1 -1 t -3
7 -2 -3
4 7 1
1 -3 5
A-1 = 1/19
To make sure that there has been no error in calculation, multiply this matrix by a.
you should get I!
In this section we will show that the minimal polynomial divides the characteristic
polynomial. Moreover, the roots of the minimal polynomial are the same as those of
the characteristic polynomial. Since it is easy to obtain the characteristic polynomial
of T, these facts will give us a simply way of finding the minimal polynomial of T.
Let us first recall some properties of the minimal polynomial of T that we gave in Unit
6. Let p9t) be the minimal polynomial of T, then
MP1) p(t) is a monic polynomial with coefficients in F.
MP2 If q(t) is a non-zero polynomial over F such that deg q< deg p, then q(T) ≠ 0.
MP3 If, for some polynomial g(over F, g(T) = 0, then p(t)| g(t). That is, there
Exists a polynomial h(t) over F such that g(t) = p(t)h(t).
We will now obtain the first link in the relationship between the minimal polynomial
and the characteristic polynomial of a linear transformation divides its characteristic
polynomial.
Proof: Let the characteristic polynomial and the minimal polynomial of T be f(t) and
p(t), respectively. By Theorem 2, f(T) = 0. Therefore, by MP4, p(t) divides f(t), as
desired.
Before going on to show the full relationship between the minimal and characteristic
polynomials, we state (but don’t prove!) two theorems that will be used again and
again, in this course as well as other courses.
And now we come to a very important result that you may be using often, without
realizing coefficients has at least one root in C.
In other words, this theorem says that any polynomial f(t) = αn-1tn-1 + … + α1t + α0
(where α0, αn C αn ≠ 0, n 1) has at least one root in C.
k)mk.
For example, the polymial equation t3 – it2 + t – i = 0 has no real roots, but it has two
distinct complex roots, namely, t(= -1) and - i . And we write t3 – it2 + t – i = (t-i)2 (t
+ i). Here i is repeated twice and – i only occurs once.
We can similarly show that any polynomial f(t) over r can be written as a product of
linear polynomials and quadratic polynomials. For example the real polynomial t3 – 1
= (t-1) (t2 + t +1).
Now we go to show the second and final link that relates the minimal and
characteristic polynomials of T:V V, where V is a vector space over F. let p(t) be
the minimal polynomial of T. We will show that a scalar is a root of p(t). The proof
will utilize the following remark.
Conversely, let be a root of p() = 0 and , by Theorem 5, p(t) = (t-)q(t), deg q <
deg p, q ≠ 0. By the property MP3, v V such that q(T) v ≠ 0.
Let x = q(T)v ≠ 0. Then,
(T - I)x = (t - I) q(T)v = p(T)v =0
Caution: Though the roots of the characteristic polynomial ad the minimal polynomial
coincide, the two polynomials are not the same, in general.
For example, if the characteristic polynomial T: R4 R4 is (t+ 1)2 (t-2)2, then the
minimal polynomial could be (t = 1) (t -2) or (t +1) (t -2), or (t + 1) (t -2)2, or even (t
2
Definition: The minimal polynomial of a matrix A over f is the monic polynomial p(t)
such that
i) p(A) = 0, and
ii) if q(t) is a non-zero polynomial over F such that deg q < deg p, then q(A) ≠ 0.
We state two theorems which are analogues to Theorem 4 and 7. Their proofs also
similar to those of Theorems 4 and 7
Therefore, the minimal polynomial p(t) is either (t -1) (t -2) or (t-1) (t-2)2
Since (A – 1) (A – 21)
4 -6 -6 3 -6 -6 0 0 0
= -1 3 2 -1 2 2 0 0 0
3 -6 -5 3 -6 -6 0 0 0
t -3 6 6
f(t) = -2 t -2 1 = (t -1) (t – 2)2.
-2 -2 t
2 0 -1
= 2
0 -1
≠ 0.
4 0 -2
Hence, p(t) ≠ (t -1) (t -2). Thus, p(t) = (t-1) (t – 2)2.
Now, let T be a linear transformation for V to v, and B be a basis of V. Let A = [T] B.
If g(t) is any polynomial with coefficients in f, then g(T) = 0 if and only it g(A) = 0.
Thus, the minimal polynomial of T is the same as the minimal of A. so, For example,
if t : R3 R3 is a linear operator which is represented with respect to the standard
basis, by the matrix in Example 3., then its minimal polynomial is (t-1) (t-2).
Solution: a) now (t-1) (t3 + 1) = (t + 1) (t-1) (t2-t+1). This has 4 distinct complex roots,
of which only 1 and -1 are real. Since all the roots are distinct this polynomial is also
the minimal polynomial of T.
b) (t2 +1)2 has no real roots. It has 2 repeated complex roots, i –i . Now, the minimal
polynomial must be a real polynomial that divides the characteristic polynomial. , it
can be (t2 + 1) or (t2 + 1)2.
This example shows you that if the minimal polynomial is a real polynomial, then it
need not be a product of linear polynomial only. Of course, over C it will always be a
product of linear polynomials.
Try the following exercises now.
7.4 Summary
1) The proof of the Cayley-Hamilton theorem, which says that every square matrix (or
linear transformation T:V V) satisfies its characteristic equation.
2) The use of the Cayley-Hamilton theorem to find the inverse of a matrix.
4) The proof of the fact that the minimal polynomial and the characteristic polynomial of
a linear transformation (or matrix) have the same roots. These roots are precisely the
eigenvalues of the concerned linear transformation (or matrix).
5) A method for obtaining the minimal polynomial of a linear transformation (or matrix).
7.5 Solutions/Answers
t -1 0 0
E1) a) fA(t) = -2 t - 3 0 = (t – 1)2 (t -2)
2 2 t -1
0 1 0 3 0 1 1 1 -1
A3 = 3 0 1 1 1 -1 2= 3 2
1 -2 -1 -7 3 -1 8 -5 4
1 1 -1 3 0 1
3 2
Now, A + A – A – 41 = 2 3 2 1 1 -1
8 -5 4 -7 3 -1
0 1 0 4 0 0
- 3 0 1 - 0 4 0
1 -2 -1 0 0 4
t -1 0 -1
c) fA(t) = 0 t-3 -1 = t3 – 8t2 + 13t
-3 -3 t -1
1 0 1 1 0 1 4 3 5
0 3 1 0 3 1 3 12 7
3 3 4 3 3 4 15 21 22
Now, A2 =
1 0 1 4 3 5 19 24 27
A1 = 0 3 1 3 12 7 24 57 43
3 3 4 15 21 22 81 168 176
13 0 13
0 39 13
+ 39 39 52
= 0.
1 0 0 5 0 0 7 0 0
1 8 9 0 10 15 0 0 7 0
= 3 - +
-8 -8 1 -10 -10 5 0 0 7
1 3 0 0
= 3 -2 1 0
2 2 3
2 1 1
1
4 0 0
4
-6 1 -3
1 1 0
A= 0 1 1
1 1 1
t -1 -1 0
Then fA(t) = 0 t-1 -1 = t3 – t2 – t.
-1 0 t -1