0% found this document useful (0 votes)
143 views218 pages

MTH212

Uploaded by

Glory
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
143 views218 pages

MTH212

Uploaded by

Glory
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 218

NATIONAL OPEN UNIVERSITY OF NIGERIA

SCHOOL OF SCIENCE AND TECHNOLOGY

COURSE CODE: MTH 212

COURSE TITLE: LINEAR ALGEBRA


UNIT 1 LINEAR TRANSFORMATIONSI

CONTENTS

1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Linear Transformations
3.2 Spaces Associated with a Linear Transformation
3.3 The Range Space and the Kernel
3.4 Rank and Nullity
3.5 Some types of Linear Transformations
3.6 Homomorphism Theorems
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/Further Reading

1.0 INTRODUCTION

You have already learnt about a vector space and several concepts related to it. In this
unit we initiate the study of certain mappings between two vector spaces, called linear
transformations. The importance of these mappings can be realized from the fact that,
in the calculus of several variables, every continuously differentiable function can be
replaced, to a first approximation, by a linear one. This fact is a reflection of a general
principle that every problem on the change of some quantity under the action of
several factors can be regarded, to a first approximation, as a linear problem. It often
turns out that this gives an adequate result. Also, in physics it is important to know
how vectors behave under a change of the coordinate system. This requires a study of
linear transformations.

In this unit we study linear transformations and their properties, as well as two spaces
associated with a linear transformations and their properties, as well as two spaces
associated with a linear transformation, and their dimensions. Then, we prove the
existence of linear transformations with some specific properties, as discuss the notion
of an isomorphism between two vector spaces, which allows us to say that all finite-
dimensional vector spaces of the same dimension are the “Same”, in a certain sense.

Finally, we state and prove the Fundamental Theorem of Homomorphism and some of
its corollaries, and apply them to various situations.
2.0 OBJECTIVES

After reading this unit, you should be able to:

 Verify the linearity of certain mappings between vector spaces;


 Construct linear transformations with certain specified properties;
 Calculate the rank and nullity of a linear operator;
 Prove and apply the Rank Nullity Theorem;
 Define an isomorphism between two vector spaces;
 Show that two vector spaces are isomorphic if and only if they have the same
dimension;
 Prove and use the Fundamental Theorem of homormorphism.

3.0 MAIN CONTENT

3.1 Linear Transformations

By now you are familiar with vector spaces IR2 and IR3. Now consider the mapping
f:R2 R3:f(x,y) = (x,y,0) (see fig. 1).

F is a well defined function. Also notice that

(i) f((a,b) + (c, d)) = f((a + c, b + d)) = (a + c, B + d,0) = (a, b, 0) + (c,d,0)

Y = f((a, b)) + f((c,d)), for (a,b), (c,d)€ R2 and Z

A B f

2 0
C 1 1 2 Y
1 D
2 D` Á
3
0 1 2 3 4 X
Fig. 1: f transforms ABCD to A`B`C`D`. 4
X c` B`
(ii) for any α € R and (a,b) € R2, f((αa, αb)) = (αa, αb, 0) = α (a,b,0) = αf((a,b)).
So we have a function f between two vector spaces such that (i) and (ii) above
hold true.

(i) Says that the sum of two plane vectors is mapped under f to the sum to sum of
their images under f.
(ii) Says that a line in the plane R2 is mapped under f to a line in R2.
The properties (i) and (ii) together say that f is linear, a term that we now define.

Definition: Let U and V be vector spaces over a field F. A linear transformation (or
linear operator) from U to V is a function T: U  V, such that LT1) T(u1 + u2) =
T(u1) + T(u1), for U1, U2 U, and LT2) T(αu) = αT(u) for α  F and U  U.

The conditions LT1 and LT2 can be combined to give the following equivalent
condition. LT3) T(α1u1 + α2u2) = α1 T(u1) + α2T(u2), for α1, α2 F and u1, u2 U.

What we are saying is that [LT1 and LT2] LT3. This can be easily shown as
follows:

We will show that LT3 LT1 and LT3  LT2. Now, LT3 is true V α1α2  F.
Therefore, it is certainly true for α1 = 1 α2 , that is, LT1 holds.

Now, to show that LT2 is true, consider T(αu) for any α  F and u  U. We have
T(αu) = T(αu + 0. u) = αT(u) + 0.T(u) = αT(u), thus proving that LT2 holds.

You can try and prove the converse now. That is what the following exercise is all
about!

E E1) Show that the conditions LT1 and LT2 together imply LT3.

Before going further, let us note two properties of any linear transformation T:U V,
which follow from LT1 (or LT2, or LT3).

LT4) T(0) = 0. Let’s see why this is true. Since T(0) = T(0 + 0) = T(0) +T(0) (by
LT1, we subtract T(0) from both sides to get T(0) = 0.

LT5) T(-u) = - T(u) V u  U. why is this so? Well, since 0 = T(0) = T(u – u) = T(u) +
T(-u), we get T(-u) = - T(u).

E E2) Can you show how LT4 and LT5 will follow from LT2?

Now let us look at some common linear transformations.


Example 1: Consider the vector space U over a field F, and the function T:U  U
defined by T(u) = u for all u  U.

Show that T is a linear transformation. (This transformation is called the identity


transformation, and is denoted by Iu, or Just I, if the underlying vector space is
understood).

Solution: For any α,  F and u1, u2 U, we have

T(αu1 + u2) = αu1 + u2 = αT(u1) + T(u2)

Hence, LT3 holds, and T is a linear transformation.

Example 2: Let T: U  V be defined by T(u) = 0 for all u  U.

Check that T is a linear transformation. (It is called the null or Zero Transformation,
and is denoted by 0).

Solution: For any α, , F and u1, u2 U, we have T(αu1 + u2) = 0 = α.0 + .0 =
αT(u1) + T(u2).

Therefore, T is linear transformation.

Example 3: Consider the function pr1:Rn R, defined by pr1[(x1, ……….xn)] = x1.


Shows that this is a linear transformation. (This is called the projection on the first
coordinate. Similarly, we can define pr1:Rn R by pr1 [(x1 ....Xi-1, Xi,………, xn)] =
x1 to be the projection on the ithCoordinate for i = 2, ......,n. For instance, pr2:R3 R:
pr2(x,y,z) = y.)

Solution: We will use LT3 to show that projection is a linear operator. For α,R
and (x1 ………xn), (y1…………yn) in Rn, we have

Pr1 [α(x1 ……….. xn) + (y1 ………..yn)]

= pr1 (αx1+ y1, αx2+y2 ………..αxn +yn) = αx1 +y1


= αpr1,[(x1,…..,xn)] + pr1 [(y1,….,yn)].

Thus pr1 (and similarly pr1) is a linear transformation.

Before going to the next example, we make a remake about projections.

Remark: Consider the function p:R3 R2: p(x,y,z) = (x,y). this is a projection from
R3 on to the xy-plane. Similarly, the functions f and g, from R3 R2, defined by
f(x,y,z) = (x,z) and g(x,y,z) = (y,z) are projections from R3 onto the xz-plane and the
yz-plane, respectively.
In general, any function  : Rn Rm (n> m), which is defined by dropping any (n – m)
coordinates, is a projection map.

Now let us see another example of a linear transformation that is very geometric in
nature.

Example 4: Let T:R2  R2 be defined by T(x,y) = (x, - y) V x,y  R. Show that T is


a linear transformation. (This is the reflection in the x-axis that we show in fig 2).
Y

P(2,1)

0 X
Q(2,-1)

Fig 2: Q is the reflection of P in the X-axis.

Solution: For α,  R and (x1y1), (x2, y2)  R2, we have

T[α (x1,y1) +  (x2,y2) = T (αx1, + x2, αy1 +  y2) = (αx1 + x2, - ay1 - y2)

= α (x1, - y1) +  (x2, - y2)


= αT (x1, y1) +  T(x2, y2).

Therefore, T is a linear transformation.

So, far we’ve given examples of linear transformations. Now we give an example of a
very important function which is not linear. This example’s importance lies in its
geometric applications.

Example 5: Let u0 be a fixed non-zero vector in U. Define T : U  U by T(u) = u +


u0V u  U. Show that T is not a linear transformation. (T is called the translation by
u0. See Fig 3 for a geometrical view).
Solution: T is not a linear transformation since LT4 does not hold. This is because
T(0) = u0 ≠ 0

Y
A` B`

4
A B
3
D`
2 C`
D
1 C

X
0 1 2 3 4

Fig. 3: A`B`C`D` is the translation of ABCD by (1,1).

Now, try the following Exercises.

E E3) Let T: R2 R2 be the reflection in the y-axis. Find an expression for T as
in Example 4. Is T a linear operator?

E E4): For a fixed vector (a1, a2, a3) in R3, define the mapping T:R3 R by
T(x1,x2,x3) = a1x1 + a2x2 + a3x3. Show that T is a linear transformation. Note
that T(x1,x2,x3) is the dot product of (x1,x2,x3) and (a1,a2,a3) (ref. sec. 2.4).
E E5) Show that the map T:R3 R3 defined by T(x1,x2,x3) = (x1 + x2 – x3, 2x1 –
x2, x2 + 2x3) is a linear operator.

You came across the real vector space Pn, of all polynomials of degree less than or
equal to n, in Unit 4. The next exercise concerns it.

E E6) Let f  Pn be given by

F(x) = α0 + α1x + ………… + αnxn, α1 € R V i.

We define (Df) (x) =α + 2α2x + ……….+ n αnxn-1.

Show that D:Pn is a linear transformation. (Observe that Df is nothing but the
derivative of f. D is called the differentiation operator.)

In Unit 3 we introduced you to the concept of a quotient space. We now define a very
useful linear transformation, using this concept.

Example 6: Let W be a subspace of a vector space U over a field F. W gives rise to


the quotient space U/W. Consider the map T:U  U/W defined by T(u) = u + W.
Show that T is a linear transformation.

Solution: for α  F and u1, u2 U we have


T(αu1 +u2) = (α u1 + u2) + W = (α u1 + W) + (u2 + W).
= α (u1 + W) + (u2 + W)
= αT(u1) + T(u2)

Thus, T is a linear transformation.

Now solve the following exercise. Which is about plane vectors.


E E7) Let u1 = (1 – 1), u2 = (2, - 1), u3 = (4 – 3), v1 = (1,0) v2 = (0, 1) and v3 =
(1, 1) be 6 vectors in R2,. Can you define a linear transformation T:R2 R2
such that (T(u1) = V1, i = 1,2,3?

(Hint: Note that 2u1 + u2 = u3 and v1 + v2 = v3).

You have already seen that a linear transformation T:U  V must satisfy T(α1u1 +
α2u2) = α1T(u1) + α2T(u2), for α1, α2 F and u1, u2 U. More generally, we can show
that,

LT6: T(α 1u1 + …+α nun) = α1T(u1) + …. + αnT(un),

Where α1 F and u1, U.

Let us show this by induction, that is, we assume the above relation for n = m, and
prove it for m + 1, Now.

T(αu1 + … + αmum + αm+1 um+1)

= T(u + αm+1 um+1), where u = α1u1 + …….+ αmum


= T(u) + αm+1 T(um+1), since the result holds for n = 2
= T(α1u1 + ……+ αmum) + αm+1 T(um+1)
= α1T(u1) + ….T αm T(um) + αm+1 T(um+1), since we have assumed the result for n
= m.

Thus, the result is true for n = m+1. Hence, by induction, it holds true for all n.

Let us now come to a very important property of any linear transformation T:U  V.
In Unit 4 we mentioned that every vector space has a basis. Thus, U has a basis. We
will now show that T is completely determined by its values on a basis of U. More
precisely, we have

Theorem 1: Let S and T be two linear transformation from U to V, where dim1 U =


n. Let (e1, ………..en) be a basis of U. Suppose S(e1) for i = 1, ….., n. Then S(u) =
T(u) for all u  U.
Proof: Let u  U. Since (e1, ….., en) is a basis of U, u can be uniquely written as u =
a1e1 + ….. + αn en, where the α1 are scalars.

Then, S(u) = S(α1e1 + …….. + n en)

= α1S(e1) + …... + αn S(en), by LT6


= α1T(e1) + ……. + αn T(en)
= α1(a1e1 + …….. + αn en), by LT6
= T(u).

What we have just proved is that once we know the values of T on a basis of U, then
we can find T(u) for any u  U.

Note: Theorem 1 is true even when U is not finite – dimensional. The proof, in this
case, is on the same lines as above.

Let us see how the idea of Theorem 1 helps us to prove the following useful result.

Theorem 2: Let V be a real vector space and T:R V be a linear transformation.


Then there exists v  V such that T(α) = αv Vα R.

Proof: A basis for R is (1). Let T(1) = V  V. then, for any α  R, T(α) = αT(1) = α
v.

Once you have read Sec. 5, 3 you will realize that this theorem says that T® is a
vector space of dimension one, whose basis is [T(1)].

Now try the following exercise, for which you will need Theorem 1.

E E8) We define a linear operator T:R2 R2: T(1,0) = (0,1) and T(0,5) =
1,0). What is T(3,5)? What is T(5,3)?

Now we shall prove a very useful theorem about linear transformations, which is
linked to Theorem 1

Theorem 3: Let ) e1 …., en) be a basis of U and let v1, …….vn be any n vectors in V.
Then there exists one and only one linear transformation T:U  V such that T(e1) =
v1, i = 1, …….n.
Proof: Let u  U. Then u can be uniquely written as u = a1 e1 + ………., T αn en
(see Unit 4, Theorem 9).

Define T(u) = α1v1 + ….+ αnvn. The T defines a mapping from U to V such that T(e1)
= v1V i = 1, …….n,. Let us now show that T is linear. Let a, b be scalars and u, u` 
U. The  scalar α1, …..,αn, 1,………, n such that u = α1e1 + ……, + αnen and u` =
1e1 + …… + nen.

Then au + bu` = (aα1 + b1) e1 + ….., + (a αn + b n) en.

Hence, T(au + bu`) = (aα1 + b 1) v1 + …., + (aαn + b n) vn = a(α1 v1 + ….. + αnvn) +


b( 1v1 + ….+  nvn) = aT(u) + bT(u`)

Therefore, T is a linear transformation with the property that T(e1) = v1V i. Theorem
1 now implies that T is the only linear transformation with the above properties.

Let’s see how Theorem 3 can be used.


Example 7: e1 = (1,0,0). e2 = (0, 1, 0) and e3 = 0, 0, 1) form the standard basis of R3.
Let (1,2), (2,3) and (3,4) be three vectors in R2. Obtain the linear transformation T:
R3 R2 such that T(e1) = (1,2), T(e2) = (2,3) and T(e3) = (3,4).

Solution: By Theorem 3 we know that  T:R3 R2 such that T(e1) = (1,2), T(e2) =
(2,3), and T(e3) = (3,4). We want to know what T(x) is, for any x = (x1, x2, x3)  R3,
Now, X = x1 e1 + x2 e2 + x3 e3.

Hence, T(x) = x1 T(e1) + x2 T(e2) + x3 T(e3)

= x1 (1,2) + x2 (2,3) + x3 (3,4)


= (x1 +2x2 + 3x3, 2x1 + 3x2 + 4x3)

Therefore, T(x1, x2, x3) = (x1 + 2x2 + 3x3, 2x1 + 3x2 + 4x3) is the definition of the
linear transformation T.

E E9) Consider the complex field C. It is a vector space over R

a) What is its dimension over R? Give a basis of C over R.


b) Let α,  R. Give the linear transformation which maps the basis elements of
C. obtained in (a), onto α and , respectively.
Let us now look at some vector spaces that are related to a linear operator.

3.2 Spaces Associated with a Linear Transformation

In Unit 1 you found that given any function, there is a set associated with it, namely,
its range. We will now consider two sets which are associated with any linear
transformation, T. These are the range and the kernel of T.

3.3 The Range Space and the Kernel

Let U and V be vector spaces over a field F. Let T:U  V be a linear transformation.
We will define the range of T as well as the Kernel of T. At first, you will see them as
sets. We will prove that these sets are also vector spaces over F.

Definition: The range of T, denoted by R(T), is the set {T(x)] x  U}/ The kernel
(or null space) of T, denoted by Ker T, is the set {x  U] T(x) = 0 } Note that R(T) 
V and Ker T  U.

To clarify these concepts consider the following examples.

Example 8: Let I: V  V be the identity transformation (see Example 1). Find R(I)
and Ker I.

Solution: R(I) = {I(v)  v  V} = {v v  V} = V. Also, Ker I = {v  V I(v) = 0} =


{v  V v = 0} = {0}.

Example 9: Let T: R3 R be defined by T(x1, x2, x3) = 3x1 + x2 + 2x3 Find R(T) and
Ker T.

Solution: R(T) = {x  R x1, x2, x3,  R with 3x1 + x2 + 2x3. = x}. For example, 0
 R(T). Since 0 = 3.0 + 0 + 2.0 = T(0, 0, 0,)

Also, I E R (T), since I = 3.1/3 + 0 + 2.0 = T(1/3, 0, 0), or I = 3.0 + 1 + 2.0 = T(0, 1,
0), or I = T(0, 0, ½) or I = T (1/6, ½, 0).

Now can you see that R(T) is the whole real line R? This is because, for any α  R, α
= α 1 = α T(1/3,0,0) = T(α /3, 0, 0)  R (T).

Ker T = { (x1, x2, x3)  R3 3x1 + x2 + 2x3 = 0}.

For example, (0,0,0)  Ker T. But (1, 0,0)  Ker T . :. Ker T ≠R3. In fact, Ker T is
the plane 3x1 + x2 +2x3 = 0 in R3.

Example 10: Let T:R3 R3 be defined by T(x1,x2, x3) = (x1 – x2 + 2x3, 2x1 + x2, - x1
– 2x2 + 2x3). Find R(T) and Ker T.
Solution: To find R(T), we must find conditions on y1, y2, y3 R so that (y1, y2, y3)
R(T), i.e. ., we must find some (x1, x2, x3)  R3 so that (y1, y2, y3) = T(x1, x2, x3) =
(x1 – x2 + 2x3, 2x1 + x1 – 2x2 + 2x3).

This means
x1 – x2 + 2x3 = y1 ………………. (1)
2x + x2 = y2 …………………..(2)
-x1 = 2x2 + 2x3 = y3 ……………. (3)

Subtracting 2 times Equation (1) from Equation (2) and adding Equations (1) and (3)
we get.

3x2 – 4x3 = y2 + - 2y1 ………………… (4)


and - 3x2 + 4x3 = y1 + y3 ……………… (5)

Adding Equations (4) and (5) we get

y2 – 2y1 + y1 + y3 = 0, that is , y2 + y3 = y1,

Thus, (y1, y2, y3 R (T)  y2 + y3 = y1.

On the other hand, if y2 + y3 = y1. We can choose

y2 - 2y1 y2 – 2y1 y1 + y2
X3 = 0, x2 = ------------ and x1 = y1 + ------------- = ----------
3 3 3

Then, we see that T (x1, x2, x3) = (y1, y2 , y3).


Thus, y2 + y3 = y1 (y1, y2, y3)  R(T).
Hence, R(T) = {(y1,y2,y3)  R3 y2 + y3 = y1}

Now (x1,x2,x3)  Ker T if and only if the following equations are true:
x1 – x2 + 2x3 = 0
2x1 + x2 = 0
-x2 – 2x2 + 2x3 = 0

Of course x1 = 0, x2 = 0, x3 = 0 is a solution. Are there other solutions? To answer


this we proceed as in the first part of this example. We see that 3x2 = 4x3 = 0. Hence,
x3 = (3/4) x2.

Also, 2x1 + x2 = 0  x1 = - x2/2.

Thus, we can give arbitrary values to x2 and calculate x1 and x3 in terms of x2.
Therefore, Ker T = {(- α/2, α, (3/4)α): α  R}.
In this example, we se that finding R(T) and Ker T amounts to solving a system of
equations. In Unit 9, you will learn a systematic way of solving a system of linear
equations by the use of matrices and determinants.

The following exercises will help you in getting used to R(T) and Ker T.

E E10) Let T be the zero transformation given in Example 2. Find Ker T and
R(T). Does I  R(T)?

E E11) Find R(T) and Ker T for each of the following operators.

a) T:R3 R2:T(x, y, z) = (x, y)


b) T:R3 R:T (x, y, z) = z
c) T:R3 R3: T(x1, x2, x3) = (x1 + x2 + x3, x1 + x2 + x3, x1 + x2 + x3).

(Note that the operators in (a) and (b) are projections onto the xy-plane and the z-axis,
respectively).

Now that you are familiar with the sets R9T) and Ker T, we will prove that they are
vector spaces.

Theorem 4: Let U and V be vector spaces over a field F. Let T:U  V be a linear
transformation. Then Ker T is a subspace of U and R(T) is a subspace of V.

Proof: Let x1, x2,  Ker T  U and α1, α2F. Now, by definition, T(x1) = T(x2) = 0.

Therefore, α1T(x1) + α2T(x2) = 0

But α1T(x1) + α2T(x2) = T(α1x1 + α2 x2).


Hence, T (α1x1 + α2x2) = 0
This means that α1x1 + α2x2 Ker T.

Thus, by Theorem 4 of Unit 3, Ker T is a subspace of U.

Let y1, y2 R(T)  V, and α1, α2 F. then, by definition of R(T), there exist x1, x2 U
such that T(x1) = y1 and T(x2) = y2.

So, α1y1 + α2y2 = α1T(x1) + α2T(x2)

= T(α1x1 + α2x2).

Therefore, α1y1 + α2y2 R (T), which proves that R(T) is a subspace of V.

Now that we have proved that R(T) and Ker T are vector spaces, you know, from Unit
4, that they must have a dimension. We will study these dimensions now.

3.4 Rank and Nullity

Consider any linear transformation T:U  V, assuming that dim U is finite. Then Ker
T, being a subspace of U, has finite dimension and dim (Ker T) ≤ dim U. Also note
that R(T) = T(U), the image of U under T, a fact you will need to use in solving the
following exercise.

E E12) Let {e1 …..en} be a basis of U Show that R(T) is generated by {T(e1),
…………, T(en)}.

From E12 it is clear that, if dim U = N, then dim R(T) ≤ n. Thus, dim R(T) is finite,
and the following definition is meaningful.

Definition: The rank of T is defined to be the dimension of R(T), the range space of
T. The nullity of T is defined to be the dimension of Ker T, the kernel (or the null
space) of T.

Thus, rank (T) = dim R(T) and nullity (T) = dim Ker T.

We have already seen that rank (T) ≤ dim U and nullity (T) ≤ dim U.

Example 11: Let T:U  V be the zero transformation given in example 2. What are
the rank and nullity of T?

Solution: In E10 you saw that R(T) = (0) and Ker T = U, Therefore, rank (T) = 0 and
nullity (T) = dim U.
Note that rank (T) + nullity (T) = dim U, in this case.

E E13) If T is the identity operator on V, find rank (T) and nullity (T).

E E14) Let D be the differentiation operator in E6. Give a basis for the range
space of D and for Ker D. What are rank (D) and nullity (D)?

In the above example and exercises you will find that for T:U  V, rank (T) + nullity
(T) = dim U. In fact, this is the most important result about rank and nullity of a linear
operator. We will now state and prove this result.

Theorem 5: Let U and V be vector spaces over a field F and dim U = n. Let T:U 
V be a linear operator. Then rank (T) + nullity (T) = n.

Proof: Let nullity (T) = m, that is, dim Ker T = m, Let (e1 ….., em) be a basis of Ker
T. We know that Ker T is a subspace of U. thus, by theorem 11 of Unit 4, we can
extend this basis to obtain a basis (e1, ….., em, em+1, ……, en) of U. We shall show
that {T(em+1), ………, T(en)} is a basis of R(T). Then, our result will follow because
dim R(T) will be n – m = n – nullity (T).

Let us first prove that {T(em+1), ……, T(en)} spans, or generates, R(T). Let y  R (T).
Then, by definition of R(T), there exists x  U such that T(x) = y.

Let x = c1e1 + …. + cmem + cm+1 em+1 + ….. + cn en, c1 F V i.


Then
y = T(x) = c1T(e1) + …. + cmT(em) + cm+1) + …..+ cnT(en)

= cm+1 T(em+1) + ….. + cnT(en),

Because T(e1) = .. = T(em) = 0, since T(e1)  Ker T V i = 1, …., m. :. any y  R (T) is


a linear combination of {T(em+1), …., T(en)}. Hence, R(T) is spanned by {T(em+1),
…., T(en)}. It remains to show that the set {T(em+1), …., T(en)} is linearly
independent. For this, suppose there exist am+1, …., an F with am+1T(em+1) + … + an
T(en) = 0.

Then, T(am+1em+1 + … + anen ) = 0


Hence, am+1 em+1 + … + anen Ker T, which is generated by {e1, ……, em}.

Therefore, there exist a1, ……, am F such that am+1 em+1 + …. + anen = a1e1 + … + am
em (- a1) e1 + …. + (- am) em + am+1 em+1 + … + an en = 0.

Since {e1,……, en} is a basis of U, it follows that this set is linearly independent.
Hence, - a1 = 0, ---, -am = 0, am+1 = 0, …., an = 0. In particular, am+1 = …. = an = 0,
which we wanted to prove.

Therefore, dim R(T) = n – m = n – nullity (T), that is, rank (T + nullity (T) = n. let us
see how this theorem can be useful.

Example 12: Let L:R3 R be the map given by L(x,y,z) = x + y + z. What is nullity
(L)?

Solution: In this case it is easier to obtain R(L), rather than Ker L. Since L (1,0,0) = 1
≠ 0, R(L) ≠ {0}, and hence dim R(L) ≠ =. Also, R(L) is a subspace of R. Thus, dim
R(L) ≤ dim R = 1. therefore, the only possibility for dim R(L) is dim R(L) = 1,. By
Theorem 5, dim Ker L + dim R(L) = 3.

Hence, dim Ker L = 3 – 1 = 2. That is, nullity (L) = 2.

E E15) Give the rank and nullity of each of the linear transformations in E11.
E E16) Let U and V be real vector spaces and T:U  V be a linear
transformation, where dim U = 1. Show that R(T) is either a point or a line.

Before ending this section we will prove a result that links the rank (or nullity) of the
composite of two linear operators with the rank (or nullity) of each of them.

Theorem 6: Let V be a vector space over a field F. Let S and T be linear operators
from V to V. Then

a) rank (ST) min (rank (S), rank (T))


b) nullity (ST) ≥ max (nullity (S), nullity (T))

Proof: We shall prove (a). Note that (ST) (v) = S(T(v)) for any v  V

Now, for any y  R (ST),  v  V such that,

Y = (ST) (v) = S (T(v)) …………………….1

Now, (1)  y  R(S).

Therefore, R(ST)  R(S). This implies that rank (ST) ≤ rank (S).

Again, (1)  y  S (R(T)), since T(v)  R(T).

:. R (ST)  S (R(T)), so that dim R(ST) ≤ dim S(R(T)) ≤ dim R(T) (since dim L(U)
≤ U, for any linear operator (0).

Therefore, rank (ST) ≤ rank (T).


Thus, rank (ST) ≤ min (rank (S), rank (R)).
The proof of this theorem will be complete, once you solve the following exercise.

E E17) Prove (b) of Theorem 6 using the Rank Nullity Theorem.


We would now like to discuss some linear operators that have special properties.

3.5 Some types of Linear Transformations

Let us recall, from Unit 1, that there can be different types of functions, some of
which are one-one, onto or invertible. We can also define such types of linear
transformations as follows:

Definition: Let T: U  V be a linear transformation.

a) T is called one-one (or injective) if, for u1, u2 U with u1,  u2, we have T (u1)
 T (u2). If T is injective, we also say T is 1 – 1.
Note that T is 1 – 1 if T (u1) = T (u2)  u1 = u2.

b) T is called onto (or surjective) if, for each v  V, u  U such that T(u) = v,
that is R(T) = V.

Can you think of examples of such functions? The identity operator is both one-one
and onto. Why is this so? Well, I:V  V is an operator such that, if v1, v2 V with v1
≠ v2 then I(v1) ≠ I(v2) Also, R(I) = V, so that I is onto.

E E18) Show that the zero operator 0: R  R is not one – one.

An important result that characterizes injectivity is the following:

Theorem7: T:U  V is one-one if and only if Ker T = (0).

Proof: First assume T is one – one. Let u  Ker T. Then T (u) = 0 = T(0). This
means that u = 0. thus, Ker T = (0). Conversely, let Ker T = (0). Suppose u 1, u2 U
with T(u1) = T(u2)  T(u2 – u2) = 0  u1 – u2 Ker  u1-u2 = 0  u1 = u2. therefore
T is 1 - 1

Suppose now that T is a one – one and onto linear transformation from a vector space
U to a vector space V. Then, from Unit 1 (Theorem 4), we know that T -1 exists. But
is T-1 linear? The answer to this question is ‘yes’, as is shown in the following
theorem.
Theorem 8: Let U and V be vector spaces ovet a field F. Let T:U  V be a none-one
and onto linear transformation. Then T-1: V  U is a linear transformation.

In fact, T-1 is also 1 – 1 and onto.

Proof: Let y1, y2 V and α1, α2 F. Suppose T-1 (y1) = x1 and T-1 (y2) = x2. then, by
definition, y1 = T (x1) and y2 = T(x2).

Now, α 1y1 + α 1y2 = α 1 T(x1) + α2 T(x2) = T(α1 x1 + α1 x2)

Hence, T-1 (α1y1 + α2y2) = α1x1 + α2x2


t-1 (y) = x 
= α1 T-1 (y1) + α2 T-1 (y2) T(x) = y

This shows that T-1 is a linear transformation.

We will now show that T-1 is 1 -1, for this, suppose y1, y2 V such that T-1 (y1) = T-1
(y2) Let x1 = T-1 (y1) and x2 = T-1 (y2).

Then T(x1) = y1 and T (x2) = y2. We know that x1 = x2. Therefore, T (x1) = T (x2),
that is, y1 = y2. thus, we have shown that T-1 (y1) = T-1 (y2)  y1 = y2, proving that T-
1
is 1 – 1. T-1 is also surjective because, for any u  U,  T (u) = v  V such that T-1
(v) = u.

Theorem 8 says that a one-one and onto linear transformation is invertible, and the
inverse is also a one-one and onto linear transformation.

This theorem immediately leads us to the following definition.

Definition: Let U and V be vector spaces over a field F, and let T:U  V be a one-
one and onto linear transformation. The T is called an Isomorphism between U and
V. In this case we say that U and V are isomorphic vector spaces. This is denoted
by U  V.

An obvious example f an isomorphism is the identity operator. Can you think of any
other? The following exercise may help.

E E19) Let T:R3 -> R3: T (x, y, z) = (x + y, y, z). Is T an isomorphism? Why?


Define T-1, if it exist
E E20) Let T : R3 R2: T(x, y, z) = (x + y, y + z). Is T an isomorphism?

In all these exercises and examples, have you noticed that if T is an isomorphism
between U and V then T-1 is an isomorphism between V and U?

Using these properties of an isomorphism we can get some useful results, like the
following:

Theorem 9: Let T: U  V be an isomorphism. Suppose {e1, …., en) is a basis of U.


then {T (e1), .., T(en)} is a basis of V.

Proof: First we show that the set {T (e1), …, T(en)} spans V. Since T is onto, R(T) =
V. thus, from E12 you know that {T(e1),.., T(en)} spans V.

Let us now show that {T(e1),.., T(en)} is linearly independent. Suppose there exist
scalars c1,…, cn, such that c1 T(e1) + … + cn T(en) = 0 ……….1

We must show that c1 = … = cn = 0

Now, (1) implies that


T(c1 e1 + ….. + cn en ) = 0
Since T is one-one and T(0) = 0, we conclude that c1e1 + …..+ cn en = 0.
But {e1, ….,en} is linearly independent. Therefore, c1 = ….= cn = 0.

Thus, we have shown that {T (e1),…., T(en)} is a basis of V.

Remark: The argument showing the linear independence of {T(e1), …, T (en)} in the
above theorem can be used to prove that any one-one linear transformation T:U  V
maps any linearly independent subset of U onto a linearly independent subset of V
(see E22).

We now give an important result equating ‘isomorphism’ with ‘1 -1 ‘ and with ‘onto’
in the finite-dimensional case.

Theorem 10: Let T: U  V be a linear transformation where U. V are of the same


finite dimension. Then the following statements are equivalent.

a) T is 1 – 1
b) T is onto.
c) T is an isomorphism.

Proof: To prove the result we will prove (a)  (b)  (c)  (a). Let dim U = dim
V = n.

Now (a) implies that Ker T = (0+ (from Theorem 7), Hence, nullity (T) = (0).
Therefore by Theorem 5, rank (T) = n that is dim R(T) = n = dim V. But R (T) is a
subspace of V. thus, by the remark following Theorem 12 of Unit 4, we get R (T) = V,
i.e., T is onto, i.e., (b) is true. So (a)  (b).

Similarly, if (b) holds then rank (T) = n, and hence, nullity (T) = 0. consequently, Ker
T = {0}, and T is one-one. Hence, T is one-one and onto, i.e. . t is an isomorphism.
Therefore, (b) implies (c).

That (a) follows from 9c) is immediate from the definition of an isomorphism.

Hence, our result is proved.

Caution: Theorem 10 is true for finite-dimensional spaces U and V, of the same


dimension. It is not true, otherwise. Consider the following counter-example.

Example 13: (To show that the spaces have to be finite-dimensional): Let V be the
real vector space of all polynomials. Let D:V  V be defined by D (a0 + a1 x + ..+ ar
xr) = a1 + 2a2x + …. + rar xr-1. then show that D is onto but not 1 -1.

Solution: Note that V has infinite dimension, a basis being {1, x, x 2, …}. D is onto
because any element of V is of the form a0 + a1 x + …. + anxn = D
a1 an
a0 x + ---- x2 + …. + ----- xn + 1
2 n+1

D is not 1 – 1 because, for example, 1 ≠ 0 but D (1) = D (0) = 0.

The following exercise shows that the statement of Theorem 10 is false if dim U ≠
dim V.

E E12) Define a linear operator T: R3 R2 such that T is onto but T is not 1 -
1. Note that dim R3 ≠ dim R2.
Let us use Theorems 9 and 10 to prove our next result.

Theorem 11: Let T:V  V be a linear transformation and let {e1, …, en} be a basis
of V. Then T is one-one and onto if and only if {T (e1), …, T(en)} is linearly
independent.

Proof: Suppose T is one-one and onto. The t is an isomorphism. Hence, by Theorem


9, {T (e1), …., T(en) } is a basis. Therefore, {T(e1), …, T(en)} is linearly independent.

Conversely, suppose {T(e1), …., T(en)} is linearly independent. Since (e1, …,


en} is a basis of V, dim V = n. therefore, any linearly independent subset of an vectors
is a basis of V (by Unit 4, Theorem 5, Cor. 1). Hence, {T (e1), …, T(en)} is a basis of
V. Then, any element v of V is of the form
n n
v =  ci T(e1 ) = T  ci ei, where c1, …., cn are scalars. Thus, T is
i =1 i =1
onto, and we can use Theorem 10 to say that T is an isomorphism.
Here are some exercise now.

E E22) a) Let T : U  V be a one-one linear transformation and let


{u1,…, Uk} be a linearly independent subset of U. show that the set {T(u1), …., T(uk)}
is linearly independent.

b) Is it true that every linear transformation maps every linearly independent set of
vectors into a linearly independent set?
d) Show that every linear transformation maps a linearly dependent set of vectors
onto a linearly dependent set of vectors.
E E23) Let T: R3 R3 be defined by T(x1, x2, x3) = (x1 + x3, x2 + x3, x1 + x2). Is
T invertible? If yes, find a rule for T-1 like the one which defines T.

We have seen, in Theorem 9, that if T: U  V is an isomorphism, then T maps a basis


of U onto a basis of V. Therefore, dim U = dim V. In other words, if U and V are
isomorphic then dim U = dim V. the natural question arises whether the converse is
also true. That is, if dim U = dim V, both being finite, can we say that U and V are
isomorphic? The following theorem shows that this is indeed the case.

Theorem 12: Let U and V be finite-dimensional vector spaces over F. the U and V
are isomorphic if and only if dim U = dim V.

Proof: We have already seen that if U and V are isomorphic then dim U= dim V.
Conversely, suppose dim U = dim V = n. We shall show that U and V are isomorphic.
Let {e1,…, en} be a basis of U and {f1,…., fn} be a basis of V. By Theorem 3, there
exists a linear transformation T:U  V such that T(e1) = f1, i = 1, ….., n.

We shall show that T is 1 -1 .


Let u = c1e1 + ….+ cn en be such that T(u) = 0
Then 0 = T(u) = c1 T(e1) + … + cn T(en)
= c1f1 + … + cnfn.

Since {f1,…., fn} is a basis of V. we conclude that c1 = c2 = … = cn = 0. Hence, u = 0.


Thus, Ker T = (0) and, by Theorem 7, T is one – one.

Therefore, by Theorem 10, T is an isomorphism, and U = V. An immediate


consequence of this theorem follows:

Corollary: Let V be a real (or complex) vector space of dimension n. Then V is


isomorphic to R” (or C”). respectively.

Proof: Since dim Rn = n = dim RV, we get V  Rn. Similarly, if dim cV = n, then V 
cn .

Remark: Let V be a vector space over F and let B = {e1,…., en} be a basis
n
of V. Each v € V can be uniquely expressed as v =  α1 e1. Recall that
i=1
α1,…., αn are called the coordinates of v with respect to B (refer to sec. 4.4.1).

Define  : V  Fn :  (v) = (α1, …., αn). Then  is an isomorphism from V to Fn.


This is because  is 1 -1, since the coordinates of v with respect to B are uniquely
determined Thus, V Fn.

We end this section with an exercise.

E E24) Let T: U  V be a one- one linear mapping. Show that T is onto if and
only if dim U = dim V. (of course, you must assume that U and V are finite
dimensional spaces).

Now let us look at isomorphism between quotient spaces.

3.6 Homomorphism Theorems

Linear transformation are also called vector space homomorphisms. There is a basic
theorem which uses the properties of homomorphisms to show the isomorphism of
certain quotient spaces (ref. Unit 3). It is simple to prove. But is very important
because it is always being used to prove more advanced theorems on vector spaces.
(in the Abstract Algebra course we will prove this theorem in the setting of groups and
rings)

Theorem 13: Let V and W be vector spaces over a firld F and T:V  W be a linear
transformation. Then V/Ker T  R (T).

Proof: You know that Ker T is a subspace of V, so that V/Ker T is a well defined
vector space over F. Also R (T) = {T(v)V  V}. To proof the theorem let us define
 V/Ker T  R (T) by  (v + Ker T) = T (v).

Firstly, we must show that  is a well defined function, that is, if v +Ker T = v` +
Ker T then  (v + Ker T) =  (v` + Ker T), i.e. T(v) = T (v`).

Now, v + Ker T = v` + Ker T (v –v`)  Ker T (see Unit 3, E23)

 T (v-v`) = 0  T(v) = T(v`). and hence,  is well defined.


Next, we check that  is a linear transformation. For this, let a , b  F and v. v` V.
then {a (v + Ker T) + b(v` + Ker T)}
=  (av + bv` + Ker T) (ref. Unit 3)
= T (av + bv`)
= aT(v) + bT (v`), since T is linear.
= a  (v + Ker T) + b  (v` + Ker T).
Thus,  is a linear transformation.

We end the proof by showing that  is an isomorphism.  is 1 -1 (because  (v + Ker


T) =  T(v) = 0  v  Ker T  v + Ker T = 0 (in V/Ker T).

Thus, Ker  ={0})

 is onto (because any element of R (T) is T (v) =  (v) =  (v + Ker T)).


So have prove that  is an isomorphism. This proves that V/Ker T = R (T).

Let us consider an immediate useful application of Theorem 13.

Example 14: Let V be a finite-dimensional space and let S and t be linear


transformations from V to V. show that. Rank (ST) = rank (T) – dim (R (T)  Ker S).
T S
Solution: We have V V V. ST is the composition of the
operators S and T. which you have studied in Unit 1, and will also study in Unit 6.
Now, we apply Theorem 13 to the homomorphism  : T(v)  ST(V):  (T(v)) = (ST)
(v).

Now, Ker  = { x  T (V)  S (x) = 0 } = Ker S  T (V) = Ker S  R (T). Also R( )


= ST (V), since any element of ST (V) is (ST) (v) =  (T(v)). Thus.

T(V)
------------------  ST (V)
Ker S  T(V)

Therefore,
T(V)
dim --------------------- = dim ST(V)
Ker S  T(V)

That is, dim T(V) – dim (Ker S  T (V)) = dim ST (V), which is what we had to
show.
E E25) Using Example 14 and the Rank Nullity Theorem, show that nullity (ST)
= nullity (T) + dim (R(T)  Ker S).

Now let us see another application of Theorem 13.

Example 15: Show that R3/R  R2.

Solution: Note that we can consider R as a subspace of R3 for the following reason:
any element a of R is equated with the element (α 0, 0 ) of R 3. Now, we define a
function f: R3 R2: f(α,, ) = (, ). then f is a linear transformation and Ker f = {(α,
0, 0) α  R}  R. Also f is onto, since any element (α, ) of R2 is f (0, α, ). Thus,
by Theorem 13, R3/R  R2.

Note: In general, for any n  m, Rn/Rm Rn-m.. Similarly, Cn-m Cn/Cm for n  m.
The next result is a corollary to the Fundamental Theorem of Homomorphism. But,
before studying it, read unit 3 for definition of the sum of spaces.

Corollary 1: Let A and Be be subspaces of a vector space V. then A + B/B  A/A 


B.
A+B
Proof: we define a linear function T:A  ---------- by T (a) = a + B
B
A+B
T is well defined because a + B is an element of ----------- (since a = a + 0  A+B ).
B

T is a linear transformation because, for α1, α2 in F and α1, α2 in A, we have


T (α1 a1 + α2 a2) = α1a1 + α2a2 + B = α1 (a1 + B) + α2 (a2 + B)
= α1 T(a1) + α2 T(a2)
A+B
Now we will show that T is surjective. Any element of ------------ is of the form a + b
+ B, where a  A and b  B. B

Now a+ b + B = a + B + b + B = a + B + B, since b  B.

A+B
= a + B, since B is the zero element of ---------
B

= T (a), proving that T is surjective.


A+B
:. R(T) = ---------
B
We will now prove that Ker T = A  B.

If a  Ker T, then a  A and T(a) = 0. This means that a + B = B, the zero


A+B
element of -------- Hence, a  B (by Unit 3, E23). Therefore, a A  B.
B
Thus, Ker T  A  B. On the other hand, a  A  B a  A and a  B a  A
and a + B = B  a  A and T(a) = T(0) = 0

 a  Ker T.

This proves that A  B = Ker T.

Now using Theorem 13, we get

A/Ker T  R (T)

That is, A/(A  B)  (A + B)/B

E E26) Using the corollary above, show that A  B/B  A () denotes the direct
sum of defined in sec. 3.6).

There is yet another interesting corollary to the Fundamental Theorem of


Homorphism.

Corollary 2: Let W be a subspace of a vector space V. Then, for any subspace U of


V containing W.
V/W
------- V/U
U/W

Proof: This time we shall prove the theorem with you. To start with let use define a
function T: V/W  V/U : t (v + ) = v + U. Now try E 27

E E27 a) Check that T is well defined.


b) Prove that T is a linear transformation.
c) What are the spaces Ker T and R(T)?

So, is the theorem proved? Yes; apply theorem 13 to T. we end the unit by
summarizing what we have done in it.

4.0 CONCLUSION

In this unit we have covered the following points.

(1) A linear transformation from a vector space U over F to a vector space V over
F is a function T: U  V such that,

LT1) T(u1 + u2) = T(u1) + T(u2) V u1, u2 U, and

LT 2) T (u) =  T(u) , for  F and u  U.

These conditions are equivalent to the single condition LT3) T (u1 + u2) =
T(u1) + T(u2) for ,  F and u1, u2 U.

(2) Given a linear transformation T: U  V.

i) The Kernel of T is the vector space {u  U  T(u) = 0}, denoted by Ker


T.
ii) The range of T is the vector space {T(u)  u  U}, denoted by r (T),
iii) The rank of T = dim 1R(T)
iv) The nullity of T = dim 1Ker T.
(3) Let U and V be finite-dimensional vector spaces over F and T: U  V be a
linear transformation. Then rank (T) + nullity (T) = dim U.
(4) Let T: U  V be a linear transformation then
T is one-one if T (u1) = T (u2)  u1 = u2 V u1, u2 U
(i) T is onto if, for any v  V  u  U such that T(u) = v.
(ii) T is an isomorphism (or is invertible) if it is one-one and onto, and then
U and V are called isomorphic spaces. This is denoted by U  V.

(5) T : U  V is

(i) one-one if and only if Ker T = (0)


(ii) onto if and only if R (T) = V

(6) Let U and V be finite-dimensional vector spaces with the same dimension.
The T: U  V is 1 -1 iff T is onto iff t is an isomorphism

(7) Two finite – dimensional vector spaces U and V are isomorphic if and only if
dim U = dim V.

(8) Let V and W be vector spaces over a field F, and T:V  W be a linear
transformation. Then V/Ker T  R(T).

5.0 SUMMARY

E1) For any a1, a2  F and u1, u2  U, we know that a1u1  U and a2u2  U.
therefore, by LT1.

T (1u1 + 1 u2) = T(1u1) + T (2 u2)


= 1 T(u1) + 1 T(u2), by LT2
Thus, LT3 is true.

E 2) By LT2, T (0, u) = 0.T (u) for any u  U. Thus, T (0) = 0. Similarly, for any u
 U, T (-u) = T (( - 1)u) = (-1) T(u) = - T (u).

E3) T (x,y) = (- x, y ) V (x, y)  R2. (See the geometric view in Fig.4) T is a linear
operator. This can be proved the same way as we did in Example 4.
Y

Q (-1,2) P (1,2)

1 0 X
1

Fig 4 Q is the reflection of 1 in the


y-axis
E4) T((x1, x2, x3) + (y1, y2, y3)) = T (x1 + y2, x3 + y3)

= a1 (x1 = y1) + a2 (x2 +y2) + a3 (x3 + y3)

= (a1x1 = a2 x2 +a3 x3) + (a1, y1 + a2 y2 + a3y3)

= T(x1, x2, x3) + T (y1, y2, y3)

Also, for any  R,

T ((x1, x2, x3)) = a1 x1 + a2 x2 + a3 x3

=  (a1 x1 + a2 x2 + a3 x3) = T (x1, x2, x3).

Thus, LT1 and LT2 hold for T.

(5) We will check if LT1 and LT2 hold firstly.

T((x1 x2 x3) + (y1, y2, y3)) = T (x1 + y1, x2 + y2, x3 + y3)

= (x1 + y1 + x2 + y2 – x3 – y3, 2x1 + 2y1 – x2-y2, x2 + y2 +2x3 + 2y3)


= (x1 + x2 – x3, 2x1 – x2, x2 + 2x3) + (y1 + y2 – y3, 2y1 – y2 + 2y3)
= T (x1, x2, x3) + T (y1, y2, y3) showing that LT1 holds.

Also, for any  R.

T( (x1, x2, x3)) = T (x1, x2, x3)

= (x1 + x2 – x3, 2x1 - x2, x2 + 2x3)


=  (x1, + x2 – x3, 2x1 – x2, x2 + 2x3) = T (x1, x2, x3), showing that LT2 holds.
E6) We want to show that D (f + g) = D (f) + D (g), for any ,  R and f, g
 Pn. Now, let f(x) = a0 + a1 x + a2x2 + …+ anxn and g(x) = b0 + b1 x +…+bnxn.

Then (f + g) (x) = a0 + b0) + (a1 + b1) x + … + (an + bn)
:. [D(f + g)] (x) = (a0 + b1) + 2 (a2 + b2) x + … + n (an + bn) xn-1
=  (a1 + 2a2x +… + nanxn-1) +  (b1 + 2b2x + … + nbnxn-1)
=  (Df) (x) +  (Dg) (x) = (Df + Dg) (x)
Thus, D (f + g) = Df + Dg, showing that D is a linear map.

E7 No. Because, it T exists, then

T (2u1 + u2) = 2T (u1) + T (u2).


But 2u1 + u2 = u3. :. T(2u1 + u2) = T(u3) = v3 = (1, 1).
On the other hand, 2T (u1) + T (u2) = 2v1 + v2 = (2,0) + (0, 1)
= (2, 1)  v3. T
Therefore, LT3 is violated. Therefore, no such T exists.

E8) Note that {(1,0), (0,5)} is a basis for R2.

Now (3,5) = 3(1,0) + (0, 5).


Therefore, T(3,5) = 3T (1,0) + T (0,5) = 3 (0,1) + (1,0) = (1,3).
Similarly, (5,3) = 5 (1,0) + 3/5 (0, 5).
Therefore, T(5,3) = 5 (1,0) + 3/5 (1,0) = (3/5, 5).
Note that T (5,3)  T (3, 5)

E9) a) dim RC = 2, a basis being {1,i), i =  -1.


b) Let T:C  R be such that T (1) = , T (i) = .

Then, for any element x + iy  C (x, y  R), we have T (x + iy) = xT (1) + yT(i) =
x+ y. Thus, T is defined by T(x + iy) = x + yV x + i y  C.

E10) T: U  V : T (U) = 0 V u  U.

:. Ket T = {u E U  T (u) = 0} = U
R(T) = { T(u) u , U} = {0}. :. 1  R (T).

E11) a) R(T) = {T(x, y, z) (x, y, z)  R3} = {(x,y) (x,y,z)  R3 = R2.


Ker T = {(x, y, z)T (x,y,z) = 0} = {(x, y,z) (x,y) = (0,0)}

= {0,0,z) z  R}
:. Ker T is the z-axis.

b) R(T) = {z  (x,y,z)  R3} = R


Ker T = {(x,y, 0) x, y,  R} = xy – plane in R3.

c) R(T) = {(x, y,z)  R3 x1, x2, x3 R such that x = x1 + x2 + x3 = y = z}


= {x, x, x)  R3x = x1 + x2 + x3 for some x1, x2, x3 R}
= {( x, x, x)  R3 x  R}

Because, for any x  R, (x, x, x) = T (x, 0, 0)


:. R(T) is generated by {(1,1,1)}.
Ker T = {(x1, x2, x3) x1 + x2 + x3 = 0}, which is the plane x1 + x2 + x3 = 0, in R3.

E12) Any element of R(T) is of the form T(u), u  U. Since {e1,….en) generates U, 
scalars 1,…., an such that u = 1e1 + … + n en.

Then T(u) = 1 T(e1) + … + n T(en), that is, T (u) is in the linear span of {T(e1), ….,
T(en)}.
:. {T(e1) ……, T(en)} generates R(T).

E13) T: V  V: T (v0 = v. Since R(T) = V and Ker T = (0), we see that rank (T) =
dim V, nullity (T) = 0.

E14) R(D) = {a1 + 2a2x + …+ nanxn-1 a1,….,an R}


Thus, R(D)  Pn-1. But any element b0 + b1 x +,….+ bn-1 xn-1, in

b1 bn-1
Pn-1 is D b0 x + ---- x2 + …+ -------- xn R (D)
2 n

Therefore, R(D) = Pn-1.

:. a basis for R(D) is {1, x,…., xn-1}, and rank (D) = n.

Ker D = {a0 + a1x + ….+ anxn  a1 + 2a2x +,…+ nan xn-1 = 0, ai R Vi}
= {a0 + a1x +…+ anxna1 = 0, a2 = 0, …, an = 0, ai R Vi}
= {a0a0 R} = R

:. A basis for Ker D is {1}.

 nullity (D) = 1.

E15 a) we have show that R(t) = R2. :. rank (t) = 2


Therefore, nullity (T) = dim R3 – 2 = 1.

b) rand (T) = 1, nullity (T) = 2


(c) R (T) is generated by {(1,1,1)}. :. Rank (T) = 1.
:. Nullity (T) = 2.

E 16 Now rank (T) + nullity (T) = dim U = 1.


Also rank (T)  0.
:. The only values rank (T) can take are 0 and 1. If rank (T) = 0, then
dim R(T) = 0.
Thus, R(T) = {0}, that is, R(T) is a point.

If rank (T) = 1, then dim R(T) = 1, That is, R(T) is a vector space over R generated by
a single element, v, say. Then R(T) is the line Rv = {v  R}.

E17) By Theorem 5, nullity (ST) = dim V – rank (ST). By (a) of Theorem 6, we


know that – rank (ST)  - rank (S) and – rank (ST)  – rank (T).
:. Nullity (ST)  dim V – rank (S) and nullity (ST)  dim V – rank (T).
Thus, nullity (ST)  nullity (S) and nullity (ST)  nullity (T). That is, nullity (ST) 
max {nullity (S), nullity (T)}.

E18) Since 1 2, but 0 (1) = 0 (2) = 0, we find that 0 is not 1– 1

E19) Firstly note that T is a linear transformation. Secondly, T is 1-1 because T (x,
y, z) = T (x`, y`, z`)  (x, y z) = (x`, y`, z`)
Thirdly, T is onto because any (x, y, z)  R3 can be written as T(x, - y, y, z)
:., T is an isomorphism. :. T-1 : R3 R3 exists and is defined by T-1 (x, y, z) = (x – y, y,
z).

E20) T is not an isomorphism because T is not 1-1, since (1-1, 1)  Ker T.

E21) The linear operator in E11) (a) suffices.

E22) a) Let α1,….αk F such that α1 T(u1) +…+ αk T(uk) = 0


 T(α 1u1 + …+ akuk) = 0 = T (0)
 α1u1 + …+ akuk = 0, since T is 1 -1
 α1 = 0,…, ak = 0, since {u1, ….uk} is linearly independent
:. {T (u1),…, T(uk)} is linearly independent.

b) No. For example, the zero operator maps every linearly independent set to {0},
which is not linearly independent.
c) Let T:U  V be a linear operator, and {u1,…., un} be a linearly dependent set
of vectors in U. We have to show that {T(u1),…,T(un)} is linearly dependent.
Since {u1,…..,un} is linearly dependent,  scalars a1,….,an, not all zero, such
that a1u1 + ….+ anun = 0.

Then a1 T (u1) +…+an T(un) = T (0) = 0, so that {T(u1),….T(un)} is linearly dependent.


E23) T is a linear transformation now, if (x, y, z)  Ker T, then T(x, y, z) =
(0, 0 , 0).

:., x + y = 0 = y + z = x + z  x = 0 = y = z
 Ker T = { (0,0,0)}

 T is 1 – 1
:. by Theorem 10, T is invertible.

To define T-1 R3 R3 suppose T-1 (x, y, z) = (a, b, c).


Then T (a, b, c) = (x, y, z)

 (a + b, b+ c, a+ c) = (x, y, z)
 a + b = x, b + c = y, a + c = z

x+z–y x+y–z y+z-x


 a = ----------, b = ----------- a = ----------
2 2 2

x+z–y x+y–z y+z-x


-1
:. T (x, y, z) = ----------- ----------- ----------- for any (x, y, z)  R3.
2 2 2

E24) T: U  V is 1 -1. Suppose T is onto. Then T is an isomorphism and dim U


= dim V, by Theorem 12. Conversely suppose dim U = dim V. Then T is onto
by theorem 10.

E25) The Rank Nullity Theorem and Example 14 give


Dim V – nullity (ST) = dim V – nullity (T) – dim (R(T)  Ker S)
 nullity (ST) = nullity (T) + dim (R(T)  Ker S)

E26) In the case of the direct sum A  B, we have A  B = {0}

.A  B
:. ------ 
B

E27) a) v + W = v` + W  v-v`  W  U  v-v`  U  v + U = v` + U


 T(v+W) = T (v` + W)
:. T is well defined.

b) For any v + W, v` + W in V/W and scalar a, b, we have


T(a(v + W) + b (v` + W)) = T (av + bv`+W) = av + bv` + U
= a(v + U) + b (v` + U) = aT (v + W) + bT (v` + W).
:. T is a linear operator.

c) ker T = { v + W v + U = U}, since U is the “zero” for V/U.


= {v + W  v U} = U/W.
R(T) = {v + U v  V} = V/U.

6.0 TOTUR MARKED – ASSEMENT

7.0 REFERENCE/FURTHER READING


UNIT 2 LINEAR TRANSFORMATION – II

CONTENTS

1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Introduction
3.2 Objectives
3.3 The Vector Space L (U, V)
3.4 The Dual Space
3.5 Composition of Linear Transformations
3.6 Minimal Polynomial
Theorems
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/Further Reading

1.0 INTRODUCTION

In the last unit we introduced you to linear transformations and their properties. We
will now show that the set of all linear transformations form a vector space U to a
vector space V forms a vector space itself, and its dimension is dim U (dim V). In
particular, we define and discuss the dual space of a vector space.

In Unit 1 we defined the composition of two functions. Over here we will discuss the
composition of two linear transformations and show that it is again a linear operator.
Note that we use the terms ‘linear transformation’ interchangeably.

Finally, we study polynomials with coefficients from a field F, in a linear operator


T:V  V. You will see that every such T satisfies a polynomial equation g(x) = 0.
that is, if we substitute T for x in g(x) we get the zero transformation. We will, then,
define the minimal polynomial of an operator and discuss some of its properties.
These ideas will crop up again in Unit 11.

You must revise Units 1 and 5 before going further.

2.0 OBJECTIVES

After reading this unit, you should be able to

 Prove and use the fact that L (U, V) is a vector space of dimension (dim U)
(dim V);
 Use dual bases, whenever convenient;
 Obtain the composition of two linear operators, whenever possible;
 Obtain the minimal polynomial of a linear transformation T:V  V in some
simple cases;
 Obtain the inverse of an isomorphism T: V  V if its minimal polynomial is
known.

3.0 MAIN CONTENT

3.1 Introduction

3.3 The Vector Space L (U, V)

By now you must be quite familiar with linear operators, as well as vector spaces. In
this section we consider the set of al linear operators from one vector space to another,
and show that it forms a vector space.

Let U, V be vector spaces over a field F. consider the set of all linear transformations
from U to V. We denote this set by L(U, V).

We will now define addition and scalar multiplication in L(U, V) so that L (U,V)
becomes a vector space.

Suppose S, T  L (U, V) (that is, S and T are linear operators from U to V). we define
(S + T: U  V by

(S + T) (u) = S (u) + T (u) V u  U.

Now, for a1, a2 F and u1, u2 U, we have

(S + T) (a1 u1 + a2 u2)

= S(a1 u1 + a2 u2) + T (a1, u1 + a2 u2)


= a1 S (u1) + a2 S (u2) + a1 T(u1) + a2 T(u2)
= a1(S(u1) + T(u1)) + a2 (S(u2) + T(u2))
= a1 (S + T) (u1) + a2 (S + T) (u2)

Hence, S + T  L (U, V).

Next, suppose S  L (U, V) and  F. We define  S: U  V as follows:


(S) (u) =  S(u) V u  U.

Is S a linear operator? To answer this take 1, 2 F and u1, u2 U. Then, (αS)
(1, u1 + b2 u2) = αS (1 u1 + 2u2) = α[1 S(u1) + 2S(u2)]
= 1 (αS) (u1) + 2 (αS) (u2)
Hence, αS  L (U, V).

So we have successfully defined addition and scalar multiplication on L(U, V).

E E1 Show that the set L (U, V) is a vector space over F with respect to the
operations of addition and multiplication by scalars defined above. (Hint: The
zero vector in this space is the zero transformation).

Notation: For any vector space V we denote L (V, V) by A(V).

Let U and V be vector spaces over F of dimensions m and n, respectively. We have


already observed that L(U, V) is a vector space over F. therefore, it must have a
dimension. We now show that the dimension of L(U, V) is mn.

Theorem 1: Let U, V be vector spaces over a field F of dimensions m and n,


respectively then L(U, V) is a vector space of dimension mn.

Proof: Let {e1,….em} be a basis of U and { f1,…..fn} be a basis of V. By Theorem 3


of Unit 5, there exists a unique linear transformation E11 L (U, V), such that

E11 (e1) = f1, E11 (e2) = 0 ,…., E11 (em) = 0


Similarly, E12 L (U, V) such that

E12 (e1) = 0, E12 (e2) = f1, E12 (e3) = 0,…., E12 (em) = 0.
In general, there exist Eij L(U, V) for I = 1, …., n, j = 1, …., m, such that Eij (ej) = fi
and Eij (ek) = 0 for j  k.
To get used to these Eij try the following exercise before continuing the proof.
E E2) Clearly define E2m, E32 and Emn’

Now, let us go on with the proof of Theorem 1.

If u = c1 e1 + ….+ cm em, where c1F V i, then Eij (u) = cj fi.

We complete the proof by showing that {Eiji = 1, ... m} is a basis of L(U, V).

Let us first show that set is linearly independent over F. for this, suppose

n m
 cij Eij = 0 ……………………………………..(1)
i=1 j=1

where cij  F. we must show that cij = 0 for all i, j.

(1) Implies that

n m
 cij Eij (ek) = 0  k = 1, …., m.
i=1 j=1

Thus, by definition of Eij’s, we get


n
 cik fi = 0,
i=1

But, {f1, …, fn} is a basis for V thus, cik = 0 for all I = 1, …., n.

But this is true for all k = 1, …., m.

Hence, we conclude that cij = 0 V i, j. therefore, the set of E ij’s is linearly


independent.
Next, we show that the set {Eij i = 1, …, n, j = 1, …, m} spans L(U, V). Suppose T 
L(U, V).

Now, for each j such that 1 j  m, T (ej)  V. since {f1, ….fn} is a basis of V, there
exist scalars cij, …., cin such that
n
T(ej) =  cij fi ………………………..(2)
i =1

we shall prove that

n m
T =  cij Eij ………………… .(3)
i=1 j=1

By Theorem 1 of Unit 5 it is enough to show that, for each k with 1  k  m,

T(ek) =  cij Eij (ek).


i j

n m n
Now,  cij Eij (ek). =  cik fi = T(ek), by (2). This implies (3).
i j i=1

Thus, we have proved that the set of mn elements {E iji = 1,….,n, j = 1,…m} is a
basis for L(U,V).

Let us see some ways of using this theorem.

Example 1: Show that L(R2, R) is a plane.

Solution: L(R2, R) is a real vector space of dimension 2 x 1 = 2.

Thus, by Theorem 12 of Unit 5 L(R2, R)  R2, the real plane.

Example 2: Let U, V be vector spaces of dimensions m and n, respectively. Suppose


W is a subspace of V of dimension p ( n). Let

X = { T  L(U, V): T(u)  W for all u  U}

Is X a subspace of L(U, V)? if yes, find its dimension.

Solution: X = {T  L(U, V) T(U)  W} = L(U, W). thus, X is also a vector space.


Since it is a subset of L(U, V), it is a subspace of L(U, V). By Theorem 1, dim X =
mp.
E E3 What can be a basis for L(R2, R), and for L(R, R2)? Notice that both
these spaces have the same dimension over R.

After having looked at L(U, V), we now discuss this vector space for the particular
case when V = F.

3.4 The Dual Space

The vector space L(U, V), discussed in sec. 2.2, has a particular name when V = F.

Definition: Let U be a vector space over F. Then the space L(U, F) is called the dual
space of U*, and is denoted by U.

In this section we shall study some basic properties of U*. the elements of U have a
specific name, which we now give.

Definition: A linear transformation T:U  F is called a linear functional. Thus, a


linear functional on U is a function T: U  F such that T(1u1 + 2u2) = 1 T(u1) +
1T(u2), for 1, 2 F and u1, u2 U.
For example, the map f:R3 R:f (x1, x2, x3) = a1 x1 + a2 x2 + a3 + a3, where a1,a2, a3
R are fixed, is a linear functional on R3. You have already seen this in Unit 5 (E4).

We now come to a very important aspect of the dual space.

We know that the space V*, of linear functional on V, is a vector space. Also, if dim
V = m, then dim V* = m, by Theorem 1. (Remember, dim F = 1).

Hence, we see that dim V = dim V*. From Theorem 12 of unit 5, it follows that the
vector spaces V and, V* are isomorphic.
We now construct a special basis for V*. Let {e1,….em} be a basis for V. by Theorem
3 of Unit 5, for each i = 1, …, m, there exists a unique linear functional f i on V such
that

1, if i = j The Kronecker delta function is


fi (ej) = 0, if i  j 1, i = j
ij = 0, i  j
= ij

We will prove that the linear functional f1, …., fm, constructed above, form a basis of
V*.

Since dim V = dim V* = m, it is enough to show that the set { f 1,….fm} is linearly
independent. For this we suppose e1,…,cm F such that c1f1 +…+ cm fm = 0.

We must show that C1 = 0, for all i.

n
Now  cjfj = 0
j=1

n
  cjfj (ei) = 0, for each i.
j=1

n
  cj (fj(ei)) = 0  i.
j=1

n
  cj ij = 0  i  ci = 0  i
j=1

Thus, the set { f1,….fm} is a set of m linearly independent elements of a vector space
V* of dimension m, Thus, from Unit 4 (Theorem 5, Cor. 1), it forms a basis of V*.

Definition: The basis {f1,….,fm} of V* is called the dual basis of the basis {e1,….,
em} of V.

We now come to the result that shows the convenience of using a dual basis.

Theorem 2: Let V be a vector space over F of dimension n, {e1,…en} be a basis of V


and {f1,…fn} be the dual basis of {e1,…,en}.Then, for each f  V*.
n
f =  f (ej) fi.
i=1

and, for each v  V,

n
v =  fi. (v)ei.
i=1

Proof: Since {f1, ….,fn} is a basis of V*, for f  V* there exist scalars c1,….cn such
that

n
f =  ci fi.
i=1

Therefore,
n
f (ej) =  ci fi. (ej)
i=1

n
=  ciij, by definition of dual basis.
i=1
= cj.

This implies that c1 = f(ei) V i = 1,…, n. therefore, f =  f1, Similarly, for v  V,


there exist scalars a1,…., an such that

n
v =  ai ei.
i=1

Hence, n
fj (v) =  ai fi. (ej)
i=1

n
=  aiij.
i=1

= aj,
and we obtain

n
v =  fi (v) ej
i=1

Let us see an example of how this theorem works.

Example 3: Consider the basis e1 = (1, 0 – 1), e2 = (1, 1, 0) of C3 over C. Find the
dual basis of {e1, e2, e3}.

Solution: Any element of C3 is v = (z1, z2, z3), z1 C. Since {e1, e2, e3} is a basis, we
have 1, 2, 3 C. Since That

V = { z1, z2, z3} = 1 e1 + 2 e2 + 3 e3


= (1 + 2 + 3, 2 + 3, - 1 + 2)

Thus,  +2 + 3 = z1
2 + 3 = z2
-1 + 2 = z3.

These equations can be solved to get


1 = z1 – z2, 2 = z1 – z2 + z3, 3 = 2z2 – z3 – z3

Now, by Theorem 2,

v = f1, (v) e1 + f2 (v) e2 + f3 (v) e3, where {f1, f2, f3} is the dual basis. Also v = 1 e1 +
2 e2 + 3 e3.

Hence, f1, (v) 1,f2(v) = 2 , f3(v) = 3 V v  C3.

Thus, the dual basis of {e1, e2, e3} is {f1, f2, f3}, where f1, f2, f3 will be defined as
follows:

f1 (z1, z2, z3) = 1 = z1 – z2.


f1(z1, z2, z3) = 2 = z1 –z2 + z3
f1 (z1, z2, z3) = 3 = 2z2 – z1 – z3.
E E5 What is the dual basis for the basis {1, x, x2} of the space

P2 = (a0 + a1x + a2x2a1 R}?

Now let us look at the dual of the dual space. If you like, you may skip this portion
and go straight to sec. 6. 4.

Let V be an n-dimensional vector space. We have already seen that V and V* are
isomorphic because dim V = dim V*. The dual of V* is called the second dual of V
and is denoted by V**. We will show that V V**.

Now any element of V** is a linear transformation from V* to F. Also, for any v 
V and f  V*, f(v)  F. So we define a mapping  : V  V**: v  v, where ( v)
(f) = f(v) for all f  V* and v  V. (Over here we will us  (v) and v
interchangeably).

Note that, for any v  V,  v is a well defined mapping from V*  F. We have to


check that it is a linear mapping.

Now, for c1, c2 F and f1, f2V*.


( v) (c1, f1 + c2 f2) = c1 f1 + c2 f2) (v)

= c1 f1 (v) + c2 f2 (v)

= c1 ( v) (f1) + c2 ( v) (f2)

:.  v L(V*, F) = V**, Vv.

Furthermore, the map  : V  V** is linear. This can be seen as follows: for c1, c2
F and v1, v2,  V.

 (c1 v1 + c2 v2) (f) = f(c1 v1 + c2 v2)


= c1f(v1) + c2 f(v2)
= c1 ( v1) (f) + c2 ( v2 (f)
= (c1 v1 + c2 v2) (f).
This is true V f  V*. Thus, 0 (c1 v1 + c2 v2) = c1 (v1) + c2 (v2) .

Now that we have shown that 0 is linear, we want to show that it is actually an
isomorphism. We will show that 0 is 1 -1. For this. By Theorem 7 of Unit 5, it
suffices to show that  (v) = 0 implies v = 0. Let {f1,….,fn} be the dual basis of a
basis {e1,…,en} of V.
n
By Theorem 2. we have v =  f1(v) ei.
i=1
Now  (v) = 0  (v) (f1) = 0 V i = 1, …., n.

 f1(v) = 0 V i = 1, ….,n
 v = f1 (v) e1 = 0
Hence, it follows that  is 1 -1. thus,  is an isomorphism (Unit 5, Theorem 10).

What we have just proved is the following theorem.

Theorem 3: The map  : V  V**, defined by ( v) (f) = f(v) V v  V and f  V*,


is an isomorphism.

We now give an important corollary to this theorem.

Corollary: Let  be a linear functional on V* (i.e.,  E V**).

Then there exists a unique v  V such that  (f) = f(v) for all f V*.

Proof: By Theorem 3, since is an isomorphism, it is onto and 1 -1. thus, there
exists a unique v  V such that  (v) = . This by definition, implies that  (f) =
(v) (f) = f(v) for all f V*.

Using the second dual try to prove the following exercise.

E E6) Show that each basis of V* is the dual of some basis of V.

In the following section we look at the composition of linear operators, and the vector
space A(v), where V is a vector space over F.
3.5 Composition of Linear Transformations

Do you remember the definition f the composition of functions, which you studied in
Unit 1? Let us now consider the particular case of the composition of two linear
transformations. Suppose T: U  V W, defined by S0T(u) = S (T(u)) V u  U.

This is diagrammatically represented in Fig. 1.

T S
U V W

SoT

Fig 1: SoT is the composition of S and T.

The first question which comes to our mind is whether S0T is linear. The affirmative
answer is given by the following result.

Theorem 4: Let U, V, W be vector spaces over F. suppose S  L (V, W) and T  L


(U, V). Then SoT  L (U, W).

Proof: All we need to prove is the linearity of the map SoT. Let 1, 2 F and u1,
u2 U. Then

SoT (1 u1 + a2 u2) = S(T(1 u1 + 2 u2))


= S(1 T (u1) + 2 T(u2)), since T is linear
= 1 S(T(u1)) + 2 S (T(u2)), since S is linear
= a1 SoT (u1) + 2 SoT (u2)

This shows that SoT  L (U, W)

Try the following exercises now

E E7) Let I be the identity operator on V. show that SoI = IoS = S for all S 
A (V).
E E8) Prove that So0 = 0oS = 0 for all S  A (V). where 0 is the null
operator.

We now make an observation.

Remark: Let S:V  V be an invertible linear transformation (ref. Sec 1.4), that is an
isomorphism. Then, by Unit 5, Theorem 8, S-1 L (V, V) = A (V).

Since S-1 oS (v) = v and SoS-1 (v) = v for all v  V.

SoS-1 = S-1 0S = Iv, where I, denotes the identity transformation on V. this remark
leads us to the following interesting result.

Theorem 5: Let V be a vector space over a field F. A linear transformation S  A(V)


is an isomorphism if an only if  T  A (V) such that SoT = I = ToS.

Proof: Let us first assume that S is an isomorphism. Then, the remark above tells us
that  S-1 A(V) such that SoS-1 = I = S-1oS. Thus, we have T (= S-1) such that SoT =
ToS = I.

Conversely, suppose T exists in A(V), such that SoT = I = ToS. We want to show that
S is 1 -1 and onto

We first show that S is 1 -1, that is, Ker S = {0}. Now, x  Ker S  S(x) = 0  ToS
(x) = T{0} = 0  I(x) = 0  x = 0. thus, Ker S = {0}.

Next, we show that S is onto, that is, for any v  V,  u  V such that S(u) = v. Now,
for any v  V,

v = I (v) = SoT (v) = S (T(v)) = S(u), where u = T(v)  V. thus, S is onto.

Hence, S is 1 -1 and onto, that is, S is an isomorphism.

Use Theorem 5 to solve the following exercise.


E E9) Let S (x1, x2) = (x2 = x1) and T (x1, x2) = (- x2, x1). Find SoT and ToS.
Is S (or T) invertible?

Now, let us look at some examples involving the composite of linear operators.

Example 4: Let T : R2 R2 and S:R3 be defined by

T(x1, x2) = (x1, x2, x1 + x2) and S(x1, x2, x3) = (x1, x2). Find SoT and ToS.

Solution: First, note that T  L (R2, R3) and S  L (R3, R2). :. SoT and ToS are both
well defined linear operators. Now,

SoT (x1, x2) = S (T(x1, x2)) = S (x1, x2, x1 + x2) = (x1, x2).

Hence, SoT = the identity transformation of R2 = IR2.

Now, ToS (x1, x2, x3) = T(S(x1, x2, x3)) = T (x1, x2) = x1, x2, x1 + x2).

In this case SoT  A (R2), while ToS  A (R3). Clearly, SoT,  ToS. Also, note that
SoT = I, but ToS  I.

Remark: Even if SoT and ToS both being to A(V), SoT may not be equal to ToS.
We give such an example below.

Example 5: Let S, T  A (R2) be defined by T (x1, x2) = (x1 _ x2, x1 – x2) and S (x1,
x2) = 0, x2). Show that SoT  ToS.

Solution: You can check that SoT (x1, x2) = (0, x1 – x2) and ToS (x1, x2) = (x2 – x2).
Thus,  (x1, x2)  R2 such that SoT (x1, x2)  ToS (x1, x2) (for instance, SoT (1, 1) 
ToS (1, 1)). That is, SoT  ToS.

Note: Before checking whether SoT is a well defined linear operator. You must be
sure that both S and T are well defined linear operators.
Now try to solve the following exercise.

E E10) Let T (x1, x2) = (0, x1, x2) and S (x1, x2, x3) = (x1, + x2, x2 + x3) Find SoT
and ToS. When is SoT = ToS?

E E11) Let T (x1, x2) = 2x1, x1 + 2x2) for (x1, x2)  R2, and S (x1, x2, x3) = (x1 +
2x2, 3x1 – x2, x3) for (x1, x2, x3)  R3. Are SoT and ToS defined? If yes, find
them.

E E12) Let U, V, W, Z be vector spaces over F. suppose T  L(U, V), S  L (V,


W) and R  L(W, Z). show that (RoS) oT = Ro (SoT).

E E13) Let S, T  A (V) and S be invertible. Show that rank (ST) = rank (TS)
= rank (T). (ST).
So far we have discussed the composition of linear transformation. We have seen that
if S, T  A (V), then SoT  A (V), where V is a vector space of dimension n. Thus,
we have introduced another binary operation (see 1. 5.2) in A (V), namely, the
composition of operators, denoted by o. Remember, we already have the binary
operations given in Sec. 6.2 In the following theorem we sate some simple properties
that involve all these operations.

Theorem 6: Let R. S. T  A (V) and let α  F. Then

(a) Ro (S + T) = RoS + RoT, and (S + T) oR = SoR + ToR.


(b) (SoT) = SoT = SoT.

Proof: a) for any V  V,

Ro (S + T) (v) = R (( S + T) (v)) = R (S(v) + T (v))


= R(S(v) + R (T(v))
= (RoS) (v) + (RoT) (v)
= (RoS + RoT) (v)

Hence, Ro (S + T) = RoS + RoT.

Similarly, we can prove that (S + T) oR = SoR + ToR

b) For any v E V,  (SoT) (v) =  (S(T(v))


= (S) (T(v))
= (SoT) (v)

Therefore,  (SoT) =  SoT.

Similarly, we can show that  (SoT) = SoT.

Notification: In future we shall be writing ST in place of SoT. Thus, ST(u) = S(T(u))


= (SoT) u. Also, if T  A (V), we write T0 = I, T1 = T, T2 = ToT and, in general, Tn =
Tn-1 oT = ToTn-1.

The properties of A(V), stated in theorems 1 and 6 are very important and will be used
implicitly again and again. To get used to A(V) and the operations in it try the
following exercises.

E E14) Consider S, T:R2 R2 defined by S(x1, x2) = x1, - x2) and T (x1, x2) = (x1
+ x2, x2 – x3). What are S + T, ST, TS, So (S-T) and (S – T) oS?
E E15) Let S E A (V), dim V = n and rank (S) = r, Let
M = {T  A (V) ST = 0},
N = {T  A (V) TS = 0}

a) Show that M and N are subspaces of A (V).


b) Show that M = L (V, Ker S). What is dim M?

By now you must have got used to handling the elements of A(V). the next section
deals with polynomials that are related to these elements.

3.6 Minimal Polynomial

Recall that a polynomial in one variable x over F is of the form p(x) = a 0 + a1x +,….+
an xn, where a0, a1,….., an F.

If an  o, then p(x) is said to be of degree n. If an = 1, then p (x) is called a monic


polynomial of degree n. For example, x2 + 5x + 6 is a monic polymial of degree 2.
The set of all polynomials in x with coefficients in F is denoted by F [x].

Definition: For a polynomial p, as above, and an operator T  A (V), we define p (T)


= a0 I + a1 T +….+ anTn.

Since each of I, T,…., Tn A (V), we find P (T)  A (V). We say P (T)  F [T]. If q
is another polynomial in x over F, then P (T) q (T) = q (T) = P(T), that is, P(T) and q
(T) commute with each other. This can be seen as follows:

Let q(T) = b0 I + b1 T + … + bm Tm

Then p (T) q (T) = (a0 I + a1 T +….+ an Tn) (b0I + b1 T +….+ bmTm)


= a0 b0 I + (a0 b1 + a1 b0) T +… + an bm Tm Tn+m
= (b0 I + b1 T +….+ bmTm) (a0 I + a1 T +….+ an Tn)
= q (T) p (T).

E E16) Let p, q E F [x] such that p (T) = 0, q (T) = 0. Show that (p + q) (T) = 0.
(( p + q) (x) means p (x) + q (x)).
E E17) Check that (2I + 3S + S3) commutes with (S + 2S4), for S  A (Rn).

We now go on to prove that given any T  A (V) we can find a polynomial g  F [x]
such that

g(T) = 0, that is, g (T) (v) = 0 V v  V.

Theorem 7: Let V be a vector space over F of dimension n and T  A (V). Then


there exists a non-zero polynomial g over F such that g (T) = 0 and the degree of g is
at most n2.

Proof: We have already seen that A(V) is a vector space of dimension n2. Hence, the
set {I, T, T2,…, Tn2} of n2 + 1 vectors of A (V), must be linearly dependent (ref. Unit
4, Theorem 7). Therefore, there must exist a0, a1,….,an2 F (not all zero) such that a0 I
+ a1 T +…+ an2 Tn2 = 0.

Let g be the polynomial of degree at most n2, such that g (T) = 0.

The following exercise will help you in getting used to polynomials in x and T.

E E18) Give an example of polynomials g (x) and h (x) in r [x], for which g (I)
= 0 and h (0) = 0, where I and 0 are the identity and zero transformations in A
(R3).

E E19) Let T E A (V). then we have a map  from F [x] to A (V) given by 
(p) = p (T) show that, for a b  F and p, q  F [x],

a)  (ap + bq) = a  (p) + b  (q).


b)  (pq) =  (p)  (q).
In Theorem 7: we have proved that there exists some g  F [x] with g (T) = 0. But, if
g (T) = 0, then (g) (T) = 0, for any  F. Also, if deg g  n2. Thus, there are
infinitely many polynomials that satisfy the conditions in theorem 7. But if we insist
on some more conditions on the polynomial g, then we end up with one and only one
polynomial which will satisfy these conditions and the conditions in Theorem 7. Let
us see what the conditions are.

Theorem 8: Let T  A (V). then there exists a unique monic polynomial p of


smallest degree such that p (T) = 0.

Proof: Consider the set S = {g  F [x] g (T) = 0}. This set is non-empty since, by
Theorem 7, there exists a non-zero polynomial g, of degree at most n2, such that g (T)
= 0. Now consider the set D = {deg f f  S}. Then D is a subset of N U {0}, and
therefore, it must have a minimum element, m. Let h  S such that deg h = m. then h
(T) = 0 and deg h  deg g V g  S.

If h = a0 + a1 x + ….+ am xm, am 0, then p = am -1 h is a monic polynomial such that


p(T) = 0, Also deg p = deg h  deg g V g  S. Thus, we have shown that there exists
a monic polynomial p, of least degree, such that p (T) = 0.

We now show that p is unique, that is, if q is any monic polynomial of smallest degree
such that q (T) = 0, then p = q. But this is easy. Firstly, since deg p  deg g V g  S,
deg p  deg q.

Similarly, deg q  deg p. :. Deg p = deg q.

Now suppose p(x) = a0 + a1 x +… + an-1 xn-1 + xn and q (x) = b0 + b1 x + …+ bn-1 xn-1 +


xn.

Since p (T) = 0 and q (T) = 0, we get (p – q) (T) = 0. But p – q = (a0 – b0) + … + (an-1
– bn-1) xn-1. Hence, (p – q) is a polynomial of degree strictly less than the degree of p,
such that (p – q) (T) = 0. That is, p – q  S with deg (p – q) < deg p. This is a
contradiction to the way we chose p, unless p – q = 0, that is, p = q. :. P is the unique
polynomial satisfying the conditions of Theorem 8.

This theorem immediately leads us to the following definition.

Definition: For T E A (V), the unique monic polynomial p of smallest degree such
that p(T) = 0 is called the minimal polynomial of T.

Note that the minimal polynomial p, of T, is uniquely determined by the following


three properties.

1) p is a monic polynomial over F


2) p (T) = 0
3) if g  F (x) with g (T) = 0, then deg p  deg g.

Consider the following example and exercises.

Example 6: For any vector space V, find the minimal polynomials for I, the identity
transformation, and 0, the zero transformation.

Solution: Let p (x) = x – 1 and q (x) = x. Then p and q are monic such that p(I) = 0
and q (0) = 0. Clearly no non-zero polynomials of smaller degree have the above
properties. Thus x – 1 and x are the required polynomials.

E E20) Define T:R3 R3: T (x1, x2, x3) = (0, x1, x2). Show that the minimal
polynomial of T is x3.

E E21) Define T:Rn Rn:T (x1, ……, xn-1). What is the minimal polynomial
of T? (Does E 20 help you?

E E22) Let T:R3  R3 be defined by

T(x1, x2, x3) = (3x1, x1 – x2, 2x1 + x2 + x3). Show that (T2 – I) (T – 3I)
= 0. what is the minimal polynomial of T?

We will now state and prove a criterion by which we can obtain the minimal
polynomial of linear operator T, once we know any polynomial f  F [x] with f (T) =
0. It says that the minimal polynomial must be a factor of any such f.

Theorem 9: Let T  A (V) and let p (x) be the minimal polynomial of T. Let f (x) be
any polynomial such that f (T) = 0. Then there exists a polynomial g (x) such that f
(x) = p (x) g (x).

Proof: The division algorithm states that given f (x) and p (x), there exist
polynomials g (x) and h (x) such that f (x) = p (x) g (x) + h (x), where h (x) = 0 or deg
h (x) < deg p (x). Now,

0 = f (T) = p (T) g (T) + h (T) = h (T), since p (T^) = 0


Therefore, if h (x)  0, then h (T) = 0, and deg h (x) < deg p (x).

This contradicts the fact that p (x) is the minimal polynomial of T. hence, h (x) = 0
and we get f (x) = p (x) g (x).

Using this theorem, can you obtain the minimal polynomial of T in E22 more easily?
Now we only need to check if T-I, T + I or T – 3I are 0.

Remark: if dim V = n and T  A(V), we have seen that the degree of the minimal
polynomial p of T  n2. We will study a systematic method of finding the minimal
polynomial of T, and some applications of this polynomial. But now we will only
illustrate one application of the concept of the minimal polynomial by proving the
following theorem.

Theorem 10: Let T  A(V). Then T is invertible if and only if the constant term in
the minimal polynomial of T is not zero.

Proof: Let p (x) = a0 + a1 x + …+ am xm-1 + xm be the minimal polynomial of T. Then


a0I + a1T + …+ am-1 Tm-1 + Tm – 0.
 T(a1I +…+ am-1 Tm-2 + Tm-1) = -a0I …………………………..(1)
Firstly, we will show that if T -1 exists, then a0 0. On the contrary, suppose a0 = 0
Then (1) implies that T(a1I +….+ Tm-1) = 0.. Multiplying both sides by T-1 on the left,
we get a1I + …+ Tm-1 = 0.

This equation gives us a monic polynomial q(x) = a1 + ….+ xm-1 such that q(T) = 0
and deg q < deg p. this contradicts the fact that p is the minimal polynomial of T.
Therefore, if T-1 exists the constant term in the minimal polynomial of T cannot be
zero.

Conversely; suppose the constant term in the minimal polynomial of T is not zero, that
is, a0 0. Then dividing Equation (1) on both sides by (–a0), we get

T((-a1/a0) I +…+ (-I/a0) Tm-1) = I


Let S = (-a1/a0) I +…+ (-I/a0) Tm-1.

Then we have ST = I and TS = I. This shows, by Theorem 5, that T -1 exists and T-1 =
S.

E E23) Let Pn be the space of all polynomials of degree  n. Consider the linear
operator D: P2 P2 given by D (a0 + a1 x + a2 x2) = a1 + 2a2x. (Note that D is
just the differentiation operator.) Show that D4 = 0. What is the minimal
polynomial of D? is D invertible?

E E24) Consider the reflection transformation given in Unit 5, Example 4, Find


its minimal polynomial. Is T invertible? If so, find its inverse.
E E25) Let the minimal polynomial of S  A (V) be xn, n  I. Show that there exists
v0 V such that the set {v0, S(v0),…., Sn-1 (V0)} is linearly independent.

We will now end the unit by summarizing what we have covered in it.

2.6 Summary

In this unit we covered the following points.

i. L(U, V, the vector space of all linear transformations from U to V is of


dimension (dim U) (dim V).
ii. The dual space of a vector space V is L (V, F) = V*, and is isomorphic to V.
iii. If {e1, ….., en} is a basis of V and {f1,…., fn} is its dual basis, then
n n
f= f (ei) fi f  V* and v =  fi(v)ei v  V.
i=I i=I
iv. Every vector space is isomorphic to its second dual.
v. Suppose S E L (V, W) and T  L (U, V). Then their composition SoT  L (U,
W).
vi. S  A (V) = L(V, V) is an isomorphism if and only if there exists T A(V)
such that SoT = I = ToS.
vii. For T  A (V) there exists a non-zero polynomial g  F [x], of degree at most
n2, such that g (T) = 0, where dim V = n.
viii. The minimal polynomial of T and f is a polynomial p, of smallest degree such
that p(T) = 0.
ix. If p is the minimal polynomial of T and f is a polynomial such that f(T) = 0,
then there exists a polynomial g (x) such that f(x) g (x).
x. Let T  A (V). Then T-1 exists if and only if the constant term in the minimal
polynomial of T is not zero.

2.7 Solutions/Answers

E1) We have to check that VS1 – VS10 are satisfied by L(U,V). we have already
shown that VS1 and VS6 are true.
VS2: For any L, M, N  L (U, V), we have V u  U, [(L + M) + N] (u)
= (L + M) (u) + N (u) = [L(u) + M(u)] + N (u)
= L(u) + [M(u) + N (u)], since addition is associative in V.
= [L + (M + N)] (u)
:. ( L + M) + N = L + (M + N).

VS3: 0 : U  V : 0 (u) = 0 V u  U is the zero element of L (U, V).

VS4: For any S  L(U,V), (-1) S = - S, is the additive inverse of S.

VS5: Since addition is commutative in V, S + T = T + S V S, T in L (U, V).

VS7: V F and S, T E L (U, V),

 (S + T) = (S + T) (u) V u  U.


:. α (S + T) = αS + αT.

VS8: V,  F and S  L (U, V), ( + ) S = S + S.

VS9: V F and S  L (U, V), () S =  (S).

VS10: VS  L (U, V), 1. S = S.

E2) E2m (em) = f2 and E2m (ei) = 0 for i  m.

E32 (ei) f3 and E32 (ei) = 0 for i = 2.


Emn (ei) = fm, if i = n
0 otherwise

E3) Both spaces have dimension 2 over R. A. basis for L (R 2, R) is {E11, E12},
where E11 (1, 0) = 1, E11 (0, 1) = 0, E12 (1, 0) = 0, E12, (0, 1) = 1. A basis for L
(R, R2) is {E11, E21}, where E11 (1) = (1, 0), E21 (1) = (0, 1).

E4 Let f: R3 R be any linear functional. Let f (1, 0, 0) = a1, f (0, 1, 0) = a2, f(0,
0, 1) = a3. Then, for any x ⇋ (x1, x2, x3), we have x = x1 (1, 0,0) + x2 (0, 1, 0) +
x3 (0, 0, 1).
:. f (x) = x1 f (1, 0,0) + x2 f (0, 1, 0) + x3 f (0, 0, 1)
= a1 x1 + a2 x2 + a3 x3.

E5 Let the dual basis be {f1, f2, f3}. Then, for any v  P2, v = f1 (v). 1 + f2 (v). x
+ f3 (v). x2
:. If v = a0 + a1x + a2x2, then f1 (v) = a0, f2 (v) = a1, f3 (v) = a2.
That is, f1 (a0 + a1x + a2x2) = a0, f2 (a0 + a1x + a2x2) = a1, f3 (a0
+ a1x + a2x ) = a3, for any a0 + a1 x + a2 x  P2.
2 2
E6) Let {f1,…., fn} be a basis of V* Let its dual basis be {1,…., n}, i. V**. Let
ei V such that  (ei) = i (ref. Theorem 3) for i = 1,…., n.
Then {e1,….,en} is a basis of V, since -1 is an isomorphism and maps a basis
to {e1,…., en}. Now fi (ej) = (ej) (fi) = j (fi) = ji, by definition of a dual
basis.
:. {f1,…., fn} is the dual of {e1,….en}.

E7) For any S  A (V) and for any v V,


SoI (v) = S (v) and IoS (v) = I (S(v)) = S (v).
:. SoI = S = IoS.

E8) V S A(V) and v  V,


So0 (v) = S (0) = 0, and
0oS (v) = 0 (S (v)) = 0.
:. So0 = 0oS = 0.

E9) S  A (R2), T  A (R2).


SoT (x1, x2) = S (-x2, x1) = (x1, x2)
ToS (x1, x2) = T (x1, - x1) = x1, x2)
V (x1, x2)  R2.
:. SoT = ToS = I, and hence, both S and T are invertible.

E10) T  L (R2, R3), S  L (R3, R2). :. SoT  A (R2), ToS  A (R3).


:. SoT and ToS can never be equal.
Now SoT (x1, x2) = S (0, x1, x2) = (x1, x1 + x2) V (x1, x2)  R2
Also, ToS (x1, x2, x3) = T (x1 + x2, x2 + x3) = (0, x1 + x2, x2 + x3) V (x1, x2, x3)
R .
3

E11) Since T  A (R2) and S  A (R3), SoT and ToS are not defined.

E12) Both (RoS) oT and Ro (SoT) are in L (U, Z). For any u  U,
[(RoS) oT] (u) = (RoS) [T(u)] = R[S(T(u))] = R [SoT) (u)] = [Ro(SoT)] (u).
:. (RoS) oT = Ro (SoT).

E13) By Unit 5, Theorem 6, rank (SoT)  rank (T).


Also, rank (T) = rank (IoT) = rank ((S-1 oS) oT)
= rank (S-1o (SoT))  rank (SoT) (by Unit 5, Theorem 6).
Thus, rank (SoT)  rank (T)  rank (SoT).
:. Rank (SoT) = rank (T).
Similarly, you can show that rank (ToS) = rank (T).

E14) (S + T) (x, y) = (x, - y) + ( x + y, y – x) = (2x + y – x)


ST (x, y) = S (x + y, y- x) = (x + y, x – y)
TS (x, y) = T (x, - y) = (x – y, - (x + y))
[So (S-T)] (x, y) = S (-y, x – 2y) = (- y, 2y – x)
[(S – T)oS] (x, y) = (S – T) (x, - y) = (x, y) – (x – y, - (x + y)) = (y, 2y + x)  (x,
y)  R2.

E15) a) W first show that if A, B  M and α  F, then A + B 


M. Now. So (A + B) = SoA + SoB, by Theorem 6.
= (SoA) + (SoB), again by Theorem 6
= 0 +  0, since A B  M.
= 0.
:. A + B  M. and M is a subspace of A (V).
Similarly, you can show that N is a subspace of A (V).

b) For any T  M, ST (v) = 0 Vv V. :. T(v)  Ker S V v V.


:. R (T), the range of T, is a subspace of Ker S.
:. T  L (V, Ker S). :. M  L (V, Ker S).
Conversely, for any T  L (V, Ker S), T  A (V) such that S (T(v)) = 0 V v  V. :. ST
= 0. :. T  M.
:. L (V, Ker S)  M.
:. We have proved that M = L (V, Ker S).
:. dim M = (dim V) (nullity S, by Theorem 1.
= n (n – r), by the Rank Nullity Theorem.

E16) (p + q) (T) = p (T) + q (T) = 0 + 0 = 0.

E17) (2I + 3S + S3) (S + 2S4) = 2I + 3S + S3) S + (2I + 3S + S3) (2S4)


= 2S + 3S2 + S4 + 4S4 + 6S5 + 2S7
= 2S + 3S2 + 5S4 + 6S5 + 2S7
Also, (S + 2S4) (2I + 3S + S3) = 2S + 3S2 + 5S4 + 6S5 + 2S7
:. (S + 2S4) (2I + 3S + S3) = (2I + 3S + S3) (S + 2S4).

E18) Consider g (x) = x – 1  R [x]. Then g (I) = I -1 I = 0.


Also, if h (x) = x, then h (0) = 0.
Notice that the degrees of g and h are both I  dim R3.

E19) Let p = a0 + a1x + … + an xn, q = b0 + b1x +… + bm xm.

a) Then ap + bq = aa0 + aa1x + …+ aanxn + bb0 + bb1x + …. +


bbm xm.
:. (ap + bq) = aa0I + aa1T +….+ aan Tn + bb0I + bb1 T +….+ bbmTm.
= ap (T) + bq (T) = a  (p) + b  (q)

b) pq = (a0 + a1 x + …+ an xn) (b0 + b1 x + … + bm xm)


= a0b0 + (a1 b0 + a0 b1) x + … + an bm xn+m
:.  (pq) = a0 b0I + (a1 b0 + a0 b1) T + … + an bm Tn+m
= (a0I + a1T +… + an Tn) (b0I + b1T + …. + bm Tm)
=  (p)  (q).
E20) T A (R3). Let p (x) = x3. Then p is a monic polynomial. Also, p (T) (x1, x2,
x3) = T3 (x1, x2, x3) = T2 (0, x1, x2) = T (0, 0, x1) = (0, 0, 0) V (x1, x2, x3)  R3.
:. p(T) = 0.

We must also show that no monic polynomial q of smaller degree exists such that q
(T) = 0.

Suppose q = a+ bx + x- and q (T) = 0


Then (aI + bT + T2) (x1, x2, x3) = (0, 0, 0)
 a (x1, x2, x3) + b(0, x1, x2) + (0, 0, x1) = (0, 0, 0)
 ax1 = 0, ax2 +bx1 = 0, ax3 + bx2 + x1 = 0 V (x1, x2, x3)  R3.
 a = 0, b = 0 and x1 = 0. But x1 can be non-zero.
:. q does not exist.
:. p is a minimal polynomial of T.

E21) Consider p(x) = xn. Then p (T) = 0 and no non- zero polynomial q of
lesser degree exists such that q (T) = 0. this ca be checked on the lines of the solution
of E20.

E22) (T2 – I) (T – 3I) (x1, x2, x3)


= (T2 – I) ((3x1, x1 – x2, 2x1 + x2 + x3) – (3x1, 3x2, 3x3))
= (T2 – I) (0, x1 – 4x2, 2x1 + x2 – 2x3)
= T(0, - x1 + 4x2, 3x1 – 3x2 – 2x3) – (0, x1 – 4x2, 2x1 + x2 –
2x3)
= (0, x1 -4x2, 2x1 + x2 – 2x3) – (0, x1 – 4x2, 2x1 + x2 – 2x3)
= (0, 0, 0) V (x1, x2, x3)  R3.
:. (T2 – I) (T – 3I) = 0

Suppose  q = a + bx + x2 such that q (T) = 0. Then q (T) (x1, x2, x3) = (0, 0, 0) V (x1,
x2, x3) R3. This means that a + 3b + 9 = 0, (b + 2) x1 + (a = b + 1) x2 = 0, (2b + 9) x1
+ bx2 + (a + b + 1) x3 = 0. Eliminating a and b, we find that these equations can be
solved provided 5x1 – 2x2 – 4x3 = 0. But they should be true for any (x1, x2, x3) R3.
:. The equations can’t be solved, and q does not exist. :., the minimal polynomial of T
is (x2 – I) (x – 3).

E23) D4 (a0 + a1x +a2x2) D3 (a1, + 2a2x) = D2 (2a2) = D(0) = 0 V a0 + a1x + a2x2 P2.

:. D4 = 0.

The minimal polynomial of D can be D, D2, D3 or D4. Check that D3 = 0, but D2 0.
:. The minimal polynomial of D is p(x) = x3. Since p has no non-zero constant term,

D is not an isomorphism.

E24) T:R2: T(x,y) = (x, - y).


Check that T2-I = 0
:. The minimal polynomial p must divide x2 – I.
:. P(x) can be x – 1, x + 1 or x2 – 1. Since T – 1  0 and T + I  0, we see that p (x0 =
x2 – 1.
By Theorem 10, T is invertible. Now T2 – 1 = 0
:. T (-T) = 1, :. T-1 = -T.

E25) Since the minimal polynomial of S is xn, Sn = 0 and Sn-1  0. :  v0 V such
that
Sn-1 (v0)  0. Let a1, a2,….., an F such that
a1 v0 + a2 S(v0) + …+ an Sn-1 (0) = 0 …………………….. (1)

Then, applying Sn-1 to both sides of this equation, we get a1Sn-1 (v0) + …+ anS2n-1(v0) =
0
 a1Sn-1 (v0) = 0, since Sn = 0 Sn+1 = …. = S2n-1
 a1 = 0.

Now (1) reduces to a2S (v0) + …+ anSn-1 (v0) = 0.


Applying Sn-2 to both sides we get a2 = 0. In this way we get a1 = 0 V i = 1, …., n.
:. The set {v0, S (v0),…., Sn-1 (v0)} is linearly independent.
UNIT 2 MATRICES I

1.0 Introduction
2.0 Objectives
3.0 Main Content
3.1 Vector Space of Matrices
3.1.1 Definition of a Matrix
3.1.2 Matrix of a Linear Transformation
3.1.3 Sum and Multiplication by Scalars
3.1.4 Mmxn (F) is a Vector Space
3.1.5 Dimension of Mmxn (F) over F
3.2 New Matrices From Old
3.2.1 Transpose
3.2.2 Conjugate Transpose
3.3 Some Types of Matrices
3.3.1 Diagonal Matrix
3.3.2 Triangular Matrix
3.4 Matrix Multiplication
3.4.1 Matrix of the Composition of Linear Transformations
3.5 Properties of a Matrix Product
3.6 Invertible Matrices
3.6.1 Inverse of a Matrix
3.6.2 Matrix of Change of Basis
3.7 Solutions/Answers
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/Further Reading

1.0 INTRODUCTION

You have studied linear transformations in Units 1 and 2 we will now study a simple
means of representing them, namely, by matrices the plural form of ‘matrix’). We
will show that, given a linear transformation, we can obtain a matrix associated to it,
and vice versa. Then, as you will see, certain properties of a linear transformation can
be studied more easily if we study the associated matrix instead. For example, you
will see in Block 3, that it is often easier to obtain the characteristic roots of a matrix
than of a linear transformation.

Matrices were introduced by the English Mathematician, Arthur Cayley, in 1858.


He came upon this notion in connection with linear substitutions. Matrix theory now
occupies an important position in pure as well as applied mathematics. In physics one
comes across such terms as matrix mechanics, scattering matrix, spin matrix,
annihilation and creation matrices. In economics we have the input-output matrix and
the pay off matrix; in statistics we have the transition matrix; and in engineering, the
stress matrix, strain matrix, and many other matrices.
Matrices are intimately connected with linear transformations. In this unit we will
bring out this link. We will first define matrices and derive algebraic operations on
matrices from the corresponding operations on linear transformations. We will also
discuss some special types of matrices. One type, a triangular matrix, will be used
often in Unit 6. You will also study invertible matrices in some detail, and their
connection with change of bases. In Block 2 we will often refer to the material on
change of bases so do spend some time on sec 3.6

To realize the deep connection between matrices and linear transformations, you
should go back to the exact sport in Units 1 and 2 to which frequent references are
made.

This unit may take you a little longer to study, than previous ones, but don’t let that
worry you. The material in it is actually very simple.

2.0 OBJECTIVES

At the end of this unit, you should be able to:

 Define and give examples of various types of matrices;


 Obtain a matrix associated to a given linear transformation
 Define a linear transformation, if you know its associated matrix;
 Evaluate the sum, difference, product and scalar multiples of matrices;
 Obtain the transpose and conjugate of a matrix;
 Determine if a given matrix is invertible;
 Obtain the inverse of a matrix;
 Discuss the effect that the change of basis has on the matrix of a linear
transformation.

3.1 Vector Space of Matrices

Consider the following system of three simultaneous equations in four unknowns:


x – 2y + 42 + t = 0
x + 1/2y + 11t = 0
3y – 5z = 0

The coefficients of the unknowns, x, y, z and t can be arranged in rows and columns to
form a rectangular array as follows:

1 -2 4 2 (Coefficients of the first equation)


1 ½ 0 11 (Coefficients of the second equation)
0 3 -5 0 (Coefficients of the third equation)

Such a rectangular array (or arrangement) of numbers is called a matrix. A matrix is


usually enclosed within square brackets [ ] or round brackets ( )

1 -2 4 1 1 -2 4 1
1 ½ 0 11 1 ½ 0 11
0 3 -5 0 0 3 -5 0

The numbers appearing in the various positions of a matrix are called the entries (or
elements) of the matrix. Note that the same number may appear at two or more
different positions of a matrix. For example, 1 appears in 3 different positions in the
matrix given above.

In the matrix above, the three horizontal rows of entries have 4 elements each. These
are called the rows of this matrix. The four vertical rows of entries in the matrix,
having 3 elements each, are called its columns. Thus, this matrix has three rows and
four columns. We describe this by saying that this is a matrix of size 3 x 4 (“3 by 4”
or “3 cross 4”), or that this is a 3 x 4 matrix. The rows are counted from top to bottom
and the columns are counted from left to right. Thus, the first row is (1, -2, 4, 1), the
second row is (1, ½, 0, 11), and so on. Similarly,

The first column is 1 the second column is -2 and so on.


1, ½,
0 3

Note that each row is a 1 x 4 matrix and each column is a 3 x 1 matrix,

We will now define a matrix of any size.

3.1.1 Definition of a Matrix

Let us see what we mean by a matrix of size m x n, where m and n are any two natural
numbers.

Let F be a field.
A rectangular array
A11 a12…………. A1n

A21 a22 ………… a2n


……………………………
……………………………
Am1 am2………… amn

Of mn elements of F arranged in m rows and n columns is called a matrix of size m x


n; or an m x n matrix, over F. You must remember that the mn entries need not be
distinct.

The element at the intersection of the ith row and the jth column is called the (i,j) th
elements. For example, in the m x n matrix above, the (2, n) the element is a2n, which
is the intersection of the 2nd row and the nth column.

A brief notation for this matrix is [a ij] mn, or simply [aij], if m and n need not be
stressed. We also denote matrices by capital letters A, B, C, … etc. The set of all m x
n matrices over F is denoted by Mmxn (F).

Thus, [1,2 ] M1x2(R).

If m = n, then the matrix is called a square matrix. The set of all n x n matrices over F
is denoted by Mn (F).

In an m x n matrix each row is a 1 x n matrix and is also called a row vector.


Similarly, each column is an m x 1 matrix and is also called a column vector.

Let us look at a situation in which a matrix can arise.

Example 1: There are 20 male and 5 female students in the B.Sc. (Math. Hons) 1
year class in a certain college, 15 male and 10 female student’s in B.Sc. (Math. Hon’s)
II year and 12 male and 10 female students in B.Sc. (Math. Hon’s) III year. How does
this information give rise to a matrix?

Solution: One of the ways in which we can arrange this information in the form of a
matrix is as follows:

B.Sc. I B.Sc. II B.Sc. III

Male 20 15 12
Female 5 10 10

This is a 2 x 3 matrix.

Another way could be the 3 x 2 matrix.


Female Male

B.Sc. I 5 20
B.Sc. II 10 15
B.Sc. II 10 12

Either of these matrix representations immediately shows us how many male/female


students there are in any class.

To get used to matrices and their elements, you can try the following exercises.

1 2 3 2 5 3 2
E Let A = 4 5 0 and B = 5 4 1 5
0 0 7 0 3 2 0
Give the
a) (1, 2)th elements of A and B.
b) Third row of A.
c) Second column of A and the first column of B.
d) Forth row of B.

E E2) Write two different 4 x 2 matrices.

How did you solve E 2? Did the (I, j) th entry of one differ from the (I, j) th entry of
the other for some I and j? if not, then they were equal. For example, the two 1 x 1
matrices [2] and [2] are equal. But [2]  [3], since their entries at the (1, 1) position
differ.
Definition: Two matrices are said to be equal if

i. They have the same size, that is, they have the same number of rows as well as
the same number of columns, and
ii. Their elements, at all the corresponding positions, are the same.

The following example will clarify what we mean by equal matrices.

1 0 x y
Example 2: if 2 3 = z 3 , then what are x, y and z?

Solution: Firstly both matrices are of the same size, namely, 2 x 2, Now, for these
matrices to be equal the (I j) th elements of both must be equal V I, j. therefore, we
must have x = I, y = 0, z = 2.

E E3) Are [1] and 3 equal? Why?


4

Now that you are familiar with the concept of a matrix, we will link it up with linear
transformations.

3.1.2 Matrix of a Linear Transformation

We will now obtain a matrix that corresponds to a given linear transformation. You
will see how easy it is to go from matrices to linear transformations, and back. Let U
and V be vector spaces over a field F, of dimensions n and m, respectively. Let

B1 = {e1,…..en} be an ordered basis of U, and


B2 = {f1,….., fm} be an ordered basis of V, (By an ordered basis

We mean that the order in which the elements of the basis are written is fixed. Thus,
an ordered basis {e1, e2} is not equal to an ordered basis {e2, e1}).

Given a linear transformation T:U  V, we will associate a matrix to it. For this, we
consider T (e1), ……, T (en), which are all elements of V and hence, they are linear
combinations of f1,….., fm. Thus, there exist mn scalars ij, such that

T(e1) = α11 f1 + α21 f2 + …….+ αm1 fm


----------------------------------------------------
T(e1) = ij f1 + α21 f2 + …........+ amj fm
-----------------------------------------------------
T(en) = ij f1 + 2n f2 +………+ mn fm.

From these n equations we form an m x n matrix whose first column consists of the
coefficients of the first equation, second column consists of the coefficients of the
second equation, and so on. This matrix.

α11 α12 .. α1n

α21 α22 .. α2n

.. .. .. ..

αm1 αm2 .. αmn

is called the matrix of T with respect to the bases B1 and B2. Notice that the
coordinate vector of T (ej) is the jth column of A.

We use the notation [T]B1, B2 for this matrix. Thus, to obtain [T]B1 B2 we consider T
(ei) V ei B1, and write them as linear combinations of the elements of B2.

If T  L (V, V), B is a basis of V and we take B1 = B2 = B, then [T]B, B is called the


matrix of T with respect to the basis B, and can also be written as [T]B.

Remark: Why do we insist on order bases? What happens if we interchange the order
of the elements in B= to {en, e1,….., en-1}? The matrix [T]B1, B2 also changes, the last
column becoming the first column now. Similarly, if we change the positions of the
f1’s in B2, the rows of [T]B1, B2 will get interchanged.

Thus, to obtain a unique matrix corresponding to T, we must insist on B 1 and B2 being


ordered bases. Henceforth, while discussing the matrix of a linear mapping, we will
always assume that our bases are ordered bases.

We will now give an example, followed by some exercises.

Example 3: Consider the linear operator


T: R3: T (x, y, z) = (x, y). Choose bases B1 and B2 of R3 and R2, respectively. Then
obtain [T]B1. B2.

Solution: Let B1 = {e1, e2, e3}, where e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1). Let
B2 = {f1, f2}, where f1 = {1, 0), f2 = {0, 1), Note that B1 and B2 are the standard bases
of R3 and R2, respectively.
T (e1) = (1,0) = f1 = 1. f1 + 0.f2
T (e2) = (0, 1) = f2 = 0.f1 + 1.f2
T (e3) = (0, 0) = 0f1 + 0f2.

1 0 0
Thus, [T]B1.B2 = 0 1 0

E E4) Choose two other bases B1 and B`2 of R3 and R2, respectively. (In
Unit 4 you came across a lot of bases of both these vector spaces) For T in the
example above, give the matrix [T]B`1B`2

What E4 shows us is that the matrix of a transformation depends on the bases that we
use for obtaining it. The next two exercises also being out the same fact.

E E 5) Write the matrix of the linear transformation T: R3 R2:T (x, y, z) = (x


+ 2y + 2z, 2x + 3y + 4z) with respect to the standard bases of R3 and R2.

E E 6) What is the matrix of T, in E5, with respect to the bases


B’1 = {(1, 0, 0), (0, 1, 0), (1, -2, 1)} and
B’2 = {(1, 2), (2, 3)}?
The next exercise is about an operator that you have come across often

E E7) Let V be the vector space of polynomials over R of degree < 3, in the
variable t. let D:V  V be the differential operator given in Unit 5 (E6, when
n = 3). Show that the matrix of D with respect to the basis {1, t, t 2, t3} is

0 1 0 0
0 0 2 0
0 0 0 3
0 0 0 0

So far, given a linear transformation, we have obtained a matrix from it. This works
the other way also. That is given a matrix we can define a linear transformation
corresponding to it.

Example 4: Describe T: R3 R3 such that

1 2 4
[T]B = 2 3 1 , where B is the standard basis of R3.
3 1 2

Solution: Let B = {e1, e2, e3), Now, we are given that

T (e1) = 1.e1 + 2.e2 + 3.e3


T (e2) = 2.e1 + 3.e2 + 1.e3
T (e3) = 4.e1 + 1. e2 + 2.e3

You know that any element of R3 is (x, y, z) = xe1 + ye2 + ze3)

Therefore, T(x, y, z) = T (xe1 + ye2 + ze3)

= xT (e1) + yT (e2) + zT (e3), since T is linear.


= x (e1 + 2e2 + 3e3) + y (2e1 + 3e2 + e3) + z (4e1 + e2 + 2e3)
= ( x + 2y + 4z)e1 + (2x + 3y + z)e2 + (3x + y + 2z) e3.
= (x + 2y + 4z, 2x + 3y + z, 3x + y + 2z)
:. T: R3 R3 is defined by T (x, y, z) = (x + 2y + 4z, 2x + 3y + z, 3x + y + 2z)

Try the following exercises now.

E E8) Describe T: R3 R2 such that

1 1 0
[T]B1. B2 = 0 1 1 , where B1 and B2 are the standard

bases of R3 and R2, respectively.

E E9) Find the linear operator T: C  C whose matrix, with respect to the
basis {1, i} is

0 -1
1 0 ,

(Note that C, the field of complex numbers, is a vector space over R, of dimension 3)

Now we are in a position to define the sum of matrices and multiplication of a matrix
by a scalar.

3.1.3 Sum and Multiplication by Scalars

In Unit 5 you studied about the sum and scalar multiples of linear transformations. In
the following theorem we will see what happens to the matrices associated with the
linear transformations that are sums or scalar multiples of given linear
transformations.
Theorem 1: Let U and V be vector spaces over F, of dimensions n and m,
respectively. Let B1 and B2 be arbitrary bases of u and V, respectively. (Let us
abbreviate [T]B1. B2 to [T] during this theorem.) Let S, T  L (U, V) and α  F.
Suppose [S] = [aij], [T] = [bij]. Then

[ S + T] = [aij + bij], and


[αS] = [αaij]
Proof: Suppose B1 = {e1, e2,…….,en} and B2 = {f1, f2,….., fm}. Then all the matrices
to be considered here will be of size m x n.

Now, by our hypothesis,


m
S(ej) =  aij fi V j = 1,…., n and
i=1

m
T(ej) =  bij fi V j = 1,…., n and
i=1

:., (S + T) (ei) = S(ej) + T (ei) (by definition of S + T)

m m
=  aij fi =  bij fi
i=1 i=1

= m
=  (aij + bij)fi.
i=1

Thus, by definition of the matrix with respect to B1 and B2, we get [S + T] = aij + bij].

Now, (αS) (ej) = α(S (ej)) (by definition of αS)

m
= α  aij fi
i=1
Two matrices can be added if and
only if they are of the same size
= m
 (α aij) fi
i=1

Thus, [αS] = [αaij]


Theorem 1 motivates us to define the sum of 2 matrices in the following way.

Definition: Let A and B be the following two m x n matrices.

a11 a12 . . a1n

a21 a22 . . a2n


A =
: . . . .

: . . . .

am1 am2 . . amn

B = b11 b12 . . b1n

b21 b12 . . b2n

. . . . .

. . . . .

bm1 bm2 . . bmn.

Then the sum of A and B is defined to be the matrix

a11 + b11 a12 + b12 . . a1n + b1n

a21 + b21 a22 + b22 . . a2n + b2n

A+B = . . . . . . . .

am1 + bm1 am2 + bm2 . . amn + bnm.


In other words, A + B is the m x n matrix whose (i, j)th element is the sum of the (i,
j)th element of A and the (i, j) th element of B.

Let us see an example of how two matrices are added.

1 4 5 0 1 0

Example 5: What is the sum of 0 1 0 and 1 4 5 ?

Solution: Firstly, notice that both the matrices are of the same size (otherwise, we
can’t add them). Their sum is

1 + 0 4 + 1 5 + 0 1 5 5
0 + 1 1 + 4 0 + 5 = 1 5 5

E E10) What is the sum of


3
a) 1 2 and 1 ?

1 0 -1 0
b) 0 1 and 0 -1 ?

Now, let us define the scalar multiple of a matrix, again motivated by theorem 1.

Definition: let a be a scalar, i.e., α  F, and Let A = [aij]mxn’ Then we define


the scalar multiple of the matrix A by the scalar  to be the matrix.

a11. a12 . . a1n

a21 aa22 . . a2n

. . . . .
am1 am2 . . amn

In other words, A is the m x n matrix whose (i,i) th element is a times the (i,j) th
element of A.

1
½ ¼ /3
Example 6: What is 2A, where 0 0 0 ?

Solution: We must multiply each entry of A by 2 to get 2A.

Thus,
2
1 ½ /3
2A = 0 0 0

1 1 0
E E11) Calculate 3 2 and 3 2 + 1.

Remark: The way we have defined the sum and scalar multiple of matrices allows us
to write Theorem 1 as follows:

[S + T]B1.B2 = [S]B1.B2 + [T]B1.B2.

[S]B1. B2 = [S]B1.B2.

The following exercise will help you in checking if you have understood the contents
of sections 7.2.2 and 7.2.3.

E E12) Define S:R2 - R3 : S (x, y) = (x, 0,y) and T: R2 - R3: T(x,y).


Let
B1 and B2 be the standard bases for R2 and R3 respectively.
Then what are [S]B1.B2, [T]B1.B2, [S + T]B1. B2, for any  R.

We now want to show that the set of all m x n matrices over F is actually a vector
space over F.

3.1.4 Mmxn (F) is a Vector Space

After having defined the sum and scalar multiplication of matrices, we enumerate the
properties of these operations. This will ultimately lead us to prove that the set of all
m x n matrices over F is a vector space over F. Do keep the properties VS1 – VS10
(of Unit 3) in mind. For any A = [aij], B = [bij], C = [cij]  Mmxn (F) and  F, we
have

i) Matrix addition is associative:

(A + B) + C = A + (B + C), since

(aij + bij + cij = aij + (bij + cij) V i, j, as they are element of a field.

ii) Additive identity: the matrix of the zero transformation (see unit 5), with
respect to any basis, will have 0 as all its entries. This is called the zero matrix.
Consider the zero matrix 0, of size m x n. then, for any A  Mmxn (F).

A + 0 = 0 + A = A,

Since aij + 0 = 0 + aij = aijV i, j.

Thus, 0 is the additive identity for Mmxn (F).

iii) Additive inverse: Given A  Mmxn (F) we consider the matrix (-1) A. then
A + (-1) A = (-1) A + A = 0

This is because the (i,j)th element of (-1) A is –aij, and aij + (-aij) = 0 = (-aij) + aij V
i.j.
Thus, (-1) A is the additive inverse of A. We denote (-1) A by – A.

iv) Matrix addition is commutative:


A +B = B+A
This is true because aij + bij + aijV i. j.

v) (A + B) = A + B.

vi) ( + ) A = A + A

vii) () A = (A)

viii) 1. A = A

E E13) Write out the formal profs of the properties (v) – (viii) given above.

These eight properties imply that Mmxn (F) is a vector space over F

Now that we have shown that Mmxn (F) is a vector space over F, we know it must have
a dimension.

3.1.5 Dimension of M mxn (F) over F

What is the dimension of Mmxn (F) over F? to answer this question we prove the
following theorem. But, before you go further, check whether you remember the
definition of a vector space isomorphism (Unit 5).

Theorem 2: Let U and V be vector spaces over F of dimensions n and m.,


respectively. Let B1 and B2 be a pair of bases of u and V, respectively. The mapping
: L(U, V)  Mmxn (F), given by  (T) = [T]B1. B2 is a vector space isomorphism.

Proof: The fact that  is a linear transformation follows from Theorem 1. We


proceed to show that the map is also 1-1 and onto. For the rest of the proof we shall
denote [S] B1. B2 by [S] only. And take B1 = {e1,….,en}, B2 = {f1, f2,…., fm}

 is 1 -1 : Suppose S, T  L (U, V) be such that  (S) =  (T).

Then [S] = [T]. Therefore, S(ej) = T(ej) V ej B1.

Thus, by Unit 5 (Theorem, 1), we have S = t.


 is on 0 : if A  Mmxn (F) we want to construct T  L (U, V)

Such that  (T) = A. Suppose A = [aij]. Let v1,….vn V such that

m
v1 =  aij fi for j = 1, …., n.
i=1

Then, by Theorem 3 of Unit 5, there exists a linear transformation T  L (U, V)


such that

m
T(ej) =vj =  aijfi.
i=1

Thus, by definition,  (T) = A.

Therefore,  is a vector space isomorphism.

A corollary to this theorem gives us the dimension of Mmxn (F).

Corollary: Dimension of Mmxn (F) = mn.

Proof: Theorem 2 tells us the Mmxn (F) is isomorphic to L (U, V). Therefore, dimF
Mmxn (F) = dimF (L(U, V) (by Theorem 12 of Unit 5) = mn, from Unit 6 (Theorem 1).

Why do you think we chose such a roundabout way for obtaining dim M mxn (F)? We
could as well have tried to obtain mn linearly independent m x n matrices and show
that they generate Mmxn (F). But that would be quite tedious (see E16). Also, we have
done so much work on L(U, V) so why not use that! And, doesn’t the way we have
used seem neat?

Now for some exercises related to Theorem 2.

E E 14) At most, how many matrices can there be in any linearly independent
subject of M2x3 (F)?
E E15) Are the matrices [1, 0] and [1, -1] linearly independent over R?

E E16) Let Eij be an m x n matrix whose (i, j)th element is 1 and the other
elements are 0. Show that {Eij:  i  m. 1  j  h}is a basis of Mmxn (F) over F.
Conclude that dimF Mmxn (F) = mn.

Now we move on to the next section, where we see some ways of getting new
matrices from given ones.

3.3 New Matrices From Old

Given any matrix we can obtain new matrices from them in different ways. Let us see
three of these ways.

3.2.1 Transpose

1 0 9

Suppose A = 2 5 9

From this we for a matrix whose first and second columns are the first and second
rows of A, respectively. That is, we obtain.
1 2
B = 0 5
9 9

Then B is called the transpose of A. Note that A is also the transpose of B, since the
rows of B are the columns of A. here A is a 2 x 3 matrix and B is a 3 x 2 matrix.

In general, if A = [a ij] is an m x n matrix. Then the n x m matrix whose ith column is


the ith row of A, is called the transpose of A. the transpose of a A is denoted by At
(the notation and A’ is also widely used.)

Note that, if A = [a ij]mxn’ then At = [bij]mxn where bij is the intersection of the ith row
and the jth column of At. :. bij is the intersection of the jth row and ith column of A,
i.e.., aji. :. bij = aji.

1 2
E E17) Find At, where a = 2 0.

We now given theorem that lists some properties of the transpose.

Theorem 3: Let A, B  Mmxn (F) and α  F. Then,

a) (A + B)t = At + Bt
b) (αA)t = αAt.
c) (At)t = A

Proof a) Let A = [aij] and B = [bij]. Then A + B = [aij + bij].


Therefore, (A + B)t = [cij], where

cij = the (j, i)th element of A + B = aij + bji.


= sum of the (j, i)th elements of A and B
= sum of the (i, j)th elements of At and Bt.
= (i, j)th element of At + Bt.

Thus, (A + B)t - At + Bt.


We leave you to complete the proof of this theorem. In fact that is what E 18 says!

E E18) Prove (b) and (c) of Theorem 3.

E E19) Show that, if A = At, then A must be a square matrix.

E 19 leads us to some definitions.

Definitions: A square matrix A such that At = A is called a symmetric matrix. A


square matrix A such that At = -A, is called a skew–symmetric matrix. For example,
the matrix in E17, and

1 2
1 1 are both symmetric matrices.

0 2
-2 0 is an example of a skew-symmetric matrix since

t
0 2 = 0 -2 = - 0 2
-2 0 2 0 -2 0
E E20) Take a 2 x 2 matrix A. Calculate A + At and A – At. Which of these is
symmetric and which is skew-symmetric?

What you have shown in E20 is true for a square matrix of any size, namely, for any A
 Mn (F), A + At is symmetric and A – At is skew-symmetric.

We now given another ways of getting a new matrix from a given matrix over the
complex field.

3.2.2 Conjugate

If A is a matrix over C, then the matrix obtained by replacing each entry of A by its
complex conjugate is called the conjugate of A, and is denoted by Ā.

Three properties of conjugates, which are similar to those of the transpose, are

a) A + B = Ā + , for A, B  Mmxn ά(C).


b) A = ά Ā, for a  C and A  Mmxn (C)
c) Ẫ A = A, for A  Mmxn (C)

Let us see an example of obtaining the conjugate of a matrix.

1 :
Example 7: Find the conjugate of 2 + i - 3 - 2i

Solution: By definition, the required matrix will be

1 -1
2 - i -3 + 2i

1 1
Example 8: What is the conjugate of 2 3 ?
Solution: Note that this matrix has only real entries. Thus, the complex conjugate of
each entry is itself. This means that the conjugate of this matrix is itself.

This example leads us to make the following observation.

Remark: Ā = Ā if and only if A is a real matrix.

Try the following exercise now.

i 2
E E21) Calculate the conjugate of 3 i .

We combine what we have learnt in the previous two sub-section now.

3.2.3 Conjugate Transpose

Given a matrix A  Mmxn(F) we form a matrix B by taking the conjugate of At. Then
B = Āt, is called the Conjugate transpose of A.

1 i
t
Example 9: Find Ā where A 2 + i - 3 - 2i.

1 2+i
t
Solution: Firstly, A = i - 3 - 2i, Then

1 2 - i
t
Ā = -1 - 3 + 2i.

Now, note a peculiar occurrence. If we first calculate Ā and then take its transpose,
we get the same matrix, namely, Āt. That is, (Ā)t = Āt. In general, (Ā)t- ĀtV A 
Mmxn (C),
E E22) Show that A = Āt A is a square matrix.

E22 leads us to the following definitions.

Definitions: A square matrix A for which Āt = A is called a Hermitian matrix. A


square matrix A is called a Skew-Hermitian matrix if Āt = - A.

1 1+i
For example, the matrix 1 - i 2 is Hermitian, whereas the

i 1+i
Matrix -1 + I 0 is a skew-Harmitian matrix.

1 2
Note: If A = 2 0 Then A = At = Āt (Since the entries are all real).

:. A is symmetric as well as Hermitian. In fact, for a real matrix A, A is Hermitian if


A is symmetric. Similarly, A is skew-Harmitian if A is skew-symmetric.

We will now discuss two important and often-used, types of square matrices.

3.3 Some Types of Matrices

In this section we will define a diagonal matrix and a triangular matrix.

3.3.1 Diagonal Matrix

Let U and V be vector spaces over F of dimension n. Let B1 = {e1,…..en} and B2 =


{fi,…….,fn} be based of U and V, respectively. Let d1,……, dn F. Consider the
transformation T:U  V:T (a1 e1 + …+ anen) = a1d1f1 + …+ andnfn

Then T(e1) = d1 f1, T(e2) = d2f2,….,T(en) = dnfn.

d1 0 … 0
0 d2 … 0
:.[T]B1.B2 = . . …
0 0 … dn.

Such a matrix is called a diagonal matrix. Let us se what this means.

Let A [aij] be a square matrix. The entries a11, a22,…..,amn are called the diagonal
entries of A. This is because they lie along the diagonal, from left to right, of the
matrix. All the other entries of A are called the off-diagonal entries of A.

A square matrix whose off-diagonal entries are zero (i.e., aij = 0 V i  j) is called a
diagonal matrix. The diagonal matrix.

d1 0 0 . . 0
0 d2 0 . . 0
: : : : : :
0 0 0 . . dn .

Is denoted by diag (d1, d2, ……….,dn).

Note: The di’s may or may not be zero. What happens if all the d i’s are zero? Well,
we get the n x n zero matrix, which corresponds to the zero operator.
If di = 1 V I = 1,. . . . . . . . . . . . ,n, we get the identity matrix, In (or I, when the size is
understood).

E E23) Show that I, is the matrix associated to the identity operator from Rn to
Rn.

if  F, the linear operator I:Rn Rn: I(v) = v, for all v  Rn, is called a Scalar
operator. Its matrix with respect to any basis is I = diag (,, . . . . . . . . . , ). Such
a matrix is called a Scalar matrix. It is a diagonal matrix whose diagonal entries are
all equal. With this much discussion on diagonal matrices, we move onto describe
triangular matrices.

3.3.2 Triangular Matrix

Let B = {e1, e2, . . . . . ., en} be a basis of a vector space V. let S  L (V, V) be an


operator such that
S(e1) = a11e1.

S(e2) = a12e1 + a22 e2


: : :
. . .
S(en) = a1n e1 + a2n e2 + ….+ ann en,

Then, the matrix of S with respect to B is

a11 a12 a13 . . a1n

0 a22 …. . . a2n

: : …. . . :

0 0 0 . . . ann’.

Note that aij = 0 V i > j.

A square matrix A such that a ij = 0 V i > j is called an Upper Triangular Matrix. If


aij = 0 V i  j, then A is called Strictly Upper Triangular.

1 3 1 0 1 0

For example 0 2 0 0 0 1 are all upper triangular,

0 3
while 0 0 is strictly upper triangular.

Note that every strictly upper triangular matrix is an upper triangular matrix.

Now let T : V  V be an operator such that T(ej) is a linear combination of ej,


ej+1,…..,enVj The matrix of T with respect to B is

b11 0 0 . . 0

b21 b22 : . . 0

[T]B = : : : : : :
bn1 bn2 bn3 . . bnn.

Note that bij = 0 V i < j

Such a matrix is called a Lower Triangular Matrix. If bij = 0 for all I  j, then B is
said to be a strictly Lower Triangular Matrix.

The matrix

0 0 0 0

2 0 0 0

-1 -1 0 0

1 0 5 0

is a strictly lower triangular matrix. Of course, it is also lower triangular!

Remark: If A is an upper triangular 3 x 3 matrix, say

A = 1 2 3 1 0 0

0 4 5, and At = 2 4 0 a lower triangular matrix

0 0 6 3 5 6

In fact, for any n x n upper triangular matrix A, its transpose is lower triangular, and
vice versa.

E E24) If an upper triangular matrix A is symmetric, then show that it must be a


diagonal matrix.
E E25) Show that the diagonal entries of a skew-symmetric matrix are all zero,
but the converse is not true.

Let us now se how to define the product of two or more matrices.

3.4 Matrix Multiplication

We have already discussed scalar multiplication. Now we see how to multiply two
matrices. Again, the motivation for this operation comes from linear transformations.

3.4.1 Matrix of the Composition of Linear Transformations

Let U, V and W be vector spaces over F, of dimension p, n and m, respectively, Let


B1, B2 and B3 be bases of these respective spaces. Let T  L (U, V) and S  L (V,
W). Then ST ( = S T)  L (U, W) (see sec 6.4).

Suppose [T]B1.B2 = B = [bjk]n x p

and [S]B2,B3 = A = [aij]mxn

We ask: What is the matrix [ST]B1.B3 ?

To answer this we suppose

B1 = {e1, e2 . . . . . . . . ,en}
B2 = {f1, f2, . . . . . . . . ,fn}
B3 = {g1, g2, . . . . . . . . ,gm}
n
Then, we know that T(ek) =  bjk fjV k = 1, 2, ……,p,
j =1
m
and S(fj)  aij giV j = 1, 2, ….., n.
i=1
n
Therefore, SoT (ek) = S(T(ek)) = S  bjk fj = bjk S(f1) + b2k S(f2) +…..+ bnk S(fn)
i=1

m m m
= b1k  aijgi + b2k a12gi + ……+ bnk  ain gi
i=1 i=1 i=1
m
=  (aij b1k + ai2 b2k + . . . . . + ain bnk) gi, on collecting the
i=1

coefficients of gi.

m
Thus, [ST]B1. B3 = [cik]mxp, where cik =  aij bjk
i=1

We define the matrix [cik] to be the product AB.

So, let us see how we obtain AB from A and B

Let a = [aij]mxn B = [bik]mxn be two matrices over F of sizes m x n and n x p,


respectively. We define AB to be the m x p matrix C whose (i,k) th entry is

n
cik =  aij bjk = aij b1k + a12 b2k + . . . . . + ain bnk.
i=1

In order to obtain the (I,k) th element of AB, take the ith row. Of A and the kth
column of B are both n-tuples. Multiply their corresponding elements and add up all
these products.

For example, if the 2nd row of A = [1 2 3], and the 3rd column of
1
B = 5 , then the (2,3) entry of AB = 1x4 + 2x5 + 3x 6 = 32
6

Note that two matrices A and B can only be multiplied if the number of columns of A
= the number of rows of B. the following illustration may help in explaining what we
do to obtain the product of two matrices.
A B

a11 a12 .. a1n b11 b12 .. bik .. b1p

a21 a22 .. a2n b21 b22 .. b2k .. b2p

. . .. .. . . .. . .. . =

a11 a12 .. ain . . .. . .. .

. . .. .. . . .. . .. .

am1 am2 .. amn bn1 bn2 .. bnk .. bnp


AB

c11 c12 .. cik .. cip

c21 c22 .. c2k .. c2p

. . .. . .. .

c11 c12 .. cjk .. cip

. . .. . .. .

cm1 cm2 .. cmk .. cmp.


m
Where cik =  aij bjk
i =1

Note: This is a very new kind of operation so take your time in trying to understand
it. To get you used to matrix multiplication we consider the product of a row and a
column matrix.

b1
Let A = [a1, a2, …., an] be a xn matrix and B = b2 be an n x 1 matrix. Then AB is
the [x] matrix
:
bn .

[a1 b1 + a2b2 + - - - - - -+ an bn]


1
E E26) What is [1 0 0] 2 ?
3

Now for another example.

1 0 0 2 1
Example 10: Let A = 7 0 8 , B = 3 5
0 0 9 4 0

Find AB, if it is defined.

Solution: A B is defined because the number of columns of A = 3 = number of rows


of B.

1.2 + 0.3 + 0.4 1.1 + 0.5 + 0.0 2 1


AB = 7.2 + 0.3 + 8.4 7.1 + 0.5 + 8.0 = 46 7
0.2 + 0.3 + 9.4 0.1 + 0.5 + 9.0 36 0

Notice that BA is not defined because the number of columns of B = 2  number of


rows of A. thus, if AB is defined then BA may not be defined.

In fact, even if AB and BA are both defined it is possible that AB  BA. Consider the
following example.

1 1 0 0 1
Example 11: Let A = 0 1 1 , B = 1 1
1 1
Is AB = BA?

Solution: AB is a 2 x 2 matrix. BA is a 3 x 3 matrix. So AB and BA are both


defined. But they are of different sizes. Thus, AB  BA.

Another point of difference between multiplication of numbers and matrix


multiplication is that A  0. B 0, but AB can be zero.
1 1 1 0
For example. If A = 1 1 , B = -1 0

1 x 1 + 1(-1) 1 x 0 + 1 x 0 0 0
Then AB = =
1 x 1 + 1 (-1) 1x0+ 1x 0 0 0

So, you see, the product of two non-zero matrices can be zero.

The following exercises will given you some practice in matrix multiplication.

1 1 1 0
E E27) Let A = , B =
0 1 1 1

Write AB and BA, if defined.

1 1 0 0 1
E E28) Let C = 0 1 0 , D = 1 1
0 0
Write C + D, CD and DC, if defined. Is CD = DC?

E E29) With A, B as in E 27, calculate (A + B)2 and A2 + 2AB + B2. Are they
equal? (Here A2 means A. A.).
-bd d
E E30 Let A = -d2b db , b, d  F. Find A2.

1 0 0 x 1 0 0
E E31) Calculate 0 2 0 y and ([x y z] 0 2 0
0 0 3 z 0 0 3

E E32) Take a 3 x 2 matrix A whose end row consists of zeros only. Multiply it
by any 2 x 4 matrix B. Show that the 2nd row of AB consists of zeros only. (In
fact, for any two matrices A and B such that AB is defined. If the ith row of A
is the zero vector, then the ith row of AB is also the zero vector. Similarly, if
the jth column of B is the zero vector, then the jth column of AB is the zero
vector)

We now make an observation.

Remark: If T  L (U, V) and S  L (V, W), then

[ST]B1. B3 = [S]B2.B3 [T] B1.B2, where B1, B2, B3 are the bases of U, V, W,
respectively.

Let us illustrate this remark.

Example 12: Let T:R2 R3 be a linear transformation such that T (x, y) = (2x + y, x
+ 2y, x + t). let S: R3 R2 be defined by S (x, y, z) = (-y + 2z, y – z). Obtain the
matrices [T] B1.B2, [S]B2. B1, and [SoT]B1, and verify that [SoT]B1 = [S]B2.B1
[T]B1.B2, where B1 and B2 are the standard bases in R2 and R3, respectively.

Solution: Let B1 =[e1, e2}, B2 = {f1, f2, f3}.


Then T(e1) = T(1,0) = (2, 1, 1) = 2f1 + f2 + f3
T(e2) = T(0,1, - (1, 2, 1) = f1 + 2f2 + f3.

2 1
Thus, [T]B1.B2 = 1 2
1 1
Also,

S(f1) = S(1,0,0) = (0,0) = 0.e1 + 0.e2


S(f2) = S(0,1,0) = (-1, 1) = -e1 + e2
S(f3) = S(0, 0, 1) = (2, -1) = 2e1 - e2.

Thus, 0 -1 2
[S]B2.B1 = 0 1 -1

0 -1 2 2 1
1 2
So, [S]B2.B1 [T]B1.B2 = 0 1 -1 1 1

= 1 0
0 1 = I2.

Also, SoT (x, y) = S (2x = y, x = 2y, x + y)


= (-x – 2y + 2x = 2y, x + 2y – x – y)
= (x, y)
Thus, SoT = I, the identify map.

This means [SoT]B1 = I2

Hence, [SoT]B1 = [S]B2.B1 [T]B1.B2.

E E33) Let S: R3 R3: S(x, y, z) = (0, x, y), and T:R3  R3: T(x, y, z) = (x, 0,
y) show that [SoT]n = [S]B [T]B’ where B is the standard basis of R3.

We will now look a little closer at matrix multiplication.

3.4.2 Properties of a Matrix Product

We will now state 5 properties concerning matrix multiplication. (Their proofs could
get a little technical, and we prefer not to give them here).
(i) Associative Law: if A, B, C are m x n, n x p and p x q matrices, respectively,
over F, then (AB) C = A (BC), i.e.., matrix multiplication is associative.
(ii) Distributive Law: If A is an m x n matrix and B, C are n x p matrices, then A
(B + C) = AB + AC.
Similarly, if, A and B are m x n matrices, and C is an n x p matrix, then (A +
B) C = AC + BC.
(iii) Multiplicative identity: In Sec. 7.4.1, we defined the identity matrix In. This
acts as the multiplicative identity for matrix multiplication. We have AIn = A,
Im A = A, for every m x n matrix A.
(iv) If a  F, and A, B are m x n and n x p matrices over F, respectively then α(AB)
= (αA) B = A (αB).
(v) If A, B are m x n, n x p matrices over F, respectively, then (AB) t = Bt At. (This
says that the operation of taking the transpose of a matrix is anti-commutative).
These properties can help you in solving the following exercises.

E E34) Show that (A + B)2 = A2 + AB + BA + B2, for any two n x n matrices A


and B.

2 -1 1 -2 -5
E E3) For A = 1 0 and B = 3 4 0 ,
-3 4

Show that 2 (AB) = (2A) B.


2 -1 0 1 -4 0

E E36) Let A = 1 0 -3 , B = 2 -1 3

0 0 0 4 0 -2

Find (AB) and B1 A1. are they equal?

E E37) Let A, B be two symmetric n x n matrices over F. Show that AB is


symmetric if and only if AB = BA.

The following exercise is a nice property of the product of diagonal matrices.

E E38) Let A, B be two diagonal n x n matrices over F. Show that AB is also a


diagonal matrix.

Now we shall go on to introduce you to the concept of an invertible matrix.


3.5 Invertible Matrices

In this section we will first explain what invertible matrices are. Then we will see
what we mean by the matrix of a change of basis. Finally, we will show you that such
matrix must be invertible.

3.5.1 Inverse of a Matrix

Just as we defined the operations on matrices by considering them on linear operators


first, we give a definition of invertibility for matrices based on considerations of
invertibility of linear operators.

It may help you to recall what we mean by an invertible linear transformation. A


linear transformation T: U  V is invertible if

(a) T is 1 – 1 and onto, or , equivalently,


(b) There exists a linear transformation S: V  U such that SoT = Iu, ToS = Iv.

In particular, T  L (V, V) is said to be invertible if  S  L (V, V) such that ST = TS


= I.

We have the following theorem involving the matrix of an invertible linear operator.

Theorem 4: Let V be an n-dimensional vector space over a field F, and B be a basis


of V. Let T  L (V, V). T is invertible if there exists A  Mn (F) such that [T]B A = In
= A [T]B.

Proof: Suppose T is invertible Then  S  L(V, V) such that TS = ST = I. Then, by


Theorem 2, [TS]B = [ST]B = I. That is, [T]B [S]B = [S]B [T]B = I. Take A = [S]B. Then
[T]B A = I = A [T]B.

Conversely, suppose  a matrix A such that [T]B A = A [T]B = I.

Let S  L (V, V) be such that [S] B = A. (S exists because of Theorem 2) Then [T] B
[S]B = [S]B [T]B = I = [I]B. Thus, [TS]B = [ST]B = [I]B.

So, by Theorem 2, ST = ST = I. that is, T is invertible.

Theorem 4 motivates us to give the following definition.

Definition: A matrix  Mn(F) is said to be invertible if  B  Mn(F) such that AB =


BA = In.

Remember, only a square matrix can be invertible.


In is an example of an invertible matrix, since In . In = In. on the other hand the n x n
zero matrix 0 is not invertible, since 0A = 0  In, for any A.
Note that Theorem 4 says that T is invertible iff [T]B is invertible. We give another
example of an invertible matrix now.

1 1
Example 13: Is A = 0 1 invertible?

a b
Solution: Suppose A were invertible. Then  B = c d such that AB
= I = BA. Now.

1 1 a b 1 0
AB = I  0 1 c d = 0 1

 a+c b+d 1 0
c d 0 1  c = 0, d = 1, a = 1, b = -1

1 -1
:.B = 0 1 , Now you can also check that BA = I.

Therefore, A is invertible.
We now show that if an inverse of a matrix exists, it must be unique.

Theorem 5: Suppose A  Mn (F) is invertible. There exists a unique matrix B  Mn


(F) such that AB = BA = I.

Proof: Suppose B, C  Mn (F) are two matrices such that AB = BA = I, and AC =


CA = I. then B = BI = B (AC) = (BA) C = IC = C.

Because of Theorem 5 we can make the following definition.

Definition: Let A be an invertible matrix. The unique matrix B such that AB = BA is


I is called the inverse of A and is denoted by A-1.

Let us take an example.


Example 14: Calculate the product AB, where

1 a 1 b
A = 0 1 , B = 0 1

Use this to calculate A-1.

1 a 1 b 1 a+b
Solution: Now AB = 0 1 0 1 = 0 1 .

Now, how can we use this to obtain A-1? Well, if AB = I, then a + b = 0. So, if we
take
1 -a
B = 0 1 ,

1 -a
We get AB = BA = I. Thus, A-1 = 0 1

1 0
E E39) is the matrix 2 -1 invertible? If so, find its inverse.

We will now make a few observations about the matrix inverse, in the form of a
theorem.

Theorem 6: a) If A is invertible, then

(i) A-1 is invertible and (A-1) -1 = A,

(iii) At is invertible and (At)-1 = (A-1)t.

(b) If A, B  Mn (F) are invertible, then AB is invertible and (AB)-1 = B-1 A-1

Proof: (a) By definition.,

A A-1 = A-1 A = I ………………………………. (1)

(i) Equation (I) shows that A-1 is invertible and (A-1)t = A.


(ii) If we take transposes in Equation (I) and use the property that (AB)t = Bt At, we
get (A-1)t At = At (A-1)t = It = I.
So At is invertible and (At)-1 = (A-1)t.

(b) To prove this we will use the associativity of matrix multiplication.


Now (AB) (B-1 A-1) = [A (BB-1)] A-1 = A A-1 = I.
(B-1 A-1) (AB) = B-1 [(A-1 A) B] = B-1 B = I.

So AB is invertible and (AB)-1 = B-1 A-1.

We now relate matrix invertibility with the linear independence of its rows or
columns. When we say that the m rows of A = [a ij]  Mmxn (F) are linearly
independent, what do we mean? Let R1,…..,Rm be the m row vectors [a11, a12,…., a1n],
[a21,….,a2n],….,[am1,….,amn], respectively. We say that they are linearly independent
if, whenever  a1,…., am F such that a1R1 + …+ am Rm = 0,

Then, a1 = 0, . . . . . . . . , am = 0.

Similarly, the n columns C1, …., Cn of A are linearly independent if b1C1 + ….bnCn =
0  b1 = 0, b2 = 0,…., bn = 0, where b1,…..,bn F.

We have the following result.

Theorem 7: Let A  Mn (F). Then the following conditions are equivalent

(a) A is invertible
(b) The columns of A are linearly independent.
(c) The rows of A are linearly independent.

Proof: We first prove (a)  (b), using Theorem 4, Let V be an n-dimensional vector
space. Over F and B = {e1,…,en} be a basis of V. Let T  L (V, V) be such that [T] B
= A. then A is invertible iff T is invertible iff T(e1), T(e2),…., T (en) are linearly
independent (see Unit 5 Theorem 9). Now we define the map

a1
 : V  Mnx1 (F): (a1e1 + ….+ an en) = :
.
an

Before continuing the proof we give an exercise.


E E40) Show that  is a well-defined isomorphism.

Now let us go on with proving. Theorem 7.

Let C1, C2,…..,Cn be the columns of A. Then  (T(ei)) = Ci for all i = 1,... n. Since
 is an isomorphism, T(e1), ….., T(en) are linearly independent iff C1, C2,….., Cn are
linearly independent. Thus, A is invertible iff C1,….., Cn are linearly independent.
Thus, we have proved (a)  (b).

Now, the equivalence of (a) and (c) follows because A is invertible  At is invertible
 the columns of At are linearly independent (as we have just shown)  the rows of
A are linearly independent (since the columns of At are the rows of A).

So we have shown that (a)  (c).

Thus, the theorem is proved.

From the following example you can see how Theorem 7 can be useful.

Example 15:

1 0 1
Let A = 0 1 1  M3 (R).
1 1 1

Determine whether or not A is invertible.

Solution: let R1, R2, R3 be the rows of A. We will show that they are linearly
independent.

Suppose x R1 + yR2 + zR3 = 0, where x, y, z,  R. Then,

X (1,0,1) + y (0, 1,1) + z (1,1,1) = (0, 0, 0). This gives us the following equations.

x+z = 0
y+z = 0
x+y+z = 0
On solving these we get x = 0, y = 0 z = 0.

Thus, by Theorem 7, A is invertible.

E E41) Check if
2 0 1
0 0 1  M3 (Q) is invertible.
0 3 0

We will now see how we associate a matrix to a change of basis. This association will
be made use of very often in the next block.

3.5.2 Matrix of Change of Basis

Let V be an n-dimensional vector space over F. Let B = {e1, e2, . . . . . . ., en} and B’ =
{e’1, e’2 ………..e’n} be two bases of V. Since e’j. V, for every j, it is a linear
combination of the elements of B. Suppose,

n

ej =  aij eiVj = 1, . . . . . . . . ., n
i=1

The n x n matrix A = [a ij] is called the matrix of the change of basis from B to B’ It is
denoted by MBB.

Note that A is the matrix of the transformation T  L (V, V) such that T (e j) = e’jV j =
1, ……., n, with respect to the basis B. Since {e’j,. . . . . . . .,e’n} is a basis of V, from
Unite 5 we see that T is 1 – 1 and onto. Thus T is invertible. So A is invertible.
Thus, the matrix of the change of basis from B to B’ s invertible.

Note: a) MBB = In, This is because, in this case e’j = ejVi = 1, 2, ……,n.

b) MB’B = [I] B’, B. This is because


I(e’j) = e’ j =  aij eiV j = 1, 2, ……,n
i=1
Now suppose A is any invertible matrix. By Theorem 2,  T  L (V, V) such that
[T]R = A. Since A is invertible, T is invertible. Thus, T is 1 – 1 and onto. Let fi = T
(ei) V i = 1, 2 . . . n, Then B’ = {f1, f2,…..,fn} is also a basis of V, and the matrix of
change of basis from B to B` is A.

In the above discussion, we have just proved the following theorem.

Theorem 8: Let B = {e1, e2, . . . . . . .,en} be a fixed basis of V. the mapping B’
MB’B is a 1 -1 and onto correspondence between the set of all bases of V and the set
of invertible n x n matrices over F.

Let us se an example of how to obtain MB’B.

Example 16: In R2, B = {e1, e2} is the standard basis. Let B’ be the basis obtained by
rotating B through a angle  in the anti-clockwise direction (see Fig. 1). Then B’ =
(e’1, e’2) where e’1 = (cos , sin ), e’2 = (-sin , cos ). Fin MB’B.

Solution: e’1 = Cos  (1.0) + sin  (0.1), and


= -sin  (1,0) + cos  (0.1)

cos  - sin 
B’
Thus, M B = sin  cos 

Try the following exercise.

E E42) Let B be the standard basis of R3 and B’ be another basis such that

01 1
MB’B. = 11 0
0 0 3

What are the elements of B ?
What happens if we change the basis more than once? The following theorem tells us
something about the corresponding matrices.

Theorem 9: Let B, B’, B” be three bases of V. Then MB’B MB’B = MB’B .

Proof: Now, MB’B MB’B = [I]B’. B [I]B’.B


= [IoI]B’.B’ = MB’B.

An immediate useful consequence is

Corollary: Let B.B’ be two bases of V. then MB’B MB’B = I = MB’B MB’B

That is, (MB’-1B ) = MB’B.

Proof: By Theorem 9,
MB’B MB’B = MB’B = I

Similarly, MB’B MB’B = MB’B = I.

But, how does the change of basis affect the matrix associated to a given linear
transformation? In sec. 7.2 we remarked that the matrix of a linear transformation
depends upon the pair of bases chosen. The relation between the matrices of a
transformation with respect to two pairs of bases can be described as follows.

Theorem 10: Let T  L (U, V), Let B1 = {e1, . . . . . . . , en} and B2 = {f1, . . . . . . ,fm}
be a pair of bases of U and V, respectively.

Let B’1 = {e’1, . . . . . . ., e’n}, B’2 = {f’1,. . . . . . . f’m} be another pair of bases of U and
V, respectively. Then.

[T]B’1.B’2 = MB’B [T]B1.B2 MB’1 B1.

Proof: [T]B’1.B’2 = [I Vo To Iu]B’1.B’2 = [IV]B1.B2 [Iu]B’1.B1


(where Iu = identity map on U and Iv = identity map on V)

= MB’2B2 [T]B1.B2 MB’1B1.

Now, a corollary to Theorem 10, which will come in handy in the next block.

Corollary: Let T  L (V, V) and B, B’ be two bases of V. Then [T]B ’ = P-1 [T]BP,
where P = MB’B .

Proof: [T]B = MB’B [T]B MB’B = P-1 [T]BP, by the corollary to Theorem 9. Let
us now recapitulate all that we have covered in this unit.
4.0 SUMMARY

We briefly sum up what has been done in this unit.

1) We defined matrices and explained the method of associating matrices with


linear transformations.
2) We showed what we mean by sums of matrices and multiplication of matrices
by scalars.
3) We proved that Mmxn (F) is a vector space of dimension mn over F.
4) We defined the transpose of a matrix, the conjugate of a complex matrix, the
conjugate transpose of a complex matrix, a diagonal matrix, identity matrix,
scalar matrix and lower and upper triangular matrices.
5) We defined the multiplication of matrices and showed its connection with the
composition of linear transformations. Some properties of the matrix product
were also listed and used.
6) The concept of an invertible matrix was explained.
7) We defined the matrix of a change of basis, and discussed the effect of change
of bases on the matrix of a linear transformation.

3.6 Solutions/Answers

E1) a) You want the elements in the 1st row and the 2nd column. They are 2
and 5, respectively.
b) [0 0 7]
2
c) The second column of A is 5
0

2
The first column of B is also 5
0

d) B only has 3 rows. Therefore, there is no 4th row of B.

E2) They are infinitely many answers. We give.

1 2 1 0
3 4 2 0
5 6 and 3 0
7 8 4 0

E3) No. Because they are of different sizes.

E4) Suppose B’1 = {(1, 0, 1), (0, 2, - 1)} and B’2 = {0, 1), (1, 0)}
Then T (1, 0, 1) = (1,) = 0. (0,1) + 1. (1, 0)
T (0, 2, - 1) = (0,2) = 2. (0, 1) + 0, (1, 0)
T (1, 0, 0) = (1, 0) = 0. (0, 1) + 1. (1, 0).

0 2 0
:. [T]B’1.B’2 = 1 0 1

E5) B1 = {e1, e2, e3} B2 = {f1, f2} are the standard bases (given in Example 3).
T (e1) = T (1, 0, 0) = (1, 2) = f1 + 2f2
T (e2) = T (0, 1, 0) = (2, 3) = 2f1 + 3f2
T (e3) = T (0, 0, 1) = (2, 4) = 2f1 + 4f2.

1 2 2
:. [T]B1.B2 = 2 3 4

E6) T (1, 0, 0) = (1, 2) = 1. 91, 2) = 0. (2, 3)


T (0, 1, 0) = (2, 3) = 0. (1, 2) + 1. (2, 3)
T (1, - 2, 1) = (-1, 0) = 3. (1, 2) -2 (2, 3)

1 0 3
:. [T] B’1.B’2 = 0 1 -2

E7) Let B = {1, t, t2, t3}. Then


D (1) = 0.1 + 0.t + 0.t2 + 0.t3
D(t) = 1 = 1.1 + 0.t + 0.t2 + 0.t3
D(t2) = 2t = 0.1 + 2.t + 0.t2 + 0.t3
D(t3) = 3t2 = 0.1 + 2.t + 3.t2 + 0.t3.
Therefore, [D]B is the given matrix.

E8) We know that


T (e1) = f1
T (e2) = f1 + f 2
T (e3) = f2

Therefore, for any (x, y, z)  R3.

T (x, y, z) = T (xe1 + ye2 + ze3) = xT(e1) + yT(e2) + zT(e3)


= xf1 + y(f1 + f2) + zf2 = ( x + y) f1 + (y + z)f2
= (x + y, y + z)
That is, R:R3 R2: T (x, y, z) = (x + y, y + z)

E9) We are given that


T(1) = 0. 1 + 1.i = i
T(i) = (-1).1 + 0.1 = -1
:., for any a + ib E C, we have
T (a + ib) = a T (1) + bT (i) = ai – b

E10) a) Since [1 2] is of size 1 x 2 and 1 is of size 2 x 1,


The sum of these matrices is not defined.

c) Both matrices are of the same size, namely, 2 x 2. their sum is the matrix.

1 + ( -1) 0 + 0 0 0
0 + 0 1 + (-1) = 0 0

1 3 0 0
E11) 3 = , 3 =
2 6 1 3

1 0 1 3
and 3 + = 3 =
2 1 3 9

1 0 1 0
Notice that 3 + = 3 + 3
2 1 2 1

E12) B1 = {(1,0), (0, 1), B2 = {(1,0,0), (0,1,0), (0,01)}


Now S(1,0) = (1,0,0)
S(0,1) = (0, 0, 1)

1 0
:.[S]B1.B2 = 0 0 , a 3 x 2 matrix
0 1

Again, T (1, 0) = (0, 1, 0)


T(0, 1) = (0,0 1)

0 0
:. [T]B1.B2 1 0 , a 3 x 2 matrix
0 1
:.[S + T] B1. B2 = [S]B1.B2 + [T]B1. B2 =

1 0 0 0 1 0
= 0 0 + 1 0 = 1 0 .
0 1 0 1 0 2

1 0 α 0
And [S]B1.B2 = [S]B1.B2 =  0 0 = 0 0 , for any  R
0 1 0 

E13) We will prove (v) and (vi) here. You can prove (vii) and (viii) in a similar
way .

v) (A + B) = ([aij]) = [aij + bij] = [aij + bij]


= [aij] + [bij] = A + B.

vi) prove it using the fact that (+ )aij = aij + aij.

E14) Since dim M2x3 (R) is 6, any linearly independent subset can have 6 elements,
at most.

E15) Let ,  R such that [1,0] + [1, - 1] = [0, 0]

Then [  + , - ] = [0, 0]. Thus,  = 0,  = 0.


:. The matrices are linearly independent.

1 0 … 0 0 1 0 … 0
E16) 0 0 … 0 0 0 0 … 0
… … … … , E 12= … … … … ... and so on
0 0 … 0 0 0 0 0 0

Now any m x n matrix A = [a ij] = a11 E11 + a12E12 + … + amn Emn (for example, in the 2
x 2 situation.

2 3 1 0 0 1 0 0 0 0
0 1 = 2 0 0 + 3 0 0 + 0 1 0 + 1 0 1

Thus, [Eij] {i = 1,…, m, j = 1,…,n} generates Mmxn (F). Also, if ij, I = 1,…., m,
J = 1,…..,n, be scalars such that 11 E11 + 12 E12 + … + mn Emn = 0.

Thus
11 12 ….. 1n 0 ….. 0
We get 21 22 ….. 2n = 0 ….. 0
: : ::::: : : :::::
m1 m2 ….. mn 0 ….. 0

Therefore, aij = 0 V I j.

Hence, the given set is linearly independent. :. It is a basis of M mxn (F). The number
of elements in this basis is mn.

:. dim Mmxn (R) = mn.

1 2
E17) At = 2 0 , In this case At = A.

E18) b) A = [aij]. :. (A)t = [bij], where


bij = (j ,i)th element of A = aji
= a times the (j, i)th element of A
= a times the (i, j)th element of At
= (i, j)th element of At.

:. (A)t = A.

E19) Let A be an m x n matrix. Then At is an n x m matrix.


:., for A = At, their sizes must be the same, that is, m = n.
:. A must be a square matrix.

a c
E20) Let A = b d, be a square matrix over a field F.

a c
t
Then A = b d

a+a b+c 2a b + c
t
:. A + A = c+b d+d = b + c 2d , and

a- b b- c 0 b - c
t
A - A = c - b d - d = - (b - c) 0

You can check that (A + At) = A + At and (A - At)t = - (A - At).


:. A + At is symmetric and A - At is skew-symmetric.
-i 2
E21) 3 -i

E22) The size of Āt is the same as the size of At. :. A = Āt. Implies that the sizes of
A and At are the same. :. A is a square matrix.

E23) I:Rn Rn : I (x1,…..,xn) = (x1,……,xn).


Then, for any basis B = {e1,…..,en} of Rn. I (ei) = ei.

1 0 …. 0

0 1 …. 0

:.[I]B = : : …. : = In.
.

0 0 … 1

E24) Since A is upper triangular, all its elements below the diagonal are zero.
Again, since A = At, a lower triangular matrix, all the entries of A above the
diagonal are zero:., all the off-diagonal entries of A are zero. :. A is a diagonal
matrix.

E25) Let A be a skew-symmetric matrix. Then A = - At. Therefore,

a11 a12 …. a1n -a11 -a21 …. -an1


a21 a22 …. a2n = -a12 -a22 …. -an2
: : : : : :
an1 an2 …. ann -a1n -a2n …. -ann.

:., for any i = 1,…, n, aij = -aij = 0  aij = 0.

0 1
The converse is not true. For example, the diagonal entries of 2 0 are zero,
but this matrix is not skew-symmetric.

E26) [1 x 1 + 0 x 2 + 0 x 3] = [1]

1 x 1 + 1 x 1 1 x 0 + 1 x 1 2 1
E27) AB = 0 x 1+ 1 x 1 0 x 0 + 1 x 1 = 1 1

1 0 1 1 1 1
BA = 1 1 0 1 = 1 2
E28) C + D is not defined.

CD is a 2 x 2 matrix and DC is a 3 x 3 matrix: CD  DC.

1x0 +1x1 + 0 x 0 1x1+1x1+ 0x 0 1 2


=
CD = 0 x 0 + 1 x 1 + 0 x 0 0x1+1x1+ 0x 0 1 1

0 1 1 1 0 0 x1+1x0 0x1+1x1 0x0+1x0


DC = 1 1 0 1 0 = 1x1+1x0 1x1+1x1 1x0+1x0
0 0 0x1+0x0 0x1+0x1 0x0+0x0

= 0 1 0
1 2 0
0 0 0

2 1 2 1 2 1 5 4
E29) A + B = 1 2 :. (A + B)2 = 1 2 1 2 = 4 5

1 1 1 1 1 2
2
Also A = 0 1 0 1 = 0 1

1 0 1 0 1 0
B2 = 1 1 1 1 = 2 1

2 1 4 2
2. AB = 2 1 1 2 2

1 2 4 2 1 0 6 4
2 2
:.A + 2AB + B = 0 1 + 2 2 + 2 1 = 4 4

:. (A + B)2 A2 + 2AB + B2.

-bd b -bd b b2d2 + d2b2 – b2d + db2 2d2 b2


E30) A = d2 b
2
db d2 b db = -b2 d3 + d3 b2 d2 b2 + d2 b2 0

1 0 0 x x
E31) 0 2 0 y = 2y
0 0 3 z 3z

1 0 0
[x y x] 0 2 0 = [x 2y 3z]
0 0 3

1 2 1 2 3 0
E32) We take A = 0 0 , B = 4 5 1 1 , Then
3 1

9 12 5 2
AB= 0 0 0 0 , You can see that the 2nd row of AB is zero
7 11 10 1

0 0 0 1 0 0
E33) [S]B = 1 0 0 , [T]B = 0 0 0
0 1 0 0 1 0

0 0 0
:.[S]B [T]B = 1 0 0
0 0 0

0 0 0
Also, [SoT]B = 1 0 0 = [S]B [T]B
0 0 0

E34) (A + B)2 = (A + B) = A(A + B) + B (A + B) (by distributivity)


= A2 + AB + BA + B2 (by distributiveity)

-1 -8 -10 -2 -16 -5
E35) AB = 1 -2 -5 :. 2(AB) = 2 -4 -10
9 22 15 18 44 30

4 -2 1 -2 -5
On the other hand, (2A) B = 2 0 3 4 0
-6 8

= -2 -16 -20

2 -4 -10

18 44 30
:. 2(AB) = (2A)B

0 -7 -3 0 -11 0
E36) AB = -11 -4 6 -7 -4 0
0 0 0 . :. (AB)t = -3 6 0

1 2 4 2 1 0 0 -11 0
Also, Bt At = -4 -1 0 -1 0 0 = -7 -4 0
0 3 -2 0 -3 0 -3 6 0

= (AB)t.

E37) First, suppose AB is symmetric. Then AB = (AB)t = Bt At = BA, since A


and B are symmetric.

Conversely, suppose AB = BA. Then

(AB)t = BtAt = BA = AB, so that AB is symmetric.

E38) let A = diag (d1,…, dn), B = diag (e1,…, en). Then

d1 0 0 … 0 e1 0 0 … 0

0 d2 0 … 0 0 e2 0 … 0

AB = . . . … . . . . ::: :

: : : … : : : : … :

0 0 : … dn 0 0 : … en .

= d1 e1 0 0 … 0
0 d 2 e2 0 … 0
0 0 d3 e 3 … 0
: : : :
0 0 0 … dn en .

= diag (d1 e1, d2 e2, …, dn en).


a b
E39) Suppose it is invertible. Then E A = such that
c d

1 0 a b 1 0 a b 1 0
2 -1 c d = 0 1 = c d 2 -1

1 0
This gives us A = which is the same as the given matrix.
2 -1
1 0 1 0
This shows that the given matrix is invertible and, in fact, 1 -1 = 2 -1

E40) Firstly,  is a well defined map. Secondly, check that  (v1 + v2) = (v1) + 
(v2), and (v) = (v) for v, v1, v2,  V and  F. Thirdly, show that (v) =
0  v = 0, that is  is 1 – 1. Then; by Unit 5 (Theorem 10), you have shown
that  is an isomorphism.

E41) We will show that its columns are linearly independent over Q. Now, if x, y, z
 Q such that

2 0 1 0
x 0 + y 0 +z 1 = 0 , we get the equations
0 3 0 0

2x + z = 0

z = 0

3y = 0
on solving them we get x = 0, y = 0, z = 0.

:. the given matrix is linearly independent.

E42) Let B = {e1, e2, e3} B’ = {f1, f2, f3}. Then

f1 = 0e1 + 1e2 + 0e3 = e2


f2 = e1 + e 2
f3 = e1 + 3e3
:. B’ = {e2, e1 + e2, e1 + 3e3}.
UNITS 4 MATRICES – II

1.0 Introduction
2.0 Objective
3.0 Main Content
3.1 Rank of a Matrix
3.2 Elementary Operates
3.3 Elementary Operation on a Matrix
3.4 Row-reduced Echelon Matrices
3.5 Applications of Row-reduction
3.6 Inverse of a Matrix
3.7 Solving a System of Linear Equations
4.0 Conclusion
5.0 Summary
6.0 Tutor-Marked Assignment
7.0 References/Further Reading

1.0 INTRODUCTION

In Unit 3 we introduced you to a matrix and showed you how a system of linear
equations can give us a matrix. An important reason for which linear algebra arose is
the theory of simultaneous linear equations. A system of simultaneous linear
equations can be translated into a matrix equation, and solved by using matrices.

The study of the rank of a matrix is a natural forerunner to the theory of simultaneous
linear equations. Because, it is in terms of rank that we can find out whether a
simultaneous system of equations has a solution or not. In this unit we start by
studying the rank and inverse of a matrix. Then we discuss row operations on a
matrix and use them for obtaining the rank and inverse of a matrix. Finally, we apply
this knowledge to determine the nature of solutions of a system of linear equations.
The method of solving a system of linear equations tat we give here is by “successive
elimination of variable”. It is also called the Gaussian elimination process.

With this unit we finish Block 2. In the next block we will discuss concepts that are
intimately related to matrices.

2.0 OBJECTIVES

After reading this unit, you should be able to

 Obtain the rank of a matrix;


 Reduce a matrix to the echelon form;
 Obtain the inverse of a matrix by row-reduction;
 Solve a system of simultaneous linear equations by the method of successive
elimination of variables.
3.0 MAIN COURSE

3.1 Rank Of A Matrix

Consider any mxn matrix A, over a field F. We can associate two vector spaces with
it, in a very natural way. Let us see what they are. Let A = [a ij]. A has m rows, say,
R1, R2,….., Rm, where R1 = (a11, a12,….., a1n), R2 = (a21, a22,….., a2n)…., Rm = (am1,
am2,….., amn).
R1
R2
Thus, R1 i , and A = 
Rm

The subspace of Fn generated by the row vectors R1, …. Rm of A, is called the row
space of A, and is denoted by RS (A).
1 0 0
Example 1: If A = 1 0 (0,0,1)  RS (A)?
0, does

Solution: The row space of a is the subspace of R3 generated by (1, 0, 0) and (0,1,0).
Therefore, RS (A) = {a, b, 0) | a, b  R). Therefore (0, 0, 1) RS (A).

The dimension of the row space of A is called the row rank of A, and is denoted by p r
(A).

Thus, pr (A) = maximum number of linear independent rows of A.

In Example 1, pr(A) = 2 = number of rows of A. But consider the next example.

1 0
Example 2: If A = ,0find
1 pr(A)
2 0

Solution: The row space of A is the subspace of R2 generated by (1,0), (0,1) and (2,0).
But (2,0) already lies in the vector space generated by (1,0) and 0,1), since (2,0) = 2
(1,0). Therefore, the row of A is generated by the linear independent vectors (1,0) and
(0,1). Thus, pr (A) = 2.

So, in Example 2, pr(A) < number of rows of A.

In general, for m x n matrix A, RS (A) is generated by m vectors. Therefore, p r(A) 


m. Also, RS (A) is a subspace of Fn and dim FFn = n. Therefore, pr(A)n.

Thus, for any mxn matrix A, 0  pr (A)  min (m,n).


E E1) show that A = 0  pr(A) = 0.

Just as we have defined the row space of A, we can define the column space of A.
each column of A is an m-tuple, and hence belongs to Fm. We denote the column of A
by C1, …., Cn. The subspace of Fm generated by {C1,….Cn} is called the column
space of A and is denoted by CS (A). The dimension of CS (A) is called the column
rank of A, and is denoted of pc (A). Again, since CS (A) is generated by n vectors and
is a subspace of Fm, we get 0  pc (A)  min (m, n).
1 0 1
0 2 1
E E2) Obtain the column rank and row rank of A =

In E2 you may have noticed that the row and column ranks of A equal. In fact, in
Theorem 1, we prove that pr (A) = pc (A), for any matrix A. But first, we prove a
lemma.

Lemma 1: Let A, B be two matrices over F such that AB is defined. Then

a) CS (AB)  CS (A),
b) RS (AB)  RS (B).

Thus, pc (AB)  pc (A), pr (AB)  pr (B).


Proof: (a) Suppose A = [aij] is an n x p matrix. Then, from sec. 3.5, you know that the
jth column of C = AB will be
a1k bkj
n
cij  a11 a1n
k=1
= a2knbkj = b1j + …+ a2n bnj
a21
c2j 
k=1

 am amn
cmj n bkj
amk
 1

k=1
= C1 b1j + …+ Cn bnj.
Where C1,…Cn are the columns of A.

Thus, the columns of AB are linear combinations of the columns of A. thus, the
columns of AB  CS(A). So, CS(AB)  CS(A).
Hence, pc (AB)  pc(A).

b) By a similar argument as above, we get RS (AB)  RS(B), and so, pr(AB) 


pr(B).

E E3) Prove (b) of Lemma 1.

We will now use Lemma 1 for proving the following theorem.

Theorem 1: pc(A) = pr(A), for any matrix A over F.

Proof: Let A  Mmxn (F) Suppose pr(A) = r and pc(A) = t.

Now, RS(A) = [{R1, R2,…., m where R1, R2, ….,Rm are the rows of A. let {e1,e2,
…er,} be a basis of RS(A). Then R1 is a linear combination of e1…e r , for each i
=1…,m. Let
r

Ri = j, i 1, 2,….m, where bij F for 1  i  m, 1  j  r.
bijej=1

We can write these equations in matrix form as

R1 b11……b1r e1


………….
= ………….
………….
Rm bm1…… bmr er
So, A = BE, where B = [b ij] is an mxr matrix and E is the rxn matrix with rows e1,
e2,…er. (remember, e1  Fn, for each i=1,…., r.)
So, t = pc (A) = (BE)  pc (B), by Lemma 1.
 min (m,r,)
r

Thus, t  r.

Just as we got A = BE above, we get A = [f1,…ft] D, where { f1,…ft} is a basis of the


column space of a and D is a t x n matrix. Thus, r = pr(A)  pr (D)  t, by Lemma 1.
So we get r  and t  r. This gives us r = t.

Theorem 1 allows us to make the following definition.

Definition: the integer pc(A) (=pr(A) is called the rank of A, and is denoted by p(A).

You will see that theorem 1 is very helpful if we want to prove any fact about p(A). If
it is easier to deal with the rows of A we can prove the fact for pr(A). similarly, if it is
easier to deal with the columns of A, we can prove the fact for p c(A). While proving
Theorem 3 we have used this facility that theorem 1 gives us.

Use theorem 1 to solve the following exercises.

E E4) If A,B are two matrices such that AB is defined then show that p(AB) 
min (p(A), p(B)).

E E5) Suppose C ≠ 0  Mmx1(F), and R ≠ 0  M1xn(F), then show that the rank
of the mxn matrix CR is 1. (Hint: use E4).
Does the term ‘rank’ seem familiar to you? Do you remember studying about the rank
of a linear transformation in Unit 2? We now see if the rank of a linear transformation
is related to the rank of its matrix. The following theorem brings forth the precise
relationship. (God through sec. 2.3 before further.)
Theorem 2:Let U.V be vector spaces over F of dimensions n and m, respectively. Let
B1 be a basis of U and B2 be a basis of V. Let T  L(U,V).

Then R(T) ~ CS (|T|B1, B2).

Proof: Let B1 = {e1, e2} and B2 = {f1, f2,….fm}. As in the proof of Theorem 7 of unit
3, : V  Mm,1(F): (v) = coordinate vector of v with respect to the basis B2, is an
isomorphism.

Now. R(T) = |{ T(e1), T(e2)…. T(en)}|. Let A = [T] B1, B2, have C1,C2,….Cn as its
columns.
Then CS(A) =|{C1,C2….Cn}|. Also, (T(e1)) = Ci i = 1, …, n.
~
Thus, :v  CS(A) is an isomorphism.R(T) CS(A)

In particular. Dim R(T) = dim CS(A) = p(A).

That is, rank (T) = p(A).

Theorem 2 leads us to the following corollary. It says that pre-multiplying or post-


multiplying a matrix by invertible matrices does not alter its rank.

Corollary 1: Let A be an m x n matrix. Let P,Q be mxm and nxn invertible matrices,
respectively. B1
B 1,

Then p (PAQ) = p(A).


B1, B2,

Proof: Let T L(U,V) be such that [T] = A. We are given matrices Q and P -1.
B2
Therefore, by theorem 8 of Unit 3, B2 bases B1 and B2 of U and V, respectively, such
that Q = M and P-1 = M

Then, by theorem 10 of Unit 7,


B2 B
2
[T] = BM
1, B2, [T]B2 M
B1, B2,PAQ
B2

In other words, we can change the bases suitable so that the matrix of T with respect
to the new bases is PAQ.

So, by theorem 2, p(PAQ)= rank (T) = Thus, p (PAQ) = p(A).

1 2 3 0 1 3 0 0
E E6) Take A = 0 -1 -2 , P= -1 0 Q= 0 2 0 Obtain PAQ
0 0 1
and show that p(PAQ) = p(A).

Now we state and prove another corollary to Theorem 2. This corollary is useful
because it transforms any matrix into a very simple matrix, a matrix whose entries are
1 and 0 only.

Corollary 2: Let A be an mxn matrix with rant r. Then  invertible matrices P and Q
such that PAQ = 1r 0
0 0
Proof: Let T  L((,V) be such that [T] B1, B2, = A. since p(A) = r, rank (T) = r. 
nullity (T) = n-r (Unit 2, Theorem 5).

Let {u1,u2,…un-r} b a basis of Ker T. We extend this to form the basis


’ ,…u ,u
B1 = u1,u 2 n-r n-r+1,…., un} of U. then {T(un-r+1),…, T(un) is a basis of R(T) (see
Unit 5, proof of Theorem 5). Extend this set to form a basis B2 of V,’ say B2 = ’
’ write it as
{T(un-r+1,…., T(un), v1,…,vm-r}. Let us reorder the elements of B1 and
B1 = { un-r+1,un,u1,….un-r}.
1r 01x(n-r)
Then, by definition, [T] =B1, B2, 0(m-r) xr 0(m-r)x(n-r)

Where 0sxt denotes the zero matrix of size sxt. (Remember that u1,….,un-r Ker T.)
1r 0
Hence, PAQ =
0 0
B B2
Where Q = M andB11P = M , by
B2 Theorem 10 of Unit 3.

1r 0
0 0
Note is called the normal form of the matrix A.

Consider the following example, which is the converse of E5.

Example 3: A is an mxn matrix of rank 1, show that  C ≠ 0 Mmx1(F) and R ≠ 0 in


M1xn (F) such that a = CR.
Solution: By corollary 2 (above),  P,Q such that
1 0 … 0
0 0 … 0


PAQ = 0 0 … 0 , since p(A) = 1.

1
= [1 00 … 0]

1
A = p-1 (PAQ)Q-1 = p-1 [1 -1
0 0 … 0] A = CR.

0
Where C = p-1 ≠ 0.R = [1 0 … 0] Q-1 ≠ 0.

E E7) What is the normal form of diag (1,2,3)?

The solution of E7 is a particular case of the general phenomenon: the normal form of
an nxn invertible matrix is 1n.

Let us now look at some ways of transforming a matrix by playing around with its
rows. The idea is to get more and more entries of the matrix to be zero. This will help
us in solving systems of linear equations.

3.2 Elementary Operations

Consider the following set of 2 equations in 3 unknowns x, y and z:

x+y+z=1
2x + 3z = 0
How can you express this system of equations in matrix form?
One way is
x
1 1 1 y 1
2 0 3 z = 0

In general, if a system of m linear equations in n variable, x1,….xn, is


a11x1 + a12 x2 + .. + a1nxn = b1


am1x1 + am2 x2 + .. + amnxn = bm

where aij, bi F i = 1,…, m and j = 1, …. n, then this can be expressed as


AX = B,

x1 b1

where A = [aij]mxn, X = , B=
xn bm

In this section we will study methods of changing the matrix A to a very simple form
so that we can obtain an immediate solution to the system of linear equations AX = B.
For this purpose, we will always be multiplying A on the left or the right by a suitable
matrix. In effect, we will be applying elementary row or column operations on a.

3.3 Elementary Operations on a matrix

Let A be an mxn matrix. As usual, we denote its rows by R1, ….,Rm, and columns by
C1,….Cn. We call the following operations elementary row operations:

1) Interchanging R1 and Rj for i ≠ j.


2) Multiplying R1 by some a  F, a ≠ 0.
3) Adding aR1 to R1, where i ≠ j and a  F.

We denote the operation (1) by Rij, (2) by Ri(a), (3) by Rij(a).


1 2 3
For example. if A = 0 1 2

0 1 2
Then R12 (A) = 1 2 3 (interchanging the two rows).

1 2 3 1 2 3
Also R2 (3) (A) = 0x3 1x3 2x3 = 0 3 6

1 + 0 x 2 2 +1 x 2 3+2x2 1 4 7
and R12(2) (A) = = 0 1 2
0 1 2
E E8) If A = , what is
0 0 1
1 0 0
0 1 0
a) R21 (A) b) R32 o R21 (A) c) R13(-1) (A)?

Just as we defined the row operations, we can define the three column
operations as follows:
1) Interchanging C1 and C1 for i ≠ j denoted by Cij.
2) Multiplying C1 by a  F, a ≠ 0, denoted by Ci,(a).
3) Adding a C1 to C1 , where a  F, denoted by Cij (a).
1 3
For example, if A = 2 4

1 13
Then C21(10) (A) = 2 24

31 3
and C12(10) (A) = 42 4

We will now prove a theorem which we will use in sec. 8.3.2 for obtaining the rank
of a matrix easily.

Theorem 3: elementary operations on a matrix do not alter its rank.

Proof: the way we will prove the statement is to show that the row space remains
unchanged under row operations and the column space remains unchanged under
column operations. This means that the row rank and the column rank remain
unchanged. This immediately shows, by Theorem 1, that the rank of the matrix
remains unchanged.

Now, let us show that the row space remains unaltered. Let R1,…Rm be the rows of a
matrix A. Then the row space of A is generated by {R1….Ri…Rj….Rm}. On applying
Rij to A, the rows of A remain the same. Only their order gets changed. Therefore,
the row space of Rij(A) is the same as the rows space of A.
If we apply R1(a), for a  F, a ≠ 0, then any linear combination of R1,….Rm is a1R1
ai +…+ a R , which is a linear combination of
+…+ amRm = a1 +….+ aR
ai m m
R1….,aRi,….Rm.

Thus, |{ R1….Ri…Rm}| = [{R1….aRi…Rm}]. That is, the row space of A is the same
as the row space of Ri (a) (A).

If we apply Rij(a), for a  F, then any linear combination


B1R1 +…+biRj +…+biRj..+ bmRm = b1R1 +..+bi(Ri+aRj) +..+ (bj-bia)Rj +…+bmRm.
Thus, [{R1,….,Rm}] = [{R1,…,Ri + aRj,…, Rj,…,Rm}].

Hence, the row space of A remains unaltered under any elementary row operations.

We can similarly show that the column space remains unaltered under elementary
column operations.

Elementary operations lead to the following definition.

Definition: A matrix obtained by subjecting 1n to an elementary row or column


operation is called an elementary matrix.

1 0 0 0 1 0
For example, C12(I3) = C12 0 =1 0 1 0is 0an elementary matrix.
0 0 1 0 0 1

Since there are six types of elementary operations, we get six types of elementary
matrices, but not all them are different.

E E9) Check that R23(I4) = C23(I4,)R2(2) (14) = C2(2)(I4) and R12(3)(I4) = C21(3)(I4)

In general, Rij(In) = Cij(In), Ri(a)(In) = Ci(a)(In) for a ≠ 0, and Rij(a)(In) = Cij(a)(In)


for i ≠ j and a  F.

Thus, there are only three types of elementary matrices. We denote


Rij(I) = Cij(I) by Eij,

Ri(a)(I) = Ci(a)(I), (if a ≠ 0) by Ei(a) and


Rij(a)(I) = Cij(a)(I) by Eij(a) for i ≠ j, a  F.
Eij, Ei(a) and Eij(a) are called the elementary matrices corresponding to the pairs Rij
and Cij.
Ri(a) and Ci(a), Rij(a) and Cij(a), respectively.

Caution: Eij(a) corresponds to Cij(a), and not Cij(a).

Now, see what happens to the matrix

0 1 2
A= 3 0 0 if we multiply it on the left by
2 1 0

E12 = 0 1 0 We get
1 0 0
0 0 1

0 1 0 0 1 2 3 0 0
1 0 0 3 0 0 0 1 2 = R12(A)
0 0 1 2 1 0 2 1 0

Similarly, AE12 = C12(A).


0 1 2
Again, consider E3(2)A = 1 0 0 3 0 0
0 1 0
0 0 2 2 1 0
0 1 2
= 3 0 0 = R3 (2)(A)
4 2 0
Similarly AE3(2) = C3(2)(A)

0 1 2 1 0 5 0 1 2
Finally, E13 (5)A = 3 0 0 0 1 0 = 3 0 15
2 1 0 0 0 1 2 1 10

= R13 (5)(A)

0 1 2 1 0 5 0 1 2
But, AE13 (5) = 3 0 0 0 1 0 3 0 15
2 1 0 0 0 1 2 1 10

= C31(5)(A).

What you have just seen are example of a general phenomenon. We will not state this
general result formally. (Its proof is slightly technical, and so, we skip it.)
Theorem 4: for any matrix A

a) Rij(A). = Eij A
b) Ri(a)(A) = Ei(a)A, for a ≠ 0.
c) Rij(a)(A). = Eij (a)A
d) Cij(A) = AEij
e) Ci(a)(A) = AEi(a), for a ≠ )
f) Cij(a)(A) = Aeij(a)
In (f) note the change of indices i and j.

An immediate corollary to this theorem shows that all the elementary matrices are
invertible (see Sec. 7.6).

Corollary: An elementary matrix is invertible. In fact,

a) EijEij = I,
b) Ei(a-1)Ei(a) = I, for a ≠ 0.
c) Eij (-a) Eij(a) = I.

Proof: we prove (a) only and leave the rest to you (see E10).
Now, from Theorem 4,

EijEij = Rij) = Rij(Rij(I)) = I, by definition of Rij.

E E10) Prove (b) and (c) of the corollary above.

The corollary tells us that the elementary matrices are invertible and the inverse of an
elementary matrix is also an element matrix of the same type.

E E11) Actually multiply the two 4 x 4 matrices E13 (-2) and E13(2) to get I3.
And now we will introduce you to a very nice type of matrix, which any matrix can be
transformed to by applying elementary operations.

3.4 Row – reduced Echelon Matrices

Consider the matrix


1 0 9
0 1 0
0 0 1
0 0 0

In this matrix the three non-zero rows come before the zero row, and the first non-zero
entry in each non-zero row is 1. Also, below this 1, are only zero. This type of matrix
has a special name, which we now give.

Definition: An m x n matrix A is called a row-reduced echelon matrix if

a) The non-zero rows come before the rows,


b) In each non-zero row, the first non-zero entry is 1, and
c) The first non-zero entry in every non-zero row (after the first row) is to the
right of the first non-zero entry in the preceding row.

1 1 2
Is 0 1 0 a row-reduced echelon matrix? Yes. It satisfies all the conditions of the
definition. On the other hand are not row-reduce echelon matrices, since they
violate conditions (a), 9b) and (c), respectively.
0 0 0 2 1 0 0 1 0
0 0 1 0 0 1 1 1 1
The matrix
0 1 3 4 9 7 8 0 -1 0 1
0 0 0 0 1 5 6 10 2 0 0
0 0 0 0 0 0 0 1 7 0 12
0 0 0 0 0 0 0 0 0 1 10
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0

is a 6 x 11 row-reduced echelon matrix. The dotted line in it s to indicate the step-like


structure of the non-zero rows.

But, why bring in this type of a matrix? Well the following theorem gives us one good
reason.

Theorem 5: The rank of a row-reduced echelon matrix is equal to the number of its
non-zero rows.
Proof: Let R1,R2,…Rr be the non-zero rows of an m x n row-reduced echelon matrix,
E. Then RS(E) is generated by R1,…Rr. We want to show that R1, …,Rr are linearly
independent. Suppose R1 has its first non-zero entry in column k1, R2 in column k2,
and so on. Then, for any r scalars c1,…cr such that c1 R1 that c1, R1 +C2R2 +…+ crRr
= 0 we immediately get
k1 k2 kr
  
c1 [0,…… ..,0. 1, ,……..…, ,……….,
 ,  …..….. ] 
+ c2 [0,………………..,0,1, …………,  ,  ……….. ]




 
+ cr [0, ……………………………..…..,0,1, ………… ]
= . [0,………………………………………………….. 0]

where denotes
 various entries that aren’t bothering to calculation.

This equation gives us the following equations (when we equate the k1th entries, the
k2th entries,…., the krth entries on both sides of the equation):
c1 = 0 = c1 ( ) + c2 = 0…., c1( ) + c2( ) +….+
  r = 0.
r-1( ) +c
On solving these equations we get
c1 = 0 = c2…=cr.R1….,Rr are linearly independent p(E) = r.

Not only is it easy to obtain the rank of an echelon matrix, one can also solve linear
equations of the type AX =B more easily if A is in echelon form.
Matrices II
Now here is some good news!

Every matrix can be transformed to the row echelon form by a series of elementary
row operations. We say that the matrix is reduced to the echelon form. Consider the
following example.

0 0 0 0 0 1
0 1 2 -1 -1 1
Example 4: let A = 0 1 2 0 3 1
0 0 0 1 4 1
0 2 4 1 10 2

Reduce a to the row echelon form.

Solution: the first column of a is zero. The second is non-zero. The (1,2)th element is
0. we want 1 at this position. We apply R12 to A and get.
0 1 2 -1 -1 1
0 0 0 0 0 1
A1 = 0 1 2 0 3 1
0 0 0 1 4 1
0 2 4 1 10 2

The (1,2)th, entry has become 1. Now we subtract multiples of the first row from other
rows so that the (2,2)th, (3,2)th (4,2)th and (5,2)th entries become zero. So e apply R ij
(-1) and R51)-2) and get

0 1 2 -1 -1 1
0 0 0 0 0 1
A2 = 0 1 2 0 4 0
0 0 0 1 4 1
0 0 0 3 12 2

Now, beneath the entries of the first row we have zeros in the first 3 columns, and in
the fourth column we find non-zero entries. We want 1 at the (2,4)the position, so we
interchange the 2nd 3rd rows. We get

0 1 2 -1 -1 1
0 0 0 1 4 0
A3 = 0 0 0 0 0 1
0 0 0 1 4 1
0 0 0 3 12 0

We now subtract suitable multiples if the 2nd row from the 3rd, 4th and 5th rows so that
the (3,4)th (4,4(th (5,4)th entries all become zero.
0 1 2 -1 -1 1
R42(-1) R
0 0 0 1 4 0 A~B means that on apply
R42(-3) 0 0 0 0 =A
0 41 the operation R to A we
A3 ~ 0 0 0 0 0 1 get matrix B.

0 0 0 0 0 0

Now we have zero below the entries of the 2nd row, except for the
6th column. The (3,6)th element is 1. We subtract suitable multiples
of the 3rd row from the 4th and 5th rows so that the (4,6)th elements
become zero.
0 1 2 -1 -1 1
0 0 0 1 4 0
R43(-1) Linear transformations
0 0 0 0 0 1 and Matrices
A4 ~ 0 0 0 0 0 0
0 0 0 0 0 0

And now we have achieved a row echelon matrix. Notice that we applied 7
elementary operations to a to obtain this matrix.
In general, we have the following theorem.

Theorem 6: Every matrix can be reduced to a row-reduced echelon matrix by a finite


sequence of elementary row operations.

The proof of this result is just a repetition of the process that you went through in
Example 4 for practice, we give you the following exercise.
1 2 0
E E12) Reduce the matrix 0 1 0 to echelon form.
3 1 0

Theorem 6 leads us to the following definition.

Definition: If a matrix A is reduced to a row-reduce echelon matrix E by a finite


sequence of elementary row operations then E is called a row-reduced echelonform
(or, the row echelon form) of A. We now give a useful result that immediately
follows from Theorem 3 and 5.

Theorem 7: Let E be a row-reduced echelon form of A. Then the rank of A = number


of non-zero rows of E.

Proof: We obtain E from A by applying elementary operations. Therefore by


Theorem 3. p(A) = p(E). Also. P(E) = the number of non-zero rows of E, by Theorem
5.

Thus, we have proved the theorem.

Let us look at some examples to actually see how the echelon form of a matrix
simplifies matters.

Example 5: Find p(A), where


1 2 3
A= 1 5 6

by reducing it to its row-reduced echelon form.


1 2 3 1 R
2 (1/3)
3 1 2 3
Solution: 1 R
5 216(-1)
~
2
~ 0 3 3 0 1 1
which is the desired row-reduced echelon form. This has 2 non-zero rows. Hence,
p(A) =

E E13) Obtain the row-reduced echelon form of the matrix


1 2 0 5
2 1 7 6
A= 4 5 7 10

Hence determine the rank of the matrix.

By now must have got used to obtaining row echelon forms. Let us discuss some
ways of applying this reduction.

3.6 Applications Of Row-Reduction

In this section we shall see how to utilize row-reduction for obtaining the inverse of a
matrix, and for solving a system of leaner equations.

3.7 Inverse of a Matrix

In Theorem 4 you discovered that applying a row transformation to a matrix a is the


same as multiplying it on the left by a suitable elementary matrix. Thus applying a
series of row transformations to A is the same as pre-multiplying A by a series of
elementary matrices. This means that after the nth row transformation we obtain the
matrix EnEn-1… E2 E1A. Where E1E2,….En, are elementary matrices.

Now, how do we use this knowledge for obtaining the inverse of an invertible matrix?
Suppose we have an n x n invertible matrix A. We know that A = 1A. Where 1 = 1 n.
Now, we apply a series of elementary row operations E1,…E to A so that A gets
transformation to 1n
Thus,
1 = EsEs-1…E2E1A = E Es-1… E2E1 (1A)
= (E Es-1… E2E11)A = BA
where B = E …. E1I. Then B is the inverse of A!
Note that we are reducing A to 1 and not only to the echelon form.

We illustrate this below.

Example 6: Determine if the matrix


1 2 3
A= 2 3 1
3 1 2

is invertible. If it is invertible, find its inverse.

Solution: Can we transform A to 1? If so, then A will be invertible.


1 0 0 1 2 3
0 1 0 2 3 1
Now, A = IA =
0 0 1 3 1 2
To transform A we will be pre-multiplying it by elementary matrices. We will also be
pre-multiplying 1A by these matrices. Therefore, as a is transformed to 1, the same
transformations are done to 1 on the right hand side of the matrix equation given
above. Now
1 2 3 1 0 0
0 -1 -5 -2 1A (applying
0 R21 (-2) and R31 (-3) to A)
0 -5 -7 -3 0 1

1 2 3 1 0 0
0 1 5 2 -1 0
 = 0 5 7 A (applying
3 0 -1 R2(-1) and R3(-1))

1 0 -7 -3 2 0
 = 0 1 5 A (applying
2 -1 0 R12(-2) and R32(-5))
0 0 -18 -7 5 -1

1 0 -7 -3 2 0
2
A-1(applying
0
 = 0 1 5 R3(-1/18))
0 0 1 7/18 -5/18 1/19

1 0 0 -5/18 1/18 7/18


0 1 0 1/18 7/18 -5/18
 = A (applying
7/18 -5/18 1/18 R13(7) and R23(-5))
0 0 1
-5 1 7
Hence, A is invertible and its inverse is B = 1/18 1 7 -5
7 -5 1
To make sure that we haven’t made a careless mistake at any stage, check the answer
by multiplying B with A. your answer should be 1.
0 1 3
E E14) show that 2 3 5 is invertible. Find its inverse.
3 5 7

Let us now look at another application of row-reduction.

3.7 Solving a System of Linear Equations

Any system of m linear equations, in n unknowns x1….xn, is


anx1 +…+ a1nxn = b1


am1x1 +…+ a1mnxn = bm


where all the aij and bi are scalars
This can be written in matrix form as
x1 b1

AX = B., where a = [aij], X = , B=


xn bn

If B = 0, the system is called homogenous. In this situation we are in a position to say


how many linearly independent solutions the system of equations has.

Theorem 8: The number of linearly impendent solutions of the matrix equation AX


= 0 is n-r, where A is an mxn matrix and r = p(A).

Proof: In Unit 7 you studied that given the matrix A, we can obtain a linear
transformation T: Fn Fm such that [T]B.B = A, where B and B are bases of Fn and Fm,
respectively.

x
Now, X = is a 1solution of AX=0 if and only if it lies in ker T(since
T(X)=AX)
xn
Thus, the number of linearly independent solutions is dim ker T nullity (T) =
n-rank (T) (Unit 5. Theorem 5.)

Also, rank (T) = p(A) (Theorem 2)

Thus, the number of linearly independent solutions is n-p(A).

This theorem is very useful for finding out whether a homogeneous system has any
non-trivial solution or not.

Example 7 consider the system of 3 equations in 3 unknowns:

3x – 2y +z = 0
x+y =0
x – 3z =0

How many solutions does it have which are linearly independent over R?
3 -2 1
Solution: Here our coefficient matrix, A = 1 1 0
1 0 -3

Thus, n = 3. We have to find r. For this, we apply the row-reduction method.


We Obtain A ~ 1 0 -3 , which is in echelon form and has rank 3.
0 1 3
Thus, p(A) = 3. 0 0 1

Thus, the number of linearly independent solutions is 3 – 3 = 0. This means that this
system of equation has no non-zero solution.

In Example 7 the number of unknowns was equal to the number of equations, that is, n
= m. What happens if n>m?
A system of m homogeneous equations in n unknowns has a non-zero solution if n>m,
why? Well in n>m, then the rank of the coefficient matrix is less than or equal to m,
and hence, less than. So n-r> 0 Therefore, at least one non-zero solution exists.

Note: If a system AX = 0 has one solution, X0 then it has an infinite number of


solutions of the form cX0, c  F. This is because AX0 = 0  A(cX0) = 0 c  F.

E E15) Give a set of linearly independent solutions for the system of equations
x + 2y + 3z = 0
2x + 4y + z = 0
Now consider the general equation AX = B, where A is an mxn matrix. We form the
augmented matrix |AB|. This is an m x (n+1) matrix whose last column is the matrix
B. Here, we also include the case B = 0.

Interchanging equations, multiplying an equation by a non-zero scalar, and adding to


any equation a scalar times some other equation does not alter the set of solutions of
the system of equations. In other words, if we apply elementary row operations on
|AB| then the solution set does nit change.

The following result tells us under what conditions the system AX = B has a solution.

Theorem 9: The system of linear equations given by the matrix equation AX = B has a
solution if P(A) = p([AB]).

Proof: AX = B represents the system.


a11x1 +a12x2 +…+ a1nxn = b1

am1x1 + am2x2 +…+ a1mnxn = bm

This is the same as


a11x1 +a12x2 +..…+ a1nxn - b1 = 0


am1x1 + am2x2 +…+ a1mnxn - bm = 0


X
-1
X
-1
which is represented by |AB| = 0. Therefore, any solution of AX = B is also a
solution of |AB| = 0 and vice versa. By Theorem 8; this system has a solution if
and only if n + 1> p([AB]).
Now, if the
c1
X

Equation |AB| -1 = 0 has a solution, say , then cc1C1 + c2C2 +….+ cnCn = B,
n-1
where

C1….Cn are the columns of A. That is, B is a linear combination of the C i’s,, RS
([AB]) = RS(A), p([AB]).

Conversely, if p([AB]), then the number of linearly independent columns of A and


|AB| are the same. Therefore, B must be a combination of the columns C1,…..Cn
of A.

Let B = a1C1 +…+ anCn a1  F i.


a1

Then a solution of AX = B is X = an
Thus, AX = B has a solution if and only if p(A) = p(|AB|).

Remark: If A is invertible then the system AX=B has the unique solution X= A -1 B.

Now, once we know that the system given by AX = B is consistent, how do we find a
solution? We utilize the method of successive (or Gaussian) elimination. This method
is attributed to the famous German mathematician, Carl Friedrich Gauss (1777-1855)
(see Fig. 1). Gauss was called the “prince of mathematicians” by his contemporaries.
He did a great amount of work in pure mathematics as well as probability theory of
errors, geodesy, mechanics, electro-magnetism and optics.

To apply the method of Gaussian elimination, we first reduce |AB| to its row echelon
form.
X
E. Then, we write out the equations E =-10 and solve them, which is simple.
Let us illustrate the method.

Example 8: Solve the following system by using the Gaussian elimination process.
x + 2y + 3z = 1
2x + 4y + z = z

Solution: the given system is the same as


x
y 0
z =
-1 0

We first reduce the coefficient matrix to echelon form.


1 2 3 1 1 2 3 1
R1(-2) 1 2 3 1 R(-1/5)
2 4 1 2 ~ 0 0 -5 0 ~ 0 0 1 0
This gives us an equivalent to x = 2y and z = 0.
x +2y + 3z = 1 and z = 0.

These are again equivalent to x = 1 – 2y and z = 0.


We get the solution in terms of a parameter. Put y = . Then x = 1 - 2, y = , z = 0
is a solution, for any scalar , Thus, the solution set is {(1 - 2 0) | R}.

Now let us look at an example where B = ), that is, the system is homogeneous

Example 9: Obtain a solution a solution set of the simultaneous equations.


x + 2y + 5t = 0
2x + y + 7z +61 = 0
4x +5y + 7z + 16t = 0
Solution: The matrix of coefficients is
1 2 0 5
A= 2 1 7 6
4 5 7 16

The given system is equivalent to AX = 0. A row-reduced echelon form of this matrix


is
1 2 0 5
0 1 -7/3 4/3
0 0 0 0

Then the given system is equivalent to

x +y + 5t = 0 x = (- 14/3)z – (7/3)t

y – (7/3) z + (4/3)t = 0 y = (7?3)z –(4/3)t

which is the solution in terms of z and t. Thus, the solution set of the given system of
equations, in terms of two parameters  and , is

{(( - 14/3) - (7/3) , (7/3)  - (4/3) ,,)| ,R}

This is a two-dimensional vector subspace of R4 with basis

{(1-14/3, 7/3, 1, 0), (-7/3, -4/3, 0, 1)}


For practice we give you the following exercise.

E E16) Use the Gaussian method to obtain solution sets of the following system
of
equations.
4x1 – 3x2 + x3 – 7 = 0
x1 – 2x2 – 2x3 - = 0
3x1 – x2 = 2x3 + 1 = 0

And now we are near the end of this unit.


4.0 CONCLUSION

5.0 SUMMARY

In this unit we covered the following points.

 We defined the row rank, column rank and rank of a mtrix, and showed that the
are equal.
 We proved that the rank of a linear transformation is equal to the rank of its
matrix.
 We defined the six elementary row and column operations.
 We have shown you how to reduce a matrix to the row-reduced echelon form.
 We have used the echelon form to obtain the inverse of a matrix.
 We proved that the number of linearly independent solutions of a homogenous
system of equations given by the matrix equation AX = 0 is n-r, where r = rank
of A n = number of columns of A.
 We proved that the system of linear equations given by the matrix equation AX
= B is consistent if and only if p(A) = p([AB]).
 We have shown you how to solve a system of linear equations by the process
of successive elimination of variables, that is, the Gaussian method.

Solutions/Answers

E1) A is the mxn zero matrix  RS (A) = {0}.  pr(A) = 0.

E2) The column space of A is the subspace of R2 generated by (1,0), (0,2),


(1,1). Now dimR CS(A)  dim RR2 = 2. Also (1,0) and (0,2), are
linearly independent.
{(1,0),(0,2)} is a basis of CS(A), and pc(A) = 2.

The row space of A is the subspace of R3 generated by (1,0,1) and (0,2,1). These
vectors are linearly independent and hence, form a basis of RS(A).pr(A) = 2.

E3) The ith row of C = AB is


[ci1 ci2….cip]

n n n

= k=1 aik bk1  aikbk2 ……. aikbkp
k=1 k=1
= ai1 [b11 b12 ..n1p] + ai1 [b21 b22…b2p] +…. + ain [bn1 bn2…bnp], a
linear combination of the rows of B.Rs.(AB)RS(B) pr (AB)  pr(B).

E4) By Lemma 1, p(AB)  pc(A) = p(A)


Also p(AB)  pr(B) = p(B).
p(AB)  min (p(A), p(B)).
E5) p(CR)  min (p(R))
But p (C)  min (m,1) = 1. Also C ≠ 0. p(C) = 1. p (CR)  1.

a1
Now, if c = , r = [B1,……., Bn], then


am

a1b1 a1b2+…+ a1bn


a2b1 a2b2+…+ a2bn 

amb1 +…+ amb2 = ambm

Since C  0, ai 0, for some i. Similarly bj 0 for some j.ai bj 0.
 CR  0.
 p (CR)  0. p(CR) = 1.

0 -2 -2
E6) PAQ = -3 -4 -3 The rows of PAQ are linearly
independent.p(PAQ)= 2.Also the rows of a are linearly
independent.p(PAQ) = p(A).
1 0 0 1 0 0
0 2 0 0 1 0
E7)Let A = Then p(A) = 3. A’s normal form is
0 0 3 0 0 1

a) 1 0 0
0 0 1
0 1 0 1 0 0 0 1 0
0 0 =1 0 1 0
b) R32 o R21(A) = R32 0 0 1
0 1 0

0 + 0 x )- 1) 0 + 1 x ( - 1) 1 + 0 x ( – 1) 0 -1 1
c) 1 0 0 = 1 0 0
0 1 0 0 1 0

1 0 0 0
0 0 1 0
E9) R23 (14)= 0 1 0 0 = C23 (14)
0 0 0 1

1 0 0 0
R2(2) (14) = 0 2 0 0 = C2(2) (14)
0 0 1 0
0 0 0 1
1 3 0 0
0 1 0 0
R12(3) (14) = 0 0 1 0 = C21 (3) (14)
0 0 0 1

E10) Ei (a-1) Ei(a) = Ri(a-1) (Ei(a)) = Ri(a-1) Ri(a) (1) = 1.


This proves (b)

Eij (- a) (Eij (a) = Rij (-a) (Eij(a)) = Rij (-a) (1)) = 1, providing (c).

1 0 -2 0 1 0 2 0 1 0 0 0
0 1 0 0 0 1 0 0 0 1 0 0
E11 E13 (-2) E13 (2) = 0 0 1 0 0 0 =1 0 0 0 1 0
0 0 0 1 0 0 0 1 0 0 0 1

1 2 0 1 2 0 1 2 0
E12) 0 1 0 R31(-3) 0 1 0 R32(5) 0 1 0
3 1 0 ~
0 -5 0 ~
0 0 0

1 2 0 5 1 2 0 5
E13) 2 1 7R216(-2), R31(-4) 0 -3 7 R-432(5)
4 5 7 10 ~ 0 -3 7 10
10

1 2 0 5
R2(-1/3) 0 1 -7/13 4/3
0 -3 7 -10

1 2 0 5 1 2 0 5
0 1 -7/3 R4/3 0 1 -7/3 4/3
R32 (3) 3(-1/6)
~ 0 0 0 -6 ~ 0 0 0 1

p(A) = 3
0 1 3 1 0 0
E14) A = 2 3 =5 0 1 A0
3 5 7 0 0 1

0 1 3 0 1 3
 2 3 5 = 2 3 5 A (applying R12
3 5 7 3 5 7

1 3/2 5/2 1 1/2 0


 0 1 3 = 1 0 0 A (applying R1(1/2), R31 (-3))
0 1/2 -1/2 0 -3/2 1
1 0 -2 -3/2 1/2 0
 0 1 3 = 1 0 0 A (applying R12(-3/2), R32 (-1/2))
0 0 -2 -1/2 -3/2 1

1 0 0 -1 2 -1
1/4 -9/4 3/2
 0 1 0 = A (applying R3(-1/2), R23(-3) and R13 (2))
0 0 1 1/4 3/4 -1/2
-1/2 -3/2 1
-1 2 -1
 A is invertible, and A-1 = 1/4 -9/4 3/2
1/4 3/4 -1/2
-1/2 -3/2 1
E15) The given system is equivalent to
x 0
1 2 3
y
2 4 1= 0
z

1 2 3
Now, the rank of 2 4 1 is 2., the number of linear independent solutions is 3 – 2
= 1.any non-zero solution will be a linearly independent solution. Now, the given
equations are equivalent to

x + 2y = - 3z ….. (1)
2x + 4y = - z …. (2)

(- 3) times Equation (2) added to Equation (1) gives -5x – 10y = 0.

x = -2. Then (1) gives z = 0. Thus, a solution is (-2, 1, 0) , a set of linearly


independent solutions is {(-2,1,0)}.

Note that you can get several answers to this exercise. But any solution will be
 (-2, 1,0), for some  R.

E16) The augmented matrix is [AB].


4 -3 1 7
= 1 -2 -2 3 Its row-reduced echelon form is
3 -1 2 -1
1 -2 -21 3
0 1 9/5 -1
0 0 1 5
3 -1 2 -1
Thus, the given system of equations is equivalent to
x1 – 2x2 – 2x3 = 3
x2 + (9/5) x3 = -1
x3 = 5.

We can solve this system to get the unique solution x1 = -7, x2 = -100 x3 = 5.
Eigenvalues And Eigenvectors

This section consist of three units in which we first introduce you to the theory of
determinants, and give its applications in solving systems of linear equations.

The theory of determinants was originated by Leibniz in 1693 while studying systems
of simultaneous linear equations. The mathematician Jacobi was perhaps the most
prolific contributor to the theory of determinants. In fact, there is a particular kind of
determinant that is named Jacobian, after him. The mathematicians Cramer and
Bezout used determinants extensively for solving systems of linear equations.

In Unit 5 we have given a self-contained treatment of determinants, including the


standard properties of determinants. We have also given formula for obtaining the
inverse of a matrix, and have explained Cramer’s Rule. We end this unit by
discussing the determinant rank.

In unit 6 we discuss eigenvalues and eigenvectors. Their use first appeared in the
study of quadratic forms. The concepts that you will study in this unit were developed
by Arthur Cayley and others during the 1840s. What you will discover in the unit is
the algebraic eigenvalue problem and methods of finding eigenvalues and linearly
independent eigenvectors.

In unit 7 we introduce you to the characteristic polynomial. We give a proof of the


Cayley-Hamilton theorem and give its applications. We also discuss the minimal
polynomial of a matrix ad of a linear transformation.

If you are interested in knowing more about the material covered in this block, you
can refer to the books listed in the course introduction. These books will be available
at your study centre.

Notations And Symbols

Mn(F) set of all nxn matrices over F

Vn(F) Mnx1 (F)

Det (A) determinant of the matrix A


|A|

p a1 the product of ais such that I satisfies property P


det (T) determinant of the linear operator T
Adj (A) adjoint of the matrix A
Tr (A) trace of the mnatrix A
W eigenspace corresponding to the eigenvalue 
UNIT 5 DETERMINANTS
Introduction
Objective
Definition Determinants
Properties of Determinants
Inverse of a Matrix
Product Formula
Adjoint of a Matrix
Systems of Linear Equations
The Determinant Rank
Summary
Solutions/Answers

Introduction

In Unit 4 we discussed the successive elimination method for solving a system of


linear equations. In this unit we introduce you to another method, which depends on
the concept of a determinant function. Determinants were used by the German
mathematician Leibniz (1646 – 716) and the Swiss mathematician Vandermonde
(1735-1796) gave the first systematic presentation of the theory of determinants.

There are several ways of developing the theory of determinants. In section 5.2 we
approach it in one way. In section 5.3 you will study the properties of determinants
and certain other basic facts about them. We go on to give applications in solving a
system of linear equations (Cramer’s Rule) and obtaining the inverse of a matrix. We
also define the determinant of a linear transformation. We end with discussing a
method of obtaining the rank of a matrix.

Throughout this unit F will denote field of characteristic zero (Mn (F) will denote the
set of n x n matrices over F and Vn (F) will denote the space of all n x 1 matrices over
F, that is,

a1
a2
Vn (F) = X = . ai F
:
an
The concept of a determinant must be understood properly because you will be using
it again and again. Do spend more time on section 5.2, if necessary. We also advise
you to revise unit 1-4 before starting this unit.
2.0 OBJECTIVES

After completing this unit, you should be able to

 Evaluate the determinant of a square matrix; using various properties of


determents;
 Obtain the adjoint of a square matrix;
 Compute the inverse of an invertible matrix, using its dajoint;
 Apply Cramer’s rule to solve a system o f linear equations;
 Evaluate the determinant of a linear transformation;
 Evaluate the rank of a matrix by using the concept of the determinant rank.

3.0 MAIN CONTENT

3.1 Defining Determinants

There are many ways of introducing and defining the determinant function from Mn
(F) to F. In this section we give one of them, the classical approach. This was
given by the French mathematician Laplace (1749- 1827), and still very much in
use.

We will define the determinant function det: Mn (F) F by induction on n. That is, we
will define it for n = 1,2,3, and then define it for any n, assuming the definition
for n-1.

When n = 1, for any A M1(F) we have a = [a], for some a F. In this case we define
det (A) = det ([a]) = a.

For example, det ([5]) = -5.


a11 a12
When n = 2, for any A = a21 a22 M2(F), we define.

0 1
For example det -2 3 a13 = 0 x 3 – 1 x (- 2) = 2.

a11 a12 a13


When n = 2, for any a = a21 a22 a23 M3 (F), we define
a31 a32 a33

det (A) using the definition for the case n = 2 as followings:

det (A) = a det a22 a23 a21 a23 +a13 det a21 a22
-a12 det
a32 a33 a31 a33 a31 a32
That is, det (a) = (-1)1+1 a11 (det of the matrix left after deleting the row and
column containing a11) + (-1 ) 1+2 a12 (det of the matrix left after deleting left
row and column containing a12) + (-1) 1+2 (det of the matrix left after deleting
the row and column containing a13

Note that the power of (-1) that is attached to a11, is 1+ j for j = 1+ j for j =
1,2,3.

So, det (A)=a11 (a22 a33 – a23 a32) – a12 (a21 a33 – a23 a31) + a13 (a21 a32– a22 a31).

In fact, we could have calculated | A | from the second row also as follows:

a12 a13 a11 a13 2+3 a11 a12


|A | = (- 1)2+1a 21a32 a33 + (- 1) 2+2
a22 a31+a33
(- 1) a123 a31 a32

Similarly, expanding by the third row, we get


3+1
a12 a13 a11 a13 a11 a12
|A | = (- 1) a 31a22 a23 + (- 1)3+2 a32 a +a (- 1) 3+3 a33
21 23 a21 a22
All 3 ways of obtaining | A | lead to the same value.

Consider the following example.

Example 1: Let

1 2 6
A= 5 4 1 Calculate | A|
7 3 2

Solution: we want to obtain

1 2 6
|A | = 5 4 1
7 3 2

let Aij denote the matrix obtained by deleting the ith row and jth column of A
let us expand by the first row. Observe that

A11 = 4 1 , A12 = 5 1 ,A13 = 5 4


3 2 7 2 7 3
Thus,

|A11| = 4 1 = 4 x 2 –1x 3 = 5, |A12| = 5 1 = 5 x 2 – 1 x 7 = 3, |A13| = 5 4


3 2 7 2 7 3

= 5 x 3 – 4 x 7 = - 13.
Thus,

|A| =(-1)1+1X 1X |A11| + (-1)1+2 X2X |A12|+ (-1)1+3X6 X |A13|= 5 –6– 8= - 79.

E E1) Now obtain A of Example 1, by expanding by the second row, and the
third row does the value of A depend upon the row used for calculating it?

Now, let us see how this definition is extended to define det (A) for any n x n matrix
A, n ≠ 1.
A11 a12 … a1n
a 21 a22 … a2n
. . .

When A = . . .
. . .
a11 a12 … ain
. . .
. . .
. . .
an1 an2 … ann

The ith row as follows:

det (A) = (-1)1+1 a11det (A11) + (- 1)1+2 a12 det (A12) + … + (- 1)1+n ain det (Ain),
where Aij is the (n – 1) x (n – 1) matrix obtained from A by deleting the ith row and
the jth column, and I is a fixed integer with 1 ≤ i ≤ n.
n
We, thus, wee that det (A) = ∑ (- 1) i+j aji det (Aji),
j=1

define the determinant of an n x n matrix A in terms of the determinants of the (n- 1)


x (n-1) matrices aij, j = 1,2, …………, n.
Note: while calculating |A| , we prefer to expand along a row that has the maximum
number of zeros, This cuts downs the number of terms to be calculated.

The following example will help you to get used to calculating determinants.

Example 2: Let

-3 -2 0 2
A = 2 1 0 -1 Calculate |A |
1 0 1 2
2 1 -3 1

-2 0 2
|A | = 1 1 -1
1 0 1

The first three rows have one zero each. Let us expand along third row. Observe
that a32 = 0. So we don’t need to calculate A32. Now,

-2 0 2 -3 –2 2 -3 –2 0
A31 = 1 0 1 , A33 = 2 1 -1 ,A34 = 2 10
1 -3 1 2 1 1 2 1 -3

We will obtain |A31|, |A33|, and |A34| by expanding along the second, third and second
row, respectively.

-2 0 2
, |A31| = 1 0 -1
1 -3 1

0 2 – 2 2 + (- 1 ) 2+3 .(-1). -2 0
= ( - 1) 2+1 .1. + (- 1)2+2 .0.
-3 1 1 -1 1 -3
(expansion along the second row)

= (-1) .6 + 0 + ( -1) (-1) .6


= -6 + 6 = 0.

|A33| = -3 -2 2
2 1 -1 = ( -1)3+1 .2. -2 2 + (-1) 2+3 .1. -3 -2
2 1 1 1 -1 2 1

-3 -2
+ 9-1) 3+3 .1.
2 1 (expansion along the third row)

= 1.2.0 + (-1). 1. (-1) + 1.1.1


= 1 + 1 = 2.

-3 -2 0 = (-1)2+1 .2. -2 0 + (-1)2+2 .1. -3 0


|A34| = 2 1 0 1 -3 2 -3
2 1 -3
-3 -2
+ (-1) 2+3 .0. (expansion along the second row)
2 1

= (-1) .2. 6 + 1.1.9 + 0


= -12 + 9 = -3.

Thus, the required determinant is given by

|A| = a31 |A31| -a32 |A32| + a33 |A33| - a34 |A 34|


= 1.0 – 0 + 1.2 -2. )-3) = 8.

E E2) Calculate |At|, where A is the matrix in


a) Example 1,
b) Example 2.

At this point we mention that there are two other methods of obtaining determinants –
via permutations and via multilinear forms. We will not be doing these methods
here. Fr purposes of actual calculation of determinants the method that we have
given is normally used. The other methods are used to prove various properties
of determinants.

So far we have looked at determinant algebraically only. But there is a geometrical


interpretation of determinants also, which we now give.

Determinant as area and volume: Let u =(a1, a2)


and v = (b1,b2) be two vectors in R2. Then, the
magnituted of the area of the parallelogram spanned
by u and v (see fig. 1) is the absolute v
u
value of det (u,v) = a1 b1 O X
a2 b2. Fig. 1: the shaded area is det
(u, v)
In fact, what we have just said is true for any n > 0. Thus, if u 1, u2, ….un are n vectors
in

Rn, then the absolute of det (u1u2…., un) is the magnitude of the volume of the n-
dimensional box spanned by u1, u2, …. un.

Try this exercise now.

E E3 What is the magnitude of the volume of the box in R3 spanned by I, j and k?

Let us, now study some properties of the determinant function.

3.5 Porperties Of Determinants

In this section we will state some properties of determinants, mostly proof. We will
take examples and check that these properties hold for them.

Now, for any A Mn (F) we shall denote its columns by C1, C2, ….Cn Then we have
the following 7 properties, P1 – P7.

P1: If Ci is an n x 1 vector over F, then


det (C1……,Ci-1, Ci+ C1+1,…..,Cn)
= det (C1…., Ci, Ci + 1,…., Cn) + det (C1, …., Ci -1, Ci + 1 ,…., Cn).
P2: IF Ci = Cj, for any i≠ j, then det ((C1,C2,…,Cn) = 0.
P3 If Ci and Cj are internachanged (i ≠ j) to form a new matrix B, then det B
= -det (C1, C2……, Cn).
P4: For F.
det (C1..…Ci– 1 Ci+1, …., Cn) =  det (C1, C2,…, Cn).
Thus, det (C1, C2,…, Cn) = n det (C1,…,Cn).
Now, using P1, P2 and P4, we find that for I ≠ j and F,
det (C1..…Ci + Cj,…, Cj …., Cn) = det (C1,…., Ci,…, Cj …., Cn) +  det
(C1,…., Ci,…, Cj …., Cn). = det (C1, C2, …., Cn).

Thus, we have
P5: for any α  F and i ≠ j, det (C1,…., Ci,…, Cj ,Ci+1,…., Cn ).

Another property that we give is


P6: det (A) = det (At) A Mn (F). (In E2 you saw that this property was true for
Examples 1 and 2. its proof uses the permutation approach to determinants.)

Using P6, and the fact that det (A) can be obtained by expanding along any row, we
get P7” For A  Mn (F), we can obtain det (A) by expanding along any column. That
is, for a fixed k,

|A| = (- 1)1+k a1k |A1k| + (- 1)2+k a2k |A2k| +…+ (- 1 )n+k ank |Ank |
An important remark now.

Remark: Using P6, we can immediately say that P1 – P5 are valid when columns
are replaced by rows.

Using the notation of Unit 8, P3 say that


det (R υ (A)) = - det (A) = det (C υ (A)).

P4 says that
det (Ri (α ) (A)) = αn det (A) = det (Ci (α) (A)), α F, and P5 says that
det (Rij (α) (A)) = det (A) = det (Cij) (α) (A), V  F.

Wewill now illustrate how useful the properties P1 – P7 are

Example 3: Obtain et (A), where A is

a) 1 6 0 b) 1 2 -1 -3
2 7 2 2 4 5 0
1 6 0 0 2 -1 -2
-1 0 0 1
Solution: a) Since the first and third rows or A (R1 and R3) coincide, |A| = 0, by P2
and P6.

b) 1 2 -1 -3
2 4 5 0
|A| = 0 2 -1 -2
-1 0 0 1

1 2 -1 -3
= 2 4 5 0, by adding R1 to R4.
0 2 -1 -2
0 2 -1 -2

= 0, since R3 = R4.

Try the following exercise now.

E E4) Calculation 1 3 0 2 3 5
2 1 2 and 1 0 1
1 3 0 4 6 10

Now we give some examples of determinants that you may come arose often.

Example 4: Let
A b b b
A = b a b b, where a, b € R.
b b a b
b b b a

Calculate |A|

Solution:

a b b b
A = b a b b,
b b a b
b b b a

a + 3b a + 3b a + 3b a + 3b (by adding the second, third and fourth


b a b b rows to the first row, and applying P5)
= b b a b
b b b a

a + 3b 0 0 0 (by subtracting the first column from


= b a-b 0 0 every other column, and using P5)
b 0 a–b 0
b 0 0 -b

= (a + 3b a – b 0 0 (expanding along the first row)


0 a-b 0
0 0 a–b

= (a + 3b) (a – b)3.
In Example 4 we have used an important, and easily proved fact, namely, det (diag
(a1, a2….., an)) = a1 a2…. an, a1F i.

This is true because,

a1 0 0  0 1 0  0
0 a2 0  0 0 1  0
: : : : = αn αn… αn . .  . , by P4
0 0 0  αn : : :
0 0 1

= α1 α2 …… αn |1n|

= α1 α2 …… αn, since|1n| = 1.

Example 5: show that

1 1 1 1
x1 x2 x3 x4  (xj – xi), 1 I  j  4
x12 x2 2 x3 2 x4 2 = ij
x13 x2 3 x3 3 x4 3

(This is know as the vandermonde’s determinant of order 4)


Solution: the given determinant

1 0 0 0 (by subtracting
x1 x2 – x1 x3 – x1 x4 – x1 the first column from
= x12 x22 – x12 x32 – x12x42 – x12 every other column)
x13 x23 – x13 x33 – x13 x43 – x13

x2 – x1 x3 – x 1 x4 – x 1

= (x2 – x1) (x2 + x1) (x3 – x1) (x3 + x1) (x4 – x1) (x4 + x1)
(x2 – x1) (x2 +x1 _ x2x1) (x3 – x1) (x3 +x1 +x3x1) (x4-x1) (x42+x12 + x4x1)
2 2 2 2

(by expanding along the first row and factorizing the entries)

= (x2 – x1) (x3 + x1) (x4 + x1) 1 1 1


x2 – x1 x3 – x1 x4 – x1
x2 +x1 + x2x1 x3 +x1 +x3x1 x42+x12 + x4x1
2 2 2 2

(by taking out (x2 – x1) (x3 + x1), and (x4 – x1)from column 1,2 and 3
respectively).
= (x2 – x1) (x3 – x1) 1 0 0
x2 + x1 x3 – x1 x4 – x2
2 2 2 2 2 2
x2 +x1 + x2x1 x3 – x2 + (x3 – x2)x1 x4 –x2 +(x4 –x2)x1
(by subtracting the first column from the second and third columns)

= (x2 – x1) (x3 + x1) (x4 + x1) x3 x4 – x2


(x3 – x2) (x3 + x2+ x1) (x4 – x2) (x4 + x2+x1)
(expanding by the first row and factorizing the entries)

= (x2 – x1) (x3 + x1) (x4 + x1) (x3 – x2) (x4 – x2) 1 1
x3 + x2+ x1 x4 + x2+x1
= (x2 – x1) (x3 - x1) (x4 - x1) (x3 – x2) (x4 – x2) (x4 – x3)

= ij (xj – xi), 1 i,  j  4

try the following exercise now

E E5) What are a 0 0 a d e


α b 0 and 0 b f ?
β T c 0 0 c

The answer of E4 is part of a general phenomenon, namely, the determinant


of an upper or lower triangular matrix is the product of its diagonal elements.

The proof of this is immediate because,

a11 *  * a22 *  *
0 a22  * 0 a33  *
0 0 . 0 0 . (expanding
       
        along C1)
       
0 0  ann 0 0  ann
=…= a11 a22…ann, each time expanding along the first column.

In the Calculus course you must have come across df/dt =f’(t), where f is a
function of it. The next exercise involves this.
E E6) Let us define the function (t) by
(t) = f(t) g(t)
f’(t) g’(t)

Show that ’(t) = f(t) g(t)


f”(t) g”(t)

And now, let us study a method for obtaining the invertible matrix.

5.1 INVERSE PF A MATRIX

In this section we first obtain the determinant of the product of two matrices and
then define an adjoint of a matrix. Finally, we see the conditions under which a
matrix is invertible, and, when it is invertible, we give its inverse in terms of its
adjoint.

5.1.1 Product Formula

In unit 7 you studied matrix multiplication, let us see what happens to the
determinant of a product of matrices.

Theorem 1: Let A and B be n X n matrices over F. Then det (AB) = (A) det
(B).

We will not do proof here since it is slightly complicated. But let us verify
theorem 1 for some cases.

Example 6: Calculate |A|, |B| and |AB| when

1 0 2 2 10 9
A= 3 1 0 and B = 0 3 8 .
0 0 1 0 0 5

Solution: We want to verify theorem ! for our pair pf matrices. Now, on


expanding by the third row, we get |A| = 1.
Also, |B| = 30, which can be immediately seen since B is a triangular matrix

2 10 19
2 10
Since AB = 6 33 35 , |AB| = 5 6 33 = 30
0 0 5

= |A| |B|.

You can verify theorem 1 for the following situation.

E E7) show that |AB| = |A| |B|, where

1 0 -1 -1 0 1
A = 0 2 -2 and B =-2 2 0
3 -3 5 5 -3 3

Theorem 1 can be extended to a product of m n x n matrices,


A1, A2, …,Am. That is,
det (A1 A2 …Am) = det (A1) det (A2) ….. det (Am)

Now let us look at an example in which Theorem 1 simplifies calculations.

Example 7: for a, b, c, R, calculate

a2 + 2bc c2 + 2ab b2 +2ac


b2 +2ac a2 + 2bc c2 + 2ab
c2 + 2ab b2 + 2ac a2 + 2bc

Solution: The solution is very simple. The given matrix is equal to

a b c 2
c a b . Therefore,
b c a
We get the required determinant to be
2 2
a b c a b c
c a b = c a b (by theorem 1)
b c a b c a

= (a3 + b3 + c3 – 3abc)2,

because a b c a b c b c a
c a b =a c a -bb a +c b c
b c a

= a(a2 – bc) – b(ac – b2) + c(c2 – ab)


= a3 + b3 + c3 – 3abc.
Now, you know that AB ≠ BA, in general. But, det (AB) = det (BA), since both are
equal to the scalar det (A) det (B).

On the other hand, det (A + B) ≠ det (a) + det (b), in general. The following
exercise is an example.
1 0 -1 0
E e8) Let A = ,B = show that det (A + B) ≠ det (A) + det
0 1 0 -1
(B).

What we have just said is that det. Is not a linear function.

We now give an immediate corollary to theorem 1.

Corollary 1: If AMn (F) is invertible, then det (A-1) = 1/det (A).


Proof: Let BMn (F) such that AB = 1. Then det(AB = det(A) det (B) =
det(1) =1
Thus, det(A) ≠ 0 and det (B) = 1/det(A). In particular,
A matrix B us similar to a
Det (A-1) = 1/det (A). matrix A if there exists a
non-singular matrix P such
that P-1 AP = B
Another corollary to Theorem 1 is
Corollary 2: Similar matrics have the same determinant.
Proof: if B is similar to A, then B = P-1 AP for some invertible matrix P.
Thus, by
Theorem 1, det(B) = det(P-1 AP)
= det (P-1) det(A) et (P) = 1/det (P). det(P), by cor. 1
= det(A).
we use this corollary to introduce you to the determinant of a linear transformation.
At each stage you have seen the very close relationship between linear transformations
and matrices. Here too, you will see this closeness.

Definifion: Let T:V  V be a linear transformation on a finite-dimensional on-zero


vector space V. Let A = [T] B be the matrix of T with respect to a given basis B of
V.
Then we define the determinant of T by det(T) = det(A).

This definition is independent of the basis of V that is chosen because, if we choose


igenvalues and
igenvectors another basis B′ of V we obtain the matrix A′ = [T] B′,, which is similar to A (see Unit
7, Cor. To Theorem 10). Thus, det (A′) = det (A).

We have the following example and exercises.

Example 8: Find det (T) where we define T:R3 R3 by


T(x1,x2, x3) = (3x1 + x3, -2x1 +x2,-x+2x2+4x3)
Solution: Let B = {(1,0,0), (0,1,0), (0,0,1)} be the standard ordered basis of R3.
Now,
T(1,0,0) = (3,-2,-1) = 3(1,0,0) -2(0,1,0) -1(0,0,1)
T(0,1,0) = (0,1,2) = 0(1,0,0) + 1(0,1,0) + 2(0,0,1)
T(0,0,1) = (1,0,4) = 191,0,0) + 0(0,1,0) + 4(0,01)

3 0 1
, A = [T]B = -2 1 0
-1 2 4

So, by definition,
3 0 1
det(T) = det(A) = -2 1 0
-1 2 4

=3 1 0 +1 -2 1 = 12 -3 = 9.
2 4 -1 2
determinants
E E9) Find the determinant of the zero operator and the identity operator
from
R3  R3.
E E10) Consider the differential operator
D: P2 P2 : D (a0 +a1x+a2x2) = a1+ 2a1x.
What is det(D)?

Let us now see what the adjoint of a square matrix is, and how it will help us in
obtaining the inverse of an invertible matrix.

5.1.2 Adjoint of a Matrix.

In section 9.2 we used the notation Aij for the matrix obtained from a square matrix A
by deleting its ith row and jth column. Related to this we define the (i,j)th cofactor of
A (or the cofactor of aij) to be (-1)i+j |Aij|. It is denoted by Cij. That is Cij = ( -1)i+j |A|.

Consider the following example.

Example 9: Obtain the cofactors C12 and C23 of the matrix A = 0 2 -1


3 4 1
2 1 6
3 1
Solution: C12 = ( - 1)2+3 |A12| = -2 6 = - 16
3 1
C23 = ( - 1)2+3 |A23| = - = 4.
2 6
In the following result we give a relationship between the elements of a matrix and
their cofactors.

Theorem 2: Let A = [aij]nxn. Then,

a) ai1 Ci1 + ai2Ci2 …. + ain Cin = det(A) = a1i C1i +a2iC2i + .. + aniCni.
b) ai1 Cj1 + ai1Cj2 …. + ain Cjn = 0 = ai1 Cj1 + a2iC2j + .. +ani Cnj if i ≠j.
We will not be proving this theorem here. We only mention that (a) follows
immediately from the definintion of det (A), since det (A) = (-1)i+1 ai1 |Ai1| + … + (-
1)i+n ain |Ain|.

E E11)Verify (b) of theorem 2 for the matrix in example 9 and i=1, j=2 or 3.

Now, we can define the adjoint of a matrix.


Definition: Let A = [aij] be any n x n matrix. Then the adjoint of A is the n x n
matrix, denoted by Adj(A), and defined by

C11 C12  Cin t C11 C21 …. Cn1


C21 C22  C2n C12 C22 …. Cn2
. . . . . .
.. ..  .. .. .. 
 ..

Cn1 Cn2  Cnn C1n C2n  Cnn

Where Cij denotes the (i,j)th cofactor of A.


Thus, Adj (A) is the n x n matrix which is the transpose of the matrix of
corresponding cofactors of A.
Let us look at an example.
cos 0 -sin
Example 10: Obtain the adjoint of the matrix A = 0 1 0
sin 0 cos
1 0
Solution: C11 = (-1)1+10 cos = cos 

C12 = (-1) 1+2 0 0 =0


sin cos

C13 = 0 1 = - sin 
Sin  0

C21 = 0, C22 = cos2 + sin2 = 1, C23 = 0.


C31 = sin, C32 = 0, C23 = cos .

cos 0 - sint cos 0 - sin


, Adj (A) = 0 1 0 = 0 1 0
sin 0 cos - sin 0 cos

Now you can try the following exercise.


2 3 -1
E E12) find Adj(A), where A = 0 0 6
0 0 5

In Unit 7 you came across one method of finding out id a matrix is invertible.
The following theorem uses the adjoint to give another way of finding out if a
matrix A is invertible. It also gives us A-1, if A is invertible.

Theorem 3: Let A be an n x n matrix over F, then


A. (Adj(A)) = (Adj(A)). A = det(A) 1.

Proof: Recall matrix multiplication from Unit 7. Now


a11 a12 …. a1n C11 C12 …. Cn1
a21 a22 …. a2n C12 C22 …. Cn1



an1 an2 …. ann C1n C2n …. Cnn


by Theorem 2 we know that ai1Ci1 + …. + ainCin = et(A), and
ai1Cj1 + ai2Ci2 + … + ainCjn = 0 if i ≠ j. Therefore,

det(A) 0 …. 0
0 det(A) …. 0
A (Adj (A)) = 0 0 …. 0
: : …. :
0 0 …. Det (A)

1 0  0
0 1  0





= det(A) 0 0 0 = det (A) 1.
: : :
0 0 1

Similarly, (Adj (A)) .A = det (A) 1.


An immediate corollary shows us how to calculate the inverse of a matrix, if it exists.
Corollary: let A be an n x matrix over F. Then a is invertible if an only if det (A) ≠ 0,
then
A-1 = (1/det (A)) Adj (A).

Proof: If a is invertible, then A-1 exists and A-1 A= 1. so, by theorem 1,


Det(A-1) det (A) = det (1) = 1., det (A) ≠ 0.

Conversely, if det (A) ≠ 0, then Theorem 3 says that

1 1
A |A| Adj (A) = 1 = |A| Adj (A) A
1
A-1 = |A| Adj (A).

We will use the result in the following example.

Example 11: Let

cos 0 -sin
A = 0 1 0 Find A-1
sin 0 cos

Solution:

det(A) = (-1)2+2 .1. cos -sin (by expansion along the


sin cos second row)

= cos2 + sin2 = 1
also, from Example 10 we know that

Adj(A) = cos 0 -sin


0 1 0
-sin 0 cos

Therefore, A-1 = (1/det (A)) Adj(A) = Adj(A).


You should also verify that Adj(A) is A-1 by calculating A.Adj(A) and
Adj(A).A.
You can use theorem 3 for solving the following exercises.

E E13) Can you find A-1 for the matrix in E 12?

E E14) find the adjoint and inverse of the matrix A in E7

E E15) If A-1 exists, does [Adj(A)]-1?


Now we go to the next section, in which we apply our knowledge of determinants to
obtain solutions of systems of linears equations.

5.2 Systems Of Linear Equations

Consider the system of n linear equations in n unknowns, given by

a11x1 + a12x2 + …+ a1nxn = b1


a21x1 + a22x2 + …+ a2nxn = b2
. . . .
. . . .
an1x1 + an2x2 + …+ annxn = bn
which is the same as

x1 b1
x2 b2
AX = B, where A = [aij], X = . ,B= .
. .
xn bn

In Section 8.4 we discussed the Gaussian elimination method for obtaining a solution
of this system. In this section we give a rule due to the mathematician Cramer, for
solving a system of linear equations when the number of equations equals the
number of variables.

Theorem 4: Let the matrix equation of a system of linear equations be


x1 b1
. .
AX = B, where A = [aij]nxn, X = . B= .
xn bn

Let the columns of A be C1, C2, …. Cn. If det(A) ≠ 0, the given system has a
unique solution, namely,
X1 = D1/D,…. xn = Dn/D, where
Di = det (C1,….Ci-1, B,Ci-1,…., Cn)
= determinant of the matrix obtained from A by replacing the ith column
by B, and D = det (A).

Proof: Since |A| ≠ 0, the corollary to Theorem 3 says that A-1 exists
Now AX = B  A-1 AX = A-1B
 IX = (1/D) Adji(A) B
C11 C21  Cn1 b1
C12 C22  Cn2 b2
. ..
X = (1/D) .. .. 
.
.  ..

C1n Cn2  Cnn bn


Thus,
x1
x2 C11b1 + C21b2 + …+ Cn1bn
. C12b1 + C22b2 + …+ Cn2bn
. = (1/D) . . .
xn . . .
C1nb1 + C2nb2 + …+ Cnnbn

Now, Di = det (C1, ….., Ci-1, B, Ci+1, …., Cn). Expanding along the ith
column, we get Di = C1ib1 + C2ib2 + … + Cnibn.

Thus,
x1 D1
x2 D1
. = 1/D .
. .
xn Dn

which gives us Cramer’s rule, namely,


x1 = D1/D,x2 = D2/D,….,xn = Dn/D.
The following example and exercise may help you to practice using Cramer’s
rule.

Example 12: Solve the following system using Cramer’s rule:

2x + 3y – z = 2
x + 2y + z = -1
2x + y -6z = 4

Solution: The give system is equivalent to AX = B, where

2 3 -1 x 2
A = 1 2 1 , X = y , B = -1 Therefore, apply the rule,
2 1 -6 z 4

2 3 -1 2 3 -1 2 3 2
-1 2 1 1 -1 1 1 2 -1
4 1 -62 4 -6 2 1 4
x= , y= , z=
2 3 -1 2 3 -1 2 3 2
1 2 1 1 -1 1 1 2 -1
4 1 -6 2 4 -6 2 1 4

After calculating, we get


X = 23, y = 14, z = -6.

Substitute these values in the given equations to check that we haven’t made a mistake
in our calculations.

E E16) Solve, by Cramer’s rule, the following system of equations.

X + 2y +4z =1
2x + 3y -z =3
x -3z =2

Now let us see what happens if B = 0. Remember, in Unit 8 you saw that AX = 0 has
n-r linearly independent solutions, where r = rank a. The following theorem
tells us this condition in terms of det(A).

Theorem 5: The homogeneous system AX = 0 has a non-trivial solution if and only if


det(A) = 0

Proof: first assume that AX = 0has a non-trivial solution. Suppose, if possible, that
det(A) ≠ 0. Then Cramer’s Ruler’s says that AX = 0 has only the trivial solution X = 0
(because each Di=0 in Theorem 4). This is a contradiction to our assumption.
Therefore, det(A) = 0.

Conversely, if det (A) = 0, then A is not invertible. , the linear mapping


A : Vn (F) Vn (F) : A(X) = AX is not invertible ., this mapping is not one-one
Therefore, Ker A ≠ 0, that is AX = 0 for some non-zero XVn(F). Thus, AX
= 0 has a non-trivial solution.
You can use Theorem 5 to solve the following exercise.

E E17) Does the sytem 2x + 3y + z =0


x–y -z =0
4x + 6y + 2z = 0

have a non-zero solution?

And now we introduce you to the determinant rank of a matrix, which leads us to
another method of obtaining the rank of a matrix.

5.3 The Determinant Rank

In Unit 5 and 8 you were introduced to the rank of a linear transformation and the rank
of a matrix, respectively. Then we related the two ranks. In this section we will
discuss the determinant rank and show that it is the rank of the concerned matrix.
First we give a necessary and sufficient condition for n vectors in Vn(F) to be linearly
dependent.

Theorem 6: Let X1,X2,….,XnVn (F). Then X1,X2,…..,Xn are linearly dependent over
the field F if and only if det (X1X2,……Xn) = 0.
Proof: Let U = (X1X2,……Xn) be the n x n matrix whose column vectors are
X1X2,……Xn. Then X1X2,……Xn are linearly dependent over F if and only if
there exist scalars a1, a2,….,an F, not all zero, such that a1 X1 + a2X2 + … +
anXn = 0.

Now, a1 a1
a2 a2
U . = (X1X2,……Xn) .
. .
αn αn

= X1 a1 + X2a2 +…. + Xnan


= a1X1 + a2X2 + …. + αnXn.
Thus, X1X2,…Xn are linearly dependent over F if and only if UX = 0 for some
non-
a1
a2
zero X = .  Vn (F).
αn

But this happens if and only if det (U) = 0, by Theorem 5. Thus, Theorem 6 is
proved.

Theorem 6 is equivalent to the statement X1X2,…XnVn (F) are linearly


independent if and only if det (X1X2,……Xn) ≠ 0.

You can use Theorem 6 for solving the following exercises.

E E18) Check if the vectors 1 0 2


0 , -1 , 3 are linearly independent
1 1 0
over R.

A submatrix of is a matrix
1 2 3 that can be obtained from A
by deleting some rows and
Now, consider the matrix A = 0 4 5 columns.
1 2 3
Since two rows of A are equal we know that |A| = 0. But consider its 2 x 2
submatrix
0 4
A13 = Its determinant is – 4 ≠ 0. In this case we say that the
1 2
determinant rank of A is 2.

In general, we have the following definition.


Definition: Let A be an m x n matrix. If A ≠ 0, then the determinant rank of A is the
largest positive integer r such that

i) there exists an r x r submatrix of A whose determinant is non-zero, and


ii) for s > r, the determinant of any s x s submatrix of A is 0.

Note: The determinant rank r of any m x n matrix is defined, not only of a square
matrix. Also r  min (m, n).

1 4
Example 13: Obtain the determinant rank of A = 2 5
3 6

Solution: Since A is a 3 x 2 matrix, the largest possible value of its determinant


rank can be 2. Also the submatrix 1 4 of A has determinant (- 3) ≠ 0.
2 5

, the determinant rank of A is 2.

Try the following exercise now.

E E19) Calculate the determinant rank of A, where A =

a) 1 2 0 b) 1 2 3
0 2 1 , 4 5 6
1 0 2

And now we come to the reason for introducing the determinant rank – it gives us
another method for obtaining the rank of a matrix.
Theorem 7: The determinant rank of an m x n matrix A is equal to the rank of A.

Proof: Let the determinant rank of A be r. Then there exists an r x r submatrix of A


whose determinant is non-zero. By Theorem 6, its column vectors are linearly
independent. It follows by the definition of linear independence, that these column
vectors, when extended to the column vectors of A, remain linearly independent.
Thus, A has at least r linearly independent column vectors. Therefore, by definition of
the rank of a matrix,

r  rank (A) = p (A) …… (1)

also, by definition of p (A), we know that the number of linearly independent rows
that A has is p (A). These rows form a p (A) x n matrix p (A). Thus, B will have p(A)
linearly independent columns. Retaining these linearly independent columns of B we
get a p(A) x p(A) submatrix C of B. so, C is a submatrix of A whose determinant will
be non-zero, by theorem 6, since its columns are linearly independent. Thus, by the
definition of the determinant rank of A, we get
p(A) r ……..(2)
(1) and (2) give us us p(A) = r.

we will use Theorem 7 in the following example.

Example 14: Find the rank of

2 3 4
A= 3 1 2

-1 2 2
Solution: det (A) = 0. but det 2 3 = - 7 ≠ 0.
3 1
Thus, by theorem 7, p(A) = 2.
Remark: This example show Theorem 7 can simplify the calculation of the rank of a
matrix in some cases. We don’t have to reduce a matrix to echelon form each time.
But, in (a) of the following exercise, we see a situation where using this method seems
to be as tedious as the row-reduction method.

E E20) Use Theorem 7 to find the rank of A, where A =

a) 3 1 2 5 b) 2 3 5 1
1 2 -1 2 , 1 -1 2 1
4 3 1 7
genvalues and
E20 (a) shows how much time can be taken by using this method. On the other hand,
genvectors E20 (b) shows how little time it takes to obtain p(A), using the determinant rank.
Thus, the method to be used for obtaining p(A) varies from case to case.

We end this unit by briefly mentioning what we have cover in it.

5.4 Summary

In this unit we have covered the following points.

1) The definition of the determinant of a square matrix.

2) The properties P1-P7, of determinants.


3) The statement and use of the fact that det(AB) = det(A) det(B).

4) The definintion of the determinant of a linear transformation from U to V, where dim


U = dim V.

5) The definition of the adjoints of a square matrix.

6) The use of adjoints to obtain the inverse of an invertible matrix.

7) The proof and use of Cramer’s rule for solving a system of linear equations.

8) The proof of the fact that the homogeneous system of linear equations AX = 0 has a
non-zero solution if and only if det(A) = 0.

9) The definition of the determinant rank, and the proof of the fact that rank of A =
determinant rank of A.

5.5 Solutions/Answers

E1) On expanding by the 2nd row we get


|A| = -5 |A21| + 4|A22| - |A23|.
Now, |A21| = 2 6 = 4 – 18 = -14
3 2
|A22| = 1 6 = 2 – 42 = -40.
7 2

|A23| = 1 2 = 3 – 14 = -11.
7 3
 |A| = (-5) (-14) + (-40) – (-11) = -79.

Expanding by the 3rd row, we get- 2 6 1 6 1 2


|a| = 7|a31| -3|A32| + 2|A33| = 7 4 1 -3
5 1 5+2 4

= 7 (-22) – 3(-29) + 2 (-6) = - 79.

Thus, |A| = -79, irrespective of the row that we use to obtain it.

E2)a) At = 1 5 7
2 4 3 , , on expanding by the first row, we get
6 1 2

|At| = 1 4 3 2 3 2 4
-5 +7 = 5 + 70 + /(- 22) = -79.
1 2 6 2 6 1

-3 2 1 2
t
b) A = -2 1 0 1 Since the 3rd row has the maximum
0 0 1 -3
2 -1 2 1

Number of zeros, we expand along it. Then

-3 2 2 -3 2 1
|A| = 1 -2 1 1 - (-3 ) -2 1 0 = 2 + 3 (2) = 8.
2 -1 1 2 -1 2
Z
E3) The magnitude of the required volume is the modulus of
1 0 1 k
j Y
0 1 0 = 1. i
0 0 1 O

we draw the box in fig. 2. X


Fig. 2
E4) the first determinant is zero, using the row equivalent of P2.
The second determinant is zero, using the row equivalent of P5,
since R3 = 2R1.

E5 a 0 0 b 0
α b 0 =a r c = a b c.
βT c

a d e
0 b f =a b f = a b c.
0 0 c 0 c
f(t) g(t)
E6 (t) = = f(t´) - f´(t) g(t).
f´(t) g´(t)

´(t) = f´(t) g´(t) + f(t´) g´(t) – {f´(t) g(t) + f´(t) g´(t)},


d df dg
since (fg) g+f
dt dt dt
f(t) g(t)
= f(t) g”(t) – f”(t) g(t) =
f´´(t) g´´(t)
E7) note that B is obtained from A by interchanging C1 and C3.

|B| = - |A|

Now |A| = 2 -2 - 0 3= 4 + 6 = 10. , |B| = - 10.


-3 -5 3 -3

Also |AB| = -6 3 -2
-14 10 -6
28 -21 18

-6 3 -2
= -14 10 -6 adding 2R2 to R3.
0 -1 6

= -6 -2 -6 3
-14 -6 -14 10 , expanding along R3.

= 8 – 108 = - 100 = |A| |B|.


E8) |A| = 1 = |B|.|A| + |B| = 2.

But a + B = 0 0 …. |A+B| = 0
0 0

E9) let B be the standard basis of R3. The zero operators is


0:R3 :0 (x) = 0 x R3. Now, [0]B = 0.
det (0) = 0.
1: R – R3 : (x) = x x R3, is the identify operator or R3. Now, [I] B = I3
3

det(I) = det (I3) = 1.

E10) The standard basis for P2, is {1,x,x2}


Now D(1) = 0, D(x) = 1, D(x) = 1, D(x2) = 2x,
0 1 0
|D|B = 0 0 2
0 0 0
0 1 0
det(D) = 0 0 2 = 0
0 0 0

E11) a11C21 +a12C22+a13C23 = 2(-1)2+2 0 -1 + (-1) (-1)2+3 0 2 = 0


2 6 2 1
similarly, check that a11C21 +a12C22+a31C32 = 0,

a11C31 +a12C32+a13C33 = 0 = a11C13 +a21C23+a31C33


E12) C11 =0, C12 =0, C13 =0, C21 =15, C22 = 10, C23 =0, C31 = 18, C32 = -12, C33
= 0.

0 -15 18
Adj(A) = 0 10 -12
0 0 0

E13) Since |A| = 0 A-1 does not exist.

E14) From E7 we know that |A| = 10.

Now, C11 = 4, C12 = 6, C13 = -6,


C21 = 3, C22 = 8, C23 = 3,
C31 = 2, C32 = 2, C33 = 2.

4 3 2
Adj(A) = 6 8 2
-6 3 2

1 4 3 2
1
A = A Adj (A) =10
-1
= 6 8 2
-6 3 2 .

verify that the matrix we have obtained is right, by multiplying it by A.

E15) Since A. Adj (A) = |A| I = Adj (A). A, and |A| ≠ 0, we find that
1
[Adj(A)]-1 exists, and is |A| A.

E16) This is of the form AX = B, where

1 2 4 x 1
A= 2 3 -1 , X = y ,B= 3
1 0 -3 z 2
1 2 4
D1 = 3 3 -1 = -19
2 0 -3

1 1 4
D1 = 2 3 -1 =2
1 2 -3

1 2 1
D1 = 2 3 3 =1
1 0 2

D = |A| = -11

x = D1 = 19 y = D2 = -2 z = D3 = -1
D 11 D 11 D 11

E17) The given system is equivalent to AX = 0, where

2 3 1
A= 1 -1 -1
4 6 2

now, the third row of A is twice the first row of A.

, by P2 and P4 of Section 9.3, | A | = 0.

, Theorem 5, the given system has a non-zero solution.

1 0 2
E18) 0 -1 3 = -3 + 2 = -1 ≠ 0. the given vectors are linearly
1 1 0 independent.

E19) a) Since |A| ≠ 0, the determinant rank of A is 3.


b) As in Example 13, the determinant rank of A is 2.
E20) a) The determinant rank of A  3. 3 1 2
Now the determinant of the 3 x 3 submatric 1 2 -1 is zero.
4 3 1

3 2 5
Also, the determinant of the 3 x 3 submatrix 1 -1 2 is zero.
4 1 7

In fact, you can check that the determinant of any of the 3x3 submatrices is zero.
Now let us look at the 2 x 2 submatrices of A. Since 3 1 = 5 ≠ 0
1 2
we find that p(A) = 2.

b) The determinant rank of A  2.

2 3
Now = -5 ≠ 0. , p(A) = 2.
1 -1
UNIT 6

EIGENVALUES AND EIGENVECTORS

Structure
6.1 Introduction
Objectives
6.2 The Algebraic Eigenvalue Problem
6.3 Obtaining eigenvalues and Eigenvectors
Characteristic Polynomial
Eigenvalues of linear Transformation
6.4 Diagonalisation
6.5 Summary
6.6 Solutions/Answers

6.1 Introduction

In Unit 5 you have studied about the matrix of a linear transformation. You have had
several opportunities, in earlier units, to observe that the matrix of a linear
transformation depends on the choice of the bases of the concerned vector spaces.

Let V be an n-dimentional vector space over F, and let T : V V be a linear


transformation. In this unit we will consider the problem of finding a suitable basis B,
of the vector space V, such that the n x n matrix [T]B is a diagonal matrix. This
problem can also be see as: given an n xx n matrix A, find a suitable n x n non-
singular matrix P such that P-1 AP is a diagonal matrix (see Unit 7, Cor. To Theorem
10). It is in this context that the study of eigenvalues and eigenvectors plays a central
role. This will be seen in Section 10.4.

The eigenvalue problem involves the evaluation of all the eigenvalues and
eigenvectors of a linear transformation or a matrix. The solution of this problem has
basic applications in almost all branches of the sciences, technology and the social
science besides its fundamental role in various branches of pure and applied
mathematics. The emergence of computers and the availability of modern computing
facilities has further strengthened this study, since they can handle very large systems
of equations.

In Section 6.2 we define eigenvalues and eigenvectors. We go on to discuss a method


of obtaining them, in Section 6.3. In this section we will also define the characteristic
polynomial, of which you will study more in the next unit.
Objectives

After studying this unit, you should be able to


 Obtain the characteristic polynomial of a linear transformation or a matrix;
 Obtain the eigenvalues, eigenvectors and eigenspaces of a linear transformation or a
matrix;
 Obtain a basis of a vector space V with respect to which the matrix of a linear
transformation T : V V is in diagonal form;
 obtain a non-singular matrix P which diagonalises a given diagonalizable matrix A.

6.2 THE ALGEBRAIC EIGENVALUE PROBLEM

Consider the linear mapping T : R2 R2: T(x,y) = (2x, y). Then, T(1,0) = (2,0) =
2(1,0).T(x,y) = 2(x,y) = (1,0) ≠ (0,0). In this situation we say that 2 is an eigenvalue
of T. But what is an eigenvalue?

Definitions: An eigenvalue of a linear transformation T: V V is a scalar such that


there exists a non-zero x  V is called an eigenvector of T with respect t the
eigenvalue . (In our example above, (1,0) is an eigenvector of T with respect to the
eigenvalue 2)

Thus, a vector x  V is an eigenvector of the linear transformation T if


i) x is non-zero, and
ii) Tx = x for some scalar F.

The fundamental algebraic eigenvalue problem deals with the determination of all the
eigenvalues of a linear transformation. Let us look at some examples of how we can
find eigenvales.

Example 1: Obtain an eignevalue and a corresponding eigenvector for the linear


operator T :R3 R3 : T(x,y,z) = (2x, 2y, 2z).

Solution: Clearly, T(x,y,z) = 2(x,y,z) (x,y,z) R3. Thus, 2 is an eigenvalue of T.


3
Any non-zero element of R will be an eigenvector of T corresponding to 2.

Example 2: Obtain an eigenvalue and a corresponding eigenvector of T: C3 C3:


T(x,y,z) = (ix,iy,z).

Solution: Firstly note that T is a linear operator. Now, if  C is an eigencalue, then 


(x,y,z) ≠ (0,0,0) such that T(x,y,z) = (x,y,z)  (ix,iy,z) = (zy, z).
 ix = x, -iy = y, z = z

These equations are satisfied if  = i, y = 0, z = 0


,  = I is an eigenvalues with a corresponding eigenvector being (1,0,0) (or (x,0,0)
for any x ≠0)
(1) is also satisfied if  = -i,x = 0, z = 0 or f  = 1, x = 0, y = 0. There, -i and 1 are also
eigenvalues with corresponding eigenvectors (0,y,0) and 0,0z) respectively for any ≠
0, z ≠0.

Do try the following exercise now.

E E1) Let T : R2 R2 be defined by T(x,y) = Obtain an eigenvalue and a


corresponding eigenvector of T.

Warning: The zero vector can never be an eigenvector. But, F can be an eigevalue.
For example, 0 is an eigenvalue of the linear operator in E 1, a corresponding
eigenvector being (0,1).

Now we define a vector space corresponding to an eigenvalue of T:V V. Suppose 


F is an eigenvalue of the linear transformation T. Define the set WA = {x  V| T(x) =
x}.
= {0} ∪ {eigenvectors of T corresponding to }.

So, a vector v W, if and only if v = 0 is an eigenvector of T corresponding to .


Now, xW Tx = Ix, I being the identity operator.
 (T - I)x = 0
 x Ker (T - I)

W = Ker (T - I), and hence, Wis a subspace of V (ref. unit 5, Theorem 4).
Since is an eigenvalue of T, it has an eigenvector, which must be non-zero. Thus,
Wis non-zero.

Definition: For an eigenvalue  of T, the non-zero subspace W is called the


eigenspace of t associated with the eigenvalue.

Example 3: Obtain W2 for the linear operator given in Example 1.

Solution: W2 = {(x,y,z) R3 | T(x,y,z) = 2(x,y,z)}


= {(x,y,z) R3 | (2x,2y,2z) = 2(x,y,z)} = R3.

Now, try the following exercise.


E E2) For T in Example 2, obtain the complex vector spaces Wi, W-i and W1.

As with every other concept related to linear transformations, we can define


eigenvalues and eigenvectors for matrices also. Let us do so.

Let A be any n x n matrix over the field F.

As we have said in Unit 2 (Theorem 5), the matrix A becomes a linear transformation
from V n(F) to V n(F), if we define
A: Vn(F) V n(F) : A(X) = AX.
Also , you can see that [A]Bo = A, where.

1 0 0
0 1 0
0 0 0
Bo = e1 = . , e2 = . ,…., en = .
. . .
. . .
. . 0
0 0 1
is the standard ordered basis of Vn(F) to Vn(F), with respect to the standard basis Bo, is
A itself.
This is why we denote the linear transformation A by A itself.

Looking at matrices as linear transformations in the above manner will help you in the
understanding of eigenvalues and eigenvectors for matrices.
Definition: A scalar  is an eigenvalue of an n x n matrix a over F if there exists X 
Vn(F), X ≠ 0, such that AX =X are eigenvectors of the matrix A corresponding to the
eigenvalue .

Let us look at a few examples.

1 0 0
Example 4: Let A = 0 2 0 Obtain an eigenvalue and a corresponding
Eigenvector of A 0 0 3

1 1
Solution: Now A 0 = 0 This shows that 1 is an
0 0

1
eigenvalue and 0 is an eigenvector corresponding to it.
0

0 0 0 0
Infact, A 1 = 2 1 and A 0 = 3 0.
0 0 1 1

Thus, 2 and a3 are eigenvalues of A, with corresponding

0 0
eigenvectors 1 and 1 , respectively.
0 0
Example 5: Obtain an eigenvalue and a corresponding eigenvector
0 -1
of A = M2(R).
1 2
Solution: Suppose  R is an eigenvalue of A. then

x 0 -yx
x= ≠ such that AX = X, that is, =
y 0 x+2y y

So for what values of , x and y are the equations –y = x and x + 2y = y satisfied?


Note that x ≠ 0 and y ≠ 0, because if either is zero then the other will have to be zero.
Now, solving our equations we get  = 1.
1
The an eigenvector corresponding to it is -1

Now you can solve an eigenvalue problem yourself!


1 2
0 3
E E3 Show that 3 is an eigenvalue of .Find 2 corresponding eigenvectors.

Just as we defined an eigenspace associated with a linear transformation we define the


eigenspace W, corresponding to an eigenvalue  of an n x n matrix A, as follows:
W = {X Vn(F),| AX = X} = {XVn (F) | (A - I) X = 0}

For example., the eigenspace W1, in the situation of Example 4, is

x x x x x x
y V3(R) A y = y = y V3(R) 2y = y
z z z z 3z z

x
= 0 xR which is the same as {x,0,0) | x  R}.
0

E E4) Find W3 for the matrix in E3.


The algebraic eigenvalue problem for matrices is to determine all the eigenvalues and
eigenvectors of a given matrix. In fact, the eigenvalues and eigenvectors of an n x n
matrix A are precisely the eigenvalues ad eigenvector of a regarded as a linear
transformation from V n(F) to V n(F).

We end this section with the following remark:

A scalar  is an eigenvalue of the matrix a if and only if (A - I) X = 0 has a non-


zero solution, i.e., if and only if det (A- I) = 0

Similarly,  is an eigenvalue of the linear transformation T if and only if det (T - I) =


0

So far we have been obtaining eigenvalues by observation, or by some calculations


that may not give us all the eigenvalues of a given matrix or linear transformation.
The remark above suggests where to look for all the eigenvalues In the next section
we determine eigenvalues and eigenvectors explicitly.

6.3 OBTAINING EIGENVALUES AND EIGENVECTORS

In the previous section we have seen that a scalar  is an eigenvalue of a matrix A if


and only if det (A) - I) = 0. In this section we shall see how this equation helps us to
solve the eigenvalue problem.

6.3.1 Characteristic Polynomial

Once we know that  is an eigenvalue of a matrix A, the eigenvectors can easily be


obtained by finding non-zero solutions of the system of equations given by AX =
X.

a11 a12 …. a1n x1


a21 a22 …. a2n x2
Now, if A =  and X =



an1 an2 …. ann xn

the equation AX = X becomes

a11 a12 …. a1n x1 x1


a21 a22 …. a2n x2 x2
 =



an1 an2 …. ann xn xn

write it out, we get the following system of equations.


a11x1 + a12x2 + ….. + a1nxn = x1
a21x1 + a22x2 + ….. + a2nxn = x2
. . ….. . . .
an1x1 + an2x2 + ….. + annxn = x1
This equivalent to the following system.

a11 -)x1 + a12x2 + ….. + a1nxn = 0


a21x1 + (a22 - )x2 + ….. + a2nxn = 0
….. ..
an1x1 + an2x2 + ….. + ann -)xn = x1

This homogeneous system of linear equations has a non-trivial solution if and only if
the determinant of the coefficient matrix is equal to 0 (by Unit 9, theorem 5). Thus, 
is an eigenvalue of A if and only if

a11-  a12 ….. a1n


a21 a22 -  ….. a2n
det (A - I) = . . ….. . =0
. . ….. .
an1 an1 ….. ann - 
Now, det(I – A) = (- 1)n det(A - I) (multiplying each row by ( - 1)). Hence, det(I
– A) = 0 if and only if det (A - = 0.
This leads us to define the concept of the characteristic polynomial.

Definition: Let A = [aij] be any n x n matrix. Then the characteristic polynomial of


the matrix A is defined by
fA(t) = det(tI – A)

t – a11 - a12 ….. - a1n


- a21 t – a22 ….. - a2n
= : : ….. :
- an1 -an2 ….. t – ann

= t” + c1t” 1 + c2t” 2 + …. + cn1t + cn

where the coefficients c1, c2, …, cn depend on the entries aij of the matrix A.

The equation fA(t) = 0 is the characteristic equation of A.

When no confusion arises, we shall simply write f(t) in place of fA(t).

Consider the following example.

Example 6: Obtain the characteristic polynomial of the matrix


1 2
0 -1 .

Solution: The required polynomial is t -1 -2


0 t+1
= (t-1) (t+1) = t2 -1.
Note try this exercise.

E E5) Obtain the characteristic polynomial of the matrix

0 0 2
1 0 1
0 1 -2 .

Note that  is an eigenvalue of A iff det(I – A) = fA() = 0, that is, iff is a root of
the characteristic polynomial fA(t), defined above. Due to this fact, eigenvalues are
also called characteristic root, and eigenvectors are called chrematistic vectors.

For example, the eigenvalues of the matrix in Example 6 are the roots of the
polynomial t2 – 1, namely, 1 and 9 – 1).

E E6) Find the eigenvalues of the matrix in E5.


Now, the characteristic polynomial f A(t) is a polynomial of degreen. Hence, it can
have n roots at the most. Thus, an n x n matrix has two eigenvalues, at the most. for
example, the matrix in Example 6 has two eigenvalues, 1 and – 1 , and the matrix in
E5 has 3 eigenvalues.

Now we will prove a theorem that will help us in section 10.4.

Theorem 1: similar matrice have the same eigenvalues.

Proof: Let an n x n matrix B be similar to an n x n matrix A.

Then, by definition, B= P-1 AP, for some invertible matrix P.

Now, the characteristic polynomial of B,


fB(t) = det(tI – B)
= det(tI –P-1 AP)
= det (P-1(tI- A), since P-1tIP = tP-1P = tI
= det(P-1) det(tI- A) det(P) (see sec. 9.4)
= det(tI-A) det(P-1) det(P)
= fA(t) det(P-1P)
= fA(t), since det(P-1P) = det(I) = 1.

Thus, the roots of fB (t) and fA(t) coincide. Therefore, the eigenvalues of A and B are
the same.

Let us consider some more examples so that the concepts mentioned in this
section become absolutely clear to you.
Example 7: find the eigenvalues and eigenvectors of the matrix

0 0 2
1 0 1
0 1 -2

Solution: In solving E6 you found that the eigenvalues of A are 1 = 1, 2 = -1, 3 = -


2. Now we obtain the eigenvectors of A.

The eigenvectors of A with respect to the eigenvalue 1 =1 are the non-trivail


solutions of

0 0 2 x1 x1
1 0 1 x2 =1 x2
0 1 -2 x3 x3

which gives the equations


2x3 = x1 x1 = 2x3
x 1 + x 3 = x2  x2 = x1 + x3 = 3x3
x2 – 2x3 = x3 x3 = x3

The eigenvectors corresponding to 1 = 0 are given by

1 1 0 0 x1 x1
-1 -1 0 0 x2 x2
-2 -2 2 1 x3 =0 x3 ,
1 1 -1 0 x4 x4

which gives x 1 + x2 = 0
-x1 –x2 = 0
-2x1 – 2x2 + 2x3 + x4 = 0
x1 + x2 –x3 = 0
The first and last equations give x3 = 0. Then, the third equation gives x4 = 0.
The first equation gives x1 = -x2.

Thus, the eigenvectors are


-x2 -1
x2 1 x2 ≠ 0, x2R.
0 = x2 0
0 0
The eigenvectors corresponding to 2 = 1 are given by
1 1 0 0 x1 x1
-1 -1 0 0 = 1 x2 x2
-2 -2 2 1 x3 x3
1 1 -1 0 x4 x4
which gives x1 + x2 = x1
-x1 –x2 = x2
-2x1 – 2x2 + 2x3 + x4 = x3
x1 + x2 –x3 = x4

The first two equations give x2 = 0 and x1 = 0. Then the last equation gives x4 = - x3 .
Thus, the eigenvectors are

0 0
0 = x3 0 , x3 ≠ 0, x3 R.
x3 1
-x3 -1
Example 9: Obtain the eigenvalues and eigenvectors of

0 1 0
A= 1 0 0
0 0 1
Solution: the characteristic polynomial of A = f A(t) = det(tI-A)
t -1 0
= -1 t 0 = (t+1) (t – 1)2
0 0 t-1

Therefore, the eigenvalues are 1= and 2 = 1.

The eigenvectors corresponding to 1 = -1 are given by


0 1 0 x1 x1
1 0 0 x2 = (-1) x2
0 0 1 x3 x3 ,

which is equivalent to
x 2 = x1
x 1 = x2
x 3 = x3

The last equation gives x3 = 0. Thus, the eigenvectors are


x1 1
-x1 = x1 -1 , x1 ≠ 0, x1 R.
0 0
The eigenvectors corresponding to 2 = 1 are given by

0 1 0 x1 x1
1 0 0 x2 = (-1) x2
0 0 1 x3 x3 ,
which gives x2 = x1
x1 = x2
x3 = x3

Thus, the eigenvectors are


x1 1 0
x1 = x1 1 +x3 0
x3 = 0 1

where x1, x3 are real numbers, not simultaneously 0.


Note that, corresponding to 2 = 1, there exist two linearly independent
eigenvectors.

1 0
1 and 0 , which form a basis of the eigenspace W1.
0 1
Thus, W-1 is 1 – dimensional, while dim R W1 = 2.
Try the following exercises now.

E E7) find the eigenvalues and bases for the eigenspaces of the matrix.

2 1 0
A= 0 1 -1
0 2 4

E E8) Find the eigenvectors of the diagonal matrix

a1 0 0 . . 0
0 a2 0 . . 0
D= 0 0 a3 . . 0 , where ai ≠ aj for i ≠ j.
. . . . . .
0 0 . . . an
We now turn to the eigenvalues and eigenvectors of linear transformations.

6.3.2 Eigenvalues of Linear Transformations

As in section 10.2, let T:V V be a linear transformation on a finite-dimensional


vector space V over the field F. we have seen that
F is an eigenvalue of T

 det (T - I) = 0
 det (I – T) = 0
 det (I – A) = 0, where A = [T]B is the matrix of T with respect to a basis B of V.
Note that [I – T]B = I – [T]B.

This shows that  is an eigenvalue of T if and only if  is an eigenvalue of the matrix


A = [T]B, where B is a basis of V. We define the characteristic polynomial of the
linear transformation T to be same as the characteristic polynomial of the matrix A =
[T]B. where B is basis V.

This definition does not depend on the basis B chosen, since similar matrice have the
same characteristic polynomial (Theorem 1), and the matrices of the same linear
transformation T with respect to two different ordered bases of V are similar.

Just as for matrices, the eigenvalues of T are precisely the roots of the
characteristic polynomial of T.

Example 10: Let T : R2 R2 be the linear transformation which maps e1 = (1,0) to e2


= (0,1) and e2 to – e1. Obtain the eigenvalues of T.
0 -1
Solution: Let A = [T]B =1 0 , where B = {e1,e2}.

The characteristic polynomial of T = the characteristic polynomial of A

T l
= = t2 + 1, which has no real roots.
-1 t

Hence, the linear transformation T has n real eigenvalues. But, it has two complex
eigenvalues i and - i

Try the following exercise now.

E E9) Obtain the eigenvalues and eigenvectors of the differential operator


D: P2  P2:

D(a0 + a1x + a2x2) = a1 + 2a2x, for a0, a1, a2R.


E E10) show that the eigenvalues of a square matrix A coincide with those of At.

E E11) Let a be an invertible matrix. If  is an eigenvalue of A, show that  ≠ 0 and that


-1 is an eigenvalue of A-1.

Now that we have discussed a method of obtaining the eigenvalues and eigenvectors
of a matrix, let us see how they help in transforming any square matrix into a diagonal
matrix.

6.4 Diagonalisation

In this section we start with proving a theorem that discusses the linear
independence

Theoroem 2: Let T: V  V be a linear transformation on a finite-dimensional vector


space V over the field F. 1, 2, ….m be the distinct eigenvalues of T and v1,v2,….,
vm be eigenvector of T corresponding to 1, 2,…. m, respectively. Then v1,v2,…., vm
are linearly independent over F.

Proof: We know that


Tvi = ivi,iF, 0 ≠ viV for i = 1,2,…, m, and i ≠ j for i ≠ j.
Suppose, if possible, that {v1v2,…., vm} is a linearly dependent set. Now, the single
non-zero vector v1 is linearly independent. We choose r(m) such that
{v1,v2,…., vr-1}is linearly independent and {v1,v2,…., vr-1 vr }is linearly dependent.
Then vr,a1v1 + a2v2 + ….+ ar-1vr-1 ……. (1)
for some a1, a2, ……., ar-1 vr-1 in F.
Applying T, we get
Tvr = a1 Tv1 + … + ar-1 Tvr-1 . This gives
r vr = a1 1 v1 + a2 2v2 + ….. + ar-1r-1vr-1 ........(2)

Now, we multiply (1) by r and subtract it from (2), to get


0 = a1 (1 -r) v1 + a2 (2 - r)v2 + …. + ar-1 (r-1 -r)vr-1
Since the set {v1,v2,…., vr-1}if linearly independent, each coefficients in the above
equation must be 0. thus, we have ai ((1-r) = 0 - 1,2 …….r-1.

But 2 ≠ r for i = 1,2,……. R-1. hence (i - r) ≠ 0 for i = 1,2….., r-1, and we must
have ai = 0 for i= 1,2,……, r-1. However, this is not possible since (1) would imply
that vr = 0, and, being an eigenvector, vr can never be 0. Thus, we reach a
contradiction.

Hence, the assumption we started with must be wrong. Thus, {v1,v2,…., vm}must be
linearly independent, and the theorem is proved.

We will use theorem 2 to choose a basis for a vector space V so that the matrix [T] B is
a diagonal matrix.

Definition: A linear transformation T : V V on a finite-dimensional vector space V


is said to be diagonalizable if there exists a basis B = {v1,v2,…., vn}of V such that the
matrix of t with respect to the basis B is diagonal. That is,

1 0 0 … 0
0 2 0 … 0
[T]B = 0 0 3 … .
. . . … .
0 0 0 … n ,

where 1, 2, ……n are scalars which need not be distinct.

The next theorem tells us under what conditions a linear transformation is


diagonalizable.

Theorem 3: A linear transformation T, on a finite-dimensional vector space V, is


diagonalizable if and only if there exists a basis of V consisting of eigenvector of T.
Proof: suppose that T is diagonalizable. By definition, there exists a basis B =
{v1,v2,…., vn} of V, such that

1 0 0 … 0
0 2 0 … 0
[T]B = 0 0 3 … .
.. .. .. … .
0 0 0 … n ,

By definition of [T]B, we must have


Tv1 = 1v1, Tv2 = 2v2, ………, Tvn = nvn.

Since basis vectors are always non-zero, v1,v2,…., vn are non-zero. Thus, we find that
v1,v2,…., vn are eigenvecyors of T.

Conversely, let b = { v1,v2,…., vn} be a basis of V consisting of eigenvectors of T.


Then, are exist scalars a1,a2,…., an, not necessarily distinct, such that Tv1 = a1, Tv2 =
a2v2,….., Tvn = anvn.

But then we have


α1 0  0
0 α2 0
.., . which means that T is diagonalizable.

[T]B = .

. . ..
0 0  αn

the next theorem combines theorem 2 and 3

Theorem 4: Let T:V  V be a linear transformation, where V is an n-dimensional


vector space. Assume that T has n distinct eigenvalues. Then T is diagonalizable.

Proof: Let 12,…… n be the n distinct eigenvalues of T. Then there exist


eigenvectors 12,…… n corresponding to the eigenvalues 12,…… n,
respectively. By theorem 2, the set, v1v2,……v n, is linearly independent and has n
vectors, where n = dim V. Thus, from Unit 5 (corollary to Theorem 5), B =
{v1v2,……v n}is a basis of V consisting of eigenvectors of T. Thus, by theorem 3, T is
diagonalizable.

Just as we have reached the conclusion of Theorem 4 for linear transformations, we


define diagonalisability of a matrix, and reach a similar conclusion for matrices.

Definition: An n x n matrix A is said to be diagonalizable if A is similar t a diagonal


matrix, that is, P-1 AP is diagonal for some non-singular nxn matrix P.
Note that the matrix A is diagonalizable if and only if the matrix A, regarded as a
linear transformation A:Vn(F) Vn(F) = AX, is diagonalizable.

Thus, Theorem 2,3, and 4 are true for the matrix A regarded as a linear transformation
from Vn(F) to Vn(F). Therefore, given an nxn matrix A, we know that it is
diagonalizable if it has n distinct eigenvalues.

We now give a practical method of diagonalaising a matrix.


Theorem 5: Let a be an nxn matrix having n distinct eigenvalues 12,…… n. let
X1,X2,…….XnVn(F) be eigenvectors of A corresponding to 12,…… n,
respectively. Let P = (X1,X2,……Xn) be the nxn matrix having X1,X2,……Xn as its
column vectors. Then.
P-1AP = diag (12,…… n).

Proof: By actual multiplication, you can see that


AP = A(X1,X2,……Xn)
= (AX1,AX2,……AXn)
= (1X1,2X2,……nXn

1 0  0
0 2 0
= (X1,X2,……Xn) . . … .
. . … .
0 0 n

= P diag (12,……n)

now, by Theorem 2, the column vectors of P are linearly independent. This means
that P is invertible (Unit 9, Theorem 6). Therefore, we can pre-multiply both sides of
the matrix equation AP = P diag (12,……n).

Let us see how this theorem works in practice.

Example 11: Diagonalise the matrix

1 2 0
A= 2 1 -6
2 -2 3

Solution: The characteristic polynomial of A = f(t) =

t-1 -2 0
-2 t-1 6 = (t – 5) (t – 3) (t + 3).
-2 2 t-3

Thus, the eigenvalues of A are 1 = 5, 2 = 3, 3 = -3 since they are all distinct, A is


diagonalizable (by theorem 4). You can find the eigenvectors by the method already
explained to you. Tight now you can directly verify that.

1 1 1 1 -1 -1
A 2 =5 2 , A 1 =3 1 and A 2 = -3 2
-1 -1 0 0 1 1

Thus, 1 1 -1
2 1 and 1
-1 0 0
are eigenvectors corresponding to the distinct eigenvalues 5,3 and -3, respectively. By
Theorem 5, the matrix which diagonalises A is given by

1 1 -1
P = 2 1 2 . Check, by actual multiplication, that
-1 0 1

5 0 0
P = 0 1 0 , which is in diagonal form.
0 0 -3

The following exercise will give you some practice in diagonalising matrices.

E E12) Are the matrices in Examples 7,8 and 9 diagonalisable? If so, diagonalise them.
We end this unit by summarizing what has been done in it.

6.5 Summary

As in previous unit, in this unit also we have treated linear transformations along
with the analogous matrix version. We have covered the following point here.

1) The definition of eigenvalues, eigenvectors and eigenspaces of linear transformations


and matrices.

2) The definition of the characteristic polynomial and characteristic equation of a linear


transformation (or matrix).

3) A scalar  is an eigenvalue of a linear transformation T(or matrix A) if and only if it is


a root of the characteristic polynomial of T (or A).

4) A method of containing all the eigenvalues and eigenvectors of a linear transformation


(or matrix).

5) Eigenvectors of a linear transformation (or matrix) corresponding to distinct


eigenvalues are linearly independent.

6) A linear transformation T: V  V is diagonalizable if and only if V has a basis


consisting of eigenvectors of t.

7) A linear transformation (or matrix) is diagonalizable if its eigenvaules are distinct.

6.6 Solutions/Answers

E1) Supose R is an eigenvalue. Then  (x,y) ≠ (0,0) such that T (x,y) =  (x,y)
(x,0) = (x,y) x = x, y = 0. These equations are satisfied if  = 1, y = 0
, 1 is an eigenvalue. A corresponding eigenvector of (1,0). Note that there are
infinitely many eigenvectors corresponding to 1, namely, (x,0) 0 ≠ x R.

E2) W1 = {(x,y,z) C3| T (x,y,z) = i(x,y,z)}


= {(x,y,z)C3|(ix, - iy, z) = (ix,iy,iz)}
= {(x,0,0)| x  C}.
Similarly, you can show that w-1 = {(0,x,0)| x C} and W1 = {(0,0,x)| x  C}.
x 0
E3) if is an eigencalve , then  ≠ such that
y 0

1 2 x x  x + 2y = 3x and 3y = 3y.
0 3 y =3 y
These equations are satisfied by x = 1, y = 1 and x = 2, y = 2.
1 2
3 is an eigenvalue, and and are eigevectors
1 2
corresponding to 3.
x x+2y 3x
E4) W3 = V2(R) =
y 3y 3y

x x
= V2(R) x=y = xR
y x
1
This is the 1-dimensional real subspace of V2(R) whose basis is
1
t 0 -2 t -1 0 -2
W5) It is -1 t -1 =t +
0 -1 t+2 -1 t+2 -1 t+2

= {t2(t+2)-t} – 2 = t3 + 2t2 – t-2.

E6 The eigenvalues are the roots of the polynomial t3+2t2 –t-2 = (t-1) (t+1) (t+2)
they are 1, -1, -2.

t-2 1 0
E7 fA(t) = 0 t-1 1 = (t-2)2 (t-3)
0 -2 t-4

the eigenvalues are 1 = 2, 2 = 3.

The eigenvectors corresponding to 1 are given by


2 1 0 x x
0 1 -1 y y
z =2 z
0 2 4
This leads us to the equations.

2x + y = 2x x=x
y – z = 2y  y=0
2y + 4z = 2z z=0

x 1
W2 = 0 xR  , a basis for W2 is 0
0 0
The eigenvectors corresponding to 2 are given by

2 1 0 x x
0 1 -1 y =3 y This gives us the equations.
0 2 4 z z

2x + y = 3x x=x
y – z = 3y  y=x
2y + 4z = 3z z = 2x
x 1
 W3 = x xR  , a basis for W2 is 1
-2x -2

t-a1 0  0
0 t-a2 0
E8) fD(t) = . . … . = (t-a1) (t-a2) …. (t-an)
. . … .
0 0  t-an

, its eigenvalues are a1, a2…., an.


The eigenvectors corresponding to a1 are given by

a1 0 … 0 x1 x1
0 a2 … 0 x2 x2
. . … : : = a1 :
0 0 … an xn xn

This gives us the equations


a1x1 = a1x1 x1 = x1
a2x2 = a2x2 x2 = 0
: : : :

(since an ≠aj = aj for i ≠ j


x1
0
The eigenvector corresponding to a1 are : , x1 ≠ 0, x1, R
0

0
:
Similarly, the eigenvectors corresponding to a1 are 0 , x2 ≠ 0, x1, R.
x1
0
:
0
2
E9) B = {1,x,x } is a basis of P2

0 1 0
Then [D]B = 0 0 2
0 0 0 .
t -1 0
, the characteristic polynomial of D is 0 t -2 = t3
0 0 t

, the only eigenvvalue of D is  = 0.


The eigenvectors corresponding to  = 0 are a0 + a1x + 42x2, where
D(a0 + a1x + a2x2)
= 0, that is a1 + 2a2x = 0.

This gives a1 = 0, a2 = 0 , the set of eigenvectors corresponding to  = 0 are


{a0|a0 R, a0 ≠ 0} = R \ {0}.

E10) |tI – A| = |(tI – A)t|, since |At| = |A|.


= |tI – At|, since It = I and (B - )1 = Bt – Ct.
, the eigenvalues of A are the same as those of At

E11) Let X be an eigenvector corresponding to . Then X ≠ 0 and AX = X.


 A-1 (AX) = A-1 (X).
 (A-1A) x = (A-1X)
 X = (A-1X)
 ≠ 0, since X ≠ 0.
Also, X = (A-1X)-1X -1 is an eigenvalue of A-1.

E12) Since the matrix in Example 7 has distinct eigenvalues 1, -1 and -2, it is
diagonalizable. Eigenvectors corresponding to.

1 -2 1
These eigenvalue are 3 1 and 0 , respectively
1 1 1

2 -2 -1 0 0 2 1 0 0
, if P 3 1 0 , then P-1 1 0 1 P = 0 -1 0
1 1 1 0 1 -2 0 0 -2.

The matrix in Example 8 is not diagonalizable. This is because it only has two
distinct eigenvalues and, corresponding to each , it has only linearly independent
eigenvector. , we cannot find a basis of V4(F) consisting of eigenvectors. And now
apply Theorem 3.

The matrix in Example 9 is diagonalizable though it only has two distinct eigenvalue.
This is because corresponding to 1 = -1 there is one linear independent eigenvector,
but corresponding to 2 = 1 there exist two linearly independent eigenvectors.
Therefore, we can form a basis V3(R) consisting of the eigenvectors.

1 1 0
-1 1 0
0 0 1 .

1 1 0
The matrix P = -1 1 0 is invertible , and
0 0 1

1 1 0 -1 0 0
-1
P -1 1 0 P= 0 1 0
0 0 1 0 0 1

UNIT 7

CHARACTERISTIC AND MINIMAL POLYNOMIAL

Structure
7.1 Introduction
objectives
7.2 Cayley-Hamilton Theorem
7.3 Minimal Polynomial
7.4 Summary
7.5 Solutions/Answers 2

7.1 Introduction
This unit is basically a continuation of the previous unit, but the emphasis is on a
different aspect of the problem discussed in the previous unit.

Let T:V  V be a linear transformation on a n-dimensional vector space V over the


field F. The two most important polynomials that are associated with T are the
characteristic polynomial of T and the minimal polynomial of T. we defined the
former in unit 10 and the latter in unit 6.

In this unit we first show that every square matrix (of linear transformation
T: V  V) satisfies its characteristic equation, and use this to compute the inverse of
the concerned matrix (or linear transformation), if it exists.

Then we define the minimal polynomial of a square matrix, and discuss the
relationship between the characteristic and minimal polynomials. This leads us to a
simple way of obtaining the minimal polynomial of a matrix (or linear
transformation).

We advise you to study Unit 2.4 and 6 before starting this unit.

Objectives

After studying this unit, you should be able to


 State and prove the Cayley-Hamilton theorem;
 Find the inverse of an invertible matrix using this theorem;
 Prove that a scalar  is an eigenvalue if and only it is a root of the minimal
polynomial;
 Obtain the minimal polynomial of a matrix (or linear transformation) if the
characteristic polynomial is known.

7.2 Cayley-Hamilton Theorem

In this section we present the Cayley-Hamilton theorem, which is related to the


characteristic equation of a matrix. It is named after the British mathematicians
Arthur Cayley (1821-1895) and William Hamilton (1805 – 1865, who were
responsible for a lot of work done in the theorem of determinants.

0 1 2
Let us consider the 3 x 3 matrix A = -1 2 1
0 3 2
t -1 -2
Then tI – A = 1 t-2 -1
0 -3 t-2

Let ji be the (i,j)th cofactor of (tI – a.


Then 11 = (t – 2)2 – 3 = t2 4t + 1, 12 = t–2, 13 = - 3, 21 = t +4, 22 = t2 – 2t,
23 = 3t, 31 = 2t – 3, 32 = t – 2, 33 = t2 – 2t + 1.

t2 – 4t + 1 t+4 2t – 3
Adj (tI – A) = t–2 2
t – 2t t–2
2
-3 3t t – 2t + I

1 0 0 -4 1 2 1 4 -3
= 0 1 0 t2+ 1 -2 1 t + -2 0 -2
0 0 1 0 3 -2 -3 0 1

This is a polynomial in t of degree 2, with matrix coefficients.

Similarly, if we consider the n x n Matrix A = [aij], then adj(tI – A) is a


polynomial of degree  n -1, with matrix coefficients. Let

Adj (tI – A) = B1tn-1 + B2tn-2 + … + Bn-1 + Bn …(1)


Where B1,….., Bn are n x n matrices over F.
Now, the characteristic polynomial of A is given by
F(t) = fA(t) = det(tI – A) = |tI – A|

t – a11 - a12 . . . - a1n


- a21 t – a22 . . . - a2n
= . . . . . . , where a = [aij]
. . . . . .
- an1 -an2 ….. t – ann

= tn + c1tn-1 + c2tn-2 + …… + cn-1t + cn, …(2)


where the coefficients (1) and (2) to prove the Cayley-Hamilton theorem.

Theorem 1 (Cayley-Hamilton): Let f(t) = tn + c1tn-1 + ……. + cn-1t + cn be the


characteristic Polynomial of an n x n matrix A. then,
F(A) = An + c1 An-1 + c2An-2 + …. + cn-1A + cn1 = 0
(Note that over here 0 denotes the n x n zero matrix, and I = In.)
(tI – A) adj(tI – A) = Adj(tI – A)
= det(tI – A)I
= f(t)I.

now equation (1) above says that


Adj(tI) = B1tn-1 + B2tn-2 + ……. + Bn, where Bk is an n x n matrix for k =
1,2,….,n.
Thus, we have
(tI – A) (B1tn-1 + B2tn-2 + B3tn-3 +……. + Bn-2t2 + Bn-1t + Bn)
= f(t) I
= Itn + c1tn-1 + c2Itn-2 + ……+ cn-1It2 + cn-1It + cnI, substituting the value of f(t).
= f(t)

Now, comparing constant term and the coefficients of t, t2,….., tn on both sides we
get.

- ABn = cnI
Bn - ABn-1 = cn tI
Bn-1 – Abn-2 = cn-2I
. . . .
. . . .
B, - AB2 = c2I
B2 - AB1 = c1I
B1 = I

Pre-multiplying the first equation by I, the second A, third by a2, ….., that last by An,
and adding all these equations, we get.0 = cnI + cn-1A + cn-2A2 + … + c2An-2 + c1An-1 +
An = f(A).
Thus, f(A) = An + c1an-1 + c2An-2 … + cn-1A + cnI = 0, and the Cayley-Hamilton
theorem is proved.

This theorem can also stated as


“Every square matrix satisfies its characteristic polynomial”.
Remark 1: You may be templed to give the following ‘quick’ proof Theorem
1:
f(t) = det (tI – A)
 f(A) = det (AI – A) = det (A- A) = det (0) = 0

This proof is false. Why? Well , the left hand side of the above equation, f(A), is an n
x n matrix while the right hand side is the scalar 0, being the value of det(0).

Now, as usual, we give the analogue of Theorem 1 for linear transformations.

Theorem 2 (Calyley-Hamilton): Let T be a linear transformation on a finite-


dimensional vector space V. if f(t) is the characteristic polynomial of T, then
f(T) = 0.

Proof: Let dim V = n, and let B = {v1, v2, ……, vn} be a basis of V. In Unit 10 we
have observed that
f(t) = the characteristic polynomial of T
= the characteristic polynomial of the matrix [T]B.

Let [T]B = A.

If f(t) tn + c1tn-1 + c2tn-2 + … + cn-1t +cn, then, by Theorem 1,


F(A) = An + c1An-1 + c2An-2 + …..+ cn-1A + cnI = 0.
Now, in Theorem 2 of Unit 7 we proved that [ ]B is a vector space isomorphism.
Thus,

[f(T)]B = [Tn +c1Tn-1 + c2Tn-2 + … + cn-1T + cnI]B


n n
= [T] + c1[T]B + c2[T] B + … _ cn-1[T] B + cn[I]B
n n-1 - n-2 -
= [A +c1A + 1
c2A 2+ … + cn-1A + cnI
= f(A) = 0

Again, using the one-one property of [ ]B, this implies that f(T0 = 0.
Thus, Theorem 2 is true.

Let us look at some examples now. 3 2


Example 1:Verify the Cayley-Hamilton theorem for A-1= 0

Solution: The characteristic polynomial of A is

t – 2 -1
- t2 – 3t + 2
1 t
, we want to verify that A2 -3A + 21 = 0.
3 2 3 2 7 6
Now, A2 = =
-1 0 -1 0 -3 -2

 A2 – 3A + 21 = 7
6
- 9 6 + 2
0
= 0
0
-3 -2 -3 0 0 2 0 0

, the Cayley-Hamilton theorem is true in this case.

E verify the Cayley-Hamilton theorem for A, where A =


7 6 0 0 1 0 1 0 1
a) 2 3 0 b) 3 0 1 , c) 0 3 1
-2 -2 1 1 -2 -1 3 3 4
We will now use Theorem 1 to prove a result that gives us a method for obtaining the
inverse of an invertible matrix.
Theorem 3: Let f(t) = t + c1tn-1 + ….. + cn-1t + cn be the characteristic polynomial of an
nxn matrix A. Then exists if cn ≠ 0 and, in this case,
A-1 = -c-1 (An-1 n+ ….. + cn-1I).

Proof: By Theorem 1,
f(A) = An +c1An-1 + … + cn -1A + cnI = 0
 A(An-1 +c1An-2 + … + cn -1I) = - cnI
and (An-1 +c1An-2 + … + cn -1I)A = - cnI
 A [-cn-1 (An-1 +c1 An-2 + … + cn -1I) ] = I
= [ - cn-1 (An-1 +c1An-2 + … + cn -1I)] = A.
Thus, A is invertible, and
An-1 = - c1An-2 + … + cn -1I).

Let us see how Theorem 3 works in practice.


2 1 1
Example 2: Is A = -1 2 -1 invertible? If so, find A-1.
-1 1 3
Solution: The characteristic polynomial of A, f(t)

t-2 -1 -1
= 1 t- 2 1 = t3 – 7t2 + 19t – 19.
1 -1 t -3

Since the constant term of f(t) = -19 ≠ 0, a is invertible.


Now, by Theorem 1, f(A) = A3 -7A2 + 19A – 19I = 0
 (1/19) A (A2 – 7A + 19I) = I
Therefore, A-1 = (1/19) (A2 – 7A + 19I)
2 5 4
Now, A2 = -3 2 -6
-6 4 7

7 -2 -3
4 7 1
1 -3 5
A-1 = 1/19

To make sure that there has been no error in calculation, multiply this matrix by a.
you should get I!

Now try the following exercise.

E E2) For the matrices in E1, obtain A-1, wherever possible.

Now let us look closely at the minimal polynomial.

7.3 Minimal Polynomial

In Unit 6 we define the minimal polynomial of a linear transformation T:V V. We


said that it is the monic Polynomial of least degree with coefficients in F, which is
satisfied by T. But, we weren’t able to give a method of obtaining the minimal
polynomial of T.

In this section we will show that the minimal polynomial divides the characteristic
polynomial. Moreover, the roots of the minimal polynomial are the same as those of
the characteristic polynomial. Since it is easy to obtain the characteristic polynomial
of T, these facts will give us a simply way of finding the minimal polynomial of T.

Let us first recall some properties of the minimal polynomial of T that we gave in Unit
6. Let p9t) be the minimal polynomial of T, then
MP1) p(t) is a monic polynomial with coefficients in F.
MP2 If q(t) is a non-zero polynomial over F such that deg q< deg p, then q(T) ≠ 0.
MP3 If, for some polynomial g(over F, g(T) = 0, then p(t)| g(t). That is, there
Exists a polynomial h(t) over F such that g(t) = p(t)h(t).

We will now obtain the first link in the relationship between the minimal polynomial
and the characteristic polynomial of a linear transformation divides its characteristic
polynomial.
Proof: Let the characteristic polynomial and the minimal polynomial of T be f(t) and
p(t), respectively. By Theorem 2, f(T) = 0. Therefore, by MP4, p(t) divides f(t), as
desired.

Before going on to show the full relationship between the minimal and characteristic
polynomials, we state (but don’t prove!) two theorems that will be used again and
again, in this course as well as other courses.

Theorem 5 (Division algorithan for polynomials): Let f and g be two polynomials


in t with coefficients in a field F such that f ≠ 0. then
a) there exist polynomials and r with coefficients in F such that g = fq + r, where r =
0 or deg r < deg f, and
b) if we also have g + fq1 + r1,with r1 = 0 or deg r1< deg f, then q = q1 and r = r1

An immediate corollary follows.


Corollary: if g is a polynomial over F with ≠ as a root then g(t) = (t-) q(t), for some
polynomial q over F.

Proof: By the division algorithm, taking f = (t-) we get


g(t) = (t-) q(t) + r(t), …….(1)
with r = 0 or deg r < deg (t-) = 1.
If deg r < 1, then r is a constant.
Putting t =  in (1) gives us
g()= r() = r, since r is a constant. But g() = 0, since  is a root of g. , r = 0. Thus,
the only possibility is r = 0. Hence, g(t) = (t - ) q(t).

And now we come to a very important result that you may be using often, without
realizing coefficients has at least one root in C.

In other words, this theorem says that any polynomial f(t) = αn-1tn-1 + … + α1t + α0
(where α0, αn C αn ≠ 0, n  1) has at least one root in C.

Remark 2: In Theorem 6, if 1 C is a root of f(t) = 0, then by theorem 5, f(t) = (t-


1)f1(t). Here deg f1 = n -1. If f1 (t) is not constant, then the equation f1(t) = 0 has a
root 2 C, and f1(t) = (t-2)f2(t). Consequently, f(t) = (t-1) (t-2) f2(t). Here deg f2 =
n-2. Using the fundamental theorem repeatedly, we get
F(t) = αn(t-1) (t-2) …. (t-n) for some 1,2, …..n in C, which are not necessarily
distinct. (This process has to stop after n steps since deg f = n.) thus, all the roots of
f(t) belong to C and these are n in number. They may not all be distinct. Suppose
1,2, …..k are the distinct roots, and they are repeated m1, m2, …., mk times,
respectively. Then m1 + m2 + ..., + mk = n, and f(t) = αn (t- 1) (t - 2)m2 … (t-
m1

k)mk.

For example, the polymial equation t3 – it2 + t – i = 0 has no real roots, but it has two
distinct complex roots, namely, t(= -1) and - i . And we write t3 – it2 + t – i = (t-i)2 (t
+ i). Here i is repeated twice and – i only occurs once.

We can similarly show that any polynomial f(t) over r can be written as a product of
linear polynomials and quadratic polynomials. For example the real polynomial t3 – 1
= (t-1) (t2 + t +1).

Now we go to show the second and final link that relates the minimal and
characteristic polynomials of T:V V, where V is a vector space over F. let p(t) be
the minimal polynomial of T. We will show that a scalar  is a root of p(t). The proof
will utilize the following remark.

Remark 3: if  is an eigenvalue of T, then Tx - x for some x  V, x ≠ 0. But tx = x


 T2x = T(Tx) = T(x) = 2x. By induction it is easy to see that Tkx = kx for all k.
Now, if g(t) = antn + an-1 tn-1 + ….. + a1t + a0 is a polynomial over F, then g(T)= antn +
an-1 Tn-1Tn-1 + …… + a1T + a0I.

This means that


g(T)x = anTn + an-1Tn-1x + … +a1Tx + a0x
= ann + an-1n-1x + … +a1x + a0x
= g() x
Thus,  is an eigenvalue of T  g() is an eigenvalue of g(T).

Now for the theorem.

Teorem 7: Let T be a linear transformation on a finite-dimensional vector V over the


field f. Then  F is an eigenvalue of T if and only if  is a root of the minimal
polynomial of T have the same roots.

Proof:Let p be the minimal polynomial of T and let  F. Suppose  is an eigenvalue


of T. Then Tx = x for some 0 ≠ x V. Also, by Remark 3, P9T)x = But p(T) = 0.
Thus, 0 = p() = 0, that is,  is a root of p(t).

Conversely, let  be a root of p() = 0 and , by Theorem 5, p(t) = (t-)q(t), deg q <
deg p, q ≠ 0. By the property MP3,  v  V such that q(T) v ≠ 0.
Let x = q(T)v ≠ 0. Then,
(T - I)x = (t - I) q(T)v = p(T)v =0

 Tx - x = 0  Tx = x. Hence,  is an eigenvalue of T.


so,  is an eigenvalue of T iff  is a root of the minimal polymial of t.

In Unit 10 we have already observed that  is an eigenvalue of T if and only if  is a


root of the characteristic polynomial of T. Hence, we have shown that both the
minimal and characteristic polynomials of T have the same roots, namely, the
eigenvalues of T.

Caution: Though the roots of the characteristic polynomial ad the minimal polynomial
coincide, the two polynomials are not the same, in general.

For example, if the characteristic polynomial T: R4 R4 is (t+ 1)2 (t-2)2, then the
minimal polynomial could be (t = 1) (t -2) or (t +1) (t -2), or (t + 1) (t -2)2, or even (t
2

+ 1)2(t -2)2, depending on which of these polynomials is satisfied by T.

In general, let f(t) = (t -1)n1 (t -2)n2 … (t - )nr be the characteristic polynomial of


a linear transformation T, where deg f = n (, n1 + n2 + … + nr = n) and 1, …..r C
are distinct. Then the minimal polynomial p(t) is given by
P(t) = (t - 1 )m1 (t - 2)m2 (t - r)mr, where 1  m1 ni for i = 1,2, ….., r.
In case T has n distinct eigenvalues, then
f(t) = (t-1) (t-2) ….... (t -n)
and therefore,
p(t) = (t - 1) (t -2) ……. (t-n) = f(t).

E E3 what can the minimal polynomial of T: R3 R3 be if its characteristic polynomial


if
a) t3, b) t(t -1) (t + 2)?

Analogues to the definition of the minimal polynomial of a linear transformation, we


define the minimal polynomial of a matrix.

Definition: The minimal polynomial of a matrix A over f is the monic polynomial p(t)
such that

i) p(A) = 0, and
ii) if q(t) is a non-zero polynomial over F such that deg q < deg p, then q(A) ≠ 0.

We state two theorems which are analogues to Theorem 4 and 7. Their proofs also
similar to those of Theorems 4 and 7

Theorem 8: The minimal polynomial and the characteristic polynomial.


Theorem 9: The roots of the minimal polynomial and characteristic polynomial of a
matrix are the same, and are the eigenvalues of the matrix.

let us use these theorems now.


5 -6 -6
Example 3: Obtain the minimal polynomial of A = -1 4 2
3 -6 -4
Solution: The characteristic polynomial of A =
t -5 6 6
f(t) = 1 t -4 -2 = (t- 1) (t – 2)2.
-3 6 t +4

Therefore, the minimal polynomial p(t) is either (t -1) (t -2) or (t-1) (t-2)2

Since (A – 1) (A – 21)
4 -6 -6 3 -6 -6 0 0 0
= -1 3 2 -1 2 2 0 0 0
3 -6 -5 3 -6 -6 0 0 0

the minimal polynomial of


Example 4: find the minimal polynomial of
3 1 -1
A= 2 2 -1
2 2 0

Solution: The characteristic polynomial of A =

t -3 6 6
f(t) = -2 t -2 1 = (t -1) (t – 2)2.
-2 -2 t

Again, as before, the minimal polynomial p(t) of A is either (t – 1) (t-2) or (t-1) (t -


2)2. But, in this case,
2 1 -1 1 1 -1
(A-I) (A -21) = 2 1 -1 2 0 -1
2 2 -1 2 2 -2

2 0 -1
= 2
0 -1
≠ 0.
4 0 -2
Hence, p(t) ≠ (t -1) (t -2). Thus, p(t) = (t-1) (t – 2)2.
Now, let T be a linear transformation for V to v, and B be a basis of V. Let A = [T] B.
If g(t) is any polynomial with coefficients in f, then g(T) = 0 if and only it g(A) = 0.
Thus, the minimal polynomial of T is the same as the minimal of A. so, For example,
if t : R3 R3 is a linear operator which is represented with respect to the standard
basis, by the matrix in Example 3., then its minimal polynomial is (t-1) (t-2).

Example 5: what can the minimal polynomial of T: R4 R4 be if the characteristic


polynomial of [T]B is
a) (t-1) (t3 + 1), b) (t2 + 1)2?
Here, B is the standard basis of R4.

Solution: a) now (t-1) (t3 + 1) = (t + 1) (t-1) (t2-t+1). This has 4 distinct complex roots,
of which only 1 and -1 are real. Since all the roots are distinct this polynomial is also
the minimal polynomial of T.

b) (t2 +1)2 has no real roots. It has 2 repeated complex roots, i –i . Now, the minimal
polynomial must be a real polynomial that divides the characteristic polynomial. , it
can be (t2 + 1) or (t2 + 1)2.

This example shows you that if the minimal polynomial is a real polynomial, then it
need not be a product of linear polynomial only. Of course, over C it will always be a
product of linear polynomials.
Try the following exercises now.

E E4) Find the minimal polynomial of


0 1 0 1
1 0 1 0
a) A = 0 1 0 1
1 0 1 0

b) T: R  R3 : T(x,y,z) = (x + y, y +z, z +x)


The next exercise involves the concept of the trace of a matrix. If A = [a ij] Mn(F),
then the trace of A, denoted by Tr(A) is – (coefficient of tn-1 in fA(t)).
E5) Let A = [aij] Mn(F). For the matrix A given in E4, show that
Tr(A) = (sum of its eigenvalues)
= (sum of its diagonal elements)

We end the unit by recapitulating what we have done in it.

7.4 Summary

In this unit we have covered the following points.

1) The proof of the Cayley-Hamilton theorem, which says that every square matrix (or
linear transformation T:V  V) satisfies its characteristic equation.
2) The use of the Cayley-Hamilton theorem to find the inverse of a matrix.

3) The definition of the minimal polynomial of a matrix.

4) The proof of the fact that the minimal polynomial and the characteristic polynomial of
a linear transformation (or matrix) have the same roots. These roots are precisely the
eigenvalues of the concerned linear transformation (or matrix).

5) A method for obtaining the minimal polynomial of a linear transformation (or matrix).

7.5 Solutions/Answers

t -1 0 0
E1) a) fA(t) = -2 t - 3 0 = (t – 1)2 (t -2)
2 2 t -1

Now, (A-1)2 (A-31) = 0.  A satisfies fA(t).


t -1 0
b) fA(t) = -3 t -1 = t3 + t2 – t – 4.
-1 2 t +1
0 1 0 0 1 0 3 0 1
Now, A = 2 3 0 1 3 0 1 = 1 1 -1
1 -2 -1 1 -2 -1 -7 3 -1

0 1 0 3 0 1 1 1 -1
A3 = 3 0 1 1 1 -1 2= 3 2
1 -2 -1 -7 3 -1 8 -5 4

1 1 -1 3 0 1
3 2
Now, A + A – A – 41 = 2 3 2 1 1 -1
8 -5 4 -7 3 -1

0 1 0 4 0 0
- 3 0 1 - 0 4 0
1 -2 -1 0 0 4

t -1 0 -1
c) fA(t) = 0 t-3 -1 = t3 – 8t2 + 13t
-3 -3 t -1

1 0 1 1 0 1 4 3 5
0 3 1 0 3 1 3 12 7
3 3 4 3 3 4 15 21 22
Now, A2 =

1 0 1 4 3 5 19 24 27
 A1 = 0 3 1 3 12 7 24 57 43
3 3 4 15 21 22 81 168 176

13 0 13
0 39 13
+ 39 39 52
= 0.

A satisfies its characteristic polynomial.

E2) a) The constant term of fA(t) is -3 ≠ 0.A is invertible.


Now, A3 – 5A2 + 7A – 31 = 0.
12
A-1 = (A – 5A + 71)
3

1 0 0 5 0 0 7 0 0
1 8 9 0 10 15 0 0 7 0
= 3 - +
-8 -8 1 -10 -10 5 0 0 7

1 3 0 0
= 3 -2 1 0
2 2 3

Pre-multiply by A to check that our calculations are right.


1
b) A is invertible and A-1 = (A2 + A-1)
4

2 1 1
1
4 0 0
4
-6 1 -3

c) A is not invertible, by Theorem 3.

E 3) a) the minimal polynomial can be t, t2 or t3.

c) The minimal polynomial can only be t(t-1) (t +2).


t -1 0 -1
E 4) a) fA(t) = -1 t -1 0 = t2(t – 2) (t + 2)
0 -1 t -1
-1 0 -1 t

, the minimal polynomial can be t(t -2) (t + 2) or t2(t – 2) (t+2).


Now A(A -21) (A + 21) = 0. , t(t -2) 9t + 2) is the minimal polynomial of A
.

b) The matrix of T with respect to the standard basis is

1 1 0
A= 0 1 1
1 1 1

t -1 -1 0
Then fA(t) = 0 t-1 -1 = t3 – t2 – t.
-1 0 t -1

This has 3 distinct roots: 1+i


0, 5 1–i 5
2 2
the minimal polynomial is the same as f A(t).

E5) Sum of diagonal elements = 0


Sum of eigenvalues = 0 -2 + 2 = 0 and Tr(A) = - (coeff. of t3 in fA(t)) = 0.
Tr(A) = sum of diagonal elements of A.
= sum of eigenvalues of A.

You might also like