0% found this document useful (0 votes)
0 views253 pages

Linear

This document is a set of lecture notes on Linear Algebra and Geometry, intended for engineering students at the University of Padova. It covers topics such as systems of linear equations, vector spaces, linear maps, determinants, eigenvalues, and Euclidean spaces, along with exercises and solutions. The content is structured into chapters with definitions, examples, and exercises to aid understanding of the material.

Uploaded by

Pera Erdir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views253 pages

Linear

This document is a set of lecture notes on Linear Algebra and Geometry, intended for engineering students at the University of Padova. It covers topics such as systems of linear equations, vector spaces, linear maps, determinants, eigenvalues, and Euclidean spaces, along with exercises and solutions. The content is structured into chapters with definitions, examples, and exercises to aid understanding of the material.

Uploaded by

Pera Erdir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 253

Linear algebra and geometry

Jakob Scholbach

February 27, 2024


2
Contents

0 Preface 5

1 Systems of linear equations 7


1.1 Linear equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Systems of linear equations . . . . . . . . . . . . . . . . . . . . . 9
1.3 Elementary operations . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2 Vector spaces 29
2.1 R2 , R3 and Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 Solution sets of homogeneous linear systems . . . . . . . . . . . . 35
2.3 Intersection of subspaces . . . . . . . . . . . . . . . . . . . . . . . 38
2.4 Further examples of vector spaces . . . . . . . . . . . . . . . . . . 40
2.5 Linear combinations . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.6 Linear independence . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.7 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.8 The dimension of a vector space . . . . . . . . . . . . . . . . . . 56
2.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3 Linear maps 71
3.1 Definition and first examples . . . . . . . . . . . . . . . . . . . . 71
3.2 Multiplication of a matrix with a vector . . . . . . . . . . . . . . 73
3.3 Outlook: current research . . . . . . . . . . . . . . . . . . . . . . 80
3.4 Kernel and image of a linear map . . . . . . . . . . . . . . . . . . 81
3.5 Revisiting linear systems . . . . . . . . . . . . . . . . . . . . . . . 88
3.6 Linear maps defined on basis vectors . . . . . . . . . . . . . . . . 91
3.7 Matrices associated to linear maps . . . . . . . . . . . . . . . . . 93
3.8 Composing linear maps and multiplying matrices . . . . . . . . . 94
3.9 Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
3.10 Transposition of matrices . . . . . . . . . . . . . . . . . . . . . . 114
3.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

3
4 CONTENTS

4 Determinants 129
4.1 Determinants of 2 ˆ 2-matrices . . . . . . . . . . . . . . . . . . . 129
4.2 Determinants of larger matrices . . . . . . . . . . . . . . . . . . . 131
4.3 Invertibility and determinants . . . . . . . . . . . . . . . . . . . . 136
4.4 Further properties of determinants . . . . . . . . . . . . . . . . . 137
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

5 Eigenvalues and eigenvectors 143


5.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.2 The characteristic polynomial . . . . . . . . . . . . . . . . . . . . 145
5.3 Eigenspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.4 Diagonalizing matrices . . . . . . . . . . . . . . . . . . . . . . . . 149
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

6 Euclidean spaces 159


6.1 The scalar product on Rn . . . . . . . . . . . . . . . . . . . . . . 159
6.2 Positive definite matrices . . . . . . . . . . . . . . . . . . . . . . 162
6.3 Euclidean spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.4 Orthogonal and symmetric matrices . . . . . . . . . . . . . . . . 173
6.5 Affine subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
6.6 Distance between two affine subspaces . . . . . . . . . . . . . . . 183
6.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

A Mathematical notation and terminology 193

B Trigonometric functions 197

C Solutions of selected exercises 201


C.1 Systems of linear equations . . . . . . . . . . . . . . . . . . . . . 201
C.2 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
C.3 Linear maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
C.4 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
C.5 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . 228
C.6 Euclidean spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

References 249
Chapter 0

Preface

These are growing notes for a lecture on Linear algebra and geome-
try, offered in Spring 2023 at the University of Padova to an audience
of engineering students.

• The exclamation mark (!)indicates that you should repeat some


aspects of a definition etc. in order to make sure you are fol-
lowing the lecture.
• Three dots indicate some content that is not written in the
notes, but will be explained in the lecture.

5
6 CHAPTER 0. PREFACE
Chapter 1

Systems of linear equations

1.1 Linear equations


Definition 1.1. Let a, b, c P R be fixed real numbers. An equation
of the form
ax ` by “ c
is called a linear equation in the variables (or unknowns) x and y.
More generally, an equation of the form
a1 x 1 ` ¨ ¨ ¨ ` an x n “ b
is called a linear equation in the variables x1 , . . . , xn . The real num-
bers a1 , . . . , an are called the coefficients and b is the constant term
of the equation. The solution set consists precisely of those collec-
tions (more formally, ordered tuples) of numbers r1 , r2 , up to rn such
that substituting the variables x1 , . . . , xn by r1 , . . . , rn respectively,
the equation holds, i.e., such that
a1 r1 ` ¨ ¨ ¨ ` an rn “ b.
The name “linear” stems from the geometry of the solution sets,
as the following example shows:
Example 1.2. The equation
4x ` 2y “ 3 (1.3)
is a linear equation (with coefficients 4 and 2 and constant term 3).
We can solve this equation by subtracting 4x from both sides,
which gives
2y “ ´4x ` 3

7
8 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS

and dividing by 2, which gives


3
y “ ´2x ` . (1.4)
2
In each of these steps, one equation holds precisely if the next one
holds. Thus the solution set of (1.3) is the same as the solution set
of the equation (1.4). That solution set is therefore the following
set:
3
tpx, ´2x ` q with x P Ru.
2
Here we use standard set-theoretic notation, cf. §A. Thus, the above
means the set of all pairs px, ´2x ` 32 q, where x is an arbitrary real
number. In particular, since there are infinitely many real numbers
x, this is an infinite set.
Graphically, the solution set is the set of points as depicted below:
y 4x ` 2y “ 3
10

x
´5 ´4 ´3 ´2 ´1 1 2 3 4 5

´5

In general, any equation of the form


ax ` by “ c
with a ‰ 0 or b ‰ 0 will have a line as a solution set (what happens
if a “ b “ 0?, cf. Exercise 1.6).
Remark 1.5. In the computation above it was critical that we were
able to divide 3 by 2, i.e., have the rational number 23 at our disposal.
The real numbers R (and also the rational numbers Q) form a so-
called field , which among other properties means that one can divide
by non-zero numbers. The integers Z “ t. . . , ´2, ´1, 0, 1, 2, . . . u do
not form a field. Solving linear systems in the integers is somewhat
1.2. SYSTEMS OF LINEAR EQUATIONS 9

harder than it is in the rationals or reals. This course will focus on


discussing linear algebra over the real numbers.

Remark 1.6. A great number of equations arising in physics, biol-


ogy, chemistry and of course mathematics itself are linear. Nonlinear
equations such as

x2 ` 4y 3 “ 5
logpxq ´ 4 sinpxq “ 0

are not primarily studied in linear algebra. For such more com-
plicated equations, linear algebra is still useful, however. This is
accomplished by replacing such equations by linear approximations.
The first idea in that direction is the derivative of a function f ,
which serves as a best linear approximation of a differentiable func-
tion. Such linearization techniques are beyond the scope of this
lecture.

1.2 Systems of linear equations


Definition 1.7. A system of linear equations is a collection of linear
equations (involving the same variables). It is also sometimes called
a linear system or even just a system.

The interest in linear systems lies in finding those tuples of num-


bers satisfying all equations at once (as opposed to just one of them,
say). We will start with two equations in two variables.
Example 1.8. The equations

x`y “4 (1.9)
x ´ y “ 1.

form a system of linear equations (in the variables x and y).


We solve this system algebraically by subtracting y in the first
equation, which gives
x “ ´y ` 4,
and substituting this into the second equation, which gives

p´y ` 4q ´ y “ 1,
10 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS

or
´2y ` 4 “ 1
or
´2y “ ´3
or finally
3
y“ .
2
Inserting this back above, gives
3 5
x“´ `4“ .
2 2
Note that again each equation holds (for given values of x and y)
precisely if the preceding one holds. Thus, the original system has
the same solution set as the last two equation (together). This
system of equations therefore has a unique solution, namely
5 3
px “ , y “ q.
2 2
To say the same using different symbols: the solution set of the
system (1.9) is a set consisting of a single element:
5 3
tp , qu
2 2
It is very useful to also understand this process geometrically,
which we do by plotting the two lines that are the solutions of the
individual equations:
y x`y “4
x´y “1
5

x
´5 ´4 ´3 ´2 ´1 1 2 3 4 5

´5
1.2. SYSTEMS OF LINEAR EQUATIONS 11

The algebraic computation of having precisely one solution is matched


by the fact that two non-parallel lines in the plane (which are the so-
lution sets of the individual equations) exact in precisely one point.

The above linear system (1.9) had exactly one solution. This
need not always be the case, as the following examples show:
Example 1.10. The system

x`y “4
x`y “1

has no solution. This can be seen algebraically (!) and also geomet-
rically:
y x`y “4
8
x`y “1
6

2
x
´5 ´4 ´3 ´2 ´1 1 2 3 4 5
´2

´4

The system has no solution, which is paralleled by the fact that two
parallel, but distinct lines in the plane do not intersect.

Example 1.11. The system

x`y “4
´2x ´ 2y “ ´8

has infinitely many solutions, namely all pairs of the form

px, y “ 4 ´ xq,

with an arbitrary real number x. Geometrically, this is explained


by taking the “intersection” of the same line twice.
12 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS

y x`y “4
8
´2x ´ 2y “ ´8
6

x
´5 ´4 ´3 ´2 ´1 1 2 3 4 5

In other words, even though there are two equations above, they
both have the same solution set. Thus, in some sense one of the
equations is redundant, i.e., the solution set of the entire system
equals the solution set of either of the equations individually.

Summary 1.12. The solution set of an equation of the form

ax ` by “ c

is a line (unless both a and b are zero).


The solution set of a system of equations of the form

ax ` by “ c

dx ` ey “ f
can take three forms:

number of solutions geometric explanation


exactly one solution the unique intersection point of two
non-parallel lines
no solution two distinct parallel lines don’t inter-
sect
infinitely many solu- a line intersects itself in infinitely many
tions points

Definition 1.13. A homogeneous linear system is one in which the


constant terms in all equations are zero. (I.e., in the notation in
(1.25) below, b1 “ ¨ ¨ ¨ “ bn “ 0.)
1.3. ELEMENTARY OPERATIONS 13

Remark 1.14. For a homogeneous linear system, there is always


at least one solution namely
px1 “ 0, . . . , xn “ 0q.
This solution is called the trivial solution.

1.3 Elementary operations


The combination of geometric intuition with algebraic computations
is very useful. However, the former is of limited use when it comes to
systems with three variables, and hardly useful anymore for systems
involving four or more variables. We will therefore develop notions
and techniques that enable us to handle linear systems more sys-
tematically.
Definition 1.15. We say that two linear systems are equivalent if
they have the same solution sets.
Example 1.16. In 1.9, we considered the system
x`y “4
x´y “1
and found that it has a unique solution, namely
5 3
px “ , y “ q.
2 2
Thus the previous system is equivalent to the system
5
x“
2
3
y“ .
2
Of course, in comparison to the original system, the latter system
is much easier to understand, since one can simply read off the
solution without any effort. The purpose of elementary operations
is to transform a given system into an equivalent system of which
the solutions can be read off.
Definition 1.17. Given a linear system, the following operations
are called elementary operations:
14 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS

(1) interchange two equations,


(2) multiply one equation by a non-zero (!) number,
(3) add a multiple of one equation to a different (!) equation.

These operations are called “elementary” since they are so simple


to perform. Their utility comes partly from the following fact:

Theorem 1.18. Consider a linear system. This linear system is


equivalent to (i.e., has the same solutions as) any linear system
obtained by performing any number of elementary operations.

This theorem, which we will prove later on (Corollary 3.75) when


we have more tools at our disposal may sound a little abstract at
first sight. It is however actually simple to comprehend and, very
importantly, extremely useful in practice.

Example 1.19. Consider the system

x ` 2z “ ´1
´2x ´ 3z “ 1
2y “ ´2.

We add twice the first equation to the second (elementary operation


(3)):

x ` 2z “ ´1
z “ ´1
2y “ ´2.

We interchange the second and third equation (elementary operation


(1)):

x ` 2z “ ´1
2y “ ´2
z “ ´1.
1.4. MATRICES 15

1
We multiply the second equation by 2
(in other words, we divide it
by 2; elementary operation (2)):
x ` 2z “ ´1
y “ ´1
z “ ´1.

We add p´2q times the third equation to the first (elementary op-
eration (3))
x“1
y “ ´1
z “ ´1.

These steps are combinations of elementary operations. According


to Theorem 1.18, the original system is equivalent (i.e., has the
same solutions) as the final one. The benefit is, of course, that the
solutions of the final system are trivial to comprehend: it has exactly
one solution, the triple
px “ 1, y “ ´1, z “ ´1q.
Thus, the original system also has exactly that one solution.

1.4 Matrices
It is time to use some better tools to do the bookkeeping needed to
solve linear systems. Matrices help doing that. Later on (§3), we
will use matrices in a much more profound way.
Definition 1.20. A matrix is a rectangular array of numbers. We
speak of an m ˆ n-matrix (or m-by-n matrix) if it has m rows and
n columns, respectively. If m “ n, we also call it a square matrix .
An 1 ˆ n-matrix (i.e., m “ 1 and n is arbitrary) is called a row
vector . Similarly, an m ˆ 1-matrix is called a column vector .
Example 1.21. It is customary to denote matrices by capital let-
ters. For example, ˆ ˙
3 4
A“
0 ´7
16 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS

is a 2 ˆ 2-matrix (or square matrix of size 2).


ˆ ˙
1 ´2 0
B“
1 0 3
is a 2 ˆ 3-matrix and
¨ ˛
1 ´2
C“˝ 0 1 ‚
0 ´3
is a 3 ˆ 2-matrix.
The entries of a matrix may also be variables. For example
ˆ ˙
x1
x2

is a column
` vector
˘ (or a 2 ˆ 1-matrix), whose entries are two vari-
ables; x1 x2 is a row vector (or a 1 ˆ 2-matrix).
Notation 1.22. A matrix whose entries are unspecified numbers is
denoted like so:
¨ ˛
a11 a12 a13 . . . a1n
˚ a21 a22 a23 . . . a2n ‹
A“˚˝ ... .. .. .. .. ‹ .
. . . . ‚
am1 am2 am3 ... amn
Thus, the number aij is the entry in the i-th row and the j-th
column. A more compressed notation expressing the same is
A “ paij qi“1,...,m,j“1,...,n
or even just
A “ paij q.
Definition 1.23. Let
a11 x1 ` a12 x2 ` ¨ ¨ ¨ ` a1n xn “ b1 (1.24) (1.25)
a21 x1 ` a22 x2 ` ¨ ¨ ¨ ` a2n xn “ b2
..
.
am1 x1 ` am2 x2 ` ¨ ¨ ¨ ` amn xn “ bm
1.5. GAUSSIAN ELIMINATION 17

be a linear system (consisting of m equations, in the unknowns


x1 , . . . , xn ; the numbers aij are the coefficients, the numbers b1 , . . . , bn
are the constants).
The matrix associated to this system is the following m ˆ pn ` 1q-
matrix (the vertical bar is just there to remind ourselves that the
last column corresponds to the constants in the equations above;
one also speaks of an augmented matrix )
¨ ˛
a11 a12 . . . a1n b1
˚ a21 a22 . . . a2n b2 ‹
A“˚˝ ... .. .. .. .. ‹ . (1.26)
. . . . ‚
am1 am2 ... amn bm
In other words, the matrix is the rectangular array containing the
coefficients and the constants of the individual equations, and sup-
presses the mention of the variables.
Example 1.27. The matrix associated to the system
x`y “4
x´y “1
is the 2 ˆ 3-matrix ˆ ˙
1 1 4
.
1 ´1 1
Of course, the process of associating a matrix to a linear system
can be reversed since any m ˆ pn ` 1q-matrix gives rise to a linear
system: the matrix (1.26) gives rise to the linear system (1.25). For
example, the 2 ˆ 3- matrix
ˆ ˙
1 ´2 0
1 0 3
gives rise to the linear sytem
x ´ 2y “ 0
x “ 3.

1.5 Gaussian elimination


Theorem 1.18 is a useful insight, but it lacks an important feature: it
does not directly instruct us how to simplify any given linear system.
18 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS

(In Example 1.19, we did end up with a particularly simple linear


system, but we did not have any a priori guarantee for this to hap-
pen. If we had chosen some “stupid” elementary operations instead,
we would not have simplified the system.) Gaussian elimination is
an algorithmic process that does just that: it is a simple procedure
that is guaranteed to yield the simplest possible equivalent form of
any given linear system.
In view of the correspondence between linear systems and matri-
ces, we will first phrase this process in terms of matrices, and then
translate it back to the problem of solving linear systems.
Among the myriad of all possible matrices, the following matrices
are the “nice guys”.
Definition 1.28. A matrix is in row-echelon form (or is called a
row-echelon matrix ) if the following conditions are satisfied:
(1) If there are any zero rows (i.e., consisting only of zeros), then
these are at the bottom of the matrix.
(2) In a non-zero row, the first non-zero (starting from the left) is a
1. (It is called the leading 1.)
(3) Each leading 1 is to the right of all the leading 1’s in the rows
above it.
If, in addition to the above conditions, each leading 1 is the only
non-zero entry in its column, then the matrix is in reduced row-
echelon form.
Maybe saying more than a thousand words is this picture of a
row-echelon form. Here, the asterisks indicate an arbitrary number.
If, in addition, all the underlined asterisks are zero, then the matrix
is a reduced row-echelon matrix.

0 1 * * * * *
¨ ˛
˚ 0 0 0 1 * * * ‹
0 0 0 0 1 * * ‹
˚ ‹
‹.
˚
0 0 0 0 0 0 1 ‹
˚
˚
˝ 0 0 0 0 0 0 0 ‚
0 0 0 0 0 0 0

In view of the correspondence between linear systems and ma-


trices, we transport the language of elementary operations (Defini-
tion 1.17) to matrices as follows.
1.5. GAUSSIAN ELIMINATION 19

Definition 1.29. The following operations on a given matrix A are


called elementary row operations:
(1) interchange any two rows,
(2) multiply any row by a non-zero (!) number,
(3) add a multiple of any row to a different (!) row.

Method 1.30. (Gaussian algorithm or Gaussian eliminiation) Ev-


ery matrix can be brought to reduced row-echelon form by a se-
quence of elementary row operations. This can be achieved using
the following algorithmic process:
(1) If the matrix consists only of zeros, stop: then the matrix is in
reduced row-echelon form.
(2) Otherwise, find the first column from the left having a non-
zero entry. Call this entry a. Interchange this row (elementary
operation (1)) so that it is in the top position.
(3) Multiply the new top row by a1 (elementary operation (2); note
this is possible since a ‰ 0, see also Remark 1.5). Thus the first
row has a leading 1.
(4) By adding appropriate multiples of the first row to the remaining
rows (elementary operation (3)), ensure that the entry between
the leading 1 are all zero.
From this point on, the first row is not touched anymore, and the
four steps above are applied to the matrix consisting of the remain-
ing rows.
This produces a (possibly not reduced) row-echelon form. It can
be finally brought into reduced row-echelon form by adding appro-
priate multiples of rows with leading 1’s to the rows above them
(elementary operation (3)), beginning at the bottom.

Example 1.31. We apply the Gaussian algorithm to the matrix


¨ ˛
1 2 5 7
˝ 2 1 4 2 ‚.
5 4 13 11

The first three steps don’t change the matrix (since the top-left
entry is already 1). Step (4): add ´2 times the first row to the
20 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS

second row; and add ´5 times the first row to the third, which gives
¨ ˛
1 2 5 7
˝ 0 ´3 ´6 ´12 ‚
0 ´6 ´12 ´24

The remaining steps only affect the second and third row. Step (2)
picks the second row, and a “ ´3 (underlined). It is already in the
top position (the first row being discarded for the remainder of the
algorithm), so Step (2) does not change the matrix. Step (3) gives
the matrix ¨ ˛
1 2 5 7
˝ 0 1 2 4 ‚
0 ´6 ´12 ´24
Step (4) adds ´6 times the second row to the third, which gives
¨ ˛
1 2 5 7
˝ 0 1 2 4 ‚.
0 0 0 0

At this point also the second row is discarded, which leaves only the
last row, which consists of zeros. By Step (1), the algorithm stops
at this point.
This matrix is in row-echelon form, but not yet reduced. To
reduce it, add ´2 times the second row to the first, which gives
¨ ˛
1 0 1 ´1
˝ 0 1 2 4 ‚
0 0 0 0

When applied to matrices associated to linear systems, Gaussian


eliminiation becomes very useful for solving linear systems:
Method 1.32. (1) Form the augmented matrix corresponding to
the given linear system.
(2) Perform Gaussian elimination to that matrix (Method 1.30),
giving a reduced row-echelon matrix.
(3) If a row of the form
0 0 ... 0 1
occurs, the system has no solutions.
1.5. GAUSSIAN ELIMINATION 21

(4) Otherwise, we call the variables corresponding to the columns


not containing a leading 1 free variables. The values of these
variables can be chosen to be arbitrary real numbers. The vari-
ables that correspond to columns that do contain a leading 1
are uniquely specified by these free variables. Their values can
be determined by solving the equations corresponding to the
row-echelon matrix for the leading variables.
Example 1.33. Consider the system
x ` 2y ` 5z “ 7
2x ` y ` 4z “ 2
5x ` 4y ` 13z “ 11.
The augmented matrix associated to this is the one in Example 1.31.
Gaussian algorithm brings it into the reduced row-echelon form
¨ ˛
1 0 1 ´1
˝ 0 1 2 4 ‚.
0 0 0 0
This matrix corresponds to the linear system
x ` z “ ´1
y ` 2z “ 4
0 “ 0.
According to Theorem 1.18, this linear system has the same solution
set as the original one.
The leading variables are x and y, so that z is a free variable.
Solving the second equation for y then gives
y “ 4 ´ 2z,
and similarly
x “ ´1 ´ z.
We obtain that the solution set to the original linear system consists
of triples of the form
px “ ´1 ´ z, y “ 4 ´ 2z, zq,
in which z is an arbitrary number (and x and y are determined by
z as indicated).
22 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS

1.6 Exercises
Exercise 1.1. Describe all the solutions of the equation
x ` y “ 3.
Draw a picture of that solution set. Is it a homogeneous equation?
Exercise 1.2. Consider the equation
x “ 3.
What is its solution set?
Consider the same equation, but now with two variables x and y
being present (so we could rewrite the equation as x ` 0 ¨ y “ 3 in
order to emphasize the presence of y). What is the solution set this
time?
Exercise 1.3. Consider the system
2x1 ´ x2 ` x3 ` x4 “ 1
5x2 ´ 3x3 ´ 5x4 “ ´3
3x1 ´ 4x2 ` 3x3 ` 4x4 “ 3.
What is the matrix associated to that system? Using Method 1.32,
find all solutions of that system.
Exercise 1.4. Consider the (augmented) matrix
¨ ˛
1 0 3 0 0 0 1
˚ 0 1 2 4 1 0 0 ‹
˚ ‹
˚ 0 0 0 2 1
A :“ ˚ 0 2 ‹‹.
˝ 0 0 0 0 0 1 3 ‚
0 0 0 0 0 0 0
What type of matrix is that? (I.e., what mˆn-matrix.) If A “ paij q,
what is a13 and a31 ? What is the linear system associated to that
matrix? (Hint: one equation reads “¨ ¨ ¨ “ 3”. For consistency, call
the variables x1 , x2 , . . . , x6 .)
Is the matrix in row-echelon form? Is it in reduced row-echelon
form? If not, use the Gaussian algorithm (Method 1.30) in order
to transform it into reduced row-echelon form. Name the columns
which contain a leading 1 (Hint: there are 4 of them). Which vari-
ables are free, which variables are not free? Use Method 1.32 and
solve the linear system associated to that augmented matrix.
1.6. EXERCISES 23

Exercise 1.5. Using Method 1.32, find all solutions of the following
systems
x`y´z “1
3x ´ y ` 2z “ 5
4x ` z “ 6.
and
x`y´z “1
3x ´ y ` 2z “ 0
x ` y ´ 2z “ 2.
Exercise 1.6. (Solution at p. 201) Let
ax ` by “ c
be a linear equation. For which values of a, b and c does this equa-
tion have no solution? For which values of a, b and c does it have
infinitely many solutions?
Exercise 1.7. Compute the reduced row-echelon form of the ma-
trices associated to the linear systems in 1.9, Example 1.10 and
Example 1.11.
Exercise 1.8. Consider the system
x`y “1
x ´ y “ b,
where b is a real number. What is its solution set? Illustrate the
system geometrically for b “ 0 and for b “ 1.
Exercise 1.9. Consider the system
ax ` by “ 1
x ´ y “ 2.
Here x and y are the variables and a and b are the coefficients.
(1) For which values of a and b does the system above have no
solution?
(2) For which values does it have exactly one solution?
24 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS

(3) For which values does it have infinitely many solutions?


Explain your findings algebraically and geometrically.
Exercise 1.10. (Solution at p. 201) Find the solutions of the sys-
tem
x1 ` 2x2 ´ x3 “ 0
´2x1 ´ 3x2 ` x3 “ 1
x2 ´ x3 “ 1.
Exercise 1.11. The linear system in the variables x1 , x2 , x3 , x4 as-
sociated to the matrix
¨ ˛
2 ´1 1 ´1 1
˚ 0 1 ´3 1 3 ‹
˚ ‹
˝ 2 1 ´4 1 6 ‚
2 0 ´2 1 2
has only one solution. Find it!
Exercise 1.12. (Solution at p. 202) Find the solutions of the fol-
lowing linear system in the variables x1 , . . . , x4 :
x1 ´ x2 ` x3 “ ´2
x3 ´ x4 “1
x1 ´ x2 ` x4 “ ´3
x1 ´ x2 ` 3x3 ´ 2x4 “ 0.
Exercise 1.13. Solve the following linear system, where h is a pa-
rameter, and x, y are the unknowns:
x ` hy “ 4
3x ` 6y “ 8.
For selected values of h, illustrate the solution set graphically.
Exercise 1.14. (Solution at p. 203) Solve the following linear sys-
tem, where h is a parameter and x, y, z are the unknowns:
p4 ´ hqx ´ 2y ´ z “ 1
´2x ` p1 ´ hqy ` 2z “ 2
´x ` 2y ` p4 ´ hqz “ 1.
1.6. EXERCISES 25

Exercise 1.15. For any t P R consider the homogeneous linear


system associated to the matrix
¨ ˛
2 0 1 ´t 0
˝ 1 ´2 0 3 0 ‚.
4 ´4 t 5 0

(1) Solve the system for t “ 0.


(2) Solve the system for all t P R.

Exercise 1.16. Solve the system

x1 ´ x3 ` 2x4 “ 0
x2 ` 2x3 ´ 2x4 “ 0
x1 ` x2 ` x3 “ 0.

Exercise 1.17. (Solution at p. 203) Consider the following linear


system (in the unknowns x1 , x2 , x3 ):

x1 ` x2 ` x3 “ 1
x1 ´ x3 “ 0.

Is there any t P R such that p1 ´ t, 2 ` 3t, 4tq is a solution of that


system?

Exercise 1.18. Consider the following linear system (in the un-
knowns x1 , x2 , x3 ):

x1 ´ x2 ` 3x3 “ 0
x1 ´ x2 “ 1.

Show that there is exactly one t P R such that the vector p3 ` t, 2 `


t, 23 ` tq is a solution of that system.

Exercise 1.19. (Solution at p. 204) Do there exist q, t P R such


that the vector

px1 , x2 , x3 q “ p1 ` t, t ` q, ´t ` 2q ` 1q

satisfies
3x1 ` 2x2 ´ x3 “ 5?
26 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS

Exercise 1.20. (Solution at p. 204) Find a polynomial

ppxq “ a0 ` a1 x ` a2 x2 ` a3 x3

such that pp1q “ 0 and pp2q “ 3. Is there a unique such polynomial?

Exercise 1.21. Find the solution of the system


¨ ˛
1 1 2 3 1
˝ 2 0 1 2 1 ‚.
1 3 5 7 2

Exercise 1.22. For any α P R find the solutions of the system


associated to the matrix
¨ ˛
1 1 2 3 ´1
˝ 2 0 1 2 α ‚.
1 3 5 7 0

Exercise 1.23. Consider the following linear system in the unknowns


x, y, z, which depends on the parameter α P R:

2x ´ y ` z “ 1
pα ` 2qx ´ 2y ` αz “ ´α.

Determine the solution set of this system for each value of α.

Exercise 1.24. The following extended exercise showcases the us-


age of linear algebra in network analysis. An idealized city consists
of the following streets U to Z, with four intersection points A to
D. The streets are all one-way streets:

A U B
V W

D
X Z
Y

C
1.6. EXERCISES 27

At the point labelled A, 500 cars per hour drive into the city, and
at B, 400 cars exit the city, while at C 100 cars exit the city per
hour.
Describe the possible scenarios regarding the numbers of cars
driving through the streets U , V , W , X, Y and Z.
28 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
Chapter 2

Vector spaces

2.1 R2 , R3 and Rn
Definition 2.1. For n ě 1, an ordered n-tuple of real numbers is a
collection of n real numbers in a fixed order. For n “ 2, an ordered
2-tuple is usually called an ordered pair , and an ordered 3-tuple is
called an ordered triple. If these numbers are r1 , r2 , . . . , rn , then the
ordered n-tuple consisting of these numbers is denoted
pr1 , r2 , . . . , rn q.
For example, p2, 3q is an (ordered) pair. This pair is different
from the (ordered) pair p3, 2q. It makes good sense to insist on the
ordering, e.g., if a pair consists of the information
p“weight of a parcel (in kg)”, “prize (in €)”q,
?
then p3, 10q is of course different from p10, 3q. p 43 , 2, ´7q, p0, 0, 0q
are examples of (ordered) 3-tuples. An ordered 1-tuple is simply a
single real number.
Definition 2.2. For n ě 1, the set Rn is the set of all ordered
n-tuples of real numbers. Thus (see §A for general mathematical
notation)
Rn “ tpr1 , r2 , . . . , rn q | r1 , r2 , . . . , rn P Ru.
Thus, R1 “ R is just the set of real numbers. Next, R2 is the
set of ordered pairs of real numbers:
R2 “ tpr1 , r2 q | r1 , r2 P Ru.

29
30 CHAPTER 2. VECTOR SPACES

Of course, here r1 , r2 are just symbols which have no meaning in


themselves, so we can also write

R2 “ tpx1 , x2 q | x1 , x2 P Ru
“ tpx, yq | x, y P Ru.

The elements in Rn are also referred to as vectors. Thus, a vector


is nothing but an ordered n-tuple. The element p0, 0, . . . , 0q P Rn
is called the zero vector . Instead of writing px1 , . . . , xn q we also
abbreviate this as x, so that the expression

x P Rn

means that x is an (ordered) n-tuple consisting of n real numbers


x1 , . . . , xn . The numbers x1 etc. are called the components of the
vector x.
Vectors in R, R2 and R3 can be visualized nicely as points on
the real line, as points in the plane or as points in 3-dimensional
space. It is also common to decorate vectors with an arrow, with
the idea of representing a movement or relocation to that point, or
in physics a force with a certain strength in a certain direction.

5 y
4
3
2 p3, 2q
p´3, 0q 1
x
´5 ´4 ´3 ´2 ´1
´1 1 2 3 4 5
´2
´3 p3, ´4q
´4
´5
2.1. R2 , R3 AND RN 31

z
8

6 y

4 8
p´4, ´4, 6q
6
2 4
´8 ´6 ´4 ´2 0 2 2 4 6 8x

´2 ´2
´4
´6 ´4
´8
´6

´8

This visualization also helps explain some of the fundamental


features of Rn .

2.1.1 Addition of vectors


Definition 2.3. Given two vectors (in the same Rn , i.e., having the
same number of components)

x “ px1 , . . . , xn q and y “ py1 , . . . , yn q P Rn ,

their sum is the vector

px1 ` y1 , x2 ` y2 , . . . , xn ` yn q.

Example 2.4. What is the sum of p1, 1q and p´2, 1q? Visualize
that sum graphically!

Remark 2.5. The sum of two vectors is only defined if they belong
to the same Rn : a sum such as p1, 2q ` p3, 4, 5q is undefined, i.e. is
a meaningless expression.

The sum of vectors has the following crucial properties:


32 CHAPTER 2. VECTOR SPACES

Lemma 2.6. For x “ px1 , . . . , xn q, y “ py1 , . . . , yn q and z “ pz1 , . . . , zn q P


Rn the following rules hold:
• x ` y “ y ` x (commutativity of addition)
• x ` 0 “ x (adding the zero vector does not change the vector
in question)
• x ` py ` zq “ px ` yq ` z (associativity of addition)

These identities are easy to prove since they quickly boil down
to similar identities for the sum of real numbers. Here is a visual
intuition for the commutativity of addition, which is also called the
parallelogram law .

7 y
6
x ` y “ y ` x “ p5, 5q
5
y “4 p2, 4q
3
2
1 x “ p3, 1q
x
´2´1 1 2 3 4 5 6 7 8 9 10 11 12

2.1.2 Scalar multiplication of vectors


Definition 2.7. Given a vector x “ px1 , . . . , xn q P Rn and a real
number r P R, the scalar multiplication of x by r is the vector

r ¨ x :“ pr ¨ x1 , . . . , r ¨ xn q.

I.e., every component of x gets multiplied by the number r. Often


one just writes rx instead of r ¨ x.

Geometrically, the scalar multiplication corresponds to stretching


the vector x by the factor r (i.e., if r ą 1 it is stretching, for 0 ă
r ă 1 it compresses the vector, for r ă 0 it additionally flips the
direction of the vector).
2.1. R2 , R3 AND RN 33

Example 2.8. What is 4¨p´1, 3q? What is p´ 14 q¨p´1, 3q? Visualize


the vector p´1, 3q and these results graphically!
Note that in contrast to the sum of vectors the scalar multipli-
cation combines two different entities: a real number and a vector.
The scalar multiplication has the following key properties:
Lemma 2.9. For two real numbers r, s P R and two vectors x, y P
Rn , the following identities hold:
(1) rpx ` yq “ rx ` ry (distributivity law )
(2) pr ` sqx “ rx ` sy (distributivity law )
(3) prsqx “ rpsxq (scalar multiplication with a product rs of two
real numbers can be computed by first multiplying with s and
then with r)
(4) 1x “ x (scalar multiplication by 1 does not change the vector)
(5) 0x “ 0 (scalar multiplication by 0 gives the zero vector)
Again, these identities are easy to check using that the same rules
hold if x, y were just real numbers.

2.1.3 Definition of vector spaces


Definition 2.10. A vector space is a set V that is equipped with
two functions called the sum and the scalar multiplication:
` : V ˆ V Ñ V, pv, wq ÞÑ v ` w,
¨ : R ˆ V Ñ V, pr, vq ÞÑ rv (or r ¨ v)
satisfying the following conditions. Below r, s P R are arbitrary real
numbers and u, v, w P V arbitrary elements of V (also referred to as
vectors):
(1) v ` w “ w ` v (commutativity of addition),
(2) u ` pv ` wq “ pu ` vq ` w,
(3) there is a vector 0 P V , called the zero vector , such that 0`v “ v
for all v P V ,
(4) rpv ` wq “ rv ` rw (distributive law ),
(5) pr ` sqv “ rv ` sv,
(6) prsqv “ rpsvq,
34 CHAPTER 2. VECTOR SPACES

(7) 1x “ x,
(8) 0x “ 0 (at the left 0 denotes the real number zero, at the right
it denotes the zero vector)

Example 2.11. The sets R “ R1 , R2 , and in general Rn are vector


spaces (where the function ` is given by vector addition and ¨ is
scalar multiplication). Indeed, the conditions in Definition 2.10 are
precisely the properties of vector addition and scalar multiplication
noted before in Lemma 2.6 and Lemma 2.9.

Remark 2.12. Recall from §A that the notation appearing in

` : V ˆ V Ñ V, pv, wq ÞÑ v ` w

means that ` is a function that takes as an input two elements in V ,


which here are denoted v and w, and produces as an output another
element in V . That element is denoted v ` w. Likewise

¨ : R ˆ V, pr, vq ÞÑ rv (or r ¨ v)

means that ¨ is a function whose input is a pair consisting of a real


number, here denoted r, and an element in V , and produces as an
output an element in V that is denoted rv or r ¨ v.
Some authors distinguish notationally between vectors and num-
bers by writing ⃗v for vectors and r for numbers. In these notes, we
usually do not use that convention.

Example 2.13. The following subsets of Rn are not vector spaces.


In each case, draw the set and point out precisely which of the above
condition(s) fails.

• tpx1 , x2 q P R2 with x1 ě 0u,

• tpx1 , x2 q P R2 with x1 ‰ 0u,

• The solution set of the equation

3x1 ` 2x2 “ 3.

• tpx1 , x2 q P R2 with x1 “ 0 or x2 “ 0u.


2.2. SOLUTION SETS OF HOMOGENEOUS LINEAR SYSTEMS 35

2.2 Solution sets of homogeneous linear systems

Recall from Definition 1.13 that a homogeneous linear system is one


on which the constant terms are all zero, i.e., one of the form

a11 x1 ` a12 x2 ` ¨ ¨ ¨ ` a1n xn “ 0 (2.14)


a21 x1 ` a22 x2 ` ¨ ¨ ¨ ` a2n xn “ 0
..
.
am1 x1 ` am2 x2 ` ¨ ¨ ¨ ` amn xn “ 0

In this section, we will see that the solution sets to homogeneous


linear systems are vector spaces, which is an extremely important
class of examples. We begin by looking at homogeneous linear equa-
tions, i.e., a linear system consisting of a single (homogeneous) equa-
tion.

Example 2.15. The homogeneous linear equation

3x ` 4y ´ 2z “ 0

has the solution set

3x ` 4y
tpx, y, q | x, y P Ru.
2

Indeed, a triple px, y, zq is a solution to the equation above precisely


if z “ 3x`4y
2
, and x and y can be arbitrary real numbers. A few
concrete elements in this solution set, drawn below, are the points
p0, 0, 0q, p2, 0, 3q, p0, 1, 2q. Slightly more generally, triples of the
form p0, y, 2yq and px, 0, 32 xq, for arbitrary y, resp. x, are elements
in the solution set. These lines (which lie in the y ´ z-plane, resp. in
the x ´ y-plane) are also drawn below. Of course, the solution
set contains further elements such as the point p2, 1, 5q. The green
shape is meant to illustrate further elements of the solution set, but
of course this is not bounded by the lines in the illustration, instead
it stretches out in all directions.
36 CHAPTER 2. VECTOR SPACES

z
8

6 y

4 8
6
p0, y, 2yq px, 0, 32 xq
4
´8 ´6 ´4 ´2 0 2 4 6 8x

´2 ´2
´4
´6 ´4
´8
´6

´8

Example 2.16. What equation (in the three variables x, y and z)


has the following solution set? Again the picture only shows the
solution set partly, it is meant to be extended to the left and below.

y
x p1, 3, 0q

We note that both equations have a solution set which is a plane


passing through the origin, i.e., the point p0, 0, 0q. We will want to
articulate that this plane is a vector space that lies inside the larger
ambient vector space R3 .
2.2. SOLUTION SETS OF HOMOGENEOUS LINEAR SYSTEMS 37

Definition 2.17. A subspace (or sub-vector space, or vector sub-


space) V of Rn is a subset that is – in its own right – a vector space.
I.e.,
(1) it contains the zero vector,

(2) for all vectors v, w P V , the sum v ` w is an element of V , and

(3) for all v P V and all real numbers r P R, the scalar multiple r ¨ v
is required to be an element of V .
More generally, a subset V of another vector space W is a subspace
if V satisfies the three preceeding conditions.

We have seen in Example 2.13 a number of subsets of R2 that


fail to be subspaces. In particular, the solution set of the equation
3x1 ` 2x2 “ 3 is not a vector space since the zero vector p0, 0q is
not a solution of this equation. This is not a homogeneous equation
(the constant term is 3, but not 0). The next proposition tells us
that this is the cause of the failure:

Proposition 2.18. Consider a homogeneous linear system in n vari-


ables x1 , . . . , xn , and m equations, as in (2.14). Its solution set is a
subspace of Rn .

Proof. Let us call S the solution set of the system. I.e., an element
x “ px1 , . . . , xn q belongs to S precisely if it is a solution of the linear
system (2.14).
We check the three conditions in Definition 2.17:

• p0, . . . , 0q P S, i.e. the zero vector in Rn is a solution. Indeed,


plugging in zero in all the xi gives 0 “ 0 for all the m equations,
which holds.

• Let v “ pv1 , . . . , vn q and w “ pw1 , . . . , wn q be elements of S. We


need to check that v ` w is also in S. Recall from Definition 2.3
that v ` w “ pv1 ` w1 , . . . , vn ` wn q. The m equations of the
linear system read

ai1 x1 ` ai2 x2 ` ¨ ¨ ¨ ` ain xn “ 0,


38 CHAPTER 2. VECTOR SPACES

where i “ 1, . . . , m. Inserting v1 ` w1 for x1 etc., we get

ai1 pv1 ` w1 q ` ai2 pv2 ` w2 q ` ¨ ¨ ¨ ` ain pvn ` wn q


“ai1 v1 ` ai1 w1 ` ai2 v2 ` ai2 w2 ` ¨ ¨ ¨ ` ain vn ` ain wn
ai1 v1 ` ai2 v2 ` ¨ ¨ ¨ ` ain vn ` loooooooooooooooomoooooooooooooooon
“ looooooooooooooomooooooooooooooon ai1 w1 ` ai2 w2 ` ¨ ¨ ¨ ` ain wn
“0 “0
“0 ` 0
“0.

This shows that v ` w P S.

• In a similar manner, one shows (do it!) that for any r P R and
v “ pv1 , . . . , vn q P S the scalar multiple rv “ prv1 , . . . , rvn q is
again in S.

2.3 Intersection of subspaces

Lemma 2.19. Let V be a vector space and A, B Ă V be two sub-


spaces. Then the intersection

A X B :“ tv P V | v P A and v P Bu

is also a subspace of V . More generally, this holds true for any


number of subspaces, i.e., if A1 , A2 . . . , An Ă V are subspaces, then
so is their joint intersection

A1 X ¨ ¨ ¨ X An “ tv P V | v P A1 , v P A2 , . . . , v P An u.

Proof. We need to make sure that A X B satisfies the conditions in


Definition 2.17. This is easy enough. For example, the zero vector
0 P AXB since 0 P A (since A is a subspace) and also 0 P B (since B
is also a subspace). Here is a visualization for sums: if x, y P A X B,
then x ` y P A X B since it is both contained in A and also in B.
2.3. INTERSECTION OF SUBSPACES 39

B
AXB

x`y PAXB
xPAXB
y PAXB
A

Intersections of subspaces are hugely important to us because of


the following example.
Example 2.20. Consider once again a homogenous system as in
(2.14). Then, of course, each individual equation of that system is
in its own right a homogeneous linear equation, for example
a11 x1 ` a12 x2 ` ¨ ¨ ¨ ` a1n xn “ 0
By the above, the solution set of that equation is a subspace of Rn ,
that we denote by S1 pĂ Rn q. Likewise this is true for all the other
individual equations, so we get some subspaces S1 , . . . , Sm , one for
each equation. The solution set of the whole system is then just
S1 X S2 X ¨ ¨ ¨ X Sm .
(Indeed, a vector pr1 , . . . , rn q P Rn is a solution for the whole system
precisely if it is one for the individual equations.)
An important question that we will eventually be able to make
more precise and to answer is this:
Question 2.21. Given two subspaces A, B in some vector space V ,
how “much smaller” can A X B be than A and B?
In the above illustration, we will want to articulate the idea that
the ambient vector space V is “3-dimensional”, A and B are 2-
dimensional (i.e., a plane) and A X B is 1-dimensional (i.e., a line).
40 CHAPTER 2. VECTOR SPACES

Note that this need not be the case: if A “ B is the same plane,
for example, then certainly A X B “ A is also 2-dimensional. This
relates to the discussion about the intersections of lines in R2 in
Summary 1.12: if A, B Ă R2 are “1-dimensional” (i.e., lines), their
intersection may still be a line, namely if A “ B. If the ambient
vector space V is even larger, for example V “ R4 (which has
“dimension 4”), then it is no longer reasonable to write down all
possible constellations of how A, B lie in V .

2.4 Further examples of vector spaces


2.4.1 Polynomials
We introduce a number of further examples of vector spaces. Recall
that a function f : R Ñ R (i.e., cf. §A, a function that takes as
an input a real number x and whose output f pxq is another real
number) is called a polynomial if it is of the form
f pxq “ an xn ` an´1 xn´1 ` ¨ ¨ ¨ ` a1 x ` a0 ,
where an , an´1 , . . . , a0 are real numbers. These numbers are called
the coefficients of f . The degree of f is the largest exponent n
appearing in f (provided that the coefficient an ‰ 0). Recall from
§A that such an expression is abbreviated as
n
ÿ
f pxq “ ai x i .
i“0

In increasing complexity, a constant function


f pxq “ a
is a polynomial of degree 0 (note a “ a ¨ x0 );
f pxq “ a1 x ` a0
is a linear polynomial (also known as linear function). Its degree is
1 (provided a1 ‰ 0). Next,
f pxq “ a2 x2 ` a1 x ` a0
is called a quadratic polynomial (or quadratic function). Its degree
is 2 (provided a2 ‰ 0; if a2 “ 0 then it is a linear polynomial). These
types of functions are familiar from high-school.
2.4. FURTHER EXAMPLES OF VECTOR SPACES 41

pxq f“pxq
f40 ´2x ´ 1
gpxq “ x3 ` 2x2 ` 1
3 2
pf ` gqpxq
30 “ x ` 2x ´ 2x

20

10
x
´3 ´2 ´1 1 2 3

Definition and Lemma 2.22. The set

Rrxs :“ tf : R Ñ R | f is a polynomialu

is a vector space, where we define the sum and scalar multiplication


as follows: given two polynomials f, g P P , their sum is the function
f ` g : R Ñ Rdefined by

pf ` gqpxq :“ f pxq ` gpxq,

and given a real number r P R, the scalar multiple is the function


rf : R Ñ R defined by

prf qpxq :“ r ¨ f pxq.

The set
d
ÿ
Rrxsďd :“ t ai xi |a0 , . . . , ad P RupĂ Rrxsq
i“0

of polynomials of degree at most d is a subspace of Rrxs.

Proof. We have to check the conditions on a vector space (Defini-


tion 2.10). As it also happens often in other examples, the most
notable condition to check is that the sum and scalar multiple is
again an element of the vector space. Here, we need to check that
for f, g P P the function f ` g defined above is again a polynomial.
Fortunately, this is easy: if f pxq “ an xn ` an´1 xn´1 ` ¨ ¨ ¨ ` a1 x ` a0
42 CHAPTER 2. VECTOR SPACES

and gpxq “ bn xn ` bn´1 xn´1 ` ¨ ¨ ¨ ` b0 , then, by definition


pf ` gqpxq “ an xn ` an´1 xn´1 ` ¨ ¨ ¨ ` a0 ` bn xn ` bn´1 xn´1 ` ¨ ¨ ¨ ` b0
“ an xn ` bn xn ` an´1 xn´1 ` bn´1 xn´1 ` ¨ ¨ ¨ ` a0 ` b0
pan ` bn q xn ` pa
“ looomooon n´1 ` bn´1 q x
loooooomoooooon
n´1
` ¨ ¨ ¨ ` pa 0 ` b0 q
looomooon
“:cn “:cn´1 “:c0
n n´1
“ cn x ` cn´1 x ` ¨ ¨ ¨ ` c0 .
Thus, the sum of f and g is another polynomial. Similarly, one
verifies that the scalar multiple r ¨ f is a polynomial (check this!
what are its coefficients?). With these checks done, one can proceed
checking the remaining conditions in Definition 2.10. Checking this
is comparatively uninsightful, and will be skipped.
Checking that Rrxsďd is a subspace amounts to asserting that
the 0 polynomial f pxq “ 0 has degree at most d, and that sums and
scalar (!) multiples of polynomials of degree ď d have again degree
ď d. This is clear.
Remark 2.23. It is also true that the product of two polynomials
is again a polynomial, but this is not part of what it takes to be a
vector space, so we disregard that property at this point.
Remark 2.24. Instead of just polynomials, one can consider more
general functions:
Rrxs Ă tf : R Ñ R | f is differentiable u
Ă tf : R Ñ R | is continuous u
Ă tf : R Ñ R is any functionu
are increasingly large vector spaces, cf. Exercise 2.3. The (huge!)
space of all differentiable functions is a key player in analysis.

2.4.2 Direct sums


Definition and Lemma 2.25. Let V, W be two vector spaces. Their
direct sum is the set
V ‘ W :“ tpv, wq | v P V, w P W u.
It is endowed with the addition given by
pv, wq ` pv 1 , w1 q :“ pv ` v 1 , w ` w1 q
2.4. FURTHER EXAMPLES OF VECTOR SPACES 43

and scalar multiplication given by


r ¨ pv, wq :“ prv, rwq.
These operations turn V ‘ W into a vector space.
More generally, the same definition works for finitely many1 vec-
tor spaces V1 , . . . , Vn , giving rise to the direct sum V1 ‘ ¨ ¨ ¨ ‘ Vn .
This is easy to check: revisit the definition of a vector space and
see how checking each of the 8 axioms for V ‘ W reduces to using
the precise same axioms for V and W . In particular, the zero vector
in V ‘ W is the pair p0V , 0W q, where for clarity 0V denotes the zero
vector in V and 0W the one in W .
Example 2.26. We have R2 “ R ‘ R and in general
Rn “ R ‘ ¨¨¨ ‘ R.
loooooomoooooon
n summands

This is clear from the definition of the sum of vectors in Rn (Defi-


nition 2.3) and the scalar multiplication (Definition 2.7).
Note that V Ă V ‘ W , by regarding a vector v P V as the
vector pv, 0W q. Likewise we can regard some w P W as the vector
p0V , wq P V ‘ W . This way, V ‘ W is a vector space that naturally
contains both V and W .
Example 2.27. The direct sum R2 ‘R consists of pairs pv, wq with
v “ px, yq P R2 and w P R. Thus, R2 ‘ R “ tppx, yq, wq | x, y, w P
Ru. We can identify such a pair (consisting of a pair px, yq and a
number w) with a triple px, y, wq. Therefore, R2 ‘R can be identified
with R3 . The sum and scalar multiple on R2 ‘ R as defined in
Definition and Lemma 2.25 then reduce to the usual sum and scalar
multiple in R3 as defined in Definition 2.3 and Definition 2.7.

2.4.3 Quotient spaces


All the examples of vector spaces that we have encountered so far
were subspaces of an already given vector space, beginning with
some ambient Rn . However, not all vector spaces embed (naturally)
in some Rn . To illustrate this, we consider an example of a so-called
quotient space. Since a full treatment of this would require a few
more basic notions, we only discuss this in a special case:
1 or even infinitely many
44 CHAPTER 2. VECTOR SPACES

Definition 2.28. Consider V “ R2 , the plane, and a line L Ă W


through the origin. We define a set
V {L :“ t all lines that are parallel to Lu.
(This is read “V modulo L.”) For example, L1 and L2 are elements
in that set V {L in the illustration below.
y
L
L1
x2 L2
L1 ` L2
x1 ` x 2
x11 ` x2 x

x1

x11

How to define the sum and scalar multiplication on that set V {L?
Given L1 , L2 P V {L, take any x1 P L1 and any x2 P L2 . (These are
both vectors in V “ R2 .) Form the unique line that passes through
x1 ` x2 and is parallel to L. Call this line L1 ` L2 . Similarly, the
scalar multiple rL9 1 is the line passing through r ¨ x1 and parallel to
L. What is remarkable is that makes sense, i.e., that the resulting
lines do not depent on the choices of x1 , x2 above. In the illustration
below, we indicate two choices for x1 (the second one being denoted
x11 ). The sum x1 ` x2 is clearly different from x11 ` x2 , but they do
lie on the same line (that is parallel to L). This holds since x11 ´ x1
lies in L. Thus
px11 ` x2 q ´ px1 ` x2 q “ x11 ´ x1
also lies in L, and therefore x11 ` x2 and x1 ` x2 lie on the same line
that is parallel to L.
With this settled, one can show (withouth much head-ache) that
V {L is indeed a vector space. (What is the zero vector in V {L?)
A conceptually important insight is that there is no natural way
in which this V {L is a subspace of R2 . E.g., one may assign to an
element L1 P V {L, say, the y-coordinate of the intersection of L1
2.5. LINEAR COMBINATIONS 45

with the y-axis. But, this idea is ad-hoc and problem-laden (why
not take the x-axis instead, and what is worse, what happens if L
is in fact the y-axis...).

2.5 Linear combinations


In the sequel, V will always denote a vector space, for example
V “ Rn .

Definition 2.29. A linear combination of vectors v1 , . . . , vm P V is


a vector of the form

a1 v1 ` ¨ ¨ ¨ ` am vm ,

where the a1 , . . . , am are arbitrary real numbers.

Example 2.30. If m “ 1 in the above definition, there is only one


vector v :“ v1 . A linear combination of a single vector v is therefore
any vector of the form av, with an arbitrary a P R. In other words
it is an arbitrary scalar multiple of that vector.

Example 2.31. More interesting things start happening for two


vectors and more. As an example, consider v1 “ p1, 0, 0q and v2 “
p0, 1, 0q in the vector space R3 . Then p3, 2, 0q is a linear combination
of these since

p3, 2, 0q “ 3 ¨ p1, 0, 0q ` 2 ¨ p0, 1, 0q.

On the other hand, p0, 0, 1q is not a linear combination of v1 and v2 :


for arbitrary a1 , a2 P R, we compute

a1 v1 ` a2 v2 “ pa1 , 0, 0q ` p0, a2 , 0q “ pa1 , a2 , 0q.

No matter how we choose a1 and a2 , we always have

pa1 , a2 , 0q ‰ p0, 0, 1q,

since the third components of these two vectors are always different.
In fact, the linear combinations of v1 and v2 are precisely the vectors
px, y, zq that satisfy z “ 0.
46 CHAPTER 2. VECTOR SPACES

v2 “ p0, 1, 0q
y
v1 “ p1, 0, 0q 1
v ` 12 v2
2 1
x

Given two vectors v1 , v2 P R3 , we will see later (Theorem 2.66)


that there will always be some vector w P R3 (in fact infinitely
many) that is not a linear combination of v1 and v2 . In the above
example any vector w “ px, y, zq with z ‰ 0 has that property.
Continuing with that example, the x ´ y-plane inside R3 , i.e., V :“
tpx, y, zq | x, y P R, z “ 0u “ tpx, y, 0q | x, y P Ru is a subspace
of R3 : it contains p0, 0, 0q and given any two vectors v, w P V we
have v ` w P V and given r P R, rv P V (!)(convince yourself this
is true!). We can alternatively use Proposition 2.18 to see this is
a subspace: V is the solution space of the equation z “ 0 (in the
three variables x, y, z), which is a homogeneous linear equation. The
following statement asserts that we always obtain a subspace in this
manner.

Lemma 2.32. Let V be a vector space and v1 , . . . , vm P V be any


vectors. The set

Lpv1 , . . . , vm q :“ ta1 v1 ` ¨ ¨ ¨ ` am vm | a1 , . . . , am P Ru

of all linear combinations of v1 , . . . , vm is a subspace of V . It is


called the span (or sometimes also the linear hull ) of these vectors.

Proof. We check the three conditions in Definition 2.17. Let us


abbreviate L :“ Lpv1 , . . . , vm q.
(1) The zero vector 0 P L since 0¨v1 `. . . 0¨vm “ 0¨pv1 `¨ ¨ ¨`vm q “ 0,
using properties (4) and (8) in the definitions of a vector space.
(2) Given two vectors w, u P L, we check w ` u P L. Since w P L
there are ř
some real numbers a1 , . . . , am such that w “ a1 v1 `¨ ¨ ¨`
am vm “ m i“1 ai vi . Likewise there are real numbers b1 , . . . , bm
2.5. LINEAR COMBINATIONS 47
řm
with u “ i“1 bi vi . This implies
m
ÿ m
ÿ
w`u“ ai vi ` bi vi
i“1 i“1
ÿm
“ pai ` bi qvi
i“1
P L.

(3) Given w P L and r P R, one checks similarly that rw P L ((!),


verify that!).

Example 2.33. In Example 2.31, we have

Lpp1, 0, 0q, p0, 1, 0qq “ tpx, y, 0q | x, y, P Ru.

Exercise 2.10 and Exercise 2.11 discuss linear combinations in


the vector space Rrxsď3 . The span is closely related to another
construction that produces new vector spaces out of given ones:
Definition 2.34. Let V be a vector space and A, B Ă V be two
subspaces. The sum of A and B is defined as

A ` B :“ tv ` w | v P A, w P Bu.

I.e., it consists of all possible ways to sum an element in A and


an element in B. More generally, given subspaces A1 , . . . , An of V ,
their sum is defined as

A1 ` ¨ ¨ ¨ ` An :“ tv1 ` ¨ ¨ ¨ ` vn |v1 P A1 , . . . , vn P An u.

Lemma 2.35. The sum A ` B is then again a subspace of V .

The proof of this is very similar to the one of Lemma 2.32 and
will be omitted.
Remark 2.36. Given some vectors v1 , . . . , vn P V , we have

Lpv1 , . . . , vn q “ Lpv1 q ` ¨ ¨ ¨ ` Lpvn q.

Indeed, both sets are precisely the vectors of the form a1 v1 ` ¨ ¨ ¨ `


an vn for arbitrary ai P R.
48 CHAPTER 2. VECTOR SPACES

Remark 2.37. The sum is completely different from the union A Y


B of the two subspaces. We have seen in Example 2.13 that the
union is (in general) not even a subspace (just a subset). We have
A Y B Ă A ` B,
but these two subsets are distinct (unless A or B only consists of the
zero vector). To see this inclusion, note that A Ă A ` B. Indeed,
the zero vector 0 P B (since it is a subspace!), and for any v P A,
we have v “ v ` 0 P A ` B. Similarly, B Ă A ` B, and therefore
A Y B Ă A ` B.
Remark 2.38. The sum A ` B is different from the direct sum
A ‘ B. This is already clear from the definition: while the sum
makes use of the ambient vector space V , the direct sum A ‘ B is
insensitive to A and B both lying in V . Also, it does not “see” to
what extent A and B may overlap.
In a spirit similar to Question 2.21 we can ask the following ques-
tion:
Question 2.39. Given two subspaces A, B Ă V of some larger vec-
tor space, how much “bigger” than A and B is the sum A ` B?
It turns out that Question 2.21 are closely related. Loosely speak-
ing, one can say that A ` B “gets bigger” the same way as A X B
“gets smaller”. To give a precise meaning to this one needs the con-
cept of the dimension of a vector space. Understanding the dimen-
sion of a vector space requires combining two preliminary notions,
that of a generating system and that of linear independence below
(Definition 2.46).
Definition 2.40. A collection v1 , . . . , vn of vectors is a generating
system if
Lpv1 , . . . , vn q “ V
or, equivalently, if every vector w P V is an appropriate linear com-
bination of these vectors. We also say that these vectors span V if
this is the case.
Example 2.41. The vectors e1 :“ p1, 0, . . . , 0q, e2 “ p0, 1, 0, . . . , 0q
up to en “ p0, . . . , 0, 1q are a generating system. Indeed, each vector
x “ px1 , . . . , xn q P Rn is a linear combination of these, namely
x “ x1 ¨ e1 ` ¨ ¨ ¨ ` xn ¨ en .
2.5. LINEAR COMBINATIONS 49

Example 2.42. We have observed in Example 2.31 that the vectors


e1 “ p1, 0, 0q and e2 “ p0, 1, 0q in R3 are not a generating system
since they only span the subspace
Lpe1 , e2 q “ tpx, y, 0q | x, y P Ru
which is not the entire R3 (e.g., p0, 0, 1q is missing).
The following example shows that three arbitrary vectors in R3
need not form a generating set.
Example 2.43. Consider the vectors v1 :“ e1 “ p1, 0, 0q, v2 “
p0, 1, 1q and v3 “ p2, 1, 1q. These three vectors do not form a gener-
ating set of R3 . In order to show this and to also understand which
vectors are precisely in the span Lpv1 , v2 , v3 q, we consider the follow-
ing equation, where w “ px, y, zq P R3 is a vector and a1 , a2 , a3 P R:
w “ a1 v1 ` a2 v2 ` a3 v3 .
Those vectors w that can be written in such a form are in the span,
those where no such equation holds are not in the span! This is
an equation between two vectors in R3 , i.e., ordered triples. Two
such triples are the same precisely if their three components are the
same. This leads to the following linear system:
a1 ¨ 1 ` a2 ¨ 0 ` a3 ¨ 2 “ x,
a1 ¨ 0 ` a2 ¨ 1 ` a3 ¨ 1 “ y,
a1 ¨ 0 ` a2 ¨ 1 ` a3 ¨ 1 “ z.
In this system a1 , a2 , a3 are the variables, and x, y, z are parameters
(on which the solutions of the system will depend). We form the
matrix associated to this linear system, which is
¨ ˛
1 0 2 x
˝ 0 1 1 y ‚.
0 1 1 z
We apply Gaussian elimination to that matrix, i.e., subtract the
second row from the third:
¨ ˛
1 0 2 x
˝ 0 1 1 y ‚.
0 0 0 z´y
We now distinguish two cases:
50 CHAPTER 2. VECTOR SPACES

• z ´ y “ 0 (i.e., y “ z): in this case the matrix is already in


reduced row echelon form (Definition 1.28). The system has
solutions, namely the variable a3 is a free variable, so its value
be chosen arbitrarily. Then a1 and a2 are uniquely determined
by a3 by the equations
a1 ` 2a3 “ x,
a2 ` a3 “ y,

which gives a1 “ x ´ 2a3 and a2 “ y ´ a3 . Therefore, for


arbitrary x, y P R, the vectors
w “ px, y, yq P Lpv1 , v2 , v3 q
are in the span. They can be expressed as linear combinations
w “ px ´ 2aqv1 ` py ´ aqv2 ` av3 ,
for an arbitrary a P R (this was the a3 before).
• z ´ y ‰ 0 (i.e., y ‰ z). In this case, we can divide the last
equation by z´y, which gives the following reduced row-echelon
matrix ¨ ˛
1 0 2 x
˝ 0 1 1 y ‚.
0 0 0 1
According to Method 1.32, the system has no solution in this
case. Thus, vectors of the form
w “ px, y, zq with y ‰ z
are not in the span: w R Lpv1 , v2 , v3 q.
The following method gives a criterion to check whether a given
set of vectors generates Rn . We will prove this statement later
(Theorem 3.78).
Method 2.44. Let v1 , . . . , vm P Rn be some vectors. Form the
matrix ¨ ˛
v1
˚ v2 ‹
A“˚ ˝ ...


vm
2.6. LINEAR INDEPENDENCE 51

(i.e., each the i-th row of A is precisely the vector vi , so that


A “ pvij q if vi “ pvi1 , . . . , vin q.) Bring this matrix into reduced
row-echelon form by Gaussian elimination (Method 1.30). Call this
resulting matrix B. If B contains n leading ones, then v1 , . . . , vm
span Rn . Otherwise, they don’t span Rn .

Corollary 2.45. Fewer than n vectors can never span Rn (since in


any event B can at most contain m leading ones).

2.6 Linear independence


Let v1 , . . . , vm P V be m vectors in some vector space. Then we have

0 ¨ v1 ` ¨ ¨ ¨ ` 0 ¨ vm “ 0 ¨ pv1 ` ¨ ¨ ¨ ` vm q “ 0.

This follows from the distributive law and the scalar multiplication
of any vector with 0, cf. (4) and (8) in Definition 2.10. So, there
is always a “trivial” way to obtain the zero vector from v1 , . . . , vm .
We can ask if there are other ways of achieving the zero vector.
Definition 2.46. We say v1 , . . . , vm are linearly dependent if there
is a non-zero linear combination of these that gives the zero vector.
I.e., if there are a1 , . . . , am P R of which at least one is non-zero,
such that
a1 v1 ` ¨ ¨ ¨ ` am vm “ 0. (2.47)
If this is not the case, then we say the vectors are linearly inde-
pendent.

Thus, they are linearly independent if the zero linear combina-


tion in (2.6) is the only way to obtain the zero vector as a linear
combination of v1 , . . . , vm .
Example 2.48. The vectors e1 “ p1, 0, 0q, e2 “ p0, 1, 0q and e3 “
p0, 0, 1q P R3 are linearly independent. To see this, suppose some
linear combination equals the zero vector: if

a1 e1 ` a2 e2 ` a3 e3 “ p0, 0, 0q

then we compute the left hand side as

pa1 , 0, 0q ` p0, a2 , 0q ` p0, 0, a3 q “ pa1 , a2 , a3 q,


52 CHAPTER 2. VECTOR SPACES

so the above equation forces a1 “ a2 “ a3 “ 0. This shows that the


vectors are linearly independent.
More generally, the same argument shows that
e1 “ p1, 0, . . . , 0q, e2 “ p0, 1, 0, . . . 0q, . . . , en “ p0, . . . , 0, 1q P Rn
are linearly independent.
Example 2.49. We revisit the vectors v1 :“ e1 “ p1, 0, 0q, v2 “
p0, 1, 1q and v3 “ p2, 1, 1q P R3 of Example 2.43. These vectors are
not linearly independent. Indeed, we observe that v3 “ 2v1 ` v2 , so
that
2v1 ` v2 ´ v3 “ 0.
Example 2.50. The polynomials 1 ` x, 3x ` x2 , 2 ` x ´ x2 are
linearly independent vectors in Rrxsď2 . To see this, suppose that a
linear combination of them equals the zero vector (i.e., the constant
polynomial 0):
0 “ a1 p1 ` xq ` a2 p3x ` x2 q ` a3 p2 ` x ´ x2 q
“ a1 ` a3 ` pa1 ` 3a2 ` a3 qx ` pa2 ´ a3 qx2 .
Since this must hold for all x P R, this forces the following homoge-
neous linear system:
0 “ a1 ` a3
0 “ a1 ` 3a2 ` a3
0 “ a2 ´ a3 .
Solving this system (do it (!)) one sees that this only has the trivial
solution a1 “ a2 “ a3 “ 0. Thus, the polynomials are linearly
independent.
The following statement says in some sense that a family of vec-
tors is linearly independent if there is no redundancy among them.
Lemma 2.51. Let v1 , . . . , vm P V be some vectors. They are lin-
early dependent exactly if (at least) one of these vectors can be
expressed as a linear combination of the others, i.e., some
vi “ a1 v1 ` ¨ ¨ ¨ ` ai´1 vi´1 ` ai`1 vi`1 ` ¨ ¨ ¨ ` am vm (2.52)
for an appropriate i and appropriate coefficients a1 etc.
2.6. LINEAR INDEPENDENCE 53

Proof. If (2.52) holds, then

a1 v1 ` ¨ ¨ ¨ ` ai´1 vi´1 ` p´1qvi ` ai`1 vi`1 ` ¨ ¨ ¨ ` am vm “ 0,

so they are linearly dependent.


Conversely, if (2.47) holds, then pick some i such that ai ‰ 0 (by
assumption this is possible). Then one can subtract ai vi and divide
by ´ai (which is nonzero, crucially!), giving
´a1 ´ai´1 ´ai`1 ´am
vi “ v1 ` ¨ ¨ ¨ ` vi´1 ` vi`1 ` ¨ ¨ ¨ ` vm .
ai ai ai ai
This is an equation of the form (2.52).

The following method decides whether a given set of vectors is


linearly independent in Rn . A proof is conveniently done using later
results, such as Lemma 3.74.

Method 2.53. Let v1 , . . . , vm P Rn be some vectors. Form the


matrix ¨ ˛
v1
˚ v2 ‹
A“˚ ˝ ...


vm
(i.e., each the i-th row of A is precisely the vector vi , so that
A “ pvij q if vi “ pvi1 , . . . , vin q.) Bring this matrix into reduced
row-echelon form Gaussian elimination (Method 1.30). Call this re-
sulting matrix B. If B contains m leading ones, then v1 , . . . , vm are
linearly independent. Otherwise, they are linearly dependent.

Corollary 2.54. More than n vectors can never be linearly inde-


pendent in Rn (i.e., for m ą n, any vectors v1 , . . . , vm will be linearly
dependent, since the matrix B can contain at most n leading ones,
being in reduced row-echelon form).

Remark 2.55. This method is very similar to Method 2.44, except


that there we asked B to contain n leading ones: this guarantees
that v1 , . . . , vm span Rn . Having as many leading ones as there are
vectors, i.e., m leading ones, instead guarantees that the vectors are
linearly independent.
54 CHAPTER 2. VECTOR SPACES

Example 2.56. We revisit the vectors v1 :“ e1 “ p1, 0, 0q, v2 “


p0, 1, 1q and v3 “ p2, 1, 1q P R3 of Example 2.56. The matrix having
these vectors as rows is
¨ ˛
1 0 0
˝ 0 1 1 ‚.
2 1 1

We bring it into reduced row echelon form like so:


¨ ˛ ¨ ˛ ¨ ˛
1 0 0 1 0 0 1 0 0
˝ 0 1 1 ‚⇝ ˝ 0 1 1 ‚⇝ ˝ 0 1 1 ‚.
2 1 1 0 1 1 0 0 0

This reduced row-echelon matrix has only 2 leading ones, so the vec-
tors are not linearly independent, i.e., they are linearly dependent.

The importance of linearly independent vectors comes from the


following result:

Proposition 2.57. Let v1 , . . . , vm be linearly independent vectors


in a vector space V . If some vector v can be expressed as an (os-
tensibly different) linear combination of those, these presentations
must be the same. I.e., if

v “ a1 v1 ` ¨ ¨ ¨ ` am vm and
v “ b1 v1 ` ¨ ¨ ¨ ` bm vm

for appropriate real numbers a1 , . . . , am , b1 , . . . , bm , then necessarily


we have
a1 “ b1 , a2 “ b2 , . . . , am “ bm .

Proof. Subtracting these two equations from one another (and us-
ing the commutativity of addition, and the law of distributivity,
cf. Definition 2.10), we obtain

0“v´v
“ pa1 ´ b1 qv1 ` ¨ ¨ ¨ ` pam ´ bm qvm .

Since the vectors are linearly independent, this implies a1 ´ b1 “ 0


etc., so that a1 “ b1 etc.
2.7. BASES 55

2.7 Bases
Definition 2.58. A collection of vectors in a vector space
v1 , . . . , vm P V
is called a basis if they span V and if they are linearly independent.
Example 2.59. The vectors
e1 “ p1, 0, . . . , 0q, e2 “ p0, 1, 0, . . . 0q, . . . , en “ p0, . . . , 0, 1q P Rn
are a basis, called the standard basis. Indeed, we have observed in
Example 2.41 and Example 2.48 that they span Rn and that they
are linearly independent.
We try and modify this basis a little bit and see what happens.
If we omit one of the vectors and only consider, say
e2 “ p0, 1, 0, . . . 0q, . . . , en “ p0, . . . , 0, 1q P Rn
these do not form a basis: while they are still linearly independent,
they do not span Rn .
On the other hand, we now consider
e1 , . . . , en , v,
for an arbitrary vector v P Rn . These do not form a basis: while they
span Rn (even without the v), they are not linearly independent.
Indeed, since e1 , . . . , en span Rn , this means that
v “ a1 e1 ` ¨ ¨ ¨ ` an en
for appropriate a1 , . . . , an P R. According to Lemma 2.51, this
means that e1 , . . . , en , v are linearly dependent.
Example 2.60. The vectors
v1 “ p0, 2, 1q, v2 “ p1, 0, 2q, v3 “ p´1, 1, 1q
form a basis of R3 . To see this, we apply Method 2.53 and Method 2.44:
¨ ˛ ¨ ˛ ¨ ˛ ¨ ˛
0 2 1 1 0 2 1 0 2 1 0 2
˝ 1 0 2 ‚ ⇝ ˝ 0 2 1 ‚ ⇝ ˝ 0 2 1 ‚ ⇝ ˝ 0 1 1 ‚.
2
´1 1 1 ´1 1 1 0 0 3 0 0 1
This matrix has three leading ones, so the vectors are linearly inde-
pendent and span R3 , so they form a basis.
Note that this is a different basis than e1 , e2 , e3 considered above.
56 CHAPTER 2. VECTOR SPACES

The following result, which is simply a combination of the defini-


tion of generating systems and Proposition 2.57, is often described
by saying that a basis gives rise to a coordinate system in a vector
space.
Proposition 2.61. Let v1 , . . . , vm be a basis of a vector space V .
Then each vector v P V can be written in a unique way as a linear
combination
v “ a1 v1 ` ¨ ¨ ¨ ` am vm .
For some other vector w “ b1 v1 ` ¨ ¨ ¨ ` bm vm , we have
v ` w “ pa1 ` b1 qv1 ` ¨ ¨ ¨ ` pam ` bm qvm .

2.8 The dimension of a vector space


We are all used to referring to the space surrounding us as “3-
dimensional”, and refer to a plane as “2-dimensional”. In this sec-
tion, which is crucial to linear algebra and, by extension to all ap-
plications of linear algebra in physics, engineering and mathematics
itself, we make this statement precise.
Theorem 2.62. Let V be a vector space with a basis v1 , . . . , vn .
Then any other basis of V also consists of n vectors.
In other words, the number of vectors in a basis does not depend
on the basis. (Recall from Example 2.60 that the vectors that form
a basis may very well be different.)
Definition 2.63. We say that a vector space V has dimension n if
there is a basis of V with n elements.
Example 2.64. The standard basis of Rn consists of n elements
(Example 2.59), so that
dim Rn “ n.
The space of polynomials of degree at most d has a basis 1, x, x2 , . . . , xd .
These are d ` 1 polynomials so that
dim Rrxsďd “ d ` 1.
If V has a basis v1 , . . . , vn (so that dim V “ n) and another vector
space W has a basis w1 , . . . , wm (and dim W “ mq, then a basis of
2.8. THE DIMENSION OF A VECTOR SPACE 57

the direct sum V ‘W is given by pv1 , 0q, . . . , pvn , 0q, p0, w1 q, . . . , p0, wm q.
These are n ` m vectors, so that

dimpV ‘ W q “ dim V ` dim W.

Remark 2.65. It can be shown that every vector space has a basis.
In this course, we only consider vector spaces with a basis consisting
of finitely many vectors, as in Definition 2.58. We call such vector
spaces finite-dimensional .
An example of a vector space not having a finite basis (i.e., an
infinite-dimensional vector space) is Rrxs (for which a basis is given
by the polynomials 1, x, x2 , x3 , . . . ).

The following theorem addresses the question how linearly inde-


pendent sets can be extended to a basis.
Theorem 2.66. Suppose that some vector space V is spanned by
m vectors b1 , . . . , bm (Definition 2.40).
(1) Then a basis of V can be obtained by removing certain vectors
among the b1 , . . . , bm . In particular, this says that V has a basis
and that
dim V ď m
(and so, in particular that V is finite-dimensional.)
(2) Every linearly independent set of vectors can be enlarged to a
basis by adding appropriate vectors from any given basis of V .
(I.e., if v1 , . . . , vn are linearly independent, and w1 , . . . , wm is
any basis of V , then the v1 , . . . , vn together with certain vectors
among the w1 , . . . , wm form a basis.) In particular, if v1 , . . . , vn
are linearly independent, then

dim V ě n.

(3) If W Ă V is a subspace, then W is also finite-dimensional and


dim W ď dim V . We have dim W “ dim V precisely if W “ V .
(4) For a subspace W Ă V , any basis of W can be extended to a
basis of V .

Proof. This is proved in any linear algebra textbook, e.g., [Nic95,


Theorem 6.4.1] or [Bot21, §1.3].
58 CHAPTER 2. VECTOR SPACES

Example 2.67. In V “ R3 , consider the four vectors v1 “ p1, 1, ´1q,


v2 “ p2, 0, 1q, v3 “ p´1, 1, ´2q, v4 “ p1, 2, 1q. We apply Method 2.44
and Method 2.53 by forming the associated matrix and bringing it
into row echelon form:
¨ ˛ ¨ ˛ ¨ ˛ ¨ ˛ ¨ ˛
v1 1 1 ´1 1 1 ´1 1 1 ´1 1 1 ´1
3 ‹
˚ v2 ‹ ˚ 2 0 1 ‹
˚ ‹“˚ ‹ ⇝ ˚ 0 ´2 3 ‹ ⇝ ˚ 0 1 ´ 2 ‹ ⇝ ˚
˚ ‹ ˚ ˚ 0 1 ´ 32 ‹

˝ v3 ‚ ˝ ´1 1 ´2 ‚ ˝ 0 2 ´3 ‚ ˝ 0 0 0 ‚ ˝ 0 0 0 ‚
v4 1 2 1 0 1 2 0 0 27 0 0 1
(First step: add certain multiples of the first row to the others,
second step: multiply second row by ´ 12 and add multiples to the
third and last row, third step: divide the last row by 72 .) We can
swap the last two rows and obtain a row echelon matrix. This
matrix has three leading ones, so that the four vectors generate R3
but are not linearly independent. (We also know dim V “ 3, so these
four vectors can not be linearly independent by Theorem 2.66(2).)
According to Theorem 2.66(1), we can obtain a basis by removing
certain vectors among these. Notice that one may not (in general)
remove just any arbitrary of the four vectors. In this example,
• the first three vectors v1 , v2 , v3 do not form a basis,
• however v1 , v2 , v4 do form a basis.
Indeed, this holds since in the above matrix, we remove either the
last row, which brings us to
¨ ˛ ¨ ˛
v1 1 1 ´1
˝ v2 ‚ ⇝ ˝ 0 1 ´ 3 ‚.
2
v3 0 0 0
This tells us that these three vectors are (still) not linearly inde-
pendent (and don’t span R3 ). By contrast, removing the third row,
gives ¨ ˛ ¨ ˛
v1 1 1 ´1
˝ v2 ‚ ⇝ ˝ 0 1 ´ 3 ‚
2
v4 0 0 1
which has three leading ones, so these three vectors form a basis of
R3 .
Corollary 2.68. Let V be a vector space with dim V “ n. Let n
vectors be given: v1 , . . . , vn . These vectors are linearly independent
if and only if they span V .
2.8. THE DIMENSION OF A VECTOR SPACE 59

Proof. This follows from the theorem above. For example, suppose
they span V . If they are not linearly independent, then some vi
lies in the span of the remaining vectors. Thus V is the span of all
vectors but vi so that n ´ 1 ě dim V by Theorem 2.66(1). This is a
contradiction to our assumption.
The converse implication is proved similarly.
Example 2.69. Let a P R be a fixed real number. Consider the
vector space Rrxsďd . The polynomials
v0 pxq “ px ´ aq0 “ 1, v1 pxq “ px ´ aq, . . . , vd pxq “ px ´ aqd
are linearly independent. To see this, suppose
0 “ a0 v0 ` a1 v1 ` ¨ ¨ ¨ ` ad vd .
Note that vd has degree d, all the remaining ones have degree ď d´1.
Thus, looking at the coefficient for xd , we see ad “ 0. Continuing
this, we note that
0 “ a0 v0 ` a1 v1 ` ¨ ¨ ¨ ` ad´1 vd´1
forces ad´1 “ 0 (by looking at the coefficient of xd´1 ). Repeating
this argument, one sees that a0 “ ¨ ¨ ¨ “ ad “ 0.
We know dim Rrxsďd “ d ` 1 (Example 2.64). Thus, by Corol-
lary 2.68, these polynomials v0 , . . . , vd form a basis. According to
Proposition 2.61, any polynomial f pxq of degree ď d therefore can
be uniquely written as
f pxq “ a0 ` a1 px ´ aq ` ¨ ¨ ¨ ` ad px ´ aqd .
Colloquially, every polynomial can be expressed as a sum of powers
of x´a. (By definition of a polynomial, it can certainly be expressed
as a sum of powers of x ´ a “ x.)

2.8.1 Dimensions of sums and intersections


In this section, we give an answer to Question 2.21 and Ques-
tion 2.39. Colloquially, the possible failure of A ` B being “as large
as possible” (i.e., having the maximum possible dimension, namely
dim A ` dim B) is closely related to the possible failure of A X B
being “as small as possible.” Before stating that, we note another
consequence of Theorem 2.66.
60 CHAPTER 2. VECTOR SPACES

Corollary 2.70. Suppose A, B Ă V are two subspaces with dim A “


m and dim B “ n. Then
dimpA ` Bq ď dim A ` dim B.
(Here, at the left + denotes the sum of the two subspaces (Defini-
tion 2.34), while at the right it is the sum of the two dimensions.)
Proof. If v1 , . . . , vm is a basis of A and w1 , . . . , wn is a basis of B,
then they in particular span A, resp. B. Thus, A ` B is spanned by
v1 , . . . , vm , w1 , . . . , wn .
These are m`n vectors. According to Theorem 2.66(1), this implies
dimpA ` Bq ď m ` n.

Theorem 2.71. Suppose A, B Ă V are two subspaces of a vector


space. Then
dimpA X Bq ` dimpA ` Bq “ dim A ` dim B.
This is a special case of a more general theorem, the so-called
rank-nullity theorem (Theorem 3.26). We illustrate it at the hand
of subspaces in V “ R2 . If A Ă V is a subspace, then exactly one
of the following three cases occurs:
• dim A “ 0. This means that A just consists of the zero vector:
A “ t0u.
• dim A “ 1. This means that there is a basis of A consisting of
a single vector v P A. Since v is linearly independent, we have
v ‰ 0 (otherwise 1 ¨ v “ 0 is a non-trivial linear combination
giving the zero vector). Since v spans A, this means A “
tav | a P Ru. Thus, A is the line spanned by the (non-zero)
vector v.
• dim A “ 2. In this case we necessarily have A “ R2 by Theo-
rem 2.66(3).
Of course, for another subspace B the same three cases apply. If
A “ t0u, then A X B “ t0u and A ` B “ B, so in this case the
dimension formula (3.34) just reads
dimpt0uq ` dim B “ dimpt0uq ` dim B,
2.8. THE DIMENSION OF A VECTOR SPACE 61

which does not give anything interesting. Similarly, if A “ R2 , then


A X B “ B and A ` B “ R2 , so the dimension formula reads

dim B ` dim R2 “ dim R2 ` dim B.

Again, this is tautological. The interesting case is therefore when


dim A “ 1 and, by symmetry, dim B “ 1. Thus both A and B are
lines, passing through the origin, in R2 . We distinguish two cases:
• A “ B. In this case A X B “ A, A ` B “ A, so the formula
reads
1 ` 1 “ 1 ` 1,
which is true.
• A ‰ B. In this case A X B “ t0u, since the lines are distinct
and therefore only interesect at the origin. Then the formula
says
0 ` dimpA ` Bq “ 1 ` 1 “ 2.
Thus dimpA ` Bq “ 2, which means that A ` B “ R2 , again
using Theorem 2.66(3).
Here is a picture of the two cases:

A“B A‰B
15 y
y A
4 A B
10
B
2 5
x
x
´4 ´2 2 4
´4 ´2 2 4
´5
´2
´10
´4
´15

Definition 2.72. Let A, B Ă V be two subspaces. We say “the


sum A ` B is a direct sum” if dim A ` dim B “ dim A ` B.
62 CHAPTER 2. VECTOR SPACES

In other words, dimpA ` Bq needs to be as large as possible. In


the example of two lines, i.e., dim A “ dim B “ 1, the sum is direct
precisely if A ` B “ R2 .
Example 2.73. In V “ R3 , consider subspaces A, B Ă R3 with
dim A “ 1 and dim B “ 2. Thus, geometrically, A is a line passing
through the origin and B is a plane passing through the origin. We
have
0 Ă A X B Ă A.
This means that
0 ď dimpA X Bq ď dim A “ 1.
We distinguish two cases:
• A Ă B. Equivalently, A X B “ A or, yet equivalently,
dimpA X Bq “ 1.

• A Ć B. In this case A X B Ĺ A. Since A X B is a subspace of


strictly smaller dimension, this implies A X B “ t0u. Thus,
dimpA X Bq “ 0.

To summarize, a line A and a plane B (both passing through the


origin) in R3 intersect either in a point or in a line.
Example 2.74. Consider V “ R3 and two subspaces A, B Ă R3 of
dimension 2. Then the formula reads
dimpA X Bq “ 2 ` 2 ´ dimpA ` Bq.
We have in any event A, B Ă A ` B Ă R3 , which implies
2 ď dimpA ` Bq ď 3.
We consider two cases:
• A “ B. In this case A X B “ A and A ` B “ A, which both
have dimension 2.
• A ‰ B. In this case A X B Ĺ A, so A X B has dimension ă 2.
This means that dimpA ` Bq “ 3, and therefore
dimpA X Bq “ 1.
2.9. EXERCISES 63

We summarize this as follows: two planes A, B passing through


the origin in R3 intersect either in a plane (this happens precisely
if A “ B), or they interesect in a line (this happens precisely if
A ‰ B).
If the ambient vector space has dimension ě 4, and dim A, dim B ě
2, then the possible dimensions of dimpA X Bq and dimpA ` Bq are
more varied, so we refrain from making a similar list.

2.9 Exercises
Exercise 2.1. Let V “ tpx, y, zq | x, y, z P Ru. (Thus, V “ R3 .)
We use the regular addition of vectors. However, in contrast to
the regular scalar multiplication (Definition 2.7), we now use the
following. Decide in each case whether this turns V into a vector
space:
• r ¨ px, y, zq “ prx, y, rzq,
• r ¨ px, y, zq “ p0, 0, 0q,
• r ¨ px, y, zq “ p2rx, 2ry, 2rzq.
Exercise 2.2. Let V Ă R2 be a subspace. Which of the following
statements are correct?
(1) V contains at least one element.
(2) V contains at least two elements.
(3) V contains the zero vector p0, 0q.
(4) If v, w P V then also v ´ w P V .
Exercise 2.3. Using basic properties of differentiable functions from
your calculus class, show that the space
tf : R Ñ R | f is differentiable u
is a vector space (with the sum and scalar multiple defined as in
(2.22) and (2.22)).
Hint: structure your thinking as in Definition and Lemma 2.22.
Exercise 2.4. Give an example of two subspaces V, W Ă R2 such
that their union
V Y W “ tx “ px1 , x2 q P R2 | x P V or x P W u
64 CHAPTER 2. VECTOR SPACES

is not a subspace.
Hint: Example 2.13.
Also give an example of two subspaces V, W Ă R2 , where the
union V Y W is a subspace.
Hint: be very lazy and minimalistic. What is the smallest sub-
space you can come up with?

Exercise 2.5. Determine in each case whether w P R4 lies in the


span of v1 and v2 . If so, name at least one linear combination of v1
and v2 that equals w; otherwise explain why there is no such linear
combination.
(1) w “ p2, ´1, 0, 1q, v1 “ p1, 0, 0, 1q, v2 “ p0, 1, 0, 1q
(2) w “ p1, 2, 15, 11q, v1 “ p2, ´1, 0, 2q, v2 “ p1, ´1, ´3, 1q
(3) w “ p2, 5, 8, 3q, v1 “ p2, ´1, 0, 5q, v2 “ p´1, 2, 2, ´3q

Exercise 2.6. Determine whether the following vectors span R4 :


(1) p1, 1, 1, 1q, p0, 1, 1, 1q, p0, 0, 1, 1q, p0, 0, 0, 1q
(2) p1, 3, ´5, 0q, p´2, 1, 0, 0q, p0, 2, 1, ´1q, p1, ´4, 5, 0q

Exercise 2.7. Determine whether the following vectors are linearly


independent:
(1) v1 “ p1, ´1, 0q, v2 “ p3, 2, ´1q, v3 “ p3, 5, ´2q in V “ R3 ,
(2) v1 “ p1, 1, 1q, v2 “ p1, ´1, 1q, v3 “ p0, 0, 1q in V “ R3 ,
(3) p1, ´, 1, 1, ´1q, p2, 0, 1, 0q, 0, ´2, 1, ´2q in R4 ,
(4) p1, 1, 0, 0q, p1, 0, 1, 0q, p0, 0, 1, 1q and p0, 1, 0, 1q in R4 .

Exercise 2.8. Name three vectors v1 , v2 , v3 P R2 such that:


• v1 , v2 are linearly independent,
• v1 , v3 are linearly independent, and
• v2 , v3 are linearly independent, but
• v1 , v2 , v3 are not linearly independent.

Exercise 2.9. Consider the vector space V “ Rrxsď3 of polynomi-


als of degree at most 3. Decide which of the following subsets of V
is a subspace:
(1) tf | f P V, f p2q “ 1u,
2.9. EXERCISES 65

(2) tx ¨ f | f P Rrxsď2 u,
(3) tx ¨ f ` p1 ´ xqg | f, g P Rrxsď2 u,
(4) tf | f P Rrxsď3 , f p0q “ 0u.

Exercise 2.10. Express the following polynomials as linear combi-


nations of x ` 1, x ´ 1 and x2 ´ 1 (in Rrxsď2 q: x2 ` 4x ´ 2, x,
40 ´ x2 .

Exercise 2.11. Is the following sentence correct? “In Rrxsď3 , the


polynomial f pxq “ 41 x3 ` 3x ` 1 is a linear combination of the
polynomials x2 , x and 1 since f pxq “ x4 ¨ x2 ` 3 ¨ x ` 1.”

Exercise 2.12. Express each of the three standard basis vectors


e1 , e2 , e3 as a linear combination of the basis vectors in Example 2.60.
ˆ ˙
1 1
Exercise 2.13. (Solution at p. 205) Consider A “ and
ˆ ˙ 2 2
3 2
B “ in the vector space of 2 ˆ 2-matrices. Is C “
ˆ 3˙ 5
´1 0
a linear combination of A and B?
2 4

Exercise 2.14. In the vector space Mat2ˆ3 of 2 ˆ 3-matrices, we


consider the set
"ˆ ˙ *
x1 x2 x3
T “ | x1 ` x4 ` x6 “ 0, x1 ` x4 ` x3 ` x5 “ 0 .
x4 x5 x6

(1) Decide whether T is a subspace of Mat2ˆ3 .


(2) Find all the vectors (i.e., matrices) in T .
(3) Find some vectors such that T “ Lpv1 , v2 , v3 , v4 q.

Exercise 2.15. (Solution at p. 206) In R4 consider the subset

S “ tpx, y, z, tq | x ` y ` z ` t “ 0u.

(1) Decide whether S is a subspace of R4 .


(2) Find all the vectors (i.e., matrices) in S.
(3) Find some vectors such that S “ Lpv1 , v2 , v3 q.
66 CHAPTER 2. VECTOR SPACES

Exercise 2.16. (Solution at p. 206) Consider the following two sub-


spaces of R4 :

S “ Lpp1, ´1, 0, 1q, p2, 1, ´2, 0q, p0, 0, 1, 1qq

and T , which is the solution set of the system

2x1 ´ x2 ´ 3x4 “ 0
2x1 ` x3 ` x4 “ 0.

Determine S X T .

Exercise 2.17. Consider the following two subspaces of R4 :

W “ Lpp1, 0, 1, 0q, p2, 0, 0, 0q, p0, ´3, ´1, ´1qq

and T given by the solution set of the system

x 1 ´ x2 “ 0
x1 ` x2 ` x3 “ 0.

Determine T X W .

Exercise 2.18. Show that


• R2 “ Lpp1, 1q, p2, ´1qq,
• R2 “ Lpp0, ´2q, p1, 1qq.

Exercise 2.19. Is p1, 5, 0q P R3 a linear combination of v1 “ p1, 1, 0q,


v2 “ p2, 0, 1q and v3 “ p0, 3, ´1q?
(I.e., are there a1 , a2 , a3 P R such that α1 v1 ` a2 v2 ` a3 v3 “
p1, 5, 0q?)

Exercise 2.20. Express the following polynomials as f pxq “ 4i“0 ai px´


ř
1q4 :
(1) f pxq “ x4 ,
(2) f pxq “ x3 ,
(3) f pxq “ x3 ´ 3x2 ` 4x ` 2.

Exercise 2.21. Let a, b P R be two distinct numbers. Show that


the polynomials x ´ a and x ´ b are a basis of Rrxsď1 .
2.9. EXERCISES 67

Exercise 2.22. In R4 “ tpx, y, z, tq | x, y, z, t P Ru consider the


subspace W1 Ă R4 given by the solutions of the system
y ` t “ 0,
y ` z “ 0.
Also consider the subspace W2 “ Lpp0, 1, ´1, 0qq.
Determine a basis and the dimension of W1 . Describe W1 X W2 .
Exercise 2.23. Let k P R be an arbitrary real number. Consider
the subspace
Wk :“ Lpp1, 0, ´1, 0q, p1, 1, 0, 1q, p1, 2, k, 1qq Ă R4 .
(1) For all k P R, find a basis of Wk and determine dim Wk .
(2) For which k P R is p´1, 1, 1, 1q P Wk ?
Exercise 2.24. Recall that the dimension of the space Mat2ˆ3 of
2 ˆ 3-matrices is 6."ˆ ˙ *
a a`b b
(1) Consider W “ | a, b P R . Confirm that W
0 0 b
is a subspace of Mat2ˆ3 . Determine dim W .
"ˆ ˙ *
c 0 ´c
(2) Let V :“ | c P R . Determine (i.e., determine
0 0 ´c
a basis and the dimension of) V X W .
Exercise 2.25. (Solution at p. 207) Consider the following sub-
spaces of R3 :
W1 :“ Lpp1, 0, 1q, p2, 1, 0qq
W2 :“ Lpp´1, 1, 1q, p0, 3, 0qq.
(1) Determine (i.e., determine a basis and the dimension of) W1 X
W2 .
(2) Determine W1 ` W2 .
Exercise 2.26. Consider the following subspaces of R4 :
W1 :“ Lpp1, 1, 1, 2q, p2, 0, 3, 5qq
W2 :“ Lpp1, 1, 0, 1q, p0, 2, ´2, ´2qq.
As in the previous exercise, determine W1 X W2 and W1 ` W2 .
68 CHAPTER 2. VECTOR SPACES

Exercise 2.27. (Solution at p. 209) Consider the subspace


W “ Lpp1, 0, 1, 0q, p2,
loooomoooon 0, 1, 1q, p0,
loooomoooon 0, 1, 3qq.
loooomoooon
“v1 “v2 “v3

(1) Find a basis of W and determine dim W .


(2) Find a vector v P R4 such that
W Ĺ Lpv1 , v2 , v3 , vq.
What is dim Lpv1 , v2 , v3 , vq?
Exercise 2.28. (Solution at p. 210) Consider the vectors in R4 ,
where t P R:
u1 “ p1, 0, ´1, 2q
u2 “ p1, 0, 0, 1q
u3 “ p2, 0, ´1, 3q
u4 “ p4, t, ´2, 6q.
(1) Let Ut “ Lpu1 , u2 , u3 , u4 q be the subspace spanned by these vec-
tors (where the last vector depends on t P R). Find the values
of t such that
dim Ut “ 2.
(2) Consider t “ 1 from now on. Verify dim U1 “ 3 and find a basis
of U1 .
(3) Let W Ă R4 be the subspace given by the equations
x1 ` x2 ` x3 “ 0
x1 ´ 3x4 “ 0.
Determine dim W and dim U1 X W .
Exercise 2.29. Consider the subspace Ut Ă R4 spanned by the
four vectors
v1 “ p1, 0, 0, 1q
v2 “ p´1, 1, 2, 3q
v3 “ p0, 1, 2, 4q
v4 “ pt, 2, 4, 8q.
Here, t P R is an arbitrary real number.
2.9. EXERCISES 69

(1) Find the values of t, such that dim Ut “ 2.


(2) Consider from now on t “ 1. Determine dim U1 .
(3) Let W Ă R4 be the subspace given by the equations
x1 ´ x 2 “ 0
x2 ´ x3 “ 0.
Determine a basis and the dimension of W and of W X U1 .
70 CHAPTER 2. VECTOR SPACES
Chapter 3

Linear maps

Mathematical objects gain a lot of richness when they can be related


to each other. In linear algebra, the objects of interest are vector
spaces, and the way the relate to each other is by means of linear
maps. The word “map” is being used as a synonym to the word
“function”.

3.1 Definition and first examples


Definition 3.1. Let V, W be two vector spaces. A function f : V Ñ
W is called linear (or a linear map, or a linear transformation) if it
satisfies the following conditions:
f pv ` v 1 q “ f pvq ` f pv 1 q for all v, v 1 P V and (3.2)
f pavq “ af pvq for all a P R, v P V. (3.3)
The vector space V is called the domain of f , W is called the
codomain of f .
Remark 3.4. These two conditions can be squeezed into one con-
dition, by requiring that
f pav 1 ` a1 v 1 q “ af pvq ` a1 f pv 1 q,
for all a, a1 P R and all v, v 1 P V . This can be paraphrased by saying
that f preserves linear combinations.
Using that 0 ¨ v “ 0V (the zero vector in V ), the above condition
implies that
f p0V q “ f p0 ¨ vq “ 0 ¨ f pvq “ 0W .

71
72 CHAPTER 3. LINEAR MAPS

Thus, for a linear map, the zero vector of V is mapped to the zero
vector in W .

Example 3.5. The map f : R2 Ñ R2 , f px, yq :“ px, ´yq (i.e.,


reflection at the x-axis) is linear. This can be proven very simply
algebraically: for (3.2): if v “ px, yq and v 1 “ px1 , y 1 q P R2 , then

f pv`v 1 q “ f ppx`x1 , y`y 1 qq “ px`x1 , ´y´y 1 q “ px, ´yq`px1 , ´y 1 q “ f pvq`f pv 1 q.

Checking (3.3) is similarly simple. The linearity of the map can also
be visualized geometrically:

y
6 v ` v1
4 v1
2
v “ px, yq x
´2 2 f pvq
4 “ px,6´yq
´2
´4 f pv 1 q
´6 f pv ` v 1 q “ f pvq ` f pv 1 q

We will soon regard the preceding example as a special case of


the multiplication
ˆ ˙of a vector with a matrix, namely in this case the
1 0
matrix , cf. §3.2.
0 ´1

Example 3.6. The map

D : Rrxs Ñ Rrxs, Dpf q :“ f 1 ,

i.e., the derivative of f , is linear. This is true because we have the


formulae (proven in calculus)

pf ` gq1 pxq “ f 1 pxq ` g 1 pxq, paf q1 pxq “ af 1 pxq.

Alternatively, one may use that the derivative of a polynomial


f pxq “ dn“0 an xn is given by f 1 pxq “ dn“1 nan xn´1 . Then, for
ř ř
3.2. MULTIPLICATION OF A MATRIX WITH A VECTOR 73

řd n
g“ n“0 bn x , we check (3.2), say:
˜ ¸1
ÿd
pf ` gq1 pxq “ pan ` bn qxn
n“0
d
ÿ
“ npan ` bn qxn´1
n“1
ÿd d
ÿ
“ nan xn´1 ` nbn xn´1
n“1 n“1
“ f 1 pxq ` g 1 pxq.

Here are a few slightly more abstract examples of linear maps, in


which V is an arbitrary vector space.
Example 3.7. • The identity map id :“ idV : V Ñ V which is
given by idpvq :“ v is linear.
• For some other vector spaces W , the zero map 0 : V Ñ W is
the map sending every vector v to 0W . It is linear.
• For any real number a P R, the map given by scalar multiplica-
tion V Ñ V , v ÞÑ a¨v is linear. This follows from the conditions
(4) and (6) in the definition of a vector space (Definition 2.10).

Non-Example 3.8. • The map f : R Ñ R, f pxq :“ x2 is not


linear. Indeed, f px ` yq “ px ` yq2 “ x2 ` 2xy ` y 2 ‰ x2 ` y 2 “
f pxq ` f pyq. Also f paxq “ a2 x2 ‰ ax2 “ af pxq.
• The map f : R Ñ R, f pxq :“ x ` 1 is not linear since again

f px ` yq “ x ` y ` 1 ‰ px ` 1q ` py ` 1q “ f pxq ` f pyq.

Thus, (3.2) is violated. Also (3.3) is violated: f paxq “ ax`1 ‰


apx ` 1q “ af pxq.

3.2 Multiplication of a matrix with a vector


In this section, we define the multiplication of a matrix with a vector
and show how this gives rise to a linear map. This is an extremely
important way to construct linear maps.
74 CHAPTER 3. LINEAR MAPS

Definition 3.9. Let A “ paij q1ďiďm,1ďjďn (cf. Notation 1.22) be an


¨ ˛
v1
m ˆ n-matrix and v “ ˝ ... ‚ be a n ˆ 1-matrix, i.e., a row vector
vn
with n columns. The product of A with v is the m ˆ 1-vector
¨ ˛
a11 v1 ` a12 v2 ` ¨ ¨ ¨ ` a1n vn
Av :“ ˝ .. ‚.
.
am1 v1 ` am2 v2 ` ¨ ¨ ¨ ` amn vn
Thus, the i-th entry of the (column) vector Av is computed by
traversing the i-th row of A and multiplying each entry of that row
with the corresponding entry of v.
Example 3.10. Here is a concrete example:
¨ ˛¨ ˛ ¨ ˛ ¨ ˛
1 3 ´2 1 1 ¨ 1 ` 3 ¨ 2 ` p´2q ¨ p´1q 9
˝ 0 1 0 ‚˝ 2 ‚ “ ˝ ‚ “ ˝ ‚.
1 0 ´1 ´1
It makes perfectly good sense to consider matrices whose entries
are variables. Compute:
¨ ˛¨ ˛ ¨ ˛ ¨ ˛
1 3 ´2 x 1 ¨ x ` 3 ¨ y ` p´2q ¨ y x ` 3y ´ 2z
˝ 0 1 0 ‚˝ y ‚ “ ˝ ‚“ ˝ ‚.
1 0 ´1 z

Thus, the equation (of column vectors consisting of 3 rows)


¨ ˛¨ ˛ ¨ ˛
1 3 ´2 x 3
˝ 0 1 0 ‚˝ y ‚ “ ˝ 4 ‚
1 0 ´1 z ´2
is a very convenient way to write down the linear system
x ` 3y ´ 2z “ 1
y“4
x ´ z “ ´2.
This shows that the product of matrices with column vectors is
very useful in enconding linear systems. We record this observation
in the due generality:
3.2. MULTIPLICATION OF A MATRIX WITH A VECTOR 75

Observation 3.11. Let


¨ ˛
a11 ... a1n
A “ ˝ ... ..
.
.. ‚
.
am1 ... amn
be an m ˆ n-matrix and
¨ ˛
x1
x “ ˝ ... ‚
xn
be a column vector with n rows and
¨ ˛
b1
b “ ˝ ... ‚
bm
be a column vector with m rows. Then the equation
Ax “ b
is equivalent to the linear system (in the unknowns x1 , . . . , xn , con-
sisting of m equations)
a11 x1 ` ¨ ¨ ¨ ` a1n xn “ b1
..
.
am1 x1 ` ¨ ¨ ¨ ` amn xn “ bm .

3.2.1 The case of 2 ˆ 2-matrices


The process of multiplying a matrix with a column vector is also ge-
ometrically very important. We now investigate this in more detail
in the case where
ˆ ˙
a11 a12
A“ P Mat2ˆ2 .
a21 a22
ˆ ˙
v1
For a column vector v “ the product is, according to Defi-
v2
nition 3.9, ˆ ˙
a11 v1 ` a12 x2
Av “ . (3.12)
a21 v1 ` a22 v2
76 CHAPTER 3. LINEAR MAPS

In keeping with traditional


ˆ ˙notation from geometry, we will instead
x
write the vector v as , in which case
y
ˆ ˙
a11 x ` a12 y
Av “ .
a21 x ` a22 y

It is useful to organize this situation into a function, namely the


function that sends the vector v to the vector Av. We obtain a
function

f : R2 Ñ R2 , v ÞÑ Av (read “v maps to Av”.)

Of course, since Av depends on the entries of A, so does this function


f.

Reflections
ˆ ˙
1 0
Example 3.13. We consider A “ . According to the
0 ´1
above we have ˆ ˙
x
Av “ .
´y
We plot a few points v and the corresponding Av:

5 y
4 ˆ ˙
3 3
w Av “
2 4
1
x
´5´4´3´2´1
´1 1 2 3 4 5
´2
´3
Aw ´4 ˆ ˙
´5 3
v“
´6 ´4
´7

Thus, geometrically, Av is the point v reflected along the x-axis.


3.2. MULTIPLICATION OF A MATRIX WITH A VECTOR 77

Rescalings
ˆ 1
˙
2
0
Example 3.14. The matrix A “ describes the map that
0 1
compresses everything in the x-direction by the factor 12 , and leaves
the y-direction untouched.

Example 3.15. If r, s are two real numbers,


ˆ ˙
r 0
A“
0 s

rescales the x-direction by a factor r (so it shrinks for r ă 1 and


enlarges for ˆ
r ą 1) and
˙ rescales the y-direction by a factor s.
1
2
0
For A “ , this looks as follows:
0 2

7 y
6
Aw 5
4
3
w 2
1
x
´5´4´3´2´1
´1 1 2 3 4 5ˆ 6 7 ˙
3
´2 v“
´3 ˆ 3 ´2˙
´4 Av “ 2
´4
´5
´6
´7

Shearing

Example 3.16. For a fixed real number r, the matrix


ˆ ˙
1 r
A“
0 1
78 CHAPTER 3. LINEAR MAPS
ˆ ˙
x ` ry
sends v to Av “ . Thus it is a shearing operation. In
y ˆ ˙
1 2
the following picture A “ .
0 1
5 y
4
3
w 2 Aw
1
x
´1 1 2 3 4 5 6 7
´5´4´3´2´1
´2
ˆ ´3 ˙ ˆ ˙
´5 3
Av “ ´4 v“
´4 ´4
´5
´6
´7

Rotations

We now consider rotations.ˆ ˙ ˆ ˙


0 ´1 ´y
Example 3.17. For A “ , the vector Av “ .
1 0 x
Geometrically, the function v ÞÑ Av is a counterclockwise rotation
by 90˝ . ˆ ˙ ˆ ˙
´1 0 ´x
For A “ , the vector Av “ so the function
0 1 y
v ÞÑ Av describes a counterclockwise rotation by 180˝ (or, what is
the same, a clockwise rotation by 180˝ ).
For more general rotations, we use basic properties of the trigno-
metric functions, e.g., as recalled in §B.
Example 3.18. In general, for any r P R the matrix
ˆ ˙
cos r ´ sin r
A“
sin r cos r
is such that the function
ˆ ˙
cos rx ´ sin ry
v ÞÑ Av “
sin rx ` cos ry
3.2. MULTIPLICATION OF A MATRIX WITH A VECTOR 79

is a (counter-clockwise) rotation by r. For this reason, A is called a


rotation matrix .
ˆ ˙
0 ´1
In the following illustration, A “ .
1 0

5 y
4 ˆ ˙
4
3 Av “
w 3
2
1
x
´5´4´3´2´1
´1 1 2 3 4 5 6 7 8 9 10
´2
´3 ˆ ˙
3
´4 v“
´4
Aw ´5
´6
´7
¨ ˛
v1
We regard a vector v “ ˝ ... ‚ as an element of Rn . (Thus,
vn
instead of using the notation pv1 , . . . , vn q for an ordered tuple, as in
Definition 2.1, we write the n numbers underneath in a row.) Fix an
m ˆ n-matrix A. Then the product Av, which is an column vector
with m entries, is an element in Rm . We now regard this matrix A
as fixed, and consider the vector v as a variable. In other words, we
consider the function (or map)

Rn Ñ Rm , v ÞÑ Av.

Matrix multiplication has the following basic, but crucial prop-


erty.

Proposition 3.19. For any m ˆ n-matrix A, the above map is lin-


ear.

Proof. We prove this in the case m “ n “ 2 using (3.12). (The case


of general m and n is just notationally more involved, but otherwise
80 CHAPTER 3. LINEAR MAPS
ˆ ˙ ˆ ˙
v1 1 v11
the same.) Let v “ ,v “ . Then
v2 v21
ˆ ˙ ˆ ˙
1 a11 v1 ` a12 v2 a11 v11 ` a12 v21
Av ` Av “ `
a21 v1 ` a22 v2 a21 v11 ` a22 v21
ˆ ˙
a11 pv1 ` v11 q ` a12 pv2 ` v21 q

a21 pv1 ` v11 q ` a22 pv2 ` v21 q
ˆ ˙
v1 ` v2
“A
v11 ` v21
“ Apv ` v 1 q.
Likewise, one checks (3.3), i.e., that for a P R,
ˆ ˙
av1
Apavq “ A
av2
ˆ ˙
a11 av1 ` a12 av2

a21 av1 ` a22 av2
ˆ ˙
a11 v1 ` a12 v2
“a
a21 v1 ` a22 v2
“ aAv
“ apAvq.

3.3 Outlook: current research


Since matrix multiplication is such a key asset, it is of great interest
to perform this process as efficiently as possible. Given two 2 ˆ 2-
matrices A and B, the computation of AB by just following the
definition takes 8 multiplications, namely
aie bej
for each of the indices i, j, e being either 1 or 2. In the 1960’s an algo-
rithm (https://en.wikipedia.org/wiki/Strassen_algorithm) was
found that only requires 7 multiplications. By applying that algo-
rithm iteratively for larger matrices, this gives a decidedly better al-
gorithm. Current research is using methods of artificial intelligence
to try and come up with similar methods for 3 ˆ 3- and other matri-
ces. Check out this interesting lay-accessible article on recent trends:
https://www.quantamagazine.org/ai-reveals-new-possibilities-in-matrix-mul
3.4. KERNEL AND IMAGE OF A LINEAR MAP 81

3.4 Kernel and image of a linear map


The kernel and the image of a linear map are an important measure
how, roughlyˆ speaking,
˙ ˆinteresting
˙ this map is. E.g., the zero map
x 0
R2 Ñ R2 , ÞÑ is certainly very boring in the sense
y 0
that it only produces the zero vector in R2 . By contrast, say, a
rotation (by a fixed angle r) in R2 is more interesting, since any
point in R2 can be obtained from another point by rotating by that
angle r.
In order to introduce kernel and image, we need the following
general notions related to maps between sets.
Definition 3.20. Let f : X Ñ Y be a function between two sets.
• The preimage of some element y P Y is
f ´1 pyq :“ tx P X | f pxq “ yu pĂ Xq.

• The image of f is defined as


im pf q :“ f pXq :“ tf pxq | x P Xu pĂ Y q.

• f is called injective (or one-to-one) if for each y, the preimage


f ´1 pyq contains at most one element.
• f is called surjective (or onto) if for each y, f ´1 pyq contains at
least one element. Equivalently, f is surjective if im pf q “ Y .
• f is called bijective if it is both injective and surjective. In other
words, if for each y P Y , f ´1 pyq contains exactly one element.
Example 3.21. While in the applications below, we will often con-
sider X and Y to be vector spaces, Definition 3.20 applies to maps
between arbitrary sets. For example, consider a group of n people
tP1 , . . . , Pn u. Consider the function
m : tP1 , . . . , Pn u Ñ t1, 2, . . . , 12u
that assigns to each person their month of birth. This function
is surjective if for each month, one of the persons is born in that
month. It is injective, if in each month only one birthday party is
happening. It is bijective if both conditions are true, i.e., in every
month there is exactly one birthday party (for one of the persons).
82 CHAPTER 3. LINEAR MAPS

In the example above, the map m can only be bijective if n “ 12,


i.e., if the size of the two sets is the same. For linear maps (between
vector spaces) we want to articulate a similar idea, but simply saying
that the size of the vector spaces are the same is insufficient, since
R, R2 , R3 etc. all have infinitely many elements. Rather, we will see
in Corollary 3.28 that the dimension of a vector space is the correct
notion of size.
Definition 3.22. Let f : V Ñ W be a linear map. The kernel of
f is defined as
kerpf q :“ f ´1 p0W q
“ tv P V | f pvq “ 0W u.
Note that kerpf q Ă V and im pf q Ă W . In fact, these are not
just arbitrary subsets:
Proposition 3.23. For a linear map f : V Ñ W , ker f is a subspace
of V , while im f is a subspace of W .
Proof. We only check the conditions in Definition 2.17 for the kernel.
(The case of the image is similar.)
• 0V P ker f : this means that f p0V q “ 0W , which holds by Re-
mark 3.4.
• For v, v 1 P ker f we check v ` v 1 P ker f : this means f pv ` v 1 q “
0W . Indeed, using that f is linear we have
f pv ` v 1 q “ f pvq ` f pv 1 q “ 0W ` 0W “ 0W .

• For v P ker f and a P R, we check av P ker f : as before, using


the linearity of f , we have
f pavq “ af pvq “ a ¨ 0W “ 0W .

Example 3.24. We consider the matrix


ˆ ˙
1 2 0
A“
2 4 0
and the associated linear map
¨ ˛
x ˆ ˙
3 2 x ` 2y
f : R Ñ R ,v “ ˝ y ‚ Þ Av “
Ñ .
2x ` 4y
z
3.4. KERNEL AND IMAGE OF A LINEAR MAP 83

The kernel of f consists of vectors v such that


x ` 2y “ 0
2x ` 4y “ 0.
This tells us that the kernel of f , or equivalently the solutions of
this system (in the unknowns x, y and z!), is
$¨ ˛ ,
& ´2y .
ker f “ ˝ y ‚ P R3 | y, z P R .
% z -
¨ ˛ ¨ ˛
´2 0
A basis of ker f is given by the two vectors ˝ 1 ‚ and ˝ 0 ‚.
0 1
The image of f consists of all vectors of the form
ˆ ˙ ˆ ˙
v1 x ` 2y
v“ “ ,
v2 2x ` 4y

with arbitrary x, y P R. (Also, the z is arbitrary, but it does not


show up in f .) This means that v2 “ 2v1 , and v1 is an arbitrary real
number. Thus
"ˆ ˙ *
v1
im f “ | v1 P R Ă R2 .
2v1
ˆ ˙
1
A basis of im f is thus given by the vector .
2
Our goal below is to develop an algorithmic method that deter-
mine bases of ker f , im f . For now, just observe that in the example
above
dimpker f q ` dimpim f q “ 2 ` 1 “ 3 “ dim R3 .
This is an example of the rank-nullity theorem (Theorem 3.26) be-
low.
Injectivity of linear maps can be measured in terms of the kernel:
Lemma 3.25. Let f : V Ñ W be a linear map. Then the following
are equivalent (i.e., one condition holds if and only if the other
holds):
84 CHAPTER 3. LINEAR MAPS

(1) f is injective,
(2) ker f “ t0V u.
Proof. Suppose f is injective, we prove ker f “ t0u. Since f p0q “ 0
by linearity (Remark 3.4), we have 0 P ker f . If v P ker f , then
f pvq “ 0W , so both v and 0V are in the preimage of 0W . By the
injectivity of f , this forces v “ 0.
Conversely, suppose ker f “ 0. Suppose two vectors v, v 1 P V are
in the preimage of some w P W , i.e., f pvq “ f pv 1 q “ w. Then, by
linearity of f
f pv ´ v 1 q “ f pv ` p´1qv 1 q “ f pvq ` p´1qf pv 1 q “ f pvq ´ f pv 1 q “ 0.
Thus, v ´ v 1 P ker f , which means by assumption that v ´ v 1 “ 0.
That is: v “ v 1 . Therefore f is injective.
Theorem 3.26. (Rank-nullity theorem) Let f : V Ñ W be a map
between (finite-dimensional) vector spaces. Then
dimpker f q ` dimpim f q “ dim V.
The rank of f is defined to be
rk f :“ dimpim f q,
while the nullity of f is defined to be dimpker f q.
A proof of this theorem appears in any linear algebra textbook,
e.g. [Nic95, Theorem 7.2.4]. As a remark on the proof, we note
that one can prove the following fact, which is very useful in its own
right.
Theorem 3.27. Let f : V Ñ W be a linear map. Let
v1 , . . . , vr , vr`1 , . . . vn
be a basis of V such that
v1 , . . . , vr
is a basis of ker f . Then f pvr`1 q, . . . , f pvn q is a basis of im f .
The following facts are immediate consequences of the rank-
nullity theorem.
3.4. KERNEL AND IMAGE OF A LINEAR MAP 85

Corollary 3.28. Let f : V Ñ W be a linear map between finite-


dimensional vector spaces.
(1) If f is injective then dim V ď dim W (since then ker f “ t0u,
i.e., dim ker f “ 0).
(2) If f is surjective then dim V ě dim W (since them im f “ W ,
so dim im f “ dim W ).
(3) If f is bijective then dim V “ dim W .
(4) The preceding three statements can in general not be reversed:
if, say, dim V ď dim W , f need not be injective. For example
the zero map V Ñ W , v ÞÑ 0 is never injective if V ‰ t0u.
(5) Suppose in addition that dim V “ dim W . In this case f is injec-
tive precisely if f is surjective. (If f is injective, then dim im f “
dim V “ dim W , so that im f “ W by Theorem 2.66(3). Sim-
ilarly, if f is surjective, then dim im f “ dim W “ dim V , so
dim ker f “ 0, so that ker f “ t0u.)

An important case of this theorem is where f : Rn Ñ Rm is


the linear map given by multiplication with a fixed m ˆ n-matrix
A. We call the rank of A, resp. the nullity the rank, resp. nullity
of that linear map. The rank is denoted by rk A. These are two
highly important numbers associated to a matrix, so we want to
have a device for computing them. This is based on the following
computation: recall from Example 2.59 the standard basis vectors

e1 “ p1, 0, . . . , 0q, e2 “ p0, 1, 0, . . . 0q, . . . , en “ p0, . . . , 0, 1q P Rn .


¨ ˛
1
˚ 0 ‹
We will in the sequel write them as column vectors, so e1 “ ˚ ˝ ... ‚

0
etc. Then we have
¨ ˛ ¨ ˛
a11 ¨ 0 ` ¨ ¨ ¨ ` a1i ¨ 1 ` ¨ ¨ ¨ ` a1n ¨ 0 a1i
f pei q “ Aei “ ˝ .. ‚ “ ˝ ... ‚.
.
am1 ¨ 0 ` ¨ ¨ ¨ ` ami ¨ 1 ` ¨ ¨ ¨ ` amn ¨ 0 ami
(3.29)
In other words, the product Aei is precisely the i-th column of the
matrix A!
86 CHAPTER 3. LINEAR MAPS

Since any vector v P Rn is a linear combination of the ei , we


have, for appropriate b1 , . . . , bn P R
n
ÿ n
ÿ
f pvq “ f p bi ei q “ bi f pei q.
i“1 i“1

Thus, f pvq is a linear combination of the columns of A. This proves


the following statement:
Proposition 3.30. Let A be an m ˆ n-matrix and f : Rn Ñ Rm
the linear map given by multiplication with A. We write
A “ pc1 c2 . . . cn q,
i.e., the ci pP Rm q is the i-th column of A. Then
im f “ Lpc1 , . . . , cn q.
This subspace of Rm is also called the column space of A.
Definition 3.31. The row space of A is the subspace of Rn spanned
by the rows of the matrix A.
We can compute the rank of A, i.e., the dim im f , as follows:
Proposition 3.32. Let A be an mˆn-matrix. Suppose B is a (pos-
sibly non-reduced) row-echelon matrix obtained from A by means
of elementary row operations (Definition 1.29).
(1) Then the non-zero rows of B form a basis of the row space of A.
(2) If the leading ones of B lie in the columns j1 , . . . , jr , then these
columns of A form a basis of the column space of A.
(3) The rank of A equals the dimension of the column space. It also
equals the dimension of the row space of A.
Proof. The first two statements can be proven by showing that the
row and column space of A do not change when one performs an
elementary row operation to A. We skip this part of the proof (e.g.,
see [Nic95, Lemma 5.4.1] for a proof).
The last statement follows from the first two: by definition,
rk A “ dim im f equals, by Proposition 3.30, the dimension of the
column space. By the second statement, this is equal to the number
of leading ones in B. Since B is a row-echelon matrix, this is also the
number of non-zero rows, i.e., by the first statement, the dimension
of the row space.
3.4. KERNEL AND IMAGE OF A LINEAR MAP 87

Example 3.33. Consider the matrix


¨ ˛
1 2 2 ´1
A“˝ 3 6 5 0 ‚
1 2 1 2
and the linear map
f : R4 Ñ R3 , v ÞÑ Av.
The row space is the subspace of R4 spanned by the vectors p1 2 2 ´
3
1q etc., while
¨ the˛column space is the subspace of R spanned by
1
the vectors ˝ 3 ‚ etc. We compute a basis of these two spaces as
1
follows:
¨ ˛ ¨ ˛ ¨ ˛
1 2 2 ´1 1 2 2 ´1 1 2 2 ´1
A ⇝ ˝ 0 0 ´1 3 ‚ ⇝ ˝ 0 0 ´1 3 ‚ ⇝ ˝ 0 0 1 ´3 ‚.
0 0 ´1 3 0 0 0 0 0 0 0 0
Thus, the vectors p1, 2, 2, ´1q and p0, 0, 1, ´3q form a basis of the
row space. In particular, its dimension is two. The column space is
spanned by the first and third row of A, i.e.,
¨ ˛ ¨ ˛
1 2
im f “ Lp 3 ,
˝ ‚ ˝ 5 ‚q.
1 1
Thus
dim im f “ rk f “ rk A “ 2.
According to the rank-nullity theorem (Theorem 3.26),
dim ker f “ dim R4 ´ dim im f “ 4 ´ 2 “ 2,
(i.e., the nullity of f or of A is 2). In order to determine a basis
of ker f , we denote the coordinates in R4 by x1 , . . . , x4 . Then, ac-
cording to Gaussian elimination, the variables x2 and x4 are free
variables, and x3 “ 3x4 from the second row above, and then
x1 “ ´2x2 ´ 2x3 ` x4 “ ´2x2 ´ 5x4 . Thus a basis of ker f is
given by the vectors
¨ ˛ ¨ ˛
´2 ´5
˚ 1 ‹ ˚ 0 ‹
˝ 0 ‚, ˝ 3 ‚.
˚ ‹ ˚ ‹
0 1
88 CHAPTER 3. LINEAR MAPS

This reconfirms that dim ker f “ 2.


Here is another consequence of the rank-nullity theorem.
Theorem 3.34. (stated above in Theorem 2.71) Suppose A, B Ă V
are two subspaces of a vector space. Then
dimpA X Bq ` dimpA ` Bq “ dim A ` dim B.
Proof. The map
f : A ‘ B Ñ V, pa, bq ÞÑ a ´ b
is linear. Since for every b P B also b1 :“ ´b is contained in B, the
image of this map is A ` B. The kernel of f consists of those vectors
pa, bq P A ‘ B such that a ´ b “ 0, i.e., a “ b. This means that
a P A X B. Therefore, the rank nullity theorem and Example 2.64
tell us
dimpAXBq`dimpA`Bq “ dim ker f `dim im f “ dimpA‘Bq “ dim A`dim B.

3.5 Revisiting linear systems


In this section, we apply our findings from above to the problem of
solving a linear system
a11 x1 ` ¨ ¨ ¨ ` a1n xn “ b1
..
.
am1 x1 ` ¨ ¨ ¨ ` amn xn “ bm
Throughout, let A “ paij q be the m ˆ n-matrix formed by the coef-
ficients of that linear system. Recall that the vector
¨ ˛
b1
b :“ ˝ ... ‚
bm
is called the vector of constants. We will also consider the linear
map (Proposition 3.19)
f : Rn Ñ Rm , v ÞÑ Av.
3.5. REVISITING LINEAR SYSTEMS 89

Theorem 3.35. (1) Suppose momentarily that b1 “ ¨ ¨ ¨ “ bm “ 0,


so the above system is homogeneous. In this case the solution
set equals ker f , which in particular is a subspace of Rn .
(2) For arbitrary b, the system above has (at least) one solution if
the vector b lies in the image of f . (Note that the vector is Rm ,
and im f Ă Rm .) If r “ pr1 , . . . , rn q is any such solution, then
the solution set consists precisely of the vectors of the form

r ` ker f :“ tr ` v, where v P ker f u.

Proof. Recall from Observation 3.11 that

f ´1 pbq “ tr P Rn | Ar “ bu

consists precisely of the solutions of the system above.


Therefore, the first statement is clear: ker f “ f ´1 p0q are the
solutions of the homogeneous system. Also, the (non-homogeneous)
system has a solution precisely if f ´1 pbq is non-empty, i.e., if b P
im f . For the last statement: we show both implications:
• if s “ ps1 , . . . , sn q is a solution, then we get

f ps ´ rq “ f psq ´ f prq

since f is linear. Since r is some solution of the system, we have


f prq “ b, and also f psq “ b. This implies v :“ s ´ r P ker f ,
i.e., s “ r ` v.
• Conversely, consider a vector of the form r ` v, with v P ker f .
Then
f pr ` vq “ f prq ` f pvq “ b ` 0 “ b.
Thus r ` v is also a solution of the system.

Remark 3.36. The solution set r ` ker f of a non-homogeneous


system is never a subspace: indeed any subspace contains the zero
vector, but if that is a solution we get

b “ A0 “ 0.

Instead, the solution set of the system with a non-zero vector b, i.e.,
f ´1 pbq is a translation of ker f , as is
90 CHAPTER 3. LINEAR MAPS

f ´1 pbq
ker f “ f ´1 p0q

Example 3.37. Consider the linear system (in the unknowns x, y, z)


x ` 3y ` 5z “ 7
3x ` 9y ` 10z “ 11
2x ` 9y ` 12z “ 10.
The pertinent 3 ˆ 3-matrix builtout of the coefficients is
¨ ˛
1 3 5
A“ ˝ 3 9 10 ‚.
2 9 12
¨ ˛
x
As above, we write f : R3 Ñ R3 , v “ ˝ y ‚ ÞÑ Av for the associ-
z
ated linear map.
We compute its rank by bringing it into row-echelon form:
¨ ˛ ¨ ˛ ¨ ˛ ¨ ˛
1 3 5 1 3 5 1 3 5 1 3 5
swap
A ⇝ ˝ 0 0 ´5 ‚ ⇝ ˝ 0 0 1 ‚ ⇝ ˝ 0 0 1 ‚ ⇝ ˝ 0 1 0 ‚.
0 3 2 0 3 2 0 1 0 0 0 1
This matrix has 3 leading ones, hence its rank is 3. Thus, f is
surjective. By the rank-nullity theorem we have
dim ker f “ dim R3 ´ dim im f “ 3 ´ 3 “ 0.
Therefore, f is injective (Lemma 3.25). (Alternatively, we may use
Corollary 3.28(5) directly to see f is injective.) Thus, f is bijec-
tive. This means that for any vector of constants, such as the above
3.6. LINEAR MAPS DEFINED ON BASIS VECTORS 91
¨ ˛
7
˝ 11 ‚, there is precisely one solution of the linear system. This
10
solution can be determined via Method 1.32, but we will omit this
computation here because we will later develop a more comprehen-
sive method, namely by using the inverse A´1 , to obtain these so-
lutions.

3.6 Linear maps defined on basis vectors


An arbitrary map
f :V ÑW
encodes a lot of information: one needs to specify f pvq for every
v P V . For linear maps, this simplifies drastically:
Proposition 3.38. Let v1 , . . . , vn be a basis of a vector space V .
Let W be another vector space and w1 , . . . , wn arbitrary vectors
(they need not be linearly independent, or span W etc.) Then there
is a unique linear map f : V Ñ W such that

f pvi q “ wi . (3.39)

Proof. Recall Proposition 2.61: given a basis v1 , . . . , vn of a vec-


tor space, any vector v P V can be uniquely expressed as a linear
combination
n
ÿ
v “ b1 v1 ` ¨ ¨ ¨ ` bn bn “ bi vi ,
i“1

i.e., we can express v in such a form and the real numbers bi are
uniquely determined by v. Moreover, we can think of these numbers
b1 , . . . , bn as the coordinates of v (with respect to our coordinate
system given by the basis). Namely, given another vector v 1 “
ř n 1
i“1 bi vi and some a P R, we have

n
ÿ
v ` v1 “ pbi ` b1i qvi
i“1
ÿn
av “ pabi qvi .
i“1
92 CHAPTER 3. LINEAR MAPS

Now, given v P V , we define


n
ÿ
f pvq :“ bi wi . (3.40)
i“1

In particular, for v “ vi , this satisfies f pvi q “ wi . The map f is


linear; this follows from the preceding discussion.
Conversely, if a linear map f satisfies f pvi q “ wi , for v as above,
it necessarily satisfies
n
ÿ n
ÿ n
ÿ
f pvq “ f p bi vi q “ bi f pvi q “ bi wi .
i“1 i“1 i“1

So, the map defined in (3.40) is the only linear map satisfying
(3.39).
Example 3.41. We consider V “ R3 , with the basis
v1 “ e1 “ p1, 0, 0q, v2 “ e2 “ p0, 1, 0q, v3 “ p0, 1, ´1q.
(Note that e1 , e2 are part of the standard basis of R3 .) According
to Proposition 3.38, there is a unique linear map f : R3 Ñ R3 such
that
f pv1 q “ p2, ´1, 0q, f pv2 q “ p1, ´1, 1q, f pv3 q “ p0, 2, 2q.
We determine f pe3 q, where e3 “ p0, 0, 1q is the third standard basis
vector. We have
e3 “ v2 ´ v3 .
Thus
f pe3 q “ f pv2 ´v3 q “ f pv2 q´f pv3 q “ p1, ´1, 1q´p0, 2, 2q “ p1, ´3, ´1q.
Thus, with respect to the standard basis e1 , e2 , e3 (which is distinct
from the one above!), the matrix of f is given by
¨ ˛
2 1 1
A “ ˝ ´1 ´1 3 ‚.
0 1 ´1
That is, f agrees with the map
f : R3 Ñ R3 , v ÞÑ Av.
3.7. MATRICES ASSOCIATED TO LINEAR MAPS 93

3.7 Matrices associated to linear maps


In Proposition 3.19, we associated a linear map Rn Ñ Rm to an
m ˆ n-matrix. In this section, we will reverse this process: we will
begin with a linear map and associate to it a matrix.
Proposition 3.42. Let V, W be two vector spaces with bases v1 , . . . , vn
and w1 , . . . , wm , respectively. Let finally f : V Ñ W be a linear map.
Then there is a unique m ˆ n-matrix A “ paij q, called the matrix
associated to the linear map f with respect to the given bases, such
that m
ÿ
f pvi q “ aij wj .
j“1
řn
For a general vector v “ i“1 bi vi , we have
ÿn ÿ m
f pvq “ bi aij wj .
i“1 j“1

Proof. We apply the above fact (Proposition 2.61) to f pvi q P W (and


the basis w1 , . . . , wm ), and see immediately that a unique expression
of f pvi q as claimed exists.
We now compute f pvq:
ÿ n
f pvq “ f p bi vi q
i“1
n
ÿ
“ bi f pvi q since f is linear
i“1
ÿn m
ÿ
“ bi aij wj
i“1 j“1
ÿn ÿ m
“ bi aij wj .
i“1 j“1

Example 3.43. We continue Example 3.41. The vectors w1 “


f pv1 q “ p2, ´1, 0q, w2 “ f pv2 q “ p1, ´1, 1q and w3 “ f pv3 q “
p0, 2, 2q form a basis of R3 , as one sees by computing the rank of
¨ ˛
2 ´1 0
˝ 1 ´1 1 ‚,
0 2 2
94 CHAPTER 3. LINEAR MAPS

which is three. We can therefore apply Proposition 3.42 to the bases


v1 , v2 , v3 and w1 , w2 , w3 . The matrix is then
¨ ˛
1 0 0
˝ 0 1 0 ‚!
0 0 1

To see this, note for example the second row says

f pe2 q “ 0w1 ` 1w2 ` 0w3 ,

which is true.
If, by contrast, we consider the standard basis e1 , e2 , e3 of V “ R3
(and still w1 , w2 , w3 in W “ R3 ), then the matrix reads
¨ ˛
1 0 0
˝ 0 1 0 ‚.
0 1 ´1

For example, the third line of this matrix expresses the identity

f pe3 q “ a31 w1 ` a32 w2 ` a33 w3 “ w2 ´ w3 ,

which we computed above.


This in particular shows that the matrix A depends (not only on
f but also on) the choice of the bases of V and W !

3.8 Composing linear maps and multiplying ma-


trices
The following lemma, while simple to prove, is of fundamental im-
portance:
Definition and Lemma 3.44. Let f : U Ñ V and g : V Ñ W be
two linear maps between three vector spaces U , V and W . Then
the composition of g and f is the map defined as

g ˝ f : U Ñ W, u ÞÑ gpf puqq.

This map is again linear.


3.8. COMPOSING LINEAR MAPS AND MULTIPLYING MATRICES 95

Proof. We check the two conditions in Definition 3.1: for u, u1 P U


and a P R, we have, using the linearity of f and g:
pg ˝ f qpu ` u1 q “ gpf pu ` u1 qq
“ gpf puq ` f pu1 qq
“ gpf puqq ` gpf pu1 qq
“ pg ˝ f qpuq ` pg ˝ f qpu1 q
pg ˝ f qpauq “ gpf pauqq
“ gpaf puqq
“ agpf puqq
“ apg ˝ f qpuq.

Example 3.45. The maps f : R2 Ñ R, px, yq ÞÑ x and g : R Ñ


R3 , x ÞÑ px, 0, xq are both linear. The composition g ˝ f is the map
g ˝ f, px, yq ÞÑ gpf px, yqq “ gpxq “ px, 0, xq.
We may also consider h : R Ñ R2 , x ÞÑ px, xq. Then the com-
posite
h ˝ f, px, yq ÞÑ hpf px, yqq “ hpxq “ px, xq.
The other composite is also defined, it is
f ˝ h : R Ñ R, x ÞÑ f phpxqq “ f px, xq “ x.
(By comparison, the composition f ˝ g is not defined, since g takes
values in R3 , but f is defined on R2 .)
We now relate this composition of abstract maps to something
more concrete, the product of matrices.
Definition 3.46. If A “ paij q is a m ˆ n-matrix and B “ pbij q is
an n ˆ k-matrix, then the product AB (also sometimes denoted by
A ¨ B) is the m ˆ k-matrix whose entry in the i-th řrow and j-th
column is the following (see §A for the sum notation ):
n
ÿ
aie bej “ ai1 b1j ` ai2 b2j ` ¨ ¨ ¨ ` ain bnj .
e“1

In other “words” n
ÿ
AB :“ p aie bej q.
e“1
96 CHAPTER 3. LINEAR MAPS

I.e., one picks the i-th row of A and the j-th column of B; one
traverses these and multiplies the corresponding entries together
one by one and finally adds up these products.
Example 3.47.
ˆ ˙ˆ ˙ ˆ ˙
1 2 ´1 0 1 ¨ p´1q ` 2 ¨ 6 1 ¨ 0 ` 2 ¨ p´2q

3 4 6 ´2 3 ¨ p´1q ` 4 ¨ 6 3 ¨ 0 ` 4 ¨ p´2q
ˆ ˙
11 ´4
“ ,
21 ´8
¨ ˛
ˆ ˙ 0 1
1 ´1 2 ˝ 1 2 ‚“
1 3 ´2
2 3

¨ ˛
0 1 ˆ ˙
˝ 1 2 ‚ 1 ´1 2

1 3 ´2
2 3

ˆ ˙ˆ ˙
1 x 1 y

0 1 0 1

Note that the second product is a 2 ˆ 2-matrix while the product of
the same matrices in the other order is a 3 ˆ 3-matrix!
The product AB is only defined if the number of columns of A
is the same as the number of rows of B. For example,
ˆ ˙ˆ ˙
0 1 1 3 4
2 2 3 5 6
is not defined, i.e., it is a meaningless expression.
Remark 3.48. In the case when B is a column vector with n en-
tries, we can regard it as an n ˆ 1-matrix. In this case the product
AB defined in Definition 3.46 is an m ˆ 1-matrix, which agrees with
the column vector AB as defined in Definition 3.9, so the product
considered now is a generalization of that previous construction. In
general, if B is an n ˆ k-matrix, we can write it as
B “ pb1 b2 . . . bn q,
3.8. COMPOSING LINEAR MAPS AND MULTIPLYING MATRICES 97

where the b1 , . . . , bn are the columns of B. Then

AB “ pAb1 Ab2 . . . Abn q.

In Proposition 3.19, we associated to an m ˆ n-matrix A a linear


map
f : Rn Ñ Rm , v ÞÑ Av.
Let us also be given an n ˆ l-matrix B, to which we can assign the
linear map
g : Rl Ñ Rn , u ÞÑ Bu.
Proposition 3.49. In the above situation, the compositition f ˝ g :
Rl Ñ Rn is the map given by multiplication by the matrix AB, i.e.,
the linear map
u ÞÑ pABqu.

Proof. Let us write C “ AB for the product of A and B. It is an


m ˆ l-matrix. If we write C “ pcij q, we have
n
ÿ
cij “ air brj . (3.50)
r“1

We have to compare two linear maps, Rl Ñ Rn , namely f ˝ g


and u ÞÑ Cu “ pABqu. According to Proposition 3.38, it suffices to
show that these two maps give the same values when we evaluate
them on some basis of Rn , for which we take the standard basis
e1 , . . . , en . As was noted in (3.29), the product Cei is precisely the
i-th column of C. That is,
¨ ˛
c1i m m ÿ n
Cei “ ˝ ... ‚ “ c1i e1 ` ¨ ¨ ¨ ` cmi em “
ÿ ÿ
csi es “ asr bri es .
cmi s“1 s“1 r“1

Similarly,
m
ÿ
f pei q “ Aei “ asi es
s“1

and
n
ÿ
gpei q “ Bei “ bri er .
r“1
98 CHAPTER 3. LINEAR MAPS

Here, as usual, e1 , . . . denotes the standard basis vectors of Rn , Rm


and Rl . We now compute
pf ˝ gqpei q “ f pgpei qq
ÿn
“ f p bri er q
r“1
n
ÿ
“ bri f per q pf is linearq
r“1
ÿn m
ÿ
“ bri asr es
r“1 s“1
ÿn ÿ m
“ bri asr es
r“1 s“1
ÿm ÿ n
“ asr bri es
s“1 r“1
ÿm
“ csi es . by (3.50).
s“1

With similar arguments, one proves the following:


Proposition 3.51. Let f : U Ñ V and g : V Ñ W be two linear
maps, and let u1 , . . . , ul , v1 , . . . , vm and w1 , . . . , wn be bases of these
vector spaces. Finally, let A be the matrix of f with respect to these
bases (of U and V ) and B the matrix of g with respect to these bases
(of V and W ). Then BA is the matrix of g ˝ f with respect to the
bases (of U and W ).

3.8.1 Properties of matrix multiplication


Dependence on the order of factors

A key property of matrix multiplication is that the product of two


matrices depends on the order of the factors.
Warning 3.52. For two n ˆ n-matrices A and B, their product
depends on the order of the two matrices. In other words, in general
AB ‰ BA!
Mark these words! It is a common misconception among linear
algebra-learners to think that AB would (always) be equal to BA.
3.8. COMPOSING LINEAR MAPS AND MULTIPLYING MATRICES 99

Example 3.53. Examples are not hard to come by:


ˆ ˙ˆ ˙ ˆ ˙
1 1 1 0 2

0 1 1 1
ˆ ˙ˆ ˙ ˆ ˙
1 0 1 1 1

1 1 0 1

So that ˆ ˙ˆ ˙ ˆ ˙ˆ ˙
1 1 1 0 1 0 1 1
‰ !
0 1 1 1 1 1 0 1

Remark 3.54. The phenomenon AB ‰ BA may be best under-


stood in the light of composition of (linear) maps: if f : Rn Ñ Rn
and g : Rn Ñ Rn is another linear map, then in general we have

g ˝ f ‰ f ˝ g.

To take a concrete example, consider the linear map f : R2 Ñ R2


given by reflecting along the x-axis, and g : R2 Ñ R2 the linear
map given by rotating counter-clockwise (around the origin) by 90˝ .

f pgpwqq 5 y
4
3
w 2
1
x
´5´4´3´2´1
´1 1 2 3 4 5 6 7 8 9 10
´2
´3
f pwq ´4
gpwq ´5 gpf pwqq
´6
´7

Let us conclude this discussion by noting that this issue is not


specific to linear algebra, but is a common phenomenon in daily
life: there is (often) no reason to expect that doing (the same) two
actions in different order give the same result:
100 CHAPTER 3. LINEAR MAPS

• You first do sports, then take a shower.


• You first take a shower, then do sports.
In the first scenario you may feel refreshed, in the second one a little
sweaty...

Further properties of matrix multiplication

Definition 3.55. The identity matrix is the square matrix


¨ ˛
1 0 ... 0
˚ 0 1 ... 0 ‹
id “ ˚
˝ ... ... .. ‹ .
. ‚
0 ... 0 1
I.e., it is a square matrix whose entries on the “north-west – south-
east” diagonal (which is called the main diagonal ) are all 1, and the
remaining entries are zero. If it is important to specify the size, one
also writes idn .
Example 3.56. If n “ ˆ 1, then
˙ id1 is just the 1 ˆ 1-matrix whose
1 0
only entry is 1. id2 “ .
0 1
The first two identities in the next lemma assert that the identity
matrix takes the role of the number 1 when it comes to multiplying
matrices.
Lemma 3.57. Matrix multiplication satisfies the following identi-
ties, where A, B and C are matrices (of a size such that the products
and sums below are defined), and r P R:
idA “ A
Aid “ A
ApB ` Cq “ AB ` AC (distributivity)
pA ` BqC “ AC ` BC
pABqC “ ApBCq (associativity)
rpABq “ prAqB “ AprBq (matrix vs. scalar multiplication)
Proof. These identities follow from similar identities for the multi-
plication and addition of real numbers.
3.8. COMPOSING LINEAR MAPS AND MULTIPLYING MATRICES 101

To illustrate the principle, we consider the first distributivity law


above. Let A “ paij q be an m ˆ n-matrix and B, C two n ˆ k-
matrices, B “ pbij q and C “ pcij q. Then B ` C “ pbij ` cij q so
that
n
ÿ
ApB ` Cq “ p aie pbej ` cej qq
e“1
ÿn
!
“p aie bej ` aie cej q
e“1
ÿn n
ÿ
“p aie bej q ` p aie cej q
e“1 e“1
“ AB ` AC.
At the equality marked ! we have used the distributivity law for real
numbers, i.e., the identity epf ` gq “ ef ` eg for any e, f, g P R.

Multiplication with elementary matrices

We recast the elementary row operations of matrices (Definition 1.29)


in terms of multiplication with appropriate matrices. Below, we use
the (standard) convention
ˆ ˙ that an “invisible”ˆ entry ˙in a matrix is
1 0 1
zero, e.g. id2 “ will be written as etc.
0 1 1
Proposition 3.58. Let A be an m ˆ n-matrix.
(1) Let A1 be the matrix obtained by interchanging the i-th and the
j-th row. Then
¨ ˛
1
˚ ... ‹
˚ ‹
˚ ‹
˚ 0 1 ‹
1
A “˚
˚
˚ . ..

‹ A.

˚
˚ 1 0 ‹

˚ . .

˝ . ‚
1
loooooooooooooooooooomoooooooooooooooooooon
p1q
Ei,j

(The first matrix is the m ˆ m-matrix obtained from idm by


exchanging the i-th and the j-th row.)
102 CHAPTER 3. LINEAR MAPS

(2) Let A1 be the matrix obtained by multiplying the i-th row with
a real number r. Then
¨ ˛
1
˚ ... ‹
˚ ‹
1
˚ ‹
˚ ‹
1
A “˚ r ‹ A.
˚ ‹
˚
˚ 1 ‹

˚
˝ . ..


1
looooooooooooooooooomooooooooooooooooooon
p2q
Ei,r

(The first matrix is the m ˆ m-matrix obtained from idm by


replacing the pi, iq-entry by r.)
(3) Let A1 be the matrix obtained by adding the r-th multiple of
the j-th row to the i-th row. Then
¨ ˛
1
˚ ... ‹
˚ ‹
˚ ‹
˚ 1 ‹
1
A “˚
˚
˚ . ..

‹ A.

˚
˚ r 1 ‹

˚
˝ ... ‹

1
loooooooooooooooooooomoooooooooooooooooooon
p3q
Ei,j,r

(The first matrix is the m ˆ m-matrix obtained from idm by


replacing the pi, jq-entry by r.)
p1q p2q p3q
Definition 3.59. The matrices Ei,j , Ei,r and Ei,j,r (for any appro-
p2q
priate i, j and any r P R, where r ‰ 0 in Ei,r ) appearing in the
statement above are called elementary matrices.

Proof. This is a more cumbersome to write down precisely than to


convince oneself by unwinding the definition. We check the third
statement. If B “ pbij q is the above matrix as stated, we have that
bii “ 1 and bij “ r and all other entries are zero. Let us write
3.9. INVERSES 103

C “ BA, C “ pcij q. Then, by definition,


m
ÿ
cst “ bse aet .
e“1

We compute this sum:


• if s ‰ i, then the only bse that is non-zero is bss “ 1, so that

cst “ bss ast “ ast .

• For s “ i, the only coefficients bse that are non-zero are bss “ 1
and bsj “ r. Thus, the sum above consists of two terms, and
therefore
cst “ bss ast ` bsj ajt “ ast ` rajt .

Thus the i-th row of C equals the matrix A1 as in the statement


above.

3.9 Inverses
Given a linear map f : V Ñ W it is a natural question whether the
process of applying f can be undone. For example, if f encodes a
counter-clockwise rotation in the plane by 60˝ , it can be undone by
rotating clockwise by 60˝ . On the other hand, the linear map

R2 Ñ R2 , px, yq ÞÑ px, 0q

cannot be undone, since there is no way of recovering px, yq only


from x.
Definition and Lemma 3.60. Let f : V Ñ W be a linear map.
Then the following statements are equivalent (i.e., one holds pre-
cisely if the other holds):
(1) f is bijective (Definition 3.20),
(2) There is a linear map g : W Ñ V such that

g ˝ f “ idV and f ˝ g “ idW .

(By definition of the composition (see also §A) this means gpf pvqq “
v for all v P V and f ˝ g “ idW (i.e., f pgpwqq “ w for all w P W .)
104 CHAPTER 3. LINEAR MAPS

If this is the case, we call f an isomorphism. In this event, the


following statements hold:

• Such a map g is unique. It is also called the inverse of f and


is denoted by f ´1 : W Ñ V .

• dim V “ dim W .

Proof. We only prove the direction (1) ñ (2). By assumption f is


bijective, i.e., the preimage f ´1 pwq consists of precisely one element,
say f ´1 pwq “ tvu. (That is, only for that vector do we have that
f pvq “ w.) We define a map g : W Ñ V by gpwq :“ v.
To compute gpf pvqq we observe that f ´1 pf pvqq “ tvu, since v is
the only element of V that is mapped to f pvq. Thus gpf pvqq “ v.
To compute f pgpwqq, say that f ´1 pwq “ tvu. This means in
particular that f pvq “ w. Then gpwq “ v and therefore f pgpwqq “
w.
We show that g is linear. Let w, w1 P W be given. Let v, v 1 P V be
the unique elements such that f pvq “ w, f pv 1 q “ w1 . By definition,
this means gpwq “ v, gpw1 q “ v 1 . Then w ` w1 “ f pv ` v 1 q, since f
is linear. Thus gpw ` w1 q “ v ` v 1 “ gpwq ` gpw1 q. In a similar way,
one shows gpawq “ agpwq for a P R.
If g 1 : W Ñ V is another map with f pg 1 pwqq “ w, as above, then

f pg 1 pwqq “ f pgpwqq.

Since f is injective, this implies g 1 pwq “ gpwq. This shows that g is


unique.
The last statement holds by Corollary 3.28(3).

3.9.1 Definition and unicity of inverses

Definition 3.61. Let A be an n ˆ n-matrix. Another n ˆ n-matrix


B is called an inverse of A if

AB “ id and BA “ id.

If such a matrix B exists, A is called invertible.


3.9. INVERSES 105
ˆ ˙ ˆ ˙
0 1 ´1 1
Example 3.62. A “ is invertible, since B “
1 1 1 0
is an inverse of A:
ˆ ˙ˆ ˙
0 1 ´1 1
AB “ “
1 1 1 0
ˆ ˙ˆ ˙
´1 1 0 1
BA “ “
1 0 1 1

Not every matrix has an inverse. An 1 ˆ 1-matrix A, which is


just a single real number a is invertible precisely if a ‰ 0. In this
case the 1 ˆ 1-matrix with entry a1 is an inverse. For larger matrices,
it is not enough to be different from zero in order to be invertible,
as the following example shows.
Example 3.63. The matrix
ˆ ˙
1 2
A“
2 4
is notˆinvertible.
˙ We prove this by taking an arbitrary 2 ˆ 2-matrix
a b
B“ and computing
c d
ˆ ˙
a ` 2c b ` 2d
AB “
2a ` 4c 2b ` 4d
ˆ ˙
1 0
Thus the condition AB “ id “ amounts to four equations:
0 1
a ` 2c “ 1
b ` 2d “ 0
2a ` 4c “ 0
2b ` 4d “ 1.
Indeed, multiplying the first equation by 2 gives 2a ` 4c “ 2, and
inserting the third equation gives a contradiction:
0 “ 2a ` 4c “ 2.
Hence there is no such matrix B, so that A is not invertible. We
can observe that both the two rows of A are linearly dependent,
106 CHAPTER 3. LINEAR MAPS

and also that the two columns of A are linearly dependent. We will
later prove that either of these two conditions are equivalent to A
not being invertible (Corollary 3.86).
Example 3.64. We revisit the reflection, rescaling, rotation and
shearing matrices (Example 3.13 onwards) and compute their in-
verses:

Geometrical description Matrix


ˆ A˙ Inverse
ˆ matrix
˙ A´1
1 0 1 0
Reflection at x-axis
ˆ 0 ´1 ˙ ˆ 0 ˙ ´1
´1 0
Reflection at y-axis
ˆ 0 1˙ ˆ ´1 ˙
r 0 r 0
Rescaling ´1 (if r, s ‰ 0; if r “ 0 or s “
ˆ 0 s ˙ ˆ 0 s ˙ ˆ
cos r ´ sin r cosp´rq ´ sinp´rq cos r si
Rotation “
ˆ sin r ˙ cos r ˆ sinp´rq˙ cosp´rq ´ sin r co
1 r 1 ´r
Shearing (for any r P R)
0 1 0 1
Lemma 3.65. Let A be an invertible matrix. Then there is pre-
cisely one inverse matrix, i.e., if B and C are two inverses (which
means AB “ BA “ id and AC “ CA “ id), then B “ C. One
therefore speaks of the inverse (as opposed to an inverse), and writes
A´1 for the inverse.
Proof. Using the associativity of matrix multiplication (marked !),
we get the following chain of equalities
!
B “ Bid “ BpACq “ pBAqC “ idC “ C.
Thus B “ C as claimed.

3.9.2 Linear systems associated to invertible matrices


Inverses of matrices are useful to solve linear systems. This is the
content of the following theorem:
Theorem 3.66. Let A be an invertible n ˆ n-matrix. We consider
the linear system
Ax “ b,
3.9. INVERSES 107

¨ ˛
x1
where x “ ˝ ... ‚ is a vector consisting of n unknowns and b “
xn
¨ ˛
b1
˝ ... ‚ is a vector. This linear system has a unique solution, which
bn
is given by
x “ A´1 b,
i.e., the product of the inverse of A with the vector b.

Remark 3.67. By Observation 3.11, if A “ paij q, then the equation


Ax “ b is a shorthand for the linear system

a11 x1 ` ¨ ¨ ¨ ` a1n xn “ b1
..
.
am1 x1 ` ¨ ¨ ¨ ` amn xn “ bm .

Proof. We first check that A´1 b is indeed a solution to the equation


Ax “ b:
!
ApA´1 bq “ pAA´1 qb “ idn b “ b.
At the equation marked ! we have used the associativity of matrix
multiplication (where the third matrix is b, which is just a column
vector, i.e., an n ˆ 1-matrix).
We now check that A´1 b is the only solution. Suppose then that
some vector y is a solution to the system, i.e., Ay “ b. We will show
y “ A´1 b by proving

z :“ A´1 b ´ y “ 0.

Again using the properties of matrix multiplication (Lemma 3.57),


we have Az “ ApA´1 b ´ yq “ AA´1 b ´ Ay “ b ´ b “ 0. Multiplying
this with the matrix A´1 , we obtain our claim:

z “ A´1 Az “ A´1 0 “ 0.
108 CHAPTER 3. LINEAR MAPS

3.9.3 Structural properties of taking the inverse


Below, it is necessary to have a few properties of the operation
“take the inverse of an (invertible) matrix” at our disposal. In the
following theorem, all matrices are square matrices of the same size.
In the right column, we illustrate what they mean for 1ˆ1-matrices,
as a means to remember them. Recall that a 1 ˆ 1-matrix paq is
invertible if and only if a ‰ 0, and its inverse paq´1 is the matrix
pa´1 q. (Here, as usual, the reciprocal a´1 of a non-zero real number
a is defined as a´1 :“ a1 .)
Theorem 3.68. The following holds:
Statement for general (square) matrices For 1 ˆ 1-
matrices
1
id is invertible: id´1 “ id 1
“1
1
If A is invertible, then A´1 is also invertible: 1 “ a
a
´1 ´1
pA q “ A. (3.69)

1 11
If A and B are invertible, then AB is invert- ab
“ ba
.
ible:
pABq´1 “ B ´1 A´1 . (3.70)

1 1
If A1 , . . . , Ak are invertible, then their prod- a1 ¨¨¨¨¨ak
“ ak
¨¨¨¨¨
uct A1 A2 . . . Ak is also invertible: 1
a1

pA1 . . . Ak q´1 “ A´1 ´1


k . . . A1 (3.71).

Proof. We prove (3.70) to illustrate the technique. We compute


pABqpB ´1 A´1 q “ ApBB ´1 qA´1 by associativity
´1 ´1
“ AA since B is inverse of B
´1
“ id since A is inverse of A.
Using the same arguments, one checks pB ´1 A´1 qpABq “ id. Thus,
by Definition 3.61, B ´1 A´1 is the inverse of AB.
Remark 3.72. Among the above formulas, (3.70) is the most note-
worthy one: note that the order of A and B has been changed! (Re-
call from Warning 3.52 that the order of multiplication is important.
3.9. INVERSES 109

Only for 1ˆ1-matrices, i.e., real numbers the order of multiplication


1
is irrelevant, so that ab “ 1b a1 “ a1 1b etc.)

3.9.4 Invertible matrices and elementary operations


Lemma 3.73. Any elementary matrix (Definition 3.59) is invert-
ible:
´ ¯´1
p1q p1q
Ei,j “ Ei,j
´ ¯´1
p2q p2q
Ei,r “ Ei,r´1
´ ¯´1
p3q p3q
Ei,j,r “ Ei,j,´r .

Proof. To illustrate this, we check this for the last one, where for
simplicity of notation we just treat the case of 2 ˆ 2-matrices. I.e.,
we prove
ˆ ˙´1 ˆ ˙
1 r 1 ´r
“ .
0 1 0 1
To do this, we compute the product
ˆ ˙ˆ ˙ ˆ ˙ ˆ ˙
1 r 1 ´r 1 0
“ “ .
0 1 0 1 1 ¨ p´rq ` r1 0 1

Similarly (or, actually, symmetrically)


ˆ ˙ˆ ˙ ˆ ˙ ˆ ˙
1 r 1 ´r 1 0
“ “ .
0 1 0 1 1 ¨ p´rq ` r1 0 1

Lemma 3.74. If A is an m ˆ n-matrix and B is obtained from A


by means of elementary row operations:

A ⇝ B,

then
B “ UA
for an invertible m ˆ m-matrix U . (In particular, if A ⇝ id, then
id “ U A.)
110 CHAPTER 3. LINEAR MAPS

Proof. If A ⇝ B in a single step, this is a combination of Lemma 3.73


and Proposition 3.58: in this case U is the elementary, and in par-
ticular invertible, matrix corresponding to the elementary operation
that has been performed.
In general, say A “: A0 ⇝ A1 ⇝ A2 ⇝ ¨ ¨ ¨ ⇝ An “ B, then
A1 “ U1 A, A2 “ U2 A1 etc., so that
An “ Un An´1 “ Un Un´1 An´2 “ ¨ ¨ ¨ “ U n Un´1 . . . U1 A,
looooooomooooooon
“:U

where we have used the associativity of matrix multiplication. Being


the product of elementary, and in particular invertible matrices, U
is then also invertible (Theorem 3.68).
We finally prove Theorem 1.18. You can check that there is no
vicious circle!
Corollary 3.75. Let
Ax “ b (3.76)
be a linear system. Apply any sequence of elementary row opera-
tions to A and to b, getting a matrix A1 and a vector b1 . Then the
system
A 1 x “ b1 (3.77)
is equivalent to (3.76), i.e., the solution sets of the two systems are
the same.
Proof. By Lemma 3.74, there is an invertible matrix U such that
A1 “ U A and b1 “ U b. If Ax “ b, then also
A1 x “ pU Aqx “ U pAxq “ U b “ b1 .
Conversely, if A1 x “ b1 , then (crucially using that U is invertible)
Ax “ U ´1 U Ax “ U ´1 A1 x “ U ´1 b1 “ U ´1 U b “ b.

3.9.5 Invertibility criteria


We can now establish a criterion that determines whether a given
matrix A is invertible (and that computes the inverse in case it is).
This can then be used in practice to apply Theorem 3.66.
Recall that three statements “X”, “Y”, “Z” are equivalent if any
of them implies the others. For example the statements (where r is
a real number)
3.9. INVERSES 111

• r`1ě1
• rě0
• r ´ 4 ě ´4
are equivalent. By contrast, the three statements
• r`1ě1
• rě0
• r2 ě 0
are not equivalent, since the third does not imply, say, the second:
for r “ ´1, the third statement holds, but the second does not. A
convenient way to show that three statements are equivalent is to
show “X” ñ “Y”, then “Y” ñ “Z”, and then “Z” ñ “X”. Of course,
this also works similarly for more than three statements.
Theorem 3.78. The following conditions on a square matrix A P
Matnˆn are equivalent:
(1) A is invertible.
(2) For any b P Rn (regarded as a column vector with n rows), the
equation Ax “ b (for x P Rn being a column vector consisting
of n unknowns x1 , . . . , xn ) has exactly one solution.
(3) For any b P Rn , the equation Ax “ b has at most one solution.
(4) The system Ax “ 0 (0 being the zero row vector consisting of n
zeros) has only the trivial solution x “ 0 (cf. Remark 1.14).
(5) Using the Gaussian algorithm (Method 1.30), A can be trans-
formed to the identity matrix idn .
(6) A is a product of (appropriate) elementary matrices.
(7) There is a matrix B P Matnˆn such that AB “ id.
If these conditions are satisfied, the inverse of A can be computed
as follows: write the identity n ˆ n-matrix to the right of A (this
gives a n ˆ p2nq-matrix):
¨ ˛
a11 . . . a1n 1 . . . 0
B :“ pA | idn q “ ˝ ... ... ..
.
.. . .
.
.
. .. ‚.
an1 . . . ann 0 . . . 1
112 CHAPTER 3. LINEAR MAPS

(The bar in the middle is just there for visual purposes, it has no
deeper meaning.) Apply Gaussian elimination in order to bring the
matrix B to reduced row echelon form, which according to the above
gives a matrix of the form
pidn | Eq.
Then E “ A´1 , i.e., E is the inverse of A.
Proof. (1) ñ (2): This is just the content of Theorem 3.66.
The implications (2) ñ (3) and (3) ñ (4) are clear.
(4) ñ (5): we can bring A into reduced row-echelon form, say,
A ⇝ R. We need to show that R “ id. If this is not the case, then R
contains a zero row (since R is a square matrix). Method 1.32 then
tells us that the system Rx “ 0 has (at least) one free parameter,
and therefore the system has not only the zero vector as a solution.
The original system Ax “ 0, which by Corollary 3.75 has the same
solutions as Rx “ 0, then also has a non-trivial solution. This is a
contradiction to our assumption that R is not the identity matrix.
(5) ñ (6): by Lemma 3.74, we have U A “ id for U being a
product of elementary matrices, say U “ U1 . . . Un . Then, using
(3.71), we have
A “ U ´1 U A “ U ´1 “ Un´1 . . . U1´1 ,
and this is also a product of elementary matrices.
(6) ñ (7): if A “ U1 . . . Un for some elementary matrices, then
AUn´1 . . . U1´1 “ U1 . . . Un Un´1 . . . U1´1 “ id.
(7) ñ (1): suppose B is such that AB “ id. We observe that
then the only vector x P Rn such that Bx “ 0 is the zero vector:
x “ idn x “ ABx “ A0 “ 0.
Applying the implication (4) ñ (7) (which was already proved) to
B, we obtain a matrix C such that BC “ id. Therefore
A “ Aidn “ ApBCq “ pABqC “ idC “ C.
This means that BA “ id.
This finishes the proof that all the given statements are equiva-
lent. The statement about the computation of A´1 holds since the
row operations that bring A ⇝ id also bring the augmented matrix
pA | idq to pU A | U idq “ pid |U q.
3.9. INVERSES 113
¨˛
1 0 ´1
Example 3.79. We apply this to A “ ˝ 3 1 ´3 ‚:
1 2 ´2
¨ ˛
1 0 ´1 1 0 0
B “ ˝ 3 1 ´3 0 1 0 ‚.
1 2 2 0 0 1
We subtract the first row 3, resp. 2 times from the other ones, which
gives ¨ ˛
1 0 ´1 1 0 0
˝ 0 1 0 ´3 1 0 ‚.
0 2 ´1 ´1 0 1
We subtract 2 times the second row from the third:
¨ ˛
1 0 ´1 1 0 0
˝ 0 1 0 ´3 1 0 ‚.
0 0 ´1 5 ´2 1
We bring the matrix into row echelon form by multiplying the last
row with ´1, which yields
¨ ˛
1 0 ´1 1 0 0
˝ 0 1 0 ´3 1 0 ‚.
0 0 1 ´5 2 ´1
Finally, to bring it into reduced row-echelon form, we add the third
row to the first, which gives
¨ ˛
1 0 0 ´4 2 ´1
˝ 0 1 0 ´3 1 0 ‚.
0 0 1 ´5 2 ´1
Thus, according to Theorem 3.78, A is indeed invertible, and its
inverse is ¨ ˛
´4 2 ´1
A´1 “ ˝ ´3 1 0 ‚.
´5 2 ´1
Corollary 3.80. If A is a square matrix such that for some other
square matrix B we have AB “ id, then we also have BA “ id.
Proof. We use the theorem to see that A is invertible, and then
B “ idB “ A´1 AB “ A´1
And we have seen in (3.69) above that A´1 A “ id.
114 CHAPTER 3. LINEAR MAPS

3.10 Transposition of matrices


Definition 3.81. If A is an m ˆ n-matrix, then the transpose (de-
noted AT ) is the nˆm-matrix obtained by A by reflecting the entries
along the main diagonal. More formally, if A “ paij q, then

AT :“ paji q.
¨ ˛
1 2
Example 3.82. For A “ ˝ 3 4 ‚,
5 6
ˆ ˙
T 1 3 5
A “ .
2 4 6

The transpose of a column


ˆ ˙vector is a row vector and vice versa.
x
For example, for v “ ,
y
` ˘
vT “ x y .

We have the following basic computation rules involving the trans-


pose.
Lemma 3.83. Let A be an m ˆ n-matrix and r P R a real number.
(1) pAT qT “ A, i.e., the transpose of the transpose equals the orig-
inal matrix,
(2) For another matrix B of the same size as A, pA`BqT “ AT `B T .
(3) For an n ˆ k-matrix B, the transpose of the matrix product AB
is the products of the transposes in the opposite order :

pABqT “ B T AT . (3.84)

(4) If a square matrix A is invertible, then AT is also invertible with


inverse
pAT q´1 “ pA´1 qT . (3.85)

Proof. The first two rules are quite immediate to check (and hardly
surprising). The first one can also be seen by noting that doing twice
the reflection of the entries along the main diagonal gives back the
original matrix.
3.10. TRANSPOSITION OF MATRICES 115

The equation (3.84) is also directly following from the definitions:


let A ř “ paij q, B “ pbij q. Let us writeřnC “ AB “ pcij q. Then
n T
cij “ e“1 aie bej . Thus C “ pcji q “ e“1 aje bei . This equals the
pi, jq-entry of B T AT .
For (3.85), we compute

AT pA´1 qT “ pA´1 AqT by (3.84)


“ idT
“ id.

Similarly,

pA´1 qT AT “ pAA´1 qT by (3.84)


“ idT
“ id.

Thus the product of AT and pA´1 qT (in the two possible orders)
equals id, so they are inverse to each other.

The usage of transposes helps us prove another set of equivalent


characterizations:
Corollary 3.86. Let A P Matnˆn be a square matrix. Then the
following are equivalent:
(1) A is invertible.
(2) The n columns of A are linearly independent.
(3) The rank of A is n.
(4) The n rows of A are linearly independent.

Proof. (1) ô (2): According to Theorem 3.78, A is invertible pre-


cisely if the only solution to the
¨ system
˛ Ax “ 0 is the zero vector
x1
x “ 0. Recalling that for x “ .. ‚ we have
˝ .
xn

Ax “ x1 c1 ` ¨ ¨ ¨ ` xn cn ,

where A “ pc1 . . . cn q are the columns of A, we see that the above


condition is equivalent to the columns being linearly independent.
116 CHAPTER 3. LINEAR MAPS

(2) ô (3): The rank is, by definition, the dimension of the column
space, i.e., the subspace of Rn generated by the columns c1 , . . . , cn .
In order to show that these vectors span Rn , let b P Rn . By the
invertibility of A, we know that
řn the system Ax “ b has a (unique)
solution x. Therefore Ax “ k“1 xi ci “ b.
(1) ô (4): A is invertible if and only if the transpose AT is
invertible. Now use that the rows of AT are the columns of A, and
apply the (already proved) equivalence (1) ô (2).

3.11 Exercises
Exercise 3.1. Determine which 2 ˆ 2-matrix A is such that the
function
f : R2 Ñ R2 , v ÞÑ Av
are the following:
• f pvq is the point v reflected along the y-axis,
• f pvq is the same point as v,
• f pvq is the origin p0, 0q,
• f pvq is the point v reflected along the line tpx, xq |x P Ru (i.e.,
the “southwest-northeast diagonal”),
• f pvq is the point v rotated counterclockwise, resp. clockwise by
60˝ ?
ˆ ˙
y
Exercise 3.2. Determine the matrix A such that Av “ .
x
Describe the behaviour of the function v ÞÑ Av geometrically.

Exercise 3.3. Write down the matrix A such that the function f :
R4 Ñ R3 , v ÞÑ Av satisfies

f pp1, 0, 0, 0qq “ p1, 2, 3q, f pp0, 1, 0, 0qq “ p0, 0, 7q,

f pp0, 0, 1, 0qq “ p0, 0, 0q, f pp0, 0, 0, 1qq “ p13, 0, ´1q.


Determine ker f and im f (i.e., determine a basis and their dimen-
sion).
3.11. EXERCISES 117

Exercise 3.4. Compute the rank of


¨ ˛
0 1 2 1
˚ 1 1 1 0 ‹
A“˚ ˝ 0 ´1
‹.
1 1 ‚
1 1 4 2

Exercise 3.5. Consider the linear map f : R3 Ñ R3 described


in Example 3.41. Determine the matrix of f with respect to the
standard basis e1 , e2 , e3 (both in the “source” R3 , and also in the
“target” R3 ).
Exercise 3.6. For λ P R consider the subspace of R3 defined as
Wλ “ Lpp1, 1 ` λ, ´1q, p2, λ ´ 2, λ ` 2qq.
For each λ P R, find a basis and the dimension of Wλ .
Exercise 3.7. Determine the rank of
¨ ˛
α 0 0
˝ 0 α 1`α ‚
α 1 2
for each α P R.
Exercise 3.8. For any λ P R solve the system (in the unknowns
x1 , x2 , x3 )
λx1 “ 0
λx2 ` p1 ` λqx3 “ 1
λx1 ` x2 ` 2x3 “ 3.
Exercise 3.9. (Solution at p. 213) Consider the linear map
¨ ˛ ¨ ˛
ˆ ˙ 1 2 ˆ ˙ x1 ` 2x2
x1 x1
f : R2 Ñ R3 , ÞÑ ˝ 0 1 ‚ “˝ x2 ‚.
x2 x2
3 5 3x1 ` 5x2

(1) Determine ker f .


¨ ˛
1
(2) Does the vector ˝ 0 ‚ lie in the image of F ?
3
118 CHAPTER 3. LINEAR MAPS

Exercise 3.10. (Solution at p. 214) Consider the linear map


f : R4 Ñ R3
¨ ˛ ¨ ˛
x1 ¨ ˛ x1 ¨ ˛
˚ x2 ‹ 2 ´1 1 1 ˚ x2 ‹ 2x1 ´ x2 ` x3 ` x4
˝ x3 ‚ Þ
˚ ‹Ñ ˝ 0 5 ´3 ´5 ‚˚ ˝ x3
‹“˝
‚ 5x2 ´ 3x3 ´ 5x4 ‚.
3 ´4 3 4 3x1 ´ 4x2 ` 3x3 ` 4x4
x4 x4
(1) Determine ker f .
¨ ˛
1
(2) Determine f ´1 p˝ ´3 ‚q, i.e., find all the vectors v P R4 such
´3
¨ ˛
1
that f pvq “ ˝ ´3 ‚. Is this subset of R4 a subspace?
´3
Exercise 3.11. (Solution at p. 214) Consider the matrix
¨ ˛
1 3 ´1 2
At “ ˝ 1 5 1 1 ‚.
2 4 t 5
Here t P R is an arbitrary real number.
(1) Determine the rank of At .
¨ ˛
1
(2) Set t “ ´4, u “ ˝ α ‚. Find α P R such that
0
¨ ˛
x1
˚ x2 ‹
A´4 ˚˝ x3 ‚ “ u

x4
has solutions. ¨ ˛
1
(3) Set again t “ ´4, v “ ˝ 4 ‚. Determine the solutions of the
´1
linear system ¨ ˛
x1
˚ x2 ‹
A´4 ˚
˝ x3 ‚ “ v.

x4
3.11. EXERCISES 119

(4) Is there any t P R such that the homogeneous system


¨ ˛
x1 ¨ ˛
˚ x2 ‹ 0
At ˝
˚ ‹ “ ˝ 0 ‚
x3 ‚
0
x4
¨ ˛ ¨ ˛
x1 0
˚ x2 ‹ ˚ 0 ‹
has only the trivial solution ˚
˝ x3 ‚ “ ˝ 0 ‚?
‹ ˚ ‹
x4 0

Exercise 3.12. Let f : V Ñ W be a linear map. For a subspace


U Ă V we define the image of U to be
f pU q :“ tf pvq | v P U u.
(For example, for U “ V , this gives back the image of f as defined
in Definition 3.20).
(1) Arguing as in Proposition 3.23, prove that f pU q is a subspace
of W .
(2) Prove that dim f pU q ď dim U .
Exercise 3.13. Let f : V Ñ W be a linear map. For a subspace
U Ă W , we define the preimage of U to be
f ´1 pU q :“ tv P V | f pvq P U u.
(For example, if U “ t0W u is the subspace consisting only of the
zero vector of W , this gives back the kernel: ker f “ f ´1 pt0W uq.)
Arguing as in Proposition 3.23, prove that f ´1 pU q is a subspace
of V .
Exercise 3.14. (Solution at p. 215) Consider the linear map f :
R4 Ñ R3 given by multiplication with the matrix
¨ ˛
2 ´1 ´ 52 1
˚ ´1 0 1 ´ 12 ‹
A“˚ 1 ‚.

˝ 1 1 ´ 21 2
0 2 1 0

Determine ker f , im f and ker f X im f .


120 CHAPTER 3. LINEAR MAPS

Exercise 3.15. Consider the linear map

f : R3 Ñ R2

given by
f px, y, zq “ p2x ´ z, x ` y ` zq.
(1) Determine the matrix of f with respect to the standard basis in
R3 and the standard basis R2 .
(2) Determine ker f and im f .
(3) Determine the preimage f ´1 pp0, 1qq. Write down the linear sys-
tem whose solution set is this preimage. Is it a subspace of R3 ?
(4) Show that the vectors v1 “ p0, 1, 2q, v2 “ p0, ´1, 1q and v3 “
p1, 1, 1q are a basis of R3 . Determine the matrix of f with respect
to this basis of R3 and the standard basis in the codomain R2 .

Exercise 3.16. (Solution at p. 217) Consider the linear map f :


R3 Ñ R3 whose matrix with respect to the standard basis (of both
the domain and the codomain R3 ) is
¨ ˛
2 ´1 0
˝ 1 0 2 ‚.
0 2 ´1

(1) Let v1 “ p1, 1, 0q. Compute v2 “ f pv1 q and v3 “ f pv2 q. Show


that v1 , v2 , v3 form a basis of R3 .
(2) Consider v4 “ f pv3 q and determine a1 , a2 , a3 such that

v4 “ a1 v1 ` a2 v2 ` a3 v3 .

(3) Determine the matrix of f with respect to the basis v1 , v2 , v3


(both of the domain and of the codomain R3 ).

Exercise 3.17. (Solution at p. 218) Consider the linear map

f: R4 Ñ R3
¨ ˛
x ¨ ˛
˚y ‹ ´x ` z
˚ ‹ ÞÑ ˝ ´y ` t ‚
˝z ‚
x´y
t
3.11. EXERCISES 121

• Write the matrix associated to f with respect to the standard


basis of the domain R4 and the standard basis of the codomain
R3 .
• Determine ker f and im f .
Exercise 3.18. (Solution at p. 219) Consider the following func-
tions:
f1 : R2 Ñ R3
¨ ˛
ˆ ˙ x2 ` x1
x1
ÞÑ ˝ 3x1 ‚
x2
2x2

f2 : R3 Ñ R2
¨ ˛
x1 ˆ ˙
˝x2 ‚ ÞÑ x 3 ` x 1
3x2 ` 4x1 ` 1
x3

f3 : R3 Ñ R2
¨ ˛
x1 ˆ ˙
˝x2 ‚ ÞÑ x23
x2 ` x1
x3

f4 : R3 Ñ R3
¨ ˛ ¨ ˛
x1 x1
˝x2 ‚ Ñ ˝x2 ‚
x3 x3

f5 : R3 Ñ R3
¨ ˛ ¨ ˛
x1 x1 ` x2
˝x2 ‚ ÞÑ ˝x2 ` x3 ‚
x3 x3 ` x1

(1) Verify which ones are linear.


(2) Determine all the images.
122 CHAPTER 3. LINEAR MAPS

(3) Determine if the described linear functions are injective or/and


surjective.
Exercise 3.19. Consider the vectors in R4
v1 “ p1, 2, ´3, ´1q
v2 “ p3, 4, 4, 1q
v3 “ p1, 0, 10, 3q.
(1) Are v1 , v2 , v3 linearly independent? What is the dimension of
the subspace U of R4 that these vectors span?
(2) Find a basis of R4 that contains at least 2 of these three vectors.
(3) Define a map f : R4 Ñ R2 that satisfies f pU q “ R2 . Can you
define a map g : R4 Ñ R3 that satisfies gpU q “ R3 ?
Example 3.87. Decide whether
¨ ˛
1 3 ´1
A“˝ 2 1 5 ‚
1 ´7 13
is invertible. If A is invertible, determine its inverse.
Exercise 3.20. Decide whether AB “ BA holds for the following
matrices:ˆ ˙ ˆ ˙
1 2 0 1
(1) A “ ,B“
2 1 ´1 1
ˆ ˙ ˆ ˙
3 0 ´1 0
(2) A “ ,B“
0 4 0 2
ˆ ˙ ˆ ˙
1 x y 1
(3) A “ ,B“ for two fixed real numbers x and
0 1 0 y
y
ˆ ˙ ˆ ˙
3 0 b11 b21
(4) A “ ,B“ an arbitrary 2 ˆ 2-matrix
0 3 b12 b22
(5) A an arbitrary matrix, B “ AA (the product of A with itself,
this is also denoted A2 )
Exercise 3.21. Determine in each case whether there is a matrix
A satisfying the following condition. If so, is there a unique such
matrix or can there be several matrices satisfying the condition?
Describe your findings geometrically.
3.11. EXERCISES 123
ˆ ˙ ˆ ˙ ˆ ˙
1 3 0
• A “ and A “ 45
0 4 1
ˆ ˙ ˆ ˙ ˆ ˙
1 3 2
• A “ and A “ 68
0 4 0
ˆ ˙ ˆ ˙ ˆ ˙
1 3 2
• A “ and A “ 34
0 4 0
Exercise 3.22. Find two 2 ˆ 2-matrices A, B such that
AB “ 0
(the zero matrix), but A ‰ 0 and B ‰ 0. (Hint: start with A “
B “ 0, and then change very few entries.)
Exercise 3.23. A square matrix A is called symmetric if
A “ AT .
ˆ ˙
1 s
(1) Determine s, t P R such that the matrix is symmet-
´2 t
ric.
(2) Let A be any square matrix. Prove that A ` AT is always sym-
metric. (Hint: Use Lemma 3.83).
Exercise 3.24. The trace of a square matrix A “ paij q is defined
to be the sum of the entries on the main diagonal:
trpAq :“ a11 ` a22 ` ¨ ¨ ¨ ` ann .
Prove the following statements
ˆ (if
˙ you get stuck with the notation,
a b
assume first that A “ is a 2 ˆ 2-matrix, then trpAq “
c d
a ` d):
(1) trpAq “ trpAT q,
(2) trpABq “ trpBAq (for another square matrix B of the same
size). This is noteworthy since AB ‰ BA in general!
(3) trpA ` Bq “ trpAq ` trpBq (for another square matrix B of the
same size), trprAq “ rtrpAq (for r P R).
(4) (optional, slightly more challenging) Prove there is no matrix B
such that AB ´ BA “ id.
124 CHAPTER 3. LINEAR MAPS
ˆ ˙
a b
Exercise 3.25. Let A “ be an arbitrary 2 ˆ 2-matrix
c d ˆ ˙
ra rb
and r P R. Recall that the scalar multiple rA “ . Find
rc rd
a 2 ˆ 2-matrix R such that the matrix product RA equals the scalar
multiple:
RA “ rA.

Exercise 3.26. (1) Let


¨ ˛
0 a b
A“˝ 0 0 c ‚
0 0 0

be a so-called upper triangular matrix (of size 3 ˆ 3). Compute


A2 and prove that A3 “ 0.
(2) Make a (sensible) similar statement for nˆn-matrices (cf. Propo-
sition 4.17 for the definition of upper triangular matrices in gen-
eral).

Exercise 3.27. (Solution at p. 220) Consider the identity map

id : R2 Ñ R2 , px, yq ÞÑ px, yq.

Consider the standard basis e1 “ p1, 0q and e2 “ p0, 1q of the do-


main, and the basis comprised of v1 “ p1, ´3q and v2 “ p2, 1q on the
codomain.

• Compute the base change matrix of id with respect to these


bases.
• Use it to compute the coordinates of p2, ´5q in terms of the
basis v1 , v2 .

Exercise 3.28. (Solution at p. 221) Consider the identity map

id : R3 Ñ R3

and the basis v1 “ p1, 0, ´1q, v2 “ p2, 1, 1q, v3 “ p´1, ´1, 7q on the
domain and the standard basis on the codomain. Compute the base
change matrix with respect to these bases.
3.11. EXERCISES 125

Exercise 3.29. Find the base change matrix from the standard ba-
sis e1 , e2 , e3 in R3 to the basis v1 “ p1, 1, 2q, v2 “ p1, 1, 3q, v3 “
p7, ´1, 0q.

Exercise 3.30. (Solution at p. 221) Consider the linear map

f : R2 Ñ R3 , v ÞÑ Av,
¨ ˛
2 1
where A “ ˝ 0 1 ‚. Compute the matrix B of f with respect
´3 1
to the basis v “ tv1 “ p1, ´1q, v2 “ p3, ´1qu in R2 , and the basis
t “ tt1 “ p1, 0, 1q, t2 “ p2, 1, 1q, t3 “ p´1, ´1, ´1qu in R3 .
Hint: We may consider the following diagram:
id f id
R2v ÝÑ R2e ÝÑ R3e ÝÑ R3t .
H A K

Here the subscripts at R2 indicate which basis we consider. The


matrices H and K are the base change matrices from the basis v to
the standard basis e, resp. from the standard basis e to the basis t.
Then
B “ K ¨ A ¨ H.

Exercise 3.31. Consider the linear map

f : R3 Ñ R3

which in the standard basis (on both the domain and the codomain)
is given by ¨ ˛
2 0 0
A “ ˝ 1 2 1 ‚.
´1 0 1
Compute the matrix of f with respect to the basis

v1 “ p0, ´1, 1q, v2 “ p0, 1, 0q, v3 “ p´1, 0, 1q.

(on the domain and the codomain).

Exercise 3.32. (Solution at p. 222) Consider the linear map

f : R2 Ñ R2 ,
126 CHAPTER 3. LINEAR MAPS
ˆ ˙
6 ´1
which is given by the matrix A “ with respect to the
2 3
standard basis in the domain and the codomain.
Find its matrix with respect to the basis v1 “ p1, 1q, v2 “ p1, 2q
both in the domain and the codomain.
¨ ˛
0 0 1
Exercise 3.33. (Solution at p. 222) Let A “ ˝ 0 1 0 ‚. De-
1 0 0
¨ ˛
x1
termine the vectors x “ ˝ x2 ‚ such that
x3
Ax “ x.
Exercise
¨ 3.34.
˛ (Solution at p. 223) Find, if possible the vectors
x1
x “ ˝ x2 ‚ such that
x3
¨ ˛
1 0 0
˝ 0 4 2 ‚x “ 5x.
0 2 1
ˆ ˙
3 0
Exercise 3.35. (Solution at p. 224) Consider the matrix A “
8 ´1
2 2
which represents f : R Ñ R with respect to the standard basis.
Find the matrix of f with respect to the basis
v “ tv1 “ p2, 1q, v2 “ p0, 1qu.
Exercise 3.36. For A and f as in ??, consider now the basis
v “ tv1 “ p1, 2q, v2 “ p0, 1qu.
Compute the matrix of f with respect to that basis.
Exercise 3.37. (Solution at p. 224) Let f : R3 Ñ R3 be the map
whose matrix with respect to the standard basis is
¨ ˛
´1 1 0
˝ 0 2 0 ‚.
1 ´1 ´2
3.11. EXERCISES 127

Compute the matrix with respect to the basis


v “ tv1 “ p0, 0, 1q, v2 “ p2, 6, ´1q, v3 “ p1, 0, 1qu.
Exercise 3.38. (Solution at p. 224) Let f : R3 Ñ R3 be the linear
map whose matrix with respect to the standard basis is
¨ ˛
1 ´1 0
A “ ˝ 1 2 1 ‚.
2 1 3

(1) Find a basis of the solution space L of the linear system


¨ ˛ ¨ ˛
x1 x1
A ˝ x2 ‚ “ 3 ˝ x2 ‚.
x3 x3

(As a forecast to terminology introduced later, this solution


space is the so-called eigenspace of A for the eigenvalue 3, cf. Def-
inition and Lemma 5.11.)
(2) Complete the basis of L (which is a subspace of R3 ) to a basis
of R3 , and compute the matrix of f with respect to this basis.
Exercise 3.39. (Solution at p. 225) Consider the linear map f :
R3 Ñ R3 whose kernel is Lpp1, 0, 1qq and such that
f p0, 3, ´1q “ p0, 3, ´1q, f p0, 0, 1q “ p0, 0, 2q.
Compute its matrix with respect to the standard basis.
Exercise 3.40. Consider the linear map f : R3 Ñ R3 such that
f p1, 1, 1q “ 3 ¨ p1, 1, 1q, f p2, 0, 1q “ p´4, 0, ´2q, f p0, 1, 3q “ p0, 2, 6q.
(1) Show that the vectors
v “ tp1, 1, 1q, p2, 0, 1q, p0, 1, 3qu
form a basis of R3 . (Note that for each of these three vectors,
one has f pvi q “ λi v, with λ1 “ 3 etc. Therefore, the basis is an
example of a so-called eigenbasis, cf. Definition 5.17.)
(2) Compute the matrix of f with respect to that basis.
(3) Compute the matrix of f with respect to the standard basis.
128 CHAPTER 3. LINEAR MAPS

The following two exercises are all concerned with linear systems
of the form
Ax “ λx,
where A is a certain square matrix, x is a vector and λ P R a real
number. We will study these systems systematically in §5.
Exercise 3.41. (Solution at p. 227) Find the solutions of the linear
system ¨ ˛¨ ˛ ¨ ˛
1 0 0 x1 x1
˝ 0 4 2 ‚˝ x2 ‚ “ 5 ˝ x2 ‚.
0 2 1 x3 x3

Exercise 3.42. Solve the system


¨ ˛¨ ˛ ¨ ˛
1 1 0 0 a a
˚ 0 1 0 0 ‹ ˚ b ‹ ˚ b ‹
˚ ‹˚ ‹“1¨˚ ‹.
˝ 0 0 1 0 ‚˝ c ‚ ˝ c ‚
0 0 0 1 d d
Chapter 4

Determinants

The determinant of a square matrix A is a number that encodes


important information about A. For example, A is invertible (Def-
inition 3.61) if and only if the determinant of A is nonzero (Theo-
rem 4.13).

4.1 Determinants of 2 ˆ 2-matrices


We begin with the definition of determinants of 2 ˆ 2-matrices.
Definition 4.1. Let ˆ ˙
a b
A“
c d
be a 2 ˆ 2-matrix. The determinant of A is defined as
det A :“ ad ´ bc.
ˆ ˙
4 7
Example 4.2. • det “ 4 ¨ p´1q ´ 7 ¨ 2 “ ´18.
2 ´1
ˆ ˙
1 r
• det “ 1 ¨ 1 ´ r ¨ 0 “ 1. In particular, det id2 “ 1.
0 1
ˆ ˙
a b
• Consider a matrix A “ whose second column is a
ra rb
multiple of the first (so that the columns are linearly depen-
dent). Then
ˆ ˙
a b
det “ arb ´ bra “ 0.
ra rb

129
130 CHAPTER 4. DETERMINANTS

According to Corollary 3.86, A is not invertible. This is an


example of the fact alluded to above (cf. Theorem 4.13).

Determinants carry the following geometric meaning. Recall that


the absolute value of a real number r is defined as
"
r rě0
|r| :“
´r r ă 0.

For example, |4| “ 4 and | ´ 5| “ 5.

Lemma 4.3. Let


ˆ ˙
1 x x1
A “ pv v q “
y y1

be a 2 ˆ 2-matrix, where v and v 1 P R2 are the two columns of A.


Then
| detpAq| “ areapv1 , v2 q,

where the right hand side denotes the area of the parallelogram
spanned by the two vectors v1 , v2 .

Proof. We illustrate this geometrically in the case where all entries


of A are positive and the vectors v and v 1 lie as depicted, i.e., the
angle from v to v 1 goes, informally speaking, counterclockwise. The
area of the black rectangle is px ` x1 qpy ` y 1 q. The area of the two
triangles whose long side is v (resp. parallel to it), is xy 2
, so the area
of these two triangles together is xy. Likewise the total area of the
triangles (parallel to) v 1 is x1 y 1 . Finally, the area of the rectangle at
the bottom right, resp. top left corner of the large rectangle is x1 y.
Therefore, the area of the parallelogram is

px ` x1 qpy ` y 1 q ´ xy ´ x1 y 1 ´ 2x1 y “ xy ` x1 y ` xy 1 ` x1 y 1 ´ xy ´ x1 y 1 ´ 2x1 y


“ xy 1 ´ x1 y
“ det A.
4.2. DETERMINANTS OF LARGER MATRICES 131

v 1 “ px1 , y 1 q

v “ px, yq
x

Lemma 4.3 does not give any information about the sign of the
determinant. Regarding that, we observe the following:
ˆ ˙
x1 x2
Lemma 4.4. Let A “ and let
y1 y2
ˆ ˙
1 x2 x1
A :“
y2 y1
ˆ ˙
2 y1 y2
A :“
x1 x2
be the matrices obtained from A by swapping the two columns,
resp. the two rows. Then
det A2 “ det A1 “ ´ det A.
In other words, swapping two rows or two columns will change the
sign of the determinant.
Proof. This is directly clear from the definition. For example,
det A1 “ x2 y1 ´ y2 x1 “ ´px1 y2 ´ x2 y1 q “ ´ det A.
Thus, the determinant (as opposed to only its absolute value)
records the area of the parallelogram spanned by the vectors and
also the orientation.

4.2 Determinants of larger matrices


There are various (equivalent) approaches to defining determinants
of larger matrices. The following one is satisfactory from both a
conceptual and a practical standpoint.
132 CHAPTER 4. DETERMINANTS

Theorem 4.5. There is a unique function, called the determinant,

det : Matnˆn Ñ R

with the following properties (throughout A P Matnˆn q:


(1) detpidn q “ 1,
(2) If A1 results from A by interchanging two rows, then

detpA1 q “ ´ detpAq. (4.6)


¨ ˛
v1
(3) Let us write a matrix as ˝ ... ‚, i.e., vi P Rn is the i-th row of
vn
the matrix. Then for any w P Rn and any r P R:
¨ ˛ ¨ ˛ ¨ ˛
v1 v1 v1
.. ˚ .. ˚ .. ‹
. ˚ . ˚ . ‹
˚ ‹ ‹
˚ ‹ ‹
detp˚ rvi ` w ‹q “ r detp˚ vi ‹q ` detp˚ w ‹q.
˚ ‹ ˚ ‹ ˚ ‹
˚ .. ‹ ˚ .
˝ ..
‹ ˚ . ‹
˝ .. ‚
˝ . ‚ ‚
vn vn vn

Remark 4.7. The above operations are somewhat like elementary


operations (Definition 1.29): if we take w “ 0 above, then the for-
mula says that multiplying any one row by r (which may be zero,
unlike in Definition 1.29), then the determinant also gets multiplied
by r. In particular, if A has a zero row, then

det A “ 0. (4.8)

Remark 4.9. We also have

det A “ 0

whenever two rows of A are equal: indeed, the matrix A1 obtained


by interchanging these rows is equal to A, i.e., A1 “ A, so that
det A “ det A1 . However, according to (4.6), we also have det A1 “
´ det A. Taking this together, we have

det A “ ´ det A

and this is only possible if det A “ 0.


4.2. DETERMINANTS OF LARGER MATRICES 133

Remark 4.10. The preceding remark also implies that for i ‰ j


and r P R

¨ ˛ ¨ ˛ ¨ ˛
v1 v1 v1
.. ˚ .. ‹ .. ‹
. ˚ . ‹ . ‹
˚ ‹ ˚
˚ ‹ ˚
det ˚ vi ` rvj ‹ “ det ˚ vi ‹ ` r det ˚ vj ‹ where vj is in the i-th row!
˚ ‹ ˚ ‹ ˚ ‹
˚ .. ‹
˝ ... ‚
˚ ‹ ˚ .. ‹
˝ . ‚ ˝ . ‚
vn vn vn
..
¨ ˛ ¨ ˛
v1 .
˚ .. ‹ ˚
v

˚ . ‹ ˚ .j
˚ ‹

˚ ..
“ det ˚ vi ‹ ` r ˚
˚ ‹ ‹
˚ . ‹ ‹
˝ .. ‚ ˚ v
˝ j


vn ..
.
¨ ˛
v1
˚ .. ‹
˚ . ‹
“ det ˚ vi ‹ by the above remark.
˚ ‹
˚ . ‹
˝ .. ‚
vn

In other words, adding an arbitrary multiple of some row to another


row does not affect the determinant.

In order to get a feeling for this theorem, let us apply it to a


concrete matrix, say

¨ ˛
´2 1 8
A“ ˝ 1 3 5 ‚.
0 1 4

Taking the theorem for granted, we will compute det A by stepwise


applying the above rules and keeping track of how the determinant
changes.
134 CHAPTER 4. DETERMINANTS

¨ ˛ ¨ ˛ ¨ ˛ ¨ ˛
´2 1 8 0 7 18 0 1 6 0 1 6
˝ 1 3 5 ‚⇝ ˝ 1 3 5 ‚⇝ ˝ 1 3 5 ‚⇝ ˝ 1 3 5 ‚
0 2 4
loooooooomoooooooon 0 2 4
loooooooomoooooooon 0 2 4
looooooomooooooon 0 0 ´8
loooooooomoooooooon
“A “A1 “A2 “A3
¨ ˛ ¨ ˛ ¨ ˛
0 1 6 0 1 6 0 1 0
⇝ ˝ 1 3 5 ‚⇝ ˝ 1 3 5 ‚⇝ ˝ 1 3 0 ‚
0 0 1
looooooomooooooon 0 0 1
looooooomooooooon 0 0 1
looooooomooooooon
“A4 “A5 “A6
¨ ˛ ¨ ˛
0 1 0 1 0 0
⇝ ˝ 1 0 0 ‚⇝ ˝ 0 1 0 ‚
0 0 1
looooooomooooooon 0 0 1
looooooomooooooon
“A7 “A8 “id3

From A to A1 to A2 to A3 , we have added appropriate multiples of


some row to another one, so that
det A “ det A1 “ det A2 “ det A3 .
We obtain A4 from A3 by multiplying the last row with ´ 18 , so that
det A4 “ ´ 81 det A3 . From A4 to A5 to A6 to A7 , we again added
appropriate multiples to some other rows, so that
det A4 “ det A5 “ det A6 “ det A7 .
Finally, A8 is obtained from A7 by swapping the first two rows, so
that
1 “ det A8 “ ´ det A7 .
Taking this all together we see that
det A “ det A3 “ ´8 det A4 “ ´8 det A7 “ `8 det A8 “ 8.
This shows that the above abstract description of the determinant
can be used to compute determinants in practice.
Proof. (of Theorem 4.5) We only sketch the proof idea: one basically
proceeds, for a general square matrix, similarly to the computation
above: one uses Gaussian elimination, i.e., elementary row opera-
tions to bring a given square matrix A into reduced row-echelon
form, say A ⇝ A1 . The properties in Theorem 4.5 then imply how
4.2. DETERMINANTS OF LARGER MATRICES 135

to compute det A in terms of det A1 . If the resulting matrix A1 has


a zero row, then det A1 “ 0. If it has no zero row, then A1 “ id, and
det A1 “ 1.

4.2.1 Small matrices


For practical purposes, it is useful to have an explicit formula at
hand for small matrices:
(1) For a 1 ˆ 1-matrix A, i.e., A “ paq, we have
det A “ a.

(2) The determinant of 2 ˆ 2-matrices defined in Definition 4.1 sat-


isfies the properties listed in Theorem 4.5.
ˆ ˙
1 0
• det id2 “ det “ 1 ¨ 1 ´ 0 ¨ 0 “ 1.
0 1
• Swapping two rows yields a sign change in the determinant
(Lemma 4.4).

ˆ ˙
a b
det “ apd ` rf q ´ bpc ` req
c ` re d ` rf
“ ad ´ bc ` rpaf ´ beq
ˆ ˙ ˆ ˙
a b a b
“ det ` r det .
c d e f

Thus, the definition of det for general matrices agrees with the
one in Definition 4.1.
(3) For a 3 ˆ 3-matrix one can show that the determinant is given
by the so-called Sarrus’ rule:
¨ ˛
a b c
det ˝ d e f ‚ “ aei ` bf g ` cdh ´ ceg ´ bdi ´ af h. (4.11)
g h i
A way to remember this formula is to write
¨ ˛
a b c a b
˝ d e f d e ‚
g h i g h
136 CHAPTER 4. DETERMINANTS

and take products of entries along the top-left-to-bottom-right


diagonals with a positive sign, and the top-right-to-bottom-left
diagonals with a negative sign:

` ` `
a b c a b

d e f d e

g h i g h
´ ´ ´

One can prove by direct computation, that the function defined


in (4.11) satisfies the conditions in Theorem 4.5.
(4) Sarrus’ rule does not apply to larger matrices. Instead, for ma-
trices of size 4 ˆ 4, one can prove that det A is the sum of 24
expressions, each of which is a product of 4 entries of A.

4.3 Invertibility and determinants


We can use the properties of the determinant in Theorem 4.5 (and
its proof) to obtain a useful criterion to decide when a matrix is
invertible. Determinants can also be used to compute the inverse of
an invertible matrix, however this is only of theoretical significance
due to the complexity of the ensuing (iterative) algorithm.
Definition 4.12. Let A P Matnˆn . For 1 ď i, j ď n, denote by
Aij that is obtained from A by deleting the i-th row and the j-th
column. The number
cij :“ p´1qi`j det Aij
is called the pi, jq-cofactor of A.
The adjugate of A is the n ˆ n-matrix defined as
adjpAq “ pcij pAqqT “ pcji pAqq.
Theorem 4.13. An n ˆ n-matrix A is invertible if and only if
det A ‰ 0.
4.4. FURTHER PROPERTIES OF DETERMINANTS 137

If this is the case, then the inverse can be computed as


1
A´1 “ adjpAq.
det A
Proof. We revisit the proof of Theorem 4.5: say A ⇝ A1 , a reduced
row echelon matrix, by means of elementary operations (Gaussian
elimination). In this process, one does not multiply any row by
zero, so that det A “ 0 if and only if det A1 “ 0. We also know that
A1 “ U A, where U is an invertible matrix (namely, a product of
elementary matrices). Moreover, A1 is invertible if and only if A is
invertible (since U is invertible). We therefore have
?
A is invertible ô A1 is invertible ô det A1 ‰ 0 ô det A ‰ 0
and it remains to show the middle equivalence.
The matrix A1 is in reduced row echelon form. Thus, either A1 “
id or A1 contains a zero row. In the first event, A1 is invertible,
and det A1 “ 1 ‰ 0. In the second event, A1 is not invertible (by
Corollary 3.86) and det A1 “ 0 as was noted around (4.8).
We skip the proof of the adjugate formula, cf. [Nic95, Theo-
rem 3.2.4].
ˆ ˙
a b
Example 4.14. For a 2 ˆ 2-matrix A “ , the adjugate
c d
matrix is ˆ ˙
d ´b
.
´c a
Therefore, the inverse can be computed as
ˆ ˙
´1 1 d ´b
A “ .
ad ´ bc ´c a

4.4 Further properties of determinants


Proposition 4.15. (product formula) For two n ˆ n-matrices A, B,
we have the following formula:
detpABq “ det A ¨ det B.
I.e., the determinant of a product (of square matrices of the same
size) is the product of the two individual determinants.
138 CHAPTER 4. DETERMINANTS

In particular, this shows

detpABq “ detpBAq,

even though AB ‰ BA!


Proof. We don’t include a full proof, but only observe that one
checks this by direct computation when A is an elementary matrix.
This implies the formula if A is invertible, since then A is a product
of elementary matrices. If A is not invertible, then one shows that
AB is also not invertible (for any B), and therefore both det A “ 0
and detpABq “ 0, so the formula holds in this case too. See, e.g.,
[Nic95, Theorem 3.2.1] for a proof.

Remark 4.16. The determinant is therefore multiplicative, but it


is not additive: one has

detpA ` Bq ‰ det A ` det B,

e.g.
ˆ ˙ ˆ ˙ ˆ ˙
2 0 1 0 1 0
det “ 4 ‰ 1 ` 1 “ det ` det .
0 2 0 1 0 1

Proposition 4.17. Let A be an upper triangular matrix or a lower


triangular matrix , i.e., of the form
¨ ˛
a11 ˚ . . . ˚
... ... .. ‹
˚ 0 . ‹
˚
A“˚ . . . ‹,
˝ .. .. .. ˚ ‚
0 ... 0 ann
resp.
¨ ˛
a11 0 ... 0
.. .. ..
˚
˚ . . .

A“˚ ‹.
˚ ‹
.. .. .. ..
˝ . . . . ‚
˚ ... ˚ ann
Here * stands for an arbitrary entry. Then

det A “ a11 ¨ ¨ ¨ ¨ ¨ ann .


4.4. FURTHER PROPERTIES OF DETERMINANTS 139

Proof. If one of the entries on the main diagonal, i.e., a11 , . . . , ann
is zero, then the columns of A are linearly dependent, so that A is
not invertible and det A “ 0. If instead all aii ‰ 0, we can divide
the i-th row by aii , and assume the entries on the main diagonal are
all 1. Then, adding appropriate multiples of the rows to the rows
above (resp. below in the case of a lower triangular matrix), which
does not affect the determinant, gives A ⇝ id, so that det A “ 1, so
the claim holds in this case.

Proposition 4.18. For A P Matnˆn , we have

det A “ detpAT q,

i.e., the determinant does not change when passing from A to its
transpose (Definition 3.81).

Proof. For small matrices (of size at most 3 ˆ 3), this can be proved
directly from the formulae in §4.2.1.
In general, one may argue like this: if A is not invertible, then
AT is not invertible either (by Lemma 3.83). In this case, both
sides of the equation are zero. If A is invertible, it is a product of
elementary matrices: A “ U1 . . . Un . We then have AT “ UnT . . . U1T .
By the product formula (Proposition 4.15), we may therefore assume
A is an elementary matrix. In this case, one checks the claim by
inspection:

¨ ˛
1
..
.
˚ ‹
˚ ‹
˚ ‹
˚ 0 1 ‹
• for A “ ˚
˚ .. ‹
‹, we have AT “ A, so
˚ . ‹
˚
˚ 1 0 ‹

˚ .. ‹
˝ . ‚
1
the claim clearly holds.
140 CHAPTER 4. DETERMINANTS
¨ ˛
1
˚ ... ‹
˚ ‹
1
˚ ‹
˚ ‹
• Likewise, for A “ ˚ r ‹, we have A “
˚ ‹
1
˚ ‹
˚ ‹
˚ .. ‹
˝ . ‚
1
AT , so again the claim holds obviously.
¨ ˛
1
˚ ... ‹
˚ ‹
˚ ‹
˚ 1 ‹
• The matrix ˚
˚
˚ ... ‹
‹ is a lower trian-

˚
˚ r 1 ‹

˚
˝ ... ‹

1
gular matrix, and its transpose an upper triangular matrix.
Both have determinant 1 according to Proposition 4.17.

Remark 4.19. We introduced the determinant using row opera-


tions. The preceding result implies that one can replace the word
“row” in all of the above by the word “column”. Applying that,
say, to Remark 4.9 we obtain that det A “ 0 if A has two identical
columns.

Proposition 4.20. Let A “ paij q P Matnˆn . Then, for any i, one


can compute the determinant using “cofactor expansion” along the
i-th row. That is, the following identity holds, where cij are the
cofactors of A (Definition 4.12):

det A “ ai1 ci1 ` ¨ ¨ ¨ ` ain cin .

Similarly, one can compute it using cofactor expansion along the


j-th column, for any j:

det A “ a1j c1j ` ¨ ¨ ¨ ` anj cnj .

For a proof of this, see, e.g. [Nic95, §3.6].


4.5. EXERCISES 141

Example 4.21. For example, we expand the determinant along the


second row:
¨ ˛
2 3 7
det ˝ ´4 0 6 ‚ “ a21 c21 ` a22 c22 ` a23 c23
1 5 0
ˆ ˙ ˆ ˙
1`2 3 7 2`2 2 7
“ p´4qp´1q det ` 0p´1q det
5 0 1 0
ˆ ˙
2 3
` 6p´1q2`3 det
1 5
“ 4 ¨ p´35q ` 0 ´ 6 ¨ 7
“ ´182.

The choice of the second row (as opposed to the others) is arbitrary,
and the result is the same. However, the presence of the a22 “ 0
simplifies the computation.

4.5 Exercises
Exercise 4.1. For which values of a, b P R is the following matrix
invertible? In this event, what is its inverse?
¨ ˛
a b 3
A “ ˝ 2 1 ´1 ‚.
1 ´1 4

Exercise 4.2. Let A be a square matrix such that A2 “ id (the


identity matrix). Prove that detpAq “ ˘1.

Exercise 4.3. Compute the determinant of a rotation matrix (cf. Ex-


ample 3.18), ˆ ˙
cos r ´ sin r
A“ .
sin r cos r
Exercise 4.4. Compute the determinant of
¨ ˛
2 0 1 4
˚ ´1 3 0 2 ‹
˚ ‹.
˝ 1 0 2 ´3 ‚
0 ´2 5 1
142 CHAPTER 4. DETERMINANTS

Exercise
¨ 4.5.˛(Solution at p. 227) Compute the determinant of
3 0 0
˝ 1 4 0 ‚ in three ways:
2 ´3 5
• by using Theorem 4.5,
• by using Sarrus’s rule, (4.11),
• by using Proposition 4.17.
Exercise 4.6. (Solution at p. 228) Compute the determinants of
¨ ˛ ¨ ˛
1 5 8 1 5 8
˝ 40 ´9 1 ‚ and ˝ 40 ´9 1 ‚.
0 0 0 1 5 8

Exercise 4.7. Compute the determinants of


¨ ˛
3 26 ´9 3 ¨ ˛
˚ 0 3 1 2 3
1 28 ‹
˚
˝ 0 0
‹ and ˝ 4 5 6 ‚.
2 71 ‚
7 8 9
0 0 0 3

Exercise 4.8. Compute the inverses of


¨ ˛
ˆ ˙ 5 2 ´1
10 9
and ˝ 0 0 1 ‚.
11 10
6 2 3
Chapter 5

Eigenvalues and
eigenvectors

Eigenvalues and eigenvectors are an extremely useful concept of lin-


ear algebra. Coupled with certain numerical considerations, which
are (only slightly!) beyond the scope of this course, eigenvalues have
been used, for example, in the early PageRank algorithm employed
by Google.
The overall idea of eigenvalues and eigenvectors is this: square
matrices of the form
¨ ˛
a11 0
A“˝ ..
. ‚
0 ann

(i.e., the only non-zero entries are on the main diagonal; such ma-
trices are called diagonal matrices) are particularly simple to com-
prehend and to use in computations. For example, products of the
can be computed easily:
¨ ˛¨ ˛ ¨ ˛
a11 0 b11 0 a11 b11 0
˝ ... ‚˝ ... ‚“ ˝ ... ‚,
0 ann 0 bnn 0 ann bnn

i.e., the (diagonal) entries can just be multiplied one by one, some-
thing that clearly goes wrong for products of general matrices. Eigen-
values and eigenvectors can, in certain cases, be used to reduce com-
putations for general matrices to those for diagonal matrices.

143
144 CHAPTER 5. EIGENVALUES AND EIGENVECTORS

5.1 Definitions
Definition 5.1. Let A be a square matrix. A real number λ is
called an eigenvalue of A if there exists a nonzero vector v P Rn
that satisfies the equation
Av “ λv.
Such a vector v is called an eigenvector for the eigenvalue λ. In
other words, multiplying the matrix A by the vector v results in a
scaled version of v, where the scaling factor is the eigenvalue λ.
Likewise, for a linear map f : V Ñ V , λ is an eigenvalue if there
is v P V, v ‰ 0 such that
f pvq “ λv. (5.2)
We consider some of the linear maps f : R2 Ñ R2 of §3.2.1.
Example 5.3. If f is the reflection along, say, the x-axis, i.e., f px, yq “
px, ´yq, then (5.2) reads
px, ´yq “ pλx, λyq,
i.e., x “ λx and ´y “ λy. The first equation holds if x “ 0 and λ
arbitrary or if λ “ 1 and x arbitrary. Similarly, the second forces
y “ 0 or λ “ ´1. Since, by definition, we have px, yq ‰ p0, 0q, we
cannot have both x “ 0 and y “ 0. If x ‰ 0, then λ “ 1, which
forces y “ 0. If x “ 0, then y ‰ 0 and therefore λ “ ´1. Thus,
there are two eigenvalues and eigenvectors are as follows:
• λ “ 1, with eigenvectors px, 0q for arbitrary x P R,
• λ “ ´1, with eigenvectors p0, yq for arbitrary y P R.

Before going on, we make the following observation, for any nˆn-
matrix A. The equation
Av “ λv
can be rewritten as
0 “ Av ´ λv “ Av ´ pλidqv “ pA ´ λidqv.
Here we have used standard properties of matrix multiplication
(Lemma 3.57). We seek a non-zero vector v satisfying this con-
dition. Such a vector exists if and only if A ´ λid is not invertible.
5.2. THE CHARACTERISTIC POLYNOMIAL 145
˙ˆ
1 r
Example 5.4. We consider the shearing matrix A “ ,
0 1
where we assume r ‰ 0 (otherwise A “ id). We check the invert-
ibility of the matrix:
ˆ ˙
1´λ r
A ´ λid “ .
0 1´λ
It suffices to compute the determinant (Theorem 4.13):
ˆ ˙
1´λ r
det “ p1 ´ λq2 ´ r ¨ 0 “ pλ ´ 1q2 .
0 1´λ
This is zero precisely if λ “ 1, i.e., λ “ 1 is the only eigenvalue of
A. Eigenvectors for this eigenvalue are those vectors such that
ˆ ˙
9 x
pA ´ 1idq “ 0,
y
ˆ ˙ ˆ ˙ˆ ˙
ry 0 r x
i.e., “ “ 0. Since we have assumed r ‰ 0,
0 0 0 y
this implies y “ 0, andˆ x is˙ arbitrary. Thus, the eigenvectors for
x
λ “ 1 are of the form for x P R.
0

5.2 The characteristic polynomial


Definition and Lemma 5.5. For A P Matnˆn , the function
χptq “ detpA ´ t ¨ idn q
is a polynomial of degree n. It is called the characteristic polynomial
of the matrix A. A real number λ is an eigenvalue of A if and only
if
χpλq “ 0.
ˆ ˙
1 r
For example, for A “ , χptq “ pt ´ 1q2 .
0 1
ˆ ˙
2 ´1
Example 5.6. We compute the eigenvalues of A “ using
4 ´3
the characteristic polynomial:
ˆ ˙
2´t ´1
χA ptq “ det “ p2 ´ tqp´3 ´ tq ` 4 “ t2 ` t ´ 2.
4 ´3 ´ t
146 CHAPTER 5. EIGENVALUES AND EIGENVECTORS

The equation χA ptq “ 0 solves as


c
1 1 1 3
t1{2 “ ´ ˘ 2 ` “ ´ ˘ ,
2 4 2 2
i.e., the eigenvalues are t1 “ ´2, t2 “ 1.
ˆ ˙
cos r ´ sin r
Example 5.7. We consider the rotation matrix A “ .
sin r cos r
We have:
• if r “ . . . , ´4π, ´2π, 0, 2π, . . . (i.e., a rotation by a multiple
of 360˝ , i.e., no rotation at all), then A “ id2 , and the only
eigenvalue is λ “ 1, and any vector px, yq P R2 is an eigenvector.
• if r “ . . . , ´3π, ´π, π, 3π, . . . (i.e., a rotation by 180˝ (plus an
irrelevant multiple of 360˝ )), the only eigenvalue is λ “ ´1,
and again any vector in R2 is an eigenvector,
• in all other cases, i.e., if the rotation is not by 0˝ or by 180˝ ,
there are no eigenvalues (and therefore no eigenvectors).
These statements are clear geometrically: for a rotation other than
the special cases, for any vector v P R2 we have that the rotated
vector Av lies on a different line, so that it cannot be an eigen-
vector. To confirm this algebraically, we compute its characteristic
polynomial:
χptq “ detpA ´ tidq
ˆ ˙
cos r ´ t ´ sin r
“ det
sin r cos r ´ t
“ pcos r ´ tq2 ` psin rq2
“ pcos rq2 ´ 2t cos r ` t2 ` psin rq2
“ 1 ` t2 ´ 2t cos r.
We solve this for t using the usual formula:
a
t “ 2 cos r ˘ ´1 ` pcos rq2 .
We have 0 ď cos r ď 1, and cos r “ 1 if and only if r is a multiple
of π (cf. §B). Thus the term inside the square root is zero in this
case, in all other cases it is strictly negative, so that the equation
χptq “ 0 has no solution.
5.2. THE CHARACTERISTIC POLYNOMIAL 147

The non-existence of eigenvalues can be salvaged by working with


complex numbers, instead of real numbers.

Theorem 5.8. (Fundamental theorem of algebra) For every non-


constant polynomial

pptq “ an xn ` ¨ ¨ ¨ ` a0 ,

where the coefficients a0 , . . . , an are complex numbers (for example,


they can be real numbers), there exists a complex number z P C
such that
ppzq “ 0.

This theorem is famous for the number of entirely different proofs.


A completely elementary proof fitting on about two pages is given
in [Oli11].

Example 5.9. Consider the rotation matrix


ˆ ˙
0 ´1
A“
1 0

describing a (counter-clockwise) rotation by 90˝ . Its characteristic


polynomial
χA ptq “ 1 ` t2
does not have a real zero, i.e., for all real numbers r, χA prq ‰ 0.
However, the complex number i, the imaginary unit, which satisfies
i2 “ ´1 is a complex zero. In addition, ´i, which also satisfies
p´iq2 “ ´1 is another complex zero.

It is therefore helpful to consider the concepts of linear algebra


not only for real matrices, but for complex matrices. All of the con-
cepts and theorems that we have encountered in this course continue
to hold for complex matrices, complex vector spaces etc.

Corollary 5.10. Any complex square matrix A P Matnˆn has at


least one (complex) eigenvalue. In particular, any real square matrix
has at least one complex eigenvalue (but it may not have a real
eigenvalue).
148 CHAPTER 5. EIGENVALUES AND EIGENVECTORS

5.3 Eigenspaces
In the above examples, the set of all eigenvectors for a given eigen-
value has a particularly nice shape. This is a general phenomenon:
Definition and Lemma 5.11. Let A P Matnˆn be a square ma-
trix and λ P R a fixed real number. The set

Eλ :“ tv P Rn | Av “ λvu

is a subspace of Rn . It is called the eigenspace of A with respect to


λ.

Proof. The equation Av “ λv is equivalent to pA ´ λidqv “ 0,


i.e., we have Eλ “ kerpA ´ λidq. This is a subspace of Rn by
Proposition 3.23.

Remark 5.12. If λ above is not an eigenvalue, then Eλ “ t0u, i.e.,


the zero vector is the only one satisfying Av “ λv.
If λ is an eigenvalue, then Eλ consists of all the eigenvectors for
the eigenvalue λ, together with the zero vector (which by definition
is not an eigenvector).

Example 5.13.
ˆ ˙ We compute the eigenspaces of the matrix A “
0 ´1
. Its characteristic polynomial is
´2 0
ˆ ˙
´λ ´1
χA ptq “ det “ λ2 ´ 2.
´2 ´λ
?
Its zeros,
? i.e., the eigenvalues of A are λ1{2 “ ˘ 2. The eigenspace
for 2 is the solution space of the homogeneous system
ˆ ? ˙ˆ ˙
´ 2 ´1 ? x
“ 0.
´2 ´ 2 y
looooooooomooooooooon
?
“A´ 2¨id“:B

We solve this by reducing the matrix B to row-echelon form


ˆ ? ? ˙
?1
˙ ˆ ˙ ˆ
´ 2 ´1 1 1 2
? ⇝ ?2 ⇝ 2 .
´2 ´ 2 ´2 ´ 2 0 0
5.4. DIAGONALIZING MATRICES 149
?
Thus, y is a free variable
?
and x “ ´ 22 y. Thus E?2 has dimension
vector is p´ 22 , 1q. Similarly, one computes the eigenspace
1, a basis ?
for λ “ ´ 2:
ˆ ? ? ˙
1 ´ ?12
˙ ˆ ˙ ˆ
? 2 ? ´1 1 ´ 2
B “ A ` 2 ¨ id “ ⇝ ? ⇝ 2 ,
´2 2 ´2 2 0 0
so the
?
eigenspace E´?2 is again one-dimensional, and a basis vector
is p 22 , 1q. Here is a plot showing the two eigenspaces: ?the map
v ÞÑ Av will stretch the vectors in E?2 by a factor of 2, while
those on ? the eigenspace E´?2 will be flipped and stretched by a
factor of 2:
y
E´?2
E?2
2

x
´5 ´4 ´3 ´2 ´1 1 2 3 4 5

´2

5.4 Diagonalizing matrices


As was mentioned above, diagonal matrices are particularly easy to
compute with. This raises the question if (and how) it is possible
to “bring” a given matrix A into such an easy form.
Definition 5.14. A square matrix A is called diagonalizable if there
is an invertible matrix P P Matnˆn such that

P AP ´1 “ D,
¨ ˛
d11 0
where D “ ˝ ... ‚ is a diagonal matrix.
0 dnn
150 CHAPTER 5. EIGENVALUES AND EIGENVECTORS

An example showing the relevance of this notion is this: in the


context of differential equations, one needs to compute the exponen-
tial of a square matrix A, which is defined as

A2 A3 A4
exp A “ 1 ` A ` ` ` ` ....
2 6 24
Here A3 “ A ¨ A ¨ A etc. Instead of computing all these powers
of A one after another, one can use the above definition: if A is
diagonalizable, i.e., P AP ´1 “ D, then A “ pP ´1 P qApP ´1 P q “
P ´1 pP AP ´1 qP “ P ´1 DP . Then,

A2 “ P ´1 DP ¨ P ´1 DP “ P ´1 D2 P, A3 “ P ´1 D3 P

etc. Computing the powers of D, as opposed to those of A is easy:


one just needs to raise the diagonal entries to the corresponding
power.

Method 5.15. In order to diagonalize a square matrix A P Matnˆn ,


i.e., to determine whether P above exists, and to compute D, one
proceeds as follows:

• Compute the eigenvalues of A, for example by finding the zeros


of the characteristic polynomial. Denote them by λ1 , . . . , λk .
Denote the eigenspaces by Eλk .

• The matrix A is diagonalizable precisely if


k
ÿ
dim Eλi “ n,
i“1

i.e., if the dimensions of the eigenspaces sum up to the size of


the matrix A.

• In this event, one may choose P to be the n ˆ n-matrix whose


columns are the basis vectors of all the eigenspaces (for the
various eigenvalues λ1 , . . . , λk ). The matrix D is the diagonal
matrix whose diagonal entries are

λ1 , . . . , λ1 , . . . , loooomoooon
loooomoooon λk , . . . , λk .
dim E1 times dim Ek times
5.4. DIAGONALIZING MATRICES 151

One can show that above, one always has k ď n. One does this
by proving that the sum of the subspaces Eλ1 , . . . , Eλk is a direct
sum, so that

n “ dim Rn ě dimpEλ1 ` ¨ ¨ ¨ ` Eλk q


“ dimpEλ1 ‘ ¨ ¨ ¨ ‘ Eλk q
“ dimpEλ1 q ` ¨ ¨ ¨ ` dimpEλk q
ě 1looooomooooon
` ¨¨¨ ` 1
k summands
“ k.

This implies the following:


Corollary 5.16. If an n ˆ n-matrix has n distinct eigenvalues, then
it is diagonalizable.

Definition 5.17. Let A P Matnˆn be given. A basis v1 , . . . , vn of


Rn is called an eigenbasis for A if each vi is an eigenvector (for a
certain eigenvalue) of A.

Lemma 5.18. For A P Matnˆn the following two statements are


equivalent:
(1) A is diagonalizable.
(2) A admits an eigenbasis, i.e., there is an eigenbasis (of Rn ) for
A.

One proves this by observing that if P AP ´1 is a diagonal matrix,


then P is a base-change matrix between the standard basis and an
eigenbasis.
ˆ ˙
0 ´1
Example 5.19. The matrix A “ in Example 5.13
´2 0
has two distinct eigenvalues, and is therefore diagonalizable (Corol-
lary 5.16). An eigenbasis for A is
? ?
v1 “ p 2, 1q, v2 “ p´ 2, 1q.

Example 5.20. We consider the shearing matrix


ˆ ˙
1 1
A“ .
0 1
152 CHAPTER 5. EIGENVALUES AND EIGENVECTORS

Its characteristic polynomial is χA ptq “ p1 ´ tq2 , whose only zero


is t “ 1. Thus, A has this eigenvalue only: λ “ 1. Weˆcompute ˙
0 1
the eigenspace: consider the matrix B :“ A ´ λid “ .
ˆ ˙ 0 0
x
Writing, as usual, v “ P R2 , the space of solutions of the
y
homogeneous system
ˆ ˙
y
Bv “ “0
0

is our eigenspace, namely

E1 “ tv P R2 |Bv “ 0u “ tpx, 0q | x P Ru.

This space is 1-dimensional, and has a basis consisting of the (single)


vector p1, 0q. Thus, A is not diagonalizable.

Example
ˆ 5.21.˙ We continue the discussion of the rotation matrix
0 ´1
A “ . Its (complex) eigenvalues are λ1 “ i, λ2 “ ´i.
1 0
According to Corollary 5.16, A is diagonalizable. We compute the
eigenspaces, where we regard A as a complex matrix:

Ei “ tv P C2 | pA ´ i ¨ idqv “ 0u.
ˆ ˙
z1
If v “ , then
z2
ˆ ˙ˆ ˙ ˆ ˙ ˆ ˙
´i ´1 z1 ´iz1 ´ z2 ! 0
pA ´ i ¨ idqv “ “ “ .
1 ´i z2 z1 ´ iz2 0

This means z1 “ iz2 from the second equation; the first is then also
satisfied since ´iz1 ´ z2 “ ´ipiz2 q ´ z2 “ z2 ´ z2 “ 0. Thus

Ei “ tpiz, zq | z P Cu,

i.e., as a complex vector space, Ei is 1-dimensional and a basis of it


is the vector pi, 1q.
Similarly,

E´i “ tv P C2 | pA ` i ¨ idqv “ 0u.


5.5. EXERCISES 153

Computing this leads to the linear system


ˆ ˙ˆ ˙ ˆ ˙ ˆ ˙
`i ´1 z1 iz1 ´ z2 ! 0
pA ` i ¨ idqv “ “ “ .
1 `i z2 z1 ` iz2 0
This gives z1 “ ´iz2 , so that E´i “ tp´iz, zq | z P Cu, and a basis
of it is the (single) vector p´i, 1q. Thus, the matrix P above is
ˆ ˙
i ´i
P “ .
1 1
ˆ ˙
1 0
Example 5.22. For A “ , λ “ 0 is an eigenvalue (and
ˆ ˙ 0 0
0
an 0-eigenvector) and λ “ 1 is another eigenvalue (and
ˆ 1 ˙
1
an 1-eigenvector).
0

5.5 Exercises
¨ ˛
2 1 1
Exercise 5.1. Let A “ ˝ 0 1 0 ‚. Following Method 5.15,
1 ´1 2
decide whether A is diagonalizable.
Help: you will find that the eigenvalues of A are among the
numbers 0, 1, 2, 3. You will be able to choose basis vectors of the
eigenspaces all of whose coordinates are ´1, 0, 1.
ˆ ˙
a b
Exercise 5.2. Let A “ . Show that:
c d
• χA ptq “ t2 ´ trpAqt ` det A “ t2 ´ pa ` dqt ` pad ´ bcq. Here
trpAq is the trace of A, cf. Exercise 3.24.
• The eigenvalues of A are
c
a`d pa ´ bq2
λ1{2 “ ˘ ` 4bc.
2 4
Exercise 5.3. For each of the following matrices, compute χA ptq,
the eigenvalues of A, the eigenspaces for these eigenvalues. Also
decide whether A is diagonalizable and compute an eigenbasis if
one exists.
154 CHAPTER 5. EIGENVALUES AND EIGENVECTORS
ˆ ˙
3 5
(1) A “
1 ´1
¨ ˛
0 1 0
(2) A “ ˝ 3 0 1 ‚
2 0 0
¨ ˛
1 0 0
3 ‚
(3) A “ ˝ 0 0 2
0 0 1
¨ ˛
0 1 0
(4) A “ ˝ 0 0 1 ‚
0 0 0
¨ ˛
1 0 0
Exercise 5.4. Consider the matrix A “ ˝ 1 1 2 ‚. Compute
1 0 1
its characteristic polynomial, its eigenvalues and its eigenspaces. Is
A diagonalizable? If so, find a basis of R3 such that the associated
matrix is a diagonal matrix, as in Definition 5.14.

Exercise 5.5. (Solution at p. 228) Let

f : R3 Ñ R3

be the linear map such that f p1, 0, 1q “ p2, 0, 2q, ker f “ Lpp1, 1, 1qq
and f p2, 0, ´3q “ p´2, 0, 3q. Compute the matrix of f with respect
to the standard basis.

Exercise 5.6. For which a P R is the matrix


¨ ˛
a 0 0
Aa “ ˝ a ´ 2 1 1 ‚
0 1 1

diagonalizable?

Exercise 5.7. (Solution at p. 229) For a parameter a P R, let


¨ ˛
4 0 4
Aa “ ˝ a 2 a ‚.
´2 0 ´2
5.5. EXERCISES 155

(1) Compute the characteristic polynomial and the eigenvalues of


Aa , for all a P R.
(2) Compute the values of a for which Aa is diagonalizable. For
these a, find an invertible matrix P such that P ´1 Aa P is a
diagonal matrix.
Exercise 5.8. Consider the matrices
¨ ˛ ¨ ˛
1 1 1 2 1 0
A“ ˝ 0 2 0 ‚ and B “ ˝ 0 2 0 ‚.
1 ´1 1 0 0 0

(1) Compute the eigenvalues and eigenvectors of A and show A is


diagonalizable.
(2) Show that the characteristic polynomials of A and B are the
same. Compute the eigenvalues and eigenspaces of B. Explain
why A and B do not represent the same linear map with respect
to different bases!
Exercise 5.9. For a parameter t P R, consider the matrix
¨ ˛
´1 2 t
At “ ˝ 2 0 ´2 ‚.
t ´2 ´1

(1) For which values of t does At have 0 as an eigenvalue?


(2) Compute the eigenvalues and eigenspaces of At for those values
of t obtained in the previous part.
Exercise 5.10.
ˆ Consider ˙the space Mat2ˆ2 of 2 ˆ 2-vector spaces.
´4 8
Consider A “ and the linear map
1 ´2
F : Mat2ˆ2 Ñ Mat2ˆ2 , X ÞÑ AX.
(1) Compute the 4 ˆ 4-matrix of F with respect to the standard
basis of Mat2ˆ2 , i.e., the matrices
ˆ ˙ ˆ ˙ ˆ ˙ ˆ ˙
1 0 0 1 0 0 0 0
E11 “ , E12 “ , E21 “ , E22 “ .
0 0 0 0 1 0 0 1

(2) Compute a basis of ker F and im F .


156 CHAPTER 5. EIGENVALUES AND EIGENVECTORS

(3) Compute the eigenvalues and eigenspaces of F .


Remark 5.23. The linearity of F is a consequence of Lemma 3.57.
It is also very similar to Proposition 3.19.

Exercise 5.11. (Solution at p. 230) Consider the two matrices


¨ ˛ ¨ ˛
1 0 2 1 0 0
A “ ˝ 0 3 0 ‚ and B “ ˝ 0 3 2 ‚.
0 0 3 0 0 3

Do they represent the same linear map f : R3 Ñ R3 (with respect


to different bases)?
¨ ˛
1 1 1
Exercise 5.12. (Solution at p. 231) Let A “ ˝ 0 2 1 ‚. De-
0 0 2
termine the eigenvalues of A and the corresponding eigenspaces. Is
A diagonalizable? Is A2 similar to A? I.e., does A2 represent the
same linear map R3 Ñ R3 as A?

Exercise 5.13. (Solution at p. 231) Consider the vectors v1 “ p1, 0, 1q,


v2 “ p1, 1, 1q and v3 “ p1, 1, 2q.
(1) Explain why there is a unique linear map f : R3 Ñ R3 such
that f pv1 q “ p0, 0, 0q, f pv2 q “ p1, 0, 3q and v3 is an eigenvector
of eigenvalue 4.
(2) Compute the matrix A of f with respect to the basis v1 , v2 , v3
(both on the domain and on the codomain).
(3) Compute the matrix B of f with respect to the standard basis
(both for the domain and the codomain).
(4) For t P R, consider the vector vt “ p2, t, 5q. For which values of
t is vt P im f ?

Exercise 5.14. (Solution at p. 233) Consider the matrix


¨ ˛
0 2 t
A “ ˝´3 ´5 6‚
´2 ´2 5

(1) Determine the value of t for which A is not invertible.


5.5. EXERCISES 157

(2) We now put t “ 2 for the remainder of this exercise. Determine


the value of a for which the vector v “ p2, 0, aq is an eigenvector
of A. What is the corresponding eigenvalue?
(3) Determine all the eigenvalues of A and decide whether A is di-
agonalizable.
(4) Decide whether A is similar to the matrix A2 (justify your re-
sponse).
158 CHAPTER 5. EIGENVALUES AND EIGENVECTORS
Chapter 6

Euclidean spaces

The definition of a (real) vector space encodes the existence (and


good properties) of the addition of vectors and the scalar multipli-
cation of vectors. The vector space Rn has, however, another im-
portant piece of structure, namely the distance between two points,
and the property of vectors being orthogonal to each other.

6.1 The scalar product on Rn


Definition 6.1. The scalar product of v, w P Rn is defined as
xv, wy :“ v T idw “ v T w “ v1 w1 ` ¨ ¨ ¨ ` vn wn .
(This is not to be confused with the scalar multiple of a vector,
which is again a vector!)
Example 6.2. The scalar product can be positive, zero, or nega-
tive: Bˆ ˙ ˆ ˙F
1 ´2
• , “ 1 ¨ p´2q ` 2 ¨ 2 “ 2
2 2
Bˆ ˙ ˆ ˙F
1 ´2
• , “ 1 ¨ p´2q ` 2 ¨ 1 “ 0
2 1
Bˆ ˙ ˆ ˙F
1 ´2
• , “ 1 ¨ p´2q ` 2 ¨ 0 “ ´2
2 0
However, for any v P Rn , we have
ÿn
xv, vy “ vi2 ě 0 (6.3)
i“1

159
160 CHAPTER 6. EUCLIDEAN SPACES

i.e., a scalar product of a vector with itself is always non-negative.


This implies that
a b
||v|| :“ xv, vy “ v12 ` ¨ ¨ ¨ ` vn2

is a well-defined (real) number. It is called the norm of the vector


v.

Lemma 6.4. The norm ||v|| is the length of the line segment from
the origin to v.
For v, w P R2 , there holds

||v ´ w||2 “ ||v||2 ` ||w||2 ´ 2||v||||w|| cos r,

where r is the angle between the vector v and w.

Proof. The formula for the norm follows from repeatedly applying
the Pythagorean theorem. Illustrating this for n “ 3, we see that
the line segment (shown dotted below)a from the origin O “ p0, 0, 0q
to the point pv1 , v2 , 0q has length v12 ` v22 . Therefore the length of
the segment from O to v is
dˆ ˙2
b b
v1 ` v2 ` v3 “ v12 ` v22 ` v32 .
2 2 2

v “ pv1 , v2 , v3 q

pv1 , v2 , 0q

The formula for the norm of v ´ w follows is a reformulation of


the law of cosines.
6.1. THE SCALAR PRODUCT ON RN 161

w
v´w
||w|| v
r
||v||

Given a square matrix A P Matnˆn , we have considered so far


the linear map
Rn Ñ Rn , v ÞÑ A ¨ v.
In addition to that, there is another fundamental map that one can
associate to a matrix:
x´, ´yA : Rn ˆ Rn Ñ R, pv, wq ÞÑ xv, wyA :“ v T ¨ A ¨ w.
Here we regard v and
˜ w as
¸ column vectors, i.e., as n ˆ 1-matrices.
v1 ` ˘
T
Therefore, for v “ .
.
.
, v “ v 1 . . . vn is a row vector
vn
(with n entries). Therefore v T A is an 1 ˆ n-matrix, so that v T Aw
is an 1 ˆ 1-matrix, i.e., just a real number. We call this number the
scalar product of v and w with respect to the given matrix A.
Lemma 6.5. The scalar product has the following fundamental
properties:
• If we fix w P Rn , then the maps
x?, wy :Rn Ñ R, v Ñ
Þ xv, wy
n
xw, ?y :R Ñ R, v Ñ Þ xw, vy
are linear (cf. Definition 3.1; e.g., for the first this means con-
cretely that
xrv ` r1 v 1 , wy “ rxv, wy ` r1 xv 1 , wy,
for r, r1 P R, v, v 1 P Rn . We refer to this by saying that x´, ´y :
Rn ˆ Rn Ñ R is a bilinear form (or as the bilinearity of the
scalar product).
• We have
xv, wy “ xw, vy.
This property is called symmetry.
162 CHAPTER 6. EUCLIDEAN SPACES

Proof. By Proposition 3.19, the map w ÞÑ v T w “ xv, wy is linear.


The proof of the linearity in the first argument is similar, or it follows
from symmetry.
The identity xv, wy “ xw, vy is directly clear from the definition.
One may also prove it using (3.84):
pv T wqT “ wT pv T qT “ wT v.
Noting that any 1ˆ1-matrix (such as v T w) is equal to its transpose,
the left hand side equals xv, wy, while the right equals xw, vy.
Using the bilinearity of x´, ´y, we can compute the following
expression
||v ´ w||2 “ xv ´ w, v ´ wy
“ xv, v ´ wy ´ xw, v ´ wy
“ xv, vy ´ xv, wy ´ xw, vy ` xw, wy
“ ||v||2 ` ||w||2 ´ 2xv, wy.
Comparing this with the cosine law above we see
xv, wy “ ||v||||w|| cos r.
The factor cos r is equal to 0 precisely if r “ ´ π2 , π2 (i.e., 90˝ or
´90˝ ). In other words,
xv, wy “ 0
if the angle between the vectors v and w is ˘90˝ . This motivates
the following definition.
Definition 6.6. Two vectors v, w P Rn are said to be orthogonal if
ÿn
xv, wy “ vi wi “ 0.
i“1

6.2 Positive definite matrices


Definition and Lemma 6.7. If A is a symmetric n ˆ n-matrix
(i.e., A “ AT ), then the map
x´, ´yA : Rn ˆ Rn Ñ R, xv, wyA :“ v T Aw
is bilinear and symmetric, i.e., Lemma 6.5 holds verbatim for x´, ´yA
instead of the standard scalar product (which corresponds to the
case A “ idn ).
6.2. POSITIVE DEFINITE MATRICES 163
¨ ˛
1 0 0 0
˚ 0 1 0 0 ‹
Example 6.8. Suppose A “ ˚
˝ 0
‹. Then Aw “
0 1 0 ‚
0 0 0 ´1
¨ ˛
w1
˚ w2 ‹
˝ w3 ‚, so that
˚ ‹
´w4
¨ ˛
w1
` ˘ ˚ w2 ‹
xv, wyA “ v T Aw “ v1 v2 v3 v4 ¨ ˚
˝ w3 ‚

´w4
“ v1 w1 ` v2 w2 ` v3 w3 ´ v4 w4 .
This example is not an anomaly, but the basis of so-called Minkowski
space which is fundamental in special relativity, which is R3`1 with
3 space coordinates and 1 time coordinate.
¨ ˛ ¨ ˛
1 0
˚ 0 ‹ ˚ 0 ‹
The standard basis vectors e1 “ ˚ ˝ 0 ‚, . . . , e4 “ ˝ 0 ‚ are
‹ ˚ ‹
0 1
orthogonal to each other, but
xe4 , e4 yA “ ´1
where as xek , ek yA “ `1 for the other three basis vectors. In that
sense, the scalar product (with respect to A) is able to distinguish
between the last and the other three directions.
Definition 6.9. A symmetric matrix A is called positive definite if
xv, vyA ą 0
for all v P Rn , v ‰ 0. In this case we can define the norm (of v with
respect to the matrix A) as
b
||v||A :“ xv, vyA .

It is negative definite if instead xv, vyA ă 0 for all v ‰ 0. The


matrix A is called indefinite if there exist v, w P Rn with xv, vyA ą 0
and xw, wyA ă 0.
164 CHAPTER 6. EUCLIDEAN SPACES

Example 6.10. As we have seen in (6.3), idn is positive definite.


The matrix in Example 6.8 is indefinite.
It is suggestive to blame the ´1 in the last entry for the indefi-
niteness of the matrix in Example 6.8. The following result gives a
way to ensure positive definiteness for general matrices. To state it,
we introduce a bit of terminology:
Definition 6.11. For a square matrix A, the principal submatrix
(of size r) is the matrix
Aprq “ paij q1ďi,jďr .
I.e., it is the matrix consisting of the first r rows and columns of A.
Proposition 6.12. Let A P Matnˆn be a symmetric square matrix.
The following are equivalent:
(1) the bilinear form x´, ´yA is positive definite, i.e., xv, vyA ě 0 for
all v P Rn ,
(2) A is positive definite,
(3) For all 1 ď r ď n, detpAprq q ą 0.
In particular, any positive definite matrix A has det A ą 0. There-
fore such a matrix is invertible (Theorem 4.13).
A proof of this criterion requires methods from §6.3.
¨ ˛
1 2 4
Example 6.13. The matrix A “ ˝ 2 5 8 ‚ is positive definite,
0 7 4
ˆ ˙
p1q p2q 1 2
since A “ 1 is positive, det A “ det “ 1 ą 0 and
2 5
det A “ 4 ą 0.
Example 6.14. The defininiteness of matrices has applications in
analysis: for a (twice differentiable) function f : R2 Ñ R, such as
f px, yq “ x2 ` y 2 ,
one considers the so-called Hesse matrix , which is given by
˜ 2 2
¸
B f B f
BxBx BxBy
B2 f B2 f .
ByBx ByBxy
6.3. EUCLIDEAN SPACES 165

For the above function it is


ˆ ˙
2 0
,
0 2

which is positive
ˆ ˙ definite. By contrast, for gpx, yq “ x2 ´ y 2 , it is
2 0
, which is indefinite. One proves in analysis that the
0 ´2
positive defininetess of the Hesse matrix implies that there is a local
minimum at a given point px, yq, provided that Bf Bx
“ Bf
By
“ 0 at this
point. Thus, f has a local minimum at the point p0, 0q, but g does
not.

6.3 Euclidean spaces


Definition 6.15. A Euclidean vector space is a vector space V to-
gether with a map
x´, ´y : V ˆ V Ñ R
that is
• bilinear (i.e., xv, ´y and x´, vy : V Ñ R are linear for each
v P V ),
• symmetric (i.e., xv, wy “ xw, vy), and
• positive definite (xv, vy ą 0 for each v ‰ 0).
One also refers to the map x´, ´y as the scalar product on V . We
say v, w are orthogonal if xv, wy “ 0. We will indicate this by writing

vKw.

We call a
||v|| :“ xv, vypP Rě0 q
the norm of the vector v. For v, w P V , the distance between v and
w is defined as
dpv, wq :“ ||v ´ w||.

Example 6.16. (1) Rn with the above scalar product is an Eu-


clidean vector space. More generally, for a symmetric, positive
definite matrix A, Rn together x´, ´yA is an Euclidean space.
166 CHAPTER 6. EUCLIDEAN SPACES

In other words, the above turns the fundamental properties of


Rn , together with the standard scalar product (or, more gen-
erally Rn with the scalar product x´, ´yA given by a positive
definite symmetric matrix A) into an abstrac definition, simi-
larly to the way that a vector space is an abstraction of the key
properties of Rn .
(2) If V , together with some given scalar product x´, ´y is a Eu-
clidean space, then so is any subspace of V . In particular, any
subspace of Rn with the standard scalar product is again an Eu-
clidean space. For example, any plane inside R3 is an Euclidean
space.
(3) One can use elementary properties of the integral to show that
the vector space C “ Cpr´1, 1sq of continuous functions f :
r´1, 1s Ñ R with
ż1
xf, gy :“ f pxqgpxqdx
´1

is an (infinite-dimensional) Euclidean space, which is of funda-


mental importance in analysis.
(4) As in Example 6.8, consider again V “ Rn , but

xv, wy :“ v1 w1 ` ¨ ¨ ¨ ` vn´1 wn´1 ´ vn wn .

This is bilinear and symmetric, but not positive definite, and


therefore not a scalar product.

Proposition 6.17. Let pV, x´, ´yq be an Euclidean space. For each
v, w P V , there holds:
(1) ||v|| ě 0,
(2) ||v|| “ 0 if and only if v “ 0,
(3) ||rv|| “ |r|||v|| for r P R,

Proof. The first and third statement is immediate. The second holds
since x´, ´y is (by definition) positive definite.

The scalar product yields a crucial additional feature that general


vector spaces do not possess. This is based on the following idea.
Throughout, let pV, x´, ´yq be an Euclidean vector space.
6.3. EUCLIDEAN SPACES 167

Lemma 6.18. Let e P V be a vector of norm 1, i.e., ||e|| “ 1. Let


v P V be any vector. Then the vector
ṽ :“ v ´ xv, ey ¨ e
is orthogonal to e and we have the equation
v “ ṽ ` xv, ey ¨ e (6.19)
expressing v as a sum of a scalar multiple of e and a vector that is
orthogonal to e.
Proof. The orthogonality of ṽ and xv, ey ¨ e is a computation using
the bilinearity of x´, ´y:
xṽ, xv, ey ¨ ey “ xv ´ xv, ey ¨ e, xv, ey ¨ ey
“ xv, eypxv ´ xv, ey ¨ e, eyq
“ xv, eypxv, ey ´ xxv, ey ¨ e, eyq
“ xv, eypxv, ey ´ xv, ey loxe,
omoey
onq
“1
“ 0.
The equation (6.19) is obvious from the definition of ṽ.
We now extend the observation of Lemma 6.18 to more than a
single vector. To do so, we introduce some terminology.
Definition and Lemma 6.20. The orthogonal complement of a
subset M Ă V is defined as
M K :“ tv P V | xv, my “ 0 for all m P M u.
This is a subspace of V .
For a subspace W , one has
W X W K “ t0u, (6.21).
The last assertion can be rephrased by saying that the zero vec-
tor is the only element in W that is orthogonal to all vectors in
W . Colloquially, this means that if W gets larger, then W K gets
smaller. This idea is made more precise (in terms of dimensions) in
Corollary 6.29 below. The last assertion is proved using the positive-
definiteness of x´, ´y (specifically, Proposition 6.17(2)).
168 CHAPTER 6. EUCLIDEAN SPACES

4 y
W
WK
2

x
´4 ´2 2 4

´2

´4
ˆ ˙ ˆ ˙
1 1
Example 6.22. Consider the subspace W “ Lp 2 , 1 q Ă
4 0

R3 (with
ˆ its K
˙ standard scalar product). We compute W . A vector
x1
x“ x2 P R3 will be orthogonal to W if and only if it is orthog-
x3
ˆ ˙ ˆ ˙
1 1
onal to v1 “ 2 and v2 “ 1 . This follows from the linearity
4 0
of xx, ´y. We make the conditions xKv1 and xKv2 explicit:
xKv1 ñ x1 ` 2x2 ` 4x3 “0
xKv2 ñ x1 ` x2 “ 0.
We solve this homogeneous system
ˆ ˙ ˆ ˙ ˆ ˙
1 2 4 1 2 4 1 2 4
⇝ ⇝
1 1 0 0 ´1 ´4 0 1 4
which shows that x3 is a free variable, and that the solution space
of the system, i.e., W K is the subspace
ˆ ˙
K 4
W “ Lp ´4 q.
1

Definition 6.23. A family v1 , . . . , vn of vectors is called an or-


thonormal system if
• ||vi || “ 1 (i.e., xvi , vi y “ 1) for all i,
• vi Kvj (i.e., xvi , vj y “ 0) for all i ‰ j.
6.3. EUCLIDEAN SPACES 169

If the vectors additionally form a basis of V , then we speak of an


orthonormal basis.
For example, the standard basis in Rn is an orthonormal basis
(with respect to the standard scalar product).
Theorem 6.24. Let u1 , . . . , un be an orthonormal system (in an
Euclidean space). Let U “ Lpu1 , . . . , un q Ă V be the subspace
spanned by these vectors. Then there is a unique linear map, called
the orthogonal projection
p:V ÑU
such that
(1) ppuq “ u for all u P U ,
(2) ppvq ´ v P U K for all v P V .
In particular, every vector v P V can be written as
v “ loppvq
omoon ` vlooomooon
´ ppvq,
PU PU K

i.e., a sum of a vector in U and another one in its orthogonal com-


plement U K . This is the unique representation of v in such a form.
The map p is given by
n
ÿ
ppvq “ xv, uk yuk . (6.25)
k“1

Proof. The map p defined in (6.25) is linear, since x´, uk y is linear. It


satisfies the two conditions. One checks this using that the uk form
an orthonormal system, very similarly to the proof of Lemma 6.18.
If q : V Ñ U is another map with these two properties, we have
C G
xppvq ´ qpvq, uy “ ppvq ´ v ´ pqpvq ´ vq, u
loooooooooooomoooooooooooon “0
PU K

for all v P V , u P U . Since qpvq, ppvq P U , we have ppvq ´ qpvq P U .


Thus, the vector ppvq´qpvq is zero, by (6.21). This shows the unicity
of p.
The final claim holds since v “ loppvqomoon ` vlooomooon
´ ppvq is such a rep-
PU PU K
resentation. If v “ u1 ` u11 with u1 P U and u11 P U K is another
170 CHAPTER 6. EUCLIDEAN SPACES

such representation, then u ´ u1 “ u1 ´ u11 lies both in U (left hand


side), but also in U K (right hand side). However, again applying
Proposition 6.17(2) to U , we have U X U K “ t0u, so u “ u1 and
u1 “ u11 .
Corollary 6.26. Suppose u1 , . . . , un form an orthonormal system
(of a Euclidean vector space pV, x´, ´yq) such that V is spanned by
these vectors. Then
• the following formula holds for any v P V :
n
ÿ
v“ xv, ui yui . (6.27)
k“1

• The vectors are necessarily linearly independent, i.e., they form


an orthonormal basis.
Proof. We apply Theorem 6.24 to these vectors. By the assumption
U “ V , so that by (1), ppvq “ id. The first claim then holds by
(6.25).
If 0 “ nk“1 ak uk is a linear combination, we apply x´, ul y, for
ř
any 1 ď l ď n, to (6.27):
0 “ x0, ul y
C G
ÿn
“ ak uk , ul
k“1
n
ÿ
“ ak xuk , ul y.
k“1

In this sum, all terms except the one with k “ l are zero, since
uk Kul for k ‰ l. We also have xul , ul y “ 1, which shows that al “ 0,
and therefore the linear independence of the given vectors.
Example 6.28. The
˜standard
¸ basis e1 , . . . , en of Rn is an orthonor-
v1
.
mal basis. For v “ .
.
, we have xei , vy “ vi and the represen-
vn
tation in (6.27) is the usual expansion of v:
v “ v1 e1 ` ¨ ¨ ¨ ` vn en .
In general, the identity (6.27) is a convenient way to compute the
coordinates of a given vector in terms of an (orthonomal) basis.
6.3. EUCLIDEAN SPACES 171

Using these results, one can quickly prove:


Corollary 6.29. If U Ă V is a subspace of a finite-dimensional
Euclidean space then

dim U K “ dim V ´ dim U.

The presence of a positive definite (symmetric) matrix yields the


following algorithmic device that constructs a particularly conve-
nient set of vectors.
Proposition 6.30. (Gram–Schmidt orthogonalization) Let v1 , . . . , vr
be any set of linearly independent vectors (in an Euclidean space).
Then the vectors w1 , . . . , wr defined inductively as follows are an
orthonormal system: They are constructed as follows
1
w1 :“ v1 (normalization)
||v1 ||
w21 :“ v2 ´ xv2 , w1 yw1 (orthogonalization w.r.t. Lpw1 q)
1
w2 :“ w1 (normalization)
||w21 || 2
..
.
r´1
ÿ
1
wr :“ vr ´ xvr , wk y ¨ wk (orthogonalization w.r.t. Lpw1 , . . . , wr´1 q)
k“1
1
wr :“ w1 (normalization)
||wr1 || r
We have
Lpv1 , . . . , vr q “ Lpw1 , . . . , wr q.
In particular, if the vi form a basis, then so do the wi , i.e., they
then form an orthonormal basis. Yet more in particular, this shows
that any finite-dimensional Euclidean space admits an orthonormal
basis.

Proof. In each step, the vector wr1 is constructed in such a way that
wr1 is orthogonal to the preceding vectors w1 , . . . , wr´1 , cf. (6.25).
The division by the norms of the vectors wr1 ensures that ||wr || “ 1.
Note that this is possible since ||wr1 || ą 0 since wr1 ‰ 0 and x´, ´y is
positive definite.
172 CHAPTER 6. EUCLIDEAN SPACES

Example 6.31. Weˆ consider


˙ A “ id2 ,ˆi.e., the
˙ standard scalar prod-
1 2
uct on R2 and v1 “ and v2 “ . (One checks this is a
1 1
basis of R2 !) Then
ˆ ˙
1 1
w1 “ ?
2 1
w21 “ v2 ´ xv2 , w1 yw1
ˆ ˙ ˆ ˙
2 3 1 1
“ ´? ¨?
1 2 2 1
ˆ ˙
1 1

2 ´1
ˆ ˙
1 1
w2 “ ? .
2 ´1
Here is an illustration of the method in this example. The blue line
depicts the vectors of the form v2 ` aw1 for a P R. The vector w21 is
the vector on that line that is orthogonal to w1 :
y

1 v1 v2
w1
´1 0 1 2 3
1
w2 x
w2
´1

Corollary 6.32. Let U Ă V be a subspace of a finite-dimensional


Euclidean space V . Then there are two unique linear maps, called
the orthogonal projection onto U , resp. onto U K ,
pU :V Ñ U
pU K :V Ñ U K
such that every vector v P V can be written as
v “ pU pvq ` pU K pvq. (6.33)
6.4. ORTHOGONAL AND SYMMETRIC MATRICES 173

Proof. By ??, U has an orthonormal basis, so we can apply The-


orem 6.24, which gives us the orthogonal projection pU : V Ñ U .
If we define pU K pvq :“ v ´ pU pvq, (6.33) holds by design, moreover,
pU K pvq P U K again by Theorem 6.24. The unicity of a decomposition
as in (6.33) is again part of Theorem 6.24.

6.4 Orthogonal and symmetric matrices


Definition 6.34. A real square matrix A P Matnˆn is called or-
thogonal if
AAT “ id.
This is equivalent to saying that A is invertible and A´1 “ AT .
The following lemma explains the name “orthogonal”.
Lemma 6.35. For a square matrix A P Matnˆn , the following are
equivalent:
(1) A is orthogonal,
(2) the n rows are an orthonormal basis of Rn ,
(3) the n columns are an orthonormal basis of Rn .
Proof. If ei is the i-th standard basis vector, we know that Aei is
the i-th column A. We compute
xAei , Aej y “ pAei qT pAej q “ eTi AT Aej .
The vector AT Aej is the j-th column of AT A, and the number
eTi AT Aej is the i-th entry of that vector. Thus, saying that the
above expression equals 1 for i “ j and 0 otherwise is equivalent to
requiring AT A “ id.
Theorem 6.36. The following conditions are equivalent for an n ˆ
n-matrix A:
(1) A is orthogonally diagonalizable, i.e., there is an orthogonal ma-
trix P such that P ´1 AP “ id,
(2) A has an orthonormal eigenbasis,
(3) A is symmetric.
If these equivalent conditions hold, then the columns of P form an
orthonormal eigenbasis and vice versa. (Note that P ´1 “ P T can
be computed without computing, properly speaking, the inverse of
P .)
174 CHAPTER 6. EUCLIDEAN SPACES

For a proof of this, see, e.g. [Nic95, Theorem 8.2.2]. The vectors
of an orthonormal eigenbasis are also called the principal axes of A.
The theorem is sometimes called the principal axes theorem. We
only point out that the difficult direction is to show that (3) ñ (1).
One does this by proving that a symmetric real matrix has only real
eigenvalues (as opposed to complex). For 2ˆ2-matrices, one can see
this by direct computation (see also Exercise 5.2):
ˆ the ˙ characteristic
a b
polynomial of a symmetric 2 ˆ 2-matrix A “ is
b d

χA ptq “ detpA ´ tidq “ pa ´ tqpd ´ tq ´ b2 “ t2 ` p´a ´ dqt ` ad ´ b2 .

The zeroes of this polynomial are given by


c
a`d pa ` dq2
λ1{2 “ ˘ ´ ad ` b2
2 c 4
a`d a2 ` d2 ad
“ ˘ ` ´ ad ` b2
2 c 4 2
a`d pa ´ dq
“ ˘ ` b2 .
2 4
The expression in the square root is always non-negative, so that
λ1{2 are real numbers.
As an example of a non-symmetric matrix with imaginary
ˆ eigen-˙
0 ´1
values, we have seen in Example 5.21 that the matrix A “
1 0
has the eigenvalues λ1{2 “ ˘i.
¨ ˛
5 ´4 2
Example 6.37. The matrix A “ ˝ ´4 5 2 ‚ is symmet-
2 2 ´1
ric. We compute an orthonormal eigenbasis by first computing the
eigenvalues:
χA ptq “ ´t3 ` 9t2 ` 9t ´ 81.
The eigenvalues and an eigenvector for them are as follows:
• λ1 “ 9, v1 “ p´1, 1, 0q,
• λ2 “ 3, v2 “ p1, 1, 1q,
• λ3 “ ´3, v3 “ p´1, ´1, 2q.
6.5. AFFINE SUBSPACES 175

These three vectors are orthogonal; this is seen by direct computa-


tion. Alternatively, since the eigenvalues are all distinct, they are
automatically orthogonal (Exercise 6.12). They are however not
normal, dividing by their norm gives an orthonormal eigenbasis:
˜ ¸
1
ˆ
´1
˙
1 1 1
ˆ
1
˙
? 1 ,? 1 ,? 1 .
2 0 3 1 6 ´2

6.5 Affine subspaces


Definition 6.38. Let V be a vector space. An affine subspace of V
is a subset of the form
v0 ` W :“ tv0 ` w |w P W u
for an appropriate vector v P V and a subspace W Ă V .
In other words, an affine subspace is obtained by translating a
subspace (i.e., a sub-vector space) by a certain vector. For example,
any line or a plane in R3 that is not necessarily passing through the
origin is an affine subspace.
Lemma 6.39. Let X “ v ` W Ă Rn be an affine subspace. If
X “ v 1 ` W 1 for any vector v 1 P Rn and a subspace W 1 Ă Rn , then
the following holds:
• W “ W 1 and
• v ´ v1 P W .
In other words, the sub-vector space W is uniquely determined by
X.
Proof. If v ` W “ v 1 ` W 1 , then v ´ v 1 P W . This implies
W 1 “ ´v 1 ` v 1 ` W 1 “ ´v 1 ` X “ loomoon
´v 1 ` v `W “ W.
PW

Here we have used that for a subspace A Ă Rn (such as A “ W ),


and an element a P A, we have a ` A “ A.
We can therefore define the dimension of an affine subspace as
dim X “ dim W , if X “ v ` W as above.
176 CHAPTER 6. EUCLIDEAN SPACES

Definition 6.40. For two affine subspaces X, X 1 Ă Rn we say that


two points x P X, x1 P X 1 realize the minimal distance of X and X 1
if
dpx, x1 q ď dpx1 , x11 q
for any two points x1 P X, x11 P X 1 . In this event, we also write
dpX, X 1 q :“ dpx, x1 q for that minimal distance.
Proposition 6.41. Let X “ v0 ` W be an affine subspace of a
Euclidean space V . There is a unique vector v P V characterized by
the following equivalent properties:
(1) v is an element of X X W K ,
(2) v realizes the minimal distance of the origin to W .
This vector v is given by
v “ pW K pv0 q “ v0 ´ pW pv0 q,
i.e., the projection of v0 onto the orthogonal complement W K .
Proof. We first prove that X X W K contains v as defined above.
Indeed, by Theorem 6.24, we can write
v0 “ w ` v
with uniquely determined w “ pW pv0 q P W and v “ pW K pvq P W K .
This means that v “ v0 ´ w P X X W K .
We now prove that X X W K consists only of that vector v. If
another vector v 1 P W K X X, then v 1 “ v0 ` w̃ for w̃ P W , so that
v ´ v 1 “ w ´ w̃ P W X W K “ t0u, so that v “ v 1 .
We prove that this vector v realizes the minimal distance to the
origin. To this end, let x P X be any vector. We need to prove
||x|| ě ||w||. Then w :“ v ´ x P W . We can then compute
||x|| “ ||v ` w||
a
“ xv ` w, v ` wy
d
“ xv, vy ` 2 lo
xv,
omowy
on `xw, wy by bilinearity and symmetry
“0
a
“ ||v||2 ` ||w||2
ě ||v||.
Here is a picture of the proof idea:
6.5. AFFINE SUBSPACES 177

w “ pW pv0 q
v0 “ w ` v

v “ pW K pvx0 q
z

W
v0 ` W

We finally show that a vector x P X with minimal distance to


the origin agrees with v:

||x||2 “ ||lo ´ ovn `v||2


xomo
“:wPW
“ xw ` v, w ` vy
“ ||w||2 ` 2 lo
omovy
xw, on `||v||
2
(bilinearity)
“0
2 2
“ ||w|| ` ||v|| . since v P W K

Since ||x|| “ ||v||, this implies ||w||2 “ 0, i.e., w “ 0, i.e., x “ v.

Definition 6.42. A hyperplane in Rn is an affine subspace H of


dimension n ´ 1, i.e., an affine subspace of the form

H “ v0 ` W

where W is a subspace with dim W “ n ´ 1.

For example, a line is a hyperplane in R2 , and a plane is a hy-


perplane in R3 .
178 CHAPTER 6. EUCLIDEAN SPACES
˜ ¸
a1
Proposition 6.43. Let a “ .
.
.
P Rn be a non-zero (column)
an
vector, and let b P R. Then the subset
H :“ tx P Rn |xx, ay “ bu
is a hyperplane. Its distance to the origin is given by
|b|
dp0, Hq “ .
||a||
Proof. We show that H is a hyperplane. Indeed, the equation
xx, ay “ b, which can be rewritten as
a1 x 1 ` ¨ ¨ ¨ ` an x n “ b
` ˘
is a (non-homogeneous) linear system and the matrix a1 . . . an
has rank 1, since the vector is nonzero. Therefore H has dimension
n ´ 1.
Let W :“ tx P Rn |xx, ay “ 0u be the associated subspace. Then
H “ v ` W for some v P Rn , according to Theorem 3.35. Thus,
a P W K . If we set λ :“ ||a||b 2 , we have λa P H:
B F
b b
2
a, a “ xa, ay “ b.
||a|| ||a||2
Therefore λa P H XW K . Thus, by Proposition 6.41, λa is the closest
vector (in H) to the origin, and we have
|b|
dp0, Hq “ ||λa|| “ .
||a||
Above we saw that an equation of the form
xx, ay “ b
for fixed a ‰ 0 and b P R determines a hyperplane. Here is a
converse to this statement.
Proposition 6.44. (Hesse normal form of a hyperplane) Let H “
v0 ` W Ă Rn be a hyperplane, and let d “ dp0, Hq be its distance
to the origin. Then there is a unique vector a P Rn such that
(1) ||a|| “ 1,
6.5. AFFINE SUBSPACES 179

(2) a P H K ,
(3) H “ tx P Rn | xx, ay “ du.
This vector can be computed as
v
a“ ,
||v||
where v is the unique element in H X W K or (equivalently) the point
in H that is closest to the origin.
The equation xx, ay “ d (which is a linear equation in the un-
knowns x1 , . . . , xn ) is called the Cartesian equation of the hyper-
plane.
Example 6.45. We continue the example in Example 6.22:
ˆ ˙ ˆ ˙
1 1
W “ Lp 2 , 1 q
4 0
ˆ ˙
4
W K “ Lp ´4 q,
1
ˆ ˙
11
and consider the hyperplane H “ 11 ` W . We compute H X
11

W K , which by Proposition 6.41 requires to find w P W and v P W K


such that
ˆ ˙
11
v0 “ 11 “w`v
11
ˆ ˙ ˆ ˙ ˆ ˙
1 1 4
“ a 2 ` b 1 ` c ´4 .
4 0 1
¨ ˛
1 1 4
We compute the inverse of A “ ˝ 2 1 ´4 ‚ using Theorem 3.78
4 0 1
(alternatively, one can also use the adjugate matrix, as in Theo-
rem 4.13). The result is
¨ ˛
´1 1 8
1 ˝
A´1 “ 18 15 ´12 ‚.
33 4 ´4 1
According to Theorem 3.66, the above system therefore has a unique
solution, given by ˆ ˙
´1 1 8
A v0 “ 21 .
3 1
180 CHAPTER 6. EUCLIDEAN SPACES

1
Thus, c “ 3
above, so that
ˆ ˙
1 4
v“ ´4 .
3 1

According to Proposition 6.41, this is the closest vector in H to the


origin, and the distance of H to the origin is given by
c c
33 11
d “ ||v|| “ “ .
9 3
In addition, c ˆ ˙
11 4
a“ ´4 .
27 1

6.5.1 Lower dimensional affine subspaces


This representation of hyperplanes can also be used to understand
the geometry of subspaces of smaller dimension. For simplicity, we
discuss this in the special case of lines in R3 . A line L Ă R3 can be
described in two ways:
(1) L can be described as an affine subspace L “ v ` Lpwq for
appropriate vectors v, w P R3 . I.e., the points in L are of the
form v ` λw for λ P R. This can be spelled out for each of the
three components:
xk “ vk ` λwk for k “ 1, 2, 3. (6.46)
This system is referred to as the system of vector equations.
(2) L can also be described by a system of two equations
a1 x 1 ` a2 x 2 ` a3 x 3 “ b
a11 x1 ` a12 x2 ` a13 x3 “ b1 .
This system is referred to
ˆ as the
˙ system of cartesian equations
x1
of L. If we write x “ x2 etc., it can be rewritten more
x3
compactly as
xx, ay “ b
xx, a1 y “ b1 .
Each of these two equations describes a hyperplane in R3 , i.e.,
a plane, and the line is the intersection of these planes.
6.5. AFFINE SUBSPACES 181

One can pass from (1) to (2) by eliminating λ in (6.46). Conversely,


in order to present L as an affine subspace, i.e., in the form
L “ v ` Lpwq,
we solve the above linear system.
Example 6.47. The following equations
x`y´1“0
3x ` y ´ 2z ´ 1 “ 0

determine a line L Ă R3 . We compute a representation L “ v ` W


by solving the system:
ˆ ˙ ˆ ˙
1 1 0 1 1 1 0 1
⇝ ,
3 1 ´2 1 0 ´2 ´2 ´2
so the solutions are y “ 1 ´ z, x “ 1 ´ y “ z, i.e.,
ˆ ˙ ˆ ˙
0 1
L“ 1 ` Lp ´1 q.
0 1

Example 6.48. The line


L “ p1, 0, 1q ` Lp2, 1, ´1q,
is described by the vector equations
x1 “ 1 ` 2λ
x2 “ λ
x3 “ 1 ´ λ.
The cartesian equations can be determined by observing that λ “
x2 , so that the other two equations read
x1 “ 1 ` 2x2
x3 “ 1 ´ x2

which can be rewritten as


x1 ´ 2x2 “ 1
x2 ` x3 “ 1
182 CHAPTER 6. EUCLIDEAN SPACES

or, yet equivalently


B ˆ ˙F
1
x, ´2 “1
0
B ˆ ˙F
0
x, 1 “ 1.
1

The planes P containing the given line L, i.e. such that L Ă P


can be characterized by the equation
xx, λa ` λ1 a1 y “ λb ` λ1 b1 ,
where λ, λ1 P R are arbitrary such that λa ` λ1 a1 ‰ 0. Indeed, this
equation does describe a (hyper)plane, and if x P L, then it satisfies
this latter equation.
Example 6.49. The line defined by the equations
L : x2 “ 0, x3 “ 1
B ˆ ˙F B ˆ ˙F
0 0
can be written as x, 1 “ 0 and x, 0 “ 1. (It
0 1
ˆ ˙ ˆ ˙
0 1
can also be written as 1 ` Lp 0 q.) Thus, the planes P
0 0
containing L are all of the form
B ˆ ˙F
0
x, λ
1
“ λ1 ,
λ
ˆ ˙
0
1
for arbitrary λ, λ P R. Note that the vectors λ are precisely
λ1
ˆ ˙
1
the vectors orthogonal to the vector 0 .
0

Given another line L1 “ v 1 ` Lpw1 q, this conveniently allows to


determine the plane P that is parallel to L1 . The line L1 is parallel
to P exactly if
w1 Kλa ` λ1 a1
for appropriate λ, λ1 P R.
Example 6.50. Continueing the example above, let
L1 : z “ 2, x “ y.
6.6. DISTANCE BETWEEN TWO AFFINE SUBSPACES 183
ˆ ˙ ˆ ˙
0 1
1
It is given by L “ 0 ` Lp 1 q. We solve the equation
2 0

ˆ ˙ ˆ ˙
1 1 0
w “ 1 K λ ,
0 λ1

it gives λ “ 0, and λ1 ‰ 0 is arbitrary. Thus, for any λ1 , the plane


defined by the equation
B ˆ ˙F
0
x, 0
1
“ λ1
λ

is parallel to L1 and contains L. This gives the equation


B ˆ ˙F
0
x, 0 “1
1

or, more concretely, x3 “ 1.

6.6 Distance between two affine subspaces


Theorem 6.51. Let X “ v ` W , X 1 “ v 1 ` W 1 be two affine sub-
spaces. Let us write d :“ v ´ v 1 and Z :“ W ` W 1 (Definition 2.34).
Let
m :“ pZ K pdq “ d ´ pZ pdq
be the orthogonal projection of d onto Z K (Corollary 6.32).
For two points x P X and x1 P X 1 the following are equivalent:
(1) x ´ x1 “ m.
(2) dpx, x1 q “ ||m||.
(3) x and x1 realize the minimal distance of X and X 1 , i.e., dpx, x1 q “
dpX, X 1 q.
(4) The vector x ´ x1 is orthogonal to W and to W 1 (i.e., x ´ x1 is
orthogonal to any w P W , w1 P W 1 ).
In particular, X intersects X 1 if and only if d P Z.

Proof. Here is a picture of the geometric ideas in the proof. For


simplicity of the picture, we choose v 1 “ 0, so that X 1 “ W 1 and
d “ v.
184 CHAPTER 6. EUCLIDEAN SPACES

Z “ W ` W1
X “v`W
x
x1
pZ pvq v

X“W

ZK
m “ pZ K pvq

X1 “ W 1

(1) ñ (2) is obvious since dpx, x1 q “ ||x ´ x1 ||.


We next prove the equivalence (3) ô (2). We write a point x P X
as x “ v ` w with an arbitrary vector w P W . Likewise, x1 “ v 1 ` w1 .
We then have

dpx, x1 q “ ||x ´ x1 || “ ||v ´ v 1 ` w ´ w1 || “ ||d ` w ´ w1 ||.

The vector w ´ w1 is an arbitrary vector in the sum Z “ W ` W 1


(notice that for any w1 P W 1 , also ´w1 P W 1 ).
Therefore, we are seeking the point z P Z “ W ` W 1 such that
||d ` z|| is minimal. This is just the distance of the affine subspace
d ` Z to the origin. According to Proposition 6.41, this distance is
given by ||m|| “ ||pZ K pdq|| “ ||d ´ pZ pdq||, and m is the unique vector
6.6. DISTANCE BETWEEN TWO AFFINE SUBSPACES 185

in Z realizing that minimal distance. This shows the equivalence of


(3) and (2).
(3) ñ (4): let x P X and x1 P X 1 be two points realizing that
minimal distance: dpx, x1 q “ ||m||. In particular, this means that
x1 P X 1 is the point realizing the minimal distance to x. Again by
Proposition 6.41, x1 ´ x is therefore orthogonal to W 1 . Switching
the role of X and X 1 we obtain similarly that x ´ x1 is orthogonal
to W .
(4) ñ (1): Our assumption means that
x ´ x1 P W K X W 1K “ pW ` W 1 qK “ Z K .
To see the latter equality note that some vector is orthogonal to W `
W 1 precisely if it is orthogonal to W and to W 1 , by the bilinearity
of x´, ´y. We use this remark as follows: from
x ´ x1 “ v ` w ´ v 1 ´ w 1
we get
d “ v ´ v 1 “ lo
xo´mox
1
w1 ´ w .
on ` loomoon
PZ K PZ

By the unicity of the representation of d as a sum of a vector in Z K


and one in Z, this means that x ´ x1 “ pZ K pdq “ m.
Example 6.52. We consider the two lines in R3
X “ p2, ´1, 3q ` Lp1, 1, ´2q “ v ` W
X 1 “ p´3, 0, 0q ` Lp0, 2, 4q “ v 1 ` W 1 .
The general vectors of X and X 1 are of the following form, for a, b P
R.
x “ p2, ´1, 3q ` ap1, 1, ´2q “ p2 ` a, ´1 ` a, 3 ´ 2aq
1
x “ p´3, 0, 0q ` bp0, 2, 4q “ p´3, 2b, 4bq
1
x´x “ p5 ` a, ´1 ` a ´ 2b, 3 ´ 2a ´ 4bq
We compute the minimal distance of X and X 1 by considering the
condition x ´ x1 Kp1, 1, ´2q and x ´ x1 Kp0, 2, 4q. This gives the fol-
lowing homogeneous linear system
0 “ p5 ` aq ` p´1 ` a ´ 2bq ´ 2p3 ´ 2a ´ 4bq “ ´2 ` 6a ` 6b
0 “ 2p´1 ` a ´ 2bq ` 4p3 ´ 2a ´ 4bq “ 10 ´ 6a ´ 20b.
186 CHAPTER 6. EUCLIDEAN SPACES

This can be solved to b “ 74 and a “ ´ 21


5
. The points x and x1 and
their distance is then readily computed.

Definition 6.53. Let X, X 1 be two affine subspaces. Let W, W 1 Ă


Rn be the associated sub-vector spaces, as per Lemma 6.39.
• We say X intersects X 1 if X X X 1 ‰ H.
• We say X is parallel to X 1 if W Ă W 1 or if W 1 Ă W .
• We say that X is skew to X 1 if W XW 1 “ t0u and if XXX 1 “ H.

Example 6.54. We examine the relative position of the lines

X “ p1, ´3, 5q ` Lp1, ´1, 2q “ tp1 ` t, ´3 ´ t, 5 ` 2tq | s P Ru


X 1 “ p4, ´3, 6q ` Lp´1, 1, 2q “ tp4 ´ t, ´3 ` t, 6 ` 2tq | t P Ru.

The two subspaces W and W 1 are spanned by p1, ´1, 2q and p´1, 1, 2q,
respectively. These two vectors are linearly independent, so that the
lines are not parallel. We determine whether they have an intersec-
tion point by solving the system

p1 ` s, ´3 ´ s, 5 ` 2sq “ p4 ´ t, ´3 ` t, 6 ` 2tq.

Considering the first two equations gives s ` t “ 3 and s ` t “ 0,


which has no solution. Thus X X X 1 “ H, which means that the
lines are skew.

6.7 Exercises
Exercise 6.1. Let V “ Pď2 “ tat2 ` bt ` c | a, b, c P Ru be the
vector space of (real) polynomials of degree ď 2. We consider the
scalar product in Example 6.16(3), i.e.,
ż1
xp, qy “ ppxqqpxqdx.
´1

• Let e1 “ 1, e2 “ t and e3 “ t2 . (These vectors form a basis of


Pď2 .) Compute xei , ej y for 1 ď i, j ď 3.
• Apply the Gram–Schmidt orthogonalization procedure to this
basis.
6.7. EXERCISES 187

Exercise 6.2. (Solution at p. 234) Consider the subspace U Ă R3


given by the solutions of the homogeneous linear system
x ´ y ` 3z “ 0.
(1) Find a basis of U .
(2) Compute a basis of U K . What is dim U K ?
(3) Consider t “ p0, 1, 5q. Find its orthogonal projection onto U
(recall from Corollary 6.32 that t “ tU ` tK with uniquely deter-
mined vectors tU P U and tK P U K . The orthogonal projection
of t onto U is then the vector tU .)
Exercise 6.3. Consider the subspace W Ă R4 given by the equa-
tions
x´t“0
y`z´t“0

(where x, y, z, t are the coordinates of R4 ).


(1) Compute a basis of W and of W K .
(2) Compute the orthogonal projection of t “ p1, 5, 1, 6q onto W .
Exercise 6.4. (Solution at p. 235) Consider the subspace U Ă R3
given by the equations
x“0
x`y`z “0

(where x, y, z are the coordinates of R3 ).


(1) Compute a basis of U and of U K .
(2) Compute the orthogonal projection of t “ p5, 1, 3q onto U .
Exercise 6.5. (Solution at p. 236) Compute the orthogonal com-
plement of T “ Lpp1, 0, ´3qq.
Exercise 6.6. (Solution at p. 236) Is there a subspace U Ă R3 such
that
(1) the orthogonal projection of t “ p1, 1, 0q onto U is given by
p1, 5, 6q?
188 CHAPTER 6. EUCLIDEAN SPACES

(2) the orthogonal projection of t “ p2, 0, 1q onto U is given by


p1, 1, 1q?
ˆ ˙ ˆ ˙
1 1
Exercise 6.7. (Solution at p. 237) Let L “ 3 ` Lp 1 q.
5 4
Compute the closest point of L to the origin, and its distance to the
origin.

Exercise 6.8. (Solution at p. 237) Consider the two lines L : x “


1 ` t, y “ t, z “ 2 ` t, t P R and L1 : x ´ 3 “ y ´ 1 “ z ´ 3. Are they
parallel? Compute the distance between L and L1 .

Exercise 6.9. (Solution at p. 238) Are the lines


z
L : x “ y ´ 1 “ ´z and L1 : x ´ 2 “ ´y “
2
identical, parallel, or skew? Compute their distance.

Exercise 6.10. (Solution at p. 238) Let P be the plane given by


the equation
4x ` 5y ` 10z ´ 20 “ 0.
Let L be the line given by the equations x “ 0, y “ 5 ´ z.
(1) Sketch P and L.
(2) Compute the orthogonal complement of the underlying vector
space W of P .
(3) Compute the point of P that is closest to the origin and its
distance to the origin.
(4) Are P and L parallel?

Exercise 6.11. (Solution at p. 240) Which of the following matrices


is orthogonally diagonalizable? If so, find a orthonormal eigenbasis
of R2 . ˆ ˙
1 2
(1) A “
2 1
ˆ ˙
1 2
(2) A “
´2 1
ˆ ˙
0 0
(3) A “ .
0 0
6.7. EXERCISES 189

Exercise 6.12. (Solution at p. 241) Let A be a symmetric matrix


and λ ‰ µ two distinct eigenvalues of A, with eigenvectors v and
w, respectively. Then vKw, i.e., eigenvectors of distinct eigenvalues
are orthogonal.
Exercise 6.13. (Solution at p. 241) Let P Ă R4 by the hyperplane
given by
2x1 ` x3 ´ x4 “ 4,
where px1 , . . . , x4 q are the coordinates of R4 . For a parameter t P R,
let Lt be the line
Lt “ p1, 0, 0, ´2tq ` Lpt, 1, 0, ´1q.
• For which t P R is Lt parallel to P ?
• Let now t “ ´ 12 and consider the line L “ L´ 1 . Determine the
2
pair(s) of points pp, lq such that p P P and l P L such that their
distance is minimal.
Exercise 6.14. Let L Ă R3 be the line defined be the system
x`z “1
y ` z “ ´2.
Let L1 be the line in R3 passing through the points p0, 0, 1q and
p0, 1, 1q.
• Present L as L “ v ` W for a subspace W Ă R3 . Do the same
for L1 .
• Are L and L1 a) identical, b) parallel or c) skew?
• Compute the Cartseian equation (i.e., in the form ax`by`cz “
d, for appropriate values of a, . . . , d) of the plane P Ă R3 that
contains L and is parallel to L1 .
• Let l “ p2, ´1, ´1q P L. Compute a point l1 P L1 such that the
line passing through l and l1 is parallel to the plane given by
the equation x ` z ´ 1 “ 0.
Exercise 6.15. Let V “ Pď2 be the vector space of polynomials of
degree ď 2. We write elements of V as pptq “ a ` bt ` ct2 , where
a, b, c P R. Define
ż1
xp, qy :“ pptqqptqdt.
´1
190 CHAPTER 6. EUCLIDEAN SPACES

• Confirm that x´, ´y is a scalar product on V .

• Compute an orthonormal basis of V .

• Consider the map


Bp
f : V Ñ V, f ppq :“ t
Bt
(i.e., it maps a polynomial p to the product of the indetermi-
nate t with the derivative of p with respect to the variable t).
Confirm that this map is linear. Compute the matrix of f with
respect to the standard basis e1 “ 1, e2 “ t and e3 “ t2 . Is this
basis an eigenbasis for f ? Compute dim ker f and dim im f .

• Does the map f have an orthonormal eigenbasis?

Exercise 6.16. Consider the subspace U Ă R4 given by the solu-


tions of the equation

x1 ´ x2 ` x3 ` 2x4 “ 0.

(As usual x1 , . . . , x4 are the coordinates of R4 .)


(1) Find a basis of U . What is dim U ?
(2) Compute an orthonormal basis of U .
(3) Compute the orthogonal projection of v “ p2, 3, 0, 0q and of
w “ p2, 5, 3, 0q onto U .
(4) Compute U K .

Exercise 6.17. (Solution at p. 243) In the Euclidean space R4 ,


endowed with the standard scalar product, let U be the subspace
spanned by the vectors u1 “ p1, 2, 0, ´1q, u2 “ p0, ´4, 3, 4q.
(1) Compute an orthogonal basis of U .
(2) Compute a basis of U K .
(3) Compute the orthogonal projection of v “ p0, 5, 3, 4q onto U .
(4) Let w “ p2, ´1, 0, 2q. Decide whether there is a subspace L Ă R4
such that the orthogonal projection of w onto L is the vector
ℓ “ p1, 1, 2, 0q.
6.7. EXERCISES 191

Exercise 6.18. (Solution at p. 244) Consider the following two lines


in R3 , where x, y, z are the coordinates:
" "
x`y´1“0 x ´ 2y ´ 1 “ 0
L: M:
2x ´ z ´ 1 “ 0 y´z`2“0

(1) Determine whether L and M are the same line, parallel, or skew.
(2) Compute the cartesian equation of the plane that contains the
line M and that is parallel to L. (Recall that a cartesian equa-
tion is of the form xx, ay “ d for an appropriate vector a and an
appropriate d P R.)
(3) Given the point l “ p0, 1, ´1q P L compute a point m P M such
that the line passing through l and m is parallel to the plane
defined by the equation 3x ´ z “ 0.
(4) Consider the family of planes πα : z “ α, for some parameter α P
R. Let rα “ L X πα and sα “ M X πα . Let mα be the midpoint
of the segment with endpoints rα and sα . Verify that the points
mα are all lying on the same line. Moreover, determine the
parametric equation of that line.

Exercise 6.19. (Solution at p. 245) Consider the points p “ p3, 1, 0q,


q “ p0, 1, 3q and r “ p´3, 0, ´3q P R3 . Let L be the line passing
through p and q. Determine the parametric equation of L, i.e., ex-
press L in the form L “ v ` W , for an appropriate vector v P R3
and a subspace W Ă R3 .
Does r lie on L?
Exercise 6.20. (Solution at p. 246) Consider the line L “ p3, 1, 0q`
Lp1, 0, ´1q. Is there a plane containing L and the line M given by
the systen x ` z “ 2, x ´ 2y “ 2 (with x, y, z being the coordinates
of R3 )?
Exercise 6.21. (Solution at p. 246) Consider the line L “ p3, 1, 0q`
Lp1, 0, ´1q. Let p “ p´1, ´1, ´1q. Describe all the points q P R3
such that the line M passing through p and q intersects L orthogo-
nally (i.e., intersects it, and does so orthogonally).
192 CHAPTER 6. EUCLIDEAN SPACES
193
194 APPENDIX A. MATHEMATICAL NOTATION AND TERMINOLOGY

Appendix A

Mathematical notation and


terminology

Sets
Symbol Reads Explanation Example
t ...u a set The elements of the set t1, 2, 3u denotes the set con-
are written inside the sisting of the numbers 1, 2
braces. and 3.
t. . . | . . . u The set of This denotes the set con- t all vegetables V | I eat V reg
all . . . sat- sisting of all objects sat- consists of all the vegetables
isfying the isfiying a certain condi- that I eat regularly.
condition tion.
....
P is an ele- If M is a set the expres- ♢ P t♢, ♡, ♠, ♣u
ment of sion x P M means that x
is a member of M .
R is not an If M is a set the expres- ♢ R t♡, ♠, ♣u
element of sion x P M means that x
is a member of M .

f :XÑY f from X A function f from a set f :


to Y X to another set Y . tMonday, . . . , Sundayu Ñ
ttrue, falseu is some func-
tion that assigns to any
weekday either true or
false. For example, f could
indicate whether I go to
school that day.
Ñ to The regular arrow is the
symbol for a function.
ÞÑ maps to x ÞÑ y indicates that a Sunday ÞÑ false
particular element x P X
is sent to (or “mapped
to”) the element y P Y .
X ˆY The prod- The product consists of t0, 1u ˆ t0, 1u “
195

Logic
Symbol Reads Explanation Example
ñ Implies If A and B are two (mathemat- x ě 1 ñ x2 ě 1
ical) statements, then “A ñ
B” means that if A holds then
B also holds.
ô Equivalent If A and B are two mathemati- x ě 0 ô x ` 1 ě 1
cal statements, then “A ô B”
is an abbreviation for A ñ B
and (at the same time) B ñ
A.
:“ is defined x :“ 2 means that we
to be define the variable x
to take the value 2

Numbers and arithmetic


Symbol Reads Explanation Example
Z The set of all integers. ´34, ´1, 0, 1, 2, 18, ¨ ¨ ¨ P
Z, 34 R Z
Q The set of all rational numbers. ´3 , ´3.3, ´1, 0, 2.4, 34 P
16 ?
Q, 3 R Q ?
R The set of all real numbers. 0, 1, ´1, 12 , 3, π, e P
R
řn ř3 2 2 2
e“1 ae Sum This is an abbreviation for the e“1 e “ 1 ` 2 `
sum of the ae , where e runs 32 “ 14.
from 1 to n. (Here ae can be
any expression depending on
e.) It can also be written as
a1 ` a2 ` ¨ ¨ ¨ ` ae .
196 APPENDIX A. MATHEMATICAL NOTATION AND TERMINOLOGY
Appendix B

Trigonometric functions

Angles can be measured in degrees or in radians. These are con-


verted as follows:

angle radian
(in degree) (no unit)
180˝ π
π
90˝ 2
π
α 180
α
180
π
r r

Geometrically, given an angle α (between 0 and 360˝ as in the


picture below), the radian is the length of the yellow circle segment
as shown:

197
198 APPENDIX B. TRIGONOMETRIC FUNCTIONS

r
sinpαq
α
cospαq

A rotation by a positive number is counter-clockwise; conversely


negative numbers correspond to a clockwise rotation. For example,
a rotation by π2 is a counter-clockwise rotation by 90˝ . A rotation
by ´ π4 is a clockwise rotation by 45˝ .

Given any radian r, the ray that has an angle r between itself
and the positive x-axis meets the circle with radius 1 and mid-point
p0, 0q in exactly one point p. The trigonometric functions sin and
cos are defined to be the coordinates of that point:

p “ pcosprq, sinprqq.

For example, we have the following values

r 0 π{6 (30°) π{4 (45°) π{3?(60°) π{2 (90°) . . .


1
? 3
sinprq 0 ?2
2 2
1 ...
3
? 1
cosprq 1 2
2 2
0 ...
199

1 cospxq
sinpxq
0.5

2 4 6
´0.5

´1
200 APPENDIX B. TRIGONOMETRIC FUNCTIONS
Appendix C

Solutions of selected
exercises

C.1 Systems of linear equations

Solution of Exercise 1.6: If a ‰ 0 or b ‰ 0, then the equation


ax ` by “ c has infinitely many solutions. Indeed, if, say a ‰ 0, we
can subtract by and divide by a, which gives x “ c´by a
. Thus, for
c´by
any y P R, the pair px “ a , yq is a solution. A similar analysis
works if b ‰ 0. It remains to consider the case in which a “ 0 and
b “ 0. In this case the solution set of the equation depends on c:

• If c “ 0, then any pair px, yq is a solution. Indeed: 0x ` 0y “ 0


holds true then. Thus, if a “ b “ c “ 0, there are infinitely
many solutions.

• If c ‰ 0, the equation 0x ` 0y “ c has no solution, since the


left hand side is always 0, while the right hand side is nonzero.
So, in the case a “ b “ 0 but c ‰ 0, there is no solution.

Solution of Exercise 1.10: The matrix associated to the system


is ¨ ˛
1 2 ´1 0
˝ ´2 ´3 1 1 ‚.
0 1 ´1 1

201
202 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

The solution set is


tp2 ´ t, t ´ 1, tq | t P Ru.

Solution of Exercise 1.12: We apply Method 1.32. The matrix


associated to the system is
¨ ˛
1 ´1 1 0 ´2
˚ 0 0 1 ´1 1 ‹
˝ 1 ´1 0 1 ´3 ‚.
˚ ‹
1 ´1 3 ´2 0

We compute the reduced row-echelon form of that matrix using


Gaussian eliminiation (Method 1.30): we subtract the first row from
the third, which gives
¨ ˛
1 ´1 1 0 ´2
˚ 0 0 1 ´1 1 ‹
˝ 0 0 ´1 1 ´1 ‚.
˚ ‹
1 ´1 3 ´2 0

We then subtract the first row from the fourth:


¨ ˛
1 ´1 1 0 ´2
˚ 0 0 1 ´1 1 ‹
˝ 0 0 ´1 1 ´1 ‚.
˚ ‹
0 0 2 ´2 2

We add the second line to the third:


¨ ˛
1 ´1 1 0 ´2
˚ 0 0 1 ´1 1 ‹
˚ ‹.
˝ 0 0 0 0 0 ‚
0 0 2 ´2 2

We then add p´2q times the second line to the fourth (equivalently,
subtract 2 times the second line from the fourth):
¨ ˛
1 ´1 1 0 ´2
˚ 0 0 1 ´1 1 ‹
˚ ‹.
˝ 0 0 0 0 0 ‚
0 0 0 0 0
C.1. SYSTEMS OF LINEAR EQUATIONS 203

This matrix is in row-echelon form, with the leading 1’s being un-
derlined above. We finally bring it into reduced row-echelon form
by subtracting the second from the first line, which gives
¨ ˛
1 ´1 0 1 ´3
˚ 0 0 1 ´1 1 ‹
˚ ‹.
˝ 0 0 0 0 0 ‚
0 0 0 0 0
The matrix has no entry of the form 0 . . . 0 1, so the system does
have a solution. The first column of the matrix corresponds to the
variable x1 etc., so that the free variables are x2 and x4 . We let
x2 “ α, x4 “ β, where α and β are arbitrary real numbers. The
non-free variables x1 and x3 are uniquely determined by α and β.
To compute them, we use the equations obtained by the matrix
x3 ´ β “ 1
x1 ´ α ` β “ ´3
which we solve as x3 “ 1 ` β and x1 “ α ´ β ´ 3. Thus, the solution
set is
tpα ´ β ´ 3, α, 1 ` β, βq | α, β P Ru.

Solution of Exercise 1.14: Hint: we will apply Gaussian elim-


ination, but it simplifies the calculations to do a certain change of
rows first. (Why is that allowed?)

Solution of Exercise 1.17: Suppose x1 “ 1 ´ t, x2 “ 2 ` 3t and


x3 “ 4t. We have to determine whether there is some t P R such
that for these choices of x1 , x2 , x3 , we have a solution of the given
system, i.e., whether
x1 ` x2 ` x3 “ p1 ´ tq ` p2 ` 3tq ` 4t “1
x1 ´ x3 “ 1 ´ t ´ 4t “ 0.
Simplifying these equations gives the system
6t ` 3 “ 0
1 ´ 5t “ 0.
This system has no solutions, so there is no t P R such that the
vector p1 ´ t, 2 ` 3t, 4tq is a solution to the original system.
204 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

Solution of Exercise 1.19: We substitute x1 “ 1 ` t, x2 “ t ` q


and x3 “ ´t ` 2q ` 1 into the given equation and get the equation

3p1 ` tq ` 2pt ` qq ´ p´t ` 2q ` 1q “ 5.

This simplifies to
4t ` 2 “ 5
which has the solution t “ 34 . Since the variable q does not appear
in that equation it is a free variable. Thus, for all q P R, the vector
3 3 1 7 3 1
px1 “ 1 ` , x2 “ ` q, x3 “ ` 2qq “ p , ` q, ` 2qq
4 4 4 4 4 4
satisfies the requested conditions. Note that these are infinitely
many solutions.

Solution of Exercise 1.20: We have to find a0 , . . . , a3 , so these


are the unknowns. The conditions amount to the linear (!) system

pp1q “ a0 ` a1 ` a2 ` a3 “0
pp2q “ a0 ` a1 ¨ 2 ` a2 ¨ 22 ` a3 ¨ 23 “ 3.

This can be rewritten as

a0 ` a1 ` a2 ` a3 “ 0
a0 ` 2a1 ` 4a2 ` 8a3 “ 3.

Using Gaussian elimination to solve this: the associated matrix is


ˆ ˙
1 1 1 1 0
.
1 2 4 8 3

Subtracting the first from the second row gives


ˆ ˙
1 1 1 1 0
.
0 1 3 7 3

Subtracting the second from the first yields a reduced row echelon
matrix: ˆ ˙
1 0 ´2 ´6 ´3
.
0 1 3 7 3
C.2. VECTOR SPACES 205

The variables a0 and a1 correspond to the leading 1’s, the variables


a2 and a3 are therefore free variables. Thus, there are infintely many
solutions. One solution, for a2 “ a3 “ 0 is

a0 “ ´3, a1 “ 3,

so that
ppxq “ ´3 ` 3x
is a solution to the problem. Another solution would be a2 “ a3 “ 1,
which gives a1 “ ´7 and a0 “ 5, i.e.,

ppxq “ 5 ´ 7x ` x2 ` x3 .

C.2 Vector spaces

Solution of Exercise 2.13: A linear combination of A and B is


of the form
ˆ ˙ ˆ ˙
1 1 3 2
αA ` βB “ α `β
2 2 3 5

with α, β P R. Computing the left hand side, we need to find α and


β such that
ˆ ˙ ˆ ˙ ˆ ˙ ˆ ˙
α α 3β 2β α ` 3β α ` 2β ´1 0
` “ “ .
2α 2α 3β 5β 2α ` 3β 2α ` 5β 2 4

Comparing the entries of the matrix, this gives the linear system

α ` 3β “ ´1
α ` 2β “0
2α ` 3β “2
2α ` 5β “ 4.

The second gives α “ ´2β, inserting into the first gives ´2β ` 3β “
´1, which means β “ ´1. However, inserting into the third equation
gives ´4β ` 3β “ 2, so that β “ ´2, contradicting the previous
equation. Thus, there is no solution, so C is not a linear combination
of A and B.
206 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

Solution of Exercise 2.15: The system x ` y ` z ` t “ 0


corresponds to the matrix

p1 1 1 1q.

This matrix is already in reduced row echelon form: the leading one
is for the variable x, the variables y, z, t are free variables. Thus,

S “ tp´α ´ β ´ γ, α, β, γq | α, β, γ P Ru.

We have

S “ tp´α, α, 0, 0q ` p´β, 0, β, 0q ` p´γ, 0, 0, γq | α, β, γ P Ru


“ tαp´1, 1, 0, 0q ` βp´1, 0, 1, 0q ` γp´1, 0, 0, 1q | α, β, γ P Ru
“ Lpp´1, 1, 0, 0q, p´1, 0, 1, 0q, p´1, 0, 0, 1qq.

Solution of Exercise 2.16: By definition, S consists of all the


linear combinations of the three given vectors. These can be written
as

ap1, ´1, 0, 1q`bp2, 1, ´2, 0q`cp0, 0, 1, 1q “ pa`2b, ´a`b, ´2b`c, a`cq

for arbitrary a, b, c P R. The intersection is given by vectors as


above satisfying the linear system determining T , i.e.,

x1 “ a ` 2b
x2 “ ´a ` b
x3 “ ´2b ` c
x4 “a`c

such that

2pa ` 2bq ´ p´a ` bq ´ 3pa ` cq “ 0


2pa ` 2bq ` p´2b ` cq ` pa ` cq “ 0.

Simplifying these equations gives

3b ´ 3c “ 0
3a ` 2b ` 2c “ 0.
C.2. VECTOR SPACES 207

Thus b “ c and 3a ` 4c “ 0, i.e., a “ ´ 43 c, and c is a free vari-


able.
ˆ (Alternatively,
˙ the above system is associated to the matrix
0 3 ´3
, which can be brought into reduced row echelon
3 2 2
form.) Thus,

4
S X T “ t´ cp1, ´1, 0, 1q ` cp2, 1, ´2, 0q ` cp0, 0, 1, 1q | c P Ru
ˆ3 ˙
4 4 4
“ tc p´ , , 0, ´ q ` p2, 1, ´2, 0q ` p0, 0, 1, 1q | c P Ru
3 3 3
2 7 1
“ tcp , , ´1, ´ q | c P Ru
3 3 3
2 7 1
“ Lpp , , ´1, ´ qq.
3 3 3

Solution of Exercise 2.25: We have to find a vector v P W1 that


is also contained in W2 . This means that

v “ ap1, 0, 1q ` bp2, 1, 0q “ pa ` 2b, b, aq (C.1)

for some a, b P R and at the same time

v “ αp´1, ´1, 1q ` βp0, 3, 0q “ p´α, ´α ` 3β, αq

for some α, β P R. Comparing the two vectors gives the following


linear system, where a, b, α, β are the unknowns:

a ` 2b “ ´α
b “ ´α ` 3β
a “ α.

We solve this system: the last equation gives a “ α and, from the
first equation, b “ ´α. The second equation implies β “ 0. There
is no condition on α, this α “ r for an arbitrary real number r P R.
Instead of solving the above system by hand, we may also use
Gaussian elimination to solve this linear system. The matrix is the
208 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

following (where the columns are for a, b, α, β, in that order):


¨ ˛ ¨ ˛ ¨ ˛
1 2 1 0 1 2 1 0 1 2 1 0
˝ 0 1 1 ´3 ‚ ⇝ ˝ 0 1 1 ´3 ‚ ⇝ ˝ 0 1 1 ´3 ‚
1 0 ´1 0 0 ´2 ´2 0 0 0 0 ´6
¨ ˛
1 2 1 0
⇝ ˝ 0 1 1 ´3 ‚.
0 0 0 1

The three leading ones are for the variables a, b, β, and α is a free
variable, so let α “ r, where r P R is an arbitrary real number. This
gives again β “ 0, b ` r ´ 3β “ 0, so that b “ ´r and a “ r.
Thus the intersection W1 X W2 consists of the vectors

v “ αp1, 0, 1q`p´αqp2, 1, 0q “ αp´1, ´1, 1q`0p0, 3, 0q “ p´α, ´α, αq.

Thus,
W1 X W2 “ Lpp1, ´1, 1qq,
so a basis of W1 X W2 consists of (the single vector) p1, ´1, 1q, and
in particular
dim W1 X W2 “ 1.
We now consider W1 ` W2 . According to Definition 2.34,

W1 ` W2 “ tw1 ` w2 | w1 P W1 , w2 P W2 u,

i.e., of arbitrary sums whose two summands are in W1 , respectively


W2 .
As was noted in the proof of Corollary 2.70, if V1 “ Lpv1 , . . . , vn q
and V2 “ Lpw1 , . . . , wm q are two subspaces of a vector space V , then
the sum
V1 ` V2 “ Lpv1 , . . . , vn , w1 , . . . , wm q.
For the subspaces W1 , W2 above, this means that we determine
the span
Lpp1, 0, 1q, p2,
loomoon 1, 0q, looooomooooon
loomoon p´1, ´1, 1q, p0, 3, 0qq.
loomoon
v1 v2 w1 w2

By Definition 2.58(1), we obtain a basis of W1 ` W2 by (possibly)


removing several of these four vectors. To determine which ones
C.2. VECTOR SPACES 209

these are, we apply Method 2.53 and Method 2.44. The matrix
built out of the four vectors is
¨ ˛ ¨ ˛ ¨ ˛ ¨ ˛
v1 1 0 1 1 0 1 1 0 1
˚ v2 ‹ ˚ 2 1 0 ‹ ˚ 0 1 ´2
‹ ˚ ‹ ˚ 0 1 ´2 ‹
‹.
˝ w1 ‚ “ ˝ ´1 ´1 1 ‚ ⇝ ˝ 0 ´1 2
˚ ‹ ˚ ‹⇝˚
‚ ˝ 0 0 0 ‚
w2 0 3 0 0 3 0 0 0 6

Note that in this process we only added multiples of some rows to


another row, but did not interchange any rows. Since we have the
zero vector (underlined) in the third row, the vector w1 is a linear
combination of v1 and v2 . The vectors v1 , v2 , w2 are however linearly
independent. Thus, they form a basis of W1 ` W2 . In particular,
dimpW1 ` W2 q “ 3.
An alternative way to determine at least the dimension of W1 `W2
is to use Theorem 2.71:

dimpW1 ` W2 q “ dim W1 ` dim W2 ´ dimpW1 X W2 q.

Using again Method 2.53, one can check that v1 , v2 is a basis of W1 ,


so that dim W1 “ 2 and similarly that w1 , w2 form a basis of W2 ,
so that dim W2 “ 2. Thus, using the first part of the exercise, we
confirm dimpW1 ` W2 q “ 3.

Solution of Exercise 2.27: We will show that v1 , v2 , v3 are lin-


early independent (in R4 and therefore also in the subspace W ) and
therefore form a basis of W . We use Method 2.53:
¨ ˛ ¨ ˛ ¨ ˛ ¨ ˛
v1 1 0 1 0 1 0 1 0 1 0 1 0
˝ v2 ‚ “ ˝ 2 0 1 1 ‚ ⇝ ˝ 0 0 ´1 1 ‚ ⇝ ˝ 0 0 1 ´1 ‚
v3 0 0 1 3 0 0 1 3 0 0 0 4
¨ ˛
1 0 1 0
⇝ ˝ 0 0 1 ´1 ‚
0 0 0 1

This matrix has three leading ones, so the vectors are linearly inde-
pendent as claimed.
We “guess” v “ p1, 2, 3, 4q and check that these vectors v1 , v2 , v3 , v
are linearly independent. By Lemma 2.51, this will then imply that
v is not a linear combination of the other vectors, so that W Ĺ
210 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

Lpv1 , v2 , v3 , vq. We use Method 2.53:

¨ ˛ ¨ ˛ ¨ ˛
1 0 1 0 1 0 1 0 1 0 1 0
˚ 2 0 1 1 ‹
‹ ˚ 0 0 ´1 1 ‹ ˚ 0 0 ´1 1 ‹
˚
˝ 0 ⇝˚ ‹⇝˚ ‹
0 1 3 ‚ ˝ 0 0 1 3 ‚ ˝ 0 0 1 3 ‚
1 2 3 4 0 2 2 4 0 1 1 2
¨ ˛
1 0 1 0
˚ 0 0 0 4 ‹
⇝˚
˝ 0

0 1 3 ‚
0 1 1 2

After dividing the second row by 4, we can interchange rows and


get a row echelon matrix with four leading ones (underlined). Thus,
v1 , v2 , v3 , v are linearly independent. Therefore, they form in fact a
basis of R4 , and we know by Definition 2.58(3) that therefore

W Ĺ R4 “ Lpv1 , v2 , v3 , vq.

Remark C.2. A more systematic way of solving the second part


of the exercise, without guessing, is to use Definition 2.58: we can
take the standard basis of R4 , and for (at least) one of the four
standard basis vectors e1 , e2 , e3 , e4 we will have that this standard
basis vector together with v1 , v2 , v3 form a basis of R4 . We can then
use Method 2.53 to see that, for example, v1 , v2 , v3 , e1 are linearly
independent and therefore form a basis of R4 , so that in particular
W Ĺ Lpv1 , v2 , v3 , e1 q.

Solution of Exercise 2.28: We bring the matrix formed by these


C.2. VECTOR SPACES 211

vectors in row-echelon form:


¨ ˛ ¨ ˛
1 0 ´1 2 1 0 ´1 2
˚ 1 0 0 1 ‹ ˚ 0 0 1 ´1 ‹
˚ ‹⇝˚ ‹
˝ 2 0 ´1 3 ‚ ˝ 0 0 1 ´1 ‚
4 t ´2 6 0 t 2 ´2
¨ ˛
1 0 ´1 2
˚ 0 t 2 ´2 ‹
⇝˚ ‹
˝ 0 0 1 ´1 ‚
0 0 1 ´1
¨ ˛
1 0 ´1 2
˚ 0 t 2 ´2 ‹
‹.
⇝˚
˝ 0 0 1 ´1 ‚
0 0 0 0
If t ‰ 0, we can divide by t, which gives a matrix with three leading
ones. Thus, the space Ut which is spanned by these vectors has
dimension 3 in this case. If t “ 0, we continue simplifying the
matrix into row echelon form:
¨ ˛ ¨ ˛
1 0 ´1 2 1 0 ´1 2
˚ 0 t 2 ´2 ‹ ˚ 0 0 2 ´2 ‹
˝ 0 0 1 ´1 ‚ “ ˝ 0 0 1 ´1 ‚
˚ ‹ ˚ ‹
0 0 0 0 0 0 0 0
¨ ˛
1 0 ´1 2
˚ 0 0 1 ´1 ‹
⇝˚ ‹.
˝ 0 0 0 0 ‚
0 0 0 0
This has two leading ones, thus dim Ut “ 2 in this case.
We now consider t “ 1. The subspace U :“ U1 then has a basis
consisting of the non-zero rows if the matrix above, i.e., it has a
basis consisting of the vectors
p1, 0, ´1, 2q, p0, 1, 2, ´2q, p0, 0, 1, ´1q.
In order to determine a basis of W , we form the matrix associated
to these homogeneous equations, which is
ˆ ˙ ˆ ˙
1 1 1 0 1 1 1 0
⇝ .
1 0 0 ´3 0 ´1 ´1 ´3
212 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

This has two columns not having a leading one, namely the last two.
These are the free variables, say x3 “ a, x4 “ b for a, b P R. To
determine a basis of W , we therefore have to consider the system

x1 ` x2 ` a “ 0
x2 ` a ` 3b “ 0

This gives x2 “ ´a ´ 3b, and x1 ´ 3b “ 0 so that x1 “ 3b. Thus, a


basis of W is given by the two vectors

p0, ´1, 1, 0q and p3, ´3, 0, 1q.

In order to determine U X W , consider a generic vector of U , i.e.,


one of the form

v “ ap1, 0, ´1, 2q ` bp0, 1, 2, ´2q ` cp0, 0, 1, ´1q


“ pa, b, ´a ` 2b ` c, 2a ´ 2b ´ cq.

We require it to satisfy the equations describing W :

a ` b ` p´a ` 2b ` cq “ 0
a ´ 3p2a ´ 2b ´ cq “ 0.

Simplifying these expressions gives the system

3b ` c “ 0
´5a ` 6b ` 3c “ 0.

Therefore c “ ´3b, plugging this into the second equation gives,


after simplifying, ´5a ´ 3b “ 0 or a “ ´ 53 b. Thus, our vector v P U
belongs to W precisely if it can be written as
3 3 3 6
´ bp1, 0, ´1, 2q ` bp0, 1, 2, ´2q ` p´3bqp0, 0, 1, ´1q “ p´ b, b, b ` 2b ´ 3b, ´ b ´ 2b ` 3
5 b 5 5
3 2 1
“ bp´ , 1, ´ , ´ q,
5 5 5
where b P R is arbitrary. Thus, a basis of U X W is this vector
3 2 1
p´ , 1, ´ , ´ q.
5 5 5
In particular, dim U X W “ 1.
C.3. LINEAR MAPS 213

C.3 Linear maps


ˆ ˙
x1
Solution of Exercise 3.9: Recall that ker f “ t P
¨ ˛ x2
ˆ ˙ 0
x1
R2 | f p q “ ˝ 0 ‚u. Thus a vector is in the kernel pre-
x2
0
cisely if it is a solution of the homongeneous system
x1 ` 2x2 “ 0
x2 “ 0
3x1 ` 5x2 “ 0.
Solving
ˆ ˙ this system gives x2 “ 0, then x1 “ 0. Thus, ker f “
0
t u.
0
For the second
¨ question,
˛ recall that im
¨ f consists
˛ precisely of
a a ˆ ˙
x 1
those vectors ˝ b ‚ that are of the form ˝ b ‚ “ f p q for
x2
c c
some x1 , x2 P R. Thus, the question amounts to this: do there exist
x1 , x2 P R with
¨ ˛ ¨ ˛
ˆ ˙ x1 ` 2x2 1
x1
fp q“ ˝ x2 ‚ “ ˝ 0 ‚?
x2
3x1 ` 5x2 3
Again, this leads to a linear system:
x1 ` 2x2 “ 1
x2 “ 0
3x1 ` 5x2 “ 3.
The first two equations give¨x2 “˛0, x1 “ 1. This also satisfies the
1
last equation, so the vector ˝ 0 ‚ is indeed in the image, because
3
¨ ˛
1 ˆ ˙
˝ 0 ‚ “ f p 1 q.
0
3
214 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

Solution of Exercise 3.10: To determine the kernel of f , one


has to solve the homogeneous system

2x2 ´ x2 ` x3 ` x4 “ 0
5x2 ´ 3x3 ´ 5x4 “ 0
3x1 ´ 4x2 ` 3x3 ` 4x4 “ 0.

For the second task, one has to solve the non-homogeneous sys-
tem

2x2 ´ x2 ` x3 ` x4 “ 1
5x2 ´ 3x3 ´ 5x4 “ ´3
3x1 ´ 4x2 ` 3x3 ` 4x4 “ 3.
¨ ˛
0
˚ 0 ‹
This solution set is not a subspace, since the zero vector ˚ ˝ 0 ‚ is

0
not a solution for the system: the left hand side of all three equations
is 0, while the right ones are not.

Solution of Exercise 3.11: We compute the rank using Gaussian


elimination:
¨ ˛ ¨ ˛
1 3 ´1 2 1 3 ´1 2
At “ ˝ 1 5 1 1 ‚ ⇝ ˝ 0 2 2 ´1 ‚
2 4 t 5 0 ´2 t ` 2 1
¨ ˛
1 3 ´1 2
⇝˝ 0 2 2 ´1 ‚
0 0 t`4 0
¨ ˛
1 3 ´1 2
⇝˝ 0 1 1 ´ 21 ‚
0 0 t`4 0

If t ‰ ´4, then we can further divide the last row by t ` 4p‰ 0q, and
the rank is then 3. For t “ ´4, the rank is 2.
C.3. LINEAR MAPS 215

For the second task, the system we are considering here is

x1 ` 3x2 ´ x3 ` 2x4 “ 1
x1 ` 5x2 ` x3 ` x4 “ α
2x1 ` 4x2 ´ 4x3 ` 5x4 “ 0.

For the last task: the rank of At is at most 3. Thus, dim im f ď 3,


and therefore

dim ker “ dim R4 ´ dim im f ě 4 ´ 3 “ 1.

This means that, for all t, the kernel of f is not just consisting of
the zero vector, hence the answer to the question is no.

Solution of Exercise 3.14: To determine a basis of ker f and of


im f , we bring A into row echelon form:
¨ ˛ ¨ ˛
2 ´1 ´ 25 1 0 ´3 ´ 32 0
˚ ´1 0 1 ´ 21 ‹ ˚ 0 1 1
2
0 ‹
A“˚ 1 ‚⇝ ˝
‹ ˚ ‹
˝ 1 1 ´
1
2 2
1 1 ´ 12 1 ‚
2
0 2 1 0 0 2 1 0
¨ ˛
1 1 ´ 12 2 1
˚ 0 1 1
2
0 ‹
⇝˚
˝ 0 ´3 ´ 3

2
0 ‚
0 2 1 0
¨ ˛
1 1 ´ 2 12
1
˚ 0 1 1 0 ‹
2
˝ 0 0 0 0 ‚ “: B.
⇝˚ ‹
0 0 0 0

According to Proposition 3.32, rk A “ dim im f equals the number


of columns (of B) with a leading 1, i.e., dim im f “ 2. A basis of
im f is given by the columns of A corresponding to these columns
of B, i.e., the first two columns. Thus, a basis of im f is given by
the two vectors p2, ´1, 1, 0q and p´1, 0, 1, 2q. To determine a basis
of ker f , the third and fourth columns correspond to a free variable,
i.e., we can choose x3 “ a, x4 “ b with a, b P R arbitrary. We obtain
216 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

the equations
1 1
x1 ` x2 ´ a ` b “ 0
2 2
1
x2 ` a “ 0.
2
This gives x2 “ ´ 12 a and x1 “ a ´ 21 b. Therefore,
1 1
ker f “ tpa ´ b, ´ a, a, b | a, b P Ru
2 2
1 1
“ tap1, ´ , 1, 0q ` bp´ , 0, 0, 1q | a, b P Ru
2 2
1 1
“ Lpp1, ´ , 1, 0q, p´ , 0, 0, 1qq.
2 2
These two vectors form a basis of ker f .
In order to determine im f X ker f , we need to consider elements
of
im f “ tap2, ´1, 1, 0q ` bp´1, 0, 1, 2q | a, b P Ru
that also belong to the kernel, i.e., the vector p2a ´ b, ´a, a ` b, 2bq
must lie in ker f . This means that
¨ ˛ ¨ ˛
2a ´ b 0
˚ ´a ‹ ˚ 0 ‹
A˚ ˝ a ` b ‚ “ ˝ 0 ‚.
‹ ˚ ‹
2b 0

Again we use the row-echelon form of A, computed above. So this


system is equivalent to
¨ ˛ ¨ ˛¨ ˛ ¨ ˛
2a ´ b 1 1 ´ 12 1
2
2a ´ b 0
˚ ´a ‹ ˚ 0 1 1 ˚ ´a ‹ ˚ 0
2
0 ‹ ‹
B˚˝ a ` b ‚“ ˝ 0 0
‹ ˚ ‹˚ ‹“˚ ‹.
0 0 ‚˝ a ` b ‚ ˝ 0 ‚
2b 0 0 0 0 2b 0

This gives the two equations


1 1
p2a ´ bq ` p´aq ´ pa ` bq ` p2bq “ 0
2 2
1
´a ` pa ` bq “ 0.
2
C.3. LINEAR MAPS 217

This simplifies to
a b
´ “0
2 2
a b
´ ` “ 0.
2 2
This is equivalent to the condition a “ b. Therefore,
ker f X im f “ tap2, ´1, 1, 0q ` ap´1, 0, 1, 2q | a P Ru
“ tap1, ´1, 2, 2q | a P Ru
“ Lpp1, ´1, 2, 2qq.
That is, the vector p1, ´1, 2, 2q is a basis of ker f X im f .

Solution of Exercise 3.16: We have v1 “ p1, 1, 0q, and compute


¨ ˛¨ ˛ ¨ ˛
2 ´1 0 1 1
v2 “ ˝ 1 0 2 ‚˝ 1 ‚ “ ˝ 1 ‚,
0 2 ´1 0 2
¨ ˛¨ ˛ ¨ ˛
2 ´1 0 1 1
v3 “ ˝ 1 0 2 ‚˝ 1 ‚ “ ˝ 5 ‚.
0 2 ´1 2 0
In order to confirm that they form a basis, we apply Method 2.44
and Method 2.53 by forming the associated matrix and bringing it
into row echelon form:
¨ ˛ ¨ ˛
1 1 0 1 1 0
˝ 1 1 2 ‚⇝ ˝ 0 1 2 ‚
1 5 0 ´4 0 0
¨ ˛
1 1 0
⇝˝ 0 1 2 ‚
0 4 0
¨ ˛
1 1 0
⇝˝ 0 1 2 ‚
0 0 ´8
¨ ˛
1 1 0
⇝ ˝ 0 1 2 ‚.
0 0 1
218 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

This matrix has three leading ones, so that the vectors do form a
basis. ¨ ˛¨ ˛ ¨ ˛
2 ´1 0 1 ´3
We compute v4 “ f pv3 q “ ˝ 1 0 2 ‚ ˝ 5 ‚“ ˝ 1 ‚.
0 2 ´1 0 10
The equation v4 “ a1 v1 ` a2 v2 ` a3 v3 is the linear system
a1 ` a2 ` a3 “ ´3
a1 ` a2 ` 5a3 “ 1
2a2 “ 10.
We solve this: the last equation gives a2 “ 5, which leads to
a1 ` 5 ` a3 “ ´3
a1 ` 5 ` 5a3 “ 1.
Therefore
a1 ` a3 “ ´8
a1 ` 5a3 “ ´4.
This can be solved to a3 “ 1 and a1 “ ´9. Thus, pa1 , a2 , a3 q “
p1, 5, ´9q are the coordinates of v4 in the basis v1 , v2 , v3 .
We now determine the matrix of f with respect to the basis
v1 , v2 , v3 (both in the domain and the codomain of f ). We therefore
write each f pvi q as a linear combination of these three vectors:
f pv1 q “ v2 “ 0v1 ` 1v2 ` 0v3
f pv2 q “ v3 “ 0v1 ` 0v2 ` 1v3
f pv3 q “ v4 “ 1v1 ` 5v2 ` p´9qv3 .
According to Proposition 3.42, the matrix of f with respect to
v1 , v2 , v3 is ¨ ˛
0 0 1
˝ 1 0 5 ‚.
0 1 ´9

Solution of Exercise 3.17: The matrix for f is


¨ ˛
` ˘ ´1 0 1 0
A “ a1 a2 a3 a4 “ ˝ 0 ´1 0 1‚
1 ´1 0 0
C.3. LINEAR MAPS 219

To compute the image of f , we bring this matrix into row-echelon


form:
¨ ˛
´1 0 1 0
A⇝˝ 0 ´1 0 1 ‚
0 ´1 1 0
¨ ˛
´1 0 1 0
⇝˝ 0 ´1 0 1 ‚
0 0 1 0

After multiplying the first two rows with ´1, we get a matrix with
three leading ones. Therefore dim im f “ 3, which implies that
im f “ R3 . This tells us that dim ker f “ 1, so ker f is generated
¨ by
˛
1
˚ 1 ‹
any non-zero vector in it. An example of such a vector is ˚ ˝ ´1 ‚,

´1
which therefore constitutes a basis of ker f .

Solution of Exercise 3.18:


(1) f2 and f3 are not linear (f pa ` bq ‰ f paq ` f pbq for most a and
b). The remaining ones are linear.
(2)
¨ ˛ ¨ ˛
1 1
im pf1 q “ Lp˝0‚, ˝3‚q
2 0
im pf2 q “ R2
ˆ ˙
x
im pf3 q “ t 1 P R2 |x1 ě 0u
x2
im pf4 q “ R3
im pf5 q “ R3

For the linear maps f1 , f4 , f5 , this can be checked by forming


the matrix of these maps with respect to the standard bases and
bringing it into row echelon form.
(3) According to the rank-nullity theorem, we see that f1 is injective,
f4 is bijective (i.e. sujective e injective) and f5 is bijective.
220 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

Solution of Exercise 3.27: We write any vector v “ pa, bq “


ap1, 0q ` bp0, 1q “ ae1 ` be2 in terms of v1 , v2 :
v “ αv1 ` βv2 “ αp1, ´3q ` βp2, 1q.
As an example, if v “ p0, 6q, then p0, 6q “ αp1, ´3q ` βp2, 1q gives
the linear system
0 “ α ` 2β
6 “ ´3α ` β.
6
This can be solved as β “ 7
and α “ ´ 12
7
. Thus, we can write
12 6
, qpv ,v q ,
v “ p0, 6qpe1 ,e2 q “ p´
7 7 1 2
where the subscripts indicate that the coordinates are with respect
to the standard basis, resp. to the basis v1 , v2 .
We now determine the base change matrix. We have to write
the matrix of the identity map in terms of the standard basis in the
domain, and the basis v1 , v2 in the codomain:
e1 “ p1, 0q ÞÑidpe1 q “ p1, 0q “ α1 v1 ` α2 v2
e2 “ p0, 1q ÞÑidpe2 q “ p0, 1q “ β1 v1 ` β2 v2 .
The base change matrix is then the matrix
ˆ ˙
α1 β1
.
α2 β2
We can compute α1 etc. similarly as above:
1 “ α1 ` 2α2
0 “ ´3α1 ` α2
We solve this as α1 “ 71 , α2 “ 37 . As for β1 , β2 , the relevant system
is
0 “ β1 ` 2β2
1 “ ´3β1 ` β2
whose solution is β1 “ ´ 72 , β2 “ 17 . Therefore, the base change
matrix is ˆ 1 ˙
7
´ 27
H“ 3 1 .
7 7
C.3. LINEAR MAPS 221

We compute the coordinates of p2, ´5q in terms of the basis v1 , v2 :


ˆ ˙ ˆ 1 ˙ˆ ˙ ˆ 12 ˙
2 7
´ 72 2 7
H “ 3 1 “ 1 .
´5 7 7
´5 7

Thus, the coordinates of p2, ´5q with respect to the basis v1 , v2 is


p 12
7 7
, 1 q.

Solution of Exercise 3.28: We have, for example,

idpv1 q “ v1 “ p1, 0, ´1q “ 1 ¨ e1 ` 0 ¨ e2 ` p´1q ¨ e3 .

Likewise for v2 and v3 . Therefore, the base change matrix is


¨ ˛
1 2 ´1
˝ 0 1 ´1 ‚.
´1 1 7

Solution of Exercise 3.30: We follow the given hint, and first


compute H. We have

v1 Ñ
Þ v1 “ p1, ´1q “ e1 ´ e2 ,
v2 ÑÞ v2 “ p3, ´1q “ 3e1 ´ e2 .
ˆ ˙
1 3
Thus H “ .
´1 ´1
We now compute the base change matrix K from the standard
basis to the basis v “ tv1 , v2 , v3 u:

e1 “ p1, 0, 0q ÞÑ p1, 0, 0q “ α1 v1 ` α2 v2 ` α3 v3 .

Plugging in the values of v1 , v2 , v3 , this gives the linear system

1 “ α1 ` 2α2 ´ α3
0 “ α2 ´ α3
0 “ α1 ` α2 ´ α3 .

This has the solution α1 “ 0, α2 “ 1, α3 “ 1. Similarly, if instead


of e1 “ p1, 0, 0q, we consider e2 “ p0, 1, 0q, the constants in the
above system change accordingly to 0, resp. 1, resp. 0 in the three
equations above. The solution is then α1 “ 1, α2 “ ´2, α3 “ ´3.
222 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

Similarly, for e3 , we obtain the solution α1 “ 1, α2 “ ´1, α3 “ ´1.


Hence ¨ ˛
0 1 1
K “ ˝ 1 ´2 ´1 ‚.
1 ´3 ´1
We compute
¨ ˛ ¨ ˛
0 1 1 2 1 ˆ ˙
1 3
KAH “ ˝ 1 ´2 ´1 ‚¨ ˝ 0 1 ¨‚
´1 ´1
1 ´3 ´1 ´3 1
¨ ˛ ¨ ˛
0 1 1 1 5
“ ˝ 1 ´2 ´1 ‚¨ ˝ ´1 ´1 ‚
1 ´3 ´1 ´4 ´10
¨ ˛
´5 ´11
“˝ 7 17 ‚.
8 18

Solution of Exercise 3.32: We follow the solution of Exer-


cise 3.30, see above:
id f id
R2v ÝÑ R2e ÝÑ R2e ÝÑ R2v .
H A K
ˆ ˙ ˆ ˙
1 1 2
We have H “ . One computes K “ ´11 and
1 2 ´1
then
ˆ ˙ˆ ˙ˆ ˙
2 ´1 6 ´1 1 1
KAH “
´1 1 2 3 1 2
ˆ ˙ˆ ˙
2 ´1 5 4

´1 1 5 8
ˆ ˙
5 0
“ .
0 4

Solution of Exercise 3.33: For the given matrix A, the system


Ax “ x is written out like so:
x3 “ x1
x2 “ x2
x1 “ x3 .
C.3. LINEAR MAPS 223

The second equation holds for all x, and the first is equivalent to
the third. Therefore, the system is equivalent to the one consisting
of the single equation
x1 ´ x3 “ 0.
This corresponds to the system

Bx “ x,

where B “ p1 0 ´1q is the corresponding 1 ˆ 3-matrix. This matrix


has rank 1, so that the solution space is two-dimensional, given by

Lpp1, 0, 1q, p0, 1, 0qq.

Solution of Exercise 3.34: We write out the given system


¨ ˛¨ ˛ ¨ ˛
1 0 0 x1 x1
˝ 0 4 2 ‚˝ x2 ‚ “ 5 ˝ x2 ‚
0 2 1 x3 x3
as
»¨ ˛ ¨ ˛fi ¨ ˛ ¨ ˛
1 0 0 5 0 0 x1 0
–˝ 0 4 2 ‚´ ˝ 0 5 0 ‚fl ˝ x2 ‚ “ ˝ 0 ‚.
0 2 1 0 0 5 x3 0

This simplifies to
¨ ˛¨ ˛ ¨ ˛
´4 0 0 x1 0
˝ 0 ´1 2 ‚˝ x2 ‚ “ ˝ 0 ‚,
0 2 ´4
looooooooooomooooooooooon x3 0
“:B

which can be solved by bringing B into row-echelon form:


¨ ˛
1 0 0
B ⇝ ˝ 0 1 ´2 ‚.
0 0 0

Thus x1 “ 0, x2 “ 2x3 and x3 is a free variable. Thus the solution


space of the given system is

Lpp0, 2, 1qq.
224 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

Solution of Exercise 3.35: We proceed the same way as for


Exercise 3.30 (see its solution above):
A
id id
R2v ÝÑ R2e f ÝÑ R2e ÝÑ R2v .
H ´1
H

Here H is the base change matrix from v to the standard basis


e “ te1 “ p1, 0q, e2 “ p0, 1qu. The base change matrix
ˆ from˙e to v
2 0
is then H ´1 . From the given vectors we have H “ . We
1 1
compute the inverse H ´1 using Theorem 3.78:
ˆ ˙ ˆ ˙ ˆ ˙
2 0 1 0 1 1 0 1 1 1 0 1
⇝ ⇝
1 1 0 1 2 0 1 0 0 ´2 1 ´2
ˆ ˙ ˆ 1
˙
1 1 0 1 1 0 2
0
⇝ ⇝
0 1 ´ 12 1 0 1 ´ 12 1
“ pid | H ´1 q.
ˆ 1 ˙
´1 2
0
Thus, H “ . Therefore the requested matrix of f is
´ 12 1
ˆ 1 ˙ˆ ˙ˆ ˙ ˆ 1 ˙ˆ ˙
´1 2
0 3 0 2 0 2
0 6 0
H AH “ “ 1
´ 12 1 8 ´1 1 1 ´ 21 1 15 ´
ˆ ˙
3 0
“ .
12 ´1

Solution of Exercise 3.37: This can be done as for Exercise 3.35


above. The final solution is
¨ ˛
´2 0 0
˝ 0 2 0 ‚.
0 0 ´1

Solution of Exercise 3.38: One computes this solution space as


Lpp´1, 2, 3qq “ Lpp1, ´2, ´3qq.
According to Theorem 2.66(2), this vector l1 “ p1, ´2, ´3q can
be completed to a basis of R3 by picking any basis v1 , v2 , v3 of R3 .
Then it is possible to find two of these three vectors which together
C.3. LINEAR MAPS 225

with l1 will form a basis of R3 . We pick the standard basis, v1 “ e2 ,


v2 “ e2 and v3 “ e3 . We check that l1 , e2 , e3 form a basis. Indeed,
the matrix whose rows are these vectors,
¨ ˛
1 ´2 ´3
˝ 0 1 0 ‚,
0 0 1

is a row-echelon matrix with three leading ones, so its rank is 3.


We compute the matrix of f with respect to this basis v “
tl1 , e2 , e3 u:
id f id
R3v ÝÑ R3e ÝÑ R3e ÝÑ R3v .
H A ´1H

We have ¨ ˛
1 0 0
H“ ˝ ´2 1 0 ‚.
´3 0 1
We compute the inverse using Theorem 3.78:
¨ ˛ ¨ ˛
1 0 0 1 0 0 1 0 0 1 0 0
˝ ´2 1 0 0 1 0 ‚ ⇝ ˝ 0 1 0 2 1 0 ‚ “ pid | H ´1 q.
´3 0 1 0 0 1 0 0 1 3 0 1

Hence the matrix for f with respect to the basis v is


¨ ˛
3 ´1 0
H ´1 AH “ ˝ 0 0 1 ‚.
0 ´2 3

Remark C.3. The choice of the two vectors e2 and e3 , in addition


to l1 above, is arbitrary. To begin with, one may choose a different
basis (other than the standard basis) to complete l1 to a basis. Even
if one takes the standard basis, for this particular value of l1 , any
two of the three vectors e1 , e2 , e3 together with l1 would form a basis.
The resulting base change matrix H will then be different, and also
the result H ´1 AH will be different.

Solution of Exercise 3.39: The vectors

v “ tv1 “ p1, 0, 1q, v2 “ p0, 3, ´1q, v3 “ p0, 0, 1qu


226 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES
¨ ˛
1 0 1
3
form a basis of R since the corresponding matrix ˝ 0 3 ´1 ‚
0 0 1
has rank 3. Hence we can compute the matrix of f with respect to
that basis as follows:

f pv1 q “ 0 “ 0v1 ` 0v2 ` 0v3


f pv2 q “ v2 “ 0v1 ` 1v2 ` 0v3
f pv3 q “ p0, 0, 2q “ 2v3 “ 0v1 ` 0v2 ` 2v3 .

Therefore the matrix of f with respec to that basis is


¨ ˛
0 0 0
A“ ˝ 0 1 0 ‚.
0 0 2

As before, we compute that matrix of f with respect to the standard


basis using the method above:
id f id
R3e ÝÑ R3v ÝÑ R3v ÝÑ R3e .
K A ´1 K

The base change matrix K is easily read off:


¨ ˛
1 0 0
K “ ˝ 0 3 0 ‚.
1 ´1 1

We compute the inverse K ´1 :


¨ ˛ ¨ ˛
1 0 0 1 0 0 1 0 0 1 0 0
˝ 0 3 0 0 1 0 ‚⇝ ˝ 0 3 0 0 1 0 ‚
1 ´1 1 0 0 1 0 ´1 1 ´1 0 1
¨ ˛
1 0 0 1 0 0
⇝ ˝ 0 1 0 0 13 0 ‚
0 ´1 1 ´1 0 1
¨ ˛
1 0 0 1 0 0
⇝ ˝ 0 1 0 0 13 0 ‚
0 0 1 ´1 13 1
“ pid | K ´1 q.
C.4. DETERMINANTS 227

Therefore the requested matrix is the product


¨ ˛¨ ˛¨ ˛
1 0 0 0 0 0 1 0 0
1
KAK ´1 “ ˝ 0 3 0 ‚˝ 0 1 0 ‚˝ 0 3
0 ‚
1
1 ´1 1 0 0 2 ´1 3
1
¨ ˛¨ ˛
1 0 0 0 0 0
“ ˝ 0 3 0 ‚ ˝ 0 31 0 ‚
1 ´1 1 ´2 23 2
¨ ˛
0 0 0
“ ˝ 0 1 0 ‚.
´2 31 2

Solution of Exercise 3.41: It is convenient to observe


¨ ˛ ¨ ˛ ¨ ˛
x1 5 0 0 x1
5 ˝ x2 ‚ “ ˝ 0 5 0 ‚5 ˝ x2 ‚.
x3 0 0 5 x3
Thus the given system can be rewritten as
¨ ˛¨ ˛ ¨ ˛¨ ˛
1 0 0 x1 5 0 0 x1
˝ 0 4 2 ‚˝ x2 ‚ “ ˝ 0 5 0 ‚˝ x2 ‚.
0 2 1 x3 0 0 5 x3
By Lemma 3.57, this is thesame as the system
»¨ ˛ ¨ ˛fi ¨ ˛ ¨ ˛
1 0 0 5 0 0 x1 0
–˝ 0 4 2 ‚´ ˝ 0 5 0 ‚fl ˝ x2 ‚ “ ˝ 0 ‚.
0 2 1 0 0 5 x3 0
¨ ˛
´4 0 0
The left hand matrix equals ˝ 0 ´1 2 ‚. From here, one can
0 2 ´4
use the standard method to solve the linear system. (As a forecast
to §4, one can note that the determinant of the latter matrix is 0,
so that the matrix is not invertible. Hence the system above has
non-zero solutions.)

C.4 Determinants

Solution of Exercise 4.5: According to Proposition 4.17, the


determinant equals 3 ¨ 4 ¨ 5 “ 60.
228 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

Solution of Exercise 4.6: Both matrices are non-invertible,


since the rows are not linearly independent. Thus they both have
determinant 0.

C.5 Eigenvalues and eigenvectors

Solution of Exercise 5.5: The condition ker f “ Lpp1, 1, 1qq


implies that f p1, 1, 1q “ p0, 0, 0q, which we can also rewrite as
f pp1, 1, 1qq “ 0 ¨ p1, 1, 1q.
Thus, this vector is an eigenvector for f , with eigenvalue 0. We
therefore have three eigenvectors as follows:
v1 “ p1, 0, 1q ÞÑ 2p1, 0, 1q
v2 “ p2, 0, ´3q ÞÑ ´1p2, 0, ´3q
v3 “ p1, 1, 1q ÞÑ 0p1, 1, 1q.
We check that these three vectors form a basis of R3 (note that this
is therefore an example of an eigenbasis). To this end, we compute
the rank of
¨ ˛ ¨ ˛
1 0 1 1 0 1
˝ 2 0 ´3 ‚ ⇝ ˝ 0 0 ´5 ‚.
1 1 1 0 1 0
This implies that the matrix has rank three, and therefore the three
vectors form a basis of R3 . The matrix of f with respect to the
basis v “ tv1 , v2 , v3 u is
¨ ˛
2 0 0
˝ 0 ´1 0 ‚.
0 0 0
In order to compute the matrix of f with respect to the standard
basis e “ te1 , e2 , e3 u, we use the usual diagram:
id f id
R3e Ñ R3v ˜ ÝÑ ¸ R3v Ñ R3e .
K 2 0 0 K´1
0 ´1 0
0 0 0
C.5. EIGENVALUES AND EIGENVECTORS 229

It turns out that K ´1 is easier to compute than K. It is given


by expressing the vi in their coordinates in the standard basis vec-
tors, e.g. v1 ¨ÞÑ idpv1 q “ p1,
˛ 0, 1q “ 1e1 ` 0e2 ` 1e3 . This im-
1 2 1
plies K ´1 “ ˝ 0 0 1 ‚. We can use this to compute K “
1 ´3 1
pK q q, cf.(3.69). This inverse (of¨K ´1 ) can be computed
´1 ´1
˛ using
3 ´5 2
Theorem 3.78, which gives K “ 15 ˝ 1 0 ´1 ‚. Then, one
0 5 0
computes the product
¨ ˛ ¨ ˛
2 0 0 6 12 11
1
K ´1 ˝ 0 ´1 0 ‚K “ ˝ 2 4 2 ‚.
0 0 0 5 0 0 ´5

This is the basis of f with respect to the standard basis.


This is a typical example of the situation that one basis of R3
may be more adapted to describing a linear map than another one.
An eigenbasis, such as v1 , v2 , v3 gives a particularly simple matrix.
¨ ˛
4´t 0 4
Solution of Exercise 5.7: We have detpAa ´tid3 q “ det ˝ a 2´t a ‚.
´2 0 ´2 ´ t
We compute the determinant by developing the first row, which gives

detpAa ´ tid3 q “ p4 ´ tqrp2 ´ tqp´2 ´ tqs ` 4r2p2 ´ tqs


“ p4 ´ tqp2 ´ tqp´2 ´ tq ` 8p2 ´ tq
“ pt ´ 2qrp4 ´ tqp2 ` tq ´ 8s
“ pt ´ 2qr8 ` 2t ´ t2 ´ 8s
“ pt ´ 2qp´t2 ` 2tq
“ ´pt ´ 2q2 t.

The roots of this polynomial, i.e., the eigenvalues are 2 and 0 (re-
gardless of the value of a). The exponent of t ´ 2 in the above
polynomial is 2, the one for t is 1. This implies that

1 ď dim E1 ď 1 for all t P R


1 ď dim E2 ď 2 for all t P R.
230 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

According to Method 5.15, Aa will be diagonalizable precisely if


dim E2 “ 2. We compute E2 by bringing Aa ´ 2id into reduced row
echelon form:
¨ ˛ ¨ ˛ ¨ ˛
4´2 0 4 2 0 4 2 0 4
˝ a 2´2 a ‚“ ˝ a 0 a ‚⇝ ˝ a 0 a ‚
´2 0 ´2 ´ 2 ´2 0 ´4 0 0 0
¨ ˛ ¨ ˛
1 0 2 1 0 2
⇝ ˝ a 0 a ‚ ⇝ ˝ 0 0 a ´ 2a ‚
0 0 0 0 0 0
¨ ˛
1 0 2
“ ˝ 0 0 ´a ‚.
0 0 0
This matrix has rank 1, or equivalently dim E2 “ 2, if and only if
a “ 0. Thus, the matrix Aa is diagonalizable precisely if a “ 0.
The second part¨of the exercise ˛then has only to be done for a “ 0,
4 0 4
i.e. A :“ A0 “ ˝ 0 2 0 ‚. This can be dealt with as in the
´2 0 ´2
previous exercises.

Solution of Exercise 5.11: If A and B represent the same map,


then A “ P BP ´1 for some invertible matrix P . This implies that
det A “ det P det B det P ´1 “ det B. In short, A and B have to
have the same determinant. This is true: det A “ det B “ 9.
In addition A and B have to have the same characteristic poly-
nomial:
χA ptq “ detpA ´ tidq
¨ ˛
1´t 0 2
“ det ˝ 0 3´t 0 ‚
0 0 3´t
“ p1 ´ tqp3 ´ tq2
“ detpB ´ tidq “ χB ptq.
Again, this is true.
Finally, the dimensions of the eigenspaces of the eigenvalues (1
and 3) need to be equal. For A, the eigenspace E1,A for the eigen-
value λ “ 1 has dim E1,A “ 2 (as one computes!). For B instead,
C.5. EIGENVALUES AND EIGENVECTORS 231

dim E1,B “ 1. Therefore A and B do not represent the same linear


map.

Solution of Exercise 5.12: The matrix A has eigenvalues 1


and 2. The eigenspace E1 “ Lp1, 0, 0q and E2 “ Lp1, 1, 0q. Their
dimensions sum up to 2, which is strictly less than 3, so that A is
not diagonalizable.
The matrix A2 therefore has 22 “ 4 as an eigenvalue. Since
similar matrices have the same eigenvalues, A is not similar to A2 .

Solution of Exercise 5.13: We first check that v1 , v2 , v3 are a


basis of R3 . Indeed, one can compute
¨ ˛
1 0 1
det ˝ 1 1 1 ‚ “ 1 ‰ 0,
1 1 2
so the rank is 3 and the vectors form a basis.
The condition that v3 be an eigenvector for the eigenvalue 4
means f pv3 q “ 4v3 “ p4, 4, 8q. Acccording to Proposition 3.38, there
is a unique linear map f whose value on v1 , v2 , v3 is prescribed.
We now compute A. We have to express f pvi q as a linear combi-
nation in terms of v1 , v2 , v3 :
f pv1 q “ p0, 0, 0q “ 0v1 ` 0v2 ` 0v3
f pv2 q “ p1, 0, 3q “ av1 ` bv2 ` cv3
f pv3 q “ p4, 4, 8q “ 0v1 ` 0v2 ` 4v3 .
We compute a, b, c above by solving the system
p1, 0, 3q “ pa ` b ` c, b ` c, a ` b ` 2cq.
Thus b “ ´c, 1 “ a and then c “ 2. Thus
¨ ˛
0 1 0
A “ ˝ 0 ´2 0 ‚.
0 2 4
In order to compute B we could use the base change matrix, but it
is also possible to compute B directly. We will express the standard
basis vectors p1, 0, 0q as a linear combination of the v1 , v2 , v3 :
av1 ` bv2 ` cv3 “ ap1, 0, 1q ` bp1, 1, 1q ` cp1, 1, 2q
“ pa ` b ` c, b ` c, a ` b ` 2cq.
232 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

Thus, the equation p1, 0, 0q “ av1 ` bv2 ` cv3 amounts to the linear
system
1“a`b`c
0“b`c
0 “ a ` b ` 2c
One solves this: a “ 1, b “ 1, c “ ´1. Similarly, one solves the
linear system p0, 1, 0q “ av1 ` bv2 ` cv3 . Its solution is a “ ´1,
b “ 1, c “ 0. Finally, for p0, 0, 1q “ av1 ` bv2 ` cv3 one gets the
solution a “ 0, b “ ´1, c “ 1.
Thus, since f is linear (!), we have
f p1, 0, 0q “ f pv1 ` v2 ´ v3 q
“ f pv1 q ` f pv2 q ´ f pv3 q
“ p0, 0, 0q ` p1, 0, 3q ´ 4p1, 1, 2q
“ p´3, ´4, ´5q.
Likewise
f p0, 1, 0q “ f p´v1 ` v2 q “ ´f pv1 q ` f pv2 q “ p1, 0, 3q
f p0, 0, 1q “ f p´v2 ` v3 q “ ´f pv2 q ` f pv3 q “ p3, 4, 5q

Therefore (writing f p1, 0, 0q as the first column etc.), we get


¨ ˛
´3 1 3
B“ ˝ ´4 0 4 ‚.
´5 3 5
The vector vt belongs to the image precisely if is a linear combina-
tion of the vectors f pv1 q “ 0, f pv2 q “ p1, 0, 3q and f pv3 q “ p4, 4, 8q.
This translates into the linear system
a ` 3b “ 2
4b “ t
3a ` 5b “ 5.
One solves the first and third equation to a “ 45 , b “ 14 . Therefore,
the system has a solution precisely
ˆ ˙ if t “ˆ1. (Alternatively,
˙ one may
x 2
also solve the linear system B y “ t .)
z 5
C.5. EIGENVALUES AND EIGENVECTORS 233

Solution of Exercise 5.14: The matrix A is invertible precisely


iff det A “ 0. We compute the determinant, for example using
Sarrus’ rule, as det A “ 0 ´ 24 ` 6t ´ 10t ´ 0 ` 30 “ 6 ´ 4t. Thus,
the condition det A “ 6 ´ 4t “ 0 amounts to t “ 32 . The matrix A
is therefore not invertible precisely if t “¨32 . ˛
0 2 2
We compute the eigenvalues of A “ ˝ ´3 ´5 6 ‚ by com-
´2 ´2 5
puting its characteristic polynomial. It is given by
χA ptq “ tp5 ` tqp5 ´ tq ´ 24 ` 12 ` 4p´5 ´ tq ´ 12t ` 6p5 ´ tq
“ ´t3 ` 3t ´ 2.
One zero of this polynomial is t “ 1. Dividing the above polynomial
by t ´ 1 gives ´t2 ´ t ` 2, which has zeroes 1 and ´2, respectively.
Thus
χA ptq “ ´pt ´ 1q2 pt ` 2q.
The eigenvalues of A are therefore λ “ 1 and λ “ ´2.
We compute the eigenspaces by bringing A´λid into row echelon
form
¨ ˛ ¨ ˛
2 2 2 1 1 1
A ´ p´2qid “ ˝ ´3 ´3 6 ‚ ⇝ ˝ 1 1 ´2 ‚
´2 ´2 3 2 2 ´3
¨ ˛ ¨ ˛
1 1 1 1 1 0
⇝ ˝ 0 0 1 ‚ ⇝ ˝ 0 1 0 ‚.
0 0 1 0 0 0
This matrix has rank 2, and its kernel is thus 1-dimensional. It is
spanned by p1, ´1, 0q. Similarly
¨ ˛ ¨ ˛
´1 2 2 1 ´2 ´2
A ´ id “ ˝ ´3 ´3 6 ‚ ⇝ ˝ 0 1 0 ‚
´2 ´2 4 0 1 0
¨ ˛
1 0 ´2
⇝ ˝ 0 1 0 ‚.
0 0 0
This again has rank 2, so that the eigenspace E1 is again 1-dimensional.
It is spanned by p2, 0, 1q.
234 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

Since v “ p2, 0, aq was requested to be an eigenvector, it will be in


one of the two eigenspaces. One sees it must lie in E1 , and p2, 0, aq
lies in E1 precisely if a “ 1. Thus v “ p2, 0, 1q, and its eigenvalue is
1.
The matrix A is not diagonalizable, since dim E2 ` dim E1 “ 2 ă
3.
The matrix A is not similar to A2 since similar matrices have the
same determinant. Above we computed det A “ 6 ´ 4t, so for t “ 2
we have det A “ ´2, so that det A2 “ p´2q2 “ 4 ‰ det A.

C.6 Euclidean spaces

Solution of Exercise 6.2: The given equation describing U can


be rewritten as
ˆ ˙ Bˆ ˙ ˆ ˙F
` ˘ x 1 x
1 ´1 3 y “ ´1 , y “ 0,
z 3 z

which gives, with free parameters y “ a and z “ b, x “ a ´ 3b. Thus


U “ tpa ´ 3b, a, bq | a, b P Ru “ Lpp1, 1, 0q, p´3, 0, 1qq.
As for the orthogonal complement U K , we know that dim U K “
3 ´ dim U “ 1 (Corollary 6.29). In order to compute U K we need
to find the vectors that are orthogonal to U . The above equation
ˆ ˙
1
tells us that U is the orthogonal complement of the vector ´1 .
3
ˆ ˙
1
Therefore U K “ Lp ´1 q.
3
In order to compute the projection of t onto U , we apply the
Gram–Schmidt orthogonalization method. The vector v1 “ p1, 1, 0q
has norm ||v1 || “ 2, so that w1 “ ?12 p1, 1, 0q. Then
„ ˆ ? ˙ȷ
1 1{ 2
w2 :“ v2 ´ xv2 , w1 yw1 “ p´3, 0, 1q ´ p´3, 0, 1q 1{?2 w1
0

3 1
“ p´3, 0, 1q ´ ? ? p1, 1, 0q
2 2
3 3
“ p´3, 0, 1q ´ p , , 0q
2 2
9 3
“ p´ , ´ , 1q.
2 2
C.6. EUCLIDEAN SPACES 235
b b
81 9 37
We finally normalize this vector: ||w21 || “ 4
` 4
` 1 “ 2
.
Therefore c
2 9 3
w2 “ p´ , ´ , 1q.
37 2 2
According to Theorem 6.24, the orthogonal projection is given by
„ ˆ ? ˙ȷ « c ff
1{?2 2 9 3
xt, w1 yw1 ` xt, w2 yw2 “ p0, 1, 5q 1{ 2 w1 ` p0, 1, 5q p´ , ´ , 1q w2
0 37 2 2
c
1 2 7
“ ? w1 ` w2 .
2 37 2
An additional observation is the following: since t “ tU ` tK
is the unique decomposition, we can compute tU “ ttK . By the
positive definitness of x´, ´y we have pU K qK “ U , so we can also
apply Gram–Schmidt to U K . This is somewhat simpler since U K
has dimension
? 1. Applying Gram–Schmidt to v “ p1, ´1, 3q, we
1
have ||v|| “ 11, so that w “ ?11 p1, ´1, 3q is an orthonormal basis
of U K . Then, again by Theorem 6.24
tK “ xt, wyw
„ ȷ
1
“ p0, 1, 5q ? p1, ´1, 3q w
11
14 1
“ ? ? p1, ´, 1, 3q
11 11
14 14 42
“ p , ´ , q.
11 11 11
Therefore
14 14 42
tU “ t ´ tK “ p1, ´, 1, 3q ´ p , ´ , q.
11 11 11

Solution of Exercise 6.4: U is given by the solutions of the


homogeneous linear system
ˆ ˙ˆ ˙ ˆ ˙
1 0 0 x 0
y “ .
1 1 1 z 0

Theˆleft hand matrix


˙ can be brought to reduced row echelon form
1 0 0
as , so that z “ a is a free parameter and x “ 0,
0 1 1
236 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

y “ ´a. This shows that U “ tp0, ´a, aq | a P Ru and p0, ´1, 1q is


a basis vector of U . The orthogonal complement consists of vectors
orthogonal to p0, ´1, 1q. As in the solution of Exercise 6.2 above, U
has
ˆ been
˙ defined
ˆ ˙as the orthogonal complement of the two vectors
1 1
0 and 1 . These vectors therefore constitute a basis of U K .
0 1
We compute the orthogonal projection of t “ p5, 1, 3q onto U and
U K . This can be done using Gram–Schmidt orthogonalization as
above, but also by solving the linear system
ˆ ˙ ˆ ˙ ˆ ˙ ˆ ˙
5 0 1 1
1 “ a ´1 ` b 0 ` c 1 .
3 1 0 1

ˆ ˙
a
This is quickly solved as b “ 132, so that the projection of t
c
ˆ ˙ ˆ ˙
0 0
onto U is a ´1 “ ´1 .
1 1

Solution of Exercise 6.5: The orthogonal complement of T


is given by vectors px, y, zq such that x ´ 3z “ 0. I.e., with free
parameters y “ a, z “ b, x “ 3b. Thus

T K “ Lpp3, 0, 1q, p0, 1, 0qq.

Solution of Exercise 6.6:


We want to find U such that U ‘ U K “ R3 and t “ p1, 5, 6q ` tK ,
with p1, 5, 6q P U and tK P U K . This in particular means that
p1, 5, 6qKtK . Solving the equation

p1, 1, 0q “ p1, 5, 6q ` tK

gives tK “ p1, 1, 0q ´ p1, 5, 6q “ p0, ´4, ´6q. But

xp1, 5, 6q, p0, ´4, ´6qy “ ´20 ` 36 “ 16 ‰ 0,

so these two vectors are not orthogonal. Therefore there is no such


subspace U .
We solve the second part similarly: we have t´p1, 1, 1q “ p1, ´1, 0q,
so the Ansatz is tK “ p1, ´1, 0q. We compute xp1, 1, 1q, p1, ´1, 0qy “
0, so these vectors are orthogonal. We now compute U : p1, 1, 1q P U ,
so that we can take U “ Lp1, 1, 1q.
C.6. EUCLIDEAN SPACES 237

Solution of Exercise 6.7: According to Proposition 6.41, the


unique point v P R3 that is lying on L and as close as possible to
the origin is given by
v “ v0 ´ pL pv0 q,
ˆ ˙
1
where v0 “ 3. We will use Theorem 6.24 in order to compute
5
ˆ ˙
1
this. The underlying subspace W of L is spanned by v1 “ 1 .
4
Renormalizing this vector to norm one, gives w1 “ ||vv11 || “ ?v18
1
. This
vector w1 is therefore an orthonormal basis of W . We then have
ˆ ˙
24 1
pL pv0 q “ xv0 , w1 yw1 “ 1
18 4
and therefore
ˆ ˙ ˆ ˙ ˆ ˙
1 24 1 ´1{3
v“ 3 ´ 1 “ 5{3 .
5 18 4 ´1{3

This vector has norm


a ?
||v|| “ 27{9 “ 3.

Solution ofˆ Exercise


˙ 6.8: The subspace W underlying L is
1
spanned by 1 , while the equatios for L1 are equivalent to x “ z
1
and
ˆ y˙ “ z ` 2. Therefore, this line has the underlying subspace
ˆ ˙
1 1
1 as well. Therefore, the lines are parallel. Let w “ 1
1 1
be the direction vector for both lines. Below,
ˆ we˙will consider its
1
renormalization to norm 1, which is w1 “ ?13 1 .
1
ˆ ˙ ˆ ˙
1 0
Let v0 “ 0 and v01 “ ´2 . Then the distance vector
2 0
ˆ ˙
1
d “ v0 ´ v01 “ 2 is not orthogonal to L (xd, wy ‰ 0). We
2

compute the orthogonal projection of d onto W K by computing


ˆ ˙ ˆ ˙ ˆ ˙
1 1 1 1 ´2
d ´ xd, w1 yw1 “ 2 ´ 1 xd, wy “ 1 .
2 3 1 3 1
238 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES
a
This vector has norm 2{3, which is therefore the distance of L
and L1 .

Solution of Exercise
ˆ ˙ 6.9: The twoˆlines ˙
have underlying vector
1 1
spaces W “ Lp 1 q and W 1 “ Lp ´1 q respectively. These
1 1{2
two one-dimensional subspaces are not contained in each other: the
two vectors are linearly
ˆ independent.
˙ ˆ ˙ ˆ ˙
0 2 ´2
Let d “ v0 ´ v01 “ 1 ´ 0 “ 1 . We compute the
0 0 0
orthogonal complement of
ˆ ˙ ˆ ˙ ˆ ˙ ˆ ˙
1 1 p 1 p
Z “ W ` W “ Lp 1 , 1 1{2q “ Lp 1 , 2 1q.
1 ´1 1 ´2

It is given by the equations

x`y`z “0
2x ´ 2y ` z “ 0
ˆ ˙
1
which can be solved as Lp 1 q. This vector has norm 6, and
´2
ˆ ˙
1 1
is renormalized to norm 1 as wK “ ?6 1 . We compute the
´2
K
orthogonal projection of d onto Z as
ˆ ˙ ˆ ˙ ˆ ˙
´2 ´1 1 1 ´11
pZ K pdq “ d ´ xd, wK ywK “ 1 ´ 1 “ 7 .
0 6 ´2 6 ´2

b
This vector has norm 29 6
.
In particular, this distance is positive. This, together with the
above observation that W ‰ W 1 means that the lines are skew.
(If one is only required to show that the lines are skew, without
computing their distance, one can also solve the linear system given
by the intersection of L and L1 ; one finds out that this system has
no solution, so the lines are skew.)

Solution of Exercise 6.10: A useful way to sketch lines and


planes is by considering some points on them where several coordi-
nates are zero. In the case of P , three such points are p5, 0, 0q, p0, 4, 0q, p0, 0, 2q,
C.6. EUCLIDEAN SPACES 239

while for L two such points are p0, 5, 0q, p0, 0, 5q. This leads to the
following sketch:
y

P
x

z
The equation can be rewritten as

Bˆ ˙ ˆ ˙F
4 x
5 , y “ 20.
10 z

Thus, the underlying


ˆ ˙ vector space W is the orthogonal complement
4
of the vector 5 .
10
ˆ ˙
0
1
The line L has as its underlying vector space W “ Lp 1 q.
ˆ´1 ˙
0
If P and L are parallel, then W 1 Ă W . This means that 1
´1
ˆ ˙
4
is orthogonal to 5 . Their scalar product is 5 ´ 10 “ ´5 ‰ 0,
10
so that these vectors are not orthogonal and therefore L and P are
not parallel.
240 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

We compute the distance of P to the origin using Proposition 6.43:


20 20
dp0, P q “ ˆ ˙ “? .
4 141
|| 5 ||
10

We compute the closest point by determining the (unique) intersec-


K
tion point
ˆ P˙ X Wˆ . We ˙ are thus seeking the real number r such
4 4r
that r 5 “ 5r lies on the plane P , i.e., such that
10 10r
Bˆ ˙ ˆ ˙F
4 4r
5 , 5r “ 20.
10 10r

20
The left hand side equals rp16 ` 25 ` 100q “ 141r, so that r “ 141 ,
and therefore the point in P that is as close as possible to the origin
is ˆ ˙
20 4
5 .
141 10

Solution of Exercise 6.11: We apply Theorem 6.36, according


to which a matrix is orthogonally
ˆ diagonalizable
˙ if and only if it is
1 2
symmetric. This excludes A “ . The other two matrices
´2 1
are symmetric
ˆ and
˙ therefore orthogonally diagonalizable. The ma-
0 0
trix A “ is already a diagonal matrix, so for P “ id the
0 0
matrix P AP ´1 is diagonal. The standard basis vectors e1 , e2 are an
orthonormalˆeigenbasis.
˙
1 2
For A “ , we compute the eigenvalues as was indicated
2 1
after Theorem 6.36.
c
a`d pa ´ dq
λ1{2 “ ˘ ` b2
2 4
“ 1 ˘ 2.
Thus the eigenvalues areˆ´1 and˙3. We compute the eigenspaces.
2 2
The matrix A´p´1qid “ has kernel given by w1 :“ e1 ´e2 .
ˆ 2 2 ˙
´2 2
The matrix A´3id “ has kernel given by w2 :“ e1 `e2 .
2 ´2
C.6. EUCLIDEAN SPACES 241

These vectors are orthogonal:


A´ ¯ ´ ¯E
1 1
xe1 ´ e2 , e1 ` e2 y “ ´1
, 1
“ 0.

However, they are not normal, so an orthonormal eigenbasis for A


is given by
w1 1 ´ 1 ¯ w2 1 ´ 1 ¯
“? and “ ? .
||w1 || 2 ´1 ||w2 || 2 1

Solution of Exercise 6.12: If Av “ λv, Aw “ λw, we compute


λxv, wy “ xλv, wy
“ xAv, wy
“ pAvqT w
“ v T AT w
“ v T Aw since A “ AT
“ xv, Awy
“ xv, µwy
“ µxv, wy.
Since λ ‰ µ, this forces xv, wy “ 0.

Solution of Exercise 6.13: By definition, the hyperplane P is


given by the equation
@ D
p2, 0, 1, ´1qT , px1 , . . . , x4 qT “ 4.
In other words, the underlying sub-vector space W Ă R4 of that
hyperplane is the subspace
@ D
p2, 0, 1, ´1qT , px1 , . . . , x4 qT “ 0,

or, what is the same, the orthogonal complement of p2, 0, 1, ´1qT .


The underlying subspace Wt1 of the line Lt is spanned by pt, 1, 0, ´1q.
By Definition 6.53, P is parallel to Lt if W Ă Wt1 (which is impos-
sible for dimension reasons) or if Wt1 Ă W . The latter is equivalent
to pt, 1, 0, ´1q P W or, yet equivalently,
@ D
pt, 1, 0, ´1qT , p2, 0, 1, ´1qT “ 0.
242 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

This equates to 2t ` 1 “ 0 or t “ ´ 12 .
We now compute the pair(s) of points realizing the minimal dis-
tance between P and L :“ L´ 1 . We will use that these are exactly
2
the points pp, lq such that p ´ lKW and p ´ lKW 1 .
The point l P L “ L´ 1 is of the form
2

1
l “ p1, 0, 0, 1q ` rp´ , 1, 0, ´1q
2
r
“ p1 ´ s, 2s, 0, 1 ´ 2sq with s “ .
2
On the other hand a point p “ px1 , . . . , x4 q P P satisfies

2x1 ` x3 ´ x4 “ 4,

so we can take x1 “ a, x2 “ b and x3 “ c as free parameters and


get x4 “ 2a ` c ´ 4. That is p “ pa, b, c, 2a ` c ´ 4q. Therefore

p ´ l “ pa ` s ´ 1, b ´ 2s, c, 2a ` c ` 2s ´ 5q.

Note that a, b, c, s are the unknowns. We now determine the


values of these unknowns. The orthogonal complement of W , W K “
ppp2, 0, 1, ´1qT qK qK “ Lp2, 0, 1, ´1q. Here we use that for a subspace
U Ă Rn , we have
pU K qK “ U.
(Indeed, U Ă pU K qK : for u P U and u1 P U K we have xu, u1 y “ 0,
so that u P pU K qK . Both vector subspaces of Rn have the same
dimension, and therefore they agree.)
This means that we are looking for a, b, c, s P R such that the
following conditions are satisfied
• p´l is a multiple of p2, 0, 1, ´1q, say λp2, 0, 1, ´1q “ p2λ, 0, λ, ´λq.
This translates into a linear system

a ` s ´ 1 “ 2λ
b ´ 2s “ 0
c“λ
2a ` c ` 2s ´ 5 “ ´λ.

• We now spell out the condition that p ´ l is orthogonal to


C.6. EUCLIDEAN SPACES 243

p´ 12 , 1, 0, ´1qT or equivalently, to p´1, 2, 0, ´2q:


@ D
0 “ p ´ l, p´1, 2, 0, ´2qT
“ ´pa ` s ´ 1q ` 2pb ´ 2sq ´ 2p2a ` c ` 2s ´ 5q
“ ´5a ` 2b ´ 2c ´ 9s ` 11.

We obtain a linear system in the 5 unknowns a, b, c, s, λ:

a ` s ´ 2λ “ 1
b ´ 2s “ 0
c´λ“0
2a ` c ` 2s ` λ “ 5
´5a ` 2b ´ 2c ´ 9s “ ´11.

One solves this using Gaussian eliminiation (or, using an appro-


priate computer algebra system, cf. https://www.wolframalpha.
com/input?i=Solve%28%28-5a%2B2b-2c-9s%2B11%3D0%2Ca%2Bs-1%
3D2l%2Cb-2s%3D0%2Cc%3Dl%2C2a%2Bc%2B2s-5%3D-l%29%29) and
obtains as solutions the vectors

1 1
pa, b “ 4 ´ 2a, c “ , λ “ , s “ 2 ´ aq.
2 2

Thus

1 7
p “ pa, 4 ´ 2a, , 2a ´ q
2 2
l “ p1 ´ p2 ´ aq, 2p2 ´ aq, 0, 1 ´ 2p2 ´ aqq “ pa ´ 1, 4 ´ 2a, 0, ´3 ` 2aq.

This implies that for any point l P L, there is a unique point


p P P such that pp, lq realizes the minimal distance between P
and L.

Solution of Exercise 6.17: We compute in fact directly an


orthonormal basis, which in particular is then orthogonal. (The
property of also being normal is convenient further below.) We
244 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

apply the Gram–Schmidt procedure:


1
w1 “ ? p1, 2, 0, ´1q,
6
B F
1 1 1
w2 “ p0, ´4, 3, 4q ´ p0, ´4, 3, 4q, ? p, 1, 2, 0, ´1q ? p1, 2, 0, ´1q
6 6
“ p0, ´4, 3, 4q ` 2p1, 2, 0, ´1q
“ p2, 0, 3, 2q,
1
w2 “ ? p2, 0, 3, 2q.
17
The vectors w1 and w2 then constitute an orthonormal basis of U .
We compute the orthogonal complement of U by solving the equa-
tions (for some v P R4 )
xv, p1, 2, 0, ´1qy “ 0
xv, p0, ´4, 3, 4y “ 0.
We consider the matrix of the resulting linear system, and bring it
into row echelon form:
ˆ ˙ ˆ ˙ ˆ ˙
1 2 0 ´1 1 2 0 ´1 1 0 23 1
⇝ ⇝ .
0 ´4 3 4 0 1 ´ 34 1 0 1 ´ 43 1
Thus, if v “ px1 , . . . , x4 q, then x3 and x4 are free variables, so that a
basis of U K is given by the two vectors p´ 23 , 34 , 1, 0q and p´1, 1, 0, 1q.
We compute the orthogonal projection
pU pvq “ xv, w1 yw1 ` xv, w2 yw2
“ p3, 2, 3, 1q.

If we write w “ l ` lK with lK P LK , then lK “ w ´ l “


p2, ´, 1, 0, 2q ´ p1, 1, 2, 0q “ p1, ´2, ´2, 2q. This vector would need
to be orthogonal to p1, 1, 2, 0q, which however it is not (since their
scalar product is ´5 ‰ 0). Therefore, such a subspace L does not
exist.

Solution of Exercise 6.18: We first compute L in the form


L “ v0 ` W . We choose v0 “ p0, 1, ´1q. In addition, the point
p1, 2, 1q is also lying in L. Therefore W is spanned by p1, 2, 1q ´
p0, 1, ´1q “ p1, 1, 2q.
C.6. EUCLIDEAN SPACES 245

Similarly, we can compute M “ p1, ´2, 0q ` Lpp2, 1, 1qq. The


two vectors p1, 1, 2q and p2, 1, 1q are linearly independent, so that L
and M are not parallel nor identical. We compute the intersection
L X M . We have the equations 1 ´ y “ x “ 2y ` 1, so that y “ 0.
We also have 2x ´ 1 “ z “ y ` 2, so that z “ 2 and x “ 23 , but then
we get a contradiction to 1 ´ y “ x. Thus L X M “ H. Therefore,
L is skew to M .
We compute the requested plane P by observing that its underly-
ing vector space W is spanned by p1, ´1, 2q and p2, 1, 1q. Moreover,
the point p1, 0, 2q P P . Thus P “ p1, 0, 2q ` Lpp1, ´1, 2q, p2, 1, 1qq.
The orthogonal complement of W is spanned by p1, ´1, ´1q, as one
sees by solving the linear system a ´ b ` 2c “ 0, 2a ` b ` c “ 0. Thus,
P is of the form
xv, p1, ´1, ´1qy “ d.
To compute the number d, we use that p1, 0, 2q P P , so that d “
xp1, 0, 2q, p1, ´1, ´1qy “ ´1.
A general point on M is of the form p1 ` 2r, r, 2 ` rq. With
l “ p0, 1, ´1q, the vector v :“ m ´ l “ p1 ` 2r, r ´ 1, 3 ` rq. The
line spanned by this vector is parallel to the plane defined by the
equation 3x ´ z “ 0 precisely if xv, p3, 0, ´1qy “ 0. This leads to the
equation
3 ` 6r ´ 3 ´ r “ 0,
i.e., r “ 0, so that m “ p1, 0, 2q.
We compute the coordinates of rα “ px, y, zq as z “ α, x “
1
2
p1 ` αq, y “ 12 p1 ´ αq. Similarly, sα “ p2α ´ 3, α ´ 2, αq. The
midpoints mα “ rα `s 2
α
“ p 5α´5
4
, α´3
4
, αq are precisely the points on
the line
5 3 5 1
p´ , ´ , 0q ` Lpp , , 1qq.
4 4 4 4

Solution of Exercise 6.19: We have L “ p ` Lpq ´ pq, i.e.,


L “ p3, 1, 0q ` Lp´3, 0, 3q. (Other solutions are possible as well,
e.g., L “ q ` Lp´3, 0, 3q.) We determine whether r lies on L by
considering the linear system

3 ´ 3t “ ´3
1“0
3t “ ´3.
246 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES

The second equation is a contradiction, therefore there is no t P R


satisfying this linear system. Thus r R L.

Solution of Exercise 6.20: The line M , given to us by the systen


x ` z “ 2, x ´ 2y “ 2 is also described as M “ p2, 0, 0q ` Lp2, 1, ´2q.
Its underlying vector space is therefore spanned by p2, 1, ´2q, while
the underlying vector space of L is spanned by p1, 0, ´1q. These
two vectors are linearly independent. We compute the intersection
of L and M by taking a general point l “ p3 ` t, 1, ´tq P L and
m “ p2 ` 2s, s, ´2sq P M , with t, s P R. The three coordinates of
the equation l “ m read:
3 ` t “ 2 ` 2s
1“s
´t “ ´2s.
The second and third equations give s “ 1, t “ 2. The first then
gives 5 “ 4, which is a contradiction. Thus, there is no point lying
on L and on M , i.e., L X M “ H. Thus L and M are skew, so no
plane contains L and M .

Solution of Exercise 6.21: Writing q “ px, y, zq for x, y, z P R,


the line M is then given by M “ p ` Lpq ´ pq “ p´1, ´1, ´1q `
Lpx ` 1, y ` 1, z ` 1q.
The line M will be orthogonal to L precisely if
px ` 1, y ` 1, z ` 1qKp1, 0, ´1q,
i.e., if x ` 1 ´ z ´ 1 “ 0, i.e., if x “ z.
As in the solution of Exercise 6.20 above, the line M (with x “ z)
intersects L precisely if the linear system
3 ` t “ ´1 ` spx ` 1q
1 “ ´1 ` spy ` 1q
´t “ ´1 ` spx ` 1q

has a solution ps, tq P R2 . To get a clearer view of this, we write


down the augmented matrix for this linear system, keeping in mind
that x, y are parameters in this linear system, and s, t are the un-
knowns. Note this is a linear system in two unknowns, but three
C.6. EUCLIDEAN SPACES 247

equations.
¨ ˛ ¨ ˛
´x ´ 1 1 ´4 ´x ´ 1 1 ´4
˝ ´y ´ 1 0 ´2 ‚ ⇝ ˝ ´y ´ 1 0 ´2 ‚
´x ´ 1 ´1 ´1 ´2x ´ 2 0 ´5
¨ ˛
´x ´ 1 1 ´4
if y‰´1 2
⇝ ˝ 1 0 y`1 ‚
´2x ´ 2 0 ´5
0 1 ´4 ` 2 x`1
¨ ˛
y`1
2
⇝˝ 1 0 y`1
‚.
2x`2
0 0 ´5 ` 2 y`1

If y “ ´1, then we have no solution in the above system, due to the


row 0 0 | ´2 in this case. Otherwise, for y ‰ ´1, we can perform the
elementary row operations as indicated above. This system has no
solution if the term ´5 ` 2 2x`2
y`1
‰ 0. If, on the contrary, the bottom
right entry is zero, then the system does have a unique solution
(since we then have two leading ones in the matrix). This bottom
right entry is zero precisely if 5py ` 1q “ 4x ` 4, or, equivalently, if
y “ 4x´1
5
.
We therefore find that the line M through p and q “ px, y, zq
intersects L orthogonally precisely if z “ x, y “ 4x´1
5
and y ‰ ´1.
248 APPENDIX C. SOLUTIONS OF SELECTED EXERCISES
References

[Bot21] F. Bottacin. Algebra Lineare e Geometria. Società Editrice Esculapio;


English translation available in moodle, 2021. isbn: 9791220877657.
url: https://books.google.it/books?id=efxVEAAAQBAJ.
[Nic95] W.K. Nicholson. Linear Algebra with Applications. Mathematics Se-
ries. PWS Publishing Company, 1995. isbn: 9780534936662. url: https:
//lyryx.com/linear-algebra-applications/.
[Oli11] Oswaldo Rio Branco de Oliveira. “The fundamental theorem of alge-
bra: A most elementary proof”. In: (2011). doi: 10.4169/amer.math.
monthly.119.09.753. eprint: arXiv:1109.1459.

249
250 REFERENCES
Index

absolute value, 130 of a polynomial, 72


adjugate, 136 determinant, 132
affine subspace, 175 of a 2 ˆ 2-matrix, 129
associativity diagonalizable, 149
of addition, 32 dimension, 56
direct sum, 42
basis, 55 distance, 165
standard, 55 distributive law, 33
bilinear form, 161 distributivity law, 33
bilinearity, 161 domain, 71
Cartesian equation, 179
cartesian equations, 180 eigenbasis, 127, 151
characteristic polynomial, 145 eigenspace, 127, 148
codomain, 71 eigenvalue, 127, 144
coefficient, 7, 40 eigenvector, 144
cofactor, 136 elementary matrices, 102
cofactor expansion, 140 elementary operations, 13
column space, 86 elementary row operations, 19
commutativity equation
of addition, 32, 33 linear, 7
component, 30 nonlinear, 9
composition, 94, 194 equivalent, 110
constant, 88 Euclidean vector space, 165
constant term, 7 exponential, 150
coordinate system, 56
field, 8
degree, 40 function
of a polynomial, 41 bijective, 81
derivative constant, 40

251
252 INDEX

injective, 81 transformation, 71
linear, 40 linear combination, 45
quadratic, 40 linear hull, 46
surjective, 81 linear map
Fundamental theorem of inverse, 104
algebra, 147 linear system
equivalent, 13
Gaussian algorithm, 19 homogeneous, 12
Gaussian elimination, 18 linearly dependent, 51
Gaussian eliminiation, 19
linearly independent, 51
generating system, 48
Gram–Schmidt
main diagonal, 100
orthogonalization, 171
map
Hesse matrix, 164 identity, 73
Hesse normal form, 178 zero, 73
homogeneous linear system, matrix, 15
35 associated to a linear
hyperplane, 177 map, 93
associated to a linear
identity matrix, 100 system, 17
image, 81, 119 augmented, 17
imaginary unit, 147 diagonal, 143
intersection, 194 indefinite, 163
of subspaces, 38 lower triangular, 138
intersects, 186 orthogonal, 173
inverse, 91, 104 positive definite, 163
invertible, 104 rotation, 79, 141
isomorphism, 104 row-echelon, 18
shearing, 151
kernel, 82
similar, 156
law of cosines, 160 square, 15
linear upper triangular, 124, 138
approximation, 9 Minkowski space, 163
combination, 45
equation, 7 negative definite, 163
hull, 46 norm, 160, 163
independence, 51 nullity
map, 71 of a linear map, 84
system, 9 of a matrix, 85
INDEX 253

one-to-one, 81 reduced, 18
onto, 81
Sarrus’ rule, 135
ordered
scalar multiplication, 32, 33
pair, 29
scalar product, 159, 161
triple, 29
skew, 186
tuple, 29
solution set, 7
origin, 36
span, 46, 48
orthogonal, 162, 165
standard basis, 55
orthogonal complement, 167
subset, 194
orthogonal projection, 169,
subspace, 37
172
sum
orthogonally diagonalizable,
of polynomials, 41
173
of subspaces, 47
orthonormal basis, 169
of vectors in Rn , 31
orthonormal system, 168
of vectors in a vector
parallel, 186 space, 33
parallelogram law, 32 symmetric, 123
polynomial, 40 symmetry, 161
linear, 40 system, 9
quadratic, 40 system of linear equations, 9
preimage, 81, 119
trace, 123, 153
principal axes, 174
transpose, 114
principal axes theorem, 174
trigonometric functions, 198
principal submatrix, 164
trivial solution, 13
product, 95, 194
of polynomials, 42 union, 63, 194
product formula, 137 unknown, 7
proper subset, 194
variable, 7
Pythagorean theorem, 160
free, 21
rank vector, 30, 33
of a linear map, 84 column, 15
of a matrix, 85 row, 15
rank-nullity theorem, 60, 84 zero, 30, 33
realize the minimal distance, vector equations, 180
176 vector space, 33
reciprocal, 108 finite-dimensional, 57
reflection, 72 infinite-dimensional, 57
row space, 86
zero vector, 30
row-echelon form, 18

You might also like