Linear Algebra - YEAR2 - (Waldron, Cherney, and Denton)
Linear Algebra - YEAR2 - (Waldron, Cherney, and Denton)
The LibreTexts mission is to unite students, faculty and scholars in a cooperative effort to develop an easy-to-use online platform
for the construction, customization, and dissemination of OER content to reduce the burdens of unreasonable textbook costs to our
students and society. The LibreTexts project is a multi-institutional collaborative venture to develop the next generation of open-
access texts to improve postsecondary education at all levels of higher learning by developing an Open Access Resource
environment. The project currently consists of 14 independently operating and interconnected libraries that are constantly being
optimized by students, faculty, and outside experts to supplant conventional paper-based books. These free textbook alternatives are
organized within a central environment that is both vertically (from advance to basic level) and horizontally (across different fields)
integrated.
The LibreTexts libraries are Powered by NICE CXOne and are supported by the Department of Education Open Textbook Pilot
Project, the UC Davis Office of the Provost, the UC Davis Library, the California State University Affordable Learning Solutions
Program, and Merlot. This material is based upon work supported by the National Science Foundation under Grant No. 1246120,
1525057, and 1413739.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not
necessarily reflect the views of the National Science Foundation nor the US Department of Education.
Have questions or comments? For information about adoptions or adaptions contact [email protected]. More information on our
activities can be found via Facebook (https://facebook.com/Libretexts), Twitter (https://twitter.com/libretexts), or our blog
(http://Blog.Libretexts.org).
This text was compiled on 02/01/2024
TABLE OF CONTENTS
Licensing
5: Vector Spaces
5.1: Examples of Vector Spaces
5.2: Other Fields
5.3: Review Problems
6: Linear Transformations
6.1: The Consequence of Linearity
6.2: Linear Functions on Hyperplanes
6.3: Linear Differential Operators
6.4: Bases (Take 1)
6.5: Review Problems
1 https://math.libretexts.org/@go/page/24047
7: Matrices
7.1: Linear Transformations and Matrices
7.2: Review Problems
7.3: Properties of Matrices
7.4: Review Problems
7.5: Inverse Matrix
7.6: Review Problems
7.7: LU Redux
7.8: Review Problems
8: Determinants
8.1: The Determinant Formula
8.2: Elementary Matrices and Determinants
8.3: Review Problems
8.4: Properties of the Determinant
8.5: Review Problems
13: Diagonalization
13.1: Diagonalization
13.2: Change of Basis
13.3: Changing to a Basis of Eigenvectors
13.4: Review Problems
2 https://math.libretexts.org/@go/page/24047
14.3: Relating Orthonormal Bases
14.4: Gram-Schmidt and Orthogonal Complements
14.5: QR Decomposition
14.6: Orthogonal Complements
14.7: Review Problems
Index
Index
Glossary
Detailed Licensing
3 https://math.libretexts.org/@go/page/24047
Licensing
A detailed breakdown of this resource's licensing can be found in Back Matter/Detailed Licensing.
1 https://math.libretexts.org/@go/page/115517
CHAPTER OVERVIEW
1.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
Thumbnail: In three-dimensional Euclidean space, these three planes represent solutions of linear equations, and their
intersection represents the set of common solutions: in this case, a unique point. The blue line is the common solution to two of
these equations. (CC BY-SA 3.0; Alksentrs via Wikipedia)
This page titled 1: What is Linear Algebra? is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
1
1.1: What Are Vectors?
Here are some examples of things that can be added:
c. Polynomials: If p(x) = 1 + x − 2x + 3x and q(x) = x + 3x − 3x + x , then their sum p(x) + q(x) is the new
2 3 2 3 4
polynomial 1 + 2x + x + x . 2 4
2!
2 1
3!
3 1
2!
2 1
3!
3
f (x) + g(x) = 1 +
1
x +
2!
2
x +⋯
1
is also a power series.
4!
4
e. Functions: If f (x) = e and g(x) = e , then their sum f (x) + g(x) is the new function 2 cosh x.
x −x
Stacks of numbers are not the only things that are vectors, as examples C, D, and E show. Because they "can be added'', you should
now start thinking of all the above objects as vectors! In Chapter 5 we will give the precise rules that vector addition must obey. In
the above examples, however, notice that the vector addition rule stems from the rules for adding numbers.
When adding the same vector over and over, for example
x +x , x +x +x , x +x +x +x , … , (1.1.1)
we will write
2x , 3x , 4x , … , (1.1.2)
1 1 1 1 1 4
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
Defining 4x = x + x + x + x is fine for integer multiples, but does not help us make sense of 1
3
. For the different types of
x
vectors above, you can probably guess how to multiply a vector by a scalar, for example:
1
1 ⎛ 3 ⎞
⎛ ⎞
1 1
⎜1⎟ =⎜⎜
⎟.
⎟
(1.1.4)
3 3
⎝ ⎠
0 ⎝ ⎠
0
In any given problem that you are planning to describe using vectors, you need to decide on a way to add and scalar multiply
vectors.
In summary:
1.1.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 1.1: What Are Vectors? is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
1.1.1 https://math.libretexts.org/@go/page/1720
1.2: What Are Linear Functions?
In calculus classes, the main subject of investigation was functions and their rates of change. In linear algebra, functions will again
be focus of your attention, but now functions of a very special type. In calculus, you probably encountered functions f (x), but were
perhaps encouraged to think of this a machine "f '', whose input is some real number x. For each input x this machine outputs a
single real number f (x).
.
In linear algebra, the functions we study will take vectors, of some type, as both inputs an outputs. We just saw that vectors were
objects that could be added or scalar multiplied---a very general notion---so the functions we are going study will look novel at
first. So things don't get too abstract, here are five questions that can be rephrased in terms of functions of vectors:
dx
For part (a), the machine needed would look like the picture below.
x 10x .
This is just like a function f (x) from calculus that takes in a number x and spits out the number f (x) = 10x . For part~(??? ), we
need something more sophisticated.
x z
⎛ ⎞ ⎛ ⎞
⎜ y ⎟ ⎜ −z ⎟ .
⎝ ⎠ ⎝ ⎠
z y −x
The inputs and outputs are both 3-vectors. You are probably getting the gist by now, but here is the machine needed for part~(??? ):
1.2.1 https://math.libretexts.org/@go/page/1721
1
∫ p(y)dy
−1
p (
1
.
)
∫ yp(y)dy
−1
The ``blob'' on the left represents all the vectors that you are allowed to input into the function L, and the blob on the right denotes
the corresponding outputs. Hopefully you noticed that there are two vectors apparently {\it not shown} on the blob of outputs:
\[ L(u)+L(v)\quad \&\quad cL(u)\, . $$
You might already be able to guess the values we would like these to take. If not, here's the answer, it's the key equation
of the whole class, from which everything else \hypertarget{twopart}{follows}:
1.2.1: 1. Additivity:
L(u + v) = L(u) + L(v) . (1.2.1)
1.2.2: 2. Homogeneity:
L(cu) = cL(u) . (1.2.2)
Most functions of vectors do not obey this requirement; linear algebra is the study of those that do. Notice that the additivity
requirement says that the function L respects vector addition: it does not matter if you first add u and v
and then input their sum into L, or first input u and v into L separately and then add the outputs . The same holds for
scalar multiplication--try writing out the scalar multiplication version of the italicized sentence. When a function of vectors obeys
the additivity and homogeneity properties we say that it is linear (this is the "linear'' of linear algebra). Together, additivity and
homogeneity are called linearity . Other, equivalent, names for linear functions are:
1.2.2 https://math.libretexts.org/@go/page/1721
Function = Transformation = Operator
The questions in cases (a) - (d) of our example can all be restated as a single equation:
Lv = w (1.2.3)
where v is an unknown and w a known vector, and L is a linear transformation. To check that this is true, one needs to know the
rules for adding vectors (both inputs and outputs) and then check linearity of L. Solving the equation Lv = w often amounts to
solving systems of linear equations, the skill you will learn in Chapter 2.
A great example is the derivative operator:
For any two functions f (x), g(x) and any number c , in calculus you probably learnt that the derivative operator satisfies
1. d
dx
(cf ) = c
d
dx
f ,
2. d
dx
(f + g) =
d
dx
f +
d
dx
g .
If we view functions as vectors with addition given by addition of functions and scalar multiplication just multiplication of
functions by a constant, then these familiar properties of derivatives are just the linearity property of linear maps.
Before introducing matrices, notice that for linear maps L we will often write simply Lu instead of L(u). This is because the
linearity property of a linear transformation L means that L(u) can be thought of as multiplying the vector u by the linear operator
L. For example, the linearity of L implies that if u, v are vectors and c, d are numbers, then
which feels a lot like the regular rules of algebra for numbers. Notice though, that "uL'' makes no sense here.
Remark A sum of multiples of vectors cu + dv is called a linear combination of u and v .
1.2.3: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 1.2: What Are Linear Functions? is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
1.2.3 https://math.libretexts.org/@go/page/1721
1.3: What is a Matrix?
Matrices are linear functions of a certain kind. One way to learn about them is by studying systems of linear equations.
Example 1.3.1:
Each bag contains 2 apples and 4 bananas and each box contains 6 apples and 8 bananas. There are 20 apples and 28 bananas in the room.
Find x and y .
The values are the numbers x and y that simultaneously make both of the following equations true:
2 x + 6 y = 20
4 x + 8 y = 28 .
Here we have an example of a System of Linear Equations . It's a collection of equations in which variables are multiplied by constants
and summed, and no variables are multiplied together: There are no powers of variables greater than one (like x or b ), non-integer or 2 5
negative powers of variables (like y or a ), and no places where variables are multiplied together (like ab or xy).
−1/2 π
Information about the fruity contents of the room can be stored two ways:
i. In terms of the number of apples and bananas.
ii. In terms of the number of bags and boxes.
Intuitively, knowing the information in one form allows you to figure out the information in the other form.
Going from (ii) to (i) is easy:
If you knew there were 3 bags and 2 boxes it would be easy to calculate the number of apples and bananas, and doing so would have the feel
of multiplication (containers times fruit per container). In the example above we are required to go the other direction, from (i) to (ii). This
feels like the opposite of multiplication, i.e. , division. Matrix notation will make clear what we are "dividing'' by.
The goal of Chapter 2 is to efficiently solve systems of linear equations. Partly, this is just a matter of finding a better notation, but one that
hints at a deeper underlying mathematical structure. For that, we need rules for adding and scalar multiplying 2-vectors:
′ ′
x cx x x x +x
c( ) := ( ) and ( ) +( ) := ( ). (1.3.1)
′ ′
y cy y y y +y
Writing our fruity equations as an equality between 2-vectors and then using these rules we have:
2x + 6y = 20 2x + 6y 20 2 6 20
} ⇔ ( ) =( ) ⇔ x( ) +y ( ) =( ). (1.3.2)
4x + 8y = 28 4x + 8y 28 4 8 28
Now we introduce an operator which takes in 2-vectors and gives out 2-vectors. We denote it by an array of numbers called a matrix.
2 6 2 6 x 2 6
The function ( ) is defined by ( )( ) := x ( ) +y ( .
)
4 8 4 8 y 4 8
x
⎛
⎞
1 0 3 4 1 0 3 4
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎜ y ⎟
⎜ 5 0 3 4⎟⎜ ⎟ := x ⎜ 5 ⎟ + y ⎜ 0 ⎟ + z ⎜ 3 ⎟ + w ⎜ 4 ⎟ . (1.3.3)
⎜ z ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
−1 6 2 5 −1 6 2 5
⎝ ⎠
w
1.3.1 https://math.libretexts.org/@go/page/1722
Viewed as a machine that inputs and outputs 2-vectors, our 2 × 2 matrix does the following:
x 2x + 6y
( ) ( .
)
y 4x + 8y
x
What vector ( ) satisfies
y
2 6 x 20
( )( ) =( ) ?
4 8 y 28
Solution
This is of the same Lv = w form as our opening examples. The matrix encodes fruit per container. The equation is roughly fruit per
container times number of containers. To solve for fruit we want to "divide" by the matrix.
Another way to think about the above example is to remember the rule for multiplying a matrix times a vector. If you have forgotten this,
you can actually guess a good rule by making sure the matrix equation is the same as the system of linear equations. This would require that
$$
2 6
( ) (1.3.4)
4 8
x
( ) (1.3.5)
y
:=
2x + 6y
( ) (1.3.6)
4x + 8y
\]
Indeed this is an example of the general rule that you have probably seen before
p q x px + qy p q
( )( ) := ( ) =x( ) +y ( ). (1.3.7)
r s y rx + sy r s
Notice, that the second way of writing the output on the right hand side of this equation is very useful because it tells us what all possible
outputs a matrix times a vector look like -- they are just sums of the columns of the matrix multiplied by scalars. The set of all possible
outputs of a matrix times a vector is called the column space (it is also the image of the linear function defined by the matrix).
A matrix is an example of a Linear Function , because it takes one vector and turns it into another in a "linear'' way. Of course, we can have
much larger matrices if our system has more variables.
Matrices are linear functions. The statement of this for the matrix in our fruity example looks like
2 6 x 2 6 a
1. ( )c( ) =c( )( ) and
4 8 y 4 8 b
1.3.2 https://math.libretexts.org/@go/page/1722
′ ′
2 6 x x 2 6 x 2 6 x
2. ( ) [( ) +(
′
)] = ( )( ) +( )(
′
)
4 8 y y 4 8 y 4 8 y
These equalities can already be verified using only the rules we introduced so far.
Example 1.3.4:
2 6
Verify that ( ) is a linear operator.
4 8
Homogeneity:
2 6 a 2 6 2a 6b 2a + 6b 2ac + 6bc
c( )( ) = c [a ( ) +b ( )] = c [( ) +( )] = c ( ) = ( ). (1.3.9)
4 8 b 4 8 4a 8b 4a + 8b 4ac + 8bc
–––––––––––––
1.3.0.0.1: Additivity:
$$
2 6
( ) (1.3.10)
4 8
\left[
a
( ) (1.3.11)
b
c
( ) (1.3.12)
d
\right]
=
2 6
( ) (1.3.13)
4 8
a+c
( ) (1.3.14)
b +d
=
(a+c)
2
( ) (1.3.15)
4
+
(b+d)
6
( ) (1.3.16)
8
=
2(a + c)
( ) (1.3.17)
4(a + c)
1.3.3 https://math.libretexts.org/@go/page/1722
+
6(b + d)
( ) (1.3.18)
8(b + d)
\]
2a + 2c + 6b + 6d
= ( ) (1.3.19)
4a + 4c + 8b + 8d
–––––––––––––––––––––
2a + 2c + 6b + 6d
$$$ = ( ).
4a + 4c + 8b + 8d
–––––––––––––––––––––
We have come full circle; matrices are just examples of the kinds of linear operators that appear in algebra problems like those in section 1.2.
Any equation of the form M v = w with M a matrix, and v, w n -vectors is called a matrix equation. Chapter 2 is about efficiently solving
systems of linear equations, or equivalently matrix equations.
1.3.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 1.3: What is a Matrix? is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom Denton, &
Andrew Waldron.
1.3.4 https://math.libretexts.org/@go/page/1722
1.4: Review Problems
1. Problems A, B, and C of example 2 can all be written as Lv = w where
L : V ⟶ W , (1.4.1)
(read this as L maps the set of vectors V to the set of vectors W ). For each case write down the sets V and W where the vectors v
and w come from.
2. Torque is a measure of "rotational force''. It is a vector whose direction is the (preferred) axis of rotation. Upon applying a force
F on an object at point r the torque τ is the cross product r × F = τ .
Lets find the force F (a vector) must one apply to a wrench lying along the vector
1
⎛ ⎞
r =⎜1⎟ ft,
⎝ ⎠
0
0
⎛ ⎞
to produce a torque ⎜ 0 ⎟ ft lb:
⎝ ⎠
1
a
⎛ ⎞
a. Find a solution by writing out this equation with F = ⎜ b ⎟. (Hint: Guess and check that a solution with a = 0 exists).
⎝ ⎠
c
1
⎛ ⎞
b. Add ⎜ 1 ⎟ to your solution and check that the result is a solution.
⎝ ⎠
0
c. Give a physics explanation why there can be two solutions, and argue that there are, in fact, infinitely many solutions.
d. Set up a system of three linear equations with the three components of F as the variables which describes this situation. What
happens if you try to solve these equations by substitution?
3. The function P (t) gives gas prices (in units of dollars per gallon) as a function of t the year, and g(t) is the gas consumption
rate measured in gallons per year by an average driver as a function of their age. Assuming a lifetime is 100 years, what function
gives the total amount spent on gas during the lifetime of an individual born in an arbitrary year t ? Is the operator that maps g to
this function linear?
4. The differential equation (DE)
d
f = 2f (1.4.2)
dt
says that the rate of change of f is proportional to f . It describes exponential growth because
1.4.1 https://math.libretexts.org/@go/page/1723
2t
f (t) = f (0)e (1.4.3)
satisfies the DE for any number f (0). The number 2 in the DE is called the constant of proportionality.
A similar DE
d 2
f = f (1.4.4)
dt t
Find a linear operator relating Pablo's representation to the ``everyday'' representation in terms of the number of apples and number
of oranges. Write your answer as a matrix. Hint: Let λ represent the amount of sugar in each apple.
6. Matrix Multiplication: Let M and N be matrices
a b e f
M =( ) and N = ( ) , (1.4.5)
c d g h
1.4.2 https://math.libretexts.org/@go/page/1723
c. Try to answer the following common question, "Is there any sense in which these rules for matrix multiplication are
unavoidable, or are they just a notation that could be replaced by some other notation?"
d. Generalize your multiplication rule to 3 × 3 matrices.
7. Diagonal matrices: A matrix M can be thought of as an array of numbers m , known as matrix entries, or matrix components,
i
j
Compute m , m , m and m .
1
1
1
2
2
1
2
2
The matrix entries m whose row and column numbers are the same are called the diagonal
i
i
of M . Matrix entries m with i ≠ j are called off-diagonal. How many diagonal entries does an
i
j
n×n matrix have? How many
off-diagonal entries does an n × n matrix have?
If all the off-diagonal entries of a matrix vanish, we say that the matrix is diagonal. Let
$$
D=
λ 0
( ) (1.4.8)
0 μ
\, .
\]
Are these matrices diagonal and why? Use the rule you found in problem 6 to compute
the matrix products DD and D D. What do you observe? Do you think the same property holds for arbitrary matrices? What
′ ′
(In fact the same is true for {1, 2, 3} = {2, 3, 1}etc, although we could make this an ordered set using 3 > 2 > 1 .)
i. Invent a function with domain {∗, ⋆, #} and co\-domain R. (Remember that the domain of a function is the set of all its
allowed inputs and the codomain (or target space) is the set where the outputs can live. A function is specified by assigning
exactly one codomain element to each element of the domain.)
ii. Choose an ordering on {∗, ⋆, #}, and then use it to write your function from part (i) as a triple of numbers.
iii. Choose a new ordering on {∗, ⋆, #} and then write your function from part(i) as a triple of numbers.
iv. Your answers for parts (ii) and (iii) are different yet represent the same function -- explain!
1.4.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 1.4: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
1.4.3 https://math.libretexts.org/@go/page/1723
CHAPTER OVERVIEW
Thumbnail: 3 planes intersect at a point. (CC BY-SA 4.0; Fred the Oyster).
2.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 2: Systems of Linear Equations is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
1
2.1: Gaussian Elimination
Systems of linear equations can be written as matrix equations. Now you will learn an efficient algorithm for (maximally)
simplifying a system of linear equations (or a matrix equation) -- Gaussian elimination.
x + y = 27
{ (2.1.1)
2x − y = 0,
1 1 27
( ) (2.1.2)
2 −1 0
1 1 x 27
( )( ) =( ), (2.1.3)
2 −1 y 0
although all three of the above equations denote the same thing.
Another interesting rewriting is
1 1 27
x( ) +y ( ) =( ). (2.1.4)
2 −1 0
1 1 27
This tells us that we are trying to find which combination of the vectors ( ) and ( ) adds up to ( ;
)
2 −1 0
1 1
the answer is "clearly'' 9 ( ) + 18 ( .
)
2 −1
6x + 2y + 0z − 2w = 0
−1x + 0y + 1z + 1w = 3,
⎛ 1 3 2 0 9 ⎞
⎜ 6 2 0 −2 0 ⎟, (2.1.5)
⎜ ⎟
⎝ ⎠
−1 0 1 1 3
Again, we are trying to find which combination of the columns of the matrix adds up to the vector on the right hand side.
For the the general case of r linear equations in k unknowns, the number of equations is the number of rows r in the augmented
matrix, and the number of columns k in the matrix left of the vertical line is the number of unknowns:
2.1.1 https://math.libretexts.org/@go/page/1751
1 1 1 1
⎛ a a ⋯ a b ⎞
1 2 k
⎜ 2 2 2 2 ⎟
⎜ a a ⋯ a b ⎟
1 2 k
⎜ ⎟, (2.1.7)
⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⎟
r r r r
⎝ a a ⋯ a b ⎠
1 2 k
Entries left of the divide carry two indices; subscripts denote column number and superscripts row number. We emphasize, the
superscripts here do not denote exponents.
We now have three ways of writing the same question. Let's put them side by side as we solve the system by strategically adding
and subtracting equations. We will not tell you the motivation for this particular series of steps yet, but let you develop some
intuition first.
Example 2.1.1: How matrix equations and augmented matrices change in elimination
x + y = 27 1 1 x 27 1 1 27
} ⇔ ( )( ) =( ) ⇔ ( ). (2.1.8)
2x − y = 0 2 −1 y 0 2 −1 0
With the first equation replaced by the sum of the two equations this becomes
3x + 0 = 27 3 0 x 27 3 0 27
} ⇔ ( )( ) =( ) ⇔ ( ). (2.1.9)
2x − y = 0 2 −1 y 0 2 −1 0
Let the new first equation be the old first equation divided by 3:
x +0 = 9 1 0 x 9 1 0 9
} ⇔ ( )( ) =( ) ⇔ ( ). (2.1.10)
2x − y = 0 2 −1 y 0 2 −1 0
Replace the second equation by the second equation minus two times the first equation:
x +0 = 9 1 0 x 9 1 0 9
} ⇔ ( )( ) =( ) ⇔ ( ). (2.1.11)
0 − y = −18 −1 y −18 0 −1 −18
Let the new second equation be the old second equation divided by -1:
x +0 = 9 1 0 x 9 1 0 9
} ⇔ ( )( ) =( ) ⇔ ( ). (2.1.12)
0 − y = 18 0 1 y 18 0 1 18
Did you see what the strategy was? To eliminate y from the first equation and then eliminate x from the second. The result was
the solution to the system.
Here is the big idea: Everywhere in the instructions above we can replace the word "equation" with the word "row" and
interpret them as telling us what to do with the augmented matrix instead of the system of equations. Performed systemically,
the result is the Gaussian elimination algorithm.
1 1 27 1 0 9 1 0 9
( ) ∼( ) ∼( ) (2.1.13)
2 −1 0 2 −1 0 0 1 18
2.1.2 https://math.libretexts.org/@go/page/1751
Example 2.1.2: Using Gaussian elimination
x +y = 5 1 1 5 1 1 5 1 0 2 x +0 = 2
} ⇔ ( ) ∼( ) ∼( ) ⇔ { (2.1.14)
x + 2y = 8 1 2 8 0 1 3 0 1 3 0 +y = 3
Note that in going from the first to second augmented matrix, we used the top left 1 to make the bottom left entry zero. For this
reason we call the top left entry a pivot. Similarly, to get from the second to third augmented matrix, the bottom right entry
(before the divide) was used to make the top right one vanish; so the bottom right entry is also called a pivot.
This name pivot is used to indicate the matrix entry used to "zero out'' the other entries in its column.
called the Identity Matrix , since this would give the simple statement of a solution x = a, y = b . The same goes for larger
systems of equations for which the identity matrix I has 1's along its diagonal and all off-diagonal entries vanish:
1 0 ⋯ 0
⎛ ⎞
⎜0 1 0⎟
⎜ ⎟
I =⎜ ⎟. (2.1.16)
⎜ ⎟
⎜ ⋮ ⋱ ⋮ ⎟
⎝ ⎠
0 0 ⋯ 1
For many systems, it is not possible to reach the identity in the augmented matrix via Gaussian elimination. In any case, a certain
version of the matrix that has the maximum number of components eliminated is said to be the Row Reduced Echelon Form
(RREF).
x +y = 2 1 1 2 1 1 2 x +y = 2
} ⇔ ( ) ∼( ) ⇔ { (2.1.17)
2x + 2y = 4 2 2 4 0 0 0 0 +0 = 0
This example demonstrates if one equation is a multiple of the other the identity matrix can not be a reached. This is because
the first step in elimination will make the second row a row of zeros. Notice that solutions still exists x = 1, y = 1 is a
solution. The last augmented matrix here is in RREF.
x +y = 2 1 1 2 1 1 2 x +y = 2
} ⇔ ( ) ∼( ) ⇔ { (2.1.18)
2x + 2y = 5 2 2 5 0 0 1 0 +0 = 1
This system of equation has a solution if there exists two numbers x, and y such that 0 + 0 = 1 . That is a tricky way of saying
there are no solutions. The last form of the augmented matrix here is in RREF.
2.1.3 https://math.libretexts.org/@go/page/1751
\right\}\Leftrightarrow\left(
0 1 −2
(2.1.20)
1 1 7
\right)\sim\cdots,\]
and then give up because the the upper left slot can not function as a pivot since the 0 that lives there can not be used to
eliminate the zero below it. Of course, the right thing to do is to change the order of the equations before starting
$$\left.
x +y = 7
(2.1.21)
0x + y = −2
\right\}\Leftrightarrow\left(
1 1 7
(2.1.22)
0 1 −2
\right)\sim\left(
1 0 9
(2.1.23)
0 1 −2
\right)\Leftrightarrow\left\{
x +0 = 9
(2.1.24)
0 + y = −2
\right.\]
The third augmented matrix above is the RREF of the first and second. That is to say, you can swap rows on your way to
RREF.
For larger systems of matrices, these three kinds of problems are the obstruction to obtaining the identity matrix, and hence to a
simple statement of a solution in the form x = a, y = b, … . What can we do to maximally simplify a system of equations in
general? We need to perform operations that simplify our system without changing its solutions . Because, exchanging the order
of equations, multiplying one equation by a non-zero constant or adding equations does not change the system's solutions, we are
lead to three operations:
Row Swap:Exchange any two rows.
Scalar Multiplication: Multiply any row by a non-zero constant.
Row Sum: Add a multiple of one row to another row.
These are called Elementary Row Operations, or EROs for short, and are studied in detail in Section 2.3. Suppose now we have a
general augmented matrix for which the first entry in the first row does not vanish. Then, using just the three EROs, we could then
perform the following algorithm:
1. Make the leftmost nonzero entry in the top row 1 by multiplication}.
2. Then use that 1 as a pivot to eliminate everything below it}.
3. Then go to the next row and make the leftmost non zero entry 1}.
4. Use that 1 as a pivot to eliminate everything below and above it}!
5. Go to the next row and make the leftmost nonzero entry 1... etc}.
In the case that the first entry of the first row is zero, we may first interchange the first row with another row whose first entry is
non-vanishing and then perform the above algorithm. If the entire first column vanishes, we may still apply the algorithm on the
remaining columns.
This algorithm is known as Gaussian elimination, its endpoint is an augmented matrix of the form
2.1.4 https://math.libretexts.org/@go/page/1751
1
⎛ 1 ∗ 0 ∗ 0 ⋯ 0 ∗ b ⎞
2
⎜ 0 0 1 ∗ 0 ⋯ 0 ∗ b ⎟
⎜ ⎟
⎜ 3 ⎟
⎜ 0 0 0 0 1 ⋯ 0 ∗ b ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⎟
⎜ ⎟ (2.1.25)
⎜ k ⎟
⎜ 0 0 0 0 0 ⋯ 1 ∗ b ⎟
⎜ ⎟
⎜ k+1 ⎟
0 0 0 0 0 ⋯ 0 0 b
⎜ ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⎟
⎝ r ⎠
0 0 0 0 0 ⋯ 0 0 b
This is called Reduced Row Echelon Form (RREF). The asterisks denote the possibility of arbitrary numbers (e.g., the second 1
in the top line of example12).
The following properties define RREF:
In every row the left most non-zero entry is 1 (and is called a pivot ).
The pivot of any given row is always to the right of the pivot of the row above it.
The pivot is the only non-zero entry in its column.
Here are some examples:
⎛ 1 0 7 0 ⎞
⎜ 0 1 3 0 ⎟
⎜ ⎟ (2.1.26)
⎜ ⎟
⎜ 0 0 0 1 ⎟
⎝ 0 0 0 0 ⎠
⎛ 1 0 3 0 ⎞
⎜ 0 0 2 0 ⎟
⎜ ⎟ (2.1.27)
⎜ ⎟
⎜ 0 1 0 1 ⎟
⎝ 0 0 0 1 ⎠
The reason we need the asterisks in the general form of RREF is that not every column need have a pivot, as demonstrated in
examples 12 and 15. Here is an example where multiple columns have no pivot:
x + y + z + 0w = 2 1 1 1 0 2 1 1 1 0 2 x +y +z = 2
} ⇔ ( ) ∼( ) ⇔ { (2.1.28)
2x + 2y + 2z + 2w = 4 2 2 2 1 4 0 0 0 1 0 w =0
Note that there was no hope of reaching the identity matrix, because of the shape of the augmented matrix we started with.
It is important that you are able to convert RREF back into a set of equations. The first thing you might notice is that if any of the
numbers b , … , b are non-zero then the system of equations is inconsistent and has no solutions. Our next task is to extract all
k+1 r
2.1.5 https://math.libretexts.org/@go/page/1751
As many coefficients as possible of the variables is unity.
It is easier to read off solutions from the maximally simplified equations than from the original equations, even when there are
infinitely many solutions.
Example 2.1.9: Standard approach from a system of equations to the solution set
x + y + 5w = 1 ⎫ ⎛ 1 1 0 5 1 ⎞ ⎛ 1 0 0 3 −5 ⎞ ⎧ x + 3w = −5 ⎪
⎫
⎪ ⎪
y + 2w = 6 ⎬ ⇔ ⎜
⎜ 0 1 0 2 6 ⎟ ∼⎜
⎟ ⎜ 0 1 0 2 6 ⎟ ⇔ ⎨
⎟ y + 2w = 6 ⎬ (2.1.29)
⎭
⎪ ⎩
⎪ ⎭
⎪
z + 4w = 8 ⎝ 0 0 1 4 8 ⎠ ⎝ 0 0 1 4 8 ⎠ z + 4w = 8
⎧ x = −5 − 3w ⎫
⎪ ⎪
⎪
⎪ ⎪
⎪
y = 6 − 2w
⇔ ⎨ ⎬
⎪ z = 8 − 4w ⎪
⎪ ⎪
⎩
⎪ ⎭
⎪
w =w
x −5 −3
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎜ y ⎟ ⎜ 6 ⎟ ⎜ −2 ⎟
⇔ ⎜ ⎟ =⎜ ⎟+w ⎜ ⎟. (2.1.30)
⎜ z ⎟ ⎜ 8 ⎟ ⎜ −4 ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
w 0 1
⎧ −5 −3 ⎫
⎪
⎪ ⎛ ⎞ ⎛ ⎞ ⎪
⎪
⎪ ⎪
⎜ 6 ⎟ ⎜ −2 ⎟
⎨ ⎜ ⎟+α ⎜ ⎟ : α ∈ R ⎬ (2.1.31)
⎜ 8 ⎟ ⎜ −4 ⎟
⎪ ⎪
⎪
⎩ ⎪
⎭
⎪ ⎝ ⎠ ⎝ ⎠ ⎪
0 1
Here is a verbal description of the preceding example of the standard approach. We say that x, y, and z are pivot variables because
they appeared with a pivot coefficient in RREF. Since w never appears with a pivot coefficient, it is not a pivot variable. In the
second line we put all the pivot variables on one side and all the non-pivot variables on the other side and added the trivial equation
w = w to obtain a system that allowed us to easily read off solutions.
The last example demonstrated the standard approach for solving a system of linear equations in its entirety:
Write the augmented matrix.
Perform EROs to reach RREF.
Express the non-pivot variables in terms of the pivot variables.
There are always exactly enough non-pivot variables to index your solutions. In any approach, the variables which are not
expressed in terms of the other variables are called free variables. The standard approach is to use the non-pivot variables as free
variables.
Non-standard approach: solve for w in terms of z and substitute into the other equations. You now have an expression for each
component in terms of z . But why pick z instead of y or x? (or x + y ?) The standard approach not only feels natural, but is
canonical, meaning that everyone will get the same RREF and hence choose the same variables to be free. However, it is
important to remember that so long as their set of solutions is the same, any two choices of free variables is fine. (You might think
of this as the difference between using Google Maps or Mapquest ; although their maps may look different, the place ⟨ home
TM TM
When you see an RREF augmented matrix with two columns that have no pivot, you know there will be two free variables.
⎛ 1 0 7 0 4 ⎞ ⎧ x = 4 − 7z ⎫
⎪ ⎪
⎪
⎪ ⎪
⎪
⎜ 0 1 3 4 1 ⎟ x + 7z = 4 y = 1 − 3z − 4w
⎜ ⎟ ⇔ { } ⇔ ⎨ ⎬ (2.1.32)
⎜ ⎟
⎜ 0 0 0 0 0 ⎟ y + 3z + 4w = 1 ⎪ z =z ⎪
⎪ ⎪
⎩
⎪ ⎭
⎪
⎝ 0 0 0 0 0 ⎠ w =w
2.1.6 https://math.libretexts.org/@go/page/1751
x 4 −7 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎜ y ⎟ ⎜1⎟ ⎜ −3 ⎟ ⎜ −4 ⎟
⇔ ⎜ ⎟ =⎜ ⎟+z⎜ ⎟+w ⎜ ⎟ (2.1.33)
⎜ z ⎟ ⎜0⎟ ⎜ 1 ⎟ ⎜ 0 ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
w 0 0 1
⎧ 4 −7 0 ⎫
⎪ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎪
⎪ ⎪
⎪ ⎪
⎜ 1 ⎟ ⎜ −3 ⎟ ⎜ −4 ⎟
⎨ ⎜ ⎟+z⎜ ⎟+w ⎜ ⎟ : z, w ∈ R ⎬ (2.1.34)
⎜0⎟ ⎜ 1 ⎟ ⎜ 0 ⎟
⎪ ⎪
⎪
⎩ ⎝
⎪ ⎪
⎭
⎪
⎠ ⎝ ⎠ ⎝ ⎠
0 0 1
There are infinitely many solutions; one for each pair of numbers z, w. You can imagine having three, four, or fifty-six non-
pivot columns and the same number of free variables indexing your solutions set. In general a solution set to a system of
equations with n free variables will be of the form
{ P + μ1 H1 + μ2 H2 + ⋯ + μn Hn : μ1 , ⋯ , μn ∈ R } (2.1.35)
The parts of these solutions play special roles in the associated matrix equation. This will come up again and again long after
this discussion of basic calculation methods, so the general language of Linear Algebra will be used to give names to these
parts now.
A homogeneous solution to a linear equation Lx = v , with L and v known is a vector H such that LH = 0 where 0 is the zero
vector.
If you have a particular solution P to a linear equation and add a sum of multiples of homogeneous solutions to it you
obtain another particular solution.
Check now that the parts of the solutions with free variables as coefficients from the previous examples are homogeneous
solutions, and that by adding a homogeneous solution to a particular solution one obtains a solution to the matrix equation. This
2
d f
will come up over and over again. As an example without matrices, consider the differential equation dx2
=3 . A particular
solution is x while x and 1 are homogeneous solutions. The solution set is
3
2
2
{
3
2
2
x + ax + c1 : a, b ∈ R } . You can imagine
similar differential equations with more homogeneous solutions.
2.1.5: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 2.1: Gaussian Elimination is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
2.1.7 https://math.libretexts.org/@go/page/1751
2.2: Review Problems
1. State whether the following augmented matrices are in RREF and compute their solution sets.
⎛ 1 0 0 0 3 1 ⎞
⎜ 0 1 0 0 1 2 ⎟
⎜ ⎟, (2.2.1)
⎜ ⎟
⎜ 0 0 1 0 1 3 ⎟
⎝ 0 0 0 1 2 0 ⎠
⎛ 1 1 0 1 0 1 0 ⎞
⎜ 0 0 1 2 0 2 0 ⎟
⎜ ⎟, (2.2.2)
⎜ ⎟
⎜ 0 0 0 0 1 3 0 ⎟
⎝ 0 0 0 0 0 0 0 ⎠
⎛ 1 1 0 1 0 1 0 1 ⎞
⎜ 0 0 1 2 0 2 0 −1 ⎟
⎜ ⎟
⎜ ⎟. (2.2.3)
0 0 0 0 1 3 0 1
⎜ ⎟
⎜ ⎟
⎜ 0 0 0 0 0 2 0 −2 ⎟
⎝ 0 0 0 0 0 0 1 1 ⎠
2 x1 + 5 x2 − 8 x3 + 2 x4 + 2 x5 = 0
6 x1 + 2 x2 − 10 x3 + 6 x4 + 8 x5 = 6
3 x1 + 6 x2 + 2 x3 + 3 x4 + 5 x5 = 6 (2.2.4)
3 x1 + 1 x2 − 5 x3 + 3 x4 + 4 x5 = 3
6 x1 + 7 x2 − 3 x3 + 6 x4 + 9 x5 = 9
Be sure to set your work out carefully with equivalence signs ∼ between each step, labeled by the row operations you performed.
3. Check that the following two matrices are row-equivalent:
1 4 7 10
( ) (2.2.5)
2 9 6 0
and
0 −1 8 20
( ). (2.2.6)
4 18 12 0
Now remove the third column from each matrix, and show that the resulting two matrices (shown below) are row-equivalent:
1 4 10
( ) (2.2.7)
2 9 0
and
0 −1 20
( ). (2.2.8)
4 18 0
Now remove the fourth column from each of the original two matrices, and show that the resulting two matrices, viewed as
augmented matrices (shown below) are row-equivalent:
1 4 7
( ) (2.2.9)
2 9 6
2.2.1 https://math.libretexts.org/@go/page/1760
and
0 −1 8
( ). (2.2.10)
4 18 12
⎛ 1 4 10 ⎞
⎜ 3 13 9 ⎟ (2.2.11)
⎜ ⎟
⎝ 4 17 20 ⎠
has no solutions. If you remove one of the rows of this matrix, does the new matrix have any solutions? In general, can row
equivalence be affected by removing rows? Explain why or why not.
5. Explain why the linear system has no solutions:
⎛ 1 0 3 1 ⎞
⎜ 0 1 2 4 ⎟ (2.2.12)
⎜ ⎟
⎝ 0 0 0 6 ⎠
For which of the values of k does the system below have a solution?
x − 3y = 6
x + 3z = −3 (2.2.13)
2x + ky + (3 − k)z = 1
6. Show that the RREF of a matrix is unique. (Hint: Consider what happens if the same augmented matrix had two different
RREFs. Try to see what happens if you removed columns from these two RREF augmented matrices.)
7. Another method for solving linear systems is to use row operations to bring the augmented matrix to Row Echelon Form (REF
as opposed to RREF). In REF, the pivots are not necessarily set to one, and we only require that all entries left of the pivots are
zero, not necessarily entries above a pivot. Provide a counterexample to show that row echelon form is not unique.
Once a system is in row echelon form, it can be solved by "back substitution." Write the following row echelon matrix as a system
of equations, then solve the system using back-substitution.
⎛ 2 3 1 6 ⎞
⎜ 0 1 1 2 ⎟ (2.2.14)
⎜ ⎟
⎝ 0 0 3 3 ⎠
8. Show that this pair of augmented matrices are row equivalent, assuming ad − bc ≠ 0 :
de−bf
⎛ 1 0 ⎞
a b e ad−bc
( ) ∼⎜ ⎟ (2.2.15)
af −ce
c d f 0 1
⎝ ⎠
ad−bc
2 −1 3
( ) (2.2.16)
−6 3 1
Give a geometric reason why the associated system of equations has no solution. (Hint, plot the three vectors given by the columns
of this augmented matrix in the plane.) Given a general augmented matrix
a b e
( ), (2.2.17)
c d f
can you nd a condition on the numbers a, b, c and d that corresponds to the geometric condition you found?
2.2.2 https://math.libretexts.org/@go/page/1760
10. A relation ∼ on a set of objects U is an equivalence relation if the following three properties are satisfied:
∙ Ref lexive : F or any x ∈ U , we have x ∼ x. (2.2.18)
2.2.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 2.2: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
2.2.3 https://math.libretexts.org/@go/page/1760
2.3: Elementary Row Operations
Elementary row operations (EROS) are systems of linear equations relating the old and new rows in Gaussian Elimination.
′
R1 =0 R1 +R2 +0 R3
′
R2 =R1 +0 R2 +0 R3
⎛ 0 1 1 7 ⎞ ⎛ 2 0 0 4 ⎞
′
R =0 R1 +0 R2 +R3
3
⎜ 2 0 0 4 ⎟ ∼ ⎜ 0 1 1 7 ⎟ (2.3.1)
⎜ ⎟ ⎜ ⎟
⎝ 0 0 1 4 ⎠ ⎝ 0 0 1 4 ⎠
′ 1
R = R1 +0 R2 +0 R3
1 2
′
R2 =0 R1 +R2 +0 R3
⎛ 1 0 0 2 ⎞
′
R3 =0 R1 +0 R2 +R3
∼ ⎜ 0 1 1 7 ⎟ (2.3.2)
⎜ ⎟
⎝ 0 0 1 4 ⎠
′
R1 =R1 +0 R2 +0 R3
′
R2 =0 R1 +R2 −R3
⎛ 1 0 0 2 ⎞
′
R =0 R1 +0 R2 +R3
3
∼ ⎜ 0 1 0 3 ⎟ (2.3.3)
⎜ ⎟
⎝ 0 0 1 4 ⎠
0 1 0 0 1 1 7 2 0 0 4
⎛ ⎞⎛ ⎞ ⎛ ⎞
⎜1 0 0⎟⎜
⎜ 2 0 0 4 ⎟ =⎜
⎟ ⎜ 0 1 1 7 ⎟
⎟ (2.3.4)
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 0 1 0 0 1 4 0 0 1 4
1
0 0 2 0 0 4 1 0 0 2
⎛ 2 ⎞⎛ ⎞ ⎛ ⎞
⎜ 0 ⎜ ⎟ =⎜ ⎟ (2.3.5)
1 0⎟⎜ 0 1 1 7 ⎟ ⎜ 0 1 1 7 ⎟
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 0 1 0 0 1 4 0 0 1 4
1 0 0 1 0 0 2 1 0 0 2
⎛ ⎞⎛ ⎞ ⎛ ⎞
⎜0 1 −1 ⎟ ⎜
⎜ 0 1 1 7 ⎟ =⎜
⎟ ⎜ 0 1 0 3 ⎟
⎟ (2.3.6)
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 0 1 0 0 1 4 0 0 1 4
Example 2.3.3:
−1 −1
⇔ 3 6x = 3 12 (2.3.8)
⇔ 2x = 4 (2.3.9)
−1 −1
⇔ 2 2x = 2 4 (2.3.10)
2.3.1 https://math.libretexts.org/@go/page/1767
\[\Leftrightarrow 1x = 2
Example 2.3.4:
0 1 1 x 7
⎛ ⎞⎛ ⎞ ⎛ ⎞
0 1 0 0 1 1 x 0 1 0 7
⎛ ⎞⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞
2 0 0 x 4
⎛ ⎞⎛ ⎞ ⎛ ⎞
⇔ ⎜0 1 1⎟⎜ y ⎟ = ⎜7⎟ (2.3.13)
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 0 1 z 4
1 1
⎛ 0 0⎞ 2 0 0 x ⎛ 0 0⎞ 4
2 ⎛ ⎞⎛ ⎞ 2 ⎛ ⎞
⇔ ⎜ 0 1 0⎟⎜0 1 1⎟⎜ y ⎟ = ⎜ 0 1 0⎟⎜7⎟ (2.3.14)
⎝ ⎠⎝ ⎠⎝ ⎠ ⎝ ⎠⎝ ⎠
0 0 1 0 0 1 z 0 0 1 4
1 0 0 x 2
⎛ ⎞⎛ ⎞ ⎛ ⎞
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 0 1 z 4
1 0 0 1 0 0 x 1 0 0 2
⎛ ⎞⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞
⇔ ⎜0 1 −1 ⎟ ⎜ 0 1 1⎟⎜ y ⎟ = ⎜0 1 −1 ⎟ ⎜ 7 ⎟ (2.3.16)
⎝ ⎠⎝ ⎠⎝ ⎠ ⎝ ⎠⎝ ⎠
0 0 1 0 0 1 z 0 0 1 4
1 0 0 x 2
⎛ ⎞⎛ ⎞ ⎛ ⎞
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 0 1 z 4
This is another way of thinking about what is happening in the process of elimination which feels more like elementary algebra in
the sense that you ``do something to both sides of an equation" until you have a solution.
multiple EROs to get a single thing that undoes our matrix. To do this, augment by the identity matrix (not just a single column)
and then perform Gaussian elimination. There is no need to write the EROs as systems of equations or as matrices while doing this.
⎛ 0 1 1 1 0 0 ⎞ ⎛ 2 0 0 0 1 0 ⎞
⎜ 2 0 0 0 1 0 ⎟ ∼⎜ 0 1 1 1 0 0 ⎟
⎜ ⎟ ⎜ ⎟
⎝ 0 0 1 0 0 1 ⎠ ⎝ 0 0 1 0 0 1 ⎠
1 1
⎛ 1 0 0 0 0 ⎞ ⎛ 1 0 0 0 0 ⎞
2 2
∼⎜ 0 1 1 1 0 0
⎟ ∼⎜
0 1 0 1 0 −1
⎟
⎜ ⎟ ⎜ ⎟
⎝ 0 0 1 0 0 1 ⎠ ⎝ 0 0 1 0 0 1 ⎠
As we changed the left slot from the matrix M to the identity matrix, the right slot changed from the identity matrix to the
matrix which undoes M .
2.3.2 https://math.libretexts.org/@go/page/1767
Example 2.3.6: Checking that one matrix undoes another
1
0 0
⎛ 2 ⎞⎛0 1 1
⎞ ⎛
1 0 0
⎞
⎜1 0 −1 ⎟ ⎜ 2 0 0⎟ =⎜0 1 0⎟ .
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 0 1 0 0 1 0 0 1
If the matrices are composed in the opposite order, the result is the same.
1
0 1 1 0 0 1 0 0
⎛ ⎞⎛ 2 ⎞ ⎛ ⎞
⎜2 0 0⎟⎜1 0 −1 ⎟ = ⎜ 0 1 0⎟ .
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 0 1 0 0 1 0 0 1
In abstract generality, let M be some matrix and, as always, let I stand for the identity matrix. Imagine the process of performing
elementary row operations to bring M to the identity matrix.
Ellipses stand for additional EROs. The result is a product of matrices that form a matrix which undoes M
⋯ E2 E1 M = I (2.3.19)
This is only true if RREF of M is the identity matrix. In that case, we say M is invertible.
Much use is made of the fact that invertible matrices can be undone with EROs. To begin with, since each elementary row
operation has an inverse,
−1 −1
M =E E ⋯ (2.3.20)
1 2
Thus, if M is invertible then M and can be expressed as the product of EROs. (The same is true for its inverse.) This has the feel of
the fundamental theorem of arithmetic (integers can be expressed as the product of primes) or the fundamental theorem of algebra
(polynomials can be expressed as the product of first order polynomials); EROs are the building blocks of invertible matrices.
The row swap matrix that swaps the 2nd and 4th row is the identity matrix with the 2nd and 4th row swapped:
1 0 0 0 0
⎛ ⎞
⎜0 0 0 1 0⎟
⎜ ⎟
⎜0 0 1 0 0⎟ (2.3.23)
⎜ ⎟
⎜ ⎟
⎜0 1 0 0 0⎟
⎝ ⎠
0 0 0 0 1
The scalar multiplication matrix that replaces the 3rd row with 7 times the 3rd row is the identity matrix with 7 in the 3rd row
instead of 1:
2.3.3 https://math.libretexts.org/@go/page/1767
1 0 0 0
⎛ ⎞
⎜0 1 0 0⎟
⎜ ⎟ (2.3.24)
⎜0 0 7 0⎟
⎝ ⎠
0 0 0 1
The row sum matrix that replaces the 4th row with the 4th row plus 9 times the 2nd row is the identity matrix with a 9 in the
4th row, 2nd column:
1 0 0 0 0 0 0
⎛ ⎞
⎜ 0 1 0 0 0 0 0⎟
⎜ ⎟
⎜ 0 0 1 0 0 0 0⎟
⎜ ⎟
⎜ ⎟
⎜ 0 9 0 1 0 0 0⎟ (2.3.25)
⎜ ⎟
⎜ ⎟
⎜ 0 0 0 0 1 0 0⎟
⎜ ⎟
⎜ 0 0 0 0 0 1 0⎟
⎝ ⎠
0 0 0 0 0 0 1
We can write an explicit factorization of a matrix into EROs by keeping track of the EROs used in getting to RREF.
Note that in the previous example one of each of the kinds of EROs is used, in the order just given. Elimination looked like
0 1 1 2 0 0 1 0 0 1 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
E1 E2 E3
M =⎜ 2 0 0⎟ ∼ ⎜0 1 1⎟ ∼ ⎜0 1 1⎟ ∼ ⎜0 1 0⎟ =I
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 1 0 0 1 0 0 1 0 0 1
E1 = ⎜ 1 0 0 ⎟ , E2 = ⎜ 0 1 0 ⎟ , E3 = ⎜ 0 1 −1 ⎟ .
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 1 0 0 1 0 0 1
The inverse of the ERO matrices (corresponding to the description of the reverse row manipulations)
0 1 0 2 0 0 1 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
−1 −1 −1
E =⎜1 0 0⎟,E =⎜0 1 0⎟,E =⎜0 1 1⎟ .
1 2 3
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 1 0 0 1 0 0 1
⎝ ⎠⎝ ⎠⎝ ⎠
0 0 1 0 0 1 0 0 1
0 1 0 2 0 0 0 1 1
⎛ ⎞⎛ ⎞ ⎛ ⎞
= ⎜1 0 0⎟⎜0 1 1⎟ =⎜2 0 0 ⎟ = M.
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 0 1 0 0 1 0 0 1
2.3.4 https://math.libretexts.org/@go/page/1767
Example 2.3.9: LU factorization
2 0 −3 1 2 0 −3 1
⎛ ⎞ ⎛ ⎞
⎜ 0 1 2 2 ⎟ E1 ⎜ 0 1 2 2 ⎟
M =⎜ ⎟ ∼ ⎜ ⎟
⎜ −4 0 9 2 ⎟ ⎜0 0 3 4 ⎟
⎝ ⎠ ⎝ ⎠
0 −1 1 −1 0 −1 1 −1
2 0 −3 1 2 0 −3 1
⎛ ⎞ ⎛ ⎞
E2 ⎜0 1 2 2⎟ E3 ⎜0 1 2 2 ⎟
∼ ⎜ ⎟ ∼ ⎜ ⎟ := U
⎜0 0 3 4⎟ ⎜0 0 3 4 ⎟
⎝ ⎠ ⎝ ⎠
0 0 3 1 0 0 0 −3
1 0 0 0 1 0 0 0 1 0 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎜0 1 0 0⎟ ⎜0 1 0 0⎟ ⎜0 1 0 0⎟
E1 =⎜ ⎟ , E2 =⎜ ⎟ , E3 =⎜ ⎟
⎜2 0 1 0⎟ ⎜0 0 1 0⎟ ⎜0 0 1 0⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 0 1 0 1 0 1 0 0 −1 1
1 0 0 0 1 0 0 0 1 0 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
−1 ⎜ 0 1 0 0⎟ −1 ⎜0 1 0 0⎟ −1 ⎜0 1 0 0⎟
E =⎜ ⎟, E =⎜ ⎟, E =⎜ ⎟ .
1 2 3
⎜ −2 0 1 0 ⎟ ⎜0 0 1 0 ⎟ ⎜0 0 1 0⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 0 1 0 −1 0 1 0 0 1 1
applying the inverses of the EROs to both sides of the equality U = E3 E2 E1 M gives M =E
−1
1
E
−1
2
E
−1
3
U or
2 0 −3 1 1 0 0 0 1 0 0 0 1 0 0 0 2 0 −3 1
⎛ ⎞ ⎛ ⎞⎛ ⎞⎛ ⎞⎛ ⎞
⎝ ⎠ ⎝ ⎠⎝ ⎠⎝ ⎠⎝ ⎠
0 −1 1 −1 0 0 0 1 0 −1 0 1 0 0 1 1 0 0 0 −3
1 0 0 0 1 0 0 0 2 0 −3 1
⎛ ⎞⎛ ⎞⎛ ⎞
⎜ 0 1 0 0⎟⎜0 1 0 0⎟⎜0 1 2 2 ⎟
= ⎜ ⎟⎜ ⎟⎜ ⎟
⎜ −2 0 1 0⎟⎜0 0 1 0⎟⎜0 0 3 4 ⎟
⎝ ⎠⎝ ⎠⎝ ⎠
0 0 0 1 0 −1 1 1 0 0 0 −3
1 0 0 0 2 0 −3 1
⎛ ⎞⎛ ⎞
⎜ 0 1 0 0⎟⎜0 1 2 2 ⎟
= ⎜ ⎟⎜ ⎟ .
⎜ −2 0 1 0⎟⎜0 0 3 4 ⎟
⎝ ⎠⎝ ⎠
0 −1 1 1 0 0 0 −3
What if we stop at a different point in elimination? We could multiply rows so that the entries in the diagonal are 1 next. Note that
the EROs that do this are diagonal. This gives a slightly different factorization.
2.3.5 https://math.libretexts.org/@go/page/1767
−3 1
⎛
2 0 −3 1
⎞ ⎛
2 0 −3 1
⎞ ⎛1 0
2 2 ⎞
⎜ 0 1 2 2 ⎟ E3 E2 E1 ⎜0 1 2 2 ⎟ E4 ⎜ 0 1 2 2 ⎟
M =⎜ ∼ ⎜ ⎟
⎟ ⎜ ⎟ ∼ ⎜ ⎟
⎜ −4 0 9 2 ⎟ ⎜0 0 3 4 ⎟ ⎜0 ⎟
0 3 4
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 −1 1 −1 0 0 0 −3 0 0 0 −3
−3 1 −3 1
⎛1 0
2 2 ⎞ ⎛1 0
2 2 ⎞
E5 ⎜0 1 2 2 ⎟ E ⎜0 1 2 2 ⎟
∼ ⎜ ⎟ ∼ ⎜ ⎟ =: U
6
⎜ 4 ⎟ ⎜ 4 ⎟
⎜0 0 1 ⎟ ⎜0 0 1 ⎟
3 3
⎝ ⎠ ⎝ ⎠
0 0 0 −3 0 0 0 1
⎜ 0 1 0 0⎟ ⎜0 1 0 0⎟ ⎜0 1 0 0 ⎟
E4 =⎜ ⎟, E5 =⎜ ⎟, E6 =⎜ ⎟,
⎜ ⎟ ⎜ 1 ⎟ ⎜ ⎟
⎜ 0 0 1 0⎟ ⎜0 0 0⎟ ⎜0 0 1 0 ⎟
3
⎝ ⎠ ⎝ ⎠ ⎝ 1 ⎠
0 0 0 1 0 0 0 1 0 0 0 −
3
2 0 0 0 1 0 0 0 1 0 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
−1 ⎜0 1 0 0⎟
−1 ⎜0 1 0 0⎟
−1 ⎜0 1 0 0 ⎟
E =⎜ ⎟ , E5 =⎜ ⎟ , E6 =⎜ ⎟,
4
⎜0 0 1 0⎟ ⎜0 0 3 0⎟ ⎜0 0 1 0 ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 0 1 0 0 0 1 0 0 0 −3
We calculated the product of the first three factors in the previous example; it was named L there, and we will reuse that name
here. The product of the next three factors is diagonal and we will name it D. The last factor we named U (the name means
something different in this example than the last example.) The LDU factorization of our matrix is
−3 1
2 0 −3 1 1 0 0 0 2 0 0 0 ⎛1 0
2 2 ⎞
⎛ ⎞ ⎛ ⎞⎛ ⎞
⎜ 2 ⎟
⎜ 0 1 2 2 ⎟ ⎜ 0 1 0 0⎟⎜0 1 0 0 ⎟⎜0 1 2
⎟
⎜ ⎟ =⎜ ⎟⎜ ⎟
⎜ −4 0 9 2 ⎟ ⎜ −2 0 1 0⎟⎜0 0 3 0 ⎟⎜ 4 ⎟
⎜0 0 1 ⎟
3
⎝ ⎠ ⎝ ⎠⎝ ⎠
0 −1 1 −1 0 −1 1 1 0 0 1 −3 ⎝ ⎠
0 0 0 1
The LDU factorization of a matrix is a factorization into blocks of EROs of a various types: L is the product of the inverses of
EROs which eliminate below the diagonal by row addition, D the product of inverses of EROs which set the diagonal elements to
1 by row multiplication, and U is the product of inverses of EROs which eliminate above the diagonal by row addition. You may
notice that one of the three kinds of row operation is missing from this story. Row exchange my very well be necessary to obtain
RREF. Indeed, so far in this chapter we have been working under the tacit assumption that M can be brought to the identity by just
row multiplication and row addition. If row exchange is necessary, the resulting factorization is LDP U where P is the product of
inverses of EROs that perform row exchange.
0 1 2 2 0 1 2 2
⎛ ⎞ ⎛ ⎞
⎜ 2 0 −3 1 ⎟ E7 ⎜ 2 0 −3 1 ⎟ E6 E5 E4 E3 E2 E1
M =⎜ ⎟ ∼ ⎜ ⎟ ∼ L
⎜ −4 0 9 2 ⎟ ⎜ −4 0 9 2 ⎟
⎝ ⎠ ⎝ ⎠
0 −1 1 −1 0 −1 1 −1
0 1 0 0
⎛ ⎞
⎜1 0 0 0⎟ −1
E7 =⎜ ⎟ = E7
⎜0 0 1 0⎟
⎝ ⎠
0 0 0 1
2.3.6 https://math.libretexts.org/@go/page/1767
−1 −1 −1 −1 −1 −1 −1
M = (E E E )(E E E )(E )U = LDP U
1 2 3 4 5 6 7
−3 1
0 1 2 2 1 0 0 0 2 0 0 0 0 1 0 0 ⎛1 0 ⎞
⎛ ⎞ ⎛ ⎞⎛ ⎞⎛ ⎞ 2 2
2.3.4.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 2.3: Elementary Row Operations is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
2.3.7 https://math.libretexts.org/@go/page/1767
2.4: Review Problems
1. While performing Gaussian elimination on these augmented matrices write the full system of equations describing the new rows
in terms of the old rows above each equivalence symbol as in Example 20.
⎛ 1 1 0 5 ⎞
2 2 10
( ), ⎜ 1 1 −1 11 ⎟ (2.4.1)
⎜ ⎟
1 2 8
⎝ −1 1 1 −5 ⎠
2. Solve the vector equation by applying ERO matrices to each side of the equation to perform elimination. Show each matrix
explicitly as in Example 23.
3 6 2
⎛ ⎞
x −3
⎛ ⎞ ⎛ ⎞
⎜ −35 9 4⎟
⎜ ⎟⎜ y ⎟ = ⎜ 1 ⎟
⎜ 12 4 2⎟
⎝ ⎠ ⎝ ⎠
z 0
⎝ ⎠
0
3. Solve this vector equation by finding the inverse of the matrix through (M |I ) ∼ (I |M −1
) and then applying M −1
to both sides
of the equation.
2 1 1
⎛ ⎞
x 9
⎛ ⎞ ⎛ ⎞
⎜ 91 1 1⎟
⎜ ⎟⎜ y ⎟ = ⎜6⎟
⎜ 61 1 2⎟⎝ ⎠ ⎝ ⎠
⎝ ⎠ z 7
7
4. Follow the method of Examples 28 and 29 to find the LU and LDU factorization of
3 3 6
⎛ ⎞
⎜3 5 2⎟
⎝ ⎠
6 2 5
5. Multiple matrix equations with the same matrix can be solved simultaneously.
a) Solve both systems by performing elimination on just one augmented matrix.
2 −1 −1 x 0 2 −1 −1 a 2
⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
⎜ −1 1 1 ⎟⎜ y ⎟ = ⎜1⎟, ⎜ −1 1 1 ⎟⎜ b ⎟ = ⎜1⎟
⎝ ⎠⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠
1 −1 0 z 0 1 −1 0 c 1
factorization; fractions and large numbers will probably be involved. To invent simple problems it is better to start with a simple
answer:
a. Start with any augmented matrix in RREF. Perform EROs to make most of the components non-zero. Write the result on a
separate piece of paper and give it to your friend. Ask that friend to find RREF of the augmented matrix you gave them. Make
sure they get the same augmented matrix you started with.
b. Create an upper triangular matrix U and a lower triangular matrix L with only 1s on the diagonal. Give the result to a friend to
factor into LU form.
c. Do the same with an LDU factorization.
2.4.1 https://math.libretexts.org/@go/page/1774
2.4.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 2.4: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
2.4.2 https://math.libretexts.org/@go/page/1774
2.5: Solution Sets for Systems of Linear Equations
Algebra problems can have multiple solutions. For example x(x − 1) = 0 has two solutions: 0 and 1. By contrast, equations of the
form Ax = b with A a linear operator have have the following property.
If A is a linear operator and b is a known then Ax = b has either
[1.] One solution
[2.] No solutions
[3.] Infinitely many solutions
In each case the linear operator is a 1 × 1 matrix. In the first case, the linear operator is invertible. In the other two cases it is not. In
the first case, the solution set is a point on the number line, in the third case the solution set is the whole number line.
Lets examine similar situations with larger matrices.
6 0 x 12 2
[1.] ( )( ) =( ) , one solution: ( )
0 2 y 6 3
1 3 x 4
[2b.] ( )( ) =( ) , no solutions
0 0 y 1
1 3 x 4 4 − 3y
[2bi.] ( )( ) =( ), one solution for each number y : ( )
0 0 y 0 y
0 0 x 0 x
[2bii.] ( )( ) =( ) , one solution for each pair of numbers x, y:( )
0 0 y 0 y
Again, in the first case the linear operator is invertible while in the other cases it is not. When the operator is not invertible the
solution set can be empty, a line in the plane or the plane itself.
For a system of equations with r equations and k variables, one can have a number of different outcomes. For example, consider
the case of r equations in three variables. Each of these equations is the equation of a plane in three-dimensional space. To find
solutions to the system of equations, we look for the common intersection of the planes (if an intersection exists). Here we have
five different possibilities:
[1.] Unique Solution. The planes have a unique point of intersection.
[2a.] No solutions. Some of the equations are contradictory, so no solutions exist.
[2bi.] Line. The planes intersect in a common line; any point on that line then gives a solution to the system of equations.
[2bii.] Plane. Perhaps you only had one equation to begin with, or else all of the equations coincide geometrically. In this case,
you have a plane of solutions, with two free parameters.
[2biii.] All of R 3
. If you start with no information, then any point in R is a solution. There are three free parameters.
3
In general, for systems of equations with k unknowns, there are k + 2 possible outcomes, corresponding to the possible numbers
(i.e. 0, 1, 2, … , k) of free parameters in the solutions set plus the possibility of no solutions. These types of "solution sets'' are
"hyperplanes'', generalizations of planes the behave like planes in R in many ways. 3
2.5.1 https://math.libretexts.org/@go/page/1781
2.5.2: Particular Solution + Homogeneous solutions
In the standard approach, variables corresponding to columns that do not contain a pivot (after going to reduced row echelon form)
are free. We called them non-pivot variables. They index elements of the solutions set by acting as coefficients of vectors.
x1
⎛ ⎞
1 0 1 −1 1 ⎧ 1 x1 + 0 x2 + 1 x3 − 1 x4 =1
⎛ ⎞ ⎛ ⎞ ⎪
⎜ x2 ⎟
⎜0 1 −1 1 ⎟⎜ ⎟ = ⎜ 1 ⎟ ⇔ ⎨ 0 x1 + 1 x2 − 1 x3 + 1 x4 = 1 $$F ollowingthestandardapproach,
⎜x ⎟ ⎩
⎝ ⎠ 3 ⎝ ⎠ ⎪
0 0 0 0 ⎝ ⎠ 0 0 x1 + 0 x2 + 0 x3 + 0 x4 =0
x4
x1 = 1 − x3 + x4 ⎫ x1 1 −1 1
⎪
⎪ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎪
x2 = 1 + x3 − x4 ⎜ x2 ⎟ ⎜1⎟ ⎜ 1 ⎟ ⎜ −1 ⎟
$ ⎬ ⇔ ⎜ ⎟ =⎜ ⎟ + x3 ⎜ ⎟ + x4 ⎜ ⎟ $
x3 = x3 ⎜x ⎟ ⎜0⎟ ⎜ 1 ⎟ ⎜ 0 ⎟
⎪ 3
⎪
⎭
⎪ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
x4 = x4 x4 0 0 1
⎧ x 1 −1 1 ⎫
⎪⎛ 1 ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎪
⎪ ⎪
⎪ ⎪
⎜ x2 ⎟ ⎜1⎟ ⎜ 1 ⎟ ⎜ −1 ⎟
= ⎨⎜ ⎟ =⎜ ⎟ + μ1 ⎜ ⎟ + μ2 ⎜ ⎟ : μ1 , μ2 ∈ R ⎬ $
⎜x ⎟ ⎜0⎟ ⎜ 1 ⎟ ⎜ 0 ⎟
⎪
⎪
3 ⎪
⎪
⎩⎝
⎪ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎭
⎪
x4 0 0 1
− pivotcolumnsAnotherwaytowritethesolutionsetis\[S = { X0 + μ1 Y1 + μ2 Y2 : μ1 , μ2 ∈ R}
where
1 −1 1
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎜1⎟ ⎜ 1 ⎟ ⎜ −1 ⎟
X0 =⎜ ⎟ , Y1 = ⎜ ⎟ , Y2 = ⎜ ⎟ (2.5.1)
⎜0⎟ ⎜ 1 ⎟ ⎜ 0 ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 1
Here X is called a particular solution while Y and Y are called homogeneous solutions.
0 1 2
M (X0 + μ1 Y1 + μ2 Y2 ) = M X0 + μ1 M Y1 + μ2 M Y2 = V (2.5.2)
M Y1 = 0 . (2.5.4)
2.5.2 https://math.libretexts.org/@go/page/1781
M Y2 = 0 . (2.5.5)
Here Y and Y are examples of what are called homogeneous solutions to the system. They
1 2 do not solve the original equation
M X = V , but instead its associated homogeneous equation M Y = 0 .
One of the fundamental lessons of linear algebra: the solution set to Ax = b with A a linear operator consists of a particular
solution plus homogeneous solutions.
Example 2.5.1:
Consider the matrix equation of the previous example. It has solution set
⎧ x 1 −1 1
⎪⎛ 1 ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎫
⎪
⎪ ⎪
⎪ ⎪
⎜ x2 ⎟ ⎜1⎟ ⎜ 1 ⎟ ⎜ −1 ⎟
S = ⎨⎜ ⎟ =⎜ ⎟ + μ1 ⎜ ⎟ + μ2 ⎜ ⎟⎬ (2.5.7)
⎜x ⎟ ⎜0⎟ ⎜ 1 ⎟ ⎜ 0 ⎟
⎪ 3 ⎪
⎪
⎩⎝
⎪ ⎪
⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎪
⎭
x4 0 0 1
x1 1
⎛ ⎞ ⎛ ⎞
⎜ x2 ⎟ ⎜1⎟
Then M X0 = V says that ⎜ ⎟ =⎜ ⎟ solves the original matrix equation, which is certainly true, but this is not the
⎜x ⎟ ⎜0⎟
3
⎝ ⎠ ⎝ ⎠
x4 0
only solution.
x1 −1
⎛ ⎞ ⎛ ⎞
⎜ x2 ⎟ ⎜ 1 ⎟
M Y1 = 0 says that ⎜ ⎟ =⎜ ⎟ solves the homogeneous equation.
⎜x ⎟ ⎜ 1 ⎟
3
⎝ ⎠ ⎝ ⎠
x4 0
x1 1
⎛ ⎞ ⎛ ⎞
⎜ x2 ⎟ ⎜ −1 ⎟
M Y2 = 0 says that ⎜ ⎟ =⎜ ⎟ solves the homogeneous equation.
⎜x ⎟ ⎜ 0 ⎟
3
⎝ ⎠ ⎝ ⎠
x4 1
Notice how adding any multiple of a homogeneous solution to the particular solution yields another particular solution
2.5.3.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 2.5: Solution Sets for Systems of Linear Equations is shared under a not declared license and was authored, remixed, and/or
curated by David Cherney, Tom Denton, & Andrew Waldron.
2.5.3 https://math.libretexts.org/@go/page/1781
2.6: Review Problems
1. Write down examples of augmented matrices corresponding to each of the five types of solution sets for systems of equations
with three unknowns.
2. Invent simple linear system that has multiple solutions. Use the standard approach for solving linear systems and a non-standard
approach to obtain different descriptions of the solution set. Is the solution set different with different approaches?
3. Let $$M =
1 1 1
a a ⋯ a
⎛ 1 2 k ⎞
2 2 2
⎜a a ⋯ a ⎟
⎜ 1 2 k ⎟
⎜ ⎟ (2.6.1)
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⎟
⎝ r r r ⎠
a a ⋯ a
1 2 k
\]
Note: x does not denote the square of x. Instead x , x , x , etc..., denote different variables; the superscript is an index. Although
2 1 2 3
confusing at first, this notation was invented by Albert Einstein who noticed that quantities like
2
1
1 2 2
a x + a x + ⋯ + a x =: ∑
2
2
k
k
a x , k
j=1
2
j
can
j
be written unambiguously as
a x . T hisiscalledEinsteinsummationnotation. T hemostimportantthingtorememberisthattheindex\(j is a dummy
2 j
j
variable, so that a x \equiva x ; this is called “relabeling dummy indices”. When dealing with products of sums, you must
2
j
j 2
i
i
remember to introduce a new dummy for each term; i.e., a x b y = ∑ a x b y does not equal a x b y = ∑ a x ∑ b y .
i
i
i
i
i i
i
i
i
i
i
j
j
i i
i
j j
j
Use Einstein summation notation to propose a rule for M X so that M X = 0 is equivalent to the linear system
1 1 1 2 1 k
a x +a x +⋯ +a x =0
1 2 k
2 1 2 2 2 k
a x +a x +⋯ +a x =0 (2.6.2)
1 2 k
r 1 r 2 r k
a x +a x +⋯ +a x =0
1 2 k
Show that your rule for multiplying a matrix by a vector obeys the linearity property.
4. The standard basis vector e is a column vector with a one in the ith row, and zeroes everywhere else. Using the rule for
i
multiplying a matrix times a vector in problem 3, find a simple rule for multiplying M e , where M is the general matrix defined
i
there.
5. If A is a non-linear operator, can the solutions to Ax = b still be written as “general solution=particular solution + homogeneous
solutions”? Provide examples.
6. Find a system of equations whose solution set is the walls of a 1 × 1 × 1 cube. (Hint: You may need to restrict the ranges of the
variables; could your equations be linear?)
2.6.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 2.6: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
2.6.1 https://math.libretexts.org/@go/page/1788
CHAPTER OVERVIEW
3.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 3: The Simplex Method is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
1
3.1: Pablo's Problem
Let us begin with an example. Consider again Pablo the nutritionist of problem 5, chapter 1. The Conundrum City school board has
employed Pablo to design their school lunch program. Unfortunately for Pablo, their requirements are rather tricky:
The Conundrum City school board is heavily influenced by the local fruit grower’s association. They have stipulated that
children eat at least 7 oranges and 5 apples per week. Parents and teachers have agreed that eating at least 15 pieces of fruit per
week is a good thing, but school janitors argue that too much fruit makes a terrible mess, so that children should eat no more
than 25 pieces of fruit per week
.
Finally Pablo knows that oranges have twice as much sugar as apples and that apples have 5 grams of sugar each. Too much
sugar is unhealthy, so Pablo wants to keep the children's sugar intake as low as possible. How many oranges and apples should
Pablo suggest that the school board put on the menu?
This is a rather gnarly word problem. Our first step is to restate it as mathematics, stripping away all the extraneous
information:
Let x be the number of apples and y be the number of oranges. These must obey
x ≥5 and y ≥7, (3.1.1)
to fulfill the school board's politically motivated wishes. The teacher's and parent's fruit requirement means that
x + y ≥ 15 , (3.1.2)
Now let
s = 5x + 10y . (3.1.4)
This linear function of (x, y) represents the grams of sugar in x apples and y oranges. The problem is asking us to minimize s
subject to the four linear inequalities listed above.
3.1.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
3.1.1 https://math.libretexts.org/@go/page/1812
This page titled 3.1: Pablo's Problem is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
3.1.2 https://math.libretexts.org/@go/page/1812
3.2: Graphical Solutions
Before giving a more general algorithm for handling this problem and problems like it, we note that when the number of variables
is small (preferably 2), a graphical technique can be used.
Inequalities, such as the four given in Pablo's problem, are often called constraints , and values of the variables that satisfy these
constraints comprise the so-called feasible region . Since there are only two variables, this is easy to plot:
y ≥7
15 ≤ x + y ≤ 25 .
You might be able to see the solution to Pablo's problem already. Oranges are very sugary, so they should be kept low, thus
y = 7 . Also, the less fruit the better, so the answer had better lie on the line x + y = 15 . Hence, the answer must be at the
vertex (8, 7). Actually this is a general feature of linear programming problems, the optimal answer must lie at a vertex of the
feasible region. Rather than prove this, lets look at a plot of the linear function s(x, y) = 5x + 10y .
3.2.1 https://math.libretexts.org/@go/page/1813
The plot of a linear function of two variables is a plane through the origin. Restricting the variables to the feasible region gives
some lamina in 3-space. Since the function we want to optimize is linear (and assumedly non-zero), if we pick a point in the
middle of this lamina, we can always increase/decrease the function by moving out to an edge and, in turn, along that edge to a
corner. Applying this to the above picture, we see that Pablo's best option is 110 grams of sugar a week, in the form of 8 apples
and 7 oranges.
It is worthwhile to contrast the optimization problem for a linear function with the non-linear case you may have seen in calculus
courses:
Here we have plotted the curve f (x) = d in the case where the function f is linear and non-linear. To optimize f in the interval
[a, b], for the linear case we just need to compute and compare the values f (a) and f (b). In contrast, for non-linear functions it is
df
necessary to also compute the derivative dx
to study whether there are extrema inside the interval.
3.2.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 3.2: Graphical Solutions is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
3.2.2 https://math.libretexts.org/@go/page/1813
3.3: Dantzig's Algorithm
In simple situations a graphical method might suffice, but in many applications there may be thousands or even millions of
variables and constraints. Clearly an algorithm that can be implemented on a computer is needed. The simplex algorithm (usually
attributed to George Dantzig) provides exactly that. It begins with a standard problem:
Problem 3.3.1:
x1
⎛ ⎞
M x = v, x := ⎜
⎜
⎟,
⎟ (3.3.1)
⋮
⎝ ⎠
xn
This is solved by arranging the information in an augmented matrix and then applying EROs. To see how this works lets try an
example.
Example 3.3.2:
c2 := x + 2y + 3z + 2w = 6,
where x ≥ 0 , y ≥ 0 , z ≥ 0 and w ≥ 0 .
Solution
The key observation is this: Suppose we are trying to maximize f (x , … , x ) subject to a constraint c(x , … , x ) = k for
1 n 1 n
some constant k (c and k would be the entries of M x and v , respectively, in the above). Then we can also try to maximize
f (x1 , … , xn ) + αc(x1 , … , xn ) (3.3.2)
because this is only a constant shift f → f + αk . Choosing α carefully can lead to a simple form for the function we are
extremizing.
Since we are interested in the optimum value of f , we treat it as an additional variable and add one further equation
−3x + 3y + z − 4w + f = 0 . (3.3.3)
⎛ 1 1 1 1 0 5 ⎞ ⎧ c1 = 5
⎪
⎜ 1 2 3 2 0 6 ⎟ ⇔ ⎨ c2 = 6 . (3.3.4)
⎜ ⎟
⎩
⎪
⎝ −3 3 1 −4 1 0 ⎠ f = 3x − 3y − z + 4w
Keep in mind that the first four columns correspond to the positive variables (x, y, z, w) and that the last row has the
information of the function f .
Now the system is written as an augmented matrix where the last row encodes the objective function and the other rows the
constraints. Clearly we can perform row operations on the constraint rows since this will not change the solutions to the
constraints. Moreover, we can add any amount of the constraint rows to the last row, since this just amounts to adding a
constant to the function we want to extremize.
3.3.1 https://math.libretexts.org/@go/page/1820
Example 3.3.4: Performing EROs
We scan the last row, and notice the (most negative) coefficient −4. Na\"ively you might think that this is good because this
multiplies the positive variable w and only helps the objective function f = 4w + ⋯ . However, what this actually means is
that the variable w will large but determined by the constraints. Therefore we want to remove it from the objective function.
We can zero out this entry by performing a row operation. For that, either of first two rows could be used. To decide which, we
remember that the we still have to solve solve the constraints for variables that are positive. Hence we should try to keep the
first two entries in the last column positive. Hence we choose the row which will add the smallest constant to f when we zero
out the −4: Look at the last column (where the values of the constraints are stored). We see that adding four times the first row
to the last row would zero out the −4 entry but add 20 to f , while adding two times the second row to the last row would also
zero out the −4 but only add 12 to f . (You can follow this by watching what happens to the last entry in the last row.) So we
perform the latter row operation and obtain the following:
⎛ 1 1 1 1 0 5 ⎞ c1 = 5
⎜ 1 2 3 2 0 6 ⎟ c2 = 6 (3.3.5)
⎜ ⎟
⎝ −1 7 7 0 1 12 ⎠ f + 2c2 = 12 + x − 7y − 7z .
We do not want to undo any of our good work when we perform further row operations, so now we use the second row to zero
out all other entries in the fourth column. This is achieved by subtracting half the second row from the first:
1 1 1
⎛ 0 − 0 0 2 ⎞ c1 − c2 = 2
2 2 2
⎜ ⎟ (3.3.6)
⎜ 1 2 3 2 0 6 ⎟ c2 = 6
⎝ −1 7 7 0 1 12 ⎠ f + 2c2 = 12 + x − 7y − 7z .
Precisely because we chose the second row to perform our row operations, all entries in the last column remain positive. This
allows us to continue the algorithm.
We now repeat the above procedure: There is a −1 in the first column of the last row. We want to zero it out while adding as
little to
f as possible. This is achieved by adding twice the first row to the last row:
$$
\begin{array}{c|c}
\left(
\begin{array}{rrrrr|r}
\frac12&0&-\frac12&0&0&2\\
1&2&3&2&0&6\\\hline
0&7&6&0&\ 1&\ 16
\end{array}
\right)
\quad\quad &\quad
1
c1 − c2 = 2
2
c2 = 6 (3.3.7)
1
f + 2 c2 + 2(c1 − c2 ) = 16 − 7y − 6z .
2
\end{array}
\]
The Dantzig algorithm terminates if all the coefficients in the last row (save perhaps for the last entry which encodes the value
of the objective) are positive. To see why we are done, lets write out what our row operations have done in terms of the
function f and the constraints (c , c ). First we have
1 2
1
f + 2 c2 + 2(c1 − c2 ) = 16 − 7y − 6z (3.3.8)
2
3.3.2 https://math.libretexts.org/@go/page/1820
with both y and z positive. Hence to maximize f we should choose y = 0 = z . In which case we obtain our optimum value
f = 16 . (3.3.9)
––––––––
Finally, we check that the constraints can be solved with y = 0 = z and positive (x, w). Indeed, they can by taking x = 2 = w
3.3.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 3.3: Dantzig's Algorithm is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
3.3.3 https://math.libretexts.org/@go/page/1820
3.4: Pablo Meets Dantzig
Oftentimes, it takes a few tricks to bring a given problem into the standard form of example 38. In Pablo's case, this goes as
follows.
Example 3.4.1:
x1 = x − 5 , x2 = y − 7 . (3.4.1)
These are called slack variables because they take up the "slack'' required to convert inequality to equality. This pair of
equations can now be written as M x = v ,
x1
⎛ ⎞
1 1 −1 0 ⎜ x2 ⎟ 3
( )⎜ ⎟ =( ) . (3.4.3)
1 1 0 1 ⎜x ⎟ 13
3
⎝ ⎠
x4
Finally, Pablo wants to minimize sugar , but the standard problem maximizes f . Thus the so-called
s = 5x + 10y
⎛ 1 1 −1 0 0 3 ⎞
⎜ 1 1 0 1 0 13 ⎟ . (3.4.4)
⎜ ⎟
⎝ 5 10 0 0 1 0 ⎠
Here it seems that the simplex algorithm already terminates because the last row only has positive coefficients, so that setting
x =0 =x
1 2would be optimal. However, this does not solve the constraints (for positive values of the slack variables x and 3
x ). Thus one more (very dirty) trick is needed. We add two more, positive, (so-called) artificial variables x and x to the
4 5 6
c1 → c1 − x5 , c2 → c2 − x6 . (3.4.5)
The idea being that for large positive α , the modified objective function
f − α x5 − α x6 (3.4.6)
is only maximal when the artificial variables vanish so the underlying problem is unchanged. Lets take α = 10 (our solution
will not depend on this choice) so that our augmented matrix reads
⎛ 1 1 −1 0 1 0 0 3 ⎞
⎜ 1 1 0 1 0 1 0 13 ⎟
⎜ ⎟
⎝ ⎠
5 10 0 0 10 10 1 0
⎛ 1 1 −1 0 1 0 0 3 ⎞
′
R3 =R3 −10R1 −10R2
∼ ⎜ 1 1 0 1 0 1 0 13 ⎟ .
⎜ ⎟
⎝ −15 −10 10 −10 0 0 1 −160 ⎠
Here we performed one row operation to zero out the coefficients of the artificial variables. Now we are ready to run the
simplex algorithm exactly as in section 3.3. The first row operation uses the 1 in the top of the first column to zero out the most
negative entry in the last row:
3.4.1 https://math.libretexts.org/@go/page/1827
⎛ 1 1 −1 0 1 0 0 3 ⎞
⎜ 1 1 0 1 0 1 0 13 ⎟
⎜ ⎟
⎝ 0 5 −5 −10 15 0 1 −115 ⎠
⎛ 1 1 1 0 1 0 0 3 ⎞
′
R2 =R2 −R1
∼ ⎜ 0 0 1 1 −1 1 0 10 ⎟
⎜ ⎟
⎝ 0 5 −5 −10 15 0 1 −115 ⎠
⎛ 1 1 1 0 1 0 0 3 ⎞
′
R3 =R3 +10R2
∼ ⎜ 0 0 1 1 −1 1 0 10 ⎟ .
⎜ ⎟
⎝ 0 5 5 0 5 10 1 −15 ⎠
Now the variables (x , x , x , x ) have zero coefficients so must be set to zero to maximize f . The optimum value is
2 3 5 6
f = −15 so s = −f − 95 = 110 exactly as before. Finally, to solve the constraints x = 3 and x = 10 so that x = 8 and
1 4
Clearly, performed by hand, the simplex algorithm was slow and complex for Pablo's problem. However, the key point is that it is
an algorithm that can be fed to a computer. For problems with many variables, this method is much faster than simply checking all
vertices as we did in section 3.2.
3.4.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 3.4: Pablo Meets Dantzig is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
3.4.2 https://math.libretexts.org/@go/page/1827
3.5: Review Problems
1. Maximize f (x, y) = 2x + 3y subject to the constraints
x ≥0, y ≥0, x + 2y ≤ 2 , 2x + y ≤ 2 , (3.5.1)
by
a) sketching the region in the xy-plane defined by the constraints and then checking the values of f at its corners; and,
b) the simplex algorithm (Hint: introduce slack variables).
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 3.5: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
3.5.1 https://math.libretexts.org/@go/page/1842
CHAPTER OVERVIEW
a =⎜ ⎟. (4.1)
⎜ ⋮ ⎟
⎝ n ⎠
a
2
Do not be confused by our use of a superscript to label components of a vector. Here a denotes the second component of the vector a, rather than the number a squared!
7 7
⎛ ⎞ ⎛ ⎞
⎜4⎟ ⎜2⎟
⎜ ⎟ ≠ ⎜ ⎟. (4.2)
⎜2⎟ ⎜4⎟
⎝ ⎠ ⎝ ⎠
5 5
1
⎧ a ∣ ⎫
⎪
⎪⎛ ⎞ ⎪
⎪
∣
n
R := ⎨⎜ ⎟∣ a1 , … , an ∈ R ⎬ . (4.3)
⎜ ⋮ ⎟
⎪
⎩
⎪ ∣ ⎪
⎭
⎪
⎝ n ⎠
a ∣
Thumbnail: The volume of this parallelepiped is the absolute value of the determinant of the 3-by-3 matrix formed by the vectors r , r , and r . (CC BY-SA 3.0; Claudio Rocchini)
1 2 3
4.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 4: Vectors in Space, n-Vectors is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom Denton, & Andrew Waldron.
1
4.1: Addition and Scalar Multiplication in Rⁿ
A simple but important property of n -vectors is that we can add n -vectors and multiply n -vectors by a scalar:
Definition
Given two n -vectors a and b whose components are given by
1 1
a b
⎛ ⎞ ⎛ ⎞
a =⎜ ⎟ and b = ⎜ ⎟ (4.1.1)
⎜ ⋮ ⎟ ⎜ ⋮ ⎟
⎝ n ⎠ ⎝ n ⎠
a b
their sum is
1 1
a +b
⎛ ⎞
a + b := ⎜ ⎟ . (4.1.2)
⎜ ⋮ ⎟
⎝ n n ⎠
a +b
⎜ ⎟ (4.1.3)
⎜ ⋮ ⎟
⎝ n ⎠
λa
\, .\]
$$a+b=
5
⎛ ⎞
⎜5⎟
⎜ ⎟ (4.1.4)
⎜5⎟
⎝ ⎠
5
and 3a - 2b=
−5
⎛ ⎞
⎜ 0 ⎟
⎜ ⎟ (4.1.5)
⎜ 5 ⎟
⎝ ⎠
10
\, .\]
A special vector is the zero vector. All of its components are zero:
$$0=
0
⎛ ⎞
⎜ ⎟ (4.1.6)
⎜ ⋮ ⎟
⎝ ⎠
0
\, .\]
In Euclidean geometry---the study of R with lengths and angles defined as in section 4.3---n -vectors are used to label points P
n
and the zero vector labels the origin O. In this sense, the zero vector is the only one with zero magnitude, and the only one which
points in no particular direction.
4.1.1 https://math.libretexts.org/@go/page/1843
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 4.1: Addition and Scalar Multiplication in Rⁿ is shared under a not declared license and was authored, remixed, and/or curated by
David Cherney, Tom Denton, & Andrew Waldron.
4.1.2 https://math.libretexts.org/@go/page/1843
4.2: Hyperplanes
Vectors in R can be hard to visualize. However, familiar objects like lines and planes still make sense: The line
n
L along the
direction defined by a vector v and through a point P labeled by a vector u can be written as
L = {u + tv|t ∈ R} . (4.2.1)
Sometimes, since we know that a point P corresponds to a vector, we will be lazy and just write L = {P + tv|t ∈ R} .
⎧⎛ 1 ⎞
⎪ ⎛
1
⎞
∣ ⎫
⎪
⎪ ∣ ⎪
⎪ ⎪
⎜2⎟ ⎜0 ⎟∣
⎨⎜ ⎟ + t⎜ ⎟ t ∈ R⎬ describes a line in R parallel to the x -axis.
4
1
⎜3⎟ ⎜0 ⎟∣
⎪ ⎪
⎪ ∣ ⎪
⎩⎝
⎪ ⎠ ⎝ ⎠ ⎭
⎪
4 0 ∣
unless both vectors are in the same line, in which case, one of the vectors is a scalar multiple of the other. The sum of u and v
corresponds to laying the two vectors head-to-tail and drawing the connecting vector. If u and v determine a plane, then their sum
lies in the plane determined by u and v .
4.2.1 https://math.libretexts.org/@go/page/1850
\[\left\{
3
⎛ ⎞
⎜ 1⎟
⎜ ⎟
⎜ 4⎟
⎜ ⎟ (4.2.3)
⎜ ⎟
⎜ 1⎟
⎜ ⎟
⎜ 5⎟
⎝ ⎠
9
+s
1
⎛ ⎞
⎜ 0⎟
⎜ ⎟
⎜ 0⎟
⎜ ⎟ (4.2.4)
⎜ ⎟
⎜ 0⎟
⎜ ⎟
⎜ 0⎟
⎝ ⎠
0
+t
0
⎛ ⎞
⎜ 1⎟
⎜ ⎟
⎜ 0⎟
⎜ ⎟ (4.2.5)
⎜ ⎟
⎜ 0⎟
⎜ ⎟
⎜ 0⎟
⎝ ⎠
0
\middle\arrowvert s, t \in \mathbb{R} \right\}$$ describes a plane in 6-dimensional space parallel to the xy-plane.
We can generalize the notion of a plane with the following recursive definition. (That is, infinitely many things are defined in the
following line.)
Definition
A set of k vectors v , … , v in R with k ≤ n determines a k -dimensional hyperplane, unless any of the vectors v lives in
1 k
n
i
the same hyperplane determined by the other vectors. If the vectors do determine a k -dimensional hyperplane, then any point
in the hyperplane can be written as:
k
{P + ∑ λi vi | λi ∈ R} (4.2.6)
i=1
When the dimension k is not specified, one usually assumes that k = n − 1 for a hyperplane inside R . n
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 4.2: Hyperplanes is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
4.2.2 https://math.libretexts.org/@go/page/1850
4.3: Directions and Magnitudes
Consider the Euclidean length of a vector:
−−
n
−−−−−
−−−−−−−−−−−−−−−−−−−
1 2 2 2 n 2 i 2
∥v∥ := √ (v ) + (v ) + ⋯ (v ) = √∑(v ) . (4.3.1)
i=1
Using the Law of Cosines, we can then figure out the angle between two vectors. Given two vectors v and u that span a plane in
R , we can then connect the ends of v and u with the vector v − u .
n
1 2 n 2
−((u ) + ⋯ + (u ) )
1 2 n 2
−((v ) + ⋯ + (v ) )
1 1 n n
= −2 u v − ⋯ − 2u v
Thus,
1 1 n n
∥u∥ ∥v∥ cos θ = u v +⋯ +u v . (4.3.3)
Note that in the above discussion, we have assumed (correctly) that Euclidean lengths in R
n
give the usual notion of lengths of
vectors for any plane in R . This now motivates the definition of the dot product.
n
Definition
1 1
u v
⎛ ⎞ ⎛ ⎞
1 1 n n
u ⋅ v := u v +⋯ +u v . (4.3.4)
When the dot product between two vectors vanishes, we say that they are perpendicular or orthogonal . Notice
that the zero vector is orthogonal to every vector.
4.3.1 https://math.libretexts.org/@go/page/1857
u ⋅ v = v⋅ u , (4.3.7)
2. Distributive so
u ⋅ (v + w) = u ⋅ v + u ⋅ w , (4.3.8)
and
(cu + dw) ⋅ v = c u ⋅ v + d w ⋅ v . (4.3.10)
4. Positive Definite :
u⋅u ≥0, (4.3.11)
You should carefully check for yourself exactly which properties of an inner product were used to write down the
above inequality!
2
2
⟨u, v⟩
0 ≤ ⟨u, u⟩ − . (4.3.13)
⟨v, v⟩
Now it is easy to rearrange this inequality to reach the Cauchy--Schwarz one above.
Proof
2
∥u + v∥ = (u + v) ⋅ (u + v)
= u ⋅ u + 2u ⋅ v + v ⋅ v
2 2
= ∥u ∥ + ∥v∥ + 2 ∥u∥ ∥v∥ cos θ
2
= (∥u∥ + ∥v∥) + 2 ∥u∥ ∥v∥(cos θ − 1)
2
≤ (∥u∥ + ∥v∥)
4.3.2 https://math.libretexts.org/@go/page/1857
Then the square of the left-hand side of the triangle inequality is ≤ the right-hand side, and both sides are positive, so the result
is true.
The triangle inequality is also "self-evident'' examining a sketch of u, v and u + v :
−− −−
Notice also that a ⋅ b = 1.4 + 2.3 + 3.2 + 4.1 = 20 < √30. √30 = 30 = ∥a∥ ∥b∥ in accordance with the Cauchy-
-Schwarz inequality.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 4.3: Directions and Magnitudes is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
4.3.3 https://math.libretexts.org/@go/page/1857
4.4: Vectors, Lists and Functions- R S RS
Suppose you are going shopping. You might jot down something like this on a piece of paper:
We could represent this information mathematically as a set, $$S=\{\rm apple, orange, onion, milk, carrot\}\, .\]
There is no information of ordering here and no information about how many carrots you will buy. This set by itself is not a vector;
how would we add such sets to one another?
If you were a more careful shopper your list might look like this:
What you have really done here is assign a number to each element of the set S . In other words, the second list is a function
f : S ⟶ R. (4.4.1)
Given two lists like the second one above, we could easily add them -- if you plan to buy 5 apples and I am buying 3 apples,
together we will buy 8 apples! In fact, the second list is really a 5-vector in disguise.
In general it is helpful to think of an n -vector as a function whose domain is the set {1, … , n}. This is equivalent to thinking of an
n -vector as an ordered list of n numbers. These two ideas give us two equivalent notions for the set of all n -vectors:
1
⎧⎛ a ∣ ⎫
⎪
⎪ ⎞ ⎪
⎪
∣
n 1 n {1,⋯,n}
R := ⎨⎜
⎜
⎟∣ a , … a ∈ R ⎬ = {a : {1, … , n} → R} := R
⎟ (4.4.2)
⋮
⎪
⎩
⎪ ∣ ⎪
⎭
⎪
⎝ n ⎠
a ∣
4.4.1 https://math.libretexts.org/@go/page/1864
The notation R {1,⋯,n}
is used to denote functions from {1, … , n} to R. Similarly, for any set S the notation R denotes the set of S
functions from S to R:
S
R := {f : S → R} . (4.4.3)
When S is an ordered set like {1, … , n}, it is natural to write the components in order. When the elements of S do not have a
natural ordering, doing so might cause confusion.
Example 4.4.1:
Consider the set S = {∗, ⋆, #} from chapter 1 review problem 9. A particular element of R
S
is the function a explicitly
defined by
⋆ # ∗
a = 3, a = 5, a = −2. (4.4.4)
because the elements of S do not have an ordering, since as sets {∗, ⋆, #} = {∗, ⋆, #}.
In this important way, R seems different from R . What is more evident are the similarities; since we can add two functions, we
S 3
Addition in R S
If a = 3, a = 5, a = −2 and b = −2, b = 4, b = 13
⋆ # ∗ ⋆ # ∗
Also, since we can multiply functions by numbers, there is a notion of scalar multiplication on R : S
Scalar Multiplication in R S
If a = 3, a = 5, a = −2 ,
⋆ # ∗
We visualize R and R in terms of axes. We have a more abstract picture of R , R and R for larger n while R seems even
2 3 4 5 n S
more abstract. However, when thought of as a simple "shopping list'', you can see that vectors in R in fact, can describe everyday
S
objects. In chapter 5 we introduce the general definition of a vector space that unifies all these different notions of a vector.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 4.4: Vectors, Lists and Functions- R is shared under a not declared license and was authored, remixed, and/or curated by David
S
4.4.2 https://math.libretexts.org/@go/page/1864
4.5: Review Problems
1. When he was young, Captain Conundrum mowed lawns on weekends to help pay his college tuition bills. He charged his
customers according to the size of their lawns at a rate of 5 cent per square foot and meticulously kept a record of the areas of their
lawns in an ordered list:
A = (200, 300, 50, 50, 100, 100, 200, 500, 1000, 100) . (4.5.1)
He also listed the number of times he mowed each lawn in a given year, for the year 1988 that ordered list was
(3) Find the angle between the diagonal of the unit cube in R and one of the coordinate axes.
3
(n) Find the angle between the diagonal of the unit (hyper)-cube in R and one of the coordinate axes.
n
(∞ What is the limit as n → ∞ of the angle between the diagonal of the unit (hyper)-cube in R and one of the coordinate axes?
n
||MX||
b) Compute ||X||
for arbitrary values of X and θ .
c) Explain your result for (b) and describe the action of M geometrically.
4. (Lorentzian Strangeness). For this problem, consider R with the Lorentzian inner product defined in example 46 of section 4.3.
n
x
⎛ ⎞
n⋅⎜ y ⎟ = n⋅p (4.5.3)
⎝ ⎠
z
where the vector p labels a given point on the plane and n is a vector normal to the plane. Let N and P be vectors in R 101
and
4.5.1 https://math.libretexts.org/@go/page/1871
1
x
⎛ ⎞
2
⎜ x ⎟
X =⎜ ⎟. (4.5.4)
⎜ ⎟
⎜ ⋮ ⎟
⎝ 101 ⎠
x
⎜1⎟ ⎜ 2 ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
u = ⎜ 1 ⎟ and v = ⎜ 3 ⎟ (4.5.5)
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⋮ ⎟ ⎜ ⋮ ⎟
⎝ ⎠ ⎝ ⎠
1 101
Find the projection of v onto u and the projection of u onto v . (Hint: Remember that two vectors u and v define a plane, so first
work out how to project one vector onto another in a plane. The picture from Section 14.4 could help.)
8. If the solution set to the equation A(x) = b is the set of vectors whose tips lie on the paraboloid z = x 2
+y
2
, then what can you
say about the function A ?
9. Find a system of equations whose solution set is
⎧ 1 −1 0 ∣ ⎫
⎪⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎪
⎪
⎪ ∣ ⎪
⎪
⎜1⎟ ⎜ −1 ⎟ ⎜ 0 ⎟∣
⎨⎜ ⎟ + c1 ⎜ ⎟ + c2 ⎜ ⎟ c1 , c2 ∈ R⎬ . (4.5.6)
⎜2⎟ ⎜ 0 ⎟ ⎜ −1 ⎟∣
⎪
⎪ ⎪
⎪
⎩⎝
⎪ ∣ ⎭
⎪
⎠ ⎝ ⎠ ⎝ ⎠
0 1 −3 ∣
Give a general procedure for going from a parametric description of a hyperplane to a system of equations with that hyperplane as a
solution set.
10. If A is a linear operator and both x =v and x = cv (for any real number c ) are solutions to Ax = b , then what can you say
about b ?
4.5.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 4.5: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
4.5.2 https://math.libretexts.org/@go/page/1871
CHAPTER OVERVIEW
5: Vector Spaces
The two key properties of vectors are that they can be added together and multiplied by scalars, so we make the following
definition.
Definition
A vector space (V , +, . , R) is a set V with two operations + and ⋅ satisfying the following properties for all u, v ∈ V and
c, d ∈ R:
Remark
Rather than writing (V , +, . , R), we will often say "let V be a vector space over R''. If it is obvious that the numbers used are
real numbers, then "let V be a vector space'' suffices. Also, don't confuse the scalar product with the dot product. The scalar
product is a function that takes as inputs a number and a vector and returns a vector as its output. This can be written:
⋅: R × V → V . (5.1)
Similarly
+ : V ×V → V . (5.2)
On the other hand, the dot product takes two vectors and returns a number. Succinctly: ⋅: V × V → R . Once the properties of a
vector space have been verified, we'll just write scalar multiplication with juxtaposition cv = c ⋅ v , though, to avoid confusing
the notation.
5.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 5: Vector Spaces is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
1
5.1: Examples of Vector Spaces
One can find many interesting vector spaces, such as the following:
Example 5.1.1:
N
R = {f ∣ f : N → R} (5.1.1)
Here the vector space is the set of functions that take in a natural number n and return a real number. The addition is just
addition of functions: (f + f )(n) = f (n) + f (n) . Scalar multiplication is just as simple: c ⋅ f (n) = cf (n) .
1 2 1 2
We can think of these functions as infinitely large ordered lists of numbers: f (1) = 1 = 1 is the first component,
3
f (2) = 2 = 8 is the second, and so on. Then for example the function f (n) = n would look like this:
3 3
1
⎛ ⎞
⎜ 8 ⎟
⎜ ⎟
⎜ 27 ⎟
⎜ ⎟
f =⎜ ⎟. (5.1.2)
⎜ ⎟
⎜ ⋮ ⎟
⎜ ⎟
3
⎜n ⎟
⎜ ⎟
⎝ ⎠
⋮
Thinking this way, R is the space of all infinite sequences. Because we can not write a list infinitely long (without infinite
N
time and ink), one can not define an element of this space explicitly; definitions that are implicit, as above, or algebraic as in
f (n) = n
3
(for all n ∈ N ) suffice.
Let's check some axioms.
(+i) (Additive Closure) (f + f )(n) = f (n) + f (n) is indeed a function N → R , since the sum of two real numbers is
1 2 1 2
a real number.
(+iv) (Zero) We need to propose a zero vector. The constant zero function g(n) = 0 works because then
f (n) + g(n) = f (n) + 0 = f (n) .
The other axioms should also be checked. This can be done using properties of the real numbers.
as is scalar multiplication
\[c\cdot f(x)=cf(x)\, .$$
To check that R is a vector space use the properties of addition of functions and scalar multiplication of functions as in the
R
previous example.
We can not write out an explicit definition for one of these functions either, there are not only infinitely many components, but
even infinitely many components between any two components! You are familiar with algebraic definitions like
2
f (x) = e
x −x+5
. However, most vectors in this vector space can not be defined algebraically. For example, the nowhere
continuous function
1, x ∈ Q
f (x) = { (5.1.5)
0, x ∉ Q
5.1.1 https://math.libretexts.org/@go/page/1878
Example 5.1.3
R
{∗,⋆,#}
= {f : {∗, ⋆, #} → R} . Again, the properties of addition and scalar multiplication of functions show that this is a
vector space.
You can probably figure out how to show that R is vector space for any set
S
S . This might lead you to guess that all vector
spaces are of the form R for some set S . The following is a counterexample.
S
Another very important example of a vector space is the space of all differentiable functions:
d
∣
{f : R → R f exists} . (5.1.6)
∣
dx
From calculus, we know that the sum of any two differentiable functions is differentiable, since the derivative distributes over
addition. A scalar multiple of a function is also differentiable, since the derivative commutes with scalar multiplication (
f ). The zero function is just the function such that 0(x) = 0 for every x. The rest of the vector space properties
d d
(cf ) = c
dx dx
Similarly, the set of functions with at least k derivatives is always a vector space, as is the space of functions with infinitely many
derivatives. None of these examples can be written as RS for some set S . Despite our emphasis on such examples, it is also not
true that all vector spaces consist of functions. Examples are somewhat esoteric, so we omit them.
Another important class of examples is vector spaces that live inside R but are not themselves R .
n n
Let
1 1 1
⎛ ⎞
⎧ −1 −1 ∣ ⎫
⎪ ⎛ ⎞ ⎛ ⎞ ⎪
∣
⎨c1 ⎜ 1 ⎟ + c2 ⎜ 0 ⎟ c1 , c2 ∈ R⎬ . (5.1.8)
∣
⎩
⎪ ⎝ ⎠ ⎝ ⎠ ⎭
⎪
0 1 ∣
1
⎛ ⎞
This set is not equal to R since it does not contain, for example, ⎜ 0 ⎟.
3
⎝ ⎠
0
−1 −1 −1 −1 −1 −1
⎡ ⎛ ⎞ ⎛ ⎞⎤ ⎡ ⎛ ⎞ ⎛ ⎞⎤ ⎛ ⎞ ⎛ ⎞
⎣ ⎝ ⎠ ⎝ ⎠⎦ ⎝ ⎠ ⎝ ⎠
0 1 0 1
This example is called a subspace because it gives a vector space inside another vector space. See chapter 9 for details. Indeed,
because it is determined by the linear map given by the matrix M , it is called ker M , or in words, the kernel of M , for this see
chapter 16.
5.1.2 https://math.libretexts.org/@go/page/1878
Similarly, the solution set to any homogeneous linear equation is a vector space: Additive and multiplicative closure follow from
the following statement, made using linearity of matrix multiplication:
If M x1 = 0 and M x2 = 0 then M (c1 x1 + c2 x2 ) = c1 M x1 + c2 M x2 = 0 + 0 = 0. (5.1.11)
A powerful result, called the subspace theorem (see chapter 9) guarantees, based on the closure properties alone, that homogeneous
solution sets are vector spaces.
More generally, if V is any vector space, then any hyperplane through the origin of V is a vector space.
Example 5.1.7 :
Consider the functions f (x) = e and g(x) = e in R . By taking combinations of these two vectors we can form the plane
x 2x R
and e .
1
2
2x
A hyperplane which does not contain the origin cannot be a vector space because it fails condition (+iv).
It is also possible to build new vector spaces from old ones using the product of sets. Remember that if V and W are sets, then
their product is the new set
or in words, all ordered pairs of elements from V and W . In fact V ×W is a vector space if V and W are. We have actually been
using this fact already:
Example 5.1.8 :
The real numbers R form a vector space (over R). The new vector space
R × R = {(x, y)|x ∈ R, y ∈ R} (5.1.13)
5.1.1: Non-Examples
The solution set to a linear non-homogeneous equation is not a vector space because it does not contain the zero vector and
therefore fails (iv).
Example \(\PageIndex{9}\):
The solution set to
1 1 x 1
( )( ) =( ) (5.1.15)
0 0 y 0
1 −1 0
is {( ) +c ( )
∣
∣
c ∈ R} . The vector ( ) is not in this set.
0 1 0
Do notice that once just one of the vector space rules is broken, the example is not a vector space. Most sets of n -vectors are not
vector spaces.
Example 5.1.11 :
a 1 1 −2
P := {( )
∣
∣
a, b ≥ 0} is not a vector space because the set fails (⋅i) since ( ) ∈ P but −2 ( ) =( ) ∉ P .
b 1 1 −2
Sets of functions other than those of the form R should be carefully checked for compliance with the definition of a vector space.
S
5.1.3 https://math.libretexts.org/@go/page/1878
Example 5.1.12 :
does not form a vector space because it does not satisfy (+i). The functions f (x) = x 2
+1 and g(x) = −5 are in the set, but
their sum (f + g)(x) = x − 4 = (x + 2)(x − 2) is not since (f + g)(2) = 0 .
2
This page titled 5.1: Examples of Vector Spaces is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
5.1.4 https://math.libretexts.org/@go/page/1878
5.2: Other Fields
Above, we defined vector spaces over the real numbers. One can actually define vector spaces over any field. This is referred to as
choosing a different base field. A field is a collection of "numbers'' satisfying properties which are listed in appendix B. An
example of a field is the complex numbers,
2
C = {x + iy ∣ i = −1, x, y ∈ R} . (5.2.1)
Example 5.2.1 :
In quantum physics, vector spaces over C describe all possible states a physical system of particles can have.
For example,
λ
V = {( ) ∣ λ, μ ∈ C} (5.2.2)
μ
1 0
is the set of possible states for an electron's spin. The vectors ( ) and ( ) describe, respectively, an electron with spin "up''
0 1
i
and "down'' along a given direction. Other vectors, like ( ) are permissible, since the base field is the complex numbers.
−i
Such states represent a mixture of spin up and spin down for the given direction (a rather counterintuitive yet experimentally
verifiable concept), but a given spin in some other direction.
Complex numbers are very useful because of a special property that they enjoy: every polynomial over the complex numbers
factors into a product of linear polynomials. For example, the polynomial
2
x +1 (5.2.3)
doesn't factor over real numbers, but over complex numbers it factors into
(x + i)(x − i) . (5.2.4)
x =i and x = −i . This property ends has far-reaching consequences: often in mathematics problems that are very difficult using
only real numbers, become relatively simple when working over the complex numbers. This phenomenon occurs when
diagonalizing matrices, see chapter 13.
The rational numbers Q are also a field. This field is important in computer algebra: a real number given by an infinite string of
numbers after the decimal point can't be stored by a computer. So instead rational approximations are used. Since the rationals are a
field, the mathematics of vector spaces still apply to this special case.
Another very useful field is bits
B2 = Z2 = {0, 1} , (5.2.6)
+ 0 1
0 0 1 (5.2.7)
1 1 0
\qquad
5.2.1 https://math.libretexts.org/@go/page/1885
× 0 1
0 0 0 (5.2.8)
1 0 1
\]
These rules can be summarized by the relation 2 = 0 . For bits, it follows that −1 = 1 !
The theory of fields is typically covered in a class on abstract algebra or Galois theory.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 5.2: Other Fields is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
5.2.2 https://math.libretexts.org/@go/page/1885
5.3: Review Problems
x ∣
1. Check that {( ) ∣ x, y ∈ R} = R
2
(with the usual addition and scalar multiplication) satisfies all of the parts in the definition
y ∣
of a vector space.
2.
a) Check that the complex numbers C = {x + iy ∣ i = −1, x, y ∈ R} , satisfy all of the parts in the definition of a vector space
2
over C. Make sure you state carefully what your rules for vector addition and scalar multiplication are.
b) What would happen if you used R as the base field (try comparing to problem 1).
3.
a) Consider the set of convergent sequences, with the same addition and scalar multiplication that we defined for the space of
sequences:
N
V = {f ∣ f : N → R, lim f ∈ R} ⊂ R . (5.3.1)
n→∞
Propose as many rules for addition and scalar multiplication as you can that satisfy some of the vector space conditions while
breaking some others.
5. Consider the set of 2 × 4 matrices:
a b c d
V = {( ) ∣ a, b, c, d, e, f , g, h ∈ C} (5.3.3)
e f g h
Propose definitions for addition and scalar multiplication in V . Identify the zero vector in V , and check that every matrix in V has
an additive inverse.
6. Let P R
3
be the set of polynomials with real coefficients of degree three or less.
a) Propose a definition of addition and scalar multiplication to make P R
3
a vector space.
b) Identify the zero vector, and find the additive inverse for the vector −3 − 2x + x . 2
c) Show that P R
3
is not a vector space over C. Propose a small change to the definition of P R
3
to make it a vector space over C.
7. Let V = {x ∈ R|x > 0} =: R+ . For x, y ∈ V and λ ∈ R , define
λ
x ⊕ y = xy , λ ⊗x = x . (5.3.4)
integers. For what set S is the set of 2 × 2 matrices the same as the set R ? Generalize to other size matrices.
S
5.3.1 https://math.libretexts.org/@go/page/1921
1, k =∗
0, k =⋆ (5.3.5)
0, k =#
\right.
,~
e_{\star} (k)= \left\{\!\!
0, k =∗
1, k =⋆ (5.3.6)
0, k =#
\right.
,~
e_{\#} (k)= \left\{\!\!
0, k =∗
0, k =⋆ (5.3.7)
1, k =#
\right. \]
10. Let V be a vector space and S any set. Show that the set of all functions mapping V → S , i.e. V
S
, is a vector space.
Hint: first decide upon a rule for adding functions whose outputs are vectors.
5.3.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 5.3: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
5.3.2 https://math.libretexts.org/@go/page/1921
CHAPTER OVERVIEW
6: Linear Transformations
Definition
A function L: V → W is linear if V and W are vector spaces and for all u, v ∈ V and r, s ∈ R we have
L(ru + sv) = rL(u) + sL(v). (6.1)
Remark
We will often refer to linear functions by names like "linear map'', "linear operator'' or "linear transformation''. In some
contexts you will also see the name "homomorphism''. The definition above coincides with the two part description in chapter
1; the case r = 1, s = 1 describes additivity, while s = 0 describes homogeneity. We are now ready to learn the powerful
consequences of linearity.
6.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
Thumbnail: A linear combination of one basis set of vectors (purple) obtains new vectors (red). If they are linearly independent,
these form a new basis set. The linear combinations relating the first set to the other extend to a linear transformation, called the
change of basis. (CC0; Maschen via Wikipedia)
This page titled 6: Linear Transformations is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
1
6.1: The Consequence of Linearity
Now that we have a sufficiently general notion of vector space it is time to talk about why linear operators are so special. Think
about what is required to fully specify a real function of one variable. One output must be specified for each input. That is an
infinite amount of information.
By contrast, even though a linear function can have infinitely many elements in its domain, it is specified by a very small amount
of information.
Example 6.1.1 :
because by homogeneity
5 1 1 5 25
L( ) = L [5 ( )] = 5L ( ) =5( ) =( ). (6.1.3)
0 0 0 3 15
because by additivity
1 1 0 1 0 5 2 7
L( ) = L [( ) +( )] = L ( ) +L( ) =( ) +( ) =( ). (6.1.6)
1 0 1 0 1 3 2 5
x 1 0
( ) =x( ) +y ( ) , (6.1.7)
y 0 1
we know how L acts on every vector from R by linearity based on just two pieces of information;
2
x 1 0 1 0 5 2 5x + 2y
L( ) = L [x ( ) +y ( )] = xL ( ) + yL ( ) =x( ) +y ( ) =( ). (6.1.8)
y 0 1 0 1 3 2 3x + 2y
Thus, the value of L at infinitely many inputs is completely specified by its value at just two inputs. (We can see now that L
5 2
( ) (6.1.9)
3 2
This is the reason that linear functions are so nice; they are secretly very simple functions by virtue of two characteristics:
1. They act on vector spaces.
6.1.1 https://math.libretexts.org/@go/page/1892
2. They act additively and homogeneously.
A linear transformation with domain R is completely specified by the way it acts on the three vectors
3
1 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
Similarly, a linear transformation with domain R is completely specified by its action on the n different n -vectors that have
n
exactly one non-zero component, and its matrix form can be read off this information. However, not all linear functions have such
nice domains.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 6.1: The Consequence of Linearity is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
6.1.2 https://math.libretexts.org/@go/page/1892
6.2: Linear Functions on Hyperplanes
It is not always so easy to write a linear operator as a matrix. Generally, this will amount to solving a linear systems problem.
Examining a linear function whose domain is a hyperplane is instructive.
Example 63
Let
⎧ 1 ∣
0 ⎫
⎪ ⎛ ⎞ ⎛ ⎞ ⎪
∣
V = ⎨c1 ⎜ 1 ⎟ + c2 ⎜ 1 ⎟ c1 , c2 ∈ R⎬ (6.2.1)
∣
⎩
⎪ ⎝ ⎠ ⎝ ⎠ ⎭
⎪
0 1 ∣
and consider L : V → R
3
defined by
1 0 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
L⎜1⎟ = ⎜1⎟, L⎜1⎟ = ⎜1⎟. (6.2.2)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 1 0
⎣ ⎝ ⎠ ⎝ ⎠⎦ ⎝ ⎠
0 1 0
The domain of L is a plane and its range is the line through the origin in the x2 direction. It is clear how to check that L is
linear.
It is not clear how to formulate L as a matrix;
since
c1 0 0 0 c1 0
⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
or since
c1 0 0 0 c1 0
⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
you might suspect that L is equivalent to one of these 3 × 3 matrices. It is not. All 3 × 3 matrices have R as their domain, 3
and the domain of L is smaller than that. When we do realize this L as a matrix it will be as a 3 × 2 matrix. We can tell
because the domain of L is 2 dimensional and the codomain is 3 dimensional
6.2.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 6.2: Linear Functions on Hyperplanes is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
6.2.1 https://math.libretexts.org/@go/page/1899
6.3: Linear Differential Operators
Your calculus class became much easier when you stopped using the limit definition of the derivative, learned the power rule, and
started using linearity of the derivative operator.
Example 64
Let V be the vector space of polynomials of degree 2 or less with standard addition and scalar multiplication.
2
V = { a0 ⋅ 1 + a1 x + a2 x | a0 , a1 , a2 ∈ R}
Let : V → V be the derivative operator. The following three equations, along with linearity of the derivative operator, allow
d
dx
In particular
d d d d
2 2
(a0 ⋅ 1 + a1 x + a2 x ) = a0 ⋅ 1 + a1 x + a2 x = 0 + a1 + 2 a2 .
dx dx dx dx
Thus, the derivative acting any of the infinitely many second order polynomials is determined by its action for just three inputs.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 6.3: Linear Differential Operators is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
6.3.1 https://math.libretexts.org/@go/page/1906
6.4: Bases (Take 1)
The central idea of linear algebra is to exploit the hidden simplicity of linear functions. It ends up there is a lot of freedom in how
to do this. That freedom is what makes linear algebra powerful.
1 0
You saw that a linear operator acting on R is completely specified by how it acts on the pair of vectors (
2
) and ( . In fact,
)
0 1
1 1
any linear operator acting on R is also completely specified by how it acts on the pair of vectors (
2
) and ( .
)
1 −1
1 2
L( ) = ( ) (6.4.1)
1 4
and $$L
1
( ) (6.4.2)
−1
=
6
( ) (6.4.3)
8
.\]
x 1 1
This is because any vector ( ) in 2
R is a sum of multiples of ( ) and ( ) which can be calculated via a linear
y 1 −1
x 1 1
( ) = a( ) +b ( ) (6.4.4)
y 1 −1
1 1 a x
⇔ ( )( ) =( ) (6.4.5)
1 −1 b y
x+y
1 1 x ⎛ 1 0 ⎞
2
⇔ ( ) ∼ (6.4.6)
x−y
1 −1 y ⎝ 0 1 ⎠
2
x+y
a =
2
⇔ { (6.4.7)
x−y
b = .
2
Thus
x x +y 1 x −y 1
( ) = ( )+ ( ) . (6.4.8)
y 2 1 2 −1
We can then calculate how L acts on any vector by first expressing the vector as a sum of multiples and then applying linearity;
6.4.1 https://math.libretexts.org/@go/page/1913
x x +y 1 x −y 1
L( ) = L[ ( )+ ( )]
y 2 1 2 −1
x +y 1 x −y 1
= L( )+ L( )
2 1 2 −1
x +y 2 x −y 6
= ( )+ ( )
2 4 2 8
x +y 3(x − y)
= ( ) +( )
2(x + y) 4(x − y)
4x − 2y
= ( )
6x − y
It should not surprise you to learn there are infinitely many pairs of vectors from R with the property that any vector can be
2
expressed as a linear combination of them; any pair that when used as columns of a matrix gives an invertible matrix works. Such a
pair is called a {\it basis}\index{basis} for R .
2
Similarly, there are infinitely many triples of vectors with the property that any vector from R can be expressed as a linear
3
combination of them: these are the triples that used as columns of a matrix give an invertible matrix. Such a triple is called a basis
for R .
3
In a similar spirit, there are infinitely many pairs of vectors with the property that every vector in
⎧ 1 ∣0 ⎫
⎪ ⎛ ⎞ ⎛ ⎞ ⎪
∣
V = ⎨c1 ⎜ 1 ⎟ + c2 ⎜ 1 ⎟ c1 , c2 ∈ R⎬ (6.4.9)
⎩ ∣ ⎭
⎪ ⎝ ⎠ ⎝ ⎠ ⎪
0 1 ∣
⎧ 1 0 ∣ ⎫ ⎧ 1 1 ∣ ⎫
⎪ ⎛ ⎞ ⎛ ⎞ ⎪ ⎪ ⎛ ⎞ ⎛ ⎞ ⎪
∣ ∣
V = ⎨c1 ⎜ 1 ⎟ + c2 ⎜ 2 ⎟ c1 , c2 ∈ R⎬ = ⎨c1 ⎜ 1 ⎟ + c2 ⎜ 3 ⎟ c1 , c2 ∈ R⎬ (6.4.10)
∣ ∣
⎩
⎪ ⎭
⎪ ⎩
⎪ ⎭
⎪
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 2 ∣ 0 2 ∣
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
6.4.2 https://math.libretexts.org/@go/page/1913
This page titled 6.4: Bases (Take 1) is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
6.4.3 https://math.libretexts.org/@go/page/1913
6.5: Review Problems
1. Show that the pair of conditions:
L(u + v) = L(u) + L(v)
(1) { (6.5.1)
L(cv) = cL(v)
(valid for all vectors u, v and any scalar c ) is equivalent to the single condition:
b) If Q(x 2
) =x
3
and Q(2x 2
) =x
4
is it possible that Q is a linear function from polynomials to polynomials?
4. If f is a linear function such that
1 2
f ( ) = 0, and f ( ) =1, (6.5.3)
2 3
x
then what is f ( )?
y
a) Find L(1 + t + 2t 2
) .
b) Find L(a + bt + ct 2
) .
c) Find all values a, b, c such that L(a + bt + ct 2
) = 1 + 3t + 2 t
3
.
x
6. Show that the operator I that maps f to the function If defined by I f (x) := ∫
0
f (t)dt is a linear operator on the space of
continuous functions.
7. Let . Recall that we can express z = x + iy where x, y ∈ R, and we can form the complex conjugateof \(z by taking
z ∈ C
z = x − iy . The function c: R → R which sends (x, y) ↦ (x, −y) agrees with complex conjugation.
¯
¯¯ 2 2
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 6.5: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
6.5.1 https://math.libretexts.org/@go/page/1928
CHAPTER OVERVIEW
7: Matrices
Matrices are a powerful tool for calculations involving linear transformations. It is important to understand how to find the matrix
of a linear transformation and properties of matrices.
7.1: Linear Transformations and Matrices
7.2: Review Problems
7.3: Properties of Matrices
7.4: Review Problems
7.5: Inverse Matrix
7.6: Review Problems
7.7: LU Redux
7.8: Review Problems
7.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 7: Matrices is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom Denton, &
Andrew Waldron.
1
7.1: Linear Transformations and Matrices
Ordered, finite-dimensional, bases for vector spaces allows us to express linear operators as matrices.
Example 7.1.1:
Let
a b ∣
V = {( ) ∣a, b, c, d ∈ R} (7.1.1)
c d ∣
be the vector space of 2 × 2 real matrices, with addition and scalar multiplication defined componentwise. One choice of basis
is the ordered set (or list) of matrices
1 0 0 1 0 0 0 0
1 1 2 2
B = (( ),( ),( ),( )) =: (e , e , e , e ) . (7.1.2)
1 2 1 2
0 0 0 0 1 0 0 1
Given a particular vector and a basis, your job is to write that vector as a sum of multiples of basis elements. Here and arbitrary
vector v ∈ V is just a matrix, so we write
a b a 0 0 b 0 0 0 0
v = ( ) = ( ) +( ) +( ) +( )
c d 0 0 0 0 c 0 0 d
1 0 0 1 0 0 0 0
= a( ) +b ( ) +c ( ) +d( )
0 0 0 0 1 0 0 1
1 1 2 2
= ae +b e +c e +d e .
1 2 1 2
1 1 2 2 1 ⎜ b ⎟
1 ⎜ b ⎟
2 2
v = ae + be + ce + de =: (e , e , e , e ) ⎜ ⎟ =: ⎜ ⎟ (7.1.3)
1 2 1 2 1 2 1 2
⎜ c ⎟ ⎜ c ⎟
⎝ ⎠ ⎝ ⎠
d d B
a
⎛ ⎞
⎜ b ⎟
The column vector ⎜ ⎟ encodes the vector v but is NOT equal to it! (After all, v is a matrix so could not equal a column
⎜ c ⎟
⎝ ⎠
d
vector.) Both notations on the right hand side of the above equation really stand for the vector obtained by multiplying the
coefficients stored in the column vector by the corresponding basis element and then summing over them.
Next, lets consider a tautological example showing how to label column vectors in terms of column vectors:
The vectors
1 0
e1 = ( ), e2 = ( ) (7.1.4)
0 1
7.1.1 https://math.libretexts.org/@go/page/1935
0, if k = 1
e2 (k) = { (7.1.6)
1, if k = 2
It is natural to assign these the order: e is first and e is second. An arbitrary vector v of R can be written as
1 2
2
x
v=( ) = x e1 + y e2 . (7.1.7)
y
To emphasize that we are using the standard basis we define the list (or ordered set)
E = (e1 , e2 ) , (7.1.8)
and write
x x
( ) := (e1 , e2 ) ( ) := x e1 + y e2 = v. (7.1.9)
y E
y
Again, the first notation of a column vector with a subscript E refers to the vector obtained by multiplying each basis vector
by the corresponding scalar listed in the column and then summing these, i.e. x e + y e . The second notation denotes exactly
1 2
the same thing but we first list the basis elements and then the column vector; a useful trick because this can be read in the
same way as matrix multiplication of a row vector times a column vector--except that the entries of the row vector are
themselves vectors!
You should already try to write down the standard basis vectors for R for other values of n and express an arbitrary vector in
n
R in terms of them.
n
The last example probably seems pedantic because column vectors are already just ordered lists of numbers and the basis notation
has simply allowed us to "re-express'' these as lists of numbers. Of course, this objection does not apply to more complicated vector
spaces like our first matrix example. Moreover, as we saw earlier, there are infinitely many other pairs of vectors in R that form a 2
basis.
1 1
b =( ) , β =( ). (7.1.10)
1 −1
1, if k = 1
β(k) = { (7.1.12)
−1, if k = 2
Notice something important: there is no reason to say that β comes before b or vice versa. That is, there is no a priori reason
to give these basis elements one order or the other. However, it will be necessary to give the basis elements an order if we want
to use them to encode other vectors. We choose one arbitrarily; let
B = (b, β) (7.1.13)
be the ordered basis. Note that for an unordered set we use the {} parentheses while for lists or ordered sets we use ().
As before we define
x x
( ) := (b, β) ( ) := xb + yβ . (7.1.14)
y B
y
7.1.2 https://math.libretexts.org/@go/page/1935
You might think that the numbers x and y denote exactly the same vector as in the previous example. However, they do not.
Inserting the actual vectors that b and β represent we have
1 1 x +y
xb + yβ = x ( ) +y ( ) =( ) . (7.1.15)
1 −1 x −y
Only in the standard basis E does the column vector of v agree with the column vector that v actually is!
Based on the above example, you might think that our aim would be to find the "standard basis'' for any problem. In fact, this is far
from the truth. Notice, for example that the vector
1
v=( ) = e1 + e2 = b (7.1.17)
1
1
v=( ) , (7.1.18)
1 E
1
v=( ) , (7.1.19)
0 B
which is actually a simpler column vector! The fact that there are many bases for any given vector space allows us to choose a basis
in which our computation is easiest. In any case, the standard basis only makes sense for R . Suppose your vector space was the n
⎧ 1 0∣ ⎫
⎪ ⎛ ⎞ ⎛ ⎞ ⎪
∣
V = ⎨c1 ⎜ 1 ⎟ + c2 ⎜ 1 ⎟ c1 , c2 ∈ R⎬ (7.1.20)
∣
⎩
⎪ ⎝ ⎠ ⎝ ⎠ ⎭
⎪
0 1 ∣
0 1 y
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
x
( ) := x b2 + y b2 = x ⎜ 1 ⎟ + y ⎜ 1 ⎟ = ⎜ x + y ⎟ . (7.1.23)
y B
′
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
1 0 x E
7.1.3 https://math.libretexts.org/@go/page/1935
Finding the column vector of a given vector in a given basis usually amounts to a linear systems problem:
Let
z u ∣
V = {( ) ∣z, u, v ∈ C} (7.1.24)
v −z ∣
be the vector space of trace-free complex-valued matrices (over C) with basis B = (σ x, σy , σz ) , where
0 1 0 −i 1 0
σx = ( ) , σy = ( ) , σz = ( ) . (7.1.25)
1 0 i 0 0 −1
These three matrices are the famous Pauli matrices , they are used to describe electrons in quantum theory.
Let
−2 + i 1 +i
v=( ) . (7.1.26)
3 −i −2 − i
This gives three equations, i.e. a linear systems problem, for the α 's
x y
⎧α
⎪
− iα = 1 +i
x y
⎨α + iα = 3 −i (7.1.28)
⎩
⎪ z
α = −2 + i
with solution
x y z
α =2, α = 2 − 2i , α = −2 + i . (7.1.29)
Hence
2
⎛ ⎞
v = ⎜ 2 − 2i ⎟ . (7.1.30)
⎝ ⎠
−2 + i
B
⎝ n ⎠
α
1 2 n i
v = α b1 + α b2 + ⋯ + α bn = ∑ α bi . (7.1.32)
i=1
The numbers (α 1 2 n
,α ,…,α ) are called the components of the vector v . Two useful shorthand notations for this are
7.1.4 https://math.libretexts.org/@go/page/1935
1 1
α α
⎛ ⎞ ⎛ ⎞
2 2
⎜ α ⎟ ⎜ α ⎟
v=⎜ ⎟ = (b1 , b2 , … , bn ) ⎜ ⎟ . (7.1.33)
⎜ ⎟ ⎜ ⎟
⎜ ⋮ ⎟ ⎜ ⋮ ⎟
⎝ n ⎠ ⎝ n ⎠
α B
α
1 j
L(bi ) = m β +⋯ +m β +⋯ (7.1.34)
i 1 i j
Remark
To calculate the matrix of a linear transformation you must compute what the linear transformation does to every input basis
vector and then write the answers in terms of the output basis vectors:
1 1 1
m m m
⎛ 1 ⎞ ⎛ 2 ⎞ ⎛ i ⎞
2 2 2
⎜ m2 ⎟ ⎜ m2 ⎟ ⎜ mi ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
= ((β1 , β2 , … , βj , …) ⋮ , (β1 , β2 , … , βj , …) ⋮ , ⋯ , (β1 , β2 , … , βj , …) ⋮ ,⋯)
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ j ⎟ ⎜ j ⎟ ⎜ j ⎟
⎜m ⎟ ⎜m ⎟ ⎜m ⎟
⎜ 1 ⎟ ⎜ 2 ⎟ ⎜ i ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
⋮ ⋮ ⋮
1 1 1
m m ⋯ m ⋯
⎛ 1 2 i ⎞
2 2 2
⎜m m ⋯ m
i
⋯⎟
1 2
⎜ ⎟
⎜ ⎟
= (β1 , β2 , … , βj , …) ⎜ ⋮ ⋮ ⋮ ⎟
⎜ ⎟
⎜ j j j ⎟
⎜m m ⋯ m ⋯⎟
⎜ 1 2 i ⎟
⎝ ⎠
⋮ ⋮ ⋮
Example 7.1.6 :
Consider L : V → R
3
defined by
1 0 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
L⎜1⎟ = ⎜1⎟ , L⎜1⎟ = ⎜1⎟ . (7.1.35)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 1 0
We had trouble expressing this linear operator as a matrix. Lets take input basis
1 0
⎛⎛ ⎞ ⎛ ⎞⎞
B = ⎜⎜ 1 ⎟ , ⎜ 1 ⎟⎟ =: (b1 , b2 ) , (7.1.37)
⎝⎝ ⎠ ⎝ ⎠⎠
0 1
7.1.5 https://math.libretexts.org/@go/page/1935
and output basis
1 0 0
⎛⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎞
E = ⎜⎜ 0 ⎟ , ⎜ 1 ⎟ , ⎜ 0 ⎟⎟ . (7.1.38)
⎝⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎠
0 0 1
Then
Lb1 = 0. e1 + 1. e2 + 0. e3 = Lb2 , (7.1.39)
or
0 0 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
(Lb1 , Lb2 ) = ((e1 , e2 , e3 ) ⎜ 1 ⎟ , (e1 , e2 , e3 ) ⎜ 1 ⎟ ) = (e1 , e2 , e3 ) ⎜ 1 1⎟ . (7.1.40)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 0 0
The matrix on the right is the matrix of L in these bases. More succinctly we could write
0
⎛ ⎞
x
L( ) = (x + y)⎜ 1 ⎟ (7.1.41)
y B ⎝ ⎠
0 E
⎜1 1⎟ (7.1.42)
⎝ ⎠
0 0
.
Hence
0 0
⎛⎛ ⎞ ⎞
x x
L( ) = ⎜⎜ 1 1⎟( )⎟ ; (7.1.43)
y y
B ⎝⎝ ⎠ ⎠
0 0 E
given input and output bases, the linear operator is now encoded by a matrix.
Linear operators become matrices when given ordered input and output bases. (7.1.44)
Example 7.1.7 :
Lets compute a matrix for the derivative operator acting on the vector space of polynomials of degree 2 or less:
2
V = { a0 1 + a1 x + a2 x | a0 , a1 , a2 ∈ R} . (7.1.45)
⎝ ⎠
c B
and
7.1.6 https://math.libretexts.org/@go/page/1935
a b
⎛ ⎞ ⎛ ⎞
d
2
⎜ b ⎟ = b ⋅ 1 + 2cx + 0 x = ⎜ 2c ⎟ (7.1.47)
dx
⎝ ⎠ ⎝ ⎠
c B
0 B
Notice this last equation makes no sense without explaining which bases we are using!
7.1.2.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 7.1: Linear Transformations and Matrices is shared under a not declared license and was authored, remixed, and/or curated by
David Cherney, Tom Denton, & Andrew Waldron.
7.1.7 https://math.libretexts.org/@go/page/1935
7.2: Review Problems
1. A door factory can buy supplies in two kinds of packages, f and g . The package f contains 3 slabs of wood, 4 fasteners, and 6
c) Let L be the manufacturing process; it takes in supply packages and gives out two products (doors, and door frames) and it is
linear in supplies. If Lf is 1 door and 2 frames and Lg is 3 doors and 1 frame, find a matrix for L.
2. You are designing a simple keyboard synthesizer with two keys. If you push the first key with intensity a then the speaker moves
in time as a sin(t) . If you push the second key with intensity b then the speaker moves in time as b sin(2t). If the keys are pressed
simultaneously,
a) Describe the set of all sounds that come out of your synthesizer. (Hint: Sounds can be "added".)
3
b) Graph the function ( ) ∈ R
{1,2}
.
5
3
c) Let B = (sin(t), sin(2t)) . Explain why ( ) is not in R {1,2}
but is still a function.
5 B
3
d) Graph the function ( ) .
5 B
3.
a) Find the matrix for d
dx
acting on the vector space V of polynomials of degree 2 or less in the ordered basis B ′ 2
= (x , x, 1)
b) Use the matrix from part (a) to rewrite the differential equation d
dx
p(x) = x as a matrix equation. Find all solutions of the matrix
equation. Translate them into elements of V .
c) Find the matrix for d
dx
acting on the vector space V in the ordered basis (x 2
+ x, x
2
− x, 1) .
d) Use the matrix from part (c) to rewrite the differential equation d
dx
p(x) = x as a matrix equation. Find all solutions of the matrix
equation. Translate them into elements of V .
e) Compare and contrast your results from parts (b) and (d).
4. Find the "matrix'' for acting on the vector space of all power series in the ordered basis (1, x, x , x , . . . ). Use this matrix to
dx
d 2 3
find all power series solutions to the differential equation f (x) = x . Hint: your "matrix'' may not have finite size.
d
dx
dx
2
acting on {c 1 cos(x) + c2 sin(x)| c1 , c2 ∈ R} in the ordered basis (cos(x), sin(x)).
cosh(x) =
e +e
2
, sinh(x) =
e −e
2
.)
7. Let B = (1, x, x 2
) be an ordered basis for
2
V = { a0 + a1 x + a2 x | a0 , a1 , a2 ∈ R} , (7.2.1)
and let B′ 3 2
= (x , x , x, 1) be an ordered basis for
2 3
W = { a0 + a1 x + a2 x + a3 x | a0 , a1 , a2 , a3 ∈ R} , (7.2.2)
7.2.1 https://math.libretexts.org/@go/page/1942
relative to these bases.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 7.2: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
7.2.2 https://math.libretexts.org/@go/page/1942
7.3: Properties of Matrices
The objects of study in linear algebra are linear operators. We have seen that linear operators can be represented as matrices
through choices of ordered bases, and that matrices provide a means of efficient computation. We now begin an in depth study of
matrices.
⎝ r r r ⎠
m m ⋯ m
1 2 k
The numbers m are called entries. The superscript indexes the row of the matrix and the subscript indexes the column of the
i
j
An r × 1 matrix v = (v r
1
r
) = (v ) is called a column vector, written
1
v
⎛ ⎞
2
⎜v ⎟
v=⎜ ⎟ . (7.3.2)
⎜ ⎟
⎜ ⋮ ⎟
⎝ r ⎠
v
A 1 × k matrix v = (v 1
k
) = (vk ) is called a row vector, written
v = ( v1 v2 ⋯ vk ) . (7.3.3)
The transpose of a column vector is the corresponding row vector and vice versa:
Example 7.3.1 :
Let
1
⎛ ⎞
v=⎜2⎟ . (7.3.4)
⎝ ⎠
3
Then
T
v =(1 2 3) , (7.3.5)
and (v T T
) =v .
In computer graphics, you may have encountered image files with a .gif extension. These files are actually just matrices: at the
start of the file the size of the matrix is given, after which each number is a matrix entry indicating the color of a particular
pixel in the image.
This matrix then has its rows shuffled a bit: by listing, say, every eighth row, a web browser downloading the file can start
displaying an incomplete version of the picture before the download is complete.
Finally, a compression algorithm is applied to the matrix to reduce the file size.
7.3.1 https://math.libretexts.org/@go/page/1949
Example 7.3.3 :
Graphs occur in many applications, ranging from telephone networks to airline routes. In the subject of graph theory, a graph is
just a collection of vertices and some edges connecting vertices. A matrix can be used to indicate how many edges attach one
vertex to another.
For example, the graph pictured above would have the following matrix, where m indicates the number of edges between the
i
j
1 2 1 1
⎛ ⎞
⎜2 0 1 0⎟
M =⎜ ⎟ (7.3.6)
⎜1 1 0 1⎟
⎝ ⎠
1 0 1 3
j
This is an example of a symmetric matrix , since m i
j
=m
i
.
The set of all r × k matrices
r i i
M := {(m )| m ∈ R; i = 1, … , r; j = 1 … k} , (7.3.7)
k j j
is itself a vector space with addition and scalar multiplication defined as follows:
i i i i
M + N = (m ) + (n ) = (m +n ) (7.3.8)
j j j j
i i
rM = r(m ) = (rm ) (7.3.9)
j j
In other words, addition just adds corresponding entries in two matrices, and scalar multiplication multiplies every entry.
Notice that M = R is just the vector space of column vectors.
n
1
n
Recall that we can multiply an r × k matrix by a k × 1 column vector to produce a r × 1 column vector using the rule
k
i j
M V = (∑ m v ) . (7.3.10)
j
j=1
This suggests the rule for multiplying an r × k matrix M by a k × s matrix~N : our k × s matrix N consists of s column
vectors side-by-side, each of dimension k × 1. We can multiply our r × k matrix M by each of these s column vectors using
the rule we already know, obtaining s column vectors each of dimension r × 1. If we place these s column vectors side-by-
side, we obtain an r × s matrix M N .
That is, let
1 1 1
n n ⋯ ns
⎛ 1 2 ⎞
2 2 2
⎜n n ⋯ ns ⎟
1 2
⎜ ⎟
N =⎜ ⎟ (7.3.11)
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⎟
⎝ k k k ⎠
n n ⋯ ns
1 2
7.3.2 https://math.libretexts.org/@go/page/1949
1 1 1
n n ns
⎛ 1 ⎞ ⎛ 2 ⎞ ⎛ ⎞
2 2 2
⎜n ⎟ ⎜n ⎟ ⎜ ns ⎟
⎜ 1 ⎟ ⎜ 2 ⎟
N1 =⎜ ⎟ , N2 = ⎜ ⎟ , … , Ns = ⎜ ⎟. (7.3.12)
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⋮ ⎟ ⎜ ⋮ ⎟ ⎜ ⋮ ⎟
⎝ k ⎠ ⎝ k ⎠ ⎝ k ⎠
n n ns
1 2
Then
| | | | | |
⎛ ⎞ ⎛ ⎞
M N = M ⎜ N1 N2 ⋯ Ns ⎟ = ⎜ M N1 M N2 ⋯ M Ns ⎟ (7.3.13)
⎝ ⎠ ⎝ ⎠
| | | | | |
i i p
ℓ = ∑ mp n . (7.3.14)
j j
p=1
Notice that in order for the multiplication make sense, the columns and rows must match. For an r × k matrix M and an s × m
matrix N , then to make the product M N we must have k = s . Likewise, for the product N M , it is required that m = r . A
common shorthand for keeping track of the sizes of the matrices involved in a given product is:
(r × k) × (k × m) = (r × m) (7.3.15)
Example 7.3.4:
1 1⋅2 1⋅3 2 3
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
The entries of M N are made from the dot products of the rows of M with the columns of N . (7.3.17)
Example 7.3.5:
Let
T
1 3 u
⎛ ⎞ ⎛ ⎞
2 3 1
T
M =⎜3 5 ⎟ =: ⎜ v ⎟ and N = ( ) =: ( a b c) (7.3.18)
0 1 0
⎝ ⎠ ⎝ T ⎠
2 6 w
where
1 3 2 2 3 1
u =( ) , v=( ) , w =( ) , a =( ) , b =( ) , c =( ) . (7.3.19)
3 5 6 0 1 0
Then
$$
MN=\left(\!
u⋅a u⋅b u⋅c
v⋅ a v⋅ b v⋅ c (7.3.20)
7.3.3 https://math.libretexts.org/@go/page/1949
\!\right)
=
2 6 1
⎛ ⎞
⎜6 14 3⎟ (7.3.21)
⎝ ⎠
4 12 2
\, .
\]
Theorem: orthogonal
Let M be a matrix and x a column vector. If
Mx = 0 (7.3.22)
Remark
Remember that the set of all vectors that can be obtained by adding up scalar multiples of the columns of a matrix is called its
column space. Similarly the row space is the set of all row vectors obtained by adding up multiples of the rows of a matrix.
The above theorem says that if M x = 0 , then the vector x is orthogonal to every vector in the row space of M .
i j
MV = ∑ m v , (7.3.23)
j
j=1
which is the same rule used when we multiply an r × k matrix by a k × 1 vector to produce an r × 1 vector.
Likewise, we can use a matrix N i
= (n )
j
to define a linear transformation of a vector space of matrices. For example
N
s r
L: M ⟶ M , (7.3.24)
k k
i i i j
L(M ) = (l ) where l = ∑n m . (7.3.25)
k k j k
j=1
This is the same as the rule we use to multiply matrices. In other words, L(M ) = N M is a linear transformation.
Any r × r matrix is called a square matrix. A square matrix that is zero for all non-diagonal entries is called a diagonal matrix.
An example of a square diagonal matrix is
$$
2 0 0
⎛ ⎞
⎜0 3 0⎟ (7.3.26)
⎝ ⎠
0 0 0
\, .\]
The r × r diagonal matrix with all diagonal entries equal to 1 is called the identity matrix , I , or just I . An identity matrix looks
r
like
7.3.4 https://math.libretexts.org/@go/page/1949
1 0 0 ⋯ 0
⎛ ⎞
⎜0 1 0 ⋯ 0⎟
⎜ ⎟
⎜0 0 1 ⋯ 0⎟
I =⎜ ⎟. (7.3.27)
⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋱ ⋮ ⎟
⎝ ⎠
0 0 0 ⋯ 1
Definition
The transpose of an r × k matrix M i
= (m )
j
is the k × r matrix with entries
T i
M = (m
^ ) (7.3.29)
j
i j
with m
^
j
=m
i
.
A matrix M is symmetric if M =M
T
.
Example 7.3.6 :
\[
2 5 6
( ) (7.3.30)
1 3 4
^{T} =
2 1
⎛ ⎞
⎜5 3⎟ (7.3.31)
⎝ ⎠
6 4
\, ,
and (7.3.32)
2 5 6
( ) (7.3.33)
1 3 4
2 5 6
( ) (7.3.34)
1 3 4
^{T} =
65 43
( ) (7.3.35)
43 26
\, ,$$
is symmetric.
7.3.1: Observations
1. Only square matrices can be symmetric.
2. The transpose of a column vector is a row vector, and vice-versa.
7.3.5 https://math.libretexts.org/@go/page/1949
3. Taking the transpose of a matrix twice does nothing. \emph{i.e.,} $(M^T)^T=M$.
Example 7.3.7 :
i.e. , the order of bracketing does not matter. The same property holds for matrix multiplication, let us show why.
Suppose M = (m )
i
j
,N = (n )
j
k
and R = (r )
k
l
are, respectively, m ×n , n×r and r×t matrices. Then from the rule for
matrix multiplication we have
n r
i j j k
M N = ( ∑ m n ) and N R = ( ∑ n r ) . (7.3.37)
j k k l
j=1 k=1
So first we compute
r n r n r n
i j k i j k i j k
(M N )R = ( ∑ [ ∑ m n ]r ) = ( ∑ ∑ [m n ]r ) = ( ∑ ∑ m n r ) . (7.3.38)
j k l j k l j k l
In the first step we just wrote out the definition for matrix multiplication, in the second step we moved summation symbol
outside the bracket (this is just the distributive property x(y + z) = xy + xz for numbers) and in the last step we used the
associativity property for real numbers to remove the square brackets. Exactly the same reasoning shows that
n r r n r n
i j k i j k i j k
M (N R) = ( ∑ m [ ∑ n r ]) = ( ∑ ∑ m [n r ]) = ( ∑ ∑ m n r ) . (7.3.39)
j k l j k l j k l
This is the same as above so we are done. As a fun remark, note that Einstein would simply have written
j j j
(M N )R = (m n )r = m n r = m (n r ) = M (N R) .
i k i k i k
j k l j k l j k l
Sometimes matrices do not share the properties of regular numbers. In particular, for generic n × n square matrices M and N ,
1 1 1 0 2 1
( )( ) =( ) (7.3.40)
0 1 1 1 1 1
7.3.6 https://math.libretexts.org/@go/page/1949
1 0 1 1 1 1
( )( ) =( ) . (7.3.41)
1 1 0 1 1 2
Since n × n matrices are linear transformations R → R , we can see that the order of successive linear transformations matters.
n n
Here is an example of matrices acting on objects in three dimensions that also shows matrices not commuting.
Example 7.3.8 :
rotates vectors in the plane by an angle θ . We can generalize this, using block matrices, to three dimensions. In fact the
following matrices built from a 2 × 2 rotation matrix, a 1 × 1 identity matrix and zeroes everywhere else
cos θ sin θ 0 1 0 0
⎛ ⎞ ⎛ ⎞
M = ⎜ − sin θ cos θ 0⎟ and N =⎜0 cos θ sin θ ⎟ , (7.3.43)
⎝ ⎠ ⎝ ⎠
0 0 1 0 − sin θ cos θ
perform rotations by an angle θ in the xy and yz planes, respectively. Because, they rotate single vectors, you can also use
them to rotate objects built from a collection of vectors like pretty colored blocks! Here is a picture of M and then N acting on
such a block, compared with the case of N followed by M . The special case of θ = 90 is shown. ∘
⎛ 1 2 3 1 ⎞
⎜ 4 5 6 0 ⎟ A B
M =⎜ ⎟ =( ) (7.3.44)
⎜ ⎟
⎜ 7 8 9 1 ⎟ C D
⎝ 0 1 2 0 ⎠
1 2 3 1
⎛ ⎞ ⎛ ⎞
Here A = ⎜ 4 5 6⎟ , B = ⎜0⎟, C =(0 1 2) , D = (0) .
⎝ ⎠ ⎝ ⎠
7 8 9 1
7.3.7 https://math.libretexts.org/@go/page/1949
B A
1. The blocks of a block matrix must fit together to form a rectangle. So ( ) makes sense, but
D C
C B
( ) does not.
D A
2. There are many ways to cut up an n × n matrix into blocks. Often context or the entries of the matrix will suggest a useful way
to divide the matrix into blocks. For example, if there are large blocks of zeros in a matrix, or blocks that look like an identity
matrix, it can be useful to partition the matrix accordingly.
3. Matrix operations on block matrices can be carried out by treating the blocks as matrix entries. In the example above,
2
A B A B
M = ( )( )
C D C D
2
A + BC AB + BD
= ( )
2
C A + DC CB + D
4
⎛ ⎞
AB + BD = ⎜ 10 ⎟
⎝ ⎠
16
18
⎛ ⎞
C A + DC = ⎜ 21 ⎟
⎝ ⎠
24
2
CB + D = (2)
⎛ 30 37 44 4 ⎞
⎜ 66 81 96 10 ⎟
⎜ ⎟ (7.3.45)
⎜ ⎟
⎜ 102 127 152 16 ⎟
⎝ 4 10 16 2 ⎠
This is exactly M . 2
0
M =I , (7.3.46)
Example 7.3.9 :
Let f (x) = x − 2x 2 3
+ 3x
and
7.3.8 https://math.libretexts.org/@go/page/1949
1 t
M =( ) . (7.3.47)
0 1
Then:
2
1 2t 3
1 3t
M =( ) , M =( ) , … (7.3.48)
0 1 0 1
Hence:
1 t 1 2t 1 3t
f (M ) = ( ) −2 ( ) +3 ( )
0 1 0 1 0 1
2 6t
= ( )
0 2
There are additional techniques to determine the convergence of Taylor Series of matrices, based on the fact that the convergence
problem is simple for diagonal matrices. It also turns out that the matrix exponential
1 1
2 3
exp(M ) = I + M + M + M +⋯ , (7.3.51)
2 3!
always converges.
7.3.5: Trace
A large matrix contains a great deal of information, some of which often reflects the fact that you have not set up your problem
efficiently. For example, a clever choice of basis can often make the matrix of a linear transformation very simple. Therefore,
finding ways to extract the essential information of a matrix is useful. Here we need to assume that n < ∞ otherwise there are
subtleties with convergence that we'd have to address.
Definition: Trace
i
trM = ∑ m . (7.3.52)
i
i=1
While matrix multiplication does not commute, the trace of a product of matrices does not depend on the order of multiplication:
i l
tr(M N ) = tr(∑ M N )
l j
i l
= ∑∑M N
l i
i l
l i
= ∑∑N M
i l
l i
l i
= tr(∑ N M )
i l
= tr(N M ).
7.3.9 https://math.libretexts.org/@go/page/1949
Theorem
Example :
7.3.11
1 1 1 0
M =( ),N =( ). (7.3.54)
0 1 1 1
so
2 1 1 1
MN = ( ) ≠ NM = ( ). (7.3.55)
1 1 1 2
This is true because the trace only uses the diagonal entries, which are fixed by the transpose. For example:
\[\textit{tr}
1 1
( ) (7.3.57)
2 3
= 4 = \textit{tr}
1 2
( ) (7.3.58)
1 3
= \textit{tr}
1 2
( ) (7.3.59)
1 3
^{T}\, .
$$
Finally, trace is a linear transformation from matrices to the real numbers. This is easy to check.
7.3.5.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 7.3: Properties of Matrices is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
7.3.10 https://math.libretexts.org/@go/page/1949
7.4: Review Problems
1. Compute the following matrix products
1
⎛ ⎞
4 1
1 2 1 ⎛ −2 3
−
3 ⎞ ⎜2⎟
⎛ ⎞
⎜ ⎟
2⎟⎜ ⎟ , 5)⎜3⎟ ,
5 2
⎜4 5 ⎜ 2 − ⎟ (1 2 3 4 (7.4.1)
3 3
⎜ ⎟
⎝ ⎠ ⎜ ⎟
7 8 2 ⎝ ⎠ ⎜4⎟
−1 2 −1
⎝ ⎠
5
1
⎛ ⎞
4 1
⎜2⎟ 1 2 1 ⎛ −2 3
−
3 ⎞ 1 2 1
⎛ ⎞ ⎛ ⎞
⎜ ⎟
⎜3⎟(1 2 3 4 5) , ⎜4 5 2⎟⎜
5 2 ⎟⎜4 5 2⎟ , (7.4.2)
⎜ ⎟ ⎜ 2 − ⎟
3 3
⎜ ⎟ ⎝ ⎠ ⎝ ⎠
⎜4⎟ 7 8 2 ⎝ ⎠ 7 8 2
−1 2 −1
⎝ ⎠
5
2 1 2 1 2 1 2 1 2 1
⎛ ⎞⎛ ⎞
2 1 1 x ⎜0 2 1 2 1⎟⎜0 1 2 1 2⎟
⎛ ⎞⎛ ⎞
⎜ ⎟⎜ ⎟
(x y z)⎜1 2 1⎟⎜ y ⎟ , ⎜0 1 2 1 2⎟ ⎜ 1⎟ , (7.4.3)
⎜ ⎟⎜0 2 1 2 ⎟
⎝ ⎠⎝ ⎠ ⎜ ⎟⎜ ⎟
1 1 2 z ⎜0 2 1 2 1⎟⎜0 1 2 1 2⎟
⎝ ⎠⎝ ⎠
0 0 0 0 2 0 0 0 0 1
$$
4 1
⎛ −2 3
−
3 ⎞
⎜ 5 2 ⎟ (7.4.4)
⎜ 2 − ⎟
3 3
⎝ ⎠
−1 2 −1
2 2
⎛ 4 3
−
3 ⎞
⎜ 6 5 2 ⎟
− (7.4.5)
⎜ 3 3 ⎟
16 10
⎝ 12 − ⎠
3 3
1 2 1
⎛ ⎞
⎜4 5 2⎟ (7.4.6)
⎝ ⎠
7 8 2
\, .
\]
2. Let's prove the theorem (M N ) T
=N
T
M
T
.
Note: the following is a common technique for proving matrix identities.
a) Let M i
= (m )
j
and let N = (n )
i
j
. Write out a few of the entries of each matrix in the form given at the beginning of section 7.3.
b) Multiply out M N and write out a few of its entries in the same form as in part (a). In terms of the entries of M and the entries of
N , what is the entry in row i and column j of M N ?
7.4.1 https://math.libretexts.org/@go/page/1956
c) Take the transpose (M N ) and write out a few of its entries in the same form as in part (a). In terms of the entries of M and the
T
f) Show that the answers you got in parts (c) and (e) are the same.
3.
1 2 0
a) Let A = ( ) . Find AA and A T T
A and their traces.
3 −1 4
4. Let x = ⎜
⎜ ⋮
⎟
⎟ and y = ⎜
⎜ ⋮
⎟
⎟ be column vectors. Show that the dot product x ⋅ y = x T
I y .
⎝ ⎠ ⎝ ⎠
xn yn
N
5. Above, we showed that left multiplication by an r×s matrix N was a linear transformation M
k
s
⟶ M
k
r
. Show that right
R
multiplication by a k × m matrix R is a linear transformation M s
k
⟶ Mm
s
. In other words, show that right matrix multiplication
obeys linearity.
6. Let the V be a vector space where B = (v 1, v2 ) is an ordered basis. Suppose
linear
L : V −−⟶ V (7.4.7)
and
L(v1 ) = v1 + v2 , L(v2 ) = 2 v1 + v2 . (7.4.8)
Compute the matrix of L in the basis B and then compute the trace of this matrix. Suppose that ad − bc ≠ 0 and consider now the
new basis
′
B = (av1 + b v2 , c v1 + dv2 ) . (7.4.9)
Compute the matrix of L in the basis B . Compute the trace of this matrix. What do you find? What do you conclude about the
′
trace of a matrix? Does it make sense to talk about the ``trace of a linear transformation''?
7. Explain what happens to a matrix when:
λ 0
a) A = ( )
0 λ
1 λ
b) A = ( )
0 1
0 λ
c) A = ( )
0 0
7.4.2 https://math.libretexts.org/@go/page/1956
1 0 0 0 0 0 0 1
⎛ ⎞
⎜ 0 1 0 0 0 0 1 0⎟
⎜ ⎟
⎜ 0 0 1 0 0 1 0 0⎟
⎜ ⎟
⎜ ⎟
⎜ 0 0 0 1 1 0 0 0⎟
9. Let M =⎜ ⎟ . Divide M into named blocks, with one block the 4 ×4 identity matrix, and then
⎜ 0 0 0 0 2 1 0 0⎟
⎜ ⎟
⎜ ⎟
⎜ 0 0 0 0 0 2 0 0⎟
⎜ ⎟
⎜ 0 0 0 0 0 0 3 1⎟
⎝ ⎠
0 0 0 0 0 0 0 3
10. A matrix A is called anti-symmetric (or skew-symmetric matrix) if A = −A . Show that for every n × n matrix M , we can
T
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 7.4: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
7.4.3 https://math.libretexts.org/@go/page/1956
7.5: Inverse Matrix
Definition
A square matrix M is invertible (or nonsingular ) if there exists a matrix M −1
such that
−1 −1
M M = I = MM . (7.5.1)
Remark
Let M and N be the matrices:
a b d −b
M =( ), N =( ) (7.5.2)
c d −c a
d −b
Then M −1
=
1
ad−bc
( ) , so long as ad − bc ≠ 0 .
−c a
7.5.1 https://math.libretexts.org/@go/page/1963
2. Notice that B −1
A
−1
AB = B
−1
I B = I = ABB
−1
A
−1
.
Then:
−1 −1 −1
(AB) =B A (7.5.5)
Thus, much like the transpose, taking the inverse of a product reverses the order of the product.
3. Finally, recall that (AB) T
=B
T
A
T
. Since I T
=I , then (A −1
A)
T
=A
T −1
(A )
T
=I . Similarly, (AA−1 T
)
−1
= (A
T
)
T
A =I .
Then:
−1 T T −1
(A ) = (A ) (7.5.6)
\begin{amatrix}{rr}
M&V
\end{amatrix}
\sim
\begin{amatrix}{rr}
I & M^{-1}V
\end{amatrix}
M X = V1 , M X = V2 (7.5.7)
we can consider augmented matrices with many columns on the right and then apply Gaussian row reduction to the left side of the
matrix. Once the identity matrix is on the left side of the augmented matrix, then the solution of each of the individual linear
systems is on the right.
−1 −1
( M V1 V2 ) ∼ ( I M V1 M V2 ) (7.5.8)
To compute M , we would like M , rather than M V to appear on the right side of our augmented matrix. This is achieved
−1 −1 −1
by solving the collection of systems M X = e , where e is the column vector of zeroes with a 1 in the k th entry. i.e. the n × n
k k
identity matrix can be viewed as a bunch of column vectors I = (e e ⋯ e ) . So, putting the e 's together into an identity
n 1 2 n k
matrix, we get:
\begin{amatrix}{1}
M&I
\end{amatrix}
\sim
\begin{amatrix}{1}
I & M^{-1}I
\end{amatrix}
=\begin{amatrix}{1}
I & M^{-1}
\end{amatrix}
Example 7.5.1 :
−1
−1 2 −3
⎛ ⎞
Find ⎜ 2 1 0 ⎟ .
⎝ ⎠
4 −2 5
We start by writing the augmented matrix, then apply row reduction to the left side.
7.5.2 https://math.libretexts.org/@go/page/1963
⎛ −1 2 −3 1 0 0 ⎞ ⎛ 1 −2 3 1 0 0 ⎞
⎜ 2 1 0 0 1 0 ⎟ ∼ ⎜ 0 5 −6 2 1 0 ⎟
⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠
4 −2 5 0 0 1 0 6 −7 4 0 1
3 1 2
⎛ 1 0 − 0 ⎞
5 4 5
⎜ 6 2 1 ⎟
∼ ⎜ 0 1 − 0 ⎟
⎜ 5 5 5 ⎟
1 4 6
⎝ 0 0 − 1 ⎠
5 5 5
⎛ 1 0 0 −5 4 −3 ⎞
∼ ⎜
⎜ 0 1 0 10 −7 6 ⎟
⎟
⎝ 0 0 1 8 −6 5 ⎠
At this point, we know M assuming we didn't goof up. However, row reduction is a lengthy and arithmetically involved
−1
process, so we should check our answer, by confirming that M M = I (or if you prefer M M = I ): −1 −1
−1 2 −3 −5 4 −3 1 0 0
⎛ ⎞⎛ ⎞ ⎛ ⎞
−1
MM =⎜ 2 1 0 ⎟ ⎜ 10 −7 6 ⎟ =⎜0 1 0⎟ (7.5.9)
⎝ ⎠⎝ ⎠ ⎝ ⎠
4 −2 5 8 −6 5 0 0 1
The product of the two matrices is indeed the identity matrix, so we're done.
Example :
7.5.12
2x +y = 2 (7.5.10)
4x −2y +5z = 0
1
⎛ ⎞
The associated matrix equation is M X = ⎜ 2 ⎟ , where M is the same as in the previous section. Then:
⎝ ⎠
0
−1
x −1 2 −3 1 −5 4 −3 1 3
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
⎜ y ⎟ =⎜ 2 1 0 ⎟ ⎜ 2 ⎟ = ⎜ 10 −7 6 ⎟ ⎜ 2 ⎟ = ⎜ −4 ⎟ (7.5.11)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠
z 4 −2 5 0 8 −6 5 0 −4
x 3
⎛ ⎞ ⎛ ⎞
Then ⎜ y ⎟ = ⎜ −4 ⎟ . In summary, when M −1
exists, then $$MX=V \Leftrightarrow X=M^{-1}V\, .\]
⎝ ⎠ ⎝ ⎠
z −4
Proof
7.5.3 https://math.libretexts.org/@go/page/1963
First, suppose that M −1
exists. Then M X = 0 ⇒ X = M −1
0 =0 . Thus, if M is invertible, then MX = 0 has no non-zero
solutions.
On the other hand, M X = 0 always has the solution X = 0 . If no other solutions exist, then M can be put into reduced row
echelon form with every variable a pivot. In this case, M can be computed using the process in the previous section.
−1
011011000110100101101110011001010110000101110010 (7.5.13)
A bit is the basic unit of information, keeping track of a single one or zero. Computers can add and multiply individual bits very
quickly.
In chapter 5, section 5.2 it is explained how to formulate vector spaces over fields other than real numbers. In particular, for the
vectors space make sense with numbers Z = {0, 1} with addition and multiplication given by:
2
+ 0 1 × 0 1
0 0 1 0 0 0 (7.5.14)
1 1 0 1 0 1
Notice that −1 = 1 , since 1 + 1 = 0 . Therefore, we can apply all of the linear algebra we have learned thus far to matrices with Z 2
Example 7.5.3 :
1 0 1
⎛ ⎞
⎝ ⎠
1 1 1
−1
1 0 1 0 1 1
⎛ ⎞ ⎛ ⎞
⎜0 1 1⎟ =⎜1 0 1⎟ (7.5.15)
⎝ ⎠ ⎝ ⎠
1 1 1 1 1 1
1 0 1 0 1 1 1 0 0
⎛ ⎞⎛ ⎞ ⎛ ⎞
⎝ ⎠⎝ ⎠ ⎝ ⎠
1 1 1 1 1 1 0 0 1
7.5.4 https://math.libretexts.org/@go/page/1963
Application: Cryptography
A very simple way to hide information is to use a substitution cipher, in which the alphabet is permuted and each letter in a
message is systematically exchanged for another. For example, the ROT-13 cypher just exchanges a letter with the letter
thirteen places before or after it in the alphabet. For example, HELLO becomes URYYB. Applying the algorithm again
decodes the message, turning URYYB back into HELLO. Substitution ciphers are easy to break, but the basic idea can be
extended to create cryptographic systems that are practically uncrackable. For example, a one-time pad is a system that uses a
different substitution for each letter in the message. So long as a particular set of substitutions is not used on more than one
message, the one-time pad is unbreakable.
English characters are often stored in computers in the ASCII format. In ASCII, a single character is represented by a string of
eight bits, which we can consider as a vector in Z (which is like vectors in R , where the entries are zeros and ones). One
8
2
8
way to create a substitution cipher, then, is to choose an 8 × 8 invertible bit matrix M , and multiply each letter of the message
by M . Then to decode the message, each string of eight characters would be multiplied by M . −1
To make the message a bit tougher to decode, one could consider pairs (or longer sequences) of letters as a single vector in Z 16
2
(or a higher-dimensional space), and then use an appropriately-sized invertible matrix. For more on cryptography, see "The
Code Book,'' by Simon Singh (1999, Doubleday).
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 7.5: Inverse Matrix is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
7.5.5 https://math.libretexts.org/@go/page/1963
7.6: Review Problems
1. Find formulas for the inverses of the following matrices, when they are not singular:
1 a b
⎛ ⎞
a) ⎜ 0 1 c ⎟
⎝ ⎠
0 0 1
a b c
⎛ ⎞
b) ⎜ 0 d e ⎟
⎝ ⎠
0 0 f
4. Left and Right Inverses: So far we have only talked about inverses of square matrices. This problem will explore the notion of
a left and right inverse for a matrix that is not square. Let
0 1 1
A =( ) (7.6.1)
1 1 0
a) Compute:
(i) AA , T
−1
(ii) (AA T
) ,
−1
(iii) B := A T
(AA
T
)
b) Show that the matrix B above is a right inverse for A , i.e. , verify that
AB = I . (7.6.2)
T
Hint: you may assume that A A has an inverse.
e) Test your proposal for a left inverse for the simple example
1
A =( ) , (7.6.4)
2
7.6.1 https://math.libretexts.org/@go/page/1970
f) True or false: Left and right inverses are unique. If false give a counterexample.
5. Show that if the range (remember that the range of a function is the set of all its possible outputs) of a 3 × 3 matrix M (viewed
as a function R → R ) is a plane then one of the columns is a sum of multiples of the other columns. Show that this relationship is
3 3
preserved under EROs. Show, further, that the solutions to M x = 0 describe this relationship between the columns.
6. If M and N are square matrices of the same size such that M −1
exists and N −1
does not exist, does (M N ) −1
exist?
7. If M is a square matrix which is not invertible, is exp M invertible?
8. Elementary Column Operations (ECOs) can be defined in the same 3 types as EROs. Describe the 3 kinds of ECOs. Show that if
maximal elimination using ECOs is performed on a square matrix and a column of zeros is obtained then that matrix is not
invertible.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 7.6: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
7.6.2 https://math.libretexts.org/@go/page/1970
7.7: LU Redux
Certain matrices are easier to work with than others. In this section, we will see how to write any square matrix M as the product
of two simpler matrices. We will write
M = LU , (7.7.1)
where:
1. L is lower triangular . This means that all entries above the main diagonal are zero. In notation, i
L = (l )
j
with i
l
j
=0 for all
j>i.
1
l 0 0 ⋯
⎛ 1 ⎞
2 2
⎜l l 0 ⋯ ⎟
1 2
⎜ ⎟
L =⎜ 3 3 3 ⎟ (7.7.2)
⎜ l1 l
2
l
3
⋯ ⎟
⎜ ⎟
⎝ ⎠
⋮ ⋮ ⋮ ⋱
2. U is upper triangular . This means that all entries below the main diagonal are zero. In notation, U = (u )
i
j
with u i
j
=0 for all
j<i.
1 1 1
u u u ⋯
⎛ 1 2 3 ⎞
2 2
⎜ 0 u u ⋯ ⎟
2 3
⎜ ⎟
U =⎜ 3 ⎟ (7.7.3)
⎜ 0 0 u
3
⋯ ⎟
⎜ ⎟
⎝ ⎠
⋮ ⋮ ⋮ ⋱
M = LU is called an LU decomposition of M .
This is a useful trick for computational reasons; it is much easier to compute the inverse of an upper or lower triangular matrix than
general matrices. Since inverses are useful for solving linear systems, this makes solving any linear system associated to the matrix
much faster as well. The determinant---a very important quantity associated with any square matrix---is very easy to compute for
triangular matrices.
Example 7.7.1:
Linear systems associated to upper triangular matrices are very easy to solve by back substitution.
$$\left(
a b 1
(7.7.4)
0 c e
⎛ 1 0 0 d ⎞
For lower triangular matrices, back substitution gives a quick solution; for upper triangular matrices, forward substitution
gives the solution.
7.7.1 https://math.libretexts.org/@go/page/1977
u
⎛ ⎞
Step 1: Set W = ⎜ v ⎟ = UX .
⎝ ⎠
w
Step 2: Solve the system LW = V . This should be simple by forward substitution since L is lower triangular. Suppose the
solution to LW = V is W . 0
Step 3: Now solve the system U X = W . This should be easy by backward substitution, since U is upper triangular. The
0
Example 7.7.2 :
2x + 12y + z = 19 (7.7.7)
4x + 15y + 3z = 0
⎝ ⎠ ⎝ ⎠⎝ ⎠
4 15 3 2 3 1 0 0 1
u
⎛ ⎞
Step 1: Set W = ⎜ v ⎟ = UX .
⎝ ⎠
w
3 0 0 u 3
⎛ ⎞⎛ ⎞ ⎛ ⎞
⎜1 6 0 ⎟ ⎜ v ⎟ = ⎜ 19 ⎟ (7.7.9)
⎝ ⎠⎝ ⎠ ⎝ ⎠
2 3 1 w 0
1
⎛ ⎞
W0 = ⎜ 3 ⎟ (7.7.10)
⎝ ⎠
−11
2 6 1 x 1
⎛ ⎞⎛ ⎞ ⎛ ⎞
⎜0 1 0⎟⎜ y ⎟ = ⎜ 3 ⎟ (7.7.11)
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 0 1 z −11
7.7.2 https://math.libretexts.org/@go/page/1977
7.7.2: Finding an LU Decomposition.
In Section 2.3, Gaussian elimination was used to find LU matrix decompositions. These ideas are presented here again as review.
For any given matrix, there are actually many different LU decompositions. However, there is a unique LU decomposition in
which the L matrix has ones on the diagonal. In that case L is called a lower unit triangular matrix .
To find the LU decomposition, we'll create two sequences of matrices L , L , … and U , U , … such that at each step,
1 2 1 2
L U = M . Each of the L will be lower triangular, but only the last U will be upper triangular. The main trick for this calculation
i i i i
Consider
1 0 a b c ⋯
E =( ) , M =( ) . (7.7.12)
λ 1 d e f ⋯
Lets compute EM
a b c ⋯
EM = ( ) . (7.7.13)
d + λa e + λb f + λc ⋯
$$
obeys (check this yourself...)
a b c ⋯
−1 −1
E E = 1 . $$H ence\(M = E EM \)or, writingthisout$$ ( ) (7.7.15)
d e f ⋯
1 0 a b c ⋯
=( )( ) .
−λ 1 d + λa e + λb f + λc ⋯
Here the matrix on the left is lower triangular, while the matrix on the right has had a row operation performed on it.
We would like to use the first row of M to zero out the first entry of every row below it. For our running example,
6 18 3
⎛ ⎞
M =⎜2 12 1⎟ , (7.7.16)
⎝ ⎠
4 15 3
so we would like to perform the row operations $$R_{2}\to R_{2} -\frac{1}{3}R_{1} \mbox{ and } R_{3}\to R_{3}-\frac{2}
{3}R_{1}\, .\]
If we perform these row operations on M to produce
$$U_1=
7.7.3 https://math.libretexts.org/@go/page/1977
6 18 3
⎛ ⎞
⎜0 6 0⎟ (7.7.17)
⎝ ⎠
0 3 1
\, ,\]
we need to multiply this on the left by a lower triangular matrix L so that the product L U = M still.
1 1 1
The above example shows how to do this: Set L to be the lower triangular matrix whose first column is filled with minus the
1
\, .\]
By construction L 1 U1 =M , but you should compute this yourself as a double check.
Now repeat the process by zeroing the second column of U1 below the diagonal using the second row of U1 using the row
operation R → R − R to produce
3 3
1
2
2
6 18 3
⎛ ⎞
U2 = ⎜ 0 6 0⎟ . (7.7.19)
⎝ ⎠
0 0 1
The matrix that undoes this row operation is obtained in the same way we found L above and is:
1
\[
1 0 0
⎛ ⎞
⎜0 1 0⎟ (7.7.20)
1
⎝0 0⎠
2
\, .
L_{2}=
1 0 0
⎛ ⎞
1
⎜ 1 0⎟ (7.7.22)
⎜ 3 ⎟
⎝ 2 ⎠
0 1
3
1 0 0
⎛ ⎞
⎜0 1 0⎟ (7.7.23)
⎝ 1 ⎠
0 0
2
=
1 0 0
⎛ ⎞
1
⎜ 1 0⎟ (7.7.24)
⎜ 3 ⎟
⎝ 2 1
1⎠
3 2
\, .
$$
Notice that it is lower triangular because
The product of lower triangular matrices is always lower triangular! (7.7.25)
7.7.4 https://math.libretexts.org/@go/page/1977
Moreover it is obtained by recording minus the constants used for all our row operations in the appropriate columns (this
always works this way). Moreover, U is upper triangular and M = L U , we are done! Putting this all together we have
2 2 2
1 0 0
6 18 3 ⎛ ⎞ 6 18 3
⎛ ⎞ ⎛ ⎞
1
M =⎜2 12 1⎟ =⎜
⎜ 3
1 0⎟⎜0
⎟ 6 0⎟ . (7.7.26)
⎝ ⎠ 2 1 ⎝ ⎠
4 15 3 ⎝ 1⎠ 0 0 1
3 2
If the matrix you're working with has more than three rows, just continue this process by zeroing out the next column below
the diagonal, and repeat until there's nothing left to do.
The fractions in the L matrix are admittedly ugly. For two matrices LU , we can multiply one entire column of L by a constant
λ and divide the corresponding row of U by the same constant without changing the product of the two matrices. Then:
For matrices that are not square, LU decomposition still makes sense. Given an m × n matrix M , for example we could write
M = LU with L a square lower unit triangular matrix, and U a rectangular matrix. Then L will be an m × m matrix, and U will
be an m × n matrix (of the same shape as M ). From here, the process is exactly the same as for a square matrix. We create a
sequence of matrices L and U that is eventually the LU decomposition. Again, we start with L = I and U = M .
i i 0 0
Example 7.7.4 :
Since M is a 2 ×3 matrix, our decomposition will consist of a 2 ×2 matrix and a 2 ×3 matrix. Then we start with
1 0
L0 = I2 = ( ) .
0 1
The next step is to zero-out the first column of M below the diagonal. There is only one row to cancel, then, and it can be
removed by subtracting 2 times the first row of M to the second row of M . Then:
1 0 −2 1 3
L1 = ( ), U1 = ( ) (7.7.27)
2 1 0 2 −5
Since U is upper triangular, we're done. With a larger matrix, we would just continue the process.
1
X Y
M =( ) (7.7.28)
Z W
Then:
$$M=\left(
I 0
(7.7.29)
−1
ZX I
\right)
\left(
X 0
(7.7.30)
−1
0 W − ZX Y
\right)\left(
7.7.5 https://math.libretexts.org/@go/page/1977
−1
I X Y
(7.7.31)
0 I
\right)\, .\]
This can be checked explicitly simply by block-multiplying these three matrices.
Example :
7.7.15
By multiplying the diagonal matrix by the upper triangular matrix, we get the standard LU decomposition of the matrix.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 7.7: LU Redux is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom Denton,
& Andrew Waldron.
7.7.6 https://math.libretexts.org/@go/page/1977
7.8: Review Problems
1. Consider the linear system:
\[
1 1
x =v
2 1 2 2
l x +x =v
1
(7.8.1)
⋮ ⋮
n 1 n 2 n n
l x +l x +⋯ +x =v
1 2
$$
i.Find x . 1
ii. Find x .
2
iii. Find x .
3
k. Try to find a formula for x . Don't worry about simplifying your answer.
k
X Y
2. Let M =( ) be a square n × n block matrix with W invertible.
Z W
3. Show that if M is a square matrix which is not invertible then either the matrix matrix U or the matrix L in the LU-
decomposition M = LU has a zero on it's diagonal.
4. Describe what upper and lower triangular matrices do to the unit hypercube in their domain.
5. In chapter 3 we saw that since, in general, row exchange matrices are necessary to achieve upper triangular form, LDP U
factorization is the complete decomposition of an invertible matrix into EROs of various kinds. Suggest a procedure for using
LDP U decompositions to solve linear systems that generalizes the procedure above.
6. Is there a reason to prefer $LU$ decomposition to U L decomposition, or is the order just a convention?
7. If M is invertible then what are the LU , LDU , and LDP U decompositions of M −1
in terms of the decompositions for M ?
8. Argue that if M is symmetric then L = U T
in the LDU decomposition of M .
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 7.8: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
7.8.1 https://math.libretexts.org/@go/page/1984
CHAPTER OVERVIEW
8: Determinants
Given a square matrix, is there an easy way to know when it is invertible? Answering this fundamental question is the goal of this
chapter.
8.1: The Determinant Formula
8.2: Elementary Matrices and Determinants
8.3: Review Problems
8.4: Properties of the Determinant
8.5: Review Problems
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 8: Determinants is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom Denton,
& Andrew Waldron.
1
8.1: The Determinant Formula
The determinant extracts a single number from a matrix that determines whether its invertibility. Lets see how this works for small
matrices first.
Simple Examples
For small cases, we already know when a matrix is invertible. If M is a 1 × 1 matrix, then M = (m) ⇒ M
−1
= (1/m) . Then M
is invertible if and only if m ≠ 0 .
1 1 2 1
m m m −m
For M a 2 × 2 matrix, chapter 7 section 7.5 shows that if M =(
1
2
2
2
) , then M −1
= 1 2
1
1 2
(
2
2 1
2
) . Thus
m1 m2 −m2 m1
m m −m m
1 2 1 1
Example 8.1.1 :
For a 3 × 3 matrix,
1 1 1
m m m
⎛ 1 2 3 ⎞
2 2 2
M =⎜m m m ⎟ , (8.1.3)
⎜ 1 2 3 ⎟
⎝ 3 3 3 ⎠
m m m
1 2 3
Notice that in the subscripts, each ordering of the numbers 1, 2, and 3 occurs exactly once. Each of these is a permutation of
the set {1, 2, 3}.
Permutations
Consider n objects labeled 1 through n and shuffle them. Each possible shuffle is called a permutation.
For example, here is an example of a permutation of 1--5:
1 2 3 4 5
σ =[ ] (8.1.5)
4 2 5 1 3
We can consider a permutation σ as an invertible function from the set of numbers [n] := {1, 2, … , n} to [n] , so can write
σ(3) = 5 in the above example. In general we can write
8.1.1 https://math.libretexts.org/@go/page/1991
1 2 3 4 5
[ ] , (8.1.6)
σ(1) σ(2) σ(3) σ(4) σ(5)
but since the top line of any permutation is always the same, we can omit it and just write:
and so our example becomes simply σ = [4 2 5 1 3] . The mathematics of permutations is extensive; there are a few key properties
of permutations that we'll need:
1. There are n! permutations of n distinct objects, since there are n choices for the first object, n − 1 choices for the second once
the first has been chosen, and so on.
2. Every permutation can be built up by successively swapping pairs of objects. For example, to build up the permutation
[3 1 2 ] from the trivial permutation [ 1 2 3 ], you can first swap 2 and 3 , and then swap 1 and 3 .
3. For any given permutation σ, there is some number of swaps it takes to build up the permutation. (It's simplest to use the
minimum number of swaps, but you don't have to: it turns out that any way of building up the permutation from swaps will
have have the same parity of swaps, either even or odd.) If this number happens to be even, then σ is called an
even permutation if this number is odd, then σ is an odd permutation. In fact, n! is even for all n ≥ 2 , and exactly half of
the permutations are even and the other half are odd. It's worth noting that the trivial permutation (which sends i → i for every
i ) is an even permutation, since it uses zero swaps.
Definition
The sign function is a function sgn that sends permutations to the set {−1, 1}, defined by:
1 if σ is even;
sgn(σ) = { (8.1.8)
−1 if σ is odd.
Definition
For an n × n matrix M , the determinant of M (sometimes written |M |) is given by:
1 2 n
det M = ∑ sgn(σ)m m ⋯m (8.1.9)
σ(1) σ(2) σ(n)
σ
The sum is over all permutations of n . Each summand is a product of a single entry from each row, but with the column
numbers shuffled by the permutation σ. The last statement about the summands yields a nice property of the determinant:
Theorem
If M i
= (m )
j
has a row consisting entirely of zeros, then m i
σ(i)
=0 for every σ and some i. Moreover det M =0 .
Because there are many permutations of n, writing the determinant this way for a general matrix gives a very
long sum. For n = 4 , there are 24 = 4! permutations, and for n = 5 , there are already 120 = 5! permutations.
1 1 1 1
m m m m
⎛ 1 2 3 4 ⎞
2 2 2 2
⎜m m m m ⎟
1 2 3 4
For a 4 × 4 matrix, M =
⎜
⎜ 3 3 3 3
⎟
⎟
, then det M is:
⎜ m1 m
2
m
3
m
4
⎟
⎝ 4 4 4 4 ⎠
m m m m
1 2 3 4
1 2 3 4 1 2 3 4 1 2 3 4
det M = m m m m −m m m m −m m m m
1 2 3 4 1 3 2 4 1 2 4 3
1 2 3 4 1 2 3 4 1 2 3 4
− m m m m +m m m m +m m m m
2 1 3 4 1 3 4 2 1 4 2 3
1 2 3 4 1 2 3 4
+ m m m m +m m m m ± 16 more terms.
2 3 1 4 2 1 4 3
8.1.2 https://math.libretexts.org/@go/page/1991
Luckily, it is very easy to compute the determinants of certain matrices. For example, if M is diagonal, then M
j
i
=0 whenever
i ≠ j . Then all summands of the determinant involving off-diagonal entries vanish, so:
1 2 n 1 2 n
det M = ∑ sgn(σ)m m ⋯m =m m ⋯ mn . (8.1.10)
σ(1) σ(2) σ(n) 1 2
Thus: The~ determinant ~of~ a~ diagonal ~matrix~ is~ the~ product ~of ~its~ diagonal~ entries.
Since the identity matrix is diagonal with all diagonal entries equal to one, we have:
det I = 1. (8.1.11)
We would like to use the determinant to decide whether a matrix is invertible. Previously, we computed the inverse of a matrix by
applying row operations. Therefore we ask what happens to the determinant when row operations are applied to a matrix.
Lets swap rows i and j of a matrix M and then compute its determinant. For the permutation σ, let ^
σ be the permutation
obtained by swapping positions i and j . Clearly
^ = −σ .
σ (8.1.12)
Let M be the matrix M with rows i and j swapped. Then (assuming i < j ):
′
′ 1 j i n
det M = ∑ sgn(σ) m ⋯m ⋯m ⋯m
σ(1) σ(i) σ(j) σ(n)
σ
1 i j n
= ∑ sgn(σ) m ⋯m ⋯m ⋯m
σ(1) σ(j) σ(i) σ(n)
σ
1 i j n
= ∑(−sgn(σ
^)) m ⋯m ⋯m ⋯m
^(1)
σ ^(i)
σ ^(j)
σ ^(n)
σ
σ
1 i j n
= − ∑ sgn(σ
^) m ⋯m ⋯m ⋯m
σ
^(1) σ
^(i) σ
^(j) σ
^(n)
^
σ
= − det M . (8.1.13)
The step replacing ∑ by ∑ often causes confusion; it hold since we sum over all permutations (see review problem 3).
σ ^
σ
Thus we see that swapping rows changes the sign of the determinant. i.e. , $$M' = - \det M\, .\]
Applying this result to M =I (the identity matrix) yields
i
det E = −1 , (8.1.14)
j
where the matrix E is the identity matrix with rows i and j swapped. It is a row swap elementary matrix. This implies another
i
j
nice property of the determinant. If two rows of the matrix are identical, then swapping the rows changes the sign of the
matrix, but leaves the matrix unchanged. Then we see the following:
Theorem
If M has two identical rows, then det M =0 .
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
8.1.3 https://math.libretexts.org/@go/page/1991
This page titled 8.1: The Determinant Formula is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
8.1.4 https://math.libretexts.org/@go/page/1991
8.2: Elementary Matrices and Determinants
In chapter 2 we found the elementary matrices that perform the Gaussian row operations. In other words, for any matrix M , and a
matrix M equal to M after a row operation, multiplying by an elementary matrix E gave M = EM . We now examine what the
′ ′
Row Swap
Our first elementary matrix multiplies a matrix M by swapping rows i and j . Explicitly: let R through R denote the rows of M ,
1 n
and let M be the matrix M with rows i and j swapped. Then M and M can be regarded as a block matrices (where the blocks
′ ′
are rows):
⎛ ⋮ ⎞ ⎛ ⋮ ⎞
⎜ i ⎟ ⎜ j ⎟
⎜R ⎟ ⎜R ⎟
⎜ ⎟ ⎜ ⎟
′
M =⎜ ⎟ and M =⎜ ⎟. (8.2.1)
⎜ ⋮ ⎟ ⎜ ⋮ ⎟
⎜ ⎟ ⎜ ⎟
⎜ Rj ⎟ ⎜ Ri ⎟
⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠
⋮ ⋮
The matrix
1
⎛ ⎞
⎜ ⎟
⎜ ⋱ ⎟
⎜ ⎟
⎜ 0 1 ⎟
⎜ ⎟
⎜ ⎟
⎜ ⎟ =: E i (8.2.3)
⎜ ⋱ ⎟ j
⎜ ⎟
⎜ ⎟
⎜ 1 0 ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋱ ⎟
⎝ ⎠
1
is just the identity matrix with rows i and j swapped. The matrix E is an elementary matrix and
i
j
′ i
M =E M . (8.2.4)
j
Because det I =1 and swapping a pair of rows changes the sign of the determinant, we have found that
i
det E = −1 . (8.2.5)
j
Now we know that swapping a pair of rows flips the sign of the determinant so det M
′
= −detM . But det E
j
i
= −1 and
M = E M so
′ i
j
i i
det E M = det E det M . (8.2.6)
j j
8.2.1 https://math.libretexts.org/@go/page/1998
Scalar Multiply
The next row operation is multiplying a row by a scalar. Consider
1
R
⎛ ⎞
M =⎜ ⎟ , (8.2.7)
⎜ ⋮ ⎟
⎝ n ⎠
R
1
⎛ ⎞
⎜ ⎟
⎜ ⋱ ⎟
⎜ ⎟
i
R (λ) = ⎜ λ ⎟ . (8.2.8)
⎜ ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋱ ⎟
⎝ ⎠
1
Then:
1
R
⎛ ⎞
⎜ ⎟
⎜ ⋮ ⎟
⎜ ⎟
′ i
= R (λ)M = ⎜ λR ⎟ ,
i
M (8.2.9)
⎜ ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋮ ⎟
⎝ n ⎠
R
1 i n
= λ ∑ sgn(σ)m ⋯m ⋯m
σ(1) σ(i) σ(n)
σ
= λ det M
Thus, multiplying a row by λ multiplies the determinant by λ . i.e., $$\det R^{i}(\lambda) M = \lambda \det M\, .\]
Since R i
(λ) is just the identity matrix with a single row multiplied by λ , then by the above rule, the determinant of i
R (λ) is λ .
Thus:
8.2.2 https://math.libretexts.org/@go/page/1998
1
⎛ ⎞
⎜ ⎟
⎜ ⋱ ⎟
⎜ ⎟
i
det R (λ) = det ⎜ λ ⎟ =λ, (8.2.10)
⎜ ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋱ ⎟
⎝ ⎠
1
Row Addition
The final row operation is adding μR to R . This is done with the elementary matrix S
j i i
j
(μ) , which is an identity matrix but with
an additional μ in the i, j position:
1
⎛ ⎞
⎜ ⎟
⎜ ⋱ ⎟
⎜ ⎟
⎜ 1 μ ⎟
⎜ ⎟
⎜ ⎟
i
S (μ) = ⎜ ⎟ . (8.2.11)
j ⎜ ⋱ ⎟
⎜ ⎟
⎜ ⎟
⎜ 1 ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋱ ⎟
⎝ ⎠
1
Then multiplying M by S j
i
(μ) performs a row addition:
1
⎛ ⎞
⎛ ⎞ ⎛ ⎞
⎜ ⎟
⎜ ⋱ ⎟⎜ ⋮ ⎟ ⎜
⋮
⎟
⎜ ⎟
⎜ ⎟⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ 1 μ ⎟ i i j
⎜ ⎟⎜R ⎟ ⎜ R + μR ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟⎜ ⎟ =⎜ ⎟ . (8.2.12)
⎜ ⋱ ⎟ ⋮ ⋮
⎜ ⎟ ⎜ ⎟
⎜ ⎟
⎜ ⎟⎜ j ⎟ ⎜ j ⎟
1 ⎜R ⎟ ⎜ R ⎟
⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜ ⎟
⎜ ⋮ ⎟ ⎜ ⋮ ⎟
⎜ ⋱ ⎟
⎝ ⎠ ⎝ ⎠
⎝ ⎠
1
replaced by R . Then
j
′ 1 i j n
det M = ∑ sgn(σ)m ⋯ (m + μm )⋯ m
σ(1) σ(i) σ(j) σ(n)
1 i n
= ∑ sgn(σ)m ⋯m ⋯m
σ(1) σ(i) σ(n)
σ
1 j j n
+ ∑ sgn(σ)m ⋯ μm ⋯m ⋯m
σ(1) σ(j) σ(j) σ(n)
σ
′′
= det M + μ det M
8.2.3 https://math.libretexts.org/@go/page/1998
′
det M = det M , (8.2.13)
Notice that if M is the identity matrix, then we have $$\det S^{i}_{j}(\mu) = \det (S^{i}_{j}(\mu)I) = \det I = 1\, .\]
Determinant of Products
In summary, the elementary matrices for each of the row operations obey
i i
E = I with rows i,j swapped; det E = −1
j j
i i
R (λ) = I with λ in position i,i; det R (λ) = λ (8.2.15)
i i
S (μ) = I with \mu in position i,j; det S (μ) = 1
j j
Theorem
If E is any of the elementary matrices E i
j
i
j
i
, R (λ), S (μ) , then det(EM ) = det E det M .
We have seen that any matrix M can be put into reduced row echelon form via a sequence of row operations, and we have seen
that any row operation can be achieved via left matrix multiplication by an elementary matrix. Suppose that RREF(M ) is the
reduced row echelon form of M . Then
8.2.4 https://math.libretexts.org/@go/page/1998
RREF (M ) = E1 E2 ⋯ Ek M , (8.2.16)
where each E is an elementary matrix. We know how to compute determinants of elementary matrices and products thereof,
i
so we ask:
What is the determinant of a square matrix in reduced row echelon form? (8.2.17)
Since each Ei has non-zero determinant, then det RREF (M ) = 0 if and only if det M = 0 . This establishes an important
theorem:
Theorem
For any square matrix M, \det M\neq 0 if and only if M is invertible.
Since we know the determinants of the elementary matrices, we can immediately obtain the following:
Corollary
Any elementary matrix E i
j
i i
, R (λ), S (μ)
j
is invertible, except for R i
. In fact, the inverse of an elementary matrix is another
(0)
elementary matrix.
To obtain one last important result, suppose that M and N are square n × n matrices, with reduced row echelon forms such that,
for elementary matrices E and F ,
i i
M = E1 E2 ⋯ Ek RREF (M ) , (8.2.19)
and
N = F1 F2 ⋯ Fl RREF (N ) . (8.2.20)
= det(E1 E2 ⋯ Ek I F1 F2 ⋯ Fl RREF (N ))
= det(M ) det(N )
Otherwise, M is not invertible, and det M = 0 = det RREF (M ) . Then there exists a row of zeros in RREF (M ) , so
n
R (λ)RREF (M ) = RREF (M ) for any \lambda . Then:
8.2.5 https://math.libretexts.org/@go/page/1998
det(M N ) = det(E1 E2 ⋯ Ek RREF (M )N )
= det(E1 E2 ⋯ Ek RREF (M )N )
= λ det(M N )
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 8.2: Elementary Matrices and Determinants is shared under a not declared license and was authored, remixed, and/or curated by
David Cherney, Tom Denton, & Andrew Waldron.
8.2.6 https://math.libretexts.org/@go/page/1998
8.3: Review Problems
1 1 1
m m m
⎛ 1 2 3 ⎞
1. Let M =⎜
2
⎜ m1 m
2
2
m
2
3
⎟
⎟ . Use row operations to put M into row echelon form . For simplicity, assume that
⎝ 3 3 3 ⎠
m m m
1 2 3
m
1
1
≠0 ≠m m
1
1
2
2
−m m
2
1
1
2
.
Prove that M is non-singular if and only if:
1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
m m m −m m m +m m m −m m m +m m m −m m m ≠0 (8.3.1)
1 2 3 1 3 2 2 3 1 2 1 3 3 1 2 3 2 1
2.
0 1 a b
a) What does the matrix E 1
2
=( ) do to M =( ) under left multiplication? What about right multiplication?
1 0 d c
b) Find elementary matrices R (λ) and R (λ) that respectively multiply rows 1 and 2 of M by λ but otherwise leave M the same
1 2
3. Let M be a matrix and S M the same matrix with rows i and j switched. Explain every line of the series of equations proving
i
j
4. Let M be the matrix obtained from M by swapping two columns i and j . Show that det M
′ ′
= − det M .
5. The scalar triple product of three vectors u, v, w from R is u ⋅ (v × w) . Show that this product is the same as the determinant of
3
the matrix whose columns are u, v, w (in that order). What happens to the scalar triple product when the factors are permuted?
6. Show that if M is a 3 × 3 matrix whose third row is a sum of multiples of the other rows (R 3 = aR2 + b R1 ) then det M =0 .
Show that the same is true if one of the columns is a sum of multiples of the others.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 8.3: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
8.3.1 https://math.libretexts.org/@go/page/2005
8.4: Properties of the Determinant
We now know that the determinant of a matrix is non-zero if and only if that matrix is invertible. We also know that the
determinant is a multiplicative function, in the sense that det(M N ) = det M det N . Now we will devise some methods for
calculating the determinant.
Recall that:
1 2 n
det M = ∑ sgn(σ)m m ⋯m . (8.4.1)
σ(1) σ(2) σ(n)
σ
A minor of an n × n matrix M is the determinant of any square matrix obtained from M by deleting one row and one column. In
particular, any entry m of a square matrix M is associated to a minor obtained by deleting the ith row and j th column of M .
i
j
1 1 2 n
= m ∑ sgn(/
σ )m ⋯m
1 σ 1 (2)
/ σ 1 (n)
/
1
σ
/
1 2 2 3 n
+ m ∑ sgn(/
σ )m m ⋯m
2 /2 (1)
σ /2 (3)
σ /2 (n)
σ
2
σ
/
1 3 2 3 4 n
+ m ∑ sgn(/
σ )m m m ⋯m
3 /3 (1)
σ /3 (2)
σ /3 (4)
σ /3 (n)
σ
3
σ
/
+ ⋯
Here the symbols / σ refers to the permutation σ with the input k removed. The summand on the j 'th line of the above formula
k
looks like the determinant of the minor obtained by removing the first and j 'th column of M . However we still need to replace sum
of /σ by a sum over permutations of column numbers of the matrix entries of this minor. This costs a minus sign whenever j − 1
j
is odd. In other words, to expand by minors we pick an entry m of the first row, then add (−1)
1
j
times the determinant of the j−1
matrix with row $i$ and column j deleted. An example will probably help:
1 2 3
⎛ ⎞
M =⎜4 5 6⎟ (8.4.2)
⎝ ⎠
7 8 9
= 0
Here, M −1
does not exist because det M = 0.
Sometimes the entries of a matrix allow us to simplify the calculation of the determinant. Take
1 2 3
⎛ ⎞
N =⎜4 0 0⎟ .
⎝ ⎠
7 8 9
Notice that the second row has many zeros; then we can switch the first and second rows of N before expanding in minors to
get:
8.4.1 https://math.libretexts.org/@go/page/2012
1 2 3 4 0 0
⎛ ⎞ ⎛ ⎞
det ⎜ 4 0 0 ⎟ = − det ⎜ 1 2 3⎟
⎝ ⎠ ⎝ ⎠
7 8 9 7 8 9
2 3
= −4 det ( )
8 9
= 24
Since we know how the determinant of a matrix changes when you perform row operations, it is often very beneficial to perform
row
operations before computing the determinant by brute force.
Example 8.4.1 :
1 2 3 1 2 3 1 2 3
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
det ⎜ 4 5 6 ⎟ = det ⎜ 3 3 3 ⎟ = det ⎜ 3 3 3⎟ =0.
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
7 8 9 6 6 6 0 0 0
Try to determine which row operations we made at each step of this computation.
You might suspect that determinants have similar properties with respect to columns as what applies to rows:
Theorem
For any square matrix M , we have:
$$\det M^{T} = \det M\, .\]
Proof
By definition,
1 2 n
det M = ∑ sgn(σ)m m ⋯m . (8.4.3)
σ(1) σ(2) σ(n)
σ
For any permutation σ, there is a unique inverse permutation σ that undoes σ. If σ sends i → j , then σ sends
−1 −1
j→ i . In
the two-line notation for a permutation, this corresponds to just flipping the permutation over. For example, if
1 2 3
σ =[ ]
2 3 1
−1
2 3 1 1 2 3
σ =[ ] =[ ] . (8.4.4)
1 2 3 3 1 2
Since any permutation can be built up by transpositions, one can also find the inverse of a permutation σ by undoing each of
the transpositions used to build up σ; this shows that one can use the same number of transpositions to build σ and σ . In −1
−1 −1 −1
σ (1) σ (2) σ (n)
= ∑ sgn(σ)m m ⋯ mn
1 2
σ
−1 −1 −1
−1 σ (1) σ (2) σ (n)
= ∑ sgn(σ )m m ⋯ mn
1 2
T
= det M .
8.4.2 https://math.libretexts.org/@go/page/2012
The second-to-last equality is due to the existence of a unique inverse permutation: summing over permutations is the same as
summing over all inverses of permutations. The final equality is by the definition of the transpose.
Example 8.4.2 :
Because of this theorem, we see that expansion by minors also works over columns. Let
1 2 3
⎛ ⎞
M =⎜0 5 6⎟ . (8.4.5)
⎝ ⎠
0 8 9
Then
T
5 8
det M = det M = 1 det ( ) = −3 . (8.4.6)
6 9
\end{example}
Theorem
−1
1
det M = (8.4.8)
det M
Adjoint of a Matrix
Recall that for a 2 × 2 matrix
d −b a b a b
( )( ) = det ( ) I . (8.4.9)
−c a c d c d
8.4.3 https://math.libretexts.org/@go/page/2012
Or in a more careful notation: if
\[M=
1 1
m m
1 2
( ) (8.4.10)
2 2
m m
1 2
\, ,$$
then
2 1
1 m −m
−1 2 2
M = ( ) , (8.4.11)
1 2 1 2 2 1
m m −m m −m m
1 2 2 1 1 1
so long as det M =m m
1
1
2
2
−m m
1
2
2
1
≠0 . The matrix
2 1
m −m
2 2
( )
2 1
−m m
1 1
that appears above is a special matrix, called the adjoint of M . Let's define the adjoint for an n × n matrix.
The cofactor of M corresponding to the entry m of M is the product of the minor associated to m and (−1)
i
j
i
j
i+j
. This is written
cofactor(m ) . i
j
Definition
For M = (m )
i
j
a square matrix, The adjoint matrix adjM is given by:
i T
adjM = (cof actor(m )) (8.4.12)
j
Example 8.4.3 :
T
2 0 1 0 1 2
⎛ ⎞
det ( ) − det ( ) det ( )
⎜ 1 1 0 1 0 1 ⎟
3 −1 −1 ⎜ ⎟
⎛ ⎞ ⎜ ⎟
−1 −1 3 −1 3 −1
adj ⎜ 1 ⎜ ) ⎟
2 0 ⎟ = ⎜ − det ( ) det ( ) − det (
⎟
(8.4.13)
⎝ ⎠ ⎜ 1 1 0 1 0 1 ⎟
0 1 1 ⎜ ⎟
⎜ −1 −1 3 −1 3 −1 ⎟
det ( ) − det ( ) det ( )
⎝ ⎠
2 0 1 0 1 2
Let's multiply M adjM . For any matrix N , the i, j entry of M N is given by taking the dot product of the ith row of M and the
j th column of N . Notice that the dot product of the i th row of M and the i th column of adjM is just the expansion by minors of
det M in the i th row. Further, notice that the dot product of the i th row of M and the j th column of adjM with j ≠ i is the same
as expanding M by minors, but with the j th row replaced by the ith row. Since the determinant of any matrix with a row repeated
is zero, then these dot products are zero as well.
We know that the i, j entry of the product of two matrices is the dot product of the ith row of the first by the j th column of the
second. Then:
8.4.4 https://math.libretexts.org/@go/page/2012
Theorem
For M a square matrix with det M ≠0 (equivalently, if M is invertible), then
−1
1
M = adjM (8.4.15)
det M
3 −1 −1 2 0 2
⎛ ⎞ ⎛ ⎞
adj ⎜ 1 2 0 ⎟ = ⎜ −1 3 −1 ⎟ . (8.4.16)
⎝ ⎠ ⎝ ⎠
0 1 1 1 −3 7
Now, multiply:
3 −1 −1 2 0 2 6 0 0
⎛ ⎞⎛ ⎞ ⎛ ⎞
⎜1 2 0 ⎟ ⎜ −1 3 −1 ⎟ = ⎜ 0 6 0⎟
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 1 1 1 −3 7 0 0 6
−1
3 −1 −1 2 0 2
⎛ ⎞ ⎛ ⎞
1
⇒ ⎜1 2 0 ⎟ = ⎜ −1 3 −1 ⎟
6
⎝ ⎠ ⎝ ⎠
0 1 1 1 −3 7
This process for finding the inverse matrix is sometimes called Cramer's Rule.
V olume = ∣
∣ det ( u v w)∣
∣ (8.4.17)
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
8.4.5 https://math.libretexts.org/@go/page/2012
This page titled 8.4: Properties of the Determinant is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
8.4.6 https://math.libretexts.org/@go/page/2012
8.5: Review Problems
1. Find the determinant via expanding by minors.
2 1 3 7
⎛ ⎞
⎜6 1 4 4⎟
⎜ ⎟ (8.5.1)
⎜2 1 8 0⎟
⎝ ⎠
1 0 2 0
3. Let σ
−1
denote the inverse permutation of σ. Suppose the function f : 1, 2, 3, 4 → R . Write out explicitly the following two
sums:
−1
∑ f (σ(s)) and ∑ f (σ (s)). (8.5.2)
σ σ
What do you observe? Now write a brief explanation why the following equality holds
−1
∑ F (σ) = ∑ F (σ ). (8.5.3)
σ σ
where the domain of the function F is the set of all permutations of n objects.
4. Suppose M = LU is an LU decomposition. Explain how you would efficiently compute det M in this case. How does this
decomposition allow you to easily see if M is invertible?
5. In computer science, the complexity of an algorithm is (roughly) computed by counting the number of times a given operation is
performed. Suppose adding or subtracting any two numbers takes a seconds, and multiplying two numbers takes m seconds. Then,
for example, computing 2·6 -- 5 would take a + m seconds.
(a) How many additions and multiplications does it take to compute the determinant of a general 2 → 2 matrix?
(b) Write a formula for the number of additions and multiplications it takes to compute the determinant of a general n → n matrix
using the definition of the determinant as a sum over permutations. Assume that finding and multiplying by the sign of a
permutation is free.
(c) How many additions and multiplications does it take to compute the determinant of a general 3 → 3 matrix using expansion by
minors? Assuming m = 2a , is this faster than computing the determinant from the definition?
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 8.5: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
8.5.1 https://math.libretexts.org/@go/page/2019
CHAPTER OVERVIEW
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 9: Subspaces and Spanning Sets is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
1
9.1: Subspaces
Definition: subspace
We say that a subset U of a vector space V is a subspace of V if U is a vector space under the inherited addition and scalar
multiplication operations of V .
Example 9.1.1 :
ax + by + cz = 0. (9.1.1)
x
⎛ ⎞
This equation can be expressed as the homogeneous system (a b c)⎜ y ⎟ = 0 , or MX = 0 with M the matrix
⎝ ⎠
z
So P is closed under addition and scalar multiplication. Additionally, P contains the origin (which can be derived from the
above by setting μ = ν = 0 ). All other vector space requirements hold for P because they hold for all vectors in R . 3
Proof
One direction of this proof is easy: if U is a subspace, then it is a vector space, and so by the additive closure and
multiplicative closure properties of vector spaces, it has to be true that μu + ν u ∈ U for all u , u in U and all constants
1 2 1 2
constants μ, ν .
9.1.1 https://math.libretexts.org/@go/page/2028
The other direction is almost as easy: we need to show that if μu + ν u ∈ U for all u , u in U and all constants μ, ν , then
1 2 1 2
U is a vector space. That is, we need to show that the ten properties of vector spaces are satisfied. We know that the additive
closure and multiplicative closure properties are satisfied. All of the other eight properties is true in U because it is true in V .
□
Note that the requirements of the subspace theorem are often referred to as "closure''.
We can use this theorem to check if a set is a vector space. That is, if we have some set U of vectors that come from some bigger
vector space V , to check if U itself forms a smaller vector space we need check only two things:
1. If we add any two vectors in U , do we end up with a vector in U ?
2. If we multiply any vector in U by any constant, do we end up with a vector in U ?
If the answer to both of these questions is yes, then U is a vector space. If not, U is not a vector space.
Contributors
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 9.1: Subspaces is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom Denton,
& Andrew Waldron.
9.1.2 https://math.libretexts.org/@go/page/2028
9.2: Building Subspaces
Consider the set
⎧ 1 0 ⎫
⎪⎛ ⎞ ⎛ ⎞⎪
3
U = ⎨⎜ 0 ⎟ , ⎜ 1 ⎟⎬ ⊂ R . (9.2.1)
⎩
⎪⎝ ⎠ ⎝ ⎭
⎠⎪
0 0
Because U consists of only two vectors, it clear that U is not a vector space, since any constant multiple of these vectors should
also be in U . For example, the 0-vector is not in U , nor is U closed under vector addition.
But we know that any two vectors define a plane:
In this case, the vectors in U define the xy-plane in R . We can view the xy-plane as the set of all vectors that arise as a linear
3
combination of the two vectors in U . We call this set of all linear combinations the span of U :
⎧ 1 0 ∣ ⎫
⎪ ⎛ ⎞ ⎛ ⎞ ⎪
∣
span(U ) = ⎨x ⎜ 0 ⎟ + y ⎜ 1 ⎟ x, y ∈ R⎬ . (9.2.2)
⎩ ∣ ⎭
⎪ ⎝ ⎠ ⎝ ⎠ ⎪
0 0 ∣
⎜ y ⎟ = x ⎜ 0 ⎟ + y ⎜ 1 ⎟ ∈ span(U ). (9.2.3)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 0
Definition: Span
Let V be a vector space and S = {s 1, s2 , …} ⊂ V a subset of V . Then the span of S is the set:
1 2 N i
span(S) := { r s1 + r s2 + ⋯ + r sN | r ∈ R, N ∈ N}. (9.2.4)
That is, the span of S is the set of all finite linear combinations (usually our vector spaces are defined over R, but in general
we can have vector spaces defined over different base fields such as C or Z . The coefficients r should come from whatever
2
i
our base field is (usually R).} of elements of S . Any finite sum of the form "a constant times s plus a constant times s plus
1 2
It is important that we only allow finite linear combinations. In Equation 9.2.4 N , must be a finite number. It can be any finite
number, but it must be finite.
Example 9.2.1 :
0
⎛ ⎞
Let V =R
3
and X ⊂ V be the x-axis. Let P =⎜1⎟ , and set
⎝ ⎠
0
9.2.1 https://math.libretexts.org/@go/page/2029
S = X ∪ {P } . (9.2.5)
2 2 2 0 −12
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
The vector ⎜3⎟ is in span(S), because ⎜3⎟ = ⎜0⎟+3⎜1⎟. Similarly, the vector ⎜ 17.5 ⎟ is in span(S),
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 0 0 0
−12 −12 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
because ⎜ 17.5 ⎟ = ⎜ 0 ⎟ + 17.5 ⎜ 1 ⎟ .
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 0
⎜ 0 ⎟+y ⎜ 1 ⎟ = ⎜ y ⎟ (9.2.6)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 0
is in span(S) . On the other hand, any vector in span(S) must have a zero in the z -coordinate. (Why?)
So span(S) is the xy-plane, which is a vector space. (Try drawing a picture to verify this!)
Lemma
For any subset S ⊂ V , span(S) is a subspace of V .
Proof
We need to show that span(S) is a vector space.
It suffices to show that span(S) is closed under linear combinations. Let u, v ∈ span(S) and λ, μ be constants. By the
definition of span(S) , there are constants c and d (some of which could be zero) such that:
i i
1 2
u = c s1 + c s2 + ⋯
1 2
v = d s1 + d s2 + ⋯
1 2 1 2
⇒ λu + μv = λ(c s1 + c s2 + ⋯) + μ(d s1 + d s2 + ⋯)
1 1 2 2
= (λ c + μd )s1 + (λ c + μd )s2 + ⋯
This last sum is a linear combination of elements of S , and is thus in span(S) . Then span(S) is closed under linear
combinations, and is thus a subspace of V .
Note that this proof, like many proofs, consisted of little more than just writing out the definitions.
Example 9.2.2 :
⎧⎛ 1 ⎞
⎪ ⎛
1
⎞ ⎛
a ⎫
⎞⎪
3
span ⎨⎜ 0 ⎟ , ⎜ 2 ⎟ , ⎜ 1 ⎟⎬ = R ? (9.2.7)
⎩
⎪⎝ ⎭
⎠ ⎝ ⎠ ⎝ ⎠⎪
a −3 0
x
⎛ ⎞
Given an arbitrary vector ⎜ y ⎟ in R , we need to find constants r
3 1 2
,r ,r
3
such that
⎝ ⎠
z
1 1 a x
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 2 3
r ⎜ 0 ⎟+r ⎜ 2 ⎟+r ⎜1⎟ = ⎜ y ⎟. (9.2.8)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
a −3 0 z
9.2.2 https://math.libretexts.org/@go/page/2029
1
1 1 a r x
⎛ ⎞⎛ ⎞ ⎛ ⎞
2
⎜0 2 1⎟⎜r ⎟ = ⎜ y ⎟. (9.2.9)
⎝ ⎠⎝ 3 ⎠ ⎝ ⎠
a −3 0 r z
If the matrix
1 1 a
⎛ ⎞
1
x r
⎛ ⎞ ⎛ ⎞
−1 2
M ⎜ y ⎟ =⎜r ⎟ (9.2.10)
⎝ ⎠ ⎝ 3 ⎠
z r
x
⎛ ⎞
⎝ ⎠
z
2
.
Some other very important ways of building subspaces are given in the following examples.
Hence, thanks to the subspace theorem, the set of all vectors in U that are mapped to the zero vector is a subspace of V .
It is called the kernel of L:
kerL := {u ∈ U |L(u) = 0} ⊂ U . (9.2.14)
Hence, calling once again on the subspace theorem, the set of all vectors in V that are obtained as outputs of the map L is a
subspace.
It is called the image of L:
imL := {L(u) u ∈ U } ⊂ V . (9.2.17)
9.2.3 https://math.libretexts.org/@go/page/2029
Example 9.2.5 : An eigenspace of a linear map
L(αu + βv) = αL(u) + βL(v) = αL(u) + βL(v) = αλu + βλv = λ(αu + βv) . (9.2.19)
Hence, again by subspace theorem, the set of all vectors in V that obey the eigenvector equation L(v) = λv is a subspace of
V . It is called an eigenspace
For most scalars λ , the only solution to L(v) = λv will be v = 0 , which yields the trivial subspace {0}.
When there are nontrivial solutions to L(v) = λv , the number λ is called an eigenvalue, and carries essential information
about the map L.
Kernels, images and eigenspaces are discussed in great depth in chapters 16 and 12.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 9.2: Building Subspaces is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
9.2.4 https://math.libretexts.org/@go/page/2029
9.3: Review Problems
1. (Subspace Theorem) Suppose that V is a vector space and that U ⊂V is a subset of V . Check all the vector space requirements
to show that
μu1 + ν u2 ∈ U for all u1 , u2 ∈ U , μ, ν ∈ R (9.3.1)
4. Let L : R3
→ R
3
where
L(x, y, z) = (x + 2y + z, 2x + y + z, 0) . (9.3.3)
Find kerL, imL and eigenspaces R −1 , R . Your answers should be subsets of R . Express them using the span notation.
3
3
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 9.3: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
9.3.1 https://math.libretexts.org/@go/page/2030
CHAPTER OVERVIEW
If no two of u, v and w are parallel, then P = span{u, v, w}. But any two vectors determines a plane, so we should be able to
span the plane using only two of the vectors u, v, w. Then we could choose two of the vectors in {u, v, w} whose span is P , and
express the other as a linear combination of those two. Suppose u and v span P . Then there exist constants d , d (not both zero) 1 2
such that w = d u + d v . Since w can be expressed in terms of u and v we say that it is not independent. More generally, the
1 2
relationship
1 2 3 i i
c u +c v+c w = 0 c ∈ R, some c ≠0 (10.1)
Definition (Independent)
We say that the vectors v , v , … , v are linearly dependent if there exist constants (usually our vector spaces are defined
1 2 n
over R, but in general we can have vector spaces defined over different base fields such as C or Z . The coefficients c should 2
i
come from whatever our base field is (usually R).} c , c , … , c not all zero such that
1 2 n
1 2 n
c v1 + c v2 + ⋯ + c vn = 0. (10.2)
Remark
The zero vector 0 can never be on a list of independent vectors because α 0
V V = 0V for any scalar α .
Example 10.1 :
4 −3 5 −1
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
v1 = ⎜ −1 ⎟ , v2 = ⎜ 7 ⎟, v3 = ⎜ 12 ⎟ , v4 = ⎜ 1 ⎟. (10.3)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
3 4 17 0
1
10.3: From Dependent Independent
10.4: Review Problems
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 10: Linear Independence is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
2
10.1: Showing Linear Dependence
In the above example we were given the linear combination 3 v1 + 2 v2 − v3 + v4 seemingly by magic. The next example shows
how to find such a linear combination, if it exists.
Example \(\PageIndex{1}\):
Consider the following vectors in R : 3
0 1 1
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
v1 = ⎜ 0 ⎟ , v2 = ⎜ 2 ⎟ , v3 = ⎜ 2 ⎟ . (10.1.1)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
1 1 3
1
c
⎛ ⎞
2
( v1 v2 v3 ) ⎜ c ⎟ = 0. (10.1.3)
⎝ 3 ⎠
c
This system has solutions if and only if the matrix M = ( v1 v2 v3 ) is singular, so we should find the determinant of M :
0 1 1
⎛ ⎞
1 1
det M = det ⎜ 0 2 2 ⎟ = det ( ) = 0. (10.1.4)
⎝ ⎠ 2 2
1 1 3
Therefore nontrivial solutions exist. At this point we know that the vectors are linearly dependent. If we need to, we can find
coefficients that demonstrate linear dependence by solving the system of equations:
0 1 1 0 1 1 3 0 1 0 2 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎜0 2 2 0 ⎟ ∼ ⎜0 1 1 0 ⎟ ∼ ⎜0 1 1 0⎟. (10.1.5)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
1 1 3 0 0 0 0 0 0 0 0 0
10.1.1 https://math.libretexts.org/@go/page/2036
This is a vanishing linear combination of the vectors {v 1, … , vn } with not all coefficients equal to zero, so {v 1, … , vn } is a
linearly dependent set.
(ii. ) Now, we show that linear dependence implies that there exists k for which vk is a linear combination of the vectors
{ v1 , … , vk−1 }.
The assumption says that
1 2 n
c v1 + c v2 + ⋯ + c vn = 0. (10.1.8)
Take k to be the largest number for which c is not equal to zero. So:
k
1 2 k−1 k
c v1 + c v2 + ⋯ + c vk−1 + c vk = 0. (10.1.9)
zero vector.)
As such, we can rearrange the equation:
1 2 k−1 k
c v1 + c v2 + ⋯ + c vk−1 = −c vk
1 2 k−1
c c c
⇒ − v1 − v2 − ⋯ − vk−1 = vk .
k k k
c c c
Therefore we have expressed v as a linear combination of the previous vectors, and we are done.
k
Example 10.1.2 :
Consider the vector space P 2 (t) of polynomials of degree less than or equal to 2. Set:
v1 = 1 + t
2
v2 = 1 + t
2
v3 = t + t
2
v4 = 2 + t + t
2
v5 = 1 + t + t .
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 10.1: Showing Linear Dependence is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
10.1.2 https://math.libretexts.org/@go/page/2036
10.2: Showing Linear Independence
We have seen two different ways to show a set of vectors is linearly dependent: we can either find a linear combination of the
vectors which is equal to zero, or we can express one of the vectors as a linear combination of the other vectors. On the other hand,
to check that a set of vectors is linearly independent, we must check that every linear combination of our vectors with non-
vanishing coefficients gives something other than the zero vector. Equivalently, to show that the set v , v , … , v is linearly 1 2 n
independent, we must show that the equation c v + c v + ⋯ + c v = 0 has no solutions other than c = c = ⋯ = c = 0.
1 1 2 2 n n 1 2 n
Example 10.2.1 :
0 2 1
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
v1 = ⎜ 0 ⎟ , v2 = ⎜ 2 ⎟ , v3 = ⎜ 4 ⎟ . (10.2.1)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
2 1 3
⎝ 3 ⎠
c
This system has solutions if and only if the matrix M = ( v1 v2 v3 ) is singular, so we should find the determinant of M :
0 2 1
⎛ ⎞
2 1
det M = det ⎜ 0 2 4 ⎟ = 2 det ( ) = 12. (10.2.4)
⎝ ⎠ 2 4
2 1 3
Since the matrix M has non-zero determinant, the only solution to the system of equations
1
c
⎛ ⎞
2
( v1 v2 v3 ) ⎜ c ⎟ =0 (10.2.5)
⎝ 3 ⎠
c
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 10.2: Showing Linear Independence is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
10.2.1 https://math.libretexts.org/@go/page/2037
10.3: From Dependent Independent
Now suppose vectors v 1, … , vn are linearly dependent,
1 2 n
c v1 + c v2 + ⋯ + c vn = 0 (10.3.1)
with c1
≠0 . Then:
Then x is in span{v 2, … , vn } .
When we write a vector space as the span of a list of vectors, we would like that list to be as short as possible (this idea is explored
further in chapter 11). This can be achieved by iterating the above procedure.
In the above example, we found that v = v + v . In this case, any expression for a vector as a linear
4 1 2
combination involving v can be turned into a combination without v by making the substitution v = v + v .
4 4 4 1 2
Then:
2 2 2 2
S = span{1 + t, 1 + t , t + t , 2 + t + t , 1 + t + t }
2 2 2
= span{1 + t, 1 + t , t + t , 1 + t + t }.
2
1
2
2 1
2
2 2
= v5 is also extraneous, since
it can be expressed as a linear combination of the remaining three vectors, v , v , v . Therefore 1 2 3
2 2
S = span{1 + t, 1 + t , t + t }. (10.3.3)
In fact, you can check that there are no (non-zero) solutions to the linear system
1 2 2 3 2
c (1 + t) + c (1 + t ) + c (t + t ) = 0. (10.3.4)
Therefore the remaining vectors {1 + t, 1 + t , t + t } are linearly independent, and span the vector space S . Then these
2 2
vectors are a minimal spanning set, in the sense that no more vectors can be removed since the vectors are linearly
independent.
Such a set is called a basis for S .
Example 10.3.1 :
Let Z be the space of 3 × 1 bit-valued matrices (i.e., column vectors). Is the following subset linearly independent?
3
2
⎧ 1 1 0 ⎫
⎪⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎪
⎨⎜ 1 ⎟ , ⎜ 0 ⎟ , ⎜ 1 ⎟⎬ (10.3.5)
⎩
⎪⎝ ⎠ ⎝ ⎠ ⎝ ⎭
⎠⎪
0 1 1
If the set is linearly dependent, then we can find non-zero solutions to the system:
10.3.1 https://math.libretexts.org/@go/page/2038
1 1 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 2 3
c ⎜ 1 ⎟+c ⎜ 0 ⎟+c ⎜ 1 ⎟ = 0, (10.3.6)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 1 1
Solutions exist if and only if the determinant of the matrix is non-zero. But:
1 1 0
⎛ ⎞
0 1 1 1
det ⎜ 1 0 1 ⎟ = 1 det ( ) − 1 det ( ) = −1 − 1 = 1 + 1 = 0 (10.3.8)
⎝ ⎠ 1 1 0 1
0 1 1
Therefore non-trivial solutions exist, and the set is not linearly independent.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 10.3: From Dependent Independent is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
10.3.2 https://math.libretexts.org/@go/page/2038
10.4: Review Problems
1. Let B be the space of n × 1 bit-valued matrices (i.e. , column vectors) over the field Z .
n
2
Remember that this means that the coefficients in any linear combination can be only 0 or 1, with rules for adding and multiplying
coefficients given here.
a) How many different vectors are there in B ? n
b) Find a collection S of vectors that span B and are linearly independent. In other words, find a basis of B .
3 3
c) Write each other vector in B as a linear combination of the vectors in the set S that you chose.
3
2. Let e be the vector in R with a 1 in the ith position and 0's in every other position. Let v be an arbitrary vector in R .
i
n n
i=1
(v ⋅ ei )ei .
c) The span{e 1, … , en } is the same as what vector space?
3. Consider the ordered set of vectors from R 3
$$
\left(
1
⎛ ⎞
⎜2⎟ (10.4.1)
⎝ ⎠
3
,
2
⎛ ⎞
⎜4⎟ (10.4.2)
⎝ ⎠
6
,
1
⎛ ⎞
⎜0⎟ (10.4.3)
⎝ ⎠
1
,
1
⎛ ⎞
⎜4⎟ (10.4.4)
⎝ ⎠
5
\right)
\]
a) Determine if the set is linearly independent by using the vectors as the columns of a matrix M and finding RREF(M ).
b) If possible, write each vector as a linear combination of the preceding ones.
c) Remove the vectors which can be expressed as linear combinations of the preceding vectors to form a linearly independent
ordered set. (Every vector in your set set should be from the given set.)
4. Gaussian elimination is a useful tool figure out whether a set of vectors spans a vector space and if they are linearly independent.
Consider a matrix M made from an ordered set of column vectors (v , v , … , v ) ⊂ R and the three cases listed below:
1 2 m
n
10.4.1 https://math.libretexts.org/@go/page/2039
First give an explicit example for each case, state whether the column vectors you use are linearly independent or spanning in each
case. Then, in general, determine whether (v , v , … , v ) are linearly independent and/or spanning R in each of the three cases.
1 2 m
n
If they are linearly dependent, does RREF(M ) tell you which vectors could be removed to yield an independent set of vectors?
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 10.4: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
10.4.2 https://math.libretexts.org/@go/page/2039
CHAPTER OVERVIEW
Definitions
Let V be a vector space.
Then a set S is a basis for V if S is linearly independent and V = spanS .
If S is a basis of V and S has only finitely many elements, then we say that V is finite-dimensional .
The number of vectors in S is the dimension of V .
Suppose V is a finite-dimensional vector space, and S and T are two different bases for V . One might worry that S and T have
a different number of vectors; then we would have to talk about the dimension of V in terms of the basis S or in terms of the basis
T . Luckily this isn't what happens. Later in this chapter, we will show that S and T must have the same number of vectors. This
means that the dimension of a vector space is basis-independent. In fact, dimension is a very important characteristic of a vector
space.
Example 11.1 :
Pn (t) (polynomials in t of degree n or less) has a basis {1, t, … , t }, since every vector in this space is a sum
n
0 1 n n i
a 1 +a t +⋯ +a t , a ∈ R, (11.1)
Theorem
Let S = {v , … , v } be a basis for a vector space
1 n V . Then every vector w ∈ V can be written uniquely as a linear
combination of vectors in the basis S :
1 n
w = c v1 + ⋯ + c vn . (11.2)
Proof
Since S is a basis for V , then spanS = V , and so there exist constants c such that w = c i 1 n
v1 + ⋯ + c vn .
Suppose there exists a second set of constants d such that i
1 n
w = d v1 + ⋯ + d vn . (11.3)
Then:
0V = w − w
1 n 1 n
= c v1 + ⋯ + c vn − d v1 − ⋯ − d vn
1 1 n n
= (c − d )v1 + ⋯ + (c −d )vn .
If we have more than one i for which c ≠ d , we can use this last equation to write one of the vectors in S as a linear
i i
combination of other vectors in S , which contradicts the assumption that S is linearly independent. Then for every i, c = d . i i
1
Remark
This theorem is the one that makes bases so useful--they allow us to convert abstract vectors into column vectors. By ordering the
set S we obtain B = (v , … , v ) and can write
1 n
1 1
c c
⎛ ⎞ ⎛ ⎞
w = (v1 , … , vn ) ⎜ ⎟ =⎜ ⎟ . (11.4)
⎜ ⋮ ⎟ ⎜ ⋮ ⎟
⎝ n ⎠ ⎝ n ⎠
c c B
Remember that in general it makes no sense to drop the subscript B on the column vector on the right--most vector spaces are not
made from columns of numbers!
Next, we would like to establish a method for determining whether a collection of vectors forms a basis for R . But first, we need n
to show that any two bases for a finite-dimensional vector space has the same number of vectors.
Lemma
If S = { v1 , … , vn } is a basis for a vector space V and T = { w1 , … , wm } is a linearly independent set of vectors in V , then
m ≤n .
The idea of the proof is to start with the set S and replace vectors in S one at a time with vectors from T , such that after each
replacement we still have a basis for V .
Proof
Since S spans V , then the set {w , v , … , v } is linearly dependent. Then we can write w as a linear combination of the v ;
1 1 n 1 i
using that equation, we can express one of the v in terms of w and the remaining v with j ≠ i . Then we can discard one of
i 1 j
the v from this set to obtain a linearly independent set that still spans V . Now we need to prove that S is a basis; we must
i 1
The set S = {w , v , … , v , v , … , v } is linearly independent: By the previous theorem, there was a unique way to
1 1 1 i−1 i+1 n
express w in terms of the set S . Now, to obtain a contradiction, suppose there is some k and constants c such that
1
i
0 1 i−1 i+1 n
vk = c w1 + c v1 + ⋯ + c vi−1 + c vi+1 + ⋯ + c vn . (11.5)
Then replacing w with its expression in terms of the collection S gives a way to express the vector v as a linear combination
1 k
of the vectors in S , which contradicts the linear independence of S . On the other hand, we cannot express w as a linear 1
combination of the vectors in {v |j ≠ i} , since the expression of w in terms of S was unique, and had a non-zero coefficient
j 1
for the vector v . Then no vector in S can be expressed as a combination of other vectors in S , which demonstrates that S is
i 1 1 1
linearly independent.
The set S spans V : For any u ∈ V , we can express u as a linear combination of vectors in S . But we can express v as a
1 i
linear combination of vectors in the collection S ; rewriting v as such allows us to express u as a linear combination of the
1 i
2
Otherwise, we have m > n , and the set S = {w , … , w } is a basis for V . But we still have some vector w
n 1 n in T that is
n+1
Corollary
For a finite-dimensional vector space V , any two bases for V have the same number of vectors.
Proof
Let S and T be two bases for V . Then both are linearly independent sets that span V . Suppose S has n vectors and T has m
vectors. Then by the previous lemma, we have that m ≤ n . But (exchanging the roles of S and T in application of the lemma) we
also see that n ≤ m . Then m = n , as desired.
11.1: Bases in Rⁿ
11.2: Matrix of a Linear Transformation (Redux)
11.3: Review Problems
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
Thumbnail: A linear combination of one basis set of vectors (purple) obtains new vectors (red). If they are linearly independent,
these form a new basis set. The linear combinations relating the first set to the other extend to a linear transformation, called the
change of basis. (CC0; Maschen via Wikipedia)
This page titled 11: Basis and Dimension is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
3
11.1: Bases in Rⁿ
In review question 2, chapter 10 you checked that
⎧ 1 0 0 ⎫
⎪
⎪⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎪
⎪
⎪ ⎪
⎪ ⎪
⎜0⎟ ⎜1⎟ ⎜0⎟
n
R = span ⎨⎜ ⎟ , ⎜ ⎟ ,…, ⎜ ⎟⎬, (11.1.1)
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎪⎜ ⋮ ⎟ ⎜ ⋮ ⎟ ⎜ ⋮ ⎟⎪
⎪ ⎪
⎪
⎩ ⎪
⎭
⎪⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎪
0 0 1
and that this set of vectors is linearly independent. (If you didn't do that problem, check this before reading any further!) So this set
of vectors is a basis for R , and dim R = n . This basis is often called the standard or canonical basis for R . The vector with a
n n n
one in the ith position and zeros everywhere else is written e . (You could also view it as the function {1, 2, … , n} → R where
i
e (j) = 1 if i = j and 0 if i ≠ j .) It points in the direction of the i coordinate axis, and has unit length. In multivariable calculus
th
i
Note that it is often convenient to order basis elements, so rather than writing a set of vectors, we would write a list. This is called
an ordered basis. For example, the canonical ordered basis for R is (e , e , … , e ). The possibility to reorder basis vectors is not
n
1 2 n
are bases for R . Rescaling any vector in one of these sets is already enough to show that R has infinitely many bases. But even if
2 2
we require that all of the basis vectors have unit length, it turns out that there are still infinitely many bases for R (see review 2
question 3).
To see whether a collection of vectors S = {v , … , v } is a basis for R , we have to check that they are linearly independent and
1 m
n
that they span R . From the previous discussion, we also know that m must equal n , so lets assume S has n vectors. If S is
n
Let M be a matrix whose columns are the vectors v and i X the column vector with entries i
x . Then the above equation is
equivalent to requiring that there is a unique solution to
MX = 0 . (11.1.4)
To see if S spans R , we take an arbitrary vector w and solve the linear system
n
1 n
w = x v1 + ⋯ + x vn (11.1.5)
11.1.1 https://math.libretexts.org/@go/page/2071
in the unknowns x . For this, we need to find a unique solution for the linear system M X = w .
i
is the unique solution we desire. Then we see that S is a basis for V if and only if det M ≠0 .
Theorem
Let S = {v , … , v } be a collection of vectors in R . Let M be the matrix whose columns are the vectors in S . Then S is a
1 m
n
11.1.2: Remark
Also observe that S is a basis if and only if RREF(M ) = I .
Example 11.1.1 :
Let
1 0 1 1
S = {( ),( )} and T = {( ),( )} . (11.1.8)
0 1 1 −1
1 0
Then set M S =( ) . Since det M S =1 ≠0 , then S is a basis for R .\\
2
0 1
1 1
Likewise, set M T =( ) . Since det M T = −2 ≠ 0 , then T is a basis for R .
2
1 −1
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 11.1: Bases in Rⁿ is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
11.1.2 https://math.libretexts.org/@go/page/2071
11.2: Matrix of a Linear Transformation (Redux)
Not only do bases allow us to describe arbitrary vectors as column vectors, they also permit linear transformations to be expressed
as matrices. This is a very powerful tool for computations, which is covered in chapter 7 and reviewed again here.
Suppose we have a linear transformation L: V → W and ordered input and output bases E = (e , … , e ) and F = (f , … , f ) 1 n 1 m
for V and W respectively (of course, these need not be the standard basis--in all likelihood V is not R ). Since for each e , L(e ) n
j j
1
M
1 ⎛ j ⎞
m
⎛ j ⎞
m ⎜ M2 ⎟
⎜ j ⎟
1 m ⎜ ⎟ i
L(ej ) = f1 m + ⋯ + fm m = (f1 , … , fm ) ⎜ ⎟ . = ∑ fi Mj = ( f1 f2 ⋯ fm )⎜ ⎟ . (11.2.1)
j j
⎜ ⋮ ⎟ ⎜ ⎟
⎜ ⎟
m
i=1
⎜ ⋮ ⎟
⎝m ⎠
j m
⎝M ⎠
j
The number m is the ith component of L(e ) in the basis F , while the f are vectors (note that if α is a scalar, and v a vector,
i
j j i
αv = vα , we have used the latter---rather uncommon---notation in the above formula). The numbers m naturally form a matrix
i
j
Then
1 2 n
L(v) = L(v e1 + v e2 + ⋯ + v en ) (11.2.3)
m
1 2 n j
= v L(e1 ) + v L(e2 ) + ⋯ + v L(en ) = ∑ L(ej )v (11.2.4)
j=1
m n m
1 m j i j
= ∑(f1 m + ⋯ + fm m )v = ∑ fi [ ∑ M v ] (11.2.5)
j j j
1 1 1
m m ⋯ mn v
1
⎛ 1 2 ⎞
⎛ ⎞
2 2
⎜ m m ⎟ 2
v
⎜ 1 2
⎟⎜ ⎟
= (f1 f2 ⋯ fm ) ⎜ ⎜ ⎟ $$I nthecolumnvector (11.2.6)
⎟⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋱ ⋮ ⎟⎜ ⋮ ⎟
⎝ n ⎠
⎝ m m ⎠ v
m ⋯ mn
1
1 1 1 1
v m … mn
⎛ ⎞ ⎛⎛ 1 ⎞⎛ v ⎞⎞
⎝ n ⎠ ⎝⎝ m m⎠ ⎝ n ⎠⎠
v m … mn v
E 1 F
The array of numbers M = (m ) is called the matrix of L in the input and output bases E and F for V and W , respectively. This
i
j
matrix will change if we change either of the bases. Also observe that the columns of M are computed by examining L acting on
each basis vector in V expanded in the basis vectors of W .
Example 11.2.1 :
Let L: P1 (t) ↦ P1 (t) , such that L(a + bt) = (a + b)t . Since V = P1 (t) = W , let's choose the same ordered basis
B = (1 − t, 1 + t) for V and W .
11.2.1 https://math.libretexts.org/@go/page/2075
0
L(1 − t) = (1 − 1)t = 0 = (1 − t) ⋅ 0 + (1 + t) ⋅ 0 = ( 1 − t, 1 + t ) ( )
0
−1
L(1 + t) = (1 + 1)t = 2t = (1 − t) ⋅ −1 + (1 + t) ⋅ 1 = ( 1 − t, 1 + t ) ( )
1
a 0 −1 a
⇒ L( ) = (( )( ))
b B
0 1 b B
When the vector space is R and the standard basis is used, the problem of finding the matrix of a linear transformation will seem
n
almost trivial. It is worthwhile working through it once in the above language though:
Example 11.2.2 :
Any vector in R can be written as a linear combination of the standard (ordered) basis (e
n
1, … en ) . The vector e has a one
i
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 1
For any matrix M , observe that M e is equal to the $i$th column of M . Then if the ith column of M equals L(e ) for every i, then
i i
M v = L(v) for every v ∈ R . Then the matrix representing L in the standard basis is just the matrix whose i th column is L(e ).
n
i
For example, if
1 1 0 2 0 3
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
L⎜0⎟ = ⎜4⎟ , L⎜1⎟ = ⎜5⎟ , L⎜0⎟ = ⎜6⎟ , (11.2.8)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 7 0 8 1 9
1 2 3
⎛ ⎞
⎜4 5 6⎟ . (11.2.9)
⎝ ⎠
7 8 9
x x + 2y + 3z
⎛ ⎞ ⎛ ⎞
L ⎜ y ⎟ = ⎜ 4x + 5y + 6z ⎟ . (11.2.10)
⎝ ⎠ ⎝ ⎠
z 7x + 8y + 9z
x 1 2 3 x
⎛ ⎞ ⎛ ⎞⎛ ⎞
L⎜ y ⎟ = ⎜4 5 6⎟⎜ y ⎟ , (11.2.11)
⎝ ⎠ ⎝ ⎠⎝ ⎠
z 7 8 9 z
11.2.2 https://math.libretexts.org/@go/page/2075
x 1 0 0
⎛ ⎞ ⎡ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎤
L ⎜ y ⎟ = L ⎢x ⎜ 0 ⎟ + y ⎜ 0 ⎟ + z ⎜ 0 ⎟⎥
⎝ ⎠ ⎣ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎦
z 0 1 1
1 2 3 1 2 3 x
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎛ ⎞
= x ⎜ 4 ⎟+y ⎜ 5 ⎟+z⎜ 6 ⎟ = ⎜ 4 5 6⎟⎜ y ⎟ .
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎝ ⎠
7 8 9 7 8 9 z
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 11.2: Matrix of a Linear Transformation (Redux) is shared under a not declared license and was authored, remixed, and/or curated
by David Cherney, Tom Denton, & Andrew Waldron.
11.2.3 https://math.libretexts.org/@go/page/2075
11.3: Review Problems
1.
a) Draw the collection of all unit vectors in R . 2
1
b) Let S x = {( ) , x} , where x is a unit vector in R . For which x is S a basis of R ?
2
x
2
c) Generalize to R . n
2. Let B be the vector space of column vectors with bit entries 0, 1. Write down every basis for B and B . How many bases are
n 1 2
there for B ? B ? Can you make a conjecture for the number of bases for B ?
3 4 n
(Hint: You can build up a basis for B by choosing one vector at a time, such that the vector you choose is not in the span of the
n
previous vectors you've chosen. How many vectors are in the span of any one vector? Any two vectors? How many vectors are in
the span of any k vectors, for k ≤ n ?)
3. Suppose that V is an n -dimensional vector space.
a) Show that any n linearly independent vectors in V form a basis.
(Hint: Let {w 1, … , wm } be a collection of n linearly independent vectors in V , and let {v 1, … , vn } be a basis for V .)
b) Show that any set of n vectors in V which span V forms a basis for V .
(Hint: Suppose that you have a set of n vectors which span V but do not form a basis. What must be true about them? How could
you get a basis from this set?)
4. Let S = {v , … , v } be a subset of a vector space V . Show that if every vector w in V can be expressed uniquely as a linear
1 n
combination of vectors in S , then S is a basis of V . In other words: suppose that for every vector w in V , there is exactly one set of
constants c , … , c so that c v + ⋯ + c v = w . Show that this means that the set S is linearly independent and spans V .
1 n 1
1
n
n
5. Vectors are objects that you can add together; show that the set of all linear transformations mapping R → R is itself a vector 3
space. Find a basis for this vector space. Do you think your proof could be modified to work for linear transformations R → R ? n
For R → R ? For R ?
N m R
3 3
Hint: Represent R as column vectors, and argue that a linear transformation T : R → R is just a row vector.
i
Hint: Describe it in terms of the matrices F which have a 1 in the i-th row and the j-th column and 0 everywhere else.
j
i r
Note that { F ∣ 1 ≤ i ≤ r, 1 ≤ j ≤ k} is a basis for M .
j k
7. Give the matrix of the linear transformation L with respect to the input and output bases B and B listed below: ′
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 11.3: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
11.3.1 https://math.libretexts.org/@go/page/2076
CHAPTER OVERVIEW
whose displacement at point x is given by a function y(x, t). The space of all displacement functions for the string can be
modelled by a vector space V . At this point, only the zero vector---the function y(x, t) = 0 drawn in grey---is the only special
vector.
The wave equation
2 2
∂ y ∂ y
= , (12.1)
∂t2 ∂x2
is a good model for the string's behavior in time and space. Hence we now have a linear transformation
2 2
∂ ∂
( − ) : V → V . (12.2)
2 2
∂t ∂x
Thumbnail: Mona Lisa with shear, eigenvector, and grid. Imaged used with permission (Public domain; TreyGreer62).
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 12: Eigenvalues and Eigenvectors is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
1
12.1: Invariant Directions
Have a look at the linear transformation L depicted below:
It was picked at random by choosing a pair of vectors L(e ) and L(e ) as the outputs of L acting on the canonical basis vectors.
1 2
Notice how the unit square with a corner at the origin is mapped to a parallelogram. The second line of the picture shows these
superimposed on one another. Now look at the second picture on that line. There, two vectors f and f have been carefully chosen
1 2
such that if the inputs into L are in the parallelogram spanned by f and f , the outputs also form a parallelogram with edges lying
1 2
along the same two directions. Clearly this is a very special situation that should correspond to interesting properties of L.
Now lets try an explicit example to see if we can achieve the last picture:
Example 12.1.1 :
1 0
Recall that a vector is a direction and a magnitude; L applied to ( ) or ( ) changes both the direction and the magnitude
0 1
3 −4 ⋅ 3 + 3 ⋅ 5 3
L( ) =( ) =( ) . (12.1.3)
5 −10 ⋅ 3 + 7 ⋅ 5 5
3
Then L fixes the direction (and actually also the magnitude) of the vector v 1 =( ) .
5
Now, notice that any vector with the same direction as v can be written as 1 cv1 for some constant c . Then
L(c v ) = cL(v ) = c v , so L fixes every vector pointing in the same direction as v .
1 1 1 1
12.1.1 https://math.libretexts.org/@go/page/2077
1
so L fixes the direction of the vector v2 = ( ) but stretches v2 by a factor of 2. Now notice that for any constant c ,
2
L(c v2 ) = cL(v2 ) = 2c v2 . Then L stretches every vector pointing in the same direction as v by a factor of 2. 2
In short, given a linear transformation L it is sometimes possible to find a vector v ≠ 0 and constant λ ≠ 0 such that Lv = λv.
We call the direction of the vector v an invariant direction . In fact, any vector pointing in the same direction also satisfies this
equation because L(cv) = cL(v) = λcv . More generally, any non-zero vector v that solves
Lv = λv (12.1.5)
is called an eigenvector of L, and λ (which now need not be zero) is an eigenvalue. Since the direction is all we really care about
here, then any other vector cv (so long as c ≠ 0 ) is an equally good choice of eigenvector. Notice that the relation "u and v point in
the same direction'' is an equivalence relation.
In our example of the linear transformation L with matrix
−4 3
( ) , (12.1.6)
−10 7
we have seen that L enjoys the property of having two invariant directions, represented by eigenvectors v and v with eigenvalues1 2
1 and 2 , respectively.
It would be very convenient if we could write any vector w as a linear combination of v and v . Suppose w = rv 1 2 1 + sv2 for some
constants r and s . Then:
Now L just multiplies the number r by 1 and the number s by 2. If we could write this as a matrix, it would look like:
1 0 s
( )( ) (12.1.8)
0 2 t
x a b x ax + by
L( ) =( )( ) =( ) . (12.1.9)
y c d y cx + dy
Here, s and t give the coordinates of w in terms of the vectors v and v . In the previous example, we multiplied the vector by the
1 2
matrix L and came up with a complicated expression. In these coordinates, we see that L has a very simple diagonal matrix, whose
diagonal entries are exactly the eigenvalues of L.
This process is called diagonalization. It makes complicated linear systems much easier to analyze.
Now that we've seen what eigenvalues and eigenvectors are, there are a number of questions that need to be answered.
1. How do we find eigenvectors and their eigenvalues?
2. How many eigenvalues and (independent) eigenvectors does a given linear transformation have?
3. When can a linear transformation be diagonalized?
12.1.2 https://math.libretexts.org/@go/page/2077
We'll start by trying to find the eigenvectors for a linear transformation.
Example 12.1.2 :
Let L: R 2
→ R
2
such that L(x, y) = (2x + 2y, 16x + 6y). First, we find the matrix of L:
x L 2 2 x
( ) ⟼ ( )( ). (12.1.10)
y 16 6 y
x
We want to find an invariant direction v = ( ) such that
y
Lv = λv (12.1.11)
2 2 x λ 0 x
⇔ ( )( ) = ( )( )
16 6 y 0 λ y
2 −λ 2 x 0
⇔ ( )( ) = ( ) .
16 6 −λ y 0
2 −λ 2
This is a homogeneous system, so it only has solutions when the matrix ( ) is singular. In other words,
16 6 −λ
2 −λ 2
det ( ) = 0
16 6 −λ
⇔ (2 − λ)(6 − λ) − 32 = 0
2
⇔ λ − 8λ − 20 = 0
⇔ (λ − 10)(λ + 2) = 0
is called the characteristic polynomial of M , and its roots are the eigenvalues of M .
In this case, we see that L has two eigenvalues, λ1 = 10 and λ 2 = −2 . To find the eigenvectors, we need to deal with these
2 −λ 2 x 0
two cases separately. To do so, we solve the linear system ( )( ) =( ) with the particular eigenvalue λ
16 6 −λ y 0
−8 2 x 0
( )( ) =( ). (12.1.13)
16 −4 y 0
x
Both equations say that y = 4x, so any vector ( ) will do. Since we only need the direction of the eigenvector, we can pick
4x
1
a value for x. Setting x = 1 is convenient, and gives the eigenvector v 1 =( ) .
4
4 2 x 0
( )( ) =( ). (12.1.14)
16 8 y 0
12.1.3 https://math.libretexts.org/@go/page/2077
Here again both equations agree, because we chose λ to make the system singular. We see that y = −2x works, so we can
1
choose v 2 =( ) .
−2
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 12.1: Invariant Directions is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
12.1.4 https://math.libretexts.org/@go/page/2077
12.2: The Eigenvalue-Eigenvector Equation
In Section 12, we developed the idea of eigenvalues and eigenvectors in the case of linear transformations R
2
→ R
2
. In this
section, we will develop the idea more generally.
Lv = λv. (12.2.1)
The left hand side of this equation is a polynomial in the variable λ called the characteristic polynomial PM (λ) of M . For an
n × n matrix, the characteristic polynomial has degree n . Then
n n−1
PM (λ) = λ + c1 λ + ⋯ + cn . (12.2.5)
The eigenvalues λ of M are exactly the roots of P (λ). These eigenvalues could be real or complex or zero, and they need not all
i M
be different. The number of times that any given root λ appears in the collection of eigenvalues is called its multiplicity .
i
Example 12.2.1 :
x
⎛ ⎞
⎜ y ⎟ (12.2.7)
⎝ ⎠
z
12.2.1 https://math.libretexts.org/@go/page/2078
2x + y − z
⎛ ⎞
⎜ x + 2y − z ⎟ (12.2.8)
⎝ ⎠
−x − y + 2z
\, .\]
In the standard basis the matrix M representing L has columns Le for each i, so: i
x 2 1 −1 x
⎛ ⎞ ⎛ ⎞⎛ ⎞
L
⎜ y ⎟ ↦ ⎜ 1 2 −1 ⎟ ⎜ y ⎟ . (12.2.9)
⎝ ⎠ ⎝ ⎠⎝ ⎠
z −1 −1 2 z
λ −2 −1 1
⎛ ⎞
PM (λ) = det ⎜ −1 λ −2 1 ⎟
⎝ ⎠
1 1 λ −2
2
= (λ − 2)[(λ − 2 ) − 1] + [−(λ − 2) − 1] + [−(λ − 2) − 1]
2
= (λ − 1 ) (λ − 4) .
−2 1 −1 0 1 −2 −1 0 1 0 1 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎜ 1 −2 −1 0 ⎟ ∼ ⎜0 −3 −3 0 ⎟ ∼ ⎜0 1 1 0⎟ (12.2.10)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
−1 −1 −2 0 0 −3 −3 0 0 0 0 0
So we see that z = z =: t , y = −t , and x = −t gives a formula for eigenvectors in terms of the free parameter t . Any such
−1
⎛ ⎞
eigenvector is of the form t ⎜ −1 ⎟ ; thus L leaves a line through the origin invariant.
⎝ ⎠
1
1 1 −1 0 1 1 −1 0
⎛ ⎞ ⎛ ⎞
⎜ 1 1 −1 0 ⎟ ∼ ⎜0 0 0 0⎟ (12.2.11)
⎝ ⎠ ⎝ ⎠
−1 −1 1 0 0 0 0 0
Then the solution set has two free parameters, s and t , such that z = z =: t , y = y =: s , and x = −s + t . Thus L leaves
invariant the set:
⎧ −1 ∣ 1 ⎫
⎪ ⎛ ⎞ ⎞ ⎛ ⎪
∣
⎨s ⎜ 1 ⎟ + t ⎜ 0 ⎟ s, t ∈ R⎬ . (12.2.12)
⎩ ∣ ⎭
⎪ ⎝ ⎠ ⎝ ⎠ ⎪
0 1 ∣
−1
⎛ ⎞
This set is a plane through the origin. So the multiplicity two eigenvalue has two independent eigenvectors, ⎜ 1 ⎟ and
⎝ ⎠
0
1
⎛ ⎞
12.2.2 https://math.libretexts.org/@go/page/2078
Example 12.2.1 :
Let V be the vector space of smooth (i.e.inf initely dif f erentiable) functions f : R → R . Then the derivative is a linear
operator d
dx
: V → V . What are the eigenvectors of the derivative? In this case, we don't have a matrix to work with, so we
have to make do. A function f is an eigenvector of if there exists some number λ such that
d
dx
f = λf . An obvious
d
dx
dx
λx λx
The operator d
dx
has an eigenvector e λx
for every λ ∈ R .
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 12.2: The Eigenvalue-Eigenvector Equation is shared under a not declared license and was authored, remixed, and/or curated by
David Cherney, Tom Denton, & Andrew Waldron.
12.2.3 https://math.libretexts.org/@go/page/2078
12.3: Eigenspaces
fIn the previous example, we found two eigenvectors
−1 1
⎛ ⎞ ⎛ ⎞
⎜ 1 ⎟ and ⎜ 0 ⎟ (12.3.1)
⎝ ⎠ ⎝ ⎠
0 1
−1 1 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎜ 1 ⎟+⎜ 0 ⎟ = ⎜ 1 ⎟ (12.3.2)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 1 1
−1 1
⎛ ⎞ ⎛ ⎞
r⎜ 1 ⎟+s⎜ 0 ⎟ (12.3.3)
⎝ ⎠ ⎝ ⎠
0 1
of these two eigenvectors will be another eigenvector with the same eigenvalue.
More generally, let { v , v , …} be eigenvectors of some linear transformation L with the same eigenvalue
1 2 λ . A
linear combination of the v can be written c v + c v + ⋯ for some constants {c , c , …}. Then:
i
1
1
2
2
1 2
1 2 1 2
L(c v1 + c v2 + ⋯) = c Lv1 + c Lv2 + ⋯ by linearity of L
1 2
= c λ v1 + c λ v2 + ⋯ since Lvi = λ vi
1 2
= λ(c v1 + c v2 + ⋯).
So every linear combination of the v is an eigenvector of L with the same eigenvalue λ . In simple terms, any sum of eigenvectors
i
All other vector space properties are inherited from the fact that V itself is a vector space. In other words, the subspace theorem,
9.1.1 chapter 9, ensures that V := {v ∈ V |Lv = 0} is a subspace of V .
λ
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 12.3: Eigenspaces is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
12.3.1 https://math.libretexts.org/@go/page/2079
12.4: Review Problems
1. Try to find more solutions to the vibrating string problem ∂ 2
y/∂ t
2
=∂
2 2
y/∂ x using the ansatz
\[
y(x,t)=\sin(\omega t) f(x)\, .
$$
What equation must f (x) obey? Can you write this as an eigenvector equation? Suppose that the string has length L and
f (0) = f (L) = 0 . Can you find any solutions for f (x)?
2 1
2. Let M =( ) . Find all eigenvalues of M . Does M have two linearly independent} eigenvectors? Is there a basis in which
0 2
x x cos θ + y sin θ
L( ) =( ) . (12.4.1)
y −x sin θ + y cos θ
1 0
a) Write the matrix of L in the basis ( ),( ) .
0 1
−
−−
e) Are there complex eigenvalues for L, assuming that i = √−1 exists?
4. Let L be the linear transformation L: R 3
→ R
3
given by
x x +y
⎛ ⎞ ⎛ ⎞
L⎜ y ⎟ = ⎜ x +z ⎟. (12.4.3)
⎝ ⎠ ⎝ ⎠
z y +z
Let e be the vector with a one in the ith position and zeros in all other positions.
i
1 1 1
m m m
⎛ 1 2 3 ⎞
b) Given a matrix M =⎜
2
⎜ m1 m
2
2
m
2
3
⎟
⎟ , what can you say about M e for each i? i
⎝ 3 3 3 ⎠
m m m
1 2 3
c) Find a 3 × 3 matrix M representing L. Choose three nonzero vectors pointing in different directions and show that M v = Lv
How about for A where n ∈ N ? Suppose that A is invertible. Show that v is also an eigenvector for A .
n −1
12.4.1 https://math.libretexts.org/@go/page/2080
a b
M =( ) . (12.4.4)
c d
Now, since we can evaluate polynomials on square matrices, we can plug M into its characteristic polynomial and find the
matrixP M(M ). What do you find from this computation?
Does something similar hold for 3 × 3 matrices? (Try assuming that the matrix of $M$ is diagonal to answer this.)
9. Discrete dynamical system. Let M be the matrix given by
3 2
M =( ). (12.4.5)
2 3
x(0)
Given any vector v(0) = ( ), we can create an infinite sequence of vectors v(1), v(2), v(3), and so on using the rule:
y(0)
(This is known as a {\it discrete dynamical system} whose {\it initial condition} is v(0). )
a) Find all eigenvectors and eigenvalues of M .
b) Find all vectors v(0) such that
v(0) = v(1) = v(2) = v(3) = ⋯ (12.4.7)
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 12.4: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
12.4.2 https://math.libretexts.org/@go/page/2080
CHAPTER OVERVIEW
13: Diagonalization
Given a linear transformation, it is highly desirable to write its matrix with respect to a basis of eigenvectors. Diagonalization is the
process of finding a corresponding diagonal matrix for a diagonalizable matrix or linear map. A square matrix that is not
diagonalizable is called defective.
13.1: Diagonalization
13.2: Change of Basis
13.3: Changing to a Basis of Eigenvectors
13.4: Review Problems
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 13: Diagonalization is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
1
13.1: Diagonalization
Suppose we are lucky, and we have L: V → V , and the ordered basis B = (v1 , … , vn ) is a set of linearly independent
eigenvectors for L, with eigenvalues λ , … , λ . Then:
1 n
L(v1 ) = λ1 v1
L(v2 ) = λ2 v2
L(vn ) = λn vn
⎝ n ⎠ ⎝⎝ ⎠⎝ n ⎠⎠
x B λn x
B
Theorem 13.1.1 :
Given an ordered basis B for a vector space V and a linear transformation L: V → V , then the matrix for L in the basis B is
diagonal if and only if B consists of eigenvectors for L.
Typically, however, we do not begin a problem with a basis of eigenvectors, but rather have to compute these. Hence we need to
know how to change from one basis to another.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 13.1: Diagonalization is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
13.1.1 https://math.libretexts.org/@go/page/2081
13.2: Change of Basis
Suppose we have two ordered bases S = (v , … , v ) and S = (v , … , v ) for a vector space
1 n
′ ′
1
′
n V . (Here v and v are
i
′
i
vectors , not components of
vectors in a basis!) Then we may write each v uniquely as a linear combination of the v :
′
i j
′ i
v = ∑ vi p , (13.2.1)
j j
or in matrix notation
1 1 1
p p ⋯ pn
⎛ 1 2 ⎞
2 2
⎜ p p ⎟
′ ′ ′ ⎜ 1 2
⎟
( v , v , ⋯ , vn ) = ( v1 , v2 , ⋯ , vn )⎜ ⎟ (13.2.2)
1 2
⎜ ⎟
⎜ ⋮ ⋮ ⎟
⎝ n n ⎠
p ⋯ pn
1
Here, the p are constants, which we can regard as entries of a square matrix P
i
j
= (p )
i
j
. The matrix P must have an inverse, since we can also write
each v uniquely as a linear combination of the v :
i
′
j
k
vj = ∑ vk q . (13.2.3)
j
k i
But ∑ q p is the k, j entry of the product matrix QP . Since the expression for v in the basis S is v itself, then
i
k
i
i
j j j QP maps each vj to itself. As a
result, each v is an eigenvector for QP with eigenvalue 1, so QP is the identity, i.e.
j
−1
P Q = QP = I ↔ Q = P . (13.2.5)
The matrix P is called a change of basis matrix. There is a quick and dirty trick to obtain it: Look at the formula above relating the new basis vectors
v , v , … v to the old ones v , v , … , v . In particular focus on v for which
′ ′ ′ ′
1 2 n 1 2 n 1
1
p
⎛ 1 ⎞
2
⎜ p ⎟
′ ⎜ 1 ⎟
v = ( v1 , v2 , ⋯ , vn )⎜ ⎟ . (13.2.6)
1
⎜ ⎟
⎜ ⋮ ⎟
⎝ n ⎠
p
1
This says that the first column of the change of basis matrix P is really just the components of the vector v in the basis v ′
1 1, v2 , … , vn , so:
The columns of the change of basis matrix are the components of the new basis vectors in terms of the old basis vectors. (13.2.7)
Example 13.2.1 :
Suppose S ′ ′
= (v , v )
1
′
2
is an ordered basis for a vector space V and that with respect to some other ordered basis S = (v 1, v2 ) for V
1 1
⎛ √2 ⎞ ⎛ √3 ⎞
′ ′
v = and v = . (13.2.8)
1 1 2 1
⎝ ⎠ ⎝− ⎠
√2 S √3 S
This means
1 1
⎛ √2 ⎞ v1 + v2 ⎛ √3 ⎞ v1 − v2
′ ′
v = ( v1 , v2 ) = and v = ( v1 , v2 ) = . (13.2.9)
1 1 – 2 1 –
⎝ ⎠ √2 ⎝− ⎠ √3
√2 √3
The change of basis matrix has as its columns just the components of v and v ; ′
1
′
2
$$
P=
1 1
⎛ √2 √3
⎞
(13.2.10)
1 1
⎝ − ⎠
√2 √3
\, .
\]
13.2.1 https://math.libretexts.org/@go/page/2082
Changing basis changes the matrix of a linear
transformation. However, as a map between vector spaces,
the linear transformation is the same no matter which basis we use . Linear transformations are the actual objects of study of this book, not
matrices; matrices are merely a convenient way of doing computations.
Lets now calculate how the matrix of a linear transformation changes when changing basis. To wit, let L: V ⟶ W with matrix i
M = (m )
j
in the
ordered input and output bases S = (v , … , v ) and T = (w , … , w ) so
1 n 1 m
k
L(vi ) = ∑ wk m . (13.2.11)
i
′k
Now, suppose S ′ ′
= (v , … , vn )
1
′
and T ′ ′
= (w , … , wm )
1
′
are new ordered input and out bases with matrix M ′
= (m i
) . Then
′ ′k
L(v ) = ∑ wk m . (13.2.12)
i i
j
Let P = (p ) be the change of basis matrix from input basis S to the basis S and Q = (q
i
j
′
k
) be the change of basis matrix from output basis T to the
basis T . Then:
′
′ i i k i
L(v ) = L ( ∑ vi p ) = ∑ L(vi )p = ∑ ∑ wk m p . (13.2.13)
j j j i j
i i i k
Meanwhile, we have:
′k j k
′
L(v ) = ∑ vk m = ∑ ∑ vj q m . (13.2.14)
i i k i
k k j
Since the expression for a vector in a basis is unique, then we see that the entries of M P are the same as the entries of QM . In other words, we see ′
that M P = Q M ′
or M =Q
′
MP .
−1
Example 13.2.2 :
From this information we can immediately read off the matrix M of L in the bases S = (1, t, t 2
) and T = (e1 , e2 ) , the standard basis for R ,2
because
2
(L(1), L(t), L(t )) = (e1 + 2 e2 , 2 e1 + e2 , 3 e1 + 3 e2 )
1 2 3 1 2 3
= (e1 , e2 ) ( ) ⇒ M = ( ) .
2 1 3 2 1 3
′ 2 2 ′
1 2
′ ′
S = (1 + t, t + t , 1 + t ) , T = (( ),( )) =: (w , w ) . (13.2.16)
1 2
2 1
2 2
1 2 2 3 1 3
(L(1 + t)L(t + t ), L(1 + t )) = (( ) +( ),( ) +( ),( ) +( ))
2 1 1 3 2 3
= (w1 + w2 , w1 + 2 w2 , 2 w2 + w1 )
1 1 2 ′
1 1 2
= (w1 , w2 ) ( ) ⇒ M =( ) .
1 2 1 1 2 1
Alternatively we could calculate the change of basis matrices P and Q by noting that
1 0 1 1 0 1
⎛ ⎞ ⎛ ⎞
2 2 2
(1 + t, t + t , 1 + t ) = (1, t, t ) ⎜ 1 1 0⎟ ⇒ P =⎜1 1 0⎟ (13.2.17)
⎝ ⎠ ⎝ ⎠
0 1 1 0 1 1
and
1 2 1 2
(w1 , w2 ) = (e1 + 2 e2 , 2 e1 + e2 ) = (e1 , e1 ) ( ) ⇒ Q =( ) . (13.2.18)
2 1 2 1
13.2.2 https://math.libretexts.org/@go/page/2082
Hence
1 0 1
⎛ ⎞
1 1 −2 1 2 3 1 1 2
′ −1
M =Q MP = − ( )( )⎜1 1 0⎟ =( ) . (13.2.19)
3 −2 1 2 1 3 1 2 1
⎝ ⎠
0 1 1
Notice that the change of basis matrices P and Q are both square and invertible. Also, since we really wanted Q , it is more efficient to try and
−1
write (e , e ) in terms of (w , w ) which would yield directly Q . Alternatively, one can check that M P = QM .
1 2 1 2
−1 ′
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 13.2: Change of Basis is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom Denton, & Andrew
Waldron.
13.2.3 https://math.libretexts.org/@go/page/2082
13.3: Changing to a Basis of Eigenvectors
If we are changing to a basis of eigenvectors, then there are various simplifications:
1. Since L : V → V , most likely you already know the matrix M of L using the same input basis as output basis
S = (u1 , … , un ) (say).
2. In the new basis of eigenvectors S ′
(v1 , … , vn ) , the matrix D of L is diagonal because Lv i = λi vi and so
λ1 0 ⋯ 0
⎛ ⎞
⎜ 0 λ2 0 ⎟
⎜ ⎟
(L(v1 ), L(v2 ), … , L(vn )) = (v1 , v2 , … , vn ) ⎜ ⎟ . (13.3.1)
⎜ ⎟
⎜ ⋮ ⋱ ⋮ ⎟
⎝ ⎠
0 0 ⋯ λn
Definition
A matrix M is diagonalizable if there exists an invertible matrix P and a diagonal matrix D such that
−1
D =P MP . (13.3.2)
If for two matrices N and M there exists a matrix P such that M = P N P , then we say that M and N are similar . Then
−1
the above discussion shows that diagonalizable matrices are similar to diagonal matrices.
Corollary
A square matrix M is diagonalizable if and only if there exists a basis of eigenvectors for M . Moreover, these eigenvectors are the
columns of the change of basis matrix P which diagonalizes M .
Example 13.3.1:
⎝ ⎠
9 18 29
So the eigenvalues of M are −1, 0, and 2, and associated eigenvectors turn out to be
\[v_{1}=
−8
⎛ ⎞
⎜ −1 ⎟ (13.3.5)
⎝ ⎠
3
,~~ v_{2}=
13.3.1 https://math.libretexts.org/@go/page/2083
−2
⎛ ⎞
⎜ 1 ⎟ (13.3.6)
⎝ ⎠
0
⎜ −1 ⎟ (13.3.7)
⎝ ⎠
1
.$$
In order for M to be diagonalizable, we need the vectors v 1, v2 , v3 to be linearly independent. Notice that the matrix
−8 −2 −1
⎛ ⎞
P = ( v1 v2 v3 ) = ⎜ −1 1 −1 ⎟ (13.3.8)
⎝ ⎠
3 0 1
is invertible because its determinant is −1. Therefore, the eigenvectors of M form a basis of R, and so M is diagonalizable.
Moreover, because the columns of P are the components of eigenvectors,
−1 0 0
⎛ ⎞
M P = ( M v1 M v2 M v3 ) = ( −1.v1 0.v2 2.v3 ) = ( v1 v2 v3 ) ⎜ 0 0 0⎟ . (13.3.9)
⎝ ⎠
0 0 2
⎝ ⎠
0 0 2
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 13.3: Changing to a Basis of Eigenvectors is shared under a not declared license and was authored, remixed, and/or curated by
David Cherney, Tom Denton, & Andrew Waldron.
13.3.2 https://math.libretexts.org/@go/page/2083
13.4: Review Problems
1. Let P (t) be the vector space of polynomials of degree n or less, and : P (t) → P (t) be the derivative operator. Find the
n
d
dt
n n
matrix of in the ordered bases E = (1, t, … , t ) for P (t) and F = (1, t, … , t ) for P (t) . Determine if this derivative
d
dt
n
n
n
n
operator is diagonalizable. \(\textit{Recall from chapter 6 that the derivative operator is linear.
2. When writing a matrix for a linear transformation, we have seen that the choice of basis matters. In fact, even the order of the
basis matters!
a. Write all possible reorderings of the standard basis (e , e , e ) for R . 1 2 3
3
b. Write each change of basis matrix between the standard basis and each of its reorderings. Make as many observations as you
can about these matrices: what are their entries? Do you notice anything about how many of each type of entry appears in each
row and column? What are their determinants? (Note: These matrices are known as permutation matrices.)
c. Given L : R → R is linear and
3 3
x 2y − z
⎛ ⎞ ⎛ ⎞
L⎜ y ⎟ = ⎜ 3x ⎟ (13.4.1)
⎝ ⎠ ⎝ ⎠
z 2z + x + y
write the matrix M for L in the standard basis, and two reorderings of the standard basis. How are these matrices related?
3. Let
X = {♡, ♣, ♠} , Y = {∗, ⋆} . (13.4.2)
Write down two different ordered bases, S, S and T , T respectively, for each of the vector spaces R and R . Find the change of
′ ′ X Y
basis matrices P and Q that map these bases to one another. Now consider the map
ℓ : Y → X, (13.4.3)
where ℓ(∗) = ♡ and ℓ(⋆) = ♠ . Show that ℓ can be used to define a linear transformation L : R → R . Compute the matrices M X Y
and M of L in the bases S, T and then S , T . Use your change of basis matrices P and Q to check that M = Q M P .
′ ′ ′ ′ −1
4. Recall that trM N = trN M . Use this fact to show that the trace of a square matrix M does not depend not the basis you used to
compute M .
a b
5. When is the 2 × 2 matrix ( ) diagonalizable? Include examples in your answer.
c d
λ 1 0
⎛ ⎞
b) Can the matrix ⎜ 0 λ 1 ⎟ be diagonalized? Either diagonalize it or explain why this is impossible.
⎝ ⎠
0 0 λ
λ 1 0 ⋯ 0 0
⎛ ⎞
⎜ 0 λ 1 ⋯ 0 0 ⎟
⎜ ⎟
⎜ 0 0 λ ⋯ 0 0 ⎟
⎜ ⎟
c) Can the n × n matrix ⎜ ⎟ be diagonalized? Either diagonalize it or explain why this is impossible.
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ ⎟
⎜ ⎟
⎜ 0 0 0 ⋯ λ 1 ⎟
⎝ ⎠
0 0 ⋯ 0 λ
Note: It turns out that every matrix is similar to a block matrix whose diagonal blocks look like diagonal matrices or the ones
above and whose off-diagonal blocks are all zero. This is called the Jordan form of the matrix and a (maximal) block that looks
like
13.4.1 https://math.libretexts.org/@go/page/2084
λ 1 0 ⋯ 0
⎛ ⎞
⎜ 0 λ 1 0 ⎟
⎜ ⎟
⎜ ⎟
⎜ ⎟ (13.4.4)
⎜ ⋮ ⋱ ⋱ ⎟
⎜ ⎟
⎜ λ 1 ⎟
⎝ ⎠
0 0 0 λ
is called a Jordan n-cell or a Jordan block where n is the size of the block.
8. Let A and B be commuting matrices (i.e. , AB = BA ) and suppose that A has an eigenvector v with eigenvalue λ . Show that
Bv is also an eigenvector of A with eigenvalue λ . Additionally suppose that A is diagonalizable with distinct eigenvalues. What is
the dimension of each eigenspace of A ? Show that v is also an eigenvector of B . Explain why this shows that A and B can be
simultaneously diagonalized (i.e. there is an ordered basis in which both their matrices are diagonal.)
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 13.4: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
13.4.2 https://math.libretexts.org/@go/page/2084
CHAPTER OVERVIEW
14.1: Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 14: Orthonormal Bases and Complements is shared under a not declared license and was authored, remixed, and/or curated by
David Cherney, Tom Denton, & Andrew Waldron.
1
14.1: Properties of the Standard Basis
The standard notion of the length of a vector x = (x 1, x2 , … , xn ) ∈ R
n
is
− −−−−−−−−−−−−−−−−− −
−− −− 2 2 2
||x|| = √ x ⋅ x = √ (x1 ) + (x2 ) + ⋯ (xn ) . (14.1.1)
1 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎜0⎟ ⎜1 ⎟ ⎜0 ⎟
e1 =⎜ ⎟,e = ⎜
2
⎟,…,e = ⎜
n
⎟ (14.1.2)
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⋮ ⎟ ⎜ ⋮ ⎟ ⎜ ⋮ ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 1
has many useful properties with respect to the dot product and lengths:
2. The standard basis vectors are orthogonal (in other words, at right angles or perpendicular):
T
ei ⋅ ej = e ej = 0 when i ≠ j (14.1.4)
i
This is summarized by
1 i =j
T
e ej = δij = { , (14.1.5)
i
0 i ≠j
where δ is the Kronecker delta. Notice that the Kronecker delta gives the entries of the identity matrix.
ij
⎜0⎟
T
Π1 = e1 e =⎜ ⎟(
0) (14.1.6)
1 ⎜ ⎟ 1 0 ⋯
⎜ ⋮ ⎟
⎝ ⎠
0
1 0 ⋯ 0
⎛ ⎞
⎜0 0 ⋯ 0⎟
=⎜ ⎟ (14.1.7)
⎜ ⎟
⎜ ⋮ ⋮ ⎟
⎝ ⎠
0 0 ⋯ 0
⋮ (14.1.8)
0
⎛ ⎞
⎜0⎟
T
Πn = en en =⎜ ⎟(0 0 ⋯ 1) (14.1.9)
⎜ ⎟
⎜ ⋮ ⎟
⎝ ⎠
1
14.1.1 https://math.libretexts.org/@go/page/2085
0 0 ⋯ 0
⎛ ⎞
⎜0 0 ⋯ 0⎟
=⎜ ⎟ (14.1.10)
⎜ ⎟
⎜ ⋮ ⋮ ⎟
⎝ ⎠
0 0 ⋯ 1
In short, Π is the diagonal square matrix with a 1 in the ith diagonal position and zeros everywhere else.
i
Notice that
T T T
Πi Πj = ei e ej e = ei δij e . (14.1.11)
i j j
Then:
Πi i =j
Πi Πj = { . (14.1.12)
0 i ≠j
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 14.1: Properties of the Standard Basis is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
14.1.2 https://math.libretexts.org/@go/page/2085
14.2: Orthogonal and Orthonormal Bases
There are many other bases that behave in the same way as the standard basis. As such, we will study:
1. Orthogonal bases {v 1, … , vn } :
vi ⋅ vj = 0 if i ≠ j . (14.2.1)
ui ⋅ uj = δij . (14.2.2)
Since T is orthonormal, there is a very easy way to find the coefficients of this linear combination. By taking the dot product of v
with any of the vectors in T , we get:
1 i n
v ⋅ ui = c u1 ⋅ ui + ⋯ + c ui ⋅ ui + ⋯ + c un ⋅ ui
1 i n
= c ⋅ 0 +⋯ +c ⋅ 1 +⋯ +c ⋅0
i
= c ,
i
⇒ c = v ⋅ ui
⇒ v = (v ⋅ u1 )u1 + ⋯ + (v ⋅ un )un
= ∑(v ⋅ ui )ui .
Theorem
For an orthonormal basis {u 1, … , un } , any vector v can be expressed as
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 14.2: Orthogonal and Orthonormal Bases is shared under a not declared license and was authored, remixed, and/or curated by
David Cherney, Tom Denton, & Andrew Waldron.
14.2.1 https://math.libretexts.org/@go/page/2086
14.3: Relating Orthonormal Bases
Suppose T = { u1 , … , un } and R = {w 1, … , wn } are two orthonormal bases for R . Then: n
⇒ wi = ∑ uj (uj ⋅ wi )
We would like to calculate the product P P . For that, we first develop a dirty trick for products of dot products:
T
T T T T
(u. v)(w. z) = (u v)(w z) = u (vw )z . (14.3.2)
The object vw is the square matrix made from the outer product of v and w! Now we are ready to compute the components of the
T
matrix product P P :T
T T
∑(uj ⋅ wi )(wi ⋅ uk ) = ∑(u wi )(w uk )
j i
i i
T T
= u [ ∑(wi w )] uk
j i
(∗)
T
= u In uk
j
T
= u uk = δjk .
j
The equality (∗) is explained below. Assuming (∗) holds, we have shown that P P T
= In , which implies that
T −1
P =P . (14.3.3)
T T j
( ∑ wi w ) v = ( ∑ wi w ) ( ∑ c wj )
i i
i i j
j T
= ∑c ∑ wi w wj
i
j i
j
= ∑c ∑ wi δij
j i
j
= ∑ c wj since all terms with i ≠ j vanish
= v.
Definition: Orthonality
A matrix P is orthogonal if P −1
=P
T
.
Then to summarize,
14.3.1 https://math.libretexts.org/@go/page/2087
Theorem: Orthonormality
A change of basis matrix P relating two orthonormal bases is an orthogonal matrix. i.e. , P −1
=P
T
.
Example 14.3.1 :
2 1
⎧
⎪ ⎛ ⎞ 0 ⎛ ⎞⎫
⎪
⎪ √6 ⎛ ⎞ √3 ⎪
⎪
⎪ ⎪
⎪
⎜ 1 ⎟ 1 ⎜ −1 ⎟
S = ⎨u1 = ⎜ ⎟,u = ⎜
⎜
⎟
⎟ , u3 = ⎜ ⎟⎬ . (14.3.4)
⎜ ⎟ 2 √2 ⎜ ⎟
√6 ⎜ ⎟ √3
⎪ ⎜ ⎟ ⎜ ⎟⎪
⎪
⎪ −1
1 ⎪
⎪
⎩ ⎝ ⎠ 1
⎪ ⎝ ⎠ √2
⎝ ⎠⎭
⎪
√6 √3
Let E be the standard basis {e , e , e }. Since we are changing from the standard basis to a new basis, then the columns of the
1 2 3
change of basis matrix are exactly the standard basis vectors. Then the change of basis matrix from E to S is given by:
e1 ⋅ u1 e1 ⋅ u2 e1 ⋅ u3
⎛ ⎞
j
P = (P ) = (ej ui ) = ⎜ e2 ⋅ u1 e2 ⋅ u2 e2 ⋅ u3 ⎟
i
⎝ ⎠
e3 ⋅ u1 e3 ⋅ u2 e3 ⋅ u3
2 1
0
⎛ √6 √3
⎞
⎜ 1 1 −1 ⎟
= ( u1 u2 u3 ) = ⎜ ⎟.
⎜ √6 √2 √3 ⎟
⎜ ⎟
−1 1 1
⎝ ⎠
√6 √2 √3
2 1 −1
⎛ ⎞
√6 √6 √6
⎜ 1 1 ⎟
= ⎜ 0 ⎟.
⎜ √2 √2 ⎟
⎜ ⎟
1 −1 1
⎝ ⎠
√3 √3 √3
⎝ T ⎠
u
3
1 0 0
⎛ ⎞
= ⎜0 1 0⎟.
⎝ ⎠
0 0 1
Above we are using orthonormality of the u and the fact that matrix multiplication amounts to taking dot products between rows
i
and columns. It is also very important to realize that the columns of an orthogonal matrix are made from an orthonormal set of
vectors.
14.3.2 https://math.libretexts.org/@go/page/2087
T T T
M = (P DP )
T T T T
= (P ) D P
T
= P DP
= M
The matrix M = P DP
T
is symmetric!
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 14.3: Relating Orthonormal Bases is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
14.3.3 https://math.libretexts.org/@go/page/2087
14.4: Gram-Schmidt and Orthogonal Complements
u⋅v
Given a vector v and some other vector u not in span {v} , we can construct a new vector: v ⊥
:= v − u.
u⋅u
⊥
u⋅v
u⋅v = u ⋅ v− u ⋅ u = 0. (14.4.1)
u⋅u
Hence, {u, v
⊥
} is an orthogonal basis for span{u, v} . When v is not parallel to u, v
⊥
≠0 , and normalizing these vectors we
⊥
u v
obtain { ,
⊥
}, an orthonormal basis for the vector space span {u, v}.
|u| |v |
Sometimes we write v = v ⊥ ∥
+v where:
u⋅v
⊥
v = v− u
u⋅u
∥
u⋅v
v = u.
u⋅u
This is called an orthogonal decomposition because we have decomposed v into a sum of orthogonal vectors. This decomposition
depends on u; if we change the direction of u we change v and v . ⊥ ∥
If u, v are linearly independent vectors in R , then the set {u, v , u × v } would be an orthogonal basis for
3 ⊥ ⊥
R
3
. This set could
then be normalized by dividing each vector by its length to obtain an orthonormal basis.
However, it often occurs that we are interested in vector spaces with dimension greater than 3, and must resort to craftier means
than cross products to obtain an orthogonal basis.
Given a third vector w, we should first check that w does not lie in the span of u and v , i.e. , check that u, v and w are linearly
independent. If it does not, we then can define:
⊥
u⋅w v ⋅w
⊥ ⊥
w =w− u− v . (14.4.2)
⊥ ⊥
u⋅u v ⋅v
14.4.1 https://math.libretexts.org/@go/page/2088
⊥
⊥
u⋅w v ⋅w ⊥
u⋅w = u ⋅ (w − u− v )
⊥ ⊥
u⋅u v ⋅v
⊥
u⋅w v ⋅w ⊥
=u⋅w− u⋅u− u⋅v
⊥
u⋅u v ⋅ v⊥
⊥
v ⋅w ⊥
= u ⋅ w −u ⋅ w − u⋅v = 0
⊥ ⊥
v ⋅v
⊥
u⋅w v ⋅w
⊥ ⊥ ⊥ ⊥
v ⋅w =v ⋅ (w − u− v )
⊥ ⊥
u⋅u v ⋅v
⊥
⊥
u⋅w ⊥
v ⋅w ⊥ ⊥
=v ⋅w− v ⋅u− v ⋅v
⊥ ⊥
u⋅u v ⋅v
⊥
u⋅w ⊥ ⊥
=v ⋅w− v ⋅ u −v ⋅w = 0
u⋅u
because is orthogonal to
u
⊥
v . Since w
⊥
is orthogonal to both u and v
⊥
, we have that {u, v
⊥
,w
⊥
} is an orthogonal basis for
span{u, v, w}.
⊥
v ⋅ v2
⊥ 1 ⊥
v := v2 − v
2 ⊥ ⊥ 1
v ⋅v
1 1
⊥ ⊥
v ⋅ v3 v ⋅ v3
⊥ 1 ⊥ 2 ⊥
v := v3 − v − v
3 ⊥ ⊥ 1 ⊥ ⊥ 2
v ⋅v v ⋅v
1 1 2 2
⋮
⊥
v ⋅ vi
j
⊥ ⊥
v = vi − ∑ v
i ⊥ ⊥ j
v ⋅v
j<i j j
⊥ ⊥ ⊥
v ⋅ vi v ⋅ vi v ⋅ vi
1 2 i−1
⊥ ⊥ ⊥
:= vi − v − v −⋯ − v
⊥ ⊥ 1 ⊥ ⊥ 2 ⊥ ⊥ i−1
v ⋅v v ⋅v v ⋅v
1 1 2 2 i−1 i−1
Notice that each v here depends on v for every j < i . This allows us to inductively/algorithmically build up a linearly
⊥
i
⊥
j
independent, orthogonal set of vectors {v , v , …} such that span{v , v , …} = span{v , v , …}. That is, an orthogonal
⊥
1
⊥
2
⊥
1
⊥
2 1 2
basis for the latter vector space. This algorithm is called the Gram--Schmidt orthogonalization procedure--Gram worked at a
Danish insurance company over one hundred years ago, Schmidt was a student of Hilbert (the famous German mathmatician).
Example 14.4.1 :
1 1 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
2
⊥
v = ⎜1⎟− ⎜1⎟ =⎜0⎟
2
2
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
1 0 1
3 1 0 1
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
4 1
⊥
v = ⎜1⎟− ⎜1⎟− ⎜ 0 ⎟ = ⎜ −1 ⎟ .
3
2 1
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
1 0 1 0
14.4.2 https://math.libretexts.org/@go/page/2088
Then the set
⎧ 1 0 1
⎪⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎫
⎪
⎨⎜ 1 ⎟ , ⎜ 0 ⎟ , ⎜ −1 ⎟⎬ (14.4.3)
⎩
⎪⎝ ⎭
⎠ ⎝ ⎠ ⎝ ⎠⎪
0 1 0
is an orthogonal basis for R . To obtain an orthonormal basis, as always we simply divide each of these vectors by its length,
3
yielding:
1 1
⎧⎛
⎪ ⎞ ⎛ ⎫
⎞⎪
⎪
⎪ – – ⎪
⎪
⎪
⎪ ⎪
⎪⎜ √2 ⎟ ⎛ 0 ⎞ ⎜ √2 ⎪
⎟⎪
⎜ ⎟ ⎜ ⎟
⎨⎜ 1 ⎟,⎜0⎟, ⎜ −1 ⎟⎬ . (14.4.4)
⎜ – ⎟ ⎝ ⎜ – ⎟
⎪
⎪ ⎠ ⎪
⎪⎜ √2 ⎟ 1 ⎜ √2 ⎟⎪
⎪
⎪
⎪ ⎪
⎪
⎩⎝
⎪ ⎠ ⎝ ⎠⎪
⎭
0 0
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 14.4: Gram-Schmidt and Orthogonal Complements is shared under a not declared license and was authored, remixed, and/or
curated by David Cherney, Tom Denton, & Andrew Waldron.
14.4.3 https://math.libretexts.org/@go/page/2088
14.5: QR Decomposition
In chapter 7, section 7.7 we learned how to solve linear systems by decomposing a matrix M into a product of lower and upper
triangular matrices
M = LU . (14.5.1)
M = QR , (14.5.2)
where Q is an orthogonal matrix and R is an upper triangular matrix. So-called QR-decompositions are useful for solving linear
systems, eigenvalue problems and least squares approximations. You can easily get the idea behind the QR decomposition by
working through a simple example.
Example 14.5.1 :
⎝ ⎠
0 1 −2
What we will do is to think of the columns of M as three 3-vectors and use Gram-Schmidt to build an orthonormal basis from
these that will become the columns of the orthogonal matrix Q. We will use the matrix R to record the steps of the Gram-
Schmidt procedure in such a way that the product QR equals M .
To begin with we write
7 1
⎛2 −
5
1 ⎞⎛1 0⎞
5
M =⎜ ⎟
14
⎜1 −2 ⎟ ⎜ 0 1 0⎟ . (14.5.4)
5
⎝ ⎠⎝ ⎠
0 1 −2 0 0 1
In the first matrix the first two columns are orthogonal because we simply replaced the second column of M by the vector that
the Gram-Schmidt procedure produces from the first two columns of M , namely
7
⎛− 5 ⎞ −1 2
⎛ ⎞ ⎛ ⎞
1
⎜ 14 ⎟ =⎜ 3 ⎟− ⎜1⎟ . (14.5.5)
⎜ ⎟
5 5
⎝ ⎠ ⎝ ⎠
⎝ ⎠ 1 0
1
The matrix on the right is almost the identity matrix, save the + in the second entry of the first row, whose effect upon
1
multiplying the two matrices precisely undoes what we we did to the second column of the first matrix. For the third column of
M we use Gram--Schmidt to deduce the third orthogonal vector
1 7
− −
⎛ 6 ⎞ 1 2 ⎛ 5 ⎞
⎛ ⎞ ⎛ ⎞
1
−9
⎜ ⎟ = ⎜ −2 ⎟ − 0 ⎜ 1 ⎟ − ⎜ 14 ⎟ (14.5.6)
⎜ 3 ⎟ 54 ⎜ 5
⎟
⎝ ⎠ ⎝ ⎠
⎝− 7
⎠ −2 0 5 ⎝ ⎠
1
6
⎝0 1 −
7 ⎠⎝ ⎠
0 0 1
6
This is not quite the answer because the first matrix is now made of mutually orthogonal column vectors, but a bona fide
orthogonal matrix is comprised of orthonormal vectors. To achieve that we divide each column of the first matrix by its
length and multiply the corresponding row of the second matrix by the same amount:
14.5.1 https://math.libretexts.org/@go/page/2089
2 √5 7 √30 √6 – √5
⎛ − − ⎞ ⎛ √5 0 ⎞
5 90 18 5
⎜ √5 7 √30 √6
⎟⎜ 3 √30 √30
⎟
M =⎜ ⎟⎜
0 −
⎟ = QR . (14.5.8)
⎜ 5 45 9 ⎟⎜ 5 2
⎟
⎜ ⎟⎜ ⎟
√30 7 √6 √6
⎝ 0 − ⎠⎝ 0 0 ⎠
18 18 2
A nice check of this result is to verify that entry (i, j) of the matrix R equals the dot product of the i-th column of Q with the
j -th column of M . (Some people memorize this fact and use it as a recipe for computing QR deompositions.)
A good test of your own understanding is to work out why this is true!
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 14.5: QR Decomposition is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
14.5.2 https://math.libretexts.org/@go/page/2089
14.6: Orthogonal Complements
Let U and V be subspaces of a vector space W . In review exercise 6 you are asked to show that U ∩ V is a subspace of W , and
that U ∪ V is not a subspace. However, span(U ∪ V ) is certainly a subspace, since the span of any subset of a vector space is a
subspace. Notice that all elements of span(U ∪ V ) take the form u + v with u ∈ U and v ∈ V . We call the subspace
U + V := span(U ∪ V ) = {u + v|u ∈ U , v ∈ V } the sum of U and V . Here, we are not adding vectors, but vector spaces to
produce a new vector space!
U ∩ V = { 0W } , (14.6.1)
Remark: When U ∩ V = { 0W } , U +V = U ⊕V .
The direct sum has a very nice property:
Theorem
If w ∈ U ⊕ V then the expression w = u + v is unique. That is, there is only one way to write w as the sum of a vector in U
and a vector in V .
Proof
Suppose that u +v = u +v
′
, with u, u ∈ U , and v, v ∈ V . Then we could express 0 = (u − u ) + (v − v ) . Then
′ ′ ′ ′ ′
(u − u ) = −(v − v ) . Since U and V are subspaces, we have (u − u ) ∈ U and −(v − v ) ∈ V . But since these elements are
′ ′ ′ ′
Given a subspace U in W , how can we write W as the direct sum of U and something? There is not a unique answer to this
question as can be seen from this picture of subspaces in W = R : 3
14.6.1 https://math.libretexts.org/@go/page/2090
Definition
Given a subspace U of a vector space W , define U ⊥
= {w ∈ W |w ⋅ u = 0 for all u ∈ U } . }
Remark: The set U (pronounced "U -perp'') is the set of all vectors in
⊥
W orthogonal to every vector in U . This is also often
called the orthogonal complement of U .
Example 14.6.1:
Theorem
Proof
Hence
\[\Rightarrow u\cdot(\alpha v+\beta w)= \alpha u\cdot v + \beta u\cdot w =0\quad (\forall u\in U)\, ,$$
and so αv + βw ∈ U . ⊥
u ⋅ u = 0 ⇔ u = 0. (14.6.5)
u = (w ⋅ e1 )e1 + ⋯ + (w ⋅ en )en ∈ U ,
⊥
u = w −u .
Example 14.6.1:
Consider any line L through the origin in R . Then L is a subspace, and L is a 3-dimensional subspace orthogonal to L. For
4 ⊥
⊥
L = {(x, y, z, w) ∣ x, y, z, w ∈ R and (x, y, z, w) ⋅ (1, 1, 1, 1) = 0}
14.6.2 https://math.libretexts.org/@go/page/2090
1
⎛ ⎞
⎜ −1 ⎟
⎜ ⎟ (14.6.6)
⎜ 0 ⎟
⎝ ⎠
0
, v_{2}=
1
⎛ ⎞
⎜ 0 ⎟
⎜ ⎟ (14.6.7)
⎜ −1 ⎟
⎝ ⎠
0
, v_{3}=
1
⎛ ⎞
⎜ 0 ⎟
⎜ ⎟ (14.6.8)
⎜ 0 ⎟
⎝ ⎠
−1
\right \}\, ,
$$
forms a basis for L . We use Gram-Schmidt to find an orthogonal basis for L :
⊥ ⊥
First, we set v ⊥
1
= v1 . Then
1
1 1 ⎛ 2 ⎞
⎛ ⎞ ⎛ ⎞
1 ⎜ −1 ⎟ ⎜ 1 ⎟
⊥ ⎜ 0 ⎟ ⎜ 2 ⎟,
v = ⎜ ⎟− ⎜ ⎟ =
2
⎜ −1 ⎟ ⎜ ⎟
2⎜ 0 ⎟ ⎜ −1 ⎟
⎝ ⎠ ⎝ ⎠
0 0 ⎝ ⎠
0
1 1
1 1 ⎛ ⎞ ⎛ 3 ⎞
⎛ ⎞ ⎛ ⎞ 2
1 ⎜ 1 ⎟
⎜ 0 ⎟ 1 ⎜ −1 ⎟ 1/2 ⎜ ⎟
v
⊥
= ⎜ ⎟− ⎜ ⎟− ⎜ 2 ⎟ =⎜
⎜
3 ⎟
⎟.
3 ⎜ ⎟ ⎜ ⎟
⎜ 0 ⎟ 2 ⎜ 0 ⎟ 3/2 ⎜ 1
−1 ⎟ ⎜ ⎟
3
⎝ ⎠ ⎝ ⎠
−1 0 ⎝ ⎠ ⎝ ⎠
0 −1
So the set
1 1 1 1 1
{(1, −1, 0, 0), ( , , −1, 0) , ( , , , −1)} (14.6.9)
2 2 3 3 3
is an orthogonal basis for L . We find an orthonormal basis for L by dividing each basis vector by its length:
⊥ ⊥
– – – –
1 1 1 1 2 √3 √3 √3 √3
{( – , − – , 0, 0) , ( – , – , − – , 0) , ( , , , − )} . (14.6.10)
√2 √2 √6 √6 √6 6 6 6 2
Moreover, we have
4 ⊥
Re = L⊕L = {(c, c, c, c) ∣ c ∈ R} ⊕ {(x, y, z, w) ∣ x, y, z, w ∈ R, x + y + z + w = 0}. (14.6.11)
Notice that for any subspace U , the subspace (U ) is just U again. As such, ⊥ is an involution on the set of subspaces of a
⊥ ⊥
vector space. (An involution is any mathematical operation which performed twice does nothing.)
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 14.6: Orthogonal Complements is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
14.6.3 https://math.libretexts.org/@go/page/2090
14.7: Review Problems
λ1 0
1. Let D = ( )
0 λ2
a b
b) Suppose P =( ) is invertible. Show that D is similar to
c d
1 λ1 ad − λ2 bc −(λ1 − λ2 )ab
M = ( ) (14.7.1)
ad − bc (λ1 − λ2 )cd −λ1bc + λ2 ad
c) Suppose the vectors (a, b) and (c, d) are orthogonal. What can you say about M in this case? (Hint: think about what M
T
is
equal to.)
2. Suppose S = v , . . . , v is an orthogonal (not orthonormal) basis for R . Then we can write any vector
1 n
n
v as i
v = ∑ c vi
i
for
some constants c . Find a formula for the constants c in terms of v and the vectors in S .
i i
u⋅u
u in the plane P ?
(b) What is the (cosine of the) angle between v and u ? ⊥ ⊥
(c) How can you find a third vector perpendicular to both u and v ? ⊥
(e) Test your abstract formulæ starting with u = (1, 2, 0) and v = (0, 1, 1) .
4. Find an orthonormal basis for R which includes (1, 1, 1, 1) using the following procedure:
4
1
⎛ ⎞
⎜1⎟
v1 =⎜ ⎟ (14.7.2)
⎜1⎟
⎝ ⎠
1
Pick the vector v obtained from the standard Gaussian elimination procedure which is the coefficient of x .
2 2
(b) Pick a vector perpendicular to both v and v from the solutions set of the matrix equation
1 2
T
v
1
( ) x = 0. (14.7.4)
T
v
2
Pick the vector v obtained from the standard Gaussian elimination procedure with x as the coefficient.
3 3
(c) Pick a vector perpendicular to v 1, v2 , and v from the solution set of the matrix equation
3
\[
T
v
⎛ 1 ⎞
⎜ vT ⎟ (14.7.5)
⎜ 2 ⎟
⎝ T ⎠
v
3
x = 0.
Pick the vector v obtained from the standard Gaussian elimination procedure with x as the coefficient.
4 3
14.7.1 https://math.libretexts.org/@go/page/2092
5. Use the inner product
1
f ⋅ g := ∫ f (x)g(x)dx (14.7.6)
0
6. Use the inner product on the vector space V = spansin(x), sin(2x), sin(3x) to perform the Gram-Schmidt procedure on the
set of vectors sin(x), sin(2x), sin(3x).
What do you suspect about the vector space spansin(nx)|n ∈ N ?
What do you suspect about the vector space spansin(ax)|a ∈ R ?
7.
a. Show that if Q is an orthogonal n × n matrix then
u ⋅ v = (Qu) ⋅ (Qv), (14.7.7)
d. How does Q change this matrix? How do the eigenvectors and eigenvalues change?
8. Carefully write out the Gram-Schmidt procedure for the set of vectors
⎧ ⎛1⎞
⎪ ⎛
1
⎞ ⎛
1 ⎫
⎞⎪
⎨ ⎜ 1 ⎟ , ⎜ −1 ⎟ , ⎜ 1 ⎟ ⎬ . (14.7.8)
⎩
⎪ ⎝ ⎭
⎠ ⎝ ⎠ ⎝ ⎠⎪
1 1 −1
Are you free to rescale the second vector obtained in the procedure to a vector with integer components?
9.
a) Suppose u and v are linearly independent. Show that u and v are also linearly independent. Explain why
⊥
u, v
⊥
is a basis for
spanu, v.
⎜ −1 2 0⎟ (14.7.9)
⎝ ⎠
−1 −2 2
.\]
11. Given any three vectors u, v, w, when do v or w of the Gram-Schmidt procedure vanish?
⊥ ⊥
vector space M of all n × n matrices. What is dimM , dimS and dimA ? Show that M = S + A . Is A = S ? Is
n
n
n
n n n
n
n n n
⊥
n n
M
n
= S ⊕A
n ?
n n
14. The vector space V = spansin(t), sin(2t), sin(3t) has an inner product:
2π
f ⋅ g := ∫ f (t)g(t)dt. (14.7.10)
0
Find the orthogonal compliment to U = spansin(t) + sin(2t) in V . Express sin(t) − sin(2t) as the sum of vectors from U and
U .
T
14.7.2 https://math.libretexts.org/@go/page/2092
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 14.7: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
14.7.3 https://math.libretexts.org/@go/page/2092
CHAPTER OVERVIEW
Davis 0 2000 80
(15.1)
Seattle 2000 0 2010
0 2000 80
⎛ ⎞
T
M = ⎜ 2000 0 2010 ⎟ = M . (15.2)
⎝ ⎠
80 2010 0
One nice property of symmetric matrices is that they always have real eigenvalues. Review exercise 1 guides you through the
general proof, but here's an example for 2 × 2 matrices:
Example 15.1 :
2 2
= λ − (a + d)λ − b + ad
−−−−−−−−−−−−
2
a+d a−d
2
⇒ λ = ± √b +( ) .
2 2
= λx ⋅ y.
15.1 https://math.libretexts.org/@go/page/1737
Since μ and λ were assumed to be distinct eigenvalues, λ −μ is non-zero, and so x⋅y =0 . We have proved the following
theorem.
Theorem
Eigenvectors of a symmetric matrix with distinct eigenvalues are orthogonal.
Example 15.2 :
2 1
The matrix M =( ) has eigenvalues determined by
1 2
2
det(M − λI ) = (2 − λ ) − 1 = 0. (15.5)
1 1
So the eigenvalues of M are 3 and 1 , and the associated eigenvectors turn out to be ( ) and ( ) . It is easily seen that
1 −1
1 1
( )⋅( ) =0 (15.6)
1 −1
In chapter 14 we saw that the matrix P built from any orthonormal basis (v 1, … , vn ) for R as its columns,
n
P = ( v1 ⋯ vn ) , (15.7)
Moreover, given any (unit) vector x , one can always find vectors x
1 2, … , xn such that (x1 , … , xn ) is an orthonormal basis.
(Such a basis can be obtained using the Gram-Schmidt procedure.)
Now suppose M is a symmetric n × n matrix and λ is an eigenvalue with eigenvector x (this is always the case because every
1 1
matrix has at least one eigenvalue--see review problem 3). Let the square matrix of column vectors P be the following:
P = ( x1 x2 ⋯ xn ) , (15.9)
where x through x are orthonormal, and x is an eigenvector for M , but the others are not necessarily eigenvectors for M . Then
1 n 1
M P = ( λ1 x1 M x2 ⋯ M xn ) . (15.10)
15.2 https://math.libretexts.org/@go/page/1737
T
x
⎛ 1 ⎞
−1 T
P =P = ⎜ ⎟
⎜ ⋮ ⎟
⎝ T ⎠
xn
T
x λ1 x1 ∗ ⋯ ∗
⎛ 1 ⎞
T
⎜ x λ1 x1 ∗ ⋯ ∗⎟
T ⎜ 2 ⎟
⇒ P MP = ⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⎟
⎝ T ⎠
xn λ1 x1 ∗ ⋯ ∗
λ1 ∗ ⋯ ∗
⎛ ⎞
⎜ 0 ∗ ⋯ ∗⎟
= ⎜ ⎟
⎜ ⎟
⎜ ⋮ ∗ ⋮ ⎟
⎝ ⎠
0 ∗ ⋯ ∗
λ1 0 ⋯ 0
⎛ ⎞
⎜ 0 ⎟
= ⎜ ⎟ .
⎜ ⎟
⎜ ⋮ ^ ⎟
M
⎝ ⎠
0
a diagonal matrix with eigenvalues of M on the diagonal. Again, we have proved a theorem:
Theorem
Every symmetric matrix is similar to a diagonal matrix of its eigenvalues. In other words,
T T
M =M ⇔ M = P DP (15.11)
where P is an orthogonal matrix and D is a diagonal matrix whose entries are the eigenvalues of M .
To diagonalize a real symmetric matrix, begin by building an orthogonal matrix from an orthonormal basis of eigenvectors:
Example 15.3 :
1 1
has eigenvalues 3 and 1 with eigenvectors ( ) and ( ) respectively. After normalizing these eigenvectors, we build the
1 −1
orthogonal matrix:
1 1
⎛ ⎞
√2 √2
P = . (15.13)
1 −1
⎝ ⎠
√2 √2
Notice that P T
P =I . Then:
3 1 1 1
⎛ ⎞ ⎛ ⎞ 3 0
√2 √2 √2 √2
MP = = ( ). (15.14)
3 −1 1 −1
⎝ ⎠ ⎝ ⎠ 0 1
√2 √2 √2 √2
15.3 https://math.libretexts.org/@go/page/1737
15.1: Review Problems
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 15: Diagonalizing Symmetric Matrices is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
15.4 https://math.libretexts.org/@go/page/1737
15.1: Review Problems
1. (On Reality of Eigenvalues)
−
−−
a) Suppose z = x + iy where x, y ∈ R, i = √−1 , and z̄ = x − iy . Compute zz̄ and z̄z in terms of x and y . What kind of numbers
¯
¯ ¯
¯ ¯
¯
are zz̄ and z̄z ? (The complex number z̄ is called the complex conjugate of z ).
¯
¯ ¯
¯ ¯
¯
¯¯
¯
b) Suppose that λ = x + iy is a complex number with x, y ∈ R , and that λ =λ . Does this determine the value of x or y ? What
kind of number must λ be?
1
z
⎛ ⎞
c) Let x =⎜
⎜ ⋮
⎟ ∈ Cn
⎟
. Let †
x = ( z1
¯
¯¯¯
¯
⋯
¯
¯¯
z
¯
n¯
) ∈ C
n
(a 1 ×n complex matrix or a row vector). Compute x x
†
. Using the
⎝ n ⎠
z
result of part 1a, what can you say about the number x x? (E.g., is it real, imaginary, positive, negative, etc.)
†
d) Suppose M = M is an T
n×n symmetric matrix with real entries. Let λ be an eigenvalue of M with eigenvector x , so
M x = λx . Compute:
†
x Mx
(15.1.1)
†
x x
a
⎛ ⎞
x1 = ⎜ b ⎟ , (15.1.2)
⎝ ⎠
c
linear
2 n
α0 v + α1 Lv + α2 L v + ⋯ + αn L v = 0 . (15.1.3)
c) Let m be the largest integer such that αm ≠ 0 and $$p(z)=\alpha_{0}+ \alpha_{1} z + \alpha_{2} z^{2}+\cdots + \alpha_{m}
z^{n} z^{n}\, .\]
Explain why the polynomial p(z) can be written as
p(z) = αm (z − λ1 )(z − λ2 ) … (z − λm ) . (15.1.4)
4. (Dimensions of Eigenspaces)
a) Let $$A=
15.1.1 https://math.libretexts.org/@go/page/2091
4 0 0
⎛ ⎞
⎜0 2 −2 ⎟ (15.1.6)
⎝ ⎠
0 −2 2
\, .\]
Find all eigenvalues of A.
b) Find a basis for each eigenspace of A. What is the sum of the dimensions of the eigenspaces of A ?
c) Based on your answer to the previous part, guess a formula for the sum of the dimensions of the eigenspaces of a real n×n
symmetric matrix. Explain why your formula must work for any real n × n symmetric matrix.
5. If M is not square then it can not be symmetric. However, M M T
and \(M^{T}M) are symmetric, and therefore diagonalizable.
a) Is it the case that all of the eigenvalues of M M T
must also be eigenvalues of M T
M ?
b) Given an eigenvector of M M T
how can you obtain an eigenvector of M T
M ?
c) Let $$M=
1 2
⎛ ⎞
⎜3 3⎟ (15.1.7)
⎝ ⎠
2 1
\, .\]
Compute an orthonormal basis of eigenvectors for both M M T
and M T
M . If any of the eigenvalues for these two matrices agree,
choose an order for them and us it to help order your orthonormal bases. Finally, change the input and output bases for the matrix
M to these ordered orthonormal bases. Comment on what you find. (Hint: The result is called
the Singular Value Decomposition Theorem .)
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 15.1: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
15.1.2 https://math.libretexts.org/@go/page/2091
CHAPTER OVERVIEW
M: W → V (16.2)
M Lv = v , (16.3)
LM w = w . (16.4)
A linear transformation is just a special kind of function from one vector space to another. So before we discuss which linear
transformations have inverses, let us first discuss inverses of arbitrary functions. When we later specialize to linear transformations,
we'll also find some nice ways of creating subspaces.
Let f : S → T be a function from a set S to a set T . Recall that S is called the domain of f , T is called the codomain or target
of f , and the set
ran(f ) = im(f ) = f (S) = {f (s)|s ∈ S} ⊂ T , (16.5)
is called the range or image of f . The image of f is the set of elements of T to which the function f maps, i. e. , the things in T
which you can get to by starting in S and applying f . We can also talk about the pre-image of any subset U ⊂ T :
−1
f (U ) = {s ∈ S|f (s) ∈ U } ⊂ S. (16.6)
For the function f : S → T , S is the domain, T is the target, f (S) is the image/range and f −1
(U ) is the pre-image of U ⊂T .
The function f is one-to-one if different elements in S always map to different elements in T . That is, f is one-to-one if for any
elements x ≠ y ∈ S, we have that f (x) ≠ f (y) :
1
One-to-one functions are also called injective functions. Notice that injectivity is a condition on the pre-images of f .
The function f is onto if every element of T is mapped to by some element of S . That is, f is onto if for any t ∈ T , there exists
some s ∈ S such that f (s) = t . Onto functions are also called surjective functions. Notice that surjectivity is a condition on the
image of f :
Theorem
A function f : S → T has an inverse function g: T → S if and only if it is bijective.
Proof
This is an "if and only if'' statement so the proof has two parts:
1. (Existence of an inverse ⇒ bijective.)
a) Suppose that f has an inverse function g . We need to show f is bijective, which we break down into injective and
surjective: The function f is injective: Suppose that we have s, s ∈ S such that f (x) = f (y) . We must have that g(f (s)) = s
′
for any s ∈ S , so in particular g(f (s)) = s and g(f (s )) = s . But since f (s) = f (s ), we have g(f (s)) = g(f (s )) so
′ ′ ′ ′
s = s . Therefore, f is injective.
′
b) The function f is surjective: Let t be any element of T . We must have that f (g(t)) = t . Thus, g(t) is an element of S
2
Suppose that f is bijective. Hence f is surjective, so every element t ∈ T has at least one pre-image. Being bijective, f is also
injective, so every t has no more than one pre-image. Therefore, to construct an inverse function g , we simply define g(t) to
be the unique pre-image f (t) of t .
−1
Now let us specialize to functions f that are linear maps between two vector spaces. Everything we said above for arbitrary
functions is exactly the same for linear functions. However, the structure of vector spaces lets us say much more about one-to-one
and onto functions whose domains are vector spaces than we can say about functions on general sets. For example, we know that a
linear function always sends 0 to 0 , i.e. ,
V W
f (0V ) = 0W . (16.7)
In review exercise 3, you will show that a linear transformation is one-to-one if and only if 0 is the only vector that is sent to 0 :
V W
In contrast to arbitrary functions between sets, by looking at just one (very special) vector, we can figure out whether f is one-to-
one!
Let L: V → W be a linear transformation. Suppose L is \emph{not} injective. Then we can find v 1 ≠ v2 such that Lv
1 = Lv2 . So
v1 − v2 ≠ 0 , but
L(v1 − v2 ) = 0. (16.8)
Theorem
A linear transformation L is injective if and only if $$\ker L=\{ 0_{V} \}\, .\]
Proof
The proof of this theorem is review exercise 2.
Notice that if L has matrix M in some basis, then finding the kernel of L is equivalent to solving the homogeneous system
M X = 0. (16.10)
Example 16.1 :
1 1 0
⎛ ⎞
⎜1 2 0⎟ (16.11)
⎝ ⎠
0 1 0
\sim
1 0 0
⎛ ⎞
⎜0 1 0⎟ (16.12)
⎝ ⎠
0 0 0
$$
Then all solutions of M X = 0 are of the form x = y = 0 . In other words, ker L = {0} , and so L is injective.
3
Theorem
linear
Proof
Notice that if L(v) = 0 and L(u) = 0 , then for any constants c, d, L(cu + dv) = 0 . Then by the subspace theorem, the kernel
of L is a subspace of V .
Example 16.2 :
Let L: R
3
→ R be the linear transformation defined by L(x, y, z) = (x + y + z) . Then ker L consists of all vectors
(x, y, z) ∈ R
3
such that x + y + z = 0 . Therefore, the set
3
V = {(x, y, z) ∈ R ∣ x + y + z = 0} (16.13)
is a subspace of R . 3
When L : V → V , the above theorem has an interpretation in terms of the eigenspaces of L : Suppose L has a zero eigenvalue.
Then the associated eigenspace consists of all vectors v such that Lv = 0v = 0 ; in other words, the 0 -eigenspace of L is exactly
the kernel of L .
In the example where L(x, y) = (x + y, x + 2y, y) , the map L is clearly not surjective, since L maps R to a plane through the 2
origin in R . But any plane through the origin is a subspace. In general notice that if w = L(v) and w = L(v ) , then for any
3 ′ ′
Now the subspace theorem strikes again, and we have the following theorem:
Theorem
Let L: V → W . Then the image L(V ) is a subspace of W .
Example 16.3 :
Let L(x, y) = (x + y, x + 2y, y) . The image of L is a plane through the origin and thus a subspace of R . Indeed the matrix 3
⎜1 2⎟ . (16.15)
⎝ ⎠
0 1
The columns of this matrix encode the possible outputs of the function L because
1 1 1 1
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
x
L(x, y) = ⎜ 1 2⎟( ) = x ⎜ 1 ⎟+y ⎜ 2 ⎟ . (16.16)
y
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 1 0 1
Thus
⎧⎛ 1 ⎞
⎪ ⎛
1 ⎫
⎞⎪
2
L(R ) = span ⎨⎜ 1 ⎟ , ⎜ 2 ⎟⎬ (16.17)
⎩
⎪⎝ ⎭
⎠ ⎝ ⎠⎪
0 1
Hence, when bases and a linear transformation is are given, people often refer to its image as the column space of the
4
corresponding matrix.
To find a basis of the image of L , we can start with a basis S = {v , … , v } for V . Then 1 n
the most general input for L is of the form α v + ⋯ + α v . In turn, its most general output looks like
1
1
n
n
1 n 1 n
L(α v1 + ⋯ + α vn ) = α Lv1 + ⋯ + α Lvn ∈ span{Lv1 , … Lvn } . (16.18)
Thus
However, the set {Lv 1, … , Lvn } may not be linearly independent; we must solve
1 n
c Lv1 + ⋯ + c Lvn = 0 , (16.20)
to determine whether it is. By finding relations amongst the elements of L(S) = {Lv , … , Lv } , we can discard vectors until 1 n
a basis is arrived at. The size of this basis is the dimension of the image of L , which is known as the rank of L .
Definition
The rank of a linear transformation L is the dimension of its image, written
= nulL + rankL.
Proof
Pick a basis for V :
{ v1 , … , vp , u1 , … , uq }, (16.23)
where v , … , v is also a basis for ker L . This can always be done, for example, by finding a basis for the kernel of L and
1 p
then extending to a basis for V . Then p = nulL and p + q = dim V . Then we need to show that q = rankL . To accomplish
this, we show that {L(u ), … , L(u )} is a basis for L(V ) .
1 q
To see that {L(u 1 ), … , L(uq )} spans L(V ) , consider any vector w in L(V ) . Then we can find constants c i
,d
j
such that:
1 p 1 q
w = L(c v1 + ⋯ + c vp + d u1 + ⋯ + d uq )
1 p 1 q
= c L(v1 ) + ⋯ + c L(vp ) + d L(u1 ) + ⋯ + d L(uq )
1 q
= d L(u1 ) + ⋯ + d L(uq ) since L(vi ) = 0,
Now we show that {L(u 1 ), … , L(uq )} is linearly independent. We argue by contradiction: Suppose there exist constants d
j
1 q
= L(d u1 + ⋯ + d uq ).
5
But since the u are linearly independent, then d u + ⋯ + d u ≠ 0 , and so d u + ⋯ + d u is in the kernel of L . But
j 1
1
q
q
1
1
q
q
then d u + ⋯ + d u must be in the span of {v , … , v }, since this was a basis for the kernel. This contradicts the
1
1
q
q 1 p
16.1: Summary
16.2: Review Problems
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 16: Kernel, Range, Nullity, Rank is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
6
16.1: Summary
We have seen that a linear transformation has an inverse if and only if it is bijective (i.e. , one-to-one and onto). We also know that
linear transformations can be represented by matrices, and we have seen many ways to tell whether a matrix is invertible. Here is a
list of them:
Theorem (Invertibility)
Let M be an n × n matrix, and let
n n
L: R → R (16.1.1)
be the linear transformation defined by L(v) = M v . Then the following statements are equivalent:
Proof
Many of these equivalences were proved earlier in other chapters. Some were left as review questions or sample final
questions. The rest are left as exercises for the reader.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 16.1: Summary is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom Denton,
& Andrew Waldron.
16.1.1 https://math.libretexts.org/@go/page/2093
16.2: Review Problems
1.
Consider an arbitrary matrix M : R
m
→ R .
n
b) Argue that M x = 0 if only if x is perpendicular to all of the linear combinations of the columns of M . T
d) Argue further R m
= ker M ⊕ ran M
T
.
e) Argue analogously that R n
= ker M
T
⊕ ran M .
The equations in the last two parts describe how a linear transformation M : R → R determines orthogonal decompositions of
m n
both it's domain and target. This result sometimes goes by the humble name The Fundamental Theorem of Linear Algebra.
2. Let L: V → W be a linear transformation. Show that ker L = {0 V } if and only if L is one-to-one:
a) (Trivial kernel ⇒ injective.) Suppose that ker L = {0 } . Show that L is one-to-one. Think about methods of proof--does a
V
4. Suppose L: R 4
→ R
3
whose matrix M in the standard basis is row equivalent to the following matrix:
1 0 0 −1
⎛ ⎞
⎜0 1 0 1 ⎟ = RREF(M ) ∼ M . (16.2.2)
⎝ ⎠
0 0 1 1
a) Explain why the first three columns of the original matrix M form a basis for L(R ). 4
b) Find and describe an algorithm (i.e. , a general procedure) for computing a basis for L(R n
) when L: R n
→ R
m
.
c) Use your algorithm to find a basis for L(R )
4
when L: R 4
→ R
3
is the linear transformation whose matrix M in the standard
basis is
2 1 1 4
⎛ ⎞
⎜0 1 0 5⎟. (16.2.3)
⎝ ⎠
4 1 1 6
5. Claim:
If {v , … , v } is a basis for ker L, where L: V → W , then it is always possible to extend this set to a basis for V .
1 n
Choose some simple yet non-trivial linear transformations with non-trivial kernels and verify the above claim for those
transformations.
6. Let Pn (x) be the space of polynomials in x of degree less than or equal to n , and consider the derivative operator
d
: Pn (x) → Pn (x) . (16.2.4)
dx
Find the dimension of the kernel and image of this operator. What happens if the target space is changed to P n−1 (x) or P n+1 (x) ?
16.2.1 https://math.libretexts.org/@go/page/2094
Now consider P (x, y), the space of polynomials of degree two or less in
2 x and y . (Recall how degree is counted; xy is degree
two, y is degree one and x y is degree three, for example.) Let
2
∂ ∂
L := + : P2 (x, y) → P2 (x, y). (16.2.5)
∂x ∂y
∂y
(xy) = y + x .) Find a basis for the kernel of L. Verify the dimension formula in this case.
7.
Lets demonstrate some ways the dimension formula can break down if a vector space is infinite dimensional:
a) Let R[x] be the vector space of all polynomials in the variable x with real coefficients. Let D =
d
dx
be the usual derivative
operator. Show that the range of D is R[x]. What is ker D ? Hint: Use the basis x ∣ n ∈ N. n
linearly independent if and only if the kernel of M is trivial, namely the set kerM = {v ∈ B | M v = 0} contains only the zero
3
vector.
[ii.] Give some method for choosing a random bit vector v in B . Suppose S is a collection of 2 linearly independent bit vectors in
3
B . How can we tell whether S ∪ {v} is linearly independent? Do you think it is likely or unlikely that S ∪ {v} is linearly
3
random matrix M in such a way as to make each characteristic polynomial equally likely.) What is the probability that the columns
of M form a basis for B ? (Hint: what is the relationship between the kernel of M and its eigenvalues?)
3
[Note:] We could ask the same question for real vectors: If I choose a real vector at random, what is the probability that it lies in the
span of some other vectors? In fact, once we write down a reasonable way of choosing a random real vector, if I choose a real
vector in R at random, the probability that it lies in the span of n − 1 other real vectors is zero!
n
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 16.2: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
16.2.2 https://math.libretexts.org/@go/page/2094
CHAPTER OVERVIEW
Consider the linear system L(x) = v , where L: U − −⟶ W , and v ∈ W is given. As we have seen, this system may have no
solutions, a unique solution, or a space of solutions. But if v is not in the range of L , in pictures:
there will never be any solutions for L(x) = v . However, for many applications we do not need an exact solution of the system;
instead, we try to find the best approximation possible.
"My work always tried to unite the Truth with the Beautiful, but when I had to choose one
or the other, I usually chose the Beautiful.''--Hermann Weyl.
If the vector space W has a notion of lengths of vectors, we can try to find x that minimizes ||L(x) − v|| :
This method has many applications, such as when trying to fit a (perhaps linear) function to a "noisy'' set of observations. For
example, suppose we measured the position of a bicycle on a racetrack once every five seconds. Our observations won't be exact,
but so long as the observations are right on average, we can figure out a best-possible linear function of position of the bicycle in
terms of time.
Suppose M is the matrix for L in some bases for U and W , and v and x are given by column vectors V and X in these bases.
Then we need to approximate
MX − V ≈ 0 . (17.1)
Note that if dim U = n and dim W = m then M can be represented by an m × n matrix and x and v as vectors in R and R , n m
respectively. Thus, we can write W = L(U ) ⊕ L(U ) . Then we can uniquely write v = v + v , with v ∈ L(U ) and
⊥ ∥ ⊥ ∥
v
⊥
∈ L(U )
⊥
.
Thus we should solve L(u) = v . In components, v is just V
∥ ⊥
− MX , and is the part we will eventually wish to minimize.
In terms of M , recall that L(V ) is spanned by the columns of M . (In the standard basis, the columns of M are M e , 1 … , M en .)
Then v must be perpendicular to the columns of M . i.e. , M (V − M X) = 0 , or
⊥ T
T T
M MX = M V . (17.2)
Solutions of M M X = M V for X are called least squares solutions to M X = V . Notice that any solution X to M X = V is
T T
a least squares solution. However, the converse is often false. In fact, the equation M X = V may have no solutions at all, but still
have least squares solutions to M M X = M V .
T T
1
Observe that since M is an m × n matrix, then M is an n × m matrix. Then M M is an n × n matrix, and is symmetric, since
T T
= M M . Then, for any vector X , we can evaluate X M M X to obtain a number. This is a very nice number,
T T T T T
(M M )
2
though! It is just the length |M X| = (M X) (M X) = X M M X .
T T T
Now suppose that ker L = {0} , so that the only solution to M X = 0 is X = 0 . (This need not mean that M is invertible because
M is an n × m matrix, so not necessarily square.) However the square matrix M M is invertible. To see this, suppose there was a
T
vector X such that M M X = 0 . Then it would follow that X M M X = |M X| = 0 . In other words the vector M X would
T T T 2
have zero length, so could only be the zero vector. But we are assuming that ker L = {0} so M X = 0 implies X = 0 . Thus the
kernel of M M is {0} so this matrix is invertible. So, in this case, the least squares solution (the X that solves M M X = M V )
T T
Example 17.1 :
Captain Conundrum falls off of the leaning tower of Pisa and makes three (rather shaky) measurements of his velocity at three
different times.
ts vm/s
1 11
(17.4)
2 19
3 31
Having taken some calculus, he believes that his data are best approximated by a straight line
v = at + b. (17.5)
11 = a ⋅ 1 + b
19 = a ⋅ 2 + b
31 = a ⋅ 3 + b.
14 6 142 1 0 10
( ) ∼( 1
) (17.8)
6 3 61 0 1
3
2
Notice that this equation implies that Captain Conundrum accelerates towards Italian soil at 10 m/s (which is an excellent
2
approximation to reality) and that he started at a downward velocity of m/s (perhaps somebody gave him a shove...)!
1
Topic hierarchy
17.1: Singular Value Decomposition
17.2: Review Problems
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 17: Least Squares and Singular Values is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
3
17.1: Singular Value Decomposition
Suppose
Linear
L : V → W . (17.1.1)
It is unlikely that dim V := n = m =: dim W so the m × n matrix M of L in bases for V and W will not be square. Therefore there is no
eigenvalue problem we can use to uncover a preferred basis. However, if the vector spaces V and W both have inner products, there does exist
an analog of the eigenvalue problem, namely the singular values of L.
Before giving the details of the powerful technique known as the singular value decomposition, we note that it is an excellent example of what
Eugene Wigner called the "Unreasonable Effectiveness of Mathematics'':
There is a story about two friends who were classmates in high school, talking about their jobs. One of them became a statistician and was
working on population trends. He showed a reprint to his former classmate. The reprint started, as usual with the Gaussian distribution and
the statistician explained to his former classmate the meaning of the symbols for the actual population and so on. His classmate was a bit
incredulous and was not quite sure whether the statistician was pulling his leg. ``How can you know that?'' was his query. "And what is
this symbol here?'' "Oh,'' said the statistician, this is "π.'' "And what is that?'' "The ratio of the circumference of the circle to its diameter.''
"Well, now you are pushing your joke too far,'' said the classmate, "surely the population has nothing to do with the circumference of the
circle.''
-- Eugene Wigner, Commun. Pure and Appl. Math. {\bf XIII}, 1 (1960).
Whenever we mathematically model a system, any "canonical quantities'' (those on which we can all agree and do not depend on any choices
we make for calculating them) will correspond to important features of the system. For examples, the eigenvalues of the eigenvector equation
you found in review question 1, chapter 12 encode the notes and harmonics that a guitar string can play! Singular values appear in many linear
algebra applications, especially those involving very large data sets such as statistics and signal processing.
Let us focus on the m × n matrix M of a linear transformation L : V → W written in orthonormal bases for the input and outputs of L
(notice, the existence of these othonormal bases is predicated on having inner products for V and W ). Even though the matrix M is not square,
both the matrices M M and M M are square and symmetric! In terms of linear transformations M is the matrix of a linear transformation
T T T
Next, let us make a simplifying assumption, namely kerL = {0} . This is not necessary, but will make some of our computations simpler. Now
suppose we have found an orthonormal basis (u , … , u ) for V composed of eigenvectors for L L:
1 n
∗
∗
L Lui = λi ui . (17.1.2)
Hence, multiplying by L,
∗
LL Lui = λi Lui . (17.1.3)
i.e., Lu is an eigenvector of LL . The vectors (Lu , … , Lu ) are linearly independent, because kerL = {0} (this is where we use our
i
∗
1 n
simplifying assumption, but you can try and extend our analysis to the case where it no longer holds). Lets compute the angles between, and
lengths of these vectors:
For that we express the vectors u in the bases used to compute the matrix M of L. Denoting these column vectors by U we then compute
i i
T T T
(M Ui ) ⋅ (M Uj ) = U M M Uj = λj U Uj = λj Ui ⋅ Uj = λj δij . (17.1.4)
i i
Lu1 Lun
( −−, … , −−) (17.1.5)
√λ1 √λn
are orthonormal and linearly independent. However, since kerL = {0} we have dim L(V ) = dim V and in turn dim V ≤ dim W , so n ≤ m .
This means that although the above set of n vectors in W are orthonormal and linearly independent, they cannot be a basis for W . However,
they are a subset of the eigenvectors of LL . Hence an orthonormal basis of eigenvectors of LL looks like
∗ ∗
′
Lu1 Lun
O =( −−, … , −− , vm+1 , … , vm ) =: (v1 , … , vm ) . (17.1.6)
√λ1 √λn
17.1.1 https://math.libretexts.org/@go/page/2095
Now lets compute the matrix of L with respect to the orthonormal basis O = (u , … , u ) for V and the orthonormal basis
1 n
O = (v , … , v ) for W . As usual, our starting point is the computation of L acting on the input basis vectors:
′
1 m
−− −−
(Lu1 , … , Lun ) = (√λ1 v1 , … , √λn vn )
−−
√λ1 0 ⋯ 0
⎛ ⎞
−−
⎜ 0 √λ2 ⋯ 0 ⎟
⎜ ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋱ ⋮ ⎟
⎜ − −⎟
= (v1 , … , vm ) ⎜ 0 0 ⋯ √λn ⎟ .
⎜ ⎟
⎜ ⎟
⎜ 0 0 ⋯ 0 ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⎟
⎝ ⎠
0 0 ⋯ 0
−
−
The result is very close to diagonalization; the numbers √λ along the leading diagonal are called the singular values of L.
i
Example 17.1.1:
M = ⎜ −1 1 ⎟ . (17.1.7)
⎜ ⎟
⎝ 1 1 ⎠
− −
2 2
whichhaseigenvaluesandeigenvectors (17.1.9)
; \qquad
\lambda=2\, ,\, u_{2}:=
1
⎛ √2 ⎞
(17.1.11)
1
⎝− ⎠
√2
\,\, .
soourorthonormalinputbasisis (17.1.12)
O=\left(
1
⎛ √2
⎞
(17.1.13)
1
⎝ ⎠
√2
,
1
⎛ √2
⎞
(17.1.14)
1
⎝− ⎠
√2
\right)\, .
17.1.2 https://math.libretexts.org/@go/page/2095
T hesearecalledthe\(right singular vectors\)of \(M \). T hevectors (17.1.15)
M u_{1}=
1
⎛ √2
⎞
⎜ ⎟
⎜ 0 ⎟ (17.1.16)
⎜ ⎟
1
⎝− ⎠
√2
\mbox{ and }
M u_{2}=
0
⎛ ⎞
–
⎜ −√2 ⎟ (17.1.17)
⎝ ⎠
0
areeigenvectorsof (17.1.18)
M M^{T}=
1 1
0 −
⎛ 2 2 ⎞
⎜ 0 2 0 ⎟ (17.1.19)
⎜ ⎟
⎝ 1 1 ⎠
− 0
2 2
v_{3}=
1
⎛ √2 ⎞
⎜ ⎟
⎜ 0 ⎟ (17.1.21)
⎜ ⎟
1
⎝− ⎠
√2
\, .
O'=\left(
1
⎛ √2
⎞
⎜ ⎟
⎜ 0 ⎟ (17.1.23)
⎜ ⎟
1
⎝− ⎠
√2
0
⎛ ⎞
⎜ −1 ⎟ (17.1.24)
⎝ ⎠
0
,
1
⎛ √2
⎞
⎜ ⎟
⎜ 0 ⎟ (17.1.25)
⎜ ⎟
1
⎝ ⎠
√2
\right)\, .
′ ′
T henewmatrix\(M \)of thelineartransf ormationgivenby\(M \)withrespecttothebases\(O\)and\(O \)is (17.1.26)
17.1.3 https://math.libretexts.org/@go/page/2095
M'=
1 0
⎛ ⎞
–
⎜0 √2 ⎟ (17.1.27)
⎝ ⎠
0 0
\, ,
$$
–
so the singular values are 1, √2.
Finally note that arranging the column vectors of O and O into change of basis matrices
′
1 1
⎛ 0 ⎞
1 1 √2 √2
⎛ √2 √2
⎞
⎜ ⎟
P = , Q =⎜ 0 −1 0 ⎟ , (17.1.28)
1 1 ⎜ ⎟
⎝ − ⎠
1 1
√2 √2
⎝− 0 ⎠
√2 √2
we have, as usual,
$$
M'=Q^{-1}MP\, .
\]
Singular vectors and values have a very nice geometric interpretation: they provide an orthonormal bases for the domain and range of L and
give the factors by which L stretches the orthonormal input basis vectors. This is depicted below for the example we just computed:
This page titled 17.1: Singular Value Decomposition is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
17.1.4 https://math.libretexts.org/@go/page/2095
17.2: Review Problems
1. Let L : U ⇒ V be a linear transformation. Suppose v ∈ L(U ) and you have found a vector u ps that obeys L(u ps ) = v.
Explain why you need to compute kerL to describe the solution set of the linear system L(u) = v .
2. Suppose that M is an m × n matrix with trivial kernel. Show that for any vectors u and v in R . m
a) u T
M
T
Mv = v
T
M
T
Mu .
b) v M M v ≥ 0 . In case you are concerned (you don't need to be) and for future reference, the notation
T T
v≥0 means each
component v ≥ 0 . i
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 17.2: Review Problems is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
17.2.1 https://math.libretexts.org/@go/page/2106
CHAPTER OVERVIEW
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 18: Symbols, Fields, Sample Exams, Online Resources, Movie Scripts is shared under a not declared license and was authored,
remixed, and/or curated by David Cherney, Tom Denton, & Andrew Waldron.
1
18.1: List of Symbols
ϵ "Is an element of".
∼ "Is equivalent to", see equivalence relations. Also "is row equivalent to" for matrices.
P
F
nThe vector space of polynomials of degree at most n with coeficients in the field F.
M The vector space of r × k matrices.
r
k
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 18.1: List of Symbols is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
18.1.1 https://math.libretexts.org/@go/page/2102
18.2: Fields
Definition
A field F is a set with two operations + and ⋅, such that for all a, b, cϵF the following axioms are satisfied:
A1. Addition is associative (a + b) + c = a + (b + c) .
A2. There exists an additive identity 0.
A3. Addition is commutative a + b = b + a .
A4. There exists an additive inverse −a .
M1. Multiplication is associative (a ⋅ b) ⋅ c = a ⋅ (b ⋅ c) .
M2. There exists a multiplicative identity 1.
M3. Multiplication is commutative a ⋅ b = b ⋅ a .
M4. There exists a multiplicative inverse a if a ≠ 0 .
−1
Note
Roughly, all of the above mean that you have notions of +, −, ×, ÷ just as for regular real numbers.
Fields are a very beautiful structure; some examples are rational numbers Q, real numbers R, and complex numbers C. These
examples are infinite, however this does not necessarily have to be the case. The smallest example of a field has just two elements,
Z = 0, 1 or bits. The rules for addition and multiplication are the usual ones save that $$1 + 1 = 0.\]
2
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 18.2: Fields is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom Denton, &
Andrew Waldron.
18.2.1 https://math.libretexts.org/@go/page/2103
18.3: Online Resources
Here are some internet places to get linear algebra help:
1. Strang's MIT Linear Algebra Course. Videos of lectures and more:
http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/
2. Beezer's online Linear Algebra Course:
linear.ups.edu/version3.html
3. The Khan Academy has thousands of free videos on a multitude of topics including linear algebra:
www.khanacademy.org/
4. The Linear Algebra toolkit:
http://www.math.odu.edu/⇠bogacki/lat/
5. Carter, Tapia and Papakonstantinou's online linear algebra resource:
http://ceee.rice.edu/Books/LA/index.html
6. S.O.S. Mathematics Matrix Algebra primer:
www.sosmath.com/matrix/matrix.html
7. The Numerical Methods Guy on Youtube. Lots of worked examples:
http://www.youtube.com/user/numericalmethodsguy
8. Interactive Mathematics. Lots of useful math lessons on many topics:
http://www.intmath.com/
9. Stat Trek. A quick matrix tutorial for statistics students:
http://stattrek.com/matrix-algebra/matrix.aspx
10. Wolfram’s Mathworld. An online mathematics encyclopædia:
http://mathworld.wolfram.com/
11. Paul Dawkin's online math notes:
http://tutorial.math.lamar.edu/
12. Math Doctor Bob:
http://www.youtube.com/user/MathDoctorBob?feature=watch
13. Some pictures of how to rotate objects with matrices:
http://people.cornellcollege.edu/dsherman/visualize-matrix.html
14. xkcd. Geek jokes:
http://xkcd.com/184/
15. See the bridge actually fall down:
http://anothermathgeek.hubpages.com/hub/What-the-Heck-are-Eigenvalues-and-Eigenvectors
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 18.3: Online Resources is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
18.3.1 https://math.libretexts.org/@go/page/2104
18.4: Sample First Midterm
Here are some worked problems typical for what you might expect on a first midterm examination.
1. Solve the following linear system. Write the solution set in vector form. Check your solution. Write one particular solution and
one homogeneous solution, if they exist. What does the solution set look like geometrically?
$$
x + 3y =4
x − 2y + z =1 (18.4.1)
2x + y + z =5
\]
2.
Consider the system
⎧ x − z + 2w = −1
⎪
⎪
⎪
x + y + z − w = 2
⎨ (18.4.2)
⎪ − y − 2z + 3w = −3
⎪
⎩
⎪
5x + 2y − z + 4w = 1
e) Check separately that X and each Y solve the matrix systems you claimed they solved in part (d).
0 i
⎜2 4 7 11 ⎟
⎜ ⎟ (18.4.3)
⎜3 7 14 25 ⎟
⎝ ⎠
4 11 25 50
\]
2 1
4. Let M =( ) . Calculate M
T
M
−1
. Is M symmetric? What is the trace of the transpose of f (M ) , where
3 −1
f (x) = x
2
−1 ?
5. In this problem M is the matrix
cos θ sin θ
M =( ) (18.4.4)
− sin θ cos θ
18.4.1 https://math.libretexts.org/@go/page/2105
Calculate all possible dot products between the vectors X and M X . Compute the lengths of X and M X . What is the angle
between the vectors M X and X. Draw a picture of these vectors in the plane. For what values of θ do you expect equality in the
triangle and Cauchy--Schwartz inequalities?
6. Let M be the matrix
1 0 0 1 0 0
⎛ ⎞
⎜0 1 0 0 1 0⎟
⎜ ⎟
⎜0 0 1 0 0 1⎟
⎜ ⎟ (18.4.6)
⎜ ⎟
⎜0 0 0 1 0 0⎟
⎜ ⎟
⎜0 0 0 0 1 0⎟
⎝ ⎠
0 0 0 0 0 1
Find a formula for M for any positive integer power k . Try some simple examples like k = 2, 3 if confused.
k
7.
a b
Determinants: The determinant det M of a 2 × 2 matrix M =( ) is defined by
c d
det M = ad − bc . (18.4.7)
\(\textit{Square matrices with the same determinant are always row equivalent.}
8. What does it mean for a function to be linear? Check that integration is a linear function from V to V , where
V = {f : R → R ∣ f is integrable} is a vector space over R with usual addition and scalar multiplication.
9. What are the four main things we need to define for a vector space? Which of the following is a vector space over R? For those
that are not vector spaces, modify one part of the definition to make it into a vector space.
a b ka b
a) V = { 2 × 2 matrices with entries in R} , usual matrix addition, and k ⋅ ( ) =( ) for k ∈ R .
c d kc d
b) V = {polynomials with complex coefficients of degree ≤ 3} , with usual addition and scalar multiplication of polynomials.
c) V = {vectors in R
3
with at least one entry containing a 1} , with usual addition and scalar multiplication.
10.
Subspaces: If V is a vector space, we say that U is a subspace of V when the set U is also a vector space, using the vector
addition and scalar multiplication rules of the vector space V . (Remember that U ⊂ V says that "U is a subset of V '', i.e. , all
elements of U are also elements of V . The symbol ∀ means "for all'' and ∈ means "is an element of''.)
Explain why additive closure (u + w ∈ U ∀ u, v ∈ U ) and multiplicative closure (r. u ∈ U ∀ r ∈ R , u ∈ V ) ensure that (i) the
zero vector 0 ∈ U and (ii) every u ∈ U has an additive inverse.
In fact it suffices to check closure under addition and scalar multiplication to verify that U is a vector space. Check whether the
following choices of U are vector spaces:
⎧⎛ x ⎞
⎪ ⎫
⎪
a) U = ⎨⎜ y ⎟ : x, y ∈ R ⎬
⎩
⎪⎝ ⎭
⎪
⎠
0
18.4.2 https://math.libretexts.org/@go/page/2105
⎧ 1 ⎫
⎪⎛ ⎞ ⎪
b) U = ⎨⎜ 0 ⎟ : z ∈ R ⎬
⎩
⎪⎝ ⎭
⎪
⎠
z
Solutions
1.
As an additional exercise, write out the row operations above the ∼ signs below:
3 11
1 3 0 4 1 3 0 4 ⎛ 1 0 ⎞
⎛ ⎞ ⎛ ⎞ 5 5
⎜ 1 −2 1 1 ⎟ ∼⎜ 0 −5 1 −3 ⎟ ∼⎜
⎜
1 3 ⎟
⎟ (18.4.8)
⎜ ⎟ ⎜ ⎟ 0 1 −
⎜ 5 5 ⎟
⎝ 2 1 1 5 ⎠ ⎝ 0 −5 1 −3 ⎠
⎝ 0 0 0 0 ⎠
Solution set
11 3
⎧
⎪ x ⎛ ⎞ ⎛− ⎞ ⎫
⎪
⎪⎛ ⎞ 5 5 ⎪
⎨⎜ y ⎟ = ⎜ ⎟+μ⎜ ⎟ : μ ∈ R⎬
3 1
(18.4.9)
⎜ 5
⎟ ⎜ 5
⎟
⎪
⎩
⎪⎝ ⎠ ⎪
⎭
⎪
z ⎝ ⎠ ⎝ ⎠
0 1
11 3
⎛ 5 ⎞ ⎛− 5 ⎞
⎝ ⎠ ⎝ ⎠
0 1
11 3
⎛ 5 ⎞ ⎛− 5 ⎞
A particular solution is ⎜
⎜
3 ⎟
⎟ and a homogeneous solution is ⎜
⎜
1 ⎟
⎟ .
5 5
⎝ ⎠ ⎝ ⎠
0 1
1 −2 1 (18.4.10)
2 1 1
\right)\
11
⎛ 5 ⎞
⎜ 3 ⎟ (18.4.11)
⎜ ⎟
5
⎝ ⎠
0
4
⎛ ⎞
⎜1⎟ (18.4.12)
⎝ ⎠
5
\mbox{ and }
\left(
1 3 0
1 −2 1 (18.4.13)
2 1 1
18.4.3 https://math.libretexts.org/@go/page/2105
\right)\
3
⎛− 5 ⎞
⎜ 1 ⎟ (18.4.14)
⎜ 5
⎟
⎝ ⎠
1
=
0
⎛ ⎞
⎜0⎟ (18.4.15)
⎝ ⎠
0
\, .
\]
2.
a) Again, write out the row operations as an additional exercise.
$$
\left(
1 0 −1 2 −1
1 1 1 −1 2
(18.4.16)
0 −1 −2 3 −3
5 2 −1 4 1
\right)
\]
b)
$$
\sim
\left(
1 0 −1 2 −1
0 1 2 −3 3
(18.4.17)
0 −1 −2 3 −3
0 2 4 −6 6
\right)
\sim
\left(
1 0 −1 2 −1
0 1 2 −3 3
(18.4.18)
0 0 0 0 0
0 0 0 0 0
\right)
\]
c)
Solution set
$$
\left\{X=
18.4.4 https://math.libretexts.org/@go/page/2105
−1
⎛ ⎞
⎜ 3 ⎟
⎜ ⎟ (18.4.19)
⎜ 0 ⎟
⎝ ⎠
0
+\mu_{1}
1
⎛ ⎞
⎜ −2 ⎟
⎜ ⎟ (18.4.20)
⎜ 1 ⎟
⎝ ⎠
0
+\mu_{2}
−2
⎛ ⎞
⎜ 3 ⎟
⎜ ⎟ (18.4.21)
⎜ 0 ⎟
⎝ ⎠
1
⎜ 3 ⎟ ⎜ −2 ⎟ ⎜ 3 ⎟
The vector X 0 =⎜ ⎟ is a particular solution and the vectors Y1 =⎜ ⎟ and Y2 =⎜ ⎟ are homogeneous solutions.
⎜ 0 ⎟ ⎜ 1 ⎟ ⎜ 0 ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 1
1 0 −1 2 −1
⎛ ⎞ ⎛ ⎞
⎜1 1 1 −1 ⎟ ⎜ 2 ⎟
Calling M =⎜ ⎟ and V =⎜ ⎟ , they obey
⎜0 −1 −2 3⎟ ⎜ −3 ⎟
⎝ ⎠ ⎝ ⎠
5 2 −1 4 1
$$
MX=V\, ,\qquad M Y_{1}=0=MY_{2}\, .
\]
e) This amounts to performing explicitly the matrix manipulations MX − V , M Y1 , M Y2 and checking they all return the zero
vector.
3.
As usual, be sure to write out the row operations above the ∼'s so your work can be easily checked.
⎛ 1 2 3 4 1 0 0 0 ⎞
⎜ 2 4 7 11 0 1 0 0 ⎟
⎜ ⎟ (18.4.22)
⎜ ⎟
⎜ 3 7 14 25 0 0 1 0 ⎟
⎝ 4 11 25 50 0 0 0 1 ⎠
⎛ 1 2 3 4 1 0 0 0 ⎞
⎜ 0 0 1 3 −2 1 0 0 ⎟
∼⎜ ⎟ (18.4.23)
⎜ ⎟
⎜ 0 1 5 13 −3 0 1 0 ⎟
⎝ 0 3 13 34 −4 0 0 1 ⎠
18.4.5 https://math.libretexts.org/@go/page/2105
⎛ 1 0 −7 −22 7 0 −2 0 ⎞
⎜ 0 1 5 13 −3 0 1 0 ⎟
∼⎜ ⎟ (18.4.24)
⎜ ⎟
⎜ 0 0 1 3 −2 1 0 0 ⎟
⎝ 0 0 −2 −5 5 0 −3 1 ⎠
⎛ 1 0 0 −1 −7 7 −2 0 ⎞
⎜ 0 1 0 −2 7 −5 1 0 ⎟
∼⎜ ⎟ (18.4.25)
⎜ ⎟
⎜ 0 0 1 3 −2 1 0 0 ⎟
⎝ 0 0 0 1 1 2 −3 1 ⎠
⎛ 1 0 0 0 −6 9 −5 1 ⎞
⎜ 0 1 0 0 9 −1 −5 2 ⎟
∼⎜ ⎟ . (18.4.26)
⎜ ⎟
⎜ 0 0 1 0 −5 −5 9 −3 ⎟
⎝ 0 0 0 1 1 2 −3 1 ⎠
Check
$$
1 2 3 4
⎛ ⎞
⎜2 4 7 11 ⎟
⎜ ⎟ (18.4.27)
⎜3 7 14 25 ⎟
⎝ ⎠
4 11 25 50
−6 9 −5 1
⎛ ⎞
⎜ 9 −1 −5 ⎟2
⎜ ⎟ (18.4.28)
⎜ −5 −5 9 −3 ⎟
⎝ ⎠
1 2 −3 1
=
1 0 0 0
⎛ ⎞
⎜0 1 0 0⎟
⎜ ⎟ (18.4.29)
⎜0 0 1 0⎟
⎝ ⎠
0 0 0 1
\, .
\]
4.
1 1 11 4
2 3 −
T −1 5 5 5 5
M M =( )( ) =( ) . (18.4.30)
3 2 2 3
1 −1 − −
5 5 5 5
Since M T
M
−1
≠I , it follows M T
≠M so M is not symmetric.
Finally
18.4.6 https://math.libretexts.org/@go/page/2105
2 1 2 1
T 2
trf (M ) = trf (M ) = tr(M − I ) = tr ( )( ) − trI (18.4.31)
3 −1 3 −1
$$
=(2\cdot 2+1\cdot 3)+(3\cdot 1+(-1)\cdot(-1))-2=9\, .
\]
5. First
cos θ sin θ x
T
X ⋅ (M X) = X MX = ( x y)( )( ) (18.4.32)
− sin θ cos θ y
x cos θ + y sin θ 2 2
=(x y)( ) = (x + y ) cos θ . (18.4.33)
−x sin θ + y cos θ
−−−−− −−−−−−
Now ||X|| = √X ⋅ X 2
= √x + y
2
and
(M X) ⋅ (M X) = X M
T
MX . But
T
cos θ − sin θ cos θ sin θ
M M =( )( ) (18.4.34)
sin θ cos θ − sin θ cos θ
2 2
cos θ + sin θ 0
=( ) =I . (18.4.35)
2 2
0 cos θ + sin θ
−−−−−−
Hence ||M X|| = ||X|| = √x 2
+y
2
. Thus the cosine of the angle between X and M X is given by
2 2
X ⋅ (M X) (x + y ) cos θ
= = cos θ . (18.4.36)
−−−−−− −−−−−−
||X|| ||M X|| 2 2 2 2
√x + y √x + y
In other words, the angle is θ OR −θ . You should draw two pictures, one where the angle between X and MX is θ , the other
where it is −θ .
|X⋅(MX)|
For Cauchy--Schwartz, ||X|| ||MX||
= | cos θ| = 1 when θ = 0, π . For the triangle equality MX = X achieves
||X + M X|| = ||X|| + ||M X|| , which requires θ = 0 .
I I
6. This is a block matrix problem. Notice the that matrix M is really just M =( ) , where I and 0 are the 3 × 3 identity zero
0 I
2
I I I I I 2I
M =( )( ) =( ) (18.4.37)
0 I 0 I 0 I
and
3
I I I 2I I 3I
M =( )( ) =( ) (18.4.38)
0 I 0 I 0 I
I kI
so, M k
=( ) , or explicitly
0 I
$$
M^{k}=
18.4.7 https://math.libretexts.org/@go/page/2105
1 0 0 k 0 0
⎛ ⎞
⎜ 0 1 0 0 k 0⎟
⎜ ⎟
⎜ 0 0 1 0 0 k⎟
⎜ ⎟ (18.4.39)
⎜ ⎟
⎜ 0 0 0 1 0 0⎟
⎜ ⎟
⎜ 0 0 0 0 1 0⎟
⎝ ⎠
0 0 0 0 0 1
\, .
\]
7.
a) Whenever detM = ad − bc ≠ 0 .
b) Unit determinant bit matrices:
$$
1 0
( ) (18.4.40)
0 1
,
1 1
( ) (18.4.41)
0 1
\, ,
1 0
( ) (18.4.42)
1 1
0 1
( ) (18.4.43)
1 0
\, ,
1 1
( ) (18.4.44)
1 0
,
0 1
( ) (18.4.45)
1 1
\,.
\]
c) Bit matrices with vanishing determinant:
0 0 1 0 0 1 0 0 0 0
( ),( ) ,( ) ,( ),( ) , (18.4.46)
0 0 0 0 0 0 1 0 0 1
1 1 0 0 1 0 0 1 1 1
( ),( ) ,( ),( ) ,( ) . (18.4.47)
0 0 1 1 1 0 0 1 1 1
(number of entries) 4
As a check, count that the total number of 2 × 2 bit matrices is 2 =2 = 16.
d) To disprove this statement, we just need to find a single counterexample. All the unit determinant examples above are actually
row equivalent to the identity matrix, so focus on the bit matrices with vanishing determinant. Then notice (for example), that
1 1 0 0
( )∼
/( ) . (18.4.48)
0 0 0 0
18.4.8 https://math.libretexts.org/@go/page/2105
So we have found a pair of matrices that are not row equivalent but do have the same determinant. It follows that the statement is
false.
8. We can call a function f : V ⟶ W linear if the sets V and W are vector spaces
and f obeys
9. The four main ingredients are (i) a set V of vectors, (ii) a number field K (usually K = R) , (iii) a rule for adding vectors
(vector addition) and (iv) a way to multiply vectors by a number to produce a new vector (scalar multiplication). There are, of
course, ten rules that these four ingredients must obey.
a) This is not a vector space. Notice that distributivity of scalar multiplication requires 2u = (1 + 1)u = u + u for any vector u
but
a b 2a b
2⋅( ) =( ) (18.4.50)
c d 2c d
a b a b 2a 2b
( ) +( ) =( ) . (18.4.51)
c d c d 2c 2d
a b ka kb
k⋅( ) =( ) . (18.4.52)
c d kc kd
b) This is a vector space. {\it Although, the question does not ask you to, it is a useful exercise
to verify that all \hyperref[vectorspace]{ten vector space rules} are satisfied.}
c) This is not a vector space for many reasons. An easy one is that (1, −1, 0) and (−1, 1, 0) are both in the space, but their sum
(0, 0, 0) is not (i.e. , additive closure fails). The easiest way to repair this would be to drop the requirement that there be at least one
entry equaling 1.
10.
(i) Thanks to multiplicative closure, if u ∈ U , so is (−1) ⋅ u . But (−1) ⋅ u + u = (−1) ⋅ u + 1 ⋅ u = (−1 + 1) ⋅ u = 0.u = 0 (at
each step in this chain of equalities we have used the fact that V is a vector space and therefore can use its vector space rules). In
particular, this means that the zero vector of V is in U and is it zero vector also. (ii) Also, in V , for each u there is an element −u
such that u + (−u) = 0 . But by additive close, (−u) must also be in U , thus every u ∈ U has an additive inverse.
x z
⎛ ⎞ ⎛ ⎞
a) This is a vector space. First we check additive closure: let ⎜ y ⎟ and ⎜w⎟ be arbitrary vectors in U . But since
⎝ ⎠ ⎝ ⎠
0 0
18.4.9 https://math.libretexts.org/@go/page/2105
x z x +z
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎜ y ⎟+⎜ w ⎟ = ⎜ y +w ⎟ , so is their sum (because vectors in U are those whose third component vanishes). Multiplicative
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 0
x αx
⎛ ⎞ ⎛ ⎞
closure is similar: for any α ∈ R , α ⎜ y ⎟ = ⎜ αy ⎟ , which also has no third component, so is in U .
⎝ ⎠ ⎝ ⎠
0 0
1 2
⎛ ⎞ ⎛ ⎞
b) This is not a vector space for various reasons. A simple one is that u = ⎜ 0 ⎟ is in U but the vector u + u = ⎜ 0 ⎟ is not in
⎝ ⎠ ⎝ ⎠
z 2z
U (it has a 2 in the first component, but vectors in U always have a 1 there).
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 18.4: Sample First Midterm is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
18.4.10 https://math.libretexts.org/@go/page/2105
18.5: Sample Second Midterm
Here are some worked problems typical for what you might expect on a second midterm examination.
1.
Find an LU decomposition for the matrix
1 1 −1 2
⎛ ⎞
⎜ 1 3 2 2 ⎟
⎜ ⎟ (18.5.1)
⎜ −1 −3 −4 6 ⎟
⎝ ⎠
0 4 7 −2
⎧ x + y − z + 2w =7
⎪
⎪
⎪
x + 3y + 2z + 2w =6
⎨ (18.5.2)
⎪ −x − 3y − 4z + 6w = 12
⎪
⎩
⎪
4y + 7z − 2w = −7
2.
Let
1 1 1
⎛ ⎞
A =⎜2 2 3⎟ . (18.5.3)
⎝ ⎠
4 5 6
Compute det A .
1
⎛ ⎞
Find all solutions to (i) AX = 0 and (ii) AX = ⎜ 2 ⎟ for the vector X ∈ R
3
. Find, but do not solve, the characteristic
⎝ ⎠
3
polynomial of A .
3.
Let M be any 2 × 2 matrix. Show
1 2
1 2
det M = − trM + (trM ) . (18.5.4)
2 2
4.
The permanent: Let M = (M ) be an n × n matrix. An operation producing a single number from M similar
i
j
For example
a b
perm ( ) = ad + bc . (18.5.6)
c d
Calculate
1 2 3
⎛ ⎞
perm ⎜ 4 5 6⎟ . (18.5.7)
⎝ ⎠
7 8 9
18.5.1 https://math.libretexts.org/@go/page/2107
What do you think would happen to the permanent of an n × n matrix M if (include a brief explanation with each answer):
a. You multiplied M by a number λ .
b. You multiplied a row of M by a number λ .
c. You took the transpose of M .
d. You swapped two rows of M .
5.
Let X be an n × 1 matrix subject to
T
X X = (1) , (18.5.8)
and define
T
H = I − 2X X , (18.5.9)
6. Suppose λ is an eigenvalue of the matrix M with associated eigenvector v . Is v an eigenvector of M (where k is any positive k
do you observe? Now suppose the n × n matrix A is "similar'' to a diagonal matrix D, in other words
−1
A =P DP (18.5.12)
for some invertible matrix P and D is a matrix with values λ , λ , … λ along its diagonal. Show that the two matrix polynomials
1 2 n
P (A) and P (D) are similar (i.e. P (A) = P P (D)P ). Finally, compute P (D), what can you say about P (A) ?
−1
A A A A A A
9.
Define what it means for a set U to be a subspace of a vector space V . Now let U and W be non-trivial subspaces of V . Are the
following also subspaces? (Remember that ∪ means "union'' and ∩ means "intersection''.)
a. \(U \cup W\)
b. \(U \cap W\)
18.5.2 https://math.libretexts.org/@go/page/2107
10.
Define what it means for a set of vectors { v1 , v2 , … , vn } to (i) be linearly independent, (ii) span a vector space V and (iii) be a
basis for a vector space V .
Consider the following vectors in R 3
−1 4 10
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
u = ⎜ −4 ⎟ , v=⎜5⎟ , w =⎜ 7 ⎟ . (18.5.13)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
3 0 h +3
Solutions
1.
1 1 −1 2 1 0 0 0 1 1 −1 2
⎛ ⎞ ⎛ ⎞⎛ ⎞
⎜ 1 3 2 2 ⎟ ⎜ 1 1 0 0⎟⎜0 2 3 0 ⎟
⎜ ⎟ =⎜ ⎟⎜ ⎟ (18.5.14)
⎜ −1 −3 −4 6 ⎟ ⎜ −1 0 1 0⎟⎜0 −2 −5 8 ⎟
⎝ ⎠ ⎝ ⎠⎝ ⎠
0 4 7 −2 0 0 0 1 0 4 7 −2
1 0 0 0 1 1 −1 2
⎛ ⎞⎛ ⎞
⎜ 1 1 0 0⎟⎜0 2 3 0 ⎟
=⎜ ⎟⎜ ⎟ (18.5.15)
⎜ −1 −1 1 0⎟⎜0 0 −2 8 ⎟
⎝ ⎠⎝ ⎠
0 2 0 1 0 0 1 −2
1 0 0 0
⎛ ⎞⎛1 1 −1 2
⎞
⎜ 1 1 0 0⎟
⎜0 2 3 0⎟
=⎜ ⎟⎜ ⎟ . (18.5.16)
⎜ ⎟
⎜ −1 −1 1 0⎟⎜0 0 −2 8⎟
⎝ 1 ⎝ ⎠
0 2 − 1⎠ 0 0 0 2
2
⎛ 1 0 0 0 7 ⎞ ⎛ 1 0 0 0 7 ⎞
⎜ 1 1 0 0 6 ⎟ ⎜ 0 1 0 0 −1 ⎟
⎜ ⎟ ∼⎜ ⎟ (18.5.17)
⎜ −1 −1 1 0 12 ⎟ ⎜ 0 0 1 0 18 ⎟
⎜ ⎟ ⎜ ⎟
1 1
⎝ 0 2 − 1 −7 ⎠ ⎝ 0 2 − 1 −7 ⎠
2 2
⎛ 1 0 0 0 7 ⎞
⎜ 0 1 0 0 −1 ⎟
∼⎜ ⎟ , (18.5.18)
⎜ ⎟
⎜ 0 0 1 0 18 ⎟
⎝ 0 0 0 1 4 ⎠
from which we can read off W . Now we compute X by solving U X = W with the augmented matrix
⎛ 1 1 −1 2 7 ⎞ ⎛ 1 1 −1 2 7 ⎞
⎜ 0 2 3 0 −1 ⎟ ⎜ 0 2 3 0 −1 ⎟
⎜ ⎟ ∼⎜ ⎟ (18.5.19)
⎜ ⎟ ⎜ ⎟
⎜ 0 0 −2 8 18 ⎟ ⎜ 0 0 −2 0 2 ⎟
⎝ 0 0 0 2 4 ⎠ ⎝ 0 0 0 1 2 ⎠
18.5.3 https://math.libretexts.org/@go/page/2107
⎛ 1 1 −1 2 7 ⎞ ⎛ 1 0 0 0 1 ⎞
⎜ 0 2 0 0 2 ⎟ ⎜ 0 1 0 0 1 ⎟
∼⎜ ⎟ ∼⎜ ⎟ (18.5.20)
⎜ ⎟ ⎜ ⎟
⎜ 0 0 1 0 −1 ⎟ ⎜ 0 0 1 0 −1 ⎟
⎝ 0 0 0 1 2 ⎠ ⎝ 0 0 0 1 2 ⎠
So x = 1 , y = 1 , z = −1 and w = 2 .
2.
detA = 1.(2.6 − 3.5) − 1.(2.6 − 3.4) + 1.(2.5 − 2.4) = −1 . (18.5.21)
(i) Since detA ≠ 0 , the homogeneous system AX = 0 only has the solution X = 0 .
(ii) It is efficient to compute the adjoint
T
−3 0 2 −3 −1 1
⎛ ⎞ ⎛ ⎞
adj A = ⎜ −1 2 −1 ⎟ =⎜ 0 2 −1 ⎟ (18.5.22)
⎝ ⎠ ⎝ ⎠
1 −1 0 2 −1 0
Hence
3 1 −1
⎛ ⎞
−1
A =⎜ 0 −2 1 ⎟ . (18.5.23)
⎝ ⎠
−2 1 0
Thus
3 1 −1 1 2
⎛ ⎞⎛ ⎞ ⎛ ⎞
X =⎜ 0 −2 1 ⎟ ⎜ 2 ⎟ = ⎜ −1 ⎟ . (18.5.24)
⎝ ⎠⎝ ⎠ ⎝ ⎠
−2 1 0 3 0
Finally,
1 −λ 1 1
⎛ ⎞
= −[(1 − λ)[(2 − λ)(6 − λ) − 15] − [2.(6 − λ) − 12] + [10 − 4.(2 − λ)]] (18.5.26)
3 2
=λ − 9λ −λ +1 . (18.5.27)
3.
a b
Call M =( ) . Then detM = ad − bc , yet
c d
2
1 2
1 2
1 a + bc ∗ 1 2
− trM + (trM ) =− tr ( )− (a + d) (18.5.28)
2
2 2 2 ∗ bc + d 2
1 1
2 2 2 2
=− (a + 2bc + d ) + (a + 2ad + d ) = ad − bc , (18.5.29)
2 2
18.5.4 https://math.libretexts.org/@go/page/2107
4.
1 2 3
⎛ ⎞
perm ⎜ 4 5 6 ⎟ = 1.(5.9 + 6.8) + 2.(4.9 + 6.7) + 3.(4.8 + 5.7) = 450 . (18.5.30)
⎝ ⎠
7 8 9
σ(j)
in the formula for the permanent by λM
i
σ(j)
, and therefore produces
an overall factor λ .n
b) Multiplying the i th
row by λ replaces M i
σ(j)
in the formula for the permanent by λM i
σ(j)
. Therefore the permanent is multiplied
by an overall factor λ .
c) The permanent of a matrix transposed equals the permanent of the original matrix, because in the formula for the permanent this
σ(1) σ(2) σ(n)
amounts to summing over permutations of rows rather than columns. But we could then sort the product M M …M
1 2 n
back into its original order using the inverse permutation σ . But summing over permutations is equivalent to summing over
−1
fact that summing over all permutations σ or over all permutations σ̃ obtained
by swapping a pair in σ are equivalent operations.
5. Firstly, lets call (1) = 1 (the 1 × 1 identity matrix). Then we calculate
T T T T T T T T T T
H = (I − 2X X ) =I − 2(X X ) = I − 2(X ) X = I − 2X X =H , (18.5.31)
T T T T T
= I − 4X X + 4X(X X)X = I − 4X X + 4X. 1. X =I . (18.5.33)
and similarly
k k−1 k
M v = λM v = … = λ v. (18.5.35)
Now let us assume v is an eigenvector of the nilpotent matrix N with eigenvalue λ . Then from above
k k
N v=λ v (18.5.36)
Hence λ k
v=0 and v (being an eigenvector) cannot vanish. Thus λ k
=0 and in turn λ = 0 .
7. Let us think about the eigenvalue problem M v = λv . This has solutions when
18.5.5 https://math.libretexts.org/@go/page/2107
3 −λ −5
2
0 = det ( ) =λ − 4 ⇒ λ = ±2 . (18.5.38)
1 −3 − λ
The associated eigenvalues solve the homogeneous systems (in augmented matrix form)
1 −5 0 1 −5 0 5 −5 0 1 −1 0
( ) ∼( ) and ( ) ∼( ) , (18.5.39)
1 −5 0 0 0 0 1 −1 0 0 0 0
5 1
respectively, so are v2 = ( ) and v−2 = ( ) . Hence M
12
v2 = 2
12
v2 and M
12
v−2 = (−2 )
12
v−2 . Now,
1 1
x x−y 5 x−5y 1
( ) =
4
( )−
4
( ) (this was obtained by solving the linear system av 2 + b v−2 = for a and b ).
y 1 1
Thus
x x −y x − 5y
M ( ) = M v2 − M v−2 (18.5.40)
y 4 4
12
x −y x − 5y 12
x
=2 ( v2 − v−2 ) = 2 ( ) . (18.5.41)
4 4 y
Thus
12
4096 0
M =( ) . (18.5.42)
0 4096
If you understand the above explanation, then you have a good understanding of diagonalization. A quicker route
4 0
is simply to observe that M
2
=( ) .
0 4
8.
2
a−λ b
PM (λ) = (−1 ) det ( ) = (λ − a)(λ − d) − bc . (18.5.43)
c d−λ
Thus
a b a 0 a b d 0 bc 0
= (( ) −( )) (( ) −( )) − ( ) (18.5.45)
c d 0 a c d 0 d 0 bc
0 b a−d b bc 0
=( )( ) −( ) =0. (18.5.46)
c d−a c 0 0 bc
Observe that any 2 × 2 matrix is a zero of its own characteristic polynomial (in fact this holds for square matrices of any size ).
Now if A =P
−1
DP then A
2
=P
−1
DP P
−1
DP = P
−1
D P
2
. Similarly A
k
=P
−1 k
D P . So for any matrix polynomial we
have
n n−1
A + c1 A + ⋯ cn−1 A + cn I
−1 n −1 n−1 −1 −1
= P D P + c1 P D P + ⋯ cn−1 P DP + cn P P (18.5.47)
−1 n n−1
= P (D + c1 D + ⋯ cn−1 D + cn I )P .
18.5.6 https://math.libretexts.org/@go/page/2107
Thus we may conclude P A (A) =P
−1
PA (D)P .
Now suppose
λ1 0 ⋯ 0
⎛ ⎞
⎜ 0 λ2 0 ⎟
⎜ ⎟
D =⎜ ⎟ . Then
⎜ ⎟
⎜ ⋮ ⋱ ⋮ ⎟
⎝ ⎠
0 ⋯ λn
−1 −1
PA (λ) = det(λI − A) = det(λ P IP − P DP ) = detP . det(λI − D). detP (18.5.48)
λ − λ1 0 ⋯ 0
⎛ ⎞
⎜ 0 λ − λ2 0 ⎟
⎜ ⎟
= det(λI − D) = det ⎜ ⎟ (18.5.49)
⎜ ⎟
⎜ ⋮ ⋱ ⋮ ⎟
⎝ ⎠
0 0 ⋯ λ − λn
= (λ − λ1 )(λ − λ2 ) … (λ − λn ) . (18.5.50)
0 0 ⋯ 0 λ1 0 ⋯ 0 λ1 0 ⋯ 0
⎛ ⎞⎛ ⎞ ⎛ ⎞
⎜0 λ2 0 ⎟⎜ 0 0 0 ⎟ ⎜ 0 λ2 0⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟
=⎜ ⎟⎜ ⎟… ⎜ ⎟ =0. (18.5.52)
⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜ ⋮ ⋱ ⋮ ⎟⎜ ⋮ ⋱ ⋮ ⎟ ⎜ ⋮ ⋱ ⋮ ⎟
⎝ ⎠⎝ ⎠ ⎝ ⎠
0 0 ⋯ λn 0 0 ⋯ λn 0 0 ⋯ 0
We conclude the P M (M ) = 0 .
9. A subset of a vector space is called a subspace if it itself is a vector space, using the rules for vector addition and scalar
multiplication inherited from the original vector space.
αu + βw ∈ U ∩ W . So closure holds in U ∩ W and this set is a subspace by the subspace theorem. Here, a good picture to draw
such that c v + c v + ⋯ + c v = 0 . Alternatively, we can require that there is no non-trivial solution for scalars c ,
1
1
2
2
n
n
1
(ii) We say that these vectors span a vector space V if the set span
1 2
{ v1 , v2 , … vn } = { c v1 + c v2 + ⋯ + c vn : c , c , … c
n 1 2 n
∈ R} = V .
(iii) We call {v 1, v2 , … vn } a basis for V if {v 1, v2 , … vn } are linearly independent and span{v 1, v2 , … vn } = V .
18.5.7 https://math.libretexts.org/@go/page/2107
x
⎛ ⎞
For u, v, w to be a basis for 3
R , we firstly need (the spanning requirement) that any vector ⎜ y ⎟ can be written as a linear
⎝ ⎠
z
combination of u, v and w
−1 4 10 x
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 2 3
c ⎜ −4 ⎟ + c ⎜ 5 ⎟+c ⎜ 7 ⎟ =⎜ y ⎟ . (18.5.53)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
3 0 h +3 z
The linear independence requirement implies that when x =y =z =0 , the only solution to the above system is
c =c =c =0 .
1 2 3
Both requirements mean that the matrix on the left hand side must be invertible, so we examine its determinant
−1 4 10
⎛ ⎞
det ⎜ −4 5 7 ⎟ = −4.(−4.(h + 3) − 7.3) + 5.(−1.(h + 3) − 10.3) (18.5.55)
⎝ ⎠
3 0 h +3
= 11(h − 3) . (18.5.56)
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 18.5: Sample Second Midterm is shared under a not declared license and was authored, remixed, and/or curated by David
Cherney, Tom Denton, & Andrew Waldron.
18.5.8 https://math.libretexts.org/@go/page/2107
18.6: Sample Final Exam
Here are some worked problems typical for what you might expect on a final examination.
1. Define the following terms:
1. An orthogonal matrix.
2. A basis for a vector space.
3. The span of a set of vectors.
4. The dimension of a vector space.
5. An eigenvector.
6. A subspace of a vector space.
7. The kernel of a linear transformation.
8. The nullity of a linear transformation.
9. The image of a linear transformation.
10. The rank of a linear transformation.
11. The characteristic polynomial of a square matrix.
12. An equivalence relation.
13. A homogeneous solution to a linear system of equations.
14. A particular solution to a linear system of equations.
15. The general solution to a linear system of equations.
16. The direct sum of a pair of subspaces of a vector space.
17. The orthogonal complement to a subspace of a vector space.
2. Kirchoff's laws : Electrical circuits are easy to analyze using systems of equations. The change in voltage (measured in Volts) around any loop due to batteries |∣∣ and resistors /∖/∖/∖/∖ (given by the
product of the current measured in Amps and resistance measured in Ohms) equals zero. Also, the sum of currents entering any junction vanishes. Consider the circuit
http://mathwiki.ucdavis.edu/@api/deki/files/1356/circuit.pdf
Find all possible equations for the unknowns I , J and V and then solve for I , J and V . Give your answers with correct units.
3.
Suppose M is the matrix of a linear transformation
L : U → V (18.6.1)
and
m ≠n. (18.6.3)
Also assume
$$
\mbox{ker} L = \{0_U\}\, .
\]
1. How many rows does M have?
2. How many columns does M have?
3. Are the columns of M linearly independent?
4. What size matrix is M M ?
T
6. Is M M invertible?
T
7. Is M M symmetric?
T
8. Is M M diagonalizable?
T
x + 2y + 2z + 2w = 1 (18.6.4)
x + 2y + 3z + 3w = 1
Express this system as a matrix equation M X = V and then find the solution set by computing an LU decomposition for the matrix M (be sure to use back and forward substitution).
5.
Compute the following determinants
1 2 3 4
⎛ ⎞
1 2 3
⎛ ⎞
1 2 ⎜ 5 6 7 8 ⎟
det ( ) , det ⎜ 4 5 6 ⎟ , det ⎜ ⎟, (18.6.5)
3 4 ⎜ 9 10 11 12 ⎟
⎝ ⎠
7 8 9 ⎝ ⎠
13 14 15 16
1 2 3 4 5
⎛ ⎞
⎜ 6 7 8 9 10 ⎟
⎜ ⎟
det ⎜ 11 12 13 14
⎟
15 ⎟ . (18.6.6)
⎜
⎜ ⎟
⎜ 16 17 18 19 20 ⎟
⎝ ⎠
21 22 23 24 25
18.6.1 https://math.libretexts.org/@go/page/2110
1 2 3 ⋯ n
⎛ ⎞
Make sure to jot down a few brief notes explaining any clever tricks you use.
6.
For which values of a does
⎧ 1 1 a
⎪⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎫
⎪
3
U = span ⎨⎜ 0 ⎟ , ⎜ 2 ⎟ , ⎜ 1 ⎟⎬ = R ? (18.6.8)
⎩⎝
⎪ ⎠ ⎝ ⎠ ⎝ ⎭
⎠⎪
1 −3 0
inside R .
3
7.
Vandermonde determinant: Calculate the following determinants
2 3
1 x x x
2 ⎛ ⎞
1 x x
⎛ ⎞ 2 3
1 x ⎜1 y y y ⎟
2
det ( ) , det ⎜ 1 y y ⎟ , det ⎜ ⎟ . (18.6.9)
⎜ 2 3 ⎟
1 y ⎜1 z z z ⎟
⎝ 2 ⎠
1 z z
⎝ 2 3 ⎠
1 w w w
⎝ 2 n−1 ⎠
1 xn (xn ) ⋯ (xn )
\, .
\]
8.
⎧ 1 3 1 0 0 ⎫
⎪⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎪
1. Do the vectors ⎨⎜ 2 ⎟ , ⎜ 2 ⎟ , ⎜ 0 ⎟ , ⎜ 1 ⎟ , ⎜ 0 ⎟⎬ form a basis for R ? Be sure to justify your answer. 3
⎩⎝
⎪ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎭
⎠⎪
3 1 0 0 1
1 4
⎛ ⎞ ⎛ ⎞
2⎟ 3⎟
2. Find a basis for R that includes the vectors ⎜
4
⎜ ⎟ and ⎜
⎜ ⎟ .
⎜3⎟ ⎜2⎟
⎝ ⎠ ⎝ ⎠
4 1
3. Explain in words how to generalize your computation in the second part to obtain a basis for R that includes a given pair of (linearly independent) vectors u and v .
n
9.
Elite NASA engineers\index{Elite NASA engineers} determine that if a satellite is placed in orbit starting at a point O , it will return exactly to that same point after one orbit of the earth.
Unfortunately, if there is a small mistake in the original location of the
satellite, which the engineers label by a vector X in R with origin at O , after one orbit the satellite will instead return to some other point Y ∈ R . The engineer's computations show that Y is
3 3
related to X by a matrix
$$
Y=
1
0 1
⎛ 2 ⎞
1 1 1
⎜ ⎟ (18.6.11)
⎜ 2 2 2 ⎟
⎝ 1 ⎠
1 0
2
X\, .
\]
1. Find all eigenvalues of the above matrix.
2. Determine all possible eigenvectors associated with each eigenvalue.
Let us assume that the rule found by the engineers applies to all subsequent orbits. Discuss case by case, what will happen to the satellite if the initial mistake in its location is in a direction given by
an eigenvector.
10.
In this problem the scalars in the vector spaces are bits (0, 1 with 1 + 1 = 0 ). The space B is the vector space of bit-valued, k -component column vectors.
k
2. Your answer to part (a) should be a list of vectors v , v , … v . What number did you find for n ?
1 2 n
5. Suppose L : B → B = {0, 1} is a linear transformation. Explain why specifying L(v ), L(v ), … , L(v
3
1 2 n) completely determines L.
6. Use the notation of part (e) to list {all} linear transformations
3
L : B → B. (18.6.12)
18.6.2 https://math.libretexts.org/@go/page/2110
How many different linear transformations did you find? Compare your answer to part 3.
7. Suppose L : B → B and L : B → B are linear transformations, and α and β are bits. Define a new map (α L
1
3
2
3
1
3
+ β L2 ) : B → B by
(α L1 + β L2 )(v) = α L1 (v) + β L2 (v). (18.6.13)
11.
A team of distinguished, post-doctoral engineers analyzes the design for a bridge across the English channel. They notice that the force on the center of the bridge when it is displaced by an amount
x
⎛ ⎞
X =⎜ y ⎟ is given by
⎝ ⎠
z
−x − y
⎛ ⎞
F = ⎜ −x − 2y − z ⎟ . (18.6.14)
⎝ ⎠
−y − z
Moreover, having read Newton's Principi\ae, they know that force is proportional to acceleration so that
2
d X
F = . (18.6.15)
2
dt
Since the engineers are worried the bridge might start swaying in the heavy channel winds, they search for an oscillatory solution to this equation of the form
$$
X=\cos(\omega t) \
a
⎛ ⎞
⎜ b ⎟ (18.6.16)
⎝ ⎠
c
\, .
\]
1. By plugging their proposed solution in the above equations the engineers find an eigenvalue problem
a a
⎛ ⎞ ⎛ ⎞
2
M ⎜ b ⎟ = −ω ⎜ b ⎟ . (18.6.17)
⎝ ⎠ ⎝ ⎠
c c
Here M is a 3 × 3 matrix. Which 3 × 3 matrix M did the engineers find? Justify your answer.
2. Find the eigenvalues and eigenvectors of the matrix M .
3. The number |ω| is often called a characteristic frequency. What characteristic frequencies do you find for the proposed bridge?
4. Find an orthogonal matrix P such that M P = P D where D is a diagonal matrix. Be sure to also state your result for D.
5. Is there a direction in which displacing the bridge yields no force? If so give a vector in that direction. Briefly evaluate the quality of this bridge design.
12. Conic Sections: The equation for the most general conic section is given by
2 2
ax + 2bxy + dy + 2cx + 2ey + f = 0 . (18.6.18)
x
relating an unknown column vector X = ( ) , its transpose X , a 2 × 2 matrix M , a constant column vector C and the constant f .
T
2. Does your matrix M obey any special properties? Find its eigenvalues. You may call your answers λ and μ for the rest of the problem to save writing.
For the rest of this problem we will focus on central conics for which the matrix M is invertible.
3. Your equation in part (a) above should be be quadratic in X. Recall that if m ≠ 0 , the quadratic equation mx 2
+ 2cx + f = 0 can be rewritten by completing the square
2
c 2 c
m(x + ) = −f . (18.6.20)
m m
Being very careful that you are now dealing with matrices, use the same trick to rewrite your answer to part (a) in the form
T
Y M Y = g. (18.6.21)
Make sure you give formulas for the new unknown column vector Y and constant g in terms of X, M , C and f . You need not multiply out any of the matrix expressions you find. If all has gone
well, you have found a way to shift coordinates for the original conic equationto a new coordinate system with its origin at the center of symmetry. Our next aim is to rotate the coordinate axes to
produce a readily recognizable equation.
4. Why is the angle between vectors V and W is not changed when you replace them by P V and P W for P any orthogonal matrix?
5. Explain how to choose an orthogonal matrix P such that M P = P D where D is a diagonal matrix.
6. For the choice of P above, define our final unknown vector Z by Y = P Z . Find an expression for Y M Y in terms of Z and the eigenvalues of M .
T
z
7. Call Z = ( ) . What equation do z and w obey? (Hint, write your answer using λ, μ and g.)
w
8. Central conics are circles, ellipses, hyperbolae or a pair of straight lines. Give examples of values of (λ, μ, g) which produce each of these cases.
13. Let L: V → W be a linear transformation between finite-dimensional vector spaces V and W , and let M be a matrix for L (with respect to some basis for V and some basis for W ). We know that
L has an inverse if and only if it is bijective, and we know a lot of ways to tell whether M has an inverse. In fact, L has an inverse if and only if M has an inverse:
18.6.3 https://math.libretexts.org/@go/page/2110
1. Show that 0 is not an eigenvalue of M .
2. Show that L is injective.
3. Show that L is surjective.
14. Captain Conundrum gives Queen Quandary a pair of newborn doves, male and female for her birthday. After one year, this pair of doves breed and produce a pair of dove eggs. One year later
these eggs hatch yielding a new pair of doves while the original pair of doves breed again and an additional pair of eggs are laid. Captain Conundrum is very happy because now he will never need to
buy the Queen a present ever again!
Let us say that in year zero, the Queen has no doves. In year one she has one pair of doves, in year two she has two pairs of doves etc...
Call F the number of pairs of doves in years n . For example, F = 0 , F = 1 and F = 1 . Assume no doves die and that the same breeding pattern continues well into the future. Then
n 0 1 2 F3 = 2
because the eggs laid by the first pair of doves in year two hatch. Notice also that in year three, two pairs of eggs are laid (by the first and second pair of doves). Thus F = 3 . 4
1. Compute F and F .
5 6
Fn
3. Let us introduce a column vector X n =( ) . Compute X and X . Verify that these vectors obey the relationship
1 2
Fn−1
1 1
X2 = M X1 where M = ( ) . (18.6.23)
1 0
4. Show that X = MX .
n+1 n
8. The number
–
1 + √5
φ = (18.6.24)
2
15.
Use Gram--Schmidt to find an orthonormal basis for
$$
\mbox{span}
\left\{
1
⎛ ⎞
⎜1⎟
⎜ ⎟ (18.6.25)
⎜1⎟
⎝ ⎠
1
\, ,\,
1
⎛ ⎞
⎜0⎟
⎜ ⎟ (18.6.26)
⎜1⎟
⎝ ⎠
1
\, ,\,
0
⎛ ⎞
⎜0⎟
⎜ ⎟ (18.6.27)
⎜1⎟
⎝ ⎠
2
\right\}\, .
\]
⊥
16. Let M be the matrix of a linear transformation L : V → W in given bases for V and W . Fill in the blanks below with one of the following six vector spaces: V , W , kerL , (kerL) , ,
imL
⊥
(imL) .
1. The columns of M span –answer in the basis given for answer .
––––––– ––––––––
2. The rows of M span –
answer in the basis given for answer . Suppose
––––––– ––––––––
1 2 1 3
⎛ ⎞
⎜2 1 −1 2 ⎟
M =⎜ ⎟ (18.6.28)
⎜1 0 0 −1 ⎟
⎝ ⎠
4 1 −1 0
3. Find bases for kerL and imL. Use the dimension formula to check your result.
17.
Captain Conundrum collects the following data set
y x
5 −2
2 −1 (18.6.29)
0 1
3 2
18.6.4 https://math.libretexts.org/@go/page/2110
$$
y=ax^{2}+bx+c\, .
\]
1. Write down a system of four linear equations for the unknown coefficients a , b and c .
2. Write the augmented matrix for this system of equations.
3. Find the reduced row echelon form for this augmented matrix.
4. Are there any solutions to this system?
5. Find the least squares solution to the system.
6. What value does Captain Conundrum predict for y when x = 2 ?
18. Suppose you have collected the following data for an experiment
x y
x1 y1
(18.6.30)
x2 y2
x3 y3
Be sure to give explicit expressions for the matrix M and column vector V .
3. For a generic data set, would you expect your system of equations to have a solution? Briefly explain your answer.
4. Calculate M M and (M M ) (for the latter computation, state the condition required for the inverse to exist).
T T −1
Solutions
1. You can find the definitions for all these terms by consulting the index of this book.
2. Both junctions give the same equation for the currents
I + J + 13 = 0 . (18.6.32)
There are three voltage loops (one on the left, one on the right and one going around the outside of the circuit). Respectively, they give the equations
60 − I − 80 − 3I = 0 (18.6.33)
80 + 2J − V + 3J = 0 (18.6.34)
60 − I + 2J − V + 3J − 3I = 0 . (18.6.35)
The above equations are easily solved (either using an augmented matrix and row reducing, or by substitution). The result is I = −5 Amps, J = −8 Amps, V = 40 Volts.
3.
1. m.
2. n .
3. Yes.
4. n × n .
5. m × m .
6. Yes. This relies on kerM = 0 because if M M had a non-trivial kernel, then there would be a non-zero solution X to M
T T
MX = 0 . But then by multiplying on the left by X we see that
T
||M X|| = 0 . This in turn implies M X = 0 which contradicts the triviality of the kernel of M .
T
7. Yes because (M M ) = M (M ) = M M .
T T T T T
Then
1 1 1 1 1 0 0
⎛ ⎞ ⎛ ⎞⎛1 1 1 1
⎞
M =⎜1 2 2 2⎟ =⎜
1 1 0⎟⎜0 1 1 1⎟ (18.6.37)
⎝ ⎠ ⎝ ⎠⎝ ⎠
1 2 3 3 1 0 1 0 1 2 2
1 0 0 1 1 1 1
⎛ ⎞⎛ ⎞
=⎜1 1 0⎟⎜0 1 1 1 ⎟ = LU (18.6.38)
⎝ ⎠⎝ ⎠
1 1 1 0 0 1 1
18.6.5 https://math.libretexts.org/@go/page/2110
a
⎛ ⎞
So now M X = V becomes LW =V where W = UX = ⎜ b ⎟ (say). Thus we solve LW =V by forward substitution
⎝ ⎠
c
a = 1, a + b = 1, a + b + c = 1 ⇒ a = 1, b = 0, c = 0 . (18.6.39)
⎧⎛ x ⎞
⎪ ⎛
1
⎞ ⎫
⎪
⎪ ⎪
⎪ ⎪
⎜ y ⎟ ⎜ 0 ⎟
The solution set is ⎨⎜ ⎟ =⎜ ⎟ : μ ∈ R⎬
⎜ z ⎟ ⎜ −μ ⎟
⎪ ⎪
⎪
⎩⎝
⎪ ⎪
⎭
⎪
⎠ ⎝ ⎠
y μ
5. First
1 2
det ( ) = −2 . (18.6.42)
3 4
All the other determinants vanish because the first three rows of each matrix are not independent. Indeed, 2R 2 − R1 = R3 in each case, so we can make row operations to get a row of zeros and thus
a zero determinant.
x
⎛ ⎞
6. If U spans R , then we must be able to express any vector X = ⎜ y ⎟
3
∈ R
3
⎝ ⎠
z
as
1
1 1 a 1 1 a c
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎛ ⎞
1 2 3 2
X =c ⎜ 0 ⎟+c ⎜ 2 ⎟+c ⎜1⎟ =⎜0 2 1⎟⎜c ⎟ , (18.6.43)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎝ 3 ⎠
1 −3 0 1 −3 0 c
for some coefficients $c , c and c . This is a linear system. We could solve for c , c and c using an augmented matrix and row operations. However, since we know that dim R
1 2 3 1 2 3 3
=3 , if U spans
R , it will also be a basis. Then the solution for c , c and c would be unique. Hence, the 3 × 3 matrix above must be invertible, so we examine its determinant
3 1 2 3
1 1 a
⎛ ⎞
det ⎜ 0 2 1 ⎟ = 1.(2.0 − 1.(−3)) + 1.(1.1 − a.2) = 4 − 2a . (18.6.44)
⎝ ⎠
1 −3 0
Thus U spans R whenever a ≠ 2 . When a = 2 we can write the third vector in U in terms of the preceding ones as
3
2 1 1
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
3 1
⎜1⎟ = ⎜0⎟+ ⎜ 2 ⎟. (18.6.45)
2 2
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 1 −3
1 2
⎛ ⎞ ⎛ ⎞
You can obtain this result, or an equivalent one by studying the above linear system with X =0 , i.e., the associated homogeneous system. The two vectors ⎜ 2 ⎟ and ⎜1⎟ are clearly linearly
⎝ ⎠ ⎝ ⎠
−3 0
independent, so this is the least number of vectors spanning U for this value of a . Also we see that dimU =2 in this case. Your picture should be a plane in R
3
though the origin containing the
1 2
⎛ ⎞ ⎛ ⎞
vectors ⎜ 2 ⎟ and ⎜ 1 ⎟.
⎝ ⎠ ⎝ ⎠
−3 0
7.
$$
\det
1 x
( ) (18.6.46)
1 y
=y-x\, ,\ \
\]
2 2
1 x x 1 x x
⎛ ⎞ ⎛ ⎞
2 2 2 2 2 2 2
det ⎜ 1 y y ⎟ = det ⎜ 0 y −x y −x ⎟ $$$$ = (y − x)(z − x ) − (y − x )(z − x) (18.6.47)
⎝ 2 ⎠ ⎝ 2 2 ⎠
1 z z 0 z−x z −x
= (y − x)(z − x)(z − y) .
\[
\det
2 3
1 x x x
⎛ ⎞
2 3
⎜1 y y y ⎟
⎜ ⎟ (18.6.48)
⎜ 2 3 ⎟
⎜1 z z z ⎟
⎝ 2 3 ⎠
1 w w w
=
\det
18.6.6 https://math.libretexts.org/@go/page/2110
2 3
1 x x x
⎛ ⎞
2 2 3 3
⎜0 y −x y −x y −x ⎟
⎜ ⎟ (18.6.49)
⎜ 2 2 3 3 ⎟
⎜0 z−x z −x z −x ⎟
⎝ 2 2 3 3 ⎠
0 w −x w −x w −x
(18.6.50)
=
\det
1 0 0 0
⎛ ⎞
2
⎜0 y −x y(y − x) y (y − x) ⎟
⎜ ⎟ (18.6.51)
⎜ 2 ⎟
⎜0 z−x z(z − x) z (z − x) ⎟
⎝ 2 ⎠
0 w −x w(w − x) w (w − x)
(18.6.52)
=
(y-x)(z-x)(w-x)\det
1 0 0 0
⎛ ⎞
2
⎜0 1 y y ⎟
⎜ ⎟ (18.6.53)
2
⎜0 1 z z ⎟
⎝ 2 ⎠
0 1 w w
(18.6.54)
=(y-x)(z-x)(w-x)\det
2
1 x x
⎛ ⎞
2
⎜1 y y ⎟ (18.6.55)
⎝ 2 ⎠
1 z z
(18.6.56)
=(y-x)(z-x)(w-x)(y-x)(z-x)(z-y)\, .
F romthe\(4 × 4\)caseabove, youcanseeallthetricksrequiredf orageneralV andermondematrix (18.6.57)
. T hisdoeschangethedeterminantsowewritethesef actorsoutsidetheremainingdeterminant,
\det
2 n−1
1 x1 (x1 ) ⋯ (x1 )
⎛ ⎞
2 n−1
⎜1 x2 (x2 ) ⋯ (x2 ) ⎟
⎜ ⎟
⎜ 2 n−1 ⎟
⎜1 x3 (x3 ) ⋯ (x3 ) ⎟ (18.6.58)
⎜ ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⋱ ⋮ ⎟
⎝ 2 n−1 ⎠
1 xn (xn ) ⋯ (xn )
=\prod_{i>j}(x_{i}-x_{j})\, .
$$
(Here ∏ stands for a multiple product, just like Σ stands for a multiple sum.)
8.
1. No, a basis for R must have exactly three vectors.
3
2. We first extend the original vectors by the standard basis for R and then try to eliminate two of them by considering
4
1 4 1 0 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
4 1 0 0 0 1
So we study
1 4 1 0 0 0 1 4 1 0 0 0
⎛ ⎞ ⎛ ⎞
⎜2 3 0 1 0 0⎟ ⎜0 −5 −2 1 0 0⎟
⎜ ⎟ ∼⎜ ⎟ (18.6.60)
⎜3 2 0 0 1 0⎟ ⎜0 −10 −3 0 1 0⎟
⎝ ⎠ ⎝ ⎠
4 1 0 0 0 1 0 −15 −4 0 0 1
18.6.7 https://math.libretexts.org/@go/page/2110
3 3
⎛1 0 − −4 0 0⎞ ⎛1 0 0 2
5
0 ⎞
5
⎜0 19 2
⎜0 1
2 1
0 0⎟ ⎜ 1 0 − − 0 ⎟
⎟
⎜ 5 5 ⎟ 5 5
∼ ∼⎜ ⎟ (18.6.61)
⎜ ⎟ ⎜ ⎟
⎜0 0 1 10 1 0⎟ ⎜0 0 1 10 1 0 ⎟
⎝ ⎠ ⎝ 5 1 ⎠
0 0 2 15 0 1 0 0 0 − −10
2 2
From here we can keep row reducing to achieve RREF, but we can already see that the non-pivot variables will be ε and η . Hence we can eject the last two vectors and obtain as our basis
⎧ 1 4 1 0
⎪⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎫
⎪
⎪ ⎪
⎪ ⎪
⎜2⎟ ⎜ 3⎟ ⎜0⎟ ⎜1⎟
⎨⎜ ⎟,⎜ ⎟,⎜ ⎟,⎜ ⎟⎬ . (18.6.62)
⎜3⎟ ⎜ 2 ⎟ ⎜ 0 ⎟ ⎜ 0 ⎟⎪
⎪
⎪
⎩⎝
⎪ ⎪
⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎪
⎭
4 1 0 0
1. ⎛ λ −
1
2
−1 ⎞
1 1 1
1 1 1 λ 1 1
det ⎜ − λ− − ⎟ = λ((λ − )λ − )+ (− − )−(− + λ) (18.6.63)
⎜ 2 2 2 ⎟
2 4 2 2 2 4
⎝ −1 1 ⎠
− λ
2
3
1 2
3 3
=λ − λ − λ = λ(λ + 1)(λ − ). (18.6.64)
2 2 2
s
⎛ ⎞
So we find the eigenvector ⎜ −2s ⎟ where s ≠ 0 is arbitrary. For λ = −1
⎝ ⎠
s
1
⎛ 1 1 0 ⎞
2 ⎛ 1 0 1 0 ⎞
⎜ 1 3 1 ⎟
⎜ 0 ⎟ ∼⎜
⎜ 0 1 0 0 ⎟ .
⎟
(18.6.66)
2 2 2
⎜ ⎟
1 ⎝ 0 0 0 0 ⎠
⎝ 1 1 0 ⎠
2
−s
⎛ ⎞
So we find the eigenvector ⎜ 0 ⎟ where s ≠ 0 is arbitrary.Finally, for λ = 3
2
⎝ ⎠
s
3 1 1 3
⎛ − 1 0 ⎞ ⎛ 1 − 0 ⎞
2 2 2 2 ⎛ 1 0 −1 0 ⎞
⎜ 1 1 ⎟ ⎜ 5 5 ⎟
⎜ −1 0 ⎟ ∼⎜ 0 − 0 ⎟ ∼⎜
⎜ 0 1 −1 0 ⎟ .
⎟
(18.6.67)
2 2 4 4
⎜ ⎟ ⎜ ⎟
1 3 5 5 ⎝ 0 0 0 0 ⎠
⎝ 1 − 0 ⎠ ⎝ 0 − 0 ⎠
2 2 4 4
s 1
⎛ ⎞ ⎛ ⎞
So we find the eigenvector ⎜ s ⎟ where s ≠ 0 is arbitrary. If the mistake X is in the direction of the eigenvector ⎜ −2 ⎟ , then Y =0 . I.e. , the satellite returns to the origin O . For all subsequent
⎝ ⎠ ⎝ ⎠
s 1
−1
⎛ ⎞
orbits it will again return to the origin. NASA would be very pleased in this case.If the mistake X is in the direction ⎜ 0 ⎟ , then Y = −X . Hence the satellite will move to the point opposite to
⎝ ⎠
1
X . After next orbit will move back to X. It will continue this wobbling motion indefinitely. Since this is a stable situation, again, the elite engineers will pat themselves on the back. Finally, if the
1
⎛ ⎞
mistake X is in the direction ⎜ 1 ⎟, the satellite will move to a point Y =
3
2
X which is further away from the origin. The same will happen for all subsequent orbits, with the satellite moving a
⎝ ⎠
1
factor 3/2 further away from O each orbit (in reality, after several orbits, the approximations used by the engineers in their calculations probably fail and a new computation will be needed). In
this case, the satellite will be lost in outer space and the engineers will likely lose their jobs!
10.
⎧⎛ 1 ⎞
⎪ ⎛
0
⎞ ⎛
0
⎞⎫
⎪
1. A basis for B is ⎨⎜ 0 ⎟ , ⎜ 1 ⎟ , ⎜ 0 ⎟⎬
3
⎩⎝
⎪ ⎠ ⎝ ⎠ ⎝ ⎭
⎠⎪
0 0 1
2. 3.
3. 2 = 8 .
3
4. dimB = 3 .
3
1
b
⎛ ⎞
5. Because the vectors {v 1, v2 , v3 } are a basis any element v ∈ B can be written uniquely as v = b
3 1
v1 + b v2 + b v3
2 3
for some triplet of bits ⎜ b 2
⎟ . Hence, to compute L(v) we use linearity of L
⎝ 3 ⎠
b
1 2 3 1 2 3
L(v) = L(b v1 + b v2 + b v3 ) = b L(v1 ) + b L(v2 ) + b L(v3 ) (18.6.68)
1
b
⎛ ⎞
2
= ( L(v1 ) L(v2 ) L(v3 ) ) ⎜ b ⎟ . (18.6.69)
⎝ 3 ⎠
b
6. From the notation of the previous part, we see that we can list linear transformations L : B 3
→ B by writing out all possible bit-valued row vectors
18.6.8 https://math.libretexts.org/@go/page/2110
(0 0 0),
(1 0 0),
(0 1 0),
(0 0 1),
(1 1 0),
(1 0 1),
(0 1 1),
(1 1 1).
There are 2 = 8 different linear transformations L : B → B , exactly the same as the number of elements in B .
3 3 3
7. Yes, essentially just because L and L are linear transformations. In detail for any bits (a, b) and vectors (u, v) in B it is easy to check the linearity property for (α L
1 2
3
1 + β L2 )
Here the first line used the definition of (α L + β L ) , the second line depended on the linearity of L and L , the third line was just algebra and the fourth used the definition of (α L
1 2 1 2 1 + β L2 )
again.
8. Yes. The easiest way to see this is the identification above of these maps with bit-valued column vectors. In that notation, a basis is
Since this (spanning) set has three (linearly independent) elements, the vector space of linear maps B 3
→ B has dimension 3. This is an example of a general notion called the dual vector space.
11.
a a
2 ⎛ ⎞ ⎛ ⎞
2 d cos(ωt)
1. d X
2
= 2
⎜ b ⎟ = −ω
2
cos(ωt) ⎜ b ⎟ . Hence
dt dt
⎝ ⎠ ⎝ ⎠
c c
−a − b −1 −1 0 a
⎛ ⎞ ⎛ ⎞⎛ ⎞
F = cos(ωt) ⎜ −a − 2b − c ⎟ = cos(ωt) ⎜ −1 −2 −1 ⎟ ⎜ b ⎟
⎝ ⎠ ⎝ ⎠⎝ ⎠
−b − c 0 −1 −1 c
a
⎛ ⎞
2
= −ω cos(ωt) ⎜ b ⎟ ,
⎝ ⎠
c
so
−1 −1 0
⎛ ⎞
M = ⎜ −1 −2 −1 ⎟ . (18.6.75)
⎝ ⎠
0 −1 −1
2. ⎛
λ +1 1 0
⎞
= (λ + 1)((λ + 2)(λ + 1) − 2)
2
= (λ + 1)(λ + 3λ) = λ(λ + 1)(λ + 3)
1
⎛ ⎞
so ⎜ −1 ⎟ is an eigenvector. For λ = −1
⎝ ⎠
1
0 −1 0 1 0 1
⎛ ⎞ ⎛ ⎞
M (−1). I = ⎜ −1 −1 −1 ⎟ ∼ ⎜ 0 1 0⎟ , (18.6.77)
⎝ ⎠ ⎝ ⎠
0 −1 0 0 0 0
−1
⎛ ⎞
so ⎜ 0 ⎟ is an eigenvector. For λ = −3
⎝ ⎠
1
2 −1 0 1 −1 1 1 0 −1
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
M − (−3). I = ⎜ −1 1 −1 ⎟ ∼ ⎜ 0 1 −2 ⎟ ∼ ⎜ 0 1 −2 ⎟ , (18.6.78)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 −1 2 0 −1 2 0 0 0
1
⎛ ⎞
so ⎜ 2 ⎟ is an eigenvector.
⎝ ⎠
1
–
3. The characteristic frequencies are 0, 1, √3.
4. The orthogonal change of basis matrix
1 1 1
−
⎛ √3 √2 √6 ⎞
⎜ 1 2 ⎟
P =⎜− 0 ⎟ (18.6.79)
√3 √6
⎜ ⎟
1 1 1
⎝ ⎠
√3 √2 √6
It obeys M P = PD where
18.6.9 https://math.libretexts.org/@go/page/2110
0 0 0
⎛ ⎞
D =⎜0 −1 0 ⎟ . (18.6.80)
⎝ ⎠
0 0 −3
1
⎛ ⎞
5. Yes, the direction given by the eigenvector ⎜ −1 ⎟ because its eigenvalue is zero. This is probably a bad design for a bridge because it can be displaced in this direction with no force!
⎝ ⎠
1
12.
a b c
1. If we call M =( ) , then X T
M X = ax
2
+ 2bxy + dy
2
. Similarly putting C =( ) yields X T
C +C
T
X = 2X ⋅ C = 2cx + 2ey . Thus
b d e
2 2
0 = ax + 2bxy + dy + 2cx + 2ey + f (18.6.81)
a b x c x
=(x y)( )( ) +(x y)( ) +(c e)( ) +f. (18.6.82)
b d y e y
2. Yes, the matrix M is symmetric, so it will have a basis of eigenvectors and is similar to a diagonal matrix of real eigenvalues. To find the eigenvalues notice that
a−λ b 2 2
det ( ) = (a − λ)(d − λ) − b
2
= (λ −
a+d
2
) −b
2
−(
a−d
2
) . So the eigenvalues are
b d−λ
−−−−−−−−−−− −−−−−−−−−−−
a+d 2
a−d 2 a+d 2
a−d 2
λ = + √b +( ) and μ = − √b + ( ) . (18.6.83)
2 2 2 2
so that
T T −1 −1 T
(X +C M )M (X + M C) = C MC − f . (18.6.85)
Hence Y = X + M C and g = C M C − f .
−1 T
So replacing V → P V and W → P W will always give a factor P P inside all the products, but P P = I for orthogonal matrices. Hence none of the dot products in the above formula
T T
8. When λ = μ and g/λ = R , we get the equation for a circle radius R in the (z, w)-plane. When λ , μ and g are postive, we have the equation for an ellipse. Vanishing g along with λ and μ of
2
opposite signs gives a pair of straight lines. When g is non-vanishing, but λ and μ have opposite signs, the result is a pair of hyperbol\ae. These shapes all come from cutting a cone with a plane,
and are therefore called conic sections.
13. We show that L is bijective if and only if M is invertible.
a) We suppose that L is bijective.
1. Since L is injective, its kernel consists of the zero vector alone. Hence
nullL = dim ker L = 0. (18.6.88)
Thereby
2. Since dim V = dim W , the matrix M is square so we can talk about its eigenvalues. Since L is injective, its kernel is the zero vector alone. That is, the only solution to LX = 0 is X = 0 . But V
LX is the same as M X , so the only solution to M X = 0 is X = 0 . So M does not have zero as an eigenvalue. V
2. Since LX = 0 has no non-zero solutions, the kernel of L is the zero vector alone. So L is injective.
3. Since M is invertible, we must have that dim V = dim W . By the Dimension Formula, we have
dim V = nullL + rankL (18.6.92)
Since L(V ) is a subspace of W with the same dimension as W , it must be equal to W . To see why, pick a basis B of (L(V)\). Each element of B is a vector in W , so the elements of B form a
linearly independent set in W . Therefore B is a basis of W , since the size of B is equal to dim W . So L(V ) = spanB = W . So L is surjective.
14.
1. F = F + F = 2 + 3 = 5 .
4 2 3
2. The number of pairs of doves in any given year equals the number of the previous years plus those that hatch and there are as many of them as pairs of doves in the year before the previous year.
18.6.10 https://math.libretexts.org/@go/page/2110
F1 1 F2 1
3. X 1 =( ) =( ) and X 2 =( ) =( ) .
F0 0 F1 1
1 1 1 1
M X1 = ( )( ) =( ) = X2 . (18.6.94)
1 0 0 1
4. We just need to use the recursion relationship of part (b) in the top slot of X n+1 :
Fn+1 Fn + Fn−1 1 1 Fn
Xn+1 = ( ) =( ) =( )( ) = M Xn . (18.6.95)
Fn Fn 1 0 Fn−1
1 −λ 1 1 2 5
det ( ) = λ(λ − 1) − 1 = (λ − ) − , (18.6.96)
1 −λ 2 4
1±√5
1±√5 1+√5 1+√5 1+√5 1−√5 1−√5 1−√5
so the eigenvalues are 2
. Hence the eigenvectors are ( 2 ) , respectively (notice that 2
+1 =
2
.
2
and 2
+1 =
2
.
2
). Thus M = P DP
−1
with
1
1+√5
1+√5 1−√5
⎛ 0 ⎞
2
D = and P = ( 2 2 ) . (18.6.97)
1−√5
⎝ ⎠ 1 1
0
2
6. M = (P DP ) = P DP P DP … P DP = P D
n −1 n −1 −1 −1 n
P
−1
.
7. Just use the matrix recursion relation of part 4 repeatedly:
2 n
Xn+1 = M Xn = M Xn−1 = ⋯ = M X1 . (18.6.98)
1+√5 1−√5
8. The eigenvalues are φ = 2
and 1 − φ = 2
.
9. Fn+1 n n −1
Xn+1 = ( ) =M Xn = P D P X1 (18.6.99)
Fn
1 1
n ⋆ n
φ 0 ⎛ √5
⎞ 1 φ 0 ⎛ √5
⎞
= P( ) ( ) =P ( ) (18.6.100)
1 n 1
0 1 −φ ⎝− ⋆⎠ 0 0 (1 − φ) ⎝− ⎠
√5 √5
n
φ
1+√5 1−√5 ⎛ ⎞ ⋆
√5
=( 2 2 )⎜ ⎟ =( ). (18.6.101)
n
n
n φ −(1−φ)
(1−φ)
1 1 ⎝− ⎠ √5
√5
Hence
n n
φ − (1 − φ )
Fn = – . (18.6.102)
√5
⎜−3 ⎟
⊥
u⋅v 3 ⎜ 4 ⎟
v = v− u = v− u =⎜ ⎟ , (18.6.103)
u⋅u 4 ⎜ 1 ⎟
⎜ 4 ⎟
⎝ 1 ⎠
4
and
−1
⎛ ⎞
3
⊥
u⋅w v ⋅w 3 4 ⎜ 0 ⎟
⊥ ⊥ ⊥
w =w− u− v =w− u− v =⎜ ⎟ (18.6.104)
⊥ ⊥ 3 ⎜ ⎟
u⋅u v ⋅v 4 0
4
⎝ ⎠
1
⎝ 1 ⎠
2
,
√3
⎛ ⎞
6
⎜ √3
⎟
⎜− ⎟
⎜ 2 ⎟
⎜ ⎟ (18.6.106)
⎜ √ 3 ⎟
⎜ ⎟
6
⎜ ⎟
√3
⎝ ⎠
6
,
√2
⎛− ⎞
2
⎜ ⎟
⎜ 0 ⎟
⎜ ⎟ (18.6.107)
⎜ 0 ⎟
⎜ ⎟
√2
⎝ ⎠
2
18.6.11 https://math.libretexts.org/@go/page/2110
\right\}\, .
\]
16.
1. The columns of M span imL in the basis given for W .
2. The rows of M span (kerL) in the basis given for V .
⊥
⎜2 1 −1 2 ⎟ ⎜0 −3 −3 −4 ⎟
M =⎜ ⎟ ∼⎜ ⎟ (18.6.108)
⎜1 0 0 −1 ⎟ ⎜0 −2 −1 −4 ⎟
⎝ ⎠ ⎝ ⎠
4 1 −1 0 0 −7 −5 −12
1
1 0 −1 1 0 0 −1
⎛ 3 ⎞
⎛ ⎞
4 8
⎜0 1 1 ⎟
⎜ ⎟ ⎜0 1 0 ⎟
3 3
∼⎜ ⎜ ⎟ .
⎟ ∼⎜ ⎟
(18.6.109)
⎜0 4 ⎟ 4
0 1 − ⎜0 0 1 − ⎟
⎜ 3 ⎟ 3
⎝ 8 ⎠ ⎝ ⎠
0 0 2 − 0 0 0 0
3
Hence
8 4
kerL = span{ v1 − v2 + v3 + v4 } (18.6.110)
3 3
and
imL = span{ v1 + 2 v2 + v3 + 4 v4 , 2 v1 + v2 + v4 , v1 − v2 − v4 } . (18.6.111)
17.
1. $$\left\{
5 = 4a − 2c + c
2 = a−b +c
(18.6.113)
0 = a+b +c
3 = 4a + 2b + c .
\right.\]
2. (Also 3, 4 and 5)
⎛ 4 −2 1 5 ⎞ ⎛ 1 1 1 0 ⎞ ⎛ 1 0 1 −1 ⎞
⎜ 1 −1 1 2 ⎟ ⎜ 0 −6 −3 5 ⎟ ⎜ 0 1 0 1 ⎟
⎜ ⎟ ∼⎜ ⎟ ∼⎜ ⎟ (18.6.114)
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 1 1 1 0 ⎟ ⎜ 0 −2 0 2 ⎟ ⎜ 0 0 −3 11 ⎟
⎝ 4 2 1 3 ⎠ ⎝ 0 −2 −3 3 ⎠ ⎝ 0 0 −3 3 ⎠
3
is impossible.
6. Let
4 −2 1 5
⎛ ⎞ ⎛ ⎞
⎜1 −1 1⎟ ⎜2⎟
M =⎜ ⎟ and V = ⎜ ⎟ . (18.6.115)
⎜1 1 1⎟ ⎜0⎟
⎝ ⎠ ⎝ ⎠
4 2 1 3
Then
34 0 10 34
⎛ ⎞ ⎛ ⎞
T T
M M =⎜ 0 10 0 ⎟ and M V = ⎜ −6 ⎟ . (18.6.116)
⎝ ⎠ ⎝ ⎠
10 0 4 10
So
5
and c = 0 .
7. The Captain predicts y(2) = 1.2 2
−
3
5
.2 + 0 =
14
5
.
18. We show that L is bijective if and only if M is invertible.
1. We suppose that L is bijective. Since L is injective, its kernel consists of the zero vector alone. So
So
dim V = \rankL = dim W . (18.6.120)
2. Since dim V = dim W , the matrix M is square so we can talk about its eigenvalues. Since L is injective, its kernel is the zero vector alone. That is, the only solution to LX = 0 is X = 0 . But V
LX is the same as M X , so the only solution to M X = 0 is X = 0 . So M does not have zero as an eigenvalue.
V
18.6.12 https://math.libretexts.org/@go/page/2110
4. Now we suppose that M is an invertible matrix. Since M is invertible, the system M X = 0 has no non-zero solutions. But LX is the same as M X , so the only solution to LX = 0 is X = 0 . So V
5. Since LX = 0 has no non-zero solutions, the kernel of L is the zero vector alone. So L is injective.
6. Since M is invertible, we must have that dim V = dim W . By the Dimension Formula, we have
dim V = nullL + \rankL (18.6.121)
Since L(V ) is a subspace of W with the same dimension as W , it must be equal to W . To see why, pick a basis B of L(V ). Each element of B is a vector in W , so the elements of B form a
linearly independent set in W . Therefore B is a basis of W , since the size of B is equal to dim W . So L(V ) = \spaB = W . So L is surjective.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 18.6: Sample Final Exam is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom Denton, & Andrew Waldron.
18.6.13 https://math.libretexts.org/@go/page/2110
18.7: Movie Scripts 1-2
G.1: What is Linear Algebra
Hint for Review Problem 5
Looking at the problem statement we find some important information, first that oranges always have twice as much sugar as
apples, and second that the information about the barrel is recorded as (s, f ), where s = units of sugar in the barrel and f =
number of pieces of fruit in the barrel.
We are asked to find a linear transformation relating this new representation to the one in the lecture, where in the lecture x = the
number of apples and y = the number of oranges. This means we must create a system of equations relating the variable x and y to
the variables s and f in matrix form. Your answer should be the matrix that transforms one set of variables into the other.
Hint: Let λ represent the amount of sugar in each apple.
1. To find the first equation relate f to the variables x and y .
2. To find the second equation, use the hint to figure out how much sugar is in x apples, and y oranges in terms of λ . Then write
an equation for s using x, y and λ .
1 1 27
( ) , (18.7.1)
2 −1 0
x + y = 27
2x − y = 0 ?
Well the augmented matrix is just a new notation for the matrix equation
1 1 x 27
( )( ) =( )
2 −1 y 0
x +y 27
( ) =( ) ,
2x − y 0
18.7.1 https://math.libretexts.org/@go/page/2192
1 1 27 1 0 9
( ) and ( ) (18.7.2)
2 −1 0 0 1 18
to be equivalent:
They are certainly not equal, because they don't match in each component, but since these augmented matrices represent a system,
we might want to introduce a new kind of equivalence relation.
Well we could look at the system of linear equations this represents
x + y = 27
2x − y = 0
and notice that the solution is x = 9 and y = 18. The other augmented matrix represents the system
x +0 ⋅ y = 9
0⋅x + y = 18
This clearly has the same solution. The first and second system are related in the sense that their solutions are the same. Notice that
it is really nice to have the augmented matrix in the second form, because the matrix multiplication can be done in your head.
Symmetric: If the first person, Bob (say) has the same hair color as a second person Betty(say), then Bob has the same hair
color as Betty, so this holds too.
18.7.2 https://math.libretexts.org/@go/page/2192
Transitive: If Bob has the same hair color as Betty (say) and Betty has the same color as Brenda (say), then it follows that Bob
and Brenda have the same hair color, so the transitive property holds too and we are done.
1 0 3 2
( ) (18.7.3)
0 1 0 1
1 ⋅ x2 = 1
1 0
Notice that when the system is written this way the copy of the 2 × 2 identity matrix ( ) makes it easy to write a solution in
0 1
3
terms of the variables x and x . We will call x and x the pivot variables. The third column (
1 2 1 2 ) does not look like part of an
0
identity matrix, and there is no 3 × 3 identity in the augmented matrix. Notice there are more variables than equations and that this
means we will have to write the solutions for the system in terms of the variable x . We'll call x the free variable.
3 3
Let x3 =μ . (We could also just add a "dummy'' equation x3 = x3 .) Then we can rewrite the first equation in our system
x1 + 3 x3 = 2
x1 + 3μ = 2
x1 = 2 − 3μ.
Then since the second equation doesn't depend on μ we can keep the equation
x2 = 1, (18.7.4)
18.7.3 https://math.libretexts.org/@go/page/2192
and for a third equation we can write
x3 = μ (18.7.5)
⎜ x2 ⎟ = ⎜ 1 ⎟
⎝ ⎠ ⎝ ⎠
x3 μ
2 −3μ
⎛ ⎞ ⎛ ⎞
= ⎜ 1 ⎟+⎜ 0 ⎟
⎝ ⎠ ⎝ ⎠
0 μ
2 −3
⎛ ⎞ ⎛ ⎞
= ⎜ 1 ⎟+μ⎜ 0 ⎟.
⎝ ⎠ ⎝ ⎠
0 1
Any value of μ will give a solution of the system, and any system can be written in this form for some value of μ . Since there are
multiple solutions, we can also express them as a set:
⎧ x 2 −3 ⎫
⎪⎛ 1 ⎞ ⎛ ⎞ ⎛ ⎞ ⎪
⎨⎜ x2 ⎟ = ⎜ 1 ⎟ + μ ⎜ 0 ⎟ ∣ μ ∈ R⎬ . (18.7.6)
⎩
⎪⎝ ⎭
⎪
⎠ ⎝ ⎠ ⎝ ⎠
x3 0 1
⎛ 2 5 2 0 2 ⎞ ⎛ 5 2 9 ⎞
⎜ 1 1 1 0 1 ⎟ ⎜ 0 5 10 ⎟ (18.7.7)
⎜ ⎟ ⎜ ⎟
⎝ 1 4 1 0 1 ⎠ ⎝ 0 3 6 ⎠
and we want to find the solution to those systems. We will do so by doing Gaussian elimination.
For the first matrix we have
⎛ 2 5 2 0 2 ⎞ ⎛ 1 1 1 0 1 ⎞
R1 ↔R2
⎜ 1 1 1 0 1 ⎟ ∼ ⎜ 2 5 2 0 2 ⎟
⎜ ⎟ ⎜ ⎟
⎝ 1 4 1 0 1 ⎠ ⎝ 1 4 1 0 1 ⎠
⎛ 1 1 1 0 1 ⎞
R2 −2 R1 ; R3 −R1
∼ ⎜ 0 3 0 0 0 ⎟
⎜ ⎟
⎝ 0 3 0 0 0 ⎠
1 ⎛ 1 1 1 0 1 ⎞
R2
3
∼ ⎜ 0 1 0 0 0 ⎟
⎜ ⎟
⎝ 0 3 0 0 0 ⎠
⎛ 1 0 1 0 1 ⎞
R1 −R2 ; R3 −3 R2
∼ ⎜ 0 1 0 0 0 ⎟
⎜ ⎟
⎝ 0 0 0 0 0 ⎠
1. We begin by interchanging the first two rows in order to get a 1 in the upper-left hand corner and avoiding dealing with
fractions.
2. Next we subtract row 1 from row 3 and twice from row 2 to get zeros in the left-most column.
18.7.4 https://math.libretexts.org/@go/page/2192
3. Then we scale row 2 to have a 1 in the eventual pivot.
4. Finally we subtract row 2 from row 1 and three times from row 2 to get it into Reduced Row Echelon Form.
Therefore we can write x = 1 − λ , y = 0 , z = λ and w = μ , or in vector form
x 1 −1 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎜ y ⎟ ⎜0⎟ ⎜ 0 ⎟ ⎜0⎟
⎜ ⎟ =⎜ ⎟+λ ⎜ ⎟+μ⎜ ⎟. (18.7.8)
⎜ z ⎟ ⎜0⎟ ⎜ 1 ⎟ ⎜0⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
w 0 0 1
⎛ 5 2 9 ⎞ 1 ⎛ 5 2 9 ⎞
R2
5
⎜ 0 5 10 ⎟ ∼ ⎜ 0 1 2 ⎟
⎜ ⎟ ⎜ ⎟
⎝ 0 3 6 ⎠ ⎝ 0 3 6 ⎠
⎛ 5 2 9 ⎞
R3 −3 R2
∼ ⎜ 0 1 2 ⎟
⎜ ⎟
⎝ 0 0 0 ⎠
⎛ 5 0 5 ⎞
R1 −2 R2
∼ ⎜ 0 1 2 ⎟
⎜ ⎟
⎝ 0 0 0 ⎠
1 ⎛ 1 0 1 ⎞
R1
5
∼ ⎜ 0 1 2 ⎟
⎜ ⎟
⎝ 0 0 0 ⎠
We scale the second and third rows appropriately in order to avoid fractions, then subtract the corresponding rows as before.
Finally scale the first row and hence we have x = 1 and y = 2 as a unique solution.
Planes
Here we want to describe the mathematics of planes in space. The video is summarised by the following picture:
A plane is often called R because it is spanned by two coordinates, and space is called
2
R
3
and has three coordinates, usually
called (x, y, z). The equation for a plane is
ax + by + cz = d . (18.7.9)
Lets simplify this by calling V = (x, y, z) the vector of unknowns and N = (a, b, c) . Using the dot product in R we have
3
N ⋅V =d. (18.7.10)
18.7.5 https://math.libretexts.org/@go/page/2192
Remember that when vectors are perpendicular their dot products vanish. \(\textit{I.e.} \(U\cdot V = 0 \Leftrightarrow U \perp V\).
This means that if a vector V solves our equation N ⋅ V = d , then so too does V + C whenever C is perpendicular to N . This is
0 0
because
N ⋅ (V0 + C ) = N ⋅ V0 + N ⋅ C = d + 0 = d . (18.7.11)
But C is ANY vector perpendicular to N , so all the possibilities for C span a plane whose normal vector is N . Hence we have
shown that solutions to the equation ax + by + cz = 0 are a plane with normal vector N = (a, b, c) .
For a single equation, the solution is a plane. This is explained in this video. The picture looks like this:
For two equations, we must look at two planes. These usually intersect along a line, so the solution set will also (usually) be a
line:
18.7.6 https://math.libretexts.org/@go/page/2192
For three equations, most often their intersection will be a single point so the solution will then be unique:
Of course stuff can go wrong. Two different looking equations could determine the same plane, or worse equations could be
inconsistent. If the equations are inconsistent, there will be no solutions at all. For example, if you had four equations
determining four parallel planes the solution set would be empty. This looks like this:
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 18.7: Movie Scripts 1-2 is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
18.7.7 https://math.libretexts.org/@go/page/2192
18.8: Movie Scripts 3-4
G.3: Vectors in Space n-Vectors
Review of Parametric Notation
The equation for a plane in three variables x, y and z looks like
ax + by + cz = d (18.8.1)
In fact this is a system of linear equations whose solutions form a plane with normal vector (1, 2, 5). As an augmented matrix the
system is simply
∣
(1 2 5 3) . (18.8.3)
∣
This is actually RREF! So we can let x be our pivot variable and y , z be represented by free parameters λ and λ :
1 2
x = λ1 , y = λ2 . (18.8.4)
y = λ1 (18.8.5)
z = λ2
or in vector notation
x 3 −2 −5
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎜ y ⎟ = ⎜ 0 ⎟ + λ1 ⎜ 1 ⎟ + λ2 ⎜ 0 ⎟ . (18.8.6)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
z 0 0 1
This describes a plane parametric equation. Planes are "two-dimensional'' because they are described by two free variables. Here's a
picture of the resulting plane:
18.8.1 https://math.libretexts.org/@go/page/2194
This video talks about the weird notion of a "length-squared'' for a vector v = (x, t) given by ||v|| = x − t used in Einstein's 2 2 2
theory of relativity. The idea is to plot the story of your life on a plane with coordinates (x, t). The coordinate x encodes where an
event happened (for real life situations, we must replace x → (x, y, z) ∈ R ). The coordinate t says when events happened.
3
Each point on the worldline corresponds to a place and time of an event in your life. The slope of the worldline has to do with your
speed. Or to be precise, the inverse slope is your velocity. Einstein realized that the maximum speed possible was that of light,
often called c . In the diagram above c = 1 and corresponds to the lines x = ±t ⇒ x − t = 0 . This should get you started in
2 2
definition. Its worth doing this once, so here we go: Before we start, remember that for R we define vector addition and scalar
2
multiplication component-wise.
x1 y1
1. Additive closure: We need to make sure that when we add ( ) and ( ) that we do not get something outside the original
x2 y2
vector space R . This just relies on the underlying structure of real numbers whose sums are again real numbers so, using our
2
x1 y1 x1 + x2 2
( ) +( ) := ( ) ∈ R . (18.8.7)
x2 y2 y1 + y2
2. Additive commutativity: We want to check that when we add any two vectors we can do so in either order, i.e.
x1 y1 ? y1 x1
( ) +( ) = ( ) +( ). (18.8.8)
x2 y2 y2 x2
This again relies on the underlying real numbers which for any x, y ∈ R obey
x +y = y +x . (18.8.9)
x1 y1 x1 + y1 y1 + x1 y1 x1
( ) +( ) =( ) =( ) =( ) +( ) (18.8.10)
x2 y2 x2 + y2 y2 + x2 y2 x2
18.8.2 https://math.libretexts.org/@go/page/2194
Again this relies on the underlying associativity of real numbers:
(x + y) + z = x + (y + z) . (18.8.12)
x1 y1 z1 x1 + y1 z1 (x1 + y1 ) + z1
(( ) +( )) + ( ) =( ) +( ) =( ) (18.8.13)
x2 y2 z2 x2 + y2 z2 (x2 + y2 ) + z2
x1 + (y1 + z1 ) x1 y1 + z1 x1 y1 z1
=( ) =( ) +( ) =( ) + (( ) +( )) . (18.8.14)
x2 + (y2 + z2 ) y1 y2 + z2 x2 y2 z2
4. Zero: There needs to exist a vector $\vec 0$ that works the way we would expect zero to behave, i.e.
x1 x1
⃗
( ) +0 = ( ) . (18.8.15)
y1 y1
You can easily check that when this vector is added to any vector, the result is unchanged.
x1
5. Additive Inverse: We need to check that when we have ( ) , there is another vector that can be added to it so the sum is 0⃗.
x2
x1 −x1
(Note that it is important to first figure out what 0⃗ is here!) The answer for the additive inverse of ( ) is ( ) because
x2 −x2
x1 −x1 x1 − x1 0
( ) +( ) =( ) =( ) = 0⃗ . (18.8.17)
x2 −x2 x2 − x2 0
We are half-way done, now we need to consider the rules for scalar multiplication. Notice, that we multiply vectors by scalars (i.e.
numbers) but do NOT multiply a vectors by vectors.
1. Multiplicative closure: Again, we are checking that an operation does not produce vectors outside the vector space. For a scalar
x1
a ∈ R , we require that a ( ) lies in R . First we compute using our component-wise rule for scalars times vectors:
2
x2
x1 ax1
a( ) =( ). (18.8.18)
x2 ax2
Since products of real numbers ax and ax are again real numbers we see this is indeed inside R .
1 2
2
x1 ? x1 x1
(a + b) ( ) = a( ) +b ( ). (18.8.19)
x2 x2 x2
Once again this is a simple LHS=RHS proof using properties of the real numbers. Starting on the left we have
x1 (a + b)x1 ax1 + b x1
(a + b) ( ) =( ) =( ) (18.8.20)
x2 (a + b)x2 ax2 + b x2
ax1 bx1 x1 x1
=( ) +( ) = a( ) +b ( ), (18.8.21)
ax2 bx2 x2 x2
as required.
3. Additive distributivity: This time we need to check the equation
x1 y1 ? x1 y1
a (( ) +( )) = a ( ) +a( ), (18.8.22)
x2 y2 x2 y2
i.e. , one scalar but two different vectors. The method is by now becoming familiar
18.8.3 https://math.libretexts.org/@go/page/2194
x1 y1 x1 + y1 a(x1 + y1 )
a (( ) +( )) = a (( )) = ( ) (18.8.23)
x2 y2 x2 + y2 a(x2 + y2 )
again as required.
4. Multiplicative associativity. Just as for addition, this is the requirement that the order of bracketing does not matter. We need to
establish whether
x1 ? x1
(a. b) ⋅ ( ) = a ⋅ (b ⋅ ( )) , (18.8.25)
x2 x2
This clearly holds for real numbers a. (b. x) = (a. b). x . The computation is
There is an obvious choice for this special scalar---just the real number 1 itself. Indeed, to be pedantic lets calculate
x1 1.x1 x1
1⋅( ) =( ) =( ). (18.8.28)
x2 1.x2 x2
Now we are done---we have really proven the R is a vector space so lets write a little square □ to celebrate.
2
The different ways of hitting a hockey puck can all be considered as vectors. You can think about adding vectors by having two
players hitting the puck at the same time. This picture shows vectors N and J corresponding to the ways Nicole Darwitz and Jenny
Potter hit a hockey puck, plus the vector obtained when they hit the puck together.
You can also model the new vector 2J obtained by scalar multiplication by 2 by thinking about Jenny hitting the puck twice
(or a world with two Jenny Potters....). Now ask yourself questions like whether the multiplicative distributive law
2J + 2N = 2(J + N ) (18.8.30)
18.8.4 https://math.libretexts.org/@go/page/2194
Lets worry about the last part of the problem. The problem can be solved by considering a non-zero simple polynomial, such as a
degree 0 polynomial, and multiplying by i ∈ C . That is to say we take a vector p ∈ P and then considering i ⋅ p . This will violate
R
3
one of the vector space rules about scalars, and you should take from this that the scalar field matters.
– –
As a second hint, consider Q (the field of rational numbers). This is not a vector space over R since √2 ⋅ 1 = √2 ≠ Q , so it is not
closed under scalar multiplication, but it is clearly a vector space over Q.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 18.8: Movie Scripts 3-4 is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
18.8.5 https://math.libretexts.org/@go/page/2194
18.9: Movie Scripts 5-6
G.5: Linear Transformations
Hint for Review Question 5
The first thing we see in the problem is a definition of this new space P . Elements of P are polynomials that look like
n n
2 n
a0 + a1 t + a2 t + … + an t (18.9.1)
and the output will have degree three and look like
2 3
b0 + b1 t + b2 t + b3 t (18.9.3)
We also know that L is a linear transformation, so what does that mean in this case? Well, by linearity we know that we can
separate out the sum, and pull out the constants so we get
2 2
L(a0 + a1 t + a2 t ) = a0 L(1) + a1 L(t) + a2 L(t ) (18.9.4)
Just this should be really helpful for the first two parts of the problem. The third part of the problem is asking us to think about this
as a linear algebra problem, so lets think about how we could write this in the vector notation we use in the class. We could write
a0
⎛ ⎞
2
a0 + a1 t + a2 t as ⎜ a1 ⎟ (18.9.5)
⎝ ⎠
a2
And think for a second about how you add polynomials, you match up terms of the same degree and add the constants component-
wise. So it makes some sense to think about polynomials this way, since vector addition is also component-wise.
We could also write the output
b0
⎛ ⎞
2 3 ⎜ b1 ⎟
b0 + b1 t + b2 t + b3 t as ⎜ ⎟ (18.9.6)
⎜b ⎟
2
⎝ ⎠
b3
Then lets look at the information given in the problem and think about it in terms of column vectors
1. L(1) = 4 but we can think of the input 1 = 1 + 0t + 0t 2
and the output 4 = 4 + 0t + 0t 2 3
0t and write this as
4
⎛ ⎞
1
⎛ ⎞
⎜0⎟
L⎜0⎟ = ⎜ ⎟
⎜0⎟
⎝ ⎠
0
⎝ ⎠
0
0
⎛ ⎞
0
⎛ ⎞
⎜0⎟
2. L(t) = t This can be written as
3
L⎜1⎟ = ⎜ ⎟
⎜0⎟
⎝ ⎠
0 ⎝ ⎠
1
3. L(t ) = t − 1 It might be a little trickier to figure out how to write t − 1 but if we write the polynomial out with the terms in
2
order and with zeroes next to the terms that do not appear, we can see that
−1
⎛ ⎞
2 3 ⎜ 1 ⎟
t − 1 = −1 + t + 0 t + 0t corresponds to ⎜ ⎟ (18.9.7)
⎜ 0 ⎟
⎝ ⎠
0
18.9.1 https://math.libretexts.org/@go/page/2195
−1
⎛ ⎞
0
⎛ ⎞
⎜ 1 ⎟
So this can be written as L⎜0⎟ = ⎜ ⎟
⎜ 0 ⎟
⎝ ⎠
1 ⎝ ⎠
0
Now to think about how you would write the linear transformation L as a matrix, first think about what the dimensions of the
matrix would be. Then look at the first two parts of this problem to help you figure out what the entries should be.
G.6: Matrices
Adjacency Matrix Example
Lets think about a graph as a mini-facebook. In this tiny facebook there are only four people, Alice, Bob, Carl, and David.
Suppose we have the following relationships
1. Alice and Bob are friends.
2. Alice and Carl are friends.
3. Carl and Bob are friends.
4. David and Bob are friends.
Now draw a picture where each person is a dot, and then draw a line between the dots of people who are friends. This is an
example of a graph if you think of the people as nodes, and the friendships as edges.
Now lets make a 4 × 4 matrix, which is an adjacency matrix for the graph. Make a column and a row for each of the four people. It
will look a lot like a table. When two people are friends put a 1 in the row of one and the column of the other. For example Alice
and Carl are friends so we can label the table below.
A B C D
A 1
B (18.9.8)
C 1
We can continue to label the entries for each friendship. Here lets assume that people are friends with themselves, so the diagonal
will be all ones.
A B C D
A 1 1 1 0
B 1 1 1 1 (18.9.9)
C 1 1 1 0
D 0 1 0 1
18.9.2 https://math.libretexts.org/@go/page/2195
1 1 1 0
⎛ ⎞
⎜1 1 1 1⎟
⎜ ⎟ (18.9.10)
⎜1 1 1 0⎟
⎝ ⎠
0 1 0 1
Notice that this table is symmetric across the diagonal, the same way a multiplication table would be symmetric. This is because on
facebook friendship is symmetric in the sense that you can't be friends with someone if they aren't friends with you too. This is an
example of a symmetric matrix.
You could think about what you would have to do differently to draw a graph for something like twitter where you don't have to
follow everyone who follows you. The adjacency matrix might not be symmetric then.
Do Matrices Commute?
This video shows you a funny property of matrices. Some matrix properties look just like those for numbers. For example numbers
obey
a(bc) = (ab)c (18.9.11)
and so do matrices:
A(BC ) = (AB)C . (18.9.12)
This says the order of bracketing does not matter and is called associativity. Now we ask ourselves whether the basic property of
numbers
ab = ba , (18.9.13)
For this, firstly note that we need to work with square matrices even for both orderings to even make sense. Lets take a simple
2 × 2 example, let
1 a 1 b 1 0
A =( ) , B =( ) , C =( ) . (18.9.15)
0 1 0 1 a 1
so
AC ≠ C A (18.9.18)
and this pair of matrices does not commute. Generally, matrices usually do not commute, and the problem of finding those that do
is a very interesting one.
18.9.3 https://math.libretexts.org/@go/page/2195
Matrix Exponential Example
This video shows you how to compute
0 θ
exp( ). (18.9.19)
−θ 0
For this we need to remember that the matrix exponential is defined by its power series
1 2
1 3
exp M := I + M + M + M +⋯ . (18.9.20)
2! 3!
0 θ
( ) = iθ (18.9.21)
−θ 0
Using these facts we compute by organizing terms according to whether they have an i or not:
1 1
2
exp iθ = I + θ (−I ) + (+I ) + ⋯
2! 4!
1 3
1
+ iθ + θ (−i) + i +⋯
3! 5!
1 2
1 4
= I (1 − θ + θ + ⋯)
2! 4!
1 1
3 5
+ i(θ − θ + θ + ⋯)
3! 5!
= I cos θ + i sin θ
cos θ sin θ
= ( ) .
− sin θ cos θ
Here we used the familiar Taylor series for the cosine and sine functions. A fun thing to think about is how the above matrix acts on
vector in the plane.
Proof Explanation
In this video we will talk through the steps required to prove
18.9.4 https://math.libretexts.org/@go/page/2195
In order to show that these are unique, we will suppose that they are not, and show that this causes a contradiction. So suppose
there exists a second set of constants d such that
i
1 n
w = d v1 + ⋯ + d vn . (18.9.26)
Then try multiplying the above row vector times y and compare.
T
f (α ⋅ M ) = (αM )R = α(M R) = α f (M )
R Rfor any scalar α . Now all that needs to be proved is that
fR (M + N ) = (M + N )R = M R + N R = fR (M ) + fR (N ), (18.9.28)
for all M , N ∈ N and v ∈ V where ⋅ denoted multiplication in M (i.e. standard matrix multiplication) and ∘ denotes the matrix is a
linear map on a vector (i.e. M (v) ). There is a corresponding notion of a right action where
v ∘ (M ⋅ N ) = (v ∘ M ) ∘ N (18.9.30)
where we treat v ∘ M as M (v) as before, and note the order in which the matrices are applied. People will often omit the left or
right because they are essentially the same, and just say that M acts on V .
∞ n
x
x
e =∑ . (18.9.31)
n!
n=0
18.9.5 https://math.libretexts.org/@go/page/2195
This means we are going to have an idea of what A
n
looks like for any n . Lets look at the example of one of the matrices in the
problem. Let
1 λ
A =( ). (18.9.33)
0 1
0
1 0
A =( )
0 1
1
1 λ
A =( )
0 1
2
1 2λ
A = A⋅A = ( )
0 1
3 2
1 3λ
A =A ⋅A =( ).
0 1
then we can think about the first few terms of the sequence
∞ n
A 1 1
A 0 2 3
e =∑ =A +A+ A + A +… . (18.9.35)
n=0
n! 2! ß 3!
Looking at the entries when we add this we get that the upper left-most entry looks like this:
∞
1 1 1
1
1 +1 + + +… = ∑ =e . (18.9.36)
2 3! n!
n=0
Continue this process with each of the entries using what you know about Taylor series expansions to find the sum of each entry.
your field. Lastly don't forget to address what happens if X does not exist.
V
These have 8 and 4 vectors, respectively, that can be depicted as corners of a cube or square:
3
Z2 (18.9.38)
18.9.6 https://math.libretexts.org/@go/page/2195
or
2
Z (18.9.39)
2
Since we have bits, we can work out what $L$ does to every vector, this is listed below
L
(0, 0, 0) ↦ (0, 0)
L
(0, 0, 1) ↦ (1, 0)
L
(1, 1, 0) ↦ (1, 0)
L
(1, 0, 0) ↦ (0, 1)
(18.9.42)
L
(0, 1, 1) ↦ (0, 1)
(0, 1, 0) ↦ (1, 1)
(1, 0, 1) ↦ (1, 1)
(1, 1, 1) ↦ (1, 1)
Now lets think about left and right inverses. A left inverse B to the matrix A
would obey
BA = I (18.9.43)
and since the identity matrix is square, B must be 2 × 3 . It would have to undo the action of A and return vectors in Z to where
3
2
18.9.7 https://math.libretexts.org/@go/page/2195
they started from. But above, we see that different vectors in Z are mapped to the same vector in Z by the linear transformation
3
2
2
2
AC = I (18.9.44)
can. It would be 2 × 2 . Its job is to take a vector in Z back to one in Z in a way that gets undone by the action of A . This can be
2
2
3
2
Using an LU Decomposition
Lets go through how to use a LU decomposition to speed up solving a system of equations. Suppose you want to solve for x in the
equation M x = b
1 0 −5 6
⎛ ⎞ ⎛ ⎞
⎜3 −1 −14 ⎟ x = ⎜ 19 ⎟ (18.9.45)
⎝ ⎠ ⎝ ⎠
1 0 −3 4
where you are given the decomposition of M into the product of L and U which are lower and upper and lower triangular matrices
respectively.
1 0 −5 1 0 0 1 0 −5
⎛ ⎞ ⎛ ⎞⎛ ⎞
M =⎜3 −1 −14 ⎟ = ⎜ 3 1 0⎟⎜0 −1 1 ⎟ = LU (18.9.46)
⎝ ⎠ ⎝ ⎠⎝ ⎠
1 0 −3 1 0 2 0 0 1
First you should solve L(U x) = b for U x . The augmented matrix you would use looks like this
⎛ 1 0 0 6 ⎞
⎜ 3 1 0 19 ⎟ (18.9.47)
⎜ ⎟
⎝ 1 0 2 4 ⎠
This is an easy augmented matrix to solve because it is upper triangular. If you were to write out the three equations using
variables, you would find that the first equation has already been solved, and is ready to be plugged into the second equation. This
backward substitution makes solving the system much faster. Try it and in a few steps you should be able to get
⎛ 1 0 0 6 ⎞
⎜ 0 1 0 1 ⎟ (18.9.48)
⎜ ⎟
⎝ 0 0 1 −1 ⎠
6
⎛ ⎞
This tells us that U x = ⎜ 1 ⎟ . Now the second part of the problem is to solve for x. The augmented matrix you get is
⎝ ⎠
−1
⎛ 1 0 −5 6 ⎞
⎜ 0 −1 1 1 ⎟ (18.9.49)
⎜ ⎟
⎝ 0 0 1 −1 ⎠
⎛ 1 0 0 1 ⎞
⎜ 0 1 0 −2 ⎟ , (18.9.50)
⎜ ⎟
⎝ 0 0 1 −1 ⎠
18.9.8 https://math.libretexts.org/@go/page/2195
1
⎛ ⎞
which gives us the answer x = ⎜ −2 ⎟ .
⎝ ⎠
−1
⎝ ⎠
1 6 3
following the procedure outlined in Section 7.7.2. So initially we have L 1 = I3 and U 1 =M , and hence
1 0 0 1 7 2
⎛ ⎞ ⎛ ⎞
L2 = ⎜ −3 1 0⎟ U2 = ⎜ 0 0 10 ⎟ .
⎝ ⎠ ⎝ ⎠
1 0 1 0 −1 −1
However we now have a problem since 0 ⋅ c = 0 for any value of c since we are working over a field, but we can quickly remedy
this by swapping the second and third rows of U to get U and note that we just interchange the corresponding rows all columns
2
′
2
left of and including the column we added values to in L to get L . Yet this gives us a small problem as L U ≠ M ; in fact it
2
′
2
′
2
′
2
gives us the similar matrix M with the second and third rows swapped. In our original problem M X = V , we also need to make
′
the corresponding swap on our vector V to get a V since all of this amounts to changing the order of our two equations, and note
′
that this clearly does not change the solution. Back to our example, we have
1 0 0 1 7 2
⎛ ⎞ ⎛ ⎞
′ ′
L =⎜ 1 1 0⎟ U =⎜0 −1 −1 ⎟ ,
2 2
⎝ ⎠ ⎝ ⎠
−3 0 1 0 0 10
and note that U is upper triangular. Finally you can easily see that
2
′
1 7 2
⎛ ⎞
′ ′ ′
L U =⎜ 1 6 3⎟ =M (18.9.52)
2 2
⎝ ⎠
−3 −21 4
matrix shape
X r×r
Y r×t (18.9.53)
Z t ×r
W t ×t
X Y
M =( ) . (18.9.54)
Z W
18.9.9 https://math.libretexts.org/@go/page/2195
Matrix multiplication works for blocks just as for matrix entries:
2
2
X Y X Y X +Y Z XY + Y W
M =( )( ) =( ) . (18.9.55)
2
Z W Z W ZX + W Z ZY + W
Now lets specialize to the case where the square matrix X has an inverse. Then we can multiply out the following triple product of
a lower triangular, a block diagonal and an upper triangular matrix:
−1
I 0 X 0 I X Y
( )( )( ) (18.9.56)
−1 −1
ZX I 0 W − ZX Y 0 I
−1
X 0 I X Y
=( )( ) (18.9.57)
−1
Z W − ZX Y 0 I
X Y
=( ) (18.9.58)
−1 −1
ZX Y +Z W − ZX Y
X Y
=( ) =M . (18.9.59)
Z W
This shows that the LDU decomposition given in Section 7.7 is correct.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 18.9: Movie Scripts 5-6 is shared under a not declared license and was authored, remixed, and/or curated by David Cherney, Tom
Denton, & Andrew Waldron.
18.9.10 https://math.libretexts.org/@go/page/2195
18.10: Movie Scripts 7-8
G.7 Determinants
Permutation Example
Lets try to get the hang of permutations. A permutation is a function which scrambles things. Suppose we had
We could write this permutation in two steps by saying that first we swap 3 and 4, and then we swap 1 and 3. The order here is
important.
This is an even permutation, since the number of swaps we used is two (an even number).
Elementary Matrices
This video will explain some of the ideas behind elementary matrices. First think back to linear systems, for example n equations
in n unknowns:
1 1 1 2 1 n 1
⎧ a x + a x + ⋯ + an x = v
⎪ 1 2
⎪
⎪
⎪
⎪ a2 x1 + a2 x2 + ⋯ + a2 xn 2
n = v
1 2
⎨ (18.10.3)
⎪ ⋮
⎪
⎪
⎪
⎩
⎪ n 1 n 2 n n n
a x +a x + ⋯ + an x = v .
1 2
We know it is helpful to store the above information with matrices and vectors
18.10.1 https://math.libretexts.org/@go/page/2198
1 1 1 1 1
a a ⋯ an x v
⎛ 1 2 ⎞ ⎛ ⎞ ⎛ ⎞
2 2 2 2 2
⎜ a a ⋯ an ⎟ ⎜ x ⎟ ⎜ v ⎟
⎜ 1 2 ⎟
M := ⎜ ⎟ , X := ⎜ ⎟ , V := ⎜ ⎟ . (18.10.4)
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⋮ ⋮ ⋮ ⎟ ⎜ ⋮ ⎟ ⎜ ⋮ ⎟
⎝ n n n ⎠ ⎝ n ⎠ ⎝ n ⎠
a a ⋯ an x v
1 2
Here we will focus on the case the M is square because we are interested in its inverse M
−1
(if it exists) and its determinant
(whose job it will be to determine the existence of M ). −1
( M V ) . (18.10.5)
Here our plan would be to perform row operations until the system looks like
−1
( I M V ) , (18.10.6)
2. As a matrix equation
MX = V , (18.10.7)
3. As a linear transformation
n n
L : R ⟶ R (18.10.9)
via
n n
R ∋ X ⟼ MX ∈ R . (18.10.10)
The main idea is that the row operations changed the augmented matrices, but we also know how to change a matrix M by
multiplying it by some other matrix E , so that M → EM . In particular can we find ``elementary matrices'' the perform row
operations?
Once we find these elementary matrices is is very important to ask how they effect the determinant, but you can think about that
for your own self right now.
Lets tabulate our names for the matrices that perform the various row operations:
$$\left(
i
Ri ↔ Rj E
j
(18.10.11)
i
Ri → λ Ri R (λ)
i
Ri → Ri + λ Rj S (λ)
j
\right)\]
To finish off the video, here is how all these elementary matrices work for a 2 × 2 example. Lets take
a b
M =( ) . (18.10.12)
c d
18.10.2 https://math.libretexts.org/@go/page/2198
A good thing to think about is what happens to det M = ad − bc under the operations below.
1. Row swap:
0 1 0 1 a b c d
1 1
E =( ) , E M =( )( ) =( ) . (18.10.13)
2 2
1 0 1 0 c d a b
2. Scalar multiplying:
1
λ 0 λ 0 a b λa λb
1
R (λ) = ( ) , E M =( )( ) =( ) . (18.10.14)
2
0 1 0 1 c d c d
3. Row sum:
1 λ 1 λ a b a + λc b + λd
1 1
S (λ) = ( ) , S (λ)M = ( )( ) =( ) . (18.10.15)
2 2
0 1 0 1 c d c d
Elementary Determinants
This video will show you how to calculate determinants of elementary matrices. First remember that the job of an elementary row
matrix is to perform row operations, so that if E is an elementary row matrix and M some given matrix,
EM (18.10.16)
3. Row addition S (λ) : adding some amount of one row to another does not change the determinant.
i
j
The corresponding elementary matrices are obtained by performing exactly these operations on the identity:
$$
E^{i}_{j}=
1
⎛ ⎞
⎜ ⎟
⎜ ⋱ ⎟
⎜ ⎟
⎜ 0 1 ⎟
⎜ ⎟
⎜ ⎟
⎜ ⎟ (18.10.17)
⎜ ⋱ ⎟
⎜ ⎟
⎜ ⎟
⎜ 1 0 ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋱ ⎟
⎝ ⎠
1
\, ,
\]
1
⎛ ⎞
⎜ ⎟
⎜ ⋱ ⎟
⎜ ⎟
i
R (λ) = ⎜ λ
⎟ , (18.10.18)
⎜ ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋱ ⎟
⎝ ⎠
1
18.10.3 https://math.libretexts.org/@go/page/2198
1
⎛ ⎞
⎜ ⎟
⎜ ⋱ ⎟
⎜ ⎟
⎜ 1 λ ⎟
⎜ ⎟
⎜ ⎟
i
S (λ) = ⎜ ⎟ (18.10.19)
j ⎜ ⋱ ⎟
⎜ ⎟
⎜ ⎟
⎜ 1 ⎟
⎜ ⎟
⎜ ⎟
⎜ ⋱ ⎟
⎝ ⎠
1
So to calculate their determinants, we just have to apply the above list of what happens to the determinant of a matrix under row
operations to the determinant of the identity. This yields
$$
\det E^{i}_{j}=-1\, ,\qquad
\det R^{i}(\lambda)=\lambda\, ,\qquad
\det S^{i}_{j}(\lambda)=1\, .
\]
0 =0
and det(M ) = 0 . When we compute the determinant, this row of all zeros gets multiplied in every term. If instead we were given
redundant equations
x +y = 1
2x + 2y = 2
of all zeros. Somehow the determinant is able to detect that there is only one equation here. Even if we had a set of contradictory
set of equations such as
x +y = 1
2x + 2y = 0,
where it is not possible for both of these equations to be true, the matrix M is still the same, and still has a determinant zero.
18.10.4 https://math.libretexts.org/@go/page/2198
Lets look at a three by three example, where the third equation is the sum of the first two equations.
x +y +z = 1
y +z = 1
x + 2y + 2z = 2
1 1 1
⎡ ⎤
M =⎢0 1 1⎥ (18.10.21)
⎣ ⎦
1 2 2
If we were trying to find the inverse to this matrix using elementary matrices
⎛ 1 1 1 1 0 0 ⎞ ⎛ 1 1 1 1 0 0 ⎞
⎜ 0 1 1 0 1 0 ⎟ =⎜ 0 1 1 0 1 0 ⎟ (18.10.22)
⎜ ⎟ ⎜ ⎟
⎝ 1 2 2 0 0 1 ⎠ ⎝ 0 0 0 −1 −1 1 ⎠
And we would be stuck here. The last row of all zeros cannot be converted into the bottom row of a 3 × 3 identity matrix. this
matrix has no inverse, and the row of all zeros ensures that the determinant will be zero. It can be difficult to see when one of the
rows of a matrix is a linear combination of the others, and what makes the determinant a useful tool is that with this reasonably
simple computation we can find out if the matrix is invertible, and if the system will have a solution of a single point or column
vector.
Alternative Proof
Here we will prove more directly that} the determinant of a product of matrices is the product of their determinants. First we
reference that for a matrix M with rows r , if M is the matrix with rows r = r + λr for j ≠ i and r = r , then
i
′ ′
j j i
′
i i
det(M ) = det(M ) . Essentially we have M as M multiplied by the elementary row sum matrices S (λ) . Hence we can create
′ ′ i
j
an upper-triangular matrix U such that det(M ) = det(U ) by first using the first row to set m ↦ 0 for all i > 1 , then iteratively 1
i
(increasing k by 1 each time) for fixed k using the k -th row to set m ↦ 0 for all i > k . k
i
Now note that for two upper-triangular matrices U = (u ) and U = (u ) , by matrix multiplication we have X = U U = (x ) is
j
i
′ ′j
i
′ j
upper-triangular and x = u u . Also since every permutation would contain a lower diagonal entry (which is 0) have
i
i
i
i
′i
i
det(U ) = ∏ u . Let A and A have corresponding upper-triangular matrices U and U respectively (i.e. det(A) = det(U ) ), we
i ′ ′
i i
′ ′ i ′i
det(AA ) = det(U U ) = ∏ u u
i i
i ′i
= (∏ u ) (∏ u )
i i
i i
′ ′
= det(U ) det(U ) = det(A) det(A ).
This formula might be easier to remember if you think about this picture.
18.10.5 https://math.libretexts.org/@go/page/2198
Now we can look at three by three matrices and see a few ways to compute the determinant. We have a similar pattern for 3 ×3
matrices.
Consider the example
1 2 3
⎛ ⎞
det ⎜ 3 1 2 ⎟ = ((1 ⋅ 1 ⋅ 1) + (2 ⋅ 2 ⋅ 0) + (3 ⋅ 3 ⋅ 0)) − ((3 ⋅ 1 ⋅ 0) + (1 ⋅ 2 ⋅ 0) + (3 ⋅ 2 ⋅ 1)) = −5 (18.10.24)
⎝ ⎠
0 0 1
We can draw a picture with similar diagonals to find the terms that will be positive and the terms that will be negative.
Another way to compute the determinant of a matrix is to use this recursive formula. Here I take the coefficients of the first row
and multiply them by the determinant of the minors and the cofactor. Then we can use the formula for a two by two determinant to
compute the determinant of the minors
1 2 3
⎛ ⎞
∣1 2∣ ∣3 2∣ ∣3 1∣
det ⎜ 3 1 2⎟ =1∣ ∣−2 ∣ ∣+3 ∣ ∣ (18.10.25)
∣0 1∣ ∣0 1∣ ∣0 0∣
⎝ ⎠
0 0 1
Decide which way you prefer and get good at taking determinants, you'll need to compute them in a lot of problems.
1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
det(A) = a a a +a a a +a a a −a a a −a a a −a a a (18.10.26)
1 2 3 2 3 1 3 1 2 1 3 2 2 1 3 3 2 1
and so the complexity is 5a + 12m . Now note that in general, the complexity c of the expansion minors formula of an arbitrary
n
n × n matrix should be
18.10.6 https://math.libretexts.org/@go/page/2198
since det(A) = ∑ n
i=1
i 1 1
(−1 ) a cof actor(a )
i i
and cof actor(a 1
i
) is an (n − 1) × (n − 1) matrix. This is one way to prove part (c).
j i j
∑l x =v (18.10.28)
i
where l is the coefficient of the variable x in the equation l . However, this is also stating that
j
i
i j
V is in the span of the vectors
j
{L } where L = (l ) . For example, consider the set of equations
i i i j
i
2x + 3y − z = 5
−x + 3y + z = 1
x + y − 2z = 3
⎜ −1 3 1 ⎟⎜ y ⎟ = ⎜1⎟. (18.10.29)
⎝ ⎠⎝ ⎠ ⎝ ⎠
1 1 −2 z 3
⎝ ⎠
3
⎧⎛ 2 3 −1 ⎫
⎪ ⎞ ⎛ ⎞ ⎛ ⎞⎪
⎨⎜ −1 ⎟ , ⎜ 3 ⎟ , ⎜ 1 ⎟⎬ . (18.10.31)
⎩
⎪⎝ ⎭
⎠ ⎝ ⎠ ⎝ ⎠⎪
1 1 −2
18.10.7 https://math.libretexts.org/@go/page/2198
Here we have taken the subspace W to be a plane through the origin and U to be a line through the origin. The hint now is to think
about what happens when you add a vector u ∈ U to a vector w ∈ W . Does this live in the union U ∪ W ?
For the second part, we take a more theoretical approach. Lets suppose that v ∈ U ∩ W and v ′
∈ U ∩W . This implies
′
v∈ U and v ∈ U . (18.10.32)
So, since U is a subspace and all subspaces are vector spaces, we know that the linear combination
′
αv + β v ∈ U . (18.10.33)
Now repeat the same logic for W and you will be nearly done.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 18.10: Movie Scripts 7-8 is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
18.10.8 https://math.libretexts.org/@go/page/2198
18.11: Movie Scripts 9-10
G.9 Linear Independence
Worked Example
This video gives some more details behind the example for the following four vectors in R . Consider the following vectors in R :
3 3
4 −3 5 −1
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
v1 = ⎜ −1 ⎟ , v2 = ⎜ 7 ⎟, v3 = ⎜ 12 ⎟ , v4 = ⎜ 1 ⎟. (18.11.1)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
3 4 17 0
The example asks whether they are linearly independent, and the answer is immediate: NO, four vectors can never be linearly
independent in R . This vector space is simply not big enough for that, but you need to understand the notion of the dimension of a
3
vector space to see why. So we think the vectors v , v , v and v are linearly dependent, which means we need to show that there
1 2 3 4
is a solution to
α1 v1 + α2 v2 + α3 v3 + α4 v4 = 0 (18.11.2)
To find this solution we need to set up a linear system. Writing out the above linear combination gives
4α1 −3α2 +5α3 −α4 = 0,
This can be easily handled using an augmented matrix whose columns are just the vectors we started with
⎛ 4 −3 5 −1 0, ⎞
⎜ −1 7 12 1 0, ⎟ . (18.11.4)
⎜ ⎟
⎝ 3 4 17 0 0. ⎠
Since there are only zeros on the right hand column, we can drop it. Now we perform row operations to achieve RREF
71 4
4 −3 5 −1 ⎛1 0
25
−
25 ⎞
⎛ ⎞
⎟ ∼⎜ ⎟ .
53 3
⎜ −1 7 12 1 (18.11.5)
⎜0 1
25 25
⎟
⎝ ⎠
3 4 17 0 ⎝ ⎠
0 0 0 0
This says that α and α are not pivot variable so are arbitrary, we set them to μ and ν , respectively.
3 4
Thus
71 4 53 3
α1 = ( − μ+ ν) , α2 = ( − μ− ν) , α3 = μ , α4 = ν . (18.11.6)
25 25 25 25
In fact this is not just one relation, but infinitely many, for any choice of μ, ν . The relationship quoted in the notes is just one of
those choices.
18.11.1 https://math.libretexts.org/@go/page/2199
Finally, since the vectors v , v , v and v are linearly dependent, we can try to eliminate some of them. The pattern here is to keep
1 2 3 4
the vectors that correspond to columns with pivots. For example, setting μ = −1 (say) and ν = 0 in the above allows us to solve
for v while μ = 0 and ν = −1 (say) gives v , explicitly we get
3 4
71 53 4 3
v3 = v1 + v2 , v4 = − v3 + v4 . (18.11.8)
25 25 25 25
This eliminates v and v and leaves a pair of linearly independent vectors v and v .
3 4 1 2
Worked Proof
Here we will work through a quick version of the proof of Theorem 10.1.1. Let {v } denote a set of linearly dependent vectors, so
i
∑ c v = 0 where there exists some c ≠ 0 . Now without loss of generality we order our vectors such that c ≠ 0 , and we can do
i k 1
i i
1 i
c v1 = − ∑ c vi
i=2
n i
c
v1 = − ∑ vi
1
c
i=2
when you have a vector v ⃗ ∈ B and you want to multiply it by a scalar, your only choices are 1 and 0. This is kind of neat because
n
it means that the possibilities are finite, so we can look at an entire vector space.
Now lets think about B there is choice you have to make for each coordinate, you can either put a 1 or a 0, there are three places
3
where you have to make a decision between two things. This means that you have 2 = 8 possibilities for vectors in B . 3 3
When you want to think about finding a set S that will span B and is linearly independent, you want to think about how many
3
vectors you need. You will need you have enough so that you can make every vector in $B^3$ using linear combinations of
elements in S but you don't want too many so that some of them are linear combinations of each other. I suggest trying something
really simple perhaps something that looks like the columns of the identity matrix
For part (c) you have to show that you can write every one of the elements as a linear combination of the elements in S , this will
check to make sure S actually spans B . 3
For part (d) if you have two vectors that you think will span the space, you can prove that they do by repeating what you did in part
(c), check that every vector can be written using only copies of of these two vectors. If you don't think it will work you should
show why, perhaps using an argument that counts the number of possible vectors in the span of two vectors.
18.11.2 https://math.libretexts.org/@go/page/2199
In order to show that these are unique, we will suppose that they are not, and show that this causes a contradiction. So suppose
there exists a second set of constants d such that
i
1 n
w = d v1 + ⋯ + d vn . (18.11.10)
Worked Example
In this video we will work through an example of how to extend a set of linearly independent vectors to a basis. For fun, we will
take the vector space
5
V = {(x, y, z, w)|x, y, z, w ∈ Z } . (18.11.11)
This is like four dimensional space R except that the numbers can only be {0, 1, 2, 3, 4}. This is like bits, but now the rule is
4
0 =5. (18.11.12)
Thus, for example, = 4 because 4 × 4 = 16 = 1 + 3 × 5 = 1 . Don't get too caught up on this aspect, its a choice of base field
1
⎜2⎟ ⎜3⎟
Find a basis f or V that includes the vectors ⎜ ⎟ and ⎜ ⎟. (18.11.13)
⎜3⎟ ⎜2⎟
⎝ ⎠ ⎝ ⎠
4 1
The way to proceed is to add a known (and preferably simple) basis to the vectors given, thus we consider
1 0 1 0 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
4 1 0 0 0 1
The last four vectors are clearly a basis (make sure you understand this....) and are called the canonical basis. We want to keep v 1
and v but find a way to turf out two of the vectors in the canonical basis leaving us a basis of four vectors. To do that, we have to
2
We want to find solutions for the α s which allow us to determine two of the e s. For that we use an augmented matrix
′ ′
⎛ 1 0 1 0 0 0 0 ⎞
⎜ 2 3 0 1 0 0 0 ⎟
⎜ ⎟ . (18.11.16)
⎜ ⎟
⎜ 3 2 0 0 1 0 0 ⎟
⎝ ⎠
4 1 0 0 0 1 0
Next comes a bunch of row operations. Note that we have dropped the last column of zeros since it has no information--you can fill
in the row operations used above the ∼'s as an exercise:
18.11.3 https://math.libretexts.org/@go/page/2199
1 0 1 0 0 0 1 0 1 0 0 0
⎛ ⎞ ⎛ ⎞
⎜2 3 0 1 0 0⎟ ⎜0 3 3 1 0 0⎟
⎜ ⎟ ∼⎜ ⎟ (18.11.17)
⎜3 2 0 0 1 0⎟ ⎜0 2 2 0 1 0⎟
⎝ ⎠ ⎝ ⎠
4 1 0 0 0 1 0 1 1 0 0 1
1 0 1 0 0 0 1 0 1 0 0 0
⎛ ⎞ ⎛ ⎞
⎜0 1 1 2 0 0⎟ ⎜0 1 1 2 0 0⎟
∼⎜ ⎟ ∼⎜ ⎟ (18.11.18)
⎜0 2 2 0 1 0⎟ ⎜0 0 0 1 1 0⎟
⎝ ⎠ ⎝ ⎠
0 1 1 0 0 1 0 0 0 3 0 1
1 0 1 0 0 0 1 0 1 0 0 0
⎛ ⎞ ⎛ ⎞
⎜0 1 1 0 3 0⎟ ⎜0 1 1 0 3 0⎟
∼⎜ ⎟ ∼⎜ ⎟ (18.11.19)
⎜0 0 0 1 1 0⎟ ⎜0 0 0 1 1 0⎟
⎝ ⎠ ⎝ ⎠
0 0 0 0 2 1 0 0 0 0 1 3
1 0 1 0 0 0
⎛– ⎞
⎜0 1 1 0 0 1⎟
∼⎜ – ⎟ (18.11.20)
⎜0 0 0 1 0 2⎟
–
⎝ ⎠
0 0 0 0 1 3
–
The pivots are underlined. The columns corresponding to non-pivot variables are the ones that can be eliminated--their coefficients
(the α 's) will be arbitrary, so set them all to zero save for the one next to the vector you are solving for which can be taken to be
unity. Thus that vector can certainly be expressed in terms of previous ones. Hence, altogether, our basis is
⎧⎛ 1 ⎞
⎪ ⎛
0
⎞ ⎛ ⎞⎫
⎪
0
⎞ ⎛
0
⎪ ⎪
⎪ ⎪
⎜2⎟ ⎜ 3⎟ ⎜1⎟ ⎜0⎟
⎨⎜ ⎟ ,⎜ ⎟,⎜ ⎟,⎜ ⎟⎬ . (18.11.21)
⎜3⎟ ⎜ 2 ⎟ ⎜ 0 ⎟ ⎜ 1 ⎟⎪
⎪
⎪
⎩⎝
⎪ ⎪
⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎪
⎭
4 1 0 0
Finally, as a check, note that e 1 = v1 + v2 which explains why we had to throw it away.
2
0 1 0 1
B = {( ) ,( ),( ),( )} . (18.11.22)
0 0 1 1
and so choosing any two non-zero vectors will form a basis. Now in general we note that we can build up a basis e by arbitrarily i
(independently) choosing the first i − 1 entries, then setting the i-th entry to 1 and all higher entries to 0.
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 18.11: Movie Scripts 9-10 is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
18.11.4 https://math.libretexts.org/@go/page/2199
18.12: Movie Scripts 11-12
G.11 Eigenvalues and Eigenvectors
2 ×2 Example
Here is an example of how to find the eigenvalues and eigenvectors of a 2 × 2 matrix.
4 2
M =( ). (18.12.1)
1 3
Remember that an eigenvector v with eigenvalue λ for M will be a vector such that M v = λv i.e. M (v) − λI (v) = 0⃗ . When we
are talking about a nonzero v then this means that det(M − λI ) = 0 . We will start by finding the eigenvalues that make this
statement true. First we compute
4 2 λ 0 4 −λ 2
det(M − λI ) = det (( ) −( )) = det ( ) (18.12.2)
1 3 0 λ 1 3 −λ
so det(M − λI ) = (4 − λ)(3 − λ) − 2 ⋅ 1 . We set this equal to zero to find values of λ that make this true:
2
(4 − λ)(3 − λ) − 2 ⋅ 1 = 10 − 7λ + λ = (2 − λ)(5 − λ) = 0 . (18.12.3)
This means that λ = 2 and λ = 5 are solutions. Now if we want to find the eigenvectors that correspond to these values we look at
vectors v such that
4 −λ 2
⃗
( )v=0. (18.12.4)
1 3 −λ
For λ = 5
4 −5 2 x −1 2 x
⃗
( )( ) =( )( ) =0. (18.12.5)
1 3 −5 y 1 −2 y
This gives us the equalities −x + 2y = 0 and x − 2y = 0 which both give the line y = 1
2
x . Any point on this line, so for example
2
( , is an eigenvector with eigenvalue λ = 5 .
)
1
the line y = −x is an eigenvector with eigenvalue 2. This solution could be written neatly as
2 1
λ1 = 5, v1 = ( ) and λ2 = 2, v2 = ( ). (18.12.7)
1 −1
and we note that we can just read off the eigenvector e with eigenvalue λ . However the characteristic polynomial of J is
1 2
PJ2(μ) = (μ − λ ) so the only possible eigenvalue is λ , but we claim it does not have a second eigenvector v . To see this, we
2
require that
18.12.1 https://math.libretexts.org/@go/page/2200
1 2 1
λv +v = λv
2 2
λv = λv
λ 1 0 ⋯ 0
⎛ ⎞
⎜ ⎟
⎜ 0 λ 1 ⋱ 0 ⎟
⎜ ⎟
Jn =⎜ ⎟ (18.12.9)
⎜ ⎟
⎜ ⋮ ⋱ ⋱ ⋱ ⋮ ⎟
⎜ ⎟
⎜ 0 ⋯ 0 λ 1 ⎟
⎝ ⎠
0 ⋯ 0 0 λ
and we immediately see that we must have V = e1 . Next for λ = 2 , we need to solve (M − 2I 3 )v =0 or
1 2
v +v =0
2 3
v +v =0
0 = 0,
Eigenvalues
Eigenvalues and eigenvectors are extremely important. In this video we review the theory of eigenvalues. Consider a linear
transformation
L : V ⟶ V (18.12.11)
where dim V =n <∞ . Since V is finite dimensional, we can represent L by a square matrix M by choosing a basis for V .
So the eigenvalue equation
Lv = λv (18.12.12)
becomes
M v = λv (18.12.13)
where v is a column vector and M is an n × n matrix (both expressed in whatever basis we chose for V ). The scalar λ is called an
eigenvalue of M and the job of this video is to show you how to find all the eigenvalues of M .
The first step is to put all terms on the left hand side of the equation, this gives
(M − λI )v = 0 . (18.12.14)
Notice how we used the identity matrix I in order to get a matrix times v equaling zero. Now here comes a VERY important fact
18.12.2 https://math.libretexts.org/@go/page/2200
N u = 0$$and$$u ≠ 0 ⟺ det N = 0. (18.12.15)
I.e., a square matrix can have an eigenvector with vanishing eigenvalue if and only if its determinant vanishes! Hence
det(M − λI ) = 0. (18.12.16)
The quantity on the left (up to a possible minus sign) equals the so-called characteristic polynomial
PM (λ) := det(λI − M ) . (18.12.17)
which is clearly a polynomial of order 2 in λ . For the n×n case, the order n term comes from the product of diagonal matrix
elements also.
There is an amazing fact about polynomials called the fundamental theorem of algebra : they can always be factored over
complex numbers. This means that degree n polynomials have n complex roots (counted with multiplicity). The word can does not
mean that explicit formulas for this are known (in fact explicit formulas can only be give for degree four or less). The necessity for
complex numbers is easily seems from a polynomial like
2
z +1 (18.12.19)
we have
2
z + 1 = (z − i)(z + i) . (18.12.21)
Returning to our characteristic polynomial, we call on the fundamental theorem of algebra to write
PM (λ) = (λ − λ1 )(λ − λ2 ) ⋯ (λ − λn ) . (18.12.22)
The roots λ , λ ,...,λ are the eigenvalues of M (or its underlying linear transformation L).
1 2 n
Eigenspaces
Consider the linear map
−4 6 6
⎛ ⎞
L =⎜ 0 2 0⎟. (18.12.23)
⎝ ⎠
−3 3 5
−1 0 0
⎛ ⎞
−1
L =Q⎜ 0 2 0⎟Q (18.12.24)
⎝ ⎠
0 0 2
where
2 1 1
⎛ ⎞
1 1
⎛ ⎞ ⎛ ⎞
(2) (2)
v =⎜0⎟ v =⎜1⎟ (18.12.26)
1 2
⎝ ⎠ ⎝ ⎠
1 0
18.12.3 https://math.libretexts.org/@go/page/2200
span the eigenspace E (2)
of the eigenvalue 2, and for an explicit example, if we take
1
⎛ ⎞
(2) (2)
v = 2v −v = ⎜ −1 ⎟ (18.12.27)
1 2
⎝ ⎠
2
we have
2
⎛ ⎞
Lv = ⎜ −2 ⎟ = 2v (18.12.28)
⎝ ⎠
4
(λ)
so v ∈ E (2)
. In general, we note the linearly independent vectors v i
with the same eigenvalue λ span an eigenspace since for any
(λ)
v=∑ c v
i
i
i
, we have
i (λ) i (λ) i (λ)
Lv = ∑ c Lv = ∑ c λv = λ∑c v = λv. (18.12.29)
i i i
i i i
x(1) x(0)
v(1) = ( ) =M ( ). (18.12.30)
y(1) y(0)
for λ .
3 −λ 2
det ( ) =0 (18.12.33)
2 3 −λ
By computing the determinant and solving for λ we can find the eigenvalues λ =1 and 5, and the corresponding eigenvectors.
You should do the computations to find these for yourself.
When we think about the question in part (b) which asks to find a vector v(0) such that v(0) = v(1) = v(2) … , we must look for a
vector that satisfies v = M v . What eigenvalue does this correspond to? If you found a v(0) with this property would cv(0) for a
scalar c also work? Remember that eigenvectors have to be nonzero, so what if c = 0 ?
For part (c) if we tried an eigenvector would we have restrictions on what the eigenvalue should be? Think about what it means to
be pointed in the same direction.
G.12 Diagonalization
Non Diagonalizable Example
First recall that the derivative operator is linear and that we can write it as the matrix
0 1 0 0 ⋯
⎛ ⎞
d ⎜0 0 2 0 ⋯ ⎟
⎜ ⎟
=⎜ ⎟. (18.12.34)
dx ⎜0 0 0 3 ⋯ ⎟
⎜ ⎟
⎝ ⎠
⋮ ⋮ ⋮ ⋮ ⋱
We note that this transforms into an infinite Jordan cell with eigenvalue 0 or
18.12.4 https://math.libretexts.org/@go/page/2200
0 1 0 0 ⋯
⎛ ⎞
⎜0 0 1 0 ⋯ ⎟
⎜ ⎟
⎜ ⎟ (18.12.35)
⎜0 0 0 1 ⋯ ⎟
⎜ ⎟
⎝ ⎠
⋮ ⋮ ⋮ ⋮ ⋱
which is in the basis {n x } (where for n = 0 , we just have 1). Therefore we note that 1 (constant polynomials) is the only
−1 n
n
eigenvector with eigenvalue 0 for polynomials since they have finite degree, and so the derivative is not diagonalizable. Note that
we are ignoring infinite cases for simplicity, but if you want to consider infinite terms such as convergent series or all formal power
series where there is no conditions on convergence, there are many eigenvectors. Can you find some? This is an example of how
things can change in infinite dimensional spaces.
For a more finite example, consider the space P of complex polynomials of degree at most 3, and recall that the derivative D can
C
3
be written as
0 1 0 0
⎛ ⎞
⎜0 0 2 0⎟
D =⎜ ⎟. (18.12.36)
⎜0 0 0 3⎟
⎝ ⎠
0 0 0 0
You can easily check that the only eigenvector is 1 with eigenvalue 0 since D always lowers the degree of a polynomial by 1 each
time it is applied. Note that this is a nilpotent matrix since D = 0 , but the only nilpotent matrix that is “diagonalizable'' is the 0
4
matrix.
18.12.5 https://math.libretexts.org/@go/page/2200
Calling the basis vectors e ⃗
1 := (1, 0) and e ⃗ 2 := (0, 1) , this representation would label what's in the barrel by a vector
x
x⃗ := x e 1
⃗ + ye2
⃗ = ( e⃗
1
⃗ )(
e2 ) . (18.12.37)
y
Since this is the method ordinary people would use, we will call this the “engineer's'' method!
But this is not the approach nutritionists would use. They would note the amount of sugar and total number of fruit (s, f ):
WARNING: To make sense of what comes next you need to allow for the possibity of a negative amount of fruit or sugar. This
would be just like a bank, where if money is owed to somebody else, we can use a minus sign.
The vector x⃗ says what is in the barrel and does not depend which mathematical description is employed. The way nutritionists
label x⃗ is in terms of a pair of basis vectors f ⃗ and f ⃗ :
1 2
s
⃗ ⃗
x⃗ = sf +ff =(f ⃗ f
⃗ )( ) . (18.12.38)
1 2 1 2
f
The vector x⃗ labels generally the contents of the barrel. The vector e ⃗ corresponds to one apple and one orange. The vector e ⃗ is
1 2
one orange and no apples. The vector f ⃗ means one unit of sugar and zero total fruit (to achieve this you could lend out some
1
apples and keep a few oranges). Finally the vector f ⃗ represents a total of one piece of fruit and no sugar.
2
You might remember that the amount of sugar in an apple is called λ while oranges have twice as much sugar as apples. Thus
s = λ(x + 2y)
{ (18.12.39)
f = x +y
Essentially, this is already our change of basis formula, but lets play around and put it in our notations. First we can write this as a
matrix
s λ 2λ x
( ) =( )( ) . (18.12.40)
f 1 1 y
18.12.6 https://math.libretexts.org/@go/page/2200
We can easily invert this to get
1
x − 2 s
λ
( ) =( )( ) . (18.12.41)
1
y −1 f
λ
Comparing to the nutritionist's formula for the same object x⃗ we learn that
1
⃗ ⃗
f =− (e 1
⃗ − e2
⃗ ) and f = 2e 1
⃗ − 2e 2
⃗ . (18.12.43)
1 2
λ
Rearranging these equation we find the change of base matrix P from the engineer's basis to the nutritionist's basis:
1
− 2
λ
(f ⃗ f
⃗ ) = ( e1
⃗ e2
⃗ )( ) =: ( e 1
⃗ e2
⃗ ) P . (18.12.44)
1 2 1
−1
λ
We can also go the other direction, changing from the nutritionist's basis to the engineer's basis
λ 2λ
⃗
( e1 ⃗ ) =(f ⃗
e2 f
⃗ )( ) =: ( f ⃗ f
⃗ ) Q . (18.12.45)
1 2 1 2
1 1
where
1 1 x 0
M := ( ) , X := ( ) , V := ( ) . (18.12.49)
2 −1 y 27
Note that
x⃗ = ( e 1
⃗ ⃗ ) X.
e2 (18.12.50)
( e1
⃗ e2
⃗ ) (18.12.51)
\, V\, .\]
Now the matrix M is the matrix of some linear transformation L in the basis of the engineers.
Lets convert it to the basis of the nutritionists:
s s ⃗
e1 s
Lx⃗ = L ( f ⃗ f
⃗ )( ) = L ( e1
⃗ ⃗ )P (
e2 ) =( ) MP ( ) . (18.12.52)
1 2
f f ⃗
e2 f
→
Note here that the linear transformation on acts on {\it vectors} -- these are the objects we have written with a sign on top of
them. It does not act on columns of numbers!
We can easily compute M P and find
18.12.7 https://math.libretexts.org/@go/page/2200
1
1 1 − 2 0 1
λ
MP = ( )( ) =( ) . (18.12.53)
1 3
2 −1 −1 − 5
λ λ
Note that P −1
MP is the matrix of L in the nutritionists basis, but we don't need this quantity right now.
Thus the last task is to solve the system, lets solve for sugar and fruit.
We need to solve
s 0 1 s 27
MP ( ) =( 3
)( ) =( ) . (18.12.54)
f − 5 f 0
λ
This is solved immediately by forward substitution (the nutritionists basis is nice since it directly gives $f$):
f = 27 and s = 45λ . (18.12.55)
2 ×2 Example
Lets diagonalize the matrix M from a previous example Eigenvalues and Eigenvectors: 2 × 2 Example
4 2
M =( ) (18.12.56)
1 3
So we get:
1 1 1 4 2 2 1 5 0
D =− ( )( )( ) =( ) (18.12.60)
3 1 −2 1 3 1 −1 0 2
But this doesn't really give any intuition into why this happens. Let look at what happens when we apply this matrix D = P −1
MP
x x
to a vector v = ( ) . Notice that applying P translates v = ( ) into x v 1 + y v2 .
y y
−1
x −1
2x + y −1
2x y
P MP ( ) =P M ( ) =P M [( ) +( )] (18.12.61)
y x −y x −y
−1
2 1 −1
=P [(x)M ( ) + (y)M ( )] = P [(x)M v1 + (y) ⋅ M v2 ] (18.12.62)
1 −1
−1 −1
P [(x)M v1 + (y)M v2 ] = P [(x λ1 )v1 + (y λ2 )v2 ]
−1 −1
= (5x)P v1 + (2y)P v2
1 0
= (5x) ( ) + (2y) ( )
0 1
5x
= ( )
2y
18.12.8 https://math.libretexts.org/@go/page/2200
1 0
Notice that multiplying by P
−1
converts v and
1 v2 back in to ( ) and ( ) respectively. This shows us why D =P
−1
MP
0 1
λ1 0 5 0
D =( ) =( ) (18.12.63)
0 λ2 0 2
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 18.12: Movie Scripts 11-12 is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
18.12.9 https://math.libretexts.org/@go/page/2200
18.13: Movie Scripts 13-14
G.13 Orthonormal Bases and Complements
All Orthonormal Bases for R 2
We wish to find all orthonormal bases for the space R , and they are {e 2 θ
1
θ
,e }
2
up to reordering where
cos θ − sin θ
θ θ
e =( ), e =( ), (18.13.1)
1 2
sin θ cos θ
for some θ ∈ [0, 2π). Now first we need to show that for a fixed θ that the pair is orthogonal:
θ θ
e ⋅e = − sin θ cos θ + cos θ sin θ = 0. (18.13.2)
1 2
Also we have
θ 2 θ 2 2 2
∥e ∥ = ∥e ∥ = sin θ + cos θ = 1, (18.13.3)
1 2
and hence {e , e } is an orthonormal basis. To show that every orthonormal basis of R is {e , e } for some θ , consider an
θ
1
θ
2
2 θ
1
θ
2
orthonormal basis {b , b } and note that b forms an angle ϕ with the vector e (which is e ). Thus b = e and if b = e , we are
1 2 1 1
0
1 1
ϕ
1 2
ϕ
done, otherwise b = −e and it is the reflected version. However we can do the same thing except starting with b and get
2
ϕ
2 2
ψ ψ
b =e
2 1
and b = e since we have just interchanged two basis vectors which corresponds to a reflection which picks up a minus
1 2
A 4 × 4 Gram-Schmidt Example
Lets do an example of how to "Gram-Schmidt" some vectors in R . Given the following vectors 4
o 0 3 1
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 0 2
we start with v 1
0
⎛ ⎞
⊥ ⎜1⎟
v = v1 =⎜ ⎟. (18.13.5)
1
⎜0⎟
⎝ ⎠
0
18.13.1 https://math.libretexts.org/@go/page/2201
⊥
(v ⋅ v2 )
⊥ 1 ⊥
v = v2 − v
2 ⊥ 1
2
∥v ∥
1
0 0
⎛ ⎞ ⎛ ⎞
⎜1⎟ 1⎜1⎟
= ⎜ ⎟− ⎜ ⎟
⎜1⎟ 1⎜0⎟
⎝ ⎠ ⎝ ⎠
0 0
0
⎛ ⎞
⎜0⎟
= ⎜ ⎟
⎜1⎟
⎝ ⎠
0
3 0 0 3
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0 0 0 0
This last step requires subtracting off the term of the form u⋅v
u⋅u
u for each of the previously defined basis vectors.
⊥ ⊥ ⊥
(v ⋅ v4 ) (v ⋅ v4 ) (v ⋅ v4 )
⊥ 1 ⊥ 2 ⊥ 3 ⊥
v = v4 − v − v − v
4 ⊥ 1 ⊥ 2 ⊥ 3
∥v ∥2 ∥v ∥2 ∥v ∥2
1 2 3
1 0 0 3
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
0
⎛ ⎞
⎜0⎟
= ⎜ ⎟
⎜0⎟
⎝ ⎠
2
Now v , v , v , and v are an orthogonal basis. Notice that even with very, very nice looking vectors we end up having to do
⊥
1
⊥
2
⊥
3
⊥
4
quite a bit of arithmetic. This a good reason to use programs like matlab to check your work.
M = ⎜ m1 m2 m3 ⎟ = ⎜ 0 1 2 ⎟. (18.13.6)
⎝ ⎠ ⎝ ⎠
| | | −1 1 1
m1 –
First we normalize m to get m
1
′
1
=
∥ m1 ∥
where ∥m 1∥ =r
1
1
= √2 which gives the decomposition
1
1 −1 –
⎛ √2 ⎞ √2 0 0
⎛ ⎞
⎜ ⎟
Q1 = ⎜ 0 1 2 ⎟ , R1 = ⎜ 0 1 0⎟. (18.13.7)
⎜ ⎟
1 ⎝ ⎠
⎝− 1 1 ⎠ 0 0 1
√2
Next we find
18.13.2 https://math.libretexts.org/@go/page/2201
′ ′ 1 ′ ′
t2 = m2 − (m ⋅ m2 )m = m2 − r m = m2 − 0 m (18.13.8)
1 1 2 1 1
noting that
′ ′ ′ 2
m ⋅m = ∥m ∥ =1 (18.13.9)
1 1 1
– t2
and ∥t2∥ =r
2
2
= √3 , and so we get m ′
2
=
∥ t2 ∥
with the decomposition
1 1
−1 –
⎛ √2 √3
⎞
√2 0 0
⎛ ⎞
⎜ 1 ⎟ –
Q2 =⎜ 0 2 ⎟ , R2 = ⎜ 0 √3 0⎟. (18.13.10)
√3
⎜ ⎟
⎝ ⎠
1 1 0 0 1
⎝− 1 ⎠
√2 √3
Finally we calculate
′ ′ ′ ′
t3 = m3 − (m ⋅ m3 )m − (m ⋅ m3 )m
1 1 2 2
– 2
1 ′ 2 ′ ′ ′
= m3 − r m −r m = m3 + √2m − – m2 ,
3 1 3 2 1
√3
−
−
t3
again noting m ′
2
⋅m
′
2
′
= ∥m ∥ = 1
2
, and let m ′
3
=
∥ t3 ∥
where ∥t 3∥ =r
3
3
= 2√
2
3
. Thus we get our final M = QR decomposition
as
1 1 1 – –
⎛ − ⎞ √2 0 −√2
√2 √3 √2 ⎛ ⎞
−− ⎟ – 2
⎜ ⎜ 0 √3 ⎟
1 2
Q =⎜ 0 √ ⎟,R = ⎜ √3 ⎟. (18.13.11)
⎜ √3 3 ⎟
⎜ ⎟ ⎜ −⎟
−
2
1 1 1 ⎝ 0 0 2√ ⎠
⎝− − ⎠ 3
√2 3 √6
Overview
This video considers solutions sets for linear systems with three unknowns. These are often called (x, y, z) and label points in R . 3
(Fig1)
2. For a single equation, the solution is a plane. The picture looks like this: (Fig2)
3. For two equations, we must look at two planes. These usually intersect along a line, so the solution set will also (usually) be a
line: (Fig3)
4. For three equations, most often their intersection will be a single point so the solution will then be unique: (Fig4)
5. Of course stuff can go wrong. Two different looking equations could determine the same plane, or worse equations could be
inconsistent. If the equations are inconsistent, there will be no solutions at all. For example, if you had four equations
determining four parallel planes the solution set would be empty. This looks like this: (Fig5)
Fig1: R 3
18.13.3 https://math.libretexts.org/@go/page/2201
Fig2: Plane in R 3
However, the basis is not orthonormal so we know nothing about the lengths of the basis vectors (save that they cannot vanish).
18.13.4 https://math.libretexts.org/@go/page/2201
To complete the hint, lets use the dot product to compute a formula for c in terms of the basis vectors and v . Consider
1
$$v_{1}\cdot v = c^{1} v_{1}\cdot v_{1} + c^{2} v_{1}\cdot v^{2} +\cdots + c^{n} v_{1}\cdot v_{n}=c^{1} v_{1}\cdot
v_{1}\, .\]
Solving for c (remembering that v
1
1 ⋅ v1 ≠ 0 ) gives
v1 ⋅ v
1
c = . (18.13.12)
v1 ⋅ v1
u⋅u
about this formula is a scalar, so this is a linear combination of v and u . Do you think it is in the span?
u⋅v
u⋅u
2. What is the angle between v and u? This part will make more sense if you think back to the dot product formulas you
⊥
2
and cos( π
2
) =0 you will get u ⋅ v = 0 . Now try to compute the dot product
of u and v to find ∥u∥∥v ∥ cos(θ)
⊥ ⊥
⊥
u⋅v
u⋅v = u ⋅ (v − u)
u⋅u
u⋅v
= u ⋅ v−u ⋅ ( )u
u⋅u
u⋅v
= u ⋅ v−( )u⋅u
u⋅u
Now you finish simplifying and see if you can figure out what θ has to be.
3. Given your solution to the above, how can you find a third vector perpendicular to both u and v ? Remember what other things ⊥
you learned in multivariable calculus? This might be a good time to remind your self what the cross product does.
4. Construct an orthonormal basis for R from u and v . If you did part (c) you can probably find 3 orthogonal vectors to make a
3
orthogonal basis. All you need to do to turn this into an orthonormal basis is make these into unit vectors.
5. Test your abstract formulae starting with
u =(1 2 0 ) and v = ( 0 1 1). (18.13.14)
Try it out, and if you get stuck try drawing a sketch of the vectors you have.
as a set of 3 vectors
0 0 2
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
v1 = ⎜ −1 ⎟ , v2 = ⎜ 2 ⎟ , v3 = ⎜ 0 ⎟ . (18.13.16)
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
−1 −2 2
Then you need to remember that we are searching for a decomposition $$M = QR\]
where Q is an orthogonal matrix. Thus the upper triangular matrix R = Q M and Q Q = I . Moreover, orthogonal matrices
T T
perform rotations. To see this, compare the inner product u ⋅ v = u v of vectors u and v with that of Qu and Qv:
T
T T T T
(Qu) ⋅ (Qv) = (Qu ) (Qv) = u Q Qv = u v = u ⋅ v. (18.13.17)
18.13.5 https://math.libretexts.org/@go/page/2201
Since the dot product doesn't change, we learn that Q does not change angles or lengths of vectors. Now, here's an interesting
procedure: rotate v , v and v such that v is along the x-axis, v is in the xy-plane. Then if you put these in a matrix you get
1 2 3 1 2
⎜0 d e ⎟ (18.13.18)
⎝ ⎠
0 0 f
a
⎛ ⎞
⎜0⎟ (18.13.19)
⎝ ⎠
0
– –
is the rotated v so must have length ||v
1 1 || = √3 . Thus a = √3 . The rotated v is 2
b
⎛ ⎞
⎜d⎟ (18.13.20)
⎝ ⎠
0
–
and must have length ||v 2 || = 2 √2. Also the dot product between
a b
⎛ ⎞ ⎛ ⎞
⎜ 0 ⎟ and ⎜ d ⎟ (18.13.21)
⎝ ⎠ ⎝ ⎠
0 0
is ab and must equal v1 ⋅ v2 = 0 . (That v1 and v2 were orthogonal is just a coincidence here….). Thus b =0 . So now we know
most of the matrix R
–
√3 0 c
⎛ ⎞
–
R =⎜ 0 2 √2 e ⎟ (18.13.22)
⎝ ⎠
0 0 f
You can work out the last column using the same ideas. Thus it only remains to compute Q from
−1
Q = MR . (18.13.23)
If we want to diagonalize this matrix, we should be happy to see that it is symmetric, since this means we will have real
eigenvalues, which means factoring won't be too hard. As an added bonus if we have three distinct eigenvalues the eigenvectors we
find will automatically be orthogonal, which means that the inverse of the matrix P will be easy to compute. We can start by
finding the eigenvalues of this
18.13.6 https://math.libretexts.org/@go/page/2201
1 −λ 2 0
⎛ ⎞
∣ 1 −λ 0 ∣
det ⎜ 2 1 −λ 0 ⎟ = (1 − λ) ∣ ∣
∣ 0 5 −λ ∣
⎝ ⎠
0 0 5 −λ
∣2 0 ∣
∣2 1 −λ ∣
− (2) ∣ ∣+0 ∣ ∣
∣0 5 −λ ∣ ∣0 0 ∣
2
= ((1 − 4) − 2λ + λ )(5 − λ)
2
= (−3 − 2λ + λ )(5 − λ)
= (1 + λ)(3 − λ)(5 − λ)
x 2 2 0 x 0
⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
(M + I ) ⎜ y ⎟ = ⎜ 2 2 0⎟⎜ y ⎟ = ⎜0⎟, (18.13.25)
⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠
z 0 0 6 z 0
1
⎛ ⎞
implies that 2x + 2y = 0 and 6z = 0 , which means any multiple of v1 = ⎜ −1 ⎟ is an eigenvector with eigenvalue λ1 = −1 .
⎝ ⎠
0
x −2 2 0 x 0
⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
(M − 3I ) ⎜ y ⎟ = ⎜ 2 −2 0⎟⎜ y ⎟ = ⎜0⎟, (18.13.26)
⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠
z 0 0 4 z 0
1
⎛ ⎞
and we can find that that v 2 = ⎜ 1 ⎟ wouldsatisf y\( − 2x + 2y = 0 , 2x − 2y = 0 and 4z = 0 .
⎝ ⎠
0
x −4 2 0 x 0
⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
Now we want v3 to satisfy −4x + 2y = 0 and 2x − 4y = 0 , which imply x =y =0 , but since there are no restrictions on the z
0
⎛ ⎞
Notice that the eigenvectors form an orthogonal basis. We can create an orthonormal basis by rescaling to make them unit vectors.
This will help us because if P = [v , v , v ] is created from orthonormal vectors then P = P , which means computing P
1 2 3
−1 T −1
so we get
1 1 1 1
0 − 0
⎛ √2 √2
⎞ ⎛ √2 √2
⎞
⎜ 1 1 ⎟ −1 ⎜ 1 1 ⎟
P =⎜− 0 ⎟ and P =⎜ 0⎟ (18.13.29)
⎜ √2 √2 ⎟ ⎜ √2 √2 ⎟
⎝ ⎠ ⎝ ⎠
0 0 1 0 0 1
So when we compute D = P −1
MP we'll get
18.13.7 https://math.libretexts.org/@go/page/2201
1 1 1 1
⎛ − 0⎞ ⎛ 0⎞
√2 √2 1 2 0 √2 √2 −1 0 0
⎛ ⎞ ⎛ ⎞
⎜ 1 1 ⎟ ⎜ 1 1 ⎟
⎜ 0⎟⎜2 5 0⎟⎜− 0⎟ =⎜ 0 3 0⎟ (18.13.30)
⎜ √2 √2 ⎟ ⎜ √2 √2 ⎟
⎝ ⎠ ⎝ ⎠
⎝ ⎠ 0 0 5 ⎝ ⎠ 0 0 5
0 0 1 0 0 1
1 0
( . Can you describe zz̄ in terms of ∥z∥? For part (b), think about what values a ∈ R can take if
) a = −a ? Part (c), just
0 −1
notice that every row vector is the (unique) transpose of a column vector, and also think about why (AA ) = AA for any T T T
¯
¯¯¯¯
¯
matrix A . Additionally you should see that x T
=x
†
and mention this. Finally for part (h), show that
¯
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
¯
† T
†
x Mx x Mx
=( ) (18.13.31)
† †
x x x x
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
This page titled 18.13: Movie Scripts 13-14 is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
18.13.8 https://math.libretexts.org/@go/page/2201
18.14: Movie Scripts 15-16
G.15 Kernel, Range, Nullity, Rank
Invertibility Conditions
Here I am going to discuss some of the conditions on the invertibility of a matrix stated in Theorem 16.3.1. Condition 1 states that
X =M
−1
V uniquely, which is clearly equivalent to 4. Similarly, every square matrix M uniquely corresponds to a linear
transformation L: R → R , so condition 3 is equivalent to condition 1.
n n
Condition 6 implies 4 by the adjoint construct the inverse, but the converse is not so obvious. For the converse (4 implying 6), we
refer back the proofs in Chapter 18 and 19. Note that if det M = 0 , there exists an eigenvalue of M equal to 0, which implies M is
not invertible. Thus condition 8 is equivalent to conditions 4, 5, 9, and 10.
The map M is injective if it does not have a null space by definition, however eigenvectors with eigenvalue 0 form a basis for the
null space. Hence conditions 8 and 14 are equivalent, and 14, 15, and 16 are equivalent by the Dimension Formula (also known as
the Rank-Nullity Theorem).
Now conditions 11, 12, and 13 are all equivalent by the definition of a basis. Finally if a matrix M is not row-equivalent to the
identity matrix, then det M = 0 , so conditions 2 and 8 are equivalent.
Hints for Review Problem 3
Lets work through this problem. Let L: V → W be a linear transformation. Show that ker L = {0 V } if and only if L is one-to-one:
1. \item First, suppose that ker L = {0 } . Show that L is one-to-one. Remember what one-one means, it means whenever
V
L(x) = L(y) we can be certain that x = y . While this might seem like a weird thing to require this statement really means that
each vector in the range gets mapped to a unique vector in the range. We know we have the one-one property, but we also don't
want to forget some of the more basic properties of linear transformations namely that they are linear, which means
L(ax + by) = aL(x) + bL(y) for scalars a and b . What if we rephrase the one-one property to say whenever
L(x) − L(y) = 0 implies that x − y = 0 ? Can we connect that to the statement that ker L = { 0 } ? Remember that if
V
2. Now, suppose that L is one-to-one. Show that ker L = {0 } . That is, show that 0 is in ker L, and then show that there are no
V V
other vectors in ker L. What would happen if we had a nonzero kernel? If we had some vector v with L(v) = 0 and v ≠ 0 , we
could try to show that this would contradict the given that L is one-one. If we found x and y with L(x) = L(y), then we know
x = y . But if L(v) = 0 then L(x) + L(v) = L(y) . Does this cause a problem?
L(x) = L(y) we can be certain that x = y . While this might seem like a weird thing to require this statement really means that
each vector in the range gets mapped to a unique vector in the range. We know we have the one-one property, but we also don't
want to forget some of the more basic properties of linear transformations namely that they are linear, which means
L(ax + by) = aL(x) + bL(y) for scalars a and b . What if we rephrase the one-one property to say whenever
L(x) − L(y) = 0 implies that x − y = 0 ? Can we connect that to the statement that ker L = { 0 } ? Remember that if
V
2. Now, suppose that L is one-to-one. Show that ker L = {0 } . That is, show that 0 is in ker L, and then show that there are no
V V
other vectors in ker L. What would happen if we had a nonzero kernel? If we had some vector v with L(v) = 0 and v ≠ 0 , we
could try to show that this would contradict the given that L is one-one. If we found x and y with L(x) = L(y), then we know
x = y . But if L(v) = 0 then L(x) + L(v) = L(y) . Does this cause a problem?
Contributor
David Cherney, Tom Denton, and Andrew Waldron (UC Davis)
18.14.1 https://math.libretexts.org/@go/page/2202
This page titled 18.14: Movie Scripts 15-16 is shared under a not declared license and was authored, remixed, and/or curated by David Cherney,
Tom Denton, & Andrew Waldron.
18.14.2 https://math.libretexts.org/@go/page/2202
Index
1: A 8: I 14: Q
augmented matrix image QR Decomposition
2.1: Gaussian Elimination 9.2: Building Subspaces 14.5: QR Decomposition
3.3: Dantzig's Algorithm inverse of a matrix
7.5: Inverse Matrix 15: R
2: B reduced row echelon form
Basis Notation 9: K 2.1: Gaussian Elimination
7.1: Linear Transformations and Matrices kernal Row Operations
bit matrix 9.2: Building Subspaces 2.1: Gaussian Elimination
7.5: Inverse Matrix 16: Kernel, Range, Nullity, Rank 2.3: Elementary Row Operations
Block LDU Decomposition Kronecker delta row vector
7.7: LU Redux 14.1: Properties of the Standard Basis 7.3: Properties of Matrices
3: C 10: L 16: S
Change of Basis least squares regression line simplex
13.2: Change of Basis 17: Least Squares and Singular Values 3: The Simplex Method
14.3: Relating Orthonormal Bases linear function simplex algorithm
characteristic equation 1.2: What Are Linear Functions? 3.3: Dantzig's Algorithm
12.2: The Eigenvalue-Eigenvector Equation lower trianglar matrix singular matrix
characteristic polynomial 7.7: LU Redux 7.5: Inverse Matrix
12.2: The Eigenvalue-Eigenvector Equation LU Decomposition singular value decomposition
column vector 7.7: LU Redux 17.1: Singular Value Decomposition
7.3: Properties of Matrices singular values
11: M 17: Least Squares and Singular Values
4: D matrix span
Dantzig's Algorithm 1.3: What is a Matrix? 9.2: Building Subspaces
3.3: Dantzig's Algorithm matrix diagonal standard basis
diagonalizable transformation 7.3: Properties of Matrices 14.1: Properties of the Standard Basis
13.1: Diagonalization multiplicity subspace
diagonalization 12.2: The Eigenvalue-Eigenvector Equation 9.1: Subspaces
13.1: Diagonalization symmetric matrix
Direct Sum 12: O 15: Diagonalizing Symmetric Matrices
14.6: Orthogonal Complements
orthogonal system of linear equations
14.1: Properties of the Standard Basis 2: Systems of Linear Equations
5: E 14.3: Relating Orthonormal Bases
Eigenspaces orthogonal complement 17: T
12.3: Eigenspaces 14.6: Orthogonal Complements trace
eigenvalue equation orthogonal decomposition 7.3: Properties of Matrices
12.2: The Eigenvalue-Eigenvector Equation 14.4: Gram-Schmidt and Orthogonal Complements transpose
outer product 7.3: Properties of Matrices
6: F 14.1: Properties of the Standard Basis
fields 18: U
18.2: Fields 13: P upper trianglar matrix
Pauli Matricies 7.7: LU Redux
7: G 7.1: Linear Transformations and Matrices
Gaussian Elimination pivot 19: V
2.1: Gaussian Elimination 2.1: Gaussian Elimination Vector Addition
1.1: What Are Vectors?
vector spaces
5: Vector Spaces
5.1: Examples of Vector Spaces
1 https://math.libretexts.org/@go/page/38037
Index
1: A 8: I 14: Q
augmented matrix image QR Decomposition
2.1: Gaussian Elimination 9.2: Building Subspaces 14.5: QR Decomposition
3.3: Dantzig's Algorithm inverse of a matrix
7.5: Inverse Matrix 15: R
2: B reduced row echelon form
Basis Notation 9: K 2.1: Gaussian Elimination
7.1: Linear Transformations and Matrices kernal Row Operations
bit matrix 9.2: Building Subspaces 2.1: Gaussian Elimination
7.5: Inverse Matrix 16: Kernel, Range, Nullity, Rank 2.3: Elementary Row Operations
Block LDU Decomposition Kronecker delta row vector
7.7: LU Redux 14.1: Properties of the Standard Basis 7.3: Properties of Matrices
3: C 10: L 16: S
Change of Basis least squares regression line simplex
13.2: Change of Basis 17: Least Squares and Singular Values 3: The Simplex Method
14.3: Relating Orthonormal Bases linear function simplex algorithm
characteristic equation 1.2: What Are Linear Functions? 3.3: Dantzig's Algorithm
12.2: The Eigenvalue-Eigenvector Equation lower trianglar matrix singular matrix
characteristic polynomial 7.7: LU Redux 7.5: Inverse Matrix
12.2: The Eigenvalue-Eigenvector Equation LU Decomposition singular value decomposition
column vector 7.7: LU Redux 17.1: Singular Value Decomposition
7.3: Properties of Matrices singular values
11: M 17: Least Squares and Singular Values
4: D matrix span
Dantzig's Algorithm 1.3: What is a Matrix? 9.2: Building Subspaces
3.3: Dantzig's Algorithm matrix diagonal standard basis
diagonalizable transformation 7.3: Properties of Matrices 14.1: Properties of the Standard Basis
13.1: Diagonalization multiplicity subspace
diagonalization 12.2: The Eigenvalue-Eigenvector Equation 9.1: Subspaces
13.1: Diagonalization symmetric matrix
Direct Sum 12: O 15: Diagonalizing Symmetric Matrices
14.6: Orthogonal Complements
orthogonal system of linear equations
14.1: Properties of the Standard Basis 2: Systems of Linear Equations
5: E 14.3: Relating Orthonormal Bases
Eigenspaces orthogonal complement 17: T
12.3: Eigenspaces 14.6: Orthogonal Complements trace
eigenvalue equation orthogonal decomposition 7.3: Properties of Matrices
12.2: The Eigenvalue-Eigenvector Equation 14.4: Gram-Schmidt and Orthogonal Complements transpose
outer product 7.3: Properties of Matrices
6: F 14.1: Properties of the Standard Basis
fields 18: U
18.2: Fields 13: P upper trianglar matrix
Pauli Matricies 7.7: LU Redux
7: G 7.1: Linear Transformations and Matrices
Gaussian Elimination pivot 19: V
2.1: Gaussian Elimination 2.1: Gaussian Elimination Vector Addition
1.1: What Are Vectors?
vector spaces
5: Vector Spaces
5.1: Examples of Vector Spaces
1 https://math.libretexts.org/@go/page/44197
Glossary
Sample Word 1 | Sample Definition 1
1 https://math.libretexts.org/@go/page/51401
Detailed Licensing
1: Overview
Title: Linear Algebra (Waldron, Cherney, and Denton)
Webpages: 114
All licenses found:
Undeclared: 100% (114 pages)
2: By Page
Linear Algebra (Waldron, Cherney, and Denton) - 6: Linear Transformations - Undeclared
Undeclared 6.1: The Consequence of Linearity - Undeclared
Front Matter - Undeclared 6.2: Linear Functions on Hyperplanes - Undeclared
TitlePage - Undeclared 6.3: Linear Differential Operators - Undeclared
InfoPage - Undeclared 6.4: Bases (Take 1) - Undeclared
Table of Contents - Undeclared 6.5: Review Problems - Undeclared
Licensing - Undeclared 7: Matrices - Undeclared
1: What is Linear Algebra? - Undeclared 7.1: Linear Transformations and Matrices -
1.1: What Are Vectors? - Undeclared Undeclared
1.2: What Are Linear Functions? - Undeclared 7.2: Review Problems - Undeclared
1.3: What is a Matrix? - Undeclared 7.3: Properties of Matrices - Undeclared
1.4: Review Problems - Undeclared 7.4: Review Problems - Undeclared
2: Systems of Linear Equations - Undeclared 7.5: Inverse Matrix - Undeclared
7.6: Review Problems - Undeclared
2.1: Gaussian Elimination - Undeclared
7.7: LU Redux - Undeclared
2.2: Review Problems - Undeclared
7.8: Review Problems - Undeclared
2.3: Elementary Row Operations - Undeclared
8: Determinants - Undeclared
2.4: Review Problems - Undeclared
2.5: Solution Sets for Systems of Linear Equations - 8.1: The Determinant Formula - Undeclared
Undeclared 8.2: Elementary Matrices and Determinants -
2.6: Review Problems - Undeclared Undeclared
8.3: Review Problems - Undeclared
3: The Simplex Method - Undeclared
8.4: Properties of the Determinant - Undeclared
3.1: Pablo's Problem - Undeclared
8.5: Review Problems - Undeclared
3.2: Graphical Solutions - Undeclared
3.3: Dantzig's Algorithm - Undeclared 9: Subspaces and Spanning Sets - Undeclared
3.4: Pablo Meets Dantzig - Undeclared 9.1: Subspaces - Undeclared
3.5: Review Problems - Undeclared 9.2: Building Subspaces - Undeclared
4: Vectors in Space, n-Vectors - Undeclared 9.3: Review Problems - Undeclared
1 https://math.libretexts.org/@go/page/115518
12.1: Invariant Directions - Undeclared 16.2: Review Problems - Undeclared
12.2: The Eigenvalue-Eigenvector Equation - 17: Least Squares and Singular Values - Undeclared
Undeclared 17.1: Singular Value Decomposition - Undeclared
12.3: Eigenspaces - Undeclared 17.2: Review Problems - Undeclared
12.4: Review Problems - Undeclared
18: Symbols, Fields, Sample Exams, Online Resources,
13: Diagonalization - Undeclared
Movie Scripts - Undeclared
13.1: Diagonalization - Undeclared
18.1: List of Symbols - Undeclared
13.2: Change of Basis - Undeclared
18.2: Fields - Undeclared
13.3: Changing to a Basis of Eigenvectors -
18.3: Online Resources - Undeclared
Undeclared
18.4: Sample First Midterm - Undeclared
13.4: Review Problems - Undeclared
18.5: Sample Second Midterm - Undeclared
14: Orthonormal Bases and Complements - Undeclared 18.6: Sample Final Exam - Undeclared
14.1: Properties of the Standard Basis - Undeclared 18.7: Movie Scripts 1-2 - Undeclared
14.2: Orthogonal and Orthonormal Bases - 18.8: Movie Scripts 3-4 - Undeclared
Undeclared 18.9: Movie Scripts 5-6 - Undeclared
14.3: Relating Orthonormal Bases - Undeclared 18.10: Movie Scripts 7-8 - Undeclared
14.4: Gram-Schmidt and Orthogonal Complements - 18.11: Movie Scripts 9-10 - Undeclared
Undeclared 18.12: Movie Scripts 11-12 - Undeclared
14.5: QR Decomposition - Undeclared 18.13: Movie Scripts 13-14 - Undeclared
14.6: Orthogonal Complements - Undeclared 18.14: Movie Scripts 15-16 - Undeclared
14.7: Review Problems - Undeclared Back Matter - Undeclared
15: Diagonalizing Symmetric Matrices - Undeclared Index - Undeclared
15.1: Review Problems - Undeclared Index - Undeclared
16: Kernel, Range, Nullity, Rank - Undeclared Glossary - Undeclared
Detailed Licensing - Undeclared
16.1: Summary - Undeclared
2 https://math.libretexts.org/@go/page/115518