0% found this document useful (0 votes)
9 views

Applications of Matrices

The document discusses the applications of matrices in solving linear systems of equations, including definitions of matrices, operations, and properties. It explains methods for solving linear systems, such as reduced row echelon form and elementary row operations, and introduces the concept of matrix inverses. Additionally, it covers properties of matrix addition, multiplication, and the transpose, along with examples and theorems related to these concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Applications of Matrices

The document discusses the applications of matrices in solving linear systems of equations, including definitions of matrices, operations, and properties. It explains methods for solving linear systems, such as reduced row echelon form and elementary row operations, and introduces the concept of matrix inverses. Additionally, it covers properties of matrix addition, multiplication, and the transpose, along with examples and theorems related to these concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 54

Applications of Matrices

Linear Systems

An n by n linear system of equations is a system of n linear equations in n variables.

a11x1 + a12x2 + ... + a1nxn = b1


a21x1 + a22x2 + ... + a2nxn = b2
... ... ... ...
an1x1 + an2x2 + ... + annxn = bn

Example

Solve

2x1 + 3x2 = 9
x1 - 2x2 = 1

Solution

To solve this we sequentially perform members of the following three operations:

1. Switch two equations.

2. Multiply an equation by a nonzero constant.

3. Replace an equation by that equation plus a multiple of the second equation.

We have

Switching the two equations


x1 - 2x2 = 1
2x1 + 3x2 = 9

Replace the 2nd equation with the 2nd equation + (-2)1st


equation
x1 - 2x2 = 1
7x2 = 7
Multiply the second equation by 1/7
x1 - 2x2 = 1
x2 = 1

Replace the 1st equation with the 1st equation + (2)2nd


equation

x1 = 3
x2 = 1

Matrices

An m by n matrix is an array of numbers with m rows and n columns.

Example

The matrix below is a 2 by 3 matrix.

A square matrix is an n by n matrix, that is a matrix such that the number of rows is equal to the
number of columns. The ijth entry is the number in the ith row and jth column. For example, the
the matrix above the 1 2thentry is

a12 = 4

Note: A vector such at <2,4,6> can be looked as a 1 by 3 matrix.

A square matrix is called a diagonal matrix if

aij = 0 for i j

The matrix below is a diagonal matrix


If all the entries of a diagonal matrix are equal, then the matrix is called a scalar matrix. The
example below is a scalar matrix.

Addition Subtraction and Scalar Multiplication

Just as with vectors we can add and subtract matrices and multiply a matrix by a scalar. To add
or subtract matrices the dimensions of the two matrices must be the same.

Definition
Let A and B be m by n matrices and k be a scalar then

(A + B)ij = Aij + Bij (A - B)ij = Aij - Bij (kA)ij = kAij

Example

Let

then

Two matrices are called equal if all of their entries are equal.
If A is an m by n matrix, then the transpose of A, AT, is the n by m matrix with the rows and
columns switched.

(AT)ij = Aji

In the above example

Example of the Theoretical Exercise

Prove that

(AT)T = A

Solution

We have

((AT)T )ij = (AT)ji = Aij

Since the ijth entries are equal for each ij, the matrices are equal.

Matrix Multiplication

The Dot Product for Vectors in Rn

Let

a = [a1 a2 ... an]

be a vector in Rn (considered as a 1 by n matrix) and let


then the dot product of a and b is defined by

a . b = a1b1 + a2b2 + ... + anbn =  aibi

Example

Find the dot product of

a = [2 1 0 6 -1] and b =

Solution

We have

a . b = (2)(5) + (1)(2) + (0)(-3) + (6)(0) + (-1)(1) = 11

Matrix Multiplication

There are many ways of thinking about a matrix. One way is as a collection of row vectors and
another way is as a collection of column vectors. Consider the m by p matrix A (considered as a
matrix of row vectors) and the p by n matrix B (considered as a matrix of column vectors). The
matrices are shown below.

We define the matrix product by


(AB)ij = vi wj .
Remark: If the number of columns of A is not equal to the number of rows of B, then the
product AB is not defined.

Remark: It is not true in general that AB and BA are the same matrix even if they are both
defined.

We can also define

Example

Let

Then the matrix product is

Linear Systems

Any m by n linear system can be written in the form

Ax = b
Where A is the coefficient matrix,

xT = (x1 x2 ... xn)

and b is the m by 1 matrix of numbers to the left of the equality. For example the
linear system

2x + 3y + z = 0
3x - 4y - z = 6
x + 2y + 3x = 2

can be written as

Often, we write the matrix equation in augmented form as shown below

Properties of Matrix Operations

Properties of Addition

The basic properties of addition for real numbers also hold true for matrices.

Let A, B and C be m x n matrices

1. A + B = B + A commutative

2. A + (B + C) = (A + B) + C associative

3. There is a unique m x n matrix O with

A+O = A additive identity


4. For any m x n matrix A there is an m x n matrix B (called -A) with

A+B = O additive inverse

The proofs are all similar. We will prove the first property.

Proof of Property 1

We have

(A + B)ij = Aij + Bij definition of addition of matrices

= Bij + Aij commutative property of addition for real numbers

= (B + A)ij definition of addition of matrices

Notice that the zero matrix is different for different m and n. For example

Properties of Matrix Multiplication

Unlike matrix addition, the properties of multiplication of real numbers do not all
generalize to matrices. Matrices rarely commute even if AB and BA are both
defined. There often is no multiplicative inverse of a matrix, even if the matrix is a
square matrix. There are a few properties of multiplication of real numbers that
generalize to matrices. We state them now.

Let A, B and C be matrices of dimensions such that the following are defined. Then

1. A(BC) = (AB)C associative

2. A(B + C) = AB + AC distributive

3. (A + B)C = AC + BC distributive
4. There are unique matrices Im and In with

Im A = A In = A multiplicative identity

We will often omit the subscript and write I for the identity matrix. The identity matrix is a
square scalar matrix with 1's along the diagonal. For example

We will prove the second property and leave the rest for you.

Proof of Property 2

Again we show that the general element of the left hand side is the same as the right hand side.
We have

(A(B + C))ij = (Aik(B + C)kj) definition of matrix multiplication

= (Aik(Bkj + Ckj)) definition of matrix addition

= (AikBkj + AikCkj) distributive property of the real numbers

=  AikBkj +  AikCkj commutative property of the real numbers

= (AB)ij + (AC)ij definition of matrix multiplication

where the sum is taken from 1 to k.

Example

We will demonstrate property 1 with


We have

so that

We have

so that

Properties of Scalar Multiplication

Since we can multiply a matrix by a scalar, we can investigate the properties that this
multiplication has. All of the properties of multiplication of real numbers generalize.
In particular, we have

Let r and s be real numbers and A and B be matrices. Then

1. r(sA) = (rs)A

2. (r + s)A = rA + sA

3. r(A + B) = rA + rB

4. A(rB) = r(AB) = (rA)B


We will prove property 3 and leave the rest for you. We have

(r(A + B))ij = (r)(A + B)ij definition of scalar multiplication

= (r)(Aij + Bij) definition of addition of matrices

= rAij + rBij distributive property of the real numbers

= (rA)ij + (rB)ij definition of scalar multiplication

= (rA + rB)ij definition of addition of matrices

Properties of the Transpose of a Matrix

Recall that the transpose of a matrix is the operation of switching rows and columns.
We state the following properties. We proved the first property in the last section.

Let r be a real number and A and B be matrices. Then

1. (AT)T = A

2. (A + B)T = AT + BT

3. (AB)T = BTAT

4. (rA)T = rAT

Solving Linear Systems of Equations


Reduced Row Echelon Form

When solving linear systems, we first transform the system into an augmented
matrix. At that point our goal is to transform the matrix into an "easier" matrix whose
corresponding linear system has the same solution set. We now defined what it means
for a matrix to be "easier".

Definition

An m x n matrix is in reduced row echelon form if it satisfies the following properties:

1. All zero rows, if any, are at the bottom of the matrix


2. The first nonzero entry of each row is a one. This one is called the leading one
or the corner.

3. Each corner is to the right and below the preceding corner.

4. The columns containing a leading one have zeros in all other entries.

If only 1, 2, and 3 are satisfied, then the matrix is in row echelon form.

Example

Of the following three matrices,

The A and B are in rref, while C is not.

The main purpose of putting a matrix in rref is that this form makes the solution of the linear
system easy to identify. For example A corresponds to the system

x1 = 4 x2 = 2 x3 = x3

or in parametric form we get the line

x1 = 4 x2 = 2 x3 = t

B corresponds to the system

x1 + 3x3 = 5 x2 - x3 = 0 x3 = x3

This also gives us a line. In parametric form it is

x1 = 5 - 3t x2 = t x3 = t

Row Operations and RREF


We saw awhile back that the three row operation do not effect the solution space of a
system of linear equations. We restate the row operations here for convenience:

Three Elementary Row Operations

1. Switch any tow rows.

2. Multiply a row by a nonzero constant.

3. Replace a row by the sum of that row and a multiple of another row.

Two matrices are called row equivalent if one can be transformed into the other using
a sequence of row operations. Since row operations do not effect the solution space,
any two row equivalent matrices have the same solution space.

Theorem

Every m x n matrix is row equivalent to a unique matrix in rref.

Instead of proving this theorem, we will explain how to take a matrix and transform it
into an rref matrix using only the elementary row operations. We follow the
following procedures:

1. Switch rows (if necessary) to ensure that the top left entry is nonzero. If the
first column is all zero go to the next one.

2. Make this top left entry a 1 by dividing the row by this entry.

3. Use this 1 and the third row operation to zero out the entries below and above
(there aren't any above for the first corner).

4. Repeat steps 1 through 3 for the columns to the right one at a time.

Example

Use the elementary row operations to put the following in rref.


Solution

We follow the procedures:

Homogeneous Systems

A homogeneous system of linear equation is a linear system of equations where the


right hand sides of all the equations are zero. That is it is of the form

a11x1 + a12x2 + ... + a1nxn = 0


a21x1 + a22x2 + ... + a2nxn = 0
... ... ...
a m1x1 + am2x2 + ... + amnxn = 0

Notice that in matrix form this can be written as

Ax = 0

where A is the m x n matrix with entries aij, x the the n x 1 matrix with
entries xi and 0 is the n x 1 zero matrix. The augmented matrix's last column is the
zero column. Since the zero column is unaffected by elementary row operations, it is
usually left out when the computations are performed. The solution (0,0, ... , 0) is
called the trivial solution. Any other solution is called a nontrivial solution.
Theorem

Let

Ax = 0

be a homogeneous system with m x n matrix A. If m < n, then the system always has
a nontrivial solution.

This theorem just states that if there are more variables than equations, then there is a
nonzero solution.

Proof

Let B be the rref equivalent matrix to A. Then B has a column that does not contain a
corner. This gives us a parameter in the solution which we can set to 1, giving us a
nontrivial solution. Since B has the same solution set as A, A has this same nontrivial
solution.

Inverse of a Matrix
Definition and Examples

Recall that functions f and g are inverses if

f(g(x)) = g(f(x)) = x

We will see later that matrices can be considered as functions from Rn to Rm and that
matrix multiplication is composition of these functions. With this knowledge, we
have the following:

Let A and B be n x n matrices then A and B are inverses of each other, then

AB = BA = In

Example

Consider the matrices


We can check that when we multiply A and B in either order we get the identity matrix. (Check
this.)

Not all square matrices have inverses. If a matrix has an inverse, we call
it nonsingular or invertible. Otherwise it is called singular. We will see in the next section how
to determine if a matrix is singular or nonsingular.

Properties of Inverses

Below are four properties of inverses.

1. If A is nonsingular, then so is A-1 and

(A-1) -1 = A

2. If A and B are nonsingular matrices, then AB is nonsingular and

(AB) -1 = B-1A-1
-1

3. If A is nonsingular then

(AT) -1 = (A -1)T

4. If A and B are matrices with

AB = In

then A and B are inverses of each other.

Notice that the fourth property implies that if AB = I then BA = I.

The first three properties' proof are elementary, while the fourth is too advanced for
this discussion. We will prove the second.

Proof that (AB) -1 = B -1 A -1


By property 4, we only need to show that

(AB)(B -1 A -1) = I

We have

(AB)(B -1 A -1) = A(BB -1)A -1 associative property

= AIA-1 definition of inverse

= AA-1 definition of the identity matrix

= I definition of inverse

Finding the Inverse

Now that we understand what an inverse is, we would like to find a way to calculate
and inverse of a nonsingular matrix. We use the definitions of the inverse and matrix
multiplication. Let A be a nonsingular matrix and Bbe its inverse. Then

AB = I

Recall that we find the jth column of the product by multiplying A by the jth column
of B. Now for some notation. Let ej be the m x 1 matrix that is the jth column of the
identity matrix and xj be the jth column of B. Then

Axj = ej

We can write this in augmented form

[A|ej]

Instead of solving these augmented problems one at a time using row operations, we
can solve them simultaneously. We solve

[A | I]

Example
Find the inverse of the matrix

Solution

The inverse matrix is just the right hand side of the final augmented matrix

This example demonstrates that if A is row equivalent to the identity matrix then A is
nonsingular.

Linear Systems and Inverses

We can use the inverse of a matrix to solve linear systems. Suppose that

Ax = b
Then just as we divide by a coefficient to isolate x, we can apply A-1 to both sides to
isolate the x.

A-1Ax = A-1b

Ix = A-1b x = A-1b

Example

Solve

x + 4z = 2
x + y + 6z = 3
-3x - 10z = 4

Solution

We put this system in matrix form

Ax = b

with

The solution is

x = A-1 b

We have already computed the inverse. We arrive at

The solution is

x = -18 y = -9 z = 5
Notice that if b is the zero vector, then

Ax = 0

can be solved by

x = A-10 = 0

This demonstrates a theorem

Theorem of Nonsingular Equivalences

The Following Are Equivalent (TFAE)

1. A is nonsingular

2. Ax = 0 has only the trivial solution

3. A is row equivalent to I

4. The linear system Ax = b has a unique solution for every n x 1 matrix b


5. Code Theory
6. Code theory has become increasingly important as computers have become
ingrained in our lives. This discussion will focus on error detection and
correction rather than on encryption. When a message is sent over the net, it is
encoded as a binary string of numbers. It is likely that a few of these numbers
will become corrupted (a 1 becoming a 0 or a 0 becoming a 1). We will be
looking at ways of detecting such an error and if possible correcting it. First a
few definitions.
7. A message is a sequence of numbers where each of the numbers is
either 0 or 1. We can encode a word by changing each of its letters as binary
string. A sentence is a sequence of words, so any set of words can be
represented by a vector. Typically we will send a message from one computer
to another, however, because of noise, the message received will not always be
the same as the message sent. In order to deal with this issue, we send a
transmission that includes redundant bits so that we can detect when something
has gone awry. The transmission will be a m by n matrix with m > n.
8.
9. Example
10.Let A be the transmission matrix

11.
12. We have the following table:
13.

Message Transmission
(0,0) (0,0,0)
(1,0) (1,0,1)
(0,1) (0,1,0)
(1,1) (1,1,0)

14.If an entry in the transmission differs from what the entry was supposed to be
transmitted, then we call this an error. Notice that if the receiver receives a
transmission other than the four possible transmissions, then an error is
detected. For example a transmission of (1,1,1) is detected as an error. The
receiver will detect an error whenever there is exactly one error in the
transmission. However if there are two errors, the error will not be detected.
This is called a single error correcting code. Notice that the third component
of the transmission is 0 if the sum of the entries of the message is even and 1 if
the sum is odd. The sum of the entries is called theweight of the message. The
transmission in the example is called a parity check code and will detect an odd
number of errors, but will not detect an even number of errors.
15.Example
16.Consider the matrix

17.
18. This matrix takes three letter words and produces a 5 letter word. The table for this is
19.

Message Transmission
(0,0) (0,0,0,0,0,0)
(1,0) (1,0,1,0,1,0)
(0,1) (0,1,0,1,0,1)
(1,1) (1,1,1,1,1,1)

20. Notice that this transmission just repeats the message three times. It will detect one or
two errors, hence is called a double-error detecting code. If we assume that there will
never be more than one error, then we can correct an error. For example, if the received
transmission is (1,1,0,1,0,1), then the only possible correct transmission that gives one
error is (1,0,1,0,1,0), hence the original message was (1,0).
21.
22.

Graph Theory

We now delve into our first application or matrices, graph theory, which shows its
face in communications, sociology, business, transportation sciences, and many other
fields. We begin by stating the basic definitions.

A directed graph, also called a digraph, is a set of points P1, P2, ... ,
Pn called vertices and ordered pairs of points PiPj called edges.

A digraph can have a vertex that does not belong to any edge or two points that are
connected by edges in both directions.

Example

The digraph with points P, Q, R, S, and edges PQ, QP, PR, and QR is shown below.

Let G by a digraph with n vertices, then the adjacency matrix of A, written A(G), is
the n x n matrix with A(G)ij equal to 1 if PiPj is an edge and zero otherwise.
Example

The adjacency matrix of the graph in the example above is

Example

Five students, Aurellio, Brian, Cindy, Dave, and Edward have formed a study group. Aurellio
has Cindy and Dave's phone number, Brian has Edward's number, Cindy has Dave and Edward's
number, Dave has Aurellio and Brian's number and Edward has everyone's number.

The graph and the adjacency matrix are shown below.

Access

One of the questions that we may ask, is if Aurellio wants to get a hold of Brian, how can he do
this? More specifically we can ask how many ways are there of Aurillio getting Brian's number
using no more than two edges.

We say that Pi has r stage access to Pj if there is a way to get to Pj from Pi via r edges.

The following theorem helps us to answer the question at hand.


Theorem

Let A(G) be the adjacency matrix of a digraph G then the number of r stage accesses
that Pi has to Pj is given by

[A(G)r]ij

Thus the total number of ways that Pi has access to Pj in r or fewer stages is
the ijth element of the matrix

A(G) + A(G)2 + ... + A(G)r

Now back to our problem. We have

To find out how many ways there are of Aurillio getting Brian's number using no more than
two edges we just take the 1 2th element of this matrix. This is 1, so there is exactly one way of
Aurillio getting Brian's number using no more than two edges.

Cliques

Sociologists speak of cliques, which mean subgroups of a larger group that associate
with each other but do not associate with others. In the language of graph theory we
have the following.

Definition

A clique in a digraph is a subset S of the vertices satisfying the following three


properties

1. S contains three or more vertices.

2. If Pi and Pj are in S then Pi has access to Pj then Pj has access to Pi.


3. There is no larger subset containing S that satisfies properties 1 and 2.

Example

In the digraph below, B, C, D, and E form a clique.

In the above example, it is relatively easy to spot any cliques. For larger digraphs, it
is much more difficult to spot a clique just by looking. Fortunately, there is a way of
doing this using matrices.

Theorem

Let A(G) be the adjacency matrix of a digraph and let S by the matrix whose entries
are defined by

Then a vertex Pi belongs to a clique if and only if sii3 is nonzero.

Example

Consider the diagraph with adjacency matrix


Then

We see that the positive diagonal elements correspond to P1, P2, P3, and P6. So these four vertices
are in a clique.

Strongly Connected Graphs

If G is a graph then we are often concerned with ensuring that there is a path from any
vertex to any other vertex. If the vertices represent street corners and the edges
represent roads, then a civil engineer wants to ensure that a car can get from any street
corner to any other corner. This idea also applies to networked systems such as
telephone lines.

A graph is called strongly connected if for every two distinct vertices Pi and Pj there is
a path leading from Pi to Pj and from Pj to Pi. Otherwise they are not strongly
connected.

If there is a path that leads from Pi to Pj then at some stage Pi has access to Pj so
that A(G)r will be positive. Moreover, since there are n vertices, the shortest path will
contain fewer than n edges. This leads us to the following theorem.

Theorem
Let G be a digraph and A(G) be its adjacency matrix, then G is strongly connected if
and only if

A(G) + A(G)2 + ... + A(G)n-1

has no zero entries.

Example

Consider the digraph below

The adjacency matrix is

Now use a calculator to find


We see that this matrix has no zero entries, so the graph is strongly connected.

Electrical Circuits
This discussion will focus on using matrices to answer questions related to electrical
circuits. We will provide the basic ideas from physics and see how matrices are
useful in this subject. An electrical circuit consists of several components some of
which are

 Batteries

 Resisters

 Wires

A battery provides current (in volts) to the circuit, a resister converts electrical energy
into other useful energy such as light reducing the current, and a wire allows the
current to flow through it without increasing or reducing the current. The example
below shows an electrical circuit diagram.

In this circuit diagram there are three batteries and four resisters. As you will notice
the diagram contains plenty of additional notes. The batteries' electrical potential
difference is measured in volts. The battery to the left is a 60 volt battery, the middle
battery is an 80 volt battery, and the voltage of the battery to the right has yet to be
determined. The resisters are measured in ohms. We use the Greek letter  as the
unit indicator. The currents I1 and I2 are measured in amperes. In the diagram above
the both currents have yet to be determined.

You will also notice the letters a, b, c, d, e, and f labeled on the circuit diagram.
These mark important points in the diagram. Points b and e represent nodes.
A node is a point where three or more wires connect. Pointsa, c, d, and f are not
nodes, however as we shall see next, it is convenient to label them.

A voltage loop is a closed connection within a circuit. That is a piece of the circuit
beginning at a point and ending at the same point. We will be interested in loops that
do not contain any sub loops. There are two such loops in the above diagram:

a ---> b ---> e ---> f ---> a

and

b ---> c ---> d ---> e ---> b

A change in voltage occurs when a current passes through a battery. For example, on
the 60 volt battery, a current flowing upwards (from "-" to "+") will gain 60 volts
while a current flowing downwards (from "+" to "-") will lose 60 volts.

The voltage through a resistor is related by

V = IR
The sign is positive if the measurement is taken against the current flow and the sign
is negative of the measurement is taken in the direction of the current flow.

Kirchhoff came up with two law for electrical circuits that will help us find the
unknown quantities.

Kirchhoff's Voltage Law: Around any voltage loop, the total electrical potential
difference is zero.

Kirchhoff's Current Law: At any current node, the flow of all currents into the node
equals the flow of all currents out of the node.
The first law is often called conservation of energy and the second law is often called
conservation of charge.

We now have all the ingredients necessary to solve our problem. We will use
Kirchhoff's two laws and the help of matrices to find the unknown currents and
voltage. First we give a direction to the two currents. Later we may change the
directions. Let I1 flow from e to f to a to b and let I2 flow from b to c to d to e. We
begin with the loops. Consider the loop

a ---> b ---> e ---> f ---> a

From a to b, the current I1 passes through a resistor of 3 amps hence the potential
difference is

-3I1

From b to e, the current passes though a battery with an 80 V voltage drop will have a
potential difference of

-80

From e to f the current passes through a resistor of 1 amp hence the potential
difference is

-I1

From f to a, the current passes though a battery with an 60 V voltage gain will have a
potential difference of

60

Kirchhoff's voltage law tells us that

-3I1 - 80 - I1 + 60 = 0

or

-4I1 = 20

Using Kirchhoff's voltage law on the loop


b ---> c ---> d ---> e ---> b

-2I2 - E - 3I2 + 80 = 0

or

5I2 + E = 80

Now we use Kirchhoff's current law. For the node b, the sum of the currents going in
are

I1 + 13

and the current going out is

I2

So that

I1 + 13 = I2

or

I1 - I2 = -13

The node e gives

I2 = I1 + 13

Notice that this gives us no new information. We have the three equations

-4I1 = 20
5I2 + E = 80
I1 - I2 = -13

This can be written in matrix form as

or with the augmented matrix


Now we can rref the matrix to get

We have that

I1 = -5 I2 = 8 E = 40

Typically we want to present currents to be positive, so we change the orientation if I1 to go


from b to a to f to e and say that I1 = 5.

Wavelets
Introduction

Modern computers are constantly sending images and videos that need to be compressed in order
to be put into a smaller package for high speed transmission. When the receiver receives the
package, the image must be decompressed. Ideally, we would want a system that compresses
and decompresses without losing any data. This is usually impossible. Instead, computers strive
to accomplish compression and decompression with a minimal loss of data.

For example, if the data is a picture, we want the transformed picture to look about the same as
the original. Digital pictures are just a sequence of numbers so we are interested in taking a large
sequence of numbers, compressing the large sequence into a smaller sequence of numbers and
then decompressing into a new large sequence of numbers that is approximately the same as the
original list. The new sequence that approximates the original sequence is called a wavelet.
Averaging

Giving two numbers, if we want to compress the numbers into one number, the most
logical number to send is the average of the two numbers. We can think of this
process as starting with the vector

v = [a,b]T

and sending it to the vector

w = Av

where A is the matrix

[0.5,0.5]

This transformation has the advantage of sending information that is half the size that
contains information from both numbers. Of course we can never get back to the
original two numbers from just the average.

This method works for larger data sets. For example if

v = [a,b,c,d]T

we take the average of a and b and the average of c and d. If we think of this as
compressing an image, we are merging adjacent pixels together. The matrix A that
takes pairwise averages is given by

Average-Difference Representation

For most pictures, there are areas where the color does not change much at all. When
this is the case, we compute the average and the distance from the average. If the
distance from the average is close to zero, we round to zero and omit the distance.
For most pictures, this allows us to compress a picture to a file that is much smaller in
size than sending information about every pixel.
For a sequence of two numbers, v = [a,b], we take the average and the distance from
the first entry to the average. The matrix that corresponds to this transformation is

For example if

v = [3,7]

then

AvT = [5,-2]T

Notice that 5 is the average of 3 and 7 and -2 is the first (3) minus the average (5). Also notice
that A has the inverse

This allows us to get the original data back from the the transformed data. Notice that

A-1 [5,-2]T = [3,7]T

For a sequence of four numbers [a,b,c,d], we use a two step process to make a
transformation. For the first step, the first outcome is the average of a and b, the
second is the average of c and d, the third is the distance from a to the average
of a and b, and the fourth is the distance from c to the average of c and d. The matrix
for this transformation is

The second step is to transform these intermediate numbers so that the first number is the
average of the first two (the final average), the second is the distance from the intermediate first
number to the total average, and the last two numbers remain the same. The matrix that
accomplishes this is
Composing these two transformation is the same as multiplying these matrices. If we start with
a 4 x 1 vector v, then we obtain the new vector using

u = A 1 A2 v

Example

Suppose that we want to transform the vector v = [2,6,9,3]. We have

A1vT = [4,6,-2,3]T

and

(A2 A1)vT = A2(A1v)T = A2[4,6,-2,3]T = [5,-1,-2,-3]T

Threshold Values

We are now ready to demonstrate a general way of compressing and decompressing


data. We will show by example how to accomplish this for a vector of length eight.

We denote the following:

Then the 8 x 8 matrix that computes pairwise averages and then distances is given by
We let the second matrix be the matrix that computes the averages of the first four and last four
respectively and then distances from the first average to the combined averages of the first two
averages and from the third average to the final two averages. Finally it leaves the final four
entries the same. The matrix that accomplishes this is

The third step is to let the first entry be the final average and the second entry be the distance
between the average of the first four and the final average. The matrix should leave the final six
entries the same. This matrix is

To transform a vector v we take the product of the three matrices.

wT = A3A2A1vT

Before we send the information wT we compress the data. The first entry is the final average and
we do not touch this number. The last seven numbers are the detail coefficients. One way of
compressing the data is to consider all small detail coefficient zero. Think of this as saying that
if nearby data points are close to their averages, then replace the data with the average.
The threshold number  is the value such that if a detail coefficient is below this value (in
absolute value) then it is replaced by zero.

To get back to close to the original vector we take the inverse. We use the fact that

(A3A2A1) -1 = A1-1A2-1A3-1

Thus
vT  A1-1A2-1A3-1wT

Example

Consider the vector

v = [23, 54, 55, 70, 89, 91, 93, 100]

Use the threshold number 5 to compress the data. Then decompress the data.

Solution

We find that

A3A2A1vT = [71.875, -21.375, -12,-3.25, -15.5, -7.5, -1, -3.5]

Next replace the values that are below 5 with a zero to get the compressed data

w = [71.875, -21.375, -12,0,3.25, -15.5, -7.5, 0, 0]

Notice that this contains only 5 nonzero numbers, which is smaller than the original 8. Now
decompress to get

A1-1A2-1A3-1wT = [23, 54, 55, 70, 93.25, 93.25, 93.25, 93.25]

Although the numbers are not exactly the same as the original, they are close.

Determinants
Permutations

Before we can get to the definition of the determinant of a matrix, we first need to
understand permutations.

Let

S = {1,2,...,n}

then a permutation is a 1-1 function from S to S.


We can think of a permutation on n elements as a reordering of the elements.

Example

(2,1,3) is a permutation on 3 elements. We have

f(1) = 2 f(2) = 1 f(3) = 3

There are exactly 6 permutations on 3 elements. They are

(1,2,3) (1,3,2) (2,1,3) (2,3,1) (3,1,2) (3,2,1)

The identity permutation is the permutation that keeps the elements in numerical
order.

For example

e3 = (1,2,3)

We define a transposition of two elements the permutation that switches the elements.

For example (2,1,3) is a transposition that switches 1 and 2. We can compose two
permutations since they are functions. Given a permutation, how many transpositions
does it take in order to get to the identity permutation? It turns out that this is not a
well defined question, since there are many ways of getting back to the identity.
However given any permutation, the parity of the number of transpositions to get back
to the identity is independent of how it is done.

A permutation is called even, if it takes an even number of transpositions to get back


to the identity and odd of it takes an odd number.

Example

(3,4,2,1) is an odd permutation since we can get back to the identity via

(3,4,2,1) --> (1,4,2,3) --> (1,2,4,3) --> (1,2,3,4)


Definition of the Determinant

We are now ready to define the determinant of the matrix. The definition will be
quite difficult to understand as it is written. We strongly encourage you to read
through the examples and try some of your own.

Definition

Let A be an n x n matrix. Then the determinant of A is the number given by

where the sum is taken over all possible permutations on n elements and the sign is
positive if the permutation is even and negative if the permutation is odd.

Example

Find the det A where

Solution

We use all the permutations on 3 elements. The permutation (1,2,3) is even and corresponds
with the product

a11a22a33 = (2)(3)(0) = 0

The permutation (1,3,2) is odd and corresponds with the product

-a11a23a32 = -(2)(5)(-1) = 10

Notice that the i subscripts are 1, 2, and 3 and the j subscripts are 1, 3, 2.

The permutation (2,1,3) is odd and corresponds with the product


-a12a21a33 = -(0)(1)(0) = 0

The permutation (2,3,1) is even and corresponds with the product

a12a23a31 = (0)(5)(10) = 0

The permutation (3,1,2) is even and corresponds with the product

a13a21a32 = (4)(1)(-1) = -4

The permutation (3,2,1) is odd and corresponds with the product

-a13a22a31 = -(4)(3)(10) = -120

Finally, we add these all up to get

0 + 10 + 0 + 0 - 4 - 120 = -114

Example

Find det A where

Solution

There are 24 permutations, however we only need those corresponding to products that are
nonzero. Since the last row has only one nonzero entry (third column), the last number in the
permutation must be a 3. Also the first column has only one nonzero entry, hence the one must
be in the second entry. For the first number in the permutation, we can only have a 4, since the
first and second entries are zeros and 3 is already taken up by the fourth number. That leaves
only 2 for the third number since 1 and 3 correspond with zero entries and 3 is reserved for the
fourth number. Hence the only permutation that leaves us with a nonzero product is

(4,1,2,3)
This permutation is odd and corresponds with the product

-(3)(1)(5)(-2) = 30

Since all other permutations lead to a zero product, the determinant is 30.

Properties of the Determinant

The payoff of this definition of the determinant is that we can quickly prove many
properties of the determinant. The list is long and we will prove only a few of them.

Theorem

If A has a row of zeros, then

det A = 0

Proof

Each term of det A includes one factor that contains each row, hence each term has a
zero factor. The sum of zeros is zero.

Theorem

If A has two identical rows, then

det A = 0

Proof

We will prove this for the first two rows. The general case is similar. The parity
of (s1,s2,...,sn) is the opposite of the parity of (s2,s1,...,sn). Hence each term is repeated
twice once with a positive coefficient and once with a negative coefficient. The terms
cancel each other out and add up to zero.

The next three theorems explore what happens to the determinant after row operations
have occurred.
Theorem

Let A be a square matrix and B be the matrix after a row of A has been multiplied by
a constant. Then

det A = c det B

Proof

Assume that the first row row has been multiplied by c. The general case is similar.
The terms in each are identical except that the first factor for A is a1j1 and the first
factor for B is ca1j1. Since each term for the Bdeterminant has an extra factor of c,
the c can be factored out.

Theorem

Let A be a square matrix and B be the matrix after two rows of A have been
switched. Then

det A = - det B

Proof

This theorem comes directly from the definition, since switching two elements of a
permutation changes the parity of the permutation. Hence switching two rows
changes the sign of each term.

Theorem

Let A be a square matrix and B be the matrix after a row of A have been replaced by
that row plus a multiple of another row. Then

det A = det B

We will not prove this one here.


Theorem

Let A be a triangular matrix then det A is equal to the product of the diagonal entries.

Proof

The terms of the determinant of A will only be nonzero when each of the factors are
nonzero. If A is lower triangular, then the only nonzero element in the first row is
also in the first column. For the second row, we have already used the first column,
hence the only nonzero element is the second column. Continuing this way we obtain
the product of the diagonals. This permutation is the identity (1,2,...,n) which is even.

The next two theorems we will state without proof.

Theorem

det A = det AT

Theorem

det (AB) = (det A)(det B)

Theorem

If A is nonsingular, then

1
-1
det (A ) =
det A

Proof

We have

AA-1 = I
hence

det(AA-1) = det I

But

det(AA-1) = (det A)(det A-1) and det I = 1

since the identity is triangular. Hence

(det A)(det A-1) = 1

Divide and the result follows.

Cofactors
Cofactors and Determinants

We begin with a definition.

Definition

Let A be an n x n matrix and let Mij be the (n - 1) x (n - 1) matrix obtained by deleting


the ith row and jth column. Then det Mij is called the minor of aij.
The cofactor Aij of aij is defined by

Aij = (-1)i+j det Mij

Example

Let

then
so the minor of a32 is the determinant of this 2 x 2 matrix. Since the matrix is triangular, the
determinant is the product of the diagonals or

(2)(4) = 8

The cofactor is

A23 = (-1)2+3(8) = -8

One of the main applications of cofactors is finding the determinant. The following theorem,
which we will not prove, shows us how to use cofactors to find a determinant.

Theorem

Let A be an n x n matrix and 1 < i < n. Then

det A = j (aijAij) = ai1Ai1 + ai2Ai2 + ... + ainAin

This theorem has little meaning without an example.

Example

Use cofactors to find det A for

Solution

We can use any row that we want. Let's pick the second row. We have
= 0 + (3)(0 - 5) + (8 - 0) = -7

Remark: Since det A = det AT, we can expand about a column if we desire.

Example

Find the determinant of

Solution

We can choose any row or column to expand. The third column has only one nonzero entry, so
we select this column. We have

Now lets expand about the third row. We get

Cofactors and Inverses


Just as cofactors can be used to find the determinant of a matrix, they can be used to
find the inverse of a matrix. We begin with a theorem that will be useful for proving
the inverse formula.

Theorem

Let A be an n x n matrix. Then

ai1Ak1 + ai2Ak2 + ... + ainAkn = 0 for i k


a1jA1k + a2jA2k + ... + anjAnk = 0 for j k

Proof

We will prove the first statement for i = 1 and k = 2. The general case and the second
statement can be proven in a similar way. We want to show that

a11A21 + a12A22 + ... + a1nA2n = 0

Consider the matrix B that is the same as A except that the second row of B is the
same as the first row. Since two rows of B are repeated, the determinant of B is zero.
Now find det B by expanding about the second row. You will notice that this
expansion is

0 = b21B21 + b22B22 + ... + b2nB2n = a11A21 + a12A22 + ... + a1nA2n

and the theorem is proven.

Definition

Let A be an n x n matrix. Then the adjoint of A (adj A) is the matrix such that

(adj A)ij = Aji


Notice the switch of subscripts. This means that the adjoint is the transpose of the
matrix that consists of cofactors.

Example

Find adj A for

Solution

We have

So that

Now for the main theorem

Theorem

If A is an n x n matrix then
A(adj A) = (adj A)A = (det A) In

Proof

The proof follows immediately from the formula for the determinant and the previous
theorem. We have

[A(adj A)]ij = k aik(adj A)kj = k aikAjk = (det A)ij

where dij is the Kronecker delta function evaluating to 1 for i = j and 0 otherwise. Hence the
theorem is proven.

The main application of this theorem is the following corollary that easily follows from the
theorem.

Corollary

If A is a nonsingular matrix then

1
-1
A = adj A
det A

Example

We found that the matrix

has adjoint
We can find that

det A = [A(adj A)]11 = 27

Hence

This gives us a way to find inverses and a way to determine if a matrix is nonsingular.

For a 2 x 2 matrix the adjoint of A is easy to find. We have

Using the inverse formula, we get

Theorem

A matrix is nonsingular if and only if the determinant is nonzero.

Proof

If A is nonsingular, then

1 = det(I) = det(AA-1) = (det A) (det A-1)

so the determinant of A is nonzero.

If the determinant is nonzero, then the corollary shows us how to find the inverse so the matrix is
nonsingular.
This gives us an addition to our list of nonsingular equivalences.

Theorem

TFAE

1. A is nonsingular.
2. Ax = 0 has only the trivial solution.
3. A is row equivalent to the identity.
4. Ax = b has a unique solution for all b.
5. The determinant of A is nonzero.

We end this discussion with the statement of Cramer's Rule, a formula that gives us
the solution of systems of equations.

Cramer's Rule

Let

Ax = b

be a linear system of equations with n x n matrix A. Then the solution is

det(Ai)
xi =
det(A)

where Ai is the matrix obtained from A by replacing the ith column of A by b.

Example

Use Cramer's rule to find z if


x - 3y + 2z = 3
2x + y + z = 1
x+y+z = 3

Solution

We write this in matrix form

We have

det A = 1(1 - 1) - (-3)(2 - 1) + 2(2 - 1) = 5

since we want to find z, we need det A3.

We find z by dividing

20
z = = 4
5

Code Theory
Code theory has become increasingly important as computers have become ingrained
in our lives. This discussion will focus on error detection and correction rather than
on encryption. When a message is sent over the net, it is encoded as a binary string of
numbers. It is likely that a few of these numbers will become corrupted (a 1
becoming a 0 or a 0 becoming a 1). We will be looking at ways of detecting such an
error and if possible correcting it. First a few definitions.

A message is a sequence of numbers where each of the numbers is either 0 or 1. We


can encode a word by changing each of its letters as binary string. A sentence is a
sequence of words, so any set of words can be represented by a vector. Typically we
will send a message from one computer to another, however, because of noise, the
message received will not always be the same as the message sent. In order to deal
with this issue, we send a transmission that includes redundant bits so that we can
detect when something has gone awry. The transmission will be a m by n matrix with
m > n.

Example

Let A be the transmission matrix

We have the following table:

Message Transmission
(0,0) (0,0,0)
(1,0) (1,0,1)
(0,1) (0,1,0)
(1,1) (1,1,0)

If an entry in the transmission differs from what the entry was supposed to be
transmitted, then we call this an error. Notice that if the receiver receives a
transmission other than the four possible transmissions, then an error is detected. For
example a transmission of (1,1,1) is detected as an error. The receiver will detect an
error whenever there is exactly one error in the transmission. However if there are
two errors, the error will not be detected. This is called a single error correcting
code. Notice that the third component of the transmission is 0 if the sum of the entries
of the message is even and 1 if the sum is odd. The sum of the entries is called
theweight of the message. The transmission in the example is called a parity check
code and will detect an odd number of errors, but will not detect an even number of
errors.

Example
Consider the matrix

This matrix takes three letter words and produces a 5 letter word. The table for this is

Message Transmission
(0,0) (0,0,0,0,0,0)
(1,0) (1,0,1,0,1,0)
(0,1) (0,1,0,1,0,1)
(1,1) (1,1,1,1,1,1)

Notice that this transmission just repeats the message three times. It will detect one or two
errors, hence is called a double-error detecting code. If we assume that there will never be more
than one error, then we can correct an error. For example, if the received transmission
is (1,1,0,1,0,1), then the only possible correct transmission that gives one error is (1,0,1,0,1,0),
hence the original message was (1,0).

You might also like