0% found this document useful (0 votes)
76 views466 pages

Diffyqs

The document is titled 'Differential Equations for Engineers' by Jiří Lebl and is a comprehensive guide on differential equations, covering topics from first-order equations to nonlinear systems. It includes various methods, applications, and mathematical concepts related to differential equations, along with a section on linear algebra. The work is dual licensed under Creative Commons licenses, allowing for use and modification under certain conditions.

Uploaded by

albaihaqirafi69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views466 pages

Diffyqs

The document is titled 'Differential Equations for Engineers' by Jiří Lebl and is a comprehensive guide on differential equations, covering topics from first-order equations to nonlinear systems. It includes various methods, applications, and mathematical concepts related to differential equations, along with a section on linear algebra. The work is dual licensed under Creative Commons licenses, allowing for use and modification under certain conditions.

Uploaded by

albaihaqirafi69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 466

Notes on Diffy Qs

Differential Equations for Engineers

by Jiří Lebl

August 6, 2025
(version 6.9)
2

Typeset in LATEX.

Edition 6 (version 6.9)

Copyright ©2008–2025 Jiří Lebl

This work is dual licensed under the Creative Commons Attribution-Noncommercial-Share Alike 4.0
International License and the Creative Commons Attribution-Share Alike 4.0 International License.
To view a copy of these licenses, visit https://creativecommons.org/licenses/by-nc-sa/4.0/
or https://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons
PO Box 1866, Mountain View, CA 94042, USA.

You can use, print, duplicate, and share this book as much as you want. You can base your own
notes on it and reuse parts if you keep the license the same. You can assume the license is either
CC-BY-NC-SA or CC-BY-SA, whichever is compatible with what you wish to do. Your derivative
work must use at least one of the licenses. Derivative works must be prominently marked as such.

During the writing of this book, the author was in part supported by NSF grant DMS-0900885 and
DMS-1362337.

The major version / edition number is raised only if there have been substantial changes. Edition
number started at 5, that is, version 5.0, as it was not kept track of before.

See https://www.jirka.org/diffyqs/ for more information (including contact information).

The LATEX source for the book is available for possible modification and customization at github:
https://github.com/jirilebl/diffyqs
Contents

Introduction 7
0.1 Notes about these notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
0.2 Introduction to differential equations . . . . . . . . . . . . . . . . . . . . . . . 10
0.3 Classification of differential equations . . . . . . . . . . . . . . . . . . . . . . 17

1 First-order equations 21
1.1 Integrals as solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.2 Slope fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.3 Separable equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.4 Linear equations and the integrating factor . . . . . . . . . . . . . . . . . . . 40
1.5 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1.6 Autonomous equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
1.7 Numerical methods: Euler’s method . . . . . . . . . . . . . . . . . . . . . . . 57
1.8 Exact equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
1.9 First-order linear PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

2 Higher-order linear ODEs 79


2.1 Second-order linear ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.2 Constant-coefficient second-order linear ODEs . . . . . . . . . . . . . . . . . 84
2.3 Higher-order linear ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
2.4 Mechanical vibrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
2.5 Nonhomogeneous equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
2.6 Forced oscillations and resonance . . . . . . . . . . . . . . . . . . . . . . . . . 111

3 Systems of ODEs 119


3.1 Introduction to systems of ODEs . . . . . . . . . . . . . . . . . . . . . . . . . 119
3.2 Matrices and linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
3.3 Linear systems of ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
3.4 Eigenvalue method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
3.5 Two-dimensional systems and their vector fields . . . . . . . . . . . . . . . . 147
3.6 Second-order systems and applications . . . . . . . . . . . . . . . . . . . . . 152
3.7 Multiple eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
3.8 Matrix exponentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
3.9 Nonhomogeneous systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
4 CONTENTS

4 Fourier series and PDEs 189


4.1 Boundary value problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
4.2 The trigonometric series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
4.3 More on the Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
4.4 Sine and cosine series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
4.5 Applications of Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
4.6 PDEs, separation of variables, and the heat equation . . . . . . . . . . . . . . 232
4.7 One-dimensional wave equation . . . . . . . . . . . . . . . . . . . . . . . . . 243
4.8 D’Alembert solution of the wave equation . . . . . . . . . . . . . . . . . . . . 252
4.9 Steady state temperature and the Laplacian . . . . . . . . . . . . . . . . . . . 258
4.10 Dirichlet problem in the circle and the Poisson kernel . . . . . . . . . . . . . 264

5 More on eigenvalue problems 273


5.1 Sturm–Liouville problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
5.2 Higher-order eigenvalue problems . . . . . . . . . . . . . . . . . . . . . . . . 282
5.3 Steady periodic solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286

6 The Laplace transform 293


6.1 The Laplace transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
6.2 Transforms of derivatives and ODEs . . . . . . . . . . . . . . . . . . . . . . . 300
6.3 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
6.4 Dirac delta and impulse response . . . . . . . . . . . . . . . . . . . . . . . . . 313
6.5 Solving PDEs with the Laplace transform . . . . . . . . . . . . . . . . . . . . 320

7 Power-series methods 327


7.1 Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
7.2 Series solutions of linear second-order ODEs . . . . . . . . . . . . . . . . . . 335
7.3 Singular points and the method of Frobenius . . . . . . . . . . . . . . . . . . 342

8 Nonlinear systems 351


8.1 Linearization, critical points, and equilibria . . . . . . . . . . . . . . . . . . . 351
8.2 Stability and classification of isolated critical points . . . . . . . . . . . . . . 357
8.3 Applications of nonlinear systems . . . . . . . . . . . . . . . . . . . . . . . . 364
8.4 Limit cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
8.5 Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378

A Linear algebra 385


A.1 Vectors, mappings, and matrices . . . . . . . . . . . . . . . . . . . . . . . . . 385
A.2 Matrix algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396
A.3 Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
A.4 Subspaces, dimension, and the kernel . . . . . . . . . . . . . . . . . . . . . . 421
A.5 Inner product and projections . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
A.6 Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
CONTENTS 5

B Table of Laplace Transforms 443

Further Reading 445

Solutions to Selected Exercises 447

Index 461
6 CONTENTS
Introduction

0.1 Notes about these notes


Note: A section for the instructor.
This book originated from my class notes for Math 286 at the University of Illinois at
Urbana-Champaign (UIUC) in Fall 2008 and Spring 2009. It is a first course on differential
equations for engineers. Using this book, I also taught Math 285 at UIUC, Math 20D
at University of California, San Diego (UCSD), and Math 2233 and 4233 at Oklahoma
State University (OSU). Normally these courses are taught with Edwards and Penney,
Differential Equations and Boundary Value Problems: Computing and Modeling [EP], or Boyce
and DiPrima’s Elementary Differential Equations and Boundary Value Problems [BD], and this
book aims to be more or less a drop-in replacement. Other books I used as sources of
information and inspiration are E.L. Ince’s classic (and inexpensive) Ordinary Differential
Equations [I], Stanley Farlow’s Differential Equations and Their Applications [F], now available
from Dover, Berg and McGregor’s Elementary Partial Differential Equations [BM], and William
Trench’s free book Elementary Differential Equations with Boundary Value Problems [T]. See
the Further Reading chapter at the end of the book.

0.1.1 Organization
The organization of this book requires, to some degree, that the chapters be covered in
order. Later chapters can be dropped. The dependence of the material covered is roughly:

Introduction

Appendix A Chapter 1

Chapter 2

Chapter 3 Chapter 7

Chapter 8 Chapter 4

Chapter 5 Chapter 6
8 INTRODUCTION

There are a few references in chapters 4 and 5 to chapter 3 (some linear algebra), but
these references are not essential and can be skimmed over, so chapter 3 can safely be
dropped, while still covering chapters 4 and 5. Chapter 6 does not depend on chapter 4
except that the PDEs section 6.5 makes a few references to chapter 4, although it could, in
theory, be covered separately. The more in-depth appendix A on linear algebra can replace
the short review § 3.2 for a course that combines linear algebra and ODEs.

0.1.2 Typical types of courses


Several typical courses can be run with the book. There are the two original courses at
UIUC, both cover ODEs as well some PDEs. Either there is the 4 hours-a-week for a
semester (Math 286 at UIUC):
Introduction (0.2), chapter 1 (1.1–1.7), chapter 2, chapter 3, chapter 4 (4.1–4.9), chapter 5 (or
6 or 7 or 8).
Or the second course at UIUC is at 3 hours-a-week (Math 285 at UIUC):
Introduction (0.2), chapter 1 (1.1–1.7), chapter 2, chapter 4 (4.1–4.9), (and maybe chapter 5,
6, or 7).
A semester-long course at 3 hours a week that does not cover either systems or PDEs
will cover, beyond the introduction, chapter 1, chapter 2, chapter 6, and chapter 7, (with
sections skipped as above). On the other hand, a typical course that covers systems will
probably need to skip Laplace and power series and cover chapter 1, chapter 2, chapter 3,
and chapter 8.
If sections need to be skipped in the beginning, a good core of the sections on single
ODEs is: 0.2, 1.1–1.4, 1.6, 2.1, 2.2, 2.4–2.6.
The complete book can be covered at a reasonably fast pace at approximately 76
lectures (without appendix A) or 86 lectures (with appendix A replacing § 3.2). This is
not accounting for exams, review, or time spent in a computer lab. A two-quarter or a
two-semester course can be easily run with the material. For example (with some sections
perhaps strategically skipped):
Semester 1: Introduction, chapter 1, chapter 2, chapter 6, chapter 7.
Semester 2: Chapter 3, chapter 8, chapter 4, chapter 5.
A combined course on ODEs with linear algebra can run as:
Introduction, chapter 1 (1.1–1.7), chapter 2, appendix A, chapter 3 (w/o § 3.2), (possibly
chapter 8).
The chapter on the Laplace transform (chapter 6), the chapter on Sturm–Liouville
(chapter 5), the chapter on power series (chapter 7), and the chapter on nonlinear systems
(chapter 8), are more or less interchangeable and can be treated as “topics”. If chapter 8
is covered, it may be best to place it right after chapter 3, and chapter 5 is best covered
right after chapter 4. If time is short, the first two sections of chapter 7 make a reasonable
self-contained unit.
0.1. NOTES ABOUT THESE NOTES 9

0.1.3 Computer resources


The book’s website https://www.jirka.org/diffyqs/ contains the following resources:

1. Interactive SAGE demos.

2. Online WeBWorK homeworks (using either your own WeBWorK installation or


Edfinity) for most sections, customized for this book.

3. The PDFs of the figures used in this book.

4. YouTube videos and corresponding slides for some sections.

I used IODE (https://publish.illinois.edu/iode-diffeq/) in the UIUC courses.


IODE is a free software package that works with Matlab (proprietary) or Octave (free
software). The graphs in the book were made with the Genius software (see https:
//www.jirka.org/genius.html).
The LATEX source of the book is also available for possible modification and customization
at github (https://github.com/jirilebl/diffyqs).

0.1.4 Acknowledgments
Firstly, I would like to acknowledge Rick Laugesen. I used his handwritten class notes
the first time I taught Math 286. My organization of this book through chapter 5, and
the choice of material covered, is heavily influenced by his notes. Many examples and
computations are taken from his notes. I am also heavily indebted to Rick for all the advice
he has given me, not just on teaching Math 286. For spotting errors and other suggestions,
I would also like to acknowledge (in no particular order): John P. D’Angelo, Sean Raleigh,
Jessica Robinson, Michael Angelini, Leonardo Gomes, Jeff Winegar, Ian Simon, Thomas
Wicklund, Eliot Brenner, Sean Robinson, Jannett Susberry, Dana Al-Quadi, Cesar Alvarez,
Cem Bagdatlioglu, Nathan Wong, Alison Shive, Shawn White, Wing Yip Ho, Joanne Shin,
Gladys Cruz, Jonathan Gomez, Janelle Louie, Navid Froutan, Grace Victorine, Paul Pearson,
Jared Teague, Ziad Adwan, Martin Weilandt, Sönmez Şahutoğlu, Pete Peterson, Thomas
Gresham, Prentiss Hyde, Jai Welch, Simon Tse, Andrew Browning, James Choi, Dusty
Grundmeier, John Marriott, Jim Kruidenier, Barry Conrad, Wesley Snider, Colton Koop,
Sarah Morse, Erik Boczko, Asif Shakeel, Chris Peterson, Nicholas Hu, Paul Seeburger,
Jonathan McCormick, David Leep, William Meisel, Shishir Agrawal, Tom Wan, Andres
Valloud, Martin Irungu, Justin Corvino, Tai-Peng Tsai, Santiago Mendoza Reyes, Glen
Pugh, Michael Tran, Heber Farnsworth, Tamás Zsoldos, Mark Mills, George Ballinger,
and probably others I have forgotten. Finally, I would like to acknowledge NSF grants
DMS-0900885 and DMS-1362337.
10 INTRODUCTION

0.2 Introduction to differential equations


Note: more than 1 lecture, §1.1 in [EP], chapter 1 in [BD]

0.2.1 Differential equations


The laws of physics are generally written down as differential equations. Therefore,
all of science and engineering uses differential equations to some degree. Differential
equations are essential to understanding almost anything you will study in your science
and engineering classes. You can think of mathematics as the language of science, and
differential equations are one of the most important parts of this language as far as science
and engineering are concerned. As an analogy, suppose all your classes from now on were
given in Swahili. It would be important to first learn Swahili, or you would have a very
tough time getting a good grade in your classes.
You have seen many differential equations already without perhaps knowing about it.
And you even solved simple differential equations when you took calculus. Let us see an
example you may not have seen:
𝑑𝑥
+ 𝑥 = 2 cos 𝑡. (1)
𝑑𝑡
Here 𝑥 is the dependent variable and 𝑡 is the independent variable. Equation (1) is a basic
example of a differential equation. It is an example of a first-order differential equation, since
it involves only the first derivative of the dependent variable. This equation arises from
Newton’s law of cooling where the ambient temperature oscillates with time.

0.2.2 Solutions of differential equations


Solving the differential equation (1) means finding 𝑥 in terms of 𝑡. That is, we want to find
a function of 𝑡, which we call 𝑥, such that when we plug 𝑥, 𝑡, and 𝑑𝑥 𝑑𝑡 into (1), the equation
holds; that is, the left-hand side equals the right-hand side. It is the same idea as it would
be for a normal (algebraic) equation of just 𝑥 and 𝑡. We claim that
𝑥 = 𝑥(𝑡) = cos 𝑡 + sin 𝑡
is a solution. How do we check? We simply plug 𝑥 into equation (1)! First we need to
compute 𝑑𝑥 𝑑𝑥
𝑑𝑡 . We find that 𝑑𝑡 = − sin 𝑡 + cos 𝑡. Now let us compute the left-hand side of (1).
𝑑𝑥
+ 𝑥 = (− sin 𝑡 + cos 𝑡) + (cos 𝑡 + sin 𝑡) = 2 cos 𝑡.
𝑑𝑡 | {z } | {z }
𝑑𝑥 𝑥
𝑑𝑡

Yay! We got precisely the right-hand side. But there is more! We claim 𝑥 = cos 𝑡 + sin 𝑡 + 𝑒 −𝑡
is also a solution. Let us try,
𝑑𝑥
= − sin 𝑡 + cos 𝑡 − 𝑒 −𝑡 .
𝑑𝑡
0.2. INTRODUCTION TO DIFFERENTIAL EQUATIONS 11

We plug into the left-hand side of (1)

𝑑𝑥
+ 𝑥 = (− sin 𝑡 + cos 𝑡 − 𝑒 −𝑡 ) + (cos 𝑡 + sin 𝑡 + 𝑒 −𝑡 ) = 2 cos 𝑡.
𝑑𝑡 | {z } | {z }
𝑑𝑥 𝑥
𝑑𝑡

And it works yet again!


So there can be many different solutions. For this equation, all solutions can be written
in the form
𝑥 = cos 𝑡 + sin 𝑡 + 𝐶𝑒 −𝑡 ,

for some constant 𝐶. Different constants 𝐶 will give different solutions, so there are really
infinitely many possible solutions. See Figure 1 for the graph of a few of these solutions.
We will see how we find these solutions a few lectures from now.

Solving differential equations can be 0 1 2 3 4 5

quite hard. There is no general method 3 3

that solves every differential equation. We


will generally focus on how to get exact for- 2 2

mulas for solutions of certain differential


equations, but we will also spend a little 1 1

bit of time on getting approximate solu-


tions. And we will spend some time on 0 0

understanding the equations without solv-


ing them. -1 -1

Most of this book is dedicated to ordinary


differential equations or ODEs, that is, equa- 0 1 2 3 4 5

tions with only one independent variable, Figure 1: Few solutions of 𝑑𝑥


𝑑𝑡 + 𝑥 = 2 cos 𝑡.
where derivatives are only with respect to
this one variable. If there are several inde-
pendent variables, we get partial differential equations or PDEs.
Even for ODEs, which are very well understood, it is not a simple question of turning
a crank to get solutions. When you can find exact solutions, they are usually preferable
to approximate solutions. It is important to understand how such solutions are found.
Although in real applications you will leave much of the actual calculations to computers,
you need to understand what they are doing. It is often necessary to simplify or transform
your equations into something that a computer can understand and solve. You may even
need to make certain assumptions and changes in your model to achieve this.
To be a successful engineer or scientist, you will be required to solve problems in your
job that you have never seen before. It is important to learn problem solving techniques, so
that you may apply those techniques to new problems. A common mistake is to expect to
learn some prescription for solving all the problems you will encounter in your later career.
This course is no exception.
12 INTRODUCTION

0.2.3 Differential equations in practice


So how do we use differential equations in Real-world problem
science and engineering? First, we have some
real-world problem we wish to understand. We abstract interpret
make some simplifying assumptions and cre-
ate a mathematical model. That is, we translate Mathematical solve
Mathematical
the real-world situation into a set of differential model solution
equations. Then we apply mathematics to get
some sort of a mathematical solution. There is still something left to do. We have to interpret
the results. We have to figure out what the mathematical solution says about the real-world
problem we started with.
Learning how to formulate the mathematical model and how to interpret the results is
what your physics and engineering classes do. In this course, we will focus mostly on the
mathematical analysis. Sometimes we will work with simple real-world examples so that
we have some intuition and motivation about what we are doing.
Let us look at an example of this process. One of the most basic differential equations
is the standard exponential growth model. Let 𝑃 denote the population of some bacteria
on a Petri dish. We assume that there is enough food and enough space. Then the rate
of growth of bacteria is proportional to the population—a larger population grows more
quickly. Let 𝑡 denote time (say in seconds) and 𝑃 the population. Our model is
𝑑𝑃
= 𝑘𝑃,
𝑑𝑡
for some positive constant 𝑘 > 0.
Example 0.2.1: Suppose there are 100 bacteria at time 0 and 200 bacteria 10 seconds later.
How many bacteria will there be 1 minute from time 0 (in 60 seconds)?
First we need to solve the equation. We 0 10 20 30 40 50 60

claim that a solution is given by


6000 6000

𝑘𝑡
𝑃(𝑡) = 𝐶𝑒 , 5000 5000

where 𝐶 is a constant. Let us try: 4000 4000

𝑑𝑃
= 𝐶 𝑘𝑒 𝑘𝑡 = 𝑘𝑃. 3000 3000

𝑑𝑡
And it really is a solution. 2000 2000

OK, now what? We do not know 𝐶, and 1000 1000

we do not know 𝑘. But we know something.


0 0

We know 𝑃(0) = 100, and we know 𝑃(10) = 0 10 20 30 40 50 60

200. We plug these conditions in and see Figure 2: Bacteria growth in the first 60 seconds.
what happens:
100 = 𝑃(0) = 𝐶𝑒 𝑘0 = 𝐶,
200 = 𝑃(10) = 100 𝑒 𝑘10 .
0.2. INTRODUCTION TO DIFFERENTIAL EQUATIONS 13

Therefore, 2 = 𝑒 10𝑘 or 𝑘 = ln 2
10 ≈ 0.069. So
𝑃(𝑡) = 100 𝑒 (ln 2)𝑡/10 ≈ 100 𝑒 0.069𝑡 .
At one minute, 𝑡 = 60, the population is 𝑃(60) = 6400. See Figure 2 on the preceding page.
Let us interpret the results. Does our solution mean that there must be exactly 6400
bacteria on the plate at 60s? No! We made assumptions that might not be true exactly, just
approximately. If our assumptions are reasonable, then there will be approximately 6400
bacteria. Also, in real life 𝑃 is a discrete quantity, not a real number. However, our model
has no problem saying that for example at 61 seconds, 𝑃(61) ≈ 6859.35.
Normally, the 𝑘 in 𝑑𝑃𝑑𝑡 = 𝑘𝑃 is given, and we want to solve the equation for different
initial conditions. What does that mean? Take 𝑘 = 1 for simplicity: We want to solve the
equation 𝑑𝑃𝑑𝑡 = 𝑃 subject to 𝑃(0) = 1000 (the initial condition). Then the solution is (exercise)
𝑃(𝑡) = 1000 𝑒 𝑡 .
We call 𝑃(𝑡) = 𝐶𝑒 𝑡 the general solution, as every solution of the equation can be written
in this form for some constant 𝐶. We need an initial condition to find out what 𝐶 is, in
order to find the particular solution we are looking for. Generally, when we say “particular
solution,” we just mean some solution.
In real life, parameters such as 𝑘 must first often be somehow computed or estimated.
The example above shows how finding an analytic solution to the differential equation is
useful in finding these parameters.

0.2.4 Four fundamental equations


A few equations appear often and it is useful to just memorize what their solutions are. Let
us call them the four fundamental equations. Their solutions are reasonably easy to guess
by recalling properties of exponentials, sines, and cosines. They are also simple to check,
which is something that you should always do. No need to wonder if you remembered the
solution correctly.
First such equation is
𝑑𝑦
= 𝑘 𝑦,
𝑑𝑥
for some constant 𝑘 > 0. Here 𝑦 is the dependent and 𝑥 the independent variable. The
general solution for this equation is
𝑦(𝑥) = 𝐶𝑒 𝑘𝑥 .
We saw above that this function is a solution, although we used different variable names.
Next,
𝑑𝑦
= −𝑘 𝑦,
𝑑𝑥
for some constant 𝑘 > 0. The general solution for this equation is
𝑦(𝑥) = 𝐶𝑒 −𝑘𝑥 .
14 INTRODUCTION

Exercise 0.2.1: Check that the 𝑦 given is really a solution to the equation above.
Next, take the second-order differential equation
𝑑2 𝑦
2
= −𝑘 2 𝑦,
𝑑𝑥
for some constant 𝑘 > 0. The general solution for this equation is
𝑦(𝑥) = 𝐶1 cos(𝑘𝑥) + 𝐶2 sin(𝑘𝑥).
Since the equation is a second-order differential equation, we have two constants in our
general solution.
Exercise 0.2.2: Check that the 𝑦 given is really a solution to the equation above.
Finally, consider the second-order differential equation
𝑑2 𝑦
= 𝑘 2 𝑦,
𝑑𝑥 2
for some constant 𝑘 > 0. The general solution for this equation is
𝑦(𝑥) = 𝐶1 𝑒 𝑘𝑥 + 𝐶2 𝑒 −𝑘𝑥 ,
or
𝑦(𝑥) = 𝐷1 cosh(𝑘𝑥) + 𝐷2 sinh(𝑘𝑥).
For those who do not know, cosh and sinh are defined by
𝑒 𝑥 + 𝑒 −𝑥 𝑒 𝑥 − 𝑒 −𝑥
cosh 𝑥 = , sinh 𝑥 = .
2 2
They are called the hyperbolic cosine and hyperbolic sine. These functions are sometimes
easier to work with than exponentials. They have some nice familiar properties such as
𝑑 𝑑
cosh 0 = 1, sinh 0 = 0, and 𝑑𝑥 cosh 𝑥 = sinh 𝑥 (no that is not a typo) and 𝑑𝑥 sinh 𝑥 = cosh 𝑥.
Exercise 0.2.3: Check that both forms of the 𝑦 given are really solutions to the equation above.
Example 0.2.2: In equations of higher order, you get more constants you must solve for to
𝑑2 𝑦
get a particular solution and hence you need more initial conditions. The equation 𝑑𝑥 2 = 0
has the general solution 𝑦 = 𝐶1 𝑥 + 𝐶2 ; simply integrate twice and don’t forget about the
constant of integration. Consider the initial conditions 𝑦(0) = 2 and 𝑦 ′(0) = 3. We plug in
our general solution and solve for the constants:
2 = 𝑦(0) = 𝐶1 · 0 + 𝐶2 = 𝐶2 , 3 = 𝑦 ′(0) = 𝐶1 .
In other words, 𝑦 = 3𝑥 + 2 is the particular solution we seek.
An interesting note about cosh: The graph of cosh is the exact shape of a hanging chain.
This shape is called a catenary. Contrary to popular belief, this is not a parabola. If you
invert the graph of cosh, it is also the ideal arch for supporting its weight. For example,
the Gateway Arch in Saint Louis is an inverted graph of cosh—if it were just a parabola it
might fall. The formula used in the design is, per the National Park Service:
 𝑥 
𝑦 = −127.7 ft · cosh ft + 757.7 ft.
127.7
0.2. INTRODUCTION TO DIFFERENTIAL EQUATIONS 15

0.2.5 Exercises
Exercise 0.2.4: Show that 𝑥 = 𝑒 4𝑡 is a solution to 𝑥 ′′′ − 12𝑥 ′′ + 48𝑥 ′ − 64𝑥 = 0.

Exercise 0.2.5: Show that 𝑥 = 𝑒 𝑡 is not a solution to 𝑥 ′′′ − 12𝑥 ′′ + 48𝑥 ′ − 64𝑥 = 0.
 2
𝑑𝑦
Exercise 0.2.6: Is 𝑦 = sin 𝑡 a solution to 𝑑𝑡 = 1 − 𝑦 2 ? Justify.

Exercise 0.2.7: Let 𝑦 ′′ + 2𝑦 ′ − 8𝑦 = 0. Now try a solution of the form 𝑦 = 𝑒 𝑟𝑥 for some (unknown)
constant 𝑟. Is this a solution for some 𝑟? If so, find all such 𝑟.

Exercise 0.2.8: Verify that 𝑥 = 𝐶𝑒 −2𝑡 is a solution to 𝑥 ′ = −2𝑥. Find 𝐶 to solve for the initial
condition 𝑥(0) = 100.

Exercise 0.2.9: Verify that 𝑥 = 𝐶1 𝑒 −𝑡 + 𝐶2 𝑒 2𝑡 is a solution to 𝑥 ′′ − 𝑥 ′ − 2𝑥 = 0. Find 𝐶1 and 𝐶2


to solve for the initial conditions 𝑥(0) = 10 and 𝑥 ′(0) = 0.

Exercise 0.2.10: Find a solution to (𝑥 ′)2 + 𝑥 2 = 4 using your knowledge of derivatives of functions
that you know from basic calculus.

Exercise 0.2.11: Solve:


𝑑𝐴 𝑑𝐻
a) = −10𝐴, 𝐴(0) = 5 b) = 3𝐻, 𝐻(0) = 1
𝑑𝑡 𝑑𝑥
𝑑2 𝑦 𝑑2 𝑥
c) = 4𝑦, 𝑦(0) = 0, 𝑦 ′(0) = 1 d) = −9𝑥, 𝑥(0) = 1, 𝑥 ′(0) = 0
𝑑𝑥 2 𝑑𝑦 2
Exercise 0.2.12: Is there a solution to 𝑦 ′ = 𝑦, such that 𝑦(0) = 𝑦(1)?

Exercise 0.2.13: The population of city X was 100 thousand 20 years ago, and the population of
city X was 120 thousand 10 years ago. Assuming constant growth, you can use the exponential
population model (like for the bacteria). What do you estimate the population is now?

Exercise 0.2.14: Suppose that a football coach gets a salary of one million dollars now, and a raise
of 10% every year (so exponential model, like population of bacteria). Let 𝑠 be the salary in millions
of dollars, and 𝑡 is time in years.

a) What is 𝑠(0) and 𝑠(1). b) Approximately how many years will it take
for the salary to be 10 million.
c) Approximately how many years will it take d) Approximately how many years will it take
for the salary to be 20 million. for the salary to be 30 million.

Note: Exercises with numbers 101 and higher have solutions in the back of the book.

Exercise 0.2.101: Show that 𝑥 = 𝑒 −2𝑡 is a solution to 𝑥 ′′ + 4𝑥 ′ + 4𝑥 = 0.

Exercise 0.2.102: Is 𝑦 = 𝑥 2 a solution to 𝑥 2 𝑦 ′′ − 2𝑦 = 0? Justify.


16 INTRODUCTION

Exercise 0.2.103: Let 𝑥𝑦 ′′ − 𝑦 ′ = 0. Try a solution of the form 𝑦 = 𝑥 𝑟 . Is this a solution for some 𝑟?
If so, find all such 𝑟.

Exercise 0.2.104: Verify that 𝑥 = 𝐶1 𝑒 𝑡 + 𝐶2 is a solution to 𝑥 ′′ − 𝑥 ′ = 0. Find 𝐶1 and 𝐶2 so that 𝑥


satisfies 𝑥(0) = 10 and 𝑥 ′(0) = 100.
𝑑𝜑
Exercise 0.2.105: Solve 𝑑𝑠 = 8𝜑 and 𝜑(0) = −9.

Exercise 0.2.106: Solve:


𝑑𝑥 𝑑2 𝑥
a) = −4𝑥, 𝑥(0) = 9 b) = −4𝑥, 𝑥(0) = 1, 𝑥 ′(0) = 2
𝑑𝑡 𝑑𝑡 2

𝑑𝑝 𝑑 2𝑇
c) = 3𝑝, 𝑝(0) = 4 d) = 4𝑇, 𝑇(0) = 0, 𝑇 ′(0) = 6
𝑑𝑞 𝑑𝑥 2
0.3. CLASSIFICATION OF DIFFERENTIAL EQUATIONS 17

0.3 Classification of differential equations


Note: less than 1 lecture or left as reading, §1.3 in [BD]
There are many types of differential equations, and we classify them into different
categories based on their properties. Let us quickly go over the most basic classification.
We already saw the distinction between ordinary and partial differential equations:
• Ordinary differential equations (ODE) are equations where the derivatives are taken
with respect to only one variable. That is, there is only one independent variable.

• Partial differential equations (PDE) are equations that depend on partial derivatives of
several variables. That is, there are several independent variables.
Let us see some examples of ordinary differential equations:
𝑑𝑦
= 𝑘 𝑦, (Exponential growth)
𝑑𝑡
𝑑𝑦
= 𝑘(𝐴 − 𝑦), (Newton’s law of cooling)
𝑑𝑡
𝑑2 𝑥 𝑑𝑥
𝑚 2 +𝑐 + 𝑘𝑥 = 𝑓 (𝑡). (Mechanical vibrations)
𝑑𝑡 𝑑𝑡
And of partial differential equations:
𝜕𝑦 𝜕𝑦
+𝑐 = 0, (Transport equation)
𝜕𝑡 𝜕𝑥
𝜕𝑢 𝜕2 𝑢
= , (Heat equation)
𝜕𝑡 𝜕𝑥 2
𝜕2 𝑢 𝜕2 𝑢 𝜕2 𝑢
= + . (Wave equation in 2 dimensions)
𝜕𝑡 2 𝜕𝑥 2 𝜕𝑦 2
If there are several equations working together, we have a so-called system of differential
equations. For example,
𝑦 ′ = 𝑥, 𝑥′ = 𝑦
is a system of ordinary differential equations. Maxwell’s equations for electromagnetics,
® = 𝜌,
∇·𝐷 ∇ · 𝐵® = 0,
𝜕 𝐵® ®
® = ®𝐽 + 𝜕 𝐷 ,
∇ × 𝐸® = − , ∇×𝐻
𝜕𝑡 𝜕𝑡
are a system of partial differential equations. The divergence operator ∇· and the curl
operator ∇× can be written out in partial derivatives of the functions involved in the 𝑥, 𝑦,
and 𝑧 variables.
The next bit of information is the order of the equation (or system). The order is the
order of the largest derivative that appears. If the highest derivative that appears is the first
18 INTRODUCTION

derivative, the equation is of first order. If the highest derivative that appears is the second
derivative, then the equation is of second order. For example, Newton’s law of cooling
above is a first-order equation, while the mechanical vibrations equation is a second-order
equation. The equation governing transversal vibrations in a beam,

4𝜕
4𝑦𝜕2 𝑦
𝑎 + = 0,
𝜕𝑥 4 𝜕𝑡 2
is a fourth-order partial differential equation. It is of fourth order as at least one derivative
is the fourth derivative. It does not matter that the derivative in 𝑡 is only of second order.
In the first chapter, we will start attacking first-order ordinary differential equations,
𝑑𝑦
that is, equations of the form 𝑑𝑥 = 𝑓 (𝑥, 𝑦). In general, lower-order equations are easier to
work with and have simpler behavior, which is why we start with them.
We also distinguish how the dependent variables appear in the equation (or system). In
particular, an equation is linear if the dependent variable (or variables) and their derivatives
appear linearly, that is, only as first powers, they are not multiplied together, and no other
functions of the dependent variables appear. The equation is a sum of terms, where each
term is a function of the independent variables or a function of the independent variables
multiplied by a dependent variable or one of its derivatives. An ordinary differential
equation is linear if it can be put into the form

𝑑𝑛 𝑦 𝑑 𝑛−1 𝑦 𝑑𝑦
𝑎 𝑛 (𝑥) 𝑛
+ 𝑎 𝑛−1 (𝑥) + · · · + 𝑎 1 (𝑥) + 𝑎 0 (𝑥)𝑦 = 𝑏(𝑥). (2)
𝑑𝑥 𝑑𝑥 𝑛−1 𝑑𝑥
The functions 𝑎 0 , 𝑎 1 , . . . , 𝑎 𝑛 are called the coefficients. The equation is allowed to depend
arbitrarily on the independent variable. So

𝑑2 𝑦 𝑑𝑦 1
𝑒𝑥 + sin(𝑥) + 𝑥 2
𝑦 = (3)
𝑑𝑥 2 𝑑𝑥 𝑥
is a linear equation as 𝑦 and its derivatives only appear linearly.
All the equations and systems given as examples above are linear. Linearity may not be
immediately obvious for Maxwell’s equations unless you write out the divergence and curl
in terms of partial derivatives. If an equation is not linear, we say it is nonlinear. Let us see
some nonlinear equations. For example Burger’s equation,

𝜕𝑦 𝜕𝑦 𝜕2 𝑦
+𝑦 = 𝜈 2,
𝜕𝑡 𝜕𝑥 𝜕𝑥
𝜕𝑦
is a nonlinear second-order partial differential equation. It is nonlinear because 𝑦 and 𝜕𝑥
are multiplied together. The equation

𝑑𝑥
= 𝑥2 (4)
𝑑𝑡
0.3. CLASSIFICATION OF DIFFERENTIAL EQUATIONS 19

is a nonlinear first-order differential equation as there is a second power of the dependent


variable 𝑥. Another nonlinear ODE is the pendulum equation
𝜃′′ + sin(𝜃) = 0, (5)
which is nonlinear as the dependent variable 𝜃 appears inside a sin function. Nonlinear
equations are notoriously difficult to solve and their solutions may behave in strange and
unexpected ways. Perhaps you have heard of chaos theory and the butterflies in the
Amazon causing hurricanes in the Atlantic, all due to nonlinear equations. So sometimes
we study related linear equations, such as 𝜃′′ + 𝜃 = 0 for the pendulum, instead.
A linear equation is further called homogeneous if all terms depend on the dependent
variable. That is, if no term is a function of the independent variables alone. Otherwise,
the equation is called nonhomogeneous or inhomogeneous. For example, the exponential
growth equation, the wave equation, or the transport equation above are homogeneous.
The mechanical vibrations equation above is nonhomogeneous as long as 𝑓 (𝑡) is not the
zero function. Similarly, if the ambient temperature 𝐴 is nonzero, Newton’s law of cooling
is nonhomogeneous. A homogeneous linear ODE can be put into the form
𝑑𝑛 𝑦 𝑑 𝑛−1 𝑦 𝑑𝑦
𝑎 𝑛 (𝑥) 𝑛
+ 𝑎 𝑛−1 (𝑥) + · · · + 𝑎 1 (𝑥) + 𝑎 0 (𝑥)𝑦 = 0.
𝑑𝑥 𝑑𝑥 𝑛−1 𝑑𝑥
Compare with (2) and notice there is no function 𝑏(𝑥).
If the coefficients of a linear equation are actually constant functions, then the equation
is said to have constant coefficients. The coefficients are the functions multiplying the
dependent variable(s) or one of its derivatives, not the function 𝑏(𝑥) standing alone. A
constant-coefficient nonhomogeneous ODE is an equation of the form
𝑑𝑛 𝑦 𝑑 𝑛−1 𝑦 𝑑𝑦
𝑎𝑛 𝑛
+ 𝑎 𝑛−1 + · · · + 𝑎1 + 𝑎 0 𝑦 = 𝑏(𝑥),
𝑑𝑥 𝑑𝑥 𝑛−1 𝑑𝑥
where 𝑎0 , 𝑎1 , . . . , 𝑎 𝑛 are all constants, but 𝑏 may depend on the independent variable
𝑥. The mechanical vibrations equation above is a constant-coefficient nonhomogeneous
second-order ODE. The same nomenclature applies to PDEs, so the transport equation,
heat equation and wave equation are all examples of constant-coefficient linear PDEs. A
linear equation whose coefficients are not constants is sometimes called a variable-coefficient
equation.
Finally, an equation (or system) is called autonomous if the equation does not depend on
the independent variable. For autonomous ordinary differential equations, the independent
variable is then thought of as time. An autonomous equation means an equation that
does not change with time. For example, Newton’s law of cooling is autonomous, so are
the equations (4) and (5). On the other hand, mechanical vibrations (as long as 𝑓 (𝑡) is
nonconstant) or (3) are not autonomous. A general first-order autonomous ODE would
have the form
𝑑𝑥
= 𝑓 (𝑥).
𝑑𝑡
20 INTRODUCTION

0.3.1 Exercises
Exercise 0.3.1: Classify the following equations. Are they ODEs or PDEs? Is it an equation
or a system? What is the order? Is it linear or nonlinear, and if it is linear, is it homogeneous,
constant-coefficient? If it is an ODE, is it autonomous?

𝑑2 𝑥 𝜕𝑢 𝜕𝑢
a) sin(𝑡) + cos(𝑡)𝑥 = 𝑡 2 b) +3 = 𝑥𝑦
𝑑𝑡 2 𝜕𝑥 𝜕𝑦
𝜕2 𝑢 𝜕2 𝑢
c) 𝑦 ′′ + 3𝑦 + 5𝑥 = 0, 𝑥 ′′ + 𝑥 − 𝑦 = 0 d) + 𝑢 =0
𝜕𝑡 2 𝜕𝑠 2
𝑑4 𝑥
e) 𝑥 ′′ + 𝑡𝑥 2 = 𝑡 f) =0
𝑑𝑡 4
𝜕𝑢1 𝜕𝑢2 𝜕𝑢3
Exercise 0.3.2: If 𝑢® = (𝑢1 , 𝑢2 , 𝑢3 ) is a vector, we have the divergence ∇ · 𝑢® = 𝜕𝑥
+ 𝜕𝑦
+ 𝜕𝑧
and
 
curl ∇ × 𝑢® = 𝜕𝑢
𝜕𝑦
3
− 𝜕𝑢
𝜕𝑧
2
, 𝜕𝑢
𝜕𝑧
1
− 𝜕𝑢
𝜕𝑥
3
, 𝜕𝑢
𝜕𝑥
2
− 𝜕𝑢
𝜕𝑦
1
. Notice that curl of a vector is still a vector. Write
out Maxwell’s equations in terms of partial derivatives and classify the system.

Exercise 0.3.3: Suppose 𝐹 is a linear function, that is, 𝐹(𝑥, 𝑦) = 𝑎𝑥 + 𝑏𝑦 for constants 𝑎 and 𝑏.
What is the classification of equations of the form 𝐹(𝑦 ′ , 𝑦) = 0.

Exercise 0.3.4: Write down an explicit example of a third-order, linear, variable-coefficient (i.e.
not constant-coefficient), nonautonomous, nonhomogeneous system of two ODEs such that every
derivative that could appear, does appear.

Exercise 0.3.101: Classify the following equations. Are they ODEs or PDEs? Is it an equation
or a system? What is the order? Is it linear or nonlinear, and if it is linear, is it homogeneous,
constant-coefficient? If it is an ODE, is it autonomous?

𝜕2 𝑣 𝜕2 𝑣 𝑑𝑥
a) + 3 = sin(𝑥) b) + cos(𝑡)𝑥 = 𝑡 2 + 𝑡 + 1
𝜕𝑥 2 𝜕𝑦 2 𝑑𝑡
𝑑7 𝐹
c) = 3𝐹(𝑥) d) 𝑦 ′′ + 8𝑦 ′ = 1
𝑑𝑥 7
𝜕𝑢 𝜕2 𝑢
e) 𝑥 ′′ + 𝑡 𝑦𝑥 ′ = 0, 𝑦 ′′ + 𝑡𝑥𝑦 = 0 f) = 2 + 𝑢2
𝜕𝑡 𝜕𝑠
Exercise 0.3.102: Write down the general zeroth-order linear ordinary differential equation. Write
down the general solution.
𝑑𝑥
Exercise 0.3.103: For which 𝑘 is 𝑑𝑡 + 𝑥 𝑘 = 𝑡 𝑘+2 linear? Hint: There are two answers.
Chapter 1

First-order equations

1.1 Integrals as solutions


Note: 1 lecture (or less), §1.2 in [EP], covered in §1.2 and §2.1 in [BD]
A first-order ODE is an equation of the form
𝑑𝑦
= 𝑓 (𝑥, 𝑦),
𝑑𝑥
or just
𝑦 ′ = 𝑓 (𝑥, 𝑦).
In general, there is no simple formula or procedure one can follow to find solutions. In the
next few lectures, we will look at cases where solutions are not difficult to obtain. In this
section, we consider the case when 𝑓 is a function of 𝑥 alone, that is, the equation is

𝑦 ′ = 𝑓 (𝑥). (1.1)

We can integrate (antidifferentiate) both sides of the equation with respect to 𝑥:


∫ ∫

𝑦 (𝑥) 𝑑𝑥 = 𝑓 (𝑥) 𝑑𝑥 + 𝐶,

that is, ∫
𝑦(𝑥) = 𝑓 (𝑥) 𝑑𝑥 + 𝐶.

This 𝑦(𝑥) is actually the general solution. So to solve (1.1), we find some antiderivative of
𝑓 (𝑥) and then we add an arbitrary constant to get the general solution.
Example 1.1.1: Find the general solution of 𝑦 ′ = 3𝑥 2 .
Elementary calculus tells us that the general solution must be 𝑦 = 𝑥 3 + 𝐶. Let us check
by differentiating: 𝑦 ′ = 3𝑥 2 . We got precisely our equation back.
Now is a good time to discuss a point about calculus notation and terminology.
Calculus textbooks muddy the waters by talking about the integral as primarily the
22 CHAPTER 1. FIRST-ORDER EQUATIONS

so-called indefinite integral. The indefinite integral is really the antiderivative (in fact the
whole one-parameter family of antiderivatives). There really exists only one integral and
that is the definite integral. The only reason for the indefinite integral notation is that
we can always write an antiderivative as a∫ definite integral. That is, by the fundamental
theorem of calculus, we can always write 𝑓 (𝑥) 𝑑𝑥 + 𝐶 as
∫ 𝑥
𝑓 (𝑡) 𝑑𝑡 + 𝐶.
𝑥0

Hence the terminology to integrate when we may really mean to antidifferentiate. Integration
is just one way to compute the antiderivative (and it is a way that always works, see the
following examples). Integration is defined as the area under the graph, it only happens to
also compute antiderivatives. For the sake of consistency, we will keep using the indefinite
integral notation when we want an antiderivative, and you should always think of the
definite integral as a way to write it.
Normally, we also have an initial condition such as 𝑦(𝑥 0 ) = 𝑦0 for some two numbers 𝑥0
and 𝑦0 (𝑥0 is often 0, but not always). We can then write the solution as a definite integral
in a nice way. Suppose our problem is 𝑦 ′ = 𝑓 (𝑥), 𝑦(𝑥0 ) = 𝑦0 . Then the solution is
∫ 𝑥
𝑦(𝑥) = 𝑓 (𝑡) 𝑑𝑡 + 𝑦0 . (1.2)
𝑥0

Let us check! We compute 𝑦 ′ = 𝑓 (𝑥) via the fundamental theorem of calculus, ∫ 𝑥0 and by Jupiter,
𝑦 is a solution. Is it the one satisfying the initial condition? Well, 𝑦(𝑥 0 ) = 𝑥 𝑓 (𝑡) 𝑑𝑡+𝑦0 = 𝑦0 .
0
It is!
Do note that the definite integral and the indefinite integral (antidifferentiation) are
completely different beasts. The definite integral always evaluates to a number. Therefore,
(1.2) is a formula we can plug into the calculator or a computer, and it will be happy to
calculate specific values for us. We will easily be able to plot the solution and work with it
just like with any other function. It is not so crucial to always find a closed form for the
antiderivative.
Example 1.1.2: Solve
2
𝑦 ′ = 𝑒 −𝑥 , 𝑦(0) = 1.

By the preceding discussion, the solution must be


∫ 𝑥
2
𝑦(𝑥) = 𝑒 −𝑡 𝑑𝑡 + 1.
0

Here is a good way to make fun of your friends taking second-semester calculus. Tell them
to find the closed form solution. Ha ha ha (bad math joke). It is not possible (in closed
form). There is absolutely nothing wrong with writing the solution as a definite integral.
This particular integral is in fact very important in statistics.
1.1. INTEGRALS AS SOLUTIONS 23

Using this method, we can also solve equations of the form

𝑦 ′ = 𝑓 (𝑦).

We write the equation in Leibniz notation:

𝑑𝑦
= 𝑓 (𝑦).
𝑑𝑥
Now we use the inverse function theorem from calculus to switch the roles of 𝑥 and 𝑦 to
obtain
𝑑𝑥 1
= .
𝑑𝑦 𝑓 (𝑦)
What we are doing seems like algebra with 𝑑𝑥 and 𝑑𝑦. It is tempting to just do algebra with
𝑑𝑥 and 𝑑𝑦 as if they were numbers. And in this case it does work. Be careful, however, as
this sort of hand-waving calculation can lead to trouble, especially when more than one
independent variable is involved. At this point, we can simply integrate with respect to 𝑦,

1
𝑥(𝑦) = 𝑑𝑦 + 𝐶.
𝑓 (𝑦)

Finally, we try to solve for 𝑦.


Example 1.1.3: Previously, we guessed 𝑦 ′ = 𝑘 𝑦 (for some 𝑘 > 0) has the solution 𝑦 = 𝐶𝑒 𝑘𝑥 .
We could have found the solution by integrating. First we note that 𝑦 = 0 is a solution.
Henceforth, we assume 𝑦 ≠ 0. We write

𝑑𝑥 1
= .
𝑑𝑦 𝑘𝑦

We integrate in 𝑦 to obtain
1
𝑥(𝑦) = 𝑥 =ln |𝑦| + 𝐷,
𝑘
where 𝐷 is an arbitrary constant. Now we solve for 𝑦 (actually for |𝑦|).

|𝑦| = 𝑒 𝑘𝑥−𝑘𝐷 = 𝑒 −𝑘𝐷 𝑒 𝑘𝑥 .

If we replace 𝑒 −𝑘𝐷 with an arbitrary constant 𝐶, we can get rid of the absolute value bars
(which we can do as 𝐷 was arbitrary). In this way, we also incorporate the solution 𝑦 = 0.
We get the same general solution as we guessed before, 𝑦 = 𝐶𝑒 𝑘𝑥 .
Example 1.1.4: Find the general solution of 𝑦 ′ = 𝑦 2 .
First we note that 𝑦 = 0 is a solution. We can now assume that 𝑦 ≠ 0. Write

𝑑𝑥 1
= 2.
𝑑𝑦 𝑦
24 CHAPTER 1. FIRST-ORDER EQUATIONS

We integrate to get
−1
𝑥= + 𝐶.
𝑦
We solve for 𝑦 = 1
𝐶−𝑥 . So the general solution is
1
𝑦= together with 𝑦 = 0.
𝐶−𝑥
Note the singularities of the solution. If, for example, 𝐶 = 1, then the solution “blows up”
as we approach 𝑥 = 1. See Figure 1.1. Generally, it is hard to tell from just looking at the
equation itself how the solution is going to behave. The equation 𝑦 ′ = 𝑦 2 is very nice and
defined everywhere, but the solution is only defined on the interval (−∞, 𝐶) or (𝐶, ∞).
Usually when this happens, we only consider the solution on one of these intervals and not
both. For example, if we impose an initial condition 𝑦(0) = 1, then the solution is 𝑦 = 1−𝑥
1
,
and we would consider this solution only for 𝑥 on the interval (−∞, 1). In the figure, it is
the left side of the graph.

-3 -2 -1 0 1 2 3
3 3

2 2

1 1

0 0

-1 -1

-2 -2

-3 -3
-3 -2 -1 0 1 2 3

Figure 1.1: Plot of 𝑦 = 1


1−𝑥 .

Classical problems leading to differential equations solvable by integration are problems


dealing with velocity, acceleration, and distance. You have surely seen these problems
before in your calculus class.
Example 1.1.5: Suppose a car drives at a speed of 𝑒 𝑡/2 meters per second, where 𝑡 is time
in seconds. How far did the car get in 2 seconds (starting at 𝑡 = 0)? How far in 10 seconds?
Let 𝑥 denote the distance the car traveled. The equation is

𝑥 ′ = 𝑒 𝑡/2 .

We just integrate this equation to get that

𝑥(𝑡) = 2𝑒 𝑡/2 + 𝐶.
1.1. INTEGRALS AS SOLUTIONS 25

We still need to figure out 𝐶. We know that when 𝑡 = 0, then 𝑥 = 0. That is, 𝑥(0) = 0. So

0 = 𝑥(0) = 2𝑒 0/2 + 𝐶 = 2 + 𝐶.

Thus 𝐶 = −2 and
𝑥(𝑡) = 2𝑒 𝑡/2 − 2.
Now we just plug in to get where the car is at 2 and at 10 seconds. We obtain

𝑥(2) = 2𝑒 2/2 − 2 ≈ 3.44 meters, 𝑥(10) = 2𝑒 10/2 − 2 ≈ 294 meters.

Example 1.1.6: Suppose that the car accelerates at 𝑡 2 m/s2 . At time 𝑡 = 0 the car is at the 1
meter mark and is traveling at 10 m/s. Where is the car at time 𝑡 = 10?
Well, this is actually a second-order problem. If 𝑥 is the distance traveled, then 𝑥 ′ is the
velocity, and 𝑥 ′′ is the acceleration. The equation with initial conditions is

𝑥 ′′ = 𝑡 2 , 𝑥(0) = 1, 𝑥 ′(0) = 10.

We can add a new dependent variable 𝑣 and declare that 𝑥 ′ = 𝑣. Then we solve

𝑣′ = 𝑡2 , 𝑣(0) = 10.

Once we find 𝑣, we solve 𝑥 ′ = 𝑣, 𝑥(0) = 1 to find 𝑥. We leave the integration to the reader.

Exercise 1.1.1: Solve for 𝑣, then solve for 𝑥. Find 𝑥(10) to answer the question.

1.1.1 Exercises
𝑑𝑦
Exercise 1.1.2: Solve 𝑑𝑥 = 𝑥 2 + 𝑥 for 𝑦(1) = 3.
𝑑𝑦
Exercise 1.1.3: Solve 𝑑𝑥 = sin(5𝑥) for 𝑦(0) = 2.
𝑑𝑦
Exercise 1.1.4: Solve 𝑑𝑥 = 1
𝑥 2 −1
for 𝑦(0) = 0.

Exercise 1.1.5: Solve 𝑦 ′ = 𝑦 3 for 𝑦(0) = 1.

Exercise 1.1.6 (little harder): Solve 𝑦 ′ = (𝑦 − 1)(𝑦 + 1) for 𝑦(0) = 3.


𝑑𝑦
Exercise 1.1.7: Solve 𝑑𝑥 = 1
𝑦+1 for 𝑦(0) = 0.

Exercise 1.1.8 (harder): Solve 𝑦 ′′ = sin 𝑥 for 𝑦(0) = 0, 𝑦 ′(0) = 2.

Exercise 1.1.9: A spaceship is traveling at the speed 2𝑡 2 + 1 km/s (𝑡 is time in seconds). It is pointing
directly away from earth and at time 𝑡 = 0 it is 1000 kilometers from earth. How far from earth is it
at one minute from time 𝑡 = 0?
𝑑𝑥
Exercise 1.1.10: Solve 𝑑𝑡 = sin(𝑡 2 ) + 𝑡, 𝑥(0) = 20. It is OK to leave your answer as a definite
integral.
26 CHAPTER 1. FIRST-ORDER EQUATIONS

Exercise 1.1.11: A dropped ball accelerates downwards at a constant rate 9.8 meters per second
squared. Set up the differential equation for the height above ground ℎ in meters. Then supposing
ℎ(0) = 100 meters, how long does it take for the ball to hit the ground.

Exercise 1.1.12: Find the general solution of 𝑦 ′ = 𝑒 𝑥 , and then 𝑦 ′ = 𝑒 𝑦 .


𝑑𝑦
Exercise 1.1.101: Solve 𝑑𝑥 = 𝑒 𝑥 + 𝑥 and 𝑦(0) = 10.

Exercise 1.1.102: Solve 𝑥 ′ = 1


𝑥2
, 𝑥(1) = 1.

Exercise 1.1.103: Solve 𝑥 ′ = 1


cos(𝑥)
, 𝑥(0) = 𝜋4 .

Exercise 1.1.104: Sid is in a car traveling at speed 10𝑡 + 70 miles per hour away from Las Vegas,
where 𝑡 is in hours. At 𝑡 = 0, Sid is 10 miles away from Vegas. How far from Vegas is Sid 2 hours
later?

Exercise 1.1.105: Solve 𝑦 ′ = 𝑦 𝑛 , 𝑦(0) = 1, where 𝑛 is a positive integer. Hint: You have to consider
different cases.

Exercise 1.1.106: The rate of change of the volume of a snowball that is melting is proportional
to the surface area of the snowball. Suppose the snowball is perfectly spherical. The volume (in
centimeters cubed) of a ball of radius 𝑟 centimeters is (4/3)𝜋𝑟 3 . The surface area is 4𝜋𝑟 2 . Set up the
differential equation for how the radius 𝑟 is changing. Then, suppose that at time 𝑡 = 0 minutes,
the radius is 10 centimeters. After 5 minutes, the radius is 8 centimeters. At what time 𝑡 will the
snowball be completely melted?

Exercise 1.1.107: Find the general solution to 𝑦 ′′′′ = 0. How many distinct constants do you need?
1.2. SLOPE FIELDS 27

1.2 Slope fields


Note: 1 lecture, §1.3 in [EP], §1.1 in [BD]
As we said, the general first-order equation we are studying looks like

𝑦 ′ = 𝑓 (𝑥, 𝑦).

Frequently, we cannot simply solve these kinds of equations explicitly. It would be nice if
we could at least figure out the shape and behavior of the solutions, or find approximate
solutions.

1.2.1 Slope fields


The equation 𝑦 ′ = 𝑓 (𝑥, 𝑦) gives you a slope at each point in the (𝑥, 𝑦)-plane. And this is
the slope a solution 𝑦(𝑥) would have at 𝑥 if its value was 𝑦. In other words, 𝑓 (𝑥, 𝑦) is the
slope of a solution whose graph runs through the point (𝑥, 𝑦). At a point (𝑥, 𝑦), we draw
a short line with the slope 𝑓 (𝑥, 𝑦). For example, if 𝑓 (𝑥, 𝑦) = 𝑥𝑦, then at point (2, 1.5) we
draw a short line of slope 𝑥𝑦 = 2 × 1.5 = 3. If 𝑦(𝑥) is a solution and 𝑦(2) = 1.5, then the
equation mandates that 𝑦 ′(2) = 3. See Figure 1.2.

-3 -2 -1 0 1 2 3
3 3

2 2

1 1

0 0

-1 -1

-2 -2

-3 -3
-3 -2 -1 0 1 2 3

Figure 1.2: The slope 𝑦 ′ = 𝑥 𝑦 at (2, 1.5).

To get an idea of how solutions behave, we draw such lines at lots of points in the plane,
not just the point (2, 1.5). We would ideally want to see the slope at every point, but that is
just not possible. Usually we pick a grid of points fine enough so that it shows the behavior,
but not too fine so that we can still recognize the individual lines. We call this picture
the slope field of the equation. See Figure 1.3 on the following page for the slope field of
the equation 𝑦 ′ = 𝑥𝑦. In practice, one does not do this by hand, a computer can do the
drawing.
28 CHAPTER 1. FIRST-ORDER EQUATIONS

Suppose we are given a specific initial condition 𝑦(𝑥 0 ) = 𝑦0 . A solution, that is, the
graph of the solution, would be a curve that follows the slopes we drew. For a few sample
solutions, see Figure 1.4. It is easy to roughly sketch (or at least imagine) possible solutions
in the slope field, just from looking at the slope field itself. You simply sketch a line that
roughly fits the little line segments and goes through your initial condition.

-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
3 3 3 3

2 2 2 2

1 1 1 1

0 0 0 0

-1 -1 -1 -1

-2 -2 -2 -2

-3 -3 -3 -3
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3

Figure 1.3: Slope field of 𝑦 ′ = 𝑥 𝑦. Figure 1.4: Slope field of 𝑦 ′ = 𝑥 𝑦 with a graph
of solutions satisfying 𝑦(0) = 0.2, 𝑦(0) = 0, and
𝑦(0) = −0.2.

By looking at the slope field we get a lot of information about the behavior of solutions
without having to solve the equation. For example, in Figure 1.4 we see what the solutions
do when the initial conditions are 𝑦(0) > 0, 𝑦(0) = 0 and 𝑦(0) < 0. A small change in the
initial condition causes quite different behavior. We see this behavior just from the slope
field and imagining what solutions ought to do.
We see a different behavior for the equation 𝑦 ′ = −𝑦. The slope field and a few solutions
is in see Figure 1.5 on the next page. If we think of moving from left to right (perhaps 𝑥 is
time and time is usually increasing), then we see that no matter what 𝑦(0) is, all solutions
tend to zero as 𝑥 tends to infinity. Again that behavior is clear from simply looking at the
slope field itself.

1.2.2 Existence and uniqueness


We wish to ask two fundamental questions about the problem

𝑦 ′ = 𝑓 (𝑥, 𝑦), 𝑦(𝑥 0 ) = 𝑦0 .

(i) Does a solution exist?

(ii) Is the solution unique (if it exists)?


1.2. SLOPE FIELDS 29

-3 -2 -1 0 1 2 3
3 3

2 2

1 1

0 0

-1 -1

-2 -2

-3 -3
-3 -2 -1 0 1 2 3

Figure 1.5: Slope field of 𝑦 ′ = −𝑦 with a graph of a few solutions.

What do you think is the answer? The answer to both questions seems to be yes, does it
not? Well, it really is yes most of the time. But there are cases when the answer to either
question can be no.
Since the equations we encounter in applications come from real life situations, it seems
logical that a solution always exists. It also has to be unique if we believe our universe is
deterministic. If the solution does not exist, or if it is not unique, we have probably not
devised the correct model. Hence, it is good to know when things go wrong and why.
Example 1.2.1: Attempt to solve:

1
𝑦′ = , 𝑦(0) = 0.
𝑥

Integrate to find the general solution 𝑦 = ln |𝑥| + 𝐶. The solution does not exist at 𝑥 = 0.
See Figure 1.6 on the following page. You may say one can see the division by zero a mile
away, but the equation may have been written as the seemingly harmless 𝑥𝑦 ′ = 1.
Example 1.2.2: Solve:
p
𝑦 ′ = 2 |𝑦|, 𝑦(0) = 0.

See Figure 1.7 on the next page. Note that 𝑦 = 0 is a solution. But another solution is
the function (
𝑥2 if 𝑥 ≥ 0,
𝑦(𝑥) =
−𝑥 if 𝑥 < 0.
2

It is hard to tell by staring at the slope field that the solution is not unique. Is there any
hope? Of course there is. We have the following theorem, known as Picard’s theorem‗ .
‗ Named after the French mathematician Charles Émile Picard (1856–1941)
30 CHAPTER 1. FIRST-ORDER EQUATIONS

-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
3 3 3 3

2 2 2 2

1 1 1 1

0 0 0 0

-1 -1 -1 -1

-2 -2 -2 -2

-3 -3 -3 -3
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3

Figure 1.6: Slope field of 𝑦 ′ = 1/𝑥 .


p
Figure 1.7: Slope field of 𝑦 ′ = 2 |𝑦| with two
solutions satisfying 𝑦(0) = 0.

Theorem 1.2.1 (Picard’s theorem on existence and uniqueness). If 𝑓 (𝑥, 𝑦) is continuous (as a
𝜕𝑓
function of two variables) and 𝜕𝑦 exists and is continuous near some (𝑥0 , 𝑦0 ), then a solution to

𝑦 ′ = 𝑓 (𝑥, 𝑦), 𝑦(𝑥 0 ) = 𝑦0 ,

exists (at least for 𝑥 in some small interval) and is unique.


p
Note that the problems 𝑦 ′ = 1/𝑥 , 𝑦(0) = 0 and 𝑦 ′ = 2 |𝑦|, 𝑦(0) = 0 do not satisfy the
hypothesis of the theorem. Even if we can use the theorem, we ought to be careful about
this existence business. It is quite possible that the solution only exists for a short while.
Example 1.2.3: For some constant 𝐴, solve:

𝑦′ = 𝑦2 , 𝑦(0) = 𝐴.

We know how to solve this equation. First assume that 𝐴 ≠ 0, so 𝑦 is not equal to zero at
least for some 𝑥 near 0. So 𝑥 ′ = 1/𝑦 2 , so 𝑥 = −1/𝑦 + 𝐶, so 𝑦 = 𝐶−𝑥
1
. If 𝑦(0) = 𝐴, then 𝐶 = 1/𝐴 so

1
𝑦= .
−𝑥
1/𝐴

If 𝐴 = 0, then 𝑦 = 0 is a solution.
For example, when 𝐴 = 1 the solution “blows up” at 𝑥 = 1. Hence, the solution does not
exist for all 𝑥 even if the equation itself is nice everywhere. The equation 𝑦 ′ = 𝑦 2 certainly
looks nice.
For most of this course, we will be interested in equations where existence and
uniqueness hold, and in fact hold “globally” unlike for the equation 𝑦 ′ = 𝑦 2 .
1.2. SLOPE FIELDS 31

1.2.3 Exercises
Exercise 1.2.1: Sketch slope field for 𝑦 ′ = 𝑒 𝑥−𝑦 . How do the solutions behave as 𝑥 grows? Can you
guess a particular solution by looking at the slope field?

Exercise 1.2.2: Sketch slope field for 𝑦 ′ = 𝑥 2 .

Exercise 1.2.3: Sketch slope field for 𝑦 ′ = 𝑦 2 .


𝑥𝑦
Exercise 1.2.4: Is it possible to solve the equation 𝑦 ′ = cos 𝑥 for 𝑦(0) = 1? Justify.
p
Exercise 1.2.5: Is it possible to solve the equation 𝑦 ′ = 𝑦 |𝑥| for 𝑦(0) = 0? Is the solution unique?
Justify.

Exercise 1.2.6: Match equations 𝑦 ′ = 1 − 𝑥, 𝑦 ′ = 𝑥 − 2𝑦, 𝑦 ′ = 𝑥(1 − 𝑦) to slope fields. Justify.

a) b) c)

Exercise 1.2.7 (challenging): Take 𝑦 ′ = 𝑓 (𝑥, 𝑦), 𝑦(0) = 0, where 𝑓 (𝑥, 𝑦) > 1 for all 𝑥 and 𝑦.
If the solution exists for all 𝑥, can you say what happens to 𝑦(𝑥) as 𝑥 goes to positive infinity?
Explain.

Exercise 1.2.8 (challenging): Take (𝑦 − 𝑥)𝑦 ′ = 0, 𝑦(0) = 0.

a) Find two distinct solutions.


b) Explain why this does not violate Picard’s theorem.

Exercise 1.2.9: Suppose 𝑦 ′ = 𝑓 (𝑥, 𝑦). What will the slope field look like, explain and sketch an
example, if you know the following about 𝑓 (𝑥, 𝑦):

a) 𝑓 does not depend on 𝑦. b) 𝑓 does not depend on 𝑥.


c) 𝑓 (𝑡, 𝑡) = 0 for any number 𝑡. d) 𝑓 (𝑥, 0) = 0 and 𝑓 (𝑥, 1) = 1 for all 𝑥.

Exercise 1.2.10: Find a solution to 𝑦 ′ = |𝑦|, 𝑦(0) = 0. Does Picard’s theorem apply?

Exercise 1.2.11: Take an equation 𝑦 ′ = (𝑦 − 2𝑥)𝑔(𝑥, 𝑦) + 2 for some function 𝑔(𝑥, 𝑦). Can you
solve the problem for the initial condition 𝑦(0) = 0, and if so what is the solution?
32 CHAPTER 1. FIRST-ORDER EQUATIONS

Exercise 1.2.12 (challenging): Suppose 𝑦 ′ = 𝑓 (𝑥, 𝑦) is such that 𝑓 (𝑥, 1) = 0 for every 𝑥, 𝑓 is
𝜕𝑓
continuous and 𝜕𝑦 exists and is continuous for every 𝑥 and 𝑦.

a) Guess a solution given the initial condition 𝑦(0) = 1.


b) Can graphs of two solutions of the equation for different initial conditions ever intersect?
c) Given 𝑦(0) = 0, what can you say about the solution. In particular, can 𝑦(𝑥) > 1 for any 𝑥?
Can 𝑦(𝑥) = 1 for any 𝑥? Why or why not?

Exercise 1.2.101: Sketch the slope field of 𝑦 ′ = 𝑦 3 . Can you visually find the solution that satisfies
𝑦(0) = 0?

Exercise 1.2.102: Is it possible to solve 𝑦 ′ = 𝑥𝑦 for 𝑦(0) = 0? Is the solution unique?


𝑥
Exercise 1.2.103: Is it possible to solve 𝑦 ′ = 𝑥 2 −1
for 𝑦(1) = 0?

Exercise 1.2.104: Match equations 𝑦 ′ = sin 𝑥, 𝑦 ′ = cos 𝑦, 𝑦 ′ = 𝑦 cos(𝑥) to slope fields. Justify.

a) b) c)

Exercise 1.2.105 (tricky): Suppose


(
0 if 𝑦 > 0,
𝑓 (𝑦) =
1 if 𝑦 ≤ 0.

Does 𝑦 ′ = 𝑓 (𝑦), 𝑦(0) = 0 have a continuously differentiable solution? Does Picard apply? Why, or
why not?

Exercise 1.2.106: Consider an equation of the form 𝑦 ′ = 𝑓 (𝑥) for some continuous function 𝑓 , and
an initial condition 𝑦(𝑥0 ) = 𝑦0 . Does a solution exist for all 𝑥? Why or why not?
1.3. SEPARABLE EQUATIONS 33

1.3 Separable equations


Note: 1 lecture, §1.4 in [EP], §2.2 in [BD]

When a differential equation is of the form 𝑦 ′ = 𝑓 (𝑥), we integrate: 𝑦 = 𝑓 (𝑥) 𝑑𝑥 + 𝐶.
Unfortunately, simply integrating no longer works for the general form of the equation
𝑦 ′ = 𝑓 (𝑥, 𝑦). Integrating both sides yields the rather unhelpful expression

𝑦= 𝑓 (𝑥, 𝑦) 𝑑𝑥 + 𝐶.

Notice the dependence on 𝑦 in the integral.

1.3.1 Separable equations


We say a differential equation is separable if we can write it as

𝑦 ′ = 𝑓 (𝑥)𝑔(𝑦),

for some functions 𝑓 (𝑥) and 𝑔(𝑦). Let us write the equation in the Leibniz notation

𝑑𝑦
= 𝑓 (𝑥)𝑔(𝑦).
𝑑𝑥
Then we rewrite the equation as
𝑑𝑦
= 𝑓 (𝑥) 𝑑𝑥.
𝑔(𝑦)
Both sides look like something we can integrate. We obtain

𝑑𝑦
∫ ∫
= 𝑓 (𝑥) 𝑑𝑥 + 𝐶.
𝑔(𝑦)

If we can find closed form expressions for these two integrals, we can, perhaps, solve for 𝑦.
Example 1.3.1: Take the equation
𝑦 ′ = 𝑥𝑦.
Note that 𝑦 = 0 is a solution. We will remember that fact and assume 𝑦 ≠ 0 from now on,
𝑑𝑦 𝑑𝑦
so that we can divide by 𝑦. Write the equation as 𝑑𝑥 = 𝑥𝑦 or 𝑦 = 𝑥 𝑑𝑥. Then

𝑑𝑦
∫ ∫
= 𝑥 𝑑𝑥 + 𝐶.
𝑦

We compute the antiderivatives to get

𝑥2
ln |𝑦| = + 𝐶,
2
34 CHAPTER 1. FIRST-ORDER EQUATIONS

or
𝑥2 𝑥2 𝑥2
|𝑦| = 𝑒 2 +𝐶 = 𝑒 2 𝑒 𝐶 = 𝐷𝑒 2 ,
where 𝐷 > 0 is some constant. Because 𝑦 = 0 is also a solution and because of the absolute
value, we can write:
𝑥2
𝑦 = 𝐷𝑒 2 ,
for any number 𝐷 (including zero or negative).
We check:
𝑥2
 𝑥2 

𝑦 = 𝐷𝑥𝑒 = 𝑥 𝐷𝑒 2 = 𝑥𝑦.
2

Yay!
You may be worried that we integrated in two different variables. We seemingly did a
different operation to each side. Perhaps we should be a little bit more careful and work
through this method more rigorously. Consider

𝑑𝑦
= 𝑓 (𝑥)𝑔(𝑦).
𝑑𝑥
𝑑𝑦
We rewrite the equation as follows. Note that 𝑦 = 𝑦(𝑥) is a function of 𝑥 and so is 𝑑𝑥 !

1 𝑑𝑦
= 𝑓 (𝑥).
𝑔(𝑦) 𝑑𝑥

We integrate both sides with respect to 𝑥:

1 𝑑𝑦
∫ ∫
𝑑𝑥 = 𝑓 (𝑥) 𝑑𝑥 + 𝐶.
𝑔(𝑦) 𝑑𝑥

We use the change of variables formula (substitution) on the left-hand side:


∫ ∫
1
𝑑𝑦 = 𝑓 (𝑥) 𝑑𝑥 + 𝐶.
𝑔(𝑦)

And we are done.

1.3.2 Implicit solutions


We sometimes get stuck even if we can do the integration. Consider the separable equation
𝑥𝑦
𝑦′ = .
𝑦2 +1

We separate variables,
𝑦2 + 1
 
1
𝑑𝑦 = 𝑦 + 𝑑𝑦 = 𝑥 𝑑𝑥.
𝑦 𝑦
1.3. SEPARABLE EQUATIONS 35

We integrate to get
𝑦2 𝑥2
+ ln |𝑦| = + 𝐶,
2 2
or perhaps the less intimidating expression (where 𝐷 = 2𝐶)

𝑦 2 + 2 ln |𝑦| = 𝑥 2 + 𝐷.

It is not easy to find the solution explicitly—it is hard to solve for 𝑦. We, therefore, leave the
solution in this form and call it an implicit solution. It is still easy to check that an implicit
solution satisfies the differential equation. In this case, we differentiate with respect to 𝑥,
and remember that 𝑦 is a function of 𝑥, to get
 
′ 2
𝑦 2𝑦 + = 2𝑥.
𝑦

Multiply both sides by 𝑦 and divide by 2(𝑦 2 + 1) and you will get exactly the differential
equation. We leave this computation to the reader.
If you have an implicit solution, and you want to compute values for 𝑦, you might
have to be tricky. You might get multiple solutions 𝑦 for each 𝑥, so you have to pick one.
Sometimes you can graph 𝑥 as a function of 𝑦, and then turn your paper to see a graph.
Sometimes you have to do more.
Computers are also good at some of these tricks. More advanced mathematical software
usually has some way of plotting solutions to implicit equations. For example, for 𝐷 = 0, if
you plot all the points (𝑥, 𝑦) that are solutions to 𝑦 2 + 2 ln |𝑦| = 𝑥 2 , you find the two curves
in Figure 1.8 on the following page. This is not quite a graph of a function. For each 𝑥 there
are two choices of 𝑦. To find a function, you have to pick one of these two curves. You
pick the one that satisfies your initial condition if you have one. For instance, the top curve
satisfies the condition 𝑦(1) = 1. So for each 𝐷, we really got two solutions. As you can see,
computing values from an implicit solution can be somewhat tricky. But sometimes, an
implicit solution is the best we can do.
The equation above also has the solution 𝑦 = 0. So the general solution is

𝑦 2 + 2 ln |𝑦| = 𝑥 2 + 𝐷, and 𝑦 = 0.

Sometimes these extra solutions that came up due to division by zero such as 𝑦 = 0 are
called singular solutions.

1.3.3 Examples of separable equations

Example 1.3.2: Solve 𝑥 2 𝑦 ′ = 1 − 𝑥 2 + 𝑦 2 − 𝑥 2 𝑦 2 , 𝑦(1) = 0.


Factor the right-hand side

𝑥 2 𝑦 ′ = (1 − 𝑥 2 )(1 + 𝑦 2 ).
36 CHAPTER 1. FIRST-ORDER EQUATIONS

-5.0 -2.5 0.0 2.5 5.0


5.0 5.0

2.5 2.5

0.0 0.0

-2.5 -2.5

-5.0 -5.0
-5.0 -2.5 0.0 2.5 5.0

𝑥𝑦
Figure 1.8: The implicit solution 𝑦 2 + 2 ln |𝑦| = 𝑥 2 to 𝑦 ′ = 𝑦 2 +1
.

Separate variables, integrate, and solve for 𝑦:

𝑦′ 1 − 𝑥2
= ,
1 + 𝑦2 𝑥2
𝑦′ 1
= 2 − 1,
1+𝑦 2 𝑥
−1
arctan(𝑦) = − 𝑥 + 𝐶,
𝑥  
−1
𝑦 = tan −𝑥+𝐶 .
𝑥

Solve for the initial condition, 0 = tan(−2 + 𝐶) to get 𝐶 = 2 (or 𝐶 = 2 + 𝜋, or 𝐶 = 2 + 2𝜋,


etc.). The particular solution we seek is, therefore,
 
−1
𝑦 = tan −𝑥+2 .
𝑥

Example 1.3.3: Bob made a cup of coffee, and Bob likes to drink coffee only once it reaches
60 degrees Celsius and will not burn him. Initially at time 𝑡 = 0 minutes, Bob measured the
temperature and the coffee was 89 degrees Celsius. One minute later, Bob measured the
coffee again and it had 85 degrees. The temperature of the room (the ambient temperature)
is 22 degrees. When should Bob start drinking?
Let 𝑇 be the temperature of the coffee in degrees Celsius, and let 𝐴 be the ambient
(room) temperature, also in degrees Celsius. Newton’s law of cooling states that the rate at
which the temperature of the coffee is changing is proportional to the difference between
the ambient temperature and the temperature of the coffee. That is,

𝑑𝑇
= 𝑘(𝐴 − 𝑇),
𝑑𝑡
1.3. SEPARABLE EQUATIONS 37

for some positive constant 𝑘. For our setup 𝐴 = 22, 𝑇(0) = 89, 𝑇(1) = 85. We separate
variables and integrate (let 𝐶 and 𝐷 denote arbitrary constants):

1 𝑑𝑇
= −𝑘,
𝑇 − 𝐴 𝑑𝑡
ln(𝑇 − 𝐴) = −𝑘𝑡 + 𝐶, (note that 𝑇 − 𝐴 > 0)
𝑇 − 𝐴 = 𝐷 𝑒 −𝑘𝑡 ,
𝑇 = 𝐴 + 𝐷 𝑒 −𝑘𝑡 .

That is, 𝑇 = 22 + 𝐷 𝑒 −𝑘𝑡 . We plug in the first condition: 89 = 𝑇(0) = 22 + 𝐷, and hence
𝐷 = 67. So 𝑇 = 22 + 67 𝑒 −𝑘𝑡 . The second condition says 85 = 𝑇(1) = 22 + 67 𝑒 −𝑘 . Solving
for 𝑘, we get 𝑘 = − ln 85−22
67 ≈ 0.0616. We solve for the time 𝑡 that gives a temperature of 60
degrees. Namely, we solve
60 = 22 + 67𝑒 −0.0616 𝑡
60−22
ln
to get 𝑡 = − 0.0616
67
≈ 9.21 minutes. So Bob can begin to drink the coffee at just over 9
minutes from the time Bob made it. That is probably about the amount of time it took us
to calculate how long it would take. See Figure 1.9.

0.0 2.5 5.0 7.5 10.0 12.5 0 20 40 60 80


90 90

80 80

80 80

60 60

70 70

40 40

60 60

20 20
0.0 2.5 5.0 7.5 10.0 12.5 0 20 40 60 80

Figure 1.9: Graphs of the coffee temperature function 𝑇(𝑡). On the left, horizontal lines are drawn at
temperatures 60, 85, and 89. Vertical lines are drawn at 𝑡 = 1 and 𝑡 = 9.21. Notice that the temperature
of the coffee hits 85 at 𝑡 = 1, and 60 at 𝑡 ≈ 9.21. On the right, the graph is over a longer period of time,
with a horizontal line at the ambient temperature 22.

−𝑥 𝑦 2
Example 1.3.4: Find the general solution to 𝑦 ′ = 3 (including any singular solutions).
First, note that 𝑦 = 0 is a solution (a singular solution). Next, assume 𝑦 ≠ 0 and separate:

−3 ′
𝑦 = 𝑥,
𝑦2
38 CHAPTER 1. FIRST-ORDER EQUATIONS

3 𝑥2
= + 𝐶,
𝑦 2
3 6
𝑦= 2 = 2 .
𝑥 /2 + 𝐶 𝑥 + 2𝐶

So the general solution is

6
𝑦= and 𝑦 = 0.
𝑥2 + 2𝐶

1.3.4 Exercises
𝑥
Exercise 1.3.1: Solve 𝑦 ′ = .
𝑦

Exercise 1.3.2: Solve 𝑦 ′ = 𝑥 2 𝑦.


𝑑𝑥
Exercise 1.3.3: Solve = (𝑥 2 − 1) 𝑡, for 𝑥(0) = 0.
𝑑𝑡
𝑑𝑥
Exercise 1.3.4: Solve = 𝑥 sin(𝑡), for 𝑥(0) = 1.
𝑑𝑡
𝑑𝑦
Exercise 1.3.5: Solve = 𝑥𝑦 + 𝑥 + 𝑦 + 1. Hint: Factor the right-hand side.
𝑑𝑥
Exercise 1.3.6: Solve 𝑥𝑦 ′ = 𝑦 + 2𝑥 2 𝑦, where 𝑦(1) = 1.

𝑑𝑦 𝑦2 + 1
Exercise 1.3.7: Solve = 2 , for 𝑦(0) = 1.
𝑑𝑥 𝑥 +1
𝑑𝑦 𝑥2 + 1
Exercise 1.3.8: Find an implicit solution for = 2 , for 𝑦(0) = 1.
𝑑𝑥 𝑦 +1
Exercise 1.3.9: Find an explicit solution for 𝑦 ′ = 𝑥𝑒 −𝑦 , 𝑦(0) = 1.

Exercise 1.3.10: Find an explicit solution for 𝑥𝑦 ′ = 𝑒 −𝑦 , for 𝑦(1) = 1.


2
Exercise 1.3.11: Find an explicit solution for 𝑦 ′ = 𝑦𝑒 −𝑥 , 𝑦(0) = 1. It is alright to leave a definite
integral in your answer.

Exercise 1.3.12: Suppose a cup of coffee is at 100 degrees Celsius at time 𝑡 = 0, it is at 70 degrees
at 𝑡 = 10 minutes, and it is at 50 degrees at 𝑡 = 20 minutes. Compute the ambient temperature.

Exercise 1.3.101: Solve 𝑦 ′ = 2𝑥𝑦.

Exercise 1.3.102: Solve 𝑥 ′ = 3𝑥𝑡 2 − 3𝑡 2 , 𝑥(0) = 2.


1
Exercise 1.3.103: Find an implicit solution for 𝑥 ′ = , 𝑥(0) = 1.
3𝑥 2 + 1
1.3. SEPARABLE EQUATIONS 39

Exercise 1.3.104: Find an explicit solution to 𝑥𝑦 ′ = 𝑦 2 , 𝑦(1) = 1.


sin(𝑥)
Exercise 1.3.105: Find an implicit solution to 𝑦 ′ = .
cos(𝑦)
Exercise 1.3.106: Take Example 1.3.3 with the same numbers: 89 degrees at 𝑡 = 0, 85 degrees at
𝑡 = 1, and ambient temperature of 22 degrees. Suppose these temperatures were measured with
precision of ±0.5 degrees. Given this imprecision, the time it takes the coffee to cool to (exactly) 60
degrees is also only known in a certain range. Find this range. Hint: Think about what kind of error
makes the cooling time longer and what shorter.

Exercise 1.3.107: A population 𝑥 of rabbits on an island is modeled by 𝑥 ′ = 𝑥 − 1/1000 𝑥 2 , where



the independent variable is time in months. At time 𝑡 = 0, there are 40 rabbits on the island.

a) Find the solution to the equation with the initial condition.


b) How many rabbits are on the island in 1 month, 5 months, 10 months, 15 months (round to
the nearest integer).
40 CHAPTER 1. FIRST-ORDER EQUATIONS

1.4 Linear equations and the integrating factor


Note: 1 lecture, §1.5 in [EP], §2.1 in [BD]
One of the most important types of equations we will learn to solve are the so-called
linear equations. In fact, the majority of the course is about linear equations. In this section
we focus on the first-order linear equation. A first-order equation is linear if we can put it
into the form:
𝑦 ′ + 𝑝(𝑥)𝑦 = 𝑓 (𝑥). (1.3)
The word “linear” means linear in 𝑦 and 𝑦 ′; no higher powers or functions of 𝑦 or 𝑦 ′ appear.
The dependence on 𝑥 can be more complicated.
Solutions of linear equations have nice properties. For example, the solution exists
wherever 𝑝(𝑥) and 𝑓 (𝑥) are defined, and has essentially the same regularity (read: it is just
as nice). Most importantly for us right now, there is a method for solving linear first-order
equations.
The trick is to rewrite the left-hand side of (1.3) as a derivative of a product of 𝑦 with
another function. To this end, we wish to find a function 𝑟(𝑥) such that
𝑑 h i
𝑟(𝑥)𝑦 ′ + 𝑟(𝑥)𝑝(𝑥)𝑦 = 𝑟(𝑥)𝑦 .
𝑑𝑥
This is the left-hand side of (1.3) multiplied by 𝑟(𝑥). If we multiply (1.3) by 𝑟(𝑥), we obtain
𝑑 h i
𝑟(𝑥)𝑦 = 𝑟(𝑥) 𝑓 (𝑥).
𝑑𝑥
We can now integrate both sides, which we can do as the right-hand side does not depend
on 𝑦 and the left-hand side is written as a derivative of a function. After the integration,
we solve for 𝑦 by dividing by 𝑟(𝑥). The function 𝑟(𝑥) is called the integrating factor and the
method is called the integrating factor method.
We are looking for a function 𝑟(𝑥), such that if we differentiate it, we get the same
function back multiplied by 𝑝(𝑥). That seems like a job for the exponential function! Let

𝑝(𝑥) 𝑑𝑥
𝑟(𝑥) = 𝑒 .

We compute:

𝑦 ′ + 𝑝(𝑥)𝑦 = 𝑓 (𝑥),
∫ ∫ ∫
𝑝(𝑥) 𝑑𝑥 ′ 𝑝(𝑥) 𝑑𝑥 𝑝(𝑥) 𝑑𝑥
𝑒 𝑦 +𝑒 𝑝(𝑥)𝑦 = 𝑒 𝑓 (𝑥),
𝑑 h ∫ 𝑝(𝑥) 𝑑𝑥
i ∫
𝑝(𝑥) 𝑑𝑥
𝑒 𝑦 =𝑒 𝑓 (𝑥),
𝑑𝑥 ∫
∫ ∫
𝑝(𝑥) 𝑑𝑥 𝑝(𝑥) 𝑑𝑥
𝑒 𝑦= 𝑒 𝑓 (𝑥) 𝑑𝑥 + 𝐶,

∫ ∫

− 𝑝(𝑥) 𝑑𝑥 𝑝(𝑥) 𝑑𝑥
𝑦=𝑒 𝑒 𝑓 (𝑥) 𝑑𝑥 + 𝐶 .
1.4. LINEAR EQUATIONS AND THE INTEGRATING FACTOR 41

Of course, to get a closed form formula for 𝑦, we need to be able to find a closed form
formula for the integrals appearing above.
Example 1.4.1: Solve
𝑦 ′ + 2𝑥𝑦 = 𝑒 𝑥−𝑥 ,
2
𝑦(0) = −1.

𝑝(𝑥) 𝑑𝑥
First note that 𝑝(𝑥) = 2𝑥 and 𝑓 (𝑥) = 𝑒 𝑥−𝑥 . The integrating factor is 𝑟(𝑥) = 𝑒 = 𝑒𝑥 .
2 2

We multiply both sides of the equation by 𝑟(𝑥) to get

𝑒 𝑥 𝑦 ′ + 2𝑥𝑒 𝑥 𝑦 = 𝑒 𝑥 𝑒 𝑥−𝑥 ,
2 2 2 2

𝑑 h 𝑥2 i
𝑒 𝑦 = 𝑒𝑥.
𝑑𝑥
We integrate
𝑒 𝑥 𝑦 = 𝑒 𝑥 + 𝐶,
2

𝑦 = 𝑒 𝑥−𝑥 + 𝐶𝑒 −𝑥 .
2 2

Next, we solve for the initial condition −1 = 𝑦(0) = 1 + 𝐶, so 𝐶 = −2. The solution is

𝑦 = 𝑒 𝑥−𝑥 − 2𝑒 −𝑥 .
2 2


Note that we do not care which antiderivative we take when computing 𝑒 𝑝(𝑥)𝑑𝑥 . You
can always add a constant of integration, but those constants will not matter in the end.

Exercise 1.4.1: Try it! Add a constant of integration to the integral in the integrating factor and
show that the solution you get in the end is the same as what we got above.

Advice: Do not try to remember the formula for 𝑦 itself, that is way too hard. It is easier
to remember the process and repeat it.
Since we cannot always evaluate the integrals in closed form, it is useful to know how
to write the solution in definite integral form. A definite integral is something that you can
plug into a computer or a calculator. Suppose we are given

𝑦 ′ + 𝑝(𝑥)𝑦 = 𝑓 (𝑥), 𝑦(𝑥 0 ) = 𝑦0 .

Look at the solution and write the integrals as definite integrals.

∫𝑥
∫ 𝑥 ∫𝑡 
− 𝑝(𝑠) 𝑑𝑠 𝑝(𝑠) 𝑑𝑠
𝑦(𝑥) = 𝑒 𝑥0
𝑒 𝑥0
𝑓 (𝑡) 𝑑𝑡 + 𝑦0 . (1.4)
𝑥0

You should be careful to properly use dummy variables here. If you now plug such a
formula into a computer or a calculator, it will be happy to give you numerical answers.

Exercise 1.4.2: Check that 𝑦(𝑥 0 ) = 𝑦0 in formula (1.4).


42 CHAPTER 1. FIRST-ORDER EQUATIONS

Exercise 1.4.3: Write the solution of the following problem as a definite integral, but try to simplify
as far as you can. You will not be able to find the solution in closed form.

𝑦′ + 𝑦 = 𝑒 𝑥
2 −𝑥
, 𝑦(0) = 10.

Remark 1.4.1: Before we move on, we should note some interesting properties of linear
equations. First, for the linear initial value problem 𝑦 ′ + 𝑝(𝑥)𝑦 = 𝑓 (𝑥), 𝑦(𝑥 0 ) = 𝑦0 , there
is an explicit formula (1.4) for the solution. Second, it follows from the formula (1.4) that
if 𝑝(𝑥) and 𝑓 (𝑥) are continuous on some interval (𝑎, 𝑏), then the solution 𝑦(𝑥) exists and
is differentiable on (𝑎, 𝑏). Compare with the simple nonlinear example we have seen
previously, 𝑦 ′ = 𝑦 2 , and compare to Theorem 1.2.1.
Example 1.4.2: We get to a common simple application of linear equations. Real-life
applications of this type of problem include computing the concentration of chemicals in
bodies of water (rivers and lakes).
A 100 liter tank contains 10 kilograms of salt dissolved in 60 5 L/min, 0.1 kg/L
liters of water. Solution of water and salt (brine) with concentration
of 0.1 kilograms per liter is flowing in at the rate of 5 liters a minute.
The solution in the tank is well stirred and flows out at a rate of 3
liters a minute. How much salt is in the tank when the tank is full?
To solve this problem, we need to find the differential equation 60 L
for this setup. Let 𝑥 denote the kilograms of salt in the tank, let 10 kg salt
𝑡 denote the time in minutes. For a small change Δ𝑡 in time, the
change in 𝑥 (denoted Δ𝑥) is approximately 3 L/min

Δ𝑥 ≈ (rate in × concentration in)Δ𝑡 − (rate out × concentration out)Δ𝑡.

Dividing through by Δ𝑡 and taking the limit Δ𝑡 → 0, we see that

𝑑𝑥
= (rate in × concentration in) − (rate out × concentration out).
𝑑𝑡
In our example,

rate in = 5,
concentration in = 0.1,
rate out = 3,
volume = initial volume + (rate in − rate out)𝑡 = 60 + (5 − 3)𝑡,
𝑥 𝑥
concentration out = = .
volume 60 + (5 − 3)𝑡

Our differential equation is

𝑑𝑥  𝑥 
= (5 × 0.1) − 3 .
𝑑𝑡 60 + 2𝑡
1.4. LINEAR EQUATIONS AND THE INTEGRATING FACTOR 43

In the form (1.3), it is


𝑑𝑥 3
+ 𝑥 = 0.5.
𝑑𝑡 60 + 2𝑡
Let us solve. The integrating factor is
∫   
3 3
𝑟(𝑡) = exp 𝑑𝑡 = exp ln(60 + 2𝑡) = (60 + 2𝑡)3/2 .
60 + 2𝑡 2

We multiply both sides of the equation by 𝑟(𝑡) to get


𝑑𝑥 3
(60 + 2𝑡)3/2 + (60 + 2𝑡)3/2 𝑥 = 0.5(60 + 2𝑡)3/2 ,
𝑑𝑡 60 + 2𝑡
𝑑 h i
(60 + 2𝑡) 𝑥 = 0.5(60 + 2𝑡)3/2 ,
3/2
𝑑𝑡 ∫
(60 + 2𝑡)3/2 𝑥 = 0.5(60 + 2𝑡)3/2 𝑑𝑡 + 𝐶,

(60 + 2𝑡)3/2

−3/2
𝑥 = (60 + 2𝑡) 𝑑𝑡 + 𝐶(60 + 2𝑡)−3/2 ,
2
1
𝑥 = (60 + 2𝑡)−3/2 (60 + 2𝑡)5/2 + 𝐶(60 + 2𝑡)−3/2 ,
10
60 + 2𝑡
𝑥= + 𝐶(60 + 2𝑡)−3/2 .
10
To find 𝐶, note that at 𝑡 = 0, we have 0 5 10 15 20

𝑥 = 10. That is,


60
10 = 𝑥(0) = + 𝐶(60)−3/2 = 6 + 𝐶(60)−3/2 , 11.5 11.5

10
or
11.0 11.0

𝐶 = 4(603/2 ) ≈ 1859.03.
We know 5 liters per minute are flowing 10.5 10.5

in and 3 liters per minute are flowing out, so


the volume is increasing by 2 liters a minute.
So the tank is full when 60 + 2𝑡 = 100, or 10.0 10.0

0 5 10 15 20

when 𝑡 = 20. We are interested in the value


of 𝑥 when the tank is full, that is, we want Figure 1.10: Graph of the solution 𝑥 kilograms of
to compute 𝑥(20): salt in the tank at time 𝑡.

60 + 40
𝑥(20) = + 𝐶(60 + 40)−3/2
10
≈ 10 + 1859.03(100)−3/2 ≈ 11.86.

There are 11.86 kg of salt in the tank when it is full. See Figure 1.10 for the graph of 𝑥 over 𝑡.
The concentration when the tank is full is approximately 11.86/100 = 0.1186 kg/liter, and
we started with 1/6 or approximately 0.1667 kg/liter.
44 CHAPTER 1. FIRST-ORDER EQUATIONS

1.4.1 Exercises
In the exercises, feel free to leave answer as a definite integral if a closed form solution
cannot be found. If you can find a closed form solution, you should give that.

Exercise 1.4.4: Solve 𝑦 ′ + 𝑥𝑦 = 𝑥.

Exercise 1.4.5: Solve 𝑦 ′ + 6𝑦 = 𝑒 𝑥 .


3
Exercise 1.4.6: Solve 𝑦 ′ + 3𝑥 2 𝑦 = sin(𝑥) 𝑒 −𝑥 , with 𝑦(0) = 1.

Exercise 1.4.7: Solve 𝑦 ′ + cos(𝑥)𝑦 = cos(𝑥).

Exercise 1.4.8: Solve 1


𝑥 2 +1
𝑦 ′ + 𝑥𝑦 = 3, with 𝑦(0) = 0.

Exercise 1.4.9: Suppose there are two lakes located on a stream. Clean water flows into the first
lake, then the water from the first lake flows into the second lake, and then water from the second
lake flows further downstream. The in and out flow from each lake is 500 liters per hour. The first
lake contains 100 thousand liters of water and the second lake contains 200 thousand liters of water.
A truck with 500 kg of toxic substance crashes into the first lake. Assume that the water is being
continually mixed perfectly by the stream.

a) Find the concentration of toxic substance as a function of time in both lakes.


b) When will the concentration in the first lake be below 0.001 kg per liter?
c) When will the concentration in the second lake be maximal?

Exercise 1.4.10: Newton’s law of cooling states that 𝑑𝑥


𝑑𝑡 = −𝑘(𝑥 − 𝐴) where 𝑥 is the temperature, 𝑡
is time, 𝐴 is the ambient temperature, and 𝑘 > 0 is a constant. Suppose that 𝐴 = 𝐴0 cos(𝜔𝑡) for
some constants 𝐴0 and 𝜔. That is, the ambient temperature oscillates (for example night and day
temperatures).

a) Find the general solution.


b) In the long term, will the initial conditions make much of a difference? Why or why not?

Exercise 1.4.11: Initially 5 grams of salt are dissolved in 20 liters of water. Brine with concentration
of salt 2 grams of salt per liter is added at a rate of 3 liters a minute. The tank is mixed well and is
drained at 3 liters a minute. How long does the process have to continue until there are 20 grams of
salt in the tank?

Exercise 1.4.12: Initially a tank contains 10 liters of pure water. Brine of unknown (but constant)
concentration of salt is flowing in at 1 liter per minute. The water is mixed well and drained at 1
liter per minute. In 20 minutes there are 15 grams of salt in the tank. What is the concentration of
salt in the incoming brine?

Exercise 1.4.101: Solve 𝑦 ′ + 3𝑥 2 𝑦 = 𝑥 2 .

Exercise 1.4.102: Solve 𝑦 ′ + 2 sin(2𝑥)𝑦 = 2 sin(2𝑥), 𝑦(𝜋/2) = 3.


1.4. LINEAR EQUATIONS AND THE INTEGRATING FACTOR 45

Exercise 1.4.103: Suppose a water tank is being pumped out at 3 L/min. The water tank starts at
10 L of clean water. Water with toxic substance is flowing into the tank at 2 L/min, with concentration
20𝑡 g/L at time 𝑡. When the tank is half empty, how many grams of toxic substance are in the tank
(assuming perfect mixing)?

Exercise 1.4.104: There is bacteria on a plate and a toxic substance is being added that slows down
the rate of growth of the bacteria. That is, suppose that 𝑑𝑃
𝑑𝑡 = (2 − 0.1 𝑡)𝑃. If 𝑃(0) = 1000, find the
population at 𝑡 = 5.

Exercise 1.4.105: A cylindrical water tank has water flowing in at 𝐼 cubic meters per second. Let
𝐴 be the area of the cross section of the tank in square meters. Suppose water is flowing out from
the bottom of the tank at a rate proportional to the height of the water level. Set up the differential
equation for ℎ, the height of the water, introducing and naming constants that you need. You should
also give the units for your constants.
46 CHAPTER 1. FIRST-ORDER EQUATIONS

1.5 Substitution
Note: 1 lecture, can safely be skipped, §1.6 in [EP], not in [BD]
Just as when solving integrals, one method to try is to change variables to end up with
a simpler equation to solve.

1.5.1 Substitution
The equation
𝑦 ′ = (𝑥 − 𝑦 + 1)2
is neither separable nor linear. What can we do? How about trying to change variables, so
that in the new variables the equation is simpler. We use another variable 𝑣, which we
treat as a function of 𝑥. We try
𝑣 = 𝑥 − 𝑦 + 1.
We need to figure out 𝑦 ′ in terms of 𝑣 ′, 𝑣 and 𝑥. We differentiate (in 𝑥) to obtain 𝑣 ′ = 1 − 𝑦 ′.
So 𝑦 ′ = 1 − 𝑣 ′. We plug this into the equation to get

1 − 𝑣′ = 𝑣2.

In other words, 𝑣 ′ = 1 − 𝑣 2 . Such an equation we know how to solve by separating variables:

1
𝑑𝑣 = 𝑑𝑥.
1 − 𝑣2
So
1 𝑣+1 𝑣+1 𝑣+1
ln = 𝑥 + 𝐶, or = 𝑒 2𝑥+2𝐶 , or = 𝐷𝑒 2𝑥 ,
2 𝑣−1 𝑣−1 𝑣−1
for some constant 𝐷. Note that 𝑣 = 1 and 𝑣 = −1 are also solutions.
Now we need to “unsubstitute” to obtain
𝑥−𝑦+2
= 𝐷𝑒 2𝑥 ,
𝑥−𝑦

and also the two solutions 𝑥 − 𝑦 + 1 = 1 or 𝑦 = 𝑥, and 𝑥 − 𝑦 + 1 = −1 or 𝑦 = 𝑥 + 2. We solve


the first equation for 𝑦:
𝑥 − 𝑦 + 2 = (𝑥 − 𝑦)𝐷𝑒 2𝑥 ,
𝑥 − 𝑦 + 2 = 𝐷𝑥𝑒 2𝑥 − 𝑦𝐷𝑒 2𝑥 ,
−𝑦 + 𝑦𝐷𝑒 2𝑥 = 𝐷𝑥𝑒 2𝑥 − 𝑥 − 2,
𝑦 (−1 + 𝐷𝑒 2𝑥 ) = 𝐷𝑥𝑒 2𝑥 − 𝑥 − 2,
𝐷𝑥𝑒 2𝑥 − 𝑥 − 2
𝑦= .
𝐷𝑒 2𝑥 − 1
Note that 𝐷 = 0 gives 𝑦 = 𝑥 + 2, but no value of 𝐷 gives the solution 𝑦 = 𝑥.
1.5. SUBSTITUTION 47

Substitution in differential equations is applied in much the same way that it is applied
in calculus. You guess. Several different substitutions might work. There are some general
patterns to look for. We summarize a few of these in a table.

When you see Try substituting


𝑦𝑦 ′ 𝑣 = 𝑦2
𝑦2 𝑦′ 𝑣 = 𝑦3
(cos 𝑦)𝑦 ′ 𝑣 = sin 𝑦
(sin 𝑦)𝑦 ′ 𝑣 = cos 𝑦
𝑒 𝑦 𝑦′ 𝑣 = 𝑒𝑦

Usually, you try to substitute in the “most complicated” part of the equation, with the
hopes of simplifying it. The table above is just a rule of thumb. You might have to modify
your guesses. If a substitution does not work (it does not make the equation any simpler),
try a different one.

1.5.2 Bernoulli equations


There are some forms of equations where there is a general rule for substitution that always
works. One such example is the so-called Bernoulli equation‗ :

𝑦 ′ + 𝑝(𝑥)𝑦 = 𝑞(𝑥)𝑦 𝑛 .

This equation looks a lot like a linear equation except for the 𝑦 𝑛 . If 𝑛 = 0 or 𝑛 = 1, then the
equation is linear and we can solve it. Otherwise, the substitution 𝑣 = 𝑦 1−𝑛 transforms the
Bernoulli equation into a linear equation. Note that 𝑛 need not be an integer.
Example 1.5.1: Solve

𝑥𝑦 ′ + 𝑦(𝑥 + 1) + 𝑥𝑦 5 = 0, 𝑦(1) = 1.

The equation is a Bernoulli equation, 𝑝(𝑥) = (𝑥 + 1)/𝑥 and 𝑞(𝑥) = −1. We substitute

𝑣 = 𝑦 1−5 = 𝑦 −4 , 𝑣 ′ = −4𝑦 −5 𝑦 ′ .

In other words, (−1/4) 𝑦 5 𝑣 ′ = 𝑦 ′. So

𝑥𝑦 ′ + 𝑦(𝑥 + 1) + 𝑥𝑦 5 = 0,
−𝑥𝑦 5 ′
𝑣 + 𝑦(𝑥 + 1) + 𝑥𝑦 5 = 0,
4
−𝑥 ′
𝑣 + 𝑦 −4 (𝑥 + 1) + 𝑥 = 0,
4
‗There are several things called Bernoulli equations, this is just one of them. The Bernoullis were a
prominent Swiss family of mathematicians. These particular equations are named for Jacob Bernoulli
(1654–1705).
48 CHAPTER 1. FIRST-ORDER EQUATIONS

−𝑥 ′
𝑣 + 𝑣(𝑥 + 1) + 𝑥 = 0,
4
and finally
4(𝑥 + 1)
𝑣 = 4. 𝑣′ −
𝑥
The equation is now linear. We can use the integrating factor method. In particular, we
use formula (1.4). We assume that 𝑥 > 0 so |𝑥| = 𝑥. This assumption is OK, as our initial
condition is at 𝑥 = 1 > 0. Let us compute the integrating factor. Here 𝑝(𝑠) from formula
−4(𝑠+1)
(1.4) is 𝑠 .
𝑥
𝑒 −4𝑥+4
∫ 
∫𝑥
𝑝(𝑠) 𝑑𝑠 −4(𝑠 + 1)
𝑒 1 = exp 𝑑𝑠 = 𝑒 −4𝑥−4 ln(𝑥)+4 = 𝑒 −4𝑥+4 𝑥 −4 = ,
1 𝑠 𝑥4
∫𝑥
− 𝑝(𝑠) 𝑑𝑠
𝑒 1 =𝑒 4𝑥+4 ln(𝑥)−4
= 𝑒 4𝑥−4 𝑥 4 .

We now plug in to (1.4)


∫𝑥
∫ 𝑥 ∫𝑡 
− 𝑝(𝑠) 𝑑𝑠 𝑝(𝑠) 𝑑𝑠
𝑣(𝑥) = 𝑒 1 𝑒 1 4 𝑑𝑡 + 1
1
𝑥
𝑒 −4𝑡+4
∫ 
=𝑒 4𝑥−4 4
𝑥 4 4 𝑑𝑡 + 1 .
1 𝑡
The integral in this expression is not possible to find in closed form. As we said before, it is
perfectly fine to have a definite integral in our solution. Now “unsubstitute”
𝑥
𝑒 −4𝑡+4
 ∫ 
−4
𝑦 =𝑒 4𝑥−4 4
𝑥 4 𝑑𝑡 + 1 ,
1 𝑡4
𝑒 −𝑥+1
𝑦=  1/4 .
𝑥 𝑒 −4𝑡+4
 ∫
𝑥 4 1 𝑡4
𝑑𝑡 + 1

1.5.3 Homogeneous equations


Another type of equations we can solve by substitution are the so-called homogeneous
equations. Suppose that we can write the differential equation as
𝑦
𝑦′ = 𝐹 .
𝑥
Here we try the substitutions
𝑦
𝑣= and therefore 𝑦 ′ = 𝑣 + 𝑥𝑣 ′ .
𝑥
We note that the equation is transformed into
𝑣′ 1
𝑣 + 𝑥𝑣 ′ = 𝐹(𝑣) or 𝑥𝑣 ′ = 𝐹(𝑣) − 𝑣 or = .
𝐹(𝑣) − 𝑣 𝑥
1.5. SUBSTITUTION 49

Hence an implicit solution is



1
𝑑𝑣 = ln |𝑥| + 𝐶.
𝐹(𝑣) − 𝑣

Clearly this solution does not work when 𝑥 = 0 (we would, afterall, divide by zero in 𝑦/𝑥 ).
So we will either assume 𝑥 > 0 or 𝑥 < 0 depending on the initial condition.
Example 1.5.2: Solve
𝑥 2 𝑦 ′ = 𝑦 2 + 𝑥𝑦, 𝑦(1) = 1.
We put the equation into the form 𝑦 ′ = ( 𝑦/𝑥 )2 + 𝑦/𝑥 , that is, 𝐹(𝑣) = 𝑣 2 + 𝑣. As the initial
condition is for a positive 𝑥 value, we will assume 𝑥 > 0. We substitute 𝑣 = 𝑦/𝑥 to get the
separable equation
𝑥𝑣 ′ = 𝑣 2 + 𝑣 − 𝑣 = 𝑣 2 ,
which has a solution ∫
1
𝑑𝑣 = ln |𝑥| + 𝐶,
𝑣2
−1
= ln 𝑥 + 𝐶,
𝑣
−1
𝑣= .
ln 𝑥 + 𝐶
We unsubstitute
𝑦 −1 −𝑥
= , or 𝑦= .
𝑥 ln 𝑥 + 𝐶 ln 𝑥 + 𝐶
We want 𝑦(1) = 1, so
−1 −1
1 = 𝑦(1) = = .
ln 1 + 𝐶 𝐶
Thus 𝐶 = −1 and the solution we are looking for is
−𝑥
𝑦= .
ln 𝑥 − 1

1.5.4 Exercises
Hint: Answers need not always be in closed form.

Exercise 1.5.1: Solve 𝑦 ′ + 𝑦(𝑥 2 − 1) + 𝑥𝑦 6 = 0, with 𝑦(1) = 1.

Exercise 1.5.2: Solve 2𝑦𝑦 ′ + 1 = 𝑦 2 + 𝑥, with 𝑦(0) = 1.

Exercise 1.5.3: Solve 𝑦 ′ + 𝑥𝑦 = 𝑦 4 , with 𝑦(0) = 1.


p
Exercise 1.5.4: Solve 𝑦𝑦 ′ + 𝑥 = 𝑥2 + 𝑦2.

Exercise 1.5.5: Solve 𝑦 ′ = (𝑥 + 𝑦 − 1)2 .


50 CHAPTER 1. FIRST-ORDER EQUATIONS

𝑥 2 −𝑦 2
Exercise 1.5.6: Solve 𝑦 ′ = 𝑥𝑦 , with 𝑦(1) = 2.

Exercise 1.5.101: Solve 𝑥𝑦 ′ + 𝑦 + 𝑦 2 = 0, 𝑦(1) = 2.

Exercise 1.5.102: Solve 𝑥𝑦 ′ + 𝑦 + 𝑥 = 0, 𝑦(1) = 1.

Exercise 1.5.103: Solve 𝑦 2 𝑦 ′ = 𝑦 3 − 3𝑥, 𝑦(0) = 2.

Exercise 1.5.104: Solve 2𝑦𝑦 ′ = 𝑒 𝑦


2 −𝑥 2
+ 2𝑥.
1.6. AUTONOMOUS EQUATIONS 51

1.6 Autonomous equations


Note: 1 lecture, §2.2 in [EP], §2.5 in [BD]
Consider problems of the form
𝑑𝑥
= 𝑓 (𝑥),
𝑑𝑡
where the derivative of solutions depends only on 𝑥 (the dependent variable). Such
equations are called autonomous equations. If we think of 𝑡 as time, the naming comes from
the fact that the equation is independent of time.
We return to the cooling coffee problem (Example 1.3.3). Newton’s law of cooling says
𝑑𝑥
= 𝑘(𝐴 − 𝑥),
𝑑𝑡
where 𝑥 is the temperature, 𝑡 is time, 𝑘 is some positive constant, and 𝐴 is the ambient
temperature. See Figure 1.11 for an example with 𝑘 = 0.3 and 𝐴 = 5.
Note the solution 𝑥 = 𝐴 (in the figure 𝑥 = 5). We call these constant solutions the
equilibrium solutions. The points on the 𝑥-axis where 𝑓 (𝑥) = 0 are called critical points. The
point 𝑥 = 𝐴 is a critical point. In fact, each critical point corresponds to an equilibrium
solution. Note also, by looking at the graph, that the solution 𝑥 = 𝐴 is “stable” in that
small perturbations in 𝑥 do not lead to substantially different solutions as 𝑡 grows. If we
change the initial condition a little bit, then as 𝑡 → ∞, we still get 𝑥(𝑡) → 𝐴. We call such
a critical point stable. In this simple example, all solutions in fact go to 𝐴 as 𝑡 → ∞. If a
critical point is not stable, we say it is unstable.

0 5 10 15 20 0 5 10 15 20
10 10 10.0 10.0

7.5 7.5

5 5

5.0 5.0

0 0 2.5 2.5

0.0 0.0

-5 -5

-2.5 -2.5

-10 -10 -5.0 -5.0


0 5 10 15 20 0 5 10 15 20

Figure 1.11: The slope field and some solutions of Figure 1.12: The slope field and some solutions of
𝑥 ′ = 0.3 (5 − 𝑥). 𝑥 ′ = 0.1 𝑥 (5 − 𝑥).

Consider now the logistic equation


𝑑𝑥
= 𝑘𝑥(𝑀 − 𝑥),
𝑑𝑡
52 CHAPTER 1. FIRST-ORDER EQUATIONS

for some positive 𝑘 and 𝑀. This equation is commonly used to model population if we
know the limiting population 𝑀, that is, the maximum sustainable population. The logistic
equation leads to less catastrophic predictions on world population than 𝑥 ′ = 𝑘𝑥. In the
real world, there is no such thing as negative population, but we will still consider negative
𝑥 for the purposes of the math.
See Figure 1.12 on the preceding page for an example, 𝑥 ′ = 0.1𝑥(5 − 𝑥). There are two
critical points, 𝑥 = 0 and 𝑥 = 5. The critical point at 𝑥 = 5 is stable, while the critical point
at 𝑥 = 0 is unstable. It is not necessary to find the exact solutions to understand their long
term behavior, that is, behavior as time goes to infinity. From the slope field above of
𝑥 ′ = 0.1𝑥(5 − 𝑥), we see that



 5 if 𝑥(0) > 0,


lim 𝑥(𝑡) = 0 if 𝑥(0) = 0,
𝑡→∞ 
 DNE or −∞ if 𝑥(0) < 0.


Here DNE means “does not exist.” From just looking at the slope field, we cannot quite
decide what happens if 𝑥(0) < 0. It could be that the solution does not exist for 𝑡 all the
way to ∞. Think of the equation 𝑥 ′ = 𝑥 2 ; we have seen that solutions only exist for some
finite period of time. Same can happen here. In our example equation above it turns out
that the solution does not exist for all time, but to see that we would have to solve the
equation. In any case, the solution does go to −∞, but it may get there rather quickly.
If we are interested only in the long term behavior of the solution, we would be doing
unnecessary work if we solved the equation exactly. We could draw the slope field, but it
is easier to just look at the phase diagram or phase portrait, which is a simple way to visualize
the behavior of autonomous equations. In this case there is one dependent variable, the 𝑥.
We draw the 𝑥-axis, we mark all the critical points, and then we draw arrows in between.
Since 𝑥 is the dependent variable, we draw the axis vertically, as it appears in the slope
field diagrams above. If 𝑓 (𝑥) > 0, we draw an up arrow. If 𝑓 (𝑥) < 0, we draw a down
arrow. To figure this out, we could just plug in some 𝑥 between the critical points, 𝑓 (𝑥)
will have the same sign at all 𝑥 between two critical points as long 𝑓 (𝑥) is continuous. For
example, 𝑓 (6) = −0.6 < 0, so 𝑓 (𝑥) < 0 for 𝑥 > 5, and the arrow above 𝑥 = 5 is a down arrow.
Next, 𝑓 (1) = 0.4 > 0, so 𝑓 (𝑥) > 0 whenever 0 < 𝑥 < 5, and the arrow points up. Finally,
𝑓 (−1) = −0.6 < 0, so 𝑓 (𝑥) < 0 when 𝑥 < 0, and the arrow points down.

𝑥=5

𝑥=0
1.6. AUTONOMOUS EQUATIONS 53

Armed with the phase diagram, it is easy to sketch the solutions approximately: As
time 𝑡 moves from left to right, the graph of a solution goes up if the arrow is up, and it
goes down if the arrow is down.
Exercise 1.6.1: Try sketching a few solutions simply from looking at the phase diagram. Check with
the preceding graphs if you are getting the same type of curves.
Once we draw the phase diagram, we classify critical points as stable or unstable‗ .
Since any mathematical model we cook up will only be an approximation to the real world,
unstable points are generally bad news.

unstable stable

We remark that you can figure out the arrows by plotting the graph 𝑦 = 𝑓 (𝑥). However,
in that case note that 𝑥 is then the dependent variable and will be on the horizontal axis.
Let us think about the logistic equation with harvesting. Suppose an alien race really
likes to eat humans. They keep a planet with humans and harvest the humans at a rate
of ℎ million humans per year. Suppose 𝑥 is the number of humans in millions on the
planet and 𝑡 is time in years. Let 𝑀 be the limiting population when no harvesting is done.
The number 𝑘 > 0 is a constant depending on how fast humans multiply. Our equation
becomes
𝑑𝑥
= 𝑘𝑥(𝑀 − 𝑥) − ℎ.
𝑑𝑡
We expand the right-hand side and set it to zero.

𝑘𝑥(𝑀 − 𝑥) − ℎ = −𝑘𝑥 2 + 𝑘𝑀𝑥 − ℎ = 0.

Solving for the critical points, let us call them 𝐴 and 𝐵, we get
q q
2
𝑘𝑀 + (𝑘𝑀) − 4ℎ𝑘 𝑘𝑀 − (𝑘𝑀)2 − 4ℎ𝑘
𝐴= , 𝐵= .
2𝑘 2𝑘
Exercise 1.6.2: Sketch a phase diagram for different possibilities. Note that these possibilities are
𝐴 > 𝐵, or 𝐴 = 𝐵, or 𝐴 and 𝐵 both complex (i.e. no real solutions). Hint: Fix some simple 𝑘 and 𝑀
and then vary ℎ.
For example, let 𝑀 = 8 and 𝑘 = 0.1. When ℎ = 1, then 𝐴 and 𝐵 are distinct and positive.
See Figure 1.13 on the next page for the slope field. As long as the population starts above
𝐵, which is approximately 1.55 million, then the population will not die out, it will tend
towards 𝐴 ≈ 6.45 million. If ever a catastrophe happens and the population drops below
𝐵, humans will die out, and the fast food restaurant serving them will go out of business.
‗ Unstable points with one of the arrows pointing towards the critical point are sometimes called semistable.
54 CHAPTER 1. FIRST-ORDER EQUATIONS

0 5 10 15 20 0 5 10 15 20
10.0 10.0 10.0 10.0

7.5 7.5 7.5 7.5

5.0 5.0 5.0 5.0

2.5 2.5 2.5 2.5

0.0 0.0 0.0 0.0


0 5 10 15 20 0 5 10 15 20

Figure 1.13: The slope field and some solutions of Figure 1.14: The slope field and some solutions of
𝑥 ′ = 0.1 𝑥 (8 − 𝑥) − 1. 𝑥 ′ = 0.1 𝑥 (8 − 𝑥) − 1.6.

When ℎ = 1.6, then 𝐴 = 𝐵 = 4. There is only one critical point and it is unstable. When
the population starts above 4 million, it will tend towards 4 million. However, if it ever
drops below 4 million, perhaps a worse than normal hurricane season one year, then
humans will die out on the planet. This scenario is not one that we (as the human fast food
proprietor) want to be in. A small perturbation of the equilibrium state and we are out of
business. There is no room for error. See Figure 1.14.
Finally, if we are harvesting at 2 million humans per year, there are no critical points.
The population will always plummet towards zero, no matter how well stocked the planet
starts. See Figure 1.15.

0 5 10 15 20
10.0 10.0

7.5 7.5

5.0 5.0

2.5 2.5

0.0 0.0
0 5 10 15 20

Figure 1.15: The slope field and some solutions of 𝑥 ′ = 0.1 𝑥 (8 − 𝑥) − 2.


1.6. AUTONOMOUS EQUATIONS 55

1.6.1 Exercises
Exercise 1.6.3: Consider 𝑥 ′ = 𝑥 2 .

a) Draw the phase diagram, find the critical points, and mark them stable or unstable.
b) Sketch typical solutions of the equation.
c) Find lim 𝑥(𝑡) for the solution with the initial condition 𝑥(0) = −1.
𝑡→∞

Exercise 1.6.4: Consider 𝑥 ′ = sin 𝑥.

a) Draw the phase diagram for −4𝜋 ≤ 𝑥 ≤ 4𝜋. On this interval mark the critical points stable
or unstable.
b) Sketch typical solutions of the equation.
c) Find lim 𝑥(𝑡) for the solution with the initial condition 𝑥(0) = 1.
𝑡→∞

Exercise 1.6.5: Suppose 𝑓 (𝑥) is positive for 0 < 𝑥 < 1, it is zero when 𝑥 = 0 and 𝑥 = 1, and it is
negative for all other 𝑥.

a) Draw the phase diagram for 𝑥 ′ = 𝑓 (𝑥), find the critical points, and mark them stable or
unstable.
b) Sketch typical solutions of the equation.
c) Find lim 𝑥(𝑡) for the solution with the initial condition 𝑥(0) = 0.5.
𝑡→∞

Exercise 1.6.6: Start with the logistic equation 𝑑𝑥


𝑑𝑡 = 𝑘𝑥(𝑀 − 𝑥). Suppose we modify our harvesting.
That is we will only harvest an amount proportional to current population. In other words, we
harvest ℎ𝑥 per unit of time for some ℎ > 0 (similar to earlier example with ℎ replaced with ℎ𝑥).

a) Construct the differential equation.


b) Show that if 𝑘𝑀 > ℎ, then the equation is still logistic.
c) What happens when 𝑘𝑀 < ℎ?

Exercise 1.6.7: A disease is spreading through the country. Let 𝑥 be the number of people infected.
Let the constant 𝑆 be the number of people susceptible to infection. The infection rate 𝑑𝑥 𝑑𝑡 is
proportional to the product of already infected people, 𝑥, and the number of susceptible but uninfected
people, 𝑆 − 𝑥.

a) Write down the differential equation.


b) Supposing 𝑥(0) > 0, that is, some people are infected at time 𝑡 = 0, what is lim 𝑥(𝑡).
𝑡→∞

c) Does the solution to part b) agree with your intuition? Why or why not?
56 CHAPTER 1. FIRST-ORDER EQUATIONS

Exercise 1.6.101: Let 𝑥 ′ = (𝑥 − 1)(𝑥 − 2)𝑥 2 .

a) Sketch the phase diagram and find critical points.


b) Classify the critical points.
c) If 𝑥(0) = 0.5, then find lim 𝑥(𝑡).
𝑡→∞

Exercise 1.6.102: Let 𝑥 ′ = 𝑒 −𝑥 .

a) Find and classify all critical points. b) Find lim 𝑥(𝑡) given any initial condition.
𝑡→∞

𝑑𝑥
Exercise 1.6.103: Assume that a population of fish in a lake satisfies 𝑑𝑡 = 𝑘𝑥(𝑀 − 𝑥). Now
suppose that fish are continually added at 𝐴 fish per unit of time.

a) Find the differential equation for 𝑥. b) What is the new limiting population?
𝑑𝑥
Exercise 1.6.104: Suppose 𝑑𝑡 = (𝑥 − 𝛼)(𝑥 − 𝛽) for two numbers 𝛼 < 𝛽.

a) Find the critical points, and classify them.

For b), c), d), find lim 𝑥(𝑡) based on the phase diagram.
𝑡→∞

b) 𝑥(0) < 𝛼, c) 𝛼 < 𝑥(0) < 𝛽, d) 𝛽 < 𝑥(0).


1.7. NUMERICAL METHODS: EULER’S METHOD 57

1.7 Numerical methods: Euler’s method


Note: 1 lecture, can safely be skipped, §2.4 in [EP], §8.1 in [BD]
Unless 𝑓 (𝑥, 𝑦) is of a special form, it is generally very hard if not impossible to get a
nice formula for the solution of the problem

𝑦 ′ = 𝑓 (𝑥, 𝑦), 𝑦(𝑥 0 ) = 𝑦0 .

If the equation can be solved in closed form, we should do that. But what if we have
an equation that cannot be solved in closed form? What if we want to find the value
of the solution at some particular 𝑥? Or perhaps we want to produce a graph of the
solution to inspect the behavior. In this section we will learn about the basics of numerical
approximation of solutions.
The simplest method for approximating a solution is Euler’s method‗ . It works as follows:
Take 𝑥 0 and 𝑦0 and compute the slope 𝑘 = 𝑓 (𝑥0 , 𝑦0 ). The slope is the change in 𝑦 per unit
change in 𝑥. Follow the line for an interval of length ℎ on the 𝑥-axis. Hence if 𝑦 = 𝑦0 at
𝑥0 , then we say that 𝑦1 , the approximate value of 𝑦 at 𝑥1 = 𝑥 0 + ℎ, is 𝑦1 = 𝑦0 + ℎ𝑘. Rinse,
repeat! Let 𝑘 = 𝑓 (𝑥 1 , 𝑦1 ), and then compute 𝑥2 = 𝑥 1 + ℎ, and 𝑦2 = 𝑦1 + ℎ𝑘. Now compute
𝑥3 and 𝑦3 using 𝑥2 and 𝑦2 , etc. Consider the equation 𝑦 ′ = 𝑦 2/3, 𝑦(0) = 1, and ℎ = 1. Then
𝑥0 = 0 and 𝑦0 = 1. We compute

𝑥1 = 𝑥0 + ℎ = 0 + 1 = 1, 𝑦1 = 𝑦0 + ℎ 𝑓 (𝑥0 , 𝑦0 ) = 1 + 1 · 1/3 = 4/3 ≈ 1.333,


(4/3)2
𝑥2 = 𝑥1 + ℎ = 1 + 1 = 2, 𝑦2 = 𝑦1 + ℎ 𝑓 (𝑥1 , 𝑦1 ) = 4/3 +1· = 52/27 ≈ 1.926.
3
We then draw an approximate graph of the solution by connecting the points (𝑥0 , 𝑦0 ),
(𝑥1 , 𝑦1 ), (𝑥 2 , 𝑦2 ),. . . . See Figure 1.16 on the following page for the first two steps of the
method.
More abstractly, for any 𝑖 = 0, 1, 2, 3, . . ., we compute

𝑥 𝑖+1 = 𝑥 𝑖 + ℎ, 𝑦 𝑖+1 = 𝑦 𝑖 + ℎ 𝑓 (𝑥 𝑖 , 𝑦 𝑖 ).

The line segments we get are an approximate graph of the solution. Generally it is not
exactly the solution. See Figure 1.17 on the next page for the plot of the real solution and
the approximation.
We continue with the equation 𝑦 ′ = 𝑦 2/3, 𝑦(0) = 1. Let us try to approximate 𝑦(2) using
Euler’s method. In Figures 1.16 and 1.17 we have graphically approximated 𝑦(2) with step
size 1. With step size 1, we have 𝑦(2) ≈ 1.926. The real answer is 3. We are approximately
1.074 off. Let us halve the step size. Computing 𝑦4 with ℎ = 0.5, we find that 𝑦(2) ≈ 2.209,
so an error of about 0.791. Table 1.1 on page 59 gives the values computed for various
parameters.
‗ Named after the Swiss mathematician Leonhard Paul Euler (1707–1783). The correct pronunciation of
the name sounds more like “oiler.”
58 CHAPTER 1. FIRST-ORDER EQUATIONS

-1 0 1 2 3 -1 0 1 2 3
3.0 3.0 3.0 3.0

2.5 2.5 2.5 2.5

2.0 2.0 2.0 2.0

1.5 1.5 1.5 1.5

1.0 1.0 1.0 1.0

0.5 0.5 0.5 0.5

0.0 0.0 0.0 0.0


-1 0 1 2 3 -1 0 1 2 3

𝑦2
Figure 1.16: First two steps of Euler’s method with ℎ = 1 for the equation 𝑦 ′ = 3 with initial conditions
𝑦(0) = 1.

-1 0 1 2 3
3.0 3.0

2.5 2.5

2.0 2.0

1.5 1.5

1.0 1.0

0.5 0.5

0.0 0.0
-1 0 1 2 3

𝑦2
Figure 1.17: Two steps of Euler’s method (step size 1) and the exact solution for the equation 𝑦 ′ = 3 with
initial conditions 𝑦(0) = 1.

Exercise 1.7.1: Solve this equation exactly and show that 𝑦(2) = 3.

The difference between the actual solution and the approximate solution is called the
error. We usually talk about the size of the error and we do not care much about its sign.

Error = Actual 𝑦 − Approximate 𝑦 .

The point is, we do not know the real solution. If we knew the error exactly, we would
know the actual solution . . . so what is the point of doing the approximation?
Note that except for the first few times, each time we halve the ℎ, the error approximately
halves. Halving of the error is a general feature of Euler’s method as it is a first-order method.
1.7. NUMERICAL METHODS: EULER’S METHOD 59

ℎ Approximate 𝑦(2) Error Error


Previous error
1 1.92593 1.07407
0.5 2.20861 0.79139 0.73681
0.25 2.47250 0.52751 0.66656
0.125 2.68034 0.31966 0.60599
0.0625 2.82040 0.17960 0.56184
0.03125 2.90412 0.09588 0.53385
0.015625 2.95035 0.04965 0.51779
0.0078125 2.97472 0.02528 0.50913

Table 1.1: Euler’s method approximation of 𝑦(2) where of 𝑦 ′ = 𝑦 2/3, 𝑦(0) = 1.

A simple improvement of the Euler method, see the exercises, produces a second-order
method. A second-order method reduces the error to approximately one quarter every time
we halve the interval. The order being “second” means the squaring in 1/4 = 1/2 × 1/2 = (1/2)2 .
To get the error to be within 0.1 of the answer, we had to do 64 steps. To get it to within
0.01, we would have to halve another three or four times, meaning doing 512 to 1024 steps.
The improved Euler method from the exercises should quarter the error every time we
halve the interval, so we would have to do (approximately) half as many “halvings” to get
the same error. This reduction can be a big deal. With 10 halvings (starting at ℎ = 1) we
have 1024 steps, whereas with 5 halvings we only have to do 32 steps, assuming that the
error was comparable to start with. A computer may not care about this difference for a
problem this simple, but suppose each step would take a second to compute (the function
may be substantially more difficult to compute than 𝑦 2/3). Then the difference is 32 seconds
versus about 17 minutes. We are not being altogether fair; a second-order method would
probably double the time to do each step. Even so, it is 1 minute versus 17 minutes. Next,
suppose that we have to repeat such a calculation for different parameters a thousand
times. You get the idea.
In practice, we do not know how large the error is! How do we know what is the
right step size? Well, essentially, we keep halving the interval, and if we are lucky, we can
estimate the error from a few of these calculations and the assumption that the error goes
down by a factor of one half each time (if we are using standard Euler).

Exercise 1.7.2: In the table above, suppose you do not know the error. Take the approximate values
of the function in the last two lines, assume that the error goes down by a factor of 2. Can you
estimate the error in the last time from this? Does it (approximately) agree with the table? Now do
it for the first two rows. Does this agree with the table?

Let us talk a little bit more about the example 𝑦 ′ = 𝑦 2/3, 𝑦(0) = 1. Suppose that instead
of 𝑦(2) we wish to find 𝑦(3). Table 1.2 on the next page lists the results of this effort for
60 CHAPTER 1. FIRST-ORDER EQUATIONS

successive halvings of ℎ. What is going on here? Well, you should solve the equation
exactly and you will notice that the solution does not exist at 𝑥 = 3. In fact, the solution
goes to infinity when you approach 𝑥 = 3.

ℎ Approximate 𝑦(3)
1 3.16232
0.5 4.54329
0.25 6.86079
0.125 10.80321
0.0625 17.59893
0.03125 29.46004
0.015625 50.40121
0.0078125 87.75769

Table 1.2: Attempts to use Euler’s to approximate 𝑦(3) where of 𝑦 ′ = 𝑦 2/3, 𝑦(0) = 1.

Another case where things go bad is if the solution oscillates wildly near some point.
The solution may exist at all points, but even a much better numerical method than
Euler would need an insanely small step size to approximate the solution with reasonable
precision. And computers might not be able to easily handle such a small step size.
In real applications we would not use a simple method such as Euler’s. The simplest
method that would probably be used in a real application is the standard Runge–Kutta
method (see exercises). That is a fourth-order method, meaning that if we halve the interval,
the error generally goes down by a factor of 16 (it is fourth-order as 1/16 = 1/2 × 1/2 × 1/2 × 1/2).
Choosing the right method to use and the right step size can be very tricky. There are
several competing factors to consider.
• Computational time: Each step takes computer time. Even if the function 𝑓 is simple
to compute, we do it many times over. Large step size means faster computation, but
perhaps not the right precision.

• Roundoff errors: Computers only compute with a certain number of significant


digits. Errors introduced by rounding numbers off during our computations become
noticeable when the step size becomes too small relative to the quantities we are
working with. So reducing step size may in fact make errors worse. There is a certain
optimum step size such that the precision increases as we approach it, but then starts
getting worse as we make our step size smaller still. The trouble is that this optimum
may be hard to find.

• Stability: Certain equations may be numerically unstable. What may happen is that
the numbers never seem to stabilize no matter how many times we halve the interval.
1.7. NUMERICAL METHODS: EULER’S METHOD 61

We may need a ridiculously small interval size, which may not be practical due to
roundoff errors or computational time considerations. Such problems are sometimes
called stiff . In the worst case, the numerical computations might be giving us bogus
numbers that look like a correct answer. Just because the numbers seem to have
stabilized after successive halving, does not mean that we must have the right answer.
We have seen just the beginnings of the challenges that appear in real applications.
Numerical approximation of solutions to differential equations is an active research area
for engineers and mathematicians. For example, the general purpose method used for the
ODE solver in Matlab and Octave (as of this writing) is a method that appeared in the
literature only in the 1980s.

1.7.1 Exercises
𝑑𝑥
Exercise 1.7.3: Consider = (2𝑡 − 𝑥)2 , 𝑥(0) = 2. Use Euler’s method with step size ℎ = 0.5 to
𝑑𝑡
approximate 𝑥(1).
𝑑𝑥
Exercise 1.7.4: Consider = 𝑡 − 𝑥, 𝑥(0) = 1.
𝑑𝑡
a) Use Euler’s method with step sizes ℎ = 1, 1/2, 1/4, 1/8 to approximate 𝑥(1).
b) Solve the equation exactly.
c) Describe what happens to the errors for each ℎ you used. That is, find the factor by which the
error changed each time you halved the interval.
Exercise 1.7.5: Approximate the value of 𝑒 by looking at the initial value problem 𝑦 ′ = 𝑦 with
𝑦(0) = 1 and approximating 𝑦(1) using Euler’s method with a step size of 0.2.
Exercise 1.7.6: Example of numerical instability: Take 𝑦 ′ = −5𝑦, 𝑦(0) = 1. We know that the
solution should decay to zero as 𝑥 grows. Using Euler’s method, start with ℎ = 1 and compute
𝑦1 , 𝑦2 , 𝑦3 , 𝑦4 to try to approximate 𝑦(4). What happened? Now halve the interval. Keep halving
the interval and approximating 𝑦(4) until the numbers you are getting start to stabilize (that is,
until they start going towards zero). Note: You might want to use a calculator.
𝑑𝑦
The simplest method used in practice is the Runge–Kutta method. Consider 𝑑𝑥 = 𝑓 (𝑥, 𝑦),
𝑦(𝑥0 ) = 𝑦0 , and a step size ℎ. Everything is the same as in Euler’s method, except the
computation of 𝑦 𝑖+1 and 𝑥 𝑖+1 . That is, in each step we compute slopes 𝑘 1 , 𝑘 2 , 𝑘 3 , and 𝑘 4 ,
and then we compute the next 𝑥 𝑖+1 and 𝑦 𝑖+1 :
𝑘 1 = 𝑓 (𝑥 𝑖 , 𝑦 𝑖 ),
𝑘 2 = 𝑓 𝑥 𝑖 + ℎ/2, 𝑦 𝑖 + 𝑘 1 ( ℎ/2) , 𝑥 𝑖+1 = 𝑥 𝑖 + ℎ,


𝑘 1 + 2𝑘 2 + 2𝑘 3 + 𝑘 4
𝑘 3 = 𝑓 𝑥 𝑖 + ℎ/2, 𝑦 𝑖 + 𝑘 2 ( ℎ/2) , 𝑦 𝑖+1 = 𝑦 𝑖 + ℎ,

6
𝑘 4 = 𝑓 (𝑥 𝑖 + ℎ, 𝑦 𝑖 + 𝑘 3 ℎ).
62 CHAPTER 1. FIRST-ORDER EQUATIONS

𝑑𝑦
Exercise 1.7.7: Consider = 𝑦𝑥 2 , 𝑦(0) = 1.
𝑑𝑥
a) Approximate 𝑦(1) using Runge–Kutta (see above) with step sizes ℎ = 1 and ℎ = 1/2.
b) Approximate 𝑦(1) using Euler’s method with ℎ = 1 and ℎ = 1/2.
c) Solve exactly, find the exact value of 𝑦(1), and compare with the approximations.
Exercise 1.7.101: Let 𝑥 ′ = sin(𝑥𝑡), and 𝑥(0) = 1. Approximate 𝑥(1) using Euler’s method with
step sizes 1, 0.5, 0.25. Use a calculator and compute up to 4 decimal digits.
Exercise 1.7.102: Let 𝑥 ′ = 2𝑡, and 𝑥(0) = 0.
a) Approximate 𝑥(4) using Euler’s method with step sizes 4, 2, and 1.
b) Solve exactly, and compute the errors.
c) Compute the factor by which the errors changed.
Exercise 1.7.103: Let 𝑥 ′ = 𝑥𝑒 𝑥𝑡+1 , and 𝑥(0) = 0.
a) Approximate 𝑥(4) using Euler’s method with step sizes 4, 2, and 1.
b) Guess an exact solution based on part a) and compute the errors.
There is a simple way to improve Euler’s method to make it a second-order method
𝑑𝑦
by doing just one extra step. Consider 𝑑𝑥 = 𝑓 (𝑥, 𝑦), 𝑦(𝑥 0 ) = 𝑦0 , and a step size ℎ. What
we do is to pretend we compute the next step as in Euler, that is, we start with (𝑥 𝑖 , 𝑦 𝑖 ),
we compute a slope 𝑘1 = 𝑓 (𝑥 𝑖 , 𝑦 𝑖 ), and then look at the point (𝑥 𝑖 + ℎ, 𝑦 𝑖 + 𝑘 1 ℎ). Instead of
letting our new point be (𝑥 𝑖 + ℎ, 𝑦 𝑖 + 𝑘 1 ℎ), we compute the slope at that point, call it 𝑘 2 ,
and then take the average of 𝑘1 and 𝑘 2 , hoping that the average is going to be closer to the
actual slope on the interval from 𝑥 𝑖 to 𝑥 𝑖 + ℎ. And we are correct, if we halve the step, the
error should go down by a factor of 22 = 4. To summarize, the setup is the same as for
regular Euler, except the computation of 𝑦 𝑖+1 and 𝑥 𝑖+1 . At each step we compute the new
slopes 𝑘1 and 𝑘2 and then the next 𝑦 𝑖+1 and 𝑥 𝑖+1 :

𝑘 1 = 𝑓 (𝑥 𝑖 , 𝑦 𝑖 ), 𝑥 𝑖+1 = 𝑥 𝑖 + ℎ,
𝑘1 + 𝑘2
𝑘 2 = 𝑓 (𝑥 𝑖 + ℎ, 𝑦 𝑖 + 𝑘 1 ℎ), 𝑦 𝑖+1 = 𝑦 𝑖 + ℎ.
2
𝑑𝑦
Exercise 1.7.104: Consider = 𝑥 + 𝑦, 𝑦(0) = 1.
𝑑𝑥
a) Approximate 𝑦(1) using the improved Euler’s method (see above) with step sizes ℎ = 1/4 and
ℎ = 1/8.
b) Approximate 𝑦(1) using Euler’s method with ℎ = 1/4 and ℎ = 1/8.
c) Solve exactly, find the exact value of 𝑦(1).
d) Compute the errors, and the factors by which the errors changed.
1.8. EXACT EQUATIONS 63

1.8 Exact equations


Note: 1–2 lectures, can safely be skipped, §1.6 in [EP], §2.6 in [BD]
A type of equation that comes up quite often in physics and engineering is an exact
equation. Suppose 𝐹(𝑥, 𝑦) is a function of two variables, which we call the potential function.
The naming should suggest potential energy, or electric potential. Exact equations and
potential functions appear when there is a conservation law at play, such as conservation
of energy. Let us make up a simple example. Consider

𝐹(𝑥, 𝑦) = 𝑥 2 + 𝑦 2 .

We are interested in the lines of constant -10 -5 0 5 10


10 10
energy, that is, lines where the energy is con-
served: the curves where 𝐹(𝑥, 𝑦) = 𝐶, for
some constant 𝐶. In our example, the curves 5 5

𝑥 2 + 𝑦 2 = 𝐶 are circles. See Figure 1.18.


We take the total derivative of 𝐹:
0 0

𝜕𝐹 𝜕𝐹
𝑑𝐹 = 𝑑𝑥 + 𝑑𝑦.
𝜕𝑥 𝜕𝑦
-5 -5

For convenience, we will use the notation


𝐹𝑥 = 𝜕𝐹
𝜕𝑥
and 𝐹 𝑦 = 𝜕𝐹
𝜕𝑦
. In our example, -10 -10
-10 -5 0 5 10

𝑑𝐹 = 2𝑥 𝑑𝑥 + 2𝑦 𝑑𝑦. Figure 1.18: Solutions to 𝐹(𝑥, 𝑦) = 𝑥 2 + 𝑦 2 = 𝐶


for various 𝐶.
We apply the total derivative to 𝐹(𝑥, 𝑦) = 𝐶,
to find the differential equation 𝑑𝐹 = 0. The
differential equation we obtain in such a way has the form

𝑑𝑦
𝑀 𝑑𝑥 + 𝑁 𝑑𝑦 = 0, or 𝑀+𝑁 = 0.
𝑑𝑥
An equation of this form is called exact if it was obtained as 𝑑𝐹 = 0 for some potential
function 𝐹. In our simple example, we obtain the equation

𝑑𝑦
2𝑥 𝑑𝑥 + 2𝑦 𝑑𝑦 = 0, or 2𝑥 + 2𝑦 = 0.
𝑑𝑥
Since we obtained this equation by differentiating 𝑥 2 + 𝑦 2 = 𝐶, the equation is exact. We
often wish to solve for 𝑦 in terms of 𝑥. In our example,

𝑦 = ± 𝐶 − 𝑥2.

In terms of multivariable calculus, at each point (𝑥, 𝑦) in the plane, 𝑣® = (𝑀, 𝑁) is a


vector, that is, a direction and a magnitude. As 𝑀 and 𝑁 are functions of (𝑥, 𝑦), we have a
64 CHAPTER 1. FIRST-ORDER EQUATIONS

vector field. A vector field 𝑣® that comes from an exact equation is a so-called conservative
vector field, that is, a vector field that comes with a potential function 𝐹(𝑥, 𝑦), such that
𝜕𝐹 𝜕𝐹
 
𝑣® = , .
𝜕𝑥 𝜕𝑦
Let 𝛾 be a path in the plane starting at (𝑥1 , 𝑦1 ) and ending at (𝑥 2 , 𝑦2 ). If we think of 𝑣® as
force, then the work required to move along 𝛾 is the path integral
∫ ∫
𝑣® (®𝑟 ) · 𝑑®𝑟 = 𝑀 𝑑𝑥 + 𝑁 𝑑𝑦 = 𝐹(𝑥 2 , 𝑦2 ) − 𝐹(𝑥 1 , 𝑦1 ).
𝛾 𝛾

In other words, the work done only depends on endpoints, that is, where we start and
where we end. For example, suppose 𝐹 is gravitational potential. The derivative of 𝐹 given
by 𝑣® is the gravitational force. What we are saying is that the work required to move a
heavy box from the ground floor to the roof, only depends on the change in potential
energy. That is, the work done is the same no matter what path we took; if we took the
stairs or the elevator. Although if we took the elevator, the elevator is doing the work
for us. The curves 𝐹(𝑥, 𝑦) = 𝐶 are those where no work need be done, such as the heavy
box sliding along without accelerating or braking on a perfectly flat roof, on a cart with
incredibly well oiled wheels.
An exact equation is a conservative vector field, and the implicit solution of this equation
is the potential function.

1.8.1 Solving exact equations


Now you, the reader, should ask: Where did we solve a differential equation? Well, in
applications we generally know 𝑀 and 𝑁, but we do not know 𝐹. That is, we may have
𝑑𝑦
just started with 2𝑥 + 2𝑦 𝑑𝑥 = 0, or perhaps even
𝑑𝑦
= 0. 𝑥+𝑦
𝑑𝑥
It is up to us to find some potential 𝐹 that works. Many different 𝐹 will work; adding
a constant to 𝐹 does not change the equation. Once we have a potential function 𝐹, the
equation 𝐹 𝑥, 𝑦(𝑥) = 𝐶 gives an implicit solution of the ODE.
𝑑𝑦
Example 1.8.1: Let us find the general solution to 2𝑥 + 2𝑦 𝑑𝑥 = 0. Forget we know 𝐹.
If we know that this is an exact equation, we start looking for a potential function 𝐹.
We have 𝑀 = 2𝑥 and 𝑁 = 2𝑦. If 𝐹 exists, it must be such that 𝐹𝑥 (𝑥, 𝑦) = 2𝑥. Integrate in
the 𝑥 variable to find
𝐹(𝑥, 𝑦) = 𝑥 2 + 𝐴(𝑦), (1.5)
for some function 𝐴(𝑦). The function 𝐴 is the “constant of integration,” though it is only
constant as far as 𝑥 is concerned, and may still depend on 𝑦. Now differentiate (1.5) in 𝑦
and set it equal to 𝑁, which is what 𝐹 𝑦 is supposed to be:
2𝑦 = 𝐹 𝑦 (𝑥, 𝑦) = 𝐴′(𝑦).
1.8. EXACT EQUATIONS 65

Integrating, we find 𝐴(𝑦) = 𝑦 2 . We could add a constant of integration if we wanted to,


but there is no need. We found 𝐹(𝑥, 𝑦) = 𝑥 2 + 𝑦 2 . Next for a constant 𝐶, we solve

𝐹 𝑥, 𝑦(𝑥) = 𝐶.



for 𝑦 in terms of 𝑥. In this case, we obtain 𝑦 = ± 𝐶 − 𝑥 2 as we did before.

Exercise 1.8.1: Why did we not need to add a constant of integration when integrating 𝐴′(𝑦) = 2𝑦?
Add a constant of integration, say 3, and see what 𝐹 you get. What is the difference from what we
got above, and why does it not matter?

The procedure, once we know that the equation is exact, is:

(i) Integrate 𝐹𝑥 = 𝑀 in 𝑥 resulting in 𝐹(𝑥, 𝑦) = something + 𝐴(𝑦).

(ii) Differentiate this 𝐹 in 𝑦, and set that equal to 𝑁, so that we may find 𝐴(𝑦) by
integration.

The procedure can also be done by first integrating in 𝑦 and then differentiating in 𝑥. Pretty
easy huh? Let’s try this again.
𝑑𝑦
Example 1.8.2: Consider now 2𝑥 + 𝑦 + 𝑥𝑦 𝑑𝑥 = 0.
OK, so 𝑀 = 2𝑥 + 𝑦 and 𝑁 = 𝑥𝑦. We try to proceed as before. Suppose 𝐹 exists. Then
𝐹𝑥 (𝑥, 𝑦) = 2𝑥 + 𝑦. We integrate:

𝐹(𝑥, 𝑦) = 𝑥 2 + 𝑥𝑦 + 𝐴(𝑦)

for some function 𝐴(𝑦). Differentiate in 𝑦 and set equal to 𝑁:

𝑁 = 𝑥𝑦 = 𝐹 𝑦 (𝑥, 𝑦) = 𝑥 + 𝐴′(𝑦).

But there is no way to satisfy this requirement! The function 𝑥𝑦 cannot be written as 𝑥
plus a function of 𝑦. The equation is not exact; no potential function 𝐹 exists.
Is there an easier way to check for the existence of 𝐹, other than failing in trying to find
it? Turns out there is. Suppose 𝑀 = 𝐹𝑥 and 𝑁 = 𝐹 𝑦 . As long as the second derivatives are
continuous,
𝜕𝑀 𝜕2 𝐹 𝜕2 𝐹 𝜕𝑁
= = = .
𝜕𝑦 𝜕𝑦𝜕𝑥 𝜕𝑥𝜕𝑦 𝜕𝑥
Let us state it as a theorem. Usually this is called the Poincaré Lemma‗ .

Theorem 1.8.1 (Poincaré). If 𝑀 and 𝑁 are continuously differentiable functions of (𝑥, 𝑦), and
𝜕𝑀
𝜕𝑦
= 𝜕𝑁
𝜕𝑥
, then near any point there is a function 𝐹(𝑥, 𝑦) such that 𝑀 = 𝜕𝐹
𝜕𝑥
and 𝑁 = 𝜕𝐹
𝜕𝑦
.

‗ Named for the French polymath Jules Henri Poincaré (1854–1912).


66 CHAPTER 1. FIRST-ORDER EQUATIONS

The theorem doesn’t give us a global 𝐹 defined everywhere in the plane. In general, we
can only find the potential locally, near some initial point. By this time, we have come to
expect this from differential equations.
Let us return to Example 1.8.2, where 𝑀 = 2𝑥 + 𝑦 and 𝑁 = 𝑥𝑦. Notice 𝑀 𝑦 = 1 and
𝑁𝑥 = 𝑦, which are clearly not equal. The equation is not exact.
Example 1.8.3: Solve
𝑑𝑦 −2𝑥 − 𝑦
= , 𝑦(0) = 1.
𝑑𝑥 𝑥−1
We write the equation as
𝑑𝑦
(2𝑥 + 𝑦) + (𝑥 − 1) = 0,
𝑑𝑥
so 𝑀 = 2𝑥 + 𝑦 and 𝑁 = 𝑥 − 1. Then

𝑀 𝑦 = 1 = 𝑁𝑥 .

The equation is exact. Integrating 𝑀 in 𝑥, we find

𝐹(𝑥, 𝑦) = 𝑥 2 + 𝑥𝑦 + 𝐴(𝑦).

Differentiating in 𝑦 and setting to 𝑁, we find

𝑥 − 1 = 𝑥 + 𝐴′(𝑦).

So 𝐴′(𝑦) = −1, and 𝐴(𝑦) = −𝑦 will work. We obtain 𝐹(𝑥, 𝑦) = 𝑥 2 + 𝑥𝑦 − 𝑦, so the implicit
solution is 𝑥 2 + 𝑥𝑦 − 𝑦 = 𝐶. First we find 𝐶. As 𝑦(0) = 1, we have 𝐹(0, 1) = 𝐶. Therefore,
02 + 0 × 1 − 1 = 𝐶, so 𝐶 = −1. Now we solve 𝑥 2 + 𝑥𝑦 − 𝑦 = −1 for 𝑦 to get

−𝑥 2 − 1
𝑦= .
𝑥−1
Example 1.8.4: Solve
−𝑦 𝑥
𝑑𝑥 + 2 𝑑𝑦 = 0, 𝑦(1) = 2.
𝑥2+𝑦 2 𝑥 + 𝑦2
We leave to the reader to check that 𝑀 𝑦 = 𝑁𝑥 .
This vector field (𝑀, 𝑁) is not conservative if considered as a vector field of the entire
plane minus the origin. The problem is that if the curve 𝛾 is a circle around the origin, say
starting at (1, 0) and ending at (1, 0) going counterclockwise, then if 𝐹 existed we would
expect
−𝑦 𝑥
∫ ∫
0 = 𝐹(1, 0) − 𝐹(1, 0) = 𝐹𝑥 𝑑𝑥 + 𝐹 𝑦 𝑑𝑦 = 𝑑𝑥 + 2 𝑑𝑦 = 2𝜋.
𝛾 𝛾 𝑥2
+𝑦 2 𝑥 + 𝑦2

That is nonsense! We leave the computation of the path integral to the interested reader, or
you can consult your multivariable calculus textbook. So there is no potential function 𝐹
defined everywhere outside the origin (0, 0).
1.8. EXACT EQUATIONS 67

If we think back to the theorem, it does not guarantee such a function anyway. It only
guarantees a potential function locally, that is, only in some region near the initial point.
As 𝑦(1) = 2, we start at the point (1, 2). Considering 𝑥 > 0 and integrating 𝑀 in 𝑥 or 𝑁
in 𝑦, we find
𝐹(𝑥, 𝑦) = arctan 𝑦/𝑥 .


The implicit solution is arctan 𝑦/𝑥 = 𝐶. Solving, 𝑦 = tan(𝐶)𝑥. That is, the solution is

a straight line. Solving 𝑦(1) = 2 gives us that tan(𝐶) = 2, and so 𝑦 = 2𝑥 is the desired
solution. See Figure 1.19, and note that the solution only exists for 𝑥 > 0.

-5.0 -2.5 0.0 2.5 5.0


10 10

5 5

0 0

-5 -5

-10 -10
-5.0 -2.5 0.0 2.5 5.0

−𝑦 𝑥
Figure 1.19: Solution to 𝑥 2 +𝑦 2
𝑑𝑥 + 𝑥 2 +𝑦 2
𝑑𝑦 = 0, 𝑦(1) = 2, with initial point marked.

Example 1.8.5: Solve


𝑑𝑦
𝑥 2 + 𝑦 2 + 2𝑦(𝑥 + 1)
= 0.
𝑑𝑥
The reader should check that this equation is exact. Let 𝑀 = 𝑥 2 + 𝑦 2 and 𝑁 = 2𝑦(𝑥 + 1).
We follow the procedure for exact equations
1 3
𝐹(𝑥, 𝑦) = 𝑥 + 𝑥𝑦 2 + 𝐴(𝑦),
3
and
2𝑦(𝑥 + 1) = 2𝑥𝑦 + 𝐴′(𝑦).
Therefore 𝐴′(𝑦) = 2𝑦 or 𝐴(𝑦) = 𝑦 2 and 𝐹(𝑥, 𝑦) = 13 𝑥 3 + 𝑥𝑦 2 + 𝑦 2 . We try to solve 𝐹(𝑥, 𝑦) = 𝐶.
We easily solve for 𝑦 2 and then just take the square root:
r
𝐶 − (1/3)𝑥 3 𝐶 − (1/3)𝑥 3
𝑦2 = , so 𝑦=± .
𝑥+1 𝑥+1
𝑑𝑦
When 𝑥 = −1, the term in front of 𝑑𝑥 is zero, and our explicit solution is not valid. The given
equation has no solution (for 𝑦(𝑥)) near 𝑥 = −1, but the equation (𝑥 2 +𝑦 2 ) 𝑑𝑥+2𝑦(𝑥+1) 𝑑𝑦 = 0
does have a solution 𝑥 = −1. In fact, one could solve for 𝑥 in terms of 𝑦 for any initial
condition. The solution is messy, so we leave it as 13 𝑥 3 + 𝑥𝑦 2 + 𝑦 2 = 𝐶.
68 CHAPTER 1. FIRST-ORDER EQUATIONS

1.8.2 Integrating factors


Sometimes an equation 𝑀 𝑑𝑥 +𝑁 𝑑𝑦 = 0 is not exact, but it can be made exact by multiplying
with a function 𝑢(𝑥, 𝑦). That is, perhaps for some nonzero function 𝑢(𝑥, 𝑦),

𝑢(𝑥, 𝑦)𝑀(𝑥, 𝑦) 𝑑𝑥 + 𝑢(𝑥, 𝑦)𝑁(𝑥, 𝑦) 𝑑𝑦 = 0

is exact. Any solution to this new equation is also a solution to 𝑀 𝑑𝑥 + 𝑁 𝑑𝑦 = 0.


In fact, a linear equation

𝑑𝑦
+ 𝑝(𝑥)𝑦 = 𝑓 (𝑥), 𝑝(𝑥)𝑦 − 𝑓 (𝑥) 𝑑𝑥 + 𝑑𝑦 = 0

or
𝑑𝑥

is always such an equation. Let 𝑟(𝑥) = 𝑒 𝑝(𝑥) 𝑑𝑥 be the integrating factor for a linear
𝑑𝑦
equation. Multiply the equation by 𝑟(𝑥) and write it in the form of 𝑀 + 𝑁 𝑑𝑥 = 0.

𝑑𝑦
𝑟(𝑥)𝑝(𝑥)𝑦 − 𝑟(𝑥) 𝑓 (𝑥) + 𝑟(𝑥) = 0.
𝑑𝑥
Then 𝑀 = 𝑟(𝑥)𝑝(𝑥)𝑦 − 𝑟(𝑥) 𝑓 (𝑥), so 𝑀 𝑦 = 𝑟(𝑥)𝑝(𝑥), while 𝑁 = 𝑟(𝑥), so 𝑁𝑥 = 𝑟 ′(𝑥) = 𝑟(𝑥)𝑝(𝑥).
In other words, we have an exact equation. Integrating factors for linear functions are just
a special case of integrating factors for exact equations.
But how do we find the integrating factor 𝑢? Well, given an equation

𝑀 𝑑𝑥 + 𝑁 𝑑𝑦 = 0,

𝑢 should be a function such that


𝜕  𝜕 
𝑢𝑀 = 𝑢 𝑦 𝑀 + 𝑢𝑀 𝑦 = 𝑢𝑁 = 𝑢𝑥 𝑁 + 𝑢𝑁𝑥 .
 
𝜕𝑦 𝜕𝑥

Therefore,
(𝑀 𝑦 − 𝑁𝑥 )𝑢 = 𝑢𝑥 𝑁 − 𝑢 𝑦 𝑀.
At first it may seem we replaced one differential equation by another. Even worse, the new
equation is a PDE. True, but all hope is not lost.
A strategy that often works is to look for a 𝑢 that is a function of 𝑥 alone, or a function
of 𝑦 alone. After all, that is what worked for linear equations. If 𝑢 is a function of 𝑥 alone,
that is, 𝑢(𝑥), then we write 𝑢 ′(𝑥) instead of 𝑢𝑥 , and 𝑢 𝑦 is just zero. Then

𝑀 𝑦 − 𝑁𝑥
𝑢 = 𝑢 ′.
𝑁
𝑀 𝑦 −𝑁𝑥
In particular, 𝑁 ought to be a function of 𝑥 alone (not depend on 𝑦). If so, then we
have a linear equation
𝑀 𝑦 − 𝑁𝑥
𝑢′ − 𝑢 = 0.
𝑁
1.8. EXACT EQUATIONS 69

𝑀 𝑦 −𝑁𝑥

Letting 𝑃(𝑥) = 𝑁 , we solve the linear equation to find 𝑢(𝑥) = 𝐶𝑒 𝑃(𝑥) 𝑑𝑥 . The constant
in the solution
∫ is not relevant, we need any nonzero solution, so we take 𝐶 = 1. Then
𝑃(𝑥) 𝑑𝑥
𝑢(𝑥) = 𝑒 is the integrating factor making the equation exact.
Similarly, we could try a function of the form 𝑢(𝑦). Then

𝑀 𝑦 − 𝑁𝑥
𝑢 = −𝑢 ′ .
𝑀
𝑀 𝑦 −𝑁𝑥
In particular, 𝑀 ought to be a function of 𝑦 alone. If so, we have a linear equation

𝑀 𝑦 − 𝑁𝑥
𝑢′ + 𝑢 = 0.
𝑀
𝑀 𝑦 −𝑁𝑥
∫ ∫
𝑄(𝑦) 𝑑𝑦 𝑄(𝑦) 𝑑𝑦
Letting 𝑄(𝑦) = 𝑀 , we find 𝑢(𝑦) = 𝐶𝑒 − . We take 𝐶 = 1. So 𝑢(𝑦) = 𝑒 − is
the integrating factor.
Example 1.8.6: Solve
𝑥2 + 𝑦2 𝑑𝑦
+ 2𝑦 = 0.
𝑥+1 𝑑𝑥
𝑥 2 +𝑦 2
Let 𝑀 = 𝑥+1 and 𝑁 = 2𝑦. Compute

2𝑦 2𝑦
𝑀 𝑦 − 𝑁𝑥 = −0= .
𝑥+1 𝑥+1
As this is not zero, the equation is not exact. We notice

𝑀 𝑦 − 𝑁𝑥 2𝑦 1 1
𝑃(𝑥) = = =
𝑁 𝑥 + 1 2𝑦 𝑥+1

is a function of 𝑥 alone. We compute the integrating factor



𝑃(𝑥) 𝑑𝑥
𝑒 = 𝑒 ln(𝑥+1) = 𝑥 + 1.

We multiply our given equation by (𝑥 + 1) to obtain

𝑑𝑦
𝑥 2 + 𝑦 2 + 2𝑦(𝑥 + 1) = 0,
𝑑𝑥
which is an exact equation that we solved in Example 1.8.5. The solution was
r
𝐶 − (1/3)𝑥 3
𝑦=± .
𝑥+1
Example 1.8.7: Solve
𝑑𝑦
𝑦 2 + (𝑥𝑦 + 1) = 0.
𝑑𝑥
70 CHAPTER 1. FIRST-ORDER EQUATIONS

First compute
𝑀 𝑦 − 𝑁𝑥 = 2𝑦 − 𝑦 = 𝑦.
As this is not zero, the equation is not exact. We observe
𝑀 𝑦 − 𝑁𝑥 𝑦 1
𝑄(𝑦) = = =
𝑀 𝑦2 𝑦

is a function of 𝑦 alone. We compute the integrating factor



𝑄(𝑦) 𝑑𝑦 1
𝑒 −
= 𝑒 − ln 𝑦 = .
𝑦

Therefore, we look at the exact equation


𝑥𝑦 + 1 𝑑𝑦
𝑦+ = 0.
𝑦 𝑑𝑥

The reader should double check that this equation is exact. We follow the procedure for
exact equations
𝐹(𝑥, 𝑦) = 𝑥𝑦 + 𝐴(𝑦),
and
𝑥𝑦 + 1 1
= 𝑥 + = 𝑥 + 𝐴′(𝑦). (1.6)
𝑦 𝑦
Consequently, 𝐴′(𝑦) = 1/𝑦 or 𝐴(𝑦) = ln |𝑦|. Thus 𝐹(𝑥, 𝑦) = 𝑥𝑦 + ln |𝑦|. It is not possible
to solve 𝐹(𝑥, 𝑦) = 𝐶 for 𝑦 in terms of elementary functions, so let us be content with the
implicit solution:
𝑥𝑦 + ln |𝑦| = 𝐶.
We are looking for the general solution and we divided by 𝑦 above. We should check what
happens when 𝑦 = 0, as the equation itself makes perfect sense in that case. We plug in
𝑦 = 0 to find the equation is satisfied. So 𝑦 = 0 is also a solution.

1.8.3 Exercises
Exercise 1.8.2: Solve the following exact equations, implicit general solutions will suffice:
𝑑𝑦
a) (2𝑥 𝑦 + 𝑥 2 ) 𝑑𝑥 + (𝑥 2 + 𝑦 2 + 1) 𝑑𝑦 = 0 b) 𝑥 5 + 𝑦 5 𝑑𝑥 = 0
𝑑𝑦
c) 𝑒 𝑥 + 𝑦 3 + 3𝑥𝑦 2 𝑑𝑥 = 0 d) (𝑥 + 𝑦) cos(𝑥) + sin(𝑥) + sin(𝑥)𝑦 ′ = 0

Exercise 1.8.3: Find the integrating factor for the following equations making them into exact
equations:
𝑦 𝑒 𝑥 +𝑦 3
a) 𝑒 𝑥 𝑦 𝑑𝑥 + 𝑥 𝑒 𝑥 𝑦 𝑑𝑦 = 0 b) 𝑦2
𝑑𝑥 + 3𝑥 𝑑𝑦 = 0
2𝑥+2𝑦 2
c) 4(𝑦 2 + 𝑥) 𝑑𝑥 + 𝑦 𝑑𝑦 = 0 d) 2 sin(𝑦) 𝑑𝑥 + 𝑥 cos(𝑦) 𝑑𝑦 = 0
1.8. EXACT EQUATIONS 71

𝑑𝑦
Exercise 1.8.4: Suppose you have an equation of the form: 𝑓 (𝑥) + 𝑔(𝑦) 𝑑𝑥 = 0.

a) Show it is exact.
b) Find the form of the potential function in terms of 𝑓 and 𝑔.

Exercise 1.8.5: Suppose that we have the equation 𝑓 (𝑥) 𝑑𝑥 − 𝑑𝑦 = 0.

a) Is this equation exact?


b) Find the general solution using a definite integral.
1+𝑥 𝑦
Exercise 1.8.6: Find the potential function 𝐹(𝑥, 𝑦) of the exact equation 𝑑𝑥 + 1/𝑦 + 𝑥 𝑑𝑦 = 0

𝑥
in two different ways.

a) Integrate 𝑀 in terms of 𝑥 and then differentiate in 𝑦 and set to 𝑁.


b) Integrate 𝑁 in terms of 𝑦 and then differentiate in 𝑥 and set to 𝑀.

Exercise 1.8.7: A function 𝑢(𝑥, 𝑦) is said to be a harmonic function if 𝑢𝑥𝑥 + 𝑢 𝑦 𝑦 = 0.

a) Show if 𝑢 is harmonic, −𝑢 𝑦 𝑑𝑥 + 𝑢𝑥 𝑑𝑦 = 0 is an exact equation. So there exists (at least


locally) the so-called harmonic conjugate function 𝑣(𝑥, 𝑦) such that 𝑣 𝑥 = −𝑢 𝑦 and 𝑣 𝑦 = 𝑢𝑥 .

Verify that the following 𝑢 are harmonic and find the corresponding harmonic conjugates 𝑣:

b) 𝑢 = 2𝑥𝑦 c) 𝑢 = 𝑒 𝑥 cos 𝑦 d) 𝑢 = 𝑥 3 − 3𝑥𝑦 2

Exercise 1.8.101: Solve the following exact equations, implicit general solutions will suffice:

a) cos(𝑥) + 𝑦𝑒 𝑥 𝑦 + 𝑥𝑒 𝑥 𝑦 𝑦 ′ = 0 b) (2𝑥 + 𝑦) 𝑑𝑥 + (𝑥 − 4𝑦) 𝑑𝑦 = 0


𝑑𝑦
c) 𝑒 𝑥 + 𝑒 𝑦 𝑑𝑥 = 0 d) (3𝑥 2 + 3𝑦) 𝑑𝑥 + (3𝑦 2 + 3𝑥) 𝑑𝑦 = 0

Exercise 1.8.102: Find the integrating factor for the following equations making them into exact
equations:

a) 1
𝑦 𝑑𝑥 + 3𝑦 𝑑𝑦 = 0 b) 𝑑𝑥 − 𝑒 −𝑥−𝑦 𝑑𝑦 = 0
cos(𝑥) 𝑥 𝑦2 
1
𝑑𝑥 + 𝑑𝑦 = 0 𝑑𝑥 + (2𝑦 + 𝑥) 𝑑𝑦 = 0

c) 𝑦2
+ 𝑦 𝑦2
d) 2𝑦 + 𝑥

Exercise 1.8.103:

a) Show that every separable equation 𝑦 ′ = 𝑓 (𝑥)𝑔(𝑦) can be written as an exact equation, and
verify that it is indeed exact.
b) Rewrite 𝑦 ′ = 𝑥𝑦 as an exact equation, solve it, and verify that the solution is the same as it
was in Example 1.3.1.
72 CHAPTER 1. FIRST-ORDER EQUATIONS

1.9 First-order linear PDEs


Note: 1 lecture, can safely be skipped
We have only considered ODEs so far, so let us solve a linear first-order PDE. Consider

𝑎(𝑥, 𝑡) 𝑢𝑥 + 𝑏(𝑥, 𝑡) 𝑢𝑡 + 𝑐(𝑥, 𝑡) 𝑢 = 𝑔(𝑥, 𝑡), 𝑢(𝑥, 0) = 𝑓 (𝑥), −∞ < 𝑥 < ∞, 𝑡 > 0,

where 𝑢(𝑥, 𝑡) is a function of 𝑥 and 𝑡. We again use the notation 𝑢𝑥 = 𝜕𝑢 𝜕𝑥


and 𝑢𝑡 = 𝜕𝑢
𝜕𝑡
for
convenience. The initial condition 𝑢(𝑥, 0) = 𝑓 (𝑥) is now a function of 𝑥 rather than just a
number. In these problems, it is useful to think of 𝑥 as position and 𝑡 as time. The equation
describes the evolution of a function of 𝑥 as time goes on. Below, the coefficients 𝑎, 𝑏, 𝑐,
and the function 𝑔 are mostly going to be constant or zero. The method we describe works
with nonconstant coefficients, although the computations may get difficult quickly.
This method we use is the method of characteristics. The idea is to find lines along
which the equation is an ODE that we then solve. We will see this technique again for
second-order PDEs when we encounter the wave equation in § 4.8.
Example 1.9.1: Consider

𝛼𝑢𝑥 + 𝑢𝑡 = 0, 𝑢(𝑥, 0) = 𝑓 (𝑥),

where 𝛼 is a constant. This particular equation, 𝛼𝑢𝑥 + 𝑢𝑡 = 0, is called the transport equation.
The data will propagate along curves called characteristics. The idea is to change
to the so-called characteristic coordinates, which we will call (𝜉, 𝑠). If we change to these
coordinates, the equation simplifies. The change of variables for this equation is

𝜉 = 𝑥 − 𝛼𝑡, 𝑠 = 𝑡.

Let us see what the equation becomes. Remember the chain rule in several variables.

𝑢 𝑥 = 𝑢𝜉 𝜉 𝑥 + 𝑢 𝑠 𝑠 𝑥 = 𝑢𝜉 ,
𝑢𝑡 = 𝑢𝜉 𝜉𝑡 + 𝑢𝑠 𝑠 𝑡 = −𝛼𝑢𝜉 + 𝑢𝑠 .

The equation in the coordinates 𝜉 and 𝑠 becomes

𝛼 (𝑢𝜉 ) + (−𝛼𝑢𝜉 + 𝑢𝑠 ) = 0,
|{z} | {z }
𝑢𝑥 𝑢𝑡

or in other words
𝑢𝑠 = 0.
Treating 𝜉 as simply a parameter, we have obtained the ODE 𝑑𝑢𝑑𝑠 = 0. That is trivial to solve.
The solution is a function that does not depend on 𝑠 (but it does depend on 𝜉). That is,
there is some function 𝐴 such that

𝑢 = 𝐴(𝜉) = 𝐴(𝑥 − 𝛼𝑡).


1.9. FIRST-ORDER LINEAR PDES 73

The initial condition says that:


𝑓 (𝑥) = 𝑢(𝑥, 0) = 𝐴(𝑥 − 𝛼0) = 𝐴(𝑥),
so 𝐴 = 𝑓 . In other words,
𝑢(𝑥, 𝑡) = 𝑓 (𝑥 − 𝛼𝑡).
Everything is simply moving to the right at speed 𝛼 as 𝑡 increases. The curves given by the
equation
𝜉 = constant
are called the characteristic curves. See Figure 1.20. In this case, the solution does not change
along the characteristic as 𝑑𝑢 𝑑𝑠 = 0.
In the (𝑥, 𝑡) coordinates, the characteristic
curves satisfy 𝑡 = 𝛼1 (𝑥 − 𝜉), and are in fact lines. 𝑡 𝜉=0 𝜉=1 𝜉=2
1
The slope of characteristic lines is 𝛼 , and for each
different 𝜉, we get a different characteristic line.
We see why 𝛼𝑢𝑥 + 𝑢𝑡 = 0 is called the transport
equation: Everything travels at some constant
speed. This behavior is called convection. An
example application is material being moved by 𝑥
a river where the material does not diffuse and
is simply carried along. In this setup, 𝑥 is the Figure 1.20: Characteristic curves.
position along the river, 𝑡 is the time, and 𝑢(𝑥, 𝑡)
the concentration the material at position 𝑥 and time 𝑡. See Figure 1.21 for an example.

-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3

1.00 1.00 1.00 1.00

0.75 0.75 0.75 0.75

0.50 0.50 0.50 0.50

0.25 0.25 0.25 0.25

0.00 0.00 0.00 0.00

-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3

Figure 1.21: Example of “transport” in 𝑢𝑥 + 𝑢𝑡 = 0 (that is, 𝛼 = 1) where the initial condition 𝑓 (𝑥) is a
peak at the origin. On the left is a graph of the initial condition 𝑢(𝑥, 0). On the right is a graph of the
function 𝑢(𝑥, 1), that is, at time 𝑡 = 1. Notice it is the same graph shifted one unit to the right.

We use a similar idea in the more general case:


𝑎𝑢𝑥 + 𝑏𝑢𝑡 + 𝑐𝑢 = 𝑔, 𝑢(𝑥, 0) = 𝑓 (𝑥).
74 CHAPTER 1. FIRST-ORDER EQUATIONS

We change coordinates to the characteristic coordinates, which we call (𝜉, 𝑠). These are
coordinates where 𝑎𝑢𝑥 + 𝑏𝑢𝑡 becomes differentiation in the 𝑠 variable.
Along the characteristic curves (where 𝜉 is constant), we get a new ODE in the 𝑠 variable.
In the transport equation, we got the simple 𝑑𝑢𝑑𝑠 = 0. In general, we get the linear equation

𝑑𝑢
+ 𝑐𝑢 = 𝑔. (1.7)
𝑑𝑠
We think of everything as a function of 𝜉 and 𝑠, although we are thinking of 𝜉 as a parameter
rather than an independent variable. So the equation is an ODE. It is a linear ODE that we
can solve using the integrating factor.
To find the characteristics, think of a curve given parametrically 𝑥(𝑠), 𝑡(𝑠) . We try to

have the curve satisfy
𝑑𝑥 𝑑𝑡
= 𝑎, = 𝑏.
𝑑𝑠 𝑑𝑠
Why? Because when we think of 𝑥 and 𝑡 as functions of 𝑠, we find, using the chain rule,

𝑑𝑢 𝑑𝑥 𝑑𝑡
 
+ 𝑐𝑢 = 𝑢𝑥 + 𝑢𝑡 +𝑐𝑢 = 𝑎𝑢𝑥 + 𝑏𝑢𝑡 + 𝑐𝑢 = 𝑔.
𝑑𝑠 𝑑𝑠 𝑑𝑠
| {z }
𝑑𝑢
𝑑𝑠

So we get the ODE (1.7), which then describes the value of the solution 𝑢 of the PDE along
this characteristic curve. It is convenient to make sure that 𝑠 = 0 corresponds to 𝑡 = 0, that
is, 𝑡(0) = 0. It will also be convenient for 𝑥(0) = 𝜉. See Figure 1.22.

𝑡 𝜉 = constant

𝑥(𝑠), 𝑡(𝑠)


𝑠=0
𝑥=𝜉 𝑥

Figure 1.22: General characteristic curve.

Example 1.9.2: Consider


2
𝑢𝑥 + 𝑢𝑡 + 𝑢 = 𝑥, 𝑢(𝑥, 0) = 𝑒 −𝑥 .

We find the characteristics, that is, the curves given by


𝑑𝑥 𝑑𝑡
= 1, = 1.
𝑑𝑠 𝑑𝑠
1.9. FIRST-ORDER LINEAR PDES 75

So
𝑥 = 𝑠 + 𝑐1 , 𝑡 = 𝑠 + 𝑐2 ,
for some 𝑐 1 and 𝑐2 . At 𝑠 = 0, we want 𝑥 = 𝜉 and 𝑡 = 0. So we let 𝑐1 = 𝜉 and 𝑐2 = 0:

𝑥 = 𝑠 + 𝜉, 𝑡 = 𝑠.
𝑑𝑢
The ODE is 𝑑𝑠 + 𝑢 = 𝑥, and 𝑥 = 𝑠 + 𝜉. So, the ODE to solve along the characteristic is

𝑑𝑢
+ 𝑢 = 𝑠 + 𝜉.
𝑑𝑠
The general solution of this equation, treating 𝜉 as a parameter, is 𝑢 = 𝐶𝑒 −𝑠 + 𝑠 + 𝜉 − 1, for
2
some 𝐶, which can depend on 𝜉. At 𝑠 = 0, our initial condition is that 𝑢 is 𝑒 −𝜉 , since at
2
𝑠 = 0, we have 𝑥 = 𝜉. Given this initial condition, we find 𝐶 = 𝑒 −𝜉 − 𝜉 + 1. So,
2
𝑢 = 𝑒 −𝜉 − 𝜉 + 1 𝑒 −𝑠 + 𝑠 + 𝜉 − 1

2 −𝑠
= 𝑒 −𝜉 + (1 − 𝜉)𝑒 −𝑠 + 𝑠 + 𝜉 − 1.

Substitute 𝜉 = 𝑥 − 𝑡 and 𝑠 = 𝑡 to find 𝑢 in terms of 𝑥 and 𝑡:


2 −𝑠
𝑢 = 𝑒 −𝜉 + (1 − 𝜉)𝑒 −𝑠 + 𝑠 + 𝜉 − 1
2
= 𝑒 −(𝑥−𝑡) −𝑡
+ (1 − 𝑥 + 𝑡)𝑒 −𝑡 + 𝑥 − 1.

See Figure 1.23 on the next page for a plot of 𝑢(𝑥, 𝑡) as a function of two variables.
When the coefficients are not constants, the characteristic curves are not going to be
straight lines anymore.
Example 1.9.3: Consider the following variable-coefficient equation:

𝑥𝑢𝑥 + 𝑢𝑡 + 2𝑢 = 0, 𝑢(𝑥, 0) = cos(𝑥).

We find the characteristics, that is, the curves given by

𝑑𝑥 𝑑𝑡
= 𝑥, = 1.
𝑑𝑠 𝑑𝑠
So
𝑥 = 𝑐1 𝑒 𝑠 , 𝑡 = 𝑠 + 𝑐2 .
At 𝑠 = 0, we wish to get 𝑥 = 𝜉 and 𝑡 = 0 as before. So

𝑥 = 𝜉𝑒 𝑠 , 𝑡 = 𝑠.

OK, the ODE we need to solve is


𝑑𝑢
+ 2𝑢 = 0.
𝑑𝑠
76 CHAPTER 1. FIRST-ORDER EQUATIONS

3 3.0
x 2 2.5 t
1
2.0
0
-1 1.5
-2 1.0
u(x,t)
-3 0.5
0.0
2 1.73
2 1.15
0.58
1 0.00
1 -0.58
-1.15
0 -1.73
0 -2.30
-2.88
-1 -3.46
-1
-2
-2
-3
-3

3.0
2.5 3
2.0 2
1.5 1
0
1.0
t
-1
0.5 -2 x
0.0 -3

2
Figure 1.23: Plot of the solution 𝑢(𝑥, 𝑡) to 𝑢𝑥 + 𝑢𝑡 + 𝑢 = 𝑥, 𝑢(𝑥, 0) = 𝑒 −𝑥 .

This is for a fixed 𝜉. We find 𝑢 = 𝐶𝑒 −2𝑠 . At 𝑠 = 0, we want 𝑢 to be cos(𝜉), so that is our


initial condition for the ODE. Moreover, 𝜉 = 𝑥𝑒 −𝑡 and 𝑠 = 𝑡. Consequently,
𝑢 = 𝑒 −2𝑠 cos(𝜉) = 𝑒 −2𝑡 cos(𝑥𝑒 −𝑡 ).
We make a few closing remarks. One thing to keep in mind is that we would get into
trouble if the coefficient in front of 𝑢𝑡 , that is, the 𝑏, is ever zero. Let us consider a quick
example of what can go wrong:
𝑢𝑥 + 𝑢 = 0, 𝑢(𝑥, 0) = sin(𝑥).
This problem has no solution. If we had a solution, it would imply that 𝑢𝑥 (𝑥, 0) = cos(𝑥),
but 𝑢𝑥 (𝑥, 0) + 𝑢(𝑥, 0) = cos(𝑥) + sin(𝑥) ≠ 0. The problem is that the characteristic curve is
now the line 𝑡 = 0, and the solution is already provided on that line!
As 𝑏 ought to then be nonzero, it is convenient to ensure that 𝑏 is positive by multiplying
the equation by −1 if necessary, so that a positive 𝑠 means a positive 𝑡.
Another remark is that if 𝑎 or 𝑏 in the equation are not constants, the computations can
quickly get out of hand, as the expressions for the characteristic coordinates become messy
and then solving the ODE becomes even messier. In the examples above, 𝑏 was always 1,
meaning we got 𝑠 = 𝑡 in the characteristic coordinates. If 𝑏 is not constant, your expression
for 𝑠 will be more complicated.
Finding the characteristic coordinates is really a system of ODEs in general if 𝑎 depends
on 𝑡 or if 𝑏 depends on 𝑥. In that case, we would need techniques of systems of ODEs to
1.9. FIRST-ORDER LINEAR PDES 77

solve, see chapter 3 or chapter 8. In general, if 𝑎 and 𝑏 are not linear functions or constants,
finding closed form expressions for the characteristic coordinates may be impossible.
Finally, the method of characteristics applies to nonlinear first-order PDEs as well. In
the nonlinear case, the characteristics depend not only on the differential equation, but
also on the initial data. This leads to not only more difficult computations, but also the
formation of singularities where the solution breaks down at a certain point in time. An
example application where first-order nonlinear PDEs come up is traffic flow theory, and
you have probably experienced the formation of singularities: traffic jams. But we digress.

1.9.1 Exercises
Exercise 1.9.1: Solve

a) 9𝑢𝑥 + 𝑢𝑡 = 0, 𝑢(𝑥, 0) = sin(𝑥), b) −8𝑢𝑥 + 𝑢𝑡 = 0, 𝑢(𝑥, 0) = sin(𝑥),


c) 𝜋𝑢𝑥 + 𝑢𝑡 = 0, 𝑢(𝑥, 0) = sin(𝑥), d) 𝜋𝑢𝑥 + 𝑢𝑡 + 𝑢 = 0, 𝑢(𝑥, 0) = sin(𝑥).

Exercise 1.9.2: Solve 3𝑢𝑥 + 𝑢𝑡 = 1, 𝑢(𝑥, 0) = 𝑥 2 .

Exercise 1.9.3: Solve 3𝑢𝑥 + 𝑢𝑡 = 𝑥, 𝑢(𝑥, 0) = 𝑒 𝑥 .

Exercise 1.9.4: Solve 𝑢𝑥 + 𝑢𝑡 + 𝑥𝑢 = 0, 𝑢(𝑥, 0) = cos(𝑥).

Exercise 1.9.5:

a) Find the characteristic coordinates for the following equations:


1) 𝑢𝑥 + 𝑢𝑡 + 𝑢 = 1, 𝑢(𝑥, 0) = cos(𝑥), 2) 2𝑢𝑥 + 2𝑢𝑡 + 2𝑢 = 2, 𝑢(𝑥, 0) = cos(𝑥).
b) Solve the two equations using the coordinates.
c) Explain why you got the same solution, although the characteristic coordinates you found
were different.

Exercise 1.9.6: Solve 𝑥 2 𝑢𝑥 + (1 + 𝑥 2 )𝑢𝑡 + 𝑒 𝑥 𝑢 = 0, 𝑢(𝑥, 0) = 0. Hint: Think a little out of the box.

Exercise 1.9.101: Solve

a) −5𝑢𝑥 + 𝑢𝑡 = 0, 𝑢(𝑥, 0) = 1
1+𝑥 2
, b) 2𝑢𝑥 + 𝑢𝑡 = 0, 𝑢(𝑥, 0) = cos(𝑥).

Exercise 1.9.102: Solve 𝑢𝑥 + 𝑢𝑡 + 𝑡𝑢 = 0, 𝑢(𝑥, 0) = cos(𝑥).

Exercise 1.9.103: Solve 𝑢𝑥 + 𝑢𝑡 = 5, 𝑢(𝑥, 0) = 𝑥.


78 CHAPTER 1. FIRST-ORDER EQUATIONS
Chapter 2

Higher-order linear ODEs

2.1 Second-order linear ODEs


Note: 1 lecture, reduction of order optional, first part of §3.1 in [EP], parts of §3.1 and §3.2 in [BD]
Consider the general second-order linear differential equation

𝐴(𝑥)𝑦 ′′ + 𝐵(𝑥)𝑦 ′ + 𝐶(𝑥)𝑦 = 𝐹(𝑥).

We usually divide through by 𝐴(𝑥) to get

𝑦 ′′ + 𝑝(𝑥)𝑦 ′ + 𝑞(𝑥)𝑦 = 𝑓 (𝑥), (2.1)

where 𝑝(𝑥) = 𝐵(𝑥)/𝐴(𝑥), 𝑞(𝑥) = 𝐶(𝑥)/𝐴(𝑥), and 𝑓 (𝑥) = 𝐹(𝑥)/𝐴(𝑥). The word linear means that the
equation contains no powers or functions of 𝑦, 𝑦 ′, and 𝑦 ′′.
In the special case when 𝑓 (𝑥) = 0, we have a so-called homogeneous equation

𝑦 ′′ + 𝑝(𝑥)𝑦 ′ + 𝑞(𝑥)𝑦 = 0. (2.2)

We have already seen some second-order linear homogeneous equations.

𝑦 ′′ + 𝑘 2 𝑦 = 0 Two solutions are: 𝑦1 = cos(𝑘𝑥), 𝑦2 = sin(𝑘𝑥).


𝑦 ′′ − 𝑘 2 𝑦 = 0 Two solutions are: 𝑦1 = 𝑒 𝑘𝑥 , 𝑦2 = 𝑒 −𝑘𝑥 .

If we know two solutions of a linear homogeneous equation, we know many more of


them.

Theorem 2.1.1 (Superposition). Suppose 𝑦1 and 𝑦2 are two solutions of the homogeneous equation
(2.2). Then
𝑦(𝑥) = 𝐶1 𝑦1 (𝑥) + 𝐶2 𝑦2 (𝑥),
also solves (2.2) for arbitrary constants 𝐶1 and 𝐶2 .
80 CHAPTER 2. HIGHER-ORDER LINEAR ODES

That is, we can add solutions together and multiply them by constants to obtain new
and different solutions. We call the expression 𝐶1 𝑦1 + 𝐶2 𝑦2 a linear combination of 𝑦1 and
𝑦2 . Let us prove this theorem; the proof is very enlightening and illustrates how linear
equations work.
Proof: Let 𝑦 = 𝐶1 𝑦1 + 𝐶2 𝑦2 . Then
𝑦 ′′ + 𝑝𝑦 ′ + 𝑞𝑦 = (𝐶1 𝑦1 + 𝐶2 𝑦2 )′′ + 𝑝(𝐶1 𝑦1 + 𝐶2 𝑦2 )′ + 𝑞(𝐶1 𝑦1 + 𝐶2 𝑦2 )
= 𝐶1 𝑦1′′ + 𝐶2 𝑦2′′ + 𝐶1 𝑝𝑦1′ + 𝐶2 𝑝𝑦2′ + 𝐶1 𝑞𝑦1 + 𝐶2 𝑞𝑦2
= 𝐶1 (𝑦1′′ + 𝑝𝑦1′ + 𝑞𝑦1 ) + 𝐶2 (𝑦2′′ + 𝑝𝑦2′ + 𝑞𝑦2 )
= 𝐶1 · 0 + 𝐶2 · 0 = 0. □

The proof becomes even simpler to state if we use the operator notation. An operator is
an object that eats functions and spits out functions (kind of like what a function is, but a
function eats numbers and spits out numbers). Define the operator 𝐿 by
𝐿𝑦 = 𝑦 ′′ + 𝑝𝑦 ′ + 𝑞𝑦.
The differential equation now becomes 𝐿𝑦 = 0. The operator (and the equation) 𝐿 being
linear means that 𝐿(𝐶1 𝑦1 + 𝐶2 𝑦2 ) = 𝐶1 𝐿𝑦1 + 𝐶2 𝐿𝑦2 . It is almost as if we were “multiplying”
by 𝐿. The proof above becomes
𝐿𝑦 = 𝐿(𝐶1 𝑦1 + 𝐶2 𝑦2 ) = 𝐶1 𝐿𝑦1 + 𝐶2 𝐿𝑦2 = 𝐶1 · 0 + 𝐶2 · 0 = 0.

Two different solutions to the second equation 𝑦 ′′ − 𝑘 2 𝑦 = 0 are 𝑦1 = cosh(𝑘𝑥) and


𝑦2 = sinh(𝑘𝑥). Recalling the definition of sinh and cosh, we note that these are solutions by
superposition as they are linear combinations of the two exponential solutions: cosh(𝑘𝑥) =
𝑒 𝑘𝑥 +𝑒 −𝑘𝑥 𝑘𝑥
= (1/2)𝑒 𝑘𝑥 + (1/2)𝑒 −𝑘𝑥 and sinh(𝑘𝑥) = 𝑒 −𝑒
−𝑘𝑥
2 2 = (1/2)𝑒 𝑘𝑥 − (1/2)𝑒 −𝑘𝑥 .
The functions sinh and cosh are sometimes more convenient to use than the exponential.
Let us review some of their properties:
cosh 0 = 1, sinh 0 = 0,
𝑑 h i 𝑑 h i
cosh 𝑥 = sinh 𝑥, sinh 𝑥 = cosh 𝑥,
𝑑𝑥 𝑑𝑥
cosh2 𝑥 − sinh2 𝑥 = 1.
Exercise 2.1.1: Derive these properties using the definitions of sinh and cosh in terms of exponen-
tials.
Linear equations have nice and simple answers to the existence and uniqueness question.
Theorem 2.1.2 (Existence and uniqueness). Suppose 𝑝, 𝑞, 𝑓 are continuous functions on some
interval 𝐼, 𝑎 is a number in 𝐼, and 𝑏0 , 𝑏1 are constants. Then the equation
𝑦 ′′ + 𝑝(𝑥)𝑦 ′ + 𝑞(𝑥)𝑦 = 𝑓 (𝑥),
has exactly one solution 𝑦(𝑥) defined on the interval 𝐼 satisfying the initial conditions
𝑦(𝑎) = 𝑏0 , 𝑦 ′(𝑎) = 𝑏1 .
2.1. SECOND-ORDER LINEAR ODES 81

For example, the equation 𝑦 ′′ + 𝑘 2 𝑦 = 0 with 𝑦(0) = 𝑏 0 and 𝑦 ′(0) = 𝑏1 has the solution

𝑏1
𝑦(𝑥) = 𝑏0 cos(𝑘𝑥) + sin(𝑘𝑥).
𝑘

The equation 𝑦 ′′ − 𝑘 2 𝑦 = 0 with 𝑦(0) = 𝑏 0 and 𝑦 ′(0) = 𝑏1 has the solution

𝑏1
𝑦(𝑥) = 𝑏0 cosh(𝑘𝑥) + sinh(𝑘𝑥).
𝑘
Using cosh and sinh in this solution allows us to solve for the initial conditions in a cleaner
way than if we had used the exponentials.
The initial conditions for a second-order ODE consist of two equations. Common sense
tells us that if we have two arbitrary constants and two equations, then we should be able
to solve for the constants and find a solution to the differential equation satisfying the
initial conditions.
Question: Suppose we find two different solutions 𝑦1 and 𝑦2 to the homogeneous
equation (2.2). Can every solution be written (using superposition) in the form 𝑦 =
𝐶 1 𝑦1 + 𝐶 2 𝑦2 ?
Answer is affirmative! Provided that 𝑦1 and 𝑦2 are different enough in the following
sense. We say 𝑦1 and 𝑦2 are linearly independent if one is not a constant multiple of the other.

Theorem 2.1.3. Let 𝑝, 𝑞 be continuous functions. Let 𝑦1 and 𝑦2 be two linearly independent
solutions to the homogeneous equation (2.2). Then every other solution is of the form

𝑦 = 𝐶 1 𝑦1 + 𝐶 2 𝑦2 .

That is, 𝑦 = 𝐶1 𝑦1 + 𝐶2 𝑦2 is the general solution.

For example, we found the solutions 𝑦1 = sin 𝑥 and 𝑦2 = cos 𝑥 for the equation 𝑦 ′′ + 𝑦 = 0.
It is not hard to see that sine and cosine are not constant multiples of each other. Indeed,
if sin 𝑥 = 𝐴 cos 𝑥 for some constant 𝐴, plugging in 𝑥 = 0 would imply 𝐴 = 0. But then
sin 𝑥 = 0 for all 𝑥, which is preposterous. So 𝑦1 and 𝑦2 are linearly independent. Hence,

𝑦 = 𝐶1 cos 𝑥 + 𝐶2 sin 𝑥

is the general solution to 𝑦 ′′ + 𝑦 = 0.


For two functions, checking linear independence is rather simple. Let us see another
example. Consider 𝑦 ′′ − 2𝑥 −2 𝑦 = 0. Then 𝑦1 = 𝑥 2 and 𝑦2 = 1/𝑥 are solutions. To see that
they are linearly independent, suppose one is a multiple of the other: 𝑦1 = 𝐴𝑦2 , we just
have to find out that 𝐴 cannot be a constant. In this case we have 𝐴 = 𝑦1/𝑦2 = 𝑥 3 , which is
most decidedly not a constant. So 𝑦 = 𝐶1 𝑥 2 + 𝐶2 1/𝑥 is the general solution.
If you have one nonzero solution to a second-order linear homogeneous equation, then
you can find another one. This is the reduction of order method. The idea is that if we
82 CHAPTER 2. HIGHER-ORDER LINEAR ODES

somehow found 𝑦1 as a solution of 𝑦 ′′ + 𝑝(𝑥)𝑦 ′ + 𝑞(𝑥)𝑦 = 0, then we try a second solution


of the form 𝑦2 (𝑥) = 𝑦1 (𝑥)𝑣(𝑥). We just need to find 𝑣. We plug 𝑦2 into the equation:

0 = 𝑦2′′ + 𝑝(𝑥)𝑦2′ + 𝑞(𝑥)𝑦2 = 𝑦1′′𝑣 + 2𝑦1′ 𝑣 ′ + 𝑦1 𝑣 ′′ +𝑝(𝑥) (𝑦1′ 𝑣 + 𝑦1 𝑣 ′) +𝑞(𝑥) 𝑦1 𝑣


| {z } | {z } |{z}
𝑦2′′ 𝑦2′ 𝑦2

:0


′′
= 𝑦1 𝑣 + (2𝑦1′ ′
+ 𝑝(𝑥)𝑦1 )𝑣 + 𝑦1′′ +
𝑝(𝑥)𝑦 ′ 
+ 𝑞(𝑥)𝑦1 𝑣.

 1
 

In other words, 𝑦1 𝑣 ′′ + (2𝑦1′ + 𝑝(𝑥)𝑦1 )𝑣 ′ = 0. Using 𝑤 = 𝑣 ′, we have the first-order linear


equation 𝑦1 𝑤 ′ + (2𝑦1′ + 𝑝(𝑥)𝑦1 )𝑤 = 0. After solving this equation for 𝑤 (integrating factor),
we find 𝑣 by antidifferentiating 𝑤. We then form 𝑦2 by computing 𝑦1 𝑣. For example,
suppose we somehow know 𝑦1 = 𝑥 is a solution to 𝑦 ′′ + 𝑥 −1 𝑦 ′ − 𝑥 −2 𝑦 = 0. The equation
for 𝑤 is then 𝑥𝑤 ′ + 3𝑤 = 0. We find a solution, 𝑤 = 𝐶𝑥 −3 , and we find an antiderivative
−𝐶
𝑣 = 2𝑥 −𝐶
2 . Hence 𝑦2 = 𝑦1 𝑣 = 2𝑥 . Any 𝐶 works and so 𝐶 = −2 makes 𝑦2 = /𝑥 . Thus, the
1
general solution is 𝑦 = 𝐶1 𝑥 + 𝐶2 1/𝑥 .
Since we have a formula for the solution to the first-order linear equation, we can write
a formula for 𝑦2 :
∫ − ∫ 𝑝(𝑥) 𝑑𝑥
𝑒
𝑦2 (𝑥) = 𝑦1 (𝑥)  2 𝑑𝑥
𝑦1 (𝑥)
However, it is much easier to remember that we just need to try 𝑦2 (𝑥) = 𝑦1 (𝑥)𝑣(𝑥) and find
𝑣(𝑥) as we did above. The technique works for higher-order equations too: You get to
reduce the order by one for each solution you find. So it is better to remember how to do it
rather than a specific formula.
We will study the solution of nonhomogeneous equations in § 2.5. We will first focus
on finding general solutions to homogeneous equations.

2.1.1 Exercises
Exercise 2.1.2: Show that 𝑦 = 𝑒 𝑥 and 𝑦 = 𝑒 2𝑥 are linearly independent.

Exercise 2.1.3: Take 𝑦 ′′ + 5𝑦 = 10𝑥 + 5. Find (guess!) a solution.

Exercise 2.1.4: Prove the superposition principle for nonhomogeneous equations. Suppose that 𝑦1
is a solution to 𝐿𝑦1 = 𝑓 (𝑥) and 𝑦2 is a solution to 𝐿𝑦2 = 𝑔(𝑥) (same linear operator 𝐿). Show that
𝑦 = 𝑦1 + 𝑦2 solves 𝐿𝑦 = 𝑓 (𝑥) + 𝑔(𝑥).

Exercise 2.1.5: For the equation 𝑥 2 𝑦 ′′ − 𝑥𝑦 ′ = 0, find two solutions, show that they are linearly
independent and find the general solution. Hint: Try 𝑦 = 𝑥 𝑟 .

Equations of the form 𝑎𝑥 2 𝑦 ′′ + 𝑏𝑥𝑦 ′ + 𝑐𝑦 = 0 are called Euler’s equations or Cauchy–Euler


equations. They are solved by trying 𝑦 = 𝑥 𝑟 and solving for 𝑟 (assume 𝑥 ≥ 0 for simplicity).
2.1. SECOND-ORDER LINEAR ODES 83

Exercise 2.1.6: Suppose that (𝑏 − 𝑎)2 − 4𝑎𝑐 > 0.

a) Find a formula for the general solution of Euler’s equation (see above) 𝑎𝑥 2 𝑦 ′′ + 𝑏𝑥𝑦 ′ + 𝑐𝑦 = 0.
Hint: Try 𝑦 = 𝑥 𝑟 and find a formula for 𝑟.
b) What happens when (𝑏 − 𝑎)2 − 4𝑎𝑐 = 0 or (𝑏 − 𝑎)2 − 4𝑎𝑐 < 0?

We will revisit the case when (𝑏 − 𝑎)2 − 4𝑎𝑐 < 0 later.

Exercise 2.1.7: Same equation as in Exercise 2.1.6. Suppose (𝑏 − 𝑎)2 − 4𝑎𝑐 = 0. Find a formula
for the general solution of 𝑎𝑥 2 𝑦 ′′ + 𝑏𝑥𝑦 ′ + 𝑐𝑦 = 0. Hint: Try 𝑦 = 𝑥 𝑟 ln 𝑥 for the second solution.

Exercise 2.1.8 (reduction of order): Suppose 𝑦1 is a solution to 𝑦 ′′ + 𝑝(𝑥)𝑦 ′ + 𝑞(𝑥)𝑦 = 0. By


directly plugging into the equation, show that

𝑝(𝑥) 𝑑𝑥
𝑒−

𝑦2 (𝑥) = 𝑦1 (𝑥)  2 𝑑𝑥
𝑦1 (𝑥)

is also a solution.

Exercise 2.1.9 (Chebyshev’s equation of order 1): Take (1 − 𝑥 2 )𝑦 ′′ − 𝑥𝑦 ′ + 𝑦 = 0.

a) Show that 𝑦 = 𝑥 is a solution.


b) Use reduction of order to find a second linearly independent solution.
c) Write down the general solution.

Exercise 2.1.10 (Hermite’s equation of order 2): Take 𝑦 ′′ − 2𝑥𝑦 ′ + 4𝑦 = 0.

a) Show that 𝑦 = 1 − 2𝑥 2 is a solution.


b) Use reduction of order to find a second linearly independent solution. (It’s OK to leave a
definite integral in the formula.)
c) Write down the general solution.

Exercise 2.1.101: Are sin(𝑥) and 𝑒 𝑥 linearly independent? Justify.

Exercise 2.1.102: Are 𝑒 𝑥 and 𝑒 𝑥+2 linearly independent? Justify.

Exercise 2.1.103: Guess a solution to 𝑦 ′′ + 𝑦 ′ + 𝑦 = 5.

Exercise 2.1.104: Find the general solution to 𝑥𝑦 ′′ + 𝑦 ′ = 0. Hint: It is a first-order ODE in 𝑦 ′.

Exercise 2.1.105: Write down an equation (guess) for which we have the solutions 𝑒 𝑥 and 𝑒 2𝑥 .
Hint: Try an equation of the form 𝑦 ′′ + 𝐴𝑦 ′ + 𝐵𝑦 = 0 for constants 𝐴 and 𝐵, plug in both 𝑒 𝑥 and
𝑒 2𝑥 and solve for 𝐴 and 𝐵.
84 CHAPTER 2. HIGHER-ORDER LINEAR ODES

2.2 Constant-coefficient second-order linear ODEs


Note: more than 1 lecture, second part of §3.1 in [EP], §3.1 in [BD]

2.2.1 Solving constant-coefficient equations


Consider the problem

𝑦 ′′ − 6𝑦 ′ + 8𝑦 = 0, 𝑦(0) = −2, 𝑦 ′(0) = 6.

This is a second-order linear homogeneous equation with constant coefficients. Constant


coefficients means that the functions in front of 𝑦 ′′, 𝑦 ′, and 𝑦 are constants; they do not
depend on 𝑥.
To guess a solution, think of a function that stays essentially the same when we
differentiate it, so that we can take the function and its derivatives, add some multiples of
these together, and end up with zero. Yes, we are talking about the exponential.
Let us try‗ a solution of the form 𝑦 = 𝑒 𝑟𝑥 . Then 𝑦 ′ = 𝑟𝑒 𝑟𝑥 and 𝑦 ′′ = 𝑟 2 𝑒 𝑟𝑥 . Plug in to get

𝑦 ′′ − 6𝑦 ′ + 8𝑦 = 0,
𝑟 2 𝑒 𝑟𝑥 −6 𝑟𝑒 𝑟𝑥 +8 𝑒 𝑟𝑥 = 0,
|{z} |{z} |{z}
𝑦 ′′ 𝑦′ 𝑦

𝑟 2 − 6𝑟 + 8 = 0 (divide through by 𝑒 𝑟𝑥 ),
(𝑟 − 2)(𝑟 − 4) = 0.

Hence, if 𝑟 = 2 or 𝑟 = 4, then 𝑒 𝑟𝑥 is a solution. So let 𝑦1 = 𝑒 2𝑥 and 𝑦2 = 𝑒 4𝑥 .

Exercise 2.2.1: Check that 𝑦1 and 𝑦2 are solutions.

The functions 𝑒 2𝑥 and 𝑒 4𝑥 are linearly independent. If they were not linearly independent,
we could write 𝑒 4𝑥 = 𝐶𝑒 2𝑥 for some constant 𝐶, implying that 𝑒 2𝑥 = 𝐶 for all 𝑥, which is
clearly not possible. Hence, we can write the general solution as

𝑦 = 𝐶1 𝑒 2𝑥 + 𝐶2 𝑒 4𝑥 .

We need to solve for 𝐶1 and 𝐶2 . To apply the initial conditions, we first find 𝑦 ′ =
2𝐶1 𝑒 2𝑥 + 4𝐶2 𝑒 4𝑥 . We plug 𝑥 = 0 into 𝑦 and 𝑦 ′ and solve.

−2 = 𝑦(0) = 𝐶1 + 𝐶2 ,
6 = 𝑦 ′(0) = 2𝐶1 + 4𝐶2 .
‗ Making an educated guess with some parameters to solve for is such a central technique in differential
equations that people sometimes use a fancy name for such a guess: ansatz, German for “initial placement of
a tool at a work piece.” Yes, the Germans have a word for that.
2.2. CONSTANT-COEFFICIENT SECOND-ORDER LINEAR ODES 85

Either apply some matrix algebra, or just solve these by high school math. For example,
divide the second equation by 2 to obtain 3 = 𝐶1 + 2𝐶2 , and subtract the two equations to
get 5 = 𝐶2 . Then 𝐶1 = −7 as −2 = 𝐶1 + 5. Hence, the solution we are looking for is
𝑦 = −7𝑒 2𝑥 + 5𝑒 4𝑥 .

We generalize this example into a method. Suppose that we have an equation


𝑎𝑦 ′′ + 𝑏𝑦 ′ + 𝑐𝑦 = 0, (2.3)
where 𝑎, 𝑏, 𝑐 are constants. Try the solution 𝑦 = 𝑒 𝑟𝑥 to obtain
𝑎𝑟 2 𝑒 𝑟𝑥 + 𝑏𝑟𝑒 𝑟𝑥 + 𝑐𝑒 𝑟𝑥 = 0.
Divide by 𝑒 𝑟𝑥 to obtain the so-called characteristic equation of the ODE:
𝑎𝑟 2 + 𝑏𝑟 + 𝑐 = 0.
Solve for the 𝑟 by using the quadratic formula:

−𝑏 ± 𝑏 2 − 4𝑎𝑐
𝑟1 , 𝑟2 = .
2𝑎
Suppose that 𝑏 2 − 4𝑎𝑐 ≥ 0 for now so that 𝑟1 and 𝑟2 are real. So 𝑒 𝑟1 𝑥 and 𝑒 𝑟2 𝑥 are solutions.
There is still a difficulty if 𝑟1 = 𝑟2 , but it is not hard to overcome.
Theorem 2.2.1. Suppose that 𝑟1 and 𝑟2 are the roots of the characteristic equation.
(i) If 𝑟1 and 𝑟2 are distinct and real (when 𝑏 2 − 4𝑎𝑐 > 0), then (2.3) has the general solution
𝑦 = 𝐶 1 𝑒 𝑟1 𝑥 + 𝐶 2 𝑒 𝑟2 𝑥 .

(ii) If 𝑟1 = 𝑟2 (happens when 𝑏 2 − 4𝑎𝑐 = 0), then (2.3) has the general solution
𝑦 = (𝐶1 + 𝐶2 𝑥) 𝑒 𝑟1 𝑥 .

Example 2.2.1: Solve


𝑦 ′′ − 𝑘 2 𝑦 = 0.
The characteristic equation is 𝑟 2 − 𝑘 2 = 0 or (𝑟 − 𝑘)(𝑟 + 𝑘) = 0. Consequently, 𝑒 −𝑘𝑥 and 𝑒 𝑘𝑥
are the two linearly independent solutions, and the general solution is
𝑦 = 𝐶1 𝑒 𝑘𝑥 + 𝐶2 𝑒 −𝑘𝑥 .
𝑒 𝑠 +𝑒 −𝑠 𝑒 𝑠 −𝑒 −𝑠
Since cosh 𝑠 = 2 and sinh 𝑠 = 2 , we can also write the general solution as
𝑦 = 𝐷1 cosh(𝑘𝑥) + 𝐷2 sinh(𝑘𝑥).
Example 2.2.2: Find the general solution of
𝑦 ′′ − 8𝑦 ′ + 16𝑦 = 0.
The characteristic equation is 𝑟 2 − 8𝑟 + 16 = (𝑟 − 4)2 = 0. The equation has a double root
𝑟1 = 𝑟2 = 4. The general solution is, therefore,
𝑦 = (𝐶1 + 𝐶2 𝑥) 𝑒 4𝑥 = 𝐶1 𝑒 4𝑥 + 𝐶2 𝑥𝑒 4𝑥 .
86 CHAPTER 2. HIGHER-ORDER LINEAR ODES

Exercise 2.2.2: Check that 𝑒 4𝑥 and 𝑥𝑒 4𝑥 are linearly independent.


It is good to check your work. That 𝑒 4𝑥 solves the equation is clear. Let us check that
𝑥𝑒 4𝑥 solves the equation. Compute 𝑦 ′ = 𝑒 4𝑥 + 4𝑥𝑒 4𝑥 and 𝑦 ′′ = 8𝑒 4𝑥 + 16𝑥𝑒 4𝑥 . Plug in,
𝑦 ′′ − 8𝑦 ′ + 16𝑦 = 8𝑒 4𝑥 + 16𝑥𝑒 4𝑥 − 8(𝑒 4𝑥 + 4𝑥𝑒 4𝑥 ) + 16𝑥𝑒 4𝑥 = 0.
In some sense, a doubled root rarely happens. If coefficients are picked randomly, a
doubled root is unlikely. There are, however, some real-world problems where a doubled
root does happen naturally (e.g. critically damped mass-spring system, as we will see).
Let us give a short argument for why the solution 𝑥𝑒 𝑟𝑥 works for a doubled root. This
𝑟 𝑥 −𝑒 𝑟1 𝑥
case is a limiting case of two distinct but very close roots. Note that 𝑒 2𝑟2 −𝑟 1
is a solution
when the roots are distinct. When we take the limit as 𝑟1 goes to 𝑟2 , we are really taking the
derivative of 𝑒 𝑟𝑥 using 𝑟 as the variable. Therefore, the limit is 𝑥𝑒 𝑟𝑥 , and hence this is a
solution in the doubled root case. We remark that in some numerical computations, two
very close roots may lead to numerical instability while a doubled root will not.

2.2.2 Complex numbers and Euler’s formula


A polynomial may have complex roots. The equation 𝑟 2 + 1 = 0 has no real roots, but it
does have two complex roots. Here we review some properties of complex numbers.
Complex numbers may seem a strange concept, especially because of the terminology.
There is nothing imaginary or really complicated about complex numbers. A complex
number is simply a pair of real numbers, (𝑎, 𝑏). Think of a complex number as a point in the
plane. We add complex numbers in the straightforward way: (𝑎, 𝑏) + (𝑐, 𝑑) = (𝑎 + 𝑐, 𝑏 + 𝑑).
We define multiplication by
def
(𝑎, 𝑏) × (𝑐, 𝑑) = (𝑎𝑐 − 𝑏𝑑, 𝑎𝑑 + 𝑏𝑐).
It turns out that with this multiplication rule, all the standard properties of arithmetic hold.
Further, and most importantly (0, 1) × (0, 1) = (−1, 0).
Generally we write (𝑎, 𝑏) as 𝑎 + 𝑖𝑏, and we treat 𝑖 as if it were an unknown. When 𝑏 is
zero, then (𝑎, 0) is just the number 𝑎. We do arithmetic with complex numbers just as we
would with polynomials. The property we just mentioned becomes 𝑖 2 = −1. So whenever
we see 𝑖 2 , we replace it by −1. For example,
(2 + 3𝑖)(4𝑖) − 5𝑖 = (2 × 4)𝑖 + (3 × 4)𝑖 2 − 5𝑖 = 8𝑖 + 12(−1) − 5𝑖 = −12 + 3𝑖.
The numbers 𝑖 and −𝑖 are the two roots of 𝑟 2 + 1 = 0. Some engineers use the letter 𝑗
instead of 𝑖 for the square root of −1. We use the mathematicians’ convention and use 𝑖.
Exercise 2.2.3: Make sure you understand (that you can justify) the following identities:
1
a) 𝑖 2 = −1, 𝑖 3 = −𝑖, 𝑖 4 = 1, b) = −𝑖,
𝑖
c) (3 − 7𝑖)(−2 − 9𝑖) = · · · = −69 − 13𝑖, d) (3−2𝑖)(3+2𝑖) = 32 −(2𝑖)2 = 32 +22 = 13,

13 𝑖.
1 1 3+2𝑖 3+2𝑖 3 2
e) 3−2𝑖 = 3−2𝑖 3+2𝑖 = 13 = 13 +
2.2. CONSTANT-COEFFICIENT SECOND-ORDER LINEAR ODES 87

We also define the exponential 𝑒 𝑎+𝑖𝑏 of a complex number. We do this by writing down
the Taylor series and plugging in the complex number. Because most properties of the
exponential can be proved by looking at the Taylor series, these properties still hold for the
complex exponential. For example, the very important property: 𝑒 𝑥+𝑦 = 𝑒 𝑥 𝑒 𝑦 . This means
that 𝑒 𝑎+𝑖𝑏 = 𝑒 𝑎 𝑒 𝑖𝑏 . Hence, if we can compute 𝑒 𝑖𝑏 , we can compute 𝑒 𝑎+𝑖𝑏 . For 𝑒 𝑖𝑏 , we use the
so-called Euler’s formula.
Theorem 2.2.2 (Euler’s formula).

𝑒 𝑖𝜃 = cos 𝜃 + 𝑖 sin 𝜃 and 𝑒 −𝑖𝜃 = cos 𝜃 − 𝑖 sin 𝜃.

In other words, 𝑒 𝑎+𝑖𝑏 = 𝑒 𝑎 cos(𝑏) + 𝑖 sin(𝑏) = 𝑒 𝑎 cos(𝑏) + 𝑖𝑒 𝑎 sin(𝑏).




Exercise 2.2.4: Using Euler’s formula, check the identities:

𝑒 𝑖𝜃 + 𝑒 −𝑖𝜃 𝑒 𝑖𝜃 − 𝑒 −𝑖𝜃
cos 𝜃 = and sin 𝜃 = .
2 2𝑖
2
Exercise 2.2.5: Double angle identities: Start with 𝑒 𝑖(2𝜃) = 𝑒 𝑖𝜃 . Use Euler on each side and
deduce:
cos(2𝜃) = cos2 𝜃 − sin2 𝜃 and sin(2𝜃) = 2 sin 𝜃 cos 𝜃.
For a complex number 𝑎 + 𝑖𝑏, we call 𝑎 the real part and 𝑏 the imaginary part of the
number. Often the following notation is used:

Re(𝑎 + 𝑖𝑏) = 𝑎 and Im(𝑎 + 𝑖𝑏) = 𝑏.

2.2.3 Complex roots


Suppose the differential equation 𝑎𝑦 ′′ + 𝑏𝑦 ′ + 𝑐𝑦 = 0 has the characteristic equation

−𝑏± 𝑏 2 −4𝑎𝑐
𝑎𝑟 2 + 𝑏𝑟 + 𝑐 = 0 that has complex roots. By the quadratic formula, the roots are 2𝑎 .
These roots are complex if 𝑏 2 − 4𝑎𝑐 < 0. In this case, we write the roots as

−𝑏 4𝑎𝑐 − 𝑏 2
𝑟1 , 𝑟2 = ±𝑖 .
2𝑎 2𝑎
As you can see, we get a pair of roots of the form 𝛼 ± 𝑖𝛽. We could still write the solution as

𝑦 = 𝐶1 𝑒 (𝛼+𝑖𝛽)𝑥 + 𝐶2 𝑒 (𝛼−𝑖𝛽)𝑥 .

However, the exponential is now complex-valued. We need to allow 𝐶1 and 𝐶2 to be


complex numbers to obtain a real-valued solution (which is what we are after). While
there is nothing particularly wrong with this approach, it can make calculations harder
and it is generally preferred to find two real-valued solutions.
Euler’s formula comes to the rescue. Let

𝑦1 = 𝑒 (𝛼+𝑖𝛽)𝑥 and 𝑦2 = 𝑒 (𝛼−𝑖𝛽)𝑥 .


88 CHAPTER 2. HIGHER-ORDER LINEAR ODES

Then
𝑦1 = 𝑒 𝛼𝑥 cos(𝛽𝑥) + 𝑖𝑒 𝛼𝑥 sin(𝛽𝑥),
𝑦2 = 𝑒 𝛼𝑥 cos(𝛽𝑥) − 𝑖𝑒 𝛼𝑥 sin(𝛽𝑥).
Linear combinations of solutions are also solutions. Hence,
𝑦1 + 𝑦2
𝑦3 = = 𝑒 𝛼𝑥 cos(𝛽𝑥),
2
𝑦1 − 𝑦2
𝑦4 = = 𝑒 𝛼𝑥 sin(𝛽𝑥),
2𝑖
are also solutions. It is not hard to see that 𝑦3 and 𝑦4 are linearly independent (not multiples
of each other). So the general solution can be written in terms of 𝑦3 and 𝑦4 . And as they
are real-valued, no complex numbers need to be used for the arbitrary constants in the
general solution. We summarize what we found as a theorem.

Theorem 2.2.3. Take the equation

𝑎𝑦 ′′ + 𝑏𝑦 ′ + 𝑐𝑦 = 0.

If the characteristic equation has the roots 𝛼 ± 𝑖𝛽 (when 𝑏 2 − 4𝑎𝑐 < 0), then the general solution is

𝑦 = 𝐶1 𝑒 𝛼𝑥 cos(𝛽𝑥) + 𝐶2 𝑒 𝛼𝑥 sin(𝛽𝑥).

Example 2.2.3: Find the general solution of 𝑦 ′′ + 𝑘 2 𝑦 = 0, for a constant 𝑘 > 0.


The characteristic equation is 𝑟 2 + 𝑘 2 = 0. Therefore, the roots are 𝑟 = ±𝑖𝑘, and by the
theorem, we have the general solution

𝑦 = 𝐶1 cos(𝑘𝑥) + 𝐶2 sin(𝑘𝑥).

Example 2.2.4: Find the solution of 𝑦 ′′ − 6𝑦 ′ + 13𝑦 = 0, 𝑦(0) = 0, 𝑦 ′(0) = 10.


The characteristic equation is 𝑟 2 − 6𝑟 + 13 = 0. Completing the square, we get
(𝑟 − 3)2 + 22 = 0 and hence the roots are 𝑟 = 3 ± 2𝑖. Per the theorem, the general solution is

𝑦 = 𝐶1 𝑒 3𝑥 cos(2𝑥) + 𝐶2 𝑒 3𝑥 sin(2𝑥).

To find the solution satisfying the initial conditions, we first plug in zero to get

0 = 𝑦(0) = 𝐶1 𝑒 0 cos 0 + 𝐶2 𝑒 0 sin 0 = 𝐶1 .

Hence, 𝐶1 = 0 and 𝑦 = 𝐶2 𝑒 3𝑥 sin(2𝑥). We differentiate,

𝑦 ′ = 3𝐶2 𝑒 3𝑥 sin(2𝑥) + 2𝐶2 𝑒 3𝑥 cos(2𝑥).

We again plug in the initial condition and obtain 10 = 𝑦 ′(0) = 2𝐶2 , or 𝐶2 = 5. The solution
we are seeking is
𝑦 = 5𝑒 3𝑥 sin(2𝑥).
2.2. CONSTANT-COEFFICIENT SECOND-ORDER LINEAR ODES 89

2.2.4 Exercises
Exercise 2.2.6: Find the general solution of 2𝑦 ′′ + 2𝑦 ′ − 4𝑦 = 0.

Exercise 2.2.7: Find the general solution of 𝑦 ′′ + 9𝑦 ′ − 10𝑦 = 0.

Exercise 2.2.8: Solve 𝑦 ′′ − 8𝑦 ′ + 16𝑦 = 0 for 𝑦(0) = 2, 𝑦 ′(0) = 0.

Exercise 2.2.9: Solve 𝑦 ′′ + 9𝑦 ′ = 0 for 𝑦(0) = 1, 𝑦 ′(0) = 1.

Exercise 2.2.10: Find the general solution of 2𝑦 ′′ + 50𝑦 = 0.

Exercise 2.2.11: Find the general solution of 𝑦 ′′ + 6𝑦 ′ + 13𝑦 = 0.

Exercise 2.2.12: Find the general solution of 𝑦 ′′ = 0 using the methods of this section.

Exercise 2.2.13: The method of this section applies to equations of other orders than two. We will
see higher orders later. Solve the first-order equation 2𝑦 ′ + 3𝑦 = 0 using the methods of this section.

Exercise 2.2.14: Let us revisit the Cauchy–Euler equations of Exercise 2.1.6 on page 82. Suppose
now that (𝑏 − 𝑎)2 − 4𝑎𝑐 < 0. Find a formula for the general solution of 𝑎𝑥 2 𝑦 ′′ + 𝑏𝑥𝑦 ′ + 𝑐𝑦 = 0.
Hint: Note that 𝑥 𝑟 = 𝑒 𝑟 ln 𝑥 .

Exercise 2.2.15: Find the solution to 𝑦 ′′ − (2𝛼)𝑦 ′ + 𝛼2 𝑦 = 0, 𝑦(0) = 𝑎, 𝑦 ′(0) = 𝑏, where 𝛼, 𝑎, and
𝑏 are real numbers.

Exercise 2.2.16: Construct an equation such that 𝑦 = 𝐶1 𝑒 −2𝑥 cos(3𝑥) + 𝐶2 𝑒 −2𝑥 sin(3𝑥) is the
general solution.

Exercise 2.2.101: Find the general solution to 𝑦 ′′ + 4𝑦 ′ + 2𝑦 = 0.

Exercise 2.2.102: Find the general solution to 𝑦 ′′ − 6𝑦 ′ + 9𝑦 = 0.

Exercise 2.2.103: Find the solution to 2𝑦 ′′ + 𝑦 ′ + 𝑦 = 0, 𝑦(0) = 1, 𝑦 ′(0) = −2.

Exercise 2.2.104: Find the solution to 2𝑦 ′′ + 𝑦 ′ − 3𝑦 = 0, 𝑦(0) = 𝑎, 𝑦 ′(0) = 𝑏.

Exercise 2.2.105: Find the solution to 𝑧 ′′(𝑡) = −2𝑧 ′(𝑡) − 2𝑧(𝑡), 𝑧(0) = 2, 𝑧 ′(0) = −2.

Exercise 2.2.106: Find the solution to 𝑦 ′′ − (𝛼 + 𝛽)𝑦 ′ + 𝛼𝛽𝑦 = 0, 𝑦(0) = 𝑎, 𝑦 ′(0) = 𝑏, where 𝛼, 𝛽,
𝑎, and 𝑏 are real numbers, and 𝛼 ≠ 𝛽.

Exercise 2.2.107: Construct an equation such that 𝑦 = 𝐶1 𝑒 3𝑥 + 𝐶2 𝑒 −2𝑥 is the general solution.
90 CHAPTER 2. HIGHER-ORDER LINEAR ODES

2.3 Higher-order linear ODEs


Note: somewhat more than 1 lecture, §3.2 and §3.3 in [EP], §4.1 and §4.2 in [BD]
We briefly study higher-order equations. Equations appearing in applications tend to
be second-order. Higher-order equations do appear from time to time, but generally the
world around us is “second-order.”
The basic results about linear ODEs of higher order are essentially the same as for second-
order equations, with 2 replaced by 𝑛. The important concept of linear independence is
somewhat more complicated when more than two functions are involved. For higher-order
constant-coefficient ODEs, the methods developed are also somewhat harder to apply, but
we will not dwell on these complications. It is also possible to use the methods for systems
of linear equations from chapter 3 to solve higher-order constant-coefficient equations.
Consider a general homogeneous linear equation

𝑦 (𝑛) + 𝑝 𝑛−1 (𝑥)𝑦 (𝑛−1) + · · · + 𝑝 1 (𝑥)𝑦 ′ + 𝑝 0 (𝑥)𝑦 = 0. (2.4)

Theorem 2.3.1 (Superposition). If 𝑦1 , 𝑦2 , . . . , 𝑦𝑛 are solutions of the homogeneous equation


(2.4), then
𝑦(𝑥) = 𝐶1 𝑦1 (𝑥) + 𝐶2 𝑦2 (𝑥) + · · · + 𝐶 𝑛 𝑦𝑛 (𝑥)
also solves (2.4) for arbitrary constants 𝐶1 , 𝐶2 , . . . , 𝐶 𝑛 .
That is, a linear combination of solutions to (2.4) is a solution to (2.4). There is also the
existence and uniqueness theorem for linear equations, including nonhomogeneous ones.
Theorem 2.3.2 (Existence and uniqueness). Suppose 𝑝 0 , 𝑝1 , . . . , 𝑝 𝑛−1 , and 𝑓 are continuous
functions on some interval 𝐼, 𝑎 is a number in 𝐼, and 𝑏 0 , 𝑏1 , . . . , 𝑏 𝑛−1 are constants. Then the
equation
𝑦 (𝑛) + 𝑝 𝑛−1 (𝑥)𝑦 (𝑛−1) + · · · + 𝑝 1 (𝑥)𝑦 ′ + 𝑝 0 (𝑥)𝑦 = 𝑓 (𝑥)
has exactly one solution 𝑦(𝑥) defined on the same interval 𝐼 satisfying the initial conditions

𝑦(𝑎) = 𝑏0 , 𝑦 ′(𝑎) = 𝑏1 , ..., 𝑦 (𝑛−1) (𝑎) = 𝑏 𝑛−1 .

2.3.1 Linear independence


When we had two functions 𝑦1 and 𝑦2 , we said they were linearly independent if one was
not a multiple of the other. Same idea holds for 𝑛 functions, although in this case it is
easier to state as follows. The functions 𝑦1 , 𝑦2 , . . . , 𝑦𝑛 are linearly independent if the equation

𝑐 1 𝑦1 + 𝑐 2 𝑦2 + · · · + 𝑐 𝑛 𝑦 𝑛 = 0

has only the trivial solution 𝑐1 = 𝑐 2 = · · · = 𝑐 𝑛 = 0, where the equation must hold for all 𝑥.
If we can solve the equation with some constants 𝑐1 , 𝑐2 , . . . , 𝑐 𝑛 , where for example 𝑐1 ≠ 0,
then we can solve for 𝑦1 as a linear combination of the others. If the functions are not
linearly independent, they are linearly dependent.
2.3. HIGHER-ORDER LINEAR ODES 91

Example 2.3.1: Verify that 𝑒 𝑥 , 𝑒 2𝑥 , 𝑒 3𝑥 are linearly independent.


Let us give several ways to show this fact. Many textbooks (including [EP] and [F])
introduce Wronskians, but it is difficult to see why they work and they are not really
necessary here.
Consider
𝑐1 𝑒 𝑥 + 𝑐 2 𝑒 2𝑥 + 𝑐 3 𝑒 3𝑥 = 0.
We use rules of exponentials and write 𝑧 = 𝑒 𝑥 . Hence 𝑧 2 = 𝑒 2𝑥 and 𝑧 3 = 𝑒 3𝑥 . Then we have

𝑐 1 𝑧 + 𝑐 2 𝑧 2 + 𝑐 3 𝑧 3 = 0.

The left-hand side is a third degree polynomial in 𝑧. It is either identically zero, or it has
at most 3 zeros. Therefore, it is identically zero, 𝑐 1 = 𝑐 2 = 𝑐 3 = 0, and the functions are
linearly independent.
Let us try another way. As before we write

𝑐1 𝑒 𝑥 + 𝑐 2 𝑒 2𝑥 + 𝑐 3 𝑒 3𝑥 = 0.

This equation has to hold for all 𝑥. We divide through by 𝑒 3𝑥 to get

𝑐1 𝑒 −2𝑥 + 𝑐 2 𝑒 −𝑥 + 𝑐 3 = 0.

As the equation is true for all 𝑥, let 𝑥 → ∞. After taking the limit we see that 𝑐 3 = 0. Hence
our equation becomes
𝑐 1 𝑒 𝑥 + 𝑐 2 𝑒 2𝑥 = 0.
Rinse, repeat!
How about yet another way. We again write

𝑐1 𝑒 𝑥 + 𝑐 2 𝑒 2𝑥 + 𝑐 3 𝑒 3𝑥 = 0.

We can evaluate the equation and its derivatives at different values of 𝑥 to obtain equations
for 𝑐1 , 𝑐 2 , and 𝑐 3 . Let us first divide by 𝑒 𝑥 for simplicity.

𝑐 1 + 𝑐 2 𝑒 𝑥 + 𝑐 3 𝑒 2𝑥 = 0.

We set 𝑥 = 0 to get the equation 𝑐1 + 𝑐 2 + 𝑐 3 = 0. Now differentiate both sides

𝑐 2 𝑒 𝑥 + 2𝑐 3 𝑒 2𝑥 = 0.

We set 𝑥 = 0 to get 𝑐 2 + 2𝑐 3 = 0. We divide by 𝑒 𝑥 again and differentiate to get 2𝑐 3 𝑒 𝑥 = 0.


It is clear that 𝑐3 is zero. Then 𝑐2 must be zero as 𝑐 2 = −2𝑐 3 , and 𝑐 1 must be zero because
𝑐1 + 𝑐 2 + 𝑐 3 = 0.
There is no one best way to do it. All of these methods are perfectly valid. The important
thing is to understand why the functions are linearly independent.
92 CHAPTER 2. HIGHER-ORDER LINEAR ODES

Example 2.3.2: On the other hand, the functions 𝑒 𝑥 , 𝑒 −𝑥 , and cosh 𝑥 are linearly dependent.
Simply apply definition of the hyperbolic cosine:
𝑒 𝑥 + 𝑒 −𝑥
cosh 𝑥 = or 2 cosh 𝑥 − 𝑒 𝑥 − 𝑒 −𝑥 = 0.
2
Once we have enough linearly independent solutions, we have the general solution to
the homogeneous equation, just as we did for second-order equations.
Theorem 2.3.3. If 𝑦1 , 𝑦2 , . . . , 𝑦𝑛 are linearly independent solutions of the homogeneous equation
(2.4), then the general solution to (2.4) can be written as
𝑦(𝑥) = 𝐶1 𝑦1 (𝑥) + 𝐶2 𝑦2 (𝑥) + · · · + 𝐶 𝑛 𝑦𝑛 (𝑥).

2.3.2 Constant-coefficient higher-order ODEs


When we have a higher-order constant-coefficient homogeneous linear equation, the song
and dance is exactly the same as it was for second order. We just need to find more
solutions. If the equation is 𝑛 th -order, we need to find 𝑛 linearly independent solutions. It
is best seen by example.
Example 2.3.3: Find the general solution to
𝑦 ′′′ − 3𝑦 ′′ − 𝑦 ′ + 3𝑦 = 0. (2.5)
Try: 𝑦 = 𝑒 𝑟𝑥 . We plug in and get
𝑟 3 𝑒 𝑟𝑥 −3 𝑟 2 𝑒 𝑟𝑥 − 𝑟𝑒 𝑟𝑥 +3 𝑒 𝑟𝑥 = 0.
|{z} |{z} |{z} |{z}
𝑦 ′′′ 𝑦 ′′ 𝑦′ 𝑦

We divide through by 𝑒 𝑟𝑥 . Then


𝑟 3 − 3𝑟 2 − 𝑟 + 3 = 0.
The trick now is to find the roots. There is a formula for the roots of degree 3 and 4
polynomials, but it is very complicated. There is no formula for higher degree polynomials.
That does not mean that the roots do not exist. There are always 𝑛 roots for an 𝑛 th degree
polynomial. They may be repeated and they may be complex. Computers are pretty good
at finding roots approximately for reasonable size polynomials.
A good place to start is to plot the polynomial and check where it is zero. We can also
simply try plugging in. We just start plugging in numbers 𝑟 = −2, −1, 0, 1, 2, . . . and see if
we get a hit (we can also try complex numbers). Even if we do not get a hit, we may get
an indication of where the root is. For example, we plug 𝑟 = −2 into our polynomial and
get −15; we plug in 𝑟 = 0 and get 3. That means there is a root between 𝑟 = −2 and 𝑟 = 0,
because the sign changed. If we find one root, say 𝑟1 , then we know (𝑟 − 𝑟1 ) is a factor of
our polynomial. Polynomial long division can then be used.
A good strategy is to begin with 𝑟 = 0, 1, or −1. These are easy to compute. Our
polynomial has two such roots, 𝑟1 = −1 and 𝑟2 = 1. There should be 3 roots and the last
root is reasonably easy to find. The constant term in a monic‗ polynomial such as this is the
‗The word monic means that the coefficient of the top degree 𝑟 𝑑 , in our case 𝑟 3 , is 1.
2.3. HIGHER-ORDER LINEAR ODES 93

multiple of the negations of all the roots because 𝑟 3 − 3𝑟 2 − 𝑟 + 3 = (𝑟 − 𝑟1 )(𝑟 − 𝑟2 )(𝑟 − 𝑟3 ). So


3 = (−𝑟1 )(−𝑟2 )(−𝑟3 ) = (1)(−1)(−𝑟3 ) = 𝑟3 .
You should check that 𝑟3 = 3 really is a root. Hence 𝑒 −𝑥 , 𝑒 𝑥 and 𝑒 3𝑥 are solutions to (2.5).
They are linearly independent as can easily be checked, and there are 3 of them, which
happens to be exactly the number we need. So the general solution is
𝑦 = 𝐶1 𝑒 −𝑥 + 𝐶2 𝑒 𝑥 + 𝐶3 𝑒 3𝑥 .
Suppose we were given some initial conditions 𝑦(0) = 1, 𝑦 ′(0) = 2, and 𝑦 ′′(0) = 3. Then
1 = 𝑦(0) = 𝐶1 + 𝐶2 + 𝐶3 ,
2 = 𝑦 ′(0) = −𝐶1 + 𝐶2 + 3𝐶3 ,
3 = 𝑦 ′′(0) = 𝐶1 + 𝐶2 + 9𝐶3 .
It is possible to find the solution by high school algebra, but it would be a pain. The
sensible way to solve a system of equations such as this is to use matrix algebra, see § 3.2
or appendix A. For now we note that the solution is 𝐶1 = −1/4, 𝐶2 = 1, and 𝐶3 = 1/4. The
specific solution to the ODE is
−1 −𝑥 1
𝑦= 𝑒 + 𝑒 𝑥 + 𝑒 3𝑥 .
4 4
Next, suppose that we have real roots, but they are repeated. Let us say we have a root
𝑟 repeated 𝑘 times. In the spirit of the second-order solution, and for the same reasons, we
have the solutions
𝑒 𝑟𝑥 , 𝑥𝑒 𝑟𝑥 , 𝑥 2 𝑒 𝑟𝑥 , . . . , 𝑥 𝑘−1 𝑒 𝑟𝑥 .
We take a linear combination of these solutions to find the general solution.
Example 2.3.4: Solve
𝑦 (4) − 3𝑦 ′′′ + 3𝑦 ′′ − 𝑦 ′ = 0.
We note that the characteristic equation is
𝑟 4 − 3𝑟 3 + 3𝑟 2 − 𝑟 = 0.
By inspection, we note that 𝑟 4 − 3𝑟 3 + 3𝑟 2 − 𝑟 = 𝑟(𝑟 − 1)3 . Hence the roots given with
multiplicity are 𝑟 = 0, 1, 1, 1. Thus the general solution is
𝑦 = (𝐶1 + 𝐶2 𝑥 + 𝐶3 𝑥 2 ) 𝑒 𝑥 + 𝐶4 .
| {z } |{z}
terms coming from 𝑟=1 from 𝑟=0

The case of complex roots is similar to second-order equations. Complex roots always
come in pairs 𝑟 = 𝛼 ± 𝑖𝛽. Suppose we have two such complex roots, each repeated 𝑘 times.
The corresponding solution is
(𝐶0 + 𝐶1 𝑥 + · · · + 𝐶 𝑘−1 𝑥 𝑘−1 ) 𝑒 𝛼𝑥 cos(𝛽𝑥) + (𝐷0 + 𝐷1 𝑥 + · · · + 𝐷 𝑘−1 𝑥 𝑘−1 ) 𝑒 𝛼𝑥 sin(𝛽𝑥),
where 𝐶0 , . . . , 𝐶 𝑘−1 , 𝐷0 , . . . , 𝐷 𝑘−1 are arbitrary constants.
94 CHAPTER 2. HIGHER-ORDER LINEAR ODES

Example 2.3.5: Solve


𝑦 (4) − 4𝑦 ′′′ + 8𝑦 ′′ − 8𝑦 ′ + 4𝑦 = 0.
The characteristic equation is
𝑟 4 − 4𝑟 3 + 8𝑟 2 − 8𝑟 + 4 = 0,
2
(𝑟 2 − 2𝑟 + 2) = 0,
2
(𝑟 − 1)2 + 1 = 0.
Hence the roots are 1 ± 𝑖, both with multiplicity 2. Hence the general solution to the ODE is
𝑦 = (𝐶1 + 𝐶2 𝑥) 𝑒 𝑥 cos 𝑥 + (𝐶3 + 𝐶4 𝑥) 𝑒 𝑥 sin 𝑥.
The way we solved the characteristic equation above is really by guessing or by inspection.
It is not so easy in general. We could also have asked a computer or an advanced calculator
for the roots.

2.3.3 Exercises
Exercise 2.3.1: Find the general solution for 𝑦 ′′′ − 𝑦 ′′ + 𝑦 ′ − 𝑦 = 0.
Exercise 2.3.2: Find the general solution for 𝑦 (4) − 5𝑦 ′′′ + 6𝑦 ′′ = 0.
Exercise 2.3.3: Find the general solution for 𝑦 ′′′ + 2𝑦 ′′ + 2𝑦 ′ = 0.
Exercise 2.3.4: Suppose the characteristic equation for an ODE is (𝑟 − 1)2 (𝑟 − 2)2 = 0.
a) Find such a differential equation.
b) Find its general solution.
Exercise 2.3.5: Suppose that a fourth-order equation has a solution 𝑦 = 2𝑒 4𝑥 𝑥 cos 𝑥.
a) Find such an equation.
b) Find the initial conditions that the given solution satisfies.
Exercise 2.3.6: Find the general solution for the equation of Exercise 2.3.5.
Exercise 2.3.7: Let 𝑓 (𝑥) = 𝑒 𝑥 − cos 𝑥, 𝑔(𝑥) = 𝑒 𝑥 + cos 𝑥, and ℎ(𝑥) = cos 𝑥. Are 𝑓 (𝑥), 𝑔(𝑥), and
ℎ(𝑥) linearly independent? If so, show it, if not, find a linear combination that works.
Exercise 2.3.8: Let 𝑓 (𝑥) = 0, 𝑔(𝑥) = cos 𝑥, and ℎ(𝑥) = sin 𝑥. Are 𝑓 (𝑥), 𝑔(𝑥), and ℎ(𝑥) linearly
independent? If so, show it, if not, find a linear combination that works.
Exercise 2.3.9: Are 𝑥, 𝑥 2 , and 𝑥 4 linearly independent? If so, show it, if not, find a linear
combination that gives 0.
Exercise 2.3.10: Are 𝑒 𝑥 , 𝑥𝑒 𝑥 , and 𝑥 2 𝑒 𝑥 linearly independent? If so, show it, if not, find a linear
combination that gives 0.
Exercise 2.3.11: Find an equation such that 𝑦 = 𝑥𝑒 −2𝑥 sin(3𝑥) is a solution.
Exercise 2.3.101: Find the general solution of 𝑦 (5) − 𝑦 (4) = 0.
2.3. HIGHER-ORDER LINEAR ODES 95

Exercise 2.3.102: Suppose that the characteristic equation of a third-order differential equation has
roots ±2𝑖 and 3.

a) What is the characteristic equation?


b) Find the corresponding differential equation.
c) Find the general solution.

Exercise 2.3.103: Solve 1001𝑦 ′′′ + 3.2𝑦 ′′ + 𝜋𝑦 ′ − 4𝑦 = 0, 𝑦(0) = 0, 𝑦 ′(0) = 0, 𝑦 ′′(0) = 0.

Exercise 2.3.104: Are 𝑒 𝑥 , 𝑒 𝑥+1 , 𝑒 2𝑥 , sin(𝑥) linearly independent? If so, show it, if not find a linear
combination that gives 0.

Exercise 2.3.105: Are sin(𝑥), 𝑥, 𝑥 sin(𝑥) linearly independent? If so, show it, if not find a linear
combination that gives 0.

Exercise 2.3.106: Find an equation such that 𝑦 = cos(𝑥), 𝑦 = sin(𝑥), 𝑦 = 𝑒 𝑥 are solutions.
96 CHAPTER 2. HIGHER-ORDER LINEAR ODES

2.4 Mechanical vibrations


Note: 2 lectures, §3.4 in [EP], §3.7 in [BD]
Let us look at some applications of linear second-order constant-coefficient equations.

2.4.1 Some examples


Our first example is a mass on a spring. Suppose we have 𝑘 𝐹(𝑡)
a mass 𝑚 > 0 (in kilograms) connected by a spring with 𝑚
spring constant 𝑘 > 0 (in newtons per meter) to a fixed wall. 𝑥
There may be some external force 𝐹(𝑡) (in newtons) acting on damping 𝑐
the mass. Here, 𝑡 is time in seconds. Finally, there is some
friction measured by 𝑐 ≥ 0 (in newton-seconds per meter) as the mass slides along the
floor (or perhaps a damper is connected).
Let 𝑥 be the displacement of the mass in meters (𝑥 = 0 is the rest position), with 𝑥
growing to the right (away from the wall). The force exerted by the spring is proportional to
the compression of the spring by Hooke’s law. Therefore, it is 𝑘𝑥 in the negative direction.
Similarly, the force exerted by friction is proportional to the velocity of the mass. Newton’s
second law says that force equals mass times acceleration. Hence, 𝑚𝑥 ′′ = 𝐹(𝑡) − 𝑐𝑥 ′ − 𝑘𝑥 or

𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 𝐹(𝑡).

The equation is a linear second-order constant-coefficient ODE. We say the motion is

(i) forced if 𝐹 . 0 (if 𝐹 is not identically zero),

(ii) unforced or free if 𝐹 ≡ 0 (if 𝐹 is identically zero),

(iii) damped if 𝑐 > 0, and

(iv) undamped if 𝑐 = 0.

This system appears in lots of applications even if it does not at first seem like it. Many
real-world scenarios can be simplified to a mass on a spring. For example, a bungee
jump setup is essentially a mass and spring system (you are the mass). It would be good
if someone did the math before you jump off the bridge, right? Let us give two other
examples.
Here is an example for electrical engineers. Consider the pictured
RLC circuit. There is a resistor with a resistance of 𝑅 ohms, an inductor E C
L
with an inductance of 𝐿 henries, and a capacitor with a capacitance R
of 𝐶 farads. There is also an electric source (such as a battery) with
voltage of 𝐸(𝑡) volts at time 𝑡 (measured in seconds). Let 𝑄(𝑡) be the
charge (in coulombs) on the capacitor and 𝐼(𝑡) be the current (in amperes) in the circuit.
2.4. MECHANICAL VIBRATIONS 97

The relation between the two is 𝑄 ′ = 𝐼. By elementary principles, we find 𝐿𝐼 ′ + 𝑅𝐼 + 𝑄/𝐶 = 𝐸.


We differentiate to get
1
𝐿𝐼 ′′(𝑡) + 𝑅𝐼 ′(𝑡) + 𝐼(𝑡) = 𝐸′(𝑡).
𝐶
This is a nonhomogeneous second-order constant-coefficient linear equation. As 𝐿, 𝑅, and
𝐶 are all positive, this circuit behaves just like the mass and spring system. Position of
the mass is replaced by current. Mass is replaced by inductance, damping is replaced by
resistance, and the spring constant is replaced by one over the capacitance. The change in
voltage becomes the forcing function—for constant voltage this is an unforced motion.
Our next example behaves like a mass and spring system
only approximately. Suppose a mass 𝑚 hangs on a pendulum
𝐿
of length 𝐿. We seek an equation for the angle 𝜃(𝑡) (in radians).
Let 𝑔 be the force of gravity. Elementary physics mandates that 𝜃 𝑚𝐿𝜃′′
the equation is 𝑚
𝑔
𝜃′′ + sin 𝜃 = 0.
𝐿 𝑚 𝑔 sin 𝜃 𝑚𝑔
Let us derive this equation using Newton’s second law: force
equals mass times acceleration. The acceleration is 𝐿𝜃′′ and
mass is 𝑚. So 𝑚𝐿𝜃′′ has to be equal to the tangential component of the force given by
the gravity, which is 𝑚 𝑔 sin 𝜃 in the opposite direction. So 𝑚𝐿𝜃′′ = −𝑚 𝑔 sin 𝜃. The 𝑚
curiously cancels from the equation.
Now we make our approximation. For small 𝜃 we have that approximately sin 𝜃 ≈ 𝜃.
This can be seen by looking at the graph. In Figure 2.1, we can see that for approximately
−0.5 < 𝜃 < 0.5 (in radians) the graphs of sin 𝜃 and 𝜃 are almost the same.
Therefore, when the swings are small, 𝜃 -1.0 -0.5 0.0 0.5 1.0

is small, and we can model the behavior by 1.0 1.0

the simpler linear equation


𝑔
′′
0.5 0.5

𝜃 + 𝜃 = 0.
𝐿
0.0 0.0

The errors from this approximation build


up. So after a long time, the state of the
-0.5 -0.5
real-world system might be substantially
different from our solution. Also, we will
see that in a mass-spring system, the ampli- -1.0

-1.0 -0.5 0.0 0.5 1.0


-1.0

tude is independent of the period. This is


not true for a pendulum. Nevertheless, for Figure 2.1: The graphs of sin 𝜃 and 𝜃 (in radians).
reasonably short periods of time and small
swings (that is, only small angles 𝜃), the approximation is reasonably good.
In real-world problems it is often necessary to make these types of simplifications.
We must understand both the mathematics and the physics of the situation to see if the
simplification is valid in the context of the questions we are trying to answer.
98 CHAPTER 2. HIGHER-ORDER LINEAR ODES

2.4.2 Free undamped motion


In this section we only consider free or unforced motion, as we do not know yet how to
solve nonhomogeneous equations. We start with undamped motion where 𝑐 = 0. The
equation is
𝑚𝑥 ′′ + 𝑘𝑥 = 0.
p
We divide by 𝑚 and let 𝜔0 = 𝑘/𝑚 to rewrite the equation as

𝑥 ′′ + 𝜔02 𝑥 = 0.

The general solution to this equation is

𝑥(𝑡) = 𝐴 cos(𝜔0 𝑡) + 𝐵 sin(𝜔0 𝑡).

By a trigonometric identity

𝐴 cos(𝜔0 𝑡) + 𝐵 sin(𝜔0 𝑡) = 𝐶 cos(𝜔0 𝑡 − 𝛾),

for two different constants 𝐶 and 𝛾. One √ finds that 𝐴 = 𝐶 cos 𝛾 and 𝐵 = 𝐶 sin 𝛾, and
therefore it is not hard to compute that 𝐶 = 𝐴2 + 𝐵2 and tan 𝛾 = 𝐵/𝐴. Therefore, we let 𝐶
and 𝛾 be our arbitrary constants and write 𝑥(𝑡) = 𝐶 cos(𝜔0 𝑡 − 𝛾).

Justify the identity 𝐴 cos(𝜔0 𝑡) + 𝐵 sin(𝜔0 𝑡) = 𝐶 cos(𝜔0 𝑡 − 𝛾) and verify the


Exercise 2.4.1: √
equations 𝐶 = 𝐴2 + 𝐵2 and tan 𝛾 = 𝐴𝐵 . Hint: Start with cos(𝛼 − 𝛽) = cos(𝛼) cos(𝛽) +
sin(𝛼) sin(𝛽) and multiply by 𝐶. Then what should 𝛼 and 𝛽 be?

While it is easier to use the first form with 𝐴 and 𝐵 to solve for the initial conditions,
the second form is more natural. The constants 𝐶 and 𝛾 have nice physical interpretation.
Write the solution as
𝑥(𝑡) = 𝐶 cos(𝜔0 𝑡 − 𝛾).
This is a pure-frequency oscillation (a sine wave). The amplitude is 𝐶, 𝜔0 is the (angular)
frequency, and 𝛾 is the so-called phase shift. The phase shift just shifts the graph left or right.
We call 𝜔0 the natural (angular) frequency. This entire setup is called simple harmonic motion.
Let us pause to explain the word angular before the word frequency. The units of 𝜔0
are radians per unit time, not cycles per unit time as is the usual measure of frequency.
𝜔0
Because one cycle is 2𝜋 radians, the usual frequency is given by 2𝜋 . It is simply a matter of
where we put the constant 2𝜋, and that is a matter of taste.
The period of the motion is one over the frequency (in cycles per unit time) and hence it
2𝜋
is 𝜔0 . That is the amount of time it takes to complete one full cycle.
Example 2.4.1: Suppose that 𝑚 = 2 kg and 𝑘 = 8 N/m. The whole mass and spring setup
is sitting on a truck that was traveling at 1 m/s. The truck crashes and hence stops. The
mass was held in place 0.5 meters forward from the rest position. During the crash
the mass gets loose. That is, the mass is now moving forward at 1 m/s, while the other
end of the spring is held in place. The mass therefore starts oscillating. What is the
2.4. MECHANICAL VIBRATIONS 99

frequency of the resulting oscillation? What is the amplitude? The units are the mks units
(meters-kilograms-seconds).
The setup means that the mass was at half a meter in the positive direction during the
crash and relative to the wall the spring is mounted to, the mass was moving forward (in
the positive direction) at 1 m/s. This gives the initial conditions. So the equation with initial
conditions is
2𝑥 ′′ + 8𝑥 = 0, 𝑥(0) = 0.5, 𝑥 ′(0) = 1.
p √
We compute 𝜔0 = 𝑘/𝑚 = 4 = 2. Hence the angular frequency is 2. The usual frequency
in Hertz (cycles per second) is 2/2𝜋 = 1/𝜋 ≈ 0.318.
The general solution is
𝑥(𝑡) = 𝐴 cos(2𝑡) + 𝐵 sin(2𝑡).
Letting 𝑥(0) = 0.5 means 𝐴 = 0.5. Then 𝑥 ′(𝑡) = −2(0.5)
√ sin(2𝑡)+2𝐵 cos(2𝑡). Letting 𝑥 ′(0) = 1
√ √
we get 𝐵 = 0.5. Therefore, the amplitude is 𝐶 = 𝐴2 + 𝐵2 = 0.25 + 0.25 = 0.5 ≈ 0.707.
The solution is
𝑥(𝑡) = 0.5 cos(2𝑡) + 0.5 sin(2𝑡).
A plot of 𝑥(𝑡) is shown in Figure 2.2.
In general, for free undamped motion, a 0.0 2.5 5.0 7.5 10.0
1.0 1.0
solution of the form
𝑥(𝑡) = 𝐴 cos(𝜔0 𝑡) + 𝐵 sin(𝜔0 𝑡),
0.5 0.5

corresponds to the initial conditions 𝑥(0) =


𝐴 and 𝑥 ′(0) = 𝜔0 𝐵. It is easy to find 𝐴 and 𝐵
0.0 0.0

from the initial conditions. The amplitude


and the phase shift are computed from 𝐴
and 𝐵. In the example, we found the ampli- -0.5 -0.5

tude 𝐶. How about the phase shift 𝛾. We


know tan 𝛾 = 𝐵/𝐴 = 1. The arctangent of -1.0 -1.0
0.0 2.5 5.0 7.5 10.0
1 is 𝜋/4 or approximately 0.785. We must
check if 𝜋/4 gives the correct quadrant—if Figure 2.2: Simple undamped oscillation.
the angle from the positive 𝑥-axis is in the
same quadrant as the point (𝐴, 𝐵)—if not, we add 𝜋. That is because 𝐴 = 𝐶 cos 𝛾 and
𝐵 = 𝐶 sin(𝛾). Both 𝐴 and 𝐵 are positive, so 𝛾 must be in the first quadrant. As 𝜋/4 radians
is in the first quadrant, 𝛾 = 𝜋/4.
Note: Many calculators and computer software have not only the atan function for
arctangent, but also what is sometimes called atan2. This function takes two arguments, 𝐵
and 𝐴, and returns a 𝛾 in the correct quadrant for you.

2.4.3 Free damped motion


Let us now focus on damped motion. We rewrite the equation
𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 0,
100 CHAPTER 2. HIGHER-ORDER LINEAR ODES

as
𝑥 ′′ + 2𝑝𝑥 ′ + 𝜔02 𝑥 = 0,
where r
𝑘 𝑐
𝜔0 = , 𝑝= .
𝑚 2𝑚
The characteristic equation is
𝑟 2 + 2𝑝𝑟 + 𝜔02 = 0.
Using the quadratic formula, we get that the roots are
q
𝑟 = −𝑝 ± 𝑝 2 − 𝜔02 .

The form of the solution depends on whether we get complex or real roots. We get real
roots if and only if the following number is nonnegative:
 𝑐 2 𝑘 𝑐 2 − 4𝑘𝑚
𝑝 2 − 𝜔02 = − = .
2𝑚 𝑚 4𝑚 2
The sign of 𝑝 2 − 𝜔02 is the same as the sign of 𝑐 2 − 4𝑘𝑚. Thus we get real roots if and only if
𝑐 2 − 4𝑘𝑚 is nonnegative, or in other words if 𝑐 2 ≥ 4𝑘𝑚.

Overdamping
When 𝑐 2 −4𝑘𝑚 > 0, the system is overdamped. 0 25 50 75 100

In this case, there


q are two distinct real roots 1.5 1.5

𝑟1 and 𝑟2 . As 𝑝2 − 𝜔02 is always less than 𝑝,


q
the expression for the roots −𝑝 ± 𝑝 2 − 𝜔02 1.0 1.0

is negative either way. So both roots are


negative. 0.5 0.5

The solution is

𝑥(𝑡) = 𝐶1 𝑒 𝑟1 𝑡 + 𝐶2 𝑒 𝑟2 𝑡 . 0.0 0.0

Since 𝑟1 , 𝑟2 are negative, 𝑥(𝑡) → 0 as 𝑡 → ∞. 0 25 50 75 100

Thus the mass will tend towards the rest Figure 2.3: Overdamped motion for several differ-
position as time goes to infinity. For a few ent initial conditions.
sample plots for different initial conditions,
see Figure 2.3.
No oscillation happens. In fact, the graph crosses the 𝑡-axis at most once. To see why,
we try to solve 0 = 𝐶1 𝑒 𝑟1 𝑡 + 𝐶2 𝑒 𝑟2 𝑡 . Therefore, 𝐶1 𝑒 𝑟1 𝑡 = −𝐶2 𝑒 𝑟2 𝑡 and using laws of exponents
we obtain
−𝐶1
= 𝑒 (𝑟2 −𝑟1 )𝑡 .
𝐶2
2.4. MECHANICAL VIBRATIONS 101

This equation has at most one solution 𝑡 ≥ 0. For some initial conditions the graph never
crosses the 𝑡-axis, as is evident from the sample graphs.
Example 2.4.2: Suppose the mass is released from rest. That is, 𝑥(0) = 𝑥 0 and 𝑥 ′(0) = 0.
Then
𝑥0
𝑥(𝑡) = 𝑟 1 𝑒 𝑟2 𝑡 − 𝑟 2 𝑒 𝑟1 𝑡 .

𝑟1 − 𝑟2
It is not hard to see that this satisfies the initial conditions.

Critical damping
When 𝑐 2 − 4𝑘𝑚 = 0, the system is critically damped. In this case, there is one root of
multiplicity 2 and this root is −𝑝. Our solution is

𝑥(𝑡) = 𝐶1 𝑒 −𝑝𝑡 + 𝐶2 𝑡𝑒 −𝑝𝑡 .

The behavior of a critically damped system is very similar to an overdamped system. After
all, a critically damped system is, in some sense, a limit of overdamped systems. Since
these equations are really only an approximation to the real world, in reality, we are never
critically damped; it is a place we can only reach in theory. We are always a little bit
underdamped or a little bit overdamped. It is better not to dwell on critical damping.

Underdamping
When 𝑐 2 − 4𝑘𝑚 < 0, the system is under- 0 5 10 15 20 25 30
1.0 1.0
damped. In this case, the roots are complex.
q
𝑟 = −𝑝 ± 𝑝 2 − 𝜔02 0.5 0.5

√ q
= −𝑝 ± −1 𝜔02 − 𝑝 2
0.0 0.0

= −𝑝 ± 𝑖𝜔1 ,
q -0.5 -0.5

where 𝜔1 = 𝜔02 − 𝑝 2 . Our solution is

𝑥(𝑡) = 𝑒 −𝑝𝑡 𝐴 cos(𝜔1 𝑡) + 𝐵 sin(𝜔1 𝑡) ,


 -1.0 -1.0
0 5 10 15 20 25 30

Figure 2.4: Underdamped motion with the envelope


or
curves shown.
𝑥(𝑡) = 𝐶𝑒 −𝑝𝑡 cos(𝜔1 𝑡 − 𝛾).
An example plot is given in Figure 2.4. Note that we still have that 𝑥(𝑡) → 0 as 𝑡 → ∞.
The figure also shows the envelope curves 𝐶𝑒 −𝑝𝑡 and −𝐶𝑒 −𝑝𝑡 . The solution is the oscillating
line between the two envelope curves. The envelope curves give the maximum amplitude
of the oscillation at any given point in time. For example, if you are bungee jumping, you
are really interested in computing the envelope curve so as not to hit the concrete with
your head.
102 CHAPTER 2. HIGHER-ORDER LINEAR ODES

The phase shift 𝛾 shifts the oscillation left or right, but within the envelope curves (the
envelope curves do not change if 𝛾 changes).
Notice that the angular pseudo-frequency‗ 𝜔1 becomes smaller when the damping 𝑐 (and
hence 𝑝) becomes larger. This makes sense. First, when we change the damping just a little
bit, we do not expect the behavior of the solution to change dramatically. Second, if we
keep making 𝑐 larger, then at some point the solution should start looking like the solution
for critical damping or overdamping, where no oscillation happens. As 𝑐 gets larger and 𝑐 2
approaches 4𝑘𝑚, we find that 𝜔1 approaches 0.
On the other hand, when 𝑐 gets smaller, 𝜔1 approaches 𝜔0 (𝜔1 is always smaller than
𝜔0 ), and the solution looks more and more like the steady periodic motion of the undamped
case. The envelope curves become flatter and flatter as 𝑐 (and hence 𝑝) goes to 0.

2.4.4 Exercises
Exercise 2.4.2: Consider a mass and spring system with a mass 𝑚 = 2, spring constant 𝑘 = 3, and
damping constant 𝑐 = 1.

a) Set up and find the general solution of the system.


b) Is the system underdamped, overdamped, or critically damped?
c) If the system is not critically damped, find a 𝑐 that makes the system critically damped.

Exercise 2.4.3: Do Exercise 2.4.2 for 𝑚 = 3, 𝑘 = 12, and 𝑐 = 12.

Exercise 2.4.4: Using the mks units (meters-kilograms-seconds), suppose you have a spring with
spring constant 4 N/m. You want to use it to weigh items. Assume no friction. You place the mass
on the spring and put it in motion.

a) You count and find that the frequency is 0.8 Hz (cycles per second). What is the mass?
b) Find a formula for the mass 𝑚 given the frequency 𝜔 in Hz.

Exercise 2.4.5: Suppose we add possible friction to Exercise 2.4.4. Further, suppose you do not
know the spring constant, but you have two reference weights 1 kg and 2 kg to calibrate your setup.
You put each in motion on your spring and measure the frequency. For the 1 kg weight you measured
1.1 Hz, for the 2 kg weight you measured 0.8 Hz.

a) Find 𝑘 (spring constant) and 𝑐 (damping constant).


b) Find a formula for the mass in terms of the frequency in Hz. Note that there may be more
than one possible mass for a given frequency.
c) For an unknown object you measured 0.2 Hz, what is the mass of the object? Suppose that
you know that the mass of the unknown object is more than a kilogram.
‗We do not call 𝜔1 a frequency since the solution is not really a periodic function.
2.4. MECHANICAL VIBRATIONS 103

Exercise 2.4.6: Suppose you wish to measure the friction a mass of 0.1 kg experiences as it slides
along a floor (you wish to find 𝑐). You have a spring with spring constant 𝑘 = 5 N/m. You take the
spring, you attach it to the mass and fix it to a wall. Then you pull on the spring and let the mass
go. You find that the mass oscillates with frequency 1 Hz. What is the friction?

Exercise 2.4.101: A mass of 2 kilograms is on a spring with spring constant 𝑘 newtons per meter
with no damping. Suppose the system is at rest and at time 𝑡 = 0 the mass is kicked and starts
traveling at 2 meters per second. How large does 𝑘 have to be to so that the mass does not go further
than 3 meters from the rest position?

Exercise 2.4.102: Suppose we have an RLC circuit with a resistor of 100 milliohms (0.1 ohms),
inductor of inductance of 50 millihenries (0.05 henries), and a capacitor of 5 farads, with constant
voltage.

a) Set up the ODE equation for the current 𝐼.


b) Find the general solution.
c) Solve for 𝐼(0) = 10 and 𝐼 ′(0) = 0.

Exercise 2.4.103: A 5000 kg railcar hits a bumper (a spring) at 1 m/s, and the spring compresses by
0.1 m. Assume no damping.

a) Find 𝑘.
b) How far does the spring compress when a 10000 kg railcar hits the spring at the same speed?
c) If the spring would break if it compresses further than 0.3 m, what is the maximum mass of a
railcar that can hit it at 1 m/s?
d) What is the maximum mass of a railcar that can hit the spring without it breaking at 2 m/s?

Exercise 2.4.104: A mass of 𝑚 kg is on a spring with 𝑘 = 3 N/m and 𝑐 = 2 Ns/m. Find the mass
𝑚0 for which there is critical damping. If 𝑚 < 𝑚0 , does the system oscillate or not, that is, is it
underdamped or overdamped?
104 CHAPTER 2. HIGHER-ORDER LINEAR ODES

2.5 Nonhomogeneous equations


Note: 2 lectures, §3.5 in [EP], §3.5 and §3.6 in [BD]

2.5.1 Solving nonhomogeneous equations


We have solved linear constant-coefficient homogeneous equations. What about nonhomo-
geneous linear ODEs? For example, the equations for forced mechanical vibrations. That
is, consider an equation such as

𝑦 ′′ + 5𝑦 ′ + 6𝑦 = 2𝑥 + 1. (2.6)

We will write 𝐿𝑦 = 2𝑥 + 1 when the exact form of the operator is not important. We
solve (2.6) in the following manner. First, we find the general solution 𝑦 𝑐 to the associated
homogeneous equation
𝑦 ′′ + 5𝑦 ′ + 6𝑦 = 0. (2.7)
We call 𝑦 𝑐 the complementary solution. Next, we find a single particular solution 𝑦 𝑝 to (2.6) in
some way. Then
𝑦 = 𝑦𝑐 + 𝑦𝑝
is the general solution to (2.6). We have 𝐿𝑦 𝑐 = 0 and 𝐿𝑦 𝑝 = 2𝑥 + 1. As 𝐿 is a linear operator,
we verify that 𝑦 is a solution: 𝐿𝑦 = 𝐿(𝑦 𝑐 + 𝑦 𝑝 ) = 𝐿𝑦 𝑐 + 𝐿𝑦 𝑝 = 0 + (2𝑥 + 1). Let us see why
we obtain the general solution.
Let 𝑦 𝑝 and 𝑦˜ 𝑝 be two different particular solutions to (2.6). Write the difference as
𝑤 = 𝑦 𝑝 − 𝑦˜ 𝑝 . Then plug 𝑤 into the left-hand side of the equation to get

𝑤 ′′ + 5𝑤 ′ + 6𝑤 = (𝑦 𝑝′′ + 5𝑦 𝑝′ + 6𝑦 𝑝 ) − ( 𝑦˜ 𝑝′′ + 5 𝑦˜ 𝑝′ + 6 𝑦˜ 𝑝 ) = (2𝑥 + 1) − (2𝑥 + 1) = 0.

Using the operator notation, the calculation becomes simpler. As 𝐿 is a linear operator, we
write
𝐿𝑤 = 𝐿(𝑦 𝑝 − 𝑦˜ 𝑝 ) = 𝐿𝑦 𝑝 − 𝐿 𝑦˜ 𝑝 = (2𝑥 + 1) − (2𝑥 + 1) = 0.
So 𝑤 = 𝑦 𝑝 − 𝑦˜ 𝑝 is a solution to (2.7), that is, 𝐿𝑤 = 0. Any two solutions of (2.6) differ by a
solution to the homogeneous equation (2.7). The solution 𝑦 = 𝑦 𝑐 + 𝑦 𝑝 includes all solutions
to (2.6), since 𝑦 𝑐 is the general solution to the associated homogeneous equation.
Theorem 2.5.1. Let 𝐿𝑦 = 𝑓 (𝑥) be a linear ODE (not necessarily constant-coefficient). Let 𝑦 𝑐 be
the complementary solution (the general solution to the associated homogeneous equation 𝐿𝑦 = 0)
and let 𝑦 𝑝 be any particular solution to 𝐿𝑦 = 𝑓 (𝑥). Then the general solution to 𝐿𝑦 = 𝑓 (𝑥) is

𝑦 = 𝑦𝑐 + 𝑦𝑝 .

The moral of the story is that we can find the particular solution in any old way. If we
find a different particular solution (by a different method, or simply by guessing), then we
still get the same general solution. The formula may look different, and the constants we
have to choose to satisfy the initial conditions may be different, but it is the same solution.
2.5. NONHOMOGENEOUS EQUATIONS 105

2.5.2 Undetermined coefficients


The trick is to somehow, in a smart way, guess one particular solution to (2.6). Note that
2𝑥 + 1 is a polynomial, and the left-hand side of the equation will be a polynomial if we let
𝑦 be a polynomial of the same degree. Let us try

𝑦 𝑝 = 𝐴𝑥 + 𝐵.

We plug 𝑦 𝑝 into the left-hand side to obtain

𝑦 𝑝′′ + 5𝑦 𝑝′ + 6𝑦 𝑝 = (𝐴𝑥 + 𝐵)′′ + 5(𝐴𝑥 + 𝐵)′ + 6(𝐴𝑥 + 𝐵)


= 0 + 5𝐴 + 6𝐴𝑥 + 6𝐵 = 6𝐴𝑥 + (5𝐴 + 6𝐵).

So 6𝐴𝑥 +(5𝐴 +6𝐵) = 2𝑥 +1. Therefore, 𝐴 = 1/3 and 𝐵 = −1/9. That means 𝑦 𝑝 = 1
3 𝑥 − 19 = 3𝑥−1
9 .
Solving the complementary problem (exercise!), we get

𝑦 𝑐 = 𝐶1 𝑒 −2𝑥 + 𝐶2 𝑒 −3𝑥 .

Hence the general solution to (2.6) is

3𝑥 − 1
𝑦 = 𝐶1 𝑒 −2𝑥 + 𝐶2 𝑒 −3𝑥 + .
9
Now suppose we are further given some initial conditions. For example, 𝑦(0) = 0 and
𝑦 ′(0) = 1/3. First find 𝑦 ′ = −2𝐶1 𝑒 −2𝑥 − 3𝐶2 𝑒 −3𝑥 + 1/3. Then

1 1 1
0 = 𝑦(0) = 𝐶1 + 𝐶2 − , = 𝑦 ′(0) = −2𝐶1 − 3𝐶2 + .
9 3 3
We solve to get 𝐶1 = 1/3 and 𝐶2 = −2/9. The particular solution we want is

1 −2𝑥 2 −3𝑥 3𝑥 − 1 3𝑒 −2𝑥 − 2𝑒 −3𝑥 + 3𝑥 − 1


𝑦= 𝑒 − 𝑒 + = .
3 9 9 9
Exercise 2.5.1: Check that 𝑦 really solves the equation (2.6) and the given initial conditions.

Note: A common mistake is to solve for constants using the initial conditions with 𝑦 𝑐
and only add the particular solution 𝑦 𝑝 after that. That will not work. You need to first
compute 𝑦 = 𝑦 𝑐 + 𝑦 𝑝 and only then solve for the constants using the initial conditions.
Another important remark is that you should not forget the lower degree terms, even if
they do not appear on the right-hand side. If the equation is 𝐿𝑦 = 𝑥 3 + 1, you must try
𝑦 𝑝 = 𝐴𝑥 3 + 𝐵𝑥 2 + 𝐶𝑥 + 𝐷, even though there is no 𝑥 2 nor 𝑥 on the right-hand side of the
equation. It is a general polynomial of degree 3 that must be tried.
A right-hand side consisting of exponentials, sines, and cosines can be handled similarly.
For example,
𝑦 ′′ + 2𝑦 ′ + 2𝑦 = cos(2𝑥).
106 CHAPTER 2. HIGHER-ORDER LINEAR ODES

Let us find some 𝑦 𝑝 . We start by guessing the solution includes some multiple of cos(2𝑥).
We may have to also add a multiple of sin(2𝑥) to our guess since derivatives of cosine are
sines. We try
𝑦 𝑝 = 𝐴 cos(2𝑥) + 𝐵 sin(2𝑥).
We plug 𝑦 𝑝 into the equation and we get

−4𝐴 cos(2𝑥) − 4𝐵 sin(2𝑥) +2 −2𝐴 sin(2𝑥) + 2𝐵 cos(2𝑥)
| {z } | {z }
𝑦 𝑝′′ 𝑦 𝑝′

+ 2 𝐴 cos(2𝑥) + 𝐵 sin(2𝑥) = cos(2𝑥),



| {z }
𝑦𝑝

or
(−4𝐴 + 4𝐵 + 2𝐴) cos(2𝑥) + (−4𝐵 − 4𝐴 + 2𝐵) sin(2𝑥) = cos(2𝑥).
The left-hand side must equal the right-hand side. Namely, −4𝐴 + 4𝐵 + 2𝐴 = 1 and
−4𝐵 − 4𝐴 + 2𝐵 = 0. So −2𝐴 + 4𝐵 = 1 and 2𝐴 + 𝐵 = 0 and hence 𝐴 = −1/10 and 𝐵 = 1/5. So
− cos(2𝑥) + 2 sin(2𝑥)
𝑦 𝑝 = 𝐴 cos(2𝑥) + 𝐵 sin(2𝑥) = .
10
Similarly, if the right-hand side contains exponentials, we try exponentials. If

𝐿𝑦 = 𝑒 3𝑥 ,

we try 𝑦 𝑝 = 𝐴𝑒 3𝑥 as our guess and try to solve for 𝐴.


When the right-hand side is a multiple of sines, cosines, exponentials, and polynomials,
we can use the product rule for differentiation to come up with a guess. We need to guess
a form for 𝑦 𝑝 such that 𝐿𝑦 𝑝 is of the same form, and has all the terms needed for the
right-hand side. For example,

𝐿𝑦 = (1 + 3𝑥 2 ) 𝑒 −𝑥 cos(𝜋𝑥).

For this equation, we guess

𝑦 𝑝 = (𝐴 + 𝐵𝑥 + 𝐶𝑥 2 ) 𝑒 −𝑥 cos(𝜋𝑥) + (𝐷 + 𝐸𝑥 + 𝐹𝑥 2 ) 𝑒 −𝑥 sin(𝜋𝑥).

We plug in and then hopefully get equations that we can solve for 𝐴, 𝐵, 𝐶, 𝐷, 𝐸, and 𝐹. As
you can see this can make for a very long and tedious calculation very quickly. C’est la vie!
There is one hiccup in all this. It could be that our guess actually solves the associated
homogeneous equation. That is, consider

𝑦 ′′ − 9𝑦 = 𝑒 3𝑥 .

We would love to guess 𝑦 𝑝 = 𝐴𝑒 3𝑥 , but if we plug this into the left-hand side of the equation,
we get
𝑦 𝑝′′ − 9𝑦 𝑝 = 9𝐴𝑒 3𝑥 − 9𝐴𝑒 3𝑥 = 0 ≠ 𝑒 3𝑥 .
2.5. NONHOMOGENEOUS EQUATIONS 107

There is no way to choose 𝐴 to make the left-hand side be 𝑒 3𝑥 . The trick in this case is to
multiply our guess by 𝑥 to get rid of duplication with the complementary solution. That is,
first we find 𝑦 𝑐 (solution to 𝐿𝑦 = 0)

𝑦 𝑐 = 𝐶1 𝑒 −3𝑥 + 𝐶2 𝑒 3𝑥 ,

and we note that the 𝑒 3𝑥 term is a duplicate with our desired guess. We modify our guess
to 𝑦 𝑝 = 𝐴𝑥𝑒 3𝑥 so that there is no duplication anymore. Let us try: 𝑦 𝑝′ = 𝐴𝑒 3𝑥 + 3𝐴𝑥𝑒 3𝑥 and
𝑦 𝑝′′ = 6𝐴𝑒 3𝑥 + 9𝐴𝑥𝑒 3𝑥 , so

𝑦 𝑝′′ − 9𝑦 𝑝 = 6𝐴𝑒 3𝑥 + 9𝐴𝑥𝑒 3𝑥 − 9𝐴𝑥𝑒 3𝑥 = 6𝐴𝑒 3𝑥 .

Thus 6𝐴𝑒 3𝑥 is supposed to equal 𝑒 3𝑥 . Hence, 6𝐴 = 1 and so 𝐴 = 1/6. We can now write the
general solution as
1
𝑦 = 𝑦 𝑐 + 𝑦 𝑝 = 𝐶1 𝑒 −3𝑥 + 𝐶2 𝑒 3𝑥 + 𝑥𝑒 3𝑥 .
6
It is possible that multiplying by 𝑥 does not get rid of all duplication. For example,

𝑦 ′′ − 6𝑦 ′ + 9𝑦 = 𝑒 3𝑥 .

The complementary solution is 𝑦 𝑐 = 𝐶1 𝑒 3𝑥 + 𝐶2 𝑥𝑒 3𝑥 . Guessing 𝑦 𝑝 = 𝐴𝑥𝑒 3𝑥 does not get us


anywhere. We want to guess 𝑦 𝑝 = 𝐴𝑥 2 𝑒 3𝑥 . Basically, we multiply our guess by 𝑥 until all
duplication is gone. But no more! Multiplying too many times will not work.
Finally, what if the right-hand side has several terms, such as

𝐿𝑦 = 𝑒 2𝑥 + cos 𝑥.

In this case we find 𝑢 that solves 𝐿𝑢 = 𝑒 2𝑥 and 𝑣 that solves 𝐿𝑣 = cos 𝑥 (that is, do each
term separately). If we set 𝑦 = 𝑢 + 𝑣, then we find our desired solution 𝐿𝑦 = 𝑒 2𝑥 + cos 𝑥.
This is because 𝐿 is linear; we have 𝐿𝑦 = 𝐿(𝑢 + 𝑣) = 𝐿𝑢 + 𝐿𝑣 = 𝑒 2𝑥 + cos 𝑥.

2.5.3 Variation of parameters


The method of undetermined coefficients works for many basic problems that crop up.
But it does not work all the time. It only works when the right-hand side of the equation
𝐿𝑦 = 𝑓 (𝑥) has finitely many linearly independent derivatives, so that we can write a guess
that consists of them all. Some equations are a bit tougher. Consider

𝑦 ′′ + 𝑦 = tan 𝑥.

Each new derivative of tan 𝑥 looks completely different and cannot be written as a linear
combination of the previous derivatives. If we start differentiating tan 𝑥, we get:

sec2 𝑥, 2 sec2 𝑥 tan 𝑥, 4 sec2 𝑥 tan2 𝑥 + 2 sec4 𝑥,


8 sec2 𝑥 tan3 𝑥 + 16 sec4 𝑥 tan 𝑥, 16 sec2 𝑥 tan4 𝑥 + 88 sec4 𝑥 tan2 𝑥 + 16 sec6 𝑥, ...
108 CHAPTER 2. HIGHER-ORDER LINEAR ODES

This equation calls for a different method. We present the method of variation of
parameters, which handles any equation of the form 𝐿𝑦 = 𝑓 (𝑥), provided we can solve
certain integrals. For simplicity, we restrict ourselves to second-order constant-coefficient
equations, but the method works for higher-order equations just as well (the computations
become more tedious). The method also works for equations with nonconstant coefficients,
provided we can solve the associated homogeneous equation.
The details below will work for any equation of the form 𝑦 ′′ + 𝑝(𝑥)𝑦 ′ + 𝑞(𝑥)𝑦 = 𝑓 (𝑥),
but perhaps it is best to explain the method with a specific example. Consider the equation

𝐿𝑦 = 𝑦 ′′ + 𝑦 = tan 𝑥.

First we find the complementary solution (solution to 𝐿𝑦 𝑐 = 0). We get 𝑦 𝑐 = 𝐶1 𝑦1 + 𝐶2 𝑦2 ,


where 𝑦1 = cos 𝑥 and 𝑦2 = sin 𝑥. To find a particular solution to the nonhomogeneous
equation, the trick to this method is to try

𝑦 𝑝 = 𝑦 = 𝑢 1 𝑦 1 + 𝑢2 𝑦 2 ,

where 𝑢1 and 𝑢2 are functions and not constants. We are trying to satisfy 𝐿𝑦 = tan 𝑥. That
gives us one condition on the functions 𝑢1 and 𝑢2 . Compute (note the product rule!)

𝑦 ′ = (𝑢1′ 𝑦1 + 𝑢2′ 𝑦2 ) + (𝑢1 𝑦1′ + 𝑢2 𝑦2′ ).

We can still impose one more condition at our discretion to simplify computations (we
have two unknown functions, so we should be allowed two conditions). We require that
𝑢1′ 𝑦1 + 𝑢2′ 𝑦2 = 0 (the first term above). This makes computing the second derivative easier:

𝑦 ′ = 𝑢1 𝑦1′ + 𝑢2 𝑦2′ ,
𝑦 ′′ = (𝑢1′ 𝑦1′ + 𝑢2′ 𝑦2′ ) + (𝑢1 𝑦1′′ + 𝑢2 𝑦2′′).

Since 𝑦1 and 𝑦2 are solutions to 𝑦 ′′ + 𝑦 = 0, we find 𝑦1′′ = −𝑦1 and 𝑦2′′ = −𝑦2 . (If the equation
where the more general 𝑦 ′′ + 𝑝(𝑥)𝑦 ′ + 𝑞(𝑥)𝑦 = 0, we would have 𝑦 𝑖′′ = −𝑝(𝑥)𝑦 𝑖′ − 𝑞(𝑥)𝑦 𝑖 .) So

𝑦 ′′ = (𝑢1′ 𝑦1′ + 𝑢2′ 𝑦2′ ) − (𝑢1 𝑦1 + 𝑢2 𝑦2 ).

We have 𝑢1 𝑦1 + 𝑢2 𝑦2 = 𝑦 and so

𝑦 ′′ = (𝑢1′ 𝑦1′ + 𝑢2′ 𝑦2′ ) − 𝑦,

and hence
𝑦 ′′ + 𝑦 = 𝐿𝑦 = 𝑢1′ 𝑦1′ + 𝑢2′ 𝑦2′ .
For 𝑦 to satisfy 𝐿𝑦 = 𝑓 (𝑥), we must have 𝑓 (𝑥) = 𝑢1′ 𝑦1′ + 𝑢2′ 𝑦2′ .
What we need to solve are the two equations (conditions) we imposed on 𝑢1 and 𝑢2 :

𝑢1′ 𝑦1 + 𝑢2′ 𝑦2 = 0,
𝑢1′ 𝑦1′ + 𝑢2′ 𝑦2′ = 𝑓 (𝑥).
2.5. NONHOMOGENEOUS EQUATIONS 109

We get these same equations for any 𝐿𝑦 = 𝑓 (𝑥), where 𝐿𝑦 = 𝑦 ′′ + 𝑝(𝑥)𝑦 ′ + 𝑞(𝑥)𝑦. We solve
for 𝑢1′ and 𝑢2′ in terms of 𝑓 (𝑥), 𝑦1 , and 𝑦2 . There is a general formula for the solution we
could just plug into, but instead of memorizing that, it is easier to simply solve it as we do
below. In our case, 𝑦1 = cos 𝑥, 𝑦2 = sin 𝑥, and 𝑓 (𝑥) = tan 𝑥, so the two equations are

𝑢1′ cos 𝑥 + 𝑢2′ sin 𝑥 = 0,


−𝑢1′ sin 𝑥 + 𝑢2′ cos 𝑥 = tan 𝑥.

Multiply the first equation by sin 𝑥 and the second by cos 𝑥:

𝑢1′ cos 𝑥 sin 𝑥 + 𝑢2′ sin2 𝑥 = 0,


−𝑢1′ sin 𝑥 cos 𝑥 + 𝑢2′ cos2 𝑥 = tan 𝑥 cos 𝑥 = sin 𝑥.

Add the two equations to eliminate 𝑢1′ , solve for 𝑢2′ , and then solve for 𝑢1′ :

𝑢2′ sin2 𝑥 + cos2 𝑥 = sin 𝑥,




𝑢2′ = sin 𝑥,
− sin2 𝑥
𝑢1′ = = cos 𝑥 − sec 𝑥.
cos 𝑥
We integrate 𝑢1′ and 𝑢2′ to get 𝑢1 and 𝑢2 .
∫ ∫
𝑢1 = 𝑢1′ 𝑑𝑥 = (cos 𝑥 − sec 𝑥) 𝑑𝑥 = sin 𝑥 − ln |sec 𝑥 + tan 𝑥| ,
∫ ∫
𝑢2 = 𝑢2′ 𝑑𝑥 = sin 𝑥 𝑑𝑥 = − cos 𝑥.

We are looking for a particular solution, so we forget about the constants of integration. So
our particular solution is

𝑦 𝑝 = 𝑢1 𝑦1 + 𝑢2 𝑦2 = cos 𝑥 sin 𝑥 − cos 𝑥 ln|sec 𝑥 + tan 𝑥| − cos 𝑥 sin 𝑥


= − cos 𝑥 ln|sec 𝑥 + tan 𝑥|.

The general solution to 𝑦 ′′ + 𝑦 = tan 𝑥 is, therefore,

𝑦 = 𝐶1 cos 𝑥 + 𝐶2 sin 𝑥 − cos 𝑥 ln|sec 𝑥 + tan 𝑥|.

So the general idea for any 𝑦 ′′ + 𝑝𝑦 ′ + 𝑞𝑦 = 𝑓 (𝑥) is to first find solutions 𝑦1 , 𝑦2 to


𝑦 ′′ + 𝑝 𝑦 ′ + 𝑞𝑦 = 0. Then solve the two boxed equations for 𝑢1′ and 𝑢2′ , that is, solve
𝑢1′ 𝑦1 + 𝑢2′ 𝑦2 = 0 and 𝑢1′ 𝑦1′ + 𝑢2′ 𝑦2′ = 𝑓 (𝑥). Integrate 𝑢1′ and 𝑢2′ to find 𝑢1 and 𝑢2 , and plug
those into 𝑦 = 𝑢1 𝑦1 + 𝑢2 𝑦2 to find the particular solution. We remark that if 𝑦 ′′ has some
coefficient that is not 1, that is, if the equation is 𝑎𝑦 ′′ + 𝑏𝑦 ′ + 𝑐𝑦 = 𝑓 (𝑥), you must first divide
the equation (and hence 𝑓 ) by 𝑎.
110 CHAPTER 2. HIGHER-ORDER LINEAR ODES

2.5.4 Exercises
Exercise 2.5.2: Find a particular solution of 𝑦 ′′ − 𝑦 ′ − 6𝑦 = 𝑒 2𝑥 .
Exercise 2.5.3: Find a particular solution of 𝑦 ′′ − 4𝑦 ′ + 4𝑦 = 𝑒 2𝑥 .
Exercise 2.5.4: Solve the initial value problem 𝑦 ′′ + 9𝑦 = cos(3𝑥) + sin(3𝑥) for 𝑦(0) = 2, 𝑦 ′(0) = 1.
Exercise 2.5.5: Set up the form of the particular solution but do not solve for the coefficients for
𝑦 (4) − 2𝑦 ′′′ + 𝑦 ′′ = 𝑒 𝑥 .
Exercise 2.5.6: Set up the form of the particular solution but do not solve for the coefficients for
𝑦 (4) − 2𝑦 ′′′ + 𝑦 ′′ = 𝑒 𝑥 + 𝑥 + sin 𝑥.
Exercise 2.5.7:
a) Using variation of parameters, find a particular solution of 𝑦 ′′ − 2𝑦 ′ + 𝑦 = 𝑒 𝑥 .
b) Find a particular solution using undetermined coefficients.
c) Are the two solutions you found the same? See also Exercise 2.5.10.
Exercise 2.5.8: Find a particular solution of 𝑦 ′′ − 2𝑦 ′ + 𝑦 = sin(𝑥 2 ). It is OK to leave the answer
as a definite integral.
Exercise 2.5.9: For an arbitrary constant 𝑐 find a particular solution to 𝑦 ′′ − 𝑦 = 𝑒 𝑐𝑥 . Hint: Make
sure to handle every possible real 𝑐.
Exercise 2.5.10:
a) Using variation of parameters, find a particular solution of 𝑦 ′′ − 𝑦 = 𝑒 𝑥 .
b) Find a particular solution using undetermined coefficients.
c) Are the two solutions you found the same? What is going on?
Exercise 2.5.11: Find a polynomial 𝑃(𝑥), so that 𝑦 = 2𝑥 2 + 3𝑥 + 4 solves 𝑦 ′′ + 5𝑦 ′ + 𝑦 = 𝑃(𝑥).
Exercise 2.5.101: Find a particular solution to 𝑦 ′′ − 𝑦 ′ + 𝑦 = 2 sin(3𝑥).
Exercise 2.5.102:
a) Find a particular solution to 𝑦 ′′ + 2𝑦 = 𝑒 𝑥 + 𝑥 3 .
b) Find the general solution.
Exercise 2.5.103: Solve 𝑦 ′′ + 2𝑦 ′ + 𝑦 = 𝑥 2 , 𝑦(0) = 1, 𝑦 ′(0) = 2.
Exercise 2.5.104: Use variation of parameters to find a particular solution of 𝑦 ′′ − 𝑦 = 1
𝑒 𝑥 +𝑒 −𝑥 .
Exercise 2.5.105: For an arbitrary constant 𝑐 find the general solution to 𝑦 ′′ − 2𝑦 = sin(𝑥 + 𝑐).
Exercise 2.5.106: Undetermined coefficients can sometimes be used to guess a particular solution
to other equations than those with constant coefficients. Find a polynomial 𝑦(𝑥) that solves
𝑦 ′ + 𝑥 𝑦 = 𝑥 3 + 2𝑥 2 + 5𝑥 + 2.
Note: Not every right-hand side will allow a polynomial solution, for example, 𝑦 ′ + 𝑥𝑦 = 1 does
not, but a technique based on undetermined coefficients does work, see chapter 7.
2.6. FORCED OSCILLATIONS AND RESONANCE 111

2.6 Forced oscillations and resonance


Note: 2 lectures, §3.6 in [EP], §3.8 in [BD]
Let us return back to the example of a mass on a spring. 𝑘 𝐹(𝑡)
We examine the case of forced oscillations, which we did not 𝑚
yet handle. That is, we consider the equation 𝑥
′′ ′
𝑚𝑥 + 𝑐𝑥 + 𝑘𝑥 = 𝐹(𝑡), damping 𝑐
for some nonzero 𝐹(𝑡). The setup is again: 𝑥 is position, 𝑚 is mass, 𝑐 is friction, 𝑘 is the
spring constant, and 𝐹(𝑡) is an external force acting on the mass.
We are interested in periodic forcing, such as noncentered rotating parts, or perhaps
loud sounds, or other sources of periodic force. Using Fourier series, see chapter 4, we note
that we can understand all periodic functions by considering 𝐹(𝑡) = 𝐹0 cos(𝜔𝑡) (or sine
instead of cosine, the calculations are essentially the same), so we focus on this simple case.

2.6.1 Undamped forced motion and resonance


First, let us consider undamped (𝑐 = 0) motion. We have the equation
𝑚𝑥 ′′ + 𝑘𝑥 = 𝐹0 cos(𝜔𝑡).
This equation has the complementary solution (solution to the associated homogeneous
equation)
𝑥 𝑐 = 𝐶1 cos(𝜔0 𝑡) + 𝐶2 sin(𝜔0 𝑡),
p
where 𝜔0 = 𝑘/𝑚 is the natural frequency (angular). It is the frequency at which the system
“wants to oscillate” without external interference.
Suppose that 𝜔0 ≠ 𝜔. We solve using the method of undetermined coefficients. We try
the solution 𝑥 𝑝 = 𝐴 cos(𝜔𝑡) and solve for 𝐴. We do not need a sine in our trial solution as
after plugging in we only have cosines. If you include a sine, it is fine; you will find that its
coefficient is zero (I could not find a second rhyme). We plug into the equation and solve
for 𝐴 to find
𝐹0
𝑥𝑝 = cos(𝜔𝑡).
𝑚(𝜔0 − 𝜔 2 )
2

We leave it as an exercise to do the algebra required.


The general solution is

𝐹0
𝑥 = 𝐶1 cos(𝜔0 𝑡) + 𝐶2 sin(𝜔0 𝑡) + cos(𝜔𝑡).
𝑚(𝜔02 − 𝜔2 )

Written another way


𝐹0
𝑥 = 𝐶 cos(𝜔0 𝑡 − 𝛾) + cos(𝜔𝑡).
𝑚(𝜔02 − 𝜔 2 )
The solution is a superposition of two (shifted) cosine waves at different frequencies.
112 CHAPTER 2. HIGHER-ORDER LINEAR ODES

Example 2.6.1: Take


0.5𝑥 ′′ + 8𝑥 = 10 cos(𝜋𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0.
p
Let us compute. First we read off the parameters: 𝜔 = 𝜋, 𝜔0 = 8/0.5 = 4, 𝐹0 = 10,
𝑚 = 0.5. The general solution is
20
𝑥 = 𝐶1 cos(4𝑡) + 𝐶2 sin(4𝑡) + cos(𝜋𝑡).
16 − 𝜋2
Solve for 𝐶1 and 𝐶2 using the initial 0 5 10 15 20
−20
conditions: 𝐶1 = 16−𝜋 2 and 𝐶 2 = 0. Hence
10 10

20
𝑥= .

cos(𝜋𝑡) − cos(4𝑡)
16 − 𝜋2
5 5

Do notice the “beating” behavior in


Figure 2.5. To see it, use the trigonometric 0 0

identity
𝐴−𝐵 𝐴+𝐵
   
-5 -5

2 sin sin = cos 𝐵 − cos 𝐴


2 2
to get -10
0 5 10 15 20
-10

4−𝜋 4+𝜋
    
20 20

Figure 2.5: Graph of cos(𝜋𝑡) − cos(4𝑡) .
𝑥= 2 sin 𝑡 sin 𝑡 . 16−𝜋2
16 − 𝜋 2 2 2
The function 𝑥 is a high-frequency wave modulated by a low-frequency wave.
Now suppose 𝜔0 = 𝜔. We notice that cos(𝜔𝑡) solves the associated homogeneous
equation. Hence, we cannot try the solution 𝐴 cos(𝜔𝑡) with the method of undetermined
coefficients. Therefore, we try 𝑥 𝑝 = 𝐴𝑡 cos(𝜔𝑡) + 𝐵𝑡 sin(𝜔𝑡). This time we do need the sine
term, since the second derivative of 𝑡 cos(𝜔𝑡) contains sines. We write the equation
𝐹0
𝑥 ′′ + 𝜔 2 𝑥 = cos(𝜔𝑡).
𝑚
Plugging 𝑥 𝑝 into the left-hand side, we get
𝐹0
2𝐵𝜔 cos(𝜔𝑡) − 2𝐴𝜔 sin(𝜔𝑡) = cos(𝜔𝑡).
𝑚
𝐹0 𝐹0
Hence 𝐴 = 0 and 𝐵 = 2𝑚𝜔 . Our particular solution is 2𝑚𝜔 𝑡 sin(𝜔𝑡) and the general solution
is
𝐹0
𝑥 = 𝐶1 cos(𝜔𝑡) + 𝐶2 sin(𝜔𝑡) + 𝑡 sin(𝜔𝑡).
2𝑚𝜔
The important term is the last one (the particular solution we found). This term grows
𝐹0 𝑡 −𝐹0 𝑡
without bound as 𝑡 → ∞. In fact it oscillates between 2𝑚𝜔 and 2𝑚𝜔 . The first two terms
q
only oscillate between ± 𝐶12 + 𝐶22 , which becomes smaller and smaller in proportion to
the oscillations of the last term as 𝑡 gets larger. In Figure 2.6 on the facing page, we see the
graph with 𝐶1 = 𝐶2 = 0, 𝐹0 = 2, 𝑚 = 1, 𝜔 = 𝜋.
2.6. FORCED OSCILLATIONS AND RESONANCE 113

By forcing the system at just the right fre- 0 5 10 15 20

quency, we produce very wild oscillations.


5.0 5.0

This kind of behavior is called resonance or


perhaps pure resonance. Sometimes reso-
2.5 2.5

nance is desired. For example, remember


when as a kid you could start swinging by
0.0 0.0

just moving back and forth on the swing


seat in the “correct frequency”? You were
-2.5 -2.5

trying to achieve resonance. The force of


each one of your moves was small, but after
-5.0 -5.0

a while it produced large swings.


0 5 10 15 20
On the other hand, resonance can be
destructive. In an earthquake, some build- Figure 2.6: Graph of 𝜋1 𝑡 sin(𝜋𝑡).
ings collapse while others may be relatively
undamaged. This is due to different buildings having different resonance frequencies. So
figuring out the resonance frequency can be very important.
A common (but wrong) example of the destructive force of resonance is the Tacoma
Narrows bridge failure. It turns out there was a different phenomenon at play‗ .

2.6.2 Damped forced motion and practical resonance


In real life, things are not as simple as they were above. There is, of course, some damping.
Our equation becomes
𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 𝐹0 cos(𝜔𝑡), (2.8)
for some 𝑐 > 0. We solved the homogeneous problem before. We let
r
𝑐 𝑘
𝑝= , 𝜔0 = .
2𝑚 𝑚
We replace equation (2.8) with
𝐹0
𝑥 ′′ + 2𝑝𝑥 ′ + 𝜔02 𝑥 =
cos(𝜔𝑡).
𝑚
The roots of q
the characteristic equation of the associated homogeneous problem are
𝑟1 , 𝑟2 = −𝑝 ± 𝑝 2 − 𝜔02 . The form of the general solution of the associated homogeneous
equation depends on the sign of 𝑝 2 − 𝜔02 , or equivalently on the sign of 𝑐 2 − 4𝑘𝑚, as before:
 𝐶 𝑒 𝑟1 𝑡 + 𝐶 2 𝑒 𝑟2 𝑡 if 𝑐 2 > 4𝑘𝑚,
 1



𝑥 𝑐 = 𝐶1 𝑒 −𝑝𝑡 + 𝐶2 𝑡𝑒 −𝑝𝑡 if 𝑐 2 = 4𝑘𝑚,

 𝑒 −𝑝𝑡 𝐶1 cos(𝜔1 𝑡) + 𝐶2 sin(𝜔1 𝑡)

if 𝑐 2 < 4𝑘𝑚,

q
where 𝜔1 = 𝜔02 − 𝑝 2 . In any case, we see that 𝑥 𝑐 (𝑡) → 0 as 𝑡 → ∞.
‗ K.
Billah and R. Scanlan, Resonance, Tacoma Narrows Bridge Failure, and Undergraduate Physics Textbooks,
American Journal of Physics, 59(2), 1991, 118–124, http://www.ketchum.org/billah/Billah-Scanlan.pdf
114 CHAPTER 2. HIGHER-ORDER LINEAR ODES

Let us find a particular solution. There can be no conflicts when trying to solve for the
undetermined coefficients by trying 𝑥 𝑝 = 𝐴 cos(𝜔𝑡) + 𝐵 sin(𝜔𝑡). Let us plug in and solve
for 𝐴 and 𝐵. We get (the tedious details are left to reader)
𝐹0
(𝜔02 − 𝜔 2 )𝐵 − 2𝜔𝑝𝐴 sin(𝜔𝑡) + (𝜔02 − 𝜔 2 )𝐴 + 2𝜔𝑝𝐵 cos(𝜔𝑡) =
 
cos(𝜔𝑡).
𝑚
We solve for 𝐴 and 𝐵:
(𝜔02 − 𝜔 2 )𝐹0
𝐴= 2
,
𝑚(2𝜔𝑝)2 + 𝑚(𝜔02 − 𝜔 2 )
2𝜔𝑝𝐹0
𝐵= 2
.
𝑚(2𝜔𝑝)2 + 𝑚(𝜔02 − 𝜔 2 )

We also compute 𝐶 = 𝐴2 + 𝐵2 to be
𝐹0
𝐶= q .
2 2
𝑚 (2𝜔𝑝) + (𝜔02 − 𝜔2 )

Thus our particular solution is


(𝜔02 − 𝜔 2 )𝐹0 2𝜔𝑝𝐹0
𝑥𝑝 = 2
cos(𝜔𝑡) + 2
sin(𝜔𝑡).
𝑚(2𝜔𝑝)2 + 𝑚(𝜔02 − 𝜔 2 ) 𝑚(2𝜔𝑝)2 + 𝑚(𝜔02 − 𝜔 2 )
In the alternative notation, we have amplitude 𝐶 and phase shift 𝛾 where (if 𝜔 ≠ 𝜔0 )
𝐵 2𝜔𝑝
tan 𝛾 = = 2 .
𝐴 𝜔0 − 𝜔 2
Hence,
𝐹0
𝑥𝑝 = q cos(𝜔𝑡 − 𝛾).
2 2
𝑚 (2𝜔𝑝) + (𝜔02 − 𝜔2 )
𝐹0
If 𝜔 = 𝜔0 , then 𝐴 = 0, 𝐵 = 𝐶 = 2𝑚𝜔𝑝 , and 𝛾 = 𝜋/2.
For reasons we will explain in a moment, we call 𝑥 𝑐 the transient solution and denote
it by 𝑥 𝑡𝑟 . We call the 𝑥 𝑝 from above the steady periodic solution and denote it by 𝑥 𝑠𝑝 . The
general solution is
𝑥 = 𝑥 𝑐 + 𝑥 𝑝 = 𝑥 𝑡𝑟 + 𝑥 𝑠𝑝 .
The transient solution 𝑥 𝑡𝑟 = 𝑥 𝑐 goes to zero as 𝑡 → ∞, as all the terms involve an
exponential with a negative exponent. So for large 𝑡, the effect of 𝑥 𝑡𝑟 is negligible and we
see essentially only 𝑥 𝑠𝑝 . Hence the name transient. Notice that 𝑥 𝑠𝑝 involves no arbitrary
constants, and the initial conditions only affect 𝑥 𝑡𝑟 . Thus, the effect of the initial conditions
is negligible after some period of time. We might as well focus on the steady periodic
solution and ignore the transient solution. See Figure 2.7 on the next page for a graph
given several different initial conditions.
2.6. FORCED OSCILLATIONS AND RESONANCE 115

The speed at which 𝑥 𝑡𝑟 goes to zero de- 0 5 10 15 20

pends on 𝑝 (and hence 𝑐). The bigger 𝑝 is


5.0 5.0

(the bigger 𝑐 is), the “faster” 𝑥 𝑡𝑟 becomes


negligible. So the smaller the damping, the
2.5 2.5

longer the “transient region.” This is consis-


tent with the observation that when 𝑐 = 0,
0.0 0.0

the initial conditions affect the behavior for


all time (i.e. an infinite “transient region”).
-2.5 -2.5

Let us describe what we mean by res-


onance when damping is present. Since -5.0 -5.0

there were no conflicts when solving with 0 5 10 15 20

undetermined coefficient, there is no term


Figure 2.7: Solutions with different initial con-
that goes to infinity. We look instead at
ditions for parameters 𝑘 = 1, 𝑚 = 1, 𝐹0 = 1,
the maximum value of the amplitude of 𝑐 = 0.7, and 𝜔 = 1.1.
the steady periodic solution. Let 𝐶 be the
amplitude of 𝑥 𝑠𝑝 . If we plot 𝐶 as a function
of 𝜔 (with all other parameters fixed), we can find its maximum. We call the 𝜔 that
achieves this maximum the practical resonance frequency. We call the maximal amplitude
𝐶(𝜔) the practical resonance amplitude. Thus when damping is present we talk of practical
resonance rather than pure resonance. A sample plot for three different values of 𝑐 is given
in Figure 2.8. As you can see, the practical resonance amplitude grows as damping gets
smaller, and practical resonance can disappear altogether when damping is large.

0.0 0.5 1.0 1.5 2.0 2.5 3.0

2.5 2.5

2.0 2.0

1.5 1.5

1.0 1.0

0.5 0.5

0.0 0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0

Figure 2.8: Graph of 𝐶(𝜔) showing practical resonance with parameters 𝑘 = 1, 𝑚 = 1, 𝐹0 = 1. The top
line is with 𝑐 = 0.4, the middle line with 𝑐 = 0.8, and the bottom line with 𝑐 = 1.6.

To find the maximum of 𝐶(𝜔), we find the derivative 𝐶 ′(𝜔):


−2𝜔(2𝑝 2 + 𝜔 2 − 𝜔02 )𝐹0
𝐶 ′(𝜔) = .
2 2  3/2
𝑚 (2𝜔𝑝) + (𝜔02 − 𝜔2 )
116 CHAPTER 2. HIGHER-ORDER LINEAR ODES

This is zero either when 𝜔 = 0 or when 2𝑝 2 + 𝜔2 − 𝜔02 = 0. In other words, 𝐶 ′(𝜔) = 0 when

q
𝜔= 𝜔02 − 2𝑝 2 or 𝜔 = 0.

q
If 𝜔02 − 2𝑝 2 is positive, then 𝜔02 − 2𝑝 2 is the practical resonance frequency (the point where
𝐶(𝜔) is maximal). This conclusion follows by the first derivative test, for example, as then
𝐶 ′(𝜔) > 0 for small 𝜔 in this case. If on the other hand 𝜔02 − 2𝑝 2 is not positive, then 𝐶(𝜔)
achieves its maximum at 𝜔 = 0, and there is no practical resonance since we assume 𝜔 > 0
in our system. In this case, the amplitude gets larger as the forcing frequency gets smaller.
If practical resonance occurs, the frequency is smaller than 𝜔0 . As the damping 𝑐 (and
hence 𝑝) becomes smaller, the practical resonance frequency goes to 𝜔0 . So when damping
is very small, 𝜔0 is a good estimate of the practical resonance frequency. This behavior
agrees with the observation that when 𝑐 = 0, then 𝜔0 is the resonance frequency.
Another interesting observation to make is that when 𝜔 → ∞, then 𝐶 → 0. This means
that if the forcing frequency gets too high it does not manage to get the mass moving in
the mass-spring system. This is quite reasonable intuitively. If we wiggle back and forth
really fast while sitting on a swing, we will not get it moving at all, no matter how forceful.
Fast vibrations just cancel each other out before the mass has any chance of responding by
moving one way or the other.
The behavior is more complicated if the forcing function is not an exact cosine wave,
but for example a square wave. A general periodic function will be the sum (superposition)
of many cosine waves of different frequencies. The reader is encouraged to come back to
this section once we have learned about the Fourier series.

2.6.3 Exercises
Exercise 2.6.1: Derive a formula for 𝑥 𝑠𝑝 if the equation is 𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 𝐹0 sin(𝜔𝑡). Assume
𝑐 > 0.

Exercise 2.6.2: Derive a formula for 𝑥 𝑠𝑝 if the equation is 𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 𝐹0 cos(𝜔𝑡) +


𝐹1 cos(3𝜔𝑡). Assume 𝑐 > 0.

Exercise 2.6.3: Take 𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 𝐹0 cos(𝜔𝑡). Fix 𝑚 > 0, 𝑘 > 0, and 𝐹0 > 0. Consider
the function 𝐶(𝜔). For what values of 𝑐 (solve in terms of 𝑚, 𝑘, and 𝐹0 ) will there be no practical
resonance (that is, for what values of 𝑐 is there no maximum of 𝐶(𝜔) for 𝜔 > 0)?

Exercise 2.6.4: Take 𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 𝐹0 cos(𝜔𝑡). Fix 𝑐 > 0, 𝑘 > 0, and 𝐹0 > 0. Consider the
function 𝐶(𝜔). For what values of 𝑚 (solve in terms of 𝑐, 𝑘, and 𝐹0 ) will there be no practical
resonance (that is, for what values of 𝑚 is there no maximum of 𝐶(𝜔) for 𝜔 > 0)?
2.6. FORCED OSCILLATIONS AND RESONANCE 117

Exercise 2.6.5: A water tower in an earthquake acts as a mass-spring system. Assume that the
container on top is full and the water does not move around. The container then acts as the mass
and the support acts as the spring, where the induced vibrations are horizontal. The container with
water has a mass of 𝑚 = 10, 000 kg. It takes a force of 1000 newtons to displace the container 1
meter. For simplicity, assume no friction. When the earthquake hits, the water tower is at rest (it is
not moving). The earthquake induces an external force 𝐹(𝑡) = 𝑚𝐴𝜔2 cos(𝜔𝑡).

a) What is the natural frequency of the water tower?


b) If 𝜔 is not the natural frequency, find a formula for the maximal amplitude of the resulting
oscillations of the water container (the maximal deviation from the rest position). The motion
will be a high-frequency wave modulated by a low-frequency wave, so simply find the constant
in front of the sines.
c) Suppose 𝐴 = 1 and an earthquake with frequency 0.5 cycles per second comes. What is the
amplitude of the oscillations? Suppose that if the water tower moves more than 1.5 meters
from the rest position, the tower collapses. Will the tower collapse?

Exercise 2.6.101: A mass of 4 kg on a spring with 𝑘 = 4 N/m and a damping constant 𝑐 = 1 Ns/m.
Suppose that 𝐹0 = 2 N. Using the forcing function 𝐹0 cos(𝜔𝑡), find the 𝜔 that causes practical
resonance and find the practical resonance amplitude.

Exercise 2.6.102: Derive a formula for 𝑥 𝑠𝑝 for 𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 𝐹0 cos(𝜔𝑡) + 𝐴, where 𝐴 is some


constant. Assume 𝑐 > 0.

Exercise 2.6.103: Suppose there is no damping in a mass and spring system with 𝑚 = 5, 𝑘 = 20,
and 𝐹0 = 5. Suppose 𝜔 is chosen to be precisely the resonance frequency.

a) Find 𝜔.
b) Find the amplitude of the oscillations at time 𝑡 = 100, given the system is at rest at 𝑡 = 0.
118 CHAPTER 2. HIGHER-ORDER LINEAR ODES
Chapter 3

Systems of ODEs

3.1 Introduction to systems of ODEs


Note: 1 to 1.5 lectures, §4.1 in [EP], §7.1 in [BD]

3.1.1 Systems
Often we do not have just one dependent variable and one equation. We will see that we
may end up with a system of several equations and several dependent variables even if we
start with a single equation.
Given several dependent variables, suppose 𝑦1 , 𝑦2 , . . . , 𝑦𝑛 , we can have a differential
equation involving all of them and their derivatives with respect to one independent
variable 𝑥. For example, 𝑦1′′ = 𝑓 (𝑦1′ , 𝑦2′ , 𝑦1 , 𝑦2 , 𝑥). Usually, when we have two dependent
variables, we have two equations such as

𝑦1′′ = 𝑓1 (𝑦1′ , 𝑦2′ , 𝑦1 , 𝑦2 , 𝑥),


𝑦2′′ = 𝑓2 (𝑦1′ , 𝑦2′ , 𝑦1 , 𝑦2 , 𝑥),

for some known functions 𝑓1 and 𝑓2 . We call the above a system of differential equations. More
precisely, the above is a second-order system of ODEs as second-order derivatives appear.
The system
𝑥 1′ = 𝑔1 (𝑥1 , 𝑥2 , 𝑥3 , 𝑡),
𝑥 2′ = 𝑔2 (𝑥1 , 𝑥2 , 𝑥3 , 𝑡),
𝑥 3′ = 𝑔3 (𝑥1 , 𝑥2 , 𝑥3 , 𝑡),

is a first-order system, where 𝑥1 , 𝑥2 , 𝑥3 are the dependent variables, and 𝑡 is the independent
variable.
The terminology for systems is essentially the same as for single equations. For the
120 CHAPTER 3. SYSTEMS OF ODES

system above, a solution is a set of three functions 𝑥1 (𝑡), 𝑥2 (𝑡), 𝑥3 (𝑡), such that
𝑥1′ (𝑡) = 𝑔1 𝑥 1 (𝑡), 𝑥2 (𝑡), 𝑥3 (𝑡), 𝑡 ,


𝑥2′ (𝑡) = 𝑔2 𝑥 1 (𝑡), 𝑥2 (𝑡), 𝑥3 (𝑡), 𝑡 ,




𝑥3 (𝑡) = 𝑔3 𝑥 1 (𝑡), 𝑥2 (𝑡), 𝑥3 (𝑡), 𝑡 .


We may also have an initial condition. As for single equations, we specify 𝑥 1 , 𝑥2 , and 𝑥3
for some fixed 𝑡. For example, 𝑥1 (0) = 𝑎1 , 𝑥2 (0) = 𝑎2 , 𝑥3 (0) = 𝑎3 , where 𝑎 1 , 𝑎 2 , and 𝑎 3 are
some constants. For a second-order system, we must also specify the first derivatives at
the initial point. If we find a solution with arbitrary constants in it, where solving for the
constants gives a solution for any initial condition, we call this solution the general solution.
Best to look at a simple example.
Example 3.1.1: Sometimes a system is easy to solve by solving for one variable and then
for the second variable. Take the first-order system
𝑦1′ = 𝑦1 ,
𝑦2′ = 𝑦1 − 𝑦2 ,
with 𝑦1 , 𝑦2 as the dependent variables and 𝑥 as the independent variable. Consider initial
conditions 𝑦1 (0) = 1, 𝑦2 (0) = 2.
We note that 𝑦1 = 𝐶1 𝑒 𝑥 is the general solution of the first equation. We then plug this 𝑦1
into the second equation and get the equation 𝑦2′ = 𝐶1 𝑒 𝑥 − 𝑦2 , which is a linear first-order
equation that is easily solved for 𝑦2 . By the integrating factor method, we get
𝐶1 2𝑥
𝑒 𝑥 𝑦2 = 𝑒 + 𝐶2 ,
2
𝐶1 𝑥
or 𝑦2 = 2 𝑒 + 𝐶2 𝑒 −𝑥 . The general solution to the system is, therefore,
𝐶1 𝑥
𝑦1 = 𝐶 1 𝑒 𝑥 , 𝑒 + 𝐶2 𝑒 −𝑥 .
𝑦2 =
2
We solve for 𝐶1 and 𝐶2 given the initial conditions. We substitute 𝑥 = 0 to find 1 = 𝑦1 (0) = 𝐶1
and 2 = 𝑦2 (0) = 𝐶1/2 + 𝐶2 , or in other words, 𝐶1 = 1 and 𝐶2 = 3/2. Thus the solution is
𝑦1 = 𝑒 𝑥 , and 𝑦2 = (1/2)𝑒 𝑥 + (3/2)𝑒 −𝑥 .
Generally, we will not be so lucky to be able to solve for each variable separately as in
the example—we will need to solve for all variables at once. While we cannot always solve
one variable at a time, we will try to salvage as much as possible from this technique. In a
certain sense, to solve systems, we will still (try to) solve a bunch of single equations and
put their solutions together. Let us not worry right now about how to solve systems yet.
We will mostly consider linear systems. Example 3.1.1 is a linear first-order system. It is
linear as none of the dependent variables or their derivatives appear in nonlinear functions
or with powers higher than one (𝑦1 , 𝑦2 , 𝑦1′ , 𝑦2′ , constants, and functions of 𝑥 can appear,
but not 𝑦1 𝑦2 or (𝑦2′ )2 or 𝑦13 ). Another, more complicated, example of a linear (second-order)
system is
𝑦1′′ = 𝑒 𝑥 𝑦1′ + 𝑥 2 𝑦1 + 5𝑦2 + sin(𝑥),
𝑦2′′ = 𝑥𝑦1′ − 𝑦2′ + 2𝑦1 + cos(𝑥).
3.1. INTRODUCTION TO SYSTEMS OF ODES 121

3.1.2 Applications
We consider some simple applications of systems and how to set up the equations.
Example 3.1.2: First, we consider salt and brine tanks, but this time water flows from one
to the other and back. Imagine we have two tanks, each containing volume 𝑉 liters of salt
brine. The amount of salt in the first tank is 𝑥1 grams, and the amount of salt in the second
tank is 𝑥2 grams. Let 𝑡 denote time—the independent variable. The liquid in each tank is
being constantly and perfectly mixed and flows or is pumped at a rate of 𝑟 liters per second
out of each tank into the other. See Figure 3.1.

𝑥1 𝑥2
Vol. = 𝑉 𝑟 𝑟 Vol. = 𝑉

Figure 3.1: A closed system of two brine tanks.

The rate of change of 𝑥1 , that is, 𝑥1′ , is the rate of salt entering minus the rate leaving.
The density of the salt in tank 2 is 𝑥2/𝑉 , so the rate of salt entering tank 1 is 𝑥2/𝑉 times 𝑟. The
density of the salt in tank 1 is 𝑥1/𝑉 , so the rate of salt leaving tank 1 is 𝑥1/𝑉 times 𝑟. In other
words,
𝑥2 𝑥1 𝑟 𝑟 𝑟
𝑥 1′ = 𝑟 − 𝑟 = 𝑥 2 − 𝑥1 = (𝑥2 − 𝑥 1 ).
𝑉 𝑉 𝑉 𝑉 𝑉
Similarly, to find the rate 𝑥2 , the roles of 𝑥1 and 𝑥2 are reversed. All in all, the system of

ODEs for this problem is


𝑟
𝑥1′ = (𝑥2 − 𝑥 1 ),
𝑉
′ 𝑟
𝑥2 = (𝑥1 − 𝑥 2 ).
𝑉
In this system, we cannot solve for 𝑥1 or 𝑥2 separately. We must solve for both 𝑥1 and 𝑥2 at
once, which is intuitively clear—the amount of salt in one tank affects the amount in the
other. We cannot know 𝑥1 before we know 𝑥 2 , and vice versa.
We do not yet know how to find all the solutions, but intuitively we can at least find
some solutions. Suppose we know that initially the tanks have the same amount of salt.
That is, we have an initial condition such as 𝑥 1 (0) = 𝑥2 (0) = 𝐶. Then clearly the amount
of salt entering and leaving each tank is the same, so the amounts are not changing. In
other words, 𝑥 1 = 𝐶 and 𝑥2 = 𝐶 (the constant functions) is a solution: 𝑥1′ = 𝑥2′ = 0, and
𝑟 𝑟
𝑉 (𝑥 2 − 𝑥 1 ) = 𝑉 (𝑥 1 − 𝑥 2 ) = 0, so the equations are satisfied.
122 CHAPTER 3. SYSTEMS OF ODES

Let us think about the setup a little bit more without solving it. Suppose the initial
conditions are 𝑥 1 (0) = 𝐴 and 𝑥2 (0) = 𝐵, for two different constants 𝐴 and 𝐵. Since no salt is
coming in or out of this closed system, the total amount of salt is constant. That is, 𝑥1 + 𝑥 2
is constant, and so it equals 𝐴 + 𝐵. Intuitively, if 𝐴 > 𝐵, more salt will flow out of tank one
than into it. After a long time, we then expect the amount of salt in each tank to equalize.
In other words, the solutions of both 𝑥1 and 𝑥2 should tend towards 𝐴+𝐵 2 as 𝑡 goes to ∞.
Once you know how to solve systems, you can check that this really is so.
Example 3.1.3: Let us look at a second-order example. We return to the mass and spring
setup, but this time we consider two masses.
Consider one spring with constant 𝑘 and two masses 𝑚1 𝑘
and 𝑚2 . Think of the masses as carts on a straight track with 𝑚1 𝑚2
no friction. Let 𝑥 1 be the displacement of the first cart and
𝑥2 be the displacement of the second cart. That is, we put 𝑥1 𝑥2
the two carts somewhere with no tension on the spring, we
mark the position of the first and second cart, and we call those the zero positions. Then
𝑥1 measures how far the first cart is from its zero position, and 𝑥2 measures how far the
second cart is from its zero position. The force exerted by the spring on the first cart is
𝑘(𝑥2 − 𝑥 1 ) as 𝑥2 − 𝑥 1 is how far the spring is stretched (or compressed) from the rest position.
The force exerted on the second cart is the opposite, thus the same thing with a negative
sign. Newton’s second law states that force equals mass times acceleration, that is,

𝑚1 𝑥1′′ = 𝑘(𝑥2 − 𝑥 1 ),
𝑚2 𝑥2′′ = −𝑘(𝑥2 − 𝑥 1 ).

Again, we cannot solve for the 𝑥1 or 𝑥2 variables one at a time. Where the first cart goes
depends on exactly where the second cart goes and vice versa.

3.1.3 Changing to first-order systems


To some degree, we need only be able to solve first-order systems. Consider an 𝑛 th -order
differential equation
𝑦 (𝑛) = 𝐹(𝑦 (𝑛−1) , . . . , 𝑦 ′ , 𝑦, 𝑥).
We define new variables 𝑢1 , 𝑢2 , . . . , 𝑢𝑛 and write the system

𝑢1′ = 𝑢2 ,
𝑢2′ = 𝑢3 ,
..
.

𝑢𝑛−1 = 𝑢𝑛 ,
𝑢𝑛′ = 𝐹(𝑢𝑛 , 𝑢𝑛−1 , . . . , 𝑢2 , 𝑢1 , 𝑥).

We solve this system for 𝑢1 , 𝑢2 , . . . , 𝑢𝑛 . Once we have solved for the 𝑢, we can discard 𝑢2
through 𝑢𝑛 and let 𝑦 = 𝑢1 . This 𝑦 solves the original equation.
3.1. INTRODUCTION TO SYSTEMS OF ODES 123

Example 3.1.4: Take 𝑥 ′′′ = 2𝑥 ′′ + 8𝑥 ′ + 𝑥 + 𝑡. Letting 𝑢1 = 𝑥, 𝑢2 = 𝑥 ′, 𝑢3 = 𝑥 ′′, we find the


system:
𝑢1′ = 𝑢2 , 𝑢2′ = 𝑢3 , 𝑢3′ = 2𝑢3 + 8𝑢2 + 𝑢1 + 𝑡.
Note why 𝑥 = 𝑢1 solves the original equation: The first two equations of the system give
that 𝑢3′ = 𝑢2′′ = 𝑢1′′′ = 𝑥 ′′′. If we plug 𝑢3′ = 𝑥 ′′′, 𝑢1 = 𝑥, 𝑢2 = 𝑥 ′, and 𝑢3 = 𝑥 ′′ into the third
equation of the system, we recover the third-order equation we started with.
The same idea works for a system of higher-order differential equations. A system of 𝑘
differential equations in 𝑘 unknowns, all of order 𝑛, can be transformed into a first-order
system of 𝑛 × 𝑘 equations and 𝑛 × 𝑘 unknowns.
Example 3.1.5: Consider the system from the example with carts, Example 3.1.3:
𝑚1 𝑥 1′′ = 𝑘(𝑥2 − 𝑥 1 ), 𝑚2 𝑥2′′ = −𝑘(𝑥2 − 𝑥 1 ).
Let 𝑢1 = 𝑥 1 , 𝑢2 = 𝑥 1′ , 𝑢3 = 𝑥 2 , 𝑢4 = 𝑥 2′ . The second-order system becomes the first-order
system
𝑢1′ = 𝑢2 , 𝑚1 𝑢2′ = 𝑘(𝑢3 − 𝑢1 ), 𝑢3′ = 𝑢4 , 𝑚2 𝑢4′ = −𝑘(𝑢3 − 𝑢1 ).
Example 3.1.6: The idea works in reverse as well. Suppose we want to solve the system
𝑥 ′ = 2𝑦 − 𝑥, 𝑦 ′ = 𝑥,
for the initial conditions 𝑥(0) = 1, 𝑦(0) = 0. Let the independent variable be 𝑡.
If we differentiate the second equation, we get 𝑦 ′′ = 𝑥 ′. We know what 𝑥 ′ is in terms of
𝑥 and 𝑦, and we know that 𝑥 = 𝑦 ′. So,
𝑦 ′′ = 𝑥 ′ = 2𝑦 − 𝑥 = 2𝑦 − 𝑦 ′ .
We now have the equation 𝑦 ′′ + 𝑦 ′ − 2𝑦 = 0. We know how to solve this equation and we
find that 𝑦 = 𝐶1 𝑒 −2𝑡 + 𝐶2 𝑒 𝑡 . Once we have 𝑦, we use the equation 𝑦 ′ = 𝑥 to get 𝑥.
𝑥 = 𝑦 ′ = −2𝐶1 𝑒 −2𝑡 + 𝐶2 𝑒 𝑡 .
We solve for the initial conditions 1 = 𝑥(0) = −2𝐶1 + 𝐶2 and 0 = 𝑦(0) = 𝐶1 + 𝐶2 . Hence,
𝐶1 = −𝐶2 and 1 = 3𝐶2 . So 𝐶1 = −1/3 and 𝐶2 = 1/3. Our solution is
2𝑒 −2𝑡 + 𝑒 𝑡 −𝑒 −2𝑡 + 𝑒 𝑡
𝑥= , 𝑦= .
3 3
Exercise 3.1.1: Plug in and check that this really is the solution.
It is useful to go back and forth between systems and higher-order equations for other
reasons. For example, software for solving ODEs numerically (approximation) is generally
for first-order systems. To use it, you take whatever ODE you want to solve and convert
it to a first-order system. It is not very hard to adapt computer code for the Euler or
Runge–Kutta method for first-order equations to handle first-order systems. We simply
treat the dependent variable not as a number but as a vector. In many mathematical
computer languages there is almost no distinction in syntax.
124 CHAPTER 3. SYSTEMS OF ODES

3.1.4 Autonomous systems and vector fields


A system where the equations do not depend on the independent variable is called an
autonomous system. For example, the system 𝑥 ′ = 2𝑦 − 𝑥, 𝑦 ′ = 𝑥 is autonomous as the
independent variable, say 𝑡, does not appear in the equations.
For autonomous systems, we can draw the so-called direction field or vector field, a plot
similar to a slope field, but instead of giving a slope at each point, we give a direction (and
a magnitude). The previous example, 𝑥 ′ = 2𝑦 − 𝑥, 𝑦 ′ = 𝑥, says that at the point (𝑥, 𝑦) the
direction in which we should travel to satisfy the equations should be the direction of the
vector (2𝑦 − 𝑥, 𝑥) with the speed equal to the magnitude of this vector. So we draw the
vector (2𝑦 − 𝑥, 𝑥) at the point (𝑥, 𝑦) and we do this for many  points on the 𝑥𝑦-plane. For
example, at the point (1, 2), we draw the vector 2(2) − 1, 1 = (3, 1), a vector pointing  to
the right and a little bit up, while at the point (2, 1) we draw the vector 2(1) − 2, 2 = (0, 2)
a vector that points straight up. When drawing the vectors, we will scale down their
size to fit many of them on the same direction field. If we drew the arrows at the actual
size, the diagram would be a jumbled mess once we draw more than a couple of arrows.
So we scale them all so that not even the longest one interferes with the others. We are
mostly interested in their direction and relative size. See Figure 3.2 on the facing page. The
diagrams we drew in § 1.6 for autonomous equations in one dimension are similar, but
note how much more complicated things become when we allow just one extra dimension.
We can draw a path of the solution in the plane. Suppose the solution is given by
𝑥 = 𝑓 (𝑡), 𝑦 = 𝑔(𝑡). We  pick an interval of 𝑡 (say 0 ≤ 𝑡 ≤ 2 for our example) and plot all
the points 𝑓 (𝑡), 𝑔(𝑡) for 𝑡 in the selected range. The resulting picture is called the phase
portrait (or phase plane portrait). The particular curve obtained is called the trajectory or
solution curve. See an example plot in Figure 3.3 on the next page. In the figure the solution
starts at (1, 0) and travels along the vector field for a distance of 2 units of 𝑡. We solved this
system precisely, so we compute 𝑥(2) and 𝑦(2) to find 𝑥(2) ≈ 2.475 and 𝑦(2) ≈ 2.457. This
point corresponds to the top right end of the plotted solution curve in the figure.
We can draw phase portraits and trajectories in the 𝑥𝑦-plane even if the system is not
autonomous. In this case, however, we cannot draw the direction field, since the field
changes as 𝑡 changes. For each 𝑡 we would get a different direction field.

3.1.5 Picard’s theorem


Before going further, we mention that Picard’s theorem on existence and uniqueness also
holds for systems of ODEs. Let us restate this theorem in this setting. A general first-order
system is of the form
𝑥1′ = 𝐹1 (𝑥1 , 𝑥2 , . . . , 𝑥 𝑛 , 𝑡),
𝑥2′ = 𝐹2 (𝑥1 , 𝑥2 , . . . , 𝑥 𝑛 , 𝑡),
.. (3.1)
.
𝑥 𝑛′ = 𝐹𝑛 (𝑥1 , 𝑥2 , . . . , 𝑥 𝑛 , 𝑡).
3.1. INTRODUCTION TO SYSTEMS OF ODES 125

-1 0 1 2 3 -1 0 1 2 3
3 3 3 3

2 2 2 2

1 1 1 1

0 0 0 0

-1 -1 -1 -1
-1 0 1 2 3 -1 0 1 2 3

Figure 3.2: The direction field for 𝑥 ′ = 2𝑦 − 𝑥, Figure 3.3: The direction field for 𝑥 ′ = 2𝑦 − 𝑥,
𝑦 ′ = 𝑥. 𝑦 ′ = 𝑥 with the trajectory of the solution starting
at (1, 0) for 0 ≤ 𝑡 ≤ 2.

Theorem 3.1.1 (Picard’s theorem on existence and uniqueness for systems). If for every
𝜕𝐹 𝑗
𝑗 = 1, 2, . . . , 𝑛 and every 𝑘 = 1, 2, . . . , 𝑛 each 𝐹 𝑗 is continuous and the derivative 𝜕𝑥 exists and is
𝑘
continuous near some (𝑥 10 , 𝑥20 , . . . , 𝑥 𝑛0 , 𝑡 0 ), then a solution to (3.1) subject to the initial condition
𝑥1 (𝑡 0 ) = 𝑥10 , 𝑥2 (𝑡 0 ) = 𝑥20 , . . . , 𝑥 𝑛 (𝑡 0 ) = 𝑥 𝑛0 exists (at least for 𝑡 in some small interval) and is
unique.

That is, a unique solution exists for any initial condition given that the system is
reasonable (each 𝐹 𝑗 and its partial derivatives in the 𝑥 variables are continuous). As for
single equations, we may not have a solution for all time 𝑡, but it is guaranteed at least for
some short period of time.
As we can change any 𝑛 th -order ODE into a first-order system, this theorem also
provides the existence and uniqueness of solutions for higher-order equations.

3.1.6 Exercises
Exercise 3.1.2: Find the general solution of 𝑥1′ = 𝑥2 − 𝑥 1 + 𝑡, 𝑥 2′ = 𝑥2 .

Exercise 3.1.3: Find the general solution of 𝑥1′ = 3𝑥1 − 𝑥 2 + 𝑒 𝑡 , 𝑥2′ = 𝑥1 .

Exercise 3.1.4: Write 𝑎𝑦 ′′ + 𝑏𝑦 ′ + 𝑐𝑦 = 𝑓 (𝑥) as a first-order system of ODEs.

Exercise 3.1.5: Write 𝑥 ′′ + 𝑦 2 𝑦 ′ − 𝑥 3 = sin(𝑡), 𝑦 ′′ + (𝑥 ′ + 𝑦 ′)2 − 𝑥 = 0 as a first-order system of


ODEs.

Exercise 3.1.6: Suppose two masses on carts on frictionless surface are at displacements 𝑥1 and 𝑥2
as in Example 3.1.3 on page 122. Suppose that a rocket applies force 𝐹 in the positive direction on
cart 𝑥1 . Set up the system of equations.
126 CHAPTER 3. SYSTEMS OF ODES

Exercise 3.1.7: Suppose the tanks are as in Example 3.1.2 on page 121, starting both at volume 𝑉,
but now the rate of flow from tank 1 to tank 2 is 𝑟1 , and rate of flow from tank 2 to tank one is 𝑟2 .
Notice that the volumes are now not constant. Set up the system of equations.

Exercise 3.1.101: Find the general solution to 𝑦1′ = 3𝑦1 , 𝑦2′ = 𝑦1 + 𝑦2 , 𝑦3′ = 𝑦1 + 𝑦3 .

Exercise 3.1.102: Solve 𝑦 ′ = 2𝑥, 𝑥 ′ = 𝑥 + 𝑦, 𝑥(0) = 1, 𝑦(0) = 3.

Exercise 3.1.103: Write 𝑥 ′′′ = 𝑥 + 𝑡 as a first-order system.

Exercise 3.1.104: Write 𝑦1′′ + 𝑦1 + 𝑦2 = 𝑡, 𝑦2′′ + 𝑦1 − 𝑦2 = 𝑡 2 as a first-order system.

Exercise 3.1.105: Suppose two masses on carts on frictionless surface are at displacements 𝑥1 and
𝑥2 as in Example 3.1.3 on page 122. Suppose initial displacement is 𝑥1 (0) = 𝑥2 (0) = 0, and initial
velocity is 𝑥 1′ (0) = 𝑥2′ (0) = 𝑎 for some number 𝑎. Use your intuition to solve the system, explain
your reasoning.

Exercise 3.1.106: Suppose the tanks are as in Example 3.1.2 on page 121 except that clean water
flows in at a rate of 𝑠 liters per second into tank 1, and brine flows out of tank 2 and into the sewer
also at a rate of 𝑠 liters per second. The rate of flow from tank 1 into tank 2 is still 𝑟, but the rate of
flow from tank 2 back into tank 1 is 𝑟 − 𝑠 (assume 𝑟 > 𝑠).

a) Draw the picture.


b) Set up the system of equations.
c) Intuitively, what happens as 𝑡 goes to infinity, explain.
3.2. MATRICES AND LINEAR SYSTEMS 127

3.2 Matrices and linear systems


Note: 1.5 lectures, first part of §5.1 in [EP], §7.2 and §7.3 in [BD], see also appendix A

3.2.1 Matrices and vectors


Before we start talking about linear systems of ODEs, we need to talk about matrices, so let
us review these briefly. A matrix is an 𝑚 × 𝑛 array of numbers (𝑚 rows and 𝑛 columns).
For example, we denote a 3 × 5 matrix as follows

 𝑎11 𝑎12 𝑎13 𝑎14 𝑎15 


𝐴 =  𝑎 21 𝑎 22 𝑎 23 𝑎 24 𝑎 25  .
 
 𝑎31 𝑎32 𝑎33 𝑎34 𝑎35 
 
The numbers 𝑎 𝑖𝑗 are called elements or entries. We say a matrix is square if it has 𝑚 = 𝑛, that
is, it has the same number of rows and columns.
In the setting of matrices, by a vector, we usually mean a column vector, that is, an 𝑚 × 1
matrix. If we mean a row vector, we will explicitly say so (a row vector is a 1 × 𝑛 matrix).
We usually denote matrices by upper case letters and vectors by lower case letters with an
arrow such as 𝑥® or 𝑏. ® We write 0® for a vector of all zeros.
We define some operations on matrices. We want 1 × 1 matrices to really act like
numbers, so our operations have to be compatible with this viewpoint.
First, we can multiply a matrix by a scalar (a number). We simply multiply each entry
in the matrix by the scalar. For example,
   
1 2 3 2 4 6
2 = .
4 5 6 8 10 12

Matrix addition is also easy. We add matrices element by element. For example,
     
1 2 3 1 1 −1 2 3 2
+ = .
4 5 6 0 2 4 4 7 10

If the sizes do not match, then addition is not defined.


If we denote by 0 the matrix with all zero entries, by 𝑐, 𝑑 scalars, and by 𝐴, 𝐵, 𝐶 matrices,
we have the following familiar rules:

𝐴 + 0 = 𝐴 = 0 + 𝐴,
𝐴 + 𝐵 = 𝐵 + 𝐴,
(𝐴 + 𝐵) + 𝐶 = 𝐴 + (𝐵 + 𝐶),
𝑐(𝐴 + 𝐵) = 𝑐𝐴 + 𝑐𝐵,
(𝑐 + 𝑑)𝐴 = 𝑐𝐴 + 𝑑𝐴.
128 CHAPTER 3. SYSTEMS OF ODES

Another useful operation for matrices is the so-called transpose. This operation just
swaps rows and columns of a matrix. The transpose of 𝐴 is denoted by 𝐴𝑇 . Example:

 𝑇 1 4
1 2 3  
= 2 5
4 5 6 3 6
 

3.2.2 Matrix multiplication


Matrix multiplication is a bit more complicated. First we define the so-called dot product (or
inner product) of two vectors. Usually this will be a row vector multiplied with a column
vector of the same size. For the dot product, we multiply each pair of entries from the first
and the second vector, and we sum these products. The result is a single number. For
example,
 𝑏 1 
 
𝑎 1 𝑎2 𝑎 3 · 𝑏 2  = 𝑎1 𝑏 1 + 𝑎 2 𝑏 2 + 𝑎 3 𝑏 3 .

𝑏 3 
 
Similarly for larger (or smaller) vectors.
Armed with the dot product, we define the product of matrices. We denote by row𝑖 (𝐴)
the 𝑖 th row of 𝐴 and by column 𝑗 (𝐴) the 𝑗 th column of 𝐴. For an 𝑚 × 𝑛 matrix 𝐴 and an
𝑛 × 𝑝 matrix 𝐵, we can define the product 𝐴𝐵. We let 𝐴𝐵 be an 𝑚 × 𝑝 matrix whose 𝑖𝑗 th
entry is the dot product
row𝑖 (𝐴) · column 𝑗 (𝐵).

Do note how the sizes match up: 𝑚 × 𝑛 multiplied by 𝑛 × 𝑝 is 𝑚 × 𝑝. Example:

  1 0 −1
1 2 3  
1 1 1  =
4 5 6 
1 0 0 

   
1·1+2·1+3·1 1·0+2·1+3·0 1 · (−1) + 2 · 1 + 3 · 0 6 2 1
= =
4·1+5·1+6·1 4·0+5·1+6·0 4 · (−1) + 5 · 1 + 6 · 0 15 5 1

For multiplication, we want an analogue of a 1. This analogue is the so-called identity


matrix. The identity matrix is a square matrix with 1s on the diagonal and zeros everywhere
else. It is usually denoted by 𝐼. For each size, we have a different identity matrix. So
sometimes we may denote the size as a subscript. For example, 𝐼3 would be the 3 × 3
identity matrix
1 0 0
 
𝐼 = 𝐼3 = 0 1 0 .
0 0 1
 
3.2. MATRICES AND LINEAR SYSTEMS 129

We have the following rules for matrix multiplication. Suppose that 𝐴, 𝐵, 𝐶 are matrices
of the correct sizes so that the following make sense. Let 𝛼 denote a scalar (number). Then,
𝐴(𝐵𝐶) = (𝐴𝐵)𝐶,
𝐴(𝐵 + 𝐶) = 𝐴𝐵 + 𝐴𝐶,
(𝐵 + 𝐶)𝐴 = 𝐵𝐴 + 𝐶𝐴,
𝛼(𝐴𝐵) = (𝛼𝐴)𝐵 = 𝐴(𝛼𝐵),
𝐼𝐴 = 𝐴 = 𝐴𝐼.
A few warnings are in order.
(i) 𝐴𝐵 ≠ 𝐵𝐴 in general (it may be true
 1 1by
 fluke sometimes). That is, matrices do not
commute. For example, take 𝐴 = 1 1 and 𝐵 = 0 2 .
1 0

(ii) 𝐴𝐵 = 𝐴𝐶 does not necessarily imply 𝐵 = 𝐶, even if 𝐴 is not 0.

(iii) 𝐴𝐵 = 0 does not necessarily mean that 𝐴 = 0 or 𝐵 = 0. Try, for example, 𝐴 = 𝐵 =


0 1
00 .
For the last two items to hold, we would need to “divide” by a matrix. This is where
the matrix inverse comes in. Suppose that 𝐴 and 𝐵 are 𝑛 × 𝑛 matrices such that
𝐴𝐵 = 𝐼 = 𝐵𝐴.
Then we call 𝐵 the inverse of 𝐴 and we denote 𝐵 by 𝐴−1 . If the inverse of 𝐴 exists, then we
call 𝐴 invertible. If 𝐴 is not invertible, we sometimes say 𝐴 is singular.
If 𝐴 is invertible, then 𝐴𝐵 = 𝐴𝐶 does imply that 𝐵 = 𝐶 (in particular, the inverse of
𝐴 is unique). We just multiply both sides by 𝐴−1 (on the left) to get 𝐴−1 𝐴𝐵 = 𝐴−1 𝐴𝐶 or
−1
𝐼𝐵 = 𝐼𝐶 or 𝐵 = 𝐶. It is also not hard to see that (𝐴−1 ) = 𝐴.

3.2.3 The determinant


For square matrices we define a useful quantity called the determinant. We define the
determinant of a 1 × 1 matrix as the value of its only entry. For a 2 × 2 matrix we define
𝑎 𝑏
 
def
det = 𝑎𝑑 − 𝑏𝑐.
𝑐 𝑑
Before trying to define the determinant for larger matrices, let us note the meaning of
the determinant. Consider an 𝑛 × 𝑛 matrix as a mapping of the 𝑛-dimensional euclidean
space ℝ𝑛 to itself, where 𝑥® gets sent to 𝐴 𝑥®. In particular, a 2 × 2 matrix 𝐴 is a mapping
of the plane to itself. The determinant of 𝐴 is the factor by which the area of objects
changes. If we take the unit square (square of side 1) in the plane, then 𝐴 takes the square
to a parallelogram of area |det(𝐴)|. The sign of det(𝐴) denotes changing of orientation
(negative if the axes get flipped). For example, let
 
1 1
𝐴= .
−1 1
130 CHAPTER 3. SYSTEMS OF ODES

Then det(𝐴) = 1 + 1 = 2. Let us see where the (unit) square with vertices (0, 0), (1, 0), (0, 1),
and (1, 1) gets sent. Clearly (0, 0) gets sent to (0, 0).
              
1 1 1 1 1 1 0 1 1 1 1 2
= , = , = .
−1 1 0 −1 −1 1 1 1 −1 1 1 0
The image of the square is another√square with vertices (0, 0), (1, −1), (1, 1), and (2, 0). The
image square has a side of length 2 and is therefore of area 2.
If you think back to high school geometry, you may have seen a formula for computing
the area of a parallelogram with vertices (0, 0), (𝑎, 𝑐), (𝑏, 𝑑) and (𝑎 + 𝑏, 𝑐 + 𝑑). And it is
precisely
𝑎 𝑏
 
det .
𝑐 𝑑
𝑎 𝑏
The vertical lines above mean absolute value. The matrix 𝑐 𝑑 carries the unit square to
the given parallelogram.
Let us look at the determinant for larger matrices. We define 𝐴 𝑖𝑗 as the matrix 𝐴 with
the 𝑖 th row and the 𝑗 th column deleted. To compute the determinant of a matrix, pick one
row, say the 𝑖 th row and compute:
𝑛
(−1)𝑖+𝑗 𝑎 𝑖𝑗 det(𝐴 𝑖𝑗 ).
Õ
det(𝐴) =
𝑗=1

For the first row, we get


(
+𝑎1𝑛 det(𝐴1𝑛 ) if 𝑛 is odd,
det(𝐴) = 𝑎11 det(𝐴11 ) − 𝑎 12 det(𝐴12 ) + 𝑎 13 det(𝐴13 ) − · · ·
−𝑎1𝑛 det(𝐴1𝑛 ) if 𝑛 even.

We alternately add and subtract the determinants of the submatrices 𝐴 𝑖𝑗 multiplied by


𝑎 𝑖𝑗 for a fixed 𝑖 and all 𝑗. For a 3 × 3 matrix, picking the first row, we get det(𝐴) =
𝑎 11 det(𝐴11 ) − 𝑎 12 det(𝐴12 ) + 𝑎 13 det(𝐴13 ). For example,
1 2 3      
©  ª 5 6 4 6 4 5
det ­ 4 5 6 ® = 1 · det − 2 · det + 3 · det
7 8 9 8 9 7 9 7 8
« ¬
= 1(5 · 9 − 6 · 8) − 2(4 · 9 − 6 · 7) + 3(4 · 8 − 5 · 7) = 0.

The numbers (−1)𝑖+𝑗 det(𝐴 𝑖𝑗 ) are called cofactors of the matrix and this way of computing
the determinant is called the cofactor expansion. No matter which row you pick, you always
get the same number. It is also possible to compute the determinant by expanding along
columns (picking a column instead of a row above). It is true that det(𝐴) = det(𝐴𝑇 ).
A common notation for the determinant is a pair of vertical lines:
𝑎 𝑏 𝑎 𝑏
 
= det .
𝑐 𝑑 𝑐 𝑑
3.2. MATRICES AND LINEAR SYSTEMS 131

I personally find this notation confusing as vertical lines usually mean a positive quantity,
while determinants can be negative. Also think about how to write the absolute value of a
determinant. I will not use this notation in this book.
Think of the determinants telling you the scaling of a mapping. If 𝐵 doubles the sizes
of geometric objects and 𝐴 triples them, then 𝐴𝐵 (which applies 𝐵 to an object and then 𝐴)
should make size go up by a factor of 6 = 3 · 2. This is true in general:
det(𝐴𝐵) = det(𝐴) det(𝐵).
This property is one of the most useful, and it can be employed to actually compute
determinants. A particularly interesting consequence is to note what it means for existence
of inverses. Take 𝐴 and 𝐵 to be inverses of each other, that is, 𝐴𝐵 = 𝐼. Then
det(𝐴) det(𝐵) = det(𝐴𝐵) = det(𝐼) = 1.
Neither det(𝐴) nor det(𝐵) can be zero. Conversely, if det(𝐴) is not zero, then 𝐴 is invertible.
We state this observation as a theorem as it is very important in the context of this course.
Theorem 3.2.1. An 𝑛 × 𝑛 matrix 𝐴 is invertible if and only if det(𝐴) ≠ 0.
1
In fact, det(𝐴−1 ) det(𝐴) = 1 says that det(𝐴−1 ) = det(𝐴) . So we even know what the
determinant of 𝐴 is before we know how to compute 𝐴 .
−1 −1

There is a simple formula for the inverse of a 2 × 2 matrix


 −1
𝑎 𝑏 𝑑 −𝑏
  
1
= .
𝑐 𝑑 𝑎𝑑 − 𝑏𝑐 −𝑐 𝑎
Notice the determinant of the matrix [ 𝑎𝑐 𝑏𝑑 ] in the denominator of the fraction. The formula
only works if the determinant is nonzero, otherwise we are dividing by zero.

3.2.4 Solving linear systems


One application of matrices we will need is to solve systems of linear equations. This is
best shown by example. Suppose that we have the following system of linear equations
2𝑥 1 + 2𝑥 2 + 2𝑥 3 = 2,
𝑥1 + 𝑥 2 + 3𝑥 3 = 5,
𝑥1 + 4𝑥 2 + 𝑥 3 = 10.
Without changing the solution, we could swap equations in this system, we could
multiply any of the equations by a nonzero number, and we could add a multiple of one
equation to another equation. It turns out these operations always suffice to find a solution.
It is easier to write the system as a matrix equation. The system above can be written as
2 2 2 𝑥1   2 
1 1 3 𝑥2  =  5  .
    

1 4 1 𝑥3  10
   
    
132 CHAPTER 3. SYSTEMS OF ODES

To solve the system, we put the coefficient matrix (the matrix on the left-hand side of the
equation) together with the vector on the right and side and get the so-called augmented
matrix:
2 2 2 2 
 
 1 1 3 5 .
 
 1 4 1 10 
 
We apply a sequence of the following three elementary row operations.
(i) Swap two rows.

(ii) Multiply a row by a nonzero number.

(iii) Add a multiple of one row to another row.


We keep doing these operations until we get into a state where it is easy to read off the
answer, or until we get into a contradiction indicating no solution, for example if we come
up with an equation such as 0 = 1.
Let us work through the example. First multiply the first row by 1/2 to obtain
1 1 1 1 
 
 1 1 3 5 .
 
 1 4 1 10 
 
Now subtract the first row from the second and third row:
1 1 1 1
 
0 0 2 4
 
0 3 0 9
 
Multiply the last row by 1/3 and the second row by 1/2:
1 1 1 1
 
0 0 1 2
 
0 1 0 3
 
Swap rows 2 and 3:
1 1 1 1
 
0 1 0 3
 
0 0 1 2
 
Subtract the last row from the first, then subtract the second row from the first:
 1 0 0 −4 
 
0 1 0 3 
 
0 0 1 2 
 
Finally, we think about what equations this augmented matrix represents: 𝑥 1 = −4, 𝑥 2 = 3,
and 𝑥 3 = 2. We try this solution in the original system and, voilà, it works!
3.2. MATRICES AND LINEAR SYSTEMS 133

Exercise 3.2.1: Check that the solution above really solves the given equations.
We write this equation in matrix notation as
®
𝐴 𝑥® = 𝑏,
h2 2 2i h 2
i
where 𝐴 is the matrix 113 and 𝑏® is the vector 5 . The solution can also be computed
141 10
via the inverse,
®
𝑥® = 𝐴−1 𝐴 𝑥® = 𝐴−1 𝑏.

It is possible that the solution is not unique, or that no solution exists. It is easy to tell if
a solution does not exist. If during the row reduction you come up with a row where all the
entries except the last one are zero (the last entry in a row corresponds to the right-hand
side of the equation), then the system is inconsistent and has no solution. For example, for
a system of 3 equations and 3 unknowns, if you find a row such as [ 0 0 0 | 1 ] in the
augmented matrix, you know the system is inconsistent. That row corresponds to 0 = 1.
You generally try to use row operations until the following conditions are satisfied. The
first (from the left) nonzero entry in each row is called the leading entry.
(i) The leading entry in any row is strictly to the right of the leading entry of the row
above.

(ii) Any zero rows are below all the nonzero rows.

(iii) All leading entries are 1.

(iv) All the entries above and below a leading entry are zero.
Such a matrix is said to be in reduced row echelon form. The variables corresponding to
columns with no leading entries are said to be free variables. Free variables mean that we can
pick those variables to be anything we want and then solve for the rest of the unknowns.
Example 3.2.1: The following augmented matrix is in reduced row echelon form.
1 2 0 3
 
0 0 1 1
 
0 0 0 0
 
Suppose the variables are 𝑥1 , 𝑥 2 , and 𝑥3 . Then 𝑥2 is the free variable, 𝑥1 = 3 − 2𝑥2 , and
𝑥3 = 1.
On the other hand, if during the row reduction process you come up with the matrix
 1 2 13 3 
 
 0 0 1 1 ,
 
0 0 0 3
 
there is no need to go further. The last row corresponds to the equation 0𝑥 1 + 0𝑥 2 + 0𝑥 3 = 3,
which is preposterous. Hence, no solution exists.
134 CHAPTER 3. SYSTEMS OF ODES

3.2.5 Computing the inverse


If the matrix 𝐴 is square and there exists a unique solution 𝑥® to 𝐴 𝑥® = 𝑏® for any 𝑏® (there are
no free variables), then 𝐴 is invertible. Multiplying both sides by 𝐴−1 , you can see that
® So it is useful to compute the inverse if you want to solve the equation for many
𝑥® = 𝐴−1 𝑏.
different right-hand sides 𝑏. ®
We have a formula for the 2 × 2 inverse, but it is also not hard to compute inverses of
larger matrices. While we will not have too much occasion to compute inverses for larger
matrices than 2 × 2 by hand, let us touch on how to do it. Finding the inverse of 𝐴 is
actually just solving a bunch of linear equations. If we can solve 𝐴 𝑥®𝑘 = 𝑒®𝑘 where 𝑒®𝑘 is the
vector with all zeros except a 1 at the 𝑘 th position, then the inverse is the matrix with the
columns 𝑥®𝑘 for 𝑘 = 1, 2, . . . , 𝑛 (exercise: why?). Therefore, to find the inverse we write a
larger 𝑛 × 2𝑛 augmented matrix [ 𝐴 | 𝐼 ], where 𝐼 is the identity matrix. We then perform
row reduction. The reduced row echelon form of [ 𝐴 | 𝐼 ] will be of the form [ 𝐼 | 𝐴−1 ] if
and only if 𝐴 is invertible. We then just read off the inverse 𝐴−1 .

3.2.6 Exercises
𝑥® =
1 2 5
Exercise 3.2.2: Solve 34 6 by using matrix inverse.
9 −2 −6
h i
Exercise 3.2.3: Compute determinant of −8 3 6 .
10 −2 −6
1 2 3 1

Exercise 3.2.4: Compute determinant of 40 5 0 . Hint: Expand along the proper row or column
60 7 0
8 0 10 1
to make the calculations simpler.
h1 2 3i
Exercise 3.2.5: Compute inverse of 111 .
010
h1 2 3i
Exercise 3.2.6: For which ℎ is 45 6 not invertible? Is there only one such ℎ? Are there several?
78 ℎ
Infinitely many?
hℎ 1 1
i
Exercise 3.2.7: For which ℎ is 0 ℎ 0 not invertible? Find all such ℎ.
1 1 ℎ
9 −2 −6
h i h1i
Exercise 3.2.8: Solve −8 3 6 𝑥® = 2 .
10 −2 −6 3
h5 3 7i h2i
Exercise 3.2.9: Solve 844 𝑥® = 0 .
633 0
3 2 3 0 2
Exercise 3.2.10: Solve 3333
0242 𝑥® = 0
4 .
2343 1

Exercise 3.2.11: Find 3 nonzero 2 × 2 matrices 𝐴, 𝐵, and 𝐶 such that 𝐴𝐵 = 𝐴𝐶 but 𝐵 ≠ 𝐶.


h1 1 1
i
Exercise 3.2.101: Compute determinant of 2 3 −5
1 −1 0
3.2. MATRICES AND LINEAR SYSTEMS 135

Exercise 3.2.102: Find 𝑡 such that 1 𝑡


 
−1 2 is not invertible.

𝑥® =
1 1
  10 
Exercise 3.2.103: Solve 1 −1 20 .
𝑎 0 h𝑎 00
i
Exercise 3.2.104: Suppose 𝑎, 𝑏, 𝑐 are nonzero numbers. Let 𝑀 = 0 𝑏 ,𝑁= 0 𝑏 0 .
0 0 𝑐

a) Compute 𝑀 −1 . b) Compute 𝑁 −1 .
136 CHAPTER 3. SYSTEMS OF ODES

3.3 Linear systems of ODEs


Note: less than 1 lecture, second part of §5.1 in [EP], §7.4 in [BD]
First let us talk about matrix- or vector-valued functions. Such a function is just a matrix
or a vector whose entries depend on some variable. If 𝑡 is the independent variable, we
write a vector-valued function 𝑥®(𝑡) as

 𝑥1 (𝑡) 
 𝑥2 (𝑡) 
 
𝑥®(𝑡) =  ..  .
 . 
𝑥 (𝑡)
 𝑛 
Similarly a matrix-valued function 𝐴(𝑡) is

 𝑎11 (𝑡) 𝑎12 (𝑡) · · · 𝑎1𝑛 (𝑡) 


 𝑎21 (𝑡) 𝑎22 (𝑡) · · · 𝑎2𝑛 (𝑡) 
 
𝐴(𝑡) =  ..
 .. .. ..  .
 . . . . 
 𝑎 (𝑡) 𝑎 (𝑡) · · · 𝑎 (𝑡)
 𝑛1 𝑛2 𝑛𝑛 

The derivative 𝐴′(𝑡) or 𝑑𝐴


𝑑𝑡 is just the matrix-valued function whose 𝑖𝑗 entry is 𝑎 𝑖𝑗 (𝑡).
th ′

Rules of differentiation of matrix-valued functions are similar to rules for normal


functions. Let 𝐴(𝑡) and 𝐵(𝑡) be matrix-valued functions. Let 𝑐 a scalar and let 𝐶 be a
constant matrix. Then
′
𝐴(𝑡) + 𝐵(𝑡) = 𝐴′(𝑡) + 𝐵′(𝑡),
′
𝐴(𝑡)𝐵(𝑡) = 𝐴′(𝑡)𝐵(𝑡) + 𝐴(𝑡)𝐵′(𝑡),
′
𝑐𝐴(𝑡) = 𝑐𝐴′(𝑡),
′
𝐶𝐴(𝑡) = 𝐶𝐴′(𝑡),
′
𝐴(𝑡) 𝐶 = 𝐴′(𝑡) 𝐶.

Note the order of the multiplication in the last two expressions.


A first-order linear system of ODEs is a system that can be written as the vector equation

𝑥®′(𝑡) = 𝑃(𝑡)𝑥®(𝑡) + 𝑓®(𝑡),

where 𝑃(𝑡) is a matrix-valued function, and 𝑥®(𝑡) and 𝑓®(𝑡) are vector-valued functions. We
will often suppress the dependence on 𝑡 and only write 𝑥®′ = 𝑃 𝑥® + 𝑓®. A solution of the
system is a vector-valued function 𝑥® satisfying the vector equation.
For example, the equations

𝑥1′ = 2𝑡𝑥1 + 𝑒 𝑡 𝑥2 + 𝑡 2 ,
𝑥1
𝑥2′ = − 𝑥2 + 𝑒 𝑡 ,
𝑡
3.3. LINEAR SYSTEMS OF ODES 137

can be written as
2𝑡 𝑒 𝑡 𝑡2
   

𝑥® = 1 𝑥® + 𝑡 .
/𝑡 −1 𝑒
We will mostly concentrate on equations that are not just linear, but are in fact constant-
coefficient equations. That is, the matrix 𝑃 will be constant; it will not depend on 𝑡.
When 𝑓® = 0® (the zero vector), then we say the system is homogeneous. For homogeneous
linear systems, we have the principle of superposition, just like for single homogeneous
equations.
Theorem 3.3.1 (Superposition). Let 𝑥®′ = 𝑃 𝑥® be a linear homogeneous system of ODEs. Suppose
that 𝑥®1 , 𝑥®2 , . . . , 𝑥®𝑛 are 𝑛 solutions of the equation and 𝑐1 , 𝑐2 , . . . , 𝑐 𝑛 are any constants, then
𝑥® = 𝑐1 𝑥®1 + 𝑐 2 𝑥®2 + · · · + 𝑐 𝑛 𝑥®𝑛 , (3.2)
is also a solution. Furthermore, if this is a system of 𝑛 equations (𝑃 is 𝑛 × 𝑛), and 𝑥®1 , 𝑥®2 , . . . , 𝑥®𝑛
are linearly independent, then every solution 𝑥® can be written as (3.2).
Linear independence for vector-valued functions is the same idea as for normal functions.
The vector-valued functions 𝑥®1 , 𝑥®2 , . . . , 𝑥®𝑛 are linearly independent when

𝑐1 𝑥®1 + 𝑐 2 𝑥®2 + · · · + 𝑐 𝑛 𝑥®𝑛 = 0®


has only the solution 𝑐1 = 𝑐2 = · · · = 𝑐 𝑛 = 0, where the equation must hold for all 𝑡.
h i h i h i
Example 3.3.1: 𝑥®1 = 𝑡2 , 𝑥®2 = 0 , 𝑥®3 = −𝑡 2 are linearly dependent because 𝑥®1 + 𝑥®3 =
𝑡 1+𝑡 1
𝑥®2 , and this holds for all 𝑡. So 𝑐1 = 1, 𝑐2 = −1, and 𝑐3 = 1 above will
h i work. h i h i
On the other hand if we change the example just slightly 𝑥®1 = 𝑡𝑡 , 𝑥®2 = 0𝑡 , 𝑥®3 = −𝑡1 ,
2 2

then the functions are linearly independent. First write 𝑐 1 𝑥®1 + 𝑐 2 𝑥®2 + 𝑐3 𝑥®3 = 0® and note
that it has to hold for all 𝑡. We get that
𝑐1 𝑡 2 − 𝑐3 𝑡 2
   
0
𝑐 1 𝑥®1 + 𝑐 2 𝑥®2 + 𝑐 3 𝑥®3 = = .
𝑐1 𝑡 + 𝑐2 𝑡 + 𝑐3 0
In other words 𝑐 1 𝑡 2 − 𝑐3 𝑡 2 = 0 and 𝑐1 𝑡 + 𝑐 2 𝑡 + 𝑐 3 = 0. If we set 𝑡 = 0, then the second
equation becomes 𝑐3 = 0. But then the first equation becomes 𝑐1 𝑡 2 = 0 for all 𝑡 and so
𝑐1 = 0. Thus the second equation is just 𝑐2 𝑡 = 0, which means 𝑐 2 = 0. So 𝑐1 = 𝑐 2 = 𝑐 3 = 0 is
the only solution and 𝑥®1 , 𝑥®2 , and 𝑥®3 are linearly independent.
The linear combination 𝑐1 𝑥®1 + 𝑐 2 𝑥®2 + · · · + 𝑐 𝑛 𝑥®𝑛 could always be written as
𝑋(𝑡) 𝑐®,
where 𝑋(𝑡) is the matrix with columns 𝑥®1 , 𝑥®2 , . . . , 𝑥®𝑛 , and 𝑐® is the column vector with entries
𝑐1 , 𝑐2 , . . . , 𝑐 𝑛 . Assuming that 𝑥®1 , 𝑥®2 , . . . , 𝑥®𝑛 are linearly independent, the matrix-valued
function 𝑋(𝑡) is called a fundamental matrix, or a fundamental matrix solution.
To solve nonhomogeneous first-order linear systems, we use the same technique as we
applied to solve single linear nonhomogeneous equations.
138 CHAPTER 3. SYSTEMS OF ODES

Theorem 3.3.2. Let 𝑥®′ = 𝑃 𝑥® + 𝑓® be a linear system of ODEs. Suppose 𝑥®𝑝 is one particular solution.
Then every solution can be written as

𝑥® = 𝑥®𝑐 + 𝑥®𝑝 ,

where 𝑥®𝑐 is a solution to the associated homogeneous equation (𝑥®′ = 𝑃 𝑥®).


The procedure for systems is the same as for single equations. We find a particular
solution to the nonhomogeneous equation, then we find the general solution to the
associated homogeneous equation, and finally we add the two together.
Alright, suppose you have found the general solution of 𝑥®′ = 𝑃 𝑥® + 𝑓®. Next suppose
you are given an initial condition of the form

𝑥®(𝑡0 ) = 𝑏®
® Let 𝑋(𝑡) be a fundamental matrix solution of
for some fixed 𝑡0 and a constant vector 𝑏.
the associated homogeneous equation (i.e. columns of 𝑋(𝑡) are solutions). The general
solution can be written as
𝑥®(𝑡) = 𝑋(𝑡) 𝑐® + 𝑥®𝑝 (𝑡).
We are seeking a vector 𝑐® such that

𝑏® = 𝑥®(𝑡0 ) = 𝑋(𝑡0 ) 𝑐® + 𝑥®𝑝 (𝑡0 ).

In other words, we are solving for 𝑐® the nonhomogeneous system of linear equations

𝑋(𝑡0 ) 𝑐® = 𝑏® − 𝑥®𝑝 (𝑡0 ).

Example 3.3.2: In Example 3.1.1 on page 120, we solved the system

𝑥1′ = 𝑥1 ,
𝑥2′ = 𝑥1 − 𝑥 2 ,

with initial conditions 𝑥1 (0) = 1, 𝑥2 (0) = 2. Let us consider this problem in the language of
this section. The system is homogeneous, so 𝑓®(𝑡) = 0. ® We write the system and the initial
conditions as    
′ 1 0 1
𝑥® = 𝑥®, 𝑥®(0) = .
1 −1 2
Ignoring the initial condition for a moment, the general solution ish 𝑥1 = i 𝑐 1 𝑒 𝑡 and
𝑒𝑡
𝑥2 = 𝑐21 𝑒 𝑡 + 𝑐 2 𝑒 −𝑡 . Letting 𝑐1 = 1 and 𝑐 2 = 0, we obtain the solution 𝑥®(𝑡) = (1/2)𝑒 𝑡 . Letting

𝑐1 = 0 and 𝑐 2 = 1, we obtain 𝑥®(𝑡) = 𝑒0−𝑡 . These two solutions are linearly independent,
 

as can be seen by setting 𝑡 = 0, and noting that the resulting constant vectors are linearly
independent. In matrix notation, a fundamental matrix solution is, therefore,

𝑒𝑡
 
0
𝑋(𝑡) = 1 𝑡 −𝑡 .
2 𝑒 𝑒
3.3. LINEAR SYSTEMS OF ODES 139

To solve the initial value problem, we solve for 𝑐® in the equation


®
𝑋(0) 𝑐® = 𝑏,
or in other words,    
1 0 1
1 𝑐® = .
2 1 2
1
A single elementary row operation shows 𝑐® =
 
3/2 . Our solution is

𝑒𝑡 𝑒𝑡
    
0 1
𝑥®(𝑡) = 𝑋(𝑡) 𝑐® = 1 𝑡 −𝑡 = 1 𝑡 3 −𝑡 .
2 𝑒 𝑒 3
2 2𝑒 + 2𝑒

This new solution agrees with our previous solution from § 3.1.

3.3.1 Exercises
Exercise 3.3.1: Write the system 𝑥1′ = 2𝑥 1 − 3𝑡𝑥2 + sin 𝑡, 𝑥 2′ = 𝑒 𝑡 𝑥1 + 3𝑥2 + cos 𝑡 in the form
𝑥®′ = 𝑃(𝑡)𝑥® + 𝑓®(𝑡).
Exercise 3.3.2:
a) Verify that the system 𝑥®′ = 𝑥® has the two solutions 𝑒 4𝑡 and 𝑒 −2𝑡 .
1 3 1  1

31 1 −1
b) Write down the general solution.
c) Write down the general solution in the form 𝑥1 =?, 𝑥2 =? (i.e. write down a formula for each
element of the solution).
𝑒 𝑡 and 𝑒 𝑡 are linearly independent. Hint: Just plug in 𝑡 = 0.
1  1

Exercise 3.3.3: Verify that 1 −1
h1i h 1
i h 1
i
Exercise 3.3.4: Verify that and
1 𝑒𝑡 and −1 𝑒𝑡 −1 𝑒 2𝑡 are linearly independent. Hint: You
0 1 1
must be a bit more tricky than in the previous exercise.
𝑡 h i
Exercise 3.3.5: Verify that and 𝑡3 are linearly independent.
𝑡2 𝑡4

Exercise 3.3.6: Take the system 𝑥1′ + 𝑥 2′ = 𝑥1 , 𝑥 1′ − 𝑥 2′ = 𝑥2 .


a) Write it in the form 𝐴 𝑥®′ = 𝐵 𝑥® for matrices 𝐴 and 𝐵.
b) Compute 𝐴−1 and use that to write the system in the form 𝑥®′ = 𝑃 𝑥®.
𝑒𝑡
 𝑒 2𝑡   
Exercise 3.3.101: Are 𝑒𝑡
and 𝑒 2𝑡
linearly independent? Justify.
 cosh(𝑡)   𝑡 
𝑒 𝑒 −𝑡
 
Exercise 3.3.102: Are 1
, 1
, and 1
linearly independent? Justify.
Exercise 3.3.103: Write 𝑥 =
′ 3𝑥 − 𝑦 + 𝑡
𝑒 ,𝑦 =
′ 𝑡𝑥 in matrix notation.
Exercise 3.3.104:
a) Write 𝑥1′ = 2𝑡𝑥2 , 𝑥2′ = 2𝑡𝑥2 in matrix notation.
b) Solve and write the solution in matrix notation.
140 CHAPTER 3. SYSTEMS OF ODES

3.4 Eigenvalue method


Note: 2 lectures, §5.2 in [EP], part of §7.3, §7.5, and §7.6 in [BD]
In this section, we will learn how to solve linear homogeneous constant-coefficient
systems of ODEs by the eigenvalue method. Suppose we have such a system

𝑥®′ = 𝑃 𝑥®,

where 𝑃 is a constant square matrix. We wish to adapt the method for the single constant-
coefficient equation by trying the function 𝑒 𝜆𝑡 . However, 𝑥® is a vector. So we try 𝑥® = 𝑣® 𝑒 𝜆𝑡 ,
where 𝑣® is an arbitrary constant vector. We plug this 𝑥® into the equation to get

𝑣 𝑒 𝜆𝑡 = 𝑃®
𝜆® 𝑣 𝑒 𝜆𝑡 .
|{z} |{z}
𝑥®′ 𝑃 𝑥®

We divide by 𝑒 𝜆𝑡 and notice that we are looking for a scalar 𝜆 and a vector 𝑣® that satisfy the
equation
𝜆®
𝑣 = 𝑃®𝑣.
To solve this equation, we need a little bit more linear algebra, which we now review.

3.4.1 Eigenvalues and eigenvectors of a matrix


Let 𝐴 be a constant square matrix. Suppose 𝜆 is a scalar for which there is a nonzero vector
𝑣® such that
𝐴®
𝑣 = 𝜆® 𝑣.
We call such a 𝜆 an eigenvalue of 𝐴, and we call 𝑣® a corresponding eigenvector.
has an eigenvalue 𝜆 = 2 with a corresponding eigenvector
2 1
Example 3.4.1: The matrix 01
1
0 , because       
2 1 1 2 1
= =2 .
0 1 0 0 0
Let us see how to compute eigenvalues for any matrix. Rewrite the equation for an
eigenvalue as
(𝐴 − 𝜆𝐼)® ®
𝑣 = 0.
This equation has a nonzero solution 𝑣® if and only if 𝐴 − 𝜆𝐼 is not invertible. Were
𝐴 − 𝜆𝐼 invertible, we could write (𝐴 − 𝜆𝐼)−1 (𝐴 − 𝜆𝐼)®𝑣 = (𝐴 − 𝜆𝐼)−1 0,
® which implies 𝑣® = 0.
®
Therefore, 𝐴 has the eigenvalue 𝜆 if and only if 𝜆 solves the equation

det(𝐴 − 𝜆𝐼) = 0.

Consequently, we can find an eigenvalue of 𝐴 without finding a corresponding eigenvector


at the same time. An eigenvector will have to be found later, once 𝜆 is known.
3.4. EIGENVALUE METHOD 141
h2 1 1i
Example 3.4.2: Find all eigenvalues of 120 .
002
We write
2 1 1 1 0 0 2 − 𝜆 1 1 
2−𝜆
©    ª 
det ­ 1 2 0 − 𝜆 0 1 0 ® = det ­  1 0  ® =
© ª

« 0 0 2
  0 0 1
 ¬ « 0
 0 2 − 𝜆 ¬
= (2 − 𝜆) (2 − 𝜆)2 − 1 = −(𝜆 − 1)(𝜆 − 2)(𝜆 − 3).


Setting this to zero, we find that the eigenvalues are 𝜆 = 1, 𝜆 = 2, and 𝜆 = 3.


For an 𝑛 × 𝑛 matrix, the polynomial we get by computing det(𝐴 − 𝜆𝐼) is of degree 𝑛, and
hence in general, we have 𝑛 eigenvalues. Some may be repeated, some may be complex.
To find an eigenvector corresponding to an eigenvalue 𝜆, we write

(𝐴 − 𝜆𝐼)® ®
𝑣 = 0,

and solve for a nontrivial (nonzero) vector 𝑣® . If 𝜆 is an eigenvalue, there will be at least one
free variable, and so for each distinct eigenvalue 𝜆, we can always find an eigenvector.
h2 1 1i
Example 3.4.3: Find an eigenvector of 120 corresponding to the eigenvalue 𝜆 = 3.
002
We write
2 1 1 1 0 0 𝑣1  −1 1 1  𝑣 1 
𝑣 = ­ 1 2 0 − 3 0 1 0 ® 𝑣2  =  1 −1 0 
©  ®
𝑣2  = 0.
  ª      
(𝐴 − 𝜆𝐼)® 
0 0 1 𝑣3   0 0 −1
 
𝑣 3 
« 0 0 2
 
 ¬      
To solve this system of linear equations, we write down the augmented matrix
 −1 1 1 0 
 
 1 −1 0 0  ,
 
 0 0 −1 0 
 
and we perform row operations (exercise: which ones?) until we get:
 1 −1 0 0 
 
 0 0 1 0 .
 
0 0 0 0
 
The entries of 𝑣® have to satisfy the equations 𝑣 1 − 𝑣2 = 0, 𝑣3 = 0, and 𝑣2 is a free variable.
We pick 𝑣 2 to be arbitrary (but
h i nonzero), we let 𝑣1 = 𝑣2 , and of course 𝑣3 = 0. For example,
1
if we pick 𝑣2 = 1, then 𝑣® = 1 . Let us verify that 𝑣® really is an eigenvector corresponding
0
to 𝜆 = 3:
2 1 1 1 3 1
      
1 2 0 1 = 3 = 3 1 .
      
0 0 2 0 0 0
      
Yay! It worked.
142 CHAPTER 3. SYSTEMS OF ODES

Exercise 3.4.1 (easy): Are eigenvectors unique? Can you find a different eigenvector for 𝜆 = 3 in
the example above? How are the two eigenvectors related?
Exercise 3.4.2: When the matrix is 2 × 2 you do not need to do row operations when computing an
eigenvector, you can read it off from 𝐴 − 𝜆𝐼 (if
 2you have computed the eigenvalues correctly). Can
you see why? Explain. Try it for the matrix 1 2 . 1

3.4.2 The eigenvalue method with distinct real eigenvalues


OK, back to homogeneous constant-coefficient systems of ODEs. We have the system
𝑥®′ = 𝑃 𝑥®.
We find the eigenvalues 𝜆1 , 𝜆2 , . . . , 𝜆𝑛 of the matrix 𝑃, and corresponding eigenvectors 𝑣®1 ,
𝑣®2 , . . . , 𝑣® 𝑛 . The functions 𝑣®1 𝑒 𝜆1 𝑡 , 𝑣®2 𝑒 𝜆2 𝑡 , . . . , 𝑣® 𝑛 𝑒 𝜆𝑛 𝑡 are solutions of the system of equations
and hence 𝑥® = 𝑐1 𝑣®1 𝑒 𝜆1 𝑡 + 𝑐 2 𝑣®2 𝑒 𝜆2 𝑡 + · · · + 𝑐 𝑛 𝑣® 𝑛 𝑒 𝜆𝑛 𝑡 is a solution.
Theorem 3.4.1. Take 𝑥®′ = 𝑃 𝑥®. If 𝑃 is an 𝑛 × 𝑛 constant matrix that has 𝑛 distinct real eigenvalues
𝜆1 , 𝜆2 , . . . , 𝜆𝑛 , then there exist 𝑛 linearly independent corresponding eigenvectors 𝑣®1 , 𝑣®2 , . . . , 𝑣® 𝑛 ,
and the general solution to 𝑥®′ = 𝑃 𝑥® can be written as

𝑥® = 𝑐1 𝑣®1 𝑒 𝜆1 𝑡 + 𝑐 2 𝑣®2 𝑒 𝜆2 𝑡 + · · · + 𝑐 𝑛 𝑣® 𝑛 𝑒 𝜆𝑛 𝑡 .

The corresponding fundamental matrix solution is


𝑋(𝑡) = 𝑣®1 𝑒 𝜆1 𝑡 𝑣®2 𝑒 𝜆2 𝑡 𝑣® 𝑛 𝑒 𝜆𝑛 𝑡 .
 
···
That is, 𝑋(𝑡) is the matrix whose 𝑗 th column is 𝑣® 𝑗 𝑒 𝜆 𝑗 𝑡 .
Example 3.4.4: Consider the system
2 1 1

 
𝑥® = 1 2 0 𝑥®.
0 0 2
 
Find the general solution. h1i
Earlier, we found the eigenvalues are 1, 2, 3. We found the eigenvector 1 for the
h i h 0 i
1 0
eigenvalue 3. Similarly we find the eigenvector −1 for the eigenvalue 1 and 1 for the
0 −1
eigenvalue 2 (exercise: check). The general solution is
1
  𝑡
0 1  𝑐1 𝑒 𝑡 + 𝑐 3 𝑒 3𝑡 
𝑡
𝑥® = 𝑐1 −1 𝑒 + 𝑐 2  1  𝑒 + 𝑐 3 1 𝑒 = −𝑐1 𝑒 + 𝑐 2 𝑒 + 𝑐 3 𝑒  .
  2𝑡   3𝑡  2𝑡 3𝑡

0
 
−1
 
0
 

 −𝑐2 𝑒 2𝑡 

In terms of a fundamental matrix solution,
 𝑒𝑡 0 𝑒 3𝑡   𝑐1 
 𝑡
𝑥® = 𝑋(𝑡) 𝑐® = −𝑒 𝑒 2𝑡 𝑒 3𝑡   𝑐2  .
 0 −𝑒 2𝑡 0  𝑐3 
  
3.4. EIGENVALUE METHOD 143

Exercise 3.4.3: Check that this 𝑥® really solves the system.

Note: If we write a single homogeneous linear constant-coefficient 𝑛 th -order equation


as a first-order system (as we did in § 3.1), then the eigenvalue equation

det(𝑃 − 𝜆𝐼) = 0

is essentially the same as the characteristic equation we got in § 2.2 and § 2.3.

3.4.3 Complex eigenvalues


A matrix may very well have complex eigenvalues even if all the entries are real. Take, for
example,  
′ 1 1
𝑥® = 𝑥®.
−1 1
Let us compute the eigenvalues of the matrix 𝑃 =
 1 1

−1 1 .

1−𝜆
 
1
det(𝑃 − 𝜆𝐼) = det = (1 − 𝜆)2 + 1 = 𝜆2 − 2𝜆 + 2 = 0.
−1 1 − 𝜆

Thus 𝜆 = 1 ± 𝑖. Corresponding eigenvectors are also complex. Start with 𝜆 = 1 − 𝑖.

®
𝑃 − (1 − 𝑖)𝐼 𝑣® = 0,


𝑖 1
 
®
𝑣® = 0.
−1 𝑖

The equations 𝑖𝑣1 + 𝑣 2 = 0 and −𝑣 1 + 𝑖𝑣 2 = 0 are multiples of each other. So we only need
to consider one of them. After picking 𝑣2 = 1, for example, we have an eigenvector 𝑣® = 1𝑖 .
 

1 is an eigenvector corresponding to the eigenvalue 1 + 𝑖.


 
In similar fashion, we find that −𝑖
We could write the solution as

𝑖 (1−𝑖)𝑡 𝑐1 𝑖𝑒 (1−𝑖)𝑡 − 𝑐 2 𝑖𝑒 (1+𝑖)𝑡


     
−𝑖 (1+𝑖)𝑡
𝑥® = 𝑐1 𝑒 + 𝑐2 𝑒 = .
1 1 𝑐1 𝑒 (1−𝑖)𝑡 + 𝑐 2 𝑒 (1+𝑖)𝑡

We would then need to look for complex values 𝑐1 and 𝑐2 to solve any initial conditions. It
is perhaps not completely clear that we get a real solution. After solving for 𝑐 1 and 𝑐 2 , we
could use Euler’s formula and do the whole song and dance we did before, but we will not.
We will apply the formula in a smarter way first to find independent real solutions.
We claim that we did not have to look for a second eigenvector (nor for the second
eigenvalue). All complex eigenvalues come in pairs (because the matrix 𝑃 is real).
First a small detour. The real part of a complex number 𝑧 can be computed as 𝑧+¯ 𝑧
2 ,
where the bar above 𝑧 means 𝑎 + 𝑖𝑏 = 𝑎 − 𝑖𝑏. The bar of 𝑧 is called the complex conjugate of
𝑧. If 𝑎 is a real number, then 𝑎¯ = 𝑎. Similarly we bar whole vectors or matrices by taking
144 CHAPTER 3. SYSTEMS OF ODES

the complex conjugate of every entry. Suppose a matrix 𝑃 is real. Then 𝑃 = 𝑃, and so
𝑃 𝑥® = 𝑃 𝑥® = 𝑃 𝑥®. Also the complex conjugate of 0 is still 0, therefore,

0® = 0® = (𝑃 − 𝜆𝐼)® ¯ 𝑣.
𝑣 = (𝑃 − 𝜆𝐼)®

In other words, if 𝜆 = 𝑎 + 𝑖𝑏 is an eigenvalue, then so is 𝜆¯ = 𝑎 − 𝑖𝑏. And if 𝑣® is an


eigenvector corresponding to the eigenvalue 𝜆, then 𝑣® is an eigenvector corresponding to
¯
the eigenvalue 𝜆.
Suppose 𝑎 + 𝑖𝑏 is a complex eigenvalue of 𝑃, and 𝑣® is a corresponding eigenvector. Then

𝑥®1 = 𝑣® 𝑒 (𝑎+𝑖𝑏)𝑡

is a solution (complex-valued) of 𝑥®′ = 𝑃 𝑥®. Euler’s formula shows that 𝑒 𝑎+𝑖𝑏 = 𝑒 𝑎−𝑖𝑏 , and so

𝑥®2 = 𝑥®1 = 𝑣® 𝑒 (𝑎−𝑖𝑏)𝑡

is also a solution. As 𝑥®1 and 𝑥®2 are solutions, the function

𝑥®1 + 𝑥®1 𝑥®1 + 𝑥®2 1 1


𝑥®3 = Re 𝑥®1 = Re 𝑣® 𝑒 (𝑎+𝑖𝑏)𝑡 = = = 𝑥®1 + 𝑥®2
2 2 2 2
is also a solution. And 𝑥®3 is real-valued! Similarly as Im 𝑧 = 𝑧−¯ 𝑧
2𝑖 is the imaginary part, we
find that
𝑥®1 − 𝑥®1 𝑥®1 − 𝑥®2
𝑥®4 = Im 𝑥®1 = = .
2𝑖 2𝑖
is also a real-valued solution. It turns out that 𝑥®3 and 𝑥®4 are linearly independent. We will
use Euler’s formula to separate out the real and imaginary part.
Returning to our problem,

𝑖 (1−𝑖)𝑡 𝑖 𝑖𝑒 𝑡 cos 𝑡 + 𝑒 𝑡 sin 𝑡 𝑒 𝑡 sin 𝑡 𝑒 𝑡 cos 𝑡


         
𝑡 𝑡
𝑥®1 = 𝑒 𝑒 cos 𝑡 − 𝑖𝑒 sin 𝑡 = 𝑡 +𝑖 .

= 𝑡 = 𝑡
1 1 𝑒 cos 𝑡 − 𝑖𝑒 sin 𝑡 𝑒 cos 𝑡 −𝑒 𝑡 sin 𝑡
Then
𝑒 𝑡 sin 𝑡 𝑒 𝑡 cos 𝑡
   
Re 𝑥®1 = 𝑡 and Im 𝑥®1 =
𝑒 cos 𝑡 −𝑒 𝑡 sin 𝑡
are the two real-valued linearly independent solutions we seek.
Exercise 3.4.4: Check that these really are solutions.
The general solution is

𝑒 𝑡 sin 𝑡 𝑒 𝑡 cos 𝑡 𝑐 1 𝑒 𝑡 sin 𝑡 + 𝑐 2 𝑒 𝑡 cos 𝑡


     
𝑥® = 𝑐1 + 𝑐 = .
𝑒 𝑡 cos 𝑡 2
−𝑒 𝑡 sin 𝑡 𝑐 1 𝑒 𝑡 cos 𝑡 − 𝑐 2 𝑒 𝑡 sin 𝑡
This solution is real-valued for real 𝑐 1 and 𝑐2 . At this point, we would solve for any initial
conditions we may have to find 𝑐1 and 𝑐2 .
We summarize the discussion as a theorem.
3.4. EIGENVALUE METHOD 145

Theorem 3.4.2. Let 𝑃 be a real-valued constant matrix. If 𝑃 has a complex eigenvalue 𝑎 + 𝑖𝑏 and
a corresponding eigenvector 𝑣® , then 𝑃 also has a complex eigenvalue 𝑎 − 𝑖𝑏 with a corresponding
eigenvector 𝑣® . Furthermore, 𝑥®′ = 𝑃 𝑥® has two linearly independent real-valued solutions

𝑥®1 = Re 𝑣® 𝑒 (𝑎+𝑖𝑏)𝑡 and 𝑥®2 = Im 𝑣® 𝑒 (𝑎+𝑖𝑏)𝑡 .

For each pair of complex eigenvalues 𝑎 + 𝑖𝑏 and 𝑎 − 𝑖𝑏, we get two real-valued linearly
independent solutions. We then go on to the next eigenvalue, which is either a real
eigenvalue or another complex eigenvalue pair. If we have 𝑛 distinct eigenvalues (real
or complex), then we end up with 𝑛 linearly independent solutions. If we had only two
equations (𝑛 = 2) as in the example above, then once we found two solutions we are
finished, and our general solution is

𝑥® = 𝑐1 𝑥®1 + 𝑐 2 𝑥®2 = 𝑐1 Re 𝑣® 𝑒 (𝑎+𝑖𝑏)𝑡 + 𝑐 2 Im 𝑣® 𝑒 (𝑎+𝑖𝑏)𝑡 .


 

We can now find a real-valued general solution to any homogeneous system where the
matrix has distinct eigenvalues. When we have repeated eigenvalues, matters get a bit
more complicated and we will look at that situation in § 3.7.

3.4.4 Exercises

h 1 i Let 𝐴 be a 3 × 3 matrix with an eigenvalue of 3 and a corresponding


Exercise 3.4.5 (easy):
eigenvector 𝑣® = −1 . Find 𝐴®
𝑣.
3

Exercise 3.4.6:

a) Find the general solution of 𝑥 1′ = 2𝑥1 , 𝑥2′ = 3𝑥2 using the eigenvalue method (first write the
system in the form 𝑥®′ = 𝐴 𝑥®).
b) Solve the system by solving each equation separately and verify you get the same general
solution.

Exercise 3.4.7: Find the general solution of 𝑥 1′ = 3𝑥1 + 𝑥2 , 𝑥2′ = 2𝑥1 + 4𝑥2 using the eigenvalue
method.

Exercise 3.4.8: Find the general solution of 𝑥 1′ = 𝑥1 − 2𝑥2 , 𝑥2′ = 2𝑥1 + 𝑥2 using the eigenvalue
method. Do not use complex exponentials in your solution.

Exercise 3.4.9:
9 −2 −6
h i
a) Compute eigenvalues and eigenvectors of 𝐴 = −8 3 6 .
10 −2 −6

b) Find the general solution of 𝑥®′ = 𝐴 𝑥®.


h −2 −1 −1 i
Exercise 3.4.10: Compute eigenvalues and eigenvectors of 3 2 1 .
−3 −1 0
146 CHAPTER 3. SYSTEMS OF ODES
h𝑎 𝑏 𝑐
i
Exercise 3.4.11: Let 𝑎, 𝑏, 𝑐, 𝑑, 𝑒, 𝑓 be numbers. Find the eigenvalues of 0 𝑑 𝑒 .
0 0 𝑓

Exercise 3.4.101:
h 1 03
i
a) Compute eigenvalues and eigenvectors of 𝐴 = −1 0 1 .
2 02
b) Solve the system 𝑥® ′ = 𝐴 𝑥®.

Exercise 3.4.102:

a) Compute eigenvalues and eigenvectors of 𝐴 =


 1 1

−1 0 .
b) Solve the system 𝑥® ′ = 𝐴 𝑥®.

Exercise 3.4.103: Solve 𝑥 1′ = 𝑥2 , 𝑥2′ = 𝑥1 using the eigenvalue method.

Exercise 3.4.104: Solve 𝑥 1′ = 𝑥2 , 𝑥2′ = −𝑥1 using the eigenvalue method.


3.5. TWO-DIMENSIONAL SYSTEMS AND THEIR VECTOR FIELDS 147

3.5 Two-dimensional systems and their vector fields


Note: 1 lecture, part of §6.2 in [EP], parts of §7.5 and §7.6 in [BD]
We take a moment to discuss constant-coefficient linear homogeneous systems in the
plane. Much intuition can be obtained by studying
 𝑎 𝑏  this simple case. We use coordinates
(𝑥, 𝑦) for the plane as usual, and suppose 𝑃 = 𝑐 𝑑 is a 2 × 2 matrix. Consider the system
 ′  ′
𝑥 𝑥 𝑥 𝑎 𝑏 𝑥
    
=𝑃 or = . (3.3)
𝑦 𝑦 𝑦 𝑐 𝑑 𝑦

The system is autonomous (compare this section to § 1.6), and so we draw a vector field (see
the end of § 3.1). We will be able to visually understand this vectorfield and the solutions
of the ODE in terms of the eigenvalues and eigenvectors of the matrix 𝑃. For this section,
we assume that 𝑃 has two distinct eigenvalues and two corresponding eigenvectors. We
will also assume that 𝑃 is nonsingular, that is, neither eigenvalue is zero.
Case 1. Suppose that the eigenvalues of 𝑃 are real and positive. We find two
 1 1 corre-
sponding eigenvectors and plot them in the plane. For example, take the matrix 0 2 . The
   
eigenvalues are 1 and 2 and corresponding eigenvectors are 10 and 11 . See Figure 3.4.
Let (𝑥, 𝑦) be a point on the line deter- -3 -2 -1 0 1 2 3

mined by an eigenvector 𝑣® for an eigenvalue 3 3

𝜆. That is, 𝑥𝑦 = 𝛼®𝑣 for some scalar 𝛼. Then



2 2

 ′
𝑥 𝑥
 
1 1

=𝑃 = 𝑃(𝛼®
𝑣 ) = 𝛼(𝑃®
𝑣 ) = 𝛼𝜆®
𝑣.
𝑦 𝑦
0 0

The derivative is a multiple of 𝑣® and hence -1 -1

points along the line determined by 𝑣® . As


𝜆 > 0, the derivative points in the direction -2 -2

of 𝑣® when 𝛼 is positive and in the opposite


direction when 𝛼 is negative. We draw the -3
-3 -2 -1 0 1 2 3
-3

lines determined by the eigenvectors and Figure 3.4: Eigenvectors of 𝑃.


arrows on the lines to indicate the directions.
See Figure 3.5 on the following page.
We fill in the rest of the arrows for the vector field, and we draw a few solutions. See
Figure 3.6 on the next page. The picture looks like a source with arrows coming out from
the origin. Hence we call this type of picture a source or sometimes an unstable node.
Case 2. Suppose both eigenvalues
 are negative. For example, take the negation of the
matrix from case 1, −1 −1 . The eigenvalues are −1 and −2 and corresponding eigenvectors
  0 −2 
are the same, 10 and 11 . The calculation and the picture are almost the same. The
difference is that the eigenvalues are negative and arrows are reversed. We get the picture
in Figure 3.7 on the following page. We call this type of picture a sink or a stable node.
148 CHAPTER 3. SYSTEMS OF ODES

-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
3 3 3 3

2 2 2 2

1 1 1 1

0 0 0 0

-1 -1 -1 -1

-2 -2 -2 -2

-3 -3 -3 -3
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3

Figure 3.5: Eigenvectors of 𝑃 with directions. Figure 3.6: Example source vector field with eigen-
vectors and solutions.

-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
3 3 3 3

2 2 2 2

1 1 1 1

0 0 0 0

-1 -1 -1 -1

-2 -2 -2 -2

-3 -3 -3 -3
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3

Figure 3.7: Example sink vector field with eigen- Figure 3.8: Example saddle vector field with eigen-
vectors and solutions. vectors and solutions.

 1 13. Suppose one eigenvalue is positive and one is negative. For example, consider
Case
𝑃 = 0 −2 . The eigenvalues are 1 and −2 and corresponding eigenvectors are 10 and
 
 1 
−3 . We reverse the arrows on the line corresponding to the negative eigenvalue and we
obtain the picture in Figure 3.8. We call this picture a saddle point.
For the next three cases, we will assume the eigenvalues are complex. In this case the
eigenvectors are also complex and we cannot just plot them in the plane.
Case
 0 4. Suppose the eigenvalues are purely imaginary, that is, ±𝑖𝑏. For
 1example,
 1let
𝑃 = −4 0 . The eigenvalues are ±2𝑖 and corresponding eigenvectors are 2𝑖 and −2𝑖 .
1


Consider the eigenvalue 2𝑖 and its eigenvector 2𝑖1 . The real and imaginary parts of 𝑣® 𝑒 2𝑖𝑡
 
3.5. TWO-DIMENSIONAL SYSTEMS AND THEIR VECTOR FIELDS 149

are        
1 2𝑖𝑡 cos(2𝑡) 1 2𝑖𝑡 sin(2𝑡)
Re 𝑒 = , Im 𝑒 = .
2𝑖 −2 sin(2𝑡) 2𝑖 2 cos(2𝑡)
We can take any linear combination of them to get other solutions, which one we take
depends on the initial conditions. Now note that the real part is a parametric equation for
an ellipse. Same with the imaginary part and in fact any linear combination of the two.
This is what happens in general when the eigenvalues are purely imaginary. So when the
eigenvalues are purely imaginary, we get ellipses for the solutions. This type of picture is
sometimes called a center. See Figure 3.9.

-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
3 3 3 3

2 2 2 2

1 1 1 1

0 0 0 0

-1 -1 -1 -1

-2 -2 -2 -2

-3 -3 -3 -3
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3

Figure 3.9: Example center vector field. Figure 3.10: Example spiral source vector field.

Case 5. Now suppose the complex eigenvalues have a positive real


 1 part.
 That is, suppose
the eigenvalues are 𝑎 ± 𝑖𝑏 for some 𝑎 > 0. For example, let 𝑃 = −4 1 . The eigenvalues
1
1  1 
turn out to be 1 ± 2𝑖 and eigenvectors are 2𝑖 and −2𝑖 . We take 1 + 2𝑖 and its eigenvector
2𝑖 and find the real and imaginary parts of 𝑣
® 𝑒 (1+2𝑖)𝑡 are
1

       
1 (1+2𝑖)𝑡 cos(2𝑡) 1 (1+2𝑖)𝑡 sin(2𝑡)
Re 𝑒 = 𝑒𝑡 , Im 𝑒 = 𝑒𝑡 .
2𝑖 −2 sin(2𝑡) 2𝑖 2 cos(2𝑡)

Note the 𝑒 𝑡 in front of the solutions. The solutions grow in magnitude while spinning
around the origin. Hence we get a spiral source. See Figure 3.10.
Case 6. Finally suppose the complex eigenvalues have a negative real part.  −1That is,
suppose the eigenvalues are −𝑎 ± 𝑖𝑏 for some 𝑎 > 0. For example, let 𝑃 = −1

4 −1 . The
 1   
eigenvalues turn out to be −1 ± 2𝑖 and eigenvectors are −2𝑖 and 2𝑖1 . We take −1 − 2𝑖
and its eigenvector 2𝑖1 and find the real and imaginary parts of 𝑣® 𝑒 (−1−2𝑖)𝑡 are
 

       
1 (−1−2𝑖)𝑡 cos(2𝑡) 1 (−1−2𝑖)𝑡 − sin(2𝑡)
Re 𝑒 = 𝑒 −𝑡 , Im 𝑒 = 𝑒 −𝑡 .
2𝑖 2 sin(2𝑡) 2𝑖 2 cos(2𝑡)
150 CHAPTER 3. SYSTEMS OF ODES

Note the 𝑒 −𝑡 in front of the solutions. The solutions shrink in magnitude while spinning
around the origin. Hence we get a spiral sink. See Figure 3.11.

-3 -2 -1 0 1 2 3
3 3

2 2

1 1

0 0

-1 -1

-2 -2

-3 -3
-3 -2 -1 0 1 2 3

Figure 3.11: Example spiral sink vector field.

We summarize the behavior of linear homogeneous two-dimensional systems given


by a nonsingular matrix in Table 3.1. Systems where one of the eigenvalues is zero (the
matrix is singular) come up in practice from time to time, see Example 3.1.2 on page 121,
and the pictures are somewhat different (simpler in a way). See the exercises. When the
eigenvalues are real and repeated, the analysis is a little bit more difficult and we have
not yet covered how to solve such systems, but the idea is roughly the same—a repeated
positive eigenvalue means a source and a repeated negative eigenvalue means a sink.

Eigenvalues Behavior
real and both positive source / unstable node
real and both negative sink / stable node
real and opposite signs saddle
purely imaginary center point / ellipses
complex with positive real part spiral source
complex with negative real part spiral sink

Table 3.1: Summary of behavior of linear homogeneous two-dimensional systems.


3.5. TWO-DIMENSIONAL SYSTEMS AND THEIR VECTOR FIELDS 151

3.5.1 Exercises
Exercise 3.5.1: Take the equation 𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 0, with 𝑚 > 0, 𝑐 ≥ 0, 𝑘 > 0 for the
mass-spring system.

a) Convert this to a system of first-order equations.


b) Classify for what 𝑚, 𝑐, 𝑘 do you get which behavior.
c) Explain from physical intuition why you do not get all the different kinds of behavior here?

Exercise 3.5.2: What happens in the case when 𝑃 = 10 11 ? In this case the eigenvalue is repeated
 

and there is only one independent eigenvector. Draw and describe the vector field.

Exercise 3.5.3: What happens in the case when 𝑃 =


1 1
11 ? Draw the vector field. Does this look
like any of the pictures we have drawn?
𝑎 0
Exercise 3.5.4: Which behaviors are possible if 𝑃 is diagonal, that is, 𝑃 = 0 𝑏 ? You can assume
that 𝑎 and 𝑏 are not zero.

Exercise 3.5.5: Take the system from Example 3.1.2 on page 121, 𝑥1′ = 𝑉𝑟 (𝑥2 − 𝑥1 ), 𝑥2′ = 𝑉𝑟 (𝑥 1 − 𝑥2 ).
As we said, one of the eigenvalues is zero. What is the other eigenvalue, how does the picture look
like, and what happens when 𝑡 goes to infinity.

Exercise 3.5.101: Describe the behavior of the following systems without solving:

a) 𝑥 ′ = 𝑥 + 𝑦, 𝑦 ′ = 𝑥 − 𝑦. b) 𝑥 1′ = 𝑥1 + 𝑥 2 , 𝑥2′ = 2𝑥2 .
c) 𝑥1′ = −2𝑥2 , 𝑥 2′ = 2𝑥1 . d) 𝑥 ′ = 𝑥 + 3𝑦, 𝑦 ′ = −2𝑥 − 4𝑦.
e) 𝑥 ′ = 𝑥 − 4𝑦, 𝑦 ′ = −4𝑥 + 𝑦.

Exercise 3.5.102: Suppose that 𝑥® ′ = 𝐴 𝑥® where 𝐴 is a 2 by 2 matrix with eigenvalues 2 ± 𝑖.


Describe the behavior.
 ′
Exercise 3.5.103: Take 𝑥𝑦 = 00 10 𝑥𝑦 . Draw the vector field and describe the behavior. Is it
  

one of the behaviors that we have seen before?


152 CHAPTER 3. SYSTEMS OF ODES

3.6 Second-order systems and applications


Note: more than 2 lectures, §5.4 in [EP], not in [BD]

3.6.1 Undamped mass-spring systems


While we did say that we can usually only study first-order systems, it is sometimes more
convenient to analyze the system in the way it arises naturally. For example, consider 3
masses connected by springs between two walls. We could pick any higher number, and
the math would be essentially the same, but for simplicity, we pick 3 right now. We also
assume no friction, that is, the system is undamped. The masses are 𝑚1 , 𝑚2 , and 𝑚3 and
the spring constants are 𝑘 1 , 𝑘 2 , 𝑘 3 , and 𝑘 4 . Let 𝑥1 be the displacement from rest position of
the first mass, and 𝑥2 and 𝑥3 the displacement of the second and third mass. We make, as
usual, positive values go right (as 𝑥1 grows, the first mass is moving right). See Figure 3.12.

𝑘1 𝑘2 𝑘3 𝑘4
𝑚1 𝑚2 𝑚3

Figure 3.12: System of masses and springs.

This simple system turns up in unexpected places. For example, our world really
consists of many small particles of matter interacting together. When we try the system
above with many more masses, we obtain a good approximation to how an elastic material
behaves. By somehow taking a limit of the number of masses going to infinity, we obtain
the continuous one-dimensional wave equation (that we study in § 4.7). But we digress.
Let us set up the equations for the three mass system. By Hooke’s law, the force acting
on the mass equals the spring compression times the spring constant. By Newton’s second
law, force is mass times acceleration. So if we sum the forces acting on each mass, put the
right sign in front of each term, depending on the direction in which it is acting, and set
this equal to mass times the acceleration, we end up with the desired system of equations.

𝑚1 𝑥 1′′ = −𝑘1 𝑥1 + 𝑘 2 (𝑥2 − 𝑥 1 ) = −(𝑘1 + 𝑘 2 )𝑥1 + 𝑘 2 𝑥2 ,


𝑚2 𝑥 2′′ = −𝑘2 (𝑥2 − 𝑥 1 ) + 𝑘 3 (𝑥3 − 𝑥 2 ) = 𝑘2 𝑥 1 − (𝑘 2 + 𝑘 3 )𝑥2 + 𝑘 3 𝑥3 ,
𝑚3 𝑥 3′′ = −𝑘3 (𝑥3 − 𝑥 2 ) − 𝑘 4 𝑥3 = 𝑘3 𝑥 2 − (𝑘 3 + 𝑘 4 )𝑥3 .

We define the matrices


 𝑚1 0 0  −(𝑘1 + 𝑘2 ) 𝑘2 0 
𝑀 =  0 𝑚2 0  𝑘2 −(𝑘2 + 𝑘 3 ) 𝑘3
   
and 𝐾 =  .
 0 0 𝑚3  𝑘 𝑘

 

 0 3 −(𝑘 3 + 4 ) 

3.6. SECOND-ORDER SYSTEMS AND APPLICATIONS 153

We write the equation simply as


𝑀 𝑥®′′ = 𝐾 𝑥®.
At this point we could introduce 3 new variables and write out a system of 6 first-order
equations. We claim this simple setup is easier to handle as a second-order system. We call
𝑥® the displacement vector, 𝑀 the mass matrix, and 𝐾 the stiffness matrix.

Exercise 3.6.1: Repeat this setup for 4 masses (find the matrices 𝑀 and 𝐾). Do it for 5 masses.
Can you find a prescription to do it for 𝑛 masses?

As with a single equation, we want to “divide by 𝑀.” This means computing the
inverse of 𝑀. The masses are all nonzero and 𝑀 is a diagonal matrix, so computing the
inverse is easy:
 𝑚1 0 0 
 1
𝑀 −1 =  0 𝑚12 0  .

0 0 1
 𝑚3 

This fact follows readily by how we multiply diagonal matrices. As an exercise, you should
verify that 𝑀𝑀 −1 = 𝑀 −1 𝑀 = 𝐼.
Let 𝐴 = 𝑀 −1 𝐾. We look at the system 𝑥®′′ = 𝑀 −1 𝐾 𝑥®, or

𝑥®′′ = 𝐴 𝑥®.

Many real world systems can be modeled by this equation. For simplicity, we will only talk
about the given masses-and-springs problem. We try a solution of the form

𝑥® = 𝑣® 𝑒 𝛼𝑡 .

We compute that for this guess, 𝑥®′′ = 𝛼2 𝑣® 𝑒 𝛼𝑡 . We plug our guess into the equation and get

𝛼2 𝑣® 𝑒 𝛼𝑡 = 𝐴®
𝑣 𝑒 𝛼𝑡 .

We divide by 𝑒 𝛼𝑡 to arrive at 𝛼2 𝑣® = 𝐴®
𝑣 . Hence if 𝛼2 is an eigenvalue of 𝐴 and 𝑣® is a
corresponding eigenvector, we have found a solution.
In our example, and in other common applications, 𝐴 has only real negative eigenvalues
(and possibly a zero eigenvalue). So we study only this case. When an eigenvalue 𝜆 is
negative, it means that 𝛼2 = 𝜆 is negative. Hence there is some real number 𝜔 such that
−𝜔2 = 𝜆. Then 𝛼 = ±𝑖𝜔. For 𝛼 = 𝑖𝜔, the solution we guessed is 𝑥® = 𝑣® 𝑒 𝑖𝜔𝑡 or

𝑥® = 𝑣® cos(𝜔𝑡) + 𝑖 sin(𝜔𝑡) .


By taking the real and imaginary parts (note that 𝑣® is real), we find that 𝑣® cos(𝜔𝑡) and
𝑣® sin(𝜔𝑡) are linearly independent solutions.
If an eigenvalue is zero, it turns out that both 𝑣® and 𝑣® 𝑡 are solutions, where 𝑣® is an
eigenvector corresponding to the eigenvalue 0.
154 CHAPTER 3. SYSTEMS OF ODES

Exercise 3.6.2: Show that if 𝐴 has a zero eigenvalue and 𝑣® is a corresponding eigenvector, then
𝑥® = 𝑣® (𝑎 + 𝑏𝑡) is a solution of 𝑥®′′ = 𝐴 𝑥® for arbitrary constants 𝑎 and 𝑏.
Theorem 3.6.1. Let 𝐴 be a real 𝑛 × 𝑛 matrix with 𝑛 distinct real negative (or zero) eigenvalues we
denote by −𝜔12 > −𝜔22 > · · · > −𝜔𝑛2 , and corresponding eigenvectors by 𝑣®1 , 𝑣®2 , . . . , 𝑣® 𝑛 . If 𝐴 is
invertible (that is, if 𝜔1 > 0), then

𝑛
Õ
𝑥®(𝑡) = 𝑣® 𝑖 𝑎 𝑖 cos(𝜔 𝑖 𝑡) + 𝑏 𝑖 sin(𝜔 𝑖 𝑡)

𝑖=1

is the general solution of


𝑥®′′ = 𝐴 𝑥®,
for some arbitrary constants 𝑎 𝑖 and 𝑏 𝑖 . If 𝐴 has a zero eigenvalue, that is, 𝜔1 = 0, and all other
eigenvalues are distinct and negative, then the general solution can be written as

𝑛
Õ
𝑥®(𝑡) = 𝑣®1 (𝑎1 + 𝑏 1 𝑡) + 𝑣® 𝑖 𝑎 𝑖 cos(𝜔 𝑖 𝑡) + 𝑏 𝑖 sin(𝜔 𝑖 𝑡) .

𝑖=2

We use this solution and the setup from the introduction of this section even when
some of the masses and springs are missing. For example, when there are only 2 masses
and only 2 springs, simply take only the equations for the two masses and set all the spring
constants for the springs that are missing to zero.

3.6.2 Examples

Example 3.6.1: Consider the setup in Figure 3.13, with 𝑚1 = 2 kg, 𝑚2 = 1 kg, 𝑘1 = 4 N/m,
and 𝑘2 = 2 N/m.

𝑘1 𝑘2
𝑚1 𝑚2

Figure 3.13: System of masses and springs.

The equations we write down are


   
2 0 ′′ −(4 + 2) 2
𝑥® = 𝑥®,
0 1 2 −2
or  
′′ −3 1
𝑥® = 𝑥®.
2 −2
3.6. SECOND-ORDER SYSTEMS AND APPLICATIONS 155

We find the eigenvalues


1  of
1
 𝐴 to be 𝜆 = −1, −4 (exercise). We find corresponding
eigenvectors to be 2 and −1 respectively (exercise). We check the theorem and note
that 𝜔1 = 1 and 𝜔2 = 2. Hence the general solution is
   
1 1
𝑥® = 𝑎 1 cos(𝑡) + 𝑏 1 sin(𝑡) + 𝑎 2 cos(2𝑡) + 𝑏 2 sin(2𝑡) .
 
2 −1
The two terms in the solution represent the two so-called natural or normal modes of
oscillation. And the two (angular) frequencies are the natural frequencies. The first natural
frequency is 1, and second natural frequency is 2. The two modes are plotted in Figure 3.14.

0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0

2 2 1.0 1.0

1 1 0.5 0.5

0 0 0.0 0.0

-1 -1 -0.5 -0.5

-2 -2 -1.0 -1.0

0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0

Figure 3.14: The two modes of the mass-spring system. In the left plot the masses are moving in unison
and in the right plot are masses moving in the opposite direction.

Let us write the solution as


   
1 1
𝑥® = 𝑐1 cos(𝑡 − 𝛼 1 ) + 𝑐 cos(2𝑡 − 𝛼 2 ).
2 −1 2
The first term,
𝑐 1 cos(𝑡 − 𝛼 1 )
   
1
𝑐 cos(𝑡 − 𝛼 1 ) = ,
2 1 2𝑐1 cos(𝑡 − 𝛼 1 )
corresponds to the mode where the masses move synchronously in the same direction.
The second term,
𝑐 2 cos(2𝑡 − 𝛼 2 )
   
1
𝑐 2 cos(2𝑡 − 𝛼 2 ) = ,
−1 −𝑐2 cos(2𝑡 − 𝛼 2 )
corresponds to the mode where the masses move synchronously but in opposite directions.
The general solution is a combination of the two modes. That is, the initial conditions
determine the amplitude and phase shift of each mode. As an example, suppose we have
initial conditions    
1 ′ 0
𝑥®(0) = , 𝑥® (0) = .
−1 6
156 CHAPTER 3. SYSTEMS OF ODES

We use the 𝑎 𝑗 , 𝑏 𝑗 constants to solve for initial conditions. First

𝑎1 + 𝑎2
       
1 1 1
= 𝑥®(0) = 𝑎1 + 𝑎2 = .
−1 2 −1 2𝑎 1 − 𝑎 2

We solve (exercise) to find 𝑎1 = 0, 𝑎2 = 1. To find the 𝑏1 and 𝑏2 , we differentiate first:


   
1 1
𝑥®′ = −𝑎1 sin(𝑡) + 𝑏 1 cos(𝑡) + −2𝑎 2 sin(2𝑡) + 2𝑏 2 cos(2𝑡) .
 
2 −1

Now we solve:
𝑏 1 + 2𝑏 2
       
0 1 1
= 𝑥®′(0) = 𝑏1 + 2𝑏2 = .
6 2 −1 2𝑏1 − 2𝑏 2
Again solve (exercise) to find 𝑏1 = 2, 𝑏2 = −1. So our solution is
     
1 1 2 sin(𝑡) + cos(2𝑡) − sin(2𝑡)
𝑥® = .

2 sin(𝑡) + cos(2𝑡) − sin(2𝑡) =
2 −1 4 sin(𝑡) − cos(2𝑡) + sin(2𝑡)

The graphs of the two displacements, 𝑥1 and 𝑥 2 of the two carts is in Figure 3.15.

0.0 2.5 5.0 7.5 10.0

5.0 5.0

2.5 2.5

0.0 0.0

-2.5 -2.5

0.0 2.5 5.0 7.5 10.0

Figure 3.15: Superposition of the two modes given the initial conditions.

Example 3.6.2: We have two toy rail cars. Car 1 of mass 2 kg is traveling at 3 m/s towards
the second rail car of mass 1 kg. There is a bumper on the second rail car that engages at
the moment the cars hit (it connects to two cars) and does not let go. The bumper acts
like a spring of spring constant 𝑘 = 2 N/m. The second car is 10 meters from a wall. See
Figure 3.16 on the next page.
We want to ask several questions. At what time after the cars link does impact with the
wall happen? What is the speed of car 2 when it hits the wall?
OK, let us first set the system up. Let 𝑡 = 0 be the time when the two cars link up. Let 𝑥1
be the displacement of the first car from the position at 𝑡 = 0, and let 𝑥 2 be the displacement
3.6. SECOND-ORDER SYSTEMS AND APPLICATIONS 157

𝑘
𝑚1 𝑚2

10 meters
Figure 3.16: The crash of two rail cars.

of the second car from its original location. Then the time when 𝑥2 (𝑡) = 10 is exactly the
time when impact with wall occurs. For this 𝑡, 𝑥2′ (𝑡) is the speed at impact. This system
acts just like the system of the previous example but without 𝑘1 . Hence the equation is
   
2 0 ′′ −2 2
𝑥® = 𝑥®,
0 1 2 −2

or  
′′ −1 1
𝑥® = 𝑥®.
2 −2
We compute the eigenvalues of 𝐴. It is not hard
 to see that the eigenvalues are 0 and
1 1
−3 (exercise). Furthermore, eigenvectors are 1 and −2 respectively (exercise). Then

𝜔1 = 0, 𝜔2 = 3, and by the second part of the theorem the general solution is

√ √ 
   
1 1

𝑥® = (𝑎 1 + 𝑏 1 𝑡) + 𝑎 2 cos( 3 𝑡) + 𝑏 2 sin( 3 𝑡)
1 −2
√ √
𝑎 1 + 𝑏 1 𝑡 + 𝑎 2 cos(√3 𝑡) + 𝑏 2 sin( √3 𝑡)
 
= .
𝑎1 + 𝑏 1 𝑡 − 2𝑎 2 cos( 3 𝑡) − 2𝑏 2 sin( 3 𝑡)

We now apply the initial conditions. First the cars start at position 0 so 𝑥1 (0) = 0 and
𝑥2 (0) = 0. The first car is traveling at 3 m/s, so 𝑥1′ (0) = 3 and the second car starts at rest, so
𝑥2′ (0) = 0. The first conditions says

𝑎1 + 𝑎2
 
0® = 𝑥®(0) = .
𝑎 1 − 2𝑎 2

It is not hard to see that 𝑎1 = 𝑎2 = 0. We set 𝑎 1 = 0 and 𝑎2 = 0 in 𝑥®(𝑡) and differentiate to get
√ √
𝑏 1 + √3 𝑏2 cos( √3 𝑡)
 

𝑥® (𝑡) = .
𝑏 1 − 2 3 𝑏2 cos( 3 𝑡)

So √
𝑏 1 + √3 𝑏2
   
3 ′
= 𝑥® (0) = .
0 𝑏1 − 2 3 𝑏2
158 CHAPTER 3. SYSTEMS OF ODES

Solving these two equations, we find 𝑏 1 = 2 and 𝑏 2 = √1 . Hence the position of our cars is
3
(until the impact with the wall)
" √ #
2𝑡 + √1 sin( 3 𝑡)
𝑥® = 3 √ .
2𝑡 − √2 sin( 3 𝑡)
3

Note how the presence of the zero eigenvalue resulted in a term containing 𝑡. This means
that the cars will be traveling in the positive direction as time grows, which is what we
expect.
√ interested in is the second expression, the one for 𝑥2 . We have
What we are really
𝑥2 (𝑡) = 2𝑡 − sin( 3 𝑡). See Figure 3.17 for the plot of 𝑥2 versus time.
√2
3
Just from the graph we can see that time of impact will be a little more than
√ 5 seconds
from time zero. For this we have to solve the equation 10 = 𝑥2 (𝑡) = 2𝑡 − sin( 3 𝑡). Using
√2
3
a computer (or even a graphing calculator) we find that 𝑡impact ≈ 5.22 seconds.
The speed of the second car is 𝑥 2′ =
√ 0 1 2 3 4 5 6

2 − 2 cos( 3 𝑡). At the time of impact (5.22 12.5 12.5

seconds from 𝑡 = 0) we get 𝑥2′ (𝑡impact ) ≈ 3.85.


The maximum√ speed is the maximum of 10.0 10.0

2 − 2 cos( 3 𝑡), which is 4. We are traveling


7.5 7.5
at almost the maximum speed when we hit
the wall. 5.0 5.0

Suppose that Bob is a tiny person sitting


2.5 2.5

on car 2. Bob has a Martini in his hand and


would like not to spill it. Let us suppose 0.0 0.0

Bob would not spill his Martini when the 0 1 2 3 4 5 6

first car links up with car 2, but if car 2 hits Figure 3.17: Position of the second car in time
the wall at any speed greater than zero, Bob (ignoring the wall).
will spill his drink. Suppose Bob can move
car 2 a few meters towards or away from
the wall (he cannot go all the way to the wall, nor can he get out of the way of the first car).
Is there a “safe” distance for him to be at? A distance such that the impact with the wall is
at zero speed?
The answer is yes. On Figure 3.17, note the “plateau” between 𝑡 = 3 and 𝑡 = 4.√There is
a point where the speed is zero. To find it we solve 𝑥2′ (𝑡) = 0. This is when cos( 3 𝑡) = 1
or in other words when 𝑡 = √ , 4𝜋 , . . . and so on. We plug in the first value to obtain
2𝜋 √
  3 3
𝑥2 2𝜋
√ = 4𝜋
√ ≈ 7.26. So a “safe” distance is about 7 and a quarter meters from the wall.
3 3
Alternatively Bob could
 move
 away from the wall towards the incoming car 2, where
another safe distance is 𝑥2 4𝜋
√ = 8𝜋
√ ≈ 14.51 and so on. We can use all the different 𝑡 such
3 3
that 𝑥2′ (𝑡)
= 0. Of course 𝑡 = 0 is also a solution, corresponding to 𝑥 2 = 0, but that means
standing right at the wall.
3.6. SECOND-ORDER SYSTEMS AND APPLICATIONS 159

3.6.3 Forced oscillations


Finally we move to forced oscillations. Suppose that now our system is

𝑥®′′ = 𝐴 𝑥® + 𝐹® cos(𝜔𝑡). (3.4)

That is, we are adding periodic forcing to the system in the direction of the vector 𝐹.®
As before, this system just requires us to find one particular solution 𝑥®𝑝 , add it to the
general solution of the associated homogeneous system 𝑥®𝑐 , and we will have the general
solution to (3.4). Let us suppose that 𝜔 is not one of the natural frequencies of 𝑥®′′ = 𝐴 𝑥®,
then we can guess
𝑥®𝑝 = 𝑐® cos(𝜔𝑡),
where 𝑐® is an unknown constant vector. Note that we do not need to use sine since there
are only second derivatives. We solve for 𝑐® to find 𝑥®𝑝 . This is really just the method of
undetermined coefficients for systems. Let us differentiate 𝑥®𝑝 twice to get

𝑥®′′𝑝 = −𝜔2 𝑐® cos(𝜔𝑡).

Plug 𝑥®𝑝 and 𝑥®′′𝑝 into equation (3.4):

𝑥®′′𝑝 𝐴 𝑥®𝑝
z }| { z }| {
−𝜔2 𝑐® cos(𝜔𝑡) = 𝐴®𝑐 cos(𝜔𝑡) +𝐹® cos(𝜔𝑡).

We cancel out the cosine and rearrange the equation to obtain


®
(𝐴 + 𝜔 2 𝐼)®𝑐 = −𝐹.

So
−1
®
𝑐® = (𝐴 + 𝜔2 𝐼) (−𝐹).
Of course this is possible only if (𝐴 + 𝜔2 𝐼) = 𝐴 − (−𝜔2 )𝐼 is invertible. That matrix is

invertible if and only if −𝜔2 is not an eigenvalue of 𝐴. That is true if and only if 𝜔 is not a
natural frequency of the system.
We simplified things a little bit. If we wish to have the forcing term to be in the units of
force, say Newtons, then we must write
® cos(𝜔𝑡).
𝑀 𝑥®′′ = 𝐾 𝑥® + 𝐺

If we then write things in terms of 𝐴 = 𝑀 −1 𝐾, we have


® cos(𝜔𝑡)
𝑥®′′ = 𝑀 −1 𝐾 𝑥® + 𝑀 −1 𝐺 or 𝑥®′′ = 𝐴 𝑥® + 𝐹® cos(𝜔𝑡),

where 𝐹® = 𝑀 −1 𝐺.
®
Example 3.6.3: Let us take the example in Figure 3.13 on page 154 with the same parameters
as before: 𝑚1 = 2, 𝑚2 = 1, 𝑘 1 = 4, and 𝑘 2 = 2. Now suppose that there is a force 2 cos(3𝑡)
acting on the second cart.
160 CHAPTER 3. SYSTEMS OF ODES

The equation is
         
2 0 ′′ −(4 + 2) 2 0 −3 1
′′ 0
𝑥® = 𝑥® + cos(3𝑡) or 𝑥® = 𝑥® + cos(3𝑡).
0 1 2 −2 2 2 −2 2

We solved the associated homogeneous equation before and found the complementary
solution to be
   
1 1
𝑥®𝑐 = 𝑎1 cos(𝑡) + 𝑏 1 sin(𝑡) + 𝑎2 cos(2𝑡) + 𝑏 2 sin(2𝑡) .
 
2 −1

The natural frequencies are 1 and 2. As 3 is not a natural frequency, we try 𝑐® cos(3𝑡).
We invert (𝐴 + 32 𝐼):
   −1   −1  7 −1 
−3 1 6 1 40 40
+3 𝐼2
= = −1 3 .
2 −2 2 7 20 20

Hence,  7 −1     1 
−1 0
® =
𝑐® = (𝐴 + 𝜔 𝐼) (−𝐹) 2 40 40
= 20
.
−1 3 −3
20 20 −2 10
Combining with the general solution of the associated homogeneous problem, we get
that the general solution to 𝑥®′′ = 𝐴 𝑥® + 𝐹® cos(𝜔𝑡) is
     1 
1 1 20
𝑥® = 𝑥®𝑐 + 𝑥®𝑝 = 𝑎1 cos(𝑡) + 𝑏 1 sin(𝑡) + 𝑎2 cos(2𝑡) + 𝑏 2 sin(2𝑡) +
 
−3 cos(3𝑡).
2 −1 10

We would then solve for the constants 𝑎1 , 𝑎 2 , 𝑏1 , and 𝑏 2 using any given initial conditions.
Note that given force 𝑓®, we write the equation as 𝑀 𝑥®′′ = 𝐾 𝑥® + 𝑓® to get the units right.
Then we write 𝑥®′′ = 𝑀 −1 𝐾 𝑥® + 𝑀 −1 𝑓®. The term 𝑔® = 𝑀 −1 𝑓® in 𝑥®′′ = 𝐴 𝑥® + 𝑔® is in units of force
per unit mass.
If 𝜔 is a natural frequency of the system, resonance may occur, because we will have to
try a particular solution of the form

𝑥®𝑝 = 𝑐® 𝑡 sin(𝜔𝑡) + 𝑑® cos(𝜔𝑡).

That is assuming that the eigenvalues of the coefficient matrix are distinct. Next, note that
the amplitude of this solution grows without bound as 𝑡 grows.

3.6.4 Exercises
Exercise 3.6.3: Find a particular solution to
   
−3 1
′′ 0
𝑥® = 𝑥® + cos(2𝑡).
2 −2 2
3.6. SECOND-ORDER SYSTEMS AND APPLICATIONS 161

Exercise 3.6.4 (challenging): Let us take the example in Figure 3.13 on page 154 with the same
parameters as before: 𝑚1 = 2, 𝑘1 = 4, and 𝑘2 = 2, except for 𝑚2 , which is unknown. Suppose
that there is a force cos(5𝑡) acting on the first mass. Find an 𝑚2 such that there exists a particular
solution where the first mass does not move.
Note: This idea is called dynamic damping. In practice there will be a small amount of
damping and so any transient solution will disappear and after long enough time, the first mass will
always come to a stop.

Exercise 3.6.5: Let us take the Example 3.6.2 on page 156, but that at time of impact, car 2 is
moving to the left at the speed of 3 m/s.

a) Find the behavior of the system after linkup.


b) Will the second car hit the wall, or will it be moving away from the wall as time goes on?
c) At what speed would the first car have to be traveling for the system to essentially stay in
place after linkup?

Exercise 3.6.6: Let us take the example in Figure 3.13 on page 154 with parameters 𝑚1 = 𝑚2 = 1,
𝑘1 = 𝑘 2 = 1. Does there exist a set of initial conditions for which the first cart moves but the second
cart does not? If so, find those conditions. If not, argue why not.
h1 0 0i h −30 0
i h cos(2𝑡) i
Exercise 3.6.101: Find the general solution to 020 𝑥® ′′ = 2 −4 0 𝑥® + 0 .
003 0 6 −3 0

Exercise 3.6.102: Suppose there are three carts of equal mass 𝑚 and connected by two springs of
constant 𝑘 (and no connections to walls). Set up the system and find its general solution.

Exercise 3.6.103: Suppose a cart of mass 2 kg is attached by a spring of constant 𝑘 = 1 to a cart of


mass 3 kg, which is attached to the wall by a spring also of constant 𝑘 = 1. Suppose that the initial
position of the first cart is 1 meter in the positive direction from the rest position, and the second
mass starts at the rest position. The masses are not moving and are let go. Find the position of the
second mass as a function of time.
162 CHAPTER 3. SYSTEMS OF ODES

3.7 Multiple eigenvalues


Note: 1 or 1.5 lectures, §5.5 in [EP], §7.8 in [BD]
It may happen that a matrix 𝐴 has some “repeated” eigenvalues. That is, the character-
istic equation det(𝐴 − 𝜆𝐼) = 0 may have repeated roots. This is actually unlikely to happen
for a random matrix. If we take a small perturbation of 𝐴 (we change the entries of 𝐴
slightly), we get a matrix with distinct eigenvalues. As any system we want to solve in
practice is an approximation to reality anyway, it is not absolutely indispensable to know
how to solve these corner cases. On the other hand, these cases do come up in applications
from time to time. Furthermore, if we have distinct but very close eigenvalues, the behavior
is similar to that of repeated eigenvalues, and so understanding that case will give us
insight into what is going on.

3.7.1 Geometric multiplicity


Take the diagonal matrix  
3 0
𝐴= .
0 3
𝐴 has an eigenvalue 3 of multiplicity 2. We call the multiplicity of the eigenvalue in the
characteristic equation the algebraic
 multiplicity.
 In this case, there also exist 2 linearly
1 0
independent eigenvectors, 0 and 1 corresponding to the eigenvalue 3. This means
that the so-called geometric multiplicity of this eigenvalue is also 2.
In all the theorems where we required a matrix to have 𝑛 distinct eigenvalues, we only
really needed to have 𝑛 linearly independent eigenvectors. For example, 𝑥®′ = 𝐴 𝑥® has the
general solution    
1 3𝑡 0 3𝑡
𝑥® = 𝑐1 𝑒 + 𝑐2 𝑒 .
0 1
We restate the theorem about real eigenvalues. In the theorem, we will repeat eigenvalues
according to (algebraic) multiplicity. So for the matrix 𝐴 above, we would say that it has
eigenvalues 3 and 3.

Theorem 3.7.1. Suppose the 𝑛 × 𝑛 matrix 𝑃 has 𝑛 real eigenvalues (not necessarily distinct), 𝜆1 ,
𝜆2 , . . . , 𝜆𝑛 , and there are 𝑛 linearly independent corresponding eigenvectors 𝑣®1 , 𝑣®2 , . . . , 𝑣® 𝑛 . Then
the general solution to 𝑥®′ = 𝑃 𝑥® can be written as

𝑥® = 𝑐1 𝑣®1 𝑒 𝜆1 𝑡 + 𝑐 2 𝑣®2 𝑒 𝜆2 𝑡 + · · · + 𝑐 𝑛 𝑣® 𝑛 𝑒 𝜆𝑛 𝑡 .

The geometric multiplicity of an eigenvalue of algebraic multiplicity 𝑛 is equal to the


number of corresponding linearly independent eigenvectors. The geometric multiplicity is
always less than or equal to the algebraic multiplicity. The theorem handles the case when
these two multiplicities are equal for all eigenvalues. If for an eigenvalue the geometric
multiplicity is equal to the algebraic multiplicity, then we say the eigenvalue is complete.
3.7. MULTIPLE EIGENVALUES 163

In other words, the hypothesis of the theorem could be stated as saying that if all the
eigenvalues of 𝑃 are complete, then there are 𝑛 linearly independent eigenvectors and thus
we have the given general solution.
If the geometric multiplicity of an eigenvalue is 2 or greater, then the set of linearly
independent eigenvectors is not
 unique up to multiples as it was
 1 before.
 1For
 example, for
the diagonal matrix 𝐴 = 0 3 we could also pick eigenvectors 1 and −1 , or in fact any
3 0

pair of two linearly independent vectors. The number of linearly independent eigenvectors
corresponding to 𝜆 is the number of free variables we obtain when solving 𝐴® 𝑣 = 𝜆®
𝑣 . We
pick specific values for those free variables to obtain eigenvectors. If you pick different
values, you may get different eigenvectors.

3.7.2 Defective eigenvalues


If an 𝑛 × 𝑛 matrix has less than 𝑛 linearly independent eigenvectors, it is said to be deficient.
Then there is at least one eigenvalue with an algebraic multiplicity that is higher than its
geometric multiplicity. We call this eigenvalue defective and the difference between the two
multiplicities we call the defect.
Example 3.7.1: The matrix
 
3 1
0 3

has an eigenvalue 3 of algebraic multiplicity 2. As 𝐴 − 𝜆𝐼 =


3 1 1 0 0 1
03 −3 01 = 00 , to
compute the eigenvectors, we must solve

𝑣1
  
0 1 ®
= 0.
0 0 𝑣2

So 𝑣2 = 0. All eigenvectors are of the form 𝑣01 . Any two such vectors are linearly
 

dependent, and hence the geometric multiplicity of the eigenvalue is 1. Therefore, the
defect is 1, and we can no longer apply the eigenvalue method directly to a system of ODEs
with such a coefficient matrix.
To solve such an ODE, we need a new idea. Roughly, the key observation we will use is
that if 𝜆 is an eigenvalue of 𝐴 of algebraic multiplicity 𝑚, then it is possible to find certain
𝑚 linearly independent vectors solving (𝐴 − 𝜆𝐼) 𝑘 𝑣® = 0® for various powers 𝑘. We will call
these generalized eigenvectors.
We continue with the example equation 𝑥®′ = 𝐴 𝑥® where 𝐴 = 30 13 . We found an
 

eigenvalue 𝜆 = 3 of (algebraic) multiplicity 2 and defect 1. We found one eigenvector


𝑣® = 0 . We have one solution
1

 
1 3𝑡
𝑥®1 = 𝑣® 𝑒 3𝑡
= 𝑒 .
0
164 CHAPTER 3. SYSTEMS OF ODES

We are now stuck, we get no other solutions from standard eigenvectors. But we need two
linearly independent solutions to find the general solution of the equation.
In the spirit of repeated roots of the characteristic equation for a single equation, we try
another solution of the form
𝑥®2 = (®
𝑣2 + 𝑣®1 𝑡) 𝑒 3𝑡 .
We differentiate to get

𝑥®2′ = 𝑣®1 𝑒 3𝑡 + 3(®


𝑣 2 + 𝑣®1 𝑡) 𝑒 3𝑡 = (3®
𝑣2 + 𝑣®1 ) 𝑒 3𝑡 + 3®
𝑣1 𝑡𝑒 3𝑡 .

As we are assuming that 𝑥®2 is a solution, 𝑥®2′ must equal 𝐴 𝑥®2 . So we compute 𝐴 𝑥®2 :

𝐴 𝑥®2 = 𝐴(®
𝑣 2 + 𝑣®1 𝑡) 𝑒 3𝑡 = 𝐴®
𝑣 2 𝑒 3𝑡 + 𝐴®
𝑣 1 𝑡𝑒 3𝑡 .

By looking at the coefficients of 𝑒 3𝑡 and 𝑡𝑒 3𝑡 , we see 3®


𝑣2 + 𝑣®1 = 𝐴® 𝑣2 and 3®
𝑣1 = 𝐴®
𝑣1 . This
means that
𝑣 2 = 𝑣®1 ,
(𝐴 − 3𝐼)® and (𝐴 − 3𝐼)® ®
𝑣1 = 0.
Therefore, 𝑥®2 is a solution if these two equations are satisfied. The second equation is
satisfied if 𝑣®1 is an eigenvector, and we found the eigenvector above, so let 𝑣®1 = 0 . So, if
1

we can find a 𝑣®2 that solves (𝐴 − 3𝐼)®𝑣2 = 𝑣®1 , then we are done. This is just a bunch of linear
equations to solve and we are by now very good at that. Let us solve (𝐴 − 3𝐼)® 𝑣 2 = 𝑣®1 . Write

𝑎
    
0 1 1
= .
0 0 𝑏 0

By inspection, we see that letting 𝑎 = 0 (𝑎 could be anything in fact) and 𝑏 = 1 does the job.
Hence we can take 𝑣®2 = 1 . Our general solution to 𝑥®′ = 𝐴 𝑥® is
0


𝑐 1 𝑒 3𝑡 + 𝑐 2 𝑡𝑒 3𝑡
        
1 3𝑡 0 1
𝑥® = 𝑐1 𝑒 + 𝑐2 + 𝑡 𝑒 3𝑡 = .
0 1 0 𝑐2 𝑒 3𝑡

Let us check that we really do have the solution. First 𝑥 1′ = 𝑐1 3𝑒 3𝑡 + 𝑐2 𝑒 3𝑡 + 3𝑐2 𝑡𝑒 3𝑡 = 3𝑥1 + 𝑥2 .
Good. Now 𝑥2′ = 3𝑐2 𝑒 3𝑡 = 3𝑥2 . Good.
𝑣1 = 0® we find
𝑣 2 = 𝑣®1 into (𝐴 − 3𝐼)®
In the example, if we plug (𝐴 − 3𝐼)®

®
𝑣 2 = 0,
(𝐴 − 3𝐼)(𝐴 − 3𝐼)® or (𝐴 − 3𝐼)2 𝑣®2 = 0.
®

® ≠ 0® and (𝐴 − 3𝐼)2 𝑤
If (𝐴 − 3𝐼)𝑤 ® then (𝐴 −3𝐼)𝑤
® = 0, ® is an eigenvector, a multiple of 𝑣®1 . In this
2
2 × 2 case, (𝐴 − 3𝐼) is just the zero matrix (exercise). So any vector 𝑤® solves (𝐴 − 3𝐼)2 𝑤 ® = 0®
and we just need a 𝑤 ® such that (𝐴 − 3𝐼)𝑤 ® Then we could use 𝑤
® ≠ 0. ® for 𝑣®2 and (𝐴 − 3𝐼)𝑤 ®
for 𝑣®1 .
Note this example system 𝑥®′ = 𝐴 𝑥® has a simpler solution since 𝐴 is a so-called upper
triangular matrix, that is, every entry below the diagonal is zero. In particular, the equation
for 𝑥2 does not depend on 𝑥1 . Mind you, not every defective matrix is triangular.
3.7. MULTIPLE EIGENVALUES 165

Exercise 3.7.1: Solve 𝑥®′ = 30 13 𝑥® by first solving for 𝑥2 and then for 𝑥 1 independently. Check that
 

you got the same solution as we did above.


Let us describe the general algorithm. Suppose that 𝜆 is an eigenvalue of multiplicity 2,
defect 1. First find an eigenvector 𝑣®1 of 𝜆. That is, 𝑣®1 solves (𝐴 − 𝜆𝐼)® ® Then, find a
𝑣1 = 0.
vector 𝑣®2 such that
(𝐴 − 𝜆𝐼)®
𝑣2 = 𝑣®1 .
This gives us two linearly independent solutions

𝑥®1 = 𝑣®1 𝑒 𝜆𝑡 ,
𝑥®2 = 𝑣®2 + 𝑣®1 𝑡 𝑒 𝜆𝑡 .


Example 3.7.2: Consider the system


 2 −5 0

 
𝑥® =  0 2 0 𝑥®.
−1 4 1
 
Compute the eigenvalues,
2 − 𝜆 −5 0 
2−𝜆
© 
0 = det(𝐴 − 𝜆𝐼) = det ­  0 0  ® = (2 − 𝜆)2 (1 − 𝜆).
ª

«  −1
 4 1 − 𝜆 ¬

The eigenvalues
h i are 1 and 2, where 2 has multiplicity 2. We leave it to the reader to find
0
that 0 is an eigenvector for the eigenvalue 𝜆 = 1.
1
We focus on 𝜆 = 2. We compute eigenvectors:
 0 −5 0  𝑣1 
®0 = (𝐴 − 2𝐼)® 0  𝑣 2  .
  
𝑣 =  0 0
−1 4 −1 𝑣3 
  
The first equation says that 𝑣 2 = 0, so the last equation is −𝑣 1 − 𝑣 3 = 0. Let
h 1 𝑣i3 be the free
variable to find that 𝑣 1 = −𝑣3 . Perhaps let 𝑣3 = −1 to find an eigenvector 0 . Problem is
−1
that setting 𝑣3 to anything else just gets multiples of this vector and so we have a defect
of 1. Let 𝑣®1 be the eigenvector and we look for a generalized eigenvector 𝑣®2 :

𝑣2 = 𝑣®1 ,
(𝐴 − 2𝐼)®

or
 0 −5 0   𝑎   1 
 0 0 0  𝑏  =  0  ,
    

−1 4 −1  𝑐  −1
   
    
where we used 𝑎, 𝑏, 𝑐 as components of 𝑣®2 for simplicity. The first equation says −5𝑏 = 1
so 𝑏 = −1/5. The second equation says nothing. The last equation is −𝑎 + 4𝑏 − 𝑐 = −1, or
166 CHAPTER 3. SYSTEMS OF ODES

𝑎 + 4/5 + 𝑐 = 1, or 𝑎 + 𝑐 = 1/5. We let 𝑐 be the free variable, and we choose 𝑐 = 0. We find


1/5
𝑣®2 = −1/5 .
0
The general solution is therefore,

0  1  1/5   1 
  𝑡   2𝑡     ª
𝑥® = 𝑐1 0 𝑒 + 𝑐 2  0  𝑒 + 𝑐 3 ­ −1/5 +  0  𝑡 ® 𝑒 2𝑡 .
©
−1
«  0  −1 ¬
1     
   
This machinery can also be generalized to higher multiplicities and higher defects. We
will not go over this method in detail, but we sketch the idea. Suppose that 𝐴 has an
eigenvalue 𝜆 of multiplicity 𝑚. We find vectors such that

(𝐴 − 𝜆𝐼) 𝑘 𝑣® = 0,
® but (𝐴 − 𝜆𝐼) 𝑘−1 𝑣® ≠ 0.
®

Such vectors are called generalized eigenvectors (then 𝑣®1 = (𝐴 − 𝜆𝐼) 𝑘−1 𝑣® is an eigenvector).
For the eigenvector 𝑣®1 there is a chain of generalized eigenvectors 𝑣®2 through 𝑣® 𝑘 such that:

(𝐴 − 𝜆𝐼)® ®
𝑣 1 = 0,
(𝐴 − 𝜆𝐼)®
𝑣 2 = 𝑣®1 ,
..
.
(𝐴 − 𝜆𝐼)®
𝑣 𝑘 = 𝑣® 𝑘−1 .

Really once you find the 𝑣® 𝑘 such that (𝐴 − 𝜆𝐼) 𝑘 𝑣® 𝑘 = 0® but (𝐴 − 𝜆𝐼) 𝑘−1 𝑣® 𝑘 ≠ 0,
® you find the
entire chain since you can compute the rest, 𝑣® 𝑘−1 = (𝐴 − 𝜆𝐼)® 𝑣 𝑘 , 𝑣® 𝑘−2 = (𝐴 − 𝜆𝐼)®
𝑣 𝑘−1 , etc. We
form the linearly independent solutions

𝑥®1 = 𝑣®1 𝑒 𝜆𝑡 ,
𝑣 2 + 𝑣®1 𝑡) 𝑒 𝜆𝑡 ,
𝑥®2 = (®
..
.
𝑡2 𝑡 𝑘−2 𝑡 𝑘−1
 
𝑥®𝑘 = 𝑣® 𝑘 + 𝑣® 𝑘−1 𝑡 + 𝑣® 𝑘−2 + · · · + 𝑣®2 + 𝑣®1 𝑒 𝜆𝑡 .
2 (𝑘 − 2)! (𝑘 − 1)!

Recall that 𝑘! = 1 · 2 · 3 · · · (𝑘 − 1) · 𝑘 is the factorial. If you have an eigenvalue of geometric


multiplicity ℓ , you will have to find ℓ such chains (some of them might be short: just the
single eigenvector equation). We go until we form 𝑚 linearly independent solutions where
𝑚 is the algebraic multiplicity. We do not quite know which specific eigenvectors go with
which chain, so start by finding 𝑣® 𝑘 first for the longest possible chain and go from there.
For example, if 𝜆 is an eigenvalue of 𝐴 of algebraic multiplicity 3 and defect 2, then
solve
(𝐴 − 𝜆𝐼)® ®
𝑣 1 = 0, (𝐴 − 𝜆𝐼)®𝑣2 = 𝑣®1 , (𝐴 − 𝜆𝐼)®
𝑣3 = 𝑣®2 .
3.7. MULTIPLE EIGENVALUES 167

That is, find 𝑣®3 such that (𝐴 − 𝜆𝐼)3 𝑣®3 = 0,


® but (𝐴 − 𝜆𝐼)2 𝑣®3 ≠ 0.
® Then you are done as
𝑣®2 = (𝐴 − 𝜆𝐼)®
𝑣3 and 𝑣®1 = (𝐴 − 𝜆𝐼)®
𝑣 2 . The 3 linearly independent solutions are

𝑡 2 𝜆𝑡
 
𝜆𝑡 𝜆𝑡
𝑥®1 = 𝑣®1 𝑒 , 𝑥®2 = (®
𝑣2 + 𝑣®1 𝑡) 𝑒 , 𝑥®3 = 𝑣®3 + 𝑣®2 𝑡 + 𝑣®1 𝑒 .
2

If, on the other hand, 𝐴 has an eigenvalue 𝜆 of algebraic multiplicity 3 and defect 1,
then solve
(𝐴 − 𝜆𝐼)® ®
𝑣 1 = 0, (𝐴 − 𝜆𝐼)® ®
𝑣2 = 0, (𝐴 − 𝜆𝐼)®
𝑣3 = 𝑣®2 .
Here 𝑣®1 and 𝑣®2 are actual honest eigenvectors, and 𝑣®3 is a generalized eigenvector. So
there are two chains. To solve, first find a 𝑣®3 such that (𝐴 − 𝜆𝐼)2 𝑣®3 = 0,
® but (𝐴 − 𝜆𝐼)® ®
𝑣3 ≠ 0.
Then 𝑣®2 = (𝐴 − 𝜆𝐼)®
𝑣 3 is going to be an eigenvector. Then solve for an eigenvector 𝑣®1 that is
linearly independent from 𝑣®2 . You get 3 linearly independent solutions

𝑥®1 = 𝑣®1 𝑒 𝜆𝑡 , 𝑥®2 = 𝑣®2 𝑒 𝜆𝑡 , 𝑣3 + 𝑣®2 𝑡) 𝑒 𝜆𝑡 .


𝑥®3 = (®

3.7.3 Exercises
Exercise 3.7.2: Let 𝐴 = . Find the general solution of 𝑥®′ = 𝐴 𝑥®.
 5 −3 
3 −1

5 −4 4
h i
Exercise 3.7.3: Let 𝐴 = 0 3 0 .
−2 4 −1

a) What are the eigenvalues?


b) What is/are the defect(s) of the eigenvalue(s)?
c) Find the general solution of 𝑥®′ = 𝐴 𝑥®.
h2 1 0i
Exercise 3.7.4: Let 𝐴 = 020 .
002

a) What are the eigenvalues?


b) What is/are the defect(s) of the eigenvalue(s)?
c) Find the general solution of 𝑥®′ = 𝐴 𝑥® in two different ways and verify you get the same answer.
h 0 1 2
i
Exercise 3.7.5: Let 𝐴 = −1 −2 −2 .
−4 4 7

a) What are the eigenvalues?


b) What is/are the defect(s) of the eigenvalue(s)?
c) Find the general solution of 𝑥®′ = 𝐴 𝑥®.
168 CHAPTER 3. SYSTEMS OF ODES

0 4 −2
h i
Exercise 3.7.6: Let 𝐴 = −1 −4 1 .
0 0 −2

a) What are the eigenvalues?


b) What is/are the defect(s) of the eigenvalue(s)?
c) Find the general solution of 𝑥®′ = 𝐴 𝑥®.
2 1 −1
h i
Exercise 3.7.7: Let 𝐴 = −1 0 2 .
−1 −2 4

a) What are the eigenvalues?


b) What is/are the defect(s) of the eigenvalue(s)?
c) Find the general solution of 𝑥®′ = 𝐴 𝑥®.

Exercise 3.7.8: Suppose that 𝐴 is a 2 × 2 matrix with a repeated eigenvalue 𝜆. Suppose that there
are two linearly independent eigenvectors. Show that 𝐴 = 𝜆𝐼.
h1 1 1i
Exercise 3.7.101: Let 𝐴 = 111 .
111

a) What are the eigenvalues?


b) What is/are the defect(s) of the eigenvalue(s)?
c) Find the general solution of 𝑥® ′ = 𝐴 𝑥®.
h 1 33
i
Exercise 3.7.102: Let 𝐴 = 1 10 .
−1 1 2

a) What are the eigenvalues?


b) What is/are the defect(s) of the eigenvalue(s)?
c) Find the general solution of 𝑥® ′ = 𝐴 𝑥®.
2 0 0
h i
Exercise 3.7.103: Let 𝐴 = −1 −1 9 .
0 −1 5

a) What are the eigenvalues?


b) What is/are the defect(s) of the eigenvalue(s)?
c) Find the general solution of 𝑥® ′ = 𝐴 𝑥®.

Exercise 3.7.104: Let 𝐴 = [ 𝑏𝑎 𝑎𝑐 ], where 𝑎, 𝑏, and 𝑐 are unknowns. Suppose that 5 is a doubled
eigenvalue of defect 1, and suppose that 10 is a corresponding eigenvector. Find 𝐴 and show that
there is only one such matrix 𝐴.
3.8. MATRIX EXPONENTIALS 169

3.8 Matrix exponentials


Note: 2 lectures, may need to refer to § 7.1, §5.6 in [EP], §7.7 in [BD]

3.8.1 Definition
There is another way of finding a fundamental matrix solution of a system. Consider the
constant-coefficient equation
𝑥®′ = 𝑃 𝑥®.
If this would be just one equation (when 𝑃 is a number or a 1 × 1 matrix), then the solution
would be
𝑥® = 𝑒 𝑃𝑡 .
That does not (yet) make sense if 𝑃 is a larger matrix, but essentially the same computation
that led to the above works for matrices when we define 𝑒 𝑃𝑡 properly. First we write the
Taylor series for 𝑒 𝑎𝑡 for some number 𝑎:

(𝑎𝑡)2 (𝑎𝑡)3 (𝑎𝑡)4 Õ (𝑎𝑡) 𝑘



𝑎𝑡
𝑒 = 1 + 𝑎𝑡 + + + +··· = .
2 6 24 𝑘!
𝑘=0

Recall 𝑘! = 1 · 2 · 3 · · · 𝑘 is the factorial, and 0! = 1. We differentiate this series term by term

𝑑 𝑎𝑡  𝑎3𝑡2 𝑎4𝑡3 (𝑎𝑡)2 (𝑎𝑡)3


 
𝑒 = 0 + 𝑎 + 𝑎2𝑡 + + + · · · = 𝑎 1 + 𝑎𝑡 + + + · · · = 𝑎𝑒 𝑎𝑡 .
𝑑𝑡 2 6 2 6
Maybe we can try the same trick with matrices. For an 𝑛 × 𝑛 matrix 𝐴 we define the matrix
exponential as
def 1 1 1
𝑒 𝐴 = 𝐼 + 𝐴 + 𝐴2 + 𝐴3 + · · · + 𝐴 𝑘 + · · ·
2 6 𝑘!
We will not worry about convergence—the series really does always converge. We usually
write 𝑃𝑡 as 𝑡𝑃 by convention when 𝑃 is a matrix. With this small change and by the exact
same calculation as above, we have that
𝑑  𝑡𝑃 
𝑒 = 𝑃𝑒 𝑡𝑃 .
𝑑𝑡
Now 𝑃 and hence 𝑒 𝑡𝑃 is an 𝑛 × 𝑛 matrix. What we are looking for is a vector. In the 1 × 1
case we would at this point multiply by an arbitrary constant to get the general solution. In
the matrix case we multiply by a column vector 𝑐®.
Theorem 3.8.1. Let 𝑃 be an 𝑛 × 𝑛 matrix. Then the general solution to 𝑥®′ = 𝑃 𝑥® is

𝑥® = 𝑒 𝑡𝑃 𝑐®,

where 𝑐® is an arbitrary constant vector. In fact, 𝑥®(0) = 𝑐®.


170 CHAPTER 3. SYSTEMS OF ODES

Check:
𝑑 𝑑  𝑡𝑃 
𝑥=
® 𝑒 𝑐® = 𝑃𝑒 𝑡𝑃 𝑐® = 𝑃 𝑥®.
𝑑𝑡 𝑑𝑡
𝑡𝑃
Hence 𝑒 is a fundamental matrix solution of the homogeneous system. If we can
compute the matrix exponential, we have another method of solving constant-coefficient
homogeneous systems. It also makes it easy to solve for initial conditions. To solve 𝑥®′ = 𝐴 𝑥®,
® we take the solution
𝑥®(0) = 𝑏,
𝑥® = 𝑒 𝑡𝐴 𝑏.
®
This equation follows because 𝑒 0𝐴 = 𝐼, so 𝑥®(0) = 𝑒 0𝐴 𝑏® = 𝑏.
®
We mention a drawback of matrix exponentials. In general 𝑒 𝐴+𝐵 ≠ 𝑒 𝐴 𝑒 𝐵 . The trouble is
that matrices do not commute, that is, in general 𝐴𝐵 ≠ 𝐵𝐴. If you would try to rewrite
𝑒 𝐴 𝑒 𝐵 as 𝑒 𝐴+𝐵 using the Taylor series, you will see why the lack of commutativity becomes a
problem. It is true that if 𝐴𝐵 = 𝐵𝐴, that is, if 𝐴 and 𝐵 commute, then 𝑒 𝐴+𝐵 = 𝑒 𝐴 𝑒 𝐵 . We will
find this fact useful. We restate this as a theorem to make a point.
Theorem 3.8.2. If 𝐴𝐵 = 𝐵𝐴, then 𝑒 𝐴+𝐵 = 𝑒 𝐴 𝑒 𝐵 . Otherwise, 𝑒 𝐴+𝐵 ≠ 𝑒 𝐴 𝑒 𝐵 in general.

3.8.2 Simple cases


In some instances it may work
 to just plug into the series definition. Suppose the matrix is
diagonal. For example, 𝐷 = 0𝑎 𝑏0 . Then
𝑎𝑘 0
 
𝑘
𝐷 = ,
0 𝑏𝑘
and
1 1
𝑒 𝐷 = 𝐼 + 𝐷 + 𝐷2 + 𝐷3 + · · ·
 2 6  𝑎
𝑎 0 1 𝑎2 0 1 𝑎3 0 𝑒
     
1 0 0
= + + + +··· = .
0 1 0 𝑏 2 0 𝑏2 6 0 𝑏3 0 𝑒𝑏
In particular,
𝑒 0 𝑒𝑎 0
   
𝐼 𝑎𝐼
𝑒 = and 𝑒 = .
0 𝑒 0 𝑒𝑎
This makes 5 exponentials of certain other matrices easy
 2 4to compute. For example,
  the
matrix 𝐴 = −1 4 can be written as 3𝐼 + 𝐵 where 𝐵 = 𝐵
 2 = 0 0 . So
1 −1 −2 . Notice that 00
𝐵 𝑘 = 0 for all 𝑘 ≥ 2. Therefore, 𝑒 𝐵 = 𝐼 + 𝐵. Suppose we actually want to compute 𝑒 𝑡𝐴 . The
matrices 3𝑡𝐼 and 𝑡𝐵 commute (exercise: check this) and 𝑒 𝑡𝐵 = 𝐼 + 𝑡𝐵, since (𝑡𝐵)2 = 𝑡 2 𝐵2 = 0.
We write
𝑒 3𝑡 0
 
𝑡𝐴 3𝑡𝐼 𝑡𝐵
𝑒 =𝑒 3𝑡𝐼+𝑡𝐵
=𝑒 𝑒 = (𝐼 + 𝑡𝐵) =
0 𝑒 3𝑡
𝑒 3𝑡 0 (1 + 2𝑡) 𝑒 3𝑡 4𝑡𝑒 3𝑡
    
1 + 2𝑡 4𝑡
= = .
0 𝑒 3𝑡 −𝑡 1 − 2𝑡 −𝑡𝑒 3𝑡 (1 − 2𝑡) 𝑒 3𝑡
3.8. MATRIX EXPONENTIALS 171

We found a fundamental matrix solution for the system 𝑥®′ = 𝐴 𝑥®. Note that this matrix has
a repeated eigenvalue with a defect; there is only one eigenvector for the eigenvalue 3. So
we found a perhaps easier way to handle this case. In fact, if a matrix 𝐴 is 2 × 2 and has an
eigenvalue 𝜆 of multiplicity 2, then either 𝐴 = 𝜆𝐼, or 𝐴 = 𝜆𝐼 + 𝐵 where 𝐵2 = 0. This is a
good exercise.

Exercise 3.8.1: Suppose that 𝐴 is 2 × 2 and 𝜆 is the only eigenvalue. Show that (𝐴 − 𝜆𝐼)2 = 0,
and therefore that we can write 𝐴 = 𝜆𝐼 + 𝐵, where 𝐵2 = 0 (and possibly 𝐵 = 0). Hint: First write
down what does it mean for the eigenvalue to be of multiplicity 2. You will get an equation for the
entries. Now compute the square of 𝐵.

Matrices 𝐵 such that 𝐵 𝑘 = 0 for some 𝑘 are called nilpotent. Computation of the matrix
exponential for nilpotent matrices is easy by just writing down the first 𝑘 terms of the
Taylor series.

3.8.3 General matrices


In general, the exponential is not as easy to compute as above. We usually cannot write a
matrix as a sum of commuting matrices where the exponential is simple for each one. But
fear not, it is still not too difficult provided we can find enough eigenvectors. First we need
the following interesting result about matrix exponentials. For two square matrices 𝐴 and
𝐵, with 𝐵 invertible, we have
𝑒 𝐵𝐴𝐵 = 𝐵𝑒 𝐴 𝐵−1 .
−1

This can be seen by writing down the Taylor series. First


2
(𝐵𝐴𝐵−1 ) = 𝐵𝐴𝐵−1 𝐵𝐴𝐵−1 = 𝐵𝐴𝐼𝐴𝐵−1 = 𝐵𝐴2 𝐵−1 .
𝑘
And by the same reasoning (𝐵𝐴𝐵−1 ) = 𝐵𝐴 𝑘 𝐵−1 . Now write the Taylor series for 𝑒 𝐵𝐴𝐵 :
−1

1 2 1 3
𝑒 𝐵𝐴𝐵
−1
= 𝐼 + 𝐵𝐴𝐵−1 + (𝐵𝐴𝐵−1 ) + (𝐵𝐴𝐵−1 ) + · · ·
2 6
1 1
= 𝐵𝐵−1 + 𝐵𝐴𝐵−1 + 𝐵𝐴2 𝐵−1 + 𝐵𝐴3 𝐵−1 + · · ·
2 6
1 2 1 3  −1
= 𝐵 𝐼 + 𝐴 + 𝐴 + 𝐴 +··· 𝐵
2 6
𝐴 −1
= 𝐵𝑒 𝐵 .

Given a square matrix 𝐴, we can usually write 𝐴 = 𝐸𝐷𝐸−1 , where 𝐷 is diagonal and
𝐸 invertible. This procedure is called diagonalization. If we can do that, the computation
of the exponential becomes easy as 𝑒 𝐷 is just taking the exponential of the entries on the
diagonal. Adding 𝑡 into the mix, we can then compute the exponential

𝑒 𝑡𝐴 = 𝐸𝑒 𝑡𝐷 𝐸−1 .
172 CHAPTER 3. SYSTEMS OF ODES

To diagonalize 𝐴 we need 𝑛 linearly independent eigenvectors of 𝐴. Otherwise, this


method of computing the exponential does not work and we need to be trickier, but
we will not get into such details. Let 𝐸 be the matrix with the eigenvectors as columns.
Let 𝜆1 , 𝜆2 , . . . , 𝜆𝑛 be the eigenvalues and let 𝑣®1 , 𝑣®2 , . . . , 𝑣® 𝑛 be the eigenvectors, then
𝐸 = [ 𝑣®1 𝑣®2 · · · 𝑣® 𝑛 ]. Make a diagonal matrix 𝐷 with the eigenvalues on the diagonal:

𝜆1 0 ··· 0

 0 𝜆2 ···
 
0

𝐷 =  .. .. ..
.
..
. . . 
.
0 0
 · · · 𝜆𝑛 
We compute

𝐴𝐸 = 𝐴[ 𝑣®1 𝑣®2 ··· 𝑣® 𝑛 ]


= [ 𝐴®
𝑣1 𝐴®
𝑣2 ··· 𝐴®
𝑣𝑛 ]
= [ 𝜆1 𝑣®1 𝜆2 𝑣®2 ··· 𝜆𝑛 𝑣® 𝑛 ]
= [ 𝑣®1 𝑣®2 ··· 𝑣® 𝑛 ]𝐷
= 𝐸𝐷.

The columns of 𝐸 are linearly independent as these are linearly independent eigenvectors
of 𝐴. Hence 𝐸 is invertible. Since 𝐴𝐸 = 𝐸𝐷, we multiply on the right by 𝐸−1 and we get

𝐴 = 𝐸𝐷𝐸−1 .

This means that 𝑒 𝐴 = 𝐸𝑒 𝐷 𝐸 −1 . Multiplying the matrix by 𝑡, we obtain

 𝑒 𝜆1 𝑡 0 · · · 0 
 0 𝑒 𝜆2 𝑡 · · ·
 
0  −1
𝑒 𝑡𝐴 = 𝐸𝑒 𝑡𝐷 𝐸 −1 = 𝐸  ..
 .. .. ..  𝐸 . (3.5)
 . . . . 
 0 0 · · · 𝑒 𝜆𝑛 𝑡 
 

The formula (3.5), therefore, gives the formula for computing a fundamental matrix solution
𝑒 𝑡𝐴 for the system 𝑥®′ = 𝐴 𝑥®, in the case where we have 𝑛 linearly independent eigenvectors.
This computation still works when the eigenvalues and eigenvectors are complex,
though then you have to compute with complex numbers. It is clear from the definition
that if 𝐴 is real, then 𝑒 𝑡𝐴 is real. So you will only need complex numbers in the computation
and not for the result. You may need to apply Euler’s formula to simplify the result. If
simplified properly, the final matrix will not have any complex numbers in it.
Example 3.8.1: Compute a fundamental matrix solution using the matrix exponential for
the system  ′ 
𝑥 1 2 𝑥
 
= .
𝑦 2 1 𝑦
Then compute the particular solution for the initial conditions 𝑥(0) = 4 and 𝑦(0) = 2.
3.8. MATRIX EXPONENTIALS 173

Let 𝐴 be the coefficient matrix 12 21 . We first compute (exercise) that the eigenvalues
 
   1 
are 3 and −1 and corresponding eigenvectors are 11 and −1 . Hence the diagonalization
of 𝐴 is
      −1
1 2 1 1 3 0 1 1
= .
2 1 1 −1 0 −1 1 −1
| {z } | {z } | {z } | {z }
𝐴 𝐸 𝐷 𝐸 −1

We write
 −1
𝑒 3𝑡 0
  
𝑡𝐴 𝑡𝐷 −1 1 1 1 1
𝑒 = 𝐸𝑒 𝐸 =
1 −1 0 𝑒 −𝑡 1 −1
𝑒 3𝑡 0 −1 −1 −1
    
1 1
=
1 −1 0 𝑒 −𝑡 2 −1 1
−1 𝑒 3𝑡 𝑒 −𝑡
  
−1 −1
=
2 𝑒 3𝑡 −𝑒 −𝑡 −1 1
" #
𝑒 3𝑡 +𝑒 −𝑡 𝑒 3𝑡 −𝑒 −𝑡
−𝑒 3𝑡 𝑒 −𝑡 −𝑒 3𝑡 𝑒 −𝑡
 
−1 − +
= = 2 2 .
2 −𝑒 + 𝑒
3𝑡 −𝑡 −𝑒 − 𝑒 −𝑡
3𝑡 𝑒 3𝑡 −𝑒 −𝑡
2
𝑒 3𝑡 +𝑒 −𝑡
2

The initial conditions are 𝑥(0) = 4 and 𝑦(0) = 2. Hence, by the property that 𝑒 0𝐴 = 𝐼, we
find that the particular solution we are looking for is 𝑒 𝑡𝐴 𝑏® where 𝑏® is 42 . The particular
 

solution we are looking for is


" # 
𝑒 3𝑡 +𝑒 −𝑡 𝑒 3𝑡 −𝑒 −𝑡
𝑥 2𝑒 3𝑡 + 2𝑒 −𝑡 + 𝑒 3𝑡 − 𝑒 −𝑡 3𝑒 3𝑡 + 𝑒 −𝑡
     
4
= 2 2 = = .
𝑦 𝑒 3𝑡 −𝑒 −𝑡
2
𝑒 3𝑡 +𝑒 −𝑡
2
2 2𝑒 − 2𝑒 + 𝑒 + 𝑒
3𝑡 −𝑡 3𝑡 −𝑡 3𝑒 3𝑡 − 𝑒 −𝑡

3.8.4 Fundamental matrix solutions


If you can compute a fundamental matrix solution in a different way, you can use this to
find the matrix exponential 𝑒 𝑡𝐴 . A fundamental matrix solution of a system of ODEs is not
unique. The exponential is the fundamental matrix solution with the property that for
𝑡 = 0 we get the identity matrix. So we must find the right fundamental matrix solution.
Let 𝑋 be any fundamental matrix solution to 𝑥®′ = 𝐴 𝑥®. We claim

𝑒 𝑡𝐴 = 𝑋(𝑡) [𝑋(0)]−1 .

If we plug 𝑡 = 0 into 𝑋(𝑡) [𝑋(0)]−1 , we get the identity as needed. We can multiply a
fundamental matrix solution on the right by any constant invertible matrix and we still
get a fundamental matrix solution. All we are doing is changing what are the arbitrary
constants in the general solution 𝑥®(𝑡) = 𝑋(𝑡) 𝑐®.
174 CHAPTER 3. SYSTEMS OF ODES

3.8.5 Approximations
If you think about it, the computation of any fundamental matrix solution 𝑋 using the
eigenvalue method is just as difficult as the computation of 𝑒 𝑡𝐴 . So perhaps we did not
gain much by this new tool. However, the Taylor series expansion actually gives us a way
to approximate solutions, which the eigenvalue method did not.
The simplest thing we can do is to just compute the series up to a certain number of
terms. There are better ways to approximate the exponential‗ . In many cases, however,
few terms of the Taylor series give a reasonable approximation for the exponential and
may suffice for the application. For example, let us compute the first 4 terms of the series
for the matrix 𝐴 = 2 1 .
1 2


5  13 7 
𝑡2 2 𝑡3 3
   
1 2 2 2 2
𝑒 𝑡𝐴 ≈ 𝐼 + 𝑡𝐴 + 𝐴 + 𝐴 = 𝐼 + 𝑡 +𝑡 + 𝑡3 6 3
13 =
2 6 2 1 2 52 7
3 6
1+𝑡+ 2𝑡 + 6 𝑡
5 2 13 3
2 𝑡 + 2 𝑡 2 + 73 𝑡 3
 
= .
2 𝑡 + 2 𝑡 2 + 37 𝑡 3 1 + 𝑡 + 52 𝑡 2 + 136 𝑡
3

Just like the scalar version of the Taylor series approximation, the approximation will be
better for small 𝑡 and worse for larger 𝑡. For larger 𝑡, we will have to compute more terms.
Let us see how we stack up against the real solution with 𝑡 = 0.1. The approximate solution
is approximately (rounded to 8 decimal places)

0.12 2 0.13 3
 
0.1 𝐴 1.12716667 0.22233333
𝑒 ≈ 𝐼 + 0.1 𝐴 + 𝐴 + 𝐴 = .
2 6 0.22233333 1.12716667

Plugging 𝑡 = 0.1 into the real solution (rounded to 8 decimal places), we get
 
0.1 𝐴 1.12734811 0.22251069
𝑒 = .
0.22251069 1.12734811

Not bad at all! Although if we take the same approximation for 𝑡 = 1, we get
 
1 1 6.66666667 6.33333333
𝐼 + 𝐴 + 𝐴2 + 𝐴3 = ,
2 6 6.33333333 6.66666667

while the real value is (again rounded to 8 decimal places)


 
10.22670818 9.85882874
𝑒𝐴 = .
9.85882874 10.22670818

So the approximation is not very good once we get up to 𝑡 = 1. To get a good approximation
at 𝑡 = 1 (say up to 2 decimal places), we would need to go up to the 11th power (exercise).
‗ C.Moler and C.F. Van Loan, Nineteen Dubious Ways to Compute the Exponential of a Matrix, Twenty-Five
Years Later, SIAM Review 45 (1), 2003, 3–49
3.8. MATRIX EXPONENTIALS 175

3.8.6 Exercises
Exercise 3.8.2: Using the matrix exponential, find a fundamental matrix solution for the system
𝑥 ′ = 3𝑥 + 𝑦, 𝑦 ′ = 𝑥 + 3𝑦.

Exercise 3.8.3: Find 𝑒 𝑡𝐴 for the matrix 𝐴 =


2 3
02 .

Exercise 3.8.4: Find a fundamental matrix solution for the system 𝑥1′ = 7𝑥1 + 4𝑥 2 + 12𝑥 3 ,
h 0
i
𝑥2′ = 𝑥1 + 2𝑥 2 + 𝑥 3 , 𝑥 3′ = −3𝑥1 − 2𝑥 2 − 5𝑥 3 . Then find the solution that satisfies 𝑥®(0) = 1 .
−2

Exercise 3.8.5: Compute the matrix exponential 𝑒 𝐴 for 𝐴 =


1 2
01 .

Exercise 3.8.6 (challenging): Suppose 𝐴𝐵 = 𝐵𝐴. Show that under this assumption, 𝑒 𝐴+𝐵 = 𝑒 𝐴 𝑒 𝐵 .
−1
Exercise 3.8.7: Use Exercise 3.8.6 to show that (𝑒 𝐴 ) = 𝑒 −𝐴 . In particular, 𝑒 𝐴 is invertible even if
𝐴 is not.

 1   0  3.8.8: Let 𝐴 be a 2 × 2 matrix with eigenvalues −1, 1, and corresponding eigenvectors


Exercise
1 , 1 .

a) Find matrix 𝐴 with these properties.


b) Find a fundamental matrix solution to 𝑥®′ = 𝐴 𝑥®.
c) Solve the system in with initial conditions 𝑥®(0) =
2
3 .

Exercise 3.8.9: Suppose that 𝐴 is an 𝑛 × 𝑛 matrix with a repeated eigenvalue 𝜆 of multiplicity 𝑛


with 𝑛 linearly independent eigenvectors. Show that the matrix is diagonal, in fact, 𝐴 = 𝜆𝐼. Hint:
Use diagonalization and the fact that the identity matrix commutes with every other matrix.

Exercise 3.8.10: Let 𝐴 =


 −1 −1 
1 −3 .

a) Find 𝑒 𝑡𝐴 . b) Solve 𝑥®′ = 𝐴 𝑥®, 𝑥®(0) =


 1

−2 .

Exercise 3.8.11: Let 𝐴 = . Approximate 𝑒 𝑡𝐴 by expanding the power series up to the third
1 2
34
order.

Exercise 3.8.12: For any positive integer 𝑛, find a formula (or a recipe) for 𝐴𝑛 for the following
matrices:
       
3 0 5 2 0 1 2 1
a) b) c) d)
0 9 4 7 0 0 0 2

Exercise 3.8.101: Compute 𝑒 𝑡𝐴 where 𝐴 =


 1 −2

−2 1 .
1 −3 2
h i
Exercise 3.8.102: Compute 𝑒 𝑡𝐴 where 𝐴 = −2 1 2 .
−1 −3 4
176 CHAPTER 3. SYSTEMS OF ODES

Exercise 3.8.103:

a) Compute 𝑒 𝑡𝐴 where 𝐴 = b) Solve 𝑥® ′ = 𝐴 𝑥® for 𝑥®(0) =


 3 −1  1
1 1 . 2 .

Exercise 3.8.104:
 2 3  Compute the first 3 terms (up to the second degree) of the Taylor expansion of 𝑒 𝑡𝐴
where 𝐴 = 2 2 . Write it as a single matrix. Then use it to approximate 𝑒 0.1𝐴 .

Exercise 3.8.105: For any positive integer 𝑛, find a formula (or a recipe) for 𝐴𝑛 for the following
matrices:
     
7 4 −3 4 0 1
a) b) c)
−5 −2 −6 7 1 0
3.9. NONHOMOGENEOUS SYSTEMS 177

3.9 Nonhomogeneous systems


Note: 3 lectures (may have to skip a little), somewhat different from §5.7 in [EP], §7.9 in [BD]

3.9.1 First-order constant-coefficient


Integrating factor
We first focus on the nonhomogeneous first-order equation

𝑥®′(𝑡) = 𝐴 𝑥®(𝑡) + 𝑓®(𝑡),

where 𝐴 is a constant matrix. The first method we look at is the integrating factor method.
For simplicity, we rewrite the equation as

𝑥®′(𝑡) + 𝑃 𝑥®(𝑡) = 𝑓®(𝑡),

where 𝑃 = −𝐴. We multiply both sides of the equation by 𝑒 𝑡𝑃 (being mindful that we are
dealing with matrices that may not commute) to obtain

𝑒 𝑡𝑃 𝑥®′(𝑡) + 𝑒 𝑡𝑃 𝑃 𝑥®(𝑡) = 𝑒 𝑡𝑃 𝑓®(𝑡).

We notice that 𝑃𝑒 𝑡𝑃 = 𝑒 𝑡𝑃 𝑃. This fact follows by writing down the series definition of 𝑒 𝑡𝑃 :
 
𝑡𝑃 1 1
𝑃𝑒 = 𝑃 𝐼 + 𝑡𝑃 + (𝑡𝑃)2 + · · · = 𝑃 + 𝑡𝑃 2 + 𝑡 2 𝑃 3 + · · · =
2 2
 
1
= 𝐼 + 𝑡𝑃 + (𝑡𝑃)2 + · · · 𝑃 = 𝑒 𝑡𝑃 𝑃.
2
𝑑
𝑒 𝑡𝑃 = 𝑃𝑒 𝑡𝑃 = 𝑒 𝑡𝑃 𝑃. The product rule says

So 𝑑𝑡

𝑑  𝑡𝑃 
𝑒 𝑥®(𝑡) = 𝑒 𝑡𝑃 𝑥®′(𝑡) + 𝑒 𝑡𝑃 𝑃 𝑥®(𝑡),
𝑑𝑡
and so
𝑑  𝑡𝑃 
𝑒 𝑥®(𝑡) = 𝑒 𝑡𝑃 𝑓®(𝑡).
𝑑𝑡
We can now integrate. That is, we integrate each component of the vector separately

𝑡𝑃
𝑒 𝑥®(𝑡) = 𝑒 𝑡𝑃 𝑓®(𝑡) 𝑑𝑡 + 𝑐®.

−1
Recall from Exercise 3.8.7 that (𝑒 𝑡𝑃 ) = 𝑒 −𝑡𝑃 . Therefore, we obtain

𝑥®(𝑡) = 𝑒 −𝑡𝑃
𝑒 𝑡𝑃 𝑓®(𝑡) 𝑑𝑡 + 𝑒 −𝑡𝑃 𝑐®.
178 CHAPTER 3. SYSTEMS OF ODES

Perhaps it is better understood as a definite integral. In this case it will be easy to also
solve for the initial conditions. Consider the equation with initial conditions

𝑥®′(𝑡) + 𝑃 𝑥®(𝑡) = 𝑓®(𝑡), ®


𝑥®(0) = 𝑏.

The solution can then be written as


∫ 𝑡
𝑥®(𝑡) = 𝑒 −𝑡𝑃
𝑒 𝑠𝑃 𝑓®(𝑠) 𝑑𝑠 + 𝑒 −𝑡𝑃 𝑏.
® (3.6)
0

Again, the integration means that each component of the vector 𝑒 𝑠𝑃 𝑓®(𝑠) is integrated
®
separately. It is not hard to see that (3.6) really does satisfy the initial condition 𝑥®(0) = 𝑏.
∫ 0
𝑥®(0) = 𝑒 −0𝑃
𝑒 𝑠𝑃 𝑓®(𝑠) 𝑑𝑠 + 𝑒 −0𝑃 𝑏® = 𝐼 𝑏® = 𝑏.
®
0

Example 3.9.1: Suppose that we have the system

𝑥1′ + 5𝑥 1 − 3𝑥 2 = 𝑒 𝑡 ,
𝑥2′ + 3𝑥 1 − 𝑥 2 = 0,

with initial conditions 𝑥1 (0) = 1, 𝑥 2 (0) = 0.


Let us write the system as

𝑒𝑡
     
′ 5 −3 1
𝑥® + 𝑥® = , 𝑥®(0) = .
3 −1 0 0

The matrix 𝑃 = 53 −3
 
−1 has a doubled eigenvalue 2 with defect 1, and we leave it as an
exercise to double check we computed 𝑒 𝑡𝑃 correctly. Once we have 𝑒 𝑡𝑃 , we find 𝑒 −𝑡𝑃 , simply
by negating 𝑡.

(1 + 3𝑡) 𝑒 2𝑡 −3𝑡𝑒 2𝑡 (1 − 3𝑡) 𝑒 −2𝑡 3𝑡𝑒 −2𝑡


   
𝑡𝑃 −𝑡𝑃
𝑒 = , 𝑒 = .
3𝑡𝑒 2𝑡 (1 − 3𝑡) 𝑒 2𝑡 −3𝑡𝑒 −2𝑡 (1 + 3𝑡) 𝑒 −2𝑡

Instead of computing the whole formula at once, let us do it in stages. First


𝑡 𝑡   𝑠
(1 + 3𝑠) 𝑒 2𝑠 −3𝑠𝑒 2𝑠 𝑒
∫ ∫ 
𝑒 𝑠𝑃 𝑓®(𝑠) 𝑑𝑠 = 𝑑𝑠
0 0 3𝑠𝑒 2𝑠 (1 − 3𝑠) 𝑒 2𝑠 0
𝑡
(1 + 3𝑠) 𝑒 3𝑠
∫  
= 𝑑𝑠
3𝑠𝑒 3𝑠
"∫0 𝑡 #
0
(1 + 3𝑠) 𝑒 3𝑠 𝑑𝑠
= ∫𝑡
0
3𝑠𝑒 3𝑠 𝑑𝑠
𝑡𝑒 3𝑡
 
= (3𝑡−1) 𝑒 3𝑡 +1 (used integration by parts).
3
3.9. NONHOMOGENEOUS SYSTEMS 179

Then
∫ 𝑡
𝑥®(𝑡) = 𝑒 −𝑡𝑃
𝑒 𝑠𝑃 𝑓®(𝑠) 𝑑𝑠 + 𝑒 −𝑡𝑃 𝑏®
0
(1 − 3𝑡) 𝑒 −2𝑡 3𝑡𝑒 −2𝑡 𝑡𝑒 3𝑡 (1 − 3𝑡) 𝑒 −2𝑡 3𝑡𝑒 −2𝑡
     
1
= (3𝑡−1) 𝑒 3𝑡 +1 +
−3𝑡𝑒 −2𝑡 (1 + 3𝑡) 𝑒 −2𝑡 3
−3𝑡𝑒 −2𝑡 (1 + 3𝑡) 𝑒 −2𝑡 0
𝑡𝑒 −2𝑡 3𝑡) 𝑒 −2𝑡
   
(1 −
= 𝑡  −2𝑡 +
− 𝑒3 + 3 +𝑡 𝑒
1 −3𝑡𝑒 −2𝑡
(1 − 2𝑡) 𝑒 −2𝑡
 
= 𝑡 .
− 𝑒3 + 31 − 2𝑡 𝑒 −2𝑡


Phew!
Let us check that this really works.

𝑥1′ + 5𝑥 1 − 3𝑥 2 = (4𝑡𝑒 −2𝑡 − 4𝑒 −2𝑡 ) + 5(1 − 2𝑡) 𝑒 −2𝑡 + 𝑒 𝑡 − (1 − 6𝑡) 𝑒 −2𝑡 = 𝑒 𝑡 .

Similarly (exercise) 𝑥 2′ + 3𝑥 1 − 𝑥 2 = 0. The initial conditions are also satisfied (exercise).


For systems, the integrating factor method only works if 𝑃 does not depend on 𝑡, that
is, 𝑃 is constant. The problem is that in general
𝑑 h ∫ 𝑃(𝑡) 𝑑𝑡
i ∫
𝑃(𝑡) 𝑑𝑡
𝑒 ≠ 𝑃(𝑡) 𝑒 ,
𝑑𝑡
because matrix multiplication is not commutative.

Eigenvector decomposition
For the next method, note that eigenvectors of a matrix give the directions in which the
matrix acts like a scalar. If we solve the system along these directions, the computations
are simpler as we treat the matrix as a scalar. We then put those solutions together to get
the general solution for the system.
Take the equation
𝑥®′(𝑡) = 𝐴 𝑥®(𝑡) + 𝑓®(𝑡). (3.7)
Assume 𝐴 has 𝑛 linearly independent eigenvectors 𝑣®1 , 𝑣®2 , . . . , 𝑣® 𝑛 . Write

𝑥®(𝑡) = 𝑣®1 𝜉1 (𝑡) + 𝑣®2 𝜉2 (𝑡) + · · · + 𝑣® 𝑛 𝜉𝑛 (𝑡). (3.8)

That is, we wish to write our solution as a linear combination of eigenvectors of 𝐴. If we


solve for the scalar functions 𝜉1 through 𝜉𝑛 , we have our solution 𝑥®. Let us decompose 𝑓® in
terms of the eigenvectors as well. We wish to write

𝑓®(𝑡) = 𝑣®1 𝑔1 (𝑡) + 𝑣®2 𝑔2 (𝑡) + · · · + 𝑣® 𝑛 𝑔𝑛 (𝑡). (3.9)

That is, we wish to find 𝑔1 through 𝑔𝑛 that satisfy (3.9). Since all the eigenvectors are
independent, the matrix 𝐸 = [ 𝑣®1 𝑣®2 · · · 𝑣® 𝑛 ] is invertible. Write the equation (3.9) as
180 CHAPTER 3. SYSTEMS OF ODES

𝑓® = 𝐸 𝑔® , where the components of 𝑔® are the functions 𝑔1 through 𝑔𝑛 . Then 𝑔® = 𝐸−1 𝑓®.
Hence it is always possible to find 𝑔® when there are 𝑛 linearly independent eigenvectors.
We plug (3.8) into (3.7), and note that 𝐴®
𝑣 𝑘 = 𝜆 𝑘 𝑣® 𝑘 :

𝑥®′ 𝐴 𝑥® 𝑓®
z }| { z }| { z }| {
𝑣®1 𝜉1′ + 𝑣®2 𝜉2′ +··· + 𝑣® 𝑛 𝜉𝑛′ = 𝐴 𝑣®1 𝜉1 + 𝑣®2 𝜉2 + · · · + 𝑣® 𝑛 𝜉𝑛 + 𝑣®1 𝑔1 + 𝑣®2 𝑔2 + · · · + 𝑣® 𝑛 𝑔𝑛
= 𝐴®
𝑣 1 𝜉1 + 𝐴®
𝑣2 𝜉2 + · · · + 𝐴®
𝑣 𝑛 𝜉𝑛 + 𝑣®1 𝑔1 + 𝑣®2 𝑔2 + · · · + 𝑣® 𝑛 𝑔𝑛
= 𝑣®1 𝜆1 𝜉1 + 𝑣®2 𝜆2 𝜉2 + · · · + 𝑣® 𝑛 𝜆𝑛 𝜉𝑛 + 𝑣®1 𝑔1 + 𝑣®2 𝑔2 + · · · + 𝑣® 𝑛 𝑔𝑛
= 𝑣®1 (𝜆1 𝜉1 + 𝑔1 ) + 𝑣®2 (𝜆2 𝜉2 + 𝑔2 ) + · · · + 𝑣® 𝑛 (𝜆𝑛 𝜉𝑛 + 𝑔𝑛 ).

If we identify the coefficients of the vectors 𝑣®1 through 𝑣® 𝑛 , we get the equations

𝜉1′ = 𝜆1 𝜉1 + 𝑔1 ,
𝜉2′ = 𝜆2 𝜉2 + 𝑔2 ,
..
.
𝜉𝑛′ = 𝜆𝑛 𝜉𝑛 + 𝑔𝑛 .

Each one of these equations is independent of the others. They are all linear first-order
equations and can easily be solved by the standard integrating factor method for single
equations. That is, for the 𝑘 th equation we write

𝜉′𝑘 (𝑡) − 𝜆 𝑘 𝜉 𝑘 (𝑡) = 𝑔 𝑘 (𝑡).

We use the integrating factor 𝑒 −𝜆 𝑘 𝑡 to find that

𝑑 h i
𝜉 𝑘 (𝑡) 𝑒 −𝜆 𝑘 𝑡 = 𝑒 −𝜆 𝑘 𝑡 𝑔 𝑘 (𝑡).
𝑑𝑡
We integrate and solve for 𝜉 𝑘 to get

𝜆𝑘 𝑡
𝜉 𝑘 (𝑡) = 𝑒 𝑒 −𝜆 𝑘 𝑡 𝑔 𝑘 (𝑡) 𝑑𝑡 + 𝐶 𝑘 𝑒 𝜆 𝑘 𝑡 .

If we are looking for just any particular solution, we can set 𝐶 𝑘 to be zero. If we leave these
constants in, we get the general solution. Write 𝑥®(𝑡) = 𝑣®1 𝜉1 (𝑡) + 𝑣®2 𝜉2 (𝑡) + · · · + 𝑣® 𝑛 𝜉𝑛 (𝑡), and
we are done.
As always, it is perhaps better to write these integrals as definite integrals. Suppose that
we have an initial condition 𝑥®(0) = 𝑏.® Take 𝑎® = 𝐸−1 𝑏® to find 𝑏® = 𝑣®1 𝑎 1 + 𝑣®2 𝑎 2 + · · · + 𝑣® 𝑛 𝑎 𝑛 ,
just like before. Then if we write
∫ 𝑡
𝜆𝑘 𝑡
𝜉 𝑘 (𝑡) = 𝑒 𝑒 −𝜆 𝑘 𝑠 𝑔 𝑘 (𝑠) 𝑑𝑠 + 𝑎 𝑘 𝑒 𝜆 𝑘 𝑡 ,
0
3.9. NONHOMOGENEOUS SYSTEMS 181

®
we get the particular solution 𝑥®(𝑡) = 𝑣®1 𝜉1 (𝑡) + 𝑣®2 𝜉2 (𝑡) + · · · + 𝑣® 𝑛 𝜉𝑛 (𝑡) satisfying 𝑥®(0) = 𝑏,
because 𝜉 𝑘 (0) = 𝑎 𝑘 .
We remark that the technique we just outlined is the eigenvalue method applied to
nonhomogeneous systems. If a system is homogeneous, that is, if 𝑓® = 0, ® then the equations
we get are 𝜉′𝑘 = 𝜆 𝑘 𝜉 𝑘 , and so 𝜉 𝑘 = 𝐶 𝑘 𝑒 𝜆 𝑘 𝑡 are the solutions and that is precisely what we
got in § 3.4.
h i
Example 3.9.2: Let 𝐴 = . Solve 𝑥®′ = 𝐴 𝑥® + 𝑓® where 𝑓®(𝑡) = 2𝑒 𝑡 for 𝑥®(0) = 3/16
1 3  
31 2𝑡 −5/16 .
The eigenvalues of 𝐴 are −2 and 4 and corresponding eigenvectors are −1 1
  1
and 1
respectively. This calculation is left as an exercise. We write down the matrix 𝐸 of the
eigenvectors and compute its inverse (using the inverse formula for 2 × 2 matrices)
   
1 1 −1 1 1 −1
𝐸= , 𝐸 = .
−1 1 2 1 1

We are looking for a solution of the form 𝑥® = 𝜉1 + 11 𝜉2 . We first need to write 𝑓®


 1
  
 𝑡  1  −1
in terms of the eigenvectors. That is we wish to write 𝑓® = 2𝑒 𝑔 1 𝑔2 . Thus
1
2𝑡 = −1 1 +

𝑔1 2𝑒 𝑡 2𝑒 𝑡 𝑒𝑡 − 𝑡
        
1 1 −1
= 𝐸−1 = = 𝑡 .
𝑔2 2𝑡 2 1 1 2𝑡 𝑒 +𝑡

So 𝑔1 = 𝑒 𝑡 − 𝑡 and 𝑔2 = 𝑒 𝑡 + 𝑡.
We hfurther i need to write 𝑥®(0) in terms of the eigenvectors. That is, we wish to write
3/16
𝑥®(0) = 𝑎1 + 𝑎 2 . Hence
 1
 1
−5/16 = −1 1

𝑎1
  3   
−1 /16 1/4
=𝐸 = .
𝑎2 −5/16 −1/16

So 𝑎 1 = 1/4 and 𝑎 2 = −1/16. We plug our 𝑥® into the equation and get

𝑥®′ 𝐴 𝑥® 𝑓®
z
  }|   { z  }|   { z
  }|   {
1 1 ′ 1 1 1 1
𝜉1′ + 𝜉2 = 𝐴 𝜉1 + 𝐴 𝜉2 + 𝑔1 + 𝑔
−1 1 −1 1 −1 1 2
       
1 1 1 1 𝑡
= (−2𝜉1 ) + 4𝜉2 + (𝑒 𝑡 − 𝑡) + (𝑒 + 𝑡).
−1 1 −1 1

We get the two equations

1
𝜉1′ = −2𝜉1 + 𝑒 𝑡 − 𝑡, where 𝜉1 (0) = 𝑎1 = ,
4
−1
𝜉2′ = 4𝜉2 + 𝑒 𝑡 + 𝑡, where 𝜉2 (0) = 𝑎2 = .
16
182 CHAPTER 3. SYSTEMS OF ODES

We solve with the integrating factor method. Computation of the integral is left as an
exercise to the student. You will need integration by parts.

𝑒𝑡 𝑡 1

−2𝑡 𝑡 −2𝑡
𝜉1 = 𝑒 𝑒 (𝑒 − 𝑡) 𝑑𝑡 + 𝐶1 𝑒
2𝑡
= − + + 𝐶1 𝑒 −2𝑡 ,
3 2 4

where 𝐶1 is the constant of integration. As 𝜉1 (0) = 1/4, then 1/4 = 1/3 + 1/4 + 𝐶1 and 𝐶1 = −1/3.
Similarly,
𝑒𝑡 𝑡

1
𝜉2 = 𝑒 4𝑡
𝑒 −4𝑡 (𝑒 𝑡 + 𝑡) 𝑑𝑡 + 𝐶2 𝑒 4𝑡 = − − − + 𝐶2 𝑒 4𝑡 .
3 4 16
As 𝜉2 (0) = −1/16, we have −1/16 = −1/3 − 1/16 + 𝐶2 and hence 𝐶2 = 1/3. The solution is
" #
𝑒 4𝑡 −𝑒 −2𝑡
𝑒𝑡 𝑒 −2𝑡 𝑒 4𝑡 𝑒𝑡 + 3−12𝑡
     
1 − 1 − 2𝑡 1 − 4𝑡 + 1
𝑥®(𝑡) = + + − = 3
𝑒 −2𝑡 +𝑒 4𝑡 −2𝑒 𝑡
16
4𝑡−5
.
−1 3 4 1 3 16 +
3 16
| {z } | {z }
𝜉1 𝜉2

𝑒 4𝑡 −𝑒 −2𝑡 𝑒 −2𝑡 +𝑒 4𝑡 −2𝑒 𝑡


That is, 𝑥 1 = 3 + 3−12𝑡
16 and 𝑥 2 = 3 + 4𝑡−5
16 .

Exercise 3.9.1: Check that 𝑥1 and 𝑥2 solve the problem. Check both that they satisfy the differential
equation and that they satisfy the initial conditions.

Undetermined coefficients
The method of undetermined coefficients also works for systems. The only difference is that
we use unknown vectors rather than just numbers. Same caveats apply to undetermined
coefficients for systems as for single equations. This method does not always work.
Furthermore, if the right-hand side is complicated, we have to solve for lots of variables.
Each element of an unknown vector is an unknown number. In system of 3 equations
with say 4 unknown vectors (this would not be uncommon), we already have 12 unknown
numbers to solve for. The method can turn into a lot of tedious work if done by hand. As
the method is essentially the same as for single equations, let us just do an example.
𝑡
Example 3.9.3: Let 𝐴 = −1 ®′ = 𝐴 𝑥® + 𝑓® where 𝑓®(𝑡) = 𝑒𝑡 .
−2 1 . Find a particular solution of 𝑥
0
   

One can solve this system in an easier way (can you see how?), but for the purposes of the
example, we use the eigenvalue method and undetermined coefficients. The eigenvalues
of 𝐴 are −1 and 1. Corresponding eigenvectors are 1 and 1 respectively. Hence, our
1 0

complementary solution is    
1 −𝑡 0 𝑡
𝑥®𝑐 = 𝛼1 𝑒 + 𝛼2 𝑒 ,
1 1
for some arbitrary constants 𝛼1 and 𝛼 2 .
We would next want to guess a particular solution of the form

𝑥® = 𝑎® 𝑒 𝑡 + 𝑏𝑡
® + 𝑐®.
3.9. NONHOMOGENEOUS SYSTEMS 183

However, something of the form 𝑎® 𝑒 𝑡 appears in  0the


 complementary solution. Because we
do not yet know if the vector 𝑎® is a multiple of 1 , we do not know if a conflict arises. It is
® 𝑡 . Here we find the crux of
possible that there is no conflict, but to be safe we also try 𝑏𝑡𝑒
the difference between a single equation and systems. We try both terms 𝑎® 𝑒 𝑡 and 𝑏𝑡𝑒
® 𝑡 in the
® 𝑡 . Therefore, we try
solution, not just the term 𝑏𝑡𝑒

𝑥® = 𝑎® 𝑒 𝑡 + 𝑏𝑡𝑒
® 𝑡 + 𝑐®𝑡 + 𝑑.
®
h i h i h i h i
𝑎1 𝑏1 𝑐1 𝑑1
We have 8 unknowns: We write 𝑎® = 𝑎2 , 𝑏® = 𝑏2 , 𝑐® = 𝑐2 , and 𝑑® = 𝑑2 . We plug 𝑥® into
the equation. First let us compute 𝑥®′.

® 𝑡 + 𝑐® = 𝑎1 + 𝑏 1 𝑒 𝑡 + 𝑏1 𝑡𝑒 𝑡 + 𝑐 1 .
       
𝑥®′ = 𝑎® + 𝑏® 𝑒 𝑡 + 𝑏𝑡𝑒
𝑎2 + 𝑏2 𝑏2 𝑐2

Now 𝑥®′ must equal 𝐴 𝑥® + 𝑓®, which is

𝐴 𝑥® + 𝑓® = 𝐴®𝑎 𝑒 𝑡 + 𝐴𝑏𝑡𝑒
® 𝑡 + 𝐴®𝑐 𝑡 + 𝐴 𝑑® + 𝑓®
           
−𝑎1 −𝑏1 −𝑐1 −𝑑1 1 𝑡 0
= 𝑒𝑡 + 𝑡𝑒 𝑡 + 𝑡+ + 𝑒 + 𝑡
−2𝑎 1 + 𝑎 2 −2𝑏1 + 𝑏 2 −2𝑐 1 + 𝑐 2 −2𝑑1 + 𝑑2 0 1
       
−𝑎1 + 1 −𝑏1 −𝑐1 −𝑑1
= 𝑒𝑡 + 𝑡𝑒 𝑡 + 𝑡+ .
−2𝑎 1 + 𝑎 2 −2𝑏1 + 𝑏 2 −2𝑐 1 + 𝑐 2 + 1 −2𝑑1 + 𝑑2

We identify the coefficients of 𝑒 𝑡 , 𝑡𝑒 𝑡 , 𝑡 and any constant vectors in 𝑥®′ and in 𝐴 𝑥® + 𝑓® to find
the equations:

𝑎 1 + 𝑏 1 = −𝑎1 + 1, 0 = −𝑐1 ,
𝑎 2 + 𝑏 2 = −2𝑎1 + 𝑎 2 , 0 = −2𝑐 1 + 𝑐 2 + 1,
𝑏1 = −𝑏1 , 𝑐 1 = −𝑑1 ,
𝑏2 = −2𝑏1 + 𝑏 2 , 𝑐 2 = −2𝑑1 + 𝑑2 .

We could write the 8 × 9 augmented matrix and do row reduction, but it is easier to solve
these in an ad hoc manner. Immediately, we see 𝑏 1 = 0, 𝑐1 = 0, 𝑑1 = 0. Plugging these back
in, we get 𝑐2 = −1 and 𝑑2 = −1. The remaining equations that tell us something are

𝑎1 = −𝑎1 + 1,
𝑎 2 + 𝑏 2 = −2𝑎1 + 𝑎 2 .

So 𝑎 1 = 1/2 and 𝑏 2 = −1. Finally, 𝑎 2 can be arbitrary and still satisfy the equations. We are
looking for just a single solution, so presumably the simplest one is when 𝑎 2 = 0. Therefore,
1 𝑡
2 𝑒
         
1/2 0 0 0
𝑡 𝑡
® + 𝑐®𝑡 + 𝑑® = 𝑡𝑡
𝑥® = 𝑎® 𝑒 + 𝑏𝑡𝑒 𝑒 + 𝑡𝑒 + 𝑡+ = .
0 −1 −1 −1 −𝑡𝑒 𝑡 − 𝑡 − 1
184 CHAPTER 3. SYSTEMS OF ODES

That is, 𝑥 1 = 12 𝑒 𝑡 , 𝑥 2 = −𝑡𝑒 𝑡 − 𝑡 − 1. We add this particular solution to the complementary


solution to get the general solution of the problem:
1 𝑡
𝑥1 = 𝑒 + 𝛼 1 𝑒 −𝑡 and 𝑥2 = −𝑡𝑒 𝑡 − 𝑡 − 1 + 𝛼 1 𝑒 −𝑡 + 𝛼 2 𝑒 𝑡 .
2
Notice that both 𝑎® 𝑒 𝑡 and 𝑏𝑡𝑒
® 𝑡 really were needed.

Exercise 3.9.2: Check that 𝑥1 and 𝑥2 solve the problem. Then set 𝑎 2 = 1 and check this solution as
well. What is the difference between the two solutions (one with 𝑎2 = 0 and one with 𝑎2 = 1)?
As you can see, other than the handling of conflicts, undetermined coefficients works
exactly the same as it did for single equations. However, the computations can get out of
hand pretty quickly for systems. The equation we considered was pretty simple.

3.9.2 First-order variable-coefficient


Variation of parameters
Just as for a single equation, there is the method of variation of parameters. For constant-
coefficient systems, it is essentially the same thing as the integrating factor method we
discussed earlier. However, this method works for any linear system, even if it is not
constant-coefficient, provided we somehow solve the associated homogeneous problem.
Consider the equation
𝑥®′ = 𝐴(𝑡) 𝑥® + 𝑓®(𝑡). (3.10)
Suppose we somehow solved the associated homogeneous equation 𝑥®′ = 𝐴(𝑡) 𝑥® and found
a fundamental matrix solution 𝑋(𝑡). The general solution to the associated homogeneous
equation is 𝑋(𝑡)®𝑐 for a constant vector 𝑐®. As in variation of parameters for single equation,
we try the solution to the nonhomogeneous equation of the form

𝑥®𝑝 = 𝑋(𝑡) 𝑢® (𝑡),

where 𝑢® (𝑡) is a vector-valued function instead of a constant. We substitute 𝑥®𝑝 into (3.10):

𝑋 ′(𝑡) 𝑢® (𝑡) + 𝑋(𝑡) 𝑢® ′(𝑡) = 𝐴(𝑡) 𝑋(𝑡) 𝑢® (𝑡) + 𝑓®(𝑡).


| {z } | {z }
𝑥®′𝑝 (𝑡) 𝐴(𝑡) 𝑥®𝑝 (𝑡)

As 𝑋(𝑡) is a fundamental matrix solution to the homogeneous problem, that is, 𝑋 ′(𝑡) =
𝐴(𝑡)𝑋(𝑡), we find
𝑋′ 
(𝑡) 𝑢® (𝑡) + 𝑋(𝑡) 𝑢® ′(𝑡) = 
𝑋 (𝑡) 𝑢® (𝑡) + 𝑓®(𝑡).
′ 

Hence, 𝑋(𝑡) 𝑢® ′(𝑡) = 𝑓®(𝑡). We compute [𝑋(𝑡)]−1 , and then 𝑢® ′(𝑡) = [𝑋(𝑡)]−1 𝑓®(𝑡). We integrate
to obtain 𝑢® , and we have the particular solution 𝑥®𝑝 = 𝑋(𝑡) 𝑢® (𝑡). Hence, we have the formula

𝑥®𝑝 = 𝑋(𝑡) [𝑋(𝑡)]−1 𝑓®(𝑡) 𝑑𝑡.
3.9. NONHOMOGENEOUS SYSTEMS 185

If 𝐴 is∫ a constant matrix and 𝑋(𝑡) = 𝑒 𝑡𝐴 , then [𝑋(𝑡)]−1 = 𝑒 −𝑡𝐴 . We obtain a solution
𝑥®𝑝 = 𝑒 𝑡𝐴 𝑒 −𝑡𝐴 𝑓®(𝑡) 𝑑𝑡, which is precisely what we got using the integrating factor method.
Example 3.9.4: Find a particular solution to

𝑡 −1 𝑡
   
1
𝑥®′ = 2 𝑥® + (𝑡 2 + 1). (3.11)
𝑡 +1 1 𝑡 1

Here, 𝐴 = 𝑡 21+1 1𝑡 −1
 
𝑡 is most definitely not constant. Perhaps by a lucky guess, we find
that 𝑋 = 𝑡 1 solves 𝑋 (𝑡) = 𝐴(𝑡)𝑋(𝑡). Once we know the complementary solution, we

 1 −𝑡 

can find a solution to (3.11). First, we compute

1 𝑡
 
−1 1
[𝑋(𝑡)] = 2 .
𝑡 + 1 −𝑡 1

Next, we know a particular solution to (3.11) is



𝑥®𝑝 = 𝑋(𝑡) [𝑋(𝑡)]−1 𝑓®(𝑡) 𝑑𝑡

1 𝑡 𝑡
 ∫   
1 −𝑡 1
= (𝑡 2 + 1) 𝑑𝑡
𝑡 1 𝑡 + 1 −𝑡 1
2 1
 ∫  
1 −𝑡 2𝑡
= 𝑑𝑡
𝑡 1 2
−𝑡 + 1
𝑡2
  
1 −𝑡
=
𝑡 1 − 13 𝑡 3 + 𝑡
3𝑡
1 4
 
= .
3𝑡 + 𝑡
2 3

Adding the complementary solution, we find the general solution to (3.11):

𝑐1 3𝑡
1 4
𝑐1 − 𝑐 2 𝑡 + 13 𝑡 4
      
1 −𝑡
𝑥® = + = .
𝑡 1 𝑐2 3𝑡 +
2 3
𝑡 𝑐2 + (𝑐 1 + 1) 𝑡 + 23 𝑡 3

Exercise 3.9.3: Check that 𝑥 1 = 3𝑡


1 4
and 𝑥2 = 3𝑡
2 3
+ 𝑡 really solve (3.11).

In the variation of parameters, as in the integrating factor method, we can obtain the
general solution by adding in constants of integration. Doing so would add 𝑋(𝑡)®𝑐 for a
vector 𝑐® of arbitrary constants. But that is precisely the complementary solution.

3.9.3 Second-order constant-coefficients


Undetermined coefficients
We have already seen a simple example of the method of undetermined coefficients for
second-order systems in § 3.6. This method is essentially the same as undetermined
186 CHAPTER 3. SYSTEMS OF ODES

coefficients for first-order systems. There are some simplifications that we can make, as we
did in § 3.6. Consider the equation

®
𝑥®′′ = 𝐴 𝑥® + 𝐹(𝑡),

® is of the form 𝐹®0 cos(𝜔𝑡), then as two derivatives of


where 𝐴 is a constant matrix. If 𝐹(𝑡)
cosine is again cosine, we do not need to introduce sines, and we try a solution of the form

𝑥®𝑝 = 𝑐® cos(𝜔𝑡).

If the 𝐹® is a sum of cosines, recall the superposition principle. If 𝐹(𝑡)


® = 𝐹®0 cos(𝜔0 𝑡) +
𝐹®1 cos(𝜔1 𝑡), then we would try 𝑎® cos(𝜔0 𝑡) for the problem 𝑥®′′ = 𝐴 𝑥® + 𝐹®0 cos(𝜔0 𝑡), and we
would try 𝑏® cos(𝜔1 𝑡) for the problem 𝑥®′′ = 𝐴 𝑥® + 𝐹®1 cos(𝜔1 𝑡). Then we sum the solutions.
If there is duplication with the complementary solution, or the equation is of the form
®
𝑥®′′ = 𝐴 𝑥®′ + 𝐵 𝑥® + 𝐹(𝑡), then we need to do the same thing as we do for first-order systems.
You can never go wrong by putting in more terms than needed into your guess. Those
extra coefficients will turn out to be zero. But it is useful to save some time and effort.

Eigenvector decomposition

If we have the system


𝑥®′′ = 𝐴 𝑥® + 𝑓®(𝑡),

we can do eigenvector decomposition, just like for first-order systems.


Let 𝜆1 , 𝜆2 , . . . , 𝜆𝑛 be the eigenvalues and 𝑣®1 , 𝑣®2 , . . . , 𝑣® 𝑛 be eigenvectors. Again form the
matrix 𝐸 = [ 𝑣®1 𝑣®2 · · · 𝑣® 𝑛 ]. Write

𝑥®(𝑡) = 𝑣®1 𝜉1 (𝑡) + 𝑣®2 𝜉2 (𝑡) + · · · + 𝑣® 𝑛 𝜉𝑛 (𝑡).

Decompose 𝑓® in terms of the eigenvectors

𝑓®(𝑡) = 𝑣®1 𝑔1 (𝑡) + 𝑣®2 𝑔2 (𝑡) + · · · + 𝑣® 𝑛 𝑔𝑛 (𝑡),

where, again, 𝑔® = 𝐸−1 𝑓®.


We plug in and as before we obtain

𝑥®′′ 𝐴 𝑥® 𝑓®
z }| { z }| { z }| {
𝑣®1 𝜉1′′ + 𝑣®2 𝜉2′′ +··· + 𝑣® 𝑛 𝜉𝑛′′ = 𝐴 𝑣®1 𝜉1 + 𝑣®2 𝜉2 + · · · + 𝑣® 𝑛 𝜉𝑛 + 𝑣®1 𝑔1 + 𝑣®2 𝑔2 + · · · + 𝑣® 𝑛 𝑔𝑛
= 𝐴®
𝑣1 𝜉1 + 𝐴®
𝑣 2 𝜉2 + · · · + 𝐴®
𝑣 𝑛 𝜉𝑛 + 𝑣®1 𝑔1 + 𝑣®2 𝑔2 + · · · + 𝑣® 𝑛 𝑔𝑛
= 𝑣®1 𝜆1 𝜉1 + 𝑣®2 𝜆2 𝜉2 + · · · + 𝑣® 𝑛 𝜆𝑛 𝜉𝑛 + 𝑣®1 𝑔1 + 𝑣®2 𝑔2 + · · · + 𝑣® 𝑛 𝑔𝑛
= 𝑣®1 (𝜆1 𝜉1 + 𝑔1 ) + 𝑣®2 (𝜆2 𝜉2 + 𝑔2 ) + · · · + 𝑣® 𝑛 (𝜆𝑛 𝜉𝑛 + 𝑔𝑛 ).
3.9. NONHOMOGENEOUS SYSTEMS 187

We identify the coefficients of the eigenvectors to get the equations

𝜉1′′ = 𝜆1 𝜉1 + 𝑔1 ,
𝜉2′′ = 𝜆2 𝜉2 + 𝑔2 ,
..
.
𝜉𝑛′′ = 𝜆𝑛 𝜉𝑛 + 𝑔𝑛 .

Each one of these equations is independent of the others. We solve each equation
using the methods of chapter 2. We write 𝑥®(𝑡) = 𝑣®1 𝜉1 (𝑡) + 𝑣®2 𝜉2 (𝑡) + · · · + 𝑣® 𝑛 𝜉𝑛 (𝑡) to
find a particular solution. If we find the general solutions for 𝜉1 through 𝜉𝑛 , then
𝑥®(𝑡) = 𝑣®1 𝜉1 (𝑡) + 𝑣®2 𝜉2 (𝑡) + · · · + 𝑣® 𝑛 𝜉𝑛 (𝑡) is the general solution as well.

Example 3.9.5: Let us do the example from § 3.6 using this method. The equation is
   
−3 1 0
𝑥®′′ = 𝑥® + cos(3𝑡).
2 −2 2

. Therefore 𝐸 =
1  1
 1 1

The eigenvalues are −1 and −4, with eigenvectors 2 and −1 2 −1 and
𝐸−1 = 31 12 −1
 1 
. Therefore,

𝑔1 2
      
1 1 1 0 3 cos(3𝑡)
= 𝐸−1 𝑓®(𝑡) = = .
𝑔2 3 2 −1 2 cos(3𝑡) −2
3 cos(3𝑡)

So after the whole song and dance of plugging in, the equations we get are

2 2
𝜉1′′ = −𝜉1 + cos(3𝑡), 𝜉2′′ = −4 𝜉2 − cos(3𝑡).
3 3

For each equation we use the method of undetermined coefficients. We try 𝐶1 cos(3𝑡) for
the first equation and 𝐶2 cos(3𝑡) for the second equation. We plug in to get

2
−9𝐶1 cos(3𝑡) = −𝐶1 cos(3𝑡) + cos(3𝑡),
3
2
−9𝐶2 cos(3𝑡) = −4𝐶2 cos(3𝑡) − cos(3𝑡).
3

We solve each of these equations separately. We get −9𝐶1 = −𝐶1 + 2/3 and −9𝐶2 = −4𝐶2 − 2/3.
And hence 𝐶1 = −1/12 and 𝐶2 = 2/15. So our particular solution is
         
1 −1 1 2 1/20
𝑥® = cos(3𝑡) + cos(3𝑡) = cos(3𝑡).
2 12 −1 15 −3/10

This solution matches what we got previously in § 3.6.


188 CHAPTER 3. SYSTEMS OF ODES

3.9.4 Exercises
Exercise 3.9.4: Find a particular solution to 𝑥 ′ = 𝑥 + 2𝑦 + 2𝑡, 𝑦 ′ = 3𝑥 + 2𝑦 − 4

a) using integrating factor method, b) using eigenvector decomposition,


c) using undetermined coefficients.

Exercise 3.9.5: Find the general solution to 𝑥 ′ = 4𝑥 + 𝑦 − 1, 𝑦 ′ = 𝑥 + 4𝑦 − 𝑒 𝑡

a) using integrating factor method, b) using eigenvector decomposition,


c) using undetermined coefficients.

Exercise 3.9.6: Find the general solution to 𝑥1′′ = −6𝑥1 + 3𝑥2 + cos(𝑡), 𝑥 2′′ = 2𝑥1 − 7𝑥2 + 3 cos(𝑡)

a) using eigenvector decomposition, b) using undetermined coefficients.

Exercise 3.9.7: Find the general solution to 𝑥 1′′ = −6𝑥1 + 3𝑥 2 + cos(2𝑡), 𝑥 2′′ = 2𝑥1 − 7𝑥 2 + 3 cos(2𝑡)

a) using eigenvector decomposition, b) using undetermined coefficients.

𝑡2
1   
𝑡 −1
Exercise 3.9.8: Take the equation 𝑥®′ = 1 𝑥® + .
1 𝑡 −𝑡

𝑡 sin 𝑡 𝑡 cos 𝑡
   
a) Check that 𝑥®𝑐 = 𝑐1 + 𝑐2 is the complementary solution.
−𝑡 cos 𝑡 𝑡 sin 𝑡
b) Use variation of parameters to find a particular solution.

Exercise 3.9.101: Find a particular solution to 𝑥 ′ = 5𝑥 + 4𝑦 + 𝑡, 𝑦 ′ = 𝑥 + 8𝑦 − 𝑡

a) using integrating factor method, b) using eigenvector decomposition,


c) using undetermined coefficients.

Exercise 3.9.102: Find a particular solution to 𝑥 ′ = 𝑦 + 𝑒 𝑡 , 𝑦 ′ = 𝑥 + 𝑒 𝑡

a) using integrating factor method, b) using eigenvector decomposition,


c) using undetermined coefficients.

Exercise 3.9.103: Solve 𝑥1′ = 𝑥 2 + 𝑡, 𝑥2′ = 𝑥 1 + 𝑡 with initial conditions 𝑥 1 (0) = 1, 𝑥 2 (0) = 2
using eigenvector decomposition.

Exercise 3.9.104: Solve 𝑥1′′ = −3𝑥 1 + 𝑥2 + 𝑡, 𝑥2′′ = 9𝑥 1 + 5𝑥 2 + cos(𝑡) with initial conditions
𝑥1 (0) = 0, 𝑥2 (0) = 0, 𝑥1′ (0) = 0, 𝑥 2′ (0) = 0 using eigenvector decomposition.
Chapter 4

Fourier series and PDEs

4.1 Boundary value problems


Note: 2 lectures, similar to §3.8 in [EP], §10.1 and §11.1 in [BD]

4.1.1 Boundary value problems


Before we tackle the Fourier series, we study the so-called boundary value problems (or
endpoint problems). Consider

𝑥 ′′ + 𝜆𝑥 = 0, 𝑥(𝑎) = 0, 𝑥(𝑏) = 0,

for some constant 𝜆, where 𝑥(𝑡) is defined for 𝑡 in the interval [𝑎, 𝑏]. Previously we specified
the value of the solution and its derivative at a single point. Now we specify the value of
the solution at two different points. As 𝑥 = 0 is a solution, existence of solutions is not a
problem. Uniqueness of solutions is another issue. The general solution to 𝑥 ′′ + 𝜆𝑥 = 0 has
two arbitrary constants‗ . It is, therefore, natural (but wrong) to believe that requiring two
conditions guarantees a unique solution.
Example 4.1.1: Take 𝜆 = 1, 𝑎 = 0, 𝑏 = 𝜋. That is,

𝑥 ′′ + 𝑥 = 0, 𝑥(0) = 0, 𝑥(𝜋) = 0.

Then 𝑥 = sin 𝑡 is another solution (besides 𝑥 = 0) satisfying both boundary conditions.


There are more. Write down the general solution of the differential equation, which is
𝑥 = 𝐴 cos 𝑡 + 𝐵 sin 𝑡. The condition 𝑥(0) = 0 forces 𝐴 = 0. Letting 𝑥(𝜋) = 0 does not give us
any more information as 𝑥 = 𝐵 sin 𝑡 already satisfies both boundary conditions. Hence,
there are infinitely many solutions of the form 𝑥 = 𝐵 sin 𝑡, where 𝐵 is an arbitrary constant.
Example 4.1.2: On the other hand, consider 𝜆 = 2. That is,

𝑥 ′′ + 2𝑥 = 0, 𝑥(0) = 0, 𝑥(𝜋) = 0.
‗ See subsection 0.2.4 on page 13 or Example 2.2.1 on page 85 and Example 2.2.3 on page 88.
190 CHAPTER 4. FOURIER SERIES AND PDES
√  √ 
Then the general solution is 𝑥 = 𝐴 cos 2 𝑡 + 𝐵 sin 2 𝑡 . Letting 𝑥(0) = 0 still forces
√  √ 
𝐴 = 0. We apply the second condition to find 0 = 𝑥(𝜋) = 𝐵 sin 2 𝜋 . As sin 2 𝜋 ≠ 0 we
obtain 𝐵 = 0. Therefore 𝑥 = 0 is the unique solution to this problem.
What is going on? We will be interested in finding which constants 𝜆 allow a nonzero
solution, and we will be interested in finding those solutions. This problem is an analogue
of finding eigenvalues and eigenvectors of matrices.

4.1.2 Eigenvalue problems


For basic Fourier series theory, we will need the following three eigenvalue problems:

𝑥 ′′ + 𝜆𝑥 = 0, 𝑥(𝑎) = 0, 𝑥(𝑏) = 0, (4.1)

𝑥 ′′ + 𝜆𝑥 = 0, 𝑥 ′(𝑎) = 0, 𝑥 ′(𝑏) = 0, (4.2)


and
𝑥 ′′ + 𝜆𝑥 = 0, 𝑥(𝑎) = 𝑥(𝑏), 𝑥 ′(𝑎) = 𝑥 ′(𝑏). (4.3)
A number 𝜆 is called an eigenvalue of (4.1) (resp. (4.2) or (4.3)) if and only if there exists a
nonzero (not identically zero) solution to (4.1) (resp. (4.2) or (4.3)) given that specific 𝜆. A
nonzero solution is called a corresponding eigenfunction. We will consider more general
equations and boundary conditions, but we will postpone this until chapter 5.
Note the similarity to eigenvalues and eigenvectors of matrices. The similarity is not
just coincidental. If we think of the equations as differential operators, then we are doing
the same exact thing. Think of a function 𝑥(𝑡) as a vector with infinitely many components
(one for each 𝑡). Let 𝐿 = − 𝑑 2 be the linear operator. Then the eigenvalue/eigenfunction
2

𝑑𝑡
pair should be 𝜆 and nonzero 𝑥 such that 𝐿𝑥 = 𝜆𝑥. In other words, we are looking for
nonzero functions 𝑥 satisfying certain endpoint conditions that solve (𝐿 − 𝜆)𝑥 = 0. A lot of
the formalism from linear algebra still applies here, though we will not pursue this line of
reasoning too far.
Example 4.1.3: Find the eigenvalues and eigenfunctions of

𝑥 ′′ + 𝜆𝑥 = 0, 𝑥(0) = 0, 𝑥(𝜋) = 0.

We have to handle the cases 𝜆 > 0, 𝜆 = 0, and 𝜆 < 0 separately. First suppose that 𝜆 > 0.
Then the general solution to 𝑥 ′′ + 𝜆𝑥 = 0 is
√  √ 
𝑥 = 𝐴 cos 𝜆 𝑡 + 𝐵 sin 𝜆 𝑡 .

The condition 𝑥(0) = 0 implies immediately 𝐴 = 0. Next


√ 
0 = 𝑥(𝜋) = 𝐵 sin 𝜆 𝜋 .

If 𝐵 is zero, then
√ 𝑥 is not a nonzero
√ solution. So to get a nonzero solution, we must
have that sin 𝜆 𝜋 = 0. Hence, 𝜆 𝜋 must be an integer multiple of 𝜋. In other words,
4.1. BOUNDARY VALUE PROBLEMS 191

𝜆 = 𝑘 for a positive integer 𝑘. Hence, the positive eigenvalues are 𝑘 2 for all integers
𝑘 ≥ 1. Corresponding eigenfunctions can be taken as 𝑥 = sin(𝑘𝑡). Just like for eigenvectors,
constant multiples of an eigenfunction are also eigenfunctions, so we only need to pick one.
Now suppose that 𝜆 = 0. In this case the equation is 𝑥 ′′ = 0, and its general solution
is 𝑥 = 𝐴𝑡 + 𝐵. The condition 𝑥(0) = 0 implies that 𝐵 = 0, and 𝑥(𝜋) = 0 implies that 𝐴 = 0.
This means that 𝜆 = 0 is not an eigenvalue.
Finally, suppose that 𝜆 < 0. In this case we have the general solution‗
√ √
𝑥 = 𝐴 cosh −𝜆 𝑡 + 𝐵 sinh −𝜆 𝑡 .
 

Letting 𝑥(0) = √0 implies that 𝐴 = 0 (recall cosh 0 = 1 and sinh 0 = 0). So our solution must
be 𝑥 = 𝐵 sinh −𝜆 𝑡 and satisfy 𝑥(𝜋) = 0. This is only possible if 𝐵 is zero. Why? Because

sinh 𝜉 is only zero when 𝜉 = 0. You should plot sinh to see this fact. We can also see this
𝜉
from the definition of sinh. We get 0 = sinh 𝜉 = 𝑒 −𝑒
−𝜉
𝜉
2 . Hence 𝑒 = 𝑒 , which implies
−𝜉

𝜉 = −𝜉 and that is only true if 𝜉 = 0. So there are no negative eigenvalues.


In summary, the eigenvalues and corresponding eigenfunctions are

𝜆𝑘 = 𝑘2 with an eigenfunction 𝑥 𝑘 = sin(𝑘𝑡) for all integers 𝑘 ≥ 1.

Example 4.1.4: Compute the eigenvalues and eigenfunctions of

𝑥 ′′ + 𝜆𝑥 = 0, 𝑥 ′(0) = 0, 𝑥 ′(𝜋) = 0.

Again we have to handle the cases 𝜆 > 0, 𝜆 = 0, 𝜆 <√ 0 separately.√ First suppose that
𝜆 > 0. The general solution to 𝑥 ′′ + 𝜆𝑥 = 0 is 𝑥 = 𝐴 cos 𝜆 𝑡 + 𝐵 sin 𝜆 𝑡 . So
 

√ √  √ √ 
𝑥 ′ = −𝐴 𝜆 sin 𝜆 𝑡 + 𝐵 𝜆 cos 𝜆 𝑡 .

The condition 𝑥 ′(0) = 0 implies immediately 𝐵 = 0. Next


√ √ 
0 = 𝑥 ′(𝜋) = −𝐴 𝜆 sin 𝜆 𝜋 .
√  √
Again 𝐴 cannot be zero if 𝜆 is to be an eigenvalue, and sin 𝜆 𝜋 is only zero if 𝜆 = 𝑘
for a positive integer 𝑘. Hence, the positive eigenvalues are again 𝑘 2 for all integers 𝑘 ≥ 1.
And the corresponding eigenfunctions can be taken as 𝑥 = cos(𝑘𝑡).
Now suppose that 𝜆 = 0. In this case, the equation is 𝑥 ′′ = 0 and the general solution is
𝑥 = 𝐴𝑡 + 𝐵 so 𝑥 ′ = 𝐴. The condition 𝑥 ′(0) = 0 implies that 𝐴 = 0. The condition 𝑥 ′(𝜋) = 0
also implies 𝐴 = 0. Hence 𝐵 could be anything (let us take it to be 1). So 𝜆 = 0 is an
eigenvalue and 𝑥 = 1 is a corresponding eigenfunction. √ √
Finally, let 𝜆 < 0. In this case, the general solution is 𝑥 = 𝐴 cosh −𝜆 𝑡 + 𝐵 sinh −𝜆 𝑡
 
and √ √ √ √
𝑥 ′ = 𝐴 −𝜆 sinh −𝜆 𝑡 + 𝐵 −𝜆 cosh −𝜆 𝑡 .
 

‗ Recall that cosh 𝑠 = 12 (𝑒 𝑠 + 𝑒 −𝑠 ) and sinh 𝑠 = 21 (𝑒 𝑠 − 𝑒 −𝑠 ). As an exercise try the computation with the
√ √
general solution written as 𝑥 = 𝐴𝑒 −𝜆 𝑡 + 𝐵𝑒 − −𝜆 𝑡 (for different 𝐴 and 𝐵 of course).
192 CHAPTER 4. FOURIER SERIES AND PDES

We have already seen (with roles of 𝐴 and 𝐵 switched) that for this expression to be zero at
𝑡 = 0 and 𝑡 = 𝜋, we must have 𝐴 = 𝐵 = 0. Hence, there are no negative eigenvalues.
In summary, the eigenvalues and corresponding eigenfunctions are

𝜆𝑘 = 𝑘2 with an eigenfunction 𝑥 𝑘 = cos(𝑘𝑡) for all integers 𝑘 ≥ 1,

and there is another eigenvalue

𝜆0 = 0 with an eigenfunction 𝑥0 = 1.

The following problem is the one that leads to the general Fourier series.
Example 4.1.5: Compute the eigenvalues and eigenfunctions of

𝑥 ′′ + 𝜆𝑥 = 0, 𝑥(−𝜋) = 𝑥(𝜋), 𝑥 ′(−𝜋) = 𝑥 ′(𝜋).

We have not specified the values or the derivatives at the endpoints, but rather that they
are the same at the beginning and at the end of the interval.
We skip 𝜆 < 0. The computations are the same as before, and again we find that there
are no negative eigenvalues.
For 𝜆 = 0, the general solution is 𝑥 = 𝐴𝑡 + 𝐵. The condition 𝑥(−𝜋) = 𝑥(𝜋) implies
that 𝐴 = 0 (𝐴𝜋 + 𝐵 = −𝐴𝜋 + 𝐵 implies 𝐴 = 0). The second condition 𝑥 ′(−𝜋) = 𝑥 ′(𝜋) says
nothing about 𝐵 and hence 𝜆 = 0 is an eigenvalue with a corresponding eigenfunction
𝑥 = 1. √  √ 
For 𝜆 > 0 we get that 𝑥 = 𝐴 cos 𝜆 𝑡 + 𝐵 sin 𝜆 𝑡 . Now
√  √  √  √ 
𝐴 cos − 𝜆 𝜋 + 𝐵 sin − 𝜆 𝜋 = 𝐴 cos 𝜆 𝜋 + 𝐵 sin 𝜆 𝜋 .
| {z } | {z }
𝑥(−𝜋) 𝑥(𝜋)

We remember that cos(−𝜃) = cos(𝜃) and sin(−𝜃) = − sin(𝜃). Therefore,


√  √  √  √ 
𝐴 cos 𝜆 𝜋 − 𝐵 sin 𝜆 𝜋 = 𝐴 cos 𝜆 𝜋 + 𝐵 sin 𝜆 𝜋 .
√ 
Hence either 𝐵 = 0 or sin 𝜆 𝜋 = 0. Similarly (exercise) if we differentiate 𝑥 and plug in
√ 
the second condition we find that 𝐴 = 0 or sin 𝜆 𝜋 = 0. Therefore, unless we want 𝐴
√  √
and 𝐵 to both be zero (which we do not) we must have sin 𝜆 𝜋 = 0. Hence, 𝜆 is an
integer and the eigenvalues are yet again 𝜆 = 𝑘 2 for an integer 𝑘 ≥ 1. In this case, however,
𝑥 = 𝐴 cos(𝑘𝑡) + 𝐵 sin(𝑘𝑡) is an eigenfunction for any 𝐴 and any 𝐵. So we have two linearly
independent eigenfunctions sin(𝑘𝑡) and cos(𝑘𝑡). Remember that for a matrix, we can also
have two eigenvectors corresponding to a single eigenvalue if the eigenvalue is repeated.
In summary, the eigenvalues and corresponding eigenfunctions are

𝜆𝑘 = 𝑘2 with eigenfunctions cos(𝑘𝑡) and sin(𝑘𝑡) for all integers 𝑘 ≥ 1,


𝜆0 = 0 with an eigenfunction 𝑥 0 = 1.
4.1. BOUNDARY VALUE PROBLEMS 193

4.1.3 Orthogonality of eigenfunctions


Something that will be very useful in the next section is the orthogonality property of the
eigenfunctions. This is an analogue of the following fact about eigenvectors of a matrix. A
matrix is called symmetric if 𝐴 = 𝐴𝑇 (it is equal to its transpose). Eigenvectors for two distinct
eigenvalues of a symmetric matrix are orthogonal. The differential operators we are dealing
with act much like a symmetric matrix. We, therefore, get the following theorem.

Theorem 4.1.1. Suppose that 𝑥1 (𝑡) and 𝑥2 (𝑡) are two eigenfunctions of the problem (4.1), (4.2), or
(4.3) for two different eigenvalues 𝜆1 and 𝜆2 . Then they are orthogonal in the sense that
∫ 𝑏
𝑥1 (𝑡)𝑥2 (𝑡) 𝑑𝑡 = 0.
𝑎

The terminology comes from the fact that the integral is a type of inner product. We will
expand on this in the next section. The theorem has a very short, elegant, and illuminating
proof so we give it here. First, we have the following two equations.

𝑥1′′ + 𝜆1 𝑥1 = 0 and 𝑥2′′ + 𝜆2 𝑥2 = 0.

Multiply the first by 𝑥2 and the second by 𝑥1 and subtract to get

(𝜆1 − 𝜆2 )𝑥1 𝑥2 = 𝑥2′′ 𝑥1 − 𝑥 2 𝑥1′′ .

Integrate both sides of the equation:


∫ 𝑏 ∫ 𝑏
(𝜆1 − 𝜆2 ) 𝑥 1 (𝑡)𝑥 2 (𝑡) 𝑑𝑡 = 𝑥2′′(𝑡)𝑥 1 (𝑡) − 𝑥 2 (𝑡)𝑥1′′(𝑡) 𝑑𝑡

𝑎 𝑎
𝑏
𝑑  ′
∫ 
= 𝑥 2 (𝑡)𝑥 1 (𝑡) − 𝑥 2 (𝑡)𝑥1′ (𝑡) 𝑑𝑡
𝑎 𝑑𝑡
h i𝑏
= 𝑥2′ (𝑡)𝑥 1 (𝑡) − 𝑥 2 (𝑡)𝑥 1′ (𝑡) = 0.
𝑡=𝑎

The last equality holds because of the boundary conditions. For example, if we consider
(4.1) we have 𝑥1 (𝑎) = 𝑥 1 (𝑏) = 𝑥 2 (𝑎) = 𝑥 2 (𝑏) = 0 and so 𝑥2′ 𝑥1 − 𝑥 2 𝑥1′ is zero at both 𝑎 and 𝑏.
As 𝜆1 ≠ 𝜆2 , the theorem follows.

Exercise 4.1.1 (easy): Finish the proof of the theorem (check the last equality in the proof) for the
cases (4.2) and (4.3).

The function sin(𝑛𝑡) is an eigenfunction for the problem 𝑥 ′′ + 𝜆𝑥 = 0, 𝑥(0) = 0, 𝑥(𝜋) = 0.


Hence for positive integers 𝑛 and 𝑚, we have the integrals
∫ 𝜋
sin(𝑚𝑡) sin(𝑛𝑡) 𝑑𝑡 = 0, when 𝑚 ≠ 𝑛.
0
194 CHAPTER 4. FOURIER SERIES AND PDES

Similarly, still assuming that 𝑚 and 𝑛 are positive integers,


∫ 𝜋 ∫ 𝜋
cos(𝑚𝑡) cos(𝑛𝑡) 𝑑𝑡 = 0, when 𝑚 ≠ 𝑛, and cos(𝑛𝑡) 𝑑𝑡 = 0.
0 0

Finally, we also get


∫ 𝜋 ∫ 𝜋
sin(𝑚𝑡) sin(𝑛𝑡) 𝑑𝑡 = 0, when 𝑚 ≠ 𝑛, and sin(𝑛𝑡) 𝑑𝑡 = 0,
−𝜋 −𝜋
∫ 𝜋 ∫ 𝜋
cos(𝑚𝑡) cos(𝑛𝑡) 𝑑𝑡 = 0, when 𝑚 ≠ 𝑛, and cos(𝑛𝑡) 𝑑𝑡 = 0,
−𝜋 −𝜋
and ∫ 𝜋
cos(𝑚𝑡) sin(𝑛𝑡) 𝑑𝑡 = 0 (even if 𝑚 = 𝑛).
−𝜋

4.1.4 Fredholm alternative


We now touch on a very useful theorem in the theory of differential equations. The theorem
holds in a more general setting than we are going to state it, but for our purposes the
following statement is sufficient. We will give a slightly more general version in chapter 5.
Theorem 4.1.2 (Fredholm alternative‗ ). Exactly one of the following statements holds. Either
𝑥 ′′ + 𝜆𝑥 = 0, 𝑥(𝑎) = 0, 𝑥(𝑏) = 0 (4.4)
has a nonzero solution, or
𝑥 ′′ + 𝜆𝑥 = 𝑓 (𝑡), 𝑥(𝑎) = 0, 𝑥(𝑏) = 0 (4.5)
has a unique solution for every function 𝑓 continuous on [𝑎, 𝑏].
The theorem is also true for the other types of boundary conditions we considered. The
theorem means that if 𝜆 is not an eigenvalue, the nonhomogeneous equation (4.5) has a
unique solution for every right-hand side. On the other hand if 𝜆 is an eigenvalue, then
(4.5) need not have a solution for every 𝑓 , and furthermore, even if it happens to have a
solution, the solution is not unique.
We also want to reinforce the idea here that linear differential operators have much
in common with matrices. So it is no surprise that there is a finite-dimensional version
of Fredholm alternative for matrices as well. Let 𝐴 be an 𝑛 × 𝑛 matrix. The Fredholm
alternative then states that either (𝐴 − 𝜆𝐼)𝑥® = 0® has a nontrivial solution, or (𝐴 − 𝜆𝐼)𝑥® = 𝑏®
has a unique solution for every 𝑏.®
A lot of intuition from linear algebra can be applied to linear differential operators, but
one must be careful of course. For example, one difference we have already seen is that in
general a differential operator will have infinitely many eigenvalues, while a matrix has
only finitely many.
‗ Named after the Swedish mathematician Erik Ivar Fredholm (1866–1927).
4.1. BOUNDARY VALUE PROBLEMS 195

4.1.5 Application
Let us consider a physical application of an endpoint problem. Suppose we have a tightly
stretched quickly spinning elastic string or rope of uniform linear density 𝜌, for example in
kg/m. Let us put this problem into the 𝑥𝑦-plane and both 𝑥 and 𝑦 are in meters. The 𝑥-axis
represents the position on the string. The string rotates at angular velocity 𝜔, in radians/s.
Imagine that the whole 𝑥𝑦-plane rotates at angular velocity 𝜔. This way, the string stays in
this 𝑥 𝑦-plane and 𝑦 measures its deflection from the equilibrium position, 𝑦 = 0, on the
𝑥-axis. Hence the graph of 𝑦 gives the shape of the string. We consider an ideal string with
no volume, just a mathematical curve. We suppose the tension on the string is a constant 𝑇
in Newtons. Assuming that the deflection is small, we can use Newton’s second law (let us
skip the derivation) to get the equation

𝑇 𝑦 ′′ + 𝜌𝜔 2 𝑦 = 0.

To check the units notice that the units of 𝑦 ′′ are m/m2 , as the derivative is in terms of 𝑥.
Let 𝐿 be the length of the string (in meters) and the string is fixed at the beginning and
end points. Hence, 𝑦(0) = 0 and 𝑦(𝐿) = 0. See Figure 4.1.

0 𝐿 𝑥

Figure 4.1: Whirling string.

𝜌𝜔 2
We rewrite the equation as 𝑦 ′′ + 𝑇 𝑦 = 0. The setup is similar to Example 4.1.3
on page 190, except for the interval length being 𝐿 instead of 𝜋. We are looking for
𝜌𝜔 2
eigenvalues of 𝑦 ′′ + 𝜆𝑦 = 0, 𝑦(0) = 0, 𝑦(𝐿) = 0 where 𝜆 = 𝑇 . As before there are
no nonpositive
√  eigenvalues.
√  With 𝜆 > 0, the general solution to the equation is 𝑦 =
𝐴 cos 𝜆 𝑥 + 𝐵 sin 𝜆 𝑥 . The condition 𝑦(0) = 0 implies that 𝐴 = 0 as before. The
√  √
condition 𝑦(𝐿) = 0 implies that sin 𝜆 𝐿 = 0 and hence 𝜆 𝐿 = 𝑘𝜋 for some integer 𝑘 > 0,
so
𝜌𝜔2 𝑘 2 𝜋2
=𝜆= 2 .
𝑇 𝐿
What does this say about the shape of the string? It says that for all parameters 𝜌, 𝜔, 𝑇
not satisfying the equation above, the string is in the equilibrium position, 𝑦 = 0. When
𝜌𝜔 2
= 𝑘 𝐿𝜋2 , then the string will “pop out” some distance 𝐵. We cannot compute 𝐵 with the
2 2
𝑇
information we have.
196 CHAPTER 4. FOURIER SERIES AND PDES

Let us assume that 𝜌 and 𝑇 are fixed and we are changing 𝜔. For most values of

𝜔, the
string is in the equilibrium state. When the angular velocity 𝜔 hits a value 𝜔 = 𝑘𝜋√ 𝑇 , then
𝐿 𝜌
the string pops out and has the shape of a sin wave crossing the 𝑥-axis 𝑘 − 1 times between
the end points. For example, at 𝑘 = 1, the string does not cross the 𝑥-axis and the shape
looks like in Figure 4.1 on the previous page. On the other hand, when 𝑘 = 3 the string
crosses the 𝑥-axis 2 times, see Figure 4.2. When 𝜔 changes again, the string returns to the
equilibrium position. The higher the angular velocity, the more times it crosses the 𝑥-axis
when it is popped out.

0 𝐿 𝑥

Figure 4.2: Whirling string at the third eigenvalue (𝑘 = 3).

For another example, if you have a spinning jump rope (then 𝑘 = 1 as it is completely
“popped out”) and you pull on the ends to increase the tension, then the velocity also
increases for the rope to stay “popped out”.

4.1.6 Exercises
√ √
Hint for the following exercises: Note that if 𝜆 > 0, then cos 𝜆 (𝑡 − 𝑎) and sin 𝜆 (𝑡 − 𝑎)
 
are also solutions of the homogeneous equation.

Exercise 4.1.2: Compute all eigenvalues and eigenfunctions of 𝑥 ′′ + 𝜆𝑥 = 0, 𝑥(𝑎) = 0, 𝑥(𝑏) = 0


(assume 𝑎 < 𝑏).

Exercise 4.1.3: Compute all eigenvalues and eigenfunctions of 𝑥 ′′ + 𝜆𝑥 = 0, 𝑥 ′(𝑎) = 0, 𝑥 ′(𝑏) = 0


(assume 𝑎 < 𝑏).

Exercise 4.1.4: Compute all eigenvalues and eigenfunctions of 𝑥 ′′ + 𝜆𝑥 = 0, 𝑥 ′(𝑎) = 0, 𝑥(𝑏) = 0


(assume 𝑎 < 𝑏).

Exercise 4.1.5: Compute all eigenvalues and eigenfunctions of 𝑥 ′′ + 𝜆𝑥 = 0, 𝑥(𝑎) = 𝑥(𝑏),


𝑥 ′(𝑎) = 𝑥 ′(𝑏) (assume 𝑎 < 𝑏).

Exercise 4.1.6: We skipped the case of 𝜆 < 0 for the boundary value problem 𝑥 ′′ + 𝜆𝑥 = 0,
𝑥(−𝜋) = 𝑥(𝜋), 𝑥 ′(−𝜋) = 𝑥 ′(𝜋). Finish the calculation and show that there are no negative
eigenvalues.

Exercise 4.1.101: Consider a spinning string of length 2 and linear density 0.1 and tension 3. Find
smallest angular velocity when the string pops out.
4.1. BOUNDARY VALUE PROBLEMS 197

Exercise 4.1.102: Suppose 𝑥 ′′ + 𝜆𝑥 = 0 and 𝑥(0) = 1, 𝑥(1) = 1. Find all 𝜆 for which there is more
than one solution. Also find the corresponding solutions (only for the eigenvalues).

Exercise 4.1.103: Suppose 𝑥 ′′ + 𝑥 = 0 and 𝑥(0) = 0, 𝑥 ′(𝜋) = 1. Find all the solution(s) if any
exist.

Exercise 4.1.104: Consider 𝑥 ′ + 𝜆𝑥 = 0 and 𝑥(0) = 0, 𝑥(1) = 0. Why does it not have any
eigenvalues? Why does every first-order equation with two endpoint conditions such as above have
no eigenvalues?

Exercise 4.1.105 (challenging): Suppose 𝑥 ′′′ + 𝜆𝑥 = 0 and 𝑥(0) = 0, 𝑥 ′(0) = 0, 𝑥(1) =√0.
Suppose that 𝜆 > 0. Find an equation that all such eigenvalues must satisfy. Hint: Note that − 𝜆
3

is a root of 𝑟 3 + 𝜆 = 0.
198 CHAPTER 4. FOURIER SERIES AND PDES

4.2 The trigonometric series


Note: 2 lectures, §9.1 in [EP], §10.2 in [BD]

4.2.1 Periodic functions and motivation


As motivation for studying Fourier series, consider the problem

𝑥 ′′ + 𝜔02 𝑥 = 𝑓 (𝑡), (4.6)

for some periodic function 𝑓 (𝑡). In § 2.6, we found the general solution to

𝑥 ′′ + 𝜔02 𝑥 = 𝐹0 cos(𝜔𝑡). (4.7)

One way to solve (4.6) is to decompose 𝑓 (𝑡) as a sum of cosines (and sines) and then solve
many problems of the form (4.7). We then use the principle of superposition, to sum up all
the solutions we got to get a solution to (4.6).
Before we proceed, let us talk a little bit more in detail about periodic functions. A
function is said to be periodic with period 𝑃 if 𝑓 (𝑡) = 𝑓 (𝑡 + 𝑃) for all 𝑡. For brevity we say
𝑓 (𝑡) is 𝑃-periodic. Note that a 𝑃-periodic function is also 2𝑃-periodic, 3𝑃-periodic and
so on. For example, cos(𝑡) and sin(𝑡) are 2𝜋-periodic. So are cos(𝑘𝑡) and sin(𝑘𝑡) for all
integers 𝑘. The constant functions are an extreme example. They are periodic for any
period (exercise).
Normally we start with a function 𝑓 (𝑡) defined on some interval [−𝐿, 𝐿], and we want to
extend 𝑓 (𝑡) periodically to make it a 2𝐿-periodic function. We do this extension by defining
a new function 𝐹(𝑡) such that for 𝑡 in [−𝐿, 𝐿], 𝐹(𝑡) = 𝑓 (𝑡). For 𝑡 in [𝐿, 3𝐿], we define
𝐹(𝑡) = 𝑓 (𝑡 − 2𝐿), for 𝑡 in [−3𝐿, −𝐿], 𝐹(𝑡) = 𝑓 (𝑡 + 2𝐿), and so on. To make that work we
needed 𝑓 (−𝐿) = 𝑓 (𝐿). We could have also started with 𝑓 defined only on the half-open
interval (−𝐿, 𝐿] and then define 𝑓 (−𝐿) = 𝑓 (𝐿).
Example 4.2.1: Define 𝑓 (𝑡) = 1 − 𝑡 2 on [−1, 1]. Extend 𝑓 (𝑡) periodically to a 2-periodic
function. For 1 ≤ 𝑡 ≤ 3, we get 𝑓 (𝑡) = 1 − (𝑡 − 2)2 . For −3 ≤ 𝑡 ≤ 1, we get 𝑓 (𝑡) = 1 − (𝑡 + 2)2 .
For 3 ≤ 𝑡 ≤ 5, we get 𝑓 (𝑡) = 1 − (𝑡 − 4)2 . And so on. See Figure 4.3 on the next page.
You should be careful to distinguish between 𝑓 (𝑡) and its extension. A common mistake
is to assume that a formula for 𝑓 (𝑡) holds for its extension. It can be confusing when the
formula for 𝑓 (𝑡) is periodic, but with perhaps a different period.
Exercise 4.2.1: Define 𝑓 (𝑡) = cos 𝑡 on [−𝜋/2, 𝜋/2]. Take the 𝜋-periodic extension and sketch its
graph. How does it compare to the graph of cos 𝑡?

4.2.2 Inner product and eigenvector decomposition


Suppose 𝐴 is a symmetric matrix, that is, 𝐴𝑇 = 𝐴. As we remarked before, eigenvectors of 𝐴
are then orthogonal. Here the word orthogonal means that if 𝑣® and 𝑤
® are two eigenvectors
4.2. THE TRIGONOMETRIC SERIES 199

-3 -2 -1 0 1 2 3
1.5 1.5

1.0 1.0

0.5 0.5

0.0 0.0

-0.5 -0.5
-3 -2 -1 0 1 2 3

Figure 4.3: Periodic extension of the function 1 − 𝑡 2 .

of 𝐴 for distinct eigenvalues, then ⟨®


𝑣 , 𝑤⟩
® = 0. In this case, the inner product ⟨®𝑣 , 𝑤⟩
® is the
𝑇
dot product, which can be computed as 𝑣® 𝑤. ®
To decompose a vector 𝑣® in terms of mutually orthogonal vectors 𝑤 ® 1 and 𝑤
® 2 , we write

𝑣® = 𝑎1 𝑤
® 1 + 𝑎2 𝑤
® 2.

Let us find the formula for 𝑎1 and 𝑎 2 . We compute,

𝑣 , 𝑤®1 ⟩ = ⟨𝑎1 𝑤
⟨® ® 1 + 𝑎2 𝑤
® 2 , 𝑤®1 ⟩ = 𝑎1 ⟨𝑤
® 1 , 𝑤®1 ⟩ + 𝑎 2 ⟨𝑤
® 2 , 𝑤®1 ⟩ = 𝑎1 ⟨𝑤
® 1 , 𝑤®1 ⟩.
| {z }
=0

Therefore,
𝑣 , 𝑤®1 ⟩
⟨®
𝑎1 = .
⟨𝑤® 1 , 𝑤®1 ⟩
Similarly,
𝑣 , 𝑤®2 ⟩
⟨®
𝑎2 = .
⟨𝑤® 2 , 𝑤®2 ⟩
You probably remember this formula from vector calculus.
Example 4.2.2: Write 𝑣® = 23 as a linear combination of 𝑤®1 = −1 and 𝑤®2 =
  1
  1
1 .
Note that 𝑤
® 1 and 𝑤
® 2 are orthogonal as ⟨𝑤
®1, 𝑤
® 2 ⟩ = 1(1) + (−1)1 = 0. Then

𝑣 , 𝑤®1 ⟩
⟨® 2(1) + 3(−1) −1
𝑎1 = = = ,
⟨𝑤® 1 , 𝑤®1 ⟩ 1(1) + (−1)(−1) 2
𝑣 , 𝑤®2 ⟩
⟨® 2+3 5
𝑎2 = = = .
⟨𝑤® 2 , 𝑤®2 ⟩ 1 + 1 2
Hence,      
2 −1 1 5 1
= + .
3 2 −1 2 1
200 CHAPTER 4. FOURIER SERIES AND PDES

4.2.3 The trigonometric series


Instead of decomposing a vector in terms of eigenvectors of a matrix, we decompose
a function in terms of eigenfunctions of a certain eigenvalue problem. The eigenvalue
problem we use for the Fourier series is

𝑥 ′′ + 𝜆𝑥 = 0, 𝑥(−𝜋) = 𝑥(𝜋), 𝑥 ′(−𝜋) = 𝑥 ′(𝜋).

We computed that eigenfunctions are 1, cos(𝑘𝑡), sin(𝑘𝑡). That is, we want to find a
representation of a 2𝜋-periodic function 𝑓 (𝑡) as

𝑎0 Õ
𝑓 (𝑡) = + 𝑎 𝑛 cos(𝑛𝑡) + 𝑏 𝑛 sin(𝑛𝑡)
2
𝑛=1
𝑎0
= + 𝑎 1 cos(𝑡) + 𝑏 1 sin(𝑡) + 𝑎 2 cos(2𝑡) + 𝑏 2 sin(2𝑡) + · · ·
2

This series is called the Fourier series‗ or the trigonometric series for 𝑓 (𝑡). The term
𝑎 𝑛 cos(𝑛𝑡) + 𝑏 𝑛 sin(𝑛𝑡) is sometimes called the 𝑛 th harmonic. We write the coefficient of the
eigenfunction 1 as 𝑎20 for convenience. We could also think of 1 = cos(0𝑡), so that we only
need to look at cos(𝑘𝑡) and sin(𝑘𝑡).
As for matrices, we want to find a projection of 𝑓 (𝑡) onto the subspaces given by the
eigenfunctions. So we want to define an inner product of functions. For example, to find 𝑎 𝑛 ,
we want to compute ⟨ 𝑓 (𝑡) , cos(𝑛𝑡) ⟩. We define the inner product as
∫ 𝜋
def
⟨ 𝑓 (𝑡) , 𝑔(𝑡) ⟩ = 𝑓 (𝑡) 𝑔(𝑡) 𝑑𝑡.
−𝜋

With this inner product, we saw in the previous section that the eigenfunctions cos(𝑘𝑡)
(including the constant eigenfunction), and sin(𝑘𝑡) are orthogonal, that is,

⟨ cos(𝑚𝑡) , cos(𝑛𝑡) ⟩ = 0 for 𝑚 ≠ 𝑛,


⟨ sin(𝑚𝑡) , sin(𝑛𝑡) ⟩ = 0 for 𝑚 ≠ 𝑛,
⟨ sin(𝑚𝑡) , cos(𝑛𝑡) ⟩ = 0 for all 𝑚 and 𝑛.

For 𝑛 = 1, 2, 3, . . ., we have
∫ 𝜋
⟨ cos(𝑛𝑡) , cos(𝑛𝑡) ⟩ = cos(𝑛𝑡) cos(𝑛𝑡) 𝑑𝑡 = 𝜋,
∫−𝜋𝜋
⟨ sin(𝑛𝑡) , sin(𝑛𝑡) ⟩ = sin(𝑛𝑡) sin(𝑛𝑡) 𝑑𝑡 = 𝜋,
−𝜋

by elementary calculus. For the constant, we get


∫ 𝜋
⟨1, 1⟩ = 1 · 1 𝑑𝑡 = 2𝜋.
−𝜋
‗ Named after the French mathematician Jean Baptiste Joseph Fourier (1768–1830).
4.2. THE TRIGONOMETRIC SERIES 201

The coefficients are given by


𝜋
⟨ 𝑓 (𝑡) , cos(𝑛𝑡) ⟩

1
𝑎𝑛 = = 𝑓 (𝑡) cos(𝑛𝑡) 𝑑𝑡,
⟨ cos(𝑛𝑡) , cos(𝑛𝑡) ⟩ 𝜋 −𝜋
∫ 𝜋
⟨ 𝑓 (𝑡) , sin(𝑛𝑡) ⟩ 1
𝑏𝑛 = = 𝑓 (𝑡) sin(𝑛𝑡) 𝑑𝑡.
⟨ sin(𝑛𝑡) , sin(𝑛𝑡) ⟩ 𝜋 −𝜋

Compare these expressions with the finite-dimensional example. For 𝑎 0 , we get a similar
formula
∫ 𝜋
⟨ 𝑓 (𝑡) , 1 ⟩ 1
𝑎0 = 2 = 𝑓 (𝑡) 𝑑𝑡.
⟨1, 1⟩ 𝜋 −𝜋
Let us check the formulas via the orthogonality properties. Suppose for a moment that

𝑎0 Õ
𝑓 (𝑡) = + 𝑎 𝑛 cos(𝑛𝑡) + 𝑏 𝑛 sin(𝑛𝑡).
2
𝑛=1

Then for 𝑚 ≥ 1, we have


D𝑎 ∞
Õ E
0
⟨ 𝑓 (𝑡) , cos(𝑚𝑡) ⟩ = + 𝑎 𝑛 cos(𝑛𝑡) + 𝑏 𝑛 sin(𝑛𝑡) , cos(𝑚𝑡)
2
𝑛=1

𝑎0 Õ
= ⟨ 1 , cos(𝑚𝑡) ⟩ + 𝑎 𝑛 ⟨ cos(𝑛𝑡) , cos(𝑚𝑡) ⟩ + 𝑏 𝑛 ⟨ sin(𝑛𝑡) , cos(𝑚𝑡) ⟩
2
𝑛=1
= 𝑎 𝑚 ⟨ cos(𝑚𝑡) , cos(𝑚𝑡) ⟩.
⟨ 𝑓 (𝑡) , cos(𝑚𝑡) ⟩
Hence, 𝑎 𝑚 = ⟨ cos(𝑚𝑡) , cos(𝑚𝑡) ⟩
.
Exercise 4.2.2: Carry out the calculation for 𝑎0 and 𝑏 𝑚 .
Example 4.2.3: Take the function
𝑓 (𝑡) = 𝑡
for 𝑡 in (−𝜋, 𝜋]. Extend 𝑓 (𝑡) periodically and write it as a Fourier series. This function is
called the sawtooth, and it finds many applications, for example, in electronic music.
The plot of the extended periodic function is given in Figure 4.4 on the following page.
Let us compute the coefficients. We start with 𝑎0 ,
∫ 𝜋
1
𝑎0 = 𝑡 𝑑𝑡 = 0.
𝜋 −𝜋

We will often use the result from calculus that says that the integral of an odd function
over a symmetric interval is zero. Recall that an odd function is a function 𝜑(𝑡) such that
𝜑(−𝑡) = −𝜑(𝑡). For example the functions 𝑡, sin 𝑡, or (importantly for us) 𝑡 cos(𝑛𝑡) are all
odd functions. Thus ∫ 𝜋
1
𝑎𝑛 = 𝑡 cos(𝑛𝑡) 𝑑𝑡 = 0.
𝜋 −𝜋
202 CHAPTER 4. FOURIER SERIES AND PDES

-5.0 -2.5 0.0 2.5 5.0

3 3

2 2

1 1

0 0

-1 -1

-2 -2

-3 -3

-5.0 -2.5 0.0 2.5 5.0

Figure 4.4: The graph of the sawtooth function.

We move to 𝑏 𝑛 . Another useful fact from calculus is that the integral of an even function
over a symmetric interval is twice the integral of the same function over half the interval.
An even function is a function 𝜑(𝑡) such that 𝜑(−𝑡) = 𝜑(𝑡). For example, 𝑡 sin(𝑛𝑡) is even.
𝜋 ∫
1
𝑏𝑛 = 𝑡 sin(𝑛𝑡) 𝑑𝑡
𝜋 −𝜋
∫ 𝜋
2
= 𝑡 sin(𝑛𝑡) 𝑑𝑡
𝜋 0
 𝜋 ∫ 𝜋 
2 −𝑡 cos(𝑛𝑡) 1
= + cos(𝑛𝑡) 𝑑𝑡
𝜋 𝑛 𝑡=0 𝑛 0
 
2 −𝜋 cos(𝑛𝜋)
= +0
𝜋 𝑛
−2 cos(𝑛𝜋) 2 (−1)𝑛+1
= = .
𝑛 𝑛
We have used that (
1 if 𝑛 even,
cos(𝑛𝜋) = (−1)𝑛 =
−1 if 𝑛 odd.
The series, therefore, is
2 (−1)𝑛+1

Õ
sin(𝑛𝑡).
𝑛
𝑛=1
More explicitly, the first 3 harmonics of the series for 𝑓 (𝑡) are
2
2 sin(𝑡) − sin(2𝑡) +
sin(3𝑡) + · · ·
3
The plot of these first three terms of the series, along with a plot of the first 20 terms is
given in Figure 4.5 on the next page.
4.2. THE TRIGONOMETRIC SERIES 203

-5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0

3 3 3 3

2 2 2 2

1 1 1 1

0 0 0 0

-1 -1 -1 -1

-2 -2 -2 -2

-3 -3 -3 -3

-5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0

Figure 4.5: First 3 (left graph) and 20 (right graph) harmonics of the sawtooth function.

Example 4.2.4: Take the function


(
0 if −𝜋 < 𝑡 ≤ 0,
𝑓 (𝑡) =
𝜋 if 0 < 𝑡 ≤ 𝜋.
Extend 𝑓 (𝑡) periodically and write it as a Fourier series. This function or its variants appear
often in applications and the function is called the square wave. It is a signal generated by
simply periodically flipping a switch on or off.

-5.0 -2.5 0.0 2.5 5.0

3 3

2 2

1 1

0 0

-5.0 -2.5 0.0 2.5 5.0

Figure 4.6: The graph of the square wave function.

The plot of the extended periodic function is given in Figure 4.6. Now we compute the
coefficients. We start with 𝑎0
∫ 𝜋 ∫ 𝜋
1 1
𝑎0 = 𝑓 (𝑡) 𝑑𝑡 = 𝜋 𝑑𝑡 = 𝜋.
𝜋 −𝜋 𝜋 0
204 CHAPTER 4. FOURIER SERIES AND PDES

Next,
∫ 𝜋 ∫ 𝜋
1 1
𝑎𝑛 = 𝑓 (𝑡) cos(𝑛𝑡) 𝑑𝑡 = 𝜋 cos(𝑛𝑡) 𝑑𝑡 = 0.
𝜋 −𝜋 𝜋 0

And finally,
𝜋 ∫
1
𝑏𝑛 = 𝑓 (𝑡) sin(𝑛𝑡) 𝑑𝑡
𝜋 −𝜋
∫ 𝜋
1
= 𝜋 sin(𝑛𝑡) 𝑑𝑡
𝜋 0
 𝜋
− cos(𝑛𝑡)
=
𝑛 𝑡=0

1 − cos(𝜋𝑛) 1 − (−1)𝑛
(
2
if 𝑛 is odd,
= = = 𝑛
𝑛 𝑛 0 if 𝑛 is even.

The Fourier series is


∞ ∞
𝜋 Õ 2 𝜋 Õ 2
sin (2𝑘 − 1) 𝑡 .

+ sin(𝑛𝑡) = +
2 𝑛 2 2𝑘 − 1
𝑛=1 𝑘=1
𝑛 odd

The first 3 harmonics of the series for 𝑓 (𝑡) are

𝜋 2
+ 2 sin(𝑡) + sin(3𝑡) + · · ·
2 3

The plot of these first three and also of the first 20 terms of the series is given in Figure 4.7.

-5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0

3 3 3 3

2 2 2 2

1 1 1 1

0 0 0 0

-5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0

Figure 4.7: First 3 (left graph) and 20 (right graph) harmonics of the square wave function.
4.2. THE TRIGONOMETRIC SERIES 205

We have so far skirted the issue of convergence. For example, if 𝑓 (𝑡) is the square wave
function, the equation

𝜋 Õ 2
𝑓 (𝑡) = + sin (2𝑘 − 1) 𝑡 .

2 2𝑘 − 1
𝑘=1

is only an equality for such 𝑡 where 𝑓 (𝑡) is continuous. We do not get an equality for
𝑡 = −𝜋, 0, 𝜋 and all the other discontinuities of 𝑓 (𝑡). It is not hard to see thatwhen 𝑡 is an
integer multiple of 𝜋 (which gives all the discontinuities), then sin (2𝑘 − 1) 𝑡 = 0 and so

𝜋 Õ 2  𝜋
+ sin (2𝑘 − 1) 𝑡 = .
2 2𝑘 − 1 2
𝑘=1

We redefine 𝑓 (𝑡) on [−𝜋, 𝜋] as




 0 if −𝜋 < 𝑡 < 0,


𝑓 (𝑡) = 𝜋 if 0 < 𝑡 < 𝜋,

 𝜋/2 if 𝑡 = −𝜋, 𝑡 = 0, or 𝑡 = 𝜋,


and extend periodically. The series equals this new extended 𝑓 (𝑡) everywhere, including
the discontinuities. We will generally not worry about changing the function values at
several (finitely many) points.
We will say more about convergence in the next section. Roughly speaking, if 𝑓 (𝑡) is a
nice enough function then the series converges to 𝑓 (𝑡) wherever 𝑓 (𝑡) is continuous. Let us,
however, briefly mention an effect of the discontinuity. Zoom in near the discontinuity in
the square wave. Further, plot the first 100 harmonics, see Figure 4.8 on the following page.
While the series is a very good approximation away from the discontinuities, the error (the
overshoot) near the discontinuity at 𝑡 = 𝜋 does not seem to be getting any smaller as we
take more and more harmonics. This behavior is known as the Gibbs phenomenon. The
region where the error is large does get smaller, however, the more terms in the series we
take.
We can think of a periodic function as a “signal” that is a superposition of many signals
of pure frequency. For example, we could think of the square wave as a tone of certain base
frequency coming through the speaker of your computer. This base frequency is called the
fundamental frequency and that is the “musical note” that you identify this sound as. But the
square wave will also be a superposition of many different pure tones of frequencies that
are multiples of the fundamental frequency. In music, the higher frequencies are called
the overtones. All the frequencies that appear are called the spectrum of the signal. On the
other hand, if the signal were a simple sine wave instead of the square wave, then it is only
the pure tone of fundamental frequency (no overtones). The simplest way to make sound
using a computer is the square wave, and the sound is very different from a pure tone. If
you ever played video games from the 1980s or so, then you heard what square waves
sound like.
206 CHAPTER 4. FOURIER SERIES AND PDES

1.75 2.00 2.25 2.50 2.75 3.00 3.25

3.50 3.50

3.25 3.25

3.00 3.00

2.75 2.75

1.75 2.00 2.25 2.50 2.75 3.00 3.25

Figure 4.8: Gibbs phenomenon in action.

4.2.4 Exercises
Exercise 4.2.3: Suppose 𝑓 (𝑡) is defined on [−𝜋, 𝜋] as sin(5𝑡) + cos(3𝑡). Extend periodically and
compute the Fourier series of 𝑓 (𝑡).

Exercise 4.2.4: Suppose 𝑓 (𝑡) is defined on [−𝜋, 𝜋] as |𝑡|. Extend periodically and compute the
Fourier series of 𝑓 (𝑡).

Exercise 4.2.5: Suppose 𝑓 (𝑡) is defined on [−𝜋, 𝜋] as |𝑡|3 . Extend periodically and compute the
Fourier series of 𝑓 (𝑡).

Exercise 4.2.6: Suppose 𝑓 (𝑡) is defined on (−𝜋, 𝜋] as


(
−1 if −𝜋 < 𝑡 ≤ 0,
𝑓 (𝑡) =
1 if 0 < 𝑡 ≤ 𝜋.

Extend periodically and compute the Fourier series of 𝑓 (𝑡).

Exercise 4.2.7: Suppose 𝑓 (𝑡) is defined on (−𝜋, 𝜋] as 𝑡 3 . Extend periodically and compute the
Fourier series of 𝑓 (𝑡).

Exercise 4.2.8: Suppose 𝑓 (𝑡) is defined on [−𝜋, 𝜋] as 𝑡 2 . Extend periodically and compute the
Fourier series of 𝑓 (𝑡).

There is another form of the Fourier series using complex exponentials 𝑒 𝑛𝑡 for 𝑛 =
. . . , −2, −1, 0, 1, 2, . . . instead of cos(𝑛𝑡) and sin(𝑛𝑡) for positive 𝑛. This form may be easier
to work with sometimes. It is certainly more compact to write, and there is only one
formula for the coefficients. On the downside, the coefficients are complex numbers.
4.2. THE TRIGONOMETRIC SERIES 207

Exercise 4.2.9: Let



𝑎0 Õ
𝑓 (𝑡) = + 𝑎 𝑛 cos(𝑛𝑡) + 𝑏 𝑛 sin(𝑛𝑡).
2
𝑛=1

Use Euler’s formula 𝑒 𝑖𝜃 = cos(𝜃) + 𝑖 sin(𝜃) to show that there exist complex numbers 𝑐 𝑚 such that

Õ
𝑓 (𝑡) = 𝑐 𝑚 𝑒 𝑖𝑚𝑡 .
𝑚=−∞

Note that the sum now ranges over all the integers including negative ones. Do not worry about
convergence in this calculation. Hint: It may be better to start from the complex exponential form
and write the series as
∞ 
Õ 
𝑖𝑚𝑡 −𝑖𝑚𝑡
𝑐0 + 𝑐𝑚 𝑒 + 𝑐 −𝑚 𝑒 .
𝑚=1

Exercise 4.2.101: Suppose 𝑓 (𝑡) is defined on [−𝜋, 𝜋] as 𝑓 (𝑡) = sin(𝑡). Extend periodically and
compute the Fourier series.

Exercise 4.2.102: Suppose 𝑓 (𝑡) is defined on (−𝜋, 𝜋] as 𝑓 (𝑡) = sin(𝜋𝑡). Extend periodically and
compute the Fourier series.

Exercise 4.2.103: Suppose 𝑓 (𝑡) is defined on (−𝜋, 𝜋] as 𝑓 (𝑡) = sin2 (𝑡). Extend periodically and
compute the Fourier series.

Exercise 4.2.104: Suppose 𝑓 (𝑡) is defined on (−𝜋, 𝜋] as 𝑓 (𝑡) = 𝑡 4 . Extend periodically and
compute the Fourier series.
208 CHAPTER 4. FOURIER SERIES AND PDES

4.3 More on the Fourier series


Note: 2 lectures, §9.2–§9.3 in [EP], §10.3 in [BD]

4.3.1 2𝐿-periodic functions


We computed the Fourier series for a 2𝜋-periodic function, but what about functions of
different periods. Well, fear not, the computation is a simple case of change of variables.
We just rescale the independent axis. Consider a 2𝐿-periodic function 𝑓 (𝑡). The 𝐿 is called
the half period. Let 𝑠 = 𝜋𝐿 𝑡. Then the function

𝐿
 
𝑔(𝑠) = 𝑓 𝑠
𝜋

is 2𝜋-periodic and we know what to do with it. We must also rescale all our sines and
cosines. In the series, we use 𝜋𝐿 𝑡 as the variable. That is, we want to write


𝑎0 Õ  𝑛𝜋   𝑛𝜋 
𝑓 (𝑡) = + 𝑎 𝑛 cos 𝑡 + 𝑏 𝑛 sin 𝑡 .
2 𝐿 𝐿
𝑛=1

If we change variables to 𝑠, we see that



𝑎0 Õ
𝑔(𝑠) = + 𝑎 𝑛 cos(𝑛𝑠) + 𝑏 𝑛 sin(𝑛𝑠).
2
𝑛=1

We compute 𝑎 𝑛 and 𝑏 𝑛 as before. After we write down the integrals, we change variables
from 𝑠 back to 𝑡, noting also that 𝑑𝑠 = 𝜋𝐿 𝑑𝑡.

∫ 𝜋 ∫ 𝐿
1 1
𝑎0 = 𝑔(𝑠) 𝑑𝑠 = 𝑓 (𝑡) 𝑑𝑡,
𝜋 −𝜋 𝐿 −𝐿
∫ 𝜋 ∫ 𝐿  𝑛𝜋 
1 1
𝑎𝑛 = 𝑔(𝑠) cos(𝑛𝑠) 𝑑𝑠 = 𝑓 (𝑡) cos 𝑡 𝑑𝑡,
𝜋 −𝜋 𝐿 −𝐿 𝐿
∫ 𝜋 ∫ 𝐿  𝑛𝜋 
1 1
𝑏𝑛 = 𝑔(𝑠) sin(𝑛𝑠) 𝑑𝑠 = 𝑓 (𝑡) sin 𝑡 𝑑𝑡.
𝜋 −𝜋 𝐿 −𝐿 𝐿

The two most common half periods that show up in examples are 𝜋 and 1 because of
the simplicity of the formulas. We should stress that we have done no new mathematics,
we have only changed variables. If you understand the Fourier series for 2𝜋-periodic
functions, you understand it for 2𝐿-periodic functions. You can think of it as just using
different units for time. All that we are doing is moving some constants around, but all the
mathematics is the same.
4.3. MORE ON THE FOURIER SERIES 209

Example 4.3.1: Let


𝑓 (𝑡) = |𝑡| for −1 < 𝑡 ≤ 1,
extended periodically. The plot of the periodic extension is given in Figure 4.9. Compute
the Fourier series of 𝑓 (𝑡).

-2 -1 0 1 2

1.00 1.00

0.75 0.75

0.50 0.50

0.25 0.25

0.00 0.00

-2 -1 0 1 2

Figure 4.9: Periodic extension of the function 𝑓 (𝑡).

First, we recognize that 𝑓 is 2-periodic and so 𝐿 = 1. We want to write 𝑓 (𝑡) =


𝑎0
+ ∞ 𝑛=1 𝑎 𝑛 cos(𝑛𝜋𝑡) + 𝑏 𝑛 sin(𝑛𝜋𝑡). We start with 𝑎 𝑛 for 𝑛 ≥ 1. We note that |𝑡| cos(𝑛𝜋𝑡) is
Í
2
even and for 0 ≤ 𝑡 ≤ 1, 𝑓 (𝑡) = |𝑡| = 𝑡. Hence,
∫ 1
𝑎𝑛 = 𝑓 (𝑡) cos(𝑛𝜋𝑡) 𝑑𝑡
−1
∫ 1
=2 𝑡 cos(𝑛𝜋𝑡) 𝑑𝑡
0
1 1
𝑡
 ∫
1
=2 sin(𝑛𝜋𝑡) −2 sin(𝑛𝜋𝑡) 𝑑𝑡
𝑛𝜋 𝑡=0 0 𝑛𝜋
2 (−1)𝑛 − 1
(
if 𝑛 is even,
i1 
1 h 0
=0+ cos(𝑛𝜋𝑡) = =
𝑛 2 𝜋2 𝑡=0 𝑛 2 𝜋2 −4
𝑛 2 𝜋2
if 𝑛 is odd.

Next we find 𝑎0 :
∫ 1
𝑎0 = |𝑡| 𝑑𝑡 = 1.
−1
You should be able to find this integral by thinking about the integral as the area under the
graph without doing any computation at all. Finally, we find 𝑏 𝑛 . Notice that |𝑡| sin(𝑛𝜋𝑡) is
odd, and so
∫ 1
𝑏𝑛 = 𝑓 (𝑡) sin(𝑛𝜋𝑡) 𝑑𝑡 = 0.
−1
210 CHAPTER 4. FOURIER SERIES AND PDES

Hence, the series is



1 Õ −4
+ cos(𝑛𝜋𝑡).
2
𝑛=1
𝑛 2 𝜋2
𝑛 odd

The first few terms of the series up to the 3rd harmonic are
1 4 4
− 2 cos(𝜋𝑡) − 2 cos(3𝜋𝑡) − · · ·
2 𝜋 9𝜋
The plot of these few terms and also a plot up to the 20th harmonic is given in Figure 4.10.
You should notice how close the graph is to the real function. You should also notice that
there is no “Gibbs phenomenon” present as there are no discontinuities.

-2 -1 0 1 2 -2 -1 0 1 2

1.00 1.00 1.00 1.00

0.75 0.75 0.75 0.75

0.50 0.50 0.50 0.50

0.25 0.25 0.25 0.25

0.00 0.00 0.00 0.00

-2 -1 0 1 2 -2 -1 0 1 2

Figure 4.10: Fourier series of 𝑓 (𝑡) up to the 3rd harmonic (left graph) and up to the 20th harmonic (right
graph).

4.3.2 Convergence
We will need the one sided limits of functions. We will use the following notation

𝑓 (𝑐−) = lim 𝑓 (𝑡), and 𝑓 (𝑐+) = lim 𝑓 (𝑡).


𝑡↑𝑐 𝑡↓𝑐

If you are unfamiliar with this notation, lim𝑡↑𝑐 𝑓 (𝑡) means we are taking a limit of 𝑓 (𝑡) as 𝑡
approaches 𝑐 from below (i.e. 𝑡 < 𝑐) and lim𝑡↓𝑐 𝑓 (𝑡) means we are taking a limit of 𝑓 (𝑡) as 𝑡
approaches 𝑐 from above (i.e. 𝑡 > 𝑐). For example, for the square wave function
(
0 if −𝜋 < 𝑡 ≤ 0,
𝑓 (𝑡) = (4.8)
𝜋 if 0 < 𝑡 ≤ 𝜋,

we have 𝑓 (0−) = 0 and 𝑓 (0+) = 𝜋.


4.3. MORE ON THE FOURIER SERIES 211

Let 𝑓 (𝑡) be a function defined on an interval [𝑎, 𝑏]. Suppose that we find finitely many
points 𝑎 = 𝑡0 , 𝑡1 , 𝑡2 , . . . , 𝑡 𝑘 = 𝑏 in the interval, such that 𝑓 (𝑡) is continuous on the intervals
(𝑡0 , 𝑡1 ), (𝑡1 , 𝑡2 ), . . . , (𝑡 𝑘−1 , 𝑡 𝑘 ). Also suppose that all the one sided limits exist, that is, all of
𝑓 (𝑡0 +), 𝑓 (𝑡1 −), 𝑓 (𝑡1 +), 𝑓 (𝑡2 −), 𝑓 (𝑡2 +), . . . , 𝑓 (𝑡 𝑘 −) exist and are finite. Then we say 𝑓 (𝑡) is
piecewise continuous.
If moreover, 𝑓 (𝑡) is differentiable at all but finitely many points, and 𝑓 ′(𝑡) is piecewise
continuous, then 𝑓 (𝑡) is said to be piecewise smooth.
Example 4.3.2: The square wave function, (4.8) extended periodically, is piecewise smooth
on [−𝜋, 𝜋] or any other finite interval, so we just say that 𝑓 (𝑡) is piecewise smooth without
mentioning an interval.
Example 4.3.3: The function 𝑓 (𝑡) = |𝑡| is piecewise smooth.
Example 4.3.4: The function 𝑓 (𝑡) = 1𝑡 is not piecewise smooth on [−1, 1] (or any other
interval containing zero). In fact, it is not even piecewise continuous.

Example 4.3.5: The function 𝑓 (𝑡) = 3 𝑡 is not piecewise smooth on [−1, 1] (or any other
interval containing zero). The function 𝑓 (𝑡) is continuous, but its derivative 𝑓 ′(𝑡) = 3𝑡12/3 , is
unbounded near zero and hence not piecewise continuous.
Piecewise smooth functions have an easy answer on the convergence of the Fourier
series.

Theorem 4.3.1. Suppose 𝑓 (𝑡) is a 2𝐿-periodic piecewise smooth function. Let



𝑎0 Õ  𝑛𝜋   𝑛𝜋 
+ 𝑎 𝑛 cos 𝑡 + 𝑏 𝑛 sin 𝑡
2 𝐿 𝐿
𝑛=1

be the Fourier series for 𝑓 (𝑡). Then the series converges for all 𝑡. If 𝑓 (𝑡) is continuous at 𝑡, then

𝑎0 Õ  𝑛𝜋   𝑛𝜋 
𝑓 (𝑡) = + 𝑎 𝑛 cos 𝑡 + 𝑏 𝑛 sin 𝑡 .
2 𝐿 𝐿
𝑛=1

Otherwise,

𝑓 (𝑡−) + 𝑓 (𝑡+) 𝑎 0 Õ  𝑛𝜋   𝑛𝜋 
= + 𝑎 𝑛 cos 𝑡 + 𝑏 𝑛 sin 𝑡 .
2 2 𝐿 𝐿
𝑛=1

𝑓 (𝑡−)+ 𝑓 (𝑡+)
If we happen to have that 𝑓 (𝑡) = 2 at all the discontinuities, the Fourier series
converges to 𝑓 (𝑡) everywhere. We can always just redefine 𝑓 (𝑡) by changing the value at
each discontinuity appropriately. Then we can write an equals sign between 𝑓 (𝑡) and the
series without any worry. We mentioned this fact briefly at the end last section.
The theorem does not say how fast the series converges. Think back to the discussion
of the Gibbs phenomenon in the last section. The closer you get to the discontinuity, the
more terms you need to take to get an accurate approximation to the function.
212 CHAPTER 4. FOURIER SERIES AND PDES

4.3.3 Differentiation and integration of Fourier series


Not only does Fourier series converge nicely, but it is easy to differentiate and integrate the
series. We can do this just by differentiating or integrating term by term.
Theorem 4.3.2. Suppose

𝑎0 Õ  𝑛𝜋   𝑛𝜋 
𝑓 (𝑡) = + 𝑎 𝑛 cos 𝑡 + 𝑏 𝑛 sin 𝑡
2 𝐿 𝐿
𝑛=1

is a piecewise smooth continuous function and the derivative 𝑓 ′(𝑡) is piecewise smooth. Then the
derivative can be obtained by differentiating term by term,

Õ −𝑎 𝑛 𝑛𝜋  𝑛𝜋  𝑏 𝑛 𝑛𝜋  𝑛𝜋 
𝑓 ′(𝑡) = sin 𝑡 + cos 𝑡 .
𝐿 𝐿 𝐿 𝐿
𝑛=1

It is important that the function is continuous. It can have corners, but no jumps.
Otherwise, the differentiated series will fail to converge. For an exercise, take the series
obtained for the square wave and try to differentiate the series. Similarly to differentiation,
integration of Fourier series is also done term by term.
Theorem 4.3.3. Suppose

𝑎0 Õ  𝑛𝜋   𝑛𝜋 
𝑓 (𝑡) = + 𝑎 𝑛 cos 𝑡 + 𝑏 𝑛 sin 𝑡
2 𝐿 𝐿
𝑛=1

is a piecewise smooth function. Then the antiderivative is obtained by antidifferentiating term by


term and so

𝑎0 𝑡 Õ 𝑎𝑛 𝐿  𝑛𝜋  −𝑏 𝐿
𝑛
 𝑛𝜋 
𝐹(𝑡) = +𝐶+ sin 𝑡 + cos 𝑡 ,
2 𝑛𝜋 𝐿 𝑛𝜋 𝐿
𝑛=1
where 𝐹 ′(𝑡) = 𝑓 (𝑡) and 𝐶 is an arbitrary constant.
Note that the series for 𝐹(𝑡) is no longer a Fourier series as it contains the 𝑎20 𝑡 term. The
antiderivative of a periodic function need no longer be periodic and so we should not
expect a Fourier series. Unless, of course, 𝑎0 = 0.

4.3.4 Rates of convergence and smoothness


We consider an example of a periodic function differentiable once but not twice.
Example 4.3.6: Take the function
(
(𝑡 + 1) 𝑡 if −1 < 𝑡 ≤ 0,
𝑓 (𝑡) =
(1 − 𝑡) 𝑡 if 0 < 𝑡 ≤ 1,
and extend to a 2-periodic function. The derivative 𝑓 ′(𝑡) exists for all 𝑡, but 𝑓 ′′(𝑡) does not
exist if 𝑡 is an integer. In particular, 𝑓 ′(𝑡) = 2𝑡 + 1 if −1 ≤ 𝑡 ≤ 0 and 𝑓 ′(𝑡) = 1 − 2𝑡 if 0 ≤ 𝑡 ≤ 1,
so 𝑓 ′(𝑡) has a “corner” at 0. See Figure 4.11 on the next page for the plot of 𝑓 (𝑡) and 𝑓 ′(𝑡).
4.3. MORE ON THE FOURIER SERIES 213

-2 -1 0 1 2

1.0 1.0

0.5 0.5

0.0 0.0

-0.5 -0.5

-1.0 -1.0

-2 -1 0 1 2

Figure 4.11: Smooth 2-periodic function (the smooth line) with its nonsmooth derivative (the jagged line).

Exercise 4.3.1: Compute 𝑓 ′′(0+) and 𝑓 ′′(0−).


Let us compute the Fourier-series coefficients. The actual computation involves several
integration by parts and is left to student.
∫ 1 ∫ 0 ∫ 1
𝑎0 = 𝑓 (𝑡) 𝑑𝑡 = (𝑡 + 1) 𝑡 𝑑𝑡 + (1 − 𝑡) 𝑡 𝑑𝑡 = 0,
−1 −1 0
∫ 1 ∫ 0 ∫ 1
𝑎𝑛 = 𝑓 (𝑡) cos(𝑛𝜋𝑡) 𝑑𝑡 = (𝑡 + 1) 𝑡 cos(𝑛𝜋𝑡) 𝑑𝑡 + (1 − 𝑡) 𝑡 cos(𝑛𝜋𝑡) 𝑑𝑡 = 0,
−1 −1 0
∫ 1 ∫ 0 ∫ 1
𝑏𝑛 = 𝑓 (𝑡) sin(𝑛𝜋𝑡) 𝑑𝑡 = (𝑡 + 1) 𝑡 sin(𝑛𝜋𝑡) 𝑑𝑡 + (1 − 𝑡) 𝑡 sin(𝑛𝜋𝑡) 𝑑𝑡
−1 −1 0

4(1 − (−1)𝑛 )
(
𝜋
8
3 𝑛3 if 𝑛 is odd,
= =
𝜋3 𝑛 3 0 if 𝑛 is even.

That is, the series is



Õ 8
sin(𝑛𝜋𝑡).
𝑛=1
𝜋3 𝑛 3
𝑛 odd
This series converges very fast. If we plot up to the third harmonic, that is,
8 8
sin(𝜋𝑡) + sin(3𝜋𝑡),
𝜋 3 27𝜋3
the plot is almost indistinguishable from the plot of 𝑓 (𝑡) in Figure 4.11. In fact, the coefficient
8
27𝜋3
is already just 0.0096 (approximately). The reason for this behavior is the 𝑛 3 term in
the denominator. The coefficients 𝑏 𝑛 in this case go to zero as fast as 1/𝑛 3 goes to zero.
For functions constructed piecewise from polynomials as above, it is generally true that
if you have one derivative but not two derivatives, the Fourier coefficients will go to zero
214 CHAPTER 4. FOURIER SERIES AND PDES

approximately like 1/𝑛 3 . If you have only a continuous function that is not differentiable,
then the Fourier coefficients will go to zero as 1/𝑛 2 . If you have discontinuities, then the
Fourier coefficients will go to zero approximately as 1/𝑛 . For more general functions the
story is somewhat more complicated but the same idea holds, the more derivatives you
have, the faster the coefficients go to zero. Similar reasoning works in reverse. If the
coefficients go to zero like 1/𝑛 2 (or faster), you always obtain a continuous function. If they
go to zero like 1/𝑛 3 (or faster), you obtain an everywhere differentiable function.
To justify this behavior, take for example the function defined by the Fourier series

Õ 1
𝑓 (𝑡) = sin(𝑛𝑡).
𝑛=1
𝑛3

When we differentiate term by term, we notice




Õ 1
𝑓 (𝑡) = cos(𝑛𝑡).
𝑛=1
𝑛2

Therefore, the coefficients now go down like 1/𝑛 2 , which means that we have a continuous
function. The derivative of 𝑓 ′(𝑡) is defined at most points, but there are points where 𝑓 ′(𝑡)
is not differentiable. It has corners, but no jumps. If we differentiate again (where we can),
we find that the function 𝑓 ′′(𝑡), now fails to be continuous (has jumps)

Õ −1
𝑓 ′′(𝑡) = sin(𝑛𝑡).
𝑛
𝑛=1

This function is similar to the sawtooth. If we tried to differentiate the series again, we
would obtain

Õ
− cos(𝑛𝑡),
𝑛=1
which does not converge!
Exercise 4.3.2: Use a computer to plot the series we obtained for 𝑓 (𝑡), 𝑓 ′(𝑡) and 𝑓 ′′(𝑡). That is,
plot say the first 5 harmonics of the functions. At what points does 𝑓 ′′(𝑡) have the discontinuities?

4.3.5 Exercises
Exercise 4.3.3: Let (
0 if −1 < 𝑡 ≤ 0,
𝑓 (𝑡) =
𝑡 if 0 < 𝑡 ≤ 1,
extended periodically.
a) Compute the Fourier series for 𝑓 (𝑡).
b) Write out the series explicitly up to the 3rd harmonic.
4.3. MORE ON THE FOURIER SERIES 215

Exercise 4.3.4: Let (


−𝑡 if −1 < 𝑡 ≤ 0,
𝑓 (𝑡) =
𝑡2 if 0 < 𝑡 ≤ 1,
extended periodically.

a) Compute the Fourier series for 𝑓 (𝑡).


b) Write out the series explicitly up to the 3rd harmonic.

Exercise 4.3.5: Let (


−𝑡
if −10 < 𝑡 ≤ 0,
𝑓 (𝑡) = 10
𝑡
10 if 0 < 𝑡 ≤ 10,
extended periodically (period is 20).

a) Compute the Fourier series for 𝑓 (𝑡).


b) Write out the series explicitly up to the 3rd harmonic.

Exercise 4.3.6: Let 𝑓 (𝑡) = ∞ 𝑛=1 𝑛 3 cos(𝑛𝑡). Is 𝑓 (𝑡) continuous and differentiable everywhere?
1
Í
Find the derivative (if it exists everywhere) or justify why 𝑓 (𝑡) is not differentiable everywhere.
(−1)𝑛
Exercise 4.3.7: Let 𝑓 (𝑡) = ∞ 𝑛=1 𝑛 sin(𝑛𝑡). Is 𝑓 (𝑡) differentiable everywhere? Find the
Í
derivative (if it exists everywhere) or justify why 𝑓 (𝑡) is not differentiable everywhere.

Exercise 4.3.8: Let




 0 if −2 < 𝑡 ≤ 0,


𝑓 (𝑡) = 𝑡 if 0 < 𝑡 ≤ 1,

 −𝑡 + 2 if 1 < 𝑡 ≤ 2,


extended periodically.

a) Compute the Fourier series for 𝑓 (𝑡).


b) Write out the series explicitly up to the 3rd harmonic.

Exercise 4.3.9: Let


𝑓 (𝑡) = 𝑒 𝑡 for −1 < 𝑡 ≤ 1
extended periodically.

a) Compute the Fourier series for 𝑓 (𝑡).


b) Write out the series explicitly up to the 3rd harmonic.
c) What does the series converge to at 𝑡 = 1.
216 CHAPTER 4. FOURIER SERIES AND PDES

Exercise 4.3.10: Let


𝑓 (𝑡) = 𝑡 2 for −1 < 𝑡 ≤ 1
extended periodically.

a) Compute the Fourier series for 𝑓 (𝑡).


(−1)𝑛

Õ 1 1
b) By plugging in 𝑡 = 0, evaluate = −1 + − +···.
𝑛=1
𝑛2 4 9

Õ 1 1 1
c) Now evaluate = 1+ + +···.
𝑛=1
𝑛 2 4 9

Exercise 4.3.11: Let (


0 if −3 < 𝑡 ≤ 0,
𝑓 (𝑡) =
𝑡 if 0 < 𝑡 ≤ 3,
extended periodically. Suppose 𝐹(𝑡) is the function given by the Fourier series of 𝑓 . Without
computing the Fourier series evaluate

a) 𝐹(2) b) 𝐹(−2) c) 𝐹(4)


d) 𝐹(−4) e) 𝐹(3) f) 𝐹(−9)

Exercise 4.3.101: Let


𝑓 (𝑡) = 𝑡 2 for −2 < 𝑡 ≤ 2
extended periodically.

a) Compute the Fourier series for 𝑓 (𝑡).


b) Write out the series explicitly up to the 3rd harmonic.

Exercise 4.3.102: Let

𝑓 (𝑡) = 𝑡 for −𝜆 < 𝑡 ≤ 𝜆 (for some 𝜆 > 0)

extended periodically.

a) Compute the Fourier series for 𝑓 (𝑡).


b) Write out the series explicitly up to the 3rd harmonic.

Exercise 4.3.103: Let



1 Õ 1
𝑓 (𝑡) = + sin(𝑛𝜋𝑡).
2 𝑛(𝑛 2 + 1)
𝑛=1

Compute 𝑓 ′(𝑡).
4.3. MORE ON THE FOURIER SERIES 217

Exercise 4.3.104: Let



1 Õ 1
𝑓 (𝑡) = + cos(𝑛𝑡).
2 𝑛3 𝑛=1

a) Find the antiderivative.


b) Is the antiderivative periodic?

Exercise 4.3.105: Let


𝑓 (𝑡) = 𝑡/2 for −𝜋 < 𝑡 < 𝜋
extended periodically.

a) Compute the Fourier series for 𝑓 (𝑡).


b) Plug in 𝑡 = 𝜋/2 to find a series representation for 𝜋/4.
c) Using the first 4 terms of the result from part b) approximate 𝜋/4.

Exercise 4.3.106: Let (


0 if −2 < 𝑡 ≤ 0,
𝑓 (𝑡) =
2 if 0 < 𝑡 ≤ 2,
extended periodically. Suppose 𝐹(𝑡) is the function given by the Fourier series of 𝑓 . Without
computing the Fourier series evaluate

a) 𝐹(0) b) 𝐹(−1) c) 𝐹(1)


d) 𝐹(−2) e) 𝐹(4) f) 𝐹(−9)
218 CHAPTER 4. FOURIER SERIES AND PDES

4.4 Sine and cosine series


Note: 2 lectures, §9.3 in [EP], §10.4 in [BD]

4.4.1 Odd and even periodic functions


You may have noticed by now that an odd function has no cosine terms in the Fourier
series and an even function has no sine terms in the Fourier series. This observation is not
a coincidence. Let us look at even and odd periodic function in more detail.
Recall that a function 𝑓 (𝑡) is odd if 𝑓 (−𝑡) = − 𝑓 (𝑡). A function 𝑓 (𝑡) is even if 𝑓 (−𝑡) = 𝑓 (𝑡).
For example, cos(𝑛𝑡) is even and sin(𝑛𝑡) is odd. Similarly the function 𝑡 𝑘 is even if 𝑘 is even
and odd if 𝑘 is odd.
Exercise 4.4.1: Take two functions 𝑓 (𝑡) and 𝑔(𝑡) and define their product ℎ(𝑡) = 𝑓 (𝑡)𝑔(𝑡).
a) Suppose both 𝑓 (𝑡) and 𝑔(𝑡) are odd. Is ℎ(𝑡) odd or even?
b) Suppose one is even and one is odd. Is ℎ(𝑡) odd or even?
c) Suppose both are even. Is ℎ(𝑡) odd or even?
If 𝑓 (𝑡) and 𝑔(𝑡) are both odd, then 𝑓 (𝑡) + 𝑔(𝑡) is odd. Similarly for even functions. On
the other hand, if 𝑓 (𝑡) is odd and 𝑔(𝑡) even, then we cannot say anything about the sum
𝑓 (𝑡) + 𝑔(𝑡). In fact, the Fourier series of any function is a sum of an odd (the sine terms)
and an even (the cosine terms) function.
In this section, we consider odd and even periodic functions. We have previously
defined the 2𝐿-periodic extension of a function defined on the interval [−𝐿, 𝐿]. Sometimes
we are only interested in the function on the range [0, 𝐿], and it would be convenient to have
an odd (resp. even) function. If the function is odd (resp. even), all the cosine (resp. sine)
terms disappear. What we will do is take the odd (resp. even) extension of the function to
[−𝐿, 𝐿] and then extend periodically to a 2𝐿-periodic function.
Take a function 𝑓 (𝑡) defined on [0, 𝐿]. On (−𝐿, 𝐿] define the functions
(
def 𝑓 (𝑡) if 0 ≤ 𝑡 ≤ 𝐿,
𝐹odd (𝑡) =
− 𝑓 (−𝑡) if −𝐿 < 𝑡 < 0,
(
def 𝑓 (𝑡) if 0 ≤ 𝑡 ≤ 𝐿,
𝐹even (𝑡) =
𝑓 (−𝑡) if −𝐿 < 𝑡 < 0.

Extend 𝐹odd (𝑡) and 𝐹even (𝑡) to be 2𝐿-periodic. Then 𝐹odd (𝑡) is called the odd periodic extension
of 𝑓 (𝑡), and 𝐹even (𝑡) is called the even periodic extension of 𝑓 (𝑡). For the odd extension, we
generally assume that 𝑓 (0) = 𝑓 (𝐿) = 0.
Exercise 4.4.2: Check that 𝐹odd (𝑡) is odd and 𝐹even (𝑡) is even. For 𝐹odd , assume 𝑓 (0) = 𝑓 (𝐿) = 0.
Example 4.4.1: Take the function 𝑓 (𝑡) = 𝑡 (1 − 𝑡) defined on [0, 1]. Figure 4.12 on the next
page shows the plots of the odd and even periodic extensions of 𝑓 (𝑡).
4.4. SINE AND COSINE SERIES 219

-2 -1 0 1 2 -2 -1 0 1 2
0.3 0.3 0.3 0.3

0.2 0.2 0.2 0.2

0.0 0.0 0.0 0.0

0.0 0.0 0.0 0.0

-0.1 -0.1 -0.1 -0.1

-0.2 -0.2 -0.2 -0.2

-0.3 -0.3 -0.3 -0.3


-2 -1 0 1 2 -2 -1 0 1 2

Figure 4.12: Odd and even 2-periodic extension of 𝑓 (𝑡) = 𝑡 (1 − 𝑡), 0 ≤ 𝑡 ≤ 1.

4.4.2 Sine and cosine series


Let 𝑓 (𝑡) be an odd 2𝐿-periodic function. We write the Fourier series for 𝑓 (𝑡). First, we
compute the coefficients 𝑎 𝑛 (including 𝑛 = 0) and get
∫ 𝐿  𝑛𝜋 
1
𝑎𝑛 = 𝑓 (𝑡) cos 𝑡 𝑑𝑡 = 0.
𝐿 −𝐿 𝐿
That is, the Fourier series of an odd function has no cosine terms. The integral is zero as
𝑓 (𝑡) cos (𝑛𝜋𝐿𝑡) is an odd function (product of an odd and an even function is odd) and the
integral of an odd function over a symmetric interval is zero. The function 𝑓 (𝑡) sin 𝑛𝜋 𝐿 𝑡
is even as it is the product of two odd functions. The integral of an even function over a
symmetric interval [−𝐿, 𝐿] is twice the integral of the function over the interval [0, 𝐿]. So
∫ 𝐿  𝑛𝜋  ∫ 𝐿  𝑛𝜋 
1 2
𝑏𝑛 = 𝑓 (𝑡) sin 𝑡 𝑑𝑡 = 𝑓 (𝑡) sin 𝑡 𝑑𝑡.
𝐿 −𝐿 𝐿 𝐿 0 𝐿
The Fourier series of an odd 𝑓 (𝑡) is then

Õ  𝑛𝜋 
𝑏 𝑛 sin 𝑡 .
𝐿
𝑛=1

Similarly, if 𝑓 (𝑡) is an even 2𝐿-periodic function. For the same exact reasons as above,
we find that 𝑏 𝑛 = 0 and ∫ 𝐿
2  𝑛𝜋 
𝑎𝑛 = 𝑓 (𝑡) cos 𝑡 𝑑𝑡.
𝐿 0 𝐿
The formula still works for 𝑛 = 0, in which case it becomes
∫ 𝐿
2
𝑎0 = 𝑓 (𝑡) 𝑑𝑡.
𝐿 0
220 CHAPTER 4. FOURIER SERIES AND PDES

The Fourier series of an even 𝑓 (𝑡) is then



𝑎0 Õ  𝑛𝜋 
+ 𝑎 𝑛 cos 𝑡 .
2 𝐿
𝑛=1

An interesting consequence is that the coefficients of the Fourier series of an odd (or
even) function can be computed by just integrating over the half interval [0, 𝐿]. Therefore,
we can compute the Fourier series of the odd (or even) extension of a function by computing
certain integrals over the interval where the original function is defined.

Theorem 4.4.1. Let 𝑓 (𝑡) be a piecewise smooth function defined on [0, 𝐿]. Then the odd periodic
extension of 𝑓 (𝑡) has the Fourier series


Õ  𝑛𝜋 
𝐹odd (𝑡) = 𝑏 𝑛 sin 𝑡 ,
𝐿
𝑛=1

where
∫ 𝐿  𝑛𝜋 
2
𝑏𝑛 = 𝑓 (𝑡) sin 𝑡 𝑑𝑡.
𝐿 0 𝐿

The even periodic extension of 𝑓 (𝑡) has the Fourier series


𝑎0 Õ  𝑛𝜋 
𝐹even (𝑡) = + 𝑎 𝑛 cos 𝑡 ,
2 𝐿
𝑛=1

where
∫ 𝐿  𝑛𝜋 
2
𝑎𝑛 = 𝑓 (𝑡) cos 𝑡 𝑑𝑡.
𝐿 0 𝐿
∞ 𝑛𝜋 𝑎0
𝑛=1 𝑏 𝑛 sin 𝐿 𝑡 the sine series of 𝑓 (𝑡) and we call the series 2 +
Í 
We call the series
Í∞ 𝑛𝜋
𝑛=1 𝑎 𝑛 cos 𝐿 𝑡 the cosine series of 𝑓 (𝑡). We often do not actually care what happens

outside of [0, 𝐿]. We simply pick whichever series fits our problem better.
It is not necessary to start with the full Fourier series to obtain the sine and cosine series.
The sine series is really the eigenfunction expansion of 𝑓 (𝑡) using eigenfunctions of the
eigenvalue problem 𝑥 ′′ + 𝜆𝑥 = 0, 𝑥(0) = 0, 𝑥(𝐿) = 0. The cosine series is the eigenfunction
expansion of 𝑓 (𝑡) using eigenfunctions of the eigenvalue problem 𝑥 ′′ + 𝜆𝑥 = 0, 𝑥 ′(0) = 0,
𝑥 ′(𝐿) = 0. We would, therefore, get the same formulas by defining the inner product
∫ 𝐿
⟨ 𝑓 (𝑡), 𝑔(𝑡)⟩ = 𝑓 (𝑡)𝑔(𝑡) 𝑑𝑡,
0

and following the procedure of § 4.2. This point of view is useful, as we commonly use
a specific series that arose because our underlying question led to a certain eigenvalue
4.4. SINE AND COSINE SERIES 221

problem. If the eigenvalue problem is not one of the three we covered so far, you can still
do an eigenfunction expansion, generalizing the results of this chapter. We will deal with
such a generalization in chapter 5.
Example 4.4.2: Find the Fourier series of the even periodic extension of the function
𝑓 (𝑡) = 𝑡 2 for 0 ≤ 𝑡 ≤ 𝜋.
We want to write

𝑎0 Õ
𝑓 (𝑡) = + 𝑎 𝑛 cos(𝑛𝑡),
2
𝑛=1
where 𝜋
2𝜋2

2
𝑎0 = 𝑡 2 𝑑𝑡 = ,
𝜋 0 3
and
∫ 𝜋  𝜋 ∫ 𝜋
2 2 21 4
𝑎𝑛 = 𝑡 2 cos(𝑛𝑡) 𝑑𝑡 = 𝑡 sin(𝑛𝑡) − 𝑡 sin(𝑛𝑡) 𝑑𝑡
𝜋 0 𝜋 𝑛 0 𝑛𝜋 0
i𝜋 ∫ 𝜋 𝑛
4 h 4 4(−1)
= 2 𝑡 cos(𝑛𝑡) + 2 cos(𝑛𝑡) 𝑑𝑡 = .
𝑛 𝜋 0 𝑛 𝜋 0 𝑛2

Note that we “detected” the continuity of the extension since the coefficients decay as 𝑛12 .
That is, the even periodic extension of 𝑡 2 has no jump discontinuities. It does have corners,
since the derivative, which is an odd function and a sine series, has jumps; it has a Fourier
series whose coefficients decay only as 𝑛1 .
Explicitly, the first few terms of the series for 𝑓 (𝑡) are

𝜋2 4
− 4 cos(𝑡) + cos(2𝑡) − cos(3𝑡) + · · ·
3 9
Exercise 4.4.3:

a) Compute the derivative of the even periodic extension of 𝑓 (𝑡) above and verify it has jump
discontinuities. Use the actual definition of 𝑓 (𝑡), not its cosine series!
b) Why is it that the derivative of the even periodic extension of 𝑓 (𝑡) is the odd periodic extension
of 𝑓 ′(𝑡)?

4.4.3 Application
Fourier series ties in to the boundary value problems that we studied earlier. Consider the
boundary value problem

𝑥 ′′(𝑡) + 𝜆 𝑥(𝑡) = 𝑓 (𝑡), 0 < 𝑡 < 𝐿,

with the Dirichlet boundary conditions 𝑥(0) = 0, 𝑥(𝐿) = 0. The Fredholm alternative
(Theorem 4.1.2 on page 194) says that as long as 𝜆 is not an eigenvalue of the underlying
222 CHAPTER 4. FOURIER SERIES AND PDES

homogeneous problem, there exists a unique solution. Eigenfunctions of this eigenvalue


problem are the functions sin 𝑛𝜋 𝐿 𝑡 . To find the solution, we first find the Fourier sine
series for 𝑓 (𝑡). We write 𝑥(𝑡) also as a sine series, but with unknown coefficients. We
substitute the series for 𝑥(𝑡) into the equation and solve for the unknown coefficients. If we
have the Neumann boundary conditions 𝑥 ′(0) = 0, 𝑥 ′(𝐿) = 0, we do the same procedure using
the cosine series. Let us see how this method works on examples.
Example 4.4.3: Take the boundary value problem

𝑥 ′′(𝑡) + 2𝑥(𝑡) = 𝑓 (𝑡), 0 < 𝑡 < 1,

where 𝑓 (𝑡) = 𝑡 on 0 < 𝑡 < 1, and satisfying the Dirichlet boundary conditions 𝑥(0) = 0,
𝑥(1) = 0. We write 𝑓 (𝑡) as a sine series


Õ
𝑓 (𝑡) = 𝑐 𝑛 sin(𝑛𝜋𝑡).
𝑛=1

Compute
1
2 (−1)𝑛+1

𝑐𝑛 = 2 𝑡 sin(𝑛𝜋𝑡) 𝑑𝑡 = .
0 𝑛𝜋
We write 𝑥(𝑡) as

Õ
𝑥(𝑡) = 𝑏 𝑛 sin(𝑛𝜋𝑡).
𝑛=1

We plug in to obtain


Õ ∞
Õ
′′
𝑥 (𝑡) + 2𝑥(𝑡) = −𝑏 𝑛 𝑛 𝜋 sin(𝑛𝜋𝑡) + 2
2 2
𝑏 𝑛 sin(𝑛𝜋𝑡)
𝑛=1 𝑛=1
| {z } | {z }
𝑥 ′′ 𝑥

Õ
= 𝑏 𝑛 (2 − 𝑛 2 𝜋2 ) sin(𝑛𝜋𝑡)
𝑛=1

2 (−1)𝑛+1

Õ
= 𝑓 (𝑡) = sin(𝑛𝜋𝑡).
𝑛𝜋
𝑛=1

Therefore,
2 (−1)𝑛+1
𝑏 𝑛 (2 − 𝑛 𝜋 ) =
2 2
𝑛𝜋
or
2 (−1)𝑛+1
𝑏𝑛 = .
𝑛𝜋(2 − 𝑛 2 𝜋2 )
4.4. SINE AND COSINE SERIES 223

That 2 − 𝑛 2 𝜋2 is not zero for any 𝑛, and that we can solve for 𝑏 𝑛 , is precisely because 2 is
not an eigenvalue of the problem. We have thus obtained a Fourier series for the solution

2 (−1)𝑛+1

Õ
𝑥(𝑡) = sin(𝑛𝜋𝑡).
𝑛=1
𝑛𝜋 (2 − 𝑛 2 𝜋2 )

See Figure 4.13 for a graph of the solution. Notice that because the eigenfunctions satisfy
the boundary conditions, and 𝑥 is written in terms of the boundary conditions, then 𝑥
satisfies the boundary conditions.

0.00 0.25 0.50 0.75 1.00

0.00 0.00

-0.02 -0.02

-0.04 -0.04

-0.06 -0.06

-0.08 -0.08

0.00 0.25 0.50 0.75 1.00

Figure 4.13: Plot of the solution of 𝑥 ′′ + 2𝑥 = 𝑡, 𝑥(0) = 0, 𝑥(1) = 0.

Example 4.4.4: We handle the Neumann conditions with cosine series. Take the boundary
value problem
𝑥 ′′(𝑡) + 2𝑥(𝑡) = 𝑓 (𝑡), 0 < 𝑡 < 1,
where again 𝑓 (𝑡) = 𝑡 on 0 < 𝑡 < 1, but now satisfying the Neumann boundary conditions
𝑥 ′(0) = 0, 𝑥 ′(1) = 0. We write 𝑓 (𝑡) as a cosine series

𝑐0 Õ
𝑓 (𝑡) = + 𝑐 𝑛 cos(𝑛𝜋𝑡),
2
𝑛=1

where ∫ 1
𝑐0 = 2 𝑡 𝑑𝑡 = 1,
0
and
2 (−1)𝑛 − 1
(
1 −4
if 𝑛 odd,
∫ 
𝑐𝑛 = 2 𝑡 cos(𝑛𝜋𝑡) 𝑑𝑡 = = 𝜋2 𝑛 2
0 𝜋2 𝑛 2 0 if 𝑛 even.
We write 𝑥(𝑡) as a cosine series

𝑎0 Õ
𝑥(𝑡) = + 𝑎 𝑛 cos(𝑛𝜋𝑡).
2
𝑛=1
224 CHAPTER 4. FOURIER SERIES AND PDES

We plug in to obtain

∞ h
Õ i ∞ h
Õ i
′′
𝑥 (𝑡) + 2𝑥(𝑡) = −𝑎 𝑛 𝑛 𝜋 cos(𝑛𝜋𝑡) + 𝑎 0 + 2
2 2
𝑎 𝑛 cos(𝑛𝜋𝑡)
𝑛=1 𝑛=1

Õ
= 𝑎0 + 𝑎 𝑛 (2 − 𝑛 2 𝜋2 ) cos(𝑛𝜋𝑡)
𝑛=1

1 Õ −4
= 𝑓 (𝑡) = + cos(𝑛𝜋𝑡).
2
𝑛=1
𝜋 2𝑛2
𝑛 odd

Therefore, 𝑎 0 = 12 , 𝑎 𝑛 = 0 for 𝑛 even (𝑛 ≥ 2), and for 𝑛 odd, we have

−4
𝑎 𝑛 (2 − 𝑛 2 𝜋2 ) = ,
𝜋2 𝑛 2
or
−4
𝑎𝑛 = .
𝑛 2 𝜋2 (2 − 𝑛 2 𝜋2 )
The Fourier series for the solution 𝑥(𝑡) is

1 Õ −4
𝑥(𝑡) = + cos(𝑛𝜋𝑡).
4
𝑛=1
𝑛 𝜋 (2 − 𝑛 2 𝜋2 )
2 2
𝑛 odd

4.4.4 Exercises
Exercise 4.4.4: Take 𝑓 (𝑡) = (𝑡 − 1)2 defined on 0 ≤ 𝑡 ≤ 1.

a) Sketch the plot of the even periodic extension of 𝑓 .


b) Sketch the plot of the odd periodic extension of 𝑓 .

Exercise 4.4.5: Find the Fourier series of both the odd and even periodic extension of the function
𝑓 (𝑡) = (𝑡 − 1)2 for 0 ≤ 𝑡 ≤ 1. Can you tell which extension is continuous from the Fourier-series
coefficients?

Exercise 4.4.6: Find the Fourier series of both the odd and even periodic extension of the function
𝑓 (𝑡) = 𝑡 for 0 ≤ 𝑡 ≤ 𝜋.

Exercise 4.4.7: Find the Fourier series of the even periodic extension of the function 𝑓 (𝑡) = sin 𝑡
for 0 ≤ 𝑡 ≤ 𝜋.
4.4. SINE AND COSINE SERIES 225

Exercise 4.4.8: Consider


𝑥 ′′(𝑡) + 4𝑥(𝑡) = 𝑓 (𝑡),
where 𝑓 (𝑡) = 1 on 0 < 𝑡 < 1.

a) Solve for the Dirichlet conditions 𝑥(0) = 0, 𝑥(1) = 0.


b) Solve for the Neumann conditions 𝑥 ′(0) = 0, 𝑥 ′(1) = 0.

Exercise 4.4.9: Consider


𝑥 ′′(𝑡) + 9𝑥(𝑡) = 𝑓 (𝑡),
for 𝑓 (𝑡) = sin(2𝜋𝑡) on 0 < 𝑡 < 1.

a) Solve for the Dirichlet conditions 𝑥(0) = 0, 𝑥(1) = 0.


b) Solve for the Neumann conditions 𝑥 ′(0) = 0, 𝑥 ′(1) = 0.

Exercise 4.4.10: Consider

𝑥 ′′(𝑡) + 3𝑥(𝑡) = 𝑓 (𝑡), 𝑥(0) = 0, 𝑥(1) = 0,

where 𝑓 (𝑡) = ∞ 𝑛=1 𝑏 𝑛 sin(𝑛𝜋𝑡). Write the solution 𝑥(𝑡) as a Fourier series, where the coefficients
Í
are given in terms of 𝑏 𝑛 .

Exercise 4.4.11: Let 𝑓 (𝑡) = 𝑡 2 (2 − 𝑡) for 0 ≤ 𝑡 ≤ 2. Let 𝐹(𝑡) be the odd periodic extension.
Compute 𝐹(1), 𝐹(2), 𝐹(3), 𝐹(−1), 𝐹(9/2), 𝐹(101), 𝐹(103). Note: Do not compute the sine series.

Exercise 4.4.101: Let 𝑓 (𝑡) = 𝑡/3 on 0 ≤ 𝑡 < 3.

a) Find the Fourier series of the even periodic extension.


b) Find the Fourier series of the odd periodic extension.

Exercise 4.4.102: Let 𝑓 (𝑡) = cos(2𝑡) on 0 ≤ 𝑡 < 𝜋.

a) Find the Fourier series of the even periodic extension.


b) Find the Fourier series of the odd periodic extension.

Exercise 4.4.103: Let 𝑓 (𝑡) be defined on 0 ≤ 𝑡 < 1. Consider the average of the two extensions
𝐹 (𝑡)+𝐹 (𝑡)
𝑔(𝑡) = odd 2 even .

a) What is 𝑔(𝑡) if 0 ≤ 𝑡 < 1 (Justify!) b) What is 𝑔(𝑡) if −1 < 𝑡 < 0 (Justify!)


Í∞
Exercise 4.4.104: Let 𝑓 (𝑡) = 1
𝑛=1 𝑛 2 sin(𝑛𝑡). Solve 𝑥 ′′ − 𝑥 = 𝑓 (𝑡) for the Dirichlet conditions
𝑥(0) = 0 and 𝑥(𝜋) = 0.

Exercise 4.4.105 (challenging): Let 𝑓 (𝑡) = 𝑡 + ∞


𝑛=1 2𝑛 sin(𝑛𝑡). Solve 𝑥 + 𝜋𝑥 = 𝑓 (𝑡) for the
1 ′′
Í
Dirichlet conditions 𝑥(0) = 0 and 𝑥(𝜋) = 1. Hint: Note that 𝜋𝑡 satisfies the given Dirichlet
conditions.
226 CHAPTER 4. FOURIER SERIES AND PDES

4.5 Applications of Fourier series


Note: 2 lectures, §9.4 in [EP], not in [BD]

4.5.1 Periodically forced oscillation


We return to the forced oscillations. Consider a mass-spring 𝑘 𝐹(𝑡)
system as before, where we have a mass 𝑚 on a spring with 𝑚
spring constant 𝑘, with damping 𝑐, and a force 𝐹(𝑡) applied 𝑥
to the mass. Suppose the forcing function 𝐹(𝑡) is 2𝐿-periodic damping 𝑐
for some 𝐿 > 0. We saw this problem in chapter 2 with
𝐹(𝑡) = 𝐹0 cos(𝜔𝑡). The equation that governs this particular setup is

𝑚𝑥 ′′(𝑡) + 𝑐𝑥 ′(𝑡) + 𝑘𝑥(𝑡) = 𝐹(𝑡). (4.9)

The general solution of (4.9) consists of the complementary solution 𝑥 𝑐 , which solves
the associated homogeneous equation 𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 0, and a particular solution of (4.9)
we call 𝑥 𝑝 . For 𝑐 > 0, the complementary solution 𝑥 𝑐 will decay as time goes by. Therefore,
we are mostly interested in a particular solution 𝑥 𝑝 that does not decay and is periodic with
the same period as 𝐹(𝑡). We call this particular solution the steady periodic solution and we
write it as 𝑥 𝑠𝑝 as before. What is new in this section is that we consider an arbitrary forcing
function 𝐹(𝑡) instead of a simple cosine.
For simplicity, suppose 𝑐 = 0. The problem with 𝑐 > 0 is very similar. The equation

𝑚𝑥 ′′ + 𝑘𝑥 = 0

has the general solution


𝑥(𝑡) = 𝐴 cos(𝜔0 𝑡) + 𝐵 sin(𝜔0 𝑡),
q
where 𝜔0 = 𝑚𝑘 . Any solution to 𝑚𝑥 ′′(𝑡) + 𝑘𝑥(𝑡) = 𝐹(𝑡) is of the form 𝐴 cos(𝜔0 𝑡) +
𝐵 sin(𝜔0 𝑡) + 𝑥 𝑠𝑝 . The steady periodic solution 𝑥 𝑠𝑝 has the same period as 𝐹(𝑡).
In the spirit of the last section and the idea of undetermined coefficients we first write

𝑐0 Õ  𝑛𝜋   𝑛𝜋 
𝐹(𝑡) = + 𝑐 𝑛 cos 𝑡 + 𝑑𝑛 sin 𝑡 .
2 𝐿 𝐿
𝑛=1

Then we write a proposed steady periodic solution 𝑥 as



𝑎0 Õ  𝑛𝜋   𝑛𝜋 
𝑥(𝑡) = + 𝑎 𝑛 cos 𝑡 + 𝑏 𝑛 sin 𝑡 ,
2 𝐿 𝐿
𝑛=1

where 𝑎 𝑛 and 𝑏 𝑛 are unknowns. We plug 𝑥 into the differential equation and solve for 𝑎 𝑛
and 𝑏 𝑛 in terms of 𝑐 𝑛 and 𝑑𝑛 . This process is perhaps best understood by example.
4.5. APPLICATIONS OF FOURIER SERIES 227

Example 4.5.1: Suppose that 𝑘 = 2 and 𝑚 = 1. The units are again the mks units (meters-
kilograms-seconds). There is a jetpack strapped to the mass, which fires with a force of 1
newton for 1 second and then is off for 1 second, and so on. We want to find the steady
periodic solution.
The equation is, therefore,
𝑥 ′′ + 2𝑥 = 𝐹(𝑡),
where 𝐹(𝑡) is the step function
(
0 if −1 < 𝑡 < 0,
𝐹(𝑡) =
1 if 0 < 𝑡 < 1,

extended periodically. We write



𝑐0 Õ
𝐹(𝑡) = + 𝑐 𝑛 cos(𝑛𝜋𝑡) + 𝑑𝑛 sin(𝑛𝜋𝑡).
2
𝑛=1

We compute
∫ 1 ∫ 1
𝑐𝑛 = 𝐹(𝑡) cos(𝑛𝜋𝑡) 𝑑𝑡 = cos(𝑛𝜋𝑡) 𝑑𝑡 = 0 for 𝑛 ≥ 1,
−1 0
∫ 1 ∫ 1
𝑐0 = 𝐹(𝑡) 𝑑𝑡 = 𝑑𝑡 = 1,
−1 0
∫ 1
𝑑𝑛 = 𝐹(𝑡) sin(𝑛𝜋𝑡) 𝑑𝑡
−1
∫ 1
= sin(𝑛𝜋𝑡) 𝑑𝑡
0
 1
− cos(𝑛𝜋𝑡)
=
𝑛𝜋
(𝑡=0
𝑛
1 − (−1) 2
if 𝑛 odd,
= = 𝜋𝑛
𝜋𝑛 0 if 𝑛 even.

So

1 Õ 2
𝐹(𝑡) = + sin(𝑛𝜋𝑡).
2 𝜋𝑛
𝑛=1
𝑛 odd
We want to try

𝑎0 Õ
𝑥(𝑡) = + 𝑎 𝑛 cos(𝑛𝜋𝑡) + 𝑏 𝑛 sin(𝑛𝜋𝑡).
2
𝑛=1

Once we plug 𝑥 into the differential equation 𝑥 ′′ + 2𝑥 = 𝐹(𝑡), it is clear that 𝑎 𝑛 = 0 for 𝑛 ≥ 1
as there are no corresponding terms in the series for 𝐹(𝑡). Similarly, 𝑏 𝑛 = 0 for even 𝑛.
228 CHAPTER 4. FOURIER SERIES AND PDES

Hence we try

𝑎0 Õ
𝑥(𝑡) = + 𝑏 𝑛 sin(𝑛𝜋𝑡).
2
𝑛=1
𝑛 odd

We plug into the differential equation and obtain


∞ h
Õ i ∞ h
Õ i
′′
𝑥 + 2𝑥 = −𝑏 𝑛 𝑛 𝜋 sin(𝑛𝜋𝑡) + 𝑎 0 + 2
2 2
𝑏 𝑛 sin(𝑛𝜋𝑡)
𝑛=1 𝑛=1
𝑛 odd 𝑛 odd

Õ
= 𝑎0 + 𝑏 𝑛 (2 − 𝑛 2 𝜋2 ) sin(𝑛𝜋𝑡)
𝑛=1
𝑛 odd

1 Õ 2
= 𝐹(𝑡) = + sin(𝑛𝜋𝑡).
2 𝜋𝑛
𝑛=1
𝑛 odd

So 𝑎 0 = 21 , 𝑏 𝑛 = 0 for even 𝑛, and for odd 𝑛, we get

2
𝑏𝑛 = .
𝜋𝑛(2 − 𝑛 2 𝜋2 )

The steady periodic solution has the Fourier series



1 Õ 2
𝑥 𝑠𝑝 (𝑡) = + sin(𝑛𝜋𝑡).
4
𝑛=1
𝜋𝑛(2 − 𝑛 2 𝜋2 )
𝑛 odd

We know this is the steady periodic solution as it contains no terms of the complementary
solution and it is periodic with the same period as 𝐹(𝑡) itself. See Figure 4.14 on the facing
page for the plot of this solution.

4.5.2 Resonance
Just as when the forcing function was a simple cosine, we may encounter resonance. We
assume 𝑐 = 0 and so we discuss only pure resonance. Let 𝐹(𝑡) be 2𝐿-periodic and consider

𝑚𝑥 ′′(𝑡) + 𝑘𝑥(𝑡) = 𝐹(𝑡).

When we expand 𝐹(𝑡) and find that some of its terms coincide with the complementary
solution to 𝑚𝑥 ′′ + 𝑘𝑥 = 0, we cannot use those terms in the guess. Just like before, they
disappear when we plug them into the left-hand side, and we get a contradictory equation
(such as 0 = 1). That is, suppose

𝑥 𝑐 = 𝐴 cos(𝜔0 𝑡) + 𝐵 sin(𝜔0 𝑡),


4.5. APPLICATIONS OF FOURIER SERIES 229

0.0 2.5 5.0 7.5 10.0


0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0.0 0.0
0.0 2.5 5.0 7.5 10.0

Figure 4.14: Plot of the steady periodic solution 𝑥 𝑠𝑝 of Example 4.5.1.

𝑁𝜋
where 𝜔0 = 𝐿 for some positive integer 𝑁. We have to modify our guess and try

𝑎0 𝑁𝜋 𝑁𝜋  𝑛𝜋   𝑛𝜋 
     Õ
𝑥(𝑡) = + 𝑡 𝑎 𝑁 cos 𝑡 + 𝑏 𝑁 sin 𝑡 + 𝑎 𝑛 cos 𝑡 + 𝑏 𝑛 sin 𝑡 .
2 𝐿 𝐿 𝐿 𝐿
𝑛=1
𝑛≠𝑁

In other words, we multiply the offending term by 𝑡. From then on, we proceed as before.
The solution is not a Fourier series (it is not even periodic)
 since it contains these terms
𝑁𝜋 𝑁𝜋
multiplied by 𝑡. The terms 𝑡 𝑎 𝑁 cos 𝐿 𝑡 + 𝑏 𝑁 sin 𝐿 𝑡 eventually dominate and lead
to wild oscillations. As before, this behavior is called pure resonance or just resonance.
Note that we may hit the resonance frequency with infinitely many terms (overtones) of
the forcing function 𝐹. That is, suppose we use the same “shape” for 𝐹, and we change the
base frequency (we change the 𝐿). Then different terms from the Fourier series of 𝐹 may
interfere with the complementary solution, that is, 𝑛𝜋 𝐿 = 𝜔0 for some 𝑛, and possibly cause
resonance. Theoretically, infinitely many base frequencies could cause resonance, however,
we should note that since everything is an approximation to any real life application, only
the first few terms of 𝐹, and hence only a few such frequencies, would matter in real life.
Example 4.5.2: We want to solve the equation

2𝑥 ′′ + 18𝜋2 𝑥 = 𝐹(𝑡), (4.10)

where (
−1 if −1 < 𝑡 < 0,
𝐹(𝑡) =
1 if 0 < 𝑡 < 1,
extended periodically. We note that

Õ 4
𝐹(𝑡) = sin(𝑛𝜋𝑡).
𝜋𝑛
𝑛=1
𝑛 odd
230 CHAPTER 4. FOURIER SERIES AND PDES

Exercise 4.5.1: Compute the Fourier series of 𝐹 to verify the equation above.
q q
𝑘 18𝜋2
As 𝑚 = 2 = 3𝜋, the solution to (4.10) is

𝑥(𝑡) = 𝑐 1 cos(3𝜋𝑡) + 𝑐 2 sin(3𝜋𝑡) + 𝑥 𝑝 (𝑡)


for some particular solution 𝑥 𝑝 .
If we just try an 𝑥 𝑝 given as a Fourier series with sin(𝑛𝜋𝑡) as usual, the complementary
equation, 2𝑥 ′′ + 18𝜋2 𝑥 = 0, eats our 3rd harmonic. That is, the term with sin(3𝜋𝑡) is already
in our complementary solution. Therefore, we pull that term out and multiply it by 𝑡. We
also add a cosine term to get everything right. That is, we try

Õ
𝑥 𝑝 (𝑡) = 𝑎3 𝑡 cos(3𝜋𝑡) + 𝑏 3 𝑡 sin(3𝜋𝑡) + 𝑏 𝑛 sin(𝑛𝜋𝑡).
𝑛=1
𝑛 odd
𝑛≠3

We compute the second derivative.

𝑥 ′′𝑝 (𝑡) = −6𝑎3 𝜋 sin(3𝜋𝑡) − 9𝜋2 𝑎3 𝑡 cos(3𝜋𝑡) + 6𝑏3 𝜋 cos(3𝜋𝑡) − 9𝜋2 𝑏3 𝑡 sin(3𝜋𝑡)

Õ
+ (−𝑛 2 𝜋2 𝑏 𝑛 ) sin(𝑛𝜋𝑡).
𝑛=1
𝑛 odd
𝑛≠3

We now plug into the left-hand side of the differential equation.


2𝑥 ′′𝑝 + 18𝜋2 𝑥 𝑝 = − 12𝑎 3 𝜋 sin(3𝜋𝑡) − 18𝜋2 𝑎 3 𝑡 cos(3𝜋𝑡) + 12𝑏 3 𝜋 cos(3𝜋𝑡) − 18𝜋2 𝑏 3 𝑡 sin(3𝜋𝑡)
+ 18𝜋2 𝑎 3 𝑡 cos(3𝜋𝑡) + 18𝜋2 𝑏 3 𝑡 sin(3𝜋𝑡)

Õ
+ (−2𝑛 2 𝜋2 𝑏 𝑛 + 18𝜋2 𝑏 𝑛 ) sin(𝑛𝜋𝑡).
𝑛=1
𝑛 odd
𝑛≠3

We simplify,

Õ
2𝑥 ′′𝑝 + 18𝜋 𝑥 𝑝 = −12𝑎3 𝜋 sin(3𝜋𝑡) + 12𝑏 3 𝜋 cos(3𝜋𝑡) +
2
(−2𝑛 2 𝜋2 𝑏 𝑛 + 18𝜋2 𝑏 𝑛 ) sin(𝑛𝜋𝑡).
𝑛=1
𝑛 odd
𝑛≠3

This series has to equal to the series for 𝐹(𝑡). We equate the coefficients and solve for 𝑎3
and 𝑏 𝑛 :
4/(3𝜋) −1
𝑎3 = = ,
−12𝜋 9𝜋2
𝑏 3 = 0,
4 2
𝑏𝑛 = = for 𝑛 odd and 𝑛 ≠ 3.
𝑛𝜋(18𝜋2 − 2𝑛 2 𝜋2 ) 𝜋3 𝑛(9 − 𝑛 2 )
4.5. APPLICATIONS OF FOURIER SERIES 231

That is,

−1 Õ 2
𝑥 𝑝 (𝑡) = 𝑡 cos(3𝜋𝑡) + sin(𝑛𝜋𝑡).
9𝜋 2
𝑛=1
𝜋 3 𝑛(9 − 𝑛 2)
𝑛 odd
𝑛≠3

When 𝑐 > 0, you do not have to worry about pure resonance. There are never any
conflicts, and you do not need to multiply any terms by 𝑡. There is a corresponding concept
of practical resonance, and it is very similar to the ideas we already explored in chapter 2.
Basically, what happens in practical resonance is that one of the coefficients in the series
for 𝑥 𝑠𝑝 can get very big. Let us not go into details here.

4.5.3 Exercises
Exercise 4.5.2: Let 𝐹(𝑡) = 12 + ∞ 𝑛=1 𝑛 2 cos(𝑛𝜋𝑡). Find the steady periodic solution to 𝑥 + 2𝑥 =
1 ′′
Í
𝐹(𝑡). Express your solution as a Fourier series.

Exercise 4.5.3: Let 𝐹(𝑡) = ∞ 𝑛=1 𝑛 3 sin(𝑛𝜋𝑡). Find the steady periodic solution to 𝑥 +𝑥 +𝑥 = 𝐹(𝑡).
1 ′′ ′
Í
Express your solution as a Fourier series.

Exercise 4.5.4: Let 𝐹(𝑡) = ∞ 𝑛=1 𝑛 2 cos(𝑛𝜋𝑡). Find the steady periodic solution to 𝑥 + 4𝑥 = 𝐹(𝑡).
1 ′′
Í
Express your solution as a Fourier series.

Exercise 4.5.5: Let 𝐹(𝑡) = 𝑡 for −1 < 𝑡 < 1 and extended periodically. Find the steady periodic
solution to 𝑥 ′′ + 𝑥 = 𝐹(𝑡). Express your solution as a series.

Exercise 4.5.6: Let 𝐹(𝑡) = 𝑡 for −1 < 𝑡 < 1 and extended periodically. Find the steady periodic
solution to 𝑥 ′′ + 𝜋2 𝑥 = 𝐹(𝑡). Express your solution as a series.

√ 4.5.101: Let 𝐹(𝑡) = sin(2𝜋𝑡) + 0.1 cos(10𝜋𝑡). Find the steady periodic solution to
Exercise
𝑥 + 2 𝑥 = 𝐹(𝑡). Express your solution as a Fourier series.
′′

Exercise 4.5.102: Let 𝐹(𝑡) = ∞ 𝑛=1 𝑒


−𝑛 cos(2𝑛𝑡). Find the steady periodic solution to 𝑥 ′′ + 3𝑥 =
Í
𝐹(𝑡). Express your solution as a Fourier series.

√ Let 𝐹(𝑡) = |𝑡| for −1 ≤ 𝑡 ≤ 1 extended periodically. Find the steady periodic
Exercise 4.5.103:
solution to 𝑥 + 3 𝑥 = 𝐹(𝑡). Express your solution as a series.
′′

Exercise 4.5.104: Let 𝐹(𝑡) = |𝑡| for −1 ≤ 𝑡 ≤ 1 extended periodically. Find the steady periodic
solution to 𝑥 ′′ + 𝜋2 𝑥 = 𝐹(𝑡). Express your solution as a series.
232 CHAPTER 4. FOURIER SERIES AND PDES

4.6 PDEs, separation of variables, and the heat equation


Note: 2 lectures, §9.5 in [EP], §10.5 in [BD]
Recall that a partial differential equation or PDE is an equation containing the partial
derivatives with respect to several independent variables. Solving PDEs will be our main
application of Fourier series.
A PDE is said to be linear if the dependent variable and its derivatives appear at most to
the first power and in no functions. We will only talk about linear PDEs. Together with a
PDE, we usually specify some boundary conditions, where the value of the solution or its
derivatives is given along the boundary of a region, and/or some initial conditions where
the value of the solution or its derivatives is given for some initial time. Sometimes such
conditions are mixed together and we will refer to them simply as side conditions.
We will study three specific partial differential equations, each one representing a
general class of equations. First, we will study the heat equation, which is an example of a
parabolic PDE. Next, we will study the wave equation, which is an example of a hyperbolic
PDE. Finally, we will study the Laplace equation, which is an example of an elliptic PDE.
Each of our examples will illustrate behavior that is typical for the whole class.

4.6.1 Heat on an insulated wire


We start with the heat equation. Consider a wire (or a thin metal rod) of length 𝐿 insulated
along its length except at the endpoints. Let 𝑥 denote the position along the wire and let 𝑡
denote time. See Figure 4.15.

temperature 𝑢

0 𝐿 𝑥
insulation
Figure 4.15: Insulated wire.

Let 𝑢(𝑥, 𝑡) denote the temperature at point 𝑥 at time 𝑡. The equation governing this
setup is the so-called one-dimensional heat equation:

𝜕𝑢 𝜕2 𝑢
= 𝑘 2,
𝜕𝑡 𝜕𝑥

where 𝑘 > 0 is a constant (the thermal conductivity of the material). That is, the change in
heat with respect to time at some point is proportional to the second derivative of the heat
4.6. PDES, SEPARATION OF VARIABLES, AND THE HEAT EQUATION 233

in the 𝑥 direction—along the wire. This makes sense; if at a fixed 𝑡 the graph of the heat
distribution has a maximum (the graph is concave down and the second 𝑥 derivative is
negative), then heat should flow away from the maximum and so the 𝑡 derivative should
also be negative. Similarly at a minimum, heat wants to flow in.
We generally use a more convenient notation for partial derivatives. We write 𝑢𝑡 instead
of 𝜕𝑡 , and we write 𝑢𝑥𝑥 instead of 𝜕𝜕𝑥𝑢2 . With this notation the heat equation becomes
𝜕𝑢 2

𝑢𝑡 = 𝑘𝑢𝑥𝑥 .

The region in which we will solve the heat equation is given by

0<𝑥<𝐿 and 𝑡 > 0.

We must also have some side conditions on the boundaries 𝑡


of that region. We assume that the ends of the wire are
either exposed and touching some body of constant heat,
𝑢=0 𝑢=0
or the ends are insulated. If the ends of the wire are kept
or 𝑢𝑡 = 𝑘𝑢𝑥𝑥 or
at temperature 0, then the conditions are
𝑢𝑥 = 0 𝑢𝑥 = 0
𝑢(0, 𝑡) = 0 and 𝑢(𝐿, 𝑡) = 0 for 𝑡 > 0.
0 𝑢 = 𝑓 (𝑥) 𝐿 𝑥
If, on the other hand, the ends are insulated, the conditions
are
𝑢𝑥 (0, 𝑡) = 0 and 𝑢𝑥 (𝐿, 𝑡) = 0 for 𝑡 > 0.
Let us see why that is so. If 𝑢𝑥 is positive at some point 𝑥0 , then at a particular time, 𝑢 is
smaller to the left of 𝑥 0 and higher to the right of 𝑥0 . Heat is flowing from high heat to
low heat, that is, to the left. On the other hand, if 𝑢𝑥 is negative, then heat is again flowing
from high heat to low heat, that is, to the right. So when 𝑢𝑥 is zero, we are at a point where
heat is not flowing in either direction. In other words, 𝑢𝑥 (0, 𝑡) = 0 means no heat is flowing
in or out of the wire at the point 𝑥 = 0.
We have two conditions along the 𝑥-axis as there are two derivatives in the 𝑥 direction.
These side conditions are said to be homogeneous (i.e. 𝑢 or a derivative of 𝑢 is set to zero).
We also need an initial condition—the temperature distribution at time 𝑡 = 0. That is,

𝑢(𝑥, 0) = 𝑓 (𝑥) for 0 < 𝑥 < 𝐿,

for some known function 𝑓 (𝑥). This initial condition is not a homogeneous side condition.

4.6.2 Separation of variables


The heat equation is linear as 𝑢 and its derivatives do not appear to any powers or in any
functions, and it is homogeneous—there is no term independent of 𝑢. Thus the principle
of superposition still applies for the heat equation (without side conditions): If 𝑢1 and 𝑢2
are solutions and 𝑐1 , 𝑐2 are constants, then 𝑢 = 𝑐1 𝑢1 + 𝑐 2 𝑢2 is also a solution.
234 CHAPTER 4. FOURIER SERIES AND PDES

Exercise 4.6.1: Verify the principle of superposition for the heat equation.

Superposition preserves some side conditions. If 𝑢1 and 𝑢2 are solutions that satisfy
𝑢(0, 𝑡) = 0 and 𝑢(𝐿, 𝑡) = 0, and 𝑐 1 , 𝑐 2 are constants, then 𝑢 = 𝑐1 𝑢1 + 𝑐 2 𝑢2 is still a solution
that satisfies 𝑢(0, 𝑡) = 0 and 𝑢(𝐿, 𝑡) = 0. Similarly for the side conditions 𝑢𝑥 (0, 𝑡) = 0 and
𝑢𝑥 (𝐿, 𝑡) = 0. In general, superposition preserves all homogeneous side conditions.
The method of separation of variables is to try to find solutions that are products of
functions of one variable. For the heat equation, we try to find solutions of the form

𝑢(𝑥, 𝑡) = 𝑋(𝑥)𝑇(𝑡).

That the desired particular solution we are looking for is of this form is too much to hope for.
What is perfectly reasonable to ask, however, is to find enough “building-block” solutions
of the form 𝑢(𝑥, 𝑡) = 𝑋(𝑥)𝑇(𝑡) using this procedure so that the desired solution to the PDE
is somehow constructed from these building blocks by the use of superposition.
Let us try to solve the heat equation problem

𝑢𝑡 = 𝑘𝑢𝑥𝑥 , with 𝑢(0, 𝑡) = 0, 𝑢(𝐿, 𝑡) = 0, and 𝑢(𝑥, 0) = 𝑓 (𝑥).

We guess 𝑢(𝑥, 𝑡) = 𝑋(𝑥)𝑇(𝑡). We will try to make this guess satisfy the differential equation,
𝑢𝑡 = 𝑘𝑢𝑥𝑥 , and the homogeneous side conditions, 𝑢(0, 𝑡) = 0 and 𝑢(𝐿, 𝑡) = 0. Then, as
superposition preserves the differential equation and the homogeneous side conditions,
we will try to build up a solution from these building blocks to solve the nonhomogeneous
initial condition 𝑢(𝑥, 0) = 𝑓 (𝑥).
First, we plug 𝑢(𝑥, 𝑡) = 𝑋(𝑥)𝑇(𝑡) into the heat equation to obtain

𝑋(𝑥)𝑇 ′(𝑡) = 𝑘𝑋 ′′(𝑥)𝑇(𝑡).

We rewrite as
𝑇 ′(𝑡) 𝑋 ′′(𝑥)
= .
𝑘𝑇(𝑡) 𝑋(𝑥)
This equation must hold for all 𝑥 and all 𝑡. But the left-hand side does not depend on 𝑥
and the right-hand side does not depend on 𝑡. Hence, each side must be a constant. Let us
call this constant −𝜆 (the minus sign is for convenience later). We obtain the two equations

𝑇 ′(𝑡) 𝑋 ′′(𝑥)
= −𝜆 = .
𝑘𝑇(𝑡) 𝑋(𝑥)
In other words,

𝑋 ′′(𝑥) + 𝜆𝑋(𝑥) = 0,
𝑇 ′(𝑡) + 𝜆𝑘𝑇(𝑡) = 0.

The boundary condition 𝑢(0, 𝑡) = 0 implies 𝑋(0)𝑇(𝑡) = 0. We are looking for a nontrivial
solution, and so we can assume that 𝑇(𝑡) is not identically zero. Hence 𝑋(0) = 0. Similarly,
𝑢(𝐿, 𝑡) = 0 implies 𝑋(𝐿) = 0. We are looking for nontrivial solutions 𝑋 of the eigenvalue
4.6. PDES, SEPARATION OF VARIABLES, AND THE HEAT EQUATION 235

problem 𝑋 ′′ + 𝜆𝑋 = 0, 𝑋(0) = 0, 𝑋(𝐿) = 0. We have previously found that the only


eigenvalues are 𝜆𝑛 = 𝑛𝐿𝜋2 , for integers 𝑛 ≥ 1, where eigenfunctions are sin 𝑛𝜋
2 2
𝐿 𝑥 . Hence,

let us pick the solutions
 𝑛𝜋 
𝑋𝑛 (𝑥) = sin 𝑥 .
𝐿
The corresponding 𝑇𝑛 must satisfy the equation

𝑛 2 𝜋2
𝑇𝑛′ (𝑡) + 𝑘𝑇𝑛 (𝑡) = 0.
𝐿2
This is one of our fundamental equations, and the solution is an exponential:

−𝑛 2 𝜋2
𝑘𝑡
𝑇𝑛 (𝑡) = 𝑒 𝐿2 ,

where we picked the particular solution where conveniently 𝑇𝑛 (0) = 1. Our building-block
solutions are  𝑛𝜋  −𝑛2 𝜋2
𝑘𝑡
𝑢𝑛 (𝑥, 𝑡) = 𝑋𝑛 (𝑥)𝑇𝑛 (𝑡) = sin 𝑥 𝑒 𝐿2 .
𝐿
𝑛𝜋
We note that 𝑢𝑛 (𝑥, 0) = sin 𝐿 𝑥 . We write 𝑓 (𝑥) as the sine series



Õ  𝑛𝜋 
𝑓 (𝑥) = 𝑏 𝑛 sin 𝑥 .
𝐿
𝑛=1

That is, we find the Fourier series of the odd periodic extension of 𝑓 (𝑥). We used the
sine series as it corresponds to the eigenvalue problem for 𝑋(𝑥) above. Finally, we use
superposition to write the solution as

∞ ∞  𝑛𝜋  −𝑛 2 𝜋2
𝑘𝑡
Õ Õ
𝑢(𝑥, 𝑡) = 𝑏 𝑛 𝑢𝑛 (𝑥, 𝑡) = 𝑏 𝑛 sin 𝑥 𝑒 𝐿2 .
𝐿
𝑛=1 𝑛=1

Why does this solution work? First note that it is a solution to the heat equation by
superposition. It satisfies 𝑢(0, 𝑡) = 0 and 𝑢(𝐿, 𝑡) = 0, because 𝑥 = 0 or 𝑥 = 𝐿 makes all the
sines vanish. Finally, plugging in 𝑡 = 0, we notice that 𝑇𝑛 (0) = 1, and so

Õ ∞
Õ  𝑛𝜋 
𝑢(𝑥, 0) = 𝑏 𝑛 𝑢𝑛 (𝑥, 0) = 𝑏 𝑛 sin 𝑥 = 𝑓 (𝑥).
𝐿
𝑛=1 𝑛=1

Example 4.6.1: Consider an insulated wire of length 1 whose ends are embedded in ice
(temperature 0). Let 𝑘 = 0.003 and let the initial heat distribution be 𝑢(𝑥, 0) = 50 𝑥 (1 − 𝑥).
See Figure 4.16 on the following page. Suppose we want to find the temperature function
𝑢(𝑥, 𝑡). In particular, suppose we want to find when (at what time 𝑡) does the maximum
temperature in the wire drop to one half of the initial maximum of 12.5.
236 CHAPTER 4. FOURIER SERIES AND PDES

0.00 0.25 0.50 0.75 1.00

12.5 12.5

10.0 10.0

7.5 7.5

5.0 5.0

2.5 2.5

0.0 0.0

0.00 0.25 0.50 0.75 1.00

Figure 4.16: Initial distribution of temperature in the wire.

We are solving the following PDE problem:

𝑢𝑡 = 0.003 𝑢𝑥𝑥 for 0 < 𝑥 < 1 and 𝑡 > 0,


𝑢(0, 𝑡) = 𝑢(1, 𝑡) = 0 for 𝑡 > 0,
𝑢(𝑥, 0) = 50 𝑥 (1 − 𝑥) for 0 < 𝑥 < 1.
Í∞
Write 𝑓 (𝑥) = 50 𝑥 (1 − 𝑥) for 0 < 𝑥 < 1 as a sine series 𝑓 (𝑥) = 𝑛=1 𝑏 𝑛 sin(𝑛𝜋𝑥), where

200 (−1)𝑛
(
1
if 𝑛 even,

200 0
𝑏𝑛 = 2 50 𝑥 (1 − 𝑥) sin(𝑛𝜋𝑥) 𝑑𝑥 = 3 3 − =
0 𝜋 𝑛 𝜋3 𝑛 3 400
𝜋3 𝑛 3
if 𝑛 odd.

We plug in these coefficients into the series for 𝑢(𝑥, 𝑡) to obtain the solution

Õ 400 2 𝜋2 0.003 𝑡
𝑢(𝑥, 𝑡) = sin(𝑛𝜋𝑥) 𝑒 −𝑛 .
𝑛=1
𝜋3 𝑛 3
𝑛 odd

We plot the solution Figure 4.17 on the next page for 0 ≤ 𝑡 ≤ 100.
Finally, we answer the question about the maximum temperature. It is relatively easy
to see that the maximum temperature at any fixed time is always at 𝑥 = 0.5, in the middle
of the wire. The plot of 𝑢(𝑥, 𝑡) confirms this intuition. If we plug in 𝑥 = 0.5, we get

Õ 400 2 𝜋2 0.003 𝑡
𝑢(0.5, 𝑡) = sin(𝑛𝜋 0.5) 𝑒 −𝑛 .
𝑛=1
𝜋3 𝑛 3
𝑛 odd

For 𝑛 = 3 and higher (remember 𝑛 is only odd), the terms of the series are insignificant
compared to the first term. The first term in the series is already a very good approximation
of the function. Hence
400
𝑢(0.5, 𝑡) ≈ 3 𝑒 −𝜋 0.003 𝑡 .
2

𝜋
4.6. PDES, SEPARATION OF VARIABLES, AND THE HEAT EQUATION 237

0
0.00 t
20
0.25 40
x 60
0.50
80
u(x,t)
0.75 100

1.00
12.5 12.5 11.700
10.400
9.100
10.0 10.0 7.800
6.500
5.200
7.5 7.5 3.900
2.600
1.300
5.0 5.0 0.000

2.5 2.5

0.0 0.0

0 0.25
20
0.50
40 x
60 0.75

t 80
1.00
100

Figure 4.17: Plot of the temperature 𝑢(𝑥, 𝑡) of the wire at position 𝑥 at time 𝑡 for 0 ≤ 𝑡 ≤ 100. Notice the
side conditions 𝑢(0, 𝑡) = 𝑢(1, 𝑡) = 0 and how the exponential makes the temperature decay with time.

The approximation gets better and better as 𝑡 gets larger as the other terms decay much
faster. We plot the function 𝑢(0.5, 𝑡), the temperature at the midpoint of the wire at time 𝑡,
and its approximation by the first term in Figure 4.18 on the following page.
After 𝑡 = 5 or so, it would be hard to tell the difference between the first term of the
series for 𝑢(𝑥, 𝑡) and the real solution 𝑢(𝑥, 𝑡). This behavior is a general feature of solving
the heat equation. If you are interested in behavior for large enough 𝑡, only the first one or
two terms may be necessary.
We get back to the question of when is the maximum temperature one half of the initial
maximum temperature. That is, when is the temperature at the midpoint 12.5/2 = 6.25. The
graph suggests that the approximation by the first term will be close enough. We solve
400 −𝜋2 0.003 𝑡
6.25 = 𝑒 .
𝜋3
That is,
𝜋 3
ln 6.25
400
≈ 24.5. 𝑡=
−𝜋2 0.003
So the maximum temperature drops to half at about 𝑡 = 24.5.
238 CHAPTER 4. FOURIER SERIES AND PDES

0 25 50 75 100

12.5 12.5

10.0 10.0

7.5 7.5

5.0 5.0

2.5 2.5

0 25 50 75 100

Figure 4.18: Temperature at the midpoint of the wire (the bottom curve), and the approximation of this
temperature by using only the first term in the series (top curve).

We mention an interesting behavior of the solution to the heat equation. The heat
equation “smoothes” out the function 𝑓 (𝑥) as 𝑡 grows. For a fixed 𝑡, the solution is a Fourier
−𝑛 2 𝜋2
𝑘𝑡
series with coefficients 𝑏 𝑛 𝑒 𝐿2 . If 𝑡 > 0, then these coefficients go to zero faster than
any 𝑛1𝑝 for any power 𝑝. In other words, the Fourier series has infinitely many derivatives
everywhere. Thus even if the function 𝑓 (𝑥) has jumps and corners, then for a fixed 𝑡 > 0,
the solution 𝑢(𝑥, 𝑡) as a function of 𝑥 is as smooth as we want it to be.
Example 4.6.2: When the initial condition is already a sine series, then there is no need to
compute anything, you just need to plug in. Consider

𝑢𝑡 = 0.3 𝑢𝑥𝑥 , 𝑢(0, 𝑡) = 𝑢(1, 𝑡) = 0, 𝑢(𝑥, 0) = 0.1 sin(𝜋𝑥) + sin(2𝜋𝑥).

The solution is then

𝑢(𝑥, 𝑡) = 0.1 sin(𝜋𝑥)𝑒 −0.3𝜋 𝑡 + sin(2𝜋𝑥)𝑒 −1.2𝜋 𝑡 .


2 2

4.6.3 Insulated ends


Now suppose the ends of the wire are insulated. In this case, we are solving the problem

𝑢𝑡 = 𝑘𝑢𝑥𝑥 with 𝑢𝑥 (0, 𝑡) = 0, 𝑢𝑥 (𝐿, 𝑡) = 0, and 𝑢(𝑥, 0) = 𝑓 (𝑥).

Yet again we try a solution of the form 𝑢(𝑥, 𝑡) = 𝑋(𝑥)𝑇(𝑡). By the same procedure as before,
we plug into the heat equation and arrive at the following two equations:

𝑋 ′′(𝑥) + 𝜆𝑋(𝑥) = 0,
𝑇 ′(𝑡) + 𝜆𝑘𝑇(𝑡) = 0.
4.6. PDES, SEPARATION OF VARIABLES, AND THE HEAT EQUATION 239

At this point, the story changes slightly. The boundary condition 𝑢𝑥 (0, 𝑡) = 0 implies
𝑋 ′(0)𝑇(𝑡) = 0. Hence 𝑋 ′(0) = 0. Similarly, 𝑢𝑥 (𝐿, 𝑡) = 0 implies 𝑋 ′(𝐿) = 0. We want
nontrivial solutions 𝑋 of the eigenvalue problem 𝑋 ′′ + 𝜆𝑋 = 0, 𝑋 ′(0) = 0, 𝑋 ′(𝐿) = 0.
We previously found that the only eigenvalues are 𝜆𝑛 = 𝑛𝐿𝜋2 , for integers 𝑛 ≥ 0, where
2 2

eigenfunctions are cos 𝑛𝜋


𝐿 𝑥 (we include the constant eigenfunction). We pick the solutions


 𝑛𝜋 
𝑋𝑛 (𝑥) = cos 𝑥 and 𝑋0 (𝑥) = 1.
𝐿
The corresponding 𝑇𝑛 must satisfy the equation

𝑛 2 𝜋2
𝑇𝑛′ (𝑡) + 𝑘𝑇𝑛 (𝑡) = 0.
𝐿2
For 𝑛 ≥ 1, as before,
−𝑛 2 𝜋2
𝑘𝑡
𝑇𝑛 (𝑡) = 𝑒 𝐿2 .
For 𝑛 = 0, we have 𝑇0′(𝑡) = 0 and hence 𝑇0 (𝑡) = 1. Our building-block solutions are
 𝑛𝜋  −𝑛 2 𝜋2
𝑘𝑡
𝑢𝑛 (𝑥, 𝑡) = 𝑋𝑛 (𝑥)𝑇𝑛 (𝑡) = cos 𝑥 𝑒 𝐿2
𝐿
and
𝑢0 (𝑥, 𝑡) = 1.
𝑛𝜋
We note that 𝑢𝑛 (𝑥, 0) = cos 𝐿 𝑥 . We write 𝑓 using the cosine series



𝑎0 Õ  𝑛𝜋 
𝑓 (𝑥) = + 𝑎 𝑛 cos 𝑥 .
2 𝐿
𝑛=1

That is, we find the Fourier series of the even periodic extension of 𝑓 (𝑥).
We use superposition to write the solution as

∞ ∞
𝑎0 Õ 𝑎0 Õ  𝑛𝜋  −𝑛2 𝜋2
𝑘𝑡
𝑢(𝑥, 𝑡) = + 𝑎 𝑛 𝑢𝑛 (𝑥, 𝑡) = + 𝑎 𝑛 cos 𝑥 𝑒 𝐿2 .
2 2 𝐿
𝑛=1 𝑛=1

Example 4.6.3: Try the same equation as before, but with insulated ends. We are solving
the following PDE problem

𝑢𝑡 = 0.003 𝑢𝑥𝑥 for 0 < 𝑥 < 1 and 𝑡 > 0,


𝑢𝑥 (0, 𝑡) = 𝑢𝑥 (1, 𝑡) = 0 for 𝑡 > 0,
𝑢(𝑥, 0) = 50 𝑥 (1 − 𝑥) for 0 < 𝑥 < 1.

For this problem, we must find the cosine series of 𝑢(𝑥, 0). For 0 < 𝑥 < 1, we have
∞  
25 Õ −200
50 𝑥 (1 − 𝑥) = + cos(𝑛𝜋𝑥).
3
𝑛=2
𝜋2 𝑛 2
𝑛 even
240 CHAPTER 4. FOURIER SERIES AND PDES

The calculation is left to the reader. Hence, the solution to the PDE problem, plotted in
Figure 4.19, is given by the series
∞  
25 Õ −200 −𝑛 2 𝜋2 0.003 𝑡
𝑢(𝑥, 𝑡) = + cos(𝑛𝜋𝑥) 𝑒 .
3
𝑛=2
𝜋 2𝑛2
𝑛 even

0.00 0
5 t
x 0.25
10
0.50 15
0.75 20
u(x,t)
25
1.00
30
12.5

11.700
12.5 10.400
10.0
9.100
7.800
10.0 6.500
7.5
5.200
3.900
7.5 2.600
5.0
1.300
0.000
5.0
2.5

2.5
0.0

0 0.0
0.00
5
10 0.25
15 0.50
20
0.75 x
t 25
30 1.00

Figure 4.19: Plot of the temperature of the insulated wire at position 𝑥 at time 𝑡.

Note in the graph that as time goes on, the temperature evens out across the wire.
Eventually, all the terms except the constant die out, and you will be left with a uniform
temperature of 253 ≈ 8.33 along the entire length of the wire.
Let us expand on the last point. The constant term in the series is
𝐿
𝑎0

1
= 𝑓 (𝑥) 𝑑𝑥.
2 𝐿 0
In other words, 𝑎20 is the average value of 𝑓 (𝑥), that is, the average of the initial temperature.
As the wire is insulated everywhere, no heat can get out, no heat can get in. So the
temperature tries to distribute evenly over time, and the average temperature must always
be the same, in particular, it is always 𝑎20 . As time goes to infinity, the temperature goes to
the constant 𝑎20 everywhere.
4.6. PDES, SEPARATION OF VARIABLES, AND THE HEAT EQUATION 241

4.6.4 Exercises
Exercise 4.6.2: Consider a wire of length 2, with 𝑘 = 0.001 and an initial temperature distribution
𝑢(𝑥, 0) = 50𝑥. Both ends are embedded in ice (temperature 0). Find the solution as a series.

Exercise 4.6.3: Find a series solution of

𝑢𝑡 = 𝑢𝑥𝑥 for 0 < 𝑥 < 1 and 𝑡 > 0,


𝑢(0, 𝑡) = 𝑢(1, 𝑡) = 0 for 𝑡 > 0,
𝑢(𝑥, 0) = 100 for 0 < 𝑥 < 1.

Exercise 4.6.4: Find a series solution of

𝑢𝑡 = 𝑢𝑥𝑥 for 0 < 𝑥 < 𝜋 and 𝑡 > 0,


𝑢𝑥 (0, 𝑡) = 𝑢𝑥 (𝜋, 𝑡) = 0 for 𝑡 > 0,
𝑢(𝑥, 0) = 3 cos(𝑥) + cos(3𝑥) for 0 < 𝑥 < 𝜋.

Exercise 4.6.5: Find a series solution of

𝑢𝑡 = 13 𝑢𝑥𝑥 for 0 < 𝑥 < 𝜋 and 𝑡 > 0,


𝑢𝑥 (0, 𝑡) = 𝑢𝑥 (𝜋, 𝑡) = 0 for 𝑡 > 0,
𝑢(𝑥, 0) = 10𝑥
𝜋 for 0 < 𝑥 < 𝜋.

Exercise 4.6.6: Find a series solution of

𝑢𝑡 = 𝑢𝑥𝑥 for 0 < 𝑥 < 1 and 𝑡 > 0,


𝑢(0, 𝑡) = 0, 𝑢(1, 𝑡) = 100 for 𝑡 > 0,
𝑢(𝑥, 0) = sin(𝜋𝑥) for 0 < 𝑥 < 1.

Hint: Use the fact that 𝑢(𝑥, 𝑡) = 100𝑥 is a solution satisfying 𝑢𝑡 = 𝑢𝑥𝑥 , 𝑢(0, 𝑡) = 0, 𝑢(1, 𝑡) = 100.
Then use superposition.

Exercise 4.6.7: Find the steady-state temperature solution as a function of 𝑥 alone, by letting
𝑡 → ∞ in the solution from exercises 4.6.5 and 4.6.6. Verify that it satisfies the equation 𝑢𝑥𝑥 = 0.

Exercise 4.6.8: Use separation variables to find a nontrivial solution to 𝑢𝑥𝑥 + 𝑢 𝑦 𝑦 = 0, where
𝑢(𝑥, 0) = 0 and 𝑢(0, 𝑦) = 0. Hint: Try 𝑢(𝑥, 𝑦) = 𝑋(𝑥)𝑌(𝑦).

Exercise 4.6.9 (challenging): Suppose that one end of the wire is insulated (say at 𝑥 = 0) and the
other end is kept at zero temperature. That is, find a series solution of

𝑢𝑡 = 𝑘𝑢𝑥𝑥 for 0 < 𝑥 < 𝐿 and 𝑡 > 0,


𝑢𝑥 (0, 𝑡) = 𝑢(𝐿, 𝑡) = 0 for 𝑡 > 0,
𝑢(𝑥, 0) = 𝑓 (𝑥) for 0 < 𝑥 < 𝐿.

Express any coefficients in the series by integrals of 𝑓 (𝑥).


242 CHAPTER 4. FOURIER SERIES AND PDES

Exercise 4.6.10 (challenging): Suppose that the wire is circular and insulated, so there are no
ends. You can think of this as simply connecting the two ends and making sure the solution matches
up at the ends. That is, find a series solution of

𝑢𝑡 = 𝑘𝑢𝑥𝑥 for 0 < 𝑥 < 𝐿 and 𝑡 > 0,


𝑢(0, 𝑡) = 𝑢(𝐿, 𝑡), 𝑢𝑥 (0, 𝑡) = 𝑢𝑥 (𝐿, 𝑡) for 𝑡 > 0,
𝑢(𝑥, 0) = 𝑓 (𝑥) for 0 < 𝑥 < 𝐿.

Express any coefficients in the series by integrals of 𝑓 (𝑥).

Exercise 4.6.11: Consider a wire insulated on both ends, 𝐿 = 1, 𝑘 = 1, and 𝑢(𝑥, 0) = cos2 (𝜋𝑥).

a) Find the solution 𝑢(𝑥, 𝑡). Hint: a trig identity.


b) Find the average temperature.
c) Initially the temperature variation is 1 (maximum minus the minimum). Find the time when
the variation is 1/2.

Exercise 4.6.101: Find a series solution of

𝑢𝑡 = 3𝑢𝑥𝑥 for 0 < 𝑥 < 𝜋 and 𝑡 > 0,


𝑢(0, 𝑡) = 𝑢(𝜋, 𝑡) = 0 for 𝑡 > 0,
𝑢(𝑥, 0) = 5 sin(𝑥) + 2 sin(5𝑥) for 0 < 𝑥 < 𝜋.

Exercise 4.6.102: Find a series solution of

𝑢𝑡 = 0.1𝑢𝑥𝑥 for 0 < 𝑥 < 𝜋 and 𝑡 > 0,


𝑢𝑥 (0, 𝑡) = 𝑢𝑥 (𝜋, 𝑡) = 0 for 𝑡 > 0,
𝑢(𝑥, 0) = 1 + 2 cos(𝑥) for 0 < 𝑥 < 𝜋.

Exercise 4.6.103: Use separation of variables to find a nontrivial solution to 𝑢𝑥𝑡 = 𝑢𝑥𝑥 .

Exercise 4.6.104: Use a variation on separation of variables to find a nontrivial solution to


𝑢𝑥 + 𝑢𝑡 = 𝑢. Hint: Try 𝑢(𝑥, 𝑡) = 𝑋(𝑥) + 𝑇(𝑡).

Exercise 4.6.105: Suppose that the temperature on the wire is fixed at 0 at the ends, 𝐿 = 1, 𝑘 = 1,
and 𝑢(𝑥, 0) = 100 sin(2𝜋𝑥).

a) What is the temperature at 𝑥 = 1/2 at any time.


b) What is the maximum and the minimum temperature on the wire at 𝑡 = 0.
c) At what time is the maximum temperature on the wire exactly one half of the initial maximum
at 𝑡 = 0.
4.7. ONE-DIMENSIONAL WAVE EQUATION 243

4.7 One-dimensional wave equation


Note: 1 lecture, §9.6 in [EP], §10.7 in [BD]
Imagine a tensioned guitar string of length 𝐿 that can vibrate. We will only consider
vibrations in one direction. Let 𝑥 denote the position along the string, let 𝑡 denote time,
and let 𝑦(𝑥, 𝑡) denote the displacement of the string from the rest position. See Figure 4.20.

0 𝐿 𝑥

Figure 4.20: Vibrating string of length 𝐿, 𝑥 is position, 𝑦 is displacement.

The equation that governs this setup is the so-called one-dimensional wave equation:

𝑦𝑡𝑡 = 𝑎 2 𝑦 𝑥𝑥 ,

for some constant 𝑎 > 0. The intuition is similar to the heat equation, replacing velocity
with acceleration: the acceleration at a specific point is proportional to the second derivative
of the shape of the string. In other words, when the string is concave down then 𝑦 𝑥𝑥 is
negative and the string wants to accelerate downwards, so 𝑦𝑡𝑡 should be negative. And
vice versa. The wave equation is an example of a hyperbolic PDE.
We will again solve for 𝑦 in the region 0 < 𝑥 < 𝐿 and 𝑡 > 0. Assume that the ends of
the string are fixed in place as on the guitar:

𝑦(0, 𝑡) = 0 and 𝑦(𝐿, 𝑡) = 0 for 𝑡 > 0.

We have two conditions along the 𝑥-axis as there are two derivatives in the 𝑥 direction.
There are also two derivatives along the 𝑡 direction and hence we need two further
conditions here. We need to know the initial position and the initial velocity of the string.
That is, for some known functions 𝑓 (𝑥) and 𝑔(𝑥), we impose

𝑦(𝑥, 0) = 𝑓 (𝑥) and 𝑦𝑡 (𝑥, 0) = 𝑔(𝑥) for 0 < 𝑥 < 𝐿.

The equation is linear and homogeneous, so superposition works just as it did for the
heat equation. Superposition also preserves the homogeneous side conditions 𝑦(0, 𝑡) = 0
and 𝑦(𝐿, 𝑡) = 0. Again we will use separation of variables to find enough building-block
solutions to get the particular solution also solving the nonhomogeneous initial conditions.
There is one change however. We will solve two separate problems and add their solutions.
244 CHAPTER 4. FOURIER SERIES AND PDES

The two problems we will solve are

𝑤 𝑡𝑡 = 𝑎 2 𝑤 𝑥𝑥 for 0 < 𝑥 < 𝐿 and 𝑡 > 0,


𝑤(0, 𝑡) = 𝑤(𝐿, 𝑡) = 0 for 𝑡 > 0,
(4.11)
𝑤(𝑥, 0) = 0 for 0 < 𝑥 < 𝐿,
𝑤 𝑡 (𝑥, 0) = 𝑔(𝑥) for 0 < 𝑥 < 𝐿,

and
𝑧 𝑡𝑡 = 𝑎 2 𝑧 𝑥𝑥 for 0 < 𝑥 < 𝐿 and 𝑡 > 0,
𝑧(0, 𝑡) = 𝑧(𝐿, 𝑡) = 0 for 𝑡 > 0,
(4.12)
𝑧(𝑥, 0) = 𝑓 (𝑥) for 0 < 𝑥 < 𝐿,
𝑧 𝑡 (𝑥, 0) = 0 for 0 < 𝑥 < 𝐿.
The principle of superposition implies that 𝑦 = 𝑤 + 𝑧 solves the wave equation and
the homogeneous side conditions. Furthermore, 𝑦(𝑥, 0) = 𝑤(𝑥, 0) + 𝑧(𝑥, 0) = 𝑓 (𝑥) and
𝑦𝑡 (𝑥, 0) = 𝑤 𝑡 (𝑥, 0) + 𝑧 𝑡 (𝑥, 0) = 𝑔(𝑥). Hence, 𝑦 is a solution to

𝑦𝑡𝑡 = 𝑎 2 𝑦 𝑥𝑥 for 0 < 𝑥 < 𝐿 and 𝑡 > 0,


𝑦(0, 𝑡) = 𝑦(𝐿, 𝑡) = 0 for 𝑡 > 0,
(4.13)
𝑦(𝑥, 0) = 𝑓 (𝑥) for 0 < 𝑥 < 𝐿,
𝑦𝑡 (𝑥, 0) = 𝑔(𝑥) for 0 < 𝑥 < 𝐿.

The reason for all this complexity is that superposition only works for homogeneous
conditions such as 𝑦(0, 𝑡) = 𝑦(𝐿, 𝑡) = 0, 𝑦(𝑥, 0) = 0, or 𝑦𝑡 (𝑥, 0) = 0. Therefore, we can use
separation of variables to find many building-block solutions solving all the homogeneous
conditions. We can then use them to construct a solution satisfying the remaining
nonhomogeneous condition.
Let us start with (4.11). We try a solution of the form 𝑤(𝑥, 𝑡) = 𝑋(𝑥)𝑇(𝑡) again. We plug
into the wave equation to obtain

𝑋(𝑥)𝑇 ′′(𝑡) = 𝑎 2 𝑋 ′′(𝑥)𝑇(𝑡).

Rewriting, we get
𝑇 ′′(𝑡) 𝑋 ′′(𝑥)
= .
𝑎 2𝑇(𝑡) 𝑋(𝑥)
Again, left-hand side depends only on 𝑡 and the right-hand side depends only on 𝑥. So
both sides equal a constant, which we denote by −𝜆:

𝑇 ′′(𝑡) 𝑋 ′′(𝑥)
= −𝜆 = .
𝑎 2𝑇(𝑡) 𝑋(𝑥)

We solve to get two ordinary differential equations

𝑋 ′′(𝑥) + 𝜆𝑋(𝑥) = 0,
𝑇 ′′(𝑡) + 𝜆𝑎 2𝑇(𝑡) = 0.
4.7. ONE-DIMENSIONAL WAVE EQUATION 245

The condition 0 = 𝑤(0, 𝑡) = 𝑋(0)𝑇(𝑡) implies 𝑋(0) = 0 and 𝑤(𝐿, 𝑡) = 0 implies that 𝑋(𝐿) = 0.
Therefore, the only nontrivial solutions for the first equation are when 𝜆 = 𝜆𝑛 = 𝑛𝐿𝜋2 and
2 2

they are  𝑛𝜋 
𝑋𝑛 (𝑥) = sin 𝑥 .
𝐿
The general solution for 𝑇 for this particular 𝜆𝑛 is
 𝑛𝜋𝑎   𝑛𝜋𝑎 
𝑇𝑛 (𝑡) = 𝐴 cos 𝑡 + 𝐵 sin 𝑡 .
𝐿 𝐿
We also have the condition that 𝑤(𝑥, 0) = 0 or 𝑋(𝑥)𝑇(0) = 0. This implies that 𝑇(0) = 0,
𝐿
which in turn forces 𝐴 = 0. It is convenient to pick 𝐵 = 𝑛𝜋𝑎 (you will see why in a moment)
and hence
𝐿  𝑛𝜋𝑎 
𝑇𝑛 (𝑡) = sin 𝑡 .
𝑛𝜋𝑎 𝐿
Our building-block solutions are
𝐿  𝑛𝜋   𝑛𝜋𝑎 
𝑤 𝑛 (𝑥, 𝑡) = sin 𝑥 sin 𝑡 .
𝑛𝜋𝑎 𝐿 𝐿
We differentiate in 𝑡:
𝜕𝑤 𝑛  𝑛𝜋   𝑛𝜋𝑎 
(𝑥, 𝑡) = sin 𝑥 cos 𝑡 .
𝜕𝑡 𝐿 𝐿
Hence,
𝜕𝑤 𝑛  𝑛𝜋 
(𝑥, 0) = sin 𝑥 .
𝜕𝑡 𝐿
We expand 𝑔(𝑥) in terms of these sines as

Õ  𝑛𝜋 
𝑔(𝑥) = 𝑏 𝑛 sin 𝑥 .
𝐿
𝑛=1

Using superposition we write the solution to (4.11) as a series


∞ ∞
Õ Õ 𝐿  𝑛𝜋   𝑛𝜋𝑎 
𝑤(𝑥, 𝑡) = 𝑏 𝑛 𝑤 𝑛 (𝑥, 𝑡) = 𝑏𝑛 sin 𝑥 sin 𝑡 .
𝑛𝜋𝑎 𝐿 𝐿
𝑛=1 𝑛=1

Exercise 4.7.1: Check that 𝑤(𝑥, 0) = 0 and 𝑤 𝑡 (𝑥, 0) = 𝑔(𝑥).


We solve (4.12) similarly. We again try 𝑧(𝑥, 𝑦) = 𝑋(𝑥)𝑇(𝑡). The procedure works exactly
the same at first. We obtain

𝑋 ′′(𝑥) + 𝜆𝑋(𝑥) = 0,
𝑇 ′′(𝑡) + 𝜆𝑎 2𝑇(𝑡) = 0,
𝑛 2 𝜋2
and the conditions 𝑋(0) = 0, 𝑋(𝐿) = 0. Again, 𝜆 = 𝜆𝑛 = 𝐿2
and
 𝑛𝜋 
𝑋𝑛 (𝑥) = sin 𝑥 .
𝐿
246 CHAPTER 4. FOURIER SERIES AND PDES

This time, the condition on 𝑇 is 𝑇 ′(0) = 0. Thus we get that 𝐵 = 0, and we take
 𝑛𝜋𝑎 
𝑇𝑛 (𝑡) = cos 𝑡 .
𝐿
Our building-block solution is
 𝑛𝜋   𝑛𝜋𝑎 
𝑧 𝑛 (𝑥, 𝑡) = sin 𝑥 cos 𝑡 .
𝐿 𝐿
𝑛𝜋
As 𝑧 𝑛 (𝑥, 0) = sin 𝐿 𝑥 , we expand 𝑓 (𝑥) in terms of these sines as



Õ  𝑛𝜋 
𝑓 (𝑥) = 𝑐 𝑛 sin 𝑥 .
𝐿
𝑛=1

We write down the solution to (4.12) as a series



Õ ∞
Õ  𝑛𝜋   𝑛𝜋𝑎 
𝑧(𝑥, 𝑡) = 𝑐 𝑛 𝑧 𝑛 (𝑥, 𝑡) = 𝑐 𝑛 sin 𝑥 cos 𝑡 .
𝐿 𝐿
𝑛=1 𝑛=1

Exercise 4.7.2: Fill in the details in the derivation of the solution of (4.12). Check that the solution
satisfies all the side conditions.
Putting these two solutions together, let us state the result as a theorem.
Theorem 4.7.1. Take the problem
𝑦𝑡𝑡 = 𝑎 2 𝑦 𝑥𝑥 for 0 < 𝑥 < 𝐿 and 𝑡 > 0,
𝑦(0, 𝑡) = 𝑦(𝐿, 𝑡) = 0 for 𝑡 > 0,
(4.14)
𝑦(𝑥, 0) = 𝑓 (𝑥) for 0 < 𝑥 < 𝐿,
𝑦𝑡 (𝑥, 0) = 𝑔(𝑥) for 0 < 𝑥 < 𝐿,
where

Õ  𝑛𝜋  ∞
Õ  𝑛𝜋 
𝑓 (𝑥) = 𝑐 𝑛 sin 𝑥 and 𝑔(𝑥) = 𝑏 𝑛 sin 𝑥 .
𝐿 𝐿
𝑛=1 𝑛=1
Then the solution 𝑦(𝑥, 𝑡) can be written as a sum of the solutions of (4.11) and (4.12):

Õ 𝐿  𝑛𝜋   𝑛𝜋𝑎   𝑛𝜋   𝑛𝜋𝑎 
𝑦(𝑥, 𝑡) = 𝑏𝑛 sin 𝑥 sin 𝑡 + 𝑐 𝑛 sin 𝑥 cos 𝑡
𝑛𝜋𝑎 𝐿 𝐿 𝐿 𝐿
𝑛=1
∞  𝑛𝜋   𝐿  𝑛𝜋𝑎   𝑛𝜋𝑎 
Õ 
= sin 𝑥 𝑏𝑛 sin 𝑡 + 𝑐 𝑛 cos 𝑡 .
𝐿 𝑛𝜋𝑎 𝐿 𝐿
𝑛=1

Example 4.7.1: Consider a string of length 2 plucked in the middle, it has an initial shape
given in Figure 4.21 on the next page. That is,
(
0.1 𝑥 if 0 ≤ 𝑥 ≤ 1,
𝑓 (𝑥) =
0.1 (2 − 𝑥) if 1 < 𝑥 ≤ 2.
4.7. ONE-DIMENSIONAL WAVE EQUATION 247

𝑦
0.1

0 2 𝑥

Figure 4.21: Initial shape of a plucked string from Example 4.7.1.

Let the string start at rest (𝑔(𝑥) = 0), and let 𝑎 = 1 for simplicity. In other words, we
wish to solve the problem:

𝑦𝑡𝑡 = 𝑦 𝑥𝑥 for 0 < 𝑥 < 2 and 𝑡 > 0,


𝑦(0, 𝑡) = 𝑦(2, 𝑡) = 0 for 𝑡 > 0,
𝑦(𝑥, 0) = 𝑓 (𝑥) for 0 < 𝑥 < 2,
𝑦𝑡 (𝑥, 0) = 0 for 0 < 𝑥 < 2.

We leave the details of computing the sine series of 𝑓 (𝑥) to the reader. The series is
∞  𝑛𝜋   𝑛𝜋 
Õ 0.8
𝑓 (𝑥) = sin sin 𝑥 .
𝑛=1
𝑛 2 𝜋2 2 2

𝑛𝜋
is the sequence 1, 0, −1, 0, 1, 0, −1, . . . for 𝑛 = 1, 2, 3, 4, . . .. Therefore,

Note that sin 2

0.8  𝜋  0.8 3𝜋 0.8 5𝜋


   
𝑓 (𝑥) = 2 sin 𝑥 − 2 sin 𝑥 + sin 𝑥 −···
𝜋 2 9𝜋 2 25𝜋 2 2

The solution 𝑦(𝑥, 𝑡) is given by


∞  𝑛𝜋   𝑛𝜋   𝑛𝜋 
Õ 0.8
𝑦(𝑥, 𝑡) = sin sin 𝑥 cos 𝑡
𝑛=1
𝑛 2 𝜋2 2 2 2

0.8(−1)𝑚+1
∞    
Õ (2𝑚 − 1)𝜋 (2𝑚 − 1)𝜋
= 2 2
sin 𝑥 cos 𝑡
(2𝑚 − 1) 𝜋 2 2
𝑚=1
0.8 𝜋   𝜋  0.8 3𝜋 3𝜋
   
= 2 sin 𝑥 cos 𝑡 − 2 sin 𝑥 cos 𝑡
𝜋 2 2 9𝜋 2 2
   
0.8 5𝜋 5𝜋
+ sin 𝑥 cos 𝑡 −···
25𝜋2 2 2

See Figure 4.22 on the following page for a plot for 0 < 𝑡 < 3. Notice that unlike the
heat equation, the solution does not become “smoother,” the “sharp edges” remain. We
will see the reason for this behavior in the next section where we derive the solution to the
wave equation in a different way.
248 CHAPTER 4. FOURIER SERIES AND PDES

0
0.0
t
1
0.5
2
x
1.0 3 y(x,t)

1.5 0.10
0.110
2.0 0.088
0.066
0.10 0.05
0.044
0.022
0.000
0.05 0.00 -0.022

y
-0.044
-0.066
-0.088
0.00 -0.05 -0.110
y

-0.05 -0.10
0.0

-0.10 0.5

0 1.0
x
1
1.5
2
t
2.0
3

Figure 4.22: Shape of the plucked string for 0 < 𝑡 < 3.

Make sure you understand what the plot, such as the one in the figure, is telling you.
For each fixed 𝑡, you can think of the function 𝑦(𝑥, 𝑡) as just a function of 𝑥. This function
gives you the shape of the string at time 𝑡. See Figure 4.23 on the next page for plots of at 𝑦
as a function of 𝑥 at several different values of 𝑡. On this plot, you can see the sharp edges
remaining much better.
One thing to take away from all this is how a guitar sounds. Notice that the (angular)
frequencies that come up in the solution are 𝑛 𝜋𝑎
𝐿 . That is, there is a certain base fundamental
𝜋𝑎
frequency 𝐿 , and then we also get all the multiples of this frequency, which in music are
called the overtones. Which overtones appear and with what amplitude is what musicians
call the timbre of the note. Mathematicians usually call this the spectrum. Because all the
frequencies are multiples of one frequency (the fundamental), we get a nice pleasing sound.
The fundamental frequency 𝜋𝑎 𝐿 increases as we decrease length 𝐿. That is, if we place a
finger on the fingerboard and then pluck a string we get a higher note. The constant 𝑎 is
given by
s
𝑇
𝑎= ,
𝜌

where 𝑇 is tension and 𝜌 is the linear density of the string. Tightening the string (turning
4.7. ONE-DIMENSIONAL WAVE EQUATION 249

0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0

0.10 0.10 0.10 0.10

0.05 0.05 0.05 0.05

0.00 0.00 0.00 0.00

-0.05 -0.05 -0.05 -0.05

-0.10 -0.10 -0.10 -0.10

0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0

0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0

0.10 0.10 0.10 0.10

0.05 0.05 0.05 0.05

0.00 0.00 0.00 0.00

-0.05 -0.05 -0.05 -0.05

-0.10 -0.10 -0.10 -0.10

0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0

Figure 4.23: Plucked string for 𝑡 = 0, 𝑡 = 0.4, 𝑡 = 0.8, and 𝑡 = 1.2.

the tuning peg on a guitar) increases 𝑎 and hence produces a higher fundamental frequency
(a higher note). On the other hand, using a heavier string reduces 𝑎 and produces a lower
fundamental frequency (a lower note). A bass guitar has longer thicker strings, while a
ukulele has short strings made of lighter material.
Something rather interesting is the almost-symmetry between space and time. In its
simplest form, we see this symmetry in the solutions
 𝑛𝜋   𝑛𝜋𝑎 
sin 𝑥 sin 𝑡 .
𝐿 𝐿
Except for the constant 𝑎, this solution looks the same if we flip time and space. In general,
the solution for a fixed 𝑥 is a Fourier series in 𝑡, for a fixed 𝑡 it is a Fourier series in 𝑥, and
the coefficients are related. If the shape 𝑓 (𝑥) or the initial velocity 𝑔(𝑥) have lots of corners,
then the sound wave will have lots of corners. That is because the Fourier coefficients of
the initial shape decay to zero (as 𝑛 → ∞) at the same rate as the Fourier coefficients of
the wave in time (for some fixed 𝑥). So if you use a sharp object to pick the string, you get
a sharper sound with lots of high-frequency components, while if you use your thumb,
you get a softer sound without so many high overtones. Similarly, if you pluck close to
250 CHAPTER 4. FOURIER SERIES AND PDES

the bridge (close to one end of the string), you are getting a pluck that looks more like the
sawtooth, and you get an even sharper sound.
In fact, if you look at the formula for the solution, you see that for any fixed 𝑥, we get an
almost arbitrary Fourier series in 𝑡, everything except the constant term. In theory, you can
obtain any timbre you want by plucking the string in just the right way. Of course, we are
considering an ideal string of no stiffness and no air resistance. Those variables clearly
impact the sound as well.

4.7.1 Exercises
Exercise 4.7.3: Solve

𝑦𝑡𝑡 = 9𝑦 𝑥𝑥 for 0 < 𝑥 < 1 and 𝑡 > 0,


𝑦(0, 𝑡) = 𝑦(1, 𝑡) = 0 for 𝑡 > 0,
𝑦(𝑥, 0) = sin(3𝜋𝑥) + 14 sin(6𝜋𝑥) for 0 < 𝑥 < 1,
𝑦𝑡 (𝑥, 0) = 0 for 0 < 𝑥 < 1.

Exercise 4.7.4: Solve

𝑦𝑡𝑡 = 4𝑦 𝑥𝑥 for 0 < 𝑥 < 1 and 𝑡 > 0,


𝑦(0, 𝑡) = 𝑦(1, 𝑡) = 0 for 𝑡 > 0,
𝑦(𝑥, 0) = sin(3𝜋𝑥) + 14 sin(6𝜋𝑥) for 0 < 𝑥 < 1,
𝑦𝑡 (𝑥, 0) = sin(9𝜋𝑥) for 0 < 𝑥 < 1.

Exercise 4.7.5: Derive the solution for a general plucked string of length 𝐿 and any constant 𝑎 (in
the equation 𝑦𝑡𝑡 = 𝑎 2 𝑦 𝑥𝑥 ), where we raise the string some distance 𝑏 at the midpoint and let go.

Exercise 4.7.6: Imagine that a stringed musical instrument falls on the floor. Suppose that the
length of the string is 1 and 𝑎 = 1. When the musical instrument hits the ground the string was in
rest position and hence 𝑦(𝑥, 0) = 0. However, the string was moving at some velocity at impact
(𝑡 = 0), say 𝑦𝑡 (𝑥, 0) = −1. Find the solution 𝑦(𝑥, 𝑡) for the shape of the string at time 𝑡.

Exercise 4.7.7 (challenging): Suppose that you have a vibrating string and that there is air
resistance proportional to the velocity. That is, you have

𝑦𝑡𝑡 = 𝑎 2 𝑦 𝑥𝑥 − 𝑘 𝑦𝑡 for 0 < 𝑥 < 1 and 𝑡 > 0,


𝑦(0, 𝑡) = 𝑦(1, 𝑡) = 0 for 𝑡 > 0,
𝑦(𝑥, 0) = 𝑓 (𝑥) for 0 < 𝑥 < 1,
𝑦𝑡 (𝑥, 0) = 0 for 0 < 𝑥 < 1.

Suppose that 0 < 𝑘 < 2𝜋𝑎. Derive a series solution to the problem. Any coefficients in the series
should be expressed as integrals of 𝑓 (𝑥).

Exercise 4.7.8: Suppose you touch the guitar string exactly in the middle to ensure another condition
𝑢(𝐿/2, 𝑡) = 0 for all time. Which multiples of the fundamental frequency 𝜋𝑎
𝐿 show up in the solution?
4.7. ONE-DIMENSIONAL WAVE EQUATION 251

Exercise 4.7.101: Solve


𝑦𝑡𝑡 = 𝑦 𝑥𝑥 for 0 < 𝑥 < 𝜋, 𝑡 > 0,
𝑦(0, 𝑡) = 𝑦(𝜋, 𝑡) = 0 for 𝑡 > 0,
𝑦(𝑥, 0) = sin(𝑥) for 0 < 𝑥 < 𝜋,
𝑦𝑡 (𝑥, 0) = sin(𝑥) for 0 < 𝑥 < 𝜋.

Exercise 4.7.102: Solve


𝑦𝑡𝑡 = 25𝑦 𝑥𝑥 for 0 < 𝑥 < 2 and 𝑡 > 0,
𝑦(0, 𝑡) = 𝑦(2, 𝑡) = 0 for 𝑡 > 0,
𝑦(𝑥, 0) = 0 for 0 < 𝑥 < 2,
𝑦𝑡 (𝑥, 0) = sin(𝜋𝑡) + 0.1 sin(2𝜋𝑡) for 0 < 𝑥 < 2.

Exercise 4.7.103: Solve


𝑦𝑡𝑡 = 2𝑦 𝑥𝑥 for 0 < 𝑥 < 𝜋 and 𝑡 > 0,
𝑦(0, 𝑡) = 𝑦(𝜋, 𝑡) = 0 for 𝑡 > 0,
𝑦(𝑥, 0) = 𝑥 for 0 < 𝑥 < 𝜋,
𝑦𝑡 (𝑥, 0) = 0 for 0 < 𝑥 < 𝜋.

Exercise 4.7.104: What happens when 𝑎 = 0? Find a solution to 𝑦𝑡𝑡 = 0, 𝑦(0, 𝑡) = 𝑦(𝜋, 𝑡) = 0,
𝑦(𝑥, 0) = sin(2𝑥), 𝑦𝑡 (𝑥, 0) = sin(𝑥).
252 CHAPTER 4. FOURIER SERIES AND PDES

4.8 D’Alembert solution of the wave equation


Note: 1 lecture, different from §9.6 in [EP], part of §10.7 in [BD]
We have solved the wave equation by using Fourier series. But it is often more convenient
to use the so-called d’Alembert solution to the wave equation‗ . While this solution can be
derived using Fourier series as well, it is really an awkward use of those concepts. It is
easier and more instructive to derive this solution by making a correct change of variables
to get an equation that can be solved by simple integration.
Suppose we wish to solve the wave equation

𝑦𝑡𝑡 = 𝑎 2 𝑦 𝑥𝑥 (4.15)

in the region 0 < 𝑥 < 𝐿 and 𝑡 > 0 subject to the side conditions

𝑦(0, 𝑡) = 𝑦(𝐿, 𝑡) = 0 for 𝑡 > 0,


𝑦(𝑥, 0) = 𝑓 (𝑥) for 0 < 𝑥 < 𝐿, (4.16)
𝑦𝑡 (𝑥, 0) = 𝑔(𝑥) for 0 < 𝑥 < 𝐿.

4.8.1 Change of variables


We will transform the equation into a simpler form where it can be solved by simple
integration. We change variables to 𝜉 = 𝑥 − 𝑎𝑡, 𝜂 = 𝑥 + 𝑎𝑡. The chain rule says:

𝜕 𝜕𝜉 𝜕 𝜕𝜂 𝜕 𝜕 𝜕
= + = + ,
𝜕𝑥 𝜕𝑥 𝜕𝜉 𝜕𝑥 𝜕𝜂 𝜕𝜉 𝜕𝜂
𝜕 𝜕𝜉 𝜕 𝜕𝜂 𝜕 𝜕 𝜕
= + = −𝑎 +𝑎 .
𝜕𝑡 𝜕𝑡 𝜕𝜉 𝜕𝑡 𝜕𝜂 𝜕𝜉 𝜕𝜂
We compute

𝜕2 𝑦 𝜕 𝜕 𝜕𝑦 𝜕𝑦 𝜕2 𝑦 𝜕2 𝑦 𝜕2 𝑦
  
𝑦 𝑥𝑥 = = + + = + 2 + ,
𝜕𝑥 2 𝜕𝜉 𝜕𝜂 𝜕𝜉 𝜕𝜂 𝜕𝜉2 𝜕𝜉𝜕𝜂 𝜕𝜂2
𝜕2 𝑦 𝜕 𝜕 𝜕𝑦 𝜕𝑦 𝜕2 𝑦 𝜕2 𝑦 𝜕2 𝑦
  
𝑦𝑡𝑡 = 2 = −𝑎 +𝑎 −𝑎 +𝑎 = 𝑎 2 2 − 2𝑎 2 + 𝑎2 2 .
𝜕𝑡 𝜕𝜉 𝜕𝜂 𝜕𝜉 𝜕𝜂 𝜕𝜉 𝜕𝜉𝜕𝜂 𝜕𝜂
𝜕2 𝑦 𝜕2 𝑦
In the computations above, we used the fact from calculus that 𝜕𝜉𝜕𝜂
= 𝜕𝜂𝜕𝜉
. We plug what
we got into the wave equation,

𝜕2 𝑦
0 = 𝑎 2 𝑦 𝑥𝑥 − 𝑦𝑡𝑡 = 4𝑎 2 = 4𝑎 2 𝑦𝜉𝜂 .
𝜕𝜉𝜕𝜂
Therefore, the wave equation (4.15) transforms into 𝑦𝜉𝜂 = 0. It is easy to find the general
solution to this new equation by integrating twice. Keeping 𝜉 constant, we integrate with
‗ Named after the French mathematician Jean le Rond d’Alembert (1717–1783).
4.8. D’ALEMBERT SOLUTION OF THE WAVE EQUATION 253

respect to 𝜂 first‗ and note that the constant of integration depends on 𝜉; for each 𝜉, we may
get a different constant of integration. We get 𝑦𝜉 = 𝐶(𝜉). Next, we integrate
∫ with respect to
𝜉 and note that the constant of integration depends on 𝜂. Thus, 𝑦 = 𝐶(𝜉) 𝑑𝜉 + 𝐵(𝜂). The
solution must, therefore, be of the following form for some functions 𝐴(𝜉) and 𝐵(𝜂):

𝑦 = 𝐴(𝜉) + 𝐵(𝜂) = 𝐴(𝑥 − 𝑎𝑡) + 𝐵(𝑥 + 𝑎𝑡).

The solution is a superposition of two functions (waves) traveling at speed 𝑎 in opposite


directions. The coordinates 𝜉 and 𝜂 are called the characteristic coordinates, and a similar
technique can be applied to more complicated hyperbolic PDEs. In § 1.9 it is used to solve
first-order linear PDEs. Basically, to solve the wave equation (or more general hyperbolic
equations) we find certain characteristic curves along which the equation is really just an
ODE, or a pair of ODEs. In this case these are the curves where 𝜉 and 𝜂 are constant.

4.8.2 D’Alembert’s formula


We know what any solution must look like, but we need to solve for the given side
conditions. We will just give the formula and see that it works. Let 𝐹(𝑥) denote the odd
periodic extension of 𝑓 (𝑥), and let 𝐺(𝑥) denote the odd periodic extension of 𝑔(𝑥). Define
∫ 𝑥 ∫ 𝑥
1 1 1 1
𝐴(𝑥) = 𝐹(𝑥) − 𝐺(𝑠) 𝑑𝑠, 𝐵(𝑥) = 𝐹(𝑥) + 𝐺(𝑠) 𝑑𝑠.
2 2𝑎 0 2 2𝑎 0

We claim these 𝐴(𝑥) and 𝐵(𝑥) give the solution. Explicitly, the solution is 𝑦(𝑥, 𝑡) =
𝐴(𝑥 − 𝑎𝑡) + 𝐵(𝑥 + 𝑎𝑡) or in other words:
∫ 𝑥−𝑎𝑡 ∫ 𝑥+𝑎𝑡
1 1 1 1
𝑦(𝑥, 𝑡) = 𝐹(𝑥 − 𝑎𝑡) − 𝐺(𝑠) 𝑑𝑠 + 𝐹(𝑥 + 𝑎𝑡) + 𝐺(𝑠) 𝑑𝑠
2 2𝑎 0 2 2𝑎 0
𝑥+𝑎𝑡
(4.17)
𝐹(𝑥 − 𝑎𝑡) + 𝐹(𝑥 + 𝑎𝑡)

1
= + 𝐺(𝑠) 𝑑𝑠.
2 2𝑎 𝑥−𝑎𝑡

Let us check that the d’Alembert formula really works. First,


𝑥
𝐹(𝑥) + 𝐹(𝑥)

1
𝑦(𝑥, 0) = + 𝐺(𝑠) 𝑑𝑠 = 𝐹(𝑥).
2 2𝑎 𝑥

So far so good. Assume for simplicity 𝐹 is differentiable. And we use the first form of (4.17)
as it is easier to differentiate. By the fundamental theorem of calculus we have
−𝑎 ′ 1 𝑎 1
𝑦𝑡 (𝑥, 𝑡) = 𝐹 (𝑥 − 𝑎𝑡) + 𝐺(𝑥 − 𝑎𝑡) + 𝐹 ′(𝑥 + 𝑎𝑡) + 𝐺(𝑥 + 𝑎𝑡).
2 2 2 2
So
−𝑎 ′ 1 𝑎 1
𝑦𝑡 (𝑥, 0) = 𝐹 (𝑥) + 𝐺(𝑥) + 𝐹 ′(𝑥) + 𝐺(𝑥) = 𝐺(𝑥).
2 2 2 2
‗There is nothing special about 𝜂, you can integrate with 𝜉 first, if you wish.
254 CHAPTER 4. FOURIER SERIES AND PDES

Yay! We’re smoking now. OK, now the boundary conditions. Note that 𝐹(𝑥) and 𝐺(𝑥) are
odd. So
𝑎𝑡 𝑎𝑡
𝐹(−𝑎𝑡) + 𝐹(𝑎𝑡) −𝐹(𝑎𝑡) + 𝐹(𝑎𝑡)
∫ ∫
1 1
𝑦(0, 𝑡) = + 𝐺(𝑠) 𝑑𝑠 = + 𝐺(𝑠) 𝑑𝑠 = 0 + 0 = 0.
2 2𝑎 −𝑎𝑡 2 2𝑎 −𝑎𝑡

Now 𝐹(𝑥) is odd and 2𝐿-periodic, so

𝐹(𝐿 − 𝑎𝑡) + 𝐹(𝐿 + 𝑎𝑡) = 𝐹(−𝐿 − 𝑎𝑡) + 𝐹(𝐿 + 𝑎𝑡) = −𝐹(𝐿 + 𝑎𝑡) + 𝐹(𝐿 + 𝑎𝑡) = 0.

Next, 𝐺(𝑠) is odd and 2𝐿-periodic, so we change variables 𝑣 = 𝑠 − 𝐿. We then notice that
𝐺(𝑣 + 𝐿) = 𝐺(𝑣 − 𝐿) = −𝐺(−𝑣 + 𝐿), so 𝐺(𝑣 + 𝐿) is odd as a function of 𝑣:
∫ 𝐿+𝑎𝑡 ∫ 𝑎𝑡
𝐺(𝑠) 𝑑𝑠 = 𝐺(𝑣 + 𝐿) 𝑑𝑣 = 0.
𝐿−𝑎𝑡 −𝑎𝑡

Hence
𝐿+𝑎𝑡
𝐹(𝐿 − 𝑎𝑡) + 𝐹(𝐿 + 𝑎𝑡)

1
𝑦(𝐿, 𝑡) = + 𝐺(𝑠) 𝑑𝑠 = 0 + 0 = 0.
2 2𝑎 𝐿−𝑎𝑡
And voilà, it works.
Example 4.8.1: D’Alembert says that the solution is a superposition of two functions
(waves) moving in the opposite direction at “speed” 𝑎. To get an idea of how it works, we
work out the following example. Consider the simpler setup

𝑦𝑡𝑡 = 𝑦 𝑥𝑥 for 0 < 𝑥 < 1 and 𝑡 > 0,


𝑦(0, 𝑡) = 𝑦(1, 𝑡) = 0 for 𝑡 > 0,
𝑦(𝑥, 0) = 𝑓 (𝑥) for 0 < 𝑥 < 1,
𝑦𝑡 (𝑥, 0) = 0 for 0 < 𝑥 < 1.
Let 𝑓 (𝑥) be the following triangluar impulse of height 1 centered at 𝑥 = 0.5:

0 if 0 ≤ 𝑥 < 0.45,




if 0.45 ≤ 𝑥 < 0.5,

 20 (𝑥 − 0.45)


𝑓 (𝑥) =


 20 (0.55 − 𝑥) if 0.5 ≤ 𝑥 < 0.55,
if 0.55 ≤ 𝑥 ≤ 1.

0


The graph of this impulse is the top left plot in Figure 4.24 on the facing page.
Let 𝐹(𝑥) be the odd periodic extension of 𝑓 (𝑥). Then (4.17) says that the solution is
𝐹(𝑥 − 𝑡) + 𝐹(𝑥 + 𝑡)
𝑦(𝑥, 𝑡) = .
2
It is not hard to compute specific values of 𝑦(𝑥, 𝑡). For example, to compute 𝑦(0.1, 0.6),
we notice 𝑥 − 𝑡 = −0.5 and 𝑥 + 𝑡 = 0.7. Now 𝐹(−0.5) = − 𝑓 (0.5) = −20 (0.55 − 0.5) = −1
and 𝐹(0.7) = 𝑓 (0.7) = 0. Hence 𝑦(0.1, 0.6) = −1+0
2 = −0.5. As you can see the d’Alembert
solution is much easier to actually compute and to plot than the Fourier series solution.
See Figure 4.24 on the next page for plots of the solution 𝑦 for several different 𝑡.
4.8. D’ALEMBERT SOLUTION OF THE WAVE EQUATION 255

0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

1.0 1.0 1.0 1.0

0.5 0.5 0.5 0.5

0.0 0.0 0.0 0.0

-0.5 -0.5 -0.5 -0.5

-1.0 -1.0 -1.0 -1.0

0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

1.0 1.0 1.0 1.0

0.5 0.5 0.5 0.5

0.0 0.0 0.0 0.0

-0.5 -0.5 -0.5 -0.5

-1.0 -1.0 -1.0 -1.0

0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00

Figure 4.24: Plot of the d’Alembert solution for 𝑡 = 0, 𝑡 = 0.2, 𝑡 = 0.4, and 𝑡 = 0.6.

4.8.3 Another way to solve for the side conditions


It is perhaps easier and more useful to remember the procedure rather than memorizing the
formula itself. The important thing is that a solution to the wave equation is a superposition
of two waves traveling in opposite directions. That is,

𝑦(𝑥, 𝑡) = 𝐴(𝑥 − 𝑎𝑡) + 𝐵(𝑥 + 𝑎𝑡).

If you think about it, the exact formulas for 𝐴 and 𝐵 are not hard to guess once you realize
what kind of side conditions 𝑦(𝑥, 𝑡) is supposed to satisfy. Let walk through the formula
again, but slightly differently. Best approach is to do it in stages. When 𝑔(𝑥) = 0 (and hence
𝐺(𝑥) = 0), the solution is
𝐹(𝑥 − 𝑎𝑡) + 𝐹(𝑥 + 𝑎𝑡)
.
2
On the other hand, when 𝑓 (𝑥) = 0 (and hence 𝐹(𝑥) = 0), we let
∫ 𝑥
𝐻(𝑥) = 𝐺(𝑠) 𝑑𝑠.
0
256 CHAPTER 4. FOURIER SERIES AND PDES

The solution in this case is


𝑥+𝑎𝑡
−𝐻(𝑥 − 𝑎𝑡) + 𝐻(𝑥 + 𝑎𝑡)

1
𝐺(𝑠) 𝑑𝑠 = .
2𝑎 𝑥−𝑎𝑡 2𝑎
By superposition, we get a solution for the general side conditions (4.16) (when neither
𝑓 (𝑥) nor 𝑔(𝑥) are identically zero).

𝐹(𝑥 − 𝑎𝑡) + 𝐹(𝑥 + 𝑎𝑡) −𝐻(𝑥 − 𝑎𝑡) + 𝐻(𝑥 + 𝑎𝑡)


𝑦(𝑥, 𝑡) = + . (4.18)
2 2𝑎
Do note the minus sign before the 𝐻 and the 𝑎 in the second denominator.

Exercise 4.8.1: Check that the new formula (4.18) satisfies the side conditions (4.16).

Warning: Make sure you use the odd periodic extensions 𝐹(𝑥) and 𝐺(𝑥), when you
have formulas for 𝑓 (𝑥) and 𝑔(𝑥). The thing is, those formulas in general hold only for
0 < 𝑥 < 𝐿 and are not usually equal to 𝐹(𝑥) and 𝐺(𝑥) for other 𝑥.

4.8.4 Some remarks


We remark that the formula 𝑦(𝑥, 𝑡) = 𝐴(𝑥 − 𝑎𝑡) + 𝐵(𝑥 + 𝑎𝑡) is the reason why the solution
of the wave equation does not get “nicer” as time goes on, that is, why in the examples
where the initial conditions had corners, the solution also has corners at every time 𝑡.
The corners bring us to another interesting remark. Nobody ever notices at first that
our example solutions are not even differentiable (they have corners): In Example 4.8.1
above, the solution is not differentiable whenever 𝑥 = 𝑡 + 0.5 or 𝑥 = −𝑡 + 0.5 for example.
Really to be able to compute 𝑢𝑥𝑥 or 𝑢𝑡𝑡 , you need not one, but two derivatives. Fear not, we
could think of a shape that is very nearly 𝐹(𝑥) but does have two derivatives by rounding
𝐹(𝑥−𝑡)+𝐹(𝑥+𝑡)
the corners a little bit, and then the solution would be very nearly 2 and nobody
would notice the switch.
One final remark is what the d’Alembert solution tells us about what part of the initial
conditions influence the solution at a certain point. We can figure this out by “traveling
backwards along the characteristics.” Suppose that the string is very long (perhaps infinite)
for simplicity. Since the solution at time 𝑡 is
𝑥+𝑎𝑡
𝐹(𝑥 − 𝑎𝑡) + 𝐹(𝑥 + 𝑎𝑡)

1
𝑦(𝑥, 𝑡) = + 𝐺(𝑠) 𝑑𝑠,
2 2𝑎 𝑥−𝑎𝑡

we notice that we have only used the initial conditions in the interval [𝑥 − 𝑎𝑡, 𝑥 + 𝑎𝑡]. The
endpoints of this interval are called the wavefronts, as that is where the wave front is given
an initial (𝑡 = 0) disturbance at 𝑥. If 𝑎 = 1, an observer sitting at 𝑥 = 0 at time 𝑡 = 1 has only
seen the initial conditions for 𝑥 in the range [−1, 1] and is blissfully unaware of anything
else. This is why, for example, we do not know that a supernova has occurred in the
universe until we see its light, millions of years from the time when it did in fact happen.
4.8. D’ALEMBERT SOLUTION OF THE WAVE EQUATION 257

4.8.5 Exercises
Exercise 4.8.2: Using the d’Alembert solution solve 𝑦𝑡𝑡 = 4𝑦 𝑥𝑥 , 0 < 𝑥 < 𝜋, 𝑡 > 0, 𝑦(0, 𝑡) =
𝑦(𝜋, 𝑡) = 0, 𝑦(𝑥, 0) = sin 𝑥, and 𝑦𝑡 (𝑥, 0) = sin 𝑥. Hint: Note that sin 𝑥 is the odd periodic
extension of 𝑦(𝑥, 0) and 𝑦𝑡 (𝑥, 0).
Exercise 4.8.3: Using the d’Alembert solution solve 𝑦𝑡𝑡 = 2𝑦 𝑥𝑥 , 0 < 𝑥 < 1, 𝑡 > 0, 𝑦(0, 𝑡) =
𝑦(1, 𝑡) = 0, 𝑦(𝑥, 0) = sin5 (𝜋𝑥), and 𝑦𝑡 (𝑥, 0) = sin3 (𝜋𝑥).
Exercise 4.8.4: Take 𝑦𝑡𝑡 = 4𝑦 𝑥𝑥 , 0 < 𝑥 < 𝜋, 𝑡 > 0, 𝑦(0, 𝑡) = 𝑦(𝜋, 𝑡) = 0, 𝑦(𝑥, 0) = 𝑥(𝜋 − 𝑥), and
𝑦𝑡 (𝑥, 0) = 0.
a) Solve using the d’Alembert formula. Hint: You can use the sine series for 𝑦(𝑥, 0).
b) Find the solution as a function of 𝑥 for a fixed 𝑡 = 0.5, 𝑡 = 1, and 𝑡 = 2. Do not use the sine
series here.
Exercise 4.8.5: Derive the d’Alembert solution for 𝑦𝑡𝑡 = 𝑎 2 𝑦 𝑥𝑥 , 0 < 𝑥 < 𝜋, 𝑡 > 0, 𝑦(0, 𝑡) =
𝑦(𝜋, 𝑡) = 0, 𝑦(𝑥, 0) = 𝑓 (𝑥), and 𝑦𝑡 (𝑥, 0) = 0, using the Fourier series solution of the wave equation,
by applying an appropriate trigonometric identity. Hint: Do it first for a single term of the Fourier
series solution, in particular do it when 𝑦 is sin(𝑛𝑥) cos(𝑛𝑎𝑡).
Exercise 4.8.6: The d’Alembert solution still works if there are no boundary conditions and the
initial condition is defined on the whole real line. Suppose that 𝑦𝑡𝑡 = 𝑦 𝑥𝑥 (for all 𝑥 on the real line
and 𝑡 ≥ 0), 𝑦(𝑥, 0) = 𝑓 (𝑥), and 𝑦𝑡 (𝑥, 0) = 0, where

0 if 𝑥 < −1,




𝑥 + 1 if −1 ≤ 𝑥 < 0,



𝑓 (𝑥) =


 −𝑥 + 1 if 0 ≤ 𝑥 < 1,
1 < 𝑥.

0
 if

Solve using the d’Alembert solution. That is, write down a piecewise definition for the solution.
Then sketch the solution for 𝑡 = 0, 𝑡 = 1/2, 𝑡 = 1, and 𝑡 = 2.
Exercise 4.8.101: Using the d’Alembert solution solve 𝑦𝑡𝑡 = 9𝑦 𝑥𝑥 , 0 < 𝑥 < 1, 𝑡 > 0, 𝑦(0, 𝑡) =
𝑦(1, 𝑡) = 0, 𝑦(𝑥, 0) = sin(2𝜋𝑥), and 𝑦𝑡 (𝑥, 0) = sin(3𝜋𝑥).
Exercise 4.8.102: Take 𝑦𝑡𝑡 = 4𝑦 𝑥𝑥 , 0 < 𝑥 < 1, 𝑡 > 0, 𝑦(0, 𝑡) = 𝑦(1, 𝑡) = 0, 𝑦(𝑥, 0) = 𝑥 − 𝑥 2 , and
𝑦𝑡 (𝑥, 0) = 0. Using the d’Alembert solution find the solution at
a) 𝑡 = 0.1, b) 𝑡 = 1/2, c) 𝑡 = 1.
You may have to split your answer up by cases.
Exercise 4.8.103: Take 𝑦𝑡𝑡 = 100𝑦 𝑥𝑥 , 0 < 𝑥 < 4, 𝑡 > 0, 𝑦(0, 𝑡) = 𝑦(4, 𝑡) = 0, 𝑦(𝑥, 0) = 𝐹(𝑥),
and 𝑦𝑡 (𝑥, 0) = 0. Suppose that 𝐹(0) = 0, 𝐹(1) = 2, 𝐹(2) = 3, 𝐹(3) = 1. Using the d’Alembert
solution find
a) 𝑦(1, 1), b) 𝑦(4, 3), c) 𝑦(3, 9).
258 CHAPTER 4. FOURIER SERIES AND PDES

4.9 Steady state temperature and the Laplacian


Note: 1 lecture, §9.7 in [EP], §10.8 in [BD]
Consider an insulated wire, a plate, or a 3-dimensional object. We apply certain
fixed temperatures on the ends of the wire, the edges of the plate, or on all sides of the
3-dimensional object. We wish to find out what is the steady-state temperature distribution.
That is, we wish to know what will be the temperature after long enough period of time.
We are really seeking a solution to the heat equation that is not dependent on time. We
first solve the problem in one space variable. We are looking for a function 𝑢 that satisfies

𝑢𝑡 = 𝑘𝑢𝑥𝑥 ,

but such that 𝑢𝑡 = 0 for all 𝑥 and 𝑡. Hence, we are looking for a function of 𝑥 alone that
satisfies 𝑢𝑥𝑥 = 0. It is easy to solve this equation by integration, and we see that 𝑢 = 𝐴𝑥 + 𝐵
for some constants 𝐴 and 𝐵.
Consider an insulated wire where we apply constant temperature 𝑇1 at one end (say
where 𝑥 = 0) and 𝑇2 on the other end (at 𝑥 = 𝐿 where 𝐿 is the length of the wire). Our
steady-state solution is
𝑇2 − 𝑇1
𝑢(𝑥) = 𝑥 + 𝑇1 .
𝐿
It is simply a straight line from one end to the other This solution agrees with the common-
sense intuition on how heat should be distributed in the wire. So in one dimension, the
steady-state solutions are just straight lines.
Things are more complicated in two or more space dimensions. We restrict ourselves to
two space dimensions for simplicity. The heat equation in two space variables is

𝑢𝑡 = 𝑘(𝑢𝑥𝑥 + 𝑢 𝑦 𝑦 ), (4.19)

or more commonly written as 𝑢𝑡 = 𝑘Δ𝑢 or 𝑢𝑡 = 𝑘∇2 𝑢. The Δ and ∇2 symbols both mean
𝜕2 𝜕2
𝜕𝑥 2
+ 𝜕𝑦 2 . We will use Δ here. The reason for using such a notation is that you can define Δ

to be the right thing for any number of space dimensions and then the heat equation is
always 𝑢𝑡 = 𝑘Δ𝑢. The operator Δ is called the Laplacian.
OK, now that we have notation out of the way, let us see what does an equation for the
steady-state solution look like. We are looking for a solution to (4.19) that does not depend
on 𝑡, that is, 𝑢𝑡 = 0. Hence, we are looking for a function 𝑢(𝑥, 𝑦) such that

Δ𝑢 = 𝑢𝑥𝑥 + 𝑢 𝑦 𝑦 = 0.

This equation is called the Laplace equation‗ and is an example of an elliptic equation.
Solutions to the Laplace equation are called harmonic functions and have many nice
properties and applications far beyond the steady-state heat problem.
‗ Named after the French mathematician Pierre-Simon, marquis de Laplace (1749–1827).
4.9. STEADY STATE TEMPERATURE AND THE LAPLACIAN 259

Harmonic functions in two variables are no longer just linear (so not just plane graphs).
For example, you can check that the functions 𝑥 2 − 𝑦 2 and 𝑥𝑦 are harmonic. However,
note that if 𝑢𝑥𝑥 is positive, 𝑢 is concave up in the 𝑥 direction, then 𝑢 𝑦 𝑦 must be negative
and 𝑢 must be concave down in the 𝑦 direction. A harmonic function can never have any
“hilltop” or “valley” on the graph. This observation is consistent with our intuitive idea of
steady-state heat distribution; the hottest or coldest spot should not be inside.
Commonly the Laplace equation is part of a so-called Dirichlet problem‗ . That is, we
have a region in the 𝑥𝑦-plane and we specify certain values along the boundaries of the
region. We then try to find a solution 𝑢 to the Laplace equation defined on this region such
that 𝑢 agrees with the values we specified on the boundary.
In this section, we consider a rectangular region. For simplicity, we specify boundary
values to be zero at 3 of the four edges and only specify an arbitrary function at one edge.
As we still have the principle of superposition, we can use this simpler solution to derive
the general solution for arbitrary boundary values by solving 4 different problems, one for
each edge, and adding those solutions together. This setup is left as an exercise.
We wish to solve the following problem. Let ℎ and 𝑤 be the height and width of our
rectangle, with one corner at the origin and lying in the first quadrant, so that the region is
given by 0 < 𝑥 < 𝑤 and 0 < 𝑦 < ℎ. Consider the problem

(0, ℎ) 𝑢=0 (𝑤, ℎ)


Δ𝑢 = 0, (4.20)
𝑢(0, 𝑦) = 0 for 0 < 𝑦 < ℎ, (4.21)
𝑢(𝑥, ℎ) = 0 for 0 < 𝑥 < 𝑤, (4.22) 𝑢=0 Δ𝑢 = 0 𝑢=0
𝑢(𝑤, 𝑦) = 0 for 0 < 𝑦 < ℎ, (4.23)
𝑢(𝑥, 0) = 𝑓 (𝑥) for 0 < 𝑥 < 𝑤. (4.24)
(0, 0) 𝑢 = 𝑓 (𝑥) (𝑤, 0)

The method we apply is separation of variables. Again, we will come up with


enough building-block solutions satisfying all the homogeneous boundary conditions (all
conditions except (4.24)). We notice that superposition still works for the equation and all
the homogeneous conditions. Therefore, we can use the Fourier series for 𝑓 (𝑥) to find a
solution 𝑢 that also solves (4.24).
We try 𝑢(𝑥, 𝑦) = 𝑋(𝑥)𝑌(𝑦). We plug 𝑢 into the equation to get

𝑋 ′′𝑌 + 𝑋𝑌 ′′ = 0.

We put the 𝑋s on one side and the 𝑌s on the other to get

𝑋 ′′ 𝑌 ′′
− = .
𝑋 𝑌
‗ Named after the German mathematician Johann Peter Gustav Lejeune Dirichlet (1805–1859).
260 CHAPTER 4. FOURIER SERIES AND PDES

The left-hand side only depends on 𝑥 and the right-hand side only depends on 𝑦. Therefore,
′′ 𝑌 ′′
there is some constant 𝜆 such that 𝜆 = −𝑋
𝑋 = 𝑌 . And we get two equations

𝑋 ′′ + 𝜆𝑋 = 0,
𝑌 ′′ − 𝜆𝑌 = 0.

Furthermore, the homogeneous boundary conditions imply that 𝑋(0) = 𝑋(𝑤) = 0 and
𝑌(ℎ) = 0. Using the equation for 𝑋, we have already seen that there is a nontrivial solution
if and only if 𝜆 = 𝜆𝑛 = 𝑛𝑤𝜋2 and the solution is a multiple of
2 2

 𝑛𝜋 
𝑋𝑛 (𝑥) = sin 𝑥 .
𝑤
For these given 𝜆𝑛 , the general solution for 𝑌 (one for each 𝑛) is
 𝑛𝜋   𝑛𝜋 
𝑌𝑛 (𝑦) = 𝐴𝑛 cosh 𝑦 + 𝐵𝑛 sinh 𝑦 . (4.25)
𝑤 𝑤
There is only one condition on 𝑌𝑛 and hence we can pick one of 𝐴𝑛 or 𝐵𝑛 to be something
convenient. It will be useful to have 𝑌𝑛 (0) = 1, so let 𝐴𝑛 = 1. Setting 𝑌𝑛 (ℎ) = 0 and solving
for 𝐵𝑛 , we get
− cosh 𝑛𝜋ℎ

𝑤
𝐵𝑛 =  .
sinh 𝑛𝜋ℎ𝑤
After we plug the 𝐴𝑛 and 𝐵𝑛 into (4.25) and simplify by using the identity sinh(𝛼 − 𝛽) =
sinh(𝛼) cosh(𝛽) − cosh(𝛼) sinh(𝛽), we find
𝑛𝜋(ℎ−𝑦)
 
sinh 𝑤
𝑌𝑛 (𝑦) = 𝑛𝜋ℎ
 .
sinh 𝑤

We define 𝑢𝑛 (𝑥, 𝑦) = 𝑋𝑛 (𝑥)𝑌𝑛 (𝑦). Note that 𝑢𝑛 satisfies (4.20)–(4.23). Observe


 𝑛𝜋 
𝑢𝑛 (𝑥, 0) = 𝑋𝑛 (𝑥)𝑌𝑛 (0) = sin 𝑥 .
𝑤
Suppose

Õ  𝑛𝜋𝑥 
𝑓 (𝑥) = 𝑏 𝑛 sin .
𝑤
𝑛=1
Then we get a solution of (4.20)–(4.24) of the following form.

𝑛𝜋(ℎ−𝑦)
 

Õ ∞
Õ  𝑛𝜋  © sinh 𝑤 ª
𝑢(𝑥, 𝑦) = 𝑏 𝑛 𝑢𝑛 (𝑥, 𝑦) = 𝑏 𝑛 sin 𝑥 ­­ ®.
𝑤 𝑛𝜋ℎ

𝑛=1 𝑛=1 sinh 𝑤
®
« ¬
As 𝑢𝑛 satisfies (4.20)–(4.23) and any linear combination (finite or infinite) of 𝑢𝑛 also satisfies
(4.20)–(4.23), then 𝑢 satisfies (4.20)–(4.23). We plug in 𝑦 = 0 to see 𝑢 satisfies (4.24) as well.
4.9. STEADY STATE TEMPERATURE AND THE LAPLACIAN 261

Example 4.9.1: Take 𝑤 = ℎ = 𝜋 and let 𝑓 (𝑥) = 𝜋. Let us compute the sine series for the
function 𝜋 (same as the series for the square wave). For 0 < 𝑥 < 𝜋, we have

Õ 4
𝑓 (𝑥) = sin(𝑛𝑥).
𝑛
𝑛=1
𝑛 odd

The solution 𝑢(𝑥, 𝑦), see Figure 4.25, to the corresponding Dirichlet problem is given as

sinh 𝑛(𝜋 − 𝑦)
 
Õ 4
𝑢(𝑥, 𝑦) = sin(𝑛𝑥) .
𝑛 sinh(𝑛𝜋)
𝑛=1
𝑛 odd

0.0 0.0 y
0.5
0.5 1.0
x 1.5
1.0 2.0
1.5 2.5
3.0
2.0 u(x,y)
2.5 3.0
3.0 3.142
2.828
2.5
2.514
3.0 2.199
2.0 1.885
2.5 1.571
1.257
1.5
0.943
2.0
0.628
1.0 0.314
1.5 0.000
0.5
1.0
0.0
0.5 0.0
0.5
0.0 1.0
0.0 1.5
0.5
1.0 2.0
1.5 x
2.0 2.5
2.5
3.0 3.0
y

Figure 4.25: Steady state temperature of a square plate, three sides held at zero and one side held at 𝜋.

This scenario corresponds to the steady-state temperature on a square plate of width 𝜋


with 3 sides held at 0 degrees and one side held at 𝜋 degrees. If we have arbitrary initial
data on all sides, then we solve four problems, each using one piece of nonhomogeneous
data. Then we use the principle of superposition to add up all four solutions to have a
solution to the original problem.
262 CHAPTER 4. FOURIER SERIES AND PDES

A different way to visualize solutions of the Laplace equation is to take a wire and bend
it so that it corresponds to the graph of the temperature above the boundary of your region.
Cut a rubber sheet in the shape of your region—a square in our case—and stretch it fixing
the edges of the sheet to the wire. The rubber sheet is a good approximation of the graph
of the solution to the Laplace equation with the given boundary data.

4.9.1 Exercises
Exercise 4.9.1: In the region described by 0 < 𝑥 < 𝜋 and 0 < 𝑦 < 𝜋, solve the problem

Δ𝑢 = 0, 𝑢(𝑥, 0) = sin 𝑥, 𝑢(𝑥, 𝜋) = 0, 𝑢(0, 𝑦) = 0, 𝑢(𝜋, 𝑦) = 0.

Exercise 4.9.2: In the region described by 0 < 𝑥 < 1 and 0 < 𝑦 < 1, solve the problem

𝑢𝑥𝑥 + 𝑢 𝑦 𝑦 = 0,
𝑢(𝑥, 0) = sin(𝜋𝑥) − sin(2𝜋𝑥), 𝑢(𝑥, 1) = 0,
𝑢(0, 𝑦) = 0, 𝑢(1, 𝑦) = 0.

Exercise 4.9.3: In the region described by 0 < 𝑥 < 1 and 0 < 𝑦 < 1, solve the problem

𝑢𝑥𝑥 + 𝑢 𝑦 𝑦 = 0,
𝑢(𝑥, 0) = 𝑢(𝑥, 1) = 𝑢(0, 𝑦) = 𝑢(1, 𝑦) = 𝐶.

for some constant 𝐶. Hint: Guess, then check your intuition.

Exercise 4.9.4: In the region described by 0 < 𝑥 < 𝜋 and 0 < 𝑦 < 𝜋, solve

Δ𝑢 = 0, 𝑢(𝑥, 0) = 0, 𝑢(𝑥, 𝜋) = 𝜋, 𝑢(0, 𝑦) = 𝑦, 𝑢(𝜋, 𝑦) = 𝑦.

Hint: Try a solution of the form 𝑢(𝑥, 𝑦) = 𝑋(𝑥) + 𝑌(𝑦) (different separation of variables).

Exercise 4.9.5: Use the solution of Exercise 4.9.4 to solve

Δ𝑢 = 0, 𝑢(𝑥, 0) = sin 𝑥, 𝑢(𝑥, 𝜋) = 𝜋, 𝑢(0, 𝑦) = 𝑦, 𝑢(𝜋, 𝑦) = 𝑦.

Hint: Use superposition.

Exercise 4.9.6: In the region described by 0 < 𝑥 < 𝑤 and 0 < 𝑦 < ℎ, solve the problem

𝑢𝑥𝑥 + 𝑢 𝑦 𝑦 = 0,
𝑢(𝑥, 0) = 0, 𝑢(𝑥, ℎ) = 𝑓 (𝑥),
𝑢(0, 𝑦) = 0, 𝑢(𝑤, 𝑦) = 0.

The solution should be in series form using the Fourier-series coefficients of 𝑓 (𝑥).
4.9. STEADY STATE TEMPERATURE AND THE LAPLACIAN 263

Exercise 4.9.7: In the region described by 0 < 𝑥 < 𝑤 and 0 < 𝑦 < ℎ, solve the problem

𝑢𝑥𝑥 + 𝑢 𝑦 𝑦 = 0,
𝑢(𝑥, 0) = 0, 𝑢(𝑥, ℎ) = 0,
𝑢(0, 𝑦) = 𝑓 (𝑦), 𝑢(𝑤, 𝑦) = 0.

The solution should be in series form using the Fourier-series coefficients of 𝑓 (𝑦).

Exercise 4.9.8: In the region described by 0 < 𝑥 < 𝑤 and 0 < 𝑦 < ℎ, solve the problem

𝑢𝑥𝑥 + 𝑢 𝑦 𝑦 = 0,
𝑢(𝑥, 0) = 0, 𝑢(𝑥, ℎ) = 0,
𝑢(0, 𝑦) = 0, 𝑢(𝑤, 𝑦) = 𝑓 (𝑦).

The solution should be in series form using the Fourier-series coefficients of 𝑓 (𝑦).

Exercise 4.9.9: In the region described by 0 < 𝑥 < 1 and 0 < 𝑦 < 1, solve the problem

𝑢𝑥𝑥 + 𝑢 𝑦 𝑦 = 0,
𝑢(𝑥, 0) = sin(9𝜋𝑥), 𝑢(𝑥, 1) = sin(2𝜋𝑥),
𝑢(0, 𝑦) = 0, 𝑢(1, 𝑦) = 0.

Hint: Use superposition.

Exercise 4.9.10: In the region described by 0 < 𝑥 < 1 and 0 < 𝑦 < 1, solve the problem

𝑢𝑥𝑥 + 𝑢 𝑦 𝑦 = 0,
𝑢(𝑥, 0) = sin(𝜋𝑥), 𝑢(𝑥, 1) = sin(𝜋𝑥),
𝑢(0, 𝑦) = sin(𝜋𝑦), 𝑢(1, 𝑦) = sin(𝜋𝑦).

Hint: Use superposition.

Exercise 4.9.11 (challenging): Using only your intuition find 𝑢(1/2, 1/2), for the problem Δ𝑢 = 0,
where 𝑢(0, 𝑦) = 𝑢(1, 𝑦) = 100 for 0 < 𝑦 < 1, and 𝑢(𝑥, 0) = 𝑢(𝑥, 1) = 0 for 0 < 𝑥 < 1. Explain.

Exercise 4.9.101: In the region described by 0 < 𝑥 < 1 and 0 < 𝑦 < 1, solve the problem

Õ 1
Δ𝑢 = 0, 𝑢(𝑥, 0) = sin(𝑛𝜋𝑥), 𝑢(𝑥, 1) = 0, 𝑢(0, 𝑦) = 0, 𝑢(1, 𝑦) = 0.
𝑛=1
𝑛2

Exercise 4.9.102: In the region described by 0 < 𝑥 < 1 and 0 < 𝑦 < 2, solve the problem

Δ𝑢 = 0, 𝑢(𝑥, 0) = 0.1 sin(𝜋𝑥), 𝑢(𝑥, 2) = 0, 𝑢(0, 𝑦) = 0, 𝑢(1, 𝑦) = 0.


264 CHAPTER 4. FOURIER SERIES AND PDES

4.10 Dirichlet problem in the circle and the Poisson kernel


Note: 2 lectures, §9.7 in [EP], §10.8 in [BD]

4.10.1 Laplace in polar coordinates


A more natural setting for the Laplace equation Δ𝑢 = 0 is a circle rather than a rectangle.
On the other hand, what makes the problem somewhat more difficult is that we need polar
coordinates.
Recall that the polar coordinates for the (𝑥, 𝑦)-plane are (𝑟, 𝜃):
(𝑟, 𝜃)
𝑥 = 𝑟 cos 𝜃, 𝑦 = 𝑟 sin 𝜃, 𝑟
𝜃
where 𝑟 ≥ 0 and −𝜋 < 𝜃 ≤ 𝜋. So the point (𝑥, 𝑦) is distance 𝑟 from the
origin at an angle 𝜃 from the positive 𝑥-axis.
Now that we know our coordinates, let us give the problem we wish to solve. We have
a circular region of radius 1, and we are interested in the Dirichlet problem for the Laplace
equation for this region. Let 𝑢(𝑟, 𝜃) denote the temperature at the point (𝑟, 𝜃) in polar
coordinates.
We have the problem:
𝑢(1, 𝜃) = 𝑔(𝜃)
Δ𝑢 = 0 for 𝑟 < 1,
(4.26)
𝑢(1, 𝜃) = 𝑔(𝜃) for −𝜋 < 𝜃 ≤ 𝜋.
Δ𝑢 = 0
The first issue we face is that we do not know the
Laplacian in polar coordinates. Normally, we would
find 𝑢𝑥𝑥 and 𝑢 𝑦 𝑦 in terms of the derivatives in 𝑟 and 𝜃.
We would need to solve for 𝑟 and 𝜃 in terms of 𝑥 and 𝑦. radius 1
In this case, it is more convenient to work in reverse. We
compute derivatives in 𝑟 and 𝜃 in terms of derivatives
in 𝑥 and 𝑦 and then we solve. The computations are
easier this way. First,
𝑥 𝑟 = cos 𝜃, 𝑥 𝜃 = −𝑟 sin 𝜃,
𝑦𝑟 = sin 𝜃, 𝑦𝜃 = 𝑟 cos 𝜃.

By chain rule, we obtain

𝑢𝑟 = 𝑢𝑥 𝑥 𝑟 + 𝑢 𝑦 𝑦𝑟 = cos(𝜃)𝑢𝑥 + sin(𝜃)𝑢 𝑦 ,
𝑢𝑟𝑟 = cos(𝜃)(𝑢𝑥𝑥 𝑥 𝑟 + 𝑢𝑥 𝑦 𝑦𝑟 ) + sin(𝜃)(𝑢 𝑦𝑥 𝑥 𝑟 + 𝑢 𝑦 𝑦 𝑦𝑟 )
= cos2 (𝜃)𝑢𝑥𝑥 + 2 cos(𝜃) sin(𝜃)𝑢𝑥 𝑦 + sin2 (𝜃)𝑢 𝑦 𝑦 .
4.10. DIRICHLET PROBLEM IN THE CIRCLE AND THE POISSON KERNEL 265

Similarly for the 𝜃 derivative. Note that we have to use the product rule for the second
derivative.

𝑢𝜃 = 𝑢𝑥 𝑥 𝜃 + 𝑢 𝑦 𝑦𝜃 = −𝑟 sin(𝜃)𝑢𝑥 + 𝑟 cos(𝜃)𝑢 𝑦 ,
𝑢𝜃𝜃 = −𝑟 cos(𝜃)𝑢𝑥 − 𝑟 sin(𝜃)(𝑢𝑥𝑥 𝑥 𝜃 + 𝑢𝑥 𝑦 𝑦𝜃 ) − 𝑟 sin(𝜃)𝑢 𝑦 + 𝑟 cos(𝜃)(𝑢 𝑦𝑥 𝑥 𝜃 + 𝑢 𝑦 𝑦 𝑦𝜃 )
= −𝑟 cos(𝜃)𝑢𝑥 − 𝑟 sin(𝜃)𝑢 𝑦 + 𝑟 2 sin2 (𝜃)𝑢𝑥𝑥 − 𝑟 2 2 sin(𝜃) cos(𝜃)𝑢𝑥 𝑦 + 𝑟 2 cos2 (𝜃)𝑢 𝑦 𝑦 .

Let us now try to find 𝑢𝑥𝑥 + 𝑢 𝑦 𝑦 . We start with 𝑟12 𝑢𝜃𝜃 to get rid of those pesky 𝑟 2 . If we add
𝑢𝑟𝑟 and use the fact that cos2 (𝜃) + sin2 (𝜃) = 1, we get
1 1 1
𝑢𝜃𝜃 + 𝑢𝑟𝑟 = 𝑢𝑥𝑥 + 𝑢 𝑦 𝑦 − cos(𝜃)𝑢𝑥 − sin(𝜃)𝑢 𝑦 .
𝑟 2 𝑟 𝑟
We are not quite there yet, but all we are lacking is 1𝑟 𝑢𝑟 . We add it to obtain the Laplacian in
polar coordinates:
1 1
Δ𝑢 = 𝑢𝑥𝑥 + 𝑢 𝑦 𝑦 = 2 𝑢𝜃𝜃 + 𝑢𝑟 + 𝑢𝑟𝑟 .
𝑟 𝑟
Notice that the Laplacian in polar coordinates no longer has constant coefficients.

4.10.2 Series solution


Let us separate variables as usual. That is, we try 𝑢(𝑟, 𝜃) = 𝑅(𝑟)Θ(𝜃). Then
1 1
0 = Δ𝑢 = 𝑅Θ′′ + 𝑅′ Θ + 𝑅 ′′ Θ.
𝑟 2 𝑟
We put 𝑅 on one side and Θ on the other and conclude that both sides must be constant.
 
1 1
𝑅Θ′′ = − 𝑅 ′ + 𝑅 ′′ Θ
𝑟 2 𝑟
Θ ′′ 𝑟𝑅 ′ + 𝑟 2 𝑅 ′′
=− = −𝜆
Θ 𝑅
We get two equations:

Θ ′′ + 𝜆Θ = 0,
𝑟 2 𝑅 ′′ + 𝑟𝑅 ′ − 𝜆𝑅 = 0.

We first focus on Θ. We know that 𝑢(𝑟, 𝜃) ought to be 2𝜋-periodic in 𝜃, that is, 𝑢(𝑟, 𝜃) =
𝑢(𝑟, 𝜃 + 2𝜋). Therefore, the solution to Θ′′ + 𝜆Θ = 0 must be 2𝜋-periodic. We have seen
such a problem in Example 4.1.5. We conclude that 𝜆 = 𝑛 2 for a nonnegative integer
𝑛 = 0, 1, 2, 3, . . .. The equation becomes Θ′′ + 𝑛 2 Θ = 0. When 𝑛 = 0 the equation is just
Θ ′′ = 0, so we have the general solution 𝐴𝜃 + 𝐵. As Θ is periodic, 𝐴 = 0. For convenience,
we write this solution as
𝑎0
Θ0 =
2
266 CHAPTER 4. FOURIER SERIES AND PDES

for some constant 𝑎0 . For positive 𝑛, the solution to Θ′′ + 𝑛 2 Θ = 0 is

Θ𝑛 = 𝑎 𝑛 cos(𝑛𝜃) + 𝑏 𝑛 sin(𝑛𝜃),

for some constants 𝑎 𝑛 and 𝑏 𝑛 .


Next, we consider the equation for 𝑅,

𝑟 2 𝑅 ′′ + 𝑟𝑅 ′ − 𝑛 2 𝑅 = 0.

This equation appeared in exercises before—we solved it in Exercise 2.1.6 and Exercise 2.1.7
on page 83. The idea is to try a solution 𝑟 𝑠 and if that does not give us two solutions, also
try a solution of the form 𝑟 𝑠 ln 𝑟. We name the solution 𝑅 𝑛 as usual. When 𝑛 = 0 we obtain

𝑅0 = 𝐴𝑟 0 + 𝐵𝑟 0 ln 𝑟 = 𝐴 + 𝐵 ln 𝑟,

and if 𝑛 > 0, we get


𝑅 𝑛 = 𝐴𝑟 𝑛 + 𝐵𝑟 −𝑛 .
The function 𝑢(𝑟, 𝜃) must be finite (it cannot blow up) at the origin, that is, when 𝑟 = 0. So
𝐵 = 0 in both cases as otherwise 𝑟 −𝑛 or ln 𝑟 does blow up as 𝑟 → 0. Set 𝐴 = 1 in both cases;
the constants in Θ𝑛 will pick up the slack so nothing is lost. That is,

𝑅0 = 1 and 𝑅𝑛 = 𝑟 𝑛 .

Our building block solutions are


𝑎0
𝑢0 (𝑟, 𝜃) = and 𝑢𝑛 (𝑟, 𝜃) = 𝑎 𝑛 𝑟 𝑛 cos(𝑛𝜃) + 𝑏 𝑛 𝑟 𝑛 sin(𝑛𝜃).
2
Putting everything together our solution is

𝑎0 Õ
𝑢(𝑟, 𝜃) = + 𝑎 𝑛 𝑟 𝑛 cos(𝑛𝜃) + 𝑏 𝑛 𝑟 𝑛 sin(𝑛𝜃).
2
𝑛=1

We look at the boundary condition in (4.26),



𝑎0 Õ
𝑔(𝜃) = 𝑢(1, 𝜃) = + 𝑎 𝑛 cos(𝑛𝜃) + 𝑏 𝑛 sin(𝑛𝜃).
2
𝑛=1

Therefore, to solve (4.26) we expand 𝑔(𝜃), which is a 2𝜋-periodic function, as a Fourier


series, and then multiply the 𝑛 th term by 𝑟 𝑛 . To find the 𝑎 𝑛 and the 𝑏 𝑛 , we compute
∫ 𝜋 ∫ 𝜋
1 1
𝑎𝑛 = 𝑔(𝜃) cos(𝑛𝜃) 𝑑𝜃 and 𝑏𝑛 = 𝑔(𝜃) sin(𝑛𝜃) 𝑑𝜃.
𝜋 −𝜋 𝜋 −𝜋

Example 4.10.1: Suppose we wish to solve

Δ𝑢 = 0, 0 ≤ 𝑟 < 1, −𝜋 < 𝜃 ≤ 𝜋,
𝑢(1, 𝜃) = cos(10 𝜃), −𝜋 < 𝜃 ≤ 𝜋.
4.10. DIRICHLET PROBLEM IN THE CIRCLE AND THE POISSON KERNEL 267

The 𝑔(𝜃) is already expanded, so the solution is


𝑢(𝑟, 𝜃) = 𝑟 10 cos(10 𝜃).
See the plot in Figure 4.26. The thing to notice in this example is that the effect of a high
frequency is mostly felt at the boundary. In the middle of the disc, the solution is very
close to zero. That is because 𝑟 10 is rather small when 𝑟 is close to 0.

1.0 -1.0 x
-0.5
y 0.5 0.0
0.5
0.0
1.0
u(r,theta)
-0.5 1.5

-1.0
1.0 1.200
1.5 0.900
0.600
0.5 0.300
1.0 0.000
0.0
-0.300
-0.600
0.5 -0.900
-0.5 -1.200
-1.500
0.0

-1.0
-0.5

-1.5
-1.0
1.0

-1.5 0.5

-1.0
0.0
-0.5
0.0 -0.5 y
0.5
x 1.0 -1.0

Figure 4.26: The solution of the Dirichlet problem in the disc with cos(10 𝜃) as boundary data.

Example 4.10.2: Let us solve a more difficult problem. Consider a long rod with circular
cross section of radius 1. Suppose we wish to solve the steady-state heat problem in
the rod. If the rod is long enough, we simply need to solve the Laplace equation in two
dimensions. We put the center of the rod at the origin, and we have exactly the region
we are currently studying—a circle of radius 1. For the boundary conditions, suppose in
Cartesian coordinates 𝑥 and 𝑦, the temperature on the boundary is 0 when 𝑦 < 0, and it is
2𝑦 when 𝑦 > 0.
Let us set the problem up. As 𝑦 = 𝑟 sin(𝜃), then on the circle of radius 1, that is, where
𝑟 = 1, we have 2𝑦 = 2 sin(𝜃). So
Δ𝑢 = 0, 0 ≤ 𝑟 < 1, −𝜋 < 𝜃 ≤ 𝜋,
(
2 sin(𝜃) if 0 ≤ 𝜃 ≤ 𝜋,
𝑢(1, 𝜃) =
0 if −𝜋 < 𝜃 < 0.
268 CHAPTER 4. FOURIER SERIES AND PDES

We must now compute the Fourier series for the boundary condition. By now the
reader has plentiful experience in computing Fourier series, and so we simply state that


2 Õ −4
𝑢(1, 𝜃) = + sin(𝜃) + cos(2𝑛𝜃).
𝜋 𝜋(4𝑛 2 − 1) 𝑛=1

Exercise 4.10.1: Compute the series for 𝑢(1, 𝜃) and verify that it really is what we have just claimed.
Hint: Be careful, make sure not to divide by zero.

To obtain the solution (see Figure 4.27), we write its series by multiplying terms in the
series for 𝑔(𝜃) by 𝑟 𝑛 in the right places:

Õ −4𝑟 2𝑛 ∞
2
𝑢(𝑟, 𝜃) = + 𝑟 sin(𝜃) + cos(2𝑛𝜃).
𝜋 𝜋(4𝑛 2 − 1) 𝑛=1

1.0 x
-0.5
y 0.5 0.0
0.5
0.0
1.0
u(r,theta)
-0.5 2.0

2.000
1.800
2.0 1.5 1.600
1.400
1.200
1.000
1.5 0.800
1.0
0.600
0.400
0.200
0.000
1.0 0.5

0.5 0.0

1.0

0.0 0.5

0.0
-0.5
0.0 -0.5 y
0.5
x 1.0

Figure 4.27: The solution of the Dirichlet problem with boundary data 0 for 𝑦 < 0 and 2𝑦 for 𝑦 > 0.
4.10. DIRICHLET PROBLEM IN THE CIRCLE AND THE POISSON KERNEL 269

4.10.3 Poisson kernel


There is another way to solve the Dirichlet problem—with the help of an integral kernel.
That is, we will find a function 𝑃(𝑟, 𝜃, 𝛼) called the Poisson kernel‗ such that
∫ 𝜋
1
𝑢(𝑟, 𝜃) = 𝑃(𝑟, 𝜃, 𝛼) 𝑔(𝛼) 𝑑𝛼.
2𝜋 −𝜋
While the integral will generally not be solvable analytically, it can be evaluated numerically.
In fact, unless the boundary data is given as a Fourier series already, it may be much easier
to numerically evaluate this formula as there is only one integral to evaluate.
The formula also has theoretical applications. For instance, as 𝑃(𝑟, 𝜃, 𝛼) will have
infinitely many derivatives, then via differentiating under the integral, we find that the
solution 𝑢(𝑟, 𝜃) has infinitely many derivatives, at least when inside the circle, 𝑟 < 1. By
“having infinitely many derivatives,” what you should think of is that 𝑢(𝑟, 𝜃) has “no
corners” and all of its partial derivatives of all orders exist and also have “no corners.”
We will compute the formula for 𝑃(𝑟, 𝜃, 𝛼) from the series solution, and this idea can be
applied anytime you have a convenient series solution where the coefficients are obtained
via integration. Hence you can apply this reasoning to obtain such integral kernels for other
equations, such as the heat equation. The computation is long and tedious, but not overly
difficult. Since the ideas are often applied in similar contexts, it is good to understand how
this computation works.
What we do is start with the series solution and replace the coefficients with the integrals
that compute them. Then we try to write everything as a single integral. We must use a
different dummy variable for the integration and hence we use 𝛼 instead of 𝜃.

𝑎0 Õ
𝑢(𝑟, 𝜃) = + 𝑎 𝑛 𝑟 𝑛 cos(𝑛𝜃) + 𝑏 𝑛 𝑟 𝑛 sin(𝑛𝜃)
2
𝑛=1
 ∫ 𝜋  ∞  ∫ 𝜋 
1 Õ 1
= 𝑔(𝛼) 𝑑𝛼 + 𝑔(𝛼) cos(𝑛𝛼) 𝑑𝛼 𝑟 𝑛 cos(𝑛𝜃)+
2𝜋 −𝜋 𝜋 −𝜋
| {z } 𝑛=1 | {z }
𝑎0 𝑎𝑛
2
 ∫ 𝜋 
1
+ 𝑔(𝛼) sin(𝑛𝛼) 𝑑𝛼 𝑟 𝑛 sin(𝑛𝜃)
𝜋 −𝜋
| {z }
𝑏𝑛
!
∫ 𝜋 ∞
1 Õ
= 𝑔(𝛼) + 2 𝑔(𝛼) cos(𝑛𝛼) 𝑟 𝑛 cos(𝑛𝜃) + 𝑔(𝛼) sin(𝑛𝛼) 𝑟 𝑛 sin(𝑛𝜃) 𝑑𝛼
2𝜋 −𝜋 𝑛=1
!
∫ 𝜋 ∞
1 Õ
𝑟 𝑛 cos(𝑛𝛼) cos(𝑛𝜃) + sin(𝑛𝛼) sin(𝑛𝜃) 𝑔(𝛼) 𝑑𝛼.

= 1+2
2𝜋 −𝜋 𝑛=1
| {z }
𝑃(𝑟,𝜃,𝛼)
‗ Named for the French mathematician Siméon Denis Poisson (1781–1840).
270 CHAPTER 4. FOURIER SERIES AND PDES

OK, so we have what we wanted, the expression in the parentheses is the Poisson kernel,
𝑃(𝑟, 𝜃, 𝛼). However, we can do a lot better. It is still given as a series, and we would really
like to have a nice simple expression for it. We must work a little harder. The trick is to
rewrite everything in terms of complex exponentials. Let us work just on the kernel.

Õ
𝑃(𝑟, 𝜃, 𝛼) = 1 + 2 𝑟 𝑛 cos(𝑛𝛼) cos(𝑛𝜃) + sin(𝑛𝛼) sin(𝑛𝜃)

𝑛=1

Õ
𝑟 𝑛 cos 𝑛(𝜃 − 𝛼)

=1+2
𝑛=1

Õ
𝑟 𝑛 𝑒 𝑖𝑛(𝜃−𝛼) + 𝑒 −𝑖𝑛(𝜃−𝛼)

=1+
𝑛=1
∞ ∞
𝑖(𝜃−𝛼) 𝑛
Õ Õ 𝑛
𝑟𝑒 𝑟𝑒 −𝑖(𝜃−𝛼) .

=1+ +
𝑛=1 𝑛=1

In the expression above, we recognize the geometric series. Recall from calculus that if 𝑧 is a
complex number where |𝑧| < 1, then

Õ 𝑧
𝑧𝑛 = .
1−𝑧
𝑛=1

Note that 𝑛 starts at 1, and that is why we have the 𝑧 in the numerator. It is the standard
geometric series multiplied by 𝑧. We can use 𝑧 = 𝑟𝑒 𝑖(𝜃−𝛼) , as lo and behold |𝑟𝑒 𝑖(𝜃−𝛼) | = 𝑟 < 1.
We continue with the computation.
∞ ∞
𝑖(𝜃−𝛼) 𝑛
Õ Õ 𝑛
𝑃(𝑟, 𝜃, 𝛼) = 1 + 𝑟𝑒 𝑟𝑒 −𝑖(𝜃−𝛼)

+
𝑛=1 𝑛=1
𝑟𝑒 𝑖(𝜃−𝛼) 𝑟𝑒 −𝑖(𝜃−𝛼)
=1+ +
1 − 𝑟𝑒 𝑖(𝜃−𝛼) 1 − 𝑟𝑒 −𝑖(𝜃−𝛼)
1 − 𝑟𝑒 𝑖(𝜃−𝛼) 1 − 𝑟𝑒 −𝑖(𝜃−𝛼) + 1 − 𝑟𝑒 −𝑖(𝜃−𝛼) 𝑟𝑒 𝑖(𝜃−𝛼) + 1 − 𝑟𝑒 𝑖(𝜃−𝛼) 𝑟𝑒 −𝑖(𝜃−𝛼)
   
=
1 − 𝑟𝑒 𝑖(𝜃−𝛼) 1 − 𝑟𝑒 −𝑖(𝜃−𝛼)
 

1 − 𝑟2
=
1 − 𝑟𝑒 𝑖(𝜃−𝛼) − 𝑟𝑒 −𝑖(𝜃−𝛼) + 𝑟 2
1 − 𝑟2
= .
1 − 2𝑟 cos(𝜃 − 𝛼) + 𝑟 2

That is a formula we can live with. The solution to the Dirichlet problem using the Poisson
kernel is
∫ 𝜋
1 1 − 𝑟2
𝑢(𝑟, 𝜃) = 𝑔(𝛼) 𝑑𝛼.
2𝜋 −𝜋 1 − 2𝑟 cos(𝜃 − 𝛼) + 𝑟 2
4.10. DIRICHLET PROBLEM IN THE CIRCLE AND THE POISSON KERNEL 271

1
Sometimes the formula for the Poisson kernel is given together with the constant 2𝜋 , in
which case we should, of course, not leave it in front of the integral. Sometimes the limits
of the integral are given as 0 to 2𝜋; everything inside is 2𝜋-periodic in 𝛼, so this does not
change the integral.
Let us not leave the Poisson kernel without explaining
its geometric meaning. Let 𝑠 be the distance from (𝑟, 𝜃) to (1, 𝛼) 𝑠
(𝑟, 𝜃)
(1, 𝛼). This distance 𝑠 in polar coordinates is given precisely
by the square root of 1 − 2𝑟 cos(𝜃 − 𝛼) + 𝑟 2 . That is, the 1
𝑟
Poisson kernel is really the formula

1 − 𝑟2
.
𝑠2
One final note we make about the formula is that it is
really a weighted average of the boundary values. First, we
look at what happens at the origin, that is, when 𝑟 = 0:
𝜋
1 − 02

1
𝑢(0, 0) = 𝑔(𝛼) 𝑑𝛼
2𝜋 −𝜋 1 − 2(0) cos(0 − 𝛼) + 02
∫ 𝜋
1
= 𝑔(𝛼) 𝑑𝛼.
2𝜋 −𝜋

So 𝑢(0, 0) is precisely the average value of 𝑔(𝜃) and therefore the average value of 𝑢 on the
boundary. This is a general feature of harmonic functions, the value at some point 𝑝 is
equal to the average of the values on a circle centered at 𝑝.
What the formula says at other points inside the circle is that the value of the solution is
a weighted average of the boundary data 𝑔(𝜃). The kernel is bigger when (1, 𝛼) is closer to
(𝑟, 𝜃). Therefore, when computing 𝑢(𝑟, 𝜃), we give more weight to the values 𝑔(𝛼) when
(1, 𝛼) is closer to (𝑟, 𝜃) and less weight to the values 𝑔(𝛼) when (1, 𝛼) far from (𝑟, 𝜃).

4.10.4 Exercises
Exercise 4.10.2: Using series solve Δ𝑢 = 0, 𝑢(1, 𝜃) = |𝜃|, for −𝜋 < 𝜃 ≤ 𝜋.

Exercise 4.10.3: Using series solve Δ𝑢 = 0, 𝑢(1, 𝜃) = 𝑔(𝜃) for the following data. Hint: trig
identities.

a) 𝑔(𝜃) = 1/2 + 3 sin(𝜃) + cos(3𝜃) b) 𝑔(𝜃) = 3 cos(3𝜃) + 3 sin(3𝜃) + sin(9𝜃)


c) 𝑔(𝜃) = 2 cos(𝜃 + 1) d) 𝑔(𝜃) = sin2 (𝜃)

Exercise 4.10.4: Using the Poisson kernel, give the solution to Δ𝑢 = 0, where 𝑢(1, 𝜃) is zero for 𝜃
outside the interval [−𝜋/4, 𝜋/4] and 𝑢(1, 𝜃) is 1 for 𝜃 on the interval [−𝜋/4, 𝜋/4].
272 CHAPTER 4. FOURIER SERIES AND PDES

Exercise 4.10.5:

a) Draw a graph for the Poisson kernel as a function of 𝛼 when 𝑟 = 1/2 and 𝜃 = 0.
b) Describe what happens to the graph when you make 𝑟 bigger (as it approaches 1).
c) Knowing that the solution 𝑢(𝑟, 𝜃) is the weighted average of 𝑔(𝜃) with Poisson kernel as the
weight, explain what your answer to part b) means.

Exercise 4.10.6: Let 𝑔(𝜃) be the function 𝑥𝑦 = cos 𝜃 sin 𝜃 on the boundary. Use the series solution
to find a solution to the Dirichlet problem Δ𝑢 = 0, 𝑢(1, 𝜃) = 𝑔(𝜃). Now convert the solution to
Cartesian coordinates 𝑥 and 𝑦. Is this solution surprising? Hint: use your trig identities.

Exercise 4.10.7: Carry out the computation we needed in the separation of variables and solve
𝑟 2 𝑅′′ + 𝑟𝑅 ′ − 𝑛 2 𝑅 = 0, for 𝑛 = 0, 1, 2, 3, . . ..

Exercise 4.10.8 (challenging): Derive the series solution to the Dirichlet problem if the region is a
circle of radius 𝜌 rather than 1. That is, solve Δ𝑢 = 0, 𝑢(𝜌, 𝜃) = 𝑔(𝜃).

Exercise 4.10.9 (challenging):

a) Find the solution for Δ𝑢 = 0, 𝑢(1, 𝜃) = 𝑥 2 𝑦 3 + 5𝑥 2 . Write the answer in Cartesian


coordinates.
b) Now solve Δ𝑢 = 0, 𝑢(1, 𝜃) = 𝑥 𝑘 𝑦ℓ . Write the solution in Cartesian coordinates.
c) Suppose you have a polynomial 𝑃(𝑥, 𝑦) = 𝑚 𝑛 𝑗 𝑘
𝑗=0 𝑘=0 𝑐 𝑗,𝑘 𝑥 𝑦 , solve Δ𝑢 = 0, 𝑢(1, 𝜃) =
Í Í
𝑃(𝑥, 𝑦) (that is, write down the formula for the answer). Write the answer in Cartesian
coordinates.

Notice the answer is again a polynomial in 𝑥 and 𝑦. See also Exercise 4.10.6.

Exercise 4.10.101: Using series solve Δ𝑢 = 0, 𝑢(1, 𝜃) = 1 + 1
Í
𝑛2
sin(𝑛𝜃).
𝑛=1

Exercise 4.10.102: Using the series solution find the solution to Δ𝑢 = 0, 𝑢(1, 𝜃) = 1 − cos(𝜃).
Express the solution in Cartesian coordinates (that is, using 𝑥 and 𝑦).

Exercise 4.10.103:

a) Try and guess a solution to Δ𝑢 = −1, 𝑢(1, 𝜃) = 0. Hint: try a solution that only depends
on 𝑟. Also first, don’t worry about the boundary condition.
b) Now solve Δ𝑢 = −1, 𝑢(1, 𝜃) = sin(2𝜃) using superposition.

Exercise 4.10.104 (challenging): Derive the Poisson kernel solution if the region is a circle of
radius 𝜌 rather than 1. That is, solve Δ𝑢 = 0, 𝑢(𝜌, 𝜃) = 𝑔(𝜃).
Chapter 5

More on eigenvalue problems

5.1 Sturm–Liouville problems


Note: 2 lectures, §10.1 in [EP], §11.2 in [BD]

5.1.1 Boundary value problems


In chapter 4, we encountered several different eigenvalue problems such as:
𝑋 ′′(𝑥) + 𝜆𝑋(𝑥) = 0,
with different boundary conditions
𝑋(0) = 0 𝑋(𝐿) = 0 (Dirichlet), or
𝑋 ′(0) = 0 𝑋 ′(𝐿) = 0 (Neumann), or
𝑋 ′(0) = 0 𝑋(𝐿) = 0 (Mixed), or
𝑋(0) = 0 𝑋 ′(𝐿) = 0 (Mixed), . . .
These boundary problems came up in the study of the heat equation 𝑢𝑡 = 𝑘𝑢𝑥𝑥 when
we were trying to solve the equation by the method of separation of variables in § 4.6.
Dirichlet conditions correspond to applying a zero temperature at the ends, Neumann
means insulating the ends, etc. Other types of endpoint conditions also arise naturally,
such as the Robin boundary conditions
ℎ𝑋(0) − 𝑋 ′(0) = 0, ℎ𝑋(𝐿) + 𝑋 ′(𝐿) = 0,
for some constant ℎ. These conditions come up when the ends are immersed in some
medium.
In the separation of variables computation we encountered an eigenvalue problem and
found the eigenfunctions 𝑋𝑛 (𝑥). We then found the eigenfunction decomposition of the initial
temperature 𝑓 (𝑥) = 𝑢(𝑥, 0),

Õ
𝑓 (𝑥) = 𝑐 𝑛 𝑋𝑛 (𝑥).
𝑛=1
274 CHAPTER 5. MORE ON EIGENVALUE PROBLEMS

Once we had this decomposition and found suitable 𝑇𝑛 (𝑡) such that 𝑇𝑛 (0) = 1 and such
that 𝑇𝑛 (𝑡)𝑋𝑛 (𝑥) were solutions to the heat equation, we wrote the solution to the original
problem, including the initial condition, as

Õ
𝑢(𝑥, 𝑡) = 𝑐 𝑛 𝑇𝑛 (𝑡)𝑋𝑛 (𝑥).
𝑛=1

To study more general problems with this method, we must study more general
eigenvalue problems. First, we study second-order linear equations of the form

𝑑 𝑑𝑦
 
𝑝(𝑥) − 𝑞(𝑥)𝑦 + 𝜆𝑟(𝑥)𝑦 = 0. (5.1)
𝑑𝑥 𝑑𝑥

Essentially any second-order linear equation of the form 𝑎(𝑥)𝑦 ′′ +𝑏(𝑥)𝑦 ′ +𝑐(𝑥)𝑦 +𝜆𝑑(𝑥)𝑦 = 0
can be written as (5.1) after multiplying by a proper factor.
Example 5.1.1 (Bessel): Put the following equation into the form (5.1):

𝑥 2 𝑦 ′′ + 𝑥𝑦 ′ + 𝜆𝑥 2 − 𝑛 2 𝑦 = 0.


1
Multiply both sides by 𝑥 to obtain

𝑛2
 
1 2 ′′ ′ ′′ ′
𝑥 𝑦 + 𝑥𝑦 + 𝜆𝑥 − 𝑛 𝑦 = 𝑥𝑦 + 𝑦 + 𝜆𝑥 −
2 2
𝑦
 
𝑥 𝑥
𝑑 𝑑𝑦 𝑛2
 
= 𝑥 − 𝑦 + 𝜆𝑥𝑦 = 0.
𝑑𝑥 𝑑𝑥 𝑥

The Bessel equation turns up for example in the solution of the two-dimensional wave
equation. If you want to see how one solves the equation, you can look at subsection 7.3.3.
The so-called Sturm–Liouville problem‗ is to seek nontrivial solutions to

𝑑 𝑑𝑦
 
𝑝(𝑥) − 𝑞(𝑥)𝑦 + 𝜆𝑟(𝑥)𝑦 = 0, 𝑎 < 𝑥 < 𝑏,
𝑑𝑥 𝑑𝑥
(5.2)
𝛼1 𝑦(𝑎) − 𝛼 2 𝑦 ′(𝑎) = 0,
𝛽 1 𝑦(𝑏) + 𝛽 2 𝑦 ′(𝑏) = 0.

By nontrivial, we again mean a 𝑦 that is not simply the constant zero (which is always a
solution). In particular, we ask which 𝜆s allow such nontrivial solutions. The 𝜆s that admit
nontrivial solutions are called the eigenvalues and the corresponding nontrivial solutions
are called eigenfunctions. The constants 𝛼1 and 𝛼2 should not both be zero; the same applies
to 𝛽1 and 𝛽2 .
‗ Named after the French mathematicians Jacques Charles François Sturm (1803–1855) and Joseph
Liouville (1809–1882).
5.1. STURM–LIOUVILLE PROBLEMS 275

Theorem 5.1.1. Suppose 𝑝(𝑥), 𝑝 ′(𝑥), 𝑞(𝑥) and 𝑟(𝑥) are continuous on [𝑎, 𝑏] and suppose 𝑝(𝑥) > 0
and 𝑟(𝑥) > 0 for all 𝑥 in [𝑎, 𝑏]. Then the Sturm–Liouville problem (5.2) has an increasing sequence
of eigenvalues
𝜆1 < 𝜆2 < 𝜆3 < · · ·
such that
lim 𝜆𝑛 = +∞
𝑛→∞

and such that to each 𝜆𝑛 there is (up to a constant multiple) a single eigenfunction 𝑦𝑛 (𝑥).
Moreover, if 𝑞(𝑥) ≥ 0 and 𝛼1 , 𝛼2 , 𝛽1 , 𝛽2 ≥ 0, then 𝜆𝑛 ≥ 0 for all 𝑛.

Problems satisfying the hypothesis of the theorem (including the “Moreover”) are called
regular Sturm–Liouville problems, and we will only consider such problems here. That is, a
regular problem is one where 𝑝(𝑥), 𝑝 ′(𝑥), 𝑞(𝑥) and 𝑟(𝑥) are continuous, 𝑝(𝑥) > 0, 𝑟(𝑥) > 0,
𝑞(𝑥) ≥ 0, and 𝛼1 , 𝛼2 , 𝛽1 , 𝛽2 ≥ 0, where neither 𝛼1 and 𝛼2 are both zero, nor 𝛽 1 and 𝛽 2 are
both zero. Note: Be careful about the signs. Also be careful about the inequalities for 𝑟 and
𝑝, they must be strict for all 𝑥 in the interval [𝑎, 𝑏], including the endpoints!
When zero is an eigenvalue, we usually start labeling the eigenvalues from 0 rather
than from 1 for convenience. That is we label the eigenvalues 𝜆0 < 𝜆1 < 𝜆2 < · · · .
Example 5.1.2: The problem 𝑦 ′′ + 𝜆𝑦 = 0, 0 < 𝑥 < 𝐿, 𝑦(0) = 0, and 𝑦(𝐿) = 0 is a regular
Sturm–Liouville problem: 𝑝(𝑥) = 1, 𝑞(𝑥) = 0, 𝑟(𝑥) = 1, and we have 𝑝(𝑥) = 1 > 0 and
𝑟(𝑥) = 1 > 0. We also have 𝑎 = 0, 𝑏 = 𝐿, 𝛼1 = 𝛽 1 = 1, 𝛼2 = 𝛽 2 = 0. The eigenvalues are
𝜆𝑛 = 𝑛𝐿𝜋2 and eigenfunctions are 𝑦𝑛 (𝑥) = sin 𝑛𝜋
2 2
𝐿 𝑥 . All eigenvalues are nonnegative as

predicted by the theorem.

Exercise 5.1.1: Find eigenvalues and eigenfunctions for

𝑦 ′′ + 𝜆𝑦 = 0, 𝑦 ′(0) = 0, 𝑦 ′(1) = 0.

Identify the 𝑝, 𝑞, 𝑟, 𝛼 𝑗 , 𝛽 𝑗 . Can you use the theorem above to make the search for eigenvalues easier?
Hint: Consider the condition −𝑦 ′(0) = 0.

Example 5.1.3: Find eigenvalues and eigenfunctions of the problem

𝑦 ′′ + 𝜆𝑦 = 0, 0 < 𝑥 < 1,
ℎ𝑦(0) − 𝑦 ′(0) = 0, 𝑦 ′(1) = 0, ℎ > 0.

These equations give a regular Sturm–Liouville problem.

Exercise 5.1.2: Identify 𝑝, 𝑞, 𝑟, 𝛼 𝑗 , 𝛽 𝑗 in the example above.

By Theorem 5.1.1, 𝜆 ≥ 0. So the general solution (without boundary conditions) is


√  √ 
𝑦(𝑥) = 𝐴 cos 𝜆 𝑥 + 𝐵 sin 𝜆 𝑥 if 𝜆 > 0,
𝑦(𝑥) = 𝐴𝑥 + 𝐵 if 𝜆 = 0.
276 CHAPTER 5. MORE ON EIGENVALUE PROBLEMS

Let us see if 𝜆 = 0 is an eigenvalue: We must satisfy 0 = ℎ𝐵 − 𝐴 and 𝐴 = 0, hence 𝐵 = 0


(as ℎ > 0). Therefore, 0 is not an eigenvalue (no nonzero solution, so no eigenfunction).
Now we try 𝜆 > 0. We plug in the boundary conditions:

0 = ℎ𝐴 − 𝜆 𝐵,
√ √  √ √ 
0 = −𝐴 𝜆 sin 𝜆 + 𝐵 𝜆 cos 𝜆 .

ℎ𝐴
If 𝐴 = 0, then 𝐵 = 0 and vice versa; therefore, both are nonzero. So 𝐵 = √ , and
√ √  √ √  𝜆
0 = −𝐴 𝜆 sin 𝜆 + √ℎ𝐴 𝜆 cos 𝜆 . As 𝐴 ≠ 0 we get
𝜆
√ √  √ 
0 = − 𝜆 sin 𝜆 + ℎ cos 𝜆 ,

or
ℎ √
√ = tan 𝜆.
𝜆
We use a computer to find 𝜆𝑛 . There are tables available, though using a computer or a
graphing calculator is far more convenient nowadays. The easiest method is to plot the
functions ℎ/𝑥 and tan 𝑥 and see for which 𝑥 they
√ intersect. There is an infinite
√ number of
intersections. Denote the first√intersection√by 𝜆1 , the second intersection by 𝜆2 , etc. For
example, when ℎ = 1, we get 𝜆1 ≈ 0.86, 𝜆2 ≈ 3.43, . . . . That is 𝜆1 ≈ 0.74, 𝜆2 ≈ 11.73, . . . .
A plot for ℎ = 1 is given in Figure 5.1 on the facing page. The appropriate eigenfunction

(let 𝐴 = 1 for convenience, then 𝐵 = ℎ/ 𝜆) is
p ℎ p
𝑦𝑛 (𝑥) = cos 𝜆𝑛 𝑥 + √ sin 𝜆𝑛 𝑥 .
 
𝜆𝑛

When ℎ = 1 we get (approximately)

1 1
𝑦1 (𝑥) ≈ cos(0.86 𝑥) + sin(0.86 𝑥), 𝑦2 (𝑥) ≈ cos(3.43 𝑥) + sin(3.43 𝑥), ....
0.86 3.43

5.1.2 Orthogonality
We have seen the notion of orthogonality before. For example, we have shown that sin(𝑛𝑥)
are orthogonal for distinct 𝑛 on [0, 𝜋]. For general Sturm–Liouville problems we need
a more general setup. Let 𝑟(𝑥) be a weight function (any function, though generally we
assume it is positive) on [𝑎, 𝑏]. Two functions 𝑓 (𝑥), 𝑔(𝑥) are said to be orthogonal with
respect to the weight function 𝑟(𝑥) when
∫ 𝑏
𝑓 (𝑥) 𝑔(𝑥) 𝑟(𝑥) 𝑑𝑥 = 0.
𝑎
5.1. STURM–LIOUVILLE PROBLEMS 277

0 2 4 6
4 4

2 2

0 0

-2 -2

-4 -4
0 2 4 6

Figure 5.1: Plot of 1


𝑥 and tan 𝑥.

In this setting, we define the inner product as


∫ 𝑏
def
⟨ 𝑓 , 𝑔⟩ = 𝑓 (𝑥) 𝑔(𝑥) 𝑟(𝑥) 𝑑𝑥,
𝑎

and then say 𝑓 and 𝑔 are orthogonal whenever ⟨ 𝑓 , 𝑔⟩ = 0. The results and concepts are
again analogous to finite-dimensional linear algebra.
The idea of the given inner product is that those 𝑥 where 𝑟(𝑥) is greater have more
weight. Nontrivial (nonconstant) 𝑟(𝑥) arise naturally, for example from a change of variables.
Hence, you could think of a change of variables such that 𝑑𝜉 = 𝑟(𝑥) 𝑑𝑥.
Eigenfunctions of a regular Sturm–Liouville problem satisfy an orthogonality property,
just like the eigenfunctions in § 4.1. Its proof is very similar to the analogous Theorem 4.1.1
on page 193.

Theorem 5.1.2. Suppose we have a regular Sturm–Liouville problem

𝑑 𝑑𝑦
 
𝑝(𝑥) − 𝑞(𝑥)𝑦 + 𝜆𝑟(𝑥)𝑦 = 0,
𝑑𝑥 𝑑𝑥
𝛼1 𝑦(𝑎) − 𝛼 2 𝑦 ′(𝑎) = 0,
𝛽 1 𝑦(𝑏) + 𝛽 2 𝑦 ′(𝑏) = 0.

Let 𝑦 𝑗 and 𝑦 𝑘 be two distinct eigenfunctions for two distinct eigenvalues 𝜆 𝑗 and 𝜆 𝑘 . Then
∫ 𝑏
𝑦 𝑗 (𝑥) 𝑦 𝑘 (𝑥) 𝑟(𝑥) 𝑑𝑥 = 0,
𝑎

that is, 𝑦 𝑗 and 𝑦 𝑘 are orthogonal with respect to the weight function 𝑟.
278 CHAPTER 5. MORE ON EIGENVALUE PROBLEMS

5.1.3 Fredholm alternative


The Fredholm alternative theorem we talked about before (Theorem 4.1.2 on page 194) holds
for all regular Sturm–Liouville problems. We state it here for completeness.

Theorem 5.1.3 (Fredholm alternative). Suppose that we have a regular Sturm–Liouville problem.
Then either

𝑑 𝑑𝑦
 
𝑝(𝑥) − 𝑞(𝑥)𝑦 + 𝜆𝑟(𝑥)𝑦 = 0,
𝑑𝑥 𝑑𝑥
𝛼1 𝑦(𝑎) − 𝛼 2 𝑦 ′(𝑎) = 0,
𝛽 1 𝑦(𝑏) + 𝛽 2 𝑦 ′(𝑏) = 0,

has a nonzero solution (𝜆 is an eigenvalue), or

𝑑 𝑑𝑦
 
𝑝(𝑥) − 𝑞(𝑥)𝑦 + 𝜆𝑟(𝑥)𝑦 = 𝑓 (𝑥),
𝑑𝑥 𝑑𝑥
𝛼1 𝑦(𝑎) − 𝛼 2 𝑦 ′(𝑎) = 0,
𝛽 1 𝑦(𝑏) + 𝛽 2 𝑦 ′(𝑏) = 0,

has a unique solution for any 𝑓 (𝑥) continuous on [𝑎, 𝑏].

This theorem is used in much the same way as we did before in § 4.4. It is used when
solving more general nonhomogeneous boundary value problems. The theorem does not
help us solve the problem, but it tells us when a unique solution exists, so that we know
when to spend time looking for it. To solve the problem we decompose 𝑓 (𝑥) and 𝑦(𝑥) in
terms of eigenfunctions of the homogeneous problem, and then solve for the coefficients of
the series for 𝑦(𝑥).

5.1.4 Eigenfunction series


What we want to do with the eigenfunctions once we have them is to compute the
eigenfunction decomposition of an arbitrary function 𝑓 (𝑥). That is, we wish to write


Õ
𝑓 (𝑥) = 𝑐 𝑛 𝑦𝑛 (𝑥), (5.3)
𝑛=1

where 𝑦𝑛 (𝑥) are eigenfunctions. We wish to find out if we can represent any function
𝑓 (𝑥) in this way, and if so, we wish to calculate the 𝑐 𝑛 (and of course we would want to
know if the sum converges). OK, so imagine we could write 𝑓 (𝑥) as (5.3). We will assume
convergence and the ability to integrate the series term by term. Because of orthogonality
5.1. STURM–LIOUVILLE PROBLEMS 279

we have
!
∫ 𝑏 ∫ 𝑏 ∞
Õ
⟨ 𝑓 , 𝑦𝑚 ⟩ = 𝑓 (𝑥) 𝑦𝑚 (𝑥) 𝑟(𝑥) 𝑑𝑥 = 𝑐 𝑛 𝑦𝑛 (𝑥) 𝑦𝑚 (𝑥) 𝑟(𝑥) 𝑑𝑥
𝑎 𝑎 𝑛=1

Õ ∫ 𝑏
= 𝑐𝑛 𝑦𝑛 (𝑥) 𝑦𝑚 (𝑥) 𝑟(𝑥) 𝑑𝑥
𝑛=1 𝑎
∫ 𝑏
= 𝑐𝑚 𝑦𝑚 (𝑥) 𝑦𝑚 (𝑥) 𝑟(𝑥) 𝑑𝑥 = 𝑐 𝑚 ⟨𝑦𝑚 , 𝑦𝑚 ⟩.
𝑎

Hence,
∫𝑏
⟨ 𝑓 , 𝑦𝑚 ⟩ 𝑓 (𝑥) 𝑦𝑚 (𝑥) 𝑟(𝑥) 𝑑𝑥
𝑐𝑚 = = ∫𝑎 𝑏 . (5.4)
⟨𝑦𝑚 , 𝑦𝑚 ⟩ 2
𝑦𝑚 (𝑥) 𝑟(𝑥) 𝑑𝑥
𝑎

Note that 𝑦𝑚 are known up to a constant multiple, so we could have picked a scalar
multiple of an eigenfunction
p such that ⟨𝑦𝑚 , 𝑦𝑚 ⟩ = 1 (if we had an arbitrary eigenfunction
𝑦˜ 𝑚 , divide it by ⟨ 𝑦˜ 𝑚 , 𝑦˜ 𝑚 ⟩). When ⟨𝑦𝑚 , 𝑦𝑚 ⟩ = 1, we have the simpler form 𝑐 𝑚 = ⟨ 𝑓 , 𝑦𝑚 ⟩.
The following theorem holds more generally, but the statement given is enough for our
purposes.

Theorem 5.1.4. Suppose 𝑓 is a piecewise smooth continuous function on [𝑎, 𝑏]. If 𝑦1 , 𝑦2 , . . . are
eigenfunctions of a regular Sturm–Liouville problem, one for each eigenvalue, then there exist real
constants 𝑐 1 , 𝑐2 , . . . given by (5.4) such that (5.3) converges and holds for 𝑎 < 𝑥 < 𝑏.

Example 5.1.4: Consider

𝑦 ′′ + 𝜆𝑦 = 0, 0 < 𝑥 < 𝜋/2,


𝑦(0) = 0, 𝑦 ′(𝜋/2) = 0.

The above is a regular Sturm–Liouville problem, and Theorem 5.1.1 on page 275 says that
if 𝜆 is an eigenvalue then 𝜆 ≥ 0.
Suppose 𝜆 = 0. The general solution is 𝑦(𝑥) = 𝐴𝑥 + 𝐵. We plug in the initial conditions
to get 0 = 𝑦(0) = 𝐵, and 0 = 𝑦 ′(𝜋/2) = 𝐴. Hence 𝜆 = 0 is not an eigenvalue.
So let us consider 𝜆 > 0, where the general solution is
√  √ 
𝑦(𝑥) = 𝐴 cos 𝜆 𝑥 + 𝐵 sin 𝜆 𝑥 .
√ √ 
Plugging in the boundary conditions we get 0 = 𝑦(0) = 𝐴 and 0 = 𝑦 ′(𝜋/2) = 𝜆 𝐵 cos 𝜆 𝜋2 .
√  √
Since 𝐴 is zero, then 𝐵 cannot be zero. Hence cos 𝜆 𝜋2 = 0. This means that 𝜆 𝜋2 is an

odd integral multiple of 𝜋/2, i.e. (2𝑛 − 1) 𝜋2 = 𝜆𝑛 𝜋2 . Solving for 𝜆𝑛 we get

𝜆𝑛 = (2𝑛 − 1)2 .
280 CHAPTER 5. MORE ON EIGENVALUE PROBLEMS

We can take 𝐵 = 1. Our eigenfunctions are

𝑦𝑛 (𝑥) = sin (2𝑛 − 1)𝑥 .




A little bit of calculus shows


𝜋
𝜋

2  2
sin (2𝑛 − 1)𝑥 𝑑𝑥 = .
0 4

So any piecewise smooth function 𝑓 (𝑥) on [0, 𝜋/2] can be written as



Õ
𝑓 (𝑥) = 𝑐 𝑛 sin (2𝑛 − 1)𝑥 ,

𝑛=1

where
𝜋
𝜋

𝑓 (𝑥) sin (2𝑛 − 1)𝑥 𝑑𝑥

⟨ 𝑓 , 𝑦𝑛 ⟩
2 ∫
0 4 2
𝑐𝑛 = 𝑓 (𝑥) sin (2𝑛 − 1)𝑥 𝑑𝑥.

= ∫ 𝜋 =
⟨𝑦𝑛 , 𝑦𝑛 ⟩  2 𝜋 0
𝑑𝑥
2

0
sin (2𝑛 − 1)𝑥

Note that the series converges to an odd 2𝜋-periodic extension of 𝑓 (𝑥). With the regular
sine series, we would expect a function with period 2 𝜋2 = 𝜋.

Exercise 5.1.3 (challenging): In the example above, the function is defined on 0 < 𝑥 < 𝜋/2, yet the
series with respect to the eigenfunctions sin (2𝑛 − 1)𝑥 converges to an odd 2𝜋-periodic extension
of 𝑓 (𝑥). Find out how the extension is defined for 𝜋/2 < 𝑥 < 𝜋.

Let us compute an example. Consider 𝑓 (𝑥) = 𝑥 for 0 < 𝑥 < 𝜋/2. Some calculus later we
find ∫ 𝜋
4 2 4(−1)𝑛+1
𝑐𝑛 = 𝑓 (𝑥) sin (2𝑛 − 1)𝑥 𝑑𝑥 = ,

𝜋 0 𝜋(2𝑛 − 1)2
and so for 𝑥 in [0, 𝜋/2],

4(−1)𝑛+1

Õ
𝑓 (𝑥) = sin (2𝑛 − 1)𝑥 .

2
𝑛=1 𝜋(2𝑛 − 1)

This is different from the 𝜋-periodic regular sine series which can be computed to be

(−1)𝑛+1

Õ
𝑓 (𝑥) = sin(2𝑛𝑥).
𝑛
𝑛=1

Both series converge to 𝑓 (𝑥) for 0 < 𝑥 < 𝜋/2, but the eigenfunctions involved come from
different eigenvalue problems.
5.1. STURM–LIOUVILLE PROBLEMS 281

5.1.5 Exercises
Exercise 5.1.4: Find eigenvalues and eigenfunctions of

𝑦 ′′ + 𝜆𝑦 = 0, 𝑦(0) − 𝑦 ′(0) = 0, 𝑦(1) = 0.

Exercise 5.1.5: Expand the function 𝑓 (𝑥) = 𝑥 on 0 ≤ 𝑥 ≤ 1 using eigenfunctions of the system

𝑦 ′′ + 𝜆𝑦 = 0, 𝑦 ′(0) = 0, 𝑦(1) = 0.

Exercise 5.1.6: Suppose that you had a Sturm–Liouville problem on the interval [0, 1] and came
up with 𝑦𝑛 (𝑥) = sin(𝛾𝑛𝑥), where 𝛾 > 0 is some constant. Decompose 𝑓 (𝑥) = 𝑥, 0 < 𝑥 < 1 in
terms of these eigenfunctions.

Exercise 5.1.7: Find eigenvalues and eigenfunctions of

𝑦 (4) + 𝜆𝑦 = 0, 𝑦(0) = 0, 𝑦 ′(0) = 0, 𝑦(1) = 0, 𝑦 ′(1) = 0.

This problem is not a Sturm–Liouville problem, but the idea is the same.

Exercise 5.1.8 (more challenging): Find eigenvalues and eigenfunctions for

𝑑 𝑥 ′
(𝑒 𝑦 ) + 𝜆𝑒 𝑥 𝑦 = 0, 𝑦(0) = 0, 𝑦(1) = 0.
𝑑𝑥
Hint: First write the system as a constant-coefficient system to find general solutions. Do note that
Theorem 5.1.1 on page 275 guarantees 𝜆 ≥ 0.

Exercise 5.1.101: Find eigenvalues and eigenfunctions of

𝑦 ′′ + 𝜆𝑦 = 0, 𝑦(−1) = 0, 𝑦(1) = 0.

Exercise 5.1.102: Put the following problems into the standard form for Sturm–Liouville problems,
that is, find 𝑝(𝑥), 𝑞(𝑥), 𝑟(𝑥), 𝛼1 , 𝛼 2 , 𝛽 1 , and 𝛽2 , and decide if the problems are regular or not.

a) 𝑥 𝑦 ′′ + 𝜆𝑦 = 0 for 0 < 𝑥 < 1, 𝑦(0) = 0, 𝑦(1) = 0.


b) (1 + 𝑥 2 )𝑦 ′′ + 2𝑥𝑦 ′ + (𝜆 − 𝑥 2 )𝑦 = 0 for −1 < 𝑥 < 1, 𝑦(−1) = 0, 𝑦(1) + 𝑦 ′(1) = 0.‗

an earlier version of this book, a typo rendered the equation as (1 + 𝑥 2 )𝑦 ′′ − 2𝑥 𝑦 ′ + (𝜆 − 𝑥 2 )𝑦 = 0


‗ In

ending up with something harder than intended. Try this equation for a further challenge.
282 CHAPTER 5. MORE ON EIGENVALUE PROBLEMS

5.2 Higher-order eigenvalue problems


Note: 1 lecture, §10.2 in [EP], exercises in §11.2 in [BD]
The eigenfunction series can arise even from higher-order equations. Consider an
elastic beam (say made of steel). We will study the transversal vibrations (vibrations where
the beam “bends”) of the beam. Suppose the beam lies along the 𝑥-axis and let 𝑦(𝑥, 𝑡)
measure the displacement of the point 𝑥 on the beam at time 𝑡. See Figure 5.2.

Figure 5.2: Transversal vibrations of a beam.

The equation that governs this setup is

𝜕4 𝑦 𝜕2 𝑦
𝑎4 + = 0,
𝜕𝑥 4 𝜕𝑡 2
for some constant 𝑎 > 0, let us not worry about the physics‗ .
Suppose the beam is of length 1 and simply supported (hinged) at the ends. The beam
is displaced by some function 𝑓 (𝑥) at time 𝑡 = 0 and then let go (initial velocity is 0). Then
𝑦 satisfies:
𝑎 4 𝑦 𝑥𝑥𝑥𝑥 + 𝑦𝑡𝑡 = 0 (0 < 𝑥 < 1, 𝑡 > 0),
𝑦(0, 𝑡) = 𝑦 𝑥𝑥 (0, 𝑡) = 0,
(5.5)
𝑦(1, 𝑡) = 𝑦 𝑥𝑥 (1, 𝑡) = 0,
𝑦(𝑥, 0) = 𝑓 (𝑥), 𝑦𝑡 (𝑥, 0) = 0.

Again we try 𝑦(𝑥, 𝑡) = 𝑋(𝑥)𝑇(𝑡) and plug in to get 𝑎 4 𝑋 (4)𝑇 + 𝑋𝑇 ′′ = 0 or

𝑋 (4) −𝑇 ′′
= 4 = 𝜆.
𝑋 𝑎 𝑇
The equations are
𝑇 ′′ + 𝜆𝑎 4𝑇 = 0 and 𝑋 (4) − 𝜆𝑋 = 0.
you are interested, 𝑎 4 = 𝐸𝐼
‗ If
𝜌 , where 𝐸 is the elastic modulus, 𝐼 is the second moment of area of the cross
section, and 𝜌 is linear density.
5.2. HIGHER-ORDER EIGENVALUE PROBLEMS 283

The boundary conditions 𝑦(0, 𝑡) = 𝑦 𝑥𝑥 (0, 𝑡) = 0 and 𝑦(1, 𝑡) = 𝑦 𝑥𝑥 (1, 𝑡) = 0 imply

𝑋(0) = 𝑋 ′′(0) = 0 and 𝑋(1) = 𝑋 ′′(1) = 0.

The initial homogeneous condition 𝑦𝑡 (𝑥, 0) = 0 implies

𝑇 ′(0) = 0.

As usual, we leave the nonhomogeneous 𝑦(𝑥, 0) = 𝑓 (𝑥) for later.


Considering the equation for 𝑇, that is, 𝑇 ′′ + 𝜆𝑎 4𝑇 = 0, and physical intuition leads us
to the fact that if 𝜆 is an eigenvalue then 𝜆 > 0: We expect vibration and not exponential
growth nor decay in the 𝑡 direction (there is no friction in our model for instance). So there
are no negative eigenvalues. Similarly 𝜆 = 0 is not an eigenvalue.

Exercise 5.2.1: Justify 𝜆 > 0 just from the equation for 𝑋 and the boundary conditions.
√4
Let 𝜔 = 𝜆, that is, 𝜔 4 = 𝜆, to avoid writing the fourth root all the time. Notice 𝜔 > 0.
The general solution to 𝑋 (4) − 𝜔 4 𝑋 = 0 is

𝑋(𝑥) = 𝐴𝑒 𝜔𝑥 + 𝐵𝑒 −𝜔𝑥 + 𝐶 sin(𝜔𝑥) + 𝐷 cos(𝜔𝑥).

Now 0 = 𝑋(0) = 𝐴 + 𝐵 + 𝐷, 0 = 𝑋 ′′(0) = 𝜔2 (𝐴 + 𝐵 − 𝐷). Solving, 𝐷 = 0 and 𝐵 = −𝐴. So

𝑋(𝑥) = 𝐴𝑒 𝜔𝑥 − 𝐴𝑒 −𝜔𝑥 + 𝐶 sin(𝜔𝑥).

Also 0 = 𝑋(1) = 𝐴(𝑒 𝜔 − 𝑒 −𝜔 ) + 𝐶 sin 𝜔, and 0 = 𝑋 ′′(1) = 𝐴𝜔 2 (𝑒 𝜔 − 𝑒 −𝜔 ) − 𝐶𝜔 2 sin 𝜔. This


means that 𝐶 sin 𝜔 = 0 and 𝐴(𝑒 𝜔 − 𝑒 −𝜔 ) = 2𝐴 sinh 𝜔 = 0. If 𝜔 > 0, then sinh 𝜔 ≠ 0 and so
𝐴 = 0. Thus 𝐶 ≠ 0, otherwise 𝜆 is not an eigenvalue. Also, 𝜔 must be an integer multiple
of 𝜋. Hence 𝜔 = 𝑛𝜋 and 𝑛 ≥ 1 (as 𝜔 > 0). We can take 𝐶 = 1. So the eigenvalues are
𝜆𝑛 = 𝑛 4 𝜋4 , and corresponding eigenfunctions are sin(𝑛𝜋𝑥).
Now 𝑇 ′′ + 𝑛 4 𝜋4 𝑎 4𝑇 = 0. The general solution is 𝑇(𝑡) = 𝐴 sin(𝑛 2 𝜋2 𝑎 2 𝑡) + 𝐵 cos(𝑛 2 𝜋2 𝑎 2 𝑡).
But 𝑇 ′(0) = 0 and hence 𝐴 = 0. We take 𝐵 = 1 to make 𝑇(0) = 1 for convenience. So our
solutions are 𝑇𝑛 (𝑡) = cos(𝑛 2 𝜋2 𝑎 2 𝑡).
The eigenfunctions are just the sines, so we decompose the function 𝑓 (𝑥) using the sine
series. That is, we find numbers 𝑏 𝑛 such that for 0 < 𝑥 < 1,

Õ
𝑓 (𝑥) = 𝑏 𝑛 sin(𝑛𝜋𝑥).
𝑛=1

Then the solution to (5.5) is



Õ ∞
Õ
𝑦(𝑥, 𝑡) = 𝑏 𝑛 𝑋𝑛 (𝑥)𝑇𝑛 (𝑡) = 𝑏 𝑛 sin(𝑛𝜋𝑥) cos(𝑛 2 𝜋2 𝑎 2 𝑡).
𝑛=1 𝑛=1
284 CHAPTER 5. MORE ON EIGENVALUE PROBLEMS

The point is that 𝑋𝑛 𝑇𝑛 is a solution that satisfies all the homogeneous conditions (all
conditions except the initial position). And since 𝑇𝑛 (0) = 1,


Õ ∞
Õ ∞
Õ
𝑦(𝑥, 0) = 𝑏 𝑛 𝑋𝑛 (𝑥)𝑇𝑛 (0) = 𝑏 𝑛 𝑋𝑛 (𝑥) = 𝑏 𝑛 sin(𝑛𝜋𝑥) = 𝑓 (𝑥).
𝑛=1 𝑛=1 𝑛=1

So 𝑦(𝑥, 𝑡) solves (5.5).


The natural (angular) frequencies of the system are 𝑛 2 𝜋2 𝑎 2 . These frequencies are
all integer multiples of the fundamental frequency 𝜋2 𝑎 2 , so we get a nice musical note.
The exact frequencies and their amplitude are what musicians call the timbre of the note
(outside of music it is called the spectrum).
The timbre of a beam is different from that of a vibrating string where we get “more” of
the lower frequencies since we get all integer multiples, 1, 2, 3, 4, 5, . . .. For a steel beam
we get only the square multiples 1, 4, 9, 16, 25, . . .. That is why when you hit a steel beam
you hear a very pure sound. The sound of a xylophone or vibraphone is, therefore, very
different from that of a guitar or piano.
𝑥(𝑥−1)
Example 5.2.1: Consider 𝑓 (𝑥) = 10 . On 0 < 𝑥 < 1 (you know how to do this by now)


Õ −4
𝑓 (𝑥) = sin(𝑛𝜋𝑥).
𝑛=1
5𝜋3 𝑛 3
𝑛 odd

Hence, the solution to (5.5) with the given initial position 𝑓 (𝑥) is


Õ −4
𝑦(𝑥, 𝑡) = sin(𝑛𝜋𝑥) cos(𝑛 2 𝜋2 𝑎 2 𝑡).
𝑛=1
5𝜋3 𝑛 3
𝑛 odd

There are other boundary conditions than just hinged ends. There are three basic
possibilities: hinged, free, or fixed. Consider the end at 𝑥 = 0. For the other end, it is the
same idea. If the end is hinged, then

𝑢(0, 𝑡) = 𝑢𝑥𝑥 (0, 𝑡) = 0.

If the end is free, that is, it is just floating in air, then

𝑢𝑥𝑥 (0, 𝑡) = 𝑢𝑥𝑥𝑥 (0, 𝑡) = 0.

And finally, if the end is clamped or fixed, for example it is welded to a wall, then

𝑢(0, 𝑡) = 𝑢𝑥 (0, 𝑡) = 0.
5.2. HIGHER-ORDER EIGENVALUE PROBLEMS 285

5.2.1 Exercises
Exercise 5.2.2: Suppose you have a beam of length 5 with free ends. Let 𝑦 be the transverse
deviation of the beam at position 𝑥 on the beam (0 < 𝑥 < 5). You know that the constants are such
that this satisfies the equation 𝑦𝑡𝑡 + 4𝑦 𝑥𝑥𝑥𝑥 = 0. Suppose you know that the initial shape of the
beam is the graph of 𝑥(5 − 𝑥), and the initial velocity is uniformly equal to 2 (same for each 𝑥) in the
positive 𝑦 direction. Set up the equation together with the boundary and initial conditions. Just set
up, do not solve.

Exercise 5.2.3: Suppose you have a beam of length 5 with one end free and one end fixed (the
fixed end is at 𝑥 = 5). Let 𝑢 be the longitudinal deviation of the beam at position 𝑥 on the beam
(0 < 𝑥 < 5). You know that the constants are such that this satisfies the equation 𝑢𝑡𝑡 = 4𝑢𝑥𝑥 .
Suppose you know that the initial displacement of the beam is 𝑥−5
−(𝑥−5)
50 , and the initial velocity is 100
in the positive 𝑢 direction. Set up the equation together with the boundary and initial conditions.
Just set up, do not solve.

Exercise 5.2.4: Suppose the beam is 𝐿 units long, everything else kept the same as in (5.5). What is
the equation and the series solution?

Exercise 5.2.5: Suppose you have

𝑎 4 𝑦 𝑥𝑥𝑥𝑥 + 𝑦𝑡𝑡 = 0 (0 < 𝑥 < 1, 𝑡 > 0),


𝑦(0, 𝑡) = 𝑦 𝑥𝑥 (0, 𝑡) = 0,
𝑦(1, 𝑡) = 𝑦 𝑥𝑥 (1, 𝑡) = 0,
𝑦(𝑥, 0) = 𝑓 (𝑥), 𝑦𝑡 (𝑥, 0) = 𝑔(𝑥).

That is, you have also an initial velocity. Find a series solution. Hint: Use the same idea as we did
for the wave equation.

Exercise 5.2.101: Suppose you have a beam of length 1 with hinged ends. Let 𝑦 be the transverse
deviation of the beam at position 𝑥 on the beam (0 < 𝑥 < 1). You know that the constants are such
that this satisfies the equation 𝑦𝑡𝑡 + 4𝑦 𝑥𝑥𝑥𝑥 = 0. Suppose you know that the initial shape of the
beam is the graph of sin(𝜋𝑥), and the initial velocity is 0. Solve for 𝑦.

Exercise 5.2.102: Suppose you have a beam of length 10 with two fixed ends. Let 𝑦 be the transverse
deviation of the beam at position 𝑥 on the beam (0 < 𝑥 < 10). You know that the constants are such
that this satisfies the equation 𝑦𝑡𝑡 + 9𝑦 𝑥𝑥𝑥𝑥 = 0. Suppose you know that the initial shape of the
beam is the graph of sin2 (𝜋𝑥), and the initial velocity is uniformly equal to 𝑥(10 − 𝑥). Set up the
equation together with the boundary and initial conditions. Just set up, do not solve.
286 CHAPTER 5. MORE ON EIGENVALUE PROBLEMS

5.3 Steady periodic solutions


Note: 1–2 lectures, §10.3 in [EP], not in [BD]

5.3.1 Forced vibrating string


Consider a guitar string of length 𝐿. We studied this setup in § 4.7. Let 𝑥 be the position
on the string, 𝑡 the time, and 𝑦 the displacement of the string. See Figure 5.3.

0 𝐿 𝑥

Figure 5.3: Vibrating string.

The problem is governed by the wave equation

𝑦𝑡𝑡 = 𝑎 2 𝑦 𝑥𝑥 ,
𝑦(0, 𝑡) = 0, 𝑦(𝐿, 𝑡) = 0, (5.6)
𝑦(𝑥, 0) = 𝑓 (𝑥), 𝑦𝑡 (𝑥, 0) = 𝑔(𝑥).

We found that the solution is of the form


∞ 
Õ  𝑛𝜋𝑎   𝑛𝜋𝑎   𝑛𝜋 
𝑦= 𝐴𝑛 cos 𝑡 + 𝐵𝑛 sin 𝑡 sin 𝑥 ,
𝐿 𝐿 𝐿
𝑛=1

where 𝐴𝑛 and 𝐵𝑛 are determined by the initial conditions. The natural frequencies of the
system are the (angular) frequencies 𝑛𝜋𝑎 𝐿 for integers 𝑛 ≥ 1.
But these are free vibrations. What if there is an external force acting on the string. Let
us assume say air vibrations (noise), for example from a second string. Or perhaps a jet
engine. For simplicity, assume nice pure sound and assume the force is uniform at every
position on the string: The external force as a function of time 𝑡 is given as 𝐹0 cos(𝜔𝑡) as
force per unit mass, we will call this the forcing function. Then our wave equation becomes
(remember force is mass times acceleration)

𝑦𝑡𝑡 = 𝑎 2 𝑦 𝑥𝑥 + 𝐹0 cos(𝜔𝑡), (5.7)

with the same boundary conditions of course.


5.3. STEADY PERIODIC SOLUTIONS 287

Suppose we want to find the solution here that satisfies the equation (5.7) and
𝑦(0, 𝑡) = 0, 𝑦(𝐿, 𝑡) = 0, 𝑦(𝑥, 0) = 0, 𝑦𝑡 (𝑥, 0) = 0. (5.8)
That is, the string is initially at rest. First, we find a particular solution 𝑦 𝑝 of (5.7) that
satisfies only the conditions 𝑦(0, 𝑡) = 𝑦(𝐿, 𝑡) = 0. We define the functions 𝑓 and 𝑔 as
𝜕𝑦 𝑝
𝑓 (𝑥) = −𝑦 𝑝 (𝑥, 0), 𝑔(𝑥) = −
(𝑥, 0).
𝜕𝑡
We then find solution 𝑦 𝑐 of (5.6). If we add the two solutions, we find that 𝑦 = 𝑦 𝑐 + 𝑦 𝑝
solves (5.7) with the initial conditions.
Exercise 5.3.1: Check that 𝑦 = 𝑦 𝑐 + 𝑦 𝑝 solves (5.7) and the side conditions (5.8).
So the big issue here is to find the particular solution 𝑦 𝑝 . We look at the equation and
we make an educated guess
𝑦 𝑝 (𝑥, 𝑡) = 𝑋(𝑥) cos(𝜔𝑡).
We plug in to get
−𝜔2 𝑋 cos(𝜔𝑡) = 𝑎 2 𝑋 ′′ cos(𝜔𝑡) + 𝐹0 cos(𝜔𝑡),
or −𝜔2 𝑋 = 𝑎 2 𝑋 ′′ + 𝐹0 after canceling the cosine. We know how to find a general solution to
this equation (it is a nonhomogeneous constant-coefficient equation). The general solution
is 𝜔  𝜔  𝐹
0
𝑋(𝑥) = 𝐴 cos 𝑥 + 𝐵 sin 𝑥 − 2.
𝑎 𝑎 𝜔
The endpoint conditions imply 𝑋(0) = 𝑋(𝐿) = 0. So
𝐹0
0 = 𝑋(0) = 𝐴 − ,
𝜔2
𝐹0
or 𝐴 = 𝜔2
, and also

𝐹0 𝜔𝐿 𝜔𝐿 𝐹0
   
0 = 𝑋(𝐿) = 2 cos + 𝐵 sin − 2.
𝜔 𝑎 𝑎 𝜔
Assuming that sin( 𝜔𝐿
𝑎 ) is not zero we can solve for 𝐵 to get
𝜔𝐿
 
−𝐹0 cos 𝑎 − 1
𝐵= . (5.9)
𝜔 2 sin 𝜔𝐿

𝑎

Therefore,
𝜔𝐿
 !
𝐹0 𝜔  cos 𝑎 − 1 𝜔 
𝑋(𝑥) = cos 𝑥 − sin 𝑥 −1 .
𝜔2 𝑎 sin 𝜔𝐿 𝑎

𝑎
The particular solution 𝑦 𝑝 we are looking for is

 𝜔  cos 𝜔𝐿 − 1  !
𝐹0 𝑎
𝜔 
𝑦 𝑝 (𝑥, 𝑡) = 2 cos 𝑥 − sin 𝑥 − 1 cos(𝜔𝑡).
𝜔 𝑎 sin 𝜔𝐿 𝑎

𝑎
288 CHAPTER 5. MORE ON EIGENVALUE PROBLEMS

Exercise 5.3.2: Check that 𝑦 𝑝 works.

Now we get to the point that we skipped. Suppose sin( 𝜔𝐿 𝑎 ) = 0. What this means is
that 𝜔 is equal to one of the natural frequencies of the system, i.e. a multiple of the base
frequency 𝜋𝑎 𝜋𝑎
𝐿 . We notice that if 𝜔 is not equal to a multiple of 𝐿 , but is very close, then
the coefficient 𝐵 in (5.9) seems to become very large as the denominator goes to zero. But
let us not jump to conclusions just yet because the numerator may also go to zero. When
𝜔 = 𝑛𝜋𝑎 𝜔𝐿
𝐿 for 𝑛 even, then cos( 𝑎 ) = 1, we could simply take 𝐵 = 0 to obtain a nice bounded
𝐹0 𝜔
solution of the same form, 𝑦 𝑝 = 𝜔2 cos 𝑎 𝑥 − 1 cos(𝜔𝑡), and so no resonance is happening.
 

Resonance occurs only when both cos( 𝜔𝐿 𝜔𝐿 𝑛𝜋𝑎


𝑎 ) = −1 and sin( 𝑎 ) = 0, that is, when 𝜔 = 𝐿 for
odd 𝑛. When 𝑛 is odd, if we take 𝜔 approaching 𝑛𝜋𝑎 𝐿 , then the numerator in the coefficient
𝐵 is not going to zero, but the denominator is, so 𝐵 really does “blow up.” We could again
try to explicitly solve for the resonance solution if we wanted to, but it is, in the right sense,
the limit of solutions as 𝜔 gets close to a resonance frequency. In real life, pure resonance
never occurs anyway.
The calculation of resonance frequencies above explains why a string begins to vibrate
if the identical string is plucked close by. In the absence of friction this vibration would
get louder and louder as time goes on. On the other hand, you are unlikely to get large
vibration if the forcing frequency is not close to a resonance frequency even if you have
a jet engine running close to the string. That is, the amplitude does not keep increasing
unless you tune to just the right frequency.
Similar resonance phenomena occur when you break a wine glass using human voice
(yes, this is possible, but not easy‗ ) if you happen to hit just the right frequency. However,
a glass has much purer sound, i.e. it is more like a vibraphone, so there are far fewer
resonance frequencies to hit.
When the forcing function is more complicated, you decompose it in terms of the
Fourier series and apply the result above. You may also need to solve the problem above if
the forcing function is a sine rather than a cosine, but if you think about it, the solution is
almost the same.
Example 5.3.1: Let us do the computation for specific values. Suppose 𝐹0 = 1 and 𝜔 = 1
and 𝐿 = 1 and 𝑎 = 1. Then
 
cos(1) − 1
𝑦 𝑝 (𝑥, 𝑡) = cos(𝑥) − sin(𝑥) − 1 cos(𝑡).
sin(1)

cos(1)−1
Write 𝐵 = sin(1) for simplicity.
Then plug in 𝑡 = 0 to get

𝑓 (𝑥) = −𝑦 𝑝 (𝑥, 0) = − cos 𝑥 + 𝐵 sin 𝑥 + 1,

𝜕𝑦 𝑝
and after differentiating in 𝑡 we see that 𝑔(𝑥) = − 𝜕𝑡
(𝑥, 0) = 0.
‗ Mythbusters, episode 31, Discovery Channel, originally aired may 18th 2005.
5.3. STEADY PERIODIC SOLUTIONS 289

Hence to find 𝑦 𝑐 , we need to solve the problem

𝑦𝑡𝑡 = 𝑦 𝑥𝑥 ,
𝑦(0, 𝑡) = 0, 𝑦(1, 𝑡) = 0,
𝑦(𝑥, 0) = − cos 𝑥 + 𝐵 sin 𝑥 + 1,
𝑦𝑡 (𝑥, 0) = 0.

If we use d’Alembert, we note that the formula that we use to define 𝑦(𝑥, 0) is not odd,
hence it is not a simple matter of plugging in the expression for 𝑦(𝑥, 0) to the d’Alembert
formula directly! You must define 𝐹 to be the odd, 2-periodic extension of 𝑦(𝑥, 0). Then
our solution is
𝐹(𝑥 + 𝑡) + 𝐹(𝑥 − 𝑡)
 
cos(1) − 1
𝑦(𝑥, 𝑡) = + cos(𝑥) − sin(𝑥) − 1 cos(𝑡). (5.10)
2 sin(1)

It is not hard to compute specific values for an odd periodic extension of a function and
hence (5.10) is a wonderful solution to the problem. For example, it is very easy to have a
computer do it, unlike a series solution. A plot is given in Figure 5.4.

0
0.0 t
1
2
0.2 3
x 4
0.5 5 y(x,t)

0.8
0.20 0.240
1.0 0.148
0.099
0.20 0.049
0.10
0.000
-0.049
0.10 -0.099
0.00
-0.148
y

-0.197
0.00 -0.254
y

-0.10

-0.10
-0.20

0.0
-0.20
0.2

0 0.5
1 x
2 0.8
3
4 1.0
t
5
 
𝐹(𝑥+𝑡)+𝐹(𝑥−𝑡) cos(1)−1
Figure 5.4: Plot of 𝑦(𝑥, 𝑡) = 2 + cos(𝑥) − sin(1)
sin(𝑥) − 1 cos(𝑡).
290 CHAPTER 5. MORE ON EIGENVALUE PROBLEMS

5.3.2 Underground temperature oscillations


Let 𝑢(𝑥, 𝑡) be the temperature at a certain location at depth 𝑥 underground at time 𝑡. See
Figure 5.5.
The temperature 𝑢 satisfies the heat equation
𝑢𝑡 = 𝑘𝑢𝑥𝑥 , where 𝑘 is the diffusivity of the soil. We
know the temperature at the surface 𝑢(0, 𝑡) from
weather records. Let us assume for simplicity that
depth 𝑥
𝑢(0, 𝑡) = 𝑇0 + 𝐴0 cos(𝜔𝑡),

where 𝑇0 is the yearly mean temperature, and 𝑡 = 0


is midsummer (you can put negative sign above to Figure 5.5: Underground temperature.
make it midwinter if you wish). 𝐴0 gives the typical
variation for the year. That is, the hottest temperature is 𝑇0 + 𝐴0 and the coldest is 𝑇0 − 𝐴0 .
For simplicity, we assume that 𝑇0 = 0. The frequency 𝜔 is picked depending on the units of
𝑡, such that when 𝑡 = 1 year, then 𝜔𝑡 = 2𝜋. For example if 𝑡 is in years, then 𝜔 = 2𝜋.
It seems reasonable that the temperature at depth 𝑥 also oscillates with the same
frequency. This, in fact, is the steady periodic solution, a solution independent of the initial
conditions. So we are looking for a solution of the form

𝑢(𝑥, 𝑡) = 𝑉(𝑥) cos(𝜔𝑡) + 𝑊(𝑥) sin(𝜔𝑡)

for the problem


𝑢𝑡 = 𝑘𝑢𝑥𝑥 , 𝑢(0, 𝑡) = 𝐴0 cos(𝜔𝑡). (5.11)
We employ the complex exponential here to make calculations simpler. Suppose we
have a complex-valued function

ℎ(𝑥, 𝑡) = 𝑋(𝑥) 𝑒 𝑖𝜔𝑡 .

We look for an ℎ such that Re ℎ = 𝑢. To find an ℎ, whose real part satisfies (5.11), we look
for an ℎ such that
ℎ 𝑡 = 𝑘 ℎ 𝑥𝑥 , ℎ(0, 𝑡) = 𝐴0 𝑒 𝑖𝜔𝑡 . (5.12)

Exercise 5.3.3: Suppose ℎ satisfies (5.12). Use Euler’s formula for the complex exponential to check
that 𝑢 = Re ℎ satisfies (5.11).

Substitute ℎ into (5.12).


𝑖𝜔𝑋 𝑒 𝑖𝜔𝑡 = 𝑘𝑋 ′′ 𝑒 𝑖𝜔𝑡 .
Hence,
𝑘𝑋 ′′ − 𝑖𝜔𝑋 = 0,
or
𝑋 ′′ − 𝛼 2 𝑋 = 0,
5.3. STEADY PERIODIC SOLUTIONS 291
q √
where 𝛼 = ± 𝑖𝜔 √ so you could simplify to 𝛼 = ±(1 + 𝑖) 𝜔 . Hence
𝑖 = ± 1+𝑖
p
𝑘 . Note that ± 2𝑘
2
the general solution is √𝜔 √𝜔
𝑋(𝑥) = 𝐴𝑒 −(1+𝑖) 2𝑘 𝑥 + 𝐵𝑒 (1+𝑖) 2𝑘 𝑥 .
We assume that an 𝑋(𝑥) that solves the problem must be bounded as 𝑥 → ∞ since 𝑢(𝑥, 𝑡)
should be bounded (we are not worrying about Earth’s core!). If you use Euler’s formula
to expand the complex exponentials, note that the second term is unbounded (if 𝐵 ≠ 0),
while the first term is always bounded. Hence 𝐵 = 0.
√𝜔
(1+𝑖) 2𝑘 𝑥
√ 5.3.4: Use Euler’s formula to show that 𝑒
Exercise
𝜔
is unbounded as 𝑥 → ∞, while
𝑥
𝑒 −(1+𝑖) 2𝑘 is bounded as 𝑥 → ∞.
Furthermore, 𝑋(0) = 𝐴0 since ℎ(0, 𝑡) = 𝐴0 𝑒 𝑖𝜔𝑡 . Thus 𝐴 = 𝐴0 . This means that
√𝜔 √𝜔 √𝜔 √𝜔
ℎ(𝑥, 𝑡) = 𝐴0 𝑒 −(1+𝑖) 2𝑘 𝑥 𝑒 𝑖𝜔𝑡 = 𝐴0 𝑒 −(1+𝑖) 2𝑘 𝑥+𝑖𝜔𝑡 = 𝐴0 𝑒 − 2𝑘 𝑥 𝑒 𝑖(𝜔𝑡− 2𝑘 𝑥) .

We need to get the real part of ℎ, so we apply Euler’s formula to get


√𝜔   r
𝜔
  r
𝜔

− 𝑥
ℎ(𝑥, 𝑡) = 𝐴0 𝑒 2𝑘 cos 𝜔𝑡 − 𝑥 + 𝑖 sin 𝜔𝑡 − 𝑥 .
2𝑘 2𝑘
Then finally
√𝜔  r
𝜔

− 𝑥
𝑢(𝑥, 𝑡) = Re ℎ(𝑥, 𝑡) = 𝐴0 𝑒 2𝑘 cos 𝜔𝑡 − 𝑥 .
2𝑘
Yay!
p Notice the phase is different at different depths. At depth 𝑥 the phase is delayed by
𝜔
𝑥 2𝑘 . For example in cgs units (centimeters-grams-seconds) we have 𝑘 = 0.005 (typical
value for soil), 𝜔 = seconds2𝜋in a year = 31,557,341
2𝜋
≈ 1.99 × 10−7 . Then if we compute where the
𝜔
phase shift 𝑥 2𝑘 = 𝜋 we find the depth in centimeters where the seasons are reversed.
p

That is, we get the depth at which summer is the coldest and winter is the warmest. We
get approximately 700 centimeters, which is approximately 23 feet below ground.
Be careful not to jump to conclusions. The temperature swings
√ decay rapidly as you
𝜔
dig deeper. The amplitude of the temperature swings is 𝐴0 𝑒 − 2𝑘 𝑥 . This function decays
very quickly as 𝑥 (the depth) grows. Let us again take typical parameters as above. We
also assume that our surface temperature swing is ±15◦ Celsius, that is, 𝐴0 = 15. Then the
maximum temperature variation at 700 centimeters is only ±0.66◦ Celsius.
You need not dig very deep to get an effective “refrigerator,” with nearly constant
temperature. That is why wines are kept in a cellar; you need consistent temperature. The
temperature differential could also be used for energy. A home could be heated or cooled
by taking advantage of the fact above. Even without Earth’s core you could heat a home
in the winter and cool it in the summer. Earth’s core makes the temperature higher the
deeper you dig, although you need to dig somewhat deep to feel a difference. We did not
take that into account above.
292 CHAPTER 5. MORE ON EIGENVALUE PROBLEMS

5.3.3 Exercises
Exercise 5.3.5: Suppose that the forcing function for the vibrating string is 𝐹0 sin(𝜔𝑡). Derive the
particular solution 𝑦 𝑝 .

Exercise 5.3.6: Take the forced vibrating string. Suppose that 𝐿 = 1, 𝑎 = 1. Suppose that the
forcing function is the square wave that is 1 on the interval 0 < 𝑡 < 𝜋 and −1 on the interval
−𝜋 < 𝑡 < 0. Find the particular solution. Hint: You may want to use result of Exercise 5.3.5.

Exercise 5.3.7: The units are cgs (centimeters-grams-seconds). For 𝑘 = 0.005, 𝜔 = 1.991 × 10−7 ,
𝐴0 = 20. Find the depth at which the temperature variation is half (±10 degrees) of what it is on the
surface.

Exercise 5.3.8: Derive the solution for underground temperature oscillation without assuming that
𝑇0 = 0.

Exercise 5.3.101: Take the forced vibrating string. Suppose that 𝐿 = 1, 𝑎 = 1. Suppose that the
forcing function is a triangle wave, |𝑡| − 𝜋2 on −𝜋 < 𝑡 < 𝜋 extended periodically. Find the particular
solution.

Exercise 5.3.102: The units are cgs (centimeters-grams-seconds). For 𝑘 = 0.01, 𝜔 = 1.991 × 10−7 ,
𝐴0 = 25. Find the depth at which the summer is again the hottest point.
Chapter 6

The Laplace transform

6.1 The Laplace transform


Note: 1.5–2 lectures, §10.1 in [EP], §6.1 and parts of §6.2 in [BD]

6.1.1 The transform


In this chapter, we will discuss the Laplace transform‗ . The Laplace transform is a
very efficient method to solve certain ODE or PDE problems. The transform takes a
differential equation and turns it into an algebraic equation. If the algebraic equation
can be solved, applying the inverse transform gives us our desired solution. The Laplace
transform also has applications in the analysis of electrical circuits, NMR spectroscopy,
signal processing, and elsewhere. Finally, understanding the Laplace transform will also
help with understanding the related Fourier transform, which, however, requires more
understanding of complex numbers. We will not cover the Fourier transform.
The Laplace transform also gives a lot of insight into the nature of the equations we are
dealing with. It can be seen as converting between the time and the frequency domain. For
example, take the standard equation

𝑚𝑥 ′′(𝑡) + 𝑐𝑥 ′(𝑡) + 𝑘𝑥(𝑡) = 𝑓 (𝑡).

We can think of 𝑡 as time and 𝑓 (𝑡) as incoming signal. The Laplace transform will convert
the equation from a differential equation in time to an algebraic (no derivatives) equation,
where the new independent variable 𝑠 is the frequency.†
We can think of the Laplace transform as a black box. It eats functions and spits out
functions in a new variable. We write ℒ 𝑓 (𝑡) = 𝐹(𝑠) for the Laplace transform of 𝑓 (𝑡).
It is common to write lower case letters for functions in the time domain and upper case
letters for functions in the frequency domain. We use the same letter to denote that one
‗ Just
like the Laplace equation and the Laplacian, the Laplace transform is also named after Pierre-Simon,
marquis de Laplace (1749–1827).
† Really, it is the “frequency” in terms of complex numbers, but we digress.
294 CHAPTER 6. THE LAPLACE TRANSFORM

function is the Laplace transform of the other. For example, 𝐹(𝑠) is the Laplace transform
of 𝑓 (𝑡). Let us define the transform.
∫ ∞
def
ℒ 𝑓 (𝑡) = 𝐹(𝑠) = 𝑒 −𝑠𝑡 𝑓 (𝑡) 𝑑𝑡.

0

We note that we are only considering 𝑡 ≥ 0 in the transform. Of course, if we think of 𝑡 as


time, there is no problem, we are generally interested in finding out what will happen in
the future (the Laplace transform is one place where it is safe to ignore the past). Let us
compute some simple transforms.
Example 6.1.1: Suppose 𝑓 (𝑡) = 1, then
∞ ℎ

𝑒 −𝑠𝑡 𝑒 −𝑠𝑡 𝑒 −𝑠 ℎ
∫    
−𝑠𝑡 1 1
ℒ{1} = 𝑒 𝑑𝑡 = = lim = lim − = .
0 −𝑠 𝑡=0 ℎ→∞ −𝑠 𝑡=0 ℎ→∞ −𝑠 −𝑠 𝑠

The limit (the improper integral) only exists if 𝑠 > 0. So ℒ{1} is only defined for 𝑠 > 0.
Example 6.1.2: Suppose 𝑓 (𝑡) = 𝑒 −𝑎𝑡 , then
∞ ∞ ∞
𝑒 −(𝑠+𝑎)𝑡
∫ ∫ 
−𝑎𝑡 −𝑠𝑡 −𝑎𝑡 −(𝑠+𝑎)𝑡 1
ℒ 𝑒 𝑒 𝑒 𝑑𝑡 = 𝑒 𝑑𝑡 = .

= =
0 0 −(𝑠 + 𝑎) 𝑡=0 𝑠+𝑎

The limit only exists if 𝑠 + 𝑎 > 0. So ℒ 𝑒 −𝑎𝑡 is only defined for 𝑠 + 𝑎 > 0.


Example 6.1.3: Suppose 𝑓 (𝑡) = 𝑡, then using integration by parts


∫ ∞
ℒ{𝑡} = 𝑒 −𝑠𝑡 𝑡 𝑑𝑡
0 ∞ ∞
−𝑡𝑒 −𝑠𝑡

1
= + 𝑒 −𝑠𝑡 𝑑𝑡
𝑠 𝑡=0 𝑠 0
∞
𝑒 −𝑠𝑡

1
=0+
𝑠 −𝑠 𝑡=0
1
= 2.
𝑠
Again, the limit only exists if 𝑠 > 0.
Example 6.1.4: A common function is the unit step function, which is sometimes called the
Heaviside function‗ . This function is generally given as
(
0 if 𝑡 < 0,
𝑢(𝑡) =
1 if 𝑡 ≥ 0.
‗The function is named after the English mathematician, engineer, and physicist Oliver Heaviside
(1850–1925). Only by coincidence is the function “heavy” on “one side.”
6.1. THE LAPLACE TRANSFORM 295

Let us find the Laplace transform of 𝑢(𝑡 − 𝑎), where 𝑎 ≥ 0 is some constant. That is, it is
the function that is 0 for 𝑡 < 𝑎 and 1 for 𝑡 ≥ 𝑎.
∞ ∞ ∞
𝑒 −𝑠𝑡 𝑒 −𝑎𝑠
∫ ∫ 
−𝑠𝑡 −𝑠𝑡
ℒ 𝑢(𝑡 − 𝑎) = 𝑒 𝑢(𝑡 − 𝑎) 𝑑𝑡 = 𝑒 𝑑𝑡 = ,

=
0 𝑎 −𝑠 𝑡=𝑎 𝑠

where of course 𝑠 > 0 (and 𝑎 ≥ 0 as we said before).


By applying similar procedures, we can compute the transforms of many elementary
functions. Many basic transforms are listed in Table 6.1 (see also appendix B).

𝑓 (𝑡) ℒ 𝑓 (𝑡) 𝑓 (𝑡) ℒ 𝑓 (𝑡)


 

𝐶 𝜔
𝐶 𝑠 sin(𝜔𝑡) 𝑠 2 +𝜔2
𝑠
𝑡 1
𝑠2
cos(𝜔𝑡) 𝑠 2 +𝜔2
𝜔
𝑡2 2
𝑠3
sinh(𝜔𝑡) 𝑠 2 −𝜔2
𝑠
𝑡3 6
𝑠4
cosh(𝜔𝑡) 𝑠 2 −𝜔2
𝑛! 𝑒 −𝑎𝑠
𝑡𝑛 𝑠 𝑛+1
𝑢(𝑡 − 𝑎) (𝑎 ≥ 0) 𝑠
𝑒 −𝑎𝑡 1
𝑠+𝑎

Table 6.1: Some Laplace transforms (𝐶, 𝜔, and 𝑎 are constants).

Exercise 6.1.1: Verify Table 6.1.

Since the transform is defined by an integral, we can use the linearity properties of the
integral. For example, suppose 𝐶 is a constant, then
∫ ∞ ∫ ∞
−𝑠𝑡
ℒ 𝐶 𝑓 (𝑡) = 𝑒 𝐶 𝑓 (𝑡) 𝑑𝑡 = 𝐶 𝑒 −𝑠𝑡 𝑓 (𝑡) 𝑑𝑡 = 𝐶ℒ 𝑓 (𝑡) .
 
0 0

So we can “pull out” a constant out of the transform. Similarly, with addition. As linearity
is important, we state it as a theorem.

Theorem 6.1.1 (Linearity of the Laplace transform). Suppose that 𝐴, 𝐵, and 𝐶 are constants,
then
ℒ 𝐴 𝑓 (𝑡) + 𝐵𝑔(𝑡) = 𝐴ℒ 𝑓 (𝑡) + 𝐵ℒ 𝑔(𝑡) ,
  

and in particular,
ℒ 𝐶 𝑓 (𝑡) = 𝐶ℒ 𝑓 (𝑡) .
 

Exercise 6.1.2: Verify the theorem. That is, show that ℒ 𝐴 𝑓 (𝑡)+𝐵𝑔(𝑡) = 𝐴ℒ 𝑓 (𝑡) +𝐵ℒ 𝑔(𝑡) .
  
296 CHAPTER 6. THE LAPLACE TRANSFORM

These rules together with Table 6.1 on the previous page make it easy to find the Laplace
transform of a whole lot of functions already. For example:
2 5 9
ℒ 2 + 5𝑡 + 9𝑒 −2𝑡 = ℒ 2 + 5ℒ 𝑡 + 9ℒ 𝑒 −2𝑡 = .
   
+ 2+
𝑠 𝑠 𝑠+2
Be careful! The Laplace transform of a product is not the product of the transforms. In
general
ℒ 𝑓 (𝑡)𝑔(𝑡) ≠ ℒ 𝑓 (𝑡) ℒ 𝑔(𝑡) .
  

Moreover, not all functions have a Laplace transform. For example, the function 1𝑡 does
not have a Laplace transform as the integral diverges for all 𝑠. Similarly, tan 𝑡 or 𝑒 𝑡 do not
2

have Laplace transforms.

6.1.2 Existence and uniqueness


When does the Laplace transform exist? A function 𝑓 (𝑡) is of exponential order as 𝑡 goes to
infinity if
| 𝑓 (𝑡)| ≤ 𝑀𝑒 𝑐𝑡 ,
for some constants 𝑀 and 𝑐, for sufficiently large 𝑡 (say for all 𝑡 > 𝑡0 for some 𝑡0 ). The
simplest way to check this condition is to try and compute
𝑓 (𝑡)
.
lim
𝑡→∞ 𝑒 𝑐𝑡

If the limit exists and is finite (usually zero), then 𝑓 (𝑡) is of exponential order.
Exercise 6.1.3: Use L’Hopital’s rule from calculus to show that a polynomial is of exponential order.
Hint: Note that a sum of two exponential-order functions is also of exponential order. Then show
that 𝑡 𝑛 is of exponential order for any 𝑛.
For an exponential-order function, we have existence and uniqueness of the Laplace
transform.
Theorem 6.1.2 (Existence). Let 𝑓 (𝑡) becontinuous on the interval [0, ∞) and of exponential order
for a certain constant 𝑐. Then 𝐹(𝑠) = ℒ 𝑓 (𝑡) is defined for all 𝑠 > 𝑐.
The existence is not difficult to see. Let 𝑓 (𝑡) be of exponential order, that is, | 𝑓 (𝑡)| ≤ 𝑀𝑒 𝑐𝑡
for all 𝑡 > 0 (for simplicity 𝑡0 = 0). Let 𝑠 > 𝑐, or in other words (𝑠 −𝑐) > 0. By the comparison
theorem from calculus, the improper integral defining ℒ 𝑓 (𝑡) exists because the following


integral exists
∞ ∞ ∞
𝑒 −(𝑠−𝑐)𝑡 𝑀
∫ ∫ 
−𝑠𝑡 𝑐𝑡 −(𝑠−𝑐)𝑡
𝑒 (𝑀𝑒 ) 𝑑𝑡 = 𝑀 𝑒 𝑑𝑡 = 𝑀 = .
0 0 −(𝑠 − 𝑐) 𝑡=0 𝑠−𝑐
The transform also exists for some other functions that are not of exponential order, but
that will not be relevant to us. Before dealing with uniqueness, we note that for functions
of exponential order, their Laplace transform decays at infinity:
lim 𝐹(𝑠) = 0.
𝑠→∞
6.1. THE LAPLACE TRANSFORM 297

Theorem 6.1.3 (Uniqueness). Let 𝑓 (𝑡) and 𝑔(𝑡) be continuous and of exponential order. Suppose
that there exists a constant 𝐶, such that 𝐹(𝑠) = 𝐺(𝑠) for all 𝑠 > 𝐶. Then 𝑓 (𝑡) = 𝑔(𝑡) for all 𝑡 ≥ 0.

Both theorems hold for piecewise continuous functions as well. Recall that piecewise
continuous means that the function is continuous except perhaps at a discrete set of points,
where it has jump discontinuities like the Heaviside function. Uniqueness, however, does
not “see” values at the discontinuities. So we can only conclude that 𝑓 (𝑡) = 𝑔(𝑡) outside of
discontinuities. For example, the unit step function is sometimes defined using 𝑢(0) = 1/2.
This new step function, however, has the exact same Laplace transform as the one we
defined earlier, where 𝑢(0) = 1.

6.1.3 The inverse transform


As we said, the Laplace transform will allow us to convert a differential equation into an
algebraic equation. Once we solve the algebraic equation in the frequency domain, we will
want to get back to the time domain, as that is what
 we are interested in. Given a function
𝐹(𝑠), we wish to find a function 𝑓 (𝑡) such that ℒ 𝑓 (𝑡) = 𝐹(𝑠). Theorem 6.1.3 says that the
solution 𝑓 (𝑡) is unique.
 So we can without fear make the following definition.
Suppose 𝐹(𝑠) = ℒ 𝑓 (𝑡) for some function 𝑓 (𝑡). Define the inverse Laplace transform as

def
ℒ −1 𝐹(𝑠) = 𝑓 (𝑡).


There is an integral formula for the inverse, but it is not as simple as the transform itself—it
requires complex numbers and path integrals. For us it will suffice to compute the inverse
using Table 6.1 on page 295.
Example 6.1.5: Take 𝐹(𝑠) = 𝑠+11
. Find the inverse Laplace transform.
We look at the table to find  
−1 1
ℒ = 𝑒 −𝑡 .
𝑠+1
As the Laplace transform is linear, the inverse Laplace transform is also linear. That is,

ℒ −1 𝐴𝐹(𝑠) + 𝐵𝐺(𝑠) = 𝐴ℒ −1 𝐹(𝑠) + 𝐵ℒ −1 𝐺(𝑠) .


  

Of course, we also have ℒ −1 𝐴𝐹(𝑠) = 𝐴ℒ −1 𝐹(𝑠) .


 

Example 6.1.6: Take 𝐹(𝑠) = 𝑠 𝑠+𝑠+1


2
3 +𝑠 . Find the inverse Laplace transform.
First we use the method of partial fractions to write 𝐹 in a form where we can use Table 6.1
on page 295. We factor the denominator as 𝑠(𝑠 2 + 1) and write

𝑠 2 + 𝑠 + 1 𝐴 𝐵𝑠 + 𝐶
= + 2 .
𝑠3 + 𝑠 𝑠 𝑠 +1
Putting the right-hand side over a common denominator and equating the numerators
we get 𝐴(𝑠 2 + 1) + 𝑠(𝐵𝑠 + 𝐶) = 𝑠 2 + 𝑠 + 1. Expanding and equating coefficients, we obtain
298 CHAPTER 6. THE LAPLACE TRANSFORM

𝐴 + 𝐵 = 1, 𝐶 = 1, 𝐴 = 1, and thus 𝐵 = 0. In other words,

𝑠2 + 𝑠 + 1 1 1
𝐹(𝑠) = = + 2 .
𝑠 +𝑠
3 𝑠 𝑠 +1

By linearity of the inverse Laplace transform, we get

𝑠2 + 𝑠 + 1
     
−1 1 1
ℒ = ℒ −1 + ℒ −1 2 = 1 + sin 𝑡.
𝑠 +𝑠
3 𝑠 𝑠 +1

Another useful property is the so-called shifting property or the first shifting property

ℒ 𝑒 −𝑎𝑡 𝑓 (𝑡) = 𝐹(𝑠 + 𝑎),




where 𝐹(𝑠) is the Laplace transform of 𝑓 (𝑡).

Exercise 6.1.4: Derive the first shifting property from the definition of the Laplace transform.

The shifting property can be used, for example, when the denominator is a more
complicated quadratic that may come up in the method of partial fractions. We complete
the square and write such quadratics as (𝑠 + 𝑎)2 + 𝑏 and then use the shifting property.
1
Example 6.1.7: Find ℒ −1

𝑠 2 +4𝑠+8
.
First, we complete the square to make the denominator (𝑠 + 2)2 + 4. We find
   
−1 1 1 −1 2 1
ℒ = ℒ = sin(2𝑡).
𝑠2 + 4 2 𝑠 2 + 22 2

Finally, we put it all together with the shifting property to find


   
−1 1 1 1
ℒ = ℒ −1 = 𝑒 −2𝑡 sin(2𝑡).
𝑠 + 4𝑠 + 8
2 2
(𝑠 + 2) + 4 2

Often, we want to be able to apply the inverse Laplace transform to rational functions,
that is, functions of the form
𝐹(𝑠)
𝐺(𝑠)
𝐹(𝑠)
where 𝐹(𝑠) and 𝐺(𝑠) are polynomials. If 𝐺(𝑠) is the Laplace transform of an exponential-
order function, it goes to zero as 𝑠 → ∞, and so the degree of 𝐹(𝑠) must be smaller than
that of 𝐺(𝑠). Such rational functions are called proper rational functions and we can always
apply the method of partial fractions without polynomial division. Of course, we still need
to be able to factor the denominator into linear and quadratic terms, which involves finding
the roots of the denominator.
6.1. THE LAPLACE TRANSFORM 299

6.1.4 Exercises
Exercise 6.1.5: Find the Laplace transform of 3 + 𝑡 5 + sin(𝜋𝑡).

Exercise 6.1.6: Find the Laplace transform of 𝑎 + 𝑏𝑡 + 𝑐𝑡 2 for some constants 𝑎, 𝑏, and 𝑐.

Exercise 6.1.7: Find the Laplace transform of 𝐴 cos(𝜔𝑡) + 𝐵 sin(𝜔𝑡).

Exercise 6.1.8: Find the Laplace transform of cos2 (𝜔𝑡).


4
Exercise 6.1.9: Find the inverse Laplace transform of 𝑠 2 −9
.
2𝑠
Exercise 6.1.10: Find the inverse Laplace transform of 𝑠 2 −1
.
1
Exercise 6.1.11: Find the inverse Laplace transform of .
(𝑠−1)2 (𝑠+1)
(
𝑡 if 𝑡 ≥ 1,
Exercise 6.1.12: Find the Laplace transform of 𝑓 (𝑡) =
0 if 𝑡 < 1.
𝑠
Exercise 6.1.13: Find the inverse Laplace transform of (𝑠 2 +𝑠+2)(𝑠+4)
.

Exercise 6.1.14: Find the Laplace transform of sin 𝜔(𝑡 − 𝑎) .




Exercise 6.1.15: Find the Laplace transform of 𝑡 sin(𝜔𝑡). Hint: Several integrations by parts.

Exercise 6.1.101: Find the Laplace transform of 4(𝑡 + 1)2 .


8
Exercise 6.1.102: Find the inverse Laplace transform of 𝑠 3 (𝑠+2)
.

Exercise 6.1.103: Find the Laplace transform of 𝑡𝑒 −𝑡 . Hint: Integrate by parts.

Exercise 6.1.104: Find the Laplace transform of sin(𝑡)𝑒 −𝑡 . Hint: Integrate by parts.
300 CHAPTER 6. THE LAPLACE TRANSFORM

6.2 Transforms of derivatives and ODEs


Note: 2 lectures, §7.2–7.3 in [EP], §6.2 and §6.3 in [BD]

6.2.1 Transforms of derivatives


Let us see how the Laplace transform is used for differential equations. First we find the
Laplace transform of the derivative of a function. Suppose 𝑔(𝑡) is adifferentiable function
of exponential order, that is, |𝑔(𝑡)| ≤ 𝑀𝑒 𝑐𝑡 for some 𝑀 and 𝑐. So ℒ 𝑔(𝑡) exists, and what
is more, lim𝑡→∞ 𝑒 −𝑠𝑡 𝑔(𝑡) = 0 when 𝑠 > 𝑐. Then
∫ ∞ h i∞ ∫ ∞
′ −𝑠𝑡 ′ −𝑠𝑡
ℒ 𝑔 (𝑡) = 𝑒 𝑔 (𝑡) 𝑑𝑡 = 𝑒 𝑔(𝑡) (−𝑠) 𝑒 −𝑠𝑡 𝑔(𝑡) 𝑑𝑡 = −𝑔(0) + 𝑠ℒ 𝑔(𝑡) .
 

0 𝑡=0 0
We repeat this procedure for higher derivatives. The results are listed in Table 6.2. The
procedure also works for continuous piecewise smooth functions, that is, functions that
are continuous with a piecewise continuous derivative.

𝑓 (𝑡) ℒ 𝑓 (𝑡) = 𝐹(𝑠)




𝑔 ′(𝑡) 𝑠𝐺(𝑠) − 𝑔(0)


𝑔 ′′(𝑡) 𝑠 2 𝐺(𝑠) − 𝑠 𝑔(0) − 𝑔 ′(0)
𝑔 ′′′(𝑡) 𝑠 3 𝐺(𝑠) − 𝑠 2 𝑔(0) − 𝑠 𝑔 ′(0) − 𝑔 ′′(0)

Table 6.2: Laplace transforms of derivatives (𝐺(𝑠) = ℒ 𝑔(𝑡) as usual).




Exercise 6.2.1: Verify Table 6.2.

6.2.2 Solving ODEs with the Laplace transform


Notice that the Laplace transform turns differentiation into multiplication by 𝑠. It is this
property that makes it useful to apply the transform to differential equations.
Example 6.2.1: Consider the problem
𝑥 ′′(𝑡) + 𝑥(𝑡) = cos(2𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 1.
We will take the Laplace transform of both sides of the equation. By 𝑋(𝑠), we will, as usual,
denote the Laplace transform of 𝑥(𝑡).
ℒ 𝑥 ′′(𝑡) + 𝑥(𝑡) = ℒ cos(2𝑡) ,
 
𝑠
𝑠 2 𝑋(𝑠) − 𝑠𝑥(0) − 𝑥 ′(0) + 𝑋(𝑠) = 2 .
𝑠 +4
6.2. TRANSFORMS OF DERIVATIVES AND ODES 301

We plug in the initial conditions now—making computations more streamlined—to find


𝑠
𝑠 2 𝑋(𝑠) − 1 + 𝑋(𝑠) = .
𝑠2 + 4
We solve for 𝑋(𝑠),
𝑠 1
𝑋(𝑠) = + 2 .
(𝑠 2 + 1)(𝑠 + 4) 𝑠 + 1
2

We use partial fractions (exercise) to write


1 𝑠 1 𝑠 1
𝑋(𝑠) = − + 2 .
3 𝑠 +1 3 𝑠 +4 𝑠 +1
2 2

Now take the inverse Laplace transform to obtain


1 1
𝑥(𝑡) = cos(𝑡) − cos(2𝑡) + sin(𝑡).
3 3
The procedure for linear constant-coefficient equations is as follows: Take an ordinary
differential equation in the time variable 𝑡. Apply the Laplace transform to transform the
equation into an algebraic (non differential) equation in the frequency domain. All the 𝑥(𝑡),
𝑥 ′(𝑡), 𝑥 ′′(𝑡), and so on, will be converted to 𝑋(𝑠), 𝑠𝑋(𝑠) − 𝑥(0), 𝑠 2 𝑋(𝑠) − 𝑠𝑥(0) − 𝑥 ′(0), and
so on. Solve the equation for 𝑋(𝑠). Then taking the inverse transform, if possible, find 𝑥(𝑡).
It should be noted that since not every function has a Laplace transform, not every
equation can be solved in this manner. Also if the equation is not a linear constant-coefficient
ODE, then applying the Laplace transform may not give us an algebraic equation.

6.2.3 Using the Heaviside function


Before we move on to more general equations than those we could solve before, we want to
consider the Heaviside function. See Figure 6.1 on the following page for the graph.
(
0 if 𝑡 < 0,
𝑢(𝑡) =
1 if 𝑡 ≥ 0.

The Heaviside function is useful for putting other functions together or cutting functions
off. Usually we use 𝑢(𝑡 − 𝑎) for some constant 𝑎; we just shift the graph to the right by 𝑎.
That is, 𝑢(𝑡 − 𝑎) = 0 when 𝑡 < 𝑎 and 𝑢(𝑡 − 𝑎) = 1 when 𝑡 ≥ 𝑎. For example, suppose 𝑓 (𝑡) is
a “signal”, and we started receiving sin 𝑡 at time 𝑡 = 𝜋. The function 𝑓 (𝑡) is then defined as
(
0 if 𝑡 < 𝜋,
𝑓 (𝑡) =
sin 𝑡 if 𝑡 ≥ 𝜋.

Using the Heaviside function, 𝑓 (𝑡) can be written as

𝑓 (𝑡) = 𝑢(𝑡 − 𝜋) sin 𝑡.


302 CHAPTER 6. THE LAPLACE TRANSFORM

-1.0 -0.5 0.0 0.5 1.0

1.00 1.00

0.75 0.75

0.50 0.50

0.25 0.25

0.00 0.00

-1.0 -0.5 0.0 0.5 1.0

Figure 6.1: Plot of the Heaviside (unit step) function 𝑢(𝑡).

Similarly, the step function that is 1 on the interval [1, 2) and 0 elsewhere is written as

𝑢(𝑡 − 1) − 𝑢(𝑡 − 2).

With the Heaviside function we can express functions defined piecewise. If 𝑓 (𝑡) = 𝑡 when
𝑡 is in [0, 1], 𝑓 (𝑡) = −𝑡 + 2 when 𝑡 is in [1, 2], and 𝑓 (𝑡) = 0 otherwise, then you write

𝑓 (𝑡) = 𝑡 𝑢(𝑡) − 𝑢(𝑡 − 1) + (−𝑡 + 2) 𝑢(𝑡 − 1) − 𝑢(𝑡 − 2) .


 

How does the Heaviside function interact with the Laplace transform? We saw that
𝑒 −𝑎𝑠
ℒ 𝑢(𝑡 − 𝑎) = .

𝑠
This computation can be generalized into a shifting property or second shifting property.

ℒ 𝑓 (𝑡 − 𝑎) 𝑢(𝑡 − 𝑎) = 𝑒 −𝑎𝑠 ℒ 𝑓 (𝑡) .


 
(6.1)

For example,
 
−𝑠 −𝑠 1 1
ℒ 𝑡 𝑢(𝑡 − 1) = ℒ (𝑡 − 1) + 1 𝑢(𝑡 − 1) = 𝑒 ℒ 𝑡 + 1 = 𝑒 .
   
+
𝑠 2 𝑠
Example 6.2.2: Consider the mass-spring system

𝑥 ′′(𝑡) + 𝑥(𝑡) = 𝑓 (𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0,

where 𝑓 (𝑡) = 1 if 1 ≤ 𝑡 < 5 and zero otherwise. The 𝑓 (𝑡) is not periodic and defined
piecewise; no problem for Laplace. Imagine a rocket attached to the mass is fired for 4
seconds starting at 𝑡 = 1. Or perhaps imagine an RLC circuit, where the voltage is raised at
a constant rate for 4 seconds starting at 𝑡 = 1 and then held steady again starting at 𝑡 = 5
(recall that 𝑓 (𝑡) represents the derivative of the voltage in the RLC circuit).
6.2. TRANSFORMS OF DERIVATIVES AND ODES 303

We write 𝑓 (𝑡) = 𝑢(𝑡 − 1) − 𝑢(𝑡 − 5). We transform the equation and we plug in the initial
conditions as before
𝑒 −𝑠 𝑒 −5𝑠
𝑠 2 𝑋(𝑠) + 𝑋(𝑠) = − .
𝑠 𝑠
We solve for 𝑋(𝑠),
𝑒 −𝑠 𝑒 −5𝑠
𝑋(𝑠) = − .
𝑠(𝑠 2 + 1) 𝑠(𝑠 2 + 1)
We leave it as an exercise to the reader to show that
 
−1 1
ℒ = 1 − cos 𝑡.
𝑠(𝑠 + 1)
2

In other words, ℒ{1 − cos 𝑡} = 1


𝑠(𝑠 2 +1)
. Using the second shifting property (6.1), we find

𝑒 −𝑠
 
−1
= ℒ −1 𝑒 −𝑠 ℒ{1 − cos 𝑡} = 1 − cos(𝑡 − 1) 𝑢(𝑡 − 1).
 

𝑠(𝑠 + 1)
2

Similarly,

𝑒 −5𝑠
 
ℒ −1 −1 −5𝑠
𝑒 𝑡} 𝑢(𝑡 − 5).
 
= ℒ ℒ{1 − cos = 1 − cos(𝑡 − 5)
𝑠(𝑠 2 + 1)

Hence, the solution is

𝑥(𝑡) = 1 − cos(𝑡 − 1) 𝑢(𝑡 − 1) − 1 − cos(𝑡 − 5) 𝑢(𝑡 − 5).


 

The plot of this solution is given in Figure 6.2.

0 5 10 15 20

2 2

1 1

0 0

-1 -1

-2 -2
0 5 10 15 20

Figure 6.2: Plot of 𝑥(𝑡).


304 CHAPTER 6. THE LAPLACE TRANSFORM

6.2.4 Transfer functions


The Laplace transform leads to the following useful concept for studying the steady-state
behavior of a linear system. Consider an equation of the form
𝐿𝑥 = 𝑓 (𝑡),
where 𝐿 is a linear constant-coefficient differential operator. Then 𝑓 (𝑡) is usually thought
of as input of the system and 𝑥(𝑡) is thought of as the output of the system. For example,
for a mass-spring system the input is the forcing function and the output is the behavior of
the mass. We would like to have a convenient way to study the behavior of the system for
different inputs.
Let us suppose that all the initial conditions are zero. We take the Laplace transform of
the equation to obtain the equation
𝐴(𝑠)𝑋(𝑠) = 𝐹(𝑠).
Solving for the ratio 𝑋(𝑠)/𝐹(𝑠), we obtain the so-called transfer function 𝐻(𝑠) = 1/𝐴(𝑠), that is,
𝑋(𝑠)
𝐻(𝑠) = .
𝐹(𝑠)
In other words, 𝑋(𝑠) = 𝐻(𝑠)𝐹(𝑠). We obtain an algebraic dependence of the output of
the system based on the input. We can now easily study the steady-state behavior of the
system given different inputs by simply multiplying by the transfer function. Moreover, it
is possible to compute the 𝐻(𝑠) without knowing exactly what the equation is by observing
the output 𝑋(𝑠) for a given input 𝐹(𝑠). Once 𝐻(𝑠) is known, you can find the output for
any input.
Example 6.2.3: Given 𝑥 ′′ + 𝜔02 𝑥 = 𝑓 (𝑡) (assume the initial conditions are zero), let us find
the transfer function.
First, we take the Laplace transform of the equation,
𝑠 2 𝑋(𝑠) + 𝜔02 𝑋(𝑠) = 𝐹(𝑠).
Now we solve for the transfer function 𝑋(𝑠)/𝐹(𝑠),
𝑋(𝑠) 1
𝐻(𝑠) = = .
𝐹(𝑠) 𝑠 2 + 𝜔02
Let us see how to use the transfer function. Suppose we have the constant input 𝑓 (𝑡) = 1.
Hence 𝐹(𝑠) = 1/𝑠 , and
1 1
𝑋(𝑠) = 𝐻(𝑠)𝐹(𝑠) = 2 𝑠
.
𝑠 + 𝜔0
2

Taking the inverse Laplace transform of 𝑋(𝑠), we obtain


1 − cos(𝜔0 𝑡)
𝑥(𝑡) = .
𝜔02
Similarly, for any other input 𝐹(𝑠), the output is 𝑋(𝑠) = 𝐻(𝑠)𝐹(𝑠) = 1
𝑠 2 +𝜔02
𝐹(𝑠).
6.2. TRANSFORMS OF DERIVATIVES AND ODES 305

6.2.5 Transforms of integrals


A feature of Laplace transforms is that it is also able to easily deal with integral equations.
That is, equations in which integrals rather than derivatives of functions appear. The basic
property, which can be proved by applying the definition and doing integration by parts, is
∫ 𝑡 
1
ℒ 𝑓 (𝜏) 𝑑𝜏 = 𝐹(𝑠).
0 𝑠

It is sometimes useful (e.g. for computing the inverse transform) to write this as
∫ 𝑡  
−1 1
𝑓 (𝜏) 𝑑𝜏 = ℒ 𝐹(𝑠) .
0 𝑠
n o
1
Example 6.2.4: To compute ℒ −1 𝑠(𝑠 2 +1)
we could apply this integration rule:
  ∫ 𝑡   ∫ 𝑡
−1 1 1 −1 1
ℒ = ℒ 𝑑𝜏 = sin 𝜏 𝑑𝜏 = 1 − cos 𝑡.
𝑠 𝑠2 + 1 0 𝑠 +1
2
0
Example 6.2.5: An equation containing an integral of the unknown function is called an
integral equation. Consider
∫ 𝑡
𝑥(𝑡) − 𝑡 = 𝑥(𝜏) 𝑑𝜏,
0
where we wish to solve for 𝑥(𝑡). We apply the Laplace transform to get
1 1
𝑋(𝑠) − = 𝑋(𝑠),
𝑠2 𝑠
where 𝑋(𝑠) = ℒ 𝑥(𝑡) . Thus


1 1 1
𝑋(𝑠) = = − .
𝑠(𝑠 − 1) 𝑠 − 1 𝑠
The inverse Laplace transform gives
𝑥(𝑡) = 𝑒 𝑡 − 1.

6.2.6 Periodic functions


The reader might ask: What about periodic functions as our input 𝑓 (𝑡)? That is, a function
𝑓 (𝑡) where 𝑓 (𝑡) = 𝑓 (𝑡 + 𝑃) for some constant 𝑃 (the period). Well, let us compute 𝐹(𝑠):
∫ ∞ ∫ 𝑃 ∫ ∞
−𝑠𝑡 −𝑠𝑡
𝐹(𝑠) = 𝑒 𝑓 (𝑡) 𝑑𝑡 = 𝑒 𝑓 (𝑡) 𝑑𝑡 + 𝑒 −𝑠𝑡 𝑓 (𝑡) 𝑑𝑡
0 0 𝑃
∫ 𝑃 ∫ ∞ ∫ 𝑃 ∫ ∞
−𝑠𝑡 −𝑠(𝑡+𝑃) −𝑠𝑡 −𝑃𝑠
= 𝑒 𝑓 (𝑡) 𝑑𝑡 + 𝑒 𝑓 (𝑡 + 𝑃) 𝑑𝑡 = 𝑒 𝑓 (𝑡) 𝑑𝑡 + 𝑒 𝑒 −𝑠𝑡 𝑓 (𝑡) 𝑑𝑡
0 0 0 0
∫ 𝑃
= 𝑒 −𝑠𝑡 𝑓 (𝑡) 𝑑𝑡 + 𝑒 −𝑃𝑠 𝐹(𝑠).
0
306 CHAPTER 6. THE LAPLACE TRANSFORM

Solving for 𝐹(𝑠), we get


∫ 𝑃
1
𝐹(𝑠) = 𝑒 −𝑠𝑡 𝑓 (𝑡) 𝑑𝑡.
1 − 𝑒 −𝑃𝑠 0
As before, computing the inverse would be more complex and possibly involve consulting
a table. Let us not worry about computing the inverse here.
Example 6.2.6: Suppose our function 𝑓 (𝑡) is a version of the sawtooth, that is, let 𝑓 (𝑡) = 𝑡
for 0 ≤ 𝑡 < 1 and use 𝑓 (𝑡) = 𝑓 (𝑡 + 1) to extend it periodically. So 𝑓 (𝑡) = 𝑡 − 1 for 1 ≤ 𝑡 < 2,
𝑓 (𝑡) = 𝑡 − 2 for 2 ≤ 𝑡 < 3, etc. Then 𝑃 = 1 and a short computation with integration by
parts gets
1
−𝑒 −𝑠 𝑒 −𝑠 −𝑒 −𝑠
∫  
1 −𝑠𝑡 1 1 1
𝐹(𝑠) = 𝑒 𝑡 𝑑𝑡 = − + = + 2.
1 − 𝑒 −𝑠 0 1−𝑒 −𝑠 𝑠 𝑠 2 𝑠 2 (1 − 𝑒 )𝑠 𝑠
−𝑠

6.2.7 Exercises
Exercise 6.2.2: Using the Heaviside function write down the piecewise function that is 0 for 𝑡 < 0,
𝑡 2 for 𝑡 in [0, 1] and 𝑡 for 𝑡 > 1.

Exercise 6.2.3: Using the Laplace transform solve

𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 0, 𝑥(0) = 𝑎, 𝑥 ′(0) = 𝑏,

where 𝑚 > 0, 𝑐 > 0, 𝑘 > 0, and 𝑐 2 − 4𝑘𝑚 > 0 (system is overdamped).

Exercise 6.2.4: Using the Laplace transform solve

𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 0, 𝑥(0) = 𝑎, 𝑥 ′(0) = 𝑏,

where 𝑚 > 0, 𝑐 > 0, 𝑘 > 0, and 𝑐 2 − 4𝑘𝑚 < 0 (system is underdamped).

Exercise 6.2.5: Using the Laplace transform solve

𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 0, 𝑥(0) = 𝑎, 𝑥 ′(0) = 𝑏,

where 𝑚 > 0, 𝑐 > 0, 𝑘 > 0, and 𝑐 2 = 4𝑘𝑚 (system is critically damped).

Exercise 6.2.6: Solve 𝑥 ′′ + 𝑥 = 𝑢(𝑡 − 1) for initial conditions 𝑥(0) = 0 and 𝑥 ′(0) = 0.

Exercise 6.2.7: Show the “differentiation of the transform” property. Suppose ℒ 𝑓 (𝑡) = 𝐹(𝑠),


then show
ℒ −𝑡 𝑓 (𝑡) = 𝐹 ′(𝑠).


Hint: Differentiate under the integral sign.

Exercise 6.2.8: Solve 𝑥 ′′′ + 𝑥 = 𝑡 3 𝑢(𝑡 − 1) for initial conditions 𝑥(0) = 1 and 𝑥 ′(0) = 0, 𝑥 ′′(0) = 0.

Exercise 6.2.9: Show the second shifting property: ℒ 𝑓 (𝑡 − 𝑎) 𝑢(𝑡 − 𝑎) = 𝑒 −𝑎𝑠 ℒ 𝑓 (𝑡) .
 
6.2. TRANSFORMS OF DERIVATIVES AND ODES 307

Exercise 6.2.10: Consider the mass-spring system with a rocket from Example 6.2.2. We noticed
that the solution kept oscillating after the rocket stopped running. The amplitude of the oscillation
depends on the time that the rocket was fired (for 4 seconds in the example).

a) Find a formula for the amplitude of the resulting oscillation in terms of the amount of time the
rocket is fired.
b) Is there a nonzero time (if so what is it?) for which the rocket fires and the resulting oscillation
has amplitude 0 (the mass is not moving)?

Exercise 6.2.11: Define




 (𝑡 − 1)2 if 1 ≤ 𝑡 < 2,


𝑓 (𝑡) = 3 − 𝑡 if 2 ≤ 𝑡 < 3,

0

otherwise.

a) Sketch the graph of 𝑓 (𝑡).
b) Write down 𝑓 (𝑡) using the Heaviside function.
c) Solve 𝑥 ′′ + 𝑥 = 𝑓 (𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0 using the Laplace transform.

Exercise 6.2.12: Find the transfer function for 𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 𝑓 (𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0.

Exercise 6.2.13: Suppose 𝐿𝑥 = 𝑓 (𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0 for the input function 𝑓 (𝑡) = 𝑡 has the
output 𝑥(𝑡) = 2𝑒 −𝑡 + 𝑡𝑒 −𝑡 + (𝑡 − 2).

a) Find 𝐹(𝑠), 𝑋(𝑠), and the transfer function 𝐻(𝑠).


b) If the input is instead 𝑓 (𝑡) = sin(𝑡) instead, find the new output 𝑥(𝑡).

Exercise 6.2.14: Suppose 𝑓 (𝑡) = 1 if 0 ≤ 𝑡 < 1 and 𝑓 (𝑡) = 0 if 1 ≤ 𝑡 < 2, and then extend
periodically for all 𝑡 ≥ 0 so that 𝑓 (𝑡) = 𝑓 (𝑡 + 2). Compute the Laplace transform 𝐹(𝑠).

Exercise 6.2.101: Using the Heaviside function 𝑢(𝑡), write down the function



 0 if 𝑡 < 1,


𝑓 (𝑡) = 𝑡 − 1 if 1 ≤ 𝑡 < 2,

1 if 2 ≤ 𝑡.


Exercise 6.2.102: Solve 𝑥 ′′ − 𝑥 = (𝑡 2 − 1)𝑢(𝑡 − 1) for initial conditions 𝑥(0) = 1, 𝑥 ′(0) = 2 using
the Laplace transform.

Exercise 6.2.103: Find the transfer function for 𝑥 ′ + 𝑥 = 𝑓 (𝑡), 𝑥(0) = 0.

Exercise 6.2.104: Suppose 𝐿𝑥 = 𝑓 (𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0 for the input function 𝑓 (𝑡) = 4 has the
output 𝑥(𝑡) = 1 − cos(2𝑡). Find 𝐹(𝑠), 𝑋(𝑠), and the transfer function 𝐻(𝑠).
308 CHAPTER 6. THE LAPLACE TRANSFORM

6.3 Convolution
Note: 1 or 1.5 lectures, §7.2 in [EP], §6.6 in [BD]

6.3.1 The convolution


The Laplace transformation of a product is not the product of the transforms. All hope
is not lost, however. We simply have to use a different type of a “product.” Take two
functions 𝑓 (𝑡) and 𝑔(𝑡) defined for 𝑡 ≥ 0, and define the convolution‗ of 𝑓 (𝑡) and 𝑔(𝑡) as
∫ 𝑡
def
( 𝑓 ∗ 𝑔)(𝑡) = 𝑓 (𝜏)𝑔(𝑡 − 𝜏) 𝑑𝜏. (6.2)
0

As you can see, the convolution of two functions of 𝑡 is another function of 𝑡.


Example 6.3.1: Take 𝑓 (𝑡) = 𝑒 𝑡 and 𝑔(𝑡) = 𝑡 for 𝑡 ≥ 0. Then
∫ 𝑡
( 𝑓 ∗ 𝑔)(𝑡) = 𝑒 𝜏 (𝑡 − 𝜏) 𝑑𝜏 = 𝑒 𝑡 − 𝑡 − 1.
0

To solve the integral we did one integration by parts.


Example 6.3.2: Take 𝑓 (𝑡) = sin(𝜔𝑡) and 𝑔(𝑡) = cos(𝜔𝑡) for 𝑡 ≥ 0. Then
∫ 𝑡
( 𝑓 ∗ 𝑔)(𝑡) = sin(𝜔𝜏) cos 𝜔(𝑡 − 𝜏) 𝑑𝜏.

0

Apply the identity


1
sin(𝜃 + 𝜓) − sin(𝜃 − 𝜓) ,

cos(𝜃) sin(𝜓) =
2
to get
∫ 𝑡
1
( 𝑓 ∗ 𝑔)(𝑡) = sin(𝜔𝑡) − sin(𝜔𝑡 − 2𝜔𝜏) 𝑑𝜏

0 2
 𝑡
1 1
= 𝜏 sin(𝜔𝑡) − cos(𝜔𝑡 − 2𝜔𝜏)
2 4𝜔 𝜏=0
1
= 𝑡 sin(𝜔𝑡).
2
The formula holds only for 𝑡 ≥ 0. The functions 𝑓 , 𝑔, and 𝑓 ∗ 𝑔 are undefined for 𝑡 < 0.
∫∞
‗ For those that have seen convolution before, you may have seen it defined as ( 𝑓∗ 𝑔)(𝑡) = −∞ 𝑓 (𝜏)𝑔(𝑡 −𝜏) 𝑑𝜏.
This definition agrees with (6.2) if you define 𝑓 (𝑡) and 𝑔(𝑡) to be zero for 𝑡 < 0. When discussing the Laplace
transform, the definition we gave is sufficient. Convolution does occur in many other applications, however,
where you may have to use the more general definition with infinities.
6.3. CONVOLUTION 309

Convolution has many properties that make it behave like a product. Let 𝑐 be a constant
and 𝑓 , 𝑔, and ℎ be functions. It is a calculus exercise to verify that

𝑓 ∗ 𝑔 = 𝑔∗ 𝑓,
(𝑐 𝑓 ) ∗ 𝑔 = 𝑓 ∗ (𝑐 𝑔) = 𝑐( 𝑓 ∗ 𝑔),
( 𝑓 + 𝑔) ∗ ℎ = 𝑓 ∗ ℎ + 𝑔 ∗ ℎ,
( 𝑓 ∗ 𝑔) ∗ ℎ = 𝑓 ∗ (𝑔 ∗ ℎ).

The most interesting property for us is the following theorem.

Theorem 6.3.1. If 𝑓 (𝑡) and 𝑔(𝑡) are of exponential order, then so is ( 𝑓 ∗ 𝑔)(𝑡) and
∫ 𝑡 
ℒ ( 𝑓 ∗ 𝑔)(𝑡) = ℒ 𝑓 (𝜏)𝑔(𝑡 − 𝜏) 𝑑𝜏 = ℒ 𝑓 (𝑡) ℒ 𝑔(𝑡) .
  
0

In other words, the Laplace transform of a convolution


 is the product of the Laplace
transforms: ℒ ( 𝑓 ∗ 𝑔)(𝑡) = 𝐹(𝑠)𝐺(𝑠), or in reverse, ℒ 𝐹(𝑠)𝐺(𝑠) = ( 𝑓 ∗ 𝑔)(𝑡).
−1


Example 6.3.3: Suppose we wish to find the inverse Laplace transform of

1 1 1
= 2 .
𝑠 2 (𝑠 + 1) 𝑠 𝑠 + 1

We recognize the two entries of Table 6.2:


   
1 1
ℒ −1 =𝑡 and ℒ −1 = 𝑒 −𝑡 .
𝑠2 𝑠+1

Therefore, we convolve 𝑡 and 𝑒 −𝑡 ,


  ∫ 𝑡
−1 1 1
ℒ = 𝜏𝑒 −(𝑡−𝜏) 𝑑𝜏 = 𝑒 −𝑡 + 𝑡 − 1.
𝑠2 𝑠 + 1 0

The calculation of the integral involved an integration by parts.

6.3.2 Solving ODEs


The next example demonstrates the full power of the convolution and the Laplace transform.
We can give the solution to the forced oscillation problem for any forcing function as a
definite integral.
Example 6.3.4: Find the solution to

𝑥 ′′ + 𝜔02 𝑥 = 𝑓 (𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0,

for an arbitrary function 𝑓 (𝑡).


310 CHAPTER 6. THE LAPLACE TRANSFORM

We first apply the Laplace transform to the equation. Denote the transform of 𝑥(𝑡) by
𝑋(𝑠) and the transform of 𝑓 (𝑡) by 𝐹(𝑠) as usual. We get

𝑠 2 𝑋(𝑠) + 𝜔02 𝑋(𝑠) = 𝐹(𝑠),

or in other words,
1
𝑋(𝑠) = 𝐹(𝑠).
𝑠 2 + 𝜔02
Recall that 𝐻(𝑠) = 1
𝑠 2 +𝜔02
is the transfer function. We know
( )
1 sin(𝜔0 𝑡)
ℒ −1 = .
𝑠 2 + 𝜔02 𝜔0

Therefore,
𝑡
sin(𝜔0 𝜏)

𝑥(𝑡) = 𝑓 (𝑡 − 𝜏) 𝑑𝜏,
0 𝜔0
or, if we reverse the order,
𝑡 sin 𝜔0 (𝑡 − 𝜏)
∫ 
𝑥(𝑡) = 𝑓 (𝜏) 𝑑𝜏.
0 𝜔0

Notice one more feature of the example above. We can now see how the Laplace
transform handles resonance. Suppose that 𝑓 (𝑡) = cos(𝜔0 𝑡). Then
𝑡 𝑡
sin(𝜔0 𝜏)
∫ ∫
1
𝑥(𝑡) = cos 𝜔0 (𝑡 − 𝜏) 𝑑𝜏 = sin(𝜔0 𝜏) cos 𝜔0 (𝑡 − 𝜏) 𝑑𝜏.
 
0 𝜔0 𝜔0 0

We have computed the convolution of sine and cosine in Example 6.3.2. Hence
   
1 1 1
𝑥(𝑡) = 𝑡 sin(𝜔0 𝑡) = 𝑡 sin(𝜔0 𝑡).
𝜔0 2 2𝜔0

Note the 𝑡 in front of the sine. The solution, therefore, grows without bound as 𝑡 gets large,
meaning we get resonance.
The general ideahere is that if 𝐻(𝑠) is the transfer function, then 𝑋(𝑠) = 𝐻(𝑠)𝐹(𝑠). If we
find the ℎ(𝑡) = ℒ −1 𝐻(𝑠) , then
∫ 𝑡
−1 −1
𝑥(𝑡) = ℒ 𝑋(𝑠) = ℒ 𝐹(𝑠)𝐻(𝑠) = ( 𝑓 ∗ ℎ)(𝑡) = 𝑓 (𝜏)ℎ(𝑡 − 𝜏) 𝑑𝜏.
 
0

Hence, we can solve any constant-coefficient equation with an arbitrary forcing function
𝑓 (𝑡) as a definite integral using convolution. A definite integral, rather than a closed form
solution, is usually enough for most practical purposes. It is not hard to numerically
evaluate a definite integral.
6.3. CONVOLUTION 311

6.3.3 Volterra integral equation


A common integral equation is the Volterra integral equation‗
∫ 𝑡
𝑥(𝑡) = 𝑓 (𝑡) + 𝑔(𝑡 − 𝜏)𝑥(𝜏) 𝑑𝜏,
0

where 𝑓 (𝑡) and 𝑔(𝑡) are known functions and 𝑥(𝑡) is an unknown we wish to solve for. To
find 𝑥(𝑡), we apply the Laplace transform to the equation to obtain

𝑋(𝑠) = 𝐹(𝑠) + 𝐺(𝑠)𝑋(𝑠),

where 𝑋(𝑠), 𝐹(𝑠), and 𝐺(𝑠) are the Laplace transforms of 𝑥(𝑡), 𝑓 (𝑡), and 𝑔(𝑡) respectively.
We find
𝐹(𝑠)
𝑋(𝑠) = .
1 − 𝐺(𝑠)
To find 𝑥(𝑡), we now need to find the inverse Laplace transform of 𝑋(𝑠).
Example 6.3.5: Solve
∫ 𝑡
−𝑡
𝑥(𝑡) = 𝑒 + sinh(𝑡 − 𝜏)𝑥(𝜏) 𝑑𝜏.
0
We apply the Laplace transform to obtain
1 1
𝑋(𝑠) = + 2 𝑋(𝑠),
𝑠+1 𝑠 −1
or
1
𝑠+1 𝑠−1 𝑠 1
𝑋(𝑠) = = = 2 − 2 .
1− 1 𝑠 −2 𝑠 −2 𝑠 −2
2
𝑠 2 −1
It is not hard to apply Table 6.1 on page 295 to find
√ 1 √ 
𝑥(𝑡) = cosh 2 𝑡 − √ sinh 2 𝑡 .

2

6.3.4 Exercises
Exercise 6.3.1: Let 𝑓 (𝑡) = 𝑡 2 for 𝑡 ≥ 0, and 𝑔(𝑡) = 𝑢(𝑡 − 1). Compute 𝑓 ∗ 𝑔.
Exercise 6.3.2: Let 𝑓 (𝑡) = 𝑡 for 𝑡 ≥ 0, and 𝑔(𝑡) = sin 𝑡 for 𝑡 ≥ 0. Compute 𝑓 ∗ 𝑔.
Exercise 6.3.3: Find the solution to

𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 𝑓 (𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0,

for an arbitrary function 𝑓 (𝑡), where 𝑚 > 0, 𝑐 > 0, 𝑘 > 0, and 𝑐 2 − 4𝑘𝑚 > 0 (the system is
overdamped). Write the solution as a definite integral.
‗ Named for the Italian mathematician Vito Volterra (1860–1940).
312 CHAPTER 6. THE LAPLACE TRANSFORM

Exercise 6.3.4: Find the solution to

𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 𝑓 (𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0,

for an arbitrary function 𝑓 (𝑡), where 𝑚 > 0, 𝑐 > 0, 𝑘 > 0, and 𝑐 2 − 4𝑘𝑚 < 0 (the system is
underdamped). Write the solution as a definite integral.

Exercise 6.3.5: Find the solution to

𝑚𝑥 ′′ + 𝑐𝑥 ′ + 𝑘𝑥 = 𝑓 (𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0,

for an arbitrary function 𝑓 (𝑡), where 𝑚 > 0, 𝑐 > 0, 𝑘 > 0, and 𝑐 2 = 4𝑘𝑚 (the system is critically
damped). Write the solution as a definite integral.

Exercise 6.3.6: Solve ∫ 𝑡


−𝑡
𝑥(𝑡) = 𝑒 + cos(𝑡 − 𝜏)𝑥(𝜏) 𝑑𝜏.
0

Exercise 6.3.7: Solve ∫ 𝑡


𝑥(𝑡) = cos 𝑡 + cos(𝑡 − 𝜏)𝑥(𝜏) 𝑑𝜏.
0
n o
𝑠
Exercise 6.3.8: Compute ℒ −1 2 using convolution.
(𝑠 2 +4)
2
Exercise 6.3.9: Write down the solution to 𝑥 ′′ − 2𝑥 = 𝑒 −𝑡 , 𝑥(0) = 0, 𝑥 ′(0) = 0 as a definite
2
integral. Hint: Do not try to compute the Laplace transform of 𝑒 −𝑡 .

Exercise 6.3.101: Let 𝑓 (𝑡) = cos 𝑡 for 𝑡 ≥ 0, and 𝑔(𝑡) = 𝑒 −𝑡 . Compute 𝑓 ∗ 𝑔.


5
Exercise 6.3.102: Compute ℒ −1

𝑠 4 +𝑠 2
using convolution.

Exercise 6.3.103: Solve 𝑥 ′′ + 𝑥 = sin 𝑡, 𝑥(0) = 0, 𝑥 ′(0) = 0 using convolution.

Exercise 6.3.104: Solve 𝑥 ′′′ + 𝑥 ′ = 𝑓 (𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0, 𝑥 ′′(0) = 0 using convolution. Write
the result as a definite integral.
6.4. DIRAC DELTA AND IMPULSE RESPONSE 313

6.4 Dirac delta and impulse response


Note: 1 or 1.5 lecture, §7.6 in [EP], §6.5 in [BD]

6.4.1 Rectangular pulse


Often we study a physical system by putting in a short pulse and then seeing what the
system does. The resulting behavior is often called impulse response, and understanding it
tells us how the system responds to any input. Let us see what we mean by a pulse. The
simplest kind of a pulse is a simple rectangular pulse defined by



 0 if 𝑡 < 𝑎,


𝜑(𝑡) = 𝑀 if 𝑎 ≤ 𝑡 < 𝑏,

0 if 𝑏 ≤ 𝑡.


See Figure 6.3 for a graph.
Notice that
0.0 0.5 1.0 1.5 2.0 2.5 3.0

𝜑(𝑡) = 𝑀 𝑢(𝑡 − 𝑎) − 𝑢(𝑡 − 𝑏) ,


 2.0 2.0

where 𝑢(𝑡) is the unit step function. Let us 1.5 1.5

take the Laplace transform of a rectangular


pulse, 1.0 1.0

ℒ 𝜑(𝑡) = ℒ 𝑀 𝑢(𝑡 − 𝑎) − 𝑢(𝑡 − 𝑏)


  
0.5 0.5

𝑒 −𝑎𝑠 − 𝑒 −𝑏𝑠
=𝑀 .
𝑠 0.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0


0.0

For simplicity, let 𝑎 = 0. It is also conve-


Figure 6.3: Sample rectangular pulse with 𝑎 = 0.5,
nient to set 𝑀 = 1/𝑏 so that
𝑏 = 1, and 𝑀 = 2.
∫ ∞
𝜑(𝑡) 𝑑𝑡 = 1.
0

That is, to have the pulse have “unit mass.” For such a pulse,

𝑢(𝑡) − 𝑢(𝑡 − 𝑏) 1 − 𝑒 −𝑏𝑠


 
ℒ 𝜑(𝑡) = ℒ .

=
𝑏 𝑏𝑠

We want 𝑏 to be very small; we wish to have the pulse be very short and very tall. By
−𝑏𝑠
letting 𝑏 go to zero, we arrive at the concept of the Dirac delta function. The limit of 1−𝑒𝑏𝑠
as 𝑏 → 0 is 1, so we are looking for a “function” whose Laplace transform is 1.
314 CHAPTER 6. THE LAPLACE TRANSFORM

6.4.2 The delta function


The Dirac delta function‗ is not exactly a function; it is sometimes called a generalized function.
We avoid unnecessary details and simply say that it is an object that does not really make
sense unless we integrate it. The motivation is that we would like a “function” 𝛿(𝑡) such
that for any continuous function 𝑓 (𝑡),
∫ ∞
𝛿(𝑡) 𝑓 (𝑡) 𝑑𝑡 = 𝑓 (0).
−∞

The formula should hold if we integrate over any interval that contains 0, not just (−∞, ∞).
So 𝛿(𝑡) is a “function” with all its “mass” at the single point 𝑡 = 0. For any interval† [𝑐, 𝑑],
(
𝑑
if the interval [𝑐, 𝑑] contains 0, i.e. 𝑐 ≤ 0 ≤ 𝑑,

1
𝛿(𝑡) 𝑑𝑡 =
𝑐 0 otherwise.

Unfortunately there is no such function in the classical sense. You could informally think
that 𝛿(𝑡) is zero for 𝑡 ≠ 0 and somehow infinite at 𝑡 = 0.
A good way to think about 𝛿(𝑡) is as a limit of pulses of decreasing length whose integral
is 1. For example, consider a rectangular pulse 𝜑(𝑡) as above with 𝑎 = 0 and 𝑀 = 1/𝑏 , that
𝑢(𝑡)−𝑢(𝑡−𝑏)
is, 𝜑(𝑡) = 𝑏 . Compute
∞ ∞ 𝑏
𝑢(𝑡) − 𝑢(𝑡 − 𝑏)
∫ ∫ ∫
1
𝜑(𝑡) 𝑓 (𝑡) 𝑑𝑡 = 𝑓 (𝑡) 𝑑𝑡 = 𝑓 (𝑡) 𝑑𝑡.
−∞ −∞ 𝑏 𝑏 0

If 𝑓 (𝑡) is continuous at 𝑡 = 0, then for very small 𝑏, the function 𝑓 (𝑡) is approximately equal
to 𝑓 (0) on the interval [0, 𝑏]. We approximate the integral
∫ 𝑏 ∫ 𝑏
1 1
𝑓 (𝑡) 𝑑𝑡 ≈ 𝑓 (0) 𝑑𝑡 = 𝑓 (0).
𝑏 0 𝑏 0

Hence, ∫ ∞ ∫ 𝑏
1
lim 𝜑(𝑡) 𝑓 (𝑡) 𝑑𝑡 = lim 𝑓 (𝑡) 𝑑𝑡 = 𝑓 (0).
𝑏→0 −∞ 𝑏→0 𝑏 0
Let us therefore accept 𝛿(𝑡) as an object that is possible to integrate. We often want to
shift 𝛿 to another point, for example 𝛿(𝑡 − 𝑎). In that case,
∫ ∞
𝛿(𝑡 − 𝑎) 𝑓 (𝑡) 𝑑𝑡 = 𝑓 (𝑎).
−∞

Note that 𝛿(𝑎 − 𝑡) is the same object as 𝛿(𝑡 − 𝑎). With that, the convolution of 𝛿(𝑡) with 𝑓 (𝑡)
is again 𝑓 (𝑡),
∫ 𝑡
( 𝑓 ∗ 𝛿)(𝑡) = 𝑓 (𝜏)𝛿(𝑡 − 𝜏) 𝑑𝜏 = 𝑓 (𝑡).
0
‗ Named after the English physicist and mathematician Paul Adrien Maurice Dirac (1902–1984).
∫𝑑+
† It is important that we consider 𝑐 and 𝑑 as part of the interval. One could write this integral as 𝑐−
.
6.4. DIRAC DELTA AND IMPULSE RESPONSE 315

As we can integrate 𝛿(𝑡), we compute its Laplace transform:


∫ ∞
ℒ 𝛿(𝑡 − 𝑎) = 𝑒 −𝑠𝑡 𝛿(𝑡 − 𝑎) 𝑑𝑡 = 𝑒 −𝑎𝑠 .

0

In particular,
ℒ 𝛿(𝑡) = 1.


Remark 6.4.1: The Laplace transform of 𝛿(𝑡 − 𝑎) would be the Laplace transform of the
derivative of the Heaviside function 𝑢(𝑡 − 𝑎), if the Heaviside function had a derivative.
First,
𝑒 −𝑎𝑠
ℒ 𝑢(𝑡 − 𝑎) = .

𝑠
To obtain what the Laplace transform of the derivative would be, we multiply by 𝑠, to obtain
𝑒 −𝑎𝑠 , which is the Laplace transform of 𝛿(𝑡 − 𝑎). We see the same thing using integration,
∫ 𝑡
𝛿(𝑠 − 𝑎) 𝑑𝑠 = 𝑢(𝑡 − 𝑎).
0

So in a certain sense
𝑑 h i
“ 𝑢(𝑡 − 𝑎) = 𝛿(𝑡 − 𝑎). ”
𝑑𝑡
This line of reasoning allows us to talk about derivatives of functions with jump discontinu-
ities. We can think of the derivative of the Heaviside function 𝑢(𝑡 − 𝑎) as being somehow
infinite at 𝑎, which is precisely our intuitive understanding of the delta function.
Example 6.4.1: Let us compute ℒ −1 𝑠+1

𝑠 . So far we only computed the inverse transform
of proper rational functions in the 𝑠 variable. That is, the numerator was of lower degree
than the denominator. Not so with 𝑠+1 𝑠 . We can use the delta function to compute,

𝑠+1
     
−1 1 1
ℒ = ℒ −1 1 + = ℒ −1 {1} + ℒ −1 = 𝛿(𝑡) + 1.
𝑠 𝑠 𝑠

The resulting object is a generalized function and only makes sense when put underneath
an integral.

6.4.3 Impulse response


As we said before, in the differential equation 𝐿𝑥 = 𝑓 (𝑡), we think of 𝑓 (𝑡) as input, and 𝑥(𝑡)
as the output. We think of the delta function as an impulse, and so to find the response to
an impulse, we use the delta function in place of 𝑓 (𝑡). The solution to

𝐿𝑥 = 𝛿(𝑡)

is called the impulse response.


316 CHAPTER 6. THE LAPLACE TRANSFORM

Example 6.4.2: Solve (find the impulse response)


𝑥 ′′ + 𝜔02 𝑥 = 𝛿(𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0. (6.3)
Apply the Laplace transform to the equation, and denote the transform of 𝑥(𝑡) by 𝑋(𝑠):
1
𝑠 2 𝑋(𝑠) + 𝜔02 𝑋(𝑠) = 1, and so 𝑋(𝑠) = .
𝑠 2 + 𝜔02
The inverse Laplace transform produces (for 𝑡 > 0)
sin(𝜔0 𝑡)
𝑥(𝑡) = .
𝜔0
Remark 6.4.2: Perhaps an astute reader will notice that it does not seem like 𝑥 ′(0) = 0.
However, we really want to think of as 𝑥(𝑡) = 0 for 𝑡 ≤ 0, so 𝑥(𝑡) has a “corner” at 𝑡 = 0,
that is, 𝑥 ′ has a jump discontinuity there, which is what produces the 𝛿(𝑡) when we take
𝑥 ′′. See Remark 6.4.1. The initial condition really is 𝑥 ′(0−) = lim𝑡↑0 𝑥 ′(0) = 0.
Let us notice something about the example above. In Example 6.3.4, we found that
when the input is 𝑓 (𝑡), the solution to 𝐿𝑥 = 𝑓 (𝑡) is given by
𝑡 sin 𝜔0 (𝑡 − 𝜏)
∫ 
𝑥(𝑡) = 𝑓 (𝜏) 𝑑𝜏.
0 𝜔0
That is, the solution for an arbitrary input is given as convolution with the impulse response. Let
us see why. The key is to notice that for functions 𝑓 (𝑡) and ℎ(𝑡),
𝑡 𝑡
𝑑2
∫  ∫
′′
( 𝑓 ∗ ℎ) (𝑡) = 2 𝑓 (𝜏)ℎ(𝑡 − 𝜏) 𝑑𝜏 = 𝑓 (𝜏)ℎ ′′(𝑡 − 𝜏) 𝑑𝜏 = ( 𝑓 ∗ ℎ ′′)(𝑡).
𝑑𝑡 0 0

We simply differentiate twice under the integral‗ . Suppose that ℎ(𝑡) is the impulse response
(solution to ℎ ′′ + 𝜔02 ℎ = 𝛿(𝑡)). If we convolve the entire equation (6.3), the left-hand side
becomes
𝑓 ∗ (ℎ ′′ + 𝜔02 ℎ) = ( 𝑓 ∗ ℎ ′′) + 𝜔02 ( 𝑓 ∗ ℎ) = ( 𝑓 ∗ ℎ)′′ + 𝜔02 ( 𝑓 ∗ ℎ).
The right-hand side becomes
( 𝑓 ∗ 𝛿)(𝑡) = 𝑓 (𝑡).
Therefore, if ℎ is the impulse response, then 𝑥(𝑡) = ( 𝑓 ∗ ℎ)(𝑡) is the solution to
𝑥 ′′ + 𝜔02 𝑥 = 𝑓 (𝑡).
The procedure works for any constant-coefficient linear equation 𝐿𝑥 = 𝑓 (𝑡). If you find
the impulse response ℎ (solution to 𝐿ℎ = 𝛿(𝑡)), you also know how to obtain the output
𝑥(𝑡) for any input 𝑓 (𝑡). We simply convolve, 𝑥(𝑡) = ( 𝑓 ∗ ℎ)(𝑡). As you may have noticed
in the example, the impulse response  is in fact just the inverse Laplace transform of the
transfer function, that is, ℎ(𝑡) = ℒ 𝐻(𝑠) .
−1

‗You should really think of the integral going over (−∞, ∞) rather than over [0, 𝑡] and simply assume
that 𝑓 (𝑡) and ℎ(𝑡) are zero for negative 𝑡.
6.4. DIRAC DELTA AND IMPULSE RESPONSE 317

6.4.4 Three-point beam bending


A quite different example application where the delta function appears is in representing
point loads on a steel beam. Consider a beam of length 𝐿, resting on two simple supports
at the ends. Let 𝑥 denote the position on the beam, and let 𝑦(𝑥) denote the deflection of the
beam in the vertical direction. The deflection 𝑦(𝑥) satisfies the Euler–Bernoulli equation‗ ,

𝑑4 𝑦
𝐸𝐼 = 𝐹(𝑥),
𝑑𝑥 4
where 𝐸 and 𝐼 are constants† and 𝐹(𝑥) is the force applied per unit length at position 𝑥. The
situation we are interested in is when the force is applied at a single point as in Figure 6.4.

𝑦 𝐹𝛿(𝑥 − 𝑎)
𝑥

Figure 6.4: Three-point bending.

The equation becomes


𝑑4 𝑦
𝐸𝐼= −𝐹𝛿(𝑥 − 𝑎),
𝑑𝑥 4
where 𝑥 = 𝑎 is the point where the mass is applied. The constant 𝐹 is the force applied and
the minus sign indicates that the force is downward, that is, in the negative 𝑦 direction.
The end points of the beam satisfy the conditions,

𝑦(0) = 0, 𝑦 ′′(0) = 0,
𝑦(𝐿) = 0, 𝑦 ′′(𝐿) = 0.

See § 5.2 for further information about endpoint conditions applied to beams.
Example 6.4.3: Suppose that length of the beam is 2, and 𝐸𝐼 = 1 for simplicity. Further
suppose that the force 𝐹 = 1 is applied at 𝑥 = 1. That is, we have the equation

𝑑4 𝑦
= −𝛿(𝑥 − 1),
𝑑𝑥 4
and the endpoint conditions are

𝑦(0) = 0, 𝑦 ′′(0) = 0, 𝑦(2) = 0, 𝑦 ′′(2) = 0.


‗ Named for the Swiss mathematicians Jacob Bernoulli (1654–1705), Daniel Bernoulli (1700–1782), the
nephew of Jacob, and Leonhard Paul Euler (1707–1783).
† 𝐸 is the elastic modulus and 𝐼 is the second moment of area. Let us not worry about the details and

simply think of these as some given constants.


318 CHAPTER 6. THE LAPLACE TRANSFORM

We could integrate, but using the Laplace transform is even easier. We apply the
transform in the 𝑥 variable rather than the 𝑡 variable. We again denote the transform of
𝑦(𝑥) by 𝑌(𝑠).
𝑠 4𝑌(𝑠) − 𝑠 3 𝑦(0) − 𝑠 2 𝑦 ′(0) − 𝑠 𝑦 ′′(0) − 𝑦 ′′′(0) = −𝑒 −𝑠 .
We notice that 𝑦(0) = 0 and 𝑦 ′′(0) = 0. Let us call 𝐶1 = 𝑦 ′(0) and 𝐶2 = 𝑦 ′′′(0). We solve for
𝑌(𝑠),
−𝑒 −𝑠 𝐶1 𝐶2
𝑌(𝑠) = 4 + 2 + 4 .
𝑠 𝑠 𝑠
We take the inverse Laplace transform utilizing the second shifting property (6.1) to take
the inverse of the first term:

−(𝑥 − 1)3 𝐶2 3
𝑦(𝑥) = 𝑢(𝑥 − 1) + 𝐶1 𝑥 + 𝑥 .
6 6
We still need to apply two of the endpoint conditions. As the conditions are at 𝑥 = 2 we
can simply replace 𝑢(𝑥 − 1) = 1 when taking the derivatives. Therefore,

−(2 − 1)3 𝐶2 3 −1 4
0 = 𝑦(2) = + 𝐶1 (2) + 2 = + 2𝐶1 + 𝐶2 ,
6 6 6 3
and
−3 · 2 · (2 − 1) 𝐶2
0 = 𝑦 ′′(2) = + 3 · 2 · 2 = −1 + 2𝐶2 .
6 6
Hence, 𝐶2 = 12 . Solving for 𝐶1 using the first equation, we obtain 𝐶1 = −1
4 . Our solution for
the beam deflection is
−(𝑥 − 1)3 𝑥 𝑥3
𝑦(𝑥) = 𝑢(𝑥 − 1) − + .
6 4 12

6.4.5 Exercises
Exercise 6.4.1: Solve (find the impulse response) 𝑥 ′′ + 𝑥 ′ + 𝑥 = 𝛿(𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0.

Exercise 6.4.2: Solve (find the impulse response) 𝑥 ′′ + 2𝑥 ′ + 𝑥 = 𝛿(𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0.

Exercise 6.4.3: A pulse can come later and can be bigger. Solve 𝑥 ′′ + 4𝑥 = 4𝛿(𝑡 − 1), 𝑥(0) = 0,
𝑥 ′(0) = 0.

Exercise 6.4.4: Suppose that 𝑓 (𝑡) and 𝑔(𝑡) are differentiable functions (and the derivatives are
continuous) and suppose that 𝑓 (𝑡) = 𝑔(𝑡) = 0 for all 𝑡 ≤ 0. Show that

( 𝑓 ∗ 𝑔)′(𝑡) = ( 𝑓 ′ ∗ 𝑔)(𝑡) = ( 𝑓 ∗ 𝑔 ′)(𝑡).

Exercise 6.4.5: Suppose that 𝐿𝑥 = 𝛿(𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0, has the solution 𝑥(𝑡) = 𝑡𝑒 −𝑡 for
𝑡 > 0. Find the solution to 𝐿𝑥 = 𝑡 2 , 𝑥(0) = 0, 𝑥 ′(0) = 0 for 𝑡 > 0.
n o
𝑠 2 +𝑠+1
Exercise 6.4.6: Compute ℒ −1 𝑠2
.
6.4. DIRAC DELTA AND IMPULSE RESPONSE 319

Exercise 6.4.7 (challenging): Solve Example 6.4.3 via integrating 4 times in the 𝑥 variable.

Exercise 6.4.8: Suppose we have a beam of length 1 simply supported at the ends and suppose that
force 𝐹 = 1 is applied at 𝑥 = 3/4 in the downward direction. Suppose that 𝐸𝐼 = 1 for simplicity.
Find the beam deflection 𝑦(𝑥).

Exercise 6.4.101: Solve (find the impulse response) 𝑥 ′′ = 𝛿(𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0.

Exercise 6.4.102: Solve (find the impulse response) 𝑥 ′ + 𝑎𝑥 = 𝛿(𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0.

Exercise 6.4.103: Suppose that 𝐿𝑥 = 𝛿(𝑡), 𝑥(0) = 0, 𝑥 ′(0) = 0, has the solution 𝑥(𝑡) = 𝑒 𝑡 sin(𝑡)
for 𝑡 > 0. Find (in closed form) the solution to 𝐿𝑥 = 𝑒 𝑡 , 𝑥(0) = 0, 𝑥 ′(0) = 0 for 𝑡 > 0.
n o
𝑠2
Exercise 6.4.104: Compute ℒ −1 𝑠 2 +1
.
n o
3𝑠 2 𝑒 −𝑠 +2
Exercise 6.4.105: Compute ℒ −1 𝑠2
.
320 CHAPTER 6. THE LAPLACE TRANSFORM

6.5 Solving PDEs with the Laplace transform


Note: 1–1.5 lecture, can be skipped
The Laplace transform comes from the same family of transforms as does the Fourier
series‗ , which we used in chapter 4 to solve partial differential equations (PDEs). It is
therefore not surprising that we can also solve PDEs with the Laplace transform.
Given a PDE in two independent variables 𝑥 and 𝑡, we use the Laplace transform on
one of the variables (taking the transform of everything in sight), and derivatives in that
variable become multiplications by the transformed variable 𝑠. The PDE becomes an ODE,
which we solve. Afterwards, we invert the transform to find a solution to the original
problem. It is best to see the procedure on an example.
Example 6.5.1: Consider the first-order PDE

𝑦𝑡 = −𝛼𝑦 𝑥 , for 𝑥 > 0, 𝑡 > 0,

with side conditions


𝑦(0, 𝑡) = 𝐶, 𝑦(𝑥, 0) = 0.
We will assume 𝛼 > 0 is a constant. This equation is called the convection equation or
sometimes the transport equation, and it already made an appearance in § 1.9, with different
conditions. See Figure 6.5 for a diagram of the setup.
A physical setup of this equation is a river
of solid goo, as we do not want anything to 𝑡
diffuse. The function 𝑦 is the concentration of
some toxic substance† . The variable 𝑥 denotes
position where 𝑥 = 0 is the location of a 𝑦=𝐶 𝑦𝑡 = −𝛼𝑦 𝑥
factory spewing the toxic substance into the
river. The toxic substance flows into the river
so that at 𝑥 = 0 the concentration is always 𝐶.
We wish to see what happens past the factory, (0, 0) 𝑦=0 𝑥
that is, at 𝑥 > 0. Let 𝑡 be the time, and assume Figure 6.5: Transport equation on a half line.
the factory started operations at 𝑡 = 0, so that
at 𝑡 = 0 the river is just pure goo.
Consider a function of two variables 𝑦(𝑥, 𝑡). Let us fix 𝑥 and transform the 𝑡 variable.
For convenience, we treat the transformed 𝑠 variable as a parameter, since there are no
derivatives in 𝑠. That is, we write 𝑌(𝑥) for the transformed function, and treat it as a
function of 𝑥, leaving 𝑠 as a parameter.
∫ ∞
𝑌(𝑥) = ℒ 𝑦(𝑥, 𝑡) = 𝑦(𝑥, 𝑡)𝑒 −𝑠𝑡 𝑑𝑡.

0
‗There is also a Fourier transform on the real line that looks sort of like the Laplace transform.
† It’s a river of goo already, we’re not hurting the environment much more.
6.5. SOLVING PDES WITH THE LAPLACE TRANSFORM 321

The transform of a derivative with respect to 𝑥 is just differentiating the transformed


function:
∞ ∞
𝑑
∫ ∫ 
ℒ 𝑦 𝑥 (𝑥, 𝑡) = 𝑦 𝑥 (𝑥, 𝑡)𝑒 −𝑠𝑡 𝑑𝑡 = 𝑦(𝑥, 𝑡)𝑒 −𝑠𝑡 𝑑𝑡 = 𝑌 ′(𝑥).

0 𝑑𝑥 0

To transform the derivative in 𝑡 (the variable being transformed), we use the rules from
§ 6.2:
ℒ 𝑦𝑡 (𝑥, 𝑡) = 𝑠𝑌(𝑥) − 𝑦(𝑥, 0).


In our specific case, 𝑦(𝑥, 0) = 0, and so ℒ 𝑦𝑡 (𝑥, 𝑡) = 𝑠𝑌(𝑥). We transform the equation


to find
𝑠𝑌(𝑥) = −𝛼𝑌 ′(𝑥).
This ODE needs an initial condition. The initial condition is the other side condition of the
PDE, the one that depends on 𝑥. Everything is transformed, so we must also transform
this condition
𝐶
𝑌(0) = ℒ 𝑦(0, 𝑡) = ℒ 𝐶 = .
 
𝑠
𝐶
We solve the ODE problem 𝑠𝑌(𝑥) = −𝛼𝑌 ′(𝑥), 𝑌(0) = 𝑠, to find

𝐶 −𝑠𝑥
𝑌(𝑥) = 𝑒 𝛼 .
𝑠
We are not done, we have 𝑌(𝑥), but we really want 𝑦(𝑥, 𝑡). We transform the 𝑠 variable
back to 𝑡. Let (
0 if 𝑡 < 0,
𝑢(𝑡) =
1 otherwise
be the Heaviside function. As
∞ ∞
𝑒 −𝑎𝑠
∫ ∫
−𝑠𝑡
ℒ 𝑢(𝑡 − 𝑎) = 𝑢(𝑡 − 𝑎) 𝑒 𝑑𝑡 = 𝑒 −𝑠𝑡 𝑑𝑡 = ,

0 𝑎 𝑠

then
𝐶 −𝑠𝑥
 
−1
𝑦(𝑥, 𝑡) = ℒ 𝑒 𝛼 = 𝐶𝑢 𝑡 − 𝑥/𝛼 .

𝑠
In other words, (
0 if 𝑡 < 𝑥/𝛼 ,
𝑦(𝑥, 𝑡) =
𝐶 otherwise.
See Figure 6.6 on the next page for a diagram of this solution. The line of slope 1/𝛼 indicates
the wavefront of the toxic substance in the picture as it is leaving the factory. What the
equation does is simply move the initial condition to the right at speed 𝛼.
Shhh. . . 𝑦 is not differentiable, it is not even continuous (nobody ever seems to notice).
How could we plug something that’s not differentiable into the equation? Well, just think
322 CHAPTER 6. THE LAPLACE TRANSFORM

𝑡
𝑦=𝐶
wavefront, slope 1/𝛼
𝑦=𝐶
𝑦=0

(0, 0) 𝑦=0 𝑥

Figure 6.6: Wavefront of toxic substance is a line of slope 1/𝛼.

of a differentiable function very very close to 𝑦. Or, if you recognize the derivative of the
Heaviside function as the delta function, then all is well too:

𝜕 
𝑦𝑡 (𝑥, 𝑡) = 𝐶𝑢 𝑡 − 𝑥/𝛼 = 𝐶𝑢 ′ 𝑡 − 𝑥/𝛼 = 𝐶𝛿 𝑡 − 𝑥/𝛼
  
𝜕𝑡
and
𝜕  𝐶 𝐶
𝑦 𝑥 (𝑥, 𝑡) = 𝐶𝑢 𝑡 − 𝑥/𝛼 = − 𝑢 ′ 𝑡 − 𝑥/𝛼 = − 𝛿 𝑡 − 𝑥/𝛼 .
  
𝜕𝑥 𝛼 𝛼
So 𝑦𝑡 = −𝛼𝑦 𝑥 .
Laplace equation is very good with constant-coefficient equations. One advantage
of Laplace is that it easily handles nonhomogeneous side conditions. Let us try a more
complicated example.
Example 6.5.2: Consider

𝑦𝑡 + 𝑦 𝑥 + 𝑦 = 0, for 𝑥 > 0, 𝑡 > 0,


𝑦(0, 𝑡) = sin(𝑡), 𝑦(𝑥, 0) = 0.

Again, we transform 𝑡, and we write 𝑌(𝑥) for the transformed function. As 𝑦(𝑥, 0) = 0,
we find
1
𝑠𝑌(𝑥) + 𝑌 ′(𝑥) + 𝑌(𝑥) = 0, 𝑌(0) = 2 .
𝑠 +1
The solution of the transformed equation is

1 1
𝑌(𝑥) = 𝑒 −(𝑠+1)𝑥 = 2 𝑒 −𝑥𝑠 𝑒 −𝑥 .
𝑠2 +1 𝑠 +1
Using the second shifting property (6.1) and linearity of the transform, we obtain the
solution
𝑦(𝑥, 𝑡) = 𝑒 −𝑥 sin(𝑡 − 𝑥)𝑢(𝑡 − 𝑥).
6.5. SOLVING PDES WITH THE LAPLACE TRANSFORM 323

We can also detect when the problem is ill-posed in the sense that it has no solution. Let
us change the equation to

− 𝑦𝑡 + 𝑦 𝑥 = 0, for 𝑥 > 0, 𝑡 > 0,


𝑦(0, 𝑡) = sin(𝑡), 𝑦(𝑥, 0) = 0.

Then the problem has no solution. First, let us see this in the language of § 1.9. The
characteristic curves are 𝑡 = −𝑥 + 𝐶. If 𝜏 is the characteristic coordinate, then we find the
equation 𝑦𝜏 = 0 along the curve, meaning a solution is constant along characteristic curves.
But these curves intersect both the 𝑥-axis and the 𝑡-axis. For example, the curve 𝑡 = −𝑥 + 1
intersects at (1, 0) and (0, 1). The solution is constant along the curve so 𝑦(1, 0) should
equal 𝑦(0, 1). But 𝑦(1, 0) = 0 and 𝑦(0, 1) = sin(1) ≠ 0. See Figure 6.7.

𝑡
𝑦(0, 1) = sin(1)
𝑡 = −𝑥 + 1
𝑦 = sin(𝑡) 𝑦 is constant along this characteristic curve
𝑦(1, 0) = 0

(0, 0) 𝑦=0 𝑥

Figure 6.7: Ill-posed problem.

Now consider the transform. The transformed problem is

1
−𝑠𝑌(𝑥) + 𝑌 ′(𝑥) = 0, 𝑌(0) = ,
𝑠2 + 1
and the solution ought to be
1
𝑌(𝑥) = 𝑒 𝑠𝑥 .
𝑠2 +1
Importantly, this Laplace transform does not decay to zero at infinity! That is, since 𝑥 > 0
in the region of interest, then

1
lim 𝑒 𝑠𝑥 = ∞ ≠ 0.
𝑠→∞ 𝑠 2 +1
It almost looks as if we could use the shifting property, but notice that the shift is in the
wrong direction.
Of course, we need not restrict ourselves to first-order equations, although the compu-
tations become more involved for higher-order equations.
324 CHAPTER 6. THE LAPLACE TRANSFORM

Example 6.5.3: Let us use Laplace for the following problem:

𝑦𝑡 = 𝑦 𝑥𝑥 , 0 < 𝑥 < ∞, 𝑡 > 0,


𝑦 𝑥 (0, 𝑡) = 𝑓 (𝑡),
𝑦(𝑥, 0) = 0.

This problem corresponds to a half-infinite insulated rod with a given heat flux at one
end. The setup would come up in real life in the case of a sufficiently long rod, where
we would not consider a long enough time for our solution to notice that we do not have
an infinite rod but just a very long one. In any case, the real-life situation imposes other
conditions on our solution 𝑦, for example, we will assume that the solution is bounded.
This boundedness condition will stand in for a boundary condition at the infinite end.
Boundedness is also sufficient for the Laplace transform in the 𝑡 variable to exist.
Transform the equation in the 𝑡 variable to find

𝑠𝑌(𝑥) = 𝑌 ′′(𝑥).

The general solution to this ODE is


√ √
𝑠𝑥 𝑠𝑥
𝑌(𝑥) = 𝐴𝑒 + 𝐵𝑒 − .

Note that 𝐴 and 𝐵 depend on 𝑠. As 𝑦 is bounded, then 𝑌(𝑥) is bounded for any fixed 𝑠 > 0,
so for 𝑌(𝑥) to stay bounded as 𝑥 → ∞, we must have 𝐴 = 0.
Now consider the boundary condition at 𝑥 = 0. Transform 𝑌 ′(0) = 𝐹(𝑠) and so

− 𝑠𝐵 = 𝐹(𝑠). In other words,
−1 √
𝑌(𝑥) = 𝐹(𝑠) √ 𝑒 − 𝑠𝑥 .
𝑠
If we look up the inverse transform in a table such as the one in Appendix B (or we spend
the afternoon doing calculus), we find
 
1 √ 1 −𝑥2
ℒ −1 √ 𝑒 − 𝑠𝑥 = √ 𝑒 4𝑡 .
𝑠 𝜋𝑡
So   𝑡
−1 √

−1 −𝑥 2
𝑦(𝑥, 𝑡) = ℒ −1
𝐹(𝑠) √ 𝑒 − 𝑠𝑥 = 𝑓 (𝜏) p 𝑒 4(𝑡−𝜏) 𝑑𝜏.
𝑠 0 𝜋(𝑡 − 𝜏)
Laplace can solve problems where separation of variables fails. Laplace does not mind
nonhomogeneity, but it is essentially only useful for constant-coefficient equations.

6.5.1 Exercises
Exercise 6.5.1: Solve

𝑦𝑡 + 𝑦 𝑥 = 1, 0 < 𝑥 < ∞, 𝑡 > 0,


𝑦(0, 𝑡) = 1, 𝑦(𝑥, 0) = 0.
6.5. SOLVING PDES WITH THE LAPLACE TRANSFORM 325

Exercise 6.5.2: Solve

𝑦𝑡 + 𝛼𝑦 𝑥 = 0, 0 < 𝑥 < ∞, 𝑡 > 0,


𝑦(0, 𝑡) = 𝑡, 𝑦(𝑥, 0) = 0.

Exercise 6.5.3: Solve

𝑦𝑡 + 2𝑦 𝑥 = 𝑥 + 𝑡, 0 < 𝑥 < ∞, 𝑡 > 0,


𝑦(0, 𝑡) = 0, 𝑦(𝑥, 0) = 0.

Exercise 6.5.4: For an 𝛼 > 0, solve

𝑦𝑡 + 𝛼𝑦 𝑥 + 𝑦 = 0, 0 < 𝑥 < ∞, 𝑡 > 0,


𝑦(0, 𝑡) = sin(𝑡), 𝑦(𝑥, 0) = 0.

Exercise 6.5.5: Find the corresponding ODE problem for 𝑌(𝑥), after transforming the 𝑡 variable

𝑦𝑡𝑡 + 3𝑦 𝑥𝑥 + 𝑦 𝑥𝑡 + 3𝑦 𝑥 + 𝑦 = sin(𝑥) + 𝑡, 0 < 𝑥 < 1, 𝑡 > 0,


𝑦(0, 𝑡) = 1, 𝑦(1, 𝑡) = 𝑡, 𝑦(𝑥, 0) = 1 − 𝑥, 𝑦𝑡 (𝑥, 0) = 1.

Do not solve the problem.

Exercise 6.5.6: Write down a solution to

𝑦𝑡 = 𝑦 𝑥𝑥 , 0 < 𝑥 < ∞, 𝑡 > 0,


−𝑡
𝑦 𝑥 (0, 𝑡) = 𝑒 , 𝑦(𝑥, 0) = 0,

as a definite integral (convolution). Assume that 𝑦 is bounded.

Exercise 6.5.7: Use the Laplace transform in 𝑡 to solve

𝑦𝑡𝑡 = 𝑦 𝑥𝑥 , −∞ < 𝑥 < ∞, 𝑡 > 0,


𝑦(𝑥, 0) = 0, 𝑦𝑡 (𝑥, 0) = sin(𝑥).

Assume that 𝑦 is bounded. Hint: Note that for any fixed 𝑠 > 0, 𝑒 𝑠𝑥 blows up as 𝑥 → +∞ and 𝑒 −𝑠𝑥
blows up 𝑠 → −∞.

Exercise 6.5.101: Solve

𝑦𝑡 + 𝑦 𝑥 = 1, 0 < 𝑥 < ∞, 𝑡 > 0,


𝑦(0, 𝑡) = 0, 𝑦(𝑥, 0) = 0.

Exercise 6.5.102: For a 𝑐 > 0, solve

𝑦𝑡 + 𝑦 𝑥 + 𝑐𝑦 = 0, 0 < 𝑥 < ∞, 𝑡 > 0,


𝑦(0, 𝑡) = sin(𝑡), 𝑦(𝑥, 0) = 0.
326 CHAPTER 6. THE LAPLACE TRANSFORM

Exercise 6.5.103: Find the corresponding ODE problem for 𝑌(𝑥), after transforming the 𝑡 variable

𝑦𝑡𝑡 + 3𝑦 𝑥𝑥 + 𝑦 = 𝑥 + 𝑡, −1 < 𝑥 < 1, 𝑡 > 0,


𝑦(−1, 𝑡) = 0, 𝑦(1, 𝑡) = 0, 𝑦(𝑥, 0) = (1 − 𝑥 2 ), 𝑦𝑡 (𝑥, 0) = 0.

Do not solve the problem.

Exercise 6.5.104: Use the Laplace transform in 𝑡 to solve

𝑦𝑡𝑡 = 𝑦 𝑥𝑥 , −∞ < 𝑥 < ∞, 𝑡 > 0,


𝑦(𝑥, 0) = sin(𝑥), 𝑦𝑡 (𝑥, 0) = 0.

Assume that 𝑦 is bounded. Hint: Note that for any fixed 𝑠 > 0, 𝑒 𝑠𝑥 blows up as 𝑥 → +∞ and 𝑒 −𝑠𝑥
blows up 𝑠 → −∞.

Exercise 6.5.105: Use the Laplace transform in 𝑡 to solve

𝑦𝑡 = 𝑦 𝑥𝑥 , 0 < 𝑥 < ∞, 𝑡 > 0,


𝑦(0, 𝑡) = 𝑓 (𝑡), 𝑦(𝑥, 0) = 0,

where 𝑓 (𝑡) is some function. Assume that 𝑦 is bounded. Give the answer as a convolution.
Chapter 7

Power-series methods

7.1 Power series


Note: 1.5 or 2 lectures, §8.1 in [EP], §5.1 in [BD]
Many functions can be written in terms of a power series

𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘 .
Õ

𝑘=0

Assuming a solution of a differential equation is a power series, we can perhaps use a


method reminiscent of undetermined coefficients—we try to solve for the numbers 𝑎 𝑘 .
Before we carry out this process, we review some results and concepts about power series.

7.1.1 Definition
As we said, a power series is an expression such as

𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘 = 𝑎0 + 𝑎 1 (𝑥 − 𝑥 0 ) + 𝑎 2 (𝑥 − 𝑥 0 )2 + 𝑎 3 (𝑥 − 𝑥 0 )3 + · · · ,
Õ
(7.1)
𝑘=0

where 𝑎 0 , 𝑎1 , 𝑎2 , . . . , 𝑎 𝑘 , . . . and 𝑥0 are constants. Let


𝑛
𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘 = 𝑎0 + 𝑎 1 (𝑥 − 𝑥 0 ) + 𝑎 2 (𝑥 − 𝑥 0 )2 + 𝑎 3 (𝑥 − 𝑥 0 )3 + · · · + 𝑎 𝑛 (𝑥 − 𝑥 0 )𝑛 ,
Õ
𝑆𝑛 (𝑥) =
𝑘=0

denote the so-called partial sum. If for some 𝑥, the limit


𝑛
𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘
Õ
lim 𝑆𝑛 (𝑥) = lim
𝑛→∞ 𝑛→∞
𝑘=0
328 CHAPTER 7. POWER-SERIES METHODS

exists, we say the series (7.1) converges at 𝑥. At 𝑥 = 𝑥 0 , the series always converges to 𝑎 0 .
When (7.1) converges at any other 𝑥 ≠ 𝑥0 , we say (7.1) is a convergent power series, and we
write
∞ 𝑛
𝑘
𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘 .
Õ Õ
𝑎 𝑘 (𝑥 − 𝑥 0 ) = lim
𝑛→∞
𝑘=0 𝑘=0
If the series does not converge for any point 𝑥 ≠ 𝑥0 , we say that the series is divergent.
Example 7.1.1: The series

Õ 1 𝑥2 𝑥3
𝑥𝑘 = 1 + 𝑥 + + +···
𝑘! 2 6
𝑘=0

is convergent for any 𝑥. Recall that 𝑘! = 1 · 2 · 3 · · · 𝑘 is the factorial. By convention we


define 0! = 1. You may recall that this series converges to 𝑒 𝑥 .
We say that (7.1) converges absolutely at 𝑥 whenever the limit
𝑛
|𝑎 𝑘 | |𝑥 − 𝑥 0 | 𝑘
Õ
lim
𝑛→∞
𝑘=0

𝑘
exists. That is, the series ∞
𝑘=0 |𝑎 𝑘 | |𝑥 − 𝑥 0 | is convergent. If (7.1) converges absolutely at 𝑥,
Í
then it converges at 𝑥. However, the opposite implication is not true.
Example 7.1.2: The series

Õ 1
𝑥𝑘
𝑘
𝑘=1
(−1) 𝑘
converges absolutely for all 𝑥 in the interval (−1, 1). It converges at 𝑥 = −1, as ∞
Í
𝑘=1 𝑘
converges (conditionally) by the alternating series test. The power series does not converge
absolutely at 𝑥 = −1, because ∞ 𝑘=1 𝑘 does not converge. The series diverges at 𝑥 = 1.
1
Í

7.1.2 Radius of convergence


If a power series converges absolutely at some 𝑥 1 , then for all 𝑥 such that |𝑥 − 𝑥 0 | ≤ |𝑥1 − 𝑥 0 |
(that is, 𝑥 is closer than 𝑥1 to 𝑥 0 ) we have |𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘 | ≤ |𝑎 𝑘 (𝑥 1 − 𝑥 0 ) 𝑘 | for all 𝑘. As
the numbers |𝑎 𝑘 (𝑥1 − 𝑥 0 ) 𝑘 | sum to some finite limit, summing smaller positive numbers
|𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘 | must also have a finite limit. Hence, the series must converge absolutely at 𝑥.
Theorem 7.1.1. For a power series (7.1), there exists a number 𝜌 (we allow 𝜌 = ∞) called the
radius of convergence such that the series converges absolutely on the interval (𝑥0 − 𝜌, 𝑥 0 + 𝜌)
and diverges for 𝑥 < 𝑥0 − 𝜌 and 𝑥 > 𝑥0 + 𝜌. We write 𝜌 = ∞ if the series converges for all 𝑥.
See Figure 7.1 on the next page. In Example 7.1.1, the radius of convergence is 𝜌 = ∞ as
the series converges everywhere. In Example 7.1.2, the radius of convergence is 𝜌 = 1. We
note that 𝜌 = 0 is another way of saying that the series is divergent.
7.1. POWER SERIES 329

diverges converges absolutely diverges

𝑥0 − 𝜌 𝑥0 𝑥0 + 𝜌

Figure 7.1: Convergence of a power series.

A useful test for convergence of a series is the ratio test. Suppose that

Õ
𝑐𝑘
𝑘=0
is a series and the limit
𝑐 𝑘+1
𝐿 = lim
𝑘→∞ 𝑐 𝑘
exists. Then the series converges absolutely if 𝐿 < 1 and diverges if 𝐿 > 1.
We apply this test to the power series (7.1). Let 𝑐 𝑘 = 𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘 in the test. Compute

𝑐 𝑘+1 𝑎 𝑘+1 (𝑥 − 𝑥 0 ) 𝑘+1 𝑎 𝑘+1


𝐿 = lim = lim = lim |𝑥 − 𝑥 0 |.
𝑘→∞ 𝑐𝑘 𝑘→∞ 𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘 𝑘→∞ 𝑎 𝑘

Define 𝐴 by
𝑎 𝑘+1
𝐴 = lim .
𝑘→∞ 𝑎 𝑘
Then the series (7.1) converges absolutely if 1 > 𝐿 = 𝐴|𝑥 − 𝑥0 |. If 𝐴 > 0, then the series
converges absolutely if |𝑥 − 𝑥 0 | < 1/𝐴, and diverges if |𝑥 − 𝑥 0 | > 1/𝐴. That is, the radius of
convergence is 1/𝐴. If 𝐴 = 0, then the series always converges.
A similar test is the root test. Suppose
p𝑘
𝐿 = lim |𝑐 𝑘 |
𝑘→∞

exists. Then ∞ 𝑘=0 𝑐 𝑘 converges absolutely if 𝐿 < 1 and diverges if 𝐿 > 1. We can use the
Í
same calculation as above to find 𝐴. Let us summarize.
Theorem 7.1.2 (Ratio and root tests for power series). Consider a power series

𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘
Õ

𝑘=0
such that
𝑎 𝑘+1 p𝑘
𝐴 = lim or 𝐴 = lim |𝑎 𝑘 |
𝑘→∞ 𝑎 𝑘 𝑘→∞
exists. If 𝐴 = 0, then the radius of convergence of the series is ∞. Otherwise, the radius of
convergence is 1/𝐴.
330 CHAPTER 7. POWER-SERIES METHODS

Example 7.1.3: Consider



2−𝑘 (𝑥 − 1) 𝑘 .
Õ

𝑘=0
We compute the limit in the ratio test,
𝑎 𝑘+1 2−𝑘−1 1
𝐴 = lim = lim = lim 2−1 = .
𝑘→∞ 𝑎𝑘 𝑘→∞ 2 −𝑘 𝑘→∞ 2
Therefore, the radius of convergence is 2, and the series converges absolutely on the interval
(−1, 3). We could just as well have used the root test:
p𝑘 p𝑘 1
𝐴 = lim |𝑎 𝑘 | = lim |2−𝑘 | = lim 2−1 = .
𝑘→∞ 𝑘→∞ 𝑘→∞ 2
Example 7.1.4: Consider

Õ 1
𝑥𝑘.
𝑘=0
𝑘𝑘
Compute the limit for the root test,
s s
𝑘
p𝑘 𝑘 1 𝑘 1 1
𝐴 = lim |𝑎 𝑘 | = lim = lim = lim = 0.
𝑘→∞ 𝑘→∞ 𝑘𝑘 𝑘→∞ 𝑘 𝑘→∞ 𝑘

So the radius of convergence is ∞: The series converges everywhere. The ratio test would
also work here.
𝑎 𝑘+1
The root or the ratio test as given does not always apply. That is, the limit of 𝑎𝑘
p𝑘
or |𝑎 𝑘 | might not exist. There exist more sophisticated ways of finding the radius of
convergence, but those would be beyond the scope of this chapter. The two methods above
cover many of the series that arise in practice. Often both tests apply, though the limit
might be easier to compute in one test than the other.
We remark that at the endpoints, 𝑥 = 𝑥 0 − 𝜌 and 𝑥 = 𝑥 0 + 𝜌, the series may or may not
converge, and the tests above say nothing about convergence there. Sometimes convergence
at the endpoints is important, but for our purposes, we will not worry about it much.

7.1.3 Analytic functions


Functions represented by power series are called analytic functions. Not every function is
analytic, although the majority of the functions you have seen in calculus are. An analytic
function 𝑓 (𝑥) is equal to its Taylor series‗ (a power series computed from 𝑓 ) near a point 𝑥 0 .
That is, for 𝑥 near 𝑥0 ,

𝑓 (𝑘) (𝑥 0 ) 𝑓 ′′(𝑥0 )
(𝑥 − 𝑥 0 ) 𝑘 = 𝑓 (𝑥0 ) + 𝑓 ′(𝑥0 )(𝑥 − 𝑥 0 ) +
Õ
𝑓 (𝑥) = (𝑥 − 𝑥 0 )2 + · · · , (7.2)
𝑘! 2
𝑘=0

where 𝑓 (𝑘) (𝑥 0 ) denotes the 𝑘 th derivative of 𝑓 (𝑥) at the point 𝑥0 .


‗ Named after the English mathematician Sir Brook Taylor (1685–1731).
7.1. POWER SERIES 331

For example, sine is an analytic function and its Taylor series around 𝑥0 = 0 is given by

(−1)𝑛

Õ
sin(𝑥) = 𝑥 2𝑛+1 .
(2𝑛 + 1)!
𝑛=0

In Figure 7.2, we plot sin(𝑥) and the truncations of the series up to degree 5 and 9. You can
see that the approximation is very good for 𝑥 near 0, but gets worse for 𝑥 further away
from 0. This is what happens in general. To get a good approximation far away from 𝑥0
you need to take more and more terms of the Taylor series.

-10 -5 0 5 10
3 3

2 2

1 1

0 0

-1 -1

-2 -2

-3 -3
-10 -5 0 5 10

Figure 7.2: The sine function and its Taylor approximations around 𝑥0 = 0 of 5th and 9th degree.

7.1.4 Manipulating power series


One of the main properties of power series that we will use is that we can differentiate
them term by term. That is, suppose that 𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘 is a convergent power series. Then
Í
for 𝑥 in the radius of convergence, we have


" # ∞
𝑑 Õ
𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘 = 𝑘𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘−1 = 𝑎1 + 2𝑎 2 (𝑥 − 𝑥 0 ) + 3𝑎 3 (𝑥 − 𝑥 0 )2 + · · · .
Õ
𝑑𝑥
𝑘=0 𝑘=1

Notice that the term corresponding to 𝑘 = 0 disappeared as it was constant. The radius of
convergence of the differentiated series is the same as that of the original.
Example 7.1.5: Let us show that the exponential 𝑦 = 𝑒 𝑥 solves 𝑦 ′ = 𝑦. Suppose we did not
know that. Write

Õ 1 𝑘
𝑦 = 𝑒𝑥 = 𝑥 .
𝑘!
𝑘=0
332 CHAPTER 7. POWER-SERIES METHODS

Differentiate 𝑦,
∞ ∞
Õ 1 𝑘−1
Õ 1

𝑦 = 𝑘 𝑥 = 𝑥 𝑘−1 .
𝑘! (𝑘 − 1)!
𝑘=1 𝑘=1
We reindex the series by simply replacing 𝑘 with 𝑘 + 1. The series does not change, what
changes is simply how we write it. After reindexing the series starts at 𝑘 = 0 again.
∞ ∞ ∞
Õ 1 Õ 1 Õ 1
𝑥 𝑘−1 =  𝑥 (𝑘+1)−1 = 𝑥𝑘.
(𝑘 − 1)! (𝑘 + 1) − 1 ! 𝑘!
𝑘=1 𝑘+1=1 𝑘=0

𝑑
That was precisely the power series for 𝑒 𝑥 we started with, so we showed that 𝑥
𝑑𝑥 [𝑒 ] = 𝑒𝑥.
Convergent power series can be added and multiplied together, and multiplied by
constants using the following rules. First, we can add series by adding term by term,

! ∞
! ∞
𝑘 𝑘
(𝑎 𝑘 + 𝑏 𝑘 )(𝑥 − 𝑥 0 ) 𝑘 .
Õ Õ Õ
𝑎 𝑘 (𝑥 − 𝑥 0 ) + 𝑏 𝑘 (𝑥 − 𝑥 0 ) =
𝑘=0 𝑘=0 𝑘=0

We can multiply by constants,



! ∞
𝑘
𝛼𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘 .
Õ Õ
𝛼 𝑎 𝑘 (𝑥 − 𝑥 0 ) =
𝑘=0 𝑘=0

We can also multiply series together,



! ∞
! ∞
𝑘 𝑘
𝑐 𝑘 (𝑥 − 𝑥 0 ) 𝑘 ,
Õ Õ Õ
𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑏 𝑘 (𝑥 − 𝑥 0 ) =
𝑘=0 𝑘=0 𝑘=0

where 𝑐 𝑘 = 𝑎0 𝑏 𝑘 + 𝑎1 𝑏 𝑘−1 + · · · + 𝑎 𝑘 𝑏0 . The radius of convergence of the sum or the product


is at least the minimum of the radii of convergence of the two series involved.

7.1.5 Power series for rational functions


Polynomials are simply finite power series. That is, a polynomial is a power series where
the 𝑎 𝑘 are zero for all 𝑘 large enough. We can always expand a polynomial as a power series
about any point 𝑥0 by writing the polynomial as a polynomial in (𝑥 − 𝑥 0 ). For example, let
us write 2𝑥 2 − 3𝑥 + 4 as a power series around 𝑥0 = 1:

2𝑥 2 − 3𝑥 + 4 = 3 + (𝑥 − 1) + 2(𝑥 − 1)2 .

In other words, 𝑎 0 = 3, 𝑎 1 = 1, 𝑎2 = 2, and all other 𝑎 𝑘 = 0. To do this, we know that 𝑎 𝑘 = 0


for all 𝑘 ≥ 3 as the polynomial is of degree 2. We write 𝑎 0 + 𝑎1 (𝑥 − 1) + 𝑎 2 (𝑥 − 1)2 , we
expand, and we solve for 𝑎 0 , 𝑎1 , and 𝑎2 . We could have also differentiated at 𝑥 = 1 and
used the Taylor series formula (7.2).
7.1. POWER SERIES 333

Let us look at rational functions, that is, ratios of polynomials. An important fact is
that a series for a function only defines the function on an interval even if the function is
defined elsewhere. For example, for −1 < 𝑥 < 1,

1 Õ
= 𝑥 𝑘 = 1 + 𝑥 + 𝑥2 + · · ·
1−𝑥
𝑘=0
This series is called the geometric series. The ratio test tells us that the radius of convergence
is 1. The series diverges for 𝑥 ≤ −1 and 𝑥 ≥ 1, even though 1−𝑥 1
is defined for all 𝑥 ≠ 1.
We can use the geometric series together with rules for addition and multiplication of
power series to expand rational functions around a point, as long as the denominator is
not zero at 𝑥 0 . Note that as for polynomials, we could equivalently use the Taylor series
expansion (7.2).
𝑥
Example 7.1.6: Expand 1+2𝑥+𝑥 2 as a power series around the origin (𝑥 0 = 0) and find the
radius of convergence.
2
First, write 1 + 2𝑥 + 𝑥 2 = (1 + 𝑥)2 = 1 − (−𝑥) . Compute
2
𝑥

1
=𝑥
1 + 2𝑥 + 𝑥 2 1 − (−𝑥)

!2
(−1) 𝑘 𝑥 𝑘
Õ
=𝑥
𝑘=0

!
Õ
=𝑥 𝑐𝑘 𝑥𝑘
𝑘=0

Õ
= 𝑐 𝑘 𝑥 𝑘+1 ,
𝑘=0
where to get 𝑐 𝑘 , we use the formula for the product of series: 𝑐0 = 1, 𝑐1 = −1 − 1 = −2,
𝑐2 = 1 + 1 + 1 = 3, etc. Therefore

𝑥
(−1) 𝑘+1 𝑘𝑥 𝑘 = 𝑥 − 2𝑥 2 + 3𝑥 3 − 4𝑥 4 + · · ·
Õ
=
1 + 2𝑥 + 𝑥 2
𝑘=1
The radius of convergence is at least 1. We use the ratio test
𝑎 𝑘+1 (−1) 𝑘+2 (𝑘 + 1) 𝑘+1
lim = lim = lim = 1.
𝑘→∞ 𝑎 𝑘 𝑘→∞ (−1) 𝑘𝑘+1 𝑘→∞ 𝑘
So the radius of convergence is actually equal to 1.
When the rational function is more complicated, it is also possible to use method of
partial fractions. For example, to find the Taylor series for 𝑥𝑥 2+𝑥
3
−1
, we write
∞ ∞ ∞
𝑥3 + 𝑥 1 1 Õ
𝑘 𝑘
Õ
𝑘
Õ
= 𝑥 + − = 𝑥 + (−1) 𝑥 − 𝑥 = −𝑥 + (−2)𝑥 𝑘 .
𝑥 −1
2 1 + 𝑥 1 − 𝑥
𝑘=0 𝑘=0 𝑘=3
𝑘 odd
334 CHAPTER 7. POWER-SERIES METHODS

7.1.6 Exercises

Õ
Exercise 7.1.1: Is 𝑒 𝑘 𝑥 𝑘 a convergent power series? If so, find the radius of convergence.
𝑘=0
Õ∞
Exercise 7.1.2: Is 𝑘𝑥 𝑘 a convergent power series? If so, find the radius of convergence.
𝑘=0

Õ
Exercise 7.1.3: Is 𝑘!𝑥 𝑘 a convergent power series? If so, find the radius of convergence.
𝑘=0

1
(𝑥 − 10) 𝑘 a convergent power series? If so, find the radius of convergence.
Õ
Exercise 7.1.4: Is
(2𝑘)!
𝑘=0
Exercise 7.1.5: Determine the Taylor series for sin 𝑥 around the point 𝑥0 = 𝜋.
Exercise 7.1.6: Find the Taylor series for ln 𝑥 around the point 𝑥0 = 1, and the radius of convergence.
1
Exercise 7.1.7: Determine the Taylor series and its radius of convergence of around 𝑥0 = 0.
1+𝑥
𝑥
Exercise 7.1.8: Determine the Taylor series and its radius of convergence of around 𝑥 0 = 0.
4 − 𝑥2
Hint: You will not be able to use the ratio test.
Exercise 7.1.9: Expand 𝑥 5 + 5𝑥 + 1 as a power series around 𝑥0 = 5.

Õ
Exercise 7.1.10: Suppose that the ratio test applies to a series 𝑎 𝑘 𝑥 𝑘 . Show, using the ratio test,
𝑘=0
that the radius of convergence of the differentiated series is the same as that of the original series.
Exercise 7.1.11: Suppose that 𝑓 is an analytic function such that 𝑓 (𝑛) (0) = 𝑛. Find 𝑓 (1).

(0.1)𝑛 𝑥 𝑛 a convergent power series? If so, find the radius of convergence.
Õ
Exercise 7.1.101: Is
𝑛=1

Õ 𝑛!
Exercise 7.1.102 (challenging): Is 𝑥 𝑛 a convergent power series? If so, find the radius of
𝑛𝑛
𝑛=1
convergence.
Exercise 7.1.103: Using the geometric series, expand 1
1−𝑥 around 𝑥0 = 2. For what 𝑥 does the
series converge?
Exercise 7.1.104 (challenging): Find the Taylor series for 𝑥 7 𝑒 𝑥 around 𝑥0 = 0.
Exercise 7.1.105 (challenging): Imagine 𝑓 and 𝑔 are analytic functions such that 𝑓 (𝑘) (0) = 𝑔 (𝑘) (0)
for all large enough 𝑘. What can you say about 𝑓 (𝑥) − 𝑔(𝑥)?
Exercise 7.1.106: Reindex the following series to have the powers be 𝑥 𝑘 .

Õ ∞
Õ ∞
Õ
𝑘+3 𝑘−1
a) 𝑘(𝑘 − 1)𝑥 b) (𝑘 + 1)𝑥 c) 2𝑘𝑥 𝑘−2
𝑘=3 𝑘=1 𝑘=5
7.2. SERIES SOLUTIONS OF LINEAR SECOND-ORDER ODES 335

7.2 Series solutions of linear second-order ODEs


Note: 1.5 or 2 lectures, §8.2 in [EP], §5.2 and §5.3 in [BD]
Consider a linear second-order homogeneous ODE of the form

𝑃(𝑥)𝑦 ′′ + 𝑄(𝑥)𝑦 ′ + 𝑅(𝑥)𝑦 = 0.

Suppose that 𝑃(𝑥), 𝑄(𝑥), and 𝑅(𝑥) are polynomials. We will try a solution of the form

𝑎 𝑘 (𝑥 − 𝑥 0 ) 𝑘
Õ
𝑦=
𝑘=0

and solve for the 𝑎 𝑘 to try to obtain a solution defined in some interval around 𝑥0 .
The point 𝑥0 is called an ordinary point if 𝑃(𝑥0 ) ≠ 0. That is, the functions
𝑄(𝑥) 𝑅(𝑥)
and
𝑃(𝑥) 𝑃(𝑥)
are defined for 𝑥 near 𝑥0 . If 𝑃(𝑥 0 ) = 0, then we say 𝑥0 is a singular point. Handling singular
points is harder than ordinary points and so we now focus only on ordinary points.
Example 7.2.1: We start with a very simple example

𝑦 ′′ − 𝑦 = 0.

Let us try a power series solution near 𝑥0 = 0, which is an ordinary point. Every point is
an ordinary point in fact, as the equation is a constant-coefficient one. We already know
we should obtain exponentials or the hyperbolic sine and cosine, but let us pretend we do
not know this fact.
We try

Õ
𝑦= 𝑎𝑘 𝑥 𝑘 .
𝑘=0
If we differentiate, the 𝑘 = 0 term is a constant and hence disappears. We get

Õ

𝑦 = 𝑘𝑎 𝑘 𝑥 𝑘−1 .
𝑘=1

We differentiate yet again to obtain (now the 𝑘 = 1 term disappears)



Õ
′′
𝑦 = 𝑘(𝑘 − 1)𝑎 𝑘 𝑥 𝑘−2 .
𝑘=2

We reindex the series (replace 𝑘 with 𝑘 + 2) to obtain



Õ
𝑦 ′′ = (𝑘 + 2) (𝑘 + 1) 𝑎 𝑘+2 𝑥 𝑘 .
𝑘=0
336 CHAPTER 7. POWER-SERIES METHODS

Now we plug 𝑦 and 𝑦 ′′ into the differential equation



! ∞
!
Õ Õ
0 = 𝑦 ′′ − 𝑦 = (𝑘 + 2) (𝑘 + 1) 𝑎 𝑘+2 𝑥 𝑘 − 𝑎𝑘 𝑥𝑘
𝑘=0 𝑘=0
∞ 
Õ 
𝑘 𝑘
= (𝑘 + 2) (𝑘 + 1) 𝑎 𝑘+2 𝑥 − 𝑎 𝑘 𝑥
𝑘=0
Õ∞
(𝑘 + 2) (𝑘 + 1) 𝑎 𝑘+2 − 𝑎 𝑘 𝑥 𝑘 .

=
𝑘=0

As 𝑦 ′′
− 𝑦 is supposed to be equal to 0, we know that the coefficients of the resulting series
must be equal to 0. Therefore,
𝑎𝑘
(𝑘 + 2) (𝑘 + 1) 𝑎 𝑘+2 − 𝑎 𝑘 = 0, or 𝑎 𝑘+2 = .
(𝑘 + 2)(𝑘 + 1)
The equation above is called a recurrence relation for the coefficients of the power series. It
does not matter what 𝑎 0 or 𝑎 1 are. They can be arbitrary. But once we pick 𝑎 0 and 𝑎 1 , all
other coefficients are determined by the recurrence relation.
Let us see what the coefficients must be. First, 𝑎0 and 𝑎 1 are arbitrary. Then,
𝑎0 𝑎1 𝑎2 𝑎0 𝑎3 𝑎1
𝑎2 = , 𝑎3 = , 𝑎4 = = , 𝑎5 = = , ...
2 (3)(2) (4)(3) (4)(3)(2) (5)(4) (5)(4)(3)(2)
For even 𝑘, that is, 𝑘 = 2𝑛, we have
𝑎0
𝑎 𝑘 = 𝑎2𝑛 = .
(2𝑛)!
For odd 𝑘, that is, 𝑘 = 2𝑛 + 1, we have
𝑎1
𝑎 𝑘 = 𝑎2𝑛+1 = .
(2𝑛 + 1)!
We write down the series for the solution
∞ ∞  ∞ ∞
𝑎0 𝑎1
Õ 1
Õ
𝑘
Õ Õ 1
𝑦= 𝑎𝑘 𝑥 = 𝑥 2𝑛
+ 𝑥 2𝑛+1 = 𝑎0 𝑥 2𝑛 + 𝑎 1 𝑥 2𝑛+1 .
(2𝑛)! (2𝑛 + 1)! (2𝑛)! (2𝑛 + 1)!
𝑘=0 𝑛=0 𝑛=0 𝑛=0

We recognize the two series as the hyperbolic sine and cosine. Therefore,
𝑦 = 𝑎0 cosh 𝑥 + 𝑎 1 sinh 𝑥.
Of course, in general we will not be able to recognize the series that appears, since
usually there will not be any elementary function that matches it. In that case, we will be
content with the series.
Example 7.2.2: Let us do a more complex example. Consider Airy’s equation‗ :
𝑦 ′′ − 𝑥𝑦 = 0,
near the point 𝑥 0 = 0. Note that 𝑥0 = 0 is an ordinary point.
‗ Named after the English mathematician Sir George Biddell Airy (1801–1892).
7.2. SERIES SOLUTIONS OF LINEAR SECOND-ORDER ODES 337

We try

Õ
𝑦= 𝑎𝑘 𝑥 𝑘 .
𝑘=0

We differentiate twice (as above) to obtain



Õ
𝑦 = ′′
𝑘 (𝑘 − 1) 𝑎 𝑘 𝑥 𝑘−2 .
𝑘=2

We plug 𝑦 into the equation



! ∞
!
Õ Õ
0 = 𝑦 ′′ − 𝑥𝑦 = 𝑘 (𝑘 − 1) 𝑎 𝑘 𝑥 𝑘−2 − 𝑥 𝑎𝑘 𝑥 𝑘
𝑘=2 𝑘=0

! ∞
!
Õ Õ
= 𝑘 (𝑘 − 1) 𝑎 𝑘 𝑥 𝑘−2 − 𝑎 𝑘 𝑥 𝑘+1 .
𝑘=2 𝑘=0

We reindex to make things easier to sum



! ∞
!
Õ Õ
0 = 𝑦 ′′ − 𝑥𝑦 = 2𝑎 2 + (𝑘 + 2) (𝑘 + 1) 𝑎 𝑘+2 𝑥 𝑘 − 𝑎 𝑘−1 𝑥 𝑘
𝑘=1 𝑘=1

Õ 
= 2𝑎2 + (𝑘 + 2) (𝑘 + 1) 𝑎 𝑘+2 − 𝑎 𝑘−1 𝑥 𝑘 .
𝑘=1

Again 𝑦 ′′ − 𝑥𝑦 is supposed to be 0, so 𝑎 2 = 0, and


𝑎 𝑘−1
(𝑘 + 2) (𝑘 + 1) 𝑎 𝑘+2 − 𝑎 𝑘−1 = 0, or 𝑎 𝑘+2 = .
(𝑘 + 2)(𝑘 + 1)

We jump in steps of three. First, since 𝑎 2 = 0, we must have 𝑎5 = 0, 𝑎 8 = 0, 𝑎 11 = 0, etc. In


general, 𝑎3𝑛+2 = 0.
The constants 𝑎0 and 𝑎 1 are arbitrary and we obtain
𝑎0 𝑎1 𝑎3 𝑎0 𝑎4 𝑎1
𝑎3 = , 𝑎4 = , 𝑎6 = = , 𝑎7 = = , ...
(3)(2) (4)(3) (6)(5) (6)(5)(3)(2) (7)(6) (7)(6)(4)(3)

For 𝑎 𝑘 where 𝑘 is a multiple of 3, that is, 𝑘 = 3𝑛, we notice that


𝑎0
𝑎 3𝑛 = .
(2)(3)(5)(6) · · · (3𝑛 − 1)(3𝑛)

For 𝑎 𝑘 where 𝑘 = 3𝑛 + 1, we notice


𝑎1
𝑎3𝑛+1 = .
(3)(4)(6)(7) · · · (3𝑛)(3𝑛 + 1)
338 CHAPTER 7. POWER-SERIES METHODS

In other words, if we write down the series for 𝑦, it has two parts

𝑎0 𝑎0 6 𝑎0
 
𝑦 = 𝑎0 + 𝑥 3 + 𝑥 +··· + 𝑥 3𝑛 + · · ·
6 180 (2)(3)(5)(6) · · · (3𝑛 − 1)(3𝑛)
𝑎1 𝑎1 7 𝑎1
 
+ 𝑎1 𝑥 + 𝑥 4 + 𝑥 +··· + 𝑥 3𝑛+1 + · · ·
12 504 (3)(4)(6)(7) · · · (3𝑛)(3𝑛 + 1)
 
1 1 6 1
= 𝑎0 1 + 𝑥3 + 𝑥 +··· + 𝑥 3𝑛 + · · ·
6 180 (2)(3)(5)(6) · · · (3𝑛 − 1)(3𝑛)
 
1 1 7 1
+ 𝑎1 𝑥 + 𝑥4 + 𝑥 +··· + 𝑥 3𝑛+1 + · · · .
12 504 (3)(4)(6)(7) · · · (3𝑛)(3𝑛 + 1)

We define
1 1 6 1
𝑦1 (𝑥) = 1 + 𝑥 3 + 𝑥 +··· + 𝑥 3𝑛 + · · · ,
6 180 (2)(3)(5)(6) · · · (3𝑛 − 1)(3𝑛)
1 1 7 1
𝑦2 (𝑥) = 𝑥 + 𝑥 4 + 𝑥 +··· + 𝑥 3𝑛+1 + · · · ,
12 504 (3)(4)(6)(7) · · · (3𝑛)(3𝑛 + 1)

and write the general solution to the equation as 𝑦(𝑥) = 𝑎0 𝑦1 (𝑥) + 𝑎 1 𝑦2 (𝑥). If we plug
in 𝑥 = 0 into the power series for 𝑦1 and 𝑦2 , we find 𝑦1 (0) = 1 and 𝑦2 (0) = 0. Similarly,
𝑦1′ (0) = 0 and 𝑦2′ (0) = 1. Therefore 𝑦 = 𝑎0 𝑦1 + 𝑎 1 𝑦2 is a solution that satisfies the initial
conditions 𝑦(0) = 𝑎0 and 𝑦 ′(0) = 𝑎1 .

-5.0 -2.5 0.0 2.5 5.0


7.5 7.5

5.0 5.0

2.5 2.5

0.0 0.0

-2.5 -2.5

-5.0 -5.0
-5.0 -2.5 0.0 2.5 5.0

Figure 7.3: The two solutions 𝑦1 and 𝑦2 to Airy’s equation.

The functions 𝑦1 and 𝑦2 cannot be written in terms of the elementary functions that you
know. See Figure 7.3 for the plot of the solutions 𝑦1 and 𝑦2 . These functions have many
interesting properties. For example, they are oscillatory for negative 𝑥 (like solutions to
𝑦 ′′ + 𝑦 = 0) and for positive 𝑥 they grow without bound (like solutions to 𝑦 ′′ − 𝑦 = 0).
Sometimes a series solution may turn out to be a polynomial. Let us see an example.
7.2. SERIES SOLUTIONS OF LINEAR SECOND-ORDER ODES 339

Example 7.2.3: Let us find a solution to the so-called Hermite’s equation of order 𝑛 ‗ :

𝑦 ′′ − 2𝑥𝑦 ′ + 2𝑛𝑦 = 0.

Let us find a solution around the point 𝑥0 = 0. We try



Õ
𝑦= 𝑎𝑘 𝑥 𝑘 .
𝑘=0

We differentiate (as above) to obtain



Õ

𝑦 = 𝑘𝑎 𝑘 𝑥 𝑘−1 ,
𝑘=1

Õ
𝑦 ′′ = 𝑘 (𝑘 − 1) 𝑎 𝑘 𝑥 𝑘−2 .
𝑘=2

Now we plug into the equation

0 = 𝑦 ′′ − 2𝑥𝑦 ′ + 2𝑛𝑦

! ∞
! ∞
!
Õ Õ Õ
= 𝑘(𝑘 − 1)𝑎 𝑘 𝑥 𝑘−2 − 2𝑥 𝑘𝑎 𝑘 𝑥 𝑘−1 + 2𝑛 𝑎𝑘 𝑥𝑘
𝑘=2 𝑘=1 𝑘=0

! ∞
! ∞
!
Õ Õ Õ
= 𝑘(𝑘 − 1)𝑎 𝑘 𝑥 𝑘−2 − 2𝑘𝑎 𝑘 𝑥 𝑘 + 2𝑛𝑎 𝑘 𝑥 𝑘
𝑘=2 𝑘=1 𝑘=0

! ∞
! ∞
!
Õ Õ Õ
= 2𝑎 2 + (𝑘 + 2)(𝑘 + 1)𝑎 𝑘+2 𝑥 𝑘 − 2𝑘𝑎 𝑘 𝑥 𝑘 + 2𝑛𝑎 0 + 2𝑛𝑎 𝑘 𝑥 𝑘
𝑘=1 𝑘=1 𝑘=1

Õ
(𝑘 + 2)(𝑘 + 1)𝑎 𝑘+2 − 2𝑘𝑎 𝑘 + 2𝑛𝑎 𝑘 𝑥 𝑘 .

= 2𝑎2 + 2𝑛𝑎 0 +
𝑘=1

As 𝑦 ′′ − 2𝑥𝑦 ′ + 2𝑛𝑦 = 0, we have

(2𝑘 − 2𝑛)
(𝑘 + 2)(𝑘 + 1)𝑎 𝑘+2 + (−2𝑘 + 2𝑛)𝑎 𝑘 = 0, or 𝑎 𝑘+2 = 𝑎𝑘 .
(𝑘 + 2)(𝑘 + 1)
This recurrence relation actually includes 𝑎2 = −𝑛𝑎0 (which comes about from the constant
term 2𝑎2 + 2𝑛𝑎 0 = 0). Again 𝑎0 and 𝑎 1 are arbitrary.

−2𝑛 2(1 − 𝑛)
𝑎2 = 𝑎0 , 𝑎3 = 𝑎1 ,
(2)(1) (3)(2)
2(2 − 𝑛) 22 (2 − 𝑛)(−𝑛)
𝑎4 = 𝑎2 = 𝑎0 ,
(4)(3) (4)(3)(2)(1)
‗ Named after the French mathematician Charles Hermite (1822–1901).
340 CHAPTER 7. POWER-SERIES METHODS

2(3 − 𝑛) 22 (3 − 𝑛)(1 − 𝑛)
𝑎5 = 𝑎3 = 𝑎1 , ...
(5)(4) (5)(4)(3)(2)
We separate the even and odd coefficients to find that

2𝑚 (−𝑛)(2 − 𝑛) · · · (2𝑚 − 2 − 𝑛)
𝑎 2𝑚 = ,
(2𝑚)!
2𝑚 (1 − 𝑛)(3 − 𝑛) · · · (2𝑚 − 1 − 𝑛)
𝑎 2𝑚+1 = .
(2𝑚 + 1)!
We write down the two series, one with the even powers and one with the odd.

2(−𝑛) 2 22 (−𝑛)(2 − 𝑛) 4 23 (−𝑛)(2 − 𝑛)(4 − 𝑛) 6


𝑦1 (𝑥) = 1 + 𝑥 + 𝑥 + 𝑥 +··· ,
2! 4! 6!
2(1 − 𝑛) 3 22 (1 − 𝑛)(3 − 𝑛) 5 23 (1 − 𝑛)(3 − 𝑛)(5 − 𝑛) 7
𝑦2 (𝑥) = 𝑥 + 𝑥 + 𝑥 + 𝑥 +··· .
3! 5! 7!
Then
𝑦(𝑥) = 𝑎0 𝑦1 (𝑥) + 𝑎 1 𝑦2 (𝑥).
We remark that if 𝑛 is a positive even integer, then 𝑦1 (𝑥) is a polynomial as all the
coefficients in the series beyond degree 𝑛 are zero. If 𝑛 is a positive odd integer, then 𝑦2 (𝑥)
is a polynomial. For example, if 𝑛 = 4, then

2(−4) 2 22 (−4)(2 − 4) 4 4
𝑦1 (𝑥) = 1 + 𝑥 + 𝑥 = 1 − 4𝑥 2 + 𝑥 4 .
2! 4! 3

7.2.1 Exercises
In the following exercises, when asked to solve an equation using power series methods,
you should find the first few terms of the series, and if possible find a general formula for
the 𝑘 th coefficient.

Exercise 7.2.1: Use power series methods to solve 𝑦 ′′ + 𝑦 = 0 at the point 𝑥0 = 1.

Exercise 7.2.2: Use power series methods to solve 𝑦 ′′ + 4𝑥𝑦 = 0 at the point 𝑥0 = 0.

Exercise 7.2.3: Use power series methods to solve 𝑦 ′′ − 𝑥𝑦 = 0 at the point 𝑥0 = 1.

Exercise 7.2.4: Use power series methods to solve 𝑦 ′′ + 𝑥 2 𝑦 = 0 at the point 𝑥0 = 0.

Exercise 7.2.5: The methods work for other orders than second order. Try the methods of this section
to solve the first-order system 𝑦 ′ − 𝑥𝑦 = 0 at the point 𝑥0 = 0.

Exercise 7.2.6 (Chebyshev’s equation of order 𝑝):

a) Solve (1 − 𝑥 2 )𝑦 ′′ − 𝑥𝑦 ′ + 𝑝 2 𝑦 = 0 using power series methods at 𝑥0 = 0.


b) For what 𝑝 is there a polynomial solution?
7.2. SERIES SOLUTIONS OF LINEAR SECOND-ORDER ODES 341

Exercise 7.2.7: Find a polynomial solution to (𝑥 2 + 1)𝑦 ′′ − 2𝑥𝑦 ′ + 2𝑦 = 0 using power series
methods.

Exercise 7.2.8:

a) Use power series methods to solve (1 − 𝑥)𝑦 ′′ + 𝑦 = 0 at the point 𝑥0 = 0.


b) Use the solution to part a) to find a solution for 𝑥𝑦 ′′ + 𝑦 = 0 around the point 𝑥0 = 1.

Exercise 7.2.101: Use power series methods to solve 𝑦 ′′ + 2𝑥 3 𝑦 = 0 at the point 𝑥 0 = 0.

Exercise 7.2.102 (challenging): Power-series methods also work for nonhomogeneous equations.

a) Use power series methods to solve 𝑦 ′′ − 𝑥𝑦 = 1


1−𝑥 at the point 𝑥0 = 0. Hint: Recall the
geometric series.
b) Now solve for the initial condition 𝑦(0) = 0, 𝑦 ′(0) = 0.

Exercise 7.2.103: Attempt to solve 𝑥 2 𝑦 ′′ − 𝑦 = 0 at 𝑥 0 = 0 using the power series method of this
section (𝑥0 is a singular point). Can you find at least one solution? Can you find more than one
solution?
342 CHAPTER 7. POWER-SERIES METHODS

7.3 Singular points and the method of Frobenius


Note: 1.5 or 2 lectures, §8.4 and §8.5 in [EP], §5.4–§5.7 in [BD]
The behavior of ODEs at singular points can be complicated. For certain singular points,
we can find a solution on at least one side of the singular point using a modification of the
power series. Let us look at some examples before giving a general method.

7.3.1 Examples
Example 7.3.1: Consider the simple first-order equation
2𝑥𝑦 ′ − 𝑦 = 0.
Note that 𝑥 = 0 is a singular point. Setting 𝑥 = 0 in the equation, we find that any solution
defined near zero satisfies 𝑦(0) = 0, but it is even worse. If we try to plug in

Õ
𝑦= 𝑎𝑘 𝑥 𝑘 ,
𝑘=0

we obtain

! ∞
!
Õ Õ
0 = 2𝑥𝑦 ′ − 𝑦 = 2𝑥 𝑘𝑎 𝑘 𝑥 𝑘−1 − 𝑎𝑘 𝑥 𝑘
𝑘=1 𝑘=0
Õ∞
= −𝑎0 + (2𝑘𝑎 𝑘 − 𝑎 𝑘 ) 𝑥 𝑘 .
𝑘=1

First, 𝑎 0 = 0. Next, the only way to solve 0 = 2𝑘𝑎 𝑘 − 𝑎 𝑘 = (2𝑘 − 1) 𝑎 𝑘 for 𝑘 = 1, 2, 3, . . . is for
𝑎 𝑘 = 0 for all 𝑘. Therefore, in this manner we only get the trivial solution 𝑦 = 0. We need a
nonzero solution to get the general solution to the equation.
Let us try 𝑦 = 𝑥 𝑟 for some real number 𝑟. Consequently our solution—if we can find
one—may only make sense for positive 𝑥. Then 𝑦 ′ = 𝑟𝑥 𝑟−1 . So
0 = 2𝑥𝑦 ′ − 𝑦 = 2𝑥𝑟𝑥 𝑟−1 − 𝑥 𝑟 = (2𝑟 − 1)𝑥 𝑟 .
Thus 𝑟 = 1/2 and so 𝑦 = 𝑥 1/2 . As the equation is linear, the general solution for positive 𝑥 is
𝑦 = 𝐶𝑥 1/2 .
If 𝐶 ≠ 0, then the derivative of the solution “blows up” at 𝑥 = 0 (the singular point). There
is only one solution that is differentiable at 𝑥 = 0 and that’s the trivial solution 𝑦 = 0.
Not every problem with a singular point has a solution of the form 𝑦 = 𝑥 𝑟 , of course.
But perhaps we can combine the methods. What we will do is to try a solution of the form
𝑦 = 𝑥 𝑟 𝑓 (𝑥)
for positive 𝑥, where 𝑓 (𝑥) is an analytic function (a power series).
7.3. SINGULAR POINTS AND THE METHOD OF FROBENIUS 343

Example 7.3.2: Consider the equation

4𝑥 2 𝑦 ′′ − 4𝑥 2 𝑦 ′ + (1 − 2𝑥)𝑦 = 0,

and again note that 𝑥 = 0 is a singular point.


Let us try

Õ ∞
Õ
𝑦 = 𝑥𝑟 𝑎𝑘 𝑥 𝑘 = 𝑎 𝑘 𝑥 𝑘+𝑟 ,
𝑘=0 𝑘=0
where 𝑟 is a real number, not necessarily an integer. Again if such a solution exists, it may
only exist for positive 𝑥. First we find the derivatives

Õ
𝑦 = ′
(𝑘 + 𝑟) 𝑎 𝑘 𝑥 𝑘+𝑟−1 ,
𝑘=0
Õ∞
𝑦 ′′ = (𝑘 + 𝑟) (𝑘 + 𝑟 − 1) 𝑎 𝑘 𝑥 𝑘+𝑟−2 .
𝑘=0

We plug those into our equation:

0 = 4𝑥 2 𝑦 ′′ − 4𝑥 2 𝑦 ′ + (1 − 2𝑥)𝑦

! ∞
! ∞
!
Õ Õ Õ
= 4𝑥 2 (𝑘 + 𝑟) (𝑘 + 𝑟 − 1) 𝑎 𝑘 𝑥 𝑘+𝑟−2 − 4𝑥 2 (𝑘 + 𝑟) 𝑎 𝑘 𝑥 𝑘+𝑟−1 + (1 − 2𝑥) 𝑎 𝑘 𝑥 𝑘+𝑟
𝑘=0 𝑘=0 𝑘=0

!
Õ
= 4(𝑘 + 𝑟) (𝑘 + 𝑟 − 1) 𝑎 𝑘 𝑥 𝑘+𝑟
𝑘=0

! ∞
! ∞
!
Õ Õ Õ
− 4(𝑘 + 𝑟) 𝑎 𝑘 𝑥 𝑘+𝑟+1 + 𝑎 𝑘 𝑥 𝑘+𝑟 − 2𝑎 𝑘 𝑥 𝑘+𝑟+1
𝑘=0 𝑘=0 𝑘=0

!
Õ
= 4(𝑘 + 𝑟) (𝑘 + 𝑟 − 1) 𝑎 𝑘 𝑥 𝑘+𝑟
𝑘=0

! ∞
! ∞
!
Õ Õ Õ
− 4(𝑘 + 𝑟 − 1) 𝑎 𝑘−1 𝑥 𝑘+𝑟 + 𝑎 𝑘 𝑥 𝑘+𝑟 − 2𝑎 𝑘−1 𝑥 𝑘+𝑟
𝑘=1 𝑘=0 𝑘=1
∞ 
Õ 
= 4𝑟(𝑟 − 1) 𝑎0 𝑥 𝑟 + 𝑎 0 𝑥 𝑟 + 4(𝑘 + 𝑟) (𝑘 + 𝑟 − 1) 𝑎 𝑘 − 4(𝑘 + 𝑟 − 1) 𝑎 𝑘−1 + 𝑎 𝑘 − 2𝑎 𝑘−1 𝑥 𝑘+𝑟
𝑘=1
∞ 
Õ 
𝑟
= 4𝑟(𝑟 − 1) + 1 𝑎 0 𝑥 + 4(𝑘 + 𝑟) (𝑘 + 𝑟 − 1) + 1 𝑎 𝑘 − 4(𝑘 + 𝑟 − 1) + 2 𝑎 𝑘−1 𝑥 𝑘+𝑟 .
  
𝑘=1

First, to have a solution we must have 4𝑟(𝑟 − 1) + 1 𝑎 0 = 0. Supposing 𝑎0 ≠ 0,




4𝑟(𝑟 − 1) + 1 = 0.
344 CHAPTER 7. POWER-SERIES METHODS

This equation is called the indicial equation. This particular indicial equation has a double
root at 𝑟 = 1/2.
OK, so we know what 𝑟 has to be. That knowledge we obtained simply by looking at
the coefficient of 𝑥 𝑟 . All other coefficients of 𝑥 𝑘+𝑟 also have to be zero so
4(𝑘 + 𝑟) (𝑘 + 𝑟 − 1) + 1 𝑎 𝑘 − 4(𝑘 + 𝑟 − 1) + 2 𝑎 𝑘−1 = 0.
 

If we plug in 𝑟 = 1/2 and solve for 𝑎 𝑘 , we get


4(𝑘 + 1/2 − 1) + 2 1
𝑎𝑘 = 𝑎 𝑘−1 = 𝑎 𝑘−1 .
4(𝑘 + 1/2) (𝑘 + 1/2 − 1) + 1 𝑘
Let us set 𝑎 0 = 1. Then
1 1 1 1 1 1 1
𝑎1 = 𝑎0 = 1, 𝑎2 = 𝑎1 = , 𝑎3 = 𝑎2 = , 𝑎4 = 𝑎3 = , ···
1 2 2 3 3·2 4 4·3·2
Extrapolating, we notice that
1 1
𝑎𝑘 = = .
𝑘(𝑘 − 1)(𝑘 − 2) · · · 3 · 2 𝑘!
In other words,
∞ ∞ ∞
Õ
𝑘+𝑟
Õ 1 𝑘+1/2
Õ 1
𝑦= 𝑎𝑘 𝑥 = 𝑥 =𝑥 1/2
𝑥 𝑘 = 𝑥 1/2 𝑒 𝑥 .
𝑘! 𝑘!
𝑘=0 𝑘=0 𝑘=0

That was lucky! In general, we will not be able to write the series in terms of elementary
functions.
We have one solution, let us call it 𝑦1 = 𝑥 1/2 𝑒 𝑥 . But what about a second solution? If we
want a general solution, we need two linearly independent solutions. Picking 𝑎0 to be a
different constant only gets us a constant multiple of 𝑦1 , and we do not have any other 𝑟 to
try; we only have one solution to the indicial equation. Well, there are powers of 𝑥 floating
around and we are taking derivatives, perhaps the logarithm (the antiderivative of 𝑥 −1 ) is
around as well. It turns out we want to try for another solution of the form

Õ
𝑦2 = 𝑏 𝑘 𝑥 𝑘+𝑟 + (ln 𝑥)𝑦1 ,
𝑘=0

which in our case is



Õ
𝑦2 = 𝑏 𝑘 𝑥 𝑘+1/2 + (ln 𝑥)𝑥 1/2 𝑒 𝑥 .
𝑘=0
We now differentiate this equation, substitute into the differential equation and solve for
𝑏 𝑘 . A long computation ensues and we obtain some recursion relation for 𝑏 𝑘 . The reader
can (and should) try this to obtain for example the first three terms
2𝑏1 − 1 6𝑏2 − 1
𝑏 1 = 𝑏0 − 1, 𝑏2 = , 𝑏3 = , ...
4 18
We then fix 𝑏 0 and obtain a solution 𝑦2 . Then we write the general solution as 𝑦 = 𝐴𝑦1 +𝐵𝑦2 .
7.3. SINGULAR POINTS AND THE METHOD OF FROBENIUS 345

7.3.2 The method of Frobenius


Before giving the general method, let us clarify when the method applies. Let
𝑃(𝑥)𝑦 ′′ + 𝑄(𝑥)𝑦 ′ + 𝑅(𝑥)𝑦 = 0
be an ODE. As before, if 𝑃(𝑥 0 ) = 0, then 𝑥0 is a singular point. If we divide by 𝑃(𝑥) to put
𝑄(𝑥) 𝑅(𝑥)
the equation in standard form 𝑦 ′′ + 𝑃(𝑥) 𝑦 ′ + 𝑃(𝑥) 𝑦 = 0, perhaps the singularities introduced
are not too bad. More specifically, if the limits
𝑄(𝑥) 𝑅(𝑥)
lim (𝑥 − 𝑥 0 ) and lim (𝑥 − 𝑥 0 )2
𝑥→𝑥 0 𝑃(𝑥) 𝑥→𝑥 0 𝑃(𝑥)
both exist and are finite, then we say that 𝑥0 is a regular singular point.
Example 7.3.3: Often, and for the rest of this section, 𝑥0 = 0. Consider
𝑥 2 𝑦 ′′ + 𝑥(1 + 𝑥)𝑦 ′ + (𝜋 + 𝑥 2 )𝑦 = 0.
Write
𝑄(𝑥) 𝑥(1 + 𝑥)
lim 𝑥 = lim 𝑥 = lim (1 + 𝑥) = 1,
𝑥→0 𝑃(𝑥) 𝑥→0 𝑥2 𝑥→0
𝑅(𝑥) (𝜋 + 𝑥 2 )
lim 𝑥 2 = lim 𝑥 2 = lim (𝜋 + 𝑥 2 ) = 𝜋.
𝑥→0 𝑃(𝑥) 𝑥→0 𝑥 2 𝑥→0

So 𝑥 = 0 is a regular singular point.


On the other hand, if we make the slight change
𝑥 2 𝑦 ′′ + (1 + 𝑥)𝑦 ′ + (𝜋 + 𝑥 2 )𝑦 = 0,
then
𝑄(𝑥) (1 + 𝑥) 1+𝑥
lim 𝑥 = lim 𝑥 = lim = DNE.
𝑥→0 𝑃(𝑥) 𝑥→0 𝑥 2 𝑥→0 𝑥
Here DNE stands for does not exist. The point 0 is singular, but not a regular singular point.
We now discuss the general Method of Frobenius‗ . We only consider the method at the
point 𝑥 = 0 for simplicity. If 𝑥0 ≠ 0, then in the solution, we would replace every 𝑥 with
(𝑥 − 𝑥 0 ). The main idea is the following theorem.
Theorem 7.3.1 (Method of Frobenius). Suppose that
𝑃(𝑥)𝑦 ′′ + 𝑄(𝑥)𝑦 ′ + 𝑅(𝑥)𝑦 = 0 (7.3)
has a regular singular point at 𝑥 = 0, then there exists at least one solution of the form

Õ
𝑟
𝑦=𝑥 𝑎𝑘 𝑥 𝑘 .
𝑘=0

A solution of this form is called a Frobenius-type solution.

‗ Named after the German mathematician Ferdinand Georg Frobenius (1849–1917).


346 CHAPTER 7. POWER-SERIES METHODS

The method usually breaks down like this:

(i) We seek a Frobenius-type solution of the form



Õ
𝑦= 𝑎 𝑘 𝑥 𝑘+𝑟 .
𝑘=0

We plug this 𝑦 into equation (7.3). We collect terms and write everything as a single
series.

(ii) The obtained series must be zero. Setting the first coefficient (usually the coefficient of
𝑥 𝑟 ) in the series to zero we obtain the indicial equation, which is a quadratic polynomial
in 𝑟.

(iii) If the indicial equation has two real roots 𝑟1 and 𝑟2 such that 𝑟1 − 𝑟2 is not an integer,
then we have two linearly independent Frobenius-type solutions. Using the first root,
we plug in

Õ
𝑟1
𝑦1 = 𝑥 𝑎𝑘 𝑥 𝑘 ,
𝑘=0

and we solve for all 𝑎 𝑘 to obtain the first solution. Then using the second root, we
plug in

Õ
𝑟2
𝑦2 = 𝑥 𝑏𝑘 𝑥𝑘 ,
𝑘=0

and solve for all 𝑏 𝑘 to obtain the second solution.

(iv) If the indicial equation has a doubled root 𝑟, then there we find one solution

Õ
𝑟
𝑦1 = 𝑥 𝑎𝑘 𝑥𝑘 ,
𝑘=0

and then we obtain a new solution by plugging



Õ
𝑟
𝑦2 = 𝑥 𝑏 𝑘 𝑥 𝑘 + (ln 𝑥)𝑦1 ,
𝑘=0

into equation (7.3) and solving for the constants 𝑏 𝑘 .

(v) If the indicial equation has two real roots such that 𝑟1 − 𝑟2 is an integer, then one
solution is

Õ
𝑦 1 = 𝑥 𝑟1 𝑎𝑘 𝑥 𝑘 ,
𝑘=0
7.3. SINGULAR POINTS AND THE METHOD OF FROBENIUS 347

and the second linearly independent solution is of the form



Õ
𝑟2
𝑦2 = 𝑥 𝑏 𝑘 𝑥 𝑘 + 𝐶(ln 𝑥)𝑦1 ,
𝑘=0

where we plug 𝑦2 into (7.3) and solve for the constants 𝑏 𝑘 and 𝐶.
(vi) Finally, if the indicial equation has complex roots, then solving for 𝑎 𝑘 in the solution

Õ
𝑦 = 𝑥 𝑟1 𝑎𝑘 𝑥 𝑘
𝑘=0

results in a complex-valued function—all the 𝑎 𝑘 are complex numbers. We obtain


our two linearly independent solutions‗ by taking the real and imaginary parts of 𝑦.
The main idea is to find at least one Frobenius-type solution. If we are lucky and find
two, we are done. If we only get one, we either use the ideas above or even a different
method such as reduction of order (see § 2.1) to obtain a second solution.

7.3.3 Bessel functions


An important class of functions that arise commonly in physics are the Bessel functions† .
For example, these functions appear when solving the wave equation in two and three
dimensions. First consider Bessel’s equation of order 𝑝:
𝑥 2 𝑦 ′′ + 𝑥𝑦 ′ + 𝑥 2 − 𝑝 2 𝑦 = 0.


We allow 𝑝 to be any number, not just an integer, although integers and multiples of 1/2 are
most important in applications.
When we plug

Õ
𝑦= 𝑎 𝑘 𝑥 𝑘+𝑟
𝑘=0
into Bessel’s equation of order 𝑝, we obtain the indicial equation
𝑟(𝑟 − 1) + 𝑟 − 𝑝 2 = (𝑟 − 𝑝)(𝑟 + 𝑝) = 0.
We obtain two roots, 𝑟1 = 𝑝 and 𝑟2 = −𝑝. If 𝑝 is not an integer, then following the method
of Frobenius and setting 𝑎0 = 1, we find linearly independent solutions of the form

(−1) 𝑘 𝑥 2𝑘

Õ
𝑝
𝑦1 = 𝑥 ,
𝑘=0
22𝑘 𝑘!(𝑘 + 𝑝)(𝑘 − 1 + 𝑝) · · · (2 + 𝑝)(1 + 𝑝)
(−1) 𝑘 𝑥 2𝑘
Õ∞
−𝑝
𝑦2 = 𝑥 .
𝑘=0
22𝑘 𝑘!(𝑘 − 𝑝)(𝑘 − 1 − 𝑝) · · · (2 − 𝑝)(1 − 𝑝)
‗ SeeJoseph L. Neuringera, The Frobenius method for complex roots of the indicial equation, International
Journal of Mathematical Education in Science and Technology, Volume 9, Issue 1, 1978, 71–77.
† Named after the German astronomer and mathematician Friedrich Wilhelm Bessel (1784–1846).
348 CHAPTER 7. POWER-SERIES METHODS

Exercise 7.3.1:

a) Verify that the indicial equation of Bessel’s equation of order 𝑝 is (𝑟 − 𝑝)(𝑟 + 𝑝) = 0.


b) Suppose 𝑝 is not an integer. Carry out the computation to obtain the solutions 𝑦1 and 𝑦2
above.
Bessel functions are convenient constant multiples of 𝑦1 and 𝑦2 . First we must define
the gamma function ∫ ∞
Γ(𝑥) = 𝑡 𝑥−1 𝑒 −𝑡 𝑑𝑡.
0
Notice that Γ(1) = 1. The gamma function also has a wonderful property

Γ(𝑥 + 1) = 𝑥Γ(𝑥).

From this property, it follows that Γ(𝑛) = (𝑛 − 1)! when 𝑛 is an integer. So the gamma
function is a continuous version of the factorial. We compute:

Γ(𝑘 + 𝑝 + 1) = (𝑘 + 𝑝)(𝑘 − 1 + 𝑝) · · · (2 + 𝑝)(1 + 𝑝)Γ(1 + 𝑝),


Γ(𝑘 − 𝑝 + 1) = (𝑘 − 𝑝)(𝑘 − 1 − 𝑝) · · · (2 − 𝑝)(1 − 𝑝)Γ(1 − 𝑝).

Exercise 7.3.2: Verify the identities above using Γ(𝑥 + 1) = 𝑥Γ(𝑥).

We define the Bessel functions of the first kind of order 𝑝 and −𝑝 as

(−1) 𝑘
∞  𝑥  2𝑘+𝑝
1 Õ
𝐽𝑝 (𝑥) = 𝑝 𝑦1 = ,
2 Γ(1 + 𝑝) 𝑘! Γ(𝑘 + 𝑝 + 1) 2
𝑘=0

(−1) 𝑘
∞  𝑥  2𝑘−𝑝
1 Õ
𝐽−𝑝 (𝑥) = −𝑝 𝑦2 = .
2 Γ(1 − 𝑝) 𝑘! Γ(𝑘 − 𝑝 + 1) 2
𝑘=0

As these are constant multiples of the solutions we found above, these are both solutions
to Bessel’s equation of order 𝑝. The constants are picked for convenience.
When 𝑝 is not an integer, 𝐽𝑝 and 𝐽−𝑝 are linearly independent. When 𝑛 is an integer we
obtain
(−1) 𝑘  𝑥  2𝑘+𝑛

Õ
𝐽𝑛 (𝑥) = .
𝑘! (𝑘 + 𝑛)! 2
𝑘=0
In this case
𝐽𝑛 (𝑥) = (−1)𝑛 𝐽−𝑛 (𝑥),
and so 𝐽−𝑛 is not a second linearly independent solution. The other solution is the so-called
Bessel function of second kind. These make sense only for integer orders 𝑛 and are defined as
limits of linear combinations of 𝐽𝑝 (𝑥) and 𝐽−𝑝 (𝑥), as 𝑝 approaches 𝑛 in the following way:

cos(𝑝𝜋)𝐽𝑝 (𝑥) − 𝐽−𝑝 (𝑥)


𝑌𝑛 (𝑥) = lim .
𝑝→𝑛 sin(𝑝𝜋)
7.3. SINGULAR POINTS AND THE METHOD OF FROBENIUS 349

Each linear combination of 𝐽𝑝 (𝑥) and 𝐽−𝑝 (𝑥) is a solution to Bessel’s equation of order 𝑝.
Then as we take the limit as 𝑝 goes to 𝑛, we see that 𝑌𝑛 (𝑥) is a solution to Bessel’s equation
of order 𝑛. It also turns out that 𝑌𝑛 (𝑥) and 𝐽𝑛 (𝑥) are linearly independent. Therefore when
𝑛 is an integer, we have the general solution to Bessel’s equation of order 𝑛:

𝑦 = 𝐴𝐽𝑛 (𝑥) + 𝐵𝑌𝑛 (𝑥),

for arbitrary constants 𝐴 and 𝐵. Note that 𝑌𝑛 (𝑥) goes to negative infinity at 𝑥 = 0. Many
mathematical software packages have these functions 𝐽𝑛 (𝑥) and 𝑌𝑛 (𝑥) defined, so they
can be used just like say sin(𝑥) and cos(𝑥). In fact, Bessel functions have some similar
properties. For example, −𝐽1 (𝑥) is a derivative of 𝐽0 (𝑥), and in general the derivative of 𝐽𝑛 (𝑥)
can be written as a linear combination of 𝐽𝑛−1 (𝑥) and 𝐽𝑛+1 (𝑥). Furthermore, these functions
oscillate, although they are not periodic. See Figure 7.4 for graphs of Bessel functions.

0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0
1.0 1.0
1.00 1.00

0.5 0.5
0.75 0.75

0.0 0.0
0.50 0.50

-0.5 -0.5
0.25 0.25

-1.0 -1.0
0.00 0.00

-1.5 -1.5
-0.25 -0.25

-2.0 -2.0
0.0 2.5 5.0 7.5 10.0 0.0 2.5 5.0 7.5 10.0

Figure 7.4: Plot of the 𝐽0 (𝑥) and 𝐽1 (𝑥) in the first graph and 𝑌0 (𝑥) and 𝑌1 (𝑥) in the second graph.

Example 7.3.4: Other equations can sometimes be solved in terms of the Bessel functions.
For example, given a positive constant 𝜆,

𝑥𝑦 ′′ + 𝑦 ′ + 𝜆2 𝑥𝑦 = 0,

can be changed to 𝑥 2 𝑦 ′′ + 𝑥𝑦 ′ + 𝜆2 𝑥 2 𝑦 = 0. Changing variables 𝑡 = 𝜆𝑥, we obtain, via the


chain rule, the equation in 𝑦 and 𝑡:

𝑡 2 𝑦 ′′ + 𝑡 𝑦 ′ + 𝑡 2 𝑦 = 0,

which we recognize as Bessel’s equation of order 0. Therefore the general solution is


𝑦(𝑡) = 𝐴𝐽0 (𝑡) + 𝐵𝑌0 (𝑡), or in terms of 𝑥:

𝑦 = 𝐴𝐽0 (𝜆𝑥) + 𝐵𝑌0 (𝜆𝑥).

This equation comes up, for example, when finding the fundamental modes of vibration of
a circular drum, but we digress.
350 CHAPTER 7. POWER-SERIES METHODS

7.3.4 Exercises
Exercise 7.3.3: Find a particular (Frobenius-type) solution of 𝑥 2 𝑦 ′′ + 𝑥𝑦 ′ + (1 + 𝑥)𝑦 = 0.

Exercise 7.3.4: Find a particular (Frobenius-type) solution of 𝑥𝑦 ′′ − 𝑦 = 0.

Exercise 7.3.5: Find a particular (Frobenius-type) solution of 𝑦 ′′ + 𝑥1 𝑦 ′ − 𝑥𝑦 = 0.

Exercise 7.3.6: Find the general solution of 2𝑥𝑦 ′′ + 𝑦 ′ − 𝑥 2 𝑦 = 0.

Exercise 7.3.7: Find the general solution of 𝑥 2 𝑦 ′′ − 𝑥𝑦 ′ − 𝑦 = 0.

Exercise 7.3.8: In the following equations classify the point 𝑥 = 0 as ordinary, regular singular,
or singular but not regular singular.

a) 𝑥 2 (1 + 𝑥 2 )𝑦 ′′ + 𝑥𝑦 = 0 b) 𝑥 2 𝑦 ′′ + 𝑦 ′ + 𝑦 = 0
c) 𝑥 𝑦 ′′ + 𝑥 3 𝑦 ′ + 𝑦 = 0 d) 𝑥𝑦 ′′ + 𝑥𝑦 ′ − 𝑒 𝑥 𝑦 = 0
e) 𝑥 2 𝑦 ′′ + 𝑥 2 𝑦 ′ + 𝑥 2 𝑦 = 0

Exercise 7.3.101: In the following equations classify the point 𝑥 = 0 as ordinary, regular singular,
or singular but not regular singular.

a) 𝑦 ′′ + 𝑦 = 0 b) 𝑥 3 𝑦 ′′ + (1 + 𝑥)𝑦 = 0
c) 𝑥 𝑦 ′′ + 𝑥 5 𝑦 ′ + 𝑦 = 0 d) sin(𝑥)𝑦 ′′ − 𝑦 = 0
e) cos(𝑥)𝑦 ′′ − sin(𝑥)𝑦 = 0

Exercise 7.3.102: Find the general solution of 𝑥 2 𝑦 ′′ − 𝑦 = 0.

Exercise 7.3.103: Find a particular solution of 𝑥 2 𝑦 ′′ + (𝑥 − 3/4)𝑦 = 0.

Exercise 7.3.104 (tricky): Find the general solution of 𝑥 2 𝑦 ′′ − 𝑥𝑦 ′ + 𝑦 = 0.


Chapter 8

Nonlinear systems

8.1 Linearization, critical points, and equilibria


Note: 1 lecture, §6.1–§6.2 in [EP], §9.2–§9.3 in [BD]
Except for a few brief detours in chapter 1, we considered mostly linear equations.
Linear equations suffice in many applications, but in reality most phenomena require
nonlinear equations. Nonlinear equations, however, are notoriously more difficult to
understand than linear ones, and many strange new phenomena appear when we allow
our equations to be nonlinear.
Not to worry, we did not waste all this time studying linear equations. Nonlinear
equations can often be approximated by linear ones if we only need a solution “locally,” for
example, only for a short period of time, or only for certain parameters. Understanding
linear equations can also give us qualitative understanding about a more general nonlinear
problem. The idea is similar to what you did in calculus in trying to approximate a function
by a line with the right slope.
In § 2.4 we looked at the pendulum of length 𝐿. The goal was to
solve for the angle 𝜃(𝑡) as a function of the time 𝑡. The equation for
the setup is the nonlinear equation 𝐿
𝜃
𝑔
𝜃′′ + sin 𝜃 = 0.
𝐿
𝑚
Instead of solving this equation, we solved the rather easier linear
equation
𝑔
𝜃′′ + 𝜃 = 0.
𝐿
While the solution to the linear equation is not exactly what we were looking for, it is rather
close to the original, as long as the angle 𝜃 is small and the time period involved is short.
You might ask: Why don’t we just solve the nonlinear problem? Well, it might be very
difficult, impractical, or impossible to solve analytically, depending on the equation in
question. We may not even be interested in the actual solution, we might only be interested
352 CHAPTER 8. NONLINEAR SYSTEMS

in some qualitative idea of what the solution is doing. For example, what happens as time
goes to infinity?

8.1.1 Autonomous systems and phase plane analysis


We restrict our attention to a two-dimensional autonomous system

𝑥 ′ = 𝑓 (𝑥, 𝑦), 𝑦 ′ = 𝑔(𝑥, 𝑦),

where 𝑓 (𝑥, 𝑦) and 𝑔(𝑥, 𝑦) are functions of two variables, and the derivatives are taken with
respect to time 𝑡. Solutions are functions 𝑥(𝑡) and 𝑦(𝑡) such that

𝑥 ′(𝑡) = 𝑓 𝑥(𝑡), 𝑦(𝑡) , 𝑦 ′(𝑡) = 𝑔 𝑥(𝑡), 𝑦(𝑡) .


 

The way we will analyze the system is very similar to § 1.6, where we studied a single
autonomous equation. The ideas in two dimensions are the same, but the behavior can be
far more complicated.
It may be best to think of the system of equations as the single vector equation
 ′
𝑥 𝑓 (𝑥, 𝑦)
 
= . (8.1)
𝑦 𝑔(𝑥, 𝑦)

As in § 3.1 we draw the phase portrait (or phase diagram), where each point (𝑥, 𝑦) corresponds
to a specific state of the system. We draw the vector field given at each point (𝑥, 𝑦) by the
𝑓 (𝑥,𝑦)
h i
vector 𝑔(𝑥,𝑦) . And as before if we find solutions, we draw the trajectories by plotting all
points 𝑥(𝑡), 𝑦(𝑡) for a certain range of 𝑡.


Example 8.1.1: Consider the second-order equation 𝑥 ′′ = −𝑥 + 𝑥 2 . Write this equation as a


first-order nonlinear system

𝑥 ′ = 𝑦, 𝑦 ′ = −𝑥 + 𝑥 2 .

The phase portrait with some trajectories is drawn in Figure 8.1 on the next page.
From the phase portrait it should be clear that even this simple system has fairly
complicated behavior. Some trajectories keep oscillating around the origin, and some go
off towards infinity. We will return to this example often, and analyze it completely in this
(and the next) section.
𝑓 (𝑥,𝑦)
h i
If we zoom into the diagram near a point where is not zero, then nearby the
𝑔(𝑥,𝑦)
arrows point generally in essentially that same direction and have essentially the same
magnitude. In other words the behavior is not that interesting near such a point. We are of
course assuming that 𝑓 (𝑥, 𝑦) and 𝑔(𝑥, 𝑦) are continuous.
Let us concentrate on those points in the phase diagram above where the trajectories
seem to start, end, or go around. We see two such points: (0, 0) and (1, 0). The trajectories
seem to go around the point (0, 0), and they seem to either go in or out of the point (1, 0).
8.1. LINEARIZATION, CRITICAL POINTS, AND EQUILIBRIA 353

-2 -1 0 1 2
2 2

1 1

0 0

-1 -1

-2 -2
-2 -1 0 1 2

Figure 8.1: Phase portrait with some trajectories of 𝑥 ′ = 𝑦, 𝑦 ′ = −𝑥 + 𝑥 2 .

These points are precisely those points where the derivatives of both 𝑥 and 𝑦 are zero. Let
us define the critical points as the points (𝑥, 𝑦) such that

𝑓 (𝑥, 𝑦)
 
®
= 0.
𝑔(𝑥, 𝑦)

In other words, these are the points where both 𝑓 (𝑥, 𝑦) = 0 and 𝑔(𝑥, 𝑦) = 0.
The critical hpointsi are where the behavior of the system is in some sense the most
𝑓 (𝑥,𝑦)
complicated. If 𝑔(𝑥,𝑦) is zero, then nearby, the vector can point in any direction whatsoever.
Also, the trajectories are either going towards, away from, or around these points, so if we
are looking for long-term qualitative behavior of the system, we should look at what is
happening near the critical points.
Critical points are also sometimes called equilibria, since we have so-called equilibrium
solutions at critical points. If (𝑥0 , 𝑦0 ) is a critical point, then we have the solutions

𝑥(𝑡) = 𝑥0 , 𝑦(𝑡) = 𝑦0 .

In Example 8.1.1 on the facing page, there are two equilibrium solutions:

𝑥(𝑡) = 0, 𝑦(𝑡) = 0, and 𝑥(𝑡) = 1, 𝑦(𝑡) = 0.

Compare this discussion on equilibria to the discussion in § 1.6. The underlying concept is
exactly the same.

8.1.2 Linearization
In § 3.5 we studied the behavior of a homogeneous linear system of two equations near a
critical point. For a linear system of two variables given by an invertible matrix, the only
354 CHAPTER 8. NONLINEAR SYSTEMS

critical point is the origin (0, 0). Let us put the understanding we gained in that section to
good use understanding what happens near critical points of nonlinear systems.
In calculus we learned to estimate a function by taking its derivative and linearizing.
We work similarly with nonlinear systems of ODEs. Suppose (𝑥0 , 𝑦0 ) is a critical point.
First change variables to (𝑢, 𝑣), so that (𝑢, 𝑣) = (0, 0) corresponds to (𝑥 0 , 𝑦0 ). That is,

𝑢 = 𝑥 − 𝑥0 , 𝑣 = 𝑦 − 𝑦0 .

Next we need to find the derivative. In multivariable calculus you may have seen that the
several variables versionh of thei derivative is the Jacobian matrix‗ . The Jacobian matrix of the
𝑓 (𝑥,𝑦)
vector-valued function 𝑔(𝑥,𝑦) at (𝑥0 , 𝑦0 ) is
"𝜕𝑓 𝜕𝑓
#
(𝑥 , 𝑦 )
𝜕𝑥 0 0
(𝑥 ,
𝜕𝑦 0
𝑦0 )
𝜕𝑔 𝜕𝑔 .
(𝑥 ,
𝜕𝑥 0
𝑦0 ) (𝑥 ,
𝜕𝑦 0
𝑦0 )

This matrix gives the best linear approximation as 𝑢 and 𝑣 (and therefore 𝑥 and 𝑦) vary.
We define the linearization of the equation (8.1) as the linear system
 ′ "𝜕𝑓 𝜕𝑓
# 
𝑢 (𝑥 , 𝑦 )
𝜕𝑥 0 0
(𝑥 ,
𝜕𝑦 0
𝑦0 ) 𝑢
= 𝜕𝑔 𝜕𝑔 .
𝑣 (𝑥 , 𝑦0 ) (𝑥 , 𝑦0 ) 𝑣
𝜕𝑥 0 𝜕𝑦 0

Example 8.1.2: Let us keep with the same equations as Example 8.1.1: 𝑥 ′ = 𝑦, 𝑦 ′ = −𝑥 + 𝑥 2 .
There are two critical points, (0, 0) and (1, 0). The Jacobian matrix at any point is
"𝜕𝑓 𝜕𝑓
#
(𝑥, 𝑦) (𝑥, 𝑦)
 
𝜕𝑥 𝜕𝑦 0 1
𝜕𝑔 𝜕𝑔 = .
(𝑥, 𝑦) (𝑥, 𝑦) −1 + 2𝑥 0
𝜕𝑥 𝜕𝑦

Therefore at (0, 0), we have 𝑢 = 𝑥 and 𝑣 = 𝑦, and the linearization is


 ′
𝑢 𝑢
  
0 1
= .
𝑣 −1 0 𝑣

At the point (1, 0), we have 𝑢 = 𝑥 − 1 and 𝑣 = 𝑦, and the linearization is


 ′
𝑢 𝑢
  
0 1
= .
𝑣 1 0 𝑣

The phase diagrams of the two linearizations at the point (0, 0) and (1, 0) are given in
Figure 8.2 on the next page. Note that the variables are now 𝑢 and 𝑣. Compare Figure 8.2
with Figure 8.1 on the preceding page, and look especially at the behavior near the critical
points.
‗ Named for the German mathematician Carl Gustav Jacob Jacobi (1804–1851).
8.1. LINEARIZATION, CRITICAL POINTS, AND EQUILIBRIA 355

-1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0
1.0 1.0 1.0 1.0

0.5 0.5 0.5 0.5

0.0 0.0 0.0 0.0

-0.5 -0.5 -0.5 -0.5

-1.0 -1.0 -1.0 -1.0


-1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0

Figure 8.2: Phase diagram with some trajectories of linearizations at the critical points (0, 0) (left) and
(1, 0) (right) of 𝑥 ′ = 𝑦, 𝑦 ′ = −𝑥 + 𝑥 2 .

8.1.3 Exercises
Exercise 8.1.1: Sketch the phase plane vector field for:
a) 𝑥 ′ = 𝑥 2 , 𝑦 ′ = 𝑦 2 , b) 𝑥 ′ = (𝑥 − 𝑦)2 , 𝑦 ′ = −𝑥, c) 𝑥 ′ = 𝑒 𝑦 , 𝑦 ′ = 𝑒 𝑥 .
Exercise 8.1.2: Match systems
1) 𝑥 ′ = 𝑥 2 , 𝑦 ′ = 𝑦 2 , 2) 𝑥 ′ = 𝑥𝑦, 𝑦 ′ = 1 + 𝑦 2 , 3) 𝑥 ′ = sin(𝜋𝑦), 𝑦 ′ = 𝑥,
to the vector fields below. Justify.

a) b) c)

Exercise 8.1.3: Find the critical points and linearizations of the following systems.
a) 𝑥 ′ = 𝑥 2 − 𝑦 2 , 𝑦 ′ = 𝑥 2 + 𝑦 2 − 1, b) 𝑥 ′ = −𝑦, 𝑦 ′ = 3𝑥 + 𝑦𝑥 2 ,
c) 𝑥 ′ = 𝑥 2 + 𝑦, 𝑦 ′ = 𝑦 2 + 𝑥.
Exercise 8.1.4: For the following systems, verify they have critical point at (0, 0), and find the
linearization at (0, 0).
a) 𝑥 ′ = 𝑥 + 2𝑦 + 𝑥 2 − 𝑦 2 , 𝑦 ′ = 2𝑦 − 𝑥 2 b) 𝑥 ′ = −𝑦, 𝑦 ′ = 𝑥 − 𝑦 3
c) 𝑥 ′ = 𝑎𝑥 + 𝑏𝑦 + 𝑓 (𝑥, 𝑦), 𝑦 ′ = 𝑐𝑥 + 𝑑𝑦 + 𝑔(𝑥, 𝑦), where 𝑓 (0, 0) = 0, 𝑔(0, 0) = 0, and all first
𝜕𝑓 𝜕𝑓 𝜕𝑔
partial derivatives of 𝑓 and 𝑔 are also zero at (0, 0), that is, 𝜕𝑥
(0, 0) = 𝜕𝑦
(0, 0) = 𝜕𝑥
(0, 0) =
𝜕𝑔
𝜕𝑦
(0, 0) = 0.
356 CHAPTER 8. NONLINEAR SYSTEMS

Exercise 8.1.5: Take 𝑥 ′ = (𝑥 − 𝑦)2 , 𝑦 ′ = (𝑥 + 𝑦)2 .

a) Find the set of critical points.


b) Sketch a phase diagram and describe the behavior near the critical point(s).
c) Find the linearization. Is it helpful in understanding the system?

Exercise 8.1.6: Take 𝑥 ′ = 𝑥 2 , 𝑦 ′ = 𝑥 3 .

a) Find the set of critical points.


b) Sketch a phase diagram and describe the behavior near the critical point(s).
c) Find the linearization. Is it helpful in understanding the system?

Exercise 8.1.101: Find the critical points and linearizations of the following systems.

a) 𝑥 ′ = sin(𝜋𝑦) + (𝑥 − 1)2 , 𝑦 ′ = 𝑦 2 − 𝑦, b) 𝑥 ′ = 𝑥 + 𝑦 + 𝑦 2 , 𝑦 ′ = 𝑥,
c) 𝑥 ′ = (𝑥 − 1)2 + 𝑦, 𝑦 ′ = 𝑥 2 + 𝑦.

Exercise 8.1.102: Match systems

1) 𝑥 ′ = 𝑦 2 , 𝑦 ′ = −𝑥 2 , 2) 𝑥 ′ = 𝑦, 𝑦 ′ = (𝑥 − 1)(𝑥 + 1),
3) 𝑥 ′ = 𝑦 + 𝑥 2 , 𝑦 ′ = −𝑥,

to the vector fields below. Justify.

a) b) c)

Exercise 8.1.103: The idea of critical points and linearization works in higher dimensions as well.
You simply make the Jacobian matrix bigger by adding more functions and more variables. For the
following system of 3 equations find the critical points and their linearizations:

𝑥′ = 𝑥 + 𝑧2 , 𝑦 ′ = 𝑧 2 − 𝑦, 𝑧′ = 𝑧 + 𝑥2.

Exercise 8.1.104: Any two-dimensional non-autonomous system 𝑥 ′ = 𝑓 (𝑥, 𝑦, 𝑡), 𝑦 ′ = 𝑔(𝑥, 𝑦, 𝑡)


can be written as a three-dimensional autonomous system (three equations). Write down this
autonomous system using the variables 𝑢, 𝑣, 𝑤.
8.2. STABILITY AND CLASSIFICATION OF ISOLATED CRITICAL POINTS 357

8.2 Stability and classification of isolated critical points


Note: 1.5–2 lectures, §6.1–§6.2 in [EP], §9.2–§9.3 in [BD]

8.2.1 Isolated critical points and almost linear systems


A critical point is isolated if it is the only critical point in some small “neighborhood” of the
point. That is, if we zoom in far enough it is the only critical point we see. In the example
above, the critical point was isolated. If on the other hand there would be a whole curve of
critical points, then it would not be isolated.
A system is called almost linear at a critical point (𝑥0 , 𝑦0 ), if the critical point is isolated
and the Jacobian matrix at the point is invertible, or equivalently if the linearized system
has an isolated critical point. In such a case, the nonlinear terms are very small and the
system behaves like its linearization, at least if we are close to the critical point.
For example, the system in Examples 8.1.1 and 8.1.2 has two isolated critical points
(0, 0) and
 0(1,1 0), and 0is1 almost
 linear at both critical points as the Jacobian matrices at both
points, −1 0 and 1 0 , are invertible.
On the other hand, the system 𝑥 ′ = 𝑥 2 , 𝑦 ′ = 𝑦 2 has an isolated critical point at (0, 0),
however the Jacobian matrix  
2𝑥 0
0 2𝑦
is zero when (𝑥, 𝑦) = (0, 0). So the system is not almost linear. Even a worse example is the
system 𝑥 ′ = 𝑥, 𝑦 ′ = 𝑥 2 , which does not have isolated critical points; 𝑥 ′ and 𝑦 ′ are both zero
whenever 𝑥 = 0, that is, the entire 𝑦-axis.
Fortunately, most often critical points are isolated, and the system is almost linear at
the critical points. So if we learn what happens there, we will have figured out the majority
of situations that arise in applications.

8.2.2 Stability and classification of isolated critical points


Once we have an isolated critical point, the system is almost linear at that critical point,
and we computed the associated linearized system, we can classify what happens to the
solutions. We more or less use the classification for linear two-variable systems from § 3.5,
with one minor caveat. Let us list the behaviors depending on the eigenvalues of the
Jacobian matrix at the critical point in Table 8.1 on the next page. This table is very similar
to Table 3.1 on page 150, with the exception of missing “center” points. We will discuss
centers later, as they are more complicated.
In the third column, we mark points as asymptotically stable or unstable. Formally, a
stable critical point (𝑥 0 , 𝑦0 ) is one where given any small distance 𝜖 to (𝑥0 , 𝑦0 ), and any initial
condition within a perhaps smaller radius around (𝑥0 , 𝑦0 ), the trajectory of the system
never goes further away from (𝑥0 , 𝑦0 ) than 𝜖. An unstable critical point is one that is not
358 CHAPTER 8. NONLINEAR SYSTEMS

Eigenvalues of the Jacobian matrix Behavior Stability


real and both positive source / unstable node unstable
real and both negative sink / stable node asymptotically stable
real and opposite signs saddle unstable
complex with positive real part spiral source unstable
complex with negative real part spiral sink asymptotically stable

Table 8.1: Behavior of an almost linear system near an isolated critical point.

stable. Informally, a point is stable if we start close to a critical point and follow a trajectory
we either go towards, or at least not away from, this critical point.
A stable critical point (𝑥0 , 𝑦0 ) is called asymptoticallystable if given any initial condition
sufficiently close to (𝑥0 , 𝑦0 ) and any solution 𝑥(𝑡), 𝑦(𝑡) satisfying that condition, then

lim 𝑥(𝑡), 𝑦(𝑡) = (𝑥0 , 𝑦0 ).



𝑡→∞

That is, the critical point is asymptotically stable if any trajectory for a sufficiently close
initial condition goes towards the critical point (𝑥0 , 𝑦0 ).
Example 8.2.1: Consider 𝑥 ′ = −𝑦 − 𝑥 2 , 𝑦 ′ = −𝑥 + 𝑦 2 . See Figure 8.3 on the next page for
the phase diagram. Let us find the critical points. These are the points where −𝑦 − 𝑥 2 = 0
and −𝑥 + 𝑦 2 = 0. The first equation means 𝑦 = −𝑥 2 , and so 𝑦 2 = 𝑥 4 . Plugging into the
second equation we obtain −𝑥 + 𝑥 4 = 0. Factoring, we obtain 𝑥(𝑥 3 − 1) = 0. Since we are
looking only for real solutions we get either 𝑥 = 0 or 𝑥 = 1. Solving for the corresponding
𝑦 using 𝑦 = −𝑥 2 , we get two critical points, one being (0, 0) and the other being (1, −1).
Clearly the critical points are isolated.
Let us compute the Jacobian matrix:
 
−2𝑥 −1
.
−1 2𝑦
 
0 −1 and so the two eigenvalues are 1 and −1. As
At the point (0, 0) we get the matrix −1 0
the matrix is invertible, the system is almost linear at (0, 0). As the eigenvalues are real and
of opposite signs, we get a saddle point, which is an unstable equilibrium point.
At the point (1, −1) we get the matrix −2 −1
−1 −2 and computing the eigenvalues we get −1,
−3. The matrix is invertible, and so the system is almost linear at (1, −1). As we have real
eigenvalues and both negative, the critical point is a sink, and therefore an asymptotically
stable equilibrium point. That is, if we start with any point (𝑥 𝑖 , 𝑦 𝑖 ) close to (1, −1) as an
initial condition and plot a trajectory, it approaches (1, −1). In other words,

lim 𝑥(𝑡), 𝑦(𝑡) = (1, −1).



𝑡→∞
8.2. STABILITY AND CLASSIFICATION OF ISOLATED CRITICAL POINTS 359

-2 -1 0 1 2
2 2

1 1

0 0

-1 -1

-2 -2
-2 -1 0 1 2

Figure 8.3: The phase portrait with few sample trajectories of 𝑥 ′ = −𝑦 − 𝑥 2 , 𝑦 ′ = −𝑥 + 𝑦 2 .

As you can see from the diagram, this behavior is true even for some initial points quite far
from (1, −1), but it is definitely not true for all initial points.
Example 8.2.2: Let us look at 𝑥 ′ = 𝑦 + 𝑦 2 𝑒 𝑥 , 𝑦 ′ = 𝑥. First let us find the critical points.
These are the points where 𝑦 + 𝑦 2 𝑒 𝑥 = 0 and 𝑥 = 0. Simplifying we get 0 = 𝑦 + 𝑦 2 = 𝑦(𝑦 + 1).
So the critical points are (0, 0) and (0, −1), and hence are isolated. Let us compute the
Jacobian matrix:  2 𝑥
𝑦 𝑒 1 + 2𝑦𝑒 𝑥

.
1 0
 
At the point (0, 0) we get the matrix 01 10 and so the two eigenvalues are 1 and −1. As
the matrix is invertible, the system is almost linear at (0, 0). And, as the eigenvalues are
real and of opposite signs, we get a saddle point, which is an unstable equilibrium

point.
whose eigenvalues are 12 ± 𝑖 23 . The matrix
 
At the point (0, −1) we get the matrix 11 −1
0
is invertible, and so the system is almost linear at (0, −1). As we have complex eigenvalues
with positive real part, the critical point is a spiral source, and therefore an unstable
equilibrium point.
See Figure 8.4 on the following page for the phase diagram. Notice the two critical
points, and the behavior of the arrows in the vector field around these points.

8.2.3 The trouble with centers


Recall, a linear system with a center means that trajectories travel in closed elliptical orbits
in some direction around the critical point. Such a critical point we call a center or a stable
center. It is not an asymptotically stable critical point, as the trajectories never approach the
critical point, but at least if you start sufficiently close to the critical point, you stay close to
the critical point. The simplest example of such behavior is the linear system with a center.
Another example is the critical point (0, 0) in Example 8.1.1 on page 352.
360 CHAPTER 8. NONLINEAR SYSTEMS

-2 -1 0 1 2
2 2

1 1

0 0

-1 -1

-2 -2
-2 -1 0 1 2

Figure 8.4: The phase portrait with few sample trajectories of 𝑥 ′ = 𝑦 + 𝑦 2 𝑒 𝑥 , 𝑦 ′ = 𝑥.

The trouble with a center in a nonlinear system is that whether the trajectory goes
towards or away from the critical point is governed by the sign of the real part of the
eigenvalues of the Jacobian matrix, and the Jacobian matrix in a nonlinear system changes
from point to point. Since this real part is zero at the critical point itself, it can have either
sign nearby, meaning the trajectory could be pulled towards or away from the critical point.
Example 8.2.3: An example of such a problematic behavior is the system 𝑥 ′ = 𝑦, 𝑦 ′ = −𝑥+𝑦 3 .
The only critical point is the origin (0, 0). The Jacobian matrix is
 
0 1
.
−1 3𝑦 2
 
0 1 , which has eigenvalues ±𝑖. So the linearization has a
At (0, 0) the Jacobian matrix is −1 0
center.
Via the quadratic equation, the eigenvalues of the Jacobian matrix at any point (𝑥, 𝑦) are
p
3 2 4 − 9𝑦 4
𝜆= 𝑦 ±𝑖 .
2 2
At any point where 𝑦 ≠ 0 (so at most points near the origin), the eigenvalues have a positive
real part (𝑦 2 can never be negative). This positive real part pulls the trajectory away from
the origin. A sample trajectory for an initial condition near the origin is given in Figure 8.5
on the facing page.
The moral of the example is that further analysis is needed when the linearization has a
center. The analysis will in general be more complicated than in the example above and
is more likely to involve case-by-case consideration. Such a complication should not be
surprising to you. By now in your mathematical career, you have seen many places where
a simple test is inconclusive, and more careful, perhaps ad hoc, analysis is required. Recall
for example when the second derivative test for maxima or minima is inconclusive.
8.2. STABILITY AND CLASSIFICATION OF ISOLATED CRITICAL POINTS 361

-3 -2 -1 0 1 2 3

2 2

1 1

0 0

-1 -1

-2 -2

-3 -2 -1 0 1 2 3

Figure 8.5: An unstable critical point (spiral source) at the origin for 𝑥 ′ = 𝑦, 𝑦 ′ = −𝑥 + 𝑦 3 , even if the
linearization has a center.

8.2.4 Conservative equations


An equation of the form
𝑥 ′′ + 𝑓 (𝑥) = 0,
where 𝑓 (𝑥) is an arbitrary function, is called a conservative equation. For example, the
pendulum equation is a conservative equation. The equations are conservative as there
is no friction in the system so the energy in the system is “conserved.” Let us write this
equation as a system of nonlinear ODEs.

𝑥 ′ = 𝑦, 𝑦 ′ = − 𝑓 (𝑥).

These types of equations have the advantage that we can solve for their trajectories easily.
The trick is to first think of 𝑦 as a function of 𝑥 for a moment. Then use the chain rule

𝑑𝑦 ′ 𝑑𝑦
𝑥 ′′ = 𝑦 ′ = 𝑥 =𝑦 ,
𝑑𝑥 𝑑𝑥
𝑑𝑦
where the prime indicates a derivative with respect to 𝑡. We obtain 𝑦 𝑑𝑥 + 𝑓 (𝑥) = 0. We
∫ 𝑑𝑦 ∫
integrate with respect to 𝑥 to get 𝑦 𝑑𝑥 𝑑𝑥 + 𝑓 (𝑥) 𝑑𝑥 = 𝐶. In other words

1 2
𝑦 + 𝑓 (𝑥) 𝑑𝑥 = 𝐶.
2

We obtained an implicit equation for the trajectories, with different 𝐶 giving different
trajectories. The value of 𝐶 is conserved on any trajectory. This expression is sometimes
called the Hamiltonian or the energy of the system. If you look back to § 1.8, you will notice
𝑑𝑦
that 𝑦 𝑑𝑥 + 𝑓 (𝑥) = 0 is an exact equation, and we just found a potential function.
362 CHAPTER 8. NONLINEAR SYSTEMS

Example 8.2.4: Let us find the trajectories for the equation 𝑥 ′′ + 𝑥 − 𝑥 2 = 0, which is the
equation from Example 8.1.1 on page 352. The corresponding first-order system is
𝑥 ′ = 𝑦, 𝑦 ′ = −𝑥 + 𝑥 2 .
Trajectories satisfy
1 2 1 2 1 3
𝑦 + 𝑥 − 𝑥 = 𝐶.
2 2 3
We solve for 𝑦 r
2
𝑦 = ± −𝑥 2 + 𝑥 3 + 2𝐶.
3
Plotting these graphs we get exactly the trajectories in Figure 8.1 on page 353. In
particular we notice that near the origin the trajectories are closed curves: they keep going
around the origin, never spiraling in or out. Therefore we discovered a way to verify that
the critical point at (0, 0) is a stable center. The critical point at (1, 0) is a saddle as we
already noticed. This example is typical for conservative equations.
Consider an arbitrary conservative equation 𝑥 ′′ + 𝑓 (𝑥) = 0. All critical points occur
when 𝑦 = 0 (the 𝑥-axis), that is, when 𝑥 ′ = 0. The critical points are those points on the
𝑥-axis where 𝑓 (𝑥) = 0. The trajectories are given by
s ∫
𝑦 = ± −2 𝑓 (𝑥) 𝑑𝑥 + 2𝐶.

So all trajectories are mirrored across the 𝑥-axis. In particular, there can be no spiral sources
nor sinks. The Jacobian matrix is  
0 1
.
− 𝑓 ′(𝑥) 0
The critical point is almost linear if 𝑓 ′(𝑥) ≠ 0 at the critical point. Let 𝐽 denote the Jacobian
matrix. The eigenvalues of 𝐽 are solutions to
0 = det(𝐽 − 𝜆𝐼) = 𝜆2 + 𝑓 ′(𝑥).
p
Therefore 𝜆 = ± − 𝑓 ′(𝑥). In other words, either we get real eigenvalues of opposite signs
(if 𝑓 ′(𝑥) < 0), or we get purely imaginary eigenvalues (if 𝑓 ′(𝑥) > 0). There are only two
possibilities for critical points, either an unstable saddle point, or a stable center. There are
never any sinks or sources.

8.2.5 Exercises
Exercise 8.2.1: For the systems below, find and classify the critical points, also indicate if the
equilibria are stable, asymptotically stable, or unstable.
a) 𝑥 ′ = −𝑥 + 3𝑥 2 , 𝑦 ′ = −𝑦 b) 𝑥 ′ = 𝑥 2 + 𝑦 2 − 1, 𝑦 ′ = 𝑥
c) 𝑥 ′ = 𝑦𝑒 𝑥 , 𝑦 ′ = 𝑦 − 𝑥 + 𝑦 2
8.2. STABILITY AND CLASSIFICATION OF ISOLATED CRITICAL POINTS 363

Exercise 8.2.2: Find the implicit equations of the trajectories of the following conservative systems.
Next find their critical points (if any) and classify them.

a) 𝑥 ′′ + 𝑥 + 𝑥 3 = 0 b) 𝜃 ′′ + sin 𝜃 = 0
c) 𝑧 ′′ + (𝑧 − 1)(𝑧 + 1) = 0 d) 𝑥 ′′ + 𝑥 2 + 1 = 0

Exercise 8.2.3: Find and classify the critical point(s) of 𝑥 ′ = −𝑥 2 , 𝑦 ′ = −𝑦 2 .

Exercise 8.2.4: Suppose 𝑥 ′ = −𝑥𝑦, 𝑦 ′ = 𝑥 2 − 1 − 𝑦.

a) Show there are two spiral sinks at (−1, 0) and (1, 0).
b) For any initial point of the form (0, 𝑦0 ), find what is the trajectory.
c) Can a trajectory starting at (𝑥 0 , 𝑦0 ) where 𝑥0 > 0 spiral into the critical point at (−1, 0)?
Why or why not?

Exercise 8.2.5: In the example 𝑥 ′ = 𝑦, 𝑦 ′ = 𝑦 3 − 𝑥 show that for any trajectory, the distance from
the origin is an increasing function. Conclude that the origin behaves like is a spiral source. Hint:
2 2
Consider 𝑓 (𝑡) = 𝑥(𝑡) + 𝑦(𝑡) and show it has positive derivative.

Exercise 8.2.6: Suppose 𝑓 is always positive. Find the trajectories of 𝑥 ′′ + 𝑓 (𝑥 ′) = 0. Are there any
critical points?

Exercise 8.2.7: Suppose that 𝑥 ′ = 𝑓 (𝑥, 𝑦), 𝑦 ′ = 𝑔(𝑥, 𝑦). Suppose that 𝑔(𝑥, 𝑦) > 1 for all 𝑥 and 𝑦.
Are there any critical points? What can we say about the trajectories at 𝑡 goes to infinity?

Exercise 8.2.101: For the systems below, find and classify the critical points.

a) 𝑥 ′ = −𝑥 + 𝑥 2 , 𝑦 ′ = 𝑦 b) 𝑥 ′ = 𝑦 − 𝑦 2 − 𝑥, 𝑦 ′ = −𝑥 c) 𝑥 ′ = 𝑥𝑦, 𝑦 ′ = 𝑥 + 𝑦 − 1

Exercise 8.2.102: Find the implicit equations of the trajectories of the following conservative systems.
Next find their critical points (if any) and classify them.

a) 𝑥 ′′ + 𝑥 2 = 4 b) 𝑥 ′′ + 𝑒 𝑥 = 0 c) 𝑥 ′′ + (𝑥 + 1)𝑒 𝑥 = 0

Exercise 8.2.103: The conservative system 𝑥 ′′ + 𝑥 3 = 0 is not almost linear. Classify its critical
point(s) nonetheless.

Exercise 8.2.104: Derive an analogous classification of critical points for equations in one dimension,
such as 𝑥 ′ = 𝑓 (𝑥) based on the derivative. A point 𝑥 0 is critical when 𝑓 (𝑥 0 ) = 0 and almost linear
if in addition 𝑓 ′(𝑥0 ) ≠ 0. Figure out if the critical point is stable or unstable depending on the sign
of 𝑓 ′(𝑥0 ). Explain. Hint: see § 1.6.
364 CHAPTER 8. NONLINEAR SYSTEMS

8.3 Applications of nonlinear systems


Note: 2 lectures, §6.3–§6.4 in [EP], §9.3, §9.5 in [BD]
In this section we study two very standard examples of nonlinear systems. First, we
look at the nonlinear pendulum equation. We saw the pendulum equation’s linearization
before, but we noted it was only valid for small angles and short times. Now we find
out what happens for large angles. Next, we look at the predator-prey equation, which
finds various applications in modeling problems in biology, chemistry, economics, and
elsewhere.

8.3.1 Pendulum
𝑔
The first example we study is the pendulum equation 𝜃 ′′ + 𝐿 sin 𝜃 = 0. Here, 𝜃 is the angular
displacement, 𝑔 is the gravitational acceleration, and 𝐿 is the length of the pendulum. In
this equation we disregard friction, so we are talking about an idealized pendulum.
This equation is a conservative equation, so we can use our
analysis of conservative equations from the previous section. Let us
change the equation to a two-dimensional system in variables (𝜃, 𝜔) 𝐿
by introducing the new variable 𝜔: 𝜃
 ′
𝜃 𝜔
 
= 𝑔 . 𝑚
𝜔 − 𝐿 sin 𝜃
𝑔
The critical points of this system are when 𝜔 = 0 and − 𝐿 sin 𝜃 = 0, or in other words
if sin 𝜃 = 0. So the critical points are when 𝜔 = 0 and 𝜃 is a multiple of 𝜋. That is,
the points are . . . (−2𝜋, 0), (−𝜋, 0), (0, 0), (𝜋, 0), (2𝜋, 0) . . .. While there are infinitely many
critical points, they are all isolated. Let us compute the Jacobian matrix:
   
𝜕 𝜕
 𝜕𝜃 𝜔 𝜔
   
𝜕𝜔
 0 1
   = 𝑔 .
𝑔 𝑔 − 𝐿 cos 𝜃 0

𝜕 𝜕
− sin 𝜃 − 𝐿 sin 𝜃 

 𝜕𝜃 𝐿 𝜕𝜔
 
For conservative equations, there are two types of critical points.
q Either stable centers,
𝑔
or saddle points. The eigenvalues of the Jacobian matrix are 𝜆 = ± − 𝐿 cos 𝜃.
The eigenvalues are going to be real when cos 𝜃 < 0. This happens at the odd
multiples of 𝜋. The eigenvalues are going to be purely imaginary when cos 𝜃 > 0.
This happens at the even multiples of 𝜋. Therefore the system has a stable center
at the points . . . (−2𝜋, 0), (0, 0), (2𝜋, 0) . . ., and it has an unstable saddle at the points
. . . (−3𝜋, 0), (−𝜋, 0), (𝜋, 0), (3𝜋, 0) . . .. Look at the phase diagram in Figure 8.6 on the next
𝑔
page, where for simplicity we let 𝐿 = 1.
In the linearized equation we have only a single critical point, the center at (0, 0). Now
we see more clearly what we meant when we said the linearization is good for small
8.3. APPLICATIONS OF NONLINEAR SYSTEMS 365

-5.0 -2.5 0.0 2.5 5.0


3 3

2 2

1 1

0 0

-1 -1

-2 -2

-3 -3
-5.0 -2.5 0.0 2.5 5.0

Figure 8.6: Phase plane diagram and some trajectories of the nonlinear pendulum equation.

angles. The horizontal axis is the deflection angle. The vertical axis is the angular velocity
of the pendulum. Suppose we start at 𝜃 = 0 (no deflection), and we start with a small
angular velocity 𝜔. Then the trajectory keeps going around the critical point (0, 0) in an
approximate circle. This corresponds to short swings of the pendulum back and forth.
When 𝜃 stays small, the trajectories really look like circles and hence are very close to our
linearization.
When we give the pendulum a big enough push, it goes across the top and keeps
spinning about its axis. This behavior corresponds to the wavy curves that do not cross the
horizontal axis in the phase diagram. Let us suppose we look at the top curves, when the
angular velocity 𝜔 is large and positive. Then the pendulum is going around and around
its axis. The velocity is going to be large when the pendulum is near the bottom, and the
velocity is the smallest when the pendulum is close to the top of its loop.
At each critical point, there is an equilibrium solution. Consider the solution 𝜃 = 0;
the pendulum is not moving and is hanging straight down. This is a stable place for the
pendulum to be, hence this is a stable equilibrium.
The other type of equilibrium solution is at the unstable point, for example 𝜃 = 𝜋. Here
the pendulum is upside down. Sure you can balance the pendulum this way and it will
stay, but this is an unstable equilibrium. Even the tiniest push will make the pendulum
start swinging wildly.
See Figure 8.7 on the following page for a diagram. The first picture is the stable
equilibrium 𝜃 = 0. The second corresponds to those “almost circles” in the phase diagram
around 𝜃 = 0 when the angular velocity is small. The third is the unstable equilibrium
𝜃 = 𝜋. The last picture corresponds to the wavy lines for large angular velocities.
The quantity
1 2 𝑔
𝜔 − cos 𝜃
2 𝐿
is conserved by any solution. This is the energy or the Hamiltonian of the system.
366 CHAPTER 8. NONLINEAR SYSTEMS

𝜃=0 Small angular velocities 𝜃=𝜋 Large angular velocities

Figure 8.7: Various possibilities for the motion of the pendulum.

We have a conservative equation and so (exercise) the trajectories are given by


r
2𝑔
𝜔=± cos 𝜃 + 𝐶,
𝐿
for various values of 𝐶. Let us look at the initial condition of (𝜃0 , 0), that is, we take the
pendulum to angle 𝜃0 , and just let it go (initial angular velocity 0). We plug the initial
conditions into the above and solve for 𝐶 to obtain
2𝑔
𝐶 = − cos 𝜃0 .
𝐿
Thus the expression for the trajectory is
r
2𝑔 p
𝜔=± cos 𝜃 − cos 𝜃0 .
𝐿
Let us figure out the period. That is, the time it takes for the pendulum to swing back
and forth. We notice that the trajectory about the origin in the phase plane is symmetric
about both the 𝜃 and the 𝜔-axis. That is, in terms of 𝜃, the time it takes from 𝜃0 to −𝜃0 is
the same as it takes from −𝜃0 back to 𝜃0 . Furthermore, the time it takes from −𝜃0 to 0 is the
same as to go from 0 to 𝜃0 . Therefore, let us find how long it takes for the pendulum to go
from angle 0 to angle 𝜃0 , which is a quarter of the full oscillation and then multiply by 4.
𝑑𝑡
We figure out this time by finding 𝑑𝜃 and integrating from 0 to 𝜃0 . The period is four
times this integral. Let us stay in the region where 𝜔 is positive. Since 𝜔 = 𝑑𝜃 𝑑𝑡 , inverting
we get s
𝑑𝑡 𝐿 1
= √ .
𝑑𝜃 2𝑔 cos 𝜃 − cos 𝜃0
Therefore the period 𝑇 is given by
s
𝜃0
𝐿

1
𝑇=4 √ 𝑑𝜃.
2𝑔 0 cos 𝜃 − cos 𝜃0
8.3. APPLICATIONS OF NONLINEAR SYSTEMS 367

The integral is an improper integral, and we cannot in general evaluate it symbolically. We


must resort to numerical approximation if we want to compute a particular 𝑇.
𝑔
Recall from § 2.4, the linearized equation 𝜃 ′′ + 𝐿 𝜃 = 0 has period

s
𝐿
𝑇linear = 2𝜋 .
𝑔

𝑇−𝑇
We plot 𝑇, 𝑇linear , and the relative error 𝑇linear in Figure 8.8. The relative error says how far
is our approximation from the real period percentage-wise. Note that 𝑇linear is simply a
constant, it does not change with the initial angle 𝜃0 . The actual period 𝑇 gets larger and
larger as 𝜃0 gets larger. Notice how the relative error is small when 𝜃0 is small. It is still
only 15% when 𝜃0 = 𝜋/2, that is, a 90 degree angle. The error is 3.8% when starting at 𝜋/4, a
45 degree angle. At a 5 degree initial angle, the error is only 0.048%.

0.00 0.25 0.50 0.75 1.00 1.25 1.50 0.00 0.25 0.50 0.75 1.00 1.25 1.50
8.0 8.0

0.150 0.150

7.5 7.5
0.125 0.125

7.0 7.0
0.100 0.100

6.5 6.5 0.075 0.075

6.0 6.0 0.050 0.050

0.025 0.025
5.5 5.5

0.000 0.000
5.0 5.0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 0.00 0.25 0.50 0.75 1.00 1.25 1.50

𝑔 𝑇−𝑇linear
Figure 8.8: The plot of 𝑇 and 𝑇linear with 𝐿 = 1 (left), and the plot of the relative error 𝑇 (right), for
𝜃0 between 0 and 𝜋/2.

While it is not immediately obvious from the formula, it is true that

lim 𝑇 = ∞.
𝜃0 ↑𝜋

That is, the period goes to infinity as the initial angle approaches the unstable equilibrium
point. So if we put the pendulum almost upside down it may take a very long time before
it gets down. This is consistent with the limiting behavior, where the exactly upside down
pendulum never makes an oscillation, so we could think of that as infinite period.
368 CHAPTER 8. NONLINEAR SYSTEMS

8.3.2 Predator-prey or Lotka–Volterra systems


One of the most common simple applications of nonlinear systems are the so-called
predator-prey or Lotka–Volterra‗ systems. For example, these systems arise when two species
interact, one as the prey and one as the predator. It is then no surprise that the equations
also see applications in economics. The system also arises in chemical reactions. In biology,
this system of equations explains the natural periodic variations of populations of different
species in nature. Before the application of differential equations, these periodic variations
in the population baffled biologists.
We keep with the classical example of hares and foxes in a forest, it is the easiest to
understand.
𝑥 = # of hares (the prey),
𝑦 = # of foxes (the predator).
When there are a lot of hares, there is plenty of food for the foxes, so the fox population
grows. However, when the fox population grows, the foxes eat more hares, so when there
are lots of foxes, the hare population should go down, and vice versa. The Lotka–Volterra
model proposes that this behavior is described by the system of equations

𝑥 ′ = (𝑎 − 𝑏𝑦)𝑥,
𝑦 ′ = (𝑐𝑥 − 𝑑)𝑦,

where 𝑎, 𝑏, 𝑐, 𝑑 are some parameters that describe the interaction of the foxes and hares† .
In this model, these are all positive numbers.
Let us analyze the idea behind this model. The model is a slightly more complicated
idea based on the exponential population model. First expand,

𝑥 ′ = (𝑎 − 𝑏𝑦)𝑥 = 𝑎𝑥 − 𝑏𝑦𝑥.

The hares are expected to simply grow exponentially in the absence of foxes. That is where
the 𝑎𝑥 term comes in, the growth in population is proportional to the population itself.
We are assuming the hares always find enough food and have enough space to reproduce.
However, there is another component −𝑏𝑦𝑥, that is, the population also is decreasing
proportionally to the number of foxes. Together we can write the equation as (𝑎 − 𝑏𝑦)𝑥, so
it is like exponential growth or decay but the constant depends on the number of foxes.
The equation for foxes is very similar, expand again

𝑦 ′ = (𝑐𝑥 − 𝑑)𝑦 = 𝑐𝑥𝑦 − 𝑑𝑦.

The foxes need food (hares) to reproduce: the more food, the bigger the rate of growth,
hence the 𝑐𝑥𝑦 term. On the other hand, there are natural deaths in the fox population, and
hence the −𝑑𝑦 term.
‗ Named for the American mathematician, chemist, and statistician Alfred James Lotka (1880–1949) and
the Italian mathematician and physicist Vito Volterra (1860–1940).
†This interaction does not end well for the hare.
8.3. APPLICATIONS OF NONLINEAR SYSTEMS 369

Without further delay, let us start with an explicit example. Suppose the equations are
𝑥 ′ = (0.4 − 0.01𝑦)𝑥, 𝑦 ′ = (0.003𝑥 − 0.3)𝑦.
See Figure 8.9 for the phase portrait. In this example it makes sense to also plot 𝑥 and 𝑦 as
graphs with respect to time. Therefore the second graph in Figure 8.9 is the graph of 𝑥 and
𝑦 on the vertical axis (the prey 𝑥 is the thinner line with taller peaks), against time on the
horizontal axis. The particular solution graphed was with initial conditions of 20 foxes and
50 hares.

0 50 100 150 200 250 300 0 10 20 30 40


100 100 300 300

250 250

75 75

200 200

50 50 150 150

100 100

25 25

50 50

0 0 0 0
0 50 100 150 200 250 300 0 10 20 30 40

Figure 8.9: The phase portrait (left) and graphs of 𝑥 and 𝑦 for a sample solution (right).

Let us analyze what we see on the graphs. We work in the general setting rather than
putting in specific numbers. We start with finding the critical points. Set (𝑎 − 𝑏𝑦)𝑥 = 0,
and (𝑐𝑥 − 𝑑)𝑦 = 0. The first equation is satisfied if either 𝑥 = 0 or 𝑦 = 𝑎/𝑏 . If 𝑥 = 0, the
second equation implies 𝑦 = 0. If 𝑦 = 𝑎/𝑏 , the second equation implies 𝑥 = 𝑑/𝑐 . There are
two equilibria: at (0, 0) when there are no animals at all, and at (𝑑/𝑐 , 𝑎/𝑏 ). In our specific
example 𝑥 = 𝑑/𝑐 = 100, and 𝑦 = 𝑎/𝑏 = 40. This is the point where there are 100 hares and 40
foxes.
We compute the Jacobian matrix:
𝑎 − 𝑏𝑦 −𝑏𝑥
 
.
𝑐𝑦 𝑐𝑥 − 𝑑
At the origin (0, 0) we get the matrix 0𝑎 −𝑑 , so the eigenvalues are 𝑎 and −𝑑, hence real
0
 

and of opposite signs. So the critical point at the origin is a saddle. This makes sense. If
you started with some foxes but no hares, then the foxes would go extinct, that is, you
would approach the origin. If you started with no foxes and a few hares, then the hares
would keep multiplying without check, and so you would go away from the origin.
OK, how about the other critical point at (𝑑/𝑐 , 𝑎/𝑏 ). Here the Jacobian matrix becomes
− 𝑏𝑑
 
0 𝑐 .
𝑎𝑐
𝑏 0
370 CHAPTER 8. NONLINEAR SYSTEMS

The eigenvalues satisfy 𝜆2 + 𝑎𝑑 = 0. In other words, 𝜆 = ±𝑖 𝑎𝑑. The eigenvalues being
purely imaginary, we are in the case where we cannot quite decide using only linearization.
We could have a stable center, spiral sink, or a spiral source. That is, the equilibrium could
be asymptotically stable, stable, or unstable. Of course I gave you a picture above that
seems to imply it is a stable center. But never trust a picture only. Perhaps the oscillations
are getting larger and larger, but only very slowly. Of course this would be bad as it would
imply something will go wrong with our population sooner or later. And I only graphed a
very specific example with very specific trajectories.
How can we be sure we are in the stable situation? As we said before, in the case of
purely imaginary eigenvalues, we have to do a bit more work. Previously we found that for
conservative systems, there was a certain quantity that was conserved on the trajectories,
and hence the trajectories had to go in closed loops. We can use a similar technique here.
We just have to figure out what is the conserved quantity. After some trial and error we
find the constant
𝑦𝑎 𝑥𝑑
𝐶 = 𝑐𝑥+𝑏 𝑦 = 𝑦 𝑎 𝑥 𝑑 𝑒 −𝑐𝑥−𝑏 𝑦
𝑒
is conserved. Such a quantity is called the constant of motion. Let us check 𝐶 really is a
constant of motion. How do we check, you say? Well, a constant is something that does
not change with time, so let us compute the derivative with respect to time:

𝐶 ′ = 𝑎𝑦 𝑎−1 𝑦 ′ 𝑥 𝑑 𝑒 −𝑐𝑥−𝑏 𝑦 + 𝑦 𝑎 𝑑𝑥 𝑑−1 𝑥 ′ 𝑒 −𝑐𝑥−𝑏 𝑦 + 𝑦 𝑎 𝑥 𝑑 𝑒 −𝑐𝑥−𝑏 𝑦 (−𝑐𝑥 ′ − 𝑏𝑦 ′).

Our equations give us what 𝑥 ′ and 𝑦 ′ are so let us plug those in:

𝐶 ′ = 𝑎𝑦 𝑎−1 (𝑐𝑥 − 𝑑)𝑦𝑥 𝑑 𝑒 −𝑐𝑥−𝑏 𝑦 + 𝑦 𝑎 𝑑𝑥 𝑑−1 (𝑎 − 𝑏𝑦)𝑥𝑒 −𝑐𝑥−𝑏 𝑦


+ 𝑦 𝑎 𝑥 𝑑 𝑒 −𝑐𝑥−𝑏 𝑦 −𝑐(𝑎 − 𝑏𝑦)𝑥 − 𝑏(𝑐𝑥 − 𝑑)𝑦

 
= 𝑦 𝑎 𝑥 𝑑 𝑒 −𝑐𝑥−𝑏 𝑦 𝑎(𝑐𝑥 − 𝑑) + 𝑑(𝑎 − 𝑏𝑦) + −𝑐(𝑎 − 𝑏𝑦)𝑥 − 𝑏(𝑐𝑥 − 𝑑)𝑦
= 0.

𝑦𝑎 𝑥𝑑
So along the trajectories 𝐶 is constant. In fact, the expression 𝐶 = 𝑒 𝑐𝑥+𝑏 𝑦 gives us an implicit
equation for the trajectories. In any case, once we have found this constant of motion, it
𝑦𝑎 𝑥𝑑
must be true that the trajectories are simple curves—the level curves of 𝑒 𝑐𝑥+𝑏 𝑦 . It turns out,
the critical point at (𝑑/𝑐 , 𝑎/𝑏 ) is a maximum for 𝐶 (left as an exercise). So (𝑑/𝑐 , 𝑎/𝑏 ) is a stable
equilibrium point, and we do not have to worry about the foxes and hares going extinct or
their populations exploding.
One blemish on this wonderful model is that the number of foxes and hares are discrete
quantities and we are modeling with continuous variables. Our model has no problem
with there being 0.1 fox in the forest for example, while in reality that makes no sense. The
approximation is a reasonable one as long as the number of foxes and hares are large, but
it does not make much sense for small numbers. One must be careful in interpreting any
results from such a model.
8.3. APPLICATIONS OF NONLINEAR SYSTEMS 371

An interesting consequence (perhaps counterintuitive) of this model is that adding


animals to the forest might lead to extinction, because the variations will get too big, and
one of the populations will get close to zero. For example, suppose there are 20 foxes and
50 hares as before, but now we bring in more foxes, bringing their number to 200. If we
run the computation, we find the number of hares will plummet to just slightly more than
1 hare in the whole forest. In reality that most likely means the hares die out, and then the
foxes will die out as well as they will have nothing to eat.
Showing that a system of equations has a stable solution can be a very difficult problem.
When Isaac Newton put forth his laws of planetary motions, he proved that a single
planet orbiting a single sun is a stable system. But any solar system with more than 1
planet proved very difficult indeed. In fact, such a system behaves chaotically (see § 8.5),
meaning small changes in initial conditions lead to very different long-term outcomes.
From numerical experimentation and measurements, we know the earth will not fly out
into the empty space or crash into the sun, for at least some millions of years or so. But we
do not know what happens beyond that.

8.3.3 Exercises
Exercise 8.3.1: Take the damped nonlinear pendulum equation 𝜃′′ + 𝜇𝜃 ′ + ( 𝑔/𝐿) sin 𝜃 = 0 for
some 𝜇 > 0 (that is, there is some friction).
a) Suppose 𝜇 = 1 and 𝑔/𝐿 = 1 for simplicity, find and classify the critical points.
b) Do the same for any 𝜇 > 0 and any 𝑔 and 𝐿, but such that the damping is small, in particular,
𝜇2 < 4( 𝑔/𝐿).
c) Explain what your findings mean, and if it agrees with what you expect in reality.
Exercise 8.3.2: Suppose the hares do not grow exponentially, but logistically. In particular consider
𝑥 ′ = (0.4 − 0.01𝑦)𝑥 − 𝛾𝑥 2 , 𝑦 ′ = (0.003𝑥 − 0.3)𝑦.
For the following two values of 𝛾, find and classify all the critical points in the positive quadrant,
that is, for 𝑥 ≥ 0 and 𝑦 ≥ 0. Then sketch the phase diagram. Discuss the implication for the long
term behavior of the population.
a) 𝛾 = 0.001, b) 𝛾 = 0.01.
Exercise 8.3.3:
𝑦𝑥
a) Suppose 𝑥 and 𝑦 are positive variables. Show 𝑒 𝑥+𝑦 attains a maximum at (1, 1).
b) Suppose 𝑎, 𝑏, 𝑐, 𝑑 are positive constants, and also suppose 𝑥 and 𝑦 are positive variables.
𝑦𝑎 𝑥𝑑
Show 𝑒 𝑐𝑥+𝑏 𝑦
attains a maximum at (𝑑/𝑐 , 𝑎/𝑏 ).
Exercise 8.3.4: Suppose that for the
q pendulum equation we take a trajectory giving the spinning-
2𝑔 2𝑔
around motion, for example 𝜔 = 𝐿 cos 𝜃 + 𝐿 + 𝜔02 . This is the trajectory where the lowest
angular velocity is 𝜔02 . Find an integral expression for how long it takes the pendulum to go all the
way around.
372 CHAPTER 8. NONLINEAR SYSTEMS

Exercise 8.3.5 (challenging): Take the pendulum, suppose the initial position is 𝜃 = 0.

a) Find the expression for 𝜔 giving the trajectory with initial condition (0, 𝜔0 ). Hint: Figure
out what 𝐶 should be in terms of 𝜔0 .
b) Find the crucial angular velocity 𝜔1 , such that for any higher initial angular velocity, the
pendulum will keep going around its axis, and for any lower initial angular velocity, the
pendulum will simply swing back and forth. Hint: When the pendulum doesn’t go over the
top the expression for 𝜔 will be undefined for some 𝜃s.
c) What do you think happens if the initial condition is (0, 𝜔1 ), that is, the initial angle is 0, and
the initial angular velocity is exactly 𝜔1 .

Exercise 8.3.101: Take the damped nonlinear pendulum equation 𝜃 ′′ + 𝜇𝜃 ′ + ( 𝑔/𝐿) sin 𝜃 = 0 for
some 𝜇 > 0 (that is, there is friction). Suppose the friction is large, in particular 𝜇2 > 4( 𝑔/𝐿).

a) Find and classify the critical points.


b) Explain what your findings mean, and if it agrees with what you expect in reality.

Exercise 8.3.102: Suppose we have the system predator-prey system where the foxes are also killed
at a constant rate ℎ (ℎ foxes killed per unit time): 𝑥 ′ = (𝑎 − 𝑏𝑦)𝑥, 𝑦 ′ = (𝑐𝑥 − 𝑑)𝑦 − ℎ.

a) Find the critical points and the Jacobian matrices of the system.
b) Put in the constants 𝑎 = 0.4, 𝑏 = 0.01, 𝑐 = 0.003, 𝑑 = 0.3, ℎ = 10. Analyze the critical
points. What do you think it says about the forest?

Exercise 8.3.103 (challenging): Suppose the foxes never die. That is, we have the system
𝑥 ′ = (𝑎 − 𝑏𝑦)𝑥, 𝑦 ′ = 𝑐𝑥𝑦. Find the critical points and notice they are not isolated. What will
happen to the population in the forest if it starts at some positive numbers. Hint: Think of the
constant of motion.
8.4. LIMIT CYCLES 373

8.4 Limit cycles


Note: less than 1 lecture, discussed in §6.1 and §6.4 in [EP] , §9.7 in [BD]
For nonlinear systems, trajectories do not simply need to approach or leave a single
point. They may in fact approach a larger set, such as a circle or another closed curve.
Example 8.4.1: The Van der Pol oscillator‗ is the following equation

𝑥 ′′ − 𝜇(1 − 𝑥 2 )𝑥 ′ + 𝑥 = 0,

where 𝜇 is some positive constant. The Van der Pol oscillator originated with electrical
circuits, but finds applications in diverse fields such as biology, seismology, and other
physical sciences.
For simplicity, let us use 𝜇 = 1. A phase diagram is given in the left-hand plot in
Figure 8.10. Notice how the trajectories seem to very quickly settle on a closed curve. On
the right-hand side is the plot of a single solution for 𝑡 = 0 to 𝑡 = 30 with initial conditions
𝑥(0) = 0.1 and 𝑥 ′(0) = 0.1. The solution quickly tends to a periodic solution.

-4 -2 0 2 4 0 5 10 15 20 25 30
4 4
2 2

2 2
1 1

0 0 0 0

-1 -1
-2 -2

-2 -2
-4 -4
-4 -2 0 2 4 0 5 10 15 20 25 30

Figure 8.10: The phase portrait (left) and a graph of a sample solution of the Van der Pol oscillator.

The Van der Pol oscillator is an example of so-called relaxation oscillation. The word
relaxation comes from the sudden jump (the very steep part of the solution). For larger 𝜇
the steep part becomes even more pronounced, for small 𝜇 the limit cycle looks more like a
circle. In fact, setting 𝜇 = 0, we get 𝑥 ′′ + 𝑥 = 0, which is a linear system with a center and
all trajectories become circles.
A trajectory in the phase portrait that is a closed curve (a curve that is a loop) is called a
closed trajectory. A limit cycle is a closed trajectory such that at least one other trajectory
spirals into it (or spirals out of it). For example, the closed curve in the phase portrait for
‗ Named for the Dutch physicist Balthasar van der Pol (1889–1959).
374 CHAPTER 8. NONLINEAR SYSTEMS

the Van der Pol equation is a limit cycle. If all trajectories that start near the limit cycle
spiral into it, the limit cycle is called asymptotically stable. The limit cycle in the Van der Pol
oscillator is asymptotically stable.
Given a closed trajectory on an autonomous system, any solution that starts on it is
periodic. Such a curve is called a periodic orbit. More precisely, if 𝑥(𝑡), 𝑦(𝑡) is a solution
such that for some 𝑡0 the point 𝑥(𝑡0 ), 𝑦(𝑡0 ) lies on a periodic orbit, then both 𝑥(𝑡) and 𝑦(𝑡)

are periodic functions (with the same period). That is, there is some number 𝑃 such that
𝑥(𝑡) = 𝑥(𝑡 + 𝑃) and 𝑦(𝑡) = 𝑦(𝑡 + 𝑃).
Consider the system
𝑥 ′ = 𝑓 (𝑥, 𝑦), 𝑦 ′ = 𝑔(𝑥, 𝑦), (8.2)
where the functions 𝑓 and 𝑔 have continuous derivatives in some region 𝑅 in the plane.
Theorem 8.4.1 (Poincaré–Bendixson‗ ). Suppose 𝑅 is a closed bounded region (a region in the
plane that includes its boundary and does not have points arbitrarily far from the origin). Suppose
𝑥(𝑡), 𝑦(𝑡) is a solution of (8.2) in 𝑅 that exists for all 𝑡 ≥ 𝑡0 . Then either the solution is a periodic

function, or the solution tends towards a periodic solution in 𝑅.
The main point of the theorem is that if you find one solution that exists for all 𝑡 large
enough (that is, as 𝑡 goes to infinity) and stays within a bounded region, then you have
found either a periodic orbit, or a solution that spirals towards a limit cycle or tends to a
critical point. That is, in the long term, the behavior is very close to a periodic function.
Note that a constant solution at a critical point is periodic (with any period). The theorem is
more a qualitative statement rather than something to help us in computations. In practice
it is hard to find analytic solutions and so hard to show rigorously that they exist for all time.
But if we think the solution exists we numerically solve for a large time to approximate the
limit cycle. Another caveat is that the theorem only works in two dimensions. In three
dimensions and higher, there is simply too much room.
The theorem applies to all solutions in the Van der Pol oscillator. Solutions that start at
any point except the origin (0, 0) tend to the periodic solution around the limit cycle, and
the initial condition of (0, 0) gives the constant solution 𝑥 = 0, 𝑦 = 0.
Example 8.4.2: Consider
2 2
𝑥 ′ = 𝑦 + (𝑥 2 + 𝑦 2 − 1) 𝑥, 𝑦 ′ = −𝑥 + (𝑥 2 + 𝑦 2 − 1) 𝑦.

A vector field along with solutions with initial conditions (1.02, 0), (0.9, 0), and (0.1, 0) are
drawn in Figure 8.11 on the facing page.
Notice that points on the unit circle (distance one from the origin) satisfy 𝑥 2 + 𝑦 2 − 1 = 0.
And 𝑥(𝑡) = sin(𝑡), 𝑦 = cos(𝑡) is a solution of the system. Therefore we have a closed
trajectory. For points off the unit circle, the second term in 𝑥 ′ pushes the solution further
away from the 𝑦-axis than the system 𝑥 ′ = 𝑦, 𝑦 ′ = −𝑥, and 𝑦 ′ pushes the solution further
away from the 𝑥-axis than the linear system 𝑥 ′ = 𝑦, 𝑦 ′ = −𝑥. In other words for all other
initial conditions the trajectory will spiral out.
‗ Ivar Otto Bendixson (1861–1935) was a Swedish mathematician.
8.4. LIMIT CYCLES 375

-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5


1.5 1.5

1.0 1.0

0.5 0.5

0.0 0.0

-0.5 -0.5

-1.0 -1.0

-1.5 -1.5
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5

Figure 8.11: Unstable limit cycle example.

This means that for initial conditions inside the unit circle, the solution spirals out
towards the periodic solution on the unit circle, and for initial conditions outside the unit
circle the solutions spiral off towards infinity. Therefore the unit circle is a limit cycle, but
not an asymptotically stable one. The Poincaré–Bendixson Theorem applies to the initial
points inside the unit circle, as those solutions stay bounded, but not to those outside, as
those solutions go off to infinity.

A very similar analysis applies to the system

𝑥 ′ = 𝑦 + (𝑥 2 + 𝑦 2 − 1)𝑥, 𝑦 ′ = −𝑥 + (𝑥 2 + 𝑦 2 − 1)𝑦.

We still obtain a closed trajectory on the unit circle, and points outside the unit circle spiral
out to infinity, but now points inside the unit circle spiral towards the critical point at the
origin. So this system does not have a limit cycle, even though it has a closed trajectory.
Due to the Picard theorem (Theorem 3.1.1 on page 124) we find that no matter where
we are in the plane we can always find a solution a little bit further in time, as long as 𝑓 and
𝑔 have continuous derivatives. So if we find a closed trajectory in an autonomous system,
then for every initial point inside the closed trajectory, the solution will exist for all time
and it will stay bounded (it will stay inside the closed trajectory). So the moment we found
the solution above going around the unit circle, we knew that for every initial point inside
the circle, the solution exists for all time and the Poincaré–Bendixson theorem applies.

Let us next look for conditions when limit cycles (or periodic orbits) do not exist. We
assume the equation (8.2) is defined on a simply connected region, that is, a region with no
holes we can go around. For example the entire plane is a simply connected region, and
so is the inside of the unit disc. However, the entire plane minus a point is not a simply
connected region as it has a “hole” at the origin.
376 CHAPTER 8. NONLINEAR SYSTEMS

Theorem 8.4.2 (Bendixson–Dulac‗ ). Suppose 𝑅 is a simply connected region, and the expression†
𝜕𝑓 𝜕𝑔
+
𝜕𝑥 𝜕𝑦
is either always positive or always negative on 𝑅 (except perhaps a small set such as on isolated
points or curves) then the system (8.2) has no closed trajectory inside 𝑅.
The theorem gives a way of ruling out the existence of a closed trajectory, and hence
𝜕𝑓 𝜕𝑔
a way of ruling out limit cycles. The expression 𝜕𝑥 + 𝜕𝑦 may seem random, but it is
called the divergence of the vector field. Divergence measures how much the vector field is
“expanding” at any point and comes up in many other contexts. The theorem says that if
the vector field is either expanding everywhere on 𝑅 or contracting everywhere on 𝑅, then
there is no closed trajectory and so no limit cycle. Perhaps this is intuitive—if particles
travel along the vector fields and are getting further and further apart, then we do not
expect any particles to travel in loops. The exception about points or curves means that
divergence can be zero at a few points, or on a curve, but not on any larger set.
Example 8.4.3: Consider 𝑥 ′ = 𝑦 + 𝑦 2 𝑒 𝑥 , 𝑦 ′ = 𝑥 in the entire plane (see Example 8.2.2
on page 359). The plane is simply connected and the theorem applies. We compute
𝜕𝑓 𝜕𝑔
𝜕𝑥
+ 𝜕𝑦 = 𝑦 2 𝑒 𝑥 + 0. The function 𝑦 2 𝑒 𝑥 is always positive except on the line 𝑦 = 0. Therefore,
via the theorem, the system has no closed trajectories.
A common informal, but not quite correct, way to state the theorem is to conclude there
are no periodic solutions that stay in 𝑅. The example above has two critical points and
hence it has constant solutions. Constant functions are periodic. The conclusion of the
theorem is that there exist no trajectories that form closed curves, or in other words, that
there exist no nonconstant periodic solutions that stay in 𝑅.
Example 8.4.4: Consider a somewhat more complicated example. Take the system 𝑥 ′ =
𝜕𝑓 𝜕𝑔
−𝑦 − 𝑥 2 , 𝑦 ′ = −𝑥 + 𝑦 2 (see Example 8.2.1 on page 358). We compute 𝜕𝑥 + 𝜕𝑦 = −2𝑥 + 2𝑦 =
2(−𝑥 + 𝑦). This expression takes on both signs, so if we are talking about the whole plane we
cannot simply apply the theorem. However, we could apply it on the set where −𝑥 + 𝑦 ≥ 0.
Via the theorem, there is no closed trajectory in that set. Similarly, there is no closed
trajectory in the set −𝑥 + 𝑦 ≤ 0. We cannot conclude (yet) that there is no closed trajectory
in the entire plane. For all we know, perhaps half of it is in the set where −𝑥 + 𝑦 ≥ 0 and
the other half is in the set where −𝑥 + 𝑦 ≤ 0.
The key is to look at the line where −𝑥+𝑦 = 0, or 𝑥 = 𝑦. On this line, 𝑥 ′ = −𝑦−𝑥 2 = −𝑥−𝑥 2
and 𝑦 ′ = −𝑥 + 𝑦 2 = −𝑥 + 𝑥 2 . In particular, if 𝑥 = 𝑦, then 𝑥 ′ ≤ 𝑦 ′. So the arrows, the vectors
(𝑥 ′ , 𝑦 ′), always point into the set where −𝑥 + 𝑦 ≥ 0. There is no way we can start in the
set where −𝑥 + 𝑦 ≥ 0 and go into the set where −𝑥 + 𝑦 ≤ 0. Once we are in the set where
−𝑥 + 𝑦 ≥ 0, we stay there. So no closed trajectory can have points in both sets.
‗ Henri Dulac (1870–1955) was a French mathematician.
† Usually 𝜕(𝜑 𝑓 ) 𝜕(𝜑𝑔)
the expression in the Bendixson–Dulac Theorem is 𝜕𝑥 + 𝜕𝑦
for some continuously
differentiable function 𝜑. For simplicity, let us just consider the case 𝜑 = 1.
8.4. LIMIT CYCLES 377

Example 8.4.5: Consider 𝑥 ′ = 𝑦 + (𝑥 2 + 𝑦 2 − 1)𝑥, 𝑦 ′ = −𝑥 + (𝑥 2 + 𝑦 2 − 1)𝑦 in the region 𝑅


given by 𝑥 2 + 𝑦 2 > 21 . That is, 𝑅 is the region outside a circle of radius √1 centered at the
2
origin. Then there is a closed trajectory in 𝑅, namely 𝑥 = cos(𝑡), 𝑦 = sin(𝑡). Furthermore,

𝜕𝑓 𝜕𝑔
+ = 4𝑥 2 + 4𝑦 2 − 2,
𝜕𝑥 𝜕𝑦

which is always positive on 𝑅. So what is going on? The Bendixson–Dulac theorem does
not apply since the region 𝑅 is not simply connected—it has a hole, the circle we cut out!

8.4.1 Exercises
Exercise 8.4.1: Show that the following systems have no closed trajectories.

a) 𝑥 ′ = 𝑥 3 + 𝑦, 𝑦′ = 𝑦3 + 𝑥2, b) 𝑥 ′ = 𝑒 𝑥−𝑦 , 𝑦 ′ = 𝑒 𝑥+𝑦 ,


c) 𝑥 ′ = 𝑥 + 3𝑦 2 − 𝑦 3 , 𝑦′ = 𝑦3 + 𝑥2.

Exercise 8.4.2: Formulate a condition for a 2-by-2 linear system 𝑥®′ = 𝐴 𝑥® to not be a center using
the Bendixson–Dulac theorem. That is, the theorem says something about certain elements of 𝐴.

Exercise 8.4.3: Explain why the Bendixson–Dulac Theorem does not apply for any conservative
system 𝑥 ′′ + ℎ(𝑥) = 0.

Exercise 8.4.4: A system such as 𝑥 ′ = 𝑥, 𝑦 ′ = 𝑦 has solutions that exist for all time 𝑡, yet there are
no closed trajectories. Explain why the Poincaré–Bendixson Theorem does not apply.

Exercise 8.4.5: Differential equations can also be given in different coordinate systems. Suppose we
have the system 𝑟 ′ = 1 − 𝑟 2 , 𝜃′ = 1 given in polar coordinates. Find all the closed trajectories and
check if they are limit cycles and if so, if they are asymptotically stable or not.

Exercise 8.4.101: Show that the following systems have no closed trajectories.

a) 𝑥 ′ = 𝑥 + 𝑦 2 , 𝑦′ = 𝑦 + 𝑥2, b) 𝑥 ′ = −𝑥 sin2 (𝑦), 𝑦′ = 𝑒 𝑥 ,


c) 𝑥 ′ = 𝑥𝑦 2 , 𝑦′ = 𝑥 + 𝑥2.

Exercise 8.4.102: Suppose an autonomous system in the plane has a solution 𝑥 = cos(𝑡) + 𝑒 −𝑡 ,
𝑦 = sin(𝑡) + 𝑒 −𝑡 . What can you say about the system (in particular about limit cycles and periodic
solutions)?

Exercise 8.4.103: Show that the limit cycle of the Van der Pol oscillator (for 𝜇 > 0) must not lie
completely in the set where −1 < 𝑥 < 1. Compare with Figure 8.10 on page 373.

Exercise 8.4.104: Suppose we have the system 𝑟 ′ = sin(𝑟), 𝜃′ = 1 given in polar coordinates. Find
all the closed trajectories.
378 CHAPTER 8. NONLINEAR SYSTEMS

8.5 Chaos
Note: 1 lecture, §6.5 in [EP], §9.8 in [BD]
You have surely heard the story about the flap of a butterfly wing in the Amazon
causing hurricanes in the North Atlantic. In a prior section, we mentioned that a small
change in initial conditions of the planets can lead to very different configuration of the
planets in the long term. These are examples of chaotic systems. Mathematical chaos is
not really chaos, there is precise order behind the scenes. Everything is still deterministic.
However a chaotic system is extremely sensitive to initial conditions. This also means even
small errors induced via numerical approximation create large errors very quickly, so it is
almost impossible to numerically approximate for long times. This is a large part of the
trouble, as chaotic systems cannot be in general solved analytically.
Take the weather, the most well-known chaotic system. A small change in the initial
conditions (the temperature at every point of the atmosphere for example) produces
drastically different predictions in relatively short time, and so we cannot accurately
predict weather. And we do not actually know the exact initial conditions. We measure
temperatures at a few points with some error, and then we somehow estimate what is
in between. There is no way we can accurately measure the effects of every butterfly
wing. Then we solve the equations numerically introducing new errors. You should not
trust weather prediction more than a few days out. But we will see we can still get some
information about a chaotic system on a longer time scale, in the context of weather, we
can study the climate.
Chaotic behavior was first noticed by Edward Lorenz‗ in the 1960s when trying to
model thermally induced air convection (movement). Lorentz was looking at the relatively
simple system:

8
𝑥 ′ = −10𝑥 + 10𝑦, 𝑦 ′ = 28𝑥 − 𝑦 − 𝑥𝑧, 𝑧 ′ = − 𝑧 + 𝑥𝑦.
3
A small change in the initial conditions yields a very different solution after a reasonably
short time.
A simple example the reader can experiment with, and which displays
chaotic behavior, is a double pendulum. The equations for this setup are
somewhat complicated, and their derivation is quite tedious, so we will not
bother to write them down. The idea is to put a pendulum on the end of
another pendulum. The movement of the bottom mass will appear chaotic.
This type of chaotic system is a basis for a whole number of office novelty
desk toys. It is simple to build a version. Take a piece of a string. Tie two
heavy nuts at different points of the string; one at the end, and one a bit
above. Now give the bottom nut a little push. As long as the swings are not too big and
the string stays tight, you have a double pendulum system.
‗ Edward Norton Lorenz (1917–2008) was an American mathematician and meteorologist.
8.5. CHAOS 379

8.5.1 Duffing equation and strange attractors


Let us study the so-called Duffing equation:

𝑥 ′′ + 𝑎𝑥 ′ + 𝑏𝑥 + 𝑐𝑥 3 = 𝐶 cos(𝜔𝑡).

Here 𝑎, 𝑏, 𝑐, 𝐶, and 𝜔 are constants. Except for the 𝑐𝑥 3 term, this equation looks like a
forced mass-spring system. The 𝑐𝑥 3 means the spring does not exactly obey Hooke’s law
(which no real-world spring actually obeys exactly). When 𝑐 is not zero, the equation
does not have a closed form solution, so we must resort to numerical solutions, as is usual
for nonlinear systems. Not all choices of constants and initial conditions exhibit chaotic
behavior. Let us study
𝑥 ′′ + 0.05𝑥 ′ + 𝑥 3 = 8 cos(𝑡).
The equation is not autonomous, so we cannot draw the vector field in the phase plane.
We can still draw trajectories. In Figure 8.12, we plot trajectories for 𝑡 going from 0 to 15 for
two very close initial conditions (2, 3) and (2, 2.9), and also the solutions in the (𝑥, 𝑡) space.
The two trajectories are close at first, but after a while diverge significantly. This sensitivity
to initial conditions is precisely what we mean by the system behaving chaotically.

-2 0 2 0.0 2.5 5.0 7.5 10.0 12.5 15.0

3 3
5.0 5.0

2 2

2.5 2.5

1 1

0.0 0.0
0 0

-1 -1
-2.5 -2.5

-2 -2

-5.0 -5.0

-3 -3
-2 0 2 0.0 2.5 5.0 7.5 10.0 12.5 15.0

Figure 8.12: On left, two trajectories in phase space for 0 ≤ 𝑡 ≤ 15, for the Duffing equation one with
initial conditions (2, 3) and the other with (2, 2.9). On right the two solutions in (𝑥, 𝑡)-space.

Let us see the long term behavior. In Figure 8.13 on the following page, we plot the
behavior of the system for initial conditions (2, 3) for a longer period of time. It is hard to
see any particular pattern in the shape of the solution except that it seems to oscillate, but
each oscillation appears quite unique. The oscillation is expected due to the forcing term.
We mention that to produce the picture accurately, a ridiculously large number of steps‗
had to be used in the numerical algorithm, as even small errors quickly propagate in a
chaotic system.
‗ In fact for reference, 30,000 steps were used with the Runge–Kutta algorithm, see exercises in § 1.7.
380 CHAPTER 8. NONLINEAR SYSTEMS

0 20 40 60 80 100

2 2

0 0

-2 -2

0 20 40 60 80 100

Figure 8.13: The solution to the given Duffing equation for 𝑡 from 0 to 100.

It is very difficult to analyze chaotic systems, or to find the order behind the madness,
but let us try to do something that we did for the standard mass-spring system. One
way we analyzed the system is that we figured out what was the long term behavior (not
dependent on initial conditions). From the figure above, it is clear that we will not get a
nice exact description of the long term behavior for this chaotic system, but perhaps we
can find some order to what happens on each “oscillation” and what do these oscillations
have in common.
The concept we explore is that of a Poincaré section‗ . Instead of looking at 𝑡 in a certain
interval, we look at where the system is at a certain sequence of points in time. Imagine
flashing a strobe at a fixed frequency and drawing the points where the solution is during
the flashes. The right strobing frequency depends on the system in question. The correct
frequency for the forced Duffing equation (and other similar systems) is the frequency of
the forcing term. For the Duffing equation above, find a solution 𝑥(𝑡), 𝑦(𝑡) , and look at
the points

𝑥(0), 𝑦(0) , 𝑥(2𝜋), 𝑦(2𝜋) , 𝑥(4𝜋), 𝑦(4𝜋) , 𝑥(6𝜋), 𝑦(6𝜋) , ...


   

As we are really not interested in the transient part of the solution, that is, the part of
the solution that depends on the initial condition, we skip some number of steps in the
beginning. For example, we might skip the first 100 such steps and start plotting points at
𝑡 = 100(2𝜋), that is,

𝑥(200𝜋), 𝑦(200𝜋) , 𝑥(202𝜋), 𝑦(202𝜋) , 𝑥(204𝜋), 𝑦(204𝜋) , ...


  

The plot of these points is the Poincaré section. After plotting enough points, a curious
pattern emerges in Figure 8.14 on the next page (the left-hand picture), a so-called strange
attractor.
‗ Named for the French polymath Jules Henri Poincaré (1854–1912).
8.5. CHAOS 381

2.0 2.5 3.0 3.5 0.0 0.5 1.0 1.5 2.0

5.0 5.0
0 0

-1 -1

2.5 2.5

-2 -2

-3 -3
0.0 0.0

-4 -4

-2.5 -2.5
-5 -5

2.0 2.5 3.0 3.5 0.0 0.5 1.0 1.5 2.0

Figure 8.14: Strange attractor. The left plot is with no phase shift, the right plot has phase shift 𝜋/4.

Given a sequence of points, an attractor is a set towards which the points in the sequence
eventually get closer and closer to, that is, they are attracted. The Poincaré section is not
really the attractor itself, but as the points are very close to it, we see its shape. The strange
attractor is a very complicated set. It has fractal structure, that is, if you zoom in as far as
you want, you keep seeing the same complicated structure.
The initial condition makes no difference. If we start with a different initial condition,
the points eventually gravitate towards the attractor, and so as long as we throw away
the first few points, we get the same picture. Similarly small errors in the numerical
approximations do not matter here.
An amazing thing is that a chaotic system such as the Duffing equation is not random
at all. There is a very complicated order to it, and the strange attractor says something
about this order. We cannot quite say what state the system will be in eventually, but given
the fixed strobing frequency we narrow it down to the points on the attractor.
If we use a phase shift, for example 𝜋/4, and look at the times
𝜋/4 , 2𝜋 + 𝜋/4, 4𝜋 + 𝜋/4, 6𝜋 + 𝜋/4, ...

we obtain a slightly different attractor. The picture is the right-hand side of Figure 8.14. It
is as if we had rotated, moved, and slightly distorted the original. For each phase shift you
can find the set of points towards which the system periodically keeps coming back to.
Study the pictures and notice especially the scales—where are these attractors located
in the phase plane. Notice the regions where the strange attractor lives and compare it to
the plot of the trajectories in Figure 8.12 on page 379.
Let us compare this section to the discussion in § 2.6 on forced oscillations. Consider
𝐹0
𝑥 ′′ + 2𝑝𝑥 ′ + 𝜔02 𝑥 = cos(𝜔𝑡).
𝑚
This is like the Duffing equation, but with no 𝑥 3 term. The steady periodic solution is of
382 CHAPTER 8. NONLINEAR SYSTEMS

the form
𝑥 = 𝐶 cos(𝜔𝑡 + 𝛾).
Strobing using the frequency 𝜔, we obtain a single point in the phase space. The attractor
in this setting is a single point—an expected result as the system is not chaotic. It was
the opposite of chaotic: Any difference induced by the initial conditions dies away very
quickly, and we settle into always the same steady periodic motion.

8.5.2 The Lorenz system


In two dimensions to find chaotic behavior, we must study forced, or non-autonomous,
systems such as the Duffing equation. The Poincaré–Bendixson Theorem says that a
solution to an autonomous two-dimensional system that exists for all time in the future
and does not go towards infinity is periodic or tends towards a periodic solution. Hardly
the chaotic behavior we are looking for.
In three dimensions, even autonomous systems can be chaotic. Let us very briefly
return to the Lorenz system

8
𝑥 ′ = −10𝑥 + 10𝑦, 𝑦 ′ = 28𝑥 − 𝑦 − 𝑥𝑧, 𝑧 ′ = − 𝑧 + 𝑥𝑦.
3
The Lorenz system is an autonomous system in three dimensions exhibiting chaotic behavior.
See the Figure 8.15 for a sample trajectory, which is now a curve in three-dimensional space.

-15 x
20 -10
-5
y 10 0
5
10
0 15

-10

-20 40

40 30

30 20

20 10

10 20

10
-15 0
-10
-5
0 -10 y
5
10
x
15 -20

Figure 8.15: A trajectory in the Lorenz system.


8.5. CHAOS 383

The solutions tend to an attractor in space, the so-called Lorenz attractor. In this case, no
strobing is necessary, the solution will tend towards the attractor set. Again we cannot
quite see the attractor itself, but if we try to follow a solution for long enough, as in the
figure, we get a pretty good picture of what the attractor looks like. The Lorenz attractor is
also a strange attractor and has a complicated fractal structure. And, just as for the Duffing
equation, what we want to draw is not the whole trajectory, but start drawing the trajectory
after a while, once it is close to the attractor.
The path of the trajectory is not simply a repeating figure-eight. The trajectory spins
some seemingly random number of times on the left, then spins a number of times on the
right, and so on. As this system arose in weather prediction, one can perhaps imagine a few
days of warm weather and then a few days of cold weather, where it is not easy to predict
when the weather will change, just as it is not really easy to predict far in advance when
the solution will jump onto the other side. See Figure 8.16 for a plot of the 𝑥 component of
the solution drawn above. A negative 𝑥 corresponds to the left “loop” and a positive 𝑥
corresponds to the right “loop”. On the other hand, while we cannot predict the weather,
we can say something about the climate—the weather will be somewhere near the attractor.
Most of the mathematics we studied in this book is quite classical and well understood.
On the other hand, chaos, including the Lorenz system, continues to be the subject of
current research. Furthermore, chaos has found applications not just in the sciences, but
also in art.

0.0 2.5 5.0 7.5 10.0 12.5 15.0

10 10

0 0

-10 -10

0.0 2.5 5.0 7.5 10.0 12.5 15.0

Figure 8.16: Graph of the 𝑥(𝑡) component of the solution.

8.5.3 Exercises
Exercise 8.5.1: For the non-chaotic equation 𝑥 ′′ + 2𝑝𝑥 ′ + 𝜔02 𝑥 = 𝐹𝑚0 cos(𝜔𝑡), suppose we strobe
with frequency 𝜔 as we mentioned above. Use the known steady periodic solution to find precisely
the point which is the attractor for the Poincaré section.
384 CHAPTER 8. NONLINEAR SYSTEMS

Exercise 8.5.2 (project): A simple fractal attractor can be drawn via the following chaos game.
Draw the three vertices of a triangle and label them, say 𝑝 1 , 𝑝2 and 𝑝3 . Draw some random point 𝑝
(it does not have to be one of the three points above). Roll a die to pick of the 𝑝1 , 𝑝 2 , or 𝑝 3 randomly
(for example 1 and 4 mean 𝑝1 , 2 and 5 mean 𝑝 2 , and 3 and 6 mean 𝑝3 ). Suppose we picked 𝑝2 , then
let 𝑝 new be the point exactly halfway between 𝑝 and 𝑝2 . Draw this point and let 𝑝 now refer to this
new point 𝑝 new . Rinse, repeat. Try to be precise and draw as many iterations as possible. Your
points will be attracted to the so-called Sierpinski triangle. A computer was used to run the game
for 10,000 iterations to obtain the picture in Figure 8.17.

0.00 0.25 0.50 0.75 1.00

0.75 0.75

0.50 0.50

0.25 0.25

0.00 0.00

0.00 0.25 0.50 0.75 1.00

Figure 8.17: 10,000 iterations of the chaos game producing the Sierpinski triangle.

Exercise 8.5.3 (project): Construct the double pendulum described in the text with a string and
two nuts (or heavy beads). Play around with the position of the middle nut, and perhaps use different
weight nuts. Describe what you find.

Exercise 8.5.4 (computer project): Use a computer software (such as Matlab, Octave, or perhaps
even a spreadsheet), plot the solution of the given forced Duffing equation with Euler’s method.
Plotting the solution for 𝑡 from 0 to 100 with several different (small) step sizes. Discuss.

Exercise 8.5.101: Find critical points of the Lorenz system and the associated linearizations.
Appendix A

Linear algebra

A.1 Vectors, mappings, and matrices


Note: 2 lectures
In real life, there is most often more than one variable. We wish to organize dealing with
multiple variables in a consistent manner, and in particular organize dealing with linear
equations and linear mappings, as those are both rather useful and rather easy to handle.
Mathematicians joke that “to an engineer every problem is linear, and everything is a
matrix.” And well, they (the engineers) are not wrong. Quite often, solving an engineering
problem is figuring out the right finite-dimensional linear problem to solve, which is
then solved with some matrix manipulation. Most importantly, linear problems are the
ones that we know how to solve, and we have many tools to solve them. For engineers,
mathematicians, physicists, and anybody else in a technical field, it is absolutely vital to
learn linear algebra.
As motivation, suppose we wish to solve

𝑥 − 𝑦 = 2,
2𝑥 + 𝑦 = 4,

for 𝑥 and 𝑦. That is, we desire numbers 𝑥 and 𝑦 such that the two equations are satisfied.
Let us perhaps start by adding the equations together to find

𝑥 + 2𝑥 − 𝑦 + 𝑦 = 2 + 4, or 3𝑥 = 6.

In other words, 𝑥 = 2. Once we have that, we plug 𝑥 = 2 into the first equation to find
2 − 𝑦 = 2, so 𝑦 = 0. OK, that was easy. What is all this fuss about linear equations. Well,
try doing this if you have 5000 unknowns‗ . Also, we may have such equations not just of
numbers, but of functions and derivatives of functions in differential equations. Clearly we
need a systematic way of doing things. A nice consequence of making things systematic
‗ One of the downsides of making everything look like a linear problem is that the number of variables
tends to become huge.
386 APPENDIX A. LINEAR ALGEBRA

and simpler to write down is that it becomes easier to have computers do the work for us.
Computers are rather stupid, they do not think, but are very good at doing lots of repetitive
tasks precisely, as long as we figure out a systematic way for them to perform the tasks.

A.1.1 Vectors and operations on vectors


Consider 𝑛 real numbers as an 𝑛-tuple:
(𝑥 1 , 𝑥2 , . . . , 𝑥 𝑛 ).
The set of such 𝑛-tuples is the so-called 𝑛-dimensional space, often denoted by ℝ𝑛 . Sometimes
we call this the 𝑛-dimensional euclidean space‗ . In two dimensions, ℝ2 is called the cartesian
plane† . Each such 𝑛-tuple represents a point in the 𝑛-dimensional space. For example, the
point (1, 2) in the plane ℝ2 is one unit to the right and two units up from the origin.
When we do algebra with these 𝑛-tuples of numbers we call them vectors‡ . Mathemati-
cians are keen on separating what is a vector and what is a point of the space or in the
plane, and it turns out to be an important distinction, however, for the purposes of linear
algebra we can think of everything being represented by a vector. A way to think of a
vector, which is especially useful in calculus and differential equations, is an arrow. It is an
object that has a direction and a magnitude. For instance, the vector (1, 2) is the arrow from
the origin to the point (1, 2) in the plane. The magnitude is the length of the arrow. See
Figure A.1. If we think of vectors as arrows, the arrow does not always have to start at the
origin. If we do move it around, however, it should always keep the same direction and
the same magnitude.

𝑥2
2

0 0 1 2 3 𝑥1
Figure A.1: The vector (1, 2) drawn as an arrow from the origin to the point (1, 2).

As vectors are arrows, when we want to give a name to a vector, we draw a little arrow
above it:
𝑥®
‗ Named after the ancient Greek mathematician Euclid of Alexandria (around 300 BC), possibly the most
famous of mathematicians; even small towns often have Euclid Street or Euclid Avenue.
† Named after the French mathematician René Descartes (1596–1650). It is “cartesian” as his name in

Latin is Renatus Cartesius.


‡A common notation to distinguish vectors from points is to write (1, 2) for the point and ⟨1, 2⟩ for the

vector. We write both as (1, 2).


A.1. VECTORS, MAPPINGS, AND MATRICES 387

Another popular notation is a bold x, although we will use the little arrows. It may be easy
to write a bold letter in a book, but it is not so easy to write it by hand on paper or on the
board. Mathematicians often do not even write the arrows. A mathematician would write
𝑥 and remember that 𝑥 is a vector and not a number. Just like you remember that Bob is
your uncle, and you don’t have to keep repeating “Uncle Bob” and you can just say “Bob.”
In this book, however, we will call Bob “Uncle Bob” and write vectors with the little arrows.
The magnitude can be computed √ using the√Pythagorean theorem. The vector (1, 2)
drawn in the figure has magnitude 12 + 22 = 5. The magnitude is denoted by ∥𝑥®∥, and,
in any number of dimensions, it can be computed in the same way:
q
∥ 𝑥®∥ = ∥(𝑥1 , 𝑥2 , . . . , 𝑥 𝑛 )∥ = 𝑥12 + 𝑥 22 + · · · + 𝑥 𝑛2 .

For reasons that will become clear in the next section, we often write vectors as so-called
column vectors:
 𝑥1 
 𝑥2 
 
𝑥® =  ..  .
.  
𝑥 
 𝑛
Don’t worry. It is just a different way of writing the same thing. For example, the vector
(1, 2) can be written as
 
1
.
2

The fact that we write arrows above vectors allows us to write several vectors 𝑥®1 , 𝑥®2 ,
etc., without confusing these with the components of some other vector 𝑥®.
So where is the algebra from linear algebra? Well, arrows can be added, subtracted, and
multiplied by numbers. First we consider addition. If we have two arrows, we simply move
along one, and then along the other. See Figure A.2.

𝑥2
2

0 0 1 2 3 𝑥1

−1

Figure A.2: Adding the vectors (1, 2), drawn dotted, and (2, −3), drawn dashed. The result, (3, −1), is
drawn as a solid arrow.
388 APPENDIX A. LINEAR ALGEBRA

It is rather easy to see what it does to the numbers that represent the vectors. Suppose
we want to add (1, 2) to (2, −3) as in the figure. We travel along (1, 2) and then we travel
along (2, −3). What we did was travel one unit right, two units up, and then we travelled
two units right, and
 three units down (the negative three). That means that we ended up
at 1 + 2, 2 + (−3) = (3, −1). That is how addition always works:

 𝑥 1   𝑦1   𝑥 1 + 𝑦1 
 𝑥 2   𝑦2   𝑥 2 + 𝑦2 
     
 . + .  =  . .
 ..   ..   .. 
     
𝑥   𝑦  𝑥 + 𝑦 
 𝑛  𝑛  𝑛 𝑛

Subtracting is similar. What 𝑥® − 𝑦® means visually is that we first travel along 𝑥®, and then
we travel backwards along 𝑦®. See Figure A.3. It is like adding 𝑥® + (− 𝑦®) where − 𝑦® is the
arrow we obtain by erasing the arrow head from one side and drawing it on the other side,
that is, we reverse the direction. In terms of the numbers, we simply go backwards both
horizontally and vertically, so we negate both numbers. For instance, if 𝑦® is (−2, 1), then
− 𝑦® is (2, −1).

𝑥2
2

0 0 1 2 3 𝑥1
Figure A.3: Subtraction, the vector (1, 2), drawn dotted, minus (−2, 1), drawn dashed. The result, (3, 1),
is drawn as a solid arrow.

Another intuitive thing to do to a vector is to scale it. We represent this by multiplication


of a number with a vector. Because of this, when we wish to distinguish between vectors
and numbers, we call the numbers scalars. For example, suppose we want to travel three
times further. If the vector is (1, 2), travelling 3 times further means going 3 units to the
right and 6 units up, so we get the vector (3, 6). We just multiply each number in the vector
by 3. If 𝛼 is a number, then
 𝑥1   𝛼𝑥1 
 𝑥2   𝛼𝑥2 
   
𝛼  ..  =  ..  .
.
   .
𝑥  𝛼𝑥 
 𝑛  𝑛
Scaling (by a positive number)
√ multiplies the magnitude and leaves direction untouched. √
The magnitude of (1, 2) is 5. The magnitude of 3 times (1, 2), that is, of (3, 6), is 3 5.
A.1. VECTORS, MAPPINGS, AND MATRICES 389

If we multiply a vector by a negative scalar, the vector is not only scaled, but it also
switches direction. Multiplying (1, 2) by −3 means we should go 3 times further but in the
opposite direction, so 3 units to the left and 6 units down, or in other words, (−3, −6). As
we mentioned above, − 𝑦® is the reverse of 𝑦®, and this is the same as (−1) 𝑦®.
In Figure A.4, you can see a couple of examples of what scaling a vector means visually.

−1.5𝑥®
2𝑥®
𝑥®

Figure A.4: A vector 𝑥®, the vector 2 𝑥® (same direction, double the magnitude), and the vector −1.5 𝑥®
(opposite direction, 1.5 times the magnitude).

We put all of these operations together to work out more complicated expressions. Let
us compute a small example:
         
1 −4 −2 3(1) + 2(−4) − 3(−2) 1
3 +2 −3 = = .
2 −1 2 3(2) + 2(−1) − 3(2) −2

We said a vector is a direction and a magnitude. Magnitude is easy to represent, it is


just a number. The direction is usually given by a vector with magnitude one. We call such
a vector a unit vector. That is, 𝑢® is a unit vector when ∥®
𝑢 ∥ = 1. For instance, the vectors
√ √
(1, 0), (1/ 2, 1/ 2), and (0, −1) are all unit vectors.
To represent the direction of a vector 𝑥®, we need to find the unit vector in the same
direction. To do so, we simply rescale 𝑥® by the reciprocal of the magnitude: ∥𝑥1®∥ 𝑥®, or more
concisely, ∥𝑥𝑥®®∥ .
As an example, the unit vector in the direction of (1, 2) is the vector
 
1 1 2
√ (1, 2) = √ , √ .
12 + 2 2 5 5

A.1.2 Linear mappings and matrices


A vector-valued function 𝐹 is a rule that takes a vector 𝑥® and returns another vector 𝑦®. For
example, 𝐹 could be a scaling that doubles the size of vectors:
𝐹( 𝑥®) = 2𝑥®.
Applied to say (1, 3) we get      
1 1 2
𝐹 =2 = .
3 3 6
390 APPENDIX A. LINEAR ALGEBRA

If 𝐹 is a mapping that takes vectors in ℝ2 to ℝ2 (such as the above), we write

𝐹 : ℝ2 → ℝ 2 .

The words function and mapping are used rather interchangeably, although more often than
not, mapping is used when talking about a vector-valued function, and the word function is
often used when the function is scalar-valued.
A beginning student of mathematics (and many a seasoned mathematician) who sees
an expression such as
𝑓 (3𝑥 + 8𝑦)
yearns to write
3 𝑓 (𝑥) + 8 𝑓 (𝑦).
√ √ √
After all, who has not wanted to write 𝑥 + 𝑦 = 𝑥 + 𝑦 or something like that at some
point in their mathematical lives. Wouldn’t life be simple if we could do that? Of course
we cannot always do that (for example, not with the square roots!) But there are many
other functions where we can do exactly the above. Such functions are called linear.
A mapping 𝐹 : ℝ𝑛 → ℝ𝑚 is called linear if

𝐹(𝑥® + 𝑦®) = 𝐹(𝑥®) + 𝐹( 𝑦®),

for any vectors 𝑥® and 𝑦®, and also

𝐹(𝛼 𝑥®) = 𝛼𝐹(𝑥®),

for any scalar 𝛼. The 𝐹 we defined above that doubles the size of all vectors is linear. Let
us check:
𝐹(𝑥® + 𝑦®) = 2(𝑥® + 𝑦®) = 2𝑥® + 2 𝑦® = 𝐹(𝑥®) + 𝐹( 𝑦®),
and also
𝐹(𝛼 𝑥®) = 2𝛼 𝑥® = 𝛼2𝑥® = 𝛼𝐹(𝑥®).
We also call a linear function a linear transformation. If you want to be really fancy and
impress your friends, you can call it a linear operator. When a mapping is linear we often do
not write the parentheses. We write simply

𝐹 𝑥®

instead of 𝐹(𝑥®). We do this because linearity means that the mapping 𝐹 behaves like
multiplying 𝑥® by “something.” That something is a matrix.
A matrix is an 𝑚 × 𝑛 array of numbers (𝑚 rows and 𝑛 columns). A 3 × 5 matrix is
 𝑎11 𝑎12 𝑎13 𝑎14 𝑎15 
𝐴 =  𝑎 21 𝑎 22 𝑎 23 𝑎 24 𝑎 25  .
 
 𝑎31 𝑎32 𝑎33 𝑎34 𝑎35 
 
The numbers 𝑎 𝑖𝑗 are called elements or entries.
A.1. VECTORS, MAPPINGS, AND MATRICES 391

A column vector is simply an 𝑚 × 1 matrix. Similarly to a column vector there is also a


row vector, which is a 1 × 𝑛 matrix. If we have an 𝑛 × 𝑛 matrix, then we say that it is a square
matrix.
How does a matrix 𝐴 relate to a linear mapping? A matrix tells you where certain
special vectors go. We give a name to those certain vectors. The standard basis vectors of ℝ𝑛
are
1 0 0 0
       
0 1 0 0
       
𝑒®1 = 0 , 𝑒®2 = 0 , 𝑒®3 = 1 , ··· , 𝑒®𝑛 = 0 .
       
 ..   ..   ..   .. 
. . . .
       
0 0 0 1
       
In ℝ3 , these vectors are
1 0 0
     
𝑒®1 = 0 , 𝑒®2 = 1 , 𝑒®3 = 0 .
0 0 1
     
®
You may recall from calculus of several variables that these are sometimes called ®𝚤 , ®𝚥 , 𝑘.
The reason these are called a basis is that every other vector can be written as a unique
linear combination of them. For example, in ℝ3 the vector (4, 5, 6) can be written as
1 0 0 4
       
4®𝑒1 + 5®𝑒2 + 6®𝑒3 = 4 0 + 5 1 + 6 0 = 5 .
0 0 1 6
       
So how does a matrix represent a linear mapping? Well, the columns of the matrix are
the vectors where the matrix, as a linear mapping, takes 𝑒®1 , 𝑒®2 , etc. For instance, consider
 
1 2
𝑀= .
3 4
As a linear mapping, 𝑀 : ℝ2 → ℝ2 takes 𝑒®1 = and 𝑒®2 =
1 1 0 2
0 to 3 1 to 4 . In other
words,          
1 2 1 1 1 2 0 2
𝑀®𝑒1 = = , and 𝑀®𝑒2 = = .
3 4 0 3 3 4 1 4
More generally, if we have an 𝑛 × 𝑚 matrix 𝐴, that is, we have 𝑛 rows and 𝑚 columns,
then the mapping 𝐴 : ℝ𝑚 → ℝ𝑛 takes 𝑒®𝑗 to the 𝑗 th column of 𝐴. For example,
 𝑎11 𝑎12 𝑎13 𝑎14 𝑎15 
𝐴 =  𝑎 21 𝑎 22 𝑎 23 𝑎 24 𝑎 25 
 
 𝑎31 𝑎32 𝑎33 𝑎34 𝑎35 
 
5
represents a mapping from ℝ to ℝ that does 3

 𝑎11   𝑎12   𝑎13   𝑎14   𝑎15 


𝐴®𝑒1 =  𝑎21  , 𝐴®𝑒2 =  𝑎 22  , 𝐴®𝑒3 =  𝑎 23  , 𝐴®𝑒4 =  𝑎 24  , 𝐴®𝑒5 =  𝑎25  .
         
 𝑎31   𝑎32   𝑎33   𝑎34   𝑎35 
         
392 APPENDIX A. LINEAR ALGEBRA

What about another vector 𝑥® that is not in the standard basis? Where does it go? We use
linearity. First, we write the vector as a linear combination of the standard basis vectors:
𝑥1  1 0 0 0  0
𝑥2 
           
0 1 0 0  0
𝑥® = 𝑥 3  = 𝑥1 0 + 𝑥 2 0 + 𝑥 3 1 + 𝑥 4 0 + 𝑥 5 0 = 𝑥1 𝑒®1 + 𝑥 2 𝑒®2 + 𝑥 3 𝑒®3 + 𝑥 4 𝑒®4 + 𝑥 5 𝑒®5 .
           
         
𝑥4  0 0 0 1  0
           
𝑥5  0 0 0 0  1
           
Then

𝐴 𝑥® = 𝐴(𝑥1 𝑒®1 + 𝑥 2 𝑒®2 + 𝑥 3 𝑒®3 + 𝑥 4 𝑒®4 + 𝑥 5 𝑒®5 ) = 𝑥1 𝐴®𝑒1 + 𝑥 2 𝐴®𝑒2 + 𝑥 3 𝐴®𝑒3 + 𝑥 4 𝐴®𝑒4 + 𝑥 5 𝐴®𝑒5 .

If we know where 𝐴 takes all the basis vectors, we know where it takes all vectors.
Suppose 𝑀 is the 2 × 2 matrix from above, then
          
−2 1 2 −2 1 2 −1.8
𝑀 = = −2 + 0.1 = .
0.1 3 4 0.1 3 4 −5.6

Every linear mapping from ℝ𝑚 to ℝ𝑛 can be represented by an 𝑛 × 𝑚 matrix. You


just figure out where it takes the standard basis vectors. Conversely, every 𝑛 × 𝑚 matrix
represents a linear mapping. Hence, we may think of matrices being linear mappings, and
linear mappings being matrices.
Or can we? In this book we study mostly linear differential operators, and linear
differential operators are linear mappings, although they are not acting on ℝ𝑛 , but on an
infinite-dimensional space of functions:

𝐿 𝑓 = 𝑔.

For a function 𝑓 we get a function 𝑔, and 𝐿 is linear in the sense that

𝐿( 𝑓 + ℎ) = 𝐿 𝑓 + 𝐿ℎ, and 𝐿(𝛼 𝑓 ) = 𝛼𝐿 𝑓 .

for any number (scalar) 𝛼 and all functions 𝑓 and ℎ.


So the answer is not really. But if we consider vectors in finite-dimensional spaces
ℝ𝑛 then yes, every linear mapping is a matrix. We have mentioned at the beginning of
this section, that we can “make everything a vector.” That’s not strictly true, but it is true
approximately. Those “infinite-dimensional” spaces of functions can be approximated by a
finite-dimensional space, and then linear operators are just matrices. So approximately,
this is true. And as far as actual computations that we can do on a computer, we can work
only with finitely many dimensions anyway. If you ask a computer or your calculator to
plot a function, it samples the function at finitely many points and then connects the dots‗ .
‗ Ifyou have ever used Matlab, you may have noticed that to plot a function, we take a vector of inputs,
ask Matlab to compute the corresponding vector of values of the function, and then we ask it to plot the
result.
A.1. VECTORS, MAPPINGS, AND MATRICES 393

It does not actually give you infinitely many values. The way that you have been using the
computer or your calculator so far has already been a certain approximation of the space
of functions by a finite-dimensional space.
To end the section, we notice how 𝐴 𝑥® can be written more succintly. Suppose

𝑥1 
𝑎 11 𝑎 12 𝑎13
 
𝑥® = 𝑥2  .
 
𝐴= and
𝑎 21 𝑎 22 𝑎23 𝑥3 
 
Then
 𝑥1  
𝑎 11 𝑎12 𝑎 13 𝑎 𝑥 𝑎 𝑥 𝑎 𝑥
 
𝑥2  =
  11 1 + 12 2 + 13 3
𝐴 𝑥® = .
𝑎 21 𝑎22 𝑎 23 𝑥3 
  𝑎 21 𝑥1 + 𝑎 22 𝑥2 + 𝑎 23 𝑥3
 
For example,
      
1 2 2 1 · 2 + 2 · (−1) 0
= = .
3 4 −1 3 · 2 + 4 · (−1) 2
That is, you take the entries in a row of the matrix, you multiply them by the entries in
your vector, you add things up, and that’s the corresponding entry in the resulting vector.

A.1.3 Exercises
Exercise A.1.1: On a piece of graph paper draw the vectors:
   
2 −2
a) b) c) (3, −4)
5 −4

Exercise A.1.2: On a piece of graph paper draw the vector (1, 2) starting at (based at) the given
point:

a) based at (0, 0) b) based at (1, 2) c) based at (0, −1)

Exercise A.1.3: On a piece of graph paper draw the following operations. Draw and label the
vectors involved in the operations as well as the result:
         
1 2 −3 1 2
a) + b) − c) 3
−4 3 2 3 1

Exercise A.1.4: Compute the magnitude of

  −2
7  
a) b)  3  c) (1, 3, −4)
2 1
 
394 APPENDIX A. LINEAR ALGEBRA

Exercise A.1.5: Compute


         
2 7 −2 6 −3
a) + b) − c) −
3 −8 3 −4 2
         
−1 1 0 1 3
d) 4 e) 5 +9 f) 3 −2
5 0 1 −8 −1

Exercise A.1.6: Find the unit vector in the direction of the given vector
  2
1  
a) b)  1  c) (3, 1, −2)
−3 −1
 
Exercise A.1.7: If 𝑥® = (1, 2) and 𝑦® are added together, we find 𝑥® + 𝑦® = (0, 2). What is 𝑦®?

Exercise A.1.8: Write (1, 2, 3) as a linear combination of the standard basis vectors 𝑒®1 , 𝑒®2 , and 𝑒®3 .

Exercise A.1.9: If the magnitude of 𝑥® is 4, what is the magnitude of

a) 0𝑥® b) 3 𝑥® c) − 𝑥® d) −4 𝑥® e) 𝑥® + 𝑥® f) 𝑥® − 𝑥®

Exercise A.1.10: Suppose a linear mapping 𝐹 : ℝ2 → ℝ2 takes (1, 0) to (2, −1) and it takes (0, 1)
to (3, 3). Where does it take

a) (1, 1) b) (2, 0) c) (2, −1)

Exercise A.1.11: Suppose a linear mapping 𝐹 : ℝ3 → ℝ2 takes (1, 0, 0) to (2, 1), it takes (0, 1, 0)
to (3, 4), and it takes (0, 0, 1) to (5, 6). Write down the matrix representing the mapping 𝐹.

Exercise A.1.12: Suppose that a mapping 𝐹 : ℝ2 → ℝ2 takes (1, 0) to (1, 2), (0, 1) to (3, 4), and
(1, 1) to (0, −1). Explain why 𝐹 is not linear.

Exercise A.1.13 (challenging): Let ℝ3 represent the space of quadratic polynomials in 𝑡: a point
(𝑎0 , 𝑎1 , 𝑎2 ) in ℝ3 represents the polynomial 𝑎 0 + 𝑎 1 𝑡 + 𝑎2 𝑡 2 . Consider the derivative 𝑑𝑡𝑑 as a mapping
of ℝ3 to ℝ3 , and note that 𝑑𝑡𝑑 is linear. Write down 𝑑𝑡𝑑 as a 3 × 3 matrix.

Exercise A.1.101: Compute the magnitude of


  2
1  
a) b)  3  c) (−2, 1, −2)
3 −1
 
Exercise A.1.102: Find the unit vector in the direction of the given vector
  1
−1  
a) b) −1 c) (2, −5, 2)
1 2
 
A.1. VECTORS, MAPPINGS, AND MATRICES 395

Exercise A.1.103: Compute


         
3 6 −1 2 −5
a) + b) − c) −
1 −3 2 −1 3
         
−2 1 0 2 2
d) 2 e) 3 +7 f) 2 −6
4 0 1 −3 −1

Exercise A.1.104: If the magnitude of 𝑥® is 5, what is the magnitude of

a) 4𝑥® b) −2𝑥® c) −4𝑥®

Exercise A.1.105: Suppose a linear mapping 𝐹 : ℝ2 → ℝ2 takes (1, 0) to (1, −1) and it takes (0, 1)
to (2, 0). Where does it take

a) (1, 1) b) (0, 2) c) (1, −1)


396 APPENDIX A. LINEAR ALGEBRA

A.2 Matrix algebra


Note: 2–3 lectures

A.2.1 One-by-one matrices


Let us motivate what we want to achieve with matrices. Real-valued linear mappings of the
real line, linear functions that eat numbers and spit out numbers, are just multiplications
by a number. Consider a mapping defined by multiplying by a number. Let’s call this
number 𝛼. The mapping then takes 𝑥 to 𝛼𝑥. We can add such mappings: If we have another
mapping 𝛽, then
𝛼𝑥 + 𝛽𝑥 = (𝛼 + 𝛽)𝑥.
We get a new mapping 𝛼 + 𝛽 that multiplies 𝑥 by, well, 𝛼 + 𝛽. If 𝐷 is a mapping that doubles
its input, 𝐷𝑥 = 2𝑥, and 𝑇 is a mapping that triples, 𝑇𝑥 = 3𝑥, then 𝐷 + 𝑇 is a mapping that
multiplies by 5, (𝐷 + 𝑇)𝑥 = 5𝑥.
Similarly we can compose such mappings, that is, we could apply one and then the other.
We take 𝑥, we run it through the first mapping 𝛼 to get 𝛼 times 𝑥, then we run 𝛼𝑥 through
the second mapping 𝛽. In other words,

𝛽(𝛼𝑥) = (𝛽𝛼)𝑥.

We just multiply those two numbers. Using our doubling and tripling mappings, if we
double and then triple, that is, 𝑇(𝐷𝑥), then we obtain 3(2𝑥) = 6𝑥. The composition 𝑇𝐷 is
the mapping that multiplies by 6. For larger matrices, composition also ends up being a
kind of multiplication.

A.2.2 Matrix addition and scalar multiplication


The mappings that multiply numbers by numbers are just 1 × 1 matrices. The number 𝛼
above could be written as a matrix [𝛼]. Perhaps we would want to do to all matrices the
same things that we did to those 1 × 1 matrices at the start of this section above. First, let
us add matrices. If we have a matrix 𝐴 and a matrix 𝐵 that are of the same size, say 𝑚 × 𝑛,
then they are mappings from ℝ𝑛 to ℝ𝑚 . The mapping 𝐴 + 𝐵 should also be a mapping
from ℝ𝑛 to ℝ𝑚 , and it should do the following to vectors:

(𝐴 + 𝐵) 𝑥® = 𝐴 𝑥® + 𝐵 𝑥®.

It turns out you just add the matrices element-wise: If the 𝑖𝑗 th entry of 𝐴 is 𝑎 𝑖𝑗 , and the 𝑖𝑗 th
entry of 𝐵 is 𝑏 𝑖𝑗 , then the 𝑖𝑗 th entry of 𝐴 + 𝐵 is 𝑎 𝑖𝑗 + 𝑏 𝑖𝑗 . If

𝑎 11 𝑎 12 𝑎 13 𝑏 11 𝑏 12 𝑏 13
   
𝐴= and 𝐵= ,
𝑎 21 𝑎 22 𝑎 23 𝑏 21 𝑏 22 𝑏 23
A.2. MATRIX ALGEBRA 397

then
𝑎 11 + 𝑏 11 𝑎 12 + 𝑏 12 𝑎 13 + 𝑏 13
 
𝐴+𝐵= .
𝑎 21 + 𝑏 21 𝑎 22 + 𝑏 22 𝑎 23 + 𝑏 23
We illustrate on a more concrete example:
1 2  7 8   1 + 7 2 + 8   8 10
       
3 4 +  9 10  =  3 + 9 4 + 10 = 12 14 .
       
5 6 11 −1 5 + 11 6 − 1  16 5 
       
Let us check that this does the right thing to a vector. We use some of the vector algebra
that we already know, and regroup things:
1 2     7 8    1 2
   ª ©  7   8 ª
   
  2   2
 −1 +  9 10  −1 = ­2 3 − 4 ® + ­2  9  −  10  ®
3 4    ©        

11 −1
« 5 6 ¬ « 11 −1 ¬
5 6         
   
1  7  2  8 
©     ª ©     ª
= 2 ­ 3 +  9  ® − ­ 4 +  10  ®
« 5 11 ¬ « 6 −1 ¬
       
1+7 2+8  8  10
       
= 2  3 + 9  − 4 + 10 = 2 12 − 14
5 + 11  6 − 1  16  5 
       
 8 10    2(8) − 10   6 
  2 ©    ª
= 12 14 ­= 2(12) − 14 = 10 ® .
16 5  −1
«  2(16) − 5  27 ¬
   
 
If we replaced the numbers by letters, that would constitute a proof! Notice that we did not
really have to compute what the result is to convince ourselves that the two expressions
were equal.
If the sizes of the matrices do not match, then addition is undefined. If 𝐴 is 3 × 2 and 𝐵
is 2 × 5, then we cannot add the matrices. We do not know what that could possibly mean.
It is also useful to have a matrix that when added to any other matrix does nothing.
This is the zero matrix, the matrix of all zeros:
     
1 2 0 0 1 2
+ = .
3 4 0 0 3 4

We often denote the zero matrix by 0 without specifying size. We would then just write
𝐴 + 0, where we just assume that 0 is the zero matrix of the same size as 𝐴.
There are really two things we can multiply matrices by. We can multiply matrices by
scalars or we can multiply by other matrices. Let us first consider multiplication by scalars.
For a matrix 𝐴 and a scalar 𝛼, we want 𝛼𝐴 to be the matrix that accomplishes

(𝛼𝐴)𝑥® = 𝛼(𝐴 𝑥®).


398 APPENDIX A. LINEAR ALGEBRA

That is just scaling the result by 𝛼. If you think about it, scaling every term in 𝐴 by 𝛼
achieves just that: If

𝑎 11 𝑎 12 𝑎 13 𝛼𝑎 11 𝛼𝑎 12 𝛼𝑎 13
   
𝐴= , then 𝛼𝐴 = .
𝑎 21 𝑎 22 𝑎 23 𝛼𝑎 21 𝛼𝑎 22 𝛼𝑎 23

For example,    
1 2 3 2 4 6
2 = .
4 5 6 8 10 12
Let us list some properties of matrix addition and scalar multiplication. Denote by 0
the zero matrix, by 𝛼, 𝛽 scalars, and by 𝐴, 𝐵, 𝐶 matrices. Then:

𝐴 + 0 = 𝐴 = 0 + 𝐴,
𝐴 + 𝐵 = 𝐵 + 𝐴,
(𝐴 + 𝐵) + 𝐶 = 𝐴 + (𝐵 + 𝐶),
𝛼(𝐴 + 𝐵) = 𝛼𝐴 + 𝛼𝐵,
(𝛼 + 𝛽)𝐴 = 𝛼𝐴 + 𝛽𝐴.

These rules should look very familiar.

A.2.3 Matrix multiplication


As we mentioned above, composition of linear mappings is also a multiplication of matrices.
Suppose 𝐴 is an 𝑚 × 𝑛 matrix, that is, 𝐴 takes ℝ𝑛 to ℝ𝑚 , and 𝐵 is an 𝑛 × 𝑝 matrix, that is, 𝐵
takes ℝ𝑝 to ℝ𝑛 . The composition 𝐴𝐵 should work as follows

𝐴𝐵 𝑥® = 𝐴(𝐵 𝑥®).

First, a vector 𝑥® in ℝ𝑝 gets taken to the vector 𝐵 𝑥® in ℝ𝑛 . Then the mapping 𝐴 takes it to the
vector 𝐴(𝐵 𝑥®) in ℝ𝑚 . In other words, the composition 𝐴𝐵 should be an 𝑚 × 𝑝 matrix. In
terms of sizes we should have

“ [𝑚 × 𝑛] [𝑛 × 𝑝] = [𝑚 × 𝑝]. ”

Notice how the middle size must match.


OK, now we know what sizes of matrices we should be able to multiply and what the
product should be. Let us see how to actually compute matrix multiplication. We start
with the so-called dot product (or inner product) of two vectors. Usually this is a row vector
multiplied with a column vector of the same size. Dot product multiplies each pair of
entries from the first and the second vector and sums these products. The result is a single
number. For example,

 𝑏 1 
 
𝑎1 𝑎2 𝑎 3 · 𝑏 2  = 𝑎1 𝑏 1 + 𝑎 2 𝑏 2 + 𝑎 3 𝑏 3 .

𝑏 3 
 
A.2. MATRIX ALGEBRA 399

And similarly for larger (or smaller) vectors. A dot product is really a product of two
matrices: a 1 × 𝑛 matrix and an 𝑛 × 1 matrix resulting in a 1 × 1 matrix—a number.
Armed with the dot product we define the product of matrices. We denote by row𝑖 (𝐴)
the 𝑖 th row of 𝐴 and by column 𝑗 (𝐴) the 𝑗 th column of 𝐴. For an 𝑚 × 𝑛 matrix 𝐴 and an
𝑛 × 𝑝 matrix 𝐵 we can compute the product 𝐴𝐵: The matrix 𝐴𝐵 is an 𝑚 × 𝑝 matrix whose
𝑖𝑗 th entry is the dot product
row𝑖 (𝐴) · column 𝑗 (𝐵).
For example, given a 2 × 3 and a 3 × 2 matrix we should end up with a 2 × 2 matrix:
 𝑏11 𝑏12  
𝑎11 𝑎 12 𝑎 13 𝑏21 𝑏22  = 𝑎11 𝑏11 + 𝑎12 𝑏21 + 𝑎13 𝑏31 𝑎11 𝑏 12 + 𝑎 12 𝑏 22 + 𝑎 13 𝑏 32
 
 
, (A.1)
𝑎21 𝑎 22 𝑎 23 𝑏31 𝑏32 
  𝑎 21 𝑏 11 + 𝑎 22 𝑏 21 + 𝑎 23 𝑏 31 𝑎21 𝑏 12 + 𝑎 22 𝑏 22 + 𝑎 23 𝑏 32
 
or with some numbers:
  −1 2     
1 2 3   = 1 · (−1) + 2 · (−7) + 3 · 1 1 · 2 + 2 · 0 + 3 · (−1) −12 −1

−7 0 = .
4 5 6   4 · (−1) + 5 · (−7) + 6 · 1 4 · 2 + 5 · 0 + 6 · (−1) −33 2
 1 −1 

A useful consequence of the definition is that the evaluation 𝐴 𝑥® for a matrix 𝐴 and a
(column) vector 𝑥® is also matrix multiplication. That is really why we think of vectors as
column vectors, or 𝑛 × 1 matrices. For example,
      
1 2 2 1 · 2 + 2 · (−1) 0
= = .
3 4 −1 3 · 2 + 4 · (−1) 2
If you look at the last section, that is precisely the last example we gave.
You should stare at the computation of multiplication of matrices 𝐴𝐵 and the previous
definition of 𝐴 𝑦® as a mapping for a moment. What we are doing with matrix multiplication
is applying the mapping 𝐴 to the columns of 𝐵. This is usually written as follows. Suppose
we write the 𝑛 × 𝑝 matrix 𝐵 = [𝑏®1 𝑏®2 · · · 𝑏®𝑝 ], where 𝑏®1 , 𝑏®2 , . . . , 𝑏®𝑝 are the columns of 𝐵.
Then for an 𝑚 × 𝑛 matrix 𝐴,
𝐴𝐵 = 𝐴[𝑏®1 𝑏®2 · · · 𝑏®𝑝 ] = [𝐴𝑏®1 𝐴𝑏®2 · · · 𝐴𝑏®𝑝 ].

The columns of the 𝑚 × 𝑝 matrix 𝐴𝐵 are the vectors 𝐴𝑏®1 , 𝐴𝑏®2 , . . . , 𝐴𝑏®𝑝 . For example, in
(A.1), the columns of
 𝑏 𝑏 
𝑎 11 𝑎 12 𝑎 13  11 12 

𝑏 𝑏
𝑎 21 𝑎 22 𝑎 23  21 22 
𝑏31 𝑏32 
are
 𝑏   𝑏 
𝑎 11 𝑎 12 𝑎13  11  𝑎 11 𝑎 12 𝑎 13  12 
 
𝑏 and 𝑏 .
𝑎 21 𝑎 22 𝑎23  21  𝑎 21 𝑎 22 𝑎 23  22 
𝑏
 31  𝑏32 
This is a very useful way to understand what matrix multiplication is. It should also make
it easier to remember how to perform matrix multiplication.
400 APPENDIX A. LINEAR ALGEBRA

A.2.4 Some rules of matrix algebra


For multiplication we want an analogue of a 1. That is, we desire a matrix that just leaves
everything as it found it. This analogue is the so-called identity matrix. The identity matrix
is a square matrix with 1s on the main diagonal and zeros everywhere else. It is usually
denoted by 𝐼. For each size we have a different identity matrix and so sometimes we may
denote the size as a subscript. For example, 𝐼3 is the 3 × 3 identity matrix

1 0 0
 
𝐼 = 𝐼3 = 0 1 0 .
0 0 1
 
Let us see how the matrix works on a smaller example,

𝑎 11 𝑎 12 𝑎11 · 1 + 𝑎12 · 0 𝑎11 · 0 + 𝑎12 · 1 𝑎 11 𝑎 12


      
1 0
= = .
𝑎 21 𝑎 22 0 1 𝑎21 · 1 + 𝑎22 · 0 𝑎21 · 0 + 𝑎22 · 1 𝑎 21 𝑎 22

Multiplication by the identity from the left looks similar, and also does not touch anything.
We have the following rules for matrix multiplication. Suppose that 𝐴, 𝐵, 𝐶 are matrices
of the correct sizes so that the following make sense. Let 𝛼 denote a scalar (number). Then

𝐴(𝐵𝐶) = (𝐴𝐵)𝐶 (associative law),


𝐴(𝐵 + 𝐶) = 𝐴𝐵 + 𝐴𝐶 (distributive law),
(𝐵 + 𝐶)𝐴 = 𝐵𝐴 + 𝐶𝐴 (distributive law),
𝛼(𝐴𝐵) = (𝛼𝐴)𝐵 = 𝐴(𝛼𝐵),
𝐼𝐴 = 𝐴 = 𝐴𝐼 (identity).

Example A.2.1: Let us demonstrate a couple of these rules. For example, the associative
law:          
−3 3 4 4 −1 4 −3 3 16 24 −96 −78
= = ,
2 −2 1 −3 5 2 2 −2 −16 −2 64 52
| {z } | {z } | {z } | {z } | {z } | {z }
𝐴 𝐵 𝐶 𝐴 𝐵𝐶 𝐴(𝐵𝐶)

and          
−3 3 4 4 −1 4 −9 −21 −1 4 −96 −78
= = .
2 −2 1 −3 5 2 6 14 5 2 64 52
| {z } | {z } | {z } | {z } | {z } | {z }
𝐴 𝐵 𝐶 𝐴𝐵 𝐶 (𝐴𝐵)𝐶

Or how about multiplication by scalars:


      
−3 3 4 4 −9 −21 −90 −210
10 = 10 = ,
2 −2 1 −3 6 14 60 140
| {z } | {z } | {z } | {z }
𝐴 𝐵 𝐴𝐵 10(𝐴𝐵)
A.2. MATRIX ALGEBRA 401
         
−3 3 4 4 −30 30 4 4 −90 −210
10 = = ,
2 −2 1 −3 20 −20 1 −3 60 140
| {z } | {z } | {z } | {z } | {z }
𝐴 𝐵 10𝐴 𝐵 (10𝐴)𝐵

and         
−3 3 4 4 −3 3 40 40 −90 −210
10 = = .
2 −2 1 −3 2 −2 10 −30 60 140
| {z } | {z } | {z } | {z } | {z }
𝐴 𝐵 𝐴 10𝐵 𝐴(10𝐵)

A multiplication rule, one you have used since primary school on numbers, is quite
conspicuously missing for matrices. That is, matrix multiplication is not commutative.
Firstly, just because 𝐴𝐵 makes sense, it may be that 𝐵𝐴 is not even defined. For example, if
𝐴 is 2 × 3, and 𝐵 is 3 × 4, the we can multiply 𝐴𝐵 but not 𝐵𝐴.
Even if 𝐴𝐵 and 𝐵𝐴 are
 1 0both
 defined, does not mean that they are equal. For example,
take 𝐴 = 1 1 and 𝐵 = 0 2 :
1 1

         
1 1 1 0 1 2 1 1 1 0 1 1
𝐴𝐵 = = ≠ = = 𝐵𝐴.
1 1 0 2 1 2 2 2 0 2 1 1

A.2.5 Inverse
A couple of other algebra rules you know for numbers do not quite work on matrices:

(i) 𝐴𝐵 = 𝐴𝐶 does not necessarily imply 𝐵 = 𝐶, even if 𝐴 is not 0.

(ii) 𝐴𝐵 = 0 does not necessarily mean that 𝐴 = 0 or 𝐵 = 0.

For example:        
0 1 0 1 0 0 0 1 0 2
= = .
0 0 0 0 0 0 0 0 0 0
To make these rules hold, we do not just need one of the matrices to not be zero, we
would need to “divide” by a matrix. This is where the matrix inverse comes in. Suppose
that 𝐴 and 𝐵 are 𝑛 × 𝑛 matrices such that

𝐴𝐵 = 𝐼 = 𝐵𝐴.

Then we call 𝐵 the inverse of 𝐴 and we denote 𝐵 by 𝐴−1 . Perhaps not surprisingly,
−1
(𝐴−1 ) = 𝐴, since if the inverse of 𝐴 is 𝐵, then the inverse of 𝐵 is 𝐴. If the inverse of 𝐴
exists, then we say 𝐴 is invertible. If 𝐴 is not invertible, we say 𝐴 is singular.
If 𝐴 = [𝑎] is a 1 × 1 matrix, then 𝐴−1 is 𝑎 −1 = 1𝑎 . That is where the notation comes from.
The computation is not nearly as simple when 𝐴 is larger.
The proper formulation of the cancellation rule is:

If 𝐴 is invertible, then 𝐴𝐵 = 𝐴𝐶 implies 𝐵 = 𝐶.


402 APPENDIX A. LINEAR ALGEBRA

The computation is what you would do in regular algebra with numbers, but you have to
be careful never to commute matrices:

𝐴𝐵 = 𝐴𝐶,
𝐴−1 𝐴𝐵 = 𝐴−1 𝐴𝐶,
𝐼𝐵 = 𝐼𝐶,
𝐵 = 𝐶.

And similarly for cancellation on the right:


If 𝐴 is invertible, then 𝐵𝐴 = 𝐶𝐴 implies 𝐵 = 𝐶.
The rule says, among other things, that the inverse of a matrix is unique if it exists: If
𝐴𝐵 = 𝐼 = 𝐴𝐶, then 𝐴 is invertible and 𝐵 = 𝐶.
We will see later how to compute an inverse of a matrix in general. For now, let us note
that there is a simple formula for the inverse of a 2 × 2 matrix
 −1
𝑎 𝑏 𝑑 −𝑏
  
1
= .
𝑐 𝑑 𝑎𝑑 − 𝑏𝑐 −𝑐 𝑎
For example:
  −1    
1 1 1 4 −1 2 −1/2
= = .
2 4 1 · 4 − 1 · 2 −2 1 −1 1/2

Let’s try it:


         
1 1 2 −1/2 1 0 2 −1/2 1 1 1 0
= and = .
2 4 −1 1/2 0 1 −1 1/2 2 4 0 1
Just as we cannot divide by every number, not every matrix is invertible. In the case of
matrices however we may have singular matrices that are not zero. For example,
 
1 1
2 2
is a singular matrix. But didn’t we just give a formula for an inverse? Let us try it:
  −1  
1 1
2 2
=
1
1·2−1·2
2 −1
−2 1
= ?
We get into a bit of trouble; we are trying to divide by zero.
So a 2 × 2 matrix 𝐴 is invertible whenever

𝑎𝑑 − 𝑏𝑐 ≠ 0

and otherwise it is singular. The expression 𝑎𝑑 − 𝑏𝑐 is called the determinant and we will
look at it more carefully in a later section. There is a similar expression for a square matrix
of any size.
A.2. MATRIX ALGEBRA 403

A.2.6 Diagonal matrices


A simple (and surprisingly useful) type of a square matrix is a so-called diagonal matrix. It
is a matrix whose entries are all zero except those on the main diagonal from top left to
bottom right. For example a 4 × 4 diagonal matrix is of the form
 𝑑1 0 0 0 
 0 𝑑2 0 0 
 
 0 0 𝑑3 0  .
 
 
 0 0 0 𝑑4 
 
Such matrices have nice properties when we multiply by them. If we multiply them by a
vector, they multiply the 𝑘 th entry by 𝑑 𝑘 . For example,
1 0 0 4 1 · 4  4 
      
0 2 0 5 = 2 · 5 = 10 .
      
0 0 3 6 3 · 6 18
      
Similarly, when they multiply another matrix from the left, they multiply the 𝑘 th row by 𝑑 𝑘 .
For example,
2 0 0  1 1 1  2 2 2 
    
0 3 0  1 1 1 =  3 3 3  .
    
0 0 −1 1 1 1 −1 −1 −1
    
On the other hand, multiplying on the right, they multiply the columns:
1 1 1 2 0 0  2 3 −1
    
1 1 1 0 3 0  = 2 3 −1 .
    
1 1 1 0 0 −1 2 3 −1
    
And it is really easy to multiply two diagonal matrices together—we multiply the entries:
1 0 0 2 0 0  1 · 2 0 0  2 0 0 
   
0 2 0 0 3 0  =  0 2 · 3
    0  = 0 6 0  .
0 0 3 0 0 −1  0 0 3 · (−1) 0 0 −3
   
For this last reason, they are easy to invert, you simply invert each diagonal element:
𝑑1 0 0  −1 𝑑−1 0 0 
 1
 0 𝑑2 0  =  0 𝑑
  −1
2
0  .
0 𝑑3−1 

 0 0 𝑑3 
 
 0
  
Let us check an example
−1
2
 0 0 2
 0 0  12 0 0  2 0 0 1 0 0
0
 3 0 0
 3 0 =  0 13 0  0 3 0 = 0 1 0 .
0
 0 4 0
 0 4  0 0 41  0 0 4 0 0 1
| {z }| {z } | {z } | {z } | {z }
𝐴−1 𝐴 𝐴−1 𝐴 𝐼

It is no wonder that the way we solve many problems in linear algebra (and in differential
equations) is to try to reduce the problem to the case of diagonal matrices.
404 APPENDIX A. LINEAR ALGEBRA

A.2.7 Transpose
Vectors do not always have to be column vectors, that is just a convention. Swapping rows
and columns is from time to time needed. The operation that swaps rows and columns is
the so-called transpose. The transpose of 𝐴 is denoted by 𝐴𝑇 . Example:

 𝑇 1 4
1 2 3  
= 2 5 .
4 5 6 3 6
 
Transpose takes an 𝑚 × 𝑛 matrix to an 𝑛 × 𝑚 matrix.
A key feature of the transpose is that if the product 𝐴𝐵 makes sense, then 𝐵𝑇 𝐴𝑇 also
makes sense, at least from the point of view of sizes. In fact, we get precisely the transpose
of 𝐴𝐵. That is:
(𝐴𝐵)𝑇 = 𝐵𝑇 𝐴𝑇 .
For example,
  0 1  𝑇   1 4
© 1 2 3  ª 0 1 2  
1 0 ® =
 2 5 .
4 5 6  1 0 −2 
­  
2 −2 ¬ 3 6
 
«
It is left to the reader to verify that computing the matrix product on the left and then
transposing is the same as computing the matrix product on the right.
If we have a column vector 𝑥® to which we apply a matrix 𝐴 and we transpose the result,
then the row vector 𝑥®𝑇 applies to 𝐴𝑇 from the left:

𝑇
(𝐴 𝑥®) = 𝑥®𝑇 𝐴𝑇 .

Another place where transpose is useful is when we wish to apply the dot product‗ to
two column vectors:
𝑥® · 𝑦® = 𝑦®𝑇 𝑥®.
That is the way that one often writes the dot product in software.
We say a matrix 𝐴 is symmetric if 𝐴 = 𝐴𝑇 . For example,

1 2 3
 
2 4 5
 
3 5 6
 
is a symmetric matrix. Notice that a symmetric matrix is always square, that is, 𝑛 × 𝑛.
Symmetric matrices have many nice properties† , and come up quite often in applications.
‗As a side note, mathematicians write 𝑦®𝑇 𝑥® and physicists write 𝑥®𝑇 𝑦®. Shhh. . . don’t tell anyone, but the
physicists are probably right on this.
†Although so far we have not learned enough about matrices to really appreciate them.
A.2. MATRIX ALGEBRA 405

A.2.8 Exercises
Exercise A.2.1: Add the following matrices

    1 2 4 2 −8 −3
−1 2 2 3 2 3    
a) + b) 2 3 1 + 3 1 0 
5 8 −1 8 3 5 0 5 1 6 −4 1 
   
Exercise A.2.2: Compute
       
0 3 1 5 −3 1 2 −1
a) 3 +6 b) 2 −3
−2 2 −1 5 2 2 3 2

Exercise A.2.3: Multiply the following matrices


−1 2   1 2 3 2 3 1 7 
  3 −1 3 1   
a)  3 1 b) 3 1 1 1 2 3 −1
 5 8 8 3 2 −3 1 0 3 1 −1 3 0 
    
4 1 6 3 2 5
 
  2 2
  1 2  1 1 4  
c) 5 6 5 0   d) 1 0
4 6 6 0 3 5 0 5 1 
 
6 4

  5 6 
 
Exercise A.2.4: Compute the inverse of the given matrices
     
  0 −1 1 4 2 2
a) −3 b) c) d)
1 0 1 3 1 4

Exercise A.2.5: Compute the inverse of the given matrices


1 0 0 0 
  3 0 0 
−2 0   0 −1 0 0 
a) b) 0 −2 0 c) 
0 1 0 0 1 0 0 0.01 0 

  0 0
 0 −5

Exercise A.2.101: Add the following matrices

    6 −2 3 −1 −1 −3
2 1 0 5 3 4    
a) + b) 7 3 3 +  6 7 3 
1 1 −1 1 2 5 8 −1 2 −9 4 −1
   
Exercise A.2.102: Compute
       
1 2 −1 3 2 −1 2 1
a) 2 +3 b) 3 −2
3 4 1 2 1 3 −1 2
406 APPENDIX A. LINEAR ALGEBRA

Exercise A.2.103: Multiply the following matrices


  2 4 0 3 3  6 6 2
2 1 4      
a) 6 3  b) 2 −2 1  4 6 0
3 4 4  
3 5 −2
 
3 5 
  
2 0 4
 
3 4 1 0 2 5 0 −2 −2  
     0 3
c) 2 −1 0 2 0 5 2 d)  5 3 
4 −1 5 3 6 1 6 2 1 1 3
    
Exercise A.2.104: Compute the inverse of the given matrices
     
  0 1 1 2 4 2
a) 2 b) c) d)
1 0 3 5 4 4

Exercise A.2.105: Compute the inverse of the given matrices


−1 0 0 0 
  4 0 0  
2 0   0 2 0 0 
a) b) 0 5 0  c) 
0 3 0 0 −1 0 0 3 0 
  0 0 0 0.1

A.3. ELIMINATION 407

A.3 Elimination
Note: 2–3 lectures

A.3.1 Linear systems of equations


One application of matrices is to solve systems of linear equations‗ . Consider the following
system of linear equations
2𝑥 1 + 2𝑥 2 + 2𝑥 3 = 2,
𝑥1 + 𝑥 2 + 3𝑥 3 = 5, (A.2)
𝑥1 + 4𝑥 2 + 𝑥 3 = 10.
There is a systematic procedure called elimination to solve such a system. In this
procedure, we attempt to eliminate each variable from all but one equation. We want to
end up with equations such as 𝑥3 = 2, where we can just read off the answer.
We write a system of linear equations as a matrix equation:
®
𝐴 𝑥® = 𝑏.
The system (A.2) is written as
2 2 2  𝑥1  2
1 3  𝑥2  =  5  .
  
1

1
 4 1  𝑥3  10
 
| {z } |{z} |{z}
𝐴 𝑥® 𝑏®

If we knew the inverse of 𝐴, then we would be done; we would simply solve the equation:
®
𝑥® = 𝐴−1 𝐴 𝑥® = 𝐴−1 𝑏.
Well, but that is part of the problem, we do not know how to compute the inverse for
matrices bigger than 2 × 2. We will see later that to compute the inverse we are really
solving 𝐴 𝑥® = 𝑏® for several different 𝑏.
® In other words, we will need to do elimination to
find 𝐴−1 . In addition, we may wish to solve 𝐴 𝑥® = 𝑏® if 𝐴 is not invertible, or perhaps not
even square.
Let us return to the equations themselves and see how we can manipulate them. There
are a few operations we can perform on the equations that do not change the solution.
First, perhaps an operation that may seem stupid, we can swap two equations in (A.2):
𝑥1 + 𝑥 2 + 3𝑥 3 = 5,
2𝑥 1 + 2𝑥 2 + 2𝑥 3 = 2,
𝑥1 + 4𝑥 2 + 𝑥 3 = 10.
‗Although perhaps we have this backwards, quite often we solve a linear system of equations to find out
something about matrices, rather than vice versa.
408 APPENDIX A. LINEAR ALGEBRA

Clearly these new equations have the same solutions 𝑥1 , 𝑥2 , 𝑥3 . A second operation is that
we can multiply an equation by a nonzero number. For example, we multiply the third
equation in (A.2) by 3:
2𝑥1 + 2𝑥 2 + 2𝑥 3 = 2,
𝑥1 + 𝑥 2 + 3𝑥 3 = 5,
3𝑥1 + 12𝑥 2 + 3𝑥 3 = 30.
Finally, we can add a multiple of one equation to another equation. For instance, we add 3
times the third equation in (A.2) to the second equation:

2𝑥1 + 2𝑥2 + 2𝑥3 = 2,


(1 + 3)𝑥 1 + (1 + 12)𝑥2 + (3 + 3)𝑥3 = 5 + 30,
𝑥1 + 4𝑥2 + 𝑥3 = 10.

The same 𝑥1 , 𝑥2 , 𝑥3 should still be solutions to the new equations. These were just examples;
we did not get any closer to the solution. We must to do these three operations in some
more logical manner, but it turns out these three operations suffice to solve every linear
equation.
The first thing is to write the equations in a more compact manner. Given

®
𝐴 𝑥® = 𝑏,

we write down the so-called augmented matrix

®
[𝐴 | 𝑏],

where the vertical line is just a marker for us to know where the “right-hand side” of the
equation starts. For the system (A.2) the augmented matrix is
2 2 2 2 
 
 1 1 3 5 .
 
 1 4 1 10 
 
The entire process of elimination, which we will describe, is often applied to any sort of
matrix, not just an augmented matrix. Simply think of the matrix as the 3 × 4 matrix
2 2 2 2 
 
1 1 3 5  .
 
1 4 1 10
 

A.3.2 Row echelon form and elementary operations


We apply the three operations above to the matrix. We call these the elementary operations
or elementary row operations. Translating the operations to the matrix setting, the operations
become:
A.3. ELIMINATION 409

(i) Swap two rows.

(ii) Multiply a row by a nonzero number.

(iii) Add a multiple of one row to another row.

We run these operations until we get into a state where it is easy to read off the answer, or
until we get into a contradiction indicating no solution.
More specifically, we run the operations until we obtain the so-called row echelon form.
Let us call the first (from the left) nonzero entry in each row the leading entry. A matrix is
in row echelon form if the following conditions are satisfied:

(i) The leading entry in any row is strictly to the right of the leading entry of the row
above.

(ii) Any zero rows are below all the nonzero rows.

(iii) All leading entries are 1.

A matrix is in reduced row echelon form if furthermore the following condition is satisfied.

(iv) All the entries above a leading entry are zero.

Note that the definition applies to matrices of any size.


Example A.3.1: The following matrices are in row echelon form. The leading entries are
marked:
1
 2 9 3  1
 −1 −3 1
 2 1 0
 1 −5 2 
0 0 1 5  0 1 5  0 1 2 0 0 0 1 
   
0 0 0 1  0 0 1  0 0 0 0 0 0 0 
   
None of the matrices above are in reduced row echelon form. For example, in the first matrix
none of the entries above the second and third leading entries are zero; they are 9, 3, and 5.
The following matrices are in reduced row echelon form. The leading entries are marked:

1 3 0 8 1 0 2 0  1 0 3  0 1 2 0 
   
0 0 1 6 0 1 3 0  0 1 −2 0 0 0 1 
   
0 0 0 0 0 0 0 1  0 0 0  0 0 0 0 
   
The procedure we will describe to find a reduced row echelon form of a matrix is called
Gauss–Jordan elimination. The first part of it, which obtains a row echelon form, is called
Gaussian elimination or row reduction. For some problems, a row echelon form is sufficient,
and it is a bit less work to only do this first part.
To attain the row echelon form we work systematically. We go column by column,
starting at the first column. We find topmost entry in the first column that is not zero, and
we call it the pivot. If there is no nonzero entry we move to the next column. We swap rows
410 APPENDIX A. LINEAR ALGEBRA

to put the row with the pivot as the first row. We divide the first row by the pivot to make
the pivot entry be a 1. Now look at all the rows below and subtract the correct multiple of
the pivot row so that all the entries below the pivot become zero.
After this procedure we forget that we had a first row (it is now fixed), and we forget
about the column with the pivot and all the preceding zero columns. Below the pivot row,
all the entries in these columns are just zero. Then we focus on the smaller matrix and we
repeat the steps above.
It is best shown by example, so let us go back to the example from the beginning of the
section. We keep the vertical line in the matrix, even though the procedure works on any
matrix, not just an augmented matrix. We start with the first column and we locate the
pivot, in this case the first entry of the first column.

 2 2 2 2 

 1 1 3 5 

 1 4 1 10 

We multiply the first row by 1/2.

 1 1 1 1 

 1 1 3 5 

 1 4 1 10 

We subtract the first row from the second and third row (two elementary operations).

1 1 1 1
 
0 0 2 4
 
0 3 0 9
 
We are done with the first column and the first row for now. We almost pretend the matrix
does not have the first column and the first row.
∗ ∗ ∗ ∗ 
 
∗ 0 2 4
 
∗ 3 0 9
 
OK, look at the second column, and notice that now the pivot is in the third row.

1 1 1 1 

0 0 2 4 

0 3 0 9 

We swap rows.
1 1 1 1 

0 3 0 9 

0 0 2 4 

A.3. ELIMINATION 411

And we divide the pivot row by 3.


1 1 1 1 

0 1 0 3 

0 0 2 4 

We do not need to subtract anything as everything below the pivot is already zero. We
move on, we again start ignoring the second row and second column and focus on
∗ ∗ ∗ ∗ 
 
 ∗ ∗ ∗ ∗ .
 
∗ ∗ 2 4
 
We find the pivot, then divide that row by 2:
1 1 1 1  1 1 1 1
   
0 1 0 3  →  0 1 0 3 .
   
0 0 2 4  0 0 1 2
   
The matrix is now in row echelon form.
The equation corresponding to the last row is 𝑥 3 = 2. We know 𝑥3 and we could
substitute it into the first two equations to get equations for 𝑥1 and 𝑥 2 . Then we could
do the same thing with 𝑥2 , until we solve for all 3 variables. This procedure is called
backsubstitution and we can achieve it via elementary operations. We start from the lowest
pivot (leading entry in the row echelon form) and subtract the right multiple from the row
above to make all the entries above this pivot zero. Then we move to the next pivot and so
on. After we are done, we will have a matrix in reduced row echelon form.
We continue our example. Subtract the last row from the first to get
 1 1 0 −1 
 
 0 1 0 3 .
 
0 0 1 2 
 
The entry above the pivot in the second row is already zero. So we move onto the next
pivot, the one in the second row. We subtract this row from the top row to get
 1 0 0 −4 
 
 0 1 0 3 .
 
0 0 1 2 
 
The matrix is in reduced row echelon form.
If we now write down the equations for 𝑥1 , 𝑥2 , 𝑥3 , we find

𝑥1 = −4, 𝑥 2 = 3, 𝑥 3 = 2.

In other words, we have solved the system.


412 APPENDIX A. LINEAR ALGEBRA

A.3.3 Non-unique solutions and inconsistent systems


It is possible that the solution of a linear system of equations is not unique, or that no
solution exists. Suppose for a moment that the row echelon form we found was

1 2 3 4
 
 0 0 1 3 .
 
0 0 0 1
 
Then we have an equation 0 = 1 coming from the last row. That is impossible and the
®
equations are what we call inconsistent. There is no solution to 𝐴 𝑥® = 𝑏.
On the other hand, if we find a row echelon form
1 2 3 4
 
 0 0 1 3 ,
 
0 0 0 0
 
then there is no issue with finding solutions. In fact, we will find way too many. Let us
continue with backsubstitution (subtracting 3 times the third row from the first) to find the
reduced row echelon form and let’s mark the pivots.

 1
 2 0 −5 

 0 0 1 3 
 
 0 0 0 0 
 
The last row is all zeros; it just says 0 = 0 and we ignore it. The two remaining equations
are
𝑥1 + 2𝑥 2 = −5, 𝑥 3 = 3.
Let us solve for the variables that corresponded to the pivots, that is, 𝑥1 and 𝑥3 as there
was a pivot in the first column and in the third column:

𝑥1 = −2𝑥2 − 5,
𝑥3 = 3.

The variable 𝑥2 can be anything you wish and we still get a solution. The 𝑥2 is called a free
variable. There are infinitely many solutions, one for every choice of 𝑥 2 . If we pick 𝑥 2 = 0,
then 𝑥1 = −5, and 𝑥3 = 3 give a solution. But we also get a solution by picking say 𝑥 2 = 1,
in which case 𝑥1 = −9 and 𝑥3 = 3, or by picking 𝑥2 = −5 in which case 𝑥1 = 5 and 𝑥3 = 3.
The general idea is that if any row has all zeros in the columns corresponding to the
variables, but a nonzero entry in the column corresponding to the right-hand side 𝑏,® then
the system is inconsistent and has no solutions. In other words, the system is inconsistent
if you find a pivot on the right side of the vertical line drawn in the augmented matrix.
Otherwise, the system is consistent, and at least one solution exists.
A.3. ELIMINATION 413

Suppose the system is consistent (at least one solution exists):


(i) If every column corresponding to a variable has a pivot element, then the solution is
unique.
(ii) If there are columns corresponding to variables with no pivot, then those are free
variables that can be chosen arbitrarily, and there are infinitely many solutions.

When 𝑏® = 0,
® we have a so-called homogeneous matrix equation

®
𝐴 𝑥® = 0.
There is no need to write an augmented matrix in this case. As the elementary operations
do not do anything to a zero column, it always stays a zero column. Moreover, 𝐴 𝑥® = 0®
always has at least one solution, namely 𝑥® = 0. ® Such a system is always consistent. It may
have other solutions: If you find any free variables, then you get infinitely many solutions.
The set of solutions of 𝐴 𝑥® = 0® comes up quite often so people give it a name. It is called
the nullspace or the kernel of 𝐴. One place where the kernel comes up is invertibility of a
square matrix 𝐴. If the kernel of 𝐴 contains a nonzero vector, then it contains infinitely
® since infinitely
many vectors (there was a free variable). But then it is impossible to invert 0,
many vectors go to 0, ® so there is no unique vector that 𝐴 takes to 0. ® So if the kernel is
nontrivial, that is, if there are any nonzero vectors in the kernel, in other words, if there are
any free variables, or in yet other words, if the row echelon form of 𝐴 has columns without
pivots, then 𝐴 is not invertible. We will return to this idea later.

A.3.4 Linear independence and rank


If rows of a matrix correspond to equations, it may be good to find out how many equations
we really need to find the same set of solutions. Similarly, if we find a number of solutions
® we may ask if we found enough so that all other solutions can
to a linear equation 𝐴 𝑥® = 0,
be formed out of the given set. The concept we want is that of linear independence. That
same concept is useful for differential equations, for example in chapter 2.
Given row or column vectors 𝑦®1 , 𝑦®2 , . . . , 𝑦®𝑛 , a linear combination is an expression of the
form
𝛼1 𝑦®1 + 𝛼 2 𝑦®2 + · · · + 𝛼 𝑛 𝑦®𝑛 ,
where 𝛼1 , 𝛼2 , . . . , 𝛼 𝑛 are all scalars. For example, 3 𝑦®1 + 𝑦®2 − 5 𝑦®3 is a linear combination of
𝑦®1 , 𝑦®2 , and 𝑦®3 .
We have seen linear combinations before. The expression
𝐴 𝑥®
is a linear combination of the columns of 𝐴, while
𝑥®𝑇 𝐴 = (𝐴𝑇 𝑥®)𝑇
is a linear combination of the rows of 𝐴.
414 APPENDIX A. LINEAR ALGEBRA

The way linear combinations come up in our study of differential equations is similar to
® 𝐴 𝑥®2 = 0,
the following computation. Suppose that 𝑥®1 , 𝑥®2 , . . . , 𝑥®𝑛 are solutions to 𝐴 𝑥®1 = 0, ®
® Then the linear combination
. . . , 𝐴 𝑥®𝑛 = 0.

𝑦® = 𝛼1 𝑥®1 + 𝛼 2 𝑥®2 + · · · + 𝛼 𝑛 𝑥®𝑛

®
is a solution to 𝐴 𝑦® = 0:

𝐴 𝑦® = 𝐴(𝛼1 𝑥®1 + 𝛼 2 𝑥®2 + · · · + 𝛼 𝑛 𝑥®𝑛 ) =


= 𝛼1 𝐴 𝑥®1 + 𝛼 2 𝐴 𝑥®2 + · · · + 𝛼 𝑛 𝐴 𝑥®𝑛 = 𝛼1 0® + 𝛼 2 0® + · · · + 𝛼 𝑛 0® = 0.
®

So if you have found enough solutions, you have them all. The question is, when did
we find enough of them?
We say the vectors 𝑦®1 , 𝑦®2 , . . . , 𝑦®𝑛 are linearly independent if the only solution to

𝛼1 𝑥®1 + 𝛼 2 𝑥®2 + · · · + 𝛼 𝑛 𝑥®𝑛 = 0®

is 𝛼1 = 𝛼2 = · · · = 𝛼 𝑛 = 0. Otherwise,
1 we say the vectors are linearly dependent.
0
For example, the vectors 2 and 1 are linearly independent. Let’s try:

𝛼1
       
1 0 0
𝛼1 + 𝛼2 = = 0® = .
2 1 2𝛼1 + 𝛼 2 0

So 𝛼1 = 0, and then it is clear that 𝛼2 = 0 as well. In other words, the two vectors are
linearly independent.
If a set of vectors is linearly dependent, that is, some of the 𝛼 𝑗 s are nonzero, then we can
®
solve for one vector in terms of the others. Suppose 𝛼1 ≠ 0. Since 𝛼1 𝑥®1 +𝛼2 𝑥®2 +· · ·+𝛼 𝑛 𝑥®𝑛 = 0,
then
−𝛼 2 −𝛼 3 −𝛼 𝑛
𝑥®1 = 𝑥®2 − 𝑥®3 + · · · + 𝑥®𝑛 .
𝛼1 𝛼1 𝛼1
For example,
1 1  1  0
       
2 2 − 4 1 + 2  0  = 0 ,
   
3 1 −1 0
       
and so
1  1  1 
     
2 = 2 1 −  0  .
     
3  1 −1
     
You may have noticed that solving for those 𝛼 𝑗 s is just solving linear equations, and so
you may not be surprised that to check if a set of vectors is linearly independent we use
row reduction.
Given a set of vectors, we may not be interested in just finding if they are linearly
independent or not, we may be interested in finding a linearly independent subset. Or
A.3. ELIMINATION 415

perhaps we may want to find some other vectors that give the same linear combinations
and are linearly independent. The way to figure this out is to form a matrix out of our
vectors. If we have row vectors we consider them as rows of a matrix. If we have column
vectors we consider them columns of a matrix. The set of all linear combinations of a set of
vectors is called their span.
span 𝑥®1 , 𝑥®2 , . . . , 𝑥®𝑛 = Set of all linear combinations of 𝑥®1 , 𝑥®2 , . . . , 𝑥®𝑛 .
 

Given a matrix 𝐴, the maximal number of linearly independent rows is called the rank
of 𝐴, and we write “rank 𝐴” for the rank. For example,
1 1 1
 
rank  2 2 2  = 1.
−1 −1 −1
 
The second and third row are multiples of the first one. We cannot choose more than one
row and still have a linearly independent set. But what is
1 2 3
 
rank 4 5 6 = ?
7 8 9
 
That seems to be a tougher question to answer. The first two rows are linearly independent
(neither is a multiple of the other), so the rank is at least two. If we would set up the
equations for the 𝛼1 , 𝛼2 , and 𝛼3 , we would find a system with infinitely many solutions.
One solution is
1 2 3 −2 4 5 6 + 7 8 9 = 0 0 0 .
       

So the set of all three rows is linearly dependent, the rank cannot be 3. Therefore the rank
is 2.
But how can we do this in a more systematic way? We find the row echelon form!
1 2 3 1 2 3
   
Row echelon form of 4 5 6 is 0 1 2 .
   
7 8 9 0 0 0
   
The elementary row operations do not change the set of linear combinations of the rows
(that was one of the main reasons for defining them as they were). In other words, the
span of the rows of the 𝐴 is the same as the span of the rows of the row echelon form of
𝐴. In particular, the number of linearly independent rows is the same. And in the row
echelon form, all nonzero rows are linearly independent. This is not hard to see. Consider
the two nonzero rows in the example above. Suppose we tried to solve for the 𝛼1 and 𝛼2 in
𝛼1 1 2 3 + 𝛼2 0 1 2 = 0 0 0 .
     

Since the first column of the row echelon matrix has zeros except in the first row means
that 𝛼1 = 0. For the same reason, 𝛼2 is zero. We only have two nonzero rows, and they are
linearly independent, so the rank of the matrix is 2.
416 APPENDIX A. LINEAR ALGEBRA

The span of the rows is called the row space. The row space of 𝐴 and the row echelon
form of 𝐴 are the same. In the example,

1 2 3
 
row space of 4 5 6 = span 1 2 3 , 4 5 6 , 7 8 9
     
7 8 9
 
1 2 3 , 0 1 2 .
   
= span

Similarly to row space, the span of columns is called the column space.

1 2 3
 1 2 3 

      
   

column space of 4 5 6 = span 4 , 5 , 6 .
7 8 9  7 8 9 
 
       
So it may also be good to find the number of linearly independent columns of 𝐴. One
way to do that is to find the number of linearly independent rows of 𝐴𝑇 . It is a tremendously
useful fact that the number of linearly independent columns is always the same as the
number of linearly independent rows:

Theorem A.3.1. rank 𝐴 = rank 𝐴𝑇

In particular, to find a set of linearly independent columns we need to look at where


the pivots were. If you recall above, when solving 𝐴 𝑥® = 0® the key was finding the pivots,
any non-pivot columns corresponded to free variables. That means we can solve for the
non-pivot columns in terms of the pivot columns. Let’s see an example. First we reduce
some random matrix:
1 2 3 4
 
2 4 5 6 .
 
3 6 7 8
 
We find a pivot and reduce the rows below:

1 2 3 4 1 2 3 4  1 2 3 4 
  
2
 4 5 6 →  0 0 −1 −2 →  0 0 −1 −2 .
3
 6 7 8 3
 6 7 8  0
 0 −2 −4

We find the next pivot, make it one, and rinse and repeat:

1 2 3 4  1 2 3 4  1 2 3 4
  
0
 0 −1 −2 →  0 0 1 2  →  0 0 1 2 .
0
 0 −2 −4 0
 0 −2 −4 0
 0 0 0

The final matrix is the row echelon form of the matrix. Consider the pivots that we marked.
The pivot columns are the first and the third column. All other columns correspond to free
A.3. ELIMINATION 417

® so all other columns can be solved in terms of the first and


variables when solving 𝐴 𝑥® = 0,
the third column. In other words
1 2 3 4
 1 2 3 4   1 3 

         
    
   
  

column space of 2 4 5 6 = span 2 , 4 , 5 , 6 = span 2 , 5 .
3 6 7 8  3 6 7 8 
   3 7 
 
             
We could perhaps use another pair of columns to get the same span, but the first and the
third are guaranteed to work because they are pivot columns.
The discussion above could be expanded into a proof of the theorem if we wanted. As
each nonzero row in the row echelon form contains a pivot, then the rank is the number of
pivots, which is the same as the maximal number of linearly independent columns.
The idea also works in reverse. Suppose we have a bunch of column vectors and we just
need to find a linearly independent set. For example, suppose we started with the vectors
1 2 3 4
       
𝑣®1 = 2 , 𝑣®2 = 4 , 𝑣®3 = 5 , 𝑣®4 = 6 .
3 6 7 8
       
These vectors are not linearly independent as we saw above. In particular, the span of 𝑣®1
and 𝑣®3 is the same as the span of all four of the vectors. So 𝑣®2 and 𝑣®4 can both be written as
linear combinations of 𝑣®1 and 𝑣®3 . A common thing that comes up in practice is that one
gets a set of vectors whose span is the set of solutions of some problem. But perhaps we
get way too many vectors, we want to simplify. For example above, all vectors in the span
of 𝑣®1 , 𝑣®2 , 𝑣®3 , 𝑣®4 can be written 𝛼1 𝑣®1 + 𝛼2 𝑣®2 + 𝛼3 𝑣®3 + 𝛼4 𝑣®4 for some numbers 𝛼1 , 𝛼2 , 𝛼3 , 𝛼4 .
But it is also true that every such vector can be written as 𝑎® 𝑣 1 + 𝑏®
𝑣3 for two numbers 𝑎 and
𝑏. And one has to admit, that looks much simpler. Moreover, these numbers 𝑎 and 𝑏 are
unique. More on that in the next section.
To find this linearly independent set we simply take our vectors and form the matrix
𝑣1 𝑣®2 𝑣®3 𝑣®4 ], that is, the matrix

1 2 3 4
 
2 4 5 6 .
 
3 6 7 8
 
We crank up the row-reduction machine, feed this matrix into it, find the pivot columns,
and pick those. In this case, 𝑣®1 and 𝑣®3 .

A.3.5 Computing the inverse


If the matrix 𝐴 is square and there exists a unique solution 𝑥® to 𝐴 𝑥® = 𝑏® for any 𝑏® (there are
no free variables), then 𝐴 is invertible. This is equivalent to the 𝑛 × 𝑛 matrix 𝐴 being of
rank 𝑛.
In particular, if 𝐴 𝑥® = 𝑏® then 𝑥® = 𝐴−1 𝑏.
® Now we just need to compute what 𝐴−1 is. We
can surely do elimination every time we want to find 𝐴−1 𝑏, ® but that would be ridiculous.
418 APPENDIX A. LINEAR ALGEBRA

The mapping 𝐴−1 is linear and hence given by a matrix, and we have seen that to figure out
the matrix we just need to find where 𝐴−1 takes the standard basis vectors 𝑒®1 , 𝑒®2 , . . . , 𝑒®𝑛 .
That is, to find the first column of 𝐴−1 , we solve 𝐴 𝑥® = 𝑒®1 , because then 𝐴−1 𝑒®1 = 𝑥®. To
find the second column of 𝐴−1 , we solve 𝐴 𝑥® = 𝑒®2 . And so on. It is really just 𝑛 eliminations
that we need to do. But it gets even easier. If you think about it, the elimination is the same
for everything on the left side of the augmented matrix. Doing 𝑛 eliminations separately
we would redo most of the computations. Best is to do all at once.
Therefore, to find the inverse of 𝐴, we write an 𝑛 × 2𝑛 augmented matrix [ 𝐴 | 𝐼 ], where
𝐼 is the identity matrix, whose columns are precisely the standard basis vectors. We then
perform row reduction until we arrive at the reduced row echelon form. If 𝐴 is invertible,
then pivots can be found in every column of 𝐴, and so the reduced row echelon form of
[ 𝐴 | 𝐼 ] looks like [ 𝐼 | 𝐴−1 ]. We then just read off the inverse 𝐴−1 . If you do not find a pivot
in every one of the first 𝑛 columns of the augmented matrix, then 𝐴 is not invertible.
This is best seen by example. Suppose we wish to invert the matrix

1 2 3
 
2 0 1 .
 
3 1 0
 

We write the augmented matrix and we start reducing:


 1 2 3 1 0 0  
 1 2 3 1 0 0 

 2 0 1 0 1 0  → 
 0 −4 −5 −2 1 0  →

 3 1 0 0 0 1  
 0 −5 −9 −3 0 1 

 1 2 3 1 0 0  
 1 2 3 1 0 0 
→  0 1 /4 /2
5 1 −1/4 0  →


 0 1 5/4 1/2 −1/4 0  →


 0 −5 −9 −3 0 1  
 0 0 −11/4 −1/2 −5/4 1 

 1 2 3 1 0 0   1 2 0 5/11 −5/11 12/11 
   
→  0 1 5/4 1/2 −1/4 0 →


 0 1 0 3/11 −9/11 5/11 →

 0 0 1 2/11 5/11 −4/11   0 0 1 2/11 5/11 −4/11 
   
 1 0 0 −1/11 3/11 2/11 
 
→  0 1 0 3/11 −9/11 5/11  .

 0 0 1 2/11 5/11 −4/11 
 

So
1 2 3 −1 −1/11 3/11 2/11
   
2 0 1 =  3/11 −9/11 5/11.
   
3 1 0  2/11 5/11 −4/11
   
Not too terrible, no? Perhaps harder than inverting a 2 × 2 matrix for which we had a
simple formula, but not too bad. Really in practice this is done efficiently by a computer.
A.3. ELIMINATION 419

A.3.6 Exercises
Exercise A.3.1: Compute the reduced row echelon form for the following matrices:
       
1 3 1 3 3 3 6 6 6 7 7
a) b) c) d)
0 1 1 6 −3 −2 −3 1 1 0 1
9 3 0 2  2 1 3 −3 6 6 5 0 2 0 −1
       
e) 8 6 3 6 f)  6 0 0 −1 g) 0 −2 2 h) 6 6 −3 3 
7 9 7 9 −2 4 4 3  6 5 6 6 2 −3 5 
       
Exercise A.3.2: Compute the inverse of the given matrices
1 0 0 1 1 1 1 2 3
     
a) 0 0 1 b) 0 2 1 c) 2 0 1
0 1 0 0 0 1 0 2 1
     
Exercise A.3.3: Solve (find all solutions), or show no solution exists

𝑥1 + 5𝑥 2 + 3𝑥 3 = 7
4𝑥 1 + 3𝑥 2 = −2
a) b) 8𝑥1 + 7𝑥 2 + 8𝑥 3 = 8
−𝑥 1 + 𝑥 2 = 4
4𝑥1 + 8𝑥 2 + 6𝑥 3 = 4
4𝑥 1 + 8𝑥 2 + 2𝑥 3 = 3 𝑥 + 2𝑦 + 3𝑧 = 4
c) −𝑥 1 − 2𝑥 2 + 3𝑥 3 = 1 d) 2𝑥 − 𝑦 + 3𝑧 = 1
4𝑥 1 + 8𝑥 2 =2 3𝑥 + 𝑦 + 6𝑧 = 6

Exercise A.3.4: By computing the inverse, solve the following systems for 𝑥®.
       
4 1 13 3 3 2
a) 𝑥® = b) 𝑥® =
−1 3 26 3 4 −1
Exercise A.3.5: Compute the rank of the given matrices
6 3 5 5 −2 −1 1 2 3
     
a) 1 4 1 b) 3 0 6  c) −1 −2 −3
7 7 6 2 4 5  2 4 6
     
Exercise A.3.6: For the matrices in Exercise A.3.5, find a linearly independent set of row vectors
that span the row space (they do not need to be rows of the matrix).

Exercise A.3.7: For the matrices in Exercise A.3.5, find a linearly independent set of columns that
span the column space. That is, find the pivot columns of the matrices.

Exercise A.3.8: Find a linearly independent subset of the following vectors that has the same span.
−1 2 −2 −1
       
1, −2 , 4, 3
       
2 −4 1 −2
       
420 APPENDIX A. LINEAR ALGEBRA

Exercise A.3.101: Compute the reduced row echelon form for the following matrices:
       1 −3 1 
1 0 1 1 2 1 1  
a) b) c) d)  4 6 −2
0 1 0 3 4 −2 −2 −2 6 −2
 
2 2 5 2  −2 6 4 3    
    0 0 0 0 1 2 3 3
e) 1 −2 4 −1 f)  6 0 −3 0 g) h)
0 3 1 −2  4 2 −1 1 0 0 0 0 1 2 3 5
   
Exercise A.3.102: Compute the inverse of the given matrices
 0 1 0 1 1 1 2 4 0
     
a) −1 0 0 b) 1 1 0 c) 2 2 3
 0 0 1 1 0 0 2 4 1
     
Exercise A.3.103: Solve (find all solutions), or show no solution exists
5𝑥 + 6𝑦 + 5𝑧 = 7
4𝑥1 + 3𝑥 2 = −1
a) b) 6𝑥 + 8𝑦 + 6𝑧 = −1
5𝑥1 + 6𝑥 2 = 4
5𝑥 + 2𝑦 + 5𝑧 = 2
𝑎 + 𝑏 + 𝑐 = −1 −2𝑥1 + 2𝑥 2 + 8𝑥 3 = 6
c) 𝑎 + 5𝑏 + 6𝑐 = −1 d) 𝑥2 + 𝑥3 = 2
−2𝑎 + 5𝑏 + 6𝑐 = 8 𝑥1 + 4𝑥 2 + 𝑥 3 = 7

Exercise A.3.104: By computing the inverse, solve the following systems for 𝑥®.
       
−1 1 4 2 7 1
a) 𝑥® = b) 𝑥® =
3 3 6 1 6 3
Exercise A.3.105: Compute the rank of the given matrices
7 −1 6 1 1 1 0 3 −1
     
a) 7 7 7 b) 1 1 1 c) 6 3 1 
7 6 2 2 2 2 4 7 −1
     
Exercise A.3.106: For the matrices in Exercise A.3.105, find a linearly independent set of row
vectors that span the row space (they do not need to be rows of the matrix).

Exercise A.3.107: For the matrices in Exercise A.3.105, find a linearly independent set of columns
that span the column space. That is, find the pivot columns of the matrices.

Exercise A.3.108: Find a linearly independent subset of the following vectors that has the same
span.
0  3 0 −3
       
0  ,  1  ,  3  ,  2 
       
0  −5 −1 4
       
A.4. SUBSPACES, DIMENSION, AND THE KERNEL 421

A.4 Subspaces, dimension, and the kernel


Note: 1 lecture

A.4.1 Subspaces, basis, and dimension


We often find ourselves looking at the set of solutions of a linear equation 𝐿 𝑥® = 0® for some
matrix 𝐿, that is, we are interested in the kernel of 𝐿. The set of all such solutions has a nice
structure: It looks and acts a lot like some euclidean space ℝ 𝑘 .
We say that a set 𝑆 of vectors in ℝ𝑛 is a subspace if whenever 𝑥® and 𝑦® are members of 𝑆
and 𝛼 is a scalar, then
𝑥® + 𝑦®, and 𝛼 𝑥®
are also members of 𝑆. That is, we can add and multiply by scalars and we still land in 𝑆.
So every linear combination of vectors of 𝑆 is still in 𝑆. That is really what a subspace is.
It is a subset where we can take linear combinations and still end up being in the subset.
Consequently the span of a number of vectors is automatically a subspace.
Example A.4.1:

(i) If we let 𝑆 = ℝ𝑛 , then this 𝑆 is a subspace of ℝ𝑛 . Adding any two vectors in ℝ𝑛 gets a
vector in ℝ𝑛 , and so does multiplying by scalars.

® that is, the set of the zero vector by itself, is also a subspace of ℝ𝑛 .
(ii) The set 𝑆 ′ = {0},
There is only one vector in this subspace, so we only need to verify the definition for
that one vector, and everything checks out: 0® + 0® = 0® and 𝛼0® = 0.
®

(iii) The set 𝑆 ′′ of all the vectors of the form (𝑎, 𝑎) for any real number 𝑎, such as (1, 1), (3, 3),
or (−0.5, −0.5) is a subspace of ℝ2 . Adding two such vectors, say (1, 1) + (3, 3) = (4, 4)
again gets a vector of the same form, and so does multiplying by a scalar, say
8(1, 1) = (8, 8).

If 𝑆 is a subspace and we can find 𝑘 linearly independent vectors in 𝑆

𝑣®1 , 𝑣®2 , . . . , 𝑣® 𝑘 ,

such that every other vector in 𝑆 is a linear combination of 𝑣®1 , 𝑣®2 , . . . , 𝑣® 𝑘 , then the set
𝑣1 , 𝑣®2 , . . . , 𝑣® 𝑘 } is called a basis of 𝑆. In other words, 𝑆 is the span of {®
{® 𝑣1 , 𝑣®2 , . . . , 𝑣® 𝑘 } and
there is no smaller subset of these vectors that spans 𝑆. We say that 𝑆 has dimension 𝑘,
and we write
dim 𝑆 = 𝑘.
We have the following theorem.
422 APPENDIX A. LINEAR ALGEBRA

Theorem A.4.1. If 𝑆 ⊂ ℝ𝑛 is a subspace and 𝑆 is not the trivial subspace {0}, ® then there exists a
unique positive integer 𝑘 (the dimension) and a (not unique) basis {®
𝑣1 , 𝑣®2 , . . . , 𝑣® 𝑘 }, such that every
𝑤
® in 𝑆 can be uniquely represented by

𝑤
® = 𝛼1 𝑣®1 + 𝛼 2 𝑣®2 + · · · + 𝛼 𝑘 𝑣® 𝑘 ,

for some scalars 𝛼1 , 𝛼2 , . . . , 𝛼 𝑘 . By “uniquely represented” we mean that these scalars 𝛼1 , 𝛼2 , . . . ,


𝛼 𝑘 are unique.
Just as a vector in ℝ 𝑘 is represented by a 𝑘-tuple of numbers, so is a vector in a
𝑘-dimensional subspace of ℝ𝑛 represented by a unique 𝑘-tuple of numbers. At least once
we have fixed a basis. A different basis would give a different 𝑘-tuple of numbers for the
same vector.
We should reiterate that while 𝑘 is unique (a subspace cannot have two different
dimensions—every basis has the same number of vectors), the set of basis vectors is not at
all unique. There are lots of different bases for any given subspace. Finding just the right
basis for a subspace is a large part of what one does in linear algebra. In fact, that is what
we spend a lot of time on in linear differential equations, although at first glance it may not
seem like that is what we are doing.
Example A.4.2:
(i) The standard basis
𝑒®1 , 𝑒®2 , . . . , 𝑒®𝑛 ,
is a basis of ℝ𝑛 , (hence the name). So as expected

dim ℝ𝑛 = 𝑛.
 
(ii) Both sets (1, 0), (0, 1) and (1, 1), (1, −1) are bases of ℝ2 . A vector (3, 7) can be
written as (3, 7) = 3(1, 0) + 7(0, 1) (and in no other way with the first basis), and it can
be written as (3, 7) = 5(1, 1) − 2(1, −1) (and in no other way with the second basis).
® is of dimension 0. A basis is simply an empty set of vectors.
(iii) The subspace {0}

(iv) The subspace 𝑆 ′′ from Example


 A.4.1, that is, the set of vectors (𝑎, 𝑎), is of dimension 1.
One possible basis is simply (1, 1) , the single vector (1, 1): Every vector in 𝑆 can be
′′

represented by 𝑎(1, 1) = (𝑎, 𝑎). Similarly another possible basis would be (−1, −1) .


Then the vector (𝑎, 𝑎) would be represented as (−𝑎)(1, 1).

(v) The set (1, 1), (−1, −1) does span 𝑆′′, but it is not a basis as it is not a linearly


independent set. Consequently, a vector such as (2, 2) can be written in multiple ways
with this set of vectors: (2, 2) = 2(1, 1) + 0(−1, −1), or (2, 2) = 5(1, 1) + 3(−1, −1), etc.
Row and column spaces of a matrix are also examples of subspaces, as they are given
as the span of vectors. We can use what we know about rank, row spaces, and column
spaces from the previous section to find a basis.
A.4. SUBSPACES, DIMENSION, AND THE KERNEL 423

Example A.4.3: In the last section, we considered the matrix


1 2 3 4
 
𝐴 = 2 4 5 6 .
3 6 7 8
 
Using row reduction to find the pivot columns, we found
1 2 3 4
 1 3 

    
©  ª  

column space of 𝐴 ­ 2 4 5 6 ® = span 2 , 5 .
« 3 6 7 8 ¬
   3 7 
 
   
What we did was we found a basis of the column space. This basis has two elements, and
so the column space of 𝐴 is two-dimensional. Notice that the rank of 𝐴 is two.
We would have followed the same procedure if we wanted to find a basis of the subspace
𝑋 spanned by
1 2 3 4
       
2 , 4 , 5 , 6 .
       
3 6 7 8
       
We would have simply formed the matrix 𝐴 with these vectors as columns and repeated
the computation above. The subspace 𝑋 is then the column space of 𝐴.
Example A.4.4: Consider the matrix
1 2 0 0 3 
 
𝐿 = 0 0 1 0 4 .
0 0 0 1 5 
 
Conveniently, the matrix is in reduced row echelon form. The matrix is of rank 3. The
column space is the span of the pivot columns. It is the 3-dimensional space

 1 0 0 



      
 

column space of 𝐿 = span 0 , 1 , 0 = ℝ3 .
 0 0 1 
     
 

The row space is the 3-dimensional space

row space of 𝐿 = span 1 2 0 0 3 , 0 0 1 0 4 , 0 0 0 1 5 .


     

As these vectors have 5 components, we think of the row space of 𝐿 as a subspace of ℝ5 .


The way the dimensions worked out in the examples is not an accident. Since the
number of vectors that we needed to take was always the same as the number of pivots,
and the number of pivots is the rank, we get the following result.

Theorem A.4.2 (Rank). The dimension of the column space and the dimension of the row space of
a matrix 𝐴 are both equal to the rank of 𝐴.
424 APPENDIX A. LINEAR ALGEBRA

A.4.2 Kernel
® the kernel of 𝐿, is a subspace: If 𝑥® and 𝑦®
The set of solutions of a linear equation 𝐿 𝑥® = 0,
are solutions, then

𝐿( 𝑥® + 𝑦®) = 𝐿 𝑥® + 𝐿 𝑦® = 0® + 0® = 0,
® and 𝐿(𝛼 𝑥®) = 𝛼𝐿 𝑥® = 𝛼0® = 0.
®

So 𝑥® + 𝑦® and 𝛼 𝑥® are solutions. The dimension of the kernel is called the nullity of the
matrix.
The same sort of idea governs the solutions of linear differential equations. We try to
describe the kernel of a linear differential operator, and as it is a subspace, we look for a
basis of this kernel. Much of this book is dedicated to finding such bases.
The kernel of a matrix is the same as the kernel of its reduced row echelon form. For
a matrix in reduced row echelon form, the kernel is rather easy to find. If a vector 𝑥® is
applied to a matrix 𝐿, then each entry in 𝑥® corresponds to a column of 𝐿, the column that
the entry multiplies. To find the kernel, pick a non-pivot column make a vector that has a
−1 in the entry corresponding to this non-pivot column and zeros at all the other entries
corresponding to the other non-pivot columns. Then for all the entries corresponding
to pivot columns make it precisely the value in the corresponding row of the non-pivot
® This procedure is best understood by
column to make the vector be a solution to 𝐿 𝑥® = 0.
example.

Example A.4.5: Consider


1 2 0 0 3

𝐿 =  0 0 1 0 4 .
0 0 0 1 5

This matrix is in reduced row echelon form, the pivots are marked. There are two non-pivot
columns, so the kernel has dimension 2, that is, it is the span of 2 vectors. Let us find the
first vector. We look at the first non-pivot column, the 2nd column, and we put a −1 in the
2nd entry of our vector. We put a 0 in the 5th entry as the 5th column is also a non-pivot
column:
?
 
−1
 
 ? .
 
?
 
0
 
Let us fill the rest. When this vector hits the first row, we get a −2 and 1 times whatever the
first question mark is. So make the first question mark 2. For the second and third rows,
it is sufficient to make it the question marks zero. We are really filling in the non-pivot
column into the remaining entries. Let us check while marking which numbers went
A.4. SUBSPACES, DIMENSION, AND THE KERNEL 425

where:
2
 
1 2 0 0 3 −1 0
    
0 0 1 0 4  0  = 0 .
    
0 0 0 1 5  0  0
    
0
 
Yay! How about the second vector. We start with

 ? 
 
 0 
 
 ? 
 
 ? 
 
−1.
 
We set the first question mark to 3, the second to 4, and the third to 5. Let us check, marking
things as previously,
3
 
1 2 0 0 3   0  0
    
0 0 1 0 4   4  = 0 .
    
0 0 0 1 5   5  0
    
−1
 
There are two non-pivot columns, so we only need two vectors. We have found a basis of
the kernel. So,

  2   3 
     
 −1  0  

     

 

kernel of 𝐿 = span  0  ,  4 
   


  0   5  
     
  0  −1 
 
   
What we did in finding a basis of the kernel is that we expressed all solutions of 𝐿 𝑥® = 0®
as a linear combination of some given vectors.
The procedure to find a basis of the kernel of a matrix 𝐿:

(i) Find the reduced row echelon form of 𝐿.

(ii) Write down a basis of the kernel as above, one vector for each non-pivot column.

The rank of a matrix is the dimension of the column space, and that is the span on the
pivot columns, while the kernel is the span of vectors one for each non-pivot column. So
the two numbers must add to the number of columns.

Theorem A.4.3 (Rank–Nullity). If a matrix 𝐴 has 𝑛 columns, rank 𝑟, and nullity 𝑘 (dimension
of the kernel), then
𝑛 = 𝑟 + 𝑘.
426 APPENDIX A. LINEAR ALGEBRA

The theorem is immensely useful in applications. It allows one to compute the rank
𝑟 if one knows the nullity 𝑘 and vice versa, without doing any extra work. An example
application is a simple version of the so-called Fredholm alternative. A similar result is true
for differential equations. Consider
®
𝐴 𝑥® = 𝑏,
where 𝐴 is a square 𝑛 × 𝑛 matrix. There are then two mutually exclusive possibilities:

(i) A nonzero solution 𝑥® to 𝐴 𝑥® = 0® exists.

(ii) The equation 𝐴 𝑥® = 𝑏® has a unique solution 𝑥® for every 𝑏.


®

How does the Rank–Nullity theorem come into the picture? Well, if 𝐴 has a nonzero
® then the nullity 𝑘 is positive. But then the rank 𝑟 = 𝑛 − 𝑘 must be
solution 𝑥® to 𝐴 𝑥® = 0,
less than 𝑛. It means that the column space of 𝐴 is of dimension less than 𝑛, so it is a
subspace that does not include everything in ℝ𝑛 . So ℝ𝑛 has to contain some vector 𝑏® not in
the column space of 𝐴. In fact, most vectors in ℝ𝑛 are not in the column space of 𝐴.

A.4.3 Exercises
Exercise A.4.1: For the following sets of vectors, find a basis for the subspace spanned by the vectors,
and find the dimension of the subspace.

1 −1 1 0  0 −4 2  2


               
a) 1 , −1
  b) 0 , 1  ,
 
−1
  c) −3 , 3  ,
 
0
 
1 −1 5 0  0 5 3  2
               
1 0  −1       3 2 −5
      1 0 −1      
d) 3 , 2  , −1 e) , , f) 1 , 4, −5
0
 
2 
 
2 3 2 −1 3
 
−4
 
−2
           
Exercise A.4.2: For the following matrices, find a basis for the kernel (nullspace).

1 1 1   2 −1 −3 −4 4 4 −2 1 1 1


       
a) 1 1 5  b)  4 0 −4 c) −1 1 1 d) −4 2 2 2
1 1 −4 −1 1 2  −5 5 5  1 0 4 3
       
Exercise A.4.3: Suppose a 5 × 5 matrix 𝐴 has rank 3. What is the nullity?

Exercise A.4.4: Suppose that 𝑋 is the set of all the vectors of ℝ3 whose third component is zero. Is
𝑋 a subspace? And if so, find a basis and the dimension.

Exercise A.4.5: Consider a square matrix 𝐴, and suppose that 𝑥® is a nonzero vector such that
® What does the Fredholm alternative say about invertibility of 𝐴.
𝐴 𝑥® = 0.
A.4. SUBSPACES, DIMENSION, AND THE KERNEL 427

Exercise A.4.6: Consider


 1 2 3
 
𝑀 =  2 ? ? .
−1 ? ?
 
If the nullity of this matrix is 2, fill in the question marks. Hint: What is the rank?

Exercise A.4.101: For the following sets of vectors, find a basis for the subspace spanned by the
vectors, and find the dimension of the subspace.
    1 2  1 5 5 −1
1 1            
a) , b) 1 , 2  , 1 c) 3 , −1 , 3
2 1 1
 
2 
 
2 1
 
5
 
−4
           
2 2  4       1 2  0
      1 2 3      
d) 2 , 2  , 4 e) , , f) 0 , 0  , 1
4
 
3 
 
−3 0 0 0 0
 
0 
 
2
           
Exercise A.4.102: For the following matrices, find a basis for the kernel (nullspace).
2 6 1 9  2 −2 −5  1 −5 −4 0 4 4
       
a) 1 3 2 9 b) −1 1 5  c)  2 3 5  d) 0 1 1
3 9 0 9 −5 5 −3 −3 5 2  0 5 5
       
Exercise A.4.103: Suppose the column space of a 9 × 5 matrix 𝐴 of dimension 3. Find

a) Rank of 𝐴. b) Nullity of 𝐴.
c) Dimension of the row space of 𝐴. d) Dimension of the nullspace of 𝐴.
e) Size of the maximum subset of linearly in-
dependent rows of 𝐴.
428 APPENDIX A. LINEAR ALGEBRA

A.5 Inner product and projections


Note: 1–2 lectures

A.5.1 Inner product and orthogonality


To do basic geometry, we need length, and we need angles. We have already seen the
euclidean length, so let us figure out angles. Mostly, we are worried about the right angle‗ .
Given two (column) vectors in ℝ𝑛 , we define the (standard) inner product as the dot
product:
𝑛
Õ
𝑇
⟨𝑥®, 𝑦®⟩ = 𝑥® · 𝑦® = 𝑦® 𝑥® = 𝑥1 𝑦1 + 𝑥 2 𝑦2 + · · · + 𝑥 𝑛 𝑦𝑛 = 𝑥 𝑖 𝑦𝑖 .
𝑖=1

Why do we seemingly give a new notation for the dot product (⟨ 𝑥®, 𝑦®⟩ instead of just 𝑥® · 𝑦®)?
Because there are other possible inner products, which are not the dot product, although
we will not worry about others here. An inner product can even be defined on spaces of
functions as we do in chapter 4:
∫ 𝑏
⟨ 𝑓 (𝑡), 𝑔(𝑡)⟩ = 𝑓 (𝑡)𝑔(𝑡) 𝑑𝑡.
𝑎

But we digress.
The inner product satisfies the following rules:
(i) ⟨𝑥®, 𝑥®⟩ ≥ 0, and ⟨𝑥®, 𝑥®⟩ = 0 if and only if 𝑥® = 0,

(ii) ⟨𝑥®, 𝑦®⟩ = ⟨ 𝑦®, 𝑥®⟩,

(iii) ⟨𝑎 𝑥®, 𝑦®⟩ = ⟨𝑥®, 𝑎 𝑦®⟩ = 𝑎⟨𝑥®, 𝑦®⟩,

(iv) ⟨𝑥® + 𝑦®, 𝑧®⟩ = ⟨𝑥®, 𝑧®⟩ + ⟨ 𝑦®, 𝑧®⟩ and ⟨𝑥®, 𝑦® + 𝑧®⟩ = ⟨𝑥®, 𝑦®⟩ + ⟨ 𝑥®, 𝑧®⟩.
Anything that satisfies the properties above can be called an inner product, although in
this section we are concerned with the standard inner product in ℝ𝑛 .
The standard inner product gives the euclidean length:
p q
∥𝑥®∥ = ⟨ 𝑥®, 𝑥®⟩ = 𝑥12 + 𝑥 22 + · · · + 𝑥 𝑛2 .

How does it give angles? You may recall a formula for the standard inner product (the dot
product) from multivariable calculus in two or three dimensions in terms of the angle 𝜃
between the vectors:
⟨ 𝑥®, 𝑦®⟩ = ∥𝑥®∥∥ 𝑦®∥ cos 𝜃.
That is, 𝜃 is the angle that 𝑥® and 𝑦® make when they are based at the same point.
‗When Euclid defined angles in his Elements, the only angle he ever really defined was the right angle.
A.5. INNER PRODUCT AND PROJECTIONS 429

In ℝ𝑛 (any dimension), we are simply going to say that 𝜃 from the formula is what the
angle is. This makes sense as any two vectors based at the origin lie in a 2-dimensional
plane (subspace), and the formula works in 2 dimensions. In fact, one could even talk
about angles between functions this way, and we do in chapter 4, where we talk about
orthogonal functions (functions at right angle to each other).
To compute the angle we compute
⟨ 𝑥®, 𝑦®⟩
cos 𝜃 = .
∥ 𝑥®∥∥ 𝑦®∥
Our angles are always in radians. We are computing the cosine of the angle, which is really
the best we can do. Given two vectors at an angle 𝜃, we can give the angle as −𝜃, 2𝜋 − 𝜃,
etc., see Figure A.5. Fortunately, cos 𝜃 = cos(−𝜃) = cos(2𝜋 − 𝜃). If we solve for 𝜃 using the
inverse cosine cos−1 , we can just decree that 0 ≤ 𝜃 ≤ 𝜋.

2𝜋 − 𝜃 𝑥®

𝜃
−𝜃

𝑦®

Figure A.5: Angle between vectors.

Example A.5.1: Let us compute the angle between the vectors (3, 0) and (1, 1) in the plane.
Compute
(3, 0), (1, 1) 3+0 1
cos 𝜃 = = √ =√ .
∥(3, 0)∥∥(1, 1)∥ 3 2 2
Therefore 𝜃 = 𝜋/4.
As we said, the most important angle is the right angle. A right angle is 𝜋/2 radians,
and cos(𝜋/2) = 0, so the formula is particularly easy in this case. We say vectors 𝑥® and 𝑦® are
orthogonal if they are at right angles, that is, if

⟨𝑥®, 𝑦®⟩ = 0.

The vectors (1, 0, 0, 1) and (1, 2, 3, −1) are orthogonal. So are (1, 1) and (1, −1). However,
(1, 1) and (1, 2) are not orthogonal as their inner product is 3 and not 0.

A.5.2 Orthogonal projection


A typical application of linear algebra is to take a difficult problem, write everything in
the right basis, and in this new basis the problem becomes simple. A particularly useful
430 APPENDIX A. LINEAR ALGEBRA

basis is an orthogonal basis, a basis where all the basis vectors are orthogonal. When we
draw a coordinate system in two or three dimensions, we almost always draw our axes as
orthogonal to each other.
Generalizing this concept to functions, it is particularly useful in chapter 4 to express a
function using a particular orthogonal basis, the Fourier series.
To express one vector in terms of an orthogonal basis, we need to first project one vector
onto another. Given a nonzero vector 𝑣® , we define the orthogonal projection of 𝑤® onto 𝑣® as

⟨𝑤,
® 𝑣® ⟩
 
proj𝑣® (𝑤)
® = 𝑣® .
𝑣 , 𝑣® ⟩
⟨®

For the geometric idea, see Figure A.6. That is, we find the “shadow of 𝑤” ® on the line
spanned by 𝑣® if the direction of the sun’s rays were exactly perpendicular to the line.
Another way of thinking about it is that the tip of the arrow of proj𝑣® (𝑤)
® is the closest point
on the line spanned by 𝑣® to the tip of the arrow of 𝑤. ® In terms of euclidean distance,
𝑢 = proj𝑣® (𝑤) minimizes the distance ∥𝑤 − 𝑢 ∥ among all vectors 𝑢® that are multiples of
® ® ® ®
𝑣® . Because of this, this projection comes up often in applied mathematics in all sorts of
contexts we cannot solve a problem exactly: We cannot always solve “Find 𝑤 ® as a multiple
of 𝑣® ,” but proj𝑣® (𝑤)
® is the best “solution.”

𝑤
®

𝜃
proj𝑣® (𝑤)
® 𝑣®
Figure A.6: Orthogonal projection.

The formula follows from basic trigonometry. The length of proj𝑣® (𝑤) ® should be cos 𝜃
times the length of 𝑤,® that is, (cos 𝜃)∥𝑤∥.
® We take the unit vector in the direction of 𝑣® , that
𝑣®
is, ∥®𝑣 ∥ and we multiply it by the length of the projection. In other words,

𝑣® (cos 𝜃)∥𝑤∥∥®
® 𝑣∥ ⟨𝑤,
® 𝑣® ⟩
proj𝑣® (𝑤)
® = (cos 𝜃)∥𝑤∥
® = 𝑣® = 𝑣® .
𝑣∥
∥® 𝑣∥
∥®
2 𝑣 , 𝑣® ⟩
⟨®

Example A.5.2: Suppose we wish to project the vector (3, 2, 1) onto the vector (1, 2, 3).
Compute
 ⟨(3, 2, 1), (1, 2, 3)⟩ 3·1+2·2+1·3
proj(1,2,3) (3, 2, 1) = (1, 2, 3) = (1, 2, 3)
⟨(1, 2, 3), (1, 2, 3)⟩ 1·1+2·2+3·3
 
10 5 10 15
= (1, 2, 3) = , , .
14 7 7 7
A.5. INNER PRODUCT AND PROJECTIONS 431

Let us double check that the projection is orthogonal. That is 𝑤 ® − proj𝑣® (𝑤)
® ought to be
orthogonal to 𝑣® , see the right angle in Figure A.6 on the preceding page. That is,
   
5 10 15 16 4 −8
(3, 2, 1) = 3 − , 2 − , 1 − , ,

(3, 2, 1) − proj(1,2,3) =
7 7 7 7 7 7
ought to be orthogonal to (1, 2, 3). We compute the inner product and we had better get
zero:   
16 4 −8 16 4 8
, , , (1, 2, 3) = · 1 + · 2 − · 3 = 0.
7 7 7 7 7 7

A.5.3 Orthogonal basis


As we said, a basis 𝑣®1 , 𝑣®2 , . . . , 𝑣® 𝑛 is an orthogonal basis if all vectors in the basis are orthogonal
to each other, that is, if
𝑣 𝑗 , 𝑣® 𝑘 ⟩ = 0
⟨®
for all choices of 𝑗 and 𝑘 where 𝑗 ≠ 𝑘 (a nonzero vector cannot be orthogonal to itself).
A basis is furthermore called an orthonormal basis if all the vectors in a basis are also
unit vectors, that is, if all the vectors have magnitude 1. For example, the standard basis
{(1, 0, 0), (0, 1, 0), (0, 0, 1)} is an orthonormal basis of ℝ3 : Any pair is orthogonal, and each
vector is of unit magnitude.
The reason why we are interested in orthogonal (or orthonormal) bases is that they
make it really simple to represent a vector (or a projection onto a subspace) in the basis. The
formula for the orthogonal projection onto a vector gives us the coefficients. In chapter 4,
we use the same idea by finding the correct orthogonal basis for the set of solutions of a
differential equation. We are then able to find any particular solution by simply applying
the orthogonal projection formula, which is just a couple of a inner products.
Let us come back to linear algebra. Consider a subspace 𝑆 and an orthogonal basis
𝑣®1 , 𝑣®2 , . . . , 𝑣® 𝑛 for 𝑆. We wish to express some vector 𝑥® in terms of this basis. If 𝑥® is not in
the span of this basis (when it is not in 𝑆), then of course it is not possible, but the following
formula gives us at least the orthogonal projection onto the subspace 𝑆, or in other words,
the best approximation to 𝑥® in the subspace—the vector in 𝑆 closest to 𝑥®.
First suppose that 𝑥® is in the span. Then it is the sum of the orthogonal projections:
⟨ 𝑥®, 𝑣®1 ⟩ ⟨𝑥®, 𝑣®2 ⟩ ⟨ 𝑥®, 𝑣® 𝑛 ⟩
𝑥® = proj𝑣®1 (𝑥®) + proj𝑣®2 (𝑥®) + · · · + proj𝑣®𝑛 (𝑥®) = 𝑣®1 + 𝑣®2 + · · · + 𝑣® 𝑛 .
𝑣1 , 𝑣®1 ⟩
⟨® 𝑣2 , 𝑣®2 ⟩
⟨® 𝑣 𝑛 , 𝑣® 𝑛 ⟩
⟨®
In other words, if we want to write 𝑥® = 𝑎1 𝑣®1 + 𝑎 2 𝑣®2 + · · · + 𝑎 𝑛 𝑣® 𝑛 , then
⟨𝑥®, 𝑣®1 ⟩ ⟨ 𝑥®, 𝑣®2 ⟩ ⟨𝑥®, 𝑣® 𝑛 ⟩
𝑎1 = , 𝑎2 = , ..., 𝑎𝑛 = .
𝑣1 , 𝑣®1 ⟩
⟨® 𝑣 2 , 𝑣®2 ⟩
⟨® 𝑣 𝑛 , 𝑣® 𝑛 ⟩
⟨®
Another way to derive this formula is to work in reverse. Suppose that 𝑥® = 𝑎1 𝑣®1 + 𝑎 2 𝑣®2 +
· · · + 𝑎 𝑛 𝑣® 𝑛 . Take an inner product with 𝑣® 𝑗 , and use the properties of the inner product:
⟨𝑥®, 𝑣® 𝑗 ⟩ = ⟨𝑎1 𝑣®1 + 𝑎 2 𝑣®2 + · · · + 𝑎 𝑛 𝑣® 𝑛 , 𝑣® 𝑗 ⟩
= 𝑎1 ⟨®
𝑣1 , 𝑣® 𝑗 ⟩ + 𝑎 2 ⟨®
𝑣2 , 𝑣® 𝑗 ⟩ + · · · + 𝑎 𝑛 ⟨®
𝑣 𝑛 , 𝑣® 𝑗 ⟩.
432 APPENDIX A. LINEAR ALGEBRA

𝑣 𝑘 , 𝑣® 𝑗 ⟩ = 0 whenever 𝑘 ≠ 𝑗. That means that only one of


As the basis is orthogonal, then ⟨®
the terms, the 𝑗 one, on the right-hand side is nonzero and we get
th

⟨ 𝑥®, 𝑣® 𝑗 ⟩ = 𝑎 𝑗 ⟨®
𝑣 𝑗 , 𝑣® 𝑗 ⟩.
⟨ 𝑥®,®
𝑣𝑗⟩
Solving for 𝑎 𝑗 we find 𝑎 𝑗 = 𝑣 𝑗 ,®
⟨® 𝑣𝑗⟩
as before.

Example A.5.3: The vectors (1, 1) and (1, −1) form an orthogonal basis of ℝ2 . Suppose we
wish to represent (3, 4) in terms of this basis, that is, we wish to find 𝑎1 and 𝑎 2 such that

(3, 4) = 𝑎 1 (1, 1) + 𝑎 2 (1, −1).

We compute:

⟨(3, 4), (1, 1)⟩ 7 ⟨(3, 4), (1, −1)⟩ −1


𝑎1 = = , 𝑎2 = = .
⟨(1, 1), (1, 1)⟩ 2 ⟨(1, −1), (1, −1)⟩ 2
So
7 −1
(3, 4) = (1, 1) + (1, −1).
2 2
If the basis is orthonormal rather than orthogonal, then all the denominators are one. It
is easy to make a basis orthonormal—divide all the vectors by their size. If you want to
decompose many vectors, it may be better to find an orthonormal basis. In the example
above, the orthonormal basis we would thus create is
   
1 1 1 −1
√ ,√ , √ ,√ .
2 2 2 2
Then the computation would have been
         
1 1 1 1 1 −1 1 −1
(3, 4) = (3, 4), √ , √ √ , √ + (3, 4), √ , √ √ ,√
2 2
  2 2  2 2 2 2
7 1 1 −1 1 −1
= √ √ ,√ +√ √ ,√ .
2 2 2 2 2 2

Maybe the example is not so awe inspiring, but given vectors in ℝ20 rather than ℝ2 ,
then surely one would much rather do 20 inner products (or 40 if we did not have an
orthonormal basis) rather than solving a system of twenty equations in twenty unknowns
using row reduction of a 20 × 21 matrix.
As we said above, the formula still works even if 𝑥® is not in the subspace, although
then it does not get us the vector 𝑥® but its projection. More concretely, suppose that 𝑆 is a
subspace that is the span of 𝑣®1 , 𝑣®2 , . . . , 𝑣® 𝑛 and 𝑥® is any vector. Let proj𝑆 (𝑥®) be the vector in
𝑆 that is the closest to 𝑥®. Then
⟨𝑥®, 𝑣®1 ⟩ ⟨ 𝑥®, 𝑣®2 ⟩ ⟨𝑥®, 𝑣® 𝑛 ⟩
proj𝑆 (𝑥®) = 𝑣®1 + 𝑣®2 + · · · + 𝑣® 𝑛 .
𝑣 1 , 𝑣®1 ⟩
⟨® 𝑣 2 , 𝑣®2 ⟩
⟨® 𝑣 𝑛 , 𝑣® 𝑛 ⟩
⟨®
A.5. INNER PRODUCT AND PROJECTIONS 433

Of course, if 𝑥® is in 𝑆, then proj𝑆 (𝑥®) = 𝑥®, as the closest vector in 𝑆 to 𝑥® is 𝑥® itself. But
true utility is obtained when 𝑥® is not in 𝑆. In much of applied mathematics, we cannot find
an exact solution to a problem, but we try to find the best solution out of a small subset
(subspace). The partial sums of Fourier series from chapter 4 are one example. Another
example is least square approximation to fit a curve to data. Yet another example is given
by the most commonly used numerical methods to solve partial differential equations, the
finite element methods.
Example A.5.4: The vectors (1, 2, 3) and (3, 0, −1) are orthogonal, and so they are an
orthogonal basis of a subspace 𝑆:

𝑆 = span (1, 2, 3), (3, 0, −1) .




Let us find the vector in 𝑆 that is closest to (2, 1, 0). That is, let us find proj𝑆 (2, 1, 0) .


 ⟨(2, 1, 0), (1, 2, 3)⟩ ⟨(2, 1, 0), (3, 0, −1)⟩


proj𝑆 (2, 1, 0) = (1, 2, 3) + (3, 0, −1)
⟨(1, 2, 3), (1, 2, 3)⟩ ⟨(3, 0, −1), (3, 0, −1)⟩
2 3
= (1, 2, 3) + (3, 0, −1)
7 5
73 4 9
= , , .
35 7 35

A.5.4 The Gram–Schmidt process


Before leaving orthogonal bases, let us note a procedure for manufacturing them out of any
old basis. It may not be difficult to come up with an orthogonal basis for a 2-dimensional
subspace, but for a 20-dimensional subspace, it seems a daunting task. Fortunately, the
orthogonal projection can be used to “project away” the bits of the vectors that are making
them not orthogonal. It is called the Gram–Schmidt process.
We start with a basis of vectors 𝑣®1 , 𝑣®2 , . . . , 𝑣® 𝑛 . We construct an orthogonal basis
𝑤®1, 𝑤
®2, . . . , 𝑤
® 𝑛 as follows.

𝑤
® 1 = 𝑣®1 ,
𝑤
® 2 = 𝑣®2 − proj𝑤® 1 (®
𝑣 2 ),
𝑤 𝑣 3 ) − proj𝑤® 2 (®
® 3 = 𝑣®3 − proj𝑤® 1 (® 𝑣3 ),
𝑤
® 4 = 𝑣®4 − proj𝑤® 1 (®
𝑣4 ) − proj𝑤® 2 (®
𝑣4 ) − proj𝑤® 3 (®
𝑣 4 ),
..
.
𝑤
® 𝑛 = 𝑣® 𝑛 − proj𝑤® 1 (®
𝑣 𝑛 ) − proj𝑤® 2 (®
𝑣 𝑛 ) − · · · − proj𝑤® 𝑛−1 (®
𝑣 𝑛 ).

What we do is at the 𝑘 th step, we take 𝑣® 𝑘 and we subtract the projection of 𝑣® 𝑘 to the subspace
spanned by 𝑤®1, 𝑤®2, . . . , 𝑤
® 𝑘−1 .
434 APPENDIX A. LINEAR ALGEBRA

Example A.5.5: Consider the vectors (1, 2, −1), and (0, 5, −2) and call 𝑆 the span of the two
vectors. Let us find an orthogonal basis of 𝑆 via the Gram–Schmidt process:

𝑤
® 1 = (1, 2, −1),
𝑤

® 2 = (0, 5, −2) − proj(1,2,−1) (0, 5, −2)
⟨(0, 5, −2), (1, 2, −1)⟩
= (0, 5, −2) − (1, 2, −1) = (0, 5, −2) − 2(1, 2, −1) = (−2, 1, 0).
⟨(1, 2, −1), (1, 2, −1)⟩

So (1, 2, −1) and (−2, 1, 0) span 𝑆 and are orthogonal. Let us check: ⟨(1, 2, −1), (−2, 1, 0)⟩ = 0.
Suppose we wish to find an orthonormal basis, not just an orthogonal one. Well, we
simply make the vectors into unit vectors by dividing them by their magnitude. The two
vectors making up the orthonormal basis of 𝑆 are:
   
1 1 2 −1 1 −2 1
√ (1, 2, −1) = √ , √ , √ , √ (−2, 1, 0) = √ , √ , 0 .
6 6 6 6 5 5 5

A.5.5 Exercises
Exercise A.5.1: Find the 𝑠 that makes the following vectors orthogonal: (1, 2, 3), (1, 1, 𝑠).

Exercise A.5.2: Find the angle 𝜃 between (1, 3, 1), (2, 1, −1).

𝑣 , 𝑤⟩
Exercise A.5.3: Given that ⟨® 𝑣 , 𝑢® ⟩ = −1 compute
® = 3 and ⟨®

𝑢 , 2®
a) ⟨® 𝑣⟩ 𝑣 , 2𝑤
b) ⟨® 𝑢⟩
® + 3® c) ⟨𝑤 𝑢 , 𝑣® ⟩
® + 3®

Exercise A.5.4: Suppose 𝑣® = (1, 1, −1). Find


  
a) proj𝑣® (1, 0, 0) b) proj𝑣® (1, 2, 3) c) proj𝑣® (1, −1, 0)

Exercise A.5.5: Consider the vectors (1, 2, 3), (−3, 0, 1), (1, −5, 3).

a) Check that the vectors are linearly indepen- b) Check that the vectors are mutually orthog-
dent and so form a basis of ℝ3 . onal, and are therefore an orthogonal basis.
c) Represent (1, 1, 1) as a linear combination d) Make the basis orthonormal.
of this basis.

Exercise A.5.6: Let 𝑆 be the subspace spanned by (1, 3, −1), (1, 1, 1). Find an orthogonal basis of
𝑆 by the Gram–Schmidt process.

Exercise A.5.7: Starting with (1, 2, 3), (1, 1, 1), (2, 2, 0), follow the Gram–Schmidt process to find
an orthogonal basis of ℝ3 .

Exercise A.5.8: Find an orthogonal basis of ℝ3 such that (3, 1, −2) is one of the vectors. Hint:
First find two extra vectors to make a linearly independent set.
A.5. INNER PRODUCT AND PROJECTIONS 435

Exercise A.5.9: Using cosines and sines of 𝜃, find a unit vector 𝑢® in ℝ2 that makes angle 𝜃 with
®𝚤 = (1, 0). What is ⟨®𝚤 , 𝑢® ⟩?

Exercise A.5.101: Find the 𝑠 that makes the following vectors orthogonal: (1, 1, 1), (1, 𝑠, 1).

Exercise A.5.102: Find the angle 𝜃 between (1, 2, 3), (1, 1, 1).

𝑣 , 𝑤⟩
Exercise A.5.103: Given that ⟨® 𝑣 , 𝑢® ⟩ = −1 and ∥®
® = 1 and ⟨® 𝑣 ∥ = 3 and

𝑢 , 5®
a) ⟨3® 𝑣⟩ 𝑣 , 2𝑤
b) ⟨® 𝑢⟩
® + 3® c) ⟨𝑤 𝑣 , 𝑣® ⟩
® + 3®

Exercise A.5.104: Suppose 𝑣® = (1, 0, −1). Find


  
a) proj𝑣® (0, 2, 1) b) proj𝑣® (1, 0, 1) c) proj𝑣® (4, −1, 0)

Exercise A.5.105: The vectors (1, 1, −1), (2, −1, 1), (1, −5, 3) form an orthogonal basis. Represent
the following vectors in terms of this basis:

a) (1, −8, 4) b) (5, −7, 5) c) (0, −6, 2)

Exercise A.5.106: Let 𝑆 be the subspace spanned by (2, −1, 1), (2, 2, 2). Find an orthogonal basis
of 𝑆 by the Gram–Schmidt process.

Exercise A.5.107: Starting with (1, 1, −1), (2, 3, −1), (1, −1, 1), follow the Gram–Schmidt process
to find an orthogonal basis of ℝ3 .
436 APPENDIX A. LINEAR ALGEBRA

A.6 Determinant
Note: 1 lecture
For square matrices we define a useful quantity called the determinant. Define the
determinant of a 1 × 1 matrix as the value of its only entry

def
𝑎 = 𝑎.
 
det

For a 2 × 2 matrix, define


𝑎 𝑏
 
def
det = 𝑎𝑑 − 𝑏𝑐.
𝑐 𝑑
Before defining the determinant for larger matrices, we note the meaning of the
determinant. An 𝑛 × 𝑛 matrix gives a mapping of the 𝑛-dimensional euclidean space ℝ𝑛
to itself. So a 2 × 2 matrix 𝐴 is a mapping of the plane to itself. The determinant of 𝐴 is the
factor by which the area of objects changes. If we take the unit square (square of side 1) in
the plane, then 𝐴 takes the square to a parallelogram of area |det(𝐴)|. The sign of det(𝐴)
denotes a change of orientation (negative if the axes get flipped). For example, let
 
1 1
𝐴= .
−1 1

Then det(𝐴) = 1 + 1 = 2. Let us see where 𝐴 sends the unit square—the square with
vertices (0, 0), (1, 0), (0, 1), and (1, 1). The point (0, 0) gets sent to (0, 0).
              
1 1 1 1 1 1 0 1 1 1 1 2
= , = , = .
−1 1 0 −1 −1 1 1 1 −1 1 1 0

The image of the square is another√square with vertices (0, 0), (1, −1), (1, 1), and (2, 0). The
image square has a side of length 2, and it is therefore of area 2. See Figure A.7.

1
1
0 0 1 2
0 0 1
−1

Figure A.7: Image of the unit quare via the mapping 𝐴.

In general, the image of a square is going to be a parallelogram. In high school geometry,


you may have seen a formula for computing the area of a parallelogram with vertices (0, 0),
A.6. DETERMINANT 437

(𝑎, 𝑐), (𝑏, 𝑑) and (𝑎 + 𝑏, 𝑐 + 𝑑). The area is

𝑎 𝑏
 
det = |𝑎𝑑 − 𝑏𝑐|.
𝑐 𝑑
𝑎 𝑏
The vertical lines above mean absolute value. The matrix 𝑐 𝑑 carries the unit square to
the given parallelogram.
There are a number of ways to define the determinant for an 𝑛 × 𝑛 matrix. Let us use
the so-called cofactor expansion. We define 𝐴 𝑖𝑗 as the matrix 𝐴 with the 𝑖 th row and the 𝑗 th
column deleted. For example, if
1 2 3    
  4 6 1 2
If 𝐴 = 4 5 6 , then 𝐴12 = and 𝐴23 = .
7 8 9 7 9 7 8
 
We now define the determinant recursively
𝑛
def
Õ
det(𝐴) = (−1)1+𝑗 𝑎 1𝑗 det(𝐴1𝑗 ),
𝑗=1

or in other words
(
+𝑎1𝑛 det(𝐴1𝑛 ) if 𝑛 is odd,
det(𝐴) = 𝑎11 det(𝐴11 ) − 𝑎 12 det(𝐴12 ) + 𝑎 13 det(𝐴13 ) − · · ·
−𝑎1𝑛 det(𝐴1𝑛 ) if 𝑛 even.

For a 3 × 3 matrix, we get det(𝐴) = 𝑎11 det(𝐴11 ) − 𝑎 12 det(𝐴12 ) + 𝑎 13 det(𝐴13 ). For example,
1 2 3      
©  ª 5 6 4 6 4 5
det ­ 4 5 6 ® = 1 · det
 − 2 · det + 3 · det
7 8 9 8 9 7 9 7 8
« ¬
= 1(5 · 9 − 6 · 8) − 2(4 · 9 − 6 · 7) + 3(4 · 8 − 5 · 7) = 0.

It turns out that we did not have to necessarily use the first row. That is for any 𝑖,
𝑛
(−1)𝑖+𝑗 𝑎 𝑖𝑗 det(𝐴 𝑖𝑗 ).
Õ
det(𝐴) =
𝑗=1

It is sometimes useful to use a row other than the first. In the following example it is more
convenient to expand along the second row. Notice that for the second row we are starting
with a negative sign.
1 2 3      
©  ª 2 3 1 3 1 2
det ­ 0 5 0 ® = −0 · det + 5 · det − 0 · det
7 8 9 8 9 7 9 7 8
« ¬
= 0 + 5(1 · 9 − 3 · 7) + 0 = −60.
438 APPENDIX A. LINEAR ALGEBRA

Let us check if it is really the same as expanding along the first row,

1 2 3      
©  ª 5 0 0 0 0 5
det ­ 0 5 0 ® = 1 · det − 2 · det + 3 · det
7 8 9 8 9 7 9 7 8
« ¬
= 1(5 · 9 − 0 · 8) − 2(0 · 9 − 0 · 7) + 3(0 · 8 − 5 · 7) = −60.

In computing the determinant, we alternately add and subtract the determinants of the
submatrices 𝐴 𝑖𝑗 multiplied by 𝑎 𝑖𝑗 for a fixed 𝑖 and all 𝑗. The numbers (−1)𝑖+𝑗 det(𝐴 𝑖𝑗 ) are
called cofactors of the matrix. And that is why this method of computing the determinant
is called the cofactor expansion.
Similarly we do not need to expand along a row, we can expand along a column. For
any 𝑗,
𝑛
(−1)𝑖+𝑗 𝑎 𝑖𝑗 det(𝐴 𝑖𝑗 ).
Õ
det(𝐴) =
𝑖=1

A related fact is that


det(𝐴) = det(𝐴𝑇 ).

A matrix is upper triangular if all elements below the main diagonal are 0. For example,

1 2 3
 
0 5 6
 
0 0 9
 
is upper triangular. Similarly a lower triangular matrix is one where everything above the
diagonal is zero. For example,
1 0 0
 
4 5 0 .
 
7 8 9
 
The determinant for triangular matrices is very simple to compute. Consider the lower
triangular matrix. If we expand along the first row, we find that the determinant is 1 times
the determinant of the lower triangular matrix 58 09 . So the determinant is just the product
of the diagonal entries:
1 0 0
©  ª
det ­ 4 5 0 ® = 1 · 5 · 9 = 45.
« 7 8 9 ¬
 

Similarly for upper triangular matrices

1 2 3
©  ª
det ­ 0 5 6 ® = 1 · 5 · 9 = 45.
« 0 0 9 ¬
 
A.6. DETERMINANT 439

In general, if 𝐴 is triangular, then


det(𝐴) = 𝑎11 𝑎22 · · · 𝑎 𝑛𝑛 .
If 𝐴 is diagonal, then it is also triangular (upper and lower), so same formula applies.
For example,
2 0 0
©  ª
det ­ 0 3 0 ® = 2 · 3 · 5 = 30.
« 0 0 5 ¬
 
In particular, the identity matrix 𝐼 is diagonal, and the diagonal entries are all 1. Thus,
det(𝐼) = 1.

The determinant is telling you how geometric objects scale. If 𝐵 doubles the sizes of
geometric objects and 𝐴 triples them, then 𝐴𝐵 (which applies 𝐵 to an object and then it
applies 𝐴) should make size go up by a factor of 6. This is true in general:
Theorem A.6.1.
det(𝐴𝐵) = det(𝐴) det(𝐵).
This property is one of the most useful, and it is employed often to actually compute
determinants. A particularly interesting consequence is to note what it means for the
existence of inverses. Take 𝐴 and 𝐵 to be inverses, that is, 𝐴𝐵 = 𝐼. Then
det(𝐴) det(𝐵) = det(𝐴𝐵) = det(𝐼) = 1.
Neither det(𝐴) nor det(𝐵) can be zero. This fact is an extremely useful property of the
determinant, and one which is used often in this book:
Theorem A.6.2. An 𝑛 × 𝑛 matrix 𝐴 is invertible if and only if det(𝐴) ≠ 0.
In fact, det(𝐴−1 ) det(𝐴) = 1 says that
1
det(𝐴−1 ) = .
det(𝐴)
So we know what the determinant of 𝐴−1 is without computing 𝐴−1 .
Let us return to the formula for the inverse of a 2 × 2 matrix:
 −1
𝑎 𝑏 𝑑 −𝑏
  
1
= .
𝑐 𝑑 𝑎𝑑 − 𝑏𝑐 −𝑐 𝑎
Notice the determinant of the matrix [ 𝑎𝑐 𝑏𝑑 ] in the denominator of the fraction. The formula
only works if the determinant is nonzero, otherwise we are dividing by zero.
A common notation for the determinant is a pair of vertical lines:
𝑎 𝑏 𝑎 𝑏
 
= det .
𝑐 𝑑 𝑐 𝑑
Personally, I find this notation confusing as vertical lines usually mean a positive quantity,
while determinants can be negative. Also think about how to write the absolute value of a
determinant. This notation is not used in this book.
440 APPENDIX A. LINEAR ALGEBRA

A.6.1 Exercises
Exercise A.6.1: Compute the determinant of the following matrices:
    1 2 3
  1 3 2 1 
a) 3 b) c) d) 0 4 5
2 1 4 2 0
 0 6
0 2 5 7  0 1 2 0
2 1 0 2 1 3  
    0 0 2 −3 1 1 −1 2
e) −2 7 −3 f) 8 6 3 g)  h) 
0 2 0 7 9 7 3 4 5 7  1 1 2 1
    0
 0 2 4  2
 −1 −2 3

Exercise A.6.2: For which 𝑥 are the following matrices singular (not invertible).
𝑥 0 1
2 𝑥 𝑥 1
     
2 3  
a) b) c) d)  1 4 2
2 𝑥 1 2 4 𝑥  1 6 2
 
Exercise A.6.3: Compute
−1
© 2 1 2 3

ª
­ 0 8 6 5 ®
det ­­  ®
­ 0 0 3 9 ®
®
0 0 0 1
« ¬
without computing the inverse.

Exercise A.6.4: Suppose

1
 0 0 0 5
 9 1 − sin(1)
2 1 0 0 0 1 88 −1 
𝐿 =  and 𝑈 =  .
7 𝜋 1 0 0 0 1 3 
28 5 −99 1 0 0 0 1 
 
Let 𝐴 = 𝐿𝑈. Compute det(𝐴) in a simple way, without computing what is 𝐴. Hint: First read off
det(𝐿) and det(𝑈).

Exercise A.6.5: Consider the linear mapping from ℝ2 to ℝ2 given by the matrix 𝐴 = 12 𝑥1 for
 

some number 𝑥. You wish to make 𝐴 such that it doubles the area of every geometric figure. What
are the possibilities for 𝑥 (there are two answers).

Exercise A.6.6: Suppose 𝐴 and 𝑆 are 𝑛 × 𝑛 matrices, and 𝑆 is invertible. Suppose that det(𝐴) = 3.
Compute det(𝑆 −1 𝐴𝑆) and det(𝑆𝐴𝑆−1 ). Justify your answer using the theorems in this section.

Exercise A.6.7: Let 𝐴 be an 𝑛×𝑛 matrix such that det(𝐴) = 1. Compute det(𝑥𝐴) given a number 𝑥.
Hint: First try computing det(𝑥𝐼), then note that 𝑥𝐴 = (𝑥𝐼)𝐴.
A.6. DETERMINANT 441

Exercise A.6.101: Compute the determinant of the following matrices:


    2 9 −11
  2 −2 2 2 
a) −2 b) c) d) 0 −1 5 
1 3 2 2 0
 0 3 
3 2 5 7 0 2 1 0 
 2 1 0 5 1 3  
    0 0 2 0 1 2 −3 4 
e) −2 7 3 f) 4 1 1 g)  h) 
 1 1 0 4 5 1 0 4 5 0 5 6 −7 8 
    2
 1 2 4 1
 2 3 −2

Exercise A.6.102: For which 𝑥 are the following matrices singular (not invertible).
𝑥 1 0
3 𝑥 𝑥 3
     
1 3  
a) b) c) d)  1 4 0
1 𝑥 1 3 3 𝑥  1 6 2
 
Exercise A.6.103: Compute
 −1
© 3 4 7 12  ª

­ 0 −1 9 −8 ®
det ­­   ®
­ 0 0 −2 4  ®
 ®
0 0 0 2 
«  ¬
without computing the inverse.

Exercise A.6.104 (challenging): Find all the 𝑥 that make the matrix inverse
  −1
1 2
1 𝑥

have only integer entries (no fractions). Note that there are two answers.
442 APPENDIX A. LINEAR ALGEBRA
Appendix B

Table of Laplace Transforms

The function 𝑢 is the Heaviside function, 𝛿 is the Dirac delta function, and
∫ ∞ ∫ 𝑡
−𝜏 𝑡−1 2 2
Γ(𝑡) = 𝑒 𝜏 𝑑𝜏, erf(𝑡) = √ 𝑒 −𝜏 𝑑𝜏, erfc(𝑡) = 1 − erf(𝑡).
0 𝜋 0
∫∞
𝑓 (𝑡) 𝐹(𝑠) = ℒ 𝑓 (𝑡) = 𝑒 −𝑠𝑡 𝑓 (𝑡) 𝑑𝑡

0
𝐶
𝐶 𝑠
𝑡 1
𝑠2
𝑛!
𝑡𝑛 𝑠 𝑛+1
Γ(𝑝+1)
𝑡𝑝 (𝑝 > −1) 𝑠 𝑝+1
𝑒 −𝑎𝑡 1
𝑠+𝑎
𝜔
sin(𝜔𝑡) 𝑠 2 +𝜔 2
𝑠
cos(𝜔𝑡) 𝑠 2 +𝜔 2
𝜔
sinh(𝜔𝑡) 𝑠 2 −𝜔 2
𝑠
cosh(𝜔𝑡) 𝑠 2 −𝜔 2
𝑒 −𝑎𝑠
𝑢(𝑡 − 𝑎) (𝑎 ≥ 0) 𝑠
𝛿(𝑡) 1
𝛿(𝑡 − 𝑎) (𝑎 ≥ 0) 𝑒 −𝑎𝑠
𝑡 1 (𝑎𝑠)2
𝑠𝑒

erf 2𝑎 erfc(𝑎𝑠)
𝑡 sin(𝜔𝑡) 2𝜔𝑠
2
(𝑠 2 +𝜔2 )
𝑠 2 −𝜔2
𝑡 cos(𝜔𝑡) 2
(𝑠 2 +𝜔2 )
𝜔
𝑒 −𝑎𝑡 sin(𝜔𝑡)
(𝑠+𝑎)2 +𝜔2
𝑠+𝑎
𝑒 −𝑎𝑡 cos(𝜔𝑡)
(𝑠+𝑎)2 +𝜔2
444 APPENDIX B. TABLE OF LAPLACE TRANSFORMS

∫∞
𝑓 (𝑡) 𝐹(𝑠) = ℒ 𝑓 (𝑡) = 𝑒 −𝑠𝑡 𝑓 (𝑡) 𝑑𝑡

0
𝜔
𝑒 −𝑎𝑡 sinh(𝜔𝑡)
(𝑠+𝑎)2 −𝜔 2
𝑠+𝑎
𝑒 −𝑎𝑡 cosh(𝜔𝑡)
(𝑠+𝑎)2 −𝜔 2
  √
−𝑎 2 𝑒 −𝑎 𝑠
√1 exp 4𝑡 (𝑎 > 0) √
𝑠
𝜋𝑡
  √
√ 𝑎
2
exp −𝑎
4𝑡 (𝑎 > 0) 𝑒 −𝑎 𝑠
4𝜋𝑡 3

√1 − 𝑎𝑒 𝑎 𝑡 erfc(𝑎 𝑡)
2
(𝑎 > 0) √1
𝑠+𝑎
𝜋𝑡
  √
𝑎 𝑒 −𝑎 𝑠
erfc √ (𝑎 > 0) 𝑠
2 𝑡
√ √
√1 (𝑒 𝑏𝑡 − 𝑒 𝑎𝑡 ) 𝑠−𝑎− 𝑠−𝑏
2 𝜋𝑡 3
1 𝑏𝑡
𝑡 (𝑒 − 𝑒 𝑎𝑡 ) ln 𝑠−𝑎
𝑠−𝑏
𝐽0 (𝑎𝑡) √ 1
𝑠 2 +𝑎 2
 2
√ exp 4𝑠 −𝑎

𝐽0 (𝑎 𝑡) 𝑠
 2

√ exp −𝑎4𝑠
√1 cos(𝑎 𝑡) √
𝑠
𝜋𝑡
 
𝑎2
√ 𝑎 exp 4𝑠
√2 sinh(𝑎
𝜋
𝑡) 𝑠 3/2
√ 𝑒 𝑎𝑡 (1 +
1 𝑠
2𝑎𝑡)
𝜋𝑡 (𝑠−𝑎)3/2
𝑎 𝑓 (𝑡) + 𝑏 𝑔(𝑡) 𝑎𝐹(𝑠) + 𝑏𝐺(𝑠)
𝑠
𝑓 (𝑎𝑡) (𝑎 > 0) 𝑎𝐹 𝑎
1


𝑓 (𝑡 − 𝑎)𝑢(𝑡 − 𝑎) (𝑎 ≥ 0) 𝑒 −𝑎𝑠 𝐹(𝑠)


𝑒 −𝑎𝑡 𝑓 (𝑡) 𝐹(𝑠 + 𝑎)
𝑔 ′(𝑡) 𝑠𝐺(𝑠) − 𝑔(0)
𝑔 ′′(𝑡) 𝑠 2 𝐺(𝑠) − 𝑠 𝑔(0) − 𝑔 ′(0)
𝑔 (𝑛) (𝑡) 𝑠 𝑛 𝐺(𝑠) − 𝑠 𝑛−1 𝑔(0) − · · · − 𝑔 (𝑛−1) (0)
∫𝑡
( 𝑓 ∗ 𝑔)(𝑡) = 0
𝑓 (𝜏)𝑔(𝑡 − 𝜏) 𝑑𝜏 𝐹(𝑠)𝐺(𝑠)
𝑡 𝑓 (𝑡) −𝐹 ′(𝑠)
𝑡 𝑛 𝑓 (𝑡) (−1)𝑛 𝐹 (𝑛) (𝑠)
∫𝑡
0
𝑓 (𝜏)𝑑𝜏 𝑠 𝐹(𝑠)
1

𝑓 (𝑡) ∫∞
𝑡 𝑠
𝐹(𝜎)𝑑𝜎
∫𝑃
𝑓 (𝑡) periodic with period 𝑃 1−𝑒
1
−𝑃𝑠 0
𝑒 −𝑠𝑡 𝑓 (𝑡) 𝑑𝑡
Further Reading

[BM] Paul W. Berg and James L. McGregor, Elementary Partial Differential Equations,
Holden-Day, San Francisco, CA, 1966.

[BD] William E. Boyce and Richard C. DiPrima, Elementary Differential Equations and
Boundary Value Problems, 11th edition, John Wiley & Sons Inc., New York, NY, 2017.

[EP] C.H. Edwards and D.E. Penney, Differential Equations and Boundary Value Problems:
Computing and Modeling, 5th edition, Pearson, 2014.

[F] Stanley J. Farlow, An Introduction to Differential Equations and Their Applications,


McGraw-Hill, Inc., Princeton, NJ, 1994. (Published also by Dover Publications, 2006.)

[I] E.L. Ince, Ordinary Differential Equations, Dover Publications, Inc., New York, NY,
1956.

[T] William F. Trench, Elementary Differential Equations with Boundary Value Problems.
Books and Monographs. Book 9. 2013. https://digitalcommons.trinity.edu/
mono/9
446 FURTHER READING
Solutions to Selected Exercises

0.2.101: Compute 𝑥 ′ = −2𝑒 −2𝑡 and 𝑥 ′′ = 4𝑒 −2𝑡 . Then (4𝑒 −2𝑡 ) + 4(−2𝑒 −2𝑡 ) + 4(𝑒 −2𝑡 ) = 0.
0.2.102: Yes.
0.2.103: 𝑦 = 𝑥 𝑟 is a solution for 𝑟 = 0 and 𝑟 = 2.
0.2.104: 𝐶1 = 100, 𝐶2 = −90
0.2.105: 𝜑 = −9𝑒 8𝑠
0.2.106: a) 𝑥 = 9𝑒 −4𝑡 b) 𝑥 = cos(2𝑡) + sin(2𝑡) c) 𝑝 = 4𝑒 3𝑞 d) 𝑇 = 3 sinh(2𝑥)
0.3.101: a) PDE, equation, second-order, linear, nonhomogeneous, constant-coefficient.
b) ODE, equation, first-order, linear, nonhomogeneous, not constant-coefficient, not au-
tonomous.
c) ODE, equation, seventh-order, linear, homogeneous, constant-coefficient, autonomous.
d) ODE, equation, second-order, linear, nonhomogeneous, constant-coefficient, autonomous.
e) ODE, system, second-order, nonlinear.
f) PDE, equation, second-order, nonlinear.
𝑏(𝑥)
0.3.102: equation: 𝑎(𝑥)𝑦 = 𝑏(𝑥), solution: 𝑦 = 𝑎(𝑥)
.
0.3.103: 𝑘 = 0 or 𝑘 = 1
𝑥2
1.1.101: 𝑦 = 𝑒𝑥 + 2 +9
1.1.102: 𝑥 = (3𝑡 − 2)1/3

𝑥 = sin−1 𝑡 + 1/

1.1.103: 2

1.1.104: 170
 1/(1−𝑛)
1.1.105: If 𝑛 ≠ 1, then 𝑦 = (1 − 𝑛)𝑥 + 1 . If 𝑛 = 1, then 𝑦 = 𝑒 𝑥 .
1.1.106: The equation is 𝑟 ′ = −𝐶 for some constant 𝐶. The snowball will be completely
melted in 25 minutes from time 𝑡 = 0.
1.1.107: 𝑦 = 𝐴𝑥 3 + 𝐵𝑥 2 + 𝐶𝑥 + 𝐷, so 4 constants.
1.2.101: 𝑦 = 0 is a solution such that 𝑦(0) = 0.
448 SOLUTIONS TO SELECTED EXERCISES

1.2.102: Yes a solution exists. The equation is 𝑦 ′ = 𝑓 (𝑥, 𝑦) where 𝑓 (𝑥, 𝑦) = 𝑥𝑦. The
𝜕𝑓
function 𝑓 (𝑥, 𝑦) is continuous and 𝜕𝑦 = 𝑥, which is also continuous near (0, 0). So a
solution exists and is unique. (In fact, 𝑦 = 0 is the solution.)
1.2.103: No, the equation is not defined at (𝑥, 𝑦) = (1, 0).
1.2.104: a) 𝑦 ′ = cos 𝑦, b) 𝑦 ′ = 𝑦 cos(𝑥), c) 𝑦 ′ = sin 𝑥. Justification left to reader.
1.2.105: Picard does not apply as 𝑓 is not continuous at 𝑦 = 0. The equation does not
have a continuously differentiable solution. Suppose it did. Notice that 𝑦 ′(0) = 1. By
the first derivative test, 𝑦(𝑥) > 0 for small positive 𝑥. But then for those 𝑥, we have
𝑦 (𝑥) = 𝑓 𝑦(𝑥) = 0. It is not possible for 𝑦 ′ to be continous, 𝑦 ′(0) = 1 and 𝑦 ′(𝑥) = 0 for


arbitrarily small positive 𝑥.
∫𝑥
1.2.106: The solution is 𝑦(𝑥) = 𝑥0
𝑓 (𝑠) 𝑑𝑠 + 𝑦0 , and this does indeed exist for every 𝑥.

𝑦 = 𝐶𝑒 𝑥
2
1.3.101:
𝑥 = 𝑒𝑡 + 1
3
1.3.102:
1.3.103: 𝑥3 + 𝑥 = 𝑡 + 2
1.3.104: 𝑦 = 1−ln
1
𝑥
1.3.105: sin(𝑦) = − cos(𝑥) + 𝐶
1.3.106: The range is approximately 7.45 to 12.15 minutes.
𝑡
1.3.107: a) 𝑥 = 1000𝑒
𝑒 𝑡 +24
. b) 102 rabbits after one month, 861 after 5 months, 999 after 10
months, 1000 after 15 months.
3
1.4.101: 𝑦 = 𝐶𝑒 −𝑥 + 1/3
1.4.102: 𝑦 = 2𝑒 cos(2𝑥)+1 + 1
1.4.103: 250 grams
2
1.4.104: 𝑃(5) = 1000𝑒 2×5−0.05×5 = 1000𝑒 8.75 ≈ 6.31 × 106
1.4.105: 𝐴ℎ ′ = 𝐼 − 𝑘 ℎ, where 𝑘 is a constant with units m2/s.
1.5.101: 𝑦= 2
3𝑥−2
3−𝑥 2
1.5.102: 𝑦= 2𝑥
 1/3
1.5.103: 𝑦 = 7𝑒 3𝑥 + 3𝑥 + 1
p
1.5.104: 𝑦 = ± 𝑥 2 − ln(𝐶 − 𝑥)
1.6.101: a) 0, 1, 2 are critical points. b) 𝑥 = 0 is unstable (semistable), 𝑥 = 1 is stable,
and 𝑥 = 2 is unstable. c) 1
1.6.102: a) There are no critical points. b) ∞

𝑑𝑥 𝑘𝑀+ (𝑘𝑀)2 +4𝐴𝑘
1.6.103: a) 𝑑𝑡 = 𝑘𝑥(𝑀 − 𝑥) + 𝐴 b) 2𝑘
1.6.104: a) 𝛼 is a stable critical point, 𝛽 is an unstable one. b) 𝛼, c) 𝛼, d) ∞ or DNE.
1.7.101: Approximately: 1.0000, 1.2397, 1.3829
SOLUTIONS TO SELECTED EXERCISES 449

1.7.102: a) 0, 8, 12 b) 𝑥(4) = 16, so errors are: 16, 8, 4. c) Factors are 0.5 and 0.5.
1.7.103: a) 0, 0, 0 b) 𝑥 = 0 is a solution so errors are: 0, 0, 0.
1.7.104: a) Improved Euler: 𝑦(1) ≈ 3.3897 for ℎ = 1/4, 𝑦(1) ≈ 3.4237 for ℎ = 1/8, b)
Standard Euler: 𝑦(1) ≈ 2.8828 for ℎ = 1/4, 𝑦(1) ≈ 3.1316 for ℎ = 1/8, c) 𝑦 = 2𝑒 𝑥 − 𝑥 − 1, so
𝑦(1) is approximately 3.4366. d) Approximate errors for improved Euler: 0.046852 for
ℎ = 1/4 and 0.012881 for ℎ = 1/8. For standard Euler: 0.55375 for ℎ = 1/4 and 0.30499 for
ℎ = 1/8. Factor is approximately 0.27 for improved Euler and 0.55 for standard Euler.
1.8.101: a) 𝑒 𝑥 𝑦 + sin(𝑥) = 𝐶 b) 𝑥 2 + 𝑥𝑦 − 2𝑦 2 = 𝐶 c) 𝑒 𝑥 + 𝑒 𝑦 = 𝐶 d) 𝑥 3 + 3𝑥𝑦 + 𝑦 3 = 𝐶
1.8.102: a) Integrating factor is 𝑦, equation becomes 𝑑𝑥 + 3𝑦 2 𝑑𝑦 = 0. b) Integrating
𝑥 𝑥
factor is 𝑒 , equation becomes 𝑒 𝑑𝑥 − 𝑒 𝑑𝑦 = 0. c) Integrating factor is 𝑦 2 , equation
−𝑦

becomes (cos(𝑥) + 𝑦) 𝑑𝑥 + 𝑥 𝑑𝑦 = 0. d) Integrating factor is 𝑥, equation becomes


(2𝑥 𝑦 + 𝑦 ) 𝑑𝑥 + (𝑥 + 2𝑥𝑦) 𝑑𝑦 = 0.
2 2

1.8.103: a) The equation is − 𝑓 (𝑥) 𝑑𝑥 + 1


𝑔(𝑦)
𝑑𝑦 = 0, and this is exact because 𝑀 = − 𝑓 (𝑥),
𝑁 = 1
𝑔(𝑦)
, so 𝑀 𝑦 = 0 = 𝑁𝑥 . b) −𝑥 𝑑𝑥 + 1𝑦 𝑑𝑦 = 0, leads to potential function 𝐹(𝑥, 𝑦) =
𝑥2
− 2 + ln|𝑦|, solving 𝐹(𝑥, 𝑦) = 𝐶 leads to the same solution as the example.
1.9.101: a) 𝑢 = 1
b) 𝑢 = cos(𝑥 − 2𝑡)
1+(𝑥+5𝑡)2
2
1.9.102: 𝑢 = cos(𝑥 − 𝑡)𝑒 −𝑡 /2
1.9.103: 𝑢 = 𝑥 + 4𝑡
2.1.101: Yes. To justify try to find a constant 𝐴 such that sin(𝑥) = 𝐴𝑒 𝑥 for all 𝑥.
2.1.102: No. 𝑒 𝑥+2 = 𝑒 2 𝑒 𝑥 .
2.1.103: 𝑦=5
2.1.104: 𝑦 = 𝐶1 ln(𝑥) + 𝐶2
2.1.105: 𝑦 ′′ − 3𝑦 ′ + 2𝑦 = 0
√ √
2.2.101: 𝑦 = 𝐶1 𝑒 (−2+ 2)𝑥 + 𝐶2 𝑒 (−2− 2)𝑥

2.2.102: 𝑦 = 𝐶1 𝑒 3𝑥 + 𝐶2 𝑥𝑒 3𝑥
√  √ √
𝑦 = 𝑒 −𝑥/4 cos ( 7/4)𝑥 − 7𝑒 −𝑥/4 sin ( 7/4)𝑥

2.2.103:
𝑒𝑥
2(𝑎−𝑏) −3𝑥/2
2.2.104: 𝑦= 5 𝑒 + 3𝑎+2𝑏
5
2.2.105: 𝑧(𝑡) −𝑡
= 2𝑒 cos(𝑡)
𝑎𝛽−𝑏 𝛼𝑥
2.2.106: 𝑦= 𝛽−𝛼 𝑒 + 𝑏−𝑎𝛼
𝛽−𝛼 𝑒
𝛽𝑥

2.2.107: 𝑦 ′′ − 𝑦 ′ − 6𝑦 = 0
2.3.101: 𝑦 = 𝐶1 𝑒 𝑥 + 𝐶2 𝑥 3 + 𝐶3 𝑥 2 + 𝐶4 𝑥 + 𝐶5
2.3.102: a) 𝑟 3 − 3𝑟 2 + 4𝑟 − 12 = 0 b) 𝑦 ′′′ − 3𝑦 ′′ + 4𝑦 ′ − 12𝑦 = 0 c) 𝑦 = 𝐶1 𝑒 3𝑥 + 𝐶2 sin(2𝑥) +
𝐶3 cos(2𝑥)
2.3.103: 𝑦 = 0
2.3.104: No. 𝑒 1 𝑒 𝑥 − 𝑒 𝑥+1 = 0.
450 SOLUTIONS TO SELECTED EXERCISES

2.3.105: Yes. (Hint: First note that sin(𝑥) is bounded. Then note that 𝑥 and 𝑥 sin(𝑥) cannot
be multiples of each other.)
2.3.106: 𝑦 ′′′ − 𝑦 ′′ + 𝑦 ′ − 𝑦 = 0
2.4.101: 𝑘 = 8/9 (and larger)
√ √
2.4.102: a)
√ 0.05𝐼 ′′ + 0.1𝐼 ′ + (1/5)𝐼 = 0 b) 𝐼 = 𝐶𝑒 −𝑡 cos( 3 𝑡 − 𝛾) c) 𝐼 = 10𝑒 −𝑡 cos( 3 𝑡) +
𝑒 sin( 3 𝑡)
10 −𝑡

3

2.4.103: a) 𝑘 = 500000 b) 1
√ ≈ 0.141 c) 45000 kg d) 11250 kg
5 2

2.4.104: 𝑚0 = 13 . If 𝑚 < 𝑚0 , then the system is overdamped and will not oscillate.
−16 sin(3𝑥)+6 cos(3𝑥)
2.5.101: 𝑦= 73
2𝑒 𝑥 +3𝑥 3 −9𝑥
√ √ 2𝑒 𝑥 +3𝑥 3 −9𝑥
2.5.102: a) 𝑦 = 6 b) 𝑦 = 𝐶1 cos( 2𝑥) + 𝐶2 sin( 2𝑥) + 6
2.5.103: 𝑦(𝑥) = 𝑥 − 4𝑥 +
2 6 + 𝑒 −𝑥 (𝑥 − 5)
2𝑥𝑒 𝑥 −(𝑒 𝑥 +𝑒 −𝑥 ) ln(𝑒 2𝑥 +1)
2.5.104: 𝑦= 4
√ √
− sin(𝑥+𝑐) 2𝑥 + 2𝑥
2.5.105: 𝑦= 3 + 𝐶 1 𝑒 𝐶2 𝑒 −
2.5.106: 𝑦= 𝑥 2 + 2𝑥 + 3

31
2.6.101: 𝜔= √ ≈ 0.984 𝐶(𝜔) = 16
√ ≈ 2.016
4 2 3 7
(𝜔02 −𝜔2 )𝐹0 2𝜔𝑝𝐹0
2.6.102: 𝑥 𝑠𝑝 = 2 2 cos(𝜔𝑡) + 2 sin(𝜔𝑡) + 𝐴𝑘 , where 𝑝 = 𝑐
2𝑚 and
𝑚(2𝜔𝑝) +𝑚(𝜔02 −𝜔2 ) 𝑚(2𝜔𝑝)2 +𝑚(𝜔02 −𝜔2 )
q
𝑘
𝜔0 = 𝑚.

2.6.103: a) 𝜔 = 2 b) 25
𝐶1 3𝑥 𝐶1 3𝑥
3.1.101: 𝑦1 = 𝐶1 𝑒 3𝑥 , 𝑦2 = 𝑦(𝑥) = 𝐶2 𝑒 𝑥 + 2 𝑒 , 𝑦3 = 𝑦(𝑥) = 𝐶3 𝑒 𝑥 + 2 𝑒
3.1.102: 𝑥 = 53 𝑒 2𝑡 − 23 𝑒 −𝑡 , 𝑦 = 35 𝑒 2𝑡 + 43 𝑒 −𝑡
3.1.103: 𝑥1′ = 𝑥2 , 𝑥 2′ = 𝑥3 , 𝑥3′ = 𝑥1 + 𝑡
3.1.104: 𝑦3′ + 𝑦1 + 𝑦2 = 𝑡, 𝑦4′ + 𝑦1 − 𝑦2 = 𝑡 2 , 𝑦1′ = 𝑦3 , 𝑦2′ = 𝑦4
3.1.105: 𝑥1 = 𝑥2 = 𝑎𝑡. Explanation of the intuition is left to reader.
3.1.106: a) Left to reader. b) 𝑥1′ = 𝑟−𝑠 𝑟 𝑟
𝑉 𝑥 2 − 𝑉 𝑥 1 , 𝑥 2 = 𝑉 (𝑥 1 − 𝑥 2 ).

c) As 𝑡 goes to infinity,
both 𝑥 1 and 𝑥 2 go to zero, explanation is left to reader.
3.2.101: −15
3.2.102: −2
3.2.103: 𝑥® = −5
 15 
1 
h
1/𝑎
i /𝑎 0 0
0
3.2.104: a) 0 1/𝑏 b) 0 1/𝑏 0
0 0 1/𝑐
3.3.101: Yes.
 𝑡
𝑒 𝑒 −𝑡 = 0®
 cosh(𝑡)   
3.3.102: No. 2 1
− 1
− 1
SOLUTIONS TO SELECTED EXERCISES 451
 𝑥 ′  3 −1   𝑥   𝑡
3.3.103: = + 𝑒
𝑦 𝑡 0 𝑦 0
2
𝐶2 𝑒 𝑡 +𝐶1
h i
a) 𝑥® ′ = 𝑥® b) 𝑥® =
 0 2𝑡 
3.3.104: 0 2𝑡 2
𝐶2 𝑒 𝑡
h1i h0i h 3
i
3.4.101: a) Eigenvalues: 4, 0, −1 Eigenvectors: 0 , 1 , 5
h1i h0i h i 1 0 −2
3
b) 𝑥® = 𝐶1 0 𝑒 4𝑡 + 𝐶2 1 + 𝐶3 5 𝑒 −𝑡
1 0 −2
√ √ h i h i
1+ 3𝑖 1− 3𝑖 −2 −2
3.4.102: a) Eigenvalues: 2 , 2 , Eigenvectors: √
1− 3𝑖
, √
1+ 3𝑖
 √    √  
3𝑡 3𝑡
−2 cos 2 −2 sin 2
b) 𝑥® = 𝐶1 𝑒 𝑡/2  √√
3𝑡

3𝑡
 + 𝐶2 𝑒 𝑡/2 √
3𝑡
 √ √
3𝑡

cos + 3 sin
2 2 sin 2 − 3 cos 2

𝑥® = 𝐶1 𝑒 𝑡 + 𝐶2 𝑒
1  1  −𝑡
3.4.103: 1 −1
h i h i
cos(𝑡) sin(𝑡)
3.4.104: 𝑥® = 𝐶1 − sin(𝑡) + 𝐶2 cos(𝑡)

3.5.101: a) Two eigenvalues: ± 2 so the behavior is a saddle. b) Two eigenvalues: 1
and 2, so the behavior is a source. c) Two eigenvalues: ±2𝑖, so the behavior is a center
(ellipses). d) Two eigenvalues: −1 and −2, so the behavior is a sink. e) Two eigenvalues:
5 and −3, so the behavior is a saddle.
3.5.102: Spiral source.
3.5.103:

The solution does not move anywhere if 𝑦 = 0. When 𝑦 is positive, the solution moves
(with constant speed) in the positive 𝑥 direction. When 𝑦 is negative, the solution moves
(with constant speed) in the negative 𝑥 direction. It is not one of the behaviors we saw.
Note that the matrix has a double eigenvalue 0 and the general solution is 𝑥 = 𝐶1 𝑡 + 𝐶2
and 𝑦 = 𝐶1 , which agrees with the description above.
h 1 i √ √  h 0 i √ √ 
3.6.101: 𝑥® = −1 𝑎 1 cos( 3 𝑡) + 𝑏 1 sin( 3 𝑡) + 1 𝑎 2 cos( 2 𝑡) + 𝑏 2 sin( 2 𝑡) +
1   −2
h0i −1
𝑎 3 cos(𝑡) + 𝑏 3 sin(𝑡) +

0 1/2
cos(2𝑡)
1 −1/3

𝑘 0
h𝑚 0 0
i h −𝑘 i h 1
i p p
𝑥® ′′ = 𝑥®. Solution: 𝑥® = 𝑎 1 cos( 3𝑘/𝑚 𝑡)+ 𝑏 1 sin( 3𝑘/𝑚 𝑡)

3.6.102: 0 𝑚 0 𝑘 −2𝑘 𝑘 −2
h i 0 0 𝑚 0 𝑘 −𝑘 h1i 1
1 p p
𝑎 2 cos( 𝑘/𝑚 𝑡) + 𝑏2 sin( 𝑘/𝑚 𝑡) 𝑎3 𝑡 + 𝑏3 .
 
+ 0 + 1
−1 1
p
3.6.103: 𝑥2 = (2/5) cos( 1/6 𝑡) − (2/5) cos(𝑡)
h1i h 1
i h 0
i
3.7.101: a) 3, 0, 0 b) No defects. c) 𝑥® = 𝐶1 1 𝑒 3𝑡 + 𝐶2 0 + 𝐶3 1
1 −1 −1
452 SOLUTIONS TO SELECTED EXERCISES

3.7.102: a) 1, 1, 2
b) Eigenvalue
h i 1 has a defect
 h i of 1h i h i
0 1 0 3
c) 𝑥® = 𝐶1 1 𝑒𝑡 + 𝐶2 0 +𝑡 1 𝑒𝑡 + 𝐶3 3 𝑒 2𝑡
−1 0 −1 −2
3.7.103: a) 2, 2, 2
b) Eigenvalue
h 0 i 2 has a defect
 h 0 i of 2h 0 i  h 1 i h 0 i h 0 i
𝑡2
c) 𝑥® = 𝐶1 3 𝑒 + 𝐶2 −1 + 𝑡 3 𝑒 + 𝐶3 0 + 𝑡 −1 +
2𝑡 2𝑡
2
3 𝑒 2𝑡
1 0 1 0 0 1

𝐴=
5 5
3.7.104: 05
 
𝑒 3𝑡 +𝑒 −𝑡 𝑒 −𝑡 −𝑒 3𝑡
3.8.101: 𝑒 𝑡𝐴 = 2
𝑒 −𝑡 −𝑒 3𝑡
2
𝑒 3𝑡 +𝑒 −𝑡
2 2

3𝑒 𝑡 3𝑡
" #
2𝑒 3𝑡 −4𝑒 2𝑡 +3𝑒 𝑡 2 − 3𝑒2 −𝑒 3𝑡 +4𝑒 2𝑡 −3𝑒 𝑡
3.8.102: 𝑒 𝑡𝐴 = 2𝑒 𝑡 −2𝑒 2𝑡 𝑒𝑡 2𝑒 2𝑡 −2𝑒 𝑡
3𝑒 𝑡 3𝑒 3𝑡
2𝑒 −5𝑒 2𝑡 +3𝑒 𝑡
3𝑡
2 − 2 −𝑒 +5𝑒 2𝑡 −3𝑒 𝑡
3𝑡

h i h i
(𝑡+1) 𝑒 2𝑡 −𝑡𝑒 2𝑡 (1−𝑡) 𝑒 2𝑡
3.8.103: a) 𝑒 𝑡𝐴 = 𝑡𝑒 2𝑡 (1−𝑡) 𝑒 2𝑡
b) 𝑥® = (2−𝑡) 𝑒 2𝑡
h i
1+2𝑡+5𝑡 2 3𝑡+6𝑡 2 𝑒 0.1𝐴 ≈
 1.25 0.36 
3.8.104: 2𝑡+4𝑡 2 1+2𝑡+5𝑡 2 0.24 1.25

5(3𝑛 ) − 2𝑛+2 4(3𝑛 ) − 2𝑛+2 3 − 2(3𝑛 ) 2(3𝑛 ) − 2


   
3.8.105: a) b)
5(2𝑛 ) − 5(3𝑛 ) 5(2𝑛 ) − 4(3𝑛 ) 3 − 3𝑛+1 3𝑛+1 − 2
   
1 0 0 1
c) if 𝑛 is even, and if 𝑛 is odd.
0 1 1 0
3.9.101: The general solution is (particular solutions should agree with one of these):
𝑥(𝑡) = 𝐶1 𝑒 9𝑡 + 4𝐶2 𝑒 4𝑡 − 𝑡/3 − 5/54, 𝑦(𝑡) = 𝐶1 𝑒 9𝑡 − 𝐶2 𝑒 4𝑡 + 𝑡/6 + 7/216
3.9.102: The general solution is (particular solutions should agree with one of these):
𝑥(𝑡) = 𝐶1 𝑒 𝑡 + 𝐶2 𝑒 −𝑡 + 𝑡𝑒 𝑡 , 𝑦(𝑡) = 𝐶1 𝑒 𝑡 − 𝐶2 𝑒 −𝑡 + 𝑡𝑒 𝑡
5 𝑡
𝑥® = 2𝑒 −𝑡−1 + −1 −𝑡
2 𝑒
1  
1

3.9.103: 1 −1
 1    √   √ 
𝑡 cos(𝑡)
3.9.104: 𝑥® = 9
1
140 + 1√
𝑒 6𝑡 + 1
140 − 1√
𝑒− 6𝑡 − 60 − 70
120 6 120 6
 1
  −9 1 9𝑡 cos(𝑡)

+ −1 80 sin(2𝑡) + 30 cos(2𝑡) + 40 − 30
q
4.1.101: 𝜔=𝜋 15
2

4.1.102: 𝜆 𝑘 = 4𝑘 2 𝜋2 for 𝑘 = 1, 2, 3, . . . 𝑥 𝑘 = cos(2𝑘𝜋𝑡) + 𝐵 sin(2𝑘𝜋𝑡) (for any 𝐵)


4.1.103: 𝑥(𝑡) = − sin(𝑡)
4.1.104: General solution is 𝑥 = 𝐶𝑒 −𝜆𝑡 . Since 𝑥(0) = 0 then 𝐶 = 0, and so 𝑥(𝑡) = 0.
Therefore, the solution is always identically zero. One condition is always enough to
guarantee a unique solution for a first-order equation.
√ √
3
√ √ √ 3 √ √ 3
3 −3 𝜆 3 3 𝜆 3 𝜆
4.1.105: 3 𝑒 2 − 3 cos 2 + sin 2 =0
4.2.101: sin(𝑡)
SOLUTIONS TO SELECTED EXERCISES 453


Í (𝜋−𝑛) sin(𝜋𝑛+𝜋2 )+(𝜋+𝑛) sin(𝜋𝑛−𝜋2 )
4.2.102: 𝜋𝑛 2 −𝜋3
sin(𝑛𝑡)
𝑛=1
1
4.2.103: 2 − 12 cos(2𝑡)

𝜋4 Í (−1)𝑛 (8𝜋2 𝑛 2 −48)
4.2.104: 5 + 𝑛4
cos(𝑛𝑡)
𝑛=1

16(−1)𝑛 𝑛𝜋 𝜋
2 𝑡 2𝑡 cos 𝜋𝑡 − 2 𝑡
8 8 16 4 16 3𝜋
Í    
4.3.101: a) 6 + 𝜋2 𝑛 2
cos b) 6 − 𝜋2
cos + 𝜋2 9𝜋2
cos +···
𝑛=1

(−1)𝑛+1 2𝜆 𝑛𝜋 𝜋 𝜆
𝜆 𝑡 𝜆𝑡 𝜆 𝑡 𝜆 𝑡
2𝜆 2𝜋 2𝜆 3𝜋
Í    
4.3.102: a) 𝑛𝜋 sin b) 𝜋 sin − 𝜋 sin + 3𝜋 sin −···
𝑛=1

𝜋
𝑓 ′(𝑡) =
Í
4.3.103: 𝑛 2 +1
cos(𝑛𝜋𝑡)
𝑛=1

𝑡
a) 𝐹(𝑡) = +𝐶+ 1
Í
4.3.104: 2 𝑛4
sin(𝑛𝑡) b) no
𝑛=1

(−1)𝑛+1
b) 𝑓 is continuous at 𝑡 = 𝜋/2 so the Fourier series converges
Í
4.3.105: a) 𝑛 sin(𝑛𝑡)
𝑛=1

(−1)𝑛+1
to 𝑓 (𝜋/2) = 𝜋/4. Obtain 𝜋/4 =
Í
2𝑛−1 = 1 − 1/3 + 1/5 − 1/7 + · · · . c) Using the first 4
𝑛=1
terms get 76/105 ≈ 0.72 (quite a bad approximation, you would have to take about 50 terms
to start to get to within 0.01 of 𝜋/4).
4.3.106: a) 𝐹(0) = 1, b) 𝐹(−1) = 0, c) 𝐹(1) = 2, d) 𝐹(−2) = 1, e) 𝐹(4) = 1, f) 𝐹(−9) = 0
∞ ∞
−4 𝑛𝜋 2(−1)𝑛+1 𝑛𝜋
3 𝑡 3 𝑡
Í  Í 
4.4.101: a) 1/2 + 𝜋2 𝑛 2
cos b) 𝜋𝑛 sin
𝑛=1 𝑛=1
𝑛 odd

Í −4𝑛
4.4.102: a) cos(2𝑡) b) 𝜋𝑛 2 −4𝜋
sin(𝑛𝑡)
𝑛=1
𝑛 odd
4.4.103: a) 𝑓 (𝑡) b) 0

Í −1
4.4.104: 𝑛 2 (1+𝑛 2 )
sin(𝑛𝑡)
𝑛=1

𝑡 Í 1
4.4.105: 𝜋 + 2𝑛 (𝜋−𝑛 2 )
sin(𝑛𝑡)
𝑛=1

4.5.101: 𝑥= √ 1 sin(2𝜋𝑡) + √ 0.1 2 cos(10𝜋𝑡)


2−4𝜋2 2−100𝜋

Í 𝑒 −𝑛
4.5.102: 𝑥= 2 cos(2𝑛𝑡)
𝑛=1 3−(2𝑛)

𝑥= 1 √−4
Í
4.5.103: √ + cos(𝑛𝜋𝑡)
𝑛=1 𝑛 𝜋 ( 3−𝑛 𝜋 )
2 3 2 2 2 2

𝑛 odd

𝑥= 1 2
𝑡 −4
Í
4.5.104: 2𝜋2
− 𝜋 3 sin(𝜋𝑡) + 𝑛 2 𝜋4 (1−𝑛 2 )
cos(𝑛𝜋𝑡)
𝑛=3
𝑛 odd
454 SOLUTIONS TO SELECTED EXERCISES

4.6.101: 𝑢(𝑥, 𝑡) = 5 sin(𝑥) 𝑒 −3𝑡 + 2 sin(5𝑥) 𝑒 −75𝑡


4.6.102: 𝑢(𝑥, 𝑡) = 1 + 2 cos(𝑥) 𝑒 −0.1𝑡
4.6.103: 𝑢(𝑥, 𝑡) = 𝑒 𝜆𝑡 𝑒 𝜆𝑥 for some 𝜆
4.6.104: 𝑢(𝑥, 𝑡) = 𝐴𝑒 𝑥 + 𝐵𝑒 𝑡
4.6.105: a) 0, b) minimum −100, maximum 100, c) 𝑡 = ln 2
4𝜋2
.
𝑦(𝑥, 𝑡) = sin(𝑥) sin(𝑡) + cos(𝑡)

4.7.101:
4.7.102: 𝑦(𝑥, 𝑡) = 1 1
5𝜋 sin(𝜋𝑥) sin(5𝜋𝑡) + 100𝜋 sin(2𝜋𝑥) sin(10𝜋𝑡)

2(−1)𝑛+1 √
𝑦(𝑥, 𝑡) = 2 𝑡)
Í
4.7.103: 𝑛 sin(𝑛𝑥) cos(𝑛
𝑛=1
4.7.104: 𝑦(𝑥, 𝑡) = sin(2𝑥) + 𝑡 sin(𝑥)
sin(2𝜋(𝑥−3𝑡))+sin(2𝜋(3𝑡+𝑥)) cos(3𝜋(𝑥−3𝑡))−cos(3𝜋(3𝑡+𝑥))
4.8.101: 𝑦(𝑥, 𝑡) = 2 + 18𝜋


 𝑥 − 𝑥 2 − 0.04 if 0.2 ≤ 𝑥 ≤ 0.8


4.8.102: a) 𝑦(𝑥, 0.1) = 0.6𝑥 if 𝑥 ≤ 0.2

 0.6 − 0.6𝑥 if 𝑥 ≥ 0.8


b) 𝑦(𝑥, 1/2) = −𝑥 + 𝑥 2 c) 𝑦(𝑥, 1) = 𝑥 − 𝑥 2
4.8.103: a) 𝑦(1, 1) = −1/2 b) 𝑦(4, 3) = 0 c) 𝑦(3, 9) = 1/2
∞  
sinh(𝑛𝜋(1−𝑦))
𝑢(𝑥, 𝑦) = 1
Í
4.9.101: 𝑛2
sin(𝑛𝜋𝑥) sinh(𝑛𝜋)
𝑛=1
 
sinh(𝜋(2−𝑦))
4.9.102: 𝑢(𝑥, 𝑦) = 0.1 sin(𝜋𝑥) sinh(2𝜋)

1 𝑛
𝑢 =1+ 𝑟
Í
4.10.101: 𝑛2
sin(𝑛𝜃)
𝑛=1
4.10.102: 𝑢 =1−𝑥
4.10.103: a) 𝑢 = −1
4 𝑟 +
2 1
4 b) 𝑢 = −1 2
4 𝑟 + 1
4 + 𝑟 2 sin(2𝜃)
𝜋
𝜌2 − 𝑟 2

1
4.10.104: 𝑢(𝑟, 𝜃) = 𝑔(𝛼) 𝑑𝛼
2𝜋 −𝜋 𝜌2 − 2𝑟𝜌 cos(𝜃 − 𝛼) + 𝑟 2
𝑛 2 𝜋2 𝑛𝜋 𝑛𝜋
5.1.101: 𝜆𝑛 = 𝑛 = 1, 2, 3, . . .. For odd 𝑛, 𝑦𝑛 = cos 2 𝑥 , for even 𝑛, 𝑦𝑛 = sin 2 𝑥
 
4 , .
5.1.102: a) 𝑝(𝑥) = 1, 𝑞(𝑥) = 0, 𝑟(𝑥) = 𝑥1 , 𝛼1 = 1, 𝛼2 = 0, 𝛽 1 = 1, 𝛽 2 = 0. The problem is not
regular. b) 𝑝(𝑥) = 1 + 𝑥 2 , 𝑞(𝑥) = 𝑥 2 , 𝑟(𝑥) = 1, 𝛼1 = 1, 𝛼2 = 0, 𝛽 1 = 1, 𝛽 2 = 1. The problem is
regular.
5.2.101: 𝑦(𝑥, 𝑡) = sin(𝜋𝑥) cos(2𝜋2 𝑡)
5.2.102: 9𝑦 𝑥𝑥𝑥𝑥 + 𝑦𝑡𝑡 = 0 (0 < 𝑥 < 10, 𝑡 > 0), 𝑦(0, 𝑡) = 𝑦 𝑥 (0, 𝑡) = 0, 𝑦(10, 𝑡) =
𝑦 𝑥 (10, 𝑡) = 0, 𝑦(𝑥, 0) = sin2 (𝜋𝑥), 𝑦𝑡 (𝑥, 0) = 𝑥(10 − 𝑥).
∞  
cos(𝑛)−1
𝑦 𝑝 (𝑥, 𝑡) = −4
Í
5.3.101: 𝜋𝑛 4
cos(𝑛𝑥) − sin(𝑛)
sin(𝑛𝑥) − 1 cos(𝑛𝑡).
𝑛=1
𝑛 odd
SOLUTIONS TO SELECTED EXERCISES 455

5.3.102: Approximately 1991 centimeters


8 8 4
6.1.101: 𝑠3
+ 𝑠2
+ 𝑠

6.1.102: 2𝑡 2 − 2𝑡 + 1 − 𝑒 −2𝑡
1
6.1.103: 2
(𝑠+1)
1
6.1.104: 𝑠 2 +2𝑠+2
𝑓 (𝑡) = (𝑡 − 1) 𝑢(𝑡 − 1) − 𝑢(𝑡 − 2) + 𝑢(𝑡 − 2)

6.2.101:
6.2.102: 𝑥(𝑡) = (2𝑒 𝑡−1 − 𝑡 2 − 1)𝑢(𝑡 − 1) − 21 𝑒 −𝑡 + 32 𝑒 𝑡
6.2.103: 𝐻(𝑠) = 1
𝑠+1
𝑠
6.2.104: 𝐹(𝑠) = 𝑠 , 𝑋(𝑠)
4
= 1
𝑠 − 𝑠 2 +4
, 𝐻(𝑠) = 1
𝑠 2 +4
.

2 (cos 𝑡 + sin 𝑡 − 𝑒 −𝑡 )
1
6.3.101:
6.3.102: 5𝑡 − 5 sin 𝑡
2 (sin 𝑡 − 𝑡 cos 𝑡)
1
6.3.103:
∫𝑡
𝑓 (𝜏) 1 − cos(𝑡 − 𝜏) 𝑑𝜏

6.3.104: 0
6.4.101: 𝑥(𝑡) = 𝑡
6.4.102: 𝑥(𝑡) = 𝑒 −𝑎𝑡
6.4.103: 𝑥(𝑡) = (𝑒 𝑡 ) ∗ (𝑒 𝑡 sin(𝑡)) = 𝑒 𝑡 (1 − cos(𝑡))
6.4.104: 𝛿(𝑡) − sin(𝑡)
6.4.105: 3𝛿(𝑡 − 1) + 2𝑡
6.5.101: 𝑦 = (𝑥 − 𝑡)𝑢(𝑡 − 𝑥) + 𝑡
6.5.102: 𝑦 = 𝑒 −𝑐𝑥 sin(𝑡 − 𝑥)𝑢(𝑡 − 𝑥)
𝑥
6.5.103: 𝑠 2𝑌(𝑥) − 𝑠(1 − 𝑥 2 ) + 3𝑌 ′′(𝑥) + 𝑌(𝑥) = 𝑠 + 1
𝑠2
, 𝑌(−1) = 0, 𝑌(1) = 0.
6.5.104: 𝑦 = sin(𝑥) cos(𝑡)
∫𝑡 −𝑥 2
𝑥
6.5.105: 𝑦(𝑥, 𝑡) = 0
𝑓 (𝜏) √ 𝑒 4(𝑡−𝜏) 𝑑𝜏
2 𝜋(𝑡−𝜏)3

7.1.101: Yes. Radius of convergence is 10.


7.1.102: Yes. Radius of convergence is 𝑒.

1 1 1
(−1)𝑛+1 (𝑥 − 2)𝑛 , which converges for 1 < 𝑥 < 3.
Í
7.1.103: 1−𝑥 = − 1−(2−𝑥) so 1−𝑥 =
𝑛=0

1
𝑥𝑛
Í
7.1.104: (𝑛−7)!
𝑛=7
7.1.105: 𝑓 (𝑥) − 𝑔(𝑥) is a polynomial. Hint: Use Taylor series.

Õ ∞
Õ ∞
Õ
7.1.106: a) (𝑘 − 3)(𝑘 − 4)𝑥 𝑘 b) (𝑘 + 2)𝑥 𝑘 c) 2(𝑘 + 2)𝑥 𝑘
𝑘=6 𝑘=0 𝑘=3
456 SOLUTIONS TO SELECTED EXERCISES

7.2.101: 𝑎2 = 0, 𝑎3 = 0, 𝑎4 = 0, recurrence relation (for 𝑘 ≥ 5): 𝑎 𝑘 = −2𝑎 𝑘−5


𝑘(𝑘−1)
, so:
𝑎0 5 𝑎1 6 𝑎 0 10 𝑎1 11 𝑎0 𝑎1
𝑦(𝑥) = 𝑎0 + 𝑎 1 𝑥 − 10 𝑥 − 15 𝑥 + 450 𝑥 + 825 𝑥 − 47250 𝑥 − 99000 𝑥 + · · ·
15 16

𝑎 𝑘−3 +1
7.2.102: a) 𝑎2 = 21 , and for 𝑘 ≥ 3 we have 𝑎 𝑘 = 𝑘(𝑘−1)
, so
𝑎 +1 𝑎 +1 𝑎 +7 𝑎 +13 43 8 𝑎0 +187 9 𝑎 1 +517 10
𝑦(𝑥) = 𝑎0 +𝑎1 𝑥+ 21 𝑥 2 + 06 𝑥 3 + 112 𝑥 4 + 40 𝑥 + 180
3 5 0
𝑥 6 + 1504 𝑥 7 + 2240 𝑥 + 12960 𝑥 + 45360 𝑥 +· · ·
b) 𝑦(𝑥) = 2 𝑥 + 6 𝑥 + 12 𝑥 + 40 𝑥 + 180 𝑥 + 504 𝑥 + 2240 𝑥 + 12960 𝑥 + 45360
1 2 1 3 1 4 3 5 7 6 13 7 43 8 187 9
𝑥 +···
517 10

7.2.103: Applying the method of this section directly we obtain 𝑎 𝑘 = 0 for all 𝑘 and so
𝑦(𝑥) = 0 is the only solution we find.
7.3.101: a) ordinary, b) singular but not regular singular, c) regular singular, d) regular
singular, e) ordinary.
√ √
1+ 5 1− 5
7.3.102: 𝑦 = 𝐴𝑥 2 + 𝐵𝑥 2


(−1) 𝑘
𝑦 = 𝑥 3/2 𝑥𝑘 (Note that for convenience we did not pick 𝑎0 = 1.)
Í
7.3.103: 𝑘! (𝑘+2)!
𝑘=0
7.3.104: 𝑦 = 𝐴𝑥 + 𝐵𝑥 ln(𝑥)
8.1.101: a) Critical points (1, 0) and (1, 1). At (1, 0) using 𝑢 = 𝑥 − 1, 𝑣 = 𝑦 the linearization
is 𝑢 ′ = 𝜋𝑣, 𝑣 ′ = −𝑣. At (1, 1) using 𝑢 = 𝑥 − 1, 𝑣 = 𝑦 − 1 the linearization is 𝑢 ′ = −𝜋𝑣, 𝑣 ′ = 𝑣.
b) Critical points (0, 0) and (0, −1). Using 𝑢 = 𝑥, 𝑣 = 𝑦 the linearization is 𝑢 ′ = 𝑢 + 𝑣, 𝑣 ′ = 𝑢.
At (0, 0) using 𝑢 = 𝑥, 𝑣 = 𝑦 the linearization is 𝑢 ′ = 𝑢 + 𝑣, 𝑣 ′ = 𝑢. At (0, −1) using 𝑢 = 𝑥,
𝑣 = 𝑦 + 1 the linearization is 𝑢 ′ = 𝑢 − 𝑣, 𝑣 ′ = 𝑢.
c) Critical point (1/2, −1/4). Using 𝑢 = 𝑥 − 1/2, 𝑣 = 𝑦 + 1/4 the linearization is 𝑢 ′ = −𝑢 + 𝑣,
𝑣 ′ = 𝑢 + 𝑣.
8.1.102: 1) is c), 2) is a), 3) is b)
8.1.103: Critical points are (0, 0, 0), and (−1, 1, −1). The linearization at the origin using
variables 𝑢 = 𝑥, 𝑣 = 𝑦, 𝑤 = 𝑧 is 𝑢 ′ = 𝑢, 𝑣 ′ = −𝑣, 𝑧 ′ = 𝑤. The linearization at the point
(−1, 1, −1) using variables 𝑢 = 𝑥 + 1, 𝑣 = 𝑦 − 1, 𝑤 = 𝑧 + 1 is 𝑢 ′ = 𝑢 − 2𝑤, 𝑣 ′ = −𝑣 − 2𝑤,
𝑤 ′ = 𝑤 − 2𝑢.
8.1.104: 𝑢 ′ = 𝑓 (𝑢, 𝑣, 𝑤), 𝑣 ′ = 𝑔(𝑢, 𝑣, 𝑤), 𝑤 ′ = 1.
8.2.101: a) (0, 0): saddle (unstable), (1, 0): source (unstable), b) (0, 0): spiral sink
(asymptotically stable), (0, 1): saddle (unstable), c) (1, 0): saddle (unstable), (0, 1):
source (unstable)
8.2.102: a) 12 𝑦 2 + 31 𝑥 3 − 4𝑥 = 𝐶, critical points: (−2, 0), an unstable saddle, and (2, 0), a
stable center. b) 12 𝑦 2 + 𝑒 𝑥 = 𝐶, no critical points. c) 12 𝑦 2 + 𝑥𝑒 𝑥 = 𝐶, critical point at
(−1, 0) is a stable center.
p
8.2.103: Critical point at (0, 0). Trajectories are 𝑦 = ± 2𝐶 − (1/2)𝑥 4 , for 𝐶 > 0, these give
closed curves around the origin, so the critical point is a stable center.
8.2.104: A critical point 𝑥0 is stable if 𝑓 ′(𝑥 0 ) < 0 and unstable when 𝑓 ′(𝑥0 ) > 0.
8.3.101: a) Critical points are 𝜔 = 0, 𝜃 = 𝑘𝜋 for any integer 𝑘. When 𝑘 is odd, we have a
saddle point. When 𝑘 is even we get a sink. b) The findings mean the pendulum will
simply go to one of the sinks, for example (0, 0) and it will not swing back and forth. The
friction is too high for it to oscillate, just like an overdamped mass-spring system.
SOLUTIONS TO SELECTED EXERCISES 457

8.3.102: a) Solving hfor the critical


i points we get (0, − ℎ/𝑑) and ( 𝑏 ℎ+𝑎𝑑 𝑎
𝑎𝑐 , 𝑏 ). The Jacobian
matrix at (0, − ℎ/𝑑) is 𝑎+𝑏 ℎ/𝑑 0
−𝑐 ℎ/𝑑 −𝑑 whose eigenvalues are 𝑎 + 𝑏 ℎ/𝑑 and −𝑑. The eigenvalues
are real of opposite signs and we get a saddle. (In the application, however, we are only
looking at the positive
 quadrant
 so this critical point is irrelevant.) At ( 𝑏 ℎ+𝑎𝑑 𝑎
𝑎𝑐 , 𝑏 ) we get
𝑏(𝑏 ℎ+𝑎𝑑)
0 − 𝑎𝑐
Jacobian matrix 𝑎𝑐 𝑏 ℎ+𝑎𝑑 . b) For the specific numbers given, the second critical point
𝑏 𝑎 −𝑑
h i √
0 −11/6 5±𝑖 327
3 , 40) the matrix is 3/25 1/4 , which has eigenvalues
is ( 550 40 . Therefore there is a
spiral source; the solution spirals outwards. The solution eventually hits one of the axes,
𝑥 = 0 or 𝑦 = 0, so something will die out in the forest.
8.3.103: The critical points are on the line 𝑥 = 0. In the positive quadrant the 𝑦 ′ is always
positive and so the fox population always grows. The constant of motion is 𝐶 = 𝑦 𝑎 𝑒 −𝑐𝑥−𝑏 𝑦 ,
for any 𝐶 this curve must hit the 𝑦-axis (why?), so the trajectory will simply approach a
point on the 𝑦 axis somewhere and the number of hares will go to zero.
8.4.101: Use Bendixson–Dulac Theorem. a) 𝑓𝑥 + 𝑔 𝑦 = 1 + 1 > 0, so no closed trajectories.
b) 𝑓𝑥 + 𝑔 𝑦 = − sin2 (𝑦) + 0 < 0 for all 𝑥, 𝑦 except the lines given by 𝑦 = 𝑘𝜋 (where we get
zero), so no closed trajectories. c) 𝑓𝑥 + 𝑔 𝑦 = 𝑦 2 + 0 > 0 for all 𝑥, 𝑦 except the line given by
𝑦 = 0 (where we get zero), so no closed trajectories.
8.4.102: Using Poincaré–Bendixson Theorem, the system has a limit cycle, which is the
unit circle centered at the origin, as 𝑥 = cos(𝑡) + 𝑒 −𝑡 , 𝑦 = sin(𝑡) + 𝑒 −𝑡 gets closer and closer
to the unit circle. Thus 𝑥 = cos(𝑡), 𝑦 = sin(𝑡) is the periodic solution.
8.4.103: 𝑓 (𝑥, 𝑦) = 𝑦, 𝑔(𝑥, 𝑦) = 𝜇(1 − 𝑥 2 )𝑦 − 𝑥. So 𝑓𝑥 + 𝑔 𝑦 = 𝜇(1 − 𝑥 2 ). The Bendixson–Dulac
Theorem says there is no closed trajectory lying entirely in the set 𝑥 2 < 1.
8.4.104: The closed trajectories are those where sin(𝑟) = 0, therefore, all the circles centered
at the origin with radius that is a multiple of 𝜋 are closed trajectories.
√ √ √ √
8.5.101: Critical points: (0, 0, 0), (3 8, 3 8, 27), (−3 8, −3 8, 27). Linearization at (0, 0, 0)
√ 𝑢√= 𝑥, 𝑣 = 𝑦, 𝑤 = 𝑧 is 𝑢√ = −10𝑢 + √ 10𝑣, 𝑣 ′ = 28𝑢 − 𝑣, 𝑤 ′ = −(8/3)𝑤. Linearization
using ′
√ at
(3 8, 3√ 8, 27)√using 𝑢 = 𝑥−3 8, 𝑣 = 𝑦−3 8, 𝑤√= 𝑧−27 √ is 𝑢 ′ = −10𝑢+10𝑣, 𝑣 ′ = 𝑢−𝑣−3 8𝑤,
√ √
𝑤 ′ = 3 8𝑢 +3 8𝑣 −(8/3)𝑤. Linearization at (−3 √ 8, −3 8, 27) √ using 𝑢√ = 𝑥 +3 8, 𝑣 = 𝑦 +3 8,
𝑤 = 𝑧 − 27 is 𝑢 ′ = −10𝑢 + 10𝑣, 𝑣 ′ = 𝑢 − 𝑣 + 3 8𝑤, 𝑤 ′ = −3 8𝑢 − 3 8𝑣 − (8/3)𝑤.
√ √
A.1.101: a) 10 b) 14 c) 3
" −1 #  √1 
√  6  
2  −1  √2 , √−5 , √2
A.1.102: a) b)  √6  c)
√1 2 33 33 33
2 √ 
 6
           
9 −3 5 −4 3 −8
A.1.103: a) b) c) d) e) f)
−2 3 −3 8 7 0
A.1.104: a) 20 b) 10 c) 20
A.1.105: a) (3, −1) b) (4, 0) c) (−1, −1)
458 SOLUTIONS TO SELECTED EXERCISES

   5 −3 0
7 4 4  
A.2.101: a) b)  13 10 6
2 3 4 −1 3 1
 
   
−1 13 2 −5
A.2.102: a) b)
9 14 5 5
  18 18 12   11 12 36 14  −2 −12
22 31     
A.2.103: a) b)  6 0 8  c) −2 4 5 −2 d)  3 24 
42 44 34 48 −2  13 38 20
   28  1
 9 
     
  0 1 −5 2 1/2 −1/4
A.2.104: a) 1/2 b) c) d)
1 0 3 −1 −1/2 1/2

−1 0 0 0 
  1/4 0 0  
1/2 0  0 1/2 0 0 
A.2.105: a) b)  0 1/5 0  c) 
0 1/3 1/3 0 
0 0 −1 0 0 
 0 0 0 10

      1 0 0  1 0 0 77/15 
1 0 1 1 0 1 1   
A.3.101: a) b) c) d) 0 1 −1/3 e) 0 1 0 −2/15
0 1 0 0 1 0 0 0 0 0 0 1 −8/5 
 0   
1 0 −1/2 0     
  0 0 0 0 1 2 3 0
f) 0 1 1/2 1/2 g) h)
0 0 0 0 0 0 0 0 0 1
 0 0 
0 −1 0 0 0 1   5/2 1 −3
   
A.3.102: a) 1 0 0 b) 0 1 −1 c) −1 −1/2 3/2 

0 0 1 1 −1 0  −1 0 1 
   
A.3.103: a) 𝑥1 = −2, 𝑥 2 = 7/3 b) no solution c) 𝑎 = −3, 𝑏 = 10, 𝑐 = −8 d) 𝑥 3 is free,
𝑥1 = −1 + 3𝑥 3 , 𝑥 2 = 2 − 𝑥 3
   
−1 −3
A.3.104: a) b)
3 1
A.3.105: a) 3 b) 1 c) 2
           
A.3.106: a) 1 0 0 , 0 1 0 , 0 0 1 b) 1 1 1 c) 1 0 1/3 , 0 1 −1/3

7 −1 7 1 0 3


           
A.3.107: a) 7 ,  7  , 6
  b) 1 c) 6 , 3
 
7  6  2 2 4 7
           
3 0
   
A.3.108:  1  ,  3 
−5 −1
   
SOLUTIONS TO SELECTED EXERCISES 459

    1 1 5  5  −1


1 1          
A.4.101: a) , dimension 2, b) 1 , 1 dimension 2, c) 3 , −1 ,  3  dimen-
2 1 1 2 1  5  −4
         
2 2   1 0
    1    
sion 3, d) 2 , 2 dimension 2, e) dimension 1, f) 0 , 1 dimension 2
4 3 0 0 2
       
3 3
    −1 1 −1 0
−1  0         
A.4.102: a)   ,   b) −1 c)  1  d)  0  , 1
0 3
 
0 −1 0 −1
 0  −1        
   
A.4.103: a) 3 b) 2 c) 3 d) 2 e) 3
A.5.101: 𝑠 = −2
A.5.102: 𝜃 ≈ 0.3876
A.5.103: a) -15 b) -1 c) 28
A.5.104: a) (−1/2, 0, 21 ) b) (0, 0, 0) c) (2, 0, −2)
A.5.105: a) (1, 1, −1) − (2, −1, 1) + 2(1, −5, 3) b) 2(2, −1, 1) + (1, −5, 3) c) 2(1, 1, −1) −
2(2, −1, 1) + 2(1, −5, 3)
A.5.106: (2, −1, 1), (2/3, 8/3, 4/3)
A.5.107: (1, 1, −1), (0, 1, 1), (4/3, −2/3, 2/3)
A.6.101: a) −2 b) 8 c) 0 d) −6 e) −3 f) 28 g) 16 h) −24
A.6.102: a) 3 b) 9 c) 3 d) 1/4
A.6.103: 1/12

A.6.104: 1 and 3
460 SOLUTIONS TO SELECTED EXERCISES
Index

absolute convergence, 328 cartesian plane, 386


acceleration, 24 catenary, 14
adding vectors, 387 Cauchy–Euler equation, 82, 89
addition of matrices, 127 center, 149, 359
Airy’s equation, 336 cgs units, 291, 292
algebraic multiplicity, 162 chaotic systems, 378
almost linear, 357 characteristic coordinates, 72, 253
amplitude, 98 characteristic curve, 73
analytic functions, 330 characteristic equation, 85
angular frequency, 98 characteristics, 72
ansatz, 84 Chebyshev’s equation of order 𝑝, 340
antiderivative, 22 Chebyshev’s equation of order 1, 83
antidifferentiate, 22 clamped end of beam, 284
associated homogeneous equation, 104, 138 closed curves, 362
associative law, 400 closed trajectory, 373
asymptotically stable, 358 coefficients, 18
asymptotically stable limit cycle, 374 cofactor, 130, 438
atan, 99 cofactor expansion, 130, 437
atan2, 99 column space, 416
attractor, 381 column vector, 127, 387
augmented matrix, 132, 408 commute, 129
autonomous, 19 complementary solution, 104
autonomous equation, 51 complete eigenvalue, 162
autonomous system, 124 complex conjugate, 143
backsubstitution, 411 complex number, 86
basis, 391, 421 complex roots, 87
beating, 112 conservative equation, 361
Bendixson–Dulac Theorem, 375 conservative vector field, 64
Bernoulli equation, 47 consistent, 412
Bessel function of second kind, 348 constant of motion, 370
Bessel function of the first kind, 348 constant-coefficient, 19, 84, 137
Bessel’s equation, 347 convection, 73
boundary conditions for a PDE, 232 convection equation, 320
boundary value problem, 189 convergence of a power series, 328
Burger’s equation, 18 convergent power series, 328

461
462 INDEX

converges absolutely, 328 element of a matrix, 127, 390


convolution, 308 elementary operations, 408
corresponding eigenfunction, 190 elementary row operations, 132, 408
cosine series, 220 elimination, 407
critical point, 51, 353 ellipses (vector field), 149
critically damped, 101 elliptic PDE, 232
endpoint problem, 189
d’Alembert solution to the wave equation, entry of a matrix, 127, 390
252 envelope curves, 101
damped, 99 equilibrium, 353
damped motion, 96 equilibrium solution, 51, 353
damped nonlinear pendulum equation, 371 euclidean space, 386
defect, 163 Euler’s equation, 82
defective eigenvalue, 163 Euler’s formula, 87
deficient matrix, 163 Euler’s method, 57
delta function, 314 Euler–Bernoulli equation, 317
dependent variable, 10 even function, 202, 218
determinant, 129, 436 even periodic extension, 218
diagonal matrix, 153, 403 exact equation, 63
matrix exponential of, 170 existence and uniqueness, 30, 80, 90
diagonalization, 171 existence and uniqueness for systems, 124
differential equation, 10 exponential growth, 17
Dirac delta function, 314 exponential growth model, 12
direction, 386, 389 exponential of a matrix, 169
direction field, 124 exponential order, 296
Dirichlet boundary conditions, 221, 273 extend periodically, 198
Dirichlet problem, 259
displacement vector, 153 first shifting property, 298
distance, 24 first-order differential equation, 10
distributive law, 400 first-order linear equation, 40
divergence, 376 first-order linear system of ODEs, 136
divergent power series, 328 first-order method, 58
dot product, 128, 199, 398 first-order system, 119
Duffing equation, 379 fixed end of beam, 284
dynamic damping, 161 forced motion, 96
systems, 159
echelon form, 409 four fundamental equations, 13
eigenfunction, 190, 274 Fourier series, 200
eigenfunction decomposition, 273, 278 fourth-order method, 60
eigenvalue, 140, 274 Fredholm alternative, 426
eigenvalue of a boundary value problem, 190 simple case, 194
eigenvector, 140 Sturm–Liouville problems, 278
eigenvector decomposition, 179, 186 free end of beam, 284
INDEX 463

free motion, 96 ill-posed, 323


free variable, 133, 412 imaginary part, 87
frequency, 98 implicit solution, 35
Frobenius method, 345 Improved Euler’s method, 62
Frobenius-type solution, 345 impulse response, 313, 315
fundamental frequency, 205, 248 inconsistent, 412
fundamental matrix, 137 inconsistent system, 133
fundamental matrix solution, 137, 170 indefinite integral, 22
independent variable, 10
Gauss–Jordan elimination, 409
indicial equation, 344
Gaussian elimination, 409
inhomogeneous, 19
general solution, 13
initial condition, 13
general solution to a system, 120
initial condition for a PDE, 72
generalized eigenvectors, 163, 166
initial condition for a system, 120
generalized function, 314
initial conditions for a PDE, 232
Genius software, 9
inner product, 128, 398, 428
geometric multiplicity, 162
inner product of functions, 200, 277
geometric series, 270, 333
integral equation, 305, 311
Gibbs phenomenon, 205
integrate, 22
Gram–Schmidt process, 433
integrating factor, 40
half period, 208 integrating factor method, 40
Hamiltonian, 361 for systems, 177
harmonic, 200 inverse Laplace transform, 297
harmonic conjugate, 71 invertible matrix, 129, 401
harmonic function, 71, 258 IODE software, 9
harvesting, 53 isolated critical point, 357
heat equation, 17, 232
Heaviside function, 294 Jacobian matrix, 354
Hermite’s equation of order 𝑛, 339
Hermite’s equation of order 2, 83 kernel, 413
hinged end of beam, 284
la vie, 106
homogeneous, 19
Laplace equation, 232, 258
homogeneous equation, 48
Laplace transform, 293
homogeneous linear equation, 79
Laplacian, 258
homogeneous matrix equation, 413
Laplacian in polar coordinates, 265
homogeneous side conditions, 233
leading entry, 133, 409
homogeneous system, 137
Leibniz notation, 23, 33
Hooke’s law, 96, 152
limit cycle, 373
hyperbolic cosine, 14
linear combination, 80, 90, 391, 413
hyperbolic PDE, 232
linear equation, 18, 40, 79
hyperbolic sine, 14
linear first-order system, 120
identity matrix, 128, 400 linear mapping, 390
464 INDEX

linear operator, 80, 104 nonhomogeneous, 19


linear PDE, 232 nonlinear equation, 18
linear systems, 120 normal mode of oscillation, 155
linearity of the Laplace transform, 295 nullity, 424
linearization, 354 nullspace, 413
linearly dependent, 90, 414
linearly independent, 81, 90, 414 odd function, 201, 218
for vector-valued functions, 137 odd periodic extension, 218
logistic equation, 51 ODE, 11, 17
with harvesting, 53 one-dimensional heat equation, 232
Lorenz attractor, 383 one-dimensional wave equation, 243
Lorenz system, 382 operator, 80, 390
Lotka–Volterra, 368 order, 17
lower triangular, 438 ordinary differential equation, 11
Ordinary differential equations, 17
magnitude, 387 ordinary point, 335
mass matrix, 153 orthogonal, 429
mathematical model, 12 functions, 193, 200
mathematical solution, 12 vectors, 198
matrix, 127, 390 with respect to a weight, 276
matrix exponential, 169 orthogonal basis, 431
matrix inverse, 129, 401 orthogonal projection, 430
matrix product, 399 orthogonality, 193
matrix-valued function, 136 orthonormal basis, 431
Maxwell’s equations, 17 overdamped, 100
mechanical vibrations, 17 overtones, 205, 248
method of characteristics, 72
Method of Frobenius, 345 parabolic PDE, 232
method of partial fractions, 297 parallelogram, 130, 436
Mixed boundary conditions, 273 partial differential equation, 11, 232
mks units, 99, 102, 227 Partial differential equations, 17
multiplication of complex numbers, 86 partial sum, 327
multiplicity, 93 particular solution, 13, 104
multiplicity of an eigenvalue, 162 PDE, 11, 17, 232
period, 98
𝑛-dimensional space, 386 periodic, 198
natural (angular) frequency, 98 periodic extension, 198
natural frequency, 111, 155 periodic orbit, 374
natural mode of oscillation, 155 phase diagram, 52, 352
Neumann boundary conditions, 222, 273 phase plane portrait, 124
Newton’s law of cooling, 17, 36, 44, 51 phase portrait, 52, 124, 352
Newton’s second law, 96, 97, 122, 152 phase shift, 98
nilpotent, 171 Picard’s theorem, 30, 124
INDEX 465

piecewise continuous, 211 row reduction, 409


piecewise smooth, 211 row space, 416
pivot, 409 row vector, 127, 391
Poincaré section, 380 Runge–Kutta method, 61
Poincaré–Bendixson Theorem, 374
Poisson kernel, 269 saddle point, 148
potential function, 63 sawtooth, 201, 306
power series, 327 scalar, 127, 388
practical resonance, 115 scalar multiplication, 127
practical resonance amplitude, 115 scale a vector, 388
practical resonance frequency, 115 second shifting property, 302
practical resonance„ 231 second-order differential equation, 14
second-order linear differential equation, 79
predator-prey, 368
second-order method, 59
product of matrices, 128, 399
second-order system, 119
projection, 200
semistable critical point, 53
orthogonal, 430
separable, 33
proper rational function, 298
separation of variables, 234
pseudo-frequency, 102
shifting property, 298, 302
pulse, 313
side conditions for a PDE, 232
pure resonance, 113, 229
Sierpinski triangle, 384
quadratic formula, 85 simple harmonic motion, 98
simply connected region, 375
radius of convergence, 328 sine series, 220
rank, 415 singular matrix, 129, 401
ratio test for series, 329 singular point, 335
real part, 87 singular solution, 35
real-world problem, 12 sink, 147
rectangular pulse, 313 slope field, 27
recurrence relation, 336 solution, 10
reduced row echelon form, 133, 409 solution curve, 124
reduction of order method, 81 solution to a system, 120
regular singular point, 345 source, 147
regular Sturm–Liouville problem, 275 span, 415
reindexing the series, 332 spectrum, 205, 248
relaxation oscillation, 373 spiral sink, 150
repeated roots, 92 spiral source, 149
resonance, 113, 160, 229, 310 square matrix, 127, 391
RLC circuit, 96 square wave, 116, 203
Robin boundary conditions, 273 stable center, 359
root test for series, 329 stable critical point, 51, 357
row echelon form, 409 stable node, 147
row operations, 132, 408 standard basis vectors, 391
466 INDEX

standard inner product, 428 upper triangular matrix, 164


steady periodic solution, 114, 226
steady-state temperature, 241, 258 Van der Pol oscillator, 373
stiff problem, 61 variable-coefficient, 19
stiffness matrix, 153 variation of parameters, 108
strange attractor, 380 for systems, 184
Sturm–Liouville problem, 274 vector, 127, 386
regular, 275 vector field, 64, 124, 352
subspace, 421 vector-valued function, 136, 389
subtracting vectors, 388 velocity, 24
superposition, 79, 90, 137, 233 Volterra integral equation, 311
symmetric matrix, 193, 198, 404 wave equation, 232, 243, 252
system of differential equations, 17, 119 wave equation in 2 dimensions, 17
wavefronts, 256
Taylor series, 330
weight function, 276
tedious, 106, 108, 114, 182, 269, 378
thermal conductivity, 232
three mass system, 152
three-point beam bending, 317
timbre, 248, 284
total derivative, 63
trajectory, 124
transfer function, 304
transformation, 390
transient solution, 114
transport equation, 17, 72, 320
transpose, 128, 404
transversal vibrations, 282
trigonometric series, 200

undamped, 98
undamped motion, 96
systems, 152
underdamped, 101
undetermined coefficients, 105
for second-order systems, 159, 185
for systems, 182
unforced motion, 96
unit step function, 294
unit vector, 389
unstable critical point, 51, 357
unstable node, 147
upper triangular, 438

You might also like