Complex Numbers & Differentiation
Complex Numbers & Differentiation
Complex
Analysis
is division (or forming a quotient). Thus make sure that you remember how to calculate the quotient of
two complex numbers as given in equation (7), Example 2, p. 610, and Prob. 3. In (7) we take the number
z2 from the denominator and form its complex conjugate zN 2 and a new quotient zN2=zN2. We multiply
the given quotient by this new quotient zN 2=zN2 (which is equal to 1 and thus allowed):
z 1 ;
D z1 D z1 D z1 zNN2
z2 z2 z2 z2
which we multiply out, recalling that i2 D 1 [see (5), p. 609]. The final result is a complex number in a
form that allows us to separate its real (Rez/ and imaginary (Imz/ parts. Also remember that 1=i D i
(see Prob. 1), as it occurs frequently. We continue by defining the complex plane and use it to graph
complex numbers (note Fig. 318, p. 611, and Fig. 322, p. 612). We use equation (8), p. 612, to go from
complex to real.
(I2) i3 D i2i D . 1/ i D i:
Here we used (I1) in the second equality. To get (I3), we apply (I2) twice:
(I5) i 0 i i: i i iN i . i/ .0 C i/.0 i/ 02 12 02 12
By (I5) and (I1) we get
Chap. 13 Complex Numbers and Functions. Complex Differentiation 259
1 1 1 2
and the pattern repeats itself. Memorize that i 2 D 1 and 1=i D i as they will appear quite
frequently.
i8 i9 : .
i4 i5 i6 i7
Start! i0 i i2 i3
1 i 1 i
C
D z1 x1 iy1 z2 D x2 C
iy2
C
D x1 iy1 x2 iy2
(N.B. corresponds to multiplication by 1)
x2 C iy2 x2 iy2
260 Complex Analysis Part D
CC
iy1x2 iy1iy2 x1x2 y1y2 C x2y1
Cx1y2
6 2i .6 2i/ .6 C 2i/ 62 22
4:8 1:4i:
36 4 40
5. Pure imaginary number a. If z D x C iy is pure imaginary, then zND z:
Proof. Let z D x C iy be pure imaginary. Then x D 0, by definition on the bottom of p. 609.
Hence
z D iy;
z DNz:
By the definition of equality (p. 609) we know that the real parts must be equal and that the
imaginary parts must be equal. Thus
x;
2x D 0; x
D 0;
and
y;
z1 z2 D . 2 C 11i/ .2 i/
2
.z 1z 2/ D 128 96 8 16 25 4 3 D
16 16 16 D 16 2 i 8 6i:
Next consider
z1 z2 2
:
4 4
We have
z2 11 z
. 2 C 11i/ D C i; i:
4 4 4 4 4
262 Complex Analysis Part D
Their difference is
z1 z2 2 2 11 1
i 1 3i:
4 4D4 4C4C4DC
Hence
z1 z2 2 2
z
19. Real part and imaginary part of z= N. For z D x C iy, we have by (7), p. 610,
z D z zNND zN z
zN zN zNN z z
since the conjugate of the conjugate of a complex number is the complex number itself (which you
may want to prove!). Then
i :
zN zzN x2 C y2 x2 y2 x2 y2 x2 y2
z x2 y2 z 2xy
Re D C I Im N D C : zN x2 y2 z x2 y2
Sec. 13.2 Polar Form of Complex Numbers. Powers and Roots
Polar coordinates, defined by (1) and (2) on p. 613, play a more important role in complex analysis than
in calculus. Their study gives a deeper understanding of multiplication and division of complex numbers
(pp. 615–616) and absolute values. More details are as follows.
The polar angle (taken counterclockwise, see Fig. 323, p. 614) of a complex number is determined
only up to integer multiples of 2. While often this is not essential, there are situations where it matters.
For this purpose, we introduce the concept of the principal value Arg z in (5), p. 614, and illustrate it in
Example 1, Probs. 9 and 13.
The triangle inequality defined in (6), p. 614, and illustrated in Example 2, p. 615, is very important
since it will be used frequently in establishing bounds such as in Chap. 15.
Often it will be used in its generalized form (6*), p. 615, which can be understood by the following
geometric reasoning. Draw several complex numbers as little arrows and let each tail coincide with the
preceding head. This gives you a zigzaging line of n parts, and the left side of (6*) equals the distance
Chap. 13 Complex Numbers and Functions. Complex Differentiation 263
from the tail of z1 to the head of zn. Can you “see” it? Now take your zigzag line and pull it taut; then you
have the right side as the length of the zigzag line straightened out.
In almost all cases when we use (6*) in establishing bounds, it will not matter whether or not the right
side of (6*) is much larger than the left. However, it will be essential that we have such an upper bound
for the absolute value of the sum on the left, so that in a limit process, the latter cannot go to infinity.
The last topic is roots of complex numbers, illustrated in Figs. 327–329, p. 617, and Prob. 21. Look at
these figures and see how, for different n, the roots of unity (16), p. 617, lie symmetrically on the unit
circle.
1. Polar form. Sketch z D 1 C i to understand what is going on. Point z is the pointjjDp .1;1/ in the
complex plane. From this we see that the distance of z from the origin is z 2. This is the absolute
value of z. Furthermore, z lies on the bisecting line of the first quadrant, so that its argument (the
angle between the positive ray of the x-axis and the segment from 0 to z) is 45 ı or =4.
Now we show how the results follow from (3) and (4), p. 613. In the notation of (3) and (4) we
have z D x C iy D 1 C i: Hence the real part of z is x D 1 and the imaginary part of z is y D 1. From (3)
we obtain
p
jz jD 12 C 12 Dp2;
as before. From (4) we obtain
y
tan D D 1; D 45ı or :
x 4
Hence the polar form (2), p. 613, is
Dp C z 2 cos i
sin :
4 4
Note that here we have explained the first part of Example 1, p. 614, in great detail.
264 Complex Analysis Part D
1 1+i
o
45
x
1
pp2 C 31i D 38 D 38 9 D 38 D 1:
8 i 9 76 76 2
1
Hence z D corresponds to 2;0 in the complex plane. Furthermore, by (3), p. 613,
o
180
x
–1
2
jzjDq12 C 122 Dq C 1 2;
1 4
y 1 1
1
tan D D 2 D I D arctan
:x 1 2 2
The desired polar form of z is
q
z
Djzj.cos C i sin/ D 1 C 142 cos arctan 12C i sin
arctan 12:
9. Principal argument. The first and second quadrants correspond to 0 Argz . The third and fourth
quadrants correspond to <Arg z 0. Note that Arg z is continuous on the positive real semiaxis and
has a jump of 2 on the negative real semiaxis. This is a convenient convention. Points on the
negative real semiaxis, e.g., 4:7, have the principal argument Argz D : To find the principal
argument of z D 1 C i, we convert z to polar form:
p
jzjD . 1/2 C 12 Dp2;
y 1 tan D D D 1: x 1
Hence
D D 135ı:
To avoid this concern, we define the principal argument Arg z [see (5), p. 614]. We have
Arg z D :
p C
Then, using DeMoivre’s formula (13), p. 616, with r Dp2 and n D 20,
4 4
D 210 .cos5 C i sin5/ D 210
.cos C i sin/:
Hence
argz D ˙ 2n; n D 0;1;2;I Argz D :
17. Conversion to x+iy. To convert from polar form to the form x C iy, we have to evaluate sin
and cos for the given : Here
!
p 1 C 1 Dp p2 C p2 Dp16 Cp16 DC
8 cos i sin 8 i i 2 2i:
4 4 2 2 2 2
21. Roots. From Prob. 1 and Example 1, p. 614 in this section, we know that 1 C i in polar form is
Chap. 13 Complex Numbers and Functions. Complex Differentiation 267
3 3
and
C 2k D =4 C 2k D C 8k D .1 C 8k/
:
3 3 3 12 12 12
Hence
.1 8k/ .1 8k/
p3 C D 1=6 CC C
1 i 2 cos i sin ;
12 12
where k D 0;1;2 (3 roots; thus 3 values of k). Written out we get
1=6
For k D 0 z0 D 2 cosC i sin:
12 12
1=6 9 9
For k1 D z1 D 2 cosC i sin:
12 12
17 17
1=6
D
For k 2 z2 D 2 cos C i sin :
12 12
The three roots are regularly spaced around a circle of radius 2 1=6 D 1:1225 with center 0.
268 Complex Analysis Part D
o
120
~1.12
~1.12 o
15
o x
120
~1.12
Sec. 13.2. Prob. 21. The three roots z0; z1;z2 of z Dp3 1 C i in the complex plane
29. Equations involving roots of complex numbers. Applying the usual formula for the solutions
b ˙pb2 4ac
zD
2a of a quadratic equation we first have
to
(Eq) z2 zC1 i D 0;
(A) zD
:
Now, in (A), we have to simplify p 3 C 4i : Let z D p C qi be a complex number where p;q are real.
Then
z2 D .p C qi/2 D p2 q2 C 2pqi D 3 C 4i:
We know that for two complex numbers to be equal, their real parts and imaginary parts must be
equal, respectively. Hence, from the imaginary part
2pq D 4;
(B) pq D 2;
2
p
D :q
This can then be used, in the real part,
Chap. 13 Complex Numbers and Functions. Complex Differentiation 269
p2 q2 D 3;
2 4
pC 2 D 3;
p
p4 C 4 D 3p2; p4
3p2 C 4 D 0:
To solve this quartic equation, we set h D p 2 and get the quadratic equation
h2 C 3h 4 D 0;
Hence
p2 D 1 and p2 D 4:
1 C 2i and 1 2i D .1 C 2i/:
p
1 ˙p 3 C 4i D 1 ˙ .1 C 2i/2 D 1 ˙ .1 C 2i/ z D:
2 2 2
This gives us the desired solutions to (Eq), that is,
1 D 1 C .1 C 2i/ D 2i D z i
2 2
and
270 Complex Analysis Part D
2 1 .1 C 2i/ D 2 2i D
zD 1 i:
2 2
Verify the result by plugging the two values into equation (Eq) and see that you get zero.
Sec. 13.3 Derivative. Analytic Function
The material follows the calculus you are used to with certain differences due to working in the complex
plane with complex functions f.z/. In particular, the concept of limit is different as z may approach z0 from
any direction (see pp. 621–622 and Example 4). This also means that the derivative, which looks the
same as in calculus, is different in complex analysis. Open the textbook on p. 623 and take a look at
Example 4. We show from the definition of derivative (4), p. 622, which uses the concept of limit, that
f.z/ DNz is not differentiable. The essence of the example is that approaching z along path I in Fig. 334,
p. 623, gives a value different from that along path II. This is not allowed with limits (see pp. 621–622).
We call those functions that are differentiable in some domain analytic (p. 623). You can think of them
as the “good functions,” and they will form the preferred functions of complex analysis and its
applications. Note that f.z/ DNz is not analytic. (You may want to build a small list of nonanalytic
functions, as you encounter them. In Sec. 13.4 we shall learn a famous method for testing analyticity.)
The differentiation rules are the same as in real calculus (see Example 3, pp. 622–623 and Prob. 19).
Here are two examples
f.z/ D .1 z/16; f 0.z/ D 16.1 z/15.
1/ D 16.1 z/15;
f.z/ D i; f 0.z/ D 0
since i is a constant.
Go over the material to see that many concepts from calculus carry over to complex analysis. Use this
section as a reference section for many of the concepts needed for Part D.
jz C 1 5ij
in the form jz aj p
jz C 1 5ijDjz C .1 5i/j
jz . 1 C 5i/j
7. Regions. Half-plane. Let z D x C yi. Then Rez D x as defined on p. 609. We are required to
determine what
Rez 1
x 1:
y
x
1
1 – 5i
3
–5 2
This is a closed right half-plane bounded by x D 1, that is, a half-plane to the right of x D 1 that
includes the boundary.
y
–1 x
11. Function values are obtained, as in calculus, by substitution of the given value into the given
function. Perhaps a quicker solution than the one shown on p. A35 of the textbook, and following the
approach of p. 621, is as follows. The function
1
f.z/ D evaluated at z D 1
i1 z
is
1 1 1
f.1 i/ D D C D D i;
1 .1 i/ 1 1 i i
with the last equality by (I5) in Prob. 1 of Sec. 13.1 on p. 258 of this Manual. Hence
Ref D Re.i/ D 0, Imf D Im.i/ D 1:
17. Continuity. Let us use polar coordinates (Sec. 13.2) to see whether the function defined by
8 z¤0
Re.z/
D < for zD0
for
f.z/1 jzj
:0
is continuous at z D 0. Then x D r cos, y D r sin by (1), p. 613, and, using the material on p. 613,
we get
Re.z/ x r cos
f.z/ D jjD jjD . 1 z 1 z 1 r
We note that as r ! 0,
so that
r cos
!0 as r ! 0; for any value of :
1 r
By (3), p. 622, we can conclude that f is continuous at z D 0:
f.z/ D .z 4i/8;
Remark. Be aware of the chain rule. Thus if, for example, we want to differentiate
(1) ux D vy; uy D vx
(Theorem 1, p. 625) as well as Laplace’s equations r2u D 0, r2v D 0 (Theorem 3, p. 628; see also
Example 4, p. 629, and Prob. 15). The converse of Theorem 1 is also true (Theorem 2, p. 627), provided
the derivatives in (1) are continuous. For these reasons the Cauchy–Riemann equations are most
important in complex analysis, which is the study of analytic functions.
Examples 1 and 2, p. 627, and Probs. 3 and 5 use the Cauchy–Riemann equations to test whether
the given functions are analytic. In particular, note that Prob. 5 gives complete details on how to use
the Cauchy–Riemann equations (1), p. 625, in complex form (7), p. 628, and even how to conclude
nonanlyticity by observing the given function. You have to memorize the Cauchy–Riemann equations
(1). Remember the minus sign in the second equation!
Problem Set 13.4. Page 629
3. Check of analyticity. Cauchy–Riemann equations (1), p. 625. From the given function
D e 2x cos2y C i e 2x sin2y
To check whether f is analytic, we want to apply the important Cauchy–Riemann equations (1),
p. 625. To this end, we compute the following four partial (real) derivatives:
ux D 2e 2x cos2y; vy D e 2x .2cos2y/ D 2e 2x
Note that the factor 2 in ux and the factor 2 in vy result from the chain rule. Can you identify
the use of the chain rule in the other two partial derivatives? We see that
ux D cos2y D vy
and
uy D 2e 2x sin2y D vx:
This shows that the Cauchy–Riemann equations are satisfied for all z D x Ciy and we conclude that f
is indeed analytic.
In Sec. 13.5 we will learn that the given function f defines the complex exponential function
ez; with z D 2x C i2y and that, in general, the complex exponential function is analytic.
5. Not analytic. We show that f.z/ D Re.z2/ i Im.z2/ is not analytic in three different ways.
Hence
u D x2 y2; v D 2xy:
To test whether f.z/ satisfies the Cauchy–Riemann equations (1), p. 625, we have to take four
partial derivatives
ux D 2x and vy D 2x
so that
(XCR1) u x ¤ v y:
Chap. 13 Complex Numbers and Functions. Complex Differentiation 275
(We could stop here and have a complete answer that the given function is not analytic! However, for
demonstration purposes we continue.)
uy D 2y and vx D 2y
so that
(CR2) uy D vx:
We see that although the given function f.z/ satisfies the second Cauchy–Riemann equation (1),
p. 625, as seen by (CR2), it does not satisfy the first Cauchy–Riemann equation (1) as seen by
(XCR1). We note that the functions u.x;y/, v.x;y/ are continuous and conclude by Theorems 1, p.
625, and 2, p. 627 that f.z/ is not analytic.
Solution 2.
In y D r sin. Hence polar
x2 y2 D r2 cos2 r2 sin2 D r2.cos2 sin2 /;
We have
cos sin:
ur D 2r.cos2 sin2 /
276 Complex Analysis Part D
D 2r2.sin2 cos2 /;
and
1 2 2
v D 2r.sin cos /; r
We see that ur D .1=r/v so that
1
ur ¤ v:
r
This means that f does not satisfy the first Cauchy–Riemann equation in polar coordinates (7),
p. 628, and f is not analytic. (Again we could stop here. However, for pedagogical reasons we
continue.)
vr D 4r cos sin;
and
1 u D 4r.sin cos/: r
This shows that f.z/ satisfies the second Cauchy–Riemann equation in polar coordinates (7), that is,
1
vr D u:
r
However, this does not help, since the first Cauchy–Riemann equation is not satisfied. We
conclude that f.z/ is not analytic.
Furthermore,
From Example 4, p. 623 in Sec. 13.3, we know that zN is not differentiable so we conclude that
the
given f.x/ D .z2/ is also not differentiable. Hence f.z/ is not analytic (by definition on p. 623).
Remark. Solution 3 is the most elegant one. Solution 1 is the standard one where we stop when the
first Cauchy–Riemann equation is not satisfied. Solution 2 is included here to show how the Cauchy–
Riemann equations are calculated in polar coordinates. (Here Solution 2 is more difficult than
Solution 1 but sometimes conversion to polar makes calculating the partial derivatives simpler.)
15. Harmonic functions appear as real and imaginary parts of analytic functions.
First solution method. Identifying the function.
If you remember that the given function u D x=.x 2 C y2/ is the real part of 1=z, then you are done.
Indeed,
1 1
z D x C iy
1 x iy
D x C iy x iy
x iy
x2
Cy
i
D x2 C y2 C x2 C y2
so that clearly
1 x
Re D C ; z x2 y2
and hence the given function u is analytic. Moreover, our derivation also shows that a conjugate
harmonic of u is y=.x2 C y2/.
Second solution method. Direct calculation as in Example 4, p. 629.
If you don’t remember that, you have to work systematically by differentiation, beginning with
proving that u satisfies Laplace’s equation (8), p. 628. Such somewhat lengthy differentiations, as
well as other calculations, can often be simplified and made more reliable by introducing suitable
shorter notations for certain expressions. In the present case we can write
278 Complex Analysis Part D
x 22 uD ; where G D x C y :
G
Then
By applying the product rule of differentiation (and the chain rule), not the quotient rule, we obtain
the first partial derivative.
1 x.2x/
(B) ux D G G2 :
By differentiating this again, using the product and chain rules, we obtain the second partial
derivative:
2x 4x 8x3
(C) uxx D G2 G2 C G3 :
Similarly, the partial derivative of u with respect to y is obtained from (A) in the form
2xy
(D) uy D 2:
G
The partial derivative of this with respect to y is
2x 8xy2
(E) uyy D G2 C G3 :
This shows that u D x=G D x=.x2 Cy2/ satisfies Laplace’s equation (8), p. 628, and thus is harmonic.
Next we want to determine a harmonic conjugate. From (D) and the second Cauchy–Riemann
equation (1), p. 625, we obtain
2xy
uy D 2 D vx:
G
2 2
Integration of 2x=G D Gx=G , with respect to x, gives 1=G, so that integration of vx, with
respect to x, gives
Chap. 13 Complex Numbers and Functions. Complex Differentiation 279
y y
(F) v D D C C h.y/:
G x2 y2
Now we show that h.y/ must be a constant. We obtain, by differentiating (F) with respect to y and
taking the common denominator G2, the following:
D 1 C 2y2 D x2 C y2 C vy G G2 G2 h0.y/:
1 2x2 y2 x2 ux D
G G2 D G2 :
vy D ux;
x2 C y2 C D y2 x2
2 h0.y/ G2 :
G
as we claimed. Since this constant is arbitrary, we can choose h.y/ D 0 and obtain, from (F), the
desired conjugate harmonic
y y
vD C C h.y/ D C ; x2 y2
x 2 y2
which is the same answer as in our first solution method.
It would be useful for you to remember equations (7), (8), and (9). The periodicity (12), p. 632, has no
counterpart in real. It motivates the fundamental region (13), p. 632, of e z.
Solving complex equations, such as Prob. 19, gives practice in the use of complex elementary
functions and illustrates the difference between these functions and their real counterparts. In
particular, Prob. 19 has infinitely many solutions in complex but only one solution in real!
Problem Set 13.5. Page 632
sin3/
D e2.cos. C 2/ C i sin. C 2// since cos.3/ D cos. C 2/; same for sin.3/
D e2.cos C i sin/ (cos and sin both have periods of 2/
D e2. 1 C i 0/
D e2 7:389:
7:389:
9. Polar form. We want to write z D 4 C 3i in exponential form (6), p. 631. This means expressing it in
the form z D rei:
We have
p p
r Djz jD x2 C y2 D 42 C 32 Dp25 D 5:
We know, by Sec. 13.2, pp. 613–619, that the principal argument of the given z is
y 3
Arg z D arctan D arctan D 0:643501:
x 4
z
D 5ei arctan.3=4/ D 5e0:643501i:
Checking the answer. By (2), p. 613, in Sec. 13.2, we know that any complex number z D x Ciy has
polar form
Chap. 13 Complex Numbers and Functions. Complex Differentiation 281
z D r.cos C i sin/:
z D 5.cos0:643501 C i sin0:643501/ D
15. Real and imaginary parts. We want to find the real and imaginary parts of exp.z2/: From the
beginning of Sec. 13.5 of the textbook we know that the notation exp means
2
2 z
exp.z / D e :
Now
ei2xy D cos.2xy/ C i sin.2xy/ [by (1), p. 630; (5), p. 631].
Putting it together
Hence
Reexp.z2/D ex2 y2 cos2xyI Imexp.z2/D ex2 y2 sin2xy;
D1 [by (A)]
282 Complex Analysis Part D
D 1 C i 0:
Since ex > 0 but the product in (C) must equal zero requires that
Since the product in (B) is positive, cosy has to be positive. If we look at (D), we know that cosy is
1 for y D˙;˙3;˙5;::: but C1 for y D 0;˙2;˙4;:::: Hence (B) and (D) give
(E) y D 0;˙2;˙4;::::
Since (B) requires that the product be equal to 1 and the cosine for the values of y in (E) is 1, we
have ex D 1: Hence
(F) x D 0:
xD0 y D 0;˙2;˙4;::: ;
z D x C yi D˙2ni; n D 0;1;2;::::
Note that (A), being complex, has infinitely many solutions in contrast to the same equation in
real, which has only one solution.
Sec. 13.6 Trigonometric and Hyperbolic Functions. Euler’s Formula
In complex, the exponential, trigonometric, and hyperbolic functions are related by the definitions (1),
p. 633, and (11), p. 635, and by the Euler formula (5), p. 634, as well as by (14) and (15), p. 635. Thus we
can convert them back and forth. Formulas (6) and (7) are needed for computing values. Problem 9 uses
such a formula to compute function values.
we do the following. We start with the definition of coshz. Since we want to avoid carrying a factor
along, we multiply both sides of (11), p. 635, by 2 and get
2coshz D ez C e z
Chap. 13 Complex Numbers and Functions. Complex Differentiation 283
D cosy.2coshx/ C i siny.2sinhx/ (by (17), p. A65 of Sec. A3.1 of App. 3) D 2coshx cosy C 2i
sinhx siny:
Division by 2 on both sides yields the desired result. Note that the formula just proven is useful
because it expresses coshz in terms of its real and imaginary parts.
The related formula for sinhz follows the same proof pattern, this time start with
2sinhz D ez e z. Fill in the details.
9. Function values. The strategy for Probs. 6–12 is to find formulas in this section or in the problem
set that allow us to get, as an answer, a real number or complex number. For example, the formulas
in Prob. 1 are of the type we want for this kind of problem.
In the present case, by Prob. 1 (just proved before!), we denote the first given complex
number by z1 D 1 C 2i so that x1 D 1 and y1 D 2 and use
Then
coshz1 D cosh. 1 C 2i/ D cosh. 1/cos2 C i sinh. 1/sin2:
C D 1 C e2 C 1e2
cosh. 1 2i/ cos2 i sin2
2e 2e
D 1:543081 . 0:4161468/ C i. 1:752011/ .0:9092974/
D 0:642148 1:068607i;
which corresponds to the rounded answer on p. A36.
For the second function value z2 D 2 i we notice that, by (1), p. 633,
cosz D 12 eiz C e iz
284 Complex Analysis Part D
Now
Hence
z1
.e e / [by (A)]
2C
D coshz1
D cosh.1 2i/
13. Equations. We want to show that the complex cosine function is even.
First solution directly from definition (1), p. 633.
We start with
cos. z/ D 21 ei. z/ C e i. z/:
Similarly,
iz D i.x C iy/ D ix i2y D y ix D i. z/:
So we have
iz D i. z/:
Similarly,
i. z/ D i. x yi/ D y C xi
and
iz D i.x C iy/ D ix C i2y D i. z/
so that
iz D i. z/:
Putting these two boxed equations to good use, we have
Chap. 13 Complex Numbers and Functions. Complex Differentiation 285
Thus cos. z/ D cosz, which means that the complex cosine function (like its real counterpart) is
even.
Second solution by using (6a), p. 634. From that formula we know
D cos. x C i. y//
D cos. x/cosh. y/ i sin. x/sinh. y/
D cosx coshy i. sinx/. sinhy/
cosz:
The fourth equality used that, for real x and y, both cos and cosh are even and sin and sinh are
odd,
that is,
sinh. x/ D sinhx:
Similary, show that the complex sine function is odd, that is, sin. z/ D sinz:
17. Equations. To solve the given complex equation, coshz D 0, we use that, by the first equality in
Prob. 1, p. 636, of Sec. 13.6, the given equation is equivalent to a pair of real equations:
sinhx siny D 0:
Since coshx ¤ 0 for all x, we must have cosy D 0, hence y D˙.2n C 1/=2 where n D 0;1;2;:::: For
these y we have siny ¤ 0, noting that the real cos and sin have no common zeros! Hence sinhx
D 0 so that x D 0. Thus our reasoning gives the solution
z D .x;y/ D .0; ˙.2n C 1/=2/; that is, z D˙.2n C 1/i=2 where n D 0;1;2;::::
286 Complex Analysis Part D
2. The difference between the real logarithm lnx, which is a function defined for x > 0, and the
complex logarithm lnz, which is an infinitely many-valued relation, which, by formula (3), p.
637, “decomposes” into infinitely many functions.
5. Principal value. Note that the real logarithm of a negative number is undefined. The principal
value Ln z of ln z is defined by (2), p. 637, that is,
Ln z D lnjzjC iArg z
where Arg z is the principal value of arg z. Now recall from (5), p. 614 of Sec. 13.2, that the
principal value of the argument of z is defined by
< Arg z :
In particular, for a negative real number we always have Arg z DC, as you should keep in mind.
From this, and (2), we obtain the answer
15. All values of a complex logarithm. We need the absolute value and the argument of e i because, by
(1) and (2), p. 637,
Now the absolute value of the exponential function e z with a pure imaginary exponent always equals
1, as you should memorize; the derivation is
q eiyjDjcosy C i sinyjD cos2 y C sin2 y D 1:
j
(Can you see where this calculation would break down if y were not real?) In our case,
The argument of ei is obtained from (10), p. 631 in Sec. 13.5, that is,
Chap. 13 Complex Numbers and Functions. Complex Differentiation 287
0:4 D argz:
Next we note that
We consider
e0:4i D e0C0:4i D e0.cos0:4 C i sin0:4/ [by (1), p. 630, Sec. 13.5] D cos0:4 C i
sin0:4:
z D e0:60:6e0:4i
0:70957i:
23. General powers. Principal value. We start with the given equation and use (8), p. 640, and the
definition of principal value to get
288 Complex Analysis Part D
Also
p
j1 C ijD 12 C 12 Dp2
and
Ln.1 C i/ D lnp2 C i
4
so that
.1 i/Ln.1 C i/ D .1 i/ lnp2 C i
4
D pC p 2 ln 2 i i ln 2 i
4 4
D lnp2 C C i lnp2 :
4 4
Thus
4 4
Numerical values are
pD
ln 2 0:4388246;
4
pD D
cosln 2 cos.0:4388246/ 0:9052517;
4
pD D
sinln 2 sin.0:4388246/ 0:4248757;
=4
4 p2e D 3:1017664:
where C is the curve of integration and the resulting integrals are real.
(However, having not studied real line integrals is not a hindrance to learning and enjoying complex
analysis as we go in a systematic fashion with the only prerequisite for Part D being elementary calculus.)
The first practical method of complex integration involves indefinite integration and substitution of
limits and is directly inspired from elementary calculus. It requires that the function be analytic. The
details are given in Theorem 1, formula (9), p. 647, and illustrated below by Examples 1–4 and Probs. 23
and 27.
A prerequisite to understanding the second practical method of integration (use of a representation of
a path) is to understand parametrization of complex curves (Examples 1–4, p. 647, Probs. 1, 7, and 19).
Indeed, (10), p. 647, of Theorem 2 is a more general approach than (9) of Theorem 1, because Theorem
Chap. 14 Complex Integration 291
2 applies to any continuous complex function not just analytic functions. However, the price of generality
is a slight increase in difficulty.
Problem Set 14.1. Page 651
is linear in the parameter t, the representation is that of a straight line in the complex z-plane. Its
slope is positive, that is
1
y.t/ 2t 1
:
x.t/ D t D 2
z0 D z.2/ D 2 C i 2D2Ci
and ends at t D 5:
z1 D z.5/ D 5 C i:
Sketch it.
represents a unit circle (i.e., radius 1, center 0) traveled in the counterclockwise direction. Hence
represents a semicircle (half circle) of radius 4 with center 0 traversed in the counterclockwise
direction.
Finally
is a shift of that semicircle to center 2, corresponding to the answer on p. A36 in App. 2 of the
textbook.
y
2 x
Remark. Our solution demonstrates a way of doing mathematics by going from a simpler problem,
whose answer we know, to more difficult problems whose answers we infer from the simple
problem.
x Dt so that y D 1 14 x2 D 1 14 t 2:
z0 D 2 C 0i:
w D u C iv D f.z/ D Rez D x
(The second Cauchy–Riemann equation is satisfied, but, of course, that is not enough for
analyticity.) Hence we cannot apply the first method (9), p. 647, which would be more convenient,
but we must use the second method (10), p. 647.
The shortest path from z0 D 1 C i to z1 D 3 C 3i is a straight-line segment with these points as
endpoints. Sketch the path. The difference of these points is
because z0 cancels when t D 1: Hence (B) is a general representation of a segment with given endpoints
z0 and z1, and t ranging from 0 to 1.
Now we start with Equation (B) and substitute (A) into (B) and, by use of z 0 D z.0/ D 1 C i, we obtain
D z0 C .z1 z0/t
(C) D 1 C i C .2 C 2i/t
D 1 C 2t C i.1 C 2t/:
dz
z.t/P D D 2 C 2i:
dt
Z Zb
f.z/ dz D f Œz.t/ z.t/dtP
C a
Z1
294 Complex Analysis Part D
D .1 C 2t/.2 C 2i/
dt
0
Z1
D .22i/ C .1 C 2t/ dt:
0
Now
Z Z Z t2 2
.1 C 2t/ dt D dt C 2 t dt D t C 2 DtCt;
2
so that
1 ˇ
Z f.z/ dz D .2 C 2i/ t C t2 ˇ0
C
D .2 C 2i/.1 C t2/
D 2.2 C 2i/
D 4 C 4i;
which is the final answer on p. A37 in App. 2 of the textbook (with a somewhat different
parametrization).
23. Integration by the first method (Theorem 1, p. 647). From (3), p. 630 of Sec. 13.5 of the text,
we know that ez is analytic. Hence we use indefinite integration and substitution of upper and
lower limits. We have
Z
ez dz D ez C const [by (2), p. 630].
Z 2i h zi2i 2i i
z
(I1) e dz D e De e :
i i
Hence
27. Integration by the first method (Theorem 1, p. 647). The integrand sec2 z is analytic except at
the points where cosz is 0 [see Example 2(b), pp. 634–635 of the textbook]. Since
2
Z i=4 h ii=4
(I2) sec z dz D tanz D tani tan:
=4 =4
Also
1 sin 1i i sinh 1 1
tan i D 4 D 4 D i tanh ; 4 cosi cosh 4
siniz D i sinhz
and
cosiz D coshz
with z D .
A numeric value to six significant digits of the desired real hyperbolic tangent is 0:655794: Hence
(I2) evaluates to
i tanh 1 D 0:655794i 1:
Remember that the real hyperbolic tangent varies between 1 and 1, as can be inferred from
the behavior of the curves of sinhx and coshx in Fig. 551 and confirmed in Fig. 552, p. A65 (in Part
A3.1 of App. 3 of the textbook).
H
and C lives in a complex domain D that is simply connected. The little circle on the integral sign marks
a contour integral.
Take a look at Fig. 345, p. 652, for the meaning of simple closed path and Fig. 346, p. 653, for a simply
connected domain. In its basic form, Theorem 1 (Cauchy’s integral theorem) requires that the path not
touch itself (a circle, an ellipse, a rectangle, etc., but not a figure 8) and lies inside a domain D that has
no holes (see Fig. 347, p. 653).
You have to memorize Cauchy’s integral theorem. Not only is this theorem important by itself, as a
main instrument of complex integration, it also has important implications explored further in this
section as well as in Secs. 14.3 and 14.4.
Other highlights in Sec. 14.2 are path independence (Theorem 2, p. 655), deformation of path (p. 656,
Example 6, Prob. 11), and extending Cauchy’s theorem to multiply connected domains (pp. 658–659).
We show where we can use Cauchy’s integral theorem (Examples 1 and 2, p. 653, Probs. 9 and 13) and
where we cannot use the theorem (Examples 3 and 5, pp. 653–654, Probs. 11 and 23). Often the
decision hinges on the location of the points at which the integrand f.z/ is not analytic. If the points lie
inside C (Prob. 23) then we cannot use Theorem 1 but use integration methods of Sec. 14.1. If the points
lie outside C (Prob. 13) we can use Theorem 1.
9. Cauchy’s integral theorem is applicable since f.z/ D e z2 is analytic for all z, and thus entire
(see p. 630 in Sec. 13.5 of the textbook). Hence, by Cauchy’s theorem (Theorem 1, p. 653),
I 2
z
e dz D 0 with C unit circle, counterclockwise.
C
11. Cauchy’s integral theorem (Theorem 1, p. 653) is not applicable. Deformation of path. We
see that 2z 1 D 0 at z D . Hence, at this point, the function
f.z/ D
is not analytic. Since z D lies inside the contour of integration (the unit circle), Cauchy’s theorem is
not applicable. Hence we have to integrate by the use of path. However, we can choose a most
convenient path by applying the principle of deformation of path, described on p. 656 of the
textbook. This allows us to move the given unit circle e it by . We obtain the path C given by
Differentiation gives
Using the second evaluation method (Theorem 2, p. 647, of Sec. 14.1) we get
Z Zb
f.x/ dz D f Œz.t/z.t/ dtP [by (10), p. 647]
C a
Z2 1 it
D ie dt
0 2eit
Z 2 eit
Di dt
0 2eit
Z 21
Di dt
0 2
t2
Di
20
D i:
Note that the answer also follows directly from (3), p. 656, with m D 1 and z0 D .
13. Nonanalytic outside the contour. To solve the problem, we consider z4 1:1 D 0, so that
4
z D 1:1: By (15), p. 617 of Sec. 13.2,
p 2k 2k k D 0;1;2;3;
;
4
zDr cos C4 C i sin C4
–1 1 x
1.0241
–i C
Since z0, z1;z2;z3 all lie on the circle with center .0;0/ and radius r D p 4 1:1 D 1:0241 > 1, they are
outside the given unit circle C. Hence f.z/ is analytic on and inside the unit circle C. Hence Cauchy’s
integral theorem applies and gives us
I I 1
f.z/ dz D dz D 0:
4
C Cz 1:1
I
2z 1 dz where C as given in the accompanying figure on p.
659.
2
Cz z
We use partial fractions (given hint) on the integrand. We note that the denominator of the
integrand factors into z2 z D z.z 1/ so that we write
2z 1 A B
:
z2 zDzCz 1
Chap. 14 Complex Integration 299
Multiplying the expression by z and then substituting z D 0 gives the value for A:
2z 1 Bz 1
D A C ; D A C 0; A D 1 : z 1 z 1 1
2z 1 A.z 1/ 1
B; 0 B; B 1:
z D z C 1D C D
Hence
2z 1 1 1
:
D C
z.z 1/ z z 1
The integrand is not analytic at z D 0 and z D 1, which clearly lie inside C. Hence Cauchy’s integral
theorem, p. 653,
second. Hence we get
does not apply.
I
2z
Instead we use
1 I 1 I 1 dz D dz C
(3), C 2 z z Cz Cz 1 dz D 2i C 2i D 4i: p. 656, with m D
1 for the two integrands obtained by partial fractions. Note that z 0 D 0, in the first integral, and
then z0 D 1 in the
I
f.z/
(1) dz D 2if.z0/:
Cz z0
with an integrand
f.z/
g.z/ D with f.z/ analytic: z
z0
300 Complex Analysis Part D
f.z/ D .z z0/g.z/:
ez z
The next task consists of identifying where the point z 0 lies with respect to the contour C of
integration. If z0 lies inside C (and the conditions of Theorem 1 are satisfied), then (1) is applied directly
(Examples 1 and 2, p. 661). If z0 lies outside C, then we use Cauchy’s integral theorem of Sec. 14.3 (Prob.
3). We extend our discussion to several points at which g.z/ is not analytic.
Example 3, pp. 661–662, and Probs. 1 and 11 illustrate that the evaluation of (A) depends on the
location of the points at which g.z/ is not analytic, relative to the contour of the integration. The section
ends with multiply connected domains (3), p. 662 (Prob. 19).
1. Contour integration by Cauchy’s integral formula (1), p. 660. The contour jz C 1j D 1 can be
written as jz . 1/j D 1. Thus, it is a circle of radius 1 with center 1. The given function to be
integrated is
z2
g.z/ D : z2 1
Our first task is to find out where g.z/ is not analytic. We consider
z2 1D0 so that z2 D 1:
zD1 and z D 1:
Our next task is to find out which of these two values lies inside the contour and make sure that
neither of them lies on the contour (a case we would not yet be able to handle). The value z D 1 lies
outside the circle (contour) and z D 1 lies inside the contour. We have
z2 z2
2
g.z/ D D C :z 1 .z
1/.z 1/
Also
z2 f.z/ f.z/
g.z/ D D D : z2 1 z z0
z . 1/
Together
Chap. 14 Complex Integration 301
f.z/ z2
:
z C 1 D .z C 1/.z 1/
z2
f.z/ D ;z
1
I z2 I f.z/
dz dz [in the form (1), p. 660]
C z2 1 D Cz z0
I
z2=.z 1/
D dz [Note z0 D 1
C z . 1/
D 2i f.z0/ D 2i f. 1/
D 2i
D i:
y
C
x
–1 1
p p
jz0 z1j D ji 1j D j 1 C ij D x2 C y2 D . 1/2 C 12 D p2 > 1:4:
302 Complex Analysis Part D
I z2 z2
3 dz D 0 by setting f.z/ D in (1) :
2
C z2 1 z 1
integrand is 1
g.z/ D C
2
z 4
We consider z2 C 4 D 0 so that z D ˙2i. Hence the points at which g.z/ is not analytic are z D 2i and z
D 2i.
To see whether these points lie inside the contour C we calculate for z D 2i D x C yi so that
x D 0 and y D 2 and
1 f.z/ f.z/
D g.z/D D : z2 C 4 z z z 2i
0
Together
f.z/ 1 1
2
z 2i D z C 4 D .z C 2i/.z
2i/
where
1
f.z/ D C:z
2i
Cauchy’s integral formula gives us
Chap. 14 Complex Integration 303
I I
dz f.z/
dz dz [by (1), p. 660]
2
C z C4 D C z C z0
D I 1=.z C 2i/
dz
C z 2i
D 2if.z0/
D 2if.2i/
1
D 2i C
2i 2i
1
D 2i
4i
D .
13. Contour integral. We use Cauchy’s integral formula. The integral is of the form (1), p. 660, with z
z0 D z 2, hence z0 D 2. Also, f.z/ D z C 2 is analytic, so that we can use (1) and calculate
2if.2/ D 8i.
19. Annulus. We have to find the points in the annulus 1 < j z j < 3 at which
ez2 ez2
g.z/ D D C
2 2
z .z 1 i/ z Œz .1 i/
is not analytic. We see that z D 1 C i is such a point in the annulus. Another point is z D 0, but this is
not in the annulus, that is, not between the circles, but in the “hole.” Hence we calculate
ez2
f.z/ D Œz .1 C i/g.z/ D :
z2
e.1Ci/2
2if.1 C i/ D 2i
2i
D e.1Ci/2
1. Contour integration. Use of a third derivative. Using (1), p. 664, we see that the given function is
sinz f.z/
z4 D .z z0/nC1 with f.z/ D sinzI z0 D 0 and n C 1 D 4:
I f.z/ 2i .3/
4
(A) dz D 3Š f .z0/:
C .z z0/
f f 00
Since f.z/ D sinz, 0.z/ D cosz, .z/ D sinz, so that
Furthermore z0 D 0 and
/ /
f .3/.z0 D cos.0 D 1:
I
sinz 2i
dz . 1/
C z4 D 3Š
2
Chap. 14 Complex Integration 305
D i
3 2 1 D i:
5. Contour integration. This is similar to Prob. 1. Here the denominator of the function to be
1 4 1 4
integrated is z 2 ; and z 2 D 0 gives z0 D which lies inside the unit circle. To use
Theorem 1, p. 664, we need the third derivative of cosh2z. We have, by the chain rule,
f.z/ D cosh2z
f 0.z/ D 2sinh2z f
00
.z/ D 4sinh2z
f .3/.z/ D 8sinh2z:
We evaluate the last equality at z0 D and get
.3/
1 1
f D 8sinh 2
2 2
D 8sinh1
D 9:40161:
Thus
I
cosh2z 2i
dz 9:40161
14
C z z D 3Š
D i 9:40161
D 3:13387 i D 9:84534i:
I tanz 2 2
306 Complex Analysis Part D
The first derivative will occur because the given function is .tanz/=z2: Now
x2 2
y 1
.14/2 C D
f 0.z/ D 2 z:
cos
Hence (1), p. 664, gives you the value of the integral in the counterclockwise direction, that is,
2i
(B) 2if 0.0/ D D 22i:
1
Since the contour is to be traversed in the clockwise direction, we obtain a minus sign in result (B)
and get the final answer 22i:
I
Ln z
dz; C z 3 2 traversed counterclockwise.
2
C .z 2/ Wj jD
We see that the given integrand is Ln.z/=.z 2/2 and the contour of integration is a circle of radius 2
with center 3: At 0 and on the ray of the real axis, the function Lnz is not analytic, and it is essential
that these points lie outside the contour. Otherwise, that is, if that ray intersected or touched the
Chap. 14 Complex Integration 307
contour, we would not be able to integrate. Fortunately, in our problem, the circle is always to the
right of these points.
In view of the fact that the integrand is not analytic at z D z 0 D 2, which lies inside the contour,
then, according to (1), p. 664, with n C 1 D 2, hence n D 1, and z 0 D 2; the integral equals 2i times
the value of the first derivative of Ln z evaluated at at z 0 D 2. We have the derivative of Ln z is
1
.Ln z/0 D
z
which, evaluated at z D z0 D 2, is : This gives a factor to the result, so that the final answer is
2i D i:
Chap. 15 Power Series, Taylor Series
We shift our studies from complex functions to power series of complex functions, marking the beginning
of another distinct approach to complex integration. It is called “residue integration” and relies on
generalized Taylor series—topics to be covered in Chap. 16. However, to properly understand these topics,
we have to start with the topics of power series and Taylor series, which are the themes of Chap. 15.
The secondapproach to complex integration based on residues owes gratitude to Weierstrass (see
footnote 5, p. 703 in the textbook), Riemann (see footnote 4, p. 625 in Sec. 13.4), and others. Weierstrass,
in particular, championed the use of power series in complex analysis and left a distinct mark on the field
through teaching it to his students (who took good lecture notes for posterity; indeed we own such a
handwritten copy) and his relatively few but important publications during his lifetime. (His collected work
is much larger as it also contains unpublished material.)
The two approaches of complex integration coexist and should not be a source of confusion. (For more
on this topic turn to p. x of the Preface of the textbook and read the first paragraph.)
We start with convergence tests for complex series, which are quite similar to those for real series.
Indeed, if you have a good understanding of real series, Sec. 15.1 may be a review and you could move on
to the next section on power series and their radius of convergence. We learn that complex power series
represent analytic functions (Sec. 15.3) and that, conversely, every analytic function can be represented by
a power series in terms of a (complex) Taylor series (Sec. 15.4). Moreover, we can generate new power
series from old power series (of analytic functions) by termwise differentiation and termwise integration.
We conclude our study with uniform convergence.
From calculus, you want to review sequences and series and their convergence tests. You should
remember analytic functions and Cauchy’s integral formula (1), p. 660 in Sec. 14.3. A knowledge of how to
calculate real Taylor series is helpful for Sec. 15.4. The material is quite hands-on in that you will construct
power series and calculate their radii of convergence.
n zn D
C
:
4 2ni
First solution method:
D
n zn
C
4 2ni
n 4 2ni
[by (7), p. 610 of Sec. 13.1]
D 4 C 2ni 4 2ni
n.4 2ni/
D 42 C .2n/2
4n
n2
i
:
D 42 C 4n2 C 8 C 2n2
We have just written zn in the form
zn D xn C iyn:
By Theorem 1, p. 672, we treat each of the sequences fx ng and fyng separately when characterizing the
behavior of fzng : Thus
4n
nlim!1 xn D nlim!1 4 C 4n2
n
D lim n2
2 (divide numerator and denominator by 4n2/ n!1 1Cn2n
n
D nlim!1 n12 C 1
n
!1 n n!1
0
0:
D0C1D
Furthermore,
n2
lim yn D lim C
310 Complex Analysis Part D
2 n!1 n!1 8
2n
2
n
D lim n2 2
n!1 8C2n
n2
lim
D 8 n!1 n2
C2
D
:
0C2D 2
Hence the sequence converges to
0 i:
Ci 2 D
Second solution method (as given on p. A38):
D
n zn
C
4 2ni
n
1 1
. i/ i
D D
2i D 22 C i D 22 C C2 2:
2
1
1 1 1
ni C ni ni ni
Now
1 lim 1i 1
nlim!1 zn D nlim!1 1 C2 ni2 D nlim!1n!11 C n2lim!1ni2 D 1 C2i0 Di: i
Since the sequence converges it is also bounded.
j zn j D j. 1/n C 10ij
D pp C n2 C 102 Œ. 1/
DD p1 100
Chap. 15 Power Series, Taylor Series 311
101 <
11:
For odd subscripts the terms are 1 C 10i and for even subscripts 1 C 10i. The sequence has two
limit points 1 C 10i and 1 C 10i, but, by definition of convergence (p. 672), it can only have
one. Hence the sequence fzng diverges.
9. Sequence. Calculate
j zn j D j0:9 C 0:1ij2n D .j0:9 C
0:1ij2/n
n
.0:81 C 0:01/
D
n
D 0:82 ! 0 as n ! 0:
13. Bounded complex sequence. To verify the claim of this problem, we first have to show that:
(i) If a complex sequence is bounded, then the two corresponding sequences of real parts and
imaginary parts are also bounded.
Proof of (i). Let fzng be an arbitrary complex sequence that is bounded. This means that there is a
constant K such that
zn D xn C i yn
and
jznj2 D xn2 C yn2:
Now
xn2 xn2 C yn2 D jznj2 since xn2 0; yn2 0:
Furthermore,
xn2 D jxnj2 since xn2 0:
Thus
jxnj2 jznj2
312 Complex Analysis Part D
jxnj jznj
so that
jxnj < K:
Similarly,
jynj2 yn2 jznj2 < K2
so that
jynj < K:
Since n was arbitrary, we have shown that fx ng and fyng are bounded by some constant K: Next we
have to show that:
(ii) If the two sequences of real parts and imaginary parts are bounded, then the complex sequence is also
bounded.
Proof of (ii). Let fxng and fyng be bounded sequences of the real parts and imaginary parts, respectively.
This means that there is a constant L such that
L L j xn j < p ; jyn j < p
:
2 2
Then
L2 L2
j 2 2
j x n< ; j yn j < ;
2 2
so that
j zn j2 D xn2 C yn2
L2 L2
< C2 2
< L2:
ˇˇ in ˇˇ
jznj D ˇˇn 2 i ˇˇ
D jinj j
D jn2j ij
1 D p n4 C
1
[by (3), p. 613 in Sec. 13.2]
1 1
<pD : n 4 n2
Since
X1 1
converges [see p. 677 in the Proof of (c) of Theorem 8],
2
nD1 n
we conclude, by the comparison test, p. 675, that the series given in this problem also converges.
23. Series convergent? Ratio test. We apply Theorem 8, p. 677. First we form the ratio z nC1=zn and
simplify algebraically. Since
. 1/n .1 i/2nC1 zn D
C ;
.2n/Š
the test ratio is
1/nC1.1 C i/2.nC1/C1 =.2.n C 1//Š
znC1 D . zn
. 1/n.1 C i/2nC1 =.2n/Š
. 1/nC1.1 C i/2nC3 .2n/ŠC
D .2.n C 1//Š n
. 1/ .1 i/2nC1
.1 C i/2 .2n/Š
D . 1/
.2n C 2/Š 1
D. .1 C i/2
1/
.2n C 2/.2n C 1/
.2i/
D . 1/ : .2n
C 2/.2n C 1/
Then we take the absolute value of the ratio and simplify by (3), p. 613, of Sec. 13.2:
Hence
lim ˇˇznC1ˇˇˇ D C 1 C D L D 0:
ˇ
n!1 ˇ zn ˇ .n 1/.2n 1/
because
1 1 1
lim C C D lim C lim C D 0 0 D 0 n!1 .n 1/.2n 1/ n!1 n 2
n!1 2n 1
Thus, by the ratio test (Theorem 8), the series converges absolutely and hence converges.
Since analytic functions can be represented by infinite power series (1), p. 680,
such series are very important to complex analysis, much more so than in calculus. Here z 0;called the
center of the series, can take on any complex number (once chosen, it is fixed). When z 0 D 0, then we get
(2), p. 680. An example is
z z2
z
(E) eD1C C C :
1Š 2Š
More on this in Sec. 15.4. We want to know where (1) converges and use the Cauchy–Hadamard formula
(6), p. 683, in Theorem 2 to determine the radius of convergence R, that is,
ˇ
ˇ an ˇ ˇ
(6) R D lim ˇ ˇ [remember that the .n C 1/st term is in the denominator!]. n!1 ˇanC1ˇ
Formula (6) shows that the radius of convergence is the limit of the quotient D j anC1=an j in the ratio test
(Theorem 8, p. 677). This isj an=anC1 j (if it exists). This in turn is the reciprocal of the quotient L
understandable; if the limit of L is small, then its reciprocal, the radius of convergence R, will be large. The
following table characterizes (6).
R D c (c a constant: real, positive) Convergence in disk jz z0j <c Ex. 5, p. 683, Prob. 13
is in powers of z , and its center is : We use the Cauchy–Hadamard formula (6), p. 683, to
determine the radius of convergence R: We have
and
ˇˇ an ˇˇ
Now as n ! 1
ˇˇ an ˇˇ
ˇ ˇ D .2n C 2/.2n C 1/ ! 1:
ˇanC1ˇ
Hence
R D 1:
316 Complex Analysis Part D
This means that the series converges everywhere, see Example 2, p. 680, and the top of p. 683, of
the textbook.
We were fortunate that the radius of convergence was 1 because our series is of the form
X1 2n
anz :
nD0
p
Had R been finite, the radius of convergence would have been R (see the next problem).
Remark. Plausibility of result. From regular calculus you may recognize that the real series
X1 . 1/n 1 2n 1
x D cos x 2
.2n/Š nD0 2
is the Taylor series for cos x 1
2. The complex analog is cos z
1
2: Since the complex cosine
function is an entire function, its has an infinite radius of convergence.
X1 n 4n
16 .z C i/ :
nD0
Since z C i D z . i/, the center of the series is i. We can write the series as
where
(A) t D .z C i/4:
We use the Cauchy–Hadamard formula (6), p. 683, to determine the radius of convergence R t [where
the subscript t refers to the substitution (A)]:
an 16n 16n 1
:
anC1 D 16nC1 D 16n 16 D 16
This is the radius of convergence of the given series, regarded as a function of t: From (A) we have
z C i D t1=4:
We denote Rz by R to signify that it is the wanted radius of convergence for the given series. Hence
the series converges in the open disk
X1 .2n/Š n
2
.z 2i/
4n .nŠ/ nD0
1//Š
2
D .n 1/ :
.nŠ/ n 1
Hence, putting the fractions together and further simplification gives us
D C C ;
anC1 .2n 2/.2n 1/ 2.n 1/.2n 1/ 2n 1 2n 1
318 Complex Analysis Part D
D nnlimlim!1!1 22 CC nn21 D 22 CC 00 D 22 D 1 D R
Thus the series converges in the open disk jz 2i j < 1 of radius R D 1 and center 2i:
5. Radius of convergence by differentiation: Theorem 3, p. 687. We start with the geometric series
1
Xz 2i n z 2i z 2i 2 z 2i 3
(A) g.z/ D D1C C C C :
2 2 2 2
nD0
2i j j j
1 z 2i 1 z 2i 2 1
(B) g0.z/ D 0 C C2 C3 C
2 2 2 2 2
X1 n.z 2i/n 1
D where jz 2i j < 2:
2n nD1
Chap. 15 Power Series, Taylor Series 319
Complete the problem by verifying the result by the Cauchy–Hadamard formula (6), p. 683, in
Sec. 15.2.
9. Radius of convergence by integration: Theorem 4, p. 688. We start with the geometric series (see
Example 1, p. 680) which has radius of convergence 1:
1X n 2 3
wD1CwCwCw j w j < 1:
nD0
Hence,
n 2 3
X1
. 2w/ D 1 2w C 4w 8w j w j <;
nD0
and then,
n 2 3
X1
. 2w/ D 2w C 4w 8w j w j <:
nD1
X1 n 2n 2 4 6
(E) . 2/ z D 2z C 4z 8z C
nD1
320 Complex Analysis Part D
j 21
zj< and hence jzj<p :
2
Our aim is to produce the series given in the problem. We observe that the desired series has
factors n C 2, n C 1, and n in the denominator of its coefficients. This suggests that we should use
three integrations to determine the radius of convergence. We use Theorem 4, p. 688, to justify
termwise integration. We divide (E) by z
3 5 X n 2n 1
1
2z C 4z 8z C D . 2/ z :
nD1
Z z2 Z 3 z4 Z 5 z6
2 z dz D 2 ; 4 z dz D 4 ; 8 z dz D 8 ; ,
2 4 6
which is
X1 z
n 2n 1
. 2/ where j z j < p : nD1 2n 2
However, we want to get the factor 1=n so we multiply the result by 2, that is, X
n z2n X1 n z2n
(F) 2 . 2/ D . 2/ :
2n n nD1 nD1
Next we aim for the factor 1=.n C 1/: We multiply the series obtained in (F) by z
X1 z
n 2nC1
. 2/ ;
n nD1
Z D. 2/n Z 2nC1 dz
X
2/n z2nC1
. dz z nD1
n n
. 2/n z2nC1C1
We multiply the result by 2 (to obtain precisely the factor 1=n) and get (G) We get
1 2/n z2nCC2 D X1 .2/n z2nC2
X : X
(G) 2 . 1
nD1 n.n C 1/
2n.n 1/ n
nD1
z2nC4
We multiply the right-hand side of (G) by z: . 2/
1
X n z2nC3
. 2/
:
n.n 1/
n.n
nD1 C
1/2.n
and integrate 2/
nD1
C
Z . 2/ z2nCC3 dz D . 2C/n Z C
X n z2nC4
. 2/ :
n.n 1/.n 2/
nD1 C C
However, our desired series is in powers of z 2n instead of z2nC4: Thus we must divide by z4 and get
1 X1 n z2nC4 X1 n z2n
(H) . 2/ CC D . 2/ C C
:
z4 n.n 1/.n 2/ n.n 1/.n 2/ nD1 nD1
But this is precisely the desired series. Since our derivation from (E) to (H) did not change the radius
of convergence (Theorem 4), we conclude that the series given in this problem
has radius of
convergence jz j < 1=p2, that is, center 0 and radius 1=p2:
Do part (a) of the problem, that is, obtain the answer by (6), p. 683.
10, 13, 14, 18 Hint. For problems 10, 13, 14, and 18, the notation for the coefficients is
explained on pp. 1026–1028 of Sec. 24.4, of the textbook.
322 Complex Analysis Part D
17. Odd
implies
a2m. z/2m D a2m. 1/2mz2m D a2m . 1/2m z2m D a2m1mz2m D a2mz2m D a2mz2m
functions. The even-numbered coefficients in (2), p. 685, are zero because f. z/ D f.z/
But
a2mz2m D a2mz2m
means
a2m D a2m
so that
Every analytic function f.z/ can be represented by a Taylor series (Theorem 1, p. 691) and a general way of
doing so is given by (1) and (2), p. 690. It would be useful if you knew some Taylor series, such as for e z
[see (12), p. 694], sinz, and cosz [(14), p. 695]. Also important is the geometric series (11) in Example 1
and Prob. 19. The section ends with practical methods to develop power series by substitution,
integration, geometric series, and binomial series with partial fractions (pp. 695–696, Examples 5–8, Prob.
3).
Example 2, p. 694, shows the Maclaurin series of the exponential function. Using it for defining e z would
have forced us to introduce series rather early. We tried this out several times with student groups of
different interests, but found the approach chosen in our book didactically superior.
3. Maclaurin series. Sine function. To obtain the Maclaurin series for sin2z2 we start with (14),
p. 695, writing t instead of z
1
X n t2nC1 t3 t5
sint D . 1/ CDt C C :
.2n 1/Š 3Š 5Š
nD0
nD0
.2n C 1/Š
1X 22nC1z4nC2
D . 1/n
nD0 .2n C 1/Š
2 23z6 25z10
Chap. 15 Power Series, Taylor Series 323
2z
D C C
3Š 5Š
4 4
2 6 10
D
2z zC z C :
3 15
The center of the series thus obtained is z 0 D 0 (i.e., z D z z0 D z 0) by definition of Maclaurin
series on p. 690. The radius of convergence is R D 1, since the series converges for all z:
Zz S.z/sint2 dt:
D 0
To find the Maclaurin series of S.z/ we start with the Maclaurin series for sinw; and set w
D t2. From Prob. 3 of this section we know that
2 X
1 n t4nC2 2 16 1 10
sint D . 1/ CDt tC t C :
.2n 1/Š 3Š 5Š
nD0
Zz 2 Z z1 n t4nC2
X
sint dt D . 1/ C dt 0 0 nD0 .2n 1/Š
1X
ˇz t4nC3
D .
nD0
1/n
ˇˇ
.2n C 1/Š.4n C 3/ˇ
ˇtD0
1X z4nC3
D .1/n ; .2n
nD0 C 1/Š.4n C 3/
which we obtained by setting t D z as required by the upper limit of integration. The lower limit t D
0 contributed 0. Hence
X1 n z4nC3 1 3 1 7 1 11 S.z/ D . 1/
CCDzzCzC :
.2n 1/Š.4n 3/ 1Š3 3Š7 5Š11
nD0
Since the radius of convergence for the Maclaurin series of the sine function is R D 1, so is R for S.z/:
First solution: We want to find the Taylor series of 1=.1 z/ with center z0 D i. We know that we are
dealing with the geometric series
1 X1 n
D z [by (11), p. 694],
1 z
nD0
with z0 D 0.
Thus consider
11 1DCD:
1 z 1 z i i .1 i/ .z i/
1 1 1 1
:
.1 i/ .z i/ D .1 i/1 z1 ii D1 i 1 z1 ii
and
1 X1 n
Dw:
1 w
nD0
Thus we have the desired Taylor series, which is
1 X1 z in 1 X1 1 n
(S) D .z i/ :
1 i 1 i 1 i .1 i/n nD0 nD0
1 D1Ci
[by (7), p. 610]
1 i 2
so that (S) becomes
1 C i X1 1 C i n n
.z i/ :
2 2
nD0
Chap. 15 Power Series, Taylor Series 325
This is precisely the answer on p. A39 of the textbook with the terms written out.
The radius of convergence of the series is
ˇz i ˇˇ
ˇ jwj < 1,
that is, ˇ ˇ < 1:
ˇ1 i ˇ Now
ˇ ˇ :
ˇ1 iˇ j1 ij 1 1
Hence
jzp ij j j p
<1 and z i< 2
2
Remark. The method of applying (1), p. 690, directly is a less attractive way as it involves
differentiating functions of the form 1=.1 i/n.
21. Taylor series. Sine function. For this problem, we develop the Taylor series directly with (1),
p. 690. This is like the method used in regular calculus. We have for f.z/ D sinz and z0 D =2:
2
.6/
.z/ D sinz f .6/.z0/ D sin D 1:
f
2
Hence the Taylor series for sinz with z0 D =2 W
2 4 6
1 1 1
f.z/ D 1 z C zz C
2Š 2 4Š 2 6Š 2
X1 n 1 2n
D . 1/ z ;
.2n/Š 2
nD0
The material in this section is for general information about uniform convergence (defined on p. 698) of
arbitrary series with variable terms (functions of z). What you should know is the content of Theorem 1, p.
699. Example 4 and Prob. 13 illustrate the Weierstrass M-test, p. 703.
3. Power series. By Theorem 1, p. 699, a power series in powers of z z0 converges uniformly in the
closed disk j z z0j 5 r, where r < R and R is the radius of convergence of the series. Hence, solving
Probs. 2–9 amounts to determining the radius of convergence. In Prob. 3 we have a power series in
powers of
(A) Z D .z C i/2
of the form
X1 n
(B) anZ
nD0
with coefficients an D 31n: Hence the Cauchy–Hadamard formula (6), p. 683 in Sec. 15.2, gives the
an 3n
3;
anC1 D 3 .nC1/ D
Chap. 15 Power Series, Taylor Series 327
so the series (B) converges uniformly in every closed disk jZj 5 r < R D 3. Substituting (A) and taking
square roots, we see that this means uniform convergence of the given power series in powers of z C
i in every closed disk:
(D) ıDR r:
We know that
R > r:
R r>r r
and by (D) and simplifying
rDR ı:
Together,
jz C ij 5 R ı D p3 ı (ı > 0).
This is the form in which the answer is given on p. A39 in App. 2 of the textbook.
7. Power series. No uniform convergence. We have to calculate the radius of convergence for
X1 nŠ 1 n n2 C
2 zi :
nD1
We want to use the Cauchy–Hadamard formula (6), p. 683 of Sec. 15.2. We start with
an D nŠ .n CC1/2
;
2
anC1 n .n 1/Š
DnC1
: n2
328 Complex Analysis Part D
Thus
ˇ ˇ
an ˇ nC1D 1C1 D 1C 1D
ˇ
0 0
Hence R D 0, which means that the given series converges only at the center:
z0 D i:
Hence it does not converge uniformly anywhere. Indeed, the result is not surprising since
nŠ >> n2;
X1 sinn jzj
n2
nD1
p
Since jzj D r D x2 C y2 is a real number, sin jzj is a real number such that
Now
Since
X1 1
converges (see Sec. 15.1 in the proof of Theorem 8, p. 677),
m2
mD1
we know, by the Weierstrass M-test, p. 703, that the given series converges uniformly.
Chap. 15 Power Series, Taylor Series 329
Solution for the Harmonic Series Problem (see p. 298 of the Student Solutions Manual) The harmonic
series is
1
1 1 1 X 1
(HS) 1C2C3C4C D m :
mD1
The harmonic series diverges. One elementary way to show this is to consider particular partial sums of
the series.
s1 D 1 s2 D 1 C
s4 D 1 C C C D 1 C C C
„ ƒ‚ … 2 terms
>1C C C 14 D 1 C C D 1 C 2
s8 1 1
D C C C C C C C D C C C 41 C C C C
„ ƒ‚ … „ ƒ‚ … 2 terms 4
terms
>1C C C 14 C C C C 18 D 1 C C C D1C3 ;
„ ƒ‚ … „ ƒ‚ … „ ƒ‚ …
2 terms 4 terms 3 fractions of value
s16 1
D C C C C C C C C C C C C C C C
1
D C C C 14 C C C C 81 C C C C C C C C
„ ƒ‚ … „ ƒ‚ … „ ƒ‚ … 2 terms 4 terms 8 terms > 1C
C C 14 C C C C 18 C C C C C C C C
„ ƒ‚ … „ ƒ‚ … „ ƒ‚ … 2 terms 4 terms 8 terms
1 1 4 ;
D C C C C D C
„ ƒ‚…
4 fractions of value
>1C C C 41 C C C 81 C C C 161 C C C
„ ƒ‚ … „ ƒ‚ … „ ƒ‚ … „ ƒ‚ … 2 terms 4 terms 8 terms 16 terms
D1 C C C 4 C8 C 16 D1C C C C C D1C5 :
„ ƒ‚…
330 Complex Analysis Part D
5 fractions of value
Thus in general
s n> 1 C n :
2
As n ! 1, then
This shows that the sequence of partial sums s n is unbounded, and hence the sequence of all partial
2
sums of the series is unbounded. Hence, the harmonic series diverges.
Another way to show that the hamonic series diverges is by the integral test from calculus
(which we can use since f.x/ is continuous, positive, and decreasing on the real interval Œ1; 1 )
Z 11 Z t1 t
Since the integral in (A) does not exist (diverges), the related harmonic series (HS) [whose nth term equals
f.n/] diverges.
Remark. The name harmonic comes from overtones in music (harmony!). The harmonic series is so
important because, although its terms go to zero as m ! 1, it still diverges. Go back to p. 298.
Chap. 16 Laurent Series. Residue Integration
In Chap. 16, we solve complex integrals over simple closed paths C where the integrand f.z/ is analytic
except at a point z0 (or at several such points) inside C: In this scenario we cannot use Cauchy’s integral
theorem (1), p. 653, but need to continue our study of complex series, which we began in Chap. 15. We
generalize Taylor series to Laurent series which allow such singularities at z 0. Laurent series have both
positive and negative integer powers and have no significant counterpart in calculus. Their study provides
the background theory (Sec. 16.2) needed for these complex integrals with singularities. We shall use
residue integration, in Sec. 16.3, to solve them. Perhaps most amazing is that we can use residue
integration to even solve certain types of real definite integrals (Sec. 16.4) that would be difficult to solve
with regular calculus. This completes our study of the second approach to complex integration based on
residues that we began in Chap. 15.
Before you study this chapter you should know analytic functions (p. 625, in Sec. 13.4), Cauchy’s integral
theorem (p. 653, in Sec. 14.2), power series (Sec. 15.2, pp. 680–685), and Taylor series (1), p. 690. From
calculus, you should know how to integrate functions in the complex several times as well as know how to
factor quadratic polynomials and check whether their roots lie inside a circle or other simple closed paths.
Laurent series generalize Taylor series by allowing the development of a function f.z/ in powers of z z 0
when f.z/ is singular at z0 (for “singular,” see p. 693 of Sec. 15.4 in the textbook). A Laurent series (1), p.
709, consists of positive as well as negative integer powers of z z 0 and a constant. The Laurent series
converges in an annulus, a circular ring with center z 0 as shown in Fig. 370, p. 709 of the textbook.
The details are given in the important Theorem 1, p. 709, and expressed by (1) and (2), which can be
written in shortened form (10) and (20), p. 710.
Take a look at Example 4, p. 713, and Example 5, pp. 713–714. A function may have different Laurent
series in different annuli with the same center z 0. Of these series, the most important Laurent series is the
one that converges directly near the center z 0, at which the given function has a singularity. Its negative
powers form the so-called principal part of the singularity of f.z/ at z 0 (Example 4 with z0 D 0 and Probs. 1
and 8).
Hint. To obtain the Laurent series for probs. 1–8 use either a familiar Maclaurin series of Chap. 15 or a
series in powers of 1=z:
1. Laurent series near a singularity at 0. To solve this problem we start with the Maclaurin series for
cosz, that is,
z2 z4 z6 [by (14), p. 695].
(A) cosz D 1 C
2Š 4Š 6Š C
1 1 1 z2
D z4 :
24 720 C
The principal part consists of
1 1
:
z4 2z2
1 11 11
cosh D 1 C C C . z 2Š z2 4Š z4
Multiplication by z3 yields
1 11 11
3 3 3 3
z cosh DzC zC z C z 2Š z2 4Š z4
3 z z
11111D C CC
2Š 4Š z 6Š z2
3 1 1 1 2
DzC zC z C z C :
2 24
We see that the principal part is
z 1C z 2C :
Furthermore, the series converges for all z ¤ 0; or equivalently the region of convergence is
0 < jzj < 1:
15. Laurent series. Singularity at z0 D. We use (6), p. A64, of Sec. A3.1 in App. 3 of the textbook and
simplify by noting that cos D 1 and sin D 0:
cosz D cos..z /C/
Chap. 16 Laurent Series. Residue Integration 333
Now
w2 w4 w6
cosw D 1 CC :
2Š 4Š 6Š
We set
wDz
and get
cos.z .z /2 .z /4 .z /6 :
/D1 C C
2Š 4Š 6Š
Then
.z /2 .z /4 .z /6
cos.z /D 1C C .
2Š 4Š 6Š C
We multiply by
.z /2
and get
cos.z / 1 1 .z /2 .z /4
/2 D C .
2 C
.z .z / C 2Š 4Š 6Š
Hence by (B)
cosz 2 1 1 2 1 4
D .z / C .z /C .z /C .
2
.z / 2 24 720
We need
so we set w D z2:
334 Complex Analysis Part D
1 X1 2n ˇ 2ˇ
2
D .z / ˇz ˇ < 1
1 z
nD0
X 2n ˇ 2ˇ 2
D 1 C z2 C z4 C z6 C :
Similarly, we obtain the Laurent series converging for j z j > 1 by the following trick, which you should
remember:
1 1 1
1 z2 1 z
1 X1 12n
D z2 z
nD0
z
24 C 6C
D
D z2 z4 z6 z8
1
X 1
D z2nC2 j
j z > 1:
nD0
23. Taylor and Laurent series. We want all Taylor and Laurent series for
z8
1
with z0 D 0:
We start with
1 X1 n
D w jwj<1 [by (11), p. 694].
1 w
Chap. 16 Laurent Series. Residue Integration 335
nD0
1 X1 4n D z j z j < 1:
4
1 z
nD0
1 z nD0 nD0
D z8 C z12 C z16 C :
1 X1 1
w > 1:
1 w2 D nD0 w2nC2 j j
2
We set w D z
1 1 so that j z j > 1:
X 1 X 1 ˇ 2 ˇ ˇz ˇ
>1
D .z 2/2nC2 D
nD0 z4nC4
nD0
z
Hence the desired Laurent series for
z8
4
with center z0 D 0
1 z
is
z8
1 z4
336 Complex Analysis Part D
and j z j > 1:
1 X 4 4n z4 z4
D z D 1 z8
nD0
z4 z8
Note that we could have developed the Laurent series without using the result by Prob. 19 (but in
the same vein as Prob. 19) by starting with
1 1
;etc.
1 z4 D z4 1 1
z4
then the isolated singularity at z D z 0 is a simple pole (Example 1, pp. 715–716). However, if the principal
part is of the form
b1 b2 bm
;
2 m
z z0 C .z z0/ C .z z0/
then we have a pole of order m. It can also happen that the principal part has infinitely many terms; then
f.z/ has an isolated essential singularity at z D z 0 (see Example 1, pp. 715–716, Prob. 17).
A third concept is that of a zero, which follows our intuition. A function f.z/ has a zero at z D z 0 if f.z0/ D 0:
Just as poles have orders so do zeros. If f.z0/ D 0 but the derivative f 0.z/ ¤ 0, then the zero is a simple
zero (i.e., a first-order zero). If f.z0/ D 0; f 0.z0/ D 0, but f 00.z/ ¤ 0, then we have a second-order zero, and
so on (see Prob. 3 for fourth-order zero). This relates to Taylor series because, when developing Taylor
series, we calculate f.z0/; f 0.z0/; f 00.z0/; :f .n/.z0/ by (4), p. 691, in Sec. 15.4. In the case of a second-order
Chap. 16 Laurent Series. Residue Integration 337
zero, the first two coefficients of the Taylor series are zero. Thus zeros can be classified by Taylor series
as shown by (3), p. 717.
Make sure that you understand the material of this section, in particular the concepts of pole and order
of pole as you will need these for residue integration. Theorem 4, p. 717, relates poles and zeros and will
be frequently used in Sec. 16.3. Problem Set 16.2. Page 719
3. Zeros. We claim that f.z/ D .z C81i/4 has a fourth-order zero at z D 81i. We show this directly:
f .z/ D 24;
Hence, by definition of order of a zero, p. 717, we conclude that the order at z 0 is 4. Note that we
demonstrated a special case of the theorem that states that if g has a zero of first order (simple zero)
at z0, then gn (n a positive integer) has a zero of nth order at z 0.
5. Zeros. Cancellation. The point of this, and similar problems, is that we have to be cautious. In the
present case, z D 0 is not a zero of the given function because
z 2 sin2 z D z 2 .z/2 C D 2C :
so that
f.z0/ D 0; f 0.z0/ D 0; ; f .n 1/.z0/ D 0;
so that, by successive product differentiation, the derivatives of h.z/ will be zero at z 0 as long as a
factor of z z0 is present in each term. If n D 1, this happens for h and h 0, giving a second-order zero z0
of h. If n D 2, we have .z z0/4 and obtain f , f 0, f 00, f 000 equal to zero at z0, giving a fourth-order zero
z0 of h. And so on.
By definition on p. 715, cotz is singular where cotz is not analytic. This occurs where sinz D 0, hence
for
Since cosz and sinz share no common zeros, we conclude that cotz is singular where sinz is 0; as
given in (B). The zeros are simple poles. Next we consider
cos4 z
4
cot z D 4:
sin z
Now sin4 z D 0 for z as given in (B). But, since sin4 z is the sine function to the fourth power and sinz
has simple zeros, the zeros of sin4 z are of order 4: Hence, by Theorem 4, p. 717, cot4 z has poles of
order 4 at (B).
But we are not finished yet. Inspired by Example 5, p. 718, we see that cosz also has an essential
singularity at 1. We claim that cos4 z also has an essential singularity at 1. To show this we would
have to develop the Maclaurin series of cos4 z. One way to do this is to develop the first few terms of
that series by (1), p. 690, of Sec. 15.4. We get (using calculus: product rule, chain rule)
1 1
4 2 4
(C) cos w D 1 4w C 40w C :
2Š 4Š
The odd powers are zero because in the derivation of (C) these terms contain sine terms (chain rule!)
that are zero at w0 D 0.
We set w D 1=z and multiply out the coefficients in (C):
1 5
4 2 4
(D) cos D 1 2z C z C :z 3
We see that the principal part of the Laurent series (D) is (D) without the constant term 1. It is
infinite and thus cot4 z has an essential singularity at 1 by p. 718. Since multiplication of the series by
1=sin4 z does not change the type of singularity, we conclude that cot4 z also has an essential
singularity at 1:
This section deals with evaluating complex integrals (1), p. 720, taken over a simple closed path C. The
important concept is that of a residue, which is the coefficient b 1 of a Laurent series that converges for all
points near a singularity z D z0 inside C, as explained on p. 720. Examples 1 and 2 show how to evaluate
integrals that have only one singularity within C.
A systematic study of residue integration requires us to consider simple poles (i.e., of order 1) and poles
of higher order. For simple poles, we use (3) or (4), on p. 721, to compute residues. This is shown in
Example 3, p. 722, and Prob. 5. The discussion extends to higher order poles (of order m) and leads to (5),
p. 722, and Example 4, p. 722. It is critical that you determine the order of the poles inside C correctly. In
Chap. 16 Laurent Series. Residue Integration 339
many cases we can use Theorem 4, on p. 717 of Sec. 16.2, to determine m. However, when h.z/ in
Theorem 4 is also zero at z0; the theorem cannot be applied. This is illustrated in Prob. 3.
Having determined the residues correctly, it is fairly straightforward to use the residue theorem
(Theorem 1, p. 723) to evaluate integrals (1), p. 720, as shown in Examples 5 and 6, p. 724, and Prob. 17.
However, since both sin2z and z6 are 0 for z0 D 0, we cannot use Theorem 4 of Sec. 16.2, p. 717, to
determine the order of that zero. Hence we cannot apply (5), p. 722, directly as we do not know the
value of m.
We develop the first few terms of the Laurent series for f.z/: From (14) in Sec. 15.4, p. 695, we
know that
w3 w5 w7
D 3Š C 5Š 7Š C
sinw w :
We set w D 2z and get
Since we need the Laurent series of sin2z=z6 we multiply (A) by z 6 and get
6 6 .2z/3 .2z/5 .2z/7
(B) z sin2z D z 2z CC
3Š 5Š 7Š
2 81 32 1 128
D z5
3Š z3 C 5Š
z 7Š C
z :
The principal part of (B) is (see definition on p. 709)
2 81 32 1
:
5 3
z 3Š z C 5Š z
We see that
sin2z
f.z/ D 6 has a pole of fifth order at z D z0 D 0 [by (2), p. 715].
z
Note that the pole of f is only of fifth order and not of sixth order because sin2z has a simple zero at
z D 0:
Using the first line in the proof of (5), p. 722, we see that the coefficient of z 1 in the Laurent
series (C) is
340 Complex Analysis Part D
32 32 4
b1 D D D :
5Š 54321 15
1 d5 6 sin2z
D lim z
5
5Š z!0 dz z6
1 d5
D lim sin2z :
5Š z!0 dz5
We need
g.z/ D sin2zI
g0.z/ D 2cos2zI
g00.z/ D 4sin2zI
g000.z/ D 8cos2zI
g.4/.z/ D 16sin2zI
g.5/.z/ D 32cos2z:
Then
Hence
sin2z 1 4
Res D 32 1 D ; as before. z D z0 D 0 z6 5Š 15
Remark. In certain problems, developing a few terms of the Laurent series may be easier than using
(5), p. 722, if the differentiation is labor intensive such as requiring several applications of the
quotient rule of calculus (see p. 623, of Sec. 13.3).
8 2 2
Chap. 16 Laurent Series. Residue Integration 341
Step 2. Determine the order of the singularities and determine whether they are poles. Since the
numerator of f is 8 D h.z/ ¤ 0 (in Theorem 4), we see that the singularities in step 1 are simple, i.e., of
order 1. Furthermore, by Theorem 4, p. 717, we have two poles of order 1 at i and i,
respectively.
Step 3. Compute the value of the residues. We can do this in two ways.
Solution 1. By (3), p. 721, we have
8
Res f.z/ D lim .z i/ C
zDi z!i 1 z2
8
D lim .z i/ C z!i .z i/.z i/ 8
D lim z!i z i
8 4
D D D 4i:
2i i
Also
8
Res f.z/ D lim .z . i// C
zD i z! i .z i/.z i/
8
Res f.z/ D 4i and Res f.z/ D
D limi .z C i/
4i:
z! .z i/.z C i/
8
zDi zD i
D D 4i:
Res f.z/ D Res0 D D
2i
z = z0 z = z q.z/
Hence the two residues are q0.z0/
Solution 2. By (4), p. 721, we have
8 ˇˇ 8 ˇˇ 8
p.z/ p.z0/ .1 C z2/0 ˇˇz = z0 D 2zˇˇz = z0 D 2z0 :
For z0 D i we have
8
Res0 f.z/ D D 4i; z Di 2i
and for z0 D i
8
342 Complex Analysis Part D
as before.
3 5
2z D ˙ ;˙ ;˙ ;
2 2 2
and hence at
(A) z D ˙; ˙; ˙ :
Since sin2z ¤ 0 at these points, we can use Theorem 4, p. 717, to conclude that we have infinitely
many poles at (A).
Consider the path of integration C W jz 0:2j D 0:2. It is a circle in the complex plane with
center 0:2 and radius 0:2: We need to be only concerned with those poles that lie inside C: There is
only one pole of interest, that is,
1
p.z/ D sin2z; p D sin D 1;
4 2
Hence
p.1/ 1 1
Res1f.z/ D 41 D D : z0 = 4 q0.4/ 2 2
I I
f.z/dz D tan 2z dz
C CWjz 0:2jD0:2W
2i Res1f.z/ z0 = 4
D
1
Chap. 16 Laurent Series. Residue Integration 343
D 2i
2
D i:
17. Residue integration. We use the same approach as in Prob. 15. We note that
3 5
cosz D 0 at zD˙ ;˙ ;˙ :
2 2 2
3 5
D˙2˙2˙2 z;;.
Here the closed path is a circle:
ˇ i ˇˇ
and only zD and zD lie within C:
ˇ
2 2
C W ˇz ˇ D 4:5
ˇ 2ˇ
=2
e=2
Res= f.z/ D D e ; z =2 sin=2
and
e =2 e =2 =2
I I ez
f.z/ dz D dz
C CWjz i=2jD4:5 cosz
D 2i
D 2i e=2 C e =2
D function]
2i
D
2sinh [since sinh is an odd
4i sinh
2 D
28:919i:
It is surprising that residue integration, a method of complex analysis, can also be used to evaluate certain
kinds of complicated real integrals. The key ideas in this section are as follows. To apply residue
integration, we need a closed path, that is, a contour. Take a look at the different real integrals in the
textbook, pp. 725–732. For real integrals (1), p. 726, we obtain a contour by the transformation (2), p. 726.
This is illustrated in Example 1 and Prob. 7.
For real integrals (4), p. 726, and (10), p. 729 (real “Fourier integrals”), we start from a finite interval
from R to R on the real axis (the x-axis) and close it in complex by a semicircle S as shown in Fig. 374, p.
727. Then we “blow up” this contour and make an assumption (degree of the denominator degree of the
numerator C2) under which the integral over the blown-up semicircle will be 0. Note that we only take
those poles that are in the upper half-plane of the complex plane and ignore the others. Example 2,
p. 728, and Prob. 11 solve integrals of the kind given by (4). Real Fourier integrals (10) are solved in
Example 3, pp. 729–730, and Prob. 21.
Finally, we solve real integrals (11) whose integrand becomes infinite at some point a in the interval of
integration (Fig. 377, p. 731; Example 4, p. 732; Prob. 25) and requires the concept of Cauchy principal
value (13), p. 730. The pole a lies on the real axis of the complex plane.
Problem Set 16.4. Page 733
Z2 a Z2 1
a d:
dDa0 sin
a
0
sin
Using (2), p. 726, we get
1 1
a z
sin D a 2i z
7. Integral involving sine. Here the given integral is
and
dz
d D [see textbook after (2)]. iz
Hence
Z2 1 I i dz
Chap. 16 Laurent Series. Residue Integration 345
0 a sin D C iz a 21i z z1
d ;
where C is the unit circle.
Now
1 1 iz 1 iz
iz a z D iza zC
2i z 2i 2i z
z2 1
D iza C
2 2
D 1 z2 2aiz 12
so that the last integral is equal to
I
dz
2 :
2
Cz 2aiz 1
We need to find the roots of z2 2aiz 1: Using the familiar formula for finding roots of a quadratic
equation,
b pb2 4ac
2
az C bz C c D 0, z1;z2 D ˙
2a
2 2
z1 D ai C 1 a2 and at z2 D ai 1 a2:
However, z1 is outside the unit circle and thus of no interest (see p. 726). Hence, by (3), p. 721, in Sec. 16.3
of the textbook, we compute the residue at z2: 1
Res f.z/ D Res
z = z2 z = zz1/.z z2/
1
D lim
.z z! z2/ z2
z2 .z z1/.z /
1
D lim2
z!z z z1
z 2 1 3 ai 1 a2
D4 p 5
D ai p1 a2 ai p1 a2
346 Complex Analysis Part D
1
:
D 2p1 a2
Thus by Theorem 1, p. 723, (Residue Theorem),
Z2 1 I dz
dD 2a
0 a sin C z2 2aiz 1
D
2a 2i Res f.z/
z = z2
1
D 4ai p
2 1 a2
2ai
:
D p1 a2
p
p1 a2 D . 1/.a2 1/ D ipa2
1
11. Improper integral: Infinite interval of integration. Use of (7), p. 728. The integrand, considered
as a function of complex z, is
1
f.z/ D C 2/2 : .
1z
This shows that there are singularities (p. 715) at z D i and z D i; respectively.
We have to consider only z D i since it lies in the upper half-plane (defined on p. 619, Sec. 13.3)
and ignore z D i since it lies in the lower half-plane. This is as in Example 2, p. 728 (where only z 1
Furthermore, since the numerator of f.z/ is not zero for z D i, we have a pole of order 2 at z D i by
Theorem 4 of Sec. 16.2 on p 717. (The ignored singularity at z D i also leads to a pole of order 2.)
Chap. 16 Laurent Series. Residue Integration 347
The degree of the numerator of f.z/ is 1 and the degree of the denominator is 4, so that we are
allowed to apply (7), p. 728. We have by (5*), p. 722,
d 2 1 d 1
.z i/ C D C
dz .z i/2.z i/2 dz .z i/2
D .z C i/ 20
D 2.z C i/ 3
and then find the residue
2 2
D .z C i/3 zDi D .i C i/3
2 1 i
:
D 23i3 D 4i3 D 4
Z1 1
dx D 2i Resf.z/
2
1 .x2 C 1/ z=i
i
D 2i
4
2
i
:
D D
2 2
21. Improper integral: Infinite interval of integration. Simple pole in upper-half plane. Simple
pole on real axis. Fourier integral. We note that the given integral is a Fourier integral of the form
Z1 1
f.x/sinsx dx with f.x/ D C and sD1 [see (8), p. 729].
348 Complex Analysis Part D
1 .x 1/.x2 4/
This gives singularities of z D 1;2i; 2i, respectively. The pole at z D 2i lies in the upper half-plane
(defined in Sec. 13.3 on p. 619), while the pole at z D 1 lies on the contour. Because of the pole on
the contour, we need to find the principal value by (14) in Theorem 1 on pp. 731–2 rather than using
(10) on p. 729. (The simple pole z D 2i lies in the lower half-plane and, thus, is not wanted.)
y
2i
x
1
–2i
Sec. 16.4 Prob. 21. Fourier integral. Only the poles at z D 1 and 2i that
lie in the upper half-plane (shaded area) are used in the residue integration We
f.z/eiz (s D 1/
as discussed on p. 729, and in Example 3. Using (4), p. 721, we get
1 p.z/
iz 2 0
Res f.z/e D Res C eiz D zD1 zD1 .z 1/.z 4/ q .z/
zD1
where
Hence
p.1/ ei ei
iz
Res f.z/e D D C D : zD1 q0.1/ 3 2 4 5
ei D cos1 C i sin1
so that
ei 1 1 1
Chap. 16 Laurent Series. Residue Integration 349
1 iz
Re Res f.z/e D Re cos1 C i sin1 D : zD1 5
Also
p.z/
iz
Res f.z/e D
zD2i q0.z/ zD2i
ei2i
D 3.2i/2 2.2i/ C 4
e2
D 8 4i
D e 2. 8 C 4i/
82 C 4 2
D 8e 2 C 4e 2i
80
e2 e 2i
:
D C
10 20
Hence
e 2 iz
Re Res f.z/e D : zD2i 10
Using (14), p. 732, the solution to the desired real Fourier integral (with s D 0) is
5 10 5
Note that we wrote pr.v., that is, Cauchy principal value (p. 730) on account of the pole on the
contour (x-axis) and the behavior of the integrand.
350 Complex Analysis Part D
25. Improper integrals. Poles on the real axis. We use (14) and the approach of Example 4, p. 732.
The denominator of the integrand is
zC5
f.z/ D has three simple poles at 0;1; 1: z3 z
Hence at z D 0
p.0/ 5
Resf.z/ 5:
0 2
z=0 D q .0/ D 3 0 1D
At z D 1
p.1/ 6
Resf.z/ 3:
z=1 D q0.1/ D 3 1D
Finally at z D 1
D p. 1/ D 1C5 D4D
Res f.z/ 2: z = -1 q0. 1/ 3. 1/2 1 2
We are ready to use (14), p. 732. Note that there are no poles in the upper half-plane as (A) does
not contain factors with nonzero imaginary parts. This means that the first summation in (14) is
zero. Hence
Z 1x C 5 D C C D D
pr:v: dx i.5 3 2/ i0 0:
3
1x x
Chap. 17 Conformal Mapping
We shift gears and introduce a third approach to problem solving in complex analysis. Recall that so far we
covered two approaches of complex analysis. The first method concerned evaluating complex integrals by
Cauchy’s integral formula (Sec. 14.3, p. 660 of the textbook and p. 291 in this Manual). Specific
background material needed was Cauchy’s integral theorem (Sec. 14.2) and, in general, Chaps. 13 and 14.
The second method dealt with residue integration, which we applied to both complex integrals in Sec.
16.3 (p. 719 in the textbook and p. 291 in this Manual) and real integrals in Sec. 16.4 (p. 725,
p. 326 in this Manual). The background material was general power series, Taylor series (Chap. 15), and,
most importantly, Laurent series which admitted negative powers (Sec. 16.1, p. 708) and thus lead to the
study of poles (Sec. 16.2, p. 715).
The new method is a geometric approach to complex analysis and involves the use of conformal
mappings. We need to explain two terms: (a) mapping and (b) conformal. For (a), recall from p. 621 in Sec.
13.3 that any complex function f.z/, where z D x C iy is a complex variable, can be written in the form
We want to study the geometry of complex functions f.z/ and consider (1).
In basic (real) calculus we graphed continuous real functions y D f.x/ of a real variable x as curves in the
Cartesian xy-plane. This required one (real) plane. If you look at (1), you may notice that we need to
represent geometrically both the variable z and the variable w as points in the complex plane. The idea is
to use two separate complex planes for the two variables: one for the z-plane and one for the w-plane.
And this is indeed what we shall do. So if we graph the points z D x C iy in the z-plane (as we have done
many times in Chap. 13) and, in addition, graph the corresponding w D u C iv (points obtained from
plugging in z into f ) in the w-plane (with uv-axes), then the function w D f.z/ defines a correspondence
(mapping) between the points of these two planes (for details, see p. 737). In practice, the graphs are
usually not obtained pointwise, as suggested by the definition, but from mappings of sectors, rays, lines,
circles, etc.
We don’t just take any function f.z/ but we prefer analytic functions. In comes the concept of (b)
conformality. The mapping (1) is conformal if it preserves angles between oriented curves both in
magnitude as well as in sense. Theorem 1, on p. 738 in Sec. 17.1, links the concepts of analyticity with
conformality: An analytic function w D f.z/ is conformal except at points z 0 (critical points) where its
derivative f 0.z0/ D 0:
The rest of the chapter discusses important conformal mappings and creates their graphs. Sections 17.1
(p. 737) and 17.4 (p. 750) examine conformal mappings of the major analytic functions from Chap. 13.
Sections 17.3 (p. 746) and 17.4 deal with the novel linear fractional transformation, a transformation that
is a fraction (see p. 746). The chapter concludes with Riemann surfaces, which allow multivalued relations
of Sec. 13.7 (p. 636) to become single-valued and hence functions in the usual sense. We will see the
astonishing versatility of conformal mapping in Chapter 18 where we apply it to practical problems in
potential theory.
You might have to allocate more study time for this chapter than you did for Chaps. 15 and 16.
You should study this chapter diligently so that you will be well prepared for the applications in Chap. 18.
352 Complex Analysis Part D
As background material for Chap. 17 you should remember Chap. 13, including how to graph complex
numbers (Sec. 13.1, p. 608), polar form of complex numbers (Sec. 13.2, p. 613), complex functions (pp.
620–621), ez (Sec. 13.5, p. 630), Euler’s formula (5), p. 634, sinz;cosz;sinhz;coshz, and their various
formulas (Sec. 13.6, p. 633), and the multivalued relations of Sec. 13.7, p. 636 (for optional Sec. 17.5).
Furthermore, you should know how to find roots of polynomials and know how to algebraically
manipulate fractions (in Secs. 17.2 and 17.3).
Sec. 17.1 Geometry of Analytic Functions: Conformal Mapping
We discussed mappings and conformal mappings in detail in the opening to Chap. 17 of this Manual.
Related material in the textbook is: mapping (1), p. 737, and illustrated by Example 1; conformal, p. 738;
conformality and analyticity in Theorem 1, p. 738. The section continues with four more examples of
conformal mappings and their graphs. They are w D z n (Example 2, p. 739), w D z C 1=z (Joukowski airfoil,
Example 3, pp. 739–740), w D ez (Example 4, p. 740), and w D Ln z (Example 5, p. 741). The last topic is the
magnification ratio, which is illustrated in Prob. 33, p. 742.
In the examples in the text and the exercises, we consider how sectors, rays, lines, circles, etc. are
mapped from the z-plane onto the w-plane by the specific given mapping. We use polar coordinates and
Cartesian coordinates. Since there is no general rule that fits all problems, you have to look over,
understand, and remember the specific mappings discussed in the examples in the text and
supplemented by those from the problem sets. To fully understand specific mappings, make graphs or
sketches. Finally you may want to build a table of conformal mappings:
Mapping Region to be Mapped Image of Region Reference
π/n
x u
Put in more mappings and graphs or sketches. The table does not have to be complete, it is just to help
you remember the most important examples for exams and for solving problems.
Illustration of mapping. Turn to p. 621 of Sec. 13.3 and look at Example 1. Note that this example defines a
mapping w D f.z/ D z2 C 3z: It then shows how the point z0 D 1 C 3i (from the z-plane) is being mapped onto
w0 D f.z0/ D f.1 C 3i/ D .1 C 3i/2 C 3.1 C 3i/ D 1 C 6i C 9i2 C 3 C 9i D
5 C 15i (of the w-plane). A second such example is Example 2, p. 621.
More details on Example 1, p. 737. Turn to p. 737 and take a look at the example and Fig. 738. We
remember that the function f.z/ D z2 is analytic (see pp. 622–624 of Sec. 13.3, Example 1, p. 627 of Sec.
13.4). The mapping of this function is
w D f.z/ D z2:
It has a critical point where the derivative of its underlying function f is zero, that is, where
Chap. 17 Conformal Mapping 353
Thus the critical point is at z D 0. By Theorem 1, p. 738, f.z/ is conformal except at z D 0. Indeed, at z D 0
conformality is violated in that the angles are doubled, as clearly shown in Fig. 378, p. 737. The same
reasoning is used in Example 2, p. 739.
3. Mapping. To obtain a figure similar to Fig. 378, p. 737, we follow Example 1 on that page. Using
polar forms [see (6), p. 631 in Sec. 13.5]
We compare the moduli and arguments (for definition, see p. 613) and get
R D r3 and D 3:
Hence circles r D r0 are mapped onto circles R D r03 and rays D 0 are mapped onto rays D 30. Note that
the resulting circle R D r03 is a circle bigger than r D r0 when r0 > 1 and a smaller one when r0 < 1.
Furthermore, the process of mapping a ray D 0 onto a ray D 30 corresponds to a rotation.
We are ready to draw the desired figure and consider the region
2
1 r 1:3 with :
9 9
It gets mapped onto the region 13 R .1:3/3 with 3 =9 3 2=9. This simplifies to
2
1 R 2:197 with :
3 3
354 Complex Analysis Part D
y
1.6
1.3
1
x x
11.3 1.6 –4.096 –2.197 –1 0 1 2.197 4.096
Sec. 17.1 Prob. 3. Given region and its image under the mapping w D z 3
7. Mapping of curves. Rotation. First we want to show that the given mapping w D iz is indeed a
rotation. To do this, we express z in polar coordinates [by (6), p. 631], that is,
D rei.C=2/
Q
D reiQ where D C =2:
This shows that this mapping, w D iz, is indeed a rotation about 0 through an angle of =2 in the
positive sense, that is, in a counterclockwise direction.
We want to determine the images of x D 1;2;3;4, and so we consider the more general problem
of determining the image of x D c, where c is a constant. Then for x D c, z becomes
z D x C iy D c C iy
DC
This means that the image of points on a line x D c is w D y C ic. Thus x D 1 is mapped onto w D y C
i; x D 2 onto w D y C 2i, etc. Furthermore, z D x D c on the real axis and is mapped by (B) onto the
imaginary axis w D ic. So z D x D 1 is mapped onto w D i, and z D x D 2 is mapped onto w D 2i; etc.
Similar steps for horizontal lines y D k D const give us
Hence y D 1 is mapped onto w D 1 C ix, and y D 2 onto w D 2 C ix: (Do you see a
counterclockwise rotation by =2?). Furthermore, z D y D k is mapped onto w D k and z D y D 1 onto w
D 1; z D y D 2 onto w D 2. Complete the problem by sketching or graphing the desired images.
11. Mapping of regions.i To examine the given mapping, w D z2, we express z in polar coordinates,
This shows that the w D z2 doubles angles at z D 0 and that it squares the moduli. Hence for our
problem, under the given mapping,
< < becomes < <
8 8 4 4
or equivalently (since D Arg z and D Arg w/
< Arg z < becomes < Arg w < [for definition of Arg, see (5), p. 614]:
8 8 4 4
Furthermore, r D maps onto R D : Since
y v
π π
8 4
–π 1 x –π 1 u
8 2 4 4
z-plane w-plane
Sec. 17.1 Prob. 11. Given region and its image under the mapping w D f.z/ D z 2
15. Mapping of regions. The given region (see p. 619 in Sec. 13.3)
ˇ 1ˇ
ˇz 2ˇ is a closed circular disk of radiuswith center at x D:
x 122 C y2 D 122 :
Written out
x2 x C C y2 D
and rearranged is
x2 C y2 xC D :
x2 C y 2 x D 0:
But
z C z x D:
2
Chap. 17 Conformal Mapping 357
(*) zz 0:
2
Now we are ready to consider the given mapping, w D 1=z, so that z D 1=w and obtain
C 1
z zD11 w C w1
zz
2 ww 2
11
1 1 D 2 w w
2 ww
11
.2D 2 ww C Œu iv Œu iv/
[by (1), p. 737 and definition of w]
11
D .2 2u/ w w
2 2u D 0 so that u D 1:
This shows that, for the given mapping, the circle maps onto u D 1: The center
1
of the circle ;0 maps onto
2
1ˇˇ 1
f.z/ D ˇD 1D 2 D u C iv so that u D 2:
zˇzD 2
ex 4:
Now
j w j 4:
21. Failure of conformality. Cubic polynomial. The general cubic polynomial (CP) is
Conformality fails at the critical points. These are the points at which the derivative of the cubic
polynomial is zero. We differentiate (CP) and set the derivative to zero:
p p
D b ˙ pb2 4ac D 2a2 ˙ 4a 22 4 3a3 a1 D a2 ˙ a22 3 a 3 a1
z1;2:
2a 2 3a3 3a3
Thus the mapping, w D f .z/, is not conformal if f 0 .z/ D 0. This happens when
D a2 ˙ pa22 3 a3 a1 z:
3a3
Remark. You may want to verify that our answer corresponds to the answer on p. A41 in
Appendix 2 of the textbook. Set
a3 D 1, a2 D a, a1 D b, a0 D c:
Note that we can set a3 D 1 in (CP) without loss of generality as we can always divide the cubic
polynomial by a3 if 0 < ja3j ¤ 1:
ˇ ˇ
ˇ.ez/0 ˇ D j ez j D ex (by Sec. 13.5, p. 630):
D az C b ¤
(1) LFT w (where ad bc 0)
cz C d
is useful in modeling and solving boundary value problems in potential theory [as in Example 2 of Sec.
18.2, where the first function on p. 765, of the textbook, is a linear fractional transformation (LFT)
.1/ with a D b D d D 1 and c D 1].
LFTs are versatile because—with different constants—LFTs can model translations, rotations, linear
transformations, and inversions of circles as shown in (3), p. 743. They also have attractive properties
(Theorem 1, p. 744). Problem 3 (in a matrix setting) and Prob. 5 (in a general setting) explore the
relationship between LFT (1) and its inverse (4), p. 745. Fixed points are defined on p. 745 and illustrated in
Probs. 13 and 17.
3. Matrices. a. Using 2 2 matrices, prove that the coefficient matrices of (1), p. 743, and (4), p. 745,
are inverses of each other, provided that
ad bc D 1:
1
1 d b
(M2) A D ; (wheredet A D ad bc/: det A c a
dw b
(4) zD C
cw a
so that its coefficient matrix is
d b
(M3) BD :
c a
Looking at (M2) and (M3) we see that if the only way for A 1 D B is for
1 1
D D 1; that is, ad bc D 1:
det A ad bc
Conversely, if ad bc D 1, then
1 11
D DD 1 so that A D 1 B D B [by (M2), (M3)]. det A ad bc
This proves a.
b. The composition of LFTs corresponds to the multiplication of coefficient matrices.
Hint: Start by defining two general LFTs of the form (1) that are different from each other.
D az C b ¤
(1) w (where ad bc 0)
cz C d
.cz C d/w D az C b:
Next we group the z-terms together on the left and the other terms on the right:
czw az D b dw
Chap. 17 Conformal Mapping 361
so that
(A0) zDb
cw a
This is not quite (4) yet. To obtain (4), we multiply (A 0) by (which we can always do) and get
D .b dw/ D b CCdw D dw Cb z :
.cw a/ cw a cw a
But this is precisely (4)! (Note that the result is determined only up to a common factor in the
numerator and the denominator).
b. Derive (1) from (4).
This follows the same approach as in a, this time starting with (4) and deriving (1). For practice you
should fill in the steps.
7. Inverse mapping. The given mapping is a linear fractional transformation. Using (1), p. 743, we have
D i D 0 z C i D az CC b
w
2z 1 2z 1 cz d
so that, by comparison,
a D 0; b D i; c D 2; d D 1:
We now use (4), p. 745, with the values of a; b; c; d just determined and get that the inverse
mapping of (1) is
dw Cb 1 w i D w i z D z.w/ D D C : cw a
2 w 0 2w
w iD .w C i/ D w C i
:
2w .2w/ 2w
To check that our answer is correct, we solve z.w/ for w and have
w i
362 Complex Analysis Part D
2wz D w i:
Adding w gives us
2wz C w D i so that w. 2z C 1/ D i:
We solve the last equation for w and then factor out a minus sign both in the numerator and
denominator to get
i .i/ i
D 2z C 1 D .2z 1/ D 2z 1 w :
The last fraction is precisely the given mapping with which we started, which validates our answer.
13. Fixed points. The fixed points of the mapping are those points z that are mapped onto themselves
as explained on p. 475. This means for our given mapping we consider
The first root (“fixed point”) is immediate, that is, z D 0 : We then have to solve
(B) 16z4 1 D 0:
For the next fixed point, from basic elementary algebra, we use that
For the first factor in (D) we use (C) again and, setting to zero, we obtain
: zD i and zD i
We have found five fixed points, and, since a quintic polynomial has five roots (not necessarily distinct),
we know we have found all fixed points of w:
Remark. We wanted to show how to solve this problem step by step. However, we could have solved
the problem more elegantly by factoring the given polynomial immediately in three steps:
Another way is to solve the problem in polar coordinates with (15), p. 617 (whose usage is illustrated
in Prob. 21 of Sec. 13.2 on p. 264, in this Manual).
17. Linear fractional transformations (LFTs) with fixed points. In general, fixed points of mappings w
D f.z/ are defined by
(E) w D f.z/ D z:
D az C b
(1) w :
cz C d
Taking (E) and (1) together gives the starting point of the general problem of finding fixed points for
LFTs, that is, az C b D w D C z. cz d
This corresponds to (5), p. 745. We obtain
az C b D C C D z 0 so that az b z.cz d/ 0:
364 Complex Analysis Part D
cz C d
For our problem, we have to find all LFTs with fixed point z D 0. This means that (F) must have a
root z D 0: We have from (F) that
(F*) z.cz C d a/ D b
and, with the desired fixed point, makes the left-hand side (F*) equal to 0 so that the right-hand side
of (F*) must be 0, hence
b D 0:
D az CC b D azCC 0 D azC
(G) w :
cz d cz d cz d
To check our answer, let us find the fixed points of (G). We have
az
CDz so that az D z.cz C d/:
cz d
This gives
z.cz C d a/ D 0
happens in connection with boundary value problems for PDEs in two space variables. The term is not a
technical term.
3. Fixed points. To show that a transformation and its inverse have the same fixed points we proceed as
follows. If a function w D f.z/ maps z 1 onto w1, we have w1 D f.z1/, and, by definition of the inverse f 1,
we also have z1 D f 1.w1/. Now for a fixed point z D w D z 1, we have z1 D f.z1/, hence z1 D f 1.z1/, as
claimed.
5. Filling in the details of Example 2, p. 748, by formula (2), p. 746. We want to derive the mapping
in Example 2, p. 748, from (2), p. 746. As required in Example 2, we set
z1 D 0, z2 D 1;z3 D 1I w1 D 1, w2 D i, w3 D 1
in
w w1 w2 w3 z z1 z2 z3
(2) D
w w3 w2 w1 z z3 z2 z1
and get
w 1 i 1 z 0 1
(A) C :
w 1 i C 1 D z 1 110
On the left-hand side, we can simplify by (7), p. 610, of Sec. 13.1, and obtain
iC1D 1 iD 1 i 1 CC i D 1 C2i C 1 D 2i D
i:
i 1 1 i 1 i 1 i 12 12 2
On the right-hand side, as indicated by Theorem 1, p. 746, we replace
11z1
by 1. Together we obtain, from (A), w C 1 D z 0
. i/ 1
w110
w 1
so that C D
. i/ z: w 1
This gives us the intermediate result w C 1 D z D
366 Complex Analysis Part D
iz.
w 1 i
Note that we used 1=i D i (by Prob. 1, p. 612, and solved on p. 258 in this Manual). We solve
for w and get
w C 1 D iz .w 1/I w C 1 D izw izI w izw D iz 1I w .1 i z/ D iz 1;
so that
iz 1 1 1
iz 1w
D iz C i D C iD C iD
1D C;z z z
or, alternatively,
i
izCi 1 z 11 z 1i z i
iz 1 i. iz 1/ z i
w ;
D iz C 1 D i. iz C 1/ D z C i
both of which lead to the desired result.
13. LFT for given points. Our task is to determine which LFT maps 0, 1, 1 into 1, 1, 0:
z1 D 0, z2 D 1;z3 D 1I w1 D 1, w2 D 1, w3 D 0
we have
w
(B) 1 1110 D zz10 1110 :
w 0
As required by Theorem 1, we have to replace, on the left-hand side,
w 1 by 1
11
Hence
1wDz
and here
1 7! 1;
0:
2 7! 1; 1 7!
Looking at how these three points are mapped, we would conjecture that w D 1=z and see that this
mapping does fulfill the three requirements. 17. zwhich leads us to Example 4, pp. 748–749. We
setMapping of a disk onto a disk.D i=2 is mapped onto w D 0: From p. 619, we know thatWe have to
i i
z0 D 2 and c D z0 D 2 D2;
in (3), p. 749, and obtain
i 2z i
z z0 z 2z i 2 2z i
wD D i 2 D iz2 2 D D :
cz 1 2 z 1 2 2 iz 2 iz 2
Complete the answer by sketching the images of the lines x D const and y D const.
19. Mapping of an angular region onto a unit disk. Our task is to find an analytic function, w D f.z/,
that maps the region 0 argz =4 onto the unit disk j w j 1: We follow Example 6, p. 749, which
combines a linear fractional transformation with another transformation. We know, from Example 2,
p. 739 of Sec. 17.1, that t D z4 maps the given angular region 0 argz =4 onto the upper t-half-plane.
(Make a sketch, similar to Fig. 382, p. 739.) (Note that the transformation D t 8 would map the given
region onto the full -plane, but this would of no help in obtaining the desired unit disk in the next
step.)
Next we use (2) in Theorem 1, p. 746, to map that t-half-plane onto the unit disk j w j 1 in the
w-plane. We note that this is the inverse problem of the problem solved in Example 3 on p. 748 of
the text.
Clearly, the real t-axis (boundary of the half-plane) must be mapped onto the unit circle j w j D
1. Since no specific points on the real t-axis and their images on the unit circle j w j D 1 are
prescribed, we can obtain infinitely many solutions (mapping functions).
For instance, if we map t1 D 1, t2 D 0, t3 D 1 onto w1 D 1, w2 D i, w3 D 1,
368 Complex Analysis Part D
respectively—a rather natural choice under which 1 and 1 are fixed points—we obtain, with these
values inserted into (2) in modified form (2*), that is, inserted into
w w1 w2 w3 t t1 t2 t3
(2*) D
w w3 w2 w1 t t3 t2 t1
equation (C):
This gives us
w. i 1/.t 1/ C . i 1/.t 1/ D w. i C 1/. t 1/ . i C 1/. t 1/
We get w. it C i t C 1 it i C t C 1/ D it i C t 1 it i C t C 1;
i t t i
(D) wD CC D C :
it 1 it 1
From above we know that the mapping t D z 4, which substituted into (D). gives us
t i z4 i
(E) wD C D C ;
4
it 1 iz 1
which is the answer given on p. A42. Note that the mapping defined by (E) maps t D i onto w D 0, the
center of the disk.
Chap. 17 Conformal Mapping 369
B →A C← –1 B 1 →A
z-plane t-plane
C A
–1 1
w-plane
Sec. 17.3 Prob. 19. z-, t-, and w-planes and regions for the given LFT
Since
gives no further restriction, since y ranges between and . Indeed, the side x D of R is
mapped onto the circle e 1=2 in the w-plane and the side x D onto the circle of radius e1=2. The
images of the two horizontal sides of R lie on the real w-axis, extending from e 1=2 to e1=2 and
coinciding.
Remark. Take another look at Example 4 in Sec. 17.1 on p. 740 to see how other rectangles are
mapped by the complex exponential function.
y v
C π B
B C
–1/2 1/2 x –1/2 1/2 u
A D e e
D –π A
z-plane w-plane
(1) w D sinz D sinx coshy C i cosx sinhy [p. 750 or (6b), p. 634, in Sec. 13.6].
Since
and, because
This means that the entire image of R lies in the right half-plane of the w-plane. The
so it is mapped from
The upper horizontal side y D 2, =2 > x > 0 is mapped onto the upper right part of the ellipse
u2 v 2 2 C 2
D
1 .u > 0/; .v < 0/:
cosh 2 sinh 2
so it is mapped into the v-axis u D 0 from i sinh2 to 0. Note that, since the region to be mapped consists of
the interior of a rectangle but not its boundary, the graphs also consist of the interior of the regions
without the boundary.
v
D
3
D C
A B A B C
372 Complex Analysis Part D
y
22
11
Sec. 17.4 Prob. 11. Rectangle and its image under w D sinz
21. Mapping w D cosz. We note that the rectangle to be mapped is the same as in Prob. 11. We can
solve this problem in two ways.
Method 1. Expressing cosine in terms of sine. We relate the present problem to Prob. 11 by using
cosz D sin.z C /:
We set
tDzC :
Then the image of the given rectangle [x in .0;=2/; y in .0;2/] in the t-plane is bounded by Re t in
;x C 12 or ;; and Im t in .0;2/; i.e. shifted =2 to the right. Now
and use zA;zB;zC; and zD as the four corners of the rectangle as in Prob. 11. Now,
The upper horizontal side y D 2, =2 > x > 0 is mapped onto the lower right part of the ellipse:
u2 v2 2 C 2
D
1 .u > 0/:
cosh 2 sinh 2
y
2 2
D C D C
1 1
A B A B
0 x 0
π/2 π/2 π z-plane t-plane
1 2 3 4
0
B A D
–1
–2
–3
C
w-plane
Sec. 17.4 Prob. 20. Given region in the z-plane and its images in the t- and
w-planes for the mapping of w D cosz
1. Square root. We are given thatp z moves from z D twice around the circle jzj D and want to know
what w D z does.
(A) z D 41ei:
Hence the given mapping w D pz
D
1 i1=2 e [by (A)]:
4
D ei=2
a complex potential F
This idea is so powerful because (2) allows us to model problems in distinct areas such as in electrostatic
fields (Secs. 18.1, p. 759, 18.2, p. 763, 18.5, p. 777), heat conduction (Sec. 18.3, p. 767), and fluid flow
(Sec. 18.4, p. 771). The main adjustment needed, in each different area, is the interpretation of ˆ and ‰ in
(2), specifically the meaning of ˆ D const and its associated conjugate potential ‰ D const. In electrostatic
fields, ˆ D const are the electrostatic equipotential lines and ‰ D const are the lines of electrical force—
the two types of lines intersecting at right angles. For heat flow, they are isotherms and heat flow lines,
respectively. And finally, for fluid flow, they are equipotential lines and streamlines.
Second, we can apply conformal mapping to potential theory because Theorem 1, p. 763 in Sec. 18.2,
asserts “closure” of harmonic functions under conformal mapping in the sense that harmonic functions
remain harmonic under conformal mapping.
Potential theory is arguably the most important reason for the importance of complex analysis in
applied mathematics. Here, in Chap. 18, the third approach to solving problems in complex analysis—the
geometric approach of conformal mapping applied to solving boundary value problems in two–
dimensional potential theory—comes to full fruition.
As background, it is very important that you remember conformal mapping of basic analytic functions
(power function, exponential function in Sec. 17.1, p. 737, trigonometric and hyperbolic functions in Sec.
17.4, p. 750), and linear fractional transformations [(1), p. 743, and (2), p. 746]. For Sec. 18.1, you may
also want to review Laplace’s equation and Coulomb’s law (pp. 400–401 in Sec. 9.7), for Sec. 18.5,
Cauchy’s integral formula (Theorem 1, p. 660 in Sec. 14.3), and the basics of how to construct Fourier
series (see pp. 476–479, pp. 486–487 in Secs. 11.1 and 11.2, respectively). The chapter ends with a brief
review of complex analysis in part D on p. 371 of this Manual.
We know from electrostatics that the force of attraction between two particles of opposite or the same
charge is governed by Coulomb’s law (12) in Sec. 9.7, p. 401. Furthermore, this force is the gradient of a
function ˆ known as the electrostatic potential. Here we are interested in the electrostatic potential ˆ
because, at any points in the electrostatic field that are free of charge, ˆ is the solution of Laplace’s
equation in 3D:
r2
376 Complex Analysis Part D
ˆ D ˆxx C ˆyy C ˆzz D 0 (see Sec. 12.11, pp. 593–594, pp. 596–598):
Laplace’s equation is so important that the study of its solutions is called potential theory.
Since we want to apply complex analysis to potential theory, we restrict our studies to two dimensions
throughout the entire chapter. Laplace’s equation in 2D becomes
Then the equipotential surfaces ˆ.x;y;z/ D const (from the 3D case) appear as equipotential lines in the xy-
plane (Examples 1–3, pp. 759–760).
The next part of Sec. 18.1 introduces the key idea that it is advantageous to work with complex
potentials instead of just real potentials. The underlying formula for this bold step is
where F is the complex potential (corresponding to the real potential ˆ) and ‰ is the complex conjugate
potential (uniquely determined except for an additive constant, see p. 629 of Sec. 13.4). The advantages
for using complex potentials F are:
1. It is mathematically easier to solve problems with F in complex analysis because we can use conformal
mappings.
2. Formula (2) has a physical meaning. The curves ‰ D const (“lines of force”) intersect the curves ˆ D
const (“equipotential lines”) at right angles in the xy-plane because of conformality (p. 738).
Illustrations of (2) are given in Examples 4–6 on p. 761 and in Probs. 3 and 15. The section concludes
with the method of superposition (Example 7, pp. 761–762, Prob. 11).
3. Potential between two coaxial cylinders. The first cylinder has radius r 1 D 10 [cm] and potential U1 D
10 [kV]. The second cylinder has radius r2 D 1 [m] D 100 [cm] and potential
U2 D 10 [kV]. From Example 2, p. 759 in Sec. 18.1, we know that the potential ˆ.r/ between two
coaxial cylinders is given by
ˆ.r/ D aln r C b where a and b are to be determined from given boundary conditions.
so that
Chap. 18 Complex Analysis and Potential Theory 377
to get
D aln 10 aln100
aln10 D 20;
20
F .z/ D 30 Lnz where ˆ.r/ D ReF .z/:
ln10
11. Two source lines. Verification of Example 7, pp. 761–762. The equipotential lines in Example 7, p. 761,
are
378 Complex Analysis Part D
ˇˇz c ˇˇ
ˇ ˇ D k D const .k and c real/: ˇz C cˇ
Hence
jz c j D k j z C c j:
Using this, and writing (A) in terms of the real and imaginary parts and taking all the terms to the
left, we obtain
We consider two cases. First, consider k D 1, hence K D 1, most terms in (B) cancel, and we are left
with
j j2 D j C j2 D 2 C 2 jjz C cjj D D z c z c y c ; 1; Ln 1 0:
z c
.1 K/.x2 C y2 C c2/ 2c x .1 C K/ D 0:
c .1 K/
2 2 2
x Cy Cc 2Lx D 0 where LD C :
Chap. 18 Complex Analysis and Potential Theory 379
1 K
.x L/2 C y2 D L2 c2:
This is a circle with center at L on the real axis and radius pL2 c2. We simplify pL2
2 2D c .1 C K/2 2
L c c (by inserting L) 1 K
D c 2.1 C K/2 2
c
2
.1 K/
.1 K/2
D c2 C 1
2
.1 K/
D 2 .1 C K/2 .1 K/2 c
.1 K/2 .1 K/2
2
C C K2/
2
1 2K K .1 2K
Dc
.1 K/2
c24K2
:
2
Hence D .1 K/
s 2
2 2
p c 4K c2K 2ck 2
L2 c2 D D D (using K D k ).
.1 K/2 1 K 1 k2
z2 D .x C iy/2 D x2 y2 C 2ixy
gives the potential in sectors of opening =2 bounded by the bisecting straight lines of the quadrants
because
x2 y2 D 0 when y D ˙x:
380 Complex Analysis Part D
Similarly, higher powers of z give potentials in sectors of smaller openings on whose boundaries the
potential is zero. For
and
x
ˆD0 when yD˙p I
3
these are the boundaries given in the problem, the opening of the sector being =3, that is, 60 ı. To
satisfy the other boundary condition, multiply ˆ0 by 220 [V] and get
Here we experience, for the first time, the full power of applying the geometric approach of conformal
mappings to boundary value problems (“Dirichlet problems,” p. 564, p. 763) in two-dimensional potential
theory. Indeed, we continue to solve problems of electrostatic potentials in a complex setting (2), p. 760
(see Example 1, p. 764, Example 2, p. 765; Probs. 7 and 17). However, now we apply conformal mappings
(defined on p. 738 in Sec. 17.1) with the purpose of simplifying the problem by mapping a given domain
onto one for which the solution is known or can be found more easily. This solution, thus obtained, is
mapped back to the given domain.
Our approach of using conformal mappings is theoretically sound and, if applied properly, will give us
correct answers. Indeed, Theorem 1, p. 763, assures us that if we apply any conformal mapping to a given
harmonic function then the resulting function is still harmonic. [Recall that harmonic functions (p. 460 in
Sec. 10.8) are those functions that are solutions to Laplace’s equation (from Sec. 18.1) and have
continuous second-order partial derivatives.]
7. Mapping by w = sin z. Look at Sec. 17.4, pp. 750–751 (also Prob. 11, p. 754 of textbook and solved on
p. 348 of this Manual) for the conformal mapping by
We conclude that the lower side (z A to zB) 0 < x < =2 .y D 0/ of the given rectangle D maps onto 0 <
u < 1 .v D 0/ because cosh0 D 1 and sinh0 D 0. The right side (zB to zC) 0 < y < 1 .x D =2/ maps onto
1 < u < cosh=2 .v D 0/. The upper side (zC to zD) maps onto a quarter of the ellipse
u2 v2
1
2 C 2 D
cosh 1 sinh 1
in the first quadrant of the w-plane. Finally, the left side (z D to zA) maps onto sinh1 > v > 0 .u D
0/.
ˆ .u;v/ D u2 v2
Hence ˆ D sin2 x on the lower side .y D 0/, and grows from 0 to 1. On the right side, ˆ D cosh2 y, which
which begins with the value cosh2 1 and decreases to sinh2 1. Finally, on the left side it begins with
sinh2 1 and returns to its value 0 at the origin. Note
u2 v2
1
cosh c C sinh c D
2 2
u2 v2
1:
sin2 k cos2 k D
382 Complex Analysis Part D
v
D
1
y
1
D C
A B 0 u
1
0 x
π/2 A B C z-plane w-plane
Sec. 18.2 Prob. 7. Given region and image under conformal mapping w D sinz
(A) zD Zi 2i :
2Z 1
We can multiply both the numerator and denominator in (A) by 2 and get the answer on p. A43 in
App. 2:
2Z i
(A2) zD :
iZ 2
To complete the problem, we evaluate (A2) with Z D 0:6 C 0:8i and 0:6 C 0:8i, respectively.
We get for Z D 0:6 C 0:8i
(B) z1,
iZ 2 i.0:6 0:8i/ 2 0:6i 0:8 2 .0:6i 1:2/
which is the desired value. Similarly, you can show that for Z D 0:6 0:8i, one gets z D 1: Thus
p
jZj D j0:6 ˙ 0:8ij D 0:62 C 0:82 D 1 1;
Chap. 18 Complex Analysis and Potential Theory 383
which means that jZj 1. And (B), with a similar calculation, shows that our chosen Z’s get mapped by
(A2) onto z D ˙1, so that indeed jzj 1: Together, this shows that (A2) is the desired LTF as described in
Prob. 17 and illustrated in Fig. 407, p. 766. Convince yourself that Fig. 407 is correct.
Complex analysis can model two-dimensional heat problems that are independent of time. From the top
of p. 564 in Sec.12.6, we know that the heat equation is
(H) Tt D c2r2T .
We assume that the heat flow is independent of time (“steady”), which means that T t D 0: Hence (H)
reduces to Laplace’s equation
This allows us to introduce methods of complex analysis because T [or T.x;y/] is the real part of the
complex heat potential
[Terminology: T.x;y/ is called the heat potential, ‰.x;y/ D const are called heat flow lines, and T.x;y/ D
const are called isotherms.]
It follows that we can reinterpret all the examples of Secs. 18.1 and 18.2 in electrostatics as problems of
heat flow (p. 767). This is another great illustration of Underlying Theme 3 on p. ix of the textbook of the
powerful unifying principles of engineering mathematics.
7. Temperature in thin metal plate. A potential in a sector (in an angular region) whose sides are kept at
constant temperatures is of the form
T.x;y/ D a C b
y
(A) D aarctan Cb
x
D a Argz C b (see similar Example 3 on pp. 768–769):
The two constants, a and b; can be determined from the given values on the two sides Argz D 0 and
Argz D =2. Namely, for Argz D 0 (the x-axis) we have
T D b D T1:
TDa C T1 D T2:
2
2.T2 T1/ a D
:
2.T2 T1/
T.x;y/ D Argz C T1:
Complete the problem by finding the associated complex potential F.z/ obeying ReF.z/ D
T.x;y/ and check on p. A43 in App. 2 of the textbook.
15. Temperature in thin metal plate with portion of boundary insulated. Mixed boundary value problem.
We start as in Prob. 7 by noting that a potential in an angular region whose sides are kept at
constant temperatures is of the form
and using the fact that Argz D D Im.Lnz/ is a harmonic function. We determine the values for the
two constants a and b from the given values on the two sides Argz D 0 and Argz D =4. For Argz D 0
(the x-axis) we have T D b D 20 and for Argz D =4 we have
320
TDa 20 D 60 so that aD :
4
Hence a potential that satisfies the conditions of the problem is
320
(C) TD Argz 20:
Chap. 18 Complex Analysis and Potential Theory 385
Now comes an important observation. The curved portion of the boundary (a circular arc) is
insulated. Hence, on this arc, the normal derivative of the temperature T must be zero. But the
normal direction is the radial direction; so the partial derivative with respect to r must vanish. Now
formula (C) shows that T is independent of r, that is, the condition under discussion is automatically
satisfied. (If this were not the case, the whole solution would not be valid.) Finally we derive the
complex potential F. From Sec. 13.7 we recall that
Hence for Argz to become the real part (as it must be the case because F D T C i ‰), we must
multiply both sides of (D) by i. Indeed, then
i Lnz D i ln j z j C Argz:
Hence from this and (C) we see that the desired complex heat potential is
320
(E) F.z/ D 20 C. i Lnz/
320
D 20i Lnz;
which, by (C) and (E), leads to the answer given on p. A43 in App. 2 of the textbook.
(3) V D V1 C i V2 D F 0.z/:
It derives its importance from relating the complex velocity vector of the fluid flow
(1) V D V1 C iV2
whose imaginary part ‰ gives the streamlines of the flow in the form
‰.x; y/ D const:
Similarly, the real part ˆ gives the equipotential lines of the flow:
ˆ.x; y/ D const:
386 Complex Analysis Part D
The use of (3), p. 771, is illustrated in different flows in Example 1 (“flow around a corner,” p. 772), Prob.
7 (“parallel flow”) and in Example 2, and Prob. 15 (“flow around a cylinder”).
Flows may be compressible or incompressible, rotational or irrotational, or may differ by other general
properties. We reach the connection to complex analysis, that is, Laplace’s equation (5) applied to ˆ and
‰ of (2), written out
by first assuming the flow to be incompressible and irrotational (see Theorem 1, p. 773).
Rotational flows can be modeled to some extent by complex logarithms, as shown in the textbook on
pp. 776–777 in the context of a Team Project.
7. Parallel flow. Our task is to interpret the flow with complex potential F.z/ D z:
The stream function ‰ gives the streamlines ‰ D const and is generally more important than the
velocity potential ˆ, which gives the equipotential lines ˆ D const. The flow can best be visualized in
terms of the velocity vector V , which is obtained from the complex potential in the form (3), p. 771,
(3) V D V1 C i V2 D F 0.z/:
(We need a special vector notation, in this case, because a complex function V can always be
regarded as a vector function with components V 1 and V2.) Hence,
for the given complex potential
we have
thus,
(B) V D V1 D 1 and V2 D 0:
Chap. 18 Complex Analysis and Potential Theory 387
The velocity vector in (B) is parallel to the x-axis and is positive, i.e., V D V 1 points to the right (in the
positive x-direction).
Hence we are dealing with a uniform flow (a flow of constant velocity) that is parallel (the
streamlines are straight lines parallel to the x-axis) and is flowing to the right (because V is positive).
From (A) we see that the equipotential lines are vertical parallel straight lines; indeed,
15. Flow around a cylinder. Here we are asked to change F.z/ in Example 2, p. 772, slightly to obtain a
flow around a cylinder of radius r0 that gives the flow in Example 2 if r0 ! 1:
D az C
az
i i
1
D are C e [by (6), p. 631 applied to both terms]:
ar
The stream function ‰ is the imaginary part of F. Since, by Euler’s formula,
we obtain
‰.r;/ D Im.F/
i 1 i D Im are C e ar
1
D Im ar .cos C i sin / C .cos i sin/ (by Euler’s formula applied twice) ar
1 1
D Im ar cos C cos C ar i sin i sin (regrouping for imaginary part) ar
ar
388 Complex Analysis Part D
1
D ar sin sin ar
1
D ar sin:
ar
The streamlines are the curves D const. As in Example 2 of the text, the streamline ‰ D 0 consists of
the x-axis . D 0 and ), where sin D 0; and of the locus where the other factor of ‰
is zero, that is,
1 2 1 ar D 0; thus .ar/ D 1 or a D : ar r
Since we were given that the cylinder has radius r D r 0, we must have
1aD
: r0
1 z r0
F.z/ D az C D C : az r0 z
Sec. 18.5 Poisson’s Integral Formula for Potentials
The beauty of this section is that it brings together various material from complex analysis and Fourier
analysis. The section applies Cauchy’s integral formula (1), p. 778 (see Theorem 1, p. 660 in Sec. 14.3), to
a complex potential F.z/ and uses it on p. 778 to derive Poisson’s integral formula (5), p. 779.
Take a look at pp. 779–780. Formula (5) yields the potential in a disk D. Ordinarily such a disk has a
continuous boundary jzj D R, which is a circle. However, this requirement can be loosened: (5) is
applicable even if the boundary is only piecewise continuous, such as in Figs. 405 and 406 of a typical
example of a potential between two semicircular plates (Example 2 on p. 765).
From (5) we obtain the potential in a region R by mapping R conformally onto D, solving the problem in
D, and then using the mapping to obtain the potential in R. The latter is given by the important formula
(7), p. 780,
X1 r n
(7) ˆ.r; / D a0 C .an cosnx C bn sinnx/
R nD1
r n r n r D R .D 1/ so
that in (7) D D1
Chap. 18 Complex Analysis and Potential Theory 389
R r
and (7) simplifies to a genuine Fourier series:
X
1
0
(7 ) ˆ.r; / D a0 C .an cos n C bn sin n/:
nD1
To determine (70) requires that we compute the Fourier coefficients of (7) by (8), p. 780, under the
simplification of r D R. Hence the techniques of calculating Fourier series explained in Sec. 11.1, pp.
474–483 of the textbook and pp. 202–208 of Vol. 1 of this Manual, and furthermore, in Sec. 11.2, pp.
483–491 of the textbook and pp. 208–211 of Vol. 1 of this Manual come into play. This is illustrated in
Prob. 13 and Example 1.
7–19. Harmonic functions in a disk. In each of 7–19 Harmonic functions in a disk. In each of Probs. 7–19
we are given a boundary function ˆ.1; /: Then, using (7), p. 780, and related formula (8), we want
to find the potential ˆ.r; / in the open unit disk r < 1 and compute some values of ˆ.r; / as well as
sketch the equipotential lines. We note that, typically, these problems are solved by Fourier series
as explained above.
7. Sinusoidal boundary values lead to a series (7) that, in this problem, reduces to finitely many terms
(a “trigonometric polynomial”). The given boundary function
ˆ.1; / D a cos2 4
is not immediately one of the terms in (7), but we can express it in terms of a cosine function of
multiple angle as follows. Indeed, in App. 3, p. A64 of the textbook, we read
xD4
and get
cos2 4 D C cos8:
ˆ.1; / D a cos2 4
1 1 a a
390 Complex Analysis Part D
Da C cos8 D C cos8:
2 2 2 2
From (7) we now see immedately that the potential in the unit disk satisfying the given boundary
condition is
a a8
ˆ.r; / D C r cos8:
2 2
Note that the answer is already in the desired form so we do not need to calculate the Fourier
coefficients by (8)!
2 Z =2
bn D sinn d n D 1;2;3;:::
0
2 sinn n cosn =2
D n2 0
D n2 n2
1 1
D 2sin 2n n cos 2n :
2
n
2sin 1 cos 1 2 1 0 2 b1 D 2 2 D D :
For n D 2; 3; 4;:::, we get the following values for the Fourier coefficients:
Chap. 18 Complex Analysis and Potential Theory 391
2sin
1/ D 2 D 1
DD
Observe that in computing bn for n odd, the cos terms are zero, while for n even, the sin terms are zero.
Hence putting it together
2 1 2 1
ˆ.1; / D sin C sin2 sin3 sin4 C C :
2 9 4
From this, we obtain the potential (7) in the disk (R D 1) in the form
2 12 2 3 14
(A) ˆ.r; / D r sin C r sin2 r sin3 r sin4 C C :
2 9 4
The following figure shows the given boundary potential (straight line), an approximation of it [the sum
of the first, first two, first three, and first four terms (dot dash) of the series (A) with r D 1] along with an
approximation of the potential on the circle of radius r D (the sum of those four terms for r D drawn
with a long dash). Make a sketch of the disk (a circle) and indicate the boundary values around the circle.
392 Complex Analysis Part D
y
1.5
0.5
x
–π –3 π –π –π 0 π π 3π π
2 8 4 8 8 4 8 2
–0.5
–1
–1.5
Sec. 18.5 Prob. 13. Boundary potential and approximations for r D 1 and r D
Recall three concepts (needed in this section): analytic functions (p. 623) are functions that are defined
and differentiable at every point in a domain D: Furthermore, one is able to test whether a function is
analytic by the two very important Cauchy–Riemann equations on p. 625. Harmonic functions (p. 460) are
functions that are solutions to Laplace’s equation r2ˆ D 0 and their second-order partial derivatives are
continuous. Finally, a Dirichlet problem (p. 564) is a boundary value problem where the values of the
function are prescribed (given) along the boundary.
The material is very accessible and needs some understanding of how to evaluate double integrals and
also apply Cauchy’s integral formula (Sec. 14.3, p. 660). We derive general properties of harmonic
functions from analytic functions. Indeed, the first two mean value theorems go together, in that Theorem
1, (p. 781; Prob. 3) is for analytic functions and leads directly to Theorem 2 (p. 782; Prob. 7) for harmonic
functions. Similarly, Theorems 3 and 4 are related to each other. Of the general properties of harmonic
functions, the maximum principle of Theorem 4, p. 783, is quite important. The chapter ends on a high
note with Theorem 5, p. 784, which states that an existing solution to a Dirichlet problem for the 2D
Laplace equation must be unique.
Orientation. We have reached the end of Part D on complex analysis, a field whose diversity of topics
and richness of ideas may represent a challenge to the student. Thus we include, for study purposes, a
brief review of complex analysis on p. 371 of this Manual.
Chap. 18 Complex Analysis and Potential Theory 393
3. Mean value of an analytic function. Verification of Theorem 1, p. 781, for given problem. The
problem is to verify that Theorem 1, p. 781, holds for
1Z2 i˛
holds for (A). Here we integrate F.z/ D .3z 2/2 around the circle, j z 4 j D 1; of radius r D 1 and center
z0 D 4, and hence we have to verify (2) with these values. This means we have to show
that
1Z2 i˛ 1Z2 i˛
Since
we have to show that the integral on the right-hand side of (2*) takes on that value of 100, that is,
we must show that
1Z2
(2**) F.4 C 1 ei ˛/d˛ D 100:
2 0
We go in a stepwise fashion. The path of integration is the circle, j z 4 j D 1, so that
z D z0 C rei ˛ D 4 C 1 ei ˛ D 4 C ei ˛:
F.z0 C ei ˛/ D 34 C ei ˛ 22 D 12 C 3ei ˛ 22
D 10 C 3ei ˛2
Z Z
i˛ i˛ 2i ˛
F.4 C 1 e /d˛ D .100 C 60e C 9e /d˛
Z Z Z
i˛
D 100 d˛ C 60 e d˛ C 9 e2i ˛ d˛
1 i˛ 9 2i ˛
D 100˛ C 60 e C e
:i 2i
Next we consider the definite integral
Z2 i˛ 1 i˛ 9 2i ˛˛D2
F.4 C 1 e /d˛ D 100˛ C 60 e C e :
1 i 2i ˛D0
2 i2 9 2i2 60 9
129
100 .2/ C 60 eC e D 200 C C D 200 C :i 2i i
2i 2i
10 9 0 129
0 C 60 e C e D : i 2i 2i
Hence the difference between the value at the upper limit and the value at the lower limit is
Z2
i˛ 129 129
F.4 C 1 e /d˛ D 200 C D 200:
1 2i 2i
The integral in (2**) has a factor 1=.2/ in front, so that we put that factor in front of the last
integral and obtain
2 Z2 i˛ 1
F.4 C 1 e /d˛ D 200 D 100 where 100 D F.4/:
2 0 2
Chap. 18 Complex Analysis and Potential Theory 395
Thus we have shown that (2**) holds and thereby verified Theorem 1 for (A).
7. Mean values of harmonic functions. Verification of Theorem 2, p. 782. Our problem is similar in spirit
to that of Prob. 3 in that it requires us to verify another mean value theorem for a given example—
here for a harmonic function. Turn to p. 782 and look at the two formulas [one with no number, one
numbered (3)] in the proof of Theorem 2. We shall verify them for given function ˆ defined on a point
.x ;y
0 / and a circle. To get a better familiarity of the material, you may want to write down all the
0
details of the solution with the integrals, as we did in Prob. 3. We verify Theorem 2 for
The function ˆ.x;y/ is indeed harmonic (for definition, see pp. 628 and 758–759). You should verify this
by differentiation, that is, by showing that ˆ is a solution of
r2
ˆ D ˆxx C ˆyy D 0 [(1), p. 759].
We continue.
We note that z0 D x0 C iy0 D 2 2i is the center of the circle in (B).
In terms of the real and imaginary parts of the path, 2 2i C ei ˛; is then [by Euler’s formula (5), p. 634 in
Sec. 13.6]
This is the representation we need, since ˆ is a real function of the two real variables x and y. We see that
D .2 1/. 2 1/ D 3:
Hence we have to show that each of the two mean values equals 3.
Substituting (C) into (B) (which is a completely schematic process) gives
Z2
. 3 C 1 sin ˛ 2 cos ˛ C cos ˛ sin ˛/ d˛
0
2
3˛ cos ˛ 2sin ˛ C sin2 ˛
D
0
!
6 D cos 22sin 2 Csin2 2 0 cos0 2 sin0 Csin2 0
D . 6 1/ . 1/ D 6:
We have to multiply this result by a factor 1=.2/: (This is the factor in front of the unnumbered
formula of the first integral in the proof of Theorem 2.) Doing so we get
1
. 6/ D 3:
2
This is the mean value of the given harmonic function over the circle considered and completes the
verification of the first part of the theorem for our given data.
Next we work on (3), p. 782. Now calculate the mean value over the disk of radius 1 and center
(2; 2). The integrand of the double integral in formula (3) in the proof of Theorem 2 is similar
to that in (D). However, in (D) we had r D 1 (the circle over which we integrated), whereas now we
have r being variable and we integrate over it from 0 to 1: In addition we have a factor r resulting
from the element of area in polar coordinates, which is r dr d. Hence, instead of .1 C cos˛/. 3 C sin˛/
in (D), we now have
D r3sin2 2 C r3sin2 0
r2cos2 r2 cos0
„ƒ‚… „ƒ‚… 2 „ƒ‚… „ƒ‚… „ƒ‚… 2 „ƒ‚…
1 0 0 1 0 0
Chap. 18 Complex Analysis and Potential Theory 397
D 6r r2 r2
D 6r:
Hence
Z1 Z1 hr i 1 1
6r dr D 6 r dr D 6 D 6 D 3:
0
0 0 2 2
In front of the double integral we have the factor 1= r 02 D 1= because the circle of integration has
Remark. The problem requires you to only verify (3). We also verified the first formula in the
proof of Theorem 2 to give you a more complete illustration of the theorem.
19. Location of maxima of a harmonic function and its conjugate: The question is whether a harmonic
function ˆ and a harmonic conjugate ‰ in a region R have their maximum at the same point of R.
The answer is “not in general.” We look for a counterexample that is as simple as possible. For
example, a simple case would be the conjugate harmonics ‰:
x D Re z and y D Im z in the square 0 x 1; 0 y 1:
Then we have
where both functions ˆ and ‰ have a maximum. But, if we leave out that point .1;1/ in the square
and consider only
then
You may want to investigate the question further. What about a triangle, a square with vertices ˙1;
˙i, and so on?
398 Complex Analysis Part D
Since complex analysis is a rather diverse area, we include this brief review of the essential ideas of
complex analysis. Our main point is that to get a good grasp of the field, keep the three approaches
(methods) of complex analysis apart and firmly planted in your mind. This is in tune with Underlying
Theme 4 of “Clearly identifying the conceptual structure of subject matter” on p. x of the textbook. The
three approaches were [with particularly important sections marked in boldface, page references given
for the Textbook (T) and this Manual (M)]:
1. Evaluating integrals by Cauchy’s integral formula [see Sec. 14.3, p. 660 (T), p. 291 (M); general
background Chap. 13, p. 608 (T), p. 257 (M), and Chap. 14, p. 643 (T), p. 283 (M)]. The method required
a basic understanding of analytic functions [p. 623 (T), p. 267 (M)], the Cauchy–Riemann equations [p.
625 (T), p. 269 (M)], and Cauchy’s integral theorem [p. 653 (T), p. 288 (M)].
2. Residue integration [applied to complex integrals see Sec. 16.3, p. 719 (T), p. 322 (M); applied to real
integrals see Sec. 16.4, p. 725 (T), p. 326 (M); general background Chap. 15, p. 671 (T), p. 298 (M), and
Chap. 16, p. 708 (T), p. 316 (M)]. The method needed a basic understanding of radius of convergence of
power series and the Cauchy–Hadamard formula [p. 683 (T), p. 303 (M)] and Taylor series p. 690 (T), p.
309 (M). This led to the very important Laurent series [which allowed negative powers, p. 709 (T), p.
316 (M)] and gave us order of singularities, poles, and zeros [p. 717 (T), p. 320 (M)].
3. Geometric approach of conformal mapping applied to potential theory [in electrostatic fields Sec. 18.1,
p. 759 (T), p. 353 (M); Sec. 18.2, p. 763 (T), p. 357 (M); Sec. 18.5, p. 777 (T), p. 364 (M), in heat
conduction, Sec. 18.3, p. 767 (T), p. 359 (M), in fluid flow in Sec. 18.4, p. 771 (T), p. 361 (M); general
background in Chap. 17, p. 736 (T), p. 332 (M)]. The method required an understanding of conformal
mapping [p. 738 (T), p. 333 (M)], linear fractional transformations [p. 743 (T), p. 339 (M)], and their
fixed points [pp. 745, 746 (T), pp. 339, 341 (M)], and a practical understanding of how to apply
conformal mappings to basic complex functions.
399 Complex Analysis Part D
In general, just like in regular calculus, you have to know basic complex functions (sine, cosine,
exponential, logarithm, power function, etc.) and know how they are different from their real
counterparts. You have to know Euler’s formula [(5), p. 634 (T), p. 277 (M)] and Laplace’s equation
[Theorem 3, p. 628 (T), p. 269 (M)].
400 Numeric Analysis Part E
PART E
Numeric
Analysis
Numeric analysis in Part E (also known as numerics or numerical analysis) is an area rich in applications
that include modeling chemical or biological processes, planning ecologically sound heating systems,
determining trajectories of satellites and spacecrafts, and many others. Indeed, in your career as an
engineer, physicist, applied mathematician, or in another field, it is likely that you will encounter projects
that will require the use of some numerical methods, with the help of some software or CAS (computer
algebra system), to solve a problem by generating results in terms of tables of numbers or figures.
The study of numeric analysis completes your prior studies in the sense that a lot of the material you
learned before from a more algebraic perspective is now presented again from a numeric perspective. At
first, we familiarize you with general concepts needed throughout numerics (floating point, roundoff,
stability, algorithm, errors, etc.) and with general tasks (solution of equations, interpolation, numeric
integration and differentiation) in Chap. 19. Then we continue with numerics for linear systems of
equations and eigenvalue problems for matrices in Chap. 20—material previously presented in an
algebraic fashion in Chaps. 7 and 8. Finally, in Chap. 21 we discuss numerical methods for differential
equations (ODEs and PDEs) and thus related to Part A and Chap. 12.
Use of Technology. We have listed on pp. 788–789 software, computer algebra systems (CASs),
programmable graphic calculators, computer guides, etc. In particular, note the Mathematica Computer
Guide and Maple Computer Guide (for stepwise guidance on how to solve problems by writing programs
for these two CASs) by Kreyszig and Norminton that accompany the textbook (see p. 789). However, the
problems in the problem sets in the textbook can be solved by a simple calculator, perhaps with some
graphing capabilities, except for the CAS projects, CAS experiments, or CAS problems (see Remark on
Software Use on p. 788 of textbook).
3. Programming
as shown on p. 791. Solving a single equation of the form f.x/ D 0; as shown in Sec. 19.2, may serve as one
of many illustrations.
From calculus, you should review Taylor series [in formula (1), p. 690, replace complex z with real x],
limits and convergence (see pp. 671–672), and, for Sec. 19.5, review, from calculus, the basics of how one
developed, geometrically, the Riemann integral.
Make sure that you understand Example 1. Here is a self-test. (a) Round the number 1:23454621 to seven
decimals, abbreviated (7D). (b) Round the number 398:723555 to four decimals (4D). Please close
this Student Solutions Manual (!). Check the answer on p. 27 of the Manual. If your answer is correct,
great.
If not, go over your answers and study Example 1 again.
The standard decimal system is not very useful for scientific computation and so we introduce the
floating-point system on p. 791. We have
„ ƒ‚ …
13 zeros
0:02000 D 0:2000 10 1 ,
The roundoff rule for significant digits is as follows. To round a number x to k significant digits, do the
following three steps:
x D ˙m 10n; 0:1 jmj < 1; where n is an integer [see also (1), p. 792].
2. For now, ignore the factor 10n. Apply the roundoff rule (for decimals) on p. 792 to m only.
3. Take the result from step 2 and multiply it by 10 n: This gives us the desired number x rounded to k
significant digits.
Self-test: Apply the roundoff rule for significant digits to round 102:89565 to six significant digits (6S).
Check your result on p. 27.
The computations in numerics of unknown quantities are approximations, that is, they are not exact but
involve errors (p. 794). Rounding, as just discussed by the roundoff rule, produces roundoff errors
bounded by (3), p. 793. To gain accuracy in calculations that involve rounding, one may carry extra digits
called guarding digits (p. 793). Severe problems in calculations may involve the loss of significant digits
that can occur when we subtract two numbers of about the same size as shown in Example 2 on pp. 793–
794 and in Problem 9.
We also distinguish between error, defined by (6) and (6*) and relative error (7) and (7 0), p. 794,
respectively. The error is defined in the relationship
Error
Relative error D (where True value ¤ 0).
True value
As one continues to compute over many steps, errors tend to get worse, that is, they propagate. In
particular, bounds for errors add under addition and subtraction and bounds for relative errors add under
multiplication and division (see Theorem 1, p. 795).
Other concepts to consider are underflow, overflow (p. 792), basic error principle, and algorithm (p.
796). Most important is the concept of stability because we want algorithms to be stable in that small
changes in our initial data should only cause small changes in our final results.
Remark on calculations and exam. Your answers may vary slightly in some later digits from the answers
given here and those in App. 2 of the textbook. You may have used different order of calculations,
rounding, technology, etc. Also, for the exam, ask your professor what technology is allowed and be
familiar with the use and the capabilities of that technology as it may save you valuable time on the exam
and give you a better grade. It may also be a good idea, for practice, to use the same technology for your
homework.
4 Numeric Analysis Part E
a D 1; bD 30; c D 1:
We get, for the square root term calculated with 4S (“significant digits,” see pp. 791–792),
p
. 30/2 4 D p900 4 D p896 D 29:933 D 29:93:
and
It is important to notice that x2, obtained from 4S values, is just 2S—i.e., we have lost two digits. As an
alternative method of solution for x2, use (5), p. 794,
1 D1 Cp2 2 D c
(5) x b b 4ac ; x :
2a ax1
The root x1 (where the similar size numbers are added) equals 29:97, as before. For x 2, you now obtain
c 1 x2 D 1 D D 0:0333667 D 0:03337 (to four significant digits): ax 29:97
(b) 2S. With 2S the calculations are as follows. We have to calculate the square root as
p
. 30/2 4 D p900 4 D p899:6 D p900 D 30 (to two significant digits, i.e., 2S).
Hence, by (4),
x1 D .30 C 30/ D 60 D 30
and
Chap. 19 Numerics in General 5
x2 D .30 30/ D 0:
In contrast, from (5), you obtain better results for the second root. We still have x 1 D 30 but
Purpose of Prob. 9. The point of this and similar examples and problems is not to show that calculations
with fewer significant digits generally give inferior results (this is fairly plain, although not always the
case). The point is to show, in terms of simple numbers, what will happen in principle, regardless of the
number of digits used in a calculation. Here, formula (4) illustrates the loss of significant digits, easily
recognizable when we work with pencil (or calculator) and paper, but difficult to spot in a long
computation in which only a few (if any) intermediate results are printed out. This explains the necessity
of developing programs that are virtually free of possible cancellation effects.
19. We obtain the Maclaurin series for the exponential function by (12), p. 694, of the textbook where
we replace z, a complex number, by u a real number. [For those familiar with complex numbers, note
that (12) holds for any complex number z D x C iy and so in particular for z D x C iy D x C i 0 D x D
Rez, thereby justifying the use of (12)! Or consult your old calculus book. Or compute it yourself.]
Anyhow, we have
x x2 x3 x4 x5 x10
0) f.x/ D ex D 1 C C C C C C C C :
(12
1Š 2Š 3Š 4Š 5Š 10Š
[All computations to six digits (6S).] We are given that the exact 6S value of 1=e is
1
(6S) D 0:36787
e 9
1 1 1 1 1
1 C C D 0:366667 (6S)
1Š 2Š 3Š 4Š 5Š
6 Numeric Analysis Part E
1 1 1 1 1 1 1 1
(B8) 1 C C C C D 0:367882
1Š 2Š 3Š 4Š 5Š 6Š 7Š 8Š
Error diff (A) (B8) D 0:367879 0:367882 D 0:000003
1 1 1 1 1 1 1 1 1
1 (B10) 1 C C C C C D 0:367879
1Š 2Š 3Š 4Š 5Š 6Š 7Š 8Š 9Š 10Š
Error diff (A) (B10) D 0:367879 0:367879 D 0:
(b) For the 1=e1 method, that is, computing ex with x D 1 and then taking the reciprocal, we get
1 1 1 1 1 1 1
(C) f.1/ D eD1C C C C C C C C ;:
1Š 2Š 3Š 4Š 5Š 10Š
so (C), with five terms is
1 1 1 1 1 1
(C5) eD1C C C C C D 2:71667
1Š 2Š 3Š 4Š 5Š
giving the reciprocal
1 1
(C5) 1D D 0:368098 [using the result of (C5)] e
2:71667
Error diff W (A) C5* D 0:367879 0:368098 D 0:000219:
This is much better than the corresponding result (B5) in (a). With seven terms we obtain
1 1 1 1 1 1 1
(C7) 1C C C C C C C D 2:71825
1Š 2Š 3Š 4Š 5Š 6Š 7Š
(C7*) D 0:367884
This result is almost as good as (B8) in (a), that is, the one with eight terms. With ten terms we get
Chap. 19 Numerics in General 7
1 1 1 1 1 1 1 1 1 1
(C10) 1C C C C C C C C C C D 2:71828
1Š 2Š 3Š 4Š 5Š 6Š 7Š 8Š 9Š 10Š
Sec. 19.1. Prob. 19. Table. Computation of e 1 and 1=e1 for the MacLaurin series as a
computer would do it
No. of Terms Decimal (a) e1 Exact is (b) e1 1=e1 Exact is
Factorial Terms Result 0:367879 Result 0:367879
Terms via Error: via Error:
in (12 )0
(B) Exact (B) (C) Exact (C)
1 1 C 1 0:632121 C 1 1 0:632121
1
1 1 0 0:367879 2 0:5 0:132121
1Š C
1
2 0:5 0:5 0:132121 2:5 0:4 0:032121
2Š C C
1
2:6666
3 0:166667 0:333333 0:034546 0:375000 0:007121
7
3Š C
1
2:7083
4 0:0416667 0:375000 0:007121 0:369230 0:001351
4
4Š C C
1
2:7166
5 0:00833333 0:366667 0:001212 0:368098 0:000219
7
5Š C
1
2:7180
6 0:00138889 0:368056 0:000177 0:367909 0:000030
6
6Š C C
1
2:7182
7 0:000198413 0:367858 0:000021 0:367882 0:000003
6
7Š C
1
2:7182
8 0:0000248016 0:367883 0:000004 0:367880 0:000001
8
8Š C C
1
2:7182
9 0:00000275573 0:367880 0:000001 0:367880 0:000001
8
9Š C
1
2:7182
10 0:000000275573 0:367880 0:000001 0:367880 0:000001
8
10Š C C
giving (C10*)
8 Numeric Analysis Part E
D 0:367879
(1) f.x/ D 0
appears in many applications in engineering. This problem appeared, for example, in the context of
characteristic equations (Chaps. 2, 4, 8), finding eigenvalues (Chap. 8), and finding zeros of Bessel
functions (Chap. 12). We distinguish between algebraic equations, that is, when (1) is a polynomial such as
f.x/ D tanx x D 0:
In the former case, the solutions to (1) are called roots and the problem of finding them is called finding
roots.
Since, in general, there are no direct formulas for solving (1), except in a few simple cases, the task of
solving (1) is made for numerics.
The first method described is a fixed-point iteration on pp. 798–801 in the text and illustrated by
Example 1, pp. 799–800, and Example 2, pp. 800–801. The main idea is to transform equation (1) from
above by some algebraic process into the form
(2) x D g.x/:
This in turn leads us to choose an x0 and compute x1 D g.x0/; x2 D g.x1/; and in general
We have set up an iteration because we substitute x0 into g and get g.x0/ D x1, the next value for the
A solution to (2) is called a fixed point as motivated on top of p. 799. Furthermore, Example 1
demonstrates the method and shows that the “algebraic process” that we use to transform (1) to (2) is
not unique. Indeed, the quadratic equation in Example 1 is written in two ways (4a) and (4b) and the
corresponding iterations illustrated in Fig. 426 at the bottom of p. 799. Making the “best” choice for g.x/
can pose a significant challenge. More on this method is given in Theorem 1 (sufficient condition for
convergence), Example 2, and Prob. 1.
Most important in this section is the famous Newton method. The method is defined recursively by
f.xn/
(5) xnC1 D xn f 0.xn/ where n D 0;1;2; ;N 1:
The details are given in Fig. 428 on p. 801 and in the algorithm in Table 19.1 on p. 802. Newton’s
method can either be derived by a geometric argument or by Taylor’s formula (5*), p. 801. Examples 3,
4, 5, and 6 show the versatility of Newton’s method in that it can be applied to transcendental and
algebraic equations. Problem 21 gives complete details on how to use the method. Newton’s method
converges of second order (Theorem 2, p. 804). Example 7, p. 805, shows when Newton’s method runs
into difficulties due to the problem of ill-conditioning when the denominator of (5) is very small in
absolute value near a solution of (1).
Newton’s method can be modified if we replace the derivative f 0.x/ in (5) by the difference quotient
f 0.xn/ D f.x n/
xn xn 1
and simplify algebraically. The result is the secant method given by (10), p. 806, which is illustrated by
Example 8 and Prob. 27. Its convergence is superlinear (nearly as fast as Newton’s method). The method
may be advantageous over Newton’s method when the derivative is difficult to compute and
computationally expensive to evaluate.
where s is such that g.s/ D s [the intersection of y D x and y D g 1.x/ in Fig. 427] and
10 Numeric Analysis Part E
Suppose we start with x1 > s. Then g.x1/ g.s/ by (C). If g.x1/ D g.s/ [which could happen if
g.x/ is constant between s and x1], then x1 is a solution of f.x/ D 0, and we are done. Otherwise
g.x1/ < g.s/, and by the definition of x2 [formula (3), p. 798 in the text] and since s is a fixed point [s D
g.s/], we obtain
Hence by (B),
g.x2/ g.s/:
The equality sign would give a solution, as before. Strict inequality, and the use of (3) in the text, give
and so on. This gives a sequence of values that are alternatingly larger and smaller than s, as
illustrated in Fig. 427 of the text.
Complete the problem by considering monotonicty, as in Example 1, p. 799.
21. Newton’s method. The equation is f.x/ D x3 5x C 3 D 0 with x0 D 2;0; 2: The derivative of f.x/ is
f 0.x/ D 3x2 5:
3xn2 5 :
We have nothing to compute for the interation n D 0: For the iteration n D 1 we have
f.x0/
x1 D x0
f 0.x0/
23 52 2 C 3
D2 3 2 5
8 10 C 3
D2 12 5
D2
Chap. 19 Numerics in General 11
D2 0:1428571429
D2 0:142857
f.x1/
x2 D x1
f 0.x1/
.1:85714/3 5 1:85714 3
1:85714
D C
2
3 .1:85714/ 5
D 0.119518
D 1:85714 1:85714
5.34691
D 1:85714 0:02235272 D 1:85714 0:0223527
f.x2/
x3 D x 2
f 0.x2/
D 1:83479 .1:83479/3 5 1:83479 C 3
3 .1:83479/2 5
D 0.00278677
D 1:83479 1:83479
5.09936
D 1:83479 0:0005464940698 D 1:83479 0:000546494 D 1:834243506 D 1.83424
(6S):
For n D 4 we obtain
f.x3/
x4 D x3
f 0.x3/
12 Numeric Analysis Part E
.1:83424/3 5 1:83424 3
D 1:83424 C
3 .1:83424/2 5
D 1:83424D 1:83424
Because we have the same value for the root (6S) as we had in the previous iteration, we are
finished.
Hence the iterative sequence converges to x4 D1.83424 (6S), which is the first root of the given
cubic polynomial.
The next set of iterations starts with x0 D 0 and converges to x4 D0.656620 (6S), which is the second
root of the given cubic polynomial. Finally starting with x 0 D –2 yields x4 D–2.49086 (6S).
The details are given in the three-part table on the next page. Note that your answer might vary
slightly in the last digits, depending on what CAS or software or calculator you are using.
(P) x3 5x C 3 D 0:
This time we are looking for only one root between the given values x 0 D 1:0 and x1 D 2:0:
The 3 in the denominator of the second equality cancels, and we get the following formula for our
iteration:
1:0
Œ.1:0/3 5 1:0
D 2:0 1:0 2:0 Œ 4:0 D 2:0 0:50 D 1:5 (exact).
Next we use x1 D 2:0 and x2 D 1:5 to get
x
x1 5x1
x3 D x2 x23 5x2 C 3 x13
14 Numeric Analysis Part E
5x22
D x2 C .1:5/3 5 1:5
3 1:5 2:0 Œ.2:0/ 3
5 2:0
1:5 Œ.1:5/ 5 1:5 3
The next iteration uses x2 D 1:5 and x3 D 1:76471 to get x4 D 1:87360 (6S). We obtain convergence at step
n D 8 and obtain x8 D 1:83424, which is one of the roots of (P). The following table shows all the steps.
Note that only after we computed x8 and found it equal (6S) to x7 did we conclude convergence.
Sec. 19.2 Prob. 27.
Table A. Secant method with 6S accuracy
n xn
2 1:5
3 1:76471
4 1:87360
5 1:83121
6 1:83412
7 1:83424
8 1.83424
For 12S values convergence occurs when n D 10.
The function may be a “mathematical” function, such as a Bessel function, or a “measured” function, say
air resistance of an airplane at different speeds. In interpolation, we want to find approximate values of
f.x/ for new x that lie between those given in (A). The idea in interpolation (p. 808) is to find a polynomial
pn.x/ of degree n or less—the so called “interpolation polynomial”—that goes through the values in (A),
that is,
We call pn.x/ a polynomial approximation of f.x/ and use it to get those new f.x/’s mentioned before. When
they lie within the interval Œx0;xn; then we call this interpolation and, if they lie outside the interval,
extrapolation.
Lagrange interpolation. The problem of finding an interpolation polynomial p n satisfying (1) for given
data exists and is unique (see p. 809) but may be expressed in different forms. The first type of
interpolation is the Lagrange interpolation, discussed on pp. 809–812. Take a careful look at the linear
case (2), p. 809, which is illustrated in Fig. 431. Example 1 on the next page applies linear Lagrange
interpolation to the natural logarithm to 3D accuracy. If you understand this example well, then the rest of
the material follows the same idea, except for details and more involved (but standard) notation. Example
2, pp. 810–811, does the same calculations for quadratic Lagrange interpolation [formulas (3a), (3b), p.
810] and obtains 4D accuracy. Further illustration of the (quadratic) technique applied to the sine function
and error function is shown in Probs. 7 and 9, respectively. This all can be generalized by (4a), (4b) on p.
811. Various error estimates are discussed on pp. 811–812. Example 3(B) illustrates the basic error
principle from Sec. 19.1 on p. 796.
Newton’s form of interpolation. We owe the greatest contribution to polynomial interpolation to Sir
Isaac Newton (on his life cf. footnote 3, p. 15, of the textbook), whose forms of interpolation have three
advantages over those of Lagrange:
1. If we want a higher degree of accuracy, then, in Newton’s form, we can use all previous work and just
add another term. This flexibility is not possible with Lagrange’s form of interpolation.
3. Finally, it is easier to use the basic error principle from Sec. 19.1 for Newton’s forms of interpolation.
The first interpolation of Newton is Newton’s divided difference interpolation (10), p. 814, with the kth
divided difference defined recursively by (8), p. 813. The corresponding algorithm is given in Table 19.2, p.
814, and the method illustrated by Example 4, p. 815, Probs. 13 and 15. The computation requires that we
16 Numeric Analysis Part E
set up a divided difference table, as shown on the top of p. 815. To understand this table, it may be useful
to write out the formulas for the terms, using (7), (8), and the unnumbered equations between them on p.
813.
If the nodes are equally spaced apart by a distance h, then we obtain Newton’s forward difference
interpolation (14), p. 816, with the kth forward difference defined by (13), p. 816. [This corresponds to
(10) and (8) for the arbitrarily spaced case.] An error analysis is given by (16) and the method is illustrated
by Example 5, pp. 817–818.
If we run the subscripts of the nodes backwards (see second column in table on top of p. 819), then we
obtain Newton’s backward difference interpolation (18), p. 818, and illustrated in Example 6.
7. Interpolation and extrapolation. We use quadratic interpolation through three points. From (3a),
(3b), p. 810, we know that p2.x/ D L0.x/f0 C L1.x/f1 C
L2.x/f2
is the quadratic polynomial needed for interpolation, which goes through the three given
points .x0;f0/, .x1;f1/; and .x1;f1/. For our problem
k xk fk
x0 D 0
0 f0 D sin0
f
1 x1 D 1 D sin
4 4
f
2 x2 D 2 D sin
2 2
so that the desired quadratic polynomial for interpolating sin x with nodes at x D 0, =4, and =2 is
.x
p2.x/ D /.x / .x 0/.x / .x 0/.x /
.0 4 2 sin0 C 2 sin C 2 sin
We use (A) to compute sinx for x D (“extrapolation” since x D lies outside the interval
Chap. 19 Numerics in General 17
2 4 6 x
–2
–2
–4
–6
–8
2Zx t2
erfx D p e dt
0
cannot be solved by elementary calculus and, thus, is an example where numerical methods are
appropriate.
Our problem is similar in spirit to Prob. 7. From (3a), (3b), and the given data for the error function
erfx, we obtain the Lagrange polynomial and simplify
This approximate value p2.0:75/ D 0:70929 is not very accurate. The exact 5S value is erf
0:75 D 0:71116 so that the error is
0:70929 D 0:00187:
y
1.0
0.5
–1.0
–1.5
Sec. 19.3 Prob. 9. The functions erf x and Lagrange polynomial p 2.x/:
See also Fig. 554 on p. A68 in App. A of the textbook
13. Lower degree. Newton’s divided difference interpolation. We need, from pp. 813–814,
and
fx
C
ajC2 D f xj;xjC1;xjC2 D jC1;xxjj C22 xfjxj;xjC1:
From the five given points .xj ;fj / we construct a table similar to the one in Example 4, p. 815. We get
j xj fj D f.xj / ajC1 D f xj;xjC1 ajC2 D f xj;xjC1;xjC2
0 4 50 18 50
16:0
2 . 4/ D
1 2 18 2 18
8 C 16 D 2:0 0
8:0
C4
0 . 2/ D
2 0 2 0 8
C D 2:0
2C2
D0
3 2 2
D 2:0
D 8:0
4 4 18
From the table and (C), with j D 0, we get the following interpolation polynomial. Note that,
because all the ajD2 differences are equal, we do not need to compute the remaining differences and
the polynomial is of degree 2:
p2.x/ D f0 C .x x0/f Œx0;x1 C .x x0/.x x1/f Œx0;x1;x2 (see formula on top of p. 814)
15. Newton’s divided difference formula (10), p. 814. Using the data from Example 2, p. 810, we build
the following table:
j xj fj f xj;xjC1 f xj;xjC1;xjC2
0 9:0 2:1972
D 0:1082
1 9:5 2:2513
D 0:005235
D 0:09773
2 11:0 2:3979
Then, using the table with the values needed for (10) underscored, the desired polynomial is
20 Numeric Analysis Part E
We continue our study of interpolation started in Sec. 19.3. Since, for large n, the interpolation polynomial
Pn.x/ may oscillate wildly between the nodes x 0;x1;x2;:::;xn, the approach of Newton’s interpolation with
one polynomial of Sec. 19.3 may not be good enough, Indeed, this is illustrated in Fig. 434, p. 821, for n D
10, and it was shown by reknown numerical analyst Carl Runge that, in general, this example exhibits
numeric instability. Also look at Fig. 435, p. 821.
The new approach is to use n low-degree polynomials involving two or three nodes instead of one
high-degree polynomial connecting all the nodes! This method of spline interpolation, initiated by I. J.
Schoenberg is used widely in applications and forms the basis for CAD (computer-aided design), for
example, in car design (Bezier curves named after French engineer P. Bezier of the Renault Automobile
Company, see p. 827 in Problem Set 19.4).
Here we concentrate on cubic splines as they are the most important ones in applications because they
are smooth (continuous first derivative) and also have smooth first derivatives. Theorem 1 guarantees
their existence and uniqueness. The proof and its completion (Prob. 3) suggest the approach for
determining splines. The best way to understand Sec. 19.4 is to study Example 1, p. 824. It uses (12), (13),
and (14) (equidistant nodes) on pp. 823–824. A second illustration is Prob. 13. Figure 437 of the Shrine of
the Book in Jerusalem in Example 2 (p. 825) shows the interpolation polynomial of degree 12, which
oscillates (reminiscent of Runge’s example in Fig. 434), whereas the spline follows the contour of the
building quite accurately.
3. Existence and uniqueness of cubic splines. Derivation of (7) and (8) from (6), p. 822, from the Proof
of Theorem 1. Formula (6), p. 822, of the unique cubic polynomial is quite involved:
We need to differentiate (6) twice to get (7) and (8), and one might make some errors in the (paper-
and-pencil) derivation. The point of the problem then is that we can minimize our chance of making
errors by introducing suitable short notations.
For instance, for the expressions involving x; we may set
Xj D x xj , XjC1 D x xjC1;
Chap. 19 Numerics in General 21
and, for the constant quantities occurring in (6), we may choose the short notations:
Then formula (6) becomes simply qj .x/ D AXj2C1.1 C BXj / C CXj2 .1 BXjC1/ C DXj Xj2C1 C EXj2XjC1:
Differentiate this twice with respect to x, applying the product rule for the second derivative, that is,
and noting that the first derivative ofthe differentiations in two steps if one wants to.) We obtainX j is
simply 1, and so is that of XjC1. (Of course, one may do
where 4 D 2 2 with one 2 resulting from the product rule and the other from differentiating a
square. And the zeros arise from factors whose second derivative is zero. Now calculate q j00 at x D xj .
Since
Xj D x xj , we see that Xj D 0 at x D xj :
1
XjC1 D xj xjC1 D c j [see (6*), p. 822, which defines c j ].
Inserting this, as well as the expressions for A;B;:::;E, we obtain (7) on p. 822. Indeed,
4c
c
Cancellation of some of the factors involving j gives
22 Numeric Analysis Part E
1
Xj D xjC1 xj D c j ;
! ! 2
4k
2
qj00.xjC1/ D f.xj /cj2 2C4 ccjj C f.x C1/cj2 2 8ccjj C 2kcjjcj C jcCj1cj :
j
Again, cancellation of some factors cj and simplification finally gives (8), that is,
For practice and obtaining familiarity with cubic splines, you may want to work out all the details
of the derivation.
13. Determination of a spline. We proceed as in Example 1, p. 824. Arrange the given data in a table for
easier work:
j xj f.xj / kj
0 0 1 0
1 1 0
2 2 1
3 3 0 6
Since there are four nodes, the spline will consist of three polynomials, q 0.x/; q1.x/; and q2.x/. The
polynomial q0.x/ gives the spline for x from 0 to 1, q 1.x/ gives the spline for x from 1 to 2; and q 2.x/
gives the spline for x from 2 to 3, respectively.
3
Chap. 19 Numerics in General 23
Step 2 for q0.x/ Determine the coefficients of the spline from (13), p. 823. We see that, in general, j
D 0;:::;n 1, so that, in the present case, we have j D 0 (this will give the spline from 0 to 1), j D 1
(which will give the spline from 1 to 2), and j D 2 (which will give the spline from 2 to 3). Take j D 0.
1
a02 Dq000.x0/ D.f1 f0/.k1 2k0/ D 3 .0 1/ .2 0/ D 1;
2
1
a03 Dq0000.x0/ D.f0 f1/ C.k1 C k0/ D 2 .1 0/ C . 2 C 0/ D 0:
6
With these Taylor coefficients we obtain, from (12), p. 823, the first part of the spline in the form
x2:
D k1 D 2;
1
a12 Dq100.x1/ D.f2 f1/.k2 C 2k1/ D 3 . 1 0/ .2 4/ D 1;
2
1
a13 Dq1000.x1/ D.f1 f2/ C.k2 C k1/ D 2 .0 C 1/ C .2 2/ D 2:
6
With these coefficients and x1 D 0 we obtain from (12), p. 823, with j D 1 the polynomial
D 1 C 6x 7x2 C 2x3;
1
a22 Dq200.x2/ D.f3 f2/.k3 C 2k2/ D 3 .0 C 1/ . 6 C 4/ D 5;
2
1
a23 Dq2000.x2/ D.f2 f3/ C.k3 C k2/ D 2 . 1 1/ C . 6 C 2/ D 6:
6
With these coefficients and x1 D 0 we obtain, from (12), p. 823, with j D 1; the polynomial
To check the answer, you should verify that the spline gives the function values f.x j / and the values kj of
the derivatives in the table at the beginning. Also make sure that the first and second derivatives of the
spline at x D 1 are continuous by verifying that
0
1 2 3 4 x
–1
–2
We see that in the graph the curve q 0 is represented by the dashed line ( ), q1 by the dotted line
( ), and q2 by the dot-dash line ( ).
The essential idea of numeric integration is to approximate the integral by a sum that can be easily
evaluated. There are different ways to do this approximation and the best way to understand them is to
look at the diagrams.
The simplest numeric integration is the rectangular rule where we approximate the area under the
curve by rectangles of given (often equal) width and height by a constant value (usually the value at an
endpoint or the midpoint) over that width as shown in Fig. 441 on p. 828. This gives us formula (1) and is
illustrated in Prob. 1.
We usually get more accuracy if we replace the rectangles by trapezoids in Fig. 442. p. 828, and we
obtain the trapezoidal rule (2) as illustrated in Example 1, p. 829, and Prob. 5. We discuss various error
estimates of the trapezoidal rule (see pp. 829–831) in equations (3), (4), and (5) and apply them in
Example 2 and Prob. 5.
Most important in this section is Simpson’s rule on p. 832:
Zb h
(7) f.x/dx .f0 C 4f1 C 2f2 C 4f3 C C 2f2m 2 C 4f2m 2 C f2m/; a 3
where
important practical method. The discussion on numeric integration ends with Gauss integration (11), p.
837, with Table 19.7 listing nodes and coefficients for n D 2;3;4;5 (see Examples 7 and 8, pp. 837–838,
Prob. 25).
Whereas integration is a process of “smoothing,” numeric differentiation “makes things rough” (tends
to enlarge errors) and should be avoided as much as possible by changing models—but we shall need it in
Chap. 21 on the numeric solution of partial differential equations (PDEs).
Problem Set 19.5. Page 839
1. Rectangular rule (1), p. 828. This rule is generally too inaccurate in practice. Our task is to evaluate
the integral of Example 1, p. 829,
Z 1 2 Je x dx
D
0
by means of the rectangular rule (1) with intervals of size 0:1: The integral cannot be evaluated by
elementary calculus, but leads to the error function erfx, defined by (35), p. A67, in Sec. A3.1, of
App. 3 of the textbook.
Since, in (1), we take the midpoints 0:05, 0:15, :::, we calculate
j xj xj2 f.xj/ D exp xj2
1 0:05 0:0025 0:997503
2 0:15 0:0225 0:977751
3 0:25 0:0625 0:939413
4 0:35 0:1225 0:884706
5 0:45 0:2025 0:816686
6 0:55 0:3025 0:738968
7 0:65 0:4225 0:655406
8 0:75 0:5625 0:569783
9 0:85 0:7225 0:485537
10 0:95 0:9025 0:405555
Sum 7:471308 D P10 f.x/
jD1 j
Since the upper limit of integration is b D 1, the lower limit a D 0, and the number of subintervals n
D 10, we get
b a 1 0 1
h D D D D 0:1: n 10 10
Hence by (1), p. 828,
Z1 2 10
X
Rectangular rule: J D e x dx h f.xj/ D 0:1 7:471308 D 0:7471308 D 0:747131 (6S).
0
jD1
Chap. 19 Numerics in General 27
We compare our result with the one obtained in Example 1, p. 829, by the trapezoidal rule (2) on
that page, that is,
This shows that the trapezoidal rule gave a more accurate answer, as was expected.
Here are some questions worth pondering about related to the rectangular rule in our calculations.
When using the rectangular rule, the approximate value was larger than the true value. Why?
(Answer: The curve of the integrand is concave.)
What would you get if you took the left endpoint of each subinterval? (Answer: An upper bound for
the value of the integral.)
If you took the right endpoint? (Answer: A lower bound.)
5. Trapezoidal rule: Error estimation by halving. The question asks us to evaluate the integral
Z 1x
J D sin dx 0 2
by the trapezoidal rule (2), p. 829, with h D 1; 0:5; 0:25 and estimate its error for h D 0:5 and h D
0:25 by halving, defined by (5), p. 830.
Step 1. Obtain the true value of J. The purpose of such problems (that can readily be solved by
calculus) is to demonstrate a numeric method and its quality—by allowing us to calculate errors (6),
p.794, and error estimates [here (5), p. 830]. We solve the indefinite integral by substitution
x du 2
uD ; D ; dx D du 2 dx 2
and 2
Z1
x 2 h xi1
JD sin dx D cos
28 Numeric Analysis Part E
0 2 2 0
2 2 2
D
h i
(A)coscos 0D .01/ D D 0:63662:
2
Step 2a. Evaluate the integral by the trapezoidal rule (2), p. 821, with h D 1: In the trapezoidal rule (2) we
subdivide the interval of integration a x b into n subintervals of equal length h, so that
b a h D
:n
We also approximate f by a broken line of segments as in Fig. 442, p. 828, and obtain
Zb 1
(2) Jh D f.x/dx D h f.a/ C f.x1/ C f.x2/ C C f.x n 1/ Cf.b/ : a 2
From (A), we know that the limits of integration are a D 0; b D 1. With h D 1 we get
ba10
n D D D 1 interval; that is, interval Œa;b D Œ0;1 : h 1
Zb
J1: D 0f.x/dx
a
1
(B) Dh f.a/ C f.b/
2
1
D 1:0 f.0/ C f.1/ 2
1 1 1 1 1
D 1:0 sin0 C sin D 1:0 0C 1 D D 0:50000:
2 2 2 2 2 2
Step 2b. Evaluate the integral by the trapezoidal rule (2) with h D 0:5. We get
b a 1 0 n D D D 2 intervals. h 0:5
Chap. 19 Numerics in General 29
The whole interval extends from 0 to 1, so that two equally spaced subintervals would be Œ0; and
Œ ;1: Hence
1 1 1
J0:5 D h .f.a/ C f.x1// C .f.x1/ C f.b// D 0:5 f.0/ C f C f.1/
22 2
D 0:51 sin0 C sin 12 C 1 sin
(C)
2 2 2 2
p p " p#
2
1 2 1 2 1 using sin D
0:5" 0 C C 1# D 4C D 0:60355
4 2
D 2 2 2
with an error of 0:63662 0:60355 D 0:03307:
b a 1 0 n D D D 4 intervals. h
0:25
1 3 1
D 0:25 sin0 C sin C sin C sin C sin
2 8 4 8 2 2
D 0:25 .0 C 0:38268 C 0:70711 C 0:92388 C 0:50000/ D 0:62842:
Here we obtain
30 Numeric Analysis Part E
1
0:5 .J
3 0:5 J1:0/ D .0:60355 0:50000/ D 0:03452:
The agreement of this estimate 0:03452 with the actual value of the error 0:03307 is good.
Step 3b. Estimate the error by halving, that is, calculate 0:25: We get, using (5),
1
0:25 .J
3 0:25 J0:5/ D .0:62842 0:60355/ D 0:00829;
which compares very well with the actual error, that is,
Although, in other cases, the difference between estimate and actual value may be larger, estimation
will still serve its purpose, namely, to give an impression of the order of magnitude of the error.
Remark. Note that since we calculated the integral by (2), p. 829, for three choices of h D 1, 0:5, 0:25
in Steps 2a–2c, we were able to make two error estimates (5), p. 830, in steps 3a, 3b.
y 1
0.8
0.6
0.4
0.2
0
0.2 0.4 0.6 0.8 1
x
Sec. 19.5 Prob. 5. Given sine curve and approximating polygons in the three trapezoidal rules used.
The agreement of these estimates with the actual value of the errors is very good
17. Simpson’s rule for a nonelementary integral. Simpson’s rule (7), p. 832, is
Zb h
(7) f.x/dx .f0 C 4f1 C 2f2 C 4f3 C C 2f2m 2 C 4f2m 1 C f2m/; a 3
where
0 x
Being nonelementary means that we cannot solve the integral by calculus. For x D 1, its exact value
(by your CAS or Table A4 on p. A98 in App. 5) is
Z1
sin x
Si.1/ D dx D 0:9460831:
0 x
We construct a table with both 2m D 2 and 2m D 4, with values of the integrand accurate to seven
digits
sin xj sin xj
j1 xj fj D f.xj / D j xj fj D f.xj / D
xj xj
0 0 1:0000000 0 0 1:0000000
1 0:25 0:9896158
1 0:5 0:9588511 2 0:5 0:9588511
3 0:75 0:9088517
2 1:0 0:8414710 4 1:0 0:8414710
Simpson’s rule, with m D 1; i.e., h D 0:5, is
h 0:5
Si.1/ D .f0 C 4f1 C f2/ D .1 C 4 0:9588511 C 0:8414710/ D 0:9461459:
3 3
25. Gauss integration for the error function. n D 2 is required. The transformation must convert the
interval to Œ 1;1:
11
Z 1 e 2 dx D 1 Z 1 e 1=4.tC1/2 dt:
x
Note the high accuracy achieved with a rather modest amount of work.
Multiply this by 2=p to obtain an approximation to the error function erf 1.D 0:842700793 with .
2
p 0:746824127 D 0:842700786:
Solution to Self Test on Rounding Problem to Decimals (see p. 2 of this Solutions Manual and Study Guide)
Then we chop off the eigth digit “6” and obtain the rounded number to seven decimals (7D) 1:2345462:
D
398:723555 C 0: 0000 5
„ƒ‚… 4 zeros
D 398:723605
Next we chop off from the fifth digit onward, that is, “05” and obtain the rounded number to four
decimals (4D) 398:7236:
Solution to Self Test on Rounding Problem to Significant Digits (see p. 3 of this Solutions Manual and Study
Guide)
2. We ignore the factor 103. Then we apply the roundoff rule for decimals to the number 0:10289565 to
get
3. Finally we have to reintroduce the factor 10 3 to obtain our final answer, that is,
Gauss elimination with back substitution is a systematic way of solving systems of linear equations (1), p.
845. We discussed this method before in Sec. 7.3 (pp. 272–282) in the context of linear algebra. This time
the context is numerics and the current discussion is kept independent of Chap. 7, except for an occasional
reference to that chapter. Pay close attention to the partial pivoting introduced here, as it is the main
difference between the Gauss elimination presented in Sec. 20.1 and that of Sec. 7.3. The reason that we
need pivoting in numerics is that we have only a finite number of digits available. With many systems, this
can result in a severe loss of accuracy. Here (p. 846), to pivot a kk, we choose as our pivoting equation the
one that has the absolutely largest coefficients ajk in column k on or below the main diagonal. The details
are explained carefully in a completely worked out Example 1, pp. 846–847. The importance of this
particular partial pivoting strategy is demonstrated in Example 3, pp. 848–849. In (a) the “absolutely
largest” partial pivoting strategy is not followed and leads to a bad value for x 1. This corresponds to the
method of Sec. 7.3. In (b) it is followed and a good value for x 1 is obtained!
2 Numeric Analysis Part E
Table 20.1, p. 849, presents Gauss elimination with back substitution in algorithmic form. The section
ends with an operation count of 2n3=3 for Gauss elimination (p. 850) and n2 C n for back substitution (p.
851). Operation count is one way to judge the quality of a numeric method.
The solved problems show that a system of linear equations may have no solution (Prob. 3), a unique
solution (Prob. 9), or infinitely many solutions (Prob. 11). This was also explained in detail on pp. 277–280
in Sec. 7.3. You may want to solve a few problems by hand until you feel reasonably comfortable with the
Gaussian algorithm and the particular type of pivoting.
[Eq. (1)] 7:2x1 3:5x2 D 16:0; [Eq. (2)] 14:4x1 C 7:0x2 D 31:0:
If we add this equation to the second equation [Eq. (2)] of the given system, we get
This last equation has no solution because the x1, x2 are each multiplied by 0, added, and equated to
63:0! Or looking at it in another way, we get the false statement that 0 D 63: [A solution would exist if
the right sides of Eq. (1) and Eq. (2) were related in the same fashion, for instance, 16:0 and 32:0
instead of 31:0.] Of course, for most systems with more than two equations, one cannot immediately
see whether there will be solutions, but the Gauss elimination with partial pivoting will work in each
case, giving the solution(s) or indicating that there is none. Geometrically, the result means that these
equations represent two lines with the same slope of
Hence Eq. (1) and Eq. (2) are parallel lines, as show in the figure on the next page.
9. System with a unique solution. Pivoting. ALGORITHM GAUSS, p. 849. Open your textbook to
p. 849 and consider Table 20.1, which contains the algorithm for the Gauss elimination. To follow the
discussion, control it for Prob. 9 in terms of matrices with paper and pencil. In each case, write down
Chap. 20 Numeric Linear Algebra 3
all three rows of a matrix, not just one or two rows, as is done below to save some space and to avoid
copying the same numbers several times.
At the beginning, k D 1. Since a11 D 0, we must pivot. Between lines 1 and 2 in Table 20.1 we search
for the absolutely greatest aj1: This is a31.D 13/. According to the algorithm, we have to interchange
Eqs. (1) (current row) and (3) (row with the maximum), that is, Rows 1 and 3 of the augmented matrix.
This gives
x2
10
Eq. (2)
5
Eq. (1)
4x
1
–4 –3 –2 –1 0 1 2 3
–5
–10
–15
23
13 8 0178:54
67
(A) 66 0 885:887:
67
45
0 6 13137:86
Don’t forget to interchange the entries on the right side (that is, in the last column of the augmented
matrix).
To get 0 as the first entry of Row 2, subtract times Row 1 from Row 2. The new Row 2 is
hi
(A2) 0 3:692308 8168:28308 :
a31 0
m31 D D D 0: a11 13
4 Numeric Analysis Part E
Hence, the operations in line 4 simply have no effect; they merely reproduce Row 3 of the matrix in (A). This
completes k D 1.
Next is k D 2. In the loop between lines 1 and 2 in Table 20.1, we have the following: Since
6 > 3:692308, the maximum is in Row 3 so we interchange Row 2 (A2) and Row 3 in (A). This gives the
matrix
2 8 3
0178:54
13 6 13137:86
6 3:692308 7
(B) 60 7: 5
4 8168:28308
0
In line 4 of the table with k D 2 and j D k C 1 D 3 we calculate
Performing the operations in line 5 of the table for p D 3; 4, we obtain the new Row 3
hi
(B3) 0 0 16253:12 :
The system and its matrix have now reached triangular form. We
begin back substitution with line 6 of the table:
(Remember that, in the table, the right sides b 1; b2; b3 are denoted by a14; a24; a34, respectively.)
Line 7 of the table with i D 2;1 gives
and
Note that, depending on the number of digits you use in your calculation, your values may be
slightly affected by roundoff.
11. System with more than one solution. Homogeneous system. A homogeneous system always has the
trivial solution x1 D 0, x2 D 0, :::, xn D 0: Say the coefficient matrix of the homogeneous system has rank
r. The homogeneous system has a nontrivial solution if and only if
The details are given in Theorem 2, p. 290 in Sec. 7.5, and related Theorem 3, p. 291.
Chap. 20 Numeric Linear Algebra 5
In the present problem, we have a homogenous system with n D 3 equations. For such a system, we
may have r D 3 (the trivial solution only), r D 2 [one (suitable) unknown remains arbitrary— infinitely
many solutions], and r D 1 [two (suitable) variables remain arbitrary, infinitely many solutions]. Note
that r D 0 is impossible unless the matrices are zero matrices. In most cases we have choices as to
which of the variables we want to leave arbitrary; the present result will show this. To avoid
misunderstandings: we need not determine those ranks, but the Gauss elimination will automatically
give all solutions. Your CAS may give only some solutions (for example, those obtained by equating
arbitrary unknowns to zero); so be careful.
We end up with a “triangular” system of the form (after interchanging rows 2 and 3)
4:32x3 D 0; 0 D 0:
Note that the last equation contains no information. From this, we get
Since the system reduced to two equations (S1) and (S2) in three unknowns, we have the choice of
one parameter t.
If we set
(S3) then (S2) x1 D t (arbitrary);
becomes
6 Numeric Analysis Part E
3:4t 6:12x2 D 0
(S2*)
so that
(S2**) x2 D t D 0:556t:
Then the solution consists of equations (S3), (S2**), and (S1). This corresponds to the solution on p.
A48 in App. 2 of the textbook.
If we set
(S4) x2 D tQ (arbitrary, we call it tQ instead of t to show its independence from t/; then
6:12
(S5) x1 D tQ D 1:8 t;Q
3:4
and the solution consists of (S5), (S4), and (S1). The two solutions are equivalent.
The inspiration for this section is the observation that an n n invertible matrix can be written in the form
(2) A D LU;
AxLUxL. Ux/ D Ly D b; D D
|{z} y
This means we can solve first (3a) for y and then (3b) for x: Both systems (3a), (3b) are triangular, so we can
solve them as in the back substitution for the Gauss elimination. Indeed, this is our approach with
Doolittle’s method on p. 854. The example is the same as Example 1, on p. 846 in Sec. 20.1. However,
Doolittle requires only about half as many operations as Gauss elimination.
If we assign 1;1;:::;1 to the main diagonal of the matrix U (instead of L) we get Crout’s method.
A third method based on (2) is Cholesky’s method, where the n n matrix A is symmetric, positive definite.
This means
.symmetric/ A D AT;
Chap. 20 Numeric Linear Algebra 7
and
Under Cholesky’s method, we get formulas (6), p. 855, for factorization. The method is illustrated by
Example 2, pp. 855–856, and Prob. 7. Cholesky’s method is attractive because it is numerically stable
(Theorem 1, p. 856).
Matrix inversion by the Gauss–Jordan elimination method is discussed on pp. 856–857 and shown in
Prob. 17.
More Details on Example 1. Doolittle’s Method, pp. 853–854. In the calculation of the entries of L and U (or
LT in Cholesky’s method) in the factorization A D LU with given A, we employ the usual matrix
multiplication
In all three methods in this section, the point is that the calculation can proceed in an order such that we
solve only one equation at a time. This is possible because we are dealing with triangular matrices, so that
the sums of n D 3 products often reduce to sums of two products or even to a single product, as we will see.
This will be a discussion of the steps of the calculation, on p. 853, in terms of the matrix equation A D LU,
written out
2 3 2 32 3
3 5 2 1 0 0 u11 u12 u13
4 5 4 54 5
6 2 8 m31 m32 1 0 0 u33
Remember that, in Doolittle’s method, the main diagonal of L is 1,1, 1. Also, the notation m jk suggests
multiplier, because, in Doolittle’s method, the matrix L is the matrix of the multipliers in the Gauss
elimination. Begin with Row 1
column of U; thus,
of A. The entry a11 D 3 is the
3D1 0 0 u11 0 0T D 1 u11; dot product of the first row of
L and the first where 1 is
prescribed. Thus, u11 D 3: Similarly, a12 D 5 D 1 u12 C 0 u22 C 0 0 D u12; thus u12 D 5: Finally, a13 D 2 D u13. This
takes care of the first row of A. In connection with the second row of A we have to consider the second row
of L, which involves m21 and 1. We obtain
a21 D 0 D m21 u11 C 1 0 C0 D m21 3; hence m21 D 0; a22 D 8 D m21 u12 C 1 u22 C 0
D u22; hence u22 D 8; a23 D 2 D m21 u13 C 1 u23 C 0 D u23; hence u23 D 2:
In connection with the third row of A we have to consider the third row of L, consisting of m 31; m32; 1:
We obtain
In (4), on p. 854, the first line concerns the first row of A and the second line concerns the first column of A;
hence in that respect the order of calculation is slightly different from that in Example 1.
7. Cholesky’s method. The coefficient matrix A of the given system of linear equations is given by
2 3
9 6 12
6 7
AD 66 13 11 7 (as explained in Sec. 7.3, pp. 272–273).
4 5
12 11 26
We clearly see that the given matrix A is symmetric, since the entries off the main diagonal are mirror
images of each other (see definition of symmetric on p. 335 in Sec. 8.3). The Cholesky factorization of
A (see top of p. 856 in Example 1) is
2 3 2 32 3
9 6 12 l11 0 0 l11 l21 l31
6 7 6 76 7
66 13 117 D 6l21 l22 0760 l22 l327:
4 5 4 54 5
12 11 26 l31 l32 l33 0 0 l33
We do not have to check whether A is also positive definite because, if it is not, all that would happen
is that we would obtain a complex triangular matrix L and would then probably choose another
method. We continue.
Going through A row by row and applying matrix multiplication (Row times Column) as just before
we calculate the following.
Now solve Ax D b, where b D Œ17:4 23:6 30:8T. We first use L and solve Ly D b, where y D Œy 1 y2
y3T. Since L is triangular, we only do back substitution as in the Gauss algorithm. Now since L is lower
triangular, whereas the Gauss elimination produces an upper triangular matrix, begin with the first equation
Chap. 20 Numeric Linear Algebra 9
and obtain y1. Then obtain y2 and finally y3. This simple calculation is written to the right of the
corresponding equations:
2 32 3 2 3 y1 D 17:4 D 5:8;
3 0 0 y1 17:4
6 76 7 6 7
62 3 07 6y27 D 623:67 y2 D.23:6 2y1/ D 4;
4 54 5 4 5
4 1 3 y3 30:8 y3 D .30:8 4y1 y2/ D 1:2:
In the second part of the procedure you solve L Tx D y for x. This is another back substitution. Since L T is
upper triangular, just as in the Gauss method after the elimination has been completed, the present back
substitution is exactly as in the Gauss method, beginning with the last equation, which gives x 3, then using
the second equation to get x2, and finally the first equation to obtain x 1.
Details on the back substitution are as follows:
2 32 3 2 3
3 2 4 x1 5:8 (S1) 3x1 C 2x2 C 4x3 D 5:8;
6 76 7 6 7
60 3 17 6x27 D 6 4 7 written out is (S2) 3x2 C x3 D 4;
4 54 5 4 5
0 0 3 x3 1:2 (S3) 3x3 D 1:2:
3x1C2x2C4x3 D 3x1C21:2C40:4 D 5:8 so that (S6) x1 D .5:8 2:4 1:6/ D 1:8 D 0:4:
We check the solution by substituting it into the given linear system written as a matrix equation.
Indeed,
2 32 3 2 32 3
9 6 12 x1 9 6 12 0:6
10 Numeric Analysis Part E
Ax D 66 6 13 117766x277 D 66 6 13 1177661:277
4 54 5 4 54 5
12 11 26 x3 12 11 26 0:4
2 3 2 3 2 3
9 0:6 C 6 1:2 C 12 0:4 5:4 C 7:2 C 4:8 17:4
6 D 7 6 7 6 7
6 6 0:6 C 13 1:2 C 11 0:4 7 D 6 3:6 C 15:6 C 4:4 7 D 623:67 D b;
4 5 4 5 4 5
12 0:6 C 11 1:2 C 26 0:4 7:2 C 13:2 C 10:4 30:8
which is correct.
Discussion. We want to show that A is positive definite, that is, by definition on p. 346 in Prob. 24 in
Sec. 8.4, and also on p. 855:
We calculate
0 2 312 3
9 6 12 x1
@ 4 5A4 5
12 11 26 x3
2 3 x1
6 7
D 9x1 C 6x2 C 12x3 6x1 C 13x2 C 11x3 12x1 C 11x2 C 26x3 6x2 7
4 5
x3
D .9x1 C 6x2 C 12x3/ x1 C .6x1 C 13x2 C 11x3/ x2 C .12x1 C 11x2 C 26x3/ x3
2 2 2
D
9x1 C 12x1x2 C 24x1x3 C 22x2x3 C 13x2 C 26x3:
We get the quadratic form Q and want to show that (A) is true for Q:
(A) Q D 9x12 C 12x1x2 C 24x1x3 C 22x2x3 C 13x22 C 26x32 > 0 for all x1;x2;x3 ¤ 0:
Since Q cannot be written into a form . / 2; it is not trivial to show that (A) is true. Thus we look for
other ways to verify (A). One such way is to use a mathematical result given in Prob. 25, p. 346.
It states that positive definiteness (PD) holds if and only if all the principal minors of A are positive.
This result is also known as Sylvester’s criterion.
Chap. 20 Numeric Linear Algebra 11
For the given matrix A, we have three principal minors. They are:
ˇ ˇ ˇ ˇ
ˇa11 a12ˇ ˇ9 6ˇ
a119 > 0; D ˇ ˇDˇ ˇ D 9 13 6 6 D 101 > 0;
ˇ ˇ ˇ ˇ
ˇa21 a22ˇ ˇ6 13ˇ
ˇ ˇ ˇ ˇ ˇ ˇ
ˇ13 11ˇ ˇ 6 11ˇ ˇ 6 13ˇ detA9ˇ ˇ 6ˇ ˇ C 12ˇ ˇ
D
ˇ ˇ ˇ ˇ ˇ ˇ
ˇ11 26ˇ ˇ12 26ˇ ˇ12 11ˇ
D 9 217 6 24 C 12 . 90/ D 729 > 0:
Since all principal minors of A are positive, we conclude, by Sylvester’s criterion, that A is indeed
positive definite.
The moral of the story is that, for large A; showing positive definiteness is not trivial, although in
some cases it may be concluded from the kind of physical (or other) application.
17. Matrix inversion. Gauss–Jordan nethod. The method suggested in this section is illustrated in detail
in Sec. 7.8 by Example 1, on pp. 303–304, in the textbook, as well as in Prob. 1 on pp. 123–124 in
Volume I of the Student Solutions Manual. It may be useful to look at one or both examples. In
your answer, you may want to write down the matrix operations stated here in our solution to
Prob. 17 to the right of the matrix as is done in Example 1, p. 303, of the textbook.
The matrix to be inverted is
2 3
1 4 2
G D 66 4 25 477:
4 5
2 4 24
Thus the left 3 3 submatrix is the given matrix and the right 3 3 submatrix is the 3 3 unit matrix I. We apply
the Gauss–Jordan method to G1 to obtain the desired inverse matrix. At the end of the process, the left 3 3
submatrix will be the 3 3 unit matrix, and the right 3 3 submatrix will be the inverse of the given matrix.
The 4 in Row 2 of G1 is the largest value in Column 1 so we interchange Row 2 and Row 1 and
12 Numeric Analysis Part E
get
23
4 25 40 1 0
G2D 66 1 4 21 0 077:
45
2 4 240 0 1
Next we replace Row 2 by Row 2 C Row 1 and replace Row 3 by Row 3 C Row 1. This gives us the new
matrix
23
4 25 40 1 0
D
67
G36 03107:
67
45
0 26 0 1
23
10
67
G7 660 1 0 3177:
D
45
1 0 1
Finally, we eliminate in the second column of G7. We do this by replacing Row 1 of G 7 by Row 1
C Row 2. The final matrix is
23
2 0 0
67
G8 660 1 0 13 77:
D
45
0 0 1
The last three columns constitute the inverse of the given matrix, that is,
2 3
6 D
G1 6 261 77:
6 93 7
45
GG 1 D I and G 1G D I:
p. 859). Equation (3) shows how Gauss–Seidel continues with these starting values. And here comes a
crucial point that is particular to the method, that is, Gauss–Seidel always uses (where possible) the most
recent and therefore “most up to date” approximation for each unknown (“successive corrections”). This is
shown in the darker shaded blue area in (3) and explained in detail in the textbook as well as in Prob. 9.
The second method, Jacobi iteration (13), p. 862 (Prob. 17), is very similar to Gauss–Seidel but avoids
using the most recent approximation of each unknown within an iteration cycle. Instead, as is much more
common with iteration methods, all values are updated at once (“simultaneous corrections”).
For these methods to converge, we require “diagonal dominance,” that is, the largest (in absolute value)
element in each row must be on the diagonal.
Other aspects of Gauss–Seidel include a more formal discussion [precise formulas (4), (5), (6)],
ALGORITHM GAUSS–SEIDEL (see p. 860), convergence criteria (p. 861, Example 2, p. 862), and residual (12).
Pay close attention to formulas (9), (10), (11) for matrix norms (Prob. 19) on p. 861, as they will play an
important role in Sec. 20.4.
9. Gauss–Seidel iteration. We write down the augmented matrix of the given system of linear equations
(see p. 273 of Sec. 7.3 in the textbook):
23
5 1 219
D
67
A61 4227:
67
45
2 3 839
This is a case in which we do not need to reorder the given linear equations, since we note that the
large entries 5, 4, 8 of the coefficient part of the augmented matrix stand on the main diagonal. Hence
we can expect convergence.
Remark. If, say, instead the augmented matrix had been
23
5 1 219
67
62 3 8397
67
45
1 4 22
meaning that 5, 3, 2 would be the entries of the main diagonal so that 8 and 4 would be larger entries
outside the main diagonal, then we would have had to reorder the equations, that is, exchange the
second and third equations. This would have led to a system corresponding to augmented matrix A
above and expected convergence.
We continue. We divide the equations so that their main diagonal entries equal 1 and keep these
terms on the left while moving the other terms to the right of the equal sign. In detail, this means that
we multiply the first given equation of the problem by , the second one by , and the third one by :
We get
Chap. 20 Numeric Linear Algebra 15
x1 C x2 C x3 D ;
x1 C x2 x3 D ; x1 C
x2 C x3 D ; and then
x1 D x2 x 3;
(GS) x2 D x1 C x3;
x3 D x1 x2:
We start from x1.0/ D 1; x2.0/ D 1; x3.0/ D 1 (or any reasonable choice) and get
x1 D x2 x3 5 5 5 D 1
1
19 1 2 16
DD 3:2 .exact/;
x2.1/ D 1x1
x3.0/ 4 4D 3:2 C
D 3:2 . 0:8/
D 4:375 .exact/:
16 Numeric Analysis Part E
Note that we always use the latest possible value in the iteration, that is, for example, in computing x 2.1/ we
use x1.1/ (new! and not x1.0/) and x3.0/ (no newer value available). In computing x3.1/ we use x1.1/ (new!) and x2.1/
(new!) (see also p. 859 of the textbook).
Then we substitute x1.1/ D 3:2; x2.1/ D 0:8; x3.1/ D 4:375 into system (GS) and get
The results are summarized in the following table. The values were computed to 6S with two guard digits
for accuracy.
Prob. 9. Gauss–Seidel Iteration Method. Table of Iterations. Five Steps.
Step x1 x2 x3
mD1 3:2 0:8 4:375
mD2 2:21000 1:13500 3:89688
mD3 2:01425 0:944875 4:01711
mD4 2:00418 1:00751 3:99614
D m 5 2:00004 0:998059 4:00072
17. Jacobi Iteration. Convergence related to eigenvalues. An outline of the solution is as follows. You may
want to work out some more of the details. We are asked to consider the matrix of the system of
linear equations in Prob. 10 on p. 863, that is,
2 3
4 0 5
AQ D 661 6 277:
4 5
8 2 1
We note that aQ13 D 8 is a large entry outside the main diagonal (see Remark in Prob. 9 above). To
obtain convergence, we reorder the rows as shown, that is, we exchange Row 3 with Row 1, and get,
2 3
8 2 1
6 7
61 6 27:
4 5
4 0 5
Chap. 20 Numeric Linear Algebra 17
Then we divide the rows by the diagonal entries 8, 6, and 5, respectively, as required in (13), p. 862
(see ajj D 1 at the end of the formula). (Equivalently, this means we take Row1, Row2,
1
5 Row3):
23
1
6 7 D
A61 1 1377:
66
45
0 1
As described in the problem, we now have to consider
23
0
6 7
B I A 66 16 0 1377:
D D
45
0 0
The eigenvalues are obtained as the solutions of the characteristic equation (see pp. 326–327)
ˇ 1ˇ
ˇ 8ˇ
ˇˇ
ˇ ˇ D det.B I/ˇˇ 1
13ˇˇ
6
ˇ ˇ
ˇ4 ˇ
ˇ5 0 ˇ
3
D C D 0:
A sketch, as given below, shows that there is a real root near 0:5, but there are no further real
3
roots because, for large jj; the curve comes closer and closer to the curve of . Hence the other
eigenvalues must be complex conjugates. A root-finding method (see Sec. 19.2, pp. 801–806, also
Prob. 21 in the Student Solutions Manual on p. YY) gives a more accurate value of 0:5196. Division of
the characteristic equation by C 0:5196 gives the quadratic equation
2
C 0:5196 0:1283 D 0:
The roots are 0:2598 ˙ 0:2466i [by the well-known root-finding formula (4) for quadratic equations on
p. 54 of the textbook or on p. 15 in Volume I of the Student Solutions Manual]. Since all three roots
are less than 1 in absolute value, that is,
q
18 Numeric Analysis Part E
D 0:3582 < 1 j
the spectral radius is less than 1, by definition. This is necessary and sufficient for convergence (see at
the end of the section at the top of p. 863).
y
0.8
0.4
–0.4
–0.8
10 1 1
C D 66 1 10 177:
4 5
1 1 10
vu 3 3
jD1 kD1
p
D 102 C 1 C 1 C 1 C 102 C 1 C 1 C 1 C 102 D p303
D 17:49:
The column “sum” norm is
3
Xˇ ˇ
(10) kCk D max ˇcjk ˇ D 12:
k jD1
Chap. 20 Numeric Linear Algebra 19
Note that, to compute (10), we took the absolute value of each entry in each column and added
them up. Each column gave the value of 12. So the maximum over the three columns was 12.
Similarly, by (11), p. 861, the row “sum” norm is 12:
Together this problem illustrates that the three norms usually tend to give values of a similar order
of magnitude. Hence, one often chooses the norm that is most convenient from a computational point
of view. However, a matrix norm often results from the choice of a vector norm. When this happens,
we are not completely free to choose the norm. This new aspect will be introduced in the next section
of this chapter.
A computational problem is called ill-conditioned (p. 864) if small changes in the data cause large changes
in the solution. The desirable counterpart, where small changes in data cause only small changes in the
solution, is labeled well-conditioned. Take a look at Fig. 445 at the bottom of p. 864. The system in (a) is
well-conditioned. The system shown in part (b) is ill-conditioned because, if we raise or lower one of the
lines just a little bit, the the point of intersection (the solution) will move substantially, signifying ill-
conditioning. Example 1, p. 865, expresses the same idea in an algebraic example.
Keeping these examples in mind, we move to the central concept of this section, the condition number
.A/ of a square matrix on p. 868:
Here is the Greek letter kappa (see back inside cover of textbook), kAk denotes the norm of matrix A, and
A 1 denotes the norm of its inverse. We need to backtrack and look at the concept of norms, which is of
general interest in numerics.
Vector norms kxk for column vectors x D xj with n components (n fixed), p. 866, are generalized
concepts of length or distance and are defined by four properties (3). Most common are the l 1-norm (5),
“Euclidean” or l2-norm (6), and l1-norm (7)—all illustrated in Example 3, p. 866. Matrix norms, p. 867, build
on vector norms and are defined by k D kAxk ¤
We use the l1-norm (5) for matrices—obtaining the column “sum” norm (10)—and the l 1-norm (7) for
matrices—obtaining the row “sum” norm (11)—both on p. 861 of Sec. 20.3. Example 4, pp. 866–867,
illustrates this. We continue our discussion of the condition number.
We take the coefficient matrix A of a linear system Ax D b and calculate .A/: If .A/ is small, then the
linear system is well-conditioned (Theorem 1, Example 5, p. 868).
We look at the proof of Theorem 1. We see the role of .A/ from (15), p. 868, is that a small condition
number gives a small difference in the norm of xxQ between an approximate solution xQ and the unknown
exact solution x of a linear system Ax D b:
Problem 9 gives a complete example on how to compute the condition number .A/ for the well-
conditioned case. Contrast this with Prob. 19, which solves an ill-conditioned system by Gauss elimination
20 Numeric Analysis Part E
with partial pivoting and also computes the very large condition number .A/. See also Example 1, p. 865,
and Example 6, p. 869.
Finally, the topic of residual [see (1), p. 865] is explored in Example 2, p. 865, and Prob. 21.
There is no sharp dividing line between well-conditioned and ill-conditioned as discussed in “Further
Comments on Condition Numbers” at the bottom of p. 870.
Problem Set 20.4. Page 871
2 1
AD
1 4
2 "4 1# "1 #
1
A D D :
24 10 0 2 0
We want the matrix norms for A and A 1, that is, kAk and A 1 . We begin with the l1-vector norm,
which is defined by (5), p. 866. We have to remember that the l 1-vector norm gives, for matrices, the
column “sum” norm (the “sum” indicating that we take sums of absolute values) as explained in the
blue box in the middle of p. 867. This gives, under the l1-norm [summing over the absolute
k
values of the entries of each column i (here i D 1;2) and then selecting the maximum], Ak D max
fj2j C j0j ; j1j C j4jg D max fj2j ; j5jg D 5;
i i
and
Now we turn to the l1-vector norm, defined by (7), p. 866. We have to remember that this vector
norm gives for matrices the row “sum” norm. This gives, under the l 1-norm [summing over the
absolute values of the entries of each row j .in our situation j D 1; 2/ and then selecting the
k
maximum], Ak D max fj2j C j1j ; j0j C j4jg D max fj2j ; j4jg D 4;
j j
Chap. 20 Numeric Linear Algebra 21
and
Since the value of the condition number is not large, we conclude that the matrix A is not ill-
conditioned.
19. An ill-conditioned system
" # " #
4:50 3:55 5:2
AD and b1 D :
3:55 2:80 4:1
We use Gauss elimination with partial pivoting (p. 846) to obtain a solution to the linear system.
We form the augmented matrix (pp. 845, 847):
"#
4:50 3:555:2
ŒAjb1 D:
3:55 2:804:1
We pivot 4:5 in Row 1 and use it to eliminate 3:55 in Row 2, that is,
and get
"#
4:5 3:555:2
:
0 0:0005555555950:00222222228
x2 D D 3:999999992 4:
The coefficient matrix A is as before with b2 slightly different from b1; that is,
" #
5:2 b2 D
:
4:0
We form the augmented matrix
"4:50 3:555:2#
ŒAjb2 D
3:55 2:804:0
and use Gauss elimination with partial pivoting with exactly the same row operation but startlingly
different numbers!
"4:5 3:555:2 #
(There will be a small, nonzero, value in the a21 position due to using a finite number of digits.) Back
x2 D D 183:87 184
3. Computing the condition number of A. First, we need the inverse of A: By (4*), p. 304, we have
1 1 " 2:80 #
3:55
A D 3:55
3:55/ . 3:55/ 4:50
2:80 4:50 .
" 2:80 # "1120 #
3:55 1420
D 400
4:50 D 1420 :
3:55
1800
The l1-norm for matrix A; which we obtain by summing over the absolute values of the entries of each
column i .here i D 1;2/ and then selecting the maximum kAk D maxi fj2:80j C j3:55j ; j3:55j C j4:50jg D
Furthermore, because matrix A is symmetric (and, consequently, so is its inverse A 1), the values of the
l1-norm, i.e., the row “sum” norm, for both matrices A and A 1 are equal to their corresponding values of
the l1-norm, respectively. Hence the computation of .A/ would yield the same value.
4. Interpretation and discussion of result. The condition number .A/ D 25921 is very large, signifying that
the given system is indeed very ill-conditioned. This was confirmed by direct calculations in steps 1 and 2
by Gauss elimination with partial pivoting, where a small change by 0:1 in the second component from b 1
to b2 causes the solution to change from Œ 2;4T to Œ 144;184T, a change of about 1,000 times that of
that component! Note that we used 10 decimals in our first set of calculations to get satisfactory results.
You may want to experiment with a small number of decimals and see how you get nonsensical results.
Furthermore, note that the two rows of A are almost proportional.
21. Small residuals for very poor solutions. Use (2), p. 865, defining the residual of the “approximate
solution” Œ 10:0 14:1T of the actual solution Œ 2 4T; to obtain
" # " #" #
5:2 r D 3:55 10:0
4:50
4:1 2:80 14:1
3:55
" # " #
5:2 5:055
D 4:1 3:980
" # 0:145
D :
0:120
While the residual is not very large, the approximate solution has a first component that is 5 times
that of the true solution and a second component that is 3:5 times as great. For ill-conditioned
matrices, a small residue does not mean a good approximation.
We may describe the underlying problem as follows. We obtained several points in the xy-plane, say by
some experiment, through which we want to fit a straight line. We could do this visually by fitting a line in
such a way that the absolute vertical distance of the points from the line would be as short as possible, as
suggested by Fig. 447, p. 873. Now, to obtain an attractive algebraic model, if the absolute value of a point
to a line is the smallest, then so is the square of the vertical distance of the point to the line. (The reason we
do not want to use absolute value is that it is not differentiable throughout its domain.) Thus we want to fit
24 Numeric Analysis Part E
a straight line in such a way that the sum of the squares of the distances of all those points from the line is
minimal, i.e., “least”—giving us the name “least squares method.”
The formal description of fitting a straight line by the least squares method is given in (2), p. 873, and
solved by two normal equations (4). While these equations are not particularly difficult, you need some
practice, such as Prob. 1, in order to remember how to correctly set up and solve such problems on the
exam.
The least squares method also plays an important role in regression analysis in statistics. Indeed, the
normal equations (4) show up again in Sec. 25.9, as (10) on p. 1105.
We extend the method to fitting a parabola by the least squares method and obtain three normal
equations (8), p. 874. This generalization is illustrated in Example 2, p. 874, with Fig. 448 on p. 875, and in
complete detail in Prob. 9.
Finally, the most general case is (5) and (6), p. 874.
1. Fitting by a straight line. Method of least squares. We are given four points .0;2/; .2;0/; .3; 2/; .5; 3/
through which we should fit algebraically (instead of geometrically or sketching approximately) a
straight line. We use the method of least squares of Example 1, on p. 873 in the textbook. This
requires that we solve the normal auxiliary quantities needed in Eqs. (4), p. 873 in the textbook. When
using paper and pencil or if you use your computer as a typesetting tool, you may organize the
auxiliary quantities needed in (4) in a table as follows:
xj yj xj2 x j yj
0 2 0 0
2 0 4 0
3 2 9 6
5 3 25 15
Sum 10 3 38 21
From the last line of the table we see that the sums are
X X X2 X
xj D 10; yj D 3; xj D 38; xj yj D 21;
and n D 4, since we used four pairs of values. This determines the following coefficients for the variables of
(4), p. 873:
(1) 4a C 10b D 3;
(2) and gives the augmented 10a C 38b D 21;
matrix
" 3#
4 10
:
10 38 21
This would be a nice candidate for Cramer’s rule. Indeed, we shall solve the system by Cramer’s rule (2),
(3), Example 1, p. 292 in Sec. 7.6. Following that page, we have
ˇ ˇ
Chap. 20 Numeric Linear Algebra 25
ˇ4 10ˇ
DdetAˇ D ˇ D D 4 38 10 10 D 152 100 D 52:
ˇ ˇ
ˇ10 38ˇ
Furthermore
ˇ ˇ
ˇ 3 4ˇ
ˇ ˇ
ˇ ˇ
D ˇ 21 10 ˇ D 3 10 . 21/ 4 D 114 C 210 D 96 D 24 D
a 1:846;
D D D 52 13
ˇ ˇ
ˇ4 3ˇ
ˇ ˇ
ˇ ˇ
D ˇ10 21 ˇ D 4 . 21/ . 3/ 10 D 84 . 30/ D 27 D
b 1:038: D D D 26
From this we immediately get our desired straight line:
y D a C bx
D 1:846 1:038x:
y
2
0
1 2 3 4 5 x
–1
–2
–3
Sec. 20.5 Prob. 1. Given data and straight line fitted by least squares. (Note that the
axes have equal scales)
9. Fitting by a quadratic parabola. A quadratic parabola is uniquely determined by three given points. In
this problem, five points are given. We can fit a quadratic parabola by solving the normal equations (8),
p. 874. We arrange the data and auxiliary quantities in (8) again in a table:
x y x2 x3 x4 xy x2y
26 Numeric Analysis Part E
2 3 4 8 16 6 12
3 0 9 27 81 0 0
5 1 25 125 625 5 25
6 0 36 216 1296 0 0
7 2 49 343 2401 14 98
Sum 23 4 85
123 719
4419 15
We use Gauss elimination but, noting that the largest numbers are in the third row, we swap the first and
third rows,
23 Row 3
123 719 441985
67
Row 1
6 23 123 719157
45
5 23 1234
6 7
6 0 11:4472 107:317 0:89431 7 Row 2 Row 1
45
0 6:22764 56:6342 0:544715 Row 3 Row 1
23
123 719 441985
6 7
6 0 11:4472 107:317 0:89431 7 Row 2 Row 1
45
0 6:22764 56:6342 0:544715 Row 3 Row 1
23
123 719 441985
67
6 0 11:4472 107:3170:894317
4 5 Row 3 Row 2
0 0 1:749651:03125
Back substitution gives us, from the last row of the last matrix,
b2 D D 0:589404:
so that
b1 D D 5:44752:
D 1397:19:
Hence
b0 D D 11:3592:
Hence the desired quadratic parabola that fits the data by the least squares principle is
0 1 2 3 4 5 6 7 8 x
–1
–2
–3
–4
Sec. 20.5 Prob. 9. Given points and quadratic parabola fitted by least squares
11. Comparison of linear and quadratic fit. The figure on the next page shows that a straight line obviously
is not sufficient. The quadratic parabola gives a much better fit. It depends on the physical or other
law underlying the data whether the fit by a quadratic polynomial is satisfactory and whether the
remaining discrepancies can be attributed to chance variations, such as inaccuracy of measurement.
Calculation shows that the augmented matrix of the normal equations for the straight line is
" #
5 10 8:3
10 30 17:5
and gives y D 1:48 C 0:09x: The augmented matrix for the quadratic polynomial is
3
2
5 10 30 8:30
7
6
4 5
and gives y D 1:896 0:741x C 0:208x2: For practice, you should fill in the details.
y
3
0 1 2 3 4 5 x
Sec. 20.5 Prob. 11. Fit by a straight line and by a quadratic parabola
The central issue in finding eigenvalues of an n n matrix is to determine the roots of the corresponding
characteristic polynomial of degree n. This is usually quite difficult and requires the use of an iterative
numerical method, say from Sec. 19.2, or from Secs. 20.8 and 20.9 for matrices with additional properties.
However, sometimes we may only want some rough approximation of one or more eigenvalues of the
matrix, thereby avoiding costly computations. This leads to our main topic of eigenvalue inclusion.
Gerschgorin was only 30 years old when he published his beautiful and imaginative theorem, Theorem 1, p.
879. Take a look at Gerschgorin’s theorem at the bottom of that page. Formula (1) says that the eigenvalues
of an n n matrix lie in the complex plane in closed circular disks. The centers of these disks are the
elements of the diagonal of the matrix, and the size of these disks are determined by the sum of the
elements off the diagonal in each corresponding row, respectively. Turn over to p. 880 and look at Example
1, which applies Gerschgorin’s theorem to a 3 3 matrix and gets three disks, so called Gerschgorin disks,
two of which overlap as shown in Fig. 449. The centers of these disks can serve as crude approximations of
the eigenvalues of the matrix and the radii of the disks as the corresponding error bounds.
Problems 1 and 5 are further illustrations of Gerschgorin’s theorem for real and complex matrices,
respectively.
Gerschgorin’s theorem (Theorem 1) and its extension (Theorem 2, p. 881) are types of theorems know as
inclusion theorems. Inclusion theorems (p. 882) are theorems that give point sets in the complex plane that
“include,” i.e., contain one or several eigenvalues of a given matrix. Other such theorems are Schur’s
theorem (Theorem 4, p. 882), Perron’s theorem (Theorem 5, p. 882) for real or complex square matrices,
and Collatz inclusion theorem (Theorem 6, p. 883), which applies only to real square matrices whose
elements are all positive. Be aware that, throughout Secs. 20.7–20.9, some theorems can only be applied to
certain types of matrices.
Finally, Probs. 7, 11, and 13 are of a more theoretical nature.
1. Determination of the Gerschgorin disks. The diagonal entries of the given real matrix (which we shall
denote by A)
30 Numeric Analysis Part E
2 2 43
A D 66 2 0 277
4 5
2 4 7
are 5; 0; and 7: By Gerschgorin’s theorem (Theorem 1, p. 879), these are the centers of the three desired
Gerschgorin disks D1, D2; and D3; respectively. For the first disk, we have the radius by
(1), p. 879,
so that
D1 W j5 j 6
or equivalently,
D1 W center 5; radius 6.
This means, to obtain the radius of a Gerschgorin disk, we add up the absolute value of the entries in the
same row as the diagonal entry (except for the value of the diagonal entry itself). Thus for the other two
Gerschgorin disks we have
Below is a sketch of the three Gerschgorin disks. Note that they intersect in the closed interval
4 13.
y
6
–4 –2 0 2 4 6 8 10 12 14 x
–2
–4
–6
Sec. 20.7 Prob. 1. Gerschgorin disks. The disks have centers 5; 0; 7 and radii 6; 4; 6;
respectively
Chap. 20 Numeric Linear Algebra 31
ˇ ˇ
ˇ5 2 4ˇ
ˇ ˇ p./det.A I/ˇˇ 2 2 ˇˇ
D D
ˇ ˇ
ˇ ˇ
ˇ2 4 7 ˇ
ˇ ˇ ˇ ˇ ˇ ˇ
ˇ 2 ˇ ˇ 2 2 ˇ ˇ 2 ˇ D .5 /ˇ ˇ 2ˇ ˇ C 4ˇ ˇ
ˇ ˇ ˇ ˇ ˇ ˇ
ˇ4 7 ˇ ˇ2 7 ˇ ˇ2 4ˇ
D .5 /Œ . 7 / 8 2Œ 2.7 / 4 C 4Œ 8 C 2 D 3 122 C 23 C 36 D 0:
We want to find the roots of the characteristic polynomial. We know the following observations:
F1. The product of the eigenvalues of a characteristic polynomial is equal to the constant term of that
polynomial.
F2. The sum of the eigenvalues is equal to . 1/n 1 times the coefficient of the second highest term of
the characteristic polynomial. (Another example is discussed on pp. 129–130, in Volume 1, of the
Student Solutions Manual).
Using these two facts, we factor the constant term 36 and get 36 D 1 2 2 3 3 3. We calculate,2
starting with the smallest factors (both positive as given) and negative: p.1/ D 1 12 1 C 23
1 C 36 D 48 ¤ 0;p. 1/ D 0: We found an eigenvalue! Thus a factor is . C 1/ and we could use long
division and apply the well-known quadratic formula for finding roots. Or we can continue: p.2/ D
42; p. 2/ D 66; p.4/ D 0: We found another eigenvalue. From F2, we know that the 3 1 sum of the
three eigenvalues must equal . 1/ 12 D 12: Hence 1 C 4 C D 12 so the other eigenvalue must
be equal to D 9: Hence the three eigenvalues (or the spectrum) are 1; 4; 9:
3. Discussion. The inclusion interval obtained from Gerschgorin’s theorem is larger; this is typical. But
the interval is the best possible in the sense that we can find, for a set of disks (with real or complex
centers), a corresponding matrix such that its spectrum cannot be included in a set of smaller closed
disks with the main diagonal entries of that matrix as centers.
5. Gerschgorin disks. Complex matrix. To obtain the radii of the Geschgorin disks, we compute by
(1), p. 879,
p
ja12j C ja13j D jij C j1 C ij D p12 C 12 C 12 D 1 C p2 [by (3), p. 613], ja21j C ja23j j ij C j0j D
1;
32 Numeric Analysis Part E
p
ja31j C ja32j D j1 ij C j0j D 12 C . 1/2 C 0 D p2:
The diagonal elements, and hence centers of the Gerschgorin disks, are
Putting it all together: The disks are D1 W center in Prob. 1. You may want to sketch the Gerschgorin
disks and determine in which closed interval they intersect.
The determination of the actual eigenvalues is as follows. Developing the determinant along the last
row, with the usual checkerboard pattern in mind giving the correct plus and minus signs of the
cofactors (see bottom of p. 294), we obtain
ˇˇ2 i 1 C iˇˇ
ˇ
ˇ ˇ p./ D det.A I/ D ˇˇ i
ˇ
3 0ˇ
ˇ
ˇˇ1 i 0 8 ˇˇ
34 D 1 2 17:
However, none of its positive and negative factors, when substituted into the characteristic polynomial,
yields p./ equal to zero. Hence we would have to resort to a root-finding method from Sec. 19.2, p. 802,
such as Newton’s method. A starting value, as suggested by Geschgorin’s theorem, would be D 1:0000:
However, the problem suggests the use of a CAS (if available). Using a CAS (here Mathematica), the
spectrum f1;2; 3g is
1 D 1:16308;
2 D 3:51108; 3 D
8:32584:
Comment. We initially tried to use the approach of Prob. 1 when we determined the characteristic
polyomial, factored the constant term, and then tried to determined whether any of these factors
yielded zeros. This was to show that we first try a simpler approach and then go to more involved
methods.
Chap. 20 Numeric Linear Algebra 33
7. Similarity transformation. The matrix in Prob. 2 shows a typical situation. It may have resulted from a
numeric method of diagonalization that left off-diagonal entries of various sizes but not exceeding 10 2
in absolute value. Gerschgorin’s theorem then gives circles of radius 2 10 2. These furnish bounds for
the deviation of the eigenvalues from the main diagonal entries. This describes the starting situation for
the present problem. Now, in various applications, one is often interested in the eigenvalue of largest or
smallest absolute value. In our matrix, the smallest eigenvalue is about 5, with a maximum possible
deviation of 2 10 2, as given by Gerschgorin’s theorem. We now wish to decrease the size of this
Gerschgorin disk as much as possible. Example 2, on p. 881 in the text, shows us how we should
proceed. The entry 5 stands in the first row and column. Hence we should apply, to A; a similarity
transformation involving a diagonal matrix T with main diagonal a;1;1, where a is as large as possible.
The inverse of T is the diagonal matrix with main diagonal 1=a;1;1. Leave a arbitrary and first determine
the result of the similarity transformation (as in Example 2).
32 32
2 0 5 0:01 0:01 a 0 03
1=a 0
6
B D T 1AT D 6 0 1 077 660:01 8 0:0177 660 1 077
4 54 54 5
0 0 1 0:01 0:01 9 0 0 1
3
2 0:01=a
5 0:01=a
D 6 7
60:01a 8 0:01 7:
4 5
0:01a 0:01 9
We see that the Gerschgorin disks of the transformed matrix B; by Gerschgorin’s theorem, p. 879, are
Center Radius
5 0:02=a
8 0:01.a C 1/
9 0:01.a C 1/
The last two disks must be small enough so that they do not touch or even overlap the first disk. Since 8
5 D 3, the radius of the second disk, after the transformation, must be less than
3 0:02=a; that is,
0:01.a C 1/ < 3 0:02=a:
a2 C a < 300a 2:
34 Numeric Analysis Part E
If we replace the inequality sign by an equality sign, we obtain the quadratic equation
a2 299a C 2 D 0:
Hence a must be less than the larger root 298:9933 of this equation, say, for convenience, a D 298.
Then the radius of the second disk is 0:01.a C 1/ D 2:99, so that the disk will not touch the first one,
and neither will the third, which is farther away from the first. The first disk is substantially reduced
in size, by a factor of almost 300, the radius of the reduced disk being D 0:000067114:
The choice of a D 100 would give a reduction by a factor 100, as requested in the problem. Our
systematic approach shows that we can do better.
2 32 32 3 2 3
0:01 0 0 5:00 0:01 0:01 100 0 0 5 0:0001 0:0001
6 76 76 7 6 7
60 1 07 60:01 8:00 0:017 6 0 1 07 D 61 8 0:01 7:
4 54 54 5 4 5
0 0 1 0:01 0:01 9:00 0 0 1 1 0:01 9
Remark. In general, the error bounds of the Gerschgorin disk are quite poor unless the off-diagonal
entries are very small. However, for an eigenvalue in an isolated Gerschgorin disk, as in Fig. 449,
p. 880, it can be meaningful to make an error bound smaller by choosing an appropriate similarity
transformation
B D T 1AT;
where T is a diagonal matrix. Do you know why this is possible? Answer: This is allowed by Theorem 2,
p. 878, which ensures that similarity transformations preserve eigenvalues. So here we picked the
smallest eigenvalue and made the error bound smaller by a factor 1=100 as requested.
11. Spectral radius. By definition (see p. 324), the spectral radius of a square matrix A is the absolute value
of an eigenvalue of A that is largest in absolute value. Since every eigenvalue of A lies in a Gerschgorin
disk, for every eigenvalue of A we must have (make a sketch)
where we sum over all off-diagonal entries in Row j (and the eigenvalues of A are numbered suitably).
Since (I) is true for all eigenvalues of A, it must be true for the eigenvalue of A that is largest in
ˇ ˇ
absolute value, that is, the largest ˇj ˇ: But this is, by definition, the spectral radius .A/: The left-hand
side of (I) is precisely the row “sum” norm of A: Hence, we have proven that
Chap. 20 Numeric Linear Algebra 35
13. Spectral radius. The row norm was used in Prob. 11, but we could also use the Frobenius norm
s ˇ ˇXX 2
ˇj ˇcjk [see (9), p. 861]
j k
to find the upper bound. In this case, we would get (calling the elements a jk, since we called the matrix
in Prob. 1 A)
vu 3 3
ˇj ˇˇ utXX 2
ˇ ajk
jD1 kD1
D 11:05:
The main attraction of the power method is its simplicity. For an n n matrix A with a dominant
eigenvalue (“dominant” means “largest in absolute value”) the method gives us an approximation (1), p.
885, usually of that eigenvalue. Furthermore, if matrix A is symmetric, that is, a jk D akj [by (1),
p. 335], then we also get an error bound (2) for approximation (1). Convergence may be slow but can be
improved by a spectral shift (Example 2, p. 887). Another use for a spectral shift is to make the method
converge to the smallest eigenvalue as shown in Prob. 11. Scaling can provide a convergent sequence of
eigenvectors (for more information, see Example 1, p. 886). The power method is explained in great detail
in Prob. 5.
More details on Example 1, pp. 886–887. Application of Power Method, Error Bound (Theorem 1, p. 885).
Scaling. We take a closer look at the six vectors listed at the beginning of the example:
23 2 3 2 3
1 0:890244 0:890244
6 7 D
x0 D 66177; x1 D 60:609756 7; x2660:60975677;
45 4 54 5
1 1 1
2 3 2 3 2 3
6 7
x5 D 60:504682 7; x10 D 660:50014677; x15 D 660:50000577:
4 5 4 5 4 5
1 1 1
Vector x0 was scaled. The others were obtained by multiplication by the given matrix A and subsequent
scaling. We can use any of these vectors for obtaining a corresponding Rayleigh quotient q as an
approximate value of an (unknown) eigenvalue of A and a corresponding error bound ı for q. Hence we
have six possibilities using one of the given vectors, and indeed many more if we want to compute further
vectors. Note that we must not use two of the given vectors because of the scaling, but just one vector. For
instance, if we use x1, and then its product Ax1 we get
3
2 2 3 2 3
4 5 4 5 4 5
1:112983:
These now give the Rayleigh quotient q and error bound ı of q by (1), (2) p. 885:
q D m1=m0 D 0:716048;
p
ı D m2=m0 q2 D 0:038887;
D 0:72 q D 0:003952:
These values agree with those for j D 2 in the table for Example 1 on p. 887 of the textbook.
2 1
A D 66 1 3 277:
4 5
1 2 3
Use the same notation as in Example 1 in the text. From x 0 D Œ1 1 1T calculate Ax0 and then scale it as
indicated in the problem, calling the resulting vector x 1. This is the first step. In the second step
calculate Ax1 and then scale it, calling the resulting vector x 2. And so on. More details are as follows:
Iteration 1: We start with
2 3
1
x0 D 66177:
45
1
Multiplication by the given matrix A gives us
2 32 3 23
2 1 1 1 2
4 54 5 45
1 2 3 1 6
The calculations that give approximations q (Rayleigh quotients) and error bounds are as follows.
For m0; m1; and m2
2 3
1
6 7
m0 D xT0x0 D 1 1 1 61 7 D 1 1 C 1 1 C 1 1 D 3;
45
23
2
6 7
m1 D xT0Ax0 D 1 1 1 64 7 D 1 2 C 1 4 C 1 6 D 12;
45
38 Numeric Analysis Part E
23
2
6 7
m2 D .Ax0/T Ax0 D 2 4 6 64 7 D 2 2 C 4 4 C 6 6 D 56;
45
6
m1 12 q D D D 4:
m0 3
2 m2 2 56 2
ıD q D 4 D 18:66667 16 D 2:666667; m0 3
ı D p2:666667 D 1:632993;
5:632993:
Iteration 2: If this is not sufficient, we iterate by using a scaling factor. We chose the absolute largest
component of Ax0: This is 6, so we get
23
2 3
0:3333333
67
D
x1664677 D 660:666666777:
67 4 5
45 1
32 3 2 3
2
2 1 1 0:3333333 1
7 6 7 6 7
Ax1 D 66 1 3 2 7 60:6666667 7 D 63:666667 7:
4 54 5 4 5
1 2 3 1 4:666667
Chap. 20 Numeric Linear Algebra 39
As before, we compute the values required to obtain our next approximation to q and ı:
2 3
0:3333333
6 7
m0 D xT1x1 D 0:3333333 0:6666667 1 60:6666667 7 D 1:555556;
4 5
1
2 3
1
6 7
m1 D xT1Ax1 D 0:3333333 0:6666667 1 63:666667 7 D 7:444445;
4 5
4:666667
2 3
1
6 7
m2 D .Ax1/T Ax1 D 1 3:666667 4:6666676 3:666667 7 D 36:22223;
4 5
4:666667
m1 7:444445
q D D D 4:785713; m0 1:555556
2 2 2 36:22223 2m
ıD qD .4:785713/ ; m0 1:555556
D 23:28571 22:90305 D 0:38266:
It is important to notice that we have a loss of significant digits (subtractive cancelation) in the computation
of ı: The two terms that are used in the subtraction are similar and we go from seven digits to five. This
suggests that, for more than three iterations, we might require our numbers to have more digits.
5:404308:
Iteration 3: Again, if the result is not good enough, we need to move to the next iteration by using the
largest value of Ax1 as our scaling factor. This is 4:666665 so we get for x 2
2 3 2 3
1=4:666667 0:2142857
6 7 6 7
x2 D 63:666667=4:666667 7 D 60:7857143 7;
4 5 4 5
40 Numeric Analysis Part E
4:666667=4:666667 1
from which
32 3 2 3
2
2 1 1 0:2142857 0:6428571
7 6 7 6 7
Ax2 D 66 1 3 2 7 60:78571437 D 64:142857 7:
4 54 5 4 5
1 2 3 1 4:785714
This is followed by one more scaling step for the final result of x 3:
2 3 2 3
0:6428571=4:785714 0:1343284
6 7 6 7
x3 D 64:142857=4:785714 7 D 60:8656717 7;
4 5 4 5
4:785714=4:785714 1
2 3
0:2142857
6 7
m0 D xT2x2 D 0:2142857 0:7857143 1 60:7857143 7 D 1:663265;
4 5
2 3
0:6428571
6 7
m1 D xT2Ax2 D 0:2142857 0:7857143 1 64:142857 7 D 8:178571;
4 5
4:785714
2 3
0:6428571
6 7
m2 D .Ax2/T Ax2 D 0:6428571 4:142857 4:785714 64:142857 7
4 5
4:785714 D
40:47959;
Chap. 20 Numeric Linear Algebra 41
m1 8:178571
q D D D 4:917179; m0 1:663265
2 m2 2 40:47959 2
ıD qD .4:917179/ m0 1:663265
D 24:33743 24:17865 D 0:1587774;
ı D p0:1587774 D 0:3984688;
C 0:3984688 D 5:315648:
The results are summarized and rounded in the following table. Note how the value of the ı gets smaller so
that we have a smaller error bound on q.
m0 xT0x0 D 3 xT1x1 D 1:55556 xT2x2 D 1:663
m1 xT0Ax0 D 12 xT1Ax1 D 7:44444 xT2Ax2 D 8:179
m2 .Ax0/T Ax0 D 56 Ax1T Ax1 D 36:22 .Ax2/T Ax2 D 40:48
m2 q
D 4:786
4 4:917
m0
m
ı2 D 2 q2 2:667 0:3826 0:1588
m0
ı 1:633 0:6186 0:3985
q ı 2:367 4:167 4:519
qCı 5:633 5:404 5:316
Solving the characteristic equation x3 C 8x2 15x shows that the matrix has the eigenvalues
respectively. We see that the interval obtained in the first step includes the eigenvalues 3 and 5. Only in the
second step and third step of the iteration did we obtain intervals that include only the largest eigenvalue,
as is usually the case from the beginning on. The reason for this interesting observation is the fact that x 0 is
a linear combination of all three eigenvectors,
x0 D z1 13.z2 C z3/;
as can be easily verified, and it needs several iterations until the powers of the largest eigenvalue
make the iterate xj come close to z1, the eigenvector corresponding to D 5. This situation occurs
quite frequently, and one needs more steps for obtaining satisfactory results the closer in absolute
value the other eigenvalues are to the absolutely largest one.
1 1 1
BDA 3I D 66 1 0 277:
4 5
1 2 0
Now the power method converges to the eigenvalue max of largest absolute value. (Here we assume
that the matrix does not have max as another eigenvalue.) Accordingly, to obtain convergence to
the smallest eigenvalue, make a shift to A C kI with a negative k. Choose k by trial and error,
reasoning about as follows. The given matrix has trace A D 2 C 3 C 3 D 8. This is the sum of the
eigenvalues. From Prob. 5 we know that the absolutely largest eigenvalue is about 5. Hence the
sum of the other eigenvalues equals about 3. Hence k D 3 suggested in the problem seems to be
a
reasonable choice. Our computation of the Rayleigh quotients and error bounds gives for the first
step x0 D Œ1 1 1T; x1 D Œ 1 1 3T; m0 D 3; m1 D 3; m2 D 11; q D 1;
q11 q8
ıD 3 1D 3, and so on, namely,
We see that the Rayleigh quotients seem to converge to 3, which corresponds to the eigenvalue
0 of the given matrix. It is interesting that the sequence of the ı is not monotone; ı first increases
and starts decreasing when q gets closer to the limit 3. This is typical. Also, note that the
error bounds are much larger than the actual errors of q. This is also typical.
understand the QR-factorization method is to work through Example 2, pp. 894–896 with a further
demonstration of the method in Prob. 7. Both examples and both problems are each concerned with the
same real symmetric matrices, respectively.
An outline of this section is as follows: Discussion of the problem and biographic reference to
Formula (1), on p. 889, is the general set of formulas for the similarity transformations P r to obtain, in
stages, the tridiagonal matrix B:
Figure 450 illustrates, visually, how a 5 5 matrix A gets transformed into A 1;A2;A3 so that at the end
B D A3:
Formulas (2) and (3), p. 889, show the general form of the similarity transformations P r and associated unit
vectors vr.
The important formula (4), on the top of p. 890, defines the components of the unit vectors v r of (2) and (3).
Notice, in 4(b), sgn a21 is the sign function. It extracts the sign from a number, here a 21. This function gives
“plus one” when a number is zero or positive and “minus one” when a number is negative. Thus, for
example,
For each interation in formula (4) we increase, by 1, all subscripts of the components of the column
vector(s) vr .r D 2 for step 2/: We iterate n 2 times for an n n matrix.
More Details on Example 2, p. 894. QR-Factorization Method. The tridiagonalized matrix is (p. 895)
p
23
618 0
6 7
D
B6 p18 7 p277:
6
4 p 5
0 2 6
We use the abbreviations c2;s2, and t2 for cos2, sin2, and tan2, respectively. We multiply B from the left by
2 3 c2 s2 0
C2 D 66 s2 c2 077:
4 5
0 0 1
The purpose of this multiplication is to obtain a matrix C 2B D Œ bjk.2/ for which the off-diagonal entry b21.2/
is zero. Now this entry is the inner product of Row 2 of C 2 times Column 1 of B, that is,
From this and the formulas that express cos and sin in terms of tan we obtain
qq
c2 D 1= 1 C t22 D 2
3 D 0:816496581;
q q s2 D t2= 1 C t22 D D
0:577350269:
b .3/ Œb .3/
3 is determined similarly, with the purpose of obtaining 32 D 0 in C3C2B D jk .
Problem Set 20.9. Page 896
3. Tridiagonalization. The given matrix
2 3
7 2 3
A D 662 10 677
4 5
Chap. 20 Numeric Linear Algebra 45
3 6 7
is symmetric. Hence we can apply Householder’s method for obtaining a tridiagonal matrix (which will
have two zeros in the location of the entries 3). Proceed as in Example 1 of the text. Since A is of size n
D 3, we have to perform n 2 D 1 step. (In Example 1 we had n D 4 and needed n 2 D 2 steps.)
Calculate the vector v1 from (4), p. 890. Denote it simply by v and its components by v 1.D 0/, v2; v3
because we do only one step. Similarly, denote S1 in (4c) by S. Compute
q p2
S D a212 C 2D C 32 D p13 D 3:60555: a31 2
If we compute, using, say, six digits, we may expect that, instead of those two zeros in the
tridiagonalized matrix, we obtain entries of the order 10 6 or even larger in absolute value. We always
have v1 D 0. From (4a) we obtain the second component
r r
1 C a21=S D 1 C 2=3:60555 D v2
D0:881675:
2 2
From (4b) with j D 3 and sgn a21 D C1 (because a21 is positive) we obtain the third component
a31 3 v3 D D D 0:471858:
2v2 S 2 0:881675 3:60555
With these values we now compute Pr from (2), where r D 1; ::: ;n 2, so that we have only r D 1 and
can denote P1 simply by P. Note well that vTv would be the dot product of the vector by itself (thus the
square of its length), whereas vvT is a 3 3 matrix because of the usual matrix multiplication. We thus
obtain from (2), p. 889,
PDI 2vvT
D
2 2 v1v2 v1
6
I 2 6v2v1 v22 v1v33
7
6 7 v 2v 3 7
D 4 5
v3v1 v3v2 v23
2 2 2v1v2
1 2v1 2v1v33
6 7
7
6 2v2v1 1 2v22 2v2v3 7
6 5
4 1 2v23
2v3v1 2v3v2
46 Numeric Analysis Part E
2 0 3
1:0 0
6 D 7
0:832051 7:
60 0:554702 5
4
0:554700
0 0:832051
1
Finally use P, and its inverse P D P, for the similarity transformation that will produce the
tridiagonal matrix
2 3
7:0 3:605556 0:000001
6 7
B D PAP D P 62:0 10:539321 4:992308 7
4 5
3:0 9:152565 1:109404
2 3
7:0 3:605556 0:000001
7
6 D
The point of the use of similarity transformations is that they preserve the spectrum of A, consisting of
the eigenvalues
2; 5; 16;
which can be found, for instance, by graphing the characteristic polynomial of A and applying Newton’s
method for improving the values obtained from the graph.
7. QR-factorization. The purpose of this factorization is the determination of approximate values of all the
eigenvalues of a given matrix. To save work, one usually begins by tridiagonalizing the matrix, which
must be symmetric. This was done in Prob. 3. The matrix at the end of that problem
2 3
7:0 3:605551275 0
6 7
B0 D Œbjk D 6 3:605551275 13:46153846 3:692307692 7
4 5
0 3:692307692 3:538461538
is tridiagonal (note that greater accuracy is being used). Hence QR can begin. We proceed as in Example 2,
on p. 894, of the textbook. To save writing, we write c 2; s2; t2 for cos 2, sin 2, tan 2, respectively.
2 3 c2 s2 0
C2 D 66 s2 c2 077
Chap. 20 Numeric Linear Algebra 47
4 5
0 0 1
Œw .0/ w .0/
with the angle of rotation 2 determined so that, in the product W 0 D C2B0 D jk ; the entry 21
is zero. By the usual matrix multiplication (row times column) w .210/ is the inner product of Row 2 of
C2 times Column 1 of B0, that is,
From this, and the formulas for cos and sin in terms of tan (usually discussed in calculus), we obtain
r 2
c2 D 1= 1 C b21.0/=b11.0/ D 0:889000889;
(I/1)
b21.0/ r .0/ .0/2
4 5
0 3:692307692 3:538461538
C2 has served its purpose: instead of b21.0/ D 3:605551276 we now have w21.0/ D 0. (Instead of w21.0/ D 0;
on the computer we may get 10 10 or another very small entry—the use of more digits in B 0 ensured the
0.) Now use the abbreviations c3; s3; t3 for cos 3, sin 3, tan 3. Consider the matrix
2 3
1 0 0
C3 D 660 c3 s377
4 5
0 s3 c3
with the angle of rotation 3 such that, in the product matrix R 0 D Œrjk D C3W0 D C3C2B0; the entry r32 is zero.
This entry is the inner product of Row 3 of C 3 times Column 2 of W0. Hence
q q
(II/1) c3 D 1= 1 C t32 D 0:9415130836; s3 D t3= 1 C t32 D 0:3369764287:
2 3
7:874007873 9:369450382 1:690727888
6 7
R0 D C3W0 D C3C2B0 D 60 10:95716904 4:282861708 7:
4 5
0 0 2:225394561
(Again, instead of 0; you might obtain 10 10 or another very small term—similarly in the further calculations.)
Finally, we multiply R0 from the right by CT2CT3. This gives
B1 D R0CT2CT3 D C3C2B0CT2CT3
2 5:017347637 0 3
11:29032258
6 D 7
6 5:017347637 10:61443933 0:74990551287:
4 5
0 0:7499055119 2:095238095
The given matrix B0 (and, thus, also the matrix B1) has the eigenvalues 16, 6, 2. We see that the main
diagonal entries of B1 are approximations that are not very accurate, a fact that we could have concluded
from the relatively large size of the off-diagonal entries of B 1. In practice, one would perform further steps
of the iteration until all off-diagonal entries have decreased in absolute value to less than a given bound.
The answer, on p. A51 in App. 2, gives the results of two more steps, which are obtained by the following
calculations.
Œb .0/ Œb .1/
Step 2. The calculations are the same as before, with B 0 D jk replaced by B1 D jk . Hence, instead
.1/ .1/ 2
D qC D 0:9138287756;
c2 1= 1 .b21 =b11 /
q
(I/2) s2 D .b21.1/=b11.1//= 1 C .b21.1/=b11.1//2 D 0:4060997031:
We can now write the matrix C2, which has the same general form as before, and calculate the product
W1 D Œwjk.1/ D C2B1
3
2
D 6 7
60 7:662236711 0:6852852366 7:
4 5
0 0:7499055119 2:095238095
Now calculate the entries of C3 from (II/1) with t3 D w32.0/=w22.0/ replaced by t3 D w32.1/=w22.1/, that is,
D 6 7
60 7:698845998 0:88611310017:
4 5
0 0 2:018524735
2 3:126499072 0 3
14:90278952
D 6 7
6 3:126499074 7:088284172 0:19661424997:
4 5
0 0:1966142491 2:008926316
The approximations of the eigenvalues have improved. The off-diagonal entries are smaller than in B 1.
Nevertheless, in practice, the accuracy would still not be sufficient, so that one would do several more
steps. We do one more step, whose result is also given on p. A51 in App. 2 of the textbook.
Œb .1/ Œb .2/
Step 3. The calculations are the same as in step 2, with B 1 D jk replaced by B2 D jk . Hence we
D q
c 21= 1 C .b21.2/=b11.2/ 2 D 0:9786942487;
/
.2/ .2/
50 Numeric Analysis Part E
p
c3 D 1= 1 C .t3/2 D 0:9995126436;
(II/3) p
s3 D t3= 1 C .t3/2 D 0:03121658809:
R2 D C3W2 D C3C2B2
2 3
15:22721682 4:515275007 0:04036944359
D 6 7
60 6:298390090 0:2550432812 7
4 5
0 0 2:001940393
and, finally,
B3 D R2CT2CT3 D C3C2B2CT2CT3
2 1:293204857 0 3
15:82987970
D 6 7
6 1:293204856 6:169155576 0:062493749427:
4 5
0 0:06249374864 2:000964734
Further steps would show convergence to 16, 6, 2, with roundoff errors in the last digits. Rounding effects
are also shown in small deviations of B2 and B3 from symmetry. Note that, for simplicity in displaying the
process, some very small numbers were set equal to zero.
PART F
Optimization,
Graphs
The purpose of Part F is to introduce the main ideas and methods of unconstrained and constrained
optimization (Chap. 22) and graphs and combinatorial optimization (Chap. 23). These topics of discrete
mathematics are particularly well suited for modeling large-scale real-world problems and have many
applications as described on p. 949 of the textbook.
whose values we can choose, that is, control. Hence these variables are called control variables. This idea
of “control” can be immediately understood if we think of an application such as the yield of a chemical
process that depends on pressure x1 and temperature x2:
In most optimization problems, the control variables are restricted, that is, they are subject to some
constraints, as shall be illustrated in Secs. 22.2–22.4.
However, certain types of optimization problems have no restrictions and thus fall into the category of
unconstrained optimization. The theoretical details of such problems are explained on the bottom third
of p. 951 and continued on p. 952. Within unconstrained optimization the textbook selected a particular
way of solving such problems, that is, the method of steepest descent or gradient method. It is illustrated
in Example 1, pp. 952–953 and in great details in Prob. 3.
with the starting value (expressed as a column vector) x 0 D Œ0 0T: We proceed as in Example 1,
p. 952, beginning with the general formulas and using the starting value later. To simplify notations,
let us denote the components of the gradient of f by f 1 and f2. Then, the gradient of f is [see also (1),
p. 396]
Furthermore,
trf.x/ D Œx1 tf1 x2 tf2T
z.t/ D Œz1 z2T D x
;
which, in terms of components, is
(C) z1.t/ D x1 tf1; z2.t/ D x2 tf2:
Now obtain g.t/ D f.z.t// from f.x/ in (A) by replacing x 1 with z1 and x2 with z2. This gives
From (C) we see that z10 D f1 and z20 D f2 with respect to t. We substitute this and z 1 and z2 from
We denote the sum of the other terms by N (suggesting “numerator”) and get
NtD :
D
Next we start the iteration process.
f1 D 4 0 4D 4; f2 D 2 0 C 4 D 4;
ND 4 0 . 6/ 2 0 4 C 4 . 4/ 4 4
D 16 16 D 32
so that
N 32 1 t D t0 D D
D D 0:3333333:
D 96 3
From this and (B) and (C) we obtain the next approximation x 1 of the desired solution in the form
42 42 4 4
f.x1/ D 2 C 4 C4
3 3 3 3
4 Optimization, Graphs Part F
D 32 C 16 C 16 16 DD
5:333333:
9 3
This completes the first step.
x1 D ; x2 D :
Then from (B) we get
16 12 4
f1 D 4 4 D D ;
3 3 3
4 8 12 4
f2 D 2 C4D C D ;
3 3 3 3
!
42 42 64 32
tD D t 4 C2 Dt C Dt ;
3 3 9 9
4 4 4 4ND
4 2 C44
3 3 3 3
D 64 C 32 C 0 D 32
;
9 9
so that
32
N 32 9 1 t D t1 D DD
0:3333333:
D 9 96 3
From this and (B) and (C) we obtain the next approximation x 2 of the desired solution in the form
8 16T
D
T x2 D z.t1/ D Œx1 t1f1; x2 t1f2T
Œ0:8888889 1:777778
4 1 4 1 T 12 4 12 4T
9 D 3 4 9 D
3 3 9
Also from (A) we find 3 D 9 that f.x / is 2
82 162 8 16
Chap. 22 Unconstrained Optimization. Linear Programming 5
f.x2/ D 2 C 4C 4
9 9 9 9
x1 D; x2 D:
Then from (B) we get
f1 D 4 4D D ;
16 32 36 4
f2 D 2 C4D C D ;
9 9 9 9
D 42 C 42! D 16 C 16 D 64 C 32 D
(1) tD t 4 2 t 4 2 t t; 9 9 81 81 81
8 4 16 4 4 N D 4 2 C4 4
9 9 9 9 9
D 128 C 128 144 144 D
;
81
so that
32
N 32 81 1
t D t2 D DD 0:3333333:
D 81 96 3
From this and (B) and (C) we obtain the next approximation x 3 of the desired solution in the form
x3 D z.t2/ D Œx1 t 2f 1; x2 t2f2T
8 1 4 16 1 4T 24 C 4
T
D 9
9 9 3 9
27
28 52T T
Œ1:037037 1:925926 :
6 Optimization, Graphs Part F
D 27 27 D
282 522 28
f.x3/ D 2 C 4 C4
27 27 27
2 2
2 28 C 52 4 27 28 4 27 52
D 272
1568 C 2704 3024 5616
D 729
D D 5:991770:
The results for the first seven steps, with six significant digit accuracy, are as follows.
Discussion. Table I gives a more accurate answer in more steps than is required by the problem. Table II
gives the same answer—this time as fractions—thereby ensuring total accuracy. With the help of your
computer algebra system (CAS) or calculator, you can readily convert the fractions of Table II to the
desired number of decimals of your final answer and check your result. Thus any variation in your answer
from the given answer due to rounding errors or technology used can be Sec. 22.1. Prob. 3. Table I.
Method of steepest descent. Seven steps with 6S accuracy and one guarding digit
n x f
0 0:000000 0:000000 0:000000
1 1:333333 1:333333 5:333333
2 0:8888889 1:777778 5:925925
3 1:0370370 1:925926 5:991770
4 0:9876543 1:975309 5:999056
5 1:004115 1:991769 5:999894
6 0:9986283 1:997256 5:999998
7 1:000457 1:999086 5:999999
Sec. 22.1. Prob. 3. Table II. Method of steepest descent.
Seven steps expressed as fractions to ensure complete accuracy
n x f
Chap. 22 Unconstrained Optimization. Linear Programming 7
0 0 0
0
6
7
checked with Tables I and II. Furthermore, the last column in each table shows that the values of f
converge toward a minimum value of approximately minus 6. We can readily see this and other
information from the given function (A) by completing the square, as follows. Recall that, for a quadratic
equation,
ax2 C bx C c D 0
completing the square amounts to writing the equation in the form
b b2
2
C
a.x d/ C e D 0 where dD and eDc :
2a 4a
We apply to our given function f this method twice, that is, first to the x 1-terms 2x12 4x1, and then to
the x2-terms x22 C 4x2: For the x1-terms we note that a D 2, b D 4, c D 0 so that
b 4b2 16 d D D D 1 and e D c D0 D 2:
2a 22 4a 8
This gives us
Adding (F) and (G) together, we see that by completing the square, f can be written as
Equation (H) explains the numeric results. It shows that f.x/ D 6 occurs at x1 D 1 and x2 D
2; which is in reasonably good agreement with the corresponding entries for n D 7 in the tables.
Furthermore, we see, geometrically, that the level curves f D const are ellipses with
principal axes in the directions of the coordinate axes (the function has no termp p x1x2)
proportional to 2 and 1:
Remark. Your answer requires only three steps. We give seven steps for a better illustration of the
method. Also note that in our calculation we used fractions, thereby maintaining higher accuracy,
and converted these fractions into decimals only when needed.
The remaining sections of this chapter deal with constrained optimization which differs from
unconstrained optimization in that, in addition to the objective function, there are also some constraints.
We are only considering problems that have a linear objective function and whose constraints are linear.
Methods that solve such problems are called linear programming (or linear optimization, p. 954). A
typical example is as follows.
Consider a linear objective function, such as
with the usual additional constraints on the variables x 1 0; x2 0, as given in Example 1, p. 954, where the
goal is to find maximum x D .x1;x2/ to maximize revenue z in the objective function:
The inequality .1/ can be converted into an equality by introducing a variable x 3 (where x3 0), thus
obtaining
The variable x3 has taken up the slack or difference between the two sides of the inequality. Thus x 3 is
called a slack variable (see p. 956). We also introduce a slack variable x 4 for equation .2/ as shown in
Example 2, p. 956. This leads to the normal form of a linear optimization problem. This is an important
concept because any problem has to be first converted to a normal form before a systematic method of
solution (as shown in the next section) can be applied.
Problems 3, 21, and Fig. 474 of Example 1 on p. 955 explore the geometric aspects of linear
programming problems.
3. Region, constraints. Perhaps the easiest way to do this problem is to denote x 1 by x and x2 by y. Then
our axes are labeled in a more familiar way and we can rewrite the problem as
(A0) 0:5x C y 2;
(B0) x C y 2;
(C0) x C 5y 5:
0
Consider inequality (A ). This is also equivalent to
(A00) y 0:5x C 2:
y D 0:5x C 2;
we get line 1 in Fig. A. Since (A00) is an inequality of the kind , the region determined by (A 00) and hence
(B0) H) (B00) y x C 2:
We consider y Dx C 2 and get line 2 in Fig. A. Since B 00 is an inequality ; we have that (A00) and (B0) lie
Also (C0) H) (C00) y x C 1, which, as an equality, gives line 3 in Fig. A. Since we have , the corresponding
shaded region lies above line 3 as shaded in Fig. A.
Taking (A00), (B00), (C00) together gives the intersection of all three shaded regions. This is precisely the
region below 1 , to the right of 2 , and above 3 . It extends from .0;2/ below 1 , from .0;2/ to
10 Optimization, Graphs Part F
5 ; 7
6 6 above 2 , and from .1;1:2/ above 3 . Together we have the infinite region with boundaries as
marked in Fig. B, with the notation x1 (for x) and x2 (for y). Note that the region lies entirely in the first
quadrant of the x1x2-plane, so that the conditions x1 0; x2 0 (often imposed by the kind of application, for
instance, number of items produced, time or quantity of raw material needed, etc.) are automatically
satisfied.
y
5
1
4
3
2
–1 1 2 3 4 5 6 x
–1
2
3
(2, 0) 3
2
( 2
1
(
5, 7
6 6
1 2 3 4 5 x1
Sec. 22.2 Prob. 3. Fig. B Final solution: region determined by the three inequalities given in
the problem statement
z D c D const;
beginning its position when c D 0 (which is shown in Fig. 474, p. 955) and increase c continuously.
21. Maximum profit. The profit per lamp L1 is $150 and that per lamp L2 is $100. Hence the total profit
for producing x1 lamps L1 and x2 lamps L2 is
We want to determine x1 and x2 such that the profit f.x1;x2/ is as large as possible.
Chap. 22 Unconstrained Optimization. Linear Programming 11
Limitations arise due to the available workforce. For the sake of simplicity the problem is talking
about two workers W1 and W 2, but it is clear how the corresponding constraints could be made into
a larger problem if teams of workers were involved or if additional constraints arose from raw
material. The assumption is that, for this kind of high-quality work, W 1 is available 100 hours per
month and that he or she assembles three lamps L 1 per hour or two lamps L2 per hour. Hence W1
needs hour for assembling lamp L2 and hour for assembling lamp L2. For a production of x1
lampsL1 and x2 lamps L2, this gives the restriction (constraint)
(B) with the equality sign gives a straight line that intersects the x 1-axis at 240 (put x2 D 0) and the x2-
axis at 480 (put x1 D 0); see Fig. C. If we put x1 D 0 and x2 D 0, the inequality (B) becomes 0 C 0 80,
which is true. Hence the region to be determined extends from that line downward. And the region
must lie in the first quadrant because we must have x 1 0 and x2 0.
The intersection of those two lines is at .210; 60/. This gives the maximum profit
Next we reason graphically that .210; 60/ does give the maximum profit. The straight line
f D 37;500
x2 D 375 1:5x1:
f D const;
that is, in
12 Optimization, Graphs Part F
x2 D c 1:5x1;
which corresponds to moving the line up and down, it becomes clear that .210; 60/ does give the
maximum profit. We indicate the solution by a small circle in Fig. C.
This section forms the heart of Chap. 22 and explains the very important simplex method, which can
briefly be described as follows. The given optimization problem has to be expressed in normal form (1),
(2), p. 958, a concept explained in Sec. 22.2. Our discussion follows the example in the textbook which
first appeared as Example 1, p. 954, and continued as Example 2, p. 956, both in Sec. 22.2. Now here, in
Sec. 22.3, one constructs an augmented matrix as in (4), p. 959. Here z is the variable to be maximized, x 1,
x2 are the nonbasic variables, x3, x4 the basic variables, and b comes from the right-hand sides of the
equalities of the equations of the constraints of the normal form. Basic variables are the slack variables
and are characterized by the fact that their columns have only one nonzero entry (see p. 960).
500
400
300
x2
200
100 A
B
0
0 50 100 150 200 250 300
x1
Sec. 22.2 Prob. 21. Fig. C Constraints (A) (lower line) and (B)
From the initial simplex table, we select the column of the pivot by finding the first negative entry in
Row 1. Then we want to find the row of the pivot, which we obtain by dividing the right-hand sides by the
corresponding entries of the column just selected and take the smallest quotient. This will give us the
desired pivot row. Finally use this pivot row to eliminate entries above and below the pivot, just like in
the Gauss–Jordan method. This will lead to the second simplex table (5), p. 960. Repeat these steps until
there are no more negative entries in the nonbasic variables, that is, the nonbasic variables become basic
variables. We set the nonbasic variables to zero and read off the solution (p. 961).
Go over the details of this example with paper and pencil so that you get a firm grasp of this important
method. The advantage of this method over a geometric approach is that it allows us to solve large
problems in a systematic fashion.
Chap. 22 Unconstrained Optimization. Linear Programming 13
Further detailed illustrations of the simplex method are given in Prob. 3 (maximization) and Prob. 7
(minimization).
Begin by writing this in normal form, see (1) and (2), p. 958. The inequalities are converted to
equations by introducing slack variables, one slack variable per inequality. In (A) and (B) we have
the variables x1 and x2. Hence we denote the slack variables by x3 [for the first inequality in (B)], x4
[for the second inequality in (B)], and x 5 (for the third). This gives the normal form (with the
objective function written as an equation)
z 3x1 2x2 D 0;
This is a linear system of equations. The corresponding augmented matrix (a concept you should know!—
see Sec. 7.3, p. 273) is called the initial simplex table and is denoted by T0. It is
x x x x
x
Take a look at (3) on p. 963, which has an extra line on top showing z, the variables, and b [denoting the
terms on the right side in (C)]. We also added such a line in (D) and also drew the dashed lines, which
separate the first row of T0 from the others as well as the columns corresponding to z, to the given
variables, to the slack variables, and to the right sides.
14 Optimization, Graphs Part F
Perform Operation O1. The first column with a negative entry in Row 1 is Column 2, the entry being
3. This is the column of the first pivot. Perform Operation O 2. We divide the right sides by the
corresponding entries of the column just selected. This gives
The smallest positive of these quotients is 12. It corresponds to Row 4. Hence select Row 4 as the row of
the pivot. Perform Operation O3, that is, create zeros in Column 2 by the row operations
Row1 C Row4;
2 3
D Row1 C Row4
Row2 Row4
Row3 Row4:
This was the first step. (Note that the extra line on top of the augmented matrix showing z, the variables
and b as well as the dashed lines is optional but is put in for better understanding.) Now comes the
second step, which is necessary because of the negative entry in Row 1 of T1. Hence the column of
the pivot is Column 3 of T1. We compute
24 120 12 60
7:06; 5:45;60
17 D D 11 D DD
17 11
5 5
and compare. The second of these is the smallest. Hence the pivot row is Row 3. To create zeros in
Column 3 we have to do the row operations
Row1 C Row3;
Row2 Row3;
Row4 Row3;
5
Chap. 22 Unconstrained Optimization. Linear Programming 15
2 1 3
Row1 C Row3
Row2 Row3
D
Row4 Row3
Since no more negative entries appear in Row 1, we are finished. From Row 1 we see that
fmax D D 43:64:
In Row 4 we divide the entry in Column 7 by the entry in Column 2 and obtain the corresponding
1200
11
D 1200 1 D:
x1 value
10 11 10
Similarly, in Row 3 we divide the entry in Column 7 by the entry in Column 3 and obtain the
corresponding
12 12 5 x2 value 11 D
D:
5 1 11
You may want to convince yourself that the maximum is taken at one of the vertices of the polygon
determined by the constraints. This vertex is marked by a small circle in Fig. D.
20
15
x2 10
0
0 5 10 15 20
x1
7. Minimization by the simplex method. The given problem, in normal form [with z D f.x 1;x2/ written as
an equation], is
z 5x1 C 20x2 D 0;
2x1 C 10x2 C x3 D 5;
Since we minimize (instead of maximizing), we consider the columns whose first entry is positive
(instead of negative). There is only one such column, namely, Column 3. The quotients are
The smaller of these is . Hence we have to choose Row 2 as pivot row and 10 as the pivot. We
create zeros by the row operations Row 1 2 Row 2 (this gives the new Row 1) and Row 3
Row 2 (this gives the new Row 3), leaving Row 2 unchanged. The result is
f min D 10:
From Row 2; with Columns 3 and 6; we see that
x2 D D :
x
Now 4 appears in the second constraint, written as equation, that is,
2x1 C 5x2 C x4 D 10:
Hence
Since this problem involves only two variables (not counting the slack variables), as a control and to
better understand the problem, you may want to graph the constraints. You will notice that they
determine a quadrangle. When you calculate the values of f at the four vertices of the quadrangle,
you should obtain
subject to
0 x1 6;
0 x2 3;
z 7x1 14x2 D 0; x1 C x3 D 6;
x2 C x4 D 3;
D
18 Optimization, Graphs Part F
The first pivot must be in Column 2 because of the entry 7 in this column. We
the row of the first pivot by calculating determine
D6 .from Row 2/
Since 6 is smallest, Row 2 is the pivot row. With this the next simplex table becomes
z x1 x2 x3 x4 x5 b
23
1 0 14 7 0 0 42 Row1 C 7Row2
67
D 60 1 0 100 67
T167
60 0 1 010 37 Row3
45
0 0 14 7 0 1 42 Row4 7Row2
x2 C x4 D x2 C 3 D 3; hence x2 D 0:
(More simply: x1;x4;x5 are basic. x2;x3 are nonbasic. Equating the latter to zero gives x 2 D 0, x3 D 0.) Thus z D
42 at the point .42;0/ on the x1-axis.
Column 3 of T1 contains the negative entry 14. Hence this column is the column of the next pivot.
To obtain the row of the pivot, we calculate
Since both ratios gave 3 we have a choice of using Row 3 or using Row 4 as a pivot. We pick Row 3
as a pivot. We obtain
Chap. 22 Unconstrained Optimization. Linear Programming 19
Row1 C 7Row2
Row4 7Row2
There are no more negative entries in Row 1. Hence we have reached the maximum z max D 84. We see
that x1;x2;x5 are basic, and x3, x4 are nonbasic variables. zmax occurs at .6;3/ because x1 D 6 (from Row 2 and
Column 2) and x2 D 3 (from Row 3 and Column 3). Point .6;3/ corresponds to a degenerate solution
because x5 D 0=1 D 0 from Row 4 and Column 6, in addition to x 3 D 0 and x4 D 0.
7x1 C 14x2 C x5 D 84
resulting from the third constraint, also passes through .x 1;x2/ D .6;3/; with x5 D 0 because
7 6 C 14 3 C 0 D 84:
Observation. In Example 1, p. 962, we reached a degenerate solution before we reached the maximum
(the optimal solution), and, for this reason, we had to do an additional step, that is, Step 2, on p. 964. In
contrast, in the present problem we reached the maximum when we reached a degenerate solution.
Hence no additional work was necessary.
Chap. 23 Graphs. Combinatorial Optimization
The field of combinatorial optimization deals with problems that are discrete [in contrast to functions in
vector calculus (Chaps. 9 and 10) which are continuous and differentiable] and whose solutions are often
difficult to obtain due to an extremely large number of cases that underlie the solution. Indeed, the
“combinatorial nature” of the field gives us difficulties because, even for relatively small n; nŠ D 1 2 3 n
(for nŠ read “n factorial,” see p. 1025 in Sec. 24.4 of the textbook) is very large. For example,
convince yourself, that
10Š D 1 2 3 4 5 6 7 8 9 10 D 24 30 56 90 D 3;628;800:
We look for optimal or suboptimal solutions to discrete problems, with a typical example being the
traveling salesman problem on p. 976 of the textbook (turn to that page and read the description). In that
problem, even for 10 cities, there are already
10Š 3;628;800
D D 1;814;251 possible routes.
2 2
Logistics dictates that the salesman needs some software tool for identifying an optimal or suboptimal
(but acceptable) route that he or she should take!
We start gently by discussing graphs and digraphs in Sec. 23.1, p. 970, as they are useful for modeling
combinatorial problems. A chapter orientation table summarizes the content of Chap. 23.
This section discusses important concepts that are used in this chapter. A graph G consists of points and
the lines that connect these points, as shown in Fig. 477, p. 971. We call the points vertices and the
connecting lines edges. This allows us to define the graph G as two finite sets, that is, G D .V;E/ where V is
a set of vertices and E a set of edges. Also, we do not allow isolated vertices, loops, and multiple edges, as
shown in Fig. 478, p. 971.
If, in addition, each of the edges has a direction, then graph G is called a directed graph or digraph (p.
972 and Fig. 479).
Another concept is degree of a vertex (p. 971), which measures how many edges are incident with that
vertex. For example, in Fig. 477, vertex 1 has degree 3 because there are three edges that are “involved
with” (i.e., end or start at) that vertex. These edges are denoted by e 1 D .1;4/ (connecting vertex 1 with
vertex 4), e2 D .1;2/ (vertex 1 with 2), and e5 D .1;3/ (vertex 1 with 3). Continuing with our example, e 1 D .
1;4/ indicates that vertex 1 is adjacent to vertex 4. Also vertex 1 is adjacent to vertex 2 and vertex 3,
respectively.
Whereas in a digraph we can only traverse in the direction of each edge, in a graph (being always
undirected), we can travel each edge in both directions.
While it is visually indispensable to draw graphs when discussing specific applications (routes of airlines,
networks of computers, organizational charts of companies, and others; see p. 971), when using
computers, it is preferable to represent graphs and digraphs by adjacency matrices (Examples 1, 2, p. 973,
Prob. 11) or incidence lists of vertices and edges (Example 3). These matrices contain only zeroes and
ones. They indicate whether pairs of vertices are connected, if “yes” by a 1 and “no” by a 0. (Since loops
are not allowed in graph G, the entries in the main diagonal of these matrices are always 0.)
11. Adjacency matrix. Digraph. The four vertices of the figure are denoted 1, 2, 3, 4, and its four edges
by e1; e2; e3; e4. We observe that each edge has a direction, indicated by an arrow head, which
e
means that the given figure is a digraph. Edge 1 goes from vertex 1 to vertex 2, edge e2 goes from
vertex 1 to vertex 3, and so on. There are two edges connecting vertices 1 and 3. They have opposite
e e
directions ( 2 goes from vertex 1 to vertex 3, and 3 from vertex 3 to vertex 1, respectively).
Note that, in a graph, there cannot be two edges connecting the same pair of vertices.
An adjacency matrix has entries 1 and 0 and indicates whether any two vertices in the graph are
connected by an edge. If “yes,” the two edges are connected, then the corresponding entry is a “1,”
and if no a “0.” For n vertices, such an indexing scheme requires a square, n n matrix.
Our digraph has n D 4 vertices so that A is a 4 4 matrix. Its entry a 12 D 1 because the digraph has
an edge (namely, e1) that goes from vertex 1 to vertex 2. Now comes an important point worth
taking some time to think about: Entry a 12 is the entry in Row 1 and Column 2. Since e 12 goes from 1
to 2, by definition, the row number is the number of the vertex at which an edge begins, and the
column number is the number of the vertex at which the edge ends. Think this over and look at the
matrix in Example 2 on p. 973. Since there are three edges that begin at 1 and end at 2, 3, 4, and
since there is no edge that begins at 1 and ends at 1 (no loop), the first row of A is
Chap. 23 Graphs. Combinatorial Optimization 3
0 1 1 1:
Since the digraph has four edges, the matrix A must have four 1’s, the three we have just listed and a
fourth resulting from the edge that goes from 3 to 1. Obviously, this gives the entry a 31 D 1.
Continuing in this way we obtain the matrix
20 13
60
D 077
6 1 1
7;
A6 61 0 0 07
4 0 0 5
0 0 0 0
which is the answer on p. A55 of the book. Note that the second and fourth row of A contains all
zeroes since there are no directed edges that begin at vertex 2 and 4, respectively. In other words,
there are no edges with initial points 2 and 4!
15. Deriving the graph for a given adjacency matrix. Since the given matrix, say M of the wanted graph
GM, is
2 3
0 1 0 0
61 0 0 077
D6
M6 7;
60 0 0 17
4 5
0 0 1 0
which is a 4 4 matrix, the corresponding graph G M has four vertices. Since the matrix has four 1’s and
each edge contributes two 1’s, the graph GM has two edges. Since m12 D 1, the graph has the edge .
1;2/; here we have numbered the four vertices by 1, 2, 3, 4, and 1 and 2 are the endpoints of this
edge. Similarly, m34 D 1 implies that GM has the edge .3;4/ with endpoints 3 and 4. An adjacency
matrix of a graph is always symmetric. Hence we must have m 21 D 1 because m12 D 1, and similarly,
m43 D 1 since m34 D 1. Differently formulated, the vertices 1 and 2 are adjacent, they are connected
by an edge in GM, namely, by .1;2/. This results in a12 D 1 as well as a21 D 1.
Similarly for .3;4/. Together, this gives a graph that has two disjointed segments as shown below.
1 3
1 2
3 4
4 2
19. Incidence matrix eB of a digraph. The incidence matrix of a graph or digraph is an n m matrix, where
n is the number of vertices and m is the number of edges. Each row corresponds to one of the
vertices and each column to one of the edges. Hence, in the case of a graph, each column contains
two 1’s. In the case of a digraph each column contains a 1 and a 1.
In this problem, we looked at the graph from Prob. 11. Since, for that graph, the number of
vertices = number of edges = 4, the incidence matrix is square (which is not the most general case)
and of dimension 4 4. The first column corresponds to edge e 1, which goes from vertex 1 to vertex 2.
Hence by definition, bQ11 D 1 and bQ21 D 1: The second column corresponds to edge e2, which
goes from vertex 1 to vertex 3. Hence bQ12 D 1 and bQ32 D 1. Proceeding in this way we get
3
21 1 1 1
7
61 0 0 0
D
4 5
0
0 0 1
We distinguish between walk, trail, path, and cycle as shown in Fig. 481, p. 976. A path requires that each
vertex is visited at most once. A cycle is a path that ends at the same vertex from which it started. We also
call such a path closed. Thus a cycle is a closed path.
A weighted graph G D .V;E/ is one in which each edge has a given weight or length that is positive. For
example, in a graph that shows the routes of an airline, the vertices represent the cities, an edge between
two cities shows that the airline flies directly between those two cities, and the weight of an edge
indicates the (flight) distance in miles between such two cities.
A shortest path is a path such that the sum of the length of its edges is minimum; see p. 976. A shortest
path problem means finding a shortest path in a weighted graph G. A Hamiltonian cycle (Prob. 11) is a
cycle that contains all the vertices of a graph. An example of a shortest path problem is the traveling
salesman problem; which requires the determination of a shortest Hamiltonian cycle. For more details on
this important problem in combinatorial optimization, see the last paragraph on p. 976 or our opening
discussion of this chapter.
Moore’s BFS algorithm, p. 977 (with a backtracking rule in Prob. 1), is a systematic way for determining
a shortest path in a connected graph, whose vertices all have length 1. The algorithm uses a breadth first
search (BFS), that is, at each step, the algorithm visits all neighboring (i.e., adjacent) vertices of a vertex
reached. This is in contrast to a depth first search (DFS), which makes a long trail as in a maze.
Chap. 23 Graphs. Combinatorial Optimization 5
Finally we discuss the complexity of an algorithm (see pp. 978–979) and the order O, suggesting
“order.” In this “big O” notation, an algorithm of complexity
where a;b;d; and k are constant. This means that order O denotes the fastest growing term of the given
expression. Indeed, for constant k
A more formal definition of O is given and used in Prob. 19. Note that, Moore’s BFS algorithm is of
complexity O.m/: (In the last equation the symbol “>>” means “much greater than.”)
1. Shortest path. Moore’s algorithm. We want to find the shortest path from s to t and its length, using
Moore’s algorithm (p. 977) and Example 1, p. 978. We numbered the vertices arbitrarily. This means
we picked a vertex and numbered it 1 and then numbered the other vertices consecutively 2 , 3 , :::.
We note that s ( 9 ) is a vertex that belongs to a hexagon ( 2 , 7 , 8 , 9 , 10 , 11 ). According to step 1 in
Moore’s algorithm, s gets a label 0. s has two adjacent vertices ( 8 and 10 ), which get the label 1.
Each of the latter has one adjacent vertex ( 7 and 11 , respectively), which gets the label 2. These two
vertices now labeled 2 are adjacent to the last still unlabeled vertex of the hexagon ( 2 ), which thus
gets the label 3. This leaves five vertices still unlabeled ( 1 , 3 , 4 , 5 ,
6 ). Two ( 1 , 3 ) of these five vertices are adjacent to the vertex ( 2 ) labeled 3 and thus get the label
4. Vertex 1 , labeled 4, is adjacent to the vertex t ( 6 ), which thus gets labeled 5, provided that there
is no shorter way for reaching t.
There is no shorter way. We could reach t ( 6 ) from the right, but the other vertex adjacent to t,
i.e., ( 5 ), gets the label 4 because the vertex ( 4 ) adjacent to it is labeled 3 since it is adjacent to a
vertex of the hexagon ( 7 ) labeled 2. This gives the label 5 for t ( 6 ), as before.
Hence, by Moore’s algorithm, the length of the shortest path from s to t is 5: The shortest path
goes through nodes 0, 1, 2, 3, 4, 5, as shown in the diagram on the next page in heavier lines.
11. Hamiltonian cycle. For the definition of a Hamiltonian cycle, see our brief discussion before or turn
to p. 976 of the textbook. Sketch the following Hamiltonian cycle (of the graph of Prob. 1), which we
describe as follows. We start at s downward and take the next three vertices on the hexagon H, then
the vertex outside H labeled 4 ( 3 ), then the vertex inside H; then t; then the vertex to the right of t (
5 ), and then the vertex below it (3). Then we return to H; taking the remaining two vertices of H and
return to s.
13. Postman problem. This problem models the typical behavior of a letter carrier. Naively stated, the
postman starts at his post office, picks up his bags of mail, delivers the mail to all the houses, and
comes back to the post office from which he/she started. (We assume that every house gets mail.)
6 Optimization, Graphs Part F
t
6
5
5 4
1
8
2 4 3
4 1 7
s 9 0
3 4
10
2
1
11 3
2
Thus the postman goes through all the streets “edges,” visits each house “vertex” at least once, and
returns to the vertex, which is the post office from where he/she came. Naturally, the postman
wants to travel the shortest distance possible.
We solve the problem by inspection. In the present situation—with the post office s located at
vertex 1—the postman can travel in four different ways:
Each route contains 3—4 and 4—3, that is, vertices 3 and 4 are each traversed twice. The length of
the first route is (with the brackets related to the different parts of the trail)
D .2 C 1/ C 4 C .3 C 4 C 5/ C 4 C .2/
D 3 C 4 C 12 C 4 C 2 D 25;
and so is that of all other three routes. Each route is optimal and represents a walk of minimum
length 25:
19. Order. We can formalize the discussion of order O (pp. 978–979 in the textbook) as follows. We say
that a function g.m/ is of the order h.m/, that is,
g.m/ D O.h.m//
This means that, from a point m0 onward, the curve of kh.m/ always lies above g.m/.
(a). To show that
2
pC D O.m/
(O1) 1 m
we do the following:
0 m2 C 1 m2 C 2m C 1 for all m 1:
p
0 1 C m2 m C 1 for all m 1:
so that together
p
0 1 C m2 2m for all m 1;
from which, by definition of order, equation (O1) follows directly where k D 2: Another,
more elegant, solution can be obtained by noting that
r
1
p
1 C m2 D m C 1 < 2m for all m 1:
m2
In this section we consider connected graphs G (p. 981) with edges of positive length. Connectivity allows
us to traverse from any edge of G to any other edge of G, as say in Figs. 487 and 488, on p. 983. (Figure
478, p. 971 is not connected.) Then, if we take a shortest path in a connected graph, that extends through
several edges, and remove the last edge, that new (shortened) path is also a shortest path (to the prior
vertex). This is the essence of Bellman’s minimality principle (Theorem 1, Fig. 486, p. 981) and leads to the
Bellman equations (1), p. 981. These equations in turn suggest a method to compute the length of
shortest paths in G and form the heart of Dijkstra’s algorithm.
Dijkstra’s algorithm, p. 982, partitions the vertices of G into two sets PL of permanent labels and TL of
temporary labels, respectively. At each iteration (Steps 2 and 3), it selects a temporarily labeled vertex k
with the minimum distance label Lek from TL, removes vertex k from TL, and places it into PL:
Furthermore Lek becomes Lk: This signifies that we have found a shortest path from vertex 1 to vertex k:
Then, using the idea of Bellman’s equations, it updates the temporary labels in Step 3. The iterations
continue until all nodes become permanently labeled, that is, until TL D ∅ and PL D the set of all edges in
G: Then the algorithm returns the lengths L k .k D 2;:::;n/ of shortest paths from the given vertex (denoted
by 1) to any other vertex in G. There is one more idea to consider: those vertices that were not adjacent
to vertex 1, got a label of 1 in Step 1 (an initialization step). This is illustrated in Prob. 5.
Note that, in Step 2, the algorithm looks for the shortest edge among all edges that originate from a
node and selects it. Furthermore, the algorithm solves a more general problem than the one in Sec. 23.3,
where the length of the edges were all equal to 1. To completely understand this algorithm requires you to
follow its steps when going through Example 1, p. 982, with a sketch of Fig. 487, p. 983, at hand.
The problem of finding the shortest (“optimal”) distance in a graph has many applications in various
networks, such as networks of roads, railroad tracks, airline routes, as well as computer networks, the
Internet, and others (see opening paragraph of Sec. 23.2, p. 975). Thus Dijkstra’s algorithm is a very
important algorithm as it forms a theoretical basis for solving problems in different network settings. In
particular, it forms a basis for GPS navigation systems in cars, where we need directions on how to travel
between two points on a map.
1. Shortest path.
(a). By inspection:
We drop 40 because 12 C 28 D 40 does the same.
We drop 36 because 12 C 16 D 28 is shorter.
We drop 28 because 16 C 8 D 24 is shorter.
Chap. 23 Graphs. Combinatorial Optimization 9
Dijkstra’s algorithm runs as follows. (Sketch the figure yourself and keep it handy while you are
working.)
Step 1
where 36 is the old temporary label of vertex 4, and 16 is the distance from vertex 2 to vertex
4. Vertex 2 belongs to the set of permanently labeled vertices, and 28 shows that vertex 4 is
now closer to this set PL than it had been before. This is the end of Step 1.
Step 2
1. Extend the set PL by including that vertex of TL that is closest to a vertex in PL, that is, add to
PL the vertex with the smallest temporary label. Now vertex 3 has the temporary label 40,
and vertex 4 has the temporary label 28. Accordingly, include vertex 4 in PL. Its permanent
label is
Step 3
Since only a single vertex, 3, is left in TL, we finally assign the temporary label 36 as the
permanent label to vertex 3. Hence the remaining roads are
The total length of the remaining roads is 36 and these roads satisfy the condition that they
connect all four communities.
Since Dijkstra’s algorithm gives a shortest path from vertex 1 to each other vertex, it follows that
these shortest paths also provide paths from any of these vertices to every other vertex, as required
in the present problem. The solution agrees with the above solution by inspection.
5. Dijkstra’s algorithm. Use of label Lej D lij D 1: The procedure is the same as in Example 1,
p. 982, and as in Prob. 1 just considered. You should make a sketch of the graph and use it to follow
the steps.
Step 1
1. Vertex 1 gets permanent label 0. The other vertices get the temporary labels 2 (vertex 2), 1
(vertex 3), 5 (vertex 4), and 1 (vertex 5).
The further work is an application of Operation 2 [assigning a permanent label to the (or a)
vertex closest to PL and Operation 3 (updating the temporary labels of the vertices that are still
in the set TL of the temporarily labeled vertices], in alternating order.
3. Le3 D min.1;2 C 3/ D 5:
Le4 D min.5;2 C 1/ D 3:
Le5 D min.1; 1/ D 1:
Step 2
1. L4 D min.5;3; 1/ D 3: Thus PL D f1;2;4g;TL D f3;5g: Two vertices are left in TLI hence we have to
make two updates.
Chap. 23 Graphs. Combinatorial Optimization 11
2. Le3 D min.5;3 C 1/ D 4.
Le5 D min.1;3 C 4/ D 7.
Step 3
1. L3 D min.4;7/ D 4:
2. Le5 D min.7;4 C 2/ D 6.
Step 4
1. L5 D Le5 D 6:
A tree is a graph that is connected and has no cycles (for definition of “connected,” see p. 977; for “cycle,”
p. 976). A spanning tree [see Fig. 489(b), p. 984], in a connected graph G; is a tree that contains all the
vertices of G. A shortest spanning tree T in a connected graph G, whose vertices have positive length, is a
spanning tree whose sum of the length of all edges of T is minimum compared to the sum of the length of
all edges for any other spanning tree in G:
Sections 23.4 (p. 984) and 23.5 (p. 988) are both devoted to finding the shortest spanning tree, a
problem also know as the minimum spanning tree (MST) problem.
Kruskal’s greedy algorithm (p. 985; see also Example 1 and Prob. 5) is a systematic method for finding a
shortest spanning tree. The efficiency of the algorithm is improved by using double labeling of vertices
(look at Table 23.5 on p. 986, which is related to Example 1). Complexity considerations (p. 987) make this
algorithm attractive for sparse graphs, that is, graphs with very few edges.
A greedy algorithm makes, at any instance, a decision that is locally optimal, that is, looks optimal at the
moment, and hopes that, in the end, this strategy will lead to the desired global (or overall) optimum. Do
you see that Kruskal uses such a strategy? Is Dijkstra’s algorithm a greedy algorithm? (For answer see p.
20).
12 Optimization, Graphs Part F
More details on Example 1, p. 985. Application of Kruskal’s algorithm with double labeling of vertices
(Table 23.3, p. 985). We reproduce the list of double labels, that is, Table 23.5, p. 986, and give some
further explanations to it. Note that this table was obtained from the rather simple Table 23.4, p. 985.
Choice 1 Choice 2 Choice 3 Choice 4 Choice 5
Vertex (3, 6) (1, 2) (1, 3) (4, 5) (3, 4)
1 (1,0)
2 (1,1)
3 (3,0) (1,1)
4 (4,0) (1,3)
5 (4,4) (1,4)
6 (3,3) (1,3)
By going line by line through our table, we can see what the shortest spanning tree looks like. Follow
our discussion and sketch our findings, obtaining a shortest spanning tree.
Line 2. (1, 1) shows that 2 is in a subtree with root 1 and is preceded by 1. [This tree consists of the
single edge (1, 2).]
Line 3. (3, 0) means that 3 first is a root, and (1, 1) shows that later it is in a subtree with root 1, and
then is preceded by 1, that is, joined to the root by a single edge (1, 3).
Line 4. (4, 0) shows that 4 first is a root, and (1, 3) shows that later it is in a subtree with root 1 and is
preceded by 3.
Line 5. (4, 4) shows that 5 first belongs to a subtree with root 4 and is preceded by 4, and (1, 4) shows
that later 5 is in a (larger) subtree with root 1 and is still preceded by 4. This subtree actually is the
whole tree to be found because we are now dealing with Choice 5.
Line 6. (3, 3) shows that 6 is first in a subtree with root 3 and is preceded by 3, and then later is in a
subtree with root 1 and is still preceded by 3.
5. Kruskal’s algorithm. Trees constitute a very important type of graph. Kruskal’s algorithm is
straightforward. It begins by ordering the edges of a given graph G in ascending order of length. The
length of an edge .i; j/ is denoted by lij. Arrange the result in a table similar to Table 23.4 on p. 985.
The given graph G has n D 5 vertices. Hence a spanning tree in G has n 1 D 4 edges, so that
you can terminate your table when four edges have been chosen. Pick edges of the spanning tree to
be obtained in order of length, rejecting when a cycle would be created. This gives the following
table. (Look at the given graph!)
Edge Length Choice
(1, 4) 2 1st
(3, 4) 2 2nd
(4, 5) 3 3rd
Chap. 23 Graphs. Combinatorial Optimization 13
(3, 5) 4 (Reject)
(1, 2) 5 4th
We see that the spanning tree is the one in the answer on p. A56 and has the length L D 12.
In the case of the present small graph we would not gain much by double labeling. Nevertheless,
to understand the process as such (and also for a better understanding of the table on p. 986) do the
following for the present graph and tree. Graph the growing tree as on p. 986. Double label the
vertices, but attach a label only if it is new or if it changes in a step.
(1, 1)
2
4
1
4
4 1 3 5
(1, 1) (1, 0) (1, 4) (1, 4)
From these graphs we can now see what a corresponding table looks like. This table is simpler
than that in the book because the root of the growing tree (subtree of the answer) does not change;
it remains vertex 1.
Choice 1 Choice 2 Choice 3 Choice 4
Vertex (1, 4) (3, 4) (4, 5) (1, 2)
1 (1, 0)
2 (1, 1)
3 (1, 4)
4 (1, 1)
5 (1, 4)
We see that vertex 1 is the root of every tree in the graph. Vertex 2 gets the label (1, 1) because
vertex 1 is its root as well as its predecessor. In the label (1, 4) of vertex 3 the 1 is the root and 4 the
predecessor. Label (1, 1) of vertex 4 shows that the root as well as the predecessor is 1. Finally,
vertex 5 has the root 1 and the predecessor 4.
17. Trees that are paths. Let T be a tree with exactly two vertices of degree 1. Suppose that T is not a
path. Then it must have at least one vertex v of degree d 3. Each of the d edges, incident with v, will
eventually lead to a vertex of degree 1 (at least one such vertex) because T is a tree, so it cannot
have cycles (definition on p. 976 in Sec. 23.2). This contradicts the assumption that T has but two
vertices of degree 1.
Sec. 23.5 Shortest Spanning Trees: Prim’s Algorithm
From the previous section, recall that a spanning tree is a tree in a connected graph that contains all
vertices of the graph. Comparison over all such trees may give a shortest one, that is, one whose sum of
the length of edges is the shortest. We assume that all the lengths are positive (p. 984 of the textbook).
Another popular method to find a shortest spanning tree is by Prim’s algorithm. This algorithm is more
involved than Kruskal’s algorithm and should be used when the graph has more edges and branches.
14 Optimization, Graphs Part F
Prim’s algorithm shares similarities with Dijkstra’s algorithm. Both share a similar structure of three
steps. They are an initialization step, a middle step where most of the action takes place, and an updating
(final) step. Thus, if you studied and understood Dijkstra’s algorithm, you will readily appreciate Prim’s
algorithm. Instead of fixing a permanent label in Dijkstra, Prim’s adds an edge to a tree T in the second
step. Prim’s algorithm is illustrated in Example 1, p. 990. (For comparison, Dijkstra’s algorithm was
illustrated in Example 1, p. 982).
Here are two simple questions (open book) to test your understanding of the material. Can Prim’s
algorithm be applied to the graph of Example 1, p. 983? Can Dijkstra’s algorithm be applied to the graph
of Example 1, p. 990? Give an answer (Yes or No) and give a reason. Then turn to p. 20 to check your
answer.
9. Shortest spanning tree obtained by Prim’s algorithm. In each step, U is the set of vertices of the tree
T to be grown, and S is the set of edges of T . The beginning is at vertex 1, as always. The table is
similar to that in Example 1 on p. 990. It contains the initial labels and then, in each column, the
effect of relabeling. Explanations follow after the table.
Relabeling
Vertex Initial (I) (II) (III)
l24 D 4 l34 l24 D 4 –
2
– –
3 D2
l12 D 16 l13 – l – l35
4 – l45
35
D 8 l14 D 4 D 10 D 10
5
l15 D 1 D 14
1. i.k/ D 1, U D f1g, S D ;. Vertices 2, 3, 4 are adjacent to vertex 1. This gives their initial labels
equal to the length of the edges connecting them with vertex 1 (see the table). Vertex 5 gets
the initial label 1 because the graph has no edge (1,5); that is, vertex 5 is not adjacent to vertex
1.
2. 4 D l14 D 4 is the smallest of the initial labels. Hence include vertex 4 in U and edge (1, 4) as the
first edge of the growing tree T . Thus, U D f1;4g, S D f.1;4/g:
3. Each time we include a vertex in U (and the corresponding edge in S) we have to update labels.
This gives the three numbers in column (I) because vertex 2 is adjacent to vertex 4, with l 24 D 4
[the length of edge (2, 4)], and so is vertex 3, with l 34 D 2 [the length of edge (3,4)]. Vertex 5 is
also adjacent to vertex 4, so that 1 is now gone and replaced by l 45 D 14 [the length of edge (4,
5)].
2. 3 D l34 D 2 is the smallest of the labels in (I). Hence include vertex 3 in U and edge (3, 4) in S. We
now have U D f1;3;4g and S D f.1;4/;.3;4/g.
Chap. 23 Graphs. Combinatorial Optimization 15
3. Column (II) shows the next updating. l24 D 4 remains because vertex 2 is not closer to the new
vertex 3 than to vertex 4. Vertex 5 is closer to vertex 3 than to vertex 4, hence the update is l 35 D
10, replacing 14.
2. The end of the procedure is now quite simple. l 24 is smaller than l35 in column (II), so that we set
2 D l24 D 4 and include vertex 2 in U and edge (2, 4) in S. We thus have U D f1;2;3;4g and S D
f.1;4/;.3;4/;.2;4/g.
3. Updating gives no change because vertex 5 is closer to vertex 3, whereas it is not even adjacent
to vertex 2.
2. 5 D l35 D 10. U D f1;2;3;4;5g, so that our spanning tree T consists of the edges S D
f.1;4/;.3;4/;.2;4/;.3;5/g.
The length of the shortest spanning tree is
X
L.T/ D lij D l14 C l34 C l24 C l35 D 4 C 2 C 4 C 10 D 20:
1. Theme. Sections 23.6 and 23.7 cover the third major topic of flow problems in networks. They have
many applications in electrical networks, water pipes, communication networks, traffic flow in
highways, and others. A typical example is the trucking problem. A trucking company wants to
transport crates, by truck, from a factory (“the source”) located in one city to a warehouse (“target”)
located far away in another city over a network of roads. There are certain constraints. The roads,
due to their construction (major highway, two-lane road), have a certain capacity, that is, they allow
a certain number of trucks and cars. They are also affected by the traffic flow, that is, the number of
trucks and cars on the road at different times. The company wants to determine the maximum
number of crates they can ship under the given constraints.
Section 23.6 covers the terminology and theory needed to analyze such problems and illustrates them
by examples. Section 23.7 gives a systematic way to determine maximum flow in a network.
0 fij cij:
16 Optimization, Graphs Part F
The vertex condition (Kirchhoff’s law) applies to each vertex i that is not s or t. It is given by
Inflow D Outflow:
3. Paths, p. 992
Definition of path P W v1 ! vk in a digraph G as a sequence of edges
.v1;v2/;.v2;v3/;:::;.vk 1;vk/;
Our goal is to maximize the flow and thus we look for a path P W s ! t from the source to the sink,
whose edges are not fully used so that we can push additional flow through P: This leads to flow
see definition on top of p. 993. Do you see that Conditions (i) and (ii) mean f ij < cij and fij > 0 for
related edges, respectively?
We introduce the concept of cut set .S;T/ because we want to know what is flowing from s to t. So
we cut the network somewhere between s and t and see what is flowing through the edges hit by
the cut. The cut set is precisely that set of edges that were hit by the cut; see upper half of p. 994.
On the cut set we define capacity cap(S, T ) to be the sum of all forward edges from source S to
target T: Write it out in a formula and compare your answer with (3), p. 994.
6. Four Theorems, pp. 995–996 The section discusses the following theorems about cut sets and flows.
They are:
Theorem 1. Net flow in cut sets. It states that any given flow in a network G is the net flow through
any cut set .S;T/ of G:
Theorem 2. Upper bound for flows. A flow f in a network G cannot exceed the capacity of any cut
set .S;T/ in G:
Chap. 23 Graphs. Combinatorial Optimization 17
Theorem 3. Main Theorem. Augmenting path theorem for flows. It states that a flow from s to t in a
network G is maximum if and only if there does not exist a flow augmenting path s ! t in G: The
last theorem is by Ford and Fulkerson. It is
Theorem 4. Max-Flow Min-Cut Theorem. It states that the maximum flow in any network G is equal
to the capacity of a cut set of minimum capacity (“minimum cut set”) in G.
7. Illustrations of Concepts.
An example of a network is given in Fig. 493, p. 992. Forward edge and backward edge are
illustrated in Figs. 494 and 495 on the same page. Example 1, p. 993, and Prob. 15 determine flow
augmenting paths. Figure 498 and explanation, p. 994, as well as Probs. 3 and 5 illustrate cut sets
and capacity. Note that, in the network in Fig. 498, the first number on each edge denotes capacity
and the second number flow. Intuitively, if you think of edges as roads, then capacity of the road
means how many cars can actually be on the road and flow denotes how many cars actually are on
the road. Finally, Prob. 17 finds maximum flow.
3. Cut sets, capacity. We are given that S D f1;2;3g. T consists of the other vertices that are not in S.
Looking at Fig. 498, p. 994, we see that T D f4;5;6g: First draw Fig. 498 (without any cut) and then
draw a line that separates S from T . This is the cut. Then we see that the curve cuts the edge .1;4/
whose capacity is 10, the edge .5;2/, which is a backward edge, the edge .3;5/, whose capacity is 5,
and the edge .3:6/, whose capacity is 13. By definition (3), p. 994, the capacity cap .S;T/ is the sum
of the capacities of the forward edges from S to T . Here we have three forward edges and hence
The edge .5;2/ goes from vertex 5, which belongs to T , to vertex 3, which belongs to S: This shows
that edge .5;2/ is indeed a backward edge, as noted above. And backward edges are not included in
the capacity of a cut set, by definition.
5. Cut sets, capacity. Here S D f1;2;4;5g. Looking at the graph in Fig. 499, p. 997, we see that T D
f3;6;7g: We draw Fig. 499 and insert the cut, that is a curve that separates S from T . We see that the
curve cuts edges .2;3/, .5;3/, and .5;6/: These edges are all forward edges and thus contribute to cap
.S;T/. The capacities of these edges are 8, 4, and 4, respectively. Using (3), p. 994, we have
1 2 5; f D 2
18 Optimization, Graphs Part F
1 4 2 5; f D 2; etc.
From this, we see that the path 1 2 5 is flow augmenting and admits an additional flow:
D min.4 2; 8 5/ D min.2; 3/ D 2:
Here 2 D 4 2 comes from edge .1; 2/ and 3 D 8 5 from edge .2; 5/.
Furthermore, we see that another flow augmenting path is 1 4 2 5 and admits an
increase of the given flow:
D min.10 3; 5 3; 8 5/ D min.7; 2; 3/ D 2:
And so on. Of course, if we increased the flow on 12 5 by 2, then we have on edge .2; 5/
instead of .8; 5/ the new values .8; 7/ and can now increase the flow on 1 4 2 5
only by 8 7 D 1, the edge .2; 5/ now being the bottleneck edge.
For such a small network we can find flow augmenting paths (if they exist) by trial and error. For
large networks we need an algorithm, such as that of Ford and Fulkerson in Sec. 23.7, pp. 998–1000.
17. Maximum flow. The given flow in the network depicted in this problem on p. 997 is 10. We can see
this by looking at the two edges .4;6/ and .5;6/ that go into target t (the sink 6) and get the flow 1 C
9 D 10. Another way is to look at the three edges .1;3/, .1;4/, and .1;2/ that are leaving vertex 1 (the
source s) and get the flow 5 C 3 C 2 D 10.
To find the maximum flow by inspection we note the following. Each of the three edges going out
from vertex 1 could carry additional flow of 3. This is computed by the difference of capacity (the
first number) and flow (the second number on the edge), which, for the three edges, are
Since the additional flow is 3, we may augment the given flow by 3 by using path 1 4 5 6. Then the
edges .1;4/ and .5;6/ are used to capacity. This increases the given flow from 10 to 10 C 3 D 13. Next
we can use the path 1 2 4 6: Its capacity is
This increases the flow from 13 to 13 C 2 D 15: For this new increased flow the capacity of the path
1 3 5 6 is
because the first increase of 3 increased the flow in edge .5;6/ from 9 to 12. Hence we can increase
our flow from 15 to 15 C 1 D 16:
Finally, consider the path 1 3 4 6. The edge .4;3/ is a backward edge in this path.
By decreasing the existing flow in edge .4;3/ from 2 to 1, we can push a flow 1 through this path.
Chap. 23 Graphs. Combinatorial Optimization 19
Then edge .4;6/ is used to capacity, whereas edge .1;3/ is still not fully used. But since both edges are
going to vertex 6, that is, edges .4;6/ and .5;6/ are now used to capacity, we cannot augment the
flow further, so that we have reached the maximum flow
f D 16 C 1 D 17:
For our solution of maximum flow f D 17, the flows in the edges are
f12 D 4 f13 D (instead of
7 f14 D 6 f24 2)
D 4 f35 D 8 (instead of
f43 D 1 f45 D 5)
5 f46 D 4 f56 (instead of
D 13 3)
(instead of
2)
(instead of
7)
(instead of
2)
(instead of
2)
(instead of
1)
(instead of
9)
You should sketch the network with the new flow and check that Kirchhoff’s law
7. Maximum flow. Example 1 in the text on pp. 999–1000 shows how we can proceed in applying the
Ford–Fulkerson algorithm for obtaining flow augmenting paths until the maximum flow is reached.
No algorithms would be needed for the modest problems in our problem sets. Hence the point of
this, and similar problems, is to obtain familiarity with the most important algorithms for basic tasks
in this chapter, as they will be needed for solving large-scale real-life problems. Keep this in mind to
avoid misunderstandings. From time to time look at Example 1 in the text, which is similar and may
help you to see what to do next.
1. The given initial flow is f D 6. This can be seen by looking at flows 2 in edge .1; 2/, 1 in edge .1; 3/,
and 3 in edge .1; 4/, that begin at s and whose sum is 6, or, more simply, by looking at flows 5 and 1
in the two edges .2; 5/ and .3; 5/, respectively, that end at vertex 5 (the target t).
3. Scan 1. This means labeling vertices 2, 3, and 4 adjacent to vertex 1 as explained in Step 3 of Table
23.8 (the table of the Ford–Fulkerson algorithm), which, in the present case, amounts to the
following. j D 2 is the first unlabeled vertex in this process, which corresponds to the first part of Step
3 in Table 23.8. We have c12 > f12 and compute
4. Scan 2. This is necessary since we have not yet reached t (vertex 5), that is, we have not yet obtained
a flow augmenting path. Adjacent to vertex 2 are the vertices 1, 4, and 5. Vertices 1 and 4 are
labeled. Hence the only vertex to be considered is vertex 5. We compute
The calculation of 5 differs from the corresponding previous ones. From the table we see that
Chap. 23 Graphs. Combinatorial Optimization 21
The idea here is that 25 D 3 is of no help because in the previous edge (1, 2) you can increase the
7. Remove the labels from 2, 3, 4, 5, and go to Step 3. Sketch the given network, with the new flows f 12
D 4 and f25 D 7. The other flows remain the same as before. We will now obtain a second flow
augmenting path.
3. We scan 1. Adjacent are 2, 3, 4. We have c12 D f12; edge (1, 2) is used to capacity and is no longer to
be considered. For vertex 3 we compute
3. We need not scan 2 because we now have f 12 D 4 so that c12 f12 D 0; .1;2/ is used to
capacity; the condition c12 > f12 in the algorithm is not satisfied. Scan 3. Adjacent to 3 are the
vertices 4 and 5. For vertex 4 we have c43 D 6 but f43 D 0, so that the condition f43 > 0 is violated.
Similarly, for vertex 5 we have c35 D f35 D 1, so that the condition c35 > f35 is violated and we must
go on to vertex 4.
and
and
7. Remove the labels from 2, 3, 4, 5 and go to Step 3. Sketch the given network with the new
flows, write the capacities and flows in each edge, obtaining edge .1; 2/: .4; 4/, edge .1; 3/: .3;
1/, edge .1; 4/: .10; 4/, edge .2; 5/: .8; 8/, edge .3; 5/: .1; 1/, edge .4; 2/: .5; 4/, and edge .4; 3/: .
6; 0/. We see that the two edges going into vertex 5 are used to capacity; hence the flow f D 9
is maximum. Indeed, the algorithm shows that vertex 5 can no longer be reached.
We consider graphs. A bipartite graph G D .V;E/ allows us to partition (“partite”) a vertex set V into two
(“bi”) sets S and T , where S and T share no elements in common. This requirement of S \ T D ∅ by the
nature of a partition.
Other concepts that follow are matching and maximum cardinality matching (p. 1001 of the textbook),
exposed vertex, complete matching, alternating path, and augmenting path (p. 1002).
A matching M in G D .S;T I E/ is a set M of edges of graph G such that no two of those edges have a
vertex in common. In the special case, where the set M consists of the greatest possible number of edges,
M is called a maximum cardinality matching in G: Matchings are shown in Fig. 503 at the bottom of p.
1001.
A vertex is exposed or not covered by M if the vertex is not an endpoint of an edge in M. If, in addition,
the matching leaves no vertices exposed, then M is known as a complete matching. Can you see that this
exists only if S and T have the same number of vertices?
An alternating path consists alternately of edges that are in M and not in M, as shown below. Closely
related is an augmenting path, whereby, in the alternating path, both endpoints a and b are exposed. This
leads to Theorem 1, the augmenting path theorem for bipartite matching. It states that the matching in a
bipartite graph is of maximum cardinality , there does not exist an augmenting path with respect to the
matching.
The theorem forms the basis for algorithm matching, pp. 1003–1004, and is illustrated in Example 1. Go
through the algorithm and example to convince yourself how the algorithm works. In addition to the label
of the vertex, the method also requires a label that keeps track of backtracking paths.
b
Chap. 23 Graphs. Combinatorial Optimization 23
a
(B) Augmenting path P
Sec. 23.8. Alternating path and augmenting path P. Heavy edges are
those belonging to a matching M
We augment a given matching by an edge by dropping from matching M the edges that are not an
augmenting path P (two edges in the figure above) and adding to M the other edges of P (three in the
figure, do you see it?).
1. A graph that is not bipartite. We proceed in the order of the numbers of the vertices. We put vertex
1 into S and its adjacent vertices 2, 3 into T . Then we consider 2, which is now in T . Hence, for the
graph to be bipartite, its adjacent vertices 1 and 3 should be in S. But vertex 3 has just been put into
T . This contradicts the definition of a bipartite graph on p. 1001 and shows that the graph is not
bipartite.
7. Bipartite graph. Since graphs can be graphed in different ways, one cannot see immediately whether
a graph is bipartite. Hence in the present problem we have to proceed systematically.
1. We put vertex 1 into S and all its adjacent vertices 2, 4, 6 into T . Thus
S D f1g; T D f2;4;6g:
3. Next consider vertex 3, which is in S. For the graph to be bipartite, its adjacent vertices 2, 4, 6
should be in T , as is the case by (P).
5. Vertex 5 is in S. Hence for the graph to be bipartite, its adjacent vertices 2, 4, 6 should be in T .
This is indeed true by (P).
6. Vertex 6 is in T and its adjacent vertices 1, 3, 5 are in S.
Since none of the six steps gave us any contradiction, we conclude that the given graph in this problem
is bipartite. Take another look at the figure of our graph on p. 1005 to realize that, although the number of
vertices and edges is small, the present problem is not completely trivial. We can sketch the graph in such
a way that we can immediately see that it is bipartite.
17. K4 is planar because we can graph it as a square A, B, C, D, then add one diagonal, say, A, C, inside,
and then join B, D not by a diagonal inside (which would cross) but by a curve outside the square.
24 Optimization, Graphs Part F
Answer to question on greedy algorithm (see p. 10 in Sec. 23.4 of this Student Solutions Manual
and Study Guide). Yes, definitely, Dijkstra’s algorithm is an example of a greedy algorithm, as in Steps
2 and 3 it looks for the shortest path between the current vertex and the next vertex.
Answer to self-test on Prim’s and Dijkstra’s algorithms (see p. 12 of Sec. 23.5). Yes, since both trees
are spanning trees.









