Math336 Ch4
Math336 Ch4
Contents
4.1 Vector and matrix norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Contraction mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.3 Iterative methods for solving linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
An iterative method for the solution of a system of n linear equations in n unknowns of the form
Ax = b
starts with an initial approximation (also called initial guess) x0 2 Rn to the solution x and generates a
sequence of vectors {x⌫ } ⇢ Rn that ideally converges to the solution x as ⌫ ! 1.
Here we will discuss classical iterative schemes based on a transformation of the equation Ax = b into a
fixed point form
G(x) = x
for some appropriate function G : Rn ! Rn .
Given an initial guess x0 , the fixed point iterates {x⌫ } [also called successive approximations] are natu-
rally defined as
x⌫+1 = G(x⌫ ), ⌫ = 0, 1, 2, . . .
Remark 4.1. We will discuss several iterative methods for the solution Ax = b where we assume that A has non-zero
diagonal elements. These methods are based on the splitting
A = AL + D + AR
where
28
CHAPTER 4. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS 29
Algoritm 4.1 (Jacobi Method). The Jacobi method is based on the observation that
Ax = b () (AL + D + AR ) x = b
() Dx = (AL + AR ) x + b
1 1
() x = D (AL + AR ) x + D b =: GJ (x).
Ax = b () (AL + D + AR ) x = b
() (AL + D) x = AR x + b
1 1
() x = (AL + D) AR x + (AL + D) b =: G(x)
(N1) kxk 0
(N2) kxk = 0 iff x = 0
(N3) k↵ xk = |↵|kxk
(N4) kx + yk kxk + kyk [triangle inequality]
0 k↵x + yk22
Xn
= (↵xk + yk )(↵xk + yk )
1
" n n
#
X X
2
= |↵| kxk22 + ↵ xk y k + ↵ xk yk + kyk22
1 1
" n n
#
X X
2
= |↵| kxk22 + ↵ xk y k + ↵ xk y k + kyk22
1 1
= |↵|2 kxk22 + 2< [↵hx, yi] + kyk22 .
Therefore,
= (2 |hx, yi|2 )2 4|hx, yi|2 kxk22 kyk22 0
or equivalently
|hx, yi|2 kxk22 kyk22 .
Thus the result. ⇤
Theorem 4.2 (triangle inequality). For each x = [x1 . . . xn ]t and y = [y1 . . . yn ]t in Fn , we have
Proof. This follows from the Cauchy-Schwarz-Bunyakovsky inequality upon noting that
d(x, y) := kx yk (x, y 2 Fn )
Definition 4.3 (equivalenve of norms). Two vector norms k · k and k · k0 on Fn are said to be equivalent provided
Example 4.2. For the p-norms on Fn defined above, and for p, q 2 {1, 2, 1}, we have
with
Definition 4.4. A matrix norm on Fm⇥n (the collection of all m ⇥ n matrices with entries in F) is a function
k · k : Fm⇥n ! R such that
(M1) kAk 0
(M2) kAk = 0 iff A = 0
(M3) k↵ Ak = |↵|kAk
(M4) kA + Bk kAk + kBk [triangle inequality]
holds for all A, B 2 Fm⇥n and ↵ 2 F. In the case m = n, a matrix norm is called sub-multiplicative provided
(M5) kABk kAkkBk
kAxk
kAk := sup kAxk = sup kAxk = sup
kxk=1 kxk1 x6=0 kxk
defines a submultiplicative matrix norm on Fn⇥n called the induced matrix norm or natural norm.
Proof. Since 0 1
n
X n
X Xn
Ax = (Ax)i ei = @ aij xj A ei
i=1 i=1 j=1
we get
n X
X n n X
X n
kAxk aij xj kei k |aij ||xj |kei k
i=1 j=1 i=1 j=1
and this can be used to show [using equivalence of norms on finite dimensional vector spaces] that kAk <
1. Alternatively, since the norm is continuous (i.e. |kxk kyk| kx yk) we get {x 2 Fn : kxk = 1} is
compact, and this combined with the continuity of matrix multiplication implies that {kAxk : kxk = 1} is
bounded. Thus kAk < 1.
kAxk x x
For x 6= 0, we have = kA k and k k = 1 so that
kxk kxk kxk
kAxk
sup sup kAxk sup kAxk.
x6=0 kxk kxk=1 kxk1
CHAPTER 4. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS 32
kAxk
On the other hand, if kxk 2 (0, 1], we have kAxk so that
kxk
kAxk
sup kAxk sup .
kxk1 x6=0 kxk
and
n
X
kAk1 := sup kAxk1 = max |ajk | (row sum norm).
kxk1 =1 1jn
k=1
and
n
X n
X
kAk1 = max |(Ax)j | = max | ajk xk | max |ajk ||xk |
1jn 1jn 1jn
k=1 k=1
0 1
n
X n
X
= max |ajk | max |xk | = @ max |ajk |A kxk1
1jn 1kn 1jn
j=1 j=1
so that
n
X
kAk1 max |ajk |
1kn
j=1
and
n
X
kAk1 max |ajk |
1jn
k=1
and choose x = (x1 , . . . , xn )t 2 Fn with xi = 1 and xk = 0 for k 6= i (i.e. x = ei ). Then kxk1 = 1 and
n
X n X
X n n
X n
X
kAxk1 = |(Ax )j | = |ajk xk | = kajk = max |ajk |.
1kn
j=1 j=1 k=1 j=1 j=1
To see that equality holds for the infinity norm, we choose i such that
n
X n
X
|aik | = max |ajk |
1jn
k=1 k=1
and choose z 2 Fn as 8
< aik , aik 6= 0,
zk = |aik |
: 1, aik = 0.
Then kzk1 = 1 and
n
X n
X n
X
kAzk1 = max |(Az)j | = max | ajk zk | ajk zk = |aik | = max |ajk |.
1jn 1jn 1jn
k=1 k=1 k=1
Theorem 4.5 (Schur’s lemma). To each A 2 Fn⇥n , there corresponds a unitary matrix Q (i.e. Q⇤ Q = QQ⇤ = I)
such that Q⇤ AQ is upper triangular.
Proof. Assume that this is true for each (n 1) ⇥ (n 1) matrix An 1 . Let be an eigenvalue of the
n ⇥ n matrix An with the corresponding eigenvector u. Normalizing if necessary, we may assume that
hu, ui = 1. Choose vectors v2 , . . . , vn 2 Cn so that {u, v2 , . . . , vn } is an orthonormal basis in Cn (e.g. use
Gram-Schmidt). The matrix ⇥ ⇤
Un := u v2 · · · vn
is unitary since we clearly have Un⇤ Un = I. With the aid of hu, vj i = 1 (j = 2, . . . , n), we see that
⇤ ⇤
⇥ ⇤ R
Un An Un = Un u An v2 · · · An vn :=
0 An 1
where R is a row vector of size (n 1), and An 1 is an (n 1) ⇥ (n 1) matrix. By induction, there exists a
unitary matrix Qn 1 such that Q⇤n 1 An 1 Q n 1 is upper triangular. The matrix
1 0
Qn := Un
0 Qn 1
is unitary since
⇤ 1 0 1 0 1 0 1 0 1 0 1 0
Qn Qn = Un⇤ Un = = = = In .
0 Q⇤n 1 0 Qn 1 0 Q⇤n 1 0 Qn 1 0 Q⇤n 1 Qn 1 0 In 1
Moreover
1 0 1 0 1 0 R 1 0
Q⇤n An Qn = Un⇤ An Un =
0 Q⇤n 1 0 Qn 1 0 Q⇤n 1 0 An 1 0 Qn 1
1 0 RQn 1 RQn 1
= =
0 Q⇤n 1 0 An 1 Q n 1 0 Q⇤n 1 An 1 Qn 1
is upper triangular. ⇤
CHAPTER 4. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS 34
Theorem 4.6 (spectral theorem: Hermitian matrices). The eigenvalues of an n ⇥ n Hermitian matrix A (so
A⇤ = A) are real, and the eigenvectors form an orthonormal basis in Cn .
Proof. Let Q be a unitary matrix such that Q⇤ AQ is upper triangular. Since
the upper triangular matrix D = Q⇤ AQ is in fact diagonal, say D = diag( 1, . . . , n ). Note that then
AQ = QD.
⇥ ⇤
If Q = u1 ··· un , this equation is equivalent to
⇥ ⇤ ⇥ ⇤
Au1 · · · Aun = 1 u1 ··· n un ,
or
Auj = j uj (j = 1, . . . , n).
Hence the eigenvectors of A form an orthonormal basis in Cn . Finally,
we deduce that the eigenvalues are all nonnegative. Denote the eigenvectors by uj and eigenvalues by µ2j
(with µj 0):
A⇤ Auj = µ2j uj .
CHAPTER 4. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS 35
Given x 2 Cn , write it as
n
X
x= ↵ j uj .
j=1
Then we compute
n
X n
X n
X
kxk22 = hx, xi = h ↵ j uj , ↵ k uk i = |↵j |2
j=1 k=1 j=1
and
n
X n
X n
X
kAxk22 = hAx, Axi = hx, A⇤ Axi = h ↵ j uj , µ2k ↵k uk i = µ2j |↵j |2 ⇢(A⇤ A)|↵j |2 = ⇢(A⇤ A)kxk22
j=1 k=1 j=1
so that p
kAk2 ⇢(A⇤ A).
Conversely, choose j with
µ2j = ⇢(A⇤ A).
Then
kAxk22 = sup kAxk22 kAuj k22 = hAuj , Auj i = huj , A⇤ Auj i = µ2j = ⇢(A⇤ A).
kxk2 =1
Note that
Q⇤ A⇤ AQ = Q⇤ A⇤ QQ⇤ AQ = (Q⇤ AQ)⇤ Q⇤ AQ = D⇤ D = D2
so that
• the diagonal entries of D are the eigenvalues of A, and
• the diagonal entries of D2 are the eigenvalues of A⇤ A.
This gives
⇢(A⇤ A) = ⇢(A)2
and we are done. ⇤
⇢(A) kAk
kAk ⇢(A) + ✏
Proof.
CHAPTER 4. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS 36
(i) Let be an eigenvalue of A with the associated eigenvector u. If necessary, normalizing u, we may
assume that kuk = 1. Then
kAk = sup kAxk kAuk = k uk = | |kuk = | |.
kxk=1
We now set
V := QD
and define a vector norm k · k⇤ on Cn by setting
1
kxk⇤ := kV xk1 , x 2 Cn .
Finally we have
1 1 1
kAxk⇤ = kV Axk1 = kCV xk1 kCk1 kV xk1 = kCk1 kxk⇤
where we have used that
Q⇤ AQ = B ) V 1
AV = (D 1
Q⇤ )A(QD ) = D 1
(Q⇤ AQ)D = D 1
BD = C
1 1
) V A = CV .
Accordingly
kAk⇤ = sup kAxk⇤ kCk1 ⇢(A) + (n 1) b.
kxkx⇤ =1
Theorem 4.10 (Banach fixed point theorem). Suppose that G : Fn ! Fn is a contraction with contraction
constant q. Then:
(i) G has a unique fixed point, say x.
(ii) For any x0 2 Fn , the sequence of successive approximations
x⌫+1 = G(x⌫ ), ⌫ = 0, 1, 2, . . .
converges to this unique fixed point. Moreover, we have the a priori error estimate
q⌫
kx⌫ xk kx1 x0 k,
1 q
and the a posteriori error estimate
q
kx⌫ xk kx⌫ x⌫ 1k
1 q
for all ⌫ 1.
Proof. Let x0 2 Fn be arbitrary, and define the sequence
x⌫+1 = G(x⌫ ), ⌫ = 0, 1, 2, . . .
Then
kx⌫+1 x⌫ k = kGx⌫ Gx⌫ 1k qkx⌫ x⌫ 1k q ⌫ kx1 x0 k, ⌫ = 1, 2, . . . .
Hence for µ > ⌫, we have
kxµ x⌫ k kxµ xµ 1k + . . . + kx⌫+1 x⌫ k
µ 1 ⌫
(q + . . . + q )kx1 x0 k
⌫ µ ⌫ 1
= q (1 + . . . + q )kx1 x0 k
q⌫
kx1 x0 k (4.1)
1 q
This show that {x⌫ } is a Cauchy sequence in Fn . Since Fn is complete, {x⌫ } converges to some x 2 Fn . This
is a fixed point since
x = lim x⌫ = lim G(x⌫ 1 ) = G( lim x⌫ 1 ) = G(x)
⌫!1 ⌫!1 ⌫!1
where we have used that G is continuous. The uniqueness of x follows from the preceding theorem. Let-
ting µ ! 1 in (4.1) prints the a priori estimate. The a posteriori estimate is a consequence of the a priori
estimate applied to the sequence {zµ } defined as z0 = x⌫ and zµ+1 = G(zµ ) for µ 0. ⇤
CHAPTER 4. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS 38
Theorem 4.11 (successive approximations). Let I be the n ⇥ n identity matrix and B 2 Fn⇥n . Suppose that
kBk < 1 for some matrix norm induced from a vector norm k · k on Fn . Then:
(i) I B is invertible:
i.e. 8z 2 Fn 9! x 2 Fn : x Bx = z.
so that
⌫
X ⌫
X ⌫
X kzk
kx⌫ k = k B k zk kB k zk kBkk kzk .
1 kBk
k=0 k=0 k=0
Since lim⌫!1 x⌫ = (I B) 1
z, it follows that
kzk
k(I B) 1
k for all z 2 Fn .
1 kBk
Thus the result follows. ⇤
CHAPTER 4. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS 39
Proof. If ⇢(B) 1, then there exists an eigenvalue of B with | | 1. Let x be an associated eigenvector,
and set
z = x = x0 .
Then the sequence af successive approximations
x⌫+1 = Bx⌫ + z
diverges since !
⌫
X ⌫
X
k k
x⌫ = B z= z.
k=0 k=0
Conversely if ⇢(B) < 1, then by Theorem 4.8 there exists a vector norm k · k on Fn such that, for the
induced matrix norm on Fn⇥n , there holds
kBk < 1.
Thus the result follows from Theorem 4.11 on successive approximations. ⇤
Ax = b
starts with an initial approximation (also called initial guess) x0 2 Rn to the solution x and generates a
sequence of vectors {x⌫ } ⇢ Rn that ideally converges to the solution x as ⌫ ! 1.
Here we will discuss classical iterative schemes based on a transformation of the equation Ax = b into a
fixed point form
G(x) = x
for some appropriate function G : Rn ! Rn .
Given an initial guess x0 , the fixed point iterates {x⌫ } [also called successive approximations] are natu-
rally defined as
x⌫+1 = G(x⌫ ), ⌫ = 0, 1, 2, . . .
Remark 4.5. We will discuss several iterative methods for the solution Ax = b where we assume that A has non-zero
diagonal elements. These methods are based on the splitting
A = AL + D + AR
where
Algoritm 4.3 (Jacobi iteration). Given an initial guess x0 , the Jacobi iteration reads
1 1
x⌫+1 = GJ (x⌫ ) = D (AL + AR ) x⌫ + D b, ⌫ = 0, 1, 2, . . .
| {z }
TJ
Ax = b , (AL + D + AR ) x = b
, Dx = (AL + AR ) x + b
1 1
, x= D (AL + AR ) x + D b =: GJ (x)
so that the exact solution of the linear system Ax = b is the unique fixed point of the function GJ .
Note also that
1 1 1 1 1 1
GJ (x) = D (AL + AR ) x + D b= D (A D) x + D b= I D A x+D b.
| {z } | {z } | {z }
TJ TJ TJ
Algoritm 4.4 (Jacobi iteration with relaxation). Given an initial guess x0 , the Jacobi iteration with a positive
relaxation parameter ! is defined as
1 1
x⌫+1 = GJ,! (x⌫ ) = I !D A x⌫ + !D b, ⌫ = 0, 1, 2, . . .
| {z }
TJ,!
where TJ,! is called the Jacobi iteration matrix with relaxation parameter !.
The exact solution of the linear system Ax = b is the unique fixed point of the function
1 1
GJ,! (x) = I !D A x + !D b
| {z }
TJ,!
since
1 1
GJ,! (x) = x , x= I !D A x + !D b
1 1
, 0= !D Ax + !D b
1
, 0= !D (Ax b)
, Ax = b.
The Jacobi iteration with relaxation parameter ! = 1 clearly coincides with the standard Jacobi iteration.
Algoritm 4.5 (Gauss-Seidel iteration). Given an initial guess x0 , the Gauss-Seidel iteration reads
1 1
x⌫+1 = GGS (x⌫ ) = (AL + D) AR x⌫ + (AL + D) b ⌫ = 0, 1, 2, . . .
| {z }
TGS
Ax = b , (AL + D + AR ) x = b
, (AL + D) x = AR x + b
1 1
, x= (AL + D) AR x + (AL + D) b =: GGS (x)
CHAPTER 4. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS 41
so that the exact solution of the linear system Ax = b is the unique fixed point of the function GGS .
As opposed to the Jacobi iteration, the Gauss-Seidel iteration is implicit as it requires the solution of a
(lower triangular) linear system
(AL + D) x⌫+1 = AR x ⌫ + b ⌫ = 0, 1, 2, . . .
at each iteration.
Algoritm 4.6 (Gauss-Seidel iteration with relaxation or the successive over relaxation method (SOR)). Given
an initial guess x0 , the Gauss-Seidel iteration (or the Successive Over Relaxation Method: SOR) with a positive
relaxation parameter ! reads
1 1
x⌫+1 = GGS,! (x⌫ ) = (!AL + D) [(1 !)D !AR ] x⌫ + ! (!AL + D) b ⌫ = 0, 1, 2, . . .
| {z }
TGS,!
where TGS,! is called the Gauss-Seidel iteration matrix with relaxation parameter !.
The exact solution of the linear system Ax = b is the unique fixed point of the function
1 1
GGS,! (x) = (!AL + D) [(1 !)D !AR ] x + ! (!AL + D) b
| {z }
TGS,!
since
1 1
GGS,! (x) = x , x = (!AL + D) [(1 !)D !AR ]x + ! (!AL + D) b
, (!AL + D)x = [(1 !)D !AR ]x + !b
, !AL x = !(D + AR )x + !b
, !(AL + D + AR )x = !b
, !Ax = !b
, Ax = b.
Just like the standard Gauss-Seidel iteration, the SOR method is implicit as it requires the solution of the
(lower triangular) linear system
at each iteration.
When ! = 1, we have
1 1
GGS,1 (x) = (AL + D) [ AR ]x + (D + AL ) b = GGS (x);