0% found this document useful (0 votes)
18 views16 pages

Lect 9

The document discusses the Cholesky factorization, which is a method to factorize a symmetric positive definite matrix A into the product of a lower triangular matrix G and its transpose. It proves that if A is SPD, then it has a unique Cholesky factorization A = GG^T. An algorithm for computing the Cholesky factorization of a 2x2 matrix using inner products is also presented.

Uploaded by

Tushar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views16 pages

Lect 9

The document discusses the Cholesky factorization, which is a method to factorize a symmetric positive definite matrix A into the product of a lower triangular matrix G and its transpose. It proves that if A is SPD, then it has a unique Cholesky factorization A = GG^T. An algorithm for computing the Cholesky factorization of a 2x2 matrix using inner products is also presented.

Uploaded by

Tushar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

MA423 Matrix Computations

Lecture 9: Cholesky Factorization

Rafikul Alam
Department of Mathematics
IIT Guwahati

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC


Outline

• Characterization of positive definite matrices


• Cholesky factorization

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC


Quadratic forms
A pure quadratic f (x, y ) comes directly from a symmetric 2 by 2 matrix!
  
> 2 2 2
  a b x
x Ax in R ax + 2bxy + cy = x y .
b c y

For any symmetric matrix A, the product x> Ax is a pure quadratic form f (x1 , . . . , xn ) :
  
a11 · · · a1n xn n n
> n
  . .. ..   ..  = X X a x x .
x Ax in R x1 · · · xn  .. . .   .  ij i j
an1 ··· ann xn i=1 j=1

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC


Quadratic forms

Let f : R2 −→ R a smooth function and p ∈ R2 . Then


1
f (p + h) = f (p) + ∇f (p) • h + h> Hf (p)h + O(khk32 ),
2
where the symmetric matrix  
fxx (p) fxy (p)
Hf (p) :=
fxy (p) fyy (p)
is the Hessian of f at p.

Thus, if p is a critical point then f has a local minimum or maximum at p according as the
quadratic form x> Hf (p)x is positive or negative in a neighbourhood of p.

On the other hand, f has a saddle point at p if x> Hf (p)x takes positive and negative values in
a neighbourhood of p.

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC


Positive definite matrices
A symmetric matrix A ∈ Rn×n is said to be
• positive semidefinte if x > Ax ≥ 0 for all x ∈ Rn (written as A  0)
• positive definite if x > Ax > 0 for all nonzero x ∈ Rn (written as A  0)

A matrix A ∈ Cn×n is said to be


• positive semidefinte if x ∗ Ax ≥ 0 for all x ∈ Cn (written as A  0)
• positive definite if x ∗ Ax > 0 for all nonzero x ∈ Cn (written as A  0)

A real positive definite matrix is also referred to as a symmetric positive definite (SPD) matrix.

Remark: Let A ∈ Cn×n . Then x ∗ Ax ∈ R for all x ∈ Cn ⇐⇒ A = A∗ .


But A ∈ Rn×n and x > Ax ∈ R for all x ∈ Rn 6=⇒ A = A> .
 
1 2
Indeed, if A = then x > Ax = (x1 + x2 )2 ≥ 0 for all x ∈ R2 but A 6= A> .
0 1

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC


Positive definite matrices
If A ∈ Rn×n is partitioned in the form
 
Am B
A= , Am ∈ Rm×m ,
C D

then Am is called a principal submatrix of A. Note that

A> = A ⇐⇒ A>
m = Am , C = B >, D > = D.

It follows that if A is SPD then so is Am . Indeed, for any nonzero x ∈ Rm , we have


 >   
x Am B x
x > Am x = > 0.
0 C D 0

In particular, if A is SPD then ajj = ej> Aej > 0 for j = 1 : n. Also, A is nonsingular (why?).

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC


Positive definite matrices
Facts: Let A ∈ Rn×n be an SPD matrix. Then the following results hold:

1 If X ∈ Rn×p with rank(X ) = p then X > AX is SPD. Indeed, for all nonzero y ∈ Rp ,

Xy 6= 0 (why?) and y > (X > AX )y = (Xy )> A(Xy ) > 0 =⇒ X > AX is SPD.

2 Leading principal submatrices of A are SPD, that is, A(1 : j, 1 : j) is SPD for j = 1 : n.
Am B >
 
3 Let A = . Then S := D − BA−1
m B
>
is the Schur complement of Am . Now
B D
" #" #" #>
B>
 
Am I 0 Am 0 I 0
=
B D BA−1
m I 0 D − BA−1
m B
>
BA−1
m I

shows that
A is SPD ⇐⇒ Am and S := D − BA−1
m B
>
are SPD.

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC


LDV factorization

Theorem: Suppose that all leading principal submatrices A ∈ Rn×n are nonsingular. Then
A = LDV is a unique decomposition of A, where L is unit lower triangular, D is diagonal, and
V is unit upper triangular.

Proof: By assumption, A has a unique LU factorization A = LU. Let D := diag(u11 , . . . , unn ),


where u11 , . . . , unn are diagonal entries of U. Then V := D −1 U is unit upper triangular and
A = LDV . 

Corollary: If A ∈ Rn×n is symmetric and all leading principal submatrices of A are nonsingular
then A = LDL> is a unique factorization of A, where L is unit lower triangular and D is a
diagonal matrix.

Corollary: If A is SPD then A = LDL> is a unique factorization of A, where L is unit lower


triangular and D is a diagonal SPD matrix.

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC


Cholesky factorization
Theorem: Let A ∈ Rn×n be nonsingular. Then A is SPD ⇐⇒ A = GG > , where G is a unique
lower triangular matrix with positive diagonal entries.

Proof: A = GG > ⇒ x > Ax = x > GG > x = (G > x)> G > x = kG > xk2 > 0 for x 6= 0 ⇒ A is SPD.

A is SPD ⇒ A = LDL> is a unique factorization, where L is unit lower triangular and D is


diagonal
√ SPD matrix. Let D be given by D √ = diag(d11 , . . . √
, dnn ). √
Since djj > 0, define
√ √
D := diag( d11 , . . . , dnn ) and G := L D. Then A = L D(L D)> = GG > . 

Definition: If A is SPD then A = GG > , where G lower triangular with positive diagonals, is
called the Cholesky factorization of A and G is called the Cholesky factor of A.
Example:
    >
1 1 1 1 0 0 1 0 0
1 2 2 = 1 1 0 1 1 0 .
1 2 3 1 1 1 1 1 1

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC


Algorithm (inner product)

   
a a21 g
Let A := 11 and G := 11 . Then A = GG > yields
a21 a22 g21 g22
      2 
a11 a21 g g11 g21 g11 g11 g21
= 11 = 2 2 .
a21 a22 g21 g22 g22 g11 g21 g21 + g22

Equating the columns, we have


2 √
a11 = g11 g11 = a11
a21 = g11 g21 =⇒ g21 = ap21 /g11
2 2 2
a22 = g21 + g22 g22 = a22 − g21

2
Remark: The factorization is possible if a11 > 0 and a22 − g21 > 0.

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC


Algorithm (inner product)
More generally, equating columns on both sides of A = GG > , we have
         
a11 g11 a22 g21 g22
 ..   ..   ..   ..   .. 
 .  = g11  .  ,  .  = g21  .  + g22  . 
an1 gn1 an2 gn1 gn2
       
ajj gj1 gj2 gjj
 ..   ..   ..   .. 
 .  = gj1  .  + gj2  .  + · · · + gjj  .  , j = 1 : n
anj gn1 gn2 gnj

Algorithm (Inner product):


For j = 1 : n
q
ajj − j−1 2
P
gjj = k=1 gjk
 
gij = aij − j−1
P
k=1 gik gjk /gjj , i = j + 1 : n

end
Cost: n3 /3 flops - half the cost of GE.
R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC
Algorithm (inner product)
 
16 −16 0
Example: Consider  −16 41 −5  . Then
0 −5 5

√ √ a21 −16 a31 0


g11 = a11 = 16 = 4, g21 = = = −4, g31 = = =0
g11 4 g11 4
q
2 =
√ a32 − g g
31 21 −5 − 0 × (−4)
g22 = a22 − g21 41 − 16 = 5, g32 = = = −1
g22 5
q
2 − g2 =

g33 = a33 − g31 32 5 − 0 − 1 = 2.

Hence     >
16 −16 0 4 4
 −16 41 −5  =  −4 5   −4 5  .
0 −5 5 0 −1 2 0 −1 2

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC


Algorithm (outer product)
Let A ∈ Rn×n be SPD. Then A = GG > can be written as
a11 h> g>
    
g11 0 g11
= b> .
h Ab g G b 0 G

Equating the blocks, we have


2 √
a11 = g11 =⇒ g11 = a11
h = g11 g =⇒ g = h/g11
b>> b − gg > = G b>
A
b = gg + G
bG =⇒ A bG

For k = 1:n
A(k,k) = sqrt(A(k,k));
g = A(k+1:n,k)/A(k,k); A(k+1:n,k) = g;
A(k+1:n, k+1:n) = A(k+1:n, k+1:n)- g*g’;
end
Cost: n3 /3 flops - half the cost of GE.
R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC
Example

    
25 15 −5 g11 0 0 g11 g21 g31
 15 18 0  =  g21 g22 0  0 g22 g32 
−5 0 11 g31 g32 g33 0 0 g33
  
5 0 0 5 3 −1
=  3 g22 0  0 g22 g32 
−1 g32 g33 0 0 g33

Equating (2, 2) blocks, we have


        
18 0 3   9 3 g22 0 g22 g32
− 3 −1 = =
0 11 −1 3 10 g32 g33 0 g33
  
3 0 3 1
=
1 g33 0 g33
2
Equating (2, 2) entry, we have 10 − 1 = g33 =⇒ g33 = 3.

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC


Solving SPD system
Let A ∈ Rn×n be SPD and b ∈ Rn . Then the system Ax = b can be solved using Cholesky
factorization as follows.

• Compute Cholesky factorization A = GG > . Cost: n3 /3 flops.


• Solve the lower triangular system Gy = b. Cost: n2 flops.
• Solve the upper triangular system G > x = y . Cost: n2 flops.

The matlab command chol computes Cholesky factorization of a positive definite matrix A.
More specifically, the commands

R = chol(A) and L = chol(A,0 lower0 )


compute an upper triangular matrix R and a lower triangular matrix L such that

A = R> R and A = LL>

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC


A direct proof of Cholesky factorization
a11 h>
 
Problem: Let A = , where h ∈ Rn−1 , be SPD. Then the Schur complement
h D
S := D − hh> /a11 is SPD. Now use
" #
a11 h> h>
  
1 0 a11
=
h D h/a11 In−1 0 D − hh> /a11
 " # >
1 0 a11 0 1 0
=
h/a11 In−1 0 D − hh> /a11 h/a11 In−1
 √ " # √ >
a 0 1 0 a 0
= √11 √11
h/ a11 In−1 0 D − hh> /a11 h/ a11 In−1

and induction on n to prove that Cholesky factorization A = GG > exists and is unique.

***

R. Alam, IIT Guwahati (July-Nov 2023) MA423 MC

You might also like