0% found this document useful (0 votes)
8 views26 pages

DE ZG535 (23-S2) - Sessions 6 - 8 (17 & 24 Feb, 2 Mar 2024)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views26 pages

DE ZG535 (23-S2) - Sessions 6 - 8 (17 & 24 Feb, 2 Mar 2024)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Sessions 6 - 8: Eigenvalues and eigenvectors

Topics: Eigenvalues and eigenvectors; Algebraic and geometric multiplicity;


Diagonalization; The case of symmetric matrices; Defective matrices and
generalized eigenvectors; Properties of eigenvalues and the Cayley-
Hamilton theorem; Functions of matrices, Properties of the matrix
exponential
MATLAB demo: Matrix operations, Eigenvalues and eigenvectors, Jordan
form of defective matrices, Matrix exponential

Introduction

A vector pre-multiplied by a matrix transforms to a vector which is a linear combination of


its columns (Session 2). Certain special vectors, called eigenvectors, associated with a
square matrix maintain the same or opposite orientation (i.e.; a scalar multiple) when pre-
multiplied by it – i.e.; the nontrivial vectors 𝑋 such that 𝐴𝑋 = 𝜆𝑋, where 𝐴 is an 𝑛 × 𝑛
matrix, 𝑋 belongs to 𝑅 𝑛 and 𝜆 is a scalar multiplier that scales 𝑋. The vectors 𝑋 are called
the eigenvectors and the 𝜆’s are called the eigenvalues.

Eigenvalues and eigenvectors

The eigenvalue problem 𝐴𝑋 = 𝜆𝑋 is equivalent to (𝐴 − 𝜆𝐼)𝑋 = 𝑂 (𝑂 is the null vector in


𝑅 𝑛 ), a homogeneous system that admits the trivial solution 𝑋 = 𝑂 for any 𝜆. Nontrivial
solutions require that |𝐴 − 𝜆𝐼| = 0 – an 𝑛th degree polynomial equation in 𝜆 (called the
characteristic equation) with 𝑛 solutions, real or complex (complex eigenvalues occur as
conjugate pairs for real matrices 𝐴). Henceforth, we assume 𝐴 to be real.

Associated with each distinct eigenvalue 𝜆 found above, one independent eigenvector 𝑋
exists as the nontrivial solution of the homogeneous system (𝐴 − 𝜆𝐼)𝑋 = 𝑂. i.e.; the
rank(𝐴 − 𝜆𝐼) is (𝑛 − 1) for each distinct 𝜆.

Consider an eigenvalue that is repeated 𝑚 times (called algebraic multiplicity 𝑚). Such
an eigenvalue may not admit 𝑚 independent eigenvectors (i.e.; rank(𝐴 − 𝜆𝐼) may be
higher than (𝑛 − 𝑚)). The number of independent eigenvectors 𝑞 (𝑞 ≤ 𝑚) associated with
a repeated eigenvalue is called its degeneracy (or geometric multiplicity). Degeneracy is
the dimension of the null space of 𝐴 − 𝜆𝐼; i.e.; 𝑞 = 𝑛 − rank(𝐴 − 𝜆𝐼).

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
𝑞 = 1 is called simple degeneracy while 𝑞 = 𝑚 is called full degeneracy. For example, a
thrice-repeated eigenvalue that admits two independent eigenvectors has degeneracy
𝑞 = 2; but if three independent eigenvectors exist for the same eigenvalue, it has full
degeneracy (𝑞 = 3). Hence a repeated eigenvalue may admit one (𝑞 = 1), more than one
(1 < 𝑞 < 𝑚), or full (𝑞 = 𝑚) degeneracy.

One method of determining eigenvectors is to solve (𝐴 − 𝜆𝐼)𝑋 = 𝑂 by reducing it to the


row echelon form. Alternate methods include the adjoint method, Gram-Schmidt
decomposition, Singular Value Decomposition (SVD) and numerical techniques. We shall
limit our discussion to the row echelon form approach and the adjoint method.

The row echelon form approach is applicable for both distinct and repeated eigenvalues.
The eigenvectors 𝑋 are determined by solving (𝐴 − 𝜆𝐼)𝑋 = 𝑂.

The adjoint approach works when the eigenvalues are distinct, and also when the
repeated eigenvalues have full degeneracy. The adjoint1 of (𝐴 − 𝜆𝐼) is expressed as a
function of 𝜆. If the eigenvalues are distinct, each of them is substituted one after another
into the adjoint and any non-zero column will be the corresponding eigenvector. For a
repeated eigenvalue 𝜆 having full degeneracy (𝑞 = 𝑚), the independent columns of the
1 𝑑𝑚−1
matrix (𝑚−1)! 𝑑𝜆𝑚−1 𝑎𝑑𝑗(𝐴 − 𝜆𝐼) are eigenvectors of A for that 𝜆.

Let us consider a few examples with distinct eigenvalues first.

1 1
Example 1: Determine the eigenvalues and eigenvectors of 𝐴 = [ ]. Verify with
−2 4
MATLAB.

1−𝜆 1 𝑥1 0
We solve 𝐴𝑋 = 𝜆𝑋 ⟹ [ ][ ] = [ ]
−2 4 − 𝜆 𝑥2 0

1−𝜆 1
Eigenvalues are the roots of | | = 0 ⟹ 𝜆2 − 5𝜆 + 6 = 0 ⟹ 𝜆 = 2, 3
−2 4−𝜆

The eigenvalues are distinct. The eigenvector corresponding to 𝜆 = 2 is determined as


follows using the row echelon form reduction:

1
adjoint is the transpose of the cofactor matrix

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
1−𝜆 1 𝑥1 0 −1 1 𝑥1 0 −2𝑅1+𝑅2 ⟶𝑅2 −1 1 𝑥1 0
[ ] [𝑥 ] = [ ] ⟹ [ ] [𝑥 ] = [ ] ⟶ [ ] [𝑥 ] = [ ]
−2 4−𝜆 2 0 −2 2 2 0 0 0 2 0

𝑥1 1 −3 1 1
⟹ 𝑥2 = 𝑥1 ⟹ 𝑋1 = [𝑥 ] = [ ] , [ ] … = [ ]
1 1 −3 √2 1

The eigenvectors aren’t unique and it doesn’t matter. Why?

If 𝐴𝑋 = 𝜆𝑋, then 𝐴(𝑘𝑋) = 𝜆(𝑘𝑋), which means that if 𝑋 is an eigenvector corresponding


to 𝜆, so is 𝑘𝑋. i.e.; the scalar multiplier is irrelevant. One way to make the eigenvector
unique is to scale it by its length (implemented in MATLAB – still not unique due to the
uncertainty in sign) as shown in the final result above. Another way is to make the
numerically largest element unity.

The eigenvector corresponding to 𝜆 = 3 is found as follows:

1−𝜆 1 𝑥1 0 −2 1 𝑥1 0 −𝑅1 +𝑅2 ⟶𝑅2 −2 1 𝑥1 0


[ ] [𝑥 ] = [ ] ⟹ [ ] [𝑥 ] = [ ] ⟶ [ ] [𝑥 ] = [ ]
−2 4−𝜆 2 0 −2 1 2 0 0 0 2 0

𝑥1 1 −1 1 1
⟹ 𝑥2 = 2𝑥1 ⟹ 𝑋2 = [2𝑥 ] = [ ] , [ ] … = [ ]
1 2 −2 √5 2

4−𝜆 −1
Now use the adjoint method: adj(𝐴 − 𝜆𝐼) = [ ]
2 1−𝜆

2 −1 1
Set 𝜆 = 2 to get adj(𝐴 − 𝜆𝐼) = [ ]. Both columns give [ ] as an eigenvector.
2 −1 1

1 −1 1
Set 𝜆 = 3 to get adj(𝐴 − 𝜆𝐼) = [ ]. Both columns give [ ] as an eigenvector. ◼
2 −2 2

10 −2 4
Example 2: Determine the eigenvalues and eigenvectors of 𝐴 = [−20 4 −10]
−30 6 −13

10 − 𝜆 −2 4 𝑥1 0
Solve 𝐴𝑋 = 𝜆𝑋 ⟹ [ −20 4−𝜆 𝑥
−10] [ 2 ] = [0]
−30 6 −13 − 𝜆 𝑥3 0

10 − 𝜆 −2 4
For nontrivial solutions, | −20 4−𝜆 −10| = 0 ⟹
−30 6 −13 − 𝜆

(10 − 𝜆){(𝜆 − 4)(𝜆 + 13) + 60} + 2{20(𝜆 + 13) − 300} + 4{−120 + 30(4 − 𝜆)} ⟹

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
(10 − 𝜆){𝜆2 + 9𝜆 + 8} + 2{20𝜆 − 40} − 120𝜆 = 0 ⟹

−𝜆3 + 𝜆2 + 2𝜆 = 0 ⟹ 𝜆(𝜆2 − 𝜆 − 2) = 0 ⟹ 𝜆(𝜆 − 2)(𝜆 + 1) = 0 ⟹ 𝜆 = −1, 0, 2

The eigenvalues are distinct. First, we use the row echelon form approach to find the
eigenvectors. The eigenvector corresponding to 𝜆 = −1 is found as:
𝑅2
⟶𝑅2
−5
10 − 𝜆 −2 4 𝑥1 0 11 −2 4 𝑥1 0 𝑅3
−6
⟶𝑅3
[ −20 4−𝜆 ]
−10 2 [ 𝑥 ] = [0 ] ⟹ [ −20 5 ]
−10 2[ 𝑥 ] = [ 0] ⟶
−30 6 −13 − 𝜆 𝑥3 0 −30 6 −12 𝑥3 0
11𝑅 ⟶𝑅 −4𝑅 +𝑅 ⟶𝑅
11 −2 4 𝑥1 0 11𝑅23 ⟶𝑅23 11 −2 4 𝑥1 0 −5𝑅11+𝑅23⟶𝑅23
[ 4 −1 2] [𝑥2 ] = [0] ⟶ [44 −11 22] [𝑥2 ] = [0] ⟶
5 −1 2 𝑥3 0 55 −11 22 3 𝑥 0

11 −2 4 𝑥1 0 𝑅2 ↔𝑅3 11 −2 4 𝑥1 0 −3𝑅2 +𝑅3 ⟶𝑅3


𝑥
[ 0 −3 6] [ 2 ] = [0] ⟶ [ 0 𝑥
−1 2] [ 2 ] = [0] ⟶
0 −1 2 𝑥3 0 0 −3 6 𝑥3 0

11 −2 4 𝑥1 0
[ 0 ]
−1 2 2[ 𝑥 ] = [ 0]
0 0 0 3 𝑥 0

Solve: Let 𝑥3 = 𝑡. −𝑥2 + 2𝑥3 = 0 ⇒ 𝑥2 = 2𝑡. 11𝑥1 − 2𝑥2 + 4𝑥3 = 0 ⇒ 𝑥1 = 0

𝑥1 0 0
1
𝑥
i.e.; 𝑋1 = [ 2 ] = [2𝑡] = [2]
√5
𝑥3 𝑡 1

The eigenvector corresponding to 𝜆 = 0 is found as:


𝑅1
⟶𝑅1
2
10 − 𝜆 −2 4 𝑥1 0 10 −2 4 𝑥1 0 𝑅2
2
⟶𝑅2
[ −20 4−𝜆 ]
−10 2 [ 𝑥 ] = [ 0 ] ⟹ [ −20 4 ]
−10 2[𝑥 ] = [ 0] ⟶
−30 6 −13 − 𝜆 𝑥3 0 −30 6 −13 𝑥3 0
2𝑅 +𝑅 ⟶𝑅
5 −1 2 𝑥1 0 6𝑅11+𝑅23 ⟶𝑅23 5 −1 2 𝑥1 0
[−10 2 −5 ] [𝑥2 ] = [0] ⟶ [0 0 −1] [𝑥2 ] = [0]
−30 6 −13 𝑥3 0 0 0 −1 𝑥3 0

−𝑅2 +𝑅3 ⟶𝑅3 5 −1 2 𝑥1 0


⟶ [0 𝑥
0 −1] [ 2 ] = [0]
0 0 0 𝑥3 0

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
1
Solve: Let 𝑥2 = 𝑡. 𝑥3 = 0 ⇒ 𝑥2 = 2𝑡. 5𝑥1 − 𝑥2 + 2𝑥3 = 0 ⇒ 𝑥1 = 5 𝑡

𝑥1 1
𝑡 1
5 1
i.e.; 𝑋2 = [𝑥2 ] = [ 𝑡 ] = [5]
√26
𝑥3 0
0

The eigenvector corresponding to 𝜆 = 2 is found as:


𝑅1
⟶𝑅1
2
𝑅2
⟶𝑅2
2
10 − 𝜆 −2 4 𝑥1 0 8 −2 4 𝑥1 0 𝑅3
3
⟶𝑅3
[ −20 4−𝜆 𝑥
−10] [ 2 ] = [0] ⟹ [−20 2 𝑥
−10] [ 2 ] = [0] ⟶
−30 6 −13 − 𝜆 𝑥3 0 −30 6 −15 𝑥3 0
5𝑅1 ⟶𝑅1
2𝑅2 ⟶𝑅2 𝑅 +𝑅 ⟶𝑅
4 −1 2 𝑥1 0 2𝑅3 ⟶𝑅3 20 −5 10 𝑥1 0 𝑅11 +𝑅23⟶𝑅23
[−10 1 −5] [𝑥2 ] = [0] ⟶ [−20 2 −10 ] [𝑥2 ] = [0] ⟶
−10 2 −5 𝑥3 0 −20 4 −10 𝑥3 0

20 −5 10 𝑥1 0 𝑅2 ↔𝑅3 20 −5 10 𝑥1 0 −3𝑅2 +𝑅3⟶𝑅3


[ 0 𝑥
−3 0 ] [ 2 ] = [0] ⟶ [ 0 𝑥
−1 0 ] [ 2 ] = [0] ⟶
0 −1 0 𝑥3 0 0 −3 0 𝑥3 0

20 −5 10 𝑥1 0
[ 0 −1 0 ] [𝑥2 ] = [0]
0 0 0 𝑥3 0
1
Solve: Let 𝑥3 = 𝑡. 𝑥2 = 0. 20𝑥1 − 5𝑥2 + 10𝑥3 = 0 ⇒ 𝑥1 = − 2 𝑡

1
𝑥1 −2𝑡 1
1
i.e.; 𝑋3 = [𝑥2 ] = [ 0 ] = [ 0 ]
√5
𝑥3 −2
𝑡

10 − 𝜆 −2 4
Now use the adjoint method: (𝐴 − 𝜆𝐼) = [ −20 4−𝜆 −10]
−30 6 −13 − 𝜆

𝜆2 + 9𝜆 + 8 −2𝜆 − 2 4𝜆 + 4
adj(𝐴 − 𝜆𝐼) = [ 40 − 20𝜆 2
𝜆 + 3𝜆 − 10 −10𝜆 + 20]
−30𝜆 6𝜆 𝜆2 − 14𝜆

0 0 0 0
Set 𝜆 = −1 to get adj(𝐴 − 𝜆𝐼) = [60 −12 30]. All columns give [2] as an eigenvector.
30 −6 15 1

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
8 −2 4 1
Set 𝜆 = 0 to get adj(𝐴 − 𝜆𝐼) = [40 −10 20 ]. All columns give [ 5] as an eigenvector.
0 0 0 0

30 −6 12 1
Set 𝜆 = 2 to get adj(𝐴 − 𝜆𝐼) = [ 0 0 0 ]. All columns give [ 0 ] as an
−60 12 −24 −2
eigenvector. ◼

−2 −5
Example 3: Determine the eigenvalues and eigenvectors of 𝐴 = [ ]
1 2

−2 − 𝜆 −5 𝑥1 0
We solve 𝐴𝑋 = 𝜆𝑋 ⟹ [ ][ ] = [ ]
1 2 − 𝜆 𝑥2 0

−2 − 𝜆 −5
For nontrivial solutions, | | = 0 ⟹ 𝜆2 + 1 = 0 ⟹ 𝜆 = −𝑖, 𝑖
1 2−𝜆

The eigenvector corresponding to 𝜆 = −𝑖 is found from the row echelon form as follows:

−2 − 𝜆 −5 𝑥1 0 −2 + 𝑖 −5 𝑥1 0
[ ][ ] = [ ] ⟹ [ ][ ] = [ ]
1 2 − 𝜆 𝑥2 0 1 2 + 𝑖 𝑥2 0

−(2 + 𝑖)𝑥2 2+𝑖


Solve: 𝑥1 + (2 + 𝑖)𝑥2 = 0 or 𝑥1 = −(2 + 𝑖)𝑥2 ⟹ 𝑋1 = [ ]=[ ]
𝑥2 −1

The eigenvector corresponding to 𝜆 = 𝑖 is found as follows:

−2 − 𝜆 −5 𝑥1 0 −2 − 𝑖 −5 𝑥1 0
[ ] [𝑥 ] = [ ] ⟹ [ ] [𝑥 ] = [ ]
1 2−𝜆 2 0 1 2−𝑖 2 0

−(2 − 𝑖)𝑥2 2−𝑖


Solve: 𝑥1 + (2 − 𝑖)𝑥2 = 0 or 𝑥1 = −(2 − 𝑖)𝑥2 ⟹ 𝑋2 = [ ]=[ ]
𝑥2 −1

2−𝜆 5
Now use the adjoint method: adj(𝐴 − 𝜆𝐼) = [ ]
−1 −2 − 𝜆

2+𝑖 5 2+𝑖
Set 𝜆 = −𝑖 to get adj(𝐴 − 𝜆𝐼) = [ ]. Both columns give [ ] as an
−1 −2 + 𝑖 −1
eigenvector.

2−𝑖 5 2−𝑖
Set 𝜆 = 𝑖 to get adj(𝐴 − 𝜆𝐼) = [ ]. Both columns give [ ] as an
−1 −2 − 𝑖 −1
eigenvector. ◼

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
Diagonalization

Diagonalization decouples a system of coupled ordinary differential equations arising in


applications such as free vibration analysis, state space models of dynamic systems etc.

Matrices that possess a full set of linearly independent eigenvectors (i.e.; 𝑛 linearly
independent eigenvectors for an 𝑛 × 𝑛 square matrix) can be diagonalized. This occurs
when the eigenvalues are distinct or when all repeated eigenvalues have full degeneracy
– real symmetric matrices and Hermitian2 matrices are known to meet these conditions
and have a full set of linearly independent eigenvectors.

A relationship of the form 𝐵 = 𝑄 −1 𝐴𝑄 between square matrices A and B involving a


nonsingular matrix 𝑄 is called a similarity transformation. Two matrices are said to be
similar if they have the same Jordan form (discussed later).

Let matrix 𝐴 possess a full set of linearly independent eigenvectors that are columns of
the matrix 𝑃. Since the columns of 𝑃 are linearly independent, 𝑃 is non-singular; i.e.;
𝑑𝑒𝑡 (𝑃) is non-zero and 𝑃−1 exists. It can be shown that 𝑃−1 𝐴𝑃 = 𝐷, a diagonal matrix
whose diagonal elements are the eigenvalues associated with the corresponding columns
of 𝑃 (proof omitted). i.e.; a matrix with a full set of linearly independent eigenvectors can
be transformed to a diagonal matrix by means of a similarity transformation.

In free vibration analysis, eigenvalues are related to the free vibration frequencies, and
eigenvectors denote mode shapes. What do complex modes (Example 3) represent in
real structures?

Example 4: Form the 𝑃 matrix in Example 3 and diagonalize 𝐴.

2+𝑖 2−𝑖 1 −1 −2 + 𝑖
𝑃=[ ] ⟹ 𝑑𝑒𝑡(𝑃) = (−2 − 𝑖) − (−2 + 𝑖) = −2𝑖 ⟹ 𝑃−1 = − [ ]
−1 −1 2𝑖 1 2+𝑖

1 −1 −2 + 𝑖 −2 −5 2 + 𝑖 2−𝑖 1 −1 −2 + 𝑖 1 − 2𝑖 1 + 2𝑖
𝑃−1 𝐴𝑃 = − [ ][ ][ ]=− [ ][ ]
2𝑖 1 2+𝑖 1 2 −1 −1 2𝑖 1 2+𝑖 𝑖 −𝑖
1 −2 0 −𝑖 0
=− [ ]=[ ]
2𝑖 0 2 0 𝑖

2
a complex matrix whose conjugate transpose is identical to itself (same as “symmetric matrix” if the matrix is real)

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
The diagonal elements of 𝑃−1 𝐴𝑃 are the complex eigenvalues of 𝐴 (from Example 3)
corresponding to the columns of 𝑃. ◼

The case of symmetric matrices

Real symmetric matrices offer analytical and computational advantages as they have real
eigenvalues and possess a full set of linearly independent and mutually orthogonal3
eigenvectors (doesn’t mean that any two eigenvectors will be orthogonal) even if the
eigenvalues are repeated (proof omitted).

A matrix is said to be orthogonal if (i) every pair of columns is orthogonal, and (ii) the
magnitude of each column (positive square root of the inner product of a column with
itself) is unity. In other words, matrix A is orthogonal if its inverse 𝐴−1 is identical to its
transpose 𝐴𝑇 . i.e.; 𝐴−1 = 𝐴𝑇 ⟹ 𝐴𝐴−1 = 𝐴𝐴𝑇 = 𝐼.

1 2 2
3 3 3
2 1 2
e.g: 𝐴 = 3 3
− 3 is orthogonal – do a quick check
2 2 1
[3 − 3 3]

The orthogonality of the eigenvectors of a symmetric matrix A makes the P matrix (having
as its columns the suitably4 normalized eigenvectors of A) orthogonal, or 𝑃−1 = 𝑃𝑇 . Hence
a symmetric matrix can be diagonalized as 𝑃−1 𝐴𝑃 = 𝑃𝑇 𝐴𝑃 = 𝐷.

Let us consider a few examples.

6 −2 2
Example 5: Show that the eigenvalues of 𝐴 = [−2 3 −1] are real and the
2 −1 3
eigenvectors are orthogonal. Diagonalize A. Verify with MATLAB.

6−𝜆 −2 2 𝑥1 0
Solve 𝐴𝑋 = 𝜆𝑋 ⟹ [ −2 3−𝜆 𝑥
−1 ] [ 2 ] = [0]
2 −1 3 − 𝜆 𝑥3 0

3
two vectors are said to be orthogonal if their inner product is zero
4
see Example 5

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
6−𝜆 −2 2
For nontrivial solutions, | −2 3−𝜆 −1 | = 0 ⟹
2 −1 3−𝜆

(6 − 𝜆){(3 − 𝜆)2 − 1} + 2{−2(3 − 𝜆) + 2} + 2{2 − 2(3 − 𝜆)} = 0 ⟹

(6 − 𝜆){𝜆2 − 6𝜆 + 8} − 8(3 − 𝜆) + 8 = 0 ⟹

−𝜆3 + 12𝜆2 − 36𝜆 + 32 = 0 ⟹ (𝜆 − 2)2 (𝜆 − 8) = 0

Eigenvalues are 𝜆 = 2 (repeated twice), and 𝜆 = 8. All are real (symmetric matrix).

The eigenvector corresponding to 𝜆 = 2 is found as:


1
𝑅 +𝑅 ⟶𝑅2
2 1 2
6−𝜆 −2 2 𝑥1 0 4 −2 2 𝑥1 0 1
− 𝑅1 +𝑅3 ⟶𝑅3
2
[ −2 3−𝜆 𝑥
−1 ] [ 2 ] = [0] ⟹ [−2 𝑥
1 −1] [ 2 ] = [0] ⟶
2 −1 3 − 𝜆 𝑥3 0 2 −1 1 𝑥3 0

4 −2 2 𝑥1 0
[0 ]
0 0 2[ 𝑥 ] = [ 0]
0 0 0 3 𝑥 0
1 1
Let 𝑥3 = 𝑡, 𝑥2 = 𝑠 be the free variables. Then, 4𝑥1 − 2𝑥2 + 2𝑥3 = 0 ⟹ 𝑥1 = 2 𝑠 − 2 𝑡

𝑥1 1 1
𝑠 − 2𝑡
2
i.e.; [𝑥2 ] = [ 𝑠 ] generates two independent eigenvectors (two free variables). Set
𝑥3 𝑡
0 −1
𝑡 = 1, 𝑠 = 1 to get: 𝑋1 = [1]. Then set 𝑡 = 1, 𝑠 = −1 to get 𝑋2 = [−1]. Verify that 𝑋1 and
1 1
𝑋2 are independent and orthogonal.

We can’t pick any 𝑡 and 𝑠 and hope for the eigenvectors to turn out to be orthogonal. Care
is required in choosing suitable values of 𝑡 and 𝑠. For instance, first set 𝑡 = 1, 𝑠 = 0 to
1 1
−2 2
get 𝑋1 = [ 0 ]. Then set 𝑡 = 0, 𝑠 = 1 to get 𝑋2 = [1]. Verify that 𝑋1 and 𝑋2 are
1 0
independent but not orthogonal.

The eigenvector corresponding to 𝜆 = 8 is found as:

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
−𝑅 +𝑅 ⟶𝑅
6−𝜆 −2 2 𝑥1 0 −2 −2 2 𝑥1 0 𝑅11+𝑅32⟶𝑅32
[ −2 3−𝜆 −1 ] [𝑥2 ] = [0] ⟹ [−2 −5 −1] [𝑥2 ] = [0] ⟶
2 −1 3−𝜆 𝑥3 0 2 −1 −5 3 𝑥 0

−2 −2 2 𝑥1 0 −𝑅2 +𝑅3⟶𝑅3 −2 −2 2 𝑥1 0
[ 0 −3 −3] [𝑥2 ] = [0] ⟶ [ 0 −3 −3] [𝑥2 ] = [0]
0 −3 −3 𝑥3 0 0 0 0 𝑥3 0

Let 𝑥3 = 𝑡 be the free variable. Then, −3𝑥2 − 3𝑥3 = 0 ⟹ 𝑥2 = −𝑡

−2𝑥1 − 2𝑥2 + 2𝑥3 = 0 ⟹ 𝑥1 = 2𝑡

2𝑡 2
Hence 𝑋3 = [−𝑡 ] = [−1] with 𝑡 = 1.
𝑡 1

We now normalize all the eigenvectors as (verify that P is orthogonal) :


1 2
0 −
√3 √6
1 0 1 −1 2 1 1 1 1
𝑋1 = [1] , 𝑋2 = [−1] , 𝑋3 = [−1] ⟹ 𝑃 = − −
√2 1 √3 1 √6 1 √2 √3 √6
1 1 1
[√2 √3 √6]
1 1 1 2
0 0 −
√2 √2 6 −2 2 √3 √6
𝑇 1 1 1 1 1 1
Hence 𝑃 𝐴𝑃 = − √3 − √3 √3
[−2 3 −1] √2

√3

√6
2 1 1 2 −1 3 1 1 1
[ − ] [√2
√6 √6 √6 √3 √6]

1 1 2 16
0 0 −
√2 √2 √3 √6 2 0 0
1 1 1 2 2 8
= − √3 − √3 √3 √2

√3

√6
= [0 2 0]
2 1 1 2 2 8 0 0 8
[ − ] [√2
√6 √6 √6 √3 √6]

Since 𝜆 = 2 has full degeneracy (𝑞 = 𝑚 = 2) and 𝜆 = 8 is distinct, the adjoint method of


determining the eigenvectors will work.

6−𝜆 −2 2 𝜆2 − 6𝜆 + 8 −2𝜆 + 4 2𝜆 − 4
𝐴 − 𝜆𝐼 = [ −2 3−𝜆 −1 ] ⟹ 𝑎𝑑𝑗(𝐴 − 𝜆𝐼) = [ −2𝜆 + 4 𝜆2 − 9𝜆 + 14 −𝜆 + 2 ]
2 −1 3−𝜆 2𝜆 − 4 −𝜆 + 2 𝜆2 − 9𝜆 + 14

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
24 −12 12 2
Evaluate 𝑎𝑑𝑗(𝐴 − 𝜆𝐼) = [−12 6 −6 ] at 𝜆 = 8. Each column gives [ −1] as the
12 −6 6 1
eigenvector associated with 𝜆 = 8, which matches with 𝑋3 found above.

2𝜆 − 6 −2 2 −2 −2 2
1 𝑑
Obtain 1! 𝑑𝜆 𝑎𝑑𝑗(𝐴 − 𝜆𝐼) = [ −2 2𝜆 − 9 −1 ], which at 𝜆 = 2 is [−2 −5 −1].
2 −1 2𝜆 − 9 2 −1 −5

Note that 𝐶3 = −2𝐶1 + 𝐶2. Hence only 𝐶1 and 𝐶2 are independent. i.e.; the eigenvectors
−1 2 −1
for 𝜆 = 2 are [−1] and [5]. The first of these matches with 𝑋2 = [−1], but the second
1 1 1
0 −1 2 0
2 1
doesn’t match with 𝑋1 = [1]. Why? Linear dependence!! 3 [−1] + 3 [5] = [1] ◼
1 1 1 1

Defective matrices and generalized eigenvectors

We found that matrices possessing a full set of linearly independent eigenvectors


(regardless of distinct or repeated eigenvalues) can be diagonalized. We also learnt that
if the eigenvalues are distinct, a full set of linearly independent eigenvectors exists. But
with repeated eigenvalues, the existence of a full set of linearly independent eigenvectors
isn’t guaranteed.

A matrix that falls short of a full set of linearly independent eigenvectors cannot be
diagonalized. Such matrices are called defective (rank deficient) matrices. Let us consider
an example of a defective matrix.

4 0 0 4 1 0
Example 6: Determine if the matrices 𝑃 = [0 4 0] and 𝑄 = [0 4 1] are defective.
0 0 4 0 0 4

4−𝜆 0 0 𝑥1 0
For P: (𝑃 − 𝜆𝐼)𝑋 = 𝑂 ⟹ [ 0 4−𝜆 𝑥
0 ] [ 2 ] = [0]
0 0 4 − 𝜆 𝑥3 0

4−𝜆 0 0
For nontrivial solutions, | 0 4−𝜆 0 | = 0 ⟹ (4 − 𝜆)3 = 0 ⟹ 𝜆 = 4 (thrice)
0 0 4−𝜆

4−𝜆 0 0 𝑥1 0 0 0 0 𝑥1 0
For 𝜆 = 4: [ 0 4−𝜆 0 ] [𝑥2 ] = [0] ⟹ [0 0 0] [𝑥2 ] = [0]
0 0 4 − 𝜆 𝑥3 0 0 0 0 𝑥3 0
------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
𝑥1 , 𝑥2 and 𝑥3 are free variables. i.e.; 𝑥3 = 𝑡, 𝑥2 = 𝑠, 𝑥1 = 𝑟; 𝑟, 𝑠, 𝑡 ∈ 𝑅

𝑥1 𝑟 1 0 0
𝑥
Hence the eigenvector [ 2 ] = [𝑠] ; 𝑟, 𝑠, 𝑡 ∈ 𝑅. i.e.; [0] , [1] , [0] are the three linearly
𝑥3 𝑡 0 0 1
independent eigenvectors in this case.

Since P has three linearly independent eigenvectors for its thrice-repeated eigenvalue, it
is not defective.

4−𝜆 1 0 𝑥1 0
For Q: (𝑄 − 𝜆𝐼)𝑋 = 𝑂 ⟹ [ 0 4−𝜆 1 ] [𝑥2 ] = [0]
0 0 4 − 𝜆 𝑥3 0

4−𝜆 1 0
For nontrivial solutions, | 0 4−𝜆 1 | = 0 ⟹ (4 − 𝜆)3 = 0 ⟹ 𝜆 = 4 (thrice)
0 0 4−𝜆

4−𝜆 1 0 𝑥1 0 0 1 0 𝑥1 0
For 𝜆 = 4: [ 0 4−𝜆 1 ] [𝑥2 ] = [0] ⟹ [0 0 1] [𝑥2 ] = [0]
0 0 4 − 𝜆 𝑥3 0 0 0 0 𝑥3 0

𝑥2 and 𝑥3 are leading variables while 𝑥1 is a free variable. Let 𝑥1 = 𝑡; 𝑡 ∈ 𝑅. From the first
and second rows, we respectively get 𝑥2 = 0 and 𝑥3 = 0.

𝑥1 𝑡 1
Hence the eigenvector [𝑥2 ] = [0] ; 𝑡 ∈ 𝑅. i.e.; [0] is the only eigenvector.
𝑥3 0 0

Since Q has just one eigenvector corresponding to the thrice-repeated eigenvalue, it is


defective. ◼

If defective matrices cannot be diagonalized due to the shortage of linearly independent


eigenvectors to form a basis set, how close can we get to diagonalizing such matrices?
The answer lies in generating additional basis vectors called generalized eigenvectors.

A generalized eigenvector 𝑋 of the matrix 𝐴 corresponding to the repeated eigenvalue 𝜆


of algebraic multiplicity 𝑚 satisfies (𝐴 − 𝜆𝐼)𝑟 𝑋 = 𝑂, where 𝑟 is a positive integer that runs
from 1, 2, 3…to utmost 𝑚 to generate 𝑚 linearly independent generalized eigenvectors.
i.e.; the generalized eigenvectors include all the eigenvectors (max 𝑚) corresponding to
𝜆 if 𝑟 > 1 and form the basis of 𝑅 𝑛 for an 𝑛 × 𝑛 matrix 𝐴.

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
With the matrix of generalized eigenvectors (called the 𝑄 matrix), matrix 𝐴 can be block-
diagonalized5 (but cannot be diagonalized) as 𝑄 −1 𝐴𝑄. The block diagonalized form of 𝐴
is called the Jordan canonical form. Each block of 𝑄 −1 𝐴𝑄, called a Jordan block, is
associated with one and only one linearly independent eigenvector. Each Jordan block
has the eigenvalue along its main diagonal, 0’s below and 1’s immediately above its main
diagonal. The remaining elements are 0’s. The number of 1’s in a block denotes the
number of generalized eigenvectors attached to that eigenvalue. The Jordan canonical
form is unique up to a rearrangement of the Jordan blocks (see Example 7).

For example, an 8 × 8 matrix with eigenvalues 𝜆1 (𝑚 = 2, 𝑞 = 1), 𝜆2 (𝑚 = 4, 𝑞 = 2), and


𝜆3 (𝑚 = 2, 𝑞 = 2) can be block-diagonalized into one of the following two forms:

The final form obtained (using 𝑄 −1 𝐴𝑄) between the above two depends on how the chain
of generalized eigenvectors is determined. The procedure for determining the generalized
eigenvectors in the context of this example is discussed next.

Look at 𝜆2 (𝑚 = 4, 𝑞 = 2) to determine the eigenvectors and generalized eigenvectors.


Let the independent eigenvectors of 𝜆2 be 𝑋1 and 𝑋2 . i.e.; (𝐴 − 𝜆2 𝐼)𝑋1 = 𝑂 and
(𝐴 − 𝜆2 𝐼)𝑋2 = 𝑂. Two generalized eigenvectors 𝑋3 and 𝑋4 are created in one of the
following two ways:

Option 1: Generalized eigenvector series created from both 𝑋1 and 𝑋2 as the 𝑋1 − 𝑋3


series and the 𝑋2 − 𝑋4 series, leading to the first block diagonal form shown above:

Eigenvector 𝑋1: (𝐴 − 𝜆2 𝐼)𝑋1 = 𝑂 Eigenvector 𝑋2: (𝐴 − 𝜆2 𝐼)𝑋2 = 𝑂

5
a block diagonal matrix is a square matrix that can be partitioned along its main diagonal into square submatrices
such that all nonzero elements are confined within these submatrices

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
Generalized eigenvector 𝑋3: (𝐴 − 𝜆2 𝐼)𝑋3 = 𝑋1 ≠ 𝑂 ⟹ (𝐴 − 𝜆2 𝐼)2 𝑋3 = (𝐴 − 𝜆2 𝐼)𝑋1 = 𝑂

Generalized eigenvector 𝑋4: (𝐴 − 𝜆2 𝐼)𝑋4 = 𝑋2 ≠ 𝑂 ⟹ (𝐴 − 𝜆2 𝐼)2 𝑋4 = (𝐴 − 𝜆2 𝐼)𝑋2 = 𝑂

Option 2: Generalized eigenvector series created from 𝑋1 alone (or 𝑋2 alone) as the 𝑋1 −
𝑋3 − 𝑋4 (or 𝑋2 − 𝑋3 − 𝑋4) series leading to the second block diagonal form shown above:

Eigenvector 𝑋1: (𝐴 − 𝜆2 𝐼)𝑋1 = 𝑂 Eigenvector 𝑋2: (𝐴 − 𝜆2 𝐼)𝑋2 = 𝑂

Generalized eigenvector 𝑋3: (𝐴 − 𝜆2 𝐼)𝑋3 = 𝑋1 ≠ 𝑂 ⟹ (𝐴 − 𝜆2 𝐼)2 𝑋3 = (𝐴 − 𝜆2 𝐼)𝑋1 = 𝑂

Generalized eigenvector 𝑋4: (𝐴 − 𝜆2 𝐼)𝑋4 = 𝑋3 ≠ 𝑂 ⟹ (𝐴 − 𝜆2 𝐼)3 𝑋4 = (𝐴 − 𝜆2 𝐼)2 𝑋3 = 𝑂

The generalized eigenvectors associated with an eigenvalue of algebraic multiplicity 𝑚


and simple degeneracy (𝑞 = 1) are calculated as follows.

Let (𝐴 − 𝜆𝐼)𝑋1 = 𝑂, where 𝜆 is an eigenvalue with algebraic multiplicity 𝑚 and 𝑞 = 1. The


associated eigenvector is 𝑋1 . The remaining (𝑚 − 1) generalized eigenvectors
𝑋2 , 𝑋3 , … 𝑋𝑚 are all chained to 𝑋1 as:

(𝐴 − 𝜆𝐼)𝑋2 = 𝑋1 ≠ 𝑂 ⟹ (𝐴 − 𝜆𝐼)2 𝑋2 = (𝐴 − 𝜆𝐼)𝑋1 = 𝑂

(𝐴 − 𝜆𝐼)𝑋3 = 𝑋2 ≠ 𝑂 ⟹ (𝐴 − 𝜆𝐼)3 𝑋3 = (𝐴 − 𝜆𝐼)2 𝑋2 = 𝑂

(𝐴 − 𝜆𝐼)𝑋𝑚 = 𝑋𝑚−1 ≠ 𝑂 ⟹ (𝐴 − 𝜆𝐼)𝑚 𝑋𝑚 = (𝐴 − 𝜆𝐼)𝑚−1 𝑋𝑚−1 = 𝑂

By solving each equation in succession, 𝑋2 , 𝑋3 , … 𝑋𝑚 are determined.

If the degeneracy of an eigenvalue 𝜆 is such that 1 < 𝑞 < 𝑚, we have 𝑞 linearly


independent eigenvectors and the remaining 𝑚 − 𝑞 generalized eigenvectors must be
determined. While the same procedure described above in the case of simple degeneracy
(that determines the eigenvectors at the start) can still be made use of, ambiguities of the
generalized eigenvector chains arise as noted earlier with the Jordan blocks. Hence, the
following approach is recommended:

For a square matrix 𝐴 of order 𝑛, find the smallest integer 𝑘 (called the index of the
eigenvalue) such that rank((𝐴 − 𝜆𝐼)𝑘 ) = 𝑛 − 𝑚. This means, the longest chain of
eigenvectors and generalized eigenvectors runs from 1, 2, 3, … 𝑘 for the particular 𝜆. In
other words, 𝑘 is the size of the largest Jordan block for the particular 𝜆. A generalized

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
eigenvector 𝑋𝑗 of order 𝑗 satisfies (𝐴 − 𝜆𝐼)𝑗 𝑋𝑗 = 𝑂 and (𝐴 − 𝜆𝐼)𝑗−1 𝑋𝑗 ≠ 𝑂 for
𝑗 = 𝑘, 𝑘 − 1, … 2. From 𝑋𝑗 we get, 𝑋𝑗−1 = (𝐴 − 𝜆𝐼)𝑋𝑗 . Continue along this path to get all the
generalized eigenvectors and finally the eigenvector.

The following examples demonstrate the procedure for determining generalized


eigenvectors.

0 6 −5
Example 7: Find the eigenvectors & generalized eigenvectors of 𝐴 = [1 0 2 ] and
3 2 4
−1
evaluate 𝑄 𝐴𝑄, where Q is the matrix of generalized eigenvectors

−𝜆 6 −5
Set |𝐴 − 𝜆𝐼| = | 1 −𝜆 2 | = 0 ⟹ −𝜆{𝜆2 − 4𝜆 − 4} − 6{−𝜆 − 2} − 5{2 + 3𝜆} = 0 ⟹
3 2 4−𝜆

𝜆3 − 4𝜆2 + 5𝜆 − 2 = 0 ⟹ (𝜆 − 1)2 (𝜆 − 2) = 0 ⟹ 𝜆 = 1 (twice repeated), 2

−1 6 −5 𝑥1 0
For 𝜆 = 1 (twice repeated): (𝐴 − 𝜆𝐼)𝑋1 = 𝑂 ⟹ [ 1 −1 2 ] [ 𝑥2 ] = [ 0]
3 2 3 𝑥3 0
𝑅 +𝑅 ⟶𝑅
−1 6 −5 0 3𝑅11+𝑅23 ⟶𝑅23 −1 6 −5 0 −4𝑅2+𝑅3 ⟶𝑅3 −1 6 −5 0
[ 1 −1 2 0] ⟶ [ 0 5 −3 0] ⟶ [ 0 5 −3 0]
3 2 3 0 0 20 −12 0 0 0 0 0

−7
3 7
Solve with 𝑥3 = 𝑡 as free variable to get 𝑥2 = 5 𝑡, 𝑥1 = − 5 𝑡. Hence 𝑋1 = [ 3 ]
5

The associated generalized eigenvector is obtained from (𝐴 − 𝜆𝐼)𝑋2 = 𝑋1 ⟹

−1 6 −5 𝑥1 −7
[ 1 −1 𝑥
2] [ 2] = [ 3]
3 2 3 𝑥3 5
𝑅 +𝑅 ⟶𝑅
−1 6 −5 −7 3𝑅11+𝑅23 ⟶𝑅23 −1 6 −5 −7 −4𝑅2+𝑅3 ⟶𝑅3 −1 6 −5 −7
[ 1 −1 2 3] ⟶ [ 0 5 −3 −4 ] ⟶ [ 0 5 −3 −4]
3 2 3 5 0 20 −12 −16 0 0 0 0
𝑥1 −7 11
3𝑡−4 −7𝑡+11 1
Set 𝑥3 = 𝑡 as free variable to get 𝑥2 = , 𝑥1 = so that [𝑥2 ] = 5 {[ 3 ] 𝑡 + [−4]}
5 5
𝑥3 1 0

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
11
5
Hence the generalized eigenvector is 𝑋2 = [ − 4 ] for 𝑡 = 0
5
0

−2 6 −5 𝑥1 0
For 𝜆 = 2: (𝐴 − 𝜆𝐼)𝑋3 = 𝑂 ⟹ [ 1 −2 2 ] [𝑥2 ] = [0]
3 2 2 𝑥3 0
1
𝑅 +𝑅 ⟶𝑅2
2 1 2 −2 6 −5 0
3 1 −2 6 −5 0
−2 6 −5 0 𝑅 +𝑅 ⟶𝑅3
2 1 3 0 1 − 0 −11𝑅2 +𝑅3 ⟶𝑅3 1
[ 1 −2 2 0] ⟶ 2 ⟶ [ 0 1 − 0]
3 2 2 0 11 2
0 0 0 0
[ 0 11 −
2
0]

−2
𝑡
Set 𝑥3 = 𝑡 as free variable to get 𝑥2 = 2 , 𝑥1 = −𝑡 so that 𝑋3 = [ 1 ]
2
11
−7 −2
5
Hence 𝑄 = [ 3 − 4 1]
5
5 0 2

1 1 0
Compute (use MATLAB) 𝐽 = 𝑄 −1 𝐴𝑄 = [0 1 0] is block diagonal.
0 0 2
11
−2 −7 5
What if we had chosen 𝑄 = [ 1 3 −5
4]

2 5 0

2 0 0
−1
Compute (use MATLAB) 𝐽 = 𝑄 𝐴𝑄 = [0 1 1] is block diagonal. The Jordan blocks
0 0 1
simply moved around!!! The Jordan canonical form is unique up to a rearrangement of
the Jordan blocks. ◼

1 2 3
Example 8: Find the eigenvectors & generalized eigenvectors of 𝐴 = [0 1 4] and
0 0 1
−1
evaluate 𝑄 𝐴𝑄, where Q is the matrix of generalized eigenvectors.

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
1−𝜆 2 3
Set |𝐴 − 𝜆𝐼| = | 0 1−𝜆 4 | = 0 ⟹ (1 − 𝜆)3 = 0 ⟹ 𝜆 = 1 (thrice repeated)
0 0 1−𝜆

0 2 3
For 𝜆 = 1, 𝐴 − 𝜆𝐼 = 𝐴 − 𝐼 = [0 0 4] has rank 2. Hence 𝑞 = 3 − 2 = 1. Hence there is
0 0 0
one eigenvector and two generalized eigenvectors.

Since 𝑞 = 1 (simple degeneracy) we can use both the methods discussed earlier.

Method 1: We first find the eigenvectors.

0 2 3 𝑥1 0
For 𝜆 = 1: (𝐴 − 𝜆𝐼)𝑋1 = 𝑂 ⟹ [0 0 4] [𝑥2 ] = [0]
0 0 0 𝑥3 0
1
Solve with 𝑥1 = 𝑡 as free variable to get 𝑥3 = 0, 𝑥2 = 0. Hence 𝑋1 = [0]
0

The associated generalized eigenvector 𝑋2 is obtained from (𝐴 − 𝜆𝐼)𝑋2 = 𝑋1 ⟹

0 2 3 𝑥1 1
[0 0 4] [𝑥2 ] = [0]
0 0 0 𝑥3 0

𝑡
1 1
Solve with 𝑥1 = 𝑡 as free variable to get 𝑥3 = 0, 𝑥2 = 2. Hence 𝑋2 = [ 2 ]
0

0
1
Hence the generalized eigenvector 𝑋2 = [ 2 ] for 𝑡 = 0
0

The generalized eigenvector 𝑋3 is obtained from (𝐴 − 𝜆𝐼)𝑋3 = 𝑋2 ⟹

0
0 2 3 𝑥1 1
[0 0 4] [𝑥2 ] = [ ]
0 0 0 𝑥3 2
0
𝑡
3
Solve with 𝑥1 = 𝑡 as free variable to get 𝑥3 = 8 , 𝑥2 = − 16. Hence 𝑋3 = [− 16]
1 3
1
8

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
0
3
Hence the generalized eigenvector 𝑋3 = [− 16] for 𝑡 = 0
1
8

1 0 0
1 3
Hence 𝑄 = [0 2 − 16]
1
0 0 8

1 1 0
−1
Compute (use MATLAB) 𝐽 = 𝑄 𝐴𝑄 = [0 1 1] is block diagonal with a single Jordan
0 0 1
block. Notice the two 1’s above the main diagonal elements.

Method 2: We determine the chain of generalized eigenvectors and eigenvectors.

First find the index 𝑘 of the eigenvalue – i.e.; the smallest integer 𝑘 such that
rank((𝐴 − 𝜆𝐼)𝑘 ) = 𝑛 − 𝑚. Since 𝑛 = 3 and 𝑚 = 3, rank((𝐴 − 𝜆𝐼)𝑘 ) = 𝑛 − 𝑚 = 0.

0 2 3 0 2 3 0 2 3 0 0 8
2
𝐴 − 𝜆𝐼 = 𝐴 − 𝐼 = [0 0 4] ⟹ (𝐴 − 𝐼) = [0 0 4] [0 0 4] = [0 0 0] has rank 1.
0 0 0 0 0 0 0 0 0 0 0 0

0 2 3 0 0 8 0 0 0
(𝐴 − 𝐼)3 = [0 0 4] [0 0 0] = [0 0 0] has rank 0.
0 0 0 0 0 0 0 0 0

0 0 8 𝑥1
3
Clearly (𝐴 − 𝐼) 𝑋3 = 𝑂 for any 𝑋3. We also require (𝐴 − 𝐼) 𝑋3 ≠ 𝑂 ⟹ [0 2
0 0] [𝑥2 ] ≠ 𝑂
0 0 0 𝑥3

0
Choose 𝑋3 = [0] that satisfies both these requirements as the generalized eigenvector.
1

0 2 3 0 3 0 2 3 3 8
Then 𝑋2 = (𝐴 − 𝐼)𝑋3 = [0 0 4] [0] = [4] and 𝑋1 = (𝐴 − 𝐼)𝑋2 = [0 0 4] [4] = [0]
0 0 0 1 0 0 0 0 0 0

8 3 0
Hence 𝑄 = [0 4 0]
0 0 1

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
1 1 0
Compute (use MATLAB) 𝐽 = 𝑄 −1 𝐴𝑄 = [0 1 1] as before. ◼
0 0 1

Properties of eigenvalues and the Cayley-Hamilton theorem

The following properties of eigenvalues are given without proof:

(i) The sum of the eigenvalues is equal to the trace (sum of the principal diagonal
elements) of the square matrix

(ii) The product of the eigenvalues is equal to the determinant of the square matrix

(iii) A square matrix satisfies its own characteristic equation (Cayley Hamilton theorem)

The example below verifies the above:

10 −2 4
Example 9: Verify the above properties for 𝐴 = [−20 4 −10] of Example 2.
−30 6 −13

We found 𝜆(𝜆 − 2)(𝜆 + 1) = 0 to be the characteristic equation ⟹ 𝜆 = −1, 0, 2

Prop (i): Sum of the eigen values = −1 + 0 + 2 = 1

Trace of the given matrix = 10 + 4 − 13 = 1 Verified.

Prop (ii): Product of the eigen values = −1 × 0 × 2 = 0

det (A) = 10(−52 + 60) + 2(260 − 300) + 4(−120 + 120) = 0 Verified.

Prop (iii): Cayley Hamilton theorem

Characteristic equation is 𝜆(𝜆 − 2)(𝜆 + 1) = 0. Verify that 𝐴(𝐴 − 2𝐼)(𝐴 + 𝐼) = 𝑂

10 −2 4 8 −2 4 11 −2 4
𝐴(𝐴 − 2𝐼)(𝐴 + 𝐼) = [−20 4 −10] [−20 2 −10] [−20 5 −10] =
−30 6 −13 −30 6 −15 −30 6 −12

10 −2 4 8 −2 4 0 0 0
[−20 4 −10] [40 −10 20] = [ 0 0 0] Verified.
−30 6 −13 0 0 0 0 0 0

An interesting observation is that the product of two non-null matrices may be null. ◼
------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
2 −1 1
Example 10: Find the inverse of 𝐴 = [−1 2 −1] using Cayley Hamilton theorem.
1 −1 2

2−𝜆 −1 1
The characteristic equation is | −1 2−𝜆 −1 | = 0 ⇒
1 −1 2−𝜆

(2 − 𝜆){(2 − 𝜆)2 − 1} + 1{−(2 − 𝜆) + 1} + 1{1 − (2 − 𝜆)} = 0 ⇒

(2 − 𝜆){𝜆2 − 4𝜆 + 3} + {𝜆 − 1} + {𝜆 − 1} = 0 ⇒ 𝜆3 − 6𝜆2 + 9𝜆 − 4 = 0

1
Hence 𝐴3 − 6𝐴2 + 9𝐴 − 4𝐼 = 𝑂 ⇒ 𝐴2 − 6𝐴 + 9𝐼 − 4𝐴−1 = 𝑂 ⇒ 𝐴−1 = 4 {𝐴2 − 6𝐴 + 9𝐼} =

2 −1 1 2 −1 1 6 −5 5
𝐴2 = [−1 2 −1] [−1 2 −1] = [−5 6 −5]
1 −1 2 1 −1 2 5 −5 6

1 2 1 6 −5 5 2 −1 1 1 0 0
−1 {𝐴
𝐴 = − 6𝐴 + 9𝐼} = {[−5 6 −5] − 6 [−1 2 −1] + 9 [0 1 0]}
4 4
5 −5 6 1 −1 2 0 0 1

3 1 −1
1
= 4[ 1 3 1] ◼
−1 1 3

Functions of matrices

𝑥2 𝑥𝑛
Recall the definition of the exponential function 𝑒 𝑥 = 1 + 𝑥 + + ⋯+ +⋯
2! 𝑛!

Let us extend the above definition for evaluating 𝑒 𝐴 , the exponential of the matrix6 𝐴. If
we restrict this discussion to diagonalizable matrices 𝐴, 𝐷 = 𝑃−1 𝐴𝑃 ⟹ 𝐴 = 𝑃𝐷𝑃−1 . i.e.;

𝐴2 𝐴𝑛
𝑒𝐴 = 𝐼 + 𝐴 + + ⋯+ +⋯
2! 𝑛!

−1
(𝑃𝐷𝑃−1 )2 (𝑃𝐷𝑃−1 )𝑛
= 𝐼 + 𝑃𝐷𝑃 + + ⋯+ +⋯
2! 𝑛!

6
Matrix A must be square. The exponential of a non-square matrix A is undefined because 𝐴2 = 𝐴𝐴 is undefined.
Similarly, the higher powers of A are also undefined. I is undefined since there is no unique I such that IA=AI=A.

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
−1
𝑃𝐷𝑃−1 𝑃𝐷𝑃−1 𝑃𝐷𝑃−1 𝑃𝐷𝑃 −1 𝑃𝐷𝑃−1 … 𝑃𝐷𝑃 −1
= 𝐼 + 𝑃𝐷𝑃 + + ⋯+ +⋯
2! 𝑛!

−1 −1
𝑃𝐷2 𝑃 −1 𝑃𝐷𝑛 𝑃−1
= 𝑃𝐼𝑃 + 𝑃𝐷𝑃 + + ⋯+ +⋯
2! 𝑛!

𝐷2 𝐷𝑛
= 𝑃 {𝐼 + 𝐷 + + ⋯+ + ⋯ } 𝑃 −1
2! 𝑛!

= 𝑃𝑒 𝐷 𝑃−1

−4 2
Example 11: Evaluate 𝑒 𝐴 if 𝐴 = [ ]
−3 1

−4 − 𝜆 2
Determine the eigenvalues from | | = 0 ⟹ 𝜆2 + 3𝜆 + 2 = 0 ⟹ 𝜆 = −2, −1
−3 1−𝜆

−4 − 𝜆 2 𝑥1 0
Eigenvectors are the solutions of [ ] [𝑥 ] = [ ]
−3 1−𝜆 2 0

−2 2 𝑥1 0 1
𝜆 = −2: [ ] [ ] = [ ] ⟹ [ ] is an eigenvector
−3 3 𝑥2 0 1

−3 2 𝑥1 0 2
𝜆 = −1: [ ] [ ] = [ ] ⟹ [ ] is an eigenvector
−3 2 𝑥2 0 3

1 2 3 −2 −2 0
Hence 𝑃 = [ ] ⟹ 𝑃−1 = [ ] and 𝐷 = [ ]
1 3 −1 1 0 −1

1 2 𝑒 −2 0 ] [ 3 −2] = [1 2] [ 3𝑒 −2 −2𝑒 −2 ] =
𝑒 𝐴 = 𝑃𝑒 𝐷 𝑃−1 = [ ][
1 3 0 𝑒 −1 −1 1 1 3 −𝑒 −1 𝑒 −1
−2 −1
[3𝑒 −2 − 2𝑒 −1 −2𝑒 −2 + 2𝑒 −1 ] ◼
3𝑒 − 3𝑒 −2𝑒 −2 + 3𝑒 −1

Cayley-Hamilton theorem is very useful in reducing polynomials, which means it enables


the evaluation of the functions of a matrix when the function can be expanded as a power
series (Taylor series, for instance).

When a polynomial 𝑝(𝑥) is divided by the polynomial 𝑑(𝑥) of degree not exceeding that
of 𝑝(𝑥), we get the quotient polynomial 𝑞(𝑥) and a remainder polynomial 𝑟(𝑥) of lower
degree than 𝑑(𝑥), just as with numbers. i.e.; 𝑝(𝑥) = 𝑞(𝑥)𝑑(𝑥) + 𝑟(𝑥).

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
If we have matrix polynomial 𝑝(𝐴), where 𝐴 is a square matrix, then we can write using
the above notation 𝑝(𝐴) = 𝑞(𝐴)𝑑(𝐴) + 𝑟(𝐴). If we choose 𝑑(𝐴) to be the characteristic
polynomial ∆(𝐴) of the matrix 𝐴, then 𝑝(𝐴) = 𝑞(𝐴)∆(𝐴) + 𝑟(𝐴) or in terms of 𝑥, 𝑝(𝑥) =
𝑞(𝑥)∆(𝑥) + 𝑟(𝑥). When 𝑥 = 𝜆𝑖 , 𝑖 = 1, 2, … 𝑛, ∆(𝑥) = 0 by Cayley-Hamilton theorem. This
provides a powerful technique for evaluating functions of a matrix as the degree of 𝑟(𝑥)
is one lower than that of ∆(𝑥). We only need the eigenvalues, not the eigenvectors to
work with. This is illustrated below.

−4 2
Example 12: Evaluate 𝑒 𝐴 if 𝐴 = [ ] – same as the earlier Example 11.
−3 1

−4 − 𝜆 2
Determine the eigenvalues from | | = 0 ⟹ 𝜆2 + 3𝜆 + 2 = 0 ⟹ 𝜆 = −2, −1
−3 1−𝜆

𝑝(𝐴) = 𝑞(𝐴)∆(𝐴) + 𝑟(𝐴) ⟹ 𝑒 𝐴 = 𝑞(𝐴)(𝐴2 + 3𝐴 + 2𝐼) + (𝑎𝐴 + 𝑏𝐼) = 𝑎𝐴 + 𝑏𝐼; as 𝑟(𝐴) is


of lower degree than ∆(𝐴) and 𝐴2 + 3𝐴 + 2𝐼 = 𝑂 by Cayley-Hamilton theorem.

Define 𝑒 𝑥 = 𝑞(𝑥)(𝑥 2 + 3𝑥 + 2) + (𝑎𝑥 + 𝑏) = 𝑎𝑥 + 𝑏 only when 𝑥 is an eigenvalue since


𝑥 2 + 3𝑥 + 2 = 0 when 𝑥 = −2, −1. i.e.; Using this, determine 𝑎 and 𝑏 as:

𝑒 −2 = −2𝑎 + 𝑏

𝑒 −1 = −𝑎 + 𝑏

Hence, 𝑎 = 𝑒 −1 − 𝑒 −2 and 𝑏 = 2𝑒 −1 − 𝑒 −2

−4 2 1 0
Hence, 𝑒 𝐴 = 𝑎𝐴 + 𝑏𝐼 = (𝑒 −1 − 𝑒 −2 ) [ ] + (2𝑒 −1 − 𝑒 −2 ) [ ]
−3 1 0 1
−2 −1
= [3𝑒 −2 − 2𝑒 −1 −2𝑒 −2 + 2𝑒 −1 ] as in Example 11 ◼
3𝑒 − 3𝑒 −2𝑒 −2 + 3𝑒 −1

Note that if eigenvalues are repeated, the above procedure needs a slight modification
as follows: For a twice-repeated eigenvalue, first use the equation 𝑒 𝑥 = 𝑎𝑥 + 𝑏 at 𝑥 = 𝜆.
𝑑 𝑑
Then take the derivative on both sides and set 𝑥 = 𝜆. i.e.; 𝑑𝑥 𝑒 𝑥 = 𝑑𝑥 (𝑎𝑥 + 𝑏 ) ⟹ 𝑒 𝑥 = 𝑎
and set 𝑥 = 𝜆 (the repeated eigenvalue). Take (𝑛 − 1) derivatives for an eigenvalue
repeated n times to generate as many equations as the unknowns.

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
Properties of the matrix exponential
(𝑨𝑡)2 (𝑨𝑡)𝑛
1. 𝑒 𝑨𝑡 = 𝑰 + 𝑨𝑡 + + ⋯+ +⋯
2! 𝑛!
𝑑 𝑨𝑡
2. 𝑒 = 𝑨𝑒 𝑨𝑡 = 𝑒 𝑨𝑡 𝑨
𝑑𝑡
(𝑨+𝑩)𝑡
3. 𝑒 = 𝑒 𝑨𝑡 𝑒 𝑩𝑡 = 𝑒 𝑩𝑡 𝑒 𝑨𝑡 if 𝑨𝑩 = 𝑩𝑨; otherwise 𝑒 (𝑨+𝑩)𝑡 ≠ 𝑒 𝑨𝑡 𝑒 𝑩𝑡 ≠ 𝑒 𝑩𝑡 𝑒 𝑨𝑡
4. (𝑒 𝑨𝑡 )−1 = 𝑒 −𝑨𝑡 if 𝑨𝑡 is nonsingular
−𝟏 )𝑡 (𝑽𝑨𝑽−𝟏 𝑽𝑨𝑽−𝟏 )𝑡 2
5. If 𝑽 is nonsingular, then 𝑒 (𝑽𝑨𝑽 = 𝑰 + 𝑽𝑨𝑽−𝟏 𝑡 + + ⋯+
2!
(𝑽𝑨𝑽−𝟏 𝑽𝑨𝑽−𝟏 …𝑽𝑨𝑽−𝟏 )𝑡 𝑛 (𝑨𝑡)2 (𝑨𝑡)𝑛
+ ⋯ = 𝑉 {𝑰 + 𝑨𝑡 + +⋯+ + ⋯ } 𝑉 −1 = 𝑉𝑒 𝑨𝑡 𝑉 −1
𝑛! 2! 𝑛!
6. If 𝒗 is an eigenvector corresponding to the eigenvalue 𝜆 of 𝑨, then 𝑒 𝑨𝑡 𝒗 =
(𝑨𝑡)2 (𝑨𝑡)𝑛 (𝜆𝑡)2 (𝜆𝑡)𝑛
(𝑰 + 𝑨𝑡 + +⋯+ + ⋯ ) 𝒗 = 𝑰 + (𝜆𝑡)𝒗 + 𝒗 + ⋯+ 𝒗 + ⋯ = 𝑒 𝝀𝑡 𝒗
2! 𝑛! 2! 𝑛!
7. If the matrix 𝑨 has a single eigenvalue 𝜆 repeated 𝑛 times, then (𝑨 − 𝜆𝑰)𝑘 = 𝑶, for
some 𝑘 (0 < 𝑘 ≤ 𝑛). Hence:
𝑒 𝑨𝑡 = 𝑒 (𝜆𝑰+𝑨−𝜆𝑰)𝑡 = 𝑒 𝜆𝑰𝑡 𝑒 (𝑨−𝜆𝑰)𝑡 = 𝑒 𝜆𝑡 𝑰𝑒 (𝑨−𝜆𝑰)𝑡
(𝑨 − 𝜆𝑰)2 𝑡 2 (𝑨 − 𝜆𝑰)𝑛 𝑡 𝑛
= 𝑒 𝜆𝑡 𝑰 (𝑰 + (𝑨 − 𝜆𝑰)𝑡 + + ⋯+ +⋯)
2! 𝑛!
(𝑨 − 𝜆𝑰)2 𝑡 2 (𝑨 − 𝜆𝑰)𝑘−1 𝑡 𝑘−1
= 𝑒 𝜆𝑡 (𝑰 + (𝑨 − 𝜆𝑰)𝑡 + +⋯+ )
2! (𝑘 − 1)!

1 4
Example 13: Determine 𝑒 𝑨𝑡 if 𝑨 = [ ]. Use property 7 above.
−1 −3
Determine the eigenvalues of 𝑨: 𝜆 = −1 (repeated twice)
2 4 0 0
𝑨 − 𝜆𝑰 = [ ]; (𝑨 − 𝜆𝑰)2 = [ 2 4
][
2 4
]=[ ]
−1 −2 −1 −2 −1 −2 0 0
(𝑨−𝜆𝑰)2 𝑡 2 (𝑨−𝜆𝑰)𝑘−1 𝑡 𝑘−1
Hence, 𝑒 𝑨𝑡 = 𝑒 𝜆𝑡 (𝑰 + (𝑨 − 𝜆𝑰)𝑡 + + ⋯+ (𝑘−1)!
)
2!

1 0 2 4 1 + 2t 4t
= 𝑒 −𝑡 ([ ]+[ ] 𝑡) = 𝑒 −𝑡 [ ] ◼
0 1 −1 −2 −t 1 − 2t

Note: The exponential of a deficient matrix A can be determined from its generalized
eigenvectors (or by means of the Cayley-Hamilton theorem).

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
In-class practice: Determine the eigenvectors and generalized eigenvectors of the
0 0 1 0
0 0 0 1
matrix 𝐴 = [ ].
0 0 0 0
0 0 0 0

Take 10 minutes to do this problem (solution on next page)

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
SOLUTION:
−𝜆 0 1 0
0 −𝜆 0 1
𝐴 − 𝜆𝐼 = [ ]
0 0 −𝜆 0
0 0 0 −𝜆
𝑑𝑒𝑡 (𝐴 − 𝜆𝐼) = 𝜆4 = 0 ⟹ 𝜆 = 0 (𝑚 = 4)

0 0 1 0
0 0 0 1
At 𝜆 = 0, 𝐴 − 𝜆𝐼 = [ ] ⟹ 𝑟𝑎𝑛𝑘(𝐴 − 𝜆𝐼) = 2 ⟹ degeneracy 𝑞 = 4 − 2 = 2
0 0 0 0
0 0 0 0

There are 2 eigenvectors and 2 generalized eigenvectors.

How do we know whether the chain is?

(i) one eigenvector + 2 generalized eigenvectors followed by the second eigenvector


or
(ii) one eigenvector + one generalized eigenvector followed by the second
eigenvector + its generalized eigenvector

Note that 𝑛 − 𝑚 = 4 − 4 = 0. Let’s find the value smallest integer 𝑘 for which
rank((𝐴 − 𝜆𝐼)𝑘 ) = 𝑛 − 𝑚

0 0 1 0 0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 1 0 0 0 0
At 𝜆 = 0, (𝐴 − 𝜆𝐼)2 = [ ][ ]=[ ] ⟹ 𝑟𝑎𝑛𝑘(𝐴 − 𝜆𝐼)2 = 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0

i.e.; 𝑘 = 2. So, one Jordan block is of size 2 and so the second must be of size 2 as well.

i.e.; we have situation (ii) above. Now let’s determine the generalized eigenvectors and
then the eigenvectors.

(𝐴 − 𝜆𝐼)2 𝑋 = 𝑂 is satisfied by any vector, but the linearly independent ones are:

𝑋1 = [1 0 0 0]′ , 𝑋2 = [0 1 0 0]′ , 𝑋3 = [0 0 1 0]′ , 𝑋4 = [0 0 0 1]′

But (𝐴 − 𝜆𝐼)𝑋1 = 0 and (𝐴 − 𝜆𝐼)𝑋2 = 0. Hence 𝑋1and 𝑋2 are actually eigenvectors.

(𝐴 − 𝜆𝐼)𝑋3 = [1 0 0 0]′ = 𝑋1 ⟹ 𝑋3 is a generalized eigenvector of the chain 𝑋3 − 𝑋1


------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani
(𝐴 − 𝜆𝐼)𝑋4 = [0 1 0 0]′ = 𝑋2 ⟹ 𝑋4 is a generalized eigenvector of the chain 𝑋4 − 𝑋2

1 0 0 0
0 0 1 1
Hence 𝑄 = [ ] . Now use MATLAB to compute:
0 1 0 0
0 0 0 1

-§§§§§§§§§§§-

------------------------------------------------------------------------------------------------------------------------------------------------------------
Advanced Engineering Math (DE/AE ZG535) course notes
Param, Core Engineering Group, WILP Division, BITS Pilani

You might also like