100% found this document useful (1 vote)
227 views666 pages

(MADHU MANGAL PAUL) Numerical Analysis For Scienti

Uploaded by

ali eng
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
227 views666 pages

(MADHU MANGAL PAUL) Numerical Analysis For Scienti

Uploaded by

ali eng
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 666

Numerical Analysis for

Scientists and Engineers:


Theory and C Programs

Madhumangal Pal
Department of Applied Mathematics with
Oceanology and Computer Programming
Vidyasagar University
Midnapore - 721102
.

Dedicated to my parents
Preface

Numerical Analysis is a multidisciplinary subject. It has formed an integral part of


the undergraduate and postgraduate curriculum in Mathematics, Computer Science,
Physics, Commerce and different Engineering streams. Numerical Analysis shows the
way to obtain numerical answers to applied problems. Numerical methods stand there
where analytical methods may fail or complicated to solve the problem. For example, in
finding the roots of transcendental equations or in solving non-linear differential equa-
tions. So, it is quite impossible to train the students in applied sciences or engineering
without an adequate knowledge of numerical methods.
The book is suitable for undergraduate as well as for postgraduate students and
advanced readers. Ample material is presented so that instructors will able to select
topics appropriate to their needs. The book contains ten chapters.
In Chapter 1, different types of errors and their sources in numerical computation
are presented. The representation of floating point numbers and their arithmetic are
studied in this chapter.
Finite difference operators, relations among them are well studied in Chapter 2. The
difference equations and their solution methods are also introduced here.
Chapter 3 is devoted to single and bivariate interpolations. Different types of in-
terpolation methods such as Lagrange, Newton, Bessal, Stirling, Hermite, Everette are
incorporated here. Inverse and cubic spline interpolation techniques are also presented
in this chapter. Several bivariate methods are presented here.
Several methods such as graphical, tabulation, bisection, regula-falsi, fixed point
iteration, Newton-Raphson, Aitken, secant, Chebyshev and Muller are well studied to
solve an algebraic and transcendental equation in Chapter 4. The geometrical meaning
and the rate of convergence of the above methods are also presented. The very new
method, modified Newton-Raphson with cubic convergence is introduced here. Birge-
Vieta, Bairstow and Graeffe’s root squaring methods are deduced and illustrated to
find the roots of a polynomial equation. The methods to solve a system of non-linear
equations are introduced here.
Chapter 5 deals to solve a system of linear equations. Different direct and iterative
methods such as matrix inverse, Gauss-Jordon, Gauss elimination, LU decomposition,

vii
viii Numerical Analysis

Cholesky, matrix partition, Jacobi, Gauss-Seidal and relaxation are well studied here.
The very new methods to find tri-diagonal determinant and to solve a tri-diagonal sys-
tem of equations are incorporated. A method to solve ill-condition system is discussed.
The generalised inverse of a matrix is introduced. Also, least squares solution method
for an inconsistent system is illustrated here.
Determination of eigenvalues and eigenvectors of a matrix is very important problem
in applied science and engineering. In Chapter 6, different methods, viz., Leverrier-
Faddeev, Rotishauser, Power, Jacobi, Givens and Householder are presented to find the
eigenvalues and eigenvectors for arbitrary and symmetric matrices.
Chapter 7 contains indepth presentation of several methods to find derivative and
integration of a functions. Three types of integration methods, viz., Newton-Cotes
(trapezoidal, Simpson, Boole, Weddle), Gaussian (Gauss-Legendre, Lobatto, Radau,
Gauss-Chebyshev, Gauss-Hermite, Gauss-Leguerre, Gauss-Jacobi) and Monte Carlo are
well studies here. Euler-Maclaurin sum formula, Romberg integration are also studied
in this chapter. An introduction to find double integration is also given here.
To solve ordinary differential equations, Taylor series, Picard, Euler, Runge-Kutta,
Runge-Kutta-Fehlberg, Runge-Kutta-Butcher, Adams-Bashforth-Moulton, Milne, finite-
difference, shooting and finite element methods are discussed in Chapter 8. Stability
analysis of some methods are also done.
An introduction to solve partial differential equation is given in Chapter 9. The finite
difference methods to solve parabolic, hyperbolic and elliptic PDEs are discussed here.
Least squares approximation techniques are discussed in Chapter 10. The method
to fit straight line, parabolic, geometric, etc., curves are illustrated here. Orthogo-
nal polynomials, their applications and Chebyshev approximation are discussed in this
chapter.
The algorithms and programmes in C are supplied for the most of the important
methods discussed in this book.
At first I would like to thank Prof. N. Dutta and Prof. R.N. Jana, as from their book
I learnt my first lessons in the subject.
In writing this book I have taken help from several books, research articles and some
websites mentioned in the bibliography. So, I acknowledge them gratefully.
This book could not have been complete without the moral and loving support and
also continuous encouragement of my wife Anita and my son Aniket.
Also, I would like to express my sincere appreciation to my teachers and colleagues
specially Prof. M. Maiti, Prof. T.K. Pal and Prof. R.N. Jana as they have taken all
the academic and administrative loads of the department on their shoulders, providing
me with sufficient time to write this book. I would also like to acknowledge my other
colleagues Dr. K. De and Dr. S. Mondal for their encouragement.
I express my sincerest gratitude to my teacher Prof. G.P.Bhattacharjee, Indian In-
stitute of Technology, Kharagpur, for his continuous encouragement.
ix

I feel great reverence for my parents, sisters, sister-in-law and relatives for their
blessings and being a constant source of inspiration.
I would like to thank to Sk. Md. Abu Nayeem, Dr. Amiya K. Shyamal, Dr. Anita
Saha, for scrutinizing the manuscript.
I shall feel great to receive constructive criticisms for the improvement of the book
from the experts as well as the learners.
I thank the Narosa Publishing House Pvt. Ltd. for their sincere care in the
publication of the book.
Madhumangal Pal
x Numerical Analysis
Contents

1 Errors in Numerical Computations 1


1.1 Sources of Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Exact and Approximate Numbers . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Absolute, Relative and Percentage Errors . . . . . . . . . . . . . . . . . 4
1.4 Valid Significant Digits . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Propagation of Errors in Arithmetic Operations . . . . . . . . . . . . . . 9
1.5.1 The errors in sum and difference . . . . . . . . . . . . . . . . . . 9
1.5.2 The error in product . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5.3 The error in quotient . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5.4 The errors in power and in root . . . . . . . . . . . . . . . . . . . 14
1.5.5 Error in evaluation of a function of several variables . . . . . . . 15
1.6 Significant Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.7 Representation of Numbers in Computer . . . . . . . . . . . . . . . . . . 18
1.8 Arithmetic of Normalized Floating Point Numbers . . . . . . . . . . . . 19
1.8.1 Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.8.2 Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.8.3 Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.8.4 Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.9 Effect of Normalized Floating Point Representations . . . . . . . . . . . 22
1.9.1 Zeros in floating point numbers . . . . . . . . . . . . . . . . . . . 23
1.10 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2 Calculus of Finite Diff. and Diff. Equs 27


2.1 Finite Difference Operators . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.1 Forward differences . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.2 Backward differences . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.1.3 Central differences . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.1.4 Shift, Average and Differential operators . . . . . . . . . . . . . . 30
2.1.5 Factorial notation . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2 Properties of Forward Differences . . . . . . . . . . . . . . . . . . . . . . 32

xi
xii Numerical Analysis

2.2.1 Properties of shift operators . . . . . . . . . . . . . . . . . . . . . 33


2.3 Relations Among Operators . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4 Representation of Polynomial using Factorial Notation . . . . . . . . . . 39
2.5 Difference of a Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.6 Summation of Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.7 Worked out Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.8 Difference Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.8.1 Formation of difference equations . . . . . . . . . . . . . . . . . . 53
2.9 Solution of Difference Equations . . . . . . . . . . . . . . . . . . . . . . 55
2.9.1 Iterative method . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.9.2 Solution using symbolic operators . . . . . . . . . . . . . . . . . 56
2.9.3 Generating function . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.10 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3 Interpolation 71
3.1 Lagrange’s Interpolation Polynomial . . . . . . . . . . . . . . . . . . . . 72
3.1.1 Lagrangian interpolation formula for equally spaced points . . . 75
3.2 Properties of Lagrangian Functions . . . . . . . . . . . . . . . . . . . . . 77
3.3 Error in Interpolating Polynomial . . . . . . . . . . . . . . . . . . . . . . 79
3.4 Finite Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.4.1 Forward differences . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.4.2 Backward differences . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.4.3 Error propagation in a difference table . . . . . . . . . . . . . . . 88
3.5 Newton’s Forward Difference Interpolation Formula . . . . . . . . . . . 90
3.5.1 Error in Newton’s forward formula . . . . . . . . . . . . . . . . . 92
3.6 Newton’s Backward Difference Interpolation Formula . . . . . . . . . . . 95
3.6.1 Error in Newton’s backward interpolation formula . . . . . . . . 98
3.7 Gaussian Interpolation Formulae . . . . . . . . . . . . . . . . . . . . . . 99
3.7.1 Gauss’s forward difference formula . . . . . . . . . . . . . . . . . 99
3.7.2 Remainder in Gauss’s forward central difference formula . . . . . 102
3.7.3 Gauss’s backward difference formula . . . . . . . . . . . . . . . . 102
3.7.4 Remainder of Gauss’s backward central difference formula . . . . 105
3.8 Stirling’s Interpolation Formula . . . . . . . . . . . . . . . . . . . . . . . 105
3.9 Bessel’s Interpolation Formula . . . . . . . . . . . . . . . . . . . . . . . . 106
3.10 Everett’s Interpolation Formula . . . . . . . . . . . . . . . . . . . . . . . 109
3.10.1 Relation between Bessel’s and Everett’s formulae . . . . . . . . . 111
3.11 Interpolation by Iteration (Aitken’s Interpolation) . . . . . . . . . . . . 113
3.12 Divided Differences and their Properties . . . . . . . . . . . . . . . . . . 116
3.12.1 Properties of divided differences . . . . . . . . . . . . . . . . . . 117
3.13 Newton’s Fundamental Interpolation Formula . . . . . . . . . . . . . . . 121
Contents xiii

3.14 Deductions of other Interpolation Formulae from Newton’s


Divided Difference Formula . . . . . . . . . . . . . . . . . . . . . . . . . 124
3.14.1 Newton’s forward difference interpolation formula . . . . . . . . 124
3.14.2 Newton’s backward difference interpolation formula . . . . . . . 124
3.14.3 Lagrange’s interpolation formula . . . . . . . . . . . . . . . . . . 125
3.15 Equivalence of Lagrange’s and Newton’s divided
difference formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
3.16 Inverse Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
3.16.1 Inverse interpolation based on Lagrange’s formula . . . . . . . . 132
3.16.2 Method of successive approximations . . . . . . . . . . . . . . . . 132
3.16.3 Based on Newton’s backward difference interpolation formula . . 134
3.16.4 Use of inverse interpolation to find a root of an equation . . . . . 135
3.17 Choice and use of Interpolation Formulae . . . . . . . . . . . . . . . . . 136
3.18 Hermite’s Interpolation Formula . . . . . . . . . . . . . . . . . . . . . . 139
3.19 Spline Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
3.19.1 Cubic spline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
3.20 Bivariate Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
3.20.1 Local matching methods . . . . . . . . . . . . . . . . . . . . . . . 156
3.20.2 Global methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
3.21 Worked out Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
3.22 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

4 Sol. of Algebraic and Transcendental Equs. 189


4.1 Location of Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
4.1.1 Graphical method . . . . . . . . . . . . . . . . . . . . . . . . . . 190
4.1.2 Method of tabulation . . . . . . . . . . . . . . . . . . . . . . . . 191
4.2 Bisection Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
4.3 Regula-Falsi Method (Method of False Position) . . . . . . . . . . . . . 198
4.4 Iteration Method or Fixed Point Iteration . . . . . . . . . . . . . . . . . 202
4.4.1 Estimation of error . . . . . . . . . . . . . . . . . . . . . . . . . . 204
4.5 Acceleration of Convergence: Aitken’s ∆2 -Process . . . . . . . . . . . . 208
4.6 Newton-Raphson Method or Method of Tangent . . . . . . . . . . . . . 210
4.6.1 Convergence of Newton-Raphson method . . . . . . . . . . . . . 213
4.7 Newton-Raphson Method for Multiple Root . . . . . . . . . . . . . . . . 218
4.8 Modification on Newton-Raphson Method . . . . . . . . . . . . . . . . . 219
4.9 Modified Newton-Raphson Method . . . . . . . . . . . . . . . . . . . . . 222
4.10 Secant Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
4.10.1 Convergence of secant method . . . . . . . . . . . . . . . . . . . 226
4.11 Chebyshev Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
4.12 Muller Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
xiv Numerical Analysis

4.13 Roots of Polynomial Equations . . . . . . . . . . . . . . . . . . . . . . . 237


4.13.1 Domains of roots . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
4.14 Birge-Vieta Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
4.15 Bairstow Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
4.16 Graeffe’s Root Squaring Method . . . . . . . . . . . . . . . . . . . . . . 253
4.17 Solution of Systems of Nonlinear Equations . . . . . . . . . . . . . . . . 261
4.17.1 The method of iteration . . . . . . . . . . . . . . . . . . . . . . . 261
4.17.2 Seidal method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
4.17.3 Newton-Raphson method . . . . . . . . . . . . . . . . . . . . . . 266
4.18 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

5 Solution of System of Linear Equations 275


5.1 Cramer’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
5.1.1 Computational aspect of Cramer’s rule . . . . . . . . . . . . . . . 277
5.2 Evaluation of Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . 278
5.3 Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
5.3.1 Gauss-Jordan Method . . . . . . . . . . . . . . . . . . . . . . . . 287
5.4 Matrix Inverse Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
5.5 Gauss Elimination Method . . . . . . . . . . . . . . . . . . . . . . . . . 297
5.6 Gauss-Jordan Elimination Method . . . . . . . . . . . . . . . . . . . . . 302
5.7 Method of Matrix Factorization . . . . . . . . . . . . . . . . . . . . . . . 304
5.7.1 LU Decomposition Method . . . . . . . . . . . . . . . . . . . . . 304
5.8 Gauss Elimination Method to the Find Inverse of a Matrix . . . . . . . 313
5.9 Cholesky Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
5.10 Matrix Partition Method . . . . . . . . . . . . . . . . . . . . . . . . . . 317
5.11 Solution of Tri-diagonal Systems . . . . . . . . . . . . . . . . . . . . . . 320
5.12 Evaluation of Tri-diagonal Determinant . . . . . . . . . . . . . . . . . . 325
5.13 Vector and Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . 326
5.14 Ill-Conditioned Linear Systems . . . . . . . . . . . . . . . . . . . . . . . 327
5.14.1 Method to solve ill-conditioned system . . . . . . . . . . . . . . . 329
5.15 Generalized Inverse (g-inverse) . . . . . . . . . . . . . . . . . . . . . . . 330
5.15.1 Greville’s algorithm for Moore-Penrose inverse . . . . . . . . . . 331
5.16 Least Squares Solution for Inconsistent Systems . . . . . . . . . . . . . . 334
5.17 Jacobi’s Iteration Method . . . . . . . . . . . . . . . . . . . . . . . . . . 338
5.17.1 Convergence of Gauss-Jacobi’s iteration . . . . . . . . . . . . . . 339
5.18 Gauss-Seidal’s Iteration Method . . . . . . . . . . . . . . . . . . . . . . 344
5.18.1 Convergence of Gauss-Seidal’s method . . . . . . . . . . . . . . . 346
5.19 The Relaxation Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
5.20 Successive Overrelaxation (S.O.R.) Method . . . . . . . . . . . . . . . . 354
5.21 Comparison of Direct and Iterative Methods . . . . . . . . . . . . . . . . 358
Contents xv

5.22 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359

6 Eigenvalues and Eigenvectors of a Matrix 365


6.1 Eigenvalue of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
6.2 Leverrier-Faddeev Method to Construct Characteristic Equation . . . . 368
6.2.1 Eigenvectors using Leverrier-Faddeev method . . . . . . . . . . . 372
6.3 Eigenvalues for Arbitrary Matrices . . . . . . . . . . . . . . . . . . . . . 374
6.3.1 Rutishauser method . . . . . . . . . . . . . . . . . . . . . . . . . 374
6.3.2 Power method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
6.3.3 Power method for least eigenvalue . . . . . . . . . . . . . . . . . 380
6.4 Eigenvalues for Symmetric Matrices . . . . . . . . . . . . . . . . . . . . 380
6.4.1 Jacobi’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
6.4.2 Eigenvalues of a Symmetric Tri-diagonal Matrix . . . . . . . . . 390
6.4.3 Givens method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
6.4.4 Householder’s method . . . . . . . . . . . . . . . . . . . . . . . . 394
6.5 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401

7 Differentiation and Integration 403


7.1 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
7.1.1 Error in Numerical Differentiation . . . . . . . . . . . . . . . . . 403
7.2 Differentiation Based on Newton’s Forward Interpolation
Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
7.3 Differentiation Based on Newton’s Backward Interpolation
Polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
7.4 Differentiation Based on Stirling’s Interpolation Formula . . . . . . . . . 410
7.5 Differentiation Based on Lagrange’s Interpolation Polynomial . . . . . . 413
7.6 Two-point and Three-point Formulae . . . . . . . . . . . . . . . . . . . . 419
7.6.1 Error analysis and optimum step size . . . . . . . . . . . . . . . . 420
7.7 Richardson’s Extrapolation Method . . . . . . . . . . . . . . . . . . . . 424
7.8 Cubic Spline Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
7.9 Determination of Extremum of a Tabulated Function . . . . . . . . . . . 431
7.10 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432
7.11 General Quadrature Formula Based on Newton’s Forward
Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
7.11.1 Trapezoidal Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
7.11.2 Simpson’s 1/3 rule . . . . . . . . . . . . . . . . . . . . . . . . . . 438
7.11.3 Simpson’s 3/8 rule . . . . . . . . . . . . . . . . . . . . . . . . . . 445
7.11.4 Boole’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
7.11.5 Weddle’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
7.12 Integration Based on Lagrange’s Interpolation . . . . . . . . . . . . . . . 448
xvi Numerical Analysis

7.13 Newton-Cotes Integration Formulae (Closed type) . . . . . . . . . . . . 449


7.13.1 Some results on Cotes coefficients . . . . . . . . . . . . . . . . . . 450
7.13.2 Deduction of quadrature formulae . . . . . . . . . . . . . . . . . 452
7.14 Newton-Cotes Formulae (Open Type) . . . . . . . . . . . . . . . . . . . 453
7.15 Gaussian Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
7.15.1 Gauss-Legendre integration methods . . . . . . . . . . . . . . . . 456
7.15.2 Lobatto integration methods . . . . . . . . . . . . . . . . . . . . 462
7.15.3 Radau integration methods . . . . . . . . . . . . . . . . . . . . . 464
7.15.4 Gauss-Chebyshev integration methods . . . . . . . . . . . . . . . 466
7.15.5 Gauss-Hermite integration methods . . . . . . . . . . . . . . . . 468
7.15.6 Gauss-Laguerre integration methods . . . . . . . . . . . . . . . . 469
7.15.7 Gauss-Jacobi integration methods . . . . . . . . . . . . . . . . . 470
7.16 Euler-Maclaurin’s Sum Formula . . . . . . . . . . . . . . . . . . . . . . . 473
7.17 Romberg’s Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
7.18 Double Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486
7.18.1 Trapezoidal method . . . . . . . . . . . . . . . . . . . . . . . . . 486
7.18.2 Simpson’s 1/3 method . . . . . . . . . . . . . . . . . . . . . . . . 490
7.19 Monte Carlo Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
7.19.1 Generation of random numbers . . . . . . . . . . . . . . . . . . . 495
7.20 Worked out Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
7.21 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502

8 Ordinary Differential Equations 511


8.1 Taylor’s Series Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
8.2 Picard’s Method of Successive Approximations . . . . . . . . . . . . . . 515
8.3 Euler’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
8.3.1 Geometrical interpretation of Euler’s method . . . . . . . . . . . 518
8.4 Modified Euler’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
8.4.1 Geometrical interpretation of modified Euler’s method . . . . . . 522
8.5 Runge-Kutta Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
8.5.1 Second-order Runge-Kutta method . . . . . . . . . . . . . . . . . 526
8.5.2 Fourth-order Runge-Kutta Method . . . . . . . . . . . . . . . . . 528
8.5.3 Runge-Kutta method for a pair of equations . . . . . . . . . . . . 533
8.5.4 Runge-Kutta method for a system of equations . . . . . . . . . . 537
8.5.5 Runge-Kutta method for second order differential equation . . . 538
8.5.6 Runge-Kutta-Fehlberg method . . . . . . . . . . . . . . . . . . . 539
8.5.7 Runge-Kutta-Butcher method . . . . . . . . . . . . . . . . . . . . 540
8.6 Predictor-Corrector Methods . . . . . . . . . . . . . . . . . . . . . . . . 541
8.6.1 Adams-Bashforth-Moulton methods . . . . . . . . . . . . . . . . 541
8.6.2 Milne’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
Contents xvii

8.7 Finite Difference Method . . . . . . . . . . . . . . . . . . . . . . . . . . 552


8.7.1 Second order initial value problem (IVP) . . . . . . . . . . . . . 553
8.7.2 Second order boundary value problem (BVP) . . . . . . . . . . . 555
8.8 Shooting Method for Boundary Value Problem . . . . . . . . . . . . . . 560
8.9 Finite Element Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 563
8.10 Discussion About the Methods . . . . . . . . . . . . . . . . . . . . . . . 571
8.11 Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
8.11.1 Model differential problem . . . . . . . . . . . . . . . . . . . . . . 572
8.11.2 Model difference problem . . . . . . . . . . . . . . . . . . . . . . 572
8.11.3 Stability of Euler’s method . . . . . . . . . . . . . . . . . . . . . 573
8.11.4 Stability of Runge-Kutta methods . . . . . . . . . . . . . . . . . 575
8.11.5 Stability of Finite difference method . . . . . . . . . . . . . . . . 577
8.12 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577

9 Partial Differential Equations 583


9.1 Finite-Difference Approximations to Partial Derivatives . . . . . . . . . 585
9.2 Parabolic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586
9.2.1 An explicit method . . . . . . . . . . . . . . . . . . . . . . . . . . 586
9.2.2 Crank-Nicolson implicit method . . . . . . . . . . . . . . . . . . 588
9.3 Hyperbolic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
9.3.1 Implicit difference methods . . . . . . . . . . . . . . . . . . . . . 599
9.4 Elliptic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
9.4.1 Iterative methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
9.5 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615
9.6 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617

10 Least Squares Approximation 619


10.1 General Least Squares Method . . . . . . . . . . . . . . . . . . . . . . . 619
10.2 Fitting of a Straight Line . . . . . . . . . . . . . . . . . . . . . . . . . . 620
10.3 Fitting of a Parabolic Curve . . . . . . . . . . . . . . . . . . . . . . . . . 623
10.4 Fitting of a Polynomial of Degree k . . . . . . . . . . . . . . . . . . . . . 625
10.5 Fitting of Other Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
10.5.1 Geometric curve . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
10.5.2 Rectangular hyperbola . . . . . . . . . . . . . . . . . . . . . . . . 627
10.5.3 Exponential curve . . . . . . . . . . . . . . . . . . . . . . . . . . 628
10.6 Weighted Least Squares Method . . . . . . . . . . . . . . . . . . . . . . 628
10.6.1 Fitting of a weighted straight line . . . . . . . . . . . . . . . . . . 628
10.7 Least Squares Method for Continuous Data . . . . . . . . . . . . . . . . 630
10.8 Approximation Using Orthogonal Polynomials . . . . . . . . . . . . . . . 633
10.9 Approximation of Functions . . . . . . . . . . . . . . . . . . . . . . . . . 636
xviii Numerical Analysis

10.9.1 Chebyshev polynomials . . . . . . . . . . . . . . . . . . . . . . . 637


10.9.2 Expansion of function using Chebyshev polynomials . . . . . . . 640
10.9.3 Economization of power series . . . . . . . . . . . . . . . . . . . . 645
10.10Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646
List of Algorithms and Programs

Sl. No. Description Algo. Prog.


3.1 Lagrange’s interpolation for single variable 85 85
3.2 Newton’s forward interpolation 93 94
3.3 Aitken’s interpolation 115 115
3.4 Interpolation by divided difference 129 130
3.5 Interpolation by cubic spline 149 151
3.6 Lagrange bivariate interpolation 162 163
4.1 Solution of an equation by bisection method 197 197
4.2 Solution of an equation by Regula-Falsi method 200 201
4.3 Solution of an equation by fixed point iteration method 207 208
4.4 Solution of an equation by Newton-Raphson method 224 225
4.5 Solution of an equation by secant method 228 229
4.6 Roots of polynomial equation by Birge-Virta method 243 244
4.7 Roots of polynomial equation by Bairstow method 250 251
4.8 Seidal iteration method for a pair of non-linear equations 264 264
4.9 Newton-Rapshon method for a pair of equations 268 269
5.1 Determinant using partial pivoting 282 283
5.2 Determinant using complete pivoting 284 285
5.3 Determination of matrix inverse 290 291
5.4 Solution of a system of equations by matrix inverse 294 294
method
5.5 Solution of a system of equations by Gauss elimination 300 301
method
5.6 Solution of a system of equations by LU decomposition 310 311
method
5.7 Solution of a tri-diagonal system of equations 323 323
5.8 Solution of a system of equations by Gauss-Jacobi’s iter- 341 342
ation
5.9 Solution of a system of equations by Gauss-Seidal’s iter- 350 351
ation

xix
xx Numerical Analysis

Sl. No. Description Algo. Prog.


5.20 Solution of a system of equations by Gauss-Seidal SOR − 357
method
6.1 Characteristic polynomial of a matrix by Leverrier- 370 370
Faddeev method
6.2 Largest eigenvalue by power method 378 379
6.3 Eigenvalue of a real symmetric matrix by Jacobi’s method 386 387
6.4 Eigenvalue of a real symmetric matrix by Householder 398 399
method
7.1 First derivative based on Lagrange’s interpolation 415 416
7.2 First derivative using Richardson extrapolation 428 429
7.3 Integration by trapezoidal rule 437 438
7.4 Integration by Simpson’s 1/3 rule 444 444
7.5 Gauss-Legendre quadrature 461 461
7.6 Romberg’s integration 483 484
7.7 Double integration using trapezoidal rule 488 489
7.8 Integration by Monte Carlo method 494 496
8.1 Solution of a first order differential equation by Euler’s 519 519
method
8.2 Solution of a first order differential equation by modified 524 524
Euler’s method
8.3 Solution of a first order differential equation by fourth 531 532
order Runge-Kutta method
8.4 Solution of a pair of first order differential equation by 535 536
Runge-Kutta method
8.5 Solution of a first order differential equation by Adams- 544 545
Bashforth-Moulton method
8.6 Solution of a first order differential equation Milne’s 550 550
predictor-corrector method
8.7 Solution of a second order BVP using finite difference 557 557
method
9.1 Solution of heat equation using Crank-Nicolson implicit 593 594
method
9.2 Solution of Poisson’s equation using Gauss-Seidal S.O.R. 612 612
method
10.1 Fitting of straight line by least square method 622 623
10.2 Approximation of a function by Chebyshev polynomial 642 643
Chapter 1

Errors in Numerical
Computations

The solutions of mathematical problems are of two types: analytical and numerical.
The analytical solutions can be expressed in closed form and these solutions are error
free. On the other hand, numerical method is a division of mathematics which solves
problems using computational machine (computer, calculator, etc.). But, for some
classes of problems it is very difficult to obtain an analytical solution. For example, the
Indian populations are known at the years 1951, 1961, 1971, 1981, 1991, 2001. There
is no analytical method available to determine the population in the year, say, 2000.
But, using numerical method one can determine the population in the said year. Again,
sometimes we observed that the solutions of non-linear differential equations cannot be
determined by analytical methods, but, such problems can easily be solved by numerical
methods. Numerical computations are almost invariably contaminated by errors, and
it is important to understand the source, propagation, magnitude, and rate of growth
of these errors.
In this age of computer, many complicated and large problems are solved in signifi-
cantly less time. But, without using numerical methods we cannot solve any mathemat-
ical problem using computer, as analytical methods are not suitable to solve a problem
by computer. Thus, the numerical methods are highly appreciated and extensively used
by Mathematicians, Computer Scientists, Statisticians, Engineers and others.

1.1 Sources of Errors

The solution of a problem obtained by numerical method contains some errors. To


minimize the errors, it is most essential to identify the causes or sources of the errors

1
2 Numerical Analysis

and their growth and propagation in numerical computation. Three types of errors, viz.,
inherent errors, round-off errors and truncation errors, occur in finding the solution of
a problem using numerical method. These three type of errors are discussed below.
(i) Inherent errors: This type of errors is present in the statement of the problem itself,
before determining its solution. Inherent errors occur due to the simplified assumptions
made in the process of mathematical modelling of a problem. It can also arise when the
data is obtained from certain physical measurements of the parameters of the proposed
problem.
(ii) Round-off errors: Generally, the numerical methods are carried out using cal-
culator or computer. In numerical computation, all the numbers are represented by
decimal fraction. Some numbers such as 1/3, 2/3, 1/7 etc. can not be represented by
decimal fraction in finite numbers of digits. Thus, to get the result, the numbers should
be rounded-off into some finite number of digits.
Again, most of the numerical computations are carried out using calculator and com-
puter. These machines can store the numbers up to some finite number of digits. So in
arithmetic computation, some errors will occur due to the finite representation of the
numbers; these errors are called round-off error. Thus, round-off errors occur due to the
finite representation of numbers during arithmetic computation. These errors depend
on the word length of the computational machine.
(iii) Truncation errors: These errors occur due to the finite representation of an
inherently infinite process. For example, the use of a finite number of terms in the
infinite series to compute the value of cos x, sin x, ex , etc.
The Taylor’s series expansion of sin x is
x3 x5 x7
sin x = x − + − + ··· .
3! 5! 7!
This is an infinite series expansion. If only first five terms are taken to compute the
value of sin x for a given x, then we obtain an approximate result. Here, the error occurs
due to the truncation of the series. Suppose, we retain the first n terms, the truncation
error (Etrunc ) is given by
x2n+1
Etrunc ≤ .
(2n + 1)!
It may be noted that the truncation error is independent of the computational machine.

1.2 Exact and Approximate Numbers

To solve a problem, two types of numbers are used. They are exact and approximate.
Exact number gives a true value of a result and approximate number gives a value which
is closed to the true value.
Errors in Numerical Computations 3

For example, in the statements ‘a triangle has three sides’, ‘there are 2000 people in a
locality’, ‘a book has 450 pages’ the numbers 3, 2000 and 450 are exact numbers. But,
in the assertions ‘the height of a pupil is 178 cm’, ‘the radius of the Earth is 6400 km’,
‘the mass of a match box is ten gram’, the numbers 178, 6400 and 10 are approximate
numbers.
This is due to the imperfection of measuring instruments we use. There are no
absolutely exact measuring instruments; each of them has its own accuracy. Thus, the
height of a pupil is 178 cm is not absolute measurement. In the second example, the
radius of the Earth is very concept; actually, the Earth is not a sphere at all, and we
can use its radius only in approximate terms. In the last example, the approximation
of the number is also defined by the fact that different boxes may have different masses
and the number 10 defines the mass of a particular box.
One important observation is that, same number may be exact as well as approximate.
For example, the number 3 is exact when it represents the number of sides of a triangle
and approximate if we use it to represent the number π when calculating the area of a
circle using the formula πr2 .

Independently, the numbers 1, 2, 3, 12 , 53 , 2, π, e, etc. written in this manner are exact.
An approximate value of π is 3.1416, a better approximation of it is 3.14159265. But
one cannot write the exact value of π.
The accuracy of calculations is defined by the number of digits in the result which
enjoy confidence. The significant digits or significant figures of a number are all its
digits, except for zeros which appear to the left of the first non-zero digit. Zeros at the
end of a number are always significant digit. For example, the numbers 0.001205 and
356.800 have 4 and 6 significant digits respectively.
In practical calculations, some numbers occur containing large number of digits, and
it will be necessary to cut them to a usable number of figures. This process is called
rounding-off of numbers. That is, in rounding process the number is replaced by
another number consisting of a smaller number of digits. In that case, one or several
digits keep with the number, taken from left to right, and discard all others.

The following rules of rounding-off are commonly used:

(i) If the discarded digits constitute a number which is larger than half the unit in the
last decimal place that remains, then the last digit that is left is increased by one.
If the discarded digits constitute a number which is smaller than half the unit in
the last decimal place that remains, then the digits that remain do not change.

(ii) If the discarded digits constitute a number which is equal to half the unit in the
last decimal place that remains, then the last digit that is half is increased by one,
if it is odd, and is unchanged if it is even.
4 Numerical Analysis

This rule is often called a rule of an even digit. If a number is rounded using the
above rule then the number is called correct up to some (say n) significant figures.
The following numbers are rounded-off correctly to five significant figures:
Exact number Round-off number
25.367835 25.368
28.353215 28.353
3.785353 3.7854
5.835453 5.8355
6.73545 6.7354
4.83275 4.8328
0.005834578 0.0058346
3856754 38568×102
2.37 2.3700
8.99997 9.0000
9.99998 10.000
From above examples, it is easy to observe that, while rounding a number, an error
is generated and this error is sometimes called round-off error.

1.3 Absolute, Relative and Percentage Errors

Let xT be the exact value of a number and xA be its approximate value. If xA < xT ,
then we say that the number xA is an approximate value of the number xT by defect
and if xA > xT , then it is an approximate value of xT by excess.
The difference between the exact value xT and its approximate value xA is an error.
As a rule, it is not possible to determine the value of the error xT − xA and even its
sign, since the exact number xT is unknown.
The errors are represented in three ways, viz., absolute error, relative error and
percentage error.

Absolute error:
The absolute error of the approximate number xA is a quantity (∆x) which satisfies the
inequality
∆x ≥ |xT − xA |.
The absolute error is the upper bound of the deviation of the exact number xT from
its approximation, i.e.,
xA − ∆x ≤ xT ≤ xA + ∆x.
The above result can be written in the form

xT = xA ± ∆x. (1.1)
Errors in Numerical Computations 5

In other words, the absolute error of the number x is the difference between true
value and approximate value, i.e.,
∆x = |xT − xA |.
It may be noted from the rounding process that, if a number be rounded to m decimal
places then 1
absolute error ≤ × 10−m . (1.2)
2

The absolute error measures only the quantitative aspect of the error but not the
qualitative one, i.e., does not show whether the measurement and calculation were
accurate. For example, the length and the width of a table are measured with a scale
(whose division is 1 cm) and the following results are obtained: the width w = 5 ± 0.5
cm and the length l = 100 ± 0.5 cm. In both cases the absolute error is same and it is
0.5 cm. It is obvious that the second measurement was more accurate than the first.
To estimate the quality of calculations or measurements, the concept of a relative error
is introduced.

Relative error:
The relative error (δx) of the number xA is
∆x ∆x
δx = or , |xT | = 0 and |xA | = 0.
|xA | |xT |
This expression can be written as
xT = xA (1 ± δx) or xA = xT (1 ± δx).
Note that relative error is the absolute error when measuring 1 unit.
For the measurements of the length and the width of the table (discussed earlier) the
relative errors are
0.5 0.5
δw = = 0.1 and δl = = 0.005.
5 100
In these cases, one can conclude that the measurement of the length of the table has
been relatively more accurate than that of its width. So one conclusion can be drawn:
the relative error measures the quantity and quality of the calculation and measurement.
Thus, the relative error is a better measurement of error than absolute error.

Percentage error:
The percentage error of an approximate number xA is δx × 100%.
It is a particular type of relative error. This error is sometimes called relative
percentage error. The percentage error gives the total error while measuring 100 unit
instead of 1 unit. This error also calculates the quantity and quality of measurement.
When relative error is very small then the percentage error is calculated.
6 Numerical Analysis

Note 1.3.1 The absolute error of a number correct to n significant figures cannot be
greater than half a unit in the nth place.

Note 1.3.2 The relative error and percentage error are independent of the unit of
measurement, while absolute error depends on the measuring unit.

Difference between relative error and absolute error:

Absolute error measures only quantity of error and it is the total amount of error
incurred by approximate value. While the relative error measures both the quantity
and quality of the measurement. It is the total error while measuring one unit. The
absolute error depends on the measuring unit, but, relative error does not depend on
measuring unit.

1
Example 1.3.1 Find the absolute, relative and percentage error in xA when xT =
3
and xA = 0.333.

Solution. The absolute error


1 1 − 0.999
∆x = |xT − xA | = − 0.333 =
3 3
0.001
= = 0.00033.
3
The relative error
∆x 0.00033
δx = = = 0.00099  0.001.
xT 1/3

The percentage error is δx × 100% = 0.00099 × 100% = 0.099%  0.1%.

Example 1.3.2 An exact number xT is in the interval [28.03, 28.08]. Assuming an


approximate value, find the absolute and the percentage errors.

Solution. The middle of the given interval is taken as its approximate value, i.e.,
xA = 28.055. The absolute error is half of its length, i.e., ∆x = 0.025. The relative
∆x
error δx = = 0.000891 · · · .
xA
It is conventional to round-off the error to one or two non-zero digits. Therefore,
δx = 0.0009 and the percentage error is 0.09%.
Errors in Numerical Computations 7

Example 1.3.3 Determine the absolute error and the exact number corresponding
to the approximate number xA = 5.373 if percentage error is 0.01%.

Solution. Here the relative error δx = 0.01% = 0.0001.


The absolute error ∆x = |xA × δx| = 5.373 × 0.0001 = 0.0005373  0.00054.
The exact value = 5.373 ± 0.00054.

Example 1.3.4 Find out in which of the following cases, the quality of calculations
15 √
is better: xT =  0.8824 and yT = 51  7.141.
17
Solution. To find the absolute error, we take the√numbers xA and yA with a larger
number of decimal digits as xA  0.882353, yA = 51  7.141428.
Therefore, the absolute error in xT is |0.882353 · · · − 0.8824|  0.000047,
and in yT is |7.141428 · · · − 7.141|  0.00043.
The relative error in xA is 0.00047/0.8824  0.00053 = 0.05%
and relative error in yA is 0.00043/7.141 = 0.0000602 = 0.006%.
In the second case the quality of calculation is better than the first case as relative
error in xT > relative error in yT .

1.4 Valid Significant Digits

A real number can be represented by many different ways. For example, the number
840000 can be represented as two factors: 840 × 103 or 84.0 × 104 or 0.840 × 106 . (Note
that in these representations the last three significant zeros are lost). The later form of
the notation is known as normalize form and it is commonly used. In this case, we
say that 840 is the mantissa of the number and 6 is its order.
Every positive decimal number, exact as well as approximate, can be expressed as

a = d1 × 10m + d2 × 10m−1 + · · · + dn × 10m−n+1 + · · · ,

where di are the digits constituting the number (i = 1, 2, . . .) with d1 = 0 and 10m−i+1
is the value of the ith decimal position (counting from left).
The digit dn of the approximate number a is valid significant digit (or simply a
valid digit) if it satisfies the following inequality.

∆a ≤ 0.5 × 10m−n+1 , (1.3)

i.e., absolute error does not exceed half the unit of the decimal digit in which dn appears.
If inequality (1.3) is not satisfied, then the digit dn is said to be doubtful. It is
obvious that if the digit dn is valid, then all the preceding digits, to the left of it, are
also valid.
8 Numerical Analysis

Theorem 1.1 If a number is correct up to n significant figures and the first significant
digit of the number is k, then the relative error is less than

1
.
k × 10n−1

Proof. Let xA be the approximate value of the exact number xT . Also, let xA is correct
up to n significant figures and m decimal places. Then there are three possibilities may
occur:
(i) m < n
(ii) m = n and
(iii) m > n.
We have by (1.2), the absolute error ∆x ≤ 0.5 × 10−m .
Case I. When m < n.
In this case, the total number of digits in integral part is n − m. If k be the first
significant digit in xT , then

∆x ≤ 0.5 × 10−m and |xT | ≥ k × 10n−m−1 − 0.5 × 10−m .

Therefore, the relative error

∆x 0.5 × 10−m
δx = ≤
|xT | k × 10n−m−1 − 0.5 × 10−m
1
= .
2k × 10n−1 − 1
Since, n is a positive integer and k is an integer lies between 1 and 9,

2k × 10n−1 − 1 > k × 10n−1

for all k and n except k = n = 1.


Hence,
1
δx < .
k × 10n−1
Case II. When m = n.
In this case, the first significant digit is the first digit after decimal point, i.e., the
integral part is zero.
As before,
0.5 × 10−m
δx =
k × 10n−m−1 − 0.5 × 10−m
1 1
= < .
2k × 10n−1 − 1 k × 10n−1
Errors in Numerical Computations 9

Case III. When m > n.


In this case, the first significant digit k is at the (n − m + 1) = −(m − n − 1)th position.
Also, the integer part is zero. Then ∆x ≤ 0.5 × 10−m and |xT | ≥ k × 10−(m−n+1) −
0.5 × 10−m .
Therefore, 0.5 × 10−m
δx =
k × 10−(m−n+1) − 0.5 × 10−m
1 1
= < .
2k × 10 n−1 −1 k × 10n−1
Hence the theorem.

1.5 Propagation of Errors in Arithmetic Operations

1.5.1 The errors in sum and difference


Consider the exact numbers X1 , X2 , . . . , Xn and their approximations be respectively
x1 , x2 , . . . , xn . Let ∆x1 , ∆x2 , . . . , ∆xn be the errors in x1 , x2 , . . . , xn , i.e., Xi = xi ±
∆xi , i = 1, 2, . . . , n. Also, let X = X1 + X2 + · · · + Xn and x = x1 + x2 + · · · + xn .
Therefore, the total absolute error is
|X − x| = |(X1 − x1 ) + (X2 − x2 ) + · · · + (Xn − xn )|
≤ |X1 − x1 | + |X2 − x2 | + · · · + |Xn − xn |.

Thus the absolute error in the sum is

|∆x| = |∆x1 | + |∆x2 | + · · · + |∆xn |. (1.4)

Thus the absolute error in sum of approximate numbers is equal to the sum of the
absolute errors of the numbers.
From (1.4), it follows that the absolute error of the algebraic sum must not be smaller
than the absolute error of the least exact term.
The following points should be kept in mind when adding numbers of different abso-
lute accuracy.

(i) identify a number (or numbers) of the least accuracy (i.e., a number which has
the maximum absolute error),

(ii) round-off more exact numbers so as to retain in them one digit more than in the
identified number (i.e., retain one reserve digit),

(iii) perform addition taking into account all the retained digits,

(iv) round-off the result by discarding one digit.


10 Numerical Analysis

Subtraction
Consider x1 and x2 be two approximate values of the corresponding exact numbers X1
and X2 . Let X = X1 − X2 and x = x1 − x2 .
Then X1 = x1 ± ∆x1 and X2 = x2 ± ∆x2 , where ∆x1 and ∆x2 are the errors in x1
and x2 respectively.
Therefore, |X − x| = |(X1 − x1 ) − (X2 − x2 )| ≤ |X1 − x1 | + |X2 − x2 |. Hence,
|∆x| = |∆x1 | + |∆x2 |. (1.5)
Thus the absolute error in difference of two numbers is equal to the sum of individual
absolute errors.

1.5.2 The error in product


Let us consider two exact numbers X1 and X2 and their approximate values x1 and x2 .
Also, let ∆x1 and ∆x2 be the errors in x1 and x2 , i.e., X1 = x1 ±∆x1 and X2 = x2 ±∆x2 .
Now, X1 X2 = x1 x2 ± x1 ∆x2 ± x2 ∆x1 ± ∆x1 · ∆x2 .
Then |X1 X2 − x1 x2 | ≤ |x1 ∆x2 | + |x2 ∆x1 | + |∆x1 · ∆x2 |. The last term of right hand
side is small, so we discard it and dividing both sides by |x| = |x1 x2 |.
Thus the relative error in the product is
     
 X1 X2 − x1 x2   ∆x2   ∆x1 
 =   
 x1 x2   x2  +  x1 . (1.6)

Thus the relative errors in product of two numbers is equal to the sum of individual
relative errors.
The result (1.6) can be easily extended to the product of several numbers so that, if
X = X1 X2 · · · Xn and x = x1 x2 · · · xn , then
       
 X − x   ∆x1   ∆x2   
 = +  + · · · +  ∆xn . (1.7)
 x   x1   x2   xn 

That is, the total relative error in product of n numbers is equal to the sum of
individual relative errors.

A particular case
Let the approximate numbers x1 , x2 , . . . , xn be all positive and x = x1 x2 · · · xn .
Then log x = log x1 + log x2 + · · · + log xn .

∆x ∆x1 ∆x2 ∆xn


Therefore, = + + ··· + .
x x1 x2 xn
 ∆x   ∆x   ∆x   ∆x 
   1  2  n
That is,  = +  + ··· +  .
x x1 x2 xn
Usually, the following steps are followed when multiplying two numbers:
Errors in Numerical Computations 11

(i) identify a number with the least number of valid digits,


(ii) round off the remaining factors so that they would contain one significant digit
more than the valid significant digits there are in the isolated number,
(iii) retain as many significant digits in the product as there are valid significant digits
in the least exact factor (the identified number).

Example 1.5.1 Show that when an approximate number x is multiplied by an


exact factor k, the relative error of the product is equal to the relative error of the
approximate number x and the absolute error is |k| times the absolute error of the
absolute number.

Solution. Let x = kx1 , where k is an exact factor other than zero. Then
     
 ∆x   k ∆x1   ∆x1 
δx =  = =  = δx1 .
x   k x1   x1 

But, the absolute error |∆x| = |k ∆x1 | = |k| |∆x1 | = |k| times the absolute error in
x1 .

1.5.3 The error in quotient


Let us consider two exact numbers X1 and X2 and their approximate values x1 and x2 .
X1 x1
Also, let X = and x = .
X2 x2
Then X1 = x1 + ∆x1 , X2 = x2 + ∆x2 , where ∆x1 and ∆x2 are the errors.
Let x1 = 0 and x2 = 0.
Now,
x1 + ∆x1 x1 x2 ∆x1 − x1 ∆x2
X −x= − = .
x2 + ∆x2 x2 x2 (x2 + ∆x2 )
Dividing both sides by x and taking absolute values:
      
 X − x   x2 ∆x1 − x1 ∆x2   x2   ∆x1 ∆x2 
 = =  
 x   x1 (x2 + ∆x2 )   x2 + ∆x2   x1 − x2 .
The error ∆x2 is small as compared to x2 , then approximately
x2
 1. Therefore, above relation becomes
x2 + ∆x2
         
 ∆x   X − x   ∆x1 ∆x2   ∆x1   ∆x2 

δx =   =   =  −  ≤   +  , (1.8)
x   x   x1 x2   x1   x2 
i.e., δx = δx1 + δx2 . Hence, the total relative error in quotient is equal to the sum of
their individual relative errors.
12 Numerical Analysis

The relation (1.8) can also be written as


       
 ∆x   ∆x1 ∆x2   ∆x1   ∆x2 
 =     
 x   x1 − x2  ≥  x1  −  x2 . (1.9)

From this relation one can conclude that the relative error in quotient is greater than
or equal to the difference of their individual relative errors.

A particular case
For positive approximate numbers x1 and x2 , the equation (1.8) can easily be deduced.
Let x = x1 /x2 . Then log x = log x1 − log x2 . Therefore,
     
∆x ∆x1 ∆x2  ∆x   ∆x1   ∆x2 
= − 
i.e.,   ≤   +  .
x x1 x2 x   x1   x2 

While dividing two numbers the following points should be followed.

(i) identify the least exact number, i.e., the number with the least number of valid
digits,

(ii) round-off the other number, leaving in it on significant digit more than there are
digits in the identified number,

(iii) retain as many significant digits in the quotient as there were in the least exact
number.

Example 1.5.2 Find the sum of the approximate numbers 0.543, 0.1834, 17.45,
0.000234, 205.2, 8.35, 185.3, 0.0863, 0.684, 0.0881 in each of which all the written
digits are valid. Find the absolute error in sum.

Solution. The least exact numbers (those possessing the maximum absolute error)
are 205.2 and 185.3. The error of each of them is 0.05. Now, rounding off the other
numbers, leaving one digit more and adding all of them.
0.54+0.18+17.45+0.00+205.2+8.35+185.3+0.09+0.68+0.09=417.88.
Discarding one digit by round-off the sum and we obtained 417.9.
The absolute error in the sum consists of two terms:

(i) the initial error, i.e., the sum of the errors of the least exact numbers and the
rounding errors of the other numbers: 0.05 × 2 + 0.0005 × 8 = 0.104  0.10.

(ii) the error in rounding-off the sum is 417.9 − 417.88 = 0.02.

Thus the absolute error of the sum is 0.10 + 0.02 = 0.12.


So, the sum can be written as 417.9 ± 0.12.
Errors in Numerical Computations 13

Example 1.5.3 Find the difference of the approximate numbers 27.5 and 35.8 hav-
ing absolute errors 0.02 and 0.03 respectively. Evaluate the absolute and the relative
errors of the result.

Solution. Let x1 = 27.5 and x2 = 35.8. Then x = x1 − x2 = −8.3. The total


absolute error ∆x = 0.02 + 0.03 = 0.05.
Thus the difference x1 − x2 is −8.3 with absolute error 0.05.
The relative error is 0.05/| − 8.3|  0.006 = 0.6%.

Example 1.5.4 Find the product of the approximate numbers x1 = 8.6 and x2 =
34.359 all of whose digits are valid. Also find the relative and the absolute errors.

Solution. In the first number, there are two valid significant digits and in the second
there are five digits. Therefore, round-off the second number to three significant
digits. After rounding-off the numbers x1 and x2 become x1 = 8.6 and x2 = 34.4.
Hence the product is

x = x1 x2 = 8.6 × 34.4 = 295.84  3.0 × 102 .

In the result, there are two significant digits, because the least number of valid sig-
nificant digits of the given numbers is 2.
The relative error in product is
     
 ∆x   ∆x1   ∆x2  0.05 0.0005
δx =   =   +  = + = 0.00583  0.58%.
x   x1   x2  8.6 34.359

The absolute error is (3.0 × 102 ) × 0.00583 = 1.749  1.7.

Example 1.5.5 Calculate the quotient x/y of the approximate numbers x = 6.845
and y = 2.53 if all the digits of the numbers are valid. Find the relative and the
absolute errors.

Solution. Here the dividend x = 6.845 has four valid significant digits and the
divisor has three, so we perform division without rounding-off. Thus
x 6.845
= = 2.71.
y 2.53

Three significant digits are retained in the result, since, the least exact number (the
divisor y) contains three valid significant digits.
The absolute error in x and y are respectively
∆x = 0.0005 and ∆y = 0.005.
14 Numerical Analysis

Therefore the relative error in quotient is


   
 ∆x   ∆y  0.0005 0.005
 + 
 x   y  = 6.845 + 2.53 = 0.000073 + 0.00198
 0.002 = 0.2%.

The absolute error is


 
x
  × 0.002 = 2.71 × 0.002 = 0.00542 = 0.005.
y

1.5.4 The errors in power and in root

Let us consider an approximate number x1 which has a relative error δx1 . Now, the
problem is to find the relative error of x = xm
1 .
Then
x = xm x1 · x1 · · · x1 .
1 =  
m times

By (1.7), the relative error δx in the product is

δx = δx1 + δx1 + · · · + δx1 = m δx1 . (1.10)


  
m times

Thus, when the approximate number x is raised to the power m, its relative error
increases m times.

Similarly, one can calculate the relative error of the number x = m x1 .
Here x1 > 0. Therefore,
1
log x = log x1 .
m
That is,
   
∆x 1 ∆x1  ∆x  1  ∆x1 
= 
or   = .
x m x1 x  m  x1 

Hence the relative error is


1
δx = δx1 ,
m
where δx and δx1 respectively represent the relative errors in x and x1 .
Errors in Numerical Computations 15


X3 Y
Example 1.5.6 Calculate A = where X = 8.36, Y = 80.46, Z = 25.8. The
Z2
absolute errors in X, Y, Z are respectively 0.01, 0.02 and 0.03. Find the error of the
result.

Solution. Here the absolute error ∆x = 0.01, ∆y = 0.02 and ∆z = 0.03. To calculate
intermediate result, retain one reserve digit. The approximate intermediate values

are x3 = 584.3, y = 8.9699, z 2 = 665.6, where x, y, z are approximate values of
X, Y, Z respectively.
Thus the approximate value of the expression is
584.3 × 8.9699
a= = 7.87.
665.6
Three significant digits are taken in the result, since, the least number of significant
digits in the numbers is 3.
Now, the relative error δa in a is given by
1 0.01 1 0.02 0.03
δa = 3 δx + δy + 2 δz = 3 × + × +2×
2 8.36 2 80.46 25.8
 0.0036 + 0.00012 + 0.0023  0.006 = 0.6%.

The absolute error ∆a in a is 7.87 × 0.006 = 0.047.


Hence, A = 7.87 ± 0.047 and the relative error is 0.006.

1.5.5 Error in evaluation of a function of several variables

Let y = f (x1 , x2 , . . . , xn ) be a differentiable function containing n variables x1 , x2 , . . . , xn


and let ∆xi be the error in xi , for all i = 1, 2, . . . , n.
Then the error ∆y in y is given by

y + ∆y = f (x1 + ∆x1 , x2 + ∆x2 , . . . , xn + ∆xn )



n
∂f
= f (x1 , x2 , . . . , xn ) + ∆xi + · · ·
∂xi
i=1
(by Taylor’s series expansion)
n
∂f
=y+ ∆xi
∂xi
i=1
(neglecting second and higher powers terms of ∆xi )
n
∂f
i.e., ∆y = ∆xi
∂xi
i=1
16 Numerical Analysis

This formula gives the total error for computing a function containing several vari-
ables.
The relative error is given by

∆y  ∂f ∆xi
n
= .
y ∂xi y
i=1

1.6 Significant Error

Significant error occurs due to the loss of significant digits during arithmetic compu-
tation. This error occurs mainly due to the finite representation of the numbers in
computational machine (computer or calculator). The loss of significant digits occurs
due to the following two reasons:

(i) when two nearly equal numbers are subtracted and

(ii) when division is made by a very small divisor compared to the dividend.

Significant error is more serious than round-off error, which are illustrated in the
following examples:
√ √
Example 1.6.1 Find the difference X = 5.36 − 5.35 and evaluate the relative
error of the result.
√ √
Solution. Let X1 = 5.36  2.315 = x1 and X2 = 5.35  2.313 = x2 .
The absolute errors ∆x1 = 0.0005 and ∆x2 = 0.0005. Then the approximate differ-
ence is x = 2.315 − 2.313 = 0.002.
The total absolute error in the subtraction is ∆x = 0.0005 + 0.0005 = 0.001.
0.001
The relative error δx = = 0.5 = 50%.
0.002
However, by changing the scheme of calculation we get a more accurate result.
√ √ √ √
√ √ ( 5.36 − 5.35)( 5.36 + 5.35)
X = 5.36 − 5.35 = √ √
5.36 + 5.35
5.36 − 5.35 0.01
=√ √ =√ √  0.002 = x (say).
5.36 + 5.35 5.36 + 5.35
In this case the relative error is
∆x1 + ∆x2 0.001
δx = = = 0.0002 = 0.02%.
x1 + x2 2.315 + 2.313
Thus, when calculating x1 and x2 with the same four digits we get a better result in
the sense of a relative error.
Errors in Numerical Computations 17

Example 1.6.2 Calculate the values of the function y = 1 − cos x at x = 82o and
at x = 1o . Also, calculate the absolute and the relative errors of the results.

Solution. y at x = 82o
The value of cos 82o  0.1392 = a1 (say) (correct up to four digits) and ∆a1 =
0.00005. Then y1 = 1 − 0.1392 = 0.8608 and ∆y1 = 0.00005 (from an exact num-
ber equal to unity we subtract an approximate number with an absolute error not
exceeding 0.00005).
Consequently, the relative error is
0.00005
δy1 = = 0.000058 = 0.006%.
0.8608

y at x = 1o
We have cos 1o  0.9998 = a2 (say). ∆a2 = 0.00005.
y2 = 1 − 0.9998 = 0.0002. ∆y2 = 0.00005.
Hence
0.00005
δy2 = = 0.25 = 25%.
0.0002
From this example it is observed that for small values of x, a direct calculation of
y = 1 − cos x gives a relative error of the order 25%. But at x = 82o the relative error
is only 0.006%.
Now, change the calculation procedure and use the formula y = 1 − cos x = 2 sin2 x2
to calculate the value of y for small values of x.
Let a = sin 0o 30  0.0087. Then ∆a = 0.00005 and
0.00005
δa = = 0.0058 = 0.58%.
0.0087
Thus y2 = 2 × 0.00872 = 0.000151 and relative error
δy2 = 0.0058 + 0.0058 = 0.012 = 1.2% (using the formula δa = δx + δy if a = x.y).
The absolute error is
∆y2 = y2 × δy2 = 0.000151 × 0.012 = 0.000002.
Thus a simple transformation, of the computing formula, gives a more accurate result
for the same data.

Example 1.6.3 Find the roots of the equation x2 − 1000x + 0.25 = 0.

Solution. For simplicity, it is assumed that all the calculations are performed using
four significant digits. The roots of this equation are

1000 ± 106 − 1
.
2
18 Numerical Analysis

106 − 1 = 0.1000 × 107 − 0.0000 × 107 = 0.1000 × 107 .


Now, √
Thus 106 − 1 = 0.1000 × 104 .
Therefore the roots are 0.1000 × 104 ± 0.1000 × 104
2
which are respectively 0.1000 × 104 and 0.0000 × 104 . One of the roots becomes zero
due to the finite representation of the numbers. But, the transformed formula gives
the smaller root more accurately.
The smaller root of the equation may be calculated using the transformed formula
√ √ √
1000 − 106 − 1 (1000 − 106 − 1)(1000 + 106 − 1)
= √
2 2(1000 + 106 − 1)
1
= √ = 0.00025.
2(1000 + 106 − 1)

Thus the roots of the given equation are 0.1000 × 104 and 0.00025.
Such a situation may be recognized by checking |4ac|  b2 .

It is not always possible to transform the computing formula. Therefore, when nearly
equal numbers are subtracted, they must be taken with a sufficient number of reserve
valid digits. If it is known that the first m significant digits may be lost during compu-
tation and if we need a result with n valid significant digits then the initial data should
be taken with m + n valid significant digits.

1.7 Representation of Numbers in Computer

Today, generally, numerical computations are carried out by calculator or computer.


Due to the limitations of calculator, computers are widely used in numerical compu-
tation. In this book, computer is taken as the computational machine. Now, the
representation and computation of numbers in computer are discussed below.
In computer, the numbers are stored mainly in two forms: (i) integer or fixed point
form, and (ii) real or floating point form. Before storing to the computer, all the numbers
are converted into binary numbers (consisting two bits 0 and 1) and then these converted
numbers are stored into computer memory. Generally, two bytes (two bytes equal to 16
bits, one bit can store either 0 or 1) memory space is required to store an integer and
four bytes space is required to store a floating point number. So, there is a limitation
to store the numbers into computers.
Storing of integers is straight forward while representation of floating point numbers
is different from our conventional technique. The main aim of this new technique is to
preserve the maximum number of significant digits in a real number and also increase
the range of values of the real numbers. This representation is called the normalized
Errors in Numerical Computations 19

floating point mode. In this mode of representation, the whole number is converted to
a proper fraction in such a way that the first digit after decimal point should be non-zero
and is adjusted by multiplying a number which is some powers of 10. For example, the
number 375.3 × 104 is represented in this mode as .3753 × 107 = .3753E7 (E7 is used
to represent 107 ). From this example, it is observed that in normalized floating point
representation, a number is a combination of two parts – mantissa and exponent.
In the above example, .3753 is the mantissa and 7 is the exponent. It may be noted
that the mantissa is always greater than or equal to .1 and exponent is an integer.
For simplicity, it is assume that the computer (hypothetical) uses four digits to store
mantissa and two digits for exponent. The mantissa and the exponent have their own
signs.
The number .0003783 would be stored as .3783E–3. The leading zeros in this number
serve only to indicate the decimal point. Thus, in this notation the range of numbers
(magnitudes) is .9999 × 1099 to .1000 × 10−99 .

1.8 Arithmetic of Normalized Floating Point Numbers

In this section, the arithmetic operations on normalized floating point numbers are
discussed.

1.8.1 Addition

If two numbers have same exponent, then the mantissas are added directly and the
exponents are adjusted, if required.
If the exponents are different then lower exponent is shifted to higher exponent by
adjusting mantissa. The details about addition are discussed in the following examples.
Example 1.8.1 Add the following normalized floating point numbers.
(i) .3456E3 and .4325E3 (same exponent)
(ii) .8536E5 and .7381E5
(iii) .3758E5 and .7811E7 (different exponent)
(iv) .2538E2 and .3514E7
(v) .7356E99 and .3718E99 (overflow condition)

Solution. (i) In this case, the exponents are equal, so the mantissa are added
directly. Thus the sum is .7781E3.
(ii) In this case, the exponent are equal and the sum is 1.5917E5. Here the mantissa
has 5 significant digits, but our computer (hypothetical) can store only four significant
figures. So, the number is shifted right one place before it is stored. The exponent is
increased by 1 and the last digit is truncated. The final result is .1591E6.
20 Numerical Analysis

(iii) Here, the numbers are .3758E5 and .7811E7. The exponent of the first number
is less than that of the second number. The difference of the exponents is 7 − 5 = 2.
So the mantissa of the smaller number (here first number) is shifted right by 2 places
(the difference of the exponents) and the last 2 digits of the mantissa are discarded
as our hypothetical computer can store only 4 digits. Then the first number becomes
.0037E7. Then the result is .0037E7 + .7811E7 = .7848E7.
(iv) Here also the exponents are different and the difference is 7 − 2 = 5. The man-
tissa of first number (smaller exponent) is shifted 5 places and the number becomes
.0000E7. The final result is .0000E7 + .3514E7 = .3514E7.
(v) Here the numbers are .7356E99 and .3718E99 and they have equal exponent. So
the sum of them is 1.1074E99. In this case mantissa has five significant digits. Thus
the mantissa is shifted right and the exponent is increased by 1. Then the exponent
becomes 100. As the exponent cannot store more than two digits, in our hypothetical
computer, the number is larger than the largest number that can be stored in our
computer. This situation is called an overflow condition and the machine will give
an error message.

1.8.2 Subtraction

The subtraction is same as addition. In subtraction one positive number and one neg-
ative number are added. The following example shows the details about subtraction.

Example 1.8.2 Subtract the normalized floating point numbers indicated below:
(i) .3628E6 from .8321E6
(ii) .3885E5 from .3892E5
(iii) .3253E–7 from .4123E–6
(iv) .5321E–99 from .5382E–99.

Solution. (i) Here the exponents are equal, and the result is
.8321E6 – .3628E6 = .4693E6.
(ii) Here the result is .3892E5 – .3885E5 = .0007E5. The most significant digit in the
mantissa is 0, so the mantissa is shifted left till the most significant digit becomes
non-zero and in each left shift of the mantissa the exponent is reduced by 1. Hence
the final result is .7000E2.
(iii) The numbers are .4123E–6 and .3253E–7. The exponents are not equal, so the
number with smaller exponent is shifted right and the exponent increased by 1 for
every right shift. Then the second number becomes .0325E–6. Thus the result is
.4123E–6 – .0325E–6 = .3798E–6.
Errors in Numerical Computations 21

(iv) The result is .5382E–99 – .5321E–99 = .0061E–99. For normalization, the man-
tissa is shifted left twice and in this process the exponent is reduced by 1. In first
shift, the exponent becomes –100, but our hypothetical computer can store only two
digits as exponent. So –100 cannot be accommodated in the exponent part of the
number. In this case, the result is smaller than the smallest number which could be
stored in our computer. This condition is called an underflow condition and the
computer will give an error message.

1.8.3 Multiplication

Two numbers in normalized floating point mode are multiplied by multiplying the man-
tissa and adding the exponents. After multiplication, the mantissa is converted into
normalized floating point form and the exponent is converted appropriately. The fol-
lowing example shows the steps of multiplication.
Example 1.8.3 Multiply the following numbers indicated below:
(i) .5321E5 by .4387E10
(ii) .1234E10 by .8374E–10
(iii) .1139E50 by .8502E51
(iv) .3721E–52 by .3205E-53.

Solution. (i) Here, .5321E5 × .4387E10 = .23343227


  E15.
discarded
The mantissa has 8 significant figures, so the last four digits are discarded. The final
result is .2334E15.
(ii) Here, .1234E10 × .8374E–10 = .10333516
  E0 = .1033E0.
discarded

(iii) .1139E50 × .8502E51 = .09683778E101.


Here, the mantissa has one 0 as most significant digit, so the mantissa is shifted
left one digit and the exponent is adjusted. The product is .9683E100. But our
hypothetical computer cannot store 3 digits as exponent. Hence, in this case, the
overflow condition occurs.
(iv) .3721E–52 × .3205E–53 = .11925805E–105 = .1192E–105.
In this case, the product is very small (as the exponent is –105). Hence the underflow
condition occurs.

1.8.4 Division

In the division, the mantissa of the numerator is divided by that of the denominator.
The exponent is obtained by subtracting exponent of denominator from the exponent
22 Numerical Analysis

of numerator. The quotient mantissa is converted to normalized form and the exponent
is adjusted appropriately.
Example 1.8.4 Perform the following divisions
(i) .9938E5 ÷ .3281E2
(ii) .9999E2 ÷ .1230E–99
(iii) .3568E–10 ÷ .3456E97.

Solution. (i) .9938E5 ÷ .3281E2 = .3028E4.


(ii) .9999E2 ÷ .1230E–99 = .8129E102.
The result overflows.
(iii) .3568E–10 ÷ .3456E97 = .1032E–106.
In this case the result underflows.

1.9 Effect of Normalized Floating Point Representations

The truncation of mantissa leads to very interesting results. For example, 16 × 12 = 2


is well known. But, when the arithmetic is performed with floating point numbers,
.1667 being added 12 times yields .1996E1, whereas, .1667 × 12 gives .2000E1. That is,
· · · + x is not true.
12x = x + x +
12 times
It is very surprising that due to the truncation of mantissa, the associative and
distributive laws do not hold always in normalized floating point numbers.
That is,
(i) (a + b) + c = a + (b + c)
(ii) (a + b) − c = (a − c) + b
(iii) a(b − c) = ab − ac.
These results are illustrated in the following examples:
(i) a =.6878E1, b =.7898E1 and c =.1007E1.
Now, a + b =.1477E2
(a + b) + c = .1477E2 + .1007E1 = .1477E2 + .0100E2 = .1577E2.
Again, b + c =.8905E1.
a + (b + c)=.6878E1+.8905E1=.1578E2.
Thus, (a + b) + c = a + (b + c).
(ii) Let a =.6573E1, b =.5857E–1, c =.6558E1.
Then a + b =.6631E1 and (a + b) − c =.6631E1 – .6558E1 = .7300E-1.
Again, a − c =.1500E–1 and (a − c) + b =.1500E–1 + .5857E–1 = .7357E–1.
Thus, (a + b) − c = (a − c) + b.
(iii) Let a =.5673E1, b =.3583E1, c =.3572E1.
Errors in Numerical Computations 23

b − c =.1100E–1.
a(b − c) =.5673E1 × .1100E–1 = .0624E0 = .6240E–1.
ab =.2032E2, ac =.2026E2.
ab − ac =.6000E–1.
Thus, a(b − c) = ab − ac.

The above examples are intentionally chosen to point out the occurrence of inac-
curacies in normalized floating point arithmetic due to the shifting and truncation of
numbers during arithmetic operations. But these situations always do not happen.
Here, we assume that the computer can store only four digits in mantissa, but actually
computer can store seven digits as mantissa (in single precision). The larger length of
mantissa gives more accurate result.

1.9.1 Zeros in floating point numbers


The number zero has a definite meaning in mathematics, but, in computer exact equality
of a number to zero can never be guaranteed. The cause behind this situation is that
most of the numbers in floating point representation are approximate. One interesting
example is presented below to illustrate the behaviour of zero.

The roots of the quadratic equation x2 + 2x − 5 = 0 are x = −1 ± 6.
The roots in floating point representation (4 digits mantissa) are .1449E1 and –
.3449E1.
But, at x =.1449E1 the left hand side of the equation is –.003 clearly which is not
equal to zero, while at x =–.3449E1, the left hand side of the equation is
(–.3449E1) × (–.3449E1) + .2000E1 × (–.3449E1) – .5000E1
= .1189E2 – .6898E1 – .5000E1 = .1189E2 – .0689E2 – .0500E2 = .0000E2, which
is equal to 0.
Thus, one root satisfies the equation completely but other root does not, though they
are roots of the equation. By the property of the root of an equation the number 0.003
be zero. Depending on the result of this example we may note the following.

Note 1.9.1 In any computational algorithm, it is not advisable to give any instruction
based on testing whether a floating point number is zero or not.

1.10 Exercise

1. What do you mean by the terms in numerical analysis ?


(i) truncation error, (ii) round-off error, (iii) significant error.

2. What are the different sources of computational errors in a numerical computa-


tional work ?
24 Numerical Analysis

3. Explain what do you understand by an approximate number and significant figures


of a number.

4. What convention are used in rounding-off a number ?

5. When a number is said to be correct to n significant figures ?


Round-off the following numbers to three significant figures.
(i) 0.01302, (ii) –349.87, (iii) 0.005922, (iv) 87678, (v) 64.8523, (vi) 6380.7, (vii)
0.0000098, (viii) .2345, (ix) 0.4575, (x) 34.653, (xi) 21.752, (xii) 1.99999.

6. Define absolute, relative and percentage errors.

7. Explain when relative error is a better indicator of the accuracy of a computation


than the absolute error.

8. Find out which of the following two equalities is more exact:


√ 6/25  1.4 or 1/3  0.333, (ii) 1/9  0.1 or 1/3  0.33, (iii) π  3.142 or
(i)
10  3.1623.

9. Find the absolute, relative and percentage errors when (i) 2/3 is approximated
to 0.667, (ii) 1/3 is approximated to 0.333, and (iii) true value is 0.50 and its
calculated value was 0.49.

10. (i) If π is approximated as 3.14 instead of 3.14156, find the absolute, relative and
percentage errors.
(ii) Round-off the number x = 3.4516 to three significant figures and find the
absolute and the relative errors.

11. The numbers 23.982 and 3.4687 are both approximate and correct only to their
last digits. Find their difference and state how many figures in the result are
trustworthy.

12. Two lengths X and Y are measured approximately up to three significant figures
as X = 3.32 cm and Y = 5.39 cm. Estimate the error in the computed value of
X +Y.

13. Let xT and xA denote respectively the true and approximate values of a number.
Prove that the relative error in the product xA yA is approximately equal to the
sum of the relative errors in xA and yA .

14. Show that the relative error in the product of several approximate non-zero num-
bers does not exceed the sum of the relative errors of the numbers.
Errors in Numerical Computations 25

15. Show that the maximum relative error in the quotient of two approximate numbers
is approximately equal to the algebraic sum of the maximum relative errors of the
individual numbers.

16. Let x = 5.234 ± 0.0005 and y = 5.123 ± 0.0005. Find the percentage error of the
difference a = x − y when relative errors δx = δy = 0.0001.

17. What do you mean by the statement that xA (approximate value) has m significant
figures with respect to xT (true value) ? If the first significant figure of xA is k
and xA is correct up to n significant figures, prove that the relative error is less
than 101−n /k.

18. Given a = 11 ± 0.5, b = 0.04562 ± 0.0001, c = 17200 ± 100. Find the maximum
value of the absolute error in the following expressions
(i) a + 2b − c, (ii) 2a − 5b + c and (iii) a2 .

19. Calculate the quotient a = x/y of the approximate numbers x = 5.762 and y =
1.24 if all the digits of the dividend and the divisor are valid. Find the relative
and the absolute errors.

20. (i) Establish the general formula for absolute and relative errors for the function
v = f (u1 , u2 , . . . , un ) when absolute errors ∆ui of each independent quantity ui
up uq ur
are known. Use this result for the function v = 1 s 2 t 3 to find the upper bound
u4 u5
of the relative error.
(ii) Find the relative error in computing f (x) = 2x5 − 3x + 2 at x = 1, if the error
in x is 0.005.
1.42x + 3.45
(iii) If y = here the coefficients are rounded-off, find the absolute and
x + 0.75
relative errors in y when x = 0.5 ± 0.1.

21. Given y = x4 y 5/2 , if x0 , y0 be the approximate values of x, y respectively and


∆x0 , ∆y0 be the absolute errors in them, determine the relative error in u.
(a + b)c
22. Calculate x = , where a = 1.562 ± 0.001, b = 10.3 ± 0.02, c = 0.12 ±
(d − e)2
0.04, d = 10.541 ± 0.004, e = 2.34 ± 0.006. Find the absolute and the relative
errors in the result.

23. (i) Determine the number of correct digits in the number 0.2318 if the relative
error is 0.3 × 10−1 .
(ii) Find the number of significant figures in the approximate number 0.4785 given
that the relative error is 0.2 × 10−2 .

24. Find the smaller root of the equation x2 −500x+1 = 0 using four-digit arithmetic.
26 Numerical Analysis

√ √
25. Find the value of 103 − 102 correct up to four significant figures.

26. Find an example where in an approximate computation


(i) (a + b) + c = a + (b + c), (ii) (a + b) − c = (a − c) + b, (iii) a(b − c) = ab − ac.
Chapter 2

Calculus of Finite Differences and


Difference Equations

Let us consider a function y = f (x) defined on [a, b]. The variables x and y are called
independent and dependent variables respectively. The points x0 , x1 , . . . , xn are taken
as equidistance, i.e., xi = x0 + ih, i = 0, 1, 2, . . . , n. Then the value of y, when x = xi ,
is denoted by yi , where yi = f (xi ). The values of x are called arguments and that of
y are called entries. The interval h is called the difference interval. In this chapter,
some important difference operators, viz., forward difference (∆), backward difference
(∇), central difference (δ), shift (E) and mean (µ) are introduced.

2.1 Finite Difference Operators

2.1.1 Forward differences


The forward difference or simply difference operator is denoted by ∆ and is defined
by

∆f (x) = f (x + h) − f (x). (2.1)

In terms of y, at x = xi the above equation gives

∆f (xi ) = f (xi + h) − f (xi ), i.e., ∆yi = yi+1 − yi , i = 0, 1, 2, . . . , n − 1. (2.2)

Explicitly, ∆y0 = y1 − y0 , ∆y1 = y2 − y1 , . . . , ∆yn−1 = yn − yn−1 .


The differences of the first differences are called second differences and they are
denoted by ∆2 y0 , ∆2 y1 , . . .. Similarly, one can define third differences, fourth differences,
etc.

27
28 Numerical Analysis

Thus,

∆2 y0 = ∆y1 − ∆y0 = (y2 − y1 ) − (y1 − y0 ) = y2 − 2y1 + y0


∆2 y1 = ∆y2 − ∆y1 = (y3 − y2 ) − (y2 − y1 ) = y3 − 2y2 + y1
∆3 y0 = ∆2 y1 − ∆2 y0 = (y3 − 2y2 + y1 ) − (y2 − 2y1 + y0 ) = y3 − 3y2 + 3y1 − y0
∆3 y1 = y4 − 3y3 + 3y2 − y1

and so on.
In general,

∆n+1 f (x) = ∆[∆n f (x)], i.e., ∆n+1 yi = ∆[∆n yi ], n = 0, 1, 2, . . . . (2.3)

Also, ∆n+1 f (x) = ∆n [f (x + h) − f (x)] = ∆n f (x + h) − ∆n f (x)


and

∆n+1 yi = ∆n yi+1 − ∆n yi , n = 0, 1, 2, . . . , (2.4)

where ∆0 ≡ identity operator, i.e., ∆0 f (x) = f (x) and ∆1 ≡ ∆.


The different forward differences for the arguments x0 , x1 , . . . , x4 are shown in Table
2.1.
Table 2.1: Forward difference table.

x y ∆ ∆2 ∆3 ∆4
x0 y0
∆y0
x1 y1 ∆ 2 y0
∆y1 ∆ 3 y0
x2 y2 ∆2 y 1 ∆ 4 y0
∆y2 ∆3 y 1
x3 y3 ∆ 2 y2
∆y3
x4 y4

Table 2.1 is called forward difference or diagonal difference table.

2.1.2 Backward differences


The backward difference operator is denoted by ∇ and it is defined as

∇f (x) = f (x) − f (x − h). (2.5)


Calculus of Finite Diff. and Diff. Equs 29

In terms of y, the above relation transforms to

∇yi = yi − yi−1 , i = n, n − 1, . . . , 1. (2.6)

That is,

∇y1 = y1 − y0 , ∇y2 = y2 − y1 , . . . , ∇yn = yn − yn−1 . (2.7)

These differences are called first differences. The second differences are denoted by
∇2 y2 , ∇2 y3 , . . . , ∇2 yn . That is,

∇2 y2 = ∇(∇y2 ) = ∇(y2 − y1 ) = ∇y2 − ∇y1 = (y2 − y1 ) − (y1 − y0 ) = y2 − 2y1 + y0 .

Similarly, ∇2 y3 = y3 − 2y2 + y1 , ∇2 y4 = y4 − 2y3 + y2 , and so on.


In general,

∇k yi = ∇k−1 yi − ∇k−1 yi−1 , i = n, n − 1, . . . , k, (2.8)

where ∇0 yi = yi , ∇1 yi = ∇yi .
These backward differences can be written in a tabular form and this table is known
as backward difference or horizontal table.
Table 2.2 is the backward difference table for the arguments x0 , x1 , . . . , x4 .

Table 2.2: Backward difference table.

x y ∇ ∇2 ∇3 ∇4
x0 y0
x1 y1 ∇y1
x2 y2 ∇y2 ∇ 2 y2
x3 y3 ∇y3 ∇ 2 y3 ∇ 3 y3
x4 y4 ∇y4 ∇ 2 y4 ∇ 3 y4 ∇ 4 y4

2.1.3 Central differences


The central difference operator is denoted by δ and is defined by

δf (x) = f (x + h/2) − f (x − h/2). (2.9)

In terms of y, the first central difference is

δyi = yi+1/2 − yi−1/2 (2.10)


30 Numerical Analysis

where yi+1/2 = f (xi + h/2) and yi−1/2 = f (xi − h/2).


Thus δy1/2 = y1 − y0 , δy3/2 = y2 − y1 , . . . , δyn−1/2 = yn − yn−1 .
The second central differences are
δ 2 yi = δyi+1/2 − δyi−1/2 = (yi+1 − yi ) − (yi − yi−1 ) = yi+1 − 2yi + yi−1 .
In general,

δ n yi = δ n−1 yi+1/2 − δ n−1 yi−1/2 . (2.11)

The central difference table for the five arguments x0 , x1 , . . . , x4 is shown in


Table 2.3.

Table 2.3: Central difference table.

x y δ δ2 δ3 δ4
x0 y0
δy1/2
x1 y1 δ 2 y1
δy3/2 δ 3 y3/2
x2 y2 δ2y 2 δ 4 y2
δy5/2 δ 3 y5/2
x3 y3 δ2y 3
δy7/2
x4 y4

It is observed that all odd differences have fraction suffices and all the even differences
are with integral suffices.

2.1.4 Shift, Average and Differential operators


Shift operator, E:
The shift operator is defined by

Ef (x) = f (x + h). (2.12)

This gives,

Eyi = yi+1 . (2.13)

That is, shift operator shifts the function value yi to the next higher value yi+1 .
The second shift operator gives

E 2 f (x) = E[Ef (x)] = E[f (x + h)] = f (x + 2h). (2.14)


Calculus of Finite Diff. and Diff. Equs 31

In general,
E n f (x) = f (x + nh) or E n yi = yi+nh . (2.15)
The inverse shift operator E −1 is defined as
E −1 f (x) = f (x − h). (2.16)
Similarly, second and higher inverse operators are
E −2 f (x) = f (x − 2h) and E −n f (x) = f (x − nh). (2.17)
More general form of E operator is
E r f (x) = f (x + rh), (2.18)
where r is positive as well as negative rationals.

Average operator, µ:
The average operator µ is defined as
1 
µf (x) = f (x + h/2) + f (x − h/2)
2
1 
i.e., µyi = yi+1/2 + yi−1/2 . (2.19)
2

Differential operator, D:
The differential operator is usually denoted by D, where
d
Df (x) = f (x) = f  (x)
dx
d2
D2 f (x) = 2 f (x) = f  (x). (2.20)
dx

2.1.5 Factorial notation


The factorial notation has many uses in calculus of finite difference. This is used to
find different differences and anti-differences. The nth factorial of x, denoted by x(n) ,
is defined by
x(n) = x(x − h)(x − 2h) · · · (x − n − 1h), (2.21)
where, each factor is decreased from the earlier by h; and x(0) = 1.
Similarly, the nth negative factorial of x is defined by
1
x(−n) = . (2.22)
x(x + h)(x + 2h) · · · (x + n − 1h)
It may be noted that x(n) .x(−n) = 1.
32 Numerical Analysis

2.2 Properties of Forward Differences

Property 2.2.1 ∆c = 0, where c is a constant.

Property 2.2.2 ∆[f1 (x) + f2 (x) + · · · + fn (x)]


= ∆f1 (x) + ∆f2 (x) + · · · + ∆fn (x).

Property 2.2.3 ∆[cf (x)] = c∆f (x).

Combining properties (2.2.2) and (2.2.3), one can generalise the property (2.2.2) as

Property 2.2.4 ∆[c1 f1 (x) + c2 f2 (x) + · · · + cn fn (x)]


= c1 ∆f1 (x) + c2 ∆f2 (x) + · · · + cn ∆fn (x).

Property 2.2.5 ∆m ∆n f (x) = ∆m+n f (x) = ∆n ∆m f (x) = ∆k ∆m+n−k f (x),


k = 0, 1, 2, . . . , m or n.

Property 2.2.6

∆[f (x)g(x)] = f (x + h)g(x + h) − f (x)g(x)


= f (x + h)g(x + h) − f (x + h)g(x) + f (x + h)g(x) − f (x)g(x)
= f (x + h)[g(x + h) − g(x)] + g(x)[f (x + h) − f (x)]
= f (x + h)∆g(x) + g(x)∆f (x).

Also, it can be shown that

∆[f (x)g(x)] = f (x)∆g(x) + g(x + h)∆f (x)


= f (x)∆g(x) + g(x)∆f (x) + ∆f (x)∆g(x).

f (x) g(x)∆f (x) − f (x)∆g(x)


Property 2.2.7 ∆ = , g(x) = 0.
g(x) g(x + h)g(x)
Proof.
f (x) f (x + h) f (x)
∆ = −
g(x) g(x + h) g(x)
f (x + h)g(x) − g(x + h)f (x)
=
g(x + h)g(x)
g(x)[f (x + h) − f (x)] − f (x)[g(x + h) − g(x)]
=
g(x + h)g(x)
g(x)∆f (x) − f (x)∆g(x)
= .
g(x + h)g(x)
Calculus of Finite Diff. and Diff. Equs 33

Property 2.2.8 In particular, when the numerator is 1, then

1 ∆f (x)
∆ =− .
f (x) f (x + h)f (x)

Property 2.2.9 ∆[cx ] = cx+h − cx = cx (ch − 1), for some constant c.

Property 2.2.10 ∆[x Cr ] = xCr−1 , where r is fixed and h = 1.


Proof. ∆[x Cr ] = x+1Cr − xCr = xCr−1 as h = 1.

Property 2.2.11 ∆x(n) = nhx(n−1) .


Proof.

∆x(n) = (x + h)(x + h − h)(x + h − 2h) · · · (x + h − n − 1h)


−x(x − h)(x − 2h) · · · (x − n − 1h)
= x(x − h)(x − 2h) · · · (x − n − 2h)[x + h − {x − (n − 1)h}]
= nhx(n−1) .

This property is analogous to the differential formula D(xn ) = nxn−1 when h = 1.


Most of the above formulae are similar to the corresponding formulae in differential
calculus.

Property 2.2.12 The above formula can also be used to find anti-difference (like inte-
gration in integral calculus), as
1 (n)
∆−1 x(n−1) = x . (2.23)
nh

2.2.1 Properties of shift operators


Property 2.2.13 Ec = c, where c is a constant.

Property 2.2.14 E{cf (x)} = cEf (x).

Property 2.2.15 E{c1 f1 (x) + c2 f2 (x) + · · · + cn fn (x)]


= c1 Ef1 (x) + c2 Ef2 (x) + · · · + cn Efn (x).

Property 2.2.16 E m E n f (x) = E n E m f (x) = E m+n f (x).

Property 2.2.17 E n E −n f (x) = f (x).


In particular, EE −1 ≡ I, I is the identity operator and it is some times denoted by 1.

Property 2.2.18 (E n )m f (x) = E mn f (x).


34 Numerical Analysis

f (x) Ef (x)
Property 2.2.19 E = .
g(x) Eg(x)

Property 2.2.20 E{f (x) g(x)} = Ef (x) Eg(x).

Property 2.2.21 E∆f (x) = ∆Ef (x).

Property 2.2.22 ∆m f (x) = ∇m E m f (x) = E m ∇m f (x)


and ∇m f (x) = ∆m E −m f (x) = E −m ∆m f (x).

2.3 Relations Among Operators

It is clear from the forward, backward and central difference tables that in a definite
numerical case, the same values occur in the same positions, practically there are no
differences among the values of the tables, but, different symbols have been used for the
theoretical importance.
Thus

∆yi = yi+1 − yi = ∇yi+1 = δyi+1/2


∆2 yi = yi+2 − 2yi+1 + yi = ∇2 yi+2 = δ 2 yi+1

etc.
In general,
∆n yi = ∇n yi+n , i = 0, 1, 2, . . . . (2.24)

Again,
∆f (x) = f (x + h) − f (x) = Ef (x) − f (x) = (E − 1)f (x).

This relation indicates that the effect of the operator ∆ on f (x) is the same as that
of the operator E − 1 on f (x). Thus

∆≡E−1 or E ≡ ∆ + 1. (2.25)

Also,
∇f (x) = f (x) − f (x − h) = f (x) − E −1 f (x) = (1 − E −1 )f (x).

That is,
∇ ≡ 1 − E −1 . (2.26)

The higher order forward difference can be expressed in terms of the given function
values in the following way:

∆3 yi = (E − 1)3 yi = (E 3 − 3E 2 + 3E − 1)yi = y3 − 3y2 + 3y1 − y0 .


Calculus of Finite Diff. and Diff. Equs 35

There is a relation among the central difference, δ, and the shift operator E, as
δf (x) = f (x + h/2) − f (x − h/2) = E 1/2 f (x) − E −1/2 f (x) = (E 1/2 − E −1/2 )f (x).
That is,
δ ≡ E 1/2 − E −1/2 . (2.27)
Again, 1 
µf (x) = f (x + h/2) + f (x − h/2)
2
1  1
= E 1/2 f (x) + E −1/2 f (x) = (E 1/2 + E −1/2 )f (x).
2 2
Thus, 1 
µ ≡ E 1/2 + E −1/2 . (2.28)
2
The average operator µ can also be expressed in terms of the central difference oper-
ator.

1  1/2 2
µ2 f (x) = E + E −1/2 f (x)
4
1  1/2  1 
= (E − E −1/2 )2 + 4 f (x) = δ 2 + 4 f (x).
4 4
Hence,
1
µ≡1 + δ2. (2.29)
4
Some more relations among the operators ∆, ∇, E and δ are deduced in the following.

∇Ef (x) = ∇f (x + h) = f (x + h) − f (x) = ∆f (x).


Also,
δE 1/2 f (x) = δf (x + h/2) = f (x + h) − f (x) = ∆f (x).
Thus,
∆ ≡ ∇E ≡ δE 1/2 . (2.30)
From the definition of E,
h2  h3
Ef (x) = f (x + h) = f (x) + hf  (x) + f (x) + f  (x) + · · ·
2! 3!
[by Taylor’s series]
h2 h3
= f (x) + hDf (x) + D2 f (x) + D3 f (x) + · · ·
2! 3!
h2 2 h3 3
= 1 + hD + D + D + · · · f (x)
2! 3!
= ehD f (x).
36 Numerical Analysis

Hence,
E ≡ ehD . (2.31)

Also,
hD ≡ log E. (2.32)

This relation is used to separate the effect of E into that of the powers of ∆ and this
method of separation is called the method of separation of symbols.
The operators µ and δ can be expressed in terms of D, as shown below
1 1 
µf (x) = [E 1/2 + E −1/2 ]f (x) = ehD/2 + e−hD/2 f (x)
2 2
 hD 
= cosh f (x)
2
 
and δf (x) = [E 1/2 − E −1/2 ]f (x) = ehD/2 − e−hD/2 f (x)
 hD 
= 2 sinh f (x).
2
Thus,  hD   hD 
µ ≡ cosh and δ ≡ 2 sinh . (2.33)
2 2
Again,  hD   hD 
µδ ≡ 2 cosh sinh = sinh(hD). (2.34)
2 2
The inverse relation

hD ≡ sinh−1 (µδ) (2.35)

is also useful.
Since E ≡ 1 + ∆ and E −1 ≡ 1 − ∇, [from (2.25) and (2.26)]
from (2.32), it is obtained that

hD ≡ log E ≡ log(1 + ∆) ≡ − log(1 − ∇) ≡ sinh−1 (µδ). (2.36)

The operators µ and E are commutative, as


1 
µEf (x) = µf (x + h) = f (x + 3h/2) + f (x + h/2) ,
2
while
1  1  
Eµf (x) = E f (x + h/2) + f (x − h/2) = f (x + 3h/2) + f (x + h/2) .
2 2
Hence,
µE ≡ Eµ. (2.37)
Calculus of Finite Diff. and Diff. Equs 37

Example 2.3.1 Prove the following relations.


 
δ2 2 δ δ2 δ2
(i) 1 + δ µ ≡ 1 +
2 2
, (ii) E 1/2 ≡ µ + , (iii) ∆ ≡ +δ 1+ ,
2 2 2 4
∆E −1 ∆ ∆+∇
(iv) (1 + ∆)(1 − ∇) ≡ 1, (v) µδ ≡ + , (vi) µδ ≡ ,
2 2 2
(vii) ∆∇ ≡ ∇∆ ≡ δ 2 .

Solution. (i) δµf (x) = 12 (E 1/2 + E −1/2 )(E 1/2 − E −1/2 )f (x) = 12 [E − E −1 ]f (x).
Therefore,

1
(1 + δ 2 µ2 )f (x) = 1 + (E − E −1 )2 f (x)
4
1 2 1
= 1 + (E − 2 + E −2 ) f (x) = (E + E −1 )2 f (x)
4 4
2 2
1 δ2
= 1 + (E 1/2 − E −1/2 )2 f (x) = 1 + f (x).
2 2

Hence  
δ2 2
1+δ µ ≡ 1+
2 2
. (2.38)
2
 
δ 1 1/2 1
(ii) µ + f (x) = [E + E −1/2 ] + [E 1/2 − E −1/2 ] f (x) = E 1/2 f (x).
2 2 2
Thus
δ
E 1/2 ≡ µ + . (2.39)
2
(iii)
δ2 δ2
+δ 1+ f (x)
2 4
1 1
= (E 1/2 − E −1/2 )2 f (x) + (E 1/2 − E −1/2 ) 1 + (E 1/2 − E −1/2 )2 f (x)
2 4
1 1
= [E + E −1 − 2]f (x) + (E 1/2 − E −1/2 )(E 1/2 + E −1/2 )f (x)
2 2
1 1
= [E + E −1 − 2]f (x) + (E − E −1 )f (x)
2 2
= (E − 1)f (x).

Hence, δ2 δ2
+δ 1+ ≡ E − 1 ≡ ∆. (2.40)
2 4
38 Numerical Analysis

(iv) (1 + ∆)(1 − ∇)f (x) = (1 + ∆)[f (x) − f (x) + f (x − h)]


= (1 + ∆)f (x − h) = f (x − h) + f (x) − f (x − h)
= f (x).
Therefore,

(1 + ∆)(1 − ∇) ≡ 1. (2.41)

(v)
∆E −1 ∆ 1
+ f (x) = [∆f (x − h) + ∆f (x)]
2 2 2
1
= [f (x) − f (x − h) + f (x + h) − f (x)]
2
1 1
= [f (x + h) − f (x − h)] = [E − E −1 ]f (x)
2 2
1 1/2
= (E + E −1/2 )(E 1/2 − E −1/2 )f (x)
2
= µδf (x).

Hence
∆E −1 ∆
+ ≡ µδ. (2.42)
2 2
(vi)
∆+∇ 1
f (x) = [∆f (x) + ∇f (x)]
2 2
1
= [f (x + h) − f (x) + f (x) − f (x − h)]
2
1 1
= [f (x + h) − f (x − h)] = [E − E −1 ]f (x)
2 2
= µδf (x) (as in previous case).

Thus,
∆+∇
µδ ≡ . (2.43)
2
(vii) ∆∇f (x) = ∆[f (x) − f (x − h)] = f (x + h) − 2f (x) + f (x − h).
Again,

∇∆f (x) = f (x + h) − 2f (x) + f (x − h) = (E − 2 + E −1 )f (x)


= (E 1/2 − E −1/2 )2 f (x) = δ 2 f (x).

Hence, ∆∇ ≡ ∇∆ ≡ (E 1/2 − E −1/2 )2 ≡ δ 2 . (2.44)


Calculus of Finite Diff. and Diff. Equs 39

The relations among the various operators are shown in Table 2.4.

Table 2.4: Relationship between the operators.

E ∆ ∇ δ hD
δ2 δ2
E E ∆+1 (1 − ∇)−1 1+ +δ 1+ ehD
2 4
δ2 δ2
∆ E−1 ∆ (1 − ∇)−1 − 1 +δ 1+ ehD − 1
2 4
δ2 δ2
∇ 1 − E −1 1 − (1 + ∆)−1 ∇ − +δ 1+ 1 − e−hD
2 4
δ E 1/2−E −1/2 ∆(1 + ∆)−1/2 ∇(1 − ∇)−1/2 δ 2 sinh(hD/2)
E 1/2+E −1/2 δ 2
µ (1 + ∆/2) (1−∇/2)(1−∇)−1/2 1+ cosh(hD/2)
2 4
×(1 + ∆)−1/2
hD log E log(1 + ∆) − log(1 − ∇) 2 sinh−1 (δ/2) hD

From definition of derivative, we have


f (x + h) − f (x) ∆f (x)
f  (x) = lim = lim .
h→0 h h→0 h
Thus one can write,
∆f (x)  hf  (x).
Again,
f  (x + h) − f  (x)
f  (x) = lim
h→0 h
∆f (x + h) ∆f (x)

 lim h h
h→0 h
∆f (x + h) − ∆f (x) ∆2 f (x)
= lim = lim .
h→0 h2 h→0 h2
Therefore, h2 f  (x)  ∆2 f (x).
In general, ∆n f (x)  hn f n (x). That is, for small h, the operators ∆ and hD are
almost equal.

2.4 Representation of Polynomial using Factorial Notation

From the definition of factorial notation,


40 Numerical Analysis

x(0) =1
x(1) =x
x(2) = x(x − h) (2.45)
x(3) = x(x − h)(x − 2h)
x(4) = x(x − h)(x − 2h)(x − 3h)
and so on.
The above relations show that x(n) , n = 1, 2, . . . is a polynomial of degree n in x.
Also, x, x2 , x3 , . . . can be expressed in terms of factorial notations x(1) , x(2) , x(3) , . . ., as
shown below.
1 = x(0)
x = x(1)
(2.46)
x2 = x(2) + hx(1)
x3 = x(3) + 3hx(2) + h2 x(1)
x4 = x(4) + 6hx(3) + 7h2 x(2) + h3 x(1)
and so on.
These relations show that xn can be expressed as a polynomial of x(1) , x(2) , . . . , x(n) ,
of degree n. Once a polynomial is expressed in a factorial notation, its differences can
be obtained by using the formula like differential calculus.

Example 2.4.1 Express f (x) = 2x4 + x3 − 5x2 + 8 in factorial notation and find
its first and second differences.

Solution. Here we assume that h = 1.


Then by (2.46), x = x(1) , x2 = x(2) + x(1) , x3 = x(3) + 3x(2) + x(1) ,
x4 = x(4) + 6x(3) + 7x(2) + x(1) .
Using these values, the function f (x) becomes
     
f (x) = 2 x(4) + 6x(3) + 7x(2) + x(1) + x(3) + 3x(2) + x(1) − 5 x(2) + x(1) + 8
= 2x(4) + 13x(3) + 12x(2) − 2x(1) + 8.

Now, the Property 2.2.11, i.e., ∆x(n) = nx(n−1) is used to find the differences.
Therefore,
∆f (x) = 2.4x(3) + 13.3x(2) + 12.2x(1) − 2.1x(0) = 8x(3) + 39x(2) + 24x(1) − 2
and ∆2 f (x) = 24x(2) + 78x(1) + 24.
In terms of x,
∆f (x) = 8x(x − 1)(x − 2) + 39x(x − 1) + 24x − 2
and ∆2 f (x) = 24x(x − 1) + 78x + 24.
From the relations of (2.46) one can conclude the following result.
Calculus of Finite Diff. and Diff. Equs 41

Lemma 2.4.1 Any polynomial f (x) in x of degree n can be expressed in factorial no-
tation with same degree, n.

This means, in conversion to the factorial notation, the degree of a polynomial remains
unchanged.
The above process to convert a polynomial in a factorial form is a labourious tech-
nique when the degree of the polynomial is large. The other systematic process, like
Maclaurin’s formula in differential calculus, is used to convert a polynomial, even a
function, in factorial notation.
Let f (x) be a polynomial in x of degree n. In factorial notation, let it be

f (x) = a0 + a1 x(1) + a2 x(2) + · · · + an x(n) , (2.47)

where ai ’s are unknown constants to be determined, an = 0.


Now, one can find the differences of (2.47) as follows.

∆f (x) = a1 + 2a2 x(1) + 3a3 x(2) + · · · + nan x(n−1)


∆2 f (x) = 2.1a2 + 3.2a3 x(1) + · · · + n(n − 1)an x(n−2)
∆3 f (x) = 3.2.1a3 + 4.3.2.x(1) + · · · + n(n − 1)(n − 2)an x(n−3)
······ ··· ·············································
∆n f (x) = n(n − 1)(n − 2) · · · 3 · 2 · 1an = n!an .

When x = 0, the above relations give


a0 = f (0), ∆f (0) = a1 ,
∆2 f (0)
∆2 f (0) = 2.1.a2 or, a2 =
2!
3 ∆3 f (0)
∆ f (0) = 3.2.1.a3 or, a3 =
3!
·················· ······ ···············
∆n f (0)
∆n f (0) = n!an or, an = .
n!
Using these results equation (2.47) becomes

∆2 f (0) (2) ∆3 f (0) (3) ∆n f (0) (n)


f (x) = f (0) + ∆f (0)x(1) + x + x + ··· + x .
2! 3! n!
(2.48)

This formula is similar to Maclaurin’s formula of differential calculus and it is also


used to expand a function in terms of factorial notation. To perform the formula (2.48),
different forward differences are to be evaluated at x = 0, and this can be done using
forward difference table. This is a systematic process and easy to implement as computer
program.
42 Numerical Analysis

Example 2.4.2 Express f (x) = 2x4 − 5x3 + 8x2 + 2x − 1 in factorial notation.

Solution. Taking h = 1. f (0) = −1, f (1) = 6, f (2) = 27, f (3) = 104, f (4) = 327.

x f (x) ∆f (x) ∆2 f (x) ∆3 f (x) ∆4 f (x)


0 −1
7
1 6 14
21 42
2 27 56 48
77 90
3 104 146
223
4 327

Thus using formula (2.48)


∆2 f (0) (2) ∆3 f (0) (3) ∆4 f (0) (4)
f (x) = f (0) + ∆f (0)x(1) + x + x + x
2! 3! 4!
= 2x(4) + 7x(3) + 7x(2) + 7x(1) − 1.

Example 2.4.3 If ∆f (x) = x4 + 2x3 + 8x2 + 3, find f (x).

Solution. Applying synthetic division to express ∆f (x) in factorial notation.


1 1 2 8 0 3
1 3 11
2 1 3 11 11
2 10
3 1 5 21
3
4 1 8
1

Therefore, ∆f (x) = x(4) + 8x(3) + 21x(2) + 11x(1) + 3.


Hence,
1 8 21 11
f (x) = x(5) + x(4) + x(3) + x(2) + 3x(1) + c, [using Property 2.2.12]
5 4 3 2
1
= x(x − 1)(x − 2)(x − 3)(x − 4) + 2x(x − 1)(x − 2)(x − 3)
5
11
+7x(x − 1)(x − 2) + x(x − 1) + 3x + c, where c is arbitrary constant.
2
Calculus of Finite Diff. and Diff. Equs 43

2.5 Difference of a Polynomial

Let f (x) = a0 xn + a1 xn−1 + · · · + an−1 x + an be a polynomial of degree n. From this


expression one can find the successive differences of f (x).
Now, ∆f (x) = f (x + h) − f (x), where h is the spacing of x
= a0 [(x + h)n − xn ] + a1 [(x + h)n−1 − xn−1 ] + · · · + an−1 [(x + h) − x].
Expanding the terms within parenthesis using binomial theorem and obtain
 n(n − 1) 2 n−2 
∆f (x) = a0 xn + nhxn−1 + h x + · · · + hn − xn
2!
 (n − 1)(n − 2) 2 n−3 
+a1 xn−1 + (n − 1)hxn−2 + h x + · · · + hn−1 − xn−1
2!
+ · · · + han−1
 n(n − 1) 
= a0 nhxn−1 + a0 h2 + a1 (n − 1)h xn−2 + · · · + an−1 h.
2!
If h is constant, then the coefficients of xn−1 , xn−2 , . . . , x and an−1 h are constants.
The coefficients of xn−1 , xn−2 , . . . , x and the constant term are denoted by b0 , b1 , . . . , bn−2
and bn−1 respectively. In this notation first difference can be written as

∆f (x) = b0 xn−1 + b1 xn−2 + · · · + bn−2 x + bn−1 , where b0 = a0 nh.

It may be noted that ∆f (x) is a polynomial of degree n − 1, i.e., first difference


reduces the degree of f (x) by 1.
The second difference of f (x) is

∆2 f (x)
= ∆f (x + h) − ∆f (x)
= b0 [(x + h)n−1 − xn−1 ] + b1 [(x + h)n−2 − xn−2 ] + · · · + bn−2 [(x + h) − x]
 (n − 1)(n − 2) 2 n−3 
= b0 xn−1 + (n − 1)hxn−2 + h x + · · · + hn−1 − xn−1
2!
 (n − 2)(n − 3) 2 n−4 
+b1 xn−2 + (n − 2)hxn−3 + h x + · · · + hn−2 − xn−2
2!
+ · · · + bn−2 h
1 
= b0 (n − 1)hxn−2 + (n − 1)(n − 2)b0 h2 + (n − 2)b1 h xn−3 + · · · + bn−2 h
2!
= c0 xn−2 + c1 xn−3 + · · · + cn−3 x + cn−2

where c0 = b0 (n − 1)h, c1 = 2!1 (n − 1)(n − 2)b0 h2 + (n − 2)b1 h, etc.


This expression shows that ∆2 f (x) is a polynomial of degree n − 2.
It may be noted that the coefficient of the leading term is c0 = b0 h(n − 1) = n(n −
1)h2 a0 and it is a constant quantity.
44 Numerical Analysis

In this way, one can find ∆n−1 f (x) is a polynomial of degree one and let it be p0 x+p1 ,
i.e., ∆n−1 f (x) = p0 x + p1 .
Then ∆n f (x) = p0 (x+h)+p1 −p0 x−p1 = p0 h, which is a constant. And ∆n+1 f (x) =
0.
It can be shown that ∆n f (x) = n(n − 1)(n − 2) · · · 2 · 1 · hn a0 = n!hn a0 .
Thus finally,

∆k f (x), k < n is a polynomial of degree n − k,


∆n f (x) is constant, and
∆k f (x), k > n is zero.

Alternative proof.
It is observed that, a polynomial in x of degree n can be expressed as a polynomial in
factorial notation with same degree n.
Thus, if f (x) = a0 xn + a1 xn−1 + a2 xn−2 + · · · + an−1 x + an be the given polynomial
then it can be written as f (x) = b0 x(n) + b1 x(n−1) + b2 x(n−2) + · · · + bn−1 x(1) + bn .
Therefore,
∆f (x) = b0 nhx(n−1) + b1 h(n − 1)x(n−2) + b2 h(n − 2)x(n−3) + · · · + bn−1 h.
Clearly this is a polynomial of degree n − 1.
Similarly,

∆2 f (x) = b0 n(n − 1)h2 x(n−2) + b1 (n − 1)(n − 2)h2 x(n−3) + · · · + bn−2 h2 ,


∆3 f (x) = b0 n(n − 1)(n − 2)h3 x(n−3) + b1 (n − 1)(n − 2)(n − 3)h3 x(n−4)
+ · · · + bn−3 h3 .

In this way,

∆n f (x) = b0 n(n − 1)(n − 2) · · · 2 · 1 · hn x(n−n)


= b0 n!hn , a constant quantity.

Hence ∆n+1 f (x) = 0.

Difference of factorial power function


From definition of factorial notation, we have

∆x(n) = nhx(n−1)
∆2 x(n) = nh∆x(n−1) = nh.(n − 1)hx(n−2) = n(n − 1)h2 x(n−2)
∆3 x(n) = n(n − 1)h2 .(n − 2)hx(n−2) = n(n − 1)(n − 2)h3 x(n−3) .
Calculus of Finite Diff. and Diff. Equs 45

In this way,
∆n x(n) = n(n − 1)(n − 2) · · · 2 · 1 · hn x(n−n) = n!h2 .

Example 2.5.1 Given xi = x0 + ih, i = 0, 1, 2, . . . , n; h > 0


and ui (x) = (x − x0 )(x − x1 ) · · · (x − xi ),
prove that
∆k ui (x) = (i + 1)i(i − 1) · · · (i − k + 2)hk (x − x0 )(x − x1 ) · · · (x − xi−k ).

Solution. Here ui (x) = (x − x0 )(x − x1 ) · · · (x − xi ) = (x − x0 )(i+1) (say).


Therefore,

∆ui (x) = (x + h − x0 )(x + h − x1 ) · · · (x + h − xi ) − (x − x0 ) · · · (x − xi )


= (x + h − x0 )(x − x0 )(x − x1 ) · · · (x − xi−1 )
−(x − x0 )(x − x1 ) · · · (x − xi )
= (x − x0 )(x − x1 ) · · · (x − xi−1 )[(x + h − x0 ) − (x − xi )]
= (x − x0 )(x − x1 ) · · · (x − xi−1 )(h + xi − x0 )
= (x − x0 )(x − x1 ) · · · (x − xi−1 )(i + 1)h [since xi = x0 + ih]
= (i + 1)h(x − x0 )(i) .

Similarly,

∆2 ui (x) = (i + 1)h[(x + h − x0 )(x + h − x1 ) · · · (x + h − xi−1 )


−(x − x0 )(x − x1 ) · · · (x − xi−1 )]
= (i + 1)h(x − x0 )(x − x1 ) · · · (x − xi−2 )[(x + h − x0 ) − (x − xi−1 )]
= (i + 1)h(x − x0 )(i−1) ih
= (i + 1)ih2 (x − x0 )(i−1) .

In similar way,
∆3 ui (x) = (i + 1)i(i − 1)h3 (x − x0 )(i−2) .
Hence,

∆k ui (x) = (i + 1)i(i − 1) · · · (i − k − 2)hk (x − x0 )(i−k−1)


= (i + 1)i(i − 1) · · · (i − k + 2)hk (x − x0 )(x − x1 ) · · · (x − xi−k ).

2.6 Summation of Series

The finite difference method is also used to find the sum of a finite series. Two important
results are presented here.
46 Numerical Analysis

Theorem 2.1 If f (x) be defined only for integral values of independent variable x, then

b b+h
f (x) = f (a) + f (a + h) + f (a + 2h) + · · · + f (b) = F (x)
x=a a

b
i.e., f (x) = F (x + h) − F (x), b = a + nh for some n (2.49)
x=a

where F (x) is an anti-difference (instead of anti-derivative) of f (x),


i.e., ∆F (x) = f (x).
Proof. Since ∆F (x) = f (x), therefore,


b 
b 
b
f (x) = ∆F (x) = [F (x + h) − F (x)]
x=a x=a x=a
= [F (b + h) − F (b)] + [F (b) − F (b − h)] + · · · + [F (a + h) − F (a)]
= F (b + h) − F (a).
Thus, if F (x) is anti-difference of f (x), i.e.,
b
∆−1 f (x) = F (x), then f (x) = F (b + h) − F (a).
x=a

Example 2.6.1 Use finite difference method to find the sum of the series
n
3
f (x), where f (x) = x(x − 1) + .
x(x + 1)(x + 2)
x=1

Solution. Let
3
F (x) = ∆−1 f (x) = ∆−1 x(x − 1) +
x(x + 1)(x + 2)
x(3) x(−2)
= ∆−1 x(2) + 3∆−1 x(−3) = +3
3 −2
1 3 1
= x(x − 1)(x − 2) − .
3 2 x(x + 1)

Therefore,


n
1 3 1 3 1
f (x) = [F (n + 1) − F (1)] = (n + 1)n(n − 1) − −0+
3 2 (n + 1)(n + 2) 2 1.2
x=1
1 3 1 3
= n(n2 − 1) − + .
3 2 (n + 1)(n + 2) 4
Calculus of Finite Diff. and Diff. Equs 47

Summation by parts
Like the formula ‘integration by parts’ of integral calculus there is a similar formula in
finite difference calculus. If f (x) and g(x) are two functions defined only for integral
values of x between a and b, then

b  b+h b
f (x)∆g(x) = f (x)g(x) − g(x + h)∆f (x). (2.50)
a
x=a x=a


n
Example 2.6.2 Find the sum of the series x4x .
x=1

Solution. Let f (x) = x, ∆g(x) = 4x . Then g(x) = 4x /3. Hence for h = 1,


n  4x n+1  n
4x+1  4n+1 4  4  x
n
x4x = x. − .∆x = (n + 1) − − 4 .1
3 1 3 3 3 3
x=1 x=1 x=1
 4n+1 4  4 4(4n − 1) 4n+1
= (n + 1) − − =n .
3 3 3 4 3

2.7 Worked out Examples

Example 2.7.1 If n is a positive integer then show that


1
x(−n) = , where h is the spacing.
(x + nh)(n)

Solution. From the definition of x(n) , we have,


x(n) = x(x − h)(x − 2h) · · · (x − n − 2h)(x − n − 1h) = x(n−1) (x − n − 1h).
x(n)
This can be written as x(n−1) = .
x − (n − 1)h
Substituting n = 0, −1, −2, . . . , −(n − 1) we obtain

x(0) 1
x(−1) = =
x+h x+h
x(−1) 1
x(−2) = =
x + 2h (x + h)(x + 2h)
x(−2) 1
x(−3) = = .
x + 3h (x + h)(x + 2h)(x + 3h)
48 Numerical Analysis

In this way,
1
x(−n−1) =
(x + h)(x + 2h) · · · (x + n − 1h)

1
and hence x(−n) =
(x + nh)(x + nh − h)(x + nh − 2h) · · · (x + nh − n − 1h)
1
= .
(x + nh)(n)

Example 2.7.2 Prove the following identities


(i) u0 + u1 + u2 + · · · + un = n+1 C1 u0 + n+1 C2 ∆u0 + n+1 C3 ∆2 u0 + · · · + ∆n u0 .
x2 x3  x2 
(ii) u0 + xu1 + u2 + u3 + · · · = ex u0 + x∆u0 + ∆2 u0 + · · ·
2! 3! 2!
1 1 1 2 1
(iii) u0 − u1 + u2 − u3 + · · · = u0 − ∆u0 + ∆ u0 − ∆3 u0 + · · ·.
2 4 8 16
Solution. (i) u0 + u1 + u2 + · · · + un

= u0 + Eu0 + E 2 u0 + · · · + E n u0
 n+1 
E −1
= (1 + E + E + · · · + E )u0 =
2 n
u0
E−1
 
(1 + ∆)n+1 − 1
= u0 [since E ≡ 1 + ∆]

1  n+1 
= C1 ∆ + n+1 C2 ∆2 + · · · + ∆n+1 u0

= n+1 C1 u0 + n+1 C2 ∆u0 + n+1 C3 ∆2 u0 + · · · + ∆n u0 .

(ii) x2 x3
u0 + xu1 + u2 + u3 + · · ·
2! 3!
x2 2 x3
= u0 + xEu0 + E u0 + E 3 u0 + · · ·
2! 3!
(xE) 2 (xE)3
= 1 + xE + + + · · · u0
2! 3!
= exE u0 = ex(1+∆) u0 = ex ex∆ u0
(x∆)2 (x∆)3
= ex 1 + x∆ + + + · · · u0
2! 3!
x2 x3
= ex u0 + x∆u0 + ∆2 u0 + ∆3 u0 + · · ·
2! 3!
Calculus of Finite Diff. and Diff. Equs 49

(iii)

u0 − u 1 + u2 − u 3 + · · ·
= u0 − Eu0 + E 2 u0 − E 3 u0 + · · · = [1 − E + E 2 − E 3 + · · · ]u0
1 ∆ −1
= (1 + E)−1 u0 = (2 + ∆)−1 u0 = 1+ u0
2 2
1 ∆ ∆2 ∆ 3 1 ∆u0 ∆2 u0 ∆3 u0
= 1− + − + · · · u0 = u0 − + − + ···
2 2 4 8 2 4 8 16

Example 2.7.3 Let fi = f (xi ) where xi = x0 + ih, i = 1, 2, . . .. Prove that


i  

i i
fi = E f0 = ∆j f0 .
j
j=0

Solution. From definition of ∆,


∆f (xi ) = f (xi + h) − f (xi ) = Ef (xi ) − f (xi ),
i.e., Ef (xi ) = ∆f (xi ) + f (xi ) = (∆ + 1)f (xi ).
Hence, E ≡ ∆ + 1 and also E i ≡ (1 + ∆)i .
Therefore,
i  
 i
fi = E f0 = (1 + ∆) f0 = 1 + C1 ∆f0 + C2 ∆ f0 + · · · =
i i i i 2
∆j f0 .
j
j=0

Example 2.7.4 Prove that


n
∆n f (x) = (−1)i n Ci f [x + (n − i)h],
i=0

where h is step-length.

Solution. We know that, ∆ ≡ E − 1 and ∆n ≡ (E − 1)n .



n
Therefore, ∆n f (x) = (E − 1)n f (x) = (−1)i n Ci E n−i f (x).
i=0
[by Binomial theorem]
Now, Ef (x) = f (x + h), E 2 f (x) = f (x + 2h), . . . , E n−i f (x) = f [x + (n − i)h].
n
n
Hence ∆ f (x) = (−1)i n Ci f [x + (n − i)h].
i=0
50 Numerical Analysis

Example 2.7.5 Find the polynomial f (x) which satisfies the following data and
hence find the value of f (1.5).

x : 1 2 3 4
f (x) : 3 5 10 30

Solution. The difference table of the given data is


x f (x) ∆f (x) ∆2 f (x) ∆3 f (x)
1 3
2
2 5 3
5 12
3 10 15
20
4 30

It is known that, f (x0 + nh) = E n f (x0 ) = (1 + ∆)n f (x0 ).


Here x0 = 1, h = 1. Let x0 + nh = 1 + n = x, i.e., n = x − 1. Therefore,

f (x) = E x−1 f (x0 ) = (1 + ∆)x−1 f (x0 )


(x − 1)(x − 2) 2
= f (0) + (x − 1)∆f (0) + ∆ f (0)
2!
(x − 1)(x − 2)(x − 3) 3
+ ∆ f (0) + · · ·
3!
3(x − 1)(x − 2) 12(x − 1)(x − 2)(x − 3)
= 3 + 2(x − 1) + +
2! 3!
21 2 39
= 2x − x + x − 8.
3
2 2
Thus f (1.5) = 4.375.

Example 2.7.6 Find the missing term in the following table:


x : 1 2 3 4 5
f (x) : –2 3 8 – 21

Solution. Here four values of f (x) are given. So, we consider f (x) be a polynomial
of degree 3. Thus the fourth differences of f (x) vanish, i.e.,
∆4 f (x) = 0 or, (E − 1)4 f (x) = 0
or, (E 4 − 4E 3 + 6E 2 − 4E + 1)f (x) = 0
or, E 4 f (x) − 4E 3 f (x) + 6E 2 f (x) − 4Ef (x) + f (x) = 0
Calculus of Finite Diff. and Diff. Equs 51

or, f (x + 4) − 4f (x + 3) + 6f (x + 2) − 4f (x + 1) + f (x) = 0.
Here, h = 1 as the values are in spacing of 1 unit.
For x = 1 the above equation becomes
f (5) − 4f (4) + 6f (3) − 4f (2) + f (1) = 0 or, 21 − 4f (4) + 6 × 8 − 4 × 3 − 2 = 0
or, f (4) = 13.75.

Example 2.7.7 Use finite difference method to find the values of a and b in the
following table.
x : 0 2 4 6 8 10
f (x) : –5 a 8 b 20 32

Solution. Here, four values of f (x) are known, so we can assume that f (x) is a
polynomial of degree 3. Then, ∆4 f (x) = 0.
or, (E − 1)4 f (x) = 0
or, E 4 f (x) − 4E 3 f (x) + 6E 2 f (x) − 4Ef (x) + f (x) = 0
or, f (x + 8) − 4f (x + 6) + 6f (x + 4) − 4f (x + 2) + f (x) = 0
[Here h = 2, because the values of x are given in 2 unit interval]
In this problem, two unknowns a and b are to be determined and needs two equations.
Therefore, the following equations are obtained by substituting x = 2 and x = 0 to
the above equation.
f (10) − 4f (8) + 6f (6) − 4f (4) + f (2) = 0 and
f (8) − 4f (6) + 6f (4) − 4f (2) + f (0) = 0.
These equations are simplifies to
32 − 4 × 20 + 6b − 4 × 8 + a = 0 and 20 − 4b + 6 × 8 − 4a − 5 = 0.
That is, 6b + a − 80 = 0 and −4b − 4a + 63 = 0. Solution of these equations is
a = 2.9, b = 12.85.

 
∆2 2
Example 2.7.8 Find the value of x .
E

Solution.
 
∆2 2 (E − 1)2 2
x = x
E E
E 2 − 2E + 1 2
= x
E
= (E − 2 + E −1 )x2 = Ex2 − 2x2 + E −1 x2
= (x + 1)2 − 2x2 + (x − 1)2
= 2.
52 Numerical Analysis

∇f (x)
Example 2.7.9 Show that ∇ log f (x) = − log 1 − .
f (x)

Solution.

∇ log f (x) = log f (x) − log f (x − h) = log f (x) − log E −1 f (x)


f (x) E −1 f (x) (1 − ∇)f (x)
= log −1 = − log = − log
E f (x) f (x) f (x)
f (x) − ∇f (x) ∇f (x)
= − log = − log 1 − .
f (x) f (x)

The following formulae for anti-difference can easily be verified, for h = 1.

(i) ∆−1 cf (x) = c∆−1 f (x), c being a constant.


(ii) ∆−1 x(n) = 1
n+1 x
(n+1) , n being a positive integer.
1
(iii) ∆−1 ax = ax , a = 1.
a−1
1
(iv) ∆−1 sin ax = − cos(ax − a/2).
2 sin a/2
1
(v) ∆−1 cos ax = sin(ax − a/2).
2 sin a/2

2.8 Difference Equations

Like differential equations, difference equations have many applications in different


branches of mathematics, statistics and other field of science and engineering.
A difference equation is an equation containing an independent variable, a dependent
variable and successive differences of dependent variable. The difference equations are
some times called recurrence equations.
For example,
a∆2 f (x) + b∆f (x) + cf (x) = g(x), (2.51)
where a, b, c are constants, g(x) is a known function and f (x) is the unknown function.
The solution of a difference equation is the value of the unknown function.
The difference equation can also be expressed as a relation among the independent
variable and the successive values, i.e., f (x), f (x + h), f (x + 2h), . . . , of dependent
variable. For example, the difference equation
2∆2 f (x) − ∆f (x) + 5f (x) = x2 + 3x, (2.52)
Calculus of Finite Diff. and Diff. Equs 53

is same as 2E 2 f (x) − 5Ef (x) + 8f (x) = x2 + 3x,


or, it can be written as
2f (x + 2h) − 5f (x + h) + 8f (x) = x2 + 3x.
If f (x) is denoted by ux , then this equation can be written as
2ux+2h − 5ux+h + 8ux = x2 + 3x.
When h = 1, the above equation is simplified as

2ux+2 − 5ux+1 + 8ux = x2 + 3x. (2.53)

The difference between the largest and the smallest arguments appearing in the dif-
ference equation with unit interval is called the order of the difference equation.
The order of the equation (2.53) is (x + 2) − x = 2, while the order of the equation
ux+3 − 8ux+1 + 5ux−1 = x3 + 2 is (x + 3) − (x − 1) = 4. The order of the difference
equation ∆f (x + 2) − 3f (x) = 0 is 3 as it is equivalent to ux+3 − ux+2 − 3ux = 0.
A difference equation in which ux , ux+1 , . . . , ux+n occur to the first degree only and
there are no product terms is called linear difference equation. Its general form is

a0 ux+n + a1 ux+n−1 + · · · + an ux = g(x). (2.54)

If the coefficients a0 , a1 , . . . , an are constants, then the equation (2.54) is called linear
difference equation with constant coefficients. If g(x) = 0 then the equation is
called homogenous otherwise it is called non-homogeneous difference equation.
The linear homogeneous equation with constant coefficients of order n is

a0 ux+n + a1 ux+n−1 + · · · + an ux = 0. (2.55)

This can be written as

(a0 E n + a1 E n−1 + · · · + an )ux = 0 or, f (E)ux = 0, (2.56)

where f (E) ≡ a0 E n + a1 E n−1 + · · · + an is known as the characteristic function of


(2.56).
The equation f (m) = 0 is called the auxiliary equation (A.E.) for the difference
equation (2.56).
The solution of a difference equation is a relation between the independent variable
and the dependent variable satisfying the equation.

2.8.1 Formation of difference equations


Let {1, 2, 4, 8, 16, . . .} be a sequence and its general term is 2n , n = 0, 1, 2, . . .. Let us
denote the nth term by an . Then an = 2n . Also, an+1 = 2n+1 . Thus an+1 = 2.2n = 2an ,
i.e., an+1 − 2an = 0 is the difference equation of the above sequence.
54 Numerical Analysis

Let us consider another example of a sequence whose xth term is given by

ux = a2x + b3x (2.57)

where a, b are arbitrary constants.


Then,

ux+1 = a2x+1 + b3x+1 (2.58)

and

ux+2 = a2x+2 + b3x+2 . (2.59)

Now ux+1 − 2ux = b3x [using (2.57) and (2.58)]


and ux+2 − 2ux+1 = b3x+1 . [using (2.58) and (2.59)]
Therefore, ux+2 − 2ux+1 = 3(b3x ) = 3(ux+1 − 2ux ).
That is, ux+2 − 5ux+1 + 6ux = 0 is the required difference equation for the relation
(2.57).
It may be noted that the equation (2.57) contains two arbitrary constants and the
corresponding difference equation is of order 2.

A large number of counting problems can be modelled by using recurrence relations.


Some of them are presented here.
Example 2.8.1 (Rabbits on an island). This problem was originally posed by
the Italian mathematician Leonardo Fibonacci in the thirteenth century in his book
‘Liber abaci’. The problem is stated below.
A pair of new-born rabbits (one male and other female) is kept on an island where
there is no other rabbit. A pair of rabbits does not breed until they are 2 months old.
After a pair becomes 2 months old, each pair of rabbits (of opposite sexes) produces
another pair (of opposite sexes) each month. It is assuming that no rabbits ever die.
This problem can be modelled as a recurrence relation as follows.

Let xn denote the number of pairs of rabbits on the island just after n months. At
the end of first month the number of pairs of rabbits is 1, i.e., x1 = 1. Since this pair
does not breed during second month, so x2 = 1. Now,

xn = number of pairs of rabbits just after n months


= number of pairs after (n − 1) months + number of new born pairs at
the end of nth month.

The number of new born pairs = number of pairs just after the (n − 2)th month,
since each new-born pair is produced by a pair of at least 2 months old, and it is
equal to xn−2 .
Calculus of Finite Diff. and Diff. Equs 55

Hence
xn = xn−1 + xn−2 , n ≥ 3, x1 = x2 = 1. (2.60)

This is the difference equation of the above stated problem and the solution is x1 =
1, x2 = 1, x2 = 2, x3 = 3, x4 = 5, x5 = 8, . . ., i.e., the sequence is {1, 1, 2, 3, 5, 8 . . .}
and this sequence is known as Fibonacci sequence.

Example 2.8.2 (The Tower of Hanoi). The Tower of Hanoi problem is a famous
problem of the late nineteenth century. The problem is stated below.
Let there be three pegs, numbered 1, 2, 3 and they are on a board and n discs of
difference sizes with holes in their centres. Initially, these n discs are placed on one
peg, say peg 1, in order of decreasing size, with the largest disc at the bottom. The
rules of the puzzle are that the discs can be moved from one peg to another only one
at a time and no discs can be placed on the top of a smaller disc. The problem is to
transfer all the discs from peg 1 to another peg 2, in order of size, with the largest
disc at the bottom, in minimum number of moves.

Let xn be the number of moves required to solve the problem with n discs. If n = 1,
i.e., if there is only one disc on peg 1, we simply transfer it to peg 2 by one move.
Hence x1 = 1. Now, if n > 1, starting with n discs on peg 1 we can transfer the top
n − 1 discs, following the rules of this problem to peg 3 by xn−1 moves. During these
moves the largest disc at the bottom on peg 1 remains fixed. Next, we use one move
to transfer the largest disc from peg 1 to peg 2, which was empty. Finally, we again
transfer the n − 1 discs on peg 3 to peg 2 by xn−1 moves, placing them on top of the
largest disc on peg 2 which remains fixed during these moves. Thus, when n > 1,
(n − 1) discs are transferred twice and one additional move is needed to move the
largest disc at the bottom from peg 1 to peg 2. Thus the recurrence relation is

xn = 2xn−1 + 1 for n ≥ 2 and x1 = 1. (2.61)

This is a first order non-homogeneous difference equation.

2.9 Solution of Difference Equations

Several methods are used to solve difference equations. Among them the widely used
methods are iterative method, solution using operators, solution using generating func-
tion, etc.

2.9.1 Iterative method


In this method the successive terms are substituted until the terms reduce to initial
term. The method is illustrated by example.
56 Numerical Analysis

Example 2.9.1 Solve the difference equation


xn = xn−1 + (n − 1) for n ≥ 2 and x1 = 0.

Solution.
xn = xn−1 + (n − 1) = {xn−2 + (n − 2)} + (n − 1)
= xn−2 + 2n − (1 + 2) = {xn−3 + (n − 3)} + 2n − (1 + 2)
= xn−3 + 3n − (1 + 2 + 3).

In this way, after (n − 1) steps

xn = xn−(n−1) + (n − 1)n − (1 + 2 + · · · + n − 1)
n(n − 1)
= x1 + (n − 1)n −
2
1
= 0 + n(n − 1) [since x1 = 0]
2
1
= n(n − 1).
2

Example 2.9.2 Solve the difference equation for the Tower of Hanoi problem: xn =
2xn−1 + 1, n ≥ 2 with x1 = 1.

Solution.
xn = 2xn−1 + 1 = 2(2xn−2 + 1) + 1
= 22 xn−2 + (2 + 1) = 22 (2xn−3 + 1) + (2 + 1)
= 23 xn−3 + (22 + 2 + 1) = 23 (2xn−4 + 1) + (22 + 2 + 1)
= 24 xn−4 + (23 + 22 + 2 + 1).

In this way, after (n − 1) steps,

xn = 2n−1 xn−(n−1) + (2n−2 + 2n−3 + · · · + 22 + 2 + 1)


= 2n−1 x1 + (2n−2 + 2n−3 + · · · + 22 + 2 + 1)
= 2n−1 + 2n−2 + 2n−3 + · · · + 22 + 2 + 1 [since x1 = 1]
= 2 − 1.
n

2.9.2 Solution using symbolic operators


This method is used to solve homogeneous as well as non-homogeneous difference equa-
tions. First we consider the homogeneous linear difference equations with constant
coefficients.
Calculus of Finite Diff. and Diff. Equs 57

Solution of homogeneous equations with constant coefficients


To explain the method, a general second order difference equation is considered. Let

ux+2 + aux+1 + bux = 0. (2.62)

Using shift operator E, this equation can be written as

(E 2 + aE + b)ux = 0. (2.63)

Let ux = cmx be a solution of (2.63), c is a non-zero constant.


Then, E 2 ux = ux+2 = cmx+2 , Eux = ux+1 = cmx+1 .
Using these values, the equation (2.63) reduces to cmx (m2 + am + b) = 0. Since
cmx = 0,

m2 + am + b = 0. (2.64)

This equation is called the auxiliary equation (A.E.) for the difference equation
(2.62). Since (2.64) is a quadratic equation, three types of roots may occur.
Case I. Let m1 and m2 be two distinct real roots of (2.64). In this case, the general
solution is ux = c1 mx1 + c2 mx2 , where c1 and c2 are arbitrary constants.
Case II. Let m1 , m1 be two real and equal roots of (2.64). In this case (c1 mx1 + c2 mx1 ) =
(c1 + c2 )mx1 = cmx1 is the only one solution of (2.62). To get the other solution (as a
second order difference equation should have two independent solutions, like differential
equation), let us consider ux = mx1 vx be its solution.
Since m1 , m1 are two equal roots of (2.64), the equation (2.63) may be written as
(E 2 − 2m1 E + m21 )ux = 0.
Substituting ux = mx1 vx to this equation, we obtain
1 ux+2 − 2m1 vx+1 + m1 vx = 0
mx+2 x+2 x+2

or, m1 (vx+2 − 2vx+1 + vx ) = 0 or, mx+2


x+2 2
1 ∆ vx = 0.
That is, ∆2 vx = 0. Since second difference is zero, the first difference is constant and
hence vx is linear. Let vx = c1 + c2 x, where c1 , c2 are arbitrary constants.
Hence, in this case the general solution is

ux = (c1 + c2 x)mx1 .

Case III. If the roots m1 , m2 are complex, then m1 , m2 should be conjugate complex
and let them be (α + iβ) and (α − iβ), where α, β are reals. Then the general solution
is
ux = c1 (α + iβ)x + c2 (α − iβ)x .
 To simplify the above expression, substituting α = r cos θ, β = r sin θ, where r =
α2 + β 2 and tan θ = β/α.
58 Numerical Analysis

Therefore,
ux = c1 rx (cos θ + i sin θ)x + c2 rx (cos θ − i sin θ)x
= rx {c1 (cos θx + i sin θx) + c2 (cos θx − i sin θx)}
= rx {(c1 + c2 ) cos θx + i(c1 − c2 ) sin θx}
= rx (A cos θx + B sin θx), where A = c1 + c2 and B = i(c1 − c2 ).

Example 2.9.3 Solve ux+1 − 8ux = 0.

Solution. This equation is written as (E − 8)ux = 0. Let ux = cm2 be a solution.


The A.E. is m − 8 = 0 or, m = 8.
Then ux = c8x , where c is an arbitrary constant, is the general solution.

Example 2.9.4 Solve the difference equation ux = ux−1 +ux−2 , x ≥ 2, u0 = 1, u1 =


1. Also, find the approximate value of ux when x tends to a large number.

√ ux = cm be a solution. The A.E. is m − m − 1 = 0


Solution. Let x 2

1± 5
or, m = .
2
Therefore, general solution is
 √   √ 
1+ 5 x 1− 5 x
u x = c1 + c2 ,
2 2
where c1 , c2 are arbitrary constants.
Given that u0 = 1, u1 = 1, therefore,  √   √ 
1+ 5 1− 5
1 = c1 + c2 and 1 = c1 + c2 .
2 2
Solution of these equations is √ √
5+1 1− 5
c1 = √ and c2 = − √ .
2 5 2 5
Hence, the particular solution is
 √   √ 
1 1 + 5 x+1 1 − 5 x+1
ux = √ − .
5 2 2
 1 − √5 x+1
When x → ∞ then → 0 and therefore,
2
 √ 
1 1 + 5 x+1
ux  √ .
5 2
Calculus of Finite Diff. and Diff. Equs 59

Solution of non-homogeneous difference equation


The general form of non-homogeneous linear difference equation with constant coeffi-
cients is

(a0 E n + a1 E n−1 + a2 E n−2 + · · · + an )ux = g(x), or, f (E)ux = g(x).

The solution of this equation is the combination of two solutions – complementary


function (C.F.) and particular integral or particular solution (P.S.). The C.F. is
1
the solution of the homogeneous equation f (E)ux = 0 and the P.I. is given by g(x).
f (E)
Then the general solution is
ux =C.F.+P.I.
The method to find C.F. is discussed in previous section.

Rules to find particular integral


Case I. g(x) = ax , f (a) = 0.
Since f (E) = a0 E n + a1 E n−1 + · · · + an ,

f (E)ax = a0 ax+n + a1 ax+n−1 + · · · + an ax = ax (a0 an + a1 an−1 · · · + an )


= ax f (a).
1 x 1 x
Thus P.I. = a = a , provided f (a) = 0.
f (E) f (a)
Case II. g(x) = ax φ(x). Then,

f (E)ax φ(x) = a0 E n ax φ(x) + a1 E n−1 ax φ(x) + · · · + an ax φ(x)


= ax [a0 an φ(x + n) + a1 an−1 φ(x + n − 1) + · · · + an φ(x)]
= ax [(a0 an E n + a1 an−1 E n−1 + · · · + an )φ(x)]
= ax f (aE)φ(x).
1 x 1
This gives P.I. = a φ(x) = ax φ(x), where f (aE) = 0.
f (E) f (aE)
Case III. g(x) = ax , f (a) = 0.
1
In this case, P.I. = ax 1 [by Case II]
f (aE)
Case IV. g(x) = xm (m is zero or positive integer)
1 1
Then, P.I. = xm = xm .
f (E) f (1 + ∆)
Now, this expression is evaluated by expanding [f (1 + ∆)]−1 as an infinite series of
∆ and applying different differences on xm .
60 Numerical Analysis

Case V. g(x) = sin ax or cos ax.


1 1 iax
(i) P.I. = sin ax = Imaginary part of e .
f (E) f (E)
1 1 iax
(ii) P.I. = cos ax = Real part of e .
f (E) f (E)
Example 2.9.5 Solve the equation ux+2 − 5ux+1 + 6ux = 5x + 2x .

Solution. The given equation is (E 2 − 5E + 6)ux = 5x + 2x .


Let ux = cmx be a solution of (E 2 −5E +6)ux = 0. Therefore, A.E. is m2 −5m+6 = 0
or, m = 2, 3.
Hence C.F. is c1 2x + c2 3x .
1 1 5x
P.I. of 5x = 2 5x = 5x = .
E − 5E + 6 25 − 25 + 6 6
1 1
P.I. of 2x = 2 2x = 2x 1
E − 5E + 6 (2E)2 − 5(2E) + 6
1 1
= 2x 2 1 = 2x 1
4E − 10E + 6 4(1 + ∆) − 10(1 + ∆) + 6
2

1 1
= 2x 2 1 = 2x (1 − 2∆)−1 1
4∆ − 2∆ −2∆
1
= 2x 1 = −2x−1 x(1) = −2x−1 .x
−2∆
5x
Therefore, the general solution is ux = c1 2x + c2 3x + − 2x−1 .x, where c1 , c2 are
6
arbitrary constants.

Example 2.9.6 Solve ux+2 − 4ux+1 + 3ux = 2x .x(3) .

Solution. The equation can be written as (E 2 − 4E + 3)ux = 2x x(3) . Let ux = cmx


be a solution. The A.E. is m2 − 4m + 3 = 0 or, m = 1, 3.
Therefore, C.F. is c1 1x + c2 3x = c1 + c2 3x .
1 1
P.I. = 2x x(3) = 2x x(3)
E2− 4E + 3 (2E) − 4(2E) + 3
2

1
= 2x x(3)
4(1 + ∆) − 8(1 + ∆) + 3
2

1
= 2x 2 x(3) = −2x (1 − 4∆2 )−1 x(3)
4∆ − 1
= −2x (1 + 4∆2 + 16∆4 + · · · )x(3)
= −2x [x(3) + 4.3.2x(1) ] = −2x [x(3) + 24x(1) ].
Hence, the general solution is ux = c1 + c2 3x − 2x [x(3) + 24x(1) ].
Calculus of Finite Diff. and Diff. Equs 61

Example 2.9.7 Solve ux − ux−1 + 2ux−2 = x2 + 5x .

Solution. The given equation can be written as


(E 2 − E + 2)ux−2 = x2 + 5x .
Let ux−2 = cm be a solution of (E 2 − E + √
x 2)ux−2 = 0.
1 ± i 7
Then A.E. is m2 − m + 2 = 0 or, m = .
2 √
1 7
Here the roots are complex. Let = r cos θ and = r sin θ.
√ √ 2 2
Therefore, r = 2, tan θ = 7.
√ √
The C.F. is ( 2)x [c1 cos θx + c2 sin θx] where tan θ = 7.
1
P.I. = (x2 + 5x )
E2 −E+2
1 1
= {x(2) + x(1) } + 5x
(1 + ∆) − (1 + ∆) + 2
2 25 − 5 + 2
1 5x
= {x(2)
+ x (1)
} +
∆2 + ∆ + 2 22
 2
−1
1 ∆ +∆ 5x
= 1+ {x(2) + x(1) } +
2 2 22
1 2
∆ +∆  2
∆ +∆  2 5x
= 1− + − · · · {x(2) + x(1) } +
2 2 2 22
1 (2) 1 1 5 x
= x + x(1) − (2 + 2x(1) + 1) + (2) +
2 2 4 22
1 (2) 5x 1 2 5x
= [x − 1] + = (x − x − 1) + .
2 22 2 22
Therefore, the general solution is
√ 1 5x
ux−2 = ( 2)x [c1 cos θx + c2 sin θx] + (x2 − x − 1) + ,
√ 2 22
where θ = 7 and c1 , c2 are arbitrary constants.

πx
Example 2.9.8 Show that the solution of the equation ux+2 + ux = 2 sin
 πx  πx
2
is given by ux = a cos + ε − x sin .
2 2
Solution. Let ux = cmx be a solution of ux+2 + ux = 0.
Then A.E. is m2 + 1 = 0 or, m = ±i.
Therefore, C.F. is A(i)x + B(−i)x .
Substituting 0 = r cos θ, 1 = r sin θ, where r = 1, θ = π/2.
62 Numerical Analysis

Then C.F. reduces to

A{r(cos θ + i sin θ)}x + B{r(cos θ − i sin θ)}x


 πx πx   πx πx 
= A cos + i sin + B cos − i sin
2 2 2 2
πx πx
= (A + B) cos − (B − A)i sin
2 2
πx πx
= a cos ε cos − a sin ε sin , where A + B = a cos ε, (B − A)i = a sin ε
 2  2
πx
= a cos +ε .
2
1 πx 1
P.I. = 2 sin = Imaginary part of 2 2eiπx/2
E2
+1 2 E +1
1
= I.P. of 2eiπx/2 iπ/2 2 1 [since (eiπ/2 )2 + 1 = 0]
(e E) + 1
1 1
= I.P. of 2eiπx/2 iπ 2 1 = I.P. of 2eiπx/2 1
e E +1 1 − (1 + ∆)2
 
iπx/2 1 ∆
= I.P. of 2e 1 − + ··· 1
−2∆ 2
1 1
= I.P. of 2eiπx/2 1 = I.P. of 2eiπx/2 x
 −2∆ πx πx  −2
= I.P. of (−x) cos + i sin
2 2
πx
= −x sin .
2
 πx  πx
Therefore, the general solution is ux = a cos + ε − x sin .
2 2

Example 2.9.9 Solve the following difference equation


yn+2 − 4yn+1 + 4yn = n2 + 3n .

Solution. Let yn = cmn be a trial solution of (E 2 − 4E + 4)yn = 0.


Then A.E. is m2 − 4m + 4 = 0 or, m = 2, 2.
Therefore, C.F. is (c1 + c2 n)2n .
1 1
P.I. of 3n = 3n = 3n = 3n .
E2 − 4E + 4 (E − 2)2
To find the P.I. of n2 , let yn = an2 + bn + c be the solution of yn+2 − 4yn+1 + 4yn = n2 .
Then {a(n + 2)2 + b(n + 2) + c} − 4{a(n + 1)2 + b(n + 1) + c}
+4{an2 + bn + c} = n2 .
Calculus of Finite Diff. and Diff. Equs 63

That is, an2 + (b − 4a)n + (c − 2b) = n2 . Equating both sides we have


a = 1, b − 4a = 0, c − 2b = 0, i.e., a = 1, b = 4, c = 8.
Hence, the P.I. of n2 is n2 + 4n + 8.
Therefore, the general solution is

yn = (c1 + c2 n)2n + 3n + n2 + 4n + 8.

Example 2.9.10 Solve the difference equation ∆f (n) − 4f (n) = 3,


n ≥ 2 and f (1) = 2.

Solution. The equation can be written using E operator as

(E − 1)f (n) − 4f (n) = 3 or, (E − 5)f (n) = 3.

Let f (n) = cmn be a trial solution.


The A.E. is m − 5 = 0. Therefore, C.F. is c5n .
To find P.I. let, f (n) = a be the solution of the given equation.
Then (E − 5)a = 3 or, a − 5a = 3 or, a = −3/4.
Hence, the general solution is
3
f (n) = c5n − .
4
But, given f (1) = 2. Therefore, 2 = 5c − 3/4 or c = 11/20.
Hence the particular solution is
11 n−1 3
f (n) = 5 − .
4 4

2.9.3 Generating function


Generating function is used to solve different kind of problems including combinatorial
problems. This function may also be used to solve difference equation.

Def. 2.9.1 Let {a0 , a1 , a2 , . . .} be a sequence of real numbers. The power series


G(x) = an xn (2.65)
n=0

is called the generating function for the sequence {a0 , a1 , a2 , . . .}.

In other words, an , the nth term of the sequence {an } is the coefficient of xn in the
expansion of G(x). That is, if the generating function of a sequence is known, then one
can determine all the terms of the sequence.
64 Numerical Analysis

Example 2.9.11 Use generating function to solve the difference equation


un = 2un−1 + 3, n ≥ 1, u0 = 2.

Solution. From the definition of the generating function,



 ∞
 ∞

G(x) = un xn = u0 + un xn = 2 + un xn .
n=0 n=1 n=1



That is, G(x) − 2 = un xn .
n=1
Now, un = 2un−1 + 3, n ≥ 1. Multiplying both sides by xn , we obtain
un xn = 2un−1 xn + 3xn .
Taking summation for all n = 1, 2, . . ., we have

 ∞
 ∞

un xn = 2 un−1 xn + 3 xn .
n=1 n=1 n=1

That is, ∞ ∞
 
G(x) − 2 = 2x un−1 xn−1 + 3 xn
n=1 n=1
∞ 
∞ 
= 2x un xn + 3 xn − 1
n=0 n=0
 1   ∞
 1 
= 2x G(x) + 3 −1 since xn =
1−x 1−x
n=0

3
Thus, (1 − 2x)G(x) = − 1.
1−x
Therefore, the generating function for this difference equation or for the sequence
{un } is
3 1
G(x) = −
(1 − x)(1 − 2x) 1 − 2x
5 3
= − = 5(1 − 2x)−1 − 3(1 − x)−1
1 − 2x 1 − x
∞ ∞ ∞
=5 (2x) − 3
n n
x = (5.2n − 3)xn .
n=0 n=0 n=0

The coefficient of xn in the expansion of G(x) is 5.2n − 3 and hence un = 5.2n − 3 is


the required solution.
Calculus of Finite Diff. and Diff. Equs 65

Example 2.9.12 Using generating function solve the difference equation an −


5an−1 + 6an−2 = 0, n ≥ 2 with initial conditions a0 = 2 and a1 = 3.

Solution. Let G(x) be the generating function of the sequence {an }. Then
∞ ∞

n
G(x) = a0 + a1 x + an x = 2 + 3x + an xn .
∞ n=2 n=2

That is, an x = G(x) − 2 − 3x.
n

n=2
Multiplying given equation by xn ,
an xn − 5an−1 xn + 6an−2 xn = 0.
Taking summation for n = 2, 3, . . . ,

 ∞
 ∞

an xn − 5 an−1 xn + 6 an−2 xn = 0
n=2 n=2 n=2

 ∞

or, G(x) − 2 − 3x − 5x an−1 xn−1 + 6x2 an−2 xn−2 = 0
n=2 n=2

∞ 
or, G(x) − 2 − 3x − 5x an xn − a0 + 6x2 G(x) = 0
n=0
or, G(x) − 2 − 3x − 5x[G(x) − 2] + 6x2 G(x) = 0.
2 − 7x
Therefore, G(x) = . This is the generating function for the given differ-
1 − 5x + 6x2
ence equation.
2 − 7x A B (A + B) − (3A + 2B)x
Let = + = .
1 − 5x + 6x2 1 − 2x 1 − 3x (1 − 2x)(1 − 3x)
The unknown A and B are related by the equations
A + B = 2 and 3A + 2B = 7,
3 1
whose solution is A = 3, B = −1. Thus, G(x) = − .
1 − 2x 1 − 3x
Now,
G(x) = 3(1 − 2x)−1 − (1 − 3x)−1
∞ ∞
 ∞
=3 (2x)n − (3x)n = (3.2n − 3n )xn .
n=0 n=0 n=0

Hence an = coefficient of xn in the expansion of G(x) = 3.2n − 3n , which is the


required solution.
66 Numerical Analysis

2.10 Exercise

1. Define the operators: forward difference (∆), backward difference (∇), shift (E),
central difference (δ) and average (µ).

2. Prove the following relations among the operators


 
δ2 2 δ
(i) 1 + δ µ ≡ 1 +
2 2
, (ii) E 1/2 ≡ µ +
2 2
∆+∇ ∆E −1 ∆
(iii) µδ ≡ , (iv) µδ ≡ + ,
2 2 2
δ 2 
(v) ∆ ≡ + δ 1 + (δ 2 /4), (vi) hD ≡ sinh−1 (µδ),
2
−1
(vii) hD ≡ log(1 +  ∆) ≡ − log(1 − ∇) ≡ sinh (µδ), (viii) E ≡ e ,
hD


(ix) µ ≡ 1 + (1 + ∆)−1/2 , (x) µ ≡ cosh(hD/2),
2
δ2 δ2 δ2 δ2
(xi) ∇ ≡ − + δ 1 + , (xii) E ≡ 1 + +δ 1+ ,
2 4 2 4
(xiii) E∇ ≡ ∆ ≡ δE 1/2 , (xiv) ∇ ≡ E −1 ∆.

3. Show that
(i) ∆i yk = ∇i yk+i = δ i yk+i/2 , (ii) ∆(yi2 ) = (yi + yi+1 )∆yi ,
1 ∆yi
(iii) ∆∇yi = ∇∆yi = δ 2 yi , (iv) ∆ =− .
yi yi yi+1
4. Prove the following
 n
(−1)i n!f (x + (n − i)h))
(i) ∆n f (x) =
i!(n − i)!
i=0
 (−1)i n!f (x − ih)
n
(ii) ∇n f (x) =
i!(n − i)!
i=0

2n
(−1)i (2n)!f (x + (n − i)h))
(iii) δ 2n f (x) = .
i!(2n − i)!
i=0

5. Prove the following


∆2 ∆3
(i) hD ≡ ∆ − + − ···
2 3
∆4 ∆5
(ii) h2 D2 ≡ ∆2 − ∆3 + 11 −5 + ···
12 6
∆6 ∆7
(iii) h4 D4 ≡ ∆4 − 2∆5 + 17 −7 + · · ·.
6 2
6. Prove that
(i) ∆n (eax+b ) = (eah − 1)n eax+b
Calculus of Finite Diff. and Diff. Equs 67

 
x ∆2 x Eex
(ii) e = e 2 x
E ∆ e
∆f (x)
(iii) ∆ log f (x) = log 1 +
 2 f (x)

(iv) x3 = 6x.
E
7. Prove that, if the spacing h is very small then the forward difference operator is
almost equal to differential operator, i.e., for small h, ∆n f (x) ≈ hn Dn f (x).

8. Show that the operators δ, µ, E, ∆ and ∇ are commute with one another.

9. Express ∆3 yi and ∇4 y4 in terms of y.

10. Prove the following relations:


(i) ux = ux−1 + ∆ux−2 + ∆2 ux−3 + · · · + ∆n−1 ux−n + ∆n ux−n−1
(ii) u1 + u2 + u3 + · · · + un =n C1 u0 +n C2 ∆u0 +n C3 ∆2 u0 + · · · + ∆n−1 u0
(iii) ∆n yx = yn+x −n C1 yx+n−1 +n C2 yx+n−2 − · · · + (−1)n yx
(iv) u1 x + u2 x2 + u3 x3 + · · ·
x x2 x3
= u1 + ∆u 1 + ∆2 u1 + · · ·.
1−x (1 − x)2 (1 − x)3
11. Show that
(i) ∆[f (x)g(x)] = f (x)∆g(x) + g(x + h)∆f (x)
(ii) ∆n f (x) = ∇n f (x + nh), where n is a positive integer
(iii) ∆∇f (x) = ∆f (x) − ∇f (x)
(iv) ∆f (x) + ∇f (x) = (∆/∇)f (x) − (∇/∆)f (x).

12. The nth difference ∆n be defined as ∆n f (x) = ∆n−1 f (x+h)−∆n−1 f (x), (n ≥ 1).
If f (x) is a polynomial of degree n show that ∆f (x) is a polynomial of degree n−1.
Hence deduce that the nth difference of f (x) is a constant.

13. If φr (x) = (x − x0 )(x − x1 ) · · · (x − xr ) where xr = x0 + rh, r = 0, 1, 2, . . . , n,


calculate ∆k φr (x).

14. If fi is the value of f (x) at xi where xi = x0 + ih, i = 1, 2, . . . prove that

i  
 i
fi = E i f0 = ∆j f0 .
j
j=0

15. For equally spaced points x0 , x1 , . . . , xn , where xk = x0 + kh, (h > 0, k =


0, 1, . . . , n) express ∆k y0 in terms of the ordinates.
68 Numerical Analysis

16. Taking h = 1, compute the second, third and fourth differences of f (x) = 3x4 −
2x2 + 5x − 1.
17. Construct the forward difference table for the following tabulated values of f (x)
and hence find the values of ∆2 f (3), ∆3 f (2), ∆4 f (0).

x : 0 1 2 3 4 5 6
f (x) : 4 7 10 20 45 57 70

18. Use finite difference method to find a polynomial which takes the following values:

x : –2 –1 0 1 2
f (x) : –12 –6 0 6 10

19. Compute the missing term in the following table.

x : 0 2 4 6 8
f (x) : 12 6 0 ? –25

20. Use finite difference method to find the value of f (2.2) from the following data.

x : 1 2 3 4 5
f (x) : 3 24 99 288 675

21. Find the functions, whose first differences are


(i) 3x2 + 9x + 2, (ii) x4 − 3x3 + x2 − 11x + 20.
22. Use iteration method to solve the following difference equations
(i) an = an−1 + 4, for all n ≥ 2 with a1 = 2.
(ii) un = un−1 + (n − 1), n ≥ 2 and a1 = 0.
(iii) xn = 5xn−1 + 3 for n ≥ 2 and x1 = 2.
23. Find the first five terms of the sequence defined by the following recurrence rela-
tions:
(i) xn = x2n−1 for n ≥ 2 and x1 = 1.
(ii) xn = nxn−1 + n2 xn−2 for n ≥ 2 and x0 = 1, x1 = 1.
(iii) Let x1 = 1 and for n ≥ 2, xn = x1 xn−1 + x2 xn−2 + · · · + xn−1 x1 .
(The numbers of this sequence are called Catalan numbers).
24. Mr. Das deposits Rs. 1,000 in a bank account yielding 5% compound interest
yearly.
(i) Find a recurrence relation for the amount in the account after n years.
(ii) Find an explicit formula for the amount in the account after n years.
(iii) How much will be in the account after 10 years ?
Calculus of Finite Diff. and Diff. Equs 69

25. Solve the following difference equations.


(i) un − 5un−1 + 6un−2 = n2 + 7n + 3n
(ii) ux+2 − 7ux−1 + 10ux = 12e3x + 4x
(iii) un+2 − 4un+1 + 4un = n for n ≥ 1 and u1 = 1, u2 = 4
(iv) un − 5un−1 + 6un−2 = n2 + 5n + 2n
(v) 6un+2 − 7un+1 − 20un = 3n2 − 2n + 8
(vi) f (n + 2) − 8f (n + 1) + 25f (n) = 2n2 + n + 1
(vii) ux − ux−1 + 2ux−2 = x + 2x
(viii) yn+2 − 4yn+1 + 4yn = n + 3n
(ix) un − 5un−1 + 6un−2 = n2
(x) Sn+1 = Sn + Sn−1 , n ≥ 3 and S1 = 1, S2 = 1. Find S8 .

26. Assuming un = an + b, show that the particular solution of


1
un − 5un−1 + 6un−2 = n is (2n + 7).
4
27. Show that the general solution of ux − ux−1 − ux−2 = x2
1 √ √
is x [A(1 + 5)x + B(1 − 5)x ] − (x2 + 6x + 13).
2
28. Show that 2
 the solution of ux+2
 + a 2ux = cos ax is
πx πx a cos ax + cos a(2 − x)
ux = ax A cos + B sin + .
2 2 1 + 2a2 cos 2a + a4
29. If un satisfies the difference equation un = 11un−1 +8un−2 , n ≥ 2 where u0 = 1 and
   
1 11 + a n+1 11 − a n+1
u1 = 11 then show that un is given by un = −
√ a 2 2
where a = 3 17.

30. The seeds of a certain plant when one year old produce eighteen fold. A seed is
planted and every seed subsequently produced is planted as soon as it is produced.
Prove that the number of grain at the end of nth year is
 n+1  n+1
1 11 + a 11 − a
un = −
a 2 2

where a = 3 17.

31. The first term of a sequence {un } is 1, the second is 4 and every other term is the
arithmetic mean of the two preceding terms. Find un and show that un tends to
a definite limit as n → ∞.

32. The first term of a sequence is 1, the second is 2 and every term is the sum of the
two proceeding terms. Find the nth term.
70 Numerical Analysis

33. If ur satisfies the difference equation ur − 4ur−1 + ur−2 = 0, 2 ≤ r ≤ n, where


√ A sinh(n − r)α
un = 0 and u0 = A, show that, if α = log(2 + 3) then ur = .
sinh nα
34. Show that the general solution of the national income equation
1 1
yn+2 − yn+1 − yn = nh + A where h, A are constants, is given by
2n 4
yn = c1 m1 + c2 mn2 + 4hn + 4(A − 6h) where c1 , c2 are arbitrary constants and the
values of m1 and m2 you are to actually find out. Also show that yn /n tends to
finite limit as n → ∞.

35. Use generating functions to solve the following difference equations.


(i) xn = 3xn−1 , n ≥ 1 and x0 = 2.
(ii) xn = 5xn−1 + 2n , n ≥ 1 and x0 = 2.
(iii) xn = xn−1 + n for n ≥ 1 and x0 = 1.
(iv) un − un−1 − un−2 = 0 for n ≥ 2 and u0 = 0, u1 = 1.
(v) an + an−2 = 2an−1 , n ≥ 2 and a0 = 0, a1 = 1.
Chapter 3

Interpolation

Sometimes we have to compute the value of the dependent variable for a given inde-
pendent variable, but the explicit relation between them is not known. For example,
the Indian population are known to us for the years 1951, 1961, 1971, 1981, 1991 and
2001. There is no exact mathematical expression available which will give the popula-
tion for any given year. So one can not determine the population of India in the year
2000 analytically. But, using interpolation one can determine the population (obviously
approximate) for any year.
The general interpolation problem is stated below:
Let y = f (x) be a function whose analytic expression is not known, but a table of
values of y is known only at a set of values x0 , x1 , x2 , . . ., xn of x. There is no other
information available about the function f (x). That is,
f (xi ) = yi , i = 0, 1, . . . , n. (3.1)
The problem of interpolation is to find the value of y(= f (x)) for an argument, say,
x . The value of y at x is not available in the table.
A large number of different techniques are used to determine the value of y at x = x .
But one common step is “find an approximate function, say, ψ(x), corresponding to
the given function f (x) depending on the tabulated value.” The approximate function
should be simple and easy to handle. The function ψ(x) may be a polynomial, exponen-
tial, geometric function, Taylor’s series, Fourier series, etc. When the function ψ(x) is
a polynomial, then the corresponding interpolation is called polynomial interpolation.
The polynomial interpolation is widely used interpolation technique, because, polyno-
mials are continuous and can be differentiated and integrated term by term within any
range.
A polynomial φ(x) is called interpolating polynomial if yi = f (xi ) = φ(xi ), i =
dk f  dk φ 
0, 1, 2, . . . , n and = for some finite k, and x is one of the values of x0 ,
dxk x dxk x

71
72 Numerical Analysis

y
y = f (x) + ε
6

y = φ(x)
y=f (x) :

]
z y = f (x) − ε

-x
x0 x1 x2 xn

Figure 3.1: Interpolation of a function.

x1 , . . ., xn .
The following theorem justifies the approximation of an unknown function f (x) to a
polynomial φ(x).

Theorem 3.1 If the function f (x) is continuous in [a, b], then for any pre-assigned
positive number ε > 0, there exists a polynomial φ(x) such that
|f (x) − φ(x)| < ε for all x ∈ (a, b).

This theorem ensures that the interpolating polynomial φ(x) is bounded by y =


f (x) − ε and y = f (x) + ε, for a given ε. This is shown in Figure 3.1.
Depending on the tabulated points, several interpolation methods are developed.
Among them finite-difference interpolating formulae, Lagrange’s interpolation are widely
used polynomial interpolation methods.
For the sake of convenience, a polynomial of degree n, means a polynomial of degree
not higher than n.

3.1 Lagrange’s Interpolation Polynomial

Let y = f (x) be a real valued function defined on an interval [a, b]. Let x0 , x1 , . . . , xn be
n + 1 distinct points in the interval [a, b] and y0 , y1 , . . . , yn be the corresponding values
of y at these points, i.e., yi = f (xi ), i = 0, 1, . . . , n, are given.
Interpolation 73

Now, we construct an algebraic polynomial φ(x) of degree less than or equal to n


which attains the assigned values at the points xi , that is,

φ(xi ) = yi , i = 0, 1, . . . , n. (3.2)

The polynomial φ(x) is called the interpolation polynomial and the points xi ,
i = 0, 1, . . . , n are called interpolation points.
Let the polynomial φ(x) be of the form

n
φ(x) = Li (x) yi , (3.3)
i=0

where each Li (x) is polynomial in x, of degree less than or equal to n, called the
Lagrangian function.
The polynomial φ(x) satisfies the equation (3.2) if
0, for i = j
Li (xj ) =
1, for i = j.
That is, the polynomial Li (x) vanishes only at the points x0 , x1 , . . . , xi−1 , xi+1 , . . . , xn .
So it should be of the form

Li (x) = ai (x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn ),

where ai , a constant whose value is determined by using the relation

Li (xi ) = 1.

Then ai (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn ) = 1.


or, ai = 1/{(xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )}.
Therefore,
(x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn )
Li (x) = . (3.4)
(xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
Thus, the required Lagrange’s interpolation polynomial φ(x) is

n
φ(x) = Li (x) yi ,
i=0

where Li (x) is given in (3.4).


The polynomial Li (x) can be written as
n  
x − xj
Li (x) = .
j=0
xi − xj
j=i
74 Numerical Analysis

In this notation, the polynomial φ(x) is



n n  
x − xj
φ(x) = yi .
j=0
xi − xj
i=0
j=i

The function Li (x) can also be expressed in another form as follows.


Let
w(x) = (x − x0 )(x − x1 ) · · · (x − xn ) (3.5)
be a polynomial of degree n + 1 and vanishes at x = x0 , x1 , . . . , xn .
Now, the derivative of w(x) with respect to x is given by
w (x) = (x − x1 )(x − x2 ) · · · (x − xn ) + (x − x0 )(x − x2 ) · · · (x − xn )
+ · · · + (x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn )
+ · · · + (x − x0 )(x − x1 ) · · · (x − xn−1 ).
Therefore, w (xi ) = (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn ), which is
the denominator of Li (x).
Using w(x), Li (x) becomes
w(x)
Li (x) = .
(x − xi )w (xi )
So the Lagrange’s interpolation polynomial in terms of w(x) is

n
w(x)
φ(x) = yi . (3.6)
(x − xi )w (xi )
i=0

Example 3.1.1 Obtain Lagrange’s interpolating polynomial for f (x) and find an
approximate value of the function f (x) at x = 0, given that f (−2) = −5, f (−1) = −1
and f (1) = 1.

Solution. Here x0 = −2, x1 = −1, x2 = 1 and f (x0 ) = −5, f (x1 ) = −1, f (x2 ) = 1.

2
Then f (x)  Li (x)f (xi ).
i=0
Now,
(x − x1 )(x − x2 ) (x + 1)(x − 1) x2 − 1
L0 (x) = = = .
(x0 − x1 )(x0 − x2 ) (−2 + 1)(−2 − 1) 3
(x − x0 )(x − x2 ) (x + 2)(x − 1) x2 + x − 2
L1 (x) = = = .
(x1 − x0 )(x1 − x2 ) (−1 + 2)(−1 − 1) −2
(x − x0 )(x − x1 ) (x + 2)(x + 1) x2 + 3x + 2
L2 (x) = = = .
(x2 − x0 )(x2 − x1 ) (1 + 2)(1 + 1) 6
Interpolation 75

Therefore,

x2 − 1 x2 + x − 2 x2 + 3x + 2
f (x)  × (−5) + × (−1) + ×1
3 −2 6
= 1 + x − x2 .

Thus, f (0) = 1.

The Lagrangian coefficients can be computed from the following scheme. The differ-
ences are computed, row-wise, as shown below:
x − x0 ∗ x0 − x1 x0 − x2 ··· x0 − xn
x1 − x0 x − x1 ∗ x1 − x2 ··· x1 − xn
x2 − x0 x2 − x1 x − x2 ∗ ··· x2 − xn
··· ··· ··· ··· ···
xn − x0 xn − x1 x − x2 ··· x − xn ∗
From this table, it is observed that the product of diagonal elements is w(x). The
product of the elements of first row is (x − x0 )w (x0 ), product of elements of second row
is (x − x1 )w (x1 ) and so on. Then the Lagrangian coefficient can be computed using
the formula
w(x)
Li (x) = .
(x − xi )w (xi )

Linear Lagrangian Interpolation


Let x0 and x1 be two points and y0 and y1 be the corresponding values of y. In this
case,

φ(x) = L0 (x)y0 + L1 (x)y1


x − x1 x − x0 x − x0
= y0 + y1 = y0 + (y1 − y0 ). (3.7)
x0 − x1 x1 − x0 x1 − x0
This polynomial is known as linear interpolation polynomial.

3.1.1 Lagrangian interpolation formula for equally spaced points


For the equally spaced points, xi = x0 + ih, i = 0, 1, 2 . . . , n, where h is the spacing.
Now, a new variable s is introduced, defined by x = x0 + sh.
Then x − xi = (s − i)h and xi − xj = (i − j)h.
Therefore,

w(x) = (x − x0 )(x − x1 ) · · · (x − xn ) = sh(s − 1)h(s − 2)h · · · (s − n)h


= hn+1 s(s − 1)(s − 2) · · · (s − n).
76 Numerical Analysis

Also,

w (xi ) = (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 )(xi − xi+2 ) · · · (xi − xn )
= (ih)(i − 1)h · · · (i − i − 1)h(i − i + 1)h(i − i + 2)h · · · (i − n)h
= hn i(i − 1) · · · 1 · (−1)(−2) · · · ({−(n − i)}
= hn i!(−1)n−i (n − i)!.

Using these values, the relation (3.5) becomes


n
hn+1 s(s − 1)(s − 2) · · · (s − n)
φ(x) = yi
(−1)n−i hn i!(n − i)!(s − i)h
i=0
n
s(s − 1)(s − 2) · · · (s − n)
= (−1)n−i yi , (3.8)
i!(n − i)!(s − i)
i=0
where x = x0 + sh.

For given tabulated values, the Lagrange’s interpolation polynomial exists and unique.
These are proved in the following theorem.

Theorem 3.2 The Lagrange’s interpolation polynomial exists and unique.

Proof. The Lagrange’s interpolation formula satisfied the condition

yi = φ(xi ), i = 0, 1, . . . , n. (3.9)

For n = 1,
x − x1 x − x0
φ(x) = y0 + y1 . (3.10)
x0 − x1 x1 − x0

For n = 2,

(x − x1 )(x − x2 ) (x − x0 )(x − x2 )
φ(x) = y0 + y1
(x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 )
(x − x0 )(x − x1 )
+ y2 . (3.11)
(x2 − x0 )(x2 − x1 )

In general, for any positive integer n,


n
φ(x) = Li (x)yi , (3.12)
i=0
Interpolation 77

where

(x − x0 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn )


Li (x) = ,
(xi − x0 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
i = 0, 1, . . . , n. (3.13)

Expression (3.10) is a linear function, i.e., a polynomial of degree one and also,
φ(x0 ) = y0 and φ(x1 ) = y1 .
Also, expression (3.11) is a second degree polynomial and φ(x0 ) = y0 , φ(x1 ) =
y1 , φ(x2 ) = y2 , i.e., satisfy (3.9). Thus, the condition (3.13) for n = 1, 2 is fulfilled.
The functions (3.13) expressed in the form of a fraction whose numerator is a poly-
nomial of degree n and whose denominator is a non-zero number. Also, Li (xi ) = 1 and
Li (xj ) = 0 for j = i, j = 0, 1, . . . , n. That is, φ(xi ) = yi . Thus, the conditions of (3.9)
are satisfied. Hence, the Lagrange’s polynomial exists.

Uniqueness of the polynomial

Let φ(x) be a polynomial of degree n, where

φ(xi ) = yi , i = 0, 1, . . . , n. (3.14)

Also, let φ∗ (x) be another polynomials of degree n satisfying the conditions

φ∗ (xi ) = yi , i = 0, 1, . . . , n. (3.15)

Then from (3.14) and (3.15),

φ∗ (xi ) − φ(xi ) = 0, i = 0, 1, . . . , n. (3.16)

If φ∗ (x) − φ(x) = 0, then this difference is a polynomial of degree at most n and it has
at most n zeros, which contradicts (3.16), whose number of zeros is n + 1. Consequently,
φ∗ (x) = φ(x). Thus φ(x) is unique.

3.2 Properties of Lagrangian Functions

1. The Lagrangian functions depend only on xi ’s and independent of yi ’s or f (xi )’s.


2. The form of Lagrangian functions remain unchanged (invariant) under linear trans-
formation.
Proof. Let x = az + b, where a, b are arbitrary constants.
Then xj = azj + b and x − xj = a(z − zj ). Also, xi − xj = a(zi − zj ) (i = j).
78 Numerical Analysis

Therefore,

w(x) = (x − x0 )(x − x1 ) · · · (x − xn )
= an+1 (z − z0 )(z − z1 ) · · · (z − zn ).
w (xi ) = (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
= an (zi − z0 )(zi − z1 ) · · · (zi − zi−1 )(zi − zi+1 ) · · · (zi − zn ).

Thus,

w(x)
Li (x) =
(x − xi )w (xi )
an+1 (z − z0 )(z − z1 ) · · · (z − zn )
=
a(z − zi )an (zi − z0 )(zi − z1 ) · · · (zi − zi−1 )(zi − zi+1 ) · · · (zi − zn )
w(z)
= = Li (z).
(z − zi )w (zi )

Thus Li (x)’s are invariant.



n
3. Sum of Lagrangian functions is 1, i.e., Li (x) = 1.
i=0
Proof. Sum of Lagrangian functions is


n 
n
w(x)
Li (x) = (3.17)
(x − xi )w (xi )
i=0 i=0

where w(x) = (x − x0 )(x − x1 ) · · · (x − xn ).


Let
1 A0 A1 Ai An
= + + ··· + + ··· + (3.18)
w(x) x − x0 x − x1 x − xi x − xn
i.e., 1 = A0 (x − x1 )(x − x2 ) · · · (x − xn ) + A1 (x − x0 )(x − x2 ) · · · (x − xn )
+ · · · + Ai (x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn )
+ · · · + An (x − x0 )(x − x1 ) · · · (x − xn−1 ). (3.19)

When x = x0 is substituted in (3.19) then the value of A0 is given by


1 = A0 (x0 − x1 )(x0 − x2 ) · · · (x0 − xn )
That is,
1 1
A0 = =  .
(x0 − x1 )(x0 − x2 ) · · · (x0 − xn ) w (x0 )
Similarly, x = x1 gives
Interpolation 79

1
A1 = .
w (x1 )
1 1
Also, Ai = and An =  .
w (xi ) w (xn )
Using these results, equation (3.18) becomes

1 1 1
= + + ···
w(x) (x − x0 )w (x0 ) (x − x1 )w (x1 )


1 1
+ 
+ ··· +
(x − xi )w (xi ) (x − xn )w (xn )
n
w(x)
i.e., 1 = .
(x − xi )w (xi )
i=0

Hence the sum of Lagrangian functions is 1, that is,


n 
n
w(x)
Li (x) = = 1. (3.20)
(x − xi )w (xi )
i=0 i=0

3.3 Error in Interpolating Polynomial

It is obvious that if f (x) is approximated by a polynomial φ(x), then there should be


some error at the non-tabular points. The following theorem gives the amount of error
in interpolating polynomial.

Theorem 3.3 Let I be an interval contains all interpolating points x0 , x1 , . . . , xn . If


f (x) is continuous and have continuous derivatives of order n + 1 for all x in I then
the error at any point x is given by

f (n+1) (ξ)
En (x) = (x − x0 )(x − x1 ) · · · (x − xn ) , (3.21)
(n + 1)!

where ξ ∈ I.

Proof. Let the error En (x) = f (x) − φ(x), where φ(x) is a polynomial of degree less
than or equal to n, which approximates the function f (x).
Now, En (xi ) = f (xi ) − φ(xi ) = 0 for i = 0, 1, . . . , n.
By virtue of the above result, it is assumed that En (x) = w(x)k, where w(x) =
(x − x0 )(x − x1 ) . . . (x − xn ).
The error at any point, say, x = t, other than x0 , x1 , . . . , xn is En (t) = w(t)k

or, f (t) − φ(t) = kw(t). (3.22)


80 Numerical Analysis

Let us construct an auxiliary function

F (x) = f (x) − φ(x) − kw(x). (3.23)

The function vanishes at x = x0 , x1 , . . . , xn because f (xi ) = φ(xi ) and w(xi ) = 0.


Also, F (t) = 0, by (3.22).
Hence, F (x) = 0 has n + 2 roots in I. By Roll’s theorem, F  (x) = 0 has n + 1 roots
in I. F  (x) = 0 has n roots in I and finally, F (n+1) (x) = 0 must have at least one root
in I. Let ξ be one such root. Then F (n+1) (ξ) = 0. That is, f (n+1) (ξ) − 0 + k(n + 1)! = 0
[φ(x) is a polynomial of degree n so φ(n+1) (x) = 0 and w(x) is a polynomial of degree
n + 1, so w(n+1) (x) = (n + 1)!]. Thus

f (n+1) (ξ)
k= .
(n + 1)!

Therefore, the error at x = t is

En (t) = kw(t) [by (3.22)]


f (n+1) (ξ)
= (t − x0 )(t − x1 ) · · · (t − xn ) .
(n + 1)!

Hence, the error at any point x is


f (n+1) (ξ)
En (x) = (x − x0 )(x − x1 ) · · · (x − xn ) . 
(n + 1)!

Note 3.3.1 The above expression gives the error at any point x.
But practically it has little utility, because, in many cases f (n+1) (ξ) cannot be deter-
mined.
If Mn+1 be the upper bound of f (n+1) (ξ) in I, i.e., if |f (n+1) (ξ)| ≤ Mn+1 in I then
the upper bound of En (x) is

Mn+1
|En (x)| ≤ |w(x)|. (3.24)
(n + 1)!

Note 3.3.2 (Error for equispaced points)


Let xi = x0 + ih, i = 0, 1, . . . , n and x = x0 + sh then x − xi = (s − i)h.
Then the error is

f (n+1) (ξ)
En (x) = s(s − 1)(s − 2) · · · (s − n)hn+1 . (3.25)
(n + 1)!
Interpolation 81

Note 3.3.3 (Error bounds for equally spaced points, particular cases)
Assume that, f (x) is defined on [a, b] that contains the equally spaced points. Suppose,
f (x) and the derivatives up to n + 1 order are continuous and bounded on the intervals
[x0 , x1 ], [x0 , x2 ] and [x0 , x3 ] respectively. That is, |f (n+1)(ξ) | ≤ Mn+1 for x0 ≤ ξ ≤ xn ,
for n = 1, 2, 3. Then

h2 M2
(i) |E1 (x)| ≤ , x0 ≤ x ≤ x1 (3.26)
8
h3 M3
(ii) |E2 (x)| ≤ √ , x0 ≤ x ≤ x2 (3.27)
9 3
h4 M4
(iii) |E3 (x)| ≤ , x0 ≤ x ≤ x3 . (3.28)
24
Proof. (i) From (3.25),

|f (2) (ξ)|
|E1 (x)| = |s(s − 1)|h2 .
2!
Let g1 (s) = s(s − 1). g1 (s) = 2s − 1. Then s = 1/2, which is the solution of g1 (s) = 0.
The extreme value of g1 (s) is 1/4.
Therefore,
1 M2 h2 M2
|E1 (x)| ≤ h2 = .
4 2! 8
|f (3) (ξ)|
(ii) |E2 (x)| = |s(s − 1)(s − 2)|h3 .
3!
1
Let g2 (s) = s(s − 1)(s − 2). Then g2 (s) = 3s2 − 6s + 2. At g2 (s) = 0, s = 1 ± √ .
3
Again g2 (s) = 6(s − 1) < 0 at s = 1 − √13 .
Therefore, the maximum value of g2 (s) is
 1  1  1  2
1− √ −√ − √ −1 = √ .
3 3 3 3 3
Thus,
2 M3 h3 M3
|E2 (x)| ≤ √ h3 = √ .
3 3 6 9 3
|f (4 (ξ)|
(iii) |E3 (x)| = |s(s − 1)(s − 2)(s − 3)|h4 .
4!

Let g3 (s) = s(s − 1)(s − 2)(s − 3). Then g3 (s) = 4s3 − 18s2 + 22s − 6.
At extrema, g3 (s) = 0, √
i.e., 2s3 − 9s2 + 11s − 3 = 0.
3 3± 5
This gives s = , .
2 2
82 Numerical Analysis

 3 ± √5 
g3 (s) = − 18s + 11. Then
6s2 g3 (3/2)
< 0 and g3 > 0.
√ 2
3± 5 9 3
But, |g3 (s)| = 1 at s = and |g3 (s)| = at x = .
2 16 2
Therefore the maximum value of |g3 (s)| is 1.
Hence,
M4 h4 M4
|E3 (x)| ≤ 1.h4 = . 
24 24

Comparison of accuracy and O(hn+1 )


The equations (3.26), (3.27) and (3.28) give the bounds of errors for linear, quadratic
and cubic interpolation polynomials. In each of these cases the error bound |En (x)|
depends on h in two ways.
Case I. hn+1 is present explicitly in |En (x)| and En (x) is proportional to hn+1 .
Case II. The value of Mn+1 generally depends on the choice of h and tend to |f (n+1) (x0 )|
as h goes to zero.
Therefore, as h tends to zero |En (x)| converges to zero with the same rate that hn+1
converges to zero. Thus, one can say that |En (x)| = O(hn+1 ). In particular,
|E1 (x)| = O(h2 ), |E2 (x)| = O(h3 ), |E3 (x)| = O(h4 ) and so on.
As a consequence, if the derivatives of f (x) are uniformly bounded on the interval
and |h| < 1, then there is a scope to choose n sufficiently large to make hn+1 very small,
and the higher degree polynomial will have less error.
Example 3.3.1 Consider f (x) = cos x over [0, 1.5]. Determine the error bounds
for linear, quadratic and cubic Lagrange’s polynomials.

Solution. |f  (x)| = | sin x|, |f  (x)| = | cos x|, |f  (x)| = | sin x|,
|f iv (x)| = | cos x|.
|f  (x)| ≤ | cos 0| = 1.0, so that M2 = 1.0,
|f  (x)| ≤ | sin 1.5| = 0.997495, so that M3 = 0.997495,
|f iv (x)| ≤ | cos 0| = 1.0, so that M4 = 1.0.
For linear polynomial the spacing h of the points is 1.5 − 0 = 1.5 and its error bound
is
h2 M2 (1.5)2 × 1.0
|E1 (x)| ≤ ≤ = 0.28125.
8 8
For quadratic polynomial the spacing of the points is h = (1.5 − 0)/2 = 0.75 and its
error bound is
h3 M5 (0.75)3 × 0.997495
|E2 (x)| ≤ √ ≤ √ = 0.0269955.
9 3 9 3
Interpolation 83

The spacing for cubic polynomial is h = (1.5 − 0)/3 = 0.5 and thus the error bound
is
h4 M4 (0.5)4 × 1.0
|E3 (x)| ≤ ≤ = 0.0026042.
24 24

Example 3.3.2 A function f (x) defined on the interval (0, 1) is such that f (0) =
0, f (1/2) = −1, f (1) = 0. Find the quadratic polynomial p(x) which agrees with f (x)
for x = 0, 1/2, 1.
 d3 f 
  1
If  3  ≤ 1 for 0 ≤ x ≤ 1, show that |f (x) − p(x)| ≤ for 0 ≤ x ≤ 1.
dx 12
Solution. Given x0 = 0, x1 = 1/2, x2 = 1 and f (0) = 0, f (1/2) = −1, f (1) = 0.
From Lagrange’s interpolating formula, the required quadratic polynomial is

(x − x1 )(x − x2 ) (x − x0 )(x − x2 )
p(x) = f (x0 ) + f (x1 )
(x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 )
(x − x0 )(x − x1 )
+ f (x2 )
(x2 − x0 )(x2 − x1 )
(x − 1/2)(x − 1) (x − 0)(x − 1) (x − 0)(x − 1/2)
= ×0+ × (−1) + ×0
(0 − 1/2)(0 − 1) (1/2 − 0)(1/2 − 1) (1 − 0)(1 − 1/2)
= 4x(x − 1).

The error E(x) = f (x) − p(x) is given by



f (ξ)
E(x) = (x − x0 )(x − x1 )(x − x2 )
3!
 f  (ξ) 
 
or, |E(x)| = |x − x0 ||x − x1 ||x − x2 | 
3!
 d3 f 
1  
≤ |x − 0||x − 1/2||x − 1|1. as  3  ≤ 1 in 0 ≤ x ≤ 1 .
3! dx

Now, |x − 0| ≤ 1, |x − 1/2| ≤ 1/2 and |x − 1| ≤ 1 in 0 ≤ x ≤ 1.


1 1 1
Hence, |E(x)| ≤ 1. .1. = .
2 6 12
1
That is, |f (x) − p(x)| ≤ .
12

Example 3.3.3 Determine the step size h (and number of points n) to be used in
the tabulation of f (x) = cos x in the interval [1, 2] so that the quadratic interpolation
will be correct to six decimal places.
84 Numerical Analysis

Solution. The upper bound of error in quadratic polynomial is

h3 M3
|E2 (x)| ≤ √ , M3 = max f  (x).
9 3 1≤x≤2

f (x) = cos x, f  (x) = − sin x, f  (x) = − cos x, f  (x) = sin x.


max |f  (x)| = max | sin x| = 1.
1≤x≤2 1≤x≤2
h3 √
Hence √ × 1 ≤ 5 × 10−6 , i.e., h3 ≤ 45 3 × 10−6 .
9 3
2−1
This gives h ≤ 0.0427 and n = = 23.42  24.
h

Advantage and disadvantage of Lagrangian interpolation


In Lagrangian interpolation, there is no restriction on the spacing and order of the
tabulating points x0 , x1 , . . . , xn . Also, the value of y (the dependent variable) can be
calculated at any point x within the minimum and maximum values of x0 , x1 , . . . , xn .
But its main disadvantage is, if the number of interpolating points decreases or in-
creases then fresh calculation is required, the previous computations are of little help.
This disadvantage is not in Newton’s difference interpolation formulae, which are dis-
cussed in Section 3.5.
Example 3.3.4 Obtain a quadratic polynomial approximation to f (x) = e−x using
Lagrange’s interpolation method, taking three points x = 0, 1/2, 1.

Solution. Here x0 = 0, x1 = 1/2, x2 = 1 and f (x0 ) = 1, f (x1 ) = e−1/2 ,


f (x2 ) = e−1 .
The quadratic polynomial φ(x) is

(x − x1 )(x − x2 ) (x − x0 )(x − x2 )
φ(x) = f (x0 ) + f (x1 )
(x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 )
(x − x0 )(x − x1 )
+ f (x2 )
(x2 − x0 )(x2 − x1 )
(x − 1/2)(x − 1) (x − 0)(x − 1) (x − 0)(x − 1/2)
= ×1+ × e−1/2 + × e−1
(0 − 1/2)(0 − 1) (1/2 − 0)(1/2 − 1) (1 − 0)(1 − 1/2)
= (2x − 1)(x − 1) − 4e−1/2 x(x − 1) + e−1 x(2x − 1)
= 2x2 (1 − 2e−1/2 + e−1 ) − x(3 − 4e−1/2 + e−1 ) + 1
= 0.309636243x2 − 0.941756802x + 1.0.

The functions f (x) and φ(x) are shown in Figure 3.2.


Interpolation 85

y
y=f (x) 6

y=φ(x)

- x
0 2

Figure 3.2: The graph of the function f (x) = e−x and the polynomial
φ(x) = 0.309636243x2 − 0.941756802x + 1.

Algorithm 3.1 (Lagrange’s interpolation). This algorithm determines the value


of y at x = x , say, from a given table of points (xi , yi ), i = 0, 1, 2, . . . , n, using
Lagrange’s interpolation method.

Algorithm Lagrange Interpolation


Step 1: Read x, n // n represents the number of points minus one//
// x is the interpolating point//
Step 2: for i = 0 to n do //reading of tabulated values//
Read xi , yi ;
endfor;
Step 3: Set sum = 0;
Step 4: for i = 0 to n do
Step 4.1: Set prod = 1;
Step 4.2: for j = 0 to n do
x − xj
if (i = j) then prod = prod × ;
xi − xj
Step 4.3: Compute sum = sum + yi × prod;
endfor;
Step 5: Print x, sum;
end Lagrange Interpolation

Program 3.1
.
/* Program Lagrange Interpolation
This program implements Lagrange’s interpolation
formula for one dimension; xg is the interpolating points */
86 Numerical Analysis

#include <stdio.h>
#include <math.h>
void main()
{
int n, i, j; float xg, x[20], y[20], sum=0, prod=1;
printf("Enter the value of n and the data
in the form x[i],y[i] ");
scanf("%d",&n);
for(i=0;i<=n;i++) scanf("%f %f",&x[i],&y[i]);
printf("\nEnter the interpolating point x ");
scanf("%f",&xg);
for(i=0;i<=n;i++)
{
prod=1;
for(j=0;j<=n;j++)
{
if(i!=j) prod*=(xg-x[j])/(x[i]-x[j]);
}
sum+=y[i]*prod;
}
printf("\nThe given data is ");
for(i=0;i<=n;i++) printf("\n(%6.4f,%6.4f)",x[i],y[i]);
printf("\nThe value of y at x= %5.2f is %8.5f ", xg, sum);
} /* main */
A sample of input/output:
Enter the value of n and the data in the form x[i],y[i] 4
1 5
1.5 8.2
2 9.2
3.2 11
4.5 16
Enter the interpolating point x 1.75
The given data is
(1.0000,5.0000)
(1.5000,8.2000)
(2.0000,9.2000)
(3.2000,11.0000)
(4.5000,16.0000)
The value of y at x= 1.75 is 8.85925
Interpolation 87

3.4 Finite Differences

Different types of finite differences are introduced in Chapter 2. Some of them are
recapitulated here.
Let a function y = f (x) be known as (xi , yi ) at (n+1) points xi , i = 0, 1, . . . , n, where
xi ’s are equally spaced, i.e., xi = x0 + ih, h is the spacing between any two successive
points xi ’s. That is, yi = f (xi ), i = 0, 1, . . . , n.

3.4.1 Forward differences

The first forward difference of f (x) is defined as

∆f (x) = f (x + h) − f (x),

∆ is called forward difference operator.


Then ∆f (x0 ) = f (x0 + h) − f (x0 ) = f (x1 ) − f (x0 ).
That is, ∆y0 = y1 − y0 using yi = f (xi ).
Similarly, ∆y1 = y2 − y1 , ∆y2 = y3 − y2 , . . ., ∆yn−1 = yn − yn−1 .
The second order differences are

∆2 y0 = ∆(∆y0 ) = ∆(y1 − y0 )
= ∆y1 − ∆y0 = (y2 − y1 ) − (y1 − y0 ) = y2 − 2y1 + y0 .

Similarly, ∆2 y1 = y3 − 2y2 + y1 , etc.


The third order differences

∆3 y0 = ∆(∆2 y0 ) = ∆(y2 − 2y1 + y0 ) = y3 − 3y2 + 3y1 − y0 ,


∆3 y1 = y4 − 3y3 + 3y2 − y1 , etc.

In general,
∆k y0 = yk − k C1 yk−1 + k C2 yk−2 − · · · − (−1)k y0 (3.29)
∆ yi = yk+i − C1 yk+i−1 + C2 yk+i−2 − · · · − (−1) yi .
k k k k
(3.30)

It is observed that difference of any order can easily be expressed in terms of the
ordinates yi ’s with binomial coefficients.
All orders forward differences can be written in a tabular form shown in Table 3.1.
This difference table is called forward difference table or diagonal difference
table.
88 Numerical Analysis

Table 3.1: Forward difference table.

x y ∆y ∆2 y ∆3 y ∆4 y
x0 y0
∆y0
x1 y1 ∆ 2 y0
∆y1 ∆ 3 y0
x2 y2 ∆ 2 y1 ∆ 4 y0
∆y2 ∆3 y 1
x3 y3 ∆ 2 y2
∆y3
x4 y4

3.4.2 Backward differences


The first order backward difference of f (x) is defined as
∇f (x) = f (x) − f (x − h),
where ∇ is the backward difference operator.
Thus, ∇f (x1 ) = f (x1 ) − f (x0 ), or, ∇y1 = y1 − y0 .
Similarly, ∇y2 = y2 − y1 , ∇y3 = y3 − y2 , . . ., ∇yn = yn − yn−1 .
The second order differences are

∇2 y2 = ∇(∇y2 ) = ∇(y2 − y1 ) = y2 − 2y1 + y0 ,


∇2 y3 = y3 − 2y2 + y1 , etc.

The third order differences are

∇3 y3 = y3 − 3y2 + 3y1 − y0 ,
∇3 y4 = y4 − 3y3 + 3y2 − y1 , etc.

In general,
∇k yi = yi − k C1 yi−1 + k C2 yi−2 − · · · − (−1)k yi−k . (3.31)

Table 3.2 shows how the backward differences of all orders can be formed.
The backward difference table is sometimes called horizontal difference table.

3.4.3 Error propagation in a difference table


If there is an error in any entry among the tabulated values of a function, then this
error propagates to other entries of higher order differences. To illustrate the behaviour
of propagation of error, we assume that an error ε is present in the number, say, y3 .
Interpolation 89

Table 3.2: Backward difference table.

x y ∇y ∇2 y ∇3 y ∇4 y
x0 y0
x1 y1 ∇y1
x2 y2 ∇y2 ∇ 2 y2
x3 y3 ∇y3 ∇ 2 y3 ∇ 3 y3
x4 y4 ∇y4 ∇ 2 y4 ∇ 3 y4 ∇ 4 y4

Table 3.3: Error propagation in a finite difference table.

x y ∆y ∆2 y ∆3 y ∆4 y ∆5 y
x0 y0
∆y0
x1 y1 ∆ 2 y0
∆y1 ∆ 3 y0 + ε
x2 y2 ∆2 y 1 +ε ∆4 y0 − 4ε
∆y2 + ε ∆3 y 1 − 3ε ∆5 y0 + 10ε
x3 y3 + ε ∆2 y2 − 2ε ∆4 y1 + 6ε
∆y3 − ε ∆3 y 2 + 3ε ∆5 y1 − 10ε
x4 y4 ∆ 2 y3 + ε ∆4 y2 − 4ε
∆y4 ∆3 y 3 −ε
x5 y5 ∆ 2 y4
∆y5
x6 y6

Table 3.3 shows the propagation of error in a difference table and how the error affects
the differences. From this table, the following observations are noted.

(i) The effect of the error increases with the order of the differences.
(ii) The error is maximum (in magnitude) along the horizontal line through the erro-
neous tabulated value.
(iii) The second difference column has the errors ε, −2ε, ε, in the third difference col-
umn, the errors are ε, −3ε, 3ε, −ε. In the fourth difference column the expected
errors ε, −4ε, 6ε, −4ε, ε (this column is not sufficient to show all of the expected er-
rors). Thus, in the pth difference column, the coefficients of errors are the binomial
coefficients in the expansion of (1 − x)p .
90 Numerical Analysis

(iv) The algebraic sum of errors in any column (complete) is zero. If there is any error
in a single entry of a table, then from the difference table one can detect and
correct such error.

Detection of errors using difference table


Difference table may be used to detect errors in a set of tabular values. From Table
3.3, it follows that if an error is present in a given data, the differences of some order
will become alternating in sign. Thus, higher order differences should be formed till the
error is revealed.
To detect the position of the error in an entry, the following steps may be proceed.

(i) Form the difference table. If at any stage, the differences do not follow a smooth
pattern, then one can conclude that there is an error.

(ii) If the differences of some order (it is generally happens in higher order) becomes
alternating in sign then the middle entry has an error.

Finite Difference Interpolations

3.5 Newton’s Forward Difference Interpolation Formula

Let y = f (x) be a function whose explicit form is unknown. But, the values of y at
the equispaced points x0 , x1 , . . . , xn , i.e., yi = f (xi ), i = 0, 1, . . . , n are known. Since
x0 , x1 , . . . , xn are equispaced then xi = x0 + ih, i = 0, 1, . . . , n, where h is the spacing.
It is required to construct a polynomial φ(x) of degree less than or equal to n satisfying
the conditions

yi = φ(xi ), i = 0, 1, . . . , n. (3.32)

Since φ(x) is a polynomial of degree at most n, so φ(x) can be taken in the following
form

φ(x) = a0 + a1 (x − x0 ) + a2 (x − x0 )(x − x1 ) + a3 (x − x0 )(x − x1 )(x − x2 )


+ · · · + an (x − x0 )(x − x1 ) · · · (x − xn−1 ), (3.33)

where a0 , a1 , . . . , an are constants whose values are to be determined using (3.32).


To determine the values of ai ’s, substituting x = xi , i = 0, 1, 2, . . . , n.
When x = x0 then
φ(x0 ) = a0 or, a0 = y0 .
For x = x1 , φ(x1 ) = a0 + a1 (x1 − x0 )
y1 − y 0 ∆y0
or, y1 = y0 + a1 h or, a1 = = .
h h
Interpolation 91

For x = x2 , φ(x2 ) = a0 + a1 (x2 − x1 ) + a2 (x2 − x0 )(x2 − x1 )


y1 − y 0
or, y2 = y0 + .2h + a2 (2h)(h)
h
y2 − 2y1 + y0 ∆ 2 y0
or, a2 = = .
2!h2 2!h2
In this way,
∆ 3 y0 ∆ 4 y0 ∆ n y0
a3 = , a 4 = , . . . , an = .
3!h3 4!h4 n!hn
Using these values, (3.33) becomes

∆y0 ∆ 2 y0
φ(x) = y0 + (x − x0 ) + (x − x0 )(x − x1 )
h 2!h2
3
∆ y0
+(x − x0 )(x − x1 )(x − x2 )
3!h3
∆ n y0
+ · · · + (x − x0 )(x − x1 ) · · · (x − xn−1 ) . (3.34)
n!hn
Introducing the condition xi = x0 + ih, i = 0, 1, . . . , n for equispaced points and a
new variable u as x = x0 + uh.
Therefore, x − xi = (u − i)h.
So the equation (3.34) becomes

∆y0 ∆ 2 y0 ∆ 3 y0
φ(x) = y0 + (uh) + (uh)(u − 1)h 2
+ (uh)(u − 1)h(u − 2)h
h 2!h 3!h3
n
∆ y0
+ · · · + (uh)(u − 1)h(u − 2)h · · · (u − n − 1)h
n!hn
u(u − 1) 2 u(u − 1)(u − 2) 3
= y0 + u∆y0 + ∆ y0 + ∆ y0
2! 3!
u(u − 1)(u − 2) · · · (u − n − 1) n
+··· + ∆ y0 , (3.35)
n!
x − x0
where u = .
h
This is known as Newton or Newton-Gregory forward difference interpolating
polynomial.
Example 3.5.1 The following table gives the values of ex for certain equidistant
values of x. Find the value of ex when x = 0.612 using Newton’s forward difference
formulae.

x : 0.61 0.62 0.63 0.64 0.65


y : 1.840431 1.858928 1.877610 1.896481 1.915541
92 Numerical Analysis

Solution. The forward difference table is

x y ∆y ∆2 y ∆3 y
0.61 1.840431
0.018497
0.62 1.858928 0.000185
0.018682 0.000004
0.63 1.877610 0.000189
0.018871 0.0
0.64 1.896481 0.000189
0.019060
0.65 1.915541

x − x0 0.612 − 0.61
Here, x0 = 0.61, x = 0.612, h = 0.01, u = = = 0.2.
h 0.01
Then,

u(u − 1) 2 u(u − 1)(u − 2) 3


y(0.612) = y0 + u∆y0 + ∆ y0 + ∆ y0
2! 3!
0.2(0.2 − 1)
= 1.840431 + 0.2 × 0.018497 + × 0.000185
2
0.2(0.2 − 1)(0.2 − 2)
+ × 0.000004
6
= 1.840431 + 0.003699 − 0.000015 + 0.00000019
= 1.844115.

3.5.1 Error in Newton’s forward formula

The error in any polynomial interpolation formula is

f (n+1) (ξ)
E(x) = (x − x0 )(x − x1 ) · · · (x − xn )
(n + 1)!
f (n+1) (ξ)
= u(u − 1)(u − 2) · · · (u − n)hn+1 (using x = x0 + uh)
(n + 1)!

where ξ lies between min{x0 , x1 , . . . , xn , x} and max{x0 , x1 , . . . , xn , x}.


Also, f (n+1) (ξ)  hn+1 ∆n+1 y0 .
Therefore,
u(u − 1)(u − 2) · · · (u − n) n+1
E(x)  ∆ y0 .
(n + 1)!
Interpolation 93

A particular case:
If 0 < u < 1 then
 2
1 1 1
|u(u − 1)| = (1 − u)u = u − u = −2
− u ≤ and
4 2 4
|(u − 2)(u − 3) · · · (u − n)| ≤ |(−2)(−3) · · · (−n)| = n!.
Then,
1 n! 1
|E(x)| ≤ |∆n+1 y0 | = |∆n+1 y0 |.
4 (n + 1)! 4(n + 1)
Also, |∆n+1 y0 | ≤ 9 in the last significant figure.
9
Thus, |E(x)| ≤ < 1 for n > 2 and 0 < u < 1.
4(n + 1)
That is, the maximum error in Newton’s forward interpolation is 1 when |x − x0 | < h.

Newton’s forward formula is used to compute the approximate value of f (x) when
the argument x is near the beginning of the table. But this formula is not appropriate
to compute f (x) when x at the end of the table. In this situation Newton’s backward
formula is appropriate.
Algorithm 3.2 (Newton’s forward interpolation). This algorithm determines
the value of y when the value of x is given, by Newton’s forward interpolation method.
The values of xi , yi , i = 0, 1, 2, . . . , n are given and assumed that xi = x0 + ih,
i.e., the data are equispaced.

Algorithm Newton Forward Intepolation


//Assume that the data are equispaced.//
Read (xi , yi ), i = 0, 1, 2, . . . , n;
Read xg; //the value of x at which y is to be determined.//
Compute h = x1 − x0 ; //compute spacing.//
Compute u = (xg − x0 )/h;
for j = 0 to n do
dyj = yj ; //copy of y to dy//
Set prod = 1, sum = y0 ;
for i = 1 to n do
for j = 0 to (n − i) do dyj = dyj+1 − dyj ;
//dy represents the difference.//
u−i+1
Compute prod = prod × ;
i
Compute sum = sum + prod × dy0 ;
endfor;
Print ‘The value of y at x =’,xg, ‘is ’, sum;
end Newton Forward Intepolation
94 Numerical Analysis

Program 3.2.
/* Program Newton Forward Interpolation
This program finds the value of y=f(x) at a given x when
the function is supplied as (x[i],y[i]), i=0, 1, ..., n.
Assumed that x’s are equispaced.*/
#include<stdio.h>
#include<math.h>
void main()
{
int i,j,n; float x[20],y[20],xg,sum,prod=1,u,dy[20],h;
printf("Enter number of subintervals ");
scanf("%d",&n);
printf("Enter x and y values ");
for(i=0;i<=n;i++) scanf("%f %f",&x[i],&y[i]);
printf("Enter interpolating point x ");
scanf("%f",&xg);
h=x[1]-x[0];
u=(xg-x[0])/h;
for(j=0;j<=n;j++) dy[j]=y[j];
prod=1; sum=y[0];
for(i=1;i<=n;i++)
{
for(j=0;j<=n-i;j++) dy[j]=dy[j+1]-dy[j];
prod*=(u-i+1)/i;
sum+=prod*dy[0];
}
printf("The value of y at x=%f is %f ",xg,sum);
}

A sample of input/output:

Enter number of subintervals 4


Enter x and y values
140 3.685
150 5.854
160 6.302
170 8.072
180 10.225
Enter interpolating point x 142
The value of y at x=142.000000 is 4.537069
Interpolation 95

3.6 Newton’s Backward Difference Interpolation Formula

Suppose, a set of values y0 , y1 , . . . , yn of the function y = f (x) is given, at x0 , x1 , . . . , xn ,


i.e., yi = f (xi ), i = 0, 1, . . . , n. Let xi ’s are equispaced with spacing h, i.e., xi = x0 + ih.
Let us consider the polynomial φ(x) in the following form:

f (x)  φ(x) = a0 + a1 (x − xn ) + a2 (x − xn )(x − xn−1 )


+a3 (x − xn )(x − xn−1 )(x − xn−2 )
+ · · · + an (x − xn )(x − xn−1 ) · · · (x − x1 ). (3.36)

The constants ai ’s are to be determined using the conditions

yi = φ(xi ), i = 0, 1, . . . , n. (3.37)

Substituting x = xn , nn−1 , . . . , x1 in (3.36), we obtain


φ(xn ) = a0 or, a0 = yn .
yn − yn−1 ∇yn
φ(xn−1 ) = a0 + a1 (xn−1 − xn ) or, yn−1 = yn + a1 (−h) or, a1 = = .
h h
φ(xn−2 ) = a0 + a1 (xn−2 − xn ) + a2 (xn−2 − xn )(xn−2 − xn−1 )
yn − yn−1
= yn + (−2h) + a2 (−2h)(−h)
h
yn − 2yn−1 + yn−2 ∇ 2 yn
or, yn−2 = 2yn−1 − yn + a2 .2!h2 or, a2 = = .
2!h2 2!h2
In this way, the others values are obtained as,

∇ 3 yn ∇ 4 yn ∇ n yn
a3 = , a4 = , . . . , an = .
3!h3 4!h4 n!hn

When the values of ai ’s are substituted in (3.36) then the polynomial φ(x) becomes

∇yn ∇ 2 yn
φ(x) = yn + (x − xn ) + (x − xn )(x − xn−1 )
h 2!h2
∇ yn
3
+(x − xn )(x − xn−1 )(x − xn−2 ) + ···
3!h3
∇ n yn
+(x − xn )(x − xn−1 )(x − xn−2 ) · · · (x − x1 ) . (3.38)
n!hn
Now, a unit less variable v is introduced which is defined as x = xn + vh, i.e.,
x − xn
v= . This substitution simplifies the formula.
h
Also, for equispaced points xi = x0 + ih.
Then x − xn−i = (xn + vh) − (x0 + n − ih) = (xn − x0 ) + (v − n − i)h = (v + i)h,
i = 0, 1, . . . , n.
96 Numerical Analysis

Using above results, (3.38) becomes

∇yn ∇ 2 yn ∇ 3 yn
φ(x) = yn + vh + vh(v + 1)h + vh(v + 1)h(v + 2)h + ···
h 2!h2 3!h3
∇ n yn
+vh(v + 1)h(v + 2)h · · · (v + n − 1)h
n!hn
v(v + 1) 2 v(v + 1)(v + 2) 3
= yn + v∇yn + ∇ yn + ∇ yn + · · ·
2! 3!
v(v + 1)(v + 2) · · · (v + n − 1) n
+ ∇ yn . (3.39)
n!

This formula is known as Newton’s backward or Newton-Gregory backward


interpolation formula.

Example 3.6.1 From the following table of values of x and f (x) determine the
value of f (0.29) using Newton’s backward interpolation formula.

x : 0.20 0.22 0.24 0.26 0.28 0.30


f (x) : 1.6596 1.6698 1.6804 1.6912 1.7024 1.7139

Solution. The difference table is

x f (x) ∆f (x) ∆2 f (x) ∆3 f (x)


0.20 1.6596
0.22 1.6698 0.0102
0.24 1.6804 0.0106 0.0004
0.26 1.6912 0.0108 0.0002 −0.0002
0.28 1.7024 0.0112 0.0004 0.0002
0.30 1.7139 0.0115 0.0003 −0.0001

x − xn 0.29 − 0.30
Here, xn = 0.30, x = 0.29, h = 0.02, v = = = −0.5.
h 0.02
Then,

v(v + 1) 2 v(v + 1)(v + 2) 3


f (0.29) = f (xn ) + v∇f (xn )+ ∇ f (xn )+ ∇ f (xn )+· · ·
2! 3!
−0.5(−0.5 + 1)
= 1.7139 − 0.5 × 0.0115 + × 0.0003
2
−0.5(−0.5 + 1)(−0.5 + 2)
+ × (−0.0001)
6
= 1.7139 − 0.00575 − 0.0000375 + 0.00000625
= 1.70811875  1.7081.
Interpolation 97

Example 3.6.2 The population of a town in decennial census were given in the
following table.
Year : 1921 1931 1941 1951 1961
Population (in thousand) : 46 66 81 93 101

Estimate the population for the year 1955 using Newton’s backward and forward
formulae and compare the results.

Solution.
Using Newton’s backward formula
The backward difference table is
Year Population ∇y ∇2 y ∇3 y ∇4 y
(x) (y)
1921 46
1931 66 20
1941 81 15 −5
1951 93 12 −3 2
1961 101 8 −4 −1 −3

Here, xn = 1961, x = 1955, h = 10, v = x−xn


h = 1955−1961
10 = −0.6.
By Newton’s backward formula

v(v + 1) 2 v(v + 1)(v + 2) 3


y(1955) = yn + v∇yn + ∇ yn + ∇ yn
2! 3!
v(v + 1)(v + 2)(v + 3) 4
+ ∇ yn
4!
−0.6(−0.6 + 1)
= 101 − 0.6 × 8 + × (−4)
2
−0.6(−0.6 + 1)(−0.6 + 2)
+ × (−1)
6
−0.6(−0.6 + 1)(−0.6 + 2)(−0.6 + 3)
+ × (−3)
24
= 96.8368  97.

Hence the approximate population of the town was 97 thousand.


Using Newton’s forward formula
The given table is written in reverse order as
Year : 1961 1951 1941 1931 1921
Population : 101 93 81 66 46
98 Numerical Analysis

The forward difference table is


Year Population ∆y ∆2 y ∆3 y ∆4 y
x y
1961 101
−8
1951 93 −4
−12 1
1941 81 −3 −3
−15 −2
1931 66 −5
−20
1921 46
x − x0 1955 − 1961
Here x0 = 1961, x = 1955, h = −10, u = = = 0.6.
h −10
Then
u(u − 1) 2 u(u − 1)(u − 2) 3
y(1955) = y0 + u∆y0 + ∆ y0 + ∆ y0
2! 3!
u(u − 1)(u − 2)(u − 3) 4
+ ∆ y0
4!
0.6(0.6 − 1) 0.6(0.6 − 1)(0.6 − 2)
= 101 + 0.6 × (−8) + × (−4) + ×1
2 6
0.6(0.6 − 1)(0.6 − 2)(0.6 − 3)
+ × (−3)
24
= 101 − 4.8 + 0.48 + 0.056 + 0.1008
= 96.8368  97.

Therefore the population of the town in the year 1955 was 97 thousand and this result
is same with the result obtained by Newton’s backward difference formula.

3.6.1 Error in Newton’s backward interpolation formula

The error in this interpolation formula is

f (n+1) (ξ)
E(x) = (x − xn )(x − xn−1 ) · · · (x − x1 )(x − x0 )
(n + 1)!
f (n+1) (ξ)
= v(v + 1)(v + 2) · · · (v + n)hn+1 , (3.40)
(n + 1)!

x − xn
where v = and ξ lies between min{x0 , x1 , . . . , xn , x} and max{x0 , x1 , . . . , xn , x}.
h
Interpolation 99

Note 3.6.1 The Newton’s backward difference interpolation formula is used to compute
the value of f (x) when x is near to xn , i.e., when x is at the end of the table.

Central Difference Interpolation Formulae

3.7 Gaussian Interpolation Formulae

Newton’s forward and Newton’s backward formulae does not give accurate value of f (x)
when x is in the middle of the table. To get more accurate result another formula may
be used. There are several methods available to solve this type of problem. Among
them Gaussian forward and backward, Stirling’s and Bessel’s interpolation formulae are
widely used.

3.7.1 Gauss’s forward difference formula


Case I: For 2n + 1 (odd) arguments
Suppose the values of the function y = f (x) are known at 2n + 1 equally spaced points
x−n , x−(n−1) , . . . , x−1 , x0 , x1 , . . . , xn−1 , xn , i.e., yi = f (xi ), i = 0, ±1, ±2, . . . , ±n.
The problem is to construct a polynomial φ(x) of degree at most 2n such that
φ(xi ) = yi , i = 0, ±1, ±2, . . . , ±n, (3.41)
where xi = x0 + ih, h is the spacing.
Let us consider φ(x) as
φ(x) = a0 + a1 (x − x0 ) + a2 (x − x0 )(x − x1 ) + a3 (x − x−1 )(x − x0 )(x − x1 )
+a4 (x − x−1 )(x − x0 )(x − x1 )(x − x2 ) + · · ·
+a2n−1 (x − x−n+1 )(x − x−n+2 ) · · · (x − x−1 )(x − x0 ) · · · (x − xn−1 )
+a2n (x − x−n+1 )(x − x−n+2 ) · · · (x − x−1 )(x − x0 ) · · · (x − xn ),
(3.42)
where ai ’s are unknown constants and their values are to be determined by substituting
x = x0 , x1 , x−1 , x2 , x−2 , . . . , xn , x−n .
Therefore,
y0 = a0
y1 = a0 + a1 (x1 − x0 ) i.e., y1 = y0 + a1 h,
y1 − y 0 ∆y0
i.e., a1 = = .
x1 − x0 h
y−1 = y0 + a1 (−h) + a2 (−h)(−2h)
∆y0
= y0 − h + a2 h2 · 2!
h
100 Numerical Analysis

y−1 − 2y0 + y1 ∆2 y−1


i.e., a2 = = .
2! h2 2! h2
y2 = a0 + a1 (x2 − x0 ) + a2 (x2 − x0 )(x2 − x1 )
+a3 (x2 − x−1 )(x2 − x0 )(x2 − x1 )
y1 − y 0 y−1 − 2y0 + y1
= y0 + (2h) + (2h)(h) + a3 (3h)(2h)(h)
h 2!h2
y2 − 3y1 + 3y0 − y−1 ∆3 y−1
or, a3 = = .
3!h3 3!h3

In this manner, the remaining values are obtained as

∆4 y−2 ∆5 y−2 ∆2n−1 y−(n−1) ∆2n y−n


a4 = , a5 = , . . . , a2n−1 = , a2n = .
4!h4 5!h5 (2n − 1)!h2n−1 (2n)!h2n

Thus the Gauss’s forward difference formula is

∆y0 ∆2 y−1
φ(x) = y0 + (x − x0 ) + (x − x0 )(x − x1 )
h 2!h2
3
∆ y−1
+(x − x−1 )(x − x0 )(x − x1 ) + ···
3!h3
∆2n−1 y−(n−1)
+(x − x−(n+1) ) · · · (x − xn−1 )
(2n − 1)!h2n−1
∆2n y−n
+(x − x−(n+1) ) · · · (x − xn−1 )(x − xn ) . (3.43)
(2n)!h2n

To simplify the above equation, a new variable s is introduced, where


x − x0
s= i.e, x = x0 + sh.
h
Also, x± i = x0 ± ih, i = 0, 1, 2, . . . , n.
Therefore, x − x±i = (s ± i)h.
Then x − x0 = sh, x − x1 = (s − 1)h, x − x−1 = (s + 1)h,
x − x2 = (s − 2)h, x − x−2 = (s + 2)h and so on.
Making use of these results, (3.43) becomes

∆y0 ∆2 y−1 ∆3 y−1


φ(x) = y0 + sh + sh(s − 1)h + (s + 1)hsh(s − 1)h + ···
h 2!h2 3!h3
∆2n−1 y−(n−1)
+(s + n − 1)h · · · sh(s − 1)h · · · (s − n − 1)h
(2n − 1)!h2n−1
∆2n y−n
+(s + n − 1)h · · · sh(s − 1)h · · · (s − n − 1)h(s − n)h
(2n)!h2n
Interpolation 101

∆2 y−1 ∆3 y−1
= y0 + s∆y0 + s(s − 1) + (s + 1)s(s − 1) + ···
2! 3!
∆2n−1 y−(n−1)
+(s + n − 1) · · · s(s − 1) · · · (s − n − 1)
(2n − 1)!
∆2n y−n
+(s + n − 1) · · · s(s − 1) · · · (s − n − 1)(s − n)
(2n)!
2
∆ y−1 3
∆ y−1
= y0 + s∆y0 + s(s − 1) + s(s2 − 12 )
2! 3!
∆ 4y
−2
+s(s2 − 12 )(s − 2) + ···
4!
2 2 ∆2n−1 y−(n−1)
+s(s2 − n − 1 )(s2 − n − 2 ) · · · (s2 − 12 )
(2n − 1)!
2 2 ∆2n y−n
+s(s2 − n − 1 )(s2 − n − 2 ) · · · (s2 − 12 )(s − n) . (3.44)
(2n)!

The formula (3.43) or (3.44) is known as Gauss’s forward central difference


formula or the first interpolation formula of Gauss.

Case II: For 2n (even) arguments

In this case the arguments are x0 , x±1 , . . . , x±(n−1) and xn .


For these points the Gauss’s forward interpolation takes the following form.

∆2 y−1 ∆3 y−1
φ(x) = y0 + s∆y0 + s(s − 1) + (s + 1)s(s − 1)
2! 3!
4
∆ y−2
+(s + 1)s(s − 1)(s − 2)
4!
∆5 y−2
+(s + 2)(s + 1)s(s − 1)(s − 2) + ···
5!
∆2n−1 y−(n−1)
+(s + n − 1) · · · s · · · (s − n − 1)
(2n − 1)!
2
∆ y−1 ∆3 y−1
= y0 + s∆y0 + s(s − 1) + s(s2 − 12 )
2! 3!
4
∆ y−2 ∆5 y−2
+s(s2 − 12 )(s − 2) + (s2 − 22 )(s2 − 12 )s + ···
4! 5!
2 ∆2n−1 y−(n−1)
+(s2 − n − 1 ) · · · (s2 − 12 )s . (3.45)
(2n − 1)!
102 Numerical Analysis

3.7.2 Remainder in Gauss’s forward central difference formula

The remainder of Gauss’s forward central difference interpolation for 2n + 1 arguments


is

f 2n+1 (ξ)
E(x) = (x − x−n )(x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn )
(2n + 1)!
= (s + n)(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1)(s − n)
f 2n+1 (ξ)
× h2n+1
(2n + 1)!
f 2n+1 (ξ)
= s(s2 − 12 ) · · · (s2 − n2 ).h2n+1 (3.46)
(2n + 1)!

x − x0
where s = and ξ lies between min{x−n , x−(n−1) , . . . , x0 , x1 , . . . , xn−1 , xn }
h
and max{x−n , x−(n−1) , . . . , x0 , x1 , . . . , xn−1 , xn }.
In case of 2n arguments, the error is

f 2n (ξ)
E(x) = (s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1)(s − n)h2n
(2n)!
2 f 2n (ξ)
= s(s2 − 12 ) · · · (s2 − n − 1 )(s − n).h2n , (3.47)
(2n)!

where, min{x−n , x−(n−1) , . . . , x0 , x1 , . . . , xn−1 , xn } < ξ


< max{x−n , x−(n−1) , . . . , x0 , x1 , . . . , xn−1 , xn }.

3.7.3 Gauss’s backward difference formula

Case I: For 2n + 1 (odd) number of arguments

Let a function y = f (x) is known for 2n + 1 equispaced arguments x±i , i = 0, 1, 2, . . . , n,


such that
x±i = x0 ± ih, i = 0, 1, 2, . . . , n.

Let y±i = f (x±i ), i = 0, 1, 2, . . . , n.


Our aim is to determine a polynomial φ(x) of degree not more than 2n which satisfies
the conditions

φ(x±i ) = y±i , i = 0, 1, . . . , n. (3.48)


Interpolation 103

The polynomial φ(x) is considered in the following form.

φ(x) = a0 + a1 (x − x0 ) + a2 (x − x−1 )(x − x0 ) + a3 (x − x−1 )(x − x0 )(x − x1 )


+a4 (x − x−2 )(x − x−1 )(x − x0 )(x − x1 )
+a5 (x − x−2 )(x − x−1 )(x − x0 )(x − x1 )(x − x2 ) + · · ·
+a2n−1 (x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn−1 )
+a2n (x − x−n )(x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn−1 ). (3.49)

The coefficients ai ’s are unknown constants. These values are determined by using
the relations (3.48). Substituting x = x0 , x−1 , x1 , x−2 , x2 , . . . , x−n , xn to (3.49) in suc-
cession. Note that xi − x−j = (i + j)h and (x−i − xj ) = −(i + j)h. Then it is found
that

y0 = a0
φ(x−1 ) = a0 + a1 (x−1 − x0 )
i.e., y−1 = y0 + a1 (−h),
y0 − y−1 ∆y−1
i.e., a1 = =
h h

φ(x1 ) = a0 + a1 (x1 − x0 ) + a2 (x1 − x−1 )(x1 − x0 )


∆y−1
y1 = y0 + h. + a2 (2h)(h)
h
y1 − y0 − (y0 − y−1 ) ∆2 y−1
i.e., a2 = =
2!h2 2!h2

φ(x−2 ) = a0 + a1 (x−2 − x0 ) + a2 (x−2 − x−1 )(x−2 − x0 )


+a3 (x−2 − x−1 )(x−2 − x0 )(x−2 − x1 )
∆y−1 ∆2 y−1
i.e., y−2 = y0 + (−2h) + (−h)(−2h) + a3 (−h)(−2h)(−3h)
h 2!h2
= y0 − 2(y0 − y−1 ) + (y1 − 2y0 + y−1 ) + a3 (−1)3 (3!)h3
y1 − 3y0 + 3y−1 − y−2 ∆3 y−2
or, a3 = = .
3!h3 3!h3
In this manner, the other values are obtained as

∆4 y−2 ∆5 y−3 ∆2n−1 y−n ∆2n y−n


a4 = , a5 = , . . . , a2n−1 = , a2n = .
4!h4 5!h5 (2n − 1)!h2n−1 (2n)!h2n
Making use of ai ’s, equation (3.49) becomes
104 Numerical Analysis

∆y−1 ∆2 y−1
φ(x) = y0 + (x − x0 ) + (x − x−1 )(x − x0 )
1!h 2!h2
3
∆ y−2
+(x − x−1 )(x − x0 )(x − x1 )
3!h3
∆4 y−2
+(x − x−2 )(x − x−1 )(x − x0 )(x − x1 ) + ···
4!h4
∆2n−1 y−n
+(x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn−1 )
(2n − 1)!h2n−1
∆2n y−n
+(x − x−n )(x − x−1 )(x − x0 )(x − x1 ) · · · (x − xn−1 ) . (3.50)
(2n)!h2n

As in previous case, a new unit less variable s is introduced to reduce the above
x − x0
formula into a simple form, where s = i.e, x = x0 + sh.
h
Then

x − xi x − x0 − ih
= = s − i and
h h
x − x−i x − x0 + ih
= = s + i, i = 0, 1, 2, . . . , n.
h h

Then the above formula is transferred to

(s + 1)s 2 (s + 1)s(s − 1) 3
φ(x) = y0 + s∆y−1 + ∆ y−1 + ∆ y−2
2! 3!
(s + 2)(s + 1)s(s − 1) 4
+ ∆ y−2 + · · ·
4!
(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1) 2n−1
+ ∆ y−n
(2n − 1)!
(s + n)(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1) 2n
+ ∆ y−n . (3.51)
(2n)!

The above formula (3.51) is known as Gauss’s backward interpolation formula


or second interpolation formula of Gauss.

Case II: For 2n (even) number of arguments

In this case the arguments are taken as x0 , x±1 , . . . , x±(n−1) and x−n , where x±i =
x0 ± ih, i = 0, 1, . . . , n − 1 and x−n = x0 − nh.
Interpolation 105

For these equispaced points the Gauss’s backward interpolation formula is

(s + 1)s 2 (s + 1)s(s − 1) 3
φ(x) = y0 + s∆y−1 + ∆ y−1 + ∆ y−2
2! 3!
(s + 2)(s + 1)s(s − 1) 4
+ ∆ y−2 + · · ·
4!
(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1) 2n−1
+ ∆ y−n , (3.52)
(2n − 1)!

x − x0
where s = .
h

3.7.4 Remainder of Gauss’s backward central difference formula

The remainder for (2n + 1) equispaced points is

f 2n+1 (ξ)
E(x) = (x − x−n )(x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn )
(2n + 1)!
= (s + n)(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1)(s − n)
f 2n+1 (ξ)
× h2n+1 ,
(2n + 1)!

where min{x−n , x−(n−1) , . . . , x0 , x1 , . . . , xn−1 , xn } < ξ


< max{x−n , x−(n−1) , . . . , x0 , x1 , . . . , xn−1 , xn }.
The remainder for the case of 2n equispaced points is

f 2n (ξ)
E(x) = (x − x−n )(x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn−1 )
(2n)!
2n
f (ξ)
= (s + n)(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1)h2n ,
(2n)!

min{x−n , x−n , . . . , x0 , x1 , . . . , xn−1 , xn−1 } < ξ


< max{x−n , x−(n−1) , . . . , x0 , x1 , . . . , xn−1 }.

3.8 Stirling’s Interpolation Formula

Stirling’s interpolation formula is used for odd number of equispaced arguments.


This formula is obtained by taking the arithmetic mean of the Gauss’s forward and
backward difference formulae given by (3.44) and (3.51).
106 Numerical Analysis

Therefore Stirling’s formula is


φ(x)forward + φ(x)backward
φ(x) =
2
s ∆y−1 + ∆y0 s2 2 s(s2 − 12 ) ∆3 y−2 + ∆3 y−1
= y0 + + ∆ y−1 +
1! 2 2! 3! 2
s2 (s2 − 12 ) 4 s(s2 − 12 )(s2 − 22 ) ∆5 y−3 + ∆5 y−2
+ ∆ y−2 + + ···
4! 5! 2
2
s2 (s2 − 12 )(s2 − 22 ) · · · (s2 − n − 1 ) 2n
+ ∆ y−n . (3.53)
(2n)!
The remainder of this formula is
s(s2 − 12 )(s2 − 22 ) · · · (s2 − n2 ) 2n+1 2n+1
E(x) = h f (ξ), (3.54)
(2n + 1)
where min{x−n , . . . , x0 , . . . , xn } < ξ < max{x−n , . . . , x0 , . . . , xn }.
The formula (3.53) is known as Stirling’s central difference interpolation for-
mula.

Note 3.8.1 (a) The Stirling’s interpolation formula (3.53) gives the best approximate
x − x0
result when −0.25 < s < 0.25. So we choose x0 in such a way that s = satisfy
h
this condition.
(b) The Stirling’s interpolation formula is used when the point x, for which f (x) to
be determined, is at the centre of the table and the number of points at which the values
of f (x) known is odd.

3.9 Bessel’s Interpolation Formula

This central difference formula is also obtained by taking the arithmetic mean of Gauss’s
forward and backward interpolation formulae. But, one difference is that the backward
formula taken after one modification.
Let us consider 2n equispaced points x−(n−1) , . . . , x−1 , x0 , x1 , . . . , xn−1 , xn as argu-
ments, where x±i = x0 ± ih, h is the spacing.
If x0 , y0 be the initial values of x and y respectively, then the Gauss’s backward
difference interpolation formula (3.52) is
s(s + 1) 2 (s + 1)s(s − 1) 3
φ(x) = y0 + s∆y−1 + ∆ y−1 + ∆ y−2
2! 3!
(s + 2)(s + 1)s(s − 1) 4
+ ∆ y−2 + · · ·
4!
(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1) 2n−1
+ ∆ y−n . (3.55)
(2n − 1)!
Interpolation 107

Suppose x1 , y1 be the initial values of x and y. Then


x − x1 x − (x0 + h) x − x0
= = − 1 = s − 1.
h h h
Also, the indices of all the differences of (3.55) will increase by 1. Now, replacing s
by s − 1 and increasing the indices of (3.55) by 1, then the above equation becomes

s(s − 1) 2 s(s − 1)(s − 2) 3


φ1 (x) = y1 + (s − 1)∆y0 + ∆ y0 + ∆ y−1
2! 3!
(s + 1)s(s − 1)(s − 2) 4
+ ∆ y−1
4!
(s + 1)s(s − 1)(s − 2)(s − 3) 5
+ ∆ y−2 + · · ·
5!
(s + n − 2) · · · (s + 1)s(s − 1)(s − 2) · · · (s − n) 2n−1
+ ∆ y−n+1 . (3.56)
(2n − 1)!

Taking arithmetic mean of (3.56) and Gauss’s forward interpolation formula given by
(3.45),

φ1 (x) + φ(x)forward
φ(x) =
2
y 0 + y1  1 s(s − 1) ∆2 y0 + ∆2 y−1
= + s− ∆y0 +
2 2 2! 2
(s − 2 )s(s − 1) 3
1
s(s − 1)(s + 1)(s − 2) ∆4 y−2 + ∆4 y−1
+ ∆ y−1 +
3! 4! 2
(s − 2 )s(s − 1)(s + 1)(s − 2) 5
1
+ ∆ y−2 + · · ·
5!
(s − 12 )s(s − 1)(s + 1) · · · (s + n − 2)(s − n − 1) 2n−1
+ ∆ y−(n−1) , (3.57)
(2n − 1)!
x − x0
where s = .
h
1 x − x0 1
Introducing u = s − = − and then the above formula reduces to
2 h 2
y0 + y 1 u2 − 14 ∆2 y−1 + ∆2 y0 u(u2 − 14 ) 3
φ(x) = + u∆y0 + · + ∆ y−1
2 2! 2 3!
2
(u2 − 14 )(u2 − 94 ) · · · (u2 − (2n−3)
4 ) 2n−1
+ ∆ y−(n−1) . (3.58)
(2n − 1)!

The formulae given by (3.57) and (3.58) are known as Bessel’s central difference
interpolation formula.
108 Numerical Analysis

Note 3.9.1 (a) The Bessel’s formula gives the best result when the starting point x0
be so chosen such that −0.25 < u < 0.25 i.e., −0.25 < s−0.5 < 0.25 or, 0.25 < s < 0.75.
(b) This central difference formula is used when the interpolating point is near the
middle of the table and the number of arguments is even.

Example 3.9.1 Use the central difference interpolation formula of Stirling or Bessel
to find the values of y at (i) x = 1.40 and (ii) x = 1.60 from the following table

x : 1.0 1.25 1.50 1.75 2.00


y : 1.0000 1.0772 1.1447 1.2051 2.2599

Solution. The central difference table is

i xi yi ∆yi ∆ 2 yi ∆ 3 yi
−2 1.00 1.0000
0.0772
−1 1.25 1.0772 –0.0097
0.0675 0.0026
0 1.50 1.1447 –0.0071
0.0604 0.0015
1 1.75 1.2051 –0.0056
0.0548
2 2.00 1.2599

(i) For x = 1.40, we take x0 = 1.50, then s = (1.40 − 1.50)/0.25 = −0.4.


The Bessel’s formula gives

y0 + y1  1 s(s − 1) ∆2 y0 + ∆2 y−1
y(1.40) = + s− ∆y0 +
2 2 2! 2
1 1
+ s− s(s − 1)∆ y−1
3
3! 2
1.1447 + 1.2051
= + (−0.4 − 0.5) × 0.0604
2
−0.4(−0.4 − 1) −0.0071 − 0.0056
+
2! 2
1
+ (−0.4 − 0.5)(−0.4)(−0.4 − 1) × 0.0015
6
= 1.118636.

(ii) For x = 1.60, we take x0 = 1.50, then s = (1.60 − 1.50)/0.25 = 0.4.


Interpolation 109

Using Stirling’s formula

∆y−1 + ∆y0 s2 2 s(s2 − 12 ) ∆3 y−2 + ∆3 y−1


y(1.60) = y0 + s + ∆ y−1 +
2 2! 3! 2
0.0675 + 0.0604 (0.4) 2
= 1.1447 + 0.4 + × (−0.0071)
2 2
0.4(0.16 − 1) 0.0026 + 0.0015
+
6 2
= 1.1447 + 0.02558 − 0.000568 − 0.0001148 = 1.1695972.

3.10 Everett’s Interpolation Formula

In this interpolation formula, 2n (even) equispaced arguments x−(n−1) , x−(n−2) , . . .,


x−1 , x0 , x1 , . . . , xn−1 and xn are considered. The Everett’s interpolation formula is
obtained from Gauss’s forward interpolation formula (3.44) by replacing the odd order
differences using the lower order even differences. That is, by the substitution
∆2k+1 y−k = ∆2k y−(k−1) − ∆2k y−k .
That is,
∆y0 = y1 − y0 ;
∆ y−1 = ∆2 y0 − ∆2 y−1 ;
3

∆5 y−2 = ∆4 y−1 − ∆4 y−2 ,


.. .. ..
. . .
∆2n−1 y−(n−1) = ∆2n−2 y−(n−2) − ∆2n−2 y−(n−1) .
On substitution of these relations, equation (3.44) yields
s(s − 1) 2 (s + 1)s(s − 1) 2
φ(x) = y0 + s(y1 − y0 ) + ∆ y−1 + (∆ y0 − ∆2 y−1 )
2! 3!
(s + 1)s(s − 1)(s − 2) 4
+ ∆ y−2
4!
(s + 2)(s + 1)s(s − 1)(s − 2) 4
+ (∆ y−1 − ∆4 y−2 ) + · · ·
5!
(s + n − 1)(s + n − 2) · · · (s + 1)s(s − 1)(s − n + 1)
+
(2n − 1)!
×(∆ 2n−2
y−(n−2) − ∆2n−2 y−(n−1) )
(s + 1)s(s − 1) 2 (s + 2)(s + 1)s(s − 1)(s − 2) 4
= sy1 + ∆ y0 + ∆ y−1 + · · ·
3! 5!
110 Numerical Analysis

(s + n − 1)(s + n − 2) · · · (s + 1)s(s − 1) · · · (s − n + 1) 2n−2


+ ∆ y−(n−2)
(2n − 1)!
 s(s − 1) (s + 1)s(s − 1) 
+ (1 − s)y0 + − ∆2 y−1
2! 3!
 (s + 1)s(s − 1)(s − 2) (s + 2)(s + 1)s(s − 1)(s − 2) 
+ − ∆4 y−2 + · · ·
4! 5!
(s + n − 1)(s + n − 2) · · · (s + 1)s(s − 1) · · · (s − n + 1) 2n−2
+ ∆ y−(n−1)
(2n − 1)!
s(s2 − 12 ) 2 s(s2 − 11 )(s2 − 22 ) 4
= sy1 + ∆ y0 + ∆ y−1 + · · ·
3! 5!
s(s2 − 12 )(s2 − 22 ) · · · (s2 − (n − 1)2 ) 2n−2
+ ∆ y−(n−2)
(2n − 1)!
u(u2 − 11 ) 2 u(u2 − 12 )(u2 − 22 ) 4
+ uy0 + ∆ y−1 + ∆ y−2 + · · ·
3! 5!
u(u2 − 12 )(u2 − 22 ) · · · (u2 − (n − 1)2 ) 2n−2
+ ∆ y−(n−1) , (3.59)
(2n − 1)!

x − x0
where s = and u = 1 − s.
h
Example 3.10.1 Use Everett’s interpolation formula to find the value of y when
x = 1.60 from the following table.
x : 1.0 1.25 1.50 1.75 2.00 2.25
y : 1.0000 1.1180 1.2247 1.3229 1.4142 1.5000

Solution. The difference table is


i xi yi ∆yi ∆ 2 yi ∆ 3 yi ∆ 4 yi
−2 1.00 1.0000
0.1180
−1 1.25 1.1180 −0.0113
0.1067 0.0028
0 1.50 1.2247 −0.0085 −0.0012
0.0982 0.0016
1 1.75 1.3229 −0.0069 −0.0002
0.0913 0.0014
2 2.00 1.4142 −0.0055
0.0858
3 2.25 1.5000
Interpolation 111

We take x0 = 1.50. Here h = 0.25.


x − x0 1.60 − 1.50
Then s = = = 0.4 and u = 1 − s = 0.6.
h 0.25
Using Everett’s formula, the value of y(1.60) is given by

s(s2 − 12 ) 2 s(s2 − 12 )(s2 − 22 ) 4


y(1.60) = sy1 + ∆ y0 + ∆ y−1
3! 5!
u(u2 − 12 ) 2 u(u2 − 12 )(u2 − 22 ) 4
+ uy0 + ∆ y−1 + ∆ y−2
3! 5!
0.4(0.16 − 1)
= 0.4 × 1.3229 + (−0.0069)
6
0.4(0.16 − 1)(0.16 − 4)
+ (−0.0002) + 0.6 × 1.2247
120
0.6(0.36 − 1) 0.6(0.36 − 1)(0.36 − 4)
+ (−0.0085) + (−0.0012)
6 120
= 0.5292 + 0.0004 − 0.0000025 + 0.7348 + 0.0005 − 0.00001
= 1.2649.

3.10.1 Relation between Bessel’s and Everett’s formulae


These two formulae are closely related and one formula can be deduced from the other
one. Now, starting from Bessel’s formula (3.57):

y 0 + y1  1 s(s − 1) ∆2 y0 + ∆2 y−1
φ(x) = + s− ∆y0 +
2  2 2! 2
s − 2 s(s − 1)
1
x − x0
+ ∆3 y−1 + · · · , where s = .
3! h
y 0 + y1  1 s(s − 1) ∆2 y0 + ∆2 y−1
= + s− (y1 − y0 ) +
2  2 2! 2
s − 2 s(s − 1)
1
+ (∆2 y0 − ∆2 y−1 ) + · · ·
3!
s(s − 1) s(s − 1)(s − 12 ) 2
= (1 − s)y0 + − ∆ y−1 + · · ·
4 6
s(s − 1) s(s − 1)(s − 12 ) 2
+sy1 + + ∆ y0 + · · ·
4 6
s(s2 − 12 ) 2 u(u2 − 12 ) 2
= sy1 + ∆ y0 + · · · + uy0 + ∆ y−1 + · · ·
3! 3!
where u = 1 − s.
112 Numerical Analysis

This is the Everett’s formula up to second order differences. From this deduction it is
clear that the Everett’s formula truncated after second order differences is equivalent to
the Bessel’s formula truncated after third differences. Conversely, the Bessel’s formula
may be deduced from Everett’s formula.

Central difference table

The difference Table 3.3 shows how the different order of differences are used in different
interpolation formulae.

x y ∆y ∆2 y ∆3 y ∆4 y

x−3 y−3

∆y−3
Newton’s Backward

x−2 y−2 ∆2 y −3

∆y−2 ∆3−3

x−1 y−1 ∆2 y−2 ∆4 y−3

∆y−1 ∆3 y−2

x0 y0 - 6-∆2 y−1- 6-∆4 y−2- Stirling’s formula
? ?
R
6- ∆y0 - 6- 3 - 6 - Bessel’s Formula
? ∆ y−1 ?
?
R
x1 y1 ∆ 2 y0 ∆4 y−1
R 3
∆y1 ∆ y0
R
x2 y2 ∆ 2 y1 ∆ 4 y0
R
∆y2 ∆ 3 y1 Newton’s Forward

x3 y3 ∆ 2 y2

Figure 3.3: Central difference table.

Note 3.10.1 In Newton’s forward and backward interpolation formulae the first or the
Interpolation 113

last interpolating point is taken as initial point x0 . But, in central difference interpola-
tion formulae, a middle point is taken as the initial point x0 .

3.11 Interpolation by Iteration (Aitken’s Interpolation)

Newton’s interpolation formula generates successively higher order interpolation for-


mula. The Aitken interpolation formula served the same purpose. But it has one
advantage that it can be easily programmed for a computer.
Let y = f (x) be given for n + 1 distinct points x0 , x1 , . . . , xn , i.e., yi = f (xi ), i =
0, 1, . . . , n are given, where the points xi , i = 0, 1, . . . , n need not be equispaced. To
compute the value of y for a given x the iterations proceed as follows:
to find the value of y obtain a first approximation by taking first two points; then obtain
its second approximation by taking the first approximations and so on.
The linear polynomial for the points x0 and x1 is
x − x1 x − x0 1
p01 (x) = y0 + y1 = [(x1 − x)y0 − (x0 − x)y1 ]
x0 − x1 x1 − x0 x1 − x0
 
1  y0 x0 − x 
=  . (3.60)
x1 − x0  y1 x1 − x 

In general,
 
1  y0 x0 − x 
p0j (x) =  
xj − x0  yj xj − x  , j = 1, 2, . . . , n. (3.61)

Here p0j (x) is a polynomial of degree less than or equal to 1, for the points x0 and
xj .
The polynomial
 
1  p01 (x) x1 − x 
p01j (x) =   , j = 2, 3, . . . , n. (3.62)
xj − x1  p0j (x) xj − x 

is a polynomial of degree less than or equal to 2.


The polynomial p01j (x) interpolates the points x0 , x1 and xj .
In general, the polynomial for the (k + 1) points x0 , x1 , . . . , xk and xj is
 
1  p012···k (x) xk − x 
p012···kj (x) =   , j = k + 1, . . . , n. (3.63)
xj − xk  p012···(k−1)j (x) xj − x 

This polynomial is of degree (k + 1) and interpolates the points x0 , x1 , . . . , xk−1 , xk


and xj .
The tabular representation of this form is
114 Numerical Analysis

xj yj p0j p01j p012j p0123j xj − x


x0 y0 x0 − x
x1 y1 p01 x1 − x
x2 y2 p02 p012 x2 − x
x3 y3 p03 p013 p0123 x3 − x
x4 y4 p04 p014 p0124 p01234 x4 − x
··· ··· ··· ··· ··· ··· ···
xn yn p0n p01n p012n p0123n xn − x

Example 3.11.1 Find the value of y(1.52) by iterated linear interpolation using
the following table.

x : 1.4 1.5 1.6 1.7


y(x) : 1.8330 1.9365 2.0396 2.1424

Solution. Here x = 1.52.


The calculations are shown below.

xj yj p0j p01j p012j xj − x


1.4 1.8330 −0.12
1.5 1.9365 1.9572 −0.02
1.6 2.0396 1.9570 1.9572 0.08
1.7 2.1424 1.9568 1.9572 1.9572 0.18

Now  
1  y0 x0 − x 
p0j =  , j = 1, 2, 3.
xj − x0  yj xj − x 
 
1  1.8330 −0.12 
p01 = = 1.9572.
0.1  1.9365 −0.02 
 
1  1.8330 −0.12 
p02 = = 1.9570.
0.2  2.0396 0.08 
 
1  1.8330 −0.12 
p03 = = 1.9568.
0.3  2.1424 0.18 
 
1  p01 x1 − x 
p01j =  , j = 2, 3.
xj − x1  p0j xj − x 
 
1  1.9572 −0.02 
p012 = = 1.9572.
0.1  1.9570 0.08 
 
1  1.9572 −0.02 
p013 = = 1.9572.
0.2  1.9568 0.18 
Interpolation 115

 
1  p012 x2 − x 
p012j =  , j = 3.
xj − x2  p01j xj − x 
 
1  1.9572 0.08 
p0123 = = 1.9572.
0.1  1.9572 0.18 
Therefore the interpolated value of y at x = 1.52 is 1.9572.

Algorithm 3.3 (Aitken Interpolation). This algorithm determines the value of


y at a given x from the table of (xi , yi ), i = 0, 1, 2, . . . , n, by Aitken’s iterative inter-
polation formula.

Algorithm Aitken Interpolation


Step 1: Read n, x0 , x1 , . . . , xn ; y0 , y1 , . . . , yn and xg;
// n represents the number of points, xg is the value of
x at which y is to be calculated.//
Step 2: for j = 0 to n do // reading of tabulated values//
Step 2.1. Set p(j, 0) = y(j);
Step 2.2. Compute xd(j) = x(j) − xg;
endfor;
Step 3: for k = 0 to n do
for j = k + 1 to n do
p(j, k + 1) = [p(k, k) ∗ xd(j) − p(j, k) ∗ xd(k)]/[x(j) − x(k)];
endfor;
Step 4: // printing of the table//
for j = 0 to n do
Print x(j), (p(j, k), k = 0 to j), xd(j);
endfor;
Step 5: //Printing of the value of the polynomial.//
Print p(n, n);
end Aitken Interpolation

Program 3.3.
/* Program Aitken Interpolation
This program implements Aitken’s interpolation
formula; xg is the interpolating points. */
#include <stdio.h>
#include <math.h>
void main()
{
int n, j, k;
float x[20], y[20], xg, p[20][20], xd[20];
116 Numerical Analysis

printf("Enter the value of n and the data points


in the form x[i],y[i] ");
scanf("%d", &n);
for(j=0;j<=n;j++) scanf("%f %f", &x[j],&y[j]);
printf("\nEnter the value of x ");
scanf("%f",&xg);
for(j=0;j<=n;j++){
p[j][0]=y[j]; xd[j]=x[j]-xg;
}
for(k=0;k<=n;k++)
for(j=k+1;j<=n;j++)
p[j][k+1]=(p[k][k]*xd[j]-p[j][k]*xd[k])/(x[j]-x[k]);
for(j=0;j<=n;j++){
printf("%6.4f ",x[j]);
for(k=0;k<=j;k++) printf(" %6.4f",p[j][k]);
for(k=j+1;k<=n;k++) printf(" ");
printf(" %6.4f\n",xd[j]);
}
printf("The value of y at x= %6.3f is %8.5f ",xg,p[n][n]);
} /* main */

A sample of input/output:

Enter the value of n and the data points in the form x[i],y[i] 4
1921 46
1931 68
1941 83
1951 95
1961 105
Enter the value of x 1955
1921.0000 46.0000 -34.0000
1931.0000 68.0000 120.8000 -24.0000
1941.0000 83.0000 108.9000 92.2400 -14.0000
1951.0000 95.0000 101.5333 97.6800 99.8560 -4.0000
1961.0000 105.0000 96.1500 101.0800 98.4280 99.2848 6.0000
The value of y at x= 1955.000 is 99.28481

3.12 Divided Differences and their Properties

The Lagrange’s interpolation formula has a disadvantage that if a new interpolation


point is added or removed then the Lagrangian functions Li (x) will have to be re-
Interpolation 117

computed. But the Newton’s general interpolation formula, based on divided


differences removes this drawback.
Let yi = f (xi ), i = 0, 1, . . . , n be known at n + 1 points x0 , x1 , . . . , xn . The points
x0 , x1 , . . . , xn are not necessarily be equispaced. Then the divided differences of different
orders are defined as follows:
Zeroth order divided difference

f [x0 ] = f (x0 ).

First order divided difference

f (x0 ) − f (x1 )
f [x0 , x1 ] = .
x0 − x1
f (xi ) − f (xj )
In general, f [xi , xj ] = .
xi − xj
Second order divided difference

f [x0 , x1 ] − f [x1 , x2 ]
f [x0 , x1 , x2 ] = .
x0 − x2
f [xi , xj ] − f [xj , xk ]
In general, f [xi , xj , xk ] = .
xi − xk
nth order divided differences

f [x0 , x1 , . . . , xn−1 ] − f [x1 , x2 , . . . , xn ]


f [x0 , x1 , . . . , xn ] = .
x0 − xn

3.12.1 Properties of divided differences


1. Divided difference of a constant is zero
Let f (x) = c.
f (x0 ) − f (x1 ) c−c
Then f [x0 , x1 ] = = = 0.
x0 − x1 x0 − x1
2. Divided difference of cf (x), c is constant, is the divided difference of f (x) multi-
plied by c
Let g(x) = cf (x).
Therefore,

g(x0 ) − g(x1 ) cf (x0 ) − cf (x1 ) f (x0 ) − f (x1 )


g[x0 , x1 ] = = =c = cf [x0 , x1 ].
x0 − x1 x0 − x1 x0 − x1
118 Numerical Analysis

3. Divided difference is linear

Let h(x) = af (x) + bg(x).

Now,

h(x0 ) − h(x1 ) af (x0 ) + bg(x0 ) − af (x1 ) − bg(x1 )


h[x0 , x1 ] = =
x0 − x1 x0 − x1
f (x0 ) − f (x1 ) g(x0 ) − g(x1 )
= a +b
x0 − x1 x0 − x1
= af [x0 , x1 ] + bg[x0 , x1 ].

That is, divided difference obeys linear property.

4. Divided differences are symmetric functions

The first order divided difference is

f (x0 ) − f (x1 ) f (x1 ) − f (x0 )


f [x0 , x1 ] = = = f [x1 , x0 ].
x0 − x1 x1 − x0

That is, first order difference is symmetric.


1 1
Also, f [x0 , x1 ] = f (x0 ) + f (x1 ).
x0 − x1 x1 − x0
The second order difference is

f [x0 , x1 ] − f [x1 , x2 ]
f [x0 , x1 , x2 ] =
x0 − x2
1  1 1 
= f (x0 ) + f (x1 )
x0 − x2 x0 − x1 x1 − x0
 1 1 
− f (x1 ) + f (x2 )
x1 − x2 x2 − x1
1
= f (x0 )
(x0 − x2 )(x0 − x1 )
1 1
+ f (x1 ) + f (x2 ).
(x1 − x0 )(x1 − x2 ) (x2 − x0 )(x2 − x1 )
Interpolation 119

Similarly, it can be shown that


1
f [x0 , x1 , . . . , xn ] = f (x0 )
(x0 − x1 )(x0 − x2 ) · · · (x0 − xn )
1
+ f (x1 ) + · · ·
(x1 − x0 )(x1 − x2 ) · · · (x1 − xn )
1
+ f (xn )
(xn − x0 )(xn − x1 ) · · · (xn − xn−1 )
n
f (xi )
= .
(xi − x0 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
i=0
(3.64)
From these relations it is easy to observe that the divided differences are symmet-
ric.
5. For equispaced arguments, the divided differences can be expressed in terms of
forward differences.
That is,
1
f [x0 , x1 , . . . , xn ] = ∆ n y0 .
hn .n!
In this case, xi = x0 + ih, i = 0, 1, . . . , n.
f (x1 ) − f (x0 ) y1 − y 0 ∆y0
f [x0 , x1 ] = = = .
x1 − x0 x1 − x0 h
f [x0 , x1 ] − f [x1 , x2 ] 1  ∆y0 ∆y1 
f [x0 , x1 , x2 ] = = −
x0 − x2 −2h h h
∆y1 − ∆y0 ∆ y02
= 2
= .
2h 2!h2
The result is true for n = 1, 2. Let the result be true for
∆ k y0
n = k, i.e., f [x0 , x1 , . . . , xk ] = .
k! hk
Now,
f [x0 , x1 , . . . , xk ] − f [x1 , x2 , . . . , xk+1 ]
f [x0 , x1 , . . . , xk , xk+1 ] =
x0 − xk+1
1  ∆k y ∆ k y1 
0
= −
−(xk+1 − x0 ) k! hk k! hk
1
= [∆k y1 − ∆k y0 ]
(k + 1)k! hk+1
∆k+1 y0
= .
(k + 1)! hk+1
120 Numerical Analysis

Hence, by mathematical induction


1
f [x0 , x1 , . . . , xn ] = ∆ n y0 . (3.65)
hn .n!

6. Divided differences for equal arguments or divided differences for confluent argu-
ments.
If the arguments are equal then the divided differences have a meaning. If two
arguments are equal then the divided difference has no meaning as denominator
becomes zero. But, by limiting process one can define the divided differences for
equal arguments which is known as confluent divided differences.
f (x0 + ε) − f (x0 )
f [x0 , x0 ] = lim f [x0 , x0 + ε] = lim = f  (x0 ),
ε→0 ε→0 ε
provided f (x) is differentiable.
f [x0 , x0 ] − f [x0 , x0 + ε]
f [x0 , x0 , x0 ] = lim f [x0 , x0 , x0 + ε] = lim
ε→0 ε→0 ε
f  (x0 ) − f (x0 +ε)−f (x0 )
ε
= lim
ε→0 ε
εf  (x0 ) − f (x0 + ε) + f (x0 )  0 
= lim form
ε→0 ε2 0
 
f (x0 ) − f (x0 + ε)
= lim (by L’Hospital rule)
ε→0 2ε
f  (x0 )
= .
2!

f (x0 )
Similarly, it can be shown that f [x0 , x0 , x0 , x0 ] = .
3!
In general,
(k+1) times
   f k (x0 )
f [x0 , x0 , . . . , x0 ] = . (3.66)
k!

In other words,
(k+1) times
dk   
k
f (x0 ) = k! f [x0 , x0 , . . . , x0 ]. (3.67)
dx

7. The nth order divided difference of a polynomial of degree n is constant


Let f (x) = a0 xn + a1 xn−1 + a2 xn−2 + · · · + an−1 x + an , (a0 = 0) be a polynomial
of degree n.
Interpolation 121

Then
f (x) − f (x0 )
f [x, x0 ] =
x − x0
xn − xn0 xn−1 − x0n−1 xn−2 − x0n−2 x − x0
= a0 + a1 + a2 + · · · + an−1
x − x0 x − x0 x − x0 x − x0
= a0 [xn−1 + xn−2 x0 + xn−3 x20 + · · · + xx0n−2 + x0n−1 ]
+a1 [xn−2 + xn−3 x0 + xn−4 x20 + · · · + xxn−3
0 + xn−2
0 ] + · · · + an−1
= a0 xn−1 + (a0 x0 + a1 )xn−2 + (a0 x20 + a1 x0 + a2 )xn−3 + · · · + an−1
= b0 xn−1 + b1 xn−2 + b2 xn−3 + · · · + bn−1 ,
where b0 = a0 , b1 = a0 x0 + a1 , b2 = a0 x20 + a1 x0 + a2 , . . . , bn−1 = an−1 .
Thus, first order divided difference of a polynomial of degree n is a polynomial of
degree n − 1.
Again, the second order divided difference is
f [x0 , x] − f [x0 , x1 ]
f [x, x0 , x1 ] =
x − x1
x n−1 − xn−1
1 xn−2 − xn−2
1 xn−3 − x1n−3
= b0 + b1 + b2
x − x1 x − x1 x − x1
x − x1
+ · · · + bn−2
x − x1
= b0 [xn−2
+x n−2
x1 + xn−4 x21 + · · · + xx1n−2 ]
+b1 [xn−3 + xn−4 x1 + xn−5 x21
+ · · · + xx1n−4 + x1n−3 ] + · · · + bn−2
= b0 xn−2 + (b0 x1 + b1 )xn−3 + (b0 x21 + b1 x1 + b2 )xn−4
+ · · · + bn−2
= c0 xn−2 + c1 xn−3 + c2 xn−4 + · · · + cn−2 ,
where c0 = b0 , c1 = b0 x1 + b1 , c2 = b0 x21 + b1 x1 + b2 , . . . , cn−2 = bn−2 .
This is a polynomial of degree n − 2. So, the second order divided difference is a
polynomial of degree n − 2.
In this way, it can be shown that the n order divided difference of a polynomial
of degree n is constant and which is equal to a0 .

3.13 Newton’s Fundamental Interpolation Formula

Suppose the function y = f (x) is known at the points x0 , x1 , . . . , xn and yi = f (xi ),


i = 0, 1, . . . , n. The points xi , i = 0, 1, . . . , n need not be equispaced.
122 Numerical Analysis

Now, from the definition of divided difference of second order

f (x) − f (x0 )
f [x0 , x] = .
x − x0

Then f (x) = f (x0 ) + (x − x0 )f [x0 , x].


From third order divided difference,

f [x0 , x1 ] − f [x0 , x]
f [x0 , x1 , x] =
x1 − x
i.e., f [x0 , x] = f [x0 , x1 ] + (x − x1 )f [x0 , x1 , x]
f (x) − f (x0 )
i.e., = f [x0 , x1 ] + (x − x1 )f [x0 , x1 , x]
x − x0
Thus, f (x) = f (x0 ) + (x − x0 )f [x0 , x1 ] + (x − x0 )(x − x1 )f [x0 , x1 , x].

Similarly, for the arguments x0 , x1 , x2 , x,

f (x) = f (x0 ) + (x − x0 )f [x0 , x1 ] + (x − x0 )(x − x1 )f [x0 , x1 , x2 ]


+ (x − x0 )(x − x1 )(x − x2 )f [x0 , x1 , x2 , x3 ] + · · ·
+ (x − x0 )(x − x1 ) · · · (x − xn )f [x0 , x1 , x2 , . . . , xn , x]. (3.68)

This formula is known as Newton’s fundamental or Newton’s general interpo-


lation formula including error term.
Tabular form of divided differences are shown in Table 3.4.

Table 3.4: Divided difference table.


..
x . f (x) First Second Third
..
x0 . f (x0 )
..
x0 − x1 . f [x0 , x1 ]
..
x0 − x2 x1 . f (x1 ) f [x0 , x1 , x2 ]
..
x1 − x2 . f [x1 , x2 ] f [x0 , x1 , x2 , x3 ]
..
x1 − x3 x2 . f (x2 ) f [x1 , x2 , x3 ]
..
x2 − x3 . f [x2 , x3 ]
..
x3 . f (x3 )
Interpolation 123

Error term

The error term

E(x) = (x − x0 )(x − x1 ) · · · (x − xn )f [x0 , x1 , . . . , xn , x]


f n+1 (ξ)
= (x − x0 )(x − x1 ) · · · (x − xn ) , [using (3.66)] (3.69)
(n + 1)!

where min{x0 , x1 , . . . , xn , x} < ξ < max{x0 , x1 , . . . , xn , x}.

Example 3.13.1 Find the value of y when x = 1.5 from the following table:

x : 1 5 7 10 12
y : 0.6931 1.7918 2.0794 2.3979 2.5649

using Newton’s divided difference formula.

Solution. The divided difference table is

x y 1st 2nd 3rd 4th


1 0.6931
–4 0.2747
–6 5 1.7918 –0.0218
–9 –2 0.1438 0.0016
–11 –5 7 2.0794 –0.0075 –0.0001
–7 –3 0.1062 0.0004
–5 10 2.3979 –0.0045
–2 0.0835
12 2.5649

Here x = 1.5, x0 = 1, x1 = 5, x2 = 7, x3 = 10, x4 = 12,


f [x0 ] = 0.6931, f [x0 , x1 ] = 0.2747, f [x0 , x1 , x2 ] = −0.0218,
f [x0 , x1 , x2 , x3 ] = 0.0016, f [x0 , x1 , x2 , x3 , x4 ] = −0.0001.
The Newton’s divided difference formula is

y(1.5) = f [x0 ] + (x − x0 )f [x0 , x1 ] + (x − x0 )(x − x1 )f [x0 , x1 , x2 ]


+(x − x0 )(x − x1 )(x − x2 )f [x0 , x1 , x2 , x3 ]
+(x − x0 )(x − x1 )(x − x2 )(x − x3 )f [x0 , x1 , x2 , x3 , x4 ]
= 0.6931 + 0.1374 + 0.0382 + 0.0154 + 0.0082
= 0.8922.
124 Numerical Analysis

3.14 Deductions of other Interpolation Formulae from Newton’s


Divided Difference Formula

3.14.1 Newton’s forward difference interpolation formula


If the arguments x0 , x1 , . . . , xn are equispaced, then xi = x0 + ih, i = 0, 1, . . . , n, and
∆n f (x0 )
f [x0 , x1 , . . . , xn ] =
. [see (3.65)]
n! hn
In this particular case, the Newton’s divided difference formula becomes
∆f (x0 ) ∆2 f (x0 )
φ(x) = f (x0 ) + (x − x0 ) + (x − x0 )(x − x1 )
1!h 2!h2
3
∆ f (x0 )
+ (x − x0 )(x − x1 )(x − x2 ) + ···
3!h3
∆n f (x0 )
+ (x − x0 )(x − x1 ) · · · (x − xn−1 ) + E(x),
n!hn
where
E(x) = (x − x0 )(x − x1 ) · · · (x − xn )f [x, x0 , x1 , . . . , xn ]
∆n+1 f (ξ)
= (x − x0 )(x − x1 ) · · · (x − xn ) .
(n + 1)! hn+1
x − x0
Now, a unit less quantity u = , i.e., x = x0 + uh is introduced to simplify the
h
formula.
So, x − xi = (u − i)h.
Then
u(u − 1) 2
φ(x) = f (x0 ) + u∆f (x0 ) + ∆ f (x0 ) + · · ·
2!
u(u − 1)(u − 2) · · · (u − n + 1) n
+ ∆ f (x0 ) + E(x),
n!
f n+1 (ξ)
E(x) = u(u − 1)(u − 2) · · · (u − n) ,
(n + 1)!
min{x, x0 , x1 , . . . , xn } < ξ < max{x, x0 , x1 , . . . , xn }.

3.14.2 Newton’s backward difference interpolation formula


Let
φ(x) = f (xn ) + (x − xn )f [xn , xn−1 ] + (x − xn )(x − xn−1 )f [xn , xn−1 , xn−2 ]
+ · · · + (x − xn )(x − xn−1 ) · · · (x − x1 )f [xn , xn−1 , . . . , x1 , x0 ]
+E(x),
Interpolation 125

where
E(x) = (x − xn )(x − xn−1 ) · · · (x − x1 )(x − x0 )f [x, xn , xn−1 , . . . , x1 , x0 ].
From the relation (3.65), we have
∆k f (xn−k ) ∇k f (xn )
f [xn , xn−1 , . . . , xn−k ] = = .
k!hk k!hk
Therefore,
∇f (xn ) ∇2 f (xn )
φ(x) = f (xn ) + (x − xn ) + (x − xn )(x − xn−1 ) + ···
1!h 2!h2
∇n f (xn )
+ (x − xn )(x − xn−1 ) · · · (x − x1 )(x − x0 ) + E(x),
n!hn
where
∇n+1 f (ξ)
E(x) = (x − xn )(x − xn−1 ) · · · (x − x1 )(x − x0 ) ,
(n + 1)!hn+1
where min{x, x0 , x1 , . . . , xn } < ξ < max{x, x0 , x1 , . . . , xn }.

3.14.3 Lagrange’s interpolation formula


From the definition of (n + 1) order divided difference (3.64) we have

n
f (xi )
f [x0 , x1 , . . . , xn ] = .
(xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
i=0

For (n + 2) arguments x, x0 , . . . , xn ,
f (x)
f [x, x0 , x1 , . . . , xn ] =
(x − x0 )(x − x1 ) · · · (x − xn )
n
f (xi )
+
(xi − x0 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
i=0
f (x) 
n
f (xi )
= + ,
w(x) (xi − x)w (xi )
i=0

where w(x) = (x − x0 )(x − x1 ) · · · (x − xn ).


Therefore,

n
w(x)f (xi )
f (x) = + w(x)f [x, x0 , x1 , . . . , xn ]
(x − xi )w (xi )
i=0

n
f n+1 (ξ)
= Li (x)f (xi ) + w(x) [using (3.65)]
(n + 1)!
i=0
126 Numerical Analysis

where min{x0 , x1 , . . . , xn , x} < ξ < max{x0 , x1 , . . . , xn , x} and


w(x)
Li (x) = .
(x − xi )w (xi )
This is the Lagrange’s interpolation formula with error term.

Note 3.14.1 It is observed that the divided differences for equispaced arguments pro-
duce the Newton forward and backward difference formulae. Also, this interpolation
gives Lagrange’s interpolation formula.

Actually, Lagrange’s interpolation formula and Newton’s divided difference interpo-


lation formula are equivalent. This fact is proved in the following.

3.15 Equivalence of Lagrange’s and Newton’s divided


difference formulae

In the following, it is proved that the Lagrange’s interpolation formula and Newton’s
divided difference interpolation formula are equivalent.
The Lagrange’s interpolation polynomial for the points (xi , yi ), i = 0, 1, . . . , n of
degree n is


n
φ(x) = Li (x)yi , (3.70)
i=0

where
(x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn )
Li (x) = . (3.71)
(xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )

The Newton’s interpolation formula with divided difference is given by

φ(x) = f (x0 ) + (x − x0 )f [x0 , x1 ] + (x − x0 )(x − x1 )f [x0 , x1 , x2 ] + · · ·


+(x − x0 )(x − x1 ) · · · (x − xn−1 )f [x0 , x1 , · · · , xn ]
f (x0 ) f (x1 )
= f (x0 ) + (x − x0 ) +
x0 − x1 x1 − x0
+(x − x0 )(x − x1 ) ×
f (x0 ) f (x1 ) f (x2 )
+ +
(x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 ) (x2 − x0 )(x2 − x1 )
+ · · · + (x − x0 )(x − x1 ) · · · (x − xn−1 ) ×
f (x0 ) f (xn )
+ ··· + . (3.72)
(x0 − x1 ) · · · (x0 − xn ) (xn − x0 ) · · · (xn − xn−1 )
Interpolation 127

The coefficient of f (x0 ) in the above expression is

x − x0 (x − x0 )(x − x1 )
1+ + + ···
x0 − x1 (x0 − x1 )(x0 − x2 )
(x − x0 )(x − x1 ) · · · (x − xn−1 )
+
(x0 − x1 )(x0 − x2 ) · · · (x0 − xn )
(x − x1 ) x − x0 (x − x0 )(x − x2 )
= 1+ + +
(x0 − x1 ) x0 − x2 (x0 − x2 )(x0 − x3 )
(x − x0 )(x − x2 ) · · · (x − xn−1 )
+··· +
(x0 − x2 )(x0 − x3 ) · · · (x0 − xn )
(x − x1 )(x − x2 ) x − x0 (x − x0 )(x − x3 ) · · · (x − xn−1 )
= 1+ + ··· +
(x0 − x1 )(x0 − x2 ) x0 − x2 (x0 − x3 )(x0 − x4 ) · · · (x0 − xn )
(x − x1 )(x − x2 ) · · · (x − xn−1 ) x − x0
= 1+
(x0 − x1 )(x0 − x2 ) · · · (x0 − xn−1 ) x0 − xn
(x − x1 )(x − x2 ) · · · (x − xn−1 )(x − xn )
=
(x0 − x1 )(x0 − x2 ) · · · (x0 − xn−1 )(x0 − xn )
= L0 (x).

Similarly, it can be shown that the coefficient of f (x1 ) is L1 (x), coefficient of f (x2 )
is L2 (x) and so on.
Thus, (3.72) becomes


n
φ(x) = L0 (x)f (x0 ) + L1 (x)f (x1 ) + · · · + Ln (x)f (xn ) = Li (x)f (xi ).
i=1

Thus the Lagrange’s interpolation and Newton’s divided difference interpolation for-
mulae are equivalent.
Example 3.15.1 A function y = f (x) is given at the points x = x0 , x1 , x2 . Show
that the Newton’s divided difference interpolation formula and the corresponding
Lagrange’s interpolation formula are identical.

Solution. The Newton’s divided difference formulae is given as

y = f (x) = y0 + (x − x0 )f [x0 , x1 ] + (x − x0 )(x − x1 )f [x0 , x1 , x2 ]


f (x1 ) − f (x0 )
= y0 + (x − x0 ) + (x − x0 )(x − x1 )
x1 − x0
f (x0 ) f (x1 ) f (x2 )
× + +
(x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 ) (x2 − x0 )(x2 − x1 )
128 Numerical Analysis

(x0 − x) (x − x0 )(x − x1 )
= 1− + f (x0 )
(x0 − x1 ) (x0 − x1 )(x0 − x2 )
(x − x0 ) (x − x0 )(x − x1 ) (x − x0 )(x − x1 )
+ + f (x1 ) + f (x2 )
(x1 − x0 ) (x1 − x0 )(x1 − x2 ) (x2 − x0 )(x2 − x1 )
(x − x1 )(x − x2 ) (x − x0 )(x − x2 )
= f (x0 ) + f (x1 )
(x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 )
(x − x0 )(x − x1 )
+ f (x2 )
(x2 − x0 )(x2 − x1 )

which is the Lagrange’s interpolation polynomial. Hence Newton’s divided difference


interpolation polynomial and Lagrange’s interpolation polynomial are equivalent.

Example 3.15.2 For the following table, find the interpolation polynomial using
(i) Lagrange’s formula and (ii) Newton’s divided difference formula, and hence show
that both represent same interpolating polynomial.
x : 0 2 4 8
f (x) : 3 8 11 18

Solution. (i) The Lagrange’s interpolation polynomial is

(x − 2)(x − 4)(x − 8) (x − 0)(x − 4)(x − 8)


φ(x) = ×3+ ×8
(0 − 2)(0 − 4)(0 − 8) (2 − 0)(2 − 4)(2 − 8)
(x − 0)(x − 2)(x − 8) (x − 0)(x − 2)(x − 4)
+ × 11 + × 19
(4 − 0)(4 − 2)(4 − 8) (8 − 0)(8 − 2)(8 − 4)
x3 − 14x2 + 56x − 64 x3 − 12x2 + 32x
= ×3+ ×8
−64 24
x3 − 10x2 + 16x x3 − 6x2 + 8x
+ × 11 + × 19
−32 192
1 3 1 2 10
= x − x + x + 3.
24 2 3
(ii) The divided difference table is

x f (x) 1st divided 2nd divided 3rd divided


difference difference difference
0 3
2 8 5/2
4 11 3/2 –1/4
8 19 2 1/12 1/24
Interpolation 129

Newton’s divided difference polynomial is


5 1
φ(x) = 3 + (x − 0) × + (x − 0)(x − 2) × (− )
2 4
1
+(x − 0)(x − 2)(x − 4) ×
24
5 1 2 1 3
= 3 + x − (x − 2x) + (x − 6x2 + 8x)
2 4 24
1 3 1 2 10
= x − x + x + 3.
24 2 3
Thus, it is observed that the interpolating polynomial by both Lagrange’s and New-
ton’s divided difference formulae are one and same.

It may be noted that Newton’s formula involves less number of arithmetic operations
than that of Lagrange’s formula.

Algorithm 3.4 (Divided difference). This algorithm finds the value of y at a


given point x from a table (xi , yi ), i = 0, 1, 2, . . . , n, by Newton’s divided difference
interpolation formula.

Let
yk = f (xk ) = dk,0 .
f (xk ) − f (xk−1 ) dk,0 − dk−1,0
f [xk , xk−1 ] = dk,1 = =
xk − xk−1 xk − xk−1
f [xk , xk−1 ] − f [xk−1 , xk−2 ] dk,1 − dk−1,1
f [xk , xk−1 , xk−2 ] = dk,2 = =
xk − xk−2 xk − xk−2
dk,2 − dk−1,2
f [xk , xk−1 , xk−2 , xk−3 ] = dk,3 =
xk − xk−3

In general,
dk,i−1 − dk−1,i−1
dk,i = , i = 1, 2, . . . , n. (3.73)
xk − xk−i

Using the above notations, the Newton’s divided difference formula (3.68) can be
written in the following form.

f (x) = d0,0 + d1,1 (x − x0 ) + d2,2 (x − x0 )(x − x1 )


+d3,3 (x − x0 )(x − x1 )(x − x2 ) + · · ·
+dn,n (x − x0 )(x − x1 ) · · · (x − xn−1 )
130 Numerical Analysis

= dn,n (x − xn−1 )(x − xn−2 )(x − xn−3 ) · · · (x − x0 )


+dn−1,n−1 (x − xn−2 )(x − xn−3 ) · · · (x − x0 )
+dn−2,n−2 (x − xn−3 ) · · · (x − x0 ) + d1,1 (x − x0 ) + d0,0
= (· · · (((dn,n (x − xn−1 ) + dn−1,n−1 )(x − xn−2 )
+dn−2,n−2 )(x − xn−3 ) + dn−3,n−3 )(x − xn−4 ) + · · · ) + d0,0

Algorithm Divided difference


Step 1: Read n; //the degree of the polynomial//
Read (x(i), y(i)), i = 0, 1, . . . , n //the points//
Step 2: for k = 0 to n do
Set d(k, 0) = y(k);
Step 3: for i = 0 to n do //compute all divided differences//
for k = i to n do
d(k, i) = (d(k, i − 1) − d(k − 1, i − 1))/(x(k) − x(k − i));
Step 4: Read xg; //for which the polynomial is to be calculated//
Step 5: Set sum = d(n, n) //initialization of sum//
Step 6: for k = n − 1 to 0 do
Compute sum = sum ∗ (xg − x(k)) + d(k, k);
Step 7: Print xg, sum;
end Divided difference
Program 3.4.
/* Program Divided Difference Interpolation
This program finds the value of y=f(x) at a given x when
the function is supplied as (x[i],y[i]), i=0, 1, ..., n.
*/
#include<stdio.h>
#include<math.h>
void main()
{
int i,k,n;
float x[20],y[20],xg,sum,d[20][20];
printf("Enter number of subintervals ");
scanf("%d",&n);
printf("Enter x and y values ");
for(i=0;i<=n;i++)
scanf("%f %f",&x[i],&y[i]);
printf("Enter interpolating point x ");
scanf("%f",&xg);
printf("The given values of x and y are\nx-value y-value\n");
Interpolation 131

for(i=0;i<=n;i++) printf("%f %f\n",x[i],y[i]);


for(k=0;k<=n;k++) d[k][0]=y[k];
for(i=1;i<=n;i++)
for(k=i;k<=n;k++)
d[k][i]=(d[k][i-1]-d[k-1][i-1])/(x[k]-x[k-i]);
sum=d[n][n];
for(k=n-1;k>=0;k--) sum=sum*(xg-x[k])+d[k][k];
printf("The interpolated value at x=%f is %f ",xg,sum);
}

A sample of input/output:

Enter number of subintervals 4


Enter x and y values
0.10 1.1052
0.15 1.1618
0.20 1.2214
0.25 1.2840
0.30 1.3499
Enter interpolating point x 0.12
The given values of x and y are
x-value y-value
0.100000 1.105200
0.150000 1.161800
0.200000 1.221400
0.250000 1.284000
0.300000 1.349900
The interpolated value at x=0.120000 is 1.127468

3.16 Inverse Interpolation

In interpolation, for a given set of values of x and y, the value of y is determined for a
given value of x. But the inverse interpolation is the process which finds the value of
x for a given y. Commonly used inverse interpolation formulae are based on successive
iteration.
In the following, three inverse interpolation formulae based on Lagrange, Newton
forward and Newton backward interpolation formulae are described. The inverse inter-
polation based on Lagrange’s formula is a direct method while the formulae based on
Newton’s interpolation formulae are iterative.
132 Numerical Analysis

3.16.1 Inverse interpolation based on Lagrange’s formula


The Lagrange’s interpolation formula of y on x is


n
w(x) yi
y= .
(x − xi )w (xi )
i=0

When x and y are interchanged then the above relation changes to


n
w(y)xi  n
x= = Li (y)xi ,
(y − yi )w (yi )
i=0 i=0

where
w(y) (y−y0 )(y−y1 ) · · · (y−yi−1 )(y−yi+1 ) · · · (y−yn )
Li (y) = = .
(y−yi )w (yi ) (yi −y0 )(yi −y1 )· · ·(yi −yi−1 )(yi −yi+1 )· · ·(yi −yn )

This formula gives the value of x for given value of y and the formula is known as
Lagrange’s inverse interpolation formula.

3.16.2 Method of successive approximations


Based on Newton’s forward difference interpolation formula
The Newton’s forward difference interpolation formula is

u(u − 1) 2 u(u − 1)(u − 2) 3


y = y0 + u∆y0 + ∆ y0 + ∆ y0 + · · ·
2! 3!
u(u − 1)(u − 2) · · · (u − n − 1) n
+ ∆ y0 ,
n!
x − x0
where u = .
h
The above formula can be written as
1  u(u − 1) 2 u(u − 1)(u − 2) 3
u = y − y0 − ∆ y0 − ∆ y0 − · · ·
∆y0 2! 3!
u(u − 1)(u − 2) · · · (u − n − 1) n 
− ∆ y0 .
n!

Let the first approximation of u be denoted by u(1) and it is obtained by neglecting


the second and higher differences as
1
u(1) = (y − y0 ).
∆y0
Interpolation 133

Next, the second approximation, u(2) , is obtained by neglecting third and higher order
differences as follows:

1  u(1) (u(1) − 1) 2 
u(2) = y − y0 − ∆ y0 .
∆y0 2!

Similarly, the third approximation u(3) is given by

1  u(2) (u(2) − 1) 2 u(2) (u(2) − 1)(u(2) − 2) 3 


u(3) = y − y0 − ∆ y0 − ∆ y0 .
∆y0 2! 3!

In general,

1 u(k) (u(k) − 1) 2 u(k) (u(k) − 1)(u(k) − 2) 3


u(k+1) = y − y0 − ∆ y0 − ∆ y0
∆y0 2! 3!
u(k) (u(k) − 1) · · · (u(k) − k) k+1
−··· − ∆ y0 ,
(k + 1)!

k = 0, 1, 2, . . . .
This process of approximation should be continued till two successive approximations
u (k+1) and u(k) be equal up to desired number of decimal places. Then the value of x is
obtained from the relation x = x0 + u(k+1) h.
Example 3.16.1 From the table of values
x : 1.8 2.0 2.2 2.4 2.6
y : 3.9422 4.6269 5.4571 6.4662 7.6947
find x when y = 5.0 using the method of successive approximations.

Solution. The difference table is

x y ∆y ∆2 y ∆3 y
1.8 3.9422
0.6847
2.0 4.6269 0.1455
0.8302 0.0334
2.2 5.4571 0.1789
1.0091 0.0405
2.4 6.4662 0.2194
1.2285
2.6 7.6947
134 Numerical Analysis

Let x0 = 2.0, h = 0.2. The value of u is determined by successive approximation.


The first approximation is
1 1
u(1) = (y − y0 ) = (5.0 − 4.6269) = 0.4494.
∆y0 0.8302
1 u(1) (u(1) − 1) 2 u(1) (u(1) − 1) ∆2 y0
u(2) = y − y0 − ∆ y0 = u(1) −
∆y0 2! 2! ∆y0
0.4494(0.4494 − 1) 0.1789
= 0.4494 − = 0.4761.
2 0.8302
u(2) (u(2) − 1) ∆2 y0 u(2) (u(2) − 1)(u(2) − 2) ∆3 y0
u(3) = u(1) − −
2 ∆y0 3! ∆y0
0.4761(0.4761 − 1) 0.1789
= 0.4494 − ·
2 0.8302
0.4761(0.4761 − 1)(0.4761 − 2) 0.0405
− ·
6 0.8302
= 0.4494 + 0.0269 − 0.0031 = 0.4732.

Thus the value of x is given by x = x0 + u(3) h = 2.0 + 0.4732 × 0.2 = 2.0946.

3.16.3 Based on Newton’s backward difference interpolation formula


The Newton’s backward formula is
v(v + 1) 2 v(v + 1)(v + 2) 3
y = yn + v∇yn + ∇ yn + ∇ yn + · · ·
2! 3!
v(v + 1)(v + 2) · · · (v + n − 1) n
+ ∇ yn ,
n!
x − xn
where v = or x = xn + vh.
h
That is,
1  v(v + 1) 2 v(v + 1)(v + 2) 3
v = y − yn − ∇ yn − ∇ yn − · · ·
∇yn 2! 3!
v(v + 1) · · · (v + n − 1) n 
− ∇ yn .
n!
Neglecting second and higher order differences, the first approximation is given by
1
v (1) = (y − yn ).
∇yn
Similarly,
1  v (1) (v (1) + 1) 2 
v (2) = y − yn − ∇ yn .
∇yn 2!
Interpolation 135

1  v (2) (v (2) + 1) 2 v (2) (v (2) + 1)(v (2) + 2) 3 


v (3) = y − yn − ∇ yn − ∇ yn
∇yn 2! 3!

and so on.
In general,

1 v (k) (v (k) + 1) 2 v (k) (v (k) + 1)(v (k) + 2) 3


v (k+1) = y − yn − ∇ yn − ∇ yn
∇yn 2! 3!
v (k) (v (k) + 1) · · · (v (k) + k) k+1
−··· − ∇ yn ,
(k + 1)!

k = 0, 1, 2, . . . .
This iteration continues until two consecutive values v (k) and v (k+1) become equal up
to a desired number of significant figures.
The value of x is given by x = xn + v (k+1) h.

3.16.4 Use of inverse interpolation to find a root of an equation

Suppose x = α be a root of the equation f (x) = 0 and let it lies between a and b, i.e.,
a < α < b. Now, a table is constructed for some values of x, within (a, b), and the
corresponding values of y. Then by inverse interpolation, the value of x is determined
when y = 0. This value of x is the required root.

Example 3.16.2 Find a real root of the equation x3 − 3x + 1 = 0.

Solution. Let y = x3 − 3x + 1. One root of this equation lies between 1/4 and 1/2.
Let us consider the points x = 0.25, 0.30, 0.35, 0.40, 0.45, 0.50. The table is shown
below.

x y ∆y ∆2 y ∆3 y
0.25 0.265625
0.30 0.127000 –0.138625
0.35 –0.007125 –0.134125 0.00450
0.40 –0.136000 –0.128875 0.00525 0.00075
0.45 –0.258875 –0.122875 0.00600 0.00075
0.50 –0.375000 –0.116125 0.00675 0.00075

Here the interpolating point is y = 0.


136 Numerical Analysis

Now,
y0 0.265625
u(1) = − = = 1.916140.
∆y0 0.138625
1 u(1) (u(1) − 1) 2
u(2) = − y0 + ∆ y0
∆y0 2
1 1.916140 × 0.916140
= 0.265625 + × 0.00450
0.138625 2
= 1.944633.
1 u(2) (u(2) − 1) 2 u(2) (u(2) − 1)(u(2) − 2) 3
u(3) = − y0 + ∆ y0 + ∆ y0
∆y0 2! 3!
1 1.944633 × 0.944633
= 0.265625 + × 0.00450
0.138625 2
1.944633 × 0.944633 × (−0.055367)
+ × 0.000750
6
= 1.945864.

Thus x = x0 + u(3) × h = 0.25 + 1.945864 × 0.05 = 0.347293 which is a required root.

3.17 Choice and use of Interpolation Formulae

If the interpolating points are not equally spaced then Lagrange’s, Newton’s divided
difference or Aitken’s iterated interpolation formulae may be used. Newton’s forward
formula is appropriate for interpolation at the beginning of the table, Newton’s back-
ward formula for interpolation at the end of the table, Stirling’s or Bessel’s formula for
interpolation at the centre of the table. It is well known that the interpolation poly-
nomial is unique and the above formulae are just different forms of one and the same
interpolation polynomial and the results obtained by the different formulae should be
identical. Practically, only a subset of the set of given interpolating points in the ta-
ble is used. For interpolation at the beginning of the table, it is better to take this
subset from the beginning of the table. This reason recommends the use of Newton’s
forward formula for interpolation at the beginning of the table. For interpolation, near
the end of the table, interpolating points should be available at the end of the table and
hence Newton’s backward formula is used for interpolation at the end of the table. For
the same reasons the central difference formulae like Stirling’s, Bessel’s, Everett’s etc.
are used for interpolation near the centre of the table. The proper choice of a central
interpolation formulae depends on the error terms of the different formulae.
For interpolation near the centre of the table, Stirling’s formula gives the most ac-
curate result for −1/4 ≤ s ≤ 1/4, and Bessel’s formula gives most accurate result near
Interpolation 137

s = 1/2, i.e., for 1/4 ≤ s ≤ 3/4. If all the terms of the formulae are considered, then
both the formulae give identical result. But, if some terms are discarded to evaluate the
polynomial, then Stirling’s and Bessel’s formulae, in general, do not give the same result
and then a choice must be made between them. The choice depends on the order of the
highest difference that could be neglected so that contributions from it and further dif-
ferences would be less than half a unit in the last decimal place. If the highest difference
is of odd order, then Stirling’s formula is used and if it is of even order, then, generally,
Bessel’s formula is used. This conclusion is drawn from the following comparison.
The term of Stirling’s formula containing the third differences is

s(s2 − 12 ) ∆3 y−1 + ∆3 y−2


.
6 2
This term may be neglected if its magnitude is less than half a unit in the last place,
i.e., if  2 
 s(s − 12 ) ∆3 y−1 + ∆3 y−2  1
 ≤ .
 6 2  2
 2 
 s(s − 1)  √

The maximum value of   is 0.064 at s = −1/ 3. Then
6 
   3 
 3 3   3 
0.064 × ∆ y−1 + ∆ y−2  < 1 , i.e.,  ∆ y−1 + ∆ y−2  < 7.8.
 2  2  2 

The term containing third order difference of Bessel’s formula will be less than half
a unit in the last place if
 
 s(s − 1)(s − 1/2) 3  1
 ∆ y−1  < .
 6 2
 
 s(s − 1)(s − 1/2) 
The maximum value of    is 0.008 and so that |∆3 y−1 | < 62.5.
6 
Thus, if the third difference is ignored, Bessel’s formula gives about eight times more
accurate result than Stirling’s formula. But, if the third differences need to be retained
and when the magnitude is more than 62.5, then Everett’s formula is more appropriate.
It may be reminded that the Bessel’s formula with third differences is equivalent to
Everett’s formula with second differences.
Depending on these discussions the following working rules are recommended for use
of interpolation formulae.

(i) If the interpolating point is at the beginning of the table, then use Newton’s
forward formula with a suitable starting point x0 such that 0 < u < 1.
138 Numerical Analysis

(ii) If the interpolating point is at the end of the table, then use Newton’s backward
formula with a suitable starting point xn such that −1 < u < 0.

(iii) If the interpolating point is at the centre of the table and the difference table ends
with odd order differences, then use Stirling’s formula.

(iv) If the interpolating point is at the centre of the table and the difference table ends
with even order differences, then use Bessel’s or Everett’s formula.

(v) If the arguments are not equispaced then use Lagrange’s formula or Newton’s
divided difference formula or Aitken’s iterative formula.

In some functions, a higher degree interpolation polynomial does not always give
the best result compared to a lower degree polynomial. This fact is illustrated in the
following example.
1
Let f (x) = . In [−3, 3] the second degree polynomial y = φ2 (x) = 1−0.1x2 and
1 + x2
fourth degree polynomial y = φ4 (x) = 0.15577u4 −1.24616u3 +2.8904u3 −1.59232u+0.1
where u = (x + 3)/1.5 are derived. The graph of the curves y = f (x), y = φ2 (x) and
y = φ4 (x) are shown in Figure 3.4. y
6
y=φ4 (x) y=φ2 (x)

=

y=f (x)
9
- x
-1.5 0 1.5

Figure 3.4: The graph of the curves y = f (x), y = φ2 (x) and y = φ4 (x).

For this function y = f (x), the fourth degree polynomial gives an absurd result at
x = 2. At this point f (2) = 0.2, φ2 (2) = 0.6 and φ4 (2) = −0.01539. It may be
noted that the functions y = f (x) and y = φ2 (x) are positive for all values of x, but
y = φ4 (x) is negative for some values of x. This example indicates that the higher
degree polynomial does always not give more accurate result.
Interpolation 139

3.18 Hermite’s Interpolation Formula

The interpolation formulae considered so far make use of the function values at some
number of points, say, n + 1 number of points and an nth degree polynomial is obtained.
But, if the values of the function y = f (x) and its first derivatives are known at n + 1
points then it is possible to determine an interpolating polynomial φ(x) of degree (2n+1)
which satisfies the (2n + 2) conditions

φ(xi ) = f (xi )
(3.74)
φ (xi ) = f  (xi ), i = 0, 1, 2, . . . , n.
This formula is known as Hermite’s interpolation formula. Here, the number of
conditions is (2n + 2), the number of coefficients to be determined is (2n + 2) and the
degree of the polynomial is (2n + 1).
Let us assume the Hermite’s interpolating polynomial in the form

n 
n
φ(x) = hi (x)f (xi ) + Hi (x)f  (xi ), (3.75)
i=0 i=0

where hi (x) and Hi (x), i = 0, 1, 2, . . . , n, are polynomial in x of degree at most (2n + 1).
Using conditions (3.74), we get

1, if i = j
hi (xj ) = ; Hi (xj ) = 0, for all i (3.76)
0, if i = j

1, if i = j
hi (xj ) = 0, for all i; Hi (xj ) = (3.77)
0, if i = j
Let us consider the Lagrangian function
(x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn )
Li (x) = ,
(xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
i = 0, 1, 2, . . . , n. (3.78)

Obviously,
1, if i = j
Li (xj ) = (3.79)
0, if i = j.

Since each Li (x) is a polynomial of degree n, [Li (x)]2 is a polynomial of degree 2n.
Again, each Li (x) satisfies (3.79) and [Li (x)]2 also satisfies (3.79). Since hi (x) and Hi (x)
are polynomials in x of degree (2n + 1), their explicit form may be taken as

hi (x) = (ai x + bi )[Li (x)]2


(3.80)
Hi (x) = (ci x + di )[Li (x)]2
140 Numerical Analysis

Using the conditions (3.77), we obtain

ai xi + bi =1
ci xi + di =0
(3.81)
ai + 2Li (xi ) =0
ci =1

These relations give

ai = −2Li (xi ), bi = 1 + 2xi Li (xi )


(3.82)
ci = 1 and di = −xi

Hence,

hi (x) = [−2xLi (xi ) + 1 + 2xi Li (xi )][Li (xi )]2


= [1 − 2(x − xi )Li (xi )][Li (x)]2 and (3.83)
Hi (x) = (x − xi )[Li (x)]2 .

Then finally (3.75) becomes



n
φ(x) = [1 − 2(x − xi )Li (xi )][Li (x)]2 f (xi )
i=0

n
+ (x − xi )[Li (x)]2 f  (xi ), (3.84)
i=0

which is the required Hermite’s interpolation formula.


Example 3.18.1 Determine the Hermite’s polynomial of√degree 5 which satisfies
the following data and hence find an approximate value of 3 2.8.

x√ : 1.5 2.0 2.5


y= 3x : 1.14471 1.25992 1.35721

y = 1/(3x2/3 ) : 0.25438 0.20999 0.18096

Solution.
(x − x1 )(x − x2 ) (x − 2.0)(x − 2.5)
L0 (x) = = = 2x2 − 9x + 10,
(x0 − x1 )(x0 − x2 ) (1.5 − 2.0)(1.5 − 2.5)
(x − x0 )(x − x2 ) (x − 1.5)(x − 2.5)
L1 (x) = = = −4x2 + 16x − 15,
(x1 − x0 )(x1 − x2 ) (2.0 − 1.5)(2.0 − 2.5)
(x − x0 )(x − x1 ) (x − 1.5)(x − 2.0)
L2 (x) = = = 2x2 − 7x + 6.
(x2 − x0 )(x2 − x1 ) (2.5 − 1.5)(2.5 − 2.0)
Interpolation 141

Therefore
L0 (x) = 4x − 9, L1 (x) = −8x + 16, L2 (x) = 4x − 7.
Hence L0 (x0 ) = −3, L1 (x1 ) = 0, L2 (x2 ) = 3.

h0 (x) = [1 − 2(x − x0 )L0 (x0 )][L0 (x)]2 = [1 − 2(x − 1.5)(−3)][L0 (x)]2


= (6x − 8)(2x2 − 9x + 10)2 ,
h1 (x) = [1 − 2(x − x1 )L1 (x1 )][L1 (x)]2 = (4x2 − 16x + 15)2 ,
h2 (x) = [1 − 2(x − x2 )L2 (x2 )][L2 (x)]2 = [1 − 2(x − 2.5)(3)](2x2 − 7x + 6)2
= (16 − 6x)(2x2 − 7x + 6)2 .
H0 (x) = (x − x0 )[L0 (x)]2 = (x − 1.5)(2x2 − 9x + 10)2 ,
H1 (x) = (x − x1 )[L1 (x)]2 = (x − 2.0)(4x2 − 16x + 15)2 ,
H2 (x) = (x − x2 )[L2 (x)]2 = (x − 2.5)(2x2 − 7x + 6)2 .

The required Hermite polynomial is

φ(x) = (6x − 8)(2x2 − 9x + 10)2 (1.14471) + (4x2 − 16x + 15)2 (1.25992)


+(16 − 6x)(2x2 − 7x + 6)2 (1.35721)
+(x − 1.5)(2x2 − 9x + 10)2 (0.25438)
+(x − 2)(4x2 − 16x + 15)2 (0.20999)
+(x − 2.5)(2x2 − 7x + 6)2 (0.18096).

To find the value of 3 2.8, substituting x = 2.8 to the above relation.
Therefore,

3
2.8 = 10.07345 × 0.23040 + 1.25992 × 2.43360 − 1.08577 × 4.32640
+0.33069 × 0.23040 + 0.16799 × 2.43360 + 0.05429 × 4.32640
= 1.40948.

3.19 Spline Interpolation

Spline interpolation is very powerful and widely used method and has many applications
in numerical differentiation, integration, solution of boundary value problems, two and
three - dimensional graph plotting etc. Spline interpolation method, interpolates a
function between a given set of points by means of piecewise smooth polynomials. In
this interpolation, the curve passes through the given set of points and also its slope
and its curvature are continuous at each point. The splines with different degree are
found in literature, among them cubic splines are widely used.
142 Numerical Analysis

3.19.1 Cubic spline


A function y(x) is called cubic spline in [x0 , xn ] if there exist cubic polynomials
p0 (x), p1 (x), . . . , pn−1 (x) such that

y(x) = pi (x) on [xi , xi+1 ], i = 0, 1, 2, . . . , n − 1. (3.85)


pi−1 (xi ) = pi (xi ), i = 1, 2, . . . , n − 1 (equal slope). (3.86)
pi−1 (xi ) = pi (xi ), i = 1, 2, . . . , n − 1 (equal curvature). (3.87)
and pi (xi ) = yi , pi (xi+1 ) = yi+1 , i = 0, 1, . . . , n − 1. (3.88)

It may be noted that, at the endpoints x0 and xn , no continuity on slope and curvature
are assigned. The conditions at these points are assigned, generally, depending on the
applications.
Let the interval [xi , xi+1 ], i = 0, 1, . . . , n − 1 be denoted by ith interval.
Let hi = xi − xi−1 , i = 1, 2, . . . , n and Mi = y  (xi ), i = 0, 1, 2, . . . , n.
Let the cubic spline for the ith interval be

y(x) = ai (x − xi )3 + bi (x − xi )2 + ci (x − xi ) + di , in [xi , xi+1 ]. (3.89)

Since it passes through the end points xi and xi+1 , therefore,

yi = d i (3.90)

and

yi+1 = ai (xi+1 − xi )3 + bi (xi+1 − xi )2 + ci (xi+1 − xi ) + di


= ai h3i+1 + bi h2i+1 + ci hi+1 + di . (3.91)

Equation (3.89) is differentiated twice and obtained the following equations.

y  (x) = 3ai (x − xi )2 + 2bi (x − xi ) + ci . (3.92)



and y (x) = 6ai (x − xi ) + 2bi . (3.93)

From (3.93), yi = 2bi and yi+1


 = 6ai hi+1 + 2bi ,
that is, Mi = 2bi , Mi+1 = 6ai hi+1 + 2bi .
Therefore,

Mi
bi = , (3.94)
2
Mi+1 − Mi
ai = . (3.95)
6hi+1
Interpolation 143

Using (3.90), (3.94) and (3.95), equation (3.91) becomes

Mi+1 − Mi 3 Mi 2
yi+1 = hi+1 + h + ci hi+1 + yi
6hi+1 2 i+1
yi+1 − yi 2hi+1 Mi + hi+1 Mi+1
i.e., ci = − . (3.96)
hi+1 6

Thus, the coefficients ai , bi , ci and di of (3.89) are determined in terms of n + 1


unknowns M0 , M1 , . . . , Mn . These unknowns are determined in the following way.
The condition of equation (3.86) state that the slopes of the two cubics pi+1 and pi
are same at xi .
Now, for the ith interval

yi = 3ai (xi − xi )2 + 2bi (xi − xi ) + ci = ci , (3.97)

and for the (i − 1)th interval

yi = 3ai−1 (xi − xi−1 )2 + 2bi−1 (xi − xi−1 ) + ci−1 . (3.98)

Equating (3.97) and (3.98), we obtain

ci = 3ai−1 h2i + 2bi−1 hi + ci−1 .

The values of ai−1 , bi−1 , ci−1 and ci are substituted to the above equation and obtained
the following equation.

yi+1 − yi 2hi+1 Mi + hi+1 Mi+1



hi+1 6
 
Mi − Mi−1 2  Mi  yi − yi−1 2hi Mi−1 + hi Mi
= 3 hi + hi + − .
6hi 2 hi 6

After simplification the above equation reduces to


y yi − yi−1 
i+1 − yi
hi Mi−1 + 2(hi + hi+1 )Mi + hi+1 Mi+1 = 6 − . (3.99)
hi+1 hi

This relation is true for i = 1, 2, . . . , n − 1. Thus n − 1 equations are available for


the n + 1 unknown quantities M0 , M1 , . . . , Mn . Now, two more conditions are required
to solve these equations uniquely. These conditions can be assumed to take one of the
following forms:

(i) M0 = Mn = 0. If this conditions are satisfied then the corresponding spline is


called natural spline.
144 Numerical Analysis

(ii) M0 = Mn , M1 = Mn+1 , y0 = yn , y1 = yn+1 , h1 = hn+1 . The corresponding spline


satisfying these conditions is called periodic spline.

(iii) y  (x0 ) = y0 , y  (xn ) = yn , i.e.,


6  y1 − y 0 
2M0 + M1 = − y0
h1 h1
6   yn − yn−1 
and Mn−1 + 2Mn = yn − . (3.100)
hn hn
The corresponding spline satisfying the above conditions is called non-periodic
spline or clamped cubic spline.
h1 (M2 − M1 ) hn (Mn−1 − Mn−2 )
(iv) If M0 = M1 − and Mn = Mn−1 + . The corre-
h2 hn−1
sponding spline is called extrapolated spline.

(v) If M0 = y0 and Mn = yn are specified. If a spline satisfy these conditions then it
is called endpoint curvature-adjusted spline.
y yi − yi−1 
i+1 − yi
Let Ai = hi , Bi = 2(hi + hi+1 ), Ci = hi+1 and Di = 6 − .
hi+1 hi
Then (3.99) becomes

Ai Mi−1 + Bi Mi + Ci Mi+1 = Di , i = 1, 2, . . . , n − 1. (3.101)

The system of equations (3.101) is basically a tri-diagonal system. Depending on


different conditions, the tri-diagonal systems are different and they are stated below.
For natural spline, the tri-diagonal system for M1 , M2 , . . . , Mn−1 is
    
B1 C1 0 0 · · · 0 0 0 M1 D1
 A2 B2 C2 0 · · · 0 0 
0  M   D 

 0 A3 B3 C3 · · · 0  2   2 
 0 0  ..  =  .. 
 .   . 
(3.102)
 ··· ··· ··· ··· ··· ··· ··· ··· 
0 0 0 0 · · · 0 An−1 Bn−1 Mn−1 Dn−1

and M0 = Mn = 0.
Imposing the conditions for non-periodic spline, we find

2M0 + M1 = D0 (3.103)
and Mn−1 + 2Mn = Dn , (3.104)
 
6 y1 − y 0
where D0 = − y0
h1 h1
 
6  yn − yn−1
and Dn = y − . (3.105)
hn n hn
Interpolation 145

Then equations (3.101), (3.103), (3.104) and (3.105) result the following tri-diagonal
system for the unknowns M0 , M1 , . . . , Mn .
  M   D 
2 1 0 0 ··· 0 0 0 0 0
 A1 B1 C1 0 · · · 0  M   D1 
0 0    
1

 0 A2 B2 C2 · · · 0  M2   D2 
 0 0  .  =  . 
  
 ··· ··· ··· ··· ··· ··· · · · · · ·  .   .  . (3.106)
  .   .  
 0 0 0 0 · · · An−1 Bn−1 Cn−1  
Mn−1   Dn−1 
0 0 0 0 ··· 0 1 2 Mn Dn
For the extrapolated spline the values of M0 and Mn are given by the relations
h1 (M2 − M1 ) hn (Mn−1 − Mn−2 )
M0 = M1 − and Mn = Mn−1 + . (3.107)
h2 hn−1
The first expression is rewritten as
A1 h1 A1 h1
M1 A1 + B1 + + M2 C1 − = D1 or, M1 B1 + M2 C1 = D1
h2 h2
A1 h1 A1 h1
where B1 = A1 + B1 + and C1 = C1 − .
h2 h2
Similarly, the second expression is transferred to Mn−2 An−1 + Mn−1 Bn−1  = Dn−1
where
Cn−1 hn hn Cn−1
An−1 = An−1 − 
and Bn−1 = Bn−1 + Cn−1 + .
hn−1 hn−1
For this case, the tri-diagonal system of equations for M1 , M2 , . . ., Mn−1 is
     
B1 C1 0 0 · · · 0 0 0 M1 D1
 A2 B2 C2 0 · · · 0 0 
 0  M2  
  D2 

 0 A3 B3 C3 · · · 0 
 0 0  .   . 
 .  =  . . (3.108)
 ··· ··· ··· ··· ··· ··· ··· ···  . . 
0 0 0 0 · · · 0 An−1 Bn−1  Mn−1 Dn−1

The values of M0 and Mn are obtained from the equation (3.107).


For the endpoint curvature-adjusted spline, the values of M0 and Mn are respectively
y0 and yn , where y0 and yn are specified. The values of M1 , M2 , . . ., Mn−1 are given
by solving the following tri-diagonal system of equations
   
B1 C1 0 0 · · · 0 0 0  D1
M1  D2 
 A2 B2 C2 0 · · · 0 
 0 0   M2   
 0 A3 B3 C3 · · · 0  .  =  ..  ,
 0 0  .   .  

(3.109)
 ··· ··· ··· ··· ··· ··· ··· ···  .  Dn−2 
0 0 0 0 · · · 0 An−1 Bn−1 Mn−1
Dn−1

where D1 = D1 − A1 y0 , 


Dn−1 = Dn−1 − Cn−1 yn .
146 Numerical Analysis

Example 3.19.1 Fit a cubic spline curve that passes through (0, 0.0), (1, 0.5), (2,
2.0) and (3, 1.5) with the natural end boundary conditions, y  (0) = y  (3) = 0.

Solution. Here the intervals are (0, 1) (1, 2) and (2, 3), i.e., three intervals of
x, in each of which we can construct a cubic spline. These piecewise cubic spline
polynomials together gives the cubic spline curve y(x) in the entire interval (0, 3).
Here h1 = h2 = h3 = 1.
Then equation (3.99) becomes

Mi−1 + 4Mi + Mi+1 = 6(yi+1 − 2yi + yi−1 ), i = 1, 2, 3.

That is,

M0 + 4M1 + M2 = 6(y2 − 2y1 + y0 ) = 6 × (2.0 − 2 × 0.5 + 0.0) = 6


M1 + 4M2 + M3 = 6(y3 − 2y2 + y1 ) = 6 × (1.5 − 2 × 2.0 + 0.5) = −12.

Imposing the conditions M0 = y  (0) = 0 and M3 = y  (3) = 0 to the above equations,


and they simplify as

4M1 + M2 = 6, M1 + 4M2 = −12.


12 18
These equations give M1 = and M2 = − .
5 5
Let the natural cubic spline be given by

pi (x) = ai (x − xi )3 + bi (x − xi )2 + ci (x − xi ) + di

where the coefficients ai , bi , ci and di are given by the relations


Mi+1 − Mi Mi
ai = , bi = ,
6hi+1 2
yi+1 − yi 2hi+1 Mi + hi+1 Mi+1
ci = − , d i = yi ,
hi+1 6

for i = 0, 1, 2.
Therefore,
M1 − M 0 M0
a0 = = 0.4, b0 = = 0,
6 2
y1 − y0 2M0 + M1
c0 = − = 0.1, d0 = y0 = 0.
1 6
M2 − M 1 M1 6
a1 = = −1, b1 = = ,
6 2 5
Interpolation 147

y2 − y1 2M1 + M2
c1 = − = 1.3, d1 = y1 = 0.5.
1 6
M3 − M 2 3 M2 9
a2 = = , b2 = =− ,
6 5 2 5
y3 − y2 2M2 + M3
c2 = − = 0.7, d2 = y2 = 2.0.
1 6
Hence the required piecewise cubic splines in each interval are given by

p0 (x) = 0.4x3 + 0.1x, 0≤x≤1


p1 (x) = −(x − 1) + 1.2(x − 1)2 + 1.3(x − 1) + 0.5,
3
1≤x≤2
p2 (x) = 0.6(x − 2)3 − 1.8(x − 2)2 + 0.7(x − 2) + 2.0, 2 ≤ x ≤ 3.

Example 3.19.2 Fit a cubic spline curve for the following data with end conditions
y  (0) = 0.2 and y  (3) = −1.

x : 0 1 2 3
y : 0 0.5 3.5 5

Solution. Here, the three intervals (0, 1) (1, 2) and (2, 3) are given in each of which
the cubic splines are to be constructed. These cubic spline functions are denoted by
y0 , y1 and y2 . In this example, h1 = h2 = h3 = 1.
For the boundary conditions, equation (3.99) is used. That is,

M0 + 4M1 + M2 = 6(y2 − 2y1 + y0 )


M1 + 4M2 + M3 = 6(y3 − 2y2 + y1 ).

Also, from the equations (3.100)

2M0 + M1 = 6(y1 − y0 − y0 ) and M2 + 2M3 = 6(y3 − y3 + y2 )

i.e.,
M0 + 4M1 + M2 = 15
M1 + 4M2 + M3 = −9
2M0 + M1 = 1.8
M2 + 2M3 = 6(−1 − 5 + 3.5) = −15.

These equations give M0 = −1.36, M1 = 4.52, M2 = −1.72 and M3 = −6.64.


Let the cubic spline in each interval be given by
yi (x) = ai (x − xi )3 + bi (x − xi )2 + ci (x − xi ) + di .
148 Numerical Analysis

The coefficients are computed as


Mi+1 − Mi Mi
ai = , bi = ,
6hi+1 2
yi+1 − yi 2hi+1 Mi + hi+1 Mi+1
ci = − , di = yi , for i = 0, 1, 2.
hi+1 6

Therefore,
M1 − M 0
a0 = = 0.98, b0 = −0.68, c0 = 0.2, d0 = 0.
6
M2 − M 1
a1 = = −1.04, b1 = 2.26, c1 = 1.78, d1 = 0.5.
6
M3 − M 2
a2 = = −0.82, b2 = −0.86, c2 = 3.18, d2 = 3.5.
6
Hence, the required piecewise cubic spline polynomials in each interval are given by

y0 (x) = 0.98x3 − 0.68x2 + 0.2x, 0≤x≤1


y1 (x) = −1.04(x − 1) + 2.26(x − 1) + 1.78(x − 1) + 0.5,
3 2
1≤x≤2
y2 (x) = −0.82(x − 2)3 − 0.86(x − 2)2 + 3.18(x − 2) + 3.5, 2 ≤ x ≤ 3.

Example 3.19.3 Consider the function


 11 75

 − 2 x + 26x − 2 x + 18, 1 ≤ x ≤ 2,
3 2

f (x) =


 11 x3 − 40x2 + 189 x − 70, 2 ≤ x ≤ 3.
2 2
Show that f (x) is a cubic spline.

Solution. Let
11 3 75
p0 (x) = − x + 26x2 − x + 18, 1 ≤ x ≤ 2,
2 2
11 3 189
and p1 (x) = x − 40x2 + x − 70, 2 ≤ x ≤ 3.
2 2
Here x0 = 1, x1 = 2 and x2 = 3. The function f (x) will be a cubic spline if

(a) pi (xi ) = f (xi ), pi (xi+1 ) = f (xi+1 ), i = 0, 1 and


(b) pi−1 (xi ) = pi (xi ), pi−1 (xi ) = pi (xi ), i = 1.
Interpolation 149

But, here the values of f (x0 ), f (x1 ) and f (x2 ) are not supplied, so only the conditions
of (b) are to be checked.
Now,
33 75 33 189
p0 (x) = − x2 + 52x − , p1 (x) = x2 − 80x +
2 2 2 2
p0 (x) = −33x + 52, p1 (x) = 33x − 80.

p0 (x1 ) = p0 (2) = 0.5, p1 (x1 ) = p1 (2) = 0.5, i.e., p0 (x1 ) = p1 (x1 ).
p0 (x1 ) = p0 (2) = −14 and p1 (x1 ) = p1 (2) = −14. Thus p0 (x1 ) = p1 (x1 ).
Hence f (x) is a spline.

Algorithm 3.5 (Cubic spline). This algorithm finds the cubic spline for each of
the intervals [xi , xi+1 ], i = 0, 1, . . . , n − 1. The (i) Natural spline, (ii) Non-periodic
spline or clamped cubic spline, (iii) Extrapolated spline, and (iv) endpoint curvature-
adjusted spline are incorporated here. The spacing for xi need not be equal and
assume that x0 < x1 < · · · < xn .

Algorithm Cubic Spline


Read xi , yi , i = 0, 1, 2, . . . , n;//x’s are not necessary equispaced//
Read xg; //the value of x at which y is to be computed.//
//Computation of hi .//
for i = 1 to n do
hi = xi − xi−1 ;
endfor;
//Compututation of Ai , Bi , Ci and Di //
for i = 1 to n − 1 do
Ai = hi; Bi = 2(hi + hi+1 ); Ci = hi+1 ;
yi+1 − yi yi − yi−1
Di = 6 − ;
hi+1 hi
endfor;
Case :
(i) Natural spline
To find M1 , M2 , . . . , Mn−1 , solve the system of tri-diagonal equation
defined in (3.102).
Set M0 = Mn = 0.
(ii) Non-periodic spline
Read y0 , yn ; //firstderivative of  y at x = x0 , 
xn // 
6 y1 − y 0  6  yn − yn−1
Compute D0 = − y0 , D n = y − .
h1 h1 hn n hn
To find M0 , M1 , M2 , . . . , Mn , solve the system of tri-diagonal equation
defined in (3.106).
150 Numerical Analysis

(iii) Extrapolated spline


A1 h1  A1 h1
Compute B1 = A1 + B1 + , C1 = C1 − ,
h2 h2
Cn−1 hn  hn Cn−1
An−1 = An−1 − , Bn−1 = Bn−1 + Cn−1 +
hn−1 hn−1
To find M1 , M2 , . . . , Mn−1 , solve the system of tri-diagonal equation
defined in (3.108).
h1 (M2 − M1 ) hn (Mn−1 − Mn−2 )
Compute M0 = M1 − , Mn = Mn−1 + .
h2 hn−1
(iv) Endpoint curvature-adjusted spline
Read y0 , yn ; //double derivative of y at x = x0 , xn //
Compute D1 = D0 − A1 y0 , Dn−1 
= Dn−1 − Cn−1 yn
To find M1 , M2 , . . . , Mn−1 , solve the system of tri-diagonal equation
defined in (3.109).
Set M0 = y0 , Mn = yn .
endcase;

//Compututation of the coefficients ai , bi , ci , di , i = 0, 1, 2, . . . , n.//


Mi+1 − Mi
for i = 1 to n do ai = ,
6hi+1
Mi
bi =
2
yi+1 − yi 2hi+1 Mi + hi+1 Mi+1
ci = − , d i = yi ;
hi+1 6
endfor;
//Printing of splines//
for i = 0 to n − 1 do
Print ‘Coefficient of ’, i, ‘th spline is’ ai , bi , ci , di ;
endfor;
//Computation of y at x = xg//
if (xg < x0 ) or (xg > xn ) then
Print ‘x outside the range’;
Stop;
endif;
for i = 0 to n − 1 do
if (xg < xi+1 ) then
j = i;
exit from for loop;
endif;
endfor;
Compute yc = aj (xg − xj )3 + bj (xg − xj )2 + cj (xg − xj ) + dj ;
Print ‘The value of y at x =’, xg, ‘is’, yc ;
end Cubic Spline
Interpolation 151

Program 3.5.
/* Program Cubic Spline
This program construct cubic splines at each interval
[x[i-1],x[i]], i=1, 2, ..., n and finds the value of
y=f(x) at a given x when the function is supplied as
(x[i],y[i]), i=0, 1, ..., n. */
#include<stdio.h>
#include<math.h>
#include<ctype.h>
#include<stdlib.h>
float M[21];
void main()
{
int i,n;
char opt,s[5];
float x[20],y[20],h[20],A[20],B[20],C[20],D[20];
float a[20],b[20],c[20],d[20],xg,yd0,ydn,temp,yc;
float TriDiag(float a[],float b[],float c[],float d[],int n);
printf("\nEnter number of subintervals ");
scanf("%d",&n);
printf("Enter x and y values ");
for(i=0;i<=n;i++) scanf("%f %f",&x[i],&y[i]);
printf("Enter interpolating point x ");
scanf("%f",&xg);
printf("The given values of x and y are\nx-value y-value\n");
for(i=0;i<=n;i++) printf("%f %f\n",x[i],y[i]);

for(i=0;i<=n;i++) h[i]=x[i]-x[i-1]; /* computation of h[i] */


for(i=1;i<n;i++) /* computation of A,B,C,D’s */
{
A[i]=h[i];
B[i]=2*(h[i]+h[i+1]);
C[i]=h[i+1];
D[i]=6*((y[i+1]-y[i])/h[i+1]-(y[i]-y[i-1])/h[i]);
}
printf("\nN Natural spline\n");
printf("P Non-Periodic spline\n");
printf("E Extrapolated spline\n");
printf("C End point Curvature adjusted spline\n");
152 Numerical Analysis

printf(" Enter your choice ");


opt=getche();

switch(toupper(opt))
{
case ’N’: /* Natural spline */
temp=TriDiag(A,B,C,D,n-1);
M[0]=0; M[n]=0;
break;
case ’P’: /* Non-periodic spline */
printf("\nEnter the values of y’[0] and y’[n] ");
scanf("%f %f",&yd0,&ydn);
D[0]=6*((y[1]-y[0])/h[1]-yd0)/h[1];
D[n]=6*(ydn-(y[n]-y[n-1])/h[n])/h[n];
for(i=n+1;i>=1;i--) D[i]=D[i-1];
A[n+1]=1; B[n+1]=2;
for(i=n;i>=2;i--){
A[i]=A[i-1]; B[i]=B[i-1]; C[i]=C[i-1];}
B[1]=2; C[1]=1;
temp=TriDiag(A,B,C,D,n+1);
for(i=0;i<=n;i++) M[i]=M[i+1];
break;
case ’E’: /* Extrapolated spline */
B[1]=A[1]+B[1]+A[1]*h[1]/h[2];
C[1]=C[1]-A[1]*h[1]/h[2];
A[n-1]=A[n-1]-C[n-1]*h[n]/h[n-1];
B[n-1]=B[n-1]+C[n-1]+C[n-1]*h[n]/h[n-1];
temp=TriDiag(A,B,C,D,n-1);
M[0]=M[1]-h[1]*(M[2]-M[1])/h[2];
M[n]=M[n-1]+h[n]*(M[n-1]-M[n-2])/h[n-1];
break;
case ’C’: /* End point Curvature adjusted spline */
printf("\nEnter the values of y’’[0] and y’’[n] ");
scanf("%f %f",&yd0,&ydn);
D[1]=D[1]-A[1]*yd0;
D[n-1]=D[n-1]-C[n-1]*ydn;
temp=TriDiag(A,B,C,D,n-1);
M[0]=yd0; M[n]=ydn;
break;
Interpolation 153

default : printf("\n No choice \n");


exit(0);
} /* switch */
/* Computation of the coefficients of the splines */
for(i=0;i<=n-1;i++){
a[i]=(M[i+1]-M[i])/(6*h[i+1]); b[i]=M[i]/2;
c[i]=(y[i+1]-y[i])/h[i+1]-(2*h[i+1]*M[i]+h[i+1]*M[i+1])/6;
d[i]=y[i];
}
/* printing of splines */
printf("\nThe cubic splines are \n");
for(i=0;i<n;i++)
{
s[1]=(x[i]>0) ? ’-’:’+’;
s[2]=(b[i]<0) ? ’-’:’+’;
s[3]=(c[i]<0) ? ’-’:’+’;
s[4]=(d[i]<0) ? ’-’:’+’;
temp=fabs(x[i]);
printf("p%1d(x)=%7.4f(x%c%7.4f)^3%c%7.4f(x%c%7.4f)^2%c%7.4f
(x%c%7.4f)%c%7.4f\n",i,a[i],s[1],temp,s[2],fabs(b[i]),
s[1],temp,s[3],fabs(c[i]),s[1],temp,s[4],fabs(d[i]));
printf(" in [%7.4f,%7.4f]\n",x[i],x[i+1]);
}
/* computation of y at x=xg */
if((xg<x[0]) || (xg>x[n])){
printf("\nx outside the range ");
exit(0);
}
for(i=0;i<=n-1;i++) /* determination of y */
{
if(xg<x[i+1])
{
temp=xg-x[i];
yc=a[i]*temp*temp*temp+b[i]*temp*temp+c[i]*temp+d[i];
printf("The value of y at x=%f is %f ",xg,yc);
exit(0);
}
}
} /* main */
154 Numerical Analysis

float TriDiag(float a[20],float b[20],float c[20],float d[20],int n)


{
/* output M[i], i=1, 2,..., n, is a global variable.*/
int i;
float gamma[10],z[10];
gamma[1]=b[1];
for(i=2;i<=n;i++)
{
if(gamma[i-1]==0.0)
{
printf("A minor is zero: Method fails ");
exit(0);
}
gamma[i]=b[i]-a[i]*c[i-1]/gamma[i-1];
}
z[1]=d[1]/gamma[1];
for(i=2;i<=n;i++)
z[i]=(d[i]-a[i]*z[i-1])/gamma[i];
/* Computation of M */
M[n]=z[n];
for(i=n-1;i>=1;i--)
M[i]=z[i]-c[i]*M[i+1]/gamma[i];
return(M[0]);
} /*end of TriDiag */

A sample of input/output:

Enter number of subintervals 3


Enter x and y values
-1 1.0
1 0.5
2 3.5
3 5.0
Enter interpolating point x 1.2
The given values of x and y are
x-value y-value
-1.000000 1.000000
1.000000 0.500000
2.000000 3.500000
3.000000 5.000000
Interpolation 155

N Natural spline
P Non-Periodic spline
E Extrapolated spline
C End point Curvature adjusted spline
Enter your choice n

The cubic splines are


p0(x)= 0.3152(x+ 1.0000)^3+ 0.0000(x+ 1.0000)^2- 1.5109(x+ 1.0000)
+ 1.0000 in [-1.0000, 1.0000]
p1(x)=-1.1630(x- 1.0000)^3+ 1.8913(x- 1.0000)^2+ 2.2717(x- 1.0000)
+ 0.5000 in [ 1.0000, 2.0000]
p2(x)= 0.5326(x- 2.0000)^3- 1.5978(x- 2.0000)^2+ 2.5652(x- 2.0000)
+ 3.5000 in [ 2.0000, 3.0000]
The value of y at x=1.200000 is 1.020696

Another input/output:

Enter number of subintervals 3


Enter x and y values
-1 1.0
1 0.5
2 3.5
3 5.0
Enter interpolating point x 1.2
The given values of x and y are
x-value y-value
-1.000000 1.000000
1.000000 0.500000
2.000000 3.500000
3.000000 5.000000

N Natural spline
P Non-Periodic spline
E Extrapolated spline
C End point Curvature adjusted spline
Enter your choice p
Enter the values of y’[0] and y’[n] 0 1
The cubic splines are
p0(x)= 0.6250(x+ 1.0000)^3- 1.3750(x+ 1.0000)^2+ 0.0000(x+ 1.0000)
+ 1.0000 in [-1.0000, 1.0000]
156 Numerical Analysis

p1(x)=-1.3750(x- 1.0000)^3+ 2.3750(x- 1.0000)^2+ 2.0000(x- 1.0000)


+ 0.5000 in [ 1.0000, 2.0000]
p2(x)= 0.6250(x- 2.0000)^3- 1.7500(x- 2.0000)^2+ 2.6250(x- 2.0000)
+ 3.5000 in [ 2.0000, 3.0000]
The value of y at x=1.200000 is 0.984000

3.20 Bivariate Interpolation

Like single valued interpolation, recently bivariate interpolations become important due
to their extensive uses in a wide range of fields e.g., digital image processing, digital
filter design, computer-aided design, solution of non-linear simultaneous equations etc.
In this section, some of the important methods are described to construct interpola-
tion formulae that can be efficiently evaluated.
To construct the formulae, the following two approaches are followed.
(i) Constructing a function that matches exactly the functional values at all the data
points.

(ii) Constructing a function that approximately fits the data. This approach is desir-
able when the data likely to have errors and require smooth functions.
On the basis of these approaches, one can use four types of methods (i) local matching
methods, (ii) local approximation methods, (iii) global matching methods and (iv) global
approximation methods. In the local methods, the constructed function at any point
depends only on the data at relatively nearby points. In global methods, the constructed
function at any points depends on all or most of the data points.
In the matching method, the matching function matches exactly the given values,
but in the approximate method the function approximately fits the data.
Here, the local and the global matching methods are discussed only.

3.20.1 Local matching methods


Here two local matching methods are described, viz., triangular interpolation and rect-
angular grid or bilinear interpolation.

Triangular interpolation
The simplest local interpolating surface is of the form

F (x, y) = a + bx + cy.

The data at the three corners of a triangle determine the coefficients. This procedure
generates a piecewise linear surface which is global continuous.
Interpolation 157

Suppose the function f (x, y) be known at the points (x1 , y1 ), (x2 , y2 ) and (x3 , y3 ).
Let f1 = f (x1 , y1 ), f2 = f (x2 , y2 ) and f3 = f (x3 , y3 ).
Let the constructed function be

F (x, y) = a + bx + cy (3.110)

such that F (xi , yi ) = f (xi , yi ), i = 1, 2, 3.


Then

f1 = a + bx1 + cy1
f2 = a + bx2 + cy2
f3 = a + bx3 + cy3 .

These equations give the values of a, b and c as

f1 (x2 y3 − x3 y2 ) − f2 (x1 y3 − x3 y1 ) + f3 (x1 y2 − x2 y1 )


a =

(f2 − f1 )(y3 − y1 ) − (f3 − f1 )(y2 − y1 )
b =

(f3 − f1 )(x2 − x1 ) − (f2 − f1 )(x3 − x1 )
c =

where
∆ = (x2 − x1 )(y3 − y1 ) − (x3 − x1 )(y2 − y1 ).
The values of a, b, c give the required polynomial.
But, the function F (x, y) can be written in the following form

F (x, y) = Af1 + Bf2 + Cf3 , (3.111)


(x2 − x)(y3 − y) − (x3 − x)(y2 − y)
where A = (3.112)

(x3 − x)(y1 − y) − (x1 − x)(y3 − y)
B = (3.113)

(x1 − x)(y2 − y) − (x2 − x)(y1 − y)
C = . (3.114)

Note 3.20.1 (i) If A + B + C = 1 then ∆ = 0.

(ii) If ∆ = 0 then the points (xi , yi ), i = 1, 2, 3 are collinear.

(iii) Let (xi , yi ) and f (xi , yi ), i = 1, 2, . . . , n be given. If we choose non-overlapping


triangles which cover the region containing all these points, then a function that
is continuous in this region can be determined.
158 Numerical Analysis

Example 3.20.1 For a function f (x, y), let f (1, 1) = 8, f (2, 1) = 12 and f (2, 2) =
20. Find the approximate value of f (3/2, 5/4) using triangular interpolation.

Solution. Here given that

x1 = 1, y1 = 1, f1 = f (x1 , y1 ) = 8
x2 = 2, y2 = 1, f2 = f (x2 , y2 ) = 12
x3 = 2, y3 = 2, f3 = f (x3 , y3 ) = 20
3 5
x= , y= .
2 4
Therefore,

∆ = (x2 − x1 )(y3 − y1 ) − (x3 − x1 )(y2 − y1 )


= (2 − 1)(2 − 1) − (2 − 1)(1 − 1) = 1.
(x2 − x)(y3 − y) − (x3 − x)(y2 − y) 1
A = =
∆ 2
(x3 − x)(y1 − y) − (x1 − x)(y3 − y) 1
B = =
∆ 4
(x1 − x)(y2 − y) − (x2 − x)(y1 − y) 1
C = = .
∆ 4
Thus f (3/2, 5/4)  F (3/2, 5/4) = Af1 + Bf2 + Cf3 = 1
2 ×8+ 1
4 × 12 + 1
4 × 20 = 12.

Bilinear interpolation
Let a function f (x, y) be known at the points (x1 , y1 ), (x1 + h, y1 ), (x1 , y1 + k) and
(x1 + h, y1 + k).
A function F (x, y) is to be constructed within the rectangle formed by these points.
Let f1 = f (x1 , y1 ), f2 = f (x1 + h, y1 ), f3 = f (x1 , y1 + k) and
f4 = f (x1 + h, y1 + k).
Let us construct a function F (x, y) of the form

F (x, y) = a + b(x − x1 ) + c(y − y1 ) + d(x − x1 )(y − y1 ) (3.115)

such that

F (x1 , y1 ) = f (x1 , y1 ) = f1 , F (x1 + h, y1 ) = f (x1 + h, y1 ) = f2 ,


F (x1 , y1 + k) = f (x1 , y1 + k) = f3 , F (x1 + h, y1 + k) = f (x1 + h, y1 + k) = f4 .

The unknowns a, b, c, d can be obtained by solving the following equations


f1 = a, f2 = a + bh, f3 = a + ck and f4 = a + hb + kc + hkd.
Interpolation 159

Thus f2 − f1
a = f1 , b= ,
h
f3 − f1 f4 + f1 − f2 − f3
c = and d = . (3.116)
k hk

Example 3.20.2 Find a bilinear interpolation polynomial F (x, y) for the function
f (x, y) where f (1, 1) = 8, f (2, 1) = 10, f (1, 2) = 12 and f (2, 2) = 20. Also, find an
approximate value of f (4/3, 5/3).

Solution. Here
x1 = 1, y1 = 1, f1 = f (x1 , y1 ) = 8
x1 + h = 2, y1 = 1, f2 = f (x1 + h, y1 ) = 10
x1 = 1, y1 + k = 2, f3 = f (x1 , y1 + k) = 12
x1 + h = 2, y1 + k = 2, f4 = f (x1 + h, y1 + k) = 20.
Obviously, h = 1, k = 1.
f2 − f1 10 − 8
Thus, a = f1 = 8, b = = = 2,
h 1
f3 − f1 12 − 8 f4 + f1 − f2 − f3
c= = = 4, d = = 6.
k 1 hk
Hence,

f (x, y)  F (x, y) = a + b(x − x1 ) + c(y − y1 ) + d(x − x1 )(y − y1 )


= 8 + 2(x − 1) + 4(y − 1) + 6(x − 1)(y − 1).
38
Therefore, f (4/3, 5/3) = .
3

3.20.2 Global methods


Variables-separation method (bilinear form)
Let the interpolating points are evenly distributed over a rectangular grid. Let the
polynomial be of the form

n 
n
F (x, y) = aij xi y j . (3.117)
i=1 j=1

This function can be written as


  
a11 a21 · · · an1 1
 a12 a22 · · · an2   x 
  
F (x, y) = [1 y y 2 · · · y n−1 ]  . .. .. ..   ..  (3.118)
 .. . . .  . 
a1n a2n · · · ann xn−1
160 Numerical Analysis

That is, F (x, y) = Yt (y)AX(x) (say),


where Y(y) = [1 y y 2 · · · y n−1 ]t , X(x) = [1 x x2 · · · xn−1 ]t , A = [aij ]n×n .
Now, the function F(x, y) is constructed in such a way that

F(x, y) = Y(yj )AX(xi ),

where F is the rearranged array form of the column vector F (xi , yj ), and Y(yj ), X(xi )
are the matrices derived from Yt (y), X(x) by introducing the points (xi , yi ).
That is,  
F (x1 , y1 ) F (x1 , y2 ) · · · F (x1 , yn )
 F (x2 , y1 ) F (x2 , y2 ) · · · F (x2 , yn ) 
F=  ···
,

··· ··· ···
F (xn , y1 ) F (xn , y2 ) · · · F (xn , yn )
   
1 x1 x21 · · · x1n−1 1 y1 y12 · · · y1n−1
 1 x2 x2 · · · xn−1   n−1 
 , Yt (yj ) =  1 y2 y2 · · · y2  .
2
X(xi ) =  2 2
 ··· ··· ··· ··· ···   ··· ··· ··· ··· ··· 
1 xn xn · · · xn
2 n−1 1 yn yn2 · · · ynn−1
Since the matrices X, Y and F are known, one can calculate the matrix A as (as-
suming X and Y are non-singular)

A∗ = (Y−1 )t FX−1 . (3.119)

Thus F (x, y) = Yt (y)A∗ X(x) is the required interpolating polynomial.


Example 3.20.3 Obtain a bilinear interpolating polynomial using the following
data
y 1 2
x
1 6 10
2 10 18

Solution. Here Xt (x) = (1 x), Yt (y) = (1 y), n = 2.

1 1 1 1
X(xi ) = , Yt (yj ) = ,
1 2 1 2
−1 −1
∗ 1 1 6 10 1 1
A =
1 2 10 18 1 2
2 −1 6 10 2 −1 2 0
= = .
−1 1 10 18 −1 1 0 4
Interpolation 161

Therefore,
2 0 1
F (x, y) = [1 y] = 2 + 4xy.
0 4 x

Lagrange’s bivariate interpolation


Let f (x, y) be a function defined at (m+1)(n+1) distinct points (xi , yi ), i = 0, 1, . . . , m; j =
0, 1, . . . , n. Let us construct a polynomial F (x, y) of degree at most m in x and n in y,
such that

F (xi , yj ) = f (xi , yj ), i = 0, 1, . . . , m; j = 0, 1, . . . , n. (3.120)

As in Lagrange’s polynomial (3.3) for single variable, we define


wx (x)
Lx,i (x) = , i = 0, 1, . . . , m (3.121)
(x − xi )wx (xi )
wy (y)
Ly,j (y) = , j = 0, 1, . . . , n (3.122)
(y − yj )wy (yj )

where wx (x) = (x − x0 )(x − x1 ) · · · (x − xm ) and wy (y) = (y − y0 )(y − y1 ) · · · (y − yn ).


The functions Lx,i (x) and Ly,j (y) are the polynomials of degree m in x and n in y
respectively and also
 xk
0, if xi =  yk
0, if yi =
Lx,i (xk ) = and Ly,j (yk ) = (3.123)
1, if xi = xk 1, if yi = yk
Thus the Lagrange’s bivariate polynomial is

m 
n
F (x, y) = Lx,i (x)Ly,j (y) f (xi , yj ). (3.124)
i=0 j=0

Example 3.20.4 The following data for a function f (x, y) is given:

y 0 1
x
0 1 1.414214
1 1.732051 2

Find the Lagrange’s bivariate polynomial and hence find an approximate value of
f (0.25, 0.75).

Solution. m = 1, n = 1, x0 = 0, y0 = 0, x1 = 1, y1 = 1, f (x0 , y0 ) = 1, f (x0 , y1 ) =


1.414214, f (x1 , y0 ) = 1.732051, f (x1 , y1 ) = 2.
162 Numerical Analysis

Then

1 
1
F (x, y) = Lx,i (x)Ly,j (y) f (xi , yj )
i=0 j=0
= Lx,0 (x){Ly,0 (y)f (x0 , y0 ) + Ly,1 (y)f (x0 , y1 )}
+Lx,1 (x){Ly,0 (y)f (x1 , y0 ) + Ly,1 (y)f (x1 , y1 )}.

Now,
x−1
Lx,0 (x) = = 1 − x, Ly,0 (y) = 1 − y
0−1
Lx,1 (x) = x, Ly,1 (y) = y.

Therefore,

F (x, y) = (1 − x)(1 − y) × 1 + (1 − x)y × 1.414214 + x(1 − y) × 1.732051 + 2xy


= 1 + 0.732051x + 0.414214y − 0.146265xy.

Thus,

F (0.25, 0.75) = 1 + 0.732051 × 0.25 + 0.414214 × 0.75 − 0.146265 × 0.25 × 0.75


= 1.466248563.

Algorithm 3.6 (Lagrange’s bivariate interpolation). This algorithm finds the


value of f (x, y) by Lagrange’s interpolation method when a table of values of xi , yj ,
f (xi , yj ), i = 0, 1, . . . , m; j = 0, 1, . . . , n, is given.

Algorithm Lagrange Bivariate


Read xi , i = 0, 1, . . . , m; yj , j = 0, 1, . . . , n;// x and y values //
Read fij , i = 0, 1, . . . , m; j = 0, 1, . . . , n;// fij = f (xi , yj )//
Read xg, yg; //the values of x and y at which f (x, y) is to be determined.//
Set wx = 1, wy = 1;
for i = 0 to m do //computation of wx (xi )//
wx = wx ∗ (xg − xi );
endfor;
for j = 0 to n do //find wy (yj )//
wy = wy ∗ (yg − yj );
endfor;
Set sum = 0;
for i = 0 to m do
for j = 0 to n do
Interpolation 163

wx wy
Compute sum = sum + ∗ ∗ fij ;
(xg − xi )wdx(i) (yg − yj )wdy(j)
endfor;
endfor;
Print ‘The value of f (x, y) is ’, sum;
end Lagrange Bivariate

function wdx(j)
sum = 0;
for i = 0 to m do
if (i = j) sum = sum + (xj − xi );
endfor;
return sum;
end wdx(j)

function wdy(j)
sum = 0;
for i = 0 to n do
if (i = j) sum = sum + (yj − yi );
endfor;
return sum;
end wdy(j)

Program 3.6
.
/* Program Lagrange bivariate
This program is used to find the value of a function
f(x,y) at a given point (x,y) when a set of values of
f(x,y) is given for different values of x and y, by
Lagrange bivariate interpolation formula. */
#include<stdio.h>
#include<math.h>
float x[20],y[20];
void main()
{
int i,j,n,m;
float xg,yg,f[20][20],wx=1,wy=1,sum=0;
float wdx(int j,int m); float wdy(int j,int n);
printf("Enter the number of subdivisions along x and y ");
scanf("%d %d",&m,&n);
printf("Enter x values ");
for(i=0;i<=m;i++) scanf("%f",&x[i]);
164 Numerical Analysis

printf("Enter y values ");


for(i=0;i<=n;i++) scanf("%f",&y[i]);
printf("Enter function f(x,y) values \n");
for(i=0;i<=m;i++) for(j=0;j<=n;j++)
{printf("f(%f,%f)= ",x[i],y[j]);scanf("%f",&f[i][j]); }
printf("Enter the interpolating point ");
scanf("%f %f",&xg,&yg);
for(i=0;i<=m;i++) wx*=(xg-x[i]);
for(j=0;j<=n;j++) wy*=(yg-y[j]);
for(i=0;i<=m;i++)for(j=0;j<=n;j++)
sum+=wx*wy*f[i][j]/((xg-x[i])*wdx(i,n)*(yg-y[j])*wdy(j,m));
printf("The interpolated value at
(%8.5f,%8.5f) is %8.5f ",xg,yg,sum);
} /* main */
/* function to find w’(x[j]) */
float wdx(int j,int m)
{
int i; float prod=1;
for(i=0;i<=m;i++) if(i!=j) prod*=(x[j]-x[i]);
return prod;
}
/* function to find w’(y[j]) */
float wdy(int j,int n)
{
int i; float prod=1;
for(i=0;i<=n;i++) if(i!=j) prod*=(y[j]-y[i]);
return prod;
}

A sample of input/output:

Enter the number of subdivisions along x and y 2 2


Enter x values 0 1 2
Enter y values 0 1 2
Enter function f(x,y) values
f(0.000000,0.000000)= 2
f(0.000000,1.000000)= 3
f(0.000000,2.000000)= 6
f(1.000000,0.000000)= 3
f(1.000000,1.000000)= 5
Interpolation 165

f(1.000000,2.000000)= 9
f(2.000000,0.000000)= 6
f(2.000000,1.000000)= 9
f(2.000000,2.000000)= 14
Enter the interpolating point 0.5 0.5
The interpolated value at ( 0.50000, 0.50000) is 2.75000

Newton’s bivariate interpolation formula


Let f (x, y) be defined at (m + 1)(n + 1) distinct points (xi , yj ), i = 0, 1, . . . , m; j =
0, 1, . . . , n. Also, let xs = x0 + sh, yt = y0 + tk, x = x0 + mh and y = y0 + nk.
Some notations are defined in the following:

∆x f (x, y) = f (x + h, y) − f (x, y) = Ex f (x, y) − f (x, y)


= (Ex − 1)f (x, y)
∆y f (x, y) = f (x, y + k) − f (x, y) = Ey f (x, y) − f (x, y)
= (Ey − 1)f (x, y)
∆xx f (x, y) = (Ex2 − 2Ex + 1)f (x, y) = (Ex − 1)2 f (x, y)
∆yy f (x, y) = (Ey2 − 1)2 f (x, y)
∆xy f (x, y) = ∆x {f (x, y + k) − f (x, y)}
= {f (x + h, y + k) − f (x, y + k)} − {f (x + h, y) − f (x, y)}
= Ex Ey f (x, y) − Ey f (x, y) − Ex f (x, y) + f (x, y)
= (Ex − 1)(Ey − 1)f (x, y)

and so on.
Then,

f (x, y) = f (x0 + mh, y0 + nk) = Exm Eyn f (x0 , y0 )


= (1 + ∆x )m (1 + ∆y )n f (x0 , y0 )
 m(m − 1) 
= 1 + m∆x + ∆xx + · · ·
2!
 n(n − 1) 
× 1 + n∆y + ∆yy + · · · f (x0 , y0 )
2!
m(m − 1) n(n − 1)
= 1 + m∆x + n∆y + ∆xx + ∆yy
2! 2!

+mn∆xy + · · · f (x0 , y0 ).

x − x0 y − y0
Substituting m = and n = .
h k
166 Numerical Analysis

x − x0 − h x − x1 y − y1
Then m − 1 = = and n − 1 = .
h h k
Thus
x − x0 y − y0
F (x, y) = f (x0 , y0 ) + ∆x + ∆y f (x0 , y0 )
h k
1 (x − x0 )(x − x1 ) 2(x − x0 )(y − y0 )
+ ∆xx + ∆xy
2! h2 hk
(y − y0 )(y − y1 )
+ ∆yy f (x0 , y0 ) + · · · (3.125)
k2
which is called Newton’s bivariate interpolating polynomial.
Now, introducing unit less quantities u and v defined as x = x0 + uh and y = y0 + vk.
Then x − xs = (u − s)h and y − yt = (v − t)k.
Hence, finally (3.125) becomes
1
F (x, y) = f (x0 , y0 ) + [u∆x + v∆y ]f (x0 , y0 ) + [u(u − 1)∆xx
2!
+2uv∆xy + v(v − 1)∆yy ]f (x0 , y0 ) + · · · (3.126)

Example 3.20.5 For the following data obtain Newton’s bivariate interpolating
polynomial and hence calculate the values of f (0.75, 0.25) and f (1.25, 1.5).

y 0 1 2
x
0 1 3 5
1 –1 2 5
2 –5 –1 3

Solution.

∆x f (x0 , y0 ) = f (x0 + h, y0 ) − f (x0 , y0 )


= f (x1 , y0 ) − f (x0 , y0 ) = −1 − 1 = −2
∆y f (x0 , y0 ) = f (x0 , y0 + k) − f (x0 , y0 )
= f (x0 , y1 ) − f (x0 , y0 ) = 3 − 1 = 2
∆xx f (x0 , y0 ) = f (x0 + 2h, y0 ) − 2f (x0 + h, y0 ) + f (x0 , y0 )
= f (x2 , y0 ) − 2f (x1 , y0 ) + f (x0 , y0 ) = −5 − 2 × (−1) + 1 = −2
∆yy f (x0 , y0 ) = f (x0 , y0 + 2k) − 2f (x0 , y0 + k) + f (x0 , y0 )
= f (x0 , y2 ) − 2f (x0 , y1 ) + f (x0 , y0 ) = 5 − 2 × 3 + 1 = 0
∆xy f (x0 , y0 ) = f (x0 + h, y0 + k) − f (x0 , y0 + k) − f (x0 + h, y0 ) + f (x0 , y0 )
= f (x1 , y1 ) − f (x0 , y1 ) − f (x1 , y0 ) + f (x0 , y0 ) = 1.
Interpolation 167

x − x0 y − y0
Here h = k = 1. u = = x, v = = y.
h k
Thus,

F (x, y) = 1 + [x × (−2) + y × 2]
1
+ [x(x − 1) × (−2) + 2xy × 1 + y(y − 1) × 0]
2!
= 1 − x + 2y − x2 + xy.

Hence f (0.75, 0.25)  F (0.75, 0.25) = 0.375 and f (1.25, 1.5)  F (1.25, 1.5) = 3.0625.

3.21 Worked out Examples

Example 3.21.1 Using Lagrange’s interpolation formula, express the function


x2 + 2x + 3
(x + 1)x(x − 1)
as sums of partial fractions.

Solution. Let f (x) = x2 + 2x + 3. Now, f (x) is tabulated for x = −1, 0, 1 as follows:

x : −1 0 1
f (x) : 2 3 6

The Lagrange’s functions are

(x − x1 )(x − x2 ) x(x − 1)
L0 (x) = = .
(x0 − x1 )(x0 − x2 ) 2
(x − x0 )(x − x2 ) (x + 1)(x − 1)
L1 (x) = = .
(x1 − x0 )(x1 − x2 ) −1
(x − x0 )(x − x1 ) (x + 1)x
L2 (x) = = .
(x2 − x0 )(x2 − x1 ) 2

By Lagrange’s interpolation formula the polynomial f (x) is given by

x(x − 1) (x + 1)(x − 1) (x + 1)x


f (x) = ×2+ ×3+ ×6
2 −1 2
= x(x − 1) − 3(x + 1)(x − 1) + 3x(x + 1).

Hence
x2 + 2x + 3 f (x) 1 3 3
= = − + .
(x + 1)x(x − 1) (x + 1)x(x − 1) x+1 x x−1
168 Numerical Analysis

Example 3.21.2 Find the missing term in the following table


x : 0 1 2 3 4
y : 1 2 4 ? 16

Solution.
Method 1.
Using Lagrange’s formula

(x − 1)(x − 2)(x − 4) x3 − 7x2 + 14x − 8


L0 (x) = = .
(0 − 1)(0 − 2)(0 − 4) −8
(x − 0)(x − 2)(x − 4) x3 − 6x2 + 8x
L1 (x) = = .
(1 − 0)(1 − 2)(1 − 4) 3
(x − 0)(x − 1)(x − 4) x3 − 5x2 + 4x
L2 (x) = = .
(2 − 0)(2 − 1)(2 − 4) −4
(x − 0)(x − 1)(x − 2) x3 − 3x2 + 2x
L3 (x) = = .
(4 − 0)(4 − 1)(4 − 2) 24

Therefore,
y(x)  y0 L0 (x) + y1 L1 (x) + y2 L2 (x) + y3 L3 (x)
x3 − 7x2 + 14x − 8 x3 − 6x2 + 8x
= ×1+ ×2
−8 3
x3 − 5x2 + 4x x3 − 3x2 + 2x
+ ×4+ × 16
−4 24
5 3 1 2 11
= x − x + x + 1.
24 8 12
Thus, y(3) = 8.25.
Hence the missing term is 8.25.
Method 2.
Let us construct a polynomial of degree 3 in the form

y(x) = a + bx + cx2 + dx3 .

If the curve passes through the points x = 0, 1, 2, 4, then


a = 1, a + b + c + d = 2,
a + 2b + 4c + 8d = 4, a + 4b + 16c + 64d = 16.
Solution of these equations is
11 1 5
a = 1, b = , c = − and d = .
12 8 24
Interpolation 169

Therefore,
11 1 5
y(x) = 1 + x − x2 + x3 .
12 8 24
Thus y(3) = 8.25.

Example 3.21.3 Let f (x) = log x, x0 = 2 and x1 = 2.1. Use linear interpolation
to calculate an approximate value for f (2.05) and obtain a bound on the truncation
error.

Solution. Let f (x) = log x. The table is

x : 2.0 2.1
y : 0.693147 0.741937

The linear interpolation polynomial is

(x − 2.1) (x − 2.0)
φ(x) = × 0.693147 + × 0.741937
(2.0 − 2.1) (2.1 − 2.0)
= 0.487900x − 0.282653.

Thus, f (2.05)  φ(2.05) = 0.717542.


The error in linear interpolation is
  
 f (ξ) 
|E1 (x)| = |(x − x0 )(x − x1 )| , 2 ≤ ξ ≤ 2.1.
2! 

1
The maximum value of f  (x) = − 2 in 2 ≤ x ≤ 2.1 is |f  (2.0)| = 0.25.
x
Then 
 0.25 
|E1 (x)| ≤ (2.05 − 2)(2.05 − 2.1)  = 0.000313.
2
Thus the upper bound of truncation error is 0.000313.

Example 3.21.4 For the following table find the value of y at x = 2.5, using
piecewise linear interpolation.
x : 1 2 3 4 5
y : 35 40 65 72 80

Solution. The point x = 2.5 lies between 2 and 3. Then


2.5 − 3 2.5 − 2
y(2.5) = × 40 + × 65 = 52.5.
2−3 3−2
170 Numerical Analysis

Example 3.21.5 Deduce the following interpolation formula taking three points
x0 , x0 + ε, ε → 0 and x1 using Lagrange’s formula.

(x1 − x)(x + x1 − 2x0 ) (x − x0 )(x1 − x) 


f (x) = f (x0 ) + f (x0 )
(x − x0 )2 (x1 − x0 )
(x − x0 )2
+ f (x1 ) + E(x)
(x1 − x0 )2

where
1
E(x) = (x − x0 )2 (x − x1 )f  (ξ) and min{x0 , x0 + ε, x1 } ≤ ξ ≤ max{x0 , x0 + ε, x1 }.
6
Solution. The Lagrange’s interpolating polynomial for the points x0 , x0 + ε and x1
is
(x − x0 − ε)(x − x1 ) (x − x0 )(x − x1 )
f (x)  f (x0 ) + f (x0 + ε)
(x0 − x0 − ε)(x0 − x1 ) (x0 + ε − x0 )(x0 + ε − x1 )
(x − x0 )(x − x0 − ε)
+ f (x1 ) + E(x)
(x1 − x0 )(x1 − x0 − ε)
(x − x0 − ε)(x − x1 ) (x − x0 )(x − x1 )
= f (x0 ) + f (x0 )
−ε(x0 − x1 ) ε(x0 + ε − x1 )
(x − x0 )(x − x1 ) f (x0 + ε) − f (x0 )
+
(x0 − x1 + ε) ε
(x − x0 )(x − x0 − ε)
+ f (x1 ) + E(x)
(x1 − x0 )(x1 − x0 − ε)
(2x0 − x1 − x)(x − x1 ) (x − x0 )(x − x1 ) f (x0 + ε) − f (x0 )
= f (x0 ) +
(x0 − x1 )(x0 − x1 + ε) (x0 − x1 + ε) ε
(x − x0 )(x − x0 − ε)
+ f (x0 ) + E(x)
(x1 − x0 )(x1 − x0 − ε)
(x1 − x)(x + x1 − 2x0 ) (x − x0 )(x1 − x) 
= f (x0 ) + f (x0 )
(x1 − x0 )2 (x1 − x0 )
(x − x0 )2
+ f (x1 ) + E(x) as ε → 0.
(x1 − x0 )2

The error term is


f  (ξ)
E(x) = (x − x0 )(x − x0 − ε)(x − x1 )
3!
1
= (x − x0 )2 (x − x1 )f  (ξ), as ε → 0
6
and min{x0 , x0 + ε, x1 } ≤ ξ ≤ max{x0 , x0 + ε, x1 }.
Interpolation 171

Example 3.21.6 The standard normal probability integral


%  1 
2 x
P (x) = exp − t2 dt
π 0 2
has the following values

x : 1.00 1.05 1.10 1.15 1.20 1.25


P (x) : 0.682689 0.706282 0.728668 0.749856 0.769861 0.788700

Calculate P (1.235).
Solution. The backward difference table is
x P (x) ∇P ∇2 P ∇3 P
1.00 0.682689
1.05 0.706282 0.023593
1.10 0.728668 0.022386 −0.001207
1.15 0.749856 0.021188 −0.001198 0.000009
1.20 0.769861 0.020005 −0.001183 0.000015
1.25 0.788700 0.018839 −0.001166 0.000017

Here x = 1.235, h = 0.05, xn = 1.25, v = (x − xn )/h = (1.235 − 1.25)/0.05 = −0.3.

v(v + 1) 2 v(v+1)(v+2) 3
P (1.235) = P (xn ) + v∇P (xn ) + ∇ P (xn ) + ∇ P (xn )
2! 3!
−0.3(−0.3 + 1)
= 0.788700 − 0.3 × 0.018839 + × (−0.001166)
2
−0.3(−0.3 + 1)(−0.3 + 2)
+ × 0.000017
6
= 0.783169.

Example 3.21.7 Find the seventh and the general terms of the series 3, 9, 20, 38,
65, . . ..

Solution. Let xi = i, i = 1, 2, 3, 4, 5 and y0 = 3, y1 = 9, y2 = 20, y3 = 38, y4 = 65.


We construct the Newton’s backward interpolation polynomial using these values.
x y ∇y ∇2 y ∇3 y
1 3
2 9 6
3 20 11 5
4 38 18 7 2
5 65 27 9 2
172 Numerical Analysis

Here xn = 5, v = (x − xn )/h = x − 5.

v(v + 1) 2 v(v + 1)(v + 2) 3


φ(x) = yn + v∇yn + ∇ yn + ∇ yn
2! 3!
v(v + 1) v(v + 1)(v + 2)
= 65 + 27v + 9 +2
2! 3!
1
= (2v 3 + 33v 2 + 193v + 390)
6
1
= [2(x − 5)3 + 33(x − 5)2 + 193(x − 5) + 390]
6
1
= (2x3 + 3x2 + 13x).
6
The seventh term is
1
φ(7) = (2 × 73 + 3 × 72 + 13 × 7) = 154.
6
[Other interpolation formulae may also be used to solve this problem.]

Example 3.21.8 From the following table of sin x compute sin 120 and sin 450 .
x : 100 200 300 400 500
y = sin x : 0.17365 0.34202 0.50000 0.64279 0.76604

Solution. The difference table is


x y ∆y ∆2 y ∆3 y ∆4 y
0
10 0.17365
0.16837
200 0.34202 −0.01039
0.15798 −0.00480
300 0.50000 −0.01519 0.00045
0.14279 −0.00435
400 0.64279 −0.01954
0.12325
500 0.76604

(i) To find sin 120 .


(x−x0 ) (120 −100 )
Here x0 = 100 , x = 120 , h = 100 , u = h = 100 = 0.2.
By Newton’s forward formula

u(u − 1) 2 u(u − 1)(u − 2) 3


y(120 ) = sin 120  y0 + u∆y0 + ∆ y0 + ∆ y0
2! 3!
Interpolation 173

u(u − 1)(u − 2)(u − 3) 4


+ ∆ y0
4!
0.2(0.2 − 1)
= 0.17365 + 0.2 × 0.16837 + × (−0.01039)
2
0.2(0.2 − 1)(0.2 − 2)
+ × (−0.00480)
6
0.2(0.2 − 1)(0.2 − 2)(0.2 − 3)
+ × (0.000450)
24
= 0.20791.

(ii) To find sin 450 .


Here xn = 500 , x = 450 , h = 100 , v = (x − xn )/h = (450 − 500 )/100 = −0.5.
By Newton’s backward formula
y(450 ) = sin 450
−0.5(−0.5 + 1)
= 0.76604 − 0.5 × 0.12325 + × (−0.01954)
2
−0.5(−0.5 + 1)(−0.5 + 2)
+ × (−0.00435)
6
−0.5(−0.5 + 1)(−0.5 + 2)(−0.5 + 3)
+ × (0.00045)
24
= 0.70711.

Example 3.21.9 Use Stirling’s formula to find u32 from the following table
u20 = 14.035, u25 = 13.674, u30 = 13.257,
u35 = 12.734, u40 = 12.089, u45 = 11.309.

Solution. The finite difference table is shown below.


i xi uxi ∆uxi ∆2 u x i ∆3 u x i
−2 20 14.035
−0.361
−1 25 13.674 −0.056
−0.417 −0.050
0 30 13.257 −0.106
−0.523 −0.016
1 35 12.734 −0.122
−0.645 −0.013
2 40 12.089 −0.135
−0.780
3 45 11.309
174 Numerical Analysis

Here x = 32, x0 = 30, h = 5, s = (x − x0 )/h = 0.4.


Therefore,

∆y−1 + ∆y0 s2 2 s(s2 − 1) ∆3 y−2 + ∆3 y−1


u32 = y0 + s + ∆ y−1 +
2 2! 3! 2
−0.417 − 0.523 0.42
= 13.257 + 0.4 + (−0.106)
2 2
0.4(0.42 − 1) −0.050 − 0.016
+
6 2
= 13.059.


3
Example 3.21.10 The function y = x is tabulated below.

x : 5600 5700 5800 5900 6000


y : 17.75808 17.86316 17.96702 18.06969 18.17121

3
Compute 5860 by (i) Bessel’s formula, and (ii) Stirling’s formula.

Solution. The finite difference table is

i x y ∆y ∆2 y ∆3 y ∆4 y
−2 5600 17.75808
0.10508
−1 5700 17.86316 −0.00122
0.10386 0.00003
0 5800 17.96702 −0.00119 0.00001
0.10267 0.00004
1 5900 18.06969 −0.00115
0.10152
2 6000 18.17121

(i) For x = 5860, let us take x0 = 5800, then s = (5860 − 5800)/100 = 0.6.
By Bessel’s formula

y 0 + y1 s(s − 1) ∆2 y0 + ∆2 y−1
y(5860) = + (s − 0.5)∆y0 +
2 2! 2
1
+ (s − 0.5)s(s − 1)∆ y−1
3
3!
17.96702 + 18.06969
= + (0.6 − 0.5) × 0.10267
2
Interpolation 175

0.6(0.6 − 1) −0.00115 − 0.00119


+
2! 2
1
+ (0.6 − 0.5)(0.6)(0.6 − 1) × 0.00004
6
= 18.02877.

(ii) By Stirling’s formula


∆y−1 + ∆y0 s2 2 s(s2 − 1) ∆3 y−2 + ∆3 y−1
y(5860) = y0 + s + ∆ y−1 +
2 2! 3! 2
0.10386 + 0.10267 0.62
= 17.96702 + 0.6 + (−0.00119)
2 2
0.6(0.62 − 1) 0.00003 + 0.00004
+
6 2
= 18.02877.

3
Thus 5860 = 18.02877.

Example 3.21.11 Prove that the third order divided difference of the function
1 1
f (x) = with arguments a, b, c, d is − .
x abcd
1
Solution. Here f (x) = .
x
f (b) − f (a) 1
−1 1
f [a, b] = = b a =− .
b−a b−a ab

f [a, b] − f [b, c] −1 + 1 1
f [a, b, c] = = ab bc = .
a−c a−c abc
The third order divided difference is

f [a, b, c] − f [b, c, d] 1
− bcd
1
1
f [a, b, c, d] = = abc
=− .
a−d a−d abcd

1 (−1)n
Example 3.21.12 If f (x) = , prove that f [x0 , x1 , . . . , xn ] = .
x x0 x1 · · · xn

Solution. The first order divided difference is

f (x1 ) − f (x0 )
1
− 1
1 (−1)1
f [x0 , x1 ] = = x1 x0 = − = .
x1 − x0 x1 − x0 x0 x1 x0 x1
176 Numerical Analysis

The second order divided difference is

f [x0 , x1 ] − f [x1 , x2 ]) − 1 + 1 (−1)2


f [x0 , x1 , x2 ] = = x0 x1 x1 x2 = .
x0 − x2 x0 − x2 x0 x1 x2
Thus the result is true for n = 1, 2.
Let the result be true for n = k, i.e.,
(−1)k
f [x0 , x1 , . . . , xk ] = .
x0 x1 · · · xk
Now,
f [x0 , x1 , . . . , xk ] − f [x1 , x2 , . . . , xk+1 ]
f [x0 , x1 , . . . , xk , xk+1 ] =
x0 − xk+1
1 (−1)k (−1)k
= −
x0 − xk+1 x0 x1 · · · xk x1 x2 · · · xk+1
k
 
(−1) 1 1 1
= −
x1 x2 · · · xk x0 − xk+1 x0 xk+1
(−1)k+1
= .
x0 x1 x2 · · · xk+1

Therefore, the result is true for n = k + 1. Hence by mathematical induction the


result is true for n = 1, 2, . . . .

Example 3.21.13 If f (x) = u(x)v(x) then show that

f [x0 , x1 ] = u(x0 )v[x0 , x1 ] + v(x1 )u[x0 , x1 ]

and hence deduce that if g(x) = w2 (x) then

g[x0 , x1 ] = w[x0 , x1 ][w(x0 ) + w(x1 )].

Solution.
f (x0 ) − f (x1 ) u(x0 )v(x0 ) − u(x1 )v(x1 )
f [x0 , x1 ] = =
x0 − x1 x0 − x1
u(x0 )[v(x0 ) − v(x1 )] + v(x1 )[u(x0 ) − u(x1 )]
=
x0 − x1
= u(x0 )v[x0 , x1 ] + v(x1 )u[x0 , x1 ].

If g(x) = w2 (x) then let u(x) = v(x) = w(x).


Then
g[x0 , x1 ] = w(x0 )w[x0 , x1 ] + w(x1 )w[x0 , x1 ]
= w[x0 , x1 ][w(x0 ) + w(x1 )].
Interpolation 177

Example 3.21.14 Show that the nth divided difference f [x0 , x1 , . . . , xn ] can be
expressed as
f [x0 , x1 , . . . , xn ] = D ÷ V, where

 
 1 1 1 ··· 1 
 
 x0 x x · · · xn 
 2 1 2 
 x0 x 2 x 2 · · · x2n 
D =  1 2 

 ·n−1
·· ··· ··· ··· ··· 
x xn−1 xn−1 · · · xnn−1 
 0 1 2 
 y0 y1 y2 · · · yn 

and
 
 1 1 1 ··· 1 
 
 x0 x x · · · xn 
 2 1 2 
 x0 x 2 x 2 · · · x2n 
V =  1 2 .

· · · · ·
 n−1 n−1 n−1· · · · ··· ··· 
x x x · · · xnn−1 
 0n 1 2 
 x xn1 xn2 · · · xnn 
0

Solution. The Vandermonde’s determinant


 
 1 1 1 ··· 1 
 
 x0 x x · · · xn 
 2 1 2 
 x0 x21 x22 · · · x2n 

V (x0 , x1 , . . . , xn ) =  .

 ·n−1
·· ··· ··· ··· ··· 
x n−1 n−1
· · · xn−1 
 0 n x1 n x2 n n 
 x x1 x2 · · · xnn 
0

Let  
 1 1 ··· 1 1 

 x0 x1 ··· xn−1 x 

V (x0 , x1 , . . . , xn−1 , x) =  x20 x21 ··· x2n−1 x2  .
 ··· ··· ··· · · · · · · 
 n
 x0 xn1 ··· xnn−1 xn 
When x = x0 , x1 , . . . , xn−1 then V = 0,
i.e., (x − x0 ), (x − x1 ), (x − x2 ), . . . , (x − xn−1 ) are the factors of V .
Then one can write

V (x0 , x1 , . . . , xn−1 , x) = (x − x0 )(x − x1 ) · · · (x − xn−1 )∆,

where ∆ is a constant.
178 Numerical Analysis

Equating the coefficient of xn , the value of ∆ is given by


 
 1 1 ··· 1 
 
 x0 x1 · · · xn−1 
 2 2 
∆ =  x0 x1 · · · x2n−1  = V (x0 , x1 , . . . , xn−1 ).
 ··· ··· ··· ··· 
 n n 
 x0 x1 · · · xnn−1 

Therefore,

n−1
V (x0 , x1 , . . . , xn ) = V (x0 , x1 , . . . , xn−1 ) (xn − xi ).
i=0

Applying this result successively, the explicit expression for V is given by

V (x0 , x1 , . . . , xn ) = {(x0 − x1 )(x0 − x2 ) · · · (x0 − xn )} ×


{(x1 − x2 )(x1 − x3 ) · · · (x1 − xn )} ×
{(x2 − x3 )(x2 − x4 ) · · · (x2 − xn )} ×
· · · {(xn−2 − xn−1 )(xn−2 − xn )}{(x0 − xn )}
 n  n
= (xi − xj )
i=0 j=0
i>j

Let us consider the determinant


 
 1 1 ··· 1 1 ··· 1 
 
 x0 x · · · x xi+1 · · · xn 
 2 1 i−1 

C(x0 , x1 , . . . , xi−1 , xi+1 , . . . , xn ) =  0x x 2 · · · x 2 xi+1 · · · xn
2 2 .
1 i−1 
 .................................... 
 n−1 n−1 
x x · · · xn−1 xn−1 · · · xnn−1 
0 1 i−1 i+1

It can be shown that

C(x0 , x1 , . . . , xi−1 , xi+1 , . . . , xn ) = (−1)n V (x0 , x1 , . . . , xi−1 , xi+1 , . . . , xn ).

Therefore,

V (x0 , x1 , . . . , xi−1 , xi+1 , . . . , xn )


V (x0 , x1 , x2 , . . . , xn )
1
=
(x0 − xi )(x1 − xi ) · · · (xi−1 − xi )(xi − xi+1 ) · · · (xi − xn )
(−1)i
= .
(xi − x0 )(xi − x1 ) · · · (xi − xi+1 )(xi − xi+1 ) · · · (xi − xn )
Interpolation 179

Now,

D = (−1)n y0 C(x1 , x2 , . . . , xn ) + (−1)n+1 y1 C(x0 , x2 , . . . , xn ) + · · ·


+(−1)n+n yn C(x0 , x1 , . . . , xn−1 )
n
= (−1)i yi V (x0 , x1 , . . . , xi−1 , xi+1 , . . . , xn ).
i=0

Thus,


n
D÷V = (−1)i yi V (x0 , x1 , . . . , xi−1 , xi+1 , . . . , xn )/V (x0 , x1 , . . . , xn )
i=0

n
(−1)i
= (−1)i yi
(xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
i=0

n
yi
=
(xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
i=0
= f [x0 , x1 , . . . , xn ].

Example 3.21.15 Given y = f (x) in the following table,

x : 10 15 17
y : 3 7 11

Find the values of x when y = 10 and y = 5.

Solution. Here, inverse Lagrange’s interpolation formula is used in the following


form
n
φ(y) = Li (y)xi .
i=0

The Lagrangian functions

(y − y1 )(y − y2 ) (y − 7)(y − 11) y 2 − 18y + 77


L0 (y) = = = ,
(y0 − y1 )(y0 − y2 ) (3 − 7)(3 − 11) 32
(y − y0 )(y − y2 ) (y − 3)(y − 11) y 2 − 14y + 33
L1 (y) = = = ,
(y1 − y0 )(y1 − y2 ) (7 − 3)(7 − 11) −16
(y − y0 )(y − y1 ) (y − 3)(y − 7) y 2 − 10y + 21
and L2 (y) = = = .
(y2 − y0 )(y2 − y1 ) (11 − 3)(11 − 7) 32
180 Numerical Analysis

Then
y 2 − 18y + 77 y 2 − 14y + 33 y 2 − 10y + 21
φ(y) = × 10 − × 15 + × 17
32 16 32
1
= (137 + 70y − 3y 2 ).
32
Hence, 1
x(10)  φ(10) = (137 + 700 − 300) = 16.78125
32
1
and x(5)  φ(5) = (137 + 350 − 75) = 12.87500.
32

Example 3.21.16 Use inverse Lagrange’s interpolation to find a root of the equa-
tion y ≡ x3 − 3x + 1 = 0.

Solution. Here y(0) = 1 > 0 and y(1) = −1 < 0. One root lies between 0 and 1.
Now, x and y are tabulated, considering five points of x as 0, 0.25, 0.50, 0.75 and 1.
x : 0 0.25 0.50 0.75 1.00
y : 1 0.26563 −0.37500 −0.82813 −1.0000

Solution. Here y = 0. Then

(y − y1 )(y − y2 )(y − y3 )(y − y4 )


L0 (y) =
(y0 − y1 )(y0 − y2 )(y0 − y3 )(y0 − y4 )
y1 y 2 y 3 y 4 −0.08249
= = = −0.02234.
(y0 − y1 )(y0 − y2 )(y0 − y3 )(y0 − y4 ) 3.69194
y0 y2 y3 y4 −0.31054
L1 (y) = = = 0.47684.
(y1 − y0 )(y1 − y2 )(y1 − y3 )(y1 − y4 ) −0.65125
y0 y1 y3 y4 0.21997
L2 (y) = = = 0.88176.
(y2 − y0 )(y2 − y1 )(y2 − y3 )(y2 − y4 ) 0.24947
y0 y1 y2 y4 0.09961
L3 (y) = = = −0.63966.
(y3 − y0 )(y3 − y1 )(y3 − y2 )(y3 − y4 ) −0.15577
y0 y1 y2 y3 0.08249
L4 (y) = = = 0.30338.
(y4 − y0 )(y4 − y1 )(y4 − y2 )(y4 − y3 ) 0.27190

Therefore, 
4
x  φ(0) = Li (y)xi = −0.02234 × 0 + 0.47684 × 0.25
i=0
+0.88176 × 0.50 + 0.06014 × 0.75 + 0.30338 × 1
= 0.38373.

Hence, the approximate root is 0.38373.


Interpolation 181

3.22 Exercise

1. Show that

n
w(x)
= 1.
(x − xi )w (xi )
i=1

2. Show that L0 (x) + L1 (x) + L2 (x) = 1 for all x.

3. Find a polynomial for f (x) where f (0) = 1, f (1) = 2 and f (3) = 5, using La-
grange’s method.

4. Show that the truncation error on quadratic interpolation in an equidistant table


is bounded by  2 
h
√ max |f  (ξ)|.
9 3

5. Suppose that f (x) = ex cos x is to be approximated on [0,1] by an interpolating


polynomial on n+1 equally spaced points 0 = x0 < x1 < · · · < xn = 1. Determine
n so that the truncation error will be less than 0.0001 in this interval.

6. Determine an appropriate step size to use, in the construction of a table of f (x) =


(1 + x)4 on [0,1]. The truncation error for linear interpolation is to be bounded
by 5 × 10−5 .
% x
dt
7. The function defined by f (x) = √ is tabulated for equally spaced values of
1 2 t
x with h = 0.1. What is the maximum error encountered if piecewise quadratic
interpolation is to be used to calculate f (a) where a ∈ [1, 2] ?

8. Find the formula for the upper bound of the error involved in linearly interpolating
f (x) between %a and b. Use the formula to find the maximum error encountered
x
2
when f (x) = e−t dt is interpolated between x = 0 and x = 1.
0

9. From the following table, find the number of students who obtain less than 35
marks

Marks : 20-30 30-40 40-50 50-60 60-70


No. of Students : 32 53 50 38 33

10. If y(1) = −3, y(3) = 9, y(4) = 30 and y(6) = 132, find the Lagrange’s interpolation
polynomial that takes the same values as the function y at the given points.

11. Let the following observation follows the law of a cubic polynomial
182 Numerical Analysis

x : 0 1 2 3 4
f (x) : 1 −2 1 16 49

Find the extrapolated value of f (5).

12. Use Lagrange’s interpolation formula to express the function

3x2 + 2x − 5
(x − 1)(x − 2)(x − 3)
as sums of partial fractions.

13. Express the function


x3 + 6x + 2
(x + 1)x(x − 2)(x − 4)
as sums of partial fractions.

14. Using Lagrange’s interpolation formula, express the function f (x) = 3x2 − 2x + 5
as the sum of products of the factors (x − 1), (x − 2) and (x − 3) taken two at a
time.

15. Compute the missing values of yn and ∆yn in the following table

yn ∆yn ∆2 yn


− 1

− 4
5
6 13

− 18

− 24

16. The following table gives pressure of a steam plant at a given temperature. Using
Newton’s formula, compute the pressure for a temperature of 1420 C.

Temperature 0 C : 140 150 160 170 180


2
Pressure, kgf/cm : 3.685 4.854 6.302 8.076 10.225
Interpolation 183

17. The following data gives the melting point of an alloy of lead and zinc; where T
is the temperature in 0 C and P is the percentage of lead in the alloy. Find the
melting point of the alloy containing 84% of lead using Newton’s interpolation
method.

P : 50 60 70 80
T : 222 246 272 299

18. Using a polynomial of third degree, complete the record of the export of a certain
commodity during five years, as given below.

Year, x : 1988 1989 1990 1991 1992


Export in tons, y : 450 388 − 400 470

19. Find the polynomial which attains the following values at the given points.

x : −1 0 1 2 3 4
f (x) : −16 −7 −4 −1 8 29

20. Compute log10 2.5 using Newton’s forward difference interpolation formula, given
that

x : 2.0 2.2 2.4 2.6 2.8 3.0


log10 x : 0.30103 0.34242 0.38021 0.41497 0.44716 0.47721

21. Find the missing term in the following table:

x : 0 1 2 3 4
y : 1 3 9 − 81

Why the result differs from 33 = 27 ?

22. In the following table, the value of y are consecutive terms of a series of which the
number 36 is the fifth term. Find the first and the tenth terms of the series. Find
also the polynomial which approximates these values.

x : 3 4 5 6 7 8 9
y : 18 26 36 48 62 78 96

23. From the following table determine (a) f (0.27), and (b) f (0.33).

x : 0.24 0.26 0.28 0.30 0.32 0.34


f (x) : 1.6804 1.6912 1.7024 1.7139 1.7233 1.7532
184 Numerical Analysis

24. The population of a town in decennial census were as under. Estimate the popu-
lation for the year 1955.

Year : 1921 1931 1941 1951 1961


Population (in core) : 46 68 83 95 105

25. Using Gauss’s forward formula, find the value of f (32) given that
f (25) = 0.2707, f (30) = 0.3027, f (35) = 0.3386, f (40) = 0.3794.

26. Using Gauss’s backward formula, find the value of 518 given that
√ √
√500 = 22.360680, √510 = 22.583100,
520 = 22.803509, 530 = 23.021729.

27. Use a suitable central difference formula of either Stirling’s or Bessel’s to find the
values of f (x) from the following tabulated function at x = 1.35 and at x = 1.42.

x : 1.0 1.2 1.4 1.6 1.8


f (x) : 1.17520 1.50946 1.90430 2.37557 2.94217

28. From Bessel’s formula, derive the following formula for midway interpolation
1 1 3
y1/2 = (y0 + y1 ) − (∆2 y−1 + ∆2 y0 ) + (∆4 y−2 + ∆4 y−1 ) − · · ·
2 16 256
Also deduce this formula from Everett’s formula.
% π/2 
29. The function log E where E = 1 − sin2 α sin2 θ dθ is tabulated below:
0

α0 : 0 5 10 15 20
log E : 0.196120 0.195293 0.192815 0.188690 0.182928

Compute log 120 by (a) Bessel’s formula and (b) Stirling’s formula and compare
the results.

30. The value of the elliptic integral


% π/2
E(α) = (1 − α sin2 θ)−1/2 dθ
0

for certain equidistance values of α are given below. Use Everett’s or Bessel’s
formula to determine E(0.25).
Interpolation 185

α : 0.20 0.22 0.24 0.26 0.28 0.30


E(α) : 1.659624 1.669850 1.680373 1.691208 1.702374 1.713889

31. Using Everett’s formula, evaluate f (20) from the following table.

x : 14 18 22 26
f (x) : 2877 3162 3566 3990

32. Using Aitken’s method evaluate y when x = 2 from the following table.

x : 1 3 4 6
y : −3 9 30 132

33. Use the Aitken’s procedure to determine the value of f (0.2) as accurately as
possible from the following table.

x : 0.17520 0.25386 0.33565 0.42078 0.50946


f (x) : 0.84147 0.86742 0.89121 0.91276 0.93204

34. Show that the first order divided difference of a linear polynomial is independent
of the arguments.

35. Show that the second order divided difference of a quadratic polynomial is con-
stant.

36. If f  (x) is continuous for x0 ≤ x ≤ x1 , show that f [x0 , x1 ] = f  (ξ) where x0 ≤ ξ ≤


x1 and hence show that

f [x0 , x0 ] ≡ lim f [x0 , x1 ] = f  (x0 ).


x1 →x0

37. For the equidistant values x0 , x1 , x2 , x3 i.e., xi = x0 + ih, establish the following
relations
1
f [x0 , x1 ] = [f (x1 ) − f (x0 )],
h
1
f [x0 , x1 , x2 ] = [f (x2 ) − 2f (x1 ) + f (x0 )],
2!h2
1
and f [x0 , x1 , x2 , x3 ] = [f (x3 ) − 3f (x2 ) + 3f (x1 ) − f (x0 )].
3!h3

ax + b
38. If f (x) = obtain expressions for f [p, q], f [p, p, q] and f [p, p, q, q].
cx + d
186 Numerical Analysis

39. If f (x) = x4 obtain expressions for f [a, b, c], f [a, a, b] and f [a, a, a] where a = b = c.

40. If f (x) = 1/(a − x), show that

1
f [x0 , x1 , . . . , xn ] = .
(a − x0 )(a − x1 ) · · · (a − xn )

41. Use Newton’s divided difference interpolation to find the interpolation polynomial
for the function y = f (x) given by the table:

x : −1 1 4 6
y : 1 −3 21 127

42. Use Newton’s divided difference interpolation to find the interpolating polynomial
for the function y = f (x) given by

x : −1 1 4 6
f (x) : 5 2 26 132

43. Using the given table of value of Bessel’s function y = J0 (x), find the root of the
equation J0 (x) = 0 lying in (2.4, 2.6) correct up to three significant digits.

x : 2.4 2.5 2.6


y : 0.0025 −0.0484 −0.0968

44. Given below is a table of values of the probability integral


%
2 x −x2
y= e dx.
π 0

Determine the value of x for which the value of y is 0.49.

x : 0.45 0.46 0.47 0.48 0.49


y : 0.4754818 0.4846555 0.497452 0.5027498 0.5116683

45. Find x for which cosh x = 1.285, by using the inverse interpolation technique
of successive approximation of Newton’s forward difference interpolation formula,
given the table.

x : 0.738 0.739 0.740 0.741 0.752


cosh x : 1.2849085 1.2857159 1.2865247 1.2873348 1.2881461
Interpolation 187

46. Use the technique of inverse interpolation to find x for which sinh x = 5.5 from
the following table.

x : 2.2 2.4 2.6 2.8


sinh x : 4.457 5.466 6.095 8.198

47. Given the following table of f (x) between x = 1.1 and x = 1.5, find the zero of
f (x).

x : 1.1 1.2 1.3 1.4 1.5


f (x) : 1.769 1.472 1.103 −1.344 −1.875

48. Use the technique of inverse interpolation to find a real root of the equation
x3 − 2x − 4 = 0.

49. Using Hermite’s interpolation formula, estimate the value of log 3.2 from the fol-
lowing table

x : 3.0 3.5 4.0


y = log x : 1.09861 1.25276 1.38629
y  = x1 : 0.33333 0.28571 0.25000

50. Find the Hermite polynomial of the third degree approximating the function y(x)
such that
y(x0 ) = 1, y(x1 ) = 0 and y  (x0 ) = y  (x1 ) = 0.

51. The following values of x and y are calculated from the relation y = x3 + 10

x : 1 2 3 4 5
y : 11 18 37 74 135

Determine the cubic spline p(x) for the interval [2, 3] given that
(a) p (1) = y  (1) and p (5) = y  (5), (b) p (1) = y  (1) and p (5) = y  (5).

52. Fit a cubic spline to the function defined by the set of points given in the following
table.

x : 0.10 0.15 0.20 0.25 0.30


x
y = e : 1.1052 1.1618 1.2214 1.2840 1.3499
188 Numerical Analysis

Use the end conditions

(a) M0 = MN = 0
(b) p (0.10) = y  (0.10) and p (0.30) = y  (0.30) and
(c) p (0.10) = y  (0.10) and p (0.30) = y  (0.30).

Interpolate in each case for x = 0.12 and state which of the end conditions gives
the best fit.

53. The distance di that a car has travelled at time ti is given below.

time ti : 0 2 4 6 8
distance di : 0 40 160 300 480

Use the values p (0) and p (8) = 98, and find the clamped spline for the points.

54. Fit a cubic spline for the points (0,1), (1,0), (2,0), (3,1), (4,2), (5,2) and (6,1) and
p (0) = −0.6, p (6) = −1.8 and p (0) = 1 and p (6) = −1.

55. A function f (x) is defined as follows:

1 + x, 0≤x≤3
f (x) =
1 + x + (x − 3) , 3 ≤ x ≤ 4.
2

Show that f (x) is a cubic spline in [0, 4].

56. Tabulate the values of the function

f (x, y) = ex sin y + y + 1

for
x = 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5
and y = 0.1, 0.2, 0.3, 0.4, 0.5, 0.6.
Hence find the value of f (1.6, 0.33) by two-dimensional interpolation.

57. Using the following data obtain the Lagrange and Newton’s bivariate interpolating
polynomials

x : 0 0 0 1 1 1 2 2 2
y : 0 1 2 0 1 2 0 1 2
f (x, y) : 1 3 7 3 6 11 7 11 17
Chapter 4

Solution of Algebraic and


Transcendental Equations

Determination of roots of algebraic and transcendental equations is a very important


problem in science and engineering.
A function f (x) is called algebraic if, to get the values of the function starting from
the given value of x, we have to perform arithmetic operations between some real num-
bers and rational power of x. On the other hand, transcendental functions include
all non-algebraic functions, i.e., an exponential function ex , ax , a logarithmic function
log x, trigonometric functions sin x, cos x, tan x, cot x, etc., inverse trigonometric func-
tions sin−1 x, cos−1 x, etc. and others.
An equation f (x) = 0 is called algebraic or transcendental according as f (x) is
algebraic or transcendental.
The equations x3 + 7x + 3 = 0, x5 − 7x2 + 3 = 0 are algebraic equations where as
e + sin x = 0, 5 log x + 3x − 2 = 0 are the transcendental equations.
x

The definition of roots of an equation can be given in two different ways:


Algebraically, a number c is called a root of the equation f (x) = 0 iff f (c) = 0 and
geometrically, the real roots of the equation f (x) = 0 are the values of x where the
graph of y = f (x) meets the x-axis.
Development of numerical methods to solve algebraic or transcendental equations are
very much essential because the analytic method fails to solve the polynomial equations
of degree greater than four.
Most of the numerical methods, used to solve an equation are based on iterative
techniques. Different numerical methods are available to solve the equation f (x) =
0. But, each method has some advantages and disadvantages over another method.
Generally, the following aspects are considered to compare the methods:

189
190 Numerical Analysis

convergence or divergence, rate of convergence, applicability of the method, amount of


pre-calculation needed before application of the method, etc.
The process of finding the approximate values of the roots of an equation can be
divided into two stages: (i) location of the roots, and (ii) computation of the values of
the roots with the specified degree of accuracy.

4.1 Location of Roots

An interval [a, b] is said to be the location of a real root c if f (c) = 0 for a < c < b.
Mainly, two methods are used to locate the real roots of an equation, one is graphical
method and other is an analytic method known as method of tabulation.

4.1.1 Graphical method


First method:
In this method, the graph of y = f (x) is drawn in a rectangular co-ordinates system.
It is obvious that the abscissas of the points where the graph intersects the x-axis are
the roots of the equation f (x) = 0. But, practically, it is most difficult to determine
the exact value of x where the graph intersects the x-axis. For example, if x = 1.27831
is a root of an equation f (x) = 0 then we can not determine 1.2783 (four digits after
decimal point) from the graph. We can measure the value of x up to one or two decimal
places. But, the approximate value of the root can be determined using this method.
Second method:
Some times, the approximate roots of f (x) = 0 can be determined by dividing all
the terms of the equation into groups, one of them is written on the left-hand side of
the equation and the other on the right hand side, i.e., the equation is represented as
g(x) = h(x). Then the graph of two functions y = g(x) and y = h(x), are drawn. The
abscissas of the points of intersection of these graphs are the roots of the equation.

Example 4.1.1 Use graphical method to locate the roots of the equation x3 − 4x −
2 = 0.

Solution. First method:


The graph of the function y = x3 − 4x − 2 is drawn in Figure 4.1(a). The curve cuts
the x-axis at three points and, consequently, the equation has three real roots. From
figure, it is observed that the roots belong to the intervals [−2, −1], [−1, 0] and [2, 3].
Second method:
The given equation can be written as x3 = 4x + 2. The graph of the functions y = x3
and y = 4x + 2 are drawn (Figure 4.1(b)). The abscissas of the points of intersection
of the graphs of these functions are roots of the equations. The intervals of the roots
are [−2, −1], [−1, 0] and [2, 3].
Sol. of Algebraic and Transcendental Equs. 191

f (x)
6
y yM = 4x + 2

yI
= x3
O - x
x
-2 -1 1 2 3 O

(a) The graph of y = x3 − 4x − 2. (b) The graph of y = x3 and y = 4x + 2.


Figure 4.1: Illustration of location of roots.

The graphical method to locate the roots is not very useful, because, the drawing of
the function y = f (x) is itself a complicated problem. But, it makes possible to roughly
determine the intervals to locate the roots. Then an analytic method is used to locate
the root.

4.1.2 Method of tabulation


This method depends on the continuity of the function f (x). Before applying the
tabulation method, following result should be noted.

Theorem 4.1 If f (x) is continuous in the interval (a, b) and if f (a) and f (b) have the
opposite signs, then at least one real root of the equation f (x) = 0 lies within the interval
(a, b).
If f (a) and f (b) have same signs then f (x) = 0 has no real roots or has an even
number of real roots.

If the curve y = f (x) touches the x-axis at some point, say, x = c then c is a root
of f (x) = 0 though f (a) and f (b), a < c < b may have same sign. For example,
f (x) = (x − 2)2 touches the x-axis at x = 2, also f (1.5) > 0 and f (2.5) > 0, but, x = 2
is the root of the equation f (x) = (x − 2)2 = 0.
A trial method for tabulation is as follows:
Form a table of signs of f (x) setting x = 0, ±1, ±2, . . .. If the sign f (x) changes it signs
for two consecutive values of x then at least one root lies between these two values, i.e.,
if f (a) and f (b) have opposite signs then a root lies between a and b.

Example 4.1.2 Find the location of roots of the equation 8x3 − 20x2 − 2x + 5 = 0
by tabulation method.
192 Numerical Analysis

Solution. We form a table of sign of f (x), taken x = 0, −1, 1, −2, 2, . . . as follows:

x 0 −1 1 −2 2 3
Sign of f (x) + − − − − +

The equation has three real roots as its degree is 3. The location of the roots of the
given equation are (−1, 0), (0, 1) and (2, 3).
A systematic process for tabulation
The following sequence of steps to be performed to locate the roots of an equation
f (x) = 0 by tabulation method:

1. find the first derivative f  (x),

2. prepare a table of signs of the function f (x) by setting x equal to

(a) the roots of f  (x) = 0 or the values close to them,


(b) the boundary values (preceding from the domain of permissible values of
the variable),

3. determine the intervals at the endpoints of which the function assumes values
of opposite signs. These intervals contain one and only one root each in its
interior.

Example 4.1.3 Find the number of real roots of the equation 3x − 3x − 2 = 0 and
locate them.

Solution. Let f (x) = 3x − 3x − 2. The domain of definition of the function f (x) is


(−∞, ∞).
Now, f  (x) = 3x log 3 − 3.
The roots of f  (x) = 0 is given by 3x log 3 − 3 = 0
3 log 3 − log log 3
or, 3x = or, x = = 0.914.
log 3 log 3
A table of signs of f (x) is then form by setting x equal to
(a) the values close of the roots of f  (x) = 0, i.e., x = 0, x = 1 and
(b) boundary values of domain, i.e., x = ±∞.

x −∞ 0 1 ∞
Sign of f (x) + − − +

The equation 3x − 3x − 2 = 0 has two real roots since the function twice changes sign,
among them one is negative root and other is greater than 1.
Sol. of Algebraic and Transcendental Equs. 193

A new table with small intervals of the location of the root is constructed in the
following.

x 0 −1 1 2
Sign of f (x) − + − +

The roots of the given equation are in (−1, 0) [as f (0).f (−1) < 0] and (1, 2).

This section is devoted to locate the roots which is the first stage of solution of
algebraic and transcendental equations.
The second stage is the computation of roots with the specified degree of accuracy. In
the following sections some methods are discussed to determine the roots of an algebraic
or a transcendental equation. Before presenting the solution methods we define the order
of convergence of a sequence of numbers in the following.

Order of Convergence
Assume that the sequence {xn } of numbers converges to ξ and let εn = ξ − xn for n ≥ 0.
If two positive constants A = 0 and p > 0 exist and
εn+1
lim =A (4.1)
n→∞ εpn
then the sequence is said to converge to ξ with order of convergence p. The number A
is called the asymptotic error constant.
If p = 1, the order of convergence of {xn } is called linear and if p = 2, the order of
convergence is called quadratic, etc.
In the next section, one of the bracketing method called bisection method is intro-
duced.

4.2 Bisection Method

Let ξ be a root of the equation f (x) = 0 lies in the interval [a, b], i.e., f (a).f (b) < 0,
and (b − a) is not sufficiently small. The interval [a, b] is divided into two equal intervals
b−a a+b
[a, c] and [c, b], each of length , and c = (Figure 4.2). If f (c) = 0, then c is
2 2
an exact root.
Now, if f (c) = 0, then the root lies either in the interval [a, c] or in the interval [c, b].
If f (a).f (c) < 0 then the interval [a, c] is taken as new interval, otherwise [c, b] is taken
as the next interval. Let the new interval be [a1 , b1 ] and use the same process to select
the next new interval. In the next step, let the new interval be [a2 , b2 ]. The process of
bisection is continued until either the midpoint of the interval is a root, or the length
(bn − an ) of the interval [an , bn ] (at nth step) is sufficiently small. The number an and
194 Numerical Analysis

an + bn
bn are the approximate roots of the equation f (x) = 0. Finally, xn = is taken
2
as the approximate value of the root ξ.

a ξs - x
O c b

Figure 4.2: Illustration of bisection method.

It may be noted that when the reduced interval be [a1 , b1 ] then the length of the
interval is (b − a)/2, when the interval be [a2 , b2 ] then the length is (b − a)/22 . At the
an + bn
nth step the length of the interval being (b − a)/2n . In the final step, when ξ =
2
is chosen as a root then the length of the interval being (b − a)/2n+1 and hence the error
does not exceed (b − a)/2n+1 .
Thus, if ε be the error at the nth step then the lower bound of n is obtained from
the following relation
|b − a|
≤ ε. (4.2)
2n
The lower bound of n is obtained by rewriting this inequation as
log(|b − a|) − log ε
n≥ . (4.3)
log 2
Hence the minimum number of iterations required to achieve the accuracy ε is
 
loge |b−a|
ε
. (4.4)
log 2
For example, if the length of the interval is |b − a| = 1 and ε = 0.0001, then n is given
by n ≥ 14.
The minimum number of iterations required to achieved the accuracy ε for |b − a| = 1
are shown in Table 4.1.
Theorem 4.2 Assume that f (x) is a continuous function on [a, b] and that there exists
a number ξ ∈ [a, b] such that f (ξ) = 0. If f (a) and f (b) have opposite signs, and {xn }
represents the sequence of midpoints generated by the bisection method, then
b−a
|ξ − xn | ≤ for n = 0, 1, 2, . . . (4.5)
2n+1
Sol. of Algebraic and Transcendental Equs. 195

Table 4.1: Number of iterations for given ε.

ε 10−2 10−3 10−4 10−5 10−6 10−7


n 7 10 14 17 20 24

and therefore the sequence {xn } converges to the root ξ i.e.,

lim xn = ξ.
n→∞

Proof. The root ξ and the midpoint xn both lie in the interval [an , bn ], the distance
between xn and ξ cannot be greater than half the width of the interval [an , bn ]. Thus

|bn − an |
|ξ − xn | ≤ for all n. (4.6)
2
From the bisection method, it is observed that the successive interval widths form
the following pattern.

|b0 − a0 |
|b1 − a1 | = , where b0 = b and a0 = a,
21
|b1 − a1 | |b0 − a0 |
|b2 − a2 | = = ,
2 22
|b2 − a2 | |b0 − a0 |
|b3 − a3 | = = .
2 23
|b0 − a0 |
In this way, |bn − an | = .
2n
Hence
|b0 − a0 |
|ξ − xn | ≤ [using (4.6)].
2n+1
Now, the limit gives

|ξ − xn | → 0 as n → ∞ i.e., lim xn = ξ.
n→∞

Note 4.2.1 If the function f (x) is continuous on [a, b] then the bisection method is
applicable. This is justified in Figure 4.3. For the function f (x) of the graph of Figure
4.3, f (a) · f (b) < 0, but the equation f (x) = 0 has no root between a and b as the
function is not continuous at x = c.
196 Numerical Analysis

f (x)
6

b - x
a c
O

Figure 4.3: The function has no root between a and b, though f (a) · f (b) < 0.

Note 4.2.2 This method is very slow, but it is very simple and will converge surely
to the exact root. So the method is applicable for any function only if the function is
continuous within the interval [a, b], where the root lies.
In this method derivative of the function f (x) and pre-manipulation of function are
not required.

Note 4.2.3 This method is also called bracketing method since the method successively
reduces the two endpoints (brackets) of the interval containing the real root.

Example 4.2.1 Find a root of the equation x2 + x − 7 = 0 by bisection method,


correct up to two decimal places.

Solution. Let f (x) = x2 + x − 7.


f (2) = −1 < 0 and f (3) = 5 > 0. So, a root lies between 2 and 3.

Left end point Right end point Midpoint


n an bn xn+1 f (xn+1 )
0 2 3 2.5 1.750
1 2 2.5 2.250 0.313
2 2 2.250 2.125 -0.359
3 2.125 2.250 2.188 -0.027
4 2.188 2.250 2.219 0.143
5 2.188 2.219 2.204 0.062
6 2.188 2.204 2.196 0.018
7 2.188 2.196 2.192 -0.003
8 2.192 2.196 2.194 0.008
9 2.192 2.194 2.193 0.002
10 2.192 2.193 2.193 0.002

Therefore, the root is 2.19 correct up to two decimal places.


Sol. of Algebraic and Transcendental Equs. 197

Algorithm 4.1 (Bisection method). This algorithm finds a real root of the equa-
tion f (x) = 0 which lies in [a, b] by bisection method.

Algorithm Bisection
Input function f (x);
// Assume that f (x) is continuous within [a, b] and a root lies on [a, b].//
Read ε; //tolerance for width of the interval//
Read a, b; //input of the interval//
Compute f a = f (a); f b = f (b); //compute the function values//
if sign(f a) = sign(f b) then
//sign(f a) gives the sign of the value of f a.//
Print ‘f (a) · f (b) > 0, so there is no guarantee for a root within [a, b]’;
Stop;
endif;
do
Compute c = (a + b)/2;
Compute f c = f (c);
if f c = 0 or |f c| < ε then
a = c and b = c;
else if sign(f b) = sign(f c) then
b = c; f b = f c;
else
a = c; f a = f c;
endif;
while (|b − a| > ε);
Print ‘the desired root is’ c;
end Bisection
Program 4.1.
/* Program Bisection
Program to find a root of the equation x*x*x-2x-1=0 by
bisection method.
Assume that a root lies between a and b. */
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
#define f(x) x*x*x-2*x-1 /* definition of the function f(x) */
void main()
{
float a,b,fa,fb,c,fc;
float eps=1e-5; /* error tolerance */
198 Numerical Analysis

printf("\nEnter the value of a and b ");


scanf("%f %f",&a,&b);
fa=f(a); fb=f(b);
if(fa*fb>0)
{
printf("There is no guarantee for a root within [a,b]");
exit(0);
}
do
{
c=(a+b)/2.;
fc=f(c);
if((fc==0) || (fabs(fc)<eps))
{
a=c;b=c;
}
else if(fb*fc>0)
{
b=c; fb=fc;
}
else
{
a=c; fa=fc;
}
}while(fabs(b-a)>eps);
printf("\nThe desired root is %8.5f ",c);
} /* main */
A sample of input/output:
Enter the value of a and b 0 2
The desired root is 1.61803

Another popular method is the method of false position or the regula falsi method.
This is also a bracketing method. This method was developed because the bisection
method converges at a fairly slow speed. In general, the regula falsi method is faster
than bisection method.

4.3 Regula-Falsi Method (Method of False Position)

The Regula-Falsi method is one of the most widely used methods of solving algebraic
and transcendental equations. This method is also known as ‘method of false position’,
Sol. of Algebraic and Transcendental Equs. 199

‘method of chords’ and ‘the method of linear interpolation’.


Let a root of the equation f (x) = 0 be lies in the interval [a, b], i.e., f (a) · f (b) < 0.
The idea of this method is that on a sufficiently small interval [a, b] the arc of the curve
y = f (x) is replaced by the chord joining the points (a, f (a)) and (b, f (b)). The abscissa
of the point of intersection of the chord and the x-axis is taken as the approximate value
of the root.
Let x0 = a and x1 = b. The equation of the chord joining the points (x0 , f (x0 )) and
(x1 , f (x1 )) is

y − f (x0 ) x − x0
= . (4.7)
f (x0 ) − f (x1 ) x0 − x1

To find the point of intersection, set y = 0 in (4.7) and let (x2 , 0) be such point.
Thus,

f (x0 )
x2 = x0 − (x1 − x0 ). (4.8)
f (x1 ) − f (x0 )

This is the second approximation of the root. Now, if f (x2 ) and f (x0 ) are of opposite
signs then the root lies between x0 and x2 and then we replace x1 by x2 in (4.8). The
next approximation is obtained as

f (x0 )
x3 = x0 − (x2 − x0 ).
f (x2 ) − f (x0 )

If f (x2 ) and f (x1 ) are of opposite signs then the root lies between x1 and x2 and the
new approximation x3 is obtain as

f (x2 )
x3 = x2 − (x1 − x2 ).
f (x1 ) − f (x2 )

The procedure is repeated till the root is obtained to the desired accuracy.
If the nth approximate root (xn ) lies between an and bn then the next approximate
root is thus obtained as
f (an )
xn+1 = an − (bn − an ). (4.9)
f (bn ) − f (an )

The illustration of the method is shown in Figure 4.4.


This method is also very slow and not suitable for hand calculation. The advantage
of this method is that it is very simple and the sequence {xn } is sure to converge. The
another advantage of this method is that it does not require the evaluation of derivatives
and pre-calculation.
200 Numerical Analysis

f (x) f (x)
6 6

x0 =a x1 x2 s - ξ
x s - x
O ξ b x0 x1x2
O

Figure 4.4: Illustration of Regula-falsi method.

To estimate the error of approximation, the following formula may be used

|ξ − xn | < |xn − xn−1 | (4.10)

where ξ is an exact root and xn−1 and xn are its approximations obtained at the (n−1)th
and nth iterations. This relation can be used when

M ≤ 2m, where M = max |f  (x)| and m = min |f  (x)| in [a, b]. (4.11)

Example 4.3.1 Find a root of the equation x3 + 2x − 2 = 0 using Regula-Falsi


method, correct up to three decimal places.

Solution. Let f (x) = x3 + 2x − 2. f (0) = −2 < 0 and f (1) = 1 > 0. Thus, one root
lies between 0 and 1. The calculations are shown in the following table.
left end right end
n point an point bn f (an ) f (bn ) xn+1 f (xn+1 )
0 0.0000 1.0 –2.0000 1.0 0.6700 –0.3600
1 0.6700 1.0 –0.3600 1.0 0.7570 –0.0520
2 0.7570 1.0 –0.0520 1.0 0.7690 –0.0072
3 0.7690 1.0 –0.0072 1.0 0.7707 –0.0010
4 0.7707 1.0 –0.0010 1.0 0.7709 –0.0001

Therefore, a root of the equation is 0.771 correct up to three decimal places.

Algorithm 4.2 (Regula-Falsi). This algorithm finds a root of the equation f (x) =
0 which lies in [x0 , x1 ], by Regula-Falsi method.

Algorithm Regula-Falsi
Input function f (x);
Sol. of Algebraic and Transcendental Equs. 201

Read x0 , x1 , ε; //interval for the root and error tolerance//


Compute f x0 = f (x0 ); f x1 = f (x1 ); // Compute the function values at x0 and x1 //
do
x0 · f x1 − x1 · f x0
Compute x2 = ;
f x1 − f x0
Compute f x2 = f (x2 );
if |f x2 | ≤ ε then
Print ‘The root is’, x2 ;
Stop;
endif;
if sign(f x2 ) = sign(f x0 ) then
Set x1 = x2 ; f x1 = f x2 ;
else
Set x0 = x2 ; f x0 = f x2 ;
endif;
while (|f x2| > ε);
end Regula-Falsi

Program 4.2
.
/* Program Regula-Falsi
Program to find a root of the equation x*x-2x-3=0 by regula
falsi method. Assumed that a root lies between x0 and x1. */
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
#define f(x) x*x-2*x-3 /* definition of the function f(x) */
void main()
{
float x0,x1,x2,fx0,fx1,fx2;
float eps=1e-5; /* error tolerance */
printf("\nEnter the value of x0 and x1 ");
scanf("%f %f",&x0,&x1);
fx0=f(x0); fx1=f(x1);

if(fx0*fx1>0)
{
printf("There is no guarantee for a root within
[%6.3f,%6.3f]",x0,x1);
exit(0);
}
202 Numerical Analysis

do
{
x2=(x0*fx1-x1*fx0)/(fx1-fx0);
fx2=f(x2);
if(fabs(fx2)<eps)
{
printf("The root is %8.5f ",x2);
exit(0);
}
if(fx2*fx0<0)
{
x1=x2; fx1=fx2;
}
else
{
x0=x2; fx0=fx2;
}
}while(fabs(fx2)>eps);
} /* main */
A sample of input/output:
Enter the value of x0 and x1 0 3
The root is 3.00000

4.4 Iteration Method or Fixed Point Iteration

The iteration method or the method of successive approximations, is one of


the most important methods in numerical mathematics. This method is also known as
fixed-point iteration.
Let f (x) be a function continuous on the interval [a, b] and the equation f (x) = 0
has at least one root on [a, b]. The equation f (x) = 0 can be written in the form

x = φ(x). (4.12)

Suppose x0 ∈ [a, b] be an initial guess to the desired root ξ. Then φ(x0 ) is evaluated
and this value is denoted by x1 . It is the first approximation of the root ξ. Again, x1 is
substituted for x to the right side of (4.12) and obtained a new value x2 = φ(x1 ). This
process is continued to generate the sequence of numbers x0 , x1 , x2 , . . . , xn , . . ., those
are defined by the following relation:

xn+1 = φ(xn ), n = 0, 1, 2, . . . (4.13)


Sol. of Algebraic and Transcendental Equs. 203

This successive iterations are repeated till the approximate numbers xn ’s converges
to the root with desired accuracy, i.e., |xn+1 − xn | < ε, where ε is a sufficiently small
number. The function φ(x) is called the iteration function.

Note 4.4.1 There is no guarantee that this sequence x0 , x1 , x2 , . . . will converge. The
function f (x) = 0 can be written as x = φ(x) in many different ways. This is very
significant since the form of the function φ(x) is very important both for the convergence
and for its rate.

For example, the equation x3 + x2 − 1 = 0 has a root lies between 0 and 1. This
equation can be rewritten in the following ways:

1 − x2 1 − x2  1
x= ; x = (1 − x2 )1/3 ; x = ; x = 1 − x3 ; x = √ , etc.
x2 x 1+x
The following theorem gives the sufficient condition for convergence of the iteration
process.

Theorem 4.3 Let ξ be a root of the equation f (x) = 0 and it can be written as x = φ(x)
and further that
1. the function φ(x) is defined and differentiable on the interval [a, b],

2. φ(x) ∈ [a, b] for all x ∈ [a, b],

3. there is a number l < 1 such that

|φ (x)| ≤ l < 1 for x ∈ [a, b]. (4.14)

Then the sequence {xn } given by (4.13) converges to the desired root ξ irrespective of
the choice of the initial approximation x0 ∈ [a, b] and the root ξ is unique.

Proof. Since ξ is a root of the equation x = φ(x), therefore

ξ = φ(ξ). (4.15)

Also from (4.13),

xi+1 = φ(xi ). (4.16)

Subtracting (4.16) from (4.15),

ξ − xi+1 = φ(ξ) − φ(xi )


= (ξ − xi )φ (ξi ) (by mean value theorem)
(where ξi lies between ξ and xi )
204 Numerical Analysis

For i = 0, 1, 2, . . . , n,

ξ − x1 = (ξ − x0 )φ (ξ0 ), ξ0 lies between ξ and x0


ξ − x2 = (ξ − x1 )φ (ξ1 ), ξ1 lies between ξ and x1
.. ..
. .
ξ − xn+1 = (ξ − xn )φ (ξn ), ξn lies between ξ and xn .

Multiplying all these equations we obtain the following result

(ξ − xn+1 ) = (ξ − x0 )φ (ξ0 )φ (ξ1 ) · · · φ (ξn )


or |ξ − xn+1 | = |ξ − x0 ||φ (ξ0 )||φ (ξ1 )| · · · |φ (ξn )|
≤ ln+1 |ξ − x0 |, (4.17)

where |φ (x)| ≤ l for all x ∈ [a, b].

Now, if l < 1 then the right hand side of (4.17) tends to zero as n → ∞.
Therefore, lim xn+1 = ξ.
n→∞
Hence the sequence {xn } converge to ξ if |φ (x)| < 1, for all x ∈ [a, b].
Now to prove the uniqueness.
Let ξ1 and ξ2 be two roots of x = φ(x), i.e., ξ1 = φ(ξ1 ) and ξ2 = φ(ξ2 ). Then

|ξ2 − ξ1 | = |φ(ξ2 ) − φ(ξ1 )| = |φ (c)||ξ1 − ξ2 |, (4.18)

where c ∈ (ξ1 , ξ2 ).
Equation (4.18) reduces to

|ξ1 − ξ2 |(1 − |φ (c)|) = 0

and by condition (iii) ξ1 = ξ2 , i.e., the two roots are not distinct, they are equal. 

4.4.1 Estimation of error


Let ξ be an exact root of the equation x = φ(x) and xn+1 = φ(xn ).
Therefore,

|ξ − xn | = |φ(ξ) − φ(xn−1 )| = |ξ − xn−1 | |φ (c)|, c lies between xn−1 and ξ


≤ l|ξ − xn−1 |
= l|ξ − xn + xn − xn−1 | ≤ l|ξ − xn | + l|xn − xn−1 |.

After rearrangement, this relation becomes


l ln
|ξ − xn | ≤ |xn − xn−1 | ≤ |x1 − x0 |. (4.19)
1−l 1−l
Sol. of Algebraic and Transcendental Equs. 205

Let the maximum number of iterations needed to achieve the accuracy ε be N (ε).
Thus from (4.19)
lN
|x1 − x0 | ≤ ε.
1−l
This gives log |xε(1−l)
1 −x0 |
N (ε) ≥ . (4.20)
log l

For l ≤ 1
2 the estimation of the error is given by the following simple form:

|ξ − xn | ≤ |xn − xn−1 |. (4.21)

Order of convergence
The convergence of an iteration method depends on the suitable choice of the iteration
function φ(x) and x0 , the initial guess.
Let xn converges to the exact root ξ, so that ξ = φ(ξ).
Thus xn+1 − ξ = φ(xn ) − φ(ξ).
Let εn+1 = xn+1 − ξ. Note that φ (x) = 0. Then the above relation becomes

εn+1 = φ(εn + ξ) − φ(ξ)


1
= εn φ (ξ) + ε2n φ (ξ) + · · ·
2
= εn φ (ξ) + O(ε2n )
i.e., εn+1  εn φ (ξ).

Hence the order of convergence of iteration method is linear.

Geometric interpretation
Geometrically, the point of intersection of the line y = x and the curve y = φ(x) is a
root of the equation f (x) = 0. Depending on the value of φ (ξ) the convergence and
divergence cases are illustrated in Figures 4.5-4.6.

Merit and Demerit


The disadvantage of iteration method is that a pre-calculation is required to rewrite
f (x) = 0 into x = φ(x) in such a way that |φ (x)| < 1. But, the main advantage of this
method is that the operation carried out at each stage are of the same kind, and this
makes easier to develop computer program.
This method is some times called a linear iteration due to its linear order of con-
vergence.
206 Numerical Analysis

f (x) y=x f (x) y=x


6 6

y = φ(x) -
 6 --
6-
6
6?

 ??
 ?  ? y = φ(x)
6
?
- x - x
ξ x2 x1 x0 x0x2 ξ x3x1
O O
(a) Stair case solution, (b) Spiral case solution,
0 < φ (ξ) < 1 −1 < φ (ξ) < 0.
Figure 4.5: Convergent for |φ (ξ)| < 1.
f (x) f (x)
y=x y=x
6 - 6 -
y=φ(x) 6 6
-
6
-


? y = φ(x)
 ?

- x - x
ξ x0 x1 x3x1 ξ x0x2 x4
O O
(a) Divergent for φ (ξ) > 1. (b) Divergent for φ (ξ) < −1.
Figure 4.6: Divergent for |φ (ξ)| > 1.

Example 4.4.1 Consider the equation 5x3 − 20x + 3 = 0. Find the root lying on
the interval [0,1] with an accuracy of 10−4 .
3
Solution. The given equation is written as x = 5x20+3 = φ(x) (say).
2
3x2
Now, φ (x) = 15x
20 = 4 < 1 on [0,1]. Let x0 = 0.5. The calculations are shown in
the following table.
n xn φ(xn ) = xn+1
0 0.5 0.18125
1 0.18125 0.15149
2 0.15149 0.15087
3 0.15087 0.15086
4 0.15086 0.15086
Sol. of Algebraic and Transcendental Equs. 207

At this stage the iteration process is terminated and ξ = 0.1509 is taken as the
required root.

Example 4.4.2 Find a root of the equation

cos x − xex = 0

correct up to three decimal places.

Solution. It is easy to see that one root of the given equation lies between 0 and 1.
Let x0 = 0. The equation can be written as x = e−x cos x = φ(x) (say).
The calculations are shown in the following table.
n 0 1 2 3 4 5 6
xn 0.50000 0.53228 0.50602 0.52734 0.51000 0.52408 0.51263
xn+1 0.53228 0.50602 0.52734 0.51000 0.52408 0.51263 0.52193

n 7 8 9 10 11 12 13
xn 0.52193 0.51437 0.52051 0.51552 0.51958 0.51628 0.51896
xn+1 0.51437 0.52051 0.51552 0.51958 0.51628 0.51896 0.51678

n 14 15 16 17 18 19 20
xn 0.51678 0.51855 0.51711 0.51828 0.51733 0.51810 0.51748
xn+1 0.51855 0.51711 0.51828 0.51733 0.51810 0.51748 0.51798

Therefore, the required root is 0.518 correct up to three decimal places.

Algorithm 4.3 (Fixed point iteration). This algorithm computes a root of the
equation f (x) = 0 by rewriting the equation as x = φ(x), provided |φ (x)| < 1 in the
interval [a, b], by fixed point iteration method. x0 ∈ [a, b] be the initial guess and ε
is the error tolerance.

Algorithm Iteration
Input function φ(x);
Read x0 , ε; //initial guess and error tolerance.//
Set x1 = x0 ;
do
Set x0 = x1 ;
Compute x1 = φ(x0 );
while (|x1 − x0 | > ε);
Print ‘The root is’, x1 ;
end Iteration
208 Numerical Analysis

Program 4.3
.
/* Program Fixed-Point Iteration
Program to find a root of the equation x*x*x-3x+1=0
by fixed point iteration method. phi(x) is obtained
by rewrite f(x)=0 as x=phi(x), which is to be supplied.*/
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
#define phi(x) (3*x-1)/(x*x)
/* definition of the function phi(x) and it to be
changed accordingly */
void main()
{
int k=0; /* counts number of iterations */
float x1,x0; /* initial guess */
float eps=1e-5; /* error tolerance */
printf("\nEnter the initial guess x0 ");
scanf("%f",&x0);
x1=x0;
do
{
k++;
x0=x1;
x1=phi(x0);
}while(fabs(x1-x0)>eps);
printf("One root is %8.5f obtained at %d th iteration ",x1,k);
} /* main */

A sample of input/output:

Enter the initial guess x0 1


One root is 1.53209 obtained at 37th iteration

4.5 Acceleration of Convergence: Aitken’s ∆2 -Process

The rate of convergence of iteration method is linear. But, this slow rate can be accel-
erated by using Aitken’s method.
The iteration scheme of this method is obtained from fixed point iteration method as

xn+1 = φ(xn ) with |φ (x)| < 1.


Sol. of Algebraic and Transcendental Equs. 209

If ξ be the root of the equation f (x) = 0 then


ξ − xn+1 = φ(ξ) − φ(xn ) = φ (ξ0 )(ξ − xn )
where ξ0 lies between ξ and xn .
Let xn−1 , xn and xn+1 be three successive approximations to the root ξ.
Then
ξ − xn = a(ξ − xn−1 ) where a = φ (ξ0 )
ξ − xn+1 = a(ξ − xn ).
Eliminating a from these equations, we find the relation
ξ − xn ξ − xn−1
=
ξ − xn+1 ξ − xn
which gives
(xn+1 − xn )2
ξ = xn+1 − . (4.22)
xn+1 − 2xn + xn−1
Now, introduce the forward difference operator as
∆xn = xn+1 − xn , ∆2 xn−1 = xn+1 − 2xn + xn−1 .
Then (4.22) is simplified as
(∆xn )2
ξ = xn+1 − , (4.23)
∆2 xn−1
which is known as Aitken’s ∆2 -process.

Example 4.5.1 Find a root of the equation cos x − xex = 0 using Aitken’s ∆2 -
process.

Solution. Let x = e−x cos x = φ(x) (say) and x0 = 0.


x1 = φ(x0 ) = 0.54030, x2 = φ(x1 ) = 0.49959.
∆x1 = x2 − x1 = −0.04071, ∆2 x0 = x2 − 2x1 + x0 = −0.58101.
(∆x1 )2 (−0.04071)2
Then x3 = x2 − = 0.49959 − = 0.50244.
∆2 x0 −0.58101
The results for n = 1, 2, 3, 4 are shown below.
n xn−1 xn xn+1 ∆xn ∆2 xn−1 xn+2
1 0.00000 0.54030 0.49959 –0.04071 –0.58101 0.50244
2 0.54030 0.49959 0.50244 0.00285 0.04356 0.50225
3 0.49958 0.50244 0.50225 –0.00019 –0.00304 0.50226
4 0.50244 0.50225 0.50226 0.00001 0.00021 0.50226

Therefore, a root is 0.5023 correct up to four decimal places.


210 Numerical Analysis

4.6 Newton-Raphson Method or Method of Tangent

Let x0 be an approximate root of the equation f (x) = 0. Suppose x1 = x0 + h be the


exact root of the equation, where h is the correction of the root (error). Then f (x1 ) = 0.
Using Taylor’s series, f (x1 ) = f (x0 + h) is expanded in the following form

h2 
f (x0 ) + hf  (x0 ) + f (x0 ) + · · · = 0.
2!
Neglecting the second and higher order derivatives the above equation reduces to

f (x0 )
f (x0 ) + hf  (x0 ) = 0 or, h = − .
f  (x0 )

Hence,

f (x0 )
x1 = x0 + h = x0 − . (4.24)
f  (x0 )

To compute the value of h, the second and higher powers of h are neglected so the
f (x0 )
value of h = −  is not exact, it is an approximate value. So, x1 , obtained from
f (x0 )
(4.24) is not a root of the equation, but it is a better approximation of x than x0 .
In general,

f (xn )
xn+1 = xn − . (4.25)
f  (xn )

This expression generates a sequence of approximate values x1 , x2 , . . . , xn , . . . each


successive term of which is closer to the exact value of the root ξ than its predecessor.
The method will terminate when |xn+1 − xn | becomes very small.
In Newton-Raphson method the arc of the curve y = f (x) is replaced by a tangent
to the curve, hence, this method is sometimes called the method of tangents.

Note 4.6.1 The Newton-Raphson method may also be used to find a complex root of
an equation when the initial guess is taken as a complex number.

Geometrical interpretation
The geometrical interpretation of Newton-Raphson method is shown in Figure 4.7.
In this method, a tangent is drawn at (x0 , f (x0 )) to the curve y = f (x). The tangent
cuts the x-axis at (x1 , 0). Again, a tangent is drawn at (x1 , f (x1 )) and this tangent cuts
x-axis at (x2 , 0). This process is continued until xn = ξ as n → ∞.
Sol. of Algebraic and Transcendental Equs. 211

f (x)
6

6
6 - x
O ξ x2 x1 x0

Figure 4.7: Geometrical interpretation of Newton-Raphson method.

The choice of initial guess of Newton-Raphson method is very important. If the initial
guess is near the root then the method converges very fast. If it is not so near the root
or if the starting point is wrong, then the method may lead to an endless cycle. This
is illustrated in Figure 4.8. In this figure, the initial guess x0 gives the fast convergence
to the root, the initial guess y0 leads to an endless cycle and the initial guess z0 gives a
divergent solution as f  (z0 ) is very small.
f (x)
6

6 6


z0 1  w - x
x0 y0
O

Figure 4.8: Illustration of the choice of initial guess in Newton-Raphson method.

Even if the initial guess is not close to the exact root, the method may diverges. To
choose the initial guess the following rule may be followed:
The endpoint of the interval [a, b] at which the sign of the function coincides with the
sign of the second derivative must be taken as the initial guess. When f (b) · f  (x) > 0,
the initial guess is x0 = b, and when f (a) · f  (x) > 0 then x0 = a be the initial guess.
Three different cases - divergent, cyclic and oscillation of Newton-Raphson method
are discussed in the following by examples.
Sometimes, if the initial guess x0 is far away from the exact root then the sequence
212 Numerical Analysis

{xn } may converges to some other root. This situation happens when the slope f  (x0 )
is small and the tangent to the curve y = f (x) is nearly horizontal. For example, if
f (x) = cos x and we try to find the root ξ = π/2 starting with x0 = 3 then x0 =
−4.01525, x2 = −4.85266, . . . and the sequence {xn } will converge to a different root
−4.71239  −3π/2.
We consider another example which will produces a divergent sequence. Let

f (x) = xe−x and x0 = 2.

Then x1 = 4.0, x2 = 5.33333, . . . , x15 = 19.72255, . . . and clearly {xn } diverges slowly
to ∞ (Figure 4.9).

f (x)
6

U R6s
- x
6
O

Figure 4.9: Newton-Raphson method produces a divergent sequence for f (x) = xe−x .

Now consider a function f (x) = x3 − x − 3 which will produce a cyclic sequence when
initial guess is x0 = 0. The sequence is
x1 = −3.0, x2 = −1.961538, x3 = −1.147176, x4 = −0.006579,
x5 = −3.000389, x6 = −1.961818, x7 = −1.147430, . . .
and it may be noted that xk+4  xk , k = 0, 1, 2, . . . (Figure 4.10).
But, the initial guess x0 = 2 gives the convergent sequence x1 = 1.72727, x2 =
1.67369, x3 = 1.67170, x4 = 1.67170.
The function f (x) = tan−1 x and x0 = 1.45 gives a divergent oscillating sequence. If
the initial guess is x0 = 1.45 then

x1 = −1.55026, x2 = 1.84593, x3 = −2.88911, . . .

(Figure 4.11). But, if x0 = 1.5 then x1 = −0.07956, x2 = 0.00034, x3 = 0.00000.


Sol. of Algebraic and Transcendental Equs. 213

x1 x2 x3 x0
–x  y : - x
 
? O
?

y = x3 − x − 3

?
–y

Figure 4.10: Newton-Raphson method produces a cyclic sequence for f (x) = x3 − x − 3


when x0 = 0.

4.6.1 Convergence of Newton-Raphson method


The Newton-Raphson iteration formula (4.25) is

f (xn )
xn+1 = xn − .
f  (xn )

Comparing this expression with fixed point iteration formula xn+1 = φ(xn ) and we
obtain
f (xn )
φ(xn ) = xn −  .
f (xn )
This can be written as
f (x)
φ(x) = x − .
f  (x)
It is already proved that the iteration method converges if |φ (x)| < 1. Therefore,
Newton-Raphson method converges, if
 
d f (x) 
  |f (x) · f  (x)| < |f  (x)|2
 dx x − f  (x)  < 1 or (4.26)

within the interval under consideration. Newton-Raphson method converges if the initial
guess x0 is chosen sufficiently close to the root and the functions f (x), f  (x) and f  (x)
are continuous and bounded in any small interval containing the root. The rate of
convergent of Newton-Raphson method is stated in the following theorem.
214 Numerical Analysis

y
6 y = tan−1 x

66

x0 x2
–x  )  1 - x
x3 x1 O 6

?
? ?
–y

Figure 4.11: Newton-Raphson method produces a divergent oscillating sequence for


f (x) = tan−1 x when x0 = 1.45.

Theorem 4.4 The rate of convergence of Newton-Raphson method is quadratic.

Proof. Let ξ be a root of the equation f (x) = 0. Then f (ξ) = 0. The iteration scheme
for Newton-Raphson method is
f (xn )
xn+1 = xn − .
f  (xn )
Let xn = εn + ξ.
Therefore, above relation becomes
f (εn + ξ)
εn+1 + ξ = εn + ξ −
f  (εn + ξ)
f (ξ) + εn f  (ξ) + (ε2n /2)f  (ξ) + · · ·
εn+1 = εn − [by Taylor’s series]
f  (ξ) + εn f  (ξ) + · · ·
ε2n f  (ξ)
f  (ξ) εn + 2 f  (ξ) + ···
= εn − [as f (ξ) = 0]

f  (ξ) 1 + εn ff  (ξ)
(ξ)
+ ···

ε2n f  (ξ) f  (ξ)


= εn − εn + + · · · 1 − ε n + ···
2 f  (ξ) f  (ξ)
ε2 f  (ξ) f  (ξ)
= − n  + ε2n  + O(ε3n )
2 f (ξ) f (ξ)
1 f  (ξ)
= ε2n  + O(ε3n ).
2 f (ξ)
Sol. of Algebraic and Transcendental Equs. 215

Neglecting the terms of order ε3n and higher powers the above expression becomes

f  (ξ)
εn+1 = Aε2n , where A = . (4.27)
2f  (ξ)

This relation shows that Newton-Raphson method has quadratic convergence or sec-
ond order convergence.

Example 4.6.1 Use Newton-Raphson method to find a root of the equation x3 +


x − 1 = 0.

Solution. Let f (x) = x3 + x − 1. Then f (0) = −1 < 0 and f (1) = 1 > 0. So one
root lies between 0 and 1. Let x0 = 0 be the initial root.
The iteration scheme is
f (xn )
xn+1 = xn −
f  (xn )
x3 + xn − 1 2x3 + 1
= xn − n 2 = n2 .
3xn + 1 3xn + 1

The sequence {xn } for different values of n is shown below.

n xn xn+1
0 0 1
1 1 0.7500
2 0.7500 0.6861
3 0.6861 0.6823
4 0.6823 0.6823
Therefore, a root of the equation is 0.682 correct up to three decimal places.

Example 4.6.2 Find an iteration scheme to find the kth root of a number a.

Solution. Let x be the kth root of a. That is, x = a1/k or xk − a = 0.


Let f (x) = xk − a. The iteration scheme is

f (xn )
xn+1 = xn −
f  (xn )
xk − a k xkn − xkn + a
or, xn+1 = xn − n k−1 =
k xn k xk−1
n
1  a 
= (k − 1)xn + k−1 .
k xn
216 Numerical Analysis

Example 4.6.3 Write down an iteration scheme for finding square root of a positive
number N . Hence find the square root of the number 2.

Solution. Let the square root of N be x. That is,√x = N , or, x2 − N = 0.
Thus the root of this equation is the required value of N . Let f (x) = x2 − N . By
Newton-Raphson method
f (xn ) x2 − N 1 N
xn+1 = xn −  = xn − n = xn + .
f (xn ) 2xn 2 xn

This is the required iteration


√ scheme.
Second Part. Let x = 2 or, x2 − 2 = 0. Also, let f (x) = x2 − 2. Then f  (x) = 2x.
The Newton-Raphson iteration scheme is
x2 − 2 x2 + 2
xn+1 = xn − n = n .
2 xn 2 xn
Let x0 = 1. The successive calculations are shown in the following.
n xn xn+1
0 1 1.5000
1 1.50000 1.41667
2 1.41667 1.41422
3 1.41422 1.41421

Therefore, the value of 2 is 1.4142, correct up to five significant figures.

Example 4.6.4 Find a root of the equation x log10 x = 4.77 by Newton-Raphson


method correct up to five decimal places.

Solution. Let f (x) = x log10 x − 4.77. Here f (6) = −0.10109 < 0 and f (7) =
1.14569 > 0.
Therefore, one root lies between 6 and 7. Let the initial guess be x0 = 6.
The iteration scheme is
xn log10 xn − 4.77 0.43429xn + 4.77
xn+1 = xn − = .
log10 xn + log10 e log10 (2.71828xn )

The values of x0 , x1 , x2 are shown in the following table.


n xn xn+1
0 6.000000 6.083358
1 6.083358 6.083152
2 6.083152 6.083153

Therefore, one root is 6.08315 correct up to five decimal places.


Sol. of Algebraic and Transcendental Equs. 217

3 x2n + 2
Example 4.6.5 The following expression xn+1 = is an iteration scheme to
8
find a root of the equation f (x) = 0. Find the function f (x).

Solution. Let α be the root obtained by performing the iteration scheme

3 x2n + 2
xn+1 = .
8
Therefore, lim xn = α.
1 
n→∞

Thus, lim xn+1 = 3 lim x2n + 2 .


n→∞ 8 n→∞
1
This gives α = [3α2 + 2], i.e., 3α2 + 2 = 8α or, 3α2 − 8α + 2 = 0.
8
Thus the required equation is 3x2 − 8x + 2 = 0 and hence f (x) = 3x2 − 8x + 2.

Example 4.6.6 Discuss the Newton-Raphson method to find the root of the equa-
tion x10 − 1 = 0 starting with x0 = 0.5.

Solution. The real roots of this equation are ±1.


Here f (x) = x10 − 1.
Therefore,
x10 − 1 9x10 + 1
xn+1 = xn − n 9 = n 9 .
10xn 10xn
9 × (0.5)10 + 1
When x0 = 0.5 then x1 = = 51.65, which is far away from the root 1.
10 × (0.5)9
This is because 0.5 was not close enough to the root x = 1.
But the sequence {xn } will converge to the root 1, although very slowly.
The initial root x0 = 0.9 gives the first approximate root x1 = 1.068, which is close
to the root 1.
This example points out the role of initial approximation in Newton-Raphson method.

Example 4.6.7 Find a complex root of the equation z 3 + 2z 2 + 2z + 1 = 0 starting


with the initial guess −0.5 + 0.5i.

Solution. Let z0 = −0.5 + 0.5i = (−0.5, 0.5) be the initial guess and f (z) =
z 3 + 2z 2 + 2z + 1. Then f  (z) = 3z 2 + 4z + 2. The iteration scheme is

f (zn )
zn+1 = zn − .
f  (zn )

All the calculations are shown below.


218 Numerical Analysis

n zn f (zn ) f  (zn ) zn+1


0 (-0.50000, 0.50000) ( 0.25000, 0.25000) (0.00000, 0.50000) (-1.00000, 1.00000)
1 (-1.00000, 1.00000) ( 1.00000,0.00000) (-2.00000,-2.00000) (-0.75000, 0.75000)
2 (-0.75000, 0.75000) ( 0.34375, 0.09375) (-1.00000,-0.37500) (-0.41781, 0.71918)
3 (-0.41781, 0.71918) ( 0.05444, 0.24110) (-0.69919, 1.07384) (-0.55230, 0.85744)
4 (-0.55230, 0.85744) ( 0.08475,-0.02512) (-1.49971, 0.58836) (-0.49763, 0.86214)
5 (-0.49763, 0.86214) (-0.00014, 0.00785) (-1.47746, 0.87439) (-0.50003, 0.86603)
6 (-0.50003, 0.86603) ( 0.00004,-0.00003) (-1.50005, 0.86587) (-0.50000, 0.86603)

Thus one complex root is (−0.5000, 0.8660) i.e., −0.5 + 0.866 i.

4.7 Newton-Raphson Method for Multiple Root

The Newton-Raphson method can be used to find the multiple root of an equation.
But, its generalized form
f (xn )
xn+1 = xn − p (4.28)
f  (xn )
gives a faster convergent sequence. The term p1 f  (xn ) is the slope of the straight line
passing through (xn , f (xn )) and intersecting the x-axis at the point (xn+1 , 0). The
formula (4.28) reduces to Newton-Raphson formula when p = 1.
If ξ is a root of f (x) = 0 with multiplicity p, then ξ is also a root of f  (x) = 0
with multiplicity (p − 1), of f  (x) = 0 with multiplicity (p − 2) and so on. Hence the
expression
f (x0 ) f  (x0 ) f  (x0 )
x0 − p , x0 − (p − 1) , x0 − (p − 2) ,...
f  (x0 ) f  (x0 ) f  (x0 )
should have the same value if there is a root with multiplicity p, when the initial guess
is very close to the exact root ξ.

Theorem 4.5 The rate of convergence of the formula (4.28) is quadratic.

Proof. Let ξ be a multiple root of multiplicity p, of the equation f (x) = 0. Then


f (ξ) = f  (ξ) = f  (ξ) = · · · = f p−1 (ξ) = 0 and f p (ξ) = 0. Let εn = xn − ξ. Then from
(4.28),

f (εn + ξ)
εn+1 = εn − p
f  (εn + ξ)
εp−1 εpn p εp+1
f (ξ) + εn f  (ξ) + · · · + n
(p−1)! f
p−1 (ξ) + p! f (ξ) + n
(p+1)! f
p+1 (ξ) + ···
= εn − p
εp−2 εp−1 εpn
f  (ξ) + εn f  (ξ) + · · · + n
(p−2)! f
p−1 (ξ) + n
(p−1)! f
p (ξ) + p! f p+1 (ξ) + · · ·
Sol. of Algebraic and Transcendental Equs. 219

εpn p εp+1
p! f (ξ) + n
(p+1)! f
p+1 (ξ) + ···
= εn − p
εp−1 εpn
n
(p−1)! f
p (ξ) + p! f p+1 (ξ) + · · ·
−1
εn ε2n f p+1 (ξ) εn f p+1 (ξ)
= εn − p + + · · · 1 + + ···
p p(p + 1) f p (ξ) p f p (ξ)
εn ε2n f p+1 (ξ) εn f p+1 (ξ)
= εn − p + + · · · 1 − + ···
p p(p + 1) f p (ξ) p f p (ξ)
ε2 f p+1 (ξ) ε2n f p+1 (ξ)
= εn − ε n + n − + ···
p + 1 f p (ξ) p f p (ξ)
1 f p+1 (ξ)
= ε2n + O(ε3n ).
p(p + 1) f p (ξ)

1 f p+1 (ξ)
Thus εn+1 = Aε2n , where A = .
p(p + 1) f p (ξ)
This shows that the rate of convergence is quadratic.

Example 4.7.1 Find the double root of the equation x3 − 3x2 + 4 = 0.

Solution. Let x0 = 1.5 and f (x) = x3 − 3x2 + 4.


f  (x) = 3x2 − 6x, f  (x) = 6x − 6.
f (x0 ) 0.625
x1 = x0 − 2  = 1.5 − 2 = 2.05556 and
f (x0 ) −2.25
f  (x0 ) −2.25
x1 = x0 − 2  = 1.5 − = 2.25000.
f (x0 ) 3
The close values of x1 indicates that there is a double root near 2. Let x1 = 2.05556.
f (x1 ) 0.00943
Then x2 = x1 − 2  = 2.05556 − 2. = 2.00051
f (x1 ) 0.34262
f  (x1 ) 0.34262
x2 = x1 − 2  = 2.05556 − = 2.00146.
f (x1 ) 6.33336
Thus there is a double root at x = 2.00051 which is sufficiently close to the actual
root 2.
The Newton-Raphson method with same initial guess x0 = 1.5 produces the sequence
x1 = 1.77778, x2 = 1.89352, x3 = 1.94776, x4 = 1.97410, x5 = 1.98714,
x6 = 1.99353, x7 = 1.99689, x8 = 1.99850, x9 = 1.99961, x10 = 1.99980.
Thus at 10th iteration the Newton-Raphson method produces the root 2.000 correct
up to three decimal places, while the formula (4.28) needs only two iterations.

4.8 Modification on Newton-Raphson Method

In the Newton-Raphson method, the derivative of the function f (x) is calculated at each
point xn . That is, at each iteration two functions are evaluated at xn , n = 0, 1, 2, . . ..
220 Numerical Analysis

But, some functions take much time to evaluate the derivative. To save this time one
can change the iteration scheme of Newton-Raphson method as

f (xn )
xn+1 = xn − . (4.29)
f  (x0 )

That is, the derivative of f (x) is calculated only at the initial guess instead of several
different points xn . This method reduces the time for calculating the derivatives. But,
the rate of convergence of this method is linear, which is proved in Theorem 4.6.

Theorem 4.6 The rate of convergence of the formula (4.29) is linear.

Solution. Let ξ be the root of the equation f (x) = 0. Then f (ξ) = 0, and εn = xn − ξ.
Therefore from (4.29),

f (εn + ξ) f (ξ) + εn f  (ξ) + · · ·


εn+1 = εn − = ε n −
f  (x0 ) f  (x0 )
 
f  (ξ)
= εn 1 −  + O(ε2n ).
f (x0 )

f  (ξ)
Neglecting ε2n and higher powers of ε2n and denoting A = 1 − the above error
f  (x0 )
term becomes

εn+1 = Aεn . (4.30)

This proved that the rate of convergence of the formula (4.29) is linear.
f (x)
6

6
s 6/ - x
O ξx3x2 x1 x0

Figure 4.12: Geometrical meaning of the formula (4.29).


Sol. of Algebraic and Transcendental Equs. 221

Geometrical interpretation

The gradient of tangent at the point xn is f  (x0 ) for all n.


Thus the line passing through the point (xn , f (xn )) is parallel to the tangent drawn
at (x0 , f (x0 )), i.e., the tangent at (xn , f (xn )) in Newton-Raphson method is replaced
by a line parallel to the tangent drawn at (x0 , f (x0 )) and passing through the point
(xn , f (xn )). This phenomena is shown in Figure 4.12.

Example 4.8.1 Find a root of the equation x3 − x + 1 = 0 using formula (4.29)


and Newton-Raphson method up to four decimal places.

Solution. One root of this equation lies between −2 and −1. Let x0 = −1.5 and
f (x) = x3 − x + 1. Then f  (x0 ) = 5.75.
The iteration scheme of the formula (4.29) is

f (xn )
xn+1 = xn −
f  (x0 )
x3 − xn + 1 1
= xn − n =− (x3 − 6.75xn + 1).
5.75 5.75 n
All the calculations are shown in the following table.
n xn xn+1
0 –1.50000 –1.34783
1 –1.34783 –1.33032
2 –1.33032 –1.32614
3 –1.32614 –1.32508
4 –1.32508 –1.32481
5 –1.32481 –1.32474
6 –1.32474 –1.32472
7 –1.32472 –1.32472

Therefore, one root of the given equation is −1.3247, correct up to four decimal places
attained at 7th iteration.
Using Newton-Raphson method
The iteration scheme for Newton-Raphson method is

f (xn )
xn+1 = xn −
f  (xn )
x3 − xn + 1 2x3 − 1
= xn − n 2 = n2 .
3xn − 1 3xn − 1
222 Numerical Analysis

Let x0 = −1.5. The successive calculations are shown below.


n xn xn+1
0 –1.50000 –1.34783
1 –1.34783 –1.32520
2 –1.32520 –1.32472
3 –1.32472 –1.32472

Therefore, a root is −1.3247 correct up to four decimal places, attained at 3rd itera-
tion.

This example shows that Newton-Raphson method is more faster than the method
given by (4.29).

1
4.9 Modified Newton-Raphson Method

In this method, the iteration scheme is taken as

f (xn )
xn+1 = xn − = φ(xn ) (say) (4.31)
f  (xn + a(xn )f (xn ))

That is,
f (x)
φ(x) = x − ,
f  (x + a(x)f (x))

where a(x) is a smooth function.


Now,
f  (x)
φ (x) = 1 − 
f (x + a(x)f (x))
f (x)f  (x + a(x)f (x))(1 + a (x)f (x) + a(x)f  (x))
+ (4.32)
{f  (x + a(x)f (x))}2

f  (x)
and φ (x) = −
f  (x + a(x)f (x))
f  (x)f  (x + a(x)f (x))(1 + a (x)f (x) + a(x)f  (x))
+ 2
{f  (x + a(x)f (x))}2
f (x){f  (x + a(x)f (x))}2 {1 + a (x)f (x) + a(x)f  (x)}2
− 2
{f  (x + a(x)f (x))}3

1
H.H.H.Homeier, A modified Newton method for root finding with cubic convergence, J. Computa-
tional and Applied Mathematics, 157 (2003) 227-230.
Sol. of Algebraic and Transcendental Equs. 223

f (x)f  (x + a(x)f (x)){1 + a (x)f (x) + a(x)f  (x)}2


+
{f  (x + a(x)f (x))}2
{f (x)} f (x + a(x)f (x))a (x)
2 
+
{f  (x + a(x)f (x))}2
f (x)f  (x + a(x)f (x)){2a (x)f  (x) + a(x)f  (x)}
+ .
{f  (x + a(x)f (x))}2

If ξ be the root of the equation f (x) = 0 then f (ξ) = 0 and hence φ(ξ) = ξ, φ (ξ) = 0
[from (4.32)] and

f  (ξ) 2f  (ξ)f  (ξ){1 + a(ξ)f  (ξ)}


φ (ξ) = − +
f  (ξ) {f  (ξ)}2
f  (ξ)
=  {1 + 2a(ξ)f  (ξ)}. (4.33)
f (ξ)
1 1
Now, if a(ξ) = − then φ (ξ) = 0. This is the easiest way to put a(x) = −  .
2f  (ξ) 2f (x)
Hence the iteration scheme for modified Newton-Raphson method is
f (xn ) 1
xn+1 = xn − , where a(xn ) = −  . (4.34)
f  (xn + a(xn )f (xn )) 2f (xn )
Also we have

φ(ξ) = ξ, φ (ξ) = 0, φ (ξ) = 0. (4.35)

The rate of convergence of this method is evaluated below:

Theorem 4.7 The rate of convergence of modified Newton-Raphson method is cubic.

Proof. Let ξ be the root of the equation f (x) = 0. Also, let εn = xn − ξ.


Therefore, xn+1 = φ(xn ) = φ(εn + ξ)
ε2 ε3
or, εn+1 + ξ = φ(ξ) + εn φ (ξ) + n φ (ξ) + n φ (ξ) + · · ·
2! 3!
ε 3
or, εn+1 = n φ (ξ) + O(ε4n ).
3!
[Using the facts φ(ξ) = ξ, φ (ξ) = 0, φ (ξ) = 0 from (4.35)]
Neglecting the terms ε4n and higher powers of ε4n the above equation reduces to
1 
εn+1 = Aε3n , where A = φ (ξ). (4.36)
3!
This shows that the rate of modified Newton-Raphson method is cubic.
224 Numerical Analysis

This method evaluates three functions f (x), x + a(x)f  (x) and f  (x + a(x)f (x)) at
each iteration.

Example 4.9.1 Find a root of the equation x3 −3x2 +4 = 0 using modified Newton-
Raphson method, starting with x0 = 1.5.

Solution. Let f (x) = x3 − 3x2 + 4. f  (x) = 3x2 − 6x,


1 1
a(x) = −  =− .
2f (x) (6x − 12x)
2

x3 − 3x2 + 4 5x3 − 9x2 − 4


Let g(x) = x + a(x)f (x) = x − = .
6x2 − 12x 6x2 − 12x
f (xn )
Then the iteration scheme is xn+1 = xn −  .
f (g(xn ))
The calculations for each value of n is listed below.
n xn f (xn ) g(xn ) f  (g(xn )) xn+1
0 1.50000 0.62500 1.63889 –1.77546 1.85202
1 1.85202 0.06245 1.89000 –0.62370 1.95215
2 1.95215 0.00676 1.96421 –0.21090 1.98420
3 1.98420 0.00074 1.98816 –0.07062 1.99468
4 1.99468 0.00008 1.99601 –0.02389 1.99803
5 1.99803 0.00001 1.99852 –0.00887 1.99916
6 1.99916 0.000002 1.99937 –0.00378 1.99969

Therefor, a root of the given equation is 2.000, correct up to three decimal places,
and this value is attained at 6th iteration, while Newton-Raphson method takes 10
iterations (see Example 4.7.1).

Algorithm 4.4 (Newton-Raphson method). This algorithm finds a root of the


equation f (x) = 0 by Newton-Raphson method, when f (x), f  (x) and initial guess
x0 are supplied.

Algorithm Newton-Raphson
// f d(x) is the derivative of f (x) and ε is the error tolerance.//
Input function f (x), f d(x);
Read x0 , ε;
Set x1 = x0 ;
do
Set x0 = x1 ;
Compute x1 = x0 − f (x0 )/f d(x0 );
while (|x1 − x0 | > ε);
Print ‘The root is’, x1 ;
end Newton-Raphson
Sol. of Algebraic and Transcendental Equs. 225

Program 4.4.
/* Program Newton-Raphson
Program to find a root of the equation x*x*x-3x+1=0 by Newton-
Raphson method. f(x) and its derivative fd(x) are to be supplied. */
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
void main()
{
int k=0; /* counts number of iterations */
float x1,x0; /* x0 is the initial guess */
float eps=1e-5; /* error tolerance */
float f(float x);
float fd(float x);
printf("\nEnter the initial guess x0 ");
scanf("%f",&x0);
x1=x0;
do
{
k++;
x0=x1;
x1=x0-f(x0)/fd(x0);
}while(fabs(x1-x0)>eps);
printf("One root is %8.5f obtained at %d th iteration ",x1,k);
} /* main */
/* definition of the function f(x) */
float f(float x)
{
return(x*x*x-3*x+1);
}
/* definition of the function fd(x) */
float fd(float x)
{
return(3*x*x-3);
}

A sample of input/output:

Enter the initial guess x0 1.1


One root is 1.53209 obtained at 7 th iteration
226 Numerical Analysis

4.10 Secant Method

The main drawback of Newton-Raphson method is to determine the derivatives at


several points. In many cases, calculation of derivatives takes much time. In some cases
closed form expression for f  (x) is not available.
To remove this drawback, the derivative f  (x) is approximated by the backward
difference
f (xi ) − f (xi−1 )
f  (xi ) 
xi − xi−1
where xi and xi−1 are two approximations to the root but need not require the condition
f (xi ) · f (xi−1 ) < 0.
Then from the Newton-Raphson method

f (xi ) f (xi )(xi − xi−1 )


xi+1 = xi − 
= xi − . (4.37)
f (xi ) f (xi ) − f (xi−1 )

This formula is same as the formula for Regula-falsi method and this formula needs
two initial guess x0 and x1 of the root.

Note 4.10.1 Regula-falsi method need an interval where the root belongs to, i.e., if
[x0 , x1 ] is the interval then f (x0 ) · f (x1 ) < 0. But, secant method needs two nearest
values x0 and x1 of the exact root and not necessarily f (x0 ) · f (x1 ) < 0.

Geometrical interpretation

A geometrical interpretation of secant method is illustrated in Figure 4.13.


A secant is drawn connecting f (xi−1 ) and f (xi ). The point where it cuts the x-axis
is xi+1 . Another secant is drawn connecting f (xi ) and f (xi+1 ) to obtain xi+2 , and so
on.

4.10.1 Convergence of secant method

The formula for secant method is

(xn − xn−1 )f (xn )


xn+1 = xn − . (4.38)
f (xn ) − f (xn−1 )

Let ξ be the exact root of the equation f (x) = 0 and the error at the nth iteration is
εn = xn − ξ. Also f (ξ) = 0.
Sol. of Algebraic and Transcendental Equs. 227

f (x)
6

xi xi+1
- x
O xi+2 xi−1

Figure 4.13: Geometrical interpretation of secant method.

Then (4.38) becomes

(εn − εn−1 )f (εn + ξ)


εn+1 = εn −
f (εn + ξ) − f (εn−1 + ξ)
(εn − εn−1 )[f (ξ) + εn f  (ξ) + (ε2n /2)f  (ξ) + · · · ]
= εn −
(εn − εn−1 )f  (ξ) + 12 (ε2n − ε2n−1 )f  (ξ) + · · ·
−1
ε2n f  (ξ) 1 f  (ξ)
= εn − εn + + · · · 1 + (ε n + ε n−1 ) + ···
2 f  (ξ) 2 f  (ξ)
1 f  (ξ)
= εn εn−1  + O(ε2n εn−1 + εn ε2n−1 ).
2 f (ξ)

1 f  (ξ)
Thus εn+1 = cεn εn−1 where c = . This is a non-linear difference equation and
2 f  (ξ)
= Aεpn . Then εn = Aεpn−1 . This gives εn−1 = εn A−1/p .
1/p
to solve it, let εn+1
Therefore, Aεpn = cεn εn A−1/p , i.e., εpn = cA−(1+1/p) εn
1/p 1+1/p
.
Equating the power of εn on both sides, obtain the equation for p

1 1 √
p=1+ or p = (1 ± 5).
p 2

Positive sign gives p = 1.618. Hence εn+1 = Aε1.618


n .
Thus the rate of convergence of secant method is 1.618, which is smaller than the
Newton-Raphson method. Thus this method converges at a slower rate. However, this
method evaluates function only once in each iteration, but Newton-Raphson method
228 Numerical Analysis

evaluates two functions f and f  in each iteration. In this context, the secant method
is more efficient as compared to Newton-Raphson method.

Example 4.10.1 Find a root of the equation x3 − 8x − 4 = 0 using secant method.

Solution. Let f (x) = x3 − 8x − 4 = 0. One root lies between 3 and 4. Let the initial
x0 f (x1 ) − x1 f (x0 )
approximation be x0 = 3, x1 = 3.5. The formula for x2 is x2 = .
f (x1 ) − f (x0 )
The calculations are shown below:
x0 f (x0 ) x1 f (x1 ) x2 f (x2 )
3.0000 –1.0000 3.5000 10.8750 3.0421 –0.1841
3.5000 10.8750 3.0421 –0.1841 3.0497 –0.0333
3.0421 –0.1841 3.0497 –0.0333 3.0514 0.0005
3.0497 –0.0333 3.0514 0.0005 3.0514 0.0005
Therefore, a root is 3.051 correct up to four significant figures.

Algorithm 4.5 (Secant method). This algorithm finds a root of the equation
f (x) = 0 by secant method when two initial guesses x0 and x1 are supplied.

Algorithm Secant
// The iteration terminates when |f (x1 ) − f (x0 )| is very small (in this case slope of
the secant is very small) and |f (x2 )| < ε, ε is the error tolerance, δ is a very small
quantity, taken as 0.//
Read ε and δ;
Input function f (x);
1. f x0 = f (x0 ); f x1 = f (x1 );
if |f x1 − f x0 | < δ then
Print ‘Slope too small, the method does not give correct root or fail’
Stop;
endif;
x0 · f x1 − x1 · f x0
Compute x2 = ;
f x1 − f x0
Compute f2 = f (x2 );
if |f2 | < ε then
Print ‘A root is’, x2 ;
Stop;
endif;
Set f x0 = f x1 ; f x1 = f x2 ;
Set x0 = x1 ; x1 = x2 ;
Go to 1;
end Secant
Sol. of Algebraic and Transcendental Equs. 229

Program 4.5 .
/* Program Secant
Program to find a root of the equation x*sin(x)-1=0 by
secant method. It is assumed that a root lies between x0 and x1.*/
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
void main()
{
float x0,x1,x2,fx0,fx1,fx2;
float eps=1e-5; /* error tolerance */
float delta=1e-5; /* slope */
float f(float x);
printf("\nEnter the values of x0 and x1 ");
scanf("%f %f",&x0,&x1);
fx0=f(x0);
fx1=f(x1);
if(fabs(fx1-fx0)<delta){
printf("Slope too small
the method does not give correct root or fail");
exit(0);
}
do
{
x2=(x0*fx1-x1*fx0)/(fx1-fx0);
fx2=f(x2);
if(fabs(fx2)<eps){
printf("One root is %8.5f ",x2);
exit(0);
}
fx0=fx1; fx1=fx2;
x0=x1; x1=x2;
}while(fabs(fx2)>eps);
} /* main */
/* definition of function f(x), it may change accordingly */
float f(float x)
{
return(x*sin(x)-1);
}
230 Numerical Analysis

A sample of input/output:

Enter the values of x0 and x1 0 1


One root is 1.11416

4.11 Chebyshev Method

Let us consider the equation f (x) = 0. The function f (x) is expanded by Taylor’s series
in the neighbourhood of xn as 0 = f (x) = f (xn ) + (x − xn )f  (xn ) + · · · .
f (xn )
This relation gives x = xn −  .
f (xn )
This is the (n + 1)th approximation to the root. Therefore,

f (xn )
xn+1 = xn − . (4.39)
f  (xn )

Again, expanding f (x) by Taylor’s series and retaining up to second order term,
shown below.
(x − xn )2 
0 = f (x) = f (xn ) + (x − xn )f  (xn ) + f (xn )
2
(xn+1 − xn )2 
Therefore, f (xn+1 ) = f (xn ) + (xn+1 − xn )f  (xn ) + f (xn ) = 0.
2
Substituting the value of xn+1 − xn from (4.39) to the last term and we find

1 [f (xn )]2 
f (xn ) + (xn+1 − xn )f  (xn ) + f (xn ) = 0.
2 [f  (xn )]2

Thus, f (xn ) 1 [f (xn )]2 


xn+1 = xn − − f (xn ). (4.40)
f  (xn ) 2 [f  (xn )]3

This formula is the extended form of Newton-Raphson formula and it is known as


Chebyshev’s formula.
The rate of convergence of this method is cubic.

4.12 Muller Method

The main idea of this method is, the function f (x) is approximated by a quadratic
polynomial passing through the three points in the neighbour of the root. The root of
this quadratic is assumed to approximate the root of the equation f (x) = 0.
Let xn−2 , xn−1 , xn be any three distinct approximation to a root of the equation
f (x) = 0. We denote f (xn−2 ) = fn−2 , f (xn−1 ) = fn−1 and f (xn ) = fn .
Sol. of Algebraic and Transcendental Equs. 231

Let the quadratic polynomial be

f (x) = ax2 + bx + c. (4.41)

Suppose, (4.41) passes through the points (xn−2 , fn−2 ), (xn−1 , fn−1 ) and (xn , fn ),
then

ax2n−2 + bxn−2 + c = fn−2 (4.42)


ax2n−1 + bxn−1 + c = fn−1 (4.43)
ax2n + bxn + c = fn (4.44)

Eliminating a, b, c from (4.41)-(4.44), we obtain the following determinant


 
 f (x) x 2 x 1 
 
 fn−2 2
xn−2 xn−2 1 
 = 0.
 fn−1 x2n−1 xn−1 1 

 fn x2n xn 1 
By expanding this determinant the function f (x) can be written as

(x − xn−1 )(x − xn ) (x − xn−2 )(x − xn )


f (x) = fn−2 + fn−1
(xn−2 − xn−1 )(xn−2 − xn ) (xn−1 − xn−2 )(xn−1 − xn )
(x − xn−2 )(x − xn−1 )
+ fn . (4.45)
(xn − xn−2 )(xn − xn−1 )
This is a quadratic polynomial passing through the given points.
Let h = x − xn , hn = xn − xn−1 and hn−1 = xn−1 − xn−2 . Then above equation
reduces to
h(h + hn ) h(h + hn + hn−1 )
fn−2 − fn−1
hn−1 (hn−1 + hn ) hn hn−1
(h + hn )(h + hn + hn−1 )
+ fn = 0, (4.46)
hn (hn + hn−1 )

since f (x) = 0.
Now, introducing
h hn
λ= , λn = and δ n = 1 + λn .
hn hn−1
Therefore, the equation (4.46) reduces to the following form

λ2 (fn−2 λ2n − fn−1 λn δn + fn λn )δn−1


+ λ{fn−2 λ2n − fn−2 δn2 + fn (λn + δn )}δn−1 + fn = 0 (4.47)
232 Numerical Analysis

or,

λ2 cn + λgn + δn fn = 0, (4.48)

where gn = λ2n fn−2 − δn2 fn−1 + (λn + δn )fn


cn = λn (λn fn−2 − δn fn−1 + fn ).

The equation (4.48) now becomes


 
1 gn
δn fn 2 + + cn = 0.
λ λ
1
When solving for λ this equation gives
2δ f
λ=−  n n . (4.49)
gn ± gn2 − 4δn fn cn
The sign in the denominator of (4.49) is ± according as gn > 0 or gn < 0.
Thus
x − xn
λ= or x = xn + (xn − xn−1 )λ. (4.50)
xn − xn−1
Now, replacing x on left hand side by xn+1 and obtain the formula

xn+1 = xn + (xn − xn−1 )λ, (4.51)

which is called the Muller method.


This method is also an iterative method and free from evaluation of derivative as in
Newton-Raphson method.

Example 4.12.1 Find a root of the equation x3 − 3x − 5 = 0 using Muller’s method


which lies between 2 and 3.

Solution. Let x0 = 2.0, x1 = 2.5 and x2 = 3.0, f (x) = x3 − 3x − 5.

hn
hn = xn − xn−1 , λn = , δ n = 1 + λn
hn−1
gn = λ2n fn−2 − δn2 fn−1 + (λn + δn )fn
cn = λn (λn fn−2 − δn fn−1 + fn )
2δ f
λ=−  n n
gn ± (gn2 − 4δn fn cn )
xn+1 = xn + hn λ.
Sol. of Algebraic and Transcendental Equs. 233

All the calculations are shown in the following.


n xn−2 xn−1 xn gn cn λ xn+1
2 2.00000 2.50000 3.00000 23.50000 3.75000 –1.43497 2.28252
3 2.50000 3.00000 2.28252 3.89279 -1.74262 0.00494 2.27897
4 3.00000 2.28252 2.27897 -0.04477 0.00010 -0.01263 2.27902
5 2.28252 2.27897 2.27902 0.00056 0.00000 -0.00338 2.27902

Therefore, one root is 2.2790 correct up to four decimal places.

Summary of the Methods


Method Formula Order of Evaluation
Convergent of function
in each step
xn + xn−1
1. Bisection xn+1 = Gain of one 1
2
bit per
iteration
xn−1 fn − xn fn−1
2. False Position xn+1 = 1 1
fn − fn−1

3. Iteration xn+1 = φ(xn ) 1 1


f (xn )
4. Newton-Raphson xn+1 = xn −  2 2
f (xn )

xn−1 fn − xn fn−1
5. Secant xn+1 = 1.62 1
fn − fn−1
6. Modified
fn
Newton-Raphson xn+1 = xn − 3 3
f  (xn − 12 fn /fn )

fn 1 fn2 
7. Chebyshev xn+1 = xn − − f 3 3
fn 2 f  3n n
8. Muller xn+1 = xn + (xn − xn−1 )λ 1.84 1

Example 4.12.2 Consider the iteration method xn+1 = φ(xn ), n = 0, 1, 2, . . . for


solving the equation f (x) = 0. If the iteration function is in the form

φ(x) = x − αf (x) − β{f (x)}2 − γ{f (x)}3

where α, β and γ are arbitrary parameters then find the values of α, β, γ such that
the iteration method has (i) third and (ii) fourth order convergence.
234 Numerical Analysis

Solution. Let ξ be an exact root of the equation f (x) = 0, i.e., f (ξ) = 0.


Now,

φ (x) = 1 − αf  (x) − 2βf (x)f  (x) − 3γ{f (x)}2 f  (x)


φ (x) = −αf  (x) − 2β{f  (x)}2 − 2βf (x)f  (x) − 6γf (x){f  (x)}2
−3γ{f (x)}2 f  (x)
φ (x) = −αf  (x) − 6βf  (x)f  (x) − 2βf (x)f  (x) − 6γ{f  (x)}3
−18γf (x)f  (x)f  (x) − 3γ{f (x)}2 f  (x).

Substituting f (ξ) = 0 to the above equations.


φ(ξ) = ξ
φ (ξ) = 1 − αf  (ξ)
φ (ξ) = −αf  (ξ) − 2β{f  (ξ)}2
φ (ξ) = −αf  (ξ) − 6βf  (ξ)f  (ξ) − 6γ{f  (ξ)}3 .
Let εn = xn − ξ.
3 4
Then εn+1 + ξ = φ(εn + ξ) = φ(ξ) + εn φ (ξ) + 12 ε2n φ (ξ) + ε6n φ (ξ) + ε24n φiv (ξ) + · · ·
(i) For third order convergence, the value of φ (ξ) and φ (ξ) should be zero. In this
case,
ε3
εn+1 = n φ (ξ) + · · · or, εn+1 = Aε3n .
6
Thus, 1 − αf  (ξ) = 0 and −αf  (ξ) − 2β{f  (ξ)}2 = 0.
1 f  (ξ)
That is, α =  , β=− .
f (ξ) 2{f  (ξ)}3
Hence, for third order convergence, the values of α and β are given by

1 f  (x)
α= 
, β=− .
f (x) 2{f  (x)}3

(ii) For the fourth order convergence, φ (ξ) = φ (ξ) = φ (ξ) = 0.
In this case εn+1 = Aε4n .
1 f  (x)
Then α = −  , β=−
f (x) 2{f  (x)}3
and − αf  (ξ) − 6βf  (ξ)f  (ξ) − 6γ{f  (ξ)}3 = 0.
That is,

f  (x) f  (x)


6γ{f  (x)}3 = −αf  (x) − 6βf  (x)f  (x) = − + 6 f  (x)f  (x)
f  (x) 2{f  (x)}3
f  (x) {f  (x)}2
or, γ=− + .
6{f  (x)}4 2{f  (x)}5
Sol. of Algebraic and Transcendental Equs. 235

Example 4.12.3 The equation x3 − 5x2 + 4x − 3 = 0 has one root near x = 4,


which is to be computed by the iteration

x0 = 4
3 + (k − 4)xn + 5x2n − x3n
xn+1 = , k is integer.
k
Find the value of k such that the iteration scheme has fast convergence.

Solution. Let lim xn = lim xn+1 = ξ.


n→∞ n→∞
Since ξ is a root of the given equation,

ξ 3 − 5ξ 2 + 4ξ − 3 = 0.

Substituting, xn = ξ + εn , xn+1 = ξ + εn+1 to the given iteration scheme. Then

k(εn+1 + ξ) = 3 + (k − 4)(εn + ξ) + 5(εn + ξ)2 − (εn + ξ)3


kεn+1 = (3 − 4ξ + 5ξ 2 − ξ 3 ) + εn {(k − 4) + 10ξ − 3ξ 2 }
+ε2n (5 − 3ξ) − ε3n

or, kεn+1 = εn {(k − 4) + 10ξ − 3ξ 2 } + ε2n (5 − 3ξ) − ε3n .


If k − 4 + 10ξ − 3ξ 2 = 0 or, k = 3ξ 2 − 10ξ + 4 then
1
εn+1 = Aε2n , where A = (5 − 3ξ)
k
i.e., the rate of convergence is 2.
The values of ξ is determined from the equation

ξ 3 − 5ξ 2 + 4ξ − 3 = 0.

The Newton-Raphson method is used to determine the approximate value of ξ as


ξ = 4.22069. Thus the required value of k is k = 3ξ 2 − 10ξ + 4 = 15.23577.

Example 4.12.4 Consider the following iteration scheme

af (xn )
xn+1 = xn −
f  (xn − bf (xn )/f  (xn ))

where a, b are arbitrary parameters, for solving the equation f (x) = 0. Determine
a and b such that the iteration method is of order as high as possible for finding a
simple root of f (x) = 0.
236 Numerical Analysis

Solution. Here the iteration scheme is xn+1 = φ(xn ) where

af (x) af (x)
φ(x) = x − =x− 
f  (x − bf (x)/f  (x)) f (g(x))

where
bf (x) b{f  (x)}2 − bf (x)f  (x)
g(x) = x − , g  (x) = 1 − .
f  (x) {f  (x)}2
If ξ be a root of f (x) = 0 then f (ξ) = 0, g(ξ) = ξ, g  (ξ) = 1 − b.

af  (x)f  (g(x)) − af (x)f  (g(x))g  (x)


φ (x) = 1 −
{f  (g(x))}2
 
af (ξ)f (ξ)
φ (ξ) = 1 − = 1 − a.
{f  (ξ)}2

Now, we choose 1 − a = 0 so that φ (ξ) = 0.


Then
f  (x) f (x)f  (g(x))g  (x)
φ (x) = 1 − +
f  (g(x)) {f  (g(x))}2
f (x)f (g(x)) − f  (x)f  (g(x))g  (x)
 
φ (x) = −
{f  (g(x))}2
f  (x)f  (g(x))g  (x)
+
{f  (g(x))}2
{f (x)f  (g(x)){g  (x)}2 + f (x)f  (g(x))g  (x)}{f  (g(x))}2
+
{f  (g(x))}4
f (x)f (g(x))g (x) · 2f (g(x)) · f  (g(x))g  (x)
  

{f  (g(x))}4
f  (ξ)f  (ξ) − f  (ξ)f  (ξ)(1 − b)
φ (ξ) = −
{f  (ξ)}2
 
f (ξ)f (ξ)(1 − b) f  (ξ)
+ = {−1 + 2(1 − b)}.
{f  (ξ)}2 f  (ξ)

Now, φ (ξ) = 0 if −1 + 2(1 − b) = 0 or, b = 1/2.


From the relation, xn+1 = φ(xn ) one can write

ε2n  ε3
εn+1 + ξ = φ(εn + ξ) = φ(ξ) + εn φ (ξ) + φ (ξ) + n φ (ξ) + · · ·
2 6
ε2n  ε3
or, εn+1 = εn φ (ξ) + φ (ξ) + n φ (ξ) + · · ·
2 6
Sol. of Algebraic and Transcendental Equs. 237

If a = 1 and b = 1/2 then φ (ξ) = 0 and φ (ξ) = 0. Then

ε3n 
εn+1  φ (ξ).
6
Hence the iteration scheme will have a third order convergence when a = 1 and
b = 1/2.

4.13 Roots of Polynomial Equations

The polynomial of degree n is generally denoted by Pn (x) and is defined as

Pn (x) ≡ a0 xn + a1 xn−1 + · · · + an−1 x + an = 0 (4.52)

where a0 , a1 , . . . , an are real coefficients.


A number ξ is a root of the polynomial Pn (x) iff Pn (x) is exactly divisible by x − ξ.
If Pn (x) is exactly divisible by (x − ξ)k (k ≥ 1), but is not divisible by (x − ξ)k+1 ,
then ξ is a k-fold root or a root of multiplicity k of the polynomial Pn (x). The roots of
multiplicity k = 1 are called simple root or single root.
The following theorem assures the existence of the roots of a polynomial equation.

Theorem 4.8 (The fundamental theorem of algebra). Every polynomial equation


with any numerical coefficients whose degree is not lower than unity has at least one
root, real or complex.

From this theorem it can be proved that:


Every polynomial Pn (x) of degree n (n ≥ 1) with any numerical coefficients has exactly
n roots, real or complex.
The roots of the equation (4.52) may be real or complex. If the coefficients of (4.52)
are real and has a complex root α + iβ of multiplicity k then (4.52) has a complex root
α − iβ also of multiplicity k.
The number of positive and negative roots of a polynomial equation can be determined
by using Descartes’ rule of signs:
The number of positive real roots of the algebraic equation Pn (x) = 0 with real coefficients
either is equal to the number of sign changes in the sequence of the coefficients of the
equation Pn (x) = 0 or is less than the number of sign changes by an even integer. The
number of negative roots of the equation is equal to the number of sign changes in the
sequence of coefficients of Pn (−x) or is smaller by an even integer.
The Sturm’s theorem gives the exact number of positive real roots of a polynomial
equation.
Let f (x) be a given polynomial of degree n and let f1 (x) be its first order derivative.
Let f2 (x) be the remainder of f (x) when it is divided by f1 (x) taken with reverse sign.
238 Numerical Analysis

Similarly, f3 (x) is the remainder of f1 (x) when it is divided by f2 (x) with the reverse sign
and so on. This division process is terminated when the quotient becomes a constant.
Thus we obtain a sequence of function f (x), f1 (x), f2 (x), . . . , fn (x) called the Sturm
functions or the Sturm sequences.
Theorem 4.9 (Sturm). The number of real roots of the equation f (x) = 0 on [a, b] is
equal to the difference between the number of changes of sign in the Sturm sequence at
x = a and x = b, provided f (a) = 0 and f (b) = 0.

4.13.1 Domains of roots


Let the polynomial of degree n be
a0 xn + a1 xn−1 + · · · + an−1 x + an = 0, (4.53)
where a0 , a1 , . . . , an are real coefficients, and let A = max{|a1 |, |a2 |, . . . , |an |} and B =
max{|a0 |, |a1 |, . . . , |an−1 |}. Then the roots of the equation (4.53) lie in the interval
r < |x| < R where
1 A
r= and R=1+ . (4.54)
1 + B/|an | |a0 |
Here r is the lower bound and R is the upper bound of the positive roots of the
equation (4.53) and −R and −r are the lower and the upper bounds of the negative
roots respectively.

The Lagrange’s or Newton’s method may also be used to find the upper bound of the
positive roots of the equation (4.53).
Theorem 4.10 (Lagrange’s). If the coefficients of the polynomial
a0 xn + a1 xn−1 + · · · + an−1 x + an = 0
satisfy the conditions a0 > 0, a1 , a2 , . . . , am−1 ≥ 0, am <
 0, for some m ≤ n, then the
upper bound of the positive roots of the equation is 1 + m B/a0 , where B is the greatest
of the absolute values of the negative coefficients of the polynomial.
Theorem 4.11 (Newton’s). If for x = c the polynomial
f (x) ≡ a0 xn + a1 xn−1 + · · · + an−1 x + an = 0
and its derivatives f  (x), f  (x), . . . assume positive values then c is the upper bound of
the positive roots of the equation.
The roots of a polynomial equation can be determined in two techniques – iteration
methods and direct methods. In this section, two iteration methods, viz., Birge-Vieta
and Bairstow methods, and one direct method – Graeffe’s root squaring method are
discussed.
Sol. of Algebraic and Transcendental Equs. 239

Iterative Methods

4.14 Birge-Vieta Method

This method is based on the Newton-Raphson method. Here a real number ξ is deter-
mined such that (x − ξ) is a factor of the polynomial

Pn (x) = xn + a1 xn−1 + a2 xn−2 + · · · + an−1 x + an = 0. (4.55)

Let Qn−1 (x) and R be the quotient and remainder when Pn (x) is divided by the
factor (x − ξ), where Qn−1 (x) is a polynomial of degree (n − 1) of the form

Qn−1 (x) = xn−1 + b1 xn−2 + b2 xn−3 + · · · + bn−2 x + bn−1 . (4.56)

Thus
Pn (x) = (x − ξ)Qn−1 (x) + R. (4.57)

The value of R depends on ξ. Now, the problem is to find the value of ξ starting
from an initial guess x0 such that R(ξ) = 0 or it is equivalent to

R(ξ) = Pn (ξ) = 0. (4.58)

The value of ξ can be determined by Newton-Raphson method or any other method.


The Newton-Raphson methods for (4.58) is

Pn (xk )
xk+1 = xk − , k = 0, 1, 2, . . . . (4.59)
Pn (xk )

The values of Pn (xk ) and Pn (xk ) can be determined by synthetic division. To
determine the values of b1 , b2 , . . . , bn−1 and R, comparing the coefficient of like powers
of x on both sides of (4.57) and obtain the following relations.

a1 = b1 − ξ b1 = a1 + ξ
a2 = b2 − ξb1 b2 = a2 + ξb1
.. ..
. .
ak = bk − ξbk−1 bk = ak + ξbk−1
.. ..
. .
an = R − ξbn−1 R = an + ξbn−1
From (4.57),

Pn (ξ) = R = bn (say). (4.60)


240 Numerical Analysis

Hence,

bk = ak + ξbk−1 , k = 1, 2, . . . , n, with b0 = 1. (4.61)

Thus bn is the value of Pn .


To determine the value of Pn , differentiating (4.57) with respect to x

Pn (x) = (x − ξ)Qn−1 (x) + Qn−1 (x)

That is,

Pn (ξ) = Qn−1 (ξ) = ξ n−1 + b1 ξ n−2 + · · · + bn−2 ξ + bn−1 . (4.62)

Thus

Pn (xi ) = xin−1 + b1 xn−2


i + · · · + bn−2 xi + bn−1 . (4.63)

Pn (x) can be evaluated as Pn (x) is evaluated. Differentiating (4.61) with respect to
ξ and obtain
dbk dbk−1
= bk−1 + ξ .
dξ dξ
Let
dbk
= ck−1 . (4.64)

Then the above relation becomes

ck−1 = bk−1 + ξck−2

This gives

ck = bk + ξck−1 , k = 1, 2, . . . , n − 1. (4.65)

The relation Pn (ξ) = R = bn is differentiated (equation (4.60)) to get


dR dbn
Pn (ξ) = = = cn−1 [using (4.64)].
dξ dξ
Thus Newton-Raphson method becomes
bn
xk+1 = xk − , k = 0, 1, 2, . . . . (4.66)
cn−1
This method is known as Birge-Vieta method.
The Table 4.2 is useful to determine bk and ck for hand calculations.
Sol. of Algebraic and Transcendental Equs. 241

Table 4.2: Scheme to calculate b’s and c’s.

x0 1 a1 a2 ··· an−2 an−1 an


x0 x0 b1 ··· x0 bn−3 x0 bn−2 x0 bn−1
x0 1 b1 b2 ··· bn−2 bn−1 bn = R
x0 x0 c1 ··· x0 cn−3 x0 cn−2
1 c1 c2 ··· cn−2 cn−1 = Pn (x0 )

Example 4.14.1 Find all the roots of the polynomial equation x4 −8x3 +14.91x2 +
9.54x − 25.92 = 0. One root of the equation lies between 1 and 2.

Solution. Let the polynomial be denoted by P4 (x). Also, let the initial guess be
x0 = 1.2.

1.2 1 –8 14.91 9.54 –25.92


1.2 –8.16 8.10 21.168
1.2 1 –6.8 6.75 17.64 –4.752=b4 = P4 (x0 )
1.2 –6.72 0.036
1 –5.6 0.03 17.676=c3 = P4 (x0 )

Therefore,
b4 −4.752
x1 = x0 − = 1.2 − = 1.46884.
c3 17.676

1.46884 1 –8 14.91 9.54 –25.92


1.46884 –9.59323 7.80949 25.48362
1.46884 1 –6.53116 5.31677 17.34949 –0.43638=b4
1.46884 –7.43574 –3.11243
1 –5.06232 –2.11897 14.23706=c3

b4 −0.43638
Then x2 = x1 − = 1.46884 − = 1.49949.
c3 14.23706

1.49949 1 –8 14.91 9.54 –25.92


1.49949 –9.74745 7.74119 25.91298
1.49949 1 –6.50051 5.16255 17.28119 –0.00702=b4
1.49949 –7.49898 –3.50345
1 –5.00102 –2.33643 13.77774=c3

b4 −0.00702
Then x3 = x2 − = 1.49949 − = 1.50000.
c3 13.77774
242 Numerical Analysis

Therefore, one root is 1.50000, which is an exact root. The reduce polynomial is

x3 − 6.50051x2 + 5.16255x + 17.28119 = 0.

One root of this equation lies between 4 and 5. Let x0 = 4.0.

4 1 –6.50051 5.16255 17.28119


4 –10.00204 –19.35796
4 1 –2.50051 –4.83949 –2.07677=b3
4 5.99796
1 1.49949 1.15847= c2

Therefore,
b3 −2.07677
x1 = x0 − =4− = 5.79268.
c2 1.15847

5.79268 1 –6.50051 5.16255 17.28119


5.79268 –4.10023 –6.15366
5.79268 1 –0.70783 1.06232 23.43485=b3
5.79268 29.45491
1 5.08485 30.51723= c2

b3 23.43485
x2 = x1 − = 5.79268 − = 5.02476.
c2 30.51723

5.02476 1 –6.50051 5.16255 17.28119


5.02476 –7.41529 –11.31948
5.02476 1 –1.47575 –2.25274 5.96171=b3
5.02476 17.83292
1 3.54901 15.58018= c2

b3 5.96171
x3 = x2 − = 5.02476 − = 4.64211.
c2 15.58018

4.64211 1 –6.50051 5.16255 17.28119


4.64211 –8.62690 –16.08188
4.64211 1 –1.85840 –3.46435 1.19931=b3
4.64211 12.92229
1 2.78371 9.45794= c2

b3 1.19931
x4 = x3 − = 4.64211 − = 4.51531.
c2 9.45794
Sol. of Algebraic and Transcendental Equs. 243

We take x = 4.5 as another root. The next reduced equation is

x2 − 1.85840x − 3.46435 = 0

and roots of this equation are


x = 3.00953, −1.15113.
Hence the roots of the given equation are

1.5, 4.5, 3.00953, −1.15113.

But, the roots 3.00953 and 1.15113 contain some errors, because we approximate
4.51531 as 4.5. Some more iterations are required to achieve the root 4.5 and the
correct quotient polynomial.

Algorithm 4.6 (Birge-Vieta method). This algorithm finds a root of a polyno-


mial equation when the coefficients a1 , a2 , . . . , an and an initial guess x0 are given.

Algorithm Birge-Vieta
//n is the degree of the polynomial and N is the maximum number of iterations to
be performed. Assume that the leading coefficient is one. ε is the error tolerance.//
Read x0 , ε, n, N ;
Read ai , i = 1, 2, . . . , n;

for i = 1 to N do
Set b0 = 1, c0 = 1;
for k = 1 to n do
Compute bk = ak + x0 bk−1 ;
endfor;
for k = 1 to n − 1 do
Compute ck = bk + x0 ck−1 ;
endfor;
bn
Compute x1 = x0 − ;
cn−1
if |x1 − x0 | < ε then
Print‘One root is’, x1 ;
Stop;
else
Set x0 = x1 ;
endif;
endfor;
Print ‘Root not found in N iterations’;
end Birge-Vieta
244 Numerical Analysis

Program 4.6 .
/* Program Birge-Vieta for polynomial equation
Program to find a root of the polynomial equation by
Birge-Vieta method. Leading coefficient is 1 */
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
void main()
{
int n, N,i,k;
float x0,x1,a[10],b[10],c[10];
float epp=1e-5; /* error tolerance */
printf("\nEnter the degree of the polynomial ");
scanf("%d",&n);
printf("Enter the coefficients of the polynomial,
except leading coeff.");
for(i=1;i<=n;i++) scanf("%f",&a[i]);
printf("Enter the initial guess x0 ");
scanf("%f",&x0);
printf("Enter maximum number of iterations to be done ");
scanf("%d",&N);
b[0]=1; c[0]=1;
for(i=1;i<=N;i++)
{
for(k=1;k<=n;k++) b[k]=a[k]+x0*b[k-1];
for(k=1;k<=n-1;k++) c[k]=b[k]+x0*c[k-1];
x1=x0-b[n]/c[n-1];
if(fabs(x1-x0)<epp)
{
printf("One root is %8.5f obtained at %d iterations",x1,i);
printf("\nCoefficients of the reduced polynomial are\n ");
for(k=0;k<=n-1;k++) printf("%f ",b[k]);
exit(0);
}
else
x0=x1;
} /* i loop */
printf("\n Root not found at %d iterations ",N);
} /* main */
Sol. of Algebraic and Transcendental Equs. 245

A sample of input/output:
Enter the degree of the polynomial 4
Enter the coefficients of the polynomial, except leading coeff.
-3 0 6 -4
Enter the initial guess x0 1
Enter maximum number of iterations to be done 100
One root is 1.00000 obtained at 1 iterations
Coefficients of the reduced polynomial are
1.000000 -2.000000 -2.000000 4.000000

4.15 Bairstow Method

The roots of a polynomial equation can also be determined by extracting a quadratic


factor from the polynomial. The roots (real or complex) of a quadratic equation can
determine using a known closed form formula. The Bairstow method is used to
extract a quadratic factor.
Let the polynomial Pn (x) of degree n be

xn + a1 xn−1 + a2 xn−2 + · · · + an−1 x + an = 0. (4.67)

Let x2 + px + q be a factor of (4.67). If (4.67) is divided by the factor x2 + px + q then


we obtain a polynomial Qn−2 (x) of degree (n − 2) and a remainder Rx + S of degree
one, where R and S are independent of x. Thus the polynomial Pn (x) can be written
as
Pn (x) = (x2 + px + q)Qn−2 (x) + Rx + S (4.68)

where
Qn−2 (x) = xn−2 + b1 xn−3 + · · · + bn−3 x + bn−2 . (4.69)

The values of R and S depends on p and q. If x2 + px + q is a factor of Pn (x) then


R and S should be zero. Thus our problem is to determine the values of p and q such
that
R(p, q) = 0 and S(p, q) = 0. (4.70)

These equations are two non-linear equations in p and q. The values of p and q can
then be determined by Newton-Raphson method for two variables (see Section 4.17.3).
Let (pt , qt ) be the true values of p and q and ∆p, ∆q be the corrections to p and q.
Then
pt = p + ∆p and qt = q + ∆q.
Thus

R(pt , qt ) = R(p + ∆p, q + ∆q) = 0 and S(pt , qt ) = S(p + ∆p, q + ∆q) = 0.


246 Numerical Analysis

Therefore, by Taylor’s series

∂R ∂R
R(p + ∆p, q + ∆q) = R(p, q) + ∆p + ∆q + ··· = 0
∂p ∂q
∂S ∂S
and S(p + ∆p, q + ∆q) = S(p, q) + ∆p + ∆q + · · · = 0.
∂p ∂q

The derivatives are evaluated at (p, q). Neglecting the square and higher powers of
∆p and ∆q the above equations reduce to

∆pRp + ∆qRq = −R (4.71)


∆pSp + ∆qSq = −S. (4.72)

The values of ∆p and ∆q are thus given by

RSq − SRq SRp − RSp


∆p = − , ∆q = − . (4.73)
Rp Sq − Rq Sp Rp Sq − Rq Sp

Now, we determine the coefficient of the polynomial Qn−2 (x) and the expression for
R and S in terms of p and q.
From equations (4.67)-(4.69)

xn + a1 xn−1 + a2 xn−2 + · · · + an−1 x + an


= (x2 + px + q)(xn−2 + b1 xn−3 + · · · + bn−3 x + bn−2 ). (4.74)

Equating the coefficients of xn , xn−1 , . . . on both sides, we get

a1 = b1 + p b1 = a1 − p
a2 = b2 + pb1 + q b2 = a2 − pb1 − q
.. ..
. .
ak = bk + pbk−1 + qbk−2 bk = ak − pbk−1 − qbk−2 (4.75)
.. ..
. .
an−1 = R + pbn−2 + qbn−3 R = an−1 − pbn−2 − qbn−3
an = S + qbn−2 S = an − qbn−2 .

In general,

bk = ak − pbk−1 − qbk−2 , k = 1, 2, . . . , n (4.76)

where b0 = 1, b−1 = 0.
In this notation,
R = bn−1 , S = bn + pbn−1 . (4.77)
Sol. of Algebraic and Transcendental Equs. 247

Thus R and S are available when b’s are known. To determine the partial derivatives
Rp , Rq , Sp and Sq , differentiating (4.76) with respect to p and q.
∂bk ∂bk−1 ∂bk−2 ∂b0 ∂b−1
= −bk−1 − p −q , = =0 (4.78)
∂p ∂p ∂p ∂p ∂p
∂bk ∂bk−1 ∂bk−2 ∂b0 ∂b−1
= −bk−2 − p −q , = =0 (4.79)
∂q ∂q ∂q ∂q ∂q
Denoting ∂bk
= −ck−1 , k = 1, 2, . . . , n (4.80)
∂p
∂bk
and = −ck−2 . (4.81)
∂q
Then (4.78) becomes
ck−1 = bk−1 − pck−2 − qck−3 (4.82)

and (4.79) becomes


ck−2 = bk−2 − pck−3 − qck−4 . (4.83)

Thus the recurrence relation to determine ck using bk is

ck = bk − pck−1 − qck−2 , k = 1, 2, . . . , n − 1 and c0 = 1, c−1 = 0. (4.84)

Therefore,
∂bn−1
Rp = = −cn−2
∂p
∂bn ∂bn−1
Sp = +p + bn−1 = bn−1 − cn−1 − pcn−2
∂p ∂p
∂bn−1
Rq = = −cn−3
∂q
∂bn ∂bn−1
Sq = +p = −(cn−2 + pcn−3 ).
∂q ∂q
To find the explicit expression for ∆p and ∆q, substituting the above values in (4.73).
Therefore,
bn cn−3 − bn−1 cn−2
∆p = −
− cn−3 (cn−1 − bn−1 )
c2n−2
bn−1 (cn−1 − bn−1 ) − bn cn−2
∆q = − 2 . (4.85)
cn−2 − cn−3 (cn−1 − bn−1 )
Therefore, the improved values of p and q are p + ∆p and q + ∆q. Thus if p0 , q0 be
the initial values of p and q then the improved values are

p1 = p0 + ∆p and q1 = q0 + ∆q. (4.86)


248 Numerical Analysis

The values of bk ’s and ck ’s may be calculated using the following scheme: (when p0
and q0 are taken as initial values of p and q).

1 a1 a2 ··· ak ··· an−1 an


−p0 −p0 −p0 b1 ··· −p0 bk−1 ··· −p0 bn−2 −p0 bn−1
−q0 −q0 ··· −q0 bk−2 ··· −q0 bn−3 −q0 bn−2
1 b1 b2 ··· bk ··· bn−1 bn
−p0 −p0 −p0 c1 ··· −p0 ck−1 ··· −p0 cn−2
−q0 −q0 ··· −q0 ck−2 ··· −q0 cn−3
1 c1 c2 ··· ck ··· cn−1
Once p1 and q1 are evaluated, the next improved values p2 , q2 are determined from
the relation
p2 = p1 + ∆p, q2 = q1 + ∆q.
In general,
pk+1 = pk + ∆p, qk+1 = qk + ∆q, (4.87)

the values of ∆p and ∆q are determined at p = pk and q = qk .


The repetition is to be terminated when p and q have been obtained to the desired
accuracy.
The polynomial

Qn−2 (x) = Pn (x)/(x2 + px + q) = xn−2 + b1 xn−3 + · · · + bn−3 x + bn−2

is called the deflated polynomial. The next quadratic polynomial can be obtained in
similar process from the deflated polynomial.
The rate of convergence of this method is quadratic as the computations of ∆p and
∆q are based on Newton-Raphson method.
Example 4.15.1 Extract a quadratic factor using the Bairstow method from the
equation
x4 + 4x3 − 7x2 − 22x + 24 = 0.

Solution. Let the initial guess of p and q be p0 = 0.5 and q0 = 0.5. Then

1.00000 4.00000 −7.00000 −22.00000 24.00000


−0.5 −0.50000 −1.75000 4.62500 9.56250
−0.5 −0.50000 −1.75000 4.62500
1.00000 3.50000 = b1 −9.25000 −19.12500 38.18750 = b4
−0.5 −0.50000 −1.50000 5.62500
−0.5 −0.50000 −1.50000
1.00000 3.00000 −11.25000 −15.00000
= c1 = c2 = c3
Sol. of Algebraic and Transcendental Equs. 249

b4 c1 − b3 c2 b3 (c3 − b3 ) − b4 c2
∆p = − = 0.88095, ∆q = − = −3.07143
c22 − c1 (c3 − b3 ) c22 − c1 (c3 − b3 )

Therefore, p1 = p0 + ∆p = 1.38095, q1 = q0 + ∆q = −2.57143.

Second iteration

1.00000 4.00000 −7.00000 −22.00000 24.00000


−1.38095 −1.38095 −3.61678 11.11025 5.73794
2.57143 2.57143 6.73469 −20.68805
1.00000 2.61905 −8.04535 −4.15506 9.04989
−1.38095 −1.38095 −1.70975 9.92031
2.57143 2.57143 3.18367
1.00000 1.23810 −7.18367 8.94893

∆p = 0.52695, ∆q = −0.29857.
p2 = p1 + ∆p = 1.90790, q2 = q1 + ∆q = −2.86999.

Third iteration

1.00000 4.00000 −7.00000 −22.00000 24.00000


−1.90790 −1.90790 −3.99152 15.49504 0.95517
2.86999 2.86999 6.00432 −23.30873
1.00000 2.09210 −8.12152 −0.50064 1.64644
−1.90790 −1.90790 −0.35144 10.68990
2.86999 2.86999 0.52866
1.00000 0.18420 −5.60297 10.71793

∆p = 0.08531, ∆q = −0.12304.
p3 = p2 + ∆p = 1.99321, q3 = q2 + ∆q = −2.99304.

Fourth iteration

1.00000 4.00000 −7.00000 −22.00000 24.00000


−1.99321 −1.99321 −3.99995 15.95943 0.06808
2.99304 2.99304 6.00642 −23.96501
1.00000 2.00679 −8.00692 −0.03416 0.10307
−1.99321 −1.99321 −0.02709 10.04768
2.99304 2.99304 0.04067
1.00000 0.01359 −5.04097 10.05419

∆p = 0.00676, ∆q = −0.00692.
p4 = p3 + ∆p = 1.99996, q4 = q3 + ∆q = −2.99996.
250 Numerical Analysis

Fifth iteration

1.00000 4.00000 −7.00000 −22.00000 24.00000


−1.99996 −1.99996 −4.00000 15.99978 0.00037
2.99996 2.99996 6.00004 −23.99981
1.00000 2.00004 −8.00004 −0.00018 0.00055
−1.99996 −1.99996 −0.00015 10.00027
2.99996 2.99996 0.00023
1.00000 0.00008 −5.00023 10.00031

∆p = 0.00004, ∆q = −0.00004.
p5 = p4 + ∆p = 2.00000, q5 = q4 + ∆q = −3.00000.
Therefore, a quadratic factor is x2 + 2x − 3 which is equal to (x − 1)(x + 3). The
deflated polynomial is Q2 (x) = x2 + 2.00004x − 8.00004  x2 + 2x − 8.
Thus P4 (x) = (x − 1)(x + 3)(x2 + 2x − 8) = (x − 1)(x + 3)(x − 2)(x + 4).
Hence the roots of the given equation are 1, −3, 2, −4.

Algorithm 4.7 (Bairstow method). This algorithm extracts a quadratic fac-


tor from a polynomial of degree n and also determines the deflated polynomial, by
Bairstow method.

Algorithm Bairstow
// Extract a quadratic factor x2 + px + q from a polynomial Pn (x) = xn + a1 xn−1 +
· · · + an−1 x + an of degree n and determines the deflated polynomial Qn−2 (x) =
xn−2 + b1 xn−3 + b2 xn−4 + · · · + bn−2 .//
Read n, a1 , a2 , . . . , an ; //the degree and the coefficients of the polynomial.//
Read p, q, ε; //the initial guess of p, q and error tolerance.//
Set b0 = 1, b−1 = 0, c0 = 1, c−1 = 0;
//Compute bk and ck
1. for k = 1 to n do
Compute bk = ak − pbk−1 − qbk−2 ;
endfor;
for k = 1 to n − 1 do
Compute ck = bk − pck−1 − qck−2 ;
endfor;
cn−3 −bn−1 cn−2 (cn−1 −bn−1 )−bn cn−2
Compute ∆p = − c2 bn−c n−3 (cn−1 −bn−1 )
; ∆q = − bn−1
c2n−2 −cn−3 (cn−1 −bn−1 ) ;
n−2
Compute pnew = p + ∆p, qnew = q + ∆q;
if (|pnew − p| < ε) and (|qnew − q| < ε) then
Print ‘The values of p and q are’, pnew , qnew ;
Stop;
endif;
Set p = pnew , q = qnew ;
Sol. of Algebraic and Transcendental Equs. 251

go to 1.
Print ‘The coefficients of deflated polynomial are’,b1 , b2 , . . . , bn−2 ;
end Bairstow
Program 4.7 .
/* Program Bairstow for polynomial equation
Program to find all the roots of a polynomial equation by
Bairstow method. Leading coefficient is 1. Assume
initial guess for all p and q are 0.5, 0.5. */
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
void main()
{
int n,i,k;
float p,q,pnew,qnew,a[10],b[10],c[10],bm1,cm1,delp,delq;
float epp=1e-5; /* error tolerance */
void findroots(float p, float q);
printf("\nEnter the degree of the polynomial ");
scanf("%d",&n);
printf("Enter the coefficients of the polynomial,
except leading coeff.");
for(i=1;i<=n;i++) scanf("%f",&a[i]);
q=0.5;
p=0.5;
printf("The roots are \n");
do
{
b[0]=1; bm1=0; c[0]=1; cm1=0;
pnew=p; qnew=q;
do{
p=pnew; q=qnew;
b[1]=a[1]-p*b[0]-q*bm1;
c[1]=b[1]-p*c[0]-q*cm1;
for(k=2;k<=n;k++) b[k]=a[k]-p*b[k-1]-q*b[k-2];
for(k=2;k<=n;k++) c[k]=b[k]-p*c[k-1]-q*c[k-2];
delp=-(b[n]*c[n-3]-b[n-1]*c[n-2])/
(c[n-2]*c[n-2]-c[n-3]*(c[n-1]-b[n-1]));
delq=-(b[n-1]*(c[n-1]-b[n-1])-b[n]*c[n-2])/
(c[n-2]*c[n-2]-c[n-3]*(c[n-1]-b[n-1]));
252 Numerical Analysis

pnew=p+delp;
qnew=q+delq;
}while((fabs(pnew-p)>epp || fabs(qnew-q)>epp));
findroots(p,q);
n-=2;
for(i=1;i<=n;i++) a[i]=b[i];
}while(n>2);

/* deflated polynomial is quadratic */


if(n==2) findroots(b[1],b[2]);

/* deflated polynomial is linear */


if(n==1) printf("%f ",-b[1]);
} /* main */

/* finds the roots of the quadratic x*x+px+q=0 */


void findroots(float p, float q)
{
float dis;
dis=p*p-4*q;
if(dis>=0)
{
printf("%f %f\n",(-p+sqrt(dis))/2,(-p-sqrt(dis))/2);
}
else
{
printf("(%f,%f), (%f,%f)\n",-p/2,sqrt(fabs(dis))/2,
-p/2,-sqrt(fabs(dis))/2);
}
} /* findroots */

A sample of input/output:

Enter the degree of the polynomial 5


Enter the coefficients of the polynomial, except leading coeff.
3 4 -5 6 1
The roots are
(0.604068,0.729697), (0.604068,-0.729697)
(-2.030660,1.861934), (-2.030660,-1.861934)
-0.146816
Sol. of Algebraic and Transcendental Equs. 253

Direct Method

4.16 Graeffe’s Root Squaring Method

This method may be used to find all the roots of all types (real, equal or complex) of a
polynomial equation with real coefficients. In this method, an equation is constructed
whose roots are squares of the roots of the given equation, then another equation whose
roots are squares of the roots of this new equation and so on, the process of root-squaring
being continued as many times as necessary.
Let the given equation be

xn + a1 xn−1 + a2 xn−2 + · · · + an−1 x + an = 0 (4.88)

whose roots are ξ1 , ξ2 , . . . , ξn .


Now, an equation is constructed whose roots are −ξ12 , −ξ22 , . . . , −ξn2 using the following
technique.
Separating the even and odd powers of x in (4.88), and squaring both sides

(xn + a2 xn−2 + a4 xn−4 + · · · )2 = (a1 xn−1 + a3 xn−3 + · · · )2 .

After simplification the above equation becomes

x2n −(a21 −2a2 )x2n−2 +(a22 −2a1 a3 +2a4 )x2n−4 +· · ·+(−1)n a2n = 0. (4.89)

Setting x2 = z to the above equation and let it be

z n + b1 z n−1 + · · · + bn−1 z + bn = 0

where
b1 = a21 − 2a2
b2 = a22 − 2a1 a3 + 2a4
..
.
bk = a2k − 2ak−1 ak+1 + 2ak−2 ak+2 − · · · (4.90)
..
.
bn = a2n .

The roots of the equation (4.89) are −ξ12 , −ξ22 , . . . , −ξn2 . The coefficients bk ’s can be
obtained from Table 4.3.
The (k + 1)th column i.e., bk of Table 4.3 can be obtained as follows:
The terms alternate in sign starting with a positive sign. The first term is a2k . The
second term is twice the product of ak−1 and ak+1 . The third term is twice the product
254 Numerical Analysis

Table 4.3: Graeffe’s root-squaring scheme.

1 a1 a2 a3 a4 ··· an
1 a21 a22 a23 a24 ··· a2n
−2a2 −2a1 a3 −2a2 a4 −2a3 a5 ···
2a4 2a1 a5 2a2 a6 ···
−2a6 −a1 a7 ···
2a8 ···
1 b1 b2 b3 b4 ··· bn

of ak−2 and ak+2 . This process is continued until there are no available coefficients to
form the cross product terms.
The root-squaring process is repeated to a sufficient number of times, say m times
and we obtain the equation

xn + c1 xn−1 + c2 xn−2 + · · · + cn−1 x + cn = 0. (4.91)

Let the roots of (4.91) be α1 , α2 , . . . , αn . This roots are the 2m th power of the roots
of the equation (4.88) with opposite signs, i.e.,
m
αi = −ξi2 , i = 1, 2, . . . , n.

The relation between roots and coefficients gives


  p
−c1 = α1 + α2 + · · · + αn = αi = − ξi
  p p
c2 = α1 α2 + α1 α3 + · · · + αn−1 αn = αi αj = ξi ξj
i<j i<j
 
−c3 = αi αj αk = − ξip ξjp ξkp
i<j<k i<j<k
··· ··· ··········································
(−1)n cn = α1 α2 · · · αn = (−1)n ξ1p ξ2p · · · ξnp , where p = 2m . (4.92)

In the following different cases are considered separately.


Case I. Roots are real and unequal in magnitudes.
Let us consider
|ξn | < |ξn−1 | < · · · < |ξ2 | < |ξ1 |
then
|αn |  |αn−1 |  · · ·  |α2 |  |α1 |.
Sol. of Algebraic and Transcendental Equs. 255

That is,
m m m m
|ξn |2  |ξn−1 |2  · · ·  |ξ2 |2  |ξ1 |2 (4.93)

since all the roots are widely separated in magnitude at the final stage.
Then from (4.92),
 ξ p  ξ p  ξ p
2 3 n
c1 = ξ1p 1 + + + ··· + +  ξ1p
ξ1 ξ1 ξ1
 ξ p
i
as → 0, i > 1 at the desired level of accuracy.
ξ1
Similarly,
 ξ ξ p ξ 
1 3 n−1 ξn p
c2 = (ξ1 ξ2 )p 1 + + ··· +  (ξ1 ξ2 )p
ξ1 ξ2 ξ1 ξ2

and so on, finally,


cn = (ξ1 ξ2 · · · ξn )p .
Thus at the desired level of accuracy
1/p
|ξ1 | = c1 , |ξ2 | = (c2 /c1 )1/p , |ξn | = (cn /cn−1 )1/p , p = 2m . (4.94)

This determines the absolute values of the roots. By substituting these values in the
original equation (4.88) one can determine the sign of the roots. The squaring process is
terminated when another squaring process produces new coefficients that are almost the
squares of the corresponding coefficients ck ’s i.e., when the cross product terms become
negligible with respect to square terms. Thus the final stage is identified by the fact
that on root-squaring at that stage all the cross products will vanish.
Case II. All roots are real with one pair of equal magnitude.
Let ξ1 , ξ2 , . . . , ξn be the roots of the given equation, if a pair of roots are equal in
magnitude then this pair is conveniently called a double root. A double root can be
identified in the following way:
If the magnitude of the coefficient ck is about half the square of the magnitude of the
corresponding coefficient in the previous equation, then it indicates that ξk is a double
root. The double root is determined by the following process.
We have
ck ck+1
αk  − and αk+1  − .
ck−1 ck
Then c 
 k+1 
αk αk+1  αk2   .
ck−1
256 Numerical Analysis

Therefore, c 
m)  k+1 
|αk2 | = |ξk |2(2 = .
ck−1
Thus the roots are given by
1/p
|ξ1 | = c1 , |ξ2 | = (c2 /c1 )1/p , . . . , |ξk−1 | = (ck−1 /ck−2 )1/p , . . . ,
|ξk | = (ck+1 /ck−1 )1/(2p) , . . . , |ξn | = (cn /cn−1 )1/p , (4.95)
m
where p = 2 .

This gives the magnitude of the double root. The sign is determined by substituting
the root to the equation.
The double root can also be determined directly since αk and αk+1 converge to the
same root after sufficient squaring. Generally, the rate of convergence to the double
root is slow.
Case III. One pair of complex roots and other roots are distinct in magnitude.

Let ξk and ξk+1 form a complex pair and let

ξk , ξk+1 = ρk e±iθk

where ρk = |ξk | = |ξk+1 |.


For sufficiently large m, ρk can be determined from the previous case,
  c 
 ck+1   k+1 
2 
|αk |    or, ρ 2p
  , where p = 2m .
ck−1  k ck−1

and θk is determined from the relation


ck+1
k cos mθk 
2ρm .
ck−1

Thus the roots are given by


c 1/p
1/p k+1
|ξ1 | = c1 , |ξ2 | = (c2 /c1 )1/p , . . . , |ξk−1 | = (ck−1 /ck−2 )1/p , ρ2k = ,
ck−1

|ξk+2 | = (ck+2 /ck+1 )1/p , . . . , |ξn | = (cn /cn−1 )1/p , p = 2m .


The real roots ξ1 , ξ2 , . . . , ξk−1 , ξk+2 , . . . , ξn are then corrected for sign.
If the equation has only one pair of complex roots ξk , ξk+1 = u ± iv then the sum of
the roots is
ξ1 + ξ2 + · · · + ξk−1 + 2u + ξk+2 + · · · + ξn = −a1 .
Sol. of Algebraic and Transcendental Equs. 257

From this relation one can determine the value of u. Then the value of v can be
determined from the relation
v 2 = ρ2k − u2 .
The presence of complex roots in (k + 1)th column is identified by the following
technique:
If the coefficients of xn−k in the successive squaring to fluctuate both in magnitude and
sign, a complex pair can be detected by this oscillation.

Merits and Demerits


1. All roots are found at the end of the method, i.e., at one execution of the method
all the roots are determined, including complex roots.

2. No initial guess is required.

3. As a direct method, there is no scope for correcting the error generated in any
stage. If any error is generated at any stage, then the error propagates to all the
subsequent computations and ultimately gives a wrong result.

4. The method is laborious, and to get a very accurate result the method has to be
repeated for a large number of times.

5. There is a chance for data overflow in computer.

In the following, three examples are considered to discuss the three possible cases of
Graeffe’s method. The following table is the Graeffe’s root squaring scheme for four
degree equation.

1 a1 a2 a3 a4
1 a21 a22 a23 a24
−2a2 −2a1 a3 −2a2 a4
2a4
1 c1 c2 c3 c4

Example 4.16.1 Find the roots of the equation

2x4 − 15x3 + 40x2 − 45x + 18 = 0

correct up to four decimal places by Graeffe’s root squaring method.

Solution. All the calculations are shown in the following table. The number within
the parenthesis represents the exponent of the adjacent number. i.e., 0.75(02) means
0.75 × 102 .
258 Numerical Analysis

m 2m x4 x3 x2 x 1
0 1 1.00000 −0.75000(01) 0.20000(02) −0.22500(02) 0.90000(01)
1.00000 0.56250(02) 0.40000(03) 0.50625(03) 0.81000(02)
−0.40000(02) −0.33750(03) −0.36000(03)
0.18000(02)
1 2 1.00000 0.16250(02) 0.80500(02) 0.14625(03) 0.81000(02)
1.00000 0.26406(03) 0.64803(04) 0.21389(05) 0.65610(04)
−0.16100(03) −0.47531(04) −0.13041(05)
0.16200(03)
2 4 1.00000 0.10306(03) 0.18891(04) 0.83481(04) 0.65610(04)
1.00000 0.10622(05) 0.35688(07) 0.69690(08) 0.43047(08)
−0.37783(04) −0.17207(07) −0.24789(08)
0.13122(05)
3 8 1.00000 0.68436(04) 0.18612(07) 0.44901(08) 0.43047(08)
1.00000 0.46835(08) 0.34640(13) 0.20161(16) 0.18530(16)
−0.37223(07) −0.61457(12) −0.16023(15)
0.86093(08)
4 16 1.00000 0.43113(08) 0.28495(13) 0.18559(16) 0.18530(16)
1.00000 0.18587(16) 0.81195(25) 0.34443(31) 0.34337(31)
−0.56989(13) −0.16002(24) −0.10560(29)
0.37060(16)
5 32 1.0000 0.18530(16) 0.79595(25) 0.34337(31) 0.34337(31)

This is the final equation since all the cross products vanish at the next step and all
the roots are real and distinct in magnitude.
Therefore,

|ξ1 | = (0.18530 × 1016 )1/32 = 3.0000,


 
0.79595 × 1025 1/32
|ξ2 | = = 2.0000,
0.18530 × 1016

 1/32
0.34337 × 1031
|ξ3 | = = 1.5000,
0.79595 × 1025

 1/32
0.34337 × 1031
|ξ4 | = = 1.0000.
0.34337 × 1031

Hence the roots of the given equation are 3, 2, 1.5, 1.


Sol. of Algebraic and Transcendental Equs. 259

Example 4.16.2 Find the roots of the equation x4 − 3x3 + 6x − 4 = 0 correct up


to four decimal places by Graeffe’s root squaring method.

Solution. The necessary calculations are shown below.


m 2m x4 x3 x2 x 1
0 1 1.00000 -0.30000(01) 0.00000(00) 0.60000(01) -0.40000(01)
1.00000 0.90000(01) 0.00000(00) 0.36000(02) 0.16000(02)
0.00000(00) 0.36000(02) 0.00000(00)
-0.80000(01)
1 2 1.00000 0.90000(01) 0.28000(02) 0.36000(02) 0.16000(02)
1.00000 0.81000(02) 0.78400(03) 0.12960(04) 0.25600(03)
-0.56000(02) -0.64800(03) -0.89600(03)
0.32000(02)
2 4 1.00000 0.25000(02) 0.16800(03) 0.40000(03) 0.25600(03)
1.00000 0.62500(03) 0.28224(05) 0.16000(06) 0.65536(05)
-0.33600(03) -0.20000(05) -0.86016(05)
0.51200(03)
3 8 1.00000 0.28900(03) 0.87360(04) 0.73984(05) 0.65536(05)
1.00000 0.83521(05) 0.76318(08) 0.54736(10) 0.42950(10)
-0.17472(05) -0.42763(08) -0.11450(10)
0.13107(06)
4 16 1.00000 0.66049(05) 0.33686(08) 0.43286(10) 0.42950(10)
1.00000 0.43625(10) 0.11347(16) 0.18737(20) 0.18447(20)
-0.67372(08) -0.57180(15) -0.28936(18)
0.85899(10)
5 32 1.00000 0.42951(10) 0.56296(15) 0.18447(20) 0.18447(20)

The diminishing double products vanish at the next step and hence this is the final
equation and since we find the characteristic behaviour of a double root in the second
column.
Therefore,

|ξ1 | = (0.42951 × 1010 )1/32 = 2.0000,


 
0.18447 × 1020 1/64
|ξ2 | = |ξ3 | = = 1.4142,
0.42951 × 1010
 
0.18447 × 1020 1/32
|ξ4 | = = 1.0000.
0.18447 × 1020

Here 1.4142 as well as -1.4142 satisfied the given equation, hence the roots of the
given equation are 2, ±1.4142, 1.
260 Numerical Analysis

Example 4.16.3 Find the roots of the equation x4 − 5x3 + x2 − 7x + 10 = 0 correct


up to four decimal places by Graeffe’s root squaring method.

Solution. The necessary calculations are shown below.


m 2m x4 x3 x2 x 1
0 1 1.00000 -0.50000(01) 0.10000(01) -0.70000(01) 0.10000(02)
1.00000 0.25000(02) 0.10000(01) 0.49000(02) 0.10000(03)
-0.20000(01) -0.70000(02) -0.20000(02)
0.20000(02)
1 2 1.00000 0.23000(02) -0.49000(02) 0.29000(02) 0.10000(03)
1.00000 0.52900(03) 0.24010(04) 0.84100(03) 0.10000(05)
0.98000(02) -0.13340(04) 0.98000(04)
0.20000(03)
2 4 1.00000 0.62700(03) 0.12670(04) 0.10641(05) 0.10000(05)
1.00000 0.39313(06) 0.16053(07) 0.11323(09) 0.10000(09)
-0.25340(04) -0.13344(08) -0.25340(08)
0.20000(05)
3 8 1.00000 0.39060(06) -0.11719(08) 0.87891(08) 0.10000(09)
1.00000 0.15256(12) 0.13732(15) 0.77248(16) 0.10000(17)
0.23437(08) -0.68659(14) 0.23437(16)
0.20000(09)
4 16 1.00000 0.15259(12) 0.68665(14) 0.10069(17) 0.10000(17)
1.00000 0.23283(23) 0.47148(28) 0.10137(33) 0.10000(33)
-0.13733(15) -0.30727(28) -0.13733(31)
0.20000(17)
5 32 1.00000 0.23283(23) 0.16422(28) 0.10000(33) 0.10000(33)

Since c2 alternates in sign, it indicates that there is a pair of complex roots.

|ξ1 | = (.23283 × 1023 )1/32 = 5.0000,


 
0.10000 × 1033 1/64
|ξ4 | = = 1.0000.
0.10000 × 1033

These two roots are positive.


 1/32
0.10000 × 1033
ρ22 = = 2.0000.
0.23283 × 1023

, ξ3 = u ± v, then 2u + 5 + 1 = 5 (sum of the roots). Therefore, u = −0.5. Then


If ξ2 

v = ρ22 − u2 = 2 − 0.25 = 1.3229.
Hence the roots are 5, 1, −0.5 ± 1.3229i.
Sol. of Algebraic and Transcendental Equs. 261

4.17 Solution of Systems of Nonlinear Equations

To solve a system of nonlinear equations the following methods are discussed in this
section.

1. The method of iteration (fixed point iteration)

2. Seidal iteration

3. Newton-Raphson method.

4.17.1 The method of iteration


Let the system of nonlinear equations be

f (x, y) = 0
and g(x, y) = 0. (4.96)

whose real roots are required within a specified accuracy. The above system can be
rewritten as

x = F (x, y)
and y = G(x, y). (4.97)

The function F and G may be obtained in many different ways.


Let (x0 , y0 ) be the initial guess to a root (ξ, η) of the system (4.96). Then we obtain
the following sequence {(xn , yn )} of roots.

x1 = F (x0 , y0 ), y1 = G(x0 , y0 )
x2 = F (x1 , y1 ), y2 = G(x1 , y1 )
··············· ··············· (4.98)
xn+1 = F (xn , yn ), yn+1 = G(xn , yn ).

If the sequence (4.98) converges, i.e.,

lim xn = ξ and lim yn = η


n→∞ n→∞

then

ξ = F (ξ, η) and η = G(ξ, η). (4.99)

Like the iteration process for single variable, the above sequence surely converge to
a root under certain condition. The sufficient condition is stated below.
262 Numerical Analysis

Theorem 4.12 Assume that the functions x = F (x, y), y = G(x, y) and their first
order partial derivatives are continuous on a region R that contains a root (ξ, η). If the
starting point (x0 , y0 ) is sufficiently close to (ξ, η) and if
       
 ∂F   ∂F   ∂G   ∂G 
 + <1 and    
 ∂x   ∂y   ∂x  +  ∂y  < 1, (4.100)

for all (x, y) ∈ R, then the iteration scheme (4.98) converges to a root (ξ, η).

The condition for the functions x = F (x, y, z), y = G(x, y, z), z = H(x, y, z) is
     
 ∂F   ∂F   ∂F 
     
 + +  < 1,
 ∂x   ∂y   ∂z 
     
 ∂G   ∂G   ∂G 
     
 + + <1
 ∂x   ∂y   ∂z 
     
 ∂H   ∂H   ∂H 
     
and  + + <1
 ∂x   ∂y   ∂z 

for all (x, y, z) ∈ R.

4.17.2 Seidal method


An improvement of the iteration method can be made by using the recently computed
values of xi while computing yi , i.e., xi+1 is used in the calculation of yi . Therefore, the
iteration scheme becomes
xn+1 = F (xn , yn )
yn+1 = G(xn+1 , yn ). (4.101)
This method is called Seidal iteration. In case of three variables the scheme is
xn+1 = F (xn , yn , zn )
yn+1 = G(xn+1 , yn , zn ) (4.102)
and zn+1 = H(xn+1 , yn+1 , zn ).

Example 4.17.1 Solve the following system of equations

8x − 4x2 + y 2 + 1 2x − x2 + 4y − y 2 + 3
x= and y =
8 4
starting with (x0 , y0 ) = (1.1, 2.0), using (i) iteration method, and (ii) Seidal iteration
method.
Sol. of Algebraic and Transcendental Equs. 263

Solution.
(i) Iteration method
Let
8x − 4x2 + y 2 + 1 2x − x2 + 4y − y 2 + 3
F (x, y) = and G(x, y) = .
8 4
The iteration scheme is
8xn − 4x2n + yn2 + 1
xn+1 = F (xn , yn ) = ,
8
2xn − x2n + 4yn − yn2 + 3
and yn+1 = G(xn , yn ) = .
4
The value of xn , yn , xn+1 and yn+1 for n = 0, 1, . . . are shown in the following table.

n xn yn xn+1 yn+1
0 1.10000 2.00000 1.12000 1.99750
1 1.12000 1.99750 1.11655 1.99640
2 1.11655 1.99640 1.11641 1.99660
3 1.11641 1.99660 1.11653 1.99661
4 1.11653 1.99661 1.11652 1.99660

Therefore, a root correct up to four decimal place is (1.1165, 1.9966).

(ii) Seidal method


The iteration scheme for Seidal method is
8xn − 4x2n + yn2 + 1
xn+1 = F (xn , yn ) =
8
2xn+1 − x2n+1 + 4yn − yn2 + 3
and yn+1 = G(xn+1 , yn ) = .
4
The calculations are shown below.

n xn yn xn+1 yn+1
0 1.10000 2.00000 1.12000 1.99640
1 1.12000 1.99640 1.11600 1.99663
2 1.11600 1.99663 1.11659 1.99660
3 1.11659 1.99660 1.11650 1.99660

Therefore, root correct up to four decimal places is


(1.1165, 1.9966).
264 Numerical Analysis

Algorithm 4.8 (Seidal iteration). This algorithm used to solve two non-linear
equations by Seidal iteration, when the initial guess is given.

Algorithm Seidal-Iteration-2D
// Let (x0 , y0 ) be the initial guess of the system of equations x = F (x, y), y = G(x, y).
ε be the error tolerance, maxiteration represents the maximum number of repetitions
to be done.//
Input functions F (x, y), G(x, y).
Read ε, maxiteration, x0 , y0 , z0 ;
Set k = 0, error = 1;
While k < maxiteration and error > ε do
Set k = k + 1;
Compute x1 = F (x0 , y0 ), y1 = G(x1 , y0 );
Compute error = |x1 − x0 | + |y1 − y0 |;
Set x0 = x1 , y0 = y1 ;
endwhile;
if error < ε then
Print ‘The sequence converge to the root’, x1 , y1 ;
Stop;
else
Print ‘The iteration did not converge after’, k,’iterations’;
Stop;
endif;
end Seidal-Iteration-2D
Program 4.8
.
/* Program Seidal for a pair of non-linear equations
Program to find a root of a pair of non-linear equations
by Seidal method. Assumed that the equations are given
in the form x=f(x,y) and y=g(x,y).
The equations taken are x*x+4y*y-4=0, x*x-2x-y+1=0. */
#include<stdio.h>
#include<math.h>
#include<stdlib.h>

void main()
{
int k=0,maxiteration;
float error=1,eps=1e-5,x0,y0; /*initial guesses for x and y*/
float x1,y1;
float f(float x, float y);
Sol. of Algebraic and Transcendental Equs. 265

float g(float x, float y);


printf("Enter initial guess for x and y ");
scanf("%f %f",&x0,&y0);
printf("Maximum iterations to be allowed ");
scanf("%d",&maxiteration);
while((k<maxiteration) && (error>eps))
{
k++;
x1=f(x0,y0); y1=g(x1,y0);
error=fabs(x1-x0)+fabs(y1-y0);
x0=x1; y0=y1;
}
if(error<eps)
{
printf("The sequence converges to the\n");
printf("root (%7.5f, %7.5f) at %d iterations",x1,y1,k);
exit(0);
}
else
{
printf("The iteration did not converge after %d iterations",k);
exit(0);
}
} /* main */

/* definition of f(x,y) */
float f(float x, float y)
{
float f1;
f1=sqrt(4-4*y*y);
return f1;
}

/* definition of g(x,y) */
float g(float x, float y)
{
float g1;
g1=x*x-2*x+1;
return g1;
}
266 Numerical Analysis

A sample of input/output:

Enter initial guess for x and y 0.5 0.5


Maximum iterations to be allowed 100
The sequence converges to the root (0.00000, 1.00000) at 68 iterations

4.17.3 Newton-Raphson method

Let (x0 , y0 ) be an initial guess to the root (ξ, η) of the equations

f (x, y) = 0 and g(x, y) = 0. (4.103)

If (x0 + h, y0 + k) is the root of the above system then

f (x0 + h, y0 + k) = 0
g(x0 + h, y0 + k) = 0. (4.104)

Assume that f and g are differentiable. Expanding (4.104) by Taylor’s series.


& ' & '
∂f ∂f
f (x0 , y0 ) + h +k + ··· = 0
∂x ∂y
(x ,y ) (x ,y )
& ' 0 0 & ' 0 0
∂g ∂g
g(x0 , y0 ) + h +k + ··· = 0 (4.105)
∂x ∂y
(x0 ,y0 ) (x0 ,y0 )

Neglecting square and higher order terms, the above equations simplified as

∂f0 ∂f0
h +k = −f0
∂x ∂y
∂g0 ∂g0
h +k = −g0 (4.106)
∂x ∂y

∂f0  ∂f 
where f0 = f (x0 , y0 ),= etc.
∂x ∂x (x0 ,y0 )
The above system can be written as
 ∂f ∂f0 
0
 ∂x ∂y  h
  −f0 h −f0
  = or = J −1 .
 ∂g ∂g  k −g0 k −g0
0 0
∂x ∂y
Sol. of Algebraic and Transcendental Equs. 267

Alternatively, h and k can be evaluated as


     
 ∂f0   ∂f0   ∂f0 ∂f0 
 −f0   −f0   
1  ∂y  1  ∂x   ∂x ∂y 
   
h=  , k =   , where J =  . (4.107)
J  J  ∂g   ∂g ∂g0 
 −g ∂g0   0 −g   0
 0   0   
∂y ∂x ∂x ∂y
Thus h and k are determined by one of the above two ways. Therefore, the new
approximations are then given by
x1 = x0 + h, y1 = y0 + k. (4.108)
The process is to be repeated until the roots are achieved to the desired accuracy.
The general formula is xn+1 = xn + h, yn+1 = yn + k; h, k are evaluated at (xn , yn )
instead at (x0 , y0 ).
If the iteration converges (the condition is stated below) then the rate of convergence
is quadratic.

Theorem 4.13 Let (x0 , y0 ) be an initial guess to a root (ξ, η) of the system f (x, y) =
0, g(x, y) = 0 in a closed neighbourhood R containing (ξ, η). If
1. f, g and their first order partial derivatives are continuous and bounded in R, and
2. J = 0 in R, then the sequence of approximation xn+1 = xn + h, yn+1 = yn + k,
where h and k are given by (4.107), converges to the root (ξ, η).

Example 4.17.2 Use Newton-Raphson method to solve the system x2 − 2x − y +


0.5 = 0, x2 + 4y 2 − 4 = 0 with the starting value (x0 , y0 ) = (2.00, 0.25).

Solution. Let f (x, y) = x2 − 2x − y + 0.5 and g(x, y) = x2 + 4y 2 − 4.


∂f ∂f ∂g ∂g
= 2x − 2, = −1, = 2x, = 8y.
∂x ∂y ∂x ∂y
Therefore,
( )
∂f ∂f
∂x ∂y 2x − 2 −1 −f0 −0.25
J= = , = .
∂g
∂x
∂g
∂y
2x 8y −g0 −0.25

2 −1
At (x0 , y0 ), J0 = .
4 2
Therefore,
2 −1 h −0.25
=
4 2 k −0.25
268 Numerical Analysis

h 1 2 1 −0.25 −0.09375
or, = =
k 8 −4 2 −0.25 0.06250
Thus, x1 = x0 + h = 2.00 − 0.09375 = 1.90625,
y1 = y0 + k = 0.25 + 0.0625 = 0.31250.
1.81250 −1.00000 −f1 −0.00879
At (x1 , y1 ), J1 = , = .
3.81250 2.50000 −g1 −0.02441
h −f1
J1 =
k −g1
h 1 2.50000 1.00000 −0.00879 −0.00556
or, = = .
k 8.34375 −3.81250 1.81250 −0.02441 −0.00129
Therefore, x2 = x1 + h = 1.90625 − 0.00556 = 1.90069,
y2 = y1 + k = 0.31250 − 0.00129 = 0.31121.
1.80138 −1.00000 −f2 −0.00003
At (x2 , y2 ), J2 = , = .
3.80138 2.48968 −g2 −0.00003
h −f2
J2 =
k −g2
h 1 2.48968 1.00000 −0.00003 −0.00001
or, = = .
k 8.28624 −3.80138 1.80138 −0.00003 0.00001
Hence, x3 = x2 + h = 1.90069 − 0.00001 = 1.90068,
y3 = y2 + k = 0.31121 + 0.00001 = 0.31122.
Thus, one root is x = 1.9007, y = 0.3112 correct up to four decimal places.

Algorithm 4.9 (Newton-Raphson method for pair of equations). This algo-


rithm solves a pair of non-linear equations by Newton-Raphson method. The initial
guess of a root is to be supplied.

Algorithm Newton-Raphson -2D


//(x0 , y0 ) is initial guess, ε is the error tolerance.//
Input functions f (x, y), g(x, y), fx (x, y), fy (x, y), gx (x, y), gy (x, y).
Read x0 , y0 , ε, maxiteration;
for i = 1 to maxiteration do
Compute f0 = f (x0 , y0 ), g0 = g(x0 , y0 );
if (|f0 | < ε and |g0 | < ε) then
Print ‘A root is’, x0 , y0 ;
Stop;
endif;    
∂f ∂f
Compute delf x = ∂x , delf y = ∂y
(x0 ,y0 ) (x0 ,y0 )
Sol. of Algebraic and Transcendental Equs. 269

   
∂g ∂g
Compute delgx = ∂x , delgy = ∂y
(x0 ,y0 ) (x0 ,y0 )
Compute J = delf x ∗ delgy − delgx ∗ delf y;
Compute h = (−f0 ∗ delgy + g0 ∗ delf y)/J;
Compute k = (−g0 ∗ delf x + f0 ∗ delgx)/J;
Compute x0 ← x0 + h, y0 ← y0 + k;
endfor;
Print ‘Solution does not converge in’, maxiteration, ‘iteration’;
end Newton-Raphson -2D
Program 4.9.
/* Program Newton-Raphson (for a pair of non-linear equations)
Program to find a root of a pair of non-linear equations
by Newton-Raphson method. Partial derivative of f and g
w.r.t. x and y are to be supplied.
The equations taken are 3x*x-2y*y-1=0, x*x-2x+2y-8=0. */
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
void main()
{
int i,maxiteration;
float eps=1e-5,x0,y0; /* initial guesses for x and y */
float J,k,h,delfx,delfy,delgx,delgy,f0,g0;
float f(float x, float y);
float fx(float x, float y);
float fy(float x, float y);
float g(float x, float y);
float gx(float x, float y);
float gy(float x, float y);
printf("Enter initial guess for x and y ");
scanf("%f %f",&x0,&y0);
printf("Maximum iterations to be allowed ");
scanf("%d",&maxiteration);

for(i=1;i<=maxiteration;i++)
{
f0=f(x0,y0);
g0=g(x0,y0);
if(fabs(f0)<eps && fabs(g0)<eps){
270 Numerical Analysis

printf("One root is (%7.5f, %7.5f) obtained at %d iterations",


x0,y0,i);
exit(0);
}
delfx=fx(x0,y0); delfy=fy(x0,y0);
delgx=gx(x0,y0); delgy=gy(x0,y0);
J=delfx*delgy-delgx*delfy;
h=(-f0*delgy+g0*delfy)/J;
k=(-g0*delfx+f0*delgx)/J;
x0+=h; y0+=k;
}
printf("Iteration does not converge in %d iterations",
maxiteration);
} /* main */

/* definition of f(x,y) and its partial derivative w.r.t x and y


fx(x,y) and fy(x,y) */
float f(float x, float y)
{
return (3*x*x-2*y*y-1);
}
float fx(float x, float y)
{
return (6*x);
}
float fy(float x, float y)
{
return (-4*y);
}
/*definition of g(x,y) and its partial derivative w.r.t x and y
gx(x,y) and gy(x,y) */
float g(float x, float y)
{
return (x*x-2*x+2*y-8);
}
float gx(float x, float y)
{
return (2*x-2);
}
Sol. of Algebraic and Transcendental Equs. 271

float gy(float x, float y)


{
return (2);
}

A sample of input/output:

Enter initial guess for x and y 2.5 3


Maximum iterations to be allowed 50
One root is (2.64005, 3.15512) obtained at 4 iterations

4.18 Exercise

1. Find the number of real roots of the equation (i) 2x − 3 = 0,


(ii) x10 − 4x4 − 100x + 200 = 0.

2. Describe the methods to locate the roots of the equation f (x) = 0.

3. Obtain a root for each of the following equations using bisection method, regula-
falsi method, iteration method, Newton-Raphson method, secant method.
(i) x3 + 2x2 − x + 7 = 0, (ii) x3 − 4x − 9 = 0, (iii) cos x = 3x − 1
(iv) e−x = 10x, (v) sin x = 10(x − 1), (vi) sin2 x = x2 − 1
(vii) tan x − tanh x = 0, (viii) x3 + 0.5x2 − 7.5x − 9.0 = 0, (ix) tan x + x = 0,
(x) x3 − 5.2x2 − 17.4x + 21.6 = 0, (xi) x7 + 28x4 − 480 = 0,
(xii) (x − 1)(x − 2)(x − 3) = 0, (xiii) x − cos x = 0, (xiv) x + log x = 2,
(xv) sin x = 12 x, (xvi) x3 − sin x + 1 = 0, (xvii) 2x = cos x + 3,
− sin πx = 0, (xx) 2x − 2x2 − 1 = 0,
(xviii) x log10 x = 4.77, (xix) x2 √
(xxi) 2 log x − 2 + 1 = 0, (xxii) 1 + x = 1/x, (xxiii) log x + (x + 1)3 = 0.
x

4. Find a root of the equation x = 12 + sin x by using the fixed point iteration method

1
xn+1 = + sin xn , x0 = 1
2
correct to six decimal places.

5. Use Newton-Raphson method for multiple roots to find the roots of the following
equations.
(i) x3 − 3x + 2 = 0, (ii) x4 + 2x3 − 2x − 1 = 0.

6. Describe the bisection method to find a root of the equation f (x) = 0 when
f (a) · f (b) < 0, a, b be two specified numbers. Is this condition necessary to get a
root using this method ?
272 Numerical Analysis

7. Describe regula-falsi method for finding a real root of an equation. Why this
method is called the bracketing method ? Give a geometric interpretation of this
method. What is the rate of convergence of regula-falsi method ? Compare this
method with Newton-Raphson method. Discuss the advantages and disadvantages
of this method.

8. Compare bisection method and regula-falsi method. Also compare bracketing


methods and iterative methods.

9. What is the main difference between regula-falsi method and secant method ?

10. Explain how an equation f (x) = 0 can be solved by the method of iteration (fixed
point iteration) and deduce the condition of convergence. Show that the rate of
convergence of iteration method is one. Give a geometric interpretation of this
method. Discuss the advantages and disadvantages of this method.

11. Find the iteration schemes to solve the following equations using fixed point iter-
ation method:
(i) 2x − sin x − 1 = 0, (ii) x3 − 2x − 1 = 0, (iii) x + sin x = 0.

12. Describe Newton-Raphson method for computing a simple real root of an equa-
tion f (x) = 0. Give a geometrical interpretation of the method. What are the
advantages and disadvantages of this method. Prove that the Newton-Raphson
method has a second order convergence.

13. Find the iteration schemes to solve the following equations using Newton-Raphson
method:
(i) 2x − cos x − 1 = 0, (ii) x5 + 3x2 − 1 = 0, (iii) x2 − 2 = 0.

14. Use Newton-Raphson


√ method
√ to find the value of the following terms:
3
(i) 1/15, (ii) 11, (iii) 5.

k
15. Show that an iterative method for computing a is given by

1 a
xn+1 = (k − 1)xn + k−1
k xn

and also deduce that


k−1 2
εn+1  − √ ε
2ka n
where εn is the error at the nth iteration. What is the order of convergence for
this iterative method ?
Sol. of Algebraic and Transcendental Equs. 273

16. Use Newton-Raphson method and modified Newton-Raphson method to find the
roots of the following equations and compare the methods.
(i) x3 − 3x + 2 = 0, (ii) x4 + 2x3 − 2x − 1 = 0.
17. Solve the following equation using Newton-Raphson method:
(i) z 3 − 4iz 2 − 3ez = 0, z0 = −0.53 − 0.36i, (ii) 1 + z 2 + z 3 = 0, z0 = 1 + i.
18. Compare Newton-Raphson method and modified Newton-Raphson method to find
a root of the equation f (x) = 0.
19. Use modified Newton-Raphson method to find a root of the equation
% x
2 1
e−t dt =
0 10
with six decimal places.

5
20. Device a scheme to find the value of a using modified Newton-Raphson method.
21. The formula
1 f (xn + 2f (xn )/f  (xn ))
xn+1 = xn −
2 f  (xn )
is used to find a multiple root of multiplicity two of the equation f (x) = 0. Show
that the rate of convergence of this method is cubic.
22. The iteration scheme
3 loge xn − e−xn
xn+1 = xn −
p
is used to find the root of the equation e−x − 3 loge x = 0. Show that p = 3 gives
rapid convergence.
23. Use Muller’s method to find a root of the equations
(i) x3 − 2x − 1 = 0, (ii) x4 − 6x2 + 3x − 1 = 0.
24. Determine the order of convergence of the iterative method
x0 f (xn ) − xn f (x0 )
xn+1 =
f (xn ) − f (x0 )
for finding the simple root of the equation f (x) = 0.
25. Find the order of convergence of the Steffensen method
f (xn ) f (xn + f (xn )) − f (xn )
xn+1 = xn − , g(xn ) = , n = 0, 1, 2, . . . .
g(xn ) f (xn )
Use this method to find a root of the equation x − 1 + ex = 0.
274 Numerical Analysis


26. Show that the iteration scheme to find the value of a using Chebyshev third
order method is given by
   
1 a 1 a 2
xn+1 = xn + − xn − .
2 xn 8xn xn

Use this scheme to find the value of 17.

27. An iteration scheme is given by


1 4 1 3
x0 = 5, xn+1 = x − x + 8xn − 12.
16 n 2 n
Show that it gives cubic convergence to ξ = 4.

28. Using Graeffe’s root squaring method, find the roots of the following equations
(i) x3 − 4x2 + 3x + 1 = 0, (ii) x3 − 3x − 5 = 0,
(iii) x4 − 2x3 + 1.99x2 − 2x + 0.99 = 0, (iv) x3 + 5x2 − 44x − 60 = 0.

29. Use Birge-Vieta method to find the roots of the equations to 3 decimal places.
(i) x4 − 8x3 + 14.91x2 + 9.54x − 25.92 = 0, (ii) x3 − 4x + 1 = 0,
(iii) x4 − 1.1x3 − 0.2x2 − 1.2x + 0.9 = 0.

30. Find the quadratic factors of the following polynomial equations using Bairstow’s
method.
(i) x4 − 8x3 + 39x2 − 62x + 50 = 0, (ii) x3 − 2x2 + x − 2 = 0,
(iii) x4 − 6x3 + 18x2 − 24x + 16 = 0, (iv) x3 − 2x + 1 = 0.

31. Solve the following systems of nonlinear equations using iteration method
(i) x2 + y = 11, y 2 + x = 7, (ii) 2xy − 3 = 0, x2 − y − 2 = 0.

32. Solve the following systems of nonlinear equations using Seidal method
(i) x2 + 4y 2 − 4 = 0, x2 − 2x − y + 1 = 0, start with (1.5, 0.5),
(ii) 3x2 − 2y 2 − 1 = 0, x2 − 2x + y 2 + 2y − 8 = 0, start with (−1, 1).

33. Solve the following systems of nonlinear equations using Newton-Raphson method
(i) 3x2 − 2y 2 − 1 = 0, x2 − 2x + 2y − 8 = 0 start with initial guess (2.5,3),
(ii) x2 − x + y 2 + z 2 − 5 = 0, x2 + y 2 − y + z 2 − 4 = 0, x2 + y 2 + z 2 + z − 6 = 0
start with (−0.8, 0.2, 1.8) and (1.2, 2.2, −0.2).

34. Use Newton’s method to find all nine solutions to 7x3 − 10x − y − 1 = 0, 8y 3 −
11y + x − 1 = 0. Use the starting points (0, 0), (1, 0), (0, 1), (−1, 0), (0, −1), (1, 1),
(−1, 1), (1, −1) and (−1, −1).

35. What are the differences between direct method and iterative method ?
Chapter 5

Solution of System of Linear


Equations

A system of m linear equations in n unknowns (variables) is written as


a11 x1 + a12 x2 + · · · + a1n xn = b1
··························· ···
ai1 x1 + ai2 x2 + · · · + ain xn = bi (5.1)
··························· ···
am1 x1 + am2 x2 + · · · + amn xn = bm .
The quantities x1 , x2 , . . ., xn are the unknowns (variables) of the system and
a11 , a12 , . . ., amn are the coefficients of the unknowns of the system. The numbers
b1 , b2 , . . . , bm are constant or free terms of the system.
The above system of equations (5.1) can be written as
n
aij xj = bi , i = 1, 2, . . . , m. (5.2)
j=1

Also, the system of equations (5.1) can be written in matrix form as


AX = b, (5.3)
   
where   b1 x1
a11 a12 ··· a1n  b2   x2 
 a21 a22 ··· a2n     
   ..   .. 
 ··· ··· ··· 
···     
A= , b =  .  and X =  .  . (5.4)
 ai1 ai2 ··· ain   bi   xi 
     
 ··· ··· ··· ···   .   .. 
 . 
.  . 
am1 am2 ··· amn
bm xm

275
276 Numerical Analysis

The system of linear equation (5.1) is consistent if it has a solution. If a system


of linear equations has no solution, then it is inconsistent (or incompatible). A
consistent system of linear equations may have one solution or several solutions and is
said to be determinate if there is one solution and indeterminate if there are more
than one solution.
Generally, the following three types of the elementary transformations to a system
of linear equations are used.
Interchange: The order of two equations can be changed.
Scaling: Multiplication of both sides of an equation of the system by any non-zero
number.
Replacement: Addition to (subtraction from) both sides of one equation of the cor-
responding sides of another equation multiplied by any number.
A system in which the constant terms b1 , b2 , . . . , bm are zero is called a homogeneous
system.
Two basic techniques are used to solve a system of linear equations:
(i) direct method, and (ii) iteration method.
Several direct methods are used to solve a system of equations, among them following
are most useful.
(i) Cramer’s rule, (ii) matrix inversion, (iii) Gauss elimination, (iv) decomposition, etc.
The most widely used iteration methods are (i) Jacobi’s iteration, (ii) Gauss-Seidal’s
iteration, etc.

Direct Methods

5.1 Cramer’s Rule

To solve a system of linear equations, a simple method (but, not efficient) was discovered
by Gabriel Cramer in 1750.
Let the determinant of the coefficients of the system (5.2) be D = |aij |; i, j =
1, 2, . . . , n, i.e., D = |A|. In this method, it is assumed that D = 0. The Cramer’s
rule is described in the following. From the properties of determinant

   
 a11 a12 · · · a1n   x1 a11 a12 · · · a1n 
   
 a21 a22 · · · a2n   x1 a21 a22 · · · a2n 

x1 D = x1    
· · · · · · · · · · · ·  =  ··· ··· ··· ··· 
   
 an1 an2 · · · ann   x1 an1 an2 · · · ann 
 
 a11 x1 + a12 x2 + · · · + a1n xn a12 · · · a1n 
 
 a21 x1 + a22 x2 + · · · + a2n xn a22 · · · a2n  [Using the operation

=  
 ··· · · · · · · · · ·  C1 = C1 + x2 C2 + · · · + xn Cn .]
 an1 x1 + an2 x2 + · · · + ann xn an2 · · · ann 
Solution of System of Linear Equations 277

 
 b1 a12 · · · a1n 

 b a ··· a2n 
=  2 22 [Using (5.1)]
··· ··· ··· · · · 
 bn an2 · · · ann 
= Dx1 (say).
Therefore, x1 = Dx1 /D.
Dx2 Dxn
Similarly, x2 = , . . . , xn = .
D D
Dxi
In general, xi = , where
D
 
 a11 a12 · · · a1 i−1 b1 a1 i+1 ··· a1n 

 a a · · · a2 i−1 b2 a2 i+1 ··· a2n 
Dxi =  21 22 ,
 ··· ··· ··· ··· ··· ··· ··· · · · 
 an1 an2 · · · an i−1 bn an i+1 ··· ann 
i = 1, 2, · · · , n
Example 5.1.1 Use Cramer’s rule to solve the following systems of equations

x1 + x2 + x3 = 2
2x1 + x2 − x3 = 5
x1 + 3x2 + 2x3 = 5.

Solution. The determinant D of the system is


 
1 1 1
 
D =  2 1 −1  = 5.
1 3 2

The determinants D1 , D2 and D3 are shown below:


     
2 1 1 1 2 1 1 1 2
     
D1 =  5 1 −1  = 5, D2 =  2 5 −1  = 10, D3 =  2 1 5  = −5.
5 3 2 1 5 2 1 3 5
D1 5 D2 10 D3 5
Thus, x1 = = = 1, x2 = = = 2, x3 = = − = −1.
D 5 D 5 D 5
Therefore the solution is x1 = 1, x2 = 2, x3 = −1.

5.1.1 Computational aspect of Cramer’s rule


It may be noted that the Cramer’s rule involves to compute (n + 1) determinants each
of order n (for a system of n equations and n variables). Again, the Laplace’s expansion
278 Numerical Analysis

method (the conventional method to find the value of a determinant) to evaluate a


determinant of order n requires (n! − 1) additions and n!(n − 1) multiplications. Thus,
to compute a determinant of order 10, needs 32 million multiplications and 4 millions
additions. So, the Cramer’s rule is only of theoretical interest due to the computational
inefficiency. But, the time complexity can be reduced by triangularizing the determinant
and this method is a polynomial time bound algorithm.

5.2 Evaluation of Determinant

Triangularization, is also known as Gauss reduction method, is the best way to


evaluate a determinant. The basic principle of this method is to convert the given
determinant into a lower or upper triangular form by using only elementary row oper-
ations. If the determinant is reduced to a triangular form (D ) then the value of D is
obtained by multiplying the diagonal
 elements of D .
 a11 a12 · · · a1n 
 
 a21 a22 · · · a2n 
Let D =   be a determinant of order n.

 ··· ··· ··· ··· 
 an1 an2 · · · ann 
Using the elementary row operations, D can be reduced to the following form:

 
 a11 a12 a13 · · · a1n 

 0 (1) (1)
· · · a2n 
(1)
 a22 a23
 (2) 
D =  0 0
(2)
a33 · · · a3n  ,
 
 ··· ··· ··· ··· ··· 
 
 0 0 0 · · · ann 
(n−1)

where
(k−1)
(k) (k−1) aik (k−1)
aij = aij − a
(k−1) kj
; (5.5)
akk

(0)
i, j = k + 1, . . . , n; k = 1, 2, . . . , n − 1 and aij = aij , i, j = 1, 2, · · · , n.
(1) (2) (n−1)
Then the value of D is equal to a11 a22 a33 · · · ann .
(k) (k−1)
To compute the value of aij one division is required. If akk is zero then further
(k−1)
reduction is not possible. If akk is small then the division leads to the loss
of significant
digits. To prevent the loss of significant digits, the pivoting techniques are used.
A pivot is the largest magnitude element in a row or a column or the principal
diagonal or the leading or trailing sub-matrix of order i (2 ≤ i ≤ n).
Solution of System of Linear Equations 279

For example, for the matrix


 
5 −1 0 5
 −10 8 3 10 
A=
 20

3 −30 8 
3 50 9 10
20 is the pivot for the first column, −30 is the pivot for the principal diagonal, 50 is the
5 −1
pivot for this matrix and −10 is the pivot for the trailing sub-matrix .
−10 8
In the elementary row operation, if the any one of the pivot element is zero or very
small relative to other elements in that row then we rearrange the remaining rows in
such a way that the pivot becomes non-zero or not a very small number. The method is
called pivoting. The pivoting are of two types- partial pivoting and complete pivoting.
Partial and complete pivoting are discussed in the following.

Partial pivoting
In the first stage, the first pivot is determined by finding the largest element in mag-
nitude among the elements of first column and let it be ai1 . Then rows i and 1 are
interchanged. In the second stage, the second pivot is determined by finding the largest
element in magnitude among the elements of second column leaving first element and
let it be aj2 . The second and jth rows are interchanged. This process is repeated for
(n − 1)th times. In general, at ith stage, the smallest index j is chosen for which
(k) (k) (k) (k) (k)
|aij | = max{|akk |, |ak+1 k |, . . . , |ank |} = max{|aik |, i = k, k + 1, . . . , n}
and the rows k and j are interchanged.

Complete pivoting or full pivoting


The largest element in magnitude is determined among all the elements of the determi-
nant and let it be |alm |.
Taking alm as the first pivot by interchanging first row and the lth row and of first
column and mth column. In second stage, the largest element in magnitude is deter-
mined among all elements leaving the first row and first column and taking this element
as second pivot.
In general, we choose l and m for which
(k) (k)
|alm | = max{|aij |, i, j = k, k + 1, . . . , n}.
Then the rows k, l and columns k, m are interchanged, and akk becomes a pivot.
The complete pivoting is more complicated than the partial pivoting. The partial
pivoting is preferred for hand computation.
280 Numerical Analysis

Note 5.2.1 If the coefficient matrix A is diagonally dominant, i.e.,



n 
n
|aij | < |aii | or |aji | < |aii |, for i = 1, 2, . . . , n. (5.6)
j=1 j=1
j=i j=i

or real symmetric and positive definite then no pivoting is necessary.

Note 5.2.2 Every diagonally dominant matrix is non-singular.

To illustrate the partial pivoting, let us consider the matrix


 
1 7 3
A =  4 5 1.
−8 1 6

The largest element (in magnitude) in the first column is −8. Then interchanging
first and third rows i.e.,  
−8 1 6
A ∼  4 5 1.
1 7 3
The largest element in second column leaving the first row is 7, so interchanging
second and third rows.
The matrix after partial pivoting is
 
−8 1 6
A ∼  1 7 3.
4 5 1
 
1 7 3
Let us consider the matrix B =  4 −8 5  to illustrate the complete pivoting. The
2 6 1
largest element (in magnitude) is determined among all the elements of the matrix. It
is −8 attained at the (2, 2) position. Therefore, first and second columns, and first and
second rows are interchanged. The matrix transferred to
 
−8 4 5
B ∼  7 1 3.
6 2 1
The number −8 is the first pivot. To find second pivot, largest element is determined
1 3
from the trailing sub-matrix . The largest element is 3 and it is at the position
2 1
(2, 3).
Solution of System of Linear Equations 281

Interchanging
  second and third columns. The final matrix (after complete pivoting)
−8 5 4
is  7 3 1  .
6 1 2

Example 5.2.1 Compute the determinant of the following matrix by a triangular-


ization algorithm using (i) partial pivoting, and (ii) complete pivoting:
 
2 0 4
A = 4 6 1.
5 1 −2

Solution. (i) The largest element in the first column is 5, which is the first pivot of
A.
Interchanging first and third rows, we obtain
 
5 1 −2
4 6 1.
2 0 4

and sign = −1.


4 2
Adding − times the first row to the second row, − times the first row to the third
5 5
 4  2
row i.e., R2 = R2 − R1 and R3 = R3 − R1 , we get
5 5
 
5 1 −2
 0 26/5 13/5  .
0 −2/5 24/5

26
The second pivot is , which is the largest element (in magnitude) among the
5
elements of second column except first row. Since this element is in the (2,2) position,
so no interchange of rows is required.
2/5 2/5
Adding times the second row to the third row i.e., R3 = R3 + R2 we obtain
26/5 26/5
 
5 1 −2
 0 26/5 13/5  .
0 0 5

Hence the determinant is sign × (5)(26/5)(5) = −1(5)(26/5)(5) = −130.


282 Numerical Analysis

(ii) The largest element in A is 6. Interchanging first and second columns and setting
sign = −1; and then interchanging first and second rows and setting sign = −sign =
1, we have  
6 4 1
0 2 4.
1 5 −2
Adding − 16 times the first row to the third row i.e., R3 = R3 − 16 R1 we obtain
 
6 4 1
0 2 4 .
0 13/3 −13/6

2 4
The pivot of the trailing sub-matrix is 13/3. Interchanging the second
13/3 −13/6
 
6 1 4
and third rows, we have  0 13/3 −13/6  and sign = −1.
0 2 4
2 6
Adding − times the second row to the third row i.e., R3 = R3 − R2 we obtain
 13/3  13
6 1 4
 0 13/3 −13/6  .
0 0 5
Therefore, |A| = sign × (6)(13/3)(5) = −130.

The algorithm for triangularization i.e., to find the value of a determinant using
partial and complete pivoting are presented below.

Algorithm 5.1 (Evaluation of determinant using partial pivoting). This


algorithm finds the value of a determinant of order n × n using partial pivoting.

Algorithm Det Partial Pivoting.


//The value of determinant using partial pivoting.//
Let A = [aij ] be an n × n matrix.
Step 1. Read the matrix A = [aij ], i, j = 1, 2, . . . , n.
Step 2. Set k = 1 and sign = 1// sign indicates the sign of the determinant
when interchanges two rows.//
Step 3. Find a pivot from the elements akk , ak+1k , · · · , ank in the kth
column of A, and let ajk be the pivot. That is, |ajk | =
max{|akk |, |ak+1 k |, . . . , |ank |}.
Step 4. If ajk = 0 then |A| = 0; print the value of |A| and Stop.
Step 5. If j = k then go to Step 6.
Otherwise interchange the kth and jth rows and set sign = −sign.
Solution of System of Linear Equations 283

ajk
Step 6. Subtract times the kth row from the jth row for j = k+1, k+2, . . . , n
akk
ajk
i.e., for j = k + 1, k + 2, . . . , n do the following Rj = Rj − .Rk , where
akk

Rj , Rj are the old and new jth rows respectively.
// This step makes ak+1k , ak+2k , · · · , ank zero.//
Step 7. Increment k by 1 i.e., set k = k+1. If k < n then goto Step 3. Otherwise,
//Triangularization is complete.//
Compute |A| = sign × product of diagonal elements.
Print |A| and Stop.
end Det Partial Pivoting.

Program 5.1.
/* Program Partial Pivoting
Program to find the value of a determinant using partial pivoting */
#include<stdio.h>
#include<stdlib.h>
#include<math.h>
void main()
{
int n,k,i,j,sign=1;
float a[10][10],b[10],prod,temp;
int max1(float b[],int k, int n);
printf("\nEnter the size of the determinant ");
scanf("%d",&n);
printf("Enter the elements rowwise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++)
scanf("%f",&a[i][j]);
for(k=1;k<=n;k++)
{
for(i=k;i<=n;i++) b[i]=a[i][k];
/* copy from a[k][k] to a[n][k] into b */
j=max1(b,k,n); /* finds pivot position */
if(a[j][k]==0)
{
printf("The value of determinant is 0");
exit(0);
}
if(j!=k) /* interchange k and j rows */
{
sign=-sign;
284 Numerical Analysis

for(i=1;i<=n;i++){
temp=a[j][i]; a[j][i]=a[k][i]; a[k][i]=temp;
}
}
for(j=k+1;j<=n;j++) /* makes a[k+1][k] to a[n][k] zero */
{
temp=a[j][k]/a[k][k];
for(i=1;i<=n;i++) a[j][i]-=temp*a[k][i];
}
} /* end of k loop */
prod=sign;
/* product of diagonal elements */
for(i=1;i<=n;i++) prod*=a[i][i];
printf("The value of the determinant is %f ",prod);
}/* main */
/* finds position of maximum element among n numbers */
int max1(float b[],int k, int n)
{
float temp; int i,j;
temp=fabs(b[k]); j=k; /* initial maximum */
for(i=k+1;i<=n;i++)
if(temp<fabs(b[i])) {temp=fabs(b[i]); j=i;}
return j;
}

A sample of input/output:

Enter the size of the determinant 3


Enter the elements rowwise
0 2 5
1 3 -8
6 5 1
The value of the determinant is -163.000000

Algorithm 5.2 (Evaluation of determinant using complete pivoting). This


algorithm finds the value of a determinant of order n × n using complete pivoting.

Algorithm Det Complete Pivoting.


Let A = [aij ] be an n × n matrix.
Step 1. Read the matrix A = [aij ], i, j = 1, 2, . . . , n.
Step 2. Set k = 1 and sign = 1.
Solution of System of Linear Equations 285

Step 3. Find
 a pivot from the elements of the trailing sub-matrix
akk akk+1 · · · akn
 ak+1k ak+1k+1 · · · ak+1n 
  of A. Let apq be the pivot.
 ··· ··· ··· ··· 
ank ank+1 · · · ann
i.e., |apq | = max{|akk |, . . . , |akn |; |ak+1k |, . . . , |ak+1n |; |ank |, . . . , |ann |}.
Step 4. If apq = 0 then |A| = 0; print the value of |A| and Stop.
Step 5. If p = k then goto Step 6. Otherwise, interchange the kth and the pth
rows and set sign = −sign.
Step 6. If q = k then goto Step 7. Otherwise, interchange the kth and the qth
columns and set sign = −sign.
ajk
Step 7. Subtract times the kth row to the jth row for j = k + 1, k + 2, . . . , n
akk
i.e., for j = k + 1, k + 2, . . . , n do the following
ajk
Rj = Rj − .Rk , where Rj , Rj are the old and new jth rows respec-
akk
tively.
Step 8. Increment k by 1 i.e., set k = k + 1.
If k < n then goto Step 3. Otherwise,
compute |A| = sign × product of diagonal elements.
Print |A| and Stop.
end Det Complete Pivoting.

Program 5.2.
/*Program Complete Pivoting
Program to find the value of a determinant using complete pivoting */
#include<stdio.h>
#include<stdlib.h>
#include<math.h>
void main()
{
int n,k,i,j,sign=1,p,q;
float a[10][10],prod,max,temp;
printf("\nEnter the size of the determinant ");
scanf("%d",&n);
printf("Enter the elements rowwise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
for(k=1;k<=n;k++)
{
/* finds the position of the pivot element */
max=fabs(a[k][k]); p=k; q=k; /* set initial maximum */
286 Numerical Analysis

for(i=k;i<=n;i++)
for(j=k;j<=n;j++)
if(max<fabs(a[i][j])) { max=fabs(a[i][j]); p=i; q=j;}
if(a[p][q]==0)
{
printf("The value of determinant is 0");
exit(0);
}
if(p!=k) /* interchange k and p rows */
{
sign=-sign;
for(i=1;i<=n;i++)
{
temp=a[p][i]; a[p][i]=a[k][i]; a[k][i]=temp;
}
}
if(q!=k) /* interchange k and q columns */
{
sign=-sign;
for(i=1;i<=n;i++)
{
temp=a[i][q]; a[i][q]=a[i][k]; a[i][k]=temp;
}
}
for(j=k+1;j<=n;j++) /* makes a[k+1][k] to a[n][k] zero */
{
temp=a[j][k]/a[k][k];
for(i=1;i<=n;i++) a[j][i]-=temp*a[k][i];
}
} /* end of k loop */
prod=sign;
for(i=1;i<=n;i++) /* product of diagonal elements*/
prod*=a[i][i];
printf("The value of the determinant is %f ",prod);
}/* main */

A sample of input/output:

Enter the size of the determinant 4


Enter the elements rowwise
Solution of System of Linear Equations 287

-2 3 8 4
6 1 0 5
-8 3 1 2
3 8 7 10
The value of the determinant is -1273.000000

Advantages and disadvantages of partial and complete pivoting


The general disadvantages of the pivoting is that the symmetry or regularity of the
original matrix may be lost. Partial pivoting requires less time in terms of interchanges
and search for the pivot than the complete pivoting. A combination of partial and
complete pivoting is expected to be very effective not only for computing a determinant
but also for solving system of linear equations. The pivoting brings in stability where a
method becomes unstable for a problem. The pivoting reduces the error due to the loss
of significant digits.

5.3 Inverse of a Matrix

From the theory of matrices, it is well known that every square non-singular matrix has
unique inverse. The inverse of a matrix A is defined by
adj A
A−1 = . (5.7)
|A|
The matrix adj A is called adjoint of A and defined as
 
A11 A21 · · · An1
 A12 A22 · · · An2 
adj A =  ···
,
··· ··· ··· 
A1n A2n · · · Ann
where Aij being the cofactor of aij in |A|.
The main difficulty of this method is to compute the inverse of the matrix A. From
the definition of adj A it is easy to observe that to compute the matrix adj A, we have
to determine n2 determinants each of order (n − 1). So, it is very much time consuming.
Many efficient methods are available to find the inverse of a matrix, among them Gauss-
Jordan is most popular. In the following Gauss-Jordan method is discussed to find the
inverse of a square non-singular matrix.

5.3.1 Gauss-Jordan Method


In this method, the given matrix A is augmented with a unit matrix of same size, i.e.,
.
if the order of A is n × n then the order of the augmented matrix [A..I] will be n × 2n.
288 Numerical Analysis

The augmented matrix looks like


 .. 
a a · · · a1n . 1 0 ··· 0
 11 12 
 .. 
.  a a · · · a2n . 0 1 ··· 0 
[A..I] =  21 22 .. . (5.8)
 ··· ··· ··· ··· ··· ··· ··· ···
 . 
..
an1 an2 · · · ann . 0 0 ··· 1
Then the inverse of A is computed in two stages. In the first stage, A is converted
into an upper triangular form, using only elementary row operations (Gauss elimination
method discussed in Section 5.5). In the second stage, the upper triangular matrix
(obtained in first stage) is reduced to an identity matrix by row operations. All these
.
operations are operated on the augmented matrix [A..I]. After completion of these
. .
stages, the augmented matrix [A..I] is turned to [I..A−1 ], i.e., the inverse of A is obtained
from the right half of augmented matrix.
Thus
 ..  Gauss − Jordan  . 
A.I −→ I..A−1 .
At the end of the operations the matrix shown in (5.8) reduces to the following form:
 . 
1 0 · · · 0 .. a11 a12 · · · a1n
 
 ..   · · · a 
 0 1 · · · 0 . a a 2n  .
 21 22
 (5.9)
 · · · · · · · · · · · · ... · · · · · · · · · · · · 
 
..   
0 0 ··· 1 . a a ··· an1 n2 nn

 
2 4 5
Example 5.3.1 Find the inverse of the following matrix A =  1 −1 2  .
3 4 5

.
Solution. The augmented matrix [A..I] can be written as
 . 
24 5 .. 1 0 0
.  
[A..I] = 
 1 −1 2
..
. 0 1 0
.
 (5.10)
..
3 4 5 . 0 0 1

Stage I. (Reduction to upper triangular form):


In the first column 3 is the largest element, thus interchanging first (R1 ) and third
(R3 ) rows to bring the pivot element 3 to the a11 position. Then (5.10) becomes
Solution of System of Linear Equations 289

 . 
3 4 5 .. 0 0 1
 
 .. .
 1 −1 2 . 0 1 0 
..
2 4 5 . 1 0 0
 . 
1 4/3 5/3 .. 0 0 1/3
  
∼ .
 1 −1 2 .. 0 1 0  R1 = 3 R1
 1

.
2 4 5 .. 1 0 0
 . 
1 4/3 5/3 .. 0 0 1/3
  
∼ 0 −7/3 1/3
..
. 0 1 −1/3
 R = R2 − R1 ; R = R3 − 2R1
 2 3
..
0 4/3 5/3 . 1 0 −2/3
7
(The largest element (in magnitude) in the second column is − , which is at the a22
3
position and so there is no need to interchange any rows).
 . 
1 4/3 5/3 .. 0 0 1/3
  
∼ .  
 0 1 −1/7 .. 0 −3/7 1/7  R2 = − 7 R2 ; R3 = R3 − 3 R2
3 4 

.
0 0 13/7 .. 1 4/7 −6/7
 . 
1 4/3 5/3 .. 0 0 1/3
  
∼ . 
 0 1 −1/7 .. 0 −3/7 1/7  R3 = 13 R3
7

.
0 0 1 .. 7/13 4/13 −6/13

Stage II. (Make the left half a unit matrix):


 . 
1 0 13/7 .. 0 4/7 1/7
  
∼ . 
 0 1 −1/7 .. 0 −3/7 1/7  R1 = R1 − 3 R2
4

.
0 0 1 .. 7/13 4/13 −6/13
 . 
1 0 0 .. −1 0 1
  
∼ .  13 
 0 1 0 .. 1/13 −5/13 1/13  R1 = R1 − 7 R3 ; R2 = R2 + 7 R3
1

.
0 0 1 .. 7/13 4/13 −6/13
The left hand becomes a unit matrix, thus the inverse of the given matrix is
 
−1 0 1
 1/13 −5/13 1/13  .
7/13 4/13 −6/13
290 Numerical Analysis

Algorithm 5.3 (Matrix inverse). The following algorithm computes the inverse
of a non-singular square matrix of order n × n and if the matrix is singular it prints
the message ‘the matrix is singular and hence not invertible’.

Algorithm Matrix Inversion (using partial pivoting).


Let A = [aij ] be an n × n matrix.
Step 1. Read the matrix A = [aij ], i, j = 1, 2, . . . , n.
Step 2. //Augment the matrix A.//
Augment the matrix A by a unit matrix of order n × n. The resultant
matrix A becomes of order n × 2n.

0, for i = j
i.e., ai n+j =
1, for i = j
for i, j = 1, 2, . . . , n.

Stage I. Make upper triangular form.


Step 3. Set k = 1.
Step 4. Find a pivot from the elements akk , ak+1 k , . . . , ank in the kth column of
A and let ajk be the pivot.
Step 5. If ajk = 0 then print ‘the matrix is singular and hence not invertible’
and Stop.
Step 6. If j = k then goto Step 7.
Otherwise interchange the kth and jth rows.
Step 7. If akk = 1 then divide all the elements of kth row by akk .
Subtract ajk times the kth row to the jth row for
j = k + 1, k + 2, · · · , 2n;
i.e., Rj = Rj − ajk Rk .
//This step makes ak+1 k , ak+2 k , . . . , ank zero.//
Step 8. Increase k by 1 i.e., set k = k + 1.
If k < n then goto Step 4. Otherwise goto Step 9.
//Stage I is completed.//

Stage II. //Make the left half of A a unit matrix.//


Step 9. Set k = 2.

Step 10. Subtract ajk times the kth row to the jth row for
j = k − 1, k − 2, . . . , 1.
Step 11. Increase k by 1.
If k < n then goto Step 10.
Otherwise, print the right half of A as inverse of A and Stop.
end Matrix Inversion
Solution of System of Linear Equations 291

Program 5.3.
/* Program Matrix Inverse
Program to find the inverse of a square matrix using
partial pivoting */
#include<stdio.h>
#include<stdlib.h>
#include<math.h>
#define zero 0.00001
void main()
{
int n,m,k,i,j;
float a[10][20],temp;
printf("\nEnter the size of the matrix ");
scanf("%d",&n);
printf("Enter the elements rowwise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
/* augment the matrix A */
for(i=1;i<=n;i++) for(j=1;j<=n;j++) a[i][n+j]=0;
for(i=1;i<=n;i++) a[i][n+i]=1;
m=2*n;
for(k=1;k<=n;k++)
{
/* finds pivot element and its position */
temp=fabs(a[k][k]); j=k; /* initial maximum */
for(i=k+1;i<=n;i++)
if(temp<fabs(a[i][k])){
temp=fabs(a[i][k]); j=i;
}
if(fabs(a[j][k])<=zero) /* if a[j][k]=0 */
{
printf("The matrix is singular and is not invertible");
exit(0);
}
if(j!=k) /* interchange k and j rows */
{
for(i=1;i<=m;i++){
temp=a[j][i]; a[j][i]=a[k][i]; a[k][i]=temp;
}
}
292 Numerical Analysis

if(a[k][k]!=1)
{
temp=a[k][k];
for(i=1;i<=m;i++) a[k][i]/=temp;
}
for(j=k+1;j<=n;j++) /* makes a[k+1][k] to a[n][k] zero */
{
temp=a[j][k];
for(i=1;i<=m;i++) a[j][i]-=temp*a[k][i];
}
} /* end of k loop */

/* make left half of A to a unit matrix */


for(k=2;k<=n;k++)
{
for(j=k-1;j>=1;j--)
{
temp=a[j][k];
for(i=1;i<=m;i++) a[j][i]-=temp*a[k][i];
}
}
printf("\nThe inverse matrix is \n");
for(i=1;i<=n;i++)
{
for(j=n+1;j<=m;j++)
printf("%f ",a[i][j]); printf("\n");
}
}/* main */

A sample of input/output:

Enter the size of the matrix 3


Enter the elements rowwise
0 1 2
3 -2 1
4 3 2
The inverse matrix is
-0.218750 0.125000 0.156250
-0.062500 -0.250000 0.187500
0.531250 0.125000 -0.093750
Solution of System of Linear Equations 293

Complexity of the algorithm


Step 4 determines the maximum among n−k+1 elements and takes n−k+1 comparisons.
Step 6 takes O(n) operations if the kth and jth rows need to interchange. For a fixed j,
Step 7 needs n − k additions and n − k multiplications. Since j is running from k + 1 to
n, so the total time taken by Step 7 is (n − k)2 . The Step 4 to Step 7 are repeated for n

n
times (k = 1, 2, . . . , n), therefore, Stage I takes (n − k)2 + O(n) + n − k + 1 = O(n3 )
k=1
operations.
Similarly, Stage II takes O(n3 ) time. Hence the time complexity to compute the
inverse of a non-singular matrix is O(n3 ).
Since the Stage I is similar to the algorithm Det Partial Pivoting, so the time
complexity to compute the determinant is O(n3 ).

5.4 Matrix Inverse Method

The system of equations (5.1) can be written in the matrix form (5.3) as

Ax = b

where A, b and x are defined in (5.4).


The solution of Ax = b is given

x = A−1 b, (5.11)

where A−1 is the inverse of the matrix A.


Once the inverse of A is known then post multiplication of it with b gives the solution
vector x.
Example 5.4.1 Solve the following system of equations by matrix inverse method
x + 2y + 3z = 10, x + 3y − 2z = 7, 2x − y + z = 5.

Solution. The given system of equations is Ax = b, where


     
1 2 3 x 10
A =  1 3 −2  , x =  y , b =  7 .
2 −1 1 z 5
 
1 2 3
 
Now, |A| =  1 3 −2  = −30 = 0.
 2 −1 1 
That is, A is non-singular and hence A−1 exists.
294 Numerical Analysis

 
1 −5 −13
adj A =  −5 −5 5.
−7 5 1
 
1 −5 −13
adj A 1 
Thus, A−1 = = −5 −5 5.
|A| −30 −7 5 1
      
1 −5 −13 10 90 3
−1 1     1    
Therefore, x = A b = −5 −5 5 7 = 60 = 2 .
−30 −7 5 1 5 30 30 1
Hence the required solution is x = 3, y = 2, z = 1.

Algorithm 5.4 (Matrix inverse method). This algorithm is used to solve a


system of linear equations Ax = b, where A = [aij ]n×n , x = (x1 , x2 , . . . , xn )t ,
b = (b1 , b2 , . . . , bn )t , by matrix inverse method.

Algorithm Matrix Inverse Method


Step 1. Read the coefficient matrix A = [aij ], i, j = 1, 2, . . . , n and the right
hand vector b = (b1 , b2 , . . . , bn )t .
Step 2. Compute the inverse of A by the algorithm Matrix Inverse.
Step 3. If A is invertible then compute x = A−1 b and print x =
(x1 , x2 , . . . , xn )t .
Otherwise, print ‘A is singular and the system has either no solution or
has infinitely many solutions’.
end Matrix Inverse Method
Program 5.4.
/* Program Matrix Inverse Method
Program to find the solution of a system of linear
equation by matrix inverse method. Partial pivoting
is used to find matrix inverse. */
#include<stdio.h>
#include<stdlib.h>
#include<math.h>
#define zero 0.00001
float a[10][20],ai[10][10];
int n;
int matinv();
void main()
{
int i,j;
float b[10],x[10];
Solution of System of Linear Equations 295

printf("\nEnter the size of the coefficient matrix ");


scanf("%d",&n);
printf("Enter the elements rowwise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("Enter the right hand vector ");
for(i=1;i<=n;i++) scanf("%f",&b[i]);
i=matinv();
if(i==0)
{
printf("Coefficient matrix is singular: ");
printf("The system has either no solution or many solutions");
exit(0);
}
for(i=1;i<=n;i++)
{
x[i]=0;
for(j=1;j<=n;j++) x[i]+=ai[i][j]*b[j];
}
printf("Solution of the system is\n ");
for(i=1;i<=n;i++) printf("%8.5f ",x[i]);
} /* main */
/* function to find matrix inverse */
int matinv()
{
int i,j,m,k; float temp;
/* augment the matrix A */
for(i=1;i<=n;i++) for(j=1;j<=n;j++) a[i][n+j]=0;
for(i=1;i<=n;i++) a[i][n+i]=1;
m=2*n;

for(k=1;k<=n;k++)
{
/* finds pivot element and its position */
temp=fabs(a[k][k]); j=k; /* initial maximum */
for(i=k+1;i<=n;i++)
if(temp<fabs(a[i][k]))
{
temp=fabs(a[i][k]); j=i;
}
296 Numerical Analysis

if(fabs(a[j][k])<=zero) /* if a[j][k]=0 */
{
printf("The matrix is singular and is not invertible");
return(0);
}
if(j!=k) /* interchange k and j rows */
{
for(i=1;i<=m;i++)
{
temp=a[j][i]; a[j][i]=a[k][i]; a[k][i]=temp;
}
}
if(a[k][k]!=1)
{
temp=a[k][k];
for(i=1;i<=m;i++) a[k][i]/=temp;
}
for(j=k+1;j<=n;j++) /* makes a[k+1][k] to a[n][k] zero */
{
temp=a[j][k];
for(i=1;i<=m;i++) a[j][i]-=temp*a[k][i];
}
} /* end of k loop */
/* make left half of A to a unit matrix */
for(k=2;k<=n;k++)
for(j=k-1;j>=1;j--){
temp=a[j][k];
for(i=1;i<=m;i++) a[j][i]-=temp*a[k][i];
}
printf("\nThe inverse matrix is \n");
for(i=1;i<=n;i++)
for(j=n+1;j<=m;j++) ai[i][j-n]=a[i][j];
return(1);
}/* matinv */

A sample of input/output:

Enter the size of the matrix 3


Enter the elements rowwise
Solution of System of Linear Equations 297

0 1 2
3 -2 1
4 3 2
The inverse matrix is
-0.218750 0.125000 0.156250
-0.062500 -0.250000 0.187500
0.531250 0.125000 -0.093750

Complexity

The time complexity of this method is O(n3 ) as the method involves computation of
A−1 .

5.5 Gauss Elimination Method

In this method, the variables are eliminated by a process of systematic elimination.


Suppose the system has n variables and n equations of the form (5.1). This procedure
reduces the system of linear equations to an equivalent upper triangular system which
can be solved by back–substitution. To convert an upper triangular system, x1 is
eliminated from second equation to nth equation, x2 is eliminated from third equation
to nth equation, x3 is eliminated from fourth equation to nth equation, and so on and
finally, xn−1 is eliminated from nth equation.
To eliminate x1 , from second, third, · · · , and nth equations the first equation is mul-
a21 a31 an1
tiplied by − ,− , ...,− respectively and successively added with the second,
a11 a11 a11
third, · · · , nth equations (assuming that a11 = 0). This gives

a11 x1 + a12 x2 + a13 x3 + · · · + a1n xn = b1


(1) (1) (1) (1)
a22 x2 + a23 x3 + · · · + a2n xn = b2
(1) (1) (1) (1)
a32 x2 + a33 x3 + · · · + a3n xn = b3 (5.12)
··························· ··· ···
(1) (1)
an2 x2 + an3 x3 + · · · + a(1) (1)
nn xn = bn ,

where
(1) ai1
aij = aij − a1j ; i, j = 2, 3, . . . , n.
a11
Again, to eliminate x2 from the third, forth, . . ., and nth equations the second equa-
(1) (1) (1)
a a42 an2 (1)
tion is multiplied by − 32
(1)
, − (1)
, . . ., − (1)
respectively (assuming that a22 = 0), and
a22 a22 a22
298 Numerical Analysis

successively added to the third, fourth, . . ., and nth equations to get the new system of
equations as

a11 x1 + a12 x2 + a13 x3 + · · · + a1n xn = b1


(1) (1) (1) (1)
a22 x2 + a23 x3 + · · · + a2n xn = b2
(2) (2) (2)
a33 x3 + · · · + a3n xn = b3 (5.13)
····················· ··· ···
(2)
an3 x3 + · · · + a(2) (2)
nn xn = bn ,

where
(1)
(2) (1) ai2 (1)
aij = aij − a ;
(1) 2j
i, j = 3, 4, . . . , n.
a22
Finally, after eliminating xn−1 , the above system of equations become

a11 x1 + a12 x2 + a13 x3 + · · · + a1n xn = b1


(1) (1) (1) (1)
a22 x2 + a23 x3 + · · · + a2n xn = b2
(2) (2) (2)
a33 x3 + · · · + a3n xn = b3 (5.14)
············ ··· ···
a(n−1)
nn xn = b(n−1)
n ,

where,
(k−1)
(k) (k−1) aik (k−1)
aij = aij − (k−1)
akj ;
akk
(0)
i, j = k + 1, . . . , n; k = 1, 2, . . . , n − 1, and apq = apq ; p, q = 1, 2, . . . , n.
Now, by back substitution, the values of the variables can be found as follows:
(n−1)
bn
From last equation we have, xn = (n−1) , from the last but one equation, i.e., (n−1)th
ann
equation, one can find the value of xn−1 and so on. Finally, from the first equation we
obtain the value of x1 .
(k)
The evaluation of the elements aij ’s is a forward substitution and the determina-
tion of the values of the variables xi ’s is a back substitution since we first determine
the value of the last variable xn .

Note 5.5.1 The method described above assumes that the diagonal elements are non-
zero. If they are zero or nearly zero then the above simple method is not applicable to
solve a linear system though it may have a solution. If any diagonal element is zero or
very small then partial pivoting should be used to get a solution or a better solution.
Solution of System of Linear Equations 299

It is mentioned earlier that if the system is diagonally dominant or real symmetric


and positive definite then no pivoting is necessary.
Example 5.5.1 Solve the equations by Gauss elimination method.
2x1 + x2 + x3 = 4, x1 − x2 + 2x3 = 2, 2x1 + 2x2 − x3 = 3.

Solution. Multiplying the second and third equations by 2 and 1 respectively and
subtracting them from first equation we get

2x1 + x2 + x3 = 4
3x2 − 3x3 = 0
−x2 + 2x3 = 1.

Multiplying third equation by –3 and subtracting from second equation we obtain

2x1 + x2 + x3 = 4
3x2 − 3x3 = 0
3x3 = 3.

From the third equation x3 = 1, from the second equations x2 = x3 = 1 and from
the first equation 2x1 = 4 − x2 − x3 = 2 or, x1 = 1.
Therefore the solution is x1 = 1, x2 = 1, x3 = 1.

Example 5.5.2 Solve the following system of equations by Gauss elimination


method (use partial pivoting).
x2 + 2x3 = 5
x1 + 2x2 + 4x3 = 11
−3x1 + x2 − 5x3 = −12.

Solution. The largest element (the pivot) in the coefficients of the variable x1 is −3,
attained at the third equation. So we interchange first and third equations

−3x1 + x2 − 5x3 = −12


x1 + 2x2 + 4x3 = 11
x2 + 2x3 = 5.

Multiplying the second equation by 3 and adding with the first equation we get,

−3x1 + x2 − 5x3 = −12


x2 + x3 = 3
x2 + 2x3 = 5
300 Numerical Analysis

The second pivot is 1, which is at the positions a22 and a32 . Taking a22 = 1 as
pivot to avoid interchange of rows. Now, subtracting the third equation from second
equation, we obtain

−3x1 + x2 − 5x3 = −12


x2 + x3 = 3
−x3 = −2.

Now by back substitution, the values of x3 , x2 , x1 are obtained as


1
x3 = 2, x2 = 3 − x3 = 1, x1 = − (−12 − x2 + 5x3 ) = 1.
3
Hence the solution is x1 = 1, x2 = 1, x3 = 2.

Algorithm 5.5 (Gauss elimination). This algorithm solves a system of equations


Ax = b, where A =[aij ]n×n , x = (x1 , x2 , . . . , xn )t , b = (b1 , b2 , . . . , bn )t , by Gauss
elimination method.

Algorithm Gauss Elimination (using partial pivoting)


Step 1. Read the matrix A = [aij ], i, j = 1, 2, . . . , n.
Step 2. Set ain+1 = bi for i = 1, 2, . . . , n.
Then the augmented matrix A becomes of order n × (n + 1).
Step 3. Set k = 1.
Step 4. Find the pivot from the elements akk , ak+1k , . . . , ank . Let ajk be the
pivot.
Step 5. If ajk = 0 then // A is singular.//
Print ‘the system has either no solution or infinite many solutions’, and
Stop.
Step 6. If j = k then interchange the rows j and k.
Step 7. Subtract ajk /akk times the kth row to the jth row for j = k + 1, k +
2, . . . , n.
Step 8. Increase k by 1.
If k = n then //forward substitution is over.//
go to Step 9.
Otherwise go to Step 4.
Step 9. //Back substitution.//
xn = an n+1 /ann .
Compute xi ’s using the expression
 
n 
1
xi = ai n+1 − aij xj , for i = n − 1, n − 2, . . . , 1.
aii
j=i+1
Step 10. Print x =(x1 , x2 , . . . , xn )t as solution.
end Gauss Elimination
Solution of System of Linear Equations 301

Program 5.5 .
/* Program Gauss-elimination
Program to find the solution of a system of linear equations by
Gauss elimination method. Partial pivoting is used. */
#include<stdio.h>
#include<stdlib.h>
#include<math.h>
#define zero 0.00001
void main()
{
int i,j,k,n,m;
float a[10][10],b[10],x[10],temp;
printf("\nEnter the size of the coefficient matrix ");
scanf("%d",&n);
printf("Enter the elements rowwise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("Enter the right hand vector ");
for(i=1;i<=n;i++) scanf("%f",&b[i]);
/* augment A with b[i], i.e., a[i][n+1]=b[i] */
m=n+1;
for(i=1;i<=n;i++) a[i][m]=b[i];
for(k=1;k<=n;k++)
{
/* finds pivot element and its position */
temp=fabs(a[k][k]); j=k; /* initial maximum */
for(i=k+1;i<=n;i++)
if(temp<fabs(a[i][k]))
{
temp=fabs(a[i][k]); j=i;
}
if(fabs(a[j][k])<=zero) /* if a[j][k]=0 */
{
printf("The matrix is singular:");
printf("The system has either no solution or many solutions");
exit(0);
}
if(j!=k) /* interchange k and j rows */
{
for(i=1;i<=m;i++)
302 Numerical Analysis

{
temp=a[j][i]; a[j][i]=a[k][i]; a[k][i]=temp;
}
}
for(j=k+1;j<=n;j++) /* makes a[k+1][k] to a[n][k] zero */
{
temp=a[j][k]/a[k][k];
for(i=1;i<=m;i++) a[j][i]-=temp*a[k][i];
}
} /* end of k loop */
/* forward substitution is over */
/* backward substitution */
x[n]=a[n][m]/a[n][n];
for(i=n-1;i>=1;i--)
{
x[i]=a[i][m];
for(j=i+1;j<=n;j++) x[i]-=a[i][j]*x[j];
x[i]/=a[i][i];
}
printf("Solution of the system is\n ");
for(i=1;i<=n;i++) printf("%8.5f ",x[i]);
} /* main */

A sample of input/output:

Enter the size of the coefficient matrix 3


Enter the elements rowwise
1 1 1
2 -1 3
3 1 -1
Enter the right hand vector
3 16 -3
Solution of the system is
1.00000 -2.00000 4.00000

5.6 Gauss-Jordan Elimination Method

In Gauss elimination method, the coefficient matrix is reduced to an upper triangular


form and the solution is obtained by back substitution. But, the Gauss-Jordan method
reduces the coefficient matrix to a diagonal matrix rather than upper triangular matrix
Solution of System of Linear Equations 303

and produces the solution of the system without using the back substitution. At the end
of Gauss-Jordan method the system of equations (5.2) reduces to the following form:

  x1
 
b1

1 0 ··· 0
 0    b 
 1 ··· 0 

x2   2
···   =  ..  . (5.15)
··· ··· ··· ..
.   . 
0 0 ··· 1 xn bn
The solution of the system is given by
x1 = b1 , x2 = b2 , . . . , xn = bn .
Thus the Gauss-Jordan method gives

 ..  Gauss − Jordan  . 
A.b −→ I..b . (5.16)

Generally, the Gauss-Jordan method is not used to solve a system of equations,


because it is more costly than the Gauss elimination method. But, this method is used
to find the matrix inverse (discussed in Section 5.3).
Example 5.6.1 Solve the following equations by Gauss-Jordan elimination method.

x1 + x2 + x3 = 3
2x1 + 3x2 + x3 = 6
x1 − x2 − x3 = −3.
     
1 1 1 x1 3
Solution. Here A =  2 3 1  , x =  x2  and b =  6  .
1 −1 −1 x3 −3
 .. 
The augmented matrix A.b is

 . 
1 1 1 .. 3
 ..   
A.b = 
2 3
.
1 .. 6 

.
1 −1 −1 .. −3
 . 
1 1 1 .. 3
  R = R2 − 2R1 ,
∼ 
0 1
.  2
−1 .. 0  R = R3 − R1
3
.
0 −2 −2 .. −6
304 Numerical Analysis

 . 
1 1 1 .. 3
  
∼ 
0 1
. 
−1 .. 0  R3 = R3 + 2R2
.
0 0 −4 .. −6
 . 
1 1 1 .. 3
   1

∼ 0 1 . 
−1 .. 0  R3 = −4 R3
.
0 0 1 .. 3/2
 . 
1 0 2 .. 3
  
∼ 
0 1
. 
−1 .. 0  R1 = R1 − R2
.
0 0 1 .. 3/2
 . 
1 0 0 .. 0
  R = R1 − 2R3 ,
∼ 
0 1
.  1
0 .. 3/2  R = R2 + R3
2
.
0 0 1 .. 3/2

The equivalent system of equations are


x1 =0
x2 = 3/2
x3 = 3/2.
Hence the required solution is x1 = 0, x2 = 3/2, x3 = 3/2.

5.7 Method of Matrix Factorization

5.7.1 LU Decomposition Method


This method is also known as factorization or LU decomposition method or Crout’s
reduction method.
Let the system of linear equations be
Ax = b (5.17)
where A, x, b are given by (5.4).
The matrix A can be factorized into the form A = LU, where L and U are the
lower and upper triangular matrices respectively. If the principal minors of A are non-
singular, i.e.,
 
   a11 a12 a13 
 a11 a12   
a11 = 0,   = 0,  a21 a22 a23  = 0, · · · , |A| = 0
   (5.18)
a21 a22  a31 a32 a33 
Solution of System of Linear Equations 305

then this factorization is possible and it is unique.


The matrices L and U are of the form
   
l11 0 0 · · · 0 u11 u12 u13 · · · u1n
 l21 l22 0 · · · 0   0 u22 u23 · · · u2n 
   
   · · · u3n 
L =  l31 l32 l33 · · · 0  and U =  0 0 u33 . (5.19)
 .. .. .. .. ..   .. .. .. .. .. 
 . . . . .   . . . . . 
ln1 ln2 ln3 · · · lnn 0 0 0 · · · unn
The equation Ax = b becomes LUx = b. Let Ux = z then Lz = b, where
z = (z1 , z2 , . . . , zn )t is an intermediate variable vector. The value of z i.e., z1 , z2 , . . . , zn
can be determined by forward substitution in the following equations.

l11 z1 = b1
l21 z1 + l22 z2 = b2
l31 z1 + l32 z2 + l33 z3 = b3 (5.20)
···································· ··· ···
ln1 z1 + ln2 z2 + ln3 z3 + · · · + lnn zn = bn .
After determination of z, one can compute the value of x i.e., x1 , x2 , . . . , xn from the
equation Ux = z i.e., from the following equations by the backward substitution.
u11 x1 + u12 x2 + u13 x3 + · · · + u1n xn = z1
u22 x2 + u23 x3 · · · + z2n xn = z2
u33 x3 + u23 x3 · · · + u3n xn = z3
(5.21)
························ ··· ···
un−1n−1 xn−1 + un−1n xn = zn−1
unn xn = zn .
When uii = 1, for i = 1, 2, . . . , n, then the method is known as Crout’s decom-
position method. When lii = 1, for i = 1, 2, . . . , n then the method is known as
Doolittle’s method for decomposition. In particular, when lii = uii for i = 1, 2, . . . , n
then the corresponding method is called Cholesky’s decomposition method.

Procedure to compute L and U


Here,
 we assume that uii = 1 for i = 1, 2, . . . , n. From the relation LU =A, i.e., from
l11 l11 u12 l11 u13 · · · l11 u1n
 l21 l21 u12 + l22 l21 u13 + l22 u23 · · · l21 u1n + l22 u2n 
 
 l31 l31 u12 + l32 l31 u13 + l32 u23 + l33 · · · l31 u1n + l32 + u2n + l33 u3n 
 
 .. .. .. .. .. 
. . . . . 
ln1 ln1 u12 + ln2 ln1 u13 + ln2 u23 + ln3 · · · ln1 u1n + ln2 u2n + · · · + lnn
306 Numerical Analysis

 
a11 a12 a13 · · · a1n
 a21 a22 a23 · · · a2n 
=  ··· ··· ··· ··· ··· 

a31 a32 a33 · · · ann


we obtain
a1j
li1 = ai1 , i = 1, 2, . . . , n and u1j = , j = 2, 3, . . . , n.
l11
The second column of L and the second row of U are determined from the relations

li2 = ai2 − li1 u12 , for i = 2, 3, . . . , n,


a2j − l21 u1j
u2j = for j = 3, 4, . . . , n.
l22
Next, third column of L and third row of U are determined in a similar way.
In general, lij and uij are given by


j−1
lij = aij − lik ukj , i ≥ j (5.22)
k=1

i−1
aij − lik ukj
k=1
uij = , i<j (5.23)
lii
uii = 1, lij = 0, j > i and uij = 0, i > j.

Alternatively, the vectors z and x can be determined from the equations

z = L−1 b (5.24)
−1
and x=U z. (5.25)

It may be noted that the computation of inverse of a triangular matrix is easier than
an arbitrary matrix.
The inverse of A can also be determined from the relation

A−1 = U−1 L−1 . (5.26)

Some properties of triangular matrices


Let L = [lij ] and U = [uij ] denote respectively the lower and upper triangular matrices.

• The determinant of a triangular matrix is the product of the diagonal elements


*
n *
n
i.e., |L| = lii and |U| = uii .
i=1 i=1

• Product of two lower (upper) triangular matrices is a lower (upper) triangular


matrix.
Solution of System of Linear Equations 307

• The inverse of lower (upper) triangular matrix is also a lower (upper) triangular
matrix.
*n  *
n 
• Since A = LU, |A| = |LU| = |L||U| = lii uii .
i=1 i=1

It may be remembered that the computation of determinant by LU decomposition


is not a better method since it may fail or become unstable due to vanishing or near-
vanishing leading minors.
Example 5.7.1 Factorize the matrix
 
2 −2 1
A =  5 1 −3 
3 4 1

into the form LU, where L and U are lower and upper triangular matrices and hence
solve the system of equations 2x1 −2x2 +x3 = 2, 5x1 +x2 −3x3 = 0, 3x1 +4x2 +x3 =
9. Determine L−1 and U−1 and hence find A−1 . Also determine |A|.
 
2 −2 1
Solution. Let  5 1 −3 
3 4 1
    
l11 0 0 1 u12 u13 l11 l11 u12 l11 u13
=  l21 l22 0   0 1 u23  =  l21 l21 u12 + l22 l21 u13 + l22 u23 .
l31 l32 l33 0 0 1 l31 l31 u12 + l32 l31 u13 + l32 u23 + l33
Comparing both sides, we have
l11 = 2, l21 = 5 l31 = 3
l11 u12 = −2 or, u12 = −2/l11 = −1
l11 u13 = 1 or, u13 = 1/l11 = 1/2
l21 u12 + l22 = 1 or, l22 = 1 − l21 u12 = 6
l31 u12 + l32 = 4 or, l32 = 4 − l31 u12 = 7
l21 u13 + l22 u23 = −3 or, u23 = (−3 − l21 u13 )/l22 = −11/12
l31 u13 + l32 u23 + l33 = 1 or, l33 = 1 − l31 u13 − l32 u23 = 71/12.
Hence L and U are given by
   
2 0 0 1 −1 1
2
L = 5 6 0 , U =  0 1 − 11 12
.
71
3 7 12 0 0 1

Second Part. The given system


 of equations
 can be
  written asAx = b, where
2 −2 1 x1 2
A =  5 1 −3  , x =  x2  , b =  0  .
3 4 1 x3 9
308 Numerical Analysis

Using A = LU, the equation Ax = b reduces to LUx = b. Let Ux = y. Then


Ly = b.
From the relation Ly = b, we have
    
2 0 0 y1 2
 5 6 0   y2  =  0  ,
3 7 71
12 y3 9

That is,

2y1 = 2,
5y1 + 6y2 = 0,
71
3y1 + 7y2 + y3 = 9.
12
Solution of this system is y1 = 1, y2 = − 56 , y3 = 2.
Thus y = (1, −5/6, 2)t .
Now, from the relation Ux = y we have
    
1 −1 1
2 x1 1
 0 1 − 11   x2  =  − 5  ,
12 6
0 0 1 x3 2

i.e.,
1
x1 − x2 + x3 = 1
2
11 5
x2 − x3 = −
12 6
x3 = 2.

Solution of this system of equations is


5 11 1
x3 = 2, x2 = − + × 2 = 1, x1 = 1 + x2 − x3 = 1.
6 12 2
Hence the required solution is x1 = 1, x2 = 1, x3 = 2.
Third Part. Applying the Gauss-Jordan method to find L−1 . The necessary aug-
mented matrix is
 . 
2 0 0 .. 1 0 0
 ..   
L.I =  . 
 5 6 0 .. 0 1 0 
..
3 7 71
12 . 0 0 1
Solution of System of Linear Equations 309

 . 
1 0 0 .. 12 0 0
  
∼  .. 5  R1 = 1 R1 , R2 = R2 − 5R1 , R3 = R3 − 3R1
 0 6 0 . − 2 1 0  2
71 ..
0 7 12 . − 2 0 1 3
 . 
1 0 0 .. 1
0 0
 2   1
∼  .. 5 
 0 1 0 . − 12 6 0  R2 = 6 R2 , R3 = R3 − 7R2
1  

.. 17 7
0 0 71
12 . 12 − 6 1
 . 
1 0 0 .. 1
0 0
 2 
∼  .. 5
 0 1 0 . − 12
1
0
.

6
.. 17 14 12
0 0 1 . 71 − 71 71
 
1/2 0 0
Hence L−1 =  −5/12 1/6 0 .
17/71 −14/71 12/71
The value of U−1 can also be determined in a similar way. Here we apply another
method based on the property ‘inverse of a triangular matrix is a triangular matrix
of same shape’.
 
1 b12 b13
Let U−1 =  0 1 b23  .
0 0 1
−1
Then U U = I gives
    
1 b12 b13 1 −1 1/2 1 0 0
 0 1 b23   0 1 −11/12  =  0 1 0  .
0 0 1 0 0 1 0 0 1
   
1 −1 + b12 12 − 11
12 b12 + b13 1 0 0
i.e., 0 1 − 11  = 0 1 0.
12 + b23
0 0 1 0 0 1
Equating both sides
1 11 1 11 5
−1 + b12 = 0 or, b12 = 1, − b12 + b13 = 0 or, b13 = − + b12 =
2 12 2 12 12
11 11
− + b23 = 0 or, b23 = .
12 12
Hence  
1 1 5/12
U−1 =  0 1 11/12  .
0 0 1
310 Numerical Analysis

Therefore,
  
1 1 5/12 1/2 0 0
A−1 = U−1 L−1 =  0 1 11/12   −5/12 1/6 0 
0 0 1 17/71 −14/71 12/71
 
13/71 6/71 5/71
=  −14/71 −1/71 11/71  .
17/71 −14/71 12/71
 71 
Last Part. |A| = |L||U| = 2 × 6 × × 1 = 71.
12

Algorithm 5.6 (LU decomposition). This algorithm finds the solution of a sys-
tem of linear equations using LU decomposition method. Assume that the principal
minors of all order are non-zero.

Algorithm LU-decomposition
Let Ax = b be the systems of equations and A = [aij ], b = (b1 , b2 , . . . , bn )t ,
x = (x1 , x2 , . . . , xn )t .
//Assume that the principal minors of all order are non-zero.//
//Determine the matrices L and U.//
Step 1. Read the matrix A = [aij ], i, j = 1, 2, . . . , n and the right
hand vector b = (b1 , b2 , . . . , bn )t .
Step 2. li1 = ai1 for i = 1, 2, . . . , n; u1j = al111j
for j = 2, 3, . . . , n;
uii = 1 for i = 1, 2, . . . , n.
Step 3. For i, j = 2, 3, . . . , n compute the following

j−1
lij = aij − lik ukj , i ≥ j
k=1
 
i−1 
uij = aij − lik ukj /lii , i < j.
k=1
Step 4. //Solve the system Lz = b by forward substitution.//
1  
i−1
b1
z1 = , zi = bi − lij zj for i = 2, 3, . . . , n.
l11 lii
j=1
Step 5. //Solve the system Ux = z by backward substitution.//
Set xn = zn ;
n
xi = zi − uij xj for i = n − 1, n − 2, . . . , 1.
j=i+1
Print x1 , x2 , . . . , xn as solution.
end LU-decomposition
Solution of System of Linear Equations 311

Program 5.6.
/* Program LU-decomposition
Solution of a system of equations by LU decomposition method.
Assume that all order principal minors are non-zero. */
#include<stdio.h>
void main()
{
float a[10][10],l[10][10],u[10][10],z[10],x[10],b[10];
int i,j,k,n;
printf("\nEnter the size of the coefficient matrix ");
scanf("%d",&n);
printf("Enter the elements rowwise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("Enter the right hand vector ");
for(i=1;i<=n;i++) scanf("%f",&b[i]);
/* computations of L and U matrices */
for(i=1;i<=n;i++) l[i][1]=a[i][1];
for(j=2;j<=n;j++) u[1][j]=a[1][j]/l[1][1];
for(i=1;i<=n;i++) u[i][i]=1;
for(i=2;i<=n;i++)
for(j=2;j<=n;j++)
if(i>=j)
{
l[i][j]=a[i][j];
for(k=1;k<=j-1;k++) l[i][j]-=l[i][k]*u[k][j];
}
else
{
u[i][j]=a[i][j];
for(k=1;k<=i-1;k++) u[i][j]-=l[i][k]*u[k][j];
u[i][j]/=l[i][i];
}
printf("\nThe lower triangular matrix L\n");
for(i=1;i<=n;i++)
{
for(j=1;j<=i;j++) printf("%f ",l[i][j]);
printf("\n");
}
printf("\nThe upper triangular matrix U\n");
312 Numerical Analysis

for(i=1;i<=n;i++)
{
for(j=1;j<i;j++) printf(" ");
for(j=i;j<=n;j++) printf("%f ",u[i][j]);
printf("\n");
}
/* solve Lz=b by forward substitution */
z[1]=b[1]/l[1][1];
for(i=2;i<=n;i++)
{
z[i]=b[i];
for(j=1;j<=i-1;j++) z[i]-=l[i][j]*z[j];
z[i]/=l[i][i];
}

/* solve Ux=z by backward substitution */


x[n]=z[n];
for(i=n-1;i>=1;i--)
{
x[i]=z[i];
for(j=i+1;j<=n;j++) x[i]-=u[i][j]*x[j];
}
printf("The solution is ");
for(i=1;i<=n;i++) printf("%f ",x[i]);
} /* main */

A sample of input/output:

Enter the size of the coefficient matrix 3


Enter the elements rowwise
4 2 1
2 5 -2
1 -2 7
Enter the right hand vector
3 4 5

The lower triangular matrix L


4.000000
2.000000 4.000000
1.000000 -2.500000 5.187500
Solution of System of Linear Equations 313

The upper triangular matrix U


1.000000 0.500000 0.250000
1.000000 -0.625000
1.000000
The solution is -0.192771 1.325301 1.120482

5.8 Gauss Elimination Method to the Find Inverse of a Matrix


 . 
Conventionally, the Gauss elimination is applied to the augmented matrix A..b . This
 . 
method can also be applied to augmented matrix A..I . In this method, the matrix
A(= LU) becomes an upper triangular matrix U and the unit matrix I becomes the
lower triangular matrix, which is the inverse of L. Then the relation AA−1 = I becomes
LUA−1 = I i.e.,
UA−1 = L−1 . (5.27)
The left hand side of (5.27) is a lower triangular matrix and also the matrices U and
L−1 are known. Hence by back substitution the matrix A−1 can be determined easily.
This method is illustrated by an example below.
Example  5.8.1 Find
 the inverse of the matrix A using Gauss elimination method
1 3 4
where A =  1 0 2  .
−2 3 1

 . 
Solution. The augmented matrix A..I is
 . 
1 3 4 .. 1 0 0
 ..   
A.I =  .
 1 0 2 .. 0 1 0 

.
−2 3 1 .. 0 0 1
 .. 
1 3 4 . 1 0 0
R2 ← R2 − R1  

−→  0 −3 −2 .. 
R3 ← R3 + 2R1 . −1 1 0 
..
0 9 9 . 2 0 1

 .. 
1 4 3. 1 0 0
R3 ← R3 + 3R2  
−→ 
 0 −3 −2
..
. −1 1 0


..
0 0 3 . −1 3 1
314 Numerical Analysis

   
1 3 4 1 0 0
 
Here U = 0 −3 −2 , L = −1 −1  1 0.
0 0 3 −1 3 1
 
x11 x12 x13
Let A−1 =  x21 x22 x23  .
x31 x32 x33
Since, UA−1 = L−1 ,
    
1 3 4 x11 x12 x13 1 0 0
 0 −3 −2   x21 x22 x23  =  −1 1 0  .
0 0 3 x31 x32 x33 −1 3 1
This implies
x11 + 3x21 + 4x31 = 1
− 3x21 − 2x31 = −1
3x31 = −1
1 5 2
These equations give x31 = − , x21 = , x11 = .
3 9 3
Again, x12 + 3x22 + 4x32 = 0
− 3x22 − 2x32 = 1
3x32 = 3
Solution of this system is x32 = 1, x22 = −1, x12 = −1
x13 + 3x23 + 4x33 = 0
− 3x23 − 2x33 = 0
3x33 = 1
1 2 2
Then, x33 = , x23 = − , x13 = − is the solution of the above equations.
3 9  3
2/3 −1 −2/3
Hence A−1 =  5/9 −1 −2/9  .
−1/3 1 1/3

5.9 Cholesky Method

If the coefficient matrix A is symmetric and positive definite then this method is appli-
cable to solve the system Ax = b. This method is also known as square-root method.
Since A is symmetric then A can be written as

A = LLt , (5.28)
where L = [lij ], lij = 0, i < j, a lower triangular matrix.
Also, A can be decomposed as
A = UUt , (5.29)
Solution of System of Linear Equations 315

in terms of upper triangular form.


For (5.28), the equation Ax = b becomes
LLt x = b. (5.30)
Let Lt x = z (5.31)
then Lz = b. (5.32)
The vector z can be obtained from (5.32) by forward substitution and the solution
vector x are determined from the equation (5.31) by back substitution. Also, z and x
can be determined by computing the inverse of L only as
z = L−1 b and x = (Lt )−1 z = (L−1 )t z. (5.33)
The inverse of A can be determined as
A−1 = (L−1 )t L−1 .

Procedure to determine L
Since A = LLt , then
 
l11 0 0 · · · 0  
 l21 l22 0 · · · 0  l11 l21 · · · lj1 · · · ln1
  
 · · · · · · · · · · · · · · ·   0 l22 · · · lj2 · · · ln2 
A =    0 0 · · · l · · · l 
 j3 n3 
 li1 li2 li3 · · · 0   · · · · · · · · · · · · · · · · · · 
 ··· ··· ··· ··· ··· 
0 0 · · · 0 · · · lnn
ln1 ln2 ln3 · · · lnn
 2 
l11 l21 l11 · · · lj1 l11 ··· ln1 l11
 l21 l11 l21 2 + l2 · · · lj1 l21 + lj2 l22 ··· ln1 l21 + ln2 l22 
 22 
··· · · · · · · · · · ··· ··· 
= 
 li1 l11 l21 li1 + l22 li2 · · · lj1 li1 + · · · + ljj lij · · ·
.
 ln1 li1 + · · · + lni lii 

··· ··· ··· ··· ··· ··· 
ln1 l11 l21 ln1 + l22 ln2 · · · lj1 ln1 + · · · + ljj lnj · · · ln1 + ln2 + · · · + lnn
2 2 2

Equating both sides, we find the following equations.


l11 = (a11 )1/2
  1/2
i−1
2 + l2 + · · · + l2 = a
li1 or l = a − lij , i = 2, 3, . . . , n
i2 ii ii ii ii
j=1
li1 = ai1 /l11 , i = 2, 3, . . . , n
(5.34)
li1 lj1 + li2 lj2 + · · · + lij ljj = aij
 
j−1 1/2
or, lij = ljj 1
aij − ljk lik , for i = j + 1, j + 2, . . . , n
k=1
lij = 0, i < j.
316 Numerical Analysis

Similarly, for the system of equations (5.29), the elements uij of U are given by

unn = (ann )1/2


uin = ain n−1
/unn , i = 1, 2, . . . , 

n
uij = u1jj aij − uik ujk ,
k=j+1
(5.35)
for i = n − 2, n − 3, . . . , 1; j = i + 1, i + 2, . . . , n − 1
 1/2
n
uii = aii − 2
uik , i = n − 1, n − 2, . . . , 1
k=i+1
uij = 0, i > j.

Example 5.9.1 Solve the following system of equations by Cholesky method.

2x1 + x2 − x3 = 6
x1 − 3x2 + 5x3 = 11
−x1 + 5x2 + 4x3 = 13.

Solution. The given system of equations is  


2 1 1
Ax = b where x = (x1 , x2 , x3 )t , b = (6, 11, 13)t , and A =  1 3 2  .
1 2 4
It is observed
 that A is symmetric and positive definite.
l11 0 0
Let L =  l21 l22 0  .
l31 l32 l33
 2   
l11 l11 l21 l11 l31 2 1 1
Therefore, LLt =  l21 l11 l21 2 + l2
22 l21 l31 + l22 l32  =  1 3 2  .
2 + l2 + l2
l31 l11 l31 l21 + l32 l22 l31 1 2 4
32 33
Comparing both sides,√ we have
2 = 2 or l
l11 11 = 2 √
l11 l21 = 1 or l21 = 1/√2
l11 l31 = 1 or l31 = 1/ 2 +
22 = (3 − 2 )
2 + l2 = 3 or l 1 1/2 5
l21 22 = 2

l31 l21 + l32 l22 = 2 or l32 = − l31 l21 ) = 3/ 10
1
l22 (2 +
2 + l2 + l2 = 4 or l
l31 32 33 33 = (4 − l 2 − l2 )1/2 =
31 32
13
5 .
 √   
2
√  0 0 1.41421 0 0
Therefore, L =  1/√2 √ 5/2  0  =  0.70711 1.58114 0 .
1/ 2 3/ 10 13/5 0.70711 0.94868 1.61245
Solution of System of Linear Equations 317

From the relation Lz = b, we have

1.41421z1 = 6
0.70711z1 + 1.58114z2 = 11
0.70711z1 + 0.94868z2 + 1.61245z3 = 13.

This gives z1 = 4.24265, z2 = 5.05963, z3 = 3.22491.


Now, from the relation Lt x = z, we have
    
1.41421 0.70711 0.70711 x1 4.24265
 0 1.58114 0.94868   x2  =  5.05963  .
0 0 1.61245 x3 3.22491

i.e., 1.41421x1 + 0.70711x2 + 0.70711x3 = 4.24265


1.58114x2 + 0.94868x3 = 5.05963
1.61245x3 = 3.22491

Solution of these equations is x3 = 2.00001, x2 = 1.99993, x1 = 0.99998.


Hence the solution is x1 = 1.0000, x2 = 2.0000, x3 = 2.0000, correct up to four
decimal places.

5.10 Matrix Partition Method

When a matrix is very large and it is not possible to store the entire matrix into the
primary memory of a computer at a time, then matrix partition method is used to
find the inverse of a matrix. When a few more variables and consequently a few more
equations are added to the original system then also this method is very useful.
Let the coefficient matrix A be partitioned as
 
..
 B . C 
A=  · · · · · · · · · 
 (5.36)
..
D . E

where B is an l × l matrix, C is an l × m matrix, D is an m × l and E is an m × m


matrix; and l, m are positive integers with l + m = n.
Let A−1 be partitioned as
 
..
 P . Q 
A−1 =  · · · · · · · · · 
 (5.37)
..
R . S
318 Numerical Analysis

where the matrices P, Q, R and S are of the same orders as those of the matrices B, C, D
and E respectively. Then

    
.. .. ..
 B . C  P . Q   I1 . 0 
AA−1 = 
··· ··· ······ ··· ··· 
 = ··· ··· ···, (5.38)
.. .. ..
D . E R . S 0 . I2

where I1 and I2 are identity matrices of order l and m respectively. From (5.38), we
have

BP + CR = I1
BQ + CS = 0
DP + ER = 0
DQ + ES = I2 .

Now, BQ + CS = 0 gives Q = −B−1 CS i.e., DQ = −DB−1 CS.


Also, from DQ + ES = I2 , we have (E − DB−1 C)S = I2 .
Therefore, S = (E − DB−1 C)−1 .
Similarly, the other matrices are

Q = −B−1 CS
R = −(E − DB−1 C)−1 DB−1 = −SDB−1
P = B−1 (I1 − CR) = B−1 − B−1 CR.

It may be noted that, to find the inverse of A, it is required to determine the inverses
of two matrices B and (E − DB−1 C) of order l × l and m × m respectively.
That is, to compute the inverse of the matrix A of order n × n, the inverses of two
lower order (roughly half) matrices are to be determined. If the matrices B, C, D, E
are still large to fit in the computer memory, then further partition them.
 
3 3 4
Example 5.10.1 Find the inverse of the matrix A =  2 1 1  using the matrix
1 3 5
partition method. Hence find the solution of the system of equations

3x1 + 3x2 + 4x3 = 5


2x1 + x2 + x3 = 7
x1 + 3x2 + 5x3 = 6.
Solution of System of Linear Equations 319

Solution. Let the matrix A be partitioned as


 
.  
3 3 .. 4 ..
 
 .  B . C
A= 2 1 .. 1  =  · · · · · · · · · , where B = 3 3
    2 1
··· ··· ··· ··· ..
.. D . E
1 3 . 5
4    
C= , D= 1 3 , E= 5
1  
..
P . Q
and A−1 =  
 · · · · · · · · ·  , where P, Q, R and S are given by
..
R . S

S = (E − DB−1 C)−1 , R = −SDB−1 , P = B−1 − B−1 CR, Q = −B−1 CS.

Now,
1 1 −3 1 −1 3
B−1 = − = .
3 −2 3 3 2 −3

  1 −1 3 4 1
E − DB−1 C = 5 − 1 3 = .
3 2 −3 1 3
S = 3
  1 −1 3  
R = −3 1 3 = −5 6
3 2 −3
1 −1 3 1 −1 3 4  
P = B−1 − B−1 CR = − −5 6
3 2 −3 3 2 −3 1
−2 3
= .
9 −11
1 −1 3 4 1
Q = − 3=
3 2 −3 1 −5

Therefore,  
−2 3 1
A−1 =  9 −11 −5  .
−5 6 3
Hence,     
−2 3 1 5 17
x = A−1 b =  9 −11 −5   7  =  −62  .
−5 6 3 6 35
Hence the required solution is x1 = 17, x2 = −62, x3 = 35.
320 Numerical Analysis

5.11 Solution of Tri-diagonal Systems

If the system of equations is of the form


b1 x1 + c1 x2 = d1
a2 x1 + b2 x2 + c2 x3 = d2 (5.39)
a3 x2 + b3 x3 + c3 x4 = d3
·················· ···
an xn−1 + bn xn = dn ,
then the coefficient matrix is

 
b1 c1 0 0 ··· 0 0 0 0  
 a2 b2 c2 0 ··· 0 0 0 0  d1
   d2 
 0 a3 b3 c3 ··· 0 0 0 0   
A=
··· and d =  .  (5.40)
 ··· ··· ··· ··· ··· ··· ··· ···    .. 
 0 0 0 0 ··· 0 an−1 bn−1 cn−1  dn
0 0 0 0 ··· 0 0 an bn
It may be noted that the main diagonal and the adjacent coefficients on either side of
it consist of only non-zero elements and all other elements are zero. The matrix is called
tri-diagonal matrix and the system of equations is called a tri-diagonal system. These
type of matrices occur frequently in the solution of ordinary and partial differential
equations by finite difference method.
A tri-diagonal system can be solved using LU decomposition method.
Let A = LU where 
γ1 0 0 · · · 0 0 0
 β2 γ2 0 · · · 0 0 0 
 
L = ··· ··· ··· ··· ··· ··· ···

,
 0 0 0 · · · βn−1 γn−1 0 
0 0 0 ··· 0 β n γn
 
1 α1 0 · · · 0 0 0
 0 1 α2 · · · 0 0 0 
 
and U =  · · · · · · · · · · · · · · · · · · · · · 

.
 0 0 0 · · · 0 1 αn−1 
0 0 0 ··· 0 0 1
Then  
γ1 γ1 α1 0 ··· 0 0 0
 β2 α1 β2 + γ2 α2 γ2 ··· 0 0 0 
 

LU =  0 β3 α2 β3 + γ3 · · · 0 0 0 .

··· ··· ··· ··· ··· ··· ··· 
0 0 0 · · · 0 βn βn αn−1 + γn
Solution of System of Linear Equations 321

Now, comparing the matrix LU with A and obtain the non-zero elements of L and
U as
γ1 = b1 , γi αi = ci , or, αi = ci /γi , i = 1, 2, . . . , n − 1
βi = ai , i = 2, . . . , n
ci−1
γi = bi − αi−1 βi = bi − ai , i = 2, 3, . . . , n.
γi−1
Thus the elements of L and U are given by the following relations.
γ1 = b1 ,
ai ci−1
γi = bi − , i = 2, 3, . . . , n (5.41)
γi−1
βi = ai , i = 2, 3, . . . , n (5.42)
αi = ci /γi , i = 1, 2, . . . , n − 1.
The solution of the equation (5.39) i.e., Ax = d where d = (d1 , d2 , . . . , dn )t can be
obtained by solving Lz = d using forward substitution and then solving Ux = z using
back substitution. The solution of Lz = d is given by
d1 di − ai zi−1
z1 = , zi = , i = 2, 3, . . . , n. (5.43)
b1 γi
The solution of the equation Ux = z is
ci
xn = zn , xi = zi − αi xi+1 = zi − xi+1 , i = n − 1, n − 2, . . . , 1. (5.44)
γi

Example 5.11.1 Solve the following tri-diagonal system of equation.


x1 + x2 = 3, −x1 + 2x2 + x3 = 6, 3x2 + 2x3 = 12.

Solution. Here b1 = c1 = 1, a2 = −1, b2 = 2, c2 = 1, a3 = 3, b3 = 2,


d1 = 3, d2 = 6, d3 = 12.
Therefore,

γ1 = b1 = 1
c1
γ2 = b2 − a2 = 2 − (−1).1 = 3
γ1
c2 1
γ3 = b3 − a3 = 2 − 3. = 1
γ2 3
d1 d 2 − a2 z 1 d3 − a3 z2
z1 = = 3, z2 = = 3, z3 = =3
b1 γ2 γ3
c2 c1
x3 = z3 = 3, x2 = z2 − x3 = 2, x1 = z1 − x2 = 1.
γ2 γ1
Hence the required solution is x1 = 1, x2 = 2, x3 = 3.
322 Numerical Analysis

It may be noted that the equation (5.43) and also (5.44) are valid only if γi = 0
for all i = 1, 2, . . . , n. If any one of γi becomes zero at any stage then the method is
not applicable. Actually, this method is based on LU decomposition technique and LU
decomposition method is applicable and unique if principal minors of coefficient matrix
of all orders are non-zero. But, if a minor becomes zero then a modification on (5.43)
and (5.44) gives the solution of the tri-diagonal system.

Suppose γk = 0 and γi = 0, i = 1, 2, . . . , k −1. Then let γk = s, a symbolic value of γk .


Then the remaining γi , i = k + 1, . . . , n are calculated using the equation (5.41). These
γ’s are used to calculate zi and xi using the formulae (5.43) and (5.44). The values of
xi ’s are obtained in terms of s. Lastly, the final solution is obtained by substituting
s = 0. The following example illustrates this case.

Example 5.11.2 Solve the following system of equations.

x1 + x2 = 3,
x1 + x2 − 3x3 = −3,
−2x2 + 3x3 = 4.

Solution. Here b1 = c1 = 1, a2 = b2 = 1, c2 = −3, a3 = −2, b3 = 3,


d1 = 3, d2 = −3, d3 = 4.
Therefore,

γ1 = b1 = 1
c1
γ2 = b2 − a2 =1−1=0
γ1
Since γ2 = 0, let γ2 = s. Therefore,
c2 −3 6
γ3 = b3 − a3 =3+2 =3−
γ2 s s
d1 d 2 − a2 z 1 6 d 3 − a3 z 2 4s − 12
z1 = = 3, z2 = =− , z3 = =
b1 γ2 s γ3 3s − 6
4s − 12
x3 = z3 = ,
3s − 6
c2 −6
x2 = z2 − x3 = ,
γ2 3s − 6
c1 9s − 12
x1 = z1 − x2 = .
γ1 3s − 6
Substituting s = 0 to find the solution. The required solution is
x1 = 2, x2 = 1, x3 = 2.
Solution of System of Linear Equations 323

Algorithm 5.7 (Solution of a tri-diagonal system). This algorithm solves a


tri-diagonal system of linear equations. Assume that the principal minors are non-
zero.

Algorithm Tridiagonal
//Let the tri-diagonal system is of the form Ax = d where A and d are given by
(5.40).//
Step 1. Read the matrix A, i.e., the arrays ai , bi , ci , i = 2, 3, . . . , n−1
and b1 , c1 , an , bn .
ai ci−1
Step 2. Compute γ1 = b1 , γi = bi − , i = 2, 3, . . . , n.
γi−1
d1 di − ai zi−1
Step 3. Compute z1 = , zi = , i = 2, 3, . . . , n.
b1 γi
ci
Step 4. Compute xn = zn , xi = zi − xi+1 , i = n − 1, n − 2, . . . , 1.
γi
Step 5. Print xi , i = 1, 2, . . . , n and Stop.
end Tridiagonal

Program 5.7.
/* Program TriDiagonal
Program to solve a tri-diagonal system of equations.
The coefficient matrix are taken as a[i],b[i],c[i],
i=2, 3, ..., n-1, b[1],c[1],a[n],b[n]. The right
hand vector is d[i], i=1, 2, ..., n.*/
#include<stdio.h>
#include<stdlib.h>
float x[10]; /* x[i]is the solution of the tri-diagonal system */
void main()
{
float a[10],b[10],c[10],d[10];
int i,n; float y;
float TriDiag(float [],float [],float [],float [],int);
printf("Enter the size of the coefficient matrix ");
scanf("%d",&n);
printf("Enter first row (only non-zero elements) ");
scanf("%f %f",&b[1],&c[1]);
printf("Enter rows 2 to n-1 ");
for(i=2;i<=n-1;i++) scanf("%f %f %f",&a[i],&b[i],&c[i]);
printf("Enter last row ");
scanf("%f %f",&a[n],&b[n]);
324 Numerical Analysis

printf("Enter the right hand vector ");


for(i=1;i<=n;i++) scanf("%f",&d[i]);
y=TriDiag(a,b,c,d,n);/* call of TriDiag to a dummy variable */
printf("The solution is \n");
for(i=1;i<=n;i++) printf("%f ",x[i]);
} /* end of main */

float TriDiag(float a[10],float b[10],float c[10],float d[10],int n)


{
/* output x[i], i=1, 2,..., n, is a global variable.*/
int i; float gamma[10],z[10];
gamma[1]=b[1];
for(i=2;i<=n;i++)
{
if(gamma[i-1]==0.0)
{
printf("A minor is zero: Method fails ");
exit(0);
}
gamma[i]=b[i]-a[i]*c[i-1]/gamma[i-1];
}
z[1]=d[1]/gamma[1];
for(i=2;i<=n;i++)
z[i]=(d[i]-a[i]*z[i-1])/gamma[i];
x[n]=z[n];
for(i=n-1;i>=1;i--)
x[i]=z[i]-c[i]*x[i+1]/gamma[i];
/* for(i=1;i<=n;i++) printf("%f ",x[i]); */
return(x[0]);
} /*end of TriDiag */

A sample of input/output:

Enter the size of the coefficient matrix 4


Enter first row (only non-zero elements) 1 2
Enter rows 2 to n-1
3 2 1
2 0 -1
Enter last row
1 2
Solution of System of Linear Equations 325

Enter the right hand vector


3 2 1 1
The solution is
0.500000 1.250000 -2.000000 1.500000

1
5.12 Evaluation of Tri-diagonal Determinant

For n ≥ 3, a general tri-diagonal matrix T = [tij ]n×n is of the form


 
b1 c1 0 · · · · · · · · · 0
 a2 b2 c2 · · · · · · · · · 0 
 
 0 a3 b3 · · · · · · · · · 0 
T=  

··· ··· ··· ··· ··· ··· ··· 
 0 0 0 · · · an−1 bn−1 cn−1 
0 0 0 ··· 0 an bn

for any tij = 0 for |i − j| ≥ 2.


The entire matrix can be stored using only three vectors c = (c1 , c2 , . . . , cn−1 ), a =
(a2 , a3 , . . . , an ), and b = (b1 , b2 , . . . , bn ). We define a vector d = (d1 , d2 , . . . , dn ) as
,
b1 if i = 1
di = b − ai c (5.45)
i i−1 if i = 2, 3, . . . , n.
di−1
If di = 0 for any i ≤ n, then, set di = x (x is just a symbolic name) and continue to
compute di+1 , di+2 , . . . , dn in terms of x by using (5.45).
 n
The product P = di (in general, this is a polynomial in x) evaluated at x = 0 is
i=1
the value of |T|. If P is free from x then the product P directly gives the value of |T|.
Example 5.12.1 Find the values of the determinants of the following tri-diagonal
matrices.    
1 1 0 1 −1 0
A =  1 1 −2  , B =  −1 2 −1  .
0 −3 4 0 −1 2

Solution. For the matrix A,


a3 4x − 6
d1 = 1, d2 = 0, so, set d2 = x, d3 = b3 − c2 = .
d2 x
Therefore, P = d1 d2 d2 = 4x − 6, gives |A| = −6.
1
M.E.A.El-Mikkawy, A fast algorithm for evaluating nth order tri-diagonal determinants, J. Com-
putational and Applied Mathematics, 166 (2004) 581-584.
326 Numerical Analysis

For the matrix B,


a2 a3
d1 = 1, d2 = b2 − c1 = 1, d3 = b3 − c2 = 1.
d1 d2
Therefore, P = d1 d2 d2 = 1.1.1 = 1, gives |B| = 1.

5.13 Vector and Matrix Norms

The norm of a vector is the size or length of that vector. The norm of a vector x is
denoted by x. This is a real number and satisfies the following conditions

(i) x ≥ 0 and x = 0 iff x = 0 (5.46)


(ii) αx = |α|x for any real scalar α (5.47)
(iii) x + y ≤ x + y (triangle inequality). (5.48)

Let x = (x1 , x2 , . . . , xn )t be any vector. The commonly used norms are


n
(i) x1 = |xi | (5.49)
i=1
-
. n
.
(ii) x2 = / |xi |2 (Euclidean norm) (5.50)
i=1
(iii) x∞ = max |xi | (maximum norm or uniform norm). (5.51)
i

Let A and B be two matrices such that A + B and AB are defined. The norm of
a matrix A = [aij ] is denoted by A, which satisfies the following conditions

(i) A ≥ 0 and A = 0 iff A = 0 (5.52)


(ii) αA = |α|A, α is a real scalar (5.53)
(iii) A + B ≤ A + B (5.54)
(iv) AB| ≤ AB. (5.55)

From (5.55), it can be verified that

Ak  ≤ Ak , (5.56)

for any positive integer k.


Solution of System of Linear Equations 327

Like the vector norms, the matrix norms may be defined as



(i) A1 = max |aij | (the column norm) (5.57)
j
i
0 
(ii) A2 = |aij |2 (the Euclidean norm) (5.58)
i j

(iii) A∞ = max |aij | (the row norm). (5.59)
i
j

The Euclidean norm is also known as Erhard-Schmidt norm or Schur norm or


the Frobenius norm.
The concept of matrix norm is used to study the stability of a system of equations.
It is also used to study the convergence of iterative methods to solve the linear system
of equations.
Example
 5.13.1
 Find the matrix norms A1 , A2 and A∞ for the matrix
2 3 4
A =  0 −1 5  .
3 2 6

Solution.

A1 = max{2 + 0 + 3, 3 − 1 + 2, 4 + 5 + 6} = 15
 √
A2 = 22 + 32 + 42 + 02 + (−1)2 + 52 + 32 + 22 + 62 = 104 and
A∞ = max{2 + 3 + 4, 0 − 1 + 5, 3 + 2 + 6} = 11.

5.14 Ill-Conditioned Linear Systems

Before introduction of ill-conditioned system, let us consider the following system of


equations.

x + 3y = 4
1
x + y = 1.33. (5.60)
3
Note that this system of equations has no solution. But, if we take the approximate
value of 13 as 0.3 then (5.60) becomes

x + 3y = 4
0.3x + y = 1.33. (5.61)
328 Numerical Analysis

The solution of (5.61) is x = 0.1, y = 1.3.


If the approximation of 13 is taken as 0.33 then the solution of the system of equations

x + 3y = 4
0.33x + y = 1.33 (5.62)

is x = 1, y = 1.
1
The approximations 0.333 and 0.3333 of 3 give the following systems

x + 3y = 4
0.333x + y = 1.33. (5.63)

whose solution is x = 10, y = −2 and

x + 3y = 4
0.3333x + y = 1.33. (5.64)

with solution x = 100, y = −32.


The systems (5.60)-(5.64) and their solutions indicate a dangerous situation. It may
be noted that the different approximations of 13 give high variations in their solutions.
What is conclusion about the above systems? We may conclude that the systems are
unstable. That is, a small change in the coefficients of the system produces large change
in the solution. These systems are called ill-conditioned or ill-posed system. On the
other hand, if the change in the solution is small for small changes in the coefficients,
then the system is called well-conditioned or well-posed system.
Let the system of equations be

Ax = b. (5.65)

Let A and b be matrices obtained from A and b by introducing small changes in


A and b and let y be the solution of the new system. That is,

A y = b . (5.66)

The system (5.65) is called ill-conditioned when the changes in y are too large com-
pared to those in x. Otherwise, the system is called well-conditioned. If a system is
ill-conditioned then the corresponding coefficient matrix is called an ill-conditioned
matrix.
1 3
The system (5.62) is ill-conditioned and the corresponding coefficient matrix
0.33 1
is an ill-conditioned matrix.
Generally, ill-condition occurs when |A| is small. To measure the ill-condition of a
matrix, different methods are available. One of the useful method is introduced here.
Solution of System of Linear Equations 329

The quantity Cond(A), called the condition number of the matrix, defined by

Cond(A) = A A−1  (5.67)

where A is any matrix norm, gives a measure of the condition of the matrix
A. The large value of Cond(A) indicates the ill-condition of a matrix or the associated
system of equations.
1 3 4 3
Let A = and B = be two matrices.
0.33 1 3 5
100 −300 5 −3
Then A−1 = and B−1 = 11 1
.
−33 100 −3 4

The Euclidean norms, A2 = 1 + 9 + 0.10890 + 1 = 3.3330 and A−1 2 = 333.3001.
Therefore, Cond(A) = A2 × A−1 2 = 1110.88945, a very large number. Hence A
is ill-conditioned.
Where as B2 = 7.68115 and B−1 2 = 0.69829
Then Cond(B) = 5.36364, a relatively small quantity.
Therefore, B is well-conditioned matrix.
Another indicator of ill-conditioning matrix is presented below.
 n 1/2
Let A = [aij ] be the matrix and ri = a2ij . The quantity
j=1

|A|
ν(A) = (5.68)
r1 r2 · · · r n

measures the smallness of the determinant of A. If ν is very small compared to 1, then


the matrix A is ill-conditioned, otherwise A is well-conditioned.
1 3 √
For the matrix A = , r1 = 10, r2 = 1.05304, |A| = 0.01,
0.33 1
0.01 4 3 √
ν(A) = √ = 0.003 and for B = , r1 = 5, r2 = 34,
10 × 1.05304 3 5
11
|B| = 11, ν(B) = √ = 0.37730.
5 × 34
Hence A is ill-conditioned while B is well-conditioned.

5.14.1 Method to solve ill-conditioned system


Some methods are available to solve an ill-conditioned system of linear equations. One
straight forward technique is to carry out the calculations with more number of signif-
icant digits. But, the calculations with more significant digits is time-consuming. One
suggested method is to improve upon the accuracy of the approximate solution by an
iterative method. This iterative method is discussed below.
330 Numerical Analysis

Let the system of equations be



n
aij xj = bi , i = 1, 2, . . . , n. (5.69)
j=1

Let x 12 , . . . , x
11 , x 1n be an approximate solution of (5.69). Since this is an approximate
 n
solution, aij x 1j is not necessarily equal to bi . Let bi = 1bi for this approximate solution.
j=1
Then, for this solution, (5.69) becomes

n
1j = 1bi , i = 1, 2, . . . , n.
aij x (5.70)
j=1

Subtracting (5.70) from (5.69), we obtain



n
1j ) = (bi − 1bi )
aij (xj − x
j=1
n
i.e., aij εi = di (5.71)
j=1

where εi = xi − x 1i , di = bi − 1bi , i = 1, 2, . . . , n.
Now, the solution for εi ’s is obtained by solving the system (5.71). Hence the new
solution is given by xi = εi + x 1i and these values are better approximations to xi ’s. This
technique can be repeated again to improve the accuracy.

5.15 Generalized Inverse (g-inverse)

The conventional matrix inverse (discussed in Section 5.3) is widely used in many areas
of science and engineering. It is also well known that conventional inverses can be
determined only for square non-singular matrices. But, in many areas of science and
engineering such as statistics, data analysis etc. some kind of weak inverses of singular
square and rectangular matrices are very much essential. The inverses of such types of
matrices are known as generalized inverse or g-inverse. A number of works have
been done during last three decades on g-inverse. The generalized inverse of an m × n
matrix A is a matrix X of size n × m. But, different types of generalized inverses are
defined by various authors. The following matrix equations are used to classify the
different types of generalized inverses for the matrix A:
(i) AXA = A
(ii) XAX = X
(5.72)
(iii) AX = (AX)∗
(iv) XA = (XA)∗ , (∗ denotes the conjugate transpose).
Solution of System of Linear Equations 331

The matrix X is called


(a) a generalized inverse of A, denoted by A− , if (i) holds;
(b) a reflexive generalized inverse of A, denoted by A− r , if (i) and (ii) hold ;
(c) a minimum norm inverse of A, denoted by A− m , if (i) and (iv) hold;

(d) a least-square inverse of A, denoted by Al , if (i) and (iii) hold;
(e) the Moore-Penrose inverse of A, denoted by A+ , if (i), (ii), (iii) and (iv) hold.
Only the Moore-Penrose inverse A+ is unique and other all inverses are not unique.
One interesting result is that, when A is non-singular then all these inverses reduce to
A−1 . Due to the uniqueness of A+ , this generalized inverse is widely used.
Some important properties are presented below:

(i) (A+ )+ = A; (5.73)


+ t t +
(ii) (A ) = (A ) ; (5.74)
−1 +
(iii) A = A , if A is non-singular (5.75)
(iv) In general, (AB) = B A .
+ + +
(5.76)

Let A be a matrix of order m × n.

(v) If rank of A is 0, then A+ is a null matrix of order n × m; (5.77)


1
(vi) If rank of A is 1, then A+ = At ; (5.78)
trace(AAt )
(vii) If rank of A is n, then A+ = (At A)−1 At ; (5.79)
+ t t −1
(viii) If rank of A is m, then A = A (AA ) . (5.80)

The g-inverse A+ of the matrix A is used to solve the system of equations Ax =


b, (b = 0) where A is an m × n matrix, x and b are respectively n × 1 and m × 1
vectors. The solution of Ax = b is given by

x = A+ b (5.81)

and this solution is known as minimum norm2 and least-squares3 solution.

5.15.1 Greville’s algorithm for Moore-Penrose inverse

Greville4 presented a recursive algorithm to find the Moore-Penrose inverse of a matrix.


2 √
The Euclidean norm x = x ∗ x is minimum for any choice of arbitrary inverse.
3
Least-squares solution minimizes Ax − b for an inconsistent system.
4
For the proof of the algorithm see Greville, T.N.E, The pseudo-inverse of a rectangular or singular
matrix and its application to the solution of system of linear equations, SIAM Review, 1 (1959) 38-43.
332 Numerical Analysis

Let
 
a11 a12 · · · a1k · · · a1n
 a21 a22 · · · a2k · · · a2n 
A= 
 · · · · · · · · · · · · · · · · · ·  = (α1 α2 . . . αk . . . αn ) (5.82)
am1 am2 · · · amk · · · amn
 
a1k
 a2k 
 
where αk =  . , the kth column of the matrix A.
 . 
.
amk
Also, let Ak be the matrix formed by the first k columns, i.e., Ak = (α1 α2 . . . αk ).
Then
Ak = (Ak−1 αk ).
The proposed algorithm is recursive and recursion is started with
A+
1 = 0 if α1 = 0 (null column) else
t −1 t
A+
1 = (α1 α1 ) α1 . (5.83)
Let the column vectors be
δ k = A+
k−1 αk (5.84)
and γk = αk − Ak−1 δk . (5.85)
If γk = 0, then compute
βk = γk+ = (γkt γk )−1 γkt ; (5.86)
else βk = (1 + δkt δk )−1 δkt A+
k−1 . (5.87)
Then, the matrix A+
k is given by

A+ k−1 − δk βk .
A+
k = βk
(5.88)

The process is repeated for k = 1, 2, . . . , n.


Example 5.15.1 Obtain the  g-inverse  (Moore-Penrose inverse) of
2 1 0 1
 1 0 −1 0 
0 1 1 2
and use it to solve the following system of equations
2x1 + x2 + x4 = 4
x1 − x3 =0
x2 + x3 + 2x4 = 4.
Solution of System of Linear Equations 333

         
2 1 0 1 2
 
Solution. Here α1 = 1 , α2 = 0 , α3 = −1 , α4 = 0 , A1 = 1  .
      
0 1 1 2 0
  −1
  2
t α )−1 αt =  2 1 0  1 
     
A+ 1 = (α 1 1 1 2 1 0 = 15 2 1 0 = 25 15 0
0
 
2 1  1  
δ 2 = A+ α = 0 = 2
1 2 5 5 0 5
1
     1
1 2   5
γ2 = α2 − A1 δ2 =  0  −  1  25 =  − 25  = 0 (the null column vector).
1 0 1
  1 −1
  5 1 2 
Hence β2 = γ2+ = (γ2t γ2 )−1 γ2t =  15 − 25 1  − 25  5 −5 1
1
   
= 56 15 − 25 1 = 16 − 13 56
   1 
δ2 β2 = 25 16 − 13 56 = 15 − 15
2 1
3

1 − δ2 β2
A+ 1 1
− 13
A+
2 = = 3 3
β2 1
6 −3
1
 6
5

0
1 1
− 13 − 23
Now, δ3 = A+ α
2 3 = 3 3 −1 =
1
6 − 13
5
6
7
6
1
     1
0 2 1
− 23 6
γ3 = α3 − A2 δ3 =  −1  −  1 0  7 =  − 13  = 0.
1 0 1 6 − 16
   
Hence β3 = γ3+ = (γ3t γ3 )−1 γ3t = 6 16 − 13 − 16 = 1 −2 −1
− 23   − 23 4 2
δ3 β3 = 1 −2 −1 = 3 3
7
6
7
6 −3
7
−6
7
 
1 −1 −1
− δ3 β3
A+
A+3 =
2 =  −1 2 2 
β3
1 −2 −1
    
1 −1 −1 1 −1
Now, δ4 = A+3 α4 =
 −1 2 2   0  =  3 ,
1 −2 −1 2 −1
          
1 2 1 0 −1 1 1 0
γ4 = α4 − A3 δ4 =  0  −  1 0 −1   3  =  0  −  0  = 0.

2 0 1 1 −1 2 2 0
(the null column vector)
334 Numerical Analysis

So that  
  1 −1 −1  
β4 = (1 + δ4t δ4 )−1 δ4t A+
3 = 12 −1 3 −1
1  −1 2 2  = − 5 3 2
12 4 3
1 −2 −1
   
−1   5/12 −3/4 −2/3
δ4 β4 =  3  
− 12 4 3 = −5/4 9/4
5 3 2
2 
−1 5/12 −3/4 −2/3
 
7/12 −1/4 −1/3
 0 
A+
A+3 − δ4 β4 =  1/4 −1/4  +
4 = β4  7/12 −5/4 −1/3  = A
−5/12 3/4 2/3
The given system +
 is Ax = b and its  solution isx = A b.
7/12 −1/4 −1/3   1
 1/4 −1/4 0  4 1
Therefore, x =     
 7/12 −5/4 −1/3  0 =  1 
4
−5/12 3/4 2/3 1
Therefore the required solution is
x1 = 1, x2 = 1, x3 = 1, x4 = 1.

Note 5.15.1 It may be verified that A+ satisfies all the conditions (5.72). Again,
−1
A+
3 = A3 as |A3 | = 1 i.e., A3 is non-singular. In addition to this, for this A,
AA = I4 , but A+ A = I4 , the unit matrix of order 4.
+

The g-inverse A+ of A can also be computed using the formula (5.80).

5.16 Least Squares Solution for Inconsistent Systems

Let the system of linear equations be

Ax = b (5.89)

where A, x and b are of order m×n, n×1 and m×1 respectively. Here, we assume that
(5.89) is inconsistent. Since (5.89) is inconsistent, the system has no solution. Again,
since there may be more than one xl (least-squares solutions) for which Ax − b is
minimum, there exist one such xl (say xm ) whose norm is minimum. That is, xm is
called minimum norm least squares solution if

xm  ≤ xl  (5.90)

for any xl such that

Axl − b ≤ Ax − b for all x. (5.91)


Solution of System of Linear Equations 335

The minimum norm least squares solution can be determined using the relation

x = A+ b. (5.92)

Since A+ is unique, the minimum norm least squares solution is unique.


The least squares solution can also be determined by the following way.
The vector Ax − b in terms of the elements of A, x and b is
 
a11 x1 + a12 x2 + · · · a1n xn − b1
 a21 x1 + a22 x2 + · · · a2n xn − b2 
 .
······ ······ ··· ········· 
am1 x1 + am2 x2 + · · · amn xn − bm

Let square of Ax − b be denoted by S. Then

S = (a11 x1 + a12 x2 + · · · a1n xn − b1 )2


+(a21 x1 + a22 x2 + · · · a2n xn − b2 )2 + · · ·
+(am1 x1 + am2 x2 + · · · amn xn − bn )2

m  n
= (aij xj − bi )2 . (5.93)
i=1 j=1

Also, S is called the sum of square of residues. To solve (5.89), find x = (x1 , x2 , . . . , xn )t
from (5.93) in such a way that S is minimum. The conditions for S to be minimum are

∂S ∂S ∂S
= 0, = 0, · · · , =0 (5.94)
∂x1 ∂x2 ∂xn

The system (5.94) is non-homogeneous and consists of n equations with n unknowns


x1 , x2 , . . . , xn . This system can be solved by any method. Let x1 = x∗1 , x2 = x∗2 , . . . , xn =
x∗n be a solution of (5.94). Therefore, the least-squares solution of (5.89) is

x∗ = (x∗1 , x∗2 , . . . , x∗n )t (5.95)

and the sum of square of residues is given by


m 
n
S∗ = (aij x∗j − bi )2 . (5.96)
i=1 j=1

This method is not suitable for a large system of equations, while the method stated
in equation (5.92) is applicable for a large system also.
336 Numerical Analysis

3 6
Example 5.16.1 Find g-inverse of the singular matrix A = and hence find
2 4
a least squares solution of the inconsistent system

3x + 6y = 9
2x + 4y = 5.

3 6 3
Solution. Let α1 = , α2 = , A1 = .
2 4 2
t −1 αt = 1 3 2 =
   
A+ 1 = (α1 α1 ) 1 13
3 2
13 13 ,
 3 2  6
δ 2 = A+1 α2 = 13 13 = 2,
4
6 3 0
γ2 = α2 − A1 δ2 = − .2 = = 0 (a null vector),
4 2 0
   6 4 
β2 = (1 + δ2t δ2 )−1 δ2t A+ = 15 .2. 133 2
=
  1 13 65 65
δ2 β2 = 12 8
65 65
A+ 1 − δ2 β2 =
3 2
Therefore, A+ 2 =
65 65
6 4 = A+ , which is the g-inverse of A.
β2 65 65
Second Part: The given equations can be written as
3 6 x 9
Ax = b, where A = , x= , A= .
2 4 y 5
Then the least squares solution is given by
1 3 2 9 1
x = A+ b, i.e., x = 65 = 3765 2 .
6 4 5
37 74
Hence the least squares solution is x = , y = .
65 65

Example 5.16.2 Find the least squares solution of the following equations x + y =
3.0, 2x − y = 0.03, x + 3y = 7.03, and 3x + y = 4.97. Also, estimate the residue.

Solution. Let x, y be least squares solution of the given system. Then the square of
residues S is
S = (x + y − 3.0)2 + (2x − y − 0.03)2 + (x + 3y − 7.03)2 + (3x + y − 4.97)2 .
We choose x and y in such a way that S is minimum. Therefore
∂S ∂S
= 0 and = 0.
∂x ∂y
That is,
2(x + y − 3.0) + 4(2x − y − 0.03) + 2(x + 3y − 7.03) + 6(3x + y − 4.97) = 0
and 2(x + y − 3.0) − 2(2x − y − 0.03) + 6(x + 3y − 7.03) + 2(3x + y − 4.97) = 0.
Solution of System of Linear Equations 337

These equations reduce to 3x + y − 5 = 0 and 5x + 12y − 29.03 = 0.


Solution of these equations is x = 30.97/31 = 0.9990322 and y = 62.09/31 =
2.0029032, which is the required solution of the given equations.
The sum of the square of residues is S = (3.0019354 − 3.0)2 + (−0.0048388 − 0.03)2 +
(7.0077418 − 7.03)2 + (4.9999998 − 4.97)2 = 0.0026129.

Iteration Methods

If the system of equations has a large number of variables, then the direct methods
are not much suitable. In this case, the approximate numerical methods are used to
determine the variables of the system.
The approximate methods for solving system of linear equations make it possible to
obtain the values of the roots of the system with the specified accuracy as the limit of
the sequence of some vectors. The process of constructing such a sequence is known as
the iterative process.
The efficiency of the application of approximate methods depends on the choice of
the initial vector and the rate of convergence of the process.
The following two approximate methods are widely used to solve a system of linear
equations:
(i) method of iteration (Jacobi’s iteration method), and
(ii) Gauss-Seidal’s iteration method.
Before presenting the iteration methods, some terms are introduced to analyse the
methods.
(k)
Let xi , i = 1, 2, . . . , n be the kth (k = 1, 2, . . .) iterated value of the variable xi and
(k) (k) (k)
x(k) = (x1 , x2 , . . . , xn )t be the solution vector obtained at the kth iteration.
The sequence {x }, k = 1, 2, . . . is said to converge to a vector x = (x1 , x2 , . . . , xn )t
(k)

if for each i (= 1, 2, . . . , n)
(k)
xi −→ xi as k −→ ∞. (5.97)

Let ξ = (ξ1 , ξ2 , . . . , ξn )t be the exact solution of the system of linear equations. Then
(k)
the error εi in the ith variable xi committed in the kth iteration is given by
(k) (k)
εi = ξi − xi . (5.98)

The error vector ε(k) at the kth iteration is then given by


(k) (k)
ε(k) = (ε1 , ε2 , . . . , ε(k) t
n ) . (5.99)

The error difference e(k) at two consecutive iterations is given by

e(k) = x(k+1) − x(k) = ε(k) − ε(k+1) , (5.100)


338 Numerical Analysis

(k) (k+1) (k)


where ei = xi − xi .
An iteration method is said to be of order p ≥ 1 if there exists a positive constant A
such that for all k

ε(k+1)  ≤ Aε(k) p . (5.101)

5.17 Jacobi’s Iteration Method

Let us consider a system of n linear equations containing n variables:

a11 x1 + a12 x2 + · · · + a1n xn = b1


a21 x1 + a22 x2 + · · · + a2n xn = b2 (5.102)
··························· ··· ···
an1 x1 + an2 x2 + · · · + ann xn = bn .

Also, we assume that the quantities aii are pivot elements.


The above equations can be written as

1
x1 = (b1 − a12 x2 − a13 x3 − · · · − a1n xn )
a11
1
x2 = (b2 − a21 x1 − a23 x3 − · · · − a2n xn ) (5.103)
a22
··· ··· ·······································
1
xn = (bn − an1 x1 − an2 x2 − · · · − an n−1 xn−1 ).
ann
(0) (0) (0)
Let x1 , x2 , . . . , xn be the initial guess to the variables x1 , x2 , . . . , xn respectively
(initial guess may be taken as zeros). Substituting these values in the right hand side
of (5.103), which yields the first approximation as follows.

(1) 1 (0) (0)


x1 = (b1 − a12 x2 − a13 x3 − · · · − a1n x(0)
n )
a11
(1) 1 (0) (0)
x2 = (b2 − a21 x1 − a23 x3 − · · · − a2n x(0)
n ) (5.104)
a22
··· ··· ·······································
1 (0) (0) (0)
x(1)
n = (bn − an1 x1 − an2 x2 − · · · − an n−1 xn−1 ).
ann
(1) (1) (1)
Again, substituting x1 , x2 , . . . , xn in the right hand side of (5.103) and obtain the
(2) (2) (2)
second approximation x1 , x2 , . . . , xn .
Solution of System of Linear Equations 339

(k) (k) (k)


In general, if x1 , x2 , . . . , xn be the kth approximate roots then the next approxi-
mate roots are given by
(k+1) 1 (k) (k)
x1 = (b1 − a12 x2 − a13 x3 − · · · − a1n x(k)
n )
a11
(k+1) 1 (k) (k)
x2 = (b2 − a21 x1 − a23 x3 − · · · − a2n x(k)
n ) (5.105)
a22
··· ··· ·······································
1 (k) (k) (k)
(k+1)
xn = (bn − an1 x1 − an2 x2 − · · · − an n−1 xn−1 ).
ann
k = 0, 1, 2, . . . .

The iteration process is continued until all the roots converge to the required number
of significant figures. This iteration method is called Jacobi’s iteration or simply the
method of iteration.
The Jacobi’s iteration method surely converges if the coefficient matrix is diagonally
dominant.

5.17.1 Convergence of Gauss-Jacobi’s iteration


The Gauss-Jacobi’s iteration scheme (5.105) can also be written as
 n 
(k+1) 1 (k)
xi = bi − aij xj , i = 1, 2, . . . , n. (5.106)
aii j=1
j=i

If ξ be the exact solution of the system, then


 n 
1
ξi = bi − aij ξj . (5.107)
aii j=1
j=i

Therefore, equations (5.106) and (5.107) yield

1   
n
(k+1) (k)
ξi − xi = − aij ξj − xj
aii j=1
j=i

1 
n
(k+1) (k)
or εi = − aij εj .
aii j=1
j=i

That is,
1  1 
n n
(k+1) (k)
εi  ≤ |aij | εj  ≤ |aij | ε(k) .
|aii | j=1 |aii | j=1
j=i j=i
340 Numerical Analysis

 1  n 
Let A = max |aij | .
i |aii | j=1
j=i

Then the above relation becomes

ε(k+1)  ≤ Aε(k) . (5.108)

This relation shows that the rate of convergence of Gauss-Jacobi’s method is linear.
Again, ε(k+1)  ≤ Aε(k)  ≤ A2 ε(k−1)  ≤ · · · ≤ Ak+1 ε(0) .
That is,

ε(k)  ≤ Ak ε(0) . (5.109)

If A < 1 then Ak → 0 as k → ∞ and consequently ε(k)  → 0 as k → ∞, i.e., the


iteration converges.
Hence the sufficient condition for convergent of Gauss-Jacobi’s method is A < 1 i.e.,
 n
|aij | < |aii | for all i, i.e., the coefficient matrix is diagonally dominant.
j=1
j=i

The relation (5.108) can be written as

ε(k+1)  ≤ Aε(k)  = A e(k) + ε(k+1)  [by (5.100)]


≤ Ae  + Aε
(k) (k+1)

A
or, ε(k+1)  ≤ e(k) . (5.110)
A−1

This relation gives the absolute error at the (k + 1)th iteration in terms of the error
difference at kth and (k + 1)th iterations.

Example 5.17.1 Solve the following system of linear equations by Gauss-Jacobi’s


method correct up to four decimal places and calculate the upper bound of absolute
errors.

27x + 6y − z = 54
6x + 15y + 2z = 72
x + y + 54z = 110.

Solution. Obviously, the system is diagonally dominant as


|6| + | − 1| < |27|, |6| + |2| < |15|, |1| + |1| < |54|.
Solution of System of Linear Equations 341

The Gauss-Jacobi’s iteration scheme is


1 
x(k+1) = 54 − 6y (k) + z (k)
27
1 
y (k+1) = 27 − 6x(k) − 2z (k)
15
1 
z (k+1)
= 110 − x(k) − y (k) .
54
Let the initial solution be (0, 0, 0). The next iterations are shown in the following
table.

k x y z
0 0 0 0
1 2.00000 4.80000 2.03704
2 1.00878 3.72839 1.91111
3 1.24225 4.14167 1.94931
4 1.15183 4.04319 1.93733
5 1.17327 4.08096 1.94083
6 1.16500 4.07191 1.93974
7 1.16697 4.07537 1.94006
8 1.16614 4.07454 1.93996
9 1.16640 4.07488 1.93999
10 1.16632 4.07477 1.93998
11 1.16635 4.07481 1.93998

The solution correct up to four decimal places is


x = 1.1664, y = 4.0748, z = 1.9400.
Here
1 
n
7 8 2 8
A = max |aij | = max , , = .
i aii j=1 27 15 54 15
j=i

e(0) = (3 × 10−5 , 4 × 10−5 , 0). Therefore, the upper bound of absolute error is

A
ε(0)  ≤ e(0)  = 5.71 × 10−5 .
1−A

Algorithm 5.8 (Gauss-Jacobi’s). This algorithm finds the solution of a system


of linear equations by Gauss-Jacobi’s iteration method. The method will terminate
(k+1) (k)
when |xi − xi | < ε, where ε is the supplied error tolerance, for all i.
342 Numerical Analysis

Algorithm Gauss Jacobi


Step 1. Read the coefficients aij , i, j = 1, 2, . . . , n and the right hand vector
bi , i = 1, 2, . . . , n of the system of equations and error tolerance ε.
Step 2. Rearrange the given equations, if possible, such that the system becomes
diagonally dominant.
Step 3. Rewrite the ith equation as
 n 
1
xi = bi − aij xj , for i = 1, 2, . . . , n.
aii j=1
j=i

Step 4. Set the initial solution as


xi = 0, i = 1, 2, 3, . . . , n.
Step 5. Calculate the new values xni of xi as
 n 
1
xni = bi − aij xj , for i = 1, 2, . . . , n.
aii j=1
j=i

Step 6. If |xi − xni | < ε (ε is an error tolerance) for all i, then goto Step 7 else
set xi = xni for all i and goto Step 5.
Step 7. Print xni , i = 1, 2, . . . , n as solution.
end Gauss Jacobi
Program 5.8
.
/*Program Gauss_Jacobi
Solution of a system of linear equations by Gauss-Jacobi’s iteration
method. Testing of diagonal dominance is also incorporated.*/
#include<stdio.h>
#include<math.h>
#include<stdlib.h>

void main()
{
float a[10][10],b[10],x[10],xn[10],epp=0.00001,sum;
int i,j,n,flag;
printf("Enter number of variables ");
scanf("%d",&n);
printf("\nEnter the coefficients rowwise ");
for(i=1;i<=n;i++)
for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("\nEnter right hand vector ");
for(i=1;i<=n;i++)
scanf("%f",&b[i]);
for(i=1;i<=n;i++) x[i]=0; /* initialize */
Solution of System of Linear Equations 343

/* checking for row dominance */


flag=0;
for(i=1;i<=n;i++)
{
sum=0;
for(j=1;j<=n;j++)
if(i!=j) sum+=fabs(a[i][j]);
if(sum>fabs(a[i][i])) flag=1;
}
/* checking for column dominance */
if(flag==1)
{
flag=0;
for(j=1;j<=n;j++)
{
sum=0;
for(i=1;i<=n;i++)
if(i!=j) sum+=fabs(a[i][j]);
if(sum>fabs(a[j][j])) flag=1;
}
}

if(flag==1)
{
printf("The coefficient matrix is not diagonally dominant\n");
printf("The Gauss-Jacobi method does not converge surely");
exit(0);
}
for(i=1;i<=n;i++) printf(" x[%d] ",i);printf("\n");
do
{
for(i=1;i<=n;i++)
{
sum=b[i];
for(j=1;j<=n;j++)
if(j!=i) sum-=a[i][j]*x[j];
xn[i]=sum/a[i][i];
}
for(i=1;i<=n;i++) printf("%8.5f ",xn[i]);printf("\n");
344 Numerical Analysis

flag=0; /* indicates |x[i]-xn[i]|<epp for all i */


for(i=1;i<=n;i++) if(fabs(x[i]-xn[i])>epp) flag=1;
if(flag==1) for(i=1;i<=n;i++) x[i]=xn[i]; /* reset x[i] */
}while(flag==1);

printf("Solution is \n");
for(i=1;i<=n;i++) printf("%8.5f ",xn[i]);
} /* main */
A sample of input/output:
Enter number of variables 3
Enter the coefficients rowwise
9 2 4
1 10 4
2 -4 10
Enter right hand vector
20 6 -15
x[1] x[2] x[3]
2.22222 0.60000 -1.50000
2.75556 0.97778 -1.70444
2.76247 1.00622 -1.66000
2.73640 0.98775 -1.65000
2.73606 0.98636 -1.65218
2.73733 0.98727 -1.65267
2.73735 0.98733 -1.65256
2.73729 0.98729 -1.65254
2.73729 0.98729 -1.65254
Solution is
2.73729 0.98729 -1.65254

5.18 Gauss-Seidal’s Iteration Method

A simple modification of Jacobi’s iteration sometimes give faster convergence. The


modified method is known as Gauss-Seidal’s iteration method.
Let us consider a system of n linear equations with n variables.

a11 x1 + a12 x2 + · · · + a1n xn = b1


a21 x1 + a22 x2 + · · · + a2n xn = b2 (5.111)
··························· ··· ···
an1 x1 + an2 x2 + · · · + ann xn = bn .
Solution of System of Linear Equations 345

Assume that the diagonal coefficients aii , i = 1, 2, . . . , n are diagonally dominant. If


this is not the case then the above system of equations are re-arranged in such a way
that the above condition holds.
The equations (5.111) are rewritten in the following form:
1
x1 = (b1 − a12 x2 − a13 x3 − · · · − a1n xn )
a11
1
x2 = (b2 − a21 x1 − a23 x3 − · · · − a2n xn ) (5.112)
a22
··· ··· ·······································
1
xn = (bn − an1 x1 − an2 x2 − · · · − an n−1 xn−1 ).
ann
(0) (0) (0)
To solve these equations an initial approximation x2 , x3 , . . . , xn for the vari-
ables x2 , x3 , . . . , xn respectively is considered. Substituting these values to the above
(1)
system and get the first approximate value of x1 , denoted by x1 . Now, substitut-
(1) (0) (0) (0) (1)
ing x1 for x1 and x3 , x4 , . . . , xn for x3 , x4 , . . . , xn respectively and we find x2
from second equation of (5.112), the first approximate value of x2 . Then substituting
(1) (1) (1) (0) (0)
x1 , x2 , . . . , xi−1 , xi+1 , . . . , xn for x1 , x2 , . . . , xi−1 , xi+1 , . . . , xn to the ith equation of
(1)
(5.112) respectively and obtain xi , and so on.
(k)
If xi , i = 1, 2, . . . , n be the kth approximate value of xi , then the (k + 1)th approx-
imate value of x1 , x2 , . . . , xn are given by
(k+1) 1 (k) (k)
x1 = (b1 − a12 x2 − a13 x3 − · · · − a1n x(k) n )
a11
(k+1) 1 (k+1) (k)
x2 = (b2 − a21 x1 − a23 x3 − · · · − a2n xn(k) ) (5.113)
a22
··· ··· ·······································
(k+1) 1 (k+1) (k+1) (k) (k)
xi = (bi − ai1 x1 − · · · − ai i−1 xi−1 − ai i+1 xi+1 − · · · − an n−1 xn−1 )
aii
··· ··· ·······································
1 (k+1) (k+1) (k+1)
x(k+1)
n = (bn − an1 x1 − an2 x2 − · · · − an n−1 xn−1 ).
ann
k = 0, 1, 2, . . . .
That is,

1    
i−1 n
(k+1) (k+1) (k)
xi = bi − aij xj − aij xj , i = 1, 2, . . . , n and k = 0, 1, 2, . . . .
aii
j=1 j=i+1

(k+1) (k)
The method is repeated until |xi − xi | < ε for all i = 1, 2, . . . , n, where ε > 0 is
any pre-assigned number called the error tolerance. This method is called Gauss-Seidal’s
iteration method.
346 Numerical Analysis

Example 5.18.1 Solve the following system of equations by Gauss-Seidal’s itera-


tion method, correct up to four decimal places.

27x + 6y − z = 54
6x + 15y + 2z = 72
x + y + 54z = 110

Solution. The iteration scheme is


1
x(k+1) = (54 − 6y (k) + z (k) )
27
1
y (k+1) = (72 − 6x(k+1) − 2z (k) )
15
1
z (k+1) = (110 − x(k+1) − y (k+1) ).
54
Let y = 0, z = 0 be the initial solution. The successive iterations are shown below.

k x y z
0 − 0 0
1 2.00000 4.00000 1.92593
2 1.18244 4.07023 1.93977
3 1.16735 4.07442 1.93997
4 1.16642 4.07477 1.93998
5 1.16635 4.07480 1.93998
6 1.16634 4.07480 1.93998

The solution correct up to four decimal places is x = 1.1663, y = 4.0748, z = 1.9400.

Note 5.18.1 This solution is achieved in eleven iterations using Gauss-Jacobi’s method
while only six iterations are used in Gauss-Seidal’s method.

The sufficient condition for convergence of this method is that the diagonal elements
of the coefficient matrix are diagonally dominant. This is justified in the following.

5.18.1 Convergence of Gauss-Seidal’s method

1  1 
n i−1
Let A = max |aij | and let Ai = |aij |, i = 1, 2, . . . , n.
i aii j=1 aii
j=1
j=i

It can easily be verified that 0 ≤ Ai ≤ A < 1.


Solution of System of Linear Equations 347

Then by Gauss-Jacobi’s case

(k+1) 1  (k+1)
 (k)
|εi |≤ |aij | |εj |+ |aij | |εj |
|aii |
j<i j>i
1  
≤ |aij | ε(k+1)  + |aij | ε(k) 
|aii |
j<i j>i

≤ Ai ε (k+1)
 + (A − Ai )ε (k)
.

Then for some i,

ε(k+1)  ≤ Ai ε(k+1)  + (A − Ai )ε(k) 

That is,
A − Ai (k)
ε(k+1)  ≤ ε .
1 − Ai
A − Ai
Since 0 ≤ Ai ≤ A < 1 then ≤ A.
1 − Ai
Therefore the above relation reduces to

ε(k+1)  ≤ Aε(k) . (5.114)

This shows that the rate of convergence of Gauss-Seidal’s iteration is also linear. The
successive substitutions give
ε(k)  ≤ Ak ε(0) .
Now, if A < 1 then ε(k)  → 0 as k → ∞, i.e., the sequence {x(k) } is sure to converge
when A < 1 i.e.,
n
|aij < |aii | for all i.
j=1
j=i

In other words the sufficient condition for Gauss-Seidal’s iteration is that the coefficient
matrix is diagonally dominant. The absolute error at the (k + 1)th iteration is given by

A
ε(k+1)  ≤ e(k)  when A < 1,
1−A
as in previous section.

Note 5.18.2 Usually, the Gauss-Seidal’s method converges rapidly than the Gauss-
Jacobi’s method. But, this is not always true. There are some examples in which the
Gauss-Jacobi’s method converges faster than the Gauss-Seidal’s method.
348 Numerical Analysis

Example 5.18.2 Solve the following system of equations by Gauss-Seidal’s method


correct to four significant figures: 3x + y + z = 3, 2x + y + 5z = 5, x + 4y + z = 2.

Solution. It may be noted that the given system is not diagonally dominant, but,
the rearranged system 3x + y + z = 3, x + 4y + z = 2, 2x + y + 5z = 5 is diagonally
dominant.
Then the Gauss-Seidal’s iteration scheme is
1
x(k+1) = (3 − y (k) − z (k) )
3
1
y (k+1) = (2 − x(k+1) − z (k) )
4
1
z (k+1)
= (5 − 2x(k+1) − y (k+1) ).
5
Let y = 0, z = 0 be the initial values. The successive iterations are shown below.
k x y z
0 − 0 0
1 1.00000 0.25000 0.55000
2 0.73333 0.17917 0.67083
3 0.71667 0.15313 0.68271
4 0.72139 0.14898 0.68165
5 0.72312 0.14881 0.68099
6 0.72340 0.14890 0.68086
7 0.72341 0.14893 0.68085

Therefore, the solution correct up to four significant figures is


x = 0.7234, y = 0.1489, z = 0.6808.

Example 5.18.3 Solve the following system of equations using Gauss-Seidal’s


method: 3x + y + 2z = 6, −x + 4y + 2z = 5, 2x + y + 4z = 7.

Solution. The iteration scheme is


1
x(k+1) = (6 − y (k) − 2z (k) )
3
1
y (k+1) = (5 + x(k+1) − 2z (k) )
4
1
z (k+1)
= (7 − 2x(k+1) − y (k+1) ).
4
Let y = 0, z = 0 be the initial solution and other approximate values are shown
below.
Solution of System of Linear Equations 349

k x y z
0 − 0 0
1 2.00000 1.75000 0.31250
2 1.20833 1.39583 0.79688
3 1.00347 1.10243 1.10243
4 0.89757 0.92328 1.07042
5 0.97866 0.95946 1.02081
6 0.99964 0.98951 1.00280
7 1.00163 0.99901 0.99943
8 1.00071 1.00046 0.99953
9 1.00016 1.00028 0.99985
10 1.00001 1.00008 0.99998

Therefore, the solution correct up to four decimal places is

x = 1.0000, y = 1.0000, z = 1.0000.


It may be noted that the given system is not diagonally dominant while the iteration
scheme converges to the exact solution.

Another interesting problem is considered in the following. The system of equations


x1 + x2 = 2, x1 − 3x2 = 1
converges when the iteration scheme is taken as
(k+1) (k) (k+1) 1 (k+1)
x1 = 2 − x2 , x2 = (−1 + x1 ) (5.115)
3
While the Gauss-Seidal’s iteration method diverges when the iteration scheme is
(k+1) (k) (k+1) (k+1)
x1 = 1 + 3x2 , x2 = 2 − x1 (5.116)
(It may be noted that the system is not diagonally dominant).
The calculations for these two schemes are shown below and the behaviour of the
solutions are shown in the figures 5.1 and 5.2.
For the scheme (5.115) For the scheme (5.116)
k x1 x2 k x1 x2
0 - 0 0 - 0
1 2 0.333 1 1 1
2 1.667 0.222 2 4 -2
3 1.778 0.259 3 -5 7
4 1.741 0.247 4 22 -20
5 1.753 0.251 5 -59 61
6 1.749 0.250 6 184 -182
7 1.750 0.250 7 547 -545
350 Numerical Analysis

The exact solution of the equations is


x1 = 1.75, x2 = 0.25.
This example shows that the condition ‘diagonally dominant’ is a sufficient condition,
not necessary for Gauss-Seidal’s iteration method.

6
(0,2)

Second equ.

-
? 6
-
(1,0) (2,0)
First equ.
(0,-1/3)

Figure 5.1: Illustration of Gauss-Seidal’s method for the convergent scheme (5.115.)
6
6

(0,2) First equ.


-
(-5,0) 6 -
(2,0) (4,0)
 ?
(0,-2)
Second equ.

Figure 5.2: Illustration of Gauss-Seidal’s method for the divergent scheme (5.116).

Algorithm 5.9 (Gauss-Seidal’s). This algorithm finds the solution of a system


of linear equations by Gauss-Seidal’s iteration method. The method will terminate
(k+1) (k)
when |xi − xi | < ε, where ε is the supplied error tolerance, for all i.
Solution of System of Linear Equations 351

Algorithm Gauss Seidal


Step 1. Read the coefficients aij , i, j = 1, 2, . . . , n and the right hand vector
bi , i = 1, 2, . . . , n of the system of equations and error tolerance ε.
Step 2. Rearrange the given equations, if possible, such that the system becomes
diagonally dominant.
Step 3. Rewrite  the ith equation as 
1  
xi = bi − aij xj − aij xj , for i = 1, 2, . . . , n.
aii
j<i j>i
Step 4. Set the initial solution as
xi = 0, i = 1, 2, 3, . . . , n.
Step 5. Calculatethe new values xni of xi as
1  
xni = bi − aij xnj − aij xj , for i = 1, 2, . . . , n.
aii
j<i j>i
Step 6. If |xi − xni | < ε (ε is an error tolerance) for all i then goto Step 7 else
set xi = xni for all i and goto Step 5.
Step 7. Print xni , i = 1, 2, . . . , n as solution.
end Gauss Seidal
Program 5.9.
/* Program Gauss-Seidal
Solution of a system of linear equations by Gauss-Seidal’s
iteration method. Assume that the coefficient matrix satisfies
the condition of convergence. */
#include<stdio.h>
#include<math.h>
void main()
{
float a[10][10],b[10],x[10],xn[10],epp=0.00001,sum;
int i,j,n,flag;
printf("Enter number of variables ");
scanf("%d",&n);
printf("\nEnter the coefficients rowwise ");
for(i=1;i<=n;i++)
for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("\nEnter right hand vector ");
for(i=1;i<=n;i++)
scanf("%f",&b[i]);
for(i=1;i<=n;i++) x[i]=0; /* initialize */
/* testing of diagonal dominance may be included here
from the program of Gauss-Jacobi’s method */
352 Numerical Analysis

do
{
for(i=1;i<=n;i++)
{
sum=b[i];
for(j=1;j<=n;j++)
{
if(j<i)
sum-=a[i][j]*xn[j];
else if(j>i)
sum-=a[i][j]*x[j];
xn[i]=sum/a[i][i];
}
}
flag=0; /* indicates |x[i]-xn[i]|<epp for all i */
for(i=1;i<=n;i++) if(fabs(x[i]-xn[i])>epp) flag=1;
if(flag==1) for(i=1;i<=n;i++) x[i]=xn[i]; /* reset x[i] */
}
while(flag==1);
printf("Solution is \n");
for(i=1;i<=n;i++) printf("%8.5f ",xn[i]);
} /* main */

A sample of input/output:

Enter number of variables 3


Enter the coefficients rowwise
3 1 -1
2 5 2
2 4 6

Enter right hand vector


7 9 8
Solution is
2.00000 1.00000 0.00000

5.19 The Relaxation Method

The relaxation method is also an iterative method invented by Southwell in 1946.


(k) (k) (k)
Let x(k) = (x1 , x2 , . . . , xn )t be the solution obtained at the kth iteration of the
Solution of System of Linear Equations 353

system of equations

n
aij xj = bi , i = 1, 2, . . . , n. (5.117)
j=1


n
(k)
Then aij xj = bi , i = 1, 2, . . . , n.
j=1
(k)
If ri denotes the residual of the ith equation, then

(k)

n
(k)
ri = bi − aij xj . (5.118)
j=1

Now, the solution can be improved successively by reducing the largest residual to
zero at that iteration.
To achieve the fast convergence of the method, the equations are rearranged in such
a way that the largest coefficients in the equations appear on the diagonals. Now, the
largest residual (in magnitude) is determined and let it be rp . Then the value of the
rp
variable xp be increased by dxp where dxp = − .
app
In other words, xp is changed to xp + dxp to relax rp , i.e., to reduce rp to zero. Then
the new solution after this iteration becomes
 
(k) (k) (k) (k)
x(k+1) = x1 , x2 , . . . , xp−1 , xp + dxp , xp+1 , . . . , xn(k) .

The method is repeated until all the residuals become zero or very small.
Example 5.19.1 Solve the following system of equations
8x1 + x2 − x3 = 8, 2x1 + x2 + 9x3 = 12, x1 − 7x2 + 2x3 = −4
by relaxation method taking (0, 0, 0) as initial solution.

Solution. We rearrange the equations to get the largest coefficients in the diagonals
as

8x1 + x2 − x3 = 8
x1 − 7x2 + 2x3 = −4
2x1 + x2 + 9x3 = 12.

The residuals r1 , r2 , r3 can be computed from the equations

r1 = 8 − 8x1 − x2 + x3
r2 = −4 − x1 + 7x2 − 2x3
r3 = 12 − 2x1 − x2 − 9x3 .
354 Numerical Analysis

The method is started with the initial solution (0, 0, 0), i.e., x1 = x2 = x3 = 0. Then
the residuals r1 = 8, r2 = −4, r3 = 12 of which the largest residual in magnitude
is r3 . This indicates that the third equation has more error and needs to improve
r3 12
x3 . Then the increment dx3 = − = = 1.333. The new solution then becomes
a33 9
(0, 0, 0 + 1.333) i.e., (0, 0, 1.333).
Similarly, we find the new residuals of large magnitudes and relax it to zero and
we repeat the process until all the residuals become zero or very small.
The detail calculations are shown in the following table.
residuals max increment solution
k r1 r2 r3 (r1 , r2 , r3 ) p dxp x1 x2 x3
0 – – – – – – 0 0 0
1 8 –4 12 12 3 1.333 0 0 1.333
2 9.333 –6.666 0.003 9.333 1 1.167 1.167 0 1.333
3 –0.003 –7.833 –2.331 –7.833 2 1.119 1.167 1.119 1.333
4 –1.122 0 –3.450 –3.450 3 –0.383 1.167 1.119 0.950
5 –1.505 0.766 –0.003 –1.505 1 –0.188 0.979 1.119 0.950
6 –0.001 0.954 0.373 0.954 2 –0.136 0.979 0.983 0.950
7 0.135 0.002 0.509 0.509 3 0.057 0.979 0.983 1.007
8 0.192 –0.112 –0.004 0.192 1 0.024 1.003 0.983 1.007
9 0 –0.136 –0.052 –0.136 2 0.019 1.003 1.002 1.007
10 –0.019 –0.003 –0.071 –0.071 3 –0.008 1.003 1.002 0.999
11 –0.027 0.013 0.001 –0.027 1 0.003 1.000 1.002 0.999
12 –0.003 0.016 0.007 0.016 2 –0.002 1.000 1.000 0.999
13 –0.001 0.002 0.009 0.009 3 0.001 1.000 1.000 1.000
14 0 0 0 0 – 0 1.000 1.000 1.000

At this stage all the residuals are zero and therefore the solution of the given system
of equations is x1 = 1.000, x2 = 1.000, x3 = 1.000, which is the exact solution of the
equations.

5.20 Successive Overrelaxation (S.O.R.) Method

The convergence can be accelerated by introducing a suitable relaxation factor w. In



n
this method, the ith equation aij xj = bi can be written as
j=1


i−1 
n
aij xj + aij xj = bi . (5.119)
j=1 j=i
Solution of System of Linear Equations 355

Like Gauss-Seidal’s method, for the solution


 
(k+1) (k+1) (k+1) (k) (k)
x1 , x2 , . . . , xi−1 , xi , xi+1 , . . . , x(k)
n ,

the equation (5.119) becomes


i−1
(k+1)

n
(k)
aij xj + aij xj = bi . (5.120)
j=1 j=i

The residual at the ith equation is then computed as


i−1
(k+1)

n
(k)
ri = bi − aij xj − aij xj . (5.121)
j=1 j=i

(k) (k+1) (k)


Let di = xi − xi denote the differences of xi at two consecutive iterations.
In the successive overrelaxation (S.O.R. or SOR) method, assume that
(k)
aii di = wri , i = 1, 2, . . . , n, (5.122)

where w is a suitable factor, called the relaxation factor.


Using (5.121), equation (5.122) becomes

(k+1) (k)

i−1
(k+1)

n
(k)
aii xi = aii xi −w aij xj + aij xj − bi , (5.123)
j=1 j=i
i = 1, 2, . . . , n; k = 0, 1, 2, . . .
6 (0) (0) (0) 7t
and x1 , x2 , . . . , xn is the initial solution. The method is repeated until desired
accuracy is achieved.
This method is called the overrelaxation method when 1 < w < 2, and is called
the under relaxation method when 0 < w < 1. When w = 1, the method becomes
Gauss-Seidal’s method.

Best relaxation factor wb


The proper choice of wb can speed up the convergence of the system. In a problem,
Carré took wb = 1.9 and found that the convergence is 40 times faster than when w = 1
(Gauss-Seidal’s method). He also observed that when w = 1.875 (a minor variation
of 1.9), the convergence is only two times faster than the Gauss-Seidal’s method. In
general, the choice of wb is not a simple task. It depends on the spectral radius of the
coefficient matrix.
356 Numerical Analysis

Example 5.20.1 Solve the following system of equations

3x1 + x2 + 2x3 = 6
−x1 + 4x2 + 2x3 = 5
2x1 + x2 + 4x3 = 7

by SOR method taken w = 1.01

Solution. The iteration scheme for SOR method is


 
(k+1) (k) (k) (k) (k)
a11 x1 = a11 x1 − w a11 x1 + a12 x2 + a13 x3 − b1
 
(k+1) (k) (k+1) (k) (k)
a22 x2 = a22 x2 − w a21 x1 + a22 x2 + a23 x3 − b2
 
(k+1) (k) (k+1) (k+1) (k)
a33 x3 = a33 x3 − w a31 x1 + a32 x2 + a33 x3 − b3

or
 
(k+1) (k) (k) (k) (k)
3x1 = 3x1 − 1.01 3x1 + x2 + 2x3 − 6
 
(k+1) (k) (k+1) (k) (k)
4x2 = 4x2 − 1.01 − x1 + 4x2 + 2x3 − 5
 
(k+1) (k) (k+1) (k+1) (k)
4x3 = 4x3 − 1.01 2x1 + x2 + 4x3 − 7 .

(0) (0) (0)


Let x1 = x2 = x3 = 0.
The detail calculations are shown in the following table.

k x1 x2 x3
0 0 0 0
1 2.02000 1.77255 0.29983
2 1.20116 1.39665 0.80526
3 0.99557 1.09326 0.98064
4 0.98169 1.00422 1.00838
5 0.99312 0.99399 1.00491
6 0.99879 0.99728 1.00125
7 1.00009 0.99942 1.00009
8 1.00013 0.99999 0.99993
9 1.00005 1.00005 0.99997

Therefore the required solution is


x1 = 1.0000, x2 = 1.0000, x3 = 1.0000
correct up to four decimal places.
Solution of System of Linear Equations 357

Program 5.10 .
/* Program Gauss-Seidal SOR
Solution of a system of linear equations by Gauss-Seidal
successive overrelaxation (SOR) method. The relaxation factor w
lies between 1 and 2. Assume that the coefficient matrix
satisfies the condition of convergence. */
#include<stdio.h>
#include<math.h>
void main()
{
float a[10][10],b[10],x[10],xn[10],epp=0.00001,sum,w;
int i,j,n,flag;
printf("Enter number of variables ");
scanf("%d",&n);
printf("\nEnter the coefficients rowwise ");
for(i=1;i<=n;i++)
for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("\nEnter right hand vector ");
for(i=1;i<=n;i++) scanf("%f",&b[i]);
printf("Enter the relaxation factor w ");
scanf("%f",&w);
for(i=1;i<=n;i++) x[i]=0; /* initialize */
for(i=1;i<=n;i++) printf(" x[%d] ",i);printf("\n");
do
{
for(i=1;i<=n;i++)
{
sum=b[i]*w+a[i][i]*x[i];
for(j=1;j<=n;j++)
{
if(j<i)
sum-=a[i][j]*xn[j]*w;
else if(j>=i)
sum-=a[i][j]*x[j]*w;
xn[i]=sum/a[i][i];
}
}
for(i=1;i<=n;i++) printf("%8.5f ",xn[i]);
printf("\n");
358 Numerical Analysis

flag=0; /* indicates |x[i]-xn[i]|<epp for all i */


for(i=1;i<=n;i++) if(fabs(x[i]-xn[i])>epp) flag=1;
if(flag==1) for(i=1;i<=n;i++) x[i]=xn[i]; /* reset x[i] */
}while(flag==1);
printf("Solution is \n");
for(i=1;i<=n;i++) printf("%8.5f ",xn[i]);
} /* main */
A sample of input/output:
Enter number of variables 3
Enter the coefficients rowwise
9 2 4
1 10 4
2 -4 10

Enter right hand vector


20 6 -15
Enter the relaxation factor w 1.01
x[1] x[2] x[3]
2.24444 0.37931 -1.81514
2.95166 1.03740 -1.67397
2.73352 0.99583 -1.64812
2.73342 0.98581 -1.65241
2.73760 0.98722 -1.65264
2.73734 0.98732 -1.65254
2.73728 0.98729 -1.65254
2.73729 0.98729 -1.65254
Solution is
2.73729 0.98729 -1.65254

5.21 Comparison of Direct and Iterative Methods

The direct and iterative, both the methods have some advantages and also some disad-
vantages and a choice between them is based on the given system of equations.
(i) The direct method is applicable for all types of problems (when the coefficient
determinant is not zero) where as iterative methods are useful only for particular
types of problems.

(ii) The rounding errors may become large particularly for ill-conditioned systems
while in iterative method the rounding error is small, since it is committed in
Solution of System of Linear Equations 359

the last iteration. Thus for ill-conditioned systems an iterative method is a good
choice.
(iii) In each iteration, the computational effect is large in direct method (it is 2n3 /3
for elimination method) and it is low in iteration method (2n2 in Gauss-Jacobi’s
and Gauss-Seidal’s methods).
(iv) Most of the direct methods are applied on the coefficient matrix and for this
purpose, the entire matrix to be stored into primary memory of the computer.
But, the iteration methods are applied in a single equation at a time, and hence
only a single equation is to be stored at a time in primary memory. Thus iterative
methods are efficient then direct method with respect to space.

5.22 Exercise

1. Find the values of the following determinants using partial pivoting.


 
   −2 3 8 4 
0 2 5 
   6 1 0 5 
(i)  1 3 −8  (ii)  .
   −8 3 1 2 
6 5 1 
 3 8 7 10 

2. Solve the following systems of equations by Cramer’s rule.


(i) 3x1 + 2x2 + x3 = 5 (ii) 7.6x1 + 0.5x2 + 2.4x3 = 1.9
2x1 + 5x2 + x3 = −3 2.2x1 + 9.1x2 + 4.4x3 = 9.7
2x1 + x2 + 3x3 = 11 −1.3x1 + 0.2x2 + 5.8x3 = −1.4
3. Find the inverse of the matrix
 
11 3 −1
 2 5 5
1 1 1
using Gauss-Jordan method and solve the following system of equations.
11x1 + 3x2 − x3 = 15
2x1 + 5x2 + 5x3 = −11
x1 + x2 + x3 = 1.

4. Find the inverses of the following matrices (using partial pivoting).


   
0 1 2 −1 2 0
(i) 3 5 1 (ii)  1 0 5 .
6 8 9 3 8 7
360 Numerical Analysis

5. Solve the following systems of equations by Gauss elimination method.


(i) x1 + 12 x2 + 13 x3 = 1 (ii) 4x + y + z = 4
1 1 1
2 x1 + 3 x2 + 4 x3 = 0 x + 4y − 2z = 4
1
x
3 1 + 1
x
4 2 + 1
x
5 3 = 0 3x + 2y − 4z = 6
(iii) x1 − 4x2 − x4 = 6
(iv) 1.14x1 − 2.15x2 − 5.11x3 = 2.05
x1 + x2 + 2x3 + 3x4 = −1
0.42x1 − 1.13x2 + 7.05x3 = 0.80
2x1 + 3x2 − x3 − x4 = −1
−0.71x1 + 0.81x2 − 0.02x3 = −1.07
x1 + 2x2 + 3x3 − x4 = 3
6. Use Gauss elimination method to find the values of the determinants.
   
1 4 1 3  −1.6 5.4 −7.7 3.1 
   
 0 −1 3 −1   8.2 1.4 −2.3 0.2 
(i)   (ii)  .
3 1 0 2  5.3 −5.9 2.7 −8.9 
   
 1 −2 5 1   0.7 1.9 −8.5 4.8 

7. Using LU decomposition method, solve the following systems of equations


(i) x1 + x2 + x3 = 3 (ii) x − 2y + 7z = 6
2x1 − x2 + 3x3 = 16 4x + 2y + z = 7
3x1 + x2 − x3 = −3 2x + 5y − 2z = 5.
8. Find the triangular factorization A = LU for the matrices
   
4 2 1 −5 2 −1
(i)  2 5 −2  (ii)  1 0 3 .
1 −2 7 3 1 6

9. Solve LY = B, UX = Y and verify that B = AX for B = (8, −4, 10, −4)t , where
A = LU is given by
    
4 8 4 0 1 0 0 0 4 8 4 0
 1 5 4 −3   1/4 1 0  
0  0 3 3 −3 
 = .
1 4 7 2   1/4 2/3 1 0  0 0 4 4
1 3 0 −2 1/4 1/3 −1/2 1 0 0 0 1

10. Prove that the product of two upper-triangular matrices is an upper-triangular


matrix.
11. Prove that the inverse of a non-singular upper-triangular matrix is an upper-
triangular matrix.
12. Use Cholesky method to solve the following systems of linear equations.
(i) x + 2y + 3z = 0 (ii) x1 + 3x2 + 4x3 = 8
2x + y + 2z = 1 3x1 − x2 + 5x3 = 7
3x + 2y + z = 4 4x1 + 5x2 − 7x3 = 2.
Solution of System of Linear Equations 361

13. Find the inverses of the following matrices using partition.


   
20 1 −2 8 1 −1
(i)  3 20 −1  (ii)  2 1 9 .
2 −3 20 1 −7 2

14. Find the solution of the following tri-diagonal system:

2x1 − 2x2 = 1
−x1 + 2x2 − 3x3 = −2
−2x2 + 2x3 − 4x4 = −1
x3 − x4 = 3.

15. Test the following system for ill-condition.

10x + 7y + 8z + 7w = 32
7x + 5y + 6z + 5w = 23
8x + 6y + 10z + 9w = 33
7x + 5y + 9z + 10w = 31.

16. Find the g-inverses of the following matrices


 
2 3 5
2 3  1 −1 0  .
(i) (ii)
4 6
3 1 2

17. Find the g-inverse of the matrix

1 1 1
2 3 5

and hence solve the following system of equations.

x+y+z = 3
2x + 3y + 5z = 10.

18. Find the least squares solution of the equations

x1 + 2x2 = 3
.
2x1 + 4x2 = 7
362 Numerical Analysis

19. Find the solution of the following equations using least squares method:
x + y + 3z = 0
2x − y + 4z = 8
x + 5y + z = 10
x + y − 2z = 2.

20. Solve the following equations by (i) Gauss-Jordan’s and (ii) Gauss-Seidal’s meth-
ods, correct up to four significant figures:
9x + 2y + 4z = 20
x + 10y + 4z = 6
2x − 4y + 10z = −15.

21. Test if the following systems of equations are diagonally dominant and hence solve
them using Gauss-Jacobi’s and Gauss-Seidal’s methods.
(i) 10x + 15y + 3z = 14 (ii) x + 3y + 4z = 7
−30x + y + 5z = 17 3x + 2y + 5z = 10
x + y + 4z = 3 x − 5y + 7z = 3.
22. Solve the following simultaneous equations
2.5x1 + 5.2x2 = 6.2
1.251x1 + 2.605x2 = 3.152
by Gauss elimination method and get your answer to 4 significant figures. Improve
the solution by iterative refinement.
23. Consider the following linear system
5x1 + 3x2 = 6, 2x1 − x2 = 4.
Can either Gauss-Jacobi’s or Gauss-Seidal’s iteration be used to solve this linear
system? Why?
24. Consider the following tri-diagonal linear system and assume that the coefficient
matrix is diagonally dominant.
d1 x1 + c1 x2 = b1
a1 x1 + d2 x2 + c2 x3 = b2
a2 x2 + d3 x3 + c4 x4 = b3
.. .. .. ..
. . . .
an−2 xn−2 + dn−1 xn−1 + cn−1 xn = bn−1
an−1 xn−1 + dn xn = bn .
Write an iterative algorithm that will solve this system.
Solution of System of Linear Equations 363

25. Use Gauss-Seidal’s iteration to solve the following band system.


12x1 − 2x2 + x3 = 5
−2x1 + 12x2 − 2x3 + x4 = 5
x1 − 2x2 + 12x3 − 2x4 + x5 = 5
x2 − 2x3 + 12x4 − 2x5 + x6 = 5
······························ ··· ···
······························ ··· ···
x46 − 2x47 + 12x48 − 2x49 + x50 = 5
x47 − 2x48 + 12x49 − 2x50 = 5
x48 − 2x49 + 12x50 = 5.

26. Solve the following systems of equations by successive relaxation method.


(i) 2x − 3y + z = −1 (ii) x+y−z = 0
x + 4y − 3z = 0 2x + 3y + 8z = −1
x+y+z = 6 5x − 4y + 10z = 9.
27. Solve the following systems of equations by successive overrelaxation method.
(i) 5x − 2y + z = 4 (ii) x1 + x2 + x3 = 3
7x + y − 5z = 8 2x1 − x2 + 3x3 = 16
3x + 7y + 4z = 10 3x1 + x2 − x3 = −3.
Choose appropriate relaxation factor w.
28. The Hilbert matrix is a classical ill-conditioned matrix, and small changes in its
coefficients will produce a large changes in the solution to the perturbed system.
(i) Solve AX = b using the Hilbert matrix
   
1 1/2 1/3 1/4 1/5 1
 1/2 1/3 1/4 1/5 1/6  0
   
A=  1/3 1/4 1/5 1/6 1/7 , b =  0 .
  
 1/4 1/5 1/6 1/7 1/8  0
1/5 1/6 1/7 1/8 1/9 0
(ii) Solve Cx = b where
   
1.0 0.5 0.33333 0.25 0.2 1
 0.5 0.33333 0.25 0.2 0.16667   0 
   
C=  0.33333 0.25 0.2 0.16667 0.14286 , b = 
  0 .

 0.25 0.2 0.16667 0.14286 0.125   0 
0.2 0.16667 0.14286 0.125 0.11111, 0
[Note that the two matrices A and C are different.]
364 Numerical Analysis
Chapter 6

Eigenvalues and Eigenvectors


of a Matrix

6.1 Eigenvalue of a Matrix

Let A be a square matrix of order n. Suppose X is a column matrix and λ is a scalar


such that

AX = λX. (6.1)

The scalar λ is called an eigenvalue or characteristic value of the matrix A


and X is called the corresponding eigenvector. The equation (6.1) can be written as
(A − λI)X = 0.
The equation

|A − λI| = 0 (6.2)

that is,
 
 a11 − λ a12 a13 · · · a1n 

 a21 a22 − λ a23 · · · a2n 
 =0 (6.3)
 ··· ··· ··· ··· · · · 

 an1 an2 an3 · · · ann − λ 

is a polynomial in λ of degree n, called characteristic equation of the matrix A. The


roots λi , i = 1, 2, . . . , n, of the equation (6.2) are the eigenvalues of A. For each value
of λi , there exists an Xi such that

AXi = λi Xi . (6.4)

365
366 Numerical Analysis

The eigenvalues λi may be either distinct or repeated, they may be real or complex.
If the matrix is real symmetric then all the eigenvalues are real. If the matrix is skew-
symmetric then the eigenvalues are either zero or purely imaginary. Sometimes, the set
of all eigenvalues, λi , of a matrix A is called the spectrum of A and the largest value
of |λi | is called the spectral radius of A.

Example 6.1.1 Find the eigenvalues and eigenvectors of the matrix


 
2 −1 1
A =  −1 2 −1  .
1 −1 2

Solution. The characteristic equation of A is |A − λI| = 0.


Therefore  
 2 − λ −1 1 

 −1 2 − λ −1  = 0.
 
 1 −1 2 − λ 
By direct expansion this gives (1 − λ)2 (4 − λ) = 0.
Hence the characteristic equation is (1 − λ)2 (4 − λ) = 0 and the eigenvalues of A are
1, 1, and 4. The two distinct eigenvectors corresponding to two eigenvalues λ = 1
and 4 are calculated below.
Eigenvector corresponding to the eigenvalue 1.
Let (x1 , x2 , x3 )T be the eigenvector corresponding to 1. Then
    
2 − 1 −1 1 x1 0
 −1 2 − 1 −1   x2  =  0  .
1 −1 2 − 1 x3 0

Thus

x1 − x2 + x3 = 0
−x1 + x2 − x3 = 0
x1 − x2 + x3 = 0.

The solution of this system of equations is x3 = 0, x1 = x2 . We take x2 = 1. Then


the eigenvector is (1, 1, 0)T .
Let (y1 , y2 , y3 )T be the eigenvector corresponding to λ = 4.
Then     
2 − 4 −1 1 y1 0
 −1 2 − 4 −1   y2  =  0 
1 −1 2 − 4 y3 0
Eigenvalues and Eigenvectors of a Matrix 367

Thus the system of equations are

−2y1 − y2 + y3 = 0
−y1 − 2y2 − y3 = 0
y1 − y2 − 2y3 = 0.

The solution is y1 = k, y2 = −k, y3 = k for arbitrary k. Let k = 1. Then the


eigenvector is (1, −1, 1)T .
The upper bound of the eigenvalue of a matrix is stated below.
Theorem 6.1 The largest eigenvalue (in magnitude) of a square matrix A cannot ex-
ceed the largest sum of the moduli of the elements along any row or any column, i.e.,
n  n 
|λ| ≤ max |aij | and |λ| ≤ max |aij | . (6.5)
i j
j=1 i=1

Proof. Let λ be an eigenvalue of A and X = (x1 , x2 , . . . , xn )T be the corresponding


eigenvector. Then AX =λX.
This equation may be written as
a11 x1 + a12 x2 + · · · + a1n xn = λx1
a21 x1 + a22 x2 + · · · + a2n xn = λx2
··························· ··· ···
ak1 x1 + ak2 x2 + · · · + akn xn = λxk
··························· ··· ···
an1 x1 + an2 x2 + · · · + ann xn = λxn .
Let |xk | = max |xi |. Now, the kth equation is divided by xk . Therefore,
i
x1 x2 xn
λ = ak1 + ak2 + · · · + akk + · · · + akn .
xk xk xk
x 
 i
Since   ≤ 1 for i = 1, 2, . . . , n,
xk

n
|λ| ≤ |ak1 | + |ak2 | + · · · + |akk | + · · · + |akn | = |akj |.
j=1

This is true for all rows k, hence



n 
|λ| ≤ max |aij | .
i
j=1

The theorem is also true for columns, as the eigenvalues of AT and A are same. 
368 Numerical Analysis

Theorem 6.2 (Shifting eigenvalues). Suppose λ be an eigenvalue and X be its


corresponding eigenvector of A. If c is any constant, then λ − c is an eigenvalue of the
matrix A − cI with same eigenvector X.

Let the characteristic polynomial of the matrix A be

det(A − λI) = λn + c1 λn−1 + c2 λn−2 + · · · + cn , (6.6)

where

n
−c1 = aii = T r. A, which is the sum of all diagonal elements of A, called the trace.
i=1
 
  aii aij 
c2 =  
 aji ajj  , is the sum of all principal minors of order two of A,
i<j
 
  aii aij aik 

−c3 =  aji ajj ajk  , is the sum of all principal minors of order three of A,
 
i<j<k  aki akj akk 

and finally (−1)n cn = detA, is the determinant of the matrix A.


If the coefficients of the polynomial (6.6) are known then solving this polynomial
using the method discussed in Chapter 4, one can determine all the eigenvalues. But,
the direct expansion is very labourious and is only used for low order matrices. One
efficient method to compute the coefficients of (6.6) is Leverrier-Faddeev method.

6.2 Leverrier-Faddeev Method to Construct Characteristic Equation

This method was developed by Leverrier and latter modified by Faddeev.


Let
det(A − λI) = λn + c1 λn−1 + c2 λn−2 + · · · + cn (6.7)

be the characteristic polynomial of the matrix A and its roots be λ1 , λ2 , . . . , λn . Now,


the sum of kth power of the eigenvalues is denoted by Sk , i.e.,

S1 = λ1 + λ2 + · · · + λn = T r A,
S2 = λ21 + λ22 + · · · + λ2n = T r A2 ,
(6.8)
··· ··· ···························
Sn = λn1 + λn2 + · · · + λnn = T r An .

Then by Newton’s formula (on polynomial) for k ≤ n

Sk + c1 Sk−1 + c2 Sk−2 + · · · + ck−1 S1 = −kck . (6.9)


Eigenvalues and Eigenvectors of a Matrix 369

For k = 1, 2, . . . , n, the coefficients are given by


c1 = −S1
1
c2 = − (S2 + c1 S1 )
2
··· ··· ···············
1
cn = − (Sn + c1 Sn−1 + c2 Sn−2 + · · · + cn−1 S1 ).
n
Thus computation of the coefficients c1 , c2 , . . . , cn depends on the trace of A, A2 , . . .,
An .
The Leverrier method was modified by Faddeev by generating a sequence of matrices
B1 , B2 , . . . , Bn , shown below.
B1 = A, d1 = T r. B1 , D1 = B1 − d 1 I
B2 = AD1 , d2 = 1
2 T r. B2 , D2 = B2 − d 2 I
B3 = AD2 , d3 = 1
3 T r. B3 , D3 = B3 − d 3 I
(6.10)
··· ··· ······ ··· ······ ······ ·········
Bn−1 = ADn−2 , 1
dn−1 = n−1 T r. Bn−1 , Dn−1 = Bn−1 − dn−1 I
Bn = ADn−1 , dn = 1
n T r. Bn , Dn = Bn − d n I
Thus the coefficients of the characteristic polynomial are
c1 = −d1 , c2 = −d2 , . . . , cn = −dn .
It can be verified that Dn = 0.
Example 6.2.1 Usethe Leverrier-Faddeev
 method to find the characteristic poly-
2 1 −1
nomial of the matrix  0 2 3  .
1 5 4

Solution.
 
2 1 −1
B1 = A =  0 2 3  , d1 = T r. B1 = 2 + 2 + 4 = 8
1 5 4
    
2 1 −1 −6 1 −1 −13 −9 5
B2 = A(B1 − d1 I) =  0 2 3   0 −6 3  =  3 3 −6 
1 5 4 1 5 −4 −2 −9 −2
1 1
d2 = T r. B2 = (−13 + 3 − 2) = −6
2 2     
2 1 −1 −7 −9 5 −9 0 0
B3 = A(B2 − d2 I) =  0 2 3   3 9 −6  =  0 −9 0 
1 5 4 −2 −9 4 0 0 −9
370 Numerical Analysis

1 1
d3 = T r. B3 = (−9 − 9 − 9) = −9.
3 3
Thus c1 = −d1 = −8, c2 = −d2 = 6, c3 = −d3 = 9. Hence the characteristic polyno-
mial is λ3 − 8λ2 + 6λ + 9 = 0.

Note 6.2.1 Using this method one can compute the inverse of the matrix A. It is
mentioned that Dn = 0. That is, Bn − dn I = 0 or, ADn−1 = dn I. From this relation
one can write Dn−1 = dn A−1 . This gives,
Dn−1 Dn−1
A−1 = = . (6.11)
dn −cn

Algorithm 6.1 (Leverrier-Faddeev method). This algorithm determines the


characteristic polynomial λn + c1 λn−1 + · · · + cn−1 λ + cn = 0 of a square matrix A.

Algorithm Leverrier-Faddeev
Step 1: Read the matrix A = [aij ], i, j = 1, 2, . . . , n.

n
Step 2: Set B1 = A, d1 = aii .
i=1
Step 3: for i = 2, 3, . . . , n do
Compute
(a) Bi = A(Bi−1 − di−1 I)
(b) di = 1i (sum of the diagonal elements of Bi )
Step 4: Compute ci = −di for i = 1, 2, . . . , n.
//ci ’s are the coefficients of the polynomial.//
end Leverrier-Faddeev.
Program 6.1.
/* Program Leverrier-Faddeev
This program finds the characteristic polynomial of
a square matrix. From which we can determine all the
eigenvalues of the matrix. */
#include<stdio.h>
#include<math.h>
void main()
{
int n,i,j,k,l;
float a[10][10],b[10][10],c[10][10],d[11];
printf("Enter the size of the matrix ");
scanf("%d",&n);
Eigenvalues and Eigenvectors of a Matrix 371

printf("Enter the elements row wise ");


for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("The given matrix is\n");
for(i=1;i<=n;i++) /* printing of A */
{
for(j=1;j<=n;j++) printf("%f\t ",a[i][j]); printf("\n");
}
printf("\n");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) b[i][j]=a[i][j];
d[1]=0;for(i=1;i<=n;i++) d[1]+=a[i][i];
for(i=2;i<=n;i++)
{
/* construction of B[i-1]-d[i-1] I=C (say) */
for(j=1;j<=n;j++) for(k=1;k<=n;k++) c[j][k]=b[j][k];
for(j=1;j<=n;j++) c[j][j]=c[j][j]-d[i-1];
/* product of A and B[i-1]-d[i-1]I */
for(j=1;j<=n;j++) for(k=1;k<=n;k++)
{
b[j][k]=0;
for(l=1;l<=n;l++) b[j][k]+=a[j][l]*c[l][k];
}
/* trace of B */
d[i]=0;
for(j=1;j<=n;j++)
d[i]+=b[j][j];
d[i]/=i;
} /* end of i loop */
printf("The coefficients of the characteristic polynomial are \n");
printf("1.00000 ");
for(i=1;i<=n;i++) printf("%8.5f ",-d[i]);
}/* main */
A sample of input/output:
Enter the size of the matrix 5
Enter the elements row wise
2 3 0 1 2
3 5 6 2 1
2 3 0 1 0
0 0 2 3 8
1 2 3 0 2
372 Numerical Analysis

The given matrix is


2.000000 3.000000 0.000000 1.000000 2.000000
3.000000 5.000000 6.000000 2.000000 1.000000
2.000000 3.000000 0.000000 1.000000 0.000000
0.000000 0.000000 2.000000 3.000000 8.000000
1.000000 2.000000 3.000000 0.000000 2.000000

The coefficients of the characteristic polynomial are


1.00000 -12.00000 18.00000 -29.00000 -142.00000 14.00000

6.2.1 Eigenvectors using Leverrier-Faddeev method


The Leverrier-Faddeev method may also be used to determine all the eigenvectors.
Suppose the matrices D1 , D2 , . . . , Dn−1 and the eigenvalues λ1 , λ2 , . . . , λn are known.
Then the eigenvectors x(i) can be determined using the formula
x(i) = λin−1 e0 + λin−2 e1 + λn−3
i e2 + · · · + en−1 , (6.12)
where e0 is a unit vector and e1 , e2 , . . . , en−1 are column vectors of the matrices
D1 , D2 , . . . , Dn−1 of the same order as e0 .
Example 6.2.2 Use Leverrier-Faddeev methodto find characteristic equation and
9 −1 9
all eigenvectors of the matrix A =  3 −1 3  .
−7 1 −7

Solution.
 
9 −1 9
B1 = A =  3 −1 3 
−7 1 −7
d1 = T r. B1 = 9 − 1 − 7 = 1
     
9 −1 9 1 0 0 8 −1 9
D1 = B1 − d 1 I =  
3 −1 3 − 0  1 0  =  3 −2 3 
−7 1 −7 0 0 1 −7 1 −8
    
9 −1 9 8 −1 9 6 2 6
B2 = AD1 =  3 −1 3   3 −2 3= 0 2 0
−7 1 −7 −7 1 −8 −4 −2 −4
1 1
d2 = T r. B2 = (6 + 2 − 4) = 2
2 2 
4 2 6
D2 = B2 − d 2 I =  0 0 0 
−4 −2 −6
Eigenvalues and Eigenvectors of a Matrix 373

    
9 −1 9 4 2 6 0 0 0
B3 = AD2 =  3 −1 3   0 0 0  =  0 0 0 
−7 1 −7 −4 −2 −6 0 0 0
1 1
d3 = T r. B3 = (0 + 0 + 0) = 0.
3 3
Thus c1 = −d1 = −1, c2 = −d2 = −2, c3 = −d3 = 0.
The characteristic equation is λ3 − λ2 − 2λ = 0.
The eigenvalues are λ1 = 0, λ2 = −1, λ3 = 2.
     
1 8 4
Let e0 =  0  , and then e1 =  3  , e2 =  0  .
0 −7 −4

(e1 , e2 are the first columns of the matrices D1 , D2 ).


The formula
x(i) = λ2i e0 + λi e1 + e2 ,
 
4
for λ1 = 0 gives x(1) =  0  .
−4
Similarly, for λ2 = −1,
       
1 8 4 −3
x(2) = (−1)2  0  + (−1)  3  +  0  =  −3 
0 −7 −4 3
       
1 8 4 24
and for λ2 = 2, x(3) = 22  0  + 2  3  +  0  =  6 .
0 −7 −4 −18
     
4 −3 24
Thus the eigenvectors are  0  ,  −3  and  6 .
−4 3 −18

A square matrix B is said to be similar to another square matrix A if there exists


a non-singular matrix P such that B = P−1 AP. The similar matrices have the same
eigenvalues. A square matrix A is said to be diagonalisable if A is similar to a square
diagonal matrix. A square matrix A of order n is diagonalisable iff A has n linearly
independent eigenvectors. If P−1 AP is a diagonal matrix and P is orthogonal then
A is said to be orthogonally diagonalisable. It can be proved that a matrix A is
orthogonally diagonalisable iff A is real and symmetric.
If a matrix is either diagonal or lower triangular or upper triangular then its eigen-
values are the diagonal elements.
374 Numerical Analysis

6.3 Eigenvalues for Arbitrary Matrices

Several methods are available to determine the eigenvalues and eigenvectors of a matrix.
Here, two methods – Rutishauser and Power methods are introduced.

6.3.1 Rutishauser method

Let A be a square matrix. In this method, a convergent sequence of upper triangular


matrices A1 , A2 , . . . are generated. The diagonal elements of the convergent matrix are
the eigenvalues, if they are real.
The conversion is based on LU decomposition technique, where L and U are lower
and upper triangular matrices. Initially, let A = A1 and A1 = L1 U1 with lii = 1. Then
form A2 = U1 L1 . The matrices A1 and A2 are similar as A2 = U1 L1 = U1 A1 U−1 1
and they have same eigenvalues. Now, A2 is factorized in the form A2 = L2 U2 with
lii = 1. Then form A3 = U2 L2 . Proceeding this way we generate a sequence of similar
matrices A1 , A2 , A3 , . . .. In general, this sequence converge to an upper triangular
matrix or a near-triangular matrix A. If the eigenvalues are real, then they all lie on
the leading diagonal of the matrix A. But, practically this method is complicated.
Sometimes, the lower triangular matrix L is replaced by Q, where Q is an unitary or
orthogonal matrix. The QU decomposition is also not simple for practical computation.
Since the sequence {Ai } converges slowly the shifting technique may be used to
accelerate its convergence. This technique is not discussed here.

4 2
Example 6.3.1 Find all the eigenvalues of the matrix using Rutishauser
−1 1
method.

1 0 u11 u12
Solution. Let A = A1 = .
l21 1 0 u22
4 2 u11 u12
That is, = .
−1 1 u11 l21 u12 l21 + u22
This gives u11 = 4, u12 = 2, l21 = −1/4, u22 = 3/2.
1 0 4 2
Therefore, L1 = , U1 = .
−1/4 1 0 3/2
4 2 1 0 7/2 2
Form A2 = U1 L1 = = .
0 3/2 −1/4 1 −3/8 3/2
1 0 u11 u12
Again, let A2 = L2 U2 = .
l21 1 0 u22
7/2 2 u11 u12
= .
−3/8 3/2 u11 l21 u12 l21 + u22
Eigenvalues and Eigenvectors of a Matrix 375

Solution is u11 = 7/2, u12 = 2, l21 = −3/28, u22 = 12/7.


1 0 7/2 2
Therefore, L2 = , U2 = .
−3/28 1 0 12/7
7/2 2 1 0 23/7 2
Form A3 = U2 L2 = = .
0 12/7 −3/28 1 −9/49 12/7
In this way, we find
3.17391 2 3.10959 2
A4 = , A5 = ,
−0.10208 1.82609 −0.06080 1.89041
3.07049 2 3.04591 2
A6 = , A7 = ,
−0.03772 1.92951 −0.02402 1.95409
3.04591 2 3.04073 2
A8 = , A9 =
−0.00788 1.96986 −0.00512 1.97504
and so on.
The sequence {Ai } converges slowly to an upper triangular matrix and the diagonal
elements converge to the eigenvalues of A. The exact eigenvalues are 3 and 2, which
are approximated by the diagonal elements of A9 .

6.3.2 Power method


Power method is generally used to find the eigenvalue, largest in magnitude, (sometimes
called first eigenvalue) of a matrix A. Let λ1 , λ2 , . . . , λn be all eigenvalues of the matrix
A. We assume that
|λ1 | > |λ2 | > |λ3 | > · · · > |λn |,
i.e., λ1 is largest in magnitude and X1 , X2 , . . . , Xn be the eigenvectors corresponding
to the eigenvalues λ1 , λ2 , . . . , λn respectively. The method is applicable if the matrix A
has n independent eigenvectors. Then any vector X in the (vector) space of eigenvectors
X1 , X2 , . . . , Xn can be written as

X = c1 X 1 + c2 X 2 + · · · + c n X n . (6.13)

Multiplying this relation by A and using the results AX1 = λ1 X1 , AX2 = λ2 X2 ,


. . . , AXn = λn Xn , we obtain

AX = c1 λ1 X1 + c2 λ2 X2 + · · · + cn λn Xn
λ  λ 
2 n
= λ 1 c1 X 1 + c 2 X2 + · · · + c n Xn . (6.14)
λ1 λ1

Again, multiplying this relation by A, A2 , . . . , Ak successively and obtain the rela-


tions
376 Numerical Analysis

 λ 2  λ 2
A2 X = λ21 c1 X1 + c2
2 n
X2 + · · · + c n Xn . (6.15)
λ1 λ1
.. ..
. .
 λ k  λ k
Ak X = λk1 c1 X1 + c2
2 n
X2 + · · · + c n Xn . (6.16)
λ1 λ1
 λ k+1  λ k+1
n
Ak+1 X = λk+1
2
1 c 1 X 1 + c 2 X 2 + · · · + c n Xn . (6.17)
λ1 λ1

When k → ∞, then  right hand sides of (6.16) and (6.17) tend to λk1 c1 X1 and
 
λk+1  λi  < 1 for i = 2, . . . , n. Thus for k → ∞, Ak X = λk c1 X1 and
1 c1 X1 , since 
λ1  1

Ak+1 X = λk+1 k
1 c1 X1 . That is, for k → ∞, λ1 A λ = A
k+1 X. It is well known that two

vectors are equal if their corresponding components are same. That is,
 
Ak+1 X
λ1 = lim   r , r = 1, 2, . . . , n. (6.18)
k→∞ k
A X
r

The symbol (Ak X)r denotes the rth component of the vector Ak X.
If |λ2 |  |λ1 |, then the term within square bracket of (6.17) tend faster to c1 X1 , i.e.,
the rate of convergence is fast.
To reduce the round off error, the method is carried out by normalizing (reducing the
largest element to unity) the eigenvector at each iteration. Let X0 be a non-null initial
(arbitrary) vector (non-orthogonal to X1 ) and we compute

Yi+1 = AXi
Yi+1
Xi+1 = (i+1) , for i = 0, 1, 2, . . . . (6.19)
λ
where λ(i+1) is the largest element in magnitude of Yi+1 and it is the (i + 1)th approx-
imate value of λ1 . Then
(Yk+1 )r
λ1 = lim , r = 1, 2, . . . , n. (6.20)
k→∞ (Xk )r

and Xk+1 is the eigenvector corresponding to the eigenvalue λ1 .

Note 6.3.1 The initial vector X0 is usually chosen as X0 = (1, 1, · · · , 1)T . But, if the
initial vector X0 is poor, then the formula (6.20) does not give λ1 , i.e., the limit of the
(Y )r
ratio (Xk+1
k )r
may not exist. If this situation occurs, then the initial vector must be
changed.
Eigenvalues and Eigenvectors of a Matrix 377

Note 6.3.2 The power method is also used to find the least eigenvalue of a matrix
A. If X is the eigenvector corresponding to the eigenvalue λ then AX = λX. If A
is non-singular then A−1 exist. Therefore, A−1 (AX) = λA−1 X or, A−1 X = λ1 X.
This means that if λ is an eigenvalue of A then λ1 is an eigenvalue of A−1 and the same
eigenvector X corresponds to the eigenvalue 1/λ of the matrix A−1 . Thus, if λ is largest
(in magnitude) eigenvalue of A then 1/λ is the least eigenvalue of A−1 .
Note 6.3.3 We observed that the coefficient Xj in (6.16) goes to zero in proposition
to (λj /λ1 )k and that the speed of convergence is governed by the terms (λ2 /λ1 )k . Con-
sequently, the rate of convergence is linear.

Example 6.3.2 Find the largest eigenvalue in magnitude and corresponding eigen-
vector of the matrix

1 3 2
A =  −1 0 2 
3 4 5

Solution. Let the initial vector be X0 = (1, 1, 1)T .


The first iteration
 is given  by   
1 3 2 1 6
Y1 = AX0 =  −1 0 2   1  =  1  .
3 4 5 1 12
 
0.50000
Y 1
Therefore λ(1) = 12 and X1 = =  0.08333  .
12 1.0000
    
1 3 2 0.50000 2.75
Y2 = AX1 =  −1 0 2   0.08333  =  1.5 
3 4 5 1.00000 6.83333
 
0.40244
Y2
λ(2) = 6.83333, X2 = =  0.21951  .
6.83333 1.0000
    
1 3 2 0.40244 3.06098
Y3 = AX2 =  −1 0 2   0.21951  =  1.59756 
3 4 5 1.00000 7.08537
 
0.43201
λ(3) = 7.08537, X3 =  0.22547  .
1.00000
   
3.10843 0.43185
Y4 =  1.56799  , X4 =  0.21784  , λ(4) = 7.19793.
7.19793 1.
378 Numerical Analysis

   
3.08691 0.43050
Y5 =  1.56950  , X5 =  0.21880  , λ(5) = 7.16691.
7.16672 1.
   
3.08691 0.43073
Y6 =  1.56950  , X6 =  0.21900  , λ(6) = 7.16672.
7.16672 1.
   
3.08772 0.43075
Y7 =  1.56927  , X7 =  0.21892  , λ(7) = 7.16818.
7.16818 1.0
   
3.08752 0.43074
Y8 =  1.56925  , X8 =  0.21893  , λ(8) = 7.16795.
7.16795 1.0
   
3.08752 0.43074
Y9 =  1.56926  , X9 =  0.21893  , λ(9) = 7.16792.
7.16792 1.0
   
3.08753 0.43074
Y10 =  1.56926  , X10 =  0.21893  , λ(10) = 7.16794.
7.16794 1.0
The required largest eigenvalue is 7.1679 correct up to four decimal places and the
corresponding eigenvector is
 
0.43074
 0.21893  .
1.00000

Algorithm 6.2 (Power method). This method determines the largest eigenvalue
(in magnitude) and its corresponding eigenvector of a square matrix A.

Algorithm Power Method


Step 1. Read the matrix A = [aij ], i, j = 1, 2, . . . , n.
Step 2. Set initial vector X0 = (1, 1, 1, . . . , 1)T of n components.
Step 3. Find the product Y = AX0 .
Step 4. Find the largest element (in magnitude) of the vector Y and let it be λ.
Step 5. Divide all the elements of Y by λ and take it as X1 , i.e., X1 = Y/λ.
Step 6. //Let X0 = (x01 , x02 , . . . , x0n ) and X1 = (x11 , x12 , . . . , x1n ).//
If |xoi − x1i | > ε for at least i then
set X0 = X1 and goto Step 3.
Step 7. Print λ as largest eigenvalue and corresponding eigenvector X1 of A.
end Power Method
Eigenvalues and Eigenvectors of a Matrix 379

Program 6.2.
/* Program Power Method
This program finds the largest eigenvalue (in magnitude)
of a square matrix.*/
#include<stdio.h>
#include<math.h>
void main()
{
int n,i,j,flag;
float a[10][10],x0[10],x1[10],y[10],lambda,eps=1e-5;
printf("Enter the size of the matrix ");
scanf("%d",&n);
printf("Enter the elements row wise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("The given matrix is\n");
for(i=1;i<=n;i++){ /* printing of A */
for(j=1;j<=n;j++) printf("%f ",a[i][j]); printf("\n");
}
printf("\n");
for(i=1;i<=n;i++) {
x0[i]=1; x1[i]=1; /* initialization */
}
do
{
flag=0;
for(i=1;i<=n;i++) x0[i]=x1[i]; /* reset x0 */
for(i=1;i<=n;i++) /* product of A and X0 */
{
y[i]=0;
for(j=1;j<=n;j++) y[i]+=a[i][j]*x0[j];
}
lambda=y[1]; /* finds maximum among y[i] */
for(i=2;i<=n;i++) if(lambda<y[i]) lambda=y[i];
for(i=1;i<=n;i++) x1[i]=y[i]/lambda;
for(i=1;i<=n;i++) if(fabs(x0[i]-x1[i])>eps) flag=1;
}while(flag==1);
printf("The largest eigenvalue is %8.5f \n",lambda);
printf("The corresponding eigenvector is \n");
for(i=1;i<=n;i++) printf("%8.5f ",x1[i]);
}/* main */
380 Numerical Analysis

A sample of input/output:
Enter the size of the matrix 5
Enter the elements row wise
3 4 5 6 7
0 0 2 1 3
3 4 5 -2 3
3 4 -2 3 4
0 1 2 0 0
The given matrix is
3.000000 4.000000 5.000000 6.000000 7.000000
0.000000 0.000000 2.000000 1.000000 3.000000
3.000000 4.000000 5.000000 -2.000000 3.000000
3.000000 4.000000 -2.000000 3.000000 4.000000
0.000000 1.000000 2.000000 0.000000 0.000000
The largest eigenvalue is 10.41317
The corresponding eigenvector is
1.00000 0.20028 0.62435 0.41939 0.13915

6.3.3 Power method for least eigenvalue


It is mentioned earlier that if λ is the largest eigenvalue of A then 1/λ is the smallest
eigenvalue of A and 1/λ can be obtained by finding the largest eigenvalue of A−1 . But,
computation of A−1 is a labourious process, so a simple process is needed. One simple
method is introduced here.
If the largest eigenvalue (in magnitude) λ1 , of an n × n matrix A is known then the
smallest magnitude eigenvalue can be computed by using power method for the matrix

B = (A − λ1 I) instead of A. The eigenvalues of the matrix B are λi = (λi − λ1 ) (called

shifting eigenvalues), i = 1, 2, . . . , n, where λi are the eigenvalues of A. Obviously, λn is
the largest magnitude eigenvalue of B. Again, if Xn is the corresponding eigenvector,

then BXn = λn Xn or, (A − λ1 I)Xn = (λn − λ1 )Xn , i.e., AXn = λn Xn . Hence Xn is
also the eigenvector of A, corresponding to the eigenvalue λn .

6.4 Eigenvalues for Symmetric Matrices

The methods discussed earlier are also applicable for symmetric matrices, but, due to
the special properties of symmetric matrices some efficient methods are developed here.
Among them three commonly used methods, viz., Jacobi, Givens and Householder are
discussed here.
In linear algebra it is established that all the eigenvalues of a real symmetric matrix
are real.
Eigenvalues and Eigenvectors of a Matrix 381

6.4.1 Jacobi’s method


This method is widely used to find the eigenvalues and eigenvectors of a real symmetric
matrix. Since all the eigenvalues of A are real and there exist a real orthogonal matrix
S such that S−1 AS is a diagonal matrix D. As D and A are similar, the diagonal
elements of D are the eigenvalues of A. But, the computation of the matrix S is not a
simple task. It is obtained by a series of orthogonal transformations S1 , S2 , . . . , Sn , . . .
as discussed below.
Let |aij | be the largest element among the off-diagonal elements of A. Now, we
construct an orthogonal matrix S1 whose elements are defined as

sij = − sin θ, sji = sin θ, sii = cos θ, sjj = cos θ, (6.21)

all other off-diagonal elements are zero and all other diagonal elements are unity. Thus
S1 is of the form

 
1 0 ··· 0 ··· 0··· 0
0 1 ··· 0 ··· 0··· 0 
 
 .. .. .. .. .. 
. . . . .
 
0 0 · · · cos θ · · · − sin θ · · · 0 
 
S1 =  . .. .. .. ..  (6.22)
 .. . . . .
 
0 0 · · · sin θ · · · cos θ · · · 0 
 
 .. .. .. .. .. 
. . . . .
0 0 ··· 0 ··· 0 ··· 1

where cos θ, − sin θ, sin θ and cos θ are at the positions (i, i), (i, j), (j, i) and (j, j) re-
spectively.
aii aij
Let A1 = be a sub-matrix of A formed by the elements aii , aij , aji and ajj .
aji ajj
To reduce A1 to a diagonal matrix, an orthogonal transformation is applied which is
cos θ − sin θ
defined as S1 = , where θ is an unknown quantity and it will be selected
sin θ cos θ
in such a way that A1 becomes diagonal.
Now,
−1
S1 A1 S1
cos θ sin θ aii aij cos θ − sin θ
=
− sin θ cos θ aji ajj sin θ cos θ
aii cos2 θ + aij sin 2θ + ajj sin2 θ (ajj − aii ) sin θ cos θ + aij cos 2θ
= .
(ajj − aii ) sin θ cos θ + aij cos 2θ aii sin2 θ − aij sin 2θ + ajj cos2 θ
382 Numerical Analysis

This matrix becomes a diagonal matrix if (ajj − aii ) sin θ cos θ + aij cos 2θ = 0,
That is , if
2aij
tan 2θ = . (6.23)
aii − ajj

The value of θ can be obtained from the following relation.


 
1 −1 2aij
θ = tan . (6.24)
2 aii − ajj

This expression gives four values of θ, but, to get smallest rotation, θ should lie in
−π/4 ≤ θ ≤ π/4. The equation (6.24) is valid for all i, j if aii = ajj . If aii = ajj then
 π


 4 , if aij > 0
θ= (6.25)


 − π , if aij < 0.
4
−1
Thus the off-diagonal elements sij and sji of S1 A1 S1 vanish and the diagonal
elements are modified. The first diagonal matrix is obtained by computing D1 =
S−1
1 A1 S1 . In the next step largest off-diagonal (in magnitude) element is selected
from the matrix D1 and the above process is repeated to generate another orthogonal
matrix S2 to compute D2 . That is,

D2 = S−1 −1 −1
2 D1 S2 = S2 (S1 AS1 )S2 = (S1 S2 )
−1
A(S1 S2 ).

In this way, a series of two-dimensional rotations are performed. At the end of k


transformations the matrix Dk is obtained as

Dk = S−1 −1 −1
k Sk−1 · · · S1 AS1 S2 · · · Sk−1 Sk
= (S1 S2 · · · Sk )−1 A(S1 S2 · · · Sk )
= S−1 AS (6.26)

where S = S1 S2 · · · Sk .
As k → ∞, Dk tends to a diagonal matrix. The diagonal elements of Dk are the
eigenvalues and the columns of S are the corresponding eigenvectors.
The method has a drawback. The elements those are transferred to zero during
diagonalisation may not necessarily remain zero during subsequent rotations. The value
of θ must be verified for its accuracy by checking whether | sin2 θ + cos2 θ − 1 | is
sufficiently small.

Note 6.4.1 It may be noted that for orthogonal matrix S, S−1 = ST .


Eigenvalues and Eigenvectors of a Matrix 383

Note 6.4.2 It can be shown that the minimum number of rotations required to trans-
form a real symmetric matrix into a diagonal matrix is n(n − 1)/2.

Example
 6.4.1
 Find the eigenvalues and eigenvectors of the symmetric matrix
1 2 2
A=2 1 2  using Jacobi’s method.
2 2 1

Solution. The largest off-diagonal element is 2 at (1, 2), (1, 3) and (2, 3) positions.
2a12 4 π
The rotational angle θ is given by tan 2θ = = = ∞ i.e., θ = .
a11 − a22 0 4
Thus the orthogonal matrix
 S 1 is   √ √ 
cos π/4 − sin π/4 0 1/√2 −1/√ 2 0
S1 =  sin π/4 cos π/4 0  =  1/ 2 1/ 2 0  .
0 0 1 0 0 1
Then the first rotation yields
 √ √   √ √ 
1/ √2 1/√2 0 1 2 2 1/√2 −1/√ 2 0
D1 = S−1 1 AS1 =
 −1/ 2 1/ 2 0   2 1 2   1/ 2 1/ 2 0 
0 0 1 2 2 1 0 0 1
 √   
3 0 4/ 2 3 0 2.82843
=  0√ −1 0  =  0 −1 0 .
4/ 2 0 1 2.82843 0 1

The largest off-diagonal element of D1 is now 2.82843 situated at (1, 3) position and
hence the rotational angle is
1 2a13
θ = tan−1 = 0.61548.
2 a11 − a33
The second orthogonal matrix S2 is
   
cos θ 0 − sin θ 0.81650 0 −0.57735
S2 =  0 1 0 = 0 1 0 .
sin θ 0 cos θ 0.57735 0 0.81650
Then second rotation gives
D2 = S−1 D1 S2
2   
0.81650 0 0.57735 3 0 2.82843 0.81650 0 −0.57735
=  0 1 0  0 −1 0  0 1 0 
−0.577351 0 0.81650 2.82843 0 1 0.57735 0 0.81650
 
5 0 0

= 0 −1 0  .
0 0 −1
384 Numerical Analysis

Thus D2 becomes a diagonal matrix and hence the eigenvalues are 5, −1, −1.
The eigenvectors are the columns of S, where
 √ √  
1/√2 −1/√ 2 0 0.81650 0 −0.57735
S = S1 S2 =  1/ 2 1/ 2 0   0 1 0 
0 0 1 0.57735 0 0.81650
 
0.57735 −0.70711 −0.40825
=  0.57735 0.70711 −0.40825  .
0.57735 0 0.81650

Hence the eigenvalues are 5, −1, −1 and the corresponding eigenvectors are
(0.57735, 0.57735, 0.57735)T , (−0.70711, −0.70711, 0)T ,
(−0.40825, −0.40825, 0.81650)T respectively.
Note that the eigenvectors are normalized and two independent eigenvectors (last two
vectors) for the eigenvalue −1 are obtained by this method.
In this problem, two rotations are used to convert A into a diagonal matrix. But,
this does not happen in general. The following example shows that at least six rotations
are needed to diagonalise a symmetric matrix.
Example
  6.4.2 Find all the eigenvalues and eigenvectors of the matrix
2 3 1
 3 2 2  by Jacobi’s method.
1 2 1
 
2 3 1
Solution. Let A =  3 2 2  .
1 2 1
It is a real symmetric matrix and the Jacobi’s method is applicable.
The largest off-diagonal element is at a12 = a21 and it is 3.
2a12 6 π
Then tan 2θ = = = ∞, and this gives θ = .
a11 − a22 0 4
Thus the orthogonal matrix S1 is
   √ √ 
cos π/4 − sin π/4 0 1/√2 −1/√ 2 0
S1 =  sin π/4 cos π/4 0  =  1/ 2 1/ 2 0  .
0 0 1 0 0 1
The first rotation gives
 √ √   √ √ 
1/ √2 1/√2 0 2 3 1 1/√2 −1/√ 2 0
D1 = S−1
1 AS1 =
 −1/ 2 1/ 2 0   3 2 2   1/ 2 1/ 2 0 
0 0 1 1 2 1 0 0 1
Eigenvalues and Eigenvectors of a Matrix 385

 √ 
5 0 3/√2
=  0√ −1 
√ 1/ 2 .
3/ 2 1/ 2 1

The largest off-diagonal element of D1 is 3/ 2, situated at (1, 3) position. Then

2a13 6/ 2
tan 2θ = = = 1.06066. or, θ = 12 tan−1 (0.69883) = 0.407413
a11 − a33 5−1  
0.91815 0 −0.39624
So, the next orthogonal matrix S2 is S2 =  0 1 0 .
0.39624 0 0.91815

D2 = S−1 D1 S2
2   
0.91815 0 0.39624 5 0 2.12132 0.91815 0 −0.39624
=  0 1 0  0 −1 0.70711   0 1 0 
−0.39624 0 0.91815 2.12132 0.70711 1 0.39624 0 0.91815
 
5.91548 0.28018 0
=  0.28018 −1.0 0.64923  .
0 0.64923 0.08452

The largest off-diagonal element of D2 is 0.64923, present at the position (2, 3). Then
2a23 1
tan 2θ = = −1.19727 or, θ = tan−1 (−1.19727) = −0.43747.
a22 − a33 2
 
1 0 0
Therefore, S3 =  0 0.90583 0.42365  .
0 −0.42365 0.90583
 
5.91548 0.25379 0.11870
D3 = S−13 D2 S3 =
 0.25379 −1.30364 0 .
0.11870 0 0.38816
Again, largest off-diagonal element is 0.25379, located at (1, 2) position.
1 2a12
Therefore, θ = tan−1 = 0.03510.
2 a11 − a22
 
0.99938 −0.03509 0
S4 =  0.03509 0.99938 0  .
0 0 1
 
5.92439 0 0.11863
D4 = S−1
4 D3 S4 =
 0 −1.31255 −0.00417  .
0.11863 −0.00417 0.38816
386 Numerical Analysis

Here largest off-diagonal element is 0.11863 at (1, 3) position. In this case


1 2a13
θ = tan−1 = 0.02141.
2 a11 − a33
 
0.99977 0 −0.02141
S5 =  0 1 0 .
0.02141 0 0.99977
 
5.92693 −0.00009 0
D5 = S−15 D4 S5 =
 −0.00009 −1.31255 −0.00417  .
0 −0.00417 0.38562
The largest off-diagonal element in magnitude is −0.00417 situated at (2, 3) position.
Then
1 2a23
θ = tan−1 = 0.00246.
2 a22 − a33
 
1 0 0
S6 =  0 1 −0.00246  .
0 0.00246 1
 
5.92693 −0.00009 0
D6 = S−16 D5 S6 =
 −0.00009 −1.31256 0 .
0 0 0.38563
This matrix is almost diagonal and hence the eigenvalues are 5.9269, −1.3126 and
0.3856 correct up to four decimal places.
The eigenvectors are the columns of
 
0.61852 −0.54567 −0.56540
S = S1 S2 . . . S6 =  0.67629 0.73604 0.02948  .
0.40006 −0.40061 0.82430

That is, the eigenvectors corresponding to the eigenvalues 5.9269, −1.3126 and 0.3856
are respectively (0.61825, 0.67629, 0.40006)T ,
(−0.54567, 0.73604, −0.40061)T and (−0.56540, 0.02948, 0.82430)T .

Note 6.4.3 This example shows that the elements which were annihilated by a rotation
may not remain zero during the next rotations.

Algorithm 6.3 (Jacobi’s method). This method determines the eigenvalues and
eigenvectors of a real symmetric matrix A, by converting A into a diagonal matrix
by similarity transformation.

Algorithm Jacobi
Step 1. Read the symmetric matrix A = [aij ], i, j = 1, 2, . . . , n.
Step 2. Initialize D = A and S = I, a unit matrix.
Eigenvalues and Eigenvectors of a Matrix 387

Step 3. Find the largest off-diagonal element (in magnitude) from D = [dij ] and
let it be dij .
Step 4. //Find the rotational angle θ.//
If dii = djj then
if dij > 0 then θ = π/4 else θ = −π/4 endif;
else  
1 −1 2dij
θ = 2 tan ;
dii − djj
endif;
Step 5. //Compute the matrix S1 = [spq ]//
Set spq = 0 for all p, q = 1, 2, . . . , n
skk = 1, k = 1, 2, . . . , n
and sii = sjj = cos θ, sij = − sin θ, sji = sin θ.
Step 6. Find D = ST 1 ∗ D ∗ S1 and S = S ∗ S1 ;
Step 7. Repeat steps 3 to 6 until D becomes diagonal.
Step 8. Diagonal elements of D are the eigenvalues and the columns of S are
the corresponding eigenvectors.
end Jacobi
Program 6.3.
/* Program Jacobi’s Method to find eigenvalues
This program finds all the eigenvalues and the corresponding
eigenvectors of a real symmetric matrix. Assume that the
given matrix is real symmetric. */
#include<stdio.h>
#include<math.h>
void main()
{
int n,i,j,p,q,flag;
float a[10][10],d[10][10],s[10][10],s1[10][10],s1t[10][10];
float temp[10][10],theta,zero=1e-4,max,pi=3.141592654;
printf("Enter the size of the matrix ");
scanf("%d",&n);
printf("Enter the elements row wise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("The given matrix is\n");
for(i=1;i<=n;i++) /* printing of A */
{
for(j=1;j<=n;j++) printf("%8.5f ",a[i][j]); printf("\n");
}
printf("\n");
388 Numerical Analysis

/* initialization of D and S */
for(i=1;i<=n;i++) for(j=1;j<=n;j++){
d[i][j]=a[i][j]; s[i][j]=0;
}
for(i=1;i<=n;i++) s[i][i]=1;
do
{
flag=0;
/* find largest off-diagonal element */
i=1; j=2; max=fabs(d[1][2]);
for(p=1;p<=n;p++) for(q=1;q<=n;q++)
{ if(p!=q) /* off-diagonal element */
if(max<fabs(d[p][q])){
max=fabs(d[p][q]); i=p; j=q;
}
}
if(d[i][i]==d[j][j]){
if(d[i][j]>0) theta=pi/4; else theta=-pi/4;
}
else
{
theta=0.5*atan(2*d[i][j]/(d[i][i]-d[j][j]));
}
/* construction of the matrix S1 and S1T */
for(p=1;p<=n;p++) for(q=1;q<=n;q++)
{s1[p][q]=0; s1t[p][q]=0;}
for(p=1;p<=n;p++) {s1[p][p]=1; s1t[p][p]=1;}
s1[i][i]=cos(theta); s1[j][j]=s1[i][i];
s1[j][i]=sin(theta); s1[i][j]=-s1[j][i];
s1t[i][i]=s1[i][i]; s1t[j][j]=s1[j][j];
s1t[i][j]=s1[j][i]; s1t[j][i]=s1[i][j];
/* product of S1T and D */
for(i=1;i<=n;i++)
for(j=1;j<=n;j++){
temp[i][j]=0;
for(p=1;p<=n;p++) temp[i][j]+=s1t[i][p]*d[p][j];
}
/* product of temp and S1 i.e., D=S1T*D*S1 */
for(i=1;i<=n;i++)
for(j=1;j<=n;j++){
Eigenvalues and Eigenvectors of a Matrix 389

d[i][j]=0;
for(p=1;p<=n;p++) d[i][j]+=temp[i][p]*s1[p][j];
}
/* product of S and S1 i.e., S=S*S1 */
for(i=1;i<=n;i++)
for(j=1;j<=n;j++)
{
temp[i][j]=0;
for(p=1;p<=n;p++) temp[i][j]+=s[i][p]*s1[p][j];
}
for(i=1;i<=n;i++) for(j=1;j<=n;j++) s[i][j]=temp[i][j];
for(i=1;i<=n;i++) for(j=1;j<=n;j++) /* is D diagonal ? */
{
if(i!=j) if(fabs(d[i][j]>zero)) flag=1;
}
}while(flag==1);
printf("The eigenvalues are\n");
for(i=1;i<=n;i++) printf("%8.5f ",d[i][i]);
printf("\nThe corresponding eigenvectors are \n");
for(j=1;j<=n;j++){
printf("(");
for(i=1;i<n;i++) printf("%8.5f,",s[i][j]);
printf("%8.5f)\n",s[n][j]);
}
}/* main */
A sample of input/output:
Enter the size of the matrix 4
Enter the elements row wise
1 2 3 4
2 -3 3 4
3 3 4 5
4 4 5 0
The given matrix is
1.000000 2.000000 3.000000 4.000000
2.000000 -3.000000 3.000000 4.000000
3.000000 3.000000 4.000000 5.000000
4.000000 4.000000 5.000000 0.000000
The eigenvalues are
-0.73369 -5.88321 11.78254 -3.16564
390 Numerical Analysis

The corresponding eigenvectors are


( 0.74263, 0.04635,-0.65234, 0.14421)
( 0.13467, 0.74460, 0.06235,-0.65081)
( 0.43846, 0.33395, 0.64097, 0.53422)
(-0.48797, 0.57611,-0.39965, 0.51987)

6.4.2 Eigenvalues of a Symmetric Tri-diagonal Matrix

Let  
a1 b2 0 ··· 0
0 0
 b2 a2 0 ··· 0
b3 0 
 
 b4 · · · 0 
A= 0 b3 a3 0 
 .. .. ..
.. .. .. .. 
 . . .
. . . . 
0 0 0 0 · · · bn an

be a symmetric tri-diagonal matrix. The characteristic equation of this matrix is


 
a1 − λ b2 0 0 ··· 0 0
 b2 a − λ b 0 ··· 0 0 
 2 3 
 0 b a − λ b4 · · · 0 0 
pn (λ) =  3 3  = 0.
 .. .. .. .. .. .. .. 
 . . . . . . . 
0 0 0 0 · · · bn an − λ

Expanding by minors, the sequence {pn (λ)} satisfies the following equations.

p0 (λ) = 1
p1 (λ) = a1 − λ
pk+1 (λ) = (ak+1 − λ)pk (λ) − b2k+1 pk−1 (λ), k = 1, 2, . . . , n (6.27)

The polynomial pn (λ) is the characteristic polynomial of A.


If none of b2 , b3 , . . . , bn vanish, then {pn (λ)} is a Sturm sequence. Then using the
property of Sturm sequence, one can determine the intervals (containing the eigenvalue),
by substituting different values of λ. That is, if N (λ) denotes the number of changes
in sign of the Sturm sequence for a given value of λ, then | N (a) − N (b) | represents
the number of zeros (eigenvalues) lie in [a, b], provided pn (a) and pn (b) are not zero.
Once the location of an eigenvalue is identified then using any iterative method such as
bisection method, Newton-Raphson method etc. one can determine it.
Having computed the eigenvalues of A, the eigenvectors of A can be directly com-
puted by solving the resulting homogeneous linear equations (A − λI)X = 0.
Eigenvalues and Eigenvectors of a Matrix 391

Example
  6.4.3 Find the eigenvalues of the following tri-diagonal matrix
3 1 0
 1 2 1 .
0 1 1

Solution. The Sturm sequence {pn (λ)} is given by


p0 (λ) = 1
p1 (λ) = 3−λ
p2 (λ) = (2 − λ)p1 (λ) − 1p0 (λ) = λ2 − 5λ + 5
p3 (λ) = (1 − λ)p2 (λ) − 1p1 (λ) = −λ3 + 6λ2 − 9λ + 2.

Now tabulate the values of p0 , p1 , p2 , p3 for different values of λ.


λ p0 p1 p2 p3 N (λ)
–1 + + + + 0
0 + + + + 0
1 + + + – 1
2 + + – 0 1
3 + 0 – + 2
4 + – + – 3

Here p3 (2) = 0, so λ = 2 is an eigenvalue. The other two eigenvalues lie in the


intervals (0, 1) and (3, 4) as N (1) − N (0) = 1 and N (4) − N (3) = 1.
To find the eigenvalue within (0, 1) using Newton-Raphson method
Let λ(i) be the ith approximate value of the eigenvalue in (0, 1). The Newton-Raphson
iteration scheme is
p3 (λ(i) )
λ(i+1) = λ(i) −  (i) .
p3 (λ )
Initially, let λ(0) = 0.5. The successive iterations are shown below.

λ(i) p3 (λ(i) ) p3 (λ(i) ) λ(i+1)


0.5 –1.12500 –3.75000 0.20000
0.2 0.43200 –6.72000 0.26429
0.26429 0.02205 –6.03811 0.26794
0.26794 0.00007 –6.00012 0.26795
0.26795 0.00000 –6.00000 0.26795
Hence the other eigenvalue is 0.26795.
To find the eigenvalue within (3, 4) using Newton-Raphson method
Let the initial eigenvalue be λ(0) = 3.5.
392 Numerical Analysis

The calculations are shown in the following table.

λ(i) p3 (λ(i) ) p3 (λ(i) ) λ(i+1)


3.5 1.12500 –3.75000 3.80000
3.80000 –0.43200 –6.72000 3.73572
3.73572 –0.02206 –6.03812 3.73206
3.73206 –0.00007 –6.00012 3.73205
3.73205 0.00000 –6.00000 3.73205

The other eigenvalue is 3.73205. √


Hence all the eigenvalues are 2, 0.26795 and 3.73205, the exact values are 2, 2 ± 3.

6.4.3 Givens method


The Givens method is used to find eigenvalues of a real symmetric matrix A = [aij ].
This method preserves the zeros in the off-diagonal elements, once they are created. This
method works into two steps. In first step, the given symmetric matrix is converted to
a symmetric tri-diagonal matrix using plane rotations. In second step, the eigenvalues
of this new matrix are determined by the method discussed in previous section.
The conversion to a tri-diagonal form is done by using orthogonal transformations as
in Jacobi’s method. In this case, the rotation is performed with the elements a22 , a23 , a32
and a33 .  
1 0 0 0 ··· 0
 0 cos θ − sin θ 0 · · · 0 
 
Let S1 =  
 0 sin θ cos θ 0 · · · 0  be the orthogonal matrix, where θ is unknown.
0 0 0 1 ··· 0 
0 0 0 0 ··· 1
Let B1 be the transformed matrix under orthogonal transformation S1 , Then B1 =
S−1
1 AS1 .
The elements of (1, 3) and (3, 1) positions are equal and they are −a12 sin θ+a13 cos θ.
The angle θ is now obtained by putting this value to zero.
That is, a13
−a12 sin θ + a13 cos θ = 0 or, tan θ = . (6.28)
a12

This transformation is now considered as a rotation in the (2,3)-plane. It may be


noted that this computation is more simple than Jacobi’s method. The matrix B1 has
the form   
a11 a12 0 a14 · · · a1n
 a21 a22 a23 a24 · · · a2n 
 
B1 =    
 0 a32 a33 a34 · · · a3n  .
 
 ··· ··· ··· ··· ··· ··· 
an1 an2 an3 an4 · · · ann
Eigenvalues and Eigenvectors of a Matrix 393

Then apply the rotation in (2,4)-plane to convert a14 and a41 to zero. This would
not effect zeros that have been created earlier. Successive rotations are carried out in
the planes (2, 3), (2, 4), . . . , (2, n) where θ’s are so chosen that the new elements at
the positions (1, 3), (1, 4), . . . , (1, n) vanish. After (n − 2) such rotations, all elements
of first row and column (except first two) become zero. Then the transformed matrix
Bn−2 after (n − 2) rotations reduces to the following form:
 
a11 a12 0 0 ··· 0
 a21 a22 a23 a24 ··· a2n 
 
Bn−2 =
 0 a32 a33 a34 ··· a3n .

 0 ··· ··· ··· ··· ··· 
0 an2 an3 an4 ··· ann

The second row of Bn−2 is taken in the same way as of the first row. The rotations
are made in the planes (3,4), (3,5), . . . , (3, n). Thus, after (n − 2) + (n − 3) + · · · + 1
(n − 1)(n − 2)
= rotations the matrix A becomes a tri-diagonal matrix B of the form
2
 
a1 b2 0 0 · · · 0 0
 b2 a2 b3 0 · · · 0 0 
 
B=  0 b3 a3 b4 · · · 0 0  .

 ··· ··· ··· ··· ··· ··· ··· 
0 0 0 0 · · · bn an

In this process, the previously created zeros are not effected by successive rotations.
The eigenvalues of B and A are same as they are similar matrices.
Example
 6.4.4Find the eigenvalues of the symmetric matrix
2 3 −1
A=  3 1 2  using Givens method.
−1 2 −1

Solution. Let the orthogonal matrix S1 be


 
1 0 0
S1 =  0 cos θ − sin θ  ,
0 sin θ cos θ

a13 1
where tan θ = = − , i.e., θ = −0.32175.
a12 3 
1 0 0
Therefore, S1 =  0 0.94868 0.31623  .
0 −0.31623 0.94868
394 Numerical Analysis

The reduced matrix is


   
1 0 0 2 3 −1 1 0 0
B = S−11 AS1 =
 0 0.94868 −0.31623   3 1 2   0 0.94868 0.31623 
0 0.31623 0.94868 −1 2 −1 0 −0.31623 0.94868
 
2 3.16228 0.00001
=  3.16228 −0.40001 2.2  .
0.00001 2.2 0.40001
 
2 3.1623 0
Let B =  3.1623 −0.4 2.2  .
0 2.2 0.4
The Sturm sequence is

p0 (λ) = 1
p1 (λ) = 2−λ
p2 (λ) = (−0.4 − λ)p1 (λ) − 3.16232 p0 (λ) = λ2 − 1.6λ − 10.8
p3 (λ) = (0.4 − λ)p2 (λ) − 2.22 p1 (λ) = −λ3 + 2λ2 + 15λ − 14.

Now, we tabulate the values of p0 , p1 , p2 , p3 for different values of λ.


λ p0 p1 p2 p3 N (λ)
−4 + + + + 0
−3 + + + – 1
−1 + + – – 1
0 + + – – 1
1 + + – + 2
2 + 0 – + 2
4 + – – + 2
5 + – + – 3

From this table, it is observed that the eigenvalues are located in the intervals
(−4, −3), (0, 1) and (4, 5). Any iterative method may be used to find them. Using
Newton-Raphson method, we find the eigenvalues −3.47531, 0.87584 and 4.59947 of
B. These are also the eigenvalues of A.

6.4.4 Householder’s method

This method is applicable to a real symmetric matrix of order n×n. It is more economic
and efficient than the Givens method. Here also a sequence of orthogonal (similarity)
transformations is used on A to get a tri-diagonal matrix. Each transformation produces
a complete row of zeros in appropriate positions, without affecting the previous rows.
Eigenvalues and Eigenvectors of a Matrix 395

Thus, (n − 2) Householder transformations are needed to produce the tri-diagonal form.


The orthogonal transformation used in this method is of the form

S = I − 2VVT (6.29)

where V = (s1 , s2 , . . . , sn )T is a column vector containing n components, such that

VT V = s21 + s22 + · · · + s2n = 1. (6.30)

The matrix S is symmetric and orthogonal, since

ST = (I − 2VVT )T = I − 2VVT = S

and
ST S = (I − 2VVT )(I − 2VVT )
= I − 4VVT + 4VVT VVT
= I − 4VVT + 4VVT = I. [using (6.30)]. (6.31)

Thus
S−1 AS = ST AS = SAS, (6.32)

since S is orthogonal and symmetric.


Let A1 = A and form a sequence of transformations

Ar = Sr Ar−1 Sr , r = 2, 3, . . . , n − 1, (6.33)

where Sr = I − 2Vr VrT , Vr = (0, 0, . . . , 0, sr , sr+1 , . . . , sn )T and s2r + s2r+1 + · · · + s2n = 1.


At the first transformation, we find sr ’s in such a way that the elements in the posi-
tions (1, 3), (1, 4), . . . , (1, n) of A2 become zero. Also, the elements in the corresponding
positions in the first column becomes zero. Therefore, one rotation creates n − 2 zeros
in the first row and first column. In the second rotation, the elements in the positions
(2, 4), (2, 5), . . . , (2, n) and (4, 2), (5, 2), . . . , (n, 2) reduce to zeros.
Thus (n − 2) Householder transformations are required to obtain the tri-diagonal
matrix An−1 . This method is illustrated using a 4 × 4 matrix A = [aij ]4×4 .
In the first transformation, let V2 = (0, s2 , s3 , s4 )T such that

s22 + s23 + s24 = 1. (6.34)

Now,
 
1 0 0 0
 0 1 − 2s22 −2s2 s3 −2s2 s4 
S2 = I − 2VVT =  
 0 −2s2 s3 1 − 2s23 −2s3 s4  . (6.35)
0 −2s2 s4 −2s3 s4 1 − 2s24
396 Numerical Analysis

The first rows of A1 and S2 A1 are same. The elements in the first row of A2 =
S2 A1 S2 are given by

a11 = a11
a12 = (1 − 2s22 )a12 − 2s2 s3 a13 − 2s2 s4 a14 = a12 − 2s2 p1
a13 = −2s2 s3 a12 + (1 − 2s23 )a13 − 2s3 s4 a14 = a13 − 2s3 p1
and a14 = −2s2 s4 a12 − 2s3 s4 a13 + (1 − 2s24 )a14 = a14 − 2s4 p1

where p1 = s2 a12 + s3 a13 + s4 a14 .


It can be verified that

a 11 + a 12 + a 13 + a 14 = a211 + a212 + a213 + a214 .


2 2 2 2

That is,

a 12 + a 13 + a 14 = a212 + a213 + a214 = q 2 (say)


2 2 2
(6.36)

and q is a known quantity.


Since the elements at the positions (1, 3) and (1, 4) of A2 need to be zeros, a13 =
0, a14 = 0.
Hence

a13 − 2p1 s3 = 0 (6.37)


a14 − 2p1 s4 = 0 (6.38)
and a12 = ±q or, a12 − 2p1 s2 = ±q. (6.39)

Multiplying equations (6.39), (6.37) and (6.38) by s2 , s3 and s4 respectively and


adding them to obtain the equation

p1 − 2p1 (s22 + s23 + s24 ) = ±qs2 , or, p1 = ±s2 q. (6.40)

Thus from (6.39), (6.37) and (6.38) the values of s2 , s3 and s4 are obtained as
 
1 a12 a13 a14
2
s2 = 1∓ , s3 = ∓ , s4 = ∓ . (6.41)
2 q 2s2 q 2s2 q

It is noticed that the values of s3 and s4 depend on s2 , so the better accuracy can
be achieved if s2 becomes large. This can be obtained by taking suitable sign in (6.41).
Choosing
 
2 1 a12 × sign(a12 )
s2 = 1+ . (6.42)
2 q
Eigenvalues and Eigenvectors of a Matrix 397

The sign of the square root is irrelevant and positive sign is taken. Hence
a13 × sign(a12 ) a14 × sign(a12 )
s3 = , s4 = .
2q × s2 2q × s2
Thus first transformation generates two zeros in the first row and first column. The
second transformation is required to create zeros at the positions (2, 4) and (4, 2).
In the second transformation, let V3 = (0, 0, s3 , s4 )T and the matrix
 
1 0 0 0
0 1 0 0 
S3 = 
 0 0 1 − 2s23 −2s3 s4  .
 (6.43)
0 0 −2s3 s4 1 − 2s24
The values of s3 and s4 are to be computed using the previous technique. The new
matrix A3 = S3 A2 S3 is obtained. The zeros in first row and first column remain
unchanged while computing A3 . Thus, A3 becomes to a tri-diagonal form in this case.
The application of this method for a general n × n matrix is obvious. The elements of
the vector Vk = (0, · · · , 0, sk , sk+1 , · · · , sn ) at the kth transformation are given by
-
  . 
1 akr × sign(akr ) . n
2
sk = 1+ , r = k + 1, where q = / a2ki
2 q
i=k+1

aki × sign(akr )
si = , i = k + 1, . . . , n.
2qsk
Since the tri-diagonal matrix An−1 is similar to the original matrix A, they have
identical eigenvalues. The eigenvalues of An−1 are computed in the same way as in the
Givens method. Once the eigenvalues become available, the eigenvectors are obtained
by solving the homogeneous system of equations (A − λI)X = 0.
Example
 6.4.5 Use the
 Householder method to reduce the matrix
2 −1 −1 1
 −1 4 1 −1 
A= 
 −1 1 3 −1  into the tri-diagonal form.
1 −1 −1 2

Solution. First rotation.  √


Let V2 = (0, s2 , s3 , s4 )T , q = a212 + a213 + a214 = 3,
 
1 (−1)(−1)
2
s2 = 1+ √ = 0.78868, s2 = 0.88807,
2 3
(−1)(−1) 1 × (−1)
s3 = √ = 0.32506, s4 = √ = −0.32506.
2 3 × 0.88807 2 3 × 0.88807
398 Numerical Analysis

V2 = (0, 0.88807, 0.32506, −0.32506)T .

S2 = I − 2V2 V2T
 
1 0 0 0
 0 −0.57734 −0.57735 0.57735 
=  
 0 −0.57735 0.78867 0.21133  .
0 0.57735 0.21133 0.78867
 
2 1.73204 0 0
 1.73204 5.0 0.21132 −0.78867 
A2 = S2 A1 S2 = 

.
0 0.21132 2.28867 −0.5 
0 −0.78867 −0.5 1.71133

Second transformation. 
V3 = (0, 0, s3 , s4 )T . q = a223 + a224 = 0.81649,
 
2 1 a23 × sign(a23 )
s3 = 1+ = 0.62941, s3 = 0.79335,
2 q
a24 × sign(a23 )
s4 = = −0.60876.
2qs3
V3 = (0, 0, 0.79335, −0.60876)T .
 
1 0 0 0
0 1 0 0 
S3 = I − 2V3 V3T =  
 0 0 −0.25881 0.96592  .
0 0 0.96592 0.25882
 
2 1.73204 0 0
 1.73204 5.0 −0.81648 0 
A3 = S3 A2 S3 = 

.
0 −0.81648 2 −0.57731 
0 0 −0.57731 2

This is the required tri-diagonal matrix similar to A.

Algorithm 6.4 (Householder method). This method converts a real symmetric


matrix A of order n × n into a real symmetric tri-diagonal form.

Algorithm Householder
Step 1. Read the symmetric matrix A = [aij ], i, j = 1, 2, . . . , n.
Step 2. Set k = 1, r = 2.
1 , v2 , · · · , vn ) //
Step 3. //Compute the vector V = (v+ T
n 2
Step 3.1. Compute q = i=k+1 aki .
Step 3.2. Set vi = 0 for i = 1, 2, . . . , r − 1.
Eigenvalues and Eigenvectors of a Matrix 399

 
1 akr ×sign(akr )
Step 3.3. Compute vr2 = 2 1+ q .
aki ×sign(akr )
Step 3.4. Compute vi = 2qvr
for i = r + 1, . . . , n.
Step 4. Compute the transformation matrix S = I − 2V ∗ VT .
Step 5. Compute A = S ∗ A ∗ S.
Step 6. Set k = k + 1, r = r + 1.
Step 7. Repeat steps 3 to 6 until k ≤ n − 2.
end Householder
Program 6.4 .
/* Program Householder method
This program reduces the given real symmetric matrix
into a real symmetric tri-diagonal matrix. Assume that
the given matrix is real symmetric. */
#include<stdio.h>
#include<math.h>
void main()
{
int n,i,j,r=2,k,l,sign;
float a[10][10],v[10],s[10][10],temp[10][10],q;
printf("Enter the size of the matrix ");
scanf("%d",&n);
printf("Enter the elements row wise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("The given matrix is\n");
for(i=1;i<=n;i++) /* printing of A */
{
for(j=1;j<=n;j++) printf("%8.5f ",a[i][j]); printf("\n");
}
for(k=1;k<=n-2;k++)
{
q=0;
for(i=k+1;i<=n;i++) q+=a[k][i]*a[k][i];
q=sqrt(q);
for(i=1;i<=r-1;i++) v[i]=0;
sign=1; if(a[k][r]<0) sign=-1;
v[r]=sqrt(0.5*(1+a[k][r]*sign/q));
for(i=r+1;i<=n;i++) v[i]=a[k][i]*sign/(2*q*v[r]);
/* construction of S */
for(i=1;i<=n;i++) for(j=1;j<=n;j++) s[i][j]=-2*v[i]*v[j];
400 Numerical Analysis

for(i=1;i<=n;i++) s[i][i]=1+s[i][i];
for(i=1;i<=n;i++) for(j=1;j<=n;j++)
{
temp[i][j]=0;
for(l=1;l<=n;l++) temp[i][j]+=s[i][l]*a[l][j];
}
for(i=1;i<=n;i++) for(j=1;j<=n;j++)
{
a[i][j]=0;
for(l=1;l<=n;l++) a[i][j]+=temp[i][l]*s[l][j];
}
r++;
} /* end of loop k */
printf("The reduced symmetric tri-diagonal matrix is\n");
for(i=1;i<=n;i++)
{
for(j=1;j<=n;j++) printf("%8.5f ",a[i][j]);
printf("\n");
}
}/* main */

A sample of input/output:

Enter the size of the matrix 5


Enter the elements row wise
1 -1 -2 1 1
-1 0 1 3 2
-2 1 3 1 1
1 3 1 4 0
1 2 1 0 5
The given matrix is
1.00000 -1.00000 -2.00000 1.00000 1.00000
-1.00000 0.00000 1.00000 3.00000 2.00000
-2.00000 1.00000 3.00000 1.00000 1.00000
1.00000 3.00000 1.00000 4.00000 0.00000
1.00000 2.00000 1.00000 0.00000 5.00000

The reduced symmetric tri-diagonal matrix is


1.00000 2.64575 -0.00000 0.00000 -0.00000
2.64575 1.00000 2.03540 -0.00000 0.00000
Eigenvalues and Eigenvectors of a Matrix 401

-0.00000 2.03540 -0.58621 0.94489 0.00000


0.00000 0.00000 0.94489 5.44237 -1.26864
-0.00000 0.00000 0.00000 -1.26864 6.14384

6.5 Exercise

1. Compare Jacobi, Givens and Householder methods to find the eigenvalues of a


real symmetric matrix.

2. If X is any vector and P = I − 2XXT , show that P is symmetric. What additional


condition is necessary in order that P is orthogonal ?

3. Use the Leverrier-Faddeev method to find the characteristic equations of the fol-
lowing matrices.    
  2 −1 3 −4 1 1 −1 1
2 5 7  3 −2 4 1   1 1 −1 −1 
(a)  6 3 4 , (b)   
 5 −3 −2 2 , (c)  1 −1 1 −1  .

5 −2 −3
3 −3 −1 1 1 −1 −1 1

4. Use the Leverrier-Faddeev method to find the eigenvalues and eigenvectors of the
matrices  
    1 −2 1 −2
5 6 −3 1 2 −3  2 −1 2 −1 
2 3
(a) , (b)  −1 0 1 , (c)  3 −1 2 , (d)   1 1 −2 −2 .

−1 2
1 2 −1 1 0 −1
−2 −2 1 1

5. Find all eigenvalues


 of the following
 matrices using Rutishauser method.
3 2 −3
4 5
(a) , (b)  0 1 1 .
−1 1
1 −2 1

6. Use power method to find the largest and the least (in magnitude) eigenvalues of
the following matrices.
     
4 1 0 2 −1 2 3 1 0
1 2
(a) , (b)  1 2 1 , (c)  5 −3 3 , (d)  1 2 2 .
2 3
0 1 1 −1 0 −2 0 1 1

7. Use Jacobi’s method to find the eigenvalues


 of the
 following matrices.
    4 3 2 1
3 2 1 −2 −2 6 3 4 3 2
(a) 2 3 2 , (b) −2 5 4 , (c) 
   
 2 3 4 3 .
1 2 3 6 4 1
1 2 3 4
402 Numerical Analysis

8. UseGivens method
 to find 
the eigenvalues
 of the
 following symmetric matrices.
3 2 1 2 2 6 1 1 −1
 
(a) 2 3 2 , (b) 2 5 4 , (c)  1 2 3 .
1 2 3 6 4 1 −1 3 1
9. Use Householder method to convert the above matrices to tri-diagonal form.

10. Find the eigenvalues


  of the 
following matrices using Householder method.
4 3 2 5 4 3
(a)  3 4 3 , (b)  4 5 4 .
2 3 4 3 4 5
11. Find the eigenvalues of the following
 tri-diagonal
 matrices.
    4 1 0 0
2 1 0 3 1 0 1 4 1 0
(a)  1 3 1 , (b)  1 3 1 , (c)  
 0 1 4 1 .
0 1 2 0 1 3
0 0 1 4
12. Find the spectralradii of
 the following matrices.

−2 1 1 −3 −7 −2
(a)  −6 1 3 , (b)  12 20 6 .
−12 −2 8 −20 −31 −9
 √ √ 
√1 √ 2 2 √ 2
 2 − 2 −1 2 
13. Transform the symmetric matrix A =  √ √ √ 
 2 −1 2 2  to a tri-diagonal
√ √
2 2 2 −3
form, using Givens method, by a sequence of orthogonal transformations. Use
exact arithmetic.
Chapter 7

Differentiation and Integration

7.1 Differentiation

Numerical differentiation is a method to find the derivatives of a function at some values


of independent variable x, when the function f (x) is not known explicitly, but is known
only for a set of arguments.
Like interpolation, a number of formulae for differentiation are derived. The choice
of formula depends on the point at which the derivative is to be determined. So, to find
the derivative at a point at the beginning of the table, the formula based on Newton’s
forward interpolation is used, but, at a point which is at the end of the table, the
formula based on Newton’s backward interpolation is used. If the given values of xi are
not equispaced then the formula based on Lagrange’s interpolation is appropriate.

7.1.1 Error in Numerical Differentiation


The error in polynomial interpolation is

f n+1 (ξ) f n+1 (ξ)


E(x) = (x − x0 )(x − x1 ) · · · (x − xn ) = w(x)
(n + 1)! (n + 1)!

where min{x, x0 , . . . , xn } < ξ < max{x, x0 , . . . , xn } and w(x) = (x − x0 )(x − x1 ) · · · (x −


xn ). Obviously, ξ = ξ(x) is an unknown function of x.
Therefore,

f n+1 (ξ) f n+2 (ξ) 


E  (x) = w (x) + w(x) ξ (x). (7.1)
(n + 1)! (n + 1)!
The bound of the second term is unknown due to the presence of the unknown
quantity ξ  (x).

403
404 Numerical Analysis

But, at x = xi , w(x) = 0. Thus,

f n+1 (ξi )
E  (xi ) = w (xi ) , (7.2)
(n + 1)!

where min{x, x0 , . . . , xn } < ξi < max{x, x0 , . . . , xn }. The error can also be expressed
in terms of divided difference.
f n+1 (ξ)
Let E(x) = w(x)f [x, x0 , x1 , . . . , xn ] where f [x, x0 , x1 , . . . , xn ] = .
(n + 1)!
 
Then E (x) = w (x)f [x, x0 , x1 , . . . , xn ] + w(x)f [x, x, x0 , x1 , . . . , xn ].
Now, this expression is differentiated (k − 1) times by Leibnitz’s theorem.


k
dk−1
k k
E (x) = Cr w(i) (x) (f [x, x0 , . . . , xn ])
dxk−i
i=0

k
  
k−i+1
= k
Cr w (x)(k − i)!f [x, x, . . . , x, x0 , . . . , xn ]
(i)

i=0

k
k!   
k−i+1
= w(i) (x)(k − i)!f [x, x, . . . , x, x0 , . . . , xn ], (7.3)
i!
i=0

where w(i) (x) denotes the ith derivative of w(x).

Note 7.1.1 If a function f (x) is well-approximated by a polynomial φ(x) of degree at


most n, the slope f  (x) can also be approximated by the slope φ (x). But, the error
committed in φ (x) is more than the error committed in φ(x).

7.2 Differentiation Based on Newton’s Forward Interpolation


Polynomial

Suppose the function y = f (x) is known at (n + 1) equispaced points x0 , x1 , . . . , xn and


they are y0 , y1 , . . . , yn respectively, i.e., yi = f (xi ), i = 0, 1, . . . , n. Let xi = x0 + ih and
x − x0
u= , h is the spacing.
h
The Newton’s forward interpolation formula is

u(u − 1) 2 u(u − 1) · · · (u − n − 1) n
φ(x) = y0 + u∆y0 + ∆ y0 + · · · + ∆ y0
2! n!
u2 −u 2 u3 −3u2 +2u 3 u4 −6u3 +11u2 −6u 4
= y0 +u∆y0 + ∆ y0 + ∆ y0 + ∆ y0
2! 3! 4!
u5 − 10u4 + 35u3 − 50u2 + 24u 5
+ ∆ y0 + · · · (7.4)
5!
Differentiation and Integration 405

with error
u(u − 1) · · · (u − n) n+1 (n+1)
E(x) = h f (ξ),
(n + 1)!
where min{x, x0 , · · · , xn } < ξ < max{x, x0 , . . . , xn }.
Differentiating (7.4) successively with respect to x, we obtain

1 2u − 1 2 3u2 − 6u + 2 3 4u3 −18u2 +22u − 6 4


φ (x) = ∆y0 + ∆ y0 + ∆ y0 + ∆ y0
h 2! 3! 4!
5u4 −40u3 +105u2 −100u+24 5
+ ∆ y0 + · · · (7.5)
5!
 
du 1
as =
dx h
1 6u − 6 3 12u2 − 36u + 22 4
φ (x) = ∆ 2
y 0 + ∆ y 0 + ∆ y0
h2 3! 4!
20u3 − 120u2 + 210u − 100 5
+ ∆ y0 + · · · (7.6)
5!

1 24u − 36 4 60u2 − 240u + 210 5


φ (x) = ∆ 3
y 0 + ∆ y 0 + ∆ y0 + · · · (7.7)
h3 4! 5!

and so on.
It may be noted that ∆y0 , ∆2 y0 , ∆3 y0 , · · · are constants.
The above equations give the approximate derivative of f (x) at arbitrary point x (=
x0 + uh).
When x = x0 , u = 0, the above formulae become
1 1 1 1 1 
φ (x0 ) = ∆y0 − ∆2 y0 + ∆3 y0 − ∆4 y0 + ∆5 y0 − · · · (7.8)
h 2 3 4 5
1 11 5 
φ (x0 ) = 2 ∆2 y0 − ∆3 y0 + ∆4 y0 − ∆5 y0 + · · · (7.9)
h 12 6
1  3 7 
φ (x0 ) = 3 ∆3 y0 − ∆4 y0 + ∆5 y0 − · · · (7.10)
h 2 4
and so on.

Error in differentiation formula based on Newton’s forward interpolation


polynomial
The error in Newton’s forward interpolation formula is

f n+1 (ξ)
E(x) = u(u − 1) · · · (u − n)hn+1 .
(n + 1)!
406 Numerical Analysis

Then

f n+1 (ξ) d   1 u(u − 1) · · · (u − n) n+1 d n+1


E  (x) = hn+1 u(u − 1) · · · (u − n) + h [f (ξ)]
(n + 1)! du h (n + 1)! dx
f n+1 (ξ) d u(u − 1) · · · (u − n) n+1 n+2
= hn [u(u − 1) · · · (u − n)] + h f (ξ1 ), (7.11)
(n + 1)! du (n + 1)!

where min{x, x0 , x1 , . . . , xn } < ξ, ξ1 < max{x, x0 , x1 , . . . , xn }.


Error at the point x = x0 , i.e., u = 0 is

f n+1 (ξ) d hn (−1)n n! f n+1 (ξ)


E  (x0 ) = hn [u(u − 1) · · · (u − n)]u=0 =
(n + 1)! du (n + 1)!
d
[as [u(u − 1) · · · (u − n)]u=0 = (−1)n n!]
du
(−1)n hn f n+1 (ξ)
= , (7.12)
n+1

where min{x, x0 , x1 , . . . , xn } < ξ < max{x, x0 , x1 , . . . , xn }.

dy d2 y
Example 7.2.1 From the following table find the value of and 2 at the point
dx dx
x = 1.5.
x : 1.5 2.0 2.5 3.0 3.5 4.0
y : 3.375 7.000 13.625 24.000 38.875 59.000

Solution. The forward difference table is


x y ∆y ∆2 y ∆3 y
1.5 3.375
3.625
2.0 7.000 3.000
6.625 0.750
2.5 13.625 3.750
10.375 0.750
3.0 24.000 4.500
14.875 0.750
3.5 38.875 5.250
20.125
4.0 59.000
Differentiation and Integration 407

Here x0 = 1.5 and h = 0.5. Then u = 0 and hence


1 1 1 
y  (1.5) = ∆y0 − ∆2 y0 + ∆3 y0 − · · ·
h 2 3
1  1 1 
= 3.625 − × 3.000 + × 0.750 = 4.750.
0.5 2 3
1 1
y  (1.5) = 2 (∆2 y0 − ∆3 y0 + · · · ) = (3.000 − 0.750) = 9.000.
h (0.5)2

7.3 Differentiation Based on Newton’s Backward Interpolation


Polynomial

Suppose the function y = f (x) is known at (n + 1) points x0 , x1 , . . . , xn , i.e., yi = f (xi ),


x − xn
i = 0, 1, 2, . . . , n are known. Let xi = x0 + ih, i = 0, 1, 2, . . . , n and v = . Then
h
Newton’s backward interpolation formula is

v(v + 1) 2 v(v + 1)(v + 2) 3


φ(x) = yn + v∇yn + ∇ yn + ∇ yn
2! 3!
v(v + 1)(v + 2)(v + 3) 4 v(v+1)(v+2)(v+3)(v + 4) 5
+ ∇ yn + ∇ yn +· · ·
4! 5!

The above equation is differentiated with respect to x successively.

1 2v + 1 2 3v 2 + 6v + 2 3 4v 3 +18v 2 +22v+6 4
φ (x) = ∇yn + ∇ yn + ∇ yn + ∇ yn
h 2! 3! 4!
5v 4 + 40v 3 + 105v 2 + 100v + 24 5 
+ ∇ yn + · · · (7.13)
5!
1 6v + 6 3 12v 2 + 36v + 22 4
φ (x) = 2 ∇2 yn + ∇ yn + ∇ yn
h 3! 4!
20v 3 + 120v 2 + 210v + 100 5 
+ ∇ yn + · · · (7.14)
5!
1 24v + 36 4 60v 3 + 240v + 210 5 
φ (x) = 3 ∇3 yn + ∇ yn + ∇ yn + · · · (7.15)
h 4! 5!

and so on.
The above formulae are used to determine the approximate differentiation of first,
second and third, etc. order at any point x where x = xn + vh.
408 Numerical Analysis

If x = xn then v = 0. In this case, the above formulae become


1 1 1 1 1 
φ (xn ) = ∇yn + ∇2 yn + ∇3 yn + ∇4 yn + ∇5 yn + · · · (7.16)
h 2 3 4 5
 1 2 11 4 5 5 
φ (xn ) = 2 ∇ yn + ∇ yn + ∇ yn + ∇ yn + · · ·
3
(7.17)
h 12 6
1  3 7 
φ (xn ) = 3 ∇3 yn + ∇4 yn + ∇5 yn + · · · (7.18)
h 2 4

Error in differentiation formula based on Newton’s backward interpolation


polynomial
The error in Newton’s backward interpolation formula is

f n+1 (ξ)
E(x) = v(v + 1)(v + 2) · · · (v + n)hn+1 ,
(n + 1)!
x − xn
where v = and min{x, x0 , x1 , . . . , xn } < ξ < max{x, x0 , x1 , . . . , xn }.
h
Then
d f n+1 (ξ)
E  (x) = hn [v(v + 1)(v + 2) · · · (v + n)]
dv (n + 1)!
v(v + 1)(v + 2) · · · (v + n) n+2
+hn+1 f (ξ1 ),
(n + 1)!

where min{x, x0 , x1 , . . . , xn } < ξ, ξ1 < max{x, x0 , x1 , . . . , xn }.


Error at x = xn , i.e., at v = 0 is

d f n+1 (ξ)
E  (xn ) = hn [v(v + 1)(v + 2) · · · (v + n)]
dv (n + 1)!
n!  d 
= hn f n+1 (ξ) as [v(v + 1) · · · (v + n)]v=0 = n!
(n + 1)! dv
n
h f n+1 (ξ)
= . (7.19)
n+1

Example 7.3.1 A particle is moving along a straight line. The displacement x at


some time instances t are given below:
t : 0 1 2 3 4
x : 5 8 12 17 26

Find the velocity and acceleration of the particle at t = 4.


Differentiation and Integration 409

Solution. The backward difference table is


t x ∇x ∇2 x ∇3 x ∇4 x
0 5
1 8 3
2 12 4 1
3 17 5 1 0
4 26 9 4 3 3
The velocity is
dx 1 1 1 1 
= ∇xn + ∇2 xn + ∇3 xn + ∇4 xn + · · ·
dt h 2 3 4
1 1 1 1 
= 9+ ×4+ ×3+ ×3
1 2 3 4
= 12.75.

The acceleration is
d2 x 1 2 11 4 
= ∇ xn + ∇ 3
xn + ∇ x n + · · ·
dt2 h2 12
1 11 
= 2 4+3+ × 3 = 9.75.
1 12

Example 7.3.2 A slider in a machine moves along a fixed straight rod. Its distance
x cm along the rod are given in the following table for various values of the time t
(in second).

t (sec) : 1.0 1.1 1.2 1.3 1.4 1.5


x (cm) : 16.40 19.01 21.96 25.29 29.03 33.21

Find the velocity and the acceleration of the slider at time t = 1.5.

Solution. The backward difference table is


t x ∇x ∇2 x ∇3 x ∇4 x ∇5 x
1.0 16.40
1.1 19.01 2.61
1.2 21.96 2.95 0.34
1.3 25.29 3.33 0.38 0.04
1.4 29.03 3.74 0.41 0.03 –0.01
1.5 33.21 4.18 0.44 0.03 0.00 0.01

Here h = 0.1.
410 Numerical Analysis

dx 1 1 1 1 1
= ∇ + ∇2 + ∇3 + ∇4 + ∇5 + · · · xn
dt h 2 3 4 5
1 1 1 1 1
= 4.18 + × 0.44 + × 0.03 + × 0.00 + × 0.01
0.1 2 3 4 5
= 44.12.
d2 x 1 11
2
= 2 ∇2 + ∇3 + ∇5 + · · · xn
dt h 12
1 11 5
= 0.44 + 0.03 + × 0.00 + × 0.01
(0.1)2 12 6
= 47.83.

Hence velocity and acceleration are respectively 44.12 cm/sec and 47.83 cm/sec2 .

7.4 Differentiation Based on Stirling’s Interpolation Formula

Suppose y±i = f (x±i ), i = 0, 1, . . . , n are given for 2n + 1 equispaced points x0 , x±1 ,


x±2 , . . ., x±n , where x±i = x0 ± ih, i = 0, 1, . . . , n.
The Stirling’s interpolation polynomial is
u ∆y−1 + ∆y0 u2 u3 − u ∆3 y−2 + ∆3 y−1
φ(x) = y0 + · + · ∆2 y−1 + ·
1! 2 2! 3! 2
u4 − u2 u 5 − 5u3 + 4u ∆5 y
−3 + ∆ 5y
−2
+ · ∆4 y−2 + · + ···
4! 5! 2
x − x0
where u = . (7.20)
h
This equation is differentiated with respect to x successively.
1 ∆y−1 + ∆y0 3u2 − 1 ∆3 y−2 + ∆3 y−1
φ (x) = + u∆2 y−1 + ·
h 2 6 2
2u − u 4
3 5u − 15u + 4 ∆ y−3 +∆ y−2
4 2 5 5
+ ∆ y−2 + · + ··· . (7.21)
12 120 2
1 ∆3 y−2 + ∆3 y−1 6u2 − 1
φ (x) = 2 ∆2 y−1 + u + · ∆4 y−2
h 2 12
2u3 − 3u ∆5 y−3 + ∆5 y−2
+ · + ··· . (7.22)
12 2
At x = x0 , u = 0. Then
1 ∆y0 + ∆y−1 1 ∆3 y−1 + ∆3 y−2 1 ∆5 y−2 + ∆5 y−3
φ (x0 ) = − + + · · · . (7.23)
h 2 6 2 30 2
Differentiation and Integration 411

1 1
and φ (x0 ) = ∆2 y−1 − ∆4 y−2 + · · · . (7.24)
h2 12

Error in differentiation formula based on Stirling’s interpolation polynomial

The error of Stirling’s interpolation formula is

f 2n+1 (ξ)
E(x) = u(u2 − 12 )(u2 − 22 ) · · · (u2 − n2 )h2n+1 ,
(2n + 1)!

where min{x, x−n , . . . , x0 , . . . , xn } < ξ < max{x, x−n , . . . , x0 , . . . , xn }.


Then

d du f 2n+1 (ξ)
E  (x) = [u(u2 − 12 )(u2 − 22 ) · · · (u2 − n2 )] h2n+1
du dx (2n + 1)!
d f 2n+1 (ξ)
+h2n+1 [u(u2 − 12 )(u2 − 22 ) · · · (u2 − n2 )]
dx (2n + 1)!
d f 2n+1 (ξ)
= h2n [u(u2 − 12 )(u2 − 22 ) · · · (u2 − n2 )]
du (2n + 1)!
f 2n+2 (ξ1 )
+h2n+1 [u(u2 − 12 )(u2 − 22 ) · · · (u2 − n2 )] , (7.25)
(2n + 1)!

where min{x, x−n , . . . , x0 , . . . , xn } < ξ, ξ1 < max{x, x−n , . . . , x0 , . . . , xn }.


d
At x = x0 , u = 0. Then [u(u2 − 12 )(u2 − 22 ) · · · (u2 − n2 )] = (−1)n (n!)2 .
du
In this case,

(−1)n (n!)2 2n 2n+1


E  (x0 ) = h f (ξ). (7.26)
(2n + 1)!

Example 7.4.1 Compute the values of (i) f  (3), (ii) f  (3), (iii) f  (3.1), (iv) f  (3.1)
using the following table.
x : 1 2 3 4 5
f (x) : 0.0000 1.3863 3.2958 5.5452 8.0472
412 Numerical Analysis

Solution. The central difference table is


x y = f (x) ∆y ∆2 y ∆3 y ∆4 y
x−2 = 1 0.0000
1.3863
x−1 = 2 1.3863 0.5232
1.9095 –0.1833
x0 = 3 3.2958 0.3399 0.0960
2.2494 –0.0873
x1 = 4 5.5452 0.2526
2.5020
x2 = 5 8.0472

Since x = 3 and x = 3.1 are the middle of the table, so the formula based on central
difference may be used. Here Stirling’s formula is used to find the derivatives.
(i) Here x0 = 3, h = 1, u = 0.
Then
1 ∆y−1 + ∆y0 ∆3 y−2 + ∆3 y−1
f  (3) = − + ···
h 2 12
1 1.9095 + 2.2494 −0.1833 − 0.0873
= − = 2.1020.
1 2 12

1 1 1 1
(ii) f  (3) =2
∆2 y−1 − ∆4 y−2 + · · · = 2 0.3399 − × 0.0960 = 0.3319.
h 12 1 12
(iii) Let x0 = 3, h = 1, u = 3.1−3
1 = 0.1. Then

1 ∆y−1 + ∆y0 3u2 − 1 ∆3 y−2 + ∆3 y−1


f  (3.1) = + u∆2 y−1 +
h 2 6 2
2u − u 4
3
+ ∆ y−2 + · · ·
12
1 1.9095 + 2.2494 3 × (0.1)2 − 1 −0.1833 − 0.0873
= + 0.1 × 0.3399 +
1 2 6 2
2 × (0.1) − 0.1
3
+ × 0.0960 = 2.1345.
12

(iv) 1 ∆3 y−2 + ∆3 y−1 6u2 − 1 4


f  (3.1) = ∆ 2
y −1 + u + ∆ y−2 + · · ·
h2 2 12
1 −0.1833 − 0.0873 6(0.1)2 − 1
= 2 0.3399 + 0.1 × + × 0.0960
1 2 12
= 0.31885.
Differentiation and Integration 413

7.5 Differentiation Based on Lagrange’s Interpolation Polynomial

The Lagrange’s interpolation formula is


n
yi
φ(x) = w(x) , where w(x) = (x − x0 )(x − x1 ) · · · (x − xn ).
(x − xi )w (xi )
i=0

Then

n
yi  n
yi
φ (x) = w (x) 
− w(x) . (7.27)
(x − xi )w (xi ) (x − xi )2 w (xi )
i=0 i=0


n
yi  yi
n
and φ (x) = w (x) 
− 2w 
(x)
(x − xi )w (xi ) (x − xi )2 w (xi )
i=0 i=0
n
yi
+2w(x) . (7.28)
(x − xi )3 w (xi )
i=0

The value of w (x) can be determined as


n

w (x) = (x − x0 )(x − x1 ) · · · (x − xj−1 )(x − xj+1 ) · · · (x − xn ).
j=0
The formulae (7.27) and (7.28) are valid for all x except x = xi , i = 0, 1, . . . , n.
To find the derivatives at the points x0 , x1 , . . . , xn , the Lagrange’s polynomial is
rearranged as


n
yi
φ(x) = w(x)
i=0
(x − xi )w (xi )
i=j

(x − x0 )(x − x1 ) · · · (x − xj−1 )(x − xj+1 ) · · · (x − xn )


+ yj .
(xj − x0 )(xj − x1 ) · · · (xj − xj−1 )(xj − xj+1 ) · · · (xj − xn )

Therefore,


n
yi  yj n
φ (xj ) = w (xj ) + , (7.29)
i=0
(xj − xi )w (xi ) i=0
xj − xi
i=j i=j

where w (xj ) = (xj − x0 )(xj − x1 ) · · · (xj − xj−1 )(xj − xj+1 ) · · · (xj − xn )


n
= (xj − xi ).
i=0
i=j
414 Numerical Analysis

Note that
d (x − x0 )(x − x1 ) · · · (x − xj−1 )(x − xj+1 ) · · · (x − xn )
dx (xj − x0 )(xj − x1 ) · · · (xj − xj−1 )(xj − xj+1 ) · · · (xj − xn ) x=xj

n
1
= .
i=0
xj − xi
i=j

This formula is used to find the derivative at the points x = x0 , x1 , . . . , xn .

Error in differentiation formula based on Lagrange’s interpolation


polynomial
The error term is
f n+1 (ξ)
E(x) = w(x) , where w(x) = (x − x0 )(x − x1 ) · · · (x − xn ).
(n + 1)!
Thus
f n+1 (ξ) f n+2 (ξ1 )
E  (x) = w (x) + w(x) , (7.30)
(n + 1)! (n + 1)!
where min{x, x0 , x1 , . . . , xn } < ξ, ξ1 < max{x, x0 , x1 , . . . , xn }.
At x = xi , i = 0, 1, 2, . . . , n,
f n+1 (ξ)
E  (xi ) = w (xi ) . (7.31)
(n + 1)!

Example 7.5.1 Use the differentiation formula based on Lagrange’s interpolation


to find the value of f  (2) and f  (2.5) from the following table.

x : 2 3 5 6
y : 13 34 136 229

Solution. Here x0 = 2, x1 = 3, x2 = 5, x3 = 6.
w(x) = (x − x0 )(x − x1 )(x − x2 )(x − x3 ) = (x − 2)(x − 3)(x − 5)(x − 6).
w (x) = (x − 3)(x − 5)(x − 6) + (x − 2)(x − 5)(x − 6) + (x − 2)(x − 3)(x − 6)
+ (x − 2)(x − 3)(x − 5).
By the formula (7.29),

3
yi 
3
y0
f  (x)  φ (x) = w (x0 ) 
+
(x0 − xi )w (xi ) x0 − xi
i=1 i=1
y1 y2 y3
= w (2) + +
(2 − 3)w (3) (2 − 5)w (5) (2 − 6)w (6)
1 1 1
+y0 + + .
2−3 2−5 2−6
Differentiation and Integration 415

w (2) = −12, w (3) = 6, w (5) = −6, w (6) = 12.


Thus
34 136 229 1 1
f  (2)  −12 + + + 13 − 1 − − = 14.
(−1) × 6 (−3) × (−6) (−4) × 12 3 4

Also
f  (2.5)

3
yi 3
yi
 w (2.5) 
− w(2.5)
(2.5 − xi )w (xi ) (2.5 − xi )2 w (xi )
i=0 i=0
y0 y1 y2 y3
= w (2.5) + + +
(2.5 − 2)w (2) (2.5 − 3)w (3) (2.5 − 5)w (5) (2.5 − 6)w (6)
y0 y1 y2 y3
−w(2.5) + + +
(2.5−2) w (2) (2.5−3) w (3) (2.5−5) w (5) (2.5−6)2 w (6)
2  2  2 

Now, w (2.5) = 1.5, w(2.5) = −2.1875.


Therefore,

13 34 136 229
f  (2.5)  1.5 + + +
0.5 × (−12) (−0.5) × 6 (−2.5) × (−6) (−3.5) × 12
13 34 136 229
+ 2.1875 + + +
(0.5) ×(−12) (−0.5) ×6 (−2.5) ×(−6) (−3.5)2×12
2 2 2

= 20.75.

Algorithm 7.1 (Derivative). This algorithm determines the first order derivative
of a function given in tabular form (xi , yi ), i = 0, 1, 2, . . . , n, at a given point xg, xg
may or may not be equal to the given nodes xi , based on Lagrange’s interpolation.

Algorithm Derivative Lagrange


Read xi , yi , i = 0, 1, 2, . . . , n.
Read xg; //the point at which the derivative is to be evaluated.//
Compute w (xj ), j = 0, 1, 2, . . . , n, using the function wd(j).
Set sum1 = sum2 = 0;
Check xg is equal to given nodes xi .
If xg is not equal to any node then
for i = 0 to n do
Compute t = yi /((xg − xi ) ∗ wd(i));
Compute sum1 = sum1 + t;
Compute sum2 = sum2 + t/(xg − xi );
416 Numerical Analysis

endfor;
//compute w (xg)//
set t = 0;
for j = 0 to n do
set prod = 1;
for i = 0 to n do
if (i = j) then prod = prod ∗ (xg − xi );
endfor;
Compute t = t + prod;
endfor;
//compute w(xg) //
set t1 = 1;
for i = 0 to n do
Compute t1 = t1 ∗ (xg − xi );
endfor;
Compute result = t ∗ sum1 − t1 ∗ sum2;
else //xg is equal to xj //
for i = 0 to n do
if i = j then
Compute sum1 = sum1 + yi /((xj − xi ) ∗ wd(i));
Compute sum2 = sum2 + 1/(xj − xi );
endif;
endfor;
Compute result = wd(j) ∗ sum1 + yj ∗ sum2;
endif;
Print ’The value of the derivative’, result;
function wd(j)
//This function determines w (xj ).//
Set prod = 1;
for i = 0 to n do
if (i = j) prod = prod ∗ xi ;
endfor;
return prod;
end wd
end Derivative Lagrange
Program 7.1
.
/* Program Derivative
Program to find the first order derivative of a function
y=f(x) given as (xi,yi), i=0, 1, 2, ..., n, using formula
based on Lagrange’s interpolation. */
#include<stdio.h>
Differentiation and Integration 417

int n; float x[20],xg;


float wd(int);
void main()
{
int i,j,flag=-1;
float y[20],sum1=0,sum2=0,prod,t,t1=1,result;
printf("Enter the value of n and the data in the form
(x[i],y[i]) \n");
scanf("%d",&n);
for(i=0;i<=n;i++) scanf("%f %f",&x[i],&y[i]);
printf("Enter the value of x at which derivative
is required \n");
scanf("%f",&xg);
for(i=0;i<=n;i++) if(x[i]==xg) flag=i;
if(flag==-1) /* xg is not equal to xi, i=0, 1, ..., n */
{
for(i=0;i<=n;i++)
{
t=y[i]/((xg-x[i])*wd(i));
sum1+=t;
sum2+=t/(xg-x[i]);
}
/* Computation of w’(xg) */
t=0;
for(j=0;j<=n;j++)
{
prod=1;
for(i=0;i<=n;i++) if(i!=j) prod*=(xg-x[i]);
t+=prod;
}
/* computation of w(xg) */
for(i=0;i<=n;i++) t1*=(xg-x[i]);
result=t*sum1-t1*sum2;
} /* end of if part */
else
{
j=flag;
for(i=0;i<=n;i++)
if(i!=j)
418 Numerical Analysis

{
sum1+=y[i]/((x[j]-x[i])*wd(i));
sum2+=1/(x[j]-x[i]);
}
result=wd(j)*sum1+y[j]*sum2;
} /* end of else part */
printf("The value of derivative at x= %6.4f is
%8.5f",xg,result);
}
/* this function determines w’(xj) */
float wd(int j)
{
int i;float prod=1;
for(i=0;i<=n;i++) if(i!=j) prod*=(x[j]-x[i]);
return prod;
}
A sample of input/output:
Enter the value of n and the data in the form (x[i],y[i])
3
1 0.54030
2 -0.41615
3 -0.98999
4 -0.65364
Enter the value of x at which derivative is required
1.2
The value of derivative at x= 1.2000 is -0.99034

Table of derivatives
The summary of the formulae of derivatives based on finite differences.

 
1 1 1 1 1 1
f  (x0 )  ∆− ∆2 + ∆3 − ∆4 + ∆5 − ∆6 + · · · y0 (7.32)
h 2 3 4 5 6
 
 1 3 11 4 5 5 137 6 7 7 363 8
f (x0 ) 2 ∆ −∆ + ∆ − ∆ +
2
∆ − ∆+ ∆ +· · · y0 (7.33)
h 12 6 180 10 560
 
1 1 1 1 1 1
f  (xn )  ∇ + ∇ 2 + ∇ 3 + ∇ 4 + ∇ 5 + ∇ 6 + · · · yn (7.34)
h 2 3 4 5 6
 
 1 3 11 4 5 5 137 6 7 7 363 8
f (xn )  2 ∇ +∇ + ∇ + ∇ +
2
∇+ ∇+ ∇ +· · · yn (7.35)
h 12 6 180 10 560
Differentiation and Integration 419

 
 1 ∆y−1+∆y0 ∆3 y−2+∆3 y−1 ∆5 y−3+∆5 y−2
f (x0 )  − + +· · · y0 (7.36)
h 2 12 60
 
 1 1 4 1 6
f (x0 )  2 ∆ y−1 − ∆ y−2 + ∆ y−3 − · · · y0
2
(7.37)
h 12 90

7.6 Two-point and Three-point Formulae

Only the first term of (7.32), gives a simple formula for the first order derivative
∆yi yi+1 − yi y(xi + h) − y(xi )
f  (xi )  = = . (7.38)
h h h
Similarly, the equation (7.34), gives
∇yi yi − yi−1 y(xi ) − y(xi − h)
f  (xi )  = = . (7.39)
h h h
Adding equations (7.38) and (7.39), we obtain the central difference formula for first
order derivative, as
y(xi + h) − y(xi − h)
f  (xi )  . (7.40)
2h
Equations (7.38)-(7.40) give two-point formulae to find first order derivative at x = xi .
Similarly, from equation (7.33)

∆ 2 yi yi+2 − 2yi+1 + yi y(xi +2h) − 2y(xi + h)+y(xi )


f  (xi )  2
= 2
= . (7.41)
h h h2
From equation (7.35)

∇ 2 yi yi − 2yi−1 +yi−2 y(xi )−2y(xi − h)+y(xi − 2h)


f  (xi )  2
= 2
= . (7.42)
h h h2
Equation (7.37) gives

∆2 y−1 y1 − 2y0 + y−1 y(x0 + h) − 2y(x0 ) + y(x0 − h)


f  (x0 )  2
= 2
= .
h h h2
In general,
y(xi + h) − 2y(xi ) + y(xi − h)
f  (xi )  . (7.43)
h2
Equations (7.41)-(7.43) give the three-point formulae for second order derivative.
420 Numerical Analysis

7.6.1 Error analysis and optimum step size


The truncation error of the two-point formula (7.40) is O(h2 ). Assume that f ∈
C 3 [a, b] (i.e., f is continuously differentiable up to third order within [a, b]) and x −
h, x, x + h ∈ [a, b]. Then by Taylor’s series

h2  h3 
f (xi + h) = f (xi ) + hf  (xi ) + f (xi ) + f (ξ1 )
2! 3!
h2 h3 
and f (xi − h) = f (xi ) − hf  (xi ) + f  (xi ) − f (ξ2 )
2! 3!
By subtraction
f  (ξ1 ) + f  (ξ2 ) 3
f (xi + h) − f (xi − h) = 2hf  (xi ) + h . (7.44)
3!
Since f  is continuous, by the intermediate value theorem there exist a number ξ so
that
f  (ξ1 ) + f  (ξ2 )
= f  (ξ).
2
Thus, after rearrangement the equation (7.44) becomes

f (xi + h) − f (xi − h) f  (ξ)h2


f  (xi ) = − . (7.45)
2h 3!
It may be noted that the first term of right hand side is two-point formula while
second term is the truncation error and it is of O(h2 ).
To find the computer’s round-off error, it is assumed that f (x0 −h) = y(x0 −h)+ε−1
and f (x0 + h) = y(x0 + h) + ε1 where y(x0 − h) and y(x0 + h) are the approximate
values of the original function f at the points (x0 − h) and (x0 + h) respectively and
ε−1 and ε1 are the round-off errors.
Thus
y(xi + h) − y(xi − h)
f  (xi ) = + Etrunc
2h
and
y(xi + h) − y(xi − h)
f  (xi ) = + Etrunc + Eround
2h
y(xi + h) − y(xi − h)
= +E
2h
where
ε1 − ε−1 h2 f  (ξ)
E = Eround + Etrunc = − (7.46)
2h 6
Differentiation and Integration 421

is the total error accumulating the round-off error (Eround ) and the truncation error
(Etrunc ).
Let |ε−1 | ≤ ε, |ε1 | ≤ ε and M3 = max |f  (x)|.
a≤x≤b
Then from (7.46), the upper bound of the total error is given by

|ε1 | + |ε−1 | h2  ε M3 h2


|E| ≤ + |f (ξ)| ≤ + . (7.47)
2h 6 h 6
d|E| ε hM3
Now, |E| will be minimum for a given h if = 0 i.e., − 2 + = 0. Thus the
dh h 3
optimum value of h to minimize |E| is
 1/3

h= (7.48)
M3
and the minimum total error is
 −1/3  2/3
3ε M3 3ε
|E| = ε + .
M3 6 M3

Example 7.6.1 If f ∈ C 5 [a, b] (i.e., the function f is continuously differentiable up


to fifth order on [a, b]) and x − 2h, x − h, x + h, x + 2h ∈ [a, b], then show that

−f (x + 2h) + 8f (x + h) − 8f (x − h) + f (x − 2h) h4 f v (ξ)


f  (x)  + , (7.49)
12h 30
where ξ lies between x − 2h and x + 2h. Determine the optimal value of h when
(i) |Eround | = |Etrunc |,
(ii) total error |Eround | + |Etrunc | is minimum.

Solution. By Taylor’s series expansion for step length h and −h,

h2  h3 h4 h5
f (x + h) = f (x) + hf  (x) + f (x) + f  (x) + f iv (x) + f v (ξ1 )
2! 3! 4! 5!
and
h2  h3 h4 h5
f (x − h) = f (x) − hf  (x) + f (x) − f  (x) + f iv (x) − f v (ξ2 ).
2! 3! 4! 5!
Then by subtraction

2f  (x)h3 2f v (ξ3 )h5


f (x + h) − f (x − h) = 2hf  (x) + + .
3! 5!
422 Numerical Analysis

Similarly, when step size is 2h then

16f  (x)h3 64f v (ξ4 )h5


f (x + 2h) − f (x − 2h) = 4hf  (x) + + .
3! 5!
All ξ’s lie between x − 2h and x + 2h.
Now,

f (x − 2h) − f (x + 2h) + 8f (x + h) − 8f (x − h)
16f v (ξ3 ) − 64f v (ξ4 ) 5
= 12hf  (x) + h .
120
Since f v (x) is continuous, f v (ξ3 )  f v (ξ4 ) = f v (ξ) (say).
Then 16f v (ξ3 ) − 64f v (ξ4 ) = −48f v (ξ).
Using this result the above equation becomes
2
f (x − 2h) − f (x + 2h) + 8f (x + h) − 8f (x − h) = 12hf  (x) − h5 f v (ξ).
5
Hence, the value of f  (x) is given by

−f (x + 2h) + 8f (x + h)−8f (x − h)+f (x − 2h) f v (ξ)h4


f  (x)  + . (7.50)
12h 30
The first term on the right hand side is a four-point formula to find f  (x) and second
term is the corresponding truncation error.
Let f (x+2h) = y2 +ε2 , f (x+h) = y1 +ε1 , f (x−h) = y−1 +ε−1 and f (x−2h) = y−2 +
ε−2 , where yi and εi are the approximate values of f (x + ih) and the corresponding
round-off errors respectively. Also let
M5 = max |f v (ξ)|.
x−2h≤ξ≤x+2h
Then (7.50) becomes

−y2 + 8y1 − 8y−1 + y−2 −ε2 + 8ε1 − 8ε−1 + ε−2 f v (ξ)h4


f  (x) = + +
12h 12h 30
−y2 + 8y1 − 8y−1 + y−2
= + Eround + Etrunc .
12h
Let ε = max{|ε−2 |, |ε−1 |, |ε1 |, |ε2 |}. Then

|ε2 | + 8|ε1 | + 8|ε−1 | + |ε−2 | 18ε 3ε


|Eround | ≤ ≤ =
12h 12h 2h
M5 h4
and |Etrunc | ≤ .
30
Differentiation and Integration 423

3ε M5 h4 45ε
(i) If |Eround | = |Etrunc | then= or, h4 = .
2h 30 M5
 1/4
45ε
Thus the optimum value of h is and
M5
   9 1/4
3ε M5 1/4
|Eround | = |Etrunc | = = (M5 ε3 )1/4 .
2 45ε 80
d|E|
(ii) When total error |E| = |Eround | + |Etrunc | is minimum then =0
dh
3ε 4M5 h3 45ε
or, − 2
+ = 0, i.e., h5 = .
2h 30 4M5
 1/5
45ε
Hence, in this case, the optimum value of h is .
4M5

Corollary 7.6.1 If f ∈ C 5 [a, b] and x−2 , x−1 , x1 , x2 ∈ [a, b] then


−f (x2 ) + 8f (x1 ) − 8f (x−1 ) + f (x−2 ) h4 f v (ξ)
f  (x0 ) = + , (7.51)
12h 30
where x−2 < ξ < x2 .

Example 7.6.2 The value of x and f (x) = x cos x are tabulated as follows:
x : 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
f (x) : 0.19601 0.28660 0.36842 0.43879 0.49520 0.53539 0.55737 0.55945 0.54030

Find the value of f  (0.6) using the two- and four-point formulae

f (x0 + h) − f (x0 − h)
f  (x0 ) =
2h
−f (x0 + 2h) + 8f (x0 + h) − 8f (x0 − h) + f (x0 − 2h)
and f  (x0 ) =
12h
with step size h = 0.1.

Solution. By the two-point formula,


f (0.7) − f (0.5) 0.53539 − 0.43879
f  (0.6)  = = 0.48300
0.2 0.2
and by the four-point formula,

−f (0.8) + 8f (0.7) − 8f (0.5) + f (0.4)


f  (0.6) 
12 × 0.1
−0.55737 + 8 × 0.53539 − 8 × 0.43879 + 0.36842
= = 0.48654.
1.2
424 Numerical Analysis

The exact value is f  (0.6) = cos 0.6 − 0.6 × sin 0.6 = 0.48655.
Therefore, error in two-point formula is 0.00355 and that in four-point formula is
0.00001. Clearly, four-point formula gives better result than two-point formula.

7.7 Richardson’s Extrapolation Method

The improvement of derivative of a function can be done using this method. This
method reduces the number of function evaluation to achieve the higher order accuracy.
The formula to find the first order derivative using two points is

f (x + h) − f (x − h)
f  (x) = + Etrunc = g(h) + Etrunc
2h
where Etrunc is the truncation error and g(h) is the approximate first order derivative
of f (x).
Using Taylor’s series expansion, it can be shown that, Etrunc is of the following form.

Etrunc = c1 h2 + c2 h4 + c3 h6 + · · · .

The Richardson’s extrapolation method combines two values of f  (x) obtained


by a certain method with two different step sizes, say, h1 and h2 . Generally, h1 and h2
are taken as h and h/2. Thus

f  (x) = g(h) + Etrunc = g(h) + c1 h2 + c2 h4 + c3 h6 + · · · (7.52)


h2 h4 h6
and f  (x) = g(h/2) + c1 + c2 + c3 + · · · . (7.53)
4 16 32
The constants c1 , c2 , . . . are independent of h. Eliminating c1 between (7.52) and
(7.53), and we get

4g(h/2) − g(h) 3 7
f  (x) = − c2 h4 − c3 h6 + · · ·
3 4 8
4g(h/2) − g(h)
= + d1 h4 + d2 h6 + · · · (7.54)
3
Denoting

4g(h/2) − g(h)
by g1 (h/2), (7.55)
3
equation (7.54) becomes

f  (x) = g1 (h/2) + d1 h4 + O(h6 ). (7.56)


Differentiation and Integration 425

This equation shows that g1 (h/2) is an approximate value of f  (x) with fourth-order
accuracy. Thus a result accurate up to fourth order is obtained by combining two results
accurate up to second order.
Now, by repeating the above result one can obtain

f  (x) = g1 (h/2) + d1 h4 + O(h6 )


h4 (7.57)
f  (x) = g1 (h/22 ) + d1 + O(h6 )
16
Eliminating d1 from (7.57) to find O(h6 ) order formula, as

f  (x) = g2 (h/22 ) + O(h6 ), (7.58)

where
42 g1 (h/22 ) − g1 (h/2)
g2 (h/22 ) = . (7.59)
42 − 1

Thus g2 (h/22 ) is the sixth-order accurate result of f  (x).


Hence the successive higher order results can be obtained from the following formula
   
h h
  4k gk−1 m − gk−1 m−1
h 2 2
gk m = , (7.60)
2 4 −1
k

k = 1, 2, 3, . . . ; m = k, k + 1, . . .
where g0 (h) = g(h).

This process is called repeated extrapolation to the limit. The values of gk (h/2m ) for
different values of k and m are tabulated as shown in Table 7.1.

Table 7.1: Richardson’s extrapolation table

h second fourth sixth eight


order order order order
h g(h)
g1 (h/2)
h/2 g(h/2) g2 (h/22 )
g1 (h/22 ) g3 (h/23 )
h/22 g(h/22 ) g2 (h/23 )
g1 (h/23 )
h/23 g(h/23 )
426 Numerical Analysis

It may be noted that the successive values in a particular column give better ap-
proximations of the derivative than the preceding columns. This process will terminate
when
|gm (h/2) − gm−1 (h)| ≤ ε

for a given error tolerance ε.


We have seen that one approximate value of f  (x) is g1 (h/2) where

4g(h/2) − g(h) g(h/2) − g(h)


g1 (h/2) = = g(h/2) + .
3 3

Here g(h/2) is more accurate than g(h) and then g1 (h/2) gives an improved ap-
proximation over g(h/2). If g(h) < g(h/2), g1 (h/2) > g(h/2) and if g(h/2) < g(h),
g1 (h/2) < g(h/2). Thus the value of g1 (h/2) lies outside the interval [g(h), g(h/2)] or
[g(h/2), g(h)] as the case may be. Thus g1 (h/2) is obtained from g(h) and g(h/2) by
means of an extrapolation operation. So, this process is called (Richardson) extrapola-
tion.
Example 7.7.1 Use Richardson’s extrapolation method to find f  (0.5) where
f (x) = 1/x starting with h = 0.2.

Solution. Here h = 0.2 and x = 0.5.


Then 1 1
f (x + h) − f (x − h) −
g(h) = = 0.5 + 0.2 0.5 − 0.2 = −4.76190,
2h 2 × 0.2
1 1
f (x + h/2) − f (x − h/2) −
g(h/2) = = 0.5 + 0.1 0.5 − 0.1 = −4.16667.
2(h/2) 0.2

4g(h/2) − g(h) 4 · (−4.16667)−(−4.76190)


Then g1 (h/2) = = = −3.96826.
4−1 3
Halving the step size further, we compute
1 1

g(h/22 ) = 0.5 + 0.05 0.5 − 0.05 = −4.04040,
2 × 0.05
4g(h/22 ) − g(h/2) 4 × (−4.04040) − (−4.16667)
g1 (h/22 ) = = = −3.99831.
4−1 3
Now, 42 g1 (h/22 ) − g1 (h/2) 16 × (−3.99831) − (−3.96826)
g2 (h/22 ) = =
4 −1
2 15
= −4.00031.
Differentiation and Integration 427

The above calculations are tabulated as follows:


h g g1 g2
0.2 –4.76190
–3.96826
0.1 –4.16667 –4.00031
–3.99831
0.05 –4.04040

Thus, after two steps we found that f  (0.5) = −4.00031 while the exact value is
 1
f  (0.5) = − 2 = −4.0.
x x=0.5

Example 7.7.2 For the following table


x : 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
f (x) : 1 2/3 1/2 2/5 1/3 2/7 1/4 2/9 1/5 2/11 1/6

find the value of f  (3) using Richardson’s extrapolation.

Solution. Let x = 3 and h = 2.


1 1
f (x + h) − f (x − h) f (5) − f (1) −
g(h) = = = 6 2 = −0.083333.
2h 2×2 4
1 1
f (x + h/2) − f (x − h/2) f (4) − f (2) −
g(h/2) = = = 5 3 = −0.066666.
2(h/2) 2 2
4g(h/2) − g(h) 4 × (−0.06666) − (−0.083333)
g1 (h/2) = = = −0.061110.
4−1 3
2 2
f (3.5) − f (2.5) −
2
g(h/2 ) = g(0.5) = = 9 7 = −0.063492.
2 × 0.5 1
4g(h/22 ) − g(h/2) 4 × (−0.063492) − (−0.066666)
g1 (h/22 ) = = = −0.062434.
4−1 3
Thus
42 g1 (h/22 ) − g1 (h/2)
g2 (h/22 ) =
42 − 1
16 × (−0.062434) − (−0.061110)
= = −0.062522.
15
Hence f  (3) = −0.062522.
428 Numerical Analysis

Algorithm 7.2 (Richardson’s extrapolation). This algorithm is used to find


the derivative using Richardson’s extrapolation formula.

The formula (7.60) can be written as

4k gk−1 (h) − gk−1 (2h)


gk (h) =
4k − 1
at each iteration h becomes half of its previous value.
We denote gk−1 (2h) (calculated as previous iteration) by g0 (k − 1) (old value) and
gk−1 (h) by gn (k − 1) (new value) to remove h from the formula.
Then the above formula reduces to
4k gn (k − 1) − go (k − 1)
gn (k) = , k = 1, 2, . . .
4k − 1
f (x + h) − f (x − h)
where gn (0) = = g0 (0).
2h
Algorithm Richardson extrapolation
Input function f (x);
Read x, h, ε; //error tolerance//

f (x + h) − f (x − h)
Compute g0 (0) = ;
2h
Set j = 1;
10: Set h = h/2;
f (x + h) − f (x − h)
Compute gn (0) = ;
2h
for k = 1 to j do

4k gn (k − 1) − go (k − 1)
gn (k) = ;
4k − 1
if |gn (j − 1) − gn (j)| < ε then
Print gn (j) as the value of derivative;
Stop;
else
for k = 0 to j do
g0 (k) = gn (k); //set new values as old values//
j = j + 1;
goto 10;
endif;
end Richardson extrapolation
Differentiation and Integration 429

Program 7.2
.
/* Program Richardson Extrapolation
This program finds the first order derivative of a function
f(x) at a given value of x by Richardson Extrapolation.
Here we assume that f(x)=1/(x*x).
*/
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
void main()
{
int j=1,k;
float x,h,eps=1e-5,g0[100],gn[100];
float f(float x);
printf("Enter the value of x ");
scanf("%f",&x);
printf("Enter the value of h ");
scanf("%f",&h);
g0[0]=(f(x+h)-f(x-h))/(h+h);
start: h=h/2;
gn[0]=(f(x+h)-f(x-h))/(h+h);
for(k=1;k<=j;k++)
gn[k]=(pow(4,k)*gn[k-1]-g0[k-1])/(pow(4,k)-1);
if(fabs(gn[j-1]-gn[j])<eps)
{
printf("The derivative is %8.5f at %8.5f ",gn[j],x);
exit(0);
}
else
{
for(k=0;k<=j;k++) g0[k]=gn[k];
j++;
goto start;
}
} /* main */
/* definition of f(x) */
float f(float x)
{
return(1/(x*x));
}
430 Numerical Analysis

A sample of input/output:

Enter the value of x 1.5


Enter the value of h 0.5
The derivative is -0.59259 at 1.50000

7.8 Cubic Spline Method

The cubic spline may be used to determine the first and second derivatives of a function.
This method works into two stages. In first stage the cubic splines will be constructed
with suitable intervals and in second stage the first and second derivatives are to be
determined from the appropriate cubic spline. This method is labourious than the
other methods, but, once a cubic spline is constructed then the method becomes very
efficient. The process of finding derivative is illustrated by as example in the following.
Example 7.8.1 Let y = f (x) = cos x, 0 ≤ x ≤ π/2 be the function. Find the natu-
ral cubic spline in the intervals 0 ≤ x ≤ π/4 and π/4 ≤ x ≤ π/2 and hence determine
the approximate values of f  (π/8) and f  (π/8). Also, use two-point formula to find
the value of f  (π/8). Find the error in each case.

Solution. Here n = 2. Therefore, h = π/4, y0 = cos 0 = 1, y1 = cos π/4 = 1/ 2 and
y2 = cos π/2 = 0.
Also, M0 = M2 = 0.
Then by the formula (3.99)
6
M0 + 4M1 + M2 = [y0 − 2y1 + y2 ]
h2
96 √ 24 √
That is, 4M1 = 2 (1 − 2) or, M1 = 2 (1 − 2) = −1.007246602.
π π
Hence the cubic spline is

s1 (x), 0 ≤ x ≤ π/4
S(x) =
s2 (x), π/4 ≤ x ≤ π/2,

where  
4 x3 1 π2 π
s1 (x) = M1 − 1 − √ + M1 x + ,
π 6 2 96 4
and  1 
4 (π/2 − x)3 π2
s2 (x) = M1 − √ − M1 (π/2 − x) .
π 6 2 96
Now, f  (π/8)  s1 (π/8) = −0.33996116, f  (π/8)  s1 (π/8) = −0.503623301.
Differentiation and Integration 431

Two-point formula.
Let h = π/30.
f (π/8 + π/30) − f (π/8 − π/30)
Then f  (π/8)  = −0.381984382.
2.π/30
The actual value of f  (π/8) is − sin π/8 = −0.382683432.
Therefore, error in cubic spline method is 0.042722272 while in two-point formula
that is 0.000699050365.

7.9 Determination of Extremum of a Tabulated Function

It is known that, if a function is differentiable then the maximum and minimum value
of that function can be determined by equating the first derivative to zero and solving
for the variable. The same method is applicable for the tabulated function.
Now, consider the Newton’s forward difference interpolation formula.
u(u − 1) 2 u(u − 1)(u − 2) 3
y = f (x) = y0 + u∆y0 + ∆ y0 + ∆ y0 + · · · , (7.61)
2! 3!
x − x0
where u = .
h
Then
dy 2u − 1 2 3u2 − 6u + 2 3
= ∆y0 + ∆ y0 + ∆ y0 + · · · .
dx 2 6
dy
For maxima and minima = 0. Then
dx
2u − 1 2 3u2 − 6u + 2 3
∆y0 + ∆ y0 + ∆ y0 + · · · = 0 (7.62)
2 6
For simplicity, the third and higher differences are neglected and obtain the quadratic
equation for u as
au2 + bu + c = 0, (7.63)
1 1 1
where a = ∆3 y0 , b = ∆2 y0 − ∆3 y0 , c = ∆y0 − ∆2 y0 + ∆3 y0 .
2 2 3
The values of u can be determined by solving this equation. Then the values of x
are determined from the relation x = x0 + uh. Finally, the maximum value of y can be
obtained from the equation (7.61).
Example 7.9.1 Find x for which y is maximum and also find the corresponding
value of y, from the following table.
x : 1.5 2.0 2.5 3.0 3.5
y : 0.40547 0.69315 0.91629 1.09861 1.25276
432 Numerical Analysis

Solution. The forward difference table is


x y ∆y ∆2 y
1.5 0.40547
0.28768
2.0 0.69315 –0.06454
0.22314
2.5 0.91629 –0.04082
0.18232
3.0 1.09861 –0.02817
0.15415
3.5 1.25276

Let x0 = 1.5. Using formula (7.62) we have


2u − 1 2 2u − 1
∆y0 + ∆ y0 = 0 or, 0.28768 + × (−0.06454) = 0 or, u = 4.95739.
2 2
Therefore, x = x0 + uh = 1.5 + 0.5 × 4.95739 = 3.97870.
For this x, the value of y is obtained by Newton’s backward formula as

y(3.97870) = 1.25276 + 0.47870 × (0.15415)


0.47870(0.47870 + 1)
+ × (−0.02817) = 1.31658.
2
This is the approximate maximum value of y when x = 3.97870.

7.10 Integration

It is well known that, if a function f (x) is known completely, even then it is not always
possible to evaluate the definite integral of it using analytic method. Again, in many
real life problems, we are required to integrate a function between two given limits,
but the function is not known explicitly, but, it is known in a tabular form (equally or
unequally spaced). Then a method, known as numerical integration or quadrature
can be used to solve all such problems.
The problem of numerical integration is stated below:
Given a set of data points (x0 , y0 ), (x1 , y1 ), . . . , (xn , yn ) of a function y = f (x), it is
8b
required to find the value of the definite integral a f (x) dx. The function f (x) is replaced
by a suitable interpolating polynomial φ(x).
Then the approximate value of the definite integral is calculated using the following
formula
% b % b
f (x) dx  φ(x) dx. (7.64)
a a
Differentiation and Integration 433

Thus, different integration formulae can be derived depending on the type of the
interpolation formulae used.
A numerical integration formula is said to be of closed type, if the limits of integra-
tion a and b are taken as interpolating points. If a and b are not taken as interpolating
points then the formula is known as open type formula.

7.11 General Quadrature Formula Based on Newton’s Forward


Interpolation

The Newton’s forward interpolation formula for the equispaced points xi , i = 0, 1, . . . , n,


xi = x0 + ih is
u(u − 1) 2 u(u − 1)(u − 2) 3
φ(x) = y0 + u∆y0 + ∆ y0 + ∆ y0 + · · · , (7.65)
2! 3!
x − x0
where u = , h is the spacing.
h
Let the interval [a, b] be divided into n equal subintervals such that a = x0 < x1 <
x2 < · · · < xn = b. Then
% b % xn
I= f (x) dx  φ(x) dx
a x0
%  
xn
u2 − u 2 u3 − 3u2 + 2u 3
= y0 + u∆y0 + ∆ y0 + ∆ y0 + · · · dx.
x0 2! 3!
Since x = x0 + uh, dx = h du, when x = x0 then u = 0 and when x = xn then u = n.
Thus,
% n 
u2 − u 2 u3 − 3u2 + 2u 3
I= y0 + u∆y0 + ∆ y0 + ∆ y0 + · · · hdu
0 2! 3!
 u2 n ∆2 y  u3 u2 n ∆3 y  u4 n
0 0
n
= h y0 [u]0 + ∆y0 + − + −u +u3 2
+ ···
2 0 2! 3 2 0 3! 4 0
 n 2n2 − 3n 2 n3 − 4n2 + 4n 3 
= nh y0 + ∆y0 + ∆ y0 + ∆ y0 + · · · . (7.66)
2 12 24
From this formula, one can generate different integration formulae by substituting
n = 1, 2, 3, . . . .

7.11.1 Trapezoidal Rule


Substituting n = 1 in the equation (7.66). In this case all differences higher than the
first difference become zero. Then
% xn     h
1 1
f (x) dx = h y0 + ∆y0 = h y0 + (y1 − y0 ) = (y0 + y1 ). (7.67)
x0 2 2 2
434 Numerical Analysis

The formula (7.67) is known as the trapezoidal rule.


In this formula, the interval [a, b] is considered as a single interval, and it gives a
very rough answer. But, if the interval [a, b] is divided into several subintervals and this
formula is applied to each of these subintervals then a better approximate result may
be obtained. This formula is known as composite formula, deduced below.

Composite trapezoidal rule


Let the interval [a, b] be divided into n equal subintervals by the points a = x0 , x1 , x2 ,
. . ., xn = b, where xi = x0 + ih, i = 1, 2, . . . , n.
Applying the trapezoidal rule to each of the subintervals, one can find the composite
formula as
% b % x1 % x2 % xn
f (x) dx = f (x) dx + f (x) dx + · · · + f (x) dx
a x0 x1 xn−1
h h h h
 [y0 + y1 ] + [y1 + y2 ] + [y2 + y3 ] + · · · + [yn−1 + yn ]
2 2 2 2
h
= [y0 + 2(y1 + y2 + · · · + yn−1 ) + yn ]. (7.68)
2

Error in trapezoidal rule


The error of trapezoidal rule is
% b
h
E= f (x) dx − (y0 + y1 ). (7.69)
a 2

Let y = f (x) be continuous and possesses continuous derivatives of all orders. Also,
it is assumed that there exists a function F (x) such that F  (x) = f (x) in [x0 , x1 ].
Then
% b % x1
f (x) dx = F  (x) dx = F (x1 ) − F (x0 )
a x0
h2 
= F (x0 + h) − F (x0 ) = F (x0 ) + hF  (x0 ) + F (x0 )
2!
h3 
+ F (x0 ) + · · · − F (x0 )
3!
h2 h3
= hf (x0 ) + f  (x0 ) + f  (x0 ) + · · ·
2! 3!
h2  h3 
= hy0 + y0 + y0 + · · · (7.70)
2 6
Differentiation and Integration 435

Again,
h h
(y0 + y1 ) = [y0 + y(x0 + h)]
2 2
h h2
= [y0 + y(x0 ) + hy  (x0 ) + y  (x0 ) + · · · ]
2 2!
h h 2
= [y0 + y0 + hy0 + y0 + · · · ]. (7.71)
2 2!
Using (7.70) and (7.71), equation (7.69) becomes
 h h2  h h2 
E = h y0 + y0 + y0 + · · · − 2y0 + hy0 + y0 + · · ·
2 6 2 2!
h3 
= − y0 + · · ·
12
h3  h3
= − f (x0 ) + · · ·  − f  (ξ), (7.72)
12 12
where a = x0 < ξ < x1 = b.
Equation (7.72) gives the error in the interval [x0 , x1 ].
The total error in the composite rule is
h3 
E=− (y + y1 + · · · + yn−1

).
12 0
If y  (ξ) is the largest among the n quantities y0 , y1 , . . . , yn−1
 then
1 3  (b − a) 2 
E≤− h ny (ξ) = − h y (ξ), as nh = b − a.
12 12
Note 7.11.1 The error term shows that if the second and higher order derivatives of
f (x) vanish then the trapezoidal rule gives exact result of the integral. This means, the
method gives exact result when f (x) is linear.

Geometrical interpretation of trapezoidal rule


In this rule, the curve y = f (x) is replaced by the line joining the points A(x0 , y0 ) and
B(x1 , y1 ) (Figure 7.1). Thus the area bounded by the curve y = f (x), the ordinates
x = x0 , x = x1 and the x-axis is then approximately equivalent to the area of the
trapezium (ABCD) bounded by the line AB, x = x0 , x = x1 and x-axis.
The geometrical significance of composite trapezoidal rule is that the curve y = f (x)
is replaced by n straight lines joining the points (x0 , y0 ) and (x1 , y1 ); (x1 , y1 ) and (x2 , y2 );
. . ., (xn−1 , yn−1 ) and (xn , yn ). Then the area bounded by the curve y = f (x), the lines
x = x0 , x = xn and the x-axis is then approximately equivalent to the sum of the area
of n trapeziums (Figures 7.2).
436 Numerical Analysis

y y = f (x)
6 B
A
y1
y0
- x
O D x0 C x1

Figure 7.1: Geometrical interpretation of trapezoidal rule.


y
6

- x
x0 x1 x2 xn
O

Figure 7.2: Composite trapezoidal rule.

Alternative deduction of trapezoidal rule

Let f ∈ C 2 [a, b], where [a, b] is a finite interval. Now, transfer the interval [a, b] to [−1, 1]
a+b b−a
using the relation x = + t = p + qt (say).
2 2
Let f (x) = f (p + qt) = g(t). When x = a, b then t = −1, 1, i.e., g(1) = f (b), g(−1) =
f (a).
Thus

% b % 1 % 0 % 1 
I= f (x)dx = g(t) q dt = q g(t)dt + g(t)dt
a −1 −1 0
% 1
=q [g(t) + g(−t)]dt.
0
Differentiation and Integration 437

Now, applying integration by parts.


 1 % 1
I = q {g(t) + g(−t)}t −q t[g  (t) − g  (−t)]dt
0 0
% 1
= q[g(1) + g(−1)] − q t.2tg  (c)dt, where 0 < c < 1
0
[by Lagrange’s MVT]
% 1

= q[f (a) + f (b)] − 2 q g (d) t2 dt, 0 < d < 1,
0
[by MVT of integral calculus]
2
= q[f (a) + f (b)] − qg  (d)
3
2 3 
= q[f (a) + f (b)] − q f (p + qd)
3
2 3 
= q[f (a) + f (b)] − q f (ξ), where a < ξ < b
3
b−a 2  b − a 3 
= [f (a) + f (b)] − f (ξ)
2 3 2
h 1
= [f (a) + f (b)] − h3 f  (ξ), as h = b − a.
2 12

In this expression, the first term is the approximate integration obtained by trape-
zoidal rule and the second term represents the error.
% b
Algorithm 7.3 (Trapezoidal). This algorithm finds the value of f (x)dx based
a
on the tabulated values (xi , yi ), yi = f (xi ), i = 0, 1, 2, . . . , n, using trapezoidal rule.

Algorithm Trapezoidal
Input function f (x);
Read a, b, n; //the lower and upper limits and number of subintervals.//
Compute h = (b − a)/n;
1
Set sum = [f (a) + f (a + nh)];
2
for i = 1 to n − 1 do
Compute sum = sum + f (a + ih);
endfor;
Compute result = sum ∗ h;
Print result;
end Trapezoidal
438 Numerical Analysis

Program 7.3
.

/* Program Trapezoidal
This program finds the value of integration of a function
by trapezoidal rule.
Here we assume that f(x)=x^3. */
#include<stdio.h>
void main()
{
float a,b,h,sum; int n,i;
float f(float);
printf("Enter the values of a, b ");
scanf("%f %f",&a,&b);
printf("Enter the value of n ");
scanf("%d",&n);
h=(b-a)/n;
sum=(f(a)+f(a+n*h))/2.;
for(i=1;i<=n-1;i++) sum+=f(a+i*h);
sum=sum*h;
printf("The value of the integration is %8.5f ",sum);
}

/* definition of the function f(x) */


float f(float x)
{
return(x*x*x);
}

A sample of input/output:

Enter the values of a, b 0 1


Enter the value of n 100
The value of the integration is 0.25002

7.11.2 Simpson’s 1/3 rule

In this formula the interval [a, b] is divided into two equal subintervals by the points
x0 , x1 , x2 , where h = (b − a)/2, x1 = x0 + h and x2 = x1 + h.
This rule is obtained by putting n = 2 in (7.66). In this case, the third and higher
order differences do not exist.
Differentiation and Integration 439

The equation (7.66) is simplified as


% xn
1 1
f (x) dx  2h y0 + ∆y0 + ∆2 y0 = 2h[y0 + (y1 − y0 ) + (y2 − 2y1 + y0 )]
x0 6 6
h
= [y0 + 4y1 + y2 ]. (7.73)
3
The above rule is known as Simpson’s 1/3 rule or simply Simpson’s rule.

Composite Simpson’s 1/3 rule


Let the interval [a, b] be divided into n (an even number) equal subintervals by the
points x0 , x1 , x2 , . . . , xn , where xi = x0 + ih, i = 1, 2, . . . , n. Then
% b % x2 % x4 % xn
f (x) dx = f (x) dx + f (x) dx + · · · + f (x) dx
a x0 x2 xn−2
h h h
= [y0 + 4y1 + y2 ] + [y2 + 4y3 + y4 ] + · · · + [yn−2 + 4yn−1 + yn ]
3 3 3
h
= [y0 + 4(y1 + y3 + · · · + yn−1 ) + 2(y2 + y4 + · · · + yn−2 ) + yn ].
3
(7.74)
This formula is known as Simpson’s 1/3 composite rule for numerical integration.

Error in Simpson’s 1/3 rule


The error in this formula is
% xn
h
E= f (x) dx − [y0 + 4y1 + y2 ]. (7.75)
x0 3
Let the function f (x) be continuous in [x0 , x2 ] and possesses continuous derivatives
of all order. Also, let there exists a function F (x) in [x0 , x2 ], such that F  (x) = f (x).
Then
% x2 % x2
f (x) dx = F  (x) dx = F (x2 ) − F (x0 )
x0 x0
(2h)2 
= F (x0 + 2h) − F (x0 ) = F (x0 ) + 2hF  (x0 ) + F (x0 )
2!
(2h)3  (2h)4 iv (2h)5 v
+ F (x0 ) + F (x0 ) + F (x0 ) + · · · − F (x0 )
3! 4! 5!
4 2
= 2hf (x0 ) + 2h2 f  (x0 ) + h3 f  (x0 ) + h4 f  (x0 )
3 3
4 5 iv
+ h f (x0 ) + · · · . (7.76)
15
440 Numerical Analysis

Again,
h h
[y0 + 4y1 + y2 ] = [f (x0 ) + 4f (x1 ) + f (x2 )]
3 3
h
= [f (x0 ) + 4f (x0 + h) + f (x0 + 2h)]
3
h  h2 h3
= f (x0 ) + 4 f (x0 ) + hf  (x0 ) + f  (x0 ) + f  (x0 )
3 2! 3!
h 4   (2h)2 
+ f iv (x0 ) + · · · + f (x0 ) + 2hf  (x0 ) + f (x0 )
4! 2!
(2h)3  (2h)4 iv 
+ f (x0 ) + f (x0 ) + · · ·
3! 4!
4 2
= 2hf (x0 ) + 2h2 f  (x0 ) + h3 f  (x0 ) + h4 f  (x0 )
3 3
5
+ h5 f iv (x0 ) + · · · . (7.77)
18
Using (7.76) and (7.77), equation (7.75) becomes,
4 5  5 iv h5
E= − h f (x0 ) + · · ·  − f iv (ξ), (7.78)
15 18 90
where x0 < ξ < x2 .
This is the error in the interval [x0 , x2 ].
The total error in composite formula is
h5 iv
E=− {f (x0 ) + f iv (x2 ) + · · · + f iv (xn−2 )}
90
h5 n iv
=− f (ξ)
90 2
nh5 iv
=− f (ξ),
180
(where f iv (ξ) is the maximum among f iv (x0 ), f iv (x2 ), . . . , f iv (xn−2 ))
(b − a) 4 iv
=− h f (ξ). (7.79)
180

Geometrical interpretation of Simpson’s 1/3 rule


In Simpson’s 1/3 rule, the curve y = f (x) is replaced by the second degree parabola pass-
ing through the points A(x0 , y0 ), B(x1 , y1 ) and C(x2 , y2 ). Therefore, the area bounded
by the curve y = f (x), the ordinates x = x0 , x = x2 and the x-axis is approximated to
the area bounded by the parabola ABC, the straight lines x = x0 , x = x2 and x-axis,
i.e., the area of the shaded region ABCDEA.
Differentiation and Integration 441

y
6
parabola
C
A y = f (x)
B

E D -
x
O x0 x1 x2

Figure 7.3: Geometrical interpretation of Simpson’s 1/3 rule.

83
Example 7.11.1 Evaluate 0 (2x − x2 ) dx, taking 6 intervals, by (i) Trapezoidal
rule, and (ii) Simpson’s 1/3 rule.

Solution. Here n = 6, a = 0, b = 3, y = f (x) = 2x − x2 .


b−a 3−0
So, h = = = 0.5.
n 6
The tabulated values of x and y are shown below.
x0 x1 x2 x3 x4 x5 x6
xi : 0.0 0.5 1.0 1.5 2.0 2.5 3.0
yi : 0.0 0.75 1.0 0.75 0.0 -1.25 -3.0
y0 y1 y2 y3 y4 y5 y6

(i) By Trapezoidal rule:


83 h
0 (2x − x2 ) dx =[y0 + 2(y1 + y2 + y3 + y4 + y5 ) + y6 ]
2
0.5
= [0 + 2(0.75 + 1.0 + 0.75 + 0 − 1.25) − 3.0] = −0.125.
2
(ii) By Simpson’s rule:
83 h
0 (2x − x ) dx = 3 [y0 + 4(y1 + y3 + y5 ) + 2(y2 + y4 ) + y6 ]
2

0.5
= [0 + 4(0.75 + 0.75 − 1.25) + 2(1.0 + 0.0) − 3.0]
3
0.5
= [0 + 1 + 2 − 3] = 0.
3
442 Numerical Analysis

Alternative deduction of Simpson’s 1/3 rule

This rule can also be deduced by applying MVT of differential and of integral calculus.
a+b b−a a+b b−a
Let f ∈ C 4 [a, b] and x = + z = p + qz, p = ,q = .
2 2 2 2
Then when x = a, b then z = −1, 1.
Therefore,
% b % 1
I= f (x)dx = q f (p + qz)dz
a −1
% 1
=q g(z)dz, where g(z) = f (p + qz)
−1
% 0 % 1 % 1
=q g(z)dz + g(z)dz = q [g(z) + g(−z)]dz
−1 0 0
% 1
=q φ(z)dz, (7.80)
0

where φ(z) = g(z) + g(−z).


Note that φ(0) = 2g(0) = 2f (p) = 2f ( a+b 
2 ), φ(1) = g(1)+g(−1) = f (a)+f (b), φ (0) =
0. % %
1 1
To prove φ(z)dz = (1 + c)φ(1) − cφ(0) − (z + c)φ (z)dz, for arbitrary constant
0 0
c.
% 1 % 1 % 1+c
φ(z)dz = φ(z)d(z + c) = φ(y − c)dy [where z + c = y]
0 0 c
 1+c % 1+c
= yφ(y − c) − yφ (y − c)dy
c c
% 1
= (1 + c)φ(1) − cφ(0) − (z + c)φ (z)d(z + c)
0
% 1
= (1 + c)φ(1) − cφ(0) − (z + c)φ (z)dz. (7.81)
0

Now, integrating (7.80) thrice


% 1 % 1
φ(z)dz = (1 + c)φ(1) − cφ(0) − (z + c)φ (z)dz
0 0
 z 2  1 % 1  z 2 

= (1 + c)φ(1) − cφ(0) − + cz + c1 φ (z) + + cz + c1 φ (z)dz
2 0 0 2
Differentiation and Integration 443

1 
= (1 + c)φ(1) − cφ(0) − + c + c1 φ (1) + c1 φ (0)
2
 z 3 z2  1 % 1  z 3 z2 

+ + c + c1 z + c2 φ (z) − + c + c1 z + c2 φ (z)dz
6 2 0 0 6 2
1  1 c 
= (1 + c)φ(1) − cφ(0) − + c + c1 φ (1) + + + c1 + c2 φ (1)
2 6 2
% 1 3 2 
z z
−c2 φ (0) − + c + c1 z + c2 φ (z)dz, (7.82)
0 6 2
where c1 , c2 , c3 are arbitrary constants and they are chosen in such a way that φ (1), φ (1)
and φ (0) vanish. Thus
1 1 c
+ c + c1 = 0, + + c1 + c2 = 0, and c2 = 0.
2 6 2
1 2
The solution of these equations is c2 = 0, c1 = , c = − .
6 3
Hence
1 % 1 3
2 z z 2 z   
I = q φ(1) + φ(0) − − + φ (z)dz
3 3 0 6 3 6
1  4  a + b  h % 1 
=h f (a) + f (b) + f − (z 3 − 2z 2 + z)φ (z)dz
3 3 2 6 0
 b−a 
as q = =h
2
h a + b 
= f (a) + 4f + f (b) + E
3 2
where
% %
h 1 2  h 1
E=− z(z − 1) φ (z)dz = − z(z − 1)2 [g  (z) − g  (−z)]dz
6 0 6 0
%
h 1
=− z(z − 1)2 .[2zg iv (ξ)]dz, −z < ξ < z
6 0
[by Lagrange’s MVT]
% 1
h iv
= − g (ξ1 ) z 2 (z − 1)2 dz [by MVT of integral calculus]
3 0
h 1 h
= − g iv (ξ1 ). = − g iv (ξ1 ), 0 < ξ1 < 1.
3 30 90
Again, g(z) = f (p + qz), g iv (z) = q 4 f iv (p + qt) = h4 f iv (ξ2 ), a < ξ2 < b.
Therefore,
h5
E = − f iv (ξ2 ).
90
444 Numerical Analysis

Hence,
% b a + b
h h5
f (x)dx = f (a) + 4f + f (b) − f iv (ξ2 ).
a 3 2 90
Here, the first term is the value of the integration obtained from the Simpson’s 1/3
rule and the second term is its error.
Algorithm 7.4 (Simpson’s 1/3). This algorithm determines the value of
8b
a f (x) dx using Simpson’s 1/3 rule.

Algorithm Simpson One Third


Input function f (x);
Read a, b, n; //the lower and upper limits and number of subintervals.//
Compute h = (b − a)/n;
Set sum = [f (a) − f (a + nh)];
for i = 1 to n − 1 step 2 do
Compute sum = sum + 4 ∗ f (a + ih) + 2 ∗ f (a + (i + 1)h);
endfor;
Compute result = sum ∗ h/3;
Print result;
end Simpson One Third.

Program 7.4.
/* Program Simpson’s 1/3
Program to find the value of integration of a function
f(x) using Simpson’s 1/3 rule. Here we assume that f(x)=x^3.*/
#include<stdio.h>
void main()
{
float f(float);
float a,b,h,sum;
int i,n;
printf("\nEnter the values of a, b ");
scanf("%f %f",&a,&b);
printf("Enter the value of subintervals n ");
scanf("%d",&n);
if(n%2!=0) {
printf("Number of subdivision should be even");
exit(0);
}
h=(b-a)/n;
sum=f(a)-f(a+n*h);
Differentiation and Integration 445

for(i=1;i<=n-1;i+=2)
sum+=4*f(a+i*h)+2*f(a+(i+1)*h);
sum*=h/3.;
printf("Value of the integration is %f ",sum);
} /* main */

/* definition of the function f(x) */


float f(float x)
{
return(x*x*x);
}
A sample of input/output:
Enter the values of a, b 0 1
Enter the value of subintervals n 100
Value of the integration is 0.250000

7.11.3 Simpson’s 3/8 rule


Simpson’s 3/8 rule can be obtained by substituting n = 3 in (7.66). Note that the
differences higher than the third order do not exist here.
% b % x3
3 3 1 3
f (x)dx = f (x)dx = 3h y0 + ∆y0 + ∆2 y0 + ∆ y0
a x0 2 4 8
3 3 1
= 3h y0 + (y1 − y0 ) + (y2 − 2y1 + y0 ) + (y3 − 3y2 + 3y1 − y0 )
2 4 8
3h
= [y0 + 3y1 + 3y2 + y3 ]. (7.83)
8
This formula is known as Simpson’s 3/8 rule.
Now, the interval [a, b] is divided into n (divisible by 3) equal subintervals by the
points x0 , x1 , . . . , xn and the formula is applied to each of the intervals.
Then
% xn % x3 % x6 % xn
f (x)dx = f (x)dx + f (x)dx + · · · + f (x)dx
x0 x0 x3 xn−3
3h
= [(y0 + 3y1 + 3y2 + y3 ) + (y3 + 3y4 + 3y5 + y6 )
8
+ · · · + (yn−3 + 3yn−2 + 3yn−1 + yn )]
3h
= [y0 + 3(y1 + y2 + y4 + y5 + y7 + y8 + · · · + yn−2 + yn−1 )
8
+2(y3 + y6 + y9 + · · · + yn−3 ) + yn ]. (7.84)
446 Numerical Analysis

This formula is known as Simpson’s 3/8 composite rule.


Note 7.11.2 This method is not so accurate as Simpson’s 1/3 rule. The error in this
3
formula is − h5 f iv (ξ), x0 < ξ < x3 .
80

7.11.4 Boole’s rule


Substituting n = 4 in (7.66). The equation (7.66) reduces to
% b
5 2 7
f (x)dx = 4h y0 + 2∆y0 + ∆2 y0 + ∆3 y0 + ∆4 y0
a 3 3 90
5 2
= 4h[y0 + 2(y1 − y0 ) + (y2 − 2y1 + y0 ) + (y3 − 3y2 + 3y1 − y0 )
3 3
7
+ (y4 − 4y3 + 6y2 − 4y1 + y0 )]
90
2h
= [7y4 + 32y3 + 12y2 + 32y1 + 7y0 ]. (7.85)
45
This rule is known as Boole’s rule.
8h7 vi
It can be shown that the error of this formula is − f (ξ), a < ξ < b.
945

7.11.5 Weddle’s rule


To find Weddle’s rule, substituting n = 6 in (7.66). Then
% b
f (x)dx
a
9 41 11 41 6
= 6h y0 + 3∆y0 + ∆2 y0 + 4∆3 y0 + ∆4 y0 + ∆5 y0 + ∆ y0
2 20 20 840
9 41 11 1 h 6
= 6h y0 + 3∆y0 + ∆2 y0 +4∆3 y0 + ∆4 y0 + ∆5 y0 + ∆6 y0 − ∆ y0 .
2 20 20 20 140
h 6
If the sixth order difference is very small, then we may neglect the last term ∆ y0 .
140
But, this rejection increases a negligible amount of error, though, it simplifies the inte-
gration formula. Then the above equation becomes
% x6
f (x)dx
x0
3h
= [20y0 + 60∆y0 + 90∆2 y0 + 80∆3 y0 + 41∆4 y0 + 11∆5 y0 + ∆6 y0 ]
10
3h
= [y0 + 5y1 + y2 + 6y3 + y4 + 5y5 + y6 ]. (7.86)
10
This formula is known as Weddle’s rule for numerical integration.
Differentiation and Integration 447

Composite Weddle’s rule


In this rule, interval [a, b] is divided into n (divisible by 6) subintervals by the points
x0 , x1 , . . . , xn . Then
% xn % x6 % x12 % xn
f (x)dx = f (x)dx + f (x)dx + · · · + f (x)dx
x0 x0 x6 xn−6
3h
= [y0 + 5y1 + y2 + 6y3 + y4 + 5y5 + y6 ]
10
3h
+ [y6 + 5y7 + y8 + 6y9 + y10 + 5y11 + y12 ] + · · ·
10
3h
+ [yn−6 + 5yn−5 + yn−4 + 6yn−3 + yn−2 + 5yn−1 + yn ]
10
3h
= [y0 + 5(y1 + y5 + y7 + y11 + · · · + yn−5 + yn−1 )
10
+(y2 + y4 + y8 + y10 + · · · + yn−4 + yn−2 )
+6(y3 + y9 + y15 + · · · + yn−3 ) + 2(y6 + y12 + · · · + yn−6 )].
(7.87)

The above formula is known as Weddle’s composite rule.


By the technique used in trapezoidal and Simpson’s 1/3 rules one can prove that the
h7 vi
error in Weddle’s rule is − f (ξ), x0 < ξ < x6 .
140

Degree of Precision
The degree of precision of a quadrature formula is a positive integer n such that the
error is zero for all polynomials of degree i ≤ n, but it is non-zero for some polynomials
of degree n + 1.
The degree of precision of some quadrature formulae are given in Table 7.2.

Table 7.2: Degree of precision of some quadrature formulae.

Method Degree of precision


Trapezoidal 1
Simpson’s 1/3 3
Simpson’s 3/8 3
Boole’s 5
Weddle’s 5
448 Numerical Analysis

Comparison of Simpson’s 1/3 and Weddle’s rules

The Weddle’s rule gives more accurate result than Simpson’s 1/3 rule. But, Weddle’s
rule has a major disadvantage that it requires the number of subdivisions (n) as a
multiple of six. In many cases, the value of h = b−an (n is multiple of six) is not finite
in decimal representation. For these reasons, the values of x0 , x1 , . . . , xn can not be
determined accurately and hence the values of y i.e., y0 , y1 , . . . , yn become inaccurate.
In Simpson’s 1/3 rule, n, the number of subdivisions is even, so one can take n as 10,
20 etc. and hence h is finite in decimal representation. Thus the values of x0 , x1 , . . . , xn
and y0 , y1 , . . . , yn can be computed correctly.
However, Weddle’s rule should be used when Simpson’s 1/3 rule does not give the
desired accuracy.

7.12 Integration Based on Lagrange’s Interpolation

Let the function y = f (x) be known at the (n + 1) points x0 , x1 , . . . , xn of [a, b], these
points need not be equispaced.
The Lagrange’s interpolation polynomial is


n
w(x)
φ(x) = yi (7.88)
(x − xi )w (xi )
i=0
where w(x) = (x − x0 ) · · · (x − xn )

and φ(xi ) = yi , i = 0, 1, 2, . . . , n.
If the function f (x) is replaced by the polynomial φ(x) then
% b % b n %
 b
w(x)
f (x)dx  φ(x)dx = yi dx. (7.89)
a a a (x − xi )w (xi )
i=0

The above equation can be written as


% b 
n
f (x)dx  Ci yi , (7.90)
a i=0
% b
w(x)
where Ci = dx, i = 0, 1, 2, . . . , n. (7.91)
a (x − xi )w (xi )

It may be noted that the coefficients Ci are independent of the choice of the function
f (x) for a given set of points.
Differentiation and Integration 449

7.13 Newton-Cotes Integration Formulae (Closed type)

Let the interpolation points x0 , x1 , . . . , xn be equispaced, i.e., xi = x0 +ih, i = 1, 2, . . . , n.


Also, let x0 = a, xn = b, h = (b−a)/n and yi = f (xi ), i = 0, 1, 2, . . . , n. Then the definite
% b
integral f (x)dx can be determined on replacing f (x) by Lagrange’s interpolation
a
polynomial φ(x) and then the approximate integration formula is given by
% b 
n
f (x)dx  Ci yi , (7.92)
a i=0
where Ci are some constant coefficients.
Now, the explicit expressions for Ci ’s are evaluated in the following.
The Lagrange’s interpolation polynomial is

n
φ(x) = Li (x)yi , (7.93)
i=0
where
(x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn )
Li (x) = . (7.94)
(xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
Introducing x = x0 + sh. Then x − xi = (s − i)h and xi − xj = (i − j)h. Therefore,
sh(s − 1)h · · · (s − i − 1)h(s − i + 1)h · · · (s − n)h
Li (x) =
ih(i − 1)h · · · (i − i − 1)h(i − i + 1)h · · · (i − n)h
(−1)n−i s(s − 1)(s − 2) · · · (s − n)
= . (7.95)
i!(n − i)! (s − i)
Then (7.92) becomes
% xn 
n
f (x)dx  Ci yi
x0 i=0
% n n
xn
(−1)n−i s(s − 1)(s − 2) · · · (s − n)
or, yi dx = Ci yi
x0 i=0 i!(n − i)! (s − i)
i=0

n % xn  n
(−1)n−i s(s−1)(s−2) · · · (s−n)
or, dx yi = Ci yi . (7.96)
i=0 x0 i!(n − i)! (s − i)
i=0
Now, comparing both sides to find the expression for Ci in the form
% xn
(−1)n−i s(s − 1)(s − 2) · · · (s − n)
Ci = dx
x0 i!(n − i)! (s − i)
%
(−1)n−i h n s(s − 1)(s − 2) · · · (s − n)
= ds, (7.97)
i!(n − i)! 0 (s − i)
450 Numerical Analysis

i = 0, 1, 2, . . . , n and x = x0 + sh.
b−a
Since h = , substituting
n
Ci = (b − a)Hi , (7.98)

where
%
1 (−1)n−i n
s(s − 1)(s − 2) · · · (s − n)
Hi = ds, i = 0, 1, 2, . . . , n. (7.99)
n i!(n − i)! 0 (s − i)
These coefficients Hi are called Cotes coefficients.
Then the integration formula (7.92) becomes
% b 
n
f (x)dx  (b − a) H i yi , (7.100)
a i=0

where Hi ’s are given by (7.99).

Note 7.13.1 The cotes coefficients Hi ’s do not depend on the function f (x).

7.13.1 Some results on Cotes coefficients



n
(i) Ci = (b − a).
i=0

n
w(x)
By the property of Lagrangian functions, =1
(x − xi )w (xi )
i=0

% b
n % b
w(x)
That is, dx = dx = (b − a). (7.101)
a i=0 (x − xi )w (xi ) a

Again,
% b
n  n %
w(x) n
s(s − 1)(s − 2) · · · (s − n)

dx = h(−1)n−i ds
a i=0 (x − xi )w (xi ) i!(n − i)!(s − i)
i=0 0
n
= Ci . (7.102)
i=0

Hence from (7.101) and (7.102),



n
Ci = b − a. (7.103)
i=0
Differentiation and Integration 451


n
(ii) Hi = 1.
i=0
From the relation (7.98),
Ci = (b − a)Hi

n 
n
or, Ci = (b − a) Hi
i=0 i=0
n
or, (b − a) = (b − a) Hi . [using (7.103)]
i=0

Hence,


n
Hi = 1. (7.104)
i=0

That is, sum of cotes coefficients is one.

(iii) Ci = Cn−i .
From the definition of Ci , one can find
%
(−1)i h n
s(s − 1)(s − 2) · · · (s − n)
Cn−i = ds.
(n − i)!i! 0 s − (n − i)

Substituting t = n − s, we obtain
%
(−1)i h(−1)n 0 t(t − 1)(t − 2) · · · (t − n)
Cn−i =− dt
i!(n − i)! n t−i
%
(−1)n−i h n s(s − 1)(s − 2) · · · (s − n)
= dt = Ci .
i!(n − i)! 0 s−i

Hence,

Ci = Cn−i . (7.105)

(iv) Hi = Hn−i .
Multiplying (7.105) by (b − a) and hence obtain

Hi = Hn−i . (7.106)
452 Numerical Analysis

7.13.2 Deduction of quadrature formulae


Trapezoidal rule
Substituting n = 1 in (7.100), we get
% b 
1
f (x)dx = (b − a) Hi yi = (b − a)(H0 y0 + H1 y1 ).
a i=0

Now H0 and H1 are obtained from (7.99) by substituting i = 0 and 1. Therefore,


% 1 % 1
s(s − 1) 1 1
H0 = − ds = and H1 = sds = .
0 s 2 0 2
Here, h = (b − a)/n = b − a for n = 1.
% b
(b − a) h
Hence, f (x)dx = (y0 + y1 ) = (y0 + y1 ).
a 2 2

Simpson’s 1/3 rule


%
1 1 2 1
For n = 2, H0 = . (s − 1)(s − 2)ds =
% 2 2 0 6
%
1 2 2 1 1 2 1
H1 = − s(s − 2)ds = , H2 = . s(s − 1)ds = .
2 0 3 2 2 0 6
In this case h = (b − a)/2.
Hence equation (7.100) gives the following formula.
% b 
2
f (x)dx = (b − a) Hi yi = (b − a)(H0 y0 + H1 y1 + H2 y2 )
a i=0
h
= (y0 + 4y1 + y2 ).
3

Weddle’s rule
To deduce the Weddle’s rule, n = 6 is substituted in (7.100).
% b 
6
f (x)dx = (b − a) H i yi
a i=0
= 6h(H0 y0 + H1 y1 + H2 y2 + H3 y3 + H4 y4 + H5 y5 + H6 y6 )
= 6h[H0 (y0 + y6 ) + H1 (y1 + y5 ) + H2 (y2 + y4 ) + H3 y3 ].

To find the values of Hi ’s one may use the result Hi = Hn−i . Also the value of H3
can be obtained by the formula
Differentiation and Integration 453

H3 = 1 − (H0 + H1 + H2 + H4 + H5 + H6 ) = 1 − 2(H0 + H1 + H2 ).
%
1 1 6 s(s − 1)(s − 2) · · · (s − 6) 41
Now, H0 = · ds = .
6 6! 0 s 840
216 27 272
Similarly, H1 = , H2 = , H3 = .
840 840 840
Hence,
% b
h
f (x)dx = [41y0 +216y1 +27y2 +272y3 +27y4 +216y5 +41y6 ]. (7.107)
a 140
Again, we know that ∆6 y0 = y0 − 6y1 + 15y2 − 20y3 + 15y4 − 6y5 + y6 ,
h
i.e., 140 [y0 − 6y1 + 15y2 − 20y3 + 15y4 − 6y5 + y6 ] − 140
h
∆6 y0 = 0.
Adding left hand side of above identity (as it is zero) to the right hand side of (7.107).
After simplification the equation (7.107) finally reduces to
% b
3h h 6
f (x)dx = [y0 + 5y1 + y2 + 6y3 + y4 + 5y5 + y6 ] − ∆ y0 .
a 10 140
The first term is the well known Weddle’s rule and the last term is the error in
addition to the truncation error.

Table 7.3: Weights of Newton-Cotes integration rule for different n.

n
1 1
1 2 2
1 4 1
2 3 3 3
3 9 9 3
3 8 8 8 8
14 64 24 64 14
4 45 45 45 45 45
95 375 250 250 375 95
5 288 288 288 288 288 288
41 216 27 272 27 216 41
6 140 140 140 140 140 140 140

7.14 Newton-Cotes Formulae (Open Type)

All the formulae based on Newton-Cotes formula developed in Section 7.13 are of closed
type, i.e., they use the function values at the end points a, b of the interval [a, b] of
integration. Here, some formulae are introduced those take the function values at eq-
uispaced intermediate points, but, not at the end points. These formulae may be used
454 Numerical Analysis

when the function has singularity(s) at the end points or the values of the function are
unknown at the endpoints. Also, these methods are useful to solve ordinary differen-
tial equations numerically when the function values at the end points are not available.
These formulae are sometimes known as the Steffensen formulae.

(i) Mid-point formula

% x1
1 3 
f (x)dx = hf (x0 + h/2) + h f (ξ), x0 ≤ ξ ≤ x1 . (7.108)
x0 24

(ii) Two-point formula

% x3
3h 3h3 
f (x)dx = [f (x1 ) + f (x2 )] + f (ξ), x0 ≤ ξ ≤ x3 . (7.109)
x0 2 4

(iii) Three-point formula

% x4
4h 14h5 iv
f (x)dx = [2f (x1 ) − f (x2 ) + 2f (x3 )] + f (ξ),
x0 3 45
x0 ≤ ξ ≤ x4 . (7.110)

(iv) Four-point formula

% x5
5h
f (x)dx = [11f (x1 ) + f (x2 ) + f (x3 ) + 11f (x4 )]
x0 24
95h5 iv
+ f (ξ), x0 ≤ ξ ≤ x5 . (7.111)
144

These formulae may be obtained by integrating Lagrange’s interpolating polynomial


for the data points (xi , yi ), i = 1, 2, . . . , (n − 1) between the given limits.

Methods Based on Undetermined Coefficients

In the Newton-Cotes method all the nodes xi , i = 0, 1, 2, . . . , n are known and eq-
uispaced. Also, the formulae obtained from Newton-Cotes method are exact for the
polynomials of degree up to n. When the nodes xi , i = 0, 1, 2, . . . , n are unknown then
one can devise some methods which give exact result for the polynomials of degree up
to 2n − 1. These methods are called Gaussian quadrature methods.
Differentiation and Integration 455

7.15 Gaussian Quadrature

The Gaussian quadrature is of the form


% b 
n
ψ(x)f (x)dx = wi f (xi ), (7.112)
a i=1

where xi and wi are respectively called nodes and weights and ψ(x) is called the weight
function. Depending on the weight function different quadrature formula can be ob-
tained.
The fundamental theorem of Gaussian quadrature states that the optimal nodes of the
m-point Gaussian quadrature formula are precisely the zeros of the orthogonal polyno-
mial for the same interval and weight function. Gaussian quadrature is optimal because
it fits all polynomial up to degree 2m exactly.
To determine the weights corresponding to the Gaussian nodes xi , compute a La-
grange’s interpolating polynomial for f (x) by assuming

m
π(x) = (x − xj ). (7.113)
j=1

Then

m

π (xj ) = (xj − xi ). (7.114)
i=1
i=j

Then Lagrange’s interpolating polynomial through m points is



m
π(x)
φ(x) = f (xj ) (7.115)
(x − xj )π  (xj )
j=1

for arbitrary points x. Now, determine a set of points xj and wj such that for a weight
function ψ(x) the following relation is valid.
% b % b m
π(x)ψ(x)
φ(x)ψ(x)dx = f (xj )dx
a a (x − xj )π  (xj )
j=1

m
= wj f (xj ), (7.116)
j=1

where weights wj are obtained from


% b
1 π(x)ψ(x)
wj =  dx. (7.117)
π (xj ) a x − xj
456 Numerical Analysis

The weights wj are sometimes called the Christofell numbers.


It can be shown that the error is given by
%
f (2n) (ξ) b
E= ψ(x)[π(x)]2 dx. (7.118)
(2n)! a
Any finite interval [a, b] can be transferred to the interval [−1, 1] using linear trans-
formation
b−a b+a
x= t+ = qt + p. (7.119)
2 2
Then,
% b % 1
f (x) dx = f (qt + p) q dt. (7.120)
a −1

Thus to study the Gaussian quadrature, we consider the integral in the form
% 1 n
ψ(x)f (x)dx = wi f (xi ) + E. (7.121)
−1 i=1

7.15.1 Gauss-Legendre integration methods


Here ψ(x) is taken as 1 and so the formula (7.121) reduces to
% 1 n
f (x)dx = wi f (xi ). (7.122)
−1 i=1

It may be noted that wi and xi are 2n parameters and therefore the weights and
nodes can be determined such that the formula is exact when f (x) is a polynomial of
degree not exceeding 2n − 1.
Let

f (x) = c0 + c1 x + c2 x2 + · · · + c2n−1 x2n−1 . (7.123)

Therefore,
% 1 % 1
f (x)dx = [c0 + c1 x + c2 x2 + · · · + c2n−1 x2n−1 ]dx
−1 −1
2 2
= 2c0 + c2 + c4 + · · · . (7.124)
3 5
When x = xi , equation (7.123) becomes

f (xi ) = c0 + c1 xi + c2 x2i + c3 x3i + · · · + c2n−1 x2n−1


i .
Differentiation and Integration 457

Substituting these values to the right hand side of (7.122) to get


% 1
f (x)dx = w1 [c0 + c1 x1 + c2 x21 + · · · + c2n−1 x2n−1
1 ]
−1
+w2 [c0 + c1 x2 + c2 x22 + · · · + c2n−1 x2n−1
2 ]
+w3 [c0 + c1 x3 + c2 x23 + · · · + c2n−1 x2n−1
3 ]
+····································
+wn [c0 + c1 xn + c2 x2n + · · · + c2n−1 x2n−1
n ]
= c0 (w1 + w2 + · · · + wn ) + c1 (w1 x1 + w2 x2 + · · · + wn xn )
+c2 (w1 x21 + w2 x22 + · · · + wn x2n ) + · · ·
+c2n−1 (w1 x2n−1
1 + w2 x2n−1
2 + · · · + wn x2n−1
n ). (7.125)

Since (7.124) and (7.125) are identical, compare the coefficients of ci , and find 2n
equations as follows:

w 1 + w2 + · · · + w n = 2
w1 x1 + w2 x2 + · · · + wn xn = 0
w1 x21 + w2 x22 + · · · + wn x2n = 23 (7.126)
··························· ···
2n−1
w1 x1 + w2 x2n−1
2 + · · · + wn x2n−1
n = 0.

Now, equation (7.126) is a set of 2n non-linear equations consisting 2n unknowns wi


and xi , i = 1, 2, . . . , n. Solution of these equations gives the values of wi and xi . Let
wi = wi∗ and xi = x∗i , i = 1, 2, . . . , n be the solution of (7.126). Then the Gauss-Legendre
formula is finally given by
% 1 
n
f (x)dx = wi∗ f (x∗i ). (7.127)
−1 i=1

Unfortunately, determination of general solution of the system (7.126) is very com-


plicated. Thus we concentrate for its particular cases.
Case I. When n = 1, the formula is
% 1
f (x)dx = w1 f (x1 ), where w1 = 2 and w1 x1 = 0, i.e., x1 = 0.
−1

Thus for n = 1,
% 1
f (x)dx = 2f (0). (7.128)
−1
458 Numerical Analysis

Case II. When n = 2 then the integral is


% 1
f (x)dx = w1 f (x1 ) + w2 f (x2 ). (7.129)
−1

and the system (7.126) reduces to

w 1 + w2 = 2
w1 x1 + w2 x2 = 0
(7.130)
w1 x21 + w2 x22 = 23
w1 x31 + w2 x32 = 0.

The above equations can also be obtained by the following way:


The formula (7.129) is exact when f (x) is a polynomial of degree ≤ 3. Substituting suc-
cessively f (x) = 1, x, x2 and x3 in (7.129) and obtain the following system of equations

w1 + w2 =2
(f (x) = 1)
w1 x1 + w2 x2 =0
(f (x) = x)
(7.131)
w1 x21 + w2 x22 = 23
(f (x) = x2 )
w1 x31 + w2 x32 (f (x) = x3 ).
=0
√ √
The solution of these equations is w1 = w2 = 1, x1 = −1/ 3, x2 = 1/ 3. Hence, the
equation (7.129) becomes
% 1 √ √
f (x)dx = f (−1/ 3) + f (1/ 3).
−1

Case III. When n = 3 then the integral becomes


% 1
f (x)dx = w1 f (x1 ) + w2 f (x2 ) + w3 f (x3 ). (7.132)
−1

The six unknowns x1 , x2 , x3 and w1 , w2 , w3 are related as

w 1 + w2 + w 3 = 2
w1 x1 + w2 x2 + w3 x3 = 0
w1 x21 + w2 x22 + w3 x32 = 23
w1 x31 + w2 x32 + w3 x32 = 0
w1 x41 + w2 x42 + w3 x42 = 25
w1 x51 + w2 x52 + w3 x52 = 0.

These equations may also be obtained by substituting f (x) = 1, x, x2 , x3 , x4 , x5 to


the equation (7.132).
Differentiation and Integration 459

Solution of this system of equations is


 
x1 = − 3/5, x2 = 0, x3 = 3/5, w1 = 5/9, w2 = 8/9, w3 = 5/9.

For these values, equation (7.132) becomes


% 1  
1
f (x)dx = [5f (− 3/5) + 8f (0) + 5f ( 3/5)]. (7.133)
−1 9

In this way one can determine the formulae for higher values of n. The solution of
the equations (7.126) for higher values of n is very complicated as they are non-linear
with respect to the nodes x1 , x2 , . . . , xn . But, they are linear with respect to weights.
If can be shown that xi , i = 1, 2, . . . , n are the zeros of the nth degree Legendre’s
1 dn
polynomial Pn (x) = n [(x2 − 1)n ], which can be generated from the recurrence
2 n! dxn
relation

(n + 1)Pn+1 (x) = (2n + 1)xPn (x) − nPn−1 (x) (7.134)

where P0 (x) = 1 and P1 (x) = x. Some lower order Legendre polynomials are

P0 (x) = 1
P1 (x) = x
P2 (x) = 12 (3x2 − 1) (7.135)
P3 (x) = 12 (5x3 − 3x)
P4 (x) = 18 (35x4 − 30x2 + 3).

The weights wi are given by


% n  
1
x − xi
wi = dx (7.136)
−1 j=1 xi − xj
j=i

where xi are known nodes.


Also, the weights wi may be obtained from the relation
2
wi = . (7.137)
(1 − x2i )[Pn (xi )]2

It can be shown that the error of this formula is


22n+1 (n!)4
E= f (2n) (ξ), −1 < ξ < 1. (7.138)
(2n + 1)[(2n)!]3

The nodes and weights for different values of n are listed in Table 7.4.
460 Numerical Analysis

Table 7.4: Values of xi and wi for Gauss-Legendre quadrature.

n node xi weight wi order of


truncation error
2 ±0.57735027 1.00000000 f (4) (ξ)
3 0.00000000 0.88888889 f (6) (ξ)
±0.77459667 0.55555556
4 ±0.33998104 0.65214515 f (8) (ξ)
±0.86113631 0.34785485
5 0.00000000 0.56888889
±0.53846931 0.47862867 f (10) (ξ)
±0.90617985 0.23692689
6 ±0.23861919 0.46791393
±0.66120939 0.36076157 f (12) (ξ)
±0.93246951 0.17132449

% 1
1
Example 7.15.1 Find the value of dx by Gauss’s formula for n = 2, 4, 6.
0 1 + x2
Also, calculate the absolute errors.

Solution. To apply the Gauss’s formula, the limits are transferred to −1, 1 by
1 1 1
substituting x = u(1 − 0) + (1 + 0) = (u + 1).
% 1 2 % 1 2 2
1 2du n
Then I = 2
dx = 2
= 2 wi f (ui ),
0 1+x −1 (u + 1) + 4 i=1
1
where f (xi ) = .
(xi + 1)2 + 4
For the two-point formula (n = 2)

x1 = −0.57735027, x2 = 0.57735027, w1 = w2 = 1.
Then I = 2[1 × 0.23931272 + 1 × 0.15412990] = 0.78688524.
For the four-point formula (n = 4)

x1 = −0.33998104, x2 = −0.86113631, x3 = −x1 , x4 = −x2 ,


w1 = w3 = 0.65214515, w2 = w4 = 0.34785485.
Then I = 2[w1 f (x1 ) + w3 f (x3 ) + w2 f (x2 ) + w4 f (x4 )]
= 2[w1 {f (x1 ) + f (−x1 )} + w2 {f (x2 ) + f (−x2 )}]
= 2[0.65214515×(0.22544737+0.17254620)+0.34785485×(0.24880059+0.13397950)]
= 0.78540297.
Differentiation and Integration 461

For the six-point formula (n = 6)

x1 = −0.23861919, x2 = −0.66120939, x3 = −0.93246951, x4 = −x1 ,


x5 = −x2 , x6 = −x3 , w1 = w4 = 0.46791393, w2 = w5 = 0.36076157,
w3 = w6 = 0.17132449.
Then I = 2[w1 {f (x1 ) + f (−x1 )} + w2 {f (x2 ) + f (−x2 ) + w3 {f (x3 ) + f (−x3 )}]
= 2[0.46791393×(0.21835488+0.18069532)+0.36076157×(0.24302641+0.14793738)
+ 0.17132449 × (0.24971530 + 0.12929187)]
= 0.78539814.
The exact value is π/4 = 0.78539816.
The following table gives a comparison among the different Gauss’s formulae.
n Exact value Gauss formula Error
2 0.78539816 0.78688524 1.49 × 10−3
4 0.78539816 0.78540297 4.81 × 10−6
6 0.78539816 0.78539814 2.00 × 10−8

Algorithm 7.5 (Gauss-Legendre’s quadrature). This algorithm finds the value


8b
of a f (x) dx using 6-point Gauss-Legendre’s quadrature method.

Algorithm Gauss Legendre


Input function f (x);
Read a, b. //the lower and upper limits of the integral.//
Assign
x1 = 0.23861919, x2 = −x1 , w1 = w2 = 0.46791393,
x3 = 0.66120939, x4 = −x3 , w3 = w4 = 0.36076157,
x5 = 0.93246951, x6 = −x5 , w5 = w6 = 0.17132449,
Set p = (a + b)/2, q = (b − a)/2 //limits changed to –1, 1//
Initialize Result = 0;
for i = 1 to 6 do
Compute result = result + wi ∗ f (p + qxi );
endfor;
Compute result = result ∗ q;
Print result;
end Gauss Legendre

Program 7.5
.
/* Program Gauss-Legendre
Program to find the integration of a function by 6-point
Gauss-Legendre method. Here f(x)=x^3. */
#include<stdio.h>
462 Numerical Analysis

void main()
{
float a,b,p,q,result=0,x[7],w[7]; int i;
float f(float);
printf("Enter the values of a, b ");
scanf("%f %f",&a,&b);
x[1]=0.23861919; x[2]=-x[1];
x[3]=0.66120939; x[4]=-x[3];
x[5]=0.93246951; x[6]=-x[5];
w[1]=w[2]=0.46791393;
w[3]=w[4]=0.36076157;
w[5]=w[6]=0.17132449;
p=(a+b)/2; q=(b-a)/2;
for(i=1;i<=6;i++)
result+=w[i]*f(p+q*x[i]);
result*=q;
printf("The value of the integration is %f ",result);
}

/* definition of the function f(x) */


float f(float x)
{
return(x*x*x);
}
A sample of input/output:
Enter the values of a, b 0 1
The value of the integration is 0.250000

7.15.2 Lobatto integration methods


Lobatto integration is a Gaussian quadrature with weight function ψ(x) = 1 in which
the endpoints of the interval [−1, 1] are included in n nodes. The remaining n − 2 nodes
are free to choose. Like Gauss-Legendre methods, the nodes are symmetrical about the
origin and the general formula is
% 1 
n−1
f (x)dx = w1 f (−1) + wn f (1) + wi f (xi ). (7.139)
−1 i=2

Here, the total number of unknown is 2n − 2 (n weights and n − 2 nodes), so the


formula is exact for polynomial of degree up to 2n − 3.
Differentiation and Integration 463

Table 7.5: Nodes and weights for Lobatto quadrature.

n node xi weight wi order of


truncation error
3 0.00000000 1.33333333 f (4) (ξ)
±1.00000000 0.33333333
4 ±0.44721360 0.83333333 f (6) (ξ)
±1.00000000 0.16666667
5 0.00000000 0.71111111
±0.65465367 0.54444444 f (8) (ξ)
±1.00000000 0.10000000
6 ±0.28523152 0.55485837
±0.76505532 0.37847496 f (10) (ξ)
±1.00000000 0.06666667

For n = 3, the formula (7.139) becomes


% 1
f (x)dx = w1 f (−1) + w3 f (1) + w2 f (x2 ). (7.140)
−1

The formula (7.140) is exact for polynomials of degree up to three. Substituting


f (x) = 1, x, x2 , x3 in (7.140) to generate the following system of equations.

w1 + w2 + w3 =2
−w1 + w2 x2 + w3 =0
w1 + w2 x22 + w3 = 23
−w1 + w2 x32 + w3 = 0.

Solution of this system of equations is

1 4
x2 = 0, w1 = w3 = , w2 = .
3 3
Hence (7.140) becomes
% 1
1
f (x)dx = [f (−1) + 4f (0) + f (1)]. (7.141)
−1 3

In general the nodes xi , i = 2, 3, . . . , n − 1 are the zeros of the polynomial Pn−1 (x),
where Pn (x) is a Legendre polynomial of degree n.
464 Numerical Analysis

The weights are given by


2n
wi = − , n = 2, 3, . . . , n − 1
n(n − 1)[Pn−1 (xi )]2 (7.142)
2
w 1 = wn = .
n(n − 1)
The error term is
n(n − 1)3 22n−1 [(n − 2)!]4 (2n−2)
E=− f (ξ), −1 < ξ < 1. (7.143)
(2n − 1)[(2n − 2)!]3
The nodes and the corresponding weights for Lobatto integration for n = 3, 4, 5, 6 are
given in Table 7.5.

7.15.3 Radau integration methods


In Radau method, (n + 1) points are needed to fit all polynomials of degree 2n, so it
effectively fits exactly all polynomials of degree (2n − 1). It uses the weight function
ψ(x) = 1 in which the endpoint −1 in the interval [−1, 1] is included out of n nodes and
remaining (n − 1) nodes are free. The general formula is
% 1 
n
f (x)dx = w1 f (−1) + wi f (xi ). (7.144)
−1 i=2

Since the formula (7.144) contains 2n − 1 (n weights and n − 1 nodes) unknowns, this
method gives exact values for polynomials of degree up to 2n − 2. For n = 3 (7.144)
becomes
% 1
f (x)dx = w1 f (−1) + w2 f (x2 ) + w3 f (x3 ). (7.145)
−1

This formula gives exact result for polynomials of degree up to 4. For f (x) =
1, x, x2 , x3 , x4 , equation (7.145) generates the following system of equations.

w1 + w 2 + w 3 = 2
−w1 + w2 x2 + w3 x3 = 0
2
w1 + w2 x22 + w3 x23 = 3
−w1 + w2 x32 + w3 x33 = 0
2
w1 + w2 x42 + w3 x43 = 5.

Solution of these equations is


√ √ √ √
1− 6 1+ 6 2 16 + 6 16 − 6
x2 = , x3 = , w1 = , w2 = , w3 = .
5 5 9 18 18
Differentiation and Integration 465

Table 7.6: Nodes and weights for Radau quadrature.

n node xi weight wi order of


truncation error
2 –1.0000000 0.5000000 f (3) (ξ)
1.3333333 1.5000000
3 −1.0000000 0.2222222
−0.2898979 1.0249717 f (5) (ξ)
0.6898979 0.7528061
4 –1.0000000 0.1250000
−0.5753189 0.6576886 f (7) (ξ)
0.1810663 0.7763870
0.8228241 0.4409244
5 −1.0000000 0.0800000
−0.7204803 0.4462078
0.1671809 0.6236530 f (9) (ξ)
0.4463140 0.5627120
0.8857916 0.2874271

The formula (7.145) becomes


% 1 √  √  √  √ 
2 16+ 6 1− 6 16− 6 1+ 6
f (x)dx = f (−1)+ f + f . (7.146)
−1 9 18 5 18 5
In general, the nodes xi , i = 2, 3, . . . , n are the zeros of the polynomial
Pn−1 (x) + Pn (x)
, (7.147)
1+x
where Pn (x) is the Legendre’s polynomial.
The weights wi , i = 2, 3, . . . , n are given by
1 − xi 1
wi = 2 =  and
n [Pn−1 (xi )] 2 (1 − xi )[Pn−1 (xi )]2
2
w1 = 2 . (7.148)
n
The error term is given by
22n−1 n[(n − 1)!]4 (2n−1)
E= f (ξ), −1 < ξ < 1. (7.149)
[(2n − 1)!]3
The nodes and the corresponding weights for the Radau integration methods for
n = 2, 3, 4, 5 are given in Table 7.6.
466 Numerical Analysis

7.15.4 Gauss-Chebyshev integration methods


Gauss-Chebyshev quadrature is also known as Chebyshev quadrature. Its weight
function is taken as ψ(x) = (1 − x2 )−1/2 . The general form of this method is
% 1  n
1
√ f (x)dx = wi f (xi ). (7.150)
−1 1 − x2 i=1

These methods give exact result for polynomials of degree up to 2n − 1. The nodes
xi , i = 1, 2, . . . , n are the zeros of the Chebyshev polynomials

Tn (x) = cos(n cos−1 x). (7.151)

That is,
 
(2i − 1)π
xi = cos , i = 1, 2, . . . , n. (7.152)
2n

For n = 3 the equation (7.150) becomes


% 1
1
√ f (x)dx = w1 f (x1 ) + w2 f (x2 ) + w3 f (x3 ). (7.153)
−1 1 − x2
Since the method gives exact value for the polynomials of degree up to 2n − 1. i.e.,
up to 5. Therefore, for f (x) = 1, x, x2 , x3 , x4 , x5 the following equations are obtained
from (7.153).

w 1 + w2 + w 3 = π
w1 x1 + w2 x2 + w3 x3 = 0
π
w1 x21 + w2 x22 + w3 x23 = 2
w1 x31 + w2 x32 + w3 x33 = 0

w1 x41 + w2 x42 + w3 x43 = 8
w1 x51 + w2 x52 + w3 x53 = 0.

The nodes xi , i = 1, 2, 3 can easily be obtained from the relation


π
xi = cos(2i − 1) , i = 1, 2, 3.
6
√ √
That is, x1 = 3/2, x2 = 0, x3 = − 3/2. Then the solution of the above equations
is w1 = w2 = w3 = π/3.
Thus formula (7.153) finally becomes
% 1
1 π √ √ 
√ f (x)dx = f ( 3/2) + f (0) + f (− 3/2) . (7.154)
−1 1 − x2 3
Differentiation and Integration 467

Table 7.7: Nodes and weights for Gauss-Chebyshev quadrature.

n node xi weight wi order of


truncation error
2 ±0.7071068 1.5707963 f (4) (ξ)
3 0.0000000 1.0471976
±0.8660254 1.0471976 f (6) (ξ)
4 ±0.3826834 0.7853982
±0.9238795 0.7853982 f (8) (ξ)
5 0.0000000 0.6283185
±0.5877853 0.6283185 f (10) (ξ)
±0.9510565 0.6283185

In general, the nodes are given by


 
(2i − 1)π
xi = cos , i = 1, 2, . . . , n
2n
and the weights are
π π
wi = − = , i = 1, 2, . . . , n. (7.155)
Tn+1 (xi )Tn (xi ) n
The error term is

E= f (2n) (ξ), −1 < ξ < 1. (7.156)
22n (2n)!
The explicit formula is then
% 1
π   (2i − 1) 
n
f (x)dx 2n
√ = f cos π + 2n f (2n) (ξ). (7.157)
−1 1 − x2 n
i=1
2n 2 (2n)!

Table 7.7 gives the values for the first few points and weights for Gauss-Chebyshev
quadrature.
% 1
1
Example 7.15.2 Find the value of 2
dx using Gauss-Chebyshev four-
0 1+x
point formula.

1 − x2
Solution. Let f (x) = . Here x1 = −0.3826834 = x2 ,
1 + x2
x3 = −0.9238795 = −x4 and w1 = w2 = w3 = w4 = 0.7853982.
468 Numerical Analysis

Then
% 1 % 1 % 1
1 1 1 1 f (x)
I= dx = dx = √ dx
0 1+x2 2 −1 1+x2 2 −1 1 − x2
1
= [w1 f (x1 ) + w2 f (x2 ) + w3 f (x3 ) + w4 f (x4 )]
2
1
= × 0.7853982[f (x1 ) + f (x2 ) + f (x3 ) + f (x4 )]
2
1
= × 0.7853982[2 × 0.8058636 + 2 × 0.2064594] = 0.7950767,
2
while the exact value is π/4 = 0.7853982. The absolute error is 0.0096785.

Remark 7.15.1 It may be noted that the Gauss-Legendre four-point formula gives bet-
ter result for this problem.

7.15.5 Gauss-Hermite integration methods


Gauss-Hermite quadrature, also known as Hermite quadrature is a Gaussian quadra-
2
ture over the interval (−∞, ∞) with weight function ψ(x) = e−x . The nodes xi are the
zeros of the Hermite polynomial Hn (x), where
2 dn −x2
Hn (x) = (−1)n ex (e ). (7.158)
dxn
The zeros are symmetrical about the origin.
These polynomials satisfy the recurrence relation
Hn (x) = 2nHn−1 (x) = 2xHn (x) − Hn+1 (x).
The first few Hermite polynomials are
H0 (x) = 1, H1 (x) = 2x, H2 (x) = 2(2x2 − 1), H3 (x) = 4(2x3 − 3x).
The weights wi are given by
√ √
2n+1 n! π 2n−1 n! π
wi =  = 2 . (7.159)
[Hn+1 (xi )]2 n [Hn−1 (xi )]2
The Gauss-Hermite formula is
% ∞ 
n
2
e−x f (x)dx = wi f (xi ). (7.160)
−∞ i=1
The error term is

n! π (2n)
E= n f (ξ). (7.161)
2 (2n)!
Table 7.8 gives the nodes and weights for Gauss-Hermite integration methods.
Differentiation and Integration 469

Table 7.8: Nodes and weights for Gauss-Hermite quadrature.

n node xi weight wi order of


truncation error
2 ±0.7071068 0.8862269 f (4) (ξ)
3 +0.0000000 1.1816359
±1.2247449 0.2954090 f (6) (ξ)
4 ±0.5246476 0.8049141
±1.6506801 0.0813128 f (8) (ξ)
5 +0.0000000 0.945309
±0.958572 0.393619 f (10) (ξ)
±2.02018 0.0199532

7.15.6 Gauss-Laguerre integration methods


Gauss-Laguerre quadrature is also known as Laguerre quadrature and it is a Gaussian
quadrature over the interval [0, ∞) with weight function ψ(x) = e−x . The general form
is
% ∞ n
−x
e f (x)dx = wi f (xi ). (7.162)
0 i=1

The nodes xi are the zeros of the Laguerre polynomial


dn −x n
Ln (x) = (−1)n ex (e x ), (7.163)
dxn
which satisfies the recurrence relation
xLn (x) = nLn (x) − nLn−1 (x). (7.164)
Few lower order Laguerre polynomials are
L0 (x) = 1, L1 (x) = x − 1, L2 (x) = x2 − 4x + 2, L3 (x) = x3 − 9x2 + 18x − 6.
The weights wi are given by
1 xi
wi = = . (7.165)
xi [Ln (xi )]2 (n + 1) [Ln+1 (xi )]2
2

The error term is


(n!)2 (2n)
E= f (ξ), 0 < ξ < ∞. (7.166)
(2n)!
Table 7.9 gives the nodes and corresponding weights of Gauss-Laguerre integration.
470 Numerical Analysis

Table 7.9: Nodes and weights for Gauss-Laguerre quadrature.

n node xi weight wi order of


truncation error
2 0.5857864376 0.8535533906 f (4) (ξ)
3.4142135624 0.1464466094
3 0.4157745568 0.7110930099
2.2942803603 0.2785177336 f (6) (ξ)
6.2899450829 0.0103892565
4 0.3225476896 0.6031541043
1.7457611012 0.3574186924 f (8) (ξ)
4.5366202969 0.0388879085
9.3950709123 0.0005392947
5 0.2635603197 0.5217556106
1.4134030591 0.3986668111
3.5964257710 0.0759424497 f (10) (ξ)
7.0858100059 0.0036117587
12.6408008443 0.0000233700

7.15.7 Gauss-Jacobi integration methods


This quadrature is also called Jacobi quadrature or Mehler quadrature. It is a
Gaussian quadrature over [−1, 1] with weight function
ψ(x) = (1 − x)α (1 + x)β . (7.167)
The general form of the formula is
% 1 
n
f (x)dx = wi f (xi ), (7.168)
−1 i=1

where the nodes xi ’s are the zeros of the Jacobi polynomial


(−1)n −α −β d
n
Pn(α,β) (x) = (1 − x) (1 + x) [(1 − x)α+n (1 + x)β+n ],
2n n! dxn
α, β > −1. (7.169)
(0,0)
Pn (x) is the Legendre polynomial.
The weights are
Γ(n + α + 1)Γ(n + β + 1) 22n+α+β+1 n!
wi = , i = 1, 2, . . . , n, (7.170)
Γ(n + α + β + 1) (1 − x2i )[Vn (xi )]2
Differentiation and Integration 471

(α,β)
where Vn = (−1)n Pn (x).2n n!.
The error term is

Γ(n + α + 1)Γ(n + β + 1)Γ(n + α + β + 1) 22n+α+β+1 (2n)


E= . f (ξ),
(2n + α + β + 1)[Γ(2n + α + β + 1)]2 (2n)!
−1 < ξ < 1. (7.171)

Table 7.10 is a list of some Gaussian quadrature along with weights, nodes and the
corresponding intervals.

Table 7.10: Summary of Gaussian quadrature.

ψ(x) interval xi are roots of


Gauss-Legendre 1 (−1, 1) Pn (x)
Lobatto 1 
(−1, 1) Pn−1 (x)
Pn−1 (x)+Pn (x)
Radau 1 (−1, 1) 1+x
Gauss-Chebyshev (1 − x )2 −1/2 (−1, 1) Tn (x)
2
Gauss-Hermite e−x (−∞, ∞) Hn (x)
Gauss-Laguerre e−x (0, ∞) Ln (x)
(α,β)
Gauss-Jacobi (1 − x) (1 + x)
α β (−1, 1) Pn (x)

Theorem 7.1 If xi , i = 1, 2, . . . , n are the zeros of an orthogonal polynomial, orthogonal


with respect to the weight function ψ(x) over the interval (a, b), then the degree of
precision of the formula
% b 
n
f (x)ψ(x)dx = wi f (xi ) (7.172)
a i=1

is (2n − 1) and each of the weights wi is positive.

Proof. We define an inner product with respect to the weight function ψ(x) as
% b
f, g = f (x)g(x)ψ(x)dx. (7.173)
a

A set of orthogonal polynomials pn (x) satisfies the result

1, m = n
pm , pn  = (7.174)
0, m = n.
472 Numerical Analysis

Our aim is to find the nodes and the weights such that
% b 
n
f (x)ψ(x)dx = wi f (xi ) + E (7.175)
a i=1

is exact (E = 0) if f (x) is a polynomial of degree 2n − 1 or less.


Let f (x) be a polynomial of degree 2n − 1. If f (x) is divided by the orthogonal
polynomial pn (x) then f (x) can be expressed as

f (x) = pn (x)Qn−1 (x) + R(x) (7.176)

where Qn−1 (x) is the quotient and R(x) is the remainder polynomial of degree n − 1.
Now, multiplying the equation (7.176) by the weight function ψ(x) and integrating it
over [a, b], we get
% b % b % b
f (x)ψ(x)dx = pn (x)Qn−1 (x)ψ(x)dx + R(x)ψ(x)dx. (7.177)
a a a

The first term of right hand side is zero, since Qn−1 (x) can be expressed as a linear
combination of the orthogonal polynomial p0 , p1 , . . . , pn−1 and so it must be orthogonal
to pn . Then from (7.177),
% b % b
f (x)ψ(x)dx = R(x)ψ(x)dx. (7.178)
a a

Now, let the nodes xi be the n zeros of the polynomial pn (x). Then from (7.176)
f (xi ) = R(xi ). We now introduce another special set of polynomials, the Lagrange’s
polynomials

n
x − xk
Li (x) = , (7.179)
k=1
xi − xk
k=i

where Li (x) satisfies the following result.

1, i = k
Li (xk ) =
0, i = k.
Since R(x) is of degree (n − 1) it can be written as a sum of Lagrange’s polynomials,
i.e.,


n
R(x) = f (xi )Li (x). (7.180)
i=1
Differentiation and Integration 473

Then from (7.178),


% b % b
n
f (x)ψ(x)dx = f (xi )Li (x)ψ(x)dx
a a i=1


n % b 
n
= f (xi ) Li (x)ψ(x)dx = f (xi )wi , (7.181)
i=1 a i=1

% b
where wi = ψ(x)Li (x)dx.
a
This proves that the formula (7.172) has precision 2n − 1.
Note that L2i (x) is a polynomial of degree less than or equal to 2n. Let f (x) = L2i (x).
Then (7.181) reduces to
% b 
n
ψ(x)L2j (x)dx = wi L2j (xi ).
a i=1

Therefore,
% b
wi = ψ(x)L2i (x)dx = Li , Li  > 0
a

using (7.173) and (7.174).

7.16 Euler-Maclaurin’s Sum Formula

Euler-Maclaurin’s sum formula is used to determine the sum of finite and infinite series
of numbers and is also used to numerical quadrature, if the values of derivatives are
known at the end points of the interval. The well-known quadrature formulae such as
trapezoidal rule, Simpson’s rule including error terms can be deduced from this formula.
This formula is also known as Maclaurin’s sum formula. To deduce this formula,
x
let us consider the expansion of x in ascending powers of x.
e −1
x
Let x = b0 + b1 x + b2 x2 + b3 x3 + b4 x4 + b5 x5 + · · · .
e −1
That is,
 x2 x3 x4 
x = (b0 + b1 x + b2 x2 + b3 x3 + b4 x4 + b5 x5 + · · · ) x + + + + ···
2! 3! 4!
b  b b  b b b 
0 0 1 0 1 2
= b0 x + x2 + b1 + x3 + + b2 + x4 + + + b3
2! 3! 2! 4! 3! 2!
b b b b  b b b b b4 
0 1 2 3 0 1 2 3
+x5 + + + + b4 + x6 + + + + + b5 + · · · .
5! 4! 3! 2! 6! 5! 4! 3! 2!
474 Numerical Analysis

Equating the like powers of x on both sides and we find the following relations.
b0 = 1
b0 1
b1 = − =−
2! 2
b0 b1 1
b2 =− − =
3! 2! 12
b0 b1 b2
b3 =− − − =0
4! 3! 2!
b0 b1 b2 b3 1
b4 =− − − − =−
5! 4! 3! 2! 720
b0 b1 b2 b3 b4
b5 =− − − − − =0
6! 5! 4! 3! 2!
b0 b1 b2 b3 b4 b5 1
b6 =− − − − − − =
7! 6! 5! 4! 3! 2! 30240
etc.
Hence,
x 1 1 1 4 1
= 1 − x + x2 − x + x6 − · · ·
−1
ex 2 12 720 30240
1 1 1 1 1 3 1
= − + x− x + x5 − · · · . (7.182)
e −1
x x 2 12 720 30240
From the above expansion one can write
1 1
= hD (since E = ehD )
E−1 e −1
1 1 1 1 1
= − + (hD) − (hD)3 + (hD)5 − · · · .
hD 2 12 720 30240
Note that
(E n − 1)f (x0 ) = f (x0 + nh) − f (x0 ) = f (xn ) − f (x0 ).
Now,
En − 1
f (x0 )
E−1
1 1 1 1 1
= − + (hD) − (hD)3 + (hD)5 − · · · (E n − 1)f (x0 ).
hD 2 12 720 30240
That is,
(E n−1 + E n−2 + · · · + E + 1)f (x0 )
1 1 h
= [f (xn ) − f (x0 )] − [f (xn ) − f (x0 )] + [f  (xn ) − f  (x0 )]
hD 2 12
h3  h 5
− [f (xn ) − f  (x0 )] + [f v (xn ) − f v (x0 )] + · · · . (7.183)
720 30240
Differentiation and Integration 475

The left hand side of (7.183) is


n−1 
n−1 
n−1 
n
r
E f (x0 ) = f (x0 + rh) = f (xr ) = f (xr ) − f (xn ).
r=0 r=0 r=0 r=0
% xn
1 1
Also, [f (xn ) − f (x0 )] = f (x)dx.
hD h x0
Hence (7.183) becomes


n
f (xr ) − f (xn )
r=0
% xn
1 1 h
= f (x)dx − [f (xn ) − f (x0 )] + [f  (xn ) − f  (x0 )]
h x0 2 12
h3 h5
− [f  (xn ) − f  (x0 )] + [f v (xn ) − f v (x0 )] + · · ·
720 30240
That is,

n % xn
1 1
f (xr ) = f (x)dx + [f (xn ) + f (x0 )]
h x0 2
r=0
h  h3 
+ [f (xn ) − f  (x0 )] − [f (xn )−f  (x0 )]
12 720
h5
+ [f v (xn )−f v (x0 )] + · · · . (7.184)
30240
This formula may also be used as a quadrature formula when it is written as
% xn 1 
1
f (x)dx = h f (x0 ) + f (x1 ) + · · · + f (xn−1 ) + f (xn )
x0 2 2
h2  h4 
− [f (xn ) − f  (x0 )] + [f (xn ) − f  (x0 )]
12 720
h6
− [f v (xn ) − f v (x0 )] + · · · . (7.185)
30240

Example 7.16.1 Find the sum of the series


1 1 1 1
+ + + ··· +
1 2 3 100
correct to six decimal places.
476 Numerical Analysis


100
1
Solution. We have to determine the value of .
x
x=1
1
Let f (x) = and h = 1, x0 = 1, xn = 100.
x
1 2 6 24 120
Then f (x) = − 2 , f  (x) = 3 , f  (x) = − 4 , f iv (x) = 5 , f v (x) = − 6 .

% xn x % x x x x
100
1
Now, f (x)dx = dx = log 100 = 4.6051702.
x0 1 x
1
f (xn ) + f (x0 ) = + 1 = 1.01,
100
1
f  (xn ) − f  (x0 ) = − + 1 = 0.9999,
1002
6
f  (xn ) − f  (x0 ) = − + 6 = 5.999999,
1004
120
f v (xn ) − f v (x0 ) = − + 120 = 120.
1006
Hence

100 % 100
1 1
f (x) = f (x)dx + [f (xn ) + f (x0 )]
h 1 2
x=1
h  h3 
+ [f (xn ) − f  (x0 )] − [f (xn ) − f  (x0 )]
12 720
h5
+ [f v (xn ) − f v (x0 )] + · · ·
30240
1 1 1 1
= 4.6051702 + × 1.01 + × 0.9999− ×5.999999+ ×120
2 12 720 30240
= 5.1891301.

Hence, the approximate value of the given sum is 5.189130 correct up to six decimal
places.


π2  1
Example 7.16.2 Using the relation = , compute the value of π 2 correct
6 x2
x=1
to six decimals.
1
Solution. Let f (x) = 2 . Here h = 1, x0 = 1, xn = ∞.
x
2  6 24 120 720
Then f (x) = − 3 , f (x) = 4 , f  (x) = − 5 , f iv (x) = 6 , f v (x) = − 7 .

% xn x % ∞ x x x x
1
Now, f (x)dx = dx = 1.
x0 1 x2
f (xn ) + f (x0 ) = f (∞) + f (1) = 1. f  (xn ) − f  (x0 ) = f  (∞) − f  (1) = 2,
Differentiation and Integration 477

f  (xn ) − f  (x0 ) = 24,


f v (xn ) − f v (x0 ) = 720.
Hence

 %
1 ∞ 1
f (x) = f (x)dx + [f (xn ) + f (x0 )]
h 1 2
x=1
h  h3 
+ [f (xn ) − f  (x0 )] − [f (xn ) − f  (x0 )]
12 720
h5
+ [f v (xn ) − f v (x0 )] + · · ·
30240
1 1 1 1
=1+ ×1+ ×2− × 24 + × 720
2 12 720 30240
= 1.6571429.

Thus π 2 /6 = 1.6571429 or, π 2 = 9.9428574. Hence the approximate value of π 2 is


9.942857 correct up to six decimal places.

Example 7.16.3 Find the sum of the series


1 1 1 1 1
2
+ 2 + 2 + ··· + 2 + .
51 53 55 99 1002
1 1 1 1
Solution. Firstly, we find the sum of 2 + 2 + 2 + · · · + 2 .
51 53 55 99
1
Let f (x) = 2 . Here h = 2, x0 = 51, xn = 99.
x
 2  24
f (x) = − 3 , f (x) = − 5 .
% xn x % x
99
1 1 1
f (x)dx = 2
dx = − + = 0.0095068.
x0 51 x 99 51
1 1
f (xn ) + f (x0 ) = 2 + 2 = 0.0004865,
99 51
  2 2
f (xn ) − f (x0 ) = − 3 + 3 = 0.000013,
99 51
24 24
f  (xn ) − f  (x0 ) = − 5 + 5 = 0.
99 51
Now,
99 %
1 xn 1 h
f (x) = f (x)dx + [f (xn ) + f (x0 )] + [f  (xn ) − f  (x0 )] − · · ·
h x0 2 12
x=51
1 1 2
= × 0.0095068 + × 0.0004865 + × 0.0000130 = 0.0049988.
2 2 12
1
Hence the required sum is 0.0049988 + = 0.0050988.
1002
478 Numerical Analysis

% 5
dx
Example 7.16.4 Find the value of using Maclaurin’s sum formula.
1 1+x

Solution. The Maclaurin’s sum formula is


% xn xn
h h2
f (x)dx = h f (x) − [f (xn ) + f (x0 )] − [f  (xn ) − f  (x0 )]
x0 x=x
2 12
0

h4 
+ [f (xn ) − f  (x0 )] − · · · .
720
1
Let f (x) = , x0 = 1, xn = 5 and h = 1.
1+x
1 2 6
f  (x) = − , f  (x) = , f  (x) = − .
(1 + x)2 (1 + x)3 (1 + x)4
1 1 1 1
f (xn ) + f (x0 ) = + = 0.6666667, f  (xn ) − f  (x0 ) = − 2 + 2 = 0.2222222,
6 2 6 2
  6 6
f (xn ) − f (x0 ) = − 4 + 4 = 0.3703704.
6 2
xn
Also, f (x) = f (1.0) + f (2.0) + f (3.0) + f (4.0) + f (5.0)
x=x0
1 1 1 1 1
= + + + + = 1.45.
2 3 4 5 6
Hence
% 5
dx 1 1 1
= 1 × 1.45 − × 0.6666667 − × 0.2222222 + × 0.3703704
1 1+x 2 12 720
= 1.0986625.

The exact value is log 6 − log 2 = 1.0986123. Thus the absolute error is 0.00005020.

Example 7.16.5 Deduce trapezoidal and Simpson’s 1/3 rules from Maclaurin’s
sum formula.

Solution. From (7.185),


% xn
h
f (x)dx = [f (x0 ) + 2{f (x1 ) + f (x2 ) + · · · + f (xn−1 )} + f (xn )]
x0 2
h2  h4 
− [f (xn ) − f  (x0 )] + [f (xn ) − f  (x0 )] − · · ·
12 720
= IT + ET , (7.186)
Differentiation and Integration 479

h
where IT = [f (x0 ) + 2{f (x1 ) + f (x2 ) + · · · + f (xn−1 )} + f (xn )] is the composite
2
trapezoidal rule and

h2  h2
ET  − [f (xn ) − f  (x0 )]  − (xn − x0 )f  (ξ1 ), x0 < ξ1 < xn
12 12
[by MVT of differential calculus]
nh3 
− f (ξ1 ).
12
% xn
Thus f (x)dx = IT + ET is the composite trapezoidal rule.
x0
To deduce Simpson’s 1/3 rule, taking n as even and replacing h by 2h in (7.185), we
have

% xn 1 
1
f (x)dx = 2h f (x0 ) + f (x2 ) + f (x4 ) + · · · + f (xn−2 ) + f (xn )
x0 2 2
22 h2  24 h4 
− [f (xn ) − f  (x0 )] + [f (xn ) − f  (x0 )] − · · · .
12 720
(7.187)

Using the expression [(7.186) × 4 − (7.187)]/3, we obtain

% xn
h
f (x)dx = [f (x0 ) + 4f (x1 ) + 2f (x2 ) + 4f (x3 ) + 2f (x4 ) + · · · + f (xn )]
x0 3
1 h4 
+ (4 − 24 ) [f (xn ) − f  (x0 )] − · · ·
3 720
= IS + ES ,
h
where IS = [f (x0 ) + 4f (x1 ) + 2f (x2 ) + 4f (x3 ) + 2f (x4 ) + · · · + f (xn )]
3
is the composite Simpson’s 1/3 rule and

h4 
ES = − [f (xn ) − f  (x0 )] − · · ·
180
h4
− (xn − x0 )f iv (ξ2 ) [by MVT of differential calculus]
180
nh5 iv
=− f (ξ2 ), [ since (xn − x0 )/n = 2h], x0 < ξ2 < xn
90
is the error term.
480 Numerical Analysis

7.17 Romberg’s Integration

The value of the integration obtained from trapezoidal or Simpson’s rules or other rules
may be improved by repeated use of Richardson’s extrapolation procedure as described
in Section 7.7. This method is known as Romberg’s integration method as he was
the first one to describe the algorithm in recursive form.
We assume that f ∈ C n [a, b] for all n, then the error term for the trapezoidal rule
can be represented in a series involving only even powers of h, i.e.,
% b
I= f (x)dx = I0 (h) + c1 h2 + c2 h4 + c3 h6 + · · · . (7.188)
a

The Richardson’s extrapolation method is used successively to eliminate c1 , c2 , c3 and


so on. This process generates integration formulae whose error terms are of the even
orders O(h4 ), O(h6 ), O(h8 ), etc.
The step length h is replaced by h/2 in (7.188) then

h2 h4 h6
I = I0 (h/2) + c1 + c2 + c 3 + · · · (7.189)
4 16 64
To eliminate c1 , equation (7.189) is multiplied by 4 and this equation is subtracted
from (7.188). Thus
3 15
3I = 4I0 (h/2) − I0 (h) − c2 h4 − c4 h6 − · · · .
4 16
This equation is divided by 3 and the coefficients of h4 , h6 etc. are denoted by d1 , d2
etc. Thus
4I0 (h/2) − I0 (h)
I= + d1 h4 + d2 h6 + · · · (7.190)
3
4I0 (h/2) − I0 (h)
Denoting by I1 (h/2) and this is the first Romberg’s improvement.
3
Thus by this notation the equation (7.190) can be written as

I = I1 (h/2) + d1 h4 + d2 h6 + · · · . (7.191)

Again, h is replaced by h/2 in (7.191), therefore,

h4 h6
I = I1 (h/22 ) + d1 + d2 + · · · . (7.192)
16 64
Now, eliminating d1 , we obtain
3
15I = 16I1 (h/22 ) − I1 (h/2) − d2 h6 + · · ·
4
Differentiation and Integration 481

That is,

16I1 (h/22 ) − I1 (h/2)


I= + e1 h6 + · · ·
15
= I2 (h/22 ) + e1 h6 + · · · , (7.193)

where
16I1 (h/22 ) − I1 (h/2) 42 I1 (h/22 ) − I1 (h/2)
I2 (h/22 ) = = (7.194)
15 42 − 1
is the second Romberg’s improvement.
In general,

4k Im−1 (h/2k ) − Im−1 (h/2k−1 )


Im (h/2k ) = , (7.195)
4k − 1
where m = 1, 2, . . . ; k = m, m + 1, . . . and I0 (h) = I(h).
Thus

I = Im (h/2m ) + O(h2m+2 ). (7.196)

Now,
4I(h/2) − I(h) 1 h h 
I1 (h/2) = = 4. (y0 + 2y1 + y2 ) − (y0 + y2 )
3 3 4 2
h/2
= [y0 + 4y1 + y2 ].
3
This is the Simpson’s 1/3 rule with step size h/2. That is, the first improved value
is equal to the value obtained by Simpson’s 1/3 rule.
Now,

16I1 (h/22 ) − I1 (h/2)


I2 (h/22 ) =
15
1 h/4   h/2
= 16. y0 + 4(y1 + y3 ) + 2y2 + y4 − (y0 + 4y2 + y4 )
15 3 3
[Simpson’s rule for 4 and 2 intervals]
2(h/4)
= [7y0 + 32y1 + 12y2 + 32y3 + 7y4 ].
45
This is the Boole’s rule for step length h/4.
The Romberg’s integration can be carried out using the triangular array of successive
approximations to the integral as shown in Table 7.11.
482 Numerical Analysis

Table 7.11: Romberg’s integration table.

0th order 1st order 2nd order


n h 3rd order 4th order
trapezoidal Simpson Boole
1 h I0 (h)
j
2 h/2 I0 (h/2) - I1 (h/2)
j s
4 h/22 I0 (h/22 ) -I1 (h/22 ) -I2 (h/22 )
j ~ s
8 h/23 I0 (h/23 ) -I1 (h/23 ) -I2 (h/23 ) -I3 (h/23 )
s R ~ R
16 h/24 I0 (h/24 )
-
I1 (h/24 ) -I2 (h/24 ) -I3 (h/24 ) -I4 (h/24 )

The entries in the 0th order are computed directly and the other entries are calculated
by using the formula (7.195). The values along the diagonal would converge to the
integral.
The advantage of Romberg’s method is that the method gives much more accurate
result than the usual composite trapezoidal rule. A computational weakness of this
method is that twice as many function evaluations are needed to decrease the error
from O(h2n ) to O(h2n+2 ). Practically, the computations are carried out rowwise until
the desired accuracy is achieved.

Note 7.17.1 If the 0th order (starting) approximation is calculated using Simpson’s
rule then first order approximation I1 gives the Boole’s approximation and so on.

% 1
dx
Example 7.17.1 Find the value of using Romberg’s method starting
0 1 + x2
with trapezoidal rule.

Solution. Let x0 = 0, x1 = 1, h = 1−0


n , xi = x0 + ih.
The initial approximations are computed by trapezoidal rule.
h
n = 1, h = 1. Then I0 = (y0 + y1 ) = 0.5(1 + 0.5) = 0.75000.
2
h
n = 2, h = 0.5. I0 = (y0 + 2y1 + y2 ) = 0.25(1 + 2 × 0.8 + 0.5) = 0.775.
2
h
n = 4, h = 0.25. I0 = [y0 + 2(y1 + y2 + y3 ) + y4 ]
2
= 0.125(1 + 2(0.94118 + 0.8 + 0.64) + 0.5) = 0.7828.
h
n = 8, h = 0.125. I0 = [y0 + 2(y1 + y2 + y3 + y4 + y5 + y6 + y7 ) + y8 ]
2
= 0.0625[1 + 2(0.98462 + 0.94118 + 0.87671 + 0.8 + 0.7191
Differentiation and Integration 483

+0.64 + 0.56637) + 0.5] = 0.78475.

4I0 (0.5) − I0 (1.0) 4 × 0.77500 − 0.75000


I1 (h/2) = = = 0.78333
3 3
4I0 (0.25) − I0 (0.5) 4 × 0.78280 − 0.77500
I1 (h/22 ) = = = 0.78540
3 3
4I0 (0.125) − I0 (0.25) 4 × 0.78475 − 0.78280
I1 (h/23 ) = = = 0.78540.
3 3
All the calculations are shown in the following table.

n h I0 I1 I2 I3
1 1.000 0.75000
2 0.500 0.77500 0.78333
4 0.250 0.78280 0.78540 0.78554
8 0.125 0.78475 0.78540 0.78540 0.78540

The exact integral is π/4 = 0.78540. In each column the numbers are converging to
the value 0.78540. The values in Simpson’s rule column (I1 ) converge faster than the
values in the trapezoidal rule column (I0 ). Ultimately, the value of the integral is
0.78540 correct up to five decimal places.

Algorithm 7.6 (Romberg’s integration). This algorithm implements Romberg’s


integration starting with trapezoidal rule.

From Table 7.11 it is observed that the elements of the row j are

4k I(j, k − 1)−I(j − 1, k − 1) I(j, k − 1)−I(j − 1, k − 1)


I(j, k) = = I(j, k − 1)+ ,
4 −1
k 4k −1
1 ≤ k ≤ j.
I(j, 0) are obtained from trapezoidal rule. The algorithm will terminate in the jth
row when |I(i, j) − I(i − 1, j − 1)| < ε.
Algorithm Romberg Integration
Input function F (x);
Read a, b, eps, maxrow;
// a, b are lower and upper limits of the integral, eps is the error tolerance and
maxrow is maximum number of rows to be evaluated if error tolerance is not
archived.//
Real I(0 : maxrow, 0 : maxrow); //value of the integral.//
Set n = 1; //initialize the number of subintervals//
Set h = b − a; // step size and current row.//
Set j = 0;
484 Numerical Analysis

Set error = 1; //initialize error to 1.//


Compute I(0, 0) = h ∗ [F (a) + F (b)]/2;
while (error > eps) and (j < maxrow) do
j = j + 1, h = h/2, n = 2 ∗ n;
//compute initial value using trapezoidal rule.//
Compute I(j, 0)=TRAP(a, h, n)
for k = 1 to j do
I(j, k) = I(j, k − 1) + [I(j, k − 1) − I(j − 1, k − 1)]/(4k − 1);
endfor;
Print I(j, l), l = 0, 1, . . . , j, the entries of jth row;
Compute error = |I(j − 1, j − 1) − I(j, j)|;
endwhile;
if (error < eps) then
Print ‘The value of integration’, I(j, j);
else
Print ‘The result does not achieve the desired accuracy,
the best approximation is’, I(j, j).
end Romberg Integration
Function Trap(a, h, n)
Input function f (x);
Set sum = 0;
for i = 1 to n − 1 do
sum = sum + f (a + ih);
endfor;
Compute Trap=[f (a) + f (a + nh) + 2 ∗ sum] ∗ h/2;
return;
end TRAP
Program 7.6.
/* Program Romberg
Integration by Romberg method. The initial integration is
computed by trapezoidal rule.
Here we assume that f(x)=x*x*exp(x). */
#include<stdio.h>
#include<math.h>
void main()
{
float a,b,h,error=1,eps=1e-5,I[10][10];
int n=1,j=0,k,l,maxrow;
float trap(float a, float h, int n);
float f(float x);
Differentiation and Integration 485

printf("Enter the limits a and b ");


scanf("%f %f",&a,&b);
printf("Enter max. number of rows to be computed ");
scanf("%d",&maxrow);
h=b-a;
I[0][0]=h*(f(a)+f(b))/2;
printf("Romberg integration table\n");
while((error>eps) && (j<maxrow))
{
j++; h/=2; n*=2;
I[j][0]=trap(a,h,n);
for(k=1;k<=j;k++)
I[j][k]=I[j][k-1]+(I[j][k-1]-I[j-1][k-1])/(pow(4,k)-1);
for(l=0;l<=j;l++) printf("%f ",I[j][l]); printf("\n");
error=fabs(I[j-1][j-1]-I[j][j]);
}
if(error<eps)
printf("The value of the integration is %f ",I[j][j]);
else
{
printf("The result does not achieve the desired accuracy");
printf("\nthe best approximation is %f ",I[j][j]);
}
} /* main */

/* definition of the function f(x) */


float f(float x)
{
return(x*x*exp(x));
}

/* function for the trapezoidal rule */


float trap(float a, float h, int n)
{
float sum=0;
int i;
for(i=1;i<=n-1;i++) sum+=f(a+i*h);
sum=(f(a)+f(a+n*h)+2*sum)*h/2;
return(sum);
}
486 Numerical Analysis

A sample of input/output:
Enter the limits a and b 0 1
Enter max. number of rows to be computed 10
Romberg integration table
0.885661 0.727834
0.760596 0.718908 0.718313
0.728890 0.718321 0.718282 0.718282
0.720936 0.718284 0.718282 0.718282 0.718282
The value of the integration is 0.718282

7.18 Double Integration

The double integration


% d% b
I= f (x, y) dx dy (7.197)
c a

can also be evaluated using the methods discussed earlier. Generally, trapezoidal or
Simpson’s methods are used. The integral can be evaluated numerically by two suc-
cessive integrations in x and y directions respectively taking one variable fixed at a
time.

7.18.1 Trapezoidal method


Taking y is fixed. The inner integral is integrated using trapezoidal method. Then
%
b−a d
I= [f (a, y) + f (b, y)]dy
2 c
% d % d
b−a
= f (a, y)dy + f (b, y)dy . (7.198)
2 c c

Again, applying trapezoidal rule on two integrals of right hand side and obtain the
trapezoidal formula for double integral as
b−a d−c d−c
I= {f (a, c) + f (a, d)} + {f (b, c) + f (b, d)}
2 2 2
(b − a)(d − c)
= [f (a, c) + f (a, d) + f (b, c) + f (b, d)]. (7.199)
4
This expression shows that the integration can be done only if the values of the
function f (x, y) is available at the corner points (a, c), (a, d), (b, c) and (b, d) of the
rectangular region [a, b; c, d].
Differentiation and Integration 487

The composite rule may also be used to determine the integral (7.197). To do this
the interval [a, b] is divided into n equal subintervals each of length h and the interval
[c, d] into m equal subintervals each of length k. That is,
b−a
xi = x0 + ih, x0 = a, xn = b, h=
n
d−c
yj = y0 + jk, y0 = c, ym = d, k= .
m
Now,
% b
h
f (x, y) = [f (x0 , y) + 2{f (x1 , y) + f (x2 , y) + · · ·
a 2
+f (xn−1 , y)} + f (xn , y)]. (7.200)

The equation (7.200) is integrated between c and d, term by term, using trapezoidal
rule. Therefore,
% d% b
I= f (x, y)dxdy
c a
h k
= f (x0 , y0 ) + 2(f (x0 , y1 ) + f (x0 , y2 ) + · · · + f (x0 , ym−1 )) + f (x0 , ym )
2 2
k
+2. f (x1 , y0 ) + 2(f (x1 , y1 ) + f (x1 , y2 ) + · · · + f (x1 , ym−1 )) + f (x1 , ym )
2
k
+2. f (x2 , y0 ) + 2(f (x2 , y1 ) + f (x2 , y2 ) + · · · + f (x2 , ym−1 )) + f (x2 , ym )
2
+················································
k
+2. f (xn−1 , y0 ) + 2(f (xn−1 , y1 ) + · · · + f (xn−1 , ym−1 )) + f (xn−1 , ym )
2
k
+ f (xn , y0 ) + 2(f (xn , y1 ) + f (xn , y2 ) + · · · + f (xn , ym−1 )) + f (xn , ym )
2
hk
= f00 + 2{f01 + f02 + · · · + f0,m−1 } + f0m
4

n−1
+2 {fi0 + 2(fi1 + fi2 + · · · + fi,m−1 ) + fim }
i=1

+fn0 + 2(fn1 + fn2 + · · · + fn,m−1 ) + fnm , (7.201)

where fij = f (xi , yj ), i = 0, 1, . . . , n; j = 0, 1, 2, . . . , m.


The method is of second order in both h and k.
488 Numerical Analysis

Table 7.12: Tabular form of trapezoidal rule for double integration.

y0 y1 y2 ym−1 ym
x0 [f00 + 2(f01 + f02 + · · · + f0,m−1 ) + f0,m ] k2 = I0
x1 [f10 + 2(f11 + f12 + · · · + f1,m−1 ) + f1,m ] k2 = I1
··· ·············································
xn [fn0 + 2(fn1 + fn2 + · · · + fn,m−1 ) + fn,m ] k2 = In
I= h2 [I0 + 2(I1 + I2 + · · · + In−1 ) + In ]

Algorithm 7.7 (Double integration using trapezoidal rule). This algorithm


determines the double integration of f (x, y) within the region [a, b; c, d] by trapezoidal
rule.

The formula (7.201) can be written as

hk      
m−1 n−1 m−1
I= f00 + f0m + 2 f0j + 2 fi0 + fim + 2 fij
4
j=1 i=1 j=1


m−1
hk 
n−1
+fn0 + fnm + 2 fnj = sum(0) + 2 sum(i) + sum(n) ,
4
j=1 i=1


m−1
where sum(i) = fi0 + fim + 2 fij .
j=1

Algorithm Double Trapezoidal


Input function f (x, y);
Read a, b, c, d, n, m; //limits of x and y and number of subintervals.//
Compute h = (b − a)/n and k = (d − c)/m;
Compute fij = f (xi , yj ), xi = a + ih, yj = c + jk, i = 0, 1, . . . , n, j = 0, 1, . . . , m;
Set result = 0;
for i = 0 to n − 1 do
result = result + sum(i);
endfor;
Compute result = (sum(0) + 2 ∗ result + sum(n)) ∗ h ∗ k/4;
Print result;
function sum(i)
Input array fij ;
Set t = 0;
Differentiation and Integration 489

for j = 1 to m − 1 do
t = t + fij ;
endfor;
return(fi0 + fim + 2 ∗ t);
end sum
end Double Trapezoidal

Program 7.7 .
/* Program Trapezoidal for Two Variables
The program to find the double integration of the function
F(x,y)=1/{(1+x*x)(1+y*y)} defined over a rectangular region
[a,b;c,d] by trapezoidal rule. */
#include<stdio.h>
int n,m; float f[15][15];
float sum(int);
float F(float,float);
void main()
{
int i,j; float a,b,c,d,h,k,x,y,result=0;
printf("Enter the limits of x and y; [a,b;c,d] ");
scanf("%f %f %f %f",&a,&b,&c,&d);
printf("Enter the number of subdivisions n,m of x,y ");
scanf("%d %d",&n,&m);
h=(b-a)/n;
k=(d-c)/m;
x=a;
for(i=0;i<=n;i++) /* computation of the function */
{
y=c;
for(j=0;j<=m;j++)
{
f[i][j]=F(x,y);
y+=k;
}
x+=h;
}
for(i=0;i<n;i++) result+=sum(i);
result=(sum(0)+2*result+sum(n))*h*k/4;
printf("The value of the integration is %8.5f",result);
}
490 Numerical Analysis

float sum(int i)
{
float t=0; int j;
for(j=1;j<m;j++) t+=f[i][j];
return(f[i][0]+f[i][m]+2*t);
}
/* definition of the function f(x,y) */
float F(float x,float y)
{
return( (1/(1+x*x))*(1/(1+y*y)));
}

A sample of input/output:

Enter the limits of x and y; [a,b;c,d]


0 1 0 1
Enter the number of subdivisions n,m of x,y
10 10
The value of the integration is 0.69469

7.18.2 Simpson’s 1/3 method


Let x0 = a, x1 = x0 + h, x2 = b be the three points on the interval [a, b] and y0 = c, y1 =
b−a d−c
y0 + k, y2 = d be that on [c, d], where h = ,k = .
% 2 2
b
Then by Simpson’s 1/3 rule on f (x, y)dx one can write
a
% b
h
f (x) dx = [f (x0 , y) + 4f (x1 , y) + f (x2 , y)].
a 3
Again, by same rule on each term of the above expression, the final formula is given
by
h k
I= {f (x0 , y0 ) + 4f (x0 , y1 ) + f (x0 , y2 )}
3 3
k
+4. {f (x1 , y0 ) + 4f (x1 , y1 ) + f (x1 , y2 )}
3
k
+ {f (x2 , y0 ) + 4f (x2 , y1 ) + f (x2 , y2 )}
3
hk
= [f00 +f02 +f20 +f22 +4(f01 +f10 +f12 +f21 )+16f11 ], (7.202)
9
where fij = f (xi , yj ), i = 0, 1, 2; j = 0, 1, 2.
Differentiation and Integration 491

Table 7.13: Tabular form of Simpson’s rule for double integration.

y0 y1 y2 y3 y4 ym−1 ym
x0 [f00 + 4f01 + 2f02 + 4f03 + 2f04 + · · · + 4f0,m−1 + f0,m ] k3 = I0
x1 [f10 + 4f11 + 2f12 + 4f13 + 2f14 + · · · + 4f1,m−1 + f1,m ] k3 = I1
··· ····························································
xn [fn0 + 4fn1 + 2fn2 + 4fn3 + 2fn4 + · · · + 4fn,m−1 + fn,m ] k3 = In
I= h3 [I0 + 4(I1 + I3 + · · · + In−1 ) + 2(I2 + I4 + · · · + In−2 ) + In ]

In general,
% yj+1% xi+1
hk 
f (x, y) dx dy = fi−1,j−1 +fi−1,j+1 +fi+1,j−1 +fi+1,j+1
yj−1 xi−1 9

+4(fi−1,j +fi,j−1 +fi,j+1 +fi+1,j )+16fij . (7.203)

This formula is known as Simpson’s 1/3 rule for double integration.


% 2% 2
dx dy
Example 7.18.1 Find the value of 2 2
using trapezoidal and Simpson’s
1 1 x +y
1/3 rules taking h = k = 0.25.

Solution. Since h = k = 0.25, let x = 1, 1.25, 1.50, 1.75, 2.0 and y =


1, 1.25, 1.50, 1.75, 2.0. The following table is constructed for the integrand
1
f (x, y) = 2 .
x + y2

y
x 1.00 1.25 1.50 1.75 2.00
1.00 0.50000 0.39024 0.30769 0.24615 0.20000
1.25 0.39024 0.32000 0.26230 0.21622 0.17978
1.50 0.30769 0.26230 0.22222 0.18824 0.16000
1.75 0.24615 0.21622 0.18824 0.16327 0.14159
2.00 0.20000 0.17978 0.16000 0.14159 0.12500

Let x be fixed and y be varying variable. Then by the trapezoidal rule on each
row of the above table, we obtain
% 2
0.25
I0 = f (1, y)dy = [0.50000 + 2(0.39024 + 0.30769 + 0.24615) + 0.20000]
1 2
= 0.32352.
492 Numerical Analysis

% 2
0.25
I1 = f (1.25, y)dy = [0.39024 + 2(0.32000 + 0.26230 + 0.21622) + 0.17978]
1 2
= 0.27088.
% 2
0.25
I2 = f (1.5, y)dy = [0.30769 + 2(0.26230 + 0.22222 + 0.18824) + 0.16000]
1 2
= 0.22665.
% 2
0.25
I3 = f (1.75, y)dy = [0.24615 + 2(0.21622 + 0.18824 + 0.16327) + 0.14159]
1 2
= 0.19040.
% 2
0.25
I4 = f (2.0, y)dy = [0.20000 + 2(0.17978 + 0.16000 + 0.14159) + 0.12500]
1 2
= 0.16097.

Hence finally
% 2% 2
dx dy h
2 + y2
= [f (1, y) + 2{f (1.25, y) + f (1.5, y) + f (1.75, y)} + f (2, y)]
1 1 x 2
h
= [I0 + 2(I1 + I2 + I3 ) + I4 ]
2
0.25
= [0.32352 + 2(0.27088 + 0.22665 + 0.1904) + 0.16097] = 0.23254.
2
Again, applying Simpson’s 1/3 rule to each row of the above table, we have
% 2
0.25
I0 = f (1, y)dy = [0.50000 + 4(0.39024 + 0.24615) + 2(0.30769) + 0.20000]
1 3
= 0.32175.

Similarly, I1 = 0.26996, I2 = 0.22619, I3 = 0.19018, I4 = 0.16087.


Hence the Simpson’s 1/3 rule gives
h
I = [f (1, y) + 4{f (1.25, y) + f (1.75, y)} + 2f (1.5, y) + f (2, y)]
3
h
= [I0 + 4(I1 + I3 ) + 2I2 + I4 ]
3
= 0.23129.

7.19 Monte Carlo Method

The Monte Carlo method is used to solve a large number of problems. The name of the
method is derived from the name of the city Monte Carlo, the city of Monaco famous
for its casino. Credit for inventing the Monte Carlo method often goes to Stanislaw
Differentiation and Integration 493

Ulam, a Polish mathematician who worked for John von Neumann in the United States
Manhattan Project during World War II. He invented this method in 1946 and the first
paper on it was published in 1949.
This method depends on a random sample and it is not suitable for hand calculation,
because it depends on a large number of random numbers. The algorithm of Monte
Carlo method is prepared to perform only one random trial. The trial is repeated for
N times and the trials are independent. The final result is the average of all the results
obtained in different trials.
Now, the Monte Carlo method is introduced to find numerical integration of a single
valued function. Suppose the definite integral be
% b
I= g(x) dx, (7.204)
a

where g(x) is a real valued function defined on [a, b]. The idea is to manipulate the
definite integral into a form that can be solved by Monte Carlo method. To do this, we
define the uniform probability density function (pdf) is defined on [a, b] in the following.
1
f (x) = b−a , a<x<b
0, otherwise.
This this function is inserted to the equation (7.204) to obtain the following expression
for I.
% b
I = (b − a) g(x)f (x) dx. (7.205)
a

It is observed that the integral on the right hand side of the equation (7.205) is simply
the expectation of g(x) under uniform pdf. Thus,
% b
I = (b − a) g(x)f (x) dx = (b − a)g. (7.206)
a

Now, a sample xi is drawn from the pdf f (x), and for each xi the value of g(xi ) is
calculated. To get a good approximation, a large number of sample is to be chosen and
let G be the average of all the values of g(xi ), i = 1, 2, . . . , N (the sample size). Then

1 
N
G= g(xi ). (7.207)
N
i=1

It is easy to prove that the expectation of the average of N samples is the expectation
of g(x), i.e., G = g. Hence
  N 
1
I = (b − a)G  (b − a) g(xi ) . (7.208)
N
i=1
494 Numerical Analysis

y = g(x)
6

g(x)


- x
x=a x=b

Figure 7.4: Random points are chosen in [a, b]. The value of g(x) is evaluated at each
random point (marked by straight lines in the figure).

Figure 7.4 illustrates the Monte Carlo method.


Thus the approximate value of the integral I on [a, b] can be evaluated by taking
the average of N observations of the integrand with the random variable x sampled
uniformly over the interval [a, b]. This implies that the interval [a, b] is finite, since an
infinite interval cannot have a uniform pdf. The infinite limits of integration can be
accommodated with more sophisticated techniques.
The true variance in the average G is equal to the true variance in g, i.e.,
1
var(G) = var(g).
N
Although var(G) is unknown, since it is a property of the pdf f (x) and the real
function g(x), it is a constant. Furthermore, if an error is committed in the estimate
of the integral I with the standard deviation,
√ then one may expect the error in the
estimate of I to decrease by the factor 1/ N .

Algorithm 7.8 (Monte Carlo integration). This algorithm finds the value of
8b
the integral a f (x) dx using Monte Carlo method based on a sample of size N .

Algorithm Monte Carlo


Input function g(x);
Step 1: Read sample size N and the lower and upper limits a, b, of the integral.
Step 2: For i = 1 to N do steps 2.1–2.3
Step 2.1: Generate a random number yi between 0 and 1.
Step 2.2: Let xi = a + (b − a)yi . //a ≤ xi ≤ b.//
Step 2.3: Calculate g(xi ) and compute sum = sum + g(xi ). //Initially sum = 0
Step 3: Calculate I = (sum/N ) ∗ (b − a).
Step 4: Print I.
end Monte Carlo.
Differentiation and Integration 495

The function to generate the random numbers rand() is available in C programming


language though it is not available in FORTRAN. Several methods are available to
generate random numbers. A method is described in the following which will generate
pseudo-random numbers.

7.19.1 Generation of random numbers


The random numbers are generated by a random process and these numbers are the
values of a random variable. Several methods are available to generate random num-
bers. Practically, these methods do not generate ideal random numbers, because these
methods follow some algorithms. So the available methods generate pseudo random
numbers.
The commonly used simplest method to generate random numbers is the power
residue method. A sequence of non-negative integers x1 , x2 , . . . is generated from the
following relation
xn+1 = (a xn ) (mod m)
where x0 is a starting value called the seed, a and m are two positive integers (a < m).
The expression (axn ) (mod m) gives the remainder when axn is divided by m. The
possible values of xn+1 are 0, 1, 2, . . . , m−1, i.e., the number of different random numbers
is m. The period of random number depends on the values of a, x0 and m. Appropriate
choice of a, x0 and m gives a long period random numbers.
Suppose the computer which will generate the random numbers have a word-length of
b bits. Let m = 2b−1 − 1, a = an odd integer of the form 8k ± 3 and closed to 2b/2 , x0 =
an odd integer between 0 and m. Now ax0 is a 2b bits integer. The least b significant bits
form the random number x1 . The process is repeated for a desired number of times. For
a 32-bit computer, m = 231 − 1 = 2147483647, a = 216 + 3 = 65539, x0 = 1267835015,
(0 < x0 < m).
To obtain the random numbers between [0, 1], all numbers are divided by m − 1.
These numbers are uniformly distributed over [0, 1]. The following FORTRAN function
RAND01() generates random numbers between 0 and 1.
C FORTRAN FUNCTION TO GENERATE A RANDOM NUMBER BETWEEN 0 AND 1
FUNCTION RAND01()
INTEGER*4 A,X0
REAL M
PARAMETER(M=2147483647.0)
DATA A,X0/65539,1267835015/
X0=IABS(A*X0)
RAND01=X0/M
RETURN
END
496 Numerical Analysis

Program 7.8
.
/* Program Monte Carlo
Program to find the value of the integration of 1/(1+x^2)
between 0 and 1 by Monte Carlo method for different values of N. */
#include<stdio.h>
#include<stdlib.h>
void main()
{
float g(float x); /* g(x) may be changed accordingly */
float x,y,I,sum=0.0,a=0.0,b=1.0;
int i, N;
srand(100); /* seed for random number */
printf("Enter the sample size ");
scanf("%d",&N);
for(i=0;i<N;i++)
{ /* rand() generates a random number between 0 and RAND_MAX */
y=(float)rand()/RAND_MAX;
/* generates a random number between 0 and 1*/
x=a+(b-a)*y;
sum+=g(x);
}
I=sum*(b-a)/N;
printf("%f",I);
}
/* definition of function */
float g(float x)
{
return(1/(1+x*x));
}

The results obtained for different values of N are tabulated in the following.

N Integration
500 0.790020
1000 0.789627
1500 0.786979
3000 0.786553
4000 0.784793
10000 0.784094
15000 0.782420
Differentiation and Integration 497

Choice of method

In this chapter, three types of integration techniques viz., Newton-Cotes, Gaussian and
Monte Carlo are discussed. These methods are computationally different from each
other.
No definite rule can be given to choose integration method. But, the following points
may be kept in mind while choosing an integration method.

(i) The simplest but crudest integration formula is the trapezoidal rule. This method
gives a rough value of the integral. When a rough value is required then this
method may be used.

(ii) Simpson’s 1/3 rule gives more accurate result and it is also simple. Practically,
this is the most widely used formula. Thus, if f (x) does not fluctuate rapidly and
is explicitly known then this method with a suitable subinterval can be used. If
high accuracy is required then the Gaussian quadrature may be used.

(iii) Double integration may be done by using Simpson’s rule with suitable subintervals.

(iv) If the integrand is known at some unequally spaced points, then the trapezoidal
rule along with Romberg’s integration is useful.

(v) If the integrand is violently oscillating or fluctuating then the Monte Carlo method
can be used.

7.20 Worked out Examples

Example 7.20.1 The arc length of the curve y = f (x) over the interval a ≤ x ≤ b
% b
is 1 + [f  (x)]2 dx. For the function f (x) = x3 , 0 ≤ x ≤ 1 find the approximate
a
arc length using the composite trapezoidal and Simpson’s 1/3 rules with n = 10.
81 81√
Solution. The arc length I = 0 1 + [f  (x)]2 dx = 0 1 + 9x4 dx.
Since n = 10, h = (1 − 0)/10 = 0.1.

x: 0.0 0.1 0.2 0.3 0.4 0.5


y : 1.00000 1.00045 1.00717 1.03581 1.10923 1.25000
x: 0.6 0.7 0.8 0.9 1.0
y : 1.47187 1.77789 2.16481 2.62772 3.16228
498 Numerical Analysis

Trapezoidal rule gives


h
I= [y0 + 2(y1 + y2 + · · · + y9 ) + y10 ]
2
0.1
= [1.00000 + 2(1.00045 + 1.00717 + 1.03581 + 1.10923 + 1.25000 + 1.47187
2
+1.77789 + 2.16481 + 2.62772) + 3.16228]
= 1.55261.

Simpson’s 1/3 rule gives

h
I= [y0 + 4(y1 + y3 + · · · + y9 ) + 2(y2 + y4 + · · · + y8 ) + y10 ]
3
0.1
= [1.00000 + 4(1.00045 + 1.03581 + 1.25000 + 1.77789 + 2.62772)
3
+2(1.00717 + 1.10923 + 1.47187 + 2.16481) + 3.16228]
= 1.54786.

The exact value is 1.547866.

Example 7.20.2 Find the number n and the step size h such that the
% error for the
5
composite Simpson’s 1/3 rule is less than 5 × 10−7 when evaluating log x dx.
2

1 2 6
Solution. Let f (x) = log x. f  (x) = − 2 , f  (x) = 3 , f iv (x) = − 4 .
x x x
6
The maximum value of |f iv (x)| = 4 over [2,5] occurs at the end point x = 2.
x
3
Thus |f (ξ)| ≤ |f (2)| = , 2 ≤ ξ ≤ 5.
iv iv
8
(b − a)f iv (ξ)h4
The error E of Simpson’s 1/3 rule is E = − .
180
Therefore,  
 (b − a)f iv (ξ)h4  (5 − 2)h4 3 h4
|E| =  ≤
 . = .
180 180 8 160
b−a 3 h4 81
Also h = = . Thus, |E| ≤ = 4 ≤ 5 × 10−7 .
n n 160 n × 160
81 1
That is, n4 ≥ · or, n ≥ 31.72114.
160 5 × 10−7
Since n is integer, we choose n = 32 and the corresponding step size is
3 3
h= = = 0.09375.
n 32
Differentiation and Integration 499

Example 7.20.3 Obtain the approximate quadrature formula


% 1 √
5 8 5 √
f (x)dx = f (− 0.6) + f (0) + f ( 0.6).
−1 9 9 9
% 1 √ √
Solution. Let f (x)dx = w1 f (− 0.6) + w2 f (0) + w3 f ( 0.6),
−1
where w1 , w2 , w3 are to be determined. To find them, substituting f (x) = 1, x, x2
successively to the above equation, we obtain the following system of equations
2=√ w1 + w2 + w3√(when f (x) = 1)
0 = −0.6 w1 + 0.6 w3 (when f (x) = x)
2 2
3 = 0.6 w1 + 0.6 w3 (when f (x) = x ).
5 8
Solution of this system is w1 = w3 = and w2 = .
9 9
Hence the result follows.

Example 7.20.4 Deduce the following quadrature formula


% n
3 1
f (x)dx = n f (0) + 19f (n) − 5f (2n) + f (3n) .
0 8 24

Solution. If the above formula gives exact result for the polynomial of degree up
to 4 then let f (x) = 1, x, x2 , x3 . Substituting f (x) = 1, x, x2 , x3 successively to the
equation % n
f (x)dx = w1 f (0) + w2 f (n) + w3 f (2n) + w4 f (3n),
0
we get
n = w1 + w2 + w3 + w4 , n2 /2 = nw2 + 2nw3 + 3nw4 ,
n3 /3 = n2 w2 + 4n2 w3 + 9n2 w4 , n4 /4 = n3 w2 + 8n3 w3 + 27n3 w4 .
3n 19n 5n n
Solution is w1 = , w2 = , w3 = − , w4 = .
% n 8 24 24 24
3 1
Hence f (x)dx = n f (0) + 19f (n) − 5f (2n) + f (3n) .
0 8 24

Example 7.20.5 If f (x) is a polynomial of degree 2, prove that


% 1
1
f (x)dx = [5f (0) + 8f (1) − f (2)].
0 12

Solution.
% 1 Let the formula be
f (x)dx = w1 f (0) + w2 f (1) + w3 f (2). The formula is exact for f (x) = 1, x, x2 .
0
500 Numerical Analysis

Substituting f (x) = 1, x, x2 successively to the above formula and find the following
set of equations.
1 = w1 + w2 + w3 , 1/2 = w2 + 2w3 , 1/3 = w2 + 4w3 .
5 2 1
Solution of these equations is w1 = , w2 = , w3 = − .
% 1 12 3 12
1
Hence the formula becomes f (x)dx = [5f (0) + 8f (1) − f (2)].
0 12

Example 7.20.6 Deduce the following quadrature formula


% 1 √ √
2
f (x)dx = [f (0) + f (1/ 2) + f (−1/ 2)].
−1 3
% 1 √ √
Solution. Let f (x)dx = w1 f (0) + w2 f (1/ 2) + w3 f (−1/ 2).
−1
As in previous examples, the system of equations for f (x) = 1, x, x2 are
1 1 2 1 1
2 = w 1 + w 2 + w 3 , 0 = √ w2 − √ w 3 , = w 2 + w 3 .
2 2 3 2 2
Solution of these equations is w1 = w2 = w3 = 2/3. Hence the result follows.

Example 7.20.7 Write down the quadrature polynomial which takes the same val-
ues as f (x) at x = −1, 0, 1 and integrate it to obtain the integration formula
% 1
1
f (x)dx = [f (−1) + 4f (0) + f (1)].
−1 3

Assuming the error to have the form Af iv (ξ), −1 < ξ < 1, find the value of A.

Solution. To find the quadratic polynomial, the Lagrange’s interpolation formula is


used. The Lagrange’s interpolation for the points −1, 0, 1 is

(x − 0)(x − 1) (x + 1)(x − 1) (x + 1)(x − 0)


f (x) = f (−1) + f (0) + f (1)
(−1 − 0)(−1 − 1) (0 + 1)(0 − 1) (1 + 1)(1 − 0)
1 1
= (x2 − x)f (−1) − (x2 − 1)f (0) + (x2 + x)f (1)
2 2
1 1  1 1 
= f (−1) − f (0) + f (1) x2 + f (1) − f (−1) x + f (0).
2 2 2 2
This is the required quadratic polynomial. Integrating it between −1 and 1.
% 1 % 1 % 1 % 1
1 1
f (x)dx = f (−1) (x − x)dx − f (0)
2
(x − 1)dx + f (1)
2
(x2 + x)dx
−1 2 −1 −1 2 −1
Differentiation and Integration 501

1 2  4 1 2
= f (−1). − f (0) − + f (1).
2 3 3 2 3
1
= [f (−1) + 4f (0) + f (1)].
3
C
Here the error is of the form Af iv (ξ). Let the error be E = Af iv (ξ) = f iv (ξ),
4!
−1 < ξ < 1. This indicates that the above formula gives exact result for the polyno-
mials of degree up to 3 and has error for the polynomial of degree 4. Let f (x) = x4 .
% 1
Then the value of x4 dx obtain from the above formula is 13 [(−1)4 + 4.04 + 14 ] = 23
−1 %
1
2
and the exact value is x4 dx = .
−1 5
% 1
2 4
Therefore, C = x4 dx − = − .
−1 3 15
C 1
Hence A = =− .
4! 90

Example 7.20.8 Derive Simpson’s 1/3 rule using the method of undetermined co-
efficients.

Solution. Let
% x2 % h
I= f (x)dx = f (z + x1 )dz where z = x − x1
x0 −h
% h
= F (z)dz = w1 F (−h) + w2 F (0) + w3 F (h), F (z) = f (z + x1 ).
−h

The coefficients w1 , w2 , w3 are to be determined. To determine these numbers assume


that the formula is exact for F (z) = 1, z, z 2 . Substituting F (z) = 1, z, z 2 successively
to the above formula and find the following equations
2
w1 + w2 + w3 = 2h, −hw1 + hw3 = 0, h2 w1 + h2 w3 = h3 .
3
h 4h
Solution of these equations is w1 = w3 = and w3 = .
3 3
Therefore,
% x2 % h
h 4h h
I= f (x)dx = F (z)dz = F (−h) + F (0) + F (h)
x0 −h 3 3 3
h
= [F (−h) + 4F (0) + F (h)]
3
502 Numerical Analysis

This can be written as


h
I= [f (x1 − h) + 4f (x1 ) + f (x1 + h)]
3
h
= [f (x0 ) + 4f (x1 ) + f (x2 )].
3
This is the required Simpson’s 1/3 rule.

7.21 Exercise

1. From the following table of values, estimate y  (1.05) and y  (1.05):

x : 1.00 1.05 1.10 1.15 1.20 1.25


y : 1.1000 1.1347 1.1688 1.1564 1.2344 1.2345

2. A slider in a machine moves along a fixed straight rod. Its distance x cm along
the rod is given below for various values of time t (second). Find the velocity of
the slider and its acceleration when t = 0.3 sec.

t : 0.3 0.4 0.5 0.6 0.7 0.8


x : 3.364 3.395 3.381 3.324 3.321 3.312

Use the formula based on Newton’s forward difference interpolation to find the
velocity and acceleration.

3. Use approximate formula to find the values of y  (2) and y  (2) from the following
table:

x : 2.0 3.0 4.0 5.0 6.0 7.0


y : –0.15917 1.09861 1.38629 1.60944 1.79176 1.94591

4. Find the values of f  (5) and f  (5) from the following table:

x : 1 2 3 4 5
f (x) : 10 26 50 82 122

5. Find the values of y  (1), y  (1.2), y  (4), y  (3.9) from the following values

x : 1 2 3 4
y : 0.54030 –0.41615 –0.98999 –0.65364

6. Use two-point and three-point formulae to find the values of f  (2.0) and f  (2.0).
Differentiation and Integration 503

x : 0.5 1.0 1.5 2.0 2.5 3.0 3.5


f (x) : –0.30103 0.00000 0.17609 0.30103 0.39794 0.47712 0.54407

7. Deduce the following relation between the differential and finite difference opera-
tors
1 ∆2 ∆3 ∆4 
(a) D ≡ ∆− + − + ···
h 2 3 4
1 ∇2 ∇3 ∇4 
(b) D ≡ ∇+ + + + ···
h 2 3 4
1 11 5 137 6 
(c) D2 ≡ 2 ∆2 − ∆3 + ∆4 − ∆5 + ∆ − ···
h 12 6 180
1  11 5 137 6 
(d) D2 ≡ 2 ∇2 + ∇3 + ∇4 + ∇5 + ∇ + ···
h 12 6 180
8. Use Taylor’s series to deduce the following formula

−3f (x0 ) + 4f (x1 ) − f (x2 ) h2 


f  (x0 ) = + f (ξ), x0 < ξ < x2 .
2h 3
Also, determine the optimum value of h which minimizes the total error (sum of
truncation and round-off errors).

9. Deduce the formula


f (x0 + h) − f (x0 ) h 
f  (x0 ) = − f (ξ), x0 < ξ < x1 .
h 2
Also, find the value of h such that the sum of truncation and round-off errors is
minimum.

10. Determine the value of k in the formula


f (x0 + h) − 2f (x0 ) + f (x0 − h)
f  (x0 ) = + O(hk ).
h2

11. Use Lagrange interpolation formula to deduce the following formulae


2f (x0 ) − 5f (x1 ) + 4f (x2 ) − f (x3 )
(a) f  (x0 ) 
h2
−5f (x0 ) + 18f (x1 ) − 24f (x2 ) + 14f (x3 ) − 3f (x4 )
(b) f  (x0 )  ,
2h3
3f (x0 ) − 14f (x1 ) + 26f (x2 ) − 24f (x3 ) + 11f (x4 ) − 2f (x5 )
(c) f iv (x0 ) 
h4
504 Numerical Analysis

12. Use Taylor’s expansion to derive the formula

f (x0 +2h) − 2f (x0 + h) + 2f (x0 − h) − f (x0 − 2h)


f  (x0 )  .
2h3

13. Use Richardson’s extrapolation to find f  (1) from the following:

x : 0.6 0.8 0.9 1.0 1.1 1.2 1.4


f (x) : 1.70718 1.85979 1.092586 1.98412 2.03384 2.07350 2.12899

Apply the approximate formula

f (x0 + h) − f (x0 − h)
f  (x0 ) =
2h
with h = 04, 0.2 and 0.1, to find initial values.
% 2
14. Find the value of (1 + e−x sin 4x) dx using basic (non-composite) rules of (i)
0
trapezoidal, (ii) Simpson’s 1/3, (iii) Simpson’s 3/8, (iv) Boole’s, and (v) Weddle’s.

15. Consider f (x) = 3 + sin(2 x). Use the composite trapezoidal and Simpson’s 1/3
rules with 11 points to compute an approximation to the integral of f (x) taken
over [0, 5].

16. Consider%the integrals %


5 1
2
(a) e−x dx (b) x4 ex−1 dx.
0 0
Evaluate these by (i) Trapezoidal rule with 11 points
(ii) Simpson’s 1/3 rule with 11 points.

17. Evaluate the following integral using Simpson’s 1/3 rule.


% π/2
dx
2 .
0 sin x + 2 cos2 x

18. Evaluate the integral


% 1
ex+1 dx
0

using Simpson’s 1/3 rule, by dividing the interval of integration into eight equal
parts.
Differentiation and Integration 505

19. Evaluate the integral % 1.8


ex + e−x
dx
1.0 2
using Simpson’s 1/3 rule and trapezoidal rule, by taking h = 0.2.

20. A curve is drawn to pass through the points given by the following table:

x 1 1.5 2 2.5 3 3.5 4


y 2 2.4 2.7 2.8 3 3.6 2.4

Use Simpson’s 1/3 rule to estimate the area bounded by the curve and the lines
x = 1 and x = 4 and x-axis.

21. Find the value of % 1


dx
0 1 + x2
taking 5 sub-intervals, by trapezoidal rule, correct to five significant figures. Also
find the error by comparing with the exact value.

22. Evaluate the integral % 5.2


ln x dx,
4
using Simpson’s one-third rule with h = 0.1 and compare the result with the exact
value.

23. Find the number of subintervals n and the step size h so that the error
% 5 for the
composite trapezoidal rule is less than 5×10−4 for the approximation sin x dx.
1

24. Verify that the Simpson’s 1/3 rule is exact for polynomial of degree less than or
equal to 3 of the form f (x) = a0 + a1 x + a2 x2 + a3 x3 over [0,2].

25. Integrate the Lagrange’s interpolation polynomial


x − x1 x − x0
φ(x) = f0 + f1
x0 − x1 x1 − x0
over the interval [x0 , x1 ] and establish the trapezoidal rule.

26. The solid of revolution obtained by rotating the region under the curve y = f (x),
a ≤ x ≤ b, about the x-axis has surface area given by
% b 
area = 2π f (x) 1 + [f  (x)]2 dx.
a
506 Numerical Analysis

Find the surface area for the functions (a) f (x) = cos x, 0 ≤ x ≤ π/4, (b)
f (x) = log x, 1 ≤ x ≤ 5, using trapezoidal and Simpson’s 1/3 rules with 10
subintervals.

27. Show that the Simpson’s 1/3 rule produces exact result for the function f (x) = x2
% b % b
b2 a2 b3 a3
3
and f (x) = x , that is, (a) 2
x dx = − and (b) x3 dx = − , by
a 3 3 a 4 4
taking four subintervals.

28. A rocket is lunched from the ground. Its acceleration a(t) measured in every 5
second is tabulated below. Find the velocity and the position of the rocket at
t = 30 second. Use trapezoidal as well as Simpson’s rules. Compare the answers.

t : 0 5 10 15 20 25 30
a(t) : 40.0 46.50 49.25 52.25 55.75 58.25 60.50

29. The natural logarithm of a positive number x is defined by


% 1
dt
loge x = − .
x t

Find the maximum step size h to get the truncation error bound 0.5 × 10−3 when
finding the value of log2 (0.75) by trapezoidal rule.

30. Expand F (x) where F  (x) = f (x), by Taylor series about x0 + h/2 and establish
the midpoint rule
% x1
h2  x1 − x0
f (x) dx = hf (x0 + h/2) + f (ξ), x0 < ξ < x1 , h = .
x0 24 2

31. %
Complete the following table to compute the value of the integral
3
sin 2x
5
dx using Romberg integration.
0 1+x

I0 I1 I2 I3
Trapezoidal rule Simpson’s rule Boole’s rule Third improvement
–0.00171772
0.02377300 ···
0.60402717 ··· ···
0.64844713 ··· ··· ···
0.66591329 ··· ··· ···
Differentiation and Integration 507

32. Find the values of the following integrals using Romberg integration starting from
trapezoidal
% 2  rule, correct up % 2to five decimal points.
(a) 4x − x2 dx, (b) sin(1/x) dx.
1 1/(2π)

33. Use three-point Gauss-Legendre formula to evaluate the integrals


% 1 % π/2
1
(a) 2
dx, (b) sin x dx.
−1 1 + x 0

34. The three-point Gauss-Legendre formula is


% 1
√ √
5f (− 0.6) + 8f (0) + 5f ( 0.6)
f (x)dx = .
−1 9

Show that the formula is exact for f (x) = 1, x, x2 , x3 , x4 , x5 .


% 2
x
35. Find the value of dx using four-point and six-point Gauss-Legendre
0 1 + x3
quadrature formulae.
% 1
36. Find the value of e5x cos x dx using
−1
(a) Gauss-Legendre quadrature for n = 3, 4,
(b) Lobatto quadrature for n = 3, 4.
% 1
37. Find the value of (1 − x2 ) sin x dx using
−1
(a) Six-point Gauss-Legendre quadrature,
(b) Three-point Gauss-Chebyshev quadrature.
% ∞
38. Find the value of e−x cos x dx using
0
(a) Three-point Gauss-Laguerre quadrature,
(b) Three-point Gauss-Hermite quadrature.
% 1
39. Evaluate cos 2x (1 − x2 )−1/2 dx using any suitable method.
0

40. Using the method of undetermined coefficients, derive the following formulae.
% 2π
(a) f (x) sin x dx = f (0) − f (2π)
%0 h
h
(b) y dx = (y0 + y1 ).
0 2
508 Numerical Analysis

41. Find the weights w1 , w2 , w3 so that the relation


% 1 √ √
f (x) dx = w1 f (− 0.6) + w2 f (0) + w3 f ( 0.6)
−1

is exact for the functions f (x) = 1, x, x2 .

42. Find the values of a, b, c such that the truncation error in the formula
% h
f (x) dx = h[af (−h) + bf (0) + cf (h)]
−h

is minimized.

43. Determine the weights and the nodes in the formula


% 1 
3
f (x)dx = wi f (xi )
−1 i=0

with x0 = −1 and x3 = 1 so that the formula becomes exact for polynomials of


highest degree.

44. Find the values of a, b, c such that the formula


% h
f (x)dx = h[af (0) + hf (h/3) + cf (h)]
0

is exact for polynomial of as high order as possible.

45. Compare Newton-Cotes and Gaussian quadrature methods.

46. Use Euler-Maclaurin formula to find the value of π from the relation
% 1
π dx
= 2
.
4 0 1+x

47. Use Euler-Maclaurin formula to find the values of the following series:
1 1 1 1
(a) 2 + 2 + 2 + · · · + 2
51 53 55 99
1 1 1 1
(b) 2
+ 2 + 2 + ··· + 2
11 12 13 99
1 1 1 1
(c) + + + ··· + .
1 2 3 20
Differentiation and Integration 509

48. Use Euler-Maclaurin formula to prove the following results:


n
n(n + 1)
(a) x=
2
x=1
n
n(n + 1)(2n + 1)
(b) x2 =
6
x=1
n
n(6n + 15n3 + 10n2 − 1)
4
(c) x4 = .
30
x=1


π2  1
49. Use the identity = to compute π 2 .
6 n2
n=1
81
50. Use Euler-Maclaurin formula to find the value of 0 x3 dx.

51. Use Euler-Maclaurin formula to deduce trapezoidal and Simpson’s 1/3 rules.

52. Evaluate the double integral


% 1% 2
2xy
I=  dy dx
0 0 (1 + x2 )(1 + y 2 )

using Simpson’s 1/3 rule with step size h = k = 0.25.

53. Use Simpson’s 1/3 rule to compute the integral


% %
dx dy
I= 2 2
R x +y

where R is the square region with corners (1, 1), (2, 1), (2, 2), (1, 2).
% 1% 1
sin xy
54. Use Simpson’s 1/3 rule to compute the integral dx dy with h =
0 0 1 + xy
k = 0.25.
% 5
x
55. Use Monte Carlo method to find the value of dx, taking sample size
1 x + cos x
N = 10.
510 Numerical Analysis
Chapter 8

Ordinary Differential Equations

Many problems in science and engineering can be represented in terms of differential


equations satisfying certain given conditions. The analytic methods to solve differential
equations are used for a limited classes of differential equations. Most of the differential
equations governed by physical problems do not possess closed form solutions. For
these types of problems the numerical methods are used. A number of good numerical
methods are available to find numerical solution of differential equations.
Let us consider the general first order differential equation
dy
= f (x, y) (8.1)
dx
with initial condition

y(x0 ) = y0 . (8.2)

The solution of a differential equation can be done in one of the following two forms:
(i) A series solution for y in terms of powers of x. Then the values of y can be
determined by substituting x = x0 , x1 , . . . , xn .

(ii) A set of tabulated values of y for x = x0 , x1 , . . . , xn with spacing h.


In case (i), the solution of the differential equation is computed in terms of x, and
the values of y are calculated from this solution, where as, in case (ii), the solution of
the differential equation is obtained by applying the method repeatedly for each value
of x (= x1 , x2 , . . . , xn ).
If the differential equation is of nth order then its general solution contains n arbitrary
constants. To find the values of these constants, n conditions are needed. The problems
in which all the conditions are specified at the initial point only, are called initial value

511
512 Numerical Analysis

problems (IVPs). The problems of order two or more and for which the conditions
are given at two or more points are called boundary value problems (BVPs).
There may not exist a solution of an ordinary differential equation always. The
sufficient condition for existence of unique solution is stated below.

Existence and Uniqueness


Theorem 8.1 (Lipschitz conditions). Let f (x, y) be a real valued function and

(i) f (x, y) is defined and continuous in the strip x0 ≤ x ≤ b, −∞ < y < ∞,

(ii) there exists a constant L such that for any x ∈ [x0 , b] and for any two numbers y1
and y2
|f (x, y1 ) − f (x, y2 )| ≤ L|y1 − y2 |,
where L is called the Lipschitz constant.

Then for any y0 , the initial value problem


dy
= f (x, y), y(x0 ) = y0
dx
has a unique solution y(x) for x ∈ [x0 , b].

The methods to find approximate solution of an initial value problem are referred as
difference methods or discrete variable methods. The solutions are determined
at a set of discrete points called a grid or mesh of points.
The errors committed in solving an initial value problem are of two types– discretiza-
tion and round-off. The discretization error, again, are of two types – global dis-
cretization error and local discretization error. The global discretization error
or global truncation error Eiglobal is defined as

Eiglobal = y(xi ) − Y (xi ), i = 1, 2, . . . . (8.3)

that is, it is the difference between the exact solution Y (xi ) and the solution y(xi )
obtained by discrete variable method.
The local discretization error or local truncation error Ei+1 local is defined by

local
Ei+1 = y(xi+1 ) − y(xi ), i = 0, 1, 2, . . . . (8.4)

This error generated at each step from xi to xi+1 . The error at the end of the interval
is called the final global error (F.G.E.)

EnFGE = |y(xn ) − Y (xn )|. (8.5)


Ordinary Differential Equations 513

In analyzing different procedures used to solve ordinary as well as partial differential


equations, we use the following numerical concepts.
The order of a finite difference approximation of a differential equation is the rate
at which the global error of the finite difference solution approaches to zero as the
size of the grid spacing (h) approaches zero. When applied to a differential equation
with a bounded solution, a finite difference equation is stable if it produces a bounded
solution and is unstable if it produces an unbounded solution. It is quite possible that
the numerical solution to a differential equation grows unbounded even though its exact
solution is well behaved. Of course, there are cases for which the exact solution may be
unbounded, but, for our discussion of stability we concentrate only on the cases in which
the exact solution is bounded. In stability analysis, conditions are deduced in terms of
the step size h for which the numerical solution remains bounded. In this connection,
the numerical methods are of three classes.

(i) Stable numerical scheme: The numerical solution does not blow up with choice
of step size.
(ii) Unstable numerical scheme: The numerical solution blows up with any choice
of step size.
(iii) Conditionally stable numerical scheme: Numerical solution remains bounded
with certain choices of step size.

A finite difference method is convergent if the solution of the finite difference equa-
tion approaches to a limit as the size of the grid spacing tends to zero. But, there is no
guarantee, in general, that this limit corresponds to the exact solution of the differential
equation.
A finite difference equation is consistent with a differential equation if the differ-
ence between the solution of the finite difference equation and those of the differential
equation tends to zero as the size of the grid spacing tends to zero independently.
If the value of yi+1 depends only on the value of yi then the method is called single-
step method and if two or more values are required to evaluate the value of yi+1 then
the method is known as two-step or multistep method.
Again, if the value of yi+1 depends only on the values of yi , h and f (xi , yi ) then the
method used to determine yi+1 is called explicit method, otherwise the method is
called implicit method.

8.1 Taylor’s Series Method

The Taylor’s series method is the most fundamental method and it is the standard to
which we compare the accuracy of the various other numerical methods for solving an
initial value problem.
514 Numerical Analysis

Let us consider the first order differential equation


dy
= f (x, y), y(x0 ) = y0 . (8.6)
dx
The exact solution y(x) of (8.6) around the point x = x0 is given by
(x − x0 )2 
y(x) = y0 + (x − x0 )y0 + y0 + · · · (8.7)
2!
If the values of y0 , y0 , y0 , etc. are known then (8.7) becomes a power series of x.
The derivatives can be obtained by taking the total derivatives of f (x, y).
Given that y  (x) = f (x, y).
From the definition of total derivatives, one can write
y  (x) = fx + y  (x)fy = fx + fy f
y  (x) = (fxx + fxy f ) + (fyx + fyy f )f + fy (fx + fy f )
= fxx + 2fxy f + fyy f 2 + fy fx + fy2 f
y iv (x) = (fxxx + 3fxxy f + 3fxyy f 2 + fyy f 3 ) + fx (fxx + 2fxy f + fyy f 2 )
+3(fx + fy f )(fxy + fyy f ) + fy2 (fx + fx f )
and in general
 
∂ ∂ (n−1)
y (n) = +f f (x, y).
∂x ∂y
The general expression for Taylor’s series method of order n is
h2  hn (n)
yi+1 = yi + hy  (xi ) + y (xi ) + · · · + y (xi ). (8.8)
2! n!
From above it is clear that if f (x, y) is not easily differentiable, then the computation
of y is very laborious. Practically, the number of terms in the expansion must be
restricted.

Error
The final global error of Taylor’s series method is of the order of O(hn+1 ). Thus, for
large value of n the error becomes small. If n is fixed then the step size h is chosen in
such a way that the global error becomes as small as desired.
Example 8.1.1 Use Taylor’s series method to solve the equation
dy
= x − y, y(0) = 1
dx
at x and at x = 0.1.
Ordinary Differential Equations 515

Solution. The Taylor’s series around x = 0 for y(x) is

x2  x3  x4 iv
y(x) = y0 + xy0 + y + y0 + y0 + · · · .
2! 0 3! 4!
Differentiating y  repeatedly with respect to x and substituting x = 0, we obtain

y  (x) = x − y, y0 = −1
y  (x) = 1 − y  , y0 = 1 − y0 = 2
y  (x) = −y  , y0 = −2
y iv (x) = −y  , y0iv = 2

and so on.
The Taylor’s series becomes

x2 x3 x4
y(x) = 1 − x + ·2+ · (−2) + · 2 + ···
2! 3! 4!
x3 x4
= 1 − x + x2 − + − ··· .
3 12
This is the Taylor’s series solution of the given differential equation at any point x.
Now,
(0.1)3 (0.1)4
y(0.1) = 1 − (0.1) + (0.1)2 − + − · · · = 0.909675.
3 12

8.2 Picard’s Method of Successive Approximations

In this method, the value of dependent variable y is expressed as a function of x.


Let us consider the differential equation
dy
= f (x, y) (8.9)
dx
with initial conditions
x = x0 , y(x0 ) = y0 . (8.10)

Now, integration of (8.9) between x0 and x gives


% y % x
dy = f (x, y) dx.
y0 x0

Thus
% x
y(x) = y0 + f (x, y) dx. (8.11)
x0
516 Numerical Analysis

This equation satisfies the initial condition (8.10), as


% x0
y(x0 ) = y0 + f (x, y) dx = y0 .
x0

The value of y is replaced by y0 in the right hand side of (8.11) and let this solution
be y (1) (x), the first approximation of y, i.e.,
% x
y (1) (x) = y0 + f (x, y0 ) dx. (8.12)
x0

Again, the value of y (1) (x) is replaced in (8.11) from (8.12) and the second approxi-
mation y (2) (x) is obtained as
% x
y (2) (x) = y0 + f (x, y (1) ) dx. (8.13)
x0

In this way, the following approximations of y are generated.


% x
y (3) (x) = y0 + f (x, y (2) ) dx
%x0x
y (4) (x) = y0 + f (x, y (3) ) dx
x0
· · · · · · · · · · · · · · %· · · · · · · · · · · · · · · · · · ·
x
y (n) (x) = y0 + f (x, y (n−1) ) dx.
x0

Thus a sequence y (1) , y (2) , . . . , y (n) of y is generated in terms of x.


Note 8.2.1 If f (x, y) is continuous and has a bounded partial derivative fy (x, y) in
the neighbourhood of (x0 , y0 ), then in a certain interval containing the point x0 , the
sequence {y (i) } converges to the function y(x) which is the solution of the differential
equation (8.9) with initial condition (8.10).

Example 8.2.1 Use Picard’s method to solve the differential equation y  = x2 + y


with initial condition y(0) = 0. Also, find the values of y(0.1) and y(0.2).

Solution. Let f (x, y) = x2 + y, x0 = 0, y0 = 0.


Then
% x % x
(1) x3
y (x) = y0 + f (x, y0 ) dx = 0 + (x2 + 0) dx =
x0 0 3
% x % x 3 
x x3 x4
y (2) (x) = y0 + f (x, y (1) (x)) dx = x2 + dx = +
x0 0 3 3 3.4
Ordinary Differential Equations 517

% % x
(3)
x
(2) x3 x4 
y (x) = y0 + f (x, y (x)) dx = x2 + + dx
x0 0 3 3.4
x3 x4 x5
= + + .
3 3.4 3.4.5
Similarly,
x3 x4 x5 x6
y (4) (x) = + + + .
3 3.4 3.4.5 3.4.5.6
Now,

(0.1)3 (0.1)4 (0.1)5 (0.1)6


y(0.1) = + + +
3 3×4 3×4×5 3×4×5×6
= 3.41836 × 10−4 .
(0.2)3 (0.2)4 (0.2)5 (0.2)6
y(0.2) = + + +
3 3×4 3×4×5 3×4×5×6
= 2.80551 × 10−3 .

8.3 Euler’s Method

This is the most simple but crude method to solve differential equation of the form
dy
= f (x, y), y(x0 ) = y0 . (8.14)
dx
Let x1 = x0 + h, where h is small. Then by Taylor’s series
 dy  h2  d2 y 
y1 = y(x0 + h) = y0 + h + ,
dx x0 2 dx2 c1
where c1 lies between x0 and x
h2
= y0 + hf (x0 , y0 ) + y  (c1 ) (8.15)
2
If the step size h is chosen small enough, then the second-order term may be neglected
and hence y1 is given by

y1 = y0 + hf (x0 , y0 ). (8.16)

Similarly,
y2 = y1 + hf (x1 , y1 ) (8.17)
y3 = y2 + hf (x2 , y2 ) (8.18)
and so on.
518 Numerical Analysis

In general,
yn+1 = yn + hf (xn , yn ), n = 0, 1, 2, . . . (8.19)

This method is very slow. To get a reasonable accuracy with Euler’s methods, the
value of h should be taken as small.
It may be noted that the Euler’s method is a single-step explicit method.

Example 8.3.1 Find the values of y(0.1) and y(0.2) from the following differential
equation
dy
= x2 + y 2 with y(0) = 1.
dx
Solution. Let h = 0.05, x0 = 0, y0 = 1.
Then
x1 = x0 + h = 0.05
y1 = y(0.05) = y0 + hf (x0 , y0 ) = 1 + 0.05 × (0 + 1) = 1.05
x2 = x1 + h = 0.1
y2 = y(0.1) = y1 + hf (x1 , y1 ) = 1.05 + 0.05 × (0.12 + 1.052 ) = 1.105625
x3 = x2 + h = 0.15
y3 = y(0.15) = y2 + hf (x2 , y2 ) = 1.105625 + 0.05 × (0.12 + 1.1056252 )
= 1.167245
x4 = x3 + h = 0.2
y4 = y(0.2) = y3 + hf (x3 , y3 ) = 1.167245 + 0.05 × (0.152 + 1.1672452 )
= 1.236493.
Hence y(0.1) = 1.105625, y(0.2) = 1.236493.

8.3.1 Geometrical interpretation of Euler’s method

The graphical representation of Euler’s method is shown in Figure 8.1.


Geometrically, the desired function curve (solution curve) is approximated by a poly-
gon train, where the direction of each part is determined by the value of the function
f (x, y) at its starting point.

Error

The local truncation error of Euler’s method is O(h2 ), it follows obviously from (8.15).
2
The neglected term at each step is y  (ci ) h2 . Then at the end of the interval [x0 , xn ],
after n steps, the global error is


n
h2 h2 hn  (xn − x0 )y  (c)
y  (ci ) = ny  (c) = y (c)h = h = O(h).
2 2 2 2
n=1
Ordinary Differential Equations 519

y
6
y = y(x)

y3

y2
y1
y0

- x
O x0 x1 x2 x3

Figure 8.1: Geometrical meaning of Euler’s method.

Algorithm 8.1 (Euler’s method). This algorithm finds the solution of the equa-
tion y  = f (x, y) with y(x0 ) = y0 over the interval [x0 , xn ], by Euler’s method
yi+1 = yi + hf (xi , yi ), i = 0, 1, 2, . . . , n − 1.

Algorithm Euler
Input function f (x, y)
Read x0 , y0 , xn , h //x0 , y0 are the initial values and xn is the last value of x//
//where the process will terminate; h is the step size//
for x = x0 to xn step h do
y = y0 + h ∗ f (x, y0 );
Print x, y;
y0 = y;
endfor;
end Euler
Program 8.1.
/* Program Euler
Solution of a differential equation of the form y’=f(x,y),
y(x0)=y0 by Euler’s method. */
#include<stdio.h>
#include<math.h>
void main()
{
float x0,y0,xn,h,x,y;
520 Numerical Analysis

float f(float x, float y);


printf("Enter the initial (x0) and final (xn) values of x ");
scanf("%f %f",&x0,&xn);
printf("Enter initial value of y ");
scanf("%f",&y0);
printf("Enter step length h ");
scanf("%f",&h);
printf(" x-value y-value\n");
for(x=x0;x<xn;x+=h)
{
y=y0+h*f(x,y0);
printf("%f %f \n",x+h,y);
y0=y;
}
} /* main */
/* definition of the function f(x,y) */
float f(float x, float y)
{
return(x*x+x*y+2);
}

A sample of input/output:

Enter the initial (x0) and final (xn) values of x


0 .2
Enter initial value of y 1
Enter step length h .05
x-value y-value
0.050000 1.100000
0.100000 1.202875
0.150000 1.309389
0.200000 1.420335

8.4 Modified Euler’s Method

In Euler’s method there is no scope to improve the value of y. The improvement can
be done using modified Euler’s method.
Now, consider the differential equation

dy
= f (x, y) with y(x0 ) = y0 . (8.20)
dx
Ordinary Differential Equations 521

To obtain the solution at x1 , integrating (8.20) over [x0 , x1 ]. That is


% x1 % x1
dy = f (x, y) dx
x0 x0

which gives % x1
y1 = y 0 + f (x, y) dx. (8.21)
x0

The integration of right hand side can be done using any numerical method. If the
trapezoidal rule is used with step size h(= x1 − x0 ) then above integration become
h
y(x1 ) = y(x0 ) + [f (x0 , y(x0 )) + f (x1 , y(x1 ))]. (8.22)
2
Note that the right hand side of (8.22) involves an unknown quantity y(x1 ). This
value can be determined by the Euler’s method. Let us denote this value by y (0) (x1 )
and the value obtained from (8.22) by y (1) (x1 ). Then the resulting formula for finding
y1 is
h
y (1) (x1 ) = y(x0 ) + [f (x0 , y(x0 )) + f (x1 , y (0) (x1 ))].
2
That is,
(0)
y1 = y0 + hf (x0 , y0 )
(1) h (0)
y1 = y0 + [f (x0 , y0 ) + f (x1 , y1 )]. (8.23)
2
This is the first approximation of y1 .
The second approximation is
(2) h (1)
y1 = y0 + [f (x0 , y0 ) + f (x1 , y1 )]. (8.24)
2
The (k + 1)th approximate value of y1 is
(k+1) h (k)
y1 = y0 + [f (x0 , y0 ) + f (x1 , y1 )], k = 0, 1, 2, . . . . (8.25)
2
In general, (0)
yi+1 = yi + hf (xi , yi )
(k+1) h (k)
yi+1 = yi + [f (xi , yi ) + f (xi+1 , yi+1 )], (8.26)
2
k = 0, 1, 2, . . . ; i = 0, 1, 2, . . . .
(k) (k+1)
The iterations are continued until two successive approximations yi+1 and yi+1
coincide to the desired accuracy. The iterations converge rapidly for sufficiently small
spacing h.
522 Numerical Analysis

Example 8.4.1 Determine the value of y when x = 0.1 and 0.2 given that
y(0) = 1 and y  = x2 − y.

Solution. Let h = 0.1, x0 = 0, y0 = 1, x1 = 0.1, x2 = 0.2 and f (x, y) = x2 − y.


(0)
y1 = y0 + hf (x0 , y0 ) = 1 + 0.1f (0, 1) = 1 + 0.1 × (0 − 1) = 0.9000.
(1) h (0)
y1 = y0 + [f (x0 , y0 ) + f (x1 , y1 )]
2
0.1 2
=1+ [(0 − 1) + (0.12 − 0.9)] = 0.9055.
2
(2) h (1)
y1 = y0 + [f (x0 , y0 ) + f (x1 , y1 )]
2
0.1 2
=1+ [(0 − 1) + (0.12 − 0.9055)] = 0.9052.
2
(3) h (2)
y1 = y0 + [f (x0 , y0 ) + f (x1 , y1 )]
2
0.1 2
=1+ [(0 − 1) + (0.12 − 0.9052)] = 0.9052.
2
Therefore, y1 = y(0.1) = 0.9052.
(0)
y2 = y1 + hf (x1 , y1 ) = 0.9052 + 0.1f (0.1, 0.9052)
= 0.9052 + 0.1 × (0.12 − 0.9052) = 0.8157.
(1) h (0)
y2 = y1 + [f (x1 , y1 ) + f (x2 , y2 )]
2
0.1
= 0.9052 + [(0.12 − 0.9052) + (0.22 − 0.8157)] = 0.8217.
2
(2) h (1)
y2 = y1 + [f (x1 , y1 ) + f (x2 , y2 )]
2
0.1
= 0.9052 + [(0.12 − 0.9052) + (0.22 − 0.8217)] = 0.8214.
2
(3) h (2)
y2 = y1 + [f (x1 , y1 ) + f (x2 , y2 )]
2
0.1
= 0.9052 + [(0.12 − 0.9052) + (0.22 − 0.8214)] = 0.8214.
2
Hence, y2 = y(0.2) = 0.8214.

8.4.1 Geometrical interpretation of modified Euler’s method


Let T0 be the tangent at (x0 , y0 ) on the solution curve y = y(x), L1 is the line passing
(0) (0)
through (x1 , y1 ) of slope f (x1 , y1 ) shown in Figure 8.2. Then L is the line passes
Ordinary Differential Equations 523

(0) (0)
through C(x1 , y1 ) but with a slope equal to the average of f (x0 , y0 ) and f (x1 , y1 ).
The line L through (x0 , y0 ) and parallel to L is the approximate curve to find the
(1) (1)
improved value y1 . The ordinate of the point B is the approximate value y1 .
y
6 y = y(x) L
: L1
B L
T0
C (x1 , y1(0) )
A(x0 , y0 )
-x
O x0 x1

Figure 8.2: Geometrical meaning of modified Euler’s method.

Again, the formula (8.23) can be interpreted as follows, by writing it in the form

(1) h (0)
y1 − y 0 = [f (x0 , y0 ) + f (x1 , y1 )]. (8.27)
2

The first improvement, i.e., the right hand of (8.27) is the area of the trapezoid (see
(0)
Figure 8.3) with the vertices (x0 , 0), (x0 , f (x0 , y0 )), (x1 , f (x1 , y1 )) and (x1 , 0).

z
6
z = f (x, y)
(0)
f (x1 , y1 )

f (x0 , y0 )

- x
O x0 x1

Figure 8.3: Geometrical interpretation of incremented value of Euler’s method.


524 Numerical Analysis

Error
h3 
The error of the trapezoidal rule is − y (ci ). So the local truncation error of modified
12
Euler’s method is O(h3 ).
After n steps, the local accumulated error of this method is

n
h3 xn − x0 
− y  (ci )  − y (c)h2 = O(h2 ).
12 12
i=1

Thus the global truncation error is O(h2 ).

Algorithm 8.2 (Modified Euler’s method). This algorithm solves the initial
value problem y  = f (x, y) with y(x0 ) = y0 over the interval [x0 , xn ] with step size h.
The formulae are given by
(0)
yi+1 = yi + hf (xi , yi )
(k) h (k−1)
yi+1 = yi + [f (xi , yi ) + f (xi+1 , yi+1 )], for k = 1, 2, . . .
2
Algorithm Modified Euler
Input function f (x, y);
Read x0 , xn , y0 , h; //initial and final values of x, initial value of y and step size h.//
Read ε; //ε is the error tolerance.//
Set y = y0 ;
for x = x0 to xn step h do
Compute f1 = f (x, y);
Compute yc = y + h ∗ f1 ; //evaluated from Euler’s method//
do
Set yp = yc ;
Compute yc = y + h2 [f1 + f (x + h, yp )] //modified Euler’s method//
while (|yp − yc | > ε) //check for accuracy//
Reset y = yc ;
Print x, y;
endfor;
end Modified Euler
Program 8.2
.
/* Program Modified Euler
Solution of a differential equation of the form y’=f(x,y),
y(x0)=y0 by Modified Euler’s method. */
#include<stdio.h>
#include<math.h>
Ordinary Differential Equations 525

void main()
{
float x0,y0,xn,h,x,y;/*x0, xn the initial and final values of x*/
/* y0 initial value of y, h is the step length */
float eps=1e-5; /* the error tolerance */
float yc,yp,f1;
float f(float x, float y);
printf("Enter the initial (x0) and final (xn) values of x ");
scanf("%f %f",&x0,&xn);
printf("Enter initial value of y ");
scanf("%f",&y0);
printf("Enter step length h ");
scanf("%f",&h);
printf(" x-value y-value\n");
y=y0;
for(x=x0;x<xn;x+=h)
{
f1=f(x,y);
yc=y+h*f1; /* evaluated by Euler’s method */
do
{
yp=yc;
yc=y+h*(f1+f(x+h,yp))/2; /*modified Euler’s method*/
}while(fabs(yp-yc)>eps);
y=yc;
printf("%f %f\n",x+h,y);
}
} /* main */
/* definition of the function f(x,y) */
float f(float x, float y)
{
return(x*x-2*y+1);
}

A sample of input/output:

Enter the initial (x0) and final (xn) values of x


0 .5
Enter initial value of y 1
Enter step length h .1
526 Numerical Analysis

x-value y-value
0.100000 0.909546
0.200000 0.837355
0.300000 0.781926
0.400000 0.742029
0.500000 0.716660

8.5 Runge-Kutta Methods

The Euler’s method is less efficient in practical problems because if h is not sufficiently
small then this method gives inaccurate result.
The Runge-Kutta methods give more accurate result. One advantage of this method
is it requires only the value of the function at some selected points on the subinterval
and it is stable, and easy to program.
The Runge-Kutta methods perform several function evaluations at each step and
avoid the computation of higher order derivatives. These methods can be constructed
for any order, i.e., second, third, fourth, fifth, etc. The fourth-order Runge-Kutta
method is more popular. These methods are single-step explicit methods.

8.5.1 Second-order Runge-Kutta method


The modified Euler’s method to compute y1 is

h (0)
y1 = y0 + [f (x0 , y0 ) + f (x1 , y1 )]. (8.28)
2
(0)
If y1 = y0 + hf (x0 , y0 ) is substituted in (8.28) then

h
y 1 = y0 + [f (x0 , y0 ) + f (x0 + h, y0 + hf (x0 , y0 ))].
2
Setting,

k1 = hf (x0 , y0 ) and
k2 = hf (x0 + h, y0 + hf (x0 , y0 )) = hf (x0 + h, y0 + k1 ). (8.29)

Then equation (8.28) becomes

1
y1 = y0 + (k1 + k2 ). (8.30)
2
This is known as second-order Runge-Kutta formula. The local truncation error of this
formula is of O(h3 ).
Ordinary Differential Equations 527

General derivation
Assume that the solution is of the form
y1 = y0 + ak1 + bk2 , (8.31)
where
k1 = hf (x0 , y0 ) and
k2 = hf (x0 + αh, y0 + βk1 ), a, b, α and β are constants.
By Taylor’s series,
h2 h3
y1 = y(x0 + h) = y0 + hy0 + y0 + y0 + · · ·
2 6
h2  ∂f   ∂f  
= y0 + hf (x0 , y0 ) + + f (x0 , y0 ) + O(h3 )
2 ∂x (x0 ,y0 ) ∂y (x0 ,y0 )
df ∂f ∂f
As = + f (x, y)
dx ∂x ∂y
k2 = hf (x0 + αh, y0 + βk1 )
  ∂f   ∂f  
= h f (x0 , y0 ) + αh + βk1 + O(h2 )
∂x (x0 ,y0 ) ∂y (x0 ,y0 )
 ∂f   ∂f 
= hf (x0 , y0 ) + αh2 + βh2 f (x0 , y0 ) + O(h3 ).
∂x (x0 ,y0 ) ∂y (x0 ,y0 )
Then the equation (8.31) becomes
h2
y0 + hf (x0 , y0 ) + [fx (x0 , y0 ) + f (x0 , y0 )fy (x0 , y0 )] + O(h3 )
2
= y0 + (a + b)hf (x0 , y0 ) + bh2 [αfx (x0 , y0 ) + βf (x0 , y0 )fy (x0 , y0 )] + O(h3 ).
The coefficients of f , fx and fy are compared and the following equations are obtained.
1 1
a + b = 1, bα = and bβ = . (8.32)
2 2
Obviously, α = β and if α is assigned any value arbitrarily, then the remaining
parameters can be determined uniquely. However, usually the parameters are chosen as
1
α = β = 1, then a = b = .
2
Thus the formula is
1
y1 = y0 + (k1 + k2 ) + O(h3 ), (8.33)
2
where k1 = hf (x0 , y0 ) and k2 = hf (x0 + h, y0 + k1 ).
It follows that there are several second-order Runge-Kutta formulae and (8.33) is just
one among such formulae.
528 Numerical Analysis

8.5.2 Fourth-order Runge-Kutta Method


The fourth-order Runge-Kutta formula is
y1 = y0 + ak1 + bk2 + ck3 + dk4 , (8.34)
where k1 = hf (x0 , y0 )
k2 = hf (x0 + α0 h, y0 + β0 k1 )
k3 = hf (x0 + α1 h, y0 + β1 k1 + γ1 k2 ) (8.35)
and k4 = hf (x0 + α2 h, y0 + β2 k1 + γ2 k2 + δ1 k3 ).
The parameters a, b, c, d, α0 , β0 , α1 , β1 , γ1 , α2 , β2 , γ2 , δ1 are to be determined by ex-
panding both sides of (8.34) by Taylor’s series retaining the terms up to and including
those containing h4 . The choice of the parameters is arbitrary and depending on the
choice of parameters, several fourth-order Runge-Kutta formulae can be generated.
The above functions are expanded by using Taylor’s series method retaining terms
up to fourth order. The local truncation error is O(h5 ). Runge and Kutta obtained the
following system of equations.

β0 = α0  

β1 + γ1 = α1  


β2 + γ2 + δ1 = α2  




a+b+c+d =1 


bα0 + cα1 + dα2 = 1/2  
bα02 + cα12 + dα22 = 1/3 (8.36)

bα03 + cα13 + dα23 = 1/4  


cα0 γ1 + d(α0 γ2 + α1 δ1 ) = 1/6  


cα0 α1 γ1 + dα2 (α0 γ2 + α1 δ1 ) = 1/8  


cα02 γ1 + d(α02 γ2 + α12 δ1 ) = 1/12 


dα0 γ1 δ1 = 1/24. 
The above system of equations contain 11 equations and 13 unknowns. Two more
conditions are required to solve the system. But, these two conditions are taken arbi-
trarily. The most common choice is
1
α0 = , β1 = 0.
2
With these choice, the solution of the system (8.36) is
1 1 1
α1 = , α2 = 1, β0 = , γ1 = , β2 = 0, γ2 = 0, δ1 = 1,
2 2 2
(8.37)
1 1 1
a = ,b = c = ,d = .
6 3 6
Ordinary Differential Equations 529

The values of these variables are substituted in (8.34) and (8.35) and the fourth-order
Runge-Kutta method is obtained as
1
y1 = y0 + (k1 + 2k2 + 2k3 + k4 ) (8.38)
6
where
k1 = hf (x0 , y0 )
k2 = hf (x0 + h/2, y0 + k1 /2)
k3 = hf (x0 + h/2, y0 + k2 /2)
k4 = hf (x0 + h, y0 + k3 ).
Starting with the initial point (x0 , y0 ), one can generate the sequence of solutions at
x1 , x2 , . . . using the formula
1 (i) (i) (i) (i)
yi+1 = yi + (k1 + 2k2 + 2k3 + k4 ) (8.39)
6
where
(i)
k1 = hf (xi , yi )
(i) (i)
k2 = hf (xi + h/2, yi + k1 /2)
(i) (i)
k3 = hf (xi + h/2, yi + k2 /2)
(i) (i)
k4 = hf (xi + h, yi + k3 ).

Example 8.5.1 Given y  = y 2 − x2 , where y(0) = 2. Find y(0.1) and y(0.2) by


second-order Runge-Kutta method.

Solution. Here h = 0.1, x0 = 0, y0 = 2, f (x, y) = y 2 − x2 .


Then

k1 = hf (x0 , y0 ) = 0.1(22 − 02 ) = 0.4000.


k2 = hf (x0 + h, y0 + k1 ) = 0.1 × f (0 + 0.1, 2 + 0.4000)
= 0.1 × (2.42 − 0.12 ) = 0.5750.

Therefore, y1 = y0 + 12 (k1 +k2 ) = 2+ 21 (0.4000+0.5750) = 2.4875, i.e., y(0.1) = 2.4875.


To determine y2 = y(0.2), let x1 = 0.1 and y1 = 2.4875.

k1 = hf (x1 , y1 ) = 0.1 × f (0.1, 2.4875) = 0.1 × (2.48752 − 0.12 )


= 0.6178.
k2 = hf (x1 + h, y1 + k1 ) = 0.1 × f (0.2, 2.4875 + 0.6178)
= 0.1 × f (0.2, 3.1053) = 0.1 × (3.10532 − 0.22 ) = 0.9603.
530 Numerical Analysis

Therefore, y2 = y1 + 12 (k1 + k2 ) = 2.4875 + 12 (0.6178 + 0.9603) = 3.2766.


Hence, y(0.2) = 3.2766.

Example 8.5.2 Given y  = x2 + y 2 with x = 0, y = 1. Find y(0.1) by fourth-order


Runge-Kutta method.

Solution. Here h = 0.1, x0 = 0, y0 = 1, f (x, y) = x2 + y 2 .

k1 = hf (x0 , y0 ) = 0.1 × (02 + 12 ) = 0.1000.


k2 = hf (x0 + h/2, y0 + k1 /2) = 0.1 × f (0.05, 1.05)
= 0.1 × (0.052 + 1.052 ) = 0.1105.
k3 = hf (x0 + h/2, y0 + k2 /2) = 0.1 × f (0.05, 1.0553)
= 0.1 × (0.052 + 1.05532 ) = 0.1116.
k4 = hf (x0 + h, y0 + k3 ) = 0.1 × f (0.1, 1.1116)
= 0.1 × (0.12 + 1.11162 ) = 0.1246.

Therefore,
1
y1 = y0 + (k1 + 2k2 + 2k3 + k4 )
6
1
= 1 + (0.1000 + 2 × 0.1105 + 2 × 0.1116 + 0.1246)
6
= 1.1115.

Note 8.5.1 The Runge-Kutta method gives better result, though, it has some disad-
vantages. This method uses numerous calculations of function to find yi+1 . When the
function f (x, y) has a complicated analytic form then the Runge-Kutta method is very
laborious.

Geometrical interpretation of k1 , k2 , k3 , k4
Let ABC be the solution curve (Figure 8.4) and B is the point on the curve at the
ordinate xi + h/2 which is the middle point of the interval [xi , xi + h]. Let AL1 T1 be
the tangent drawn at A makes an angle θ1 with the horizontal line AT0 . L1 and T1
are the points of intersection with the ordinates BD and CC0 . Then the number k1 is
the approximate slope (within the factor h) of the tangent at A of the solution curve
ABC, i.e., k1 = hy1 = hf (xi , yi ). The coordinates of L1 is (xi + h/2, yi + k1 /2). The
number k2 is the approximate slope (within the factor h) of the tangent drawn to the
curve ABC at L1 . A straight line AL2 is drawn parallel to the line segment L1 T2 . Then
the coordinates of L2 is (xi + h/2, yi + k2 /2). The number k3 is the approximate slope
Ordinary Differential Equations 531

(within factor h) of the tangent to the curve ABC at the point L2 . Finally, a straight
line is drawn through A and parallel to L2 T3 , which cuts the extension of ordinates
C0 T4 at T4 . The coordinates of T4 are (xi + h, yi + k3 ). Then k4 is the approximate
slope (within the factor h) of the tangent drawn to the curve ABC at T4 .
y
6

y = y(x)
T4 θ4
T3 T2
q T1
C
θ3
L2
Bq θ2
A θ1 L1
T0
A0 D C0 -x
O xi xi + h/2 xi+1

Figure 8.4: Interpretation of k1 , k2 , k3 , k4 .

Error
The fourth-order Runge-Kutta formula is
(h/2)  4(k2 + k3 ) 
y 1 = y0 + k1 + + k4 .
3 2
This is similar to the Simpson’s formula with step size h/2, so the local truncation error
h5 iv
of this formula is − y (c1 ), i.e., of O(h5 ) and after n steps, the accumulated error
2880
is
n
h5 iv xn − x0 iv
− y (ci ) = − y (c)h4 = O(h4 ).
2880 5760
i=1

Algorithm 8.3 (Fourth-order Runge-Kutta method). This algorithm finds


the solution of the differential equation y  = f (x, y) with y(x0 ) = y0 using fourth-
order Runge-Kutta method, i.e., using the formula
1
yi+1 = yi + [k1 + 2(k2 + k3 ) + k4 ]
6
within the interval [x1 , xn ] at step h.
532 Numerical Analysis

Algorithm RK4
Input function f (x, y);
Read x0 , xn , y0 , h; //initial and final value of x, initial value of y and step size.//
Set y = y0 ;
for x = x0 to xn step h do
Compute k1 = h ∗ f (x, y);
Compute k2 = h ∗ f (x + h/2, y + k1 /2);
Compute k3 = h ∗ f (x + h/2, y + k2 /2);
Compute k4 = h ∗ f (x + h, y + k3 );
Compute y = y + [k1 + 2(k2 + k3 ) + k4 ]/6;
Print x, y;
endfor;
end RK4
Program 8.3.
/* Program Fourth Order Runge-Kutta
Solution of a differential equation of the form y’=f(x,y),
y(x0)=y0 by fourth order Runge-Kutta method. */
#include<stdio.h>
#include<math.h>
void main()
{
float x0,y0,xn,h,x,y,k1,k2,k3,k4;
float f(float x, float y);
printf("Enter the initial values of x and y ");
scanf("%f %f",&x0,&y0);
printf("Enter last value of x ");
scanf("%f",&xn);
printf("Enter step length h ");
scanf("%f",&h);
y=y0;
printf(" x-value y-value\n");

for(x=x0;x<xn;x+=h)
{
k1=h*f(x,y);
k2=h*f(x+h/2,y+k1/2);
k3=h*f(x+h/2,y+k2/2);
k4=h*f(x+h,y+k3);
y=y+(k1+2*(k2+k3)+k4)/6;
printf("%f %f\n",x+h,y);
Ordinary Differential Equations 533

}
} /* main */
/* definition of the function f(x,y) */
float f(float x, float y)
{
return(x*x-y*y+y);
}

A sample of input/output:

Enter the initial values of x and y 0 2


Enter last value of x 0.5
Enter step length h 0.1
x-value y-value
0.100000 1.826528
0.200000 1.695464
0.300000 1.595978
0.400000 1.521567
0.500000 1.468221

8.5.3 Runge-Kutta method for a pair of equations

The Runge-Kutta methods may also be used to solve a pair of first order differential
equations.
Consider a pair of first-order differential equations

dy
= f (x, y, z)
dx
(8.40)
dz
= g(x, y, z)
dx

with initial conditions

x = x0 , y(x0 ) = y0 , z(x0 ) = z0 . (8.41)

Then the values of yi and zi at xi are obtained by the formulae

1 (i) (i) (i) (i)


yi+1 = yi + [k1 + 2k2 + 2k3 + k4 ],
6
1 (i) (i) (i) (i)
zi+1 = zi + [l1 + 2l2 + 2l3 + l4 ], (8.42)
6
534 Numerical Analysis

(i)
where k1 = hf (xi , yi , zi )
(i)
l1 = hg(xi , yi , zi )
(i) (i) (i)
k2 = hf (xi + h/2, yi + k1 /2, zi + l1 /2)
(i) (i) (i)
l2 = hg(xi + h/2, yi + k1 /2, zi + l1 /2)
(i) (i) (i) (8.43)
k3 = hf (xi + h/2, yi + k2 /2, zi + l2 /2)
(i) (i) (i)
l3 = hg(xi + h/2, yi + k2 /2, zi + l2 /2)
(i) (i) (i)
k4 = hf (xi + h, yi + k3 , zi + l3 )
(i) (i) (i)
l4 = hg(xi + h, yi + k3 , zi + l3 ).

Example 8.5.3 Solve the following pair of differential equations


dy x+y dz
= and = xy + z with initial conditions x0 = 0.5, y0 = 1.5, z0 = 1 for
dx z dx
x = 0.6.

Solution. Let h = 0.1 and calculate the values of k1 , k2 , k3 , k4 ; l1 , l2 , l3 , l4 .


x+y
Here f (x, y) = , g(x, y) = xy + z.
z
0.5 + 1.5
k1 = hf (x0 , y0 , z0 ) = 0.1 × = 0.2
1
l1 = hg(x0 , y0 , z0 ) = 0.1 × (0.5 × 1.5 + 1) = 0.175
0.55 + 1.6
k2 = hf (x0 + h/2, y0 + k1 /2, z0 + l1 /2) = 0.1 × = 0.197701
1.0875
l2 = hg(x0 + h/2, y0 + k1 /2, z0 + l1 /2) = 0.1 × (0.55 × 1.6 + 1.0875)
= 0.19675
0.55 + 1.59885
k3 = hf (x0 + h/2, y0 + k2 /2, z0 + l2 /2) = 0.1 × = 0.195639
1.098375
l3 = hg(x0 + h/2, y0 + k2 /2, z0 + l2 /2) = 0.1 × (0.55 × 1.59885 + 1.098375)
= 0.197774
0.6 + 1.695639
k4 = hf (x0 + h, y0 + k3 , z0 + l3 ) = 0.1 × = 0.191659
1.197774
l4 = hg(x0 + h, y0 + k3 , z0 + l3 ) = 0.1 × (0.6 × 1.695639 + 1.197774)
= 0.221516.
Hence,
1
y(0.6) = y1 = y0 + [k1 + 2(k2 + k3 ) + k4 ]
6
1
= 1.5 + [0.2 + 2(0.197701 + 0.195639) + 0.191659] = 1.696390.
6
1
z(0.6) = z1 = z0 + [l1 + 2(l2 + l3 ) + l4 ]
6
1
= 1.0 + [0.175 + 2(0.19675 + 0.197774) + 0.221516] = 1.197594.
6
Ordinary Differential Equations 535

Algorithm 8.4 (Runge-Kutta method for a pair of equations). A pair of


first order differential equation of the form y  = f (x, y, z), z  = g(x, y, z) with initial
conditions x = x0 , y(x0 ) = y0 and z(x0 ) = z0 can be solved by this algorithm using
fourth-order Runge-Kutta method. The formulae are

1
yi+1 = yi + [k1 + 2(k2 + k3 ) + k4 ],
6
1
zi+1 = zi + [l1 + 2(l2 + l3 ) + l4 ]
6
where

k1 = hf (xi , yi , zi )
l1 = hg(xi , yi , zi )
k2 = hf (xi + h/2, yi + k1 /2, zi + l1 /2)
l2 = hg(xi + h/2, yi + k1 /2, zi + l1 /2)
k3 = hf (xi + h/2, yi + k2 /2, zi + l2 /2)
l3 = hg(xi + h/2, yi + k2 /2, zi + l2 /2)
k4 = hf (xi + h, yi + k3 , zi + l3 )
l4 = hg(xi + h, yi + k3 , zi + l3 ).

Algorithm RK4 Pair


Input functions f (x, y) and g(x, y);
Read x0 , y0 , z0 , h, xn ; //initial values of x, y, z; step size and final value of x.//
Set y = y0 ;
Set z = z0 ;
for x = x0 to xn step h do
Compute the following
k1 = hf (x, y, z);
l1 = hg(x, y, z);
k2 = hf (x + h/2, y + k1 /2, z + l1 /2);
l2 = hg(x + h/2, y + k1 /2, z + l1 /2);
k3 = hf (x + h/2, y + k2 /2, z + l2 /2);
l3 = hg(x + h/2, y + k2 /2, z + l2 /2);
k4 = hf (x + h, y + k3 , z + l3 );
l4 = hg(x + h, y + k3 , z + l3 );

y = y + [k1 + 2(k2 + k3 ) + k4 ]/6;


z = z + [l1 + 2(l2 + l3 ) + l4 ]/6;
Print x, y, z;
endfor;
end RK4 Pair
536 Numerical Analysis

Program 8.4 .
/* Program Runge-Kutta (for Pair of Equations)
Solution of a differential equation of the form y’=f(x,y,z),
z’=g(x,y,z) with x=x0, y(x0)=y0 and z(x0)=z0 by fourth order
Runge-Kutta method.
Here the equations are taken as y’=y+2z, z’=3y+2z with
y(0)=6, z(0)=4. */
#include<stdio.h>
#include<math.h>
void main()
{
float x0,y0,z0,xn,h,x,y,z,k1,k2,k3,k4,l1,l2,l3,l4;
float f(float x, float y, float z);
float g(float x, float y, float z);
printf("Enter the initial values of x, y and z ");
scanf("%f %f %f",&x0,&y0,&z0);
printf("Enter last value of x ");
scanf("%f",&xn);
printf("Enter step length h ");
scanf("%f",&h);
y=y0;
z=z0;
printf("x-value y-value z-value\n");
for(x=x0;x<xn;x+=h)
{
k1=h*f(x,y,z);
l1=h*g(x,y,z);
k2=h*f(x+h/2,y+k1/2,z+l1/2);
l2=h*g(x+h/2,y+k1/2,z+l1/2);
k3=h*f(x+h/2,y+k2/2,z+l2/2);
l3=h*g(x+h/2,y+k2/2,z+l2/2);
k4=h*f(x+h,y+k3,z+l3);
l4=h*g(x+h,y+k3,z+l3);

y=y+(k1+2*(k2+k3)+k4)/6;
z=z+(l1+2*(l2+l3)+l4)/6;
printf("%f %f %f\n",x+h,y,z);
}
} /* main */
Ordinary Differential Equations 537

/* definition of the function f(x,y,z) */


float f(float x, float y, float z)
{
return(y+2*z);
}
/* definition of the function g(x,y,z) */
float g(float x, float y, float z)
{
return(3*y+2*z);
}
A sample of input/output:
Enter the initial values of x, y and z
0 6 4
Enter last value of x 0.4
Enter step length h 0.1
x-value y-value z-value
0.100000 7.776608 7.140725
0.200000 10.538535 11.714149
0.300000 14.759665 18.435406
0.400000 21.147917 28.370275

8.5.4 Runge-Kutta method for a system of equations


Let
dy1
= f1 (x, y1 , y2 , . . . , yn )
dx
dy2
= f2 (x, y1 , y2 , . . . , yn ) (8.44)
dx
··· ··················
dyn
= fn (x, y1 , y2 , . . . , yn )
dx
with initial conditions

y1 (x0 ) = y10 , y2 (x0 ) = y20 , . . . , yn (x0 ) = yn0 (8.45)

be a system of first order differential equations, where y1 , y2 , . . . , yn are n dependent


variables and x is the only independent variable.
The above system can be written as
dy
= f (x, y1 , y2 , . . . , yn ) with y(x0 ) = y0 . (8.46)
dx
538 Numerical Analysis

The fourth-order Runge-Kutta method for the above system is


1
yj+1 = yj + [k1 + 2k2 + 2k3 + k4 ] (8.47)
6
where
       
k11 k12 k13 k14
 k21   k22   k23   k24 
       
k 1 =  .  , k2 =  .  , k3 =  .  , k4 =  .  (8.48)
 ..   ..   ..   .. 
kn1 kn2 kn3 kn4
and
ki1 = hfi (xj , y1j , y2j , . . . , ynj )
ki2 = hfi (xj + h/2, y1j + k11 /2, y2j + k21 /2, . . . , ynj + kn1 /2)
ki3 = hfi (xj + h/2, y1j + k12 /2, y2j + k22 /2, . . . , ynj + kn2 /2)
ki4 = hfi (xj + h, y1j + k13 , y2j + k23 , . . . , ynj + kn3 )
for i = 1, 2, . . . , n.
It may be noted that ykj is the value of the kth dependent variable yk evaluated at
xj . The above formula in explicit vector notation is
           
y1 j+1 y1 j k11 k12 k13 k14
 y2 j+1   y2 j  1  k21   k22   k23   k24 
           
 ..  =  ..  +  ..  + 2  ..  + 2  ..  +  ..  . (8.49)
 .   .  6  .   .   .   . 
yn j+1 yn j kn1 kn2 kn3 kn4

8.5.5 Runge-Kutta method for second order differential equation


Let the second order differential equation be
a(x)y  + b(x)y  + c(x)y = f (x). (8.50)
This equation can be written as
y  (x) = g(x, y(x), y  (x)) (8.51)
with initial conditions
x = x0 , y(x0 ) = y0 , y  (x0 ) = z0 . (8.52)
This second order differential equation can be converted as a pair of first order dif-
ferential equations by substituting
y  (x) = z(x).
Ordinary Differential Equations 539

Then y  (x) = z  (x). The equation (8.51) becomes


dy
=z
dx
dz
= g(x, y, z) (8.53)
dx
with initial conditions
x = x0 , y(x0 ) = y0 , z(x0 ) = z0 . (8.54)
Now, Runge-Kutta methods may be used to solve the equations (8.53) with initial
conditions (8.54). This system of equations generate two sequences {yi } and {zi }. The
first sequence is the solution of (8.51). It may be noted that the value of z represents
the derivative at the same point.
Example 8.5.4 Find the value of y(0.1) for the following second order differential
equation by fourth-order Runge-Kutta method.

2y  (x) − 5y  (x) − 3y(x) = 45e2x with y(0) = 2, y  (0) = 1.

Solution. Substituting y  = z. Then the given equation reduces to 2z  − 5z − 3y =


1
45e2x or, z  = (5z + 3y + 45e2x ) = g(x, y, z), (say).
2
x0 = 0, y0 = 2, z0 = 1, h = 0.1.
k1 = h × z0 = 0.1
l1 = hg(x0 , y0 , z0 ) = 0.1 × g(0, 2, 1) = 2.80
k2 = h × (z0 + l1 /2) = 0.1 × 2.4 = 0.24
l2 = hg(x0 + h/2, y0 + k1 /2, z0 + l1 /2) = 0.1 × g(0.05, 2.05, 2.4) = 3.394135
k3 = h × (z0 + l2 /2) = 0.1 × 2.697067 = 0.269707
l3 = hg(x0 + h/2, y0 + k2 /2, z0 + l2 /2) = 0.1 × g(0.05, 2.12, 2.697067)
= 3.478901
k4 = h × (z0 + l3 ) = 0.1 × 4.478901 = 0.447890
l4 = hg(x0 + h, y0 + k3 , z0 + l3 ) = 0.1 × g(0.1, 2.269707, 4.478901) = 4.208338
Therefore,
y1 = y0 + 16 [k1 + 2(k2 + k3 ) + k4 ]
= 2 + 16 [0.1 + 2(0.24 + 0.269707) + 0.447890] = 2.261217
z1 = z0 + 16 [l1 + 2(l2 + l3 ) + l4 ]
= 1 + 16 [2.80 + 2(3.394135 + 3.478901) + 4.208338] = 4.459068.
The required value of y(0.1) is 2.261217. In addition, y  (0.1) is 4.459068.

8.5.6 Runge-Kutta-Fehlberg method


Runge-Kutta methods have become very popular both as computational techniques and
as a topic for research. Many variant of Runge-Kutta methods are available. Some of
540 Numerical Analysis

them are RK-Butcher, RK-Fehlberg, RK-Merson, RK-Centroidal mean, RK-arithmetic


mean etc. The Runge-Kutta-Fehlberg method gives better accuracy by solving the
problem twice using step sizes h and h/2. In this method, at each step two different
approximations for the solution are calculated. If the two approximations are closed
then the solution is obtained. If the approximation is not acceptable then the step size
is reduced.
In each step the following six values are required.

k1 = hf (xi , yi )
 
h k1
k2 = hf xi + , yi +
4 4
 
3h 3 9
k3 = hf xi + , yi + k1 + k2
8 32 32
 
12 1932 7200 7296
k4 = hf xi + h, yi + k1 − k2 + k3 (8.55)
13 2197 2197 2197
 
439 3680 845
k5 = hf xi + h, yi + k1 − 8k2 + k3 − k4
216 513 4104
 
h 8 3544 1859 11
k6 = hf xi + , yi − k1 + 2k2 − k3 + k4 − k5 .
2 27 2565 4104 40

Then an approximation using fourth-order Runge-Kutta method is

25 1408 2197 1
yi+1 = yi + k1 + k3 + k4 − k5 . (8.56)
216 2565 4104 5
It may be noted that the value of k2 is not used in the above formula. The other
value of y is determined by fifth order Runge-Kutta method as follows:

∗ 16 6656 28561 9 2
yi+1 = yi + k1 + k3 + k4 − k5 + k6 . (8.57)
135 12825 56430 50 55
∗ | is small enough then the method is terminated; otherwise, the compu-
If |yi+1 − yi+1
tation is repeated by reducing the step size h. The local truncation error of this method
∗ .
is yi+1 − yi+1

8.5.7 Runge-Kutta-Butcher method1


RK-Butcher method is normally considered to be sixth order since it requires six function
evaluations (it looks like a sixth-order method, but it is a fifth-order method only), but
1
J.C.Butcher, The Numerical Analysis of Ordinary Differential Equation: Runge-Kutta and General
Linear Methods, (Chichester, John Wiley), 1987.
Ordinary Differential Equations 541

in practice the working order is closer to five. In this method, a pair of expressions are
developed to determine y as follows:
1
yi+1 = yi + (7k1 + 32k3 + 12k4 + 32k5 + 7k6 )
90
∗ 1
and yi+1 = yi + (k1 + 4k4 + k6 ), (8.58)
6
where
k1 = hf (xi , yi )
k2 = hf (xi + h/4, yi + k1 /4)
k3 = hf (xi + h/4, yi + k1 /8 + k2 /8)
k4 = hf (xi + h/2, yi − k1 /2 + k3 ) (8.59)
k5 = hf (xi + 3h/4, yi + 3k1 /16 + 9k4 /16)
k6 = hf (xi + h, yi − 3k1 /7 + 2k2 /7 + 12k3 /7 − 12k4 /7 + 8k5 /7).
∗ | is small enough.
The method will terminate when |yi+1 − yi+1
∗ .
The local truncation error of this method is yi+1 − yi+1

8.6 Predictor-Corrector Methods

The methods, viz., Taylor’s series, Picard, Euler’s and Runge-Kutta are single-step
methods as these methods use only one previous values to compute the successive
values, i.e., yi is used to compute yi+1 . Certain efficient methods are available which
need some more values to compute the successive values. These methods are called
multistep methods. General k-step method needs yi−k−1 , yi−k−2 , . . . , yi−1 and yi to
compute yi+1 . The predictor-corrector method is a combination of two formulae. The
first formula (called predictor) finds an approximate value of yi and the second formula
(called corrector) improves this value. The commonly used predictor-corrector multistep
methods are due to Adams-Bashforth-Moulton and Milne-Simpson. These two methods
are discussed below.

8.6.1 Adams-Bashforth-Moulton methods


This is a fourth-order multistep method and it needs four values (xi−3 , yi−3 ), (xi−2 , yi−2 ),
(xi−1 , yi−1 ) and (xi , yi ) to compute yi+1 . These values are called starting values of this
method and they may be determined by using any single step method such as Euler,
Runge-Kutta, etc.
Let us consider a differential equation
dy
= f (x, y) with initial conditions x = x0 , y(x0 ) = y0 . (8.60)
dx
542 Numerical Analysis

This differential equation is integrated between xi and xi+1 and obtained the following
equation
% xi+1
yi+1 = yi + f (x, y) dx. (8.61)
xi

Now, Newton’s backward interpolation formula is used for y  as

v(v + 1) 2  v(v + 1)(v + 2) 3 


y  = yi + v∇yi + ∇ yi + ∇ yi
2! 3!
x − xi
where v = . After simplification, it reduces to
h

v 2 + v 2  v 3 + 3v 2 + 2v 3 
y  = yi + v∇yi + ∇ yi + ∇ yi .
2 6

Since y  = f (x, y), this value is substituted in (8.61) for f (x, y).
Then
% 1
v 2 + v 2  v 3 + 3v 2 + 2v 3 
yi+1 = yi + h yi + v∇yi + ∇ yi + ∇ yi dv
0 2 6
1 5 3
= yi + hyi + h∇yi + h∇2 yi + h∇3 yi
2 12 8
1 5 3 3
= yi + hfi + h∇fi + h∇ fi + ∇ fi2
2 12 8
where fj = f (xj , yj ) for all j = i, i − 1, i − 2, i − 3.
h
= yi + (−9fi−3 + 37fi−2 − 59fi−1 + 55fi ). (8.62)
24

This formula is known as Adams-Bashforth predictor formula and it is denoted by


p
yi+1 , i.e.,

p h
yi+1 = yi + (−9fi−3 + 37fi−2 − 59fi−1 + 55fi ). (8.63)
24

The corrector formula can also be developed in the same way. In corrector formula,
p
the value of yi+1 is used. Again, Newton’s backward formula is employed on the points
p
(xi−2 , yi−2 ), (xi−1 , yi−1 ), (xi , yi ) and (xi+1 , yi+1 ). The polynomial is

v(v+1) 2  v(v+1)(v+2) 3  x − xi+1


y  = yi+1
 
+v∇yi+1 + ∇ yi+1 + ∇ yi+1 , where v = .
2! 3! h
Ordinary Differential Equations 543

This value is substituted in (8.61). Therefore,


% 0
v2 + v 2 v 3 + 3v 2 + 2v 3
yi+1 = yi + h fi+1 + v∇fi+1 + ∇ fi+1 + ∇ fi+1 dv
−1 2 6
[since dx = hdv and yi = fi = f (xi , yi )]
1 1 1
= yi + h fi+1 − ∇fi+1 − ∇2 fi+1 − ∇3 fi+1
2 12 24
h
= yi + [fi−2 − 5fi−1 + 19fi + 9fi+1 ]. (8.64)
24
This formula is known as Adams-Moulton corrector formula and yi+1 is denoted
c . Thus
by yi+1
h
c
yi+1 = yi + [fi−2 − 5fi−1 + 19fi + 9fi+1 ], (8.65)
24
p
where fi+1 = f (xi+1 , yi+1 ).
p
The value of yi+1 is computed using (8.63). The formula (8.65) can be used repeatedly
to obtain the value of yi+1 to the desired accuracy.
Example 8.6.1 Find the value of y(0.20) and y(0.25) from the differential equation
y  = 2y − y 2 with y(0) = 1 taking step size h = 0.05 using Adams-Bashforth-Moulton
predictor-corrector method.

Solution. The Runge-Kutta method is used to find the starting values at x = 0.05,
0.10, 0.15. Here f (x, y) = 2y − y 2 , h = 0.05. The values are shown below:

i xi yi k1 k2 k3 k4 yi+1
0 0.00 1.000000 0.050000 0.049969 0.049969 0.049875 1.049959
1 0.05 1.049959 0.049875 0.049720 0.049720 0.049503 1.099669
2 0.10 1.099669 0.049503 0.049226 0.049228 0.048891 1.148886

Thus the starting values are


y0 = y(0) = 1, y1 = y(0.05) = 1.049959, y2 = y(0.10) = 1.099669,
y3 = y(0.15) = 1.148886.
Now,
y p (0.20) = y4p = y3 + 24
h
[−9f (x0 , y0 ) + 37f (x1 , y1 ) − 59f (x2 , y2 ) + 55f (x3 , y3 )]
= 1.148886 + 24 [−9 × 1 + 37 × 0.997504 − 59 × 0.990066
0.05

+55 × 0.977833]
= 1.197375.
h
y c (0.20) = y4c = y3 + 24 [f (x1 , y1 ) − 5f (x2 , y2 ) + 19f (x3 , y3 ) + 9f (x4 , y4p )]
24 [0.997504 − 5 × 0.990066 + 19 × 0.977833
= 1.148886 + 0.05
+9 × 0.961043]
= 1.197376.
544 Numerical Analysis

Thus, y4 = y(0.20) = 1.197376.


y p (0.25) = y5p = y4 + 24
h
[−9f (x1 , y1 ) + 37f (x2 , y2 )−59f (x3 , y3 )+55f (x4 , y4 )]
= 1.197376 + 24 [−9 × 0.997504 + 37 × 0.990066 − 59 × 0.977833
0.05

+55 × 0.961043]
= 1.244918.
h
y c (0.25) = y5c = y4 + 24 [f (x2 , y2 ) − 5f (x3 , y3 )+19f (x4 , y4 )+9f (x5 , y5p )]
24 [0.990066 − 5 × 0.977833+19 × 0.961043
= 1.197376 + 0.05
+9 × 0.940015]
= 1.244919.
Hence y(0.25) = 1.244919.
It may be noted that the predicted and corrected values are equal up to five decimal
places.

Error
The local truncation error for predictor formula (8.63) is
% 1
v(v + 1)(v + 2)(v + 3) 4 251 v p
h ∇ fi  y (ci+1 )h5 (= Yi+1 − yi+1 )
0 4! 720

and that of corrector formula (8.65) is


% 0
v(v + 1)(v + 2)(v + 3) 4 19 v
h ∇ fi+1  − y (di+1 )h5 (= Yi+1 − yi+1
c
),
−1 4! 720

where Yi+1 is the true value of y at xi+1 .


When h is small then y v (x) is almost constant over the interval. Then the local
truncation error for Adams-Bashforth-Moulton method is
19 c p
Yi+1 − yi+1
c
− (yi+1 − yi+1 ). (8.66)
270

Algorithm 8.5 (Adams-Bashforth-Moulton method). This algorithm is used


to solve the differential equation y  = f (x, y) with y(x0 ) = y0 at x = x1 , x2 , . . . , xn
with step length h. The predictor and corrector formulae are respectively

p h
yi+1 = yi + [−9fi−3 + 37fi−2 − 59fi−1 + 55fi ]
24
h
and c
yi+1 = yi + [fi−2 − 5fi−1 + 19fi + 9fi+1 ],
24
p
where fi = f (xi , yi ); j = i − 3, i − 2, i − 1, i and fi+1 = f (xi+1 , yi+1 ).
Ordinary Differential Equations 545

Algorithm Adams-Bashforth-Moulton
Input function f (x, y);
Read x0 , xn , h, y0 , y1 , y2 , y3 , ε;
//x0 , xn are the initial and final values of x; h is step length;
y0 , y1 , y2 , y3 are the starting values of y obtained from
any single-step method; ε is the error tolerance.//
Compute the following
x1 = x0 + h; x2 = x1 + h; x3 = x2 + h;
f0 = f (x0 , y0 ); f1 = f (x1 , y1 );
f2 = f (x2 , y2 ); f3 = f (x3 , y3 );
for x4 = x3 + h to xn step h do
Compute y p = y3 + 24 h
[−9f0 + 37f1 − 59f2 + 55f3 ];
Set yold = y ; p

Compute y c = y3 + 24 h
[f1 − 5f2 + 19f3 + 9f (x4 , yold )];
if (|y − yold | > ε) then
c

Set yold = y c ;
Calculate y c from above relation;
else
Reset y0 = y1 ; y1 = y2 ;
Reset y2 = y3 ; y3 = y c ;
Reset f1 = f2 ; f2 = f3 ;
Reset f3 = f (x4 , y c );
Print x4 , y c ;
endif;
endfor;
end Adams-Bashforth-Moulton

Program 8.5.
/* Program Adams-Bashforth-Moutlon
Solution of a differential equation of the form y’=f(x,y),
y(x0)=y0 by Adams-Bashforth-Moutlon method.
Here the equation is taken as y’=x*x*y+y*y with y(0)=1. */
#include<stdio.h>
#include<math.h>
void main()
{
float x0,y0,xn,h,y1,y2,y3,yc,yp;
/* x0, xn are the initial and final values of x */
/* y0,y1,y2,y3 are starting values of y,
h is the step length */
float eps=1e-5; /* the error tolerance */
546 Numerical Analysis

float x1,x2,x3,x4,f0,f1,f2,f3,yold;
float f(float x, float y);
float rk4(float x,float y,float h);
printf("Enter the initial values of x and y ");
scanf("%f %f",&x0,&y0);
printf("Enter last value of x ");
scanf("%f",&xn);
printf("Enter step length h ");
scanf("%f",&h);
printf(" x-value y-value\n");
/* initial values of y are computed using Runge-Kutta method */
x1=x0+h; x2=x1+h; x3=x2+h;
y1=rk4(x0,y0,h);
y2=rk4(x1,y1,h);
y3=rk4(x2,y2,h);
f0=f(x0,y0); f1=f(x1,y1); f2=f(x2,y2); f3=f(x3,y3);
for(x4=x3+h;x4<=xn;x4+=h)
{
yp=y3+h*(-9*f0+37*f1-59*f2+55*f3)/24;
yold=yp;
yc=yp;

do
{
yold=yc;
yc=y3+h*(f1-5*f2+19*f3+9*f(x4,yold))/24;
}while((yc-yold)>eps);

printf("%8.5f %8.5f\n",x4,yc);
y0=y1; y1=y2; y2=y3; y3=yc;
f1=f2; f2=f3; f3=f(x4,yc);
}
} /* main */
/* definition of the function f(x,y) */
float f(float x, float y)
{
return(x*x*y+y*y);
}
/* the fourth order Runge-Kutta method */
Ordinary Differential Equations 547

float rk4(float x,float y,float h)


{
float k1,k2,k3,k4;
k1=h*f(x,y);
k2=h*f(x+h/2,y+k1/2);
k3=h*f(x+h/2,y+k2/2);
k4=h*f(x+h,y+k3);
y=y+(k1+2*(k2+k3)+k4)/6;
return (y);
}
A sample of input/output:
Enter the initial values of x and y 0 1
Enter last value of x 0.4
Enter step length h 0.05
x-value y-value
0.20000 1.25355
0.25000 1.34099
0.30000 1.44328
0.35000 1.56458

8.6.2 Milne’s method


Another popular predictor-corrector formula is Milne’s method which is also known
as Milne-Simpson method. The differential equation y  = f (x, y) with y(x0 ) = y0 is
integrated between xi−3 and xi+1 and find
% xi+1
yi+1 = yi−3 + f (x, y) dx. (8.67)
xi−3

Now, the function f (x, y) is replaced by Newton’s forward difference formula in the
form
u(u − 1) 2 u(u − 1)(u − 2) 3
f (x, y) = fi−3 + u∆fi−3 + ∆ fi−3 + ∆ fi−3 , (8.68)
2! 3!
x − xi−3
where u = .
h
The value of f (x, y) is substituted from (8.68) to (8.67) and find
% 4
u2 − u 2 u3 − 3u2 + 2u 3
yi+1 = yi−3 + h fi−3 + u∆fi−3 + ∆ fi−3 + ∆ fi−3 du
0 2 6
 20 8 
= yi−3 + h 4fi−3 + 8∆fi−3 + ∆2 fi−3 + ∆3 fi−3
3 3
4h
= yi−3 + [2fi−2 − fi−1 + 2fi ].
3
548 Numerical Analysis

Thus the Milne’s predictor formula is

p 4h
yi+1 = yi−3 + [2fi−2 − fi−1 + 2fi ]. (8.69)
3
p
The corrector formula is developed in a similar way. The value of yi+1 will now be
used. Again, the given differential equation is integrated between xi−1 and xi+1 and
the function f (x, y) is replaced by the Newton’s formula (8.68). Then
% xi+1
u(u − 1) 2
yi+1 = yi−1 + fi−1 + u∆fi−1 + ∆ fi−1 dx
xi−1 2
% 2 
u2 − u 2
= yi−1 + h fi−1 + u∆fi−1 + ∆ fi−1 du
0 2
 1 2 
= yi−1 + h 2fi−1 + 2∆fi−1 + ∆ fi−1
3
h
= yi−1 + [fi−1 + 4fi + fi+1 ].
3
c . That is,
This formula is known as corrector formula and it is denoted by yi+1

c h p
yi+1 = yi−1 + [f (xi−1 , yi−1 ) + 4f (xi , yi ) + f (xi+1 , yi+1 )]. (8.70)
3
p
When yi+1 is computed using the formula (8.69), formula (8.70) can be used itera-
tively to obtain the value of yi+1 to the desired accuracy.

Example 8.6.2 Find the value of y(0.20) for the initial value problem

dy
= y 2 sin x with y(0) = 1
dx
using Milne’s predictor-corrector method, taking h = 0.05.

Solution. Let f (x, y) = y 2 sin x, x0 = 0, y0 = 1, h = 0.05.


Fourth-order Runge-Kutta method is used to compute the starting values y1 , y2 and
y3 .

i xi yi k1 k2 k3 k4 yi+1
0 0.00 1.000000 0.000000 0.001250 0.001251 0.002505 1.001251
1 0.05 1.001251 0.002505 0.003765 0.003770 0.005042 1.005021
2 0.10 1.005021 0.005042 0.006328 0.006336 0.007643 1.011356
Ordinary Differential Equations 549

The predictor value


4h
y4p = y0 + [2f (x1 , y1 ) − f (x2 , y2 ) + 2f (x3 , y3 )]
3
4 × 0.05
=1+ [2f (0.05, 1.001251) − f (0.10, 1.005021) + 2f (0.15, 1.011356)]
3
4 × 0.05
=1+ [2 × 0.0501043 − 0.1008385 + 2 × 0.1528516]
3
= 1.0203380

The corrector value is


h
y4c = y2 + [f (x2 , y2 ) + 4f (x3 , y3 ) + f (x4 , y4p )]
2
0.05
= 1, 005021 + [0.1008385 + 4 × 0.1528516 + 0.2068326]
2
= 1.0203390.

Again, the corrector value y4c is calculated by using the formula

h
y4c = y2 + [f (x2 , y2 ) + 4f (x3 , y3 ) + f (x4 , y4c )]
2
0.05
= 1.005021 + [0.1008385 + 4 × 0.1528516 + 0.2068392]
2
= 1.0203390.

Since these two values are same, the required solution is


y4 = y(0.20) = 1.0203390 correct up to seven decimal places.

Error
The local truncation error for prediction formula is

28 v
y (ci+1 )h5 = O(h5 )
90
1 v
and that of the corrector formula is − y (di+1 )h5 = O(h5 ).
90

Note 8.6.1 Milne’s predictor-corrector method is widely used formula. In this method
there is a scope to improve the value of y by repeated use of corrector formula. So
it gives more accurate result. But, this method needs the starting values y1 , y2 , y3 to
obtain y4 . These values may be obtained from any single-step method, such as Taylor’s
series, Euler’s, Runge-Kutta or any similar method.
550 Numerical Analysis

Algorithm 8.6 (Milne’s Predictor-Corrector method). This algorithm is used


to solve the initial value problem y  = f (x, y) with y(x0 ) = y0 at the points
x4 , x5 , . . . , xn by Milne’s method. The starting values y1 , y2 , y3 may be obtained
from any self starting method, viz., Taylor’s series, Euler’s or Runge-Kutta methods.

Algorithm Milne
Input function f (x, y);
Read x0 , xn , h, y0 , y1 , y2 , y3 , ε;
//x0 , xn are the initial and final values of x; h is step length;
y0 , y1 , y2 , y3 are the starting values of y obtained from
any single-step method; ε is the error tolerance.//
Compute the following
x1 = x0 + h; x2 = x1 + h; x3 = x2 + h;
f1 = f (x1 , y1 );
f2 = f (x2 , y2 ); f3 = f (x3 , y3 );
for x4 = x3 + h to xn step h do
Compute y p = y0 + 4h 3 [2f1 − f2 + 2f3 ];
Set yold = y p ;
Compute y c = y2 + h3 [f2 + 4f3 + f (x4 , yold )];
if (|y c − yold | > ε) then
Reset yold = y c ;
Calculate y c from above relation;
else
Reset y0 = y1 ; y1 = y2 ;
Reset y2 = y3 ; y3 = y c ;
Reset f1 = f2 ; f2 = f3 ;
Compute f3 = f (x4 , y c );
Print x4 , y c ;
endif;
endfor;
end Milne

Program 8.6
.
/* Program Milne Predictor-Corrector
Solution of a differential equation of the form y’=f(x,y),
y(x0)=y0 by Milne Predictor-Corrector method.
Here the equation is taken as y’=x*y+y*y with y(0)=1.
*/
#include<stdio.h>
#include<math.h>
Ordinary Differential Equations 551

void main()
{
float x0,y0,xn,h,y1,y2,y3,yc,yp;
/* x0, xn the intial and final value of x */
/* y0,y1,y2,y3 are starting values of y,
h is the step length */
float eps=1e-5; /* the error tolerance */
float x1,x2,x3,x4,f0,f1,f2,f3,yold;
float f(float x, float y);
float rk4(float x,float y,float h);
printf("Enter the initial values of x and y ");
scanf("%f %f",&x0,&y0);
printf("Enter last value of x ");
scanf("%f",&xn);
printf("Enter step length h ");
scanf("%f",&h);
printf(" x-value y-value\n");

/* initial values of y are computed using Runge-Kutta method */


x1=x0+h; x2=x1+h; x3=x2+h;
y1=rk4(x0,y0,h);
y2=rk4(x1,y1,h);
y3=rk4(x2,y2,h);
f1=f(x1,y1); f2=f(x2,y2); f3=f(x3,y3);
for(x4=x3+h;x4<=xn;x4+=h)
{
yp=y0+4*h*(2*f1-f2+2*f3)/3;
yold=yp;
yc=yp;
do
{
yold=yc;
yc=y2+h*(f2+4*f3+f(x4,yold))/3;
}while((yc-yold)>eps);
printf("%8.5f %8.5f\n",x4,yc);
y0=y1; y1=y2; y2=y3; y3=yc;
f1=f2; f2=f3; f3=f(x4,yc);
}
} /* main */
552 Numerical Analysis

/* definition of the function f(x,y) */


float f(float x, float y)
{
return(x*y+y*y);
}
/* the fourth order Runge-Kutta method */
float rk4(float x,float y,float h)
{
float k1,k2,k3,k4;
k1=h*f(x,y);
k2=h*f(x+h/2,y+k1/2);
k3=h*f(x+h/2,y+k2/2);
k4=h*f(x+h,y+k3);
y=y+(k1+2*(k2+k3)+k4)/6;
return (y);
}
A sample of input/output:
Enter the initial values of x and y 0 1
Enter last value of x 0.4
Enter step length h 0.5
x-value y-value
0.20000 1.27740
0.25000 1.38050
0.30000 1.50414
0.35000 1.65418

8.7 Finite Difference Method

In this method, the derivatives y  and y  are replaced by finite differences (either by
forward or central) and generates a system of linear algebraic equations. The answer of
this system is the solution of the differential equation at different mesh.
The central difference formula discussed in Chapter 7 are used to replace derivatives
yi+1 − yi−1
y  (xi ) = + O(h2 )
2h
(8.71)
y − 2y + y
and y  (xi ) =
i+1 i i−1
+ O(h2 ).
h2
The method to solve first order differential equation using finite difference method is
nothing but the Euler’s method. The finite difference method is commonly used method
to solve second order initial value problem and boundary value problem.
Ordinary Differential Equations 553

8.7.1 Second order initial value problem (IVP)


Let us consider the second order linear IVP of the form

y  + p(x)y  + q(x)y = r(x) (8.72)

with the initial conditions

x = x0 , y(x0 ) = y0 , y  (x0 ) = y0 . (8.73)

The values of y  (xi ) and y  (xi ) are substituted from (8.71) to (8.72) and (8.73), to
find the equation
yi+1 − 2yi + yi−1 yi+1 − yi−1
+ p(xi ) + q(xi )yi = r(xi ).
h2 2h
After simplification, the above equation reduces to

[2 − hp(xi )]yi−1 + [2h2 q(xi ) − 4]yi + [2 + hp(xi )]yi+1 = 2h2 r(xi ). (8.74)

Let us consider

Ci = 2 − hp(xi )
Ai = 2h2 q(xi ) − 4
(8.75)
Bi = 2 + hp(xi ) and
Di = 2h2 r(xi ).

Then equation (8.74) is simplified to

Ci yi−1 + Ai yi + Bi yi+1 = Di . (8.76)

The initial condition y  (x0 ) = y0 reduces to

y1 − y−1
y0 = or, y−1 = y1 − 2hy0 . (8.77)
2h
Again, from (8.76),

C0 y−1 + A0 y0 + B0 y1 = D0 . (8.78)

The quantity y−1 is eliminated between (8.77) and (8.78), and the value of y1 is
obtained as
D0 − A0 y0 + 2hC0 y0
y1 = . (8.79)
C0 + B0
554 Numerical Analysis

Since p, q, r are known functions, A, B, C, D can easily be determined at x = x0 , x1 , . . . .


That is, right hand side of (8.79) is a known quantity. Thus (8.79) gives the value of
y1 . Also, y0 is known. Hence the value of yi+1 can be obtained from (8.76) as

Di − Ci yi−1 − Ai yi
yi+1 = , xi+1 = xi + h (8.80)
Bi
for i = 1, 2, . . . .
Thus, the values of y1 , y2 , . . . are determined recursively from (8.80).

Example 8.7.1 Solve the following IVP y  − y = x with y(0) = 0 and y  (0) = 1
using finite difference method for x = 0.01, 0.02, . . . , 0.10.

yi+1 − 2yi + yi−1


Solution. The second order derivative y  is replaced by in the
h2
given differential equation and obtained the following system of equations.
yi+1 − 2yi + yi−1
− yi = xi or yi+1 − (2 + h2 )yi + yi−1 = h2 xi
h2
and
y1 − y−1
y0 = or y−1 = y1 − 2hy0 .
2h
Again from above equation
y1 − (2 + h2 )y0 + y−1 = h2 x0 or, y1 = (2 + h2 )y0 − (y1 − 2hy0 ) + h2 x0 .
That is,
(2 + h2 )y0 + 2hy0 + h2 x0
y1 =
2
[2 + (0.01)2 ] × 0 + 2 × 0.01 × 1.0 + (0.01)2 × 0
= = 0.01.
2
The values of yi , i = 2, 3, . . . are obtained from the relation
yi+1 = (2 + h2 )yi − yi−1 + h2 xi .

i yi−1 xi yi yi+1
1 0.000000 0.01 0.010000 0.020002
2 0.010000 0.02 0.020002 0.030008
3 0.020002 0.03 0.030008 0.040020
4 0.030008 0.04 0.040020 0.050040
5 0.040020 0.05 0.050040 0.060070
6 0.050040 0.06 0.600070 0.070112
7 0.060070 0.07 0.070112 0.080168
8 0.070112 0.08 0.080168 0.090240
9 0.080168 0.09 0.090240 0.100330
10 0.090240 0.10 0.100330 0.110440
Ordinary Differential Equations 555

Error
The local truncation error is
   
yi−1 − 2yi + yi+1  yi+1 − yi−1 
ELTE = − yi + p i − yi .
h2 2h
Expanding the terms yi−1 and yi+1 by Taylor’s series and simplifying, the above expres-
sion gives
h2 iv
ELTE = (y + 2pi yi ) + O(h4 ).
12
Thus, the finite difference approximation has second-order accuracy for the functions
with continuous fourth derivatives.

8.7.2 Second order boundary value problem (BVP)


Let us consider the linear second order differential equation

y  + p(x)y  + q(x)y = r(x), a<x<b (8.81)

with boundary conditions y(a) = γ1 and y(b) = γ2 .


Let the interval [a, b] be divided into n subintervals with spacing h. That is, xi =
xi−1 + h, i = 1, 2, . . . , n − 1.
The equation (8.81) is satisfied by x = xi . Then

yi + p(xi )yi + q(xi )yi = r(xi ). (8.82)

Now, yi and yi are replaced by finite difference expressions


yi−1 − 2yi + yi+1 yi+1 − yi−1
yi = + O(h2 ), yi = + O(h2 ).
h2 2h
Using these substitutions and drooping O(h2 ), equation (8.82) becomes
yi−1 − 2yi + yi+1 yi+1 − yi−1
2
+ p(xi ) + q(xi )yi = r(xi ).
h 2h
That is,

yi−1 [2 − hp(xi )] + yi [2h2 q(xi ) − 4] + yi+1 [2 + hp(xi )] = 2h2 r(xi ). (8.83)

Let us consider
Ci = 2 − hp(xi )
Ai = 2h2 q(xi ) − 4
(8.84)
Bi = 2 + hp(xi )
and Di = 2h2 r(xi ).
556 Numerical Analysis

With these notations, the equation (8.83) is simplified to

Ci yi−1 + Ai yi + Bi yi+1 = Di , (8.85)

for i = 1, 2, . . . , n − 1.
The boundary conditions then are y0 = γ1 , yn = γ2 .
For i = 1, n − 1 equation (8.85) reduces to
C1 y0 + A1 y1 + B1 y2 = D1 or, A1 y1 + B1 y2 = D1 − C1 γ1 (as y0 = γ1 )
and Cn−1 yn−2 + An−1 yn−1 + Bn−1 yn = Dn−1
or, Cn−1 yn−2 + An−1 yn−1 = Dn−1 − Bn−1 γ2 (as yn = γ2 ).
The equation (8.85) in matrix notation is

Ay = b (8.86)

where

y = [y1 , y2 , . . . , yn−1 ]t
b = 2h2 [r(x1 ) − {C1 γ1 }/(2h2 ), r(x2 ), . . . , r(xn−2 ), r(xn−1 ) − {Bn−1 γ2 }/(2h2 )]t
 
A1 B1 0 0 · · · 0 0
 C2 A2 B2 0 · · · 0 0 
 

and A =  0 C3 A3 B3 · · · 0 0  .
··· ··· ··· ··· ··· ··· ··· 
0 0 0 0 · · · Cn−1 An−1

Equation (8.86) is a tri-diagonal system which can be solved by the method discussed
in Chapter 5. The solution of this system i.e., the values of y1 , y2 , . . . , yn−1 constitutes
the approximate solution of the BVP.
Example 8.7.2 Solve the following boundary value problem y  + xy  + 1 = 0 with
boundary conditions y(0) = 0, y(1) = 0.

Solution. Here nh = 1. The difference scheme is


yi−1 − 2yi + yi+1 yi+1 − yi−1
2
+ xi + 1 = 0.
h 2h
That is,

yi−1 (2 − xi h) − 4yi + yi+1 (2 + xi h) + 2h2 = 0, i = 1, 2, . . . , n − 1 (8.87)

together with boundary condition y0 = 0, yn = 0.


Let n = 2. Then h = 1/2, x0 = 0, x1 = 1/2, x2 = 1, y0 = 0, y2 = 0.
The difference scheme is
y0 (2 − x1 h) − 4y1 + y2 (2 + x1 h) + 2h2 = 0 or, −4y1 + 2(1/4) = 0 or, y1 = 0.125.
Ordinary Differential Equations 557

That is, y(0.5) = 0.125.


Let n = 4. Then h = 0.25, x0 = 0, x1 = 0.25, x2 = 0.50, x3 = 0.75,
x4 = 1.0, y0 = 0, y4 = 0.
The system of equations (8.87) becomes
y0 (2 − x1 h) − 4y1 + y2 (2 + x1 h) + 2h2 = 0
y1 (2 − x2 h) − 4y2 + y3 (2 + x2 h) + 2h2 = 0
y2 (2 − x3 h) − 4y3 + y4 (2 + x3 h) + 2h2 = 0.
This system is finally simplified to
−4y1 + 2.06250y2 + 0.125 = 0
1.875y1 − 4y2 + 2.125y3 + 0.125 = 0
1.8125y2 − 4y3 + 0.125 = 0.
The solution of this system is
y1 = y(0.25) = 0.09351, y2 = y(0.50) = 0.12075, y3 = y(0.75) = 0.08597.
This is also the solution of the given differential equation.

Algorithm 8.7 (Finite Difference Method for BVP). Using this algorithm,
the BVP y  + p(x)y  + q(x)y = r(x) with y(x0 ) = γ1 , y(xn ) = γ2 is solved by finite
difference method.

Algorithm BVP FD
Input functions q(x), q(x), r(x);
//The functions p, q, r to be changed accordingly.//
Read x0 , xn , h, γ1 , γ2 ; //The boundary values of x; step size h; and
the boundary value of y.//
for i = 1, 2, . . . , n − 1 do
Compute Ai = 2h2 q(xi ) − 4;
Compute Ci = 2 − hp(xi );
Compute Bi = 2 + hp(xi );
Compute Di = 2h2 r(xi );
Reset D1 = D1 − C1 γ1 ;
Reset Dn−1 = Dn−1 − Bn−1 γ2 ;
Solve the tri-diagonal system of equations Ay = b,
where A, b, y are given by the equation (8.86);
Print y1 , y2 , . . . , yn−1 as solution;
end BVP FD
Program 8.7
.
/* Program BVP Finite Difference
This program solves the second order boundary value
y’’+p(x)y’+q(x)y=r(x) with y(x0)=y0, y(xn)=yn by finite difference
method. Here we consider the equation y’’+2y’+y=10x with
boundary conditions y(0)=0 and y(1)=0. */
558 Numerical Analysis

#include<stdio.h>
#include<math.h>
#include<stdlib.h>
float y[10];
void main()
{
int i,n;
float a[10],b[10],c[10],d[10],x0,xn,y0,yn,temp,h,x;
float p(float x); float q(float x); float r(float x);
float TriDiag(float a[],float b[],float c[],float d[],int n);
printf("Enter the initial and final values of x ");
scanf("%f %f",&x0,&xn);
printf("Enter the initial and final values of y ");
scanf("%f %f",&y0,&yn);
printf("Enter number of subintervals ");
scanf("%d",&n);
h=(xn-x0)/n;
x=x0;
for(i=1;i<=n-1;i++)
{
x+=h;
a[i]=2*h*h*q(x)-4;
b[i]=2+h*p(x);
c[i]=2-h*p(x);
d[i]=2*h*h*r(x);
} /* end of loop i */

d[1]-=c[1]*y0;
d[n-1]-=b[n-1]*yn;
temp=TriDiag(c,a,b,d,n-1);
y[0]=y0; y[n]=yn;
printf("The solution is\n x-value y-value\n");
for(i=0;i<=n;i++) printf("%8.5f %8.5f \n",x0+i*h,y[i]);
} /* main */

/* definition of the functions p(x), q(x) and r(x) */


float p(float x)
{
return(2);
}
Ordinary Differential Equations 559

float q(float x)
{
return(1);
}
float r(float x)
{
return(10*x);
}
float TriDiag(float a[10],float b[10],float c[10],float d[10],int n)
{
/* output y[i], i=1, 2,..., n, is a global variable.*/
int i; float gamma[10],z[10];
gamma[1]=b[1];
for(i=2;i<=n;i++){
if(gamma[i-1]==0.0){
printf("A minor is zero: Method fails ");
exit(0);
}
gamma[i]=b[i]-a[i]*c[i-1]/gamma[i-1];
}
z[1]=d[1]/gamma[1];
for(i=2;i<=n;i++)
z[i]=(d[i]-a[i]*z[i-1])/gamma[i];
y[n]=z[n];
for(i=n-1;i>=1;i--)
y[i]=z[i]-c[i]*y[i+1]/gamma[i];
return(y[0]);
} /*end of TriDiag */
A sample of input/output:
Enter the initial and final values of x 0 1
Enter the initial and final values of y 0 0
Enter number of subintervals 4
The solution is
x-value y-value
0.00000 0.00000
0.25000 -0.52998
0.50000 -0.69647
0.75000 -0.51154
1.00000 0.00000
560 Numerical Analysis

Error
The local truncation error of this method is similar to the IVP, i.e., this method is also
of second-order accuracy for functions with continuous fourth order derivatives on [a, b].
It may be noted that when h → 0 then the local truncation error tends to zero, i.e.,
greater accuracy in the result can be achieved by using small h. But, small h produces
a large number of equations and take more computational effort.
The accuracy can be improved to employ Richardson’s deferred approach to the limit.
The error of this method has the form

y(xi ) − yi = h2 E(xi ) + O(h4 ). (8.88)

For extrapolation to the limit, we find the value of (8.88) at two intervals h and h/2.
The values of yi are denoted by yi (h) and yi (h/2). Thus for x = xi for different step
sizes
y(xi ) − yi (h) = h2 E(xi ) + O(h4 )
(8.89)
h2
and y(xi ) − yi (h/2) = E(xi ) + O(h4 )
4
Now, by eliminating E(xi ) we find the expression for y(xi ), in the form

4yi (h/2) − yi (h)


y(xi ) = . (8.90)
3
The existence and uniqueness conditions of BVP are stated below.

Theorem 8.2 Assume that f (x, y, y  ) is continuous on the region R = {(x, y, y  ) : a ≤


x ≤ b, −∞ < y < ∞, −∞ < y  < ∞} and that fy and fy are continuous on R. If there
exists a constant M > 0 for which fy and fy satisfy fy > 0 for all (x, y, y  ) ∈ R and
|fy (x, y, y  )| < M for all (x, y, y  ) ∈ R, then the BVP y  = f (x, y, y  ) with y(a) = γ1
and y(b) = γ2 has a unique solution y = y(x) for a ≤ x ≤ b.

8.8 Shooting Method for Boundary Value Problem

This method works in three stages

(i) The given BVP is transferred into two IVPs,

(ii) Solutions of these two IVPs can be determined by Taylor’s series or Runge-Kutta
or any other method,

(iii) Combination of these two solutions is the required solution of the given BVP.
Ordinary Differential Equations 561

Reduction to two IVPs


Let the BVP be
y  − p(x)y  − q(x)y = r(x) with y(a) = α and y(b) = β. (8.91)
Suppose that u(x) is the unique solution to the IVP
u (x) − p(x)u (x) − q(x)u(x) = r(x) with u(a) = α and u (a) = 0. (8.92)
Furthermore, suppose that v(x) is the unique solution to the IVP,
v  (x) − p(x)v  (x) − q(x)v(x) = 0 with v(a) = 0 and v  (a) = 1. (8.93)
Then the linear combination
y(x) = c1 u(x) + c2 v(x) (8.94)
is the solution of (8.91) for some constants c1 , c2 .
As y(a) = α and y(b) = β,
α = c1 u(a) + c2 v(a) and β = c1 u(b) + c2 v(b).
β − u(b)
From the first equation, c1 = 1 and from second equation, c2 = .
v(b)
Hence (8.94) reduces to
β − u(b)
y(x) = u(x) + v(x), (8.95)
v(b)
this is a solution of the BVP (8.91) and it also satisfies the boundary conditions y(a) =
α, y(b) = β.
The method involves to solve two systems of equations over [a, b]. The first system is
u = w(x) with u(a) = α
(8.96)
and w (x)= p(x)w(x) + q(x)u(x) + r(x) with w(a) = u (a) = 0,
and the second system is
v = z(x) with v(a) = 0
(8.97)
z  (x) = p(x)z(x) + q(x)v(x) with z(a) = v  (a) = 1.
Finally, the desired solution y(x) is obtained from (8.95).
Unfortunately, when the expression p(x)y  (x) + q(x)y(x) + r(x) is non-linear then
several iterations are needed to obtain the solution at b to a prescribed accuracy. One
may encounter difficulty to obtain a convergent solution if y(b) is a very sensitive function
of y  (a). In this case it may be suggested to integrate from the opposite direction by
guessing a value of y  (b) and iterating until y(a) is sufficiently close to y0 . The speed of
convergence depends on how good the initial guess was chosen.
562 Numerical Analysis

Example 8.8.1 Find the solution of the boundary value problem y  = y − x with
y(0) = 0, y(1) = 0 using the shooting method.

Solution. The second-order Runge-Kutta method is used to solve the initial value
problem with h = 0.25.
Here p(x) = 0, q(x) = 1, r(x) = −x, a = 0, b = 1, α = 0, β = 0.
Now, two IVPs are

u = w with u(0) = 0
(8.98)
w = u − x and w(0) = u (0) = 0

and
v  = z with v(0) = 0
(8.99)
z  = v and z(0) = v  (0) = 1.

Solution of the system (8.98) is shown below.


i xi ui wi k1 k2 l1 l2 ui+1 wi+1
0 0.0 0.00000 0.00000 0.00000 0.00000 0.00000 -0.04000 0.00000 -0.02000
1 0.2 0.00000 -0.02000 -0.00400 -0.01200 -0.04000 -0.08080 -0.00800 -0.08040
2 0.4 -0.00800 -0.08040 -0.01608 -0.03240 -0.08160 -0.12482 -0.03224 -0.18361
3 0.6 -0.03224 -0.18361 -0.03672 -0.06201 -0.12645 -0.17379 -0.08161 -0.33373
4 0.8 -0.08161 -0.33373 -0.06675 -0.10201 -0.17632 -0.22967 -0.16598 -0.53672

Solution of the system (8.99) is shown in the following table.

i xi vi zi k1 k2 l1 l2 vi+1 zi+1
0 0.0 0.00000 1.00000 0.20000 0.20000 0.00000 0.04000 0.20000 1.02000
1 0.2 0.20000 1.02000 0.20400 0.21200 0.04000 0.08080 0.40800 1.08040
2 0.4 0.40800 1.08040 0.21608 0.23240 0.08160 0.12482 0.63224 1.18361
3 0.6 0.63224 1.18361 0.23672 0.26201 0.12645 0.17379 0.88161 1.33373
4 0.8 0.88161 1.33373 0.26675 0.30201 0.17632 0.22967 1.16598 1.53672
Now, β − u(b) 0 − u(1) 0.16598
c= = = = 0.142352.
v(b) v(1) 1.16598
The values of y(x) given by y(x) = u(x) + cv(x) = u(x) + 0.142352 v(x) are listed
below:
x u(x) v(x) y(x) yexact
0.2 +0.00000 0.20000 0.02847 0.02868
0.4 -0.00800 0.40800 0.05008 0.05048
0.6 -0.03224 0.63224 0.05776 0.05826
0.8 -0.08161 0.88161 0.04389 0.04429
1.0 -0.16598 1.16598 0.00000 0.00000
Ordinary Differential Equations 563

8.9 Finite Element Method

Finite element method (FEM) is widely used technique to solve many engineering
problems. Here, a very brief introduction is presented to solve a BVP by using this
method. The detailed discussion of FEM is beyond the scope of this book.
The main idea of this method is – the whole interval (of integration) is divided into a
finite number of subintervals called element and over each element the continuous func-
tion is approximated by a suitable piecewise polynomial. This approximated problem
is solved by Rayleigh-Ritz or Galerkin methods.
Let us consider the functional
% b
J[y(x)] = F (x, y(x), y  (x)) dx (8.100)
a

subject to the boundary conditions

y(a) = ya , y(b) = yb . (8.101)

We assume that F is differentiable. The curve y = y(x) which extremizing J under


the boundary condition (8.101) is a solution of the Euler equation
 
∂F d ∂F
− = 0. (8.102)
∂y dx ∂y 

The Euler’s equation (8.102) has many solutions but, for a given boundary condition,
it gives a unique solution.
Let us consider the boundary value problem
d 
− p(x)y  (x) + q(x)y(x) = r(x) (8.103)
dx
with boundary condition (8.101). It can easily be verified that the variational form of
(8.103) is given by
% b
1 
J[y(x)] = p(x) {y  (x)}2 + q(x) {y(x)}2 − 2r(x) y(x) dx. (8.104)
2 a

The main steps to solve a BVP using FEM are given below.
Step 1. Discretization of the interval
The interval [a, b] is divided into a finite number of subintervals, called elements, of
unequal length. Let x0 , x1 , . . . , xn , a = x0 < x1 < · · · < xn = b, be the division points,
called the nodes. Let ei = [xi , xi+1 ] be the ith element of length hi = xi+1 − xi .
Step 2. Variational formulation of BVP over the element ei
564 Numerical Analysis

ei
a=x0 x1 x2 xi−1 xi xi+1 xn =b

Figure 8.5: Division of interval into elements.

The variational form of (8.103) is given by


% b
1
J[y(x)] = p(y  )2 + qy 2 − 2ry] dx. (8.105)
2 a

Let Ji be the functional (called element functional) over the element ei = [xi , xi+1 ],
i = 0, 1, 2, . . . , n − 1. Then
% xi+1  dy (i) 2  2
1
Ji = p + q y (i) − 2ry (i) dx, (8.106)
2 xi dx

where y (i) is the value of y over the element ei and it is zero outside the element ei . xi ,
xi+1 are the end nodes of the element ei and let φ(i) = [y(xi ) y(xi+1 )]T .
Thus the functional J over the whole interval [a, b] is the sum of the n functionals Ji ,
that is,


n−1
J[y] = Ji . (8.107)
i=0

The element functional Ji will be extremum with respect to φ(i) if


∂Ji
= 0. (8.108)
∂φ(i)
This gives the required finite element equation for the element ei .
Step 3. Rayleigh-Ritz finite element approximation over ei
The function y(x) may be approximated over the element ei by linear Lagrange’s
interpolation polynomial as

y (i) (x) = Li (x)yi + Li+1 (x)yi+1 (8.109)

where
xi+1 − x x − xi
yi = y(xi ) and Li (x) = , Li+1 (x) = . (8.110)
xi+1 − xi xi+1 − xi

The function Li (x) and Li+1 (x) are called shape functions.
Ordinary Differential Equations 565

The equation (8.109) can be written as


  yi
y (i) (x) = Li (x) Li+1 = L(i) φ(i) (8.111)
yi+1
   T
where L(i) = Li Li+1 and φ(i) = yi yi+1 .
It may be noted that
0, i = j
Li (xj ) =
1, i = j.
From (8.110), it is easy to obtain
∂Li 1 1 ∂Li+1 1 1
=− = − and = = . (8.112)
∂x xi+1 − xi hi ∂x xi+1 − xi hi

The value of y (i) (x) is substituted from (8.111) to (8.106) and obtain
% xi+1   2   2
1 yi yi
Ji = p(x) Li Li+1 + q(x) Li Li+1
2 xi yi+1 yi+1
  yi
2
−r(x) Li Li+1 dx,
yi+1

where prime denotes differentiation with respect to x.


To extremize Ji , differentiating it with respect to yi and yi+1 as
% xi+1  
∂Ji yi
= p(x)Li Li Li+1
∂yi xi yi+1
  yi
+q(x)Li Li Li+1 − r(x)Li dx = 0. (8.113)
yi+1

% xi+1  
∂Ji yi
and = p(x)Li+1 Li Li+1
∂yi+1 xi yi+1
  yi
+q(x)Li+1 Li Li+1 − r(x)Li+1 dx = 0. (8.114)
yi+1

These two equations can be written in matrix form as


% xi+1
Li Li Li Li+1 Li Li Li Li+1 yi
p(x)     + q(x) dx
xi Li+1 Li Li+1 Li+1 Li+1 Li Li+1 Li+1 yi+1
% xi+1
Li
− r(x) dx = 0.
xi Li+1
566 Numerical Analysis

This gives

A(i) φ(i) − b(i) = 0, (8.115)

where
%
(i)
xi+1
Li Li Li Li+1 Li Li Li Li+1
A = p(x) + q(x) dx,
x Li+1 Li Li+1 Li+1 Li+1 Li Li+1 Li+1
% ixi+1
Li yi
b(i) = r(x) dx, and φ(i) = . (8.116)
xi Li+1 yi+1

Step 4. Assembly of element equations


Let the elements of the matrix A(i) be taken as
( (i) (i)
) ( )
(i)
a a b
A(i) = i,i
(i)
i,i+1
(i) and those of b(i) as b(i) = i
(i) .
ai+1,i ai+1,i+1 bi+1

Substituting i = 0, 1, 2, . . . , n − 1 in (8.115) and taking summation over all the ele-


ments as
( ) ( ) ( (i) (i)
)
(0) (0) (1) (1)
a0,0 a0,1 y0 a1,1 a1,2 y1 ai,i ai,i+1 yi
(0) (0) + (1) (1) + ··· + (i) (i)
a1,0 a1,1 y1 a2,1 a2,2 y2 ai+1,i ai+1,i+1 yi+1
( )
(n−1) (n−1)
an−1,n−1 an−1,n yn−1
+··· + (n−1) (n−1)
an,n−1 an,n yn
( ) ( ) ( ) ( )
(0) (1) (i) (n−1)
b0 b1 bi bn−1
= (0) + (1) + ··· + (i) + ··· + (n−1) .
b1 b2 bi+1 bn

After simplification, the assembled matrix equation becomes


 (0) (0)
   (0) 
a0,0 a0,1 0 0 0 ··· 0 0 0 y0 b0
 (0) (0)    (0) (1) 
 a1,0 a1,1 + a(1) a
(1)
0 0 · · · 0 0 0  y1   b1 + b1 
 1,1 1,2  
  b + b(2) 
 0 0  y2 (1) .
 = 2
(1) (1) (2) (2)
 a2,1 a2,2 + a2,2 a2,2 0 · · · 0 0 
   2
 .. .. 
 ··· ··· ··· ··· ··· ··· ··· ··· ···  .   . 
(n−1) (n−1)
0 0 0 0 0 · · · 0 an,n−1 an,n yn (n−1)
bn

This equation can be written as

Aφ = b. (8.117)

It may be noted that A is a tri-diagonal matrix. The solution of this system can be
determined by the method discussed in Chapter 5.
Ordinary Differential Equations 567

Step 5. Incorporation of boundary conditions


For simplicity, let the system (8.117) be of the following form.
  y0
 
b0

a0,0 a0,1 0 0 0 ··· 0 0 0
 ···  y1   b1 
 a1,0 a1,1 a12 0 0 0 0 0    
 ···  y2   b2 
 0 a2,1 a2,2 a2,3 0 0 0 0    
 ··· ··· ··· ··· ··· ··· ··· ··· ···  .. = .. .
  .   . 
 0 0 0 0 0 · · · an−1,n−2 an−1,n−1 an−1,n 

 
  bn−1


yn−1
0 0 0 0 0 ··· 0 an,n−1 an,n yn bn
(8.118)

The second and nth equations are a1,0 y0 + a1,1 y1 + a1,2 y2 = b1 and an−1,n−2 yn−2 +
an−1,n−1 yn−1 + an−1,n yn = bn−1 . Now, the boundary conditions y(a) = ya and y(b) = yb
are introduced to the above equations and they become

a1,1 y1 + a1,2 y2 = b1 − a1,0 ya


and an−1,n−2 yn−2 + an−1,n−1 yn−1 = bn−1 − an−1,n yn . (8.119)

Now, first and last rows and columns are removed from A also first and last elements
are removed from b and the equations of (8.119) are incorporated.
The equation (8.118) finally reduces to
  y  
b1

a1,1 a12 0 0 ··· 0 0 1
 a2,1 a2,2 a2,3 0 
 2 y   b2 
 ··· 0 0  .   
 ··· ··· ··· ···   .. = .. . (8.120)
··· ··· ···   . 
0 0 0 0 · · · an−1,n−2 an−1,n−1 yn−1 bn−1

The above equation is a tri-diagonal system of equations containing (n −1) unknowns


y1 , y2 , . . . , yn−1 .
Example 8.9.1 Solve the boundary value problem

d2 y
+ 2y = x, y(0) = 1, y(1) = 2
dx2
using finite element method for two and four elements of equal lengths.

Solution. Here p(x) = −1, q(x) = 2, r(x) = x. If the lengths of elements are equal
then hi = h (say) for all i.
1 1
Now, Li = − , Li+1 = .
h h
568 Numerical Analysis

Therefore,
% xi+1
Li Li Li Li+1 Li Li Li Li+1
A(i) = − +2 dx
xi Li+1 Li Li+1 Li+1
  Li+1 Li Li+1 Li+1
1 1 −1 h 2 1
=− + .
h −1 1 3 1 2

% %
xi+1
x(xi+1 − x)
xi+1
h
x Li dx = dx = (3xi + h)
xi xi h 6
% % xi+1
xi+1
x(x − xi ) h
and x Li+1 dx = dx = (3xi + 2h).
xi xi h 6

Then
% xi+1
(i) Li h 3xi + h
b = x dx = .
xi Li+1 6 3xi + 2h

For two elements of equal length.

In this case h = 1/2.


y0 y1 y2
0 1/2 1

Figure 8.6: Two elements.

For element e0 : x0 = 0, x1 = 1/2.

1 −1 1 2 1 1 −10 13
A(0) = −2 + =
−1 1 6 1 2 6 13 −10
1 1/2 1 1 y0
b(0) = = , φ(0) = .
12 1 24 2 y1

For element e1 : x1 = 1/2, x2 = 1.

1 −10 13 1 4 y1
A(1) = , b(1) = , φ(1) = .
6 13 −10 24 5 y2
Ordinary Differential Equations 569

Combination of these two element equations gives


         
−10 13 0 y0 0 0 0 y0 1 0
1 1 1   1  
13 −10 0   y1  +  0 −10 13   y1  = 2 + 4
6 0 0 0 y2 6 0 13 −10 y2 24 0 24 5
    
−10 13 0 y0 1
or, 4  13 −20 13   y1  =  6  .
0 13 −10 y2 5

Incorporating boundary conditions y0 = 1, y2 = 2, the second equation is written as


4(13y0 − 20y1 + 13y2 ) = 6.
This gives 80y1 = 52(y0 + y2 ) − 6, that is, y1 = 15/8 = 1.875.
The solution of the given equation for two elements can also be written as

 x −x x − x0
 1 y0 + y1 = 1 + 1.75x, 0 ≤ x ≤ 1/2
y(x) = x1 − x 0 x1 − x0
 x − x x − x
 2 y1 +
1
y2 = 0.25x + 1.75, 1/2 ≤ x ≤ 1
x2 − x1 x2 − x1
For four elements of equal length.

In this case h = 1/4. y0 y1 y2 y3 y4


0 e0 1/4 e1 1/2 e2 3/4 e3 1

Figure 8.7: Four elements.

For element e0 : x0 = 0, x1 = 1/4.

1 −46 49 1 1 y0
A(0) = , b(0) = , φ(0) = .
12 49 −46 96 2 y1

For element e1 : x1 = 1/4, x2 = 1/2.

1 −46 49 1 4 y1
A(1) = , b(1) = , φ(1) = .
12 49 −46 96 5 y2

For element e2 : x2 = 1/2, x3 = 3/4.

1 −46 49 1 7 y2
A(2) = , b(2) = , φ(2) = .
12 49 −46 96 8 y3
570 Numerical Analysis

For element e3 : x3 = 3/4, x4 = 1.

1 −46 49 1 10 y3
A(3) = , b(3) = , φ(3) = .
12 49 −46 96 11 y4

Now, combining four element equations as


     
−46 49 0 0 0 y0 0 0 0 0 0 y0
 49 −46 0 0 0   y1   0 −46 49 0 0   
1    1    y1 
 0  
0 0 0 0   y2  +  49 −46 0 0   y2 
 
12  12 0 
 0 0 0 0 0   y3   0 0 0 0 0   y3 
0 0 0 0 0 y4 0 0 0 0 0 y4
     
0 0 0 0 0 y0 0 0 0 0 0 y0
0 0 0 0 0   0 0 0 0  
1    y1  1 
0   y1 

+  0 0 −46   
49 0   y2  +   
0   y2 
12  0 0 12 0 0 0 0 
49 −46 0   y3   0 0 0 −46 49   y3 
0 0 0 0 0 y4 0 0 0 49 −46 y4
       
1 0 0 0
2 4 0  0 
1       
 0  + 1  5  + 1  7  + 1  0 .

= 
96  0  96  0  96  8  96  10 
     

0 0 0 11

After simplification, the above equation reduces to


    
−46 49 0 0 0 y0 1
 49 −92 49 0 0   y1   6 
1 
  
 y2  = 1  12

 0 49 −92 49 0   
.

12  0 0 49 −92 49   y3  96  18 
0 0 0 49 −46 y4 11

Incorporating boundary conditions y0 = 1, y4 = 2 the above system reduces to [using


(8.120)]
    
−92 49 0 y1 −193
 49 −92 1
49   y2  =  6  .
0 49 −92 y3 4 −383

Solution of this system is

y1 = 1.53062, y2 = 1.88913, y3 = 2.04693.

That is, y(0) = 1, y(1/4) = 1.53062, y(1/2) = 1.88913, y(3/4) = 2.04693, y(1) = 2.
Ordinary Differential Equations 571

The solution at each element is



 x1 − x x − x0

 y0 + y1 = 1 + 2.12248x, 0 ≤ x ≤ 1/4

 x1 − x0 x1 − x0





 x2 − x x − x1

 y2 = 1.17211 + 1.43404x, 1/4 ≤ x ≤ 1/2

 x2 − x1
y1 +
x2 − x1
y(x) =

 x3 − x x − x2

 y2 + y3 = 1.57353 + 0.63120x, 1/2 ≤ x ≤ 3/4

 x3 − x2 x3 − x2







 x4 − x x − x3
 y3 + y4 = 2.18772 − 0.18772x, 3/4 ≤ x ≤ 1.
x4 − x3 x4 − x3
The exact solution of the equation is
√ √ x
y = cos 2x + 1.3607 sin 2x + .
2

8.10 Discussion About the Methods

In this section, the advantages and disadvantages of the numerical methods to solve
ordinary differential equations presented in this chapter are placed.
The Taylor’s series method has a serious disadvantage that higher order derivatives of
f (x, y) are required while computing y at a given value of x. Since this method involves
several computations of higher order derivatives, it is a labourious method, but, once
the series is available then one can compute the values of y at different values of x,
provided the step size is small enough.
The Picard’s method involves successive integrations and it is difficult for a compli-
cated function.
The Euler’s method is the simplest and the most crude of all single-step methods and
gives a rough idea about the solution.
The Runge-Kutta methods are most widely used method to solve a single or a system
of IVP, though, it is laborious. These methods can also be used to solve higher order
equations. To get the starting values of some predictor-corrector methods the Runge-
Kutta methods are used. After the starting values are found, the remaining values can
be determined by using predictor-corrector methods. An important drawback of Runge-
Kutta method is that there is no technique to check or estimate the error occurs at any
step. If an error generates at any step, then it is propagated through the subsequent
steps without detection.
The multistep method is, in general, more efficient than a single-step method. When
starting values are available then at each step only one (for explicit method) or a few
572 Numerical Analysis

(for implicit method) functions evaluations are required. In this contrast, a single-step
method requires multiple functions evaluations, but, single-step method is self starting.
The finite difference method is useful to solve second order IVP and BVP. To solve
BVP, an algebraic tri-diagonal system of equations is generated. When the step size h
is small then this method gives better result, but, for small h, the number of equations
becomes very large and then it is very complicated to solve such system.

8.11 Stability Analysis

Stability analysis is an important part in the study of the numerical methods to solve
differential equations. Most of the methods used to solve differential equation are based
on difference equation. To study the stability, the model differential and difference
equations are defined in the following.

8.11.1 Model differential problem


For convenience and feasibility of analytical treatment and without loss of generality,
stability analysis will be performed on the model initial value differential equation

y  = λy, y(0) = y0 , (8.121)

where λ is a constant and it may be a real or a complex number. The solution of this
problem is

y = eλt y0 . (8.122)

In our treatment, let us consider λ = λR +iλI , where λR and λI represent respectively


the real and imaginary parts of λ, with λR ≤ 0.

8.11.2 Model difference problem


Similar to the differential problem, let us consider the single first order linear model
initial value difference problem

yn+1 = σyn , n = 0, 1, 2, . . . (8.123)

where y0 is given and σ is, in general, a complex number. The solution of this problem
is

y n = σ n y0 . (8.124)

It may be noted that the solution remains bounded only if |σ| ≤ 1.


Ordinary Differential Equations 573

Im(λh)
6

- Re(λh)
1 O

Stability region
of exact solution

Figure 8.8: Stability region of exact solution.

The connection between the exact solution and the difference solution is evident if
we evaluate the exact solution at tn = nh, for n = 0, 1, . . . where h > 0 and
yn = eλtn y0 = eλnh y0 = σ n y0 (8.125)
where σ = eλh .
If the exact solution is bounded then |σ| = |eλh | ≤ 1. This is possible if Re(λh) =
λR h ≤ 0.
That is, in the Re(λh)-Im(λh) plane, the region of stability of the exact solution is
the left half-plane as shown in Figure 8.8.
The single-step method is called absolutely stable if |σ| ≤ 1 and relatively stable
if |σ| ≤ eλh . If λ is pure imaginary and |σ| = 1, then the absolute stability is called the
periodic stability (P-stability).
When the region of stability of a difference equation is identical to the region of
stability of the differential equation, the finite difference scheme is sometimes referred
to as A-stable.

8.11.3 Stability of Euler’s method


The solution scheme of (8.121) by Euler’s method is
yn+1 = yn + hf (xn , yn ) = yn + λhyn = (1 + λh)yn . (8.126)
The solution of this difference equation is
yn = (1 + λh)n y0 = σ n y0 , (8.127)
where σ = 1 + λh.
The numerical method is stable if |σ| ≤ 1.
Now, the different cases are discussed for different nature of λ.
574 Numerical Analysis

(i) Real λ: |1 + λh| < 1 or, −2 < λh < 0,


(ii) Pure imaginary λ: √
Let λ = iw, w is real.
Then |1 + iwh| = 1 + w2 h2 > 1. That is, the method is not stable when λ is
pure imaginary.
(iii) Complex λ: Let λ = λR + iλI . Then
|σ| = |1 + λh| = |1 + λR h + iλI h| = (1 + λR h)2 + (λI h)2 ≤ 1.
It means λh lies inside the unit circle.

That is, only a small portion of the left half-plane is the region of stability for the
Euler’s method. This region is inside the circle (1 + λR h)2 + (λI h)2 = 1, which is shown
in Figure 8.9.
Im(λh)
6

qs q - Re(λh)
−2 −1 O

Stability
region

Figure 8.9: Stability region of Euler’s method.

For any value of λh in the left half-plane and outside this circle, the numerical solution
blows-up while the exact solution decays. Thus the numerical method is conditionally
stable.
To get a stable numerical solution, the step size h must be reduced so that λh falls
within the circle. If λ is real and negative then the maximum step size for stability is
0 ≤ h ≤ 2/|λ|. The circle is only tangent to the imaginary axis. If λ is real and the
numerical solution is unstable, then |1 + λh| > 1, which means that (1 + λh) is negative
with magnitude greater than 1. Since yn = (1 + λh)n y0 , the numerical solutions exhibits
oscillations with changes of sign at every step. This behavior of the numerical solutions
is a good indicator of instability.

Note 8.11.1 The numerical stability does not imply accuracy. A method can be stable
even if it gives inaccurate result. From the stability point of view, our objective is to
use the maximum step size h to reach the final destination at x = xn . If h is large then
number of function evaluations is low and needs low computational cost. This may not
be the optimum h for acceptable accuracy, but, it is optimum for stability.
Ordinary Differential Equations 575

8.11.4 Stability of Runge-Kutta methods


Let us consider the second-order Runge-Kutta method for the model equation y  = λy.
Then

k1 = hf (xn , yn ) = λhyn
k2 = hf (xn + h, yn + k1 ) = λh(yn + k1 ) = λh(yn + λhyn )
= λh(1 + λh)yn

and
 
1 (λh)2
yn+1 = yn + (k1 + k2 ) = yn + λh + yn
2 2
 
λ2 h2
= 1 + λh + yn . (8.128)
2

This expression confirms that the method is of second order accuracy. For stability,
|σ| ≤ 1 where

λ2 h2
σ = 1 + λh + . (8.129)
2
Now, the stability is discussed for different cases of λ.

λ2 h2
(i) Real λ: 1 + λh + 2≤ 1 or, −2 ≤ λh ≤ 0.
+
(ii) Pure imaginary λ: Let λ = iw. Then |σ| = 1 + 14 w4 h4 > 1. That is, the method
is unstable.
2 2
(iii) Complex λ: Let 1 + λh + λ 2h = eiθ and find the complex roots, λh, of this
polynomial for different values of θ. Note that |σ| = 1 for all values of θ.

The resulting stability region is shown in Figure 8.10.


When fourth-order Runge-Kutta method is applied to the model equation y  = λy
then

k1 = λhyn
k2 = λh(yn + k1 /2) = λh(1 + λh/2)yn
 1 1 
k3 = λh(yn + k2 /2) = λh 1 + λh + λ2 h2 yn
2 4
 1 2 2 1 3 3
k4 = λh(yn + k3 ) = λh 1 + λh + λ h + λ h yn .
2 4
576 Numerical Analysis

Im(λh)
6

2 2
Second order
RK method

3 2
q
1

- Re(λh)
−2.785 −2 −1 O

–1
 √
− 3 –2
Fourth order
RK method √
−2 2

Figure 8.10: Stability regions of Runge-Kutta methods.

Therefore,

1
yn+1 = yn + (k1 + 2k2 + 2k3 + k4 )
6
1 1 1
= 1 + λh + (λh)2 + (λh)3 + (λh)4 yn , (8.130)
2! 3! 4!

which confirms the fourth-order accuracy of the method.


For different λ, the stability of fourth-order Runge-Kutta method is discussed in the
following:

(i) Real λ: −2.785 ≤ λh ≤ 0.



(ii) Pure imaginary λ: 0 ≤ |λh| ≤ 2 2.

(iii) Complex λ: In this case, the stability region is obtained by finding the roots of
the fourth-order polynomial with complex coefficients:
1 1 1
1 + λh + (λh)2 + (λh)3 + (λh)4 = eiθ .
2! 3! 4!
Ordinary Differential Equations 577

The region of stability is shown in Figure 8.10. It shows a significant improvement


over the second-order Runge-Kutta scheme. In particular, it has a large stability region
on the imaginary axis.

8.11.5 Stability of Finite difference method


To test the stability of finite difference method, let us consider a simple second order
differential equation

y  + ky  = 0, (8.131)

where k is a constant and very large in comparison to 1.


The central difference approximation for (8.131) is
1 k
(yi+1 − 2yi + yi+1 ) + (yi+1 − yi−1 ) = 0. (8.132)
h2 2h
The solution of this difference equation is
 
2 − kh i
yi = A + B , (8.133)
2 + kh
where A and B are arbitrary constants to be determined by imposing boundary condi-
tions.
The analytical solution of (8.131) is

y(x) = A + Be−kx . (8.134)

If k > 0 and x → ∞ then the solution becomes bounded. The term e−kx is monotonic
for k > 0 and k < 0. Thus we expect the finite difference is also monotonic for k > 0
and k < 0.
If the behavior of the exponential term of (8.133) is analyzed, it is observed that it
behaves monotonic for k > 0 and k < 0 if h < |2/k|. This is the condition for stability
of the difference scheme (8.132).

8.12 Exercise

1. Discuss Taylor’s series method to solve an initial value problem of the form
dy
= f (x, y), y(x0 ) = y0 .
dx

dy
2. For the differential equation − 1 = x2 y with y(0) = 1, obtain Taylor’s series
dx
for y(x) and compute the values of y(0.1), y(0.2) correct up to four decimal places.
578 Numerical Analysis

3. Use Taylor’s series method to find the solution of


dy
= t + y 2 with y(1) = 0
dt
at t = 1.5.
4. Use Taylor’s series method to solve y  = y sin x + cos x at x = 0.5, 0.6 with initial
condition x = 0, y = 0.
5. Use Picard’s method to solve the following differential equation
dy
+ 2y = 0, y(0) = 1
dx
at x = 0.1.
6. Using the Picard’s method, find three successive approximations to the solutions
of the following differential equations:
(a) y  = 2y(1 + x), y(0) = 1,

(b) y = x − y, y(0) = 1.
7. Deduce Euler’s method to solve the initial value problem y  = f (x, y), y(x0 ) = y0
using (i) Taylor’s series, and (ii) forward difference formula.
8. Discuss Euler’s method to solve the differential equation of the form y  = f (x, y),
y(x0 ) = y0 .
9. Solve the following differential equation
dy
= 3x2 + y, y(0) = 4
dx
for the range 0.1 ≤ x ≤ 0.5, using Euler’s method by taking h = 0.1.
10. Obtain numerically the solution of
y  = x2 + y 2 , y(0) = 0.5
using Euler’s method to find the value of y at x = 0.1 and 0.2.
11. Deduce modified Euler’s method and compare it with Euler’s method.
12. Give geometrical interpretations of Euler’s and modified Euler’s methods.
13. Using modified Euler’s method, evaluate y(0.1) correct to two significant figures
from the differential equation
dy
= y + x, y = 1 when x = 0, taking h = 0.05.
dx
Ordinary Differential Equations 579

14. Deduce second and fourth order Runge-Kutta methods to solve the initial value
problem y  = f (x, y) with y(x0 ) = y0 .
15. Explain the significance of the numbers k1 , k2 , k3 , k4 used in fourth order Runge-
Kutta methods.
16. Analyze the stability of second and fourth order Runge-Kutta methods.
17. Use second order Runge-Kutta method to solve the initial value problem
dy
5 = x2 + y 2 , y(0) = 1
dx
and find y in the interval 0 ≤ x ≤ 0.4, taking h = 0.1.
18. Using Runge-Kutta method of fourth order, solve
dy
= xy + y 2 ,
dx
given that y(0) = 1. Take h = 0.2 and find y at x = 0.2, 0.4, 0.6.
19. Use Runge-Kutta method of fourth order to compute the solution of the following
problem in the interval [0,0.1] with h = 0.02.

y  = x + y, y(0) = 1.

20. Consider the following system of first order differential equation


dy dz
= y + 2z, = 3y + 2z
dx dx
with y(0) = 6, z(0) = 4. Use second and fourth order Runge-Kutta methods to
find the values of y and z at x = 0.1, 0.2.
21. Solve the system ẋ = x + 2y, ẏ = 2x + y with the initial condition x(0) = −2 and
y(0) = 2 over the interval 0 ≤ t ≤ 1 using the step size h = 0.1. Fourth order
Runge-Kutta method may be used.
22. Use fourth order Runge-Kutta method to solve the second order initial value
problem 2y  (x)−6y  (x)+2y(x) = 4ex with y(0) = 1 and y  (0) = 1, at x = 0.1, 0.2.
23. Use fourth order Runge-Kutta method to solve y  = xy  +y with initial conditions
y(0) = 1, y  (0) = 2. Take h = 0.2 and find y and y  at x = 0.2.
24. Solve the following initial value problem y  = −xy over [0, 0.2] with y(0) = 1 using
(a) fourth order Runge-Kutta method,
(b) Runge-Kutta-Butcher method, and compare them.
580 Numerical Analysis

25. Solve the following initial value problem y  + xy = 1 at x = 0.1 with initial
condition y(0) = 1 using Runge-Kutta-Fehlberg method.

26. Discuss Milne’s predictor-corrector formula to find the solution of y  = f (x, y),
y(x0 ) = y0 .

27. Use Milne’s predictor-corrector formula to find the solutions at x = 0.4, 0.5, 0.6 of
the differential equation
dy
= x3 + y 2 , y(0) = 1.
dx

28. Deduce Adams-Bashforth-Moulton predictor-corrector formula to find the solution


of y  = f (x, y), y(x0 ) = y0 .

29. Consider the initial value problem


dy
= x2 y + y 2 , y(0) = 1.
dx
Find the solution of this equation at x = 0.4, 0.5, 0.6 using Adams-Bashforth-
Moulton predictor-corrector method.

30. Use Adams-Bashforth-Moulton and Milne’s predictor-corrector methods with h =


0.5 to compute the approximate solution of the IVP y  = (x − y)/3 with y(0) = 1
over [0, 2]. The Runge-Kutta method may be used to find the starting solution.

31. Using Milne’s predictor-corrector method, find the solution of y  = xy +y 2 , y(0) =


1 at x = 0.4 given that y(0.1) = 1.11689, y(0.2) = 1.27739 and y(0.3) = 1.50412.

32. Find the solution of the IVP y  = y 2 sin x with y(0) = 1 using Adams-Bashforth-
Moulton predictor-corrector method, in the interval [0.2,0.3], given that y(0.05) =
1.00125, y(0.10) = 1.00502 and y(0.15) = 1.01136.

33. Use Adams-Bashforth method to solve the differential equation y  = 2x − 3y for


the initial condition y(0) = 1 on the interval [0,1]. The starting solutions may be
obtained by Euler’s method.

34. Consider the second order initial value problem

d2 y
+ xy = 0
dx2
with initial conditions y(0) = 1, y  (0) = 0. Compute y(0.2) and y(0.4) using
(a) Runge-Kutta methods,
(b) Finite difference method, and compare them.
Ordinary Differential Equations 581

35. Convert the following second order differential equation y  +p(x)y  +q(x)y+r(x) =
0 with y(x0 ) = y0 , y(xn ) = yn into a system of algebraic equations.

36. Solve the boundary value problem


2x  2
y  = 2
y − y+1
1+x 1 + x2
with y(0) = 1.25 and y(2) = −0.95 for x = 0.5, 1.0, 1.5, using finite difference
method.

37. Use finite difference method to solve the BVP y  + 2y  + y = 10x, y(0) = 0 and
y(1) = 0 at x = 0.25, 0.50, 0.75, and compare these results with exact solutions.

38. Use the finite difference method to find the solution of the boundary value problem
y  − xy  + 2y = x + 1 with boundary conditions y(0.9) = 0.5, y  (0.9) = 2,
y(1.2) = 1, taking h = 0.1.

39. Convert the following BVP y  − 12xy + 1 = 0 with boundary conditions y(0) =
y(1) = 0, into two initial value problems.

40. Use shooting method to solve the boundary value problem y  + xy = 0 with
boundary conditions y(0) = 0, y(1) = 1.

41. Consider the boundary value problem y  − y = 1, 0 < x < 1 with the boundary
conditions y(0) = 0, y(1) = e − 1.
Use finite element method to find the solution of this problem.

42. Solve the boundary value problem y  + y = x2 , 0 < x < 1 with boundary condi-
tion y(0) = 3, y(1) = 1 using finite element method for two and three elements of
equal lengths.

43. Use finite element method to obtain the difference scheme for the boundary value
problem
d dy
(1 + x2 ) − y = 1 + x2 , y(−1) = y(1) = 0
dx dx
with linear shape functions and element length h = 0.25.
582 Numerical Analysis
Chapter 9

Partial Differential Equations

The partial differential equation (PDE) is an important subject in applied mathematics,


physics and some branches of engineering. Several analytical methods are available to
solve PDEs, but, these methods are based on advanced mathematical techniques. Also,
several numerical methods are available to solve PDEs. The numerical methods are, in
general, simple but generate erroneous result. Among several numerical methods, the
finite-difference method is widely used for its simplicity. In this chapter, only finite-
difference method is discussed to solve PDEs.
The general second order PDE is of the form

∂2u ∂2u ∂2u ∂u ∂u


A 2
+ B + C 2
+D +E + Fu = G (9.1)
∂x ∂x∂y ∂y ∂x ∂y
i.e., Auxx + Buxy + Cuyy + Dux + Euy + F u = G, (9.2)

where A, B, C, D, E, F, G are functions of x and y.


The PDEs can be classified into three different types – elliptic, hyperbolic and
parabolic. These classifications are done by computing the discriminant

∆ = B 2 − 4AC.

The equation (9.2) is called elliptic, parabolic and hyperbolic according as the value
of ∆ at any point (x, y) are <, = or > 0.

Elliptic equations
The simplest examples of this type of PDEs are Poisson’s equation

∂2u ∂2u
+ 2 = g(x, y) (9.3)
∂x2 ∂y

583
584 Numerical Analysis

and Laplace equation


∂2u ∂2u
+ 2 =0 or ∇2 u = 0. (9.4)
∂x2 ∂y
These equations are generally associated with equilibrium or steady-state problems.
For example, the velocity potential for the steady flow of incompressible non-viscous
fluid satisfies Laplace’s equation. The electric potential V associated with a two-
dimensional electron distribution of charge density ρ satisfies Poisson’s equation
∂2V ∂2V ρ
2
+ 2
= − , where ε is the dielectric constant.
∂x ∂y ε
The analytic solution of an elliptic equation is a function of x and y which satisfies
the PDE at every point of the region S which is bounded by a plane closed curve C
and satisfies some conditions at every point on C. The condition that the dependent
variable satisfies along the boundary curve C is known as boundary condition.

Parabolic equation
In parabolic equation, time t is involved as an independent variable.
The simplest example of parabolic equation is the heat conduction equation
∂u ∂2u
= α 2. (9.5)
∂t ∂x
The solution u is the temperature at a distance x units of length from one end of
a thermally insulated bar after t seconds of heat conduction. In this problem, the
temperatures at the ends of a bar are known for all time, i.e., the boundary conditions
are known.

Hyperbolic equation
In this equation also, time t appears as an independent variable.
The simplest example of hyperbolic equation is the one-dimensional wave equation
∂2u 2
2∂ u
= c . (9.6)
∂t2 ∂x2

Here u is the transverse displacement at a distance x from one end of a vibrating


string of length l after a time t.
Hyperbolic equations generally originate from vibration problems or from problems
where discontinuities can persist in time.
The values of u at the ends of the string are usually known for all time (i.e., boundary
conditions are known) and the shape and velocity of the string are given at initial time
(i.e., initial conditions are known).
Generally, three types of problems occur in PDEs.
Partial Differential Equations 585

(i) Dirichlet’s problem


Let R be a region bounded by a closed curve C and f be a continuous function
on C.
Now the problem is to find u satisfying the Laplace’s equation

∂2u ∂2u
+ 2 = 0 in R
∂x2 ∂y (9.7)
and u = f (x, y) on C.

(ii) Cauchy’s problem

∂2u ∂2u
= for t > 0
∂t2 ∂x2
u(x, 0) = f (x) (9.8)
and ut (x, 0) = g(x)
where f (x)and g(x) are arbitrary.
(iii)

∂u ∂2u
= for t > 0 (9.9)
∂t ∂x2
and u(x, 0) = f (x).

The above cited problems are all well-defined (i.e., well-posed) and it can be shown
that each of the above problems has unique solution.
But, the problem of Laplace’s equation with Cauchy boundary conditions, i.e., the
problem

∂2u ∂2u
+ 2 =0
∂x2 ∂y (9.10)
u(x, 0) = f (x)
and uy (x, 0) = g(x)

is an ill-posed problem.

9.1 Finite-Difference Approximations to Partial Derivatives

Let the xy plane be divided into sets of equal rectangles of sides ∆x = h and ∆y = k
by drawing the equally spaced grid lines parallel to the coordinates axes, defined by

xi = ih, i = 0, ±1, ±2, . . .


yj = jk, j = 0, ±1, ±2, . . . .
586 Numerical Analysis

The value of u at a mesh point (intersection of horizontal and vertical lines) P (xi , yj )
or, at P (ih, jk) is denoted by ui,j , i.e., ui,j = u(xi , yj ) = u(ih, jk).
Now,
ui+1,j − ui,j
ux (ih, jk) = + O(h) (9.11)
h
(forward difference approximation)
ui,j − ui−1,j
= + O(h) (9.12)
h
(backward difference approximation)
ui+1,j − ui−1,j
= + O(h2 ) (9.13)
2h
(central difference approximation)

Similarly, ui,j+1 − ui,j


uy (ih, jk) = + O(k) (9.14)
k
(forward difference approximation)
ui,j − ui,j−1
= + O(k) (9.15)
k
(backward difference approximation)
ui,j+1 − ui,j−1
= + O(k 2 ) (9.16)
2k
(central difference approximation)

ui−1,j − 2ui,j + ui+1,j


uxx (ih, jk) = + O(h2 ). (9.17)
h2
ui,j−1 − 2ui,j + ui,j+1
uyy (ih, jk) = + O(k 2 ). (9.18)
k2

9.2 Parabolic Equations

9.2.1 An explicit method


Let us consider the heat conduction equation

∂u ∂2u
= α 2. (9.19)
∂t ∂x
Using the finite-difference approximation for ut and uxx , equation (9.19) reduces to
ui,j+1 − ui,j ui−1,j − 2ui,j + ui+1,j
α , (9.20)
k h2
where xi = ih and tj = jk; i, j = 0, 1, 2, . . ..
Partial Differential Equations 587

This can be written as


ui,j+1 = rui−1,j + (1 − 2r)ui,j + rui+1,j , (9.21)
where r = αk/h2 .
This formula gives the unknown ‘temperature’ ui,j+1 at the mesh point (i, j + 1)
when the values of ui−1,j , ui,j and ui+1,j are known and hence the method is called the
explicit method. It can be shown that (by stability analysis) the formula is valid for
0 < r ≤ 1/2. t 6
u u

u u
u known values
u e u
ui,j+1 ⊕ unknown value

u u u u u
ui−1,j ui,j ui+1,j
u u
6
k h-
u? u u u u u u- x
t = 0, u = f (x)

Figure 9.1: Known and unknown meshes of heat equation.

Example 9.2.1 Solve the heat equation


∂u ∂2u
=
∂t ∂x2
subject to the boundary conditions u(0, t) = 0, u(1, t) = 2t and initial condition
u(x, 0) = 12 x.

Solution. Let h = 0.2 and k = 0.01, so r = k/h2 = 0.25 < 1/2. The initial and
boundary values are shown in the following table.
i=0 i=1 i=2 i=3 i=4 i=5
x = 0 x = 0.2 x = 0.4 x = 0.6 x = 0.8 x = 1.0
j = 0, t = 0.00 0.0 0.1 0.2 0.3 0.4 0.00
j = 1, t = 0.01 0.0 0.02
j = 2, t = 0.02 0.0 0.04
j = 3, t = 0.03 0.0 0.06
j = 4, t = 0.04 0.0 0.08
j = 5, t = 0.05 0.0 0.10
588 Numerical Analysis

For this problem the equation (9.21) reduces to


1
ui,j+1 = (ui−1,j + 2ui,j + ui+1,j ).
4
Then
1
u1,1 = (u0,0 + 2u1,0 + u2,0 ) = 0.1
4
1
u2,1 = (u1,0 + 2u2,0 + u3,0 ) = 0.2
4
and so on.
The complete values are shown in the following table.
i=0 i=1 i=2 i=3 i=4 i=5
x=0 x = 0.2 x = 0.4 x = 0.6 x = 0.8 x = 1.0
j = 0, t = 0.00 0.0 0.1 0.2 0.3 0.4 0.00
j = 1, t = 0.01 0.0 0.10000 0.20000 0.30000 0.27500 0.02
j = 2, t = 0.02 0.0 0.10000 0.20000 0.26875 0.21750 0.04
j = 3, t = 0.03 0.0 0.10000 0.19219 0.23875 0.18594 0.06
j = 4, t = 0.04 0.0 0.09805 0.18078 0.21391 0.16766 0.08
j = 5, t = 0.05 0.0 0.09422 0.16838 0.19406 0.15730 0.10

9.2.2 Crank-Nicolson implicit method


The above explicit method is simple but it has one serious drawback. This method
gives a meaningfull result if 0 < r ≤ 1/2, i.e., 0 < αk/h2 ≤ 1/2 or, αk ≤ h2 /2. This
means, the time step k is necessarily small. Crank and Nicolson (1947) have developed
a method that reduces the total computation time and is valid for all finite values of
r. In this method, the equation is approximated by replacing both space and time
derivatives by their central difference approximations at a point (ih, (j + 1/2)k), which
is the midpoint of the points (ih, jk) and (ih, (j + 1)k). Thus the equation (9.19) can
be written as
   2     2 
∂u ∂ u α ∂2u ∂ u
=α 2
= 2
+ . (9.22)
∂t i,j+1/2 ∂x i,j+1/2 2 ∂x i,j ∂x2 i,j+1
Then by using central difference approximation the above equation reduces to
ui,j+1 − ui,j α ui+1,j − 2ui,j + ui−1,j ui+1,j+1 − 2ui,j+1 + ui−1,j+1
= 2
+
k 2 k k2
and after simplification this equation gives
−rui−1,j+1 + (2 + 2r)ui,j+1 − rui+1,j+1 = rui−1,j + (2 − 2r)ui,j + rui+1,j , (9.23)
Partial Differential Equations 589

where r = αk/h2 .
In general, the left hand side of (9.23) contains three unknowns and the right hand
side has three known values of u.
For j = 0 and i = 1, 2, . . . , N −1, equation (9.23) generates N simultaneous equations
for N − 1 unknown u1,1 , u2,1 , . . . , uN −1,1 (of first row) in terms of known initial and
boundary values u0,0 , u1,0 , u2,0 , . . . , uN,0 ; u0,0 and uN,0 are the boundary values and
u1,0 , u2,0 , . . . , uN −1,0 are the initial values.
Similarly, for j = 1 and i = 1, 2, . . . , N − 1 we obtain another set of unknown values
u1,2 , u2,2 , . . . , uN −1,2 in terms of calculated values for j = 0, and so on.
t 6
u u
 o
u e e e u
left * j+1 i right
boundary boundary
values s u e
u u u u values
j
U u u
⊕ unknown values
u known values
u u u u u u u- x
i=0 i−1 i i+1 i=N
Figure 9.2: Meshes of Crank-Nicolson implicit method.

In this method, the value of ui,j+1 is not given directly in terms of known ui,j at one
time step earlier but is also a function of unknown values at previous time step, and
hence the method is called an implicit method.
The system of equation (9.23) can be viewed in the following matrix notation.

  u  
d1,j

2+2r −r 1,j+1
 −r −r  u   d2,j 
2+2r    
2,j+1

 ··· ··· ··· ··· ··· ··· ···  u   d3,j 
 

3,j+1 =
 


(9.24)
 −r 2+2r −r   ..
 
..

. .
−r 2+2r uN −1,j+1 dN −1,j
where

d1,j = ru0,j + (2 − 2r)u1,j + ru2,j + ru0,j+1


di,j = rui−1,j + (2 − 2r)ui,j + rui+1,j ; i = 2, 3, . . . , N − 2
dN −1,j = ruN −2,j + (2 − 2r)uN −1,j + ruN,j + ruN,j+1 .

The right hand side of (9.24) is known.


590 Numerical Analysis

This resulting tri-diagonal system can be solved by the method discussed in


Chapter 5.
Example 9.2.2 Use the Crank-Nicolson method to calculate a numerical solution
of the problem
∂u ∂2u
= , 0 < x < 1, t > 0
∂t ∂x2
 
1 1
where u(0, t) = u(1, t) = 0, t > 0, u(x, 0) = 2x, t = 0. Mention the value of u ,
2 8
1 1 1 1
by taking h = , and k = , .
2 4 8 16
Solution. Case I.
1 1
Let h = and k = .
2 8
k 1
In this case r = 2 = .
h 2
The Crank-Nicolson scheme (9.23) now becomes

−ui−1,j+1 + 6ui,j+1 − ui+1,j+1 = ui−1,j + 2ui,j + ui+1,j .

The boundary and initial conditions are shown in Figure 9.3.


t
6u0,1 u1,1 u2,1
j=1, t= 8 u
1 e u
e

u u u2,0
j=0, t=0 u 0,0 u 1,0 u - x
i=0 i=1 i=2
x=0 x= 12 x=1

Figure 9.3: Boundary and initial values when h = 1/2, k = 1/8.

The initial and boundary values are

u0,0 = 0, u1,0 = 1, u2,0 = 2; u0,1 = 0, u2,1 = 0.

Substituting i = 0 and j = 0 to the Crank-Nicolson scheme, we obtain

−u0,1 + 6u1,1 − u2,1 = u0,0 + 2u1,0 + u2,0 .

Using initial and boundary conditions the above equation reduces to


6u1,1 = 2 + 2 = 4, i.e., u1,1 = u(1/2, 1/8) = 2/3.
Partial Differential Equations 591

Case II.
1 1 k
Let h = and k = . In this case r = 2 = 2.
4 8 h
The Crank-Nicolson scheme is

−ui−1,j+1 + 3ui,j+1 − ui+1,j+1 = ui−1,j − ui,j + ui+1,j .

The initial and boundary conditions are shown in Figure 9.4.


t
6u0,1 u1,1 u2,1 u3,1 u4,1
j=1, t= 18 u e e e u
A B C
u0,0 u1,0 u2,0 u3,0 u
j=0, t=0 u u u u u-4,0x
i=0 i=1 i=2 i=3 i=4
x=0 x= 14 x= 12 x= 34 x=1

Figure 9.4: Boundary and initial values when h = 1/4, k = 1/8.

1 3
That is, u0,0 = 0, u1,0 = , u2,0 = 1, u3,0 = , u4,0 = 2; u0,1 = 0, u4,1 = 0.
2 2
The Crank-Nicolson equations for the mesh points A(i = 1), B(i = 2) and C(i = 3)
are respectively.

−u0,1 + 3u1,1 − u2,1 = u0,0 − u1,0 + u2,0


−u1,1 + 3u2,1 − u3,1 = u1,0 − u2,0 + u3,0
−u2,1 + 3u3,1 − u4,1 = u2,0 − u3,0 + u4,0 .

Using initial and boundary conditions the above system becomes


1 1
0 + 3u1,1 − u2,1 = 0 −+1=
2 2
1 3
−u1,1 + 3u2,1 − u3,1 = − 1 + = 1
2 2
3 3
−u2,1 + 3u3,1 + 0 = 1 − + 2 = .
2 2
The above
 1 system has three equations
 1 1and
 three unknownsand the solution is
1  17 5 3 1  31
u1,1 = u , = , u2,1 = u , = , u3,1 = u , = .
4 8 42 2 8 7 4 8 42
Case III.
1 1 1
Let h = , k = . Then r = 1. To find the value of u at t = we have to apply
4 16 8
two steps instead of one step as in Case I and Case II.
592 Numerical Analysis

The Crank-Nicolson scheme for this case is

−ui−1,j+1 + 4ui,j+1 − ui+1,j+1 = ui−1,j + ui+1,j .

The initial and boundary conditions


t are shown in Figure 9.5.
6u0,2 u1,2 u2,2 u3,2 u4,2
j=2, t= 18 u e e e u
D E F
u0,1 u1,1 u2,1 u3,1 u4,1
1
j=1, t= 16 u e e e u
A B C
u0,0 u1,0 u2,0 u3,0 u4,0
j=0, t=0 u u u u u- x
i=0 i=1 i=2 i=3 i=4
x=0 x= 14 x= 12 x= 34 x=1

Figure 9.5: Boundary and initial values when h = 1/4, k = 1/16.


Also, u0,0 = 0, u1,0 = 12 , u2,0 = 1, u3,0 = 32 , u4,0 = 2; u0,1 = 0, u4,1 = 0; u0,2 = 0,
u4,2 = 0
The Crank-Nicolson equations for the mesh points A(i = 1, j = 1),
B(i = 2, j = 1) and C(i = 3, j = 1) are respectively
−u0,1 + 4u1,1 − u2,1 = u0,0 + u2,0
−u1,1 + 4u2,1 − u3,1 = u1,0 + u3,0
−u2,1 + 4u3,1 − u4,1 = u2,0 + u4,0 .

That is,
4u1,1 − u2,1 = 1
1 3
−u1,1 + 4u2,1 − u3,1 = + = 2
2 2
−u2,1 + 4u3,1 = 3.

The solution of this system is


 1 1  13 1 1  6  3 1  27
u1,1 = u , = , u2,1 = u , = , u3,1 = u , = .
4 16 28 2 16 7 4 16 28
Again, the Crank-Nicolson equations for the mesh points
D(i = 1, j = 2), E(i = 2, j = 2) and F (i = 3, j = 2) are respectively,

−u0,2 + 4u1,2 − u2,2 = u0,1 + u2,1


−u1,2 + 4u2,2 − u3,2 = u1,1 + u3,1
−u2,2 + 4u3,2 − u4,2 = u2,1 + u4,1 .
Partial Differential Equations 593

Using boundary conditions and values of right hand side obtained in first step, the
above system becomes
6 6
4u1,2 − u2,2 = 0 + =
7 7
13 27 10
−u1,2 + 4u2,2 − u3,2 = + =
28 28 7
6 6
−u2,2 + 4u3,2 = +0= .
7 7
The solution of this system is
 1 1  17  1 1  26  3 1  17
u1,2 = u , = , u2,2 = u , = , u3,2 = u , = .
4 8 49 2 8 49 4 8 49

Algorithm 9.1 (Crank-Nicolson). This algorithm solves the heat equation


∂u ∂2u
∂t = α ∂x2 subject to the boundary conditions u(0, t) = f0 (t), u(l, t) = fl (t) and
initial condition u(x, 0) = g(x). The interval [0, l] is divided into N subintervals each
of length h. k is the step length of t.

Algorithm Crank Nicolson


Input functions f0 (t), fl (t), g(x).
Read N, M ;//number of subintervals of [0, l] and [0, t] respectively.//
Read l; //upper limit of x.//
Read k; //step length of t.//
Read α;
Compute h = l/N, r = αk/h2 ;
//Construction
 of the tri-diagonal matrix.// 
b1 c1 0 · · · 0 0 0
 a2 b2 c2 · · · 0 0 0 
 

Let T =  · · · · · · · · · · · · · · · · · · · · · 
 be the tri-diagonal matrix,
 0 0 0 · · · an−1 bn−1 cn−1 
0 0 0 · · · · · · an bn
where bi = 2 + 2r, i = 1, 2, . . . , n; ai = −r, i = 2, 3, . . . , n;
ci = −r, i = 1, 2, . . . , n − 1.
for i = 1 to N − 1 do
ui = g(i ∗ h);
endfor;
for j = 0 to M do
//construction of right hand vector//
Compute d1 = rf0 (j ∗ k) + (2 − 2r)u1 + ru2 + rf0 ((j + 1) ∗ k);
Compute dN −1 = ruN −2 + (2 − 2r)uN −1 + rfl (j ∗ k) + rfl ((j + 1) ∗ k);
endfor;
594 Numerical Analysis

for i = 2 to N − 2 do
Compute di = rui−1 + (2 − 2r)ui + rui+1 ;
endfor;
Solve the tri-diagonal system TU = D, where U = (u1 , u2 , . . . , uN −1 )t
and D = (d1 , d2 , . . . , dN −1 )t .
Print ‘The values of u’, u1 , u2 , . . . , uN −1 , ‘when t =’, (j + 1) ∗ k;
endfor;
end Crank Nicolson
Program 9.1
.
/* Program Crank-Nicolson
Program to solve the heat equation by Crank-Nicolson
method. This program solves the problem of the form
Ut=alpha Uxx, with boundary conditions
U(0,t)=f0(t); U(l,t)=fl(t); and initial condition
U(x,0)=g(x). The interval for x [0,l] is divided into
N subintervals of length h. k is the step length of t.
Here g(x)=cos(pi*x/2) with u(0,t)=1, u(1,t)=0, h=k=1/3.*/
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
float u[10];
void main()
{
int i,j,N,M;
float h,k,l,alpha,r,a[10],b[10],c[10],d[10],y,temp1,temp2;
float TriDiag(float a[],float b[],float c[],float d[],int n);
float g(float x);
float f0(float t);
float fl(float t);
printf("Enter the number subintervals of x and t ");
scanf("%d %d",&N,&M);
printf("Enter the step size (k) of t ");
scanf("%f",&k);
printf("Enter the upper limit of x ");
scanf("%f",&l);
printf("Enter the value of alpha ");
scanf("%f",&alpha);
h=l/N; r=alpha*k/(h*h);
for(i=1;i<=N;i++) b[i]=2+2*r;
for(i=2;i<=N;i++) a[i]=-r;
Partial Differential Equations 595

for(i=1;i<N;i++) c[i]=-r;
for(i=0;i<=N;i++) u[i]=g(i*h);
printf("h=%8.5f, k=%8.5f, r=%8.5f\n",h,k,r);
temp1=f0(0);
if(fabs(temp1)!=fabs(u[0])){
printf("u[0]=%f and f0(0)=%f are different!\n",u[0],temp1);
printf("Enter correct value ");
scanf("%f",&temp1);}
temp2=fl(0);
if(fabs(temp2)!=fabs(u[N]))
{
printf("u[N]=%f and fl(0)=%f are different!\n",u[N],temp2);
printf("Enter correct value ");
scanf("%f",&temp2);
}
printf("Solution is\nx-> ");
for(i=1;i<N;i++) printf("%8.5f ",i*h); printf("\n");
for(i=0;i<N;i++) printf("---------");printf("\n");
for(j=0;j<=M;j++)
{
if(j!=0) {temp1=f0(j*k); temp2=fl(j*k); }
/* construction of right hand vector */
d[1]=r*temp1+(2-2*r)*u[1]+r*u[2]+r*f0((j+1)*k);
d[N-1]=r*u[N-2]+(2-2*r)*u[N-1]+r*temp2+r*fl((j+1)*k);
for(i=2;i<=N-2;i++)
d[i]=r*u[i-1]+(2-2*r)*u[i]+r*u[i+1];
y=TriDiag(a,b,c,d,N-1); /*solution of tri-diagonal system*/
printf("%8.5f| ",(j+1)*k);
for(i=1;i<=N-1;i++) printf("%8.5f ",u[i]);printf("\n");
} /* end of j loop */
} /* main */
/* definitions of the initial and boundary functions*/
float g(float x)
{
return(cos(3.141592*x/2));
}
float f0(float t)
{
return(1);
}
596 Numerical Analysis

float fl(float t)
{
return(0);
}
float TriDiag(float a[10],float b[10],float c[10],float d[10],int n)
{
/* output u[i], i=1, 2,..., n, is a global variable.*/
int i; float gamma[10],z[10];
gamma[1]=b[1];
for(i=2;i<=n;i++)
{
if(gamma[i-1]==0.0)
{
printf("A minor is zero: Method fails ");
exit(0);
}
gamma[i]=b[i]-a[i]*c[i-1]/gamma[i-1];
}
z[1]=d[1]/gamma[1];
for(i=2;i<=n;i++)
z[i]=(d[i]-a[i]*z[i-1])/gamma[i];
u[n]=z[n];
for(i=n-1;i>=1;i--)
u[i]=z[i]-c[i]*u[i+1]/gamma[i];
/* for(i=1;i<=n;i++) printf("%f ",u[i]); */
return(u[0]);
} /*end of TriDiag */
A sample of input/output:
Enter the number subintervals of x and t 4 5
Enter the step size (k) of t 0.03125
Enter the upper limit of x 1
Enter the value of alpha 1
h= 0.25000, k= 0.03125, r= 0.50000
Solution is
x-> 0.25000 0.50000 0.75000
------------------------------------
0.03125| 0.86871 0.65741 0.35498
0.06250| 0.83538 0.61745 0.33080
0.09375| 0.81261 0.58746 0.31109
Partial Differential Equations 597

0.12500| 0.79630 0.56512 0.29579


0.15625| 0.78437 0.54848 0.28420
0.18750| 0.77555 0.53610 0.27550

9.3 Hyperbolic Equations

Let us consider the wave equation

∂2u 2
2∂ u
= c , t > 0, 0 < x < 1 (9.25)
∂t2 ∂x2
where initial conditions u(x, 0) = f (x) and

∂u
= g(x), 0 < x < 1 (9.26)
∂t (x,0)

and boundary conditions

u(0, t) = φ(t) and u(1, t) = ψ(t), t ≥ 0. (9.27)

This problem may occur in the transverse vibration of a stretched string. As in the
previous cases, the central-difference approximation for uxx and utt at the mesh points
(xi , tj ) = (ih, jk) are
1
uxx = (ui−1,j − 2ui,j + ui+1,j ) + O(h2 )
h2
1
and utt = 2 (ui,j−1 − 2ui,j + ui,j+1 ) + O(k 2 ),
k
where i, j = 0, 1, 2, . . ..
Using the value of uxx and utt , the equation (9.25) becomes

1 c2
2
(ui,j−1 − 2ui,j + ui,j+1 ) = 2 (ui−1,j − 2ui,j + ui+1,j )
k h
i.e.,
ui,j+1 = r2 ui−1,j + 2(1 − r2 )ui,j + r2 ui+1,j − ui,j−1 , (9.28)

where r = ck/h.
The value of ui,j+1 depends on the values of u at three time-levels (j − 1), j and
(j + 1). The known and unknown values of u are shown in Figure 9.6.
On substituting j = 0, the equation (9.28) yields

ui,1 = r2 ui−1,0 + 2(1 − r2 )ui,0 + r2 ui+1,0 − ui,−1


= r2 fi−1 + 2(1 − r2 )fi + r2 fi+1 − ui,−1 , where fi = f (xi ).
598 Numerical Analysis

t
6 e
j+1
⊕ unknown values
j u u u u known values

j−1 u

- x
i−1 i i+1

Figure 9.6: Known and unknown meshes for hyperbolic equations.

Again, the central difference approximation to the initial derivative condition gives
1
(ui,1 − ui,−1 ) = gi .
2k
Eliminating ui,−1 between above two relations and we obtain the expression for u
along t = k, i.e., for j = 1 as
1 2 
ui,1 = r fi−1 + 2(1 − r2 )fi + r2 fi+1 + 2kgi . (9.29)
2
The truncation error of this method is O(h2 +k 2 ) and the formula (9.28) is convergent
for 0 < r ≤ 1.
Example 9.3.1 Solve the second-order wave equation

∂2u ∂2u
=
∂t2 ∂x2
with boundary conditions u = 0 at x = 0 and 1, t > 0 and the initial conditions
1 ∂u
u = sin πx, = 0, when t = 0, 0 ≤ x ≤ 1, for x = 0, 0.2, 0.4, . . . , 1.0 and t =
2 ∂t
0, 0.1, 0.2, . . . , 0.5.

Solution. The explicit formula is

ui,j+1 = r2 ui−1,j + 2(1 − r2 )ui,j + r2 ui+1,j − ui,j−1 .

Let h = 0.2 and k = 0.1, so r = k/h = 0.5 < 1.


The boundary conditions become u0,j = 0, u5,j = 0 and initial conditions reduce to
1 ui,1 − ui,−1
ui,0 = sin π(ih), i = 1, 2, 3, 4, 5 and = 0, so that ui,−1 = ui,1 .
2 2k
Partial Differential Equations 599

For r = 0.5, the difference scheme is then

ui,j+1 = 0.25ui−1,j + 1.5ui,j + 0.25ui+1,j − ui,j−1 . (9.30)

For j = 0, this relation becomes

ui,1 = 0.25ui−1,0 + 1.5ui,0 + 0.25ui+1,0 − ui,−1


i.e., ui,1 = 0.125ui−1,0 + 0.75ui,0 + 0.125ui+1,0 , [using ui,−1 = ui,1 ]
= 0.125(ui−1,0 + ui+1,0 ) + 0.75ui,0 .

The above formula gives the values of u for j = 1. For j = 2, 3, . . . the values are
obtained from the formula (9.30).
Hence,

u1,1 = 0.125(u0,0 + u2,0 ) + 0.75u1,0 = 0.27986


u2,1 = 0.125(u1,0 + u3,0 ) + 0.75u2,0 = 0.45282
u3,1 = 0.125(u2,0 + u4,0 ) + 0.75u3,0 = 0.45282
u4,1 = 0.125(u3,0 + u5,0 ) + 0.75u4,0 = 0.27986.

All values are shown in the following table.

i=0 i=1 i=2 i=3 i=4 i=5


x=0 x = 0.2 x = 0.4 x = 0.6 x = 0.8 x = 1.0
j = 0, t = 0.0 0 0.29389 0.47553 0.47553 0.29389 0
j = 1, t = 0.1 0 0.27986 0.45282 0.45282 0.27986 0
j = 2, t = 0.2 0 0.23910 0.38688 0.38688 0.23910 0
j = 3, t = 0.3 0 0.17552 0.28399 0.28399 0.17552 0
j = 4, t = 0.4 0 0.09517 0.15398 0.15398 0.09517 0
j = 5, t = 0.5 0 0.00573 0.00927 0.00927 0.00573 0

The exact solution of this equation is


1
u(x, t) = sin πx cos πt.
2

9.3.1 Implicit difference methods


The implicit methods generate a tri-diagonal system of algebraic equations. So, it is
suggested that the implicit methods should not be used without simplifying assumption
to solve pure IVPs because they generate an infinite number of system of equations.
But, these methods may be used for initial-boundary value problems. Two such implicit
methods are presented below.
600 Numerical Analysis

(i) At the mesh point (ih, jk) the scheme is


 2  (   2  )
∂ u c2 ∂2u ∂ u
= + .
∂t2 i,j 2 ∂x2 i,j+1 ∂x2 i,j−1

That is,
1
[ui,j+1 − 2ui,j + ui,j−1 ] (9.31)
k2
c2
= 2 [(ui+1,j+1 − 2ui,j+1 + ui−1,j+1 ) + (ui+1,j−1 − 2ui,j−1 + ui−1,j−1 )].
2h
 2  (   2   2  )
∂ u c2 ∂2u ∂ u ∂ u
(ii) = +2 +
∂t2 i,j 4 ∂x2 i,j+1 ∂x2 i,j ∂x2 i,j−1
Using the finite difference scheme this equation reduces to
1
[ui,j+1 − 2ui,j + ui,j−1 ]
k2
c2
= 2 [(ui+1,j+1 − 2ui,j+1 + ui−1,j+1 ) (9.32)
4h
+2(ui+1,j − 2ui,j + ui−1,j ) + (ui+1,j−1 − 2ui,j−1 + ui−1,j−1 )].

Both the formulae hold good for all values of r = ck/h > 0.

9.4 Elliptic Equations

Let us consider the two-dimensional Laplace equation

∂2u ∂2u
+ 2 = 0 within R (9.33)
∂x2 ∂y
and u = f (x, y) on the boundary C.

Using the central difference approximation to both the space derivatives, the finite
difference approximation of above equation is given by
ui−1,j − 2ui,j + ui+1,j ui,j−1 − 2ui,j + ui,j+1
+ = 0.
h2 k2
If the mesh points are uniform in both x and y directions then h = k and the above
equation becomes a simple one. That is,
1
ui,j = [ui+1,j + ui−1,j + ui,j+1 + ui,j−1 ]. (9.34)
4
Partial Differential Equations 601

y
6(i, j+1)
u

u e u
(i-1,j) (i,j) (i+1,j)
u - x
(i,j-1)

Figure 9.7: Standard five-point formula.

This shows that the value of u at the point (i, j) is the average of its values at the
four neighbours – north (i, j + 1), east (i + 1, j), south (i, j − 1) and west (i − 1, j). This
is shown in Figure 9.7, and the formula is known as standard five-point formula.
It is also well known that the Laplace equation remains invariant when the coordinates
axes are rotated at an angle 450 .
Hence equation (9.34) can also be expressed in the form
1
ui,j = [ui−1,j−1 + ui+1,j−1 + ui+1,j+1 + ui−1,j+1 ]. (9.35)
4
This formula is known as diagonal five-point formula and the mesh points are
shown in Figure 9.8.
y
6(i-1,j+1) u
u (i+1,j+1)

e
(i,j)
u u- x
(i-1,j-1) (i+1,j-1)

Figure 9.8: Diagonal five-point formula.

Let us consider the Poisson’s equation in two-dimension in the form


uxx + uyy = g(x, y) (9.36)
with the boundary condition u = f (x, y) along C.
Using central difference approximation with uniform mesh points in both x and y
directions, the equation (9.36) is given by
1
ui,j = [ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j ] where gi,j = g(xi , yj ). (9.37)
4
602 Numerical Analysis

Let u = 0 be the boundary and xi = ih, yj = jh where i, j = 0, 1, 2, 3, 4. Then


i, j = 0, 4 represent the boundary shown in Figure 9.9.
y
6 u=0
j=4 u u u u u

j=3 u e e e u
u1,3 u2,3 u3,3
u=0, j=2 u e e e u u=0
u1,2 u2,2 u3,2
j=1 u e e e u
u1,1 u2,1 u3,1

j=0 u u u u u- x
i=0 i=1 i=2 i=3 i=4
u=0

Figure 9.9: The meshes for elliptic equation.

For i, j = 1, 2 and 3, the equation (9.37) is a system of nine equations with nine
unknowns, which is written in matrix notation as

    
4 −1 0 −1 0 0 0 0 0 u1,1 −h2 g1,1
 −1 4 −1 0 −1 0 0 0 0   u1,2   −h2 g1,2 
    
 0 −1 4 0 0 −1 0 0 0   u1,3   −h2 g1,3 
    
 −1 0 0 4 −1 0 −1 0 0   u2,1   −h2 g2,1 
    
 0 −1 0 −1 4 −1 0 −1 0   u2,2  =  −h2 g2,2  (9.38)
    
 0 0 −1 0 −1 4 0 0 −1   u2,3   −h2 g2,3 
    
 0 0 0 −1 0 0 4 −1 0   u3,1   −h2 g3,1 
    
 0 0 0 0 −1 0 −1 4 −1   u3,2   −h2 g3,2 
0 0 0 0 0 −1 0 −1 4 u3,3 −h2 g3,3

It may be noted that the coefficient matrix is symmetric, positive definite and sparse.
Thus the solution of an elliptic PDE depends on a sparse system of equations. To solve
this system an iterative method is suggested rather a direct method. Three iterative
methods are commonly used to solve such system, viz., (i) Jacobi’s method, (ii) Gauss-
Seidel’s method and (iii) successive overrelaxation method.
The another iterative method known as alternate direction implicit (ADI) method
is also used.
Partial Differential Equations 603

Method to obtain first approximate value of Laplace’s equation


Let us consider the Laplace’s equation
uxx + uyy = 0.
y
6 u1,4 u2,4 u3,4 u4,4
u0,4 u u u u u

u0,3 u e e e u u4,3
u1,3 u2,3 u3,3
u0,2 u e e e u u4,2
u1,2 u2,2 u3,2
u0,1 u e e e u u4,1
u1,1 u2,1 u3,1
u u u u u- x
u0,0 u1,0 u2,0 u3,0 u4,0

Figure 9.10: Boundary values for Laplace equation.

Let R be a square region divided into N × N small squares of side h. The boundary
values are uo,j , uN,j , ui,0 , ui,N where i, j = 0, 1, 2, . . . , N , shown in Figure 9.10.
Initially, diagonal five-point formula is used to compute the values of u2,2 , u1,3 , u3,3 ,
u1,1 and u3,1 in this sequence. The values are
1
u2,2 = (u0,0 + u4,4 + u0,4 + u4,0 )
4
1
u1,3 = (u0,2 + u2,4 + u0,4 + u2,2 )
4
1
u3,3 = (u2,2 + u4,4 + u2,4 + u4,2 )
4
1
u1,1 = (u0,0 + u2,2 + u0,2 + u2,0 )
4
1
u3,1 = (u2,0 + u4,2 + u2,2 + u4,0 ).
4
In the second step, the remaining values, viz., u2,3 , u1,2 , u3,2 and u2,1 are computed
using standard five-point formula as
1
u2,3 = (u1,3 + u3,3 + u2,2 + u2,4 )
4
1
u1,2 = (u0,2 + u2,2 + u1,1 + u1,3 )
4
604 Numerical Analysis

1
u3,2 = (u2,2 + u4,2 + u3,1 + u3,3 )
4
1
u2,1 = (u1,1 + u3,1 + u2,0 + u2,2 ).
4
These are the initial values of u at the nine internal mesh points, their accuracy can
be improved by any iterative method.

Note 9.4.1 The above scheme gives a better first approximate values, but, ui,j = 0 for
i, j = 1, 2, . . . , N − 1 may also be taken as first approximate values of u. Sometimes for
these initial values the iteration converges slowly.

Example 9.4.1 Find the first approximate values at the interior mesh points of the
following Dirichlet’s problem

uxx + uyy = 0,
u(x, 0) = 0, u(0, y) = 0,
u(x, 1) = 10x, u(1, y) = 20y.

Solution. Let the region 0 ≤ x, y ≤ 1 be divided into 4 × 4 squares of side h = 0.25.


Let xi = ih, yj = jk, i, j = 0, 1, 2, 3, 4. The mesh points are shown in Figure 9.11.
y u1,4 u2,4 u3,4 u4,4
6 2.5 5.0 7.5 10
y=1.00, u0,4 =0 u u u u u u4,4 =10.0

y =0.75, u0,3 =0 u e e e u u4,3 =7.5


u1,3 u2,3 u3,3
y=0.50, u0,2 =0 u e e e u u4,2 =5.0
u1,2 u2,2 u3,2
y=9.25, u0,1 =0 u e e e u u4,1 =2.5
u1,1 u2,1 u3,1

y=0.00, u0,0 =0 u u u u u- x
0 0 0 0
u1,0 u2,0 u3,0 u4,0
x=0 x= 14 x= 12 x= 34 x=1

Figure 9.11: Mesh points for Dirichlet’s problem.

The diagonal five-point formula is used to find the values of u2,2 , u1,3 , u3,3 , u1,1 and
u3,1 . Thus
1 1
u2,2 = (u0,0 + u4,4 + u0,4 + u4,0 ) = (0 + 10 + 0 + 0) = 2.5
4 4
Partial Differential Equations 605

1 1
u1,3 = (u0,2 + u2,4 + u0,4 + u2,2 ) = (0 + 5 + 0 + 2.5) = 1.875
4 4
1 1
u3,3 = (u2,2 + u4,4 + u2,4 + u4,2 ) = (2.5 + 10 + 5 + 5) = 5.625
4 4
1 1
u1,1 = (u0,0 + u2,2 + u0,2 + u2,0 ) = (0 + 2.5 + 0 + 0) = 0.625
4 4
1 1
u3,1 = (u2,0 + u4,2 + u2,2 + u4,0 ) = (0 + 5 + 2.5 + 0) = 1.875.
4 4
The values of u2,3 , u1,2 , u3,2 and u2,1 are obtained by using standard five-point formula.

1 1
u2,3 = (u1,3 + u3,3 + u2,2 + u2,4 ) = (1.875 + 5.625 + 2.5 + 5) = 3.75
4 4
1 1
u1,2 = (u0,2 + u2,2 + u1,1 + u1,3 ) = (0 + 2.5 + 0.625 + 1.875) = 1.25
4 4
1 1
u3,2 = (u2,2 + u4,2 + u3,1 + u3,3 ) = (2.5 + 5 + 1.875 + 5.625) = 3.75
4 4
1 1
u2,1 = (u1,1 + u3,1 + u2,0 + u2,2 ) = (0.625 + 1.875 + 0 + 2.5) = 1.25.
4 4
Thus we obtain the first approximate values of u at the interior mesh points.

9.4.1 Iterative methods


When first approximate values of u are available then those values can be improved by
applying any iterative method. There are a large number of iterative methods available
with different rates of convergence. Some of the classical methods are discussed below.
The finite-difference scheme for the Poison’s equation (9.36) when discretized using
standard five-point formula is

1
ui,j = (ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j ). (9.39)
4
(r)
Let ui,j denote the rth iterative value of ui,j .

Jacobi’s method
The iterative scheme to solve (9.39) for the interior mesh points is

(r+1) 1 (r) (r) (r) (r)


ui,j = [ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j ]. (9.40)
4
This is called the point Jacobi’s method.
606 Numerical Analysis

Gauss-Seidel’s method
In this method, the most recently computed values as soon as they are available are
used and the values of u along each row are computed systematically from left to right.
The iterative formula is
(r+1) 1 (r+1) (r) (r+1) (r)
ui,j = [ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j ]. (9.41)
4
The rate of convergence of this method is twice as fast as the Jacobi’s method.
Successive Over-Relaxation or S.O.R. method
In this method, the rate of convergence of an iterative method is accelerated by making
(r+1) (r) (r+1)
corrections on [ui,j − ui,j ]. If ui,j is the value obtained from a basic iterative (such
as Jacobi’s or Gauss-Seidel’s) method then the value at the next iteration is given by
(r+1) (r+1) (r)
ui,j = wui,j + (1 − w)ui,j , (9.42)
where w is the over-relaxation factor.
Thus, the Jacobi’s over-relaxation method for the Poisson’s equation is
(r+1) 1  (r) (r) (r) (r)

(r)
ui,j = w ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j + (1 − w)ui,j (9.43)
4
and the Gauss-Seidel’s over-relaxation method for that is
(r+1) 1  (r+1) (r) (r+1) (r)

(r)
ui,j = w ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j + (1 − w)ui,j . (9.44)
4
The rate of convergence of (9.43) and (9.44) depends on the value of w and its value
lies between 1 and 2. But, the choice of w is a difficult task.
It may be noted that w = 1 gives the corresponding basic iteration formula.

Example 9.4.2 Solve the Laplace’s equation ∇2 u = 0 with boundary conditions


shown in Figure 9.12.

Solution. The quantities u2,2 , u1,3 , u3,3 , u1,1 and u3,1 are computed using diagonal
five-point formula and let those be the first approximate values. That is,

(1) 1
u2,2 = (u0,0 + u4,4 + u0,4 + u4,0 ) = 45.0
4
(1) 1
u1,3 = (u0,2 + u2,4 + u0,4 + u2,2 ) = 23.75
4
(1) 1
u3,3 = (u2,2 + u4,4 + u2,4 + u4,2 ) = 41.25
4
(1) 1
u1,1 = (u0,0 + u2,2 + u0,2 + u2,0 ) = 48.75
4
(1) 1
u3,1 = (u2,0 + u4,2 + u2,2 + u4,0 ) = 66.25.
4
Partial Differential Equations 607

10u 20u 30u


0 u u40

20 u e e e u50
u1,3 u2,3 u3,3

30 u e e e u60
u1,2 u2,2 u3,2
40 u e e e
u2,1 u3,1
u70
u1,1

50 u u u u u90
60 70 80

Figure 9.12: Boundary conditions of Laplace equation.

The values of u2,3 , u1,2 , u3,2 and u2,1 are computed using standard five-point formula.

(1) 1
u2,3 = (u1,3 + u3,3 + u2,2 + u2,4 ) = 32.5
4
(1) 1
u1,2 = (u0,2 + u2,2 + u1,1 + u1,3 ) = 36.875
4
(1) 1
u3,2 = (u0,2 + u4,2 + u3,1 + u3,3 ) = 53.125
4
(1) 1
u2,1 = (u1,1 + u3,1 + u2,0 + u2,2 ) = 57.5.
4
Thus the first approximate values are obtained for nine internal mesh points. These
values can be improved by using any iterative method. Here Gauss-Seidel’s iterative
method is used to obtain better approximate values.

u1,1 u2,1 u3,1 u1,2 u2,2 u3,2 u1,3 u2,3 u3,3


48.5938 57.4609 65.1465 36.8359 44.9805 52.8442 24.8340 32.7661 41.4026
48.5742 57.1753 65.0049 37.0972 44.9707 52.8445 24.9658 32.8348 41.4198
48.5681 57.1359 64.9951 37.1262 44.9854 52.8501 24.9902 32.8489 41.4247
48.5655 57.1365 64.9966 37.1353 44.9927 52.8535 24.9960 32.8534 41.4267
48.5679 57.1393 64.9982 37.1392 44.9963 52.8553 24.9981 32.8553 41.4277
48.5696 57.1410 64.9991 37.1410 44.9982 52.8562 24.9991 32.8562 41.4281
48.5705 57.1419 64.9995 37.1419 44.9991 52.8567 24.9995 32.8567 41.4283
48.5710 57.1424 64.9998 37.1424 44.9995 52.8569 24.9998 32.8569 41.4285
48.5712 57.1426 64.9999 37.1426 44.9998 52.8570 24.9999 32.8570 41.4285
48.5713 57.1427 64.9999 37.1427 44.9999 52.8571 24.9999 32.8571 41.4285
608 Numerical Analysis

Example 9.4.3 Solve the Laplace’s equation uxx + uyy = 0 in the domain shown in
Figure 9.13, by (a) Gauss-Seidel’s method, (b) Jacobi’s method, and (c) Gauss-Seidel
S.O.R. method.

u
10
e
u
10
e
u e
u

0 u e
u1,2
e
u2,2
e0
u

0 u e e e0
u
u1,1 u2,1
u u u u
0 0

Figure 9.13: Boundary conditions of Laplace equation.

Solution.
(a) Gauss-Seidel’s method
Let u2,1 = u1,2 = u2,2 = u1,1 = 0 at the beginning.
The Gauss-Seidel’s iteration scheme is
(r+1) 1  (r) (r) 
u1,1 = u2,1 + u1,2
4
(r+1) 1  (r+1) (r) 
u2,1 = u1,1 + u2,2
4
(r+1) 1  (r+1) (r) 
u1,2 = u1,1 + u2,2 + 10
4
(r+1) 1  (r+1) (r+1) 
u2,2 = u1,2 + u2,1 + 10 .
4
For r = 0,

(1) 1 
u1,1 = 0+0 =0
4
(1) 1 
u2,1 = 0+0 =0
4
(1) 1 
u1,2 = 0 + 0 + 10 = 2.5
4
(1) 1 
u2,2 = 2.5 + 0 + 10 = 3.125.
4
The subsequent iterations are shown below.
Partial Differential Equations 609

u1,1 u2,1 u1,2 u2,2


0.00000 0.00000 2.50000 3.12500
0.62500 0.93750 3.43750 3.59375
1.09375 1.17188 3.67188 3.71094
1.21094 1.23047 3.73047 3.74023
1.24023 1.24512 3.74512 3.74756
1.24756 1.24878 3.74878 3.74939
1.24939 1.24969 3.74969 3.74985
1.24985 1.24992 3.74992 3.74996
1.24996 1.24998 3.74998 3.74999
1.24999 1.25000 3.75000 3.75000
(b) Jacobi’s method
Initially we take u2,1 = u1,2 = u2,2 = u1,1 = 0.
The Jacobi’s iteration scheme is
(r+1) 1  (r) (r)  1  (r) (r) 
u1,1 = u2,1 + u1,2 + 0 + 0 = u2,1 + u1,2
4 4
(r+1) 1  (r) (r)  1  (r) (r) 
u2,1 = u1,1 + u2,2 + 0 + 0 = u1,1 + u2,2
4 4
(r+1) 1  (r) (r)  1  (r) (r) 
u1,2 = u1,1 + u2,2 + 0 + 10 = u1,1 + u2,2 + 10
4 4
(r+1) 1  (r) (r)  1  (r) (r) 
u2,2 = u1,2 + u2,1 + 10 + 0 = u1,2 + u2,1 + 10 .
4 4
(1) (1) (1) (1)
For r = 0, u1,1 = 0, u2,1 = 0, u1,2 = 2.5, u2,2 = 2.5.
The successive iterations are given below.
u1,1 u2,1 u1,2 u2,2
0.00000 0.00000 2.50000 2.50000
0.62500 0.62500 3.12500 3.12500
0.93750 0.93750 3.43750 3.43750
1.09375 1.09375 3.59375 3.59375
1.17188 1.17188 3.67188 3.67188
1.21094 1.21094 3.71094 3.71094
1.23047 1.23047 3.73047 3.73047
1.24023 1.24023 3.74023 3.74023
1.24512 1.24512 3.74512 3.74512
1.24756 1.24756 3.74756 3.74756
1.24878 1.24878 3.74878 3.74878
1.24939 1.24939 3.74939 3.74939
1.24969 1.24969 3.74969 3.74969
1.24985 1.24985 3.74985 3.74985
1.24992 1.24992 3.74992 3.74992
610 Numerical Analysis

(c) Gauss-Seidel’s S.O.R. method


Let u2,1 = u1,2 = u2,2 = u1,1 = 0 at the beginning.
The Gauss-Seidel’s S.O.R. scheme for interior mesh points are
(r+1) w  (r+1) (r) (r+1) (r)  (r)
ui,j = ui−1,j + ui+1,j + ui,j−1 + ui,j+1 + (1 − w)ui,j .
4
For j = 1, 2 and i = 1, 2, the formulae are
(r+1) w (r) (r) (r)
u1,1 = [u2,1 + u1,2 ] + (1 − w)u1,1
4
(r+1) w (r+1) (r) (r)
u2,1 = [u1,1 + u2,2 ] + (1 − w)u2,1
4
(r+1) w (r) (r+1) (r)
u1,2 = [u2,2 + u1,1 + 10] + (1 − w)u1,2
4
(r+1) w (r+1) (r) (r+1) (r)
u2,2 = [u1,2 + u3,2 + u2,1 + 10] + (1 − w)u2,2 .
4
For w = 1.1, the values are listed below.
r u1,1 u2,1 u1,2 u2,2
1 0.00000 0.00000 2.75000 3.50625
2 0.75625 1.17219 3.64719 3.72470
3 1.24970 1.25074 3.75324 3.75363
4 1.25113 1.25123 3.75098 3.75025
5 1.25050 1.25008 3.75011 3.75003
6 1.25000 1.25000 3.75000 3.75000

This result shows that the method converges at 6th step. But, to get the same result
using Gauss-Seidel’s method (in this case w = 1) 11 iterations are needed . Also, the
rate of convergence depends on the value of w. For some specific values of w, the
value of r, at which the solution converges, are tabulated below.
w 1 1.05 1.07 1.08 1.09 1.1 1.2 1.3 1.4 1.009 1.1009 1.109
r 11 9 7 6 7 6 9 10 16 11 6 7

Example 9.4.4 Solve the Poisson’s equation uxx +uyy = 8x2 y 2 for the square region
0 ≤ x ≤ 1, 0 ≤ y ≤ 1 with h = 1/3 and the values of u on the boundary are every
where zero. Use (a) Gauss-Seidel’s method, and (b) Gauss-Seidel’s S.O.R. method.

Solution. In this problem, g(x, y) = 8x2 y 2 , h = 1/3 and the boundary conditions
are u0,0 = u1,0 = u2,0 = u3,0 = 0, u0,1 = u0,2 = u0,3 = 0,
u1,3 = u2,3 = u3,3 = 0, u3,1 = u3,2 = 0.
(a) The Gauss-Seidel’s iteration scheme is
(r+1) 1  (r+1) (r) (r+1) (r) 
ui,j = ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 g(ih, jk) .
4
Partial Differential Equations 611

8 2 2
Now, g(ih, jk) = 8h4 i2 j 2 = 81 i j . Thus

(r+1) 1 (r+1) (r) (r+1) (r) 1 8


u1,1 = u0,1 + u2,1 + u1,0 + u1,2 − . .1.1
4 9 81
1 (r) (r) 8 1 (r) (r) 8
= 0 + u2,1 + 0 + u1,2 − = u2,1 + u1,2 −
4 729 4 729
(r+1) 1 (r+1) (r) (r+1) (r) 1 8
u2,1 = u1,1 + u3,1 + u2,0 + u2,2 − . .22 .12
4 9 81
1 (r+1) (r) 32
= u1,1 + u2,2 −
4 729
(r+1) 1 (r+1) (r) (r+1) (r) 1 8
u1,2 = u0,2 + u2,2 + u1,1 + u1,3 − . .12 .22
4 9 81
1 (r) (r+1) 32
= u2,2 + u1,1 −
4 729
(r+1) 1 (r+1) (r) (r+1) (r) 1 8
u2,2 = u1,2 + u3,2 + u2,1 + u2,3 − . .22 .22
4 9 81
1 (r+1) (r+1) 128
= u1,2 + u2,1 − .
4 729
(0) (0) (0)
Let u2,1 = u2,2 = u1,2 = 0.
All the values are shown in the following table.

r u1,1 u2,1 u1,2 u2,2


1 -0.00274 -0.01166 -0.01166 -0.04973
2 -0.00857 -0.02555 -0.02555 -0.05667
3 -0.01552 -0.02902 -0.02902 -0.05841
4 -0.01725 -0.02989 -0.02989 -0.05884
5 -0.01769 -0.03011 -0.03011 -0.05895
6 -0.01780 -0.03016 -0.03016 -0.05898
7 -0.01782 -0.03017 -0.03017 -0.05898
8 -0.01783 -0.03018 -0.03018 -0.05898

(b) The Gauss-Seidel’s S.O.R. scheme is


(r+1) w  (r+1) (r) (r+1) (r)  (r)
ui,j = ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 g(ih, jh) + (1 − w)ui,j
4
w (r+1) (r) (r+1) (r) 8 2 2 (r)
= ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − i j + (1 − w)ui,j .
4 729
(0) (0) (0)
Let the initial values be u2,1 = u1,2 = u2,2 = 0.
612 Numerical Analysis

For w = 1.1, the values of u1,1 , u2,1 , u1,2 and u2,2 are shown below.

r u1,1 u2,1 u1,2 u2,2


1 -0.00302 -0.01290 -0.01290 -0.05538
2 -0.00981 -0.02871 -0.02871 -0.05854
3 -0.01783 -0.03020 -0.03020 -0.05904
4 -0.01785 -0.03020 -0.03020 -0.05899
5 -0.01784 -0.03018 -0.03018 -0.05899
6 -0.01783 -0.03018 -0.03018 -0.05898

Algorithm 9.2 (Gauss-Seidel’s S.O.R. method to solve elliptic equation).


This algorithm solves an elliptic (Poisson’s) equation of the form uxx + uyy = g(x, y)
with given boundary conditions ui,0 , ui,N , u0,j , uN,j , i, j = 0, 1, 2, . . . , N .

Algorithm Elliptic SOR


Input function g(x, y); //g(x, y) = 0 reduces the problem to Laplace equation//
Read boundary conditions
ui,0 , ui,N , u0,j , uN,j , i, j = 0, 1, 2, . . . , N ;
Read h, k, ε; //ε is the error tolerance//
Read w; //w = 1 gives Gauss-Seidel’s method//
Step 1. Set initial values for internal mesh points ui,j = 0
for i, j = 1, 2, . . . , N − 1.
Step 2. for j = 1, 2, . . . , N − 1 do
for i = 1, 2, . .. , N − 1 do 
uni,j = w4 uni−1,j + ui+1,j + uni,j−1 + ui,j+1 − h2 g(ih, jk) + (1 − w)ui,j
//uni,j denotes the new value of u at (i, j).//
Step 3. Print uni,j for i, j = 1, 2, . . . , N − 1.
Step 4. If |uni,j − ui,j | < ε for all i, j = 1, 2, . . . , N − 1 then Stop.
Step 4. Set ui,j = uni,j for i, j = 1, 2, . . . , N − 1
goto Step 2.
end Elliptic SOR
Program 9.2
.
/* Program Elliptic PDE
Program to solve the Poisson PDE by Gauss-Seidal S.O.R.
method. This program solves the problem of the form
Uxx+Utt=g(x,y). Boundary conditions are given by
U(x,0)=f1(x); U(0,y)=f2(y); U(x,l)=f3(x); U(l,y)=f4(y).
g(x,y)=0 gives the Laplace equation.
Here g(x,y)= -2*x*x+y*y; u(x,y)=0 at boundary.
*/
Partial Differential Equations 613

#include<stdio.h>
#include<math.h>
void main()
{
int i,j,N,flag;
float u[6][6],un[6][6],h,k,w,eps=1e-5,temp;
float g(float x,float y);
float f1(float x);
float f2(float y);
float f3(float x);
float f4(float y);
printf("Enter the number of mesh points N ");
scanf("%d",&N);
printf("Enter the step sizes of x (h) and y (k) ");
scanf("%f %f",&h,&k);
printf("Enter the relaxation factor w ");
scanf("%f",&w);
/* set boundary conditions */
for(i=0;i<=N;i++) for(j=0;j<=N;j++) u[i][j]=0;
for(i=0;i<=N;i++){
u[i][0]=f1(i*h); u[i][N]=f3(i*h);
}
for(j=0;j<=N;j++){
u[0][j]=f2(j*k); u[N][j]=f4(j*k);
}
for(i=0;i<=N;i++)
for(j=0;j<=N;j++) un[i][j]=u[i][j];
printf("The values of N, h, k and w are respectively\n");
printf("N=%3d, h=%6.4f, k=%6.4f, w=%5.3f\n",N,h,k,w);
do
{
for(i=1;i<=N-1;i++)
for(j=1;j<=N-1;j++) u[i][j]=un[i][j];
for(i=1;i<=N-1;i++)
for(j=1;j<=N-1;j++)
un[i][j]=0.25*w*(un[i-1][j]+u[i+1][j]+un[i][j-1]
+u[i][j+1]-h*h*g(i*h,j*k))+(1-w)*u[i][j];
flag=0;
for(i=1;i<=N-1;i++)
for(j=1;j<=N-1;j++) if(fabs(un[i][j]-u[i][j])>eps) flag=1;
614 Numerical Analysis

}while(flag==1);
/* printing of the boundary and internal values */
printf("The interior and boundary values are shown below\n");
printf(" ");
for(i=0;i<=N;i++) printf("%8.5f ",i*h);printf("\n");
printf("---------");
for(i=0;i<=N;i++) printf("---------");printf("\n");
for(i=0;i<=N;i++){
printf("%8.5f| ",i*k);
for(j=0;j<=N;j++) printf("%8.5f ",un[i][j]);printf("\n");
}
}
/* definitions of the functions */
float g(float x,float y)
{
return(-2*x*x+y*y);
}
float f1(float x)
{
return 0;
}
float f2(float y)
{
return 0;
}
float f3(float x)
{
return 0;
}
float f4(float y)
{
return 0;
}
A sample of input/output:
Enter the number of mesh points N 5
Enter the step sizes of x (h) and y (k) 0.5 0.1
Enter the relaxation factor w 1.1
The values of N, h, k and w are respectively
N= 5, h=0.5000, k=0.1000, w=1.100
Partial Differential Equations 615

The interior and boundary values are shown below


0.00000 0.50000 1.00000 1.50000 2.00000 2.50000
---------------------------------------------------------------
0.00000| 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
0.10000| 0.00000 0.37538 0.55816 0.55020 0.35862 0.00000
0.20000| 0.00000 0.82088 1.19205 1.18154 0.79930 0.00000
0.30000| 0.00000 1.21861 1.71762 1.70711 1.19702 0.00000
0.40000| 0.00000 1.21345 1.63770 1.62975 1.19669 0.00000
0.50000| 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000

9.5 Stability

To investigate the stability of a finite difference method to solve a PDE, we consider


up,q = eiβx eαt = eiβph eαqk (since x = ph, y = qk)
iβph q
=e ξ , (9.45)
where ξ = eαk and α is in general, a complex constant.
The finite difference equation will be stable if |up,q | remain bounded for qk ≤ T, 0 ≤
t ≤ T , T is finite as h → 0, k → 0, and for all values of β, those satisfy the initial
conditions.
If the exact solution of the finite difference equation does not increase exponentially
with time, then a necessary and sufficient condition for stability is that
|ξ| ≤ 1, i.e., − 1 ≤ ξ ≤ 1. (9.46)
But, if up,q increases with time, then the necessary and sufficient condition for stability
is
|ξ| ≤ 1 + O(k). (9.47)

∂u ∂2u
Example 9.5.1 Investigate the stability of the parabolic equation = which
∂t ∂x2
is approximated by finite difference scheme
1 1
(up,q+1 − up,q ) = 2 (up−1,q+1 − 2up,q+1 + up+1,q+1 )
k h
at (ph, qk).

Solution. Substituting up,q = eiβph ξ q to the difference equation and we obtain

eiβph ξ q+1 − eiβph ξ q = r{eiβ(p−1)h ξ q+1 − 2eiβph ξ q+1 + eiβ(p+1)h ξ q+1 }

where r = k/h2 .
616 Numerical Analysis

Dividing both sides by eiβph ξ q ,

ξ − 1 = rξ{e−iβh − 2 + eiβh } = rξ(2 cos βh − 2)


= −4rξ sin2 (βh/2).

That is,
1
ξ= . (9.48)
1 + 4r sin2 (βh/2)

Clearly 0 < ξ ≤ 1 for all r > 0 and all β. Therefore, the difference scheme is
unconditionally stable.

Example 9.5.2 Investigate the stability of the hyperbolic equation


∂2u ∂2u
= , which is approximated by the difference equation
∂t2 ∂x2
1 1
2
(up,q+1 − 2up,q + up,q−1 ) = 2 (up+1,q − 2up,q + up−1,q )
k h
at (ph, qk).

Solution. Substituting up,q = eiβph ξ q to the above equation, we obtain the relation

eiβph ξ q+1 − 2eiβph ξ q + eiβph ξ q−1 = r2 {eiβ(p+1)h ξ q − 2eiβph ξ q + eiβ(p−1)h ξ q }.

Now, dividing both sides by eiβph ξ q . Thus

ξ − 2 + ξ −1 = r2 {eiβh − 2 + e−iβh }
= r2 {2 cos βh − 2} = −4r2 sin2 (βh/2)

i.e., ξ 2 − 2{1 − 2r2 sin2 (βh/2)}ξ + 1 = 0 or, ξ 2 − 2aξ + 1 = 0,


where

a = 1 − 2r2 sin2 (βh/2), r = k/h. (9.49)

Let ξ1 and ξ2 be two values of ξ and they are given by


 
ξ1 = a + (a2 − 1) and ξ2 = a − (a2 − 1).

Here, u does not increase exponentially with t as the difference equation is a three
time-level. Thus, a necessary condition for stability is that |ξ| ≤ 1.
It is clear from (9.49) that a ≤ 1 because r, k and β are real.
Partial Differential Equations 617

If a < −1, then |ξ2 | > 1 and leads to instability, when −1 ≤ a ≤ 1, a2 ≤ 1 then ξ1
and ξ2 are imaginary and they are given by
 
ξ1 = a + i (1 − a2 ), ξ2 = a − i (1 − a2 ).

Hence |ξ1 | = |ξ2 | = a2 + (1 − a2 ) = 1 and this shows that a necessary condition for
stability is −1 ≤ a ≤ 1. That is, −1 ≤ 1 − 2r2 sin2 (βh/2) ≤ 1. The right hand side
inequality is obvious. The left hand side inequality is useful.
Then
−1 ≤ 1 − 2r2 sin2 (βh/2) or, r2 sin2 (βh/2) ≤ 1.
This gives r = k/h ≤ 1.
This condition is also a sufficient condition.

9.6 Exercise

1. Solve the heat equation


∂u ∂2u
=α 2
∂t ∂x
subject to the conditions u(x, 0) = 0, u(0, t) = 0 and u(1, t) = 2t, taking h =
1/2, k = 1/16.
∂2u ∂u
2. Given the differential equation 2
= and the boundary condition u(0, t) =
∂x ∂t
u(5, t) = 0 and u(x, 0) = x (30 − x ). Use the explicit method to obtain the
2 2

solution for xi = ih, yj = jk; i = 0, 1, . . . , 5 and j = 0, 1, 2, . . . , 6.

∂u ∂2u
3. Solve the differential equation = , 0 ≤ x ≤ 1/2, given that u = 0 when
∂t ∂x2
∂u ∂u
t = 0, 0 ≤ x ≤ 1/2 and with boundary conditions = 0 at x = 0 and = 1 at
∂x ∂x
x = 1/2 for t > 0, taking h = 0.1, k = 0.001.

4. Solve the following initial value problem ft = fxx , 0 ≤ x ≤ 1 subject to the initial
πx
condition f (x, 0) = cos and the boundary conditions f (0, t) = 1, f (1, t) = 0 for
2
t > 0, taking h = 1/3, k = 1/3.

5. Solve the parabolic differential equation ut = uxx , 0 ≤ x ≤ 1 subject to the


boundary conditions u = 0 at x = 0 and x = 1 for t > 0 and the initial conditions

2x for 0 ≤ x ≤ 1/2
u(x, 0) =
2(1 − x) for 1/2 ≤ x ≤ 1,

using explicit method.


618 Numerical Analysis

6. The differential equation utt = uxx , 0 ≤ x ≤ 1 satisfies the boundary conditions


πx
u = 0 at x = 0 and x = 1 for t > 0, and the initial conditions u(x, 0) = sin ,
  4
∂u
= 0. Compute the values of u for x = 0, 0.1, 0.2, . . . , 0.5 and t =
∂t (x,0)
0, 0.1, 0.2, . . . , 0.5.

7. Solve the hyperbolic partial differential equation utt = uxx , 0 ≤ x ≤ 2, t ≥ 0,


subject to the boundary conditions u(0, t) = u(2, t) = 0, t ≥ 0 and the initial
πx
conditions u(x, 0) = 5 sin , 0 ≤ x ≤ 2, ut (x, 0) = 0, taking h = 1/8 and k = 1/8.
2
8. Solve the Poisson’s equation uxx + uyy = −2x2 + y 2 over the region 0 ≤ x ≤ 2, 0 ≤
y ≤ 2 taking the boundary condition u = 0 on all the boundary sides with h = 0.5.
Use Gauss-Seidel’s method to improve the solution.

9. Solve the Laplace equation uxx + uyy = 0 taking h = 1, with boundary values as
shown below.
8u 10u 12u
0 u u16

0 u e e e u28
u1,3 u2,3 u3,3

0 u e e e u24
u1,2 u2,2 u3,2
0 u e e e
u2,1 u3,1
u22
u1,1

0 u u u u u20
10 15 18

10. Solve the elliptic differential equation uxx + uyy = 0 and for the region bounded
by 0 ≤ x ≤ 5, 0 ≤ y ≤ 5, the boundary conditions being
u = 0 at x = 0 and u = 2 + y at x = 5,
u = x2 at y = 0 and u = 2x at y = 5.
Take h = k = 1. Use
(a) Jacobi’s method, (b) Gauss-Seidel’s method, and (c) Gauss-Seidel’s S.O.R.
method.
Chapter 10

Least Squares Approximation

In science and engineering, the experimental data are usually viewed by plotting as
graphs on plane paper. But, the problem is: what is the ‘best curve’ for a given set of
data ? If n data points (xi , yi ), i = 1, 2, . . . , n are given, then an nth degree polynomial
can be constructed using interpolation methods, such as Lagrange’s, Newton’s etc. But,
the handling of higher degree polynomial is practically difficult, though it gives exact
values at the given nodes x0 , x1 , . . . , xn , and errors at the other points. For the given
data points we can construct lower degree polynomials such as linear, quadratic etc.
and other types of curve, viz., geometric, exponential etc., using least squares method,
which minimizes the sum of squares of the absolute errors. The curve fitted by this
method not always give exact values at the given nodes x0 , x1 , . . . , xn and other points.
The least squares method is used to fit polynomials of different degrees, other special
curves and also to fit orthogonal polynomials, etc.

10.1 General Least Squares Method

Let (xi , yi ), i = 1, 2, . . . , n be a bivariate sample of size n. The problem is to fit the


curve

y = g(x; a0 , a1 , . . . , ak ), (10.1)

where a0 , a1 , . . . , ak are the unknown parameters to be determined based on the given


sample values such that the error is minimum.
When x = xi , the value of y obtained from (10.1) is denoted by Yi and it is given by

Yi = g(xi ; a0 , a1 , . . . , ak ). (10.2)

619
620 Numerical Analysis

The quantity Yi is called the expected or predicted value of y corresponding to


x = xi and yi is called the observed value of y. These two values, in general, are
different as the observed values not necessarily lie on the curve (10.1).
The difference (yi −Yi ) is called the residual corresponding to x = xi . The parameters
a0 , a1 , . . . , ak are chosen by least squares method in such a way that the sum of the
squares of residuals, S, is minimum.
Now,


n 
n
S= (yi − Yi )2 = [yi − g(xi ; a0 , a1 , . . . , ak )]2 . (10.3)
i=1 i=1

The expression for S contains (k + 1) unknowns a0 , a1 , . . . , ak .


The value of S will be minimum if
∂S ∂S ∂S
= 0, = 0, . . . , = 0. (10.4)
∂a0 ∂a1 ∂ak

These equations are called normal equations. Solution of these (k + 1) equations


give the values of the parameters a0 , a1 , . . . , ak . Let a0 = a∗0 , a1 = a∗1 , . . . , ak = a∗k be
the solution of the system of equations (10.4).
Then the fitted curve is

y = g(x; a∗0 , a∗1 , . . . , a∗k ). (10.5)

The sum of the squares of residuals is obtained from the equation


n 
n
S= (yi − Yi )2 = [yi − g(xi ; a∗0 , a∗1 , . . . , a∗k )]2 . (10.6)
i=1 i=1

10.2 Fitting of a Straight Line

Let

y = a + bx (10.7)

be the equation of a straight line, where a and b are two parameters whose values are
to be determined. Let (xi , yi ), i = 1, 2, . . . , n, be a given sample of size n.
Here S is given by


n 
n
S= (yi − Yi )2 = (yi − a − bxi )2 .
i=1 i=1
Least Squares Approximation 621

The normal equations are


∂S 
= −2 (yi − a − bxi ) = 0
∂a
∂S 
= −2 (yi − a − bxi )xi = 0
∂b
which give on simplification,
 
yi = na + b xi
  
xi yi = a xi + b x2i . (10.8)
The solution of these equations is
  
n xi yi − xi yi 1  
b=   and a= yi − b xi . (10.9)
n x2i − ( xi )2 n
But, when the sample size
 is large  or2 the data
 are2 large then there is a chance for data
overflow while computing xi yi , xi and ( xi ) . Then the suggested expression for
b is

(x − x)(y − y) 1 1
b=  , where x = x i , y = yi . (10.10)
(xi − x)2 n n
Let this solution be denoted by a = a∗ , b = b∗ . Then the fitted straight line is
y = a∗ + b∗ x. (10.11)

Example 10.2.1 Use least squares method to fit the line y = a + bx based on the
sample (2, 1), ( 16 , − 56 ), (− 32 , −2) and (− 13 , − 23 ). Estimate the total error.

Solution. Here n = 4. The normal equations are


 
yi = na + b xi
  
xi yi = a xi + b x2i
  
The values of xi , yi and xi yi are calculated in the following table.

xi yi x2i xi yi
2 1 4 2
1/6 –5/6 1/36 –5/36
–3/2 –2 9/4 3
–1/3 –2/3 1/9 2/9
Total 1/3 –5/2 115/18 61/12
622 Numerical Analysis

The normal equations are then reduced to


5 1 61 1 115
− = 4a + b and = a+ b.
2 3 12 3 18
Solution of these equations is
a = −0.6943, b = 0.8319.
Therefore, the fitted straight line is

y = −0.6943 + 0.8319x.

Estimation of error.

x Given y y, obtained from (yi − Yi )2


(yi ) the curve (Yi )
2 1 0.9695 0.0009
1/6 –5/6 –0.5557 0.0771
–3/2 –2 –1.9422 0.0033
–1/3 –2/3 –0.9716 0.0930
Total 0.1743

Thus the sum of the squares of error is 0.1743.

Algorithm 10.1 (Straight line fit). This algorithm fits a straight line for the
given data points (xi , yi ), i = 1, 2, . . . , n, by least squares method.

Algorithm Straight Line


Step 1. Read (xi , yi ), i = 1, 2, . . . , n.
Step 2. //Computation of x and y//
Set sx = 0, sy = 0.
for i = 1 to n do
sx = sx + xi and sy = sy + yi
endfor;
Compute x = sx/n and y = sy/n;
Step 3. Set sxy = 0, sx2 = 0;
for i = 1 to n do
sxy = sxy + (xi − x)(yi − y);
sx2 = sx2 + (xi − x)2 ;
endfor;
Step 4. Compute b = sxy/sx2; and a = y − bx;
Step 5. Print ‘The fitted line is y=’,a, ’+’, b, ‘x’.
end Straight Line
Least Squares Approximation 623

Program 10.1 .
/* Program Straight Line Fit
Program to fit a straight line for the given data points
(x[i],y[i]), i=1, 2, . . ., n, by least squares method.*/
#include<stdio.h>
#include<math.h>
void main()
{
int n,i; float x[50], y[50], sx=0,sy=0,sxy=0,sx2=0,xb,yb,a,b;
char sign;
printf("Enter the sample size and the sample (x[i],y[i]) ");
scanf("%d",&n);
for(i=1;i<=n;i++) scanf("%f %f",&x[i],&y[i]);
for(i=1;i<=n;i++){
sx+=x[i]; sy+=y[i];
}
xb=sx/n; yb=sy/n;
for(i=1;i<=n;i++){
sxy+=(x[i]-xb)*(y[i]-yb);
sx2+=(x[i]-xb)*(x[i]-xb);
}
b=sxy/sx2; a=yb-b*xb;
sign=(b<0)? ’-’:’+’;
printf("\nThe fitted line is y = %f %c %f x",a,sign,fabs(b));
} /* main */

A sample of input/output:
Enter the sample size and the sample (x[i],y[i]) 5 1 12 4
13 6 10 7 8 10 3 The fitted line is y = 15.097345 - 1.053097 x

10.3 Fitting of a Parabolic Curve

Let
y = a + bx + cx2 (10.12)

be the second degree parabolic curve, where a, b, c are unknown parameters and the
values of them are to be determined based on the sample values (xi , yi ), i = 1, 2, . . . , n.
Assume that Yi is the predicted value of y obtained from the curve (10.12) at x = xi ,
i.e.,
Yi = a + bxi + cx2i .
624 Numerical Analysis

The sum of squares of residuals



n 
n
S= (yi − Yi )2 = (yi − a − bxi − cx2i )2 .
i=1 i=1

The normal equations are given by


∂S ∂S ∂S
= 0, = 0, and = 0.
∂a ∂b ∂c
That is,

n   
−2 (yi − a − bxi − cx2i ) = 0 or yi = na + b xi + c x2i
i=1

n    
−2 (yi − a − bxi − cx2i )xi = 0 or xi yi = a xi + b x2i + c x3i
i=1
n    
−2 (yi − a − bxi − cx2i )x2i = 0 or x2i yi = a x2i + b x3i + c x4i .
i=1

Let a = a∗ , b = b∗ and c = c∗ be the solution of the above equations.


Then the fitted parabolic curve is

y = a∗ + b∗ x + c∗ x2 (10.13)

Example 10.3.1 Fit a parabola to the following data by taking x as the indepen-
dent variable.
x : 1 2 3 4 5 6 7 8 9
y : 2 6 7 8 10 11 11 10 9

Solution. Let the equation of the parabola be

y = a + bu + cu2

where u = 5 − x, and u is taking as the independent variable.


Therefore, the normal equations are
  
y = na + b u + c u2
   
uy = a u + b u2 + c u3
   
u2 y = a u2 + b u3 + c u4 .
Least Squares Approximation 625

The calculations are shown in the following table.


x u y u2 uy u3 u2 y u4
1 4 2 16 8 64 32 256
2 3 6 9 18 27 54 81
3 2 7 4 14 8 28 16
4 1 8 1 8 1 8 1
5 0 10 0 0 0 0 0
6 –1 11 1 –11 –1 11 1
7 –2 11 4 –22 –8 44 16
8 –3 10 9 –30 –27 90 81
9 –4 9 16 –36 –64 144 256
Total 0 74 60 –51 0 411 708

Then the above normal equations reduce to

74 = 9a + 0.b + 60.c
−51 = 0.a + 60.b + 0.c
411 = 60.a + 0.b + 708.c.

Solution of these equations is b = −0.85, a = 10.0043, c = −0.2673.


Thus the fitted parabola is

y = 10.0043 − 0.85u − 0.2673u2 = 10.0043 − 0.85(5 − x) − 0.2673(5 − x)2


= −0.9282 + 3.5230x − 0.2673x3 .

10.4 Fitting of a Polynomial of Degree k

Let

y = a0 + a1 x + a2 x2 + · · · + ak xk , (10.14)

be a polynomial or a parabola of degree k and (xi , yi ), i = 1, 2, . . . , n be the given


sample.
The sum of squares of residuals is
n 
 2
S= yi − (a0 + a1 xi + a2 x2i + · · · + ak xki ) . (10.15)
i=0

As in previous cases, the normal equations are given by


∂S ∂S ∂S
= 0, = 0, . . . , = 0.
∂a0 ∂a1 ∂ak
626 Numerical Analysis

That is,
   
na0 + a1 xi + a2 x2i + · · · + ak xki = yi
    
a0 xi + a1 x2i + a2 x3i + · · · + ak xk+1
i = xi yi (10.16)
    
a0 x2i + a1 x3i + a2 x4i + · · · + ak xk+2
i = x2i yi
················································ ··· ······
    
a0 xki + a1 xk+1
i + a2 xk+2
i + · · · + ak x2k
i = xki yi .

These (k + 1) equations contain (k + 1) unknowns a0 , a1 , . . . , ak . Let a0 = a∗0 , a1 =


a∗1 , . . . , ak
= a∗k be the solution of the above system of linear equations. Then the fitted
polynomial is

y = a∗0 + a∗1 x + a∗2 x2 + · · · + a∗k xk . (10.17)

10.5 Fitting of Other Curves

10.5.1 Geometric curve

Let

y = axb (10.18)

be the geometric curve to be fitted to the given data (xi , yi ), i = 1, 2, . . . , n, where a, b


are unknown parameters.
Taking logarithm on both sides, we obtain the equation

log y = log a + b log x.

This can be written as


Y = A + bX,

where
Y = log y, X = log x, A = log a.

The original points (xi , yi ) in the xy-plane are transferred to (Xi , Yi ), i = 1, 2, . . . , n


in the XY -plane. This process is called data linearization.
Now, using the method described in Section 10.2, one can determine the values of A
and b, and a can be obtained from a = eA .
Least Squares Approximation 627

Example 10.5.1 Fit a curve of the type y = axb to the following points.

x : 1 2 3 4 5
y : 3.5 6.2 9.5 15.3 20.4

Solution. Let y = axb . Then its corresponding linear curve is Y = A + bX, where
log y = Y, log x = X, log a = A.
The normal equations are
 
Yi = nA + b Xi
  
Xi Yi = A Xi + b Xi2 .
   
The values of Xi , Xi2 , Yi , Xi Yi are calculated in the following table.

x y X Y X2 XY
1 3.5 0 1.25276 0 0
2 6.2 0.69315 1.82455 0.48046 1.26469
3 9.5 1.09861 2.25129 1.20694 2.47329
4 15.3 1.38629 2.72785 1.92180 3.78159
5 20.4 1.60944 3.01553 2.59030 4.85331
Total 4.78749 11.07198 6.19950 12.37288

Then the normal equations are

5A + 4.78749b = 11.07198,
4.78749A + 6.19950b = 12.37288.

The solution of these equations is A = 1.16445, b = 1.09655 and hence


a = eA = 3.20416.
Thus the fitted curve is
y = 3.20416 x1.09655 .

10.5.2 Rectangular hyperbola


Let the equation of rectangular hyperbola be
1
y= . (10.19)
a + bx
This non-linear equation can be converted to linear form by substituting Y = 1/y.
The corresponding linear equation then becomes Y = a + bx.
Now, using the method discussed in Section 10.2 one can fit the above curve.
628 Numerical Analysis

10.5.3 Exponential curve


Let the exponential curve be

y = aebx . (10.20)

Taking logarithm on both sides, we get


log y = log a + bx
which can be written as
Y = A + bx
where Y = log y, A = log a, i.e., a = eA .
Now the parameters A and b can be determined by using the technique adopted in
Section 10.2.

10.6 Weighted Least Squares Method

Sometimes it may happen that while fitting a curve based on a given sample (xi , yi ), i =
1, 2, . . . , n, some data points may be more significant than the other. In this situation,
we find a curve such that the curve either pass through that important points or pass
very near to that points. The amount of ‘importance’ can be introduced by assigning
weights to that data points. If all the data have same importance, then the weights are
set to 1.

10.6.1 Fitting of a weighted straight line


Let y = a + bx be the straight line to be fitted to the given data points (xi , yi ), i =
1, 2, . . . , n with weights wi , i = 1, 2, . . . , n. In this case, the sum of squares of residuals
is defined as

n  2
S= wi yi − (a + bxi ) . (10.21)
i=1

For the minimum S,


∂S ∂S
= 0, = 0.
∂a ∂b
These give


n
−2 wi [yi − (a + bxi )] = 0
i=1

n
and −2 wi [yi − (a + bxi )]xi = 0
i=1
Least Squares Approximation 629

After simplification these give a system of linear equations for a and b as


n 
n 
n 
n 
n 
n
a wi + b wi xi = w i yi and a wi xi + b wi x2i = wi xi yi .
i=1 i=1 i=1 i=1 i=1 i=1

These are the normal equations and give the values of a and b.

Example 10.6.1 Fit the following data


x : 0 2 4 6
y : 10 15 18 25

to a straight line by considering that the data (2,15) and (4,18) are more significant
or reliable with weights 5 and 10 respectively.

Solution. Let the straight line be y = a + bx. The normal equations are
  
a wi + b wi xi = w i yi
  
and a wi xi + b wi x2i = wi xi yi .

The related values are calculated in the following table.


x y w wx wx2 wy wxy
0 10 1 0 0 10 0
2 15 5 10 20 75 150
4 18 10 40 160 180 720
6 25 1 6 36 25 150
Total 17 56 216 290 1020

The normal equations are then 17a + 56b = 290 and 56a + 216b = 1020.
The solution of these equations is a = 10.29851, b = 2.05224.
Thus the fitted line is
y = 10.29851 + 2.05224x.
Estimation of error.

x y w Predicted y Absolute error


0 10 1 10.29851 0.29851
2 15 5 14.40299 0.59701
4 18 10 18.50747 0.50747
6 25 1 22.61195 2.38805
Sum of squares of errors 6.40584
630 Numerical Analysis

Example 10.6.2 Consider the above example again with the modified weights 25
and 40 instead of 5 and 10.

Solution. The modified calculations are shown below.


x y w wx wx2 wy wxy
0 10 1 0 0 10 0
2 15 25 50 100 375 750
4 18 40 160 640 720 2880
6 25 1 6 36 25 150
Total 67 216 776 1130 3780

The normal equations are


67a + 216b = 1130 and 216a + 776b = 3780.
These give a = 11.31934, b = 1.72039.
The fitted straight line is
y = 11.31934 + 1.72039x.
Estimation of error.
x y w Predicted y Absolute error
0 10 1 11.31934 1.31934
2 15 25 14.76012 0.23988
4 18 40 18.20090 0.20090
6 25 1 21.64168 3.35832
Sum of squares of errors 13.11687
It is observed that when the weights on x = 2 and x = 4 are increased then the
absolute errors in y are reduced at these points, but, the sum of squares of errors is
increased due to the less importance of the data (0, 10) and (6, 25).

10.7 Least Squares Method for Continuous Data

In the previous sections, the least squares method is considered for the discrete data.
This method is also applicable for continuous data.
Let y = f (x) be a continuous function on [a, b] and it is to be approximated by the
kth degree polynomial

y = a0 + a1 x + a2 x2 + · · · + ak xk . (10.22)

In this case, the sum of the squares of residuals S is defined as


% b
S= w(x)[y − (a0 + a1 x + a2 x2 + · · · + ak xk )]2 dx (10.23)
a
Least Squares Approximation 631

where w(x) is a suitable weight function.


The necessary conditions for minimum S are

∂S ∂S ∂S
= = ··· = = 0. (10.24)
∂a0 ∂a1 ∂ak

These give the normal equations as


% b
−2 w(x)[y − (a0 + a1 x + a2 x2 + · · · + ak xk )] dx = 0
a
% b
−2 w(x)[y − (a0 + a1 x + a2 x2 + · · · + ak xk )]x dx = 0
a
% b
−2 w(x)[y − (a0 + a1 x + a2 x2 + · · · + ak xk )]x2 dx = 0
a
.. ..
. .
% b
−2 w(x)[y − (a0 + a1 x + a2 x2 + · · · + ak xk )]xk dx = 0.
a

After simplification these equations reduce to


% b% b % b % b
w(x)dx + a1 xw(x) dx + · · · + ak xk w(x) dx = w(x)y dx
a0
a a a a
% b % b % b % b
a0 xw(x)dx + a1 x2 w(x) dx + · · · + ak xk+1 w(x) dx = w(x)xy dx
a a a a
% b % b % b % b
a0 x2 w(x)dx + a1 x3 w(x) dx + · · · + ak xk+2 w(x) dx = w(x)x2 y dx
a a a a
.. ..
. . (10.25)
% b % b % b % b
a0 xk w(x)dx + a1 xk+1 w(x) dx + · · · + ak x2k w(x) dx = w(x)xk y dx
a a a a

Since w(x) and y = f (x) are known, the above equations form a system of linear
equations with (k + 1) unknowns a0 , a1 , . . . , ak . This system of equations possesses a
unique solution. If
a0 = a∗0 , a1 = a∗1 , . . . , ak = a∗k

is the solution for a0 , a1 , . . . , ak then the approximate polynomial is

y = a∗0 + a∗1 x + a∗2 x2 + · · · + a∗k xk .


632 Numerical Analysis

Example 10.7.1 Construct a least squares quadratic approximation to the function


y = ex
on [0, 1].

Solution. Let the weight function be w(x) = 1 and y = a0 + a1 x + a2 x2 be the


required quadratic approximation. Then the normal equations are

% 1 % 1 % 1 % 1
a0 dx + a1 x dx + a2 x2 dx = ex dx
0 0 0 0
% 1 % 1 % 1 % 1
a0 x dx + a1 x2 dx + a2 x3 dx = xex dx
0 0 0 0
% 1 % 1 % 1 % 1
2 3 4
a0 x dx + a1 x dx + a2 x dx = x2 ex dx.
0 0 0 0

After simplification these equations reduce to


1 1
a0 + a1 + a2 = e − 1,
2 3
1 1 1
a0 + a1 + a2 = 1,
2 3 4
1 1 1
a0 + a1 + a2 = e − 2.
3 4 5
The solution of these system of equations is

a0 = 1.0129913, a1 = 0.8511251, a2 = 0.8391840.

Thus the least squares approximation to y = ex is

yl = 1.0129913 + 0.8511251x + 0.8391840x2 . (10.26)

It is well known that the Taylor’s series expansion of y = ex up to second and third
degree terms are

x2
y2 = 1 + x + (10.27)
2
x2 x3
y3 = 1 + x + + . (10.28)
2 6
The values of y obtained from (10.26), (10.27) and (10.28) are listed below.
Least Squares Approximation 633

x yl y2 y3 Exact
0.0 1.01299 1.00000 1.00000 1.00000
0.1 1.10650 1.10500 1.10517 1.10517
0.2 1.21678 1.22000 1.22133 1.22140
0.3 1.34386 1.34500 1.34950 1.34986
0.4 1.48771 1.48000 1.49067 1.49182
0.5 1.64835 1.62500 1.64583 1.64872
0.6 1.82577 1.78000 1.81600 1.82212
0.7 2.01998 1.94500 2.00217 2.01375
0.8 2.23097 2.12000 2.20533 2.22554
0.9 2.45874 2.30500 2.42650 2.45960
1.0 2.70330 2.50000 2.66667 2.71828

This table shows that the least squares quadratic approximation gives more better
result as compared to second degree Taylor’s series approximation, even to third
degree approximation.

10.8 Approximation Using Orthogonal Polynomials

The least squares methods discussed in previous sections, generate a system of (k + 1)


linear equations. This system can be solved by using any method presented in Chapter
5. But, there is a problem to solve a large system of equations, specially when the
system is ill-conditioned. These drawbacks can be removed by using the orthogonal
polynomials.
In the previous section, a function is approximated as a polynomial containing the
terms 1, x, x2 , . . . , xk . These terms are called base functions, because, any function
or even discrete data are approximated based on these functions.
Now, we assume that the base functions are some orthogonal polynomials f0 (x),
f1 (x), . . ., fk (x). Let the given function be approximated as

y = a0 f0 (x) + a1 f1 (x) + · · · + ak fk (x), (10.29)

where fi (x) is a polynomial in x of degree i. Then as in the previous cases, the residue
is given by
% b
S= w(x)[y − {a0 f0 (x) + a1 f1 (x) + · · · + ak fk (x)}]2 dx. (10.30)
a

For minimum S,
∂S ∂S ∂S
= 0, = 0, . . . , = 0.
∂a0 ∂a1 ∂ak
634 Numerical Analysis

These equations give the following normal equations.


% b
−2 w(x)[y − {a0 f0 (x) + a1 f1 (x) + · · · + ak fk (x)}]f0 (x) dx = 0
a
% b
−2 w(x)[y − {a0 f0 (x) + a1 f1 (x) + · · · + ak fk (x)}]f1 (x) dx = 0
a
.. ..
. .
% b
−2 w(x)[y − {a0 f0 (x) + a1 f1 (x) + · · · + ak fk (x)}]fk (x) dx = 0.
a
After simplification, the ith equation becomes
% b % b
a0 w(x)f0 (x)fi (x) dx + a1 w(x)f1 (x)fi (x) dx + · · ·
a a
% b % b
+ai w(x)fi (x) dx + · · · + ak
2
w(x)fk (x)fi (x) dx (10.31)
a a
% b
= w(x) y fi (x) dx,
a
i = 0, 1, 2, . . . , n.
A set of polynomial {f0 (x), f1 (x), . . . , fk (x)} is said to be orthogonal with respect
to the weight function w(x) if

% b  0, if i = j
% b
fi (x)fj (x)w(x) dx = (10.32)
a  fi2 (x)w(x) dx, if i = j.
a

Incorporating this property, equation (10.31) becomes


% b % b
ai w(x)fi2 (x) dx = w(x)yfi (x) dx, i = 0, 1, 2, . . . , n.
a a
That is,
% b
w(x) y fi (x) dx
ai = %a b
, i = 0, 1, 2, . . . , n. (10.33)
w(x)fi2 (x) dx
a
From this relation one can determine the values of a0 , a1 , . . . , ak and the least squares
approximation is obtained by substituting these values in (10.29). But, the functions
f0 (x), f1 (x), . . . , fk (x) are unknown. Several orthogonal functions are available in liter-
ature, some of them are listed in Table 10.1.
Based on the problem, any one of them can be selected to fit a function.
Least Squares Approximation 635

Table 10.1: Some standard orthogonal polynomials.

Name fi (x) Interval w(x)


Legendre Pn (x) [−1, 1] 1
Leguerre Ln (x) [0, ∞) e−x
2
Hermite Hn (x) (−∞, ∞) e−x
Chebyshev Tn (x) [−1, 1] (1 − x2 )−1/2

Gram-Schmidt orthogonalization process


Let fi (x) be a polynomial in x of degree i and {fi (x)} be a given sequence of polynomials.
Then the sequence of orthogonal polynomials {fi∗ (x)} over the interval [a, b] with respect
to the weight function w(x) can be generated by the relation

i−1
fi∗ (x) = xi − air fr∗ (x), i = 1, 2, . . . , n, (10.34)
r=0

for suitable constants air , and f0∗ (x) = 1.


To find air , multiplying (10.34) by w(x)fk∗ (x), 0 ≤ k ≤ i − 1 and integrating over
[a, b]. Then
% b % b % b
i−1
fi∗ (x)fk∗ (x)w(x) dx = xi fk∗ (x)w(x) dx − air fr∗ (x)fk∗ (x)w(x) dx.
a a a r=0

Using the property of orthogonal polynomial, this equation reduces to


% b % b
i ∗
x fk (x)w(x) dx − aik f ∗ 2k (x)w(x) dx = 0
a a
8b i ∗
x fk (x) w(x) dx
or, aik = a8 b , 0 ≤ k ≤ i − 1.
∗2
a fk (x) w(x) dx

Thus the set of orthogonal polynomials {fi∗ (x)} are given by


f0∗ (x) = 1

i−1
fi∗ (x) = xi − air fr∗ (x), i = 1, 2, . . . , n
r=0
8b
xi fk∗ (x) w(x) dx
where air = a8 b . (10.35)
a fk∗2 (x) w(x) dx
For the discrete data, the integral is replaced by summation.
636 Numerical Analysis

Note 10.8.1 It may be noted that Gram-Schmidt process generates a sequence of


monic (leading coefficient unity) orthogonal polynomials.

Example 10.8.1 Use Gram-Schmidt orthogonalization process to determine the


first four orthogonal polynomials on [−1, 1] with respect to the weight function w(x) =
1.

Solution. Let f0∗ (x) = 1.


81
x dx
Then f1∗ (x) = x − a10 f0∗ (x), where a10 = 8−11 = 0.
−1 dx
Thus f1∗ (x) = x.
The second orthogonal polynomial is
f2∗ (x) = x2 − a20 f0∗ (x) − a21 f1∗ (x)
81 2
81 2
−1 x dx 1 −1 x .x dx
where a20 = 81 = , a21 = 81 = 0.
3 2
−1 dx −1 x dx

1 1
Thus f2∗ (x) = x2 − = (3x2 − 1).
3 3
Again, f3∗ (x) = x3 − a30 f0∗ (x) − a31 f1∗ (x) − a32 f2∗ (x)
where
81 3 81 3 81 3 1
−1 x dx −1 x .x dx 3 x . (3x2 − 1) dx
a30 = 8 1 = 0, a31 = 8 1 = , a32 = 8−11 1 3 = 0.
−1 dx −1 x2 dx 5 (3x2 − 1)2 dx
−1 9

Hence f3∗ (x) = x3 − 35 x = 15 (5x3 − 3x).


Thus the first three orthogonal polynomials are f0∗ (x) = 1, f1∗ (x) = x,
f2∗ (x) = 13 (3x2 − 1) and f3∗ (x) = 15 (5x3 − 3x).
These polynomials are called (monic) Legendre polynomials.

10.9 Approximation of Functions

The problem of approximation of functions is an important problem in numerical anal-


ysis, due to its wide applications in science and engineering, specially in computer
graphics, to plot two- and three-dimensional figures. Under some restriction, almost all
functions can be approximated using Taylor’s series with base functions 1, x, x2 , . . .. It
is mentioned earlier that the method of approximation of a function using Taylor’s se-
ries is not economic. But, the approximation using orthogonal polynomials is economic.
Among several orthogonal polynomials, Chebyshev polynomials seem to be more eco-
nomic.
Least Squares Approximation 637

10.9.1 Chebyshev polynomials


The Chebyshev polynomial, Tn (x), of degree n over the interval [−1, 1] is defined by
Tn (x) = cos(n cos−1 x), n = 0, 1, 2, . . . . (10.36)
This expression can also be written as
Tn (x) = cos nθ, where x = cos θ. (10.37)
Thus T0 (x) = 1 and T1 (x) = x.
From equation (10.36), it is easy to observe that Tn (x) = T−n (x).
Also, T2n (x) = T2n (−x) and T2n+1 (−x) = −T2n+1 (x), i.e., Tn (x) is even or odd
functions according as n is even or odd.
Now, from the trigonometric formula
cos(n − 1)θ + cos(n + 1)θ = 2 cos nθ. cos θ,
the recurrence relation for Chebyshev polynomial is obtained as
Tn−1 (x) + Tn+1 (x) = 2xTn (x),
i.e., Tn+1 (x) = 2xTn (x) − Tn−1 (x), n = 1, 2, 3, . . . . (10.38)
The zeros of Chebyshev polynomial Tn (x) lie between [−1, 1], being all distinct and
given by
 
(2i + 1)π
xi = cos , i = 0, 1, 2, . . . , n − 1. (10.39)
2n
These values are called the Chebyshev abscissas or nodes.
From the equation (10.37), it is obvious that |Tn (x)| ≤ 1 for −1 ≤ x ≤ 1. Thus
extreme values of Tn (x) are −1 and 1.
From the recurrence relation (10.38), it is observed that the relation doubles the
leading coefficient of Tn (x) to get the leading coefficient of Tn+1 (x). Also T1 (x) = x.
Thus the coefficient of xn in Tn (x) is 2n−1 , n ≥ 1.
This polynomial satisfies a second order differential equation
d2 y dy
(1 − x2 ) 2
−x + n2 y = 0. (10.40)
dx dx
To justify it, let y = Tn (x) = cos nθ, x = cos θ.
dy
dy n sin nθ d2 y −n2 cos nθ + n sin nθ cot θ −n2 y + x
Then = and = = dx .
dx sin θ dx2 sin2 θ 1 − x2
d2 y dy
This gives (1 − x2 ) 2
−x + n2 y = 0.
dx dx
638 Numerical Analysis

Orthogonal property
We mentioned earlier that the Chebyshev polynomials are orthogonal and they are
orthogonal with respect to the weight function (1 − x2 )−1/2 , i.e.,
% 1
Tn (x)Tm (x)
√ dx = 0 for m = n.
−1 1 − x2
To prove it, let x = cos θ. Then
% 1 % π
Tn (x)Tm (x)
I= √ dx = Tn (cos θ)Tm (cos θ) dθ
1 − x2
%−1π
0
%
1 π
= cos nθ cos mθ dθ = [cos(m + n)θ + cos(m − n)θ] dθ
0 2 0
1 sin(m + n)θ sin(m − n)θ π
= + .
2 m+n m−n 0
Now, when m = n it gives I = 0, when m = n = 0 then I = π and when m = n = 0
then I = π2 .
Thus

% 1  0, if m = n
Tn (x)Tm (x)
√ dx = π, if m = n = 0 (10.41)
−1 1 − x2 
π/2, if m = n = 0

Example 10.9.1 Use Chebyshev√ polynomials to find least squares approximation


of second degree for f (x) = 1 − x2 on [−1, 1].

Solution. Let f (x)  a0 T0 (x) + a1 T1 (x) + a2 T2 (x). Then the residue is


% 1
S= w(x)[f (x) − {a0 T0 (x) + a1 T1 (x) + a2 T2 (x)}]2 dx,
−1

where w(x) = (1 − x2 )−1/2 .


For minimum S,
∂S ∂S ∂S
= 0, = 0, = 0.
∂a0 ∂a1 ∂a2
Now, % 1
∂S
= −2 w(x)[f (x) − {a0 T0 (x) + a1 T1 (x) + a2 T2 (x)}]T0 (x) dx = 0.
∂a0 −1

Using the property of orthogonal polynomials, this equation is simplified as


% 1 % 1
w(x)f (x)T0 (x) dx − w(x)a0 T02 (x) dx = 0.
−1 −1
Least Squares Approximation 639

This equation gives 8 1 % √


−1 w(x)f (x)T0 (x) dx 1 1
1 − x2 2
a0 = 81 = √ dx =
2
−1 w(x)T0 (x) dx
π −1 1−x 2 π
[Using (10.41) and T0 (x) = 1].

∂S ∂S
Similarly, from the relations = 0 and = 0, we obtain
∂a1 ∂a2
81 % 1
−1 w(x)f (x)T1 (x) dx 2
a1 = 81 = x dx = 0
2 π
−1 w(x)T1 (x) dx −1
81 % 1
−1 w(x)f (x)T2 (x) dx 2 4
a2 = 81 = (2x2 − 1) dx = − .
2 π 3π
−1 w(x)T2 (x) dx −1

Thus the Chebyshev approximation is


2 4 2 4
f (x) = T0 (x) − T2 (x) = − (2x2 − 1) = 1.061033 − 0.8488264x2 .
π 3π π 3π
Using the recurrence relation (10.38) and T0 (x) = 1, T1 (x) = x, one can generate
Chebyshev polynomials. The first seven Chebyshev polynomials are
T0 (x) = 1
T1 (x) = x
T2 (x) = 2x2 − 1
T3 (x) = 4x3 − 3x
(10.42)
T4 (x) = 8x4 − 8x2 + 1
T5 (x) = 16x5 − 20x3 + 5x
T6 (x) = 32x6 − 48x4 + 18x2 − 1
T7 (x) = 64x7 − 112x5 + 56x3 − 7x.
The graph of first four Chebyshev polynomials are shown in Figure 10.1.
The equation (10.42) suggests that all powers of x can be expressed in terms of
Chebyshev polynomials, as

1 = T0 (x)
x = T1 (x)
x2 = 12 [T0 (x) + T2 (x)]
x3 = 14 [3T1 (x) + T3 (x)]
(10.43)
x4 = 18 [3T0 (x) + 4T2 (x) + T4 (x)]
1
x5 = 16 [10T1 (x) + 5T3 (x) + T5 (x)]
1
x6 = 32 [10T0 (x) + 15T2 (x) + 6T4 (x) + T6 (x)]
1
x7 = 64 [35T1 (x) + 21T3 (x) + 7T5 (x) + T7 (x)].
640 Numerical Analysis

Tn (x)
T0 (x)
61
T4 (x)
T3 (x)

T1 (x)

- x
−1 0 1

T2 (x)

−1

Figure 10.1: Chebyshev polynomials Tn (x), n = 0, 1, 2, 3, 4.

Thus every polynomial can be approximated using Chebyshev polynomials.


Example 10.9.2 Express the polynomial x3 + 2x2 − 7 in terms of Chebyshev poly-
nomials.

Solution. x3 + 2x2 − 7 = 14 [3T1 (x) + T3 (x)] + 2. 12 [T0 (x) + T2 (x)] − 7T0 (x)
= 14 T3 (x) + T2 (x) + 34 T1 (x) − 6T0 (x).

10.9.2 Expansion of function using Chebyshev polynomials


Let y = f (x) be a function to be approximated by Chebyshev polynomials. This can
be done as f (x) = a0 T0 (x) + a1 T1 (x) + a2 T2 (x) + · · · + ak Tk (x), where the coefficients
ai are given by
% 1
1
√ yTi (x) dx
1 − x2
ai = %−11 , i = 0, 1, 2, . . . , n.
1
√ 2
Ti (x) dx
−1 1 − x2

But, the integral in the denominator of ai is improper and it is not easy to evaluate.
So, a discretization technique is adopted here to approximate a function using Chebyshev
polynomials. The orthogonal property for discrete case is stated below.
Least Squares Approximation 641



 0, if i = j

n
n+1
Ti (xk )Tj (xk ) = , if i = j = 0 (10.44)

 2
k=0
n + 1, if i = j = 0
 
(2k + 1)π
where xk = cos , k = 0, 1, 2, . . . , n.
2n + 2
This result is used to establish the following theorem.

Theorem 10.1 (Chebyshev approximation). The function f (x) can be approxi-


mated over [−1, 1] by Chebyshev polynomials as

n
f (x)  ai Ti (x). (10.45)
i=0

The coefficients ai are given by

1  1 
n n
a0 = f (xj )T0 (xj ) = f (xj ), (10.46)
n+1 n+1
j=0 j=0
 
(2j + 1)π
xj = cos
2n + 2
2 
n
and ai = f (xj )Ti (xj )
n+1
j=0
n  
2 (2j + 1)iπ
= f (xj ) cos (10.47)
n+1 2n + 2
j=0
for i = 1, 2, . . . , n.

Example 10.9.3 Approximate f (x) = ex to second order Chebyshev approxima-


tion over the interval [0, 1].

Solution. The second order Chebyshev approximation is



2
f (x)  ai Ti (x).
i=0
The Chebyshev nodes are given by
 
(2j + 1)π
xj = cos , j = 0, 1, 2.
6
642 Numerical Analysis

x0 = 0.86660254, x1 = 0, x2 = −0.8660254.
1
a0 = [f (x0 ) + f (x1 ) + f (x2 )] = 1.2660209
3
 
2
2
(2j + 1)π
a1 = f (xj ) cos = 1.1297721
3 6
j=0
 
2
2
2(2j + 1)π
a2 = f (xj ) cos = 0.26602093.
3 6
j=0

Therefore, f (x)  1.2660209 T0 (x) + 1.1297721 T1 (x) + 0.2660209 T2 (x).

Algorithm 10.2 (Chebyshev approximation). This algorithm approximates


and evaluates a function f (x) by Chebyshev polynomials as
n
1 
n
f (x)  ai Ti (x) where a0 = f (xj ) and
n+1
i=0 j=0
 
2 
n
(2j + 1)iπ
ai = f (xj ) cos , i = 1, 2, . . . , n.
n+1 2n + 2
j=0

Algorithm Chebyshev
Input function F (x)
Step 1. Read n; //degree of the polynomial//
Step 2. Set P i = 3.1415926535 and A = P i/(2n + 2).
Step 3. //Calculation of nodes and function values//
for i = 0 to n do
xi = cos((2i + 1)A) and yi = F (xi );
endfor;
Step 4. //Calculation of the coefficients ai .//
for i = 0 to n do
Set ai = 0;
for j = 0 to n do
ai = ai + cos((2j + 1) ∗ i ∗ A) ∗ yj ;
endfor;
endfor;
a0 = a0 /(n + 1);
for i = 1 to n do
ai = 2ai /(n + 1);
endfor;
//Evaluation of Chebyshev polynomial approximation//
Step 5. Read x; //The point at which the function is to be evaluated//
Least Squares Approximation 643

Step 6. Set T0 = 1, T1 = x;
Step 7. //Evaluation of Chebyshev polynomial//
if n > 1 then
for i = 1 to n − 1 do
Ti+1 = 2xTi − Ti−1 ;
endfor;
endif;
Step 8. sum = a0 ∗ T0 ;
for i = 1 to n do
sum = sum + ai ∗ Ti ;
endfor;
Step 9. Print ‘The value of Chebyshev polynomial approximation is’, sum;
end Chebyshev
Program 10.2.
/* Program Chebyshev Approximation
This program approximates and evaluates a function f(x)
by Chebyshev polynomials. */
#include <stdio.h>
#include <math.h>
#define f(x) exp(x) /* definition of function f(x) */
#define pi 3.1415926535
void main()
{
int n,i,j; float a,xi,x[40],y[40],c[40];
float xg,T[40],sum;
printf("Enter the degree of the polynomial ");
scanf("%d",&n);
a=pi/(2.0*n+2.0);
for(i=0;i<=n;i++)
{
x[i]=cos((2*i+1.)*a);
xi=x[i];
y[i]=f(xi);
}
for(i=0;i<=n;i++)
{
c[i]=0;
for(j=0;j<=n;j++)
c[i]=c[i]+cos((2*j+1)*i*a)*y[j];
}
644 Numerical Analysis

c[0]=c[0]/(n+1.);
for(i=1;i<=n;i++)
c[i]=2*c[i]/(n+1.);
/* Printing of Chebyshev coefficients*/
printf("The Chebyshev coefficients are\n ");
for(i=0;i<=n;i++) printf("%f ",c[i]);
/* Evaluation of the polynomial */
printf("\nEnter the value of x ");
scanf("%f",&xg);
T[0]=1.0; T[1]=xg;
/*Computation of Chebyshev polynomial at x*/
if(n>1)
for(i=1;i<=n-1;i++) T[i+1]=2*xg*T[i]-T[i-1];
sum=c[0]*T[0];
for(i=1;i<=n;i++)
sum+=c[i]*T[i];
printf("\nThe value of the Chebyshev polynomial approximation
at %6.3f is %9.8f",xg,sum);
} /* main */

A sample of input/output:

Enter the degree of the polynomial 4


The Chebyshev coefficients are
1.266066 1.130318 0.271495 0.044334 0.005429
Enter the value of x 0.5
The value of the Chebyshev polynomial approximation at 0.500
is 1.64842904

Minimax principle
It is well known that the error in polynomial interpolation is

f (n+1) (ξ)
En (x) = wn+1 (x) ,
(n + 1)!

where wn+1 (x) = (x − x0 )(x − x1 ) · · · (x − xn ) is a polynomial of degree (n + 1).


Now,

max |f (n+1) (ξ)|


−1≤x≤1
|En (x)| ≤ max |wn+1 (x)| . (10.48)
−1≤x≤1 (n + 1)!
Least Squares Approximation 645

The expression max |f (n+1) (ξ)| is fixed for a given function f (x). Thus the error
−1≤x≤1
bound depends on the value of |wn+1 (x)| and the value of |wn+1 (x)| depends on the
choice of the nodes x0 , x1 , . . . , xn . The maximum error depends on the product of
max |wn+1 (x)| and max |f (n+1) (ξ)|. Russian mathematician Chebyshev invented
−1≤x≤1 −1≤x≤1
that x0 , x1 , . . . , xn should be chosen such that wn+1 (x) = 2−n Tn+1 (x). The polynomial
2−n Tn+1 (x) is the monic Chebyshev polynomial.
If n is fixed, then all possible choices for wn+1 (x) and thus among all possible choices
for the nodes x0 , x1 , . . . , xn on [−1, 1], the polynomial T̃n+1 (x) = 2−n Tn+1 (x) is the
unique choice that satisfies the relation
max {|T̃n+1 (x)|} ≤ max {|wn+1 (x)|}.
−1≤x≤1 −1≤x≤1

Moreover, max {|T̃n+1 (x)|} = 2−n as |Tn+1 (x)| ≤ 1.


−1≤x≤1
This property is called minimax principle and the polynomial
T̃n+1 (x) = 2−n Tn+1 (x) or, T̃n (x) = 21−n Tn (x)
is called minimax polynomial.

10.9.3 Economization of power series


From the equation (10.43), it is observed that every polynomial of degree n can be
expressed in terms of Chebyshev polynomials of same degree. If
f (x) = a0 + a1 x + a2 x2 + · · · + an xn (10.49)
be the given polynomial, then it can be expressed in the form
f (x) = b0 + b1 T1 (x) + b2 T2 (x) + · · · + bn Tn (x). (10.50)
For a large number of functions, it is computationally observed that the expansion
of the form (10.50) converges more rapidly than the form (10.49). Thus representation
of the form (10.50) is computationally better and the process is called economization
of the power series. This is illustrated in the following example.
Example 10.9.4 Economize the power series

x2 x4 x6 x8
cos x = 1 − + − + − ···
2! 4! 6! 8!
correct to four significant digits.

Solution. Since the result is required to four significant digits and the coefficients of
the term x8 , 1/8! = 0.0000248 can affect the fifth decimal place only, thus the terms
after fourth term may be truncated.
646 Numerical Analysis

Hence
x2 x4 x6
cos x = 1 − + − . (10.51)
2! 4! 6!
Now, express the above series using Chebyshev polynomials, as
1 1 1 1
cos x = T0 (x) − · [T0 (x) + T2 (x)] + · [3T0 (x) + 4T2 (x) + T4 (x)]
2! 2 4! 8
1 1
− · [10T0 (x) + 15T2 (x) + 6T4 (x) + T6 (x)]
6! 32
= 0.7651910 − 0.2298177 T2 (x) + 0.0049479 T4 (x) − 0.0000434 T6 (x).

Again, the term 0.0000434 T6 (x) does not affect the fourth decimal place, so it is
discarded and the economized series is
cos x = 0.7651910 − 0.2298177 T2 (x) + 0.0049479 T4 (x).
In terms of x, the series is

cos x = 0.7651910 − 0.2298177 (2x2 − 1) + 0.0049479 (8x4 − 8x2 + 1)


= 0.9999566 − 0.4992186 x2 + 0.0395832 x4 .

This expression gives the values of cos x correct up to four decimal places.

10.10 Exercise

1. Fit a straight line for the following data.

x : 1 3 5 7 9
y : 10 13 25 23 33

Also calculate the sum of squares of residuals.


2. Fit a straight line of the form y = a + bx for the following data.

x : 1951 1961 1971 1981 1991


y : 33 43 60 78 96

Find y when x = 2001.


3. If the straight line y = a + bx is the best fit curve for the data points (xi , yi ), i =
1, 2, . . . , n, then show that
 
 x y 1 
  
 xi yi n  = 0.
 
 x2 yi2 xi 
i
Least Squares Approximation 647

4. Use weighted least squares method to fit the straight line y = a + bx to the


following data.

x : 2 4 6 7 9
y : 6 10 8 15 20
w : 1 1 5 10 1

5. Fit a parabola of the form y = a + bx + cx2 for the following data.

x : 0 1 2 3 4 5
y : 0 4 16 36 64 100

Estimate the residue and conclude about the data. Find the predicted value of y
when x = −1.

6. Fit a straight line and a second degree parabola for the following data and explain
which curve is better fitted for the given data by estimating the residue.

x : –1 0 3 4 6
y : 0 11 22 25 34

7. Fit a curve of the form ae−bx for the data given below.

x : 0.5 1.0 1.5 1.6 1.8 1.9 2.0 2.25


y : 2.1 3.4 3.9 4.8 5.1 5.8 6.0 8.9

8. Fit a curve of the type y = cxa for the following data.

x : 1 2 3 4 5
y : 0.6 1.9 4.4 7.8 11.9

9. Fit the curves y1 = ceax and y2 = 1/(ax + b) by using the following data.

x : −1 0 1 2 3
y : 6.62 3.98 2.76 1.25 0.5

Also find the sum of squares of errors in both the cases and conclude which curve
is better for this data points.
b c
10. Find the set of equations to find the values of a, b, c when the curve y = a + x + x2
is to be fitted for the data points (xi , yi ), i = 1, 2, . . . , n.
648 Numerical Analysis

11. Find the appropriate transformations to linearize the following curves


a b 1
(a) y = , (b) x = , (c) y = ,
x+b y+a a + bx
x
(d) y = a log x + b, (e) y = , (f) y = ce−bx .
a + bx
12. Find the equations for the coefficients a and b if the curve y = a sinh bx is to be
fitted to the data points (xi , yi ), i = 1, 2, . . . , n.

13. By expanding the expression (cos θ + i sin θ)n show that the coefficient of xn in
Tn (x) is 2n−1 .

14. Show that Tn (x) is a polynomial in x of degree n. Also show that Tn (x) is even
or odd functions according as n is even or odd.

15. Show that the Chebyshev polynomials are orthogonal with respect to the weight
function w(x) = (1 − x2 )−1/2 .

16. Use Gram-Schmidt orthogonalization process to find first three orthogonal poly-
2
nomials whose weight function is w(x) = e−x .

17. Express the following in terms of Chebyshev polynomials.


(a) x4 − 3x3 + 2x2 + 10x − 5, and (b) x5 − 6x3 + 11x2 − 10x + 8.

18. Express the following as a polynomials in x.


(a) T5 (x) + 5T3 (x) − 11T2 (x) + 3T1 (x), and
(b) 2T4 (x) − 7T3 (x) + 2T2 (x) + 5T0 (x).

19. Approximate the following functions using Chebyshev polynomials.


(a) log(1 + x), −1 < x ≤ 1, (b) sin x, −1 ≤ x ≤ 1, (c) ex .
2 3 4
20. Suppose y = 1 − 2!x
+ x4! − x6! + x8! − · · ·. Economize this series if the fourth deci-
mal place is not to be affected, near x = 1.
3 5 7
21. Economize the power series sin x = x − x6 + 120x
− 5040
x
+ ···
on the interval [−1, 1] correct up to four decimal places.
Bibliography

[1] A. Ben Israel and T.N.E. Greville, Generalized Inverses: Theory and Applications,
Wiley, New York, 1974.

[2] J.C. Butcher, The Numerical Analysis of Ordinary Differential Equation: Runge-
Kutta and General Linear Methods, (Chichester, John Wiley), 1987.

[3] N.I. Danilina, S.N. Dubrovskaya, O.P. Kvasha and G.L. Smirnov, Computational
Mathematics, Mir Publishers, Moscow, 1998.

[4] M.E.A. El-Mikkawy, A fast algorithm for evaluating nth order tri-diagonal deter-
minants, J. Computational and Applied Mathematics, 166 (2004) 581-584.

[5] L. Fox, Numerical Solution of Ordinary and Partial Differential Equations, Perga-
mon, London, 1962.

[6] A. Gupta and S.C. Bose, Introduction to Numerical Analysis, Academic Publishers,
Calcutta, 1989.

[7] T.N.E. Greville, The pseudo-inverse of a rectangular or singular matrix and its
application to the solution of systems of linear equations, SIAM Review, 1 (1959)
38-43.

[8] F.B. Hildebrand, Introduction of Numerical Analysis, McGraw-Hill, New York,


London, 1956.

[9] H.H.H. Homeier, A modified Newton method for root finding with cubic conver-
gence, J. Computational and Applied Mathematics, 157 (2003) 227-230.

[10] A.S. Householder, The Theory of Matrices in Numerical Analysis, Blaisdell, New
York, 1964.

[11] M.K. Jain, S.R.K. Iyengar and R.K. Jain, Numerical Methods for Scientific and
Engineering Computation, New Age International (P) Limited, New Delhi, 1984.

649
650 Numerical Analysis

[12] E.V. Krishnamurthy and S.K. Sen, Numerical Algorithms, Affiliated East-West
Press Pvt. Ltd., New Delhi, 1986.

[13] J.H. Mathews, Numeical Methods for Mathematics, Science, and Engineering, 2nd
ed., Prentice-Hall, Inc., N.J., U.S.A., 1992.

[14] J.Y. Park, D.J. Evans, K. Murugesan, S. Sekar and V. Murugesh, Optimal control of
singular systems using the RK-Butcher algorithm, Inter. J. Computer Mathematics,
81 (2004) 239-249.

[15] G.D. Smith, Numerical Solution of Partial Differential Equations: Finite Difference
Methods, Clarendon Press, Oxford, 1985.

[16] E.A. Volkov, Numerical Methods, Mir Publishers, Moscow, 1986.

[17] E.W. Weisstein, Numerical quadrature, From MathWorld – A Wolfram Web Re-
source.
http://mathworld.wolfram.com

[18] http://www.riskglossary.com/link/monte-carlo-method.htm

[19] http://www.damtp.cam.ac.uk

You might also like