(MADHU MANGAL PAUL) Numerical Analysis For Scienti
(MADHU MANGAL PAUL) Numerical Analysis For Scienti
Madhumangal Pal
Department of Applied Mathematics with
Oceanology and Computer Programming
Vidyasagar University
Midnapore - 721102
.
Dedicated to my parents
Preface
vii
viii Numerical Analysis
Cholesky, matrix partition, Jacobi, Gauss-Seidal and relaxation are well studied here.
The very new methods to find tri-diagonal determinant and to solve a tri-diagonal sys-
tem of equations are incorporated. A method to solve ill-condition system is discussed.
The generalised inverse of a matrix is introduced. Also, least squares solution method
for an inconsistent system is illustrated here.
Determination of eigenvalues and eigenvectors of a matrix is very important problem
in applied science and engineering. In Chapter 6, different methods, viz., Leverrier-
Faddeev, Rotishauser, Power, Jacobi, Givens and Householder are presented to find the
eigenvalues and eigenvectors for arbitrary and symmetric matrices.
Chapter 7 contains indepth presentation of several methods to find derivative and
integration of a functions. Three types of integration methods, viz., Newton-Cotes
(trapezoidal, Simpson, Boole, Weddle), Gaussian (Gauss-Legendre, Lobatto, Radau,
Gauss-Chebyshev, Gauss-Hermite, Gauss-Leguerre, Gauss-Jacobi) and Monte Carlo are
well studies here. Euler-Maclaurin sum formula, Romberg integration are also studied
in this chapter. An introduction to find double integration is also given here.
To solve ordinary differential equations, Taylor series, Picard, Euler, Runge-Kutta,
Runge-Kutta-Fehlberg, Runge-Kutta-Butcher, Adams-Bashforth-Moulton, Milne, finite-
difference, shooting and finite element methods are discussed in Chapter 8. Stability
analysis of some methods are also done.
An introduction to solve partial differential equation is given in Chapter 9. The finite
difference methods to solve parabolic, hyperbolic and elliptic PDEs are discussed here.
Least squares approximation techniques are discussed in Chapter 10. The method
to fit straight line, parabolic, geometric, etc., curves are illustrated here. Orthogo-
nal polynomials, their applications and Chebyshev approximation are discussed in this
chapter.
The algorithms and programmes in C are supplied for the most of the important
methods discussed in this book.
At first I would like to thank Prof. N. Dutta and Prof. R.N. Jana, as from their book
I learnt my first lessons in the subject.
In writing this book I have taken help from several books, research articles and some
websites mentioned in the bibliography. So, I acknowledge them gratefully.
This book could not have been complete without the moral and loving support and
also continuous encouragement of my wife Anita and my son Aniket.
Also, I would like to express my sincere appreciation to my teachers and colleagues
specially Prof. M. Maiti, Prof. T.K. Pal and Prof. R.N. Jana as they have taken all
the academic and administrative loads of the department on their shoulders, providing
me with sufficient time to write this book. I would also like to acknowledge my other
colleagues Dr. K. De and Dr. S. Mondal for their encouragement.
I express my sincerest gratitude to my teacher Prof. G.P.Bhattacharjee, Indian In-
stitute of Technology, Kharagpur, for his continuous encouragement.
ix
I feel great reverence for my parents, sisters, sister-in-law and relatives for their
blessings and being a constant source of inspiration.
I would like to thank to Sk. Md. Abu Nayeem, Dr. Amiya K. Shyamal, Dr. Anita
Saha, for scrutinizing the manuscript.
I shall feel great to receive constructive criticisms for the improvement of the book
from the experts as well as the learners.
I thank the Narosa Publishing House Pvt. Ltd. for their sincere care in the
publication of the book.
Madhumangal Pal
x Numerical Analysis
Contents
xi
xii Numerical Analysis
3 Interpolation 71
3.1 Lagrange’s Interpolation Polynomial . . . . . . . . . . . . . . . . . . . . 72
3.1.1 Lagrangian interpolation formula for equally spaced points . . . 75
3.2 Properties of Lagrangian Functions . . . . . . . . . . . . . . . . . . . . . 77
3.3 Error in Interpolating Polynomial . . . . . . . . . . . . . . . . . . . . . . 79
3.4 Finite Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.4.1 Forward differences . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.4.2 Backward differences . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.4.3 Error propagation in a difference table . . . . . . . . . . . . . . . 88
3.5 Newton’s Forward Difference Interpolation Formula . . . . . . . . . . . 90
3.5.1 Error in Newton’s forward formula . . . . . . . . . . . . . . . . . 92
3.6 Newton’s Backward Difference Interpolation Formula . . . . . . . . . . . 95
3.6.1 Error in Newton’s backward interpolation formula . . . . . . . . 98
3.7 Gaussian Interpolation Formulae . . . . . . . . . . . . . . . . . . . . . . 99
3.7.1 Gauss’s forward difference formula . . . . . . . . . . . . . . . . . 99
3.7.2 Remainder in Gauss’s forward central difference formula . . . . . 102
3.7.3 Gauss’s backward difference formula . . . . . . . . . . . . . . . . 102
3.7.4 Remainder of Gauss’s backward central difference formula . . . . 105
3.8 Stirling’s Interpolation Formula . . . . . . . . . . . . . . . . . . . . . . . 105
3.9 Bessel’s Interpolation Formula . . . . . . . . . . . . . . . . . . . . . . . . 106
3.10 Everett’s Interpolation Formula . . . . . . . . . . . . . . . . . . . . . . . 109
3.10.1 Relation between Bessel’s and Everett’s formulae . . . . . . . . . 111
3.11 Interpolation by Iteration (Aitken’s Interpolation) . . . . . . . . . . . . 113
3.12 Divided Differences and their Properties . . . . . . . . . . . . . . . . . . 116
3.12.1 Properties of divided differences . . . . . . . . . . . . . . . . . . 117
3.13 Newton’s Fundamental Interpolation Formula . . . . . . . . . . . . . . . 121
Contents xiii
xix
xx Numerical Analysis
Errors in Numerical
Computations
The solutions of mathematical problems are of two types: analytical and numerical.
The analytical solutions can be expressed in closed form and these solutions are error
free. On the other hand, numerical method is a division of mathematics which solves
problems using computational machine (computer, calculator, etc.). But, for some
classes of problems it is very difficult to obtain an analytical solution. For example, the
Indian populations are known at the years 1951, 1961, 1971, 1981, 1991, 2001. There
is no analytical method available to determine the population in the year, say, 2000.
But, using numerical method one can determine the population in the said year. Again,
sometimes we observed that the solutions of non-linear differential equations cannot be
determined by analytical methods, but, such problems can easily be solved by numerical
methods. Numerical computations are almost invariably contaminated by errors, and
it is important to understand the source, propagation, magnitude, and rate of growth
of these errors.
In this age of computer, many complicated and large problems are solved in signifi-
cantly less time. But, without using numerical methods we cannot solve any mathemat-
ical problem using computer, as analytical methods are not suitable to solve a problem
by computer. Thus, the numerical methods are highly appreciated and extensively used
by Mathematicians, Computer Scientists, Statisticians, Engineers and others.
1
2 Numerical Analysis
and their growth and propagation in numerical computation. Three types of errors, viz.,
inherent errors, round-off errors and truncation errors, occur in finding the solution of
a problem using numerical method. These three type of errors are discussed below.
(i) Inherent errors: This type of errors is present in the statement of the problem itself,
before determining its solution. Inherent errors occur due to the simplified assumptions
made in the process of mathematical modelling of a problem. It can also arise when the
data is obtained from certain physical measurements of the parameters of the proposed
problem.
(ii) Round-off errors: Generally, the numerical methods are carried out using cal-
culator or computer. In numerical computation, all the numbers are represented by
decimal fraction. Some numbers such as 1/3, 2/3, 1/7 etc. can not be represented by
decimal fraction in finite numbers of digits. Thus, to get the result, the numbers should
be rounded-off into some finite number of digits.
Again, most of the numerical computations are carried out using calculator and com-
puter. These machines can store the numbers up to some finite number of digits. So in
arithmetic computation, some errors will occur due to the finite representation of the
numbers; these errors are called round-off error. Thus, round-off errors occur due to the
finite representation of numbers during arithmetic computation. These errors depend
on the word length of the computational machine.
(iii) Truncation errors: These errors occur due to the finite representation of an
inherently infinite process. For example, the use of a finite number of terms in the
infinite series to compute the value of cos x, sin x, ex , etc.
The Taylor’s series expansion of sin x is
x3 x5 x7
sin x = x − + − + ··· .
3! 5! 7!
This is an infinite series expansion. If only first five terms are taken to compute the
value of sin x for a given x, then we obtain an approximate result. Here, the error occurs
due to the truncation of the series. Suppose, we retain the first n terms, the truncation
error (Etrunc ) is given by
x2n+1
Etrunc ≤ .
(2n + 1)!
It may be noted that the truncation error is independent of the computational machine.
To solve a problem, two types of numbers are used. They are exact and approximate.
Exact number gives a true value of a result and approximate number gives a value which
is closed to the true value.
Errors in Numerical Computations 3
For example, in the statements ‘a triangle has three sides’, ‘there are 2000 people in a
locality’, ‘a book has 450 pages’ the numbers 3, 2000 and 450 are exact numbers. But,
in the assertions ‘the height of a pupil is 178 cm’, ‘the radius of the Earth is 6400 km’,
‘the mass of a match box is ten gram’, the numbers 178, 6400 and 10 are approximate
numbers.
This is due to the imperfection of measuring instruments we use. There are no
absolutely exact measuring instruments; each of them has its own accuracy. Thus, the
height of a pupil is 178 cm is not absolute measurement. In the second example, the
radius of the Earth is very concept; actually, the Earth is not a sphere at all, and we
can use its radius only in approximate terms. In the last example, the approximation
of the number is also defined by the fact that different boxes may have different masses
and the number 10 defines the mass of a particular box.
One important observation is that, same number may be exact as well as approximate.
For example, the number 3 is exact when it represents the number of sides of a triangle
and approximate if we use it to represent the number π when calculating the area of a
circle using the formula πr2 .
√
Independently, the numbers 1, 2, 3, 12 , 53 , 2, π, e, etc. written in this manner are exact.
An approximate value of π is 3.1416, a better approximation of it is 3.14159265. But
one cannot write the exact value of π.
The accuracy of calculations is defined by the number of digits in the result which
enjoy confidence. The significant digits or significant figures of a number are all its
digits, except for zeros which appear to the left of the first non-zero digit. Zeros at the
end of a number are always significant digit. For example, the numbers 0.001205 and
356.800 have 4 and 6 significant digits respectively.
In practical calculations, some numbers occur containing large number of digits, and
it will be necessary to cut them to a usable number of figures. This process is called
rounding-off of numbers. That is, in rounding process the number is replaced by
another number consisting of a smaller number of digits. In that case, one or several
digits keep with the number, taken from left to right, and discard all others.
(i) If the discarded digits constitute a number which is larger than half the unit in the
last decimal place that remains, then the last digit that is left is increased by one.
If the discarded digits constitute a number which is smaller than half the unit in
the last decimal place that remains, then the digits that remain do not change.
(ii) If the discarded digits constitute a number which is equal to half the unit in the
last decimal place that remains, then the last digit that is half is increased by one,
if it is odd, and is unchanged if it is even.
4 Numerical Analysis
This rule is often called a rule of an even digit. If a number is rounded using the
above rule then the number is called correct up to some (say n) significant figures.
The following numbers are rounded-off correctly to five significant figures:
Exact number Round-off number
25.367835 25.368
28.353215 28.353
3.785353 3.7854
5.835453 5.8355
6.73545 6.7354
4.83275 4.8328
0.005834578 0.0058346
3856754 38568×102
2.37 2.3700
8.99997 9.0000
9.99998 10.000
From above examples, it is easy to observe that, while rounding a number, an error
is generated and this error is sometimes called round-off error.
Let xT be the exact value of a number and xA be its approximate value. If xA < xT ,
then we say that the number xA is an approximate value of the number xT by defect
and if xA > xT , then it is an approximate value of xT by excess.
The difference between the exact value xT and its approximate value xA is an error.
As a rule, it is not possible to determine the value of the error xT − xA and even its
sign, since the exact number xT is unknown.
The errors are represented in three ways, viz., absolute error, relative error and
percentage error.
Absolute error:
The absolute error of the approximate number xA is a quantity (∆x) which satisfies the
inequality
∆x ≥ |xT − xA |.
The absolute error is the upper bound of the deviation of the exact number xT from
its approximation, i.e.,
xA − ∆x ≤ xT ≤ xA + ∆x.
The above result can be written in the form
xT = xA ± ∆x. (1.1)
Errors in Numerical Computations 5
In other words, the absolute error of the number x is the difference between true
value and approximate value, i.e.,
∆x = |xT − xA |.
It may be noted from the rounding process that, if a number be rounded to m decimal
places then 1
absolute error ≤ × 10−m . (1.2)
2
The absolute error measures only the quantitative aspect of the error but not the
qualitative one, i.e., does not show whether the measurement and calculation were
accurate. For example, the length and the width of a table are measured with a scale
(whose division is 1 cm) and the following results are obtained: the width w = 5 ± 0.5
cm and the length l = 100 ± 0.5 cm. In both cases the absolute error is same and it is
0.5 cm. It is obvious that the second measurement was more accurate than the first.
To estimate the quality of calculations or measurements, the concept of a relative error
is introduced.
Relative error:
The relative error (δx) of the number xA is
∆x ∆x
δx = or , |xT | = 0 and |xA | = 0.
|xA | |xT |
This expression can be written as
xT = xA (1 ± δx) or xA = xT (1 ± δx).
Note that relative error is the absolute error when measuring 1 unit.
For the measurements of the length and the width of the table (discussed earlier) the
relative errors are
0.5 0.5
δw = = 0.1 and δl = = 0.005.
5 100
In these cases, one can conclude that the measurement of the length of the table has
been relatively more accurate than that of its width. So one conclusion can be drawn:
the relative error measures the quantity and quality of the calculation and measurement.
Thus, the relative error is a better measurement of error than absolute error.
Percentage error:
The percentage error of an approximate number xA is δx × 100%.
It is a particular type of relative error. This error is sometimes called relative
percentage error. The percentage error gives the total error while measuring 100 unit
instead of 1 unit. This error also calculates the quantity and quality of measurement.
When relative error is very small then the percentage error is calculated.
6 Numerical Analysis
Note 1.3.1 The absolute error of a number correct to n significant figures cannot be
greater than half a unit in the nth place.
Note 1.3.2 The relative error and percentage error are independent of the unit of
measurement, while absolute error depends on the measuring unit.
Absolute error measures only quantity of error and it is the total amount of error
incurred by approximate value. While the relative error measures both the quantity
and quality of the measurement. It is the total error while measuring one unit. The
absolute error depends on the measuring unit, but, relative error does not depend on
measuring unit.
1
Example 1.3.1 Find the absolute, relative and percentage error in xA when xT =
3
and xA = 0.333.
Solution. The middle of the given interval is taken as its approximate value, i.e.,
xA = 28.055. The absolute error is half of its length, i.e., ∆x = 0.025. The relative
∆x
error δx = = 0.000891 · · · .
xA
It is conventional to round-off the error to one or two non-zero digits. Therefore,
δx = 0.0009 and the percentage error is 0.09%.
Errors in Numerical Computations 7
Example 1.3.3 Determine the absolute error and the exact number corresponding
to the approximate number xA = 5.373 if percentage error is 0.01%.
Example 1.3.4 Find out in which of the following cases, the quality of calculations
15 √
is better: xT = 0.8824 and yT = 51 7.141.
17
Solution. To find the absolute error, we take the√numbers xA and yA with a larger
number of decimal digits as xA 0.882353, yA = 51 7.141428.
Therefore, the absolute error in xT is |0.882353 · · · − 0.8824| 0.000047,
and in yT is |7.141428 · · · − 7.141| 0.00043.
The relative error in xA is 0.00047/0.8824 0.00053 = 0.05%
and relative error in yA is 0.00043/7.141 = 0.0000602 = 0.006%.
In the second case the quality of calculation is better than the first case as relative
error in xT > relative error in yT .
A real number can be represented by many different ways. For example, the number
840000 can be represented as two factors: 840 × 103 or 84.0 × 104 or 0.840 × 106 . (Note
that in these representations the last three significant zeros are lost). The later form of
the notation is known as normalize form and it is commonly used. In this case, we
say that 840 is the mantissa of the number and 6 is its order.
Every positive decimal number, exact as well as approximate, can be expressed as
where di are the digits constituting the number (i = 1, 2, . . .) with d1 = 0 and 10m−i+1
is the value of the ith decimal position (counting from left).
The digit dn of the approximate number a is valid significant digit (or simply a
valid digit) if it satisfies the following inequality.
i.e., absolute error does not exceed half the unit of the decimal digit in which dn appears.
If inequality (1.3) is not satisfied, then the digit dn is said to be doubtful. It is
obvious that if the digit dn is valid, then all the preceding digits, to the left of it, are
also valid.
8 Numerical Analysis
Theorem 1.1 If a number is correct up to n significant figures and the first significant
digit of the number is k, then the relative error is less than
1
.
k × 10n−1
Proof. Let xA be the approximate value of the exact number xT . Also, let xA is correct
up to n significant figures and m decimal places. Then there are three possibilities may
occur:
(i) m < n
(ii) m = n and
(iii) m > n.
We have by (1.2), the absolute error ∆x ≤ 0.5 × 10−m .
Case I. When m < n.
In this case, the total number of digits in integral part is n − m. If k be the first
significant digit in xT , then
∆x 0.5 × 10−m
δx = ≤
|xT | k × 10n−m−1 − 0.5 × 10−m
1
= .
2k × 10n−1 − 1
Since, n is a positive integer and k is an integer lies between 1 and 9,
Thus the absolute error in sum of approximate numbers is equal to the sum of the
absolute errors of the numbers.
From (1.4), it follows that the absolute error of the algebraic sum must not be smaller
than the absolute error of the least exact term.
The following points should be kept in mind when adding numbers of different abso-
lute accuracy.
(i) identify a number (or numbers) of the least accuracy (i.e., a number which has
the maximum absolute error),
(ii) round-off more exact numbers so as to retain in them one digit more than in the
identified number (i.e., retain one reserve digit),
(iii) perform addition taking into account all the retained digits,
Subtraction
Consider x1 and x2 be two approximate values of the corresponding exact numbers X1
and X2 . Let X = X1 − X2 and x = x1 − x2 .
Then X1 = x1 ± ∆x1 and X2 = x2 ± ∆x2 , where ∆x1 and ∆x2 are the errors in x1
and x2 respectively.
Therefore, |X − x| = |(X1 − x1 ) − (X2 − x2 )| ≤ |X1 − x1 | + |X2 − x2 |. Hence,
|∆x| = |∆x1 | + |∆x2 |. (1.5)
Thus the absolute error in difference of two numbers is equal to the sum of individual
absolute errors.
Thus the relative errors in product of two numbers is equal to the sum of individual
relative errors.
The result (1.6) can be easily extended to the product of several numbers so that, if
X = X1 X2 · · · Xn and x = x1 x2 · · · xn , then
X − x ∆x1 ∆x2
= + + · · · + ∆xn . (1.7)
x x1 x2 xn
That is, the total relative error in product of n numbers is equal to the sum of
individual relative errors.
A particular case
Let the approximate numbers x1 , x2 , . . . , xn be all positive and x = x1 x2 · · · xn .
Then log x = log x1 + log x2 + · · · + log xn .
Solution. Let x = kx1 , where k is an exact factor other than zero. Then
∆x k ∆x1 ∆x1
δx = = = = δx1 .
x k x1 x1
But, the absolute error |∆x| = |k ∆x1 | = |k| |∆x1 | = |k| times the absolute error in
x1 .
From this relation one can conclude that the relative error in quotient is greater than
or equal to the difference of their individual relative errors.
A particular case
For positive approximate numbers x1 and x2 , the equation (1.8) can easily be deduced.
Let x = x1 /x2 . Then log x = log x1 − log x2 . Therefore,
∆x ∆x1 ∆x2 ∆x ∆x1 ∆x2
= −
i.e., ≤ + .
x x1 x2 x x1 x2
(i) identify the least exact number, i.e., the number with the least number of valid
digits,
(ii) round-off the other number, leaving in it on significant digit more than there are
digits in the identified number,
(iii) retain as many significant digits in the quotient as there were in the least exact
number.
Example 1.5.2 Find the sum of the approximate numbers 0.543, 0.1834, 17.45,
0.000234, 205.2, 8.35, 185.3, 0.0863, 0.684, 0.0881 in each of which all the written
digits are valid. Find the absolute error in sum.
Solution. The least exact numbers (those possessing the maximum absolute error)
are 205.2 and 185.3. The error of each of them is 0.05. Now, rounding off the other
numbers, leaving one digit more and adding all of them.
0.54+0.18+17.45+0.00+205.2+8.35+185.3+0.09+0.68+0.09=417.88.
Discarding one digit by round-off the sum and we obtained 417.9.
The absolute error in the sum consists of two terms:
(i) the initial error, i.e., the sum of the errors of the least exact numbers and the
rounding errors of the other numbers: 0.05 × 2 + 0.0005 × 8 = 0.104 0.10.
Example 1.5.3 Find the difference of the approximate numbers 27.5 and 35.8 hav-
ing absolute errors 0.02 and 0.03 respectively. Evaluate the absolute and the relative
errors of the result.
Example 1.5.4 Find the product of the approximate numbers x1 = 8.6 and x2 =
34.359 all of whose digits are valid. Also find the relative and the absolute errors.
Solution. In the first number, there are two valid significant digits and in the second
there are five digits. Therefore, round-off the second number to three significant
digits. After rounding-off the numbers x1 and x2 become x1 = 8.6 and x2 = 34.4.
Hence the product is
In the result, there are two significant digits, because the least number of valid sig-
nificant digits of the given numbers is 2.
The relative error in product is
∆x ∆x1 ∆x2 0.05 0.0005
δx = = + = + = 0.00583 0.58%.
x x1 x2 8.6 34.359
Example 1.5.5 Calculate the quotient x/y of the approximate numbers x = 6.845
and y = 2.53 if all the digits of the numbers are valid. Find the relative and the
absolute errors.
Solution. Here the dividend x = 6.845 has four valid significant digits and the
divisor has three, so we perform division without rounding-off. Thus
x 6.845
= = 2.71.
y 2.53
Three significant digits are retained in the result, since, the least exact number (the
divisor y) contains three valid significant digits.
The absolute error in x and y are respectively
∆x = 0.0005 and ∆y = 0.005.
14 Numerical Analysis
Let us consider an approximate number x1 which has a relative error δx1 . Now, the
problem is to find the relative error of x = xm
1 .
Then
x = xm x1 · x1 · · · x1 .
1 =
m times
Thus, when the approximate number x is raised to the power m, its relative error
increases m times.
√
Similarly, one can calculate the relative error of the number x = m x1 .
Here x1 > 0. Therefore,
1
log x = log x1 .
m
That is,
∆x 1 ∆x1 ∆x 1 ∆x1
=
or = .
x m x1 x m x1
√
X3 Y
Example 1.5.6 Calculate A = where X = 8.36, Y = 80.46, Z = 25.8. The
Z2
absolute errors in X, Y, Z are respectively 0.01, 0.02 and 0.03. Find the error of the
result.
Solution. Here the absolute error ∆x = 0.01, ∆y = 0.02 and ∆z = 0.03. To calculate
intermediate result, retain one reserve digit. The approximate intermediate values
√
are x3 = 584.3, y = 8.9699, z 2 = 665.6, where x, y, z are approximate values of
X, Y, Z respectively.
Thus the approximate value of the expression is
584.3 × 8.9699
a= = 7.87.
665.6
Three significant digits are taken in the result, since, the least number of significant
digits in the numbers is 3.
Now, the relative error δa in a is given by
1 0.01 1 0.02 0.03
δa = 3 δx + δy + 2 δz = 3 × + × +2×
2 8.36 2 80.46 25.8
0.0036 + 0.00012 + 0.0023 0.006 = 0.6%.
This formula gives the total error for computing a function containing several vari-
ables.
The relative error is given by
∆y ∂f ∆xi
n
= .
y ∂xi y
i=1
Significant error occurs due to the loss of significant digits during arithmetic compu-
tation. This error occurs mainly due to the finite representation of the numbers in
computational machine (computer or calculator). The loss of significant digits occurs
due to the following two reasons:
(ii) when division is made by a very small divisor compared to the dividend.
Significant error is more serious than round-off error, which are illustrated in the
following examples:
√ √
Example 1.6.1 Find the difference X = 5.36 − 5.35 and evaluate the relative
error of the result.
√ √
Solution. Let X1 = 5.36 2.315 = x1 and X2 = 5.35 2.313 = x2 .
The absolute errors ∆x1 = 0.0005 and ∆x2 = 0.0005. Then the approximate differ-
ence is x = 2.315 − 2.313 = 0.002.
The total absolute error in the subtraction is ∆x = 0.0005 + 0.0005 = 0.001.
0.001
The relative error δx = = 0.5 = 50%.
0.002
However, by changing the scheme of calculation we get a more accurate result.
√ √ √ √
√ √ ( 5.36 − 5.35)( 5.36 + 5.35)
X = 5.36 − 5.35 = √ √
5.36 + 5.35
5.36 − 5.35 0.01
=√ √ =√ √ 0.002 = x (say).
5.36 + 5.35 5.36 + 5.35
In this case the relative error is
∆x1 + ∆x2 0.001
δx = = = 0.0002 = 0.02%.
x1 + x2 2.315 + 2.313
Thus, when calculating x1 and x2 with the same four digits we get a better result in
the sense of a relative error.
Errors in Numerical Computations 17
Example 1.6.2 Calculate the values of the function y = 1 − cos x at x = 82o and
at x = 1o . Also, calculate the absolute and the relative errors of the results.
Solution. y at x = 82o
The value of cos 82o 0.1392 = a1 (say) (correct up to four digits) and ∆a1 =
0.00005. Then y1 = 1 − 0.1392 = 0.8608 and ∆y1 = 0.00005 (from an exact num-
ber equal to unity we subtract an approximate number with an absolute error not
exceeding 0.00005).
Consequently, the relative error is
0.00005
δy1 = = 0.000058 = 0.006%.
0.8608
y at x = 1o
We have cos 1o 0.9998 = a2 (say). ∆a2 = 0.00005.
y2 = 1 − 0.9998 = 0.0002. ∆y2 = 0.00005.
Hence
0.00005
δy2 = = 0.25 = 25%.
0.0002
From this example it is observed that for small values of x, a direct calculation of
y = 1 − cos x gives a relative error of the order 25%. But at x = 82o the relative error
is only 0.006%.
Now, change the calculation procedure and use the formula y = 1 − cos x = 2 sin2 x2
to calculate the value of y for small values of x.
Let a = sin 0o 30 0.0087. Then ∆a = 0.00005 and
0.00005
δa = = 0.0058 = 0.58%.
0.0087
Thus y2 = 2 × 0.00872 = 0.000151 and relative error
δy2 = 0.0058 + 0.0058 = 0.012 = 1.2% (using the formula δa = δx + δy if a = x.y).
The absolute error is
∆y2 = y2 × δy2 = 0.000151 × 0.012 = 0.000002.
Thus a simple transformation, of the computing formula, gives a more accurate result
for the same data.
Solution. For simplicity, it is assumed that all the calculations are performed using
four significant digits. The roots of this equation are
√
1000 ± 106 − 1
.
2
18 Numerical Analysis
Thus the roots of the given equation are 0.1000 × 104 and 0.00025.
Such a situation may be recognized by checking |4ac| b2 .
It is not always possible to transform the computing formula. Therefore, when nearly
equal numbers are subtracted, they must be taken with a sufficient number of reserve
valid digits. If it is known that the first m significant digits may be lost during compu-
tation and if we need a result with n valid significant digits then the initial data should
be taken with m + n valid significant digits.
floating point mode. In this mode of representation, the whole number is converted to
a proper fraction in such a way that the first digit after decimal point should be non-zero
and is adjusted by multiplying a number which is some powers of 10. For example, the
number 375.3 × 104 is represented in this mode as .3753 × 107 = .3753E7 (E7 is used
to represent 107 ). From this example, it is observed that in normalized floating point
representation, a number is a combination of two parts – mantissa and exponent.
In the above example, .3753 is the mantissa and 7 is the exponent. It may be noted
that the mantissa is always greater than or equal to .1 and exponent is an integer.
For simplicity, it is assume that the computer (hypothetical) uses four digits to store
mantissa and two digits for exponent. The mantissa and the exponent have their own
signs.
The number .0003783 would be stored as .3783E–3. The leading zeros in this number
serve only to indicate the decimal point. Thus, in this notation the range of numbers
(magnitudes) is .9999 × 1099 to .1000 × 10−99 .
In this section, the arithmetic operations on normalized floating point numbers are
discussed.
1.8.1 Addition
If two numbers have same exponent, then the mantissas are added directly and the
exponents are adjusted, if required.
If the exponents are different then lower exponent is shifted to higher exponent by
adjusting mantissa. The details about addition are discussed in the following examples.
Example 1.8.1 Add the following normalized floating point numbers.
(i) .3456E3 and .4325E3 (same exponent)
(ii) .8536E5 and .7381E5
(iii) .3758E5 and .7811E7 (different exponent)
(iv) .2538E2 and .3514E7
(v) .7356E99 and .3718E99 (overflow condition)
Solution. (i) In this case, the exponents are equal, so the mantissa are added
directly. Thus the sum is .7781E3.
(ii) In this case, the exponent are equal and the sum is 1.5917E5. Here the mantissa
has 5 significant digits, but our computer (hypothetical) can store only four significant
figures. So, the number is shifted right one place before it is stored. The exponent is
increased by 1 and the last digit is truncated. The final result is .1591E6.
20 Numerical Analysis
(iii) Here, the numbers are .3758E5 and .7811E7. The exponent of the first number
is less than that of the second number. The difference of the exponents is 7 − 5 = 2.
So the mantissa of the smaller number (here first number) is shifted right by 2 places
(the difference of the exponents) and the last 2 digits of the mantissa are discarded
as our hypothetical computer can store only 4 digits. Then the first number becomes
.0037E7. Then the result is .0037E7 + .7811E7 = .7848E7.
(iv) Here also the exponents are different and the difference is 7 − 2 = 5. The man-
tissa of first number (smaller exponent) is shifted 5 places and the number becomes
.0000E7. The final result is .0000E7 + .3514E7 = .3514E7.
(v) Here the numbers are .7356E99 and .3718E99 and they have equal exponent. So
the sum of them is 1.1074E99. In this case mantissa has five significant digits. Thus
the mantissa is shifted right and the exponent is increased by 1. Then the exponent
becomes 100. As the exponent cannot store more than two digits, in our hypothetical
computer, the number is larger than the largest number that can be stored in our
computer. This situation is called an overflow condition and the machine will give
an error message.
1.8.2 Subtraction
The subtraction is same as addition. In subtraction one positive number and one neg-
ative number are added. The following example shows the details about subtraction.
Example 1.8.2 Subtract the normalized floating point numbers indicated below:
(i) .3628E6 from .8321E6
(ii) .3885E5 from .3892E5
(iii) .3253E–7 from .4123E–6
(iv) .5321E–99 from .5382E–99.
Solution. (i) Here the exponents are equal, and the result is
.8321E6 – .3628E6 = .4693E6.
(ii) Here the result is .3892E5 – .3885E5 = .0007E5. The most significant digit in the
mantissa is 0, so the mantissa is shifted left till the most significant digit becomes
non-zero and in each left shift of the mantissa the exponent is reduced by 1. Hence
the final result is .7000E2.
(iii) The numbers are .4123E–6 and .3253E–7. The exponents are not equal, so the
number with smaller exponent is shifted right and the exponent increased by 1 for
every right shift. Then the second number becomes .0325E–6. Thus the result is
.4123E–6 – .0325E–6 = .3798E–6.
Errors in Numerical Computations 21
(iv) The result is .5382E–99 – .5321E–99 = .0061E–99. For normalization, the man-
tissa is shifted left twice and in this process the exponent is reduced by 1. In first
shift, the exponent becomes –100, but our hypothetical computer can store only two
digits as exponent. So –100 cannot be accommodated in the exponent part of the
number. In this case, the result is smaller than the smallest number which could be
stored in our computer. This condition is called an underflow condition and the
computer will give an error message.
1.8.3 Multiplication
Two numbers in normalized floating point mode are multiplied by multiplying the man-
tissa and adding the exponents. After multiplication, the mantissa is converted into
normalized floating point form and the exponent is converted appropriately. The fol-
lowing example shows the steps of multiplication.
Example 1.8.3 Multiply the following numbers indicated below:
(i) .5321E5 by .4387E10
(ii) .1234E10 by .8374E–10
(iii) .1139E50 by .8502E51
(iv) .3721E–52 by .3205E-53.
1.8.4 Division
In the division, the mantissa of the numerator is divided by that of the denominator.
The exponent is obtained by subtracting exponent of denominator from the exponent
22 Numerical Analysis
of numerator. The quotient mantissa is converted to normalized form and the exponent
is adjusted appropriately.
Example 1.8.4 Perform the following divisions
(i) .9938E5 ÷ .3281E2
(ii) .9999E2 ÷ .1230E–99
(iii) .3568E–10 ÷ .3456E97.
b − c =.1100E–1.
a(b − c) =.5673E1 × .1100E–1 = .0624E0 = .6240E–1.
ab =.2032E2, ac =.2026E2.
ab − ac =.6000E–1.
Thus, a(b − c) = ab − ac.
The above examples are intentionally chosen to point out the occurrence of inac-
curacies in normalized floating point arithmetic due to the shifting and truncation of
numbers during arithmetic operations. But these situations always do not happen.
Here, we assume that the computer can store only four digits in mantissa, but actually
computer can store seven digits as mantissa (in single precision). The larger length of
mantissa gives more accurate result.
Note 1.9.1 In any computational algorithm, it is not advisable to give any instruction
based on testing whether a floating point number is zero or not.
1.10 Exercise
9. Find the absolute, relative and percentage errors when (i) 2/3 is approximated
to 0.667, (ii) 1/3 is approximated to 0.333, and (iii) true value is 0.50 and its
calculated value was 0.49.
10. (i) If π is approximated as 3.14 instead of 3.14156, find the absolute, relative and
percentage errors.
(ii) Round-off the number x = 3.4516 to three significant figures and find the
absolute and the relative errors.
11. The numbers 23.982 and 3.4687 are both approximate and correct only to their
last digits. Find their difference and state how many figures in the result are
trustworthy.
12. Two lengths X and Y are measured approximately up to three significant figures
as X = 3.32 cm and Y = 5.39 cm. Estimate the error in the computed value of
X +Y.
13. Let xT and xA denote respectively the true and approximate values of a number.
Prove that the relative error in the product xA yA is approximately equal to the
sum of the relative errors in xA and yA .
14. Show that the relative error in the product of several approximate non-zero num-
bers does not exceed the sum of the relative errors of the numbers.
Errors in Numerical Computations 25
15. Show that the maximum relative error in the quotient of two approximate numbers
is approximately equal to the algebraic sum of the maximum relative errors of the
individual numbers.
16. Let x = 5.234 ± 0.0005 and y = 5.123 ± 0.0005. Find the percentage error of the
difference a = x − y when relative errors δx = δy = 0.0001.
17. What do you mean by the statement that xA (approximate value) has m significant
figures with respect to xT (true value) ? If the first significant figure of xA is k
and xA is correct up to n significant figures, prove that the relative error is less
than 101−n /k.
18. Given a = 11 ± 0.5, b = 0.04562 ± 0.0001, c = 17200 ± 100. Find the maximum
value of the absolute error in the following expressions
(i) a + 2b − c, (ii) 2a − 5b + c and (iii) a2 .
19. Calculate the quotient a = x/y of the approximate numbers x = 5.762 and y =
1.24 if all the digits of the dividend and the divisor are valid. Find the relative
and the absolute errors.
20. (i) Establish the general formula for absolute and relative errors for the function
v = f (u1 , u2 , . . . , un ) when absolute errors ∆ui of each independent quantity ui
up uq ur
are known. Use this result for the function v = 1 s 2 t 3 to find the upper bound
u4 u5
of the relative error.
(ii) Find the relative error in computing f (x) = 2x5 − 3x + 2 at x = 1, if the error
in x is 0.005.
1.42x + 3.45
(iii) If y = here the coefficients are rounded-off, find the absolute and
x + 0.75
relative errors in y when x = 0.5 ± 0.1.
23. (i) Determine the number of correct digits in the number 0.2318 if the relative
error is 0.3 × 10−1 .
(ii) Find the number of significant figures in the approximate number 0.4785 given
that the relative error is 0.2 × 10−2 .
24. Find the smaller root of the equation x2 −500x+1 = 0 using four-digit arithmetic.
26 Numerical Analysis
√ √
25. Find the value of 103 − 102 correct up to four significant figures.
Let us consider a function y = f (x) defined on [a, b]. The variables x and y are called
independent and dependent variables respectively. The points x0 , x1 , . . . , xn are taken
as equidistance, i.e., xi = x0 + ih, i = 0, 1, 2, . . . , n. Then the value of y, when x = xi ,
is denoted by yi , where yi = f (xi ). The values of x are called arguments and that of
y are called entries. The interval h is called the difference interval. In this chapter,
some important difference operators, viz., forward difference (∆), backward difference
(∇), central difference (δ), shift (E) and mean (µ) are introduced.
27
28 Numerical Analysis
Thus,
and so on.
In general,
x y ∆ ∆2 ∆3 ∆4
x0 y0
∆y0
x1 y1 ∆ 2 y0
∆y1 ∆ 3 y0
x2 y2 ∆2 y 1 ∆ 4 y0
∆y2 ∆3 y 1
x3 y3 ∆ 2 y2
∆y3
x4 y4
That is,
These differences are called first differences. The second differences are denoted by
∇2 y2 , ∇2 y3 , . . . , ∇2 yn . That is,
where ∇0 yi = yi , ∇1 yi = ∇yi .
These backward differences can be written in a tabular form and this table is known
as backward difference or horizontal table.
Table 2.2 is the backward difference table for the arguments x0 , x1 , . . . , x4 .
x y ∇ ∇2 ∇3 ∇4
x0 y0
x1 y1 ∇y1
x2 y2 ∇y2 ∇ 2 y2
x3 y3 ∇y3 ∇ 2 y3 ∇ 3 y3
x4 y4 ∇y4 ∇ 2 y4 ∇ 3 y4 ∇ 4 y4
x y δ δ2 δ3 δ4
x0 y0
δy1/2
x1 y1 δ 2 y1
δy3/2 δ 3 y3/2
x2 y2 δ2y 2 δ 4 y2
δy5/2 δ 3 y5/2
x3 y3 δ2y 3
δy7/2
x4 y4
It is observed that all odd differences have fraction suffices and all the even differences
are with integral suffices.
This gives,
That is, shift operator shifts the function value yi to the next higher value yi+1 .
The second shift operator gives
In general,
E n f (x) = f (x + nh) or E n yi = yi+nh . (2.15)
The inverse shift operator E −1 is defined as
E −1 f (x) = f (x − h). (2.16)
Similarly, second and higher inverse operators are
E −2 f (x) = f (x − 2h) and E −n f (x) = f (x − nh). (2.17)
More general form of E operator is
E r f (x) = f (x + rh), (2.18)
where r is positive as well as negative rationals.
Average operator, µ:
The average operator µ is defined as
1
µf (x) = f (x + h/2) + f (x − h/2)
2
1
i.e., µyi = yi+1/2 + yi−1/2 . (2.19)
2
Differential operator, D:
The differential operator is usually denoted by D, where
d
Df (x) = f (x) = f (x)
dx
d2
D2 f (x) = 2 f (x) = f (x). (2.20)
dx
Combining properties (2.2.2) and (2.2.3), one can generalise the property (2.2.2) as
Property 2.2.6
1 ∆f (x)
∆ =− .
f (x) f (x + h)f (x)
Property 2.2.12 The above formula can also be used to find anti-difference (like inte-
gration in integral calculus), as
1 (n)
∆−1 x(n−1) = x . (2.23)
nh
f (x) Ef (x)
Property 2.2.19 E = .
g(x) Eg(x)
It is clear from the forward, backward and central difference tables that in a definite
numerical case, the same values occur in the same positions, practically there are no
differences among the values of the tables, but, different symbols have been used for the
theoretical importance.
Thus
etc.
In general,
∆n yi = ∇n yi+n , i = 0, 1, 2, . . . . (2.24)
Again,
∆f (x) = f (x + h) − f (x) = Ef (x) − f (x) = (E − 1)f (x).
This relation indicates that the effect of the operator ∆ on f (x) is the same as that
of the operator E − 1 on f (x). Thus
∆≡E−1 or E ≡ ∆ + 1. (2.25)
Also,
∇f (x) = f (x) − f (x − h) = f (x) − E −1 f (x) = (1 − E −1 )f (x).
That is,
∇ ≡ 1 − E −1 . (2.26)
The higher order forward difference can be expressed in terms of the given function
values in the following way:
There is a relation among the central difference, δ, and the shift operator E, as
δf (x) = f (x + h/2) − f (x − h/2) = E 1/2 f (x) − E −1/2 f (x) = (E 1/2 − E −1/2 )f (x).
That is,
δ ≡ E 1/2 − E −1/2 . (2.27)
Again, 1
µf (x) = f (x + h/2) + f (x − h/2)
2
1 1
= E 1/2 f (x) + E −1/2 f (x) = (E 1/2 + E −1/2 )f (x).
2 2
Thus, 1
µ ≡ E 1/2 + E −1/2 . (2.28)
2
The average operator µ can also be expressed in terms of the central difference oper-
ator.
1 1/2 2
µ2 f (x) = E + E −1/2 f (x)
4
1 1/2 1
= (E − E −1/2 )2 + 4 f (x) = δ 2 + 4 f (x).
4 4
Hence,
1
µ≡1 + δ2. (2.29)
4
Some more relations among the operators ∆, ∇, E and δ are deduced in the following.
Hence,
E ≡ ehD . (2.31)
Also,
hD ≡ log E. (2.32)
This relation is used to separate the effect of E into that of the powers of ∆ and this
method of separation is called the method of separation of symbols.
The operators µ and δ can be expressed in terms of D, as shown below
1 1
µf (x) = [E 1/2 + E −1/2 ]f (x) = ehD/2 + e−hD/2 f (x)
2 2
hD
= cosh f (x)
2
and δf (x) = [E 1/2 − E −1/2 ]f (x) = ehD/2 − e−hD/2 f (x)
hD
= 2 sinh f (x).
2
Thus, hD hD
µ ≡ cosh and δ ≡ 2 sinh . (2.33)
2 2
Again, hD hD
µδ ≡ 2 cosh sinh = sinh(hD). (2.34)
2 2
The inverse relation
is also useful.
Since E ≡ 1 + ∆ and E −1 ≡ 1 − ∇, [from (2.25) and (2.26)]
from (2.32), it is obtained that
Solution. (i) δµf (x) = 12 (E 1/2 + E −1/2 )(E 1/2 − E −1/2 )f (x) = 12 [E − E −1 ]f (x).
Therefore,
1
(1 + δ 2 µ2 )f (x) = 1 + (E − E −1 )2 f (x)
4
1 2 1
= 1 + (E − 2 + E −2 ) f (x) = (E + E −1 )2 f (x)
4 4
2 2
1 δ2
= 1 + (E 1/2 − E −1/2 )2 f (x) = 1 + f (x).
2 2
Hence
δ2 2
1+δ µ ≡ 1+
2 2
. (2.38)
2
δ 1 1/2 1
(ii) µ + f (x) = [E + E −1/2 ] + [E 1/2 − E −1/2 ] f (x) = E 1/2 f (x).
2 2 2
Thus
δ
E 1/2 ≡ µ + . (2.39)
2
(iii)
δ2 δ2
+δ 1+ f (x)
2 4
1 1
= (E 1/2 − E −1/2 )2 f (x) + (E 1/2 − E −1/2 ) 1 + (E 1/2 − E −1/2 )2 f (x)
2 4
1 1
= [E + E −1 − 2]f (x) + (E 1/2 − E −1/2 )(E 1/2 + E −1/2 )f (x)
2 2
1 1
= [E + E −1 − 2]f (x) + (E − E −1 )f (x)
2 2
= (E − 1)f (x).
Hence, δ2 δ2
+δ 1+ ≡ E − 1 ≡ ∆. (2.40)
2 4
38 Numerical Analysis
(1 + ∆)(1 − ∇) ≡ 1. (2.41)
(v)
∆E −1 ∆ 1
+ f (x) = [∆f (x − h) + ∆f (x)]
2 2 2
1
= [f (x) − f (x − h) + f (x + h) − f (x)]
2
1 1
= [f (x + h) − f (x − h)] = [E − E −1 ]f (x)
2 2
1 1/2
= (E + E −1/2 )(E 1/2 − E −1/2 )f (x)
2
= µδf (x).
Hence
∆E −1 ∆
+ ≡ µδ. (2.42)
2 2
(vi)
∆+∇ 1
f (x) = [∆f (x) + ∇f (x)]
2 2
1
= [f (x + h) − f (x) + f (x) − f (x − h)]
2
1 1
= [f (x + h) − f (x − h)] = [E − E −1 ]f (x)
2 2
= µδf (x) (as in previous case).
Thus,
∆+∇
µδ ≡ . (2.43)
2
(vii) ∆∇f (x) = ∆[f (x) − f (x − h)] = f (x + h) − 2f (x) + f (x − h).
Again,
The relations among the various operators are shown in Table 2.4.
E ∆ ∇ δ hD
δ2 δ2
E E ∆+1 (1 − ∇)−1 1+ +δ 1+ ehD
2 4
δ2 δ2
∆ E−1 ∆ (1 − ∇)−1 − 1 +δ 1+ ehD − 1
2 4
δ2 δ2
∇ 1 − E −1 1 − (1 + ∆)−1 ∇ − +δ 1+ 1 − e−hD
2 4
δ E 1/2−E −1/2 ∆(1 + ∆)−1/2 ∇(1 − ∇)−1/2 δ 2 sinh(hD/2)
E 1/2+E −1/2 δ 2
µ (1 + ∆/2) (1−∇/2)(1−∇)−1/2 1+ cosh(hD/2)
2 4
×(1 + ∆)−1/2
hD log E log(1 + ∆) − log(1 − ∇) 2 sinh−1 (δ/2) hD
x(0) =1
x(1) =x
x(2) = x(x − h) (2.45)
x(3) = x(x − h)(x − 2h)
x(4) = x(x − h)(x − 2h)(x − 3h)
and so on.
The above relations show that x(n) , n = 1, 2, . . . is a polynomial of degree n in x.
Also, x, x2 , x3 , . . . can be expressed in terms of factorial notations x(1) , x(2) , x(3) , . . ., as
shown below.
1 = x(0)
x = x(1)
(2.46)
x2 = x(2) + hx(1)
x3 = x(3) + 3hx(2) + h2 x(1)
x4 = x(4) + 6hx(3) + 7h2 x(2) + h3 x(1)
and so on.
These relations show that xn can be expressed as a polynomial of x(1) , x(2) , . . . , x(n) ,
of degree n. Once a polynomial is expressed in a factorial notation, its differences can
be obtained by using the formula like differential calculus.
Example 2.4.1 Express f (x) = 2x4 + x3 − 5x2 + 8 in factorial notation and find
its first and second differences.
Now, the Property 2.2.11, i.e., ∆x(n) = nx(n−1) is used to find the differences.
Therefore,
∆f (x) = 2.4x(3) + 13.3x(2) + 12.2x(1) − 2.1x(0) = 8x(3) + 39x(2) + 24x(1) − 2
and ∆2 f (x) = 24x(2) + 78x(1) + 24.
In terms of x,
∆f (x) = 8x(x − 1)(x − 2) + 39x(x − 1) + 24x − 2
and ∆2 f (x) = 24x(x − 1) + 78x + 24.
From the relations of (2.46) one can conclude the following result.
Calculus of Finite Diff. and Diff. Equs 41
Lemma 2.4.1 Any polynomial f (x) in x of degree n can be expressed in factorial no-
tation with same degree, n.
This means, in conversion to the factorial notation, the degree of a polynomial remains
unchanged.
The above process to convert a polynomial in a factorial form is a labourious tech-
nique when the degree of the polynomial is large. The other systematic process, like
Maclaurin’s formula in differential calculus, is used to convert a polynomial, even a
function, in factorial notation.
Let f (x) be a polynomial in x of degree n. In factorial notation, let it be
Solution. Taking h = 1. f (0) = −1, f (1) = 6, f (2) = 27, f (3) = 104, f (4) = 327.
∆2 f (x)
= ∆f (x + h) − ∆f (x)
= b0 [(x + h)n−1 − xn−1 ] + b1 [(x + h)n−2 − xn−2 ] + · · · + bn−2 [(x + h) − x]
(n − 1)(n − 2) 2 n−3
= b0 xn−1 + (n − 1)hxn−2 + h x + · · · + hn−1 − xn−1
2!
(n − 2)(n − 3) 2 n−4
+b1 xn−2 + (n − 2)hxn−3 + h x + · · · + hn−2 − xn−2
2!
+ · · · + bn−2 h
1
= b0 (n − 1)hxn−2 + (n − 1)(n − 2)b0 h2 + (n − 2)b1 h xn−3 + · · · + bn−2 h
2!
= c0 xn−2 + c1 xn−3 + · · · + cn−3 x + cn−2
In this way, one can find ∆n−1 f (x) is a polynomial of degree one and let it be p0 x+p1 ,
i.e., ∆n−1 f (x) = p0 x + p1 .
Then ∆n f (x) = p0 (x+h)+p1 −p0 x−p1 = p0 h, which is a constant. And ∆n+1 f (x) =
0.
It can be shown that ∆n f (x) = n(n − 1)(n − 2) · · · 2 · 1 · hn a0 = n!hn a0 .
Thus finally,
Alternative proof.
It is observed that, a polynomial in x of degree n can be expressed as a polynomial in
factorial notation with same degree n.
Thus, if f (x) = a0 xn + a1 xn−1 + a2 xn−2 + · · · + an−1 x + an be the given polynomial
then it can be written as f (x) = b0 x(n) + b1 x(n−1) + b2 x(n−2) + · · · + bn−1 x(1) + bn .
Therefore,
∆f (x) = b0 nhx(n−1) + b1 h(n − 1)x(n−2) + b2 h(n − 2)x(n−3) + · · · + bn−1 h.
Clearly this is a polynomial of degree n − 1.
Similarly,
In this way,
∆x(n) = nhx(n−1)
∆2 x(n) = nh∆x(n−1) = nh.(n − 1)hx(n−2) = n(n − 1)h2 x(n−2)
∆3 x(n) = n(n − 1)h2 .(n − 2)hx(n−2) = n(n − 1)(n − 2)h3 x(n−3) .
Calculus of Finite Diff. and Diff. Equs 45
In this way,
∆n x(n) = n(n − 1)(n − 2) · · · 2 · 1 · hn x(n−n) = n!h2 .
Similarly,
In similar way,
∆3 ui (x) = (i + 1)i(i − 1)h3 (x − x0 )(i−2) .
Hence,
The finite difference method is also used to find the sum of a finite series. Two important
results are presented here.
46 Numerical Analysis
Theorem 2.1 If f (x) be defined only for integral values of independent variable x, then
b b+h
f (x) = f (a) + f (a + h) + f (a + 2h) + · · · + f (b) = F (x)
x=a a
b
i.e., f (x) = F (x + h) − F (x), b = a + nh for some n (2.49)
x=a
b
b
b
f (x) = ∆F (x) = [F (x + h) − F (x)]
x=a x=a x=a
= [F (b + h) − F (b)] + [F (b) − F (b − h)] + · · · + [F (a + h) − F (a)]
= F (b + h) − F (a).
Thus, if F (x) is anti-difference of f (x), i.e.,
b
∆−1 f (x) = F (x), then f (x) = F (b + h) − F (a).
x=a
Example 2.6.1 Use finite difference method to find the sum of the series
n
3
f (x), where f (x) = x(x − 1) + .
x(x + 1)(x + 2)
x=1
Solution. Let
3
F (x) = ∆−1 f (x) = ∆−1 x(x − 1) +
x(x + 1)(x + 2)
x(3) x(−2)
= ∆−1 x(2) + 3∆−1 x(−3) = +3
3 −2
1 3 1
= x(x − 1)(x − 2) − .
3 2 x(x + 1)
Therefore,
n
1 3 1 3 1
f (x) = [F (n + 1) − F (1)] = (n + 1)n(n − 1) − −0+
3 2 (n + 1)(n + 2) 2 1.2
x=1
1 3 1 3
= n(n2 − 1) − + .
3 2 (n + 1)(n + 2) 4
Calculus of Finite Diff. and Diff. Equs 47
Summation by parts
Like the formula ‘integration by parts’ of integral calculus there is a similar formula in
finite difference calculus. If f (x) and g(x) are two functions defined only for integral
values of x between a and b, then
b b+h b
f (x)∆g(x) = f (x)g(x) − g(x + h)∆f (x). (2.50)
a
x=a x=a
n
Example 2.6.2 Find the sum of the series x4x .
x=1
n 4x n+1 n
4x+1 4n+1 4 4 x
n
x4x = x. − .∆x = (n + 1) − − 4 .1
3 1 3 3 3 3
x=1 x=1 x=1
4n+1 4 4 4(4n − 1) 4n+1
= (n + 1) − − =n .
3 3 3 4 3
x(0) 1
x(−1) = =
x+h x+h
x(−1) 1
x(−2) = =
x + 2h (x + h)(x + 2h)
x(−2) 1
x(−3) = = .
x + 3h (x + h)(x + 2h)(x + 3h)
48 Numerical Analysis
In this way,
1
x(−n−1) =
(x + h)(x + 2h) · · · (x + n − 1h)
1
and hence x(−n) =
(x + nh)(x + nh − h)(x + nh − 2h) · · · (x + nh − n − 1h)
1
= .
(x + nh)(n)
= u0 + Eu0 + E 2 u0 + · · · + E n u0
n+1
E −1
= (1 + E + E + · · · + E )u0 =
2 n
u0
E−1
(1 + ∆)n+1 − 1
= u0 [since E ≡ 1 + ∆]
∆
1 n+1
= C1 ∆ + n+1 C2 ∆2 + · · · + ∆n+1 u0
∆
= n+1 C1 u0 + n+1 C2 ∆u0 + n+1 C3 ∆2 u0 + · · · + ∆n u0 .
(ii) x2 x3
u0 + xu1 + u2 + u3 + · · ·
2! 3!
x2 2 x3
= u0 + xEu0 + E u0 + E 3 u0 + · · ·
2! 3!
(xE) 2 (xE)3
= 1 + xE + + + · · · u0
2! 3!
= exE u0 = ex(1+∆) u0 = ex ex∆ u0
(x∆)2 (x∆)3
= ex 1 + x∆ + + + · · · u0
2! 3!
x2 x3
= ex u0 + x∆u0 + ∆2 u0 + ∆3 u0 + · · ·
2! 3!
Calculus of Finite Diff. and Diff. Equs 49
(iii)
u0 − u 1 + u2 − u 3 + · · ·
= u0 − Eu0 + E 2 u0 − E 3 u0 + · · · = [1 − E + E 2 − E 3 + · · · ]u0
1 ∆ −1
= (1 + E)−1 u0 = (2 + ∆)−1 u0 = 1+ u0
2 2
1 ∆ ∆2 ∆ 3 1 ∆u0 ∆2 u0 ∆3 u0
= 1− + − + · · · u0 = u0 − + − + ···
2 2 4 8 2 4 8 16
n
∆n f (x) = (−1)i n Ci f [x + (n − i)h],
i=0
where h is step-length.
Example 2.7.5 Find the polynomial f (x) which satisfies the following data and
hence find the value of f (1.5).
x : 1 2 3 4
f (x) : 3 5 10 30
Solution. Here four values of f (x) are given. So, we consider f (x) be a polynomial
of degree 3. Thus the fourth differences of f (x) vanish, i.e.,
∆4 f (x) = 0 or, (E − 1)4 f (x) = 0
or, (E 4 − 4E 3 + 6E 2 − 4E + 1)f (x) = 0
or, E 4 f (x) − 4E 3 f (x) + 6E 2 f (x) − 4Ef (x) + f (x) = 0
Calculus of Finite Diff. and Diff. Equs 51
or, f (x + 4) − 4f (x + 3) + 6f (x + 2) − 4f (x + 1) + f (x) = 0.
Here, h = 1 as the values are in spacing of 1 unit.
For x = 1 the above equation becomes
f (5) − 4f (4) + 6f (3) − 4f (2) + f (1) = 0 or, 21 − 4f (4) + 6 × 8 − 4 × 3 − 2 = 0
or, f (4) = 13.75.
Example 2.7.7 Use finite difference method to find the values of a and b in the
following table.
x : 0 2 4 6 8 10
f (x) : –5 a 8 b 20 32
Solution. Here, four values of f (x) are known, so we can assume that f (x) is a
polynomial of degree 3. Then, ∆4 f (x) = 0.
or, (E − 1)4 f (x) = 0
or, E 4 f (x) − 4E 3 f (x) + 6E 2 f (x) − 4Ef (x) + f (x) = 0
or, f (x + 8) − 4f (x + 6) + 6f (x + 4) − 4f (x + 2) + f (x) = 0
[Here h = 2, because the values of x are given in 2 unit interval]
In this problem, two unknowns a and b are to be determined and needs two equations.
Therefore, the following equations are obtained by substituting x = 2 and x = 0 to
the above equation.
f (10) − 4f (8) + 6f (6) − 4f (4) + f (2) = 0 and
f (8) − 4f (6) + 6f (4) − 4f (2) + f (0) = 0.
These equations are simplifies to
32 − 4 × 20 + 6b − 4 × 8 + a = 0 and 20 − 4b + 6 × 8 − 4a − 5 = 0.
That is, 6b + a − 80 = 0 and −4b − 4a + 63 = 0. Solution of these equations is
a = 2.9, b = 12.85.
∆2 2
Example 2.7.8 Find the value of x .
E
Solution.
∆2 2 (E − 1)2 2
x = x
E E
E 2 − 2E + 1 2
= x
E
= (E − 2 + E −1 )x2 = Ex2 − 2x2 + E −1 x2
= (x + 1)2 − 2x2 + (x − 1)2
= 2.
52 Numerical Analysis
∇f (x)
Example 2.7.9 Show that ∇ log f (x) = − log 1 − .
f (x)
Solution.
The difference between the largest and the smallest arguments appearing in the dif-
ference equation with unit interval is called the order of the difference equation.
The order of the equation (2.53) is (x + 2) − x = 2, while the order of the equation
ux+3 − 8ux+1 + 5ux−1 = x3 + 2 is (x + 3) − (x − 1) = 4. The order of the difference
equation ∆f (x + 2) − 3f (x) = 0 is 3 as it is equivalent to ux+3 − ux+2 − 3ux = 0.
A difference equation in which ux , ux+1 , . . . , ux+n occur to the first degree only and
there are no product terms is called linear difference equation. Its general form is
If the coefficients a0 , a1 , . . . , an are constants, then the equation (2.54) is called linear
difference equation with constant coefficients. If g(x) = 0 then the equation is
called homogenous otherwise it is called non-homogeneous difference equation.
The linear homogeneous equation with constant coefficients of order n is
and
Let xn denote the number of pairs of rabbits on the island just after n months. At
the end of first month the number of pairs of rabbits is 1, i.e., x1 = 1. Since this pair
does not breed during second month, so x2 = 1. Now,
The number of new born pairs = number of pairs just after the (n − 2)th month,
since each new-born pair is produced by a pair of at least 2 months old, and it is
equal to xn−2 .
Calculus of Finite Diff. and Diff. Equs 55
Hence
xn = xn−1 + xn−2 , n ≥ 3, x1 = x2 = 1. (2.60)
This is the difference equation of the above stated problem and the solution is x1 =
1, x2 = 1, x2 = 2, x3 = 3, x4 = 5, x5 = 8, . . ., i.e., the sequence is {1, 1, 2, 3, 5, 8 . . .}
and this sequence is known as Fibonacci sequence.
Example 2.8.2 (The Tower of Hanoi). The Tower of Hanoi problem is a famous
problem of the late nineteenth century. The problem is stated below.
Let there be three pegs, numbered 1, 2, 3 and they are on a board and n discs of
difference sizes with holes in their centres. Initially, these n discs are placed on one
peg, say peg 1, in order of decreasing size, with the largest disc at the bottom. The
rules of the puzzle are that the discs can be moved from one peg to another only one
at a time and no discs can be placed on the top of a smaller disc. The problem is to
transfer all the discs from peg 1 to another peg 2, in order of size, with the largest
disc at the bottom, in minimum number of moves.
Let xn be the number of moves required to solve the problem with n discs. If n = 1,
i.e., if there is only one disc on peg 1, we simply transfer it to peg 2 by one move.
Hence x1 = 1. Now, if n > 1, starting with n discs on peg 1 we can transfer the top
n − 1 discs, following the rules of this problem to peg 3 by xn−1 moves. During these
moves the largest disc at the bottom on peg 1 remains fixed. Next, we use one move
to transfer the largest disc from peg 1 to peg 2, which was empty. Finally, we again
transfer the n − 1 discs on peg 3 to peg 2 by xn−1 moves, placing them on top of the
largest disc on peg 2 which remains fixed during these moves. Thus, when n > 1,
(n − 1) discs are transferred twice and one additional move is needed to move the
largest disc at the bottom from peg 1 to peg 2. Thus the recurrence relation is
Several methods are used to solve difference equations. Among them the widely used
methods are iterative method, solution using operators, solution using generating func-
tion, etc.
Solution.
xn = xn−1 + (n − 1) = {xn−2 + (n − 2)} + (n − 1)
= xn−2 + 2n − (1 + 2) = {xn−3 + (n − 3)} + 2n − (1 + 2)
= xn−3 + 3n − (1 + 2 + 3).
xn = xn−(n−1) + (n − 1)n − (1 + 2 + · · · + n − 1)
n(n − 1)
= x1 + (n − 1)n −
2
1
= 0 + n(n − 1) [since x1 = 0]
2
1
= n(n − 1).
2
Example 2.9.2 Solve the difference equation for the Tower of Hanoi problem: xn =
2xn−1 + 1, n ≥ 2 with x1 = 1.
Solution.
xn = 2xn−1 + 1 = 2(2xn−2 + 1) + 1
= 22 xn−2 + (2 + 1) = 22 (2xn−3 + 1) + (2 + 1)
= 23 xn−3 + (22 + 2 + 1) = 23 (2xn−4 + 1) + (22 + 2 + 1)
= 24 xn−4 + (23 + 22 + 2 + 1).
(E 2 + aE + b)ux = 0. (2.63)
m2 + am + b = 0. (2.64)
This equation is called the auxiliary equation (A.E.) for the difference equation
(2.62). Since (2.64) is a quadratic equation, three types of roots may occur.
Case I. Let m1 and m2 be two distinct real roots of (2.64). In this case, the general
solution is ux = c1 mx1 + c2 mx2 , where c1 and c2 are arbitrary constants.
Case II. Let m1 , m1 be two real and equal roots of (2.64). In this case (c1 mx1 + c2 mx1 ) =
(c1 + c2 )mx1 = cmx1 is the only one solution of (2.62). To get the other solution (as a
second order difference equation should have two independent solutions, like differential
equation), let us consider ux = mx1 vx be its solution.
Since m1 , m1 are two equal roots of (2.64), the equation (2.63) may be written as
(E 2 − 2m1 E + m21 )ux = 0.
Substituting ux = mx1 vx to this equation, we obtain
1 ux+2 − 2m1 vx+1 + m1 vx = 0
mx+2 x+2 x+2
ux = (c1 + c2 x)mx1 .
Case III. If the roots m1 , m2 are complex, then m1 , m2 should be conjugate complex
and let them be (α + iβ) and (α − iβ), where α, β are reals. Then the general solution
is
ux = c1 (α + iβ)x + c2 (α − iβ)x .
To simplify the above expression, substituting α = r cos θ, β = r sin θ, where r =
α2 + β 2 and tan θ = β/α.
58 Numerical Analysis
Therefore,
ux = c1 rx (cos θ + i sin θ)x + c2 rx (cos θ − i sin θ)x
= rx {c1 (cos θx + i sin θx) + c2 (cos θx − i sin θx)}
= rx {(c1 + c2 ) cos θx + i(c1 − c2 ) sin θx}
= rx (A cos θx + B sin θx), where A = c1 + c2 and B = i(c1 − c2 ).
1± 5
or, m = .
2
Therefore, general solution is
√ √
1+ 5 x 1− 5 x
u x = c1 + c2 ,
2 2
where c1 , c2 are arbitrary constants.
Given that u0 = 1, u1 = 1, therefore, √ √
1+ 5 1− 5
1 = c1 + c2 and 1 = c1 + c2 .
2 2
Solution of these equations is √ √
5+1 1− 5
c1 = √ and c2 = − √ .
2 5 2 5
Hence, the particular solution is
√ √
1 1 + 5 x+1 1 − 5 x+1
ux = √ − .
5 2 2
1 − √5 x+1
When x → ∞ then → 0 and therefore,
2
√
1 1 + 5 x+1
ux √ .
5 2
Calculus of Finite Diff. and Diff. Equs 59
1 1
= 2x 2 1 = 2x (1 − 2∆)−1 1
4∆ − 2∆ −2∆
1
= 2x 1 = −2x−1 x(1) = −2x−1 .x
−2∆
5x
Therefore, the general solution is ux = c1 2x + c2 3x + − 2x−1 .x, where c1 , c2 are
6
arbitrary constants.
1
= 2x x(3)
4(1 + ∆) − 8(1 + ∆) + 3
2
1
= 2x 2 x(3) = −2x (1 − 4∆2 )−1 x(3)
4∆ − 1
= −2x (1 + 4∆2 + 16∆4 + · · · )x(3)
= −2x [x(3) + 4.3.2x(1) ] = −2x [x(3) + 24x(1) ].
Hence, the general solution is ux = c1 + c2 3x − 2x [x(3) + 24x(1) ].
Calculus of Finite Diff. and Diff. Equs 61
πx
Example 2.9.8 Show that the solution of the equation ux+2 + ux = 2 sin
πx πx
2
is given by ux = a cos + ε − x sin .
2 2
Solution. Let ux = cmx be a solution of ux+2 + ux = 0.
Then A.E. is m2 + 1 = 0 or, m = ±i.
Therefore, C.F. is A(i)x + B(−i)x .
Substituting 0 = r cos θ, 1 = r sin θ, where r = 1, θ = π/2.
62 Numerical Analysis
yn = (c1 + c2 n)2n + 3n + n2 + 4n + 8.
Def. 2.9.1 Let {a0 , a1 , a2 , . . .} be a sequence of real numbers. The power series
∞
G(x) = an xn (2.65)
n=0
In other words, an , the nth term of the sequence {an } is the coefficient of xn in the
expansion of G(x). That is, if the generating function of a sequence is known, then one
can determine all the terms of the sequence.
64 Numerical Analysis
∞
That is, G(x) − 2 = un xn .
n=1
Now, un = 2un−1 + 3, n ≥ 1. Multiplying both sides by xn , we obtain
un xn = 2un−1 xn + 3xn .
Taking summation for all n = 1, 2, . . ., we have
∞
∞
∞
un xn = 2 un−1 xn + 3 xn .
n=1 n=1 n=1
That is, ∞ ∞
G(x) − 2 = 2x un−1 xn−1 + 3 xn
n=1 n=1
∞
∞
= 2x un xn + 3 xn − 1
n=0 n=0
1 ∞
1
= 2x G(x) + 3 −1 since xn =
1−x 1−x
n=0
3
Thus, (1 − 2x)G(x) = − 1.
1−x
Therefore, the generating function for this difference equation or for the sequence
{un } is
3 1
G(x) = −
(1 − x)(1 − 2x) 1 − 2x
5 3
= − = 5(1 − 2x)−1 − 3(1 − x)−1
1 − 2x 1 − x
∞ ∞ ∞
=5 (2x) − 3
n n
x = (5.2n − 3)xn .
n=0 n=0 n=0
Solution. Let G(x) be the generating function of the sequence {an }. Then
∞ ∞
n
G(x) = a0 + a1 x + an x = 2 + 3x + an xn .
∞ n=2 n=2
That is, an x = G(x) − 2 − 3x.
n
n=2
Multiplying given equation by xn ,
an xn − 5an−1 xn + 6an−2 xn = 0.
Taking summation for n = 2, 3, . . . ,
∞
∞
∞
an xn − 5 an−1 xn + 6 an−2 xn = 0
n=2 n=2 n=2
∞
∞
or, G(x) − 2 − 3x − 5x an−1 xn−1 + 6x2 an−2 xn−2 = 0
n=2 n=2
∞
or, G(x) − 2 − 3x − 5x an xn − a0 + 6x2 G(x) = 0
n=0
or, G(x) − 2 − 3x − 5x[G(x) − 2] + 6x2 G(x) = 0.
2 − 7x
Therefore, G(x) = . This is the generating function for the given differ-
1 − 5x + 6x2
ence equation.
2 − 7x A B (A + B) − (3A + 2B)x
Let = + = .
1 − 5x + 6x2 1 − 2x 1 − 3x (1 − 2x)(1 − 3x)
The unknown A and B are related by the equations
A + B = 2 and 3A + 2B = 7,
3 1
whose solution is A = 3, B = −1. Thus, G(x) = − .
1 − 2x 1 − 3x
Now,
G(x) = 3(1 − 2x)−1 − (1 − 3x)−1
∞ ∞
∞
=3 (2x)n − (3x)n = (3.2n − 3n )xn .
n=0 n=0 n=0
2.10 Exercise
1. Define the operators: forward difference (∆), backward difference (∇), shift (E),
central difference (δ) and average (µ).
∆
(ix) µ ≡ 1 + (1 + ∆)−1/2 , (x) µ ≡ cosh(hD/2),
2
δ2 δ2 δ2 δ2
(xi) ∇ ≡ − + δ 1 + , (xii) E ≡ 1 + +δ 1+ ,
2 4 2 4
(xiii) E∇ ≡ ∆ ≡ δE 1/2 , (xiv) ∇ ≡ E −1 ∆.
3. Show that
(i) ∆i yk = ∇i yk+i = δ i yk+i/2 , (ii) ∆(yi2 ) = (yi + yi+1 )∆yi ,
1 ∆yi
(iii) ∆∇yi = ∇∆yi = δ 2 yi , (iv) ∆ =− .
yi yi yi+1
4. Prove the following
n
(−1)i n!f (x + (n − i)h))
(i) ∆n f (x) =
i!(n − i)!
i=0
(−1)i n!f (x − ih)
n
(ii) ∇n f (x) =
i!(n − i)!
i=0
2n
(−1)i (2n)!f (x + (n − i)h))
(iii) δ 2n f (x) = .
i!(2n − i)!
i=0
x ∆2 x Eex
(ii) e = e 2 x
E ∆ e
∆f (x)
(iii) ∆ log f (x) = log 1 +
2 f (x)
∆
(iv) x3 = 6x.
E
7. Prove that, if the spacing h is very small then the forward difference operator is
almost equal to differential operator, i.e., for small h, ∆n f (x) ≈ hn Dn f (x).
8. Show that the operators δ, µ, E, ∆ and ∇ are commute with one another.
12. The nth difference ∆n be defined as ∆n f (x) = ∆n−1 f (x+h)−∆n−1 f (x), (n ≥ 1).
If f (x) is a polynomial of degree n show that ∆f (x) is a polynomial of degree n−1.
Hence deduce that the nth difference of f (x) is a constant.
i
i
fi = E i f0 = ∆j f0 .
j
j=0
16. Taking h = 1, compute the second, third and fourth differences of f (x) = 3x4 −
2x2 + 5x − 1.
17. Construct the forward difference table for the following tabulated values of f (x)
and hence find the values of ∆2 f (3), ∆3 f (2), ∆4 f (0).
x : 0 1 2 3 4 5 6
f (x) : 4 7 10 20 45 57 70
18. Use finite difference method to find a polynomial which takes the following values:
x : –2 –1 0 1 2
f (x) : –12 –6 0 6 10
x : 0 2 4 6 8
f (x) : 12 6 0 ? –25
20. Use finite difference method to find the value of f (2.2) from the following data.
x : 1 2 3 4 5
f (x) : 3 24 99 288 675
30. The seeds of a certain plant when one year old produce eighteen fold. A seed is
planted and every seed subsequently produced is planted as soon as it is produced.
Prove that the number of grain at the end of nth year is
n+1 n+1
1 11 + a 11 − a
un = −
a 2 2
√
where a = 3 17.
31. The first term of a sequence {un } is 1, the second is 4 and every other term is the
arithmetic mean of the two preceding terms. Find un and show that un tends to
a definite limit as n → ∞.
32. The first term of a sequence is 1, the second is 2 and every term is the sum of the
two proceeding terms. Find the nth term.
70 Numerical Analysis
Interpolation
Sometimes we have to compute the value of the dependent variable for a given inde-
pendent variable, but the explicit relation between them is not known. For example,
the Indian population are known to us for the years 1951, 1961, 1971, 1981, 1991 and
2001. There is no exact mathematical expression available which will give the popula-
tion for any given year. So one can not determine the population of India in the year
2000 analytically. But, using interpolation one can determine the population (obviously
approximate) for any year.
The general interpolation problem is stated below:
Let y = f (x) be a function whose analytic expression is not known, but a table of
values of y is known only at a set of values x0 , x1 , x2 , . . ., xn of x. There is no other
information available about the function f (x). That is,
f (xi ) = yi , i = 0, 1, . . . , n. (3.1)
The problem of interpolation is to find the value of y(= f (x)) for an argument, say,
x . The value of y at x is not available in the table.
A large number of different techniques are used to determine the value of y at x = x .
But one common step is “find an approximate function, say, ψ(x), corresponding to
the given function f (x) depending on the tabulated value.” The approximate function
should be simple and easy to handle. The function ψ(x) may be a polynomial, exponen-
tial, geometric function, Taylor’s series, Fourier series, etc. When the function ψ(x) is
a polynomial, then the corresponding interpolation is called polynomial interpolation.
The polynomial interpolation is widely used interpolation technique, because, polyno-
mials are continuous and can be differentiated and integrated term by term within any
range.
A polynomial φ(x) is called interpolating polynomial if yi = f (xi ) = φ(xi ), i =
dk f dk φ
0, 1, 2, . . . , n and = for some finite k, and x is one of the values of x0 ,
dxk x dxk x
71
72 Numerical Analysis
y
y = f (x) + ε
6
y = φ(x)
y=f (x) :
]
z y = f (x) − ε
-x
x0 x1 x2 xn
x1 , . . ., xn .
The following theorem justifies the approximation of an unknown function f (x) to a
polynomial φ(x).
Theorem 3.1 If the function f (x) is continuous in [a, b], then for any pre-assigned
positive number ε > 0, there exists a polynomial φ(x) such that
|f (x) − φ(x)| < ε for all x ∈ (a, b).
Let y = f (x) be a real valued function defined on an interval [a, b]. Let x0 , x1 , . . . , xn be
n + 1 distinct points in the interval [a, b] and y0 , y1 , . . . , yn be the corresponding values
of y at these points, i.e., yi = f (xi ), i = 0, 1, . . . , n, are given.
Interpolation 73
φ(xi ) = yi , i = 0, 1, . . . , n. (3.2)
The polynomial φ(x) is called the interpolation polynomial and the points xi ,
i = 0, 1, . . . , n are called interpolation points.
Let the polynomial φ(x) be of the form
n
φ(x) = Li (x) yi , (3.3)
i=0
where each Li (x) is polynomial in x, of degree less than or equal to n, called the
Lagrangian function.
The polynomial φ(x) satisfies the equation (3.2) if
0, for i = j
Li (xj ) =
1, for i = j.
That is, the polynomial Li (x) vanishes only at the points x0 , x1 , . . . , xi−1 , xi+1 , . . . , xn .
So it should be of the form
Li (xi ) = 1.
Example 3.1.1 Obtain Lagrange’s interpolating polynomial for f (x) and find an
approximate value of the function f (x) at x = 0, given that f (−2) = −5, f (−1) = −1
and f (1) = 1.
Solution. Here x0 = −2, x1 = −1, x2 = 1 and f (x0 ) = −5, f (x1 ) = −1, f (x2 ) = 1.
2
Then f (x) Li (x)f (xi ).
i=0
Now,
(x − x1 )(x − x2 ) (x + 1)(x − 1) x2 − 1
L0 (x) = = = .
(x0 − x1 )(x0 − x2 ) (−2 + 1)(−2 − 1) 3
(x − x0 )(x − x2 ) (x + 2)(x − 1) x2 + x − 2
L1 (x) = = = .
(x1 − x0 )(x1 − x2 ) (−1 + 2)(−1 − 1) −2
(x − x0 )(x − x1 ) (x + 2)(x + 1) x2 + 3x + 2
L2 (x) = = = .
(x2 − x0 )(x2 − x1 ) (1 + 2)(1 + 1) 6
Interpolation 75
Therefore,
x2 − 1 x2 + x − 2 x2 + 3x + 2
f (x) × (−5) + × (−1) + ×1
3 −2 6
= 1 + x − x2 .
Thus, f (0) = 1.
The Lagrangian coefficients can be computed from the following scheme. The differ-
ences are computed, row-wise, as shown below:
x − x0 ∗ x0 − x1 x0 − x2 ··· x0 − xn
x1 − x0 x − x1 ∗ x1 − x2 ··· x1 − xn
x2 − x0 x2 − x1 x − x2 ∗ ··· x2 − xn
··· ··· ··· ··· ···
xn − x0 xn − x1 x − x2 ··· x − xn ∗
From this table, it is observed that the product of diagonal elements is w(x). The
product of the elements of first row is (x − x0 )w (x0 ), product of elements of second row
is (x − x1 )w (x1 ) and so on. Then the Lagrangian coefficient can be computed using
the formula
w(x)
Li (x) = .
(x − xi )w (xi )
Also,
w (xi ) = (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 )(xi − xi+2 ) · · · (xi − xn )
= (ih)(i − 1)h · · · (i − i − 1)h(i − i + 1)h(i − i + 2)h · · · (i − n)h
= hn i(i − 1) · · · 1 · (−1)(−2) · · · ({−(n − i)}
= hn i!(−1)n−i (n − i)!.
n
hn+1 s(s − 1)(s − 2) · · · (s − n)
φ(x) = yi
(−1)n−i hn i!(n − i)!(s − i)h
i=0
n
s(s − 1)(s − 2) · · · (s − n)
= (−1)n−i yi , (3.8)
i!(n − i)!(s − i)
i=0
where x = x0 + sh.
For given tabulated values, the Lagrange’s interpolation polynomial exists and unique.
These are proved in the following theorem.
yi = φ(xi ), i = 0, 1, . . . , n. (3.9)
For n = 1,
x − x1 x − x0
φ(x) = y0 + y1 . (3.10)
x0 − x1 x1 − x0
For n = 2,
(x − x1 )(x − x2 ) (x − x0 )(x − x2 )
φ(x) = y0 + y1
(x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 )
(x − x0 )(x − x1 )
+ y2 . (3.11)
(x2 − x0 )(x2 − x1 )
n
φ(x) = Li (x)yi , (3.12)
i=0
Interpolation 77
where
Expression (3.10) is a linear function, i.e., a polynomial of degree one and also,
φ(x0 ) = y0 and φ(x1 ) = y1 .
Also, expression (3.11) is a second degree polynomial and φ(x0 ) = y0 , φ(x1 ) =
y1 , φ(x2 ) = y2 , i.e., satisfy (3.9). Thus, the condition (3.13) for n = 1, 2 is fulfilled.
The functions (3.13) expressed in the form of a fraction whose numerator is a poly-
nomial of degree n and whose denominator is a non-zero number. Also, Li (xi ) = 1 and
Li (xj ) = 0 for j = i, j = 0, 1, . . . , n. That is, φ(xi ) = yi . Thus, the conditions of (3.9)
are satisfied. Hence, the Lagrange’s polynomial exists.
φ(xi ) = yi , i = 0, 1, . . . , n. (3.14)
φ∗ (xi ) = yi , i = 0, 1, . . . , n. (3.15)
If φ∗ (x) − φ(x) = 0, then this difference is a polynomial of degree at most n and it has
at most n zeros, which contradicts (3.16), whose number of zeros is n + 1. Consequently,
φ∗ (x) = φ(x). Thus φ(x) is unique.
Therefore,
w(x) = (x − x0 )(x − x1 ) · · · (x − xn )
= an+1 (z − z0 )(z − z1 ) · · · (z − zn ).
w (xi ) = (xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
= an (zi − z0 )(zi − z1 ) · · · (zi − zi−1 )(zi − zi+1 ) · · · (zi − zn ).
Thus,
w(x)
Li (x) =
(x − xi )w (xi )
an+1 (z − z0 )(z − z1 ) · · · (z − zn )
=
a(z − zi )an (zi − z0 )(zi − z1 ) · · · (zi − zi−1 )(zi − zi+1 ) · · · (zi − zn )
w(z)
= = Li (z).
(z − zi )w (zi )
n
n
w(x)
Li (x) = (3.17)
(x − xi )w (xi )
i=0 i=0
1
A1 = .
w (x1 )
1 1
Also, Ai = and An = .
w (xi ) w (xn )
Using these results, equation (3.18) becomes
1 1 1
= + + ···
w(x) (x − x0 )w (x0 ) (x − x1 )w (x1 )
1 1
+
+ ··· +
(x − xi )w (xi ) (x − xn )w (xn )
n
w(x)
i.e., 1 = .
(x − xi )w (xi )
i=0
n
n
w(x)
Li (x) = = 1. (3.20)
(x − xi )w (xi )
i=0 i=0
f (n+1) (ξ)
En (x) = (x − x0 )(x − x1 ) · · · (x − xn ) , (3.21)
(n + 1)!
where ξ ∈ I.
Proof. Let the error En (x) = f (x) − φ(x), where φ(x) is a polynomial of degree less
than or equal to n, which approximates the function f (x).
Now, En (xi ) = f (xi ) − φ(xi ) = 0 for i = 0, 1, . . . , n.
By virtue of the above result, it is assumed that En (x) = w(x)k, where w(x) =
(x − x0 )(x − x1 ) . . . (x − xn ).
The error at any point, say, x = t, other than x0 , x1 , . . . , xn is En (t) = w(t)k
f (n+1) (ξ)
k= .
(n + 1)!
Note 3.3.1 The above expression gives the error at any point x.
But practically it has little utility, because, in many cases f (n+1) (ξ) cannot be deter-
mined.
If Mn+1 be the upper bound of f (n+1) (ξ) in I, i.e., if |f (n+1) (ξ)| ≤ Mn+1 in I then
the upper bound of En (x) is
Mn+1
|En (x)| ≤ |w(x)|. (3.24)
(n + 1)!
f (n+1) (ξ)
En (x) = s(s − 1)(s − 2) · · · (s − n)hn+1 . (3.25)
(n + 1)!
Interpolation 81
Note 3.3.3 (Error bounds for equally spaced points, particular cases)
Assume that, f (x) is defined on [a, b] that contains the equally spaced points. Suppose,
f (x) and the derivatives up to n + 1 order are continuous and bounded on the intervals
[x0 , x1 ], [x0 , x2 ] and [x0 , x3 ] respectively. That is, |f (n+1)(ξ) | ≤ Mn+1 for x0 ≤ ξ ≤ xn ,
for n = 1, 2, 3. Then
h2 M2
(i) |E1 (x)| ≤ , x0 ≤ x ≤ x1 (3.26)
8
h3 M3
(ii) |E2 (x)| ≤ √ , x0 ≤ x ≤ x2 (3.27)
9 3
h4 M4
(iii) |E3 (x)| ≤ , x0 ≤ x ≤ x3 . (3.28)
24
Proof. (i) From (3.25),
|f (2) (ξ)|
|E1 (x)| = |s(s − 1)|h2 .
2!
Let g1 (s) = s(s − 1). g1 (s) = 2s − 1. Then s = 1/2, which is the solution of g1 (s) = 0.
The extreme value of g1 (s) is 1/4.
Therefore,
1 M2 h2 M2
|E1 (x)| ≤ h2 = .
4 2! 8
|f (3) (ξ)|
(ii) |E2 (x)| = |s(s − 1)(s − 2)|h3 .
3!
1
Let g2 (s) = s(s − 1)(s − 2). Then g2 (s) = 3s2 − 6s + 2. At g2 (s) = 0, s = 1 ± √ .
3
Again g2 (s) = 6(s − 1) < 0 at s = 1 − √13 .
Therefore, the maximum value of g2 (s) is
1 1 1 2
1− √ −√ − √ −1 = √ .
3 3 3 3 3
Thus,
2 M3 h3 M3
|E2 (x)| ≤ √ h3 = √ .
3 3 6 9 3
|f (4 (ξ)|
(iii) |E3 (x)| = |s(s − 1)(s − 2)(s − 3)|h4 .
4!
Let g3 (s) = s(s − 1)(s − 2)(s − 3). Then g3 (s) = 4s3 − 18s2 + 22s − 6.
At extrema, g3 (s) = 0, √
i.e., 2s3 − 9s2 + 11s − 3 = 0.
3 3± 5
This gives s = , .
2 2
82 Numerical Analysis
3 ± √5
g3 (s) = − 18s + 11. Then
6s2 g3 (3/2)
< 0 and g3 > 0.
√ 2
3± 5 9 3
But, |g3 (s)| = 1 at s = and |g3 (s)| = at x = .
2 16 2
Therefore the maximum value of |g3 (s)| is 1.
Hence,
M4 h4 M4
|E3 (x)| ≤ 1.h4 = .
24 24
Solution. |f (x)| = | sin x|, |f (x)| = | cos x|, |f (x)| = | sin x|,
|f iv (x)| = | cos x|.
|f (x)| ≤ | cos 0| = 1.0, so that M2 = 1.0,
|f (x)| ≤ | sin 1.5| = 0.997495, so that M3 = 0.997495,
|f iv (x)| ≤ | cos 0| = 1.0, so that M4 = 1.0.
For linear polynomial the spacing h of the points is 1.5 − 0 = 1.5 and its error bound
is
h2 M2 (1.5)2 × 1.0
|E1 (x)| ≤ ≤ = 0.28125.
8 8
For quadratic polynomial the spacing of the points is h = (1.5 − 0)/2 = 0.75 and its
error bound is
h3 M5 (0.75)3 × 0.997495
|E2 (x)| ≤ √ ≤ √ = 0.0269955.
9 3 9 3
Interpolation 83
The spacing for cubic polynomial is h = (1.5 − 0)/3 = 0.5 and thus the error bound
is
h4 M4 (0.5)4 × 1.0
|E3 (x)| ≤ ≤ = 0.0026042.
24 24
Example 3.3.2 A function f (x) defined on the interval (0, 1) is such that f (0) =
0, f (1/2) = −1, f (1) = 0. Find the quadratic polynomial p(x) which agrees with f (x)
for x = 0, 1/2, 1.
d3 f
1
If 3 ≤ 1 for 0 ≤ x ≤ 1, show that |f (x) − p(x)| ≤ for 0 ≤ x ≤ 1.
dx 12
Solution. Given x0 = 0, x1 = 1/2, x2 = 1 and f (0) = 0, f (1/2) = −1, f (1) = 0.
From Lagrange’s interpolating formula, the required quadratic polynomial is
(x − x1 )(x − x2 ) (x − x0 )(x − x2 )
p(x) = f (x0 ) + f (x1 )
(x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 )
(x − x0 )(x − x1 )
+ f (x2 )
(x2 − x0 )(x2 − x1 )
(x − 1/2)(x − 1) (x − 0)(x − 1) (x − 0)(x − 1/2)
= ×0+ × (−1) + ×0
(0 − 1/2)(0 − 1) (1/2 − 0)(1/2 − 1) (1 − 0)(1 − 1/2)
= 4x(x − 1).
Example 3.3.3 Determine the step size h (and number of points n) to be used in
the tabulation of f (x) = cos x in the interval [1, 2] so that the quadratic interpolation
will be correct to six decimal places.
84 Numerical Analysis
h3 M3
|E2 (x)| ≤ √ , M3 = max f (x).
9 3 1≤x≤2
(x − x1 )(x − x2 ) (x − x0 )(x − x2 )
φ(x) = f (x0 ) + f (x1 )
(x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 )
(x − x0 )(x − x1 )
+ f (x2 )
(x2 − x0 )(x2 − x1 )
(x − 1/2)(x − 1) (x − 0)(x − 1) (x − 0)(x − 1/2)
= ×1+ × e−1/2 + × e−1
(0 − 1/2)(0 − 1) (1/2 − 0)(1/2 − 1) (1 − 0)(1 − 1/2)
= (2x − 1)(x − 1) − 4e−1/2 x(x − 1) + e−1 x(2x − 1)
= 2x2 (1 − 2e−1/2 + e−1 ) − x(3 − 4e−1/2 + e−1 ) + 1
= 0.309636243x2 − 0.941756802x + 1.0.
y
y=f (x) 6
y=φ(x)
- x
0 2
Figure 3.2: The graph of the function f (x) = e−x and the polynomial
φ(x) = 0.309636243x2 − 0.941756802x + 1.
Program 3.1
.
/* Program Lagrange Interpolation
This program implements Lagrange’s interpolation
formula for one dimension; xg is the interpolating points */
86 Numerical Analysis
#include <stdio.h>
#include <math.h>
void main()
{
int n, i, j; float xg, x[20], y[20], sum=0, prod=1;
printf("Enter the value of n and the data
in the form x[i],y[i] ");
scanf("%d",&n);
for(i=0;i<=n;i++) scanf("%f %f",&x[i],&y[i]);
printf("\nEnter the interpolating point x ");
scanf("%f",&xg);
for(i=0;i<=n;i++)
{
prod=1;
for(j=0;j<=n;j++)
{
if(i!=j) prod*=(xg-x[j])/(x[i]-x[j]);
}
sum+=y[i]*prod;
}
printf("\nThe given data is ");
for(i=0;i<=n;i++) printf("\n(%6.4f,%6.4f)",x[i],y[i]);
printf("\nThe value of y at x= %5.2f is %8.5f ", xg, sum);
} /* main */
A sample of input/output:
Enter the value of n and the data in the form x[i],y[i] 4
1 5
1.5 8.2
2 9.2
3.2 11
4.5 16
Enter the interpolating point x 1.75
The given data is
(1.0000,5.0000)
(1.5000,8.2000)
(2.0000,9.2000)
(3.2000,11.0000)
(4.5000,16.0000)
The value of y at x= 1.75 is 8.85925
Interpolation 87
Different types of finite differences are introduced in Chapter 2. Some of them are
recapitulated here.
Let a function y = f (x) be known as (xi , yi ) at (n+1) points xi , i = 0, 1, . . . , n, where
xi ’s are equally spaced, i.e., xi = x0 + ih, h is the spacing between any two successive
points xi ’s. That is, yi = f (xi ), i = 0, 1, . . . , n.
∆f (x) = f (x + h) − f (x),
∆2 y0 = ∆(∆y0 ) = ∆(y1 − y0 )
= ∆y1 − ∆y0 = (y2 − y1 ) − (y1 − y0 ) = y2 − 2y1 + y0 .
In general,
∆k y0 = yk − k C1 yk−1 + k C2 yk−2 − · · · − (−1)k y0 (3.29)
∆ yi = yk+i − C1 yk+i−1 + C2 yk+i−2 − · · · − (−1) yi .
k k k k
(3.30)
It is observed that difference of any order can easily be expressed in terms of the
ordinates yi ’s with binomial coefficients.
All orders forward differences can be written in a tabular form shown in Table 3.1.
This difference table is called forward difference table or diagonal difference
table.
88 Numerical Analysis
x y ∆y ∆2 y ∆3 y ∆4 y
x0 y0
∆y0
x1 y1 ∆ 2 y0
∆y1 ∆ 3 y0
x2 y2 ∆ 2 y1 ∆ 4 y0
∆y2 ∆3 y 1
x3 y3 ∆ 2 y2
∆y3
x4 y4
∇3 y3 = y3 − 3y2 + 3y1 − y0 ,
∇3 y4 = y4 − 3y3 + 3y2 − y1 , etc.
In general,
∇k yi = yi − k C1 yi−1 + k C2 yi−2 − · · · − (−1)k yi−k . (3.31)
Table 3.2 shows how the backward differences of all orders can be formed.
The backward difference table is sometimes called horizontal difference table.
x y ∇y ∇2 y ∇3 y ∇4 y
x0 y0
x1 y1 ∇y1
x2 y2 ∇y2 ∇ 2 y2
x3 y3 ∇y3 ∇ 2 y3 ∇ 3 y3
x4 y4 ∇y4 ∇ 2 y4 ∇ 3 y4 ∇ 4 y4
x y ∆y ∆2 y ∆3 y ∆4 y ∆5 y
x0 y0
∆y0
x1 y1 ∆ 2 y0
∆y1 ∆ 3 y0 + ε
x2 y2 ∆2 y 1 +ε ∆4 y0 − 4ε
∆y2 + ε ∆3 y 1 − 3ε ∆5 y0 + 10ε
x3 y3 + ε ∆2 y2 − 2ε ∆4 y1 + 6ε
∆y3 − ε ∆3 y 2 + 3ε ∆5 y1 − 10ε
x4 y4 ∆ 2 y3 + ε ∆4 y2 − 4ε
∆y4 ∆3 y 3 −ε
x5 y5 ∆ 2 y4
∆y5
x6 y6
Table 3.3 shows the propagation of error in a difference table and how the error affects
the differences. From this table, the following observations are noted.
(i) The effect of the error increases with the order of the differences.
(ii) The error is maximum (in magnitude) along the horizontal line through the erro-
neous tabulated value.
(iii) The second difference column has the errors ε, −2ε, ε, in the third difference col-
umn, the errors are ε, −3ε, 3ε, −ε. In the fourth difference column the expected
errors ε, −4ε, 6ε, −4ε, ε (this column is not sufficient to show all of the expected er-
rors). Thus, in the pth difference column, the coefficients of errors are the binomial
coefficients in the expansion of (1 − x)p .
90 Numerical Analysis
(iv) The algebraic sum of errors in any column (complete) is zero. If there is any error
in a single entry of a table, then from the difference table one can detect and
correct such error.
(i) Form the difference table. If at any stage, the differences do not follow a smooth
pattern, then one can conclude that there is an error.
(ii) If the differences of some order (it is generally happens in higher order) becomes
alternating in sign then the middle entry has an error.
Let y = f (x) be a function whose explicit form is unknown. But, the values of y at
the equispaced points x0 , x1 , . . . , xn , i.e., yi = f (xi ), i = 0, 1, . . . , n are known. Since
x0 , x1 , . . . , xn are equispaced then xi = x0 + ih, i = 0, 1, . . . , n, where h is the spacing.
It is required to construct a polynomial φ(x) of degree less than or equal to n satisfying
the conditions
yi = φ(xi ), i = 0, 1, . . . , n. (3.32)
Since φ(x) is a polynomial of degree at most n, so φ(x) can be taken in the following
form
∆y0 ∆ 2 y0
φ(x) = y0 + (x − x0 ) + (x − x0 )(x − x1 )
h 2!h2
3
∆ y0
+(x − x0 )(x − x1 )(x − x2 )
3!h3
∆ n y0
+ · · · + (x − x0 )(x − x1 ) · · · (x − xn−1 ) . (3.34)
n!hn
Introducing the condition xi = x0 + ih, i = 0, 1, . . . , n for equispaced points and a
new variable u as x = x0 + uh.
Therefore, x − xi = (u − i)h.
So the equation (3.34) becomes
∆y0 ∆ 2 y0 ∆ 3 y0
φ(x) = y0 + (uh) + (uh)(u − 1)h 2
+ (uh)(u − 1)h(u − 2)h
h 2!h 3!h3
n
∆ y0
+ · · · + (uh)(u − 1)h(u − 2)h · · · (u − n − 1)h
n!hn
u(u − 1) 2 u(u − 1)(u − 2) 3
= y0 + u∆y0 + ∆ y0 + ∆ y0
2! 3!
u(u − 1)(u − 2) · · · (u − n − 1) n
+··· + ∆ y0 , (3.35)
n!
x − x0
where u = .
h
This is known as Newton or Newton-Gregory forward difference interpolating
polynomial.
Example 3.5.1 The following table gives the values of ex for certain equidistant
values of x. Find the value of ex when x = 0.612 using Newton’s forward difference
formulae.
x y ∆y ∆2 y ∆3 y
0.61 1.840431
0.018497
0.62 1.858928 0.000185
0.018682 0.000004
0.63 1.877610 0.000189
0.018871 0.0
0.64 1.896481 0.000189
0.019060
0.65 1.915541
x − x0 0.612 − 0.61
Here, x0 = 0.61, x = 0.612, h = 0.01, u = = = 0.2.
h 0.01
Then,
f (n+1) (ξ)
E(x) = (x − x0 )(x − x1 ) · · · (x − xn )
(n + 1)!
f (n+1) (ξ)
= u(u − 1)(u − 2) · · · (u − n)hn+1 (using x = x0 + uh)
(n + 1)!
A particular case:
If 0 < u < 1 then
2
1 1 1
|u(u − 1)| = (1 − u)u = u − u = −2
− u ≤ and
4 2 4
|(u − 2)(u − 3) · · · (u − n)| ≤ |(−2)(−3) · · · (−n)| = n!.
Then,
1 n! 1
|E(x)| ≤ |∆n+1 y0 | = |∆n+1 y0 |.
4 (n + 1)! 4(n + 1)
Also, |∆n+1 y0 | ≤ 9 in the last significant figure.
9
Thus, |E(x)| ≤ < 1 for n > 2 and 0 < u < 1.
4(n + 1)
That is, the maximum error in Newton’s forward interpolation is 1 when |x − x0 | < h.
Newton’s forward formula is used to compute the approximate value of f (x) when
the argument x is near the beginning of the table. But this formula is not appropriate
to compute f (x) when x at the end of the table. In this situation Newton’s backward
formula is appropriate.
Algorithm 3.2 (Newton’s forward interpolation). This algorithm determines
the value of y when the value of x is given, by Newton’s forward interpolation method.
The values of xi , yi , i = 0, 1, 2, . . . , n are given and assumed that xi = x0 + ih,
i.e., the data are equispaced.
Program 3.2.
/* Program Newton Forward Interpolation
This program finds the value of y=f(x) at a given x when
the function is supplied as (x[i],y[i]), i=0, 1, ..., n.
Assumed that x’s are equispaced.*/
#include<stdio.h>
#include<math.h>
void main()
{
int i,j,n; float x[20],y[20],xg,sum,prod=1,u,dy[20],h;
printf("Enter number of subintervals ");
scanf("%d",&n);
printf("Enter x and y values ");
for(i=0;i<=n;i++) scanf("%f %f",&x[i],&y[i]);
printf("Enter interpolating point x ");
scanf("%f",&xg);
h=x[1]-x[0];
u=(xg-x[0])/h;
for(j=0;j<=n;j++) dy[j]=y[j];
prod=1; sum=y[0];
for(i=1;i<=n;i++)
{
for(j=0;j<=n-i;j++) dy[j]=dy[j+1]-dy[j];
prod*=(u-i+1)/i;
sum+=prod*dy[0];
}
printf("The value of y at x=%f is %f ",xg,sum);
}
A sample of input/output:
yi = φ(xi ), i = 0, 1, . . . , n. (3.37)
∇ 3 yn ∇ 4 yn ∇ n yn
a3 = , a4 = , . . . , an = .
3!h3 4!h4 n!hn
When the values of ai ’s are substituted in (3.36) then the polynomial φ(x) becomes
∇yn ∇ 2 yn
φ(x) = yn + (x − xn ) + (x − xn )(x − xn−1 )
h 2!h2
∇ yn
3
+(x − xn )(x − xn−1 )(x − xn−2 ) + ···
3!h3
∇ n yn
+(x − xn )(x − xn−1 )(x − xn−2 ) · · · (x − x1 ) . (3.38)
n!hn
Now, a unit less variable v is introduced which is defined as x = xn + vh, i.e.,
x − xn
v= . This substitution simplifies the formula.
h
Also, for equispaced points xi = x0 + ih.
Then x − xn−i = (xn + vh) − (x0 + n − ih) = (xn − x0 ) + (v − n − i)h = (v + i)h,
i = 0, 1, . . . , n.
96 Numerical Analysis
∇yn ∇ 2 yn ∇ 3 yn
φ(x) = yn + vh + vh(v + 1)h + vh(v + 1)h(v + 2)h + ···
h 2!h2 3!h3
∇ n yn
+vh(v + 1)h(v + 2)h · · · (v + n − 1)h
n!hn
v(v + 1) 2 v(v + 1)(v + 2) 3
= yn + v∇yn + ∇ yn + ∇ yn + · · ·
2! 3!
v(v + 1)(v + 2) · · · (v + n − 1) n
+ ∇ yn . (3.39)
n!
Example 3.6.1 From the following table of values of x and f (x) determine the
value of f (0.29) using Newton’s backward interpolation formula.
x − xn 0.29 − 0.30
Here, xn = 0.30, x = 0.29, h = 0.02, v = = = −0.5.
h 0.02
Then,
Example 3.6.2 The population of a town in decennial census were given in the
following table.
Year : 1921 1931 1941 1951 1961
Population (in thousand) : 46 66 81 93 101
Estimate the population for the year 1955 using Newton’s backward and forward
formulae and compare the results.
Solution.
Using Newton’s backward formula
The backward difference table is
Year Population ∇y ∇2 y ∇3 y ∇4 y
(x) (y)
1921 46
1931 66 20
1941 81 15 −5
1951 93 12 −3 2
1961 101 8 −4 −1 −3
Therefore the population of the town in the year 1955 was 97 thousand and this result
is same with the result obtained by Newton’s backward difference formula.
f (n+1) (ξ)
E(x) = (x − xn )(x − xn−1 ) · · · (x − x1 )(x − x0 )
(n + 1)!
f (n+1) (ξ)
= v(v + 1)(v + 2) · · · (v + n)hn+1 , (3.40)
(n + 1)!
x − xn
where v = and ξ lies between min{x0 , x1 , . . . , xn , x} and max{x0 , x1 , . . . , xn , x}.
h
Interpolation 99
Note 3.6.1 The Newton’s backward difference interpolation formula is used to compute
the value of f (x) when x is near to xn , i.e., when x is at the end of the table.
Newton’s forward and Newton’s backward formulae does not give accurate value of f (x)
when x is in the middle of the table. To get more accurate result another formula may
be used. There are several methods available to solve this type of problem. Among
them Gaussian forward and backward, Stirling’s and Bessel’s interpolation formulae are
widely used.
∆y0 ∆2 y−1
φ(x) = y0 + (x − x0 ) + (x − x0 )(x − x1 )
h 2!h2
3
∆ y−1
+(x − x−1 )(x − x0 )(x − x1 ) + ···
3!h3
∆2n−1 y−(n−1)
+(x − x−(n+1) ) · · · (x − xn−1 )
(2n − 1)!h2n−1
∆2n y−n
+(x − x−(n+1) ) · · · (x − xn−1 )(x − xn ) . (3.43)
(2n)!h2n
∆2 y−1 ∆3 y−1
= y0 + s∆y0 + s(s − 1) + (s + 1)s(s − 1) + ···
2! 3!
∆2n−1 y−(n−1)
+(s + n − 1) · · · s(s − 1) · · · (s − n − 1)
(2n − 1)!
∆2n y−n
+(s + n − 1) · · · s(s − 1) · · · (s − n − 1)(s − n)
(2n)!
2
∆ y−1 3
∆ y−1
= y0 + s∆y0 + s(s − 1) + s(s2 − 12 )
2! 3!
∆ 4y
−2
+s(s2 − 12 )(s − 2) + ···
4!
2 2 ∆2n−1 y−(n−1)
+s(s2 − n − 1 )(s2 − n − 2 ) · · · (s2 − 12 )
(2n − 1)!
2 2 ∆2n y−n
+s(s2 − n − 1 )(s2 − n − 2 ) · · · (s2 − 12 )(s − n) . (3.44)
(2n)!
∆2 y−1 ∆3 y−1
φ(x) = y0 + s∆y0 + s(s − 1) + (s + 1)s(s − 1)
2! 3!
4
∆ y−2
+(s + 1)s(s − 1)(s − 2)
4!
∆5 y−2
+(s + 2)(s + 1)s(s − 1)(s − 2) + ···
5!
∆2n−1 y−(n−1)
+(s + n − 1) · · · s · · · (s − n − 1)
(2n − 1)!
2
∆ y−1 ∆3 y−1
= y0 + s∆y0 + s(s − 1) + s(s2 − 12 )
2! 3!
4
∆ y−2 ∆5 y−2
+s(s2 − 12 )(s − 2) + (s2 − 22 )(s2 − 12 )s + ···
4! 5!
2 ∆2n−1 y−(n−1)
+(s2 − n − 1 ) · · · (s2 − 12 )s . (3.45)
(2n − 1)!
102 Numerical Analysis
f 2n+1 (ξ)
E(x) = (x − x−n )(x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn )
(2n + 1)!
= (s + n)(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1)(s − n)
f 2n+1 (ξ)
× h2n+1
(2n + 1)!
f 2n+1 (ξ)
= s(s2 − 12 ) · · · (s2 − n2 ).h2n+1 (3.46)
(2n + 1)!
x − x0
where s = and ξ lies between min{x−n , x−(n−1) , . . . , x0 , x1 , . . . , xn−1 , xn }
h
and max{x−n , x−(n−1) , . . . , x0 , x1 , . . . , xn−1 , xn }.
In case of 2n arguments, the error is
f 2n (ξ)
E(x) = (s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1)(s − n)h2n
(2n)!
2 f 2n (ξ)
= s(s2 − 12 ) · · · (s2 − n − 1 )(s − n).h2n , (3.47)
(2n)!
The coefficients ai ’s are unknown constants. These values are determined by using
the relations (3.48). Substituting x = x0 , x−1 , x1 , x−2 , x2 , . . . , x−n , xn to (3.49) in suc-
cession. Note that xi − x−j = (i + j)h and (x−i − xj ) = −(i + j)h. Then it is found
that
y0 = a0
φ(x−1 ) = a0 + a1 (x−1 − x0 )
i.e., y−1 = y0 + a1 (−h),
y0 − y−1 ∆y−1
i.e., a1 = =
h h
∆y−1 ∆2 y−1
φ(x) = y0 + (x − x0 ) + (x − x−1 )(x − x0 )
1!h 2!h2
3
∆ y−2
+(x − x−1 )(x − x0 )(x − x1 )
3!h3
∆4 y−2
+(x − x−2 )(x − x−1 )(x − x0 )(x − x1 ) + ···
4!h4
∆2n−1 y−n
+(x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn−1 )
(2n − 1)!h2n−1
∆2n y−n
+(x − x−n )(x − x−1 )(x − x0 )(x − x1 ) · · · (x − xn−1 ) . (3.50)
(2n)!h2n
As in previous case, a new unit less variable s is introduced to reduce the above
x − x0
formula into a simple form, where s = i.e, x = x0 + sh.
h
Then
x − xi x − x0 − ih
= = s − i and
h h
x − x−i x − x0 + ih
= = s + i, i = 0, 1, 2, . . . , n.
h h
(s + 1)s 2 (s + 1)s(s − 1) 3
φ(x) = y0 + s∆y−1 + ∆ y−1 + ∆ y−2
2! 3!
(s + 2)(s + 1)s(s − 1) 4
+ ∆ y−2 + · · ·
4!
(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1) 2n−1
+ ∆ y−n
(2n − 1)!
(s + n)(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1) 2n
+ ∆ y−n . (3.51)
(2n)!
In this case the arguments are taken as x0 , x±1 , . . . , x±(n−1) and x−n , where x±i =
x0 ± ih, i = 0, 1, . . . , n − 1 and x−n = x0 − nh.
Interpolation 105
(s + 1)s 2 (s + 1)s(s − 1) 3
φ(x) = y0 + s∆y−1 + ∆ y−1 + ∆ y−2
2! 3!
(s + 2)(s + 1)s(s − 1) 4
+ ∆ y−2 + · · ·
4!
(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1) 2n−1
+ ∆ y−n , (3.52)
(2n − 1)!
x − x0
where s = .
h
f 2n+1 (ξ)
E(x) = (x − x−n )(x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn )
(2n + 1)!
= (s + n)(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1)(s − n)
f 2n+1 (ξ)
× h2n+1 ,
(2n + 1)!
f 2n (ξ)
E(x) = (x − x−n )(x − x−(n−1) ) · · · (x − x−1 )(x − x0 ) · · · (x − xn−1 )
(2n)!
2n
f (ξ)
= (s + n)(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1)h2n ,
(2n)!
Note 3.8.1 (a) The Stirling’s interpolation formula (3.53) gives the best approximate
x − x0
result when −0.25 < s < 0.25. So we choose x0 in such a way that s = satisfy
h
this condition.
(b) The Stirling’s interpolation formula is used when the point x, for which f (x) to
be determined, is at the centre of the table and the number of points at which the values
of f (x) known is odd.
This central difference formula is also obtained by taking the arithmetic mean of Gauss’s
forward and backward interpolation formulae. But, one difference is that the backward
formula taken after one modification.
Let us consider 2n equispaced points x−(n−1) , . . . , x−1 , x0 , x1 , . . . , xn−1 , xn as argu-
ments, where x±i = x0 ± ih, h is the spacing.
If x0 , y0 be the initial values of x and y respectively, then the Gauss’s backward
difference interpolation formula (3.52) is
s(s + 1) 2 (s + 1)s(s − 1) 3
φ(x) = y0 + s∆y−1 + ∆ y−1 + ∆ y−2
2! 3!
(s + 2)(s + 1)s(s − 1) 4
+ ∆ y−2 + · · ·
4!
(s + n − 1) · · · (s + 1)s(s − 1) · · · (s − n + 1) 2n−1
+ ∆ y−n . (3.55)
(2n − 1)!
Interpolation 107
Taking arithmetic mean of (3.56) and Gauss’s forward interpolation formula given by
(3.45),
φ1 (x) + φ(x)forward
φ(x) =
2
y 0 + y1 1 s(s − 1) ∆2 y0 + ∆2 y−1
= + s− ∆y0 +
2 2 2! 2
(s − 2 )s(s − 1) 3
1
s(s − 1)(s + 1)(s − 2) ∆4 y−2 + ∆4 y−1
+ ∆ y−1 +
3! 4! 2
(s − 2 )s(s − 1)(s + 1)(s − 2) 5
1
+ ∆ y−2 + · · ·
5!
(s − 12 )s(s − 1)(s + 1) · · · (s + n − 2)(s − n − 1) 2n−1
+ ∆ y−(n−1) , (3.57)
(2n − 1)!
x − x0
where s = .
h
1 x − x0 1
Introducing u = s − = − and then the above formula reduces to
2 h 2
y0 + y 1 u2 − 14 ∆2 y−1 + ∆2 y0 u(u2 − 14 ) 3
φ(x) = + u∆y0 + · + ∆ y−1
2 2! 2 3!
2
(u2 − 14 )(u2 − 94 ) · · · (u2 − (2n−3)
4 ) 2n−1
+ ∆ y−(n−1) . (3.58)
(2n − 1)!
The formulae given by (3.57) and (3.58) are known as Bessel’s central difference
interpolation formula.
108 Numerical Analysis
Note 3.9.1 (a) The Bessel’s formula gives the best result when the starting point x0
be so chosen such that −0.25 < u < 0.25 i.e., −0.25 < s−0.5 < 0.25 or, 0.25 < s < 0.75.
(b) This central difference formula is used when the interpolating point is near the
middle of the table and the number of arguments is even.
Example 3.9.1 Use the central difference interpolation formula of Stirling or Bessel
to find the values of y at (i) x = 1.40 and (ii) x = 1.60 from the following table
i xi yi ∆yi ∆ 2 yi ∆ 3 yi
−2 1.00 1.0000
0.0772
−1 1.25 1.0772 –0.0097
0.0675 0.0026
0 1.50 1.1447 –0.0071
0.0604 0.0015
1 1.75 1.2051 –0.0056
0.0548
2 2.00 1.2599
y0 + y1 1 s(s − 1) ∆2 y0 + ∆2 y−1
y(1.40) = + s− ∆y0 +
2 2 2! 2
1 1
+ s− s(s − 1)∆ y−1
3
3! 2
1.1447 + 1.2051
= + (−0.4 − 0.5) × 0.0604
2
−0.4(−0.4 − 1) −0.0071 − 0.0056
+
2! 2
1
+ (−0.4 − 0.5)(−0.4)(−0.4 − 1) × 0.0015
6
= 1.118636.
x − x0
where s = and u = 1 − s.
h
Example 3.10.1 Use Everett’s interpolation formula to find the value of y when
x = 1.60 from the following table.
x : 1.0 1.25 1.50 1.75 2.00 2.25
y : 1.0000 1.1180 1.2247 1.3229 1.4142 1.5000
y 0 + y1 1 s(s − 1) ∆2 y0 + ∆2 y−1
φ(x) = + s− ∆y0 +
2 2 2! 2
s − 2 s(s − 1)
1
x − x0
+ ∆3 y−1 + · · · , where s = .
3! h
y 0 + y1 1 s(s − 1) ∆2 y0 + ∆2 y−1
= + s− (y1 − y0 ) +
2 2 2! 2
s − 2 s(s − 1)
1
+ (∆2 y0 − ∆2 y−1 ) + · · ·
3!
s(s − 1) s(s − 1)(s − 12 ) 2
= (1 − s)y0 + − ∆ y−1 + · · ·
4 6
s(s − 1) s(s − 1)(s − 12 ) 2
+sy1 + + ∆ y0 + · · ·
4 6
s(s2 − 12 ) 2 u(u2 − 12 ) 2
= sy1 + ∆ y0 + · · · + uy0 + ∆ y−1 + · · ·
3! 3!
where u = 1 − s.
112 Numerical Analysis
This is the Everett’s formula up to second order differences. From this deduction it is
clear that the Everett’s formula truncated after second order differences is equivalent to
the Bessel’s formula truncated after third differences. Conversely, the Bessel’s formula
may be deduced from Everett’s formula.
The difference Table 3.3 shows how the different order of differences are used in different
interpolation formulae.
x y ∆y ∆2 y ∆3 y ∆4 y
x−3 y−3
∆y−3
Newton’s Backward
x−2 y−2 ∆2 y −3
∆y−2 ∆3−3
x−1 y−1 ∆2 y−2 ∆4 y−3
∆y−1 ∆3 y−2
x0 y0 - 6-∆2 y−1- 6-∆4 y−2- Stirling’s formula
? ?
R
6- ∆y0 - 6- 3 - 6 - Bessel’s Formula
? ∆ y−1 ?
?
R
x1 y1 ∆ 2 y0 ∆4 y−1
R 3
∆y1 ∆ y0
R
x2 y2 ∆ 2 y1 ∆ 4 y0
R
∆y2 ∆ 3 y1 Newton’s Forward
x3 y3 ∆ 2 y2
Note 3.10.1 In Newton’s forward and backward interpolation formulae the first or the
Interpolation 113
last interpolating point is taken as initial point x0 . But, in central difference interpola-
tion formulae, a middle point is taken as the initial point x0 .
In general,
1 y0 x0 − x
p0j (x) =
xj − x0 yj xj − x , j = 1, 2, . . . , n. (3.61)
Here p0j (x) is a polynomial of degree less than or equal to 1, for the points x0 and
xj .
The polynomial
1 p01 (x) x1 − x
p01j (x) = , j = 2, 3, . . . , n. (3.62)
xj − x1 p0j (x) xj − x
Example 3.11.1 Find the value of y(1.52) by iterated linear interpolation using
the following table.
Now
1 y0 x0 − x
p0j = , j = 1, 2, 3.
xj − x0 yj xj − x
1 1.8330 −0.12
p01 = = 1.9572.
0.1 1.9365 −0.02
1 1.8330 −0.12
p02 = = 1.9570.
0.2 2.0396 0.08
1 1.8330 −0.12
p03 = = 1.9568.
0.3 2.1424 0.18
1 p01 x1 − x
p01j = , j = 2, 3.
xj − x1 p0j xj − x
1 1.9572 −0.02
p012 = = 1.9572.
0.1 1.9570 0.08
1 1.9572 −0.02
p013 = = 1.9572.
0.2 1.9568 0.18
Interpolation 115
1 p012 x2 − x
p012j = , j = 3.
xj − x2 p01j xj − x
1 1.9572 0.08
p0123 = = 1.9572.
0.1 1.9572 0.18
Therefore the interpolated value of y at x = 1.52 is 1.9572.
Program 3.3.
/* Program Aitken Interpolation
This program implements Aitken’s interpolation
formula; xg is the interpolating points. */
#include <stdio.h>
#include <math.h>
void main()
{
int n, j, k;
float x[20], y[20], xg, p[20][20], xd[20];
116 Numerical Analysis
A sample of input/output:
Enter the value of n and the data points in the form x[i],y[i] 4
1921 46
1931 68
1941 83
1951 95
1961 105
Enter the value of x 1955
1921.0000 46.0000 -34.0000
1931.0000 68.0000 120.8000 -24.0000
1941.0000 83.0000 108.9000 92.2400 -14.0000
1951.0000 95.0000 101.5333 97.6800 99.8560 -4.0000
1961.0000 105.0000 96.1500 101.0800 98.4280 99.2848 6.0000
The value of y at x= 1955.000 is 99.28481
f [x0 ] = f (x0 ).
f (x0 ) − f (x1 )
f [x0 , x1 ] = .
x0 − x1
f (xi ) − f (xj )
In general, f [xi , xj ] = .
xi − xj
Second order divided difference
f [x0 , x1 ] − f [x1 , x2 ]
f [x0 , x1 , x2 ] = .
x0 − x2
f [xi , xj ] − f [xj , xk ]
In general, f [xi , xj , xk ] = .
xi − xk
nth order divided differences
Now,
f [x0 , x1 ] − f [x1 , x2 ]
f [x0 , x1 , x2 ] =
x0 − x2
1 1 1
= f (x0 ) + f (x1 )
x0 − x2 x0 − x1 x1 − x0
1 1
− f (x1 ) + f (x2 )
x1 − x2 x2 − x1
1
= f (x0 )
(x0 − x2 )(x0 − x1 )
1 1
+ f (x1 ) + f (x2 ).
(x1 − x0 )(x1 − x2 ) (x2 − x0 )(x2 − x1 )
Interpolation 119
6. Divided differences for equal arguments or divided differences for confluent argu-
ments.
If the arguments are equal then the divided differences have a meaning. If two
arguments are equal then the divided difference has no meaning as denominator
becomes zero. But, by limiting process one can define the divided differences for
equal arguments which is known as confluent divided differences.
f (x0 + ε) − f (x0 )
f [x0 , x0 ] = lim f [x0 , x0 + ε] = lim = f (x0 ),
ε→0 ε→0 ε
provided f (x) is differentiable.
f [x0 , x0 ] − f [x0 , x0 + ε]
f [x0 , x0 , x0 ] = lim f [x0 , x0 , x0 + ε] = lim
ε→0 ε→0 ε
f (x0 ) − f (x0 +ε)−f (x0 )
ε
= lim
ε→0 ε
εf (x0 ) − f (x0 + ε) + f (x0 ) 0
= lim form
ε→0 ε2 0
f (x0 ) − f (x0 + ε)
= lim (by L’Hospital rule)
ε→0 2ε
f (x0 )
= .
2!
f (x0 )
Similarly, it can be shown that f [x0 , x0 , x0 , x0 ] = .
3!
In general,
(k+1) times
f k (x0 )
f [x0 , x0 , . . . , x0 ] = . (3.66)
k!
In other words,
(k+1) times
dk
k
f (x0 ) = k! f [x0 , x0 , . . . , x0 ]. (3.67)
dx
Then
f (x) − f (x0 )
f [x, x0 ] =
x − x0
xn − xn0 xn−1 − x0n−1 xn−2 − x0n−2 x − x0
= a0 + a1 + a2 + · · · + an−1
x − x0 x − x0 x − x0 x − x0
= a0 [xn−1 + xn−2 x0 + xn−3 x20 + · · · + xx0n−2 + x0n−1 ]
+a1 [xn−2 + xn−3 x0 + xn−4 x20 + · · · + xxn−3
0 + xn−2
0 ] + · · · + an−1
= a0 xn−1 + (a0 x0 + a1 )xn−2 + (a0 x20 + a1 x0 + a2 )xn−3 + · · · + an−1
= b0 xn−1 + b1 xn−2 + b2 xn−3 + · · · + bn−1 ,
where b0 = a0 , b1 = a0 x0 + a1 , b2 = a0 x20 + a1 x0 + a2 , . . . , bn−1 = an−1 .
Thus, first order divided difference of a polynomial of degree n is a polynomial of
degree n − 1.
Again, the second order divided difference is
f [x0 , x] − f [x0 , x1 ]
f [x, x0 , x1 ] =
x − x1
x n−1 − xn−1
1 xn−2 − xn−2
1 xn−3 − x1n−3
= b0 + b1 + b2
x − x1 x − x1 x − x1
x − x1
+ · · · + bn−2
x − x1
= b0 [xn−2
+x n−2
x1 + xn−4 x21 + · · · + xx1n−2 ]
+b1 [xn−3 + xn−4 x1 + xn−5 x21
+ · · · + xx1n−4 + x1n−3 ] + · · · + bn−2
= b0 xn−2 + (b0 x1 + b1 )xn−3 + (b0 x21 + b1 x1 + b2 )xn−4
+ · · · + bn−2
= c0 xn−2 + c1 xn−3 + c2 xn−4 + · · · + cn−2 ,
where c0 = b0 , c1 = b0 x1 + b1 , c2 = b0 x21 + b1 x1 + b2 , . . . , cn−2 = bn−2 .
This is a polynomial of degree n − 2. So, the second order divided difference is a
polynomial of degree n − 2.
In this way, it can be shown that the n order divided difference of a polynomial
of degree n is constant and which is equal to a0 .
f (x) − f (x0 )
f [x0 , x] = .
x − x0
f [x0 , x1 ] − f [x0 , x]
f [x0 , x1 , x] =
x1 − x
i.e., f [x0 , x] = f [x0 , x1 ] + (x − x1 )f [x0 , x1 , x]
f (x) − f (x0 )
i.e., = f [x0 , x1 ] + (x − x1 )f [x0 , x1 , x]
x − x0
Thus, f (x) = f (x0 ) + (x − x0 )f [x0 , x1 ] + (x − x0 )(x − x1 )f [x0 , x1 , x].
Error term
Example 3.13.1 Find the value of y when x = 1.5 from the following table:
x : 1 5 7 10 12
y : 0.6931 1.7918 2.0794 2.3979 2.5649
where
E(x) = (x − xn )(x − xn−1 ) · · · (x − x1 )(x − x0 )f [x, xn , xn−1 , . . . , x1 , x0 ].
From the relation (3.65), we have
∆k f (xn−k ) ∇k f (xn )
f [xn , xn−1 , . . . , xn−k ] = = .
k!hk k!hk
Therefore,
∇f (xn ) ∇2 f (xn )
φ(x) = f (xn ) + (x − xn ) + (x − xn )(x − xn−1 ) + ···
1!h 2!h2
∇n f (xn )
+ (x − xn )(x − xn−1 ) · · · (x − x1 )(x − x0 ) + E(x),
n!hn
where
∇n+1 f (ξ)
E(x) = (x − xn )(x − xn−1 ) · · · (x − x1 )(x − x0 ) ,
(n + 1)!hn+1
where min{x, x0 , x1 , . . . , xn } < ξ < max{x, x0 , x1 , . . . , xn }.
For (n + 2) arguments x, x0 , . . . , xn ,
f (x)
f [x, x0 , x1 , . . . , xn ] =
(x − x0 )(x − x1 ) · · · (x − xn )
n
f (xi )
+
(xi − x0 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
i=0
f (x)
n
f (xi )
= + ,
w(x) (xi − x)w (xi )
i=0
Note 3.14.1 It is observed that the divided differences for equispaced arguments pro-
duce the Newton forward and backward difference formulae. Also, this interpolation
gives Lagrange’s interpolation formula.
In the following, it is proved that the Lagrange’s interpolation formula and Newton’s
divided difference interpolation formula are equivalent.
The Lagrange’s interpolation polynomial for the points (xi , yi ), i = 0, 1, . . . , n of
degree n is
n
φ(x) = Li (x)yi , (3.70)
i=0
where
(x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn )
Li (x) = . (3.71)
(xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
x − x0 (x − x0 )(x − x1 )
1+ + + ···
x0 − x1 (x0 − x1 )(x0 − x2 )
(x − x0 )(x − x1 ) · · · (x − xn−1 )
+
(x0 − x1 )(x0 − x2 ) · · · (x0 − xn )
(x − x1 ) x − x0 (x − x0 )(x − x2 )
= 1+ + +
(x0 − x1 ) x0 − x2 (x0 − x2 )(x0 − x3 )
(x − x0 )(x − x2 ) · · · (x − xn−1 )
+··· +
(x0 − x2 )(x0 − x3 ) · · · (x0 − xn )
(x − x1 )(x − x2 ) x − x0 (x − x0 )(x − x3 ) · · · (x − xn−1 )
= 1+ + ··· +
(x0 − x1 )(x0 − x2 ) x0 − x2 (x0 − x3 )(x0 − x4 ) · · · (x0 − xn )
(x − x1 )(x − x2 ) · · · (x − xn−1 ) x − x0
= 1+
(x0 − x1 )(x0 − x2 ) · · · (x0 − xn−1 ) x0 − xn
(x − x1 )(x − x2 ) · · · (x − xn−1 )(x − xn )
=
(x0 − x1 )(x0 − x2 ) · · · (x0 − xn−1 )(x0 − xn )
= L0 (x).
Similarly, it can be shown that the coefficient of f (x1 ) is L1 (x), coefficient of f (x2 )
is L2 (x) and so on.
Thus, (3.72) becomes
n
φ(x) = L0 (x)f (x0 ) + L1 (x)f (x1 ) + · · · + Ln (x)f (xn ) = Li (x)f (xi ).
i=1
Thus the Lagrange’s interpolation and Newton’s divided difference interpolation for-
mulae are equivalent.
Example 3.15.1 A function y = f (x) is given at the points x = x0 , x1 , x2 . Show
that the Newton’s divided difference interpolation formula and the corresponding
Lagrange’s interpolation formula are identical.
(x0 − x) (x − x0 )(x − x1 )
= 1− + f (x0 )
(x0 − x1 ) (x0 − x1 )(x0 − x2 )
(x − x0 ) (x − x0 )(x − x1 ) (x − x0 )(x − x1 )
+ + f (x1 ) + f (x2 )
(x1 − x0 ) (x1 − x0 )(x1 − x2 ) (x2 − x0 )(x2 − x1 )
(x − x1 )(x − x2 ) (x − x0 )(x − x2 )
= f (x0 ) + f (x1 )
(x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 )
(x − x0 )(x − x1 )
+ f (x2 )
(x2 − x0 )(x2 − x1 )
Example 3.15.2 For the following table, find the interpolation polynomial using
(i) Lagrange’s formula and (ii) Newton’s divided difference formula, and hence show
that both represent same interpolating polynomial.
x : 0 2 4 8
f (x) : 3 8 11 18
It may be noted that Newton’s formula involves less number of arithmetic operations
than that of Lagrange’s formula.
Let
yk = f (xk ) = dk,0 .
f (xk ) − f (xk−1 ) dk,0 − dk−1,0
f [xk , xk−1 ] = dk,1 = =
xk − xk−1 xk − xk−1
f [xk , xk−1 ] − f [xk−1 , xk−2 ] dk,1 − dk−1,1
f [xk , xk−1 , xk−2 ] = dk,2 = =
xk − xk−2 xk − xk−2
dk,2 − dk−1,2
f [xk , xk−1 , xk−2 , xk−3 ] = dk,3 =
xk − xk−3
In general,
dk,i−1 − dk−1,i−1
dk,i = , i = 1, 2, . . . , n. (3.73)
xk − xk−i
Using the above notations, the Newton’s divided difference formula (3.68) can be
written in the following form.
A sample of input/output:
In interpolation, for a given set of values of x and y, the value of y is determined for a
given value of x. But the inverse interpolation is the process which finds the value of
x for a given y. Commonly used inverse interpolation formulae are based on successive
iteration.
In the following, three inverse interpolation formulae based on Lagrange, Newton
forward and Newton backward interpolation formulae are described. The inverse inter-
polation based on Lagrange’s formula is a direct method while the formulae based on
Newton’s interpolation formulae are iterative.
132 Numerical Analysis
n
w(x) yi
y= .
(x − xi )w (xi )
i=0
n
w(y)xi n
x= = Li (y)xi ,
(y − yi )w (yi )
i=0 i=0
where
w(y) (y−y0 )(y−y1 ) · · · (y−yi−1 )(y−yi+1 ) · · · (y−yn )
Li (y) = = .
(y−yi )w (yi ) (yi −y0 )(yi −y1 )· · ·(yi −yi−1 )(yi −yi+1 )· · ·(yi −yn )
This formula gives the value of x for given value of y and the formula is known as
Lagrange’s inverse interpolation formula.
Next, the second approximation, u(2) , is obtained by neglecting third and higher order
differences as follows:
1 u(1) (u(1) − 1) 2
u(2) = y − y0 − ∆ y0 .
∆y0 2!
In general,
k = 0, 1, 2, . . . .
This process of approximation should be continued till two successive approximations
u (k+1) and u(k) be equal up to desired number of decimal places. Then the value of x is
obtained from the relation x = x0 + u(k+1) h.
Example 3.16.1 From the table of values
x : 1.8 2.0 2.2 2.4 2.6
y : 3.9422 4.6269 5.4571 6.4662 7.6947
find x when y = 5.0 using the method of successive approximations.
x y ∆y ∆2 y ∆3 y
1.8 3.9422
0.6847
2.0 4.6269 0.1455
0.8302 0.0334
2.2 5.4571 0.1789
1.0091 0.0405
2.4 6.4662 0.2194
1.2285
2.6 7.6947
134 Numerical Analysis
and so on.
In general,
k = 0, 1, 2, . . . .
This iteration continues until two consecutive values v (k) and v (k+1) become equal up
to a desired number of significant figures.
The value of x is given by x = xn + v (k+1) h.
Suppose x = α be a root of the equation f (x) = 0 and let it lies between a and b, i.e.,
a < α < b. Now, a table is constructed for some values of x, within (a, b), and the
corresponding values of y. Then by inverse interpolation, the value of x is determined
when y = 0. This value of x is the required root.
Solution. Let y = x3 − 3x + 1. One root of this equation lies between 1/4 and 1/2.
Let us consider the points x = 0.25, 0.30, 0.35, 0.40, 0.45, 0.50. The table is shown
below.
x y ∆y ∆2 y ∆3 y
0.25 0.265625
0.30 0.127000 –0.138625
0.35 –0.007125 –0.134125 0.00450
0.40 –0.136000 –0.128875 0.00525 0.00075
0.45 –0.258875 –0.122875 0.00600 0.00075
0.50 –0.375000 –0.116125 0.00675 0.00075
Now,
y0 0.265625
u(1) = − = = 1.916140.
∆y0 0.138625
1 u(1) (u(1) − 1) 2
u(2) = − y0 + ∆ y0
∆y0 2
1 1.916140 × 0.916140
= 0.265625 + × 0.00450
0.138625 2
= 1.944633.
1 u(2) (u(2) − 1) 2 u(2) (u(2) − 1)(u(2) − 2) 3
u(3) = − y0 + ∆ y0 + ∆ y0
∆y0 2! 3!
1 1.944633 × 0.944633
= 0.265625 + × 0.00450
0.138625 2
1.944633 × 0.944633 × (−0.055367)
+ × 0.000750
6
= 1.945864.
If the interpolating points are not equally spaced then Lagrange’s, Newton’s divided
difference or Aitken’s iterated interpolation formulae may be used. Newton’s forward
formula is appropriate for interpolation at the beginning of the table, Newton’s back-
ward formula for interpolation at the end of the table, Stirling’s or Bessel’s formula for
interpolation at the centre of the table. It is well known that the interpolation poly-
nomial is unique and the above formulae are just different forms of one and the same
interpolation polynomial and the results obtained by the different formulae should be
identical. Practically, only a subset of the set of given interpolating points in the ta-
ble is used. For interpolation at the beginning of the table, it is better to take this
subset from the beginning of the table. This reason recommends the use of Newton’s
forward formula for interpolation at the beginning of the table. For interpolation, near
the end of the table, interpolating points should be available at the end of the table and
hence Newton’s backward formula is used for interpolation at the end of the table. For
the same reasons the central difference formulae like Stirling’s, Bessel’s, Everett’s etc.
are used for interpolation near the centre of the table. The proper choice of a central
interpolation formulae depends on the error terms of the different formulae.
For interpolation near the centre of the table, Stirling’s formula gives the most ac-
curate result for −1/4 ≤ s ≤ 1/4, and Bessel’s formula gives most accurate result near
Interpolation 137
s = 1/2, i.e., for 1/4 ≤ s ≤ 3/4. If all the terms of the formulae are considered, then
both the formulae give identical result. But, if some terms are discarded to evaluate the
polynomial, then Stirling’s and Bessel’s formulae, in general, do not give the same result
and then a choice must be made between them. The choice depends on the order of the
highest difference that could be neglected so that contributions from it and further dif-
ferences would be less than half a unit in the last decimal place. If the highest difference
is of odd order, then Stirling’s formula is used and if it is of even order, then, generally,
Bessel’s formula is used. This conclusion is drawn from the following comparison.
The term of Stirling’s formula containing the third differences is
The term containing third order difference of Bessel’s formula will be less than half
a unit in the last place if
s(s − 1)(s − 1/2) 3 1
∆ y−1 < .
6 2
s(s − 1)(s − 1/2)
The maximum value of is 0.008 and so that |∆3 y−1 | < 62.5.
6
Thus, if the third difference is ignored, Bessel’s formula gives about eight times more
accurate result than Stirling’s formula. But, if the third differences need to be retained
and when the magnitude is more than 62.5, then Everett’s formula is more appropriate.
It may be reminded that the Bessel’s formula with third differences is equivalent to
Everett’s formula with second differences.
Depending on these discussions the following working rules are recommended for use
of interpolation formulae.
(i) If the interpolating point is at the beginning of the table, then use Newton’s
forward formula with a suitable starting point x0 such that 0 < u < 1.
138 Numerical Analysis
(ii) If the interpolating point is at the end of the table, then use Newton’s backward
formula with a suitable starting point xn such that −1 < u < 0.
(iii) If the interpolating point is at the centre of the table and the difference table ends
with odd order differences, then use Stirling’s formula.
(iv) If the interpolating point is at the centre of the table and the difference table ends
with even order differences, then use Bessel’s or Everett’s formula.
(v) If the arguments are not equispaced then use Lagrange’s formula or Newton’s
divided difference formula or Aitken’s iterative formula.
In some functions, a higher degree interpolation polynomial does not always give
the best result compared to a lower degree polynomial. This fact is illustrated in the
following example.
1
Let f (x) = . In [−3, 3] the second degree polynomial y = φ2 (x) = 1−0.1x2 and
1 + x2
fourth degree polynomial y = φ4 (x) = 0.15577u4 −1.24616u3 +2.8904u3 −1.59232u+0.1
where u = (x + 3)/1.5 are derived. The graph of the curves y = f (x), y = φ2 (x) and
y = φ4 (x) are shown in Figure 3.4. y
6
y=φ4 (x) y=φ2 (x)
=
y=f (x)
9
- x
-1.5 0 1.5
Figure 3.4: The graph of the curves y = f (x), y = φ2 (x) and y = φ4 (x).
For this function y = f (x), the fourth degree polynomial gives an absurd result at
x = 2. At this point f (2) = 0.2, φ2 (2) = 0.6 and φ4 (2) = −0.01539. It may be
noted that the functions y = f (x) and y = φ2 (x) are positive for all values of x, but
y = φ4 (x) is negative for some values of x. This example indicates that the higher
degree polynomial does always not give more accurate result.
Interpolation 139
The interpolation formulae considered so far make use of the function values at some
number of points, say, n + 1 number of points and an nth degree polynomial is obtained.
But, if the values of the function y = f (x) and its first derivatives are known at n + 1
points then it is possible to determine an interpolating polynomial φ(x) of degree (2n+1)
which satisfies the (2n + 2) conditions
φ(xi ) = f (xi )
(3.74)
φ (xi ) = f (xi ), i = 0, 1, 2, . . . , n.
This formula is known as Hermite’s interpolation formula. Here, the number of
conditions is (2n + 2), the number of coefficients to be determined is (2n + 2) and the
degree of the polynomial is (2n + 1).
Let us assume the Hermite’s interpolating polynomial in the form
n
n
φ(x) = hi (x)f (xi ) + Hi (x)f (xi ), (3.75)
i=0 i=0
where hi (x) and Hi (x), i = 0, 1, 2, . . . , n, are polynomial in x of degree at most (2n + 1).
Using conditions (3.74), we get
1, if i = j
hi (xj ) = ; Hi (xj ) = 0, for all i (3.76)
0, if i = j
1, if i = j
hi (xj ) = 0, for all i; Hi (xj ) = (3.77)
0, if i = j
Let us consider the Lagrangian function
(x − x0 )(x − x1 ) · · · (x − xi−1 )(x − xi+1 ) · · · (x − xn )
Li (x) = ,
(xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
i = 0, 1, 2, . . . , n. (3.78)
Obviously,
1, if i = j
Li (xj ) = (3.79)
0, if i = j.
Since each Li (x) is a polynomial of degree n, [Li (x)]2 is a polynomial of degree 2n.
Again, each Li (x) satisfies (3.79) and [Li (x)]2 also satisfies (3.79). Since hi (x) and Hi (x)
are polynomials in x of degree (2n + 1), their explicit form may be taken as
ai xi + bi =1
ci xi + di =0
(3.81)
ai + 2Li (xi ) =0
ci =1
Hence,
Solution.
(x − x1 )(x − x2 ) (x − 2.0)(x − 2.5)
L0 (x) = = = 2x2 − 9x + 10,
(x0 − x1 )(x0 − x2 ) (1.5 − 2.0)(1.5 − 2.5)
(x − x0 )(x − x2 ) (x − 1.5)(x − 2.5)
L1 (x) = = = −4x2 + 16x − 15,
(x1 − x0 )(x1 − x2 ) (2.0 − 1.5)(2.0 − 2.5)
(x − x0 )(x − x1 ) (x − 1.5)(x − 2.0)
L2 (x) = = = 2x2 − 7x + 6.
(x2 − x0 )(x2 − x1 ) (2.5 − 1.5)(2.5 − 2.0)
Interpolation 141
Therefore
L0 (x) = 4x − 9, L1 (x) = −8x + 16, L2 (x) = 4x − 7.
Hence L0 (x0 ) = −3, L1 (x1 ) = 0, L2 (x2 ) = 3.
Spline interpolation is very powerful and widely used method and has many applications
in numerical differentiation, integration, solution of boundary value problems, two and
three - dimensional graph plotting etc. Spline interpolation method, interpolates a
function between a given set of points by means of piecewise smooth polynomials. In
this interpolation, the curve passes through the given set of points and also its slope
and its curvature are continuous at each point. The splines with different degree are
found in literature, among them cubic splines are widely used.
142 Numerical Analysis
It may be noted that, at the endpoints x0 and xn , no continuity on slope and curvature
are assigned. The conditions at these points are assigned, generally, depending on the
applications.
Let the interval [xi , xi+1 ], i = 0, 1, . . . , n − 1 be denoted by ith interval.
Let hi = xi − xi−1 , i = 1, 2, . . . , n and Mi = y (xi ), i = 0, 1, 2, . . . , n.
Let the cubic spline for the ith interval be
yi = d i (3.90)
and
Mi
bi = , (3.94)
2
Mi+1 − Mi
ai = . (3.95)
6hi+1
Interpolation 143
Mi+1 − Mi 3 Mi 2
yi+1 = hi+1 + h + ci hi+1 + yi
6hi+1 2 i+1
yi+1 − yi 2hi+1 Mi + hi+1 Mi+1
i.e., ci = − . (3.96)
hi+1 6
The values of ai−1 , bi−1 , ci−1 and ci are substituted to the above equation and obtained
the following equation.
(v) If M0 = y0 and Mn = yn are specified. If a spline satisfy these conditions then it
is called endpoint curvature-adjusted spline.
y yi − yi−1
i+1 − yi
Let Ai = hi , Bi = 2(hi + hi+1 ), Ci = hi+1 and Di = 6 − .
hi+1 hi
Then (3.99) becomes
and M0 = Mn = 0.
Imposing the conditions for non-periodic spline, we find
2M0 + M1 = D0 (3.103)
and Mn−1 + 2Mn = Dn , (3.104)
6 y1 − y 0
where D0 = − y0
h1 h1
6 yn − yn−1
and Dn = y − . (3.105)
hn n hn
Interpolation 145
Then equations (3.101), (3.103), (3.104) and (3.105) result the following tri-diagonal
system for the unknowns M0 , M1 , . . . , Mn .
M D
2 1 0 0 ··· 0 0 0 0 0
A1 B1 C1 0 · · · 0 M D1
0 0
1
0 A2 B2 C2 · · · 0 M2 D2
0 0 . = .
··· ··· ··· ··· ··· ··· · · · · · · . . . (3.106)
. .
0 0 0 0 · · · An−1 Bn−1 Cn−1
Mn−1 Dn−1
0 0 0 0 ··· 0 1 2 Mn Dn
For the extrapolated spline the values of M0 and Mn are given by the relations
h1 (M2 − M1 ) hn (Mn−1 − Mn−2 )
M0 = M1 − and Mn = Mn−1 + . (3.107)
h2 hn−1
The first expression is rewritten as
A1 h1 A1 h1
M1 A1 + B1 + + M2 C1 − = D1 or, M1 B1 + M2 C1 = D1
h2 h2
A1 h1 A1 h1
where B1 = A1 + B1 + and C1 = C1 − .
h2 h2
Similarly, the second expression is transferred to Mn−2 An−1 + Mn−1 Bn−1 = Dn−1
where
Cn−1 hn hn Cn−1
An−1 = An−1 −
and Bn−1 = Bn−1 + Cn−1 + .
hn−1 hn−1
For this case, the tri-diagonal system of equations for M1 , M2 , . . ., Mn−1 is
B1 C1 0 0 · · · 0 0 0 M1 D1
A2 B2 C2 0 · · · 0 0
0 M2
D2
0 A3 B3 C3 · · · 0
0 0 . .
. = . . (3.108)
··· ··· ··· ··· ··· ··· ··· ··· . .
0 0 0 0 · · · 0 An−1 Bn−1 Mn−1 Dn−1
Example 3.19.1 Fit a cubic spline curve that passes through (0, 0.0), (1, 0.5), (2,
2.0) and (3, 1.5) with the natural end boundary conditions, y (0) = y (3) = 0.
Solution. Here the intervals are (0, 1) (1, 2) and (2, 3), i.e., three intervals of
x, in each of which we can construct a cubic spline. These piecewise cubic spline
polynomials together gives the cubic spline curve y(x) in the entire interval (0, 3).
Here h1 = h2 = h3 = 1.
Then equation (3.99) becomes
That is,
pi (x) = ai (x − xi )3 + bi (x − xi )2 + ci (x − xi ) + di
for i = 0, 1, 2.
Therefore,
M1 − M 0 M0
a0 = = 0.4, b0 = = 0,
6 2
y1 − y0 2M0 + M1
c0 = − = 0.1, d0 = y0 = 0.
1 6
M2 − M 1 M1 6
a1 = = −1, b1 = = ,
6 2 5
Interpolation 147
y2 − y1 2M1 + M2
c1 = − = 1.3, d1 = y1 = 0.5.
1 6
M3 − M 2 3 M2 9
a2 = = , b2 = =− ,
6 5 2 5
y3 − y2 2M2 + M3
c2 = − = 0.7, d2 = y2 = 2.0.
1 6
Hence the required piecewise cubic splines in each interval are given by
Example 3.19.2 Fit a cubic spline curve for the following data with end conditions
y (0) = 0.2 and y (3) = −1.
x : 0 1 2 3
y : 0 0.5 3.5 5
Solution. Here, the three intervals (0, 1) (1, 2) and (2, 3) are given in each of which
the cubic splines are to be constructed. These cubic spline functions are denoted by
y0 , y1 and y2 . In this example, h1 = h2 = h3 = 1.
For the boundary conditions, equation (3.99) is used. That is,
i.e.,
M0 + 4M1 + M2 = 15
M1 + 4M2 + M3 = −9
2M0 + M1 = 1.8
M2 + 2M3 = 6(−1 − 5 + 3.5) = −15.
Therefore,
M1 − M 0
a0 = = 0.98, b0 = −0.68, c0 = 0.2, d0 = 0.
6
M2 − M 1
a1 = = −1.04, b1 = 2.26, c1 = 1.78, d1 = 0.5.
6
M3 − M 2
a2 = = −0.82, b2 = −0.86, c2 = 3.18, d2 = 3.5.
6
Hence, the required piecewise cubic spline polynomials in each interval are given by
Solution. Let
11 3 75
p0 (x) = − x + 26x2 − x + 18, 1 ≤ x ≤ 2,
2 2
11 3 189
and p1 (x) = x − 40x2 + x − 70, 2 ≤ x ≤ 3.
2 2
Here x0 = 1, x1 = 2 and x2 = 3. The function f (x) will be a cubic spline if
But, here the values of f (x0 ), f (x1 ) and f (x2 ) are not supplied, so only the conditions
of (b) are to be checked.
Now,
33 75 33 189
p0 (x) = − x2 + 52x − , p1 (x) = x2 − 80x +
2 2 2 2
p0 (x) = −33x + 52, p1 (x) = 33x − 80.
p0 (x1 ) = p0 (2) = 0.5, p1 (x1 ) = p1 (2) = 0.5, i.e., p0 (x1 ) = p1 (x1 ).
p0 (x1 ) = p0 (2) = −14 and p1 (x1 ) = p1 (2) = −14. Thus p0 (x1 ) = p1 (x1 ).
Hence f (x) is a spline.
Algorithm 3.5 (Cubic spline). This algorithm finds the cubic spline for each of
the intervals [xi , xi+1 ], i = 0, 1, . . . , n − 1. The (i) Natural spline, (ii) Non-periodic
spline or clamped cubic spline, (iii) Extrapolated spline, and (iv) endpoint curvature-
adjusted spline are incorporated here. The spacing for xi need not be equal and
assume that x0 < x1 < · · · < xn .
Program 3.5.
/* Program Cubic Spline
This program construct cubic splines at each interval
[x[i-1],x[i]], i=1, 2, ..., n and finds the value of
y=f(x) at a given x when the function is supplied as
(x[i],y[i]), i=0, 1, ..., n. */
#include<stdio.h>
#include<math.h>
#include<ctype.h>
#include<stdlib.h>
float M[21];
void main()
{
int i,n;
char opt,s[5];
float x[20],y[20],h[20],A[20],B[20],C[20],D[20];
float a[20],b[20],c[20],d[20],xg,yd0,ydn,temp,yc;
float TriDiag(float a[],float b[],float c[],float d[],int n);
printf("\nEnter number of subintervals ");
scanf("%d",&n);
printf("Enter x and y values ");
for(i=0;i<=n;i++) scanf("%f %f",&x[i],&y[i]);
printf("Enter interpolating point x ");
scanf("%f",&xg);
printf("The given values of x and y are\nx-value y-value\n");
for(i=0;i<=n;i++) printf("%f %f\n",x[i],y[i]);
switch(toupper(opt))
{
case ’N’: /* Natural spline */
temp=TriDiag(A,B,C,D,n-1);
M[0]=0; M[n]=0;
break;
case ’P’: /* Non-periodic spline */
printf("\nEnter the values of y’[0] and y’[n] ");
scanf("%f %f",&yd0,&ydn);
D[0]=6*((y[1]-y[0])/h[1]-yd0)/h[1];
D[n]=6*(ydn-(y[n]-y[n-1])/h[n])/h[n];
for(i=n+1;i>=1;i--) D[i]=D[i-1];
A[n+1]=1; B[n+1]=2;
for(i=n;i>=2;i--){
A[i]=A[i-1]; B[i]=B[i-1]; C[i]=C[i-1];}
B[1]=2; C[1]=1;
temp=TriDiag(A,B,C,D,n+1);
for(i=0;i<=n;i++) M[i]=M[i+1];
break;
case ’E’: /* Extrapolated spline */
B[1]=A[1]+B[1]+A[1]*h[1]/h[2];
C[1]=C[1]-A[1]*h[1]/h[2];
A[n-1]=A[n-1]-C[n-1]*h[n]/h[n-1];
B[n-1]=B[n-1]+C[n-1]+C[n-1]*h[n]/h[n-1];
temp=TriDiag(A,B,C,D,n-1);
M[0]=M[1]-h[1]*(M[2]-M[1])/h[2];
M[n]=M[n-1]+h[n]*(M[n-1]-M[n-2])/h[n-1];
break;
case ’C’: /* End point Curvature adjusted spline */
printf("\nEnter the values of y’’[0] and y’’[n] ");
scanf("%f %f",&yd0,&ydn);
D[1]=D[1]-A[1]*yd0;
D[n-1]=D[n-1]-C[n-1]*ydn;
temp=TriDiag(A,B,C,D,n-1);
M[0]=yd0; M[n]=ydn;
break;
Interpolation 153
A sample of input/output:
N Natural spline
P Non-Periodic spline
E Extrapolated spline
C End point Curvature adjusted spline
Enter your choice n
Another input/output:
N Natural spline
P Non-Periodic spline
E Extrapolated spline
C End point Curvature adjusted spline
Enter your choice p
Enter the values of y’[0] and y’[n] 0 1
The cubic splines are
p0(x)= 0.6250(x+ 1.0000)^3- 1.3750(x+ 1.0000)^2+ 0.0000(x+ 1.0000)
+ 1.0000 in [-1.0000, 1.0000]
156 Numerical Analysis
Like single valued interpolation, recently bivariate interpolations become important due
to their extensive uses in a wide range of fields e.g., digital image processing, digital
filter design, computer-aided design, solution of non-linear simultaneous equations etc.
In this section, some of the important methods are described to construct interpola-
tion formulae that can be efficiently evaluated.
To construct the formulae, the following two approaches are followed.
(i) Constructing a function that matches exactly the functional values at all the data
points.
(ii) Constructing a function that approximately fits the data. This approach is desir-
able when the data likely to have errors and require smooth functions.
On the basis of these approaches, one can use four types of methods (i) local matching
methods, (ii) local approximation methods, (iii) global matching methods and (iv) global
approximation methods. In the local methods, the constructed function at any point
depends only on the data at relatively nearby points. In global methods, the constructed
function at any points depends on all or most of the data points.
In the matching method, the matching function matches exactly the given values,
but in the approximate method the function approximately fits the data.
Here, the local and the global matching methods are discussed only.
Triangular interpolation
The simplest local interpolating surface is of the form
F (x, y) = a + bx + cy.
The data at the three corners of a triangle determine the coefficients. This procedure
generates a piecewise linear surface which is global continuous.
Interpolation 157
Suppose the function f (x, y) be known at the points (x1 , y1 ), (x2 , y2 ) and (x3 , y3 ).
Let f1 = f (x1 , y1 ), f2 = f (x2 , y2 ) and f3 = f (x3 , y3 ).
Let the constructed function be
F (x, y) = a + bx + cy (3.110)
f1 = a + bx1 + cy1
f2 = a + bx2 + cy2
f3 = a + bx3 + cy3 .
Example 3.20.1 For a function f (x, y), let f (1, 1) = 8, f (2, 1) = 12 and f (2, 2) =
20. Find the approximate value of f (3/2, 5/4) using triangular interpolation.
x1 = 1, y1 = 1, f1 = f (x1 , y1 ) = 8
x2 = 2, y2 = 1, f2 = f (x2 , y2 ) = 12
x3 = 2, y3 = 2, f3 = f (x3 , y3 ) = 20
3 5
x= , y= .
2 4
Therefore,
Bilinear interpolation
Let a function f (x, y) be known at the points (x1 , y1 ), (x1 + h, y1 ), (x1 , y1 + k) and
(x1 + h, y1 + k).
A function F (x, y) is to be constructed within the rectangle formed by these points.
Let f1 = f (x1 , y1 ), f2 = f (x1 + h, y1 ), f3 = f (x1 , y1 + k) and
f4 = f (x1 + h, y1 + k).
Let us construct a function F (x, y) of the form
such that
Thus f2 − f1
a = f1 , b= ,
h
f3 − f1 f4 + f1 − f2 − f3
c = and d = . (3.116)
k hk
Example 3.20.2 Find a bilinear interpolation polynomial F (x, y) for the function
f (x, y) where f (1, 1) = 8, f (2, 1) = 10, f (1, 2) = 12 and f (2, 2) = 20. Also, find an
approximate value of f (4/3, 5/3).
Solution. Here
x1 = 1, y1 = 1, f1 = f (x1 , y1 ) = 8
x1 + h = 2, y1 = 1, f2 = f (x1 + h, y1 ) = 10
x1 = 1, y1 + k = 2, f3 = f (x1 , y1 + k) = 12
x1 + h = 2, y1 + k = 2, f4 = f (x1 + h, y1 + k) = 20.
Obviously, h = 1, k = 1.
f2 − f1 10 − 8
Thus, a = f1 = 8, b = = = 2,
h 1
f3 − f1 12 − 8 f4 + f1 − f2 − f3
c= = = 4, d = = 6.
k 1 hk
Hence,
where F is the rearranged array form of the column vector F (xi , yj ), and Y(yj ), X(xi )
are the matrices derived from Yt (y), X(x) by introducing the points (xi , yi ).
That is,
F (x1 , y1 ) F (x1 , y2 ) · · · F (x1 , yn )
F (x2 , y1 ) F (x2 , y2 ) · · · F (x2 , yn )
F= ···
,
··· ··· ···
F (xn , y1 ) F (xn , y2 ) · · · F (xn , yn )
1 x1 x21 · · · x1n−1 1 y1 y12 · · · y1n−1
1 x2 x2 · · · xn−1 n−1
, Yt (yj ) = 1 y2 y2 · · · y2 .
2
X(xi ) = 2 2
··· ··· ··· ··· ··· ··· ··· ··· ··· ···
1 xn xn · · · xn
2 n−1 1 yn yn2 · · · ynn−1
Since the matrices X, Y and F are known, one can calculate the matrix A as (as-
suming X and Y are non-singular)
1 1 1 1
X(xi ) = , Yt (yj ) = ,
1 2 1 2
−1 −1
∗ 1 1 6 10 1 1
A =
1 2 10 18 1 2
2 −1 6 10 2 −1 2 0
= = .
−1 1 10 18 −1 1 0 4
Interpolation 161
Therefore,
2 0 1
F (x, y) = [1 y] = 2 + 4xy.
0 4 x
y 0 1
x
0 1 1.414214
1 1.732051 2
Find the Lagrange’s bivariate polynomial and hence find an approximate value of
f (0.25, 0.75).
Then
1
1
F (x, y) = Lx,i (x)Ly,j (y) f (xi , yj )
i=0 j=0
= Lx,0 (x){Ly,0 (y)f (x0 , y0 ) + Ly,1 (y)f (x0 , y1 )}
+Lx,1 (x){Ly,0 (y)f (x1 , y0 ) + Ly,1 (y)f (x1 , y1 )}.
Now,
x−1
Lx,0 (x) = = 1 − x, Ly,0 (y) = 1 − y
0−1
Lx,1 (x) = x, Ly,1 (y) = y.
Therefore,
Thus,
wx wy
Compute sum = sum + ∗ ∗ fij ;
(xg − xi )wdx(i) (yg − yj )wdy(j)
endfor;
endfor;
Print ‘The value of f (x, y) is ’, sum;
end Lagrange Bivariate
function wdx(j)
sum = 0;
for i = 0 to m do
if (i = j) sum = sum + (xj − xi );
endfor;
return sum;
end wdx(j)
function wdy(j)
sum = 0;
for i = 0 to n do
if (i = j) sum = sum + (yj − yi );
endfor;
return sum;
end wdy(j)
Program 3.6
.
/* Program Lagrange bivariate
This program is used to find the value of a function
f(x,y) at a given point (x,y) when a set of values of
f(x,y) is given for different values of x and y, by
Lagrange bivariate interpolation formula. */
#include<stdio.h>
#include<math.h>
float x[20],y[20];
void main()
{
int i,j,n,m;
float xg,yg,f[20][20],wx=1,wy=1,sum=0;
float wdx(int j,int m); float wdy(int j,int n);
printf("Enter the number of subdivisions along x and y ");
scanf("%d %d",&m,&n);
printf("Enter x values ");
for(i=0;i<=m;i++) scanf("%f",&x[i]);
164 Numerical Analysis
A sample of input/output:
f(1.000000,2.000000)= 9
f(2.000000,0.000000)= 6
f(2.000000,1.000000)= 9
f(2.000000,2.000000)= 14
Enter the interpolating point 0.5 0.5
The interpolated value at ( 0.50000, 0.50000) is 2.75000
and so on.
Then,
+mn∆xy + · · · f (x0 , y0 ).
x − x0 y − y0
Substituting m = and n = .
h k
166 Numerical Analysis
x − x0 − h x − x1 y − y1
Then m − 1 = = and n − 1 = .
h h k
Thus
x − x0 y − y0
F (x, y) = f (x0 , y0 ) + ∆x + ∆y f (x0 , y0 )
h k
1 (x − x0 )(x − x1 ) 2(x − x0 )(y − y0 )
+ ∆xx + ∆xy
2! h2 hk
(y − y0 )(y − y1 )
+ ∆yy f (x0 , y0 ) + · · · (3.125)
k2
which is called Newton’s bivariate interpolating polynomial.
Now, introducing unit less quantities u and v defined as x = x0 + uh and y = y0 + vk.
Then x − xs = (u − s)h and y − yt = (v − t)k.
Hence, finally (3.125) becomes
1
F (x, y) = f (x0 , y0 ) + [u∆x + v∆y ]f (x0 , y0 ) + [u(u − 1)∆xx
2!
+2uv∆xy + v(v − 1)∆yy ]f (x0 , y0 ) + · · · (3.126)
Example 3.20.5 For the following data obtain Newton’s bivariate interpolating
polynomial and hence calculate the values of f (0.75, 0.25) and f (1.25, 1.5).
y 0 1 2
x
0 1 3 5
1 –1 2 5
2 –5 –1 3
Solution.
x − x0 y − y0
Here h = k = 1. u = = x, v = = y.
h k
Thus,
F (x, y) = 1 + [x × (−2) + y × 2]
1
+ [x(x − 1) × (−2) + 2xy × 1 + y(y − 1) × 0]
2!
= 1 − x + 2y − x2 + xy.
Hence f (0.75, 0.25) F (0.75, 0.25) = 0.375 and f (1.25, 1.5) F (1.25, 1.5) = 3.0625.
x : −1 0 1
f (x) : 2 3 6
(x − x1 )(x − x2 ) x(x − 1)
L0 (x) = = .
(x0 − x1 )(x0 − x2 ) 2
(x − x0 )(x − x2 ) (x + 1)(x − 1)
L1 (x) = = .
(x1 − x0 )(x1 − x2 ) −1
(x − x0 )(x − x1 ) (x + 1)x
L2 (x) = = .
(x2 − x0 )(x2 − x1 ) 2
Hence
x2 + 2x + 3 f (x) 1 3 3
= = − + .
(x + 1)x(x − 1) (x + 1)x(x − 1) x+1 x x−1
168 Numerical Analysis
Solution.
Method 1.
Using Lagrange’s formula
Therefore,
y(x) y0 L0 (x) + y1 L1 (x) + y2 L2 (x) + y3 L3 (x)
x3 − 7x2 + 14x − 8 x3 − 6x2 + 8x
= ×1+ ×2
−8 3
x3 − 5x2 + 4x x3 − 3x2 + 2x
+ ×4+ × 16
−4 24
5 3 1 2 11
= x − x + x + 1.
24 8 12
Thus, y(3) = 8.25.
Hence the missing term is 8.25.
Method 2.
Let us construct a polynomial of degree 3 in the form
Therefore,
11 1 5
y(x) = 1 + x − x2 + x3 .
12 8 24
Thus y(3) = 8.25.
Example 3.21.3 Let f (x) = log x, x0 = 2 and x1 = 2.1. Use linear interpolation
to calculate an approximate value for f (2.05) and obtain a bound on the truncation
error.
x : 2.0 2.1
y : 0.693147 0.741937
(x − 2.1) (x − 2.0)
φ(x) = × 0.693147 + × 0.741937
(2.0 − 2.1) (2.1 − 2.0)
= 0.487900x − 0.282653.
1
The maximum value of f (x) = − 2 in 2 ≤ x ≤ 2.1 is |f (2.0)| = 0.25.
x
Then
0.25
|E1 (x)| ≤ (2.05 − 2)(2.05 − 2.1) = 0.000313.
2
Thus the upper bound of truncation error is 0.000313.
Example 3.21.4 For the following table find the value of y at x = 2.5, using
piecewise linear interpolation.
x : 1 2 3 4 5
y : 35 40 65 72 80
Example 3.21.5 Deduce the following interpolation formula taking three points
x0 , x0 + ε, ε → 0 and x1 using Lagrange’s formula.
where
1
E(x) = (x − x0 )2 (x − x1 )f (ξ) and min{x0 , x0 + ε, x1 } ≤ ξ ≤ max{x0 , x0 + ε, x1 }.
6
Solution. The Lagrange’s interpolating polynomial for the points x0 , x0 + ε and x1
is
(x − x0 − ε)(x − x1 ) (x − x0 )(x − x1 )
f (x) f (x0 ) + f (x0 + ε)
(x0 − x0 − ε)(x0 − x1 ) (x0 + ε − x0 )(x0 + ε − x1 )
(x − x0 )(x − x0 − ε)
+ f (x1 ) + E(x)
(x1 − x0 )(x1 − x0 − ε)
(x − x0 − ε)(x − x1 ) (x − x0 )(x − x1 )
= f (x0 ) + f (x0 )
−ε(x0 − x1 ) ε(x0 + ε − x1 )
(x − x0 )(x − x1 ) f (x0 + ε) − f (x0 )
+
(x0 − x1 + ε) ε
(x − x0 )(x − x0 − ε)
+ f (x1 ) + E(x)
(x1 − x0 )(x1 − x0 − ε)
(2x0 − x1 − x)(x − x1 ) (x − x0 )(x − x1 ) f (x0 + ε) − f (x0 )
= f (x0 ) +
(x0 − x1 )(x0 − x1 + ε) (x0 − x1 + ε) ε
(x − x0 )(x − x0 − ε)
+ f (x0 ) + E(x)
(x1 − x0 )(x1 − x0 − ε)
(x1 − x)(x + x1 − 2x0 ) (x − x0 )(x1 − x)
= f (x0 ) + f (x0 )
(x1 − x0 )2 (x1 − x0 )
(x − x0 )2
+ f (x1 ) + E(x) as ε → 0.
(x1 − x0 )2
Calculate P (1.235).
Solution. The backward difference table is
x P (x) ∇P ∇2 P ∇3 P
1.00 0.682689
1.05 0.706282 0.023593
1.10 0.728668 0.022386 −0.001207
1.15 0.749856 0.021188 −0.001198 0.000009
1.20 0.769861 0.020005 −0.001183 0.000015
1.25 0.788700 0.018839 −0.001166 0.000017
v(v + 1) 2 v(v+1)(v+2) 3
P (1.235) = P (xn ) + v∇P (xn ) + ∇ P (xn ) + ∇ P (xn )
2! 3!
−0.3(−0.3 + 1)
= 0.788700 − 0.3 × 0.018839 + × (−0.001166)
2
−0.3(−0.3 + 1)(−0.3 + 2)
+ × 0.000017
6
= 0.783169.
Example 3.21.7 Find the seventh and the general terms of the series 3, 9, 20, 38,
65, . . ..
Here xn = 5, v = (x − xn )/h = x − 5.
Example 3.21.8 From the following table of sin x compute sin 120 and sin 450 .
x : 100 200 300 400 500
y = sin x : 0.17365 0.34202 0.50000 0.64279 0.76604
Example 3.21.9 Use Stirling’s formula to find u32 from the following table
u20 = 14.035, u25 = 13.674, u30 = 13.257,
u35 = 12.734, u40 = 12.089, u45 = 11.309.
√
3
Example 3.21.10 The function y = x is tabulated below.
i x y ∆y ∆2 y ∆3 y ∆4 y
−2 5600 17.75808
0.10508
−1 5700 17.86316 −0.00122
0.10386 0.00003
0 5800 17.96702 −0.00119 0.00001
0.10267 0.00004
1 5900 18.06969 −0.00115
0.10152
2 6000 18.17121
(i) For x = 5860, let us take x0 = 5800, then s = (5860 − 5800)/100 = 0.6.
By Bessel’s formula
y 0 + y1 s(s − 1) ∆2 y0 + ∆2 y−1
y(5860) = + (s − 0.5)∆y0 +
2 2! 2
1
+ (s − 0.5)s(s − 1)∆ y−1
3
3!
17.96702 + 18.06969
= + (0.6 − 0.5) × 0.10267
2
Interpolation 175
Example 3.21.11 Prove that the third order divided difference of the function
1 1
f (x) = with arguments a, b, c, d is − .
x abcd
1
Solution. Here f (x) = .
x
f (b) − f (a) 1
−1 1
f [a, b] = = b a =− .
b−a b−a ab
f [a, b] − f [b, c] −1 + 1 1
f [a, b, c] = = ab bc = .
a−c a−c abc
The third order divided difference is
f [a, b, c] − f [b, c, d] 1
− bcd
1
1
f [a, b, c, d] = = abc
=− .
a−d a−d abcd
1 (−1)n
Example 3.21.12 If f (x) = , prove that f [x0 , x1 , . . . , xn ] = .
x x0 x1 · · · xn
f (x1 ) − f (x0 )
1
− 1
1 (−1)1
f [x0 , x1 ] = = x1 x0 = − = .
x1 − x0 x1 − x0 x0 x1 x0 x1
176 Numerical Analysis
Solution.
f (x0 ) − f (x1 ) u(x0 )v(x0 ) − u(x1 )v(x1 )
f [x0 , x1 ] = =
x0 − x1 x0 − x1
u(x0 )[v(x0 ) − v(x1 )] + v(x1 )[u(x0 ) − u(x1 )]
=
x0 − x1
= u(x0 )v[x0 , x1 ] + v(x1 )u[x0 , x1 ].
Example 3.21.14 Show that the nth divided difference f [x0 , x1 , . . . , xn ] can be
expressed as
f [x0 , x1 , . . . , xn ] = D ÷ V, where
1 1 1 ··· 1
x0 x x · · · xn
2 1 2
x0 x 2 x 2 · · · x2n
D = 1 2
·n−1
·· ··· ··· ··· ···
x xn−1 xn−1 · · · xnn−1
0 1 2
y0 y1 y2 · · · yn
and
1 1 1 ··· 1
x0 x x · · · xn
2 1 2
x0 x 2 x 2 · · · x2n
V = 1 2 .
· · · · ·
n−1 n−1 n−1· · · · ··· ···
x x x · · · xnn−1
0n 1 2
x xn1 xn2 · · · xnn
0
Let
1 1 ··· 1 1
x0 x1 ··· xn−1 x
V (x0 , x1 , . . . , xn−1 , x) = x20 x21 ··· x2n−1 x2 .
··· ··· ··· · · · · · ·
n
x0 xn1 ··· xnn−1 xn
When x = x0 , x1 , . . . , xn−1 then V = 0,
i.e., (x − x0 ), (x − x1 ), (x − x2 ), . . . , (x − xn−1 ) are the factors of V .
Then one can write
where ∆ is a constant.
178 Numerical Analysis
Therefore,
n−1
V (x0 , x1 , . . . , xn ) = V (x0 , x1 , . . . , xn−1 ) (xn − xi ).
i=0
Therefore,
Now,
Thus,
n
D÷V = (−1)i yi V (x0 , x1 , . . . , xi−1 , xi+1 , . . . , xn )/V (x0 , x1 , . . . , xn )
i=0
n
(−1)i
= (−1)i yi
(xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
i=0
n
yi
=
(xi − x0 )(xi − x1 ) · · · (xi − xi−1 )(xi − xi+1 ) · · · (xi − xn )
i=0
= f [x0 , x1 , . . . , xn ].
x : 10 15 17
y : 3 7 11
Then
y 2 − 18y + 77 y 2 − 14y + 33 y 2 − 10y + 21
φ(y) = × 10 − × 15 + × 17
32 16 32
1
= (137 + 70y − 3y 2 ).
32
Hence, 1
x(10) φ(10) = (137 + 700 − 300) = 16.78125
32
1
and x(5) φ(5) = (137 + 350 − 75) = 12.87500.
32
Example 3.21.16 Use inverse Lagrange’s interpolation to find a root of the equa-
tion y ≡ x3 − 3x + 1 = 0.
Solution. Here y(0) = 1 > 0 and y(1) = −1 < 0. One root lies between 0 and 1.
Now, x and y are tabulated, considering five points of x as 0, 0.25, 0.50, 0.75 and 1.
x : 0 0.25 0.50 0.75 1.00
y : 1 0.26563 −0.37500 −0.82813 −1.0000
Therefore,
4
x φ(0) = Li (y)xi = −0.02234 × 0 + 0.47684 × 0.25
i=0
+0.88176 × 0.50 + 0.06014 × 0.75 + 0.30338 × 1
= 0.38373.
3.22 Exercise
1. Show that
n
w(x)
= 1.
(x − xi )w (xi )
i=1
3. Find a polynomial for f (x) where f (0) = 1, f (1) = 2 and f (3) = 5, using La-
grange’s method.
8. Find the formula for the upper bound of the error involved in linearly interpolating
f (x) between %a and b. Use the formula to find the maximum error encountered
x
2
when f (x) = e−t dt is interpolated between x = 0 and x = 1.
0
9. From the following table, find the number of students who obtain less than 35
marks
10. If y(1) = −3, y(3) = 9, y(4) = 30 and y(6) = 132, find the Lagrange’s interpolation
polynomial that takes the same values as the function y at the given points.
11. Let the following observation follows the law of a cubic polynomial
182 Numerical Analysis
x : 0 1 2 3 4
f (x) : 1 −2 1 16 49
3x2 + 2x − 5
(x − 1)(x − 2)(x − 3)
as sums of partial fractions.
14. Using Lagrange’s interpolation formula, express the function f (x) = 3x2 − 2x + 5
as the sum of products of the factors (x − 1), (x − 2) and (x − 3) taken two at a
time.
15. Compute the missing values of yn and ∆yn in the following table
yn ∆yn ∆2 yn
−
−
− 1
−
− 4
5
6 13
−
− 18
−
− 24
−
−
16. The following table gives pressure of a steam plant at a given temperature. Using
Newton’s formula, compute the pressure for a temperature of 1420 C.
17. The following data gives the melting point of an alloy of lead and zinc; where T
is the temperature in 0 C and P is the percentage of lead in the alloy. Find the
melting point of the alloy containing 84% of lead using Newton’s interpolation
method.
P : 50 60 70 80
T : 222 246 272 299
18. Using a polynomial of third degree, complete the record of the export of a certain
commodity during five years, as given below.
19. Find the polynomial which attains the following values at the given points.
x : −1 0 1 2 3 4
f (x) : −16 −7 −4 −1 8 29
20. Compute log10 2.5 using Newton’s forward difference interpolation formula, given
that
x : 0 1 2 3 4
y : 1 3 9 − 81
22. In the following table, the value of y are consecutive terms of a series of which the
number 36 is the fifth term. Find the first and the tenth terms of the series. Find
also the polynomial which approximates these values.
x : 3 4 5 6 7 8 9
y : 18 26 36 48 62 78 96
23. From the following table determine (a) f (0.27), and (b) f (0.33).
24. The population of a town in decennial census were as under. Estimate the popu-
lation for the year 1955.
25. Using Gauss’s forward formula, find the value of f (32) given that
f (25) = 0.2707, f (30) = 0.3027, f (35) = 0.3386, f (40) = 0.3794.
√
26. Using Gauss’s backward formula, find the value of 518 given that
√ √
√500 = 22.360680, √510 = 22.583100,
520 = 22.803509, 530 = 23.021729.
27. Use a suitable central difference formula of either Stirling’s or Bessel’s to find the
values of f (x) from the following tabulated function at x = 1.35 and at x = 1.42.
28. From Bessel’s formula, derive the following formula for midway interpolation
1 1 3
y1/2 = (y0 + y1 ) − (∆2 y−1 + ∆2 y0 ) + (∆4 y−2 + ∆4 y−1 ) − · · ·
2 16 256
Also deduce this formula from Everett’s formula.
% π/2
29. The function log E where E = 1 − sin2 α sin2 θ dθ is tabulated below:
0
α0 : 0 5 10 15 20
log E : 0.196120 0.195293 0.192815 0.188690 0.182928
Compute log 120 by (a) Bessel’s formula and (b) Stirling’s formula and compare
the results.
for certain equidistance values of α are given below. Use Everett’s or Bessel’s
formula to determine E(0.25).
Interpolation 185
31. Using Everett’s formula, evaluate f (20) from the following table.
x : 14 18 22 26
f (x) : 2877 3162 3566 3990
32. Using Aitken’s method evaluate y when x = 2 from the following table.
x : 1 3 4 6
y : −3 9 30 132
33. Use the Aitken’s procedure to determine the value of f (0.2) as accurately as
possible from the following table.
34. Show that the first order divided difference of a linear polynomial is independent
of the arguments.
35. Show that the second order divided difference of a quadratic polynomial is con-
stant.
37. For the equidistant values x0 , x1 , x2 , x3 i.e., xi = x0 + ih, establish the following
relations
1
f [x0 , x1 ] = [f (x1 ) − f (x0 )],
h
1
f [x0 , x1 , x2 ] = [f (x2 ) − 2f (x1 ) + f (x0 )],
2!h2
1
and f [x0 , x1 , x2 , x3 ] = [f (x3 ) − 3f (x2 ) + 3f (x1 ) − f (x0 )].
3!h3
ax + b
38. If f (x) = obtain expressions for f [p, q], f [p, p, q] and f [p, p, q, q].
cx + d
186 Numerical Analysis
39. If f (x) = x4 obtain expressions for f [a, b, c], f [a, a, b] and f [a, a, a] where a = b = c.
1
f [x0 , x1 , . . . , xn ] = .
(a − x0 )(a − x1 ) · · · (a − xn )
41. Use Newton’s divided difference interpolation to find the interpolation polynomial
for the function y = f (x) given by the table:
x : −1 1 4 6
y : 1 −3 21 127
42. Use Newton’s divided difference interpolation to find the interpolating polynomial
for the function y = f (x) given by
x : −1 1 4 6
f (x) : 5 2 26 132
43. Using the given table of value of Bessel’s function y = J0 (x), find the root of the
equation J0 (x) = 0 lying in (2.4, 2.6) correct up to three significant digits.
45. Find x for which cosh x = 1.285, by using the inverse interpolation technique
of successive approximation of Newton’s forward difference interpolation formula,
given the table.
46. Use the technique of inverse interpolation to find x for which sinh x = 5.5 from
the following table.
47. Given the following table of f (x) between x = 1.1 and x = 1.5, find the zero of
f (x).
48. Use the technique of inverse interpolation to find a real root of the equation
x3 − 2x − 4 = 0.
49. Using Hermite’s interpolation formula, estimate the value of log 3.2 from the fol-
lowing table
50. Find the Hermite polynomial of the third degree approximating the function y(x)
such that
y(x0 ) = 1, y(x1 ) = 0 and y (x0 ) = y (x1 ) = 0.
51. The following values of x and y are calculated from the relation y = x3 + 10
x : 1 2 3 4 5
y : 11 18 37 74 135
Determine the cubic spline p(x) for the interval [2, 3] given that
(a) p (1) = y (1) and p (5) = y (5), (b) p (1) = y (1) and p (5) = y (5).
52. Fit a cubic spline to the function defined by the set of points given in the following
table.
(a) M0 = MN = 0
(b) p (0.10) = y (0.10) and p (0.30) = y (0.30) and
(c) p (0.10) = y (0.10) and p (0.30) = y (0.30).
Interpolate in each case for x = 0.12 and state which of the end conditions gives
the best fit.
53. The distance di that a car has travelled at time ti is given below.
time ti : 0 2 4 6 8
distance di : 0 40 160 300 480
Use the values p (0) and p (8) = 98, and find the clamped spline for the points.
54. Fit a cubic spline for the points (0,1), (1,0), (2,0), (3,1), (4,2), (5,2) and (6,1) and
p (0) = −0.6, p (6) = −1.8 and p (0) = 1 and p (6) = −1.
1 + x, 0≤x≤3
f (x) =
1 + x + (x − 3) , 3 ≤ x ≤ 4.
2
f (x, y) = ex sin y + y + 1
for
x = 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5
and y = 0.1, 0.2, 0.3, 0.4, 0.5, 0.6.
Hence find the value of f (1.6, 0.33) by two-dimensional interpolation.
57. Using the following data obtain the Lagrange and Newton’s bivariate interpolating
polynomials
x : 0 0 0 1 1 1 2 2 2
y : 0 1 2 0 1 2 0 1 2
f (x, y) : 1 3 7 3 6 11 7 11 17
Chapter 4
189
190 Numerical Analysis
An interval [a, b] is said to be the location of a real root c if f (c) = 0 for a < c < b.
Mainly, two methods are used to locate the real roots of an equation, one is graphical
method and other is an analytic method known as method of tabulation.
Example 4.1.1 Use graphical method to locate the roots of the equation x3 − 4x −
2 = 0.
f (x)
6
y yM = 4x + 2
yI
= x3
O - x
x
-2 -1 1 2 3 O
The graphical method to locate the roots is not very useful, because, the drawing of
the function y = f (x) is itself a complicated problem. But, it makes possible to roughly
determine the intervals to locate the roots. Then an analytic method is used to locate
the root.
Theorem 4.1 If f (x) is continuous in the interval (a, b) and if f (a) and f (b) have the
opposite signs, then at least one real root of the equation f (x) = 0 lies within the interval
(a, b).
If f (a) and f (b) have same signs then f (x) = 0 has no real roots or has an even
number of real roots.
If the curve y = f (x) touches the x-axis at some point, say, x = c then c is a root
of f (x) = 0 though f (a) and f (b), a < c < b may have same sign. For example,
f (x) = (x − 2)2 touches the x-axis at x = 2, also f (1.5) > 0 and f (2.5) > 0, but, x = 2
is the root of the equation f (x) = (x − 2)2 = 0.
A trial method for tabulation is as follows:
Form a table of signs of f (x) setting x = 0, ±1, ±2, . . .. If the sign f (x) changes it signs
for two consecutive values of x then at least one root lies between these two values, i.e.,
if f (a) and f (b) have opposite signs then a root lies between a and b.
Example 4.1.2 Find the location of roots of the equation 8x3 − 20x2 − 2x + 5 = 0
by tabulation method.
192 Numerical Analysis
x 0 −1 1 −2 2 3
Sign of f (x) + − − − − +
The equation has three real roots as its degree is 3. The location of the roots of the
given equation are (−1, 0), (0, 1) and (2, 3).
A systematic process for tabulation
The following sequence of steps to be performed to locate the roots of an equation
f (x) = 0 by tabulation method:
3. determine the intervals at the endpoints of which the function assumes values
of opposite signs. These intervals contain one and only one root each in its
interior.
Example 4.1.3 Find the number of real roots of the equation 3x − 3x − 2 = 0 and
locate them.
x −∞ 0 1 ∞
Sign of f (x) + − − +
The equation 3x − 3x − 2 = 0 has two real roots since the function twice changes sign,
among them one is negative root and other is greater than 1.
Sol. of Algebraic and Transcendental Equs. 193
A new table with small intervals of the location of the root is constructed in the
following.
x 0 −1 1 2
Sign of f (x) − + − +
The roots of the given equation are in (−1, 0) [as f (0).f (−1) < 0] and (1, 2).
This section is devoted to locate the roots which is the first stage of solution of
algebraic and transcendental equations.
The second stage is the computation of roots with the specified degree of accuracy. In
the following sections some methods are discussed to determine the roots of an algebraic
or a transcendental equation. Before presenting the solution methods we define the order
of convergence of a sequence of numbers in the following.
Order of Convergence
Assume that the sequence {xn } of numbers converges to ξ and let εn = ξ − xn for n ≥ 0.
If two positive constants A = 0 and p > 0 exist and
εn+1
lim =A (4.1)
n→∞ εpn
then the sequence is said to converge to ξ with order of convergence p. The number A
is called the asymptotic error constant.
If p = 1, the order of convergence of {xn } is called linear and if p = 2, the order of
convergence is called quadratic, etc.
In the next section, one of the bracketing method called bisection method is intro-
duced.
Let ξ be a root of the equation f (x) = 0 lies in the interval [a, b], i.e., f (a).f (b) < 0,
and (b − a) is not sufficiently small. The interval [a, b] is divided into two equal intervals
b−a a+b
[a, c] and [c, b], each of length , and c = (Figure 4.2). If f (c) = 0, then c is
2 2
an exact root.
Now, if f (c) = 0, then the root lies either in the interval [a, c] or in the interval [c, b].
If f (a).f (c) < 0 then the interval [a, c] is taken as new interval, otherwise [c, b] is taken
as the next interval. Let the new interval be [a1 , b1 ] and use the same process to select
the next new interval. In the next step, let the new interval be [a2 , b2 ]. The process of
bisection is continued until either the midpoint of the interval is a root, or the length
(bn − an ) of the interval [an , bn ] (at nth step) is sufficiently small. The number an and
194 Numerical Analysis
an + bn
bn are the approximate roots of the equation f (x) = 0. Finally, xn = is taken
2
as the approximate value of the root ξ.
a ξs - x
O c b
It may be noted that when the reduced interval be [a1 , b1 ] then the length of the
interval is (b − a)/2, when the interval be [a2 , b2 ] then the length is (b − a)/22 . At the
an + bn
nth step the length of the interval being (b − a)/2n . In the final step, when ξ =
2
is chosen as a root then the length of the interval being (b − a)/2n+1 and hence the error
does not exceed (b − a)/2n+1 .
Thus, if ε be the error at the nth step then the lower bound of n is obtained from
the following relation
|b − a|
≤ ε. (4.2)
2n
The lower bound of n is obtained by rewriting this inequation as
log(|b − a|) − log ε
n≥ . (4.3)
log 2
Hence the minimum number of iterations required to achieve the accuracy ε is
loge |b−a|
ε
. (4.4)
log 2
For example, if the length of the interval is |b − a| = 1 and ε = 0.0001, then n is given
by n ≥ 14.
The minimum number of iterations required to achieved the accuracy ε for |b − a| = 1
are shown in Table 4.1.
Theorem 4.2 Assume that f (x) is a continuous function on [a, b] and that there exists
a number ξ ∈ [a, b] such that f (ξ) = 0. If f (a) and f (b) have opposite signs, and {xn }
represents the sequence of midpoints generated by the bisection method, then
b−a
|ξ − xn | ≤ for n = 0, 1, 2, . . . (4.5)
2n+1
Sol. of Algebraic and Transcendental Equs. 195
lim xn = ξ.
n→∞
Proof. The root ξ and the midpoint xn both lie in the interval [an , bn ], the distance
between xn and ξ cannot be greater than half the width of the interval [an , bn ]. Thus
|bn − an |
|ξ − xn | ≤ for all n. (4.6)
2
From the bisection method, it is observed that the successive interval widths form
the following pattern.
|b0 − a0 |
|b1 − a1 | = , where b0 = b and a0 = a,
21
|b1 − a1 | |b0 − a0 |
|b2 − a2 | = = ,
2 22
|b2 − a2 | |b0 − a0 |
|b3 − a3 | = = .
2 23
|b0 − a0 |
In this way, |bn − an | = .
2n
Hence
|b0 − a0 |
|ξ − xn | ≤ [using (4.6)].
2n+1
Now, the limit gives
|ξ − xn | → 0 as n → ∞ i.e., lim xn = ξ.
n→∞
Note 4.2.1 If the function f (x) is continuous on [a, b] then the bisection method is
applicable. This is justified in Figure 4.3. For the function f (x) of the graph of Figure
4.3, f (a) · f (b) < 0, but the equation f (x) = 0 has no root between a and b as the
function is not continuous at x = c.
196 Numerical Analysis
f (x)
6
b - x
a c
O
Figure 4.3: The function has no root between a and b, though f (a) · f (b) < 0.
Note 4.2.2 This method is very slow, but it is very simple and will converge surely
to the exact root. So the method is applicable for any function only if the function is
continuous within the interval [a, b], where the root lies.
In this method derivative of the function f (x) and pre-manipulation of function are
not required.
Note 4.2.3 This method is also called bracketing method since the method successively
reduces the two endpoints (brackets) of the interval containing the real root.
Algorithm 4.1 (Bisection method). This algorithm finds a real root of the equa-
tion f (x) = 0 which lies in [a, b] by bisection method.
Algorithm Bisection
Input function f (x);
// Assume that f (x) is continuous within [a, b] and a root lies on [a, b].//
Read ε; //tolerance for width of the interval//
Read a, b; //input of the interval//
Compute f a = f (a); f b = f (b); //compute the function values//
if sign(f a) = sign(f b) then
//sign(f a) gives the sign of the value of f a.//
Print ‘f (a) · f (b) > 0, so there is no guarantee for a root within [a, b]’;
Stop;
endif;
do
Compute c = (a + b)/2;
Compute f c = f (c);
if f c = 0 or |f c| < ε then
a = c and b = c;
else if sign(f b) = sign(f c) then
b = c; f b = f c;
else
a = c; f a = f c;
endif;
while (|b − a| > ε);
Print ‘the desired root is’ c;
end Bisection
Program 4.1.
/* Program Bisection
Program to find a root of the equation x*x*x-2x-1=0 by
bisection method.
Assume that a root lies between a and b. */
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
#define f(x) x*x*x-2*x-1 /* definition of the function f(x) */
void main()
{
float a,b,fa,fb,c,fc;
float eps=1e-5; /* error tolerance */
198 Numerical Analysis
Another popular method is the method of false position or the regula falsi method.
This is also a bracketing method. This method was developed because the bisection
method converges at a fairly slow speed. In general, the regula falsi method is faster
than bisection method.
The Regula-Falsi method is one of the most widely used methods of solving algebraic
and transcendental equations. This method is also known as ‘method of false position’,
Sol. of Algebraic and Transcendental Equs. 199
y − f (x0 ) x − x0
= . (4.7)
f (x0 ) − f (x1 ) x0 − x1
To find the point of intersection, set y = 0 in (4.7) and let (x2 , 0) be such point.
Thus,
f (x0 )
x2 = x0 − (x1 − x0 ). (4.8)
f (x1 ) − f (x0 )
This is the second approximation of the root. Now, if f (x2 ) and f (x0 ) are of opposite
signs then the root lies between x0 and x2 and then we replace x1 by x2 in (4.8). The
next approximation is obtained as
f (x0 )
x3 = x0 − (x2 − x0 ).
f (x2 ) − f (x0 )
If f (x2 ) and f (x1 ) are of opposite signs then the root lies between x1 and x2 and the
new approximation x3 is obtain as
f (x2 )
x3 = x2 − (x1 − x2 ).
f (x1 ) − f (x2 )
The procedure is repeated till the root is obtained to the desired accuracy.
If the nth approximate root (xn ) lies between an and bn then the next approximate
root is thus obtained as
f (an )
xn+1 = an − (bn − an ). (4.9)
f (bn ) − f (an )
f (x) f (x)
6 6
x0 =a x1 x2 s - ξ
x s - x
O ξ b x0 x1x2
O
where ξ is an exact root and xn−1 and xn are its approximations obtained at the (n−1)th
and nth iterations. This relation can be used when
M ≤ 2m, where M = max |f (x)| and m = min |f (x)| in [a, b]. (4.11)
Solution. Let f (x) = x3 + 2x − 2. f (0) = −2 < 0 and f (1) = 1 > 0. Thus, one root
lies between 0 and 1. The calculations are shown in the following table.
left end right end
n point an point bn f (an ) f (bn ) xn+1 f (xn+1 )
0 0.0000 1.0 –2.0000 1.0 0.6700 –0.3600
1 0.6700 1.0 –0.3600 1.0 0.7570 –0.0520
2 0.7570 1.0 –0.0520 1.0 0.7690 –0.0072
3 0.7690 1.0 –0.0072 1.0 0.7707 –0.0010
4 0.7707 1.0 –0.0010 1.0 0.7709 –0.0001
Algorithm 4.2 (Regula-Falsi). This algorithm finds a root of the equation f (x) =
0 which lies in [x0 , x1 ], by Regula-Falsi method.
Algorithm Regula-Falsi
Input function f (x);
Sol. of Algebraic and Transcendental Equs. 201
Program 4.2
.
/* Program Regula-Falsi
Program to find a root of the equation x*x-2x-3=0 by regula
falsi method. Assumed that a root lies between x0 and x1. */
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
#define f(x) x*x-2*x-3 /* definition of the function f(x) */
void main()
{
float x0,x1,x2,fx0,fx1,fx2;
float eps=1e-5; /* error tolerance */
printf("\nEnter the value of x0 and x1 ");
scanf("%f %f",&x0,&x1);
fx0=f(x0); fx1=f(x1);
if(fx0*fx1>0)
{
printf("There is no guarantee for a root within
[%6.3f,%6.3f]",x0,x1);
exit(0);
}
202 Numerical Analysis
do
{
x2=(x0*fx1-x1*fx0)/(fx1-fx0);
fx2=f(x2);
if(fabs(fx2)<eps)
{
printf("The root is %8.5f ",x2);
exit(0);
}
if(fx2*fx0<0)
{
x1=x2; fx1=fx2;
}
else
{
x0=x2; fx0=fx2;
}
}while(fabs(fx2)>eps);
} /* main */
A sample of input/output:
Enter the value of x0 and x1 0 3
The root is 3.00000
x = φ(x). (4.12)
Suppose x0 ∈ [a, b] be an initial guess to the desired root ξ. Then φ(x0 ) is evaluated
and this value is denoted by x1 . It is the first approximation of the root ξ. Again, x1 is
substituted for x to the right side of (4.12) and obtained a new value x2 = φ(x1 ). This
process is continued to generate the sequence of numbers x0 , x1 , x2 , . . . , xn , . . ., those
are defined by the following relation:
This successive iterations are repeated till the approximate numbers xn ’s converges
to the root with desired accuracy, i.e., |xn+1 − xn | < ε, where ε is a sufficiently small
number. The function φ(x) is called the iteration function.
Note 4.4.1 There is no guarantee that this sequence x0 , x1 , x2 , . . . will converge. The
function f (x) = 0 can be written as x = φ(x) in many different ways. This is very
significant since the form of the function φ(x) is very important both for the convergence
and for its rate.
For example, the equation x3 + x2 − 1 = 0 has a root lies between 0 and 1. This
equation can be rewritten in the following ways:
1 − x2 1 − x2 1
x= ; x = (1 − x2 )1/3 ; x = ; x = 1 − x3 ; x = √ , etc.
x2 x 1+x
The following theorem gives the sufficient condition for convergence of the iteration
process.
Theorem 4.3 Let ξ be a root of the equation f (x) = 0 and it can be written as x = φ(x)
and further that
1. the function φ(x) is defined and differentiable on the interval [a, b],
Then the sequence {xn } given by (4.13) converges to the desired root ξ irrespective of
the choice of the initial approximation x0 ∈ [a, b] and the root ξ is unique.
ξ = φ(ξ). (4.15)
For i = 0, 1, 2, . . . , n,
Now, if l < 1 then the right hand side of (4.17) tends to zero as n → ∞.
Therefore, lim xn+1 = ξ.
n→∞
Hence the sequence {xn } converge to ξ if |φ (x)| < 1, for all x ∈ [a, b].
Now to prove the uniqueness.
Let ξ1 and ξ2 be two roots of x = φ(x), i.e., ξ1 = φ(ξ1 ) and ξ2 = φ(ξ2 ). Then
where c ∈ (ξ1 , ξ2 ).
Equation (4.18) reduces to
and by condition (iii) ξ1 = ξ2 , i.e., the two roots are not distinct, they are equal.
Let the maximum number of iterations needed to achieve the accuracy ε be N (ε).
Thus from (4.19)
lN
|x1 − x0 | ≤ ε.
1−l
This gives log |xε(1−l)
1 −x0 |
N (ε) ≥ . (4.20)
log l
For l ≤ 1
2 the estimation of the error is given by the following simple form:
Order of convergence
The convergence of an iteration method depends on the suitable choice of the iteration
function φ(x) and x0 , the initial guess.
Let xn converges to the exact root ξ, so that ξ = φ(ξ).
Thus xn+1 − ξ = φ(xn ) − φ(ξ).
Let εn+1 = xn+1 − ξ. Note that φ (x) = 0. Then the above relation becomes
Geometric interpretation
Geometrically, the point of intersection of the line y = x and the curve y = φ(x) is a
root of the equation f (x) = 0. Depending on the value of φ (ξ) the convergence and
divergence cases are illustrated in Figures 4.5-4.6.
y = φ(x) -
6 --
6-
6
6?
??
? ? y = φ(x)
6
?
- x - x
ξ x2 x1 x0 x0x2 ξ x3x1
O O
(a) Stair case solution, (b) Spiral case solution,
0 < φ (ξ) < 1 −1 < φ (ξ) < 0.
Figure 4.5: Convergent for |φ (ξ)| < 1.
f (x) f (x)
y=x y=x
6 - 6 -
y=φ(x) 6 6
-
6
-
? y = φ(x)
?
- x - x
ξ x0 x1 x3x1 ξ x0x2 x4
O O
(a) Divergent for φ (ξ) > 1. (b) Divergent for φ (ξ) < −1.
Figure 4.6: Divergent for |φ (ξ)| > 1.
Example 4.4.1 Consider the equation 5x3 − 20x + 3 = 0. Find the root lying on
the interval [0,1] with an accuracy of 10−4 .
3
Solution. The given equation is written as x = 5x20+3 = φ(x) (say).
2
3x2
Now, φ (x) = 15x
20 = 4 < 1 on [0,1]. Let x0 = 0.5. The calculations are shown in
the following table.
n xn φ(xn ) = xn+1
0 0.5 0.18125
1 0.18125 0.15149
2 0.15149 0.15087
3 0.15087 0.15086
4 0.15086 0.15086
Sol. of Algebraic and Transcendental Equs. 207
At this stage the iteration process is terminated and ξ = 0.1509 is taken as the
required root.
cos x − xex = 0
Solution. It is easy to see that one root of the given equation lies between 0 and 1.
Let x0 = 0. The equation can be written as x = e−x cos x = φ(x) (say).
The calculations are shown in the following table.
n 0 1 2 3 4 5 6
xn 0.50000 0.53228 0.50602 0.52734 0.51000 0.52408 0.51263
xn+1 0.53228 0.50602 0.52734 0.51000 0.52408 0.51263 0.52193
n 7 8 9 10 11 12 13
xn 0.52193 0.51437 0.52051 0.51552 0.51958 0.51628 0.51896
xn+1 0.51437 0.52051 0.51552 0.51958 0.51628 0.51896 0.51678
n 14 15 16 17 18 19 20
xn 0.51678 0.51855 0.51711 0.51828 0.51733 0.51810 0.51748
xn+1 0.51855 0.51711 0.51828 0.51733 0.51810 0.51748 0.51798
Algorithm 4.3 (Fixed point iteration). This algorithm computes a root of the
equation f (x) = 0 by rewriting the equation as x = φ(x), provided |φ (x)| < 1 in the
interval [a, b], by fixed point iteration method. x0 ∈ [a, b] be the initial guess and ε
is the error tolerance.
Algorithm Iteration
Input function φ(x);
Read x0 , ε; //initial guess and error tolerance.//
Set x1 = x0 ;
do
Set x0 = x1 ;
Compute x1 = φ(x0 );
while (|x1 − x0 | > ε);
Print ‘The root is’, x1 ;
end Iteration
208 Numerical Analysis
Program 4.3
.
/* Program Fixed-Point Iteration
Program to find a root of the equation x*x*x-3x+1=0
by fixed point iteration method. phi(x) is obtained
by rewrite f(x)=0 as x=phi(x), which is to be supplied.*/
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
#define phi(x) (3*x-1)/(x*x)
/* definition of the function phi(x) and it to be
changed accordingly */
void main()
{
int k=0; /* counts number of iterations */
float x1,x0; /* initial guess */
float eps=1e-5; /* error tolerance */
printf("\nEnter the initial guess x0 ");
scanf("%f",&x0);
x1=x0;
do
{
k++;
x0=x1;
x1=phi(x0);
}while(fabs(x1-x0)>eps);
printf("One root is %8.5f obtained at %d th iteration ",x1,k);
} /* main */
A sample of input/output:
The rate of convergence of iteration method is linear. But, this slow rate can be accel-
erated by using Aitken’s method.
The iteration scheme of this method is obtained from fixed point iteration method as
Example 4.5.1 Find a root of the equation cos x − xex = 0 using Aitken’s ∆2 -
process.
h2
f (x0 ) + hf (x0 ) + f (x0 ) + · · · = 0.
2!
Neglecting the second and higher order derivatives the above equation reduces to
f (x0 )
f (x0 ) + hf (x0 ) = 0 or, h = − .
f (x0 )
Hence,
f (x0 )
x1 = x0 + h = x0 − . (4.24)
f (x0 )
To compute the value of h, the second and higher powers of h are neglected so the
f (x0 )
value of h = − is not exact, it is an approximate value. So, x1 , obtained from
f (x0 )
(4.24) is not a root of the equation, but it is a better approximation of x than x0 .
In general,
f (xn )
xn+1 = xn − . (4.25)
f (xn )
Note 4.6.1 The Newton-Raphson method may also be used to find a complex root of
an equation when the initial guess is taken as a complex number.
Geometrical interpretation
The geometrical interpretation of Newton-Raphson method is shown in Figure 4.7.
In this method, a tangent is drawn at (x0 , f (x0 )) to the curve y = f (x). The tangent
cuts the x-axis at (x1 , 0). Again, a tangent is drawn at (x1 , f (x1 )) and this tangent cuts
x-axis at (x2 , 0). This process is continued until xn = ξ as n → ∞.
Sol. of Algebraic and Transcendental Equs. 211
f (x)
6
6
6 - x
O ξ x2 x1 x0
The choice of initial guess of Newton-Raphson method is very important. If the initial
guess is near the root then the method converges very fast. If it is not so near the root
or if the starting point is wrong, then the method may lead to an endless cycle. This
is illustrated in Figure 4.8. In this figure, the initial guess x0 gives the fast convergence
to the root, the initial guess y0 leads to an endless cycle and the initial guess z0 gives a
divergent solution as f (z0 ) is very small.
f (x)
6
6 6
z0 1 w - x
x0 y0
O
Even if the initial guess is not close to the exact root, the method may diverges. To
choose the initial guess the following rule may be followed:
The endpoint of the interval [a, b] at which the sign of the function coincides with the
sign of the second derivative must be taken as the initial guess. When f (b) · f (x) > 0,
the initial guess is x0 = b, and when f (a) · f (x) > 0 then x0 = a be the initial guess.
Three different cases - divergent, cyclic and oscillation of Newton-Raphson method
are discussed in the following by examples.
Sometimes, if the initial guess x0 is far away from the exact root then the sequence
212 Numerical Analysis
{xn } may converges to some other root. This situation happens when the slope f (x0 )
is small and the tangent to the curve y = f (x) is nearly horizontal. For example, if
f (x) = cos x and we try to find the root ξ = π/2 starting with x0 = 3 then x0 =
−4.01525, x2 = −4.85266, . . . and the sequence {xn } will converge to a different root
−4.71239 −3π/2.
We consider another example which will produces a divergent sequence. Let
Then x1 = 4.0, x2 = 5.33333, . . . , x15 = 19.72255, . . . and clearly {xn } diverges slowly
to ∞ (Figure 4.9).
f (x)
6
U R6s
- x
6
O
Figure 4.9: Newton-Raphson method produces a divergent sequence for f (x) = xe−x .
Now consider a function f (x) = x3 − x − 3 which will produce a cyclic sequence when
initial guess is x0 = 0. The sequence is
x1 = −3.0, x2 = −1.961538, x3 = −1.147176, x4 = −0.006579,
x5 = −3.000389, x6 = −1.961818, x7 = −1.147430, . . .
and it may be noted that xk+4 xk , k = 0, 1, 2, . . . (Figure 4.10).
But, the initial guess x0 = 2 gives the convergent sequence x1 = 1.72727, x2 =
1.67369, x3 = 1.67170, x4 = 1.67170.
The function f (x) = tan−1 x and x0 = 1.45 gives a divergent oscillating sequence. If
the initial guess is x0 = 1.45 then
x1 x2 x3 x0
–x y : - x
? O
?
y = x3 − x − 3
?
–y
f (xn )
xn+1 = xn − .
f (xn )
Comparing this expression with fixed point iteration formula xn+1 = φ(xn ) and we
obtain
f (xn )
φ(xn ) = xn − .
f (xn )
This can be written as
f (x)
φ(x) = x − .
f (x)
It is already proved that the iteration method converges if |φ (x)| < 1. Therefore,
Newton-Raphson method converges, if
d f (x)
|f (x) · f (x)| < |f (x)|2
dx x − f (x) < 1 or (4.26)
within the interval under consideration. Newton-Raphson method converges if the initial
guess x0 is chosen sufficiently close to the root and the functions f (x), f (x) and f (x)
are continuous and bounded in any small interval containing the root. The rate of
convergent of Newton-Raphson method is stated in the following theorem.
214 Numerical Analysis
y
6 y = tan−1 x
66
x0 x2
–x ) 1 - x
x3 x1 O 6
?
? ?
–y
Proof. Let ξ be a root of the equation f (x) = 0. Then f (ξ) = 0. The iteration scheme
for Newton-Raphson method is
f (xn )
xn+1 = xn − .
f (xn )
Let xn = εn + ξ.
Therefore, above relation becomes
f (εn + ξ)
εn+1 + ξ = εn + ξ −
f (εn + ξ)
f (ξ) + εn f (ξ) + (ε2n /2)f (ξ) + · · ·
εn+1 = εn − [by Taylor’s series]
f (ξ) + εn f (ξ) + · · ·
ε2n f (ξ)
f (ξ) εn + 2 f (ξ) + ···
= εn − [as f (ξ) = 0]
f (ξ) 1 + εn ff (ξ)
(ξ)
+ ···
Neglecting the terms of order ε3n and higher powers the above expression becomes
f (ξ)
εn+1 = Aε2n , where A = . (4.27)
2f (ξ)
This relation shows that Newton-Raphson method has quadratic convergence or sec-
ond order convergence.
Solution. Let f (x) = x3 + x − 1. Then f (0) = −1 < 0 and f (1) = 1 > 0. So one
root lies between 0 and 1. Let x0 = 0 be the initial root.
The iteration scheme is
f (xn )
xn+1 = xn −
f (xn )
x3 + xn − 1 2x3 + 1
= xn − n 2 = n2 .
3xn + 1 3xn + 1
n xn xn+1
0 0 1
1 1 0.7500
2 0.7500 0.6861
3 0.6861 0.6823
4 0.6823 0.6823
Therefore, a root of the equation is 0.682 correct up to three decimal places.
Example 4.6.2 Find an iteration scheme to find the kth root of a number a.
f (xn )
xn+1 = xn −
f (xn )
xk − a k xkn − xkn + a
or, xn+1 = xn − n k−1 =
k xn k xk−1
n
1 a
= (k − 1)xn + k−1 .
k xn
216 Numerical Analysis
Example 4.6.3 Write down an iteration scheme for finding square root of a positive
number N . Hence find the square root of the number 2.
√
Solution. Let the square root of N be x. That is,√x = N , or, x2 − N = 0.
Thus the root of this equation is the required value of N . Let f (x) = x2 − N . By
Newton-Raphson method
f (xn ) x2 − N 1 N
xn+1 = xn − = xn − n = xn + .
f (xn ) 2xn 2 xn
Solution. Let f (x) = x log10 x − 4.77. Here f (6) = −0.10109 < 0 and f (7) =
1.14569 > 0.
Therefore, one root lies between 6 and 7. Let the initial guess be x0 = 6.
The iteration scheme is
xn log10 xn − 4.77 0.43429xn + 4.77
xn+1 = xn − = .
log10 xn + log10 e log10 (2.71828xn )
3 x2n + 2
Example 4.6.5 The following expression xn+1 = is an iteration scheme to
8
find a root of the equation f (x) = 0. Find the function f (x).
3 x2n + 2
xn+1 = .
8
Therefore, lim xn = α.
1
n→∞
Example 4.6.6 Discuss the Newton-Raphson method to find the root of the equa-
tion x10 − 1 = 0 starting with x0 = 0.5.
Solution. Let z0 = −0.5 + 0.5i = (−0.5, 0.5) be the initial guess and f (z) =
z 3 + 2z 2 + 2z + 1. Then f (z) = 3z 2 + 4z + 2. The iteration scheme is
f (zn )
zn+1 = zn − .
f (zn )
The Newton-Raphson method can be used to find the multiple root of an equation.
But, its generalized form
f (xn )
xn+1 = xn − p (4.28)
f (xn )
gives a faster convergent sequence. The term p1 f (xn ) is the slope of the straight line
passing through (xn , f (xn )) and intersecting the x-axis at the point (xn+1 , 0). The
formula (4.28) reduces to Newton-Raphson formula when p = 1.
If ξ is a root of f (x) = 0 with multiplicity p, then ξ is also a root of f (x) = 0
with multiplicity (p − 1), of f (x) = 0 with multiplicity (p − 2) and so on. Hence the
expression
f (x0 ) f (x0 ) f (x0 )
x0 − p , x0 − (p − 1) , x0 − (p − 2) ,...
f (x0 ) f (x0 ) f (x0 )
should have the same value if there is a root with multiplicity p, when the initial guess
is very close to the exact root ξ.
f (εn + ξ)
εn+1 = εn − p
f (εn + ξ)
εp−1 εpn p εp+1
f (ξ) + εn f (ξ) + · · · + n
(p−1)! f
p−1 (ξ) + p! f (ξ) + n
(p+1)! f
p+1 (ξ) + ···
= εn − p
εp−2 εp−1 εpn
f (ξ) + εn f (ξ) + · · · + n
(p−2)! f
p−1 (ξ) + n
(p−1)! f
p (ξ) + p! f p+1 (ξ) + · · ·
Sol. of Algebraic and Transcendental Equs. 219
εpn p εp+1
p! f (ξ) + n
(p+1)! f
p+1 (ξ) + ···
= εn − p
εp−1 εpn
n
(p−1)! f
p (ξ) + p! f p+1 (ξ) + · · ·
−1
εn ε2n f p+1 (ξ) εn f p+1 (ξ)
= εn − p + + · · · 1 + + ···
p p(p + 1) f p (ξ) p f p (ξ)
εn ε2n f p+1 (ξ) εn f p+1 (ξ)
= εn − p + + · · · 1 − + ···
p p(p + 1) f p (ξ) p f p (ξ)
ε2 f p+1 (ξ) ε2n f p+1 (ξ)
= εn − ε n + n − + ···
p + 1 f p (ξ) p f p (ξ)
1 f p+1 (ξ)
= ε2n + O(ε3n ).
p(p + 1) f p (ξ)
1 f p+1 (ξ)
Thus εn+1 = Aε2n , where A = .
p(p + 1) f p (ξ)
This shows that the rate of convergence is quadratic.
In the Newton-Raphson method, the derivative of the function f (x) is calculated at each
point xn . That is, at each iteration two functions are evaluated at xn , n = 0, 1, 2, . . ..
220 Numerical Analysis
But, some functions take much time to evaluate the derivative. To save this time one
can change the iteration scheme of Newton-Raphson method as
f (xn )
xn+1 = xn − . (4.29)
f (x0 )
That is, the derivative of f (x) is calculated only at the initial guess instead of several
different points xn . This method reduces the time for calculating the derivatives. But,
the rate of convergence of this method is linear, which is proved in Theorem 4.6.
Solution. Let ξ be the root of the equation f (x) = 0. Then f (ξ) = 0, and εn = xn − ξ.
Therefore from (4.29),
f (ξ)
Neglecting ε2n and higher powers of ε2n and denoting A = 1 − the above error
f (x0 )
term becomes
This proved that the rate of convergence of the formula (4.29) is linear.
f (x)
6
6
s 6/ - x
O ξx3x2 x1 x0
Geometrical interpretation
Solution. One root of this equation lies between −2 and −1. Let x0 = −1.5 and
f (x) = x3 − x + 1. Then f (x0 ) = 5.75.
The iteration scheme of the formula (4.29) is
f (xn )
xn+1 = xn −
f (x0 )
x3 − xn + 1 1
= xn − n =− (x3 − 6.75xn + 1).
5.75 5.75 n
All the calculations are shown in the following table.
n xn xn+1
0 –1.50000 –1.34783
1 –1.34783 –1.33032
2 –1.33032 –1.32614
3 –1.32614 –1.32508
4 –1.32508 –1.32481
5 –1.32481 –1.32474
6 –1.32474 –1.32472
7 –1.32472 –1.32472
Therefore, one root of the given equation is −1.3247, correct up to four decimal places
attained at 7th iteration.
Using Newton-Raphson method
The iteration scheme for Newton-Raphson method is
f (xn )
xn+1 = xn −
f (xn )
x3 − xn + 1 2x3 − 1
= xn − n 2 = n2 .
3xn − 1 3xn − 1
222 Numerical Analysis
Therefore, a root is −1.3247 correct up to four decimal places, attained at 3rd itera-
tion.
This example shows that Newton-Raphson method is more faster than the method
given by (4.29).
1
4.9 Modified Newton-Raphson Method
f (xn )
xn+1 = xn − = φ(xn ) (say) (4.31)
f (xn + a(xn )f (xn ))
That is,
f (x)
φ(x) = x − ,
f (x + a(x)f (x))
f (x)
and φ (x) = −
f (x + a(x)f (x))
f (x)f (x + a(x)f (x))(1 + a (x)f (x) + a(x)f (x))
+ 2
{f (x + a(x)f (x))}2
f (x){f (x + a(x)f (x))}2 {1 + a (x)f (x) + a(x)f (x)}2
− 2
{f (x + a(x)f (x))}3
1
H.H.H.Homeier, A modified Newton method for root finding with cubic convergence, J. Computa-
tional and Applied Mathematics, 157 (2003) 227-230.
Sol. of Algebraic and Transcendental Equs. 223
If ξ be the root of the equation f (x) = 0 then f (ξ) = 0 and hence φ(ξ) = ξ, φ (ξ) = 0
[from (4.32)] and
This method evaluates three functions f (x), x + a(x)f (x) and f (x + a(x)f (x)) at
each iteration.
Example 4.9.1 Find a root of the equation x3 −3x2 +4 = 0 using modified Newton-
Raphson method, starting with x0 = 1.5.
Therefor, a root of the given equation is 2.000, correct up to three decimal places,
and this value is attained at 6th iteration, while Newton-Raphson method takes 10
iterations (see Example 4.7.1).
Algorithm Newton-Raphson
// f d(x) is the derivative of f (x) and ε is the error tolerance.//
Input function f (x), f d(x);
Read x0 , ε;
Set x1 = x0 ;
do
Set x0 = x1 ;
Compute x1 = x0 − f (x0 )/f d(x0 );
while (|x1 − x0 | > ε);
Print ‘The root is’, x1 ;
end Newton-Raphson
Sol. of Algebraic and Transcendental Equs. 225
Program 4.4.
/* Program Newton-Raphson
Program to find a root of the equation x*x*x-3x+1=0 by Newton-
Raphson method. f(x) and its derivative fd(x) are to be supplied. */
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
void main()
{
int k=0; /* counts number of iterations */
float x1,x0; /* x0 is the initial guess */
float eps=1e-5; /* error tolerance */
float f(float x);
float fd(float x);
printf("\nEnter the initial guess x0 ");
scanf("%f",&x0);
x1=x0;
do
{
k++;
x0=x1;
x1=x0-f(x0)/fd(x0);
}while(fabs(x1-x0)>eps);
printf("One root is %8.5f obtained at %d th iteration ",x1,k);
} /* main */
/* definition of the function f(x) */
float f(float x)
{
return(x*x*x-3*x+1);
}
/* definition of the function fd(x) */
float fd(float x)
{
return(3*x*x-3);
}
A sample of input/output:
This formula is same as the formula for Regula-falsi method and this formula needs
two initial guess x0 and x1 of the root.
Note 4.10.1 Regula-falsi method need an interval where the root belongs to, i.e., if
[x0 , x1 ] is the interval then f (x0 ) · f (x1 ) < 0. But, secant method needs two nearest
values x0 and x1 of the exact root and not necessarily f (x0 ) · f (x1 ) < 0.
Geometrical interpretation
Let ξ be the exact root of the equation f (x) = 0 and the error at the nth iteration is
εn = xn − ξ. Also f (ξ) = 0.
Sol. of Algebraic and Transcendental Equs. 227
f (x)
6
xi xi+1
- x
O xi+2 xi−1
1 f (ξ)
Thus εn+1 = cεn εn−1 where c = . This is a non-linear difference equation and
2 f (ξ)
= Aεpn . Then εn = Aεpn−1 . This gives εn−1 = εn A−1/p .
1/p
to solve it, let εn+1
Therefore, Aεpn = cεn εn A−1/p , i.e., εpn = cA−(1+1/p) εn
1/p 1+1/p
.
Equating the power of εn on both sides, obtain the equation for p
1 1 √
p=1+ or p = (1 ± 5).
p 2
evaluates two functions f and f in each iteration. In this context, the secant method
is more efficient as compared to Newton-Raphson method.
Solution. Let f (x) = x3 − 8x − 4 = 0. One root lies between 3 and 4. Let the initial
x0 f (x1 ) − x1 f (x0 )
approximation be x0 = 3, x1 = 3.5. The formula for x2 is x2 = .
f (x1 ) − f (x0 )
The calculations are shown below:
x0 f (x0 ) x1 f (x1 ) x2 f (x2 )
3.0000 –1.0000 3.5000 10.8750 3.0421 –0.1841
3.5000 10.8750 3.0421 –0.1841 3.0497 –0.0333
3.0421 –0.1841 3.0497 –0.0333 3.0514 0.0005
3.0497 –0.0333 3.0514 0.0005 3.0514 0.0005
Therefore, a root is 3.051 correct up to four significant figures.
Algorithm 4.5 (Secant method). This algorithm finds a root of the equation
f (x) = 0 by secant method when two initial guesses x0 and x1 are supplied.
Algorithm Secant
// The iteration terminates when |f (x1 ) − f (x0 )| is very small (in this case slope of
the secant is very small) and |f (x2 )| < ε, ε is the error tolerance, δ is a very small
quantity, taken as 0.//
Read ε and δ;
Input function f (x);
1. f x0 = f (x0 ); f x1 = f (x1 );
if |f x1 − f x0 | < δ then
Print ‘Slope too small, the method does not give correct root or fail’
Stop;
endif;
x0 · f x1 − x1 · f x0
Compute x2 = ;
f x1 − f x0
Compute f2 = f (x2 );
if |f2 | < ε then
Print ‘A root is’, x2 ;
Stop;
endif;
Set f x0 = f x1 ; f x1 = f x2 ;
Set x0 = x1 ; x1 = x2 ;
Go to 1;
end Secant
Sol. of Algebraic and Transcendental Equs. 229
Program 4.5 .
/* Program Secant
Program to find a root of the equation x*sin(x)-1=0 by
secant method. It is assumed that a root lies between x0 and x1.*/
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
void main()
{
float x0,x1,x2,fx0,fx1,fx2;
float eps=1e-5; /* error tolerance */
float delta=1e-5; /* slope */
float f(float x);
printf("\nEnter the values of x0 and x1 ");
scanf("%f %f",&x0,&x1);
fx0=f(x0);
fx1=f(x1);
if(fabs(fx1-fx0)<delta){
printf("Slope too small
the method does not give correct root or fail");
exit(0);
}
do
{
x2=(x0*fx1-x1*fx0)/(fx1-fx0);
fx2=f(x2);
if(fabs(fx2)<eps){
printf("One root is %8.5f ",x2);
exit(0);
}
fx0=fx1; fx1=fx2;
x0=x1; x1=x2;
}while(fabs(fx2)>eps);
} /* main */
/* definition of function f(x), it may change accordingly */
float f(float x)
{
return(x*sin(x)-1);
}
230 Numerical Analysis
A sample of input/output:
Let us consider the equation f (x) = 0. The function f (x) is expanded by Taylor’s series
in the neighbourhood of xn as 0 = f (x) = f (xn ) + (x − xn )f (xn ) + · · · .
f (xn )
This relation gives x = xn − .
f (xn )
This is the (n + 1)th approximation to the root. Therefore,
f (xn )
xn+1 = xn − . (4.39)
f (xn )
Again, expanding f (x) by Taylor’s series and retaining up to second order term,
shown below.
(x − xn )2
0 = f (x) = f (xn ) + (x − xn )f (xn ) + f (xn )
2
(xn+1 − xn )2
Therefore, f (xn+1 ) = f (xn ) + (xn+1 − xn )f (xn ) + f (xn ) = 0.
2
Substituting the value of xn+1 − xn from (4.39) to the last term and we find
1 [f (xn )]2
f (xn ) + (xn+1 − xn )f (xn ) + f (xn ) = 0.
2 [f (xn )]2
The main idea of this method is, the function f (x) is approximated by a quadratic
polynomial passing through the three points in the neighbour of the root. The root of
this quadratic is assumed to approximate the root of the equation f (x) = 0.
Let xn−2 , xn−1 , xn be any three distinct approximation to a root of the equation
f (x) = 0. We denote f (xn−2 ) = fn−2 , f (xn−1 ) = fn−1 and f (xn ) = fn .
Sol. of Algebraic and Transcendental Equs. 231
Suppose, (4.41) passes through the points (xn−2 , fn−2 ), (xn−1 , fn−1 ) and (xn , fn ),
then
since f (x) = 0.
Now, introducing
h hn
λ= , λn = and δ n = 1 + λn .
hn hn−1
Therefore, the equation (4.46) reduces to the following form
or,
λ2 cn + λgn + δn fn = 0, (4.48)
hn
hn = xn − xn−1 , λn = , δ n = 1 + λn
hn−1
gn = λ2n fn−2 − δn2 fn−1 + (λn + δn )fn
cn = λn (λn fn−2 − δn fn−1 + fn )
2δ f
λ=− n n
gn ± (gn2 − 4δn fn cn )
xn+1 = xn + hn λ.
Sol. of Algebraic and Transcendental Equs. 233
xn−1 fn − xn fn−1
5. Secant xn+1 = 1.62 1
fn − fn−1
6. Modified
fn
Newton-Raphson xn+1 = xn − 3 3
f (xn − 12 fn /fn )
fn 1 fn2
7. Chebyshev xn+1 = xn − − f 3 3
fn 2 f 3n n
8. Muller xn+1 = xn + (xn − xn−1 )λ 1.84 1
where α, β and γ are arbitrary parameters then find the values of α, β, γ such that
the iteration method has (i) third and (ii) fourth order convergence.
234 Numerical Analysis
1 f (x)
α=
, β=− .
f (x) 2{f (x)}3
(ii) For the fourth order convergence, φ (ξ) = φ (ξ) = φ (ξ) = 0.
In this case εn+1 = Aε4n .
1 f (x)
Then α = − , β=−
f (x) 2{f (x)}3
and − αf (ξ) − 6βf (ξ)f (ξ) − 6γ{f (ξ)}3 = 0.
That is,
x0 = 4
3 + (k − 4)xn + 5x2n − x3n
xn+1 = , k is integer.
k
Find the value of k such that the iteration scheme has fast convergence.
ξ 3 − 5ξ 2 + 4ξ − 3 = 0.
ξ 3 − 5ξ 2 + 4ξ − 3 = 0.
af (xn )
xn+1 = xn −
f (xn − bf (xn )/f (xn ))
where a, b are arbitrary parameters, for solving the equation f (x) = 0. Determine
a and b such that the iteration method is of order as high as possible for finding a
simple root of f (x) = 0.
236 Numerical Analysis
af (x) af (x)
φ(x) = x − =x−
f (x − bf (x)/f (x)) f (g(x))
where
bf (x) b{f (x)}2 − bf (x)f (x)
g(x) = x − , g (x) = 1 − .
f (x) {f (x)}2
If ξ be a root of f (x) = 0 then f (ξ) = 0, g(ξ) = ξ, g (ξ) = 1 − b.
ε2n ε3
εn+1 + ξ = φ(εn + ξ) = φ(ξ) + εn φ (ξ) + φ (ξ) + n φ (ξ) + · · ·
2 6
ε2n ε3
or, εn+1 = εn φ (ξ) + φ (ξ) + n φ (ξ) + · · ·
2 6
Sol. of Algebraic and Transcendental Equs. 237
ε3n
εn+1 φ (ξ).
6
Hence the iteration scheme will have a third order convergence when a = 1 and
b = 1/2.
Similarly, f3 (x) is the remainder of f1 (x) when it is divided by f2 (x) with the reverse sign
and so on. This division process is terminated when the quotient becomes a constant.
Thus we obtain a sequence of function f (x), f1 (x), f2 (x), . . . , fn (x) called the Sturm
functions or the Sturm sequences.
Theorem 4.9 (Sturm). The number of real roots of the equation f (x) = 0 on [a, b] is
equal to the difference between the number of changes of sign in the Sturm sequence at
x = a and x = b, provided f (a) = 0 and f (b) = 0.
The Lagrange’s or Newton’s method may also be used to find the upper bound of the
positive roots of the equation (4.53).
Theorem 4.10 (Lagrange’s). If the coefficients of the polynomial
a0 xn + a1 xn−1 + · · · + an−1 x + an = 0
satisfy the conditions a0 > 0, a1 , a2 , . . . , am−1 ≥ 0, am <
0, for some m ≤ n, then the
upper bound of the positive roots of the equation is 1 + m B/a0 , where B is the greatest
of the absolute values of the negative coefficients of the polynomial.
Theorem 4.11 (Newton’s). If for x = c the polynomial
f (x) ≡ a0 xn + a1 xn−1 + · · · + an−1 x + an = 0
and its derivatives f (x), f (x), . . . assume positive values then c is the upper bound of
the positive roots of the equation.
The roots of a polynomial equation can be determined in two techniques – iteration
methods and direct methods. In this section, two iteration methods, viz., Birge-Vieta
and Bairstow methods, and one direct method – Graeffe’s root squaring method are
discussed.
Sol. of Algebraic and Transcendental Equs. 239
Iterative Methods
This method is based on the Newton-Raphson method. Here a real number ξ is deter-
mined such that (x − ξ) is a factor of the polynomial
Let Qn−1 (x) and R be the quotient and remainder when Pn (x) is divided by the
factor (x − ξ), where Qn−1 (x) is a polynomial of degree (n − 1) of the form
Thus
Pn (x) = (x − ξ)Qn−1 (x) + R. (4.57)
The value of R depends on ξ. Now, the problem is to find the value of ξ starting
from an initial guess x0 such that R(ξ) = 0 or it is equivalent to
Pn (xk )
xk+1 = xk − , k = 0, 1, 2, . . . . (4.59)
Pn (xk )
The values of Pn (xk ) and Pn (xk ) can be determined by synthetic division. To
determine the values of b1 , b2 , . . . , bn−1 and R, comparing the coefficient of like powers
of x on both sides of (4.57) and obtain the following relations.
a1 = b1 − ξ b1 = a1 + ξ
a2 = b2 − ξb1 b2 = a2 + ξb1
.. ..
. .
ak = bk − ξbk−1 bk = ak + ξbk−1
.. ..
. .
an = R − ξbn−1 R = an + ξbn−1
From (4.57),
Hence,
That is,
Thus
Pn (x) can be evaluated as Pn (x) is evaluated. Differentiating (4.61) with respect to
ξ and obtain
dbk dbk−1
= bk−1 + ξ .
dξ dξ
Let
dbk
= ck−1 . (4.64)
dξ
Then the above relation becomes
This gives
ck = bk + ξck−1 , k = 1, 2, . . . , n − 1. (4.65)
Example 4.14.1 Find all the roots of the polynomial equation x4 −8x3 +14.91x2 +
9.54x − 25.92 = 0. One root of the equation lies between 1 and 2.
Solution. Let the polynomial be denoted by P4 (x). Also, let the initial guess be
x0 = 1.2.
Therefore,
b4 −4.752
x1 = x0 − = 1.2 − = 1.46884.
c3 17.676
b4 −0.43638
Then x2 = x1 − = 1.46884 − = 1.49949.
c3 14.23706
b4 −0.00702
Then x3 = x2 − = 1.49949 − = 1.50000.
c3 13.77774
242 Numerical Analysis
Therefore, one root is 1.50000, which is an exact root. The reduce polynomial is
Therefore,
b3 −2.07677
x1 = x0 − =4− = 5.79268.
c2 1.15847
b3 23.43485
x2 = x1 − = 5.79268 − = 5.02476.
c2 30.51723
b3 5.96171
x3 = x2 − = 5.02476 − = 4.64211.
c2 15.58018
b3 1.19931
x4 = x3 − = 4.64211 − = 4.51531.
c2 9.45794
Sol. of Algebraic and Transcendental Equs. 243
x2 − 1.85840x − 3.46435 = 0
But, the roots 3.00953 and 1.15113 contain some errors, because we approximate
4.51531 as 4.5. Some more iterations are required to achieve the root 4.5 and the
correct quotient polynomial.
Algorithm Birge-Vieta
//n is the degree of the polynomial and N is the maximum number of iterations to
be performed. Assume that the leading coefficient is one. ε is the error tolerance.//
Read x0 , ε, n, N ;
Read ai , i = 1, 2, . . . , n;
for i = 1 to N do
Set b0 = 1, c0 = 1;
for k = 1 to n do
Compute bk = ak + x0 bk−1 ;
endfor;
for k = 1 to n − 1 do
Compute ck = bk + x0 ck−1 ;
endfor;
bn
Compute x1 = x0 − ;
cn−1
if |x1 − x0 | < ε then
Print‘One root is’, x1 ;
Stop;
else
Set x0 = x1 ;
endif;
endfor;
Print ‘Root not found in N iterations’;
end Birge-Vieta
244 Numerical Analysis
Program 4.6 .
/* Program Birge-Vieta for polynomial equation
Program to find a root of the polynomial equation by
Birge-Vieta method. Leading coefficient is 1 */
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
void main()
{
int n, N,i,k;
float x0,x1,a[10],b[10],c[10];
float epp=1e-5; /* error tolerance */
printf("\nEnter the degree of the polynomial ");
scanf("%d",&n);
printf("Enter the coefficients of the polynomial,
except leading coeff.");
for(i=1;i<=n;i++) scanf("%f",&a[i]);
printf("Enter the initial guess x0 ");
scanf("%f",&x0);
printf("Enter maximum number of iterations to be done ");
scanf("%d",&N);
b[0]=1; c[0]=1;
for(i=1;i<=N;i++)
{
for(k=1;k<=n;k++) b[k]=a[k]+x0*b[k-1];
for(k=1;k<=n-1;k++) c[k]=b[k]+x0*c[k-1];
x1=x0-b[n]/c[n-1];
if(fabs(x1-x0)<epp)
{
printf("One root is %8.5f obtained at %d iterations",x1,i);
printf("\nCoefficients of the reduced polynomial are\n ");
for(k=0;k<=n-1;k++) printf("%f ",b[k]);
exit(0);
}
else
x0=x1;
} /* i loop */
printf("\n Root not found at %d iterations ",N);
} /* main */
Sol. of Algebraic and Transcendental Equs. 245
A sample of input/output:
Enter the degree of the polynomial 4
Enter the coefficients of the polynomial, except leading coeff.
-3 0 6 -4
Enter the initial guess x0 1
Enter maximum number of iterations to be done 100
One root is 1.00000 obtained at 1 iterations
Coefficients of the reduced polynomial are
1.000000 -2.000000 -2.000000 4.000000
where
Qn−2 (x) = xn−2 + b1 xn−3 + · · · + bn−3 x + bn−2 . (4.69)
These equations are two non-linear equations in p and q. The values of p and q can
then be determined by Newton-Raphson method for two variables (see Section 4.17.3).
Let (pt , qt ) be the true values of p and q and ∆p, ∆q be the corrections to p and q.
Then
pt = p + ∆p and qt = q + ∆q.
Thus
∂R ∂R
R(p + ∆p, q + ∆q) = R(p, q) + ∆p + ∆q + ··· = 0
∂p ∂q
∂S ∂S
and S(p + ∆p, q + ∆q) = S(p, q) + ∆p + ∆q + · · · = 0.
∂p ∂q
The derivatives are evaluated at (p, q). Neglecting the square and higher powers of
∆p and ∆q the above equations reduce to
Now, we determine the coefficient of the polynomial Qn−2 (x) and the expression for
R and S in terms of p and q.
From equations (4.67)-(4.69)
a1 = b1 + p b1 = a1 − p
a2 = b2 + pb1 + q b2 = a2 − pb1 − q
.. ..
. .
ak = bk + pbk−1 + qbk−2 bk = ak − pbk−1 − qbk−2 (4.75)
.. ..
. .
an−1 = R + pbn−2 + qbn−3 R = an−1 − pbn−2 − qbn−3
an = S + qbn−2 S = an − qbn−2 .
In general,
where b0 = 1, b−1 = 0.
In this notation,
R = bn−1 , S = bn + pbn−1 . (4.77)
Sol. of Algebraic and Transcendental Equs. 247
Thus R and S are available when b’s are known. To determine the partial derivatives
Rp , Rq , Sp and Sq , differentiating (4.76) with respect to p and q.
∂bk ∂bk−1 ∂bk−2 ∂b0 ∂b−1
= −bk−1 − p −q , = =0 (4.78)
∂p ∂p ∂p ∂p ∂p
∂bk ∂bk−1 ∂bk−2 ∂b0 ∂b−1
= −bk−2 − p −q , = =0 (4.79)
∂q ∂q ∂q ∂q ∂q
Denoting ∂bk
= −ck−1 , k = 1, 2, . . . , n (4.80)
∂p
∂bk
and = −ck−2 . (4.81)
∂q
Then (4.78) becomes
ck−1 = bk−1 − pck−2 − qck−3 (4.82)
Therefore,
∂bn−1
Rp = = −cn−2
∂p
∂bn ∂bn−1
Sp = +p + bn−1 = bn−1 − cn−1 − pcn−2
∂p ∂p
∂bn−1
Rq = = −cn−3
∂q
∂bn ∂bn−1
Sq = +p = −(cn−2 + pcn−3 ).
∂q ∂q
To find the explicit expression for ∆p and ∆q, substituting the above values in (4.73).
Therefore,
bn cn−3 − bn−1 cn−2
∆p = −
− cn−3 (cn−1 − bn−1 )
c2n−2
bn−1 (cn−1 − bn−1 ) − bn cn−2
∆q = − 2 . (4.85)
cn−2 − cn−3 (cn−1 − bn−1 )
Therefore, the improved values of p and q are p + ∆p and q + ∆q. Thus if p0 , q0 be
the initial values of p and q then the improved values are
The values of bk ’s and ck ’s may be calculated using the following scheme: (when p0
and q0 are taken as initial values of p and q).
is called the deflated polynomial. The next quadratic polynomial can be obtained in
similar process from the deflated polynomial.
The rate of convergence of this method is quadratic as the computations of ∆p and
∆q are based on Newton-Raphson method.
Example 4.15.1 Extract a quadratic factor using the Bairstow method from the
equation
x4 + 4x3 − 7x2 − 22x + 24 = 0.
Solution. Let the initial guess of p and q be p0 = 0.5 and q0 = 0.5. Then
b4 c1 − b3 c2 b3 (c3 − b3 ) − b4 c2
∆p = − = 0.88095, ∆q = − = −3.07143
c22 − c1 (c3 − b3 ) c22 − c1 (c3 − b3 )
Second iteration
∆p = 0.52695, ∆q = −0.29857.
p2 = p1 + ∆p = 1.90790, q2 = q1 + ∆q = −2.86999.
Third iteration
∆p = 0.08531, ∆q = −0.12304.
p3 = p2 + ∆p = 1.99321, q3 = q2 + ∆q = −2.99304.
Fourth iteration
∆p = 0.00676, ∆q = −0.00692.
p4 = p3 + ∆p = 1.99996, q4 = q3 + ∆q = −2.99996.
250 Numerical Analysis
Fifth iteration
∆p = 0.00004, ∆q = −0.00004.
p5 = p4 + ∆p = 2.00000, q5 = q4 + ∆q = −3.00000.
Therefore, a quadratic factor is x2 + 2x − 3 which is equal to (x − 1)(x + 3). The
deflated polynomial is Q2 (x) = x2 + 2.00004x − 8.00004 x2 + 2x − 8.
Thus P4 (x) = (x − 1)(x + 3)(x2 + 2x − 8) = (x − 1)(x + 3)(x − 2)(x + 4).
Hence the roots of the given equation are 1, −3, 2, −4.
Algorithm Bairstow
// Extract a quadratic factor x2 + px + q from a polynomial Pn (x) = xn + a1 xn−1 +
· · · + an−1 x + an of degree n and determines the deflated polynomial Qn−2 (x) =
xn−2 + b1 xn−3 + b2 xn−4 + · · · + bn−2 .//
Read n, a1 , a2 , . . . , an ; //the degree and the coefficients of the polynomial.//
Read p, q, ε; //the initial guess of p, q and error tolerance.//
Set b0 = 1, b−1 = 0, c0 = 1, c−1 = 0;
//Compute bk and ck
1. for k = 1 to n do
Compute bk = ak − pbk−1 − qbk−2 ;
endfor;
for k = 1 to n − 1 do
Compute ck = bk − pck−1 − qck−2 ;
endfor;
cn−3 −bn−1 cn−2 (cn−1 −bn−1 )−bn cn−2
Compute ∆p = − c2 bn−c n−3 (cn−1 −bn−1 )
; ∆q = − bn−1
c2n−2 −cn−3 (cn−1 −bn−1 ) ;
n−2
Compute pnew = p + ∆p, qnew = q + ∆q;
if (|pnew − p| < ε) and (|qnew − q| < ε) then
Print ‘The values of p and q are’, pnew , qnew ;
Stop;
endif;
Set p = pnew , q = qnew ;
Sol. of Algebraic and Transcendental Equs. 251
go to 1.
Print ‘The coefficients of deflated polynomial are’,b1 , b2 , . . . , bn−2 ;
end Bairstow
Program 4.7 .
/* Program Bairstow for polynomial equation
Program to find all the roots of a polynomial equation by
Bairstow method. Leading coefficient is 1. Assume
initial guess for all p and q are 0.5, 0.5. */
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
void main()
{
int n,i,k;
float p,q,pnew,qnew,a[10],b[10],c[10],bm1,cm1,delp,delq;
float epp=1e-5; /* error tolerance */
void findroots(float p, float q);
printf("\nEnter the degree of the polynomial ");
scanf("%d",&n);
printf("Enter the coefficients of the polynomial,
except leading coeff.");
for(i=1;i<=n;i++) scanf("%f",&a[i]);
q=0.5;
p=0.5;
printf("The roots are \n");
do
{
b[0]=1; bm1=0; c[0]=1; cm1=0;
pnew=p; qnew=q;
do{
p=pnew; q=qnew;
b[1]=a[1]-p*b[0]-q*bm1;
c[1]=b[1]-p*c[0]-q*cm1;
for(k=2;k<=n;k++) b[k]=a[k]-p*b[k-1]-q*b[k-2];
for(k=2;k<=n;k++) c[k]=b[k]-p*c[k-1]-q*c[k-2];
delp=-(b[n]*c[n-3]-b[n-1]*c[n-2])/
(c[n-2]*c[n-2]-c[n-3]*(c[n-1]-b[n-1]));
delq=-(b[n-1]*(c[n-1]-b[n-1])-b[n]*c[n-2])/
(c[n-2]*c[n-2]-c[n-3]*(c[n-1]-b[n-1]));
252 Numerical Analysis
pnew=p+delp;
qnew=q+delq;
}while((fabs(pnew-p)>epp || fabs(qnew-q)>epp));
findroots(p,q);
n-=2;
for(i=1;i<=n;i++) a[i]=b[i];
}while(n>2);
A sample of input/output:
Direct Method
This method may be used to find all the roots of all types (real, equal or complex) of a
polynomial equation with real coefficients. In this method, an equation is constructed
whose roots are squares of the roots of the given equation, then another equation whose
roots are squares of the roots of this new equation and so on, the process of root-squaring
being continued as many times as necessary.
Let the given equation be
x2n −(a21 −2a2 )x2n−2 +(a22 −2a1 a3 +2a4 )x2n−4 +· · ·+(−1)n a2n = 0. (4.89)
z n + b1 z n−1 + · · · + bn−1 z + bn = 0
where
b1 = a21 − 2a2
b2 = a22 − 2a1 a3 + 2a4
..
.
bk = a2k − 2ak−1 ak+1 + 2ak−2 ak+2 − · · · (4.90)
..
.
bn = a2n .
The roots of the equation (4.89) are −ξ12 , −ξ22 , . . . , −ξn2 . The coefficients bk ’s can be
obtained from Table 4.3.
The (k + 1)th column i.e., bk of Table 4.3 can be obtained as follows:
The terms alternate in sign starting with a positive sign. The first term is a2k . The
second term is twice the product of ak−1 and ak+1 . The third term is twice the product
254 Numerical Analysis
1 a1 a2 a3 a4 ··· an
1 a21 a22 a23 a24 ··· a2n
−2a2 −2a1 a3 −2a2 a4 −2a3 a5 ···
2a4 2a1 a5 2a2 a6 ···
−2a6 −a1 a7 ···
2a8 ···
1 b1 b2 b3 b4 ··· bn
of ak−2 and ak+2 . This process is continued until there are no available coefficients to
form the cross product terms.
The root-squaring process is repeated to a sufficient number of times, say m times
and we obtain the equation
Let the roots of (4.91) be α1 , α2 , . . . , αn . This roots are the 2m th power of the roots
of the equation (4.88) with opposite signs, i.e.,
m
αi = −ξi2 , i = 1, 2, . . . , n.
That is,
m m m m
|ξn |2 |ξn−1 |2 · · · |ξ2 |2 |ξ1 |2 (4.93)
since all the roots are widely separated in magnitude at the final stage.
Then from (4.92),
ξ p ξ p ξ p
2 3 n
c1 = ξ1p 1 + + + ··· + + ξ1p
ξ1 ξ1 ξ1
ξ p
i
as → 0, i > 1 at the desired level of accuracy.
ξ1
Similarly,
ξ ξ p ξ
1 3 n−1 ξn p
c2 = (ξ1 ξ2 )p 1 + + ··· + (ξ1 ξ2 )p
ξ1 ξ2 ξ1 ξ2
This determines the absolute values of the roots. By substituting these values in the
original equation (4.88) one can determine the sign of the roots. The squaring process is
terminated when another squaring process produces new coefficients that are almost the
squares of the corresponding coefficients ck ’s i.e., when the cross product terms become
negligible with respect to square terms. Thus the final stage is identified by the fact
that on root-squaring at that stage all the cross products will vanish.
Case II. All roots are real with one pair of equal magnitude.
Let ξ1 , ξ2 , . . . , ξn be the roots of the given equation, if a pair of roots are equal in
magnitude then this pair is conveniently called a double root. A double root can be
identified in the following way:
If the magnitude of the coefficient ck is about half the square of the magnitude of the
corresponding coefficient in the previous equation, then it indicates that ξk is a double
root. The double root is determined by the following process.
We have
ck ck+1
αk − and αk+1 − .
ck−1 ck
Then c
k+1
αk αk+1 αk2 .
ck−1
256 Numerical Analysis
Therefore, c
m) k+1
|αk2 | = |ξk |2(2 = .
ck−1
Thus the roots are given by
1/p
|ξ1 | = c1 , |ξ2 | = (c2 /c1 )1/p , . . . , |ξk−1 | = (ck−1 /ck−2 )1/p , . . . ,
|ξk | = (ck+1 /ck−1 )1/(2p) , . . . , |ξn | = (cn /cn−1 )1/p , (4.95)
m
where p = 2 .
This gives the magnitude of the double root. The sign is determined by substituting
the root to the equation.
The double root can also be determined directly since αk and αk+1 converge to the
same root after sufficient squaring. Generally, the rate of convergence to the double
root is slow.
Case III. One pair of complex roots and other roots are distinct in magnitude.
ξk , ξk+1 = ρk e±iθk
From this relation one can determine the value of u. Then the value of v can be
determined from the relation
v 2 = ρ2k − u2 .
The presence of complex roots in (k + 1)th column is identified by the following
technique:
If the coefficients of xn−k in the successive squaring to fluctuate both in magnitude and
sign, a complex pair can be detected by this oscillation.
3. As a direct method, there is no scope for correcting the error generated in any
stage. If any error is generated at any stage, then the error propagates to all the
subsequent computations and ultimately gives a wrong result.
4. The method is laborious, and to get a very accurate result the method has to be
repeated for a large number of times.
In the following, three examples are considered to discuss the three possible cases of
Graeffe’s method. The following table is the Graeffe’s root squaring scheme for four
degree equation.
1 a1 a2 a3 a4
1 a21 a22 a23 a24
−2a2 −2a1 a3 −2a2 a4
2a4
1 c1 c2 c3 c4
Solution. All the calculations are shown in the following table. The number within
the parenthesis represents the exponent of the adjacent number. i.e., 0.75(02) means
0.75 × 102 .
258 Numerical Analysis
m 2m x4 x3 x2 x 1
0 1 1.00000 −0.75000(01) 0.20000(02) −0.22500(02) 0.90000(01)
1.00000 0.56250(02) 0.40000(03) 0.50625(03) 0.81000(02)
−0.40000(02) −0.33750(03) −0.36000(03)
0.18000(02)
1 2 1.00000 0.16250(02) 0.80500(02) 0.14625(03) 0.81000(02)
1.00000 0.26406(03) 0.64803(04) 0.21389(05) 0.65610(04)
−0.16100(03) −0.47531(04) −0.13041(05)
0.16200(03)
2 4 1.00000 0.10306(03) 0.18891(04) 0.83481(04) 0.65610(04)
1.00000 0.10622(05) 0.35688(07) 0.69690(08) 0.43047(08)
−0.37783(04) −0.17207(07) −0.24789(08)
0.13122(05)
3 8 1.00000 0.68436(04) 0.18612(07) 0.44901(08) 0.43047(08)
1.00000 0.46835(08) 0.34640(13) 0.20161(16) 0.18530(16)
−0.37223(07) −0.61457(12) −0.16023(15)
0.86093(08)
4 16 1.00000 0.43113(08) 0.28495(13) 0.18559(16) 0.18530(16)
1.00000 0.18587(16) 0.81195(25) 0.34443(31) 0.34337(31)
−0.56989(13) −0.16002(24) −0.10560(29)
0.37060(16)
5 32 1.0000 0.18530(16) 0.79595(25) 0.34337(31) 0.34337(31)
This is the final equation since all the cross products vanish at the next step and all
the roots are real and distinct in magnitude.
Therefore,
1/32
0.34337 × 1031
|ξ3 | = = 1.5000,
0.79595 × 1025
1/32
0.34337 × 1031
|ξ4 | = = 1.0000.
0.34337 × 1031
The diminishing double products vanish at the next step and hence this is the final
equation and since we find the characteristic behaviour of a double root in the second
column.
Therefore,
Here 1.4142 as well as -1.4142 satisfied the given equation, hence the roots of the
given equation are 2, ±1.4142, 1.
260 Numerical Analysis
To solve a system of nonlinear equations the following methods are discussed in this
section.
2. Seidal iteration
3. Newton-Raphson method.
f (x, y) = 0
and g(x, y) = 0. (4.96)
whose real roots are required within a specified accuracy. The above system can be
rewritten as
x = F (x, y)
and y = G(x, y). (4.97)
x1 = F (x0 , y0 ), y1 = G(x0 , y0 )
x2 = F (x1 , y1 ), y2 = G(x1 , y1 )
··············· ··············· (4.98)
xn+1 = F (xn , yn ), yn+1 = G(xn , yn ).
then
Like the iteration process for single variable, the above sequence surely converge to
a root under certain condition. The sufficient condition is stated below.
262 Numerical Analysis
Theorem 4.12 Assume that the functions x = F (x, y), y = G(x, y) and their first
order partial derivatives are continuous on a region R that contains a root (ξ, η). If the
starting point (x0 , y0 ) is sufficiently close to (ξ, η) and if
∂F ∂F ∂G ∂G
+ <1 and
∂x ∂y ∂x + ∂y < 1, (4.100)
for all (x, y) ∈ R, then the iteration scheme (4.98) converges to a root (ξ, η).
The condition for the functions x = F (x, y, z), y = G(x, y, z), z = H(x, y, z) is
∂F ∂F ∂F
+ + < 1,
∂x ∂y ∂z
∂G ∂G ∂G
+ + <1
∂x ∂y ∂z
∂H ∂H ∂H
and + + <1
∂x ∂y ∂z
8x − 4x2 + y 2 + 1 2x − x2 + 4y − y 2 + 3
x= and y =
8 4
starting with (x0 , y0 ) = (1.1, 2.0), using (i) iteration method, and (ii) Seidal iteration
method.
Sol. of Algebraic and Transcendental Equs. 263
Solution.
(i) Iteration method
Let
8x − 4x2 + y 2 + 1 2x − x2 + 4y − y 2 + 3
F (x, y) = and G(x, y) = .
8 4
The iteration scheme is
8xn − 4x2n + yn2 + 1
xn+1 = F (xn , yn ) = ,
8
2xn − x2n + 4yn − yn2 + 3
and yn+1 = G(xn , yn ) = .
4
The value of xn , yn , xn+1 and yn+1 for n = 0, 1, . . . are shown in the following table.
n xn yn xn+1 yn+1
0 1.10000 2.00000 1.12000 1.99750
1 1.12000 1.99750 1.11655 1.99640
2 1.11655 1.99640 1.11641 1.99660
3 1.11641 1.99660 1.11653 1.99661
4 1.11653 1.99661 1.11652 1.99660
n xn yn xn+1 yn+1
0 1.10000 2.00000 1.12000 1.99640
1 1.12000 1.99640 1.11600 1.99663
2 1.11600 1.99663 1.11659 1.99660
3 1.11659 1.99660 1.11650 1.99660
Algorithm 4.8 (Seidal iteration). This algorithm used to solve two non-linear
equations by Seidal iteration, when the initial guess is given.
Algorithm Seidal-Iteration-2D
// Let (x0 , y0 ) be the initial guess of the system of equations x = F (x, y), y = G(x, y).
ε be the error tolerance, maxiteration represents the maximum number of repetitions
to be done.//
Input functions F (x, y), G(x, y).
Read ε, maxiteration, x0 , y0 , z0 ;
Set k = 0, error = 1;
While k < maxiteration and error > ε do
Set k = k + 1;
Compute x1 = F (x0 , y0 ), y1 = G(x1 , y0 );
Compute error = |x1 − x0 | + |y1 − y0 |;
Set x0 = x1 , y0 = y1 ;
endwhile;
if error < ε then
Print ‘The sequence converge to the root’, x1 , y1 ;
Stop;
else
Print ‘The iteration did not converge after’, k,’iterations’;
Stop;
endif;
end Seidal-Iteration-2D
Program 4.8
.
/* Program Seidal for a pair of non-linear equations
Program to find a root of a pair of non-linear equations
by Seidal method. Assumed that the equations are given
in the form x=f(x,y) and y=g(x,y).
The equations taken are x*x+4y*y-4=0, x*x-2x-y+1=0. */
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
void main()
{
int k=0,maxiteration;
float error=1,eps=1e-5,x0,y0; /*initial guesses for x and y*/
float x1,y1;
float f(float x, float y);
Sol. of Algebraic and Transcendental Equs. 265
/* definition of f(x,y) */
float f(float x, float y)
{
float f1;
f1=sqrt(4-4*y*y);
return f1;
}
/* definition of g(x,y) */
float g(float x, float y)
{
float g1;
g1=x*x-2*x+1;
return g1;
}
266 Numerical Analysis
A sample of input/output:
f (x0 + h, y0 + k) = 0
g(x0 + h, y0 + k) = 0. (4.104)
Neglecting square and higher order terms, the above equations simplified as
∂f0 ∂f0
h +k = −f0
∂x ∂y
∂g0 ∂g0
h +k = −g0 (4.106)
∂x ∂y
∂f0 ∂f
where f0 = f (x0 , y0 ),= etc.
∂x ∂x (x0 ,y0 )
The above system can be written as
∂f ∂f0
0
∂x ∂y h
−f0 h −f0
= or = J −1 .
∂g ∂g k −g0 k −g0
0 0
∂x ∂y
Sol. of Algebraic and Transcendental Equs. 267
Theorem 4.13 Let (x0 , y0 ) be an initial guess to a root (ξ, η) of the system f (x, y) =
0, g(x, y) = 0 in a closed neighbourhood R containing (ξ, η). If
1. f, g and their first order partial derivatives are continuous and bounded in R, and
2. J = 0 in R, then the sequence of approximation xn+1 = xn + h, yn+1 = yn + k,
where h and k are given by (4.107), converges to the root (ξ, η).
2 −1
At (x0 , y0 ), J0 = .
4 2
Therefore,
2 −1 h −0.25
=
4 2 k −0.25
268 Numerical Analysis
h 1 2 1 −0.25 −0.09375
or, = =
k 8 −4 2 −0.25 0.06250
Thus, x1 = x0 + h = 2.00 − 0.09375 = 1.90625,
y1 = y0 + k = 0.25 + 0.0625 = 0.31250.
1.81250 −1.00000 −f1 −0.00879
At (x1 , y1 ), J1 = , = .
3.81250 2.50000 −g1 −0.02441
h −f1
J1 =
k −g1
h 1 2.50000 1.00000 −0.00879 −0.00556
or, = = .
k 8.34375 −3.81250 1.81250 −0.02441 −0.00129
Therefore, x2 = x1 + h = 1.90625 − 0.00556 = 1.90069,
y2 = y1 + k = 0.31250 − 0.00129 = 0.31121.
1.80138 −1.00000 −f2 −0.00003
At (x2 , y2 ), J2 = , = .
3.80138 2.48968 −g2 −0.00003
h −f2
J2 =
k −g2
h 1 2.48968 1.00000 −0.00003 −0.00001
or, = = .
k 8.28624 −3.80138 1.80138 −0.00003 0.00001
Hence, x3 = x2 + h = 1.90069 − 0.00001 = 1.90068,
y3 = y2 + k = 0.31121 + 0.00001 = 0.31122.
Thus, one root is x = 1.9007, y = 0.3112 correct up to four decimal places.
∂g ∂g
Compute delgx = ∂x , delgy = ∂y
(x0 ,y0 ) (x0 ,y0 )
Compute J = delf x ∗ delgy − delgx ∗ delf y;
Compute h = (−f0 ∗ delgy + g0 ∗ delf y)/J;
Compute k = (−g0 ∗ delf x + f0 ∗ delgx)/J;
Compute x0 ← x0 + h, y0 ← y0 + k;
endfor;
Print ‘Solution does not converge in’, maxiteration, ‘iteration’;
end Newton-Raphson -2D
Program 4.9.
/* Program Newton-Raphson (for a pair of non-linear equations)
Program to find a root of a pair of non-linear equations
by Newton-Raphson method. Partial derivative of f and g
w.r.t. x and y are to be supplied.
The equations taken are 3x*x-2y*y-1=0, x*x-2x+2y-8=0. */
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
void main()
{
int i,maxiteration;
float eps=1e-5,x0,y0; /* initial guesses for x and y */
float J,k,h,delfx,delfy,delgx,delgy,f0,g0;
float f(float x, float y);
float fx(float x, float y);
float fy(float x, float y);
float g(float x, float y);
float gx(float x, float y);
float gy(float x, float y);
printf("Enter initial guess for x and y ");
scanf("%f %f",&x0,&y0);
printf("Maximum iterations to be allowed ");
scanf("%d",&maxiteration);
for(i=1;i<=maxiteration;i++)
{
f0=f(x0,y0);
g0=g(x0,y0);
if(fabs(f0)<eps && fabs(g0)<eps){
270 Numerical Analysis
A sample of input/output:
4.18 Exercise
3. Obtain a root for each of the following equations using bisection method, regula-
falsi method, iteration method, Newton-Raphson method, secant method.
(i) x3 + 2x2 − x + 7 = 0, (ii) x3 − 4x − 9 = 0, (iii) cos x = 3x − 1
(iv) e−x = 10x, (v) sin x = 10(x − 1), (vi) sin2 x = x2 − 1
(vii) tan x − tanh x = 0, (viii) x3 + 0.5x2 − 7.5x − 9.0 = 0, (ix) tan x + x = 0,
(x) x3 − 5.2x2 − 17.4x + 21.6 = 0, (xi) x7 + 28x4 − 480 = 0,
(xii) (x − 1)(x − 2)(x − 3) = 0, (xiii) x − cos x = 0, (xiv) x + log x = 2,
(xv) sin x = 12 x, (xvi) x3 − sin x + 1 = 0, (xvii) 2x = cos x + 3,
− sin πx = 0, (xx) 2x − 2x2 − 1 = 0,
(xviii) x log10 x = 4.77, (xix) x2 √
(xxi) 2 log x − 2 + 1 = 0, (xxii) 1 + x = 1/x, (xxiii) log x + (x + 1)3 = 0.
x
4. Find a root of the equation x = 12 + sin x by using the fixed point iteration method
1
xn+1 = + sin xn , x0 = 1
2
correct to six decimal places.
5. Use Newton-Raphson method for multiple roots to find the roots of the following
equations.
(i) x3 − 3x + 2 = 0, (ii) x4 + 2x3 − 2x − 1 = 0.
6. Describe the bisection method to find a root of the equation f (x) = 0 when
f (a) · f (b) < 0, a, b be two specified numbers. Is this condition necessary to get a
root using this method ?
272 Numerical Analysis
7. Describe regula-falsi method for finding a real root of an equation. Why this
method is called the bracketing method ? Give a geometric interpretation of this
method. What is the rate of convergence of regula-falsi method ? Compare this
method with Newton-Raphson method. Discuss the advantages and disadvantages
of this method.
9. What is the main difference between regula-falsi method and secant method ?
10. Explain how an equation f (x) = 0 can be solved by the method of iteration (fixed
point iteration) and deduce the condition of convergence. Show that the rate of
convergence of iteration method is one. Give a geometric interpretation of this
method. Discuss the advantages and disadvantages of this method.
11. Find the iteration schemes to solve the following equations using fixed point iter-
ation method:
(i) 2x − sin x − 1 = 0, (ii) x3 − 2x − 1 = 0, (iii) x + sin x = 0.
12. Describe Newton-Raphson method for computing a simple real root of an equa-
tion f (x) = 0. Give a geometrical interpretation of the method. What are the
advantages and disadvantages of this method. Prove that the Newton-Raphson
method has a second order convergence.
13. Find the iteration schemes to solve the following equations using Newton-Raphson
method:
(i) 2x − cos x − 1 = 0, (ii) x5 + 3x2 − 1 = 0, (iii) x2 − 2 = 0.
1 a
xn+1 = (k − 1)xn + k−1
k xn
16. Use Newton-Raphson method and modified Newton-Raphson method to find the
roots of the following equations and compare the methods.
(i) x3 − 3x + 2 = 0, (ii) x4 + 2x3 − 2x − 1 = 0.
17. Solve the following equation using Newton-Raphson method:
(i) z 3 − 4iz 2 − 3ez = 0, z0 = −0.53 − 0.36i, (ii) 1 + z 2 + z 3 = 0, z0 = 1 + i.
18. Compare Newton-Raphson method and modified Newton-Raphson method to find
a root of the equation f (x) = 0.
19. Use modified Newton-Raphson method to find a root of the equation
% x
2 1
e−t dt =
0 10
with six decimal places.
√
5
20. Device a scheme to find the value of a using modified Newton-Raphson method.
21. The formula
1 f (xn + 2f (xn )/f (xn ))
xn+1 = xn −
2 f (xn )
is used to find a multiple root of multiplicity two of the equation f (x) = 0. Show
that the rate of convergence of this method is cubic.
22. The iteration scheme
3 loge xn − e−xn
xn+1 = xn −
p
is used to find the root of the equation e−x − 3 loge x = 0. Show that p = 3 gives
rapid convergence.
23. Use Muller’s method to find a root of the equations
(i) x3 − 2x − 1 = 0, (ii) x4 − 6x2 + 3x − 1 = 0.
24. Determine the order of convergence of the iterative method
x0 f (xn ) − xn f (x0 )
xn+1 =
f (xn ) − f (x0 )
for finding the simple root of the equation f (x) = 0.
25. Find the order of convergence of the Steffensen method
f (xn ) f (xn + f (xn )) − f (xn )
xn+1 = xn − , g(xn ) = , n = 0, 1, 2, . . . .
g(xn ) f (xn )
Use this method to find a root of the equation x − 1 + ex = 0.
274 Numerical Analysis
√
26. Show that the iteration scheme to find the value of a using Chebyshev third
order method is given by
1 a 1 a 2
xn+1 = xn + − xn − .
2 xn 8xn xn
√
Use this scheme to find the value of 17.
28. Using Graeffe’s root squaring method, find the roots of the following equations
(i) x3 − 4x2 + 3x + 1 = 0, (ii) x3 − 3x − 5 = 0,
(iii) x4 − 2x3 + 1.99x2 − 2x + 0.99 = 0, (iv) x3 + 5x2 − 44x − 60 = 0.
29. Use Birge-Vieta method to find the roots of the equations to 3 decimal places.
(i) x4 − 8x3 + 14.91x2 + 9.54x − 25.92 = 0, (ii) x3 − 4x + 1 = 0,
(iii) x4 − 1.1x3 − 0.2x2 − 1.2x + 0.9 = 0.
30. Find the quadratic factors of the following polynomial equations using Bairstow’s
method.
(i) x4 − 8x3 + 39x2 − 62x + 50 = 0, (ii) x3 − 2x2 + x − 2 = 0,
(iii) x4 − 6x3 + 18x2 − 24x + 16 = 0, (iv) x3 − 2x + 1 = 0.
31. Solve the following systems of nonlinear equations using iteration method
(i) x2 + y = 11, y 2 + x = 7, (ii) 2xy − 3 = 0, x2 − y − 2 = 0.
32. Solve the following systems of nonlinear equations using Seidal method
(i) x2 + 4y 2 − 4 = 0, x2 − 2x − y + 1 = 0, start with (1.5, 0.5),
(ii) 3x2 − 2y 2 − 1 = 0, x2 − 2x + y 2 + 2y − 8 = 0, start with (−1, 1).
33. Solve the following systems of nonlinear equations using Newton-Raphson method
(i) 3x2 − 2y 2 − 1 = 0, x2 − 2x + 2y − 8 = 0 start with initial guess (2.5,3),
(ii) x2 − x + y 2 + z 2 − 5 = 0, x2 + y 2 − y + z 2 − 4 = 0, x2 + y 2 + z 2 + z − 6 = 0
start with (−0.8, 0.2, 1.8) and (1.2, 2.2, −0.2).
34. Use Newton’s method to find all nine solutions to 7x3 − 10x − y − 1 = 0, 8y 3 −
11y + x − 1 = 0. Use the starting points (0, 0), (1, 0), (0, 1), (−1, 0), (0, −1), (1, 1),
(−1, 1), (1, −1) and (−1, −1).
35. What are the differences between direct method and iterative method ?
Chapter 5
275
276 Numerical Analysis
Direct Methods
To solve a system of linear equations, a simple method (but, not efficient) was discovered
by Gabriel Cramer in 1750.
Let the determinant of the coefficients of the system (5.2) be D = |aij |; i, j =
1, 2, . . . , n, i.e., D = |A|. In this method, it is assumed that D = 0. The Cramer’s
rule is described in the following. From the properties of determinant
a11 a12 · · · a1n x1 a11 a12 · · · a1n
a21 a22 · · · a2n x1 a21 a22 · · · a2n
x1 D = x1
· · · · · · · · · · · · = ··· ··· ··· ···
an1 an2 · · · ann x1 an1 an2 · · · ann
a11 x1 + a12 x2 + · · · + a1n xn a12 · · · a1n
a21 x1 + a22 x2 + · · · + a2n xn a22 · · · a2n [Using the operation
=
··· · · · · · · · · · C1 = C1 + x2 C2 + · · · + xn Cn .]
an1 x1 + an2 x2 + · · · + ann xn an2 · · · ann
Solution of System of Linear Equations 277
b1 a12 · · · a1n
b a ··· a2n
= 2 22 [Using (5.1)]
··· ··· ··· · · ·
bn an2 · · · ann
= Dx1 (say).
Therefore, x1 = Dx1 /D.
Dx2 Dxn
Similarly, x2 = , . . . , xn = .
D D
Dxi
In general, xi = , where
D
a11 a12 · · · a1 i−1 b1 a1 i+1 ··· a1n
a a · · · a2 i−1 b2 a2 i+1 ··· a2n
Dxi = 21 22 ,
··· ··· ··· ··· ··· ··· ··· · · ·
an1 an2 · · · an i−1 bn an i+1 ··· ann
i = 1, 2, · · · , n
Example 5.1.1 Use Cramer’s rule to solve the following systems of equations
x1 + x2 + x3 = 2
2x1 + x2 − x3 = 5
x1 + 3x2 + 2x3 = 5.
a11 a12 a13 · · · a1n
0 (1) (1)
· · · a2n
(1)
a22 a23
(2)
D = 0 0
(2)
a33 · · · a3n ,
··· ··· ··· ··· ···
0 0 0 · · · ann
(n−1)
where
(k−1)
(k) (k−1) aik (k−1)
aij = aij − a
(k−1) kj
; (5.5)
akk
(0)
i, j = k + 1, . . . , n; k = 1, 2, . . . , n − 1 and aij = aij , i, j = 1, 2, · · · , n.
(1) (2) (n−1)
Then the value of D is equal to a11 a22 a33 · · · ann .
(k) (k−1)
To compute the value of aij one division is required. If akk is zero then further
(k−1)
reduction is not possible. If akk is small then the division leads to the loss
of significant
digits. To prevent the loss of significant digits, the pivoting techniques are used.
A pivot is the largest magnitude element in a row or a column or the principal
diagonal or the leading or trailing sub-matrix of order i (2 ≤ i ≤ n).
Solution of System of Linear Equations 279
Partial pivoting
In the first stage, the first pivot is determined by finding the largest element in mag-
nitude among the elements of first column and let it be ai1 . Then rows i and 1 are
interchanged. In the second stage, the second pivot is determined by finding the largest
element in magnitude among the elements of second column leaving first element and
let it be aj2 . The second and jth rows are interchanged. This process is repeated for
(n − 1)th times. In general, at ith stage, the smallest index j is chosen for which
(k) (k) (k) (k) (k)
|aij | = max{|akk |, |ak+1 k |, . . . , |ank |} = max{|aik |, i = k, k + 1, . . . , n}
and the rows k and j are interchanged.
The largest element (in magnitude) in the first column is −8. Then interchanging
first and third rows i.e.,
−8 1 6
A ∼ 4 5 1.
1 7 3
The largest element in second column leaving the first row is 7, so interchanging
second and third rows.
The matrix after partial pivoting is
−8 1 6
A ∼ 1 7 3.
4 5 1
1 7 3
Let us consider the matrix B = 4 −8 5 to illustrate the complete pivoting. The
2 6 1
largest element (in magnitude) is determined among all the elements of the matrix. It
is −8 attained at the (2, 2) position. Therefore, first and second columns, and first and
second rows are interchanged. The matrix transferred to
−8 4 5
B ∼ 7 1 3.
6 2 1
The number −8 is the first pivot. To find second pivot, largest element is determined
1 3
from the trailing sub-matrix . The largest element is 3 and it is at the position
2 1
(2, 3).
Solution of System of Linear Equations 281
Interchanging
second and third columns. The final matrix (after complete pivoting)
−8 5 4
is 7 3 1 .
6 1 2
Solution. (i) The largest element in the first column is 5, which is the first pivot of
A.
Interchanging first and third rows, we obtain
5 1 −2
4 6 1.
2 0 4
26
The second pivot is , which is the largest element (in magnitude) among the
5
elements of second column except first row. Since this element is in the (2,2) position,
so no interchange of rows is required.
2/5 2/5
Adding times the second row to the third row i.e., R3 = R3 + R2 we obtain
26/5 26/5
5 1 −2
0 26/5 13/5 .
0 0 5
(ii) The largest element in A is 6. Interchanging first and second columns and setting
sign = −1; and then interchanging first and second rows and setting sign = −sign =
1, we have
6 4 1
0 2 4.
1 5 −2
Adding − 16 times the first row to the third row i.e., R3 = R3 − 16 R1 we obtain
6 4 1
0 2 4 .
0 13/3 −13/6
2 4
The pivot of the trailing sub-matrix is 13/3. Interchanging the second
13/3 −13/6
6 1 4
and third rows, we have 0 13/3 −13/6 and sign = −1.
0 2 4
2 6
Adding − times the second row to the third row i.e., R3 = R3 − R2 we obtain
13/3 13
6 1 4
0 13/3 −13/6 .
0 0 5
Therefore, |A| = sign × (6)(13/3)(5) = −130.
The algorithm for triangularization i.e., to find the value of a determinant using
partial and complete pivoting are presented below.
ajk
Step 6. Subtract times the kth row from the jth row for j = k+1, k+2, . . . , n
akk
ajk
i.e., for j = k + 1, k + 2, . . . , n do the following Rj = Rj − .Rk , where
akk
Rj , Rj are the old and new jth rows respectively.
// This step makes ak+1k , ak+2k , · · · , ank zero.//
Step 7. Increment k by 1 i.e., set k = k+1. If k < n then goto Step 3. Otherwise,
//Triangularization is complete.//
Compute |A| = sign × product of diagonal elements.
Print |A| and Stop.
end Det Partial Pivoting.
Program 5.1.
/* Program Partial Pivoting
Program to find the value of a determinant using partial pivoting */
#include<stdio.h>
#include<stdlib.h>
#include<math.h>
void main()
{
int n,k,i,j,sign=1;
float a[10][10],b[10],prod,temp;
int max1(float b[],int k, int n);
printf("\nEnter the size of the determinant ");
scanf("%d",&n);
printf("Enter the elements rowwise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++)
scanf("%f",&a[i][j]);
for(k=1;k<=n;k++)
{
for(i=k;i<=n;i++) b[i]=a[i][k];
/* copy from a[k][k] to a[n][k] into b */
j=max1(b,k,n); /* finds pivot position */
if(a[j][k]==0)
{
printf("The value of determinant is 0");
exit(0);
}
if(j!=k) /* interchange k and j rows */
{
sign=-sign;
284 Numerical Analysis
for(i=1;i<=n;i++){
temp=a[j][i]; a[j][i]=a[k][i]; a[k][i]=temp;
}
}
for(j=k+1;j<=n;j++) /* makes a[k+1][k] to a[n][k] zero */
{
temp=a[j][k]/a[k][k];
for(i=1;i<=n;i++) a[j][i]-=temp*a[k][i];
}
} /* end of k loop */
prod=sign;
/* product of diagonal elements */
for(i=1;i<=n;i++) prod*=a[i][i];
printf("The value of the determinant is %f ",prod);
}/* main */
/* finds position of maximum element among n numbers */
int max1(float b[],int k, int n)
{
float temp; int i,j;
temp=fabs(b[k]); j=k; /* initial maximum */
for(i=k+1;i<=n;i++)
if(temp<fabs(b[i])) {temp=fabs(b[i]); j=i;}
return j;
}
A sample of input/output:
Step 3. Find
a pivot from the elements of the trailing sub-matrix
akk akk+1 · · · akn
ak+1k ak+1k+1 · · · ak+1n
of A. Let apq be the pivot.
··· ··· ··· ···
ank ank+1 · · · ann
i.e., |apq | = max{|akk |, . . . , |akn |; |ak+1k |, . . . , |ak+1n |; |ank |, . . . , |ann |}.
Step 4. If apq = 0 then |A| = 0; print the value of |A| and Stop.
Step 5. If p = k then goto Step 6. Otherwise, interchange the kth and the pth
rows and set sign = −sign.
Step 6. If q = k then goto Step 7. Otherwise, interchange the kth and the qth
columns and set sign = −sign.
ajk
Step 7. Subtract times the kth row to the jth row for j = k + 1, k + 2, . . . , n
akk
i.e., for j = k + 1, k + 2, . . . , n do the following
ajk
Rj = Rj − .Rk , where Rj , Rj are the old and new jth rows respec-
akk
tively.
Step 8. Increment k by 1 i.e., set k = k + 1.
If k < n then goto Step 3. Otherwise,
compute |A| = sign × product of diagonal elements.
Print |A| and Stop.
end Det Complete Pivoting.
Program 5.2.
/*Program Complete Pivoting
Program to find the value of a determinant using complete pivoting */
#include<stdio.h>
#include<stdlib.h>
#include<math.h>
void main()
{
int n,k,i,j,sign=1,p,q;
float a[10][10],prod,max,temp;
printf("\nEnter the size of the determinant ");
scanf("%d",&n);
printf("Enter the elements rowwise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
for(k=1;k<=n;k++)
{
/* finds the position of the pivot element */
max=fabs(a[k][k]); p=k; q=k; /* set initial maximum */
286 Numerical Analysis
for(i=k;i<=n;i++)
for(j=k;j<=n;j++)
if(max<fabs(a[i][j])) { max=fabs(a[i][j]); p=i; q=j;}
if(a[p][q]==0)
{
printf("The value of determinant is 0");
exit(0);
}
if(p!=k) /* interchange k and p rows */
{
sign=-sign;
for(i=1;i<=n;i++)
{
temp=a[p][i]; a[p][i]=a[k][i]; a[k][i]=temp;
}
}
if(q!=k) /* interchange k and q columns */
{
sign=-sign;
for(i=1;i<=n;i++)
{
temp=a[i][q]; a[i][q]=a[i][k]; a[i][k]=temp;
}
}
for(j=k+1;j<=n;j++) /* makes a[k+1][k] to a[n][k] zero */
{
temp=a[j][k]/a[k][k];
for(i=1;i<=n;i++) a[j][i]-=temp*a[k][i];
}
} /* end of k loop */
prod=sign;
for(i=1;i<=n;i++) /* product of diagonal elements*/
prod*=a[i][i];
printf("The value of the determinant is %f ",prod);
}/* main */
A sample of input/output:
-2 3 8 4
6 1 0 5
-8 3 1 2
3 8 7 10
The value of the determinant is -1273.000000
From the theory of matrices, it is well known that every square non-singular matrix has
unique inverse. The inverse of a matrix A is defined by
adj A
A−1 = . (5.7)
|A|
The matrix adj A is called adjoint of A and defined as
A11 A21 · · · An1
A12 A22 · · · An2
adj A = ···
,
··· ··· ···
A1n A2n · · · Ann
where Aij being the cofactor of aij in |A|.
The main difficulty of this method is to compute the inverse of the matrix A. From
the definition of adj A it is easy to observe that to compute the matrix adj A, we have
to determine n2 determinants each of order (n − 1). So, it is very much time consuming.
Many efficient methods are available to find the inverse of a matrix, among them Gauss-
Jordan is most popular. In the following Gauss-Jordan method is discussed to find the
inverse of a square non-singular matrix.
2 4 5
Example 5.3.1 Find the inverse of the following matrix A = 1 −1 2 .
3 4 5
.
Solution. The augmented matrix [A..I] can be written as
.
24 5 .. 1 0 0
.
[A..I] =
1 −1 2
..
. 0 1 0
.
(5.10)
..
3 4 5 . 0 0 1
.
3 4 5 .. 0 0 1
.. .
1 −1 2 . 0 1 0
..
2 4 5 . 1 0 0
.
1 4/3 5/3 .. 0 0 1/3
∼ .
1 −1 2 .. 0 1 0 R1 = 3 R1
1
.
2 4 5 .. 1 0 0
.
1 4/3 5/3 .. 0 0 1/3
∼ 0 −7/3 1/3
..
. 0 1 −1/3
R = R2 − R1 ; R = R3 − 2R1
2 3
..
0 4/3 5/3 . 1 0 −2/3
7
(The largest element (in magnitude) in the second column is − , which is at the a22
3
position and so there is no need to interchange any rows).
.
1 4/3 5/3 .. 0 0 1/3
∼ .
0 1 −1/7 .. 0 −3/7 1/7 R2 = − 7 R2 ; R3 = R3 − 3 R2
3 4
.
0 0 13/7 .. 1 4/7 −6/7
.
1 4/3 5/3 .. 0 0 1/3
∼ .
0 1 −1/7 .. 0 −3/7 1/7 R3 = 13 R3
7
.
0 0 1 .. 7/13 4/13 −6/13
.
0 0 1 .. 7/13 4/13 −6/13
.
1 0 0 .. −1 0 1
∼ . 13
0 1 0 .. 1/13 −5/13 1/13 R1 = R1 − 7 R3 ; R2 = R2 + 7 R3
1
.
0 0 1 .. 7/13 4/13 −6/13
The left hand becomes a unit matrix, thus the inverse of the given matrix is
−1 0 1
1/13 −5/13 1/13 .
7/13 4/13 −6/13
290 Numerical Analysis
Algorithm 5.3 (Matrix inverse). The following algorithm computes the inverse
of a non-singular square matrix of order n × n and if the matrix is singular it prints
the message ‘the matrix is singular and hence not invertible’.
0, for i = j
i.e., ai n+j =
1, for i = j
for i, j = 1, 2, . . . , n.
Step 10. Subtract ajk times the kth row to the jth row for
j = k − 1, k − 2, . . . , 1.
Step 11. Increase k by 1.
If k < n then goto Step 10.
Otherwise, print the right half of A as inverse of A and Stop.
end Matrix Inversion
Solution of System of Linear Equations 291
Program 5.3.
/* Program Matrix Inverse
Program to find the inverse of a square matrix using
partial pivoting */
#include<stdio.h>
#include<stdlib.h>
#include<math.h>
#define zero 0.00001
void main()
{
int n,m,k,i,j;
float a[10][20],temp;
printf("\nEnter the size of the matrix ");
scanf("%d",&n);
printf("Enter the elements rowwise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
/* augment the matrix A */
for(i=1;i<=n;i++) for(j=1;j<=n;j++) a[i][n+j]=0;
for(i=1;i<=n;i++) a[i][n+i]=1;
m=2*n;
for(k=1;k<=n;k++)
{
/* finds pivot element and its position */
temp=fabs(a[k][k]); j=k; /* initial maximum */
for(i=k+1;i<=n;i++)
if(temp<fabs(a[i][k])){
temp=fabs(a[i][k]); j=i;
}
if(fabs(a[j][k])<=zero) /* if a[j][k]=0 */
{
printf("The matrix is singular and is not invertible");
exit(0);
}
if(j!=k) /* interchange k and j rows */
{
for(i=1;i<=m;i++){
temp=a[j][i]; a[j][i]=a[k][i]; a[k][i]=temp;
}
}
292 Numerical Analysis
if(a[k][k]!=1)
{
temp=a[k][k];
for(i=1;i<=m;i++) a[k][i]/=temp;
}
for(j=k+1;j<=n;j++) /* makes a[k+1][k] to a[n][k] zero */
{
temp=a[j][k];
for(i=1;i<=m;i++) a[j][i]-=temp*a[k][i];
}
} /* end of k loop */
A sample of input/output:
The system of equations (5.1) can be written in the matrix form (5.3) as
Ax = b
x = A−1 b, (5.11)
1 −5 −13
adj A = −5 −5 5.
−7 5 1
1 −5 −13
adj A 1
Thus, A−1 = = −5 −5 5.
|A| −30 −7 5 1
1 −5 −13 10 90 3
−1 1 1
Therefore, x = A b = −5 −5 5 7 = 60 = 2 .
−30 −7 5 1 5 30 30 1
Hence the required solution is x = 3, y = 2, z = 1.
for(k=1;k<=n;k++)
{
/* finds pivot element and its position */
temp=fabs(a[k][k]); j=k; /* initial maximum */
for(i=k+1;i<=n;i++)
if(temp<fabs(a[i][k]))
{
temp=fabs(a[i][k]); j=i;
}
296 Numerical Analysis
if(fabs(a[j][k])<=zero) /* if a[j][k]=0 */
{
printf("The matrix is singular and is not invertible");
return(0);
}
if(j!=k) /* interchange k and j rows */
{
for(i=1;i<=m;i++)
{
temp=a[j][i]; a[j][i]=a[k][i]; a[k][i]=temp;
}
}
if(a[k][k]!=1)
{
temp=a[k][k];
for(i=1;i<=m;i++) a[k][i]/=temp;
}
for(j=k+1;j<=n;j++) /* makes a[k+1][k] to a[n][k] zero */
{
temp=a[j][k];
for(i=1;i<=m;i++) a[j][i]-=temp*a[k][i];
}
} /* end of k loop */
/* make left half of A to a unit matrix */
for(k=2;k<=n;k++)
for(j=k-1;j>=1;j--){
temp=a[j][k];
for(i=1;i<=m;i++) a[j][i]-=temp*a[k][i];
}
printf("\nThe inverse matrix is \n");
for(i=1;i<=n;i++)
for(j=n+1;j<=m;j++) ai[i][j-n]=a[i][j];
return(1);
}/* matinv */
A sample of input/output:
0 1 2
3 -2 1
4 3 2
The inverse matrix is
-0.218750 0.125000 0.156250
-0.062500 -0.250000 0.187500
0.531250 0.125000 -0.093750
Complexity
The time complexity of this method is O(n3 ) as the method involves computation of
A−1 .
where
(1) ai1
aij = aij − a1j ; i, j = 2, 3, . . . , n.
a11
Again, to eliminate x2 from the third, forth, . . ., and nth equations the second equa-
(1) (1) (1)
a a42 an2 (1)
tion is multiplied by − 32
(1)
, − (1)
, . . ., − (1)
respectively (assuming that a22 = 0), and
a22 a22 a22
298 Numerical Analysis
successively added to the third, fourth, . . ., and nth equations to get the new system of
equations as
where
(1)
(2) (1) ai2 (1)
aij = aij − a ;
(1) 2j
i, j = 3, 4, . . . , n.
a22
Finally, after eliminating xn−1 , the above system of equations become
where,
(k−1)
(k) (k−1) aik (k−1)
aij = aij − (k−1)
akj ;
akk
(0)
i, j = k + 1, . . . , n; k = 1, 2, . . . , n − 1, and apq = apq ; p, q = 1, 2, . . . , n.
Now, by back substitution, the values of the variables can be found as follows:
(n−1)
bn
From last equation we have, xn = (n−1) , from the last but one equation, i.e., (n−1)th
ann
equation, one can find the value of xn−1 and so on. Finally, from the first equation we
obtain the value of x1 .
(k)
The evaluation of the elements aij ’s is a forward substitution and the determina-
tion of the values of the variables xi ’s is a back substitution since we first determine
the value of the last variable xn .
Note 5.5.1 The method described above assumes that the diagonal elements are non-
zero. If they are zero or nearly zero then the above simple method is not applicable to
solve a linear system though it may have a solution. If any diagonal element is zero or
very small then partial pivoting should be used to get a solution or a better solution.
Solution of System of Linear Equations 299
Solution. Multiplying the second and third equations by 2 and 1 respectively and
subtracting them from first equation we get
2x1 + x2 + x3 = 4
3x2 − 3x3 = 0
−x2 + 2x3 = 1.
2x1 + x2 + x3 = 4
3x2 − 3x3 = 0
3x3 = 3.
From the third equation x3 = 1, from the second equations x2 = x3 = 1 and from
the first equation 2x1 = 4 − x2 − x3 = 2 or, x1 = 1.
Therefore the solution is x1 = 1, x2 = 1, x3 = 1.
Solution. The largest element (the pivot) in the coefficients of the variable x1 is −3,
attained at the third equation. So we interchange first and third equations
Multiplying the second equation by 3 and adding with the first equation we get,
The second pivot is 1, which is at the positions a22 and a32 . Taking a22 = 1 as
pivot to avoid interchange of rows. Now, subtracting the third equation from second
equation, we obtain
Program 5.5 .
/* Program Gauss-elimination
Program to find the solution of a system of linear equations by
Gauss elimination method. Partial pivoting is used. */
#include<stdio.h>
#include<stdlib.h>
#include<math.h>
#define zero 0.00001
void main()
{
int i,j,k,n,m;
float a[10][10],b[10],x[10],temp;
printf("\nEnter the size of the coefficient matrix ");
scanf("%d",&n);
printf("Enter the elements rowwise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("Enter the right hand vector ");
for(i=1;i<=n;i++) scanf("%f",&b[i]);
/* augment A with b[i], i.e., a[i][n+1]=b[i] */
m=n+1;
for(i=1;i<=n;i++) a[i][m]=b[i];
for(k=1;k<=n;k++)
{
/* finds pivot element and its position */
temp=fabs(a[k][k]); j=k; /* initial maximum */
for(i=k+1;i<=n;i++)
if(temp<fabs(a[i][k]))
{
temp=fabs(a[i][k]); j=i;
}
if(fabs(a[j][k])<=zero) /* if a[j][k]=0 */
{
printf("The matrix is singular:");
printf("The system has either no solution or many solutions");
exit(0);
}
if(j!=k) /* interchange k and j rows */
{
for(i=1;i<=m;i++)
302 Numerical Analysis
{
temp=a[j][i]; a[j][i]=a[k][i]; a[k][i]=temp;
}
}
for(j=k+1;j<=n;j++) /* makes a[k+1][k] to a[n][k] zero */
{
temp=a[j][k]/a[k][k];
for(i=1;i<=m;i++) a[j][i]-=temp*a[k][i];
}
} /* end of k loop */
/* forward substitution is over */
/* backward substitution */
x[n]=a[n][m]/a[n][n];
for(i=n-1;i>=1;i--)
{
x[i]=a[i][m];
for(j=i+1;j<=n;j++) x[i]-=a[i][j]*x[j];
x[i]/=a[i][i];
}
printf("Solution of the system is\n ");
for(i=1;i<=n;i++) printf("%8.5f ",x[i]);
} /* main */
A sample of input/output:
and produces the solution of the system without using the back substitution. At the end
of Gauss-Jordan method the system of equations (5.2) reduces to the following form:
x1
b1
1 0 ··· 0
0 b
1 ··· 0
x2 2
··· = .. . (5.15)
··· ··· ··· ..
. .
0 0 ··· 1 xn bn
The solution of the system is given by
x1 = b1 , x2 = b2 , . . . , xn = bn .
Thus the Gauss-Jordan method gives
.. Gauss − Jordan .
A.b −→ I..b . (5.16)
x1 + x2 + x3 = 3
2x1 + 3x2 + x3 = 6
x1 − x2 − x3 = −3.
1 1 1 x1 3
Solution. Here A = 2 3 1 , x = x2 and b = 6 .
1 −1 −1 x3 −3
..
The augmented matrix A.b is
.
1 1 1 .. 3
..
A.b =
2 3
.
1 .. 6
.
1 −1 −1 .. −3
.
1 1 1 .. 3
R = R2 − 2R1 ,
∼
0 1
. 2
−1 .. 0 R = R3 − R1
3
.
0 −2 −2 .. −6
304 Numerical Analysis
.
1 1 1 .. 3
∼
0 1
.
−1 .. 0 R3 = R3 + 2R2
.
0 0 −4 .. −6
.
1 1 1 .. 3
1
∼ 0 1 .
−1 .. 0 R3 = −4 R3
.
0 0 1 .. 3/2
.
1 0 2 .. 3
∼
0 1
.
−1 .. 0 R1 = R1 − R2
.
0 0 1 .. 3/2
.
1 0 0 .. 0
R = R1 − 2R3 ,
∼
0 1
. 1
0 .. 3/2 R = R2 + R3
2
.
0 0 1 .. 3/2
l11 z1 = b1
l21 z1 + l22 z2 = b2
l31 z1 + l32 z2 + l33 z3 = b3 (5.20)
···································· ··· ···
ln1 z1 + ln2 z2 + ln3 z3 + · · · + lnn zn = bn .
After determination of z, one can compute the value of x i.e., x1 , x2 , . . . , xn from the
equation Ux = z i.e., from the following equations by the backward substitution.
u11 x1 + u12 x2 + u13 x3 + · · · + u1n xn = z1
u22 x2 + u23 x3 · · · + z2n xn = z2
u33 x3 + u23 x3 · · · + u3n xn = z3
(5.21)
························ ··· ···
un−1n−1 xn−1 + un−1n xn = zn−1
unn xn = zn .
When uii = 1, for i = 1, 2, . . . , n, then the method is known as Crout’s decom-
position method. When lii = 1, for i = 1, 2, . . . , n then the method is known as
Doolittle’s method for decomposition. In particular, when lii = uii for i = 1, 2, . . . , n
then the corresponding method is called Cholesky’s decomposition method.
a11 a12 a13 · · · a1n
a21 a22 a23 · · · a2n
= ··· ··· ··· ··· ···
j−1
lij = aij − lik ukj , i ≥ j (5.22)
k=1
i−1
aij − lik ukj
k=1
uij = , i<j (5.23)
lii
uii = 1, lij = 0, j > i and uij = 0, i > j.
z = L−1 b (5.24)
−1
and x=U z. (5.25)
It may be noted that the computation of inverse of a triangular matrix is easier than
an arbitrary matrix.
The inverse of A can also be determined from the relation
• The inverse of lower (upper) triangular matrix is also a lower (upper) triangular
matrix.
*n *
n
• Since A = LU, |A| = |LU| = |L||U| = lii uii .
i=1 i=1
into the form LU, where L and U are lower and upper triangular matrices and hence
solve the system of equations 2x1 −2x2 +x3 = 2, 5x1 +x2 −3x3 = 0, 3x1 +4x2 +x3 =
9. Determine L−1 and U−1 and hence find A−1 . Also determine |A|.
2 −2 1
Solution. Let 5 1 −3
3 4 1
l11 0 0 1 u12 u13 l11 l11 u12 l11 u13
= l21 l22 0 0 1 u23 = l21 l21 u12 + l22 l21 u13 + l22 u23 .
l31 l32 l33 0 0 1 l31 l31 u12 + l32 l31 u13 + l32 u23 + l33
Comparing both sides, we have
l11 = 2, l21 = 5 l31 = 3
l11 u12 = −2 or, u12 = −2/l11 = −1
l11 u13 = 1 or, u13 = 1/l11 = 1/2
l21 u12 + l22 = 1 or, l22 = 1 − l21 u12 = 6
l31 u12 + l32 = 4 or, l32 = 4 − l31 u12 = 7
l21 u13 + l22 u23 = −3 or, u23 = (−3 − l21 u13 )/l22 = −11/12
l31 u13 + l32 u23 + l33 = 1 or, l33 = 1 − l31 u13 − l32 u23 = 71/12.
Hence L and U are given by
2 0 0 1 −1 1
2
L = 5 6 0 , U = 0 1 − 11 12
.
71
3 7 12 0 0 1
That is,
2y1 = 2,
5y1 + 6y2 = 0,
71
3y1 + 7y2 + y3 = 9.
12
Solution of this system is y1 = 1, y2 = − 56 , y3 = 2.
Thus y = (1, −5/6, 2)t .
Now, from the relation Ux = y we have
1 −1 1
2 x1 1
0 1 − 11 x2 = − 5 ,
12 6
0 0 1 x3 2
i.e.,
1
x1 − x2 + x3 = 1
2
11 5
x2 − x3 = −
12 6
x3 = 2.
.
1 0 0 .. 12 0 0
∼ .. 5 R1 = 1 R1 , R2 = R2 − 5R1 , R3 = R3 − 3R1
0 6 0 . − 2 1 0 2
71 ..
0 7 12 . − 2 0 1 3
.
1 0 0 .. 1
0 0
2 1
∼ .. 5
0 1 0 . − 12 6 0 R2 = 6 R2 , R3 = R3 − 7R2
1
.. 17 7
0 0 71
12 . 12 − 6 1
.
1 0 0 .. 1
0 0
2
∼ .. 5
0 1 0 . − 12
1
0
.
6
.. 17 14 12
0 0 1 . 71 − 71 71
1/2 0 0
Hence L−1 = −5/12 1/6 0 .
17/71 −14/71 12/71
The value of U−1 can also be determined in a similar way. Here we apply another
method based on the property ‘inverse of a triangular matrix is a triangular matrix
of same shape’.
1 b12 b13
Let U−1 = 0 1 b23 .
0 0 1
−1
Then U U = I gives
1 b12 b13 1 −1 1/2 1 0 0
0 1 b23 0 1 −11/12 = 0 1 0 .
0 0 1 0 0 1 0 0 1
1 −1 + b12 12 − 11
12 b12 + b13 1 0 0
i.e., 0 1 − 11 = 0 1 0.
12 + b23
0 0 1 0 0 1
Equating both sides
1 11 1 11 5
−1 + b12 = 0 or, b12 = 1, − b12 + b13 = 0 or, b13 = − + b12 =
2 12 2 12 12
11 11
− + b23 = 0 or, b23 = .
12 12
Hence
1 1 5/12
U−1 = 0 1 11/12 .
0 0 1
310 Numerical Analysis
Therefore,
1 1 5/12 1/2 0 0
A−1 = U−1 L−1 = 0 1 11/12 −5/12 1/6 0
0 0 1 17/71 −14/71 12/71
13/71 6/71 5/71
= −14/71 −1/71 11/71 .
17/71 −14/71 12/71
71
Last Part. |A| = |L||U| = 2 × 6 × × 1 = 71.
12
Algorithm 5.6 (LU decomposition). This algorithm finds the solution of a sys-
tem of linear equations using LU decomposition method. Assume that the principal
minors of all order are non-zero.
Algorithm LU-decomposition
Let Ax = b be the systems of equations and A = [aij ], b = (b1 , b2 , . . . , bn )t ,
x = (x1 , x2 , . . . , xn )t .
//Assume that the principal minors of all order are non-zero.//
//Determine the matrices L and U.//
Step 1. Read the matrix A = [aij ], i, j = 1, 2, . . . , n and the right
hand vector b = (b1 , b2 , . . . , bn )t .
Step 2. li1 = ai1 for i = 1, 2, . . . , n; u1j = al111j
for j = 2, 3, . . . , n;
uii = 1 for i = 1, 2, . . . , n.
Step 3. For i, j = 2, 3, . . . , n compute the following
j−1
lij = aij − lik ukj , i ≥ j
k=1
i−1
uij = aij − lik ukj /lii , i < j.
k=1
Step 4. //Solve the system Lz = b by forward substitution.//
1
i−1
b1
z1 = , zi = bi − lij zj for i = 2, 3, . . . , n.
l11 lii
j=1
Step 5. //Solve the system Ux = z by backward substitution.//
Set xn = zn ;
n
xi = zi − uij xj for i = n − 1, n − 2, . . . , 1.
j=i+1
Print x1 , x2 , . . . , xn as solution.
end LU-decomposition
Solution of System of Linear Equations 311
Program 5.6.
/* Program LU-decomposition
Solution of a system of equations by LU decomposition method.
Assume that all order principal minors are non-zero. */
#include<stdio.h>
void main()
{
float a[10][10],l[10][10],u[10][10],z[10],x[10],b[10];
int i,j,k,n;
printf("\nEnter the size of the coefficient matrix ");
scanf("%d",&n);
printf("Enter the elements rowwise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("Enter the right hand vector ");
for(i=1;i<=n;i++) scanf("%f",&b[i]);
/* computations of L and U matrices */
for(i=1;i<=n;i++) l[i][1]=a[i][1];
for(j=2;j<=n;j++) u[1][j]=a[1][j]/l[1][1];
for(i=1;i<=n;i++) u[i][i]=1;
for(i=2;i<=n;i++)
for(j=2;j<=n;j++)
if(i>=j)
{
l[i][j]=a[i][j];
for(k=1;k<=j-1;k++) l[i][j]-=l[i][k]*u[k][j];
}
else
{
u[i][j]=a[i][j];
for(k=1;k<=i-1;k++) u[i][j]-=l[i][k]*u[k][j];
u[i][j]/=l[i][i];
}
printf("\nThe lower triangular matrix L\n");
for(i=1;i<=n;i++)
{
for(j=1;j<=i;j++) printf("%f ",l[i][j]);
printf("\n");
}
printf("\nThe upper triangular matrix U\n");
312 Numerical Analysis
for(i=1;i<=n;i++)
{
for(j=1;j<i;j++) printf(" ");
for(j=i;j<=n;j++) printf("%f ",u[i][j]);
printf("\n");
}
/* solve Lz=b by forward substitution */
z[1]=b[1]/l[1][1];
for(i=2;i<=n;i++)
{
z[i]=b[i];
for(j=1;j<=i-1;j++) z[i]-=l[i][j]*z[j];
z[i]/=l[i][i];
}
A sample of input/output:
.
Solution. The augmented matrix A..I is
.
1 3 4 .. 1 0 0
..
A.I = .
1 0 2 .. 0 1 0
.
−2 3 1 .. 0 0 1
..
1 3 4 . 1 0 0
R2 ← R2 − R1
−→ 0 −3 −2 ..
R3 ← R3 + 2R1 . −1 1 0
..
0 9 9 . 2 0 1
..
1 4 3. 1 0 0
R3 ← R3 + 3R2
−→
0 −3 −2
..
. −1 1 0
..
0 0 3 . −1 3 1
314 Numerical Analysis
1 3 4 1 0 0
Here U = 0 −3 −2 , L = −1 −1 1 0.
0 0 3 −1 3 1
x11 x12 x13
Let A−1 = x21 x22 x23 .
x31 x32 x33
Since, UA−1 = L−1 ,
1 3 4 x11 x12 x13 1 0 0
0 −3 −2 x21 x22 x23 = −1 1 0 .
0 0 3 x31 x32 x33 −1 3 1
This implies
x11 + 3x21 + 4x31 = 1
− 3x21 − 2x31 = −1
3x31 = −1
1 5 2
These equations give x31 = − , x21 = , x11 = .
3 9 3
Again, x12 + 3x22 + 4x32 = 0
− 3x22 − 2x32 = 1
3x32 = 3
Solution of this system is x32 = 1, x22 = −1, x12 = −1
x13 + 3x23 + 4x33 = 0
− 3x23 − 2x33 = 0
3x33 = 1
1 2 2
Then, x33 = , x23 = − , x13 = − is the solution of the above equations.
3 9 3
2/3 −1 −2/3
Hence A−1 = 5/9 −1 −2/9 .
−1/3 1 1/3
If the coefficient matrix A is symmetric and positive definite then this method is appli-
cable to solve the system Ax = b. This method is also known as square-root method.
Since A is symmetric then A can be written as
A = LLt , (5.28)
where L = [lij ], lij = 0, i < j, a lower triangular matrix.
Also, A can be decomposed as
A = UUt , (5.29)
Solution of System of Linear Equations 315
Procedure to determine L
Since A = LLt , then
l11 0 0 · · · 0
l21 l22 0 · · · 0 l11 l21 · · · lj1 · · · ln1
· · · · · · · · · · · · · · · 0 l22 · · · lj2 · · · ln2
A = 0 0 · · · l · · · l
j3 n3
li1 li2 li3 · · · 0 · · · · · · · · · · · · · · · · · ·
··· ··· ··· ··· ···
0 0 · · · 0 · · · lnn
ln1 ln2 ln3 · · · lnn
2
l11 l21 l11 · · · lj1 l11 ··· ln1 l11
l21 l11 l21 2 + l2 · · · lj1 l21 + lj2 l22 ··· ln1 l21 + ln2 l22
22
··· · · · · · · · · · ··· ···
=
li1 l11 l21 li1 + l22 li2 · · · lj1 li1 + · · · + ljj lij · · ·
.
ln1 li1 + · · · + lni lii
··· ··· ··· ··· ··· ···
ln1 l11 l21 ln1 + l22 ln2 · · · lj1 ln1 + · · · + ljj lnj · · · ln1 + ln2 + · · · + lnn
2 2 2
Similarly, for the system of equations (5.29), the elements uij of U are given by
2x1 + x2 − x3 = 6
x1 − 3x2 + 5x3 = 11
−x1 + 5x2 + 4x3 = 13.
1.41421z1 = 6
0.70711z1 + 1.58114z2 = 11
0.70711z1 + 0.94868z2 + 1.61245z3 = 13.
When a matrix is very large and it is not possible to store the entire matrix into the
primary memory of a computer at a time, then matrix partition method is used to
find the inverse of a matrix. When a few more variables and consequently a few more
equations are added to the original system then also this method is very useful.
Let the coefficient matrix A be partitioned as
..
B . C
A= · · · · · · · · ·
(5.36)
..
D . E
where the matrices P, Q, R and S are of the same orders as those of the matrices B, C, D
and E respectively. Then
.. .. ..
B . C P . Q I1 . 0
AA−1 =
··· ··· ······ ··· ···
= ··· ··· ···, (5.38)
.. .. ..
D . E R . S 0 . I2
where I1 and I2 are identity matrices of order l and m respectively. From (5.38), we
have
BP + CR = I1
BQ + CS = 0
DP + ER = 0
DQ + ES = I2 .
Q = −B−1 CS
R = −(E − DB−1 C)−1 DB−1 = −SDB−1
P = B−1 (I1 − CR) = B−1 − B−1 CR.
It may be noted that, to find the inverse of A, it is required to determine the inverses
of two matrices B and (E − DB−1 C) of order l × l and m × m respectively.
That is, to compute the inverse of the matrix A of order n × n, the inverses of two
lower order (roughly half) matrices are to be determined. If the matrices B, C, D, E
are still large to fit in the computer memory, then further partition them.
3 3 4
Example 5.10.1 Find the inverse of the matrix A = 2 1 1 using the matrix
1 3 5
partition method. Hence find the solution of the system of equations
Now,
1 1 −3 1 −1 3
B−1 = − = .
3 −2 3 3 2 −3
1 −1 3 4 1
E − DB−1 C = 5 − 1 3 = .
3 2 −3 1 3
S = 3
1 −1 3
R = −3 1 3 = −5 6
3 2 −3
1 −1 3 1 −1 3 4
P = B−1 − B−1 CR = − −5 6
3 2 −3 3 2 −3 1
−2 3
= .
9 −11
1 −1 3 4 1
Q = − 3=
3 2 −3 1 −5
Therefore,
−2 3 1
A−1 = 9 −11 −5 .
−5 6 3
Hence,
−2 3 1 5 17
x = A−1 b = 9 −11 −5 7 = −62 .
−5 6 3 6 35
Hence the required solution is x1 = 17, x2 = −62, x3 = 35.
320 Numerical Analysis
b1 c1 0 0 ··· 0 0 0 0
a2 b2 c2 0 ··· 0 0 0 0 d1
d2
0 a3 b3 c3 ··· 0 0 0 0
A=
··· and d = . (5.40)
··· ··· ··· ··· ··· ··· ··· ··· ..
0 0 0 0 ··· 0 an−1 bn−1 cn−1 dn
0 0 0 0 ··· 0 0 an bn
It may be noted that the main diagonal and the adjacent coefficients on either side of
it consist of only non-zero elements and all other elements are zero. The matrix is called
tri-diagonal matrix and the system of equations is called a tri-diagonal system. These
type of matrices occur frequently in the solution of ordinary and partial differential
equations by finite difference method.
A tri-diagonal system can be solved using LU decomposition method.
Let A = LU where
γ1 0 0 · · · 0 0 0
β2 γ2 0 · · · 0 0 0
L = ··· ··· ··· ··· ··· ··· ···
,
0 0 0 · · · βn−1 γn−1 0
0 0 0 ··· 0 β n γn
1 α1 0 · · · 0 0 0
0 1 α2 · · · 0 0 0
and U = · · · · · · · · · · · · · · · · · · · · ·
.
0 0 0 · · · 0 1 αn−1
0 0 0 ··· 0 0 1
Then
γ1 γ1 α1 0 ··· 0 0 0
β2 α1 β2 + γ2 α2 γ2 ··· 0 0 0
LU = 0 β3 α2 β3 + γ3 · · · 0 0 0 .
··· ··· ··· ··· ··· ··· ···
0 0 0 · · · 0 βn βn αn−1 + γn
Solution of System of Linear Equations 321
Now, comparing the matrix LU with A and obtain the non-zero elements of L and
U as
γ1 = b1 , γi αi = ci , or, αi = ci /γi , i = 1, 2, . . . , n − 1
βi = ai , i = 2, . . . , n
ci−1
γi = bi − αi−1 βi = bi − ai , i = 2, 3, . . . , n.
γi−1
Thus the elements of L and U are given by the following relations.
γ1 = b1 ,
ai ci−1
γi = bi − , i = 2, 3, . . . , n (5.41)
γi−1
βi = ai , i = 2, 3, . . . , n (5.42)
αi = ci /γi , i = 1, 2, . . . , n − 1.
The solution of the equation (5.39) i.e., Ax = d where d = (d1 , d2 , . . . , dn )t can be
obtained by solving Lz = d using forward substitution and then solving Ux = z using
back substitution. The solution of Lz = d is given by
d1 di − ai zi−1
z1 = , zi = , i = 2, 3, . . . , n. (5.43)
b1 γi
The solution of the equation Ux = z is
ci
xn = zn , xi = zi − αi xi+1 = zi − xi+1 , i = n − 1, n − 2, . . . , 1. (5.44)
γi
γ1 = b1 = 1
c1
γ2 = b2 − a2 = 2 − (−1).1 = 3
γ1
c2 1
γ3 = b3 − a3 = 2 − 3. = 1
γ2 3
d1 d 2 − a2 z 1 d3 − a3 z2
z1 = = 3, z2 = = 3, z3 = =3
b1 γ2 γ3
c2 c1
x3 = z3 = 3, x2 = z2 − x3 = 2, x1 = z1 − x2 = 1.
γ2 γ1
Hence the required solution is x1 = 1, x2 = 2, x3 = 3.
322 Numerical Analysis
It may be noted that the equation (5.43) and also (5.44) are valid only if γi = 0
for all i = 1, 2, . . . , n. If any one of γi becomes zero at any stage then the method is
not applicable. Actually, this method is based on LU decomposition technique and LU
decomposition method is applicable and unique if principal minors of coefficient matrix
of all orders are non-zero. But, if a minor becomes zero then a modification on (5.43)
and (5.44) gives the solution of the tri-diagonal system.
x1 + x2 = 3,
x1 + x2 − 3x3 = −3,
−2x2 + 3x3 = 4.
γ1 = b1 = 1
c1
γ2 = b2 − a2 =1−1=0
γ1
Since γ2 = 0, let γ2 = s. Therefore,
c2 −3 6
γ3 = b3 − a3 =3+2 =3−
γ2 s s
d1 d 2 − a2 z 1 6 d 3 − a3 z 2 4s − 12
z1 = = 3, z2 = =− , z3 = =
b1 γ2 s γ3 3s − 6
4s − 12
x3 = z3 = ,
3s − 6
c2 −6
x2 = z2 − x3 = ,
γ2 3s − 6
c1 9s − 12
x1 = z1 − x2 = .
γ1 3s − 6
Substituting s = 0 to find the solution. The required solution is
x1 = 2, x2 = 1, x3 = 2.
Solution of System of Linear Equations 323
Algorithm Tridiagonal
//Let the tri-diagonal system is of the form Ax = d where A and d are given by
(5.40).//
Step 1. Read the matrix A, i.e., the arrays ai , bi , ci , i = 2, 3, . . . , n−1
and b1 , c1 , an , bn .
ai ci−1
Step 2. Compute γ1 = b1 , γi = bi − , i = 2, 3, . . . , n.
γi−1
d1 di − ai zi−1
Step 3. Compute z1 = , zi = , i = 2, 3, . . . , n.
b1 γi
ci
Step 4. Compute xn = zn , xi = zi − xi+1 , i = n − 1, n − 2, . . . , 1.
γi
Step 5. Print xi , i = 1, 2, . . . , n and Stop.
end Tridiagonal
Program 5.7.
/* Program TriDiagonal
Program to solve a tri-diagonal system of equations.
The coefficient matrix are taken as a[i],b[i],c[i],
i=2, 3, ..., n-1, b[1],c[1],a[n],b[n]. The right
hand vector is d[i], i=1, 2, ..., n.*/
#include<stdio.h>
#include<stdlib.h>
float x[10]; /* x[i]is the solution of the tri-diagonal system */
void main()
{
float a[10],b[10],c[10],d[10];
int i,n; float y;
float TriDiag(float [],float [],float [],float [],int);
printf("Enter the size of the coefficient matrix ");
scanf("%d",&n);
printf("Enter first row (only non-zero elements) ");
scanf("%f %f",&b[1],&c[1]);
printf("Enter rows 2 to n-1 ");
for(i=2;i<=n-1;i++) scanf("%f %f %f",&a[i],&b[i],&c[i]);
printf("Enter last row ");
scanf("%f %f",&a[n],&b[n]);
324 Numerical Analysis
A sample of input/output:
1
5.12 Evaluation of Tri-diagonal Determinant
The norm of a vector is the size or length of that vector. The norm of a vector x is
denoted by x. This is a real number and satisfies the following conditions
n
(i) x1 = |xi | (5.49)
i=1
-
. n
.
(ii) x2 = / |xi |2 (Euclidean norm) (5.50)
i=1
(iii) x∞ = max |xi | (maximum norm or uniform norm). (5.51)
i
Let A and B be two matrices such that A + B and AB are defined. The norm of
a matrix A = [aij ] is denoted by A, which satisfies the following conditions
Solution.
A1 = max{2 + 0 + 3, 3 − 1 + 2, 4 + 5 + 6} = 15
√
A2 = 22 + 32 + 42 + 02 + (−1)2 + 52 + 32 + 22 + 62 = 104 and
A∞ = max{2 + 3 + 4, 0 − 1 + 5, 3 + 2 + 6} = 11.
x + 3y = 4
1
x + y = 1.33. (5.60)
3
Note that this system of equations has no solution. But, if we take the approximate
value of 13 as 0.3 then (5.60) becomes
x + 3y = 4
0.3x + y = 1.33. (5.61)
328 Numerical Analysis
x + 3y = 4
0.33x + y = 1.33 (5.62)
is x = 1, y = 1.
1
The approximations 0.333 and 0.3333 of 3 give the following systems
x + 3y = 4
0.333x + y = 1.33. (5.63)
x + 3y = 4
0.3333x + y = 1.33. (5.64)
Ax = b. (5.65)
A y = b . (5.66)
The system (5.65) is called ill-conditioned when the changes in y are too large com-
pared to those in x. Otherwise, the system is called well-conditioned. If a system is
ill-conditioned then the corresponding coefficient matrix is called an ill-conditioned
matrix.
1 3
The system (5.62) is ill-conditioned and the corresponding coefficient matrix
0.33 1
is an ill-conditioned matrix.
Generally, ill-condition occurs when |A| is small. To measure the ill-condition of a
matrix, different methods are available. One of the useful method is introduced here.
Solution of System of Linear Equations 329
The quantity Cond(A), called the condition number of the matrix, defined by
where A is any matrix norm, gives a measure of the condition of the matrix
A. The large value of Cond(A) indicates the ill-condition of a matrix or the associated
system of equations.
1 3 4 3
Let A = and B = be two matrices.
0.33 1 3 5
100 −300 5 −3
Then A−1 = and B−1 = 11 1
.
−33 100 −3 4
√
The Euclidean norms, A2 = 1 + 9 + 0.10890 + 1 = 3.3330 and A−1 2 = 333.3001.
Therefore, Cond(A) = A2 × A−1 2 = 1110.88945, a very large number. Hence A
is ill-conditioned.
Where as B2 = 7.68115 and B−1 2 = 0.69829
Then Cond(B) = 5.36364, a relatively small quantity.
Therefore, B is well-conditioned matrix.
Another indicator of ill-conditioning matrix is presented below.
n 1/2
Let A = [aij ] be the matrix and ri = a2ij . The quantity
j=1
|A|
ν(A) = (5.68)
r1 r2 · · · r n
Let x 12 , . . . , x
11 , x 1n be an approximate solution of (5.69). Since this is an approximate
n
solution, aij x 1j is not necessarily equal to bi . Let bi = 1bi for this approximate solution.
j=1
Then, for this solution, (5.69) becomes
n
1j = 1bi , i = 1, 2, . . . , n.
aij x (5.70)
j=1
where εi = xi − x 1i , di = bi − 1bi , i = 1, 2, . . . , n.
Now, the solution for εi ’s is obtained by solving the system (5.71). Hence the new
solution is given by xi = εi + x 1i and these values are better approximations to xi ’s. This
technique can be repeated again to improve the accuracy.
The conventional matrix inverse (discussed in Section 5.3) is widely used in many areas
of science and engineering. It is also well known that conventional inverses can be
determined only for square non-singular matrices. But, in many areas of science and
engineering such as statistics, data analysis etc. some kind of weak inverses of singular
square and rectangular matrices are very much essential. The inverses of such types of
matrices are known as generalized inverse or g-inverse. A number of works have
been done during last three decades on g-inverse. The generalized inverse of an m × n
matrix A is a matrix X of size n × m. But, different types of generalized inverses are
defined by various authors. The following matrix equations are used to classify the
different types of generalized inverses for the matrix A:
(i) AXA = A
(ii) XAX = X
(5.72)
(iii) AX = (AX)∗
(iv) XA = (XA)∗ , (∗ denotes the conjugate transpose).
Solution of System of Linear Equations 331
x = A+ b (5.81)
Let
a11 a12 · · · a1k · · · a1n
a21 a22 · · · a2k · · · a2n
A=
· · · · · · · · · · · · · · · · · · = (α1 α2 . . . αk . . . αn ) (5.82)
am1 am2 · · · amk · · · amn
a1k
a2k
where αk = . , the kth column of the matrix A.
.
.
amk
Also, let Ak be the matrix formed by the first k columns, i.e., Ak = (α1 α2 . . . αk ).
Then
Ak = (Ak−1 αk ).
The proposed algorithm is recursive and recursion is started with
A+
1 = 0 if α1 = 0 (null column) else
t −1 t
A+
1 = (α1 α1 ) α1 . (5.83)
Let the column vectors be
δ k = A+
k−1 αk (5.84)
and γk = αk − Ak−1 δk . (5.85)
If γk = 0, then compute
βk = γk+ = (γkt γk )−1 γkt ; (5.86)
else βk = (1 + δkt δk )−1 δkt A+
k−1 . (5.87)
Then, the matrix A+
k is given by
A+ k−1 − δk βk .
A+
k = βk
(5.88)
2 1 0 1 2
Solution. Here α1 = 1 , α2 = 0 , α3 = −1 , α4 = 0 , A1 = 1 .
0 1 1 2 0
−1
2
t α )−1 αt = 2 1 0 1
A+ 1 = (α 1 1 1 2 1 0 = 15 2 1 0 = 25 15 0
0
2 1 1
δ 2 = A+ α = 0 = 2
1 2 5 5 0 5
1
1
1 2 5
γ2 = α2 − A1 δ2 = 0 − 1 25 = − 25 = 0 (the null column vector).
1 0 1
1 −1
5 1 2
Hence β2 = γ2+ = (γ2t γ2 )−1 γ2t = 15 − 25 1 − 25 5 −5 1
1
= 56 15 − 25 1 = 16 − 13 56
1
δ2 β2 = 25 16 − 13 56 = 15 − 15
2 1
3
1 − δ2 β2
A+ 1 1
− 13
A+
2 = = 3 3
β2 1
6 −3
1
6
5
0
1 1
− 13 − 23
Now, δ3 = A+ α
2 3 = 3 3 −1 =
1
6 − 13
5
6
7
6
1
1
0 2 1
− 23 6
γ3 = α3 − A2 δ3 = −1 − 1 0 7 = − 13 = 0.
1 0 1 6 − 16
Hence β3 = γ3+ = (γ3t γ3 )−1 γ3t = 6 16 − 13 − 16 = 1 −2 −1
− 23 − 23 4 2
δ3 β3 = 1 −2 −1 = 3 3
7
6
7
6 −3
7
−6
7
1 −1 −1
− δ3 β3
A+
A+3 =
2 = −1 2 2
β3
1 −2 −1
1 −1 −1 1 −1
Now, δ4 = A+3 α4 =
−1 2 2 0 = 3 ,
1 −2 −1 2 −1
1 2 1 0 −1 1 1 0
γ4 = α4 − A3 δ4 = 0 − 1 0 −1 3 = 0 − 0 = 0.
2 0 1 1 −1 2 2 0
(the null column vector)
334 Numerical Analysis
So that
1 −1 −1
β4 = (1 + δ4t δ4 )−1 δ4t A+
3 = 12 −1 3 −1
1 −1 2 2 = − 5 3 2
12 4 3
1 −2 −1
−1 5/12 −3/4 −2/3
δ4 β4 = 3
− 12 4 3 = −5/4 9/4
5 3 2
2
−1 5/12 −3/4 −2/3
7/12 −1/4 −1/3
0
A+
A+3 − δ4 β4 = 1/4 −1/4 +
4 = β4 7/12 −5/4 −1/3 = A
−5/12 3/4 2/3
The given system +
is Ax = b and its solution isx = A b.
7/12 −1/4 −1/3 1
1/4 −1/4 0 4 1
Therefore, x =
7/12 −5/4 −1/3 0 = 1
4
−5/12 3/4 2/3 1
Therefore the required solution is
x1 = 1, x2 = 1, x3 = 1, x4 = 1.
Note 5.15.1 It may be verified that A+ satisfies all the conditions (5.72). Again,
−1
A+
3 = A3 as |A3 | = 1 i.e., A3 is non-singular. In addition to this, for this A,
AA = I4 , but A+ A = I4 , the unit matrix of order 4.
+
Ax = b (5.89)
where A, x and b are of order m×n, n×1 and m×1 respectively. Here, we assume that
(5.89) is inconsistent. Since (5.89) is inconsistent, the system has no solution. Again,
since there may be more than one xl (least-squares solutions) for which Ax − b is
minimum, there exist one such xl (say xm ) whose norm is minimum. That is, xm is
called minimum norm least squares solution if
The minimum norm least squares solution can be determined using the relation
x = A+ b. (5.92)
Also, S is called the sum of square of residues. To solve (5.89), find x = (x1 , x2 , . . . , xn )t
from (5.93) in such a way that S is minimum. The conditions for S to be minimum are
∂S ∂S ∂S
= 0, = 0, · · · , =0 (5.94)
∂x1 ∂x2 ∂xn
m
n
S∗ = (aij x∗j − bi )2 . (5.96)
i=1 j=1
This method is not suitable for a large system of equations, while the method stated
in equation (5.92) is applicable for a large system also.
336 Numerical Analysis
3 6
Example 5.16.1 Find g-inverse of the singular matrix A = and hence find
2 4
a least squares solution of the inconsistent system
3x + 6y = 9
2x + 4y = 5.
3 6 3
Solution. Let α1 = , α2 = , A1 = .
2 4 2
t −1 αt = 1 3 2 =
A+ 1 = (α1 α1 ) 1 13
3 2
13 13 ,
3 2 6
δ 2 = A+1 α2 = 13 13 = 2,
4
6 3 0
γ2 = α2 − A1 δ2 = − .2 = = 0 (a null vector),
4 2 0
6 4
β2 = (1 + δ2t δ2 )−1 δ2t A+ = 15 .2. 133 2
=
1 13 65 65
δ2 β2 = 12 8
65 65
A+ 1 − δ2 β2 =
3 2
Therefore, A+ 2 =
65 65
6 4 = A+ , which is the g-inverse of A.
β2 65 65
Second Part: The given equations can be written as
3 6 x 9
Ax = b, where A = , x= , A= .
2 4 y 5
Then the least squares solution is given by
1 3 2 9 1
x = A+ b, i.e., x = 65 = 3765 2 .
6 4 5
37 74
Hence the least squares solution is x = , y = .
65 65
Example 5.16.2 Find the least squares solution of the following equations x + y =
3.0, 2x − y = 0.03, x + 3y = 7.03, and 3x + y = 4.97. Also, estimate the residue.
Solution. Let x, y be least squares solution of the given system. Then the square of
residues S is
S = (x + y − 3.0)2 + (2x − y − 0.03)2 + (x + 3y − 7.03)2 + (3x + y − 4.97)2 .
We choose x and y in such a way that S is minimum. Therefore
∂S ∂S
= 0 and = 0.
∂x ∂y
That is,
2(x + y − 3.0) + 4(2x − y − 0.03) + 2(x + 3y − 7.03) + 6(3x + y − 4.97) = 0
and 2(x + y − 3.0) − 2(2x − y − 0.03) + 6(x + 3y − 7.03) + 2(3x + y − 4.97) = 0.
Solution of System of Linear Equations 337
Iteration Methods
If the system of equations has a large number of variables, then the direct methods
are not much suitable. In this case, the approximate numerical methods are used to
determine the variables of the system.
The approximate methods for solving system of linear equations make it possible to
obtain the values of the roots of the system with the specified accuracy as the limit of
the sequence of some vectors. The process of constructing such a sequence is known as
the iterative process.
The efficiency of the application of approximate methods depends on the choice of
the initial vector and the rate of convergence of the process.
The following two approximate methods are widely used to solve a system of linear
equations:
(i) method of iteration (Jacobi’s iteration method), and
(ii) Gauss-Seidal’s iteration method.
Before presenting the iteration methods, some terms are introduced to analyse the
methods.
(k)
Let xi , i = 1, 2, . . . , n be the kth (k = 1, 2, . . .) iterated value of the variable xi and
(k) (k) (k)
x(k) = (x1 , x2 , . . . , xn )t be the solution vector obtained at the kth iteration.
The sequence {x }, k = 1, 2, . . . is said to converge to a vector x = (x1 , x2 , . . . , xn )t
(k)
if for each i (= 1, 2, . . . , n)
(k)
xi −→ xi as k −→ ∞. (5.97)
Let ξ = (ξ1 , ξ2 , . . . , ξn )t be the exact solution of the system of linear equations. Then
(k)
the error εi in the ith variable xi committed in the kth iteration is given by
(k) (k)
εi = ξi − xi . (5.98)
1
x1 = (b1 − a12 x2 − a13 x3 − · · · − a1n xn )
a11
1
x2 = (b2 − a21 x1 − a23 x3 − · · · − a2n xn ) (5.103)
a22
··· ··· ·······································
1
xn = (bn − an1 x1 − an2 x2 − · · · − an n−1 xn−1 ).
ann
(0) (0) (0)
Let x1 , x2 , . . . , xn be the initial guess to the variables x1 , x2 , . . . , xn respectively
(initial guess may be taken as zeros). Substituting these values in the right hand side
of (5.103), which yields the first approximation as follows.
The iteration process is continued until all the roots converge to the required number
of significant figures. This iteration method is called Jacobi’s iteration or simply the
method of iteration.
The Jacobi’s iteration method surely converges if the coefficient matrix is diagonally
dominant.
1
n
(k+1) (k)
ξi − xi = − aij ξj − xj
aii j=1
j=i
1
n
(k+1) (k)
or εi = − aij εj .
aii j=1
j=i
That is,
1 1
n n
(k+1) (k)
εi ≤ |aij | εj ≤ |aij | ε(k) .
|aii | j=1 |aii | j=1
j=i j=i
340 Numerical Analysis
1 n
Let A = max |aij | .
i |aii | j=1
j=i
This relation shows that the rate of convergence of Gauss-Jacobi’s method is linear.
Again, ε(k+1) ≤ Aε(k) ≤ A2 ε(k−1) ≤ · · · ≤ Ak+1 ε(0) .
That is,
This relation gives the absolute error at the (k + 1)th iteration in terms of the error
difference at kth and (k + 1)th iterations.
27x + 6y − z = 54
6x + 15y + 2z = 72
x + y + 54z = 110.
k x y z
0 0 0 0
1 2.00000 4.80000 2.03704
2 1.00878 3.72839 1.91111
3 1.24225 4.14167 1.94931
4 1.15183 4.04319 1.93733
5 1.17327 4.08096 1.94083
6 1.16500 4.07191 1.93974
7 1.16697 4.07537 1.94006
8 1.16614 4.07454 1.93996
9 1.16640 4.07488 1.93999
10 1.16632 4.07477 1.93998
11 1.16635 4.07481 1.93998
e(0) = (3 × 10−5 , 4 × 10−5 , 0). Therefore, the upper bound of absolute error is
A
ε(0) ≤ e(0) = 5.71 × 10−5 .
1−A
Step 6. If |xi − xni | < ε (ε is an error tolerance) for all i, then goto Step 7 else
set xi = xni for all i and goto Step 5.
Step 7. Print xni , i = 1, 2, . . . , n as solution.
end Gauss Jacobi
Program 5.8
.
/*Program Gauss_Jacobi
Solution of a system of linear equations by Gauss-Jacobi’s iteration
method. Testing of diagonal dominance is also incorporated.*/
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
void main()
{
float a[10][10],b[10],x[10],xn[10],epp=0.00001,sum;
int i,j,n,flag;
printf("Enter number of variables ");
scanf("%d",&n);
printf("\nEnter the coefficients rowwise ");
for(i=1;i<=n;i++)
for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("\nEnter right hand vector ");
for(i=1;i<=n;i++)
scanf("%f",&b[i]);
for(i=1;i<=n;i++) x[i]=0; /* initialize */
Solution of System of Linear Equations 343
if(flag==1)
{
printf("The coefficient matrix is not diagonally dominant\n");
printf("The Gauss-Jacobi method does not converge surely");
exit(0);
}
for(i=1;i<=n;i++) printf(" x[%d] ",i);printf("\n");
do
{
for(i=1;i<=n;i++)
{
sum=b[i];
for(j=1;j<=n;j++)
if(j!=i) sum-=a[i][j]*x[j];
xn[i]=sum/a[i][i];
}
for(i=1;i<=n;i++) printf("%8.5f ",xn[i]);printf("\n");
344 Numerical Analysis
printf("Solution is \n");
for(i=1;i<=n;i++) printf("%8.5f ",xn[i]);
} /* main */
A sample of input/output:
Enter number of variables 3
Enter the coefficients rowwise
9 2 4
1 10 4
2 -4 10
Enter right hand vector
20 6 -15
x[1] x[2] x[3]
2.22222 0.60000 -1.50000
2.75556 0.97778 -1.70444
2.76247 1.00622 -1.66000
2.73640 0.98775 -1.65000
2.73606 0.98636 -1.65218
2.73733 0.98727 -1.65267
2.73735 0.98733 -1.65256
2.73729 0.98729 -1.65254
2.73729 0.98729 -1.65254
Solution is
2.73729 0.98729 -1.65254
1
i−1 n
(k+1) (k+1) (k)
xi = bi − aij xj − aij xj , i = 1, 2, . . . , n and k = 0, 1, 2, . . . .
aii
j=1 j=i+1
(k+1) (k)
The method is repeated until |xi − xi | < ε for all i = 1, 2, . . . , n, where ε > 0 is
any pre-assigned number called the error tolerance. This method is called Gauss-Seidal’s
iteration method.
346 Numerical Analysis
27x + 6y − z = 54
6x + 15y + 2z = 72
x + y + 54z = 110
k x y z
0 − 0 0
1 2.00000 4.00000 1.92593
2 1.18244 4.07023 1.93977
3 1.16735 4.07442 1.93997
4 1.16642 4.07477 1.93998
5 1.16635 4.07480 1.93998
6 1.16634 4.07480 1.93998
Note 5.18.1 This solution is achieved in eleven iterations using Gauss-Jacobi’s method
while only six iterations are used in Gauss-Seidal’s method.
The sufficient condition for convergence of this method is that the diagonal elements
of the coefficient matrix are diagonally dominant. This is justified in the following.
1 1
n i−1
Let A = max |aij | and let Ai = |aij |, i = 1, 2, . . . , n.
i aii j=1 aii
j=1
j=i
(k+1) 1 (k+1)
(k)
|εi |≤ |aij | |εj |+ |aij | |εj |
|aii |
j<i j>i
1
≤ |aij | ε(k+1) + |aij | ε(k)
|aii |
j<i j>i
≤ Ai ε (k+1)
+ (A − Ai )ε (k)
.
That is,
A − Ai (k)
ε(k+1) ≤ ε .
1 − Ai
A − Ai
Since 0 ≤ Ai ≤ A < 1 then ≤ A.
1 − Ai
Therefore the above relation reduces to
This shows that the rate of convergence of Gauss-Seidal’s iteration is also linear. The
successive substitutions give
ε(k) ≤ Ak ε(0) .
Now, if A < 1 then ε(k) → 0 as k → ∞, i.e., the sequence {x(k) } is sure to converge
when A < 1 i.e.,
n
|aij < |aii | for all i.
j=1
j=i
In other words the sufficient condition for Gauss-Seidal’s iteration is that the coefficient
matrix is diagonally dominant. The absolute error at the (k + 1)th iteration is given by
A
ε(k+1) ≤ e(k) when A < 1,
1−A
as in previous section.
Note 5.18.2 Usually, the Gauss-Seidal’s method converges rapidly than the Gauss-
Jacobi’s method. But, this is not always true. There are some examples in which the
Gauss-Jacobi’s method converges faster than the Gauss-Seidal’s method.
348 Numerical Analysis
Solution. It may be noted that the given system is not diagonally dominant, but,
the rearranged system 3x + y + z = 3, x + 4y + z = 2, 2x + y + 5z = 5 is diagonally
dominant.
Then the Gauss-Seidal’s iteration scheme is
1
x(k+1) = (3 − y (k) − z (k) )
3
1
y (k+1) = (2 − x(k+1) − z (k) )
4
1
z (k+1)
= (5 − 2x(k+1) − y (k+1) ).
5
Let y = 0, z = 0 be the initial values. The successive iterations are shown below.
k x y z
0 − 0 0
1 1.00000 0.25000 0.55000
2 0.73333 0.17917 0.67083
3 0.71667 0.15313 0.68271
4 0.72139 0.14898 0.68165
5 0.72312 0.14881 0.68099
6 0.72340 0.14890 0.68086
7 0.72341 0.14893 0.68085
k x y z
0 − 0 0
1 2.00000 1.75000 0.31250
2 1.20833 1.39583 0.79688
3 1.00347 1.10243 1.10243
4 0.89757 0.92328 1.07042
5 0.97866 0.95946 1.02081
6 0.99964 0.98951 1.00280
7 1.00163 0.99901 0.99943
8 1.00071 1.00046 0.99953
9 1.00016 1.00028 0.99985
10 1.00001 1.00008 0.99998
6
(0,2)
Second equ.
-
? 6
-
(1,0) (2,0)
First equ.
(0,-1/3)
Figure 5.1: Illustration of Gauss-Seidal’s method for the convergent scheme (5.115.)
6
6
Figure 5.2: Illustration of Gauss-Seidal’s method for the divergent scheme (5.116).
do
{
for(i=1;i<=n;i++)
{
sum=b[i];
for(j=1;j<=n;j++)
{
if(j<i)
sum-=a[i][j]*xn[j];
else if(j>i)
sum-=a[i][j]*x[j];
xn[i]=sum/a[i][i];
}
}
flag=0; /* indicates |x[i]-xn[i]|<epp for all i */
for(i=1;i<=n;i++) if(fabs(x[i]-xn[i])>epp) flag=1;
if(flag==1) for(i=1;i<=n;i++) x[i]=xn[i]; /* reset x[i] */
}
while(flag==1);
printf("Solution is \n");
for(i=1;i<=n;i++) printf("%8.5f ",xn[i]);
} /* main */
A sample of input/output:
system of equations
n
aij xj = bi , i = 1, 2, . . . , n. (5.117)
j=1
n
(k)
Then aij xj = bi , i = 1, 2, . . . , n.
j=1
(k)
If ri denotes the residual of the ith equation, then
(k)
n
(k)
ri = bi − aij xj . (5.118)
j=1
Now, the solution can be improved successively by reducing the largest residual to
zero at that iteration.
To achieve the fast convergence of the method, the equations are rearranged in such
a way that the largest coefficients in the equations appear on the diagonals. Now, the
largest residual (in magnitude) is determined and let it be rp . Then the value of the
rp
variable xp be increased by dxp where dxp = − .
app
In other words, xp is changed to xp + dxp to relax rp , i.e., to reduce rp to zero. Then
the new solution after this iteration becomes
(k) (k) (k) (k)
x(k+1) = x1 , x2 , . . . , xp−1 , xp + dxp , xp+1 , . . . , xn(k) .
The method is repeated until all the residuals become zero or very small.
Example 5.19.1 Solve the following system of equations
8x1 + x2 − x3 = 8, 2x1 + x2 + 9x3 = 12, x1 − 7x2 + 2x3 = −4
by relaxation method taking (0, 0, 0) as initial solution.
Solution. We rearrange the equations to get the largest coefficients in the diagonals
as
8x1 + x2 − x3 = 8
x1 − 7x2 + 2x3 = −4
2x1 + x2 + 9x3 = 12.
r1 = 8 − 8x1 − x2 + x3
r2 = −4 − x1 + 7x2 − 2x3
r3 = 12 − 2x1 − x2 − 9x3 .
354 Numerical Analysis
The method is started with the initial solution (0, 0, 0), i.e., x1 = x2 = x3 = 0. Then
the residuals r1 = 8, r2 = −4, r3 = 12 of which the largest residual in magnitude
is r3 . This indicates that the third equation has more error and needs to improve
r3 12
x3 . Then the increment dx3 = − = = 1.333. The new solution then becomes
a33 9
(0, 0, 0 + 1.333) i.e., (0, 0, 1.333).
Similarly, we find the new residuals of large magnitudes and relax it to zero and
we repeat the process until all the residuals become zero or very small.
The detail calculations are shown in the following table.
residuals max increment solution
k r1 r2 r3 (r1 , r2 , r3 ) p dxp x1 x2 x3
0 – – – – – – 0 0 0
1 8 –4 12 12 3 1.333 0 0 1.333
2 9.333 –6.666 0.003 9.333 1 1.167 1.167 0 1.333
3 –0.003 –7.833 –2.331 –7.833 2 1.119 1.167 1.119 1.333
4 –1.122 0 –3.450 –3.450 3 –0.383 1.167 1.119 0.950
5 –1.505 0.766 –0.003 –1.505 1 –0.188 0.979 1.119 0.950
6 –0.001 0.954 0.373 0.954 2 –0.136 0.979 0.983 0.950
7 0.135 0.002 0.509 0.509 3 0.057 0.979 0.983 1.007
8 0.192 –0.112 –0.004 0.192 1 0.024 1.003 0.983 1.007
9 0 –0.136 –0.052 –0.136 2 0.019 1.003 1.002 1.007
10 –0.019 –0.003 –0.071 –0.071 3 –0.008 1.003 1.002 0.999
11 –0.027 0.013 0.001 –0.027 1 0.003 1.000 1.002 0.999
12 –0.003 0.016 0.007 0.016 2 –0.002 1.000 1.000 0.999
13 –0.001 0.002 0.009 0.009 3 0.001 1.000 1.000 1.000
14 0 0 0 0 – 0 1.000 1.000 1.000
At this stage all the residuals are zero and therefore the solution of the given system
of equations is x1 = 1.000, x2 = 1.000, x3 = 1.000, which is the exact solution of the
equations.
i−1
n
aij xj + aij xj = bi . (5.119)
j=1 j=i
Solution of System of Linear Equations 355
i−1
(k+1)
n
(k)
aij xj + aij xj = bi . (5.120)
j=1 j=i
i−1
(k+1)
n
(k)
ri = bi − aij xj − aij xj . (5.121)
j=1 j=i
(k+1) (k)
i−1
(k+1)
n
(k)
aii xi = aii xi −w aij xj + aij xj − bi , (5.123)
j=1 j=i
i = 1, 2, . . . , n; k = 0, 1, 2, . . .
6 (0) (0) (0) 7t
and x1 , x2 , . . . , xn is the initial solution. The method is repeated until desired
accuracy is achieved.
This method is called the overrelaxation method when 1 < w < 2, and is called
the under relaxation method when 0 < w < 1. When w = 1, the method becomes
Gauss-Seidal’s method.
3x1 + x2 + 2x3 = 6
−x1 + 4x2 + 2x3 = 5
2x1 + x2 + 4x3 = 7
or
(k+1) (k) (k) (k) (k)
3x1 = 3x1 − 1.01 3x1 + x2 + 2x3 − 6
(k+1) (k) (k+1) (k) (k)
4x2 = 4x2 − 1.01 − x1 + 4x2 + 2x3 − 5
(k+1) (k) (k+1) (k+1) (k)
4x3 = 4x3 − 1.01 2x1 + x2 + 4x3 − 7 .
k x1 x2 x3
0 0 0 0
1 2.02000 1.77255 0.29983
2 1.20116 1.39665 0.80526
3 0.99557 1.09326 0.98064
4 0.98169 1.00422 1.00838
5 0.99312 0.99399 1.00491
6 0.99879 0.99728 1.00125
7 1.00009 0.99942 1.00009
8 1.00013 0.99999 0.99993
9 1.00005 1.00005 0.99997
Program 5.10 .
/* Program Gauss-Seidal SOR
Solution of a system of linear equations by Gauss-Seidal
successive overrelaxation (SOR) method. The relaxation factor w
lies between 1 and 2. Assume that the coefficient matrix
satisfies the condition of convergence. */
#include<stdio.h>
#include<math.h>
void main()
{
float a[10][10],b[10],x[10],xn[10],epp=0.00001,sum,w;
int i,j,n,flag;
printf("Enter number of variables ");
scanf("%d",&n);
printf("\nEnter the coefficients rowwise ");
for(i=1;i<=n;i++)
for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("\nEnter right hand vector ");
for(i=1;i<=n;i++) scanf("%f",&b[i]);
printf("Enter the relaxation factor w ");
scanf("%f",&w);
for(i=1;i<=n;i++) x[i]=0; /* initialize */
for(i=1;i<=n;i++) printf(" x[%d] ",i);printf("\n");
do
{
for(i=1;i<=n;i++)
{
sum=b[i]*w+a[i][i]*x[i];
for(j=1;j<=n;j++)
{
if(j<i)
sum-=a[i][j]*xn[j]*w;
else if(j>=i)
sum-=a[i][j]*x[j]*w;
xn[i]=sum/a[i][i];
}
}
for(i=1;i<=n;i++) printf("%8.5f ",xn[i]);
printf("\n");
358 Numerical Analysis
The direct and iterative, both the methods have some advantages and also some disad-
vantages and a choice between them is based on the given system of equations.
(i) The direct method is applicable for all types of problems (when the coefficient
determinant is not zero) where as iterative methods are useful only for particular
types of problems.
(ii) The rounding errors may become large particularly for ill-conditioned systems
while in iterative method the rounding error is small, since it is committed in
Solution of System of Linear Equations 359
the last iteration. Thus for ill-conditioned systems an iterative method is a good
choice.
(iii) In each iteration, the computational effect is large in direct method (it is 2n3 /3
for elimination method) and it is low in iteration method (2n2 in Gauss-Jacobi’s
and Gauss-Seidal’s methods).
(iv) Most of the direct methods are applied on the coefficient matrix and for this
purpose, the entire matrix to be stored into primary memory of the computer.
But, the iteration methods are applied in a single equation at a time, and hence
only a single equation is to be stored at a time in primary memory. Thus iterative
methods are efficient then direct method with respect to space.
5.22 Exercise
9. Solve LY = B, UX = Y and verify that B = AX for B = (8, −4, 10, −4)t , where
A = LU is given by
4 8 4 0 1 0 0 0 4 8 4 0
1 5 4 −3 1/4 1 0
0 0 3 3 −3
= .
1 4 7 2 1/4 2/3 1 0 0 0 4 4
1 3 0 −2 1/4 1/3 −1/2 1 0 0 0 1
2x1 − 2x2 = 1
−x1 + 2x2 − 3x3 = −2
−2x2 + 2x3 − 4x4 = −1
x3 − x4 = 3.
10x + 7y + 8z + 7w = 32
7x + 5y + 6z + 5w = 23
8x + 6y + 10z + 9w = 33
7x + 5y + 9z + 10w = 31.
1 1 1
2 3 5
x+y+z = 3
2x + 3y + 5z = 10.
x1 + 2x2 = 3
.
2x1 + 4x2 = 7
362 Numerical Analysis
19. Find the solution of the following equations using least squares method:
x + y + 3z = 0
2x − y + 4z = 8
x + 5y + z = 10
x + y − 2z = 2.
20. Solve the following equations by (i) Gauss-Jordan’s and (ii) Gauss-Seidal’s meth-
ods, correct up to four significant figures:
9x + 2y + 4z = 20
x + 10y + 4z = 6
2x − 4y + 10z = −15.
21. Test if the following systems of equations are diagonally dominant and hence solve
them using Gauss-Jacobi’s and Gauss-Seidal’s methods.
(i) 10x + 15y + 3z = 14 (ii) x + 3y + 4z = 7
−30x + y + 5z = 17 3x + 2y + 5z = 10
x + y + 4z = 3 x − 5y + 7z = 3.
22. Solve the following simultaneous equations
2.5x1 + 5.2x2 = 6.2
1.251x1 + 2.605x2 = 3.152
by Gauss elimination method and get your answer to 4 significant figures. Improve
the solution by iterative refinement.
23. Consider the following linear system
5x1 + 3x2 = 6, 2x1 − x2 = 4.
Can either Gauss-Jacobi’s or Gauss-Seidal’s iteration be used to solve this linear
system? Why?
24. Consider the following tri-diagonal linear system and assume that the coefficient
matrix is diagonally dominant.
d1 x1 + c1 x2 = b1
a1 x1 + d2 x2 + c2 x3 = b2
a2 x2 + d3 x3 + c4 x4 = b3
.. .. .. ..
. . . .
an−2 xn−2 + dn−1 xn−1 + cn−1 xn = bn−1
an−1 xn−1 + dn xn = bn .
Write an iterative algorithm that will solve this system.
Solution of System of Linear Equations 363
AX = λX. (6.1)
|A − λI| = 0 (6.2)
that is,
a11 − λ a12 a13 · · · a1n
a21 a22 − λ a23 · · · a2n
=0 (6.3)
··· ··· ··· ··· · · ·
an1 an2 an3 · · · ann − λ
AXi = λi Xi . (6.4)
365
366 Numerical Analysis
The eigenvalues λi may be either distinct or repeated, they may be real or complex.
If the matrix is real symmetric then all the eigenvalues are real. If the matrix is skew-
symmetric then the eigenvalues are either zero or purely imaginary. Sometimes, the set
of all eigenvalues, λi , of a matrix A is called the spectrum of A and the largest value
of |λi | is called the spectral radius of A.
Thus
x1 − x2 + x3 = 0
−x1 + x2 − x3 = 0
x1 − x2 + x3 = 0.
−2y1 − y2 + y3 = 0
−y1 − 2y2 − y3 = 0
y1 − y2 − 2y3 = 0.
The theorem is also true for columns, as the eigenvalues of AT and A are same.
368 Numerical Analysis
where
n
−c1 = aii = T r. A, which is the sum of all diagonal elements of A, called the trace.
i=1
aii aij
c2 =
aji ajj , is the sum of all principal minors of order two of A,
i<j
aii aij aik
−c3 = aji ajj ajk , is the sum of all principal minors of order three of A,
i<j<k aki akj akk
S1 = λ1 + λ2 + · · · + λn = T r A,
S2 = λ21 + λ22 + · · · + λ2n = T r A2 ,
(6.8)
··· ··· ···························
Sn = λn1 + λn2 + · · · + λnn = T r An .
Solution.
2 1 −1
B1 = A = 0 2 3 , d1 = T r. B1 = 2 + 2 + 4 = 8
1 5 4
2 1 −1 −6 1 −1 −13 −9 5
B2 = A(B1 − d1 I) = 0 2 3 0 −6 3 = 3 3 −6
1 5 4 1 5 −4 −2 −9 −2
1 1
d2 = T r. B2 = (−13 + 3 − 2) = −6
2 2
2 1 −1 −7 −9 5 −9 0 0
B3 = A(B2 − d2 I) = 0 2 3 3 9 −6 = 0 −9 0
1 5 4 −2 −9 4 0 0 −9
370 Numerical Analysis
1 1
d3 = T r. B3 = (−9 − 9 − 9) = −9.
3 3
Thus c1 = −d1 = −8, c2 = −d2 = 6, c3 = −d3 = 9. Hence the characteristic polyno-
mial is λ3 − 8λ2 + 6λ + 9 = 0.
Note 6.2.1 Using this method one can compute the inverse of the matrix A. It is
mentioned that Dn = 0. That is, Bn − dn I = 0 or, ADn−1 = dn I. From this relation
one can write Dn−1 = dn A−1 . This gives,
Dn−1 Dn−1
A−1 = = . (6.11)
dn −cn
Algorithm Leverrier-Faddeev
Step 1: Read the matrix A = [aij ], i, j = 1, 2, . . . , n.
n
Step 2: Set B1 = A, d1 = aii .
i=1
Step 3: for i = 2, 3, . . . , n do
Compute
(a) Bi = A(Bi−1 − di−1 I)
(b) di = 1i (sum of the diagonal elements of Bi )
Step 4: Compute ci = −di for i = 1, 2, . . . , n.
//ci ’s are the coefficients of the polynomial.//
end Leverrier-Faddeev.
Program 6.1.
/* Program Leverrier-Faddeev
This program finds the characteristic polynomial of
a square matrix. From which we can determine all the
eigenvalues of the matrix. */
#include<stdio.h>
#include<math.h>
void main()
{
int n,i,j,k,l;
float a[10][10],b[10][10],c[10][10],d[11];
printf("Enter the size of the matrix ");
scanf("%d",&n);
Eigenvalues and Eigenvectors of a Matrix 371
Solution.
9 −1 9
B1 = A = 3 −1 3
−7 1 −7
d1 = T r. B1 = 9 − 1 − 7 = 1
9 −1 9 1 0 0 8 −1 9
D1 = B1 − d 1 I =
3 −1 3 − 0 1 0 = 3 −2 3
−7 1 −7 0 0 1 −7 1 −8
9 −1 9 8 −1 9 6 2 6
B2 = AD1 = 3 −1 3 3 −2 3= 0 2 0
−7 1 −7 −7 1 −8 −4 −2 −4
1 1
d2 = T r. B2 = (6 + 2 − 4) = 2
2 2
4 2 6
D2 = B2 − d 2 I = 0 0 0
−4 −2 −6
Eigenvalues and Eigenvectors of a Matrix 373
9 −1 9 4 2 6 0 0 0
B3 = AD2 = 3 −1 3 0 0 0 = 0 0 0
−7 1 −7 −4 −2 −6 0 0 0
1 1
d3 = T r. B3 = (0 + 0 + 0) = 0.
3 3
Thus c1 = −d1 = −1, c2 = −d2 = −2, c3 = −d3 = 0.
The characteristic equation is λ3 − λ2 − 2λ = 0.
The eigenvalues are λ1 = 0, λ2 = −1, λ3 = 2.
1 8 4
Let e0 = 0 , and then e1 = 3 , e2 = 0 .
0 −7 −4
Several methods are available to determine the eigenvalues and eigenvectors of a matrix.
Here, two methods – Rutishauser and Power methods are introduced.
4 2
Example 6.3.1 Find all the eigenvalues of the matrix using Rutishauser
−1 1
method.
1 0 u11 u12
Solution. Let A = A1 = .
l21 1 0 u22
4 2 u11 u12
That is, = .
−1 1 u11 l21 u12 l21 + u22
This gives u11 = 4, u12 = 2, l21 = −1/4, u22 = 3/2.
1 0 4 2
Therefore, L1 = , U1 = .
−1/4 1 0 3/2
4 2 1 0 7/2 2
Form A2 = U1 L1 = = .
0 3/2 −1/4 1 −3/8 3/2
1 0 u11 u12
Again, let A2 = L2 U2 = .
l21 1 0 u22
7/2 2 u11 u12
= .
−3/8 3/2 u11 l21 u12 l21 + u22
Eigenvalues and Eigenvectors of a Matrix 375
X = c1 X 1 + c2 X 2 + · · · + c n X n . (6.13)
AX = c1 λ1 X1 + c2 λ2 X2 + · · · + cn λn Xn
λ λ
2 n
= λ 1 c1 X 1 + c 2 X2 + · · · + c n Xn . (6.14)
λ1 λ1
λ 2 λ 2
A2 X = λ21 c1 X1 + c2
2 n
X2 + · · · + c n Xn . (6.15)
λ1 λ1
.. ..
. .
λ k λ k
Ak X = λk1 c1 X1 + c2
2 n
X2 + · · · + c n Xn . (6.16)
λ1 λ1
λ k+1 λ k+1
n
Ak+1 X = λk+1
2
1 c 1 X 1 + c 2 X 2 + · · · + c n Xn . (6.17)
λ1 λ1
When k → ∞, then right hand sides of (6.16) and (6.17) tend to λk1 c1 X1 and
λk+1 λi < 1 for i = 2, . . . , n. Thus for k → ∞, Ak X = λk c1 X1 and
1 c1 X1 , since
λ1 1
Ak+1 X = λk+1 k
1 c1 X1 . That is, for k → ∞, λ1 A λ = A
k+1 X. It is well known that two
vectors are equal if their corresponding components are same. That is,
Ak+1 X
λ1 = lim r , r = 1, 2, . . . , n. (6.18)
k→∞ k
A X
r
The symbol (Ak X)r denotes the rth component of the vector Ak X.
If |λ2 | |λ1 |, then the term within square bracket of (6.17) tend faster to c1 X1 , i.e.,
the rate of convergence is fast.
To reduce the round off error, the method is carried out by normalizing (reducing the
largest element to unity) the eigenvector at each iteration. Let X0 be a non-null initial
(arbitrary) vector (non-orthogonal to X1 ) and we compute
Yi+1 = AXi
Yi+1
Xi+1 = (i+1) , for i = 0, 1, 2, . . . . (6.19)
λ
where λ(i+1) is the largest element in magnitude of Yi+1 and it is the (i + 1)th approx-
imate value of λ1 . Then
(Yk+1 )r
λ1 = lim , r = 1, 2, . . . , n. (6.20)
k→∞ (Xk )r
Note 6.3.1 The initial vector X0 is usually chosen as X0 = (1, 1, · · · , 1)T . But, if the
initial vector X0 is poor, then the formula (6.20) does not give λ1 , i.e., the limit of the
(Y )r
ratio (Xk+1
k )r
may not exist. If this situation occurs, then the initial vector must be
changed.
Eigenvalues and Eigenvectors of a Matrix 377
Note 6.3.2 The power method is also used to find the least eigenvalue of a matrix
A. If X is the eigenvector corresponding to the eigenvalue λ then AX = λX. If A
is non-singular then A−1 exist. Therefore, A−1 (AX) = λA−1 X or, A−1 X = λ1 X.
This means that if λ is an eigenvalue of A then λ1 is an eigenvalue of A−1 and the same
eigenvector X corresponds to the eigenvalue 1/λ of the matrix A−1 . Thus, if λ is largest
(in magnitude) eigenvalue of A then 1/λ is the least eigenvalue of A−1 .
Note 6.3.3 We observed that the coefficient Xj in (6.16) goes to zero in proposition
to (λj /λ1 )k and that the speed of convergence is governed by the terms (λ2 /λ1 )k . Con-
sequently, the rate of convergence is linear.
Example 6.3.2 Find the largest eigenvalue in magnitude and corresponding eigen-
vector of the matrix
1 3 2
A = −1 0 2
3 4 5
3.08691 0.43050
Y5 = 1.56950 , X5 = 0.21880 , λ(5) = 7.16691.
7.16672 1.
3.08691 0.43073
Y6 = 1.56950 , X6 = 0.21900 , λ(6) = 7.16672.
7.16672 1.
3.08772 0.43075
Y7 = 1.56927 , X7 = 0.21892 , λ(7) = 7.16818.
7.16818 1.0
3.08752 0.43074
Y8 = 1.56925 , X8 = 0.21893 , λ(8) = 7.16795.
7.16795 1.0
3.08752 0.43074
Y9 = 1.56926 , X9 = 0.21893 , λ(9) = 7.16792.
7.16792 1.0
3.08753 0.43074
Y10 = 1.56926 , X10 = 0.21893 , λ(10) = 7.16794.
7.16794 1.0
The required largest eigenvalue is 7.1679 correct up to four decimal places and the
corresponding eigenvector is
0.43074
0.21893 .
1.00000
Algorithm 6.2 (Power method). This method determines the largest eigenvalue
(in magnitude) and its corresponding eigenvector of a square matrix A.
Program 6.2.
/* Program Power Method
This program finds the largest eigenvalue (in magnitude)
of a square matrix.*/
#include<stdio.h>
#include<math.h>
void main()
{
int n,i,j,flag;
float a[10][10],x0[10],x1[10],y[10],lambda,eps=1e-5;
printf("Enter the size of the matrix ");
scanf("%d",&n);
printf("Enter the elements row wise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("The given matrix is\n");
for(i=1;i<=n;i++){ /* printing of A */
for(j=1;j<=n;j++) printf("%f ",a[i][j]); printf("\n");
}
printf("\n");
for(i=1;i<=n;i++) {
x0[i]=1; x1[i]=1; /* initialization */
}
do
{
flag=0;
for(i=1;i<=n;i++) x0[i]=x1[i]; /* reset x0 */
for(i=1;i<=n;i++) /* product of A and X0 */
{
y[i]=0;
for(j=1;j<=n;j++) y[i]+=a[i][j]*x0[j];
}
lambda=y[1]; /* finds maximum among y[i] */
for(i=2;i<=n;i++) if(lambda<y[i]) lambda=y[i];
for(i=1;i<=n;i++) x1[i]=y[i]/lambda;
for(i=1;i<=n;i++) if(fabs(x0[i]-x1[i])>eps) flag=1;
}while(flag==1);
printf("The largest eigenvalue is %8.5f \n",lambda);
printf("The corresponding eigenvector is \n");
for(i=1;i<=n;i++) printf("%8.5f ",x1[i]);
}/* main */
380 Numerical Analysis
A sample of input/output:
Enter the size of the matrix 5
Enter the elements row wise
3 4 5 6 7
0 0 2 1 3
3 4 5 -2 3
3 4 -2 3 4
0 1 2 0 0
The given matrix is
3.000000 4.000000 5.000000 6.000000 7.000000
0.000000 0.000000 2.000000 1.000000 3.000000
3.000000 4.000000 5.000000 -2.000000 3.000000
3.000000 4.000000 -2.000000 3.000000 4.000000
0.000000 1.000000 2.000000 0.000000 0.000000
The largest eigenvalue is 10.41317
The corresponding eigenvector is
1.00000 0.20028 0.62435 0.41939 0.13915
The methods discussed earlier are also applicable for symmetric matrices, but, due to
the special properties of symmetric matrices some efficient methods are developed here.
Among them three commonly used methods, viz., Jacobi, Givens and Householder are
discussed here.
In linear algebra it is established that all the eigenvalues of a real symmetric matrix
are real.
Eigenvalues and Eigenvectors of a Matrix 381
all other off-diagonal elements are zero and all other diagonal elements are unity. Thus
S1 is of the form
1 0 ··· 0 ··· 0··· 0
0 1 ··· 0 ··· 0··· 0
.. .. .. .. ..
. . . . .
0 0 · · · cos θ · · · − sin θ · · · 0
S1 = . .. .. .. .. (6.22)
.. . . . .
0 0 · · · sin θ · · · cos θ · · · 0
.. .. .. .. ..
. . . . .
0 0 ··· 0 ··· 0 ··· 1
where cos θ, − sin θ, sin θ and cos θ are at the positions (i, i), (i, j), (j, i) and (j, j) re-
spectively.
aii aij
Let A1 = be a sub-matrix of A formed by the elements aii , aij , aji and ajj .
aji ajj
To reduce A1 to a diagonal matrix, an orthogonal transformation is applied which is
cos θ − sin θ
defined as S1 = , where θ is an unknown quantity and it will be selected
sin θ cos θ
in such a way that A1 becomes diagonal.
Now,
−1
S1 A1 S1
cos θ sin θ aii aij cos θ − sin θ
=
− sin θ cos θ aji ajj sin θ cos θ
aii cos2 θ + aij sin 2θ + ajj sin2 θ (ajj − aii ) sin θ cos θ + aij cos 2θ
= .
(ajj − aii ) sin θ cos θ + aij cos 2θ aii sin2 θ − aij sin 2θ + ajj cos2 θ
382 Numerical Analysis
This matrix becomes a diagonal matrix if (ajj − aii ) sin θ cos θ + aij cos 2θ = 0,
That is , if
2aij
tan 2θ = . (6.23)
aii − ajj
This expression gives four values of θ, but, to get smallest rotation, θ should lie in
−π/4 ≤ θ ≤ π/4. The equation (6.24) is valid for all i, j if aii = ajj . If aii = ajj then
π
4 , if aij > 0
θ= (6.25)
− π , if aij < 0.
4
−1
Thus the off-diagonal elements sij and sji of S1 A1 S1 vanish and the diagonal
elements are modified. The first diagonal matrix is obtained by computing D1 =
S−1
1 A1 S1 . In the next step largest off-diagonal (in magnitude) element is selected
from the matrix D1 and the above process is repeated to generate another orthogonal
matrix S2 to compute D2 . That is,
D2 = S−1 −1 −1
2 D1 S2 = S2 (S1 AS1 )S2 = (S1 S2 )
−1
A(S1 S2 ).
Dk = S−1 −1 −1
k Sk−1 · · · S1 AS1 S2 · · · Sk−1 Sk
= (S1 S2 · · · Sk )−1 A(S1 S2 · · · Sk )
= S−1 AS (6.26)
where S = S1 S2 · · · Sk .
As k → ∞, Dk tends to a diagonal matrix. The diagonal elements of Dk are the
eigenvalues and the columns of S are the corresponding eigenvectors.
The method has a drawback. The elements those are transferred to zero during
diagonalisation may not necessarily remain zero during subsequent rotations. The value
of θ must be verified for its accuracy by checking whether | sin2 θ + cos2 θ − 1 | is
sufficiently small.
Note 6.4.2 It can be shown that the minimum number of rotations required to trans-
form a real symmetric matrix into a diagonal matrix is n(n − 1)/2.
Example
6.4.1
Find the eigenvalues and eigenvectors of the symmetric matrix
1 2 2
A=2 1 2 using Jacobi’s method.
2 2 1
Solution. The largest off-diagonal element is 2 at (1, 2), (1, 3) and (2, 3) positions.
2a12 4 π
The rotational angle θ is given by tan 2θ = = = ∞ i.e., θ = .
a11 − a22 0 4
Thus the orthogonal matrix
S 1 is √ √
cos π/4 − sin π/4 0 1/√2 −1/√ 2 0
S1 = sin π/4 cos π/4 0 = 1/ 2 1/ 2 0 .
0 0 1 0 0 1
Then the first rotation yields
√ √ √ √
1/ √2 1/√2 0 1 2 2 1/√2 −1/√ 2 0
D1 = S−1 1 AS1 =
−1/ 2 1/ 2 0 2 1 2 1/ 2 1/ 2 0
0 0 1 2 2 1 0 0 1
√
3 0 4/ 2 3 0 2.82843
= 0√ −1 0 = 0 −1 0 .
4/ 2 0 1 2.82843 0 1
The largest off-diagonal element of D1 is now 2.82843 situated at (1, 3) position and
hence the rotational angle is
1 2a13
θ = tan−1 = 0.61548.
2 a11 − a33
The second orthogonal matrix S2 is
cos θ 0 − sin θ 0.81650 0 −0.57735
S2 = 0 1 0 = 0 1 0 .
sin θ 0 cos θ 0.57735 0 0.81650
Then second rotation gives
D2 = S−1 D1 S2
2
0.81650 0 0.57735 3 0 2.82843 0.81650 0 −0.57735
= 0 1 0 0 −1 0 0 1 0
−0.577351 0 0.81650 2.82843 0 1 0.57735 0 0.81650
5 0 0
= 0 −1 0 .
0 0 −1
384 Numerical Analysis
Thus D2 becomes a diagonal matrix and hence the eigenvalues are 5, −1, −1.
The eigenvectors are the columns of S, where
√ √
1/√2 −1/√ 2 0 0.81650 0 −0.57735
S = S1 S2 = 1/ 2 1/ 2 0 0 1 0
0 0 1 0.57735 0 0.81650
0.57735 −0.70711 −0.40825
= 0.57735 0.70711 −0.40825 .
0.57735 0 0.81650
Hence the eigenvalues are 5, −1, −1 and the corresponding eigenvectors are
(0.57735, 0.57735, 0.57735)T , (−0.70711, −0.70711, 0)T ,
(−0.40825, −0.40825, 0.81650)T respectively.
Note that the eigenvectors are normalized and two independent eigenvectors (last two
vectors) for the eigenvalue −1 are obtained by this method.
In this problem, two rotations are used to convert A into a diagonal matrix. But,
this does not happen in general. The following example shows that at least six rotations
are needed to diagonalise a symmetric matrix.
Example
6.4.2 Find all the eigenvalues and eigenvectors of the matrix
2 3 1
3 2 2 by Jacobi’s method.
1 2 1
2 3 1
Solution. Let A = 3 2 2 .
1 2 1
It is a real symmetric matrix and the Jacobi’s method is applicable.
The largest off-diagonal element is at a12 = a21 and it is 3.
2a12 6 π
Then tan 2θ = = = ∞, and this gives θ = .
a11 − a22 0 4
Thus the orthogonal matrix S1 is
√ √
cos π/4 − sin π/4 0 1/√2 −1/√ 2 0
S1 = sin π/4 cos π/4 0 = 1/ 2 1/ 2 0 .
0 0 1 0 0 1
The first rotation gives
√ √ √ √
1/ √2 1/√2 0 2 3 1 1/√2 −1/√ 2 0
D1 = S−1
1 AS1 =
−1/ 2 1/ 2 0 3 2 2 1/ 2 1/ 2 0
0 0 1 1 2 1 0 0 1
Eigenvalues and Eigenvectors of a Matrix 385
√
5 0 3/√2
= 0√ −1
√ 1/ 2 .
3/ 2 1/ 2 1
√
The largest off-diagonal element of D1 is 3/ 2, situated at (1, 3) position. Then
√
2a13 6/ 2
tan 2θ = = = 1.06066. or, θ = 12 tan−1 (0.69883) = 0.407413
a11 − a33 5−1
0.91815 0 −0.39624
So, the next orthogonal matrix S2 is S2 = 0 1 0 .
0.39624 0 0.91815
D2 = S−1 D1 S2
2
0.91815 0 0.39624 5 0 2.12132 0.91815 0 −0.39624
= 0 1 0 0 −1 0.70711 0 1 0
−0.39624 0 0.91815 2.12132 0.70711 1 0.39624 0 0.91815
5.91548 0.28018 0
= 0.28018 −1.0 0.64923 .
0 0.64923 0.08452
The largest off-diagonal element of D2 is 0.64923, present at the position (2, 3). Then
2a23 1
tan 2θ = = −1.19727 or, θ = tan−1 (−1.19727) = −0.43747.
a22 − a33 2
1 0 0
Therefore, S3 = 0 0.90583 0.42365 .
0 −0.42365 0.90583
5.91548 0.25379 0.11870
D3 = S−13 D2 S3 =
0.25379 −1.30364 0 .
0.11870 0 0.38816
Again, largest off-diagonal element is 0.25379, located at (1, 2) position.
1 2a12
Therefore, θ = tan−1 = 0.03510.
2 a11 − a22
0.99938 −0.03509 0
S4 = 0.03509 0.99938 0 .
0 0 1
5.92439 0 0.11863
D4 = S−1
4 D3 S4 =
0 −1.31255 −0.00417 .
0.11863 −0.00417 0.38816
386 Numerical Analysis
That is, the eigenvectors corresponding to the eigenvalues 5.9269, −1.3126 and 0.3856
are respectively (0.61825, 0.67629, 0.40006)T ,
(−0.54567, 0.73604, −0.40061)T and (−0.56540, 0.02948, 0.82430)T .
Note 6.4.3 This example shows that the elements which were annihilated by a rotation
may not remain zero during the next rotations.
Algorithm 6.3 (Jacobi’s method). This method determines the eigenvalues and
eigenvectors of a real symmetric matrix A, by converting A into a diagonal matrix
by similarity transformation.
Algorithm Jacobi
Step 1. Read the symmetric matrix A = [aij ], i, j = 1, 2, . . . , n.
Step 2. Initialize D = A and S = I, a unit matrix.
Eigenvalues and Eigenvectors of a Matrix 387
Step 3. Find the largest off-diagonal element (in magnitude) from D = [dij ] and
let it be dij .
Step 4. //Find the rotational angle θ.//
If dii = djj then
if dij > 0 then θ = π/4 else θ = −π/4 endif;
else
1 −1 2dij
θ = 2 tan ;
dii − djj
endif;
Step 5. //Compute the matrix S1 = [spq ]//
Set spq = 0 for all p, q = 1, 2, . . . , n
skk = 1, k = 1, 2, . . . , n
and sii = sjj = cos θ, sij = − sin θ, sji = sin θ.
Step 6. Find D = ST 1 ∗ D ∗ S1 and S = S ∗ S1 ;
Step 7. Repeat steps 3 to 6 until D becomes diagonal.
Step 8. Diagonal elements of D are the eigenvalues and the columns of S are
the corresponding eigenvectors.
end Jacobi
Program 6.3.
/* Program Jacobi’s Method to find eigenvalues
This program finds all the eigenvalues and the corresponding
eigenvectors of a real symmetric matrix. Assume that the
given matrix is real symmetric. */
#include<stdio.h>
#include<math.h>
void main()
{
int n,i,j,p,q,flag;
float a[10][10],d[10][10],s[10][10],s1[10][10],s1t[10][10];
float temp[10][10],theta,zero=1e-4,max,pi=3.141592654;
printf("Enter the size of the matrix ");
scanf("%d",&n);
printf("Enter the elements row wise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("The given matrix is\n");
for(i=1;i<=n;i++) /* printing of A */
{
for(j=1;j<=n;j++) printf("%8.5f ",a[i][j]); printf("\n");
}
printf("\n");
388 Numerical Analysis
/* initialization of D and S */
for(i=1;i<=n;i++) for(j=1;j<=n;j++){
d[i][j]=a[i][j]; s[i][j]=0;
}
for(i=1;i<=n;i++) s[i][i]=1;
do
{
flag=0;
/* find largest off-diagonal element */
i=1; j=2; max=fabs(d[1][2]);
for(p=1;p<=n;p++) for(q=1;q<=n;q++)
{ if(p!=q) /* off-diagonal element */
if(max<fabs(d[p][q])){
max=fabs(d[p][q]); i=p; j=q;
}
}
if(d[i][i]==d[j][j]){
if(d[i][j]>0) theta=pi/4; else theta=-pi/4;
}
else
{
theta=0.5*atan(2*d[i][j]/(d[i][i]-d[j][j]));
}
/* construction of the matrix S1 and S1T */
for(p=1;p<=n;p++) for(q=1;q<=n;q++)
{s1[p][q]=0; s1t[p][q]=0;}
for(p=1;p<=n;p++) {s1[p][p]=1; s1t[p][p]=1;}
s1[i][i]=cos(theta); s1[j][j]=s1[i][i];
s1[j][i]=sin(theta); s1[i][j]=-s1[j][i];
s1t[i][i]=s1[i][i]; s1t[j][j]=s1[j][j];
s1t[i][j]=s1[j][i]; s1t[j][i]=s1[i][j];
/* product of S1T and D */
for(i=1;i<=n;i++)
for(j=1;j<=n;j++){
temp[i][j]=0;
for(p=1;p<=n;p++) temp[i][j]+=s1t[i][p]*d[p][j];
}
/* product of temp and S1 i.e., D=S1T*D*S1 */
for(i=1;i<=n;i++)
for(j=1;j<=n;j++){
Eigenvalues and Eigenvectors of a Matrix 389
d[i][j]=0;
for(p=1;p<=n;p++) d[i][j]+=temp[i][p]*s1[p][j];
}
/* product of S and S1 i.e., S=S*S1 */
for(i=1;i<=n;i++)
for(j=1;j<=n;j++)
{
temp[i][j]=0;
for(p=1;p<=n;p++) temp[i][j]+=s[i][p]*s1[p][j];
}
for(i=1;i<=n;i++) for(j=1;j<=n;j++) s[i][j]=temp[i][j];
for(i=1;i<=n;i++) for(j=1;j<=n;j++) /* is D diagonal ? */
{
if(i!=j) if(fabs(d[i][j]>zero)) flag=1;
}
}while(flag==1);
printf("The eigenvalues are\n");
for(i=1;i<=n;i++) printf("%8.5f ",d[i][i]);
printf("\nThe corresponding eigenvectors are \n");
for(j=1;j<=n;j++){
printf("(");
for(i=1;i<n;i++) printf("%8.5f,",s[i][j]);
printf("%8.5f)\n",s[n][j]);
}
}/* main */
A sample of input/output:
Enter the size of the matrix 4
Enter the elements row wise
1 2 3 4
2 -3 3 4
3 3 4 5
4 4 5 0
The given matrix is
1.000000 2.000000 3.000000 4.000000
2.000000 -3.000000 3.000000 4.000000
3.000000 3.000000 4.000000 5.000000
4.000000 4.000000 5.000000 0.000000
The eigenvalues are
-0.73369 -5.88321 11.78254 -3.16564
390 Numerical Analysis
Let
a1 b2 0 ··· 0
0 0
b2 a2 0 ··· 0
b3 0
b4 · · · 0
A= 0 b3 a3 0
.. .. ..
.. .. .. ..
. . .
. . . .
0 0 0 0 · · · bn an
Expanding by minors, the sequence {pn (λ)} satisfies the following equations.
p0 (λ) = 1
p1 (λ) = a1 − λ
pk+1 (λ) = (ak+1 − λ)pk (λ) − b2k+1 pk−1 (λ), k = 1, 2, . . . , n (6.27)
Example
6.4.3 Find the eigenvalues of the following tri-diagonal matrix
3 1 0
1 2 1 .
0 1 1
Then apply the rotation in (2,4)-plane to convert a14 and a41 to zero. This would
not effect zeros that have been created earlier. Successive rotations are carried out in
the planes (2, 3), (2, 4), . . . , (2, n) where θ’s are so chosen that the new elements at
the positions (1, 3), (1, 4), . . . , (1, n) vanish. After (n − 2) such rotations, all elements
of first row and column (except first two) become zero. Then the transformed matrix
Bn−2 after (n − 2) rotations reduces to the following form:
a11 a12 0 0 ··· 0
a21 a22 a23 a24 ··· a2n
Bn−2 =
0 a32 a33 a34 ··· a3n .
0 ··· ··· ··· ··· ···
0 an2 an3 an4 ··· ann
The second row of Bn−2 is taken in the same way as of the first row. The rotations
are made in the planes (3,4), (3,5), . . . , (3, n). Thus, after (n − 2) + (n − 3) + · · · + 1
(n − 1)(n − 2)
= rotations the matrix A becomes a tri-diagonal matrix B of the form
2
a1 b2 0 0 · · · 0 0
b2 a2 b3 0 · · · 0 0
B= 0 b3 a3 b4 · · · 0 0 .
··· ··· ··· ··· ··· ··· ···
0 0 0 0 · · · bn an
In this process, the previously created zeros are not effected by successive rotations.
The eigenvalues of B and A are same as they are similar matrices.
Example
6.4.4Find the eigenvalues of the symmetric matrix
2 3 −1
A= 3 1 2 using Givens method.
−1 2 −1
a13 1
where tan θ = = − , i.e., θ = −0.32175.
a12 3
1 0 0
Therefore, S1 = 0 0.94868 0.31623 .
0 −0.31623 0.94868
394 Numerical Analysis
p0 (λ) = 1
p1 (λ) = 2−λ
p2 (λ) = (−0.4 − λ)p1 (λ) − 3.16232 p0 (λ) = λ2 − 1.6λ − 10.8
p3 (λ) = (0.4 − λ)p2 (λ) − 2.22 p1 (λ) = −λ3 + 2λ2 + 15λ − 14.
From this table, it is observed that the eigenvalues are located in the intervals
(−4, −3), (0, 1) and (4, 5). Any iterative method may be used to find them. Using
Newton-Raphson method, we find the eigenvalues −3.47531, 0.87584 and 4.59947 of
B. These are also the eigenvalues of A.
This method is applicable to a real symmetric matrix of order n×n. It is more economic
and efficient than the Givens method. Here also a sequence of orthogonal (similarity)
transformations is used on A to get a tri-diagonal matrix. Each transformation produces
a complete row of zeros in appropriate positions, without affecting the previous rows.
Eigenvalues and Eigenvectors of a Matrix 395
S = I − 2VVT (6.29)
ST = (I − 2VVT )T = I − 2VVT = S
and
ST S = (I − 2VVT )(I − 2VVT )
= I − 4VVT + 4VVT VVT
= I − 4VVT + 4VVT = I. [using (6.30)]. (6.31)
Thus
S−1 AS = ST AS = SAS, (6.32)
Ar = Sr Ar−1 Sr , r = 2, 3, . . . , n − 1, (6.33)
Now,
1 0 0 0
0 1 − 2s22 −2s2 s3 −2s2 s4
S2 = I − 2VVT =
0 −2s2 s3 1 − 2s23 −2s3 s4 . (6.35)
0 −2s2 s4 −2s3 s4 1 − 2s24
396 Numerical Analysis
The first rows of A1 and S2 A1 are same. The elements in the first row of A2 =
S2 A1 S2 are given by
a11 = a11
a12 = (1 − 2s22 )a12 − 2s2 s3 a13 − 2s2 s4 a14 = a12 − 2s2 p1
a13 = −2s2 s3 a12 + (1 − 2s23 )a13 − 2s3 s4 a14 = a13 − 2s3 p1
and a14 = −2s2 s4 a12 − 2s3 s4 a13 + (1 − 2s24 )a14 = a14 − 2s4 p1
That is,
Thus from (6.39), (6.37) and (6.38) the values of s2 , s3 and s4 are obtained as
1 a12 a13 a14
2
s2 = 1∓ , s3 = ∓ , s4 = ∓ . (6.41)
2 q 2s2 q 2s2 q
It is noticed that the values of s3 and s4 depend on s2 , so the better accuracy can
be achieved if s2 becomes large. This can be obtained by taking suitable sign in (6.41).
Choosing
2 1 a12 × sign(a12 )
s2 = 1+ . (6.42)
2 q
Eigenvalues and Eigenvectors of a Matrix 397
The sign of the square root is irrelevant and positive sign is taken. Hence
a13 × sign(a12 ) a14 × sign(a12 )
s3 = , s4 = .
2q × s2 2q × s2
Thus first transformation generates two zeros in the first row and first column. The
second transformation is required to create zeros at the positions (2, 4) and (4, 2).
In the second transformation, let V3 = (0, 0, s3 , s4 )T and the matrix
1 0 0 0
0 1 0 0
S3 =
0 0 1 − 2s23 −2s3 s4 .
(6.43)
0 0 −2s3 s4 1 − 2s24
The values of s3 and s4 are to be computed using the previous technique. The new
matrix A3 = S3 A2 S3 is obtained. The zeros in first row and first column remain
unchanged while computing A3 . Thus, A3 becomes to a tri-diagonal form in this case.
The application of this method for a general n × n matrix is obvious. The elements of
the vector Vk = (0, · · · , 0, sk , sk+1 , · · · , sn ) at the kth transformation are given by
-
.
1 akr × sign(akr ) . n
2
sk = 1+ , r = k + 1, where q = / a2ki
2 q
i=k+1
aki × sign(akr )
si = , i = k + 1, . . . , n.
2qsk
Since the tri-diagonal matrix An−1 is similar to the original matrix A, they have
identical eigenvalues. The eigenvalues of An−1 are computed in the same way as in the
Givens method. Once the eigenvalues become available, the eigenvectors are obtained
by solving the homogeneous system of equations (A − λI)X = 0.
Example
6.4.5 Use the
Householder method to reduce the matrix
2 −1 −1 1
−1 4 1 −1
A=
−1 1 3 −1 into the tri-diagonal form.
1 −1 −1 2
S2 = I − 2V2 V2T
1 0 0 0
0 −0.57734 −0.57735 0.57735
=
0 −0.57735 0.78867 0.21133 .
0 0.57735 0.21133 0.78867
2 1.73204 0 0
1.73204 5.0 0.21132 −0.78867
A2 = S2 A1 S2 =
.
0 0.21132 2.28867 −0.5
0 −0.78867 −0.5 1.71133
Second transformation.
V3 = (0, 0, s3 , s4 )T . q = a223 + a224 = 0.81649,
2 1 a23 × sign(a23 )
s3 = 1+ = 0.62941, s3 = 0.79335,
2 q
a24 × sign(a23 )
s4 = = −0.60876.
2qs3
V3 = (0, 0, 0.79335, −0.60876)T .
1 0 0 0
0 1 0 0
S3 = I − 2V3 V3T =
0 0 −0.25881 0.96592 .
0 0 0.96592 0.25882
2 1.73204 0 0
1.73204 5.0 −0.81648 0
A3 = S3 A2 S3 =
.
0 −0.81648 2 −0.57731
0 0 −0.57731 2
Algorithm Householder
Step 1. Read the symmetric matrix A = [aij ], i, j = 1, 2, . . . , n.
Step 2. Set k = 1, r = 2.
1 , v2 , · · · , vn ) //
Step 3. //Compute the vector V = (v+ T
n 2
Step 3.1. Compute q = i=k+1 aki .
Step 3.2. Set vi = 0 for i = 1, 2, . . . , r − 1.
Eigenvalues and Eigenvectors of a Matrix 399
1 akr ×sign(akr )
Step 3.3. Compute vr2 = 2 1+ q .
aki ×sign(akr )
Step 3.4. Compute vi = 2qvr
for i = r + 1, . . . , n.
Step 4. Compute the transformation matrix S = I − 2V ∗ VT .
Step 5. Compute A = S ∗ A ∗ S.
Step 6. Set k = k + 1, r = r + 1.
Step 7. Repeat steps 3 to 6 until k ≤ n − 2.
end Householder
Program 6.4 .
/* Program Householder method
This program reduces the given real symmetric matrix
into a real symmetric tri-diagonal matrix. Assume that
the given matrix is real symmetric. */
#include<stdio.h>
#include<math.h>
void main()
{
int n,i,j,r=2,k,l,sign;
float a[10][10],v[10],s[10][10],temp[10][10],q;
printf("Enter the size of the matrix ");
scanf("%d",&n);
printf("Enter the elements row wise ");
for(i=1;i<=n;i++) for(j=1;j<=n;j++) scanf("%f",&a[i][j]);
printf("The given matrix is\n");
for(i=1;i<=n;i++) /* printing of A */
{
for(j=1;j<=n;j++) printf("%8.5f ",a[i][j]); printf("\n");
}
for(k=1;k<=n-2;k++)
{
q=0;
for(i=k+1;i<=n;i++) q+=a[k][i]*a[k][i];
q=sqrt(q);
for(i=1;i<=r-1;i++) v[i]=0;
sign=1; if(a[k][r]<0) sign=-1;
v[r]=sqrt(0.5*(1+a[k][r]*sign/q));
for(i=r+1;i<=n;i++) v[i]=a[k][i]*sign/(2*q*v[r]);
/* construction of S */
for(i=1;i<=n;i++) for(j=1;j<=n;j++) s[i][j]=-2*v[i]*v[j];
400 Numerical Analysis
for(i=1;i<=n;i++) s[i][i]=1+s[i][i];
for(i=1;i<=n;i++) for(j=1;j<=n;j++)
{
temp[i][j]=0;
for(l=1;l<=n;l++) temp[i][j]+=s[i][l]*a[l][j];
}
for(i=1;i<=n;i++) for(j=1;j<=n;j++)
{
a[i][j]=0;
for(l=1;l<=n;l++) a[i][j]+=temp[i][l]*s[l][j];
}
r++;
} /* end of loop k */
printf("The reduced symmetric tri-diagonal matrix is\n");
for(i=1;i<=n;i++)
{
for(j=1;j<=n;j++) printf("%8.5f ",a[i][j]);
printf("\n");
}
}/* main */
A sample of input/output:
6.5 Exercise
3. Use the Leverrier-Faddeev method to find the characteristic equations of the fol-
lowing matrices.
2 −1 3 −4 1 1 −1 1
2 5 7 3 −2 4 1 1 1 −1 −1
(a) 6 3 4 , (b)
5 −3 −2 2 , (c) 1 −1 1 −1 .
5 −2 −3
3 −3 −1 1 1 −1 −1 1
4. Use the Leverrier-Faddeev method to find the eigenvalues and eigenvectors of the
matrices
1 −2 1 −2
5 6 −3 1 2 −3 2 −1 2 −1
2 3
(a) , (b) −1 0 1 , (c) 3 −1 2 , (d) 1 1 −2 −2 .
−1 2
1 2 −1 1 0 −1
−2 −2 1 1
6. Use power method to find the largest and the least (in magnitude) eigenvalues of
the following matrices.
4 1 0 2 −1 2 3 1 0
1 2
(a) , (b) 1 2 1 , (c) 5 −3 3 , (d) 1 2 2 .
2 3
0 1 1 −1 0 −2 0 1 1
8. UseGivens method
to find
the eigenvalues
of the
following symmetric matrices.
3 2 1 2 2 6 1 1 −1
(a) 2 3 2 , (b) 2 5 4 , (c) 1 2 3 .
1 2 3 6 4 1 −1 3 1
9. Use Householder method to convert the above matrices to tri-diagonal form.
7.1 Differentiation
403
404 Numerical Analysis
f n+1 (ξi )
E (xi ) = w (xi ) , (7.2)
(n + 1)!
where min{x, x0 , . . . , xn } < ξi < max{x, x0 , . . . , xn }. The error can also be expressed
in terms of divided difference.
f n+1 (ξ)
Let E(x) = w(x)f [x, x0 , x1 , . . . , xn ] where f [x, x0 , x1 , . . . , xn ] = .
(n + 1)!
Then E (x) = w (x)f [x, x0 , x1 , . . . , xn ] + w(x)f [x, x, x0 , x1 , . . . , xn ].
Now, this expression is differentiated (k − 1) times by Leibnitz’s theorem.
k
dk−1
k k
E (x) = Cr w(i) (x) (f [x, x0 , . . . , xn ])
dxk−i
i=0
k
k−i+1
= k
Cr w (x)(k − i)!f [x, x, . . . , x, x0 , . . . , xn ]
(i)
i=0
k
k!
k−i+1
= w(i) (x)(k − i)!f [x, x, . . . , x, x0 , . . . , xn ], (7.3)
i!
i=0
u(u − 1) 2 u(u − 1) · · · (u − n − 1) n
φ(x) = y0 + u∆y0 + ∆ y0 + · · · + ∆ y0
2! n!
u2 −u 2 u3 −3u2 +2u 3 u4 −6u3 +11u2 −6u 4
= y0 +u∆y0 + ∆ y0 + ∆ y0 + ∆ y0
2! 3! 4!
u5 − 10u4 + 35u3 − 50u2 + 24u 5
+ ∆ y0 + · · · (7.4)
5!
Differentiation and Integration 405
with error
u(u − 1) · · · (u − n) n+1 (n+1)
E(x) = h f (ξ),
(n + 1)!
where min{x, x0 , · · · , xn } < ξ < max{x, x0 , . . . , xn }.
Differentiating (7.4) successively with respect to x, we obtain
and so on.
It may be noted that ∆y0 , ∆2 y0 , ∆3 y0 , · · · are constants.
The above equations give the approximate derivative of f (x) at arbitrary point x (=
x0 + uh).
When x = x0 , u = 0, the above formulae become
1 1 1 1 1
φ (x0 ) = ∆y0 − ∆2 y0 + ∆3 y0 − ∆4 y0 + ∆5 y0 − · · · (7.8)
h 2 3 4 5
1 11 5
φ (x0 ) = 2 ∆2 y0 − ∆3 y0 + ∆4 y0 − ∆5 y0 + · · · (7.9)
h 12 6
1 3 7
φ (x0 ) = 3 ∆3 y0 − ∆4 y0 + ∆5 y0 − · · · (7.10)
h 2 4
and so on.
f n+1 (ξ)
E(x) = u(u − 1) · · · (u − n)hn+1 .
(n + 1)!
406 Numerical Analysis
Then
dy d2 y
Example 7.2.1 From the following table find the value of and 2 at the point
dx dx
x = 1.5.
x : 1.5 2.0 2.5 3.0 3.5 4.0
y : 3.375 7.000 13.625 24.000 38.875 59.000
1 2v + 1 2 3v 2 + 6v + 2 3 4v 3 +18v 2 +22v+6 4
φ (x) = ∇yn + ∇ yn + ∇ yn + ∇ yn
h 2! 3! 4!
5v 4 + 40v 3 + 105v 2 + 100v + 24 5
+ ∇ yn + · · · (7.13)
5!
1 6v + 6 3 12v 2 + 36v + 22 4
φ (x) = 2 ∇2 yn + ∇ yn + ∇ yn
h 3! 4!
20v 3 + 120v 2 + 210v + 100 5
+ ∇ yn + · · · (7.14)
5!
1 24v + 36 4 60v 3 + 240v + 210 5
φ (x) = 3 ∇3 yn + ∇ yn + ∇ yn + · · · (7.15)
h 4! 5!
and so on.
The above formulae are used to determine the approximate differentiation of first,
second and third, etc. order at any point x where x = xn + vh.
408 Numerical Analysis
f n+1 (ξ)
E(x) = v(v + 1)(v + 2) · · · (v + n)hn+1 ,
(n + 1)!
x − xn
where v = and min{x, x0 , x1 , . . . , xn } < ξ < max{x, x0 , x1 , . . . , xn }.
h
Then
d f n+1 (ξ)
E (x) = hn [v(v + 1)(v + 2) · · · (v + n)]
dv (n + 1)!
v(v + 1)(v + 2) · · · (v + n) n+2
+hn+1 f (ξ1 ),
(n + 1)!
d f n+1 (ξ)
E (xn ) = hn [v(v + 1)(v + 2) · · · (v + n)]
dv (n + 1)!
n! d
= hn f n+1 (ξ) as [v(v + 1) · · · (v + n)]v=0 = n!
(n + 1)! dv
n
h f n+1 (ξ)
= . (7.19)
n+1
The acceleration is
d2 x 1 2 11 4
= ∇ xn + ∇ 3
xn + ∇ x n + · · ·
dt2 h2 12
1 11
= 2 4+3+ × 3 = 9.75.
1 12
Example 7.3.2 A slider in a machine moves along a fixed straight rod. Its distance
x cm along the rod are given in the following table for various values of the time t
(in second).
Find the velocity and the acceleration of the slider at time t = 1.5.
Here h = 0.1.
410 Numerical Analysis
dx 1 1 1 1 1
= ∇ + ∇2 + ∇3 + ∇4 + ∇5 + · · · xn
dt h 2 3 4 5
1 1 1 1 1
= 4.18 + × 0.44 + × 0.03 + × 0.00 + × 0.01
0.1 2 3 4 5
= 44.12.
d2 x 1 11
2
= 2 ∇2 + ∇3 + ∇5 + · · · xn
dt h 12
1 11 5
= 0.44 + 0.03 + × 0.00 + × 0.01
(0.1)2 12 6
= 47.83.
Hence velocity and acceleration are respectively 44.12 cm/sec and 47.83 cm/sec2 .
1 1
and φ (x0 ) = ∆2 y−1 − ∆4 y−2 + · · · . (7.24)
h2 12
f 2n+1 (ξ)
E(x) = u(u2 − 12 )(u2 − 22 ) · · · (u2 − n2 )h2n+1 ,
(2n + 1)!
d du f 2n+1 (ξ)
E (x) = [u(u2 − 12 )(u2 − 22 ) · · · (u2 − n2 )] h2n+1
du dx (2n + 1)!
d f 2n+1 (ξ)
+h2n+1 [u(u2 − 12 )(u2 − 22 ) · · · (u2 − n2 )]
dx (2n + 1)!
d f 2n+1 (ξ)
= h2n [u(u2 − 12 )(u2 − 22 ) · · · (u2 − n2 )]
du (2n + 1)!
f 2n+2 (ξ1 )
+h2n+1 [u(u2 − 12 )(u2 − 22 ) · · · (u2 − n2 )] , (7.25)
(2n + 1)!
Example 7.4.1 Compute the values of (i) f (3), (ii) f (3), (iii) f (3.1), (iv) f (3.1)
using the following table.
x : 1 2 3 4 5
f (x) : 0.0000 1.3863 3.2958 5.5452 8.0472
412 Numerical Analysis
Since x = 3 and x = 3.1 are the middle of the table, so the formula based on central
difference may be used. Here Stirling’s formula is used to find the derivatives.
(i) Here x0 = 3, h = 1, u = 0.
Then
1 ∆y−1 + ∆y0 ∆3 y−2 + ∆3 y−1
f (3) = − + ···
h 2 12
1 1.9095 + 2.2494 −0.1833 − 0.0873
= − = 2.1020.
1 2 12
1 1 1 1
(ii) f (3) =2
∆2 y−1 − ∆4 y−2 + · · · = 2 0.3399 − × 0.0960 = 0.3319.
h 12 1 12
(iii) Let x0 = 3, h = 1, u = 3.1−3
1 = 0.1. Then
n
yi
φ(x) = w(x) , where w(x) = (x − x0 )(x − x1 ) · · · (x − xn ).
(x − xi )w (xi )
i=0
Then
n
yi n
yi
φ (x) = w (x)
− w(x) . (7.27)
(x − xi )w (xi ) (x − xi )2 w (xi )
i=0 i=0
n
yi yi
n
and φ (x) = w (x)
− 2w
(x)
(x − xi )w (xi ) (x − xi )2 w (xi )
i=0 i=0
n
yi
+2w(x) . (7.28)
(x − xi )3 w (xi )
i=0
n
yi
φ(x) = w(x)
i=0
(x − xi )w (xi )
i=j
Therefore,
n
yi yj n
φ (xj ) = w (xj ) + , (7.29)
i=0
(xj − xi )w (xi ) i=0
xj − xi
i=j i=j
Note that
d (x − x0 )(x − x1 ) · · · (x − xj−1 )(x − xj+1 ) · · · (x − xn )
dx (xj − x0 )(xj − x1 ) · · · (xj − xj−1 )(xj − xj+1 ) · · · (xj − xn ) x=xj
n
1
= .
i=0
xj − xi
i=j
x : 2 3 5 6
y : 13 34 136 229
Solution. Here x0 = 2, x1 = 3, x2 = 5, x3 = 6.
w(x) = (x − x0 )(x − x1 )(x − x2 )(x − x3 ) = (x − 2)(x − 3)(x − 5)(x − 6).
w (x) = (x − 3)(x − 5)(x − 6) + (x − 2)(x − 5)(x − 6) + (x − 2)(x − 3)(x − 6)
+ (x − 2)(x − 3)(x − 5).
By the formula (7.29),
3
yi
3
y0
f (x) φ (x) = w (x0 )
+
(x0 − xi )w (xi ) x0 − xi
i=1 i=1
y1 y2 y3
= w (2) + +
(2 − 3)w (3) (2 − 5)w (5) (2 − 6)w (6)
1 1 1
+y0 + + .
2−3 2−5 2−6
Differentiation and Integration 415
Also
f (2.5)
3
yi 3
yi
w (2.5)
− w(2.5)
(2.5 − xi )w (xi ) (2.5 − xi )2 w (xi )
i=0 i=0
y0 y1 y2 y3
= w (2.5) + + +
(2.5 − 2)w (2) (2.5 − 3)w (3) (2.5 − 5)w (5) (2.5 − 6)w (6)
y0 y1 y2 y3
−w(2.5) + + +
(2.5−2) w (2) (2.5−3) w (3) (2.5−5) w (5) (2.5−6)2 w (6)
2 2 2
13 34 136 229
f (2.5) 1.5 + + +
0.5 × (−12) (−0.5) × 6 (−2.5) × (−6) (−3.5) × 12
13 34 136 229
+ 2.1875 + + +
(0.5) ×(−12) (−0.5) ×6 (−2.5) ×(−6) (−3.5)2×12
2 2 2
= 20.75.
Algorithm 7.1 (Derivative). This algorithm determines the first order derivative
of a function given in tabular form (xi , yi ), i = 0, 1, 2, . . . , n, at a given point xg, xg
may or may not be equal to the given nodes xi , based on Lagrange’s interpolation.
endfor;
//compute w (xg)//
set t = 0;
for j = 0 to n do
set prod = 1;
for i = 0 to n do
if (i = j) then prod = prod ∗ (xg − xi );
endfor;
Compute t = t + prod;
endfor;
//compute w(xg) //
set t1 = 1;
for i = 0 to n do
Compute t1 = t1 ∗ (xg − xi );
endfor;
Compute result = t ∗ sum1 − t1 ∗ sum2;
else //xg is equal to xj //
for i = 0 to n do
if i = j then
Compute sum1 = sum1 + yi /((xj − xi ) ∗ wd(i));
Compute sum2 = sum2 + 1/(xj − xi );
endif;
endfor;
Compute result = wd(j) ∗ sum1 + yj ∗ sum2;
endif;
Print ’The value of the derivative’, result;
function wd(j)
//This function determines w (xj ).//
Set prod = 1;
for i = 0 to n do
if (i = j) prod = prod ∗ xi ;
endfor;
return prod;
end wd
end Derivative Lagrange
Program 7.1
.
/* Program Derivative
Program to find the first order derivative of a function
y=f(x) given as (xi,yi), i=0, 1, 2, ..., n, using formula
based on Lagrange’s interpolation. */
#include<stdio.h>
Differentiation and Integration 417
{
sum1+=y[i]/((x[j]-x[i])*wd(i));
sum2+=1/(x[j]-x[i]);
}
result=wd(j)*sum1+y[j]*sum2;
} /* end of else part */
printf("The value of derivative at x= %6.4f is
%8.5f",xg,result);
}
/* this function determines w’(xj) */
float wd(int j)
{
int i;float prod=1;
for(i=0;i<=n;i++) if(i!=j) prod*=(x[j]-x[i]);
return prod;
}
A sample of input/output:
Enter the value of n and the data in the form (x[i],y[i])
3
1 0.54030
2 -0.41615
3 -0.98999
4 -0.65364
Enter the value of x at which derivative is required
1.2
The value of derivative at x= 1.2000 is -0.99034
Table of derivatives
The summary of the formulae of derivatives based on finite differences.
1 1 1 1 1 1
f (x0 ) ∆− ∆2 + ∆3 − ∆4 + ∆5 − ∆6 + · · · y0 (7.32)
h 2 3 4 5 6
1 3 11 4 5 5 137 6 7 7 363 8
f (x0 ) 2 ∆ −∆ + ∆ − ∆ +
2
∆ − ∆+ ∆ +· · · y0 (7.33)
h 12 6 180 10 560
1 1 1 1 1 1
f (xn ) ∇ + ∇ 2 + ∇ 3 + ∇ 4 + ∇ 5 + ∇ 6 + · · · yn (7.34)
h 2 3 4 5 6
1 3 11 4 5 5 137 6 7 7 363 8
f (xn ) 2 ∇ +∇ + ∇ + ∇ +
2
∇+ ∇+ ∇ +· · · yn (7.35)
h 12 6 180 10 560
Differentiation and Integration 419
1 ∆y−1+∆y0 ∆3 y−2+∆3 y−1 ∆5 y−3+∆5 y−2
f (x0 ) − + +· · · y0 (7.36)
h 2 12 60
1 1 4 1 6
f (x0 ) 2 ∆ y−1 − ∆ y−2 + ∆ y−3 − · · · y0
2
(7.37)
h 12 90
Only the first term of (7.32), gives a simple formula for the first order derivative
∆yi yi+1 − yi y(xi + h) − y(xi )
f (xi ) = = . (7.38)
h h h
Similarly, the equation (7.34), gives
∇yi yi − yi−1 y(xi ) − y(xi − h)
f (xi ) = = . (7.39)
h h h
Adding equations (7.38) and (7.39), we obtain the central difference formula for first
order derivative, as
y(xi + h) − y(xi − h)
f (xi ) . (7.40)
2h
Equations (7.38)-(7.40) give two-point formulae to find first order derivative at x = xi .
Similarly, from equation (7.33)
h2 h3
f (xi + h) = f (xi ) + hf (xi ) + f (xi ) + f (ξ1 )
2! 3!
h2 h3
and f (xi − h) = f (xi ) − hf (xi ) + f (xi ) − f (ξ2 )
2! 3!
By subtraction
f (ξ1 ) + f (ξ2 ) 3
f (xi + h) − f (xi − h) = 2hf (xi ) + h . (7.44)
3!
Since f is continuous, by the intermediate value theorem there exist a number ξ so
that
f (ξ1 ) + f (ξ2 )
= f (ξ).
2
Thus, after rearrangement the equation (7.44) becomes
is the total error accumulating the round-off error (Eround ) and the truncation error
(Etrunc ).
Let |ε−1 | ≤ ε, |ε1 | ≤ ε and M3 = max |f (x)|.
a≤x≤b
Then from (7.46), the upper bound of the total error is given by
h2 h3 h4 h5
f (x + h) = f (x) + hf (x) + f (x) + f (x) + f iv (x) + f v (ξ1 )
2! 3! 4! 5!
and
h2 h3 h4 h5
f (x − h) = f (x) − hf (x) + f (x) − f (x) + f iv (x) − f v (ξ2 ).
2! 3! 4! 5!
Then by subtraction
f (x − 2h) − f (x + 2h) + 8f (x + h) − 8f (x − h)
16f v (ξ3 ) − 64f v (ξ4 ) 5
= 12hf (x) + h .
120
Since f v (x) is continuous, f v (ξ3 ) f v (ξ4 ) = f v (ξ) (say).
Then 16f v (ξ3 ) − 64f v (ξ4 ) = −48f v (ξ).
Using this result the above equation becomes
2
f (x − 2h) − f (x + 2h) + 8f (x + h) − 8f (x − h) = 12hf (x) − h5 f v (ξ).
5
Hence, the value of f (x) is given by
3ε M5 h4 45ε
(i) If |Eround | = |Etrunc | then= or, h4 = .
2h 30 M5
1/4
45ε
Thus the optimum value of h is and
M5
9 1/4
3ε M5 1/4
|Eround | = |Etrunc | = = (M5 ε3 )1/4 .
2 45ε 80
d|E|
(ii) When total error |E| = |Eround | + |Etrunc | is minimum then =0
dh
3ε 4M5 h3 45ε
or, − 2
+ = 0, i.e., h5 = .
2h 30 4M5
1/5
45ε
Hence, in this case, the optimum value of h is .
4M5
Example 7.6.2 The value of x and f (x) = x cos x are tabulated as follows:
x : 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
f (x) : 0.19601 0.28660 0.36842 0.43879 0.49520 0.53539 0.55737 0.55945 0.54030
Find the value of f (0.6) using the two- and four-point formulae
f (x0 + h) − f (x0 − h)
f (x0 ) =
2h
−f (x0 + 2h) + 8f (x0 + h) − 8f (x0 − h) + f (x0 − 2h)
and f (x0 ) =
12h
with step size h = 0.1.
The exact value is f (0.6) = cos 0.6 − 0.6 × sin 0.6 = 0.48655.
Therefore, error in two-point formula is 0.00355 and that in four-point formula is
0.00001. Clearly, four-point formula gives better result than two-point formula.
The improvement of derivative of a function can be done using this method. This
method reduces the number of function evaluation to achieve the higher order accuracy.
The formula to find the first order derivative using two points is
f (x + h) − f (x − h)
f (x) = + Etrunc = g(h) + Etrunc
2h
where Etrunc is the truncation error and g(h) is the approximate first order derivative
of f (x).
Using Taylor’s series expansion, it can be shown that, Etrunc is of the following form.
Etrunc = c1 h2 + c2 h4 + c3 h6 + · · · .
4g(h/2) − g(h) 3 7
f (x) = − c2 h4 − c3 h6 + · · ·
3 4 8
4g(h/2) − g(h)
= + d1 h4 + d2 h6 + · · · (7.54)
3
Denoting
4g(h/2) − g(h)
by g1 (h/2), (7.55)
3
equation (7.54) becomes
This equation shows that g1 (h/2) is an approximate value of f (x) with fourth-order
accuracy. Thus a result accurate up to fourth order is obtained by combining two results
accurate up to second order.
Now, by repeating the above result one can obtain
where
42 g1 (h/22 ) − g1 (h/2)
g2 (h/22 ) = . (7.59)
42 − 1
k = 1, 2, 3, . . . ; m = k, k + 1, . . .
where g0 (h) = g(h).
This process is called repeated extrapolation to the limit. The values of gk (h/2m ) for
different values of k and m are tabulated as shown in Table 7.1.
It may be noted that the successive values in a particular column give better ap-
proximations of the derivative than the preceding columns. This process will terminate
when
|gm (h/2) − gm−1 (h)| ≤ ε
Here g(h/2) is more accurate than g(h) and then g1 (h/2) gives an improved ap-
proximation over g(h/2). If g(h) < g(h/2), g1 (h/2) > g(h/2) and if g(h/2) < g(h),
g1 (h/2) < g(h/2). Thus the value of g1 (h/2) lies outside the interval [g(h), g(h/2)] or
[g(h/2), g(h)] as the case may be. Thus g1 (h/2) is obtained from g(h) and g(h/2) by
means of an extrapolation operation. So, this process is called (Richardson) extrapola-
tion.
Example 7.7.1 Use Richardson’s extrapolation method to find f (0.5) where
f (x) = 1/x starting with h = 0.2.
Thus, after two steps we found that f (0.5) = −4.00031 while the exact value is
1
f (0.5) = − 2 = −4.0.
x x=0.5
f (x + h) − f (x − h)
Compute g0 (0) = ;
2h
Set j = 1;
10: Set h = h/2;
f (x + h) − f (x − h)
Compute gn (0) = ;
2h
for k = 1 to j do
4k gn (k − 1) − go (k − 1)
gn (k) = ;
4k − 1
if |gn (j − 1) − gn (j)| < ε then
Print gn (j) as the value of derivative;
Stop;
else
for k = 0 to j do
g0 (k) = gn (k); //set new values as old values//
j = j + 1;
goto 10;
endif;
end Richardson extrapolation
Differentiation and Integration 429
Program 7.2
.
/* Program Richardson Extrapolation
This program finds the first order derivative of a function
f(x) at a given value of x by Richardson Extrapolation.
Here we assume that f(x)=1/(x*x).
*/
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
void main()
{
int j=1,k;
float x,h,eps=1e-5,g0[100],gn[100];
float f(float x);
printf("Enter the value of x ");
scanf("%f",&x);
printf("Enter the value of h ");
scanf("%f",&h);
g0[0]=(f(x+h)-f(x-h))/(h+h);
start: h=h/2;
gn[0]=(f(x+h)-f(x-h))/(h+h);
for(k=1;k<=j;k++)
gn[k]=(pow(4,k)*gn[k-1]-g0[k-1])/(pow(4,k)-1);
if(fabs(gn[j-1]-gn[j])<eps)
{
printf("The derivative is %8.5f at %8.5f ",gn[j],x);
exit(0);
}
else
{
for(k=0;k<=j;k++) g0[k]=gn[k];
j++;
goto start;
}
} /* main */
/* definition of f(x) */
float f(float x)
{
return(1/(x*x));
}
430 Numerical Analysis
A sample of input/output:
The cubic spline may be used to determine the first and second derivatives of a function.
This method works into two stages. In first stage the cubic splines will be constructed
with suitable intervals and in second stage the first and second derivatives are to be
determined from the appropriate cubic spline. This method is labourious than the
other methods, but, once a cubic spline is constructed then the method becomes very
efficient. The process of finding derivative is illustrated by as example in the following.
Example 7.8.1 Let y = f (x) = cos x, 0 ≤ x ≤ π/2 be the function. Find the natu-
ral cubic spline in the intervals 0 ≤ x ≤ π/4 and π/4 ≤ x ≤ π/2 and hence determine
the approximate values of f (π/8) and f (π/8). Also, use two-point formula to find
the value of f (π/8). Find the error in each case.
√
Solution. Here n = 2. Therefore, h = π/4, y0 = cos 0 = 1, y1 = cos π/4 = 1/ 2 and
y2 = cos π/2 = 0.
Also, M0 = M2 = 0.
Then by the formula (3.99)
6
M0 + 4M1 + M2 = [y0 − 2y1 + y2 ]
h2
96 √ 24 √
That is, 4M1 = 2 (1 − 2) or, M1 = 2 (1 − 2) = −1.007246602.
π π
Hence the cubic spline is
s1 (x), 0 ≤ x ≤ π/4
S(x) =
s2 (x), π/4 ≤ x ≤ π/2,
where
4 x3 1 π2 π
s1 (x) = M1 − 1 − √ + M1 x + ,
π 6 2 96 4
and 1
4 (π/2 − x)3 π2
s2 (x) = M1 − √ − M1 (π/2 − x) .
π 6 2 96
Now, f (π/8) s1 (π/8) = −0.33996116, f (π/8) s1 (π/8) = −0.503623301.
Differentiation and Integration 431
Two-point formula.
Let h = π/30.
f (π/8 + π/30) − f (π/8 − π/30)
Then f (π/8) = −0.381984382.
2.π/30
The actual value of f (π/8) is − sin π/8 = −0.382683432.
Therefore, error in cubic spline method is 0.042722272 while in two-point formula
that is 0.000699050365.
It is known that, if a function is differentiable then the maximum and minimum value
of that function can be determined by equating the first derivative to zero and solving
for the variable. The same method is applicable for the tabulated function.
Now, consider the Newton’s forward difference interpolation formula.
u(u − 1) 2 u(u − 1)(u − 2) 3
y = f (x) = y0 + u∆y0 + ∆ y0 + ∆ y0 + · · · , (7.61)
2! 3!
x − x0
where u = .
h
Then
dy 2u − 1 2 3u2 − 6u + 2 3
= ∆y0 + ∆ y0 + ∆ y0 + · · · .
dx 2 6
dy
For maxima and minima = 0. Then
dx
2u − 1 2 3u2 − 6u + 2 3
∆y0 + ∆ y0 + ∆ y0 + · · · = 0 (7.62)
2 6
For simplicity, the third and higher differences are neglected and obtain the quadratic
equation for u as
au2 + bu + c = 0, (7.63)
1 1 1
where a = ∆3 y0 , b = ∆2 y0 − ∆3 y0 , c = ∆y0 − ∆2 y0 + ∆3 y0 .
2 2 3
The values of u can be determined by solving this equation. Then the values of x
are determined from the relation x = x0 + uh. Finally, the maximum value of y can be
obtained from the equation (7.61).
Example 7.9.1 Find x for which y is maximum and also find the corresponding
value of y, from the following table.
x : 1.5 2.0 2.5 3.0 3.5
y : 0.40547 0.69315 0.91629 1.09861 1.25276
432 Numerical Analysis
7.10 Integration
It is well known that, if a function f (x) is known completely, even then it is not always
possible to evaluate the definite integral of it using analytic method. Again, in many
real life problems, we are required to integrate a function between two given limits,
but the function is not known explicitly, but, it is known in a tabular form (equally or
unequally spaced). Then a method, known as numerical integration or quadrature
can be used to solve all such problems.
The problem of numerical integration is stated below:
Given a set of data points (x0 , y0 ), (x1 , y1 ), . . . , (xn , yn ) of a function y = f (x), it is
8b
required to find the value of the definite integral a f (x) dx. The function f (x) is replaced
by a suitable interpolating polynomial φ(x).
Then the approximate value of the definite integral is calculated using the following
formula
% b % b
f (x) dx φ(x) dx. (7.64)
a a
Differentiation and Integration 433
Thus, different integration formulae can be derived depending on the type of the
interpolation formulae used.
A numerical integration formula is said to be of closed type, if the limits of integra-
tion a and b are taken as interpolating points. If a and b are not taken as interpolating
points then the formula is known as open type formula.
Let y = f (x) be continuous and possesses continuous derivatives of all orders. Also,
it is assumed that there exists a function F (x) such that F (x) = f (x) in [x0 , x1 ].
Then
% b % x1
f (x) dx = F (x) dx = F (x1 ) − F (x0 )
a x0
h2
= F (x0 + h) − F (x0 ) = F (x0 ) + hF (x0 ) + F (x0 )
2!
h3
+ F (x0 ) + · · · − F (x0 )
3!
h2 h3
= hf (x0 ) + f (x0 ) + f (x0 ) + · · ·
2! 3!
h2 h3
= hy0 + y0 + y0 + · · · (7.70)
2 6
Differentiation and Integration 435
Again,
h h
(y0 + y1 ) = [y0 + y(x0 + h)]
2 2
h h2
= [y0 + y(x0 ) + hy (x0 ) + y (x0 ) + · · · ]
2 2!
h h 2
= [y0 + y0 + hy0 + y0 + · · · ]. (7.71)
2 2!
Using (7.70) and (7.71), equation (7.69) becomes
h h2 h h2
E = h y0 + y0 + y0 + · · · − 2y0 + hy0 + y0 + · · ·
2 6 2 2!
h3
= − y0 + · · ·
12
h3 h3
= − f (x0 ) + · · · − f (ξ), (7.72)
12 12
where a = x0 < ξ < x1 = b.
Equation (7.72) gives the error in the interval [x0 , x1 ].
The total error in the composite rule is
h3
E=− (y + y1 + · · · + yn−1
).
12 0
If y (ξ) is the largest among the n quantities y0 , y1 , . . . , yn−1
then
1 3 (b − a) 2
E≤− h ny (ξ) = − h y (ξ), as nh = b − a.
12 12
Note 7.11.1 The error term shows that if the second and higher order derivatives of
f (x) vanish then the trapezoidal rule gives exact result of the integral. This means, the
method gives exact result when f (x) is linear.
y y = f (x)
6 B
A
y1
y0
- x
O D x0 C x1
- x
x0 x1 x2 xn
O
Let f ∈ C 2 [a, b], where [a, b] is a finite interval. Now, transfer the interval [a, b] to [−1, 1]
a+b b−a
using the relation x = + t = p + qt (say).
2 2
Let f (x) = f (p + qt) = g(t). When x = a, b then t = −1, 1, i.e., g(1) = f (b), g(−1) =
f (a).
Thus
% b % 1 % 0 % 1
I= f (x)dx = g(t) q dt = q g(t)dt + g(t)dt
a −1 −1 0
% 1
=q [g(t) + g(−t)]dt.
0
Differentiation and Integration 437
In this expression, the first term is the approximate integration obtained by trape-
zoidal rule and the second term represents the error.
% b
Algorithm 7.3 (Trapezoidal). This algorithm finds the value of f (x)dx based
a
on the tabulated values (xi , yi ), yi = f (xi ), i = 0, 1, 2, . . . , n, using trapezoidal rule.
Algorithm Trapezoidal
Input function f (x);
Read a, b, n; //the lower and upper limits and number of subintervals.//
Compute h = (b − a)/n;
1
Set sum = [f (a) + f (a + nh)];
2
for i = 1 to n − 1 do
Compute sum = sum + f (a + ih);
endfor;
Compute result = sum ∗ h;
Print result;
end Trapezoidal
438 Numerical Analysis
Program 7.3
.
/* Program Trapezoidal
This program finds the value of integration of a function
by trapezoidal rule.
Here we assume that f(x)=x^3. */
#include<stdio.h>
void main()
{
float a,b,h,sum; int n,i;
float f(float);
printf("Enter the values of a, b ");
scanf("%f %f",&a,&b);
printf("Enter the value of n ");
scanf("%d",&n);
h=(b-a)/n;
sum=(f(a)+f(a+n*h))/2.;
for(i=1;i<=n-1;i++) sum+=f(a+i*h);
sum=sum*h;
printf("The value of the integration is %8.5f ",sum);
}
A sample of input/output:
In this formula the interval [a, b] is divided into two equal subintervals by the points
x0 , x1 , x2 , where h = (b − a)/2, x1 = x0 + h and x2 = x1 + h.
This rule is obtained by putting n = 2 in (7.66). In this case, the third and higher
order differences do not exist.
Differentiation and Integration 439
Again,
h h
[y0 + 4y1 + y2 ] = [f (x0 ) + 4f (x1 ) + f (x2 )]
3 3
h
= [f (x0 ) + 4f (x0 + h) + f (x0 + 2h)]
3
h h2 h3
= f (x0 ) + 4 f (x0 ) + hf (x0 ) + f (x0 ) + f (x0 )
3 2! 3!
h 4 (2h)2
+ f iv (x0 ) + · · · + f (x0 ) + 2hf (x0 ) + f (x0 )
4! 2!
(2h)3 (2h)4 iv
+ f (x0 ) + f (x0 ) + · · ·
3! 4!
4 2
= 2hf (x0 ) + 2h2 f (x0 ) + h3 f (x0 ) + h4 f (x0 )
3 3
5
+ h5 f iv (x0 ) + · · · . (7.77)
18
Using (7.76) and (7.77), equation (7.75) becomes,
4 5 5 iv h5
E= − h f (x0 ) + · · · − f iv (ξ), (7.78)
15 18 90
where x0 < ξ < x2 .
This is the error in the interval [x0 , x2 ].
The total error in composite formula is
h5 iv
E=− {f (x0 ) + f iv (x2 ) + · · · + f iv (xn−2 )}
90
h5 n iv
=− f (ξ)
90 2
nh5 iv
=− f (ξ),
180
(where f iv (ξ) is the maximum among f iv (x0 ), f iv (x2 ), . . . , f iv (xn−2 ))
(b − a) 4 iv
=− h f (ξ). (7.79)
180
y
6
parabola
C
A y = f (x)
B
E D -
x
O x0 x1 x2
83
Example 7.11.1 Evaluate 0 (2x − x2 ) dx, taking 6 intervals, by (i) Trapezoidal
rule, and (ii) Simpson’s 1/3 rule.
0.5
= [0 + 4(0.75 + 0.75 − 1.25) + 2(1.0 + 0.0) − 3.0]
3
0.5
= [0 + 1 + 2 − 3] = 0.
3
442 Numerical Analysis
This rule can also be deduced by applying MVT of differential and of integral calculus.
a+b b−a a+b b−a
Let f ∈ C 4 [a, b] and x = + z = p + qz, p = ,q = .
2 2 2 2
Then when x = a, b then z = −1, 1.
Therefore,
% b % 1
I= f (x)dx = q f (p + qz)dz
a −1
% 1
=q g(z)dz, where g(z) = f (p + qz)
−1
% 0 % 1 % 1
=q g(z)dz + g(z)dz = q [g(z) + g(−z)]dz
−1 0 0
% 1
=q φ(z)dz, (7.80)
0
1
= (1 + c)φ(1) − cφ(0) − + c + c1 φ (1) + c1 φ (0)
2
z 3 z2 1 % 1 z 3 z2
+ + c + c1 z + c2 φ (z) − + c + c1 z + c2 φ (z)dz
6 2 0 0 6 2
1 1 c
= (1 + c)φ(1) − cφ(0) − + c + c1 φ (1) + + + c1 + c2 φ (1)
2 6 2
% 1 3 2
z z
−c2 φ (0) − + c + c1 z + c2 φ (z)dz, (7.82)
0 6 2
where c1 , c2 , c3 are arbitrary constants and they are chosen in such a way that φ (1), φ (1)
and φ (0) vanish. Thus
1 1 c
+ c + c1 = 0, + + c1 + c2 = 0, and c2 = 0.
2 6 2
1 2
The solution of these equations is c2 = 0, c1 = , c = − .
6 3
Hence
1 % 1 3
2 z z 2 z
I = q φ(1) + φ(0) − − + φ (z)dz
3 3 0 6 3 6
1 4 a + b h % 1
=h f (a) + f (b) + f − (z 3 − 2z 2 + z)φ (z)dz
3 3 2 6 0
b−a
as q = =h
2
h a + b
= f (a) + 4f + f (b) + E
3 2
where
% %
h 1 2 h 1
E=− z(z − 1) φ (z)dz = − z(z − 1)2 [g (z) − g (−z)]dz
6 0 6 0
%
h 1
=− z(z − 1)2 .[2zg iv (ξ)]dz, −z < ξ < z
6 0
[by Lagrange’s MVT]
% 1
h iv
= − g (ξ1 ) z 2 (z − 1)2 dz [by MVT of integral calculus]
3 0
h 1 h
= − g iv (ξ1 ). = − g iv (ξ1 ), 0 < ξ1 < 1.
3 30 90
Again, g(z) = f (p + qz), g iv (z) = q 4 f iv (p + qt) = h4 f iv (ξ2 ), a < ξ2 < b.
Therefore,
h5
E = − f iv (ξ2 ).
90
444 Numerical Analysis
Hence,
% b a + b
h h5
f (x)dx = f (a) + 4f + f (b) − f iv (ξ2 ).
a 3 2 90
Here, the first term is the value of the integration obtained from the Simpson’s 1/3
rule and the second term is its error.
Algorithm 7.4 (Simpson’s 1/3). This algorithm determines the value of
8b
a f (x) dx using Simpson’s 1/3 rule.
Program 7.4.
/* Program Simpson’s 1/3
Program to find the value of integration of a function
f(x) using Simpson’s 1/3 rule. Here we assume that f(x)=x^3.*/
#include<stdio.h>
void main()
{
float f(float);
float a,b,h,sum;
int i,n;
printf("\nEnter the values of a, b ");
scanf("%f %f",&a,&b);
printf("Enter the value of subintervals n ");
scanf("%d",&n);
if(n%2!=0) {
printf("Number of subdivision should be even");
exit(0);
}
h=(b-a)/n;
sum=f(a)-f(a+n*h);
Differentiation and Integration 445
for(i=1;i<=n-1;i+=2)
sum+=4*f(a+i*h)+2*f(a+(i+1)*h);
sum*=h/3.;
printf("Value of the integration is %f ",sum);
} /* main */
Degree of Precision
The degree of precision of a quadrature formula is a positive integer n such that the
error is zero for all polynomials of degree i ≤ n, but it is non-zero for some polynomials
of degree n + 1.
The degree of precision of some quadrature formulae are given in Table 7.2.
The Weddle’s rule gives more accurate result than Simpson’s 1/3 rule. But, Weddle’s
rule has a major disadvantage that it requires the number of subdivisions (n) as a
multiple of six. In many cases, the value of h = b−an (n is multiple of six) is not finite
in decimal representation. For these reasons, the values of x0 , x1 , . . . , xn can not be
determined accurately and hence the values of y i.e., y0 , y1 , . . . , yn become inaccurate.
In Simpson’s 1/3 rule, n, the number of subdivisions is even, so one can take n as 10,
20 etc. and hence h is finite in decimal representation. Thus the values of x0 , x1 , . . . , xn
and y0 , y1 , . . . , yn can be computed correctly.
However, Weddle’s rule should be used when Simpson’s 1/3 rule does not give the
desired accuracy.
Let the function y = f (x) be known at the (n + 1) points x0 , x1 , . . . , xn of [a, b], these
points need not be equispaced.
The Lagrange’s interpolation polynomial is
n
w(x)
φ(x) = yi (7.88)
(x − xi )w (xi )
i=0
where w(x) = (x − x0 ) · · · (x − xn )
and φ(xi ) = yi , i = 0, 1, 2, . . . , n.
If the function f (x) is replaced by the polynomial φ(x) then
% b % b n %
b
w(x)
f (x)dx φ(x)dx = yi dx. (7.89)
a a a (x − xi )w (xi )
i=0
It may be noted that the coefficients Ci are independent of the choice of the function
f (x) for a given set of points.
Differentiation and Integration 449
i = 0, 1, 2, . . . , n and x = x0 + sh.
b−a
Since h = , substituting
n
Ci = (b − a)Hi , (7.98)
where
%
1 (−1)n−i n
s(s − 1)(s − 2) · · · (s − n)
Hi = ds, i = 0, 1, 2, . . . , n. (7.99)
n i!(n − i)! 0 (s − i)
These coefficients Hi are called Cotes coefficients.
Then the integration formula (7.92) becomes
% b
n
f (x)dx (b − a) H i yi , (7.100)
a i=0
Note 7.13.1 The cotes coefficients Hi ’s do not depend on the function f (x).
% b
n % b
w(x)
That is, dx = dx = (b − a). (7.101)
a i=0 (x − xi )w (xi ) a
Again,
% b
n n %
w(x) n
s(s − 1)(s − 2) · · · (s − n)
dx = h(−1)n−i ds
a i=0 (x − xi )w (xi ) i!(n − i)!(s − i)
i=0 0
n
= Ci . (7.102)
i=0
n
(ii) Hi = 1.
i=0
From the relation (7.98),
Ci = (b − a)Hi
n
n
or, Ci = (b − a) Hi
i=0 i=0
n
or, (b − a) = (b − a) Hi . [using (7.103)]
i=0
Hence,
n
Hi = 1. (7.104)
i=0
(iii) Ci = Cn−i .
From the definition of Ci , one can find
%
(−1)i h n
s(s − 1)(s − 2) · · · (s − n)
Cn−i = ds.
(n − i)!i! 0 s − (n − i)
Substituting t = n − s, we obtain
%
(−1)i h(−1)n 0 t(t − 1)(t − 2) · · · (t − n)
Cn−i =− dt
i!(n − i)! n t−i
%
(−1)n−i h n s(s − 1)(s − 2) · · · (s − n)
= dt = Ci .
i!(n − i)! 0 s−i
Hence,
Ci = Cn−i . (7.105)
(iv) Hi = Hn−i .
Multiplying (7.105) by (b − a) and hence obtain
Hi = Hn−i . (7.106)
452 Numerical Analysis
Weddle’s rule
To deduce the Weddle’s rule, n = 6 is substituted in (7.100).
% b
6
f (x)dx = (b − a) H i yi
a i=0
= 6h(H0 y0 + H1 y1 + H2 y2 + H3 y3 + H4 y4 + H5 y5 + H6 y6 )
= 6h[H0 (y0 + y6 ) + H1 (y1 + y5 ) + H2 (y2 + y4 ) + H3 y3 ].
To find the values of Hi ’s one may use the result Hi = Hn−i . Also the value of H3
can be obtained by the formula
Differentiation and Integration 453
H3 = 1 − (H0 + H1 + H2 + H4 + H5 + H6 ) = 1 − 2(H0 + H1 + H2 ).
%
1 1 6 s(s − 1)(s − 2) · · · (s − 6) 41
Now, H0 = · ds = .
6 6! 0 s 840
216 27 272
Similarly, H1 = , H2 = , H3 = .
840 840 840
Hence,
% b
h
f (x)dx = [41y0 +216y1 +27y2 +272y3 +27y4 +216y5 +41y6 ]. (7.107)
a 140
Again, we know that ∆6 y0 = y0 − 6y1 + 15y2 − 20y3 + 15y4 − 6y5 + y6 ,
h
i.e., 140 [y0 − 6y1 + 15y2 − 20y3 + 15y4 − 6y5 + y6 ] − 140
h
∆6 y0 = 0.
Adding left hand side of above identity (as it is zero) to the right hand side of (7.107).
After simplification the equation (7.107) finally reduces to
% b
3h h 6
f (x)dx = [y0 + 5y1 + y2 + 6y3 + y4 + 5y5 + y6 ] − ∆ y0 .
a 10 140
The first term is the well known Weddle’s rule and the last term is the error in
addition to the truncation error.
n
1 1
1 2 2
1 4 1
2 3 3 3
3 9 9 3
3 8 8 8 8
14 64 24 64 14
4 45 45 45 45 45
95 375 250 250 375 95
5 288 288 288 288 288 288
41 216 27 272 27 216 41
6 140 140 140 140 140 140 140
All the formulae based on Newton-Cotes formula developed in Section 7.13 are of closed
type, i.e., they use the function values at the end points a, b of the interval [a, b] of
integration. Here, some formulae are introduced those take the function values at eq-
uispaced intermediate points, but, not at the end points. These formulae may be used
454 Numerical Analysis
when the function has singularity(s) at the end points or the values of the function are
unknown at the endpoints. Also, these methods are useful to solve ordinary differen-
tial equations numerically when the function values at the end points are not available.
These formulae are sometimes known as the Steffensen formulae.
% x1
1 3
f (x)dx = hf (x0 + h/2) + h f (ξ), x0 ≤ ξ ≤ x1 . (7.108)
x0 24
% x3
3h 3h3
f (x)dx = [f (x1 ) + f (x2 )] + f (ξ), x0 ≤ ξ ≤ x3 . (7.109)
x0 2 4
% x4
4h 14h5 iv
f (x)dx = [2f (x1 ) − f (x2 ) + 2f (x3 )] + f (ξ),
x0 3 45
x0 ≤ ξ ≤ x4 . (7.110)
% x5
5h
f (x)dx = [11f (x1 ) + f (x2 ) + f (x3 ) + 11f (x4 )]
x0 24
95h5 iv
+ f (ξ), x0 ≤ ξ ≤ x5 . (7.111)
144
In the Newton-Cotes method all the nodes xi , i = 0, 1, 2, . . . , n are known and eq-
uispaced. Also, the formulae obtained from Newton-Cotes method are exact for the
polynomials of degree up to n. When the nodes xi , i = 0, 1, 2, . . . , n are unknown then
one can devise some methods which give exact result for the polynomials of degree up
to 2n − 1. These methods are called Gaussian quadrature methods.
Differentiation and Integration 455
where xi and wi are respectively called nodes and weights and ψ(x) is called the weight
function. Depending on the weight function different quadrature formula can be ob-
tained.
The fundamental theorem of Gaussian quadrature states that the optimal nodes of the
m-point Gaussian quadrature formula are precisely the zeros of the orthogonal polyno-
mial for the same interval and weight function. Gaussian quadrature is optimal because
it fits all polynomial up to degree 2m exactly.
To determine the weights corresponding to the Gaussian nodes xi , compute a La-
grange’s interpolating polynomial for f (x) by assuming
m
π(x) = (x − xj ). (7.113)
j=1
Then
m
π (xj ) = (xj − xi ). (7.114)
i=1
i=j
for arbitrary points x. Now, determine a set of points xj and wj such that for a weight
function ψ(x) the following relation is valid.
% b % b m
π(x)ψ(x)
φ(x)ψ(x)dx = f (xj )dx
a a (x − xj )π (xj )
j=1
m
= wj f (xj ), (7.116)
j=1
Thus to study the Gaussian quadrature, we consider the integral in the form
% 1 n
ψ(x)f (x)dx = wi f (xi ) + E. (7.121)
−1 i=1
It may be noted that wi and xi are 2n parameters and therefore the weights and
nodes can be determined such that the formula is exact when f (x) is a polynomial of
degree not exceeding 2n − 1.
Let
Therefore,
% 1 % 1
f (x)dx = [c0 + c1 x + c2 x2 + · · · + c2n−1 x2n−1 ]dx
−1 −1
2 2
= 2c0 + c2 + c4 + · · · . (7.124)
3 5
When x = xi , equation (7.123) becomes
Since (7.124) and (7.125) are identical, compare the coefficients of ci , and find 2n
equations as follows:
w 1 + w2 + · · · + w n = 2
w1 x1 + w2 x2 + · · · + wn xn = 0
w1 x21 + w2 x22 + · · · + wn x2n = 23 (7.126)
··························· ···
2n−1
w1 x1 + w2 x2n−1
2 + · · · + wn x2n−1
n = 0.
Thus for n = 1,
% 1
f (x)dx = 2f (0). (7.128)
−1
458 Numerical Analysis
w 1 + w2 = 2
w1 x1 + w2 x2 = 0
(7.130)
w1 x21 + w2 x22 = 23
w1 x31 + w2 x32 = 0.
w1 + w2 =2
(f (x) = 1)
w1 x1 + w2 x2 =0
(f (x) = x)
(7.131)
w1 x21 + w2 x22 = 23
(f (x) = x2 )
w1 x31 + w2 x32 (f (x) = x3 ).
=0
√ √
The solution of these equations is w1 = w2 = 1, x1 = −1/ 3, x2 = 1/ 3. Hence, the
equation (7.129) becomes
% 1 √ √
f (x)dx = f (−1/ 3) + f (1/ 3).
−1
w 1 + w2 + w 3 = 2
w1 x1 + w2 x2 + w3 x3 = 0
w1 x21 + w2 x22 + w3 x32 = 23
w1 x31 + w2 x32 + w3 x32 = 0
w1 x41 + w2 x42 + w3 x42 = 25
w1 x51 + w2 x52 + w3 x52 = 0.
In this way one can determine the formulae for higher values of n. The solution of
the equations (7.126) for higher values of n is very complicated as they are non-linear
with respect to the nodes x1 , x2 , . . . , xn . But, they are linear with respect to weights.
If can be shown that xi , i = 1, 2, . . . , n are the zeros of the nth degree Legendre’s
1 dn
polynomial Pn (x) = n [(x2 − 1)n ], which can be generated from the recurrence
2 n! dxn
relation
where P0 (x) = 1 and P1 (x) = x. Some lower order Legendre polynomials are
P0 (x) = 1
P1 (x) = x
P2 (x) = 12 (3x2 − 1) (7.135)
P3 (x) = 12 (5x3 − 3x)
P4 (x) = 18 (35x4 − 30x2 + 3).
The nodes and weights for different values of n are listed in Table 7.4.
460 Numerical Analysis
% 1
1
Example 7.15.1 Find the value of dx by Gauss’s formula for n = 2, 4, 6.
0 1 + x2
Also, calculate the absolute errors.
Solution. To apply the Gauss’s formula, the limits are transferred to −1, 1 by
1 1 1
substituting x = u(1 − 0) + (1 + 0) = (u + 1).
% 1 2 % 1 2 2
1 2du n
Then I = 2
dx = 2
= 2 wi f (ui ),
0 1+x −1 (u + 1) + 4 i=1
1
where f (xi ) = .
(xi + 1)2 + 4
For the two-point formula (n = 2)
x1 = −0.57735027, x2 = 0.57735027, w1 = w2 = 1.
Then I = 2[1 × 0.23931272 + 1 × 0.15412990] = 0.78688524.
For the four-point formula (n = 4)
Program 7.5
.
/* Program Gauss-Legendre
Program to find the integration of a function by 6-point
Gauss-Legendre method. Here f(x)=x^3. */
#include<stdio.h>
462 Numerical Analysis
void main()
{
float a,b,p,q,result=0,x[7],w[7]; int i;
float f(float);
printf("Enter the values of a, b ");
scanf("%f %f",&a,&b);
x[1]=0.23861919; x[2]=-x[1];
x[3]=0.66120939; x[4]=-x[3];
x[5]=0.93246951; x[6]=-x[5];
w[1]=w[2]=0.46791393;
w[3]=w[4]=0.36076157;
w[5]=w[6]=0.17132449;
p=(a+b)/2; q=(b-a)/2;
for(i=1;i<=6;i++)
result+=w[i]*f(p+q*x[i]);
result*=q;
printf("The value of the integration is %f ",result);
}
w1 + w2 + w3 =2
−w1 + w2 x2 + w3 =0
w1 + w2 x22 + w3 = 23
−w1 + w2 x32 + w3 = 0.
1 4
x2 = 0, w1 = w3 = , w2 = .
3 3
Hence (7.140) becomes
% 1
1
f (x)dx = [f (−1) + 4f (0) + f (1)]. (7.141)
−1 3
In general the nodes xi , i = 2, 3, . . . , n − 1 are the zeros of the polynomial Pn−1 (x),
where Pn (x) is a Legendre polynomial of degree n.
464 Numerical Analysis
Since the formula (7.144) contains 2n − 1 (n weights and n − 1 nodes) unknowns, this
method gives exact values for polynomials of degree up to 2n − 2. For n = 3 (7.144)
becomes
% 1
f (x)dx = w1 f (−1) + w2 f (x2 ) + w3 f (x3 ). (7.145)
−1
This formula gives exact result for polynomials of degree up to 4. For f (x) =
1, x, x2 , x3 , x4 , equation (7.145) generates the following system of equations.
w1 + w 2 + w 3 = 2
−w1 + w2 x2 + w3 x3 = 0
2
w1 + w2 x22 + w3 x23 = 3
−w1 + w2 x32 + w3 x33 = 0
2
w1 + w2 x42 + w3 x43 = 5.
These methods give exact result for polynomials of degree up to 2n − 1. The nodes
xi , i = 1, 2, . . . , n are the zeros of the Chebyshev polynomials
That is,
(2i − 1)π
xi = cos , i = 1, 2, . . . , n. (7.152)
2n
w 1 + w2 + w 3 = π
w1 x1 + w2 x2 + w3 x3 = 0
π
w1 x21 + w2 x22 + w3 x23 = 2
w1 x31 + w2 x32 + w3 x33 = 0
3π
w1 x41 + w2 x42 + w3 x43 = 8
w1 x51 + w2 x52 + w3 x53 = 0.
Table 7.7 gives the values for the first few points and weights for Gauss-Chebyshev
quadrature.
% 1
1
Example 7.15.2 Find the value of 2
dx using Gauss-Chebyshev four-
0 1+x
point formula.
√
1 − x2
Solution. Let f (x) = . Here x1 = −0.3826834 = x2 ,
1 + x2
x3 = −0.9238795 = −x4 and w1 = w2 = w3 = w4 = 0.7853982.
468 Numerical Analysis
Then
% 1 % 1 % 1
1 1 1 1 f (x)
I= dx = dx = √ dx
0 1+x2 2 −1 1+x2 2 −1 1 − x2
1
= [w1 f (x1 ) + w2 f (x2 ) + w3 f (x3 ) + w4 f (x4 )]
2
1
= × 0.7853982[f (x1 ) + f (x2 ) + f (x3 ) + f (x4 )]
2
1
= × 0.7853982[2 × 0.8058636 + 2 × 0.2064594] = 0.7950767,
2
while the exact value is π/4 = 0.7853982. The absolute error is 0.0096785.
Remark 7.15.1 It may be noted that the Gauss-Legendre four-point formula gives bet-
ter result for this problem.
(α,β)
where Vn = (−1)n Pn (x).2n n!.
The error term is
Table 7.10 is a list of some Gaussian quadrature along with weights, nodes and the
corresponding intervals.
Proof. We define an inner product with respect to the weight function ψ(x) as
% b
f, g = f (x)g(x)ψ(x)dx. (7.173)
a
1, m = n
pm , pn = (7.174)
0, m = n.
472 Numerical Analysis
Our aim is to find the nodes and the weights such that
% b
n
f (x)ψ(x)dx = wi f (xi ) + E (7.175)
a i=1
where Qn−1 (x) is the quotient and R(x) is the remainder polynomial of degree n − 1.
Now, multiplying the equation (7.176) by the weight function ψ(x) and integrating it
over [a, b], we get
% b % b % b
f (x)ψ(x)dx = pn (x)Qn−1 (x)ψ(x)dx + R(x)ψ(x)dx. (7.177)
a a a
The first term of right hand side is zero, since Qn−1 (x) can be expressed as a linear
combination of the orthogonal polynomial p0 , p1 , . . . , pn−1 and so it must be orthogonal
to pn . Then from (7.177),
% b % b
f (x)ψ(x)dx = R(x)ψ(x)dx. (7.178)
a a
Now, let the nodes xi be the n zeros of the polynomial pn (x). Then from (7.176)
f (xi ) = R(xi ). We now introduce another special set of polynomials, the Lagrange’s
polynomials
n
x − xk
Li (x) = , (7.179)
k=1
xi − xk
k=i
1, i = k
Li (xk ) =
0, i = k.
Since R(x) is of degree (n − 1) it can be written as a sum of Lagrange’s polynomials,
i.e.,
n
R(x) = f (xi )Li (x). (7.180)
i=1
Differentiation and Integration 473
n % b
n
= f (xi ) Li (x)ψ(x)dx = f (xi )wi , (7.181)
i=1 a i=1
% b
where wi = ψ(x)Li (x)dx.
a
This proves that the formula (7.172) has precision 2n − 1.
Note that L2i (x) is a polynomial of degree less than or equal to 2n. Let f (x) = L2i (x).
Then (7.181) reduces to
% b
n
ψ(x)L2j (x)dx = wi L2j (xi ).
a i=1
Therefore,
% b
wi = ψ(x)L2i (x)dx = Li , Li > 0
a
Euler-Maclaurin’s sum formula is used to determine the sum of finite and infinite series
of numbers and is also used to numerical quadrature, if the values of derivatives are
known at the end points of the interval. The well-known quadrature formulae such as
trapezoidal rule, Simpson’s rule including error terms can be deduced from this formula.
This formula is also known as Maclaurin’s sum formula. To deduce this formula,
x
let us consider the expansion of x in ascending powers of x.
e −1
x
Let x = b0 + b1 x + b2 x2 + b3 x3 + b4 x4 + b5 x5 + · · · .
e −1
That is,
x2 x3 x4
x = (b0 + b1 x + b2 x2 + b3 x3 + b4 x4 + b5 x5 + · · · ) x + + + + ···
2! 3! 4!
b b b b b b
0 0 1 0 1 2
= b0 x + x2 + b1 + x3 + + b2 + x4 + + + b3
2! 3! 2! 4! 3! 2!
b b b b b b b b b4
0 1 2 3 0 1 2 3
+x5 + + + + b4 + x6 + + + + + b5 + · · · .
5! 4! 3! 2! 6! 5! 4! 3! 2!
474 Numerical Analysis
Equating the like powers of x on both sides and we find the following relations.
b0 = 1
b0 1
b1 = − =−
2! 2
b0 b1 1
b2 =− − =
3! 2! 12
b0 b1 b2
b3 =− − − =0
4! 3! 2!
b0 b1 b2 b3 1
b4 =− − − − =−
5! 4! 3! 2! 720
b0 b1 b2 b3 b4
b5 =− − − − − =0
6! 5! 4! 3! 2!
b0 b1 b2 b3 b4 b5 1
b6 =− − − − − − =
7! 6! 5! 4! 3! 2! 30240
etc.
Hence,
x 1 1 1 4 1
= 1 − x + x2 − x + x6 − · · ·
−1
ex 2 12 720 30240
1 1 1 1 1 3 1
= − + x− x + x5 − · · · . (7.182)
e −1
x x 2 12 720 30240
From the above expansion one can write
1 1
= hD (since E = ehD )
E−1 e −1
1 1 1 1 1
= − + (hD) − (hD)3 + (hD)5 − · · · .
hD 2 12 720 30240
Note that
(E n − 1)f (x0 ) = f (x0 + nh) − f (x0 ) = f (xn ) − f (x0 ).
Now,
En − 1
f (x0 )
E−1
1 1 1 1 1
= − + (hD) − (hD)3 + (hD)5 − · · · (E n − 1)f (x0 ).
hD 2 12 720 30240
That is,
(E n−1 + E n−2 + · · · + E + 1)f (x0 )
1 1 h
= [f (xn ) − f (x0 )] − [f (xn ) − f (x0 )] + [f (xn ) − f (x0 )]
hD 2 12
h3 h 5
− [f (xn ) − f (x0 )] + [f v (xn ) − f v (x0 )] + · · · . (7.183)
720 30240
Differentiation and Integration 475
n−1
n−1
n−1
n
r
E f (x0 ) = f (x0 + rh) = f (xr ) = f (xr ) − f (xn ).
r=0 r=0 r=0 r=0
% xn
1 1
Also, [f (xn ) − f (x0 )] = f (x)dx.
hD h x0
Hence (7.183) becomes
n
f (xr ) − f (xn )
r=0
% xn
1 1 h
= f (x)dx − [f (xn ) − f (x0 )] + [f (xn ) − f (x0 )]
h x0 2 12
h3 h5
− [f (xn ) − f (x0 )] + [f v (xn ) − f v (x0 )] + · · ·
720 30240
That is,
n % xn
1 1
f (xr ) = f (x)dx + [f (xn ) + f (x0 )]
h x0 2
r=0
h h3
+ [f (xn ) − f (x0 )] − [f (xn )−f (x0 )]
12 720
h5
+ [f v (xn )−f v (x0 )] + · · · . (7.184)
30240
This formula may also be used as a quadrature formula when it is written as
% xn 1
1
f (x)dx = h f (x0 ) + f (x1 ) + · · · + f (xn−1 ) + f (xn )
x0 2 2
h2 h4
− [f (xn ) − f (x0 )] + [f (xn ) − f (x0 )]
12 720
h6
− [f v (xn ) − f v (x0 )] + · · · . (7.185)
30240
100
1
Solution. We have to determine the value of .
x
x=1
1
Let f (x) = and h = 1, x0 = 1, xn = 100.
x
1 2 6 24 120
Then f (x) = − 2 , f (x) = 3 , f (x) = − 4 , f iv (x) = 5 , f v (x) = − 6 .
% xn x % x x x x
100
1
Now, f (x)dx = dx = log 100 = 4.6051702.
x0 1 x
1
f (xn ) + f (x0 ) = + 1 = 1.01,
100
1
f (xn ) − f (x0 ) = − + 1 = 0.9999,
1002
6
f (xn ) − f (x0 ) = − + 6 = 5.999999,
1004
120
f v (xn ) − f v (x0 ) = − + 120 = 120.
1006
Hence
100 % 100
1 1
f (x) = f (x)dx + [f (xn ) + f (x0 )]
h 1 2
x=1
h h3
+ [f (xn ) − f (x0 )] − [f (xn ) − f (x0 )]
12 720
h5
+ [f v (xn ) − f v (x0 )] + · · ·
30240
1 1 1 1
= 4.6051702 + × 1.01 + × 0.9999− ×5.999999+ ×120
2 12 720 30240
= 5.1891301.
Hence, the approximate value of the given sum is 5.189130 correct up to six decimal
places.
∞
π2 1
Example 7.16.2 Using the relation = , compute the value of π 2 correct
6 x2
x=1
to six decimals.
1
Solution. Let f (x) = 2 . Here h = 1, x0 = 1, xn = ∞.
x
2 6 24 120 720
Then f (x) = − 3 , f (x) = 4 , f (x) = − 5 , f iv (x) = 6 , f v (x) = − 7 .
% xn x % ∞ x x x x
1
Now, f (x)dx = dx = 1.
x0 1 x2
f (xn ) + f (x0 ) = f (∞) + f (1) = 1. f (xn ) − f (x0 ) = f (∞) − f (1) = 2,
Differentiation and Integration 477
% 5
dx
Example 7.16.4 Find the value of using Maclaurin’s sum formula.
1 1+x
h4
+ [f (xn ) − f (x0 )] − · · · .
720
1
Let f (x) = , x0 = 1, xn = 5 and h = 1.
1+x
1 2 6
f (x) = − , f (x) = , f (x) = − .
(1 + x)2 (1 + x)3 (1 + x)4
1 1 1 1
f (xn ) + f (x0 ) = + = 0.6666667, f (xn ) − f (x0 ) = − 2 + 2 = 0.2222222,
6 2 6 2
6 6
f (xn ) − f (x0 ) = − 4 + 4 = 0.3703704.
6 2
xn
Also, f (x) = f (1.0) + f (2.0) + f (3.0) + f (4.0) + f (5.0)
x=x0
1 1 1 1 1
= + + + + = 1.45.
2 3 4 5 6
Hence
% 5
dx 1 1 1
= 1 × 1.45 − × 0.6666667 − × 0.2222222 + × 0.3703704
1 1+x 2 12 720
= 1.0986625.
The exact value is log 6 − log 2 = 1.0986123. Thus the absolute error is 0.00005020.
Example 7.16.5 Deduce trapezoidal and Simpson’s 1/3 rules from Maclaurin’s
sum formula.
h
where IT = [f (x0 ) + 2{f (x1 ) + f (x2 ) + · · · + f (xn−1 )} + f (xn )] is the composite
2
trapezoidal rule and
h2 h2
ET − [f (xn ) − f (x0 )] − (xn − x0 )f (ξ1 ), x0 < ξ1 < xn
12 12
[by MVT of differential calculus]
nh3
− f (ξ1 ).
12
% xn
Thus f (x)dx = IT + ET is the composite trapezoidal rule.
x0
To deduce Simpson’s 1/3 rule, taking n as even and replacing h by 2h in (7.185), we
have
% xn 1
1
f (x)dx = 2h f (x0 ) + f (x2 ) + f (x4 ) + · · · + f (xn−2 ) + f (xn )
x0 2 2
22 h2 24 h4
− [f (xn ) − f (x0 )] + [f (xn ) − f (x0 )] − · · · .
12 720
(7.187)
% xn
h
f (x)dx = [f (x0 ) + 4f (x1 ) + 2f (x2 ) + 4f (x3 ) + 2f (x4 ) + · · · + f (xn )]
x0 3
1 h4
+ (4 − 24 ) [f (xn ) − f (x0 )] − · · ·
3 720
= IS + ES ,
h
where IS = [f (x0 ) + 4f (x1 ) + 2f (x2 ) + 4f (x3 ) + 2f (x4 ) + · · · + f (xn )]
3
is the composite Simpson’s 1/3 rule and
h4
ES = − [f (xn ) − f (x0 )] − · · ·
180
h4
− (xn − x0 )f iv (ξ2 ) [by MVT of differential calculus]
180
nh5 iv
=− f (ξ2 ), [ since (xn − x0 )/n = 2h], x0 < ξ2 < xn
90
is the error term.
480 Numerical Analysis
The value of the integration obtained from trapezoidal or Simpson’s rules or other rules
may be improved by repeated use of Richardson’s extrapolation procedure as described
in Section 7.7. This method is known as Romberg’s integration method as he was
the first one to describe the algorithm in recursive form.
We assume that f ∈ C n [a, b] for all n, then the error term for the trapezoidal rule
can be represented in a series involving only even powers of h, i.e.,
% b
I= f (x)dx = I0 (h) + c1 h2 + c2 h4 + c3 h6 + · · · . (7.188)
a
h2 h4 h6
I = I0 (h/2) + c1 + c2 + c 3 + · · · (7.189)
4 16 64
To eliminate c1 , equation (7.189) is multiplied by 4 and this equation is subtracted
from (7.188). Thus
3 15
3I = 4I0 (h/2) − I0 (h) − c2 h4 − c4 h6 − · · · .
4 16
This equation is divided by 3 and the coefficients of h4 , h6 etc. are denoted by d1 , d2
etc. Thus
4I0 (h/2) − I0 (h)
I= + d1 h4 + d2 h6 + · · · (7.190)
3
4I0 (h/2) − I0 (h)
Denoting by I1 (h/2) and this is the first Romberg’s improvement.
3
Thus by this notation the equation (7.190) can be written as
I = I1 (h/2) + d1 h4 + d2 h6 + · · · . (7.191)
h4 h6
I = I1 (h/22 ) + d1 + d2 + · · · . (7.192)
16 64
Now, eliminating d1 , we obtain
3
15I = 16I1 (h/22 ) − I1 (h/2) − d2 h6 + · · ·
4
Differentiation and Integration 481
That is,
where
16I1 (h/22 ) − I1 (h/2) 42 I1 (h/22 ) − I1 (h/2)
I2 (h/22 ) = = (7.194)
15 42 − 1
is the second Romberg’s improvement.
In general,
Now,
4I(h/2) − I(h) 1 h h
I1 (h/2) = = 4. (y0 + 2y1 + y2 ) − (y0 + y2 )
3 3 4 2
h/2
= [y0 + 4y1 + y2 ].
3
This is the Simpson’s 1/3 rule with step size h/2. That is, the first improved value
is equal to the value obtained by Simpson’s 1/3 rule.
Now,
The entries in the 0th order are computed directly and the other entries are calculated
by using the formula (7.195). The values along the diagonal would converge to the
integral.
The advantage of Romberg’s method is that the method gives much more accurate
result than the usual composite trapezoidal rule. A computational weakness of this
method is that twice as many function evaluations are needed to decrease the error
from O(h2n ) to O(h2n+2 ). Practically, the computations are carried out rowwise until
the desired accuracy is achieved.
Note 7.17.1 If the 0th order (starting) approximation is calculated using Simpson’s
rule then first order approximation I1 gives the Boole’s approximation and so on.
% 1
dx
Example 7.17.1 Find the value of using Romberg’s method starting
0 1 + x2
with trapezoidal rule.
n h I0 I1 I2 I3
1 1.000 0.75000
2 0.500 0.77500 0.78333
4 0.250 0.78280 0.78540 0.78554
8 0.125 0.78475 0.78540 0.78540 0.78540
The exact integral is π/4 = 0.78540. In each column the numbers are converging to
the value 0.78540. The values in Simpson’s rule column (I1 ) converge faster than the
values in the trapezoidal rule column (I0 ). Ultimately, the value of the integral is
0.78540 correct up to five decimal places.
From Table 7.11 it is observed that the elements of the row j are
A sample of input/output:
Enter the limits a and b 0 1
Enter max. number of rows to be computed 10
Romberg integration table
0.885661 0.727834
0.760596 0.718908 0.718313
0.728890 0.718321 0.718282 0.718282
0.720936 0.718284 0.718282 0.718282 0.718282
The value of the integration is 0.718282
can also be evaluated using the methods discussed earlier. Generally, trapezoidal or
Simpson’s methods are used. The integral can be evaluated numerically by two suc-
cessive integrations in x and y directions respectively taking one variable fixed at a
time.
Again, applying trapezoidal rule on two integrals of right hand side and obtain the
trapezoidal formula for double integral as
b−a d−c d−c
I= {f (a, c) + f (a, d)} + {f (b, c) + f (b, d)}
2 2 2
(b − a)(d − c)
= [f (a, c) + f (a, d) + f (b, c) + f (b, d)]. (7.199)
4
This expression shows that the integration can be done only if the values of the
function f (x, y) is available at the corner points (a, c), (a, d), (b, c) and (b, d) of the
rectangular region [a, b; c, d].
Differentiation and Integration 487
The composite rule may also be used to determine the integral (7.197). To do this
the interval [a, b] is divided into n equal subintervals each of length h and the interval
[c, d] into m equal subintervals each of length k. That is,
b−a
xi = x0 + ih, x0 = a, xn = b, h=
n
d−c
yj = y0 + jk, y0 = c, ym = d, k= .
m
Now,
% b
h
f (x, y) = [f (x0 , y) + 2{f (x1 , y) + f (x2 , y) + · · ·
a 2
+f (xn−1 , y)} + f (xn , y)]. (7.200)
The equation (7.200) is integrated between c and d, term by term, using trapezoidal
rule. Therefore,
% d% b
I= f (x, y)dxdy
c a
h k
= f (x0 , y0 ) + 2(f (x0 , y1 ) + f (x0 , y2 ) + · · · + f (x0 , ym−1 )) + f (x0 , ym )
2 2
k
+2. f (x1 , y0 ) + 2(f (x1 , y1 ) + f (x1 , y2 ) + · · · + f (x1 , ym−1 )) + f (x1 , ym )
2
k
+2. f (x2 , y0 ) + 2(f (x2 , y1 ) + f (x2 , y2 ) + · · · + f (x2 , ym−1 )) + f (x2 , ym )
2
+················································
k
+2. f (xn−1 , y0 ) + 2(f (xn−1 , y1 ) + · · · + f (xn−1 , ym−1 )) + f (xn−1 , ym )
2
k
+ f (xn , y0 ) + 2(f (xn , y1 ) + f (xn , y2 ) + · · · + f (xn , ym−1 )) + f (xn , ym )
2
hk
= f00 + 2{f01 + f02 + · · · + f0,m−1 } + f0m
4
n−1
+2 {fi0 + 2(fi1 + fi2 + · · · + fi,m−1 ) + fim }
i=1
y0 y1 y2 ym−1 ym
x0 [f00 + 2(f01 + f02 + · · · + f0,m−1 ) + f0,m ] k2 = I0
x1 [f10 + 2(f11 + f12 + · · · + f1,m−1 ) + f1,m ] k2 = I1
··· ·············································
xn [fn0 + 2(fn1 + fn2 + · · · + fn,m−1 ) + fn,m ] k2 = In
I= h2 [I0 + 2(I1 + I2 + · · · + In−1 ) + In ]
hk
m−1 n−1 m−1
I= f00 + f0m + 2 f0j + 2 fi0 + fim + 2 fij
4
j=1 i=1 j=1
m−1
hk
n−1
+fn0 + fnm + 2 fnj = sum(0) + 2 sum(i) + sum(n) ,
4
j=1 i=1
m−1
where sum(i) = fi0 + fim + 2 fij .
j=1
for j = 1 to m − 1 do
t = t + fij ;
endfor;
return(fi0 + fim + 2 ∗ t);
end sum
end Double Trapezoidal
Program 7.7 .
/* Program Trapezoidal for Two Variables
The program to find the double integration of the function
F(x,y)=1/{(1+x*x)(1+y*y)} defined over a rectangular region
[a,b;c,d] by trapezoidal rule. */
#include<stdio.h>
int n,m; float f[15][15];
float sum(int);
float F(float,float);
void main()
{
int i,j; float a,b,c,d,h,k,x,y,result=0;
printf("Enter the limits of x and y; [a,b;c,d] ");
scanf("%f %f %f %f",&a,&b,&c,&d);
printf("Enter the number of subdivisions n,m of x,y ");
scanf("%d %d",&n,&m);
h=(b-a)/n;
k=(d-c)/m;
x=a;
for(i=0;i<=n;i++) /* computation of the function */
{
y=c;
for(j=0;j<=m;j++)
{
f[i][j]=F(x,y);
y+=k;
}
x+=h;
}
for(i=0;i<n;i++) result+=sum(i);
result=(sum(0)+2*result+sum(n))*h*k/4;
printf("The value of the integration is %8.5f",result);
}
490 Numerical Analysis
float sum(int i)
{
float t=0; int j;
for(j=1;j<m;j++) t+=f[i][j];
return(f[i][0]+f[i][m]+2*t);
}
/* definition of the function f(x,y) */
float F(float x,float y)
{
return( (1/(1+x*x))*(1/(1+y*y)));
}
A sample of input/output:
y0 y1 y2 y3 y4 ym−1 ym
x0 [f00 + 4f01 + 2f02 + 4f03 + 2f04 + · · · + 4f0,m−1 + f0,m ] k3 = I0
x1 [f10 + 4f11 + 2f12 + 4f13 + 2f14 + · · · + 4f1,m−1 + f1,m ] k3 = I1
··· ····························································
xn [fn0 + 4fn1 + 2fn2 + 4fn3 + 2fn4 + · · · + 4fn,m−1 + fn,m ] k3 = In
I= h3 [I0 + 4(I1 + I3 + · · · + In−1 ) + 2(I2 + I4 + · · · + In−2 ) + In ]
In general,
% yj+1% xi+1
hk
f (x, y) dx dy = fi−1,j−1 +fi−1,j+1 +fi+1,j−1 +fi+1,j+1
yj−1 xi−1 9
+4(fi−1,j +fi,j−1 +fi,j+1 +fi+1,j )+16fij . (7.203)
y
x 1.00 1.25 1.50 1.75 2.00
1.00 0.50000 0.39024 0.30769 0.24615 0.20000
1.25 0.39024 0.32000 0.26230 0.21622 0.17978
1.50 0.30769 0.26230 0.22222 0.18824 0.16000
1.75 0.24615 0.21622 0.18824 0.16327 0.14159
2.00 0.20000 0.17978 0.16000 0.14159 0.12500
Let x be fixed and y be varying variable. Then by the trapezoidal rule on each
row of the above table, we obtain
% 2
0.25
I0 = f (1, y)dy = [0.50000 + 2(0.39024 + 0.30769 + 0.24615) + 0.20000]
1 2
= 0.32352.
492 Numerical Analysis
% 2
0.25
I1 = f (1.25, y)dy = [0.39024 + 2(0.32000 + 0.26230 + 0.21622) + 0.17978]
1 2
= 0.27088.
% 2
0.25
I2 = f (1.5, y)dy = [0.30769 + 2(0.26230 + 0.22222 + 0.18824) + 0.16000]
1 2
= 0.22665.
% 2
0.25
I3 = f (1.75, y)dy = [0.24615 + 2(0.21622 + 0.18824 + 0.16327) + 0.14159]
1 2
= 0.19040.
% 2
0.25
I4 = f (2.0, y)dy = [0.20000 + 2(0.17978 + 0.16000 + 0.14159) + 0.12500]
1 2
= 0.16097.
Hence finally
% 2% 2
dx dy h
2 + y2
= [f (1, y) + 2{f (1.25, y) + f (1.5, y) + f (1.75, y)} + f (2, y)]
1 1 x 2
h
= [I0 + 2(I1 + I2 + I3 ) + I4 ]
2
0.25
= [0.32352 + 2(0.27088 + 0.22665 + 0.1904) + 0.16097] = 0.23254.
2
Again, applying Simpson’s 1/3 rule to each row of the above table, we have
% 2
0.25
I0 = f (1, y)dy = [0.50000 + 4(0.39024 + 0.24615) + 2(0.30769) + 0.20000]
1 3
= 0.32175.
The Monte Carlo method is used to solve a large number of problems. The name of the
method is derived from the name of the city Monte Carlo, the city of Monaco famous
for its casino. Credit for inventing the Monte Carlo method often goes to Stanislaw
Differentiation and Integration 493
Ulam, a Polish mathematician who worked for John von Neumann in the United States
Manhattan Project during World War II. He invented this method in 1946 and the first
paper on it was published in 1949.
This method depends on a random sample and it is not suitable for hand calculation,
because it depends on a large number of random numbers. The algorithm of Monte
Carlo method is prepared to perform only one random trial. The trial is repeated for
N times and the trials are independent. The final result is the average of all the results
obtained in different trials.
Now, the Monte Carlo method is introduced to find numerical integration of a single
valued function. Suppose the definite integral be
% b
I= g(x) dx, (7.204)
a
where g(x) is a real valued function defined on [a, b]. The idea is to manipulate the
definite integral into a form that can be solved by Monte Carlo method. To do this, we
define the uniform probability density function (pdf) is defined on [a, b] in the following.
1
f (x) = b−a , a<x<b
0, otherwise.
This this function is inserted to the equation (7.204) to obtain the following expression
for I.
% b
I = (b − a) g(x)f (x) dx. (7.205)
a
It is observed that the integral on the right hand side of the equation (7.205) is simply
the expectation of g(x) under uniform pdf. Thus,
% b
I = (b − a) g(x)f (x) dx = (b − a)g. (7.206)
a
Now, a sample xi is drawn from the pdf f (x), and for each xi the value of g(xi ) is
calculated. To get a good approximation, a large number of sample is to be chosen and
let G be the average of all the values of g(xi ), i = 1, 2, . . . , N (the sample size). Then
1
N
G= g(xi ). (7.207)
N
i=1
It is easy to prove that the expectation of the average of N samples is the expectation
of g(x), i.e., G = g. Hence
N
1
I = (b − a)G (b − a) g(xi ) . (7.208)
N
i=1
494 Numerical Analysis
y = g(x)
6
g(x)
- x
x=a x=b
Figure 7.4: Random points are chosen in [a, b]. The value of g(x) is evaluated at each
random point (marked by straight lines in the figure).
Algorithm 7.8 (Monte Carlo integration). This algorithm finds the value of
8b
the integral a f (x) dx using Monte Carlo method based on a sample of size N .
Program 7.8
.
/* Program Monte Carlo
Program to find the value of the integration of 1/(1+x^2)
between 0 and 1 by Monte Carlo method for different values of N. */
#include<stdio.h>
#include<stdlib.h>
void main()
{
float g(float x); /* g(x) may be changed accordingly */
float x,y,I,sum=0.0,a=0.0,b=1.0;
int i, N;
srand(100); /* seed for random number */
printf("Enter the sample size ");
scanf("%d",&N);
for(i=0;i<N;i++)
{ /* rand() generates a random number between 0 and RAND_MAX */
y=(float)rand()/RAND_MAX;
/* generates a random number between 0 and 1*/
x=a+(b-a)*y;
sum+=g(x);
}
I=sum*(b-a)/N;
printf("%f",I);
}
/* definition of function */
float g(float x)
{
return(1/(1+x*x));
}
The results obtained for different values of N are tabulated in the following.
N Integration
500 0.790020
1000 0.789627
1500 0.786979
3000 0.786553
4000 0.784793
10000 0.784094
15000 0.782420
Differentiation and Integration 497
Choice of method
In this chapter, three types of integration techniques viz., Newton-Cotes, Gaussian and
Monte Carlo are discussed. These methods are computationally different from each
other.
No definite rule can be given to choose integration method. But, the following points
may be kept in mind while choosing an integration method.
(i) The simplest but crudest integration formula is the trapezoidal rule. This method
gives a rough value of the integral. When a rough value is required then this
method may be used.
(ii) Simpson’s 1/3 rule gives more accurate result and it is also simple. Practically,
this is the most widely used formula. Thus, if f (x) does not fluctuate rapidly and
is explicitly known then this method with a suitable subinterval can be used. If
high accuracy is required then the Gaussian quadrature may be used.
(iii) Double integration may be done by using Simpson’s rule with suitable subintervals.
(iv) If the integrand is known at some unequally spaced points, then the trapezoidal
rule along with Romberg’s integration is useful.
(v) If the integrand is violently oscillating or fluctuating then the Monte Carlo method
can be used.
Example 7.20.1 The arc length of the curve y = f (x) over the interval a ≤ x ≤ b
% b
is 1 + [f (x)]2 dx. For the function f (x) = x3 , 0 ≤ x ≤ 1 find the approximate
a
arc length using the composite trapezoidal and Simpson’s 1/3 rules with n = 10.
81 81√
Solution. The arc length I = 0 1 + [f (x)]2 dx = 0 1 + 9x4 dx.
Since n = 10, h = (1 − 0)/10 = 0.1.
h
I= [y0 + 4(y1 + y3 + · · · + y9 ) + 2(y2 + y4 + · · · + y8 ) + y10 ]
3
0.1
= [1.00000 + 4(1.00045 + 1.03581 + 1.25000 + 1.77789 + 2.62772)
3
+2(1.00717 + 1.10923 + 1.47187 + 2.16481) + 3.16228]
= 1.54786.
Example 7.20.2 Find the number n and the step size h such that the
% error for the
5
composite Simpson’s 1/3 rule is less than 5 × 10−7 when evaluating log x dx.
2
1 2 6
Solution. Let f (x) = log x. f (x) = − 2 , f (x) = 3 , f iv (x) = − 4 .
x x x
6
The maximum value of |f iv (x)| = 4 over [2,5] occurs at the end point x = 2.
x
3
Thus |f (ξ)| ≤ |f (2)| = , 2 ≤ ξ ≤ 5.
iv iv
8
(b − a)f iv (ξ)h4
The error E of Simpson’s 1/3 rule is E = − .
180
Therefore,
(b − a)f iv (ξ)h4 (5 − 2)h4 3 h4
|E| = ≤
. = .
180 180 8 160
b−a 3 h4 81
Also h = = . Thus, |E| ≤ = 4 ≤ 5 × 10−7 .
n n 160 n × 160
81 1
That is, n4 ≥ · or, n ≥ 31.72114.
160 5 × 10−7
Since n is integer, we choose n = 32 and the corresponding step size is
3 3
h= = = 0.09375.
n 32
Differentiation and Integration 499
Solution. If the above formula gives exact result for the polynomial of degree up
to 4 then let f (x) = 1, x, x2 , x3 . Substituting f (x) = 1, x, x2 , x3 successively to the
equation % n
f (x)dx = w1 f (0) + w2 f (n) + w3 f (2n) + w4 f (3n),
0
we get
n = w1 + w2 + w3 + w4 , n2 /2 = nw2 + 2nw3 + 3nw4 ,
n3 /3 = n2 w2 + 4n2 w3 + 9n2 w4 , n4 /4 = n3 w2 + 8n3 w3 + 27n3 w4 .
3n 19n 5n n
Solution is w1 = , w2 = , w3 = − , w4 = .
% n 8 24 24 24
3 1
Hence f (x)dx = n f (0) + 19f (n) − 5f (2n) + f (3n) .
0 8 24
Solution.
% 1 Let the formula be
f (x)dx = w1 f (0) + w2 f (1) + w3 f (2). The formula is exact for f (x) = 1, x, x2 .
0
500 Numerical Analysis
Substituting f (x) = 1, x, x2 successively to the above formula and find the following
set of equations.
1 = w1 + w2 + w3 , 1/2 = w2 + 2w3 , 1/3 = w2 + 4w3 .
5 2 1
Solution of these equations is w1 = , w2 = , w3 = − .
% 1 12 3 12
1
Hence the formula becomes f (x)dx = [5f (0) + 8f (1) − f (2)].
0 12
Example 7.20.7 Write down the quadrature polynomial which takes the same val-
ues as f (x) at x = −1, 0, 1 and integrate it to obtain the integration formula
% 1
1
f (x)dx = [f (−1) + 4f (0) + f (1)].
−1 3
Assuming the error to have the form Af iv (ξ), −1 < ξ < 1, find the value of A.
1 2 4 1 2
= f (−1). − f (0) − + f (1).
2 3 3 2 3
1
= [f (−1) + 4f (0) + f (1)].
3
C
Here the error is of the form Af iv (ξ). Let the error be E = Af iv (ξ) = f iv (ξ),
4!
−1 < ξ < 1. This indicates that the above formula gives exact result for the polyno-
mials of degree up to 3 and has error for the polynomial of degree 4. Let f (x) = x4 .
% 1
Then the value of x4 dx obtain from the above formula is 13 [(−1)4 + 4.04 + 14 ] = 23
−1 %
1
2
and the exact value is x4 dx = .
−1 5
% 1
2 4
Therefore, C = x4 dx − = − .
−1 3 15
C 1
Hence A = =− .
4! 90
Example 7.20.8 Derive Simpson’s 1/3 rule using the method of undetermined co-
efficients.
Solution. Let
% x2 % h
I= f (x)dx = f (z + x1 )dz where z = x − x1
x0 −h
% h
= F (z)dz = w1 F (−h) + w2 F (0) + w3 F (h), F (z) = f (z + x1 ).
−h
7.21 Exercise
2. A slider in a machine moves along a fixed straight rod. Its distance x cm along
the rod is given below for various values of time t (second). Find the velocity of
the slider and its acceleration when t = 0.3 sec.
Use the formula based on Newton’s forward difference interpolation to find the
velocity and acceleration.
3. Use approximate formula to find the values of y (2) and y (2) from the following
table:
4. Find the values of f (5) and f (5) from the following table:
x : 1 2 3 4 5
f (x) : 10 26 50 82 122
5. Find the values of y (1), y (1.2), y (4), y (3.9) from the following values
x : 1 2 3 4
y : 0.54030 –0.41615 –0.98999 –0.65364
6. Use two-point and three-point formulae to find the values of f (2.0) and f (2.0).
Differentiation and Integration 503
7. Deduce the following relation between the differential and finite difference opera-
tors
1 ∆2 ∆3 ∆4
(a) D ≡ ∆− + − + ···
h 2 3 4
1 ∇2 ∇3 ∇4
(b) D ≡ ∇+ + + + ···
h 2 3 4
1 11 5 137 6
(c) D2 ≡ 2 ∆2 − ∆3 + ∆4 − ∆5 + ∆ − ···
h 12 6 180
1 11 5 137 6
(d) D2 ≡ 2 ∇2 + ∇3 + ∇4 + ∇5 + ∇ + ···
h 12 6 180
8. Use Taylor’s series to deduce the following formula
f (x0 + h) − f (x0 − h)
f (x0 ) =
2h
with h = 04, 0.2 and 0.1, to find initial values.
% 2
14. Find the value of (1 + e−x sin 4x) dx using basic (non-composite) rules of (i)
0
trapezoidal, (ii) Simpson’s 1/3, (iii) Simpson’s 3/8, (iv) Boole’s, and (v) Weddle’s.
√
15. Consider f (x) = 3 + sin(2 x). Use the composite trapezoidal and Simpson’s 1/3
rules with 11 points to compute an approximation to the integral of f (x) taken
over [0, 5].
using Simpson’s 1/3 rule, by dividing the interval of integration into eight equal
parts.
Differentiation and Integration 505
20. A curve is drawn to pass through the points given by the following table:
Use Simpson’s 1/3 rule to estimate the area bounded by the curve and the lines
x = 1 and x = 4 and x-axis.
23. Find the number of subintervals n and the step size h so that the error
% 5 for the
composite trapezoidal rule is less than 5×10−4 for the approximation sin x dx.
1
24. Verify that the Simpson’s 1/3 rule is exact for polynomial of degree less than or
equal to 3 of the form f (x) = a0 + a1 x + a2 x2 + a3 x3 over [0,2].
26. The solid of revolution obtained by rotating the region under the curve y = f (x),
a ≤ x ≤ b, about the x-axis has surface area given by
% b
area = 2π f (x) 1 + [f (x)]2 dx.
a
506 Numerical Analysis
Find the surface area for the functions (a) f (x) = cos x, 0 ≤ x ≤ π/4, (b)
f (x) = log x, 1 ≤ x ≤ 5, using trapezoidal and Simpson’s 1/3 rules with 10
subintervals.
27. Show that the Simpson’s 1/3 rule produces exact result for the function f (x) = x2
% b % b
b2 a2 b3 a3
3
and f (x) = x , that is, (a) 2
x dx = − and (b) x3 dx = − , by
a 3 3 a 4 4
taking four subintervals.
28. A rocket is lunched from the ground. Its acceleration a(t) measured in every 5
second is tabulated below. Find the velocity and the position of the rocket at
t = 30 second. Use trapezoidal as well as Simpson’s rules. Compare the answers.
t : 0 5 10 15 20 25 30
a(t) : 40.0 46.50 49.25 52.25 55.75 58.25 60.50
Find the maximum step size h to get the truncation error bound 0.5 × 10−3 when
finding the value of log2 (0.75) by trapezoidal rule.
30. Expand F (x) where F (x) = f (x), by Taylor series about x0 + h/2 and establish
the midpoint rule
% x1
h2 x1 − x0
f (x) dx = hf (x0 + h/2) + f (ξ), x0 < ξ < x1 , h = .
x0 24 2
31. %
Complete the following table to compute the value of the integral
3
sin 2x
5
dx using Romberg integration.
0 1+x
I0 I1 I2 I3
Trapezoidal rule Simpson’s rule Boole’s rule Third improvement
–0.00171772
0.02377300 ···
0.60402717 ··· ···
0.64844713 ··· ··· ···
0.66591329 ··· ··· ···
Differentiation and Integration 507
32. Find the values of the following integrals using Romberg integration starting from
trapezoidal
% 2 rule, correct up % 2to five decimal points.
(a) 4x − x2 dx, (b) sin(1/x) dx.
1 1/(2π)
40. Using the method of undetermined coefficients, derive the following formulae.
% 2π
(a) f (x) sin x dx = f (0) − f (2π)
%0 h
h
(b) y dx = (y0 + y1 ).
0 2
508 Numerical Analysis
42. Find the values of a, b, c such that the truncation error in the formula
% h
f (x) dx = h[af (−h) + bf (0) + cf (h)]
−h
is minimized.
46. Use Euler-Maclaurin formula to find the value of π from the relation
% 1
π dx
= 2
.
4 0 1+x
47. Use Euler-Maclaurin formula to find the values of the following series:
1 1 1 1
(a) 2 + 2 + 2 + · · · + 2
51 53 55 99
1 1 1 1
(b) 2
+ 2 + 2 + ··· + 2
11 12 13 99
1 1 1 1
(c) + + + ··· + .
1 2 3 20
Differentiation and Integration 509
∞
π2 1
49. Use the identity = to compute π 2 .
6 n2
n=1
81
50. Use Euler-Maclaurin formula to find the value of 0 x3 dx.
51. Use Euler-Maclaurin formula to deduce trapezoidal and Simpson’s 1/3 rules.
where R is the square region with corners (1, 1), (2, 1), (2, 2), (1, 2).
% 1% 1
sin xy
54. Use Simpson’s 1/3 rule to compute the integral dx dy with h =
0 0 1 + xy
k = 0.25.
% 5
x
55. Use Monte Carlo method to find the value of dx, taking sample size
1 x + cos x
N = 10.
510 Numerical Analysis
Chapter 8
y(x0 ) = y0 . (8.2)
The solution of a differential equation can be done in one of the following two forms:
(i) A series solution for y in terms of powers of x. Then the values of y can be
determined by substituting x = x0 , x1 , . . . , xn .
511
512 Numerical Analysis
problems (IVPs). The problems of order two or more and for which the conditions
are given at two or more points are called boundary value problems (BVPs).
There may not exist a solution of an ordinary differential equation always. The
sufficient condition for existence of unique solution is stated below.
(ii) there exists a constant L such that for any x ∈ [x0 , b] and for any two numbers y1
and y2
|f (x, y1 ) − f (x, y2 )| ≤ L|y1 − y2 |,
where L is called the Lipschitz constant.
The methods to find approximate solution of an initial value problem are referred as
difference methods or discrete variable methods. The solutions are determined
at a set of discrete points called a grid or mesh of points.
The errors committed in solving an initial value problem are of two types– discretiza-
tion and round-off. The discretization error, again, are of two types – global dis-
cretization error and local discretization error. The global discretization error
or global truncation error Eiglobal is defined as
that is, it is the difference between the exact solution Y (xi ) and the solution y(xi )
obtained by discrete variable method.
The local discretization error or local truncation error Ei+1 local is defined by
local
Ei+1 = y(xi+1 ) − y(xi ), i = 0, 1, 2, . . . . (8.4)
This error generated at each step from xi to xi+1 . The error at the end of the interval
is called the final global error (F.G.E.)
(i) Stable numerical scheme: The numerical solution does not blow up with choice
of step size.
(ii) Unstable numerical scheme: The numerical solution blows up with any choice
of step size.
(iii) Conditionally stable numerical scheme: Numerical solution remains bounded
with certain choices of step size.
A finite difference method is convergent if the solution of the finite difference equa-
tion approaches to a limit as the size of the grid spacing tends to zero. But, there is no
guarantee, in general, that this limit corresponds to the exact solution of the differential
equation.
A finite difference equation is consistent with a differential equation if the differ-
ence between the solution of the finite difference equation and those of the differential
equation tends to zero as the size of the grid spacing tends to zero independently.
If the value of yi+1 depends only on the value of yi then the method is called single-
step method and if two or more values are required to evaluate the value of yi+1 then
the method is known as two-step or multistep method.
Again, if the value of yi+1 depends only on the values of yi , h and f (xi , yi ) then the
method used to determine yi+1 is called explicit method, otherwise the method is
called implicit method.
The Taylor’s series method is the most fundamental method and it is the standard to
which we compare the accuracy of the various other numerical methods for solving an
initial value problem.
514 Numerical Analysis
Error
The final global error of Taylor’s series method is of the order of O(hn+1 ). Thus, for
large value of n the error becomes small. If n is fixed then the step size h is chosen in
such a way that the global error becomes as small as desired.
Example 8.1.1 Use Taylor’s series method to solve the equation
dy
= x − y, y(0) = 1
dx
at x and at x = 0.1.
Ordinary Differential Equations 515
x2 x3 x4 iv
y(x) = y0 + xy0 + y + y0 + y0 + · · · .
2! 0 3! 4!
Differentiating y repeatedly with respect to x and substituting x = 0, we obtain
y (x) = x − y, y0 = −1
y (x) = 1 − y , y0 = 1 − y0 = 2
y (x) = −y , y0 = −2
y iv (x) = −y , y0iv = 2
and so on.
The Taylor’s series becomes
x2 x3 x4
y(x) = 1 − x + ·2+ · (−2) + · 2 + ···
2! 3! 4!
x3 x4
= 1 − x + x2 − + − ··· .
3 12
This is the Taylor’s series solution of the given differential equation at any point x.
Now,
(0.1)3 (0.1)4
y(0.1) = 1 − (0.1) + (0.1)2 − + − · · · = 0.909675.
3 12
Thus
% x
y(x) = y0 + f (x, y) dx. (8.11)
x0
516 Numerical Analysis
The value of y is replaced by y0 in the right hand side of (8.11) and let this solution
be y (1) (x), the first approximation of y, i.e.,
% x
y (1) (x) = y0 + f (x, y0 ) dx. (8.12)
x0
Again, the value of y (1) (x) is replaced in (8.11) from (8.12) and the second approxi-
mation y (2) (x) is obtained as
% x
y (2) (x) = y0 + f (x, y (1) ) dx. (8.13)
x0
% % x
(3)
x
(2) x3 x4
y (x) = y0 + f (x, y (x)) dx = x2 + + dx
x0 0 3 3.4
x3 x4 x5
= + + .
3 3.4 3.4.5
Similarly,
x3 x4 x5 x6
y (4) (x) = + + + .
3 3.4 3.4.5 3.4.5.6
Now,
This is the most simple but crude method to solve differential equation of the form
dy
= f (x, y), y(x0 ) = y0 . (8.14)
dx
Let x1 = x0 + h, where h is small. Then by Taylor’s series
dy h2 d2 y
y1 = y(x0 + h) = y0 + h + ,
dx x0 2 dx2 c1
where c1 lies between x0 and x
h2
= y0 + hf (x0 , y0 ) + y (c1 ) (8.15)
2
If the step size h is chosen small enough, then the second-order term may be neglected
and hence y1 is given by
y1 = y0 + hf (x0 , y0 ). (8.16)
Similarly,
y2 = y1 + hf (x1 , y1 ) (8.17)
y3 = y2 + hf (x2 , y2 ) (8.18)
and so on.
518 Numerical Analysis
In general,
yn+1 = yn + hf (xn , yn ), n = 0, 1, 2, . . . (8.19)
This method is very slow. To get a reasonable accuracy with Euler’s methods, the
value of h should be taken as small.
It may be noted that the Euler’s method is a single-step explicit method.
Example 8.3.1 Find the values of y(0.1) and y(0.2) from the following differential
equation
dy
= x2 + y 2 with y(0) = 1.
dx
Solution. Let h = 0.05, x0 = 0, y0 = 1.
Then
x1 = x0 + h = 0.05
y1 = y(0.05) = y0 + hf (x0 , y0 ) = 1 + 0.05 × (0 + 1) = 1.05
x2 = x1 + h = 0.1
y2 = y(0.1) = y1 + hf (x1 , y1 ) = 1.05 + 0.05 × (0.12 + 1.052 ) = 1.105625
x3 = x2 + h = 0.15
y3 = y(0.15) = y2 + hf (x2 , y2 ) = 1.105625 + 0.05 × (0.12 + 1.1056252 )
= 1.167245
x4 = x3 + h = 0.2
y4 = y(0.2) = y3 + hf (x3 , y3 ) = 1.167245 + 0.05 × (0.152 + 1.1672452 )
= 1.236493.
Hence y(0.1) = 1.105625, y(0.2) = 1.236493.
Error
The local truncation error of Euler’s method is O(h2 ), it follows obviously from (8.15).
2
The neglected term at each step is y (ci ) h2 . Then at the end of the interval [x0 , xn ],
after n steps, the global error is
n
h2 h2 hn (xn − x0 )y (c)
y (ci ) = ny (c) = y (c)h = h = O(h).
2 2 2 2
n=1
Ordinary Differential Equations 519
y
6
y = y(x)
y3
y2
y1
y0
- x
O x0 x1 x2 x3
Algorithm 8.1 (Euler’s method). This algorithm finds the solution of the equa-
tion y = f (x, y) with y(x0 ) = y0 over the interval [x0 , xn ], by Euler’s method
yi+1 = yi + hf (xi , yi ), i = 0, 1, 2, . . . , n − 1.
Algorithm Euler
Input function f (x, y)
Read x0 , y0 , xn , h //x0 , y0 are the initial values and xn is the last value of x//
//where the process will terminate; h is the step size//
for x = x0 to xn step h do
y = y0 + h ∗ f (x, y0 );
Print x, y;
y0 = y;
endfor;
end Euler
Program 8.1.
/* Program Euler
Solution of a differential equation of the form y’=f(x,y),
y(x0)=y0 by Euler’s method. */
#include<stdio.h>
#include<math.h>
void main()
{
float x0,y0,xn,h,x,y;
520 Numerical Analysis
A sample of input/output:
In Euler’s method there is no scope to improve the value of y. The improvement can
be done using modified Euler’s method.
Now, consider the differential equation
dy
= f (x, y) with y(x0 ) = y0 . (8.20)
dx
Ordinary Differential Equations 521
which gives % x1
y1 = y 0 + f (x, y) dx. (8.21)
x0
The integration of right hand side can be done using any numerical method. If the
trapezoidal rule is used with step size h(= x1 − x0 ) then above integration become
h
y(x1 ) = y(x0 ) + [f (x0 , y(x0 )) + f (x1 , y(x1 ))]. (8.22)
2
Note that the right hand side of (8.22) involves an unknown quantity y(x1 ). This
value can be determined by the Euler’s method. Let us denote this value by y (0) (x1 )
and the value obtained from (8.22) by y (1) (x1 ). Then the resulting formula for finding
y1 is
h
y (1) (x1 ) = y(x0 ) + [f (x0 , y(x0 )) + f (x1 , y (0) (x1 ))].
2
That is,
(0)
y1 = y0 + hf (x0 , y0 )
(1) h (0)
y1 = y0 + [f (x0 , y0 ) + f (x1 , y1 )]. (8.23)
2
This is the first approximation of y1 .
The second approximation is
(2) h (1)
y1 = y0 + [f (x0 , y0 ) + f (x1 , y1 )]. (8.24)
2
The (k + 1)th approximate value of y1 is
(k+1) h (k)
y1 = y0 + [f (x0 , y0 ) + f (x1 , y1 )], k = 0, 1, 2, . . . . (8.25)
2
In general, (0)
yi+1 = yi + hf (xi , yi )
(k+1) h (k)
yi+1 = yi + [f (xi , yi ) + f (xi+1 , yi+1 )], (8.26)
2
k = 0, 1, 2, . . . ; i = 0, 1, 2, . . . .
(k) (k+1)
The iterations are continued until two successive approximations yi+1 and yi+1
coincide to the desired accuracy. The iterations converge rapidly for sufficiently small
spacing h.
522 Numerical Analysis
Example 8.4.1 Determine the value of y when x = 0.1 and 0.2 given that
y(0) = 1 and y = x2 − y.
(0) (0)
through C(x1 , y1 ) but with a slope equal to the average of f (x0 , y0 ) and f (x1 , y1 ).
The line L through (x0 , y0 ) and parallel to L is the approximate curve to find the
(1) (1)
improved value y1 . The ordinate of the point B is the approximate value y1 .
y
6 y = y(x) L
: L1
B L
T0
C (x1 , y1(0) )
A(x0 , y0 )
-x
O x0 x1
Again, the formula (8.23) can be interpreted as follows, by writing it in the form
(1) h (0)
y1 − y 0 = [f (x0 , y0 ) + f (x1 , y1 )]. (8.27)
2
The first improvement, i.e., the right hand of (8.27) is the area of the trapezoid (see
(0)
Figure 8.3) with the vertices (x0 , 0), (x0 , f (x0 , y0 )), (x1 , f (x1 , y1 )) and (x1 , 0).
z
6
z = f (x, y)
(0)
f (x1 , y1 )
f (x0 , y0 )
- x
O x0 x1
Error
h3
The error of the trapezoidal rule is − y (ci ). So the local truncation error of modified
12
Euler’s method is O(h3 ).
After n steps, the local accumulated error of this method is
n
h3 xn − x0
− y (ci ) − y (c)h2 = O(h2 ).
12 12
i=1
Algorithm 8.2 (Modified Euler’s method). This algorithm solves the initial
value problem y = f (x, y) with y(x0 ) = y0 over the interval [x0 , xn ] with step size h.
The formulae are given by
(0)
yi+1 = yi + hf (xi , yi )
(k) h (k−1)
yi+1 = yi + [f (xi , yi ) + f (xi+1 , yi+1 )], for k = 1, 2, . . .
2
Algorithm Modified Euler
Input function f (x, y);
Read x0 , xn , y0 , h; //initial and final values of x, initial value of y and step size h.//
Read ε; //ε is the error tolerance.//
Set y = y0 ;
for x = x0 to xn step h do
Compute f1 = f (x, y);
Compute yc = y + h ∗ f1 ; //evaluated from Euler’s method//
do
Set yp = yc ;
Compute yc = y + h2 [f1 + f (x + h, yp )] //modified Euler’s method//
while (|yp − yc | > ε) //check for accuracy//
Reset y = yc ;
Print x, y;
endfor;
end Modified Euler
Program 8.2
.
/* Program Modified Euler
Solution of a differential equation of the form y’=f(x,y),
y(x0)=y0 by Modified Euler’s method. */
#include<stdio.h>
#include<math.h>
Ordinary Differential Equations 525
void main()
{
float x0,y0,xn,h,x,y;/*x0, xn the initial and final values of x*/
/* y0 initial value of y, h is the step length */
float eps=1e-5; /* the error tolerance */
float yc,yp,f1;
float f(float x, float y);
printf("Enter the initial (x0) and final (xn) values of x ");
scanf("%f %f",&x0,&xn);
printf("Enter initial value of y ");
scanf("%f",&y0);
printf("Enter step length h ");
scanf("%f",&h);
printf(" x-value y-value\n");
y=y0;
for(x=x0;x<xn;x+=h)
{
f1=f(x,y);
yc=y+h*f1; /* evaluated by Euler’s method */
do
{
yp=yc;
yc=y+h*(f1+f(x+h,yp))/2; /*modified Euler’s method*/
}while(fabs(yp-yc)>eps);
y=yc;
printf("%f %f\n",x+h,y);
}
} /* main */
/* definition of the function f(x,y) */
float f(float x, float y)
{
return(x*x-2*y+1);
}
A sample of input/output:
x-value y-value
0.100000 0.909546
0.200000 0.837355
0.300000 0.781926
0.400000 0.742029
0.500000 0.716660
The Euler’s method is less efficient in practical problems because if h is not sufficiently
small then this method gives inaccurate result.
The Runge-Kutta methods give more accurate result. One advantage of this method
is it requires only the value of the function at some selected points on the subinterval
and it is stable, and easy to program.
The Runge-Kutta methods perform several function evaluations at each step and
avoid the computation of higher order derivatives. These methods can be constructed
for any order, i.e., second, third, fourth, fifth, etc. The fourth-order Runge-Kutta
method is more popular. These methods are single-step explicit methods.
h (0)
y1 = y0 + [f (x0 , y0 ) + f (x1 , y1 )]. (8.28)
2
(0)
If y1 = y0 + hf (x0 , y0 ) is substituted in (8.28) then
h
y 1 = y0 + [f (x0 , y0 ) + f (x0 + h, y0 + hf (x0 , y0 ))].
2
Setting,
k1 = hf (x0 , y0 ) and
k2 = hf (x0 + h, y0 + hf (x0 , y0 )) = hf (x0 + h, y0 + k1 ). (8.29)
1
y1 = y0 + (k1 + k2 ). (8.30)
2
This is known as second-order Runge-Kutta formula. The local truncation error of this
formula is of O(h3 ).
Ordinary Differential Equations 527
General derivation
Assume that the solution is of the form
y1 = y0 + ak1 + bk2 , (8.31)
where
k1 = hf (x0 , y0 ) and
k2 = hf (x0 + αh, y0 + βk1 ), a, b, α and β are constants.
By Taylor’s series,
h2 h3
y1 = y(x0 + h) = y0 + hy0 + y0 + y0 + · · ·
2 6
h2 ∂f ∂f
= y0 + hf (x0 , y0 ) + + f (x0 , y0 ) + O(h3 )
2 ∂x (x0 ,y0 ) ∂y (x0 ,y0 )
df ∂f ∂f
As = + f (x, y)
dx ∂x ∂y
k2 = hf (x0 + αh, y0 + βk1 )
∂f ∂f
= h f (x0 , y0 ) + αh + βk1 + O(h2 )
∂x (x0 ,y0 ) ∂y (x0 ,y0 )
∂f ∂f
= hf (x0 , y0 ) + αh2 + βh2 f (x0 , y0 ) + O(h3 ).
∂x (x0 ,y0 ) ∂y (x0 ,y0 )
Then the equation (8.31) becomes
h2
y0 + hf (x0 , y0 ) + [fx (x0 , y0 ) + f (x0 , y0 )fy (x0 , y0 )] + O(h3 )
2
= y0 + (a + b)hf (x0 , y0 ) + bh2 [αfx (x0 , y0 ) + βf (x0 , y0 )fy (x0 , y0 )] + O(h3 ).
The coefficients of f , fx and fy are compared and the following equations are obtained.
1 1
a + b = 1, bα = and bβ = . (8.32)
2 2
Obviously, α = β and if α is assigned any value arbitrarily, then the remaining
parameters can be determined uniquely. However, usually the parameters are chosen as
1
α = β = 1, then a = b = .
2
Thus the formula is
1
y1 = y0 + (k1 + k2 ) + O(h3 ), (8.33)
2
where k1 = hf (x0 , y0 ) and k2 = hf (x0 + h, y0 + k1 ).
It follows that there are several second-order Runge-Kutta formulae and (8.33) is just
one among such formulae.
528 Numerical Analysis
The values of these variables are substituted in (8.34) and (8.35) and the fourth-order
Runge-Kutta method is obtained as
1
y1 = y0 + (k1 + 2k2 + 2k3 + k4 ) (8.38)
6
where
k1 = hf (x0 , y0 )
k2 = hf (x0 + h/2, y0 + k1 /2)
k3 = hf (x0 + h/2, y0 + k2 /2)
k4 = hf (x0 + h, y0 + k3 ).
Starting with the initial point (x0 , y0 ), one can generate the sequence of solutions at
x1 , x2 , . . . using the formula
1 (i) (i) (i) (i)
yi+1 = yi + (k1 + 2k2 + 2k3 + k4 ) (8.39)
6
where
(i)
k1 = hf (xi , yi )
(i) (i)
k2 = hf (xi + h/2, yi + k1 /2)
(i) (i)
k3 = hf (xi + h/2, yi + k2 /2)
(i) (i)
k4 = hf (xi + h, yi + k3 ).
Therefore,
1
y1 = y0 + (k1 + 2k2 + 2k3 + k4 )
6
1
= 1 + (0.1000 + 2 × 0.1105 + 2 × 0.1116 + 0.1246)
6
= 1.1115.
Note 8.5.1 The Runge-Kutta method gives better result, though, it has some disad-
vantages. This method uses numerous calculations of function to find yi+1 . When the
function f (x, y) has a complicated analytic form then the Runge-Kutta method is very
laborious.
Geometrical interpretation of k1 , k2 , k3 , k4
Let ABC be the solution curve (Figure 8.4) and B is the point on the curve at the
ordinate xi + h/2 which is the middle point of the interval [xi , xi + h]. Let AL1 T1 be
the tangent drawn at A makes an angle θ1 with the horizontal line AT0 . L1 and T1
are the points of intersection with the ordinates BD and CC0 . Then the number k1 is
the approximate slope (within the factor h) of the tangent at A of the solution curve
ABC, i.e., k1 = hy1 = hf (xi , yi ). The coordinates of L1 is (xi + h/2, yi + k1 /2). The
number k2 is the approximate slope (within the factor h) of the tangent drawn to the
curve ABC at L1 . A straight line AL2 is drawn parallel to the line segment L1 T2 . Then
the coordinates of L2 is (xi + h/2, yi + k2 /2). The number k3 is the approximate slope
Ordinary Differential Equations 531
(within factor h) of the tangent to the curve ABC at the point L2 . Finally, a straight
line is drawn through A and parallel to L2 T3 , which cuts the extension of ordinates
C0 T4 at T4 . The coordinates of T4 are (xi + h, yi + k3 ). Then k4 is the approximate
slope (within the factor h) of the tangent drawn to the curve ABC at T4 .
y
6
y = y(x)
T4 θ4
T3 T2
q T1
C
θ3
L2
Bq θ2
A θ1 L1
T0
A0 D C0 -x
O xi xi + h/2 xi+1
Error
The fourth-order Runge-Kutta formula is
(h/2) 4(k2 + k3 )
y 1 = y0 + k1 + + k4 .
3 2
This is similar to the Simpson’s formula with step size h/2, so the local truncation error
h5 iv
of this formula is − y (c1 ), i.e., of O(h5 ) and after n steps, the accumulated error
2880
is
n
h5 iv xn − x0 iv
− y (ci ) = − y (c)h4 = O(h4 ).
2880 5760
i=1
Algorithm RK4
Input function f (x, y);
Read x0 , xn , y0 , h; //initial and final value of x, initial value of y and step size.//
Set y = y0 ;
for x = x0 to xn step h do
Compute k1 = h ∗ f (x, y);
Compute k2 = h ∗ f (x + h/2, y + k1 /2);
Compute k3 = h ∗ f (x + h/2, y + k2 /2);
Compute k4 = h ∗ f (x + h, y + k3 );
Compute y = y + [k1 + 2(k2 + k3 ) + k4 ]/6;
Print x, y;
endfor;
end RK4
Program 8.3.
/* Program Fourth Order Runge-Kutta
Solution of a differential equation of the form y’=f(x,y),
y(x0)=y0 by fourth order Runge-Kutta method. */
#include<stdio.h>
#include<math.h>
void main()
{
float x0,y0,xn,h,x,y,k1,k2,k3,k4;
float f(float x, float y);
printf("Enter the initial values of x and y ");
scanf("%f %f",&x0,&y0);
printf("Enter last value of x ");
scanf("%f",&xn);
printf("Enter step length h ");
scanf("%f",&h);
y=y0;
printf(" x-value y-value\n");
for(x=x0;x<xn;x+=h)
{
k1=h*f(x,y);
k2=h*f(x+h/2,y+k1/2);
k3=h*f(x+h/2,y+k2/2);
k4=h*f(x+h,y+k3);
y=y+(k1+2*(k2+k3)+k4)/6;
printf("%f %f\n",x+h,y);
Ordinary Differential Equations 533
}
} /* main */
/* definition of the function f(x,y) */
float f(float x, float y)
{
return(x*x-y*y+y);
}
A sample of input/output:
The Runge-Kutta methods may also be used to solve a pair of first order differential
equations.
Consider a pair of first-order differential equations
dy
= f (x, y, z)
dx
(8.40)
dz
= g(x, y, z)
dx
(i)
where k1 = hf (xi , yi , zi )
(i)
l1 = hg(xi , yi , zi )
(i) (i) (i)
k2 = hf (xi + h/2, yi + k1 /2, zi + l1 /2)
(i) (i) (i)
l2 = hg(xi + h/2, yi + k1 /2, zi + l1 /2)
(i) (i) (i) (8.43)
k3 = hf (xi + h/2, yi + k2 /2, zi + l2 /2)
(i) (i) (i)
l3 = hg(xi + h/2, yi + k2 /2, zi + l2 /2)
(i) (i) (i)
k4 = hf (xi + h, yi + k3 , zi + l3 )
(i) (i) (i)
l4 = hg(xi + h, yi + k3 , zi + l3 ).
1
yi+1 = yi + [k1 + 2(k2 + k3 ) + k4 ],
6
1
zi+1 = zi + [l1 + 2(l2 + l3 ) + l4 ]
6
where
k1 = hf (xi , yi , zi )
l1 = hg(xi , yi , zi )
k2 = hf (xi + h/2, yi + k1 /2, zi + l1 /2)
l2 = hg(xi + h/2, yi + k1 /2, zi + l1 /2)
k3 = hf (xi + h/2, yi + k2 /2, zi + l2 /2)
l3 = hg(xi + h/2, yi + k2 /2, zi + l2 /2)
k4 = hf (xi + h, yi + k3 , zi + l3 )
l4 = hg(xi + h, yi + k3 , zi + l3 ).
Program 8.4 .
/* Program Runge-Kutta (for Pair of Equations)
Solution of a differential equation of the form y’=f(x,y,z),
z’=g(x,y,z) with x=x0, y(x0)=y0 and z(x0)=z0 by fourth order
Runge-Kutta method.
Here the equations are taken as y’=y+2z, z’=3y+2z with
y(0)=6, z(0)=4. */
#include<stdio.h>
#include<math.h>
void main()
{
float x0,y0,z0,xn,h,x,y,z,k1,k2,k3,k4,l1,l2,l3,l4;
float f(float x, float y, float z);
float g(float x, float y, float z);
printf("Enter the initial values of x, y and z ");
scanf("%f %f %f",&x0,&y0,&z0);
printf("Enter last value of x ");
scanf("%f",&xn);
printf("Enter step length h ");
scanf("%f",&h);
y=y0;
z=z0;
printf("x-value y-value z-value\n");
for(x=x0;x<xn;x+=h)
{
k1=h*f(x,y,z);
l1=h*g(x,y,z);
k2=h*f(x+h/2,y+k1/2,z+l1/2);
l2=h*g(x+h/2,y+k1/2,z+l1/2);
k3=h*f(x+h/2,y+k2/2,z+l2/2);
l3=h*g(x+h/2,y+k2/2,z+l2/2);
k4=h*f(x+h,y+k3,z+l3);
l4=h*g(x+h,y+k3,z+l3);
y=y+(k1+2*(k2+k3)+k4)/6;
z=z+(l1+2*(l2+l3)+l4)/6;
printf("%f %f %f\n",x+h,y,z);
}
} /* main */
Ordinary Differential Equations 537
k1 = hf (xi , yi )
h k1
k2 = hf xi + , yi +
4 4
3h 3 9
k3 = hf xi + , yi + k1 + k2
8 32 32
12 1932 7200 7296
k4 = hf xi + h, yi + k1 − k2 + k3 (8.55)
13 2197 2197 2197
439 3680 845
k5 = hf xi + h, yi + k1 − 8k2 + k3 − k4
216 513 4104
h 8 3544 1859 11
k6 = hf xi + , yi − k1 + 2k2 − k3 + k4 − k5 .
2 27 2565 4104 40
25 1408 2197 1
yi+1 = yi + k1 + k3 + k4 − k5 . (8.56)
216 2565 4104 5
It may be noted that the value of k2 is not used in the above formula. The other
value of y is determined by fifth order Runge-Kutta method as follows:
∗ 16 6656 28561 9 2
yi+1 = yi + k1 + k3 + k4 − k5 + k6 . (8.57)
135 12825 56430 50 55
∗ | is small enough then the method is terminated; otherwise, the compu-
If |yi+1 − yi+1
tation is repeated by reducing the step size h. The local truncation error of this method
∗ .
is yi+1 − yi+1
in practice the working order is closer to five. In this method, a pair of expressions are
developed to determine y as follows:
1
yi+1 = yi + (7k1 + 32k3 + 12k4 + 32k5 + 7k6 )
90
∗ 1
and yi+1 = yi + (k1 + 4k4 + k6 ), (8.58)
6
where
k1 = hf (xi , yi )
k2 = hf (xi + h/4, yi + k1 /4)
k3 = hf (xi + h/4, yi + k1 /8 + k2 /8)
k4 = hf (xi + h/2, yi − k1 /2 + k3 ) (8.59)
k5 = hf (xi + 3h/4, yi + 3k1 /16 + 9k4 /16)
k6 = hf (xi + h, yi − 3k1 /7 + 2k2 /7 + 12k3 /7 − 12k4 /7 + 8k5 /7).
∗ | is small enough.
The method will terminate when |yi+1 − yi+1
∗ .
The local truncation error of this method is yi+1 − yi+1
The methods, viz., Taylor’s series, Picard, Euler’s and Runge-Kutta are single-step
methods as these methods use only one previous values to compute the successive
values, i.e., yi is used to compute yi+1 . Certain efficient methods are available which
need some more values to compute the successive values. These methods are called
multistep methods. General k-step method needs yi−k−1 , yi−k−2 , . . . , yi−1 and yi to
compute yi+1 . The predictor-corrector method is a combination of two formulae. The
first formula (called predictor) finds an approximate value of yi and the second formula
(called corrector) improves this value. The commonly used predictor-corrector multistep
methods are due to Adams-Bashforth-Moulton and Milne-Simpson. These two methods
are discussed below.
This differential equation is integrated between xi and xi+1 and obtained the following
equation
% xi+1
yi+1 = yi + f (x, y) dx. (8.61)
xi
v 2 + v 2 v 3 + 3v 2 + 2v 3
y = yi + v∇yi + ∇ yi + ∇ yi .
2 6
Since y = f (x, y), this value is substituted in (8.61) for f (x, y).
Then
% 1
v 2 + v 2 v 3 + 3v 2 + 2v 3
yi+1 = yi + h yi + v∇yi + ∇ yi + ∇ yi dv
0 2 6
1 5 3
= yi + hyi + h∇yi + h∇2 yi + h∇3 yi
2 12 8
1 5 3 3
= yi + hfi + h∇fi + h∇ fi + ∇ fi2
2 12 8
where fj = f (xj , yj ) for all j = i, i − 1, i − 2, i − 3.
h
= yi + (−9fi−3 + 37fi−2 − 59fi−1 + 55fi ). (8.62)
24
p h
yi+1 = yi + (−9fi−3 + 37fi−2 − 59fi−1 + 55fi ). (8.63)
24
The corrector formula can also be developed in the same way. In corrector formula,
p
the value of yi+1 is used. Again, Newton’s backward formula is employed on the points
p
(xi−2 , yi−2 ), (xi−1 , yi−1 ), (xi , yi ) and (xi+1 , yi+1 ). The polynomial is
Solution. The Runge-Kutta method is used to find the starting values at x = 0.05,
0.10, 0.15. Here f (x, y) = 2y − y 2 , h = 0.05. The values are shown below:
i xi yi k1 k2 k3 k4 yi+1
0 0.00 1.000000 0.050000 0.049969 0.049969 0.049875 1.049959
1 0.05 1.049959 0.049875 0.049720 0.049720 0.049503 1.099669
2 0.10 1.099669 0.049503 0.049226 0.049228 0.048891 1.148886
+55 × 0.977833]
= 1.197375.
h
y c (0.20) = y4c = y3 + 24 [f (x1 , y1 ) − 5f (x2 , y2 ) + 19f (x3 , y3 ) + 9f (x4 , y4p )]
24 [0.997504 − 5 × 0.990066 + 19 × 0.977833
= 1.148886 + 0.05
+9 × 0.961043]
= 1.197376.
544 Numerical Analysis
+55 × 0.961043]
= 1.244918.
h
y c (0.25) = y5c = y4 + 24 [f (x2 , y2 ) − 5f (x3 , y3 )+19f (x4 , y4 )+9f (x5 , y5p )]
24 [0.990066 − 5 × 0.977833+19 × 0.961043
= 1.197376 + 0.05
+9 × 0.940015]
= 1.244919.
Hence y(0.25) = 1.244919.
It may be noted that the predicted and corrected values are equal up to five decimal
places.
Error
The local truncation error for predictor formula (8.63) is
% 1
v(v + 1)(v + 2)(v + 3) 4 251 v p
h ∇ fi y (ci+1 )h5 (= Yi+1 − yi+1 )
0 4! 720
p h
yi+1 = yi + [−9fi−3 + 37fi−2 − 59fi−1 + 55fi ]
24
h
and c
yi+1 = yi + [fi−2 − 5fi−1 + 19fi + 9fi+1 ],
24
p
where fi = f (xi , yi ); j = i − 3, i − 2, i − 1, i and fi+1 = f (xi+1 , yi+1 ).
Ordinary Differential Equations 545
Algorithm Adams-Bashforth-Moulton
Input function f (x, y);
Read x0 , xn , h, y0 , y1 , y2 , y3 , ε;
//x0 , xn are the initial and final values of x; h is step length;
y0 , y1 , y2 , y3 are the starting values of y obtained from
any single-step method; ε is the error tolerance.//
Compute the following
x1 = x0 + h; x2 = x1 + h; x3 = x2 + h;
f0 = f (x0 , y0 ); f1 = f (x1 , y1 );
f2 = f (x2 , y2 ); f3 = f (x3 , y3 );
for x4 = x3 + h to xn step h do
Compute y p = y3 + 24 h
[−9f0 + 37f1 − 59f2 + 55f3 ];
Set yold = y ; p
Compute y c = y3 + 24 h
[f1 − 5f2 + 19f3 + 9f (x4 , yold )];
if (|y − yold | > ε) then
c
Set yold = y c ;
Calculate y c from above relation;
else
Reset y0 = y1 ; y1 = y2 ;
Reset y2 = y3 ; y3 = y c ;
Reset f1 = f2 ; f2 = f3 ;
Reset f3 = f (x4 , y c );
Print x4 , y c ;
endif;
endfor;
end Adams-Bashforth-Moulton
Program 8.5.
/* Program Adams-Bashforth-Moutlon
Solution of a differential equation of the form y’=f(x,y),
y(x0)=y0 by Adams-Bashforth-Moutlon method.
Here the equation is taken as y’=x*x*y+y*y with y(0)=1. */
#include<stdio.h>
#include<math.h>
void main()
{
float x0,y0,xn,h,y1,y2,y3,yc,yp;
/* x0, xn are the initial and final values of x */
/* y0,y1,y2,y3 are starting values of y,
h is the step length */
float eps=1e-5; /* the error tolerance */
546 Numerical Analysis
float x1,x2,x3,x4,f0,f1,f2,f3,yold;
float f(float x, float y);
float rk4(float x,float y,float h);
printf("Enter the initial values of x and y ");
scanf("%f %f",&x0,&y0);
printf("Enter last value of x ");
scanf("%f",&xn);
printf("Enter step length h ");
scanf("%f",&h);
printf(" x-value y-value\n");
/* initial values of y are computed using Runge-Kutta method */
x1=x0+h; x2=x1+h; x3=x2+h;
y1=rk4(x0,y0,h);
y2=rk4(x1,y1,h);
y3=rk4(x2,y2,h);
f0=f(x0,y0); f1=f(x1,y1); f2=f(x2,y2); f3=f(x3,y3);
for(x4=x3+h;x4<=xn;x4+=h)
{
yp=y3+h*(-9*f0+37*f1-59*f2+55*f3)/24;
yold=yp;
yc=yp;
do
{
yold=yc;
yc=y3+h*(f1-5*f2+19*f3+9*f(x4,yold))/24;
}while((yc-yold)>eps);
printf("%8.5f %8.5f\n",x4,yc);
y0=y1; y1=y2; y2=y3; y3=yc;
f1=f2; f2=f3; f3=f(x4,yc);
}
} /* main */
/* definition of the function f(x,y) */
float f(float x, float y)
{
return(x*x*y+y*y);
}
/* the fourth order Runge-Kutta method */
Ordinary Differential Equations 547
Now, the function f (x, y) is replaced by Newton’s forward difference formula in the
form
u(u − 1) 2 u(u − 1)(u − 2) 3
f (x, y) = fi−3 + u∆fi−3 + ∆ fi−3 + ∆ fi−3 , (8.68)
2! 3!
x − xi−3
where u = .
h
The value of f (x, y) is substituted from (8.68) to (8.67) and find
% 4
u2 − u 2 u3 − 3u2 + 2u 3
yi+1 = yi−3 + h fi−3 + u∆fi−3 + ∆ fi−3 + ∆ fi−3 du
0 2 6
20 8
= yi−3 + h 4fi−3 + 8∆fi−3 + ∆2 fi−3 + ∆3 fi−3
3 3
4h
= yi−3 + [2fi−2 − fi−1 + 2fi ].
3
548 Numerical Analysis
p 4h
yi+1 = yi−3 + [2fi−2 − fi−1 + 2fi ]. (8.69)
3
p
The corrector formula is developed in a similar way. The value of yi+1 will now be
used. Again, the given differential equation is integrated between xi−1 and xi+1 and
the function f (x, y) is replaced by the Newton’s formula (8.68). Then
% xi+1
u(u − 1) 2
yi+1 = yi−1 + fi−1 + u∆fi−1 + ∆ fi−1 dx
xi−1 2
% 2
u2 − u 2
= yi−1 + h fi−1 + u∆fi−1 + ∆ fi−1 du
0 2
1 2
= yi−1 + h 2fi−1 + 2∆fi−1 + ∆ fi−1
3
h
= yi−1 + [fi−1 + 4fi + fi+1 ].
3
c . That is,
This formula is known as corrector formula and it is denoted by yi+1
c h p
yi+1 = yi−1 + [f (xi−1 , yi−1 ) + 4f (xi , yi ) + f (xi+1 , yi+1 )]. (8.70)
3
p
When yi+1 is computed using the formula (8.69), formula (8.70) can be used itera-
tively to obtain the value of yi+1 to the desired accuracy.
Example 8.6.2 Find the value of y(0.20) for the initial value problem
dy
= y 2 sin x with y(0) = 1
dx
using Milne’s predictor-corrector method, taking h = 0.05.
i xi yi k1 k2 k3 k4 yi+1
0 0.00 1.000000 0.000000 0.001250 0.001251 0.002505 1.001251
1 0.05 1.001251 0.002505 0.003765 0.003770 0.005042 1.005021
2 0.10 1.005021 0.005042 0.006328 0.006336 0.007643 1.011356
Ordinary Differential Equations 549
h
y4c = y2 + [f (x2 , y2 ) + 4f (x3 , y3 ) + f (x4 , y4c )]
2
0.05
= 1.005021 + [0.1008385 + 4 × 0.1528516 + 0.2068392]
2
= 1.0203390.
Error
The local truncation error for prediction formula is
28 v
y (ci+1 )h5 = O(h5 )
90
1 v
and that of the corrector formula is − y (di+1 )h5 = O(h5 ).
90
Note 8.6.1 Milne’s predictor-corrector method is widely used formula. In this method
there is a scope to improve the value of y by repeated use of corrector formula. So
it gives more accurate result. But, this method needs the starting values y1 , y2 , y3 to
obtain y4 . These values may be obtained from any single-step method, such as Taylor’s
series, Euler’s, Runge-Kutta or any similar method.
550 Numerical Analysis
Algorithm Milne
Input function f (x, y);
Read x0 , xn , h, y0 , y1 , y2 , y3 , ε;
//x0 , xn are the initial and final values of x; h is step length;
y0 , y1 , y2 , y3 are the starting values of y obtained from
any single-step method; ε is the error tolerance.//
Compute the following
x1 = x0 + h; x2 = x1 + h; x3 = x2 + h;
f1 = f (x1 , y1 );
f2 = f (x2 , y2 ); f3 = f (x3 , y3 );
for x4 = x3 + h to xn step h do
Compute y p = y0 + 4h 3 [2f1 − f2 + 2f3 ];
Set yold = y p ;
Compute y c = y2 + h3 [f2 + 4f3 + f (x4 , yold )];
if (|y c − yold | > ε) then
Reset yold = y c ;
Calculate y c from above relation;
else
Reset y0 = y1 ; y1 = y2 ;
Reset y2 = y3 ; y3 = y c ;
Reset f1 = f2 ; f2 = f3 ;
Compute f3 = f (x4 , y c );
Print x4 , y c ;
endif;
endfor;
end Milne
Program 8.6
.
/* Program Milne Predictor-Corrector
Solution of a differential equation of the form y’=f(x,y),
y(x0)=y0 by Milne Predictor-Corrector method.
Here the equation is taken as y’=x*y+y*y with y(0)=1.
*/
#include<stdio.h>
#include<math.h>
Ordinary Differential Equations 551
void main()
{
float x0,y0,xn,h,y1,y2,y3,yc,yp;
/* x0, xn the intial and final value of x */
/* y0,y1,y2,y3 are starting values of y,
h is the step length */
float eps=1e-5; /* the error tolerance */
float x1,x2,x3,x4,f0,f1,f2,f3,yold;
float f(float x, float y);
float rk4(float x,float y,float h);
printf("Enter the initial values of x and y ");
scanf("%f %f",&x0,&y0);
printf("Enter last value of x ");
scanf("%f",&xn);
printf("Enter step length h ");
scanf("%f",&h);
printf(" x-value y-value\n");
In this method, the derivatives y and y are replaced by finite differences (either by
forward or central) and generates a system of linear algebraic equations. The answer of
this system is the solution of the differential equation at different mesh.
The central difference formula discussed in Chapter 7 are used to replace derivatives
yi+1 − yi−1
y (xi ) = + O(h2 )
2h
(8.71)
y − 2y + y
and y (xi ) =
i+1 i i−1
+ O(h2 ).
h2
The method to solve first order differential equation using finite difference method is
nothing but the Euler’s method. The finite difference method is commonly used method
to solve second order initial value problem and boundary value problem.
Ordinary Differential Equations 553
The values of y (xi ) and y (xi ) are substituted from (8.71) to (8.72) and (8.73), to
find the equation
yi+1 − 2yi + yi−1 yi+1 − yi−1
+ p(xi ) + q(xi )yi = r(xi ).
h2 2h
After simplification, the above equation reduces to
[2 − hp(xi )]yi−1 + [2h2 q(xi ) − 4]yi + [2 + hp(xi )]yi+1 = 2h2 r(xi ). (8.74)
Let us consider
Ci = 2 − hp(xi )
Ai = 2h2 q(xi ) − 4
(8.75)
Bi = 2 + hp(xi ) and
Di = 2h2 r(xi ).
y1 − y−1
y0 = or, y−1 = y1 − 2hy0 . (8.77)
2h
Again, from (8.76),
C0 y−1 + A0 y0 + B0 y1 = D0 . (8.78)
The quantity y−1 is eliminated between (8.77) and (8.78), and the value of y1 is
obtained as
D0 − A0 y0 + 2hC0 y0
y1 = . (8.79)
C0 + B0
554 Numerical Analysis
Di − Ci yi−1 − Ai yi
yi+1 = , xi+1 = xi + h (8.80)
Bi
for i = 1, 2, . . . .
Thus, the values of y1 , y2 , . . . are determined recursively from (8.80).
Example 8.7.1 Solve the following IVP y − y = x with y(0) = 0 and y (0) = 1
using finite difference method for x = 0.01, 0.02, . . . , 0.10.
i yi−1 xi yi yi+1
1 0.000000 0.01 0.010000 0.020002
2 0.010000 0.02 0.020002 0.030008
3 0.020002 0.03 0.030008 0.040020
4 0.030008 0.04 0.040020 0.050040
5 0.040020 0.05 0.050040 0.060070
6 0.050040 0.06 0.600070 0.070112
7 0.060070 0.07 0.070112 0.080168
8 0.070112 0.08 0.080168 0.090240
9 0.080168 0.09 0.090240 0.100330
10 0.090240 0.10 0.100330 0.110440
Ordinary Differential Equations 555
Error
The local truncation error is
yi−1 − 2yi + yi+1 yi+1 − yi−1
ELTE = − yi + p i − yi .
h2 2h
Expanding the terms yi−1 and yi+1 by Taylor’s series and simplifying, the above expres-
sion gives
h2 iv
ELTE = (y + 2pi yi ) + O(h4 ).
12
Thus, the finite difference approximation has second-order accuracy for the functions
with continuous fourth derivatives.
Let us consider
Ci = 2 − hp(xi )
Ai = 2h2 q(xi ) − 4
(8.84)
Bi = 2 + hp(xi )
and Di = 2h2 r(xi ).
556 Numerical Analysis
for i = 1, 2, . . . , n − 1.
The boundary conditions then are y0 = γ1 , yn = γ2 .
For i = 1, n − 1 equation (8.85) reduces to
C1 y0 + A1 y1 + B1 y2 = D1 or, A1 y1 + B1 y2 = D1 − C1 γ1 (as y0 = γ1 )
and Cn−1 yn−2 + An−1 yn−1 + Bn−1 yn = Dn−1
or, Cn−1 yn−2 + An−1 yn−1 = Dn−1 − Bn−1 γ2 (as yn = γ2 ).
The equation (8.85) in matrix notation is
Ay = b (8.86)
where
y = [y1 , y2 , . . . , yn−1 ]t
b = 2h2 [r(x1 ) − {C1 γ1 }/(2h2 ), r(x2 ), . . . , r(xn−2 ), r(xn−1 ) − {Bn−1 γ2 }/(2h2 )]t
A1 B1 0 0 · · · 0 0
C2 A2 B2 0 · · · 0 0
and A = 0 C3 A3 B3 · · · 0 0 .
··· ··· ··· ··· ··· ··· ···
0 0 0 0 · · · Cn−1 An−1
Equation (8.86) is a tri-diagonal system which can be solved by the method discussed
in Chapter 5. The solution of this system i.e., the values of y1 , y2 , . . . , yn−1 constitutes
the approximate solution of the BVP.
Example 8.7.2 Solve the following boundary value problem y + xy + 1 = 0 with
boundary conditions y(0) = 0, y(1) = 0.
Algorithm 8.7 (Finite Difference Method for BVP). Using this algorithm,
the BVP y + p(x)y + q(x)y = r(x) with y(x0 ) = γ1 , y(xn ) = γ2 is solved by finite
difference method.
Algorithm BVP FD
Input functions q(x), q(x), r(x);
//The functions p, q, r to be changed accordingly.//
Read x0 , xn , h, γ1 , γ2 ; //The boundary values of x; step size h; and
the boundary value of y.//
for i = 1, 2, . . . , n − 1 do
Compute Ai = 2h2 q(xi ) − 4;
Compute Ci = 2 − hp(xi );
Compute Bi = 2 + hp(xi );
Compute Di = 2h2 r(xi );
Reset D1 = D1 − C1 γ1 ;
Reset Dn−1 = Dn−1 − Bn−1 γ2 ;
Solve the tri-diagonal system of equations Ay = b,
where A, b, y are given by the equation (8.86);
Print y1 , y2 , . . . , yn−1 as solution;
end BVP FD
Program 8.7
.
/* Program BVP Finite Difference
This program solves the second order boundary value
y’’+p(x)y’+q(x)y=r(x) with y(x0)=y0, y(xn)=yn by finite difference
method. Here we consider the equation y’’+2y’+y=10x with
boundary conditions y(0)=0 and y(1)=0. */
558 Numerical Analysis
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
float y[10];
void main()
{
int i,n;
float a[10],b[10],c[10],d[10],x0,xn,y0,yn,temp,h,x;
float p(float x); float q(float x); float r(float x);
float TriDiag(float a[],float b[],float c[],float d[],int n);
printf("Enter the initial and final values of x ");
scanf("%f %f",&x0,&xn);
printf("Enter the initial and final values of y ");
scanf("%f %f",&y0,&yn);
printf("Enter number of subintervals ");
scanf("%d",&n);
h=(xn-x0)/n;
x=x0;
for(i=1;i<=n-1;i++)
{
x+=h;
a[i]=2*h*h*q(x)-4;
b[i]=2+h*p(x);
c[i]=2-h*p(x);
d[i]=2*h*h*r(x);
} /* end of loop i */
d[1]-=c[1]*y0;
d[n-1]-=b[n-1]*yn;
temp=TriDiag(c,a,b,d,n-1);
y[0]=y0; y[n]=yn;
printf("The solution is\n x-value y-value\n");
for(i=0;i<=n;i++) printf("%8.5f %8.5f \n",x0+i*h,y[i]);
} /* main */
float q(float x)
{
return(1);
}
float r(float x)
{
return(10*x);
}
float TriDiag(float a[10],float b[10],float c[10],float d[10],int n)
{
/* output y[i], i=1, 2,..., n, is a global variable.*/
int i; float gamma[10],z[10];
gamma[1]=b[1];
for(i=2;i<=n;i++){
if(gamma[i-1]==0.0){
printf("A minor is zero: Method fails ");
exit(0);
}
gamma[i]=b[i]-a[i]*c[i-1]/gamma[i-1];
}
z[1]=d[1]/gamma[1];
for(i=2;i<=n;i++)
z[i]=(d[i]-a[i]*z[i-1])/gamma[i];
y[n]=z[n];
for(i=n-1;i>=1;i--)
y[i]=z[i]-c[i]*y[i+1]/gamma[i];
return(y[0]);
} /*end of TriDiag */
A sample of input/output:
Enter the initial and final values of x 0 1
Enter the initial and final values of y 0 0
Enter number of subintervals 4
The solution is
x-value y-value
0.00000 0.00000
0.25000 -0.52998
0.50000 -0.69647
0.75000 -0.51154
1.00000 0.00000
560 Numerical Analysis
Error
The local truncation error of this method is similar to the IVP, i.e., this method is also
of second-order accuracy for functions with continuous fourth order derivatives on [a, b].
It may be noted that when h → 0 then the local truncation error tends to zero, i.e.,
greater accuracy in the result can be achieved by using small h. But, small h produces
a large number of equations and take more computational effort.
The accuracy can be improved to employ Richardson’s deferred approach to the limit.
The error of this method has the form
For extrapolation to the limit, we find the value of (8.88) at two intervals h and h/2.
The values of yi are denoted by yi (h) and yi (h/2). Thus for x = xi for different step
sizes
y(xi ) − yi (h) = h2 E(xi ) + O(h4 )
(8.89)
h2
and y(xi ) − yi (h/2) = E(xi ) + O(h4 )
4
Now, by eliminating E(xi ) we find the expression for y(xi ), in the form
(ii) Solutions of these two IVPs can be determined by Taylor’s series or Runge-Kutta
or any other method,
(iii) Combination of these two solutions is the required solution of the given BVP.
Ordinary Differential Equations 561
Example 8.8.1 Find the solution of the boundary value problem y = y − x with
y(0) = 0, y(1) = 0 using the shooting method.
Solution. The second-order Runge-Kutta method is used to solve the initial value
problem with h = 0.25.
Here p(x) = 0, q(x) = 1, r(x) = −x, a = 0, b = 1, α = 0, β = 0.
Now, two IVPs are
u = w with u(0) = 0
(8.98)
w = u − x and w(0) = u (0) = 0
and
v = z with v(0) = 0
(8.99)
z = v and z(0) = v (0) = 1.
i xi vi zi k1 k2 l1 l2 vi+1 zi+1
0 0.0 0.00000 1.00000 0.20000 0.20000 0.00000 0.04000 0.20000 1.02000
1 0.2 0.20000 1.02000 0.20400 0.21200 0.04000 0.08080 0.40800 1.08040
2 0.4 0.40800 1.08040 0.21608 0.23240 0.08160 0.12482 0.63224 1.18361
3 0.6 0.63224 1.18361 0.23672 0.26201 0.12645 0.17379 0.88161 1.33373
4 0.8 0.88161 1.33373 0.26675 0.30201 0.17632 0.22967 1.16598 1.53672
Now, β − u(b) 0 − u(1) 0.16598
c= = = = 0.142352.
v(b) v(1) 1.16598
The values of y(x) given by y(x) = u(x) + cv(x) = u(x) + 0.142352 v(x) are listed
below:
x u(x) v(x) y(x) yexact
0.2 +0.00000 0.20000 0.02847 0.02868
0.4 -0.00800 0.40800 0.05008 0.05048
0.6 -0.03224 0.63224 0.05776 0.05826
0.8 -0.08161 0.88161 0.04389 0.04429
1.0 -0.16598 1.16598 0.00000 0.00000
Ordinary Differential Equations 563
Finite element method (FEM) is widely used technique to solve many engineering
problems. Here, a very brief introduction is presented to solve a BVP by using this
method. The detailed discussion of FEM is beyond the scope of this book.
The main idea of this method is – the whole interval (of integration) is divided into a
finite number of subintervals called element and over each element the continuous func-
tion is approximated by a suitable piecewise polynomial. This approximated problem
is solved by Rayleigh-Ritz or Galerkin methods.
Let us consider the functional
% b
J[y(x)] = F (x, y(x), y (x)) dx (8.100)
a
The Euler’s equation (8.102) has many solutions but, for a given boundary condition,
it gives a unique solution.
Let us consider the boundary value problem
d
− p(x)y (x) + q(x)y(x) = r(x) (8.103)
dx
with boundary condition (8.101). It can easily be verified that the variational form of
(8.103) is given by
% b
1
J[y(x)] = p(x) {y (x)}2 + q(x) {y(x)}2 − 2r(x) y(x) dx. (8.104)
2 a
The main steps to solve a BVP using FEM are given below.
Step 1. Discretization of the interval
The interval [a, b] is divided into a finite number of subintervals, called elements, of
unequal length. Let x0 , x1 , . . . , xn , a = x0 < x1 < · · · < xn = b, be the division points,
called the nodes. Let ei = [xi , xi+1 ] be the ith element of length hi = xi+1 − xi .
Step 2. Variational formulation of BVP over the element ei
564 Numerical Analysis
ei
a=x0 x1 x2 xi−1 xi xi+1 xn =b
Let Ji be the functional (called element functional) over the element ei = [xi , xi+1 ],
i = 0, 1, 2, . . . , n − 1. Then
% xi+1 dy (i) 2 2
1
Ji = p + q y (i) − 2ry (i) dx, (8.106)
2 xi dx
where y (i) is the value of y over the element ei and it is zero outside the element ei . xi ,
xi+1 are the end nodes of the element ei and let φ(i) = [y(xi ) y(xi+1 )]T .
Thus the functional J over the whole interval [a, b] is the sum of the n functionals Ji ,
that is,
n−1
J[y] = Ji . (8.107)
i=0
where
xi+1 − x x − xi
yi = y(xi ) and Li (x) = , Li+1 (x) = . (8.110)
xi+1 − xi xi+1 − xi
The function Li (x) and Li+1 (x) are called shape functions.
Ordinary Differential Equations 565
The value of y (i) (x) is substituted from (8.111) to (8.106) and obtain
% xi+1 2 2
1 yi yi
Ji = p(x) Li Li+1 + q(x) Li Li+1
2 xi yi+1 yi+1
yi
2
−r(x) Li Li+1 dx,
yi+1
% xi+1
∂Ji yi
and = p(x)Li+1 Li Li+1
∂yi+1 xi yi+1
yi
+q(x)Li+1 Li Li+1 − r(x)Li+1 dx = 0. (8.114)
yi+1
This gives
where
%
(i)
xi+1
Li Li Li Li+1 Li Li Li Li+1
A = p(x) + q(x) dx,
x Li+1 Li Li+1 Li+1 Li+1 Li Li+1 Li+1
% ixi+1
Li yi
b(i) = r(x) dx, and φ(i) = . (8.116)
xi Li+1 yi+1
Aφ = b. (8.117)
It may be noted that A is a tri-diagonal matrix. The solution of this system can be
determined by the method discussed in Chapter 5.
Ordinary Differential Equations 567
The second and nth equations are a1,0 y0 + a1,1 y1 + a1,2 y2 = b1 and an−1,n−2 yn−2 +
an−1,n−1 yn−1 + an−1,n yn = bn−1 . Now, the boundary conditions y(a) = ya and y(b) = yb
are introduced to the above equations and they become
Now, first and last rows and columns are removed from A also first and last elements
are removed from b and the equations of (8.119) are incorporated.
The equation (8.118) finally reduces to
y
b1
a1,1 a12 0 0 ··· 0 0 1
a2,1 a2,2 a2,3 0
2 y b2
··· 0 0 .
··· ··· ··· ··· .. = .. . (8.120)
··· ··· ··· .
0 0 0 0 · · · an−1,n−2 an−1,n−1 yn−1 bn−1
d2 y
+ 2y = x, y(0) = 1, y(1) = 2
dx2
using finite element method for two and four elements of equal lengths.
Solution. Here p(x) = −1, q(x) = 2, r(x) = x. If the lengths of elements are equal
then hi = h (say) for all i.
1 1
Now, Li = − , Li+1 = .
h h
568 Numerical Analysis
Therefore,
% xi+1
Li Li Li Li+1 Li Li Li Li+1
A(i) = − +2 dx
xi Li+1 Li Li+1 Li+1
Li+1 Li Li+1 Li+1
1 1 −1 h 2 1
=− + .
h −1 1 3 1 2
% %
xi+1
x(xi+1 − x)
xi+1
h
x Li dx = dx = (3xi + h)
xi xi h 6
% % xi+1
xi+1
x(x − xi ) h
and x Li+1 dx = dx = (3xi + 2h).
xi xi h 6
Then
% xi+1
(i) Li h 3xi + h
b = x dx = .
xi Li+1 6 3xi + 2h
1 −1 1 2 1 1 −10 13
A(0) = −2 + =
−1 1 6 1 2 6 13 −10
1 1/2 1 1 y0
b(0) = = , φ(0) = .
12 1 24 2 y1
1 −10 13 1 4 y1
A(1) = , b(1) = , φ(1) = .
6 13 −10 24 5 y2
Ordinary Differential Equations 569
1 −46 49 1 1 y0
A(0) = , b(0) = , φ(0) = .
12 49 −46 96 2 y1
1 −46 49 1 4 y1
A(1) = , b(1) = , φ(1) = .
12 49 −46 96 5 y2
1 −46 49 1 7 y2
A(2) = , b(2) = , φ(2) = .
12 49 −46 96 8 y3
570 Numerical Analysis
1 −46 49 1 10 y3
A(3) = , b(3) = , φ(3) = .
12 49 −46 96 11 y4
That is, y(0) = 1, y(1/4) = 1.53062, y(1/2) = 1.88913, y(3/4) = 2.04693, y(1) = 2.
Ordinary Differential Equations 571
In this section, the advantages and disadvantages of the numerical methods to solve
ordinary differential equations presented in this chapter are placed.
The Taylor’s series method has a serious disadvantage that higher order derivatives of
f (x, y) are required while computing y at a given value of x. Since this method involves
several computations of higher order derivatives, it is a labourious method, but, once
the series is available then one can compute the values of y at different values of x,
provided the step size is small enough.
The Picard’s method involves successive integrations and it is difficult for a compli-
cated function.
The Euler’s method is the simplest and the most crude of all single-step methods and
gives a rough idea about the solution.
The Runge-Kutta methods are most widely used method to solve a single or a system
of IVP, though, it is laborious. These methods can also be used to solve higher order
equations. To get the starting values of some predictor-corrector methods the Runge-
Kutta methods are used. After the starting values are found, the remaining values can
be determined by using predictor-corrector methods. An important drawback of Runge-
Kutta method is that there is no technique to check or estimate the error occurs at any
step. If an error generates at any step, then it is propagated through the subsequent
steps without detection.
The multistep method is, in general, more efficient than a single-step method. When
starting values are available then at each step only one (for explicit method) or a few
572 Numerical Analysis
(for implicit method) functions evaluations are required. In this contrast, a single-step
method requires multiple functions evaluations, but, single-step method is self starting.
The finite difference method is useful to solve second order IVP and BVP. To solve
BVP, an algebraic tri-diagonal system of equations is generated. When the step size h
is small then this method gives better result, but, for small h, the number of equations
becomes very large and then it is very complicated to solve such system.
Stability analysis is an important part in the study of the numerical methods to solve
differential equations. Most of the methods used to solve differential equation are based
on difference equation. To study the stability, the model differential and difference
equations are defined in the following.
where λ is a constant and it may be a real or a complex number. The solution of this
problem is
y = eλt y0 . (8.122)
where y0 is given and σ is, in general, a complex number. The solution of this problem
is
y n = σ n y0 . (8.124)
Im(λh)
6
- Re(λh)
1 O
Stability region
of exact solution
The connection between the exact solution and the difference solution is evident if
we evaluate the exact solution at tn = nh, for n = 0, 1, . . . where h > 0 and
yn = eλtn y0 = eλnh y0 = σ n y0 (8.125)
where σ = eλh .
If the exact solution is bounded then |σ| = |eλh | ≤ 1. This is possible if Re(λh) =
λR h ≤ 0.
That is, in the Re(λh)-Im(λh) plane, the region of stability of the exact solution is
the left half-plane as shown in Figure 8.8.
The single-step method is called absolutely stable if |σ| ≤ 1 and relatively stable
if |σ| ≤ eλh . If λ is pure imaginary and |σ| = 1, then the absolute stability is called the
periodic stability (P-stability).
When the region of stability of a difference equation is identical to the region of
stability of the differential equation, the finite difference scheme is sometimes referred
to as A-stable.
That is, only a small portion of the left half-plane is the region of stability for the
Euler’s method. This region is inside the circle (1 + λR h)2 + (λI h)2 = 1, which is shown
in Figure 8.9.
Im(λh)
6
qs q - Re(λh)
−2 −1 O
Stability
region
For any value of λh in the left half-plane and outside this circle, the numerical solution
blows-up while the exact solution decays. Thus the numerical method is conditionally
stable.
To get a stable numerical solution, the step size h must be reduced so that λh falls
within the circle. If λ is real and negative then the maximum step size for stability is
0 ≤ h ≤ 2/|λ|. The circle is only tangent to the imaginary axis. If λ is real and the
numerical solution is unstable, then |1 + λh| > 1, which means that (1 + λh) is negative
with magnitude greater than 1. Since yn = (1 + λh)n y0 , the numerical solutions exhibits
oscillations with changes of sign at every step. This behavior of the numerical solutions
is a good indicator of instability.
Note 8.11.1 The numerical stability does not imply accuracy. A method can be stable
even if it gives inaccurate result. From the stability point of view, our objective is to
use the maximum step size h to reach the final destination at x = xn . If h is large then
number of function evaluations is low and needs low computational cost. This may not
be the optimum h for acceptable accuracy, but, it is optimum for stability.
Ordinary Differential Equations 575
k1 = hf (xn , yn ) = λhyn
k2 = hf (xn + h, yn + k1 ) = λh(yn + k1 ) = λh(yn + λhyn )
= λh(1 + λh)yn
and
1 (λh)2
yn+1 = yn + (k1 + k2 ) = yn + λh + yn
2 2
λ2 h2
= 1 + λh + yn . (8.128)
2
This expression confirms that the method is of second order accuracy. For stability,
|σ| ≤ 1 where
λ2 h2
σ = 1 + λh + . (8.129)
2
Now, the stability is discussed for different cases of λ.
λ2 h2
(i) Real λ: 1 + λh + 2≤ 1 or, −2 ≤ λh ≤ 0.
+
(ii) Pure imaginary λ: Let λ = iw. Then |σ| = 1 + 14 w4 h4 > 1. That is, the method
is unstable.
2 2
(iii) Complex λ: Let 1 + λh + λ 2h = eiθ and find the complex roots, λh, of this
polynomial for different values of θ. Note that |σ| = 1 for all values of θ.
k1 = λhyn
k2 = λh(yn + k1 /2) = λh(1 + λh/2)yn
1 1
k3 = λh(yn + k2 /2) = λh 1 + λh + λ2 h2 yn
2 4
1 2 2 1 3 3
k4 = λh(yn + k3 ) = λh 1 + λh + λ h + λ h yn .
2 4
576 Numerical Analysis
Im(λh)
6
√
2 2
Second order
RK method
√
3 2
q
1
- Re(λh)
−2.785 −2 −1 O
–1
√
− 3 –2
Fourth order
RK method √
−2 2
Therefore,
1
yn+1 = yn + (k1 + 2k2 + 2k3 + k4 )
6
1 1 1
= 1 + λh + (λh)2 + (λh)3 + (λh)4 yn , (8.130)
2! 3! 4!
(iii) Complex λ: In this case, the stability region is obtained by finding the roots of
the fourth-order polynomial with complex coefficients:
1 1 1
1 + λh + (λh)2 + (λh)3 + (λh)4 = eiθ .
2! 3! 4!
Ordinary Differential Equations 577
y + ky = 0, (8.131)
If k > 0 and x → ∞ then the solution becomes bounded. The term e−kx is monotonic
for k > 0 and k < 0. Thus we expect the finite difference is also monotonic for k > 0
and k < 0.
If the behavior of the exponential term of (8.133) is analyzed, it is observed that it
behaves monotonic for k > 0 and k < 0 if h < |2/k|. This is the condition for stability
of the difference scheme (8.132).
8.12 Exercise
1. Discuss Taylor’s series method to solve an initial value problem of the form
dy
= f (x, y), y(x0 ) = y0 .
dx
dy
2. For the differential equation − 1 = x2 y with y(0) = 1, obtain Taylor’s series
dx
for y(x) and compute the values of y(0.1), y(0.2) correct up to four decimal places.
578 Numerical Analysis
14. Deduce second and fourth order Runge-Kutta methods to solve the initial value
problem y = f (x, y) with y(x0 ) = y0 .
15. Explain the significance of the numbers k1 , k2 , k3 , k4 used in fourth order Runge-
Kutta methods.
16. Analyze the stability of second and fourth order Runge-Kutta methods.
17. Use second order Runge-Kutta method to solve the initial value problem
dy
5 = x2 + y 2 , y(0) = 1
dx
and find y in the interval 0 ≤ x ≤ 0.4, taking h = 0.1.
18. Using Runge-Kutta method of fourth order, solve
dy
= xy + y 2 ,
dx
given that y(0) = 1. Take h = 0.2 and find y at x = 0.2, 0.4, 0.6.
19. Use Runge-Kutta method of fourth order to compute the solution of the following
problem in the interval [0,0.1] with h = 0.02.
y = x + y, y(0) = 1.
25. Solve the following initial value problem y + xy = 1 at x = 0.1 with initial
condition y(0) = 1 using Runge-Kutta-Fehlberg method.
26. Discuss Milne’s predictor-corrector formula to find the solution of y = f (x, y),
y(x0 ) = y0 .
27. Use Milne’s predictor-corrector formula to find the solutions at x = 0.4, 0.5, 0.6 of
the differential equation
dy
= x3 + y 2 , y(0) = 1.
dx
32. Find the solution of the IVP y = y 2 sin x with y(0) = 1 using Adams-Bashforth-
Moulton predictor-corrector method, in the interval [0.2,0.3], given that y(0.05) =
1.00125, y(0.10) = 1.00502 and y(0.15) = 1.01136.
d2 y
+ xy = 0
dx2
with initial conditions y(0) = 1, y (0) = 0. Compute y(0.2) and y(0.4) using
(a) Runge-Kutta methods,
(b) Finite difference method, and compare them.
Ordinary Differential Equations 581
35. Convert the following second order differential equation y +p(x)y +q(x)y+r(x) =
0 with y(x0 ) = y0 , y(xn ) = yn into a system of algebraic equations.
37. Use finite difference method to solve the BVP y + 2y + y = 10x, y(0) = 0 and
y(1) = 0 at x = 0.25, 0.50, 0.75, and compare these results with exact solutions.
38. Use the finite difference method to find the solution of the boundary value problem
y − xy + 2y = x + 1 with boundary conditions y(0.9) = 0.5, y (0.9) = 2,
y(1.2) = 1, taking h = 0.1.
39. Convert the following BVP y − 12xy + 1 = 0 with boundary conditions y(0) =
y(1) = 0, into two initial value problems.
40. Use shooting method to solve the boundary value problem y + xy = 0 with
boundary conditions y(0) = 0, y(1) = 1.
41. Consider the boundary value problem y − y = 1, 0 < x < 1 with the boundary
conditions y(0) = 0, y(1) = e − 1.
Use finite element method to find the solution of this problem.
42. Solve the boundary value problem y + y = x2 , 0 < x < 1 with boundary condi-
tion y(0) = 3, y(1) = 1 using finite element method for two and three elements of
equal lengths.
43. Use finite element method to obtain the difference scheme for the boundary value
problem
d dy
(1 + x2 ) − y = 1 + x2 , y(−1) = y(1) = 0
dx dx
with linear shape functions and element length h = 0.25.
582 Numerical Analysis
Chapter 9
∆ = B 2 − 4AC.
The equation (9.2) is called elliptic, parabolic and hyperbolic according as the value
of ∆ at any point (x, y) are <, = or > 0.
Elliptic equations
The simplest examples of this type of PDEs are Poisson’s equation
∂2u ∂2u
+ 2 = g(x, y) (9.3)
∂x2 ∂y
583
584 Numerical Analysis
Parabolic equation
In parabolic equation, time t is involved as an independent variable.
The simplest example of parabolic equation is the heat conduction equation
∂u ∂2u
= α 2. (9.5)
∂t ∂x
The solution u is the temperature at a distance x units of length from one end of
a thermally insulated bar after t seconds of heat conduction. In this problem, the
temperatures at the ends of a bar are known for all time, i.e., the boundary conditions
are known.
Hyperbolic equation
In this equation also, time t appears as an independent variable.
The simplest example of hyperbolic equation is the one-dimensional wave equation
∂2u 2
2∂ u
= c . (9.6)
∂t2 ∂x2
∂2u ∂2u
+ 2 = 0 in R
∂x2 ∂y (9.7)
and u = f (x, y) on C.
∂2u ∂2u
= for t > 0
∂t2 ∂x2
u(x, 0) = f (x) (9.8)
and ut (x, 0) = g(x)
where f (x)and g(x) are arbitrary.
(iii)
∂u ∂2u
= for t > 0 (9.9)
∂t ∂x2
and u(x, 0) = f (x).
The above cited problems are all well-defined (i.e., well-posed) and it can be shown
that each of the above problems has unique solution.
But, the problem of Laplace’s equation with Cauchy boundary conditions, i.e., the
problem
∂2u ∂2u
+ 2 =0
∂x2 ∂y (9.10)
u(x, 0) = f (x)
and uy (x, 0) = g(x)
is an ill-posed problem.
Let the xy plane be divided into sets of equal rectangles of sides ∆x = h and ∆y = k
by drawing the equally spaced grid lines parallel to the coordinates axes, defined by
The value of u at a mesh point (intersection of horizontal and vertical lines) P (xi , yj )
or, at P (ih, jk) is denoted by ui,j , i.e., ui,j = u(xi , yj ) = u(ih, jk).
Now,
ui+1,j − ui,j
ux (ih, jk) = + O(h) (9.11)
h
(forward difference approximation)
ui,j − ui−1,j
= + O(h) (9.12)
h
(backward difference approximation)
ui+1,j − ui−1,j
= + O(h2 ) (9.13)
2h
(central difference approximation)
∂u ∂2u
= α 2. (9.19)
∂t ∂x
Using the finite-difference approximation for ut and uxx , equation (9.19) reduces to
ui,j+1 − ui,j ui−1,j − 2ui,j + ui+1,j
α , (9.20)
k h2
where xi = ih and tj = jk; i, j = 0, 1, 2, . . ..
Partial Differential Equations 587
u u
u known values
u e u
ui,j+1 ⊕ unknown value
u u u u u
ui−1,j ui,j ui+1,j
u u
6
k h-
u? u u u u u u- x
t = 0, u = f (x)
Solution. Let h = 0.2 and k = 0.01, so r = k/h2 = 0.25 < 1/2. The initial and
boundary values are shown in the following table.
i=0 i=1 i=2 i=3 i=4 i=5
x = 0 x = 0.2 x = 0.4 x = 0.6 x = 0.8 x = 1.0
j = 0, t = 0.00 0.0 0.1 0.2 0.3 0.4 0.00
j = 1, t = 0.01 0.0 0.02
j = 2, t = 0.02 0.0 0.04
j = 3, t = 0.03 0.0 0.06
j = 4, t = 0.04 0.0 0.08
j = 5, t = 0.05 0.0 0.10
588 Numerical Analysis
where r = αk/h2 .
In general, the left hand side of (9.23) contains three unknowns and the right hand
side has three known values of u.
For j = 0 and i = 1, 2, . . . , N −1, equation (9.23) generates N simultaneous equations
for N − 1 unknown u1,1 , u2,1 , . . . , uN −1,1 (of first row) in terms of known initial and
boundary values u0,0 , u1,0 , u2,0 , . . . , uN,0 ; u0,0 and uN,0 are the boundary values and
u1,0 , u2,0 , . . . , uN −1,0 are the initial values.
Similarly, for j = 1 and i = 1, 2, . . . , N − 1 we obtain another set of unknown values
u1,2 , u2,2 , . . . , uN −1,2 in terms of calculated values for j = 0, and so on.
t 6
u u
o
u e e e u
left * j+1 i right
boundary boundary
values s u e
u u u u values
j
U u u
⊕ unknown values
u known values
u u u u u u u- x
i=0 i−1 i i+1 i=N
Figure 9.2: Meshes of Crank-Nicolson implicit method.
In this method, the value of ui,j+1 is not given directly in terms of known ui,j at one
time step earlier but is also a function of unknown values at previous time step, and
hence the method is called an implicit method.
The system of equation (9.23) can be viewed in the following matrix notation.
u
d1,j
2+2r −r 1,j+1
−r −r u d2,j
2+2r
2,j+1
··· ··· ··· ··· ··· ··· ··· u d3,j
3,j+1 =
(9.24)
−r 2+2r −r ..
..
. .
−r 2+2r uN −1,j+1 dN −1,j
where
u u u2,0
j=0, t=0 u 0,0 u 1,0 u - x
i=0 i=1 i=2
x=0 x= 12 x=1
Case II.
1 1 k
Let h = and k = . In this case r = 2 = 2.
4 8 h
The Crank-Nicolson scheme is
1 3
That is, u0,0 = 0, u1,0 = , u2,0 = 1, u3,0 = , u4,0 = 2; u0,1 = 0, u4,1 = 0.
2 2
The Crank-Nicolson equations for the mesh points A(i = 1), B(i = 2) and C(i = 3)
are respectively.
That is,
4u1,1 − u2,1 = 1
1 3
−u1,1 + 4u2,1 − u3,1 = + = 2
2 2
−u2,1 + 4u3,1 = 3.
Using boundary conditions and values of right hand side obtained in first step, the
above system becomes
6 6
4u1,2 − u2,2 = 0 + =
7 7
13 27 10
−u1,2 + 4u2,2 − u3,2 = + =
28 28 7
6 6
−u2,2 + 4u3,2 = +0= .
7 7
The solution of this system is
1 1 17 1 1 26 3 1 17
u1,2 = u , = , u2,2 = u , = , u3,2 = u , = .
4 8 49 2 8 49 4 8 49
for i = 2 to N − 2 do
Compute di = rui−1 + (2 − 2r)ui + rui+1 ;
endfor;
Solve the tri-diagonal system TU = D, where U = (u1 , u2 , . . . , uN −1 )t
and D = (d1 , d2 , . . . , dN −1 )t .
Print ‘The values of u’, u1 , u2 , . . . , uN −1 , ‘when t =’, (j + 1) ∗ k;
endfor;
end Crank Nicolson
Program 9.1
.
/* Program Crank-Nicolson
Program to solve the heat equation by Crank-Nicolson
method. This program solves the problem of the form
Ut=alpha Uxx, with boundary conditions
U(0,t)=f0(t); U(l,t)=fl(t); and initial condition
U(x,0)=g(x). The interval for x [0,l] is divided into
N subintervals of length h. k is the step length of t.
Here g(x)=cos(pi*x/2) with u(0,t)=1, u(1,t)=0, h=k=1/3.*/
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
float u[10];
void main()
{
int i,j,N,M;
float h,k,l,alpha,r,a[10],b[10],c[10],d[10],y,temp1,temp2;
float TriDiag(float a[],float b[],float c[],float d[],int n);
float g(float x);
float f0(float t);
float fl(float t);
printf("Enter the number subintervals of x and t ");
scanf("%d %d",&N,&M);
printf("Enter the step size (k) of t ");
scanf("%f",&k);
printf("Enter the upper limit of x ");
scanf("%f",&l);
printf("Enter the value of alpha ");
scanf("%f",&alpha);
h=l/N; r=alpha*k/(h*h);
for(i=1;i<=N;i++) b[i]=2+2*r;
for(i=2;i<=N;i++) a[i]=-r;
Partial Differential Equations 595
for(i=1;i<N;i++) c[i]=-r;
for(i=0;i<=N;i++) u[i]=g(i*h);
printf("h=%8.5f, k=%8.5f, r=%8.5f\n",h,k,r);
temp1=f0(0);
if(fabs(temp1)!=fabs(u[0])){
printf("u[0]=%f and f0(0)=%f are different!\n",u[0],temp1);
printf("Enter correct value ");
scanf("%f",&temp1);}
temp2=fl(0);
if(fabs(temp2)!=fabs(u[N]))
{
printf("u[N]=%f and fl(0)=%f are different!\n",u[N],temp2);
printf("Enter correct value ");
scanf("%f",&temp2);
}
printf("Solution is\nx-> ");
for(i=1;i<N;i++) printf("%8.5f ",i*h); printf("\n");
for(i=0;i<N;i++) printf("---------");printf("\n");
for(j=0;j<=M;j++)
{
if(j!=0) {temp1=f0(j*k); temp2=fl(j*k); }
/* construction of right hand vector */
d[1]=r*temp1+(2-2*r)*u[1]+r*u[2]+r*f0((j+1)*k);
d[N-1]=r*u[N-2]+(2-2*r)*u[N-1]+r*temp2+r*fl((j+1)*k);
for(i=2;i<=N-2;i++)
d[i]=r*u[i-1]+(2-2*r)*u[i]+r*u[i+1];
y=TriDiag(a,b,c,d,N-1); /*solution of tri-diagonal system*/
printf("%8.5f| ",(j+1)*k);
for(i=1;i<=N-1;i++) printf("%8.5f ",u[i]);printf("\n");
} /* end of j loop */
} /* main */
/* definitions of the initial and boundary functions*/
float g(float x)
{
return(cos(3.141592*x/2));
}
float f0(float t)
{
return(1);
}
596 Numerical Analysis
float fl(float t)
{
return(0);
}
float TriDiag(float a[10],float b[10],float c[10],float d[10],int n)
{
/* output u[i], i=1, 2,..., n, is a global variable.*/
int i; float gamma[10],z[10];
gamma[1]=b[1];
for(i=2;i<=n;i++)
{
if(gamma[i-1]==0.0)
{
printf("A minor is zero: Method fails ");
exit(0);
}
gamma[i]=b[i]-a[i]*c[i-1]/gamma[i-1];
}
z[1]=d[1]/gamma[1];
for(i=2;i<=n;i++)
z[i]=(d[i]-a[i]*z[i-1])/gamma[i];
u[n]=z[n];
for(i=n-1;i>=1;i--)
u[i]=z[i]-c[i]*u[i+1]/gamma[i];
/* for(i=1;i<=n;i++) printf("%f ",u[i]); */
return(u[0]);
} /*end of TriDiag */
A sample of input/output:
Enter the number subintervals of x and t 4 5
Enter the step size (k) of t 0.03125
Enter the upper limit of x 1
Enter the value of alpha 1
h= 0.25000, k= 0.03125, r= 0.50000
Solution is
x-> 0.25000 0.50000 0.75000
------------------------------------
0.03125| 0.86871 0.65741 0.35498
0.06250| 0.83538 0.61745 0.33080
0.09375| 0.81261 0.58746 0.31109
Partial Differential Equations 597
∂2u 2
2∂ u
= c , t > 0, 0 < x < 1 (9.25)
∂t2 ∂x2
where initial conditions u(x, 0) = f (x) and
∂u
= g(x), 0 < x < 1 (9.26)
∂t (x,0)
This problem may occur in the transverse vibration of a stretched string. As in the
previous cases, the central-difference approximation for uxx and utt at the mesh points
(xi , tj ) = (ih, jk) are
1
uxx = (ui−1,j − 2ui,j + ui+1,j ) + O(h2 )
h2
1
and utt = 2 (ui,j−1 − 2ui,j + ui,j+1 ) + O(k 2 ),
k
where i, j = 0, 1, 2, . . ..
Using the value of uxx and utt , the equation (9.25) becomes
1 c2
2
(ui,j−1 − 2ui,j + ui,j+1 ) = 2 (ui−1,j − 2ui,j + ui+1,j )
k h
i.e.,
ui,j+1 = r2 ui−1,j + 2(1 − r2 )ui,j + r2 ui+1,j − ui,j−1 , (9.28)
where r = ck/h.
The value of ui,j+1 depends on the values of u at three time-levels (j − 1), j and
(j + 1). The known and unknown values of u are shown in Figure 9.6.
On substituting j = 0, the equation (9.28) yields
t
6 e
j+1
⊕ unknown values
j u u u u known values
j−1 u
- x
i−1 i i+1
Again, the central difference approximation to the initial derivative condition gives
1
(ui,1 − ui,−1 ) = gi .
2k
Eliminating ui,−1 between above two relations and we obtain the expression for u
along t = k, i.e., for j = 1 as
1 2
ui,1 = r fi−1 + 2(1 − r2 )fi + r2 fi+1 + 2kgi . (9.29)
2
The truncation error of this method is O(h2 +k 2 ) and the formula (9.28) is convergent
for 0 < r ≤ 1.
Example 9.3.1 Solve the second-order wave equation
∂2u ∂2u
=
∂t2 ∂x2
with boundary conditions u = 0 at x = 0 and 1, t > 0 and the initial conditions
1 ∂u
u = sin πx, = 0, when t = 0, 0 ≤ x ≤ 1, for x = 0, 0.2, 0.4, . . . , 1.0 and t =
2 ∂t
0, 0.1, 0.2, . . . , 0.5.
The above formula gives the values of u for j = 1. For j = 2, 3, . . . the values are
obtained from the formula (9.30).
Hence,
That is,
1
[ui,j+1 − 2ui,j + ui,j−1 ] (9.31)
k2
c2
= 2 [(ui+1,j+1 − 2ui,j+1 + ui−1,j+1 ) + (ui+1,j−1 − 2ui,j−1 + ui−1,j−1 )].
2h
2 ( 2 2 )
∂ u c2 ∂2u ∂ u ∂ u
(ii) = +2 +
∂t2 i,j 4 ∂x2 i,j+1 ∂x2 i,j ∂x2 i,j−1
Using the finite difference scheme this equation reduces to
1
[ui,j+1 − 2ui,j + ui,j−1 ]
k2
c2
= 2 [(ui+1,j+1 − 2ui,j+1 + ui−1,j+1 ) (9.32)
4h
+2(ui+1,j − 2ui,j + ui−1,j ) + (ui+1,j−1 − 2ui,j−1 + ui−1,j−1 )].
Both the formulae hold good for all values of r = ck/h > 0.
∂2u ∂2u
+ 2 = 0 within R (9.33)
∂x2 ∂y
and u = f (x, y) on the boundary C.
Using the central difference approximation to both the space derivatives, the finite
difference approximation of above equation is given by
ui−1,j − 2ui,j + ui+1,j ui,j−1 − 2ui,j + ui,j+1
+ = 0.
h2 k2
If the mesh points are uniform in both x and y directions then h = k and the above
equation becomes a simple one. That is,
1
ui,j = [ui+1,j + ui−1,j + ui,j+1 + ui,j−1 ]. (9.34)
4
Partial Differential Equations 601
y
6(i, j+1)
u
u e u
(i-1,j) (i,j) (i+1,j)
u - x
(i,j-1)
This shows that the value of u at the point (i, j) is the average of its values at the
four neighbours – north (i, j + 1), east (i + 1, j), south (i, j − 1) and west (i − 1, j). This
is shown in Figure 9.7, and the formula is known as standard five-point formula.
It is also well known that the Laplace equation remains invariant when the coordinates
axes are rotated at an angle 450 .
Hence equation (9.34) can also be expressed in the form
1
ui,j = [ui−1,j−1 + ui+1,j−1 + ui+1,j+1 + ui−1,j+1 ]. (9.35)
4
This formula is known as diagonal five-point formula and the mesh points are
shown in Figure 9.8.
y
6(i-1,j+1) u
u (i+1,j+1)
e
(i,j)
u u- x
(i-1,j-1) (i+1,j-1)
j=3 u e e e u
u1,3 u2,3 u3,3
u=0, j=2 u e e e u u=0
u1,2 u2,2 u3,2
j=1 u e e e u
u1,1 u2,1 u3,1
j=0 u u u u u- x
i=0 i=1 i=2 i=3 i=4
u=0
For i, j = 1, 2 and 3, the equation (9.37) is a system of nine equations with nine
unknowns, which is written in matrix notation as
4 −1 0 −1 0 0 0 0 0 u1,1 −h2 g1,1
−1 4 −1 0 −1 0 0 0 0 u1,2 −h2 g1,2
0 −1 4 0 0 −1 0 0 0 u1,3 −h2 g1,3
−1 0 0 4 −1 0 −1 0 0 u2,1 −h2 g2,1
0 −1 0 −1 4 −1 0 −1 0 u2,2 = −h2 g2,2 (9.38)
0 0 −1 0 −1 4 0 0 −1 u2,3 −h2 g2,3
0 0 0 −1 0 0 4 −1 0 u3,1 −h2 g3,1
0 0 0 0 −1 0 −1 4 −1 u3,2 −h2 g3,2
0 0 0 0 0 −1 0 −1 4 u3,3 −h2 g3,3
It may be noted that the coefficient matrix is symmetric, positive definite and sparse.
Thus the solution of an elliptic PDE depends on a sparse system of equations. To solve
this system an iterative method is suggested rather a direct method. Three iterative
methods are commonly used to solve such system, viz., (i) Jacobi’s method, (ii) Gauss-
Seidel’s method and (iii) successive overrelaxation method.
The another iterative method known as alternate direction implicit (ADI) method
is also used.
Partial Differential Equations 603
u0,3 u e e e u u4,3
u1,3 u2,3 u3,3
u0,2 u e e e u u4,2
u1,2 u2,2 u3,2
u0,1 u e e e u u4,1
u1,1 u2,1 u3,1
u u u u u- x
u0,0 u1,0 u2,0 u3,0 u4,0
Let R be a square region divided into N × N small squares of side h. The boundary
values are uo,j , uN,j , ui,0 , ui,N where i, j = 0, 1, 2, . . . , N , shown in Figure 9.10.
Initially, diagonal five-point formula is used to compute the values of u2,2 , u1,3 , u3,3 ,
u1,1 and u3,1 in this sequence. The values are
1
u2,2 = (u0,0 + u4,4 + u0,4 + u4,0 )
4
1
u1,3 = (u0,2 + u2,4 + u0,4 + u2,2 )
4
1
u3,3 = (u2,2 + u4,4 + u2,4 + u4,2 )
4
1
u1,1 = (u0,0 + u2,2 + u0,2 + u2,0 )
4
1
u3,1 = (u2,0 + u4,2 + u2,2 + u4,0 ).
4
In the second step, the remaining values, viz., u2,3 , u1,2 , u3,2 and u2,1 are computed
using standard five-point formula as
1
u2,3 = (u1,3 + u3,3 + u2,2 + u2,4 )
4
1
u1,2 = (u0,2 + u2,2 + u1,1 + u1,3 )
4
604 Numerical Analysis
1
u3,2 = (u2,2 + u4,2 + u3,1 + u3,3 )
4
1
u2,1 = (u1,1 + u3,1 + u2,0 + u2,2 ).
4
These are the initial values of u at the nine internal mesh points, their accuracy can
be improved by any iterative method.
Note 9.4.1 The above scheme gives a better first approximate values, but, ui,j = 0 for
i, j = 1, 2, . . . , N − 1 may also be taken as first approximate values of u. Sometimes for
these initial values the iteration converges slowly.
Example 9.4.1 Find the first approximate values at the interior mesh points of the
following Dirichlet’s problem
uxx + uyy = 0,
u(x, 0) = 0, u(0, y) = 0,
u(x, 1) = 10x, u(1, y) = 20y.
y=0.00, u0,0 =0 u u u u u- x
0 0 0 0
u1,0 u2,0 u3,0 u4,0
x=0 x= 14 x= 12 x= 34 x=1
The diagonal five-point formula is used to find the values of u2,2 , u1,3 , u3,3 , u1,1 and
u3,1 . Thus
1 1
u2,2 = (u0,0 + u4,4 + u0,4 + u4,0 ) = (0 + 10 + 0 + 0) = 2.5
4 4
Partial Differential Equations 605
1 1
u1,3 = (u0,2 + u2,4 + u0,4 + u2,2 ) = (0 + 5 + 0 + 2.5) = 1.875
4 4
1 1
u3,3 = (u2,2 + u4,4 + u2,4 + u4,2 ) = (2.5 + 10 + 5 + 5) = 5.625
4 4
1 1
u1,1 = (u0,0 + u2,2 + u0,2 + u2,0 ) = (0 + 2.5 + 0 + 0) = 0.625
4 4
1 1
u3,1 = (u2,0 + u4,2 + u2,2 + u4,0 ) = (0 + 5 + 2.5 + 0) = 1.875.
4 4
The values of u2,3 , u1,2 , u3,2 and u2,1 are obtained by using standard five-point formula.
1 1
u2,3 = (u1,3 + u3,3 + u2,2 + u2,4 ) = (1.875 + 5.625 + 2.5 + 5) = 3.75
4 4
1 1
u1,2 = (u0,2 + u2,2 + u1,1 + u1,3 ) = (0 + 2.5 + 0.625 + 1.875) = 1.25
4 4
1 1
u3,2 = (u2,2 + u4,2 + u3,1 + u3,3 ) = (2.5 + 5 + 1.875 + 5.625) = 3.75
4 4
1 1
u2,1 = (u1,1 + u3,1 + u2,0 + u2,2 ) = (0.625 + 1.875 + 0 + 2.5) = 1.25.
4 4
Thus we obtain the first approximate values of u at the interior mesh points.
1
ui,j = (ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j ). (9.39)
4
(r)
Let ui,j denote the rth iterative value of ui,j .
Jacobi’s method
The iterative scheme to solve (9.39) for the interior mesh points is
Gauss-Seidel’s method
In this method, the most recently computed values as soon as they are available are
used and the values of u along each row are computed systematically from left to right.
The iterative formula is
(r+1) 1 (r+1) (r) (r+1) (r)
ui,j = [ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j ]. (9.41)
4
The rate of convergence of this method is twice as fast as the Jacobi’s method.
Successive Over-Relaxation or S.O.R. method
In this method, the rate of convergence of an iterative method is accelerated by making
(r+1) (r) (r+1)
corrections on [ui,j − ui,j ]. If ui,j is the value obtained from a basic iterative (such
as Jacobi’s or Gauss-Seidel’s) method then the value at the next iteration is given by
(r+1) (r+1) (r)
ui,j = wui,j + (1 − w)ui,j , (9.42)
where w is the over-relaxation factor.
Thus, the Jacobi’s over-relaxation method for the Poisson’s equation is
(r+1) 1 (r) (r) (r) (r)
(r)
ui,j = w ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j + (1 − w)ui,j (9.43)
4
and the Gauss-Seidel’s over-relaxation method for that is
(r+1) 1 (r+1) (r) (r+1) (r)
(r)
ui,j = w ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 gi,j + (1 − w)ui,j . (9.44)
4
The rate of convergence of (9.43) and (9.44) depends on the value of w and its value
lies between 1 and 2. But, the choice of w is a difficult task.
It may be noted that w = 1 gives the corresponding basic iteration formula.
Solution. The quantities u2,2 , u1,3 , u3,3 , u1,1 and u3,1 are computed using diagonal
five-point formula and let those be the first approximate values. That is,
(1) 1
u2,2 = (u0,0 + u4,4 + u0,4 + u4,0 ) = 45.0
4
(1) 1
u1,3 = (u0,2 + u2,4 + u0,4 + u2,2 ) = 23.75
4
(1) 1
u3,3 = (u2,2 + u4,4 + u2,4 + u4,2 ) = 41.25
4
(1) 1
u1,1 = (u0,0 + u2,2 + u0,2 + u2,0 ) = 48.75
4
(1) 1
u3,1 = (u2,0 + u4,2 + u2,2 + u4,0 ) = 66.25.
4
Partial Differential Equations 607
20 u e e e u50
u1,3 u2,3 u3,3
30 u e e e u60
u1,2 u2,2 u3,2
40 u e e e
u2,1 u3,1
u70
u1,1
50 u u u u u90
60 70 80
The values of u2,3 , u1,2 , u3,2 and u2,1 are computed using standard five-point formula.
(1) 1
u2,3 = (u1,3 + u3,3 + u2,2 + u2,4 ) = 32.5
4
(1) 1
u1,2 = (u0,2 + u2,2 + u1,1 + u1,3 ) = 36.875
4
(1) 1
u3,2 = (u0,2 + u4,2 + u3,1 + u3,3 ) = 53.125
4
(1) 1
u2,1 = (u1,1 + u3,1 + u2,0 + u2,2 ) = 57.5.
4
Thus the first approximate values are obtained for nine internal mesh points. These
values can be improved by using any iterative method. Here Gauss-Seidel’s iterative
method is used to obtain better approximate values.
Example 9.4.3 Solve the Laplace’s equation uxx + uyy = 0 in the domain shown in
Figure 9.13, by (a) Gauss-Seidel’s method, (b) Jacobi’s method, and (c) Gauss-Seidel
S.O.R. method.
u
10
e
u
10
e
u e
u
0 u e
u1,2
e
u2,2
e0
u
0 u e e e0
u
u1,1 u2,1
u u u u
0 0
Solution.
(a) Gauss-Seidel’s method
Let u2,1 = u1,2 = u2,2 = u1,1 = 0 at the beginning.
The Gauss-Seidel’s iteration scheme is
(r+1) 1 (r) (r)
u1,1 = u2,1 + u1,2
4
(r+1) 1 (r+1) (r)
u2,1 = u1,1 + u2,2
4
(r+1) 1 (r+1) (r)
u1,2 = u1,1 + u2,2 + 10
4
(r+1) 1 (r+1) (r+1)
u2,2 = u1,2 + u2,1 + 10 .
4
For r = 0,
(1) 1
u1,1 = 0+0 =0
4
(1) 1
u2,1 = 0+0 =0
4
(1) 1
u1,2 = 0 + 0 + 10 = 2.5
4
(1) 1
u2,2 = 2.5 + 0 + 10 = 3.125.
4
The subsequent iterations are shown below.
Partial Differential Equations 609
This result shows that the method converges at 6th step. But, to get the same result
using Gauss-Seidel’s method (in this case w = 1) 11 iterations are needed . Also, the
rate of convergence depends on the value of w. For some specific values of w, the
value of r, at which the solution converges, are tabulated below.
w 1 1.05 1.07 1.08 1.09 1.1 1.2 1.3 1.4 1.009 1.1009 1.109
r 11 9 7 6 7 6 9 10 16 11 6 7
Example 9.4.4 Solve the Poisson’s equation uxx +uyy = 8x2 y 2 for the square region
0 ≤ x ≤ 1, 0 ≤ y ≤ 1 with h = 1/3 and the values of u on the boundary are every
where zero. Use (a) Gauss-Seidel’s method, and (b) Gauss-Seidel’s S.O.R. method.
Solution. In this problem, g(x, y) = 8x2 y 2 , h = 1/3 and the boundary conditions
are u0,0 = u1,0 = u2,0 = u3,0 = 0, u0,1 = u0,2 = u0,3 = 0,
u1,3 = u2,3 = u3,3 = 0, u3,1 = u3,2 = 0.
(a) The Gauss-Seidel’s iteration scheme is
(r+1) 1 (r+1) (r) (r+1) (r)
ui,j = ui−1,j + ui+1,j + ui,j−1 + ui,j+1 − h2 g(ih, jk) .
4
Partial Differential Equations 611
8 2 2
Now, g(ih, jk) = 8h4 i2 j 2 = 81 i j . Thus
For w = 1.1, the values of u1,1 , u2,1 , u1,2 and u2,2 are shown below.
#include<stdio.h>
#include<math.h>
void main()
{
int i,j,N,flag;
float u[6][6],un[6][6],h,k,w,eps=1e-5,temp;
float g(float x,float y);
float f1(float x);
float f2(float y);
float f3(float x);
float f4(float y);
printf("Enter the number of mesh points N ");
scanf("%d",&N);
printf("Enter the step sizes of x (h) and y (k) ");
scanf("%f %f",&h,&k);
printf("Enter the relaxation factor w ");
scanf("%f",&w);
/* set boundary conditions */
for(i=0;i<=N;i++) for(j=0;j<=N;j++) u[i][j]=0;
for(i=0;i<=N;i++){
u[i][0]=f1(i*h); u[i][N]=f3(i*h);
}
for(j=0;j<=N;j++){
u[0][j]=f2(j*k); u[N][j]=f4(j*k);
}
for(i=0;i<=N;i++)
for(j=0;j<=N;j++) un[i][j]=u[i][j];
printf("The values of N, h, k and w are respectively\n");
printf("N=%3d, h=%6.4f, k=%6.4f, w=%5.3f\n",N,h,k,w);
do
{
for(i=1;i<=N-1;i++)
for(j=1;j<=N-1;j++) u[i][j]=un[i][j];
for(i=1;i<=N-1;i++)
for(j=1;j<=N-1;j++)
un[i][j]=0.25*w*(un[i-1][j]+u[i+1][j]+un[i][j-1]
+u[i][j+1]-h*h*g(i*h,j*k))+(1-w)*u[i][j];
flag=0;
for(i=1;i<=N-1;i++)
for(j=1;j<=N-1;j++) if(fabs(un[i][j]-u[i][j])>eps) flag=1;
614 Numerical Analysis
}while(flag==1);
/* printing of the boundary and internal values */
printf("The interior and boundary values are shown below\n");
printf(" ");
for(i=0;i<=N;i++) printf("%8.5f ",i*h);printf("\n");
printf("---------");
for(i=0;i<=N;i++) printf("---------");printf("\n");
for(i=0;i<=N;i++){
printf("%8.5f| ",i*k);
for(j=0;j<=N;j++) printf("%8.5f ",un[i][j]);printf("\n");
}
}
/* definitions of the functions */
float g(float x,float y)
{
return(-2*x*x+y*y);
}
float f1(float x)
{
return 0;
}
float f2(float y)
{
return 0;
}
float f3(float x)
{
return 0;
}
float f4(float y)
{
return 0;
}
A sample of input/output:
Enter the number of mesh points N 5
Enter the step sizes of x (h) and y (k) 0.5 0.1
Enter the relaxation factor w 1.1
The values of N, h, k and w are respectively
N= 5, h=0.5000, k=0.1000, w=1.100
Partial Differential Equations 615
9.5 Stability
∂u ∂2u
Example 9.5.1 Investigate the stability of the parabolic equation = which
∂t ∂x2
is approximated by finite difference scheme
1 1
(up,q+1 − up,q ) = 2 (up−1,q+1 − 2up,q+1 + up+1,q+1 )
k h
at (ph, qk).
where r = k/h2 .
616 Numerical Analysis
That is,
1
ξ= . (9.48)
1 + 4r sin2 (βh/2)
Clearly 0 < ξ ≤ 1 for all r > 0 and all β. Therefore, the difference scheme is
unconditionally stable.
Solution. Substituting up,q = eiβph ξ q to the above equation, we obtain the relation
ξ − 2 + ξ −1 = r2 {eiβh − 2 + e−iβh }
= r2 {2 cos βh − 2} = −4r2 sin2 (βh/2)
Here, u does not increase exponentially with t as the difference equation is a three
time-level. Thus, a necessary condition for stability is that |ξ| ≤ 1.
It is clear from (9.49) that a ≤ 1 because r, k and β are real.
Partial Differential Equations 617
If a < −1, then |ξ2 | > 1 and leads to instability, when −1 ≤ a ≤ 1, a2 ≤ 1 then ξ1
and ξ2 are imaginary and they are given by
ξ1 = a + i (1 − a2 ), ξ2 = a − i (1 − a2 ).
Hence |ξ1 | = |ξ2 | = a2 + (1 − a2 ) = 1 and this shows that a necessary condition for
stability is −1 ≤ a ≤ 1. That is, −1 ≤ 1 − 2r2 sin2 (βh/2) ≤ 1. The right hand side
inequality is obvious. The left hand side inequality is useful.
Then
−1 ≤ 1 − 2r2 sin2 (βh/2) or, r2 sin2 (βh/2) ≤ 1.
This gives r = k/h ≤ 1.
This condition is also a sufficient condition.
9.6 Exercise
∂u ∂2u
3. Solve the differential equation = , 0 ≤ x ≤ 1/2, given that u = 0 when
∂t ∂x2
∂u ∂u
t = 0, 0 ≤ x ≤ 1/2 and with boundary conditions = 0 at x = 0 and = 1 at
∂x ∂x
x = 1/2 for t > 0, taking h = 0.1, k = 0.001.
4. Solve the following initial value problem ft = fxx , 0 ≤ x ≤ 1 subject to the initial
πx
condition f (x, 0) = cos and the boundary conditions f (0, t) = 1, f (1, t) = 0 for
2
t > 0, taking h = 1/3, k = 1/3.
2x for 0 ≤ x ≤ 1/2
u(x, 0) =
2(1 − x) for 1/2 ≤ x ≤ 1,
9. Solve the Laplace equation uxx + uyy = 0 taking h = 1, with boundary values as
shown below.
8u 10u 12u
0 u u16
0 u e e e u28
u1,3 u2,3 u3,3
0 u e e e u24
u1,2 u2,2 u3,2
0 u e e e
u2,1 u3,1
u22
u1,1
0 u u u u u20
10 15 18
10. Solve the elliptic differential equation uxx + uyy = 0 and for the region bounded
by 0 ≤ x ≤ 5, 0 ≤ y ≤ 5, the boundary conditions being
u = 0 at x = 0 and u = 2 + y at x = 5,
u = x2 at y = 0 and u = 2x at y = 5.
Take h = k = 1. Use
(a) Jacobi’s method, (b) Gauss-Seidel’s method, and (c) Gauss-Seidel’s S.O.R.
method.
Chapter 10
In science and engineering, the experimental data are usually viewed by plotting as
graphs on plane paper. But, the problem is: what is the ‘best curve’ for a given set of
data ? If n data points (xi , yi ), i = 1, 2, . . . , n are given, then an nth degree polynomial
can be constructed using interpolation methods, such as Lagrange’s, Newton’s etc. But,
the handling of higher degree polynomial is practically difficult, though it gives exact
values at the given nodes x0 , x1 , . . . , xn , and errors at the other points. For the given
data points we can construct lower degree polynomials such as linear, quadratic etc.
and other types of curve, viz., geometric, exponential etc., using least squares method,
which minimizes the sum of squares of the absolute errors. The curve fitted by this
method not always give exact values at the given nodes x0 , x1 , . . . , xn and other points.
The least squares method is used to fit polynomials of different degrees, other special
curves and also to fit orthogonal polynomials, etc.
y = g(x; a0 , a1 , . . . , ak ), (10.1)
Yi = g(xi ; a0 , a1 , . . . , ak ). (10.2)
619
620 Numerical Analysis
n
n
S= (yi − Yi )2 = [yi − g(xi ; a0 , a1 , . . . , ak )]2 . (10.3)
i=1 i=1
n
n
S= (yi − Yi )2 = [yi − g(xi ; a∗0 , a∗1 , . . . , a∗k )]2 . (10.6)
i=1 i=1
Let
y = a + bx (10.7)
be the equation of a straight line, where a and b are two parameters whose values are
to be determined. Let (xi , yi ), i = 1, 2, . . . , n, be a given sample of size n.
Here S is given by
n
n
S= (yi − Yi )2 = (yi − a − bxi )2 .
i=1 i=1
Least Squares Approximation 621
Example 10.2.1 Use least squares method to fit the line y = a + bx based on the
sample (2, 1), ( 16 , − 56 ), (− 32 , −2) and (− 13 , − 23 ). Estimate the total error.
xi yi x2i xi yi
2 1 4 2
1/6 –5/6 1/36 –5/36
–3/2 –2 9/4 3
–1/3 –2/3 1/9 2/9
Total 1/3 –5/2 115/18 61/12
622 Numerical Analysis
y = −0.6943 + 0.8319x.
Estimation of error.
Algorithm 10.1 (Straight line fit). This algorithm fits a straight line for the
given data points (xi , yi ), i = 1, 2, . . . , n, by least squares method.
Program 10.1 .
/* Program Straight Line Fit
Program to fit a straight line for the given data points
(x[i],y[i]), i=1, 2, . . ., n, by least squares method.*/
#include<stdio.h>
#include<math.h>
void main()
{
int n,i; float x[50], y[50], sx=0,sy=0,sxy=0,sx2=0,xb,yb,a,b;
char sign;
printf("Enter the sample size and the sample (x[i],y[i]) ");
scanf("%d",&n);
for(i=1;i<=n;i++) scanf("%f %f",&x[i],&y[i]);
for(i=1;i<=n;i++){
sx+=x[i]; sy+=y[i];
}
xb=sx/n; yb=sy/n;
for(i=1;i<=n;i++){
sxy+=(x[i]-xb)*(y[i]-yb);
sx2+=(x[i]-xb)*(x[i]-xb);
}
b=sxy/sx2; a=yb-b*xb;
sign=(b<0)? ’-’:’+’;
printf("\nThe fitted line is y = %f %c %f x",a,sign,fabs(b));
} /* main */
A sample of input/output:
Enter the sample size and the sample (x[i],y[i]) 5 1 12 4
13 6 10 7 8 10 3 The fitted line is y = 15.097345 - 1.053097 x
Let
y = a + bx + cx2 (10.12)
be the second degree parabolic curve, where a, b, c are unknown parameters and the
values of them are to be determined based on the sample values (xi , yi ), i = 1, 2, . . . , n.
Assume that Yi is the predicted value of y obtained from the curve (10.12) at x = xi ,
i.e.,
Yi = a + bxi + cx2i .
624 Numerical Analysis
y = a∗ + b∗ x + c∗ x2 (10.13)
Example 10.3.1 Fit a parabola to the following data by taking x as the indepen-
dent variable.
x : 1 2 3 4 5 6 7 8 9
y : 2 6 7 8 10 11 11 10 9
y = a + bu + cu2
74 = 9a + 0.b + 60.c
−51 = 0.a + 60.b + 0.c
411 = 60.a + 0.b + 708.c.
Let
y = a0 + a1 x + a2 x2 + · · · + ak xk , (10.14)
That is,
na0 + a1 xi + a2 x2i + · · · + ak xki = yi
a0 xi + a1 x2i + a2 x3i + · · · + ak xk+1
i = xi yi (10.16)
a0 x2i + a1 x3i + a2 x4i + · · · + ak xk+2
i = x2i yi
················································ ··· ······
a0 xki + a1 xk+1
i + a2 xk+2
i + · · · + ak x2k
i = xki yi .
Let
y = axb (10.18)
where
Y = log y, X = log x, A = log a.
Example 10.5.1 Fit a curve of the type y = axb to the following points.
x : 1 2 3 4 5
y : 3.5 6.2 9.5 15.3 20.4
Solution. Let y = axb . Then its corresponding linear curve is Y = A + bX, where
log y = Y, log x = X, log a = A.
The normal equations are
Yi = nA + b Xi
Xi Yi = A Xi + b Xi2 .
The values of Xi , Xi2 , Yi , Xi Yi are calculated in the following table.
x y X Y X2 XY
1 3.5 0 1.25276 0 0
2 6.2 0.69315 1.82455 0.48046 1.26469
3 9.5 1.09861 2.25129 1.20694 2.47329
4 15.3 1.38629 2.72785 1.92180 3.78159
5 20.4 1.60944 3.01553 2.59030 4.85331
Total 4.78749 11.07198 6.19950 12.37288
5A + 4.78749b = 11.07198,
4.78749A + 6.19950b = 12.37288.
y = aebx . (10.20)
Sometimes it may happen that while fitting a curve based on a given sample (xi , yi ), i =
1, 2, . . . , n, some data points may be more significant than the other. In this situation,
we find a curve such that the curve either pass through that important points or pass
very near to that points. The amount of ‘importance’ can be introduced by assigning
weights to that data points. If all the data have same importance, then the weights are
set to 1.
n
−2 wi [yi − (a + bxi )] = 0
i=1
n
and −2 wi [yi − (a + bxi )]xi = 0
i=1
Least Squares Approximation 629
n
n
n
n
n
n
a wi + b wi xi = w i yi and a wi xi + b wi x2i = wi xi yi .
i=1 i=1 i=1 i=1 i=1 i=1
These are the normal equations and give the values of a and b.
to a straight line by considering that the data (2,15) and (4,18) are more significant
or reliable with weights 5 and 10 respectively.
Solution. Let the straight line be y = a + bx. The normal equations are
a wi + b wi xi = w i yi
and a wi xi + b wi x2i = wi xi yi .
The normal equations are then 17a + 56b = 290 and 56a + 216b = 1020.
The solution of these equations is a = 10.29851, b = 2.05224.
Thus the fitted line is
y = 10.29851 + 2.05224x.
Estimation of error.
Example 10.6.2 Consider the above example again with the modified weights 25
and 40 instead of 5 and 10.
In the previous sections, the least squares method is considered for the discrete data.
This method is also applicable for continuous data.
Let y = f (x) be a continuous function on [a, b] and it is to be approximated by the
kth degree polynomial
y = a0 + a1 x + a2 x2 + · · · + ak xk . (10.22)
∂S ∂S ∂S
= = ··· = = 0. (10.24)
∂a0 ∂a1 ∂ak
Since w(x) and y = f (x) are known, the above equations form a system of linear
equations with (k + 1) unknowns a0 , a1 , . . . , ak . This system of equations possesses a
unique solution. If
a0 = a∗0 , a1 = a∗1 , . . . , ak = a∗k
% 1 % 1 % 1 % 1
a0 dx + a1 x dx + a2 x2 dx = ex dx
0 0 0 0
% 1 % 1 % 1 % 1
a0 x dx + a1 x2 dx + a2 x3 dx = xex dx
0 0 0 0
% 1 % 1 % 1 % 1
2 3 4
a0 x dx + a1 x dx + a2 x dx = x2 ex dx.
0 0 0 0
It is well known that the Taylor’s series expansion of y = ex up to second and third
degree terms are
x2
y2 = 1 + x + (10.27)
2
x2 x3
y3 = 1 + x + + . (10.28)
2 6
The values of y obtained from (10.26), (10.27) and (10.28) are listed below.
Least Squares Approximation 633
x yl y2 y3 Exact
0.0 1.01299 1.00000 1.00000 1.00000
0.1 1.10650 1.10500 1.10517 1.10517
0.2 1.21678 1.22000 1.22133 1.22140
0.3 1.34386 1.34500 1.34950 1.34986
0.4 1.48771 1.48000 1.49067 1.49182
0.5 1.64835 1.62500 1.64583 1.64872
0.6 1.82577 1.78000 1.81600 1.82212
0.7 2.01998 1.94500 2.00217 2.01375
0.8 2.23097 2.12000 2.20533 2.22554
0.9 2.45874 2.30500 2.42650 2.45960
1.0 2.70330 2.50000 2.66667 2.71828
This table shows that the least squares quadratic approximation gives more better
result as compared to second degree Taylor’s series approximation, even to third
degree approximation.
where fi (x) is a polynomial in x of degree i. Then as in the previous cases, the residue
is given by
% b
S= w(x)[y − {a0 f0 (x) + a1 f1 (x) + · · · + ak fk (x)}]2 dx. (10.30)
a
For minimum S,
∂S ∂S ∂S
= 0, = 0, . . . , = 0.
∂a0 ∂a1 ∂ak
634 Numerical Analysis
1 1
Thus f2∗ (x) = x2 − = (3x2 − 1).
3 3
Again, f3∗ (x) = x3 − a30 f0∗ (x) − a31 f1∗ (x) − a32 f2∗ (x)
where
81 3 81 3 81 3 1
−1 x dx −1 x .x dx 3 x . (3x2 − 1) dx
a30 = 8 1 = 0, a31 = 8 1 = , a32 = 8−11 1 3 = 0.
−1 dx −1 x2 dx 5 (3x2 − 1)2 dx
−1 9
Orthogonal property
We mentioned earlier that the Chebyshev polynomials are orthogonal and they are
orthogonal with respect to the weight function (1 − x2 )−1/2 , i.e.,
% 1
Tn (x)Tm (x)
√ dx = 0 for m = n.
−1 1 − x2
To prove it, let x = cos θ. Then
% 1 % π
Tn (x)Tm (x)
I= √ dx = Tn (cos θ)Tm (cos θ) dθ
1 − x2
%−1π
0
%
1 π
= cos nθ cos mθ dθ = [cos(m + n)θ + cos(m − n)θ] dθ
0 2 0
1 sin(m + n)θ sin(m − n)θ π
= + .
2 m+n m−n 0
Now, when m = n it gives I = 0, when m = n = 0 then I = π and when m = n = 0
then I = π2 .
Thus
% 1 0, if m = n
Tn (x)Tm (x)
√ dx = π, if m = n = 0 (10.41)
−1 1 − x2
π/2, if m = n = 0
∂S ∂S
Similarly, from the relations = 0 and = 0, we obtain
∂a1 ∂a2
81 % 1
−1 w(x)f (x)T1 (x) dx 2
a1 = 81 = x dx = 0
2 π
−1 w(x)T1 (x) dx −1
81 % 1
−1 w(x)f (x)T2 (x) dx 2 4
a2 = 81 = (2x2 − 1) dx = − .
2 π 3π
−1 w(x)T2 (x) dx −1
1 = T0 (x)
x = T1 (x)
x2 = 12 [T0 (x) + T2 (x)]
x3 = 14 [3T1 (x) + T3 (x)]
(10.43)
x4 = 18 [3T0 (x) + 4T2 (x) + T4 (x)]
1
x5 = 16 [10T1 (x) + 5T3 (x) + T5 (x)]
1
x6 = 32 [10T0 (x) + 15T2 (x) + 6T4 (x) + T6 (x)]
1
x7 = 64 [35T1 (x) + 21T3 (x) + 7T5 (x) + T7 (x)].
640 Numerical Analysis
Tn (x)
T0 (x)
61
T4 (x)
T3 (x)
T1 (x)
- x
−1 0 1
T2 (x)
−1
Solution. x3 + 2x2 − 7 = 14 [3T1 (x) + T3 (x)] + 2. 12 [T0 (x) + T2 (x)] − 7T0 (x)
= 14 T3 (x) + T2 (x) + 34 T1 (x) − 6T0 (x).
But, the integral in the denominator of ai is improper and it is not easy to evaluate.
So, a discretization technique is adopted here to approximate a function using Chebyshev
polynomials. The orthogonal property for discrete case is stated below.
Least Squares Approximation 641
0, if i = j
n
n+1
Ti (xk )Tj (xk ) = , if i = j = 0 (10.44)
2
k=0
n + 1, if i = j = 0
(2k + 1)π
where xk = cos , k = 0, 1, 2, . . . , n.
2n + 2
This result is used to establish the following theorem.
1 1
n n
a0 = f (xj )T0 (xj ) = f (xj ), (10.46)
n+1 n+1
j=0 j=0
(2j + 1)π
xj = cos
2n + 2
2
n
and ai = f (xj )Ti (xj )
n+1
j=0
n
2 (2j + 1)iπ
= f (xj ) cos (10.47)
n+1 2n + 2
j=0
for i = 1, 2, . . . , n.
x0 = 0.86660254, x1 = 0, x2 = −0.8660254.
1
a0 = [f (x0 ) + f (x1 ) + f (x2 )] = 1.2660209
3
2
2
(2j + 1)π
a1 = f (xj ) cos = 1.1297721
3 6
j=0
2
2
2(2j + 1)π
a2 = f (xj ) cos = 0.26602093.
3 6
j=0
Algorithm Chebyshev
Input function F (x)
Step 1. Read n; //degree of the polynomial//
Step 2. Set P i = 3.1415926535 and A = P i/(2n + 2).
Step 3. //Calculation of nodes and function values//
for i = 0 to n do
xi = cos((2i + 1)A) and yi = F (xi );
endfor;
Step 4. //Calculation of the coefficients ai .//
for i = 0 to n do
Set ai = 0;
for j = 0 to n do
ai = ai + cos((2j + 1) ∗ i ∗ A) ∗ yj ;
endfor;
endfor;
a0 = a0 /(n + 1);
for i = 1 to n do
ai = 2ai /(n + 1);
endfor;
//Evaluation of Chebyshev polynomial approximation//
Step 5. Read x; //The point at which the function is to be evaluated//
Least Squares Approximation 643
Step 6. Set T0 = 1, T1 = x;
Step 7. //Evaluation of Chebyshev polynomial//
if n > 1 then
for i = 1 to n − 1 do
Ti+1 = 2xTi − Ti−1 ;
endfor;
endif;
Step 8. sum = a0 ∗ T0 ;
for i = 1 to n do
sum = sum + ai ∗ Ti ;
endfor;
Step 9. Print ‘The value of Chebyshev polynomial approximation is’, sum;
end Chebyshev
Program 10.2.
/* Program Chebyshev Approximation
This program approximates and evaluates a function f(x)
by Chebyshev polynomials. */
#include <stdio.h>
#include <math.h>
#define f(x) exp(x) /* definition of function f(x) */
#define pi 3.1415926535
void main()
{
int n,i,j; float a,xi,x[40],y[40],c[40];
float xg,T[40],sum;
printf("Enter the degree of the polynomial ");
scanf("%d",&n);
a=pi/(2.0*n+2.0);
for(i=0;i<=n;i++)
{
x[i]=cos((2*i+1.)*a);
xi=x[i];
y[i]=f(xi);
}
for(i=0;i<=n;i++)
{
c[i]=0;
for(j=0;j<=n;j++)
c[i]=c[i]+cos((2*j+1)*i*a)*y[j];
}
644 Numerical Analysis
c[0]=c[0]/(n+1.);
for(i=1;i<=n;i++)
c[i]=2*c[i]/(n+1.);
/* Printing of Chebyshev coefficients*/
printf("The Chebyshev coefficients are\n ");
for(i=0;i<=n;i++) printf("%f ",c[i]);
/* Evaluation of the polynomial */
printf("\nEnter the value of x ");
scanf("%f",&xg);
T[0]=1.0; T[1]=xg;
/*Computation of Chebyshev polynomial at x*/
if(n>1)
for(i=1;i<=n-1;i++) T[i+1]=2*xg*T[i]-T[i-1];
sum=c[0]*T[0];
for(i=1;i<=n;i++)
sum+=c[i]*T[i];
printf("\nThe value of the Chebyshev polynomial approximation
at %6.3f is %9.8f",xg,sum);
} /* main */
A sample of input/output:
Minimax principle
It is well known that the error in polynomial interpolation is
f (n+1) (ξ)
En (x) = wn+1 (x) ,
(n + 1)!
The expression max |f (n+1) (ξ)| is fixed for a given function f (x). Thus the error
−1≤x≤1
bound depends on the value of |wn+1 (x)| and the value of |wn+1 (x)| depends on the
choice of the nodes x0 , x1 , . . . , xn . The maximum error depends on the product of
max |wn+1 (x)| and max |f (n+1) (ξ)|. Russian mathematician Chebyshev invented
−1≤x≤1 −1≤x≤1
that x0 , x1 , . . . , xn should be chosen such that wn+1 (x) = 2−n Tn+1 (x). The polynomial
2−n Tn+1 (x) is the monic Chebyshev polynomial.
If n is fixed, then all possible choices for wn+1 (x) and thus among all possible choices
for the nodes x0 , x1 , . . . , xn on [−1, 1], the polynomial T̃n+1 (x) = 2−n Tn+1 (x) is the
unique choice that satisfies the relation
max {|T̃n+1 (x)|} ≤ max {|wn+1 (x)|}.
−1≤x≤1 −1≤x≤1
x2 x4 x6 x8
cos x = 1 − + − + − ···
2! 4! 6! 8!
correct to four significant digits.
Solution. Since the result is required to four significant digits and the coefficients of
the term x8 , 1/8! = 0.0000248 can affect the fifth decimal place only, thus the terms
after fourth term may be truncated.
646 Numerical Analysis
Hence
x2 x4 x6
cos x = 1 − + − . (10.51)
2! 4! 6!
Now, express the above series using Chebyshev polynomials, as
1 1 1 1
cos x = T0 (x) − · [T0 (x) + T2 (x)] + · [3T0 (x) + 4T2 (x) + T4 (x)]
2! 2 4! 8
1 1
− · [10T0 (x) + 15T2 (x) + 6T4 (x) + T6 (x)]
6! 32
= 0.7651910 − 0.2298177 T2 (x) + 0.0049479 T4 (x) − 0.0000434 T6 (x).
Again, the term 0.0000434 T6 (x) does not affect the fourth decimal place, so it is
discarded and the economized series is
cos x = 0.7651910 − 0.2298177 T2 (x) + 0.0049479 T4 (x).
In terms of x, the series is
This expression gives the values of cos x correct up to four decimal places.
10.10 Exercise
x : 1 3 5 7 9
y : 10 13 25 23 33
x : 2 4 6 7 9
y : 6 10 8 15 20
w : 1 1 5 10 1
x : 0 1 2 3 4 5
y : 0 4 16 36 64 100
Estimate the residue and conclude about the data. Find the predicted value of y
when x = −1.
6. Fit a straight line and a second degree parabola for the following data and explain
which curve is better fitted for the given data by estimating the residue.
x : –1 0 3 4 6
y : 0 11 22 25 34
7. Fit a curve of the form ae−bx for the data given below.
x : 1 2 3 4 5
y : 0.6 1.9 4.4 7.8 11.9
9. Fit the curves y1 = ceax and y2 = 1/(ax + b) by using the following data.
x : −1 0 1 2 3
y : 6.62 3.98 2.76 1.25 0.5
Also find the sum of squares of errors in both the cases and conclude which curve
is better for this data points.
b c
10. Find the set of equations to find the values of a, b, c when the curve y = a + x + x2
is to be fitted for the data points (xi , yi ), i = 1, 2, . . . , n.
648 Numerical Analysis
13. By expanding the expression (cos θ + i sin θ)n show that the coefficient of xn in
Tn (x) is 2n−1 .
14. Show that Tn (x) is a polynomial in x of degree n. Also show that Tn (x) is even
or odd functions according as n is even or odd.
15. Show that the Chebyshev polynomials are orthogonal with respect to the weight
function w(x) = (1 − x2 )−1/2 .
16. Use Gram-Schmidt orthogonalization process to find first three orthogonal poly-
2
nomials whose weight function is w(x) = e−x .
[1] A. Ben Israel and T.N.E. Greville, Generalized Inverses: Theory and Applications,
Wiley, New York, 1974.
[2] J.C. Butcher, The Numerical Analysis of Ordinary Differential Equation: Runge-
Kutta and General Linear Methods, (Chichester, John Wiley), 1987.
[3] N.I. Danilina, S.N. Dubrovskaya, O.P. Kvasha and G.L. Smirnov, Computational
Mathematics, Mir Publishers, Moscow, 1998.
[4] M.E.A. El-Mikkawy, A fast algorithm for evaluating nth order tri-diagonal deter-
minants, J. Computational and Applied Mathematics, 166 (2004) 581-584.
[5] L. Fox, Numerical Solution of Ordinary and Partial Differential Equations, Perga-
mon, London, 1962.
[6] A. Gupta and S.C. Bose, Introduction to Numerical Analysis, Academic Publishers,
Calcutta, 1989.
[7] T.N.E. Greville, The pseudo-inverse of a rectangular or singular matrix and its
application to the solution of systems of linear equations, SIAM Review, 1 (1959)
38-43.
[9] H.H.H. Homeier, A modified Newton method for root finding with cubic conver-
gence, J. Computational and Applied Mathematics, 157 (2003) 227-230.
[10] A.S. Householder, The Theory of Matrices in Numerical Analysis, Blaisdell, New
York, 1964.
[11] M.K. Jain, S.R.K. Iyengar and R.K. Jain, Numerical Methods for Scientific and
Engineering Computation, New Age International (P) Limited, New Delhi, 1984.
649
650 Numerical Analysis
[12] E.V. Krishnamurthy and S.K. Sen, Numerical Algorithms, Affiliated East-West
Press Pvt. Ltd., New Delhi, 1986.
[13] J.H. Mathews, Numeical Methods for Mathematics, Science, and Engineering, 2nd
ed., Prentice-Hall, Inc., N.J., U.S.A., 1992.
[14] J.Y. Park, D.J. Evans, K. Murugesan, S. Sekar and V. Murugesh, Optimal control of
singular systems using the RK-Butcher algorithm, Inter. J. Computer Mathematics,
81 (2004) 239-249.
[15] G.D. Smith, Numerical Solution of Partial Differential Equations: Finite Difference
Methods, Clarendon Press, Oxford, 1985.
[17] E.W. Weisstein, Numerical quadrature, From MathWorld – A Wolfram Web Re-
source.
http://mathworld.wolfram.com
[18] http://www.riskglossary.com/link/monte-carlo-method.htm
[19] http://www.damtp.cam.ac.uk