0% found this document useful (0 votes)
76 views313 pages

Traub-Iterative Methods For..

metode iterasi Traub penyelesaian persamaan nonlinear

Uploaded by

nalaazana55
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views313 pages

Traub-Iterative Methods For..

metode iterasi Traub penyelesaian persamaan nonlinear

Uploaded by

nalaazana55
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 313

I

ITERATIVE METHODS FOR THE SOLUTION OF EQUATIONS

J. F. TRAUB
BELL TELEPHONE LABORATORIES, INCORPORATED
MURRAY HILL, NEW JERSEY
i

- ii -

TO SUSANNE

/
- Ill -

PREFACE

This book presents a general theory of iteration

algorithms for the numerical solution of equations and sys-

tems of equations. The relationship between the quantity

and quality of information used by an algorithm and the

efficiency of the algorithm are investigated. Iteration

functions are divided into four classes depending on whether

they use new information at one or at several points and

whether or not they reuse old information. Known iteration

functions are systematized and new classes of computationally

effective iteration functions are introduced. Our interest

in the efficient use of information is influenced by the wide-

spread use of computing machines.

The mathematical foundations of our subject are

treated with rigor but rigor in itself is not the main object.

Some of the material is of wider application than to our

theory. Belonging to this category are Chapter 3 , "The

Mathematics of Difference Relations"; Appendix A, "Interpolation

and Appendix D "Acceleration of Convergence". The inclusion

of Chapter 1 2 , "A Compilation of Iteration Functions" permits

the use of this book as a handbook of iteration functions.

Extensive numerical experimentation was performed on a computer;

a selection of the results are reported in Appendix E.


1

The solution of equations is a venerable subject.

Among the mathematicians who have made their contribution

are Cauchy, Chebyshev, Euler, Fourier, Gauss, Lagrange,

Laguerre, and Newton. E, Schroder wrote a classic paper

on the subject in 1 8 7 0 . A glance at the bibliography indi-

cates the level of contemporary interest. Perhaps the most

important recent contribution is the book by Ostrowski;

papers by Bodewig and Zajta are also outstanding.

Most of the material is new and unpublished.

Every attempt has been made to keep the subject in proper

historical perspective. Some of the material has been orally

presented at meetings of the Association for Computing

Machinery in 1 9 6 l , 1 9 6 2 , 1 9 6 3 , the American Mathematical

Society in 1 9 6 2 and 1 9 6 3 * and the International Congress of

Mathematicians in 1 9 6 2 .

I wish to acknowledge with sincere appreciation the

assistance I have received from my friends and colleagues at

Bell Telephone Laboratories, Incorporated. I am particularly

indebted to M. D. McIlroy, J. Morrison, and H. 0 . Pollak for

numerous important suggestions. I want to thank A. J. Goldstein,

R. W. Hamming, and E. N. Gilbert for stimulating conversations

and valuable comments. My thanks to Professor G, E. Forsythe

of Stanford University for his encouragement and comments during

the preparation of the manuscript and for reading the final manu-

script. My appreciation to S. P. Morgan and Professor A. Ralston


T

- v -

of Stevens Institute of Technology who also read the manuscript.


I am particularly grateful to J. Riordan for suggesting numer-

/— ous improvements in style.


I want to thank Miss Nancy Morris for always dig-

ging up just one more reference and Mrs. Helen Carlson for

editing the final manuscript. I am grateful to

^ Mrs. Elizabeth Jenkins for her splendid supervision of the

preparation of the difficult manuscript and to Miss Joy Catanzaro

for her speedy and accurate typing.

To my wife, for her never-failing support and encour-

agement as well as her assistance in editing and proofreading,

I owe the principal acknowledgment.

J. P. TRAUB
- vi -

TABLE 0 F CONTENTS
1

PREFACE
TERMINOLOGY
GLOSSARY OF SYMBOLS
1. GENERAL PRELIMINARIES
1.1 Introduction
1.2 Basic Concepts and Notations
1.21 Some concepts and notations
1.22 Classification of iteration functions

1.23 Order
1.24 Concepts related to order

2. GENERAL THEOREMS ON ITERATION FUNCTIONS

2.1 The Solution of a Fixed Point Problem

2.2 Linear and Superlinear Convergence

2.21 Linear convergence

2.22 Superlinear convergence

2.23 The advantages of higher order iteration

functions

2.3 The Iteration Calculus


2.31 Preparation
2.32 The theorems of the iteration calculus
- vii -

3. THE MATHEMATICS OF DIFFERENCE RELATIONS


3.1 Convergence of Difference Inequalities
3.2 A Theorem on the Solutions of Certain
Inhomogeneous Difference Equations

3 .3 On the Roots of Certain Indicial Equations

3.31 The properties of the roots

3.32 An important special case


3.4 The Asymptotic Behavior of the Solutions of
Certain Equations

3.41 Introduction
3.42 Difference equations of type 1
3.43 Difference equations of type 2
4, INTERPOLATOR! ITERATION FUNCTIONS
4.1 Interpolation and the Solution of Equations
4.11 Statement and solution of an interpolation
problem
4.12 Relation of interpolation to the calculation
of roots
4.2 The Order of Interpolatory Iteration Functions
4.21 The order of iteration functions generated
by Inverse interpolation
4.22 The equal information case
4.23 The order of Iteration functions generated
by direct interpolation
4.3 Examples
1

- vili -

ONE-POINT ITERATION FUNCTIONS


5.1 The Basic Sequence E g

5.11 The formula for E 0

s
5.12 An example
5#13 The structure of E o

s
5.2 R a t i o n a l Approximations to Eo

s
5.21 Iteration functions generated by
r a t i o n a l approximation to E a

5.22 The formulas of Halley and Lambert


5 3
# A Basic Sequence of Iteration Functions
Generated by Direct Interpolation
5.31 The basic sequence $
ojs
5.32 The iteration function $ q

5.33 Reduction of degree


5.4 The Fundamental Theorem of One-Point Iteration
Functions
5.5 The Coefficients of the Error Series of E Q

s
5.51 A recursion formula for the coefficients
5.52 A theorem cohcerning the coefficients
ONE-POINT ITERATION FUNCTIONS WITH MEMORY
6.1 Interpolatory Iteration Functions
6.11 Comments
6.12 Examples
- ix -

6.2 Derivative Estimated One-Point Iteration

Functions with Memory

6.21 The secant iteration function and its

generalization

6.22 Estimation of 1
f^" )

6.23 Estimation of g ^ " ^1

6.24 Examples

6.3 Discussion of One-Point Iteration Functions

with Memory

6.31 A conjecture
6.32 Practical considerations
6.33 Iteration functions which do not use
all available information

6.34 An additional term in the error equation

MULTIPLE ROOTS

7.1 Introduction
7.2 The Order of E g

7.3 The Basic Sequence &


s
7.31 Introduction
7 . 3 2 The structure of g g

7.33 Formulas for g


s
7.4 The Coefficients of the Error Series of g R
7.5 Iteration Functions Generated by Direct

Interpolation
7.51 The error equation
7.52 On the roots of an indicial equation

7.53 The order


7.54 Discussion and examples
7.6 One-Point Iteration Functions with Memory

7.7 Some General Results


7.8 An Iteration Function of Incommensurate Order

MULTIPOINT ITERATION FUNCTIONS


8.1 The Advantages of Multipoint Iteration Functions
8.2 A New Iterpolation Problem
8.21 A new interpolation formula
8.22 Application to the construction of
multipoint iteration functions
8.3 Recursively Formed Iteration Functions
8.31 Another theorem of the iteration calculus
8.32 The generalization of the previous theorem

8.33 Examples
8.34 The construction of recursively formed
iteration functions

8.4 Multipoint Iteration Functions Generated by


Derivative Estimation

8.5 Multipoint Iteration Functions Generated by


Composition
8.6 Multipoint Iteration Functions with Memory
i

- xi -

9. MULTIPOINT ITERATION FUNCTIONS: CONTINUATION

9.1 Introduction
9.2 Multipoint Iteration Functions of Type 1

9.21 The third order case

9.22 The fourth order case

9.3 Multipoint Iteration Functions of Type 2

9.31 The third order case

9.32 The fourth orjder case

9.4 Discussion of Criteria for the Selection of

an Iteration Function

10. ITERATION FUNCTIONS WHICH REQUIRE NO EVALUATION

OF DERIVATIVES

10.1 Introduction

10.2 Interpolatory Iteration Functions


10.21 Direct Interpolation

10.22 Inverse Interpolation

10.3 Some Additional Iteration Functions

11. SYSTEMS OF EQUATIONS


11.1 Introduction
11.2 The Generation of Vector-Valued Iteration
Functions by Inverse Interpolation
11; 3 # Error Estimates for Some Vector-Valued
Iteration Functions
1

- xii -

11.31 The generalized Newton iteration

function
11.32 A third order iteration function

11.33 Some other vector-valued iteration

functions
11.34 A test function
11.4 Vector-Valued Iteration Functions which
Require No Derivative Evaluations
12. A COMPILATION OF ITERATION FUNCTIONS
12.1 Introduction
12.2 One-Point Iteration Functions
12.3 One-Point Iteration Functions with Memory
12.4 Multiple Roots
12.41 Multiplicity known
12.42 Multiplicity unknown

12.5 Multipoint Iteration Functions

12.6 Multipoint Iteration Functions with Memory

12.7 Systems of Equations

APPENDICES
A. INTERPOLATION
A.l Introduction
A.2 An Interpolation Problem and Its Solution
A.21 Statement of the problem
A.22 Divided differences
A.23 The Newtonian formulation
A.24 The Lagrange-Hermite formulation
A.25 The interpolation error
! i

- xiii -

A.3 The Equal Information Case


A.31 The Newtonian formulation
A.32 The Lagrange-Hermite formulation
A.33 The interpolation error
A.4 The Approximation of Derivatives

A.41 Statement of the problem

A.42 Derivative estimation from the


Newtonian formulation

A.43 Derivative estimation from the


Lagrange-Hermite formulation
A.5 The Error in the Approximation of Derivatives

A.51 Discussion

A.52 First proof of the error formula


A.53 Second proof of the error formula
ON THE jTH DERIVATIVE OF THE INVERSE FUNCTION
SIGNIFICANT FIGURES AND COMPUTATIONAL EFFICIENCY
ACCELERATION OF CONVERGENCE
D.l Introduction
2
f
D.2 Aitken s 5 Transformation

D.3 The Steffensen-Householder-Ostrowski Iteration

Function
- xiv -

E« NUMERICAL EXAMPLES

E.l Introduction

E.2 Growth of the Number of Significant Figures

E.3 One-Point and One-Point with Memory Iteration

Functions

E,4 Multiple Roots

E.5 Multipoint Iteration Functions

E.6 Systems of Equations

F> AREAS FOR FUTURE RESEARCH

BIBLIOGRAPHY
I r

- XV -

TERMINOLOGY

Page

asymptotic error constant 1.2-12

basic sequence 1.2-18

derivative estimated iteration function 6.2-3

informational efficiency 1.2-16

informational usage 1.2-16

interpolatory iteration function 4.1-5

iteration function 1.2-4

multipoint iteration function 1.2-11

multipoint Iteration function with memory 1.2-11

one-point iteration function 1.2-10

one-point iteration function with memory 1.2-10

optimal iteration function 1.2-18

optimal basic sequence 1.2-18

order 1.2-12

order is multiplicity-dependent 1.2-13

order is multiplicity-independent 1.2-13


I I

- xvi -

GLOSSARY OP SYMBOLS

The following list, which is intended only for

reference, contains the symbols which occur most frequently

in this book.
- xvii -

( j )
a ( x )
f (x)/j!
j

A j ( x ) aj(x)/a (x) 1

Lagrange-Hermite coefficient

a a zero of f

a
,Hm-l| ) x

B, (x) mam(x) w v 7

( a )
B s ,n A - ; » ( t )
1,3 t=x.

P dominant zero of g, ( t )
k,a Q

C asymptotic error constant

C[x,l] binomial coefficient

7
(t) Newton coefficient
'1,3
d inf orma11ona1 us age

(s)
(t)
1,3 '1,3 t=X.

e x - a
e
i x^ - a

EPF informational efficiency


ir

- xviii -

Page

E g a certain family of iteration 5*1-2


functions
*E o a family of iteration functions 6.2-5
n»o
generated by derivative estimation

"4s Q a family of iteration functions 6.2-14


n, o
generated by derivative estimation

S fx,f.m) = E ( x , f o
l/m
,l) 7.3-3

f function whose zero is sought 1.2-1

*f£ ) S
an estimate of f^ ^ S
6.2-5

f[x ,7 ; ± 0 x ± - 1 ,7 j
1 x _ ,7 ]
1 n n

confluent divided difference A.2-4


f a certain kind of interpolatory 8.2-2
function

3 the inverse function to f 1.2-2

1
a£ ) S
an estimate of 6.2-14

k-1
g k j a (t) k
t - a £ tJ 3.3-1

the inverse of the Jacobian matrix 11.2-1,

LP. iteration function 1.2-4


I class
class oi
of iteration function of 1.2-13
order p
- xix -

class of Iteration functions with


informational usage d and order p

Jacobian matrix

1
g (x,f,m) =
s Z-\ ^(m)(x-a)
t

the multiplicity of a

the number of points at which old


information is reused
0
u(x) = Sv^x-a)

order (in sense of order of


magnitude)

mu(x) = 2cD^(m)(x-a)

order
hyperosculatory polynomial for f

names of Iteration functions


family of iteration functions
generated by Inverse interpolation

family of Iteration functions


generated by direct Interpolation

family of iteration functions


generated by rational approxima-
tion to E
s

hyperosculatory polynomial for 3

s(n+l)
- XX -

Page
R S(n+1) 6.2-4

s
p
s,j ( m ) 7.3-4
S s+1 (x,f,m) = x - £ p ^ ( m ) Z (x,f,l) s

j=l
s - 1 derivatives are used in many 1.2-17
functions
s - 1 6.2-4

Stirling numbers of the first kind 7.3-7


P,3

Wjx,f,m) = ^ a ^(m)Zj(x,f,l)
t 7.3-5.

T Stirling numbers of the second kind 7.3-7

1
E (x) =
g Zt U-a)
tta 5.5-4

u(x) f(x)/f'(x) 1.2-6

V(x) cp(x) - a
P
2.3-3
(x-a)

W ( x )
cp(x) - a
P
2.3-5
u (x)

l/m
y (x,t,m)
3 = Z (x,t ,l)
3 7.3-3

approximant to a 1.2-4

Yj(x) ( • l ) J
" V j )
f y )
1.2-7
jU*'(y)] J

y=f(x)

Z j (x) Y (x)u (x)


J
J

5.1-13

ij • • « 11.3-9
- xxi -

Page
approximate equality between 1.2-8
numbers

same order of magnitude 1.2-8

the set of x for which the 1.2-9


proposition p(x) is true
1.0-1

CHAPTER 1

GENERAL PRELIMINARIES

The basic concepts and notations to be used

throughout this book will be introduced in Section 1.2.


1.1 Introduction

The general area into which this book falls may be

labeled algorithmics. By algorithmics we mean the study of

algorithms in general and the study of the convergence and

efficiency of numerical algorithms in particular.

More specifically, we shall study algorithms for

the solution of equations. Our approach will be to examine

the relationship between the quantity and quality of informa-

tion used by an algorithm and the efficiency of that algorithm

We shall investigate the effect of reusing old information and

of gathering new information at certain felicitous points.

Our interest in the efficient use of information is

influenced by the widespread use of high-speed computing

machines. The introduction of computers has meant that many

algorithms which were formerly of only academic interest


become feasible for calculation. They are, in fact, used many

times in many establishments on a wide variety of problems.

The efficiency of these algorithms is therefore most important

Furthermore, there are situations where the acquisition of

more than a certain amount of data is prohibitively expensive.

It is then imperative that as much information as possible be

squeezed from the available data. Here again the question of

efficiency is of paramount importance.


I

1.1-2

Iteration algorithms for the solution of equations

will be studied in a systematic fashion. In the course of

this study, new families of computationally effective itera-

tion algorithms will be introduced and certain well-known

iteration algorithms will be identified as special cases.

It is hoped that this comprehensive approach will moderate,

if not prevent, the rediscovery of special cases. This

uniform approach will lead to certain natural classification

schemes and will permit the uniform rigorous error analysis

and the uniform establishment of convergence criteria for

families of iteration algorithms. The final verdict on the

usefulness of the new methods will not be available until

the new methods have been tried on a variety of problems aris-

ing in practice. At present, however, extensive numerical

experimentation on test problems support the theoretical error

analysis.

Although we shall confine ourselves to the solution

of real equations and systems of real equations, the field of

potential application of this work is of much broader scope.

Thus, analogous techniques may be applied to such problems as

the solution of differential and integral equations and the

calculation of eigenvalues. The generalization of our results

to abstract spaces is of interest. The reader is referred to

Appendix P for some additional discussion of this point.


\ i I

1.2-1

1.2 Basic Concepts and Notations

1.21 Some concepts and notations. The symbols to

be introduced below are global; they will preserve their

meaning for the remainder of this book. A few of these

symbols will be used with other than their global meanings

(for example, as dummy indices in summation) but context will

always make this clear.


Our investigation concerns the approximation of a

zero of f(x), or equivalently, the approximation of a root

of the equation f(x) = 0 . The zero and root formulations

will be used interchangeably. There does not seem to be a

good phrase for labeling this problem. If one says that one

is solving an equation, one may be concerned with the solu-

tion of a differential equation. The adjectives algebraic

and transcendental are usually used to distinguish between

the cases where f is or is not a polynomial. Perhaps the

best generic term is root-finding.

We will restrict ourselves to f(x) which are real


single-valued functions of a real variable, possessing a

certain number of continuous derivatives in the neighborhood

of a real zero a. The number of continuous derivatives

assumed will vary upwards from zero. The restriction to real

zeros is not essential. With the exception of Chapter 11,

f(x) will be a scalar function of a scalar variable; in

Chapter 11, f(x) will be a vector function of a vector

variable.
r

1.2-2

The "independent variable" will sometimes appear

explicitly and will sometimes not appear explicitly. This

will depend on whether the function or the function evaluated

at a point in its domain is what is meant. It will also

depend on the readability of formulas. Thus we will use both

f and f(x). See Boas F l . 2 - 1 , pp. 67-681.


(l)
v
Derivatives of f will be denoted either by f ' with
f(°) = f, or by f' f» ...
9 9• If f does not vanish in a
(l)
v
neighborhood of a and if f ' is continuous in this neighbor-
(l)
hood, then f has an inverse 3, and !r ' is continuous in a

neighborhood of zero.
A zero a is of multiplicity m if

m
f(x) = (x-a) g(x),

where g(x) is bounded at a and g(a) is nonzero. We shall


always take m as a positive integer. Ostrowski [ 1 . 2 - 2 , Chap. 5 ]
deals with the case that m is not a positive integer. If
m = 1 , a is said to be simple; if m > 1 , a is said to be
nonsimple. If a is nonsimple, it is called a multiple zero.
Perhaps the most primitive procedure for approxi-
mating a real zero is the following bisection algorithm. Let
a and b be two points such that f(a)f(b) < 0 . Then f has at
least one real zero of odd multiplicity on (a,b). For the
purpose of explaining the algorithm it is sufficient to take
the interval as (0,l) and to assume that f ( 0 ) < 0 , f(l) > 0 .

Calculate f(J>). If f Q ) = 0 , a zero has been found. If

f(l) < 0 , the zero lies on (|,l) and one next calculates f ( | ) .

If f(^) > 0 , the zero lies on ( 0 , - | ) and one next calculates

f(-j^). The zero will either be found by this procedure or the

zero will be known to lie on an interval of length 2 ~ q


after

the bisection operation has been applied q times. By then


estimating the value of the zero to be the midpoint of this

last interval, the value of the zero will be known to within

a maximum error of 2 ~ ^ " " . 1


Once the zero has been bracketed

it takes just q evaluations of f to achieve this accuracy.

Observe that the accuracy to which the zero may be found is

limited only by the accuracy with which f may be evaluated.

Indeed, it is only necessary to be able to decide on the

sign of f.

The bisection algorithm just described is an example


of a one-point iteration function with memory, as defined

below. Since no use is made of the structure of f, such as

the values of its derivatives, the rate of convergence is not

very high. On the other hand, the method is guaranteed to

converge. Gross and Johnson [ 1 . 2 - 3 ] use a property of f to

achieve faster convergence. See also Hooke and Jeeves [1.2-4]

and Lehmer [ 1 . 2 - 5 ] . Kiefer [ 1 . 2 - 6 ] gives an optimal search

strategy for the case of a maximum of a unimodal function.


By using additional information we can do far better

than the bisection algorithm. The natural information avail-

able are the values of f and the values of its derivatives.

We shall use the words information, samples, and data quite

interchangeably.
x b e n + 1 a r o x l m a n t s t o a
Let x
i> i-i9 x
•••> i-n PP -
Let be uniquely determined by information obtained at
x x x t l i e x X
i* j__]_* • • • * i - * n function that maps -J/ JL_I> • • • x
> ±-n

into x
i + 1 be called cp. Thus

We call cp an iteration function. The abbreviation I.F. will

henceforth be used for iteration function and its plural. Let

We shall also write y _j for f(x _^) ± ± and for ^^^..j)-


x a r e
Since the information used at i _j the values of f and
its derivatives, we may write

(l )Q (t )"
n

cp = cp i' i' ' l 7


i-l' i-l' ' 1-1 ' ' l-rr l-rr l-n

(1-2)
We shall not permit I.F. as general as thisj the types of
I.F. to be studied will be specified in Section 1.22.
I

1.2-5

Rather than writing c p ( x , x _ , .. . , x _ ) , it is more


i i 1 i n

convenient to write cp or cp(x) . Observe further that cp is a

functional depending on f and should perhaps be written cp(x,f) ,

This notation is not necessary and will therefore not be used

except in Chapter 7 .

Even the simplest iteration algorithm must consist

of initial approximation(s), an I.F., and numerical criteria

for deciding when "convergence" is attained. We shall only

be concerned with the I.F. Thus, for us, iteration algorithm

will be the same as I.F.

Two I.F. are almost universally known. They are the

Newton-Raphson I.F. and the secant I.F. For the former,

cp(x.) - x . - — 7 , (1-3)

while for the latter,

x x
l l-l
cp(x ,x _ ) = x - f
f -f f f (1-4)
1 1 1 ± ±
l * i-l«
I
I l-1 I

We shall henceforth call the former Newton's I.F. The secant


I.F. is closely related to regula falsi. This last method,
as it is usually defined, keeps two approximants which
bracket the root; the secant I.F. always uses the latest two
approximants.
V

1.2-6

In much of our work we shall deal not with f, but

with a normalized f defined by

u = I 9 f' ^ 0 . (1-5)

If f' = 0 , f ^ 0 , the u is undefined. If f' = 0 , f = 0 ,


which is the case at a multiple zero, we define u = 0 . The
importance of u in our future work cannot be overestimated.
One reason for its importance is that

11m
x a x-a m

For simple zeros,

u(x)
11m = 1,
1 x-a
x a

Now, x - a is the error in x but is not known till a is known.


On the other hand, u is known at each step of the iteration.
In the new notation, Newton's I.F. becomes

cp = x - u, (1-6)

For later convenience we introduce some additional


notation. Let

e
i = x
i ~ a, e = x - a. (1-7)
1.2-7

Thus e i is the error of the ith approximant. Let

a x)j ( .

A ( x ) »
V ;
3 It'{x)>

•Hffl-
(1-8)
B
J , m ( x ) ma(x) '

a (y) = g ( J )
( y )

v ( x ) . ( - p ^ ^ ' f y )
J
J ! [ s ,
( y ) ] J

y-f(x)

The first of these is the Taylor series coefficient


for f, while the second is a "normalized Taylor series coeffi-
cient" for f. The third is a "generalized normalized Taylor
series coefficient" which will be used when the multiplicity
m is greater than unity. Note that B. -,(x) = A.(x). The
fourth is a "normalized Taylor series coefficient" for 3.
The last Is a Taylor series coefficient for 3 with a different
normalization and with f(x) substituted after differentiation.

We shall use the symbols 0 , ~ and ^ according to


the following conventions:
1.2-8

If

lim g(x.) = C,
l — oo 1

we shall write

g(x ) - C
1 or
If

lim g(x) = C,
x -* a

we shall write

g(x) C or

How the limit is taken will be clear from context.

where C is a nonzero constant, we shall write

f = 0(g)

or
f * Cg.

For approximate equality between numbers, the symbol


be used. Thus

i(i+v^) ~ 1 . 6 1 8 ^ 1 . 6 2 .

The use of these symbols is illustrated in


1.2-9

EXAMPLE 1 - 1 . Let

M ef± + L e\, ±
i+1

where

e, ^ 0 , M. K / 0, N ± - L ^ 0,

and where e± Is nonzero for all finite i. Then

e + e
i+i - *A ^( i>
since

L e
i i
L.

Also,

e ± + 1 * M e 1 ,

since

= M ± + L e ± 1 -+ M.

We shall use the notation

= {x|p(x)|

to denote the set of x for which the proposition p(x) is true


Thus

J = [x||x| < lj

denotes the interior of the unit interval.


1-2-10

1.22 Classification of iteration functions. We


shall classify L P . by the information which they require.
Let be determined only by new information at x^. No old
information is reused. Thus

x = 1 9
i+l ' t " )

Then cp will be called a one-point I.F. Most I.F. which have


been used for root-finding are one-point I.F. The most commonly
known example is Newton's I.F.

Next let be determined by new information at x.^


and reused information at x. x. . Thus
i-1* i-n

1 0
x
i+l = X x # x
<P(I'* i - l " • ^ i-n^ • U" )

Then cp will be called a one-point I.F. with memory. The


semicolon in (1-10) separates the point at which new data are
used from the points at which old data are reused. This type
of I.F. is now of special interest since the bid information
is easily saved in the memory (or store or storage) of a
computer. The case of practical interest is when the same
information, the values of f and f' for example, is used at
all points. The best-known example of a one-point I.F. with
memory is the secant I.F.
1.2-11

Now let be determined by new information at

x ,x
1 i - 1 > .. . , x _ , k ^
1 k 1. No old information is reused. Thus

Then cp will be called a multipoint I.F. There are no well-

known examples of multipoint I.F. They are being introduced

because of certain characteristic limitations on one-point

I.F. and one-point I.F. with memory.

Finally let be determined by new information


a n d # , x #
at x x
i> i-i> • • • ^i-k reused information x ^ ^ - l ' * * i - n
Thus

x = x x 9 x x x 9
i-l-l ^ i ' i - 1 * * * * i*-k* i-k-l* * * * * i - n ^ ft > k.
(1-12)
Then cp will be called a multipoint I.F. with memory. The

semicolon in (1-12) separates the points at which new data

are used from the points at which old data are reused. There

are no well-known examples of multipoint I.F. with memory.


1.2-12

1.23 Order. We turn to the important concept of


the order of an I.F. Let x ,x ,...,x ,... be a sequence
Q 1 i

converging to a. Let e 1 = x^ - a. If there exists a real


number p and a nonzero constant C such that

6 1 + 1

p
"C, (1-13)
ei
e

then p is called the order of the sequence and C is called the


asymptotic error estimate.
The remainder of this section will consist of
commentary on this definition. We wish to associate the
concept of order with the I.F. which generates the x ^ To
emphasize this point, we can write ( 1 - 1 3 ) as

I cp(x) - a| / ,v
lv±-L—^-^ c# (1-14)
p
|x-a|

We will associate an order to an I.F. whether or not the


sequence generated by cp converges. The order assigned to
an I.F. is the order of the sequence it generates when the
sequence converges.

Recall that cp is a functional which depends on f.


Hence the order of cp may be different for different classes
of f. For the type of f and cp that we shall study, we shall
insist that cp be of a certain order for all f whose zeros are
1.2-13

of a certain multiplicity. We shall always assume that we


are in the neighborhood of an isolated zero of f and that the
order possibly depends on the multiplicity of the zero but is
otherwise independent of f. (With the exception of Chapter 7 ,
the zero will always be simple unless the contrary is explicitly
stated.) To indicate that cp belongs to the class of I,P. of
order p, we shall write

cp e I .p (1-15)

The following definitions relate order to multi-

plicity. If the order is the same for zeros of all multi-

plicities, we say that the order is multiplicity-independent.

If the order depends on the multiplicity we say that the

order is multiplicity-dependent. If in particular the order


is greater than one for simple zeros and one for all nonsimple

zeros, we say that the order is linear for all nonsimple zeros.

(The adjectives linear and quadratic will sometimes be used

instead of first and second.)

Observe that if the order exists then it is unique.

Assume that a convergent sequence has two orders, p^ and Pg.

Let p Q = p n + 6 , 6 > 0. Then


1,2-14

and

lim:. = 9 ,
i -+ oo

which contradicts the assumption that the last limit is nonzero.

If the order is integral, then the absolute values

may be dropped in the definition of order. We shall find that

I.F. with memory never have integral order. If cp(p+D is


continuous and if

cp(x) - a = C ( x - a ) p
[ l + O ( x - a ) ] , (1-16)

then cp is of order p and C - c p VP;


(a)/p!. Equation ( l - l 6 ) may
be written as

cp(x) - a ~ C ( x - a ) p
.

In his classic paper of 1 8 7 0 , E. Schroder [1.2-7]

defined order as follows: c p ( x ) is of order p if

cp(a) = a; c p ^ ( a ) = 0 , J = l,2,...,p-l; c p ^ ( a ) f Q.

(1-17)

This definition is only valid for I.F. of one variable with p


continuous derivatives. Although many authors have followed

Schro'der's lead, we prefer to state ( 1 - 1 7 ) in the conclusion


of Theorem 2 - 2 . However, a generalization of ( 1 - 1 7 ) will
serve as the definition of order for the case of systems of
equations in Chapter 1 1 .
1.2-15

The advantage of high-order I.F. will be discussed


in Section 2 . 2 3 .

EXAMPLE 1 - 2 . For Newton's method,

f i + i - A* \a),
2 (a)
2 h
A 2 = -.
2 f /

e
i

For the secant method,

| e |

, ^p±
- A (a), p = i(l+v/5) - 1.62.

In Section 7 - 8 we will give an example of an I.F.


whose order is incommensurate with the scale defined by ( 1 - 1 3 ) .
1.2-16

1.24 Concepts related to order. A measure of the


Information used by an I.P. and a measure of the efficiency
of the I.P. are required. A natural measure of the former
is the informational usage, d, of an I.F. which we define as
the number of new pieces of information required per iteration.
Since the information to be used are the values of f and its
derivatives, the informational usage is the total number of
new derivative evaluations per iteration. We use here the
convention that the function is the zeroth derivative.
Ostrowski [ 1 . 2 - 8 , p. 1 9 ] has suggested the "Horner" as the
unit of informational usage. To indicate that cp belongs to
the class of I.F. of order p and informational usage d we
write

(1-18)

To obtain a measure of the efficiency, we make the


following definition: The informational efficiency. EFP, is
the order divided by the informational usage. Thus

EPF = d #
(1-19)

Another measure of efficiency called the computational


efficiency, which takes into account the "cost" of calculating
different derivatives, will be discussed in Appendix C. An
alternative definition of informational efficiency is

*EFF = p (1-20)
1.2-17

Ostrowski [ 1 . 2 - 9 , p. 2 0 ] calls *EFF an efficiency index; see


Appendix C for a discussion of *EFF. Our definition of
informational efficiency was chosen because it permits the
simple statement of certain important results.

EXAMPLE 1 - 3 . For Newton's I.F.,

p = 2, d = 2, EFP = 1 , *EFF = ^ 2 .

For the secant I.F.,

P = 4(1+^5) - 1-62, d = 1, EFP = *EFF = i ( l + v / 5 ) .

For one-point I.F. and one-point I.F. with memory


there can be only one new evaluation of each derivative. If
the first s - 1 derivatives are used, the informational
usage will be s and

EPF = ^. (1-21)

Let n be the number of points at which old informa-


tion is reused in a one-point I.F. with memory. Let

r = s(n+l). (1-22)
1.2-18

Hence r is the product of the number of pieces of new informa-

tion with the total number of points at which information is

used. The quantities s, n, and r will characterize certain

families of I.F. These symbols will occasionally be used

with other meanings as long as there is no danger of confusion

We shall prove in Section 5 ~ 4 that for one-point I.F

EFF 1. We anticipate this result to define a one-point I.F.

as optimal if EFF = 1 . For optimal one-point I.F.,p=d=s.

A basic sequence of I.F. is an infinite sequence of I.F. such

that the pth member of the sequence is of order p. This con-

cept is defined only for I.F. of integral order. An optimal

basic sequence is a basic sequence all of whose members are

optimal. In Section 5 - 1 the properties of a particular basic

sequence, which will be used extensively throughout this book,

will be developed.
2.0-1

CHAPTER 2

GENERAL THEOREMS ON ITERATION FUNCTIONS

In this chapter I.F. will be discussed without

regard to their structure. In Section 2.1 the existence and

uniqueness of the iterative solution of a fixed point problem

will be proven under the assumption that the I.F. satisfies

a Lipschitz condition. In Section 2.2 the difference between

linear and superlinear convergence will be studied in some

detail while in Section 2.3 an "iteration calculus" will be

developed for I.F. which possess a certain number of continu-

ous derivatives.
2.1 The Solution of a Fixed Point Problem

We propose to study the solution of

cp(x) = x (2-1)

by the iteration

x 1 +
x
i = q>( i) • (2-2)

If a satisfies ( 2 - l ) , then a is called a fixed point of cp.

The problem of finding the fixed points of a function occurs

in many branches of mathematics. To see its relation to the

solution of f(x) = 0 we proceed as follows. Let g(x) be any

function such that g(a) is finite and nonzero. Let

cp(x) = x - f (x)g(x) . (2-3)

Then a is a solution of f(x) = 0 if and only if a is a solu-

tion of (2-1).

We shall show that under certain hypotheses the

fixed point problem has a unique s o l u t i o n a n d that the itera-

tion defined by ( 2 - 2 ) converges to this solution. Because of

the restrictive nature of our hypotheses* the proof will be

very simple . Many other proofs have been given both for real

functions and in abstract spaces and under a variety of

hypotheses. See, for example Antosiewicz and Rheinboldt

[2.1-1], Collatz [ 2 . 1 - 2 , Chap. II], Coppel [2.1-3],

Ford [ 2 . 1 - 4 ] , Householder [ 2 . 1 - 5 , Sect. 3 - 3 ] , John [2.1-6,

Chap. 4 , Sect. 6 ] , and Ostrowski [ 2 . 1 - 7 , Chap. 4 ] .


»

2.1-2

We assume that cp is defined on some closed and


bounded interval J = [a,b] and that its values are in the same
interval. Then if x is in J, all the x. are in J. To
9
o i
guarantee that ( 2 - 1 ) has a solution we must assume the conti-

nuity of cp. We have

LEMMA 2 - 1 . Let cp be a continuous function from

J = [a,b] to J. Then there exists an a, a < a < b, such that

cp(a) = a.

PROOF. Since the function is from J to J,

cp(a) ^ a, cp(b) £ b.

Let h(x) = cp(x) - x. Then

h(a) ^ 0, h(b) £ 0 ,

and the intermediate value theorem guarantees the existence


of a such that h(a) = 0 . The result follows immediately.

In order to draw additional conclusions we must


impose an additional condition on cp. Let

Ms) - cp(t)| < L|s-t|, 0 < L < 1, (2-4)

for arbitrary points s and t in J. Observe that this


Lipschitz condition implies continuity. We can now show
that the solution of cp(x) = x is unique.
i li

2.1-3

LEMMA 2 - 2 . Let cp be a function from J to J which

satisfies the Lipschitz condition ( 2 - 4 ) . Then cp(x) = x has

at most one solution.

PROOF. Assume that there are two distinct solu-

tions, and . Then

a 1 = cp(a ),1 a 2 = cp(a ) 2

and

a
l " 2 a =
" <p( ) •a
2

An application of ( 2 - 4 ) yields

l ^ - a g l = |cp(a ) x - cp(a )|2 ^ L l o ^ - a i g l < l a ^ - a g l ,

which is a contradiction.

The existence and uniqueness of a solution a of the

equation cp(x) = x has now been verified. It is a simple

matter to show that the sequence defined by x^ ^ = cp(x^)

converges to this solution.

THEOREM 2 - 1 . Let J be a closed, bounded interval


and let cp be a function from J to J which satisfies the
Lipschitz condition
2.1-4

|cp(s) - cp(t)| << L|s-t|, 0 1 L < 1,

for arbitrary points s and t in J. Let X q be an arbitrary

point of J and let = cp(x^) . Then the sequence {x^}

converges to the unique solution of cp(x) = x in J.

PROOF. Lemmas 2 - 1 and 2 - 2 assure us of the exist-

ence and uniqueness of a solution a of the equation cp(x) = x,

This solution is the candidate for the limit of the sequence

defined by x^ ^ = cp(x ) . i Thus

x 1 + 1 - a = cp(x ) - a = cp(x ) - cp(a) .


± ±

From the Lipschitz condition we conclude that

l i+l" ' ^ Ljx^al ,


x a
L < 1.

Thus

| x ± - a | £ L 1
| x Q - a |

and x^ a.

If one did not know that c p ( x ) = x had a unique solu-

tion, one would have to show that the x i satisfy the Cauchy

criterion. A proof based on the Cauchy criterion may be found

in Henrici [ 2 . 1 - 8 , Chap. 4 ] .
1 B

2.1-5

An estimate of the error of the pth approximant

which depends only on the first two approximants and the

Lipschitz constant may be derived as follows. Prom the

Lipschitz condition we can conclude that for all i,

l 1 x x ( 2 5 )
K + i ~ x
i l ^ I I" 0 I • ~

Let p and q be arbitrary positive integers. Then

X X
p+q " p =
( p+q- 4<i-l>
x x
P
+ X
( p+q-l- p+q-2 X )
+ + ^pH-l'V

and

X X
l p+q- pl * l X
P + q- X
P + q-ll +
' p + q - r p - K l - 2 • + ••• + l p l- pl
X X X
+
X

An application of ( 2 - 5 ) shows that

p q 1
|x p + q - x | £ L (l+L+...+L " )|x -x |,
p 1 o

or

Let q —• oo. Then

a |
' V x
l l- ol« x

This gives us a bound on the error of the pth approximant.

Observe that if L is close to unity, convergence may be very

slow. A good part of our effort from here on in will be

devoted to the construction of I.P. which converge rapidly.

To do this we shall have to impose additional conditions on cp


2.2-1

2.2 Linear and Superlinear Convergence

In Section 2.1 the analytical problem, cp(x) = x,

suggested the iterative procedure = cpCx^) . This is not

typical of the problems on which we will work. In general we

will be given the equation f = 0 and we will want to construct

a functional cp which will be used as an I.F. In Section 2.1

we showed that if cp satisfies a Lipschitz condition, then the

iteration will converge to the unique solution of the fixed

point problem. From now on, rather than seeking weak suffi-

cient conditions under which the iterative sequence converges,

we will be interested in constructing I.F. such that the

sequence converges rapidly. We will be quite ready to impose

rather strong conditions. In particular, we shall assume that

f has a zero in the interval in which we work.


2.2-2

2.21 Linear convergence. Let us assume that cp'

is continuous in a neighborhood of a. We will certainly

insist that

cp(a) = a. (2-7)

For if x. a and cp is continuous, then

a = lim x ± + 1 = lim cp(x ) = cp( lim x ) = cp(a)


± ±

i —• oo i —• oo i —• oo

Now
x = x = a + c x a
i+l °P( i) <P( ) P'(^i)( i" )^

where £^ lies in the interval determined by x^ and a. An

application of ( 2 - 7 ) yields

x
i+l ~ a
" c p ^ ^ ) (x -a) ±
(2-8)

We shall require that |cp'(£^)| <i K < 1 in the neighborhood

of a. Since cp' is continuous it will be sufficient to require

that | cp' (ct) | £ K < 1 . Then there is a neighborhood of a such

that

I cp' (x) | £ L, 0 £ L < 1

and

i~ l•
x a L
±+l~ \ ^ x a

Then

l ^ - a l £ L | x 0 - a

and x ± —• a.
«

2.2-3

If a is such that for every starting point x Q in a

sufficiently small neighborhood of a the sequence {x^} con-

verges to a, then a is called a point of attraction. This

terminology is due to J. P. Ritt. See Ostrowski [2.2-1,

p. 2 6 ] for a discussion of this and the definition of point

of repulsion. In this terminology we can state our result as

follows. If cp' is continuous in a neighborhood of a and if

cp(a) = a and |cp'(a)| <C L < 1 , then a is a point of attraction.

We can conclude something more. Let cp'(a) be differ

ent from zero. Then cp' does not vanish in a neighborhood of a

Let e i = x^ - a. Then ( 2 - 8 ) may be written as

e e
i+l " $'(£>±) ±-

It is clear that if e is not zero and cp' does not vanish,


o ^
then e.^ does not vanish for any finite i. That is, the itera-
tion cannot converge in a finite number of steps as long as
the iterants lie in the neighborhood of a where cp' does not
vanish. Hence

e
i+l ,/> \

and
e
i+1 _ •/ \
- • cp'(a)
e
i

This is the case of linear or first-order convergence.


2.2-4

In the case of superlinear convergence It is not


necessary to require that |cp'(a)| < 1; the iteration always
converges in some neighborhood of a.
i ir

2.2-5

2.22 Superlinear convergence. We now assume that

P ^ 11*
, is
is continuous. As before we require that

cp(a) = a. Then

P
q/ ^U )
x
i + l =
<P( i.) x
= a +
cp' ( a ) ( x ± - a ) + . . . + ^ ( x ± - a ) p
,

where £^ lies in the interval determined by x i and a. It

should be clear that cp is of order p only if cp^^(a) = 0,

j = l,2,...,p-l, and if c p ^ (a) is nonzero. Hence we impose

the following conditions on cp:

cp(a) = a; <p^ ^(a)J


=0, j = l,2,...,p-l; cp^ ^(a) P
? 0.
(2-9)

Then

<p (
p )
Uj) p

p s- , ±— p^
i+1 pi i'

P p
where = x 1 - a. Since q/ ^(a) is nonzero, q/ ^ does not

vanish in some neighborhood of a. Then the algorithm cannot

converge in a finite number of steps provided that e Q is

nonzero and that the iterants lie in the neighborhood of a


p
where q/ ^ does not vanish. (See also the beginning of

Section 2 . 2 3 . ) Furthermore,

(p)
fj±i-.<p («) ( 2 - 1 0 )
e
i
Hartree [2.2-2] defines the order of an I.F. by
the conditions (2-9) . This definition of order cannot be
used for one-point I.F. with memory. Hence we prefer to
state these conditions as part of a theorem. We summarize
the results in

THEOREM 2-2. Let cp be an I.F. such that c p ^ is


e x a
continuous in a neighborhood of a. Let j_ = j_ ~ * Then cp

is of order p if and only if

cp(a) = a; cp^(a) = 0 , J = l,2,...,p-l; c p ^ ( a ) ^ 0.

Furthermore,

e (p)
l+l , q> (a)
eP Pi- '

Recall that we assign an order to cp whether or


not the sequence generated by cp converges. Sufficient condi-
tions for a sequence to converge are derived below.
2.2-7

EXAMPLE 2-1. Let cp = x - f / f , (Newton's I.P.).


Let f"'be continuous. A direct calculation reveals that

cp(a) = a, cp'(a) = 0, cp"(a) = f"(a)/f'(a) ^ 0. Hence Newton's

I.F. is second order and

f"
A (a),
2 A 2 = 2f' •

We require that f"' be continuous in order to satisfy the

hypothesis of Theorem (2-2) that cp" be continuous. It will be

shown in Chapter 4 that we only need f" continuous.

We observed in Section 2.21 that a sequence formed

by a linear I.F. need not converge in any neighborhood of a.

We shall show that a sequence formed by a superlinear I.F.

always converges in some neighborhood of a. In the following

analysis the cases of linear and superlinear convergence are

handled jointly.

We start our analysis from

(2-11)

Let
i

2.2-8

(the notation is specified at the end of Section 1 . 2 1 ) and

let c p ^ ^ be continuous on J.
P
Let x Q e J and let

p
J Ux) I

for all x e J.

Since x Q e J,

M I
0 £ M, e | Q r .

Hence

Let

MT P - 1
= L < 1. (2-12)

Then

| e i | 1 LT < r .

Hence e J. We now proceed to prove by induction that if

( 2 - 1 2 ) holds, then for all i,

x ± e J, |e | £ L^T.
± (2-13)

Assume that ( 2 - 1 3 ) holds. Then

|e i + 1 l = i M j I e j P ^ M l e j P ^ l e J

£ m|r| p _ 1
|e | i l l t
±
j
= L 1 + 1
r < r .
r

2.2-9

Hence ( 2 - 1 3 ) holds for 1 + 1 which completes the proof by

induction. Since

| e | 1 L ]?,
±
1
L < 1,

we conclude that x ± -* a. We have proven

THEOREM 2 - 3 . p
Let q/ ^ be continuous on the

interval J,

J = |x |x-a| < r^.

Let x Q e J and let

|q> ( p )
(x)l
£ M

for all x e J. Let

MT P - 1
< 1.

Then for all 1 , x^ e J and x i -»• a.


2.2-10

2.23 The advantages of higher order iteration


functions. If cp' Is continuous in a neighborhood of a and if
cp is first order, then the sequence generated by cp will con-
verge to a if and only if |cp'(a)| < 1 unless one of the x i

becomes equal to a. That the last restriction is necessary


is shown by the example

2x for |x| £ 1
cp(x) = i
for |x| > 1

Here cp' is certainly continuous in the neighborhood of zero

and cp'(O) = 2 but the iteration converges for all X q in the

unit interval. Baring such cases, cp'(a) will have to be less

than unity for convergence. If, on the other hand, p > 1 and

c p ^ ^ is continuous in the neighborhood of a, then there is

always a neighborhood of a where the iteration is guaranteed


to converge.

This is one of a number of advantages of superlinear

convergence over linear convergence. Perhaps the most impor-

tant advantage is roughly summarized by the statement that


x a r e e s
i+l S with a to p times as many significant figures as

x^. See Appendix C.

The higher order I.F. will often require fewer

total samples of f and its derivatives. Recall that the

informational efficiency of an I.F. is given by the ratio of


i ir

2.2-11

the order to the number of pieces of new information required

per iteration. As will be shown in Chapter 5 , there exist

I.F. of all orders whose informational efficiency is unity.

We defined such I.F. as optimal. To simplify matters, let cp

and ^ be optimal I.F. of orders 2 and 3 respectively. Let

x^ be generated from x Q by the application of cp three times

and let y^ be generated from x Q by the application of ^

twice. Although each process requires six pieces of informa-


o q
tion, - a = 0[(x -a) ] whereas y^ - a = 0[(x -a)^].
Q Q

Ehrmann [ 2 . 2 - 3 ] considers a certain family of I.F. whose


members have arbitrary order. He discusses which order is
"best" under certain assumptions.
The main drawbacks to the use of high order one-
point I.F. are the increasing complexity of the formulas and
the need for evaluating higher derivatives of f. In
Chapters 8 and 9 we shall see how these disadvantages may be
overcome by using multipoint I.F.
Observe that if f satisfies a differential equation
it may be cheaper to calculate the derivatives of f from the
differential equation than from f itself. In particular, let
f satisfy a second order differential equation. After f
and f have been calculated, f" is available from the differ-
ential equation. By differentiating the differential equation
one may calculate f"'. This process may be continued as long
as it is feasible.
i 1

2.2-12

It might appear that it is difficult to generate

I.F. of higher order for all f. Such is not the case and

algorithms for constructing I.F. of arbitrary order have been

given by many authors. See, for example, Bodewig [2.2-4],

Ehrmann [ 2 . 2 - 5 ] , Ludwig [ 2 . 2 - 6 ] , E. Schroder [2.2-7],

Schwerdtfeger [ 2 . 2 - 8 ] , and Zajta [ 2 . 2 - 9 ] . Many of these

algorithms depend on the construction of I.F. such that

cp(a) = a, q/ )(a) = 0 ,
J
j = l,2,...,p-l. (2-l4)

We shall propose numerous techniques for generating I.F. of

arbitrary order which also satisfy certain other criteria.

It will turn out that it is not necessary to use ( 2 - l 4 ) to

construct these I.F.

From two I.F. of first order, it is possible to

construct an I.F. of second order by the technique of

Steffensen-Householder-Ostrowski. The reader is referred to

Appendix D.

In many branches of numerical analysis one con-

structs approximations by insisting that the approximation

be exact for a certain number of powers of x. There is one

class of I.F., those generated by direct interpolation

(Section 4 . 2 3 ) , for which the I.F. yield the solution a in

one step for all polynomials of less than a certain degree.

This is not a general property of I.F. In fact, powers of x


m
could not be used if for no other reason than that x has a

root of multiplicity m.
2.2-13

EXAMPLE 2 - 2 . Let

m
cp - x - X
jr?'

Then cp is second order and yields a in one step if f is any

linear polynomial. Let

f 2
if = x - YT + f .

Then if/ is second order. Let f = x. Then ^ will not yield the

answer in one step and will not converge if |x | > 1 . o


2.3 The Iteration Calculus

We will develop a calculus of I.F. The results to

be proven in this section will play a key role in much of the

later work. They will not apply to I.F. with memory. The

nature of our hypotheses will be such that the order will be

integral. No assumptions will be made as to the structure of

the I.F. Theorems 8 - 1 and 8 - 2 also belong to the iteration

calculus but are deferred to Chapter 8 because their applica-


tions are in that Chapter. Some of these results, for the

case of simple roots, may be found in the papers by Zajta

[2.3-1] and Ehrmann [ 2 . 3 - 2 ] .


2.3-2

2.31 Preparation. In this section we do not

restrict ourselves to simple zeros. The multiplicity of the

zero is denoted by m and we indicate that cp is a member of

the class of I.F. whose order is p by writing cp e I . We

always insist that cp is of order p for all functions f whose

zeros have a certain multiplicity. However cp may be of

order p for simple zeros but of a different order for multiple

zeros. Hence our hypotheses contain phrases such as "Let

cp e I for some set of values of m."

EXAMPLE 2 - 3 - Let u = f/f and let

u
cp = x - u,
1 cp = x - mu,
2 cp^ = * -

r
Then cp-^ e I 2 f° simple zeros whereas cp-^ e 1^ for multiple

zeros. On the other hand, cp e I 2 2 for zeros of multiplicity m

and cp^ e I 2 for zeros of arbitrary multiplicity. See

Chapter 7 for details.

P
Let ^ / ^ be continuous in a neighborhood of a.

If cp e I , then from Theorem 2 - 2 ,


p

cp(a) = a; cp^(a) = 0 for j = l,2,...,p-lj cp^(a) ^ 0

(2-15)
2.3-3

The expansion of cp(x) in a Taylor series about a yields

cp(x) - a = ^ p V 0
U - a ) p
, (2-16)

where i lies in the interval determined by a and x. Now,


P
i is not a continuous function of x but q/ ^(£) may be
defined as a continuous function of x as follows. Let

X q
V(x) = <P< ) - for x ^ a, V ( a ) - ^ , { a )
.
p p
(x-a) *

Then V(x) is continuous wherever cp(x) is continuous and x ^ a.

A p-fold application of L'Hospital's rule reveals that

l l m v ( x ) . a ^ l a l .
p
x a •

Hence
P
cp(x) - a = V(x)(x-a) , (2-17)

where V(x) is a continuous function of x and V(a) is nonzero.

Thus one may characterize I.F. of order p by the condition

that cp(x) - a has a zero of multiplicity p at a.

By comparing ( 2 - 1 7 ) with ( l - l 4 ) we observe that

C = V(a)

where C is the asymptotic error constant.


2.3-4

Equation ( 2 - 1 7 ) may be put into a form which is


more convenient for many applications. Let f' be continuous
in a neighborhood of a and let a be a simple root. Then f'(a)
is nonzero and

f (x) = f '(T])(x-a) .
Let

M x ) = Jkp- for x ^ a, * ( a ) = f'(a) .

Then

f(x) - X ( x ) ( x - a ) , (2-18)

where A(x) is continuous if f is continuous and 7\(a) / 0,


Prom ( 2 - 1 7 ) and ( 2 - 1 8 ) we conclude that

cp(x) - a = T(x)f (x), p


(2-19)

p
where T(x) = V(x)/A (x). Then T(x) is continuous wherever
cp^ ^(x) and f'(x) are continuous and f'(x) does not vanish.
p

Furthermore,

(P) a
T(a) - <P ( ) ^ 0 .n
p
pi[f'(a)]

If the multiplicity m is greater than unity, then


f is no longer proportional to x - a and so we cannot obtain
an expression like ( 2 - 1 9 ) which involves f. However,
r

2.3-5

u = f/f' has only simple zeros and hence ( 2 - 1 9 ) is easily


generalized. Let f ( ) be continuous and let a have multi-
m

plicity m with m j> 1 . Proceeding as before we conclude the


x s u c h
existence of continuous functions A-^(x) and ^g( ) that

f(x) = A ( x ) ( x - a )
1
m
, f ' ( x ) = ^ ( x M x - a ) 1 1 1
- 1
,

with

Hence

u(x) = -fj^y = p(x)(x-o), (2-20)

with

P(a) - i .

Finally, from ( 2 - 1 7 ) and (2-20),

cp(x) - a = W(x)u (x), p


(2-21)

and W(x) continuous and with

W ( x ) = ^ L , w ( a ) = ^ i a I ^ 0 .
p p
p (x) '

The importance of ( 2 - 1 7 ) , (2-19), and ( 2 - 2 1 ) in our


future work cannot be overestimated. In particular, we will
have occasion to use ( 2 - 2 1 ) in the form

a = cp(x) - W ( x ) u ( x ) . p
(2-22)
I

2.3-6

2.32 The theorems of the Iteration calculus. The

hypotheses of the theorems to be proven below will not be

weighed down by the statement of continuity conditions on the

derivatives of cp and f; they should be clear from the theorems.

The notation is the same as in Section 2 . 3 1 .

THEOREM 2 - 4 . Let cp (x) x e I p 3 cp ( )2


x e
J
p 2
f o r

some set of values of m and let cp^(x) = cp fcp (x)].


2 1 Then for
these values of m, cp (x) Q e I .
1 2
PROOF. From (2-17),

P-i p
cp (x)
1 = a •+ V 1 ( x ) ( x - a ) , cp (x) 2 = a + V 2 ( x ) ( x - a ) 2
.

Then

cp (x)
3 = cp [cp (x)]
2 1 = a + V 2 [ C p 1 ( x ) ][<p (x) 1 - af ,2

cp (x)
3 - a = V 2 [ c p 1 ( x ) ] V ^ 2
( x ) ( x - a ) P l P 2
,

Let

P
2,
V (x)
3 = V 2 [ c p 1 ( x ) ] V 1
d
( x )

The fact that

C 3 = V 3 ( a ) = V 2 ( a ) V P 2
( a ) = C ^ 2
± 0 (2-23)

completes the proof.


2.3-7

NOTE. Observe t h a t (2-23) g i v e s t h e formula f o r

the a s y m p t o t i c e r r o r c o n s t a n t o f t h e c o m p o s i t e I . F . i n terms

of t h e a s y m p t o t i c e r r o r c o n s t a n t s o f t h e c o n s t i t u e n t I . F .

COROLLARY. L e t

cp^(x) e I p , I = 1,2, . . . , k

f o r some s e t of v a l u e s o f m. L e t

cp(x) = cpj Cqpj (...(cpj ( x ) ) . . . ) )

where t h e a r e a n y p e r m u t a t i o n of t h e numbers l , 2 , . . . , k .

Then f o r t h e s e v a l u e s o f m, q>(x) e l . I n p a r t i c u l a r ,
p
l p
2 * * , p
k
i f p . = p f o r I = 1,2, . . . , k , then <p(x) e l , , .
i K p

EXAMPLE 2-4. L e t

qp (x)1 = cp (x)
2 = x - u ( x ) , (Newton's I . F . ) .

Then

$1 e ^2' ^2 6 1
2 9 C
l = C
2 = A
2^ ^ a A
2 83
2f * 7

Hence

cp (x) 3 = cp [cp (x)]


2 1 6 l 4 , C = [ A 2 ( a ) ] 3
.
I

2.3-8

THEOREM 2 - 5 - Let <p(x) e I , with p > 1 , for some


set of values of m. Then for these values of m there exists
a function H(x) such that

cp(x) - x = - u(x)H(x), H(a) ± 0 .

PROOF. Since cp(a) = a, there exists a function G(x)


with G(a) = 0 such that cp(x) =• x - G(x) . Then for p > 1 ,

cp'(a) =0 = 1 - G'(a) .

Therefore G(x) has a simple zero at a and G(x) = u(x)H(x) for


some H(x) with H(a) V 0 .

Theorem 2 - 5 may be rephrased as follows. If a is a


p-fold zero, p > 1 , of the function c p ( x ) - a, then it is a
simple zero of the function cp(x) x . Since ep-^(x) - x - f(x)

and cpgCx) = x - f ( x ) are both of order one, the theorem is


false when p = 1 .

EXAMPLE 2 - 5 - Since < p ( x ) - x has only simple zeros


if the order of c p ( x ) is greater than unity, any I.F. which is
of order p for functions f ( x ) with simple zeros will be of
order p if f ( x ) is replaced by c p ( x ) - x . Thus if f ( x ) is
replaced by c p ( x ) - x in Newton's I.F.,
II I

2.3-9

and f(x) e Ig. Let cp(x) be Newton's I.P. Then

x u(x)

This generalization of Newton's I.P. is of order two for roots

of arbitrary multiplicity; it will be discussed in Chapter 7.

THEOREM 2 - 6 . Let <p, (x) e I n and cp (x)


0 e I_ for
some set of values of m. Then for these values of m there
exists a function U(x), with U(x) bounded at a and U(a) ^ 0 ,
such that

p
<P (x) = cp (x) + U(x)u (x),
2 x

with

p = minfp-^Pg], if p 2 ? p ;2

(P-,) x (PT)

p = p , x if p 2 = p 2 and (a) £ cp 2 (a);

P > P lf if P x = P 2 and ^ (a) = cp 2 (a).


1.

2.3-10

PROOF. Case 1. p 1 / p Assume in p a r t i c u l a r

t h a t p 1 > p Then

P-i p
cp-^x) = a + W x ( x ) u - L
( x ) , cp (x) 2 = a + W ( x ) u 2
( x ) ,

P x - P 2

cp (x)2 - cp^x) = u ^(x) W 2 ( x ) - W 1 ( x ) [ u ( x ) ]

D e f i n e

p
l - P
U(x) = W ( x ) - W ( x ) [ u ( x ) ]
2
2 1

The f a c t t h a t U(a) = W 2 ( a ) ^ 0 c o m p l e t e s t h e p r o o f of Case 1

( P n ) ( P J
Case 2. p 1 = p 2 and cp x
x
(a) / cp 2 ( a ) .

P-,

q> (x)
2 " cp^x) = u - L
( x ) [ W 2 ( x ) - W 1 ( x ) ] .

D e f i n e U(x) = W 2 ( x ) - W - ^ x ) . The f a c t t h a t

U(a) = W 2 ( a ) - W l { a ) - ^ [ " ^ ( a ) - ^ ( a ) ? 0

c o m p l e t e s t h e p r o o f of Case 2.

( P i ) , (Pn)

Case 3. P x = P 2 and (a) = q> 2 (a) . U n l e s s

^ ( x ) = c p 2 ( x ) , t h e r e e x i s t s an i n t e g e r q, q > p such t h a t

cp[ )(a)
q
^ cp^ ^(a) q
and t h e p r o o f may be c o m p l e t e d as a b o v e .
!

2 . 3 - H

The case of most interest for later applications is


( P
1>, , <Pl>, X
p
l = p
2 ' ^1 ^ ^2 * W e
P r o v e a
converse to Theorem 2-6,

THEOREM 2 - 7 .
, and let Let q>-(x) e I
p
l
p
cp (x) = cp (x) + U(x)u (x) for some set of values of m.
2 1 Then

for these values of m, cp (x) e l . , where p

P
2

P 2 = min[p,p ], 1 if p ^ P l -

4 P )
(a)m p

P 2 = p , if p x = p and U(a) ^ -
p l j

p) p
cp{ (a)m
P 2 > p , if p x = p and U(a) = -

PROOF. Case 1 . p ^ p . 1 Assume p > P . 1 Then

p
l
p
<p (x) = a + W (x)u
x 1 (x), cp (x) = <p (x) + U(x)u (x),
2 1

Hence

cp (x) - a = u
Q (x) W ( x ) + U(x)[u(x)]
1
P P l

P P l
Define W ( x ) = W ( x ) + U(x)[u(x)]
2 1 . The fact that
W ( a ) = W-^a) ^ 0 completes the proof of Case 1 .
2
r

2.3-12

P p
Case 2. p = P 1 and U(a) f - <pj ^ (a)m /p J. Then

cp (x) - a = uP(x)[W (x) + U(x)].


2 1

Define Wg(x) = W ^ x ) + U(x) . The fact that

4 (a)mP p )

1
W ( a ) = W (a) + U(a) =
2 1 , + U(a) ^ 0 p

completes the proof of Case 2.

Case 3 . P = P 1 and U(a) = - cpj ^ (a)m /p J . P p


This
case may be completed in a similar manner.

The "comparison theorem" just proved will permit

us to deduce the order of a given I.F. if it differs by terms


p
of order u from an I.F. whose order is known.
'*

The following theorem permits the calculation of

the asymptotic error constant of an I.F. of order p if the

asymptotic error constant of another I.F. of order p is known.

Recall that the asymptotic error constant C is defined by

C = l l m - « (2-24)
p
x -» a (x-a)

Absolute values are not required in (2-24) since p is an


integer.
r

2.3-13

THEOREM 2 - 8 . Let cp-^x) e I and cpg(x) e I p for


some set of values of m. Let

o ( » ) - - ; / x )
,
p
(x-a)

Let C.^ and be the asymptotic error constants of cp^ and cpg

Then for these values of m,

C = C 1 + 11m G(x).
x a

PROOF. A p-fold application of L'Hospital's rule

yields

4 p )
( « ) - q>i p )
(°)
llm G(x) = -s ± = C - C
p d 1
x — a '

EXAMPLE 2 - 6 . Let

cp (x) = x - mu(x),
1 cp (x) = x -
2 •

Then cp-^Cx) e I 2 and cp (x) e Ig, for m arbitrary.


2 As will be

shown in Section 7 * ^ 1

mu(x) = x - a - ^ i " ! (x-a) 2


+ 0[(x-a) ], 3

m^ '
r

2.3-14

where a (x) = f^ '(x)/ml.


m Then It is easy to verify that

2a
n _ m+1 m+1 "m+1
lim G(x) = -
2 -
9

1 ~ raa
m
m
C
m ma
x a m

as predicted by Theorem 2 - 8 . Let

cp(x) = ^[cp (x) 1 + cp (x)]


2 = x - | u ( x ) m + u'(x)

Since and C 2 are equal in magnitude but opposite in sign,


it is clear that cp(x) e for all m.

A more useful result than the one obtained in


Theorem 2 - 8 is given by

THEOREM 2 - 9 . Let ^ ( x ) e I p and let cp (x) 2 e I


for some set of values of m. Let

cp (x)
2 - cp (x) f

=
H(x) = 1 s x ^ a, u •
ir (x) 1

Let C 1 and C 2 be the asymptotic error constants of cp-j^x) and


cp (x)
2 . Then for the values of m,

C 2 = C + l l m H (x) .
2!. 3 - 1 5

PROOF. It Is easy to show that

1
lim u(x) =

x -+ a

From the previous theorem,

lim H(x) = lim G(x) x-a


x a x -*• a

and the result follows

EXAMPLE 2-7. Let m = 1. Then

cp (x) = x - u(x) - A ( x ) u ( x ) ,
1 2
2
cp (x) = x - 2
2 - A ^ f x ) ' u ( x )

are of third order and

2
f (i)
C 1 = 2A (a) - A 3 ( a ) , A_ = j7j7
j

Since

cp (x) = x - u(x) 1 + A (x)u(x) + A (x)u (x)


2 2
2 2
+ 0 [ v T ( x ) ] ,
r

2.3-16

2
lim H(x) = -A (a)
x —• a

Hence C 2 = Ag(a) - ^(a) .

THEOREM 2 - 1 0 . Let m = 1 and let U(x) be an arbi-

trary function which may depend on f(x) and its derivatives

and which is bounded at a. Let q>-,(x) e I and let


Jtr

U(a) t - 1

p
P![f'(a)] '
Then

cp(x) = ^ ( x ) + U(x)f (x) p


(2-25)

is the most general I.P. of order exactly p.

PROOF. From ( 2 - 1 9 ) ,

(
p ( a )
q,,(x) = a + T (x)f (x), T^a) = * . p
1 1 1 p
Pl[f'(a)]

By hypothesis,
p
«p(x) = cp-^x) + U(x)f (x) .

Then

p
<p(x) = a + f (x)[T (x) + U(x)].1 ( 2 - g 6 )
2.3-17

Define

T(x) = T (x) + U(x).


x

Since T(a) is nonzero, ep(x) e I . p The observation that


p
U(x)f (x) is the most general addend to cp-^(x) which le^ds to

(2-26) completes the proof.

Note that we do not assume that U(a) ^ 0 . It is


clear that

cp(x) = p (x) + *U(x)u (x),


C 1
q
q > p (2-27)

is also of order p. Setting U(x) = *U(x)u " (x) puts ( 2 - 2 7 )


q p

into the form of ( 2 - 2 5 ) • Note also that Theorem 2 - 1 0 need

not hold if m > 1 . For example,

cp(x) = cp (x) + U(x)


1

is also of order p if m > 1 .

The following two theorems lead to another charac-


terization of I . . F . of order p.
2.3-18

THEOREM 2 - 1 1 . Let cp(x) 6 I p for some set of values

of m. Then for these values of m, there exists a function Q ( x )

such that Q(a) ± 0 and f[cp(x)] = Q(x)f (x) . p

PROOF, Let a be a zero of multiplicity m. As was

shown in Section 2 . 3 1 * there exists a continuous function M x )

such that

f(x) = *(x)(x-a)
m

and

Then

m
f[<p(x)] = [<p(x) - a] Mcp(x)]

p
= [V(x)(x-a) ] A[cp(x)]

r a m P
= V (x)Mcp(x)][(x-a) ] .

Since M a ) ^ 0 , A(x) does not vanish in some neighborhood


of a and we may write

m p p
f[cp(x)] = V (x)X[cp(x)]A- (x)f (x).

Define

m p
Q(x) = V (x)7v[cp(x)]7v- (x).
! r

2 . 3 - 1 9

The fact that

m
(m)
"f (a)
Q ( a ) = ? 0
p! ml

completes the proof.

As a converse to Theorem 2 - 1 1 , we have

THEOREM 2 - 1 2 - Let f[cp(x)] = Q(x)f (x) for some p

set of values of m with Q(a) ^ 0 . Then for these values of m,


x
cp( ) e I .p

PROOF. Let A(x) be defined as in the previous

theorem. Then

m
f [cp(x)j = [cp(x) - a] A[cp(x)],

and from the hypothesis,

p p mp
f[cp(x)] = Q(x)f (x) = Q(x)A (x)(x-a) .

Define V(x) by

1 p
V ^ x ) = X- [cp(x)]A (x)Q(x).
Then
p
cp(x) - a = V(x)(x-a)

and the fact that V(a) ^ 0 completes the proof.


! r

2.3-20

Theorems 2 - 1 1 and 2 - 1 2 show that cp(x) e I If and


P
only If f [q>(x)] = Q(x)f (x) with Q(a) ? 0.

EXAMPLE 2 - 8 . Let m = 1 and let

1 JT"
cp(x) = x - u(x)H(x), H(x) = x , A ( ) (x)*
x u
A
2^ )
x =
2F 7

This is Halley's I.P. which will be studied in Section 5 . 2 .

Then

f[cp(x)] = f(x) - u ( x ) H ( x ) f (x) + |u (x)H (x)f"(x) +


2 2
0[u (x)],
3

H(x) = 1 + A (x)u(x)
2 + A|(X)U (X) + 2
0[u (x)]. 3

Thus

f[cp(x)] = f(x) - u(x)f"(x) - A (x)u (x)f'(x)


2
2
+ |u (x)f"(x)
2
+ 0[u (x)] 3

= 0[u (x)]
3
= 0[f (x)].
3

Therefore Halley's I.P. is third order for m = 1 .

THEOREM 2 - 1 3 . Let m = 1 and cp(x) e I . Then

P
d f Tcpfx)! (p)
= f'(a)cp (a)
dx*
x=a
2.3-21

PROOF. Let i/(x) = x - f (x) . Clearly ^(x) e I . 2

Let

¥(x) = T^tcp(x)] = q>(x) - f[<p(x)].

By Theorem 2 - 4 , f(x) e I . Hence

cp(x) - f [<p(x)] = a + V ( x ) ( x - a ) .
1
P
(2-28)

Since cp(x) e I ,
ir

P
<p(x) = a + V(x)(x-a) .

Therefore

f[cp(x)] = [V(x) - V (x)](x-a) .


1
P
(2-29)

A second expression for f[cp(x)] may be derived as follows.

Define A(x) by

f(x) = A(x)(x-a), A(a) = f'(a).


Then

p
f[cp(x)] = A[<p(x) ][cp(x) - a] = Mcp(x)]V(x)(x-a) .

(2-30)

From ( 2 - 2 9 ) and ( 2 - 3 0 ) ,

A[cp(x)]V(x) = V(x) - V ( x ) .1 (2-31)


r

2.3-22

From (2-28),

p
(P)(„, .
9 " M f W l v(a) - A . a f[g<x)]

x=a

Taking x = a in ( 2 - 3 1 ) yields the desired result.

We turn to a generalization of the previous theorem,

THEOREM 2-14. Let m = 1 and <p(x) e I . Then

J
d f [CD(X)1
f'(a)cp (a),(j)
0 < j < 2p
dx'
x=a

PROOF. Let = d^/dx^. It is clear from


Theorem 2-11 that D^f[cp(x)]| _ = 0 for 0 < j < p. Since
q/^(a) = 0 for 0 < j < p, the theorem is proved for these
values of j .

Let p £ j < 2 p , Now

2
f(x) = f'(a)(x-a) + x(x)(x-a) ,

p
cp(x) = a + V(x)(x-a) .
r

2.3-23

It is clear that if f and cp possess a sufficient number of

continuous derivatives, then T and V may be defined so as to

possess as many continuous derivatives as required. We have

f[cp(x)] = f (a)[cp(x) - a] + [cp(x)][cp(x) - af


T = f'(a)v(x)(x-a) + S(x). p

Then

J J_k k P J
D f [cp(x)j = f'(a) £ C[j,k]D V(x)D (x-a) + D S(x),
k=0

where C[j,k] denotes a binomial coefficient. Hence

J J_p
D f [cp(x)]| x=a = f'(a)p!C[j,p]D V(x)| x=a .

It is easy to show that

D J
- p
V ( x ) | x = a = - 4 f H < p ( J ) ( a ) ,

which completes the proof.


CHAPTER 3

THE MATHEMATICS OF DIFFERENCE RELATIONS

In this chapter we shall lay the mathematical

foundations prerequisite to a careful analysis of the conver-

gence and order of one-point I.F. and one-point I.F. with

memory. The two theorems of Section 3 ^ will be basic for

that analysis.

It is assumed that the reader is familiar with the

elementary aspects of difference equation theory. If this is

not the case, Hildebrand [ 3 . 0 - 1 , Chap. 3 ] * Jordan [3.0-2,

Chap. 1 1 ] , Milne-Thomson [ 3 . 0 - 3 L a n d Norlund [ 3 . 0 - 4 ] may be

used as references.
3.1-1

3.1 Convergence of Difference Inequalities

LEMMA 3 - 1 . Let 7 Q * 7 ^ * • • • >7 n


b e
nonnegative integers

and let q - 2
j = o ^y L e t M b e a
P o s i t i v e
constant and let the

6^ be a sequence of nonnegative numbers such that

n
5
I I^ +
m
n J

and such that

5, 1 T, i * 0,1,...,n. (3-1)

Then
q _ 1
Mr < i

I m p l i e s t h a t 6^-*0. '

PROOF. Let

L = Mr q _ 1
. (3-2)

L e t t be t h e f i r s t s u b s c r i p t f o r which y^ i s n o n z e r o . Then

5
n + l * M 5
n - t II ( 5
n - j ) 7 j

•J-t+1

* Mr " 5 _ q 1
n t = L6 . n t

We now p r o v e b y i n d u c t i o n t h a t

5
i + l ^ L 5
i - t ' L < 1, (3-3)

f o r a l l i . Note t h a t (3~l) and (3~3) i m p l y 6 ± + 1 £ r.


!

3.1-2

Let ( 3 - 3 ) hold for 1 = 0,1,.. .,k-l. Then

5 M 5 J
k+i^ £t n ( v /
j=t+i

L 5
* ^^^k-t - k-t

This completes the induction. Prom ( 3 - 3 ) we conclude that


§ x — 0.

LEMMA 3 - 2 . Let y Q be a positive integer and let


7 , 7 , ...,y be nonnegative integers.
1 2 n Let q = Z^ 7y Q Let
M and N be positive constants and let 6^ be a sequence of
nonnegative numbers such that

5 M 5 5 + 5 7 j + N 5 o + 1
I+I^ I° n ( i ±-j) I *

and such that

6 < r, 1 = 0,1,...,n. (3-4)

Then
q r
~ o a-1 q
y
o
2 °Mr + NT < 1

Implies that 6^ -+ 0.
3.1-3

PROOF. L e t

q
~ 7
o a - 1 7
o
L - 2 MT q
+ NT . (3-5)

Then

5
n l *
+
M5
n° II ( n n - P 5 +5 3 + N5
n
j = l

M T 7 0
" 1
II ( 2 r ) 7 j
+ N T 7
°
5
n + l * 5
n

= 8 2 " ' u
M r q _ 1
+ N r 0
= L 6 n .
n

We now p r o v e by i n d u c t i o n t h a t

5
i + l ^ L 6
i ' L < 1 ,
(3-6)

f o r a l l i . Note t h a t (3-4) and (3-6) i m p l y 6 i + 1 £ r .

Let (3-6) h o l d f o r i = 0 , l , . . . , k - l . Then

7 . - 1
n
Mr = L6,
5
k + l * 5
k
j = l

T h i s c o m p l e t e s t h e i n d u c t i o n . From (3-6) we c o n c l u d e t h a t

6 4 — 0.
!

3.2-1

3-2 A Theorem on the Solutions of Certain Inhomogeneous


Difference Equations

Let

X a
*j i+j - °* % - 1
(3-7)

be a homogeneous linear difference equation with constant


coefficients. The indlcial equation corresponding to ( 3 - 7 )
is the algebraic equation

N
t J
= 0. (3-8)
I " ,
J=0

Now, if all the roots of ( 3 - 8 ) are simple, the general solution


of ( 3 - 7 ) is given by

4=1

where the are the roots of ( 3 - 8 ) and where the c^ are con-
stants determined by the initial conditions. It is obvious
that if all the have moduli less than unity, then all
solutions of ( 3 - 7 ) converge to zero.

We wish to extend this result to the case of a


linear difference equation whose homogeneous part has constant
coefficients and whose inhomogeneous part is a sequence con-
verging to zero. Hence we shall consider
3.2-2

I K
j a
i + J " K
N = 1
0-9)

.no-

where 3^ -*• 0 .

Let us f i r s t c o n s i d e r t h e s p e c i a l c a s e of t h e

f i r s t - o r d e r d i f f e r e n c e e q u a t i o n

0
i + l + K
o a
i = p
i * (3-10)

Observe t h a t t h e s o l u t i o n of t h e i n d i c i a l e q u a t i o n i s g i v e n

b y t » -K q = p ^ . We s h a l l assume t h a t i s l e s s t h a n u n i t y

and t h a t P i -* 0 . The d i f f e r e n c e e q u a t i o n may be "summed" as

f o l l o w s . L e t I and n be a r b i t r a r y w i t h n - 1 ;> t . Then

a
l+l + K
o°l " H'

°t+2 + K
o°l+l =

V l + K
o a
n - 2 = p
n - 2 '

a
n + K
o a
n - l - P n - 1 -

M u l t i p l y i n g t h e e q u a t i o n whose l e a d i n g term i s a
n _ j b y (-K )^Q

or and a d d i n g a l l t h e e q u a t i o n s y i e l d s

° n " 0l'\ +
I P l P ? " 1
" 1
- ( 3 - 1 1 )
i = t
3.2-3

Hence

n-1
\o \
n * IPiT^laJ + £ IPJIPJ " " . 1 1 1 1
(3-12)

Since 0 we can fix I so large that the sum has magnitude

less than e / 2 . Since p.^ is less than unity we can then take

n so large that the first term on the right side of ( 3 - 1 2 )

has magnitude less than e / 2 . We conclude that —> 0. This

is the desired result for the case of a first-order difference

equation.

We shall now derive a generalization of (3~ll) for


the case of an Nth order difference equation. Thus we start
with

_N
P K 1
i+j " i ' N " '

Let py j .= 1 , 2 , . . . , N be the roots of the indlcial equation,

Let d n i be constants whose values we shall choose at our

later convenience. Let I and n be arbitrary integers such

that n - N ^ I. Then

n^N _N n-N

I d
ni Z V i + J =
Z d p
ni i'
1=1 j=0 1=1
3.2-4

or
n^N i+N n^N

X d
ni Z s-i s -K a
Z d p
ni i*
1=1 s=l 1=1

Changing the order of summation yields

n^N t+N-1 s

ni s-i

n-N s n n-N

d .K .
* I °= I d
n l V i +
I °s X ni s - i
s*t+N i«s-N s=n-N+l i=s-N

-1+2+3.

Consider 2 , This is zero if we choose d n i as a


solution of the homogeneous difference equation. It is
convenient to define d n i by

d
m - i v r 1 - 1 ( 3
- i 3 )

where the constants will be chosen later.

Consider 3 . Using ( 3 - 1 3 ) we can show that

n n^N N-l N _N

s=n-N+l i=a-N A=0 r=N-7\ j=l


3.2-5

We have the N coefficients D-^Dg, ... , D at our disposal. N We

choose them so that the coefficient of is unity and the

coefficients of the a _^ are zero for 7\ > 0. n Thus, recalling

that k n - 1,

(3-14)
N N

I * r I V j 4
* " 1 =
° ' * = 1,2,...,N-1.
r-N-y j = l

U s i n g t h e f a c t t h a t

N
1\

*r.Pi = 0, j - 1,2, . . .,N,


I
r=0

i t i s n o t d i f f i c u l t t o p r o v e t h a t t h e s y s t e m (3-14) i s e q u i v a -

l e n t t o t h e s y s t e m

N
D
j j p = 5
r,N-l' r =
0 l ... N-l
9 9 9 (3-15)

where 6^ is the Kronecker symbol. The determinant of

this system is a Vandermonde determinant. Hence Dj satisfying

(3-15)i and consequently ( 3 - 1 4 ) , exist and are indeed the

ratios of certain Vandermonde determinants. With this choice

of the Dy 3 reduces simply to o^. We conclude that


3.2-6

n-N t+N-1 s

<W*i
a a d K
n - I " I s I ni s-i
1=1 s=l 1=1

We summarize our results in

LEMMA 3 - 3 . Let

N
IN

u p K
\j i+J " i ' N
J=0

be a linear difference equation whose homogeneous part has


constant coefficients. Let the roots of the indicial equation
be simple. Let

N
d
3=1

where the are the roots of the indicial equation and where
the Dj are determined by.

I Vj 5
" r,N-l' r
" 0,1,...,N-l;
3.2-7

5
r N - l i s t h e K r o n e
^ i k e r
s y m b o l . Then

n-N t + N - i s

a
n - I d
n A - Z ° s l d
n i K
s - i ' <3-17)
1 = 1 s=l 1 = 1

where t and n a r e a r b i t r a r y i n t e g e r s such t h a t n - N j> t .

We are now r e a d y t o p r o v e

THEOREM 3-1. L e t

I V i + j - P i

be a l i n e a r d i f f e r e n c e e q u a t i o n whose homogeneous p a r t has

c o n s t a n t c o e f f i c i e n t s . L e t t h e r o o t s of t h e i n d i c i a l e q u a -

t i o n be s i m p l e and have m o d u l i l e s s t h a n u n i t y . L e t £^ -+0.

Then o ± 0 f o r a l l s e t s of i n i t i a l c o n d i t i o n s .

PROOF. Let

t+N-1 s
G
3 l - D
J Z a
s Z - s - i P j 1
" 1
' (3-18)
3 = 1 1 = 1
3.2-8

Using ( 3 - 1 6 ) , (3-17), and ( 3 - 1 8 ) we can write

n-N N

°n = E n A " I d
ilPy
G

1=1 j=l

The notational adjustments which would be necessary if one


of the roots of the indicial equation were equal to zero are
obvious. Since

N
d
ni - 1 v r 1
- 1
-
J=l

and since the having moduli less than unity, it is clear


that there exists a constant A such that

I
n-N
d A
nil < -
i=l

Let e > 0 be preassigned and arbitrary. Since ^ -* 0, there


exists a number L such that

< 2A f o r a 1 1 1
^ - L

Fix £ such that I ;> L. Observe that is independent of n.


With I fixed there exists a constant B such that

N
3.2-9

Let

p = •max[|.p |,|p 1 2 |,...,|p N | ].

Let T| be so large that

P ^ 2B"

Then for all n > T|, we have

n-N N
l * n | . * I | d n l | | P l | + Z I ^ J I P j l ^
i=t J=l

a
nl < 2T A +
^ B
= e

which completes the proof.

For the applications that we have in mind, the


inhomogeneous part of the difference equation will converge
to a nonzero constant. We wish to show that all solutions
of the inhomogeneous equation will converge to a constant.
This result follows easily from the above theorem. We have
3.2-10

COROLLARY. Let

j=0

be a linear difference equation whose homogeneous part has

constant coefficients. Let the roots of the indicial equa-

tion be simple and have moduli less than unity. Let co^ cd.

Then

CD
'i N

for all sets of initial conditions.

PROOF. Let

CD
a 85 +
i *i ~N

Then

N N
CD
X * . "i+J r
N *±+j +
* =
° V

J=0 J

Let = cd^ - cd. Then the conditions of the theorem apply


and we conclude that 0 and hence that
0)
i N

0=0
3.3-1

3.3 On the Roots of Certain Indicial Equations

The properties of the roots of the polynomial

equation

k-1
g kja (t) = t k
- a £ t J
= 0 (3-19)
j=0

will be derived. This will turn out to be the indicial


equation for certain families of difference equations which
will be encountered in our later work. The case a = 1 has
been treated by Ostrowski [ 3 . 3 - 1 , pp. 8 7 - 9 0 ] .
3.3-2

3.31 The properties of the roots. If k = 1 , the


only root of ( 3 - 1 9 ) is t = a. For the remainder of this
section we shall assume k ^> 2 . We shall permit a to be any
real number such that

ka > 1 . (3-20)

Since

g ( l ) = 1 k a
k,a " '

t = 1 is not a solution of ( 3 - 1 9 ) • It is convenient to define

(3-21)
= ( t _ l ) g ( t ) =
^ a ^ k,a t k + 1
" ( a + 1
H k + a
-

Hence G.k. ,(at ) has a root at t = 1 and roots at the roots of


0

1
By Descartes rule, g, ( t ) has exactly one real Q

jk , a
positive simple root. This unique root will be labeled p. .
ic j a
Since
k-1
g
k ,, (a) = - £ aJ,
a S k , a ( D = 1 - ka, g k a (a+l) = 1
j=l
we have

LEMMA 3 - 4 . The equation

k-1
g
k,a ( t )
= t k
" a £ t J
= 0
J=0
3.3-3

has a unique real positive simple root, , and

max[l,a] < £ ^ k a < a + 1.

Thus p k a has modulus greater than one. We shall

show later that all other roots have moduli less than one. In

the next lemma we shall prove that p. is a strictly increas-


o

K. a$

ing function of k.

LEMMA 3-5. P .
k l j f i < P k # a .
PROOF. This follows from the observation that the
recurrence relation

t&krl.aM - a =
*k,a^)

l s
implies that gjj ( P _ i )
a k a negative.

The following inequality will be needed below.

LEMMA 3 - 6 . Let ka > 1 . Then

^ >H (i • £) -
k
PROOF. Let k be fixed and define

k 2
Then J'(a) = (a+l) (ka-l)/a and therefore J(a) is a strictly

increasing function of a for ka > 1. The observation that

completes the proof.

Descartes 1
rule shows that (3, Is a simple root.
.k fa
Furthermore

LEMMA 3-7. All the roots of g k & ( t ) = 0 are simple


PROOF. Define

t k + 1 a k
G
k , a ( t ) =
^ - ^ S i c a ^ ) - " ( + D t + a .

Then G k a ( t ) has only one nonzero root,

v (a+1).

The fact that v is positive completes the proof.


I r

3.3-5

In the following two lemmas we derive better bounds

LEMMA 3 - 8 .

a + 1
k+t < > < Pk,a < a + 1.

Therefore, for a fixed

lim P k a = a + 1,

NOTE. Since ka > 1 implies that (a+1)k/(k+1) > 1,

we: need not write

max[l, ^ (a+1)] <

PROOF. Let v = (a+1)k/(k+1). Then

k + 1 k
n ( v ) _ a (a+l) /_ , l^"
G
k,a ( v ) = a
" k+1 V1 +
k,

An application of Lemma 3 - 6 shows that G, ( v ) < 0 , which 0

ic ja
together with an application of Lemma 3 - 4 completes the proof,

LEMMA 3 - 9 .

k k a k
(a+l) ' (a+l)

where e denotes the base of common logarithms.


r

3.3-6

PROOF. Let v = (a+1)k/(k+1). The only positive

root of the equation G," ( t ) = 0 is t = v - (a+l)/(k+l).


Q

Therefore G, ( t ) does not change sign in the interval


K. , a
0

v £ t a + 1. Since

k _ 1
G^ (a+1) = 2k(a+l)
a > 0,

G, _(t) is convex in the interval. Since the secant line


K. f a
through the points [v,G, ( v ) ] , [a+l,G, (a+l)] intersects the
0

Kj a ic f a
t axis at

a
t = a + 1 -
(a+1)

whereas the tangent line at [a+l,G, (a+l)] intersects the


ic j a
t axis at

t = a + 1 -
k
(a+l) '

we conclude that

k a
a + l - - ^ ( l ^ )K
O k, < a P + l -
(a+l) V ^ --K,a- ( a + l ) k«

k
An application of the well-known inequality (l + l / k ) < e

completes the proof.


Il

3-3-7

We will next study the polynomial generated by

dividing g, ( t ) by the factor t - p.


Q . Define c,.
Ka 5 j
0-£ J£ k - 1, by

-£§77" I W J
* ( 3
' 2 2 )

K a
' j=0

We first prove

LEMMA 3-10.

k-1
V c = ka-1
°J P -i*
k a
K a
j=0 '

PROOF. The result is obtained by setting t = 1 in


(3-22) and observing that g. ( l ) = 1 - ka. Q

iv , a

In the following three lemmas we shall prove that

all the roots of the quotient polynomial have moduli less

than unity. It will be convenient to sometimes abbreviate


3.3-8

LEMMA 3-11. C j > 0 f o r 0 £ j £ k - 1.

PROOF. I t i s e a s y t o show t h a t

c Q = 1, Cj = p" - a £ p 6
,

t=0
(3-23)
A second f o r m u l a f o r t h e c^ may be d e r i v e d b y n o t i n g t h a t

k-1
p k
- a £ 3 * - = 0. (3-24)
t=0

B r i n g i n g t h e l a s t k - j terms of (3-24) t o t h e r i g h t s i d e of

k - 1
t h e e q u a t i o n and d i v i d i n g b y p 0
y i e l d s

c^*» a £ - p * - a y. p"" , 6
1 £ J £ k - 1.
t = o t = i

L e t 0 = p ' 1
. Then

°3 = a 0 ( l
" i - e J )
» (3-25)

The f a c t t h a t 6^1 c o m p l e t e s t h e p r o o f .

LEMMA 3-12.

c
j - l c
j + l < °y k
> 2
> 1
1 H k " 2.
li.

3.3-9

PROOF. Case 1 . j = 1. We must show that

< c o r f r o m 3 2 3
°o°2 l ( ~ )'

a a a 2
< a - ^k,a " < ^k,a- > '

This simplifies to 0 < 1 + a - 0. _ which is true from


xV , CI

Lemma 3 - 4 .

Case 2 . J > 1. Then from ( 3 - 2 5 ) , the result is

equivalent to

(i-e , k - J - l )(i-e

or 0 < 1 - 29 + 6* which holds since 6^1.

LEMMA 3 - 1 3 . All the roots of the equation g k a (t) = 0 ,

other than the root 0, . have moduli less than one.


K, a
PROOF. The proof depends on the following theorem

which Ostrowski [ 3 . 3 - 2 , p. 9 0 ] attributes to independent

discoveries by Kakeya [ 3 . 3 ~ 3 ] and Enestrom.

If in the equation g(x) = 2 j ^ Q t> _jX


n all

coefficients bj are positive, then we have for each

root i that

141 <: max b /b


l ^ K n J
3.3-10

Applying this theorem to (3-22) we find, by virtue

of Lemmas 3-11 and 3-12, that

c c
C c
l^Kk-l J-1 o

Furthermore c,/c
jl o = k. ^ a - a < 1 from (3-23) and Lemma 3~4.
This completes the proof for the case k > 2. For k = 2,

= t + p Q - a

and Ul = P 2 > a - a < 1.

The major results of this section are summarized in

THEOREM 3-2. Let

k-1

If k 1, this equation has the real root p _ = a. Assume n

j.,a
k ^ 2 and ka > 1. Then the equation has one real positive
simple root p, and
0
a

max[l,a] < p v Q < a +.1.


3.3-H

Furthermore,

r~ o +4.1
a i ~ — s s
§a
riz < P„ < a + 1 -
( a + l ) k ^k,a
a

— — ( a + l ) k*

where e denotes the base of common logarithms. Hence

lim 6, „ = a + 1. All other roots are also simple and


v k,a
have moduli less than one.
r

3.3-12

3.32 An important special case, A case of special

interest occurs when a is a positive integer. (We will not

be interested in nonintegral values of a until we investigate

multiple roots in Chapter 7 . ) Lemma 3-4- niay be simplified to

read

a < P ^
k a < a + 1, k > 1.

Values of p, for low values of k and a may be found in


it j a
Table 3 - 1 . Observe that p. has almost attained its limit,
ic j a

a •+ 1 , by the time k has attained the value 3 or 4 . This is

particularly true if a is large. This will have important

consequences later.
3.3-13

TABLE 3 - 1 . VALUES OF p

"a 1 2 3 4

1 k

1 1.000 2.000 3.000 4.000

2 1.618 2.732 3.791 4.828

3 1.839 2.920 3.951 4.967

4 •1.928 , 2.974 3.988 4.994

"5 1.966 2.992 3.997 4.999

6 1.984 2.997 3.999 5.000

7 1.992 2.999 4.000 5.000


3.4-1

3.4 The Asymptotic Behavior of the Solutions of Certain


Difference Equations
3 .41 Introduction. Consider the difference equa-

tion

n
e
i+i - K
II e
ty (3- 2 6
)

where K is a constant and s is a positive integer. Taking

logarithms in ( 3 - 2 6 ) leads to a linear difference equation

with constant coefficients whose indicial equation is

n
n + 1 j
t - s t = 0. (3-27)

This is a special case of the equation studied in Section 3 . 3

with k = n 4- 1 and a = s. The properties of the roots of

(3-27), derived in the previous section, permit us to analyze

the asymptotic behavior of the sequence {e^}•

Now consider the difference equation

n
e
i+i = M
i II e
l-y (3-28)

where K. Is the asymptotic behavior of the sequence


generated by this equation the same as that generated by
(3-26)? The answer turns out to be in the affirmative. It
will turn out that the errors of certain important families
of I.P. are governed by difference equations of the form speci-
fied by ( 3 T 2 8 ) .
3-4-2

In Section 3 * 4 3 we shall show that the asymptotic

behavior of the sequence

n
(3-29)
e M e
i+i = i i n ( v e
i ) S + N
i e
i + 1
^
J=I

where M.^ K, N ± —• L, is the same as the asymptotic behavior

of the sequence generated by ( 3 - 2 6 ) or ( 3 - 2 8 ) . Difference

equations of the type defined by ( 3 - 2 9 ) will be encountered

in Chapter 6 . We shall designate difference equations of the

type defined by ( 3 - 2 8 ) and ( 3 - 2 9 ) as of type 1 and type 2,

respectively.
3.4-3

3.42 D i f f e r e n c e e q u a t i o n s of t y p e 1, We s h a l l

s t u d y t h e a s y m p t o t i c b e h a v i o r of t h e s o l u t i o n s of t h e d i f f e r -

ence e q u a t i o n

e
i + i = M
I n ^ 3
° )

where s i s a p o s i t i v e i n t e g e r . We s h a l l show t h a t i f

M ± - K, (3-31)

and i f t h e m a g n i t u d e s of e
0 >$^,> * • • > e
n are s u f f i c i e n t l y s m a l l .

t h e n t h e sequence of c o n v e r g e s t o z e r o . Observe t h a t i f

none of t h e M 4 a r e z e r o and i f none of e - e - , . . . , e ^ a r e z e r o ,


l o l* * n

t h e n e^ i s not z e r o f o r any f i n i t e i . We s h a l l prove t h a t

t h e r e e x i s t s a number p g r e a t e r t h a n u n i t y such t h a t

| e 1 + 1 | / | e . j j * \ c o n v e r g e s t o a n o n z e r o c o n s t a n t . I t i s e a s y t o

p r o v e ( s e e S e c t i o n 1.23) t h a t i f such a number p e x i s t s , t h e n

i t i s n e c e s s a r i l y u n i q u e .

L e t

5 ± = [ e ± | , r = s ( n + l ) > (3-32)

S i n c e c o n v e r g e n t s e q u e n c e s are n e c e s s a r i l y bounded, t h e r e

e x i s t s a c o n s t a n t M such t h a t
3.4-4

for all 1 . Then

n
B 1 + 1 * M II B j . 4

J = 0

Let
5. ^ r , 1 = 0 , 1 , . . .,n.

An application of Lemma 3 _
1 with all the y^ equal to s and
with r replacing q shows that if

M r r - 1
< 1 ,

then 6 -»• 0
i

Assume that and K are nonzero. We will investi-

gate how fast the sequence {e^} converges to zero. Let

o± = In 5 ± = ln|e |, ± J ± = lnJMj. (3-33)

Then from ( 3 ~ 3 0 ) , we have that

_n
a
i+l " i J + s
Z °±-y (3-34)
J=0

Let t be an arbitrary parameter whose value will be chosen


later. Let

c_O = 1, cj(t) = t 3
- s ^ l
t, j * l,2,.o.,n+l,
1=0

(3-35)
r

3.4-5

and let
D (t)
± = a 1 - tff _ . ± 1 (3-36)

Then it is not difficult to see that ( 3 - 3 4 ) is identical

with

X °^\ + l-3 { t ) + c
n + l ( t
K - n J
= i ^-37)

for all values of t.

Consider the equation

n
c n + 1 (t) = t n + 1
-s y t V « 0. (3-38)
t=0

Observe that this is precisely the indicial equation of the

difference equation derived from ( 3 - 2 6 ) by taking logarithms.

It is a special case of the equation

k-1
S , (t) - t
k a
k
- a £ t J
= 0 (3-39)
J-0

studied in Section 3 . 3 with k = n + 1 , a = s. The analysis


of ( 3 - 3 9 ) was based on the assumption that ka > 1 . Let r > 1
Since ka = (n-fl)s = r s the conclusions of Section 3 . 3 apply.
Hence all the roots of ( 3 - 3 8 ) are simple and there is one
f

3.4-6

real positive root greater than unity; all other roots have
moduli less than unity. The real positive root is labeled

o • We shall abbreviate 6
n+JL y s • sp and choose the
11+1 by 1 o

parameter t equal to p. Hence

c n + 1 ( p ) - 0. (3-40)

Let
C
j = Cj(p), = Dj(p).

Then ( 3 - 3 7 ) becomes

The c. are just the coefficients of the polynomial gotten by


J
dividing c ^ t ) by t - p. [See ( 3 - 2 3 ) J Hence the indicial

equation corresponding to the difference equation ( 3 - 4 l ) has

only simple roots whose moduli are less than unity. Since

—• K, we conclude that -* ln|K|. All the conditions of

the Corollary to Theorem 3 - 1 now apply and we conclude that

D ± - ^ L . (3-42)

c
j
j=0
3.4-7

An application of Lemma 3 - 1 0 , with ka = (n+l)s = r and with

^k,a * Pn+1,8 = p
' s h 0 W S t h a t

n
I C
J ' P i - (3.43)

From (3-33), (3-36), (3-40), ( 3 - 4 2 ) , and (3-43), we conclude

that

6
1+1* _^ | |(p-l)/(r-l) K

We summarize these results in

THEOREM 3-3. Let

e
i+i - I M
n e
i-j

with

| e | ^ T, ± 1 = 0,1,...,n.

Let s be a positive integer and let r = s(n+l) > 1 . Let

M ± -»• K and let | M ± | £ M for all 1 . Let

M r r _ 1
< 1.
3.4-8

Then e A -•0. Let p be the unique real positive root of

n
n + 1
t - 8 £ - 0 .

Let M i and K be nonzero. Then

p i K | ^ - ^ ^ r
- ^ . (3-44)
e
i

Table 3 - 1 gives values of p = P n + 1 s for low values


of n and s. Note that no a priori assumption has been made
of the existence of a number p such that

* o .
p
e
if

If the existence of such a number is assumed a priori, it is

not difficult to prove that it must satisfy the indicial

equation ( 3 - 3 8 ) .
3.4-9

3.43 Difference equations of type 2, We shall

study the asymptotic behavior of the solutions of the differ-

ence equation

n
e
i+i = M
i i e
n ( i-j- i)
e e s + N e
i i + 1
' ( 3
~ 4 5 )

J=I
where

M ^ K / O , N ± -» L, (3-46)

and where n and s are positive integers. We shall show that


a e
if the magnitudes of ® * ^S
0
E
*•* > E
N ^ sufficiently small, then
e^ 0 and there exists a number p greater than unity such
that I 1 /| e JL | p
converges to a nonzero constant. InJPaet,.
we shall demonstrate that the asymptotic behavior of the
sequence generated by (3-45) is identical with the asymptotic
behavior of the sequence generated by

n
e
i l = i
+
M
II e|.j, M ± - K

which was studied in the previous section.


Let

Q
± = |e |, ± r = s(n+l) (3-4?)

and let

|M | £ M,
± |N j ^ N±
I

3.4-10

f o r a l l i . Then

n
s+1
5
i + i ^ M 5
I n ( 5
i - j + 5
i ) s +
N 5
I

• J - l

L e t

5
i <> R , 1 - 0 , 1 , . . . , n .

An a p p l i c a t i o n of Lemma 3-2, w i t h a l l t h e y^ e q u a l t o s and

w i t h q e q u a l t o r , sjiows t h a t i f

2 r - s M r r - l + < x ^

t h e n 5 i -»• 0 .

Assume t h a t e^ i s n o n z e r o f o r a l l f i n i t e 1. We may

w r i t e (3-4-5) as

w i t h
e
i + l - H 4-y II (3-48)

T
i = M
i A
i + N
i 9
i ' ^3-49)

where

»i " 5 ( l - A ) " . - ~n~~^ • (3-50)

J = 1
n

We s h a l l d e m o n s t r a t e t h a t -*• 1 and 6^ •—• 0 .


3.4-11

To show that \^ -* 1 we note from (3-45) that


0 f o r a 1 1
e 1 + 1 /e -*0.
1 Therefore e
j _ A j _ « j ~* finite j and the

result follows.

To show that 6^ 0 we proceed as follows. Prom

(3~45) and (3-50),

n
e M
± - i-i ±-i x
n e
i-i-j + N
i-i l-i- e

Hence

n
6 1
" M
i - l V l e
L l - n + N
i-1 n ^ " 1
• < 3 - 5 D

n 'i. 3 n 4-i
Repeat this process for the second term on the right side of

(3-51)* Carry out this reduction a total of n - 1 times.


e e
The problem is reduced to proving th^at i + 1 / ^ ~* 0 and this

is clearly true from (3-45) •

Since 7^ 1 , 6^ 0, and M i K, we conclude that

T i K. Theorem 3-3 niay now be applied to (3-48) and we

arrive at

THEOREM 3-4. Let

n
s+1
j-i
I

3.4-12

with

| e ± | i r , 1 = 0 , 1 , . . , , n .

Let s be a positive integer and let r = s(n+l) > 1 . Let

M i •* K / 0 , N ± L, and let J M j ^ M, |N | £ N for all 1 ±

Let

2 r - s M r r - l + jjpS ^

Then e^ 0. Let p be the unique real positive root of

n
t n + 1
- s £ ^ = 0.
J-0

Assume that e i ^ 0 for all finite i. Then

f U l " , K | < P - l ) / < r - l ) .


4.0-1

CHAPTER 4

INTERP0LAT0RY ITERATION FUNCTIONS

In this chapter we shall study I.F. which are

generated by direct or inverse hyperosculatory interpolation;

such I.F. will be called interpolatory I.F. The major results

concerning the convergence and order of interpolatory I.F.

will be given in Theorems 4-1 and 4-3. A sweeping generaliza-

tion of Fourier's conditions for monotone convergence to a

solution will be given in Theorem 4-2.


4.1-1

4.1 Interpolation and the Solution of Equations

4 . 1 1 statement and solution of an interpolation

problem. The reader is referred to Appendix A for a discussion

of hyperosculatory interpolation theory. Certain salient

features of that discussion which are needed for the develop-

ment of the theory of this chapter will be repeated here.

Consider the following rather general interpolation

problem. We seek a polynomial P such that

(k«) (k.)
J x f x f o r
P ( i_j) = ( i-j) J =0,l,...,n;
(4-1)

k^ = 0,1, ... ,7^-1, ^ 1; x_ ± k ± x_ ± t if k ^ I.

That is, the first 7^-1 derivatives of P are to agree with

the first 7j-l derivatives of f at the n + 1 points

x
i ' i _2* • •
x l^t

q =
I 7
r

Then there exists a unique polynomial P of


n,7 ,7 » Q 1 • • - ,7
n

degree q - 1 which satisfies ( 4 - 1 ) . For the sake of brevity


we write
P = p (h-g)
v
n,7 n , 7 , 7 , .. . , 7 '
Q 1 n '
i

4.1-2

where 7 signifies the vector y y^ 9 ...,y . If, in particular,

7^ = s for all yy then we write the interpolatory polynomial

as P .
n, sn

q
Let f( )(t) be continuous in the interval determined

by x , x1 1 - 1 ,..-jX 1 - n , t . Then

n
f(q)[| (t)] 7
J
f(t) - P B f y (t) . II (t-xj.j) , (4-3)

x x x
where ^ ( t ) lies in the interval determined by ±> i-i> • • • > j_- r

q
Let f be nonzero and let f^ ^ be continuous on an

interval J. Let f map J into K. Then f has an inverse S and


q
zj( ^ is continuous on K. We can state the interpolation

problem for 3 as follows. We seek a polynomial Q such that

(kj (k.)
Q ( y
i-p = S ( y
i-j^ f o r J =
0,1, ...,n;
(4-4)

kj = o,i,...,7^-i, 7j ^ i; y _ ± k / y _ ± t if + i.

Then there exists a unique polynomial 7 ,y^ . . . , 7


Q 5 n of

degree q - 1 which satisfies (4-4) . For the sake of brevity

we write

Q
(4-5)
n,7 ^n,y s7 s Q 1 ..-,7 * n

If, in particular, y^ = s for all 7^, then we write the

interpolatory polynomial as ^(t).


4.1-3

The error of the interpolation is given by

( a )
« [Mt)] "
s(t) = Q ^ ( t ) + —
7 ^ II 8
(t-y,^) , (4-6)

where 6^(t) lies in the interval determined by

^i'^i-l* • • • '^i-n'^ *
f

4.1-4

4.12 Relation of interpolation to the calculation


x x x b e n + 1 a r o x l r n a n t s t o a
of roots. Let ±> i-i9 • • • > i- n PP
zero a of the function f. To calculate a new approximant
it is reasonable to calculate the zero of the polynomial
which interpolates f at the points x^,x^_^,...,x . 1-n The
x x x
process is then repeated for the set i+i' i'•••> i-n+l•
One drawback of this procedure is that a polynomial equation
must be solved at each step of the iteration. A second draw-
back is that the polynomial will have a number of zeros, some
of which may be complex, and criteria are required to select
x
one of these zeros as i + 1 - Once such criteria are established
the point is uniquely determined by the points
X • • X • -i • . . . . X . . We define $ as the function which
J
I 1-1 I-n n,7
x x x i r r t o x #
maps i> i„i> • • • > i - i+i n

The difficulties of having to solve a polynomial

equation may be avoided by interpolating the inverse to f,


# y a n d
at the points y ^ y ^ i * • • * i - n ' evaluating the interpolatory

polynomial at zero. The point is uniquely determined;


x x x i n t x
let the function which maps i> ±^i > • • • * i - L n o j_ + 1 be

labeled cp - nj/

For either of the two processes described above,


x ,x ,...,x
Q 1 n must be available. One method for obtaining
these starting values is described in Section 6 . 3 2 .
If

4.1-5

I.F. which are generated through direct or inverse

hyperosculatory interpolation will be called interpolatory I.F.

Hyperosculatory interpolation is not the only means by which

I.F. may be generated. Other techniques will be studied in

various parts of this book. It is a very useful technique

however and gives a uniform method for deriving I.F. and of

studying their properties and, in particular, their order.

The most widely known I.F. are all examples of interpolatory

I.F. A useful by-product of generating I.F. by hyperosculatory

interpolation is the introduction of a natural classification

scheme.
II

4 . 2 - 1

4.2 The Order of Interpolatory Iteration Functions


4.21 The order of iteration functions generated by
x X x b e n + 1 a r o x l
inverse interpolation. Let J_* J__]_j • • • ' i - n PP ~
mations to a zero a of f. Let 0^ ^ be the polynomial which
i n t J r i e s e n s e
interpolates S at the points y ^ y ^ i * • • • >y±- n °f
(4-4). Define a new approximation to a by

x 0 )
±+i - ^J '

.Then Repeat this procedure using the points i + i j ^ • • • > i- +l x x x


n

We shall investigate how the error of depends on the

errors at the previous n + 1 points.

We observed ( 4 - 6 ) that

(q)
3 [e.(t)] " 7

( t ) + J
- ^,7 qT n (Wi-J> -

where 0^(t) lies in the interval determined by


S e t t = T n e n
y , y _ , . . . , y _ , t , and where q =
1 1 1 1 n jy °*

7,
x
i+l - a - - -^f S ( q )
(e ) ± [I (y -j) > ±
J
( 4 - 7 )

where 0 ^ = 9^(0) . Let j = i-j x _ a


« W e c a n
write ( 4 - 7 )
in terms of either the y^_j or the ^. Since

y f x = f e
i + i • ( i+i) ' ( W i + i '
4.2-2

where T\ ±+1 lies in the interval determined by x ± + 1 and a, we

conclude that
n

M
y±+i = * i n *l-y
(4-8)

q {(i)
(-i) z (e ) ±

M =
* i ~ q!«;(p i> ' 1+

where p ± + 1 = f (Tl ) .
1+1 Since

we can also conclude that


(4-9)
n
e
i + i - (i- l )II^ ( )t-y
M
(e )qe
1

M, = -

qi II [3'(pi_j)] J

where p _ j = f (T| _^) .


± ±

It is clear that if none of the set e ,e^,...,e is Q n

zero and if does not vanish on the interval of iteration,

then e^ is not equal to zero for any finite 1.


4.2-3

We shall show, however, that If the initial approxi-

mations x ,x ,...,x , are sufficiently close to a, then e


Q 1 n ± -+ 0

We will use (4-9) for the proof. Let

J = -jx |x-a| 1 T\.

q
Let f ( ^ be continuous on J and let f' be nonzero on J. Let

f map the interval J into the interval K. Then 3 ^ is con-

tinuous on K and 0' is nonzero there. Let

l 8 ( q
^ ' ^ V |.'(y)li» a

for all y e K; that is, for all x e J. Let

A
2

Let x , x , . . . , x
Q 1 n e J. Then

max[|e |,|e |,...,|e |] £ r.


o 1 n

We shall show that if

M p q-1 ^ 1 } (4-10)

then all the e J.


!

4.2-4

Since x , x , . . . , x
Q 1 n e J, |M | £ M.
n Hence

e
l n ll = l n«
+
M
II lei.jl^iHT^r,

where the last inequality is due to (4-10). We proceed by

induction. Let x ± e J, for i = 0,1,...,k. Hence |M | £ M


k

and

J=o

which completes the induction. Since all the x.^ e J, | j ^ M

for all i.

A modification of this proof could be used to show

that e^ -»• 0. Instead we note that

n
7
5
i l ^
+
M
II (^-j) =

and use Lemma 3-1 to arrive at

LEMMA 4-1. Let q = 7y Let

J = -jx | x-a | £ r |
4.2-5

and let f ^ be continuous and f nonzero on J. Let

x ,x ,...,x
Q 1 n e J. Let

for all x e J and let M = ^ A ^ - Suppose that

M r q - 1 < 1 >

Then e i 0.

Since -*• 0, x ± -*• a, and we can conclude that


4.2-6

4.22 The equal Information case. We will now

specialize to the case of Interest for our future work. Let

7j - s, j = 0,1,...,n f

Then the same amount of information will be used at each point.

Thus the first s - 1 derivatives of the interpolatory poly-

nomial are to agree with the first s - 1 derivatives of 3 at

y y # y # L e t
i' i-l'•• ' i-n

r = s(n+l).

The results of the previous section hold for this case if we

replace q by r and 7j by s. In particular,

- * M
i II y±-y

3=o (4-11)

r
(-i) ^^(e ) ±

* M
i - " r ! « ' ( p 1 + 1 ) '

and

3=0
e M e
i + i - i II l-y

(4-12)

r r)
M. = -
'1
i-i)
n
^ (e ) ±

r ! II r * ' ( P i - j ) ] '

3=0
1

4.2-7

Let the conditions of Lemma 4 - 1 hold with q replaced by r

and 7 , replaced by s. Then |M | <; M for all i, e


± ± 0 , and
J
M - * Y ( a ) , where
1 r
v
r ( x ) was defined ( 1 - 8 ) as

r ( r )
_ _ (-D 3 (y)
r
r'[*'(y)] y=f(x)

All the conditions of Theorem 3-3 are now satisfied and we

conclude that

K ± l l _ | j {p-Mr-\
Y a ) (4. 1 3 )

where p is the unique real positive root with magnitude

greater than unity of the equation

n
t n+l _ s \ J t = o.
J=0

Furthermore, * M — - ( - l ) a ( 0 ) , where CZ (y) is defined ( 1 - 8 )


±
r
r r

as
4.2-8

Hence

i l m i - . |a (o) fp-^r-i). ( 4 . 1 4 )

Iy i x

It is not difficult to see that one can pass from ( 4 - l 4 ) to

( 4 - 1 3 ) by observing that y _j = ± f'(\_ )e _y


3 ±

Our results concerning the convergence and order

of I.F. generated by inverse interpolation are summarized in

THEOREM 4 - 1 . Let

J = jx |x-a| £ rj-.

r
Let r = s(n+l) > 1. Let f ( ^ be continuous and let f ' S ^ ^ 0

on J. Let x , x , . . . , x
Q 1 n e J and let a sequence (x ) be i

defined as follows: Let g be an interpolatory polynomial

for 3 such that the first s - 1 derivatives of 0^ g are equal

to the first s - 1 derivatives of 2f at the points y^fY^-^* • • • ^ i -

Define

x = ( x ; x x 0)
i+l Vs i l-l"-" i-n) - %3^ '
4.2-9

Let e ^ = x ^ j - a. Let

,r-l
for all x e J, and let M = ^ A g . Suppose that MT < 1,

Then, x ± e J for all 1 , e -* 0 , and ±

(4-15)

where p is the unique real positive root of

n
t n + 1
- s (4-16)

«J=o

and where

r {r)

Y (x) = (-D Z (v)


r
]
r! [S'(y)] y=f(x)

Also

i±^^|a (o)|M/M

where
4.2-10

We close this section with a number of comments.

We have proven that the order of any I.F. generated by


inverse interpolation is given by a certain number p without
making any a priori assumptions about the asymptotic behavior
of the sequence of errors. If we had assumed a priori the
e e P
existence of a number p such that I ^ ] _ I/I i I+ converges to
a limit, then it would have been much easier to prove that
this number p was determined by the indicial equation ( 4 - l 6 ) .
If n = 0 , then m is a one-point I.F.; if n > 0 ,
n, s
then cp is a one-point I.F. with memory. The order p is an
n, s
integer if and only if n = 0,- that is, if the I.F. has no

memory.

Observe that the asymptotic error constant of the

sequence {y^} depends upon OL^ whereas the asymptotic error

constant of the sequence { x ^ depends upon Y^. We shall see

that whenever we deal with direct interpolation, the asymp-

totic error constant of the sequence (x ) will depend upon A^.


i

Recall ( 1 - 8 ) that A r is the same function of f as <2 is of 3.r

Thus, in a certain sense, {y^} plays the same role for inverse

interpolation that (x^} plays for direct interpolation.

Values of p for different values of n and s may be

found in Table 3 - 1 with k = n + 1 and a = s.


4.2-11

4.23 The order of iteration functions generated

by direct interpolation. We now investigate the case where a

new approximation to a is generated by solving the polynomial

which interpolates f. We immediately turn to the case where

all the 7j are equal to s.


x x x b e n + 1 a r o x l m a t i o n s a
Let j^ j__]_* • • • ' i - n PP to
zero a of f. Let P be the polynomial whose first
n, s

s - 1 derivatives are equal to the first s - 1 derivatives


x X x
of f at J_j J__I' • • • > j_- ' n Define a new approximation to a by

x x x #
Then repeat this procedure for i i> i> + • • • ' j_- +l n

Since P is a polynomial of degree r - 1 , where


n, s
x w i l 1 n o t
r = s(n+l), j_ 2 + generally be uniquely specified by
(4-17) • It is not even clear a priori that P has a real
' * n, s
zero in the neighborhood of a. We shall prove that under

suitable conditions P^ does possess a real zero in the


n, s
neighborhood of a. Under certain hypotheses we shall, in

fact, be able to prove much more.


4.2-12

x X x e J #
Let J = jx |x-a| <; r|- and let J_>J__;L> • • • * i - n

,
Let f be nonzero on J. If the ^..j bracket a, then it Is
clear that P^ has a real zero in J.
n Hence it is sufficient
^ nj s
x e o n o n e
to investigate the case where all the -j__j l i side

of a. We shall first prove

LEMMA 4-2. Let

J = |x||x-a| £ r|.

Let f ^ be continuous on J and let f' / O . o n J. Let

of a. Let

for all x e J. Suppose that

^ i a L U , ; ! . (4-18)
v 2 r

Then P has a real root, x.,-., which lies in J.


n,s -L+J-
I ]

4.2-13

PROOF. To prove the lemma It is sufficient to prove

the result for the case that

> a, j = 0,1, .. .,n,

and f' > 0. Hence P n s ( x ) * fCXj^) is positive.


1 We shall

prove that (a-r) is negative. Since P interpolates f,


n, s
Q M
n} sQ

8
p(t) - f ( t ) . - f ( r )
| p ^ 5 (t-x^) ,
«J=o

where £(t) lies in the interval determined by

x^ j X j ^ _ - ^ , .,. *x^_^, t. Then

f ( 3
p(orr) = f(a-r) - *fo> n (a-r-x^j) ,

where £ s £(a-r). Furthermore,

n
r
s
p(a-r) = - rf(n) - - ^ - f ^ U ) J] (r+ X j L . a) ,
r

where T| lies in (a-r,a). Hence P(a-r) < 0 if

J=o
4.2-14

Since

2 1

the proof Is complete.

Hence P M has a real root in J provided that

(4-18) holds. We shall now show that under certain conditions

we can prove a much stronger result without demanding that

(4-18) holds.

LEMMA 4 - 3 . Let

J = jx |x-a| £ rj-.

Let f ^ be continuous on J and let f 'f( ) / 0 on J. Letr

x >x^,...,x
0 n e J and assume that these points all lie on one
side of a. Let these points be labeled such that x n is the
closest point to a. Let

f(x )f
n
( r )
(x ) > 0,
n r even, (4-19)

f'(x )f( (x ) < 0 ,


n
r)
n r odd. (4-20)
r

4.2-15

Then P„ „ has a real root x^ such that


n,s n+i

minta.x^] < x n + 1 < max[a,x ]. n

PROOF. There are four possible cases depending on

the signs of f(* ) and f . n We shall prove the result for

only two cases which will give the flavor of the proof; the

other cases may be handled analogously.


Case 1. f(* ) > 0, f' > 0.
n We need only prove %

that P„ (a) < 0. Since P„ interpolates f,


n, s ny s a

p , ( t ) = f(t) - f
n s
( r )
i;pn n 8
( t - x ^ j ) . (4-2i)

Hence

P
n,s^ =-^T^ (i) r)
II ( X l . a) ,
r
s

J=0

x
where £ = £(a) . For this case ^ _ j - a > 0. Hence P n (a)
Q

r_1 r
is negative if (-l) f^ ^ is negative. The proof of Case 1

may now be easily completed.


f x
Case 2. ( n ) < 0, f > 0. We need only prove

that P ^ ( a ) > 0.
n s From (4-21),

3
*»,.<«> - - ^ P - n (c-x^j) .
r

4.2-16

For this case a - x _ j > 0. ± Hence P ^ ( a ) Is positive if


n s

r
-f( ) is positivej that is if

r (r)
f(x )f( ^(x ) > 0,
n n f'(x )f n ( x ) < 0.
n

Let the hypotheses of the preceding lemma hold,

We can conclude that if a < x , then


R

a < x ± + 1 < x ,± i = n,n+l,... . (4-22)

If x < a, then

x
i ^ i + l ^ CL'
x 1 = n
' n + 1
'»" • (4-23)

Let (4-22) hold. Then the sequence {x^} is monotone decreas-

ing and bounded from below. Hence it has a limit and

x
i + l " i - j ~* °'
x
J-0,l,...,n. (4-24)

Label the limit £. Let t = x ± + 1 in (4-21). Then

f £ 8
- 4 F n ("iw-'i.,) .
4.2-17

x a n d
where | = l( j_ ) +1 where we have used the fact that
P (x ,) = 0 . Let i oo. An application of (4-24) yields
n,s i + 1 ' x 4j

f(5) = 0 . Since f' is assumed to be nonzero. £ = a and hence

x 1 ~* a. The same conclusion would have been obtained if we

had started with (4-23) • We summarize our results in

THEOREM 4 - 2 . Let

J = jx |x-a| £ rj*.

Let r = s(n+l) > 1 . r


Let f^ ^ be continuous on J and let

f , (r)
f o on J. Let x ,x ,...,x_ e J and assume that these
n
' o 1 n
points all lie on one side of a. Suppose that

f(x )f( )(x ) > 0 ,


±
r
± r even (4-25)

f'(x )f( )(x ) < 0 ,


±
r
± r odd (4-26)

where i is any of 0 , 1 , . . . , n . Define a sequence { x ^ as

follows: Let P be an interpolatory polynomial for f such


n, s
that the first s - 1 derivatives of P_ ^ are equal to the
n, s
first s - 1 derivatives of f at the points i' j__i* • • • > ± - x x x
n
9

x e a 0 31
Let jL+^ ^ P ^ ^ (whose existence we have verified in the
preceding discussion) such that x. . is real, P^ (x. ...) = 0 ,
i"t"j. n, s i"T"jL
and
x
min[a,x J < i i + 1 < max[a,x ]. i

Then the sequence {x^} converges monotonically to a.


r

4.2-18

This result is well known for the special case

n = 0 , s = 2 which is Newton's I.F. For this case the condi-

tions of Theorem 4 - 2 are known as Fourier conditions

(Fourier [ 4 . 2 - 1 ] ) . Theorem 4 - 2 gives a sweeping generalization

of Fourier 6 result.
1
Although the sufficiency of the Fourier

conditions are geometrically self-evident for Newton's method,

they are not self-evident in the general case.

Observe that the hypotheses of Theorem 4 - 2 do not

place any restrictions on the size of the interval where


(r) /
monotone convergence is guaranteed other than that f'f ' f 0 . v

Note that the condition that x ,x ,...,x all lie on one side
o 1 n n

of J is automatically satisfied if n = 0 . If ( 4 - 2 5 ) and

(4-26) do not hold, then we shall have to place restrictions

on the size of the interval where convergence is guaranteed.

We have already seen in the proof of Lemma 4 - 2 that in the

general case we require a condition on the size of the interval

in order to assure that P_ has a real zero in the interval.


n, s
To prove convergence in the general case we start
again with

f ( r )
te,(t)] n

*»..<*>-'<*> — F T — n <*-*i-j> - s

j=o
1

4.2-19

Then

n
p ( a ) = f ( r ) ( C ) e X A ( A )
n ,s -^ i n ^ i-j - I-J - > ^ ^ -
J-0

Let the conditions of Lemma 4 - 2 hold. Then there exists a

real x 1 + 1 e J such that P n s (x 1 + 1 ) = 0. Furthermore,

P , (a) = (a-x
n 8 ±+1 )P'(Tl 1+1 ),

where T l 1+1 lies in the interval determined by a and

Then

i+r

Assume that p' does not vanish in the interval determined


n,s
x
by a and i + 1 * Then

i+1 "i+1 II "i-j-


= H
i+i II l-y e

j=0

(4-27)

1 + 1 r !
c < w
4.2-20

e e e i s
Observe that if none of the set 0 > i>•••* n

zero and if f d o e s not vanish on the interval of Iteration,

then e^^ is not equal to zero for any finite i.

We shall show, however, that if the initial approxi-

mations x , x , . . . , x , are sufficiently close to a, then


Q 1 n

eA 0. Let

for all x e J. Let

P
n,s| ^ 2

x
for all x In the interval determined by a and i + 1 « l^t

H = v /|x .
1 2 Then | H ± + 1 | £ H and

n
5
i+i * H
II d
l-y 5
i-j - K-jl-

An application of Lemma 3-1 shows that if

r _ 1
HT < 1

then e A -*• 0.
4.2-21

Observe that the argument used In the proof of

Lemma 4 - 1 could not be used because depends upon

Since

we conclude that

H
i+1 - r
(-l) A (^).r A r = ijp-.

All the conditions of Theorem 3-3 are now satisfied and we

conclude that

le '
i n a U |A (a)H
r
/ ( r _ l )
(4-28)

where p is the unique real positive root with magnitude

greater than unity of the equation

n + 1
t - s £ tJ = 0.

Our results concerning the convergence and order

of I.F. generated by 'direct interpolation are summarized in


r

4.2-22

THEOREM 4 - 3 . Let

J = |x |x-a| £ r j .

r
Let r = s(n+l) > 1. Let f^ ^ be continuous and let

f , ( r ) ^ o on J.
f Let x ,x ,...,x 1 e J. Let

1
^T- 1 v,, |f'| 1 v.

for all x e J. Suppose that

21 r r-l 2 r < 1 # ( 4 _ 2 9 )

v
2

Define a sequence {x^} as follows: Let P n g be an interpo-

lators polynomial for f such that the first s - 1 derivatives

of P are equal to the first s - 1 derivatives of f at the


n,s
x
points x ^ x ^ . ^ • . . > ^_ - n Let be a point such that x^+i

is real, P n s
x
( ^ l)
+
=
Oj a n d x
j_+i e J
* T h e e x i s
t e n c e of such
a point is assured by ( 4 - 2 9 ) • Define $ by
n, s

x |+1 ~ n , ( i $
S
x ; x
i-l' • • " i-n) • x

Assume that P ;> |i for all x in the interval


n,s 2

determined by a and x 1 + 1 • Let H = v /|x and suppose that 1 2

Hr ^ 1 1
< 1. Let e _j = x ^
± - a.
4.2-23

Then, x^ e J for all 1 , e 1 -* 0 , and

IfldiU u ( „ ) ( P - l > ^ , (4-30)

where p is the unique real positive root of

t n + 1
- 8 £ t J
= 0,
J=o

and where

f ( r )
#
r r!f>

NOTE. The point may not be uniquely defined

by the hypotheses of the theorem. Additional criteria must

then be imposed in order to make $ a single-valued function.


n, s

The hypotheses of this theorem may seem rather


strong; in the cases of greatest practical interest, however,
some of the conditions are automatically satisfied. See the
examples of the next section.
The form of ( 4 - 3 0 ) is strikingly similar to the
form of ( 4 - 1 5 ) • As before, the only parameters that appear
are r and p.

If n = 0 , then <f> is a one-point I.F,; if n > 0 ,


n, s
then $n, s is a one-point I.F. with memory,
4.3 Examples

In all the examples of this section J will denote

the interval defined by

We shall find that both the Newton I.F. and the secant I.F.

may be derived from both direct and inverse interpolation.

For a given I.F., the conditions sufficient for convergence

may vary depending on the method of generation of the I.F.

This should not be surprising since Theorems 4 - 1 and 4 - 3 were

derived for general families of I.F. Hence the conditions

sufficient for convergence, for a special case which Is

covered by both theorems, need not agree. The notation in

this section is the same as in the previous section. The Yj

are defined by ( 1 - 8 ) .

EXAMPLE 4 - 1 . We perform inverse interpolation

with n = 0 and s = 2 . Hence r = 2 and

Q
o,2 ( t ) =
*i +
(^i)*!' S
i =
^ i ^

Then
x
l l - Oo^*!* = Q
o,2 ( 0 ) = X
i " u u
+
II

4.3-2

This is Newton's I.F. Then

^ 1 *"( i) 9
2 1 f
"(*i) , , 2 2
e = e 2 2
i+l " i 7 T T i " 5 ^ [ f TU, ] e ,

(4-31)

where 0 ± = f ( £ ) , p^^ = f (T^) . Let x


± Q e J and let f" be
* continuous and f'3" ^ 0 on J. Let

for all x e J. Let M = 7\ /A| and suppose that Mr < 1.


1 Then
0 and

e
I+l
t t - Y (a). 2 (4-32)

No absolute value signs are required in (4-32) since the


method is of integral order.

EXAMPLE 4 - 2 . We perform inverse interpolation


with n = 0 and s = 3• Hence r = 3 and

2 "
Q
o,3 ( t ) = 3
i +
(t-yi)^! + V
Then

x
i+l - Po,
c
3
( x
i ) =
*o,3 ( 0 ) = x
i " i " V
u x
i K ' A
2 - •&>

and

_ 3
e ± + 1
" 6[3'( P l n 3 e ±
'

Let x Q e J and let f"' be continuous and / 0 on J,

Let £ £ 13'| ^ * 2 for all x e J. Let M = \/X^ and


p
suppose that Mr < 1. Then -+ 0 and

e
i+i
f*- Y (a).
3

The I.F. of the form cp are of such importance

that they are given the special designation E . They will be


s
studied in considerable detail in Chapter 5 ; their properties

form the basis for the study of all one-point I.F.


EXAMPLE 4 - 3 . We perform inverse Interpolation
with n = 1 , s = 1 . Therefore r = 2 and

Q ( t ) +
W l
l,l - *i (t-yi)

where we have used the Newtonian form of the interpolatory


polynomial as given In Appendix A. Then

x x
l~ i-l
x
i + l - *1,1 ±{X ; X
i-1 } = Q
l,l ( 0 ) = X
i " i f
f f
i" i-l

This is the secant I.P. Then

e
i+l " " 2 S'lp^S'tPi^) i i - l ' e e

Let x ,x^ e J and suppose that the other conditions of


Q

Example 4 - 1 hold. Then e i -* 0 and

1
l!i±l[-. |Y ( a ) ! ^
2 (
|e |P ' ±^ '

where p = |(l+v/5) ~ 1 . 6 2 .
4.3-5

EXAMPLE 4 - 4 . We perform direct interpolation with


n = 0, s = 2. Therefore r = 2 and

? f +
o,2^ - i (t-^K.

Then

x (x = x u
i + l " *o,2 i> i " i*

This is Newton's I.F. again. Since P Q 2 is a linear poly-

nomial, is always uniquely specified. Since P Q 2 * f

the hypothesis of Theorem 4 - 3 which is concerned with the

nonvanishing of P is automatically satisfied by the condi-


n, s
tion on the nonvanishing of f . Also ( 4 - 2 7 ) becomes

f"U ) ± 2

e
i+l =
2f'( X i ) e
i- ( 4
^ 3 3 )

Let x Q s J and let f" be continuous and f 'f" ^ 0

on J. Let f(x o }f"(x o ) > 0 . From Theorem 4 - 2 , we conclude

that if x Q < a, then x^ converges to a monotonically from


below while if x Q > a, then x ± converges to a monotonically
from above. Furthermore

^ T - A ( a ) . (4-34)
e
i

Observe that since Yg(a) = Ag(a), ( 4 - 3 2 ) and ( 4 - 3 4 ) do not


contradict each other.
If we perform direct interpolation with n = 1 and
s == 1 we will again derive the secant I.F. The I.F. of the
form * will be studied in Section 5.3 with emphasis on
the study of * ,. The case n = 2, s - 1 will be studied in
Section 10.2.
5.0-1

CHAPTER 5

ONE-POINT ITERATION FUNCTIONS

In this chapter we shall study the theory of one-


point I.F. These I.F. are of integral order. One particular
basic sequence, E , will be studied in considerable detail
s
in Section 5 . 1 . By using E as a comparison sequence, we draw
conclusions about certain properties of arbitrary one-point
I.F. We consider Theorem 5 - 3 to be the "fundamental theorem
of one-point I.F."
5.1 The Basic Sequence E
2 s
We recall our definition of basic sequence. A
basic sequence of I.F. is an infinite sequence of I.F. such
that the pth member of the sequence is of order p. Technique
for generating basic sequences, some of which are equivalent,
are due to Bodewig [ 5 . 1 - 1 ] , Curry [ 5 . 1 - 2 ] , Ehrmann [5.1-3]*

E. Schroder [ 5 . 1 - 4 ] , Schwerdtfeger [5.1-5L and

Whittaker [ 5 . 1 - 6 ] , among others. See also Durand [5.I-7],


Householder [ 5 . 1 - 8 , Chap. 3 ] * Korganoff [ 5 . 1 - 9 * Chap. 3 ] *
Ludwig [ 5 . 1 - 1 0 ] , Ostrowski [5.1-11, Appendix J ] , and
Zajta [5.1-12]- From Theorem 2 - 6 we know that two I.F. of
order p can differ only by terms proportional to u^. Hence
if the properties of one I.F. of order p are known, many of
the properties of arbitrary I.F. of order p may be deduced.
If the properties of a basic sequence are known, then many
of the properties of arbitrary I.F. of any order may be
deduced. In Section 5 . H we shall study a basic sequence
whose simplicity of structure makes it useful as a comparison
sequence.
5.1-2

5.11 The formula for E . The I.P. m wereT


s_ n, s
defined and studied in Section 4 . 2 2 . If n = 0 , these I.F.

are without memory. Because of their importance, the cp


O, S
are given the special designation E . The conditions for the
a

s
convergence of a sequence generated by E were derived in a

s
Section 4 . 2 2 ; we assume that these conditions hold. In con-
trast with the careful analysis which is required in the
general case, we shall find that the proof that E o is of
s
order s is almost trivial.
In order that the material on E be self-contained, g

we start anew. Let f' be nonzero in a neighborhood of a and


(s)
v
let f ' be continuous in this neighborhood. Then f has an
v
inverse 3f, and 2f v i s continuous in a neighborhood of zero.

o,os be the polynomial whose first s - 1 derivatives agree


with 2f at the point y *» f (x) . Then

( ) t 8
«(*) - ^ . . ( t ) + « ' f f n (t-y) .

and

- 1
,(J>

where e(t) lies in the interval determined by y and t. Define

Ea => 0
^o,s(0)
'' K
5.1-3

Hence

E
s " I ^ ^ J ) f J
(3-D
,J=0

or

> =*-Z^^.
S (-) 5 2

Furthermore,

a » Eg .+ Lzl^lV (e)f ,s) s


(5.3)
where 6 = 6(0).

In ( 5 - 2 ) , E o is expressed as a power series in f ^.

For some applications It is more useful to express E as a


s
power series in u where u = f / f . Hence we write

s-1

-1
In practice fj is not known and we must express E in terms
s
of f and its derivatives. With the definition

Y (x) = (-D'W^fy)
J J
JI[S'(y)]
y-f(x)
we may w r i t e

s^l

E
s ( x ) =
* [
~ Z Y
J u J
(5-4)

and

a = E + (-D * s ( s )
(e) u s t
(5-5)
8
s ! [ S ' ] s

Thus

a = E g + 0 [ u s
] . (5-6)

We may w r i t e f o r m a l l y t h a t

00

a = x
" Z V J
' (5-7)
j = l

The s t r u c t u r e of t h e w i l l be i n v e s t i g a t e d i n S e c t i o n 5.13.

Assume t h a t 3 0( ^( ) s
does n o t v a n i s h i n an i n t e r v a l

a b o u t z e r o . Prom (5-5),

v a
_ (-i) - g( )f ) / _ u _ y s 1 s
0

( x - a ) s
s ! [ 3 ' ] s
'

S i n c e
we conclude that

(5-8)
(x-a)s

Let x. = x, x = s' l E e 85 x
i " 1 1 1 6 1 1
(5-8) may be
1+1

written as

Y (a).
Q

Hence E g Is of order s and has Y (a) as its g

asymptotic error constant. Since the informational usage


of E is s, its informational efficiency is unity. Hence E
is an optimal basic sequence. (These terms are defined in
Section 1.24.)

It is easy to see that

3 1
- (-l) " ^(0),

where

y A = f(x ),
±
Bodewig [ 5 . 1 - 1 3 ] attributes E o to Euler [5.1-14].
s
In the Russian literature, these formulas are credited to
Chebyshev who wrote a student paper entitled "Ca.lcul des
racines d'une equation" for which he was awarded a silver
medal. This paper which was written in 1 8 3 7 or 1 8 3 8 has not
been available to me.
We show that the formula of E. Schroder [5.1-15]

is equivalent to E . g Observe that

jl 1 d evW__\ _ 1
75 K Y
dy f '(x) dx' ' " f '(x) •

Then from ( 5 - 2 ) ,

s L y. 1 (x)
\T^Tx) dxj TiJFf'

which is Schroder's formula. Compare with the formula for a

Burmann series given by Hildebrand [ 5 . 1 - 1 6 , p. 2 5 ] .


5 . 1 2 An e x a m p l e . In c e r t a i n c a s e s one can show

t h a t t h e f o r m a l i n f i n i t e s e r i e s f o r a c o n v e r g e s t o a. A

s e r i e s s o l u t i o n of a q u a d r a t i c e q u a t i o n i s s t u d i e d by

E. S c h r o d e r [ 5 . 1 - 1 7 ] .

EXAMPLE 5 - 1 . C o n s i d e r f ( x ) = x n
- A, w i t h n an

i n t e g e r . I f n ^ 2 t h i s l e a d s t o a f o r m u l a f o r n t h r o o t s ,

w h i l e i f n = - 1 t h i s l e a d s t o a f o r m u l a f o r t h e r e c i p r o c a l

of A. I f f ( x ) = x n
- A, t h e n

»(y) = ( A + y ) l / n

and

J - l

Then

x + x
( 5 - 9 )

In p a r t i c u l a r ,
5.1-8

If n = 2 , this is Heron s method for the approximation of


1

square roots. The formula ( 5 - 9 ) was derived by Traub [5.1-18]

using the binomial expansion. If n = - 1 ,

s-l
(5-10)
J
E s = x £ (l-Ax) ,
<J=0
and

00

This geometric series converges to l/A if |l-Ax| < 1 .


Rabinowitz [ 5 . 1 - 1 9 ] points out that ( 5 - 1 0 ) may be used to
carry out multiple-precision division.

See Durand [5.1-20, pp. for the approximation


1 1
of In A, sin"' A, and tan"" A.
5.1-9

5.13 The structure of E . We defined E as


. s s

rr s-i
E
s
= x y
j-i J y
' J![*'(y)] J

y=f(x)

It is easy to show that Y^ satisfies the difference-


differential equation,

jY^x) - 2(j-l)A (x)Y _ (x) + Y ^ U ) = 0,


2 J 1 Y^x) = 1, Ag(x) =

(5-11)

The first few Yj may be calculated directly from this equation,

An explicit formula for the Y^ may be derived from


the formula for the derivative of the inverse function derived
in Appendix B. We have

'f* I r
(-l) (j r-l)! + II " ^ j - , (5-12)
1=2 V

with the sum taken over all nonnegative integers p such that

2, (i-DPi = J " 1 , (5-13)


1=2

and where r = 2 £ ± 2 p 1 # For j = l, p ± = o for all i. Prom


the definition of Y^ and ( 5 - 1 2 ) we have
5.1-10

THEOREM 5 - 1 . Let

Y,(x) - ( - l ) - y
j )
J
(y) ( ) =
, A X

J J
«Ji[3'(y)] y=f(x)

where f and are Inverse functions. Then

i=2 1

with the sum taken over all nonnegative integers such that

^ (i-l)p = J - 1 ,
±

i=2

and where r = 0^-

The first few Yj are given in Table . 5 - 1 • Observe


that Yj is independent of f; it depends only on the deriva-
tives of f. We conclude that

j=l 1=2 1

(5-14)

where the Inner sum Is taken over all nonnegative Integers p ±

such that 2 £± 2 (i-l)^ 1 = j - 1 and where r = 2 £ p . 1 2 ±


5.1-11

TABLE 5-1. FORMULAS FOR Yj

5A| - 5 A A
g 3 + A 4

l4Ag - 21AgA 3 + 6AgA^ + 3A^


5.1-12

By replacing the upper limit of the first sum by <»,

we obtain a formal infinite series formula for a.

The first few E are given byx


s

Eg - x - u,

E3 = E 2 - A u ,
2
2

(5-15)
3
E 4 = E 3 - (j2A| - A ) u , 3

3 h
- (5A - 5A A + k^* .
5 4
2 3
E = E

The following corollaries follow easily from


Theorem 5 - 1 .

COROLLARY a. Y^ is a polynomial in A , A , . .., A^ g 3

COROLLARY b. A^ is the same polynomial in

9 f * • •9 that Yj is in Ag • A^ ^••# «Aj« <

COROLLARY c. The sum of the coefficients in this


polynomial is unity.
5.1-13

PROOF. The polynomial is an identity in x. Let


1 A 1 r a l
f(x) - ( l - x ) " . At x = 0, j = f° l J* The Inverse
- 1
to f(x) is Sf(y) = 1 - y and when x = 0, y = 1. A short

calculation shows that at y = 1, Y^ = 1 for all j .

COROLLARY d, Y 1 depends on A 1 only as (-l)^A . 1

For some applications it is convenient to work with

J
Zj - Y j U . (5-16)
Then

s-1

E s = x- £ Z r (5-17)

We have

COROLLARY e. The form of is given by

where = 0, ip A = - 1 . Thus Zj is a polynomial,


homogeneous of degree zero and isobaric of weight - 1 .
5.1-14

It Is easy to show that Z^ satisfies the difference-

differential equation,

JZj(x) - (j-l)Z _ (x) + u(x)zj_ (x) = 0 ,


J 1 1 Z (x) = u ( x ) .
1

(5-18)

In terms of the forward difference operator A, this equation

may be written as

At-JZjU) ] + Z (x)Zj(x) = 0.
1

In Lemmas 5-1 and 5-2 and in Theorem 5-2, we shall

make use of the fact that E is of order s. The following


s
lemma enables us to calculate the asymptotic error constant

of an arbitrary I.P. of order p.

LEMMA 5-1. Let ep be an I.F. of order p and let C

be its asymptotic error constant. Let

cp(x) - E (x)
G(x) = -f , x ji a.
p
(x-a)
Then
C Y (a) + lim G(x) .
p
x -+ a
1

5.1-15

PROOF. Take cp = cp and


2 = E p in Theorem 2-8 and
observe that the asymptotic error constant of E is Y (a).
1? JP

A more useful result is given in

LEMMA 5-2. Let cp be an I.F. of order p and let C


be its asymptotic error constant. Let

<p(x) - E (x)
H (x) = — - E , x a .
p
u (x)
Then

C - Y (a) + lim H(x).


p
x a
PROOF. From the previous lemma,

IP
lim H(x) = lim G(x) x-q = C - Y (a), p

x a x a
since

lim Ji(*l = i.
x a
x-a "
r

5.1-16

EXAMPLE 5-2. Let

<p = x -
1-A u* 2

T h i s i s H a l l e y ' s I . F . which w i l l be d e r i v e d in S e c t i o n 5.21

S i n c e E^ = x - u [ l + A 2 u ] , i t i s e a s y t o show t h a t

l i m H(x) - - A ^ ( a ) . Then

C = Y (a) -
3 A | ( a ) - A|(o) - A ^ a ) .

The o r d e r and a s y m p t o t i c e r r o r c o n s t a n t of an I . F .

w i t h I n t e g e r - v a l u e d order may be c a l c u l a t e d u s i n g the r e s u l t s

of Theorem 2-2. I t i s awkward however t o a p p l y t h i s theorem

f o r a l l but the s i m p l e s t I . F . The f o l l o w i n g theorem p e r m i t s

t h e c a l c u l a t i o n of the o r d e r and a s y m p t o t i c e r r o r c o n s t a n t of

an I . F . b y comparing i t w i t h E p + ^ . T h i s procedure t u r n s out

t o be p a r t i c u l a r l y u s e f u l i n the development of Chapter 9.

THEOREM 5-2. cp i s of order p i f and o n l y i f

'cp(x) - E p + 1 ( x ) '
l i m
x —• a . u p
( x )
5.1-17

exists and is nonzero. Furthermore

C = lim
x -+ a . uP(x)

where C is the asymptotic error constant of cp.

PROOF. From (5-6),

p + 1
a = E p + 1 (x) + 0[u (x)].

Therefore,

P + 1
,(») -a_*W - V l ^ ) +0> (x)] f x x {

(x-a) p p
u (x) x-

P
cp(x) - E p + 1 (x) / u ( x ) N /u(xi}
p
u (x) x
V " / a V x-a y

Since

we conclude that

q>Cx) - a <p(*) - E p + 1 (x)


lim
p
x -+ a . (x-a) p
J u (x)

which completes the proof.


5.1-18

Observe that Lemmas 5-1 and 5-2 each involve two


I.F. of the same order; Theorem 5-2 involves two I.F. whose
orders differ by unity.

The next lemma is used in Section 5.51.

LEMMA 5-3-

W * > - >.(«) - ^ B,(x)

PROOF. From (5-18),

JZj(x) - £ ( J-l)Z _ (x) + u(x) £ Zj-i(x) = 0


I t j 1

J-2 J=2 J=2

This telescopes to

sZ. (x) - Z (x) + u(x) £


1
z x
j_i( ) = °
J=2

Since Z-^x) = u(x),

sZ (x) - u(x)
s 1 - £ z ^ w -» u(x)E (x)
s

J=2
5.1-19

The fact that

completes the proof.

EXAMPLE 5-3. Let s = 2. Then

^ 2 = x - u - Ju(l-u'),
E E
q3 = o
J
"2 " £UE'
s

and

E^ = x - u - |u(2A u)2 = x - u - AgU 2


5.2 R a t i o n a l A p p r o x i m a t i o n s t o E .a

I t i s common knowledge t h a t r a t i o n a l f u n c t i o n s are

o f t e n p r e f e r a b l e t o p o l y n o m i a l s f o r t h e a p p r o x i m a t i o n of

f u n c t i o n s . The r a t i o n a l f u n c t i o n a p p r o x i m a t i o n s t o a f u n c -

t i o n may be a r r a n g e d I n t o a t w o - d i m e n s i o n a l a r r a y i n d e x e d by

t h e d e g r e e s of the p o l y n o m i a l s of t h e numerator and

d e n o m i n a t o r . Such an a r r a y I s c a l l e d a Pade t a b l e . See

K o b e t l i a n t z [5-2-1], Kopal [5-2-2, Chap. I X ] , and Wall [5.2-3,

Chap. 20].

S i n c e E s i s a p o l y n o m i a l i n u w i t h c o e f f i c i e n t s

d e p e n d i n g on t h e d e r i v a t i v e s of f, we e x t e n d t h e u s u a l p r o -

c e d u r e and form a Pade t a b l e of I . F . The r a t i o n a l a p p r o x i m a -

t i o n s t o E a are c o n s t r u c t e d so as t o be o r d e r - p r e s e r v i n g .
s

For e a c h s we o b t a i n s - 1 o p t i m a l I . F . of o r d e r s . In

p a r t i c u l a r we o b t a i n t h e o f t e n r e d i s c o v e r e d H a l l e y ' s I . F .

Most of t h e m a t e r i a l of t h i s s e c t i o n f i r s t appeared i n

Traub [5-2-4],
r

5-2-2

5-21 I t e r a t i o n f u n c t i o n s g e n e r a t e d b y r a t i o n a l

a p p r o x i m a t i o n t o E . I t i s c o n v e n i e n t t o d e f i n e a p o l y -
s

nomial Y ( u , s - l ) b y

s-1
Y ( u , s - l ) = £ Y j ( x ) u j
( x ) .

Then E Q = x - Y ( u , s - l ) . We w i l l s t u d y r a t i o n a l a p p r o x i m a t i o n s

t o Y ( u , s - l ) which a r e o r d e r - p r e s e r v i n g . Define

Vb • x
- fc:5)- » + * • • - 1 . « > °.

where

R(u,s,a) = ^ R ^(x)u (x),


s
J
Q(u,s,b)> ^ ^ (x)u (x), J
Q g q ( x ) . 1.

Note t h a t t h e " c o n s t a n t term" i s a b s e n t i n R ( u , s , a ) and

p r e s e n t i n Q ( u , s , b ) . T h i s g u a r a n t e e s t h a t ^ a ^ ( a ) = a.

The a + b p a r a m e t e r s

R s ^ j ( x ) , J = l , 2 , . . . , a ; Q s ^ j ( x ) , J - 1,2,...,b,

are chosen so t h a t

R ( u , s , a ) - Y ( u , s - l ) Q ( u , s , b ) = 0 [ u s
( x ) ] . (5-19)
r

5.2-3

Then

and by Theorem 2 - 7 , i/ ^ is of order s.


a Equivalently, b

is of order a + b •+ 1. The conditions imposed by ( 5 - 1 9 ) are


satisfied if the R .,Q , are chosen so that
a

s, j s, j

r
3

J=0

where cd.v , a
_ » 1, for I £ a, cd,
c, a = 0 , for I > a, and
k * min(t-l,b). For parameters thus chosen,

Since the j ( x ) , 1 ^ J <. s - 1 , depend only on


Y
f^,

1 <L J <1 9 - 1 * and since the R .(x),Q o .(x) depend only on


these Yj(x), we conclude that the ^ a fc are all optimal I.F.
A number of formulas of type $ , , together with their asymp-
a, d

totic error constants, may be found in Table 5 - 2 . Note that

^a,o s-1

For s fixed, which of the s - 1 I.F. generated by


this process is to be preferred? There are indications
(Frame [ 5 . 2 - 5 ] and Kopal [ 5 . 2 - 6 ] ) that the I.F. which lie
near the diagonal of the Pade table are best.
3 )

TABLE 5-2. FORMULAS FOR v and C


*a,b a,

Formulas for f Asymptotic Error Constants


a
^a b " a b
C = lim '
7
x —•a (x-a;

s = 2: tJj~
r = x - u, Newton = Y (a)
l, o c
i,«
2

s = 3: y 2 ^ = x - u[l+Y u] 2 = Y (a)
3

= X H a l l e y = Y (a) - r|(a)
*L,1 • 1-Y u' 2
3

s = k: d = x - u[l+Y u+Y u ] 2 3
2
- \(a)

Y 2 + (f 2 - Y )u]
3
Y
3 ( a )

a )

2,l = V
= X U
*2,1 " Y -Y u
2 3
C
" Y (a)
2

a ) - 2Y (a)Y (a) + Y^a)


1 , 1 , 2 X 2 2
C
l,2 = V 2 3

1 - Y u + (y , - Y^)u
2
5.2-5

EXAMPLE 5-4. If two I.F. have the same order, then

a measure of which I.F. will converge faster is given by the

relative magnitudes of the asymptotic error constants. Let


n 1//n
f(x) m - A, a = A . Define C 3 by
x
a,b
h

a
^a b "
8 a b
(x-a) '
Then

_ (n-l)(2n-l)
s = 3: C
2 2
'° 6c '

n
2 , - 1
n -
1 1 ~
1 , 1
2* d
12<X

s . 4. c = (n-l)(2n-l)(3n-l)
3,o 3 2 4 a

2
C - (n -l)(2n-l)
1
2' " 72a3 '

2
_ c n(n -l)

and

c c
lim = 4, lim p 3
- ^ = 9,
n -+ oo 1,1 n oo 2,1

3
lim = •-
c 2
n eo 2,l '
5.2-6

a r e n o t t n e o n
The "^o ^ l y optimal I.F. of rational
form. Thus Kiss [ 5 - 2 - 7 ] suggests a fourth order formula
which in our notation may be written as

u(l-A u)
p

cp = x ^ (5-20)
l-2A u+A u*
2 3

Snyder [ 5 - 2 - 8 ] derives ^ 1 (Halley's I.F.) by a "method of


11
replacement, and a fourth order formula by a "method of
!f
double replacement • When Snyder's fourth order formula is

translated into our notation, it is seen to be identical

with (5-20) which Kiss derives by entirely different methods.

Hildebrand [ 5 - 2 - 9 , Sect. 9 - 1 2 ] studies I.F. generated by

Thiele's continued-fraction expansions.


5.22 The f o r m u l a s of H a l l e y and Lambert. Perhaps

the most f r e q u e n t l y r e d i s c o v e r e d I . P . i n the l i t e r a t u r e i s

•1,1 m x
~ T%ir

I t has r e c e n t l y b e e n r e d i s c o v e r e d b y Frame [5.2-10],

Richmond [5.2-11], and H. W a l l [5.2-12]. I t was d e r i v e d b y

Ey S c h r o d e r [5-2-13, p . 352] i n 1870. S a l e h o v [5.2-14]

I n v e s t i g a t e s the c o n v e r g e n c e of t h e method w h i c h he c a l l s t h e

method of t a n g e n t h y p e r b o l a s . Z a g u s k i n [5.2"15, P. 113]

p o i n t s out t h a t ^ may be d e r i v e d by t h e method of Domoryad.

Bateman [5.2-16] p o i n t s out t h a t t h e method i s due t o

H a l l e y [5.2-17] (1694).

I f f =s x 1 1
- A, H a l l e y ' s I . F . becomes

9 = x [ ( n - l ) x n
+ (n+l)A] f ( 5 _ 2 1 )

( n + l ) x n
+ ( n - l ) A

and

( x - a ) 0
12a*

In t h e c u r r e n t l i t e r a t u r e , (5-21) i s o f t e n a s c r i b e d t o

B a i l e y [5.2-18] (l94l). E q u a t i o n (5-21) was d e r i v e d by

Uspensky [5.2-19] (1927). R. Newton [5.2-20] p o i n t s out t h a t

D a v i e s and Peck [5.2-21] (1876) c a l l i t H u t t o n ' s method but

t h e y g i v e no r e f e r e n c e . Dunkel [5.2-22] n o t e s t h a t t h e f o r m u l a

was known t o Barlow [5.2-23] (l8l4). K i s s [5.2,-24] and

M u l l e r [5.2-25] a s c r i b e t h e method t o Lambert [5.2-26] (1770),


5,3 A B a s i c Sequence of I t e r a t i o n F u n c t i o n s G e n e r a t e d by

D i r e c t I n t e r p o l a t i o n .

In S e c t i o n 5 . 1 we s t u d i e d t h e b a s i c sequence E o ^ m e

S Oj s

g e n e r a t e d b y i n v e r s e I n t e r p o l a t i o n a t one p o i n t . We t u r n t o

t h e b a g i c sequence * o g e n e r a t e d b y d i r e c t i n t e r p o l a t i o n a t

OyS

one p o i n t . These l a t t e r I . F . have t h e drawback t h a t f o r

s >•?, a p o l y n o m i a l of d e g r e e s - 1 must be s o l v e d a t e a c h

i t e r a t i o n . They have t h e v i r t u e of b e i n g e x a c t f o r a l l p o l y -

n o m i a l s of d e g r e e l e s s t h a n or e q u a l t o s - 1 . In S e c t i o n 5.33*

we i n v e s t i g a t e a t e c h n i q u e w h i c h r e d u c e s t h e d e g r e e of t h e

p o l y n o m i a l t o be s o l v e d b u t w h i c h p r e s e r v e s t h e o r d e r of t h e

I . F . g e n e r a t e d .
r

5.3-2

5.31 The basic sequence 4> .


CT The I.F. <J> was
Oj s n, s

defined and studied in Section 4 . 2 3 . If n = 0 , these I.F.

are without memory. Th$ conditions for the convergence of a

sequence generated by <j> were derived in Section 4 . 2 3 ; we


o, s

assume that these conditions hold. In contrast with the

careful analysis which is required in the general case, we

shall find that the proof that $


trivial. is of order s is almost
o, s
In order that the material on S> 0 be self-
Q , S
contained, we start anew. Let be the polynomial whose
o, s

first s - 1 derivatives agree with f at the point x i f Then

and f(t) = P , ( t ) + gr 4
(t-x ) , s
( 5 - 2 2 )
0 s ±

where i^(t) lies in the interval determined by x.^ and t.

Define x i + 1 by

P x
o,s( i l>+ " <>• ( 5 - 2 3 )

Let a real root of ( 5 ~ 2 3 ) be chosen by some criteria. Let

the function that maps x. into x. be labeled * . Thus


i 1+1 o,s

x ( x
i+l - *o, B i>'
5.3-3

The order and asymptotic error constant of $ are now


o, s
easily obtained. Set t = a in (5-22). Then

° = *o,a<«> • * # ' ( a )
< e 1 ) . ; .

where e i » - a and where £^ = £^(a) . Since

where Tl^ ^ lies in the interval determined by x


+ 1 + 1 and a,

we conclude that

' t \ f ( B )
( ^ s
P
o,s^i+l^ i+l ^e
s! e
i #

Le* ^ o e nonzero in the interval determined by x . a n d


P fc
a
( s)
v
and let f ' be nonzero in the interval of iteration.
Observe that p' - * f ' ( a ) . Then
o, s a

s
^-(-l) A (a), s A 8 - f ^ . (5-24)
e
i

Slnoe s is an integer, it is not necessary to take absolute

values in (5-24). We conclude that $ Q fl is a basic sequence


r

5.3-4

5.32 The Iteration function ^ ^ > 0 The I.F. <t> Q 2

is Newton's I.F. We turn to Q . This I.F. was already

studied by Gauchy [5,3-1]. See also Hltotumato [5.3-2]. We

must solve

2
0 = f(x) + f'(x)(t-x) + | f " ( x ) ( t - x ) « 0. (5-25)

Then

$ . £1 + ltd
0,3
= x
— grr fvi--_ 4 1 A u
2
)i
' U = X ,
f '
A
2 2f' *

a will be a fixed point of $ 0 ^ if and only if we take

the + sign if f' > 0 and the - sign if f' < 0. If this choice

of sign is made, then $ - differs only by terms of O(u^)

from E ^ . Thus

^0,3 = x
~ T» +
JT> ( l - ^ A u ) ^ .
2 (5-26)

For x close to a, severe cancellation is bound to

occur in (5-26). This may easily be avoided by observing that

2
- b + (b -4ac)^ _ -2c
2 a 2
b + (b -4ac)*'

and hence taking

« = x (5_ ) 2 7

0 , 5
1 + (l-4A u)t
0
r

5.3-5

This last form is clearly best for computational purposes.

Although generations of schoolboys have been drilled to

rationalize the denominator, it is preferable, in this case,

to irratlonalize the denominator.

The generalized Fourier conditions of Theorem 4-2

assume a particularly simple form for $ Q ^. Let

f'f'"< 0

over the interval of Iteration. Then if x Q > a, the x^ con-

verge to a monotonically from above; if x Q < a, the con-

verge to a monotonically from below. Observe that in the

case of Newton's method, monotone convergence is only possible

from one side. This is because we demand that f(x )f"(x ) > 0
1
for Newton s I.F. and f must change sign as x goes through a.
5.3-6

5.33 Reduction of decree. We observed that the

use of $ o f s requires the solution of an (s-l) degree poly-

nomial at each iteration. We can effect certain degree-

reducing changes which change the asymptotic error constant

but which do not affect the order.

Consider the following change. Replace one of

the t - x factors in (t-x) ~ by Eg - x = -u. (Eg = x - u

is Newton's I.F.) Define

( 5 - 2 8 )

We shall show that although R Is a polynomial of degree s - 2


o

s
In t - x, it still leads to an I.F. of order s. Observe that

s 2
p
0 , 3 < * > - V * > - ^ r ^ f 1
<*-*> - <t-E ) 2

and

Since

2
E 2 - a - V(x)(x-a) , v ( a ) - Ag(a),
and

f(t) - P , ( t )
0 # + f ( 3 )
f p ) l ( t . , ) B >
5.3-7

we conclude that

0 «• f(a) = R ( a ) + (-l) (x-a)


s
s
f ( B )
( 0 .
sJ (s-l)l ' V
W

where i lies In the interval determined by x and a. Define


x , „ by
i+1
R (x s i + 1 ) = 0.

Then

R (a) - -
B R (Tl
B 1 + 1 )(x 1 + 1 -a),

where T ) 1 + 1 lies in the interval determined by x ± + 1 and a.

Assume that R ( T 1 S 1 + 1 ) does not vanish. Then if e ± 0,

'i+1 s
(-l) [A (a) - A ^ ^ ^ A g C a ) ] .
g ( -29)
5

EXAMPLE 5-5. Let s = 3. Then we must solve

R ( t ) = 0.
3 Prom (5-28),

0 - f(x) + (t-x)[f'(x) + if"(x)(-u)]

or

= X
* " I^AgH-
5.3-8

This is Halley's I.F. which we have now generated by linear-

izing a second degree equation. From (5-29),

'i+1 2
A ^ a ) - A (a) = Y (a) - Y (a),
3
2

as we found in Section 5 . 2 1 .

EXAMPLE 5 - 6 . Let s = 4 . Then

0 = f(x) + f'(x)(t-x) + (t-x) ;


\ f"(x) - \ f"{x)u

or

0 = u + (t-x) + (t-x) [A -A u]2


2 3

and

2.u
cp = x -
1 + [1 - 4u(A -A u)]i 2 3

Then

'i+1
- [A (a) - A ( a ) A ( a ) ] .
4 2 3 (5-30)

One could also arrive at ( 5 - 3 0 ) by expanding <p into a power

series in u and applying Lemma 5 - 2 .


r

5.3-9

T h e r e a r e n u m e r o u s o t h e r w a y s b y w h i c h t h e d e g r e e

o f P (t) c a n b e l o w e r e d w i t h o u t c h a n g i n g t h e o r d e r . I f o n e
O y S

o f t h e t - x t e r m s i n ( t - x ) 8
" " 1
w e r e r e p l a c e d b y E^ - x , t h e n

t h e d e g r e e w o u l d b e l o w e r e d b y o n e b u t n e i t h e r t h e o r d e r n o r

t h e a s y m p t o t i c e r r o r c o n s t a n t w o u l d b e c h a n g e d . We s h a l l

^ c o n t e n t o u r s e l v e s w i t h t w o m o r e e x a m p l e s .

EXAMPLE 5-7. I n

P 0 j 3 ( t ) = f ( x ) + ( t - x ) f ' ( x ) 4- i ( t - x ) 2
f " ( x ) ,

r e p l a c e ( t - x ) 2
b y ( E g - x ) 2
= u 2
. T h e n

/ 0 = f ( x ) + ( t - x ) f ' ( x ) + ^ f ' ( x )

o r

p
<p = x - u - u A 2 ,

w h i c h i s j u s t E ^ .

EXAMPLE 5-8. I n

P
o,4 ( t )
= f ( x
> +
( t - x ) f ' ( x ) + | ( t - x ) 2
f " ( x ) + \ ( t - x ) 3
f ' " ( x ) ,
5.3-10

replace one of the t - x In the quadratic term by E^ - x and


2
replace (t-x) in the cubic term by ( E g - x ) = u . Then

0 - f(x) + (t-x) f'(x) - \ f " ( x ) ( u + A u ) + \ f"'(x)u


2
2 J

cp = x - u

1 + A u +
g [k^ - Ag^u'

This is y 1 2 derived in Section 5,21,


5.4-1

5.4 The Fundamental Theorem of One-Point Iteration Functions

We review some of the terminology introduced in

Section 1.24 which will be used in this section. The

informational usage, d, of cp is defined as the number of new

pieces of information required per iteration. If an I.F.

belongs to the class of I.F. of order p and informational

usage d, we write cp e ^ 3 . * T
^ e informational efficiency 9 EFF,

of cp is defined by EFF = p/d. If EFF = 1 , cp is an optimal I.F,

In this section we consider both simple and multiple zeros.

If the order of cp is independent of the multiplicity m of the

zero, then we say that its order is multiplicity-independent.

The reader familiar with I.F. has no doubt observed

that one-point I.F. of order p depend explicitly on at least

f and its first p - 1 derivatives. Hence the informational

usage of the I.F. is at least p. A theorem which gives a

formal proof of this fact is given below. It is this theorem

which causes us to label as optimal those I.F. whose informa-

tional efficiency is unity. The theorem is quite simple to


11
prove and the result is in the "folklore of numerical

analysis. Its importance is that it makes us look for types

of I.F. whose informational efficiency is greater than unity.

Multipoint I.F. and I.F. with memory are not subject to the

conclusions of this "fundamental theorem of one-point I.F."

Recall that E s , which was studied in Section 5 . 1 , is of order

p = s. This fact will be emphasized in this section by

writing E . We give two equivalent formulations of


THEOREM 5 - 3 * Let m = 1 . Let cp denote any one-

point I.F. Then there exist cp of all orders such that

EFF(cp) = 1 and for all cp, EFF(cp) <; 1 . Moreover cp must depend

explicitly on at least the first p - 1 derivatives of f,

ALTERNATIVE FORMULATION. Let m = 1 . . Let cp denote

any one-point I.F. Then for all p, there exist cp e I and

if cp e H^r>* then d ^ p . Moreover cp must depend explicitly

on the first p - 1 derivatives of f.

PROOF. Since E^ I . there exist optimal


e I.F.
p ° p p'
of all orders. This disposes of the first part of the proof.

Let cp^ e Ip. From Theorem 2 - 1 0 , the most general I.F. of

order p is

p
cp = cp + U f
x

where U is any function bounded at a. Hence U cannot contain

any terms in l/f. We take cp = E_. I 1 The most convenient form


P
to take for E is given by ( 5 - 2 ) ,

P
. f i ^ i ( j ) J „
L
E p = x g f

J!

p _ 1
E p depends explicitly on f , f , f ( ) . Since the
p _ 1
highest power appearing in E is f , none of its terms can
p
be cancelled by U f . Since cp is a one-point I.P., none of
5 . 4 - 3

the f(^) can be approximated by lower derivatives to within


0

terms of 0(u ) , I > 0 . Hence cp must depend explicitly on


1
f ,f', . .. ,f ^P"" ^ which completes the proof.

The restriction to one-point I.F. is essential

as the following considerations show. Observe that

f [
V ( x S X ) 3 =
V >
x u 2
( >
x +
°-tu (x) ] .
3

Recall that
2
E ( x ) = x - u(x) - A ( x ) u ( x ) .
3 2

Since

cp(x) = x - u( ) - X
f [
\ : $ x ) ]

and E^ differ only by terms of 0 [ u ( x ) ] , cp is third order. 3

That is, the second derivative appearing explicitly in has

been approximated in cp. Observe that cp uses information at

x and at x - u(x) and is therefore an example of the multi-

point I.F. which are studied in Chapters 8 and 9 . Such

approximation of derivatives is impossible for one-point I.F.

We turn to the case of multiple zeros.


COROLLARY. Let m be arbitrary and known. There

exist cp of all orders, with cp depending explicitly on m, such

that EPF(cp) = 1 and for all cp, EFF(cp) £ 1 . Moreover cp must

depend explicitly on the first p - 1 derivatives of f.

ALTERNATIVE FORMULATION. Let m be arbitrary and

known. Then for all p there exist cp depending explicitly on m

such that cp e I p p and if cp e then d ^> p. Moreover, cp

must depend explicitly on the first p - 1 derivatives of f.


1//m
PROOF. Define F = f . Then F has only simple

zeros and F ^ depends only on f ^ , I <; j. An application

of Theorem 5-3 completes the proof.

Note that this corollary assures us of the existence

of optimal I.F. whose order is multiplicity-independent. A

basic sequence of such I.F. will be explicitly given in


m - 1
Section 7-3. Observe that if we define G = f ( ) and insert

G into any optimal one-point I.F. of order p, we obtain a

one-point I.F. with informational efficiency equal to unity.

^ This approach leads to I.F. which depend explicitly on


f ( m - l ) .(m) f (m+p-2)

A case of greater interest Is-when m is not known. i^Note

that u = f/f' has only simple zeros. Replacing f by u in any

/-^ optimal one-point I.F. of order p leads to an I.F. which is

of order p. We conclude that there exist I.F. of all orders

such that EFF(cp) = p/(p+l) for zeros of all multiplicities.

These I.F. do not contain m explicitly. These matters will be

taken up in greater detail in Chapter 7.


r

5.5-1

5-5 The Coefficients of the Error Series of E


s
^ We saw in Section 5 . 1 that

E g - a = Y (a)(x-a)
g
s
+ 0[(x-a) s + 1
]. (5-31)

In certain special cases, an explicit expression may be found

for the error of E . Thus for the calculation of square


p
roots, f = x - A and

E a =
e2

a = A
4
2 " 2(a+e)' > e = x - a.

In general, the expression for E - a, is an infinite series


s
whose leading term is given by the first term on the right

side of ( 5 - 3 1 ) • The coefficients of the infinite series are


E
is^ ( a ) / J ! - A
recursion formula for these coefficients will

be found which does not involve differentiation. An interest-

ing property of the coefficients will be proven in Section 5 . 5 2 .


5 . 5 - 2

5.51 A recursion formula for the coefficients.

We recall the definitions of the following symbols which will

be used frequently;

( \ f(x)
. (v )f _( x^) .
. (, f
_ |( x^ )
( j ) ( j )

u ( x ) = ^ , e-x-a.
f f
f 7 a x = A x ) =

The expansion of u(x) into a power series in e which

will be needed below is derived now. Define by

00
l
u(x) = f(x)/f'(x) = £ ve.
t ( 5 - 3 2 )

1=1
Since

00
J a e J _ 1
f(x) = £ a ^ , f'(x) = £ j '
J=l j=l

and u(x)f'(x) = f ( x ) , we obtain

I
a 1
l =
I
V
q ( t + 1
- ) t l-q' q a
+ - 1 * 2 , . . . ,
q=l

or

t
A
l =
Z
V
q ( t + 1
-^Vl-q' 1 - 1 , 2 , . . . , ( 5 - 3 3 )

q=l
5.5-3

and finally

V
l =
h " Z V t + 1
" q )
Vl-q' t=l,2,..., (5-34)
q-i

as a recursion formula for the calculation of the with

= 1. In ( 5 - 3 3 ) and ( 5 - 3 4 ) . as throughout this section, the

functions which occur are evaluated at a unless otherwise

Indicated.

An explicit formula for the may also be obtained.

It is not difficult to prove that

l V j I W"- J " ^ V \ (5-35)


J-l i=l

a a n d w h e r e t h e
where r = i inner sum is taken over all

integers such that ia^ = j .

From either ( 5 - 3 4 ) or ( 5 - 3 5 ) , the first few v l may

be calculated as

v, = 1,

V = A
2 ~ 2'

2
v 3 = 2A - 2A , 3

v 4 = - 4A| + 7A 2 A3 - 3 A 4 .
We are ready to turn to the problem of finding the

coefficients of the error series. Define t » a by


v jS
00

(x) = £ T t / B e 1
' . (5-36)
1=0

Since E (x) e I , we expect x,


0 = 0 , for 0 < I < s, ando

S S •s C
t = a. This may be proven directly by induction on s.
o, s
Let s = 1 . Then

E ( x ) = x « a + (x-a) = a + e.
1

Now assume t . a = 0 , 0 < I < s, t a = a. Substitute (5-36)


Ks } S O, S
into the formula of Lemma 5 - 3 ,

E !
« s « W = . W - f

to find

00 00 00

t=0 l=S l=s

which completes the induction.


Substituting ( 5 - 3 6 ) into ( 5 - 3 7 ) yields

00 00 00

V ^ V ^ . u(x) V # ^t-i A

t=0 t=0 1=0


r

5.5-5

and using the expansion of u(x) given by ( 5 - 3 2 ) , multiplying


t
the series, and equating the coefficient of e to zero yields

t-1

ST
t,s+l +
^- W,s
s +
I t + l - r r , s - °-
rv T
(~ )
5 38

r=l
Since the may be considered known, ( 5 - 3 8 ) can be used to

determine the t * . A number of i9 0 are given in Table 5 - 3


c, s c, s
These results are summarized in

THEOREM 5 - 4 . Let

00

E
1=0 1=1

Then

t-1

, s + l + U-sK,s +
I
r v T
t + l - r r , s - °'
r=l

with t a = a, T-, -1 = 1 , t . n = 0 for I > 1 , and t , „ = 0


O, S JL,J_ v,-L C,S

for 0 < l < s and s > 1 .


5.5-6
a

TABLE 5 - 3 - FORMULAS FOR t .


•I, s

t 1 2 3 k
i I
0 a a a
a
1 1

2 A
2

2
3 - 2A + 2A 3 2 A
2 " A
3

h 4A| - 7A A 2 3 + 3A^ - 9 A | + 12A A 2 3 -3A U 5A^ - 5 A A


2 3 + A^
5 . 5 - 7

Using Table 5 - 3 we find

E 2 ( x ) - a = A 2 e 2
+ (- 2k\ + 2 A 3 ) e 3
+ (kk\ - T A ^ + lk^ k
+ 0 ( e 5
) ,

E (x) - a = (2k* - A 3 ) e 3
+ (- 9A 3
+ 12A A £ 3 - 3A^)e U
+ 0 ( e 5
) , ( 5 " 3 9 )

E 4 ( x ) - a = (5A , 3
- 5A A 2 3 + A 4 ) e 4
+ 0(e ).
5

Observe that for the cases worked out above, the coefficient
s
of e in the expansion of E (x) - a is Y ( a ) , as expected.
s s
(See Table 5 - 1 for the formulas of Y .)
s
r

5.52 A theorem concerning the coefficients. The

following theorem may be used to check tables of the t „ _

and has a somewhat surprising corollary.

THEOREM 5-5

s=l

PROOF. Note that since t , a = 0 , for I < s, (5-40)

is equivalent to Z t « _ = A,. The proof is by induction


S —JL v, S v
T T 1 A
on t. For t = 1, ^ s = i 1 = = i« Let
Z
o^i t„ „ — A . for r - 1 , 2 , w i t h I > 2. Sum the

recursion formula for t» _ to obtain


v , S

£ *C 't^-l

t + l - r l
r , s
s=l s=l s=l s=l r = l

or

I l-l r

I H,s - I r v
t l-r
+ I ^s'
s=l r=l s=l
5.5-9

where we have used the fact that t = 0 for r < s. An


r, s

application of the inductive hypothesis yields

I l-l I
I T
t,s * I r v
t+l-r r " A
I ^ + 1
" ^ v
q V l - q " t v
l
s=l r=l q=l

Finally, the recursion formula for the v^, (5-34), yields

s=l

which completes the proof.

COROLLARY. Let k be an arbitrary positive integer.

Then

s=l

PROOF.

J^- X\. 00

s=l s=l t=0

k k oo k

4=1 8=1 t=k+l 8=1


5.5-10

Since

I
s=l s=l

an application of Theorem 5 - 5 yields

k k
£ E ( x ) = ka +
g £ A^e* + 0 ( e k + 1
)
s=l 1=1

The fact that

1=1 1=1
and hence that

t=l

yields the corollary.

Taking the limit as k oo of ( 5 - 4 l ) yields a special

case of the general theorem that if a series is convergent,

then it is also Cesaro summable.


6 . 0 - 1

CHAPTER 6

ONE-POINT ITERATION FUNCTIONS WITH MEMORY

Two classes of one-point I.F. with memory are:

studied: Interpolatory I.F. and derivative estimated I.F.

These I.F. are always of nonintegral order. The structure

of the main results, given by Theorems 6 - 1 to 6 - 4 , Is

remarkably simple.
6.1-1

6.1 Interpolatory Iteration Functions

In Chapter 4 we developed the general theory of

I.F. generated by direct or inverse hyperosculatory inter-

polation; such I.F. are called interpolatory I.F. If n * 0 ,

the interpolatory I.F. are one-point I.F,; if n > 0 , they

are one-point I.F.'with memory. The reader is referred to

Theorems 4 - 1 , 4 - 2 , and 4 - 3 for results concerning the conver-

gence and order of interpolatory I.F. The generalization to

multiple zeros is handled in Chapter J\ • In this section

we limit ourselves to comments and examples. The notation is

the same as in Chapter 4 .


6 . 1 - 2

6.11 Comments. Observe that Interpolatory one-

point I.F. with memory use s pieces of new information at x^

and reuse s pieces of old Information at the n points


x
i-l' i-2> * ' i-n
x # # x #
Thus their informational usage is s.
Their order is determined by the unique positive real root

of the equation

n
n + 1 J
t - s ) t = 0 . ( 6 - 1 )

j=0

As was shown in Section 3 . 3 , bounds on this root are given by

Hence bounds on the informational efficiency are given by


1 < EFF < 1 + i
s

Furthermore,

n i ~*
f P = 3 + 1
n
1
oo
n+l,s 9

Theorem 5 - 3 states that the informational efficiency of any

one-point I.F. is less than or equal to unity. Hence the

increase in informational efficiency for interpolatory I.F.

with memory is directly traceable to the reuse of old infor-

mation. On the other hand, we conclude from ( 6 - 2 ) that the

old information adds less than one to the order of an Inter-

polatory I.F.
6.1-3

The dependence of the order on n and s may be seen

from Table 3-1 w i t h k = n + 1 and a = s . Observe t h a t the

order approaches i t s l i m i t i n g v a l u e , s •+ 1, q u i t e r a p i d l y as

a f u n c t i o n of n; t h i s i s p a r t i c u l a r l y true f o r l a r g e s . The

case of most p r a c t i c a l i n t e r e s t i s n = 1. Then (6-1) may be

s o l v e d e x a c t l y and

(s +4s)^].
p +
l,s"" 2

A drawback of i n t e r p o l a t o r y I . F . w i t h memory i s t h a t

m u l t i p l e p r e c i s i o n a r i t h m e t i c may have t o be used f o r at l e a s t

p a r t of the c a l c u l a t i o n . A drawback of i n t e r p o l a t o r y I . F .

g e n e r a t e d by d i r e c t i n t e r p o l a t i o n i s t h a t a p o l y n o m i a l of

degree r - 1 must be s o l v e d f o r each i t e r a t i o n . A r e d u c t i o n

of d e g r e e t e c h n i q u e , demonstrated f o r o n e - p o i n t I . F . in

S e c t i o n 5.33, i s a l s o a v a i l a b l e f o r o n e - p o i n t I . F , w i t h memory

an example i s g i v e n i n S e c t i o n 10.21.
6 . 1 - 4

6.16 Examples. The reader is referred to Appendix A

for the interpolation formulas used in the following examples.

The notation is the same as in Chapter 4. We shall not give

the conditions for convergence; such conditions may be found

in the examples of Section 4.3. The first three examples use

inverse interpolation; the next three examples use direct

interpolation. We take y^^ = f 1 = f ( x ^ ) ^ !J 1 - ^(y^),

EXAMPLE 6 - 1 . n ~ 1 , s * 1 , (secant I.F.). In the

Newtonian formulation,

f
i
= x
* ± ~ yi'^i^i-i] - i - f t x 1 > x 1 , 1 ] '

In the Lagrange-Hermite formulation,

f x l x
i i-l" 'i^l l

e
|Y (oi) P
a " 1
, P «• i(l+v/5) - 1 , 6 2 .
e
i

EFF - £ -v 1 . 6 2 .
II

6.1-5

EXAMPLE 6-2. n » 2, s = 1

ff
1
4

cp + j-* r i i i
,1 "
2
+
[f Lx ,x _ J
i i 1 " f Lx^x^g]/'

1
- ^ f t — |Y (a)|^P- ), 3 p „ !.84.
e
l'

EPF = 1.84.

EXAMPLE 6-3. n = 1 , s = 2.

'0,2
+ f H
' 1 , 2 =
i '

a, -X ^ x 7
'o,2 - i " T '
I
I

H . — 1 Xi 1 ,] V l fl_ . _ _ 1 _ 2
f f
i- i-iif; ^I^T; (f.-f,.,) 2
t ; f f ; _ / ^ ^ i

| e 1
^±|i- l Y ^ a ) ! * ^ " ) ,
p
p = l + v / 3 ^ 2.73.
e '
• 1

EFF - -| ^ 1 . 3 7 .
6 . 1 - 6

EXAMPLE 6 - 4 . n = 1 , s = 1 , (secant I.F.). In the

Newtonian formulation,

P ( t ) = f +
l,l i (t-x )f[x ,x _ ], ± 1 1 1

x
" i ' fLx ,x _ J 1 1 1

In the Lagrangian, formulation,

f X f X
<D = l i-l" i-l l
1,1 i i-i f _ f

Observe that for the case n = 1 , s = 1 , the I.F. generated by

direct and inverse interpolation are identical. This is the

only case, for n > 0 , where this is so.

e
! i+l p 1
- |A (a)| -
2
p = |Yja)|P-\
e P '
• 1

p « |(l+ v/5) ~ 1 . 6 2 ,

EFF = ~ 1 . 6 2 ,
s
6 . 1 - 7

EXAMPLE 6 - 5 . n = 2, s = 1 . The second degree

polynomial equation

P ^(t) = f
2 ± + (t-x )f[x ,x _ ] + (t-x )(t-x +h)f[x ,x _ ,x _ ],
± 1 1 1 ± 1 ± 1 1 1 2

h = x x
i " i-l* f [ x
i' i-l' i-2
x x ]
" x
i" i-2 x

must be solved for t - x . i

i f ^ ^ l M a ) ! * ^ 1
. ) , p . 1.84.

EFP = f ~ 1 . 8 4

This I.F. is discussed in Section 10,21

EXAMPLE 6 - 6 , n = 1 , s = 2 . The third degree poly-

nomial equation

0 = P x 2 (t) - f ± + (t-x )f' + ± ± (t- ) H,


X l
2

f + f 2 f x x ]
f x
i - ^ i^ i-l^ x
i i-l " t i* i-i
H = + (t-x ) ±
x x 2
i" i-l (x -x _ )
1 j L 1

must be solved for t - x


6 . 1 - 8

e
i4-1
i ± l i - | A , ( a ) p ( p _ l )
, P = 1 + v / 3 - 2 . 7 3

EPF = ^ 1.37
r

6 . 2 - 1

6.2 Derivative Estimated One-Point Iteration Functions With


Memory

6.21 The secant iteration function and its

generalization. We have used direct and inverse interpolation

to derive one-point I.F. with memory. We observed that the

secant I.F. could be generated from either direct or inverse

interpolation. We now give two additional derivations of the

secant I.F.; the method of derivation suggests a second general

technique for generating one-point I.F. with memory.

The secant I.F., together with slight modifications

thereof, must share with Halley s I.F. (Section 5 . 2 ) 1


the dis-

tinction of being the most often rediscovered I.F. in the

literature. Discussions of the secant I.F. may be found in

Bachmann[6.2-1], Collatz [ 6 . 2 - 2 , Chap. I l l ] , Hsu [ 6 . 2 - 3 ] ,

Jeeves [ 6 . 2 - 4 ] , Ostrowski [ 6 . 2 - 5 , Chap. 3 ] , and Putzer [ 6 . 2 - 6 ] .

Its order seems to have been first given by Bachmann [ 6 . 2 - 7 ]

( 1 9 5 4 ) . Let 3 be the inverse to f. One way of writing


1
Newton s I.F. is

1
Newton s I.F. is second order and uses two pieces of Informa-

tion. We estimate 3 ' ( y ) by % 1 where

"3(y) - s ( y i _ i ) '
Q 1 / L (t) - *(y) + (t-y)
y-yi_
i-l
6 , 2 - 2

is the first degree polynomial which agrees with S at the

points y and y^.^* It is convenient to use the symbols y

and y^ and the symbols x and x i interchangeably. Replacing

3' by ^ in (6-3) leads to

x x
i" l-l

which is the secant I.F.

On the other hand, we may write

x
* 2 f '(x)

and estimate f'(x) by differentiating the first degree poly-

nomial

P ^ ( t ) m f(x) + (t-x)
1 1 f(x) - t(x y
x-x i-1 lmml

which agrees with f at the points x and X j ^ - Again the secant

I.F. is generated.

In the following sections we will deal with two

broad generalizations of these ideas.

a. Rather than estimating the first derivative of an

optimal I.F. of second order, we estimate the

(s-l)st derivative of an optimal I.F. of order s.


b. Rather than estimating the first derivative from

two values of the function, we estimate the

(s-l)st derivative from n + 1 values (one new and

n old) of the first s - 2 derivatives.

1 1
In general the estimation of f and the estimation of

2r 1
leads to different I.F. We investigate the former

in Section 6 . 2 2 and the latter in Section 6 . 2 3 . After study-

ing the estimation of derivatives in the optimal basic

sequence E , we will be able to deal with the case of arbi-

trary optimal I.F. The one-point I.F. with memory thus gen-

erated will be said to be derivative estimated.


r

6 . 2 - 4

6.22 Estimation of f f . We first develop the

theory of one-point I.F. with memory which are generated by


v
estimating f ' in the I.F. E ; the corresponding theory
o

for arbitrary optimal I.F. then follows easily. From

Section 5-11,
s-1

E = x +
s
>1
( 6 - 5 )

H ( j ) f
y,(x) _
= -(—- i ) » (,y )
9 U ~ J J .
3
J![3'(y)] J

y-f(x)

E uses s - 1 derivatives and hence s pieces of information,


s
s 1
We estimate f ^ ^ ^ ( x ) from f ^ f x ^ ) ,
1 with j * 0 , 1 , . . . , n ,

and I = 0 , 1 , . . . , s - 2 . The symbols x and x i will be inter-

changeably. The I.F. so generated use s - 1 pieces of new

information at the point x^ and reuse s - 1 pieces of informa-

tion from the previous n points. Hence s - 1 will be of basic

importance and we define

S - s - 1 . ( 6 - 6 )

In place of r = s(n+l), we introduce

R « S(n+1). ( 6 - 7 )

The advantage of using S and R instead of s and r will become

evident below.
r

6.2-5

Let P Q ( t ) be the polynomial such that


Ti y O

P ^S
n
( x
i-p " ^ f ( x
i-j)' J »- 0,l,...,n, I - 0,1,...,S-1.

(6-8)

pj; I(x)
S
We estimate f^ ^(x) by S
. Let
IJ y O

n ! s -
f S ) x )
* n < ~ P ( x ) ( 6 _ 9 )

As shown in Appendix A,

n
f (S) ( x ) . . f U ) ( x ) = | j f ( R ) ( | i ) jj (x-x^j) , 3
(6-10)

where ^ lies in the interval determined by x i , x i . .. j X ^ .

Let * E ^ o be generated from E c , n by estimating


s
f^(x) by * f ^ ) ( x ) . A new approximation to a is defined by

x,., = * E _
k j 5 x j x x
i+l " ' n , S ^ i i - l ' * * " i - n ^ *

We derive the error equation for * E n g. Recall that


S
is independent of f^ ^ for j < S. Let * ^ n > s be generated
S
from Y g by estimating f^ ^ by * f ^ . Then

Sri
_ -v _l_
* -n,S" =
EJ x +
I V* + X Y
n,S u S
- S E + Y
* n,S u S
* ( 6
' 1 ; L )

J-l
On the other hand,

S S + l ) S 4 : L
a = E s + Y u s + «( (© )f ' ,
1 ( 6 - 1 2 )

where 0^ lies in the interval determined by 0 and y^. Hence

E
* n,S " a
"
uS
I* n,S-V
Y
" i
f^TTT 8 ( S + 1 )
(«i)f S + 1
-

(s)
Since by corollary d of Theorem 5 - 1 * depends on f v 7
only

as (-l) A , A S
s s = f^ V(S!f')* ^
S
conclude that

* E a _ (-D u S S
r. (S) _ ( S ) l _ (-l)°;
f f
S +x 1
g ( S (Sfl),
+l) ( 0 f) l f N-S+l

An application of ( 6 - 1 0 ) yields

*E_n , S„ - a = -
J
R ! f

Let

e x a e = x a = E
i-j - l-j ' > i+l i+l " * n , S " °"

7 - F'TV ! = g 7 ^ - p
±
6

where T| lies in the interval determined by x


1 i and a and

P l = f (Tl ) . 1 Then
6.2-7

- * M
i e
i S ( e
i - r e
i ) S +
* N
i e
i + 1
-

R ( R )
(-i) f (^)
v - / 1 3
*M, - - " q l i ' [f'CnJ] , (6-13)

( _ 1 ) S + i ( s i )
5 + ( }

*N, = - S+l*
1
( S + l ) ! [ * ' ( P i n

This is the error equation for


„; we use It to
nf o
ume
derive the conditions for convergence and the order. Ass

that e i is nonzero for all finite i. We shall show that if


x x # #, x a r e s u f f c : J e n t l o s e
o' l' * n i - y °l to a, then e ^ ^ + 0 .

Let

|x-a| £ r j .

v
Let f ' be continuous on J and let f be nonzero on J. Let

f map the interval J into the interval K, Since n > 0 ,

R ^ S + 1 , Hence we can conclude that g ( s + 1


) is continuous

on K. Let

•(H)
6 . 2 - 8

for all x e J. Let

S
v
l 3 v

1 1
S+l*
v
2

Let

3(8+1)1

for all y e K; that is, for all x e J. Let

x
*N =

Let x .x,,,..,x e J. Then


o 1 n

max[|e |,|e |,...,|e |] £ r .


o 1 n

By an inductive argument analogous to one employed in

Section 4 . 2 1 , it can be shown that if

S
2 R
" S
* M r R
" 1
+ *NT £ I,

then x ± e J for all 1. Hence | * M ± | £ *M, 1*^1 ^ *N for all i

Then

n
5
1+1 < * M 5
i II ( l-j 6 + 5
i) S
+ ^sf , 1
b ± _ 5= le _j I .
±

j=l
An application of Lemma 3-2 shows that if

+ * N r S
< 1,

then 6 4 -*0. Hence e. -* 0 .


1 i
Observe that

*M ±
n
-+ - ( - l ) A ( a ) ,R *N. - ± Y s + 1 (a).

All the conditions of Theorem 3-4 now apply and we conclude

that

(6-14)

where p is the unique real positive root with magnitude greater

than unity of the equation

n
n + 1
t - S
j=0

We summarize our results in

THEOREM 6 - 1 . Let
r

6.2-10

Let R m s ( n + l ) , n > 0. Let f^ ^ R


be continuous and let f be
x x x
nonzero on J. Let 0 * i>*"> n e J and let a sequence {x^

be defined as follows; Let be defined by (6-8) and


(s)
(6-9). Let * E n g be generated from E s + 1 by estimating f v
'

by • Define
x i+1 * * n,s( i' i-l''' " i-n)'
E x x x

x
Let ~ i - j " °"

T s + D

for all x e J and let

v 2 X 2

Assume that e A is nonzero for all finite i. Suppose that


1 1 S
a^^Mr *- + *NT < 1.
Then, x ± e J for all 1, e ± -+ 0, and

iy*)l ~ ( p 1 ) / ( R 1 )
~ . < - 6 15)
6.2-11

where p Is the unique real positive root of

n
t n + 1
- S > t J
= 0,
J«0

and where A R = f^/iRlf).

(S)
v
We turn to the case where f ' is estimated In an

arbitrary optimal one-point I.F. It will turn out that the

order and asymptotic error constant of the I.F. so generated


(S)
are identical with the case where f v
' is estimated in £ 5 + 1 •

From Theorem 2 - 1 0 ,

* s + i - V i + U f S + 1
< 6
" 1 6
)

is the most general I.F. of order S + 1. Let cpg +1 be an

optimal one-point I.F. Let *cp n s be generated from c p s + 1 by


(S) S '
estimating f ^ ' by * f £ )
v
in cp g+1 and let * U ^ S be defined

analogously. Then

S+l
a
n,S ~ " n,S " "n,S J
6 . 2 - 1 2

Hence

s+i
*<p . - a - *E TI q - a + *u f ,
Y
n , S n,S n,fa

n
*U
*cp Q - a * e . ^ = *M.ef [f (e. < - e . ) S
+ (*N. + S+X
y
n,S l+l i i ii- l-J 1' t 1
[S'(p.)] J
—1 l

(6-17)

where and were defined In ( 6 - 1 3 ) . We can proceed as

before to arrive at

THEOREM 6 - 2 . Let

x |x-a| £ ry.

R
Let R = S(n+l), n > 0. Let f^ ^ be continuous and let f be

nonzero on J. Let x , x ^ , . . . , x
Q n e J and let a sequence {x } i

be defined as follows$' Let * f £ ) be defined by ( 6 - 8 ) and ( 6 - 9 ) s

e ar
Let <P Slfl t> * arbitrary optimal one-point I.F.j let
S+1

cp s + 1 m E g + 1 + Uf . Let *cp n g be generated from c p s+1 by


S
estimating f^ ^ by in < p s+1 ; then

Define

x = c ( x ; x x
i+l * Pn,S i i-i""' i-n^
6.2-13

Let
Let e
1-J = x1-J - a,

R1 v 3 1 |f'I > v ,
2

for all x e J. Let

q
V V n 0 +

S+'l* .S+l*
v
2 A
2

Assume that e ± Is nonzero for all finite 1. Suppose that


2 H-S # M r R-l + * ^.
Nr < x >

Then, x ± e J for all 1 , e ± -»• 0, and

(p l)/(R l)
1 ^ ± 4 - IVa)l - - , ( 6 - 1 8 )

where.p is the unique real positive root of

n
t n+l _ s \ t J , 0 j

j-0

and where A R = /(Rlf) .


6.2-14

6.23 Estimation of g^"" ^.


We turn to one-point 1

f s-l)
v y
I.F. with memory which are generated by estimating £ in
the I.F. E ; the corresponding theory for arbitrary optimal
s
I.F. then follows easily.

We again take S = s - 1 . R = S(n+l); the

symbols y and y^ are used interchangeably. Let g be the

polynomial such that

(6-19)

We estimate ^ S
\ y ) by oj^ (y)
s . Let

l { 3)
z (y)
n s ^ (y).
s (6-20)

As shown in Appendix A,

n
s (y) - ^
(s) s )
( y ) =fTSf ( R )
(ei) II ±
S
(y-y -j) > (6-21)

where 6 ± lies in the interval determined by y^Y^i> ••• >y±-n'


1
Let E n s be generated from E g + 1 by estimating
s
g ^ ) ( ) by
y
1
3^ ^(y).
S
A new approximation to a is defined by

x l E x j x ,x
i+l ~ n,S^ i i-l'*'* i-n^
r

6.2-15

I
We derive the error equation for E N g . We have

_ S-l 1

i-E _ x + Vv L D l B (J) Jf + Izlif y s) s f

n,S - x +
L 31 S! tf
n

= E g + J b | ^ ^S) S f f ( 6 . 2 2 )

a -E s
+ ^ ^ )fS s
+ ^ he )f ,+1
±
2+1

(6-23)

where 8 i lies in the Interval determined by 0 and y^. Prom

(6-21), (6-22), and (6-23),

k S+l

Since

a
V j T

where T l i + 1 lies in the interval determined by x 1 + 1 and a and

f w e c o n c u d e
P i + 1 = ^i+l^ l that

y± i -
+ X + i ^ II (yi-j-yi) s
+ V ^ f " - 1

M
i+1 =
- R!g'(p 1 + 1 ) >

8 1 1
L ( - i ) * ^ ^ )
N
i+1 ~ ~ (SH)J*'(P i + 1 ) '
r

6 . 2 - 1 6

Because of the dependence of (6-24) on Y „y


±
we

shall focus our attention on H,

H = |y| |y| £ A}-

Let 3 ^ ) R
be continuous on H and let 3 ' be nonzero there.

Let y y , . . . , y
o J 1 n e H. Since M 1 + 1 and N 1 + 1 depend on y 1 + 1

through the parameter P i+1 5 we cannot prove that all y i e H in

a manner analogous to the proof of the last section. Instead

we assume that y^^ e H for all I and that y i Is nonzero for all

finite i. Let

3
R! ^ V l 'I^V (s+1)! ^ V

for all y e H. Let

V *2

Then

n
S , 1S+1
|y i + 1 l <; lM\y±\s II (| _jl yi + |y J) + ±
s 1
k\j \
±+1

An application of Lemma 3 - 2 shows that if

2 R - S ljy^R-1 + i N A S < l f
6 . 2 - 1 ?

then y A -*• 0 . Hence e ± -*• 0 . Observe that

S (R)
( - 1 ) % ( 0 ) , a R (y) = f r ^ { f }

All the conditions of Theorem 3-4 are now satisfied and we can

conclude that

where p Is the unique real positive root of

n
.n+
T A1 „ V 4-J -
t" - S ^ t = °-
j=0

Since

a n d a w e
where lies in the interval determined by y ^ j *

have that

J > l i - | y R ( „ ) | ( P - l ) / ( » - l ) .

We summarize our results in


6.2-18

THEOREM 6 - 3 . Let

H = iy I <; a }

Let R = S(n+l), n > 0 . Let be continuous and let be

nonzero on H. Let x , x ^ , . . . , x Q n be given and let a sequence

{x } be defined as follows:
± Let i S
^ ^ be defined by (6-19)

and ( 6 - 2 0 ) . Let 1
E n g be generated from E s + 1 by estimating

3 ^ by 35 ^ .
i
I I
S
Define

_ x ; x x
i+1 "n,S^ i i-l'''*' i-n)'

Assume that y ± e H for all 1 and that y ± is nonzero for all

finite i. Let

l * ( R )
l s , K H s * 1S ( S + 1 )
1 ,
R!

for all y e H and let

\ 2 A 2

Suppose that 2 R _ S 1
Mr R _ 1
+ W l S
< 1.
6.2-19

Then y ± -• 0 and

^ i ± l l . | V 0 ) | ( P - l ) / ( H - l ) , (6-25)

where p Is the unique real positive root of

n
t n + i
- s
i
y t j
= o,

and where

(R)

Furthermore,

K±i"_l V a ) |(p-i)/(H-i), (6-26)

where

R ( R )

Y (x) = . (-D g (y)


R

R! [ 3 ' ( y ) ] J

y=f(x)

K
The case where % ' is estimated in an arbitrary

optimal one-point I.F. may be handled in a fashion which


/—
should by now be familiar. We summarize the results in
6 . 2 - 2 0

THEOREM 6 - 4 . Let

H = {y| |y| 1 A}.

Let R = S(n+l), n > 0 . Let 3 f ( R


) be continuous and let 3 ' be

nonzero on H. Let x ,x^,.,.,x be given and let a sequence

{x } be defined as follows:
± Let be defined by ( 6 - 1 9 )

and ( 6 - 2 0 ) . Let cpg +1 be an arbitrary optimal one-point I.P.;


S+1 1

let c p s+1 * E
s + 1 + f u
• Let <P 5 be generated from c p
n s+1 t»y

estimating 0 ^ by in c p s+1 ; then

Define

x ± c p x ; x ,x
i+l " n,S^ i i - l ' * * * i-n^

Assume that y ± e H for all 1 and that y ± is nonzero for all

finite i. Let

\^ S + l )
\ .lu I s K

(S+1)! * V I V s 1
*
6,2-21

for all y e H. Let

K
1 X
3 + V

R S X R _ 1
Suppose that 2 " MA + ^NA^ < 1

Then y i -*• 0 and

1
- ^ 4 - l ^ o ) ! ^ " ' ^ - ! ) .

where p Is the unique real positive root of

n
t.n+1 - S > +.tJ _= 0.
^
1 1 T X
Q
J

j=0

Furthermore,
6,2-22

6.24 Examples. The reader is referred to Appendix A

for the approximate derivative formulas used in the following

examples. We shall not give the conditions for convergence.

The notation is the same as in Sections 6 . 2 2 and 6 . 2 3 .

EXAMPLE 6-7. n ~ 1, S = 1 , (secant I . F . ) .

E
* i,i = x
i ' Tp-

^ ± 4 - lAgCcOp- , 1
P - i(l+V*>) - 1.62.

EXAMPLE 6-8. n = 2, S » 1 .

*f' - f [ x , x _ ] + f [ x , x _ ] - f [ x _ , x _ ] ,
1 1 1 ± ± 2 1 1 1 2

* 2,i
E
- x
i " ;7'
*2

I ^ ± 4 - . lAgCa)!*^"^. P . 1.84.
e
' i
r

6 . 2 - 2 3

EXAMPLE 6 - 9 . n = 1 , S =2 .

* l " x p f ^
f
{ 2 f
I +
^ ' 3 f C x
i' i-l }'X ]

U
2
f
r
1 i " i

l A ^ C a ) ! ^ 5
" 1 5
, P - 1 + 7 3 - 2 . 7 3

EXAMPLE 6 - 1 0 . n = 1 , S = 2 . We estimate f" by * f 1

in Halley's I.F. rather than in . Thus

u
i
- *i " 1 - I ( u X j <

|A (ct)
4 | * ( P - D , p . i + v / 3 ^ 2 . 7 3 .

e
i'
6.2-24

EXAMPLE 6 - 1 1 . n = 1 , S = 1 , (secant I.F.).

1 , ' 1
* 1 - f [ x ± , x 1 - 1 J '

1 1 '
E
l , l " x
i ~ f
i V

v/5)
p
I * ± i i - |Y (a)| -\2 p = | ( 1 + - 1 . 6 2 ,

EXAMPLE 6 - 1 2 . n = 2 , S - 1 .

• U ' 1 , 1 1
3
2 * f i
[ x 1 , x 1 _ 1 J +
f [ x 1 , x 1 _ 2 J " f L x ^ ^ x ^ J '

1 1 '
2,l
8
E
" X
l " f
l 2'

b±ij^,y (o ),i(P-l), p . 1.84.


e
l
6 . 2 - 2 5

EXAMPLE 6 - 1 3 . n = 1 , S = 2

1 " 1 _ -3
5 T + ,
i f -f f, , f [ x
i' i-lx J

i M-l

l
S , 2
X
- i " i U +
A
^ ± 4 - l Y . f a ) ! ^ " 1
) , P = 1 + 7 3 ^ 2, 7 3
P 4
| . , I "
6.3 Discussion of One-Point Iteration Functions With Memory

6,31 A conjecture. The theory of interpolatory I.F.

and derivative estimated I.F. has been developed in

Sections 6 . 1 and 6 . 2 . In the case of interpolatory I.F.,the

first s - 1 derivatives of f are evaluated. Hence s pieces

of new information are required for each iteration. For the

case of derivative estimated I.F., the (s-l)st derivative of

an optimal one-point I.F. is estimated from the first

s - 2 derivatives at n + 1 points. If we set S - s - 1 , then

S pieces of new information are used for each iteration. For

interpolatory I.F.,, r » s(n-fl) represents the product of the

number of new pieces of information per iteration with the

number of points at which information is used; R = S(n+l) plays

the corresponding role for derivative estimated I.F. A glance

at Theorems 4 - 1 , 4-3, 6-1, 6-2, 6-3, and 6 - 4 , reveals a remark-

able regularity in the structure of the results. The only

parameters which enter are p and r or p and R; p depends only

on s and n or S and n. When we deal with inverse interpolation

the asymptotic error constant depends on Y^; when we deal with

direct interpolation the asymptotic error constant depends

on A .r From Theorems 6 - 2 and 6 - 4 we can conclude that the

asymptotic behavior of a sequence generated by a derivative

estimated I.F. is independent of the optimal one-point I.F. in

which the derivative is estimated.


r

Values of p for different values of n and s or n and

S may be found In Table 3-1,

The effect of estimating the highest derivative of

optimal one-point I.F, has been studied. The idea of estimat-

ing the two highest derivatives suggests itself. Calculations

of orders of such methods indicate that such a procedure would

not be profitable.

We have proved, for the case of interpolatory and

derivative estimated I.F,, that the old information adds less

than unity to the order. We conjecture that this is true no

matter how the old information is reused.

CONJECTURE. Let

388
cp x
cp[x^; i - i , x
i - 2 ' ** * l-n^
9 x

« G ^JL*^ ^'^i * * ^ * 1
1 # 0 # ,
^i-l'^l—l ^i l'" * * ,p, #

f U - i ) v f f' f U - i ) ~
x I I I
i~l '•*" l-n' i-n' i-n""' i-n

be any one-point I.F. with memory. (The semicolon shows that

new information is used only at x^.) Let cp e ^I^. Then

p < I .+ 1.
In particular, we conjecture that it is impossible

to construct a one-point I.F, with memory which is of second

order and which does not require the evaluation of derivatives.

The fact that neyf information is used at only one point is

critical; in Section 8.6 we give I.F. which are of order

greater than two and which do not require the evaluation of

any derivatives. Those I.F. are, however, multipoint I.F.

with memory.

It must be emphasized that the conjecture does not

apply to the case where the sequence of approximants generated

by an I.F. is "milked" for more information. See Appendix D

for a discussion of acceleration of convergence.


6.32 Practical considerations. One-point I.F, with

memory typically contain terms which approach o/o as the

approximants approach a. Consider, for instance, the I.F. of

Example 6 - 9 :

# f
l " x p i j ^ { 2 f
i + f
i-l " 3 f [ x ^ x , ^ ] } ,

(6-29)

2
1 i
U

* l,2- i- i-57* i-
E x u f

x
i

In theory, *f- - * f " ( a ) .


L In practice, it approaches a o/o form

which naturally poses computational difficulties. Note that

*f-" is multiplied by u£
L
2
which goes to zero quite rapidly. Note

also that the last term of (6-29) may be regarded as a correc-

tlon term to x^ u^. Hence *f^ need not be Imown too

accurately. It might be worthwhile to do at least part of the

computation with multiple precision arithmetic; this matter

has not been fully explored.

In order to use an I.F. such as C


P , n + 1 approxi-
Y
n,s

mants to a must be available. This suggests using g ,

which requires but two approximants, followed successively

i?Q o'*»«'9v* a*
a t t h e
by <Po a beginning of a calculation.
6 . 3 - 5

6.33 Iteration functions which do not use all

available information. All the I.F. studied in this chapter

use all the old information available at n points. It is

possible to construot I.F. which use only part of the old

information available at n points. This may lead to simpler

I.F. but ones which are not of as high an order. For example,

the simplest estimate of f" is

1
• x
r x
i - i •

+
Define E 1 2 by

+
» i . 2 - » i - » i - a ? t f
i - ( 6
" 3 0 )

It may be shown that the indicial equation associated with

this I.F. is t 2
- 2 t - 1 - 0 , with roots 1 ± y / 2 . Thus the
4-

order of E^ 2 is 2 . 4 l . On the other hand. * E 1 2 , which

estimates f" from f ^ f ^ ^ f ^ f ^ p is of order 2 . 7 3 - If the

evaluation of f and f is expensive, it is preferable to use

the slightly more complicated *E-^ g.


r

6 . 3 - 6

6 . 3 4 An a d d i t i o n a l term i n the e r r o r e q u a t i o n . In

S e c t i o n 5 . 5 a d i f f e r e n c e e q u a t i o n was d e r i v e d which p e r m i t t e d

t h e r e c u r s i v e c a l c u l a t i o n of t h e c o e f f i c i e n t s of the e r r o r

s e r i e s of E . G e n e r a l i z a t i o n t o I . F . w i t h memory has not been

a t t e m p t e d . The f i r s t two terms of the e r r o r s e r i e s have been

worked out f o r a number of I . F . and are g i v e n b e l o w . The

l e a d i n g term of t h e s e I . F . i s , of c o u r s e , g i v e n by Theorem 6 - 1 .

*E 1 e -e s e c a n t I . F . ,
, 1 " a
* A
2 ^ a
^ e
i e
i - 1 " ^(a) - (a) i " i - r

( 6 - 3 1 )

*E 2 , 1 ( 6 - 3 2 )
- ° * - A
3 ( ° 0 e
i e
i _ i e
i - 2 + A
2 ( a
) e
i >

*E n
1,2 • a
* V ) i i-i a e e +
1 |( ) - V )
2A a a e{. ( 6 - 3 3 )
7.0-1

CHAPTER 7

MULTIPLE ROOTS

In S e c t i o n 7.2 we show t h a t a l l E are of l i n e a r

s
o r d e r f o r nonsimple z e r o . In S e c t i o n 7-3 we s t u d y an o p t i m a l

b a s i c sequence { g } whose o r d e r i s m u l t i p l i c i t y - i n d e p e n d e n t .
s

Our r e s u l t s on I . F . g e n e r a t e d by d i r e c t i n t e r p o l a t i o n are

e x t e n d e d t o t h e c a s e of m u l t i p l e r o o t s i n S e c t i o n 7*5.

Theorem 7-6 g i v e s a n e c e s s a r y and s u f f i c i e n t c o n d i t i o n f o r an

I . F . t o be of second order f o r r o o t s of a r b i t r a r y m u l t i p l i c i t y .

An I . F . of incommensurate o r d e r i s s t u d i e d i n S e c t i o n 7.8.
7 . 1 - 1

7 . 1 I n t r o d u c t i o n

A number of d e f i n i t i o n s which r e l a t e o r d e r t o

m u l t i p l i c i t y are g i v e n i n S e c t i o n 1 . 2 3 .

Three f u n c t i o n a l s are of p a r t i c u l a r i n t e r e s t :

v-frs G = fi^ ), 1
p = f lA. ( 7 - 1 )

Observe t h a t u has o n l y s i m p l e z e r o s ; G and P have o n l y s i m p l e

z e r o s p r o v i d e d t h a t the z e r o of f has m u l t i p l i c i t y m. The

m u l t i p l i c i t y must be known a p r i o r i i f G or F are t o be used.

I f f and i t s d e r i v a t i v e s are r e p l a c e d by u, G, or F and t h e i r

d e r i v a t i v e s i n any I . F . , then the e n t i r e t h e o r y which p e r t a i n s

t o s i m p l e z e r o s may be a p p l i e d . If, f o r example, we r e p l a c e f

b y u in N e w t o n ' s I . F . , t h e n we g e n e r a t e

u
9 - X -

w h i c h i s second o r d e r f o r z e r o s o f a l l m u l t i p l i c i t i e s and

cp-a u"( a)
/ n 2 2u' a) #

( x - a ) v 1

I t i s w e l l known t h a t Newton's I . F . i s of l i n e a r

o r d e r f o r a l l nonsimple r o o t s . E. S c h r o d e r [ 7 . 1 - 1 , p . 3 2 4 ]

p o i n t s out t h a t

cp = x - mu ( 7 - 2 )
7.1-2

i s of second o r d e r f o r z e r o s of m u l t i p l i c i t y m; t h i s f a c t has

been o f t e n r e d i s c o v e r e d . In S e c t i o n 7.2 we w i l l prove t h a t

a l l E are of l i n e a r o r d e r f o r nonsimple z e r o s . In
s

S e c t i o n 7-3 we w i l l c o n s t r u c t an o p t i m a l b a s i c sequence of

w h i c h (7-2) i s t h e f i r s t member.
7.2-1

7.2 The Order of E o

E g was d e r i v e d from t h e T a y l o r s e r i e s e x p a n s i o n

of 3 under t h e a s s u m p t i o n t h a t f has o n l y simple z e r o s . We

c a n , n e v e r t h e l e s s , i n q u i r e as t o the b e h a v i o r of E when
s
a p p l i e d t o the c a l c u l a t i o n of m u l t i p l e z e r o s . We prove

THEOREM 7-1. The o r d e r of E g i s l i n e a r f o r a l l

nonsimple z e r o s . Moreover,

E_ , n - a / -i \ s

• S i - - 1
^ - n a - * - ) . ( 7 - 3 )
3 I m
t = i

PROOF. From (5-l6) and (5-17),

E s + 1 ( x ) = x - £ Z j ( x ) , Z j ( x ) = Y j ( x ) u J
( x )

j = l

Hence

E g + 1 ( x ) - a - x - a - ^ Z ^ ( x ) .

J - l

D e f i n e 7 ^ by

Z j ( x ) = £ 7 (*-a) . tt3
1
(7-4)
t = l
7.2-2

Hence

E g + 1 ( x ) - a = ( x - a ) + 0 [ ( x - a ) 2
] . (7-5)
1
- I

S i n c e Z
j ( x
) s a t i s f i e s t h e d i f f e r e n c e - d i f f e r e n t i a l e q u a t i o n

J Z j ( x ) - ( j - l ) Z J _ 1 ( x ) + u ( x ) Z j _ 1 ( x ) = 0, Z 1 ( x ) = u ( x ) ,

(7-6)

we f i n d on s u b s t i t u t i n g (7-4) i n t o (7-6) and n o t i n g

u(x) = ( x - a ) / m + 0 [ ( x - a ) 2
] , t h a t

I7 l9i - ( M ) 7 l j H + £ 7 1 ) y 1 - 0, y 1 } 1 = 1. (7-7)

S e t t i n g M = l / m y i e l d s

,1-1-M
71 , J - 1 ' = M,
'1,J
7
1 , 1

Hence

(7-8)

where C[M,J] d e n o t e s a b i n o m i a l c o e f f i c i e n t . S u b s t i t u t i n g

(7-8) i n t o (7-5) y i e l d s

E g + 1 (x) - a = (x-a) £ (-D C[M,J] + 0[(X-AF] =


J
(x-a)(-l) C[M-l,s] s
+ 0[(X-AF],
J=0
r

7.2-3

where we have used t h e w e l l - k n o w n I d e n t i t y

£ (-l) C[M,j]
J
- (-1) C[M-1,b].
8

J=0

Hence

E s + 1 ( x ) - a = ( x - a ) + 0[(x-a) ] .

-t=i

S i n c e m i s a p o s i t i v e i n t e g e r , the c o e f f i c i e n t of ( x - a ) i s z e r o

i f and o n l y i f m = 1; f a c t o r i n g out l/m from each term of t h e

p r o d u c t c o m p l e t e s t h e p r o o f .

EXAMPLE 7-1.

E 2 ( x ) - a = ( l - ( x - a ) + 0 [ ( x - a ) 2
]

which i s a w e l l - k n o w n r e s u l t ,

The a s y m p t o t i c e r r o r c o n s t a n t i s g i v e n by the r i g h t

s i d e of (7-3). We have

COROLLARY. Let

G ( M , S ) = L = I L [J ( i - t m )

s Jm
1=1
7.2-4

Then

Q(m,s) < 1 , l i m Q(m,s) = 1,


m —•• oo

PROOF. S i n c e G(m,s) may be w r i t t e n a s

G(m,s) = II ( l -

t = l

the r e s u l t f o l l o w s i m m e d i a t e l y .

I f m > 1 and i f a sequence of approximants i s formed

by x ± + 1 = E 2 ( x ± ) , t h e n

x
i + l " a
= C 1
" m ) ( x
i " a
) +
0 [ ( X i - a ) 2
] . (7-9)

S i n c e t h e sequence c o n v e r g e s l i n e a r l y , we can a p p l y A i t k e n ' s

5 formula (Appendix D) and e s t i m a t e a b y

a* = x - ( x
i + 2 ~ x
i + l ) 2

i+2 x
i + 2 " 2 x
i + l + x
i *

Using a * as an e s t i m a t e f o r a,

x
i + l " a w
C 1
- m ) ( x
i " a
)

may be used t o c a l c u l a t e an e s t i m a t e f o r m. Note t h a t m i s

an i n t e g e r - v a l u e d v a r i a b l e . Once m i s known, t h e second order

I . F . , cp = x - mu m a y b e u s e d .
7.3-1

7.3 The B a s i c Sequence g


s
7.31 I n t r o d u c t i o n . In S e c t i o n 5*13* e x p l i c i t

f o r m u l a s were d e r i v e d f o r E g ; t h e s e I . P . form an o p t i m a l b a s i c

sequence f o r simple z e r o s . In S e c t i o n 7.2 we showed t h a t the

o r d e r of E i s l i n e a r f o r nonsimple z e r o s and hence t h a t {E )


s s

i s not a b a s i c sequence f o r m > 1. An o p t i m a l b a s i c sequence

i s c o n s t r u c t e d below f o r t h e case where m i s a r b i t r a r y but

known.

S i n c e t h e m u l t i p l i c i t y of a z e r o i s o f t e n not known

a p r i o r i , the r e s u l t s are of l i m i t e d v a l u e as f a r as p r a c t i c a l

problems are c o n c e r n e d . The s t u d y i s , however, of c o n s i d e r a b l e

t h e o r e t i c a l i n t e r e s t and l e a d s t o some s u r p r i s i n g r e s u l t s .

We w i l l f i n d t h a t m u l t i p l i c a t i o n of the terms of E g by c e r t a i n

p o l y n o m i a l s i n m l e a d s t o I . F . w i t h the d e s i r e d p r o p e r t i e s .

These p o l y n o m i a l s are found e x p l i c i t l y ; t h e i r c o e f f i c i e n t s

depend on S t i r l i n g numbers of the f i r s t and second k i n d .

One may d e r i v e t h e s e new I . F . by t h e f o l l o w i n g

t e c h n i q u e . Let a be a z e r o of m u l t i p l i c i t y m. Let

h(x) = f^ (x) m
=z .

Then h ( x ) has a simple z e r o a t a. Let H be t h e i n v e r s e t o h .

Then p r o c e e d i n g as i n S e c t i o n 5.11 i t i s c l e a r t h a t

s-1

j=0
7.3-2

i s of o r d e r s f o r a l l m. In p a r t i c u l a r .

.pl/rri/ \

cp 2 = H(z) - z H ' ( z ) = x - h '(x]T = X


" m u
( x
) >

( 7 - H )

cp 3 = H(z) - z H ' ( z ) + | z 2
H " ( z ) = x - |m(3-m)u(x) - m 2
A 2 ( x ) u 2
( x ) .

cp 2 was known t o E. S c h r o d e r [7-3-1] (1870). See a l s o

Bodewig [7-3-2] and O s t r o w s k l '[7.3-3, Chap. 8].

R a t h e r t h a n u s i n g (7-10) t o g e n e r a t e h i g h e r o r d e r

I . F . of t h i s t y p e , we a t t a c k t h e problem from a n o t h e r p o i n t

of v i e w .
T

7*3-3

7.32 The s t r u c t u r e of g . I t i s a d v a n t a g e o u s t o

e x t e n d the n o t a t i o n f o r I . F . so t h a t f and m a p p e a r e x p l i c i t l y

as p a r a m e t e r s . Thus we r e p l a c e (5-17) by

E a + 1 ( x , f , l ) = x - ^ Z j ( x , f , l ) .

L e t

F = f l / m
. (7-12)

Observe t h a t a z e r o of m u l t i p l i c i t y m of f i s a z e r o of

m u l t i p l i c i t y 1 of F. C l e a r l y , {E 1 ( x , P , l ) ) i s an o p t i m a l

b a s i c sequence f o r a l l m. ( I t i s c o n v e n i e n t t o use E g + 1

r a t h e r t h a n E t h r o u g h o u t t h i s s e c t i o n . ) Let
s

Z j ( x , P , l ) = W j ( x , f , m ) .

Then (5-18) becomes

J W j ( x , f , m ) - ( j - l ) W J _ 1 ( x , f , m ) + m u ( x ) W j _ 1 ( x , f , m ) = 0 ,

(7-13)

W ^ ( x , f , m ) = rau(x),

w h i l e Lemma 5 3
_
becomes
7.3-4

LEMMA 7-1.

e s+1 ( x , f , m ) = fi (x,f,m) - a
s
e^(x,f,m).
w
s

We s e e k c o e f f i c i e n t s p ..(m) such t h a t

e s + 1 (x,f,m) = x - £ p s ^ J ( m ) Z j ( x , f , l ) ;

(7-14)

P s ^ j ( m ) =0, s < j ,

and such t h a t { £ s + 1 ( x , f , m ) } i s an o p t i m a l b a s i c s e q u e n c e .

S u b s t i t u t i n g (7-l4) i n t o t h e formula o f Lemma 7-1

y i e l d s

x - ^ p s ^ ( r a ) Z J ( x , f , l )

J=l

s-1 s-1
- u ( x ) 1 - y p . ( m ) Z j ( x , f , l )
s v
' B l i J

3=1
We use (5-18) t o e l i m i n a t e u ( x ) z l ( x , f , l ) and f i n d

X Z j U ^ l H - sp S j .(m) + sp _ (m) s 1}i + m J p B . l j J . ( m )


1 - n J p 8 . l j J ( m ) ]

j=l

where we have t a k e n p _(m), which may be c o n s i d e r e d as t h e


s , o

c o e f f i c i e n t of x i n (7-l4), e q u a l t o u n i t y . Then,

s p s ^ ( m ) + ( m j - s ) p s _ 1 ^ j ( m ) - m J p s - 1 ^ - : L ( m ) = 0, (7-15)

w i t h p (m) = 1 f o r s ^> 0 and p . (m) = 0, f o r s < j as


s , o s , j

i n i t i a l c o n d i t i o n s . E q u a t i o n (7-15) p e r m i t s t h e r e c u r s i v e

c a l c u l a t i o n of the p ^(m). E q u a t i o n (7-15) shows t h a t


J
S
9
p ^(m) i s a p o l y n o m i a l i n m; an e x p l i c i t formula i s d e r i v e d
J
S
9
b e l o w .

D e f i n e the a s s o c i a t e d f u n c t i o n s a. . (m) by
^9 J

W t ( x , f , m ) = ^ a t ^ j ( m ) Z J ( x , f , l ) , t < j . (7-l6)
j = l

The ^(m) were i n t r o d u c e d by Z a j t a [7.3-4]. S i n c e

s s t

e s + 1 ( x , f , m ) = x - ^ W t ( x , f , m ) = x - ^ ^ a ^ f m ) Z^ ( x , f , l )

1=1 t = l j = l

= X

j = l l=i
r

7.3-6

we have

P s ^ ( m ) = £ a
t , j ( m )
' ( 7
" 1 7 )

1=3

To f i n d a r e c u r s i o n formula f o r t h e j ( m
) w e
s u b s t i t u t e

(7-16) i n t o (7-13), and use (5-18) t o e l i m i n a t e u ( x ) Z ^ ( x , f , 1 ) .

Then

1-

^ Zj(x,f,l)[ta^j(m) - U - l ^ ^ j C m ) - m
«J0^_i^j_i( ) m + m J a
t - l , j ( m
^ =
°*

J=l

Hence

(t+l)a t + 1 ^(m) + (mj-t)a ^(m) t - mja^^Cm) = 0 (7-18)

w i t h a (m) = 1, a, (m) = 0 for t > 0, and a Am) = 0 for

I < j , as i n i t i a l c o n d i t i o n s . E q u a t i o n (7-18) shows t h a t

a. . i s a p o l y n o m i a l i n m.

We d i g r e s s b r i e f l y t o l i s t some d e f i n i t i o n s from t h e

C a l c u l u s of F i n i t e ' D i f f e r e n c e s . The r e a d e r i s r e f e r r e d t o

Jordan [7-3-5] o r
Riordan [7*3-6] f o r t h e standard t h e o r y .

Our n o t a t i o n i s n o t q u i t e s t a n d a r d . We d e f i n e :
7.3-7

i-i
[ x ] t = J[ ( x - i ) " P a l l i n g F a c t o r i a l "

1=0

l-l
p [ x ] t - ( x + i ) " R i s i n g F a c t o r i a l "

1=0

C[x,l] = [x],/ll " F a l l i n g B i n o m i a l C o e f f i c i e n t "

C _ [ x , t ] = Ax],/II " R i s i n g F a c t o r i a l C o e f f i c i e n t "

[ x ] ^ = ^ S^ " S t i r l i n g Numbers of the F i r s t Kind"

I J=0

l
x " S t i r l i n g Numbers of the Second Kind"

j=0

S t i r l i n g numbers of t h e f i r s t and second kind are o f t e n denoted

by two d i f f e r e n t t y p e s of s ; f o r example, by s and S. We

w i l l use S and T, There i s no d a n g e r of c o n f u s i n g the l a t t e r

w i t h the u s u a l symbol f o r a Chebyshey p o l y n o m i a l .

We c o n t i n u e our s t u d y of p .(m) and ap ^(m). A

g e n e r a t i n g f u n c t i o n f o r the j ( m
) ^ a
y be d e r i v e d by d e f i n i n g

hj(x,m) = ^ o ^{m)x .
t
l

t=0
7 . 3 - 8

I t f o l l o w s from ( 7 - 1 8 ) t h a t h j ( x , m ) s a t i s f i e s

( l - x ) h j ( x , m ) + mjhj(x,m) - m j h J _ 1 ( x , m ) - 0, ( 7 - 1 9 )

where m I s a p a r a m e t e r . A s o l u t i o n of ( 7 - 1 9 ) Is

h j ( x , m ) = [1 - ( l - x ) m
] J
.

I t i s not d i f f i c u l t t o v e r i f y t h a t the f u n c t i o n s k- . (m) which

^9 J
s a t i s f y

00

[1 - ( l - x ) m
] J
= £

1=0

a l s o s a t i s f y (7~l8) and I t s I n i t i a l c o n d i t i o n s , and hence

00

l
[1 - ( l - x ) m
] J
= £ a l f i (m)x .

O b s e r v i n g t h a t

1 - ( l - x ) m
= £ (-D^Cim.rW
r = l

and a p p l y i n g the m u l t i n o m i a l theorem y i e l d s

a t j J ( m ) = J . ( - l ) < - +
J I n • ( 7 - 2 0 )

r = l
7 . 3 - 9

w i t h t h e sum t a k e n o v e r a l l n o n n e g a t i v e i n t e g e r s a r such

t h a t

L ra
r -
l
> I a
r = J '
r = l r = l

On t h e o t h e r hand,

[1 - ( l - x ) m
] J
= £ C [ j , r ] ( - l ) r
( l - x ) r m

r=0

t = j r=0

Thus

a
t , j ( m ) =
1 ( - 1
) r c
[ J ^ H - l ) t
C [ r m , t j . (7-21)
r^O

Since C [ r m , t ] i s a p o l y n o m i a l i n m of degree I, ( 7 - 2 1 ) e x h i b i t s

an e x p a n s i o n of cj^ j(m) i n t h e p o l y n o m i a l s C [ r m , t ] . S i n c e

r=0 k-0

k=0 r=0
r

7.3-10

Using

r=0

y i e l d s

a Jm)t - ( - l ) t + J
# I S
t , k T
k , J m k
' ^-22)

E q u a t i o n (7-22) e x h i b i t s a. ,(m) a s a p o l y n o m i a l i n m. I t

i s n o t d i f f i c u l t t o show t h a t

k
C r [ m x , t ] - £ C r [ x , j ] ( - 1 ) ^ J | X \ , k T
k , Jm

j=0 k = j

and hence t h a t

C r [ m x , t ] = ^ a t ^ ( m ) C r [ x , j ]

J=0

Thus C [mx,t] i s a g e n e r a t i n g f u n c t i o n f o r t h e a. .(m) r e l a -

t i v e t o t h e base f u n c t i o n s C [ x , j ] . Since C [x l] s = C[x+l-l l] 9

we have e q u i v a l e n t l y t h a t

C [ m x + t - l , t ] = ^ a t ^ ( m ) C [ x + J - l , j ] .

j=0
7.3-11

We now d e r i v e an e x p l i c i t f o r m u l a f o r t h e p . ( m ) . R e c a l l i n g

t h a t

1=3

and u s i n g (7-22) y i e l d s

k=j t=k

S i n c e

t ! "*,,k ~ s ! S
s + l , k + l '
t=k

Our r e s u l t s are summarized i n


7.3-12

THEOREM 7-2. Define p 0 ,(m) by

fi (x,f,m)
s+1 = x - ^ p s ^ ( m ) Z J ( x , f , l ) ; P s ^ ( m
) = °; f o r s
< J«

D e f i n e a . (m) by
^ 9 J

W t ( x , f , m ) = £ a t ^ ( m ) Z j ( x , f , 1 ) ; cr^j(m) = 0 . f o r l< j .

J - l

Then

00

[1 - ( l - x ) m
] J
= £ CT
t,j( ) ^
m x
(7-24)
t=0

= ]T ( - l ) r
C [ j , r ] ( - l ) t
C [ r m , t ] , (7-25)
r=0

,j(m) = ( - D t + J
ff £ S ^ k T k ^ m k
, (7-26)

C r [ m x , t ] = Y a
t ^ ( m ) C r [ x , j ] , (7-27)
j=0

I
J + S S T m k
P 8 j J ( » ) - ( - D f t s + l , k + l k , J ' ^ '
7.3-13

The f o l l o w i n g c o r o l l a r i e s f o l l o w from t h e v a r l o u

p a r t s of Theorem 7-2. T h i s Is not a complete l i s t of the

p r o p e r t i e s of j ( m
) a n d
j ( m
) *

COROLLARY a. j(m) Is a p o l y n o m i a l i n m of

d e g r e e I.

PROOF. T h i s f o l l o w s from (7-26).

COROLLARY b . ^(m) = m. l

PROOF. T h i s f o l l o w s from (7-26), s i n c e

T — q _ -1

COROLLARY c. 1 ( m ) = (-l) l + 1
C[m,I]

PROOF. T h i s f o l l o w s from (7-25).

COROLLARY d. j(m) = 0, f o r l < j and I > m j .

PROOF. For I < j , ^(m) was d e f i n e d as z e r o .

For I > m j , t h i s f o l l o w s from (7-24).


r

7.3-14

COROLLARY e .

00

t=j t=o

PROOF. S e t x = 1 In ( 7 - 2 4 ) and use C o r o l l a r y d.

COROLLARY f. j ( l ) = & t j ( K r o n e c k e r d e l t a ) .

PROOF. Set m = 1 In (7-24).

COROLLARY g .

«t 00

^ a ^ j ( m ) =
X a
£ , j ( m
) = C
t m + t
" 1
' ^ *

j=o ' J-0

PROOF. Set x = 1 In (7-27).

COROLLARY h . The l e a d i n g c o e f f i c i e n t of a. ,(m) i s

1
( - l ^ C i y i ] ! .
1=0

PROOF. T h i s f o l l o w s from (7-26) and n o t i n g t h a t

i=0
r

7.3-15

COROLLARY 1. p 0 ,(m) i s a p o l y n o m i a l i n m of
s , j

d e g r e e s .

PROOF. T h i s f o l l o w s from (7-28).

COROLLARY J . p (m) = m S
.

PROOF. T h i s f o l l o w s from (7-28).

COROLLARY k. p o , (m) = 1 + ( - l ) s + 1
C [ ' m - l , s ] .

PROOF. T h i s f o l l o w s from (7-23) and C o r o l l a r y c

COROLLARY £. The l e a d i n g c o e f f i c i e n t of p ,(m) i s

g i v e n by

.8 ^
4 ^ £ ( - l ^ C t j , ! ] ! 3
.

1=0

PROOF. T h i s f o l l o w s from (7-28) and

T
s,J =
^ i V - I (-D^tj,!]! . 3

1=0
r

7 . 3 - 1 6

COROLLARY ra. p n .(m) = 1 f o r e v e r y m such t h a t s ^ mj


s , j

PROOF. T h i s f o l l o w s from (7-23) and C o r o l l a r i e s d

and e .

COROLLARY n . p , ( l ) = 1.
s
> J

PROOF. T h i s f o l l o w s f^om C o r o l l a r y ( ( 7 - 2 8 ) .

COROLLARY o .

p s ^ ( m ) ( - l ) J + 1
C [ l / m , j ] = 1 , s = 1 , 2 , 3 , . . . .

J=l

PROOF.

J=l
r

7 . 3 - 1 7

i s t o h o l d f o r ^ a r b i t r a r y f. Take f ( x ) = x m
. Then

P(x) = f 1 / i n
( x ) = x and hence E s + 1 ( x , x , l ) = g s + 1 ( x , x m
, m ) = 0 ,

f o r s = 1 , 2 , 3 , . . . . I t i s not d i f f i c u l t t o show t h a t

Z j [ x , x m
, l ] - ( - l ) J + 1
C [ l / m , j ] x . Thus

0 = x - £ p s ^ ( m ) i ( - l ) J + 1
C [ l / m , j ] x ,

J-l

and t h e r e s u l t f o l l o w s .

C o r o l l a r y n shows t h a t f o r m = 1 ,

e s + 1 ( x , f , m ) .- x - £ p s ^ ( m ) Z J ( x , f , l )

r e d u c e s as e x p e c t e d t o

E g + 1 ( x ) = x - ^ Z j ( x , f , l ) .

J-l

Thus e g + 1 ( x , f , l ) - E g + 1 ( x ) .

C o r o l l a r y e l e a d s t o an i n t e r e s t i n g r e s u l t . From

( 7 - 1 4 ) and ( 7 - 2 3 ) ,

s s

e s + 1 ( x , f , m ) = x - ^ ^ a ^ j ( m ) Z j ( x , f , l ) .

j = l t = j
7 . 3 - 1 8

Then

00 00

l i m S s + 1 ( x , f , m ) = x - V V a t j j ( m ) Z J ( x J f , 1 )
> —+ 00 t—i, . _ .

00

= X - £ Z j ( x , f , l ) = l i m E s + 1 ( x , f , l )

j = l 3 0 0
7.3-19

7.33 Formulas f o r e . T a b l e 7-1 l i s t s some o f t h e


s

a„ . (m) e x p r e s s e d i n terms of C [ r m , t ] , w h i l e T a b l e 7-2 l i s t s


^9 J
some o f t h e a, . (m) e x p r e s s e d i n powers of m. T a b l e 7-3 l i s t s
^90

some o f t h e p o ^ ( m ) . Using T a b l e 7~3j r e c a l l i n g t h a t


S
9 J
Z j ( x ) = Y j ( x ) u ^ ( x ) , and u s i n g t h e e x p r e s s i o n s f o r Y
j ( x
) g i v e n

i n T a b l e 5~1* e n a b l e s us t o c a l c u l a t e a number of t h e

e s + 1 ( x , f , m ) ;

S = x - mu(x),
2

£ = x - mu(x)| (3-m) + mA(x)u(x)


Q 2

r
P Pi 2
= x - mu(xWg (m -6m+ll) + m(2-m)A(x)u(x) + m 2A|(x) - A (x)u (x)k
2
£
3

3 2 2
S = x - mu(x)|- ^ (m-10m+35m-50) + ~ m(7m-30m+35)A(x)u(x)
5 2

1m"(5-3m)
2 2Ag(x) - A (x)u (x) 3
2

+ nr 5A^(x) - 5A(x)A(x) + A (x) u(x)j


3
2 3 4

J
T

7 . 3 - 2 0

J
o
+

M
i—i
O

o
H
I

oo

o O
+ l

oo
s
OJ
o o
oo oo

OO -3-

o o
H oo oo
I

OJ oo
n
OJ
oj
o
t

o
CVI +
+
OJ OO -3-

o o o
OJ OJ OJ

H OJ oo -3-

O o o o

OJ oo
7.3-21

H
I

oo oo
CO
OJ
OO

CO

OJ
OJ OJ H
l
0

3
OJ

oo
I
0
OJ OJ
I H OJ
I
3
9 OJ
I

-3-
OJ

OJ OO
TABLE 7-3. p .(m) = ( - l ) S + J
\ S ,. . ,.T. .m . k

J
* s , jv
' K
' si s+l,k+l k , j

? i * 2 3 k

3
i

1 m

2
2 - (m/2)(m-3) m

(m/6)(m -6m+ll) - m (m-2)


3
3 2 2
m

4 - (m/24)(m -10m +35m-50) m


k
3 2
(m /l2)(7m -30m+35)
2 2
I - (m /2)(3m-5)
3

From Corollary" k, the expressions in t h i s column can be replaced by

p s , i ( m )
- 1 +
n
1=1
r

7.4-1

7o4 The C o e f f i c i e n t s of t h e E r r o r S e r i e s of £ g

The e r r o r s e r i e s f o r E ( x , f , l ) was s t u d i e d i n
s

S e c t i o n 5 . 5 . We now d e r i v e an a l g o r i t h m f o r c a l c u l a t i n g t h e

c o e f f i c i e n t s of the e r r o r s e r i e s of e ( x , f , m ) .
o

R e c a l l the d e f i n i t i o n s of the f o l l o w i n g symbols

which w i l l be used f r e q u e n t l y :

* M - m > v - ' - ^ - » , w ^

a . . - (x)

Observe t h a t B . n (x) = A . ( x ) .

L e t a be a z e r o of m u l t i p l i c i t y m. The e x p a n s i o n

of u(x) i n t o a power s e r i e s in e, which w i l l be needed b e l o w ,

i s d e r i v e d now. Throughout t h i s s e c t i o n , a l l f u n c t i o n s are

e v a l u a t e d a t a u n l e s s o t h e r w i s e i n d i c a t e d . D e f i n e a)^(m) by

(x) = £ o ) ( m ) e
t
4
. (7-29)
1=1
S i n c e

00 oo
r - 1
f(x) = ] [ ae, r
r
f'(x) = £ r a p e 3

r=m r=m
r

7 . 4 - 2

and m u ( x ) f ,
( x ) = m f ( x ) , we o b t a i n

l+l-m
Y C D
q ( m
) ( ^ + 1
" (
l ) a
t + l - q , 1 = m
' r a + 1
>
q=l

or

l-l

q=l

( 7 - 3 0 )

S i n c e ^ = A^, we o b s e r v e t h a t ( 7 ~ 3 0 ) r e d u c e s t o ( 5 -
3 4 ) when

m = 1 , and hence t h a t ^ ( l ) = as e x p e c t e d .

I t i s n o t d i f f i c u l t t o prove t h a t an e x p l i c i t

f o r m u l a f o r oo^(m) i s g i v e n by

r^ 1 J
[(m+i)B,_,, j " 1

» c-)t - - B t > B + - 1 B t . J > m ^ ( - D r


r ! n — 3 7 1 ^ -
j = i i = i

( 7 - 3 1 )

where r = > and where t h e i n n e r sum i s t a k e n o v e r a l l

A
n o n n e g a t i v e i n t e g e r s a ± such t h a t ) i a ± = j . Observe t h a t

1 = 1

o) p (m) i s the same f u n c t i o n of the B. as v 0 i s of the A.


t l f > m t i

e x c e p t f o r c o e f f i c i e n t s which depend on m o n l y . Prom e i t h e r

( 7 - 3 0 ) or ( 7 - 3 1 ) * the f i r s t few u^(m) niay be c a l c u l a t e d as


7.4-3

cu^m) = 1,

= _ B
2,m'

c 3 ( m ) = ( m + l ) B ^ m - 2 B ^ m ,

co (m) 4 = - ( m + l ) 2
B ^ m + ( S r n ^ B ^ ^ - 3 B ^ m .

We t u r n t o t h e problem of f i n d i n g the c o e f f i c i e n t s

of the e r r o r s e r i e s . D e f i n e 7\
P (m) by

Ks $ S
oo

e s ( x , f , m ) = ^ ^ t , s ( m
) e t
- (7-32)
1=0
S i n c e e ( x , f , m ) e I . we e x p e c t t» „ = 0 f o r 0 < I < s, and
o S C, S

t = ex. T h i s may be p r o v e n d i r e c t l y by i n d u c t i o n on s .

L e t s = 1. Then

e i ( x , f ,m) = x = a + ( x - a ) = a .+ e.

Now assume A (m) = a and A, (m) = 0, f o r 0 < I < s e

U, o v •S
S u b s t i t u t e ( 7 - 3 2 ) i n t o the f o r m u l a of Lemma 7 - 1 ,

S B + 1 ( x , f , m ) = C ( x , f , m ) - c ' ( x , f , m ) , (7-33)
t o f i n d

00 00

t=0 t=s t=S

w h i c h c o m p l e t e s the i n d u c t i o n .

S u b s t i t u t i n g (7-32) i n t o (7-33), u s i n g

oo
mu
(x) = Y
%
l
<» (m)e , t
1=1

i
and e q u a t i n g t o z e r o the c o e f f i c i e n t of e , we a r r i v e a t

THEOREM 7-3. Let the m u l t i p l i c i t y of a be m. L e t

00 00
l l
e a ( x , f , m ) = ^ -K (m)e , l)S mu(x) = ^' as (m)e .
t

1=0 ' 1=1


Then

l-l
s
\ , s + l ( m
) +
C ^ s J ^ s ^ ) +
^ r a 3
t + l - r ( m
^ r , s = ° '
r = l

(7-34)

w i t h ^ 0 j S ( m ) = a, ^ ^ ( m ) = 1, A ^ - ^ m ) = 0 f o r I > l , and

A t ^ g ( m ) = 0 f o r 0 < I < s and s > 1.


7.4-5

PQ
+

B
<N OO
Q oo cvi
pq g
pq^
OO OJ

+ +

-=J-
pq
s
« °°
OO <N
pq oo oj B
pq
I ^ oo
oo pq
q a + a
OJ OJ OJ CvJ
pq ^ pq
00 H oo
1 i i
«—riOJ r-rjCM OO

pq
oo
+
B B
OO oo

+ OJ
CvJ B ^ PQ
OJ *» -3-
pq OJ oj +

OO OJ
pq
OJ

r—1 Q H

^ O H CvJ OO ^j-
7.4-6

S i n c e the o)^(m) may be c o n s i d e r e d as known, (7"3^)

can be used t o d e t e r m i n e the 7v„ (m) . Note t h a t A- (m) i s t h e


v yS S
same f u n c t i o n of a),,(m) as t« _ i s of v . . Some of the A- (m)
v V/S C v^S
are l i s t e d i n T a b l e 7-4. U s i n g t h i s t a b l e we f i n d

e (.x,f,m)
2 - a = B g e + (m+l)B + 2B-
2^m

(m+l) B^ 2
- (3m+4)B p B- + 3B, ^

| ( ^ 3 ) B ^ m - B

|(m+l)(2m+7)B 3
+ 3(m+3)B B_ - 3B,

2 o
e^(x,f,m) - a = •Km +6m+8)B^ - (m+4)B Q B_ + B,,
7.5-1

7.5 I t e r a t i o n F u n c t i o n s G e n e r a t e d By D i r e c t I n t e r p o l a t i o n

In t h i s s e c t i o n we g e n e r a l i z e Theorem 4-3 t o t h e c a s e

of m u l t i p l e r o o t s . We have t o assume s t r o n g e r c o n d i t i o n s t h a n

i n t h e c a s e of s i m p l e r o o t s i n o r d e r t o c a r r y t h r o u g h our

a n a l y s i s .
7.5-2

7.51 The e r r o r e q u a t i o n * L e t x
J^ ^_IJ•••' i-
X x
n
b e

n + 1 9.pproilim^iitB t o a z e r o a o f m u l t i p l i c i t y m. L e t

o b e t h e
p o l y n o m i a l whose f i r s t s - 1 d e r i v a t i v e s a r e
n j s

e q u a l t o t h e f i r s t s - 1 d e r i v a t i v e s of f a t x
j ^ x
i _ i * • • • ' x
i - n *

D e f i n e a new approjtimaht t o a b y

P
n , s ( x
i + l ) - ° - (7-35)

Then r e p e a t t h i s p r o c e d u r e u s i n g t h e p o i n t s x
^ + ^ , x
i ^ * * • ' x
i - n + l

We assume t h a t we c a n f i n d a r e a l r o o t o f P w h i c h s a t i s f i e s
n, s

(7-35) . I f P„
n
0
s
h a s a number o f r e a l r o o t s 9 one o f them i s
9

c h o s e n a s b y some c r i t e r i a .

We have

^ ) - + n 1
n ( t - x ^ ) 3
,

where ^ ( t ) l i e s i n t h e i n t e r v a l d e t e r m i n e d b y

+•. = v
x
i ' x
l - l ' • ' x
i - n ' t
' a n d w h e r e r
= s ( n + l ) . S e t t = x
i + 1

Then

n
f^'te ]
f x 3
( i i) = + — n (-i+i-i-j) *
J=o
7-5-3

x
where = £±( ±+i) • S i n c e the m u l t i p l i c i t y of a i s

f
( x
i + l ) =
ml < x
i + l - a
) '

where T|^ + 1 l i e s i n the i n t e r v a l d e t e r m i n e d by x ^ + ^ and a.

L e t t i n g e
j__j = x
i - j " a
* w e a r r i v e a t

e
m
i + l =
/
^
n \ r ml
r T ~JjnT7Z
1
v

U
s

l
i + l

i + 1
y

;
rr

j=0
/
r
xs
11 ^ e
t „
i - j - e
i + l ) •

E q u a t i o n (7-36) i s the e r r o r e q u a t i o n f o r the c a s e of m u l t i p l e

r o o t s . We assume t h a t

m < r = s ( n + l ) . (7-37)

B e f o r e a n a l y z i n g t h i s e r r o r e q u a t i o n we i n v e s t i g a t e the r o o t s

of an i n d i c i a l e q u a t i o n .
7.5-4

7.52 On t h e r o o t s of an I n d i c i a l e q u a t i o n . In

S e c t i o n 3.3 we i n v e s t i g a t e d t h e p r o p e r t i e s of t h e r o o t s of

the p o l y n o m i a l e q u a t i o n

k-1
g k > a ( t ) = t k
- a £ t J
= 0, (7-38)

under t h e a s s u m p t i o n t h a t f o r k > 1,

k a > 1. (7-39)

The o r d e r of t h e I . F . w h i c h are b e i n g s t u d i e d i n S e c t i o n 7.5

i s d e t e r m i n e d by t h e r o o t s of t h e e q u a t i o n

m t n + 1
- s Y t j
= 0, (7-40)

w h i c h i s of t h e form (7-38) w i t h

k = n + 1, a = (7_4i)

S i n c e we demand t h a t m < r = s ( n + l ) , (7-39) i s s a t i s f i e d and

Theorem 3-2 becomes


7.5-5

THEOREM 7-4. L e t

n
n-fl s_
(t) = t t J
= 0
s
n + l , s / m m i

I f n = 0, t h i s e q u a t i o n has the r e a l r o o t ^ = s/m.

Assume n J> 1 a n d
m < r = s ( n + l ) . Then t h e e q u a t i o n has one

r e a l p o s i t i v e s i m p l e r o o t P n + 1 S / / m
and

max 1 ^ < - + 1,
± s
m ^ ^ n + l , s / m ^ m

F u r t h e r m o r e ,
j

e s/m . o • s , -i _
m
* + 1 " n+l ^ p
n + l , s / m ^ m n+1
m v

+ 1 m
+ 1
m

where e d e n o t e s t h e b a s e of n a t u r a l l o g a r i t h m s . Hence

= ^ + 1.
^ P
n + l , s / m =
m
n —• oo

A l l o t h e r r o o t s are a l s o s i m p l e and have m o d u l i l e s s t h a n one.

T a b l e s 7-5, 7-6, 7-7, and 7-8 g i v e v a l u e s of p.

a = s/m, f o r m = 1, 2, 3, and 4 r e s p e c t i v e l y . T a b l e 7-5 i s

i d e n t i c a l w i t h T a b l e 3-1 and i s r e p e a t e d here t o f a c i l i t a t e

c o m p a r i s o n w i t h the o t h e r t h r e e t a b l e s . The c a s e s i n w h i c h

the c o n d i t i o n ka > 1 i s v i o l a t e d are l e f t b l a n k .


r

7.5-6

TABLE 7-5. VALUES OF P k

"a 1/1 2/1 3/1 V i

1 2.000 3.000 4.000

2 1.618 2.732 3.791 4.828

3 1.839 2.920 3.951 4.967

4 1.928 2.974 3.988 4.994

5 1.966 2.992 3.997 4.999

6 1.984 2.997 3.999 5.000

7 1.992 2.999 4.000 5.000


7.5-7

TABLE 7 - 6 . VALUES OP 0.
K, d

"a 1 / 2 2/2 3/2 4/2

i k

1 1.500 2.000

2 1.618 2.186 2.732

3 1.234 1.839 2.390 2.920

4 1.349 1.928 2.459 2.974

5 1.410 1.966 2.484 2.992

6 1.445 1.984 2.494 2.997

7 1.466 1.992 2.498 2.999


7.5-8

TABLE 7-7. VALUES OP p.

"a 1 / 3 2/3 3/3 4/3

i k
1 1.333

2 1.215 1.618 2.000

3 1.446 1.839 2.210

4 1.126 1.552 1.928 2.284

5 1.199 1.604 1.966 2.313

6 1.243 1.631 1.984 2.325

7 1.271 1.646 1.992 2.330


I

7 . 5 - 9

TABLE 7-8. VALUES OF P k &

a* 1 / 4 2/4 3/4 4/4

\ k

2 1.319 I.618

3 1.234 1.548 1.839

4 1.349 1.648 1.928

5 1.079 1.410 1.697 1.966

6 1.130 1.445 1.721 1.984

7 1.163 1.466 1.734 1.992


1

7.5-10

7.53 The o r d e r . We r e t u r n t o the a n a l y s i s of the

e r r o r e q u a t i o n ,

T i
M
e
+ = ± n ( e
i . j - e
i + i ) s
>

(7-42)

^ f ( r )
U )

We s h a l l assume t h a t

e
i + l
- T ^ - 0
- (7-^3)

Hence e^ 0.

We may r e w r i t e (7-42) as

'i+1
J=0

where

(7-45)

Prom (7-43), we c o n c l u d e t h a t A ± 1. Assume t h a t e ± does n o t

v a n i s h f o r any f i n i t e i . Let

a
i = l n 5
i = l n | e ± | , T ± = l n | T 1 | . (7-46)
r

7.5-11

Prom (7-44),

mo T + s
i+1
±
' 1 - J '
j=0

or

n
T
^ + ^ (7-47)
'i+1 m m

Now, (7-47) i s i d e n t i c a l w i t h (3-34), e x c e p t t h a t T ± / m r e p l a c e s

J± and s/m r e p l a c e s s . Observe t h a t

mi f ( r )
( a )
T ± In
f W
( a )

whereas

J ± -•> I n K .

F u r t h e r m o r e , (3-43) i s r e p l a c e d by

n
m x

I p - i

J=0

By methods a n a l o g o u s t o t h o s e used i n the proof of Theorem 3 - 3

one may show t h a t

( p - l ) / ( r - m )
'i+1 m i f ^ ( c t )

We summarize our r e s u l t s i n
7.5-12

THEOREM 7-5. L e t

J = | x | x - a | ^ rj*

where a i s a z e r o of m u l t i p l i c i t y m. L e t r = s ( n + l ) and

assume t h a t m < r . L e t f ( r
) be c o n t i n u o u s and l e t f ( m
) f ( r
)

be n o n z e r o on J . L e t x - x . . , . . . e J and d e f i n e a
o 1* n

sequence { x ^ a s f o l l o w s : L e t P N g be an i n t e r p o l a t o r y

p o l y n o m i a l f o r f such t h a t t h e f i r s t s - 1 d e r i v a t i v e s o f

P a r e e q u a l t o t h e f i r s t s - 1 d e r i v a t i v e s of f a t t h e
n , s

p o i n t s x
i ^ x
i » i > • • • > x
i - n ' Assume t h a t t h e r e e x i s t s a r e a l

number, JL+1
x
£ J
# such t h a t P N s ( x
i + i ) =
° * D e f i n e <& n g b y

x
l + l " ° n , s ( x
i 5 x
i - l ' # 0
- ' x
i - n ) #

Let e ± _ j = x ± _ j - a and assume t h a t e ± does n o t v a n i s h f o r

a n y f i n i t e i b u t t h a t e
i + 1 / e
j _ ~* ° «

Then

( p - l ) / ( r - m )

i+ll
; m l f ( r )
( a )
(7-48)
P r I
f < m
> ( a )
l i e

where p i s t h e unique r e a l p o s i t i v e r o o t of

n+1 - 2-
m
Y
/_j
t J
= 0 .

j=0
7 . 5 - 1 3

Note t h a t (7-48) r e d u c e s t o (4-30) i f m = 1 . For

the case m = 1, we showed t h a t e^ 0 p r o v i d e d the i n i t i a l

e r r o r s were s u f f i c i e n t l y s m a l l . For the case m > 1, we

assume e u i /e, 0 , which i m p l i e s e. - * 0 „


7.54 D i s c u s s i o n and e x a m p l e s . The r e s u l t s of

S e c t i o n 7*53 are v e r y s a t i s f y i n g ; the o n l y p a r a m e t e r s t h a t

a p p e a r i n t h e c o n c l u s i o n of Theorem 7~5 are t h e o r d e r , the

p r o d u c t of t h e number of new p i e c e s of i n f o r m a t i o n w i t h t h e

number of p o i n t s a t w h i c h i n f o r m a t i o n i s u s e d , and t h e m u l t i -

p l i c i t y of the z e r o . The e f f e c t of the m u l t i p l i c i t y i s t o

r e d u c e t h e f a c t o r s , w h i c h a p p e a r s l i n e a r l y i n the e q u a t i o n

w h i c h d e t e r m i n e s the o r d e r , t o s/m. For n f i x e d , the o r d e r

depends o n l y on the r a t i o of s t o m. Thus the o r d e r i s the

same f o r n = l , s = 1, m = 1 ( s e c a n t I . F . ) , as f o r n = 1,

s ~ 2, m = 2. F u r t h e r m o r e , the l i m i t i n g v a l u e of t h e o r d e r

as n -* 00 i s s i m p l y 1 + s/m.

EXAMPLE 7-2. s = 3, n = 0. T h i s I . F . was d i s c u s s e d

i n S e c t i o n 5.32 f o r t h e case of s i m p l e r o o t s . I f m = 1,

e
i

I f m = 2,
r

7 . 5 - 1 5

EXAMPLE 7 - 3 . s - 1 , n - 2 . T h i s I . F . i s d i s c u s s e d

i n S e c t i o n 1 0 . 2 1 f o r t h e case of s i m p l e r o o t s . I f m = 1 ,

' i + 1
|A.(a)p ( p _ l )
, p ^ 1.84,
1*1

I f m = 2,

p - 1
'i+11 1 f'"(a
p <v 1 . 2 3 .
3 f"(a
T

7 - 6 - 1

7 . 6 O n e - P o i n t I . F . With Memory

A number of t e c h n i q u e s f o r c o n s t r u c t i n g o n e - p o i n t

I . F . w i t h memory f o r the case of m u l t i p l e z e r o s are g i v e n i n

t h i s s e c t i o n . ^ S i n c e t h e s e t e c h n i q u e s are v a r i a t i o n s on

e a r l i e r themes we s h a l l p a s s o v e r them l i g h t l y .

The I . F . s t u d i e d i n S e c t i o n 7.5 are o n e - p o i n t w i t h

memory i f n > 0. We showed t h a t i f m < r = s ( n + l ) , t h e n

bounds on the o r d e r p a r e g i v e n by

max 1, —

L M

S i n c e u = f / f ' has o n l y s i m p l e z e r o s , we can a p p l y t h e

t h e o r y which p e r t a i n s t o s i m p l e z e r o s by r e p l a c i n g f and i t s

d e r i v a t i v e s by u and i t s d e r i v a t i v e s . As an e x a m p l e , we

c o n s i d e r d i r e c t i n t e r p o l a t i o n s t u d i e d i n S e c t i o n 4.23. The

c o n c l u s i o n of Theorem 4-3 s t a t e s t h a t

( p - l ) / ( r - l )
.(r)

L e t <& ( u )
Q be t h e I . F . g e n e r a t e d from 0 by r e p l a c i n g f
n, s n, s

by u. Then

( p - l ) / ( r - l )
u < r
\ a
r lu' ( a
x - a | p
I

7 - 6 - 2

In S e c t i o n 7 . 4 , co, ( m ) was d e f i n e d b y

00

mu
(x) = Y u We >
t
l e = x - a.

1=1
Hence

y
\x-a\

In p a r t i c u l a r ,

x
i " x
i - l
( 7 - 5 0 )
l , l ( u )
- X
i " U
i
u
i " u
i - l

i s of o r d e r § ( 1 + ^ 5 ) ^ 1 . 6 2 f o r a l l m. I t i s n o t d i f f i c u l t

t o show t h a t t h e f i r s t two terms of t h e e r r o r s e r i e s a r e

g i v e n by

e.e* B. - J t e i ,
* L , l ( u )
" a
~ - B
2 , m ( a ) e
i e
i - l i j.m ma 9

9
m

( 7 - 5 1 )
B = a
J+"i-l a = f ( j )

j,m ma m
9 a
j JT
T

7.6-3

^ has an i n f o r m a t i o n a l usage of two and an

i n f o r m a t i o n a l e f f i c i e n c y of .81 f o r a l l m. *]_ ^ s u f f e r s f r o m

t h e drawback t h a t as the a p p r o x i m a n t s c o n v e r g e t o a, u

a p p r p a c h e s a o/o form w h i c h may n e c e s s i t a t e m u l t i p l e p r e c i s i o n

a r i t h m e t i c .

I f m i s known, a number of o t h e r t e c h n i q u e s are

a v a i l a b l e . S i n c e f^ 1 7 1
" ) 1
has o n l y s i m p l e z e r o s , t h e t h e o r y

w h i c h p e r t a i n s t o s i m p l e z e r o s i s a p p l i c a b l e t o t h e I . P .

g e n e r a t e d by r e p l a c i n g f by f ( m - 1
^ # Such I . F . have t h e a d v a n -

t a g e t h a t no o/o forms o c c u r ; t h e y have t h e d i s a d v a n t a g e t h a t

r a t h e r h i g h d e r i v a t i v e s of f must b e ' u s e d .

We can a l s o r e p l a c e f by F = f ^
1 1 1 1
. We can perform

d e r i v a t i v e e s t i m a t i o n on the new I . F . For example, d e f i n e

* F n a n a l o g o u s l y t o *f n [ . (See S e c t i o n 6.22.) I t i s c l e a r

t h a t t h e I . F .

* n E
1 ( F ) = x
" ( "
7 5 2
)
*n

has o r d e r s r a n g i n g from ^(1+^/5) t o 2, as a f u n c t i o n of n.

As a f i n a l e x a m p l e , we perform d e r i v a t i v e e s t i m a t i o n

on t h e second o r d e r I . F . ,

f
£ 2 *b x - m
7.6-4

D e f i n e

* n
£
1 = x
" m
^ " 7 5 3
)
n

I t may be shown t h a t the o r d e r of t h i s I . F . r a n g e s from 1 t o

^ ( l + v / 5 ) as a f u n c t i o n of n and m. ( s e c a n t I . F . ) i s of

l i n e a r o r d e r f o r a l l n o n s i m p l e z e r o s . The e s s e n t i a l d i f f e r e n c e

b e t w e e n (7-53) and (7-52) i s t h a t the denominator on the r i g h t

s i d e of (7-53) i s a l i n e a r c o m b i n a t i o n of ?±_y 0 < j £ n,

whereas t h e d e n o m i n a t o r on t h e r i g h t s i d e of (7-52) i s a

l i n e a r c o m b i n a t i o n of f ^ j , 0 ^ j ^ n.
1!

7.7-1

7.7 Some G e n e r a l R e s u l t s

The f o l l o w i n g s t a t e m e n t s t y p i f y the r e s u l t s which

we have o b t a i n e d i n t h i s c h a p t e r :

a. E . s = 2,3*...., i s of l i n e a r order f o r a l l non-


s
simple z e r o s .

b . S . s = 2,3,...., i s o p t i m a l ; i t depends e x p l i c i t l y
s ^

on m and i t s o r d e r i s m u l t i p l i c i t y - i n d e p e n d e n t .

c. A n o n o p t i m a l m u l t i p l i c i t y - i n d e p e n d e n t L P . may be

g e n e r a t e d from any I . F . by r e p l a c i n g f and i t s

d e r i v a t i v e s by u and i t s d e r i v a t i v e s .

d. $ , g e n e r a t e d by d i r e c t i n t e r p o l a t i o n , i s m u l t i -
n, s

p l i c i t y d e p e n d e n t ^ b u t not of l i n e a r o r d e r i f

m < r = s(n+1).

In t h i s s e c t i o n we s h a l l make some g e n e r a l remarks,

F i r s t we s u g g e s t the f o l l o w i n g

^ CONJECTURE. I t i s i m p o s s i b l e t o c o n s t r u c t an o p t i m a l

I . F . which does not depend e x p l i c i t l y on m and which i s

m u l t i p l i c i t y - i n d e p e n d e n t .
7.7-2

We have e x c l u d e d e x p l i c i t dependence on m because

of b and we have r e s t r i c t e d o u r s e l v e s t o o p t i m a l I . F . because

of c .

The f o l l o w i n g i n c o r r e c t d i s c u s s i o n by Bodewig [7.7-1]

i l l u s t r a t e s t h e p i t f a l l s which one may e n c o u n t e r i n the s t u d y

of m u l t i p l e r o o t s . Bodewig a r g u e s as f o l l o w s : Since we

r e q u i r e cp(a) = a, we t a k e

cp(x) = x - f (x)H(x) .

A n e c e s s a r y c o n d i t i o n t h a t cp b e of second order i s t h a t

cp/(a) = 0 = 1 - f'(a)H(a) - f ( a ) H ' ( a ) . (7-5*0

S i n c e f ( a ) and f ' ( a ) a r e b o t h z e r o f o r m > 1, Bodewig r e a s o n s

t h a t t h e second and t h i r d terms i n (7-54) v a n i s h i f a i s a

m u l t i p l e r o o t . This i g n o r e s t h e p o s s i b i l i t y t h a t H(x) has a

s i n g u l a r i t y a t x = a. In Newton's I . F . , f o r example,

H(x) = l / f ' ( x ) . Hence f ( a ) H ' ( a ) = (l-m,)7m w h i l e

f ' ( a ) H ( a ) = 1 and none of t h e terms of (7-5*0 v a n i s h when

m > 1. Taking H(x) = m / f ' ( x ) or H(x) = l / [ f ' ( x ) u ' ( x ) ] shows

t h a t B o d e w i g ' s c o n c l u s i o n t h a t cp'(a) ^ 0 f o r m u l t i p l e r o o t s

i s i n c o r r e c t .

The above r e a s o n i n g l e d t o d i f f i c u l t i e s because t h e

c o n d i t i o n cp(a) = a, which i m p l i e s f ( x ) H ( x ) 0, p e r m i t s H(x)

"1 m

t o be of 0 [ ( x - a ) " ] . Since u ( x ) = 0 ( x - a ) , we can a v o i d t h e

d i f f i c u l t y by t a k i n g
7 . 7 - 3

cp(x) = x - u ( x ) h ( x ) .

Then

<p(x) - a = x - a - + 0 [ ( x - a ) 2
] | h ( x )

= ( x - a ) + h ( x ) 0 [ ( x - a ) 2
] .
m

L e t h be c o n t i n u o u s i n a c l o s e d i n t e r v a l about a. Then cp i s

of second o r d e r i f and o n l y i f

h ( x ) - m = O ( x - a ) . ( 7 - 5 5 )

In p a r t i c u l a r , we r e q u i r e

h ( a ) = ra. ( 7 - 5 6 )

I f h ' i s c o n t i n u o u s , t h e n ( 7 - 5 6 ) i m p l i e s ( 7 " 5 5 ) . E q u a t i o n ( 7 - 5 5 )

i s e q u i v a l e n t t o

h ( x
u | x j m
- D ^ O . ( 7 - 5 7 )

We summarize our r e s u l t s i n
7.7-4

THEOREM 7-6. L e t cp(x) = x - u ( x ) h ( x ) „ Let a be a

r o o t of m u l t i p l i c i t y m. I f h i s c o n t i n u o u s i n a c l o s e d inter-

v a l a b o u t a, t h e n cp i s of second o r d e r i f and o n l y i f

h ( x ) - m n / n

I f h ' i s c o n t i n u o u s t h e n cp i s of second o r d e r i f and o n l y i f

h ( a ) = m.

For I . F . of second o r d e r , the C o n j e c t u r e w h i c h we

s t a t e d e a r l i e r i n t h i s s e c t i o n i s e q u i v a l e n t t o t h e f o l l o w i n g

L e t cp(x) = x - u ( x ) h ( x ) , Where h does n o t depend upon any

d e r i v a t i v e s of f h i g h e r t h a n t h e f i r s t and does n o t depend

e x p l i c i t l y on m. L e t h ( x ) be a c o n t i n u o u s l y d i f f e r e n t i a b l e

f u n c t i o n of x . Then i t i s i m p o s s i b l e t h a t h ( a ) = m f o r a l l f

whose z e r o s have m u l t i p l i c i t y m.

In S e c t i o n 7-8 we c o n s t r u c t an h w h i c h depends o n l y

on f and f' and f o r w h i c h h - * m . But t h i s h i s n o t c o n t i n u -

o u s l y d i f f e r e n t i a b l e a t a.
7 . 8 - 1

7 . 8 An I . F . Of Incommensurate Order

LEMMA 7 - 2 . Let f ( m + 1
) be c o n t i n u o u s i n t h e n e i g h b o r -

hood of a z e r o a of m u l t i p l i c i t y m. Let

Then h ( x ) -»• m .

PROOF. L e t

f ( x ) = ( x - a ) m
g ( x ) . ( 7 - 5 8 )

Then

g ( x ) - g ( a ) - f ( m
m
}
, ( a )
t 0.

L e t

a ( x ) . ( x - a ) ^ f | f .

Then G(x) = O ( x - a ) and

u(x) = * = g . ( 7 - 5 9 )

From ( 7 - 5 8 ) and ( 7 - 5 9 )

, x _ l n l f ( x ) I _ m l n l x - a l + l n | g | _^ m

w
" l n | u ( x ) I ~ l n l x - a - ln|m+G| '
7.8-2

LEMMA 7-3. L e t f and h be as i n Lemma 7-2. Then

uh' -* 0.

PROOF. We s h a l l assume f o r s i m p l i c i t y t h a t f and

f ' !
are p o s i t i v e ; the r e s u l t i s t r u e i n g e n e r a l . Then f o r

x £ a,

h ,(„) - 1 , _ m f ( x ) u ' ( x )
<*> ~ u l n [ u ( x ) j { l n [ u ( x ) ] ) 2 u ( x ) '

Hence

( \ 1 h ( x ) u ' ( x ) '
» ( ) h ' ( x )
x
- l n [ u ( x ) ] " l n f u ( x ) j -

The f a c t t h a t h -+ m, u' l / m , l n [ u ] -oo c o m p l e t e s the p r o o f

LEMMA 7-4. Let f and h be d e f i n e d as In Lemma 7-2.

Let

cp(x) = x - u ( x ) h ( x ) . (7-60)

Then cp' ( x ) 0 .

PROOF. For x ^ a, h and hence cp a r e d i f f e r e n t i a b l e

and

q)'(x) = 1 - u ' ( x ) h ( x ) - u ( x ) h ' ( x ) .

Note t h a t h -+ m from Lemma 7-2, uh' 0 from Lemma 7-3, and

u' l / m ; the c o n c l u s i o n f o l l o w s I m m e d i a t e l y .
1

7.8-3

We have shown t h a t h , which depends o n l y on f

and f , has the p r o p e r t y t h a t h - * m . S i n c e cp' 0, one might

hope t o c o n c l u d e t h a t cp i s second o r d e r b u t t h i s i s not the

c a s e . I t may be shown t h a t (h-m)/u -+ oo. Then an a p p l i c a t i o n

of Theorem 7-6 shows t h a t cp i s not second o r d e r . I n s t e a d we

have

THEOREM 7-7. Let h and f s a t i s f y the c o n d i t i o n s of

Lemma 7-2 and l e t cp = x - uh. Then

1/m.
f ( m )
( q )
- ln( m
m •
(7-61)

PROOF. S i n c e

I n If
cp = x - u h = x - u
l n | u | '

we h a v e

cp-q
I n I x - a = I n I x - q I 1 _ I n l f
x - q x - q I n u

R e c a l l t h a t

n _ l n | f I m m I n I x - q I + l n l g l x - q
l n | u | I n x - q | - ln|ra+G ' u
" m+G
7.8-4

where

f = ( x - a ) m
g , G = ( x - a ) ^ .

The f a c t t h a t

g^i^fll, Q . o ( x - a )
&
m i ^ —\ /

p e r m i t s t h e c o m p l e t i o n of t h e p r o o f .

T h i s theorem shows t h a t the o r d e r of cp i s incom-

m e n s u r a t e w i t h t h e order s c a l e d e f i n e d by (1-14) . Observe

t h a t i f e i s an a r b i t r a r y p r e a s s i g n e d p o s i t i v e number, t h e n

cp-a
x - a 9

x - a
1+e

We can w r i t e (7-61) as

1/m
f ( m )
( a )
ln^m
mi
cp-q
C = -
x - a l n | x - a |

Thus cp c o n v e r g e s l i n e a r l y and i t s a s y m p t o t i c e r r o r c o n s t a n t

g o e s l o g a r i t h m i c a l l y t o z e r o . I t i s c l e a r t h a t the sequence

of x 1 g e n e r a t e d by cp c o n v e r g e s i f x Q i s s u f f i c i e n t l y c l o s e t o

In p a r t i c u l a r , l e t f ( x ) = ( x - a ) m
. Then C = - l n ( m ) / l n | x - a |

and C < 1 i f | x - a | < l / m .


f

7-8-5

The f o l l o w i n g g e n e r a l i z a t i o n s u g g e s t s i t s e l f .

R e c a l l t h a t

h = l n f
1 = m l n l x - a l + l n l g l
In u| l n | x - a | - ln|m+G|'

where ln|m+G| -» In m. S i n c e h -+ m, t h i s s u g g e s t s t h a t we can

a c c e l e r a t e t h e c o n v e r g e n c e of h t o m by d e f i n i n g

h = l n
l f
1 h - m l f
X 1
2 l n | u | + l n | h 1 | ^ n
l " I n j u r

or more g e n e r a l l y

h l n | f
n
i + l " l n j u j + l n j l i j ' 1
" 0
> - ^ - - - ; h
0 - L

T h i s , i n t u r n , s u g g e s t s i m p l i c i t l y d e f i n i n g h. by

^ =
l n | u j n
+ f
l n | i i | • (7-62)

For f = ( x - a ) m
, (7-62) may be d e r i v e d from a n o t h e r

p o i n t of v i e w . We have u = ( x - a ) / m and

( m u ) m
= ( x - a ) m
= f,

and h e n c e

m - In 1f1
~ l n | u | + l n nT

The a n a l y s i s of the o r d e r of I . F . of t h e form cp = x - u h ± and

cp = x - uti has n o t been c a r r i e d o u t .


7.8-6

A f t e r f and f' have been c a l c u l a t e d , i t i s n o t

e x p e n s i v e t o c a l c u l a t e h = l n | f | / l n | u | . In a g e n e r a l r o o t -

f i n d i n g r o u t i n e i t m i g h t be w o r t h w h i l e t o a l w a y s c a l c u l a t e h

A f t e r the l i m i t of h has been d e t e r m i n e d t o the n e a r e s t

i n t e g e r , t h e r o u t i n e can s w i t c h t o <5 2 = x - mu.

You might also like