0% found this document useful (0 votes)
15 views22 pages

Calculus of Variations FEM

Chapter 3 discusses the calculus of variations, which focuses on functionals and methods to find functions that optimize these functionals. It introduces key concepts such as functionals, variations, and the first variation of functionals, along with their mathematical definitions and properties. The chapter also outlines rules for variational calculus and the relationship between differentials and variations.

Uploaded by

Gita Bhadury
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views22 pages

Calculus of Variations FEM

Chapter 3 discusses the calculus of variations, which focuses on functionals and methods to find functions that optimize these functionals. It introduces key concepts such as functionals, variations, and the first variation of functionals, along with their mathematical definitions and properties. The chapter also outlines rules for variational calculus and the relationship between differentials and variations.

Uploaded by

Gita Bhadury
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Chapter 3

Calculus of Variations

3.1 Introduction
The calculus of variations deals with functionals, which are functions of a function, to put it
simply. For example, the methods of calculus of variations can be used to find an unknown
function that minimizes or maximizes a functional. Many of its methods were developed over
two hundred years ago by Euler (1701-1783), Lagrange (1736-1813), and others. It continues
to the present day to bring important techniques to many branches of engineering and physics.

3.2 Functionals
As we have seen in the last section, there exist a great variety of physical problems that deals
with functionals, which are functions of a function. We are familiar with the definition of a
function. A function can be regarded as a rule that maps one number (or a set of numbers) to
another value. For example,
f (x) = x2 + 2x
is a function, which maps x = 2 to f (x) = 8, and x = 3 to f (x) = 15, etc. On the other hand, a
functional is a mapping from a function (or a set of functions) to a value. That is, a functional
is a rule that assigns a real number to each function y(x) in a well-defined class. Like a function,
a functional is a rule, but its domain is some set of functions rather than a set of real numbers.
We can consider F[y(x)] as a functional for the fixed values of x. For example,

F[y(x)] = 3y2 − y + 10

where
y(x) = ex + cos x − x for x=π
is a functional. Another class of functional has the form
Z b
J[y] = y(x) dx
a

15
16 CHAPTER 3. CALCULUS OF VARIATIONS

Here J gives the area under the curve y = y(x). Hence J is not a function of x and its value will
be a number. However, this number depends on the particular form of y(x) and hence J[y] is a
functional. For a = 0 and b = π , the value of the functional when y(x) = x is
Z π
π2
J[y] = x dx = ≈ 4.93
0 2
and when y(x) = sin x, Z π
J[y] = sin x dx = 2
0
Therefore the given functional J[y] maps y(x) = x to π 2 /2 and maps y(x) = sin x to 2. Because
an integral maps a function to a number, a functional usually involves an integral. The following
form of functional often appears in the calculus of variations,
Z b
J[y] = F(x, y, y′ ) dx (3.1)
a

The fundamental problem of the calculus of variations is to find the extremum (maximum or
minimum) of the functional (3.1).

3.3 First Variation of Functionals


Consider a function y = f (x). When the independent variable x changes to x + ∆x, then the
dependent variable y changes to f (x + ∆x) = f (x) + ∆ f (x), where ∆ f is the total change in the
function. ∆ f can be computed by expanding f (x + ∆x) using Taylor series. Thus,
df d 2 f ∆x2 d 3 f ∆x3
f (x + ∆x) = f (x) + ∆x + 2 + 3 + ...
dx dx 2! dx 3!
df d 2 f ∆x2 d 3 f ∆x3
∆ f ≡ f (x + ∆x) − f (x) = ∆x + 2 + 3 + ... (3.2)
dx dx 2! dx 3!
By definition, the differential d f of the function f (x) is how much f changes if its argument, x,
changes by an infinitesimal amount ∆x. That is
df
d f = lim ∆ f = ∆x (3.3)
∆x→0 dx
Comparing (3.2) and (3.3), we see that the differential d f is the linear part of the total change
∆ f . That is
∆ f = d f + higher-order terms in ∆x (3.4)
In line with definition of differential of a function f (x), we now introduce the concept of the
variation (or differential) of a functional F[y(x)]. Let y(x) is changed to y(x) + δ y(x), where
δ y(x) is the vertical displacement of the curve y(x). It is known as the variation of y and is
denoted by δ y. We introduce an alternative function of the form

Y (x) = y(x) + δ y(x) (3.5)


3.3. FIRST VARIATION OF FUNCTIONALS 17

This is illustrated in figure 3.1, where y(x) is shown in red color and Y (x) is shown in blue color.
By definition, the total change in the functional is given by
∆F[y] = F[y(x) + δ y(x)] − F[y(x)] = F[Y (x)] − F[y(x)] (3.6)
If η (x) is an arbitrary differentiable function that vanishes at the boundaries of the domain, i.e.,
y

δ y(x)
y(x)

Y (x)

Figure 3.1: Plot of y(x) and a small variation from it.

η (a) = 0 and η (b) = 0, then the variation δ y(x) can be represented as


δ y(x) = εη (x) y, η ∈ A (3.7)
where ε is an arbitrary parameter independent of x. This definition enable us to write equation
(3.5) in the following form,
Y = y + εη (3.8)
Now from (3.6), the total change in functional F is given by
∆F = F[y + εη ] − F[y] (3.9)
Using Taylor series, we can expand the first term on R.H.S. as
dF d 2F 2 ε 2 d3F 3 ε 3
F[Y ] = F[y + εη ] = F[y] + ηε + 2 η + 3η + ... (3.10)
dy dy 2! dy 3!
Rearranging equation (3.10) to obtain the change in functional F:
dF
∆F = F[y + εη ] − F[y] = ηε + higher-order terms (3.11)
dy
By definition, first variation of a functional F[y], denoted by δ F, is how much F changes if its
argument, y, changes by an infinitesimal amount δ y. Therefore,
dF dF
δ F = lim ∆F = lim (F[y + εη ] − F[y]) = ηε = δy (3.12)
ε →0 ε →0 dy dy
which shows that δ F is given by the linear part of the equation (3.11). Thus, the change in
functional F[y] and its first variation is related by the equation
∆F = δ F + higher-order terms (3.13)
18 CHAPTER 3. CALCULUS OF VARIATIONS

Let us now define what is called the Gâteaux derivative or Gâteaux variation in the direction of
η (x). It is denoted by δ F[y; η ] and is defined as

∆F F[y + εη ] − F[y] d
δ F[y; η ] = lim = lim = F[y + εη ] (3.14)
ε →0 ε ε →0 ε dε ε =0

Note that the first variation and the Gâteaux variation are related through the parameter ε ,
i.e., δ F( f v) = εδ F(gv) where we have denoted first variation by δ F( f v) and Gâteaux variation by
δ F(gv) . Unfortunately, in the literature, these two variations are denoted by the same symbol
δ F.
Let us look at the meaning of η and ε geometrically. Since y is the unknown function to
be found so as to extremize a functional, we want to see what happens to the functional F[y]
when we perturb this function slightly. For this, we take another function η and multiply it by
a small number ε . We add εη to y and look at the value of F[y + εη ]. That is, we look at
the perturbed value of the functional due to perturbation εη . This is the shaded area shown in
figure 3.2. Now as ε → 0, we consider the limit of the shaded area divided by ε . If this limit
exists, such a limit is called the Gâteaux variation of F[y] at y for an arbitrary but fixed function
η.
y

y(x)
y + εη

η (x)

x
a b

Figure 3.2: Plot of y(x) and its variation.

Note that choosing a different η gives a different set of varied curves and hence a different
variation. Hence δ F[y; η ] depends on which function η is chosen to define the increment δ y
and this dependence is explicitly shown in the notation.

First variation of functional F[x, y, y′ , y′′ ]

We now consider the first variation of the functional

F[x, y, y′ , y′′ ]
3.3. FIRST VARIATION OF FUNCTIONALS 19

for fixed values of x. If y changes to y + εη , then y′ changes to y′ + εη ′ and y′′ changes to


y′′ + εη ′′ . From equation (3.8), we have

Y = y + εη
Y ′ = y′ + εη ′ and
Y = y + εη
′′ ′′ ′′

The new value of the functional is then

F[x, Y, Y ′ , Y ′′ ] = F[x, y + εη , y′ + εη ′ , y′′ + εη ′′ ]

where εη ′ is known as the variation of y′ and is denoted by δ y′ . Similarly, εη ′′ is known as the


variation of y′′ and is denoted by δ y′′ . The change in the functional F is then defined as

∆F = F[x, y + εη , y′ + εη ′ , y′′ + εη ′′ ] − F[x, y, y′ , y′′ ] (3.15)

Using Taylor series, we can expand the first term on R.H.S. as


 
∂F ∂ F ′ ∂ F ′′
F[x, y + εη , y + εη , y + εη ] = F[x, y, y , y ] +
′ ′ ′′ ′′ ′ ′′
η + ′ η + ′′ η ε
∂y ∂y ∂y
 2 
∂ F 2 ∂ F ′2 ∂ F ′′2
2 2 ∂ F
2 ∂ F
2 ∂ 2 F ′ ′′ ε 2
+ η + ′2 η + ′′2 η + 2 ηη + 2

ηη + 2 ′ ′′ η η
′′
+ ...
∂ y2 ∂y ∂y ∂ y∂ y′ ∂ y∂ y′′ ∂y ∂y 2!
Rearranging the above Taylor series expansion, we obtain the change in functional F:
   2
∂F ∂ F ′ ∂ F ′′ ∂ F 2 ∂ 2 F ′2 ∂ 2 F ′′2
∆F = η + ′ η + ′′ η ε + η + ′2 η + ′′2 η
∂y ∂y ∂y ∂ y2 ∂y ∂y

∂ F
2 ∂ F
2 ∂ F ′ ′′ ε 2
2
+2 ηη + 2

ηη + 2 ′ ′′ η η
′′
+ ...
∂ y∂ y′ ∂ y∂ y′′ ∂y ∂y 2!
In analogy with the definition of a function, the sum of the linear part in the ∆F is called the
first variation of the functional F. Therefore,
∂F ∂F ∂F
δF = ηε + ′ η ′ ε + ′′ η ′′ ε (3.16)
∂y ∂y ∂y
Since
δ y = εη , δ y′ = εη ′ , δ y′′ = εη ′′
The variation of F can be written as
∂F ∂F ∂F
δF = δ y + ′ δ y′ + ′′ δ y′′ (3.17)
∂y ∂y ∂y
Now, the total differential dF of a function F(x, y, y′ , y′′ ), when x is considered fixed, is given by
∂F ∂F ∂F
dF = dy + ′ dy′ + ′′ dy′′
∂y ∂y ∂y
20 CHAPTER 3. CALCULUS OF VARIATIONS

Formula (3.17) for δ F has the same form as the above formula for dF. Thus the variation of F
is given by the same formula as differential of F, if x is considered to be fixed.
It is to be noted that the differential of a function is the first-order approximation to the
change in that function, along a particular curve while the variation of a functional is the first-
order approximation to the change in the functional from one curve to other.
We mention here that the sum of terms in ε and ε 2 is called the second variation of F and
the sum of terms in ε , ε 2 , and ε 3 is called the third variation of F. However, when the term
variation is used alone, the first variation is meant.

Some rules of variational calculus

The variational operator δ follows the rules of differential operator d of calculus. Let F1 and F2
be any continuous and differentiable functionals. Then we have the following results:
• δ F n = nF n−1 δ F

• δ (F1 + F2 ) = δ F1 + δ F2

• δ (F1 F2 ) = F1 δ F2 + F2 δ F1
 
F1 F2 δ F1 − F1 δ F2
• δ =
F2 F22
It is easy to show that the operators d
dx and δ are commutative. The commutative property
may be written mathematically as
d dy
(δ y) = δ
dx dx
The proof is as follows:
d d dη dy
(δ y) = (εη ) = ε = εη ′ = δ y′ = δ
dx dx dx dx
That is, the differential of the variation of a function is identical to the variation of the differential
of the same function.
Another commutative property is the one that states that the variation of the integral of a
functional F is the same as the integral of the variation of the same functional, or mathematically
Z Z
δ Fdx = δ Fdx

Note that the two integrals must be evaluated between the same two limits.

Rb
First variation of functional a F(x, y, y′ , y′′ ) dx

Next we consider the first variation of the functional defined by


Z b
J[y] = F(x, y, y′ , y′′ ) dx
a
3.3. FIRST VARIATION OF FUNCTIONALS 21

If y changes to Y = y + εη , then y′ changes to Y ′ = y′ + εη ′ and y′′ changes to Y ′′ = y′′ + εη ′′ .


The change in functional, ∆J, is given by

∆J = J[Y ] − J[y] = J[y + εη ] − J[y] (3.18)

where Z b
J[y + εη ] = F[x, y + εη , y′ + εη ′ , y′′ + εη ′′ ] dx
a
Therefore, the change in functional is given by
Z b Z b
∆J = F[x, y + εη , y′ + εη ′ , y′′ + εη ′′ ] dx − F(x, y, y′ , y′′ ) dx (3.19)
a a

As previously defined, the Gâteaux derivative or Gâteaux variation in the direction of η (x) is
given by
∆J J[y + εη ] − J[y] d
δ J[y; η ] = lim = lim = J[y + εη ] (3.20)
ε →0 ε ε →0 ε dε ε =0

Example 3.1

Consider the functional Z 1 


J[y] = x2 − y2 + y′2 dx
0
with y(0) = 0 and y(1) = 1. Calculate ∆J and δ J[y; η ] when y(x) = x and η (x) = x2 .
We first evaluate J[y],
Z 1 
J[y] = x2 − y2 + y′2 dx
0
Z 1  Z 1
2 2
= x − x + 1 dx = dx = 1
0 0

The family of curves y + εη is given by x + ε x2 . We next evaluate J on the family y + εη to get


Z 1 
J[y + εη ] = x2 − (y + εη )2 + (y′ + εη ′ )2 dx
0
Z 1 
= x2 − (x + ε x2 )2 + (1 + 2ε x)2 dx
0
3 17
= 1 + ε + ε2
2 15
Hence, the change in the functional
3 17
∆J = J[y + εη ] − J[y] = ε + ε2
2 15
The derivative of the functional
d 3 34
J[y + εη ] = + ε
dε 2 15
22 CHAPTER 3. CALCULUS OF VARIATIONS

Evaluating this derivative at ε = 0 gives the Gâteaux derivative


d 3
J[y + εη ] =
dε ε =0 2
Hence we conclude that variation δ J = 1.5 in the direction η (x) = x2 .

3.4 The Fundamental Problem


A fundamental problem of the calculus of variations can be stated as follows: Given a functional
J and a well-defined set of function A, determine which function in A afford a minimum (or
maximum) value to J. The word minimum can be interpreted as a local minimum or an absolute
minimum – a minimum relative to all elements in A. The well-defined set A is called the set of
admissible functions. It is those functions that are the competing functions for extremizing J.
For example, the set of admissible functions might be the set of all continuous functions on an
interval [a, b], the set of all continuously differentiable functions on [a, b] satisfying the conditions
such as f (a) = 0.
Classical calculus of variations restricts itself to functionals that are defined by certain integrals
and to the determination of both necessary and sufficient conditions for extrema. The problem of
extremizing a functional J over the set A is called a variational problem. To a certain degree the
calculus of variations could be termed as the calculus of functionals. In the present discussion we
restrict ourselves to an analysis of necessary conditions for extrema. An elementary treatment
of sufficient conditions can be found in Gelfand and Fomin.
Let us concentrate on the simplest class of variational problems, in which the unknown is a
continuously differentiable scalar function, and the functional to be minimized depends upon at
most its second derivative. As already mentioned, the basic minimization problem, then, is to
determine a suitable function y = y(x) that minimizes the objective functional
Z b
J[y] = F(x, y, y′ y′′ ) dx, y∈A (3.21)
a

where F(x, y, y′ , y′′ ) is some given function and A is a admissible class of functions. The integrand
F is known as the Lagrangian for the variational problem. We assume that the Lagrangian is
continuously differentiable in each of its four arguments x, y, y′ , and y′′ .
Very often, we encounter variational problems in which the integrand F takes the simple form
F(x, y, y′ ) and hence have the functional in the form
Z b
J[y] = F(x, y, y′ ) dx, y∈A (3.22)
a

3.5 Maxima and Minima


One of the central problems in the calculus is to maximize or minimize a given real valued
function of a single variable. If f is a given function defined in an open interval (a, b), then f
3.5. MAXIMA AND MINIMA 23

has a local minimum at a point x = x0 in (a, b) if f (x0 ) < f (x) for all x near x0 on both sides
of x = x0 . In other words, f has a local minimum at a point x = x0 in (a, b) if f (x0 ) < f (x)
for all x, satisfying |x − x0 | < δ for some δ . If f has a local minimum at x0 in (a, b) and f is
differentiable in (a, b), then it is well known that

f ′ (x0 ) = 0 (3.23a)

Similar statements can be made if f has a local maximum at x0 . The aforementioned condition
(3.23a) is called a necessary condition for a local minimum; that is, if f has a local minimum
at x0 , then (3.23a) necessarily follows. Equation (3.23a) is not sufficient for a local minimum,
however; that is, if (3.23a) holds, it does not guarantee that x0 provides an actual minimum.
The following conditions are sufficient conditions for f to have a local minimum at x0

f ′ (x0 ) = 0 and f ′′ (x0 ) > 0 (3.23b)

provided f ′′ exists. Again, similar conditions can be formulated for local maxima. If (3.23b)
holds, we say f is stationary at x0 and that x0 is an extreme point for f .

3.5.1 Maxima and minima of functionals


Instead of extremizing functions in calculus, the calculus of variations deals with extremizing
functionals. The necessary condition for the functional J[y] to have an extremum at y(x) = ŷ(x)
is that its variation vanishes for y = ŷ. That is,

δ J[ŷ; η ] = 0 (3.24)

for y = ŷ and for all admissible variations η .


The fact that the condition (3.24) holds for all admissible variations η often allows us to
eliminate η from the condition and obtain an equation just in terms of ŷ, which can then be
solved for ŷ. Generally the equation for ŷ is a differential equation. Since (3.24) is a necessary
condition we are not guaranteed that solutions ŷ actually will provide a minimum. Therefore the
solutions ŷ to (3.24) are called (local) extremals or stationary functions, and are the candidates
for maxima and minima. If δ J[ŷ; η ] = 0, we say J is stationary at ŷ in the direction η .
Based on the variations δ y and δ y′ , we distinguish between the following cases, i.e., strong
extremum and weak extremum. Strong extremum occurs when δ y is small, however, δ y′ is large,
while weak extremum occurs when both δ y and δ y′ are small.

Example 3.2

Consider the functional Z 1 


J[y] = 1 + y′ (x)2 dx
0
with y(0) = 0 and y(1) = 1. Let ŷ(x) = x and η (x) = x(1 − x). The family of curves ŷ + εη
is given by x + ε x(1 − x) and a few members are sketched in figure ??. We evaluate J on the
family ŷ + εη to get
24 CHAPTER 3. CALCULUS OF VARIATIONS
y

ŷ = x

η = x(1 − x)
x
1

Figure 3.3: The one parameter family of curves (x + ε x(1 − x)).

Z 1h 2 i
J[ŷ + εη ] = 1 + ŷ′ (x) + εη ′ (x) dx
0
Z 1h i
= 1 + (1 + ε (1 − 2x))2 dx
0
ε2
= 2+
3
Then the derivative of the functional
d 2ε
J[ŷ + εη ] =
dε 3
Evaluating this derivative at ε = 0 gives the Gâteaux derivative
d
δ J[ŷ; η ] = J[y + εη ] =0
dε ε =0

Hence we conclude that variation δ J[y; η ] = 0 and J is stationary at ŷ = x in the direction


η = x(1 − x).

Example 3.3

Consider the functional Z 2π 


J[y] = 1 + y′ (x)2 dx
0
with y(0) = 0 and y(2π ) = 2π . Let ŷ(x) = x and η (x) = sin x. The family of curves ŷ + εη is
given by x + ε sin x and a few members are sketched in figure 3.2. We evaluate J on the family
ŷ + εη to get Z 2π h 2 i
J[ŷ + εη ] = 1 + ŷ′ (x) + εη ′ (x) dx
0
Z 2π  
= 1 + (1 + ε cos x)2 dx
0
= π (4 + ε 2 )
3.6. THE SIMPLEST PROBLEM 25
y

ŷ = x

η = sin x
x

Figure 3.4: The one parameter family of curves (x + ε sin x).

Then the derivative of the functional


d
J[y + εη ] = 2πε

Evaluating this derivative at ε = 0 gives the Gâteaux derivative
d
J[y + εη ] =0
dε ε =0

Hence we conclude that variation δ J[y; η ] = 0 and J is stationary at ŷ = x in the direction


η = sin x.

3.6 The Simplest Problem


The simplest problem of calculus of variations is to determine a function y(x) for which the value
of the following functional
Z b
J[y] = F(x, y, y′ ) dx (3.25)
a
is a minimum. Here y ∈ C2 [a, b].1 and F is a given function that is twice continuously dif-
ferentiable on [a, b] × R2. In order to uniquely specify a minimizing function, we must impose
suitable boundary conditions. Any type of boundary conditions including, Dirichlet (essential)
and Neumann (natural) boundary conditions may be prescribed. In the interests of brevity, we
shall impose the Dirichlet boundary conditions of the form
y(a) = α , y(b) = β
That is, the graphs of the admissible functions pass through the end points (a, α ) and (b, β ).
We seek a necessary condition for the functional J[y] to be a minimum. For this, we need to
compute the Gâteaux variation of δ J. Let y(x) be a local minimum and η (x) a twice continuously
1C2 [a,b] is the set of all continuous functions on an interval [a,b] whose second derivative is also continuous. If y ∈ C2 [a,b], we say y is a
function of class C2 on [a,b].
26 CHAPTER 3. CALCULUS OF VARIATIONS

differentiable function satisfying η (a) = η (b) = 0. Then, Y = y + εη is an admissible function


and the new functional becomes
Z b Z b
J[Y ] = ′
F[x, Y, Y ] dx = F[x, y + εη , y′ + εη ′ ] dx (3.26)
a a

Its derivative with respect to the parameter ε is



Z b
d
J[Y ] = F[x, Y, Y ′ ] dx
dε ∂ε
a
Z b  Z b 
∂ F ∂Y ∂ F ∂Y ′ ∂F ∂F ′
= + dx = η+ η dx
a ∂Y ∂ ε ∂Y ′ ∂ ε a ∂Y ∂Y ′
Evaluating the above integral at ε = 0, we obtain
Z b 
d ∂F ∂F ′
J[y + εη ] = η + ′ η dx (3.27)
dε ε =0 a ∂y ∂y
As we have seen earlier, the necessary condition for the functional J[y] to have an extremum at
y is that its variation vanishes for y. That is,
d
δ J[y; η ] = J[y + εη ] =0 (3.28)
dε ε =0

Therefore, from (3.27) the necessary condition for the functional J[y] to have an extremum at y
is given by
Z b 
∂F ∂F ′
η + ′ η dx = 0 (3.29)
a ∂y ∂y
for all η ∈ C2 [a, b] with η (a) = η (b) = 0.

An alternate approach for the derivation of equation (3.29)


Since first variation and Gâteaux variation are linearly related through the parameter ε , the
Gâteaux variation in equation (3.28) may be replaced by the first variation. Thus the necessary
condition given by equation (3.28) becomes
Z b Z b
δJ = δ F(x, y, y′ ) dx = δ F dx = 0
a a

Hence, using equation (3.16), we can write


Z b 
∂F ∂F ′
Z b
δ F dx = δ y + ′ δ y dx
a a ∂y ∂y
Z b 
∂F ∂F ′
= ηε + ′ η ε dx = 0
a ∂y ∂y
Dividing this by ε , we have
Z b 
∂F ∂F ′
η + ′ η dx = 0
a ∂y ∂y
3.6. THE SIMPLEST PROBLEM 27

which is same as (3.29).


Condition (3.29) is not useful as it stands for determining y(x). Using the fact that it must
hold for all η , however, we can thus eliminate η and η ′ and thereby obtain a condition for y
alone. First we integrate the second term in (3.29) by parts2 to obtain
   
∂F ∂F b d ∂F
Z b Z b
η dx =

η − η dx
a ∂ y |{z} ∂ y′ a a dx ∂ y
′ ′
|{z} v′
u

Thus, condition (3.29) can be written as


Z b    
∂F d ∂F ∂F b
− η dx + η =0 (3.30)
a ∂y dx ∂ y′ ∂ y′ a
Since, η (a) = η (b) = 0, the last term on right-hand side vanishes and thus the condition (3.30)
becomes Z b  
∂F d ∂F
− η dx = 0 (3.31)
a ∂y dx ∂ y′
The above equation must hold for any arbitrary limits. This is possible only if the integrand is
identically zero (Dubois–Reymond lemma). Therefore, we have
  
∂F d ∂F
− η =0
∂y dx ∂ y′
Since η (x) is an arbitrary admissible function, equation (3.31) holds good only if
 
∂F d ∂F
− =0
∂y dx ∂ y′
We will state this result in the form of a theorem.
Theorem: If a function y provides a local minimum to the functional
Z b
J[y] = F(x, y, y′ ) dx
a

where y ∈ C2 [a, b] and


y(a) = α , y(b) = β
then y must satisfy the equation
 
∂F d ∂F
− = 0, x ∈ [a, b] (3.32a)
∂y dx ∂ y′
Equation (3.32a) is called the Euler–Lagrange equation or simply Euler equation. There
are two important aspects of the derivation of the Euler–Lagrange equation that deserve close
inspection. First, it provides a necessary condition for a local minimum but not a sufficient
one. It is analogous to the derivative condition f ′ (x) = 0 in differential calculus. Therefore its
R R
2 uv′ dx = uv − u′ vdx
28 CHAPTER 3. CALCULUS OF VARIATIONS

solutions are not necessarily local minima. It is a second-order ordinary differential equation
with a solution that is required to satisfy two conditions at the boundaries of the domain of
solution. Such boundary value problems may have no solution, one unique solution, or multiple
solutions depending on the situation. A case with multiple solutions will imply that more than
one paths from point (a, α ) to point (b, β ) satisfy the Euler–Lagrange equation. However, not
all of these paths will necessarily minimize the functional J[y]. A second important aspect of the
Euler–Lagrange equation is related to our assumption that the curve y(x) ∈ C2 [a, b]. Indeed, our
considerations focused only on such smooth functions. However, the actual path that extremizes
an integral might be one with a corner or a kink. Such paths are not relevant for the use of the
Euler–Lagrange equation in Newtonian mechanics. However, they are often the true solutions in
other problems in the calculus of variations, as we have seen in the case of physics of soap films.
It may be worthwhile to note that if y is treated as independent variable and x is dependent
variable, then the Euler–Lagrange equation (3.32a) will takes the form
 
∂F d ∂F
− = 0, y ∈ [α , β ] (3.32b)
∂x dy ∂ x′

3.6.1 Essential and natural boundary conditions


In the derivation of the Euler–Lagrange equation, we used the conditions that η (a) = η (b) = 0,
which means that the variations δ y(a) = δ y(b) = 0. These conditions are a consequence of our
imposition of fixed values of y(x) at the endpoints a and b. That is
y(a) = α , y(b) = β
where α and β are constants. This is called the essential (or Dirichlet) boundary condition. In
some applications, we may need to apply other types of boundary conditions to the function
y(x).
If we still want the last term in equation (3.30) to vanish (so that we obtain the familiar
Euler–Lagrange equation), but allowing δ y(a) and δ y(b) to be non-zero, then we need to have,
∂F ∂F
= 0, =0
∂ y′ x=a ∂ y′ x=b
This is called a natural (or Neumann) boundary condition. A system may also have a natural
boundary condition at one end (x = a) and an essential boundary condition at the other end
(x = b).

3.6.2 Other forms of Euler–Lagrange equation


The functional F in the Euler–Lagrange equation is a function of x, y, and y′ . Therefore,
dF ∂F ∂ F dy ∂ F dy′
= + +
dx ∂x ∂ y dx ∂ y′ dx
dF ∂F ∂F ∂F
= + y′ + y′′ ′ (3.33)
dx ∂x ∂y ∂y
3.6. THE SIMPLEST PROBLEM 29

But, we have    
d ′∂F ′′ ∂ F ′ d ∂F
y ′ =y +y (3.34)
dx ∂y ∂ y′ dx ∂ y′
Subtracting (3.34) from (3.33), we have
   
dF d ′∂F ∂F ′∂F ′ d ∂F
− y ′ = +y −y
dx dx ∂y ∂x ∂y dx ∂ y′
Rewriting the above equation to give
    
d ′∂F ∂F ′ ∂F d ∂F
F −y ′ − =y −
dx ∂y ∂x ∂y dx ∂ y′
By the Euler–Lagrange equation (3.32a) we see that the right-hand side of the above equation
is zero. Thus,  
d ′∂F ∂F
F−y ′ − =0 (3.35)
dx ∂y ∂x
Equation (3.35) is another useful form of the Euler–Lagrange equation.

3.6.3 Special cases


Case I. Often in applications, the functional F does not depend directly on x and the Euler–
Lagrange equation, in this case, takes a particularly nice form. Here we have ∂ F/∂ x = 0 and
the corresponding form of Euler–Lagrange equation (3.35) becomes
 
d ′∂F
F −y ′ = 0
dx ∂y
Integrating, we get the first integral of Euler–Lagrange equation
∂F
F − y′ =C (3.36)
∂ y′
Thus, the extremizing function y is obtained as the solution of a first-order differential equation
(3.36) involving y and y′ only. This simplified form of Euler–Lagrange equation (3.36) is known
as the Beltrami identity. The combination F − y′ Fy′ that appears on the left of the Beltrami
identity is sometimes referred to as Hamiltonian.
Case II. If F is independent of y, then ∂ F/∂ y = 0 and the form of Euler–Lagrange equation
(3.32a) becomes  
d ∂F
=0
dx ∂ y′
Integrating, we get the first integral of the Euler–Lagrange equation as,
∂F
=k (3.37)
∂ y′
where k is a constant. Note that equation (3.37) is a first order differential equation involving x
and y′ .
30 CHAPTER 3. CALCULUS OF VARIATIONS

Case III. If F is independent of y′ , then ∂ F/∂ y′ = 0 and the form of Euler–Lagrange equation
(3.32a) becomes
∂F
=0
∂y
integrating, we get F = F(x), a function of x alone.

3.7 Advanced Variational Problems

3.7.1 Variational problems with high-order derivatives


Here we will consider the problem of finding the function y(x) that extremizes the integral
Z b
J[y] = F(x, y, y′ y′′ ) dx (3.38)
a

with prescribed Dirichlet (essential) boundary conditions


y(a) = α , y′ (a) = α ′
y(b) = β , y′ (b) = β ′
Here y ∈ C4 [a, b] and F is a given function that is twice continuously differentiable on [a, b] × R2 .
The necessary condition for the functional J[y] to be a minimum is that the function y(x)
satisfies the following Euler–Lagrange equation
   
∂F d ∂F d2 ∂ F
− + 2 =0 (3.39)
∂y dx ∂ y′ dx ∂ y′′
Instead of the Dirichlet-type boundary conditions we may also prescribe a Neumann -type
(natural) boundary conditions of the form
 
∂F d ∂F ∂F
− = 0, =0
∂y ′ dx ∂ y ′′
x=a ∂ y′′ x=a
 
∂F d ∂F ∂F
− = 0, =0
∂ y′ dx ∂ y′′ x=b ∂ y′′ x=b
In general, when the functional contains higher derivatives of y(x), which extremizes the
functional Z b
J[y] = F(x, y, y′ y′′ , · · · , y(n) ) dx (3.40)
a
must be a solution of the equation
    n  ∂F 
∂F d ∂F d2 ∂ F n d
− + 2 − · · · · · · + (−1) =0 (3.41)
∂y dx ∂ y′ dx ∂ y′′ dxn ∂ y(n)
Equation (3.41) is differential equation of order 2n and is called Euler–Poisson equation. The
general solution of this contains 2n arbitrary constants, which may be determined from the 2n
boundary conditions.
3.8. APPLICATION OF EL EQUATION: MINIMAL PATH PROBLEMS 31

3.7.2 Variational problems with several independent variables


If the extremal function u is a function on two independent variables x & y and the functional
to be extremized is of the form
ZZ
J[u] = F(x, y, u, ux , uy ) dx dy (3.42)
R

then the u(x, y) must be a solution of the equation


   
∂F ∂ ∂F ∂ ∂F
− − =0 (3.43)
∂u ∂ x ∂ ux ∂ y ∂ uy
This second-order partial differential equation that must be satisfied by the extremizing function
u(x, y) is called the Ostrogradsky equation after the Russian mathematician M. Ostrogradsky.

3.8 Application of EL Equation: Minimal Path Problems


This section deals with few classical problems to illustrate the methodology to solve the variational
problems with Euler-Lagrange equation. Problems of determining shortest distances furnish a
useful introduction to the theory of the calculus of variations because the properties characterizing
their solutions are familiar ones which illustrate many of the general principles common to all of
the problems suggested above.

3.8.1 Shortest distance


Let us begin with the simplest case of all, the problem of determining the shortest distance
joining two given points. Let P(x1 , y1 ) and Q(x2 , y2 ) be two fixed points in a space. Then we
want to find the shortest distance between these two points. The length of the curve using the
arc-length expression is
Z Q Z x2 q
L = J[y(x)] = ds = 1 + y′ (x)2 dx
P x1

The variational problem is to find the plane curve whose length is shortest i.e., to determine the
function y(x) which minimizes the functional J[y]. The curve y(x) which minimizes the functional
J[y] is be determined by solving the Euler–Lagrange equation (3.32a)
 
∂F d ∂F
− =0
∂y dx ∂ y′
In the present problem q
F= 1 + y′ (x)2
and is a special case in which F independent of x and y. Then according to (3.17) EL equation
reduces to
∂F
=k
∂ y′
32 CHAPTER 3. CALCULUS OF VARIATIONS

where k is a constant. The derivative


∂F 1 2y′
= p =k
∂ y′ 2 1 + y′ (x)2

Therefore, p
y′ = k 1 + y′2
Solving for y′ to obtain s
k2
y′ = =m
1 − k2
Integrating, y = mx + c, where constants m and c are to be found using the boundary conditions
y(x1 ) = y1 and y(x2 ) = y2 . Thus, the straight line joining the two points P(x1 , y1 ) and Q(x2 , y2 ),
y2 − y1 x2 y1 − x1 y2
y= x+
x2 − x1 x2 − x1
is the curve with shortest length.

3.8.2 The brachistochrone problem

Let P(x1 , y1 ) and Q(x2 , y2 ) be two points on a vertical plane. Consider a curved path connecting
these points. We allow a particle, without friction, to slide down this path under the influence
of gravity. The question here is what is the shape of curve that allows the particle to complete
the journey in the shortest possible time. Clearly, the shortest path from point P to point Q is
the straight line that connects the two points. However, along the straight line, the acceleration
is constant and not necessarily optimal. Naive guesses for the paths’s optimal shape, including
a straight line, a circular arc, a parabola, or a catenary are wrong.
In order to calculate the optimal curve we set up a two-dimensional Cartesian coordinate
system on the vertical plane that contains the two points P and Q as shown in figure 3.5. Our
goal is to find the path that minimizes the time it takes for an object to move from point P to
point Q.

P b
x
y(x)
g

c(x,y)
b

b Q
Ft
Fn
F
y
Figure 3.5: A particle sliding down a curved path.
3.8. APPLICATION OF EL EQUATION: MINIMAL PATH PROBLEMS 33

From figure 3.5 we see that at any point c(x, y) on curve y(x), the gravitational force vector F
decomposes into a component Ft tangent and Fn normal to curve at P. The component Fn does
nothing to move the particle along the path, only the component Ft has any effect. The vector
F is a constant at each point on the curve of (F = mg, where m is the mass of the particle and g
is the gravitational acceleration), but Fn and Ft depend on the steepness of the curve at c. The
steeper the curve, the larger Ft is, and the faster the particle moves. So it would be better if the
path close to point P is more steeper so that the velocity of the object increases rapidly and then
flattens towards point c. Definitely this sort of curve is longer than the straight line connecting
the end points. But the extra speed that the particle develops just as it is released will more
than make up for the extra distance that it must travel, and it will arrive at Q in less time than
it takes along a straight line. The curve along which the particle takes the least time to go from
P to Q is called the Brachistochrone (from the Greek words for shortest time). This famous
problem, known as the Brachistochrone Problem, was posed by Johann Bernoulli (1667-1748) in
1696. The problem was solved by Johann Bernoulli, his older brother Jakob Bernoulli, Newton,
and L’Hospital.
Let us begin our own study of the problem by deriving a formula relating the choice of the
curve y to the time required for a particle to fall from P to Q. The instantaneous velocity of the
ball along the curve is v = dsdt , where s denotes the arc-length. Therefore,
p q
ds dx2 + dy2 1
dt = = = 1 + y′ (x)2 dx (3.44)
v v v
Let τ be the time of descent from A to B along the curve y = y(x). Then,
Z τ Z S
ds
τ = dt = (3.45)
0 0 v
where S is the total arc-length of the curve. If the origin of the coordinate system is taken as
the staring point A, we have, using (3.44)
Z x2 p
1 + y′ (x)2
τ = dx (3.46)
0 v
To obtain an expression for v we use the fact that energy is conserved through the motion. Thus,
the total energy at any time t must be the same as the total energy at time zero (corresponding
to location P), which we may take to be zero; that is
1 2
mv + mg(−y) = 0
2

Solving for v gives v = 2gy. Therefore the time required for the particle to descend is
s
Z x2
1 1 + y′ (x)2
τ [y] = √ dx (3.47)
2g 0 y(x)

where we have explicitly noted that τ depends on the curve y(x). Equation (3.47) defines a
functional.
34 CHAPTER 3. CALCULUS OF VARIATIONS

The Brachistochrone problem can be stated as: find the function y(x) that minimizes the
functional s
Z x2
1 1 + y′ (x)2
τ = J[y] = √ dx (3.48)
2g 0 y(x)
subject to the conditions y(0) = 0 and y(x2 ) = y2 > 0. We could experiment with formula (3.48)
to determine the the shortest time. Clearly it would be tedious to choose y(x) one after another
and look for the shortest time.
First of all we note that s
1 + y′2
F =
y
which is independent of x and therefore we can apply the Beltrami identity (3.36)
∂F
F − y′ =B
∂ y′
where B is a constant. Now
∂F 1 1
= √ · p · 2y′
∂y′ y 2 1+y′2

Therefore the Beltrami identity becomes


p
1 + y′2 y′2
√ −√ p =B
y y 1 + y′2

Creating a common denominator on the left-hand side produces


p p
1 + y′2 1 + y′2 − y′2
√ p =B
y 1 + y′2

The above equation simplifies to



y 1 + y′2 = C
where C is another constant. "  2 #
dy
y 1+ =C
dx
That is, the solution to the brachistochrone problem is the solution y = y(x) of the above ordinary
differential equation. To solve this differential equation, we first rewrite it in the following form:
C
y=
1 + y′2
Substitute y′ = cot θ (where θ is a parameter) in the differential equation to obtain
C C
y= = C sin2 θ = (1 − cos 2θ )
1 + cot θ
2 2
3.8. APPLICATION OF EL EQUATION: MINIMAL PATH PROBLEMS 35

Now the dx can be expressed as follows

2 (2 sin2θ ) d θ C2 sin θ cos θ d θ


C
dy
dx = = = = 2C sin2 θ d θ
y′ cot θ cot θ
dx = C(1 − cos 2θ )d θ
Integrating the above differential equation to obtain
 
sin 2θ
x=C θ− +D
2
where the constant of integration D can be determined from the condition y(0) = 0, we get
D = 0. Putting 2θ = φ , we can write
C C
x = (φ − sin φ ) and y = (1 − cos φ )
2 2
This is the parametric equation for cycloid. A cycloid is the locus of a point fixed on the
circumference of a circle as the circle rolls on a flat horizontal surface, see figure 3.6. We can
show that there is one and only one cycloid passing through points P and Q. The parametric
equation of cycloid may be written the following standard form:
x(φ ) = a(φ − sin φ )
(3.49)
y(φ ) = a(1 − cos φ )
where a = C/2 is the radius of the rolling circle and φ is the angle of rotation. Using the
condition that the curve (cycloid) passes through Q(x2 , y2 ), the value of the constant a can be
determined.

b b
x
φ
b b b

y
Figure 3.6: The cycloid acts as a brachistochrone.

Another remarkable characteristic of the brachistochrone particle is that when two particles
at rest are simultaneously released from two different points M and N of the curve they will
reach the terminal point of the curve at the same time, if the terminal point is the lowest point
on the path (see figure 3.7). Such a curve is called an isochrone or a tautochrone. This is also
counterintuitive, since clearly they have different geometric distances to cover; however, since
they are acting under the gravity and the slope of the curve is different at the two locations,
the particle starting from a higher location gathers much bigger speed than the particle starting
at a lower location. Hence the brachistochrone problem may also be posed with a specified
terminal point and a variable starting point, leading to the class of variational problems with
open boundary.
36 CHAPTER 3. CALCULUS OF VARIATIONS

b
x

b
M

b
N

y b

Figure 3.7: The tautochrone

You might also like