Université Paris Cité 2022-2023
Basic notes on functional derivatives
(extracted directly from some notes I wrote on a first course on classical
mechanics and Lagrangians)
D.A. Steer
APC, 10 rue Alice Domon et Léonie Duquet, 75205 Paris Cedex 13, France
(November 17, 2022)
Contents
1 Calculus of variations a): functionals and the Euler-Lagrange equations 1
1.1 Examples of functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Euler-Lagrange equations: derivation . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 General EL equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 A note on Functionals and functional derivatives . . . . . . . . . . . . . . . . . . 5
1.3.1 Functional derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.2 Functional derivatives and the EL equation . . . . . . . . . . . . . . . . . 7
1.3.3 Functional Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Calculus of variations b): Symmetries and conservation laws 8
1 Calculus of variations a): functionals and the Euler-
Lagrange equations
How does one find the trajectory which minimises a functional (also called the principle of least
action)? There is an infinite list applications of such a type of calculation outside the domain
of classical mechanics. They range from optics (Snell-decartes law and mirages (see exercises
below)), to the problem of the brachistochrone, through to finding geodesics of particles in
relativity, classical and quantum field theory etc...). We begin by giving some simple examples
of applications of this type of calculation.
1.1 Examples of functionals
In general we wish to minimise a functional F [qα ] which depends on a path qα (τ ). Depending
on which calculation we are doing, τ may not necessarily be time. Here are some examples of
the type of calculation we might want to do:
1. Consider 2-dimensional Euclidean space, with coordinates (x, y). Let us fix two points
A = (xA , yA ) and B = (xB , yB ), and consider a path y(x) which goes between these two
1
points. Clearly the distance L between the points A and B depends on the path y(x)
between these two points:
functional L[y], path y(x)
Which path minimises the functional L[y]? From experience we know the answer – a
straight line!
2. One can ask the same question, but on the surface of the sphere. Then
functional L[θ], path θ(φ)
Which path minimises the functional L[θ]? The answer is again known from experience
– great circles.
3. The brachistochrone. In the vertical plane with coordinates (x, z) (x-horizontal and z-
vertical), consider two given points A and B connected by a wire of shape z(x). A bead
of mass m starts at point A with zero velocity, and travels to B under the influence of
gravity (we neglect any frictional forces). Which is the shape of the curve y(x) such that
the time taken for the bead to go from A to B is minimum?
functional T [y], path y(x)
(This problem was first posed in 1696, and birth to the calculus of variations)
4. What must be the shape of the curve between A and B such that the area of the solid of
revolution is minimum?
functional A[y], path y(x)
5. A plane takes off from New-York bound for London. The on-board computer must choose
the optimal path ~x(t) (a sequence of altitudes, longitudes, and latitudes for all t), such
that given all the wind directions etc, the consumption of fuel is minimum:
functional VF [~x] (volume of fuel), path (r(t), θ(t), φ(t)) (sequence of long, lat and alt)
In all these examples on must minimise a functional F [qα ] which depends on a path qα (τ ):
for the different examples we have
1. α = 1 (one degree of freedom); path parametrised by τ = x; and q1 (τ ) = y(x). The
functional F = L.
2. α = 1 (one degree of freedom); path parametrised by τ = φ; and q1 (τ ) = θ(φ). The
functional F = L.
3. α = 1 (one degree of freedom); path parametrised by τ = x; and q1 (τ ) = y(x). The
functional F = T .
4. α = 1 (one degree of freedom); path parametrised by τ = x; and q1 (τ ) = y(x). The
functional F = A.
5. α = 3 (three degrees of freedom); path parametrised by τ = t; and q1 (τ ) = r(t), q2 (τ ) =
θ(t), q3 (τ ) = φ(t). The functional F = VF .
Notice also that the functionals in all these examples are numbers whose value depends on
the value of a function at all points.
2
1.2 Euler-Lagrange equations: derivation
From section 1.1, the functional, paths and coordinates chosen change from one example to the
next. We derive the EL equations for a functional
F [~x] with path ~x(t) (1)
It is very easy to translate the results we will obtain from ~x(t) → qα (τ ) and hence to all the
examples considered above.
Given a path ~x(t), the simplest functionals are integrals along the path of ~x(t) and its
derivatives with respect to t:
Z tf
F1 [~x] = |~x(t)|2 dt
t
Z itf
F2 [~x] = |~x˙ (t)|2 dt
t
Z itf
F3 [~x] = (~x(t) · ~x˙ (t))dt
ti
........
Note that a functional is a scalar: on the RHS one must integrate over a scalar.
A general functional takes the form
Z tf
F [~x] = f (~x, ~x˙ , t)dt (2)
ti
1
How do we find the path ~xˆ which minimises the functional F [~x] subject to the conditions
~x(ti ) = ~xi and ~x(tf ) = ~xf ? The answer is that the path ~xˆ is the solution of the Euler-Lagrange
equations.
Note that throughout this course we will use the ‘Einstein summation convention’:
∂ X ∂ ∂
~y · = yi = yi (3)
∂~x i
∂xi ∂xi
Let ~xˆ(t) be the minimising path which we are trying to determine, and ~η (t) a small variation
about that path. Thus we construct the new path
~x(t) = ~xˆ(t) + ~η (t) ∼ ~xˆ(t) (4)
and also impose
~η (ti ) = ~η (tf ) = 0 (5)
Hence ~x(t) is a path which is infinitesimally close to ~xˆ(t), and which also starts at ~xi and finishes
at ~xf . Notice that there is no variation of the independent variable t but only variation of the
functions ~x(t).
1
R tf You can ask why we don’t consider functionals containing up to n derivatives of ~x, that is F =
ti
f (~x, ~x˙ , . . . ~x(n) , t)dt where ~x(n) = dn ~x/dtn . One reason is pragmatic: none of the examples we study is of this
form. The second is fundamental: in the case of the action F [~x] = S[~x], when there is a dependence on third
or higher order derivatives of the generalised coordinates, the theory suffers from the so-called Ostrogradski in-
stability. The theory has no stable ground state. See for example http://arxiv.org/pdf/astro-ph/0601672,
page 4, for a very readable introduction to this instability.
3
Then
Z tf
ˆ
F [~x] = f (~xˆ + ~η , ~x˙ + ~η˙ , t)dt
t
Z itf
ˆ ˆ˙ ∂f ∂f ˙
= dt f (~x, ~x, t) + · ~η + · ~η + . . .
ti ∂~x ∂ ~x˙
Z tf
∂f ∂f
= F [~xˆ] + dt · ~η + · ~η˙ + . . . (6)
ti ∂~x ∂ ~x˙
Now integrate by parts the second term in the last line:
Z tf tf Z tf
∂f ˙ ∂f d ∂f
dt · ~η = · ~η − dt · ~η (7)
ti ∂ ~x˙ ∂ ~x˙ ti ti dt ∂ ~x˙
The first term vanishes by the boundary conditions. Hence,
Z tf
ˆ ∂f d ∂f
δF = F [~x] − F [~x] = dt − · ~η + . . . (8)
ti ∂~x dt ∂ ~x˙
The functional is minimum (technically an extremum) if δF = 0. This relation must hold
for any infinitessimal ~η , no matter how small. Hence we can first neglect the higher order
terms in the Taylor expansion, noted with . . .. The remaining term is proportional to ~η .
In the discussion presented so far we are working under the assumption that the generalised
coordinates qα (which in this case are ~x) are all independent. Hence the δxi = ηi (i = 1, 2, 3)
are also independent.2 This is only possible if the coefficient of ~η vanishes for all ti ≤ t ≤ tf .
Thus ~xˆ(t) minimises F if and only if
∂f d ∂f
− =0 (9)
∂~x dt ∂ ~x˙
along the path ~xˆ(t). This is the Euler-Lagrange equation, and the theory that underlies it
the calculus of variations. Notice that the above equation is a compact form for writing three
equations, one for each component of ~x:
∂f d ∂f
− =0 (i = 1, 2, 3) (10)
∂xi dt ∂ ẋi
1.2.1 General EL equations
The argument is straightforwardly generalised to functionals
Z τf
F [qα ] = dτ f (qα , q̇α , τ ) (11)
τi
where the path is qα (τ ) and q˙α = dqα /dτ . In that case
Z τf
∂f d ∂f
δF = F [qα ] − F [q̂α ] = dτ − δqα (12)
τi ∂qα dτ ∂ q̇α
2
This will not be the case when the system is subject to constraints: see below
4
Since the qα are assumed independent, the path which minimises F [qα ] subject to the conditions
qα (τi ) and qα (τf ) fixed satisfies
∂f d ∂f
− =0 (13)
∂qα dτ ∂ q̇α
Note that these are N equations, one for each value of α.
Finally, observe the following three points
1. The functions f and g related by
g(qα , q̇α , τ ) = f (qα , q̇α , τ ) + const (14)
satisfy the same EL equations (though the value of the functionals F [g] and F [f ] differ).
2. The functions f and g related by
dΛ(qα , τ )
g=f+ (15)
dτ
where Λ(qα , τ ) is an arbitrary scalar function of τ and qα (but not q̇α ) also satisfy the
same EL equations. To see this, either plug straight into the EL equations (which is a
mess), or recall that the EL equations are derived from F which itself only changes by a
constant under the transformation.
3. In (13), apart from their independence, the qα are arbitrary. Suppose we’d decided to
work with different independent generalised coordinates Qα . Then in terms of these the
EL equations must take the form
∂f d ∂f
− = 0. (16)
∂Qα dτ ∂ Q̇α
1.3 A note on Functionals and functional derivatives
Functions f are maps from (for example) the reals to the reals
f : R→R (function). (17)
That is, given an x ∈ R, f (x) ∈ R. Of course functions may also be vectors: an example is
the magnetic field B, ~ which associates the magnetic field to every point of 3-space. In this
case the function is a mapping from R3 → R3 . One can also have scalar function, such as φ(~x)
(the Higgs field is a possible example) which is a mapping from R3 → R, as well as complex
functions.
Let F denote the space of functions. In physics one generally deals with functions which
are infinitely differentiable so that the underlying coordinate space is a manifold M and the
space of functions is denoted by C ∞ (M ). A functional F is a map
F : F →R (functional). (18)
That is, for f ∈ F, F [f ] ∈ R. A functional depends on the value of the function f (~x) at all
points. (It should now be clear that a functional is not a function of a function – which is
nothing other than a function!)
Note that the arguments of a function and a functional are labelled differently: f (•) and
F [•] respectively.
5
1.3.1 Functional derivative
The Functional derivative
δF
δf (x)
of a functional F [f ] is defined as follows: for any infinitesimal δf (x),
Z
δF
δF [f ] = F [f + δf ] − F [f ] ≡ dx δf (x) (19)
δf (x)
This is in analogy for an ordinary function of n variables,
n
X ∂f
δf (x1 , . . . , xn ) = f (x1 + δx1 , . . . , xn + δxn ) − f (x1 , . . . , xn ) = δxj (20)
j=1
∂x j
Similarly the functional taylor series is:
δ 2 F [f0 ]
Z Z
δF [f0 ] 1
F [f0 + f ] = F [f0 ] + dx δf (x) + dx1 dx2 δf (x1 )δf (x2 ) + . . . (21)
δf (x) 2 δf (x1 )δf (x2 )
Specific examples:
1. Consider Z Z
F [f ] = dxf (x) ⇒ δF = dxδf (x) (22)
so that comparing with (19) gives
δF
=1 (23)
δf (x)
2. Consider
Z Z
F [f ] = dxδ(x − y)f (x) ⇒ δF = dxδ(x − y)δf (x) (24)
Note that the left hand side is not only a functional, but also a function of y, since y is
not integrated over on the right hand side. Comparing with (19) gives
δF
= δ(x − y) (25)
δf (x)
R
But by definition F [f ] = dxδ(x − y)f (x) = f (y). So we get to the very useful identity
δf (y)
= δ(x − y). (26)
δf (x)
3. Consider Z
F [f ] = G(x, y)f (y)dy (27)
Note that the left hand side is not only a functional, but also a function of x, since x
is not integrated over on the right hand side. It therefore makes no sense to calculate
δF/δf (x), but we can calculate δF/δf (z). From above
Z Z
δF = dyG(x, y)δf (y) = dzG(x, z)δf (z) (28)
6
since y is the dummy integration variable. Hence comparing with (19)
δF
= G(x, z) (29)
δf (z)
4. Consider Z Z
1
F [f ] = a + dxb(x)f (x) + dxdyc(x, y)f (x)f (y) (30)
2
where, without loss of generality one can assume c(x, y) = c(y, x). Then
Z Z
1
δF [f ] = dxb(x)δf (x) + dxdyc(x, y)[f (x)δf (y) + f (y)δf (x)] (31)
2
so that from (19): Z
δF
= b(x) + dyc(x, y)f (y) (32)
δf (x)
5. Consider
Z Z
F [f ] = dxδ(x−y)(f (x))n = f (y)n ⇒ δF = dxδ(x−y)nf (x)n−1 δf (x) (33)
(using that (f + δf )n = f n + nf n−1 δf + . . .). Thus
δ(f (y)n )
= δ(x − y)nf (x)n−1 (34)
δf (x)
1.3.2 Functional derivatives and the EL equation
Finally, now return to Eq. (12), which we can rewrite as
Z τf Z τf
∂f d ∂f δF
δF = F [qα ] − F [q̂α ] = dτ − δqα ≡ dτ δqα . (35)
τi ∂qα dτ ∂ q̇α τi δqα
Thus the EL equations are equivalent to setting
δF
=0 (36)
δqα
exactly as one would expect at an extremum.
1.3.3 Functional Integration
Functional integration, or path integration, is a huge subject in itself, but provides one of the
most powerful methods of modern theoretical physics. The functional integration approach is
(essentially by definition) central to systems with an infinite number of degrees of freedom, and
is very suitable for the introduction and formulation of diagrammatic perturbation theory of
quantum field theory and statistical physics. Any beginning course on quantum field theory
starts with functional integrals: this is well outside the R scope of the present course.
For a curious reader, the path integral (denoted by Df (x)) was first introduced by Feyman:
for example in quantum mechanics, the transition amplitude between an initial quantum state
|qi ti i and a final quantum state |qf tf i is written as
Z
i
hqf tf |qi ti i = Dqe ~ S[q] (37)
where for simplicity we have assumed one generalised coordinate q, and S is the action men-
tioned above. The classical path is q(t). Rather loosely, a preliminary way to think of Dq is
as Y
Dq ∝ dq(t) (38)
t
7
2 Calculus of variations b): Symmetries and conserva-
tion laws
In nearly all physical phenomena there exist quantities which are conserved during the evolution
of the system — depending on the problem, these may be the angular momentum, the total
energy, the total momentum etc. Knowing the existence of such conserved quantities is crucial
in determining the dynamics of the system under consideration (think of the collision of billiard
balls for example). The aim of this short section is to see how such conservation laws appear
in the functional approach. Rτ
In the following we work with F [qα ] = τif f (qα , q̇α , τ ).
Definition A function C(qα , q̇α , τ ) is a constant of motion (= conserved quantity) along
the path q̂α solution to the EL equations, if its total derivative wrt τ vanishes along the path:
dC ∂C ∂C ∂C
= q̇α + q̈α + =0 (39)
dτ ∂qα ∂ q̇α ∂τ
Note the use of the summation convention here. The total derivative takes into account the
τ -dependence due to the evolution in τ of the qα (τ ) and q̇α (τ ).
When do we find conserved quantities?
1. Suppose f is explicitly τ independent, ∂f /∂τ = 0, so that f = f (qα , q̇α ).
Along the trajectory satisfying the EL equations, it follows that
df ∂f ∂f
= q̇α + q̈α
dτ ∂qα ∂ q̇α
d ∂f ∂f
= q̇α + q̈α
dτ ∂ q̇α ∂ q̇α
d ∂f
= q̇α (40)
dτ ∂ q̇α
where in going from the 1st to 2nd line, we’ve used the EL equations. Therefore, putting
the lhs on the rhs, it follows that
∂f
h= q̇α − f = constant (41)
∂ q̇α
Namely h is conserved if ∂f /∂τ = 0. (Théorème de Beltrami)
If such a conserved quantity exists, use it! It provides a different way of encoding the EL
equations and has the advantage of being first order in time, making it often easier to
find the path which minimises the functional F .
2. Suppose f does not depend explicitly on one of the generalised coordinates, say q1 , i.e. ∂f /∂q1 =
0, so that f = f (q2 , q3 , . . . , qN , q̇1 , . . . , q̇N , t).
Then from the EL equations, it follows that
d ∂f ∂f
=0 ⇒ = constant (42)
dτ ∂ q̇1 ∂ q̇1
There are numerous examples of this. One is that of a particle moving in a central
8
potential V (r), say in two dimensions. Then the obvious generalised coordinates are
R θ) and the dynamics of the particle is determined by minimising a functional F [r, θ] =
(r,
dtf (ṙ, θ̇, r). The corresponding conserved quantity ∂f /∂ θ̇ is nothing other than the
angular momentum of the particle.
3. Noether’s theorem and symmetry
What happens when f is invariant (or symmetric) under a change of generalised coordi-
nates?
Consider a one-parameter family of maps
qα (τ ) → Qα (s, τ ) (s ∈ R) (43)
such that Qα (0, t) = qα . If this transformation leaves the functional form of f (qα , q̇α , t)
invariant to linear order in s, it is said to be a continuous symmetry of f . Mathematically,
a continuous symmetry requires
∂f ∂Qα ∂f ∂ Q̇α
+ =0 (44)
∂Qα ∂s s=0 ∂ Q̇α ∂s s=0
Why? Since we work to linear order (43) can be written as
qα (t) → Qα (s, τ ) = qα (τ ) + shα (τ ) + . . . . (45)
Of course, for any given Lagrangian, the aim is to find exactly which hα (or equivalently
which Qα (s, τ )) leave the Lagrangian invariant. Then, on doing a Taylor expansion to
linear order,
∂f ∂f
f (Qα (s, τ ), Qα (s, τ ), τ ) = f (qα , q̇α , τ ) + s hα + ḣα + ... (46)
∂Qα s=0 ∂ Q̇α s=0
For a symmetry, the term proportional to s must vanish. This is nothing other than
the above condition, since once you have found Qα (s, τ ) then hα is obtained from (45)
through
∂Qα (s, τ )
hα (τ ) = (47)
∂s s=0
Statement of theorem:
Noether’s theorem states that for each such symmetry there is a conserved quantity given
by
∂f ∂Qα
= constant (48)
∂ q̇α ∂s s=0
(Again recall that there’s an implicit summation over α here.)
Proof:
9
By definition
∂f ∂f ∂Qα ∂f ∂ Q̇α
0= = +
∂s s=0 ∂Qα ∂s s=0 ∂ Q̇α ∂s s=0
∂Qα ∂f ∂ Q̇α ∂f
= +
∂s s=0 ∂qα ∂s ∂ q̇α
s=0
∂Qα d ∂f ∂ Q̇α ∂f
= +
∂s s=0 dt ∂ q̇α ∂s ∂ q̇α
s=0
d ∂f ∂Qα
= (49)
dt ∂ q̇α ∂s s=0
But since the LHS vanishes for a continuous symmetry, it follows that (48) is conserved.
We will see numerous applications of Noethers theorem and conserved quantities once we
have defined the action for classical mechanics. (For example, many systems are invariant under
spatial translations ~x(t) → ~x(t) + s~n. The quantity conserved by Noether’s theorem will turn
out to be the total linear momentum).
10