GR II
GR II
Lecture Notes
Abstract
These notes represent the material covered in the Part II lecture General Relativity
(GR). While the course is largely self-contained and some aspects of Newtonian Gravity
and Special Relativity will be reviewed, it assumed that readers will already be famil-
iar with these topics. Also, calculus in N dimensions and Linear Algebra will be used
extensively without being introduced.
There is wide range of books available on the topic and these notes have found inspi-
ration in several of these. Likewise, these notes benefit considerably from other lecture
notes used for this course or its Part III extension in previous years. Readers may find it
helpful to consult any of these as alternative sources for the material, although the goal
of these notes is to make this an optional rather than a necessary procedure for following
the material. We note in particular the lecture notes for Part III GR by Harvey Reall
[36], and the Part II GR notes by Gary Gibbons [37] and Stephen Siklos [38].
The content of these notes is too comprehensive to be put on the blackboard in ver-
batim fashion. A condensed version mirroring with high precision the blackboard content
will be generated at some later stage.
A subset of the wealth of literature on Einstein’s theory is given as follows.
• S. M. Carroll: “Spacetime and Geometry: An Introduction to General Relativity”
[8] ; cf. also [7] .
• R. d’Inverno: “Introducing Einstein’s Relativity” [9] .
• J. B. Hartle: “Gravity, An Introduction to Einstein’s General Relativity” [11] .
• L. P. Hughston & K. P. Tod: “An Introduction to General Relativity” [15] .
• C. W. Misner, K. S. Thorne & J. A. Wheeler: “Gravitation” [17] .
• W. Rindler: “Relativity: Special, General, and Cosmological” [20] .
• L. Ryder: “Introduction to General Relativity” [21] .
• B. Schutz, “A first course in general relativity” [24] .
• H. Stephani: “An Introduction to Special and General Relativity” [27] .
• R. M. Wald: “General Relativity” [30] .
• S. Weinberg: “Gravitation and Cosmology: Principles and Applications of the Gen-
eral Theory of Relativity” [31] .
I have not read all of these books, but will attempt here to give my two cents on guidance
based on what I have read. Schutz’ book is an excellent very first reading of general
relativity. I also enjoyed Carroll’s book a lot (on top of a good compromise between
mathematical foundation and physics, I enjoyed his sense of humor). I found d’Inverno
amazingly readable especially given that it goes quite a bit beyond the standard material
1
2
on several occasions. I may be biased, but certainly enjoyed a lot how much material
of his book I found of high value in numerical relativity. (Note besides: it’s German
translation, while equally readable has a good chunk of typos in its first edition – the one
I know). Misner, Thorne & Wheeler is often referred to as “The Bible of GR” and you
will quickly find out why (starting when carrying it home). It was my first introduction
to the geometrical foundation of relativity and it is simply breathtaking at providing the
reader with a visual idea of curved geometry and it’s mathematical toolkit. Weinberg is
also a classic, but focuses more on the field theoretical side rather than geometric images.
I enjoyed the Cosmology part most. I have frequently used Ryder and Wald for selected
chapters but have not read them from the beginning (simply because I only knew about
them at a later stage when reading books from the beginning had become an unaffordable
luxury). Ryder seems a great introduction while Wald is rightfully famous for considerable
mathematical rigor and depth (if you like, a good stepping stone towards Hawking & Ellis
[13]). I have heard good things about Hartle’s book but haven’t got a hand on it myself
yet. It goes without saying that these are merely my own humble opinions. As usual with
textbooks, the recommendation is to have a look yourself and find your optimal selection.
Chocolate is a wonderful thing in my opinion, but I know people who just don’t happen
to like it...
Ulrich Sperhake
CONTENTS 3
Contents
A Preliminaries 6
A.1 Units and constants of nature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
A.2 Newtonian gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
A.2.1 A tale of three masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
A.2.2 Equivalence principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
A.2.3 Gravitational redshift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
A.2.4 An index based formulation of Newtonian Gravity . . . . . . . . . . . . . 17
A.2.5 The need for general relativity . . . . . . . . . . . . . . . . . . . . . . . . 20
A.3 A review of special relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
A.3.1 Notation and metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
A.3.2 Lorentz transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
A.3.3 World lines and the four velocity . . . . . . . . . . . . . . . . . . . . . . 24
A.3.4 Time dilation and Lorentz contraction . . . . . . . . . . . . . . . . . . . 26
A.3.5 Four momentum and Doppler shift . . . . . . . . . . . . . . . . . . . . . 29
B Differential geometry 31
B.1 Manifolds and tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
B.1.1 Functions and curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
B.1.2 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
B.1.3 Covectors / one-forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
B.1.4 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
B.1.5 Tensor operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
B.1.6 Tensor fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
B.1.7 Integral curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
B.2 The metric tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
B.2.1 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
B.2.2 Lorentzian signature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
B.3 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
B.3.1 Curves revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
B.3.2 Geodesic curves defined by a variational principle: Version 1 . . . . . . . 46
B.3.3 Geodesic curves defined by a variational principle: Version 2 . . . . . . . 49
B.4 The covariant derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
B.5 The Levi-Civita connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
B.6 Parallel transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
B.7 Normal coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
B.8 The Riemann tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
B.8.1 The commutator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
B.8.2 Second derivatives and the Riemann tensor . . . . . . . . . . . . . . . . . 65
B.8.3 Symmetries of the Riemann tensor . . . . . . . . . . . . . . . . . . . . . 67
B.8.4 Parallel transport and curvature . . . . . . . . . . . . . . . . . . . . . . . 69
B.8.5 Geodesic deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
CONTENTS 4
E Cosmology 116
E.1 Homogeneity and Isotropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
E.2 The Friedmann equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
E.2.1 Ricci tensor and Christoffel symbols . . . . . . . . . . . . . . . . . . . . . 120
E.2.2 The cosmological matter fields . . . . . . . . . . . . . . . . . . . . . . . . 120
E.2.3 The Einstein equations in cosmology . . . . . . . . . . . . . . . . . . . . 122
E.3 Cosmological redshift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
E.4 Cosmological models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
E.4.1 General considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
E.4.2 Selected solutions to the Friedmann equations . . . . . . . . . . . . . . . 127
A Preliminaries
A.1 Units and constants of nature
The units we use for measuring things in our day-to-day lives are naturally adjusted to the
magnitude of the size or mass of ourselves and the objects we tend to deal with. It does not
matter here, whether you prefer Imperial or SI units; on a good pub outing, you will very
likely consume of the order O(1) pints or liters of beer rather than, say, O(10−2 ) or O(102 ).
When dealing with the wide range of objects in physics, these units are often not most suitable
because we have essentially no intuitive understanding of numbers such as 2 × 1030 – the mass
of the sun in kg. Here lies one reason why physicists often introduce units other than those
used in supermarkets. It is not the only reason, however.
A second, and more profound, reason arises from the seeming constancy of certain values in
nature. While we cannot be absolutely certain that the speed of light, Planck’s ~ or Newton’s
gravitational constant are genuinely constant over all of space and time, experiments and ob-
servations made so far suggest that they are, and we will follow in this course the working
hypothesis that this is indeed the case.
Constants of nature have two prominent implications: (i) they relate what previously appeared
to be different fundamental physical dimensions and (ii) they give us an intuitive notion about
the regime of validity of a physical theory. In this section, we will discuss these two phenomena
for the speed of light c, Newton’s constant G and Planck’s constant ~.
Its constancy, of course, was one of the key ingredients in Einstein’s derivation of the theory of
special relativity. It turns out very convenient in these notes and, indeed, in much of research
in relativity, to measure all velocities in units of the speed of light, i.e. set c = 1. This is to be
applied quite literally to Eq. (A.1), so that
!
c = 3.00 × 108 m/s = 1
⇒ 1 s = 3.00 × 108 m . (A.2)
Note that we really mean that 1 second is the same as 3.00 × 108 meters. This notion is most
familiar from the use of “light years” for astrophysical distance,
days seconds m
1 yr = 365.25 × 86 400 × 2.9979 × 108 = 9.4607 × 1015 m . (A.3)
year day s
It is a testament to the intuitive potential of this concept that the light year is frequently used
in public presentations of astrophysical results and in science fiction, whereas astrophysicists
at work tend to use the unit parsec instead. A parsec, 1 pc ≈ 3.26 lightyears, is the distance at
which a celestial body undergoes a parallax of 1 arcsecond while the Earth orbits once around
the sun – parsec = short for parallax second.
A PRELIMINARIES 7
The speed of light thus gives us a natural unit for velocities and establishes a fundamental
link between time and spatial distance. It furthermore tells us when a velocity is large in
an absolute sense, namely in terms of a dimensionless number. Absolute numbers in physics
only give us a real sense of the magnitude of something when that number is dimensionless.
Often, such numbers also suggest when a physical theory hits the limits of its regime of validity.
For instance, for velocities v c, the Galileo transformations give us an exquisitely accurate
rule for transforming from one coordinate frame to another moving with v relative to the
first. For v ≈ c, however, we know that this rule breaks down and we need to use Lorentz
transformations instead. In fact, Galileo transformations turn out to be the leading order Taylor
expansion of Lorentz transformations around v = 0. Likewise, the Newtonian expression for
kinetic energy mv 2 /2 is the leading-order approximation obtained from Taylor expanding the
relativistic E 2 = p2 c2 + m2 c4 around v = 0. We have here a first warning that a theory that is
practically used only in the limit of a small dimensionless number may turn out to be merely
a leading order approximation of a more fundamental theory. This may also be the case of
General Relativity itself.
in general behave quite differently than Newtonian theory would predict; we need general
relativity for their modeling. The sun has a radius R = 6.957 × 105 km, so
M
1. (A.8)
R
Solar dynamics are accurately modelled using Newtonian gravity. For example, the relativistic
effects of light bending near the solar surface are very small and require rather high-precision
measurements to become detectable. We will return to this in Sec. D.3.2 below.
Note that in many physical systems, the regime of high velocity and strong gravity overlap.
For example, the velocity of a test mass in spherical orbital of radius r around a spherically
symmetric body of mass M is (using Newtonian theory) given by
v2 GM
2
= 2 , (A.9)
c c r
and the escape velocity from the surface of a spherically symmetric body of mass M and radius
R is r
2GM
ve = . (A.10)
R
So we have v 2 ∼ M/R when the velocity is determined by gravitational effects and the regime
v ≈ 1 coincides with the regime M/R ≈ 1. Post-Newtonian theory is a whole branch of
gravitational research concerned with expanding general relativity around Newtonian gravity
in terms of a power series of a dimensionless parameter = v 2 = M/R [5]. If, on the other
hand, large velocities are of non-gravitational origin, special relativity provides a satisfactory
description. This applies, for example, to collisions experiments at particle colliders.
classical physics break down and we have entered the realm of quantum mechanics. For example,
we can safely track the sun using optical light (ν ∼ 5 × 1014 Hz), since
~ω ω 2π × 5 × 1014 Hz
= = Hz
≈ 1.85 × 10−66 1 . (A.14)
M c2 M 1.989 × 1030 kg × 8.5223 × 1050 kg
Life doesn’t get much more classical than that. How about tracking protons? Using the proton
mass in SI units, mp = 1.6726219 × 10−27 kg, we obtain
ω 2π × 5 × 1014 Hz
= Hz
≈ 2.2 × 10−9 1 , (A.15)
mp 1.6726 × 10−27 kg × 8.5223 × 1050 kg
which is still ok. For instance, we can safely trace the trajectory of protons in bubble chambers.
Next let us consider energy levels in atoms. For this purpose recall that the energy difference
between different electron states in an atom is of the order of electron volts and that
kg m2
1 eV = 1.602176565 × 10−19 J = 1.602176565 × 10−19 .
s2
1 eV
⇒ meV = = 1.78269 × 10−36 kg . (A.16)
c2
If we wish to probe energy levels in atoms using optical light, we have
~ω ω 2π × 5 × 1014 Hz
= = Hz
≈ 2 = O(1) , (A.17)
meV c2 meV 1.78269 × 10−36 kg × 8.5223 × 1050 kg
and have definitely reached the quantum regime. The light thrown at the atoms is manifestly
perturbing the very energy levels we are interested in studying.
An alternative way to look at the unity of Planck’s constant is to consider the Compton wave-
length
~ 1
λ̄ = = , (A.18)
mc m
so in natural units a particle’s mass is merely the inverse of its Compton wavelength. The
dimensionless quantity then is the ratio of the Compton wavelength of the object to its size
or the characteristic length scale of its available volume. Macroscopic objects are much larger
than their Compton wavelength. For the sun, for instance, we obtain the absurdly small value
~
λ̄ = = 0.5028 × 10−30 kg−1 = 0.177 × 10−72 m , (A.19)
M c
and clearly λ̄ /R 1. The sun as a compound object is a classical object through and
through. Of course, quantum effects play a very important role for the behaviour of the sun’s
constituent matter, but not for the sun as a lump object. For a proton, on the other hand, the
Compton wavelength is
~
λ̄p = = 2.10268 × 10−16 m = 0.210268 fm . (A.20)
mp c
A PRELIMINARIES 10
The radius of atomic nuclei ranges from O(1) to O(10) fm, so the available volume is compa-
rable to the proton wavelength and quantum effects are important.
In summary, we have the following three dimensionless quantities that mark the onset of the
need for new physics when they approach values of the order of unity.
v
(1) ≈1 ⇒ Galileo transformations are no longer valid and
c
we need special relativity.
GM
(2) ≈1 ⇒ Newtonian gravity breaks down and we need
c2 R
general relativity.
λ̄ ~
(3) = ≈1 ⇒ Classical physics break down and we need quantum theory.
R M cR
We conclude this discussion with the question of the overlap between the three regimes. We
already discussed this issue for the first two items: we may have large velocities without strong
gravity which is well described by Einstein’s theory of special relativity. General relativity fully
includes special relativity, on the other hand, so when we have strong gravity, we automatically
have relativistic effects. The most intriguing overlap is that between general relativity and
quantum theory and it remains one of the great unknowns of contemporary physics. This
overlap regime is characterized by having
GM ~
=1 and =1
c2 R M cR
~c
⇒ M2 = . (A.21)
G
This scale is called the Planck mass, length or time defined by
r
~c
Planck mass MPl = = 2.18 × 10−8 kg = 1.22 × 1019 GeV (A.22)
G
G
Planck length LPl = MPl = 1.61 × 10−35 m (A.23)
c2
1
Planck time TPl = LPl = 5.37 × 10−44 s . (A.24)
c
This is the regime where we need a new theory: quantum gravity.
A PRELIMINARIES 11
m1
F~
−F~
m2
Figure 1: Illustration of the Newtonian two-body problem.
According to Newton’s 3rd law of motion, for every action force, there is a reaction force equal
in magnitude and pointing in the opposite direction. In consequence, the second body reacts
on the first with a force F~2on1 given by
~r2 − ~r1 ! ~ ~r2 − ~r1
F~2on1 = Gm1p m2a = −F1on2 = Gm1a m2p . (A.26)
|~r2 − ~r1 |3 |~r2 − ~r1 |3
This equality holds for arbitrary position vectors ~r1 , ~r2 , so that
How about the inertial mass then? This has been studied throughout a good part of history in
a variety of experiments. An incomplete list is as follows.
(1) ∼ 500 AD: Philoponus observes that two weights differing from each other by a wide
measure fall in times whose ratio differs much less than the ratio of their weights.
(2) ∼ 1590: Galileo studies balls rolling down a slope and measures that irrespective of the
balls’ weight, they require for this an amount of time equal to within about 2 %.
(3) ≈ 1686: Newton finds the oscillation period of pendulums of different matter types equal
to within ∼ 10−3 .
(4) 1922: Eötvös uses a torsion balance with arms of different material to check for a torque
exerted by the sun’s gravity. He finds none to within ∼ 5 × 10−9 .
(5) 1964: Dicke et al perform a refined version of Eötvös’ experiment and observe no torque
to within ∼ 10−11 .
More experiments have been carried out since to search for signs of inequality between the
inertial and the gravitational mass, all compatible to within error bars with the universality of
free fall. If we denote the gravitational field by ~g , a freely falling particle in this field follows
and the universality of motion implies that all objects have the same ratio mi /mp which, again,
we set to unity without loss of generality. Note that gravity differs in this regard from all
other interactions: inertial mass is identical to the gravitational “charge” of a body, but has no
relation to its electric charge or the body’s coupling to the weak and strong nuclear forces.
Weak Equivalence Principle (WEP): Freely falling small bodies with negligible gravitational self
interaction follow the same path if they have the same initial velocity and position.
The WEP summarizes the observations reviewed in the previous subsection. You may wonder
at this stage why this version excludes gravitational self interaction. We will return to this
question shortly, but first introduce Einstein’s version which promotes the principle to a more
general status. For this purpose, we need the following definition.
Def. : A “local inertial frame” is a coordinate frame (t, x, y, z) defined by a freely falling
observer in the same way as an inertial frame is defined in Minkowski spacetime. In this
context, “local” is defined to mean small compared with the length scale of variations
in the gravitational field ~g .
The word “local” marks the key difference from inertial frames in special relativity. This con-
straint is necessary to avoid effects such as tidal forces. As illustrated in Fig. 2, tidal forces
in an oversized laboratory give rise to a relative acceleration of two falling particles relative to
A PRELIMINARIES 13
Lab frame
Earth
Figure 2: If an observer’s frame is too large, inhomogeneities in the gravitational field lead to
relative acceleration of particles when viewed inside this frame. Both particles fall towards the
Earth’s center. In a large freely falling laboratory, the different horizontal components of ~g
make the particles appear accelerating towards each other without apparent cause.
each other. Local inertial frames are central to Einstein’s version of the equivalence principle.
Einstein equivalence principle (EEP): In a local inertial frame, the results of all non-gravitational
experiments are indistinguishable from those of the same experiment performed in an inertial frame
in Minkowski spacetime.
In the 1960s, Schiff [23] conjectured that the WEP implies the EEP. The idea is that mat-
ter is composed of particles (quarks, electrons etc.), that the binding energy merely forms a
contribution to the particle’s masses and that the overall interactions in any experiment can
thus be reduced to point particle motion that obeys the WEP. Intriguing though this idea may
be, it remains an unproven conjecture. That leaves the strong equivalence principle which is
undoubtedly a stronger requirement than the WEP.
Strong equivalence principle (SEP): The gravitational motion of a small test body (that may
have gravitational self interaction) depends only on its initial velocity and position but not on its
constitution.
z
Alice g
Bob
x, y
Figure 3: Two observers, Alice and Bob, are located at different height in a uniform gravitational
field ~g . Alice sends light to Bob that undergoes a change in frequency.
Let Alice and Bob be located at x = y = 0 and z = h and z = 0, respectively; cf. Fig. 3.
According to the EEP, we can describe this scenario using the laws of special relativity in a
freely falling frame, i.e. a frame accelerated with ~g relative to the rest frame with gravitational
field displayed in Fig. 3.
For simplification, we assume the velocity of both Alice and Bob to be much smaller than the
speed of light, v c, so that we can ignore (v/c)2 and higher order special relativistic terms.
The trajectories of Alice and Bob in the freely falling frame are then
1 1 !
zA (t) = h + gt2 , zB (t) = gt2 , vA = vB = gt c . (A.32)
2 2
The calculation then proceeds as follows.
A PRELIMINARIES 15
h
⇒ T1 − t1 = to leading order. (A.38)
c
6. Using this expression in Eq. (A.37) for the redshift gives
gh !
∆τB ≈ 1− 2 ∆τA < ∆τA . (A.39)
c
The signal appears blue shifted to Bob: in terms of the wavelength λ we have
gh
c∆τB = λB ≈ 1 − 2 λA . (A.40)
c
A PRELIMINARIES 16
This prediction was verified to within about 10 % by Pound & Rebka [19] in 1959 at Havard’s
Jefferson Laboratory. With a height difference of about 22.5 m, the quantity gh/c ≈ 7 ×
10−7 m/s 1 satisfies our simplifying assumption exquisitely. The fractional change in energy
of a photon (i.e. its frequency) was O(10−15 ) in this experiment. Later similar experiments, all
compatible with the equivalence principle, refined the accuracy by several orders of magnitude.
Anticipating material that we will develop further down the road of this course, we can gen-
eralize the result (A.39) to gravitational fields with non-uniform fields. The invariant special
relativistic interval
c2 ∆τ 2 = −c2 ∆t2 + ∆x2 + ∆y 2 + ∆z 2 , (A.41)
generalizes in the case of a weak and time independent Newtonian gravitational potential
φ(x, y, z) to
2 2 2φ(x, y, z) 2 2 2φ(x, y, z) φ
c dτ = 1 + 2
c dt − 1 − 2
(dx2 + dy 2 + dz 2 ) , 1. (A.42)
c c c2
In Sec. G.3, we will recover this expression as the spacetime metric of general relativity in
the Newtonian limit. Note that the interval is infinitesimally small in contrast to the special
relativistic (A.41). Let Alice and Bob now be located at fixed positions ~xA and ~xB . We calculate
the redshift from the invariant (A.42) as follows.
1. Alice emits signals at tA and tA + ∆t. Let tB denote the time when Bob receives the first
signal. When does Bob receive the second?
2. Because the spacetime is static (φ does not depend on t), the two signals travel on identical
trajectories, merely shifted in time. Bob therefore receives the second signal at tB + ∆t.
3. The time measured by Alice’s and Bob’s clocks, however, is given by the proper times τ at
their respective positions. These are
2φA 2φB
2
∆τA = 1 + 2 ∆t , 2
∆τB = 1 + 2 ∆t2
2
c c
φA φB
⇒ ∆τA ≈ 1 + 2 ∆t , ∆τB ≈ 1 + 2 ∆t
c c
−1
φB φA
⇒ ∆τB ≈ 1 + 2 1+ 2 ∆τA
c c
φB − φA
⇒ ∆τB ≈ 1+ ∆τA . (A.43)
c2
The redshift depends only on the potential difference between the point of emission and the
point of absorption.
The equivalence principles played an important role in the development of general relativity.
If the response of a body’s motion to gravitational forces is independent of the properties of
the body, it suggests that the gravitational force is not a feature of the body but exclusively
of the spacetime in which it moves. To be more precise, gravity is a feature of the spacetime’s
geometry.
A PRELIMINARIES 17
vi = (vx , vy , vz ) . (A.44)
In view of things to come later when we discuss relativity, we will not equate the vector ~v with
its components. Our hesitation in this regard will become clearer further below. In contrast
to general relativity, we will also not distinguish between upstairs and downstairs indices, but
only use the latter. Again, the difference between the index positions will be clarified when we
discuss tensors in general relativity. In the example of a vector, we have one index, for example
for the components of a velocity. A quantity may have more indices, however. An example
would be the moment of inertia tensor which is matrix valued and has two indices. We will
encounter further examples as we move along.
The following rules will govern our index notation.
(1) Repeated indices in a product are summed over. For example
3
X
Aij vj ..= Aij vj . (A.45)
j=1
Repeated indices appear exactly twice. More than two identical indices in one term do
not give a meaningful expression.
(2) Indices over which a summation is performed may be renamed as long as no conflict with
other indices arises. So,
Aij vj = Aik vk , (A.46)
really are the same. The j may not be replaced with an i in this case, however, since Aii vi
is not a well defined expression.
(3) In an equation, free (i.e. not repeated) indices must match on both sides and in added
terms. For example, wi + Aij vj = 0 is a valid equation but wk = Aij vj is not.
(4) Coordinates can also be written in index form. We often use the letter x for this purpose.
For example, Cartesian coordinates can be written as xi = (x, y, z). We may also denote
spherical coordinates in this way, xi = (r, θ, ϕ). Some expressions are valid in all coordinate
systems, others may only hold for specific coordinates. In the latter case, we will make
clear which coordinates we are using.
A PRELIMINARIES 18
(5) The partial derivative with respect to the coordinate xi is sometimes denoted by ∂i ..=
∂/∂xi . Sometimes, we also use a comma for this purpose as for example in
∂vi
vi,j ..= ∂j vi ..= . (A.47)
∂xj
Let us start using the index notation in the already familiar case of the motion of a point mass.
Consider, for this purpose, Cartesian coordinates
xi = (x, y, z) , (A.48)
and the time coordinate t. Let ~g (t, ~x) be the gravitational field and m the mass of a freely
falling particle. The equation of motion for the particle is then
where a dot denotes ∂t and we assumed equality of inertial and gravitational mass. In index
notation, this becomes
ẍi = gi (xj , t) , (A.51)
where the j index on the right hand side merely denotes the coordinate labels. It is not a free
index in the sense of requiring an analog on the left hand side.
We can now introduce a non-inertial coordinate system x̃i by
We have already seen that in too large a laboratory, tidal effects will give rise to non-inertial
phenomena; cf. Fig. 2. We calculate the tidal forces by considering two particles located at ~x
and ~x + δ~x. The two particles’ motion follows
d2 d2
xi = gi (xj , t) , (xi + δxi ) = gi (xj + δxj , t) (A.54)
dt2 dt2
d2 ~ i + O(|δ~x|2 )
⇒ δxi = (δ~x · ∇)g (A.55)
dt2
d2
⇒ δxi = δxk ∂k gi + O(δx2j ) , (A.56)
dt2
A PRELIMINARIES 19
where we introduced gradient ∇ ~ and dropped higher-order terms in the δxj . We now define
the tidal tensor as (the minus sign is merely a convention)
and write the tidal effect on the particle’s relative motion as the equation of geodesic deviation
(the name will become clear when we consider the general relativistic analog)
d2
δxi + Eij δxj = 0 (A.58)
∂t2
It follows that Eij = +∂j ∂i φ and, since partial derivatives commute, that the tidal tensor is
symmetric,
Eji = Eij . (A.60)
Generic matter distributions are described in terms of an energy density field ρ(~x, t) and source
a gravitational field according to Poisson’s equation
~ · ~g = −4πGρ
∇ ⇒ ~ 2 φ = ∂i ∂i φ = 4πGρ .
∇ (A.61)
which, as we shall see, bears considerable resemblance to the general relativistic version of the
field equations.
Finally, we note that the definition Eij = −∂j gi implies
d2
δxi + Eij δxj = 0 . (A.65)
dt2
A PRELIMINARIES 20
z
P
r
y
θ
φ
x
Coordinates will from now on be denoted with an upstair index. Again, this choice will
be motivated below when we introduce differential geometry and tensors.
(2) We introduce Greek indices α, β, . . . which run from 0 to 3 and include x0 = t as the time
coordinate. We will keep the notation that middle Latin indices i, j, . . . run from 1 to 3
and will occasionally write xα = (x0 , xi ) or uβ = (u0 , uj ) etc.
We also introduce a metric as a generalization of Pythagoras’ theorem familiar from the R2
or R3 . In Euclidean geometry in R3 , Pythagoras gives us the distance between two points
xi = (x , y , z) and xi + ∆xi = (x + ∆x, y + ∆y, z + ∆z) as
where
1 0 0
δij = 0 1 0 , (A.71)
0 0 1
is the Kronecker delta. In curvilinear coordinates, we can use chain rule to obtain the separation
between neighboring points, but the result will in general only apply to infinitesimally close
points. Les us consider this for spherical coordinates (r, θ, φ), defined through (see Fig. 4)
x = r sin θ cos φ ,
y = r sin θ sin φ ,
z = r cos θ . (A.72)
Using
∂x ∂x ∂x
dx = dr + dθ + dφ ,
∂r ∂θ ∂φ
∂y ∂y ∂y
dy = dr + dθ + dφ ,
∂r ∂θ ∂φ
∂z ∂z ∂z
dz = dr + dθ + dφ , (A.73)
∂r ∂θ ∂φ
A PRELIMINARIES 22
we obtain
ds2 = dx2 + dy 2 + dz 2
= dr2 + r2 dθ2 + r2 sin2 θ dφ2 . (A.74)
The second equality, however, only holds in the limit of infinitesimally small separation. In fact,
this is the general case; the only situation where we are allowed to apply the distance calculation
to finite separations ∆xi is that of flat, Euclidean geometry in Cartesian coordinates. Again,
it is customary to write the second equality of (A.74) in index notation as
Of course, this relation remains valid in the limit of infinitesimally close points; we then merely
replace all ’∆’ with ’d’.
According to the theory of special relativity, however, no inertial frame is preferred over another.
If we denote by x̃α̃ the coordinate system of another inertial frame, Eq. (A.77) also holds in
this frame, i.e.
∆s2 = −∆t̃2 + ∆x̃2 + ∆ỹ 2 + ∆z̃ 2 . (A.78)
Note that this implies, in particular, that ∆s = 0 for events connected by a light ray and all
inertial observers will therefore agree on the value of the speed of light (unity in our coordinates).
Switching again to index notation, we can write Eqs. (A.77), (A.78) as
Here Greek indices with a tilde also run from 0 to 3; the tilde has merely been introduced to
mark that this index is related to the new coordinate system x̃α̃ . Normally, we will not introduce
the tilde on the index letters, since the tilde on the x already signifies different coordinates. We
A PRELIMINARIES 23
mark the index as well here because it will help us below to distinguish between the Lorentz
transformation and its inverse. In Eq. (A.79), we have also introduced the Minkowski metric
whose components are
−1 0 0 0 −1 0 0 0
0 1 0 0 0 1 0 0
ηαβ = ηα̃β̃ =
0 0 1 0
⇔ η αβ = η α̃β̃ =
0 0 1 0 ,
(A.80)
0 0 0 1 0 0 0 1
where η αβ is defined as the inverse matrix of ηαβ and has exactly the same components in this
case. There now remains the task of identifying the coordinate transformations that ensure the
invariance of ∆s2 . Inertial frames move with constant velocity relative to each other, so that
their coordinates are related by linear transformations of the kind
x̃α̃ = Λα̃ µ xµ + xµ0 , (A.81)
where the Λα̃ µ = const. The translation given by the constant xµ0 has no impact on the following
calculations and we can set xµ0 = 0 without loss of generality. Equation (A.79) together with
the transformation (A.81) implies
!
ηα̃β̃ ∆x̃α̃ ∆x̃β̃ = ηα̃β̃ Λα̃ µ ∆xµ Λβ̃ ν ∆xν = ηµν ∆xµ ∆xν . (A.82)
This condition holds for arbitrary ∆xµ , ∆x̃α̃ , so that we require
ηµν = Λα̃ µ Λβ̃ ν ηα̃β̃ , (A.83)
or, written as a matrix multiplication,
η = ΛT ηΛ , (A.84)
where now the “T” denotes the transpose of a matrix. The class of transformations satisfying
this condition are the Lorentz transformations
! !
γ −γv j γ γv j
Λα̃ µ = vi v ⇔ Λµ α̃ = vi v , (A.85)
−γv i δ i j + (γ − 1) |~v|2j γv i δ i j + (γ − 1) |~v|2j
where the Kronecker delta δ i j with one index raised has the same components as δij in
Eq. (A.71), v i is the velocity (see Fig. 5) of thepframe (x̃α̃ ) relative to the frame (xµ ), |~v |2 ..=
δij v i v j is the norm of this velocity, and γ = 1/ 1 − |~v |2 is the Lorentz boost factor. As one
would expect, the inverse transformation Λµ α̃ to get back from (x̃α ) to the original frame xµ is
given by merely inverting the sign of the velocity vector. One straightforwardly shows that
Λα̃ µ Λµ β̃ = δ α̃ β̃ , Λµ α̃ Λα̃ ν = δ µ ν , (A.86)
where δ µ ν = diag(1, 1, 1, 1) is the four-dimensional Kronecker delta. In practice, one can often
choose the relative velocity v i to point in the direction of one coordinate axis. Choosing, for
instance, the x direction simplifies Eq. (A.85) to
γ −γv 0 0 γ γv 0 0
−γv γ 0 0 γv γ 0 0
Λα̃ µ =
0
⇔ Λ µ
α̃ = 0 0 1 0 .
(A.87)
0 1 0
0 0 0 1 0 0 0 1
A PRELIMINARIES 24
z̃
ỹ
x̃
z
~v
Figure 5: An inertial frame (x̃α̃ ) moves with constant velocity v i relative to the frame (xµ ).
Def.: The interval between two spacetime events xα and xα + ∆xα is called
timelike :⇔ ηµν ∆xµ ∆xν < 0
Using the proper time, we can state the Clock postulate of special relativity:
Postulate: A clock moving on a world line xα (λ) , λ ∈ R, that is in every point timelike or null,
measures the proper time along this world line
Z λ2 r
dxµ dxµ
τ ..= −ηµν dλ . (A.89)
λ1 dλ dλ
The requirement that the curve be everywhere timelike or null implies that for all λ ∈ [λ1 , λ2 ], we
dxµ dxν
have ηµν ≤ 0. Note that the expression (A.89) is invariant under a reparameterization
dλ dλ
λ → µ(λ) of the world line and that such a parameterization does not alter the local timelike
or null character of the curve.
It is often convenient to parameterize a timelike curve by the proper time, i.e. use λ = τ . From
A PRELIMINARIES 25
ηµν uµ uν = −1 . (A.92)
By chain rule, the four velocity changes under a coordinate transformation (xµ ) → (x̃α̃ ) ac-
cording to
ũα = Λα̃ µ uµ . (A.93)
Its norm is therefore manifestly invariant under Lorentz transformations,
= δ µ ρ δ ν σ ηµν uρ uσ (A.94)
= ηµν uµ uν , (A.95)
where we used Eq. (A.82) for the transformation rule of the Minkowski metric and Eq. (A.86)
for the product of the Lorentz transformation matrix with its inverse. Note that we also used
the property of the Kronecker delta to replace indices according to
δ µ ρ uρ = uµ , (A.96)
which directly follows from the definition of δ µ ρ and will be frequently used in the remainder
of these notes.
A special class of curves are the Geodesics. We will introduce geodesics in terms of a variational
principle. For this purpose, we use the action for timelike curves
Z r
dxα dxβ
S[xα (λ)] = −ηαβ dλ , (A.97)
| dλ
{z dλ}
=..L
A PRELIMINARIES 26
which we identify as the proper time along the curve xα (λ); cf. Eq. (A.89). Timelike geodesics
are then defined as the curves that extremize this action. This is an Euler-Lagrange variation
problem and the solutions are obtained from the Euler-Lagrange equation
d ∂L ∂L
µ
= , (A.98)
dλ ∂ ẋ ∂xµ
where ẋµ ..= dxµ /dλ. With the Lagrangian L from Eq. (A.97) we obtain
∂L ∂L 1 α β
= 0, = p −ηαµ ẋ − ηµβ ẋ . (A.99)
∂xµ ∂ ẋµ 2 −ηαβ ẋα ẋβ
The definition of L in Eq. (A.97) implies L = dτ /dλ, so that
dxβ d dxβ −η µα
d ∂L d dλ
= −η µβ = −ηµβ ×
dλ ∂ ẋµ dλ dτ dλ dλ dτ L
d 2 xα
⇒ = 0. (A.100)
dτ 2
The same equation can be derived for spacelike and null geodesics; cf. Sec. B.3 below. With
this result, we can formulate the geodesic postulate of special relativity.
Postulate: Free massive (massless) particles in special relativity move on straight timelike (null)
curves,
d2 xα
= 0. (A.101)
dτ 2
Note that τ denotes the proper time only along timelike geodesics. For null geodesics it merely
parameterizes the curve.
Time dilation: Let O and Õ be two inertial observers using coordinates xµ and x̃α̃ , respec-
tively, in their rest frames and let Õ move with velocity v i relative to the frame O. Our goal is
to find the relation between the proper time measured along world lines at rest in the respective
frames.
We consider for this purpose a world line at rest in the frame Õ. The four-velocity tangential
to this world line in coordinates x̃α is
α̃ dt̃
ũ = , 0, 0, 0 . (A.102)
dτ
A PRELIMINARIES 27
where the sign of dt̃ follows from assuming that both t̃ and τ are future oriented.
In the frame O, this world line is not at rest and the four velocity expressed in coordinates xµ
is
dt dxi ! µ α̃
µ dt̃ i dt̃
u = , = Λ α̃ ũ = γ , γv . (A.104)
dτ dτ dτ dτ
Let us first consider the time component of this equation. We find
dt dt̃ dt
=γ ⇒ =γ ⇒ dt = γdt̃ . (A.105)
dτ dτ dt̃
With the result (A.103) and the definition of γ, we can write this result as
dt̃ dτ
dt = p =p . (A.106)
1 − |~v |2 1 − |~v |2
So while p the moving observer ages by an amount dτ , observer O sees a larger amount of time
dt = dτ / 1 − |v|2 elapse in his/her own frame. The moving observer Õ ages more slowly than
his twin O remaining at rest. The argument is entirely symmetric: as viewed from the rest
frame of Õ, the aging of O is slower. This is not a paradox, since the two observers cannot
return to one another to compare their two clocks without undergoing acceleration at some
point. This accelerated phase of their motion requires additional calculation which resolves the
seeming paradox. The interested reader is referred to Sec. 1.13 of Schutz [24].
It is instructive to also consider the spatial components of Eq. (A.104) which gives us
so that the velocity v i denotes the coordinate velocity of frame Õ as seen in frame O.
Lorentz contraction: We have defined the measure of time by clocks but still need the
proper size of an object. We define this concept through the length of a rod, which generalizes
obviously to the extent of an object in more than one direction.
Def.: The length in a reference frame O of a rod is defined as the proper distance ∆s between two
events A and B, where xiA is the position of the rod’s tail at a specified time tA = t0 and
xiB is the position of the rod’s head at the same time tB = t0 . Denoting xiB − xiA = ∆xi ,
the length is given by
q p
∆s = ηαβ ∆xα ∆xβ = δij ∆xi ∆xj . (A.108)
A PRELIMINARIES 28
Note, that the length of the rod is by this definition frame dependent. We could define a
preferred measure for the rod’s length by applying the above definition in a special frame,
e.g. the frame comoving with the rod.
Let us now consider an observer O who is comoving with the rod and therefore measures its
length ` as given by (A.108). A second observer Õ is moving with velocity v i relative to the
rod. What length `˜ does this observer measure? Of course, both observers will agree with
the proper distance between the two events we called A and B in the above definition; ∆s2 is
Lorentz invariant. What they will not agree upon is whether these two events are simultaneous.
We start by considering the world lines xµ of the tail and y µ of the head of the rod in the system
O. They are
xµ = (ttail , xi0 ) , y µ = (thead , xi0 + ∆xi ) , (A.109)
where xi0 , ∆xi = const and ttail and thead are coordinate time which we use as parameters along
the respective world lines. Observer O will pick two simultaneous events by setting ttail = thead
evaluate the length of the rod as
In the frame of the moving observer Õ, the world lines of the rod’s head and tail are given by
Note that here, x̃i and ỹ i are not constant; the rod is moving in this frame. In order to measure
the length of the rod, observer Õ will choose two events à and B̃, one respectively on the tail’s
and the head’s world line, that are simultaneous in her/his frame. This means setting
t̃tail = t̃head
Λ0̃ i ∆xi
⇒ ttail = thead + = thead + vi ∆xi . (A.112)
Λ0̃ 0
We see here explicitly how the mixing of time and spatial components in the Lorentz transfor-
mation matrix alters the meaning of simultaneity from one observer to another.
All that is left to do is to evaluate the proper distance between the two events à and B̃ that
observer Õ sees as simultaneously representing tail and head, respectively, of the rod. This
proper separation will be independent of which frame, O or Õ, we choose to evaluate it in. We
choose the former frame O because it makes the comparison with the rod’s length in its own
rest frame easier. In the frame O, the coordinates of the two events are
The length is positive by definition, so that in both Eqs. (A.110) and (A.114), we take the
positive square root. Without loss of generality, we can orient our coordinates so that the rod
is aligned with, say, the x coordinate axis. Then we have
`˜ = 1 − vx2 ∆x .
p
` = ∆x , (A.115)
This is the famous√Lorentz contraction: Relative to its length in the rest frame, the rod is
shorter by a factor 1 − v 2 as viewed by an observer moving relative to the rod with a velocity
component v parallel to the rod. Note that (i) the sign of the velocity component (moving
tail-to-head or the other way round) does not affect the result, and (ii) motion perpendicular
to the rod does not contribute to the Lorentz contraction.
pα = muα . (A.116)
Because the four velocity is a vector of length −1, we immediately obtain the frame invariant
relation
ηµν pµ pν = −m2 . (A.117)
Let us again consider two inertial observers O and Õ, where Õ is moving with velocity v i in
the frame O. A particle at rest in the frame Õ has a four momentum with components in this
frame given by
p̃α̃ = (m, 0, 0, 0) . (A.118)
Relative to the frame O, the particle moves with velocity v i , and the four momentum compo-
nents in this frame are obtained from the Lorentz transformation (A.85),
Here, γm is the total relativistic mass-energy and γmv i is the linear momentum of the particle
as measured in the frame O. The components of the four momentum can therefore be written
as
pµ = (E , pi ) . (A.120)
A PRELIMINARIES 30
From the norm of the four momentum, we obtain the special relativistic energy formula
!
ηµν pµ pν = −E 2 + |~p|2 = −m2
⇒ E 2 = m2 + |~p|2
⇒ E 2 = m2 c4 + |~p|2 c2 , (A.121)
where in the last line we restored factors of c by using dimensional arguments.
According to the geodesic postulate, free massless particles move along null geodesics. For null
curves, we cannot define the four velocity, since proper time vanishes along these curves. The
curves still have tangent vectors, but they all have zero magnitude, so that we cannot define a
tangent vector of unit length. The four momentum, however, is not a vector of unit length. For
massless particles, it satisfies ηµν pµ pν = 0 and therefore is indeed a null vector. The components
are obtained from Eq. (A.120), recalling that the energy of a massless particle, e.g. a photon, is
E = hν and the momentum p = h/λ, where ν and λ are frequency and wavelength, related by
c = λν. Setting the speed of light c = 1, we can thus write the four momentum of a massless
particle is
pα = hν(1, ni ) , (A.122)
where ni is a unit vector.
The redshift can be calculated directly from the Lorentz transformation. Let us consider our
usual frames O and Õ, the latter moving with v i relative to the former. Without loss of
generality we orient the frame O such that the photon momentum points in the +x direction.
The four momentum of the photon in this frame can then be written as
pα = (E, E, 0, 0) . (A.123)
Next we assume that observer Õ is moving with velocity ~v = (v, 0, 0) relative to O. The
four-momenta p̃α̃ and pα of the particle in the two frames are then related by a Lorentz trans-
formation according to
p̃α̃ = Λα̃ µ pµ = γE − γvE, − γvE + γE, 0, 0 =.. Ẽ, Ẽ, 0, 0 .
(A.124)
The redshift is obtained from the ratio Ẽ/E,
r
ν̃ Ẽ 1−v 1−v
= = γ − γv = √ = = 1 − v + O(v 2 ) . (A.125)
ν E 1 − v2 1+v
As expected, the photon is redshifted if the frame Õ moves in the same direction, i.e. “tries
to run away from the photon”, but is blue-shifted if v x < 0, i.e. Õ moves towards the photon.
There is also a so-called transverse Doppler effect arising from velocity components of observer
Õ in the y or z directions (i.e. transverse to the propagation of the photon). The calculation of
this transverse effect proceeds along similar lines, but requires some care: The general Lorentz
transformation would mix x components with y or z components, so that we would first have
to decide whether the photon propagation proceeds in the x direction in the frame O or in
the frame Õ. These two cases represent different physical scenarios and would lead to different
redshift factors.
B DIFFERENTIAL GEOMETRY 31
B Differential geometry
Differential geometry is the mathematical formulation of the properties of curved manifolds,
i.e. the extension of flat, Euclidean geometry. Some of the observations we have made so far
suggest that the generalization of special relativity to encapsulate gravitation will follow a sim-
ilar path like that from Euclidean to curved geometry. A full discussion of differential geometry
is beyond the scope of these lectures. On the other hand the geometric view of Einstein’s gen-
eral relativity is constructive for the understanding of the theory. We will therefore pursue a
middle path in these notes; while not dealing with all aspects in full mathematical rigor, we will
introduce the main concepts as necessary to form a geometrical picture of the theory. Readers
who wish to delve deeper into the topic are referred to DAMTP’s Part III course on general
relativity, the corresponding lecture notes [36] and the books by Stewart [28], Hawking & Ellis
[13] and, especially for an intuitive pictorial introduction, Misner, Thorne & Wheeler [17].
From now on, we will extensively use Einstein’s summation convention in the same way as
introduced in Sec. A.3. We only make two additional remarks.
(1) In the literature, you will sometimes find upstairs indices referred to as contravariant and
downstairs indices as covariant. We will not use this terminology, but it is good to bear
these names in mind.
(2) An upstairs index appearing in the denominator of an expression counts as a downstairs
index. Likewise a downstairs index appearing in a denominator counts as an upstairs index.
Typically, we encounter this phenomenon when we take partial derivatives with respect to
a coordinate. We therefore use the notation
∂
∂µ = , (B.1)
∂xµ
which makes it manifest that the index really is downstairs.
Def.: An n dimensional manifold M is a set of points that locally resembles Euclidean space
Rn at each point. For our purposes, this means that there exists a one-to-one and onto
map
φ : M → U ⊂ Rn , p ∈ M 7→ xα ∈ U ⊂ Rn , α = 0, . . . , n − 1 , (B.2)
U ⊂R
xα
a coordinate map for each of them. Wherever the subsets of M overlap, we then have
multiple coordinate charts and require that these are smoothly related to each other. In
most practical applications, this subtlety is not required and one instead works with one
or more coordinate systems covering the entire manifold. We will therefore assume in the
rest of this work that we do not need to subdivide the manifold. The results we will obtain
are valid either for a global chart or for a collection of local coordinate charts.
• As we have already seen in the discussion of special relativity, there does not exist one
unique coordinate system, but an infinite number of different coordinate systems. The
coordinates serve us in labeling points and in translating operations on the manifold into
operations in the Rn , where we are already familiar with, for example, taking derivatives.
As we will discuss in more detail further below, the objects in the manifold remain invariant
under the choice of coordinates. A convenient way to think about coordinates is the use
of house numbers in a street. They are convenient, but a relabeling of houses does not
affect the physical structure of the houses or the street.
• The operations (e.g. taking derivatives) and objects (e.g. functions) that we will be dealing
with, really all live in the manifold M, not in the coordinate space U . Because the mapping
φ : M → U is one-to-one, however, this distinction is often blurred and we will not always
rigorously distinguish between operating on the manifold or in coordinate space.
B DIFFERENTIAL GEOMETRY 33
d
dt
λ(t)
p
Tp(M)
Figure 7: Illustration of defining a vector as the derivative operator along a curve. Tp (M) is
the space of all vectors at point p.
f : M → R. (B.3)
The function is smooth iff for any coordinate system xα on the manifold, f (xα ) is a smooth
function from Rn to R. If a function is invariant under a change of coordinates, it is also
called a scalar.
B.1.2 Vectors
Def.: Let C ∞ be the space of all smooth functions f : M → R, λ be a smooth curve and
p ≡ λ(0) ∈ M. The tangent vector to the curve λ at p ∈ M is the map
d
V : C∞ → R ,
f 7→ V (f ) = f λ(t) . (B.5)
dt t=0
A vector is thus defined as the directional derivative operator along a curve at a specific point
of that curve; for an illustration see Fig. 7. Note that vectors inherit the following properties
from derivative operators.
B DIFFERENTIAL GEOMETRY 34
We next consider the choice of a convenient basis of the vector space Tp (M). Let xα be a
coordinate system on the manifold M. Using chain rule, we can write
d dxµ ∂
f xµ λ(t) = f (xα ) .
V (f ) = µ
(B.8)
dt dt λ ∂x
% ↑ -
vector components basis vectors
It can indeed be shown that Tp (M) is a vector space of dimension n and that the n partial
derivative operators ∂µ = ∂/∂xµ define a basis of this vector space. We denote the basis vectors
by either of
∂
eµ = ∂µ = . (B.9)
∂xµ
The components of the vector V are then
dxµ dxµ
Vµ = = , (B.10)
dt λ dt
where we often drop the explicit reference to the curve λ. We can then expand the vector in
terms of the basis according to any of the following combinations,
dxµ ∂ d
V = V µ eµ = V µ ∂µ = µ
= . (B.11)
dt ∂x dt
Note that the vector components V µ and the basis vectors ∂/∂xµ both change when we trans-
form from one coordinate system (xµ ) to another (x̃α ). More specifically they change according
to chain rule,
∂ ∂ ∂xµ ∂ ∂xµ
eµ = → ẽα = = = eµ , (B.12)
∂xµ ∂ x̃α ∂ x̃α ∂xµ ∂ x̃α
dxµ dx̃α ∂ x̃α dxν ∂ x̃α ν
Vµ = → Ṽ α = = = V (B.13)
dt dt ∂xν dt ∂xν
While the components of the vector change under a coordinate transformation according to
µ ∂ x̃µ α
Ṽ = V , (B.14)
∂xα
B DIFFERENTIAL GEOMETRY 35
The space of all covectors at a point p ∈ M is called the cotangent space Tp∗ (M) and can
be shown to be an n dimensional vector space, just like Tp (M). If eµ be a basis for the
tangent space Tp (M), we define the components of a covector η as
⇒ η(V ) = V µ ηµ . (B.20)
(iii) Transformation rule: The coordinate invariance of η(V ) determines the behaviour of the
components ηµ under a change of coordinates. Let us transform from xµ to new coordinates
x̃α . We already know the transformation rule (B.13) for the components of a vector, so
that for any V ∈ Tp (M)
! ∂ x̃α µ
η(V ) = ηµ V µ = η̃α Ṽ α = η̃α V
∂xµ
∂ x̃α ∂xµ
⇒ ηµ = η̃α · (B.21)
∂xµ ∂ x̃β
∂xµ
⇒ η̃β = ηµ . (B.22)
∂ x̃β
df
df (V ) = V (f ) = . (B.24)
dt
In particular, we can regard the coordinates xα as functions on the manifold. Setting f = xα
for some fixed α ∈ {1, 2, . . . , n}, we obtain
∂xα
α α ∂
dx (eβ ) = dx = = δαβ . (B.25)
∂xβ ∂xβ
Recalling Eq. (B.17) for the components of a covector, we conclude the following relation for
any vector V ,
ηα dxα (V ) = ηα dxα (V β ∂β ) = ηα V β dxα (∂β ) = ηα V β δ α β = ηα V α = η(V ) , (B.26)
so that ηα dxα and η are the same one-form. The coordinate gradients dxα therefore form a
basis of the cotangent space Tp∗ (M), the dual basis of the vector basis ∂µ . We thus have the
basis expansion of a one-form η,
η = ηα dxα . (B.27)
B.1.4 Tensors
Now that we have defined vectors and covectors, we can define general tensors which include
the former two and also scalars as special cases.
B DIFFERENTIAL GEOMETRY 37
r
, r, s ∈ N0 , is a multilinear map
Def. : A tensor T at p ∈ M of rank s
Put bluntly, a tensor is a machine into which one plugs r one-forms and s vectors and
out pops a real number.
convenient way to obtain the components of a vector. From the basis expansion of a one-
form (B.27), we have
η(V ) = ηα dxα (V ) = ηα V α
by filling its slots with the respective basis one-forms and basis vectors:
T α1 ...αr β1 ...βs = T (dxα1 , . . . , dxαr , eβ1 , . . . , eβs ) . (B.31)
1
3) We define the 1
tensor δ through
δ : Tp∗ (M) × Tp (M) → R , (η, V ) 7→ η(V ) ∀ η ∈ Tp∗ (M) , V ∈ Tp (M) . (B.32)
From Eq. (B.31), we obtain its components
∂xα
δ α β = δ(dxα , ∂β ) = dxα (∂β ) = = δαβ , (B.33)
∂xβ
as the Kronecker delta.
It can be shown that the tensors of rank rs form a vector space of dimension nr+s . The
transformation properties of the components of a tensor are determined by requiring that the
number obtained by filling all its slots with one-forms and vectors is a scalar, i.e. invariant
under coordinate transformations. A straightforward calculation shows
that transforming from
µ α r
coordinates x to x̃ changes the components of a tensor of rank s according to
Note the simple rule underlying this lengthy expression: one factor ∂ x̃α /∂xµ for each upstairs
index of the tensor and one factor ∂xν /∂ x̃β for each downstairs
index.0 The transformation rules
1
(B.14) and (B.22) are merely special cases of this rule for 0 and 1 tensors.
1
its symmetric part Sαβ ..= (Tαβ + Tβα ) =.. T(αβ) , (B.36)
2
1
its anti-symmetric part Aαβ ..= (Tαβ − Tβα ) =.. T[αβ] . (B.37)
2
This operation can be applied over a subset of indices of tensors of higher rank, as for
example in
1
T (αβ)γ δ ..= (T αβγ δ + T βαγ δ ) . (B.38)
2
For (anti-)symmetrizing over non-adjacent indices, we use the | symbol as a delimiter
between the indices we operate on and those we do not. For example,
1
T(α|βγ|δ) ..= (Tαβγδ − Tδβγα ) . (B.39)
2
We can also (anti-)symmetrize over more than two indices. This is done as follows.
• Sum over all permutations of the indices we (anti-)symmetrize over.
• For antisymmetrization, each of these terms is multiplied by the sign of its permuta-
tion.
• Divide by n! (n factorial).
For example, this procedure gives us
1 α
T α [βγδ] =(T βγδ + T α γδβ + T α δβγ − T α βδγ − T α δγβ − T α γβδ ) . (B.40)
3!
the basis vector ∂α (with the same index α!). For example, let T be a 32 tensor, ω and
Note that the derivatives ∂ x̃µ /∂xα and ∂xβ /∂ x̃µ are merely numbers and can therefore be
pulled out of the argument of T ; T is linear in its vector and covector arguments!
The components of the contracted tensor are obtained from Eq. (B.31),
• Often the same letter is used for the tensor and its contraction, as for example in
T µν ρ = T αµν αρ . This is not strictly wrong, but in index free notation, it will be
confusing if the same letter is used for different tensors.
(4) The outer product of a pq tensor S end a rs tensor T is the p+r
q+s
tensor S ⊗ T defined
through
S ⊗ T (ω 1 , . . . , ω p , η 1 , . . . , η r , V 1 , . . . , V q , W 1 , . . . , W s ) (B.45)
= S(ω 1 , . . . , ω p , V 1 , . . . , V q ) T (η 1 , . . . , η r , W 1 , . . . , W s ) , (B.46)
(S ⊗ T )α1 ...αp β1 ...βr µ1 ...µq ν1 ...νs = S α1 ...αp µ1 ...µq T β1 ...βr ν1 ...νs . (B.47)
One can furthermore show that an arbitrary tensor T of rank rs can be expanded in
The outer products eα1 ⊗ . . . ⊗ eαr ⊗ dxβ1 ⊗ . . . ⊗ dxβs thus form a basis of the vector space
of rs tensors.
B DIFFERENTIAL GEOMETRY 40
Def.: A tensor field of rank rs is a collection of rs tensors at each point. We can regard the tensor
field as a map that associates with every point p a tensor T p of rank rs . The tensor field is
smooth :⇔ its components in a coordinate basis are smooth functions .
The distinction between a tensor and a tensor field will often be clear from the context. Some-
times, however, we will use an index p to distinguish a vector X p at p ∈ M from the vector
field X. As an example, we consider a vector field
X : M → Tp (M) , p 7→ X p . (B.49)
If f : M → R is a smooth function on the manifold, the vector field X defines a new function
X(f ) through
X(f ) : M → R , p 7→ X p (f ) , (B.50)
i.e. at any point p, the function X(f ) returns the directional derivative df /dt along the curve
that defines the vector at that point. For a vector field, we can define smoothness in a concep-
tually different but fully equivalent way to the above smoothness criterion for tensors.
Def.: The vector field X is smooth :⇔ X(f ) is a smooth function for every smooth f .
For illustration, let xα be a coordinate system on the manifold and consider the vector field
defined by the coordinate basis vector ∂µ at every point. For a function f , the vector field
generates the new function
∂f
∂µ (f ) : M → R , p 7→ . (B.51)
∂xµ
The vector field ∂µ is clearly smooth, since for every smooth function f , its partial derivative
∂f /∂xµ is also a smooth function. We now see why the two definitions of smoothness for a
vector field are equivalent: we merely expand a vector field in the coordinate basis and obtain
smoothness of all individual terms in the expansion iff the vector field’s components are smooth.
As a final example, we consider a smooth vector field V and a smooth covector field η. Then
η(V ) : M → R , p 7→ η p (V p ) , (B.52)
is a smooth function because η(V ) = ηµ V µ and the components are smooth. Throughout the
remainder of this work, we will assume all tensors to be smooth.
B DIFFERENTIAL GEOMETRY 41
V λ
Figure 8: Illustration of the integral curve λ of a vector field V through the point p ∈ M.
Def.: The integral curve of a vector field V through a point p ∈ M is defined as the curve through
p whose tangent at every point q along the curve is V q .
house numbers who are convenient for labeling houses in a street but not for giving us a precise
measure of how far apart they are. We will now define the metric tensor in such a general man-
ner that it accommodates the description of spacetimes as different as those containing multiple
black holes, describing open and closed universes or the gravitational collapse of stellar cores
in supernova explosions.
According to Eq. (B.48), we can expand the metric in terms of basis one-forms as
g = gαβ dxα ⊗ dxβ , gµν = g(∂µ , ∂ν ) . (B.54)
This relation is reminiscent of the more common notation for the line element
ds2 = gαβ dxα dxβ . (B.55)
Note, however, that the two relations express mathematically very different objects, the former
a tensor on a manifold, the latter a differential. Combining Eq. (B.34) for the transformation
of tensors under a change of coordinates with chain rule, we directly obtain the invariance of
the line element (B.55),
∂xα ∂xβ ∂ x̃µ ρ ∂ x̃ν σ
ds̃2 = g̃µν dx̃µ x̃ν =µ ν
gαβ ρ
dx σ
dx = δ α ρ δ β σ gαβ dxρ dxσ = gαβ dxα dxβ = ds2 .
∂ x̃ ∂ x̃ ∂x ∂x
(B.56)
A metric introduces an isomorphism between vectors and one-forms,
V 7→ V ..= g(V , . ) , (B.57)
i.e. V is a one-form defined through
V : Tp (M) → R , W 7→ V (W ) ..= g(V , W ) . (B.58)
The components of V are obtained by expanding all involved vectors and covectors in the
coordinate basis,
W = W α ∂α , V = V α ∂α , V = V α dxα
⇒ V (W ) = V α dxα (W µ ∂µ ) = V α W µ δ α µ = V µ W µ . (B.59)
Furthermore,
g(V , W ) = gαβ V α W β
⇒ V µ = gµν V ν . (B.60)
B DIFFERENTIAL GEOMETRY 43
In the following, we will drop the underbar in the covector and write Vµ = V µ . The index
position makes clear whether we have a vector or a one-form. In index free notation, the
distinction will often be clear from the context. In those rare cases where it is not, we will
explicitly state what type of tensor we are dealing with.
From now on, we will drop the exponent −1 when we write the components of the
inverse metric and merely distinguish it from the metric by the position of the indices.
Example: The line element on the unit sphere, x2 + y 2 + z 2 = 1 in R3 , is ds2 = dθ2 + sin2 θ dφ2 ,
so that
1 0 1 0
! !
gαβ = and hence g αβ = . (B.62)
2
0 sin θ 0 sin12 θ
Just as the metric defines a mapping from vectors to covectors, the inverse metric defines a
map in the other direction. If η is a one-form, a tensor of rank 10 , i.e. a vector, is defined
through
g −1 (η, . ) : Tp∗ (M) → R ω 7→ g −1 (η, ω) . (B.63)
In components,
η α = g αµ ηµ . (B.64)
The two isomorphisms defined by the metric and the inverse metric through Eqs. (B.58), (B.63)
are inverses of each other,
g −1 g(V , . ) . = V , g g −1 (η, . ), . = η .
(B.65)
In analogy to Eq. (B.64), we can raise and lower any number of indices in a tensor with the
metric or its inverse. For example, if T is a tensor of rank 32 , we obtain a tensor of rank 41
through
T α β γδ = gβλ g δµ g ν T αλγ µν . (B.66)
Because these mappings between tensors of different rank are isomorphims, we usually use the
same letter, here T , for the object, irrespective of the positions of the indices.
timelike
null
spacelike
Lemma: For every point p ∈ M, there exists a coordinate system y α such that at p the com-
ponents gαβ are (i) non-zero only on the diagonal, i.e. for α = β, and (ii) that these
non-zero components are +1 or −1. “Sylvester’s law” furthermore states that the num-
ber of such +1 or −1 entries is invariant under any coordinate change that preserves
the requirements (i) and (ii).
Def.: The signature σ of a metric gαβ on an n-dimensional manifold M is the sum over the +1 and
−1 entries over all diagonal elements. A metric with signature σ = n is called a “Riemannian
metric” and a metric with signature σ = n − 2 is called “Lorentzian”.
For example, the four-dimensional Minkowski metric ηαβ = diag(−1, +1, +1, +1) has signature
σ = 2 and we define spacetimes accordingly in general relativity.
null :⇔ g(V , V ) = 0
B.3 Geodesics
B.3.1 Curves revisited
On a manifold with Lorentzian metric, we can distinguish between timelike, null and spacelike
vectors according to the above definition. This property is directly transferred to curves.
Note that in general, the null, time or spacelike character of a curve can change along the curve.
For curves or segments of curves that are timelike or spacelike throughout, we can define the
following measures.
d
where V = is the tangent vector of the curve λ. In components, this becomes
dt
Z t1 r
dxα dxβ
s ..= gαβ dt , (B.68)
t0 dt dt
which, by differentiation, also justifies our notation ds2 = gαβ dxα dxβ for the line
element.
B DIFFERENTIAL GEOMETRY 46
Def.: For timelike curves, we define the proper time along a curve as
Z t1 q Z t1 r
dxα dxβ
τ (t1 ) ..= − g(V , V )|λ(t) dt = −gαβ dt , (B.69)
t0 t0 dt dt
For timelike curves, we define the four-velocity as through Eq. (A.91) in special relativity:
Def.: The four-velocity along a timelike curve λ is the tangent vector to that curve parametrized
by proper time τ ,
dxµ
uµ ..= . (B.70)
dτ λ(τ )
⇒ gµν uµ uν = −1 . (B.71)
Just as in special relativity, the four-velocity of a timelike curve is a unit vector.
xα(λ)
Figure 10: Graphical illustration of varying curves from A to B such that the action (B.77) is
extremal.
∂L
pk ..= , (B.74)
∂ q̇k
is a first integral of motion, i.e. is conserved along the path
that extremizes the action S.
∂L
I ..= q̇k − L, (B.75)
∂ q̇k
is a first integral of motion.
Proof: Part (i) follows directly from the Euler-Lagrange equation (B.73). For part (ii), we
start by differentiating Eq. (B.75),
d ∂L ∂L d ∂L dL
q̇k −L = q̈k + q̇k −
dλ ∂ q̇k ∂ q̇k dλ ∂ q̇k dλ
∂L d ∂L ∂L ∂L ∂L
= q̈k + q̇k − +q̇k + q̈k
∂ q̇k
:::::
dλ ∂ q̇k ∂λ
|{z} ∂qk ∂ q̇k
:::::
=0
d ∂L ∂L
= q̇k − =0 by Eq. (B.73). (B.76)
dλ ∂ q̇k ∂qk
In the study of geodesics, this result will turn out particularly valuable if q̇k 6= 0.
B DIFFERENTIAL GEOMETRY 48
Let us then extremize proper time for timelike curves. More specifically, we consider curves
xα (λ) connecting points A and B of the manifold; cf. Fig. 10. Without loss of generality, we
choose the parameter λ such that λ = 0 corresponds to point A and λ = 1 to point B. We
wish to extremize the proper time between the points which gives us the action [cf. Eq. (B.69)]
Z 1
p
S= Ldλ , L = −gµν ẋµ ẋν . (B.77)
0
Note that S is invariant under a reparametrization of the curve. For example, we can introduce
a new parameter κ required only to be a monotonic function of λ, i.e. dκ/dλ > 0. Then we
have chain rule
Z 1r
dxµ dxν
S = −gµν dλ
0 dλ dλ
Z κ(1) r
dxµ dxν dκ dλ
= −gµν dκ
κ(0) dκ dκ dλ dκ
κ(1)
r
dxµ dxν
Z
= −gµν dκ . (B.78)
κ(0) dκ dκ
We now apply the Euler-Lagrange equation (B.73) to the action (B.77). The derivatives of the
Lagrangian are (a dot denotes d/dλ)
∂L 1 µ ν µ ν gµα ẋµ
= (−gµν δ α ẋ − gµν ẋ δ α ) = − , (B.79)
∂ ẋα 2L L
∂L 1
α
= (−ẋµ ẋν ∂α gµν ) , (B.80)
∂x 2L
so that the Euler-Lagrange equation becomes
gµα ẋµ ẋµ ẋν ∂α gµν
d
− + = 0. (B.81)
dλ L 2L
If you haven’t got a social life, like me, you might want to go ahead and evaluate the λ
derivatives. But there is an easier way: we reparametrize the curve using proper time
Z λr
dxµ dxν
τ (λ) = −gµν dλ̃
0 dλ̃ dλ̃
2
dτ dxµ dxν
⇒ = −gµν = L2
dλ dλ dλ
dτ
⇒ =L
dλ
d dλ d 1 d
⇒ = = , (B.82)
dτ dτ dλ L dλ
B DIFFERENTIAL GEOMETRY 49
where we assumed in the third line that dτ /dλ > 0, i.e. both parameters are future oriented.
Inserting this result into Eqs. (B.81) gives
dxµ L dxµ dxν
d
−L gµα + ∂α gµν = 0
dτ dτ 2 dτ dτ
α ..=
1 αµ
β γ
g (∂β gγµ + ∂γ gµβ − ∂µ gβγ ) . (B.85)
2
d2 xα α dxβ dxγ
+ βγ =0 . (B.86)
dτ 2 dτ dτ
For spacelike geodesics, we can perform an analogous calculation, merely starting with the
action Z 1
p
S̃ = L̃dλ , L̃ = gµν ẋµ ẋν , (B.87)
0
in place of Eq. (B.77) and then reparametrizing from λ to the proper length s according to
ds
= L̃ . (B.88)
dλ
We then obtain
d2 xα α dxβ dxγ
+ βγ =0 . (B.89)
ds2 ds ds
The difference to our first Lagrangian (B.77) is (i) that we do not take the square root and (ii)
that we do not restrict the discussion to timelike or spacelike or null curves. For this reason,
we need not worry about the overall sign of gαβ x˙α ẋβ and, just for convenience, choose to not
put a minus in front.
The variation of (B.90) is straightforward. The derivatives of L̂ are
∂ L̂
= gαβ ẋβ δ α µ + gαβ ẋα δ β µ = 2gµβ ẋβ , (B.91)
∂ ẋµ
∂ L̂
= ẋα ẋβ ∂µ gαβ , (B.92)
∂xµ
and the Euler-Lagrange equation gives us
d ∂ L̂ ∂ L̂
− = 2gµβ ẍβ + 2ẋβ (∂ν gµβ ) ẋν − ẋα ẋβ ∂µ gαβ = 0
dλ ∂ ẋµ ∂xµ
1
⇒ gµβ ẍβ + ẋν ẋβ (∂ν gµβ + ∂β gµν − ∂µ gνβ ) = 0 · g αµ
2
Aside from the fact that we have the more general parameter λ instead of proper time or
length, this equation looks exactly like Eqs. (B.86), (B.89) derived above for time and spacelike
geodesics and all seems fine. But it is not quite as simple as that.
Let us consider, for example timelike geodesics and choose a parameter λ related to proper
time by
d2 2
τ d dλ d d d τ d d 2 d
λ=e ⇒ = =λ ⇒ = e = λ + λ . (B.94)
dτ dτ dλ dλ dτ 2 dτ dλ dλ dλ2
We have demonstrated above that the action (B.77) is invariant under any reparametrization,
so its variation proceeds the same way for any λ and the geodesic equations (B.86), (B.89) still
are the correct results. Rewritten in terms of λ = eτ , however, (B.86) becomes [using (B.94)]
α 1 dxα
ẍα + ẋν ẋβ = −
ν β
. (B.95)
λ dλ
This is clearly not compatible with Eq. (B.93). So which one is correct and what is going on?
The answer arises from the fact that the action (B.90) is not invariant under a change of the
parameter λ. If we change the parameter, say from λ1 to λ2 , we are not necessarily extremizing
the same action and should not be surprised that the result of this exercise, namely Eq. (B.93),
gives us a different curve when choosing parameter λ1 than for choosing parameter λ2 . So, for
our particular choice λ = eτ , Eq. (B.95) gives us geodesics and Eq. (B.93) does not.
B DIFFERENTIAL GEOMETRY 51
On the other hand, if we set λ = τ , Eq. (B.93) agrees with (B.86) and gives us geodesics.
The question then remains to figure out for which choices of the parameter λ, Eq. (B.93) is
correct. Let us first consider timelike geodesics which are given by Eq. (B.86). Let λ and τ be
monotonically increasing and, thus, invertible functions of each other: dτ /dλ > 0. Then
2 2
d2 d2 λ d
d dλ d d dλ d dλ d
= ⇒ 2
= = 2 + , (B.96)
dτ dτ dλ dτ dτ dτ dλ dτ dλ dτ dλ2
This agrees with Eq. (B.93) if the right-hand side vanishes which is only achieved for
d2 λ
=0 ⇔ λ = c1 τ + c2 , c1 , c2 = const ∈ R , (B.98)
dτ 2
i.e. if λ and τ are linearly related. We likewise find that (B.93) defines spacelike geodesic if the
parameter λ is linearly related to the proper distance s. This leads to the definition of affine
parameters.
Def.: The parameter λ along a timelike (spacelike) curve is called an affine parameter if it
is linearly related to the proper time (proper distance) along this curve: λ = c1 τ + c2
(λ = c1 s + c2 ).
For an affine parameter, timelike and spacelike geodesics are determined by Eq. (B.93).
If we choose a non-affine parameter instead, geodesics are given by Eq. (B.97).
In this discussion, we have so far gracefully ignored null geodesics. Null geodesics are special in
the sense that they do not have a natural affine parameter analogous to proper time or proper
distance. Nevertheless, null geodesics are honorable curves that can be parametrized just like
other curves. We can even define affine parameters as follow.
In general relativity we define test particles as sufficiently small bodies that generate negligible
gravitational fields. Their motion is governed by a geodesic postulate analogous to Eq. (A.101)
in special relativity.
Geodesic postulate: Test particles with positive (zero) rest mass move on timelike (null)
geodesics.
B DIFFERENTIAL GEOMETRY 52
The geodesic equation, either in the form (B.93) for an affine parameter or (B.97) for a non-
affine parameter, is a second-order ordinary differential equation. The uniqueness theorems of
the theory of ordinary differential equations ensure that a unique solution exists for specified
position xα (λ) and velocity ẋα (λ) at some λ = λ0 .
Aside from demonstrating the difference between affine and non-affine parameters, the varia-
tional method discussed in this section also serves a practical purpose: it gives us a convenient
method to calculate the Christoffel symbols without grinding through its definition (B.85).
This method is best illustrated using an example. As will be shown below in Sec. D.1, the
Schwarzschild metric for a static black hole can be written in spherical coordinates as
1 2M
ds2 = −f (r) dt2 + dr2 + r2 dθ2 + r2 sin2 θ dφ2 , f (r) = 1 − , (B.100)
f (r) r
where the constant M denotes the mass of the black hole. For an affine parameter λ, the
geodesic equation is then given by (B.93) if we know the Christoffel symbols. Viewed the
other way round, however, we can use Eq. (B.93) to extract the Christoffel symbols if we know
the geodesic equation. And for reasonably simple metrics like (B.100), the geodesic equation is
quite easily obtained by directly varying the Lagrangian L̂ of Eq. (B.90). For the Schwarzschild
metric (B.100), the Lagrangian is
∂ L̂ ∂ L̂
= −2f ṫ , = 0, (B.102)
∂ ṫ ∂t
leading to
d
(−2f ṫ) = 0
dλ
d2 t df
⇒ 2
+ f −1 ṙṫ = 0
dλ dr
t
t 1 df 2M t
⇒ t r
= r t
= = , µ ν
= 0 otherwise . (B.103)
2f dr r(r − 2M )
Note the factor 1/2 that arises for Christoffel symbols with mixed downstairs indices which are
equal and thus appear twice in the summation µt ν x˙µ x˙ν in the geodesic equation.
spaces, namely Tp (M) and Tq (M). We can therefore not take the difference between them. So
how can we calculate their derivative?
The answer is to construct the so-called covariant derivative. We will do this in steps, first for
scalars, then for vectors and finally for arbitrary tensors. For scalars, this is trivial since they
are the only class of tensors for which the problem just mentioned does not arise; we can just
subtract the scalar at one point from that at another.
∇α f := (∇f )α = ∂α f . (B.105)
Recall that V (f ) is the derivative of f defined by (B.5). Covariantly differentiating vector fields
is a bit more complicated.
with the following properties (f , g are functions and X, Y , V , W are vector fields)
(2) ∇X (V + W ) = ∇X V + ∇X W
Note that we can also define ∇V as the following type of map, which is completely equivalent
to (B.106),
∇V : Tp∗ (M) × Tp (M) → R , (η, X) 7→ η(∇X V ) . (B.107)
In this form, the tensor rank 11 of ∇V is manifest. In components, we use the following
notations
V α ;β := ∇β V α ..= (∇V )α β . (B.108)
You may wonder at this stage that this definition is all nice and fine, but how do we actually
calculate the covariant derivative of a vector? Patience, we will come to that. First we define
another level of structure on the manifold.
Def.: Let (eµ ) be a basis of the tangent space Tp (M). The connection coefficients Γρµν are defined
through
∇ν eµ := ∇eν eµ = Γρµν eρ . (B.109)
B DIFFERENTIAL GEOMETRY 54
= V ν eν (W µ ) eµ + W µ ∇V ν eν (eµ )
= V ν ∂ν W ρ + W µ Γρµν eρ ,
(B.110)
where in the last but one line, we used eν (f ) = ∂ν f for f = W µ , renamed the summation index
µ to ρ in the first term and inserted the connection through its definition (B.109). Since the
vector V is arbitrary in Eq. (B.111), we can rewrite this result, also defining standard notation,
in the form
W ρ ;ν := ∇ν W ρ := (∇W )ρ ν = ∂ν W ρ + Γρµν W µ . (B.112)
So we now have a perfectly nice expression for the covariant derivative of a vector field provided
we know the connection. Before you conclude that we are just kicking the can down the road,
B DIFFERENTIAL GEOMETRY 55
we will get to that point in due course. But first we deal with a couple of other important
points concerning Eq. (B.112).
First, we would like to check how it changes under a transformation of coordinates from xα to
x̃µ . For the connection, we start with its definition (B.109) and replace eα = ∂α which makes it
easier to spot where to apply chain rule. Transformed to coordinates x̃µ , this equation becomes
(we denote ∂˜α := ∂/∂ x̃α )
∂xα
β
σ ˜ ˜ ∂x ˜
Γ̃µν ∂σ = ∇∂˜ν ∂µ = ν
∇ ∂α ∂β
∂ x̃ ∂ x̃µ
Def.: Let M be a manifold with connection Γλµν . The torsion tensor is defined as
The difference between two connections often appears naturally in perturbation theory where
we study small deviations from the connection of a background spacetime. This deviation is
indeed a tensor.
With Eq. (B.113), we have all tools at hand to check how the covariant derivative of a vector
transforms. Let us first consider the partial derivative on the right-hand side of Eq. (B.112),
Again, the first term on the right hand side would give us a tensor transformation law, this time
for a tensor of rank 11 , but the second term spoils the transformation. It looks suspiciously
similar to the extra term we obtained in Eq. (B.113) and you probably guess where this is
B DIFFERENTIAL GEOMETRY 56
heading. Combining Eqs. (B.113) and (B.115), we obtain the transformation of the covariant
derivative ∇α W β ,
˜ ν W̃ ρ = ∂˜ν W̃ ρ + Γ̃ρµν W̃ µ
∇
Before moving on to other tensors, we mention a subtle point about the notation that has
some potential for confusion but is nonetheless used almost ubiquitously in the field. concerns
B DIFFERENTIAL GEOMETRY 57
the component functions of tensor fields; for example the components W µ of a vector W .
Strictly speaking, these are merely functions on the manifold. We have treated them as such,
for instance, in the derivation (B.110) where we regarded eν (W µ ) as the derivative ∂ν W µ . It
is common in the literature, however, to also use W µ representing the entire vector. This is
done, for example, in the notation ∇ν W ρ for the covariant derivative of W in Eq. (B.112).
The covariant derivative of a function would just be its partial derivative, but ∇ν W ρ includes
the correction term for the covariant derivative of a vector. How do you know when terms like
W ρ are assumed to represent merely the component functions or the entire tensor? Usually
this should be clear from the context, but a good rule of thump is that they represent the
components if the basis vectors are explicitly present in the equation, but otherwise denote the
tensor; cf. Eq. (B.110) with Eq. (B.112).
In order to define the covariant derivative of a covector field, we recall that a covector is defined
through its action on vectors; cf. Eq. (B.16).
Def.: Let η be a covector field and V , W be two vector fields. The covariant derivative of η
is defined as the map
Note that η(W ) is a function and ∇V W is a vector, so that all terms on the right-hand
side of Eq. (B.119) are already well defined. Equation (B.119) furthermore exhibits product
rule explicitly for differentiating η(V ) if we move the last term to the left-hand side.
0
∇η is a tensor of rank 2
which can be seen as follows.
= V ρ ∂ρ (ηµ W µ ) − ηµ V ρ ∂ρ W µ + V ρ Γµνρ W ν
= V ρ W µ ∂ρ ηµ − Γµνρ ηµ V ρ W ν
∂ρ ηµ − Γνµρ ην V ρ W µ .
= (B.120)
So ∇η is indeed a linear map taking two vectors as input and returning a number. Equation
(B.120) further gives us the components of the covariant derivative of a one-form η,
We likewise define the covariant derivative of a tensor T of rank rs by filling all its slots with
r one-forms and s vectors. The result is a number and we require Leibnitz rule to hold on the
B DIFFERENTIAL GEOMETRY 58
entire product. A straightforward calculation analogous to (B.120) shows that the result ∇T
r
is a tensor of rank s+1 and has the components
∇ρ T µ1 ...µr ν1 ...νs = ∂ρ T µ1 ...µr ν1 ...νs + Γµσρ1 T σµ2 ...µr ν1 ...νs + . . . + Γµσρr T µ1 ...µs−1 σ ν1 ...νs
− Γσν1 ρ T µ1 ...µr σν2 ...νs − . . . − Γσνs ρ T µ1 ...µr ν1 ...νs−1 σ . (B.122)
This expression is simpler than it looks at first glance. First, we get a partial derivative and then
for each tensor index one correction term constructed as follows. For each upstairs (downstairs)
index of the tensor T , we add (subtract) a term “ΓT ”. The derivative index (ρ in our case) is
always the second downstairs index of the Γ whose other indices are combined with those of T
in the only manner possible to make the free indices’ positions agree with the left-hand side.
Theorem: On a manifold M with metric g there exists a unique torsion free connection that is
metric compatible, i.e. satisfies
∇g = 0 . (B.123)
This connection is called the Levi-Civita connection and its components are given by the
Christoffel symbols.
B DIFFERENTIAL GEOMETRY 59
∇α gβγ = 0
⇒ ∂α gβγ = Γρβα gργ + Γργα gβρ . (B.126)
= Γµβγ . (B.127)
In general relativity we use the Levi-Civita connection and we shall henceforth assume the
connection Γαβγ to be the Levi-Civita one unless stated otherwise.
Def.: Let V be a vector field and C an integral curve of V . A tensor T is parallel transported
along C if ∇V T = 0 at every point of the curve.
B DIFFERENTIAL GEOMETRY 60
Example: Recall Eq. (B.93) for an affinely parametrized geodesic which we now write as
d2 xα µ
α dx dx
ν
+ Γ µν = 0. (B.128)
dλ2 dλ dλ
The tangent vector of that curve is
dxα
Uα = , (B.129)
dλ
which becomes the four velocity (B.70) for the case of timelike geodesics
parametrized with proper time. Equation (B.128) then becomes
d α dxβ
0 = U + Γαµν U µ U ν = ∂β U α + Γαµν U µ U ν
dλ dλ
d d dλ
U= , and V = = U, (B.131)
dλ dκ dκ
are tangent to the geodesic. Defining h := dλ/dκ, we have
dh
∇V V = ∇hU (hU ) = h∇U (hU ) = h2 ∇U U +U (h) hU = V , (B.132)
| {z } dλ
=0
and
dh
∇V V = V , (B.133)
dλ
describes the same geodesic. κ is also affine if dh/dλ = 0 ⇔ h = const ⇔ κ =
c1 λ + c2 in agreement what we found in Eq. (B.98).
d µ
⇒ T ν + Γµρσ T ρ ν V σ − Γρνσ T µ ρ V σ = 0 . (B.134)
dλ
The theory of ordinary differential equations ensures that a unique solution exists if initial
conditions are provided for T µ ν at some point on the curve. In the literature, you will sometimes
B DIFFERENTIAL GEOMETRY 61
so that parallel transport of our 11 tensor T along a curve is defined by DT µ ν /Dλ = 0 and
and a similar calculation shows that the angle between two spatial vectors also remains un-
changed under parallel transport. An important consequence of Eq. (B.136) is that the timelike,
spacelike or null character of the tangent vector along a geodesic is constant along the geodesic.
Unlike a normal curve, a geodesic that is timelike (spacelike, null) at some point is timelike
(spacelike, null) everywhere. We have already seen this for the specific case of the four velocity
of a timelike geodesic: uα uα = −1. In the context of timelike curves, one can also define the
acceleration.
Def.: Let uα be the tangent vector to a timelike curve parametrized by proper time τ . The
acceleration is
µ Duµ
a := = uρ ∇ρ uµ . (B.137)
Dλ
The curve is a geodesic if aµ = 0. Geodesics are therefore the analogs of the paths of freely
moving particles in Newtonian dynamics. Note that a non-affinely parametrized geodesic
satisfies aα = f uα where f is a function.
It is instructive to contrast parallel transport in general relativity with that in special relativity.
In the Minkowski spacetime with Cartesian coordinates, Γρµν = 0 and Eq. (B.134) becomes
d µ
T ν = 0, (B.138)
dλ
so that in Cartesian coordinates parallel transport leaves tensor components unchanged and
this result is independent of the curve we choose between points p to q. This is a key difference
between special and general relativity. As we shall see in Sec. B.8.4 below, parallel transport
of a tensor from p to q is dependent on which curve we choose.
Def.: Let M be a manifold with connection Γ and let p ∈ M. The exponential map is defined as
e : Tp (M) → M , X p 7→ q , (B.139)
where q is the point a unit affine parameter distance along the geodesic through p with
tangent vector X p .
Def.: Let (eµ ) be a basis of Tp (M). Normal coordinates in a neighborhood of p ∈ M are defined
as the coordinate chart that assigns to q = e(X p ) ∈ M the coordinates of the vector X µ .
Note that this definition does not completely specify the coordinates. We still have the freedom
to choose a basis for Tp (M).
Next, we will investigate how normal coordinates can be used to control the metric components
at p.
Lemma: In normal coordinates constructed around the point p, the connection at p satisfies
Γµ(νρ) = 0. If the connection is torsion free, we furthermore have Γµνρ = 0.
Proof: In item (2) of the above set of comments we saw that the exponential map (B.139)
maps the vector λX p to the point an affine parameter distance λ along the geodesic
through X p . In the neighborhood of p, the affinely parametrized geodesic is therefore
given by
C : [0, 1] → M , λ 7→ xµ (λ) = λXpµ . (B.140)
The geodesic equation for the affine parameter λ becomes
d2 xµ ν
µ dx dx
ρ
+ Γ νρ = Γµνρ Xpν Xpρ = 0 at p ∈ M ∀X p ∈ Tp (M)
dλ2 dλ dλ
⇒ Γµ(νρ) = 0 . (B.141)
If the connection is torsion free, we also have Γµ[νρ] = 0 and, hence, Γµνρ = 0.
Having chosen coordinates that lead to Γµνρ = 0 at p ∈ M, we will in general not find the
connection to also vanish at other points q 6= p. It is an interesting exercise to check which
piece of the above proof breaks down at q 6= p. We will comment on this question in the actual
lectures.
B DIFFERENTIAL GEOMETRY 63
Lemma: If we have a manifold with metric g and choose the Levi-Civita connection, then in
normal coordinates
∂ρ gνσ = 0 . (B.142)
Proof: The Levi-Civita connection is torsion free, so that by the previous lemma
Γρµν = 0
Next, we symmetrize the left-hand side over σ and µ and add the result to obtain
2∂ν gσµ = 0.
Note again that this result holds at p but that in general we cannot make ∂ν gσµ vanish at other
points q 6= p. It now remains to select the normal coordinates such that the metric components
acquire the Minkowskian values.
Lemma: Let M be a manifold with a metric g of signature 2 and Γ the Levi-Civita connection.
Then we can choose normal coordinates such that at p
Proof: We already proved the first part. For the second part, let xα be normal coordinates.
By Eq. (B.140), the point an affine parameter distance λ along a geodesic through p
with tangent X p then has coordinates λ Xpµ . Now choose an orthonormal basis (eµ )
of the tangent space Tp (M) (this can always be achieved, for example by Gram-
Schmidt orthonormalisation) and consider the special case where X p = e0 . The
point an affine parameter distance λ along the geodesic through p with tangent e0
then has coordinates
λXpµ = λ (e0 )µ = (λ, 0, 0, 0) .
So the geodesic curve has coordinates xµ (λ) = (λ, 0, 0, 0). But in any coordinate
system, the tangent vector to the curve (λ, 0, 0, 0) is ∂0 = ∂/∂x0 , so that ∂0 = e0 .
We likewise show ∂µ = eµ . It follows that the {∂µ } form an orthonormal basis and,
hence,
gµν = g(∂µ , ∂ν ) = ηµν . (B.145)
In summary, we can choose coordinates such that at p ∈ M the metric is Minkowskian and the
connection coefficients vanish.
Def.: We call a coordinate frame with these properties a local inertial frame.
In a local inertial frame, we therefore recover the laws of special relativity. According to the
equivalence principle, this frame represents freely falling observers.
B DIFFERENTIAL GEOMETRY 64
[V , W ]α := V µ ∂µ W α − W µ ∂µ V α , (B.146)
Proof: Using Eqs. (B.14), (B.115) for the transformation of a vector and its partial derivative
under a change of coordinates (xα ) → (x̃µ ), we find for the commutator in the new
coordinate system
µ µ
∂ x̃ν β ∂ x̃µ ∂xδ ∂xα γ
µ
ν ∂ W̃ ν ∂ Ṽ γ ∂ x̃
Ṽ ν
− W̃ ν
= β
V γ ν
∂δ W + ν W ∂α
∂ x̃ ∂ x̃ ∂x ∂x ∂ x̃ ∂ x̃ ∂xγ
∂ x̃µ β γ β
2 µ
γ ∂ x̃
= V ∂ β W + V W
∂xγ ∂xβ ∂xγ
∂ x̃µ β γ β γ ∂ x̃
2 µ
− W ∂β V − W V
∂xγ ∂xβ ∂xγ
∂ x̃µ γ γ
β ∂W β ∂V
= V −W . (B.147)
∂xγ ∂xβ ∂xβ
Note that we departed here from our usual path of defining tensors as linear maps and then
deducing it’s transformation properties. Instead, we define the commutator through its compo-
nents and show that this definition satisfies the transformation rule of a vector under coordinate
transformations.
One straightforwardly shows that with vector fields U , V , W and a function f the commutator
B DIFFERENTIAL GEOMETRY 65
satisfies
[V , W ] = −[W , V ] , (B.148)
[V , W + U ] = [V , W ] + [V , U ] , (B.149)
[V , f W ] = f [V , W ] + V (f ) W , (B.150)
U , [V , W ] + V , [W , U ] + W , [U , V ] = 0 “Jacobi Identity” . (B.151)
because the components of these vectors are constant by construction. We state without proof
the following theorem about the inverse implication.
Theorem: If V 0 , . . . , V m−1 , m ≤ dim(M) are vector fields that are linearly independent at
every p ∈ M and whose commutators all vanish, then we can construct coordinates
xµ in a neighborhood of any p ∈ M such that
∂
Vi= , i = 0, . . . , m − 1 . (B.153)
∂xi
With a torsion free connection, such as Levi-Civita, we therefore find that second covariant
derivatives of functions also commute. Note that in Eq. (B.154) we first took the outer covariant
derivative. This avoids ending up with covariant derivatives of connection coefficients which
are not well defined quantities.
Next, we consider second covariant derivatives of vectors. We find with the Levi-Civita con-
nection
− ( α ↔ β ), (B.155)
B DIFFERENTIAL GEOMETRY 66
where (α ↔ β) denotes the right-hand side of the preceding lines with α and β swapped.
With Eq. (B.156), the second covariant derivative (B.155) becomes the so-called “Ricci Identity”
∇α ∇β V γ − ∇β ∇α V γ = Rγ ραβ V ρ . (B.157)
We conclude that covariant derivatives of vectors fail to commute and that the Riemann tensor
(by definition) measures this failure.
Def.: Let U , V , W be three vector fields. The Riemann tensor is the rank 13 tensor R with
R(U , V ) (W ) = ∇U ∇V W − ∇V ∇U W − ∇[U ,V ] W . (B.158)
R(f U , V )W = f R(U , V )W ,
R(U , f V )W = f R(U , V )W ,
So R is linear in its three vector arguments. Furthermore, the right-hand side of Eq. (B.158)
is manifestly of vector type, so that contraction with a one-form is by construction a linear
operation. Therefore, R is a tensor. In order to calculate the components, we fill the three
vector slots of R with the basis vectors, i.e. substitute in (B.158) U = eα , V = eβ and W = eρ .
We use a coordinate basis eα = ∂α so that [eα , eβ ] = 0 by Eq. (B.152),
= (∂α Γνρβ )eν + Γµρβ Γνµα eν − (∂β Γνρα )eν − Γµρα Γνµβ eν
Equations (B.156) and (B.158) indeed define the same tensor. We have covered both defini-
tions because from case to case, either one or the other may be more convenient in practical
calculations.
(1) By definition, the Riemann tensor is antisymmetric in the last two indices
⇒ Rµ [νρσ] = 0 (B.164)
This is a tensorial equation and is therefore valid in any coordinate system. Furthermore,
the point p was arbitrary, so that Eq. (B.164) holds at all points.
B DIFFERENTIAL GEOMETRY 68
(3) Again we use normal coordinates at p ∈ M and a torsion free connection, so that at p
we have Γµνρ = 0. Next, we take the partial derivative of the Riemann tensor as given in
Eq. (B.156). Because the connection vanishes, the only terms surviving in this equation
can be symbolically written as
Furthermore the vanishing connection at p implies that covariant and partial derivatives
are the same in that point, so that
Again, this is a tensorial equation and the point p was arbitrary, so that the equality
holds in general. Note the striking similarity with the Newtonian integrability condition
(A.64).
(4) For this symmetry, we assume that the manifold is equipped with a metric and that the
connection is the Levi-Civita one. At an arbitrary point p ∈ M, using normal coordinates,
we then have ∂µ gνρ = 0. This implies
0 = ∂µ δ ν ρ = ∂µ (g νσ gσρ ) = gσρ ∂µ g νσ · g ρλ
⇒ ∂µ g νλ = 0
1
⇒ ∂ρ Γλνσ = g λµ (∂ρ ∂ν gσµ + ∂ρ ∂σ gµν − ∂ρ ∂µ gνσ )
2
1
∂ρ ∂ν gσµ + ∂σ ∂µ gνρ − ∂σ ∂ν gρµ − ∂ρ ∂µ gνσ + “ΓΓ − ΓΓ00
⇒ Rµνρσ =
2 | {z }
=0
because gαβ is symmetric and ∂α ∂β commute. We can therefore swap the first with the
second pair of indices. Together with Eq. (B.162), we directly obtain
Note that the first of our four symmetries always holds, the second and third hold if we have
a torsion free connection, and the fourth holds if we have a metric and use the Levi-Civita
B DIFFERENTIAL GEOMETRY 69
X r (δs,δt)
u (0,δt)
Y Y
p (0,0) X q ( δs,0)
Figure 11: Integral curves and points along a closed loop along which a vector Z is parallel
transported.
(Z 0p − Z p )α
lim = (Rα βµν Z β Y µ X ν ) p . (B.171)
δs,δt→0 δs δt
Proof:
Let Z p ∈ Tp (M) and (xµ ) be normal coords. at p. Because of Eq. (B.170) the integral curves
of X and Y are given by (s, 0, . . . , 0) and (0, t, 0, . . . , 0), respectively. We assume that δs
and δt are small and related by δt = aδs for a = const. We divide the closed path from p back
to p into four parts.
B DIFFERENTIAL GEOMETRY 70
(1) p → q: We transport Z p along the curve with tangent X and parameter s i.e. we have
∇X Z = 0, so that
∂ µ dZ µ
X σ ∇σ Z µ = X σ Z + Γ µ
ρσ Z ρ σ
X = + Γµρσ Z ρ X σ = 0
∂xσ ds
dZ µ
⇒ = −Γµρσ Z ρ X σ
ds
d2 Z µ ∂ d
⇒ = −X λ ∂λ (Γµρσ Z ρ X σ ) X = Xµ = . (B.172)
ds2 ∂x µ ds
Next we Taylor expand Z µ around p and use that Γµρσ = 0 at p in our normal
coordinate system,
µ
1 d2 Z µ
µ µ dZ
Zq − Zp = δs + δs2 + O(δs3 )
ds p 2 ds2 p
1 λ ρ σ
X Z X ∂λ Γµρσ p δs2 + O(δs3 )
= − (B.173)
2
(2) q → r: We use again Taylor expansion, but this time around the point q and need to bear
in mind that the connection coefficients do not vanish at q. We obtain
µ
1 d2 Z µ
µ µ dZ
Zr − Zq = δt + δt2 + O(δt3 )
dt q 2 dt2 q
1
= − Γµρσ Z ρ Y σ q δt − Y λ ∂λ (Γµρσ Z ρ Y σ ) q δt2 + O(δt3 ) . (B.174)
2
Using
0 + (Z ρ Y σ X λ ∂λ Γµρσ )p δs + O(δs2 ) δt ,
= (B.175)
we find
1 λ
Y ∂λ (Γµρσ Z ρ Y σ ) p + O(δs) δt2 + O(δt3 )
− (B.176)
2
1
= −(Z ρ Y σ X λ ∂λ Γµρσ )p δsδt − (Z ρ Y σ Y λ ∂λ Γµρσ )p δt2 + O(δs3 )
2
B DIFFERENTIAL GEOMETRY 71
(3), (4): The change of Z p under parallel transport along the alternative route p → u → r,
follows from Eq. (B.177) by simply interchanging X ↔ Y , s ↔ t,
1
Zrµ − Zpµ = − (∂λ Γµρσ ) Z ρ Y σ Y λ δt2 + X σ X λ δs2 + 2X σ Y λ δt δs p + O(δs3 ) .
pur 2
(B.178)
The change of Z along the inverse path from rup is simply minus the result (B.178), so that
the change of Z p under parallel transport along the closed loop pqrup is
Zp0µ − Zpµ = (Zrµ − Zpµ )pqr − (Zrµ − Zpµ )pur = − (Y σ X λ − X σ Y λ )(∂λ Γµρσ ) p Z ρ δs δt + O(δs3 )
Rµ ρλσ Z ρ Y λ X σ p δt δs + O(δs3 ) ,
= (B.179)
∗
where the symbol = denotes equality in normal coordinates at p where (Γαβγ )p = 0. Taking the
limit δs, δt → 0, we recover Eq. (B.171).
We conclude that curvature measures the change of vectors under parallel transport along a
closed curve or, equivalently, the path dependence of parallel transport.
Def.: Let (M, Γ) be a manifold with connection. A “1-parameter family of geodesics” is a map
(ii) locally, (s, t) 7→ γ(s, t) is smooth, one-to-one and has a smooth inverse.
Figure 12: Relative geodesic acceleration illustrated for great circles on planet Earth (red
curves). Two such curves starting at the equator initially point perpendicular to the equator
but converge at the north pole. Two observers, one moving along each great circle would find
the second derivative of their separation with respect to their distance to the equator to be
negative.
s= const
S t= const
Figure 13: A one-parameter family of geodesics. Curves s = const are geodesics and T µ =
dxµ /dt is their tangent vector. S = dxµ /ds is the vector pointing from one geodesic in the
direction of neighboring geodesics.
Let T be the tangent vector to the geodesics γ(s = const, t) and S the tangent vector to the
curves γ(s, t = const). In coordinates (xµ ) we can write the vectors as
dxµ dxµ
Tµ = , Sµ = . (B.181)
dt ds
We now consider two neighboring geodesics specified by parameters s0 and s0 + δs. These
geodesics are given by xµ (s0 , t) and xµ (s0 +δs, t) and we Taylor expand their coordinate distance
according to
xµ (s0 + δs, t) = xµ (s0 , t) + δs S µ (s0 , t) + O(δs2 ) . (B.182)
B DIFFERENTIAL GEOMETRY 73
Def.: δs S is the “geodesic deviation vector” that points from one geodesic with s0 to a nearby
one with parameter s0 + δs.
⇔ T ν ∇ν (T µ ∇µ S α ) = Rα λµν T λ T µ S ν . (B.184)
Proof: We use coordinates (s, t) on the two-dimensional surface Σ spanned by the geodesics
and extend the coordinates to (s, t, . . .) in a neighborhood of Σ. In this coordinate
system, the vectors S and T have the particularly simple form
∂ ∂
S= , T = ⇒ [S, T ] = 0 , (B.185)
∂s ∂t
because the commutator vanishes for basis vectors. For a torsion free connection,
we further have for arbitrary vector fields V , W
V µ ∇µ W α − W µ ∇µ V α = V µ ∂µ W α + V µ Γαρµ W ρ − W µ ∂µ V α − W µ Γαρµ V ρ
= V µ ∂µ W α − W µ ∂µ V α = [V , W ]α , (B.186)
∇T S − ∇S T = [T , S] = 0 , (B.187)
and hence
∇T ∇T S = ∇T ∇S T = ∇S ∇T T +R(T , S)T , (B.188)
| {z }
=0
where we have used the definition (B.158) of the Riemann tensor and the geodesic
equation ∇T T = 0.
1
The “Einstein tensor” is Gαβ ..= Rαβ − gαβ R .
2
A very important relation is obtained from the Bianchi identity (B.167),
Rαβ[γδ;µ] = 0 · g αγ g βδ
1 αγ βδ
⇒ g g Rαβγδ;µ + Rαβδµ;γ + Rαβµγ;δ − Rαβδγ;µ −Rαβγµ;δ − Rαβµδ;γ = 0
6 | {z }
=−Rαβγδ;µ
1 αγ βδ
⇒ g g Rαβγδ;µ + Rαβδµ;γ + Rαβµγ;δ = 0
3
⇒ R;µ − g αγ Rαµ;γ − g βδ Rβµ;δ = 0
γ γ 1
⇒ ∇µ R − 2∇γ R µ = −2∇ Rγµ − gγµ R
2
⇒ ∇µ Gµα = 0 . (B.189)
This relation is called the “contracted Bianchi identity” and bears a striking similarity to the
Newtonian integrability condition (A.64).
C PHYSICAL LAWS IN CURVED SPACETIMES 75
Proposal: In general relativity, the laws of physics are stated in terms of tensorial equations and,
thus, are invariant under coordinate transformations. The laws are obtained from those
in special relativity by making the following substitutions,
(1) The Minkowski metric is replaced by the spacetime metric: ηµν → gµν .
(2) Partial derivatives are replaced by covariant derivatives: ∂ → ∇ .
C PHYSICAL LAWS IN CURVED SPACETIMES 76
Example: The Maxwell equations are conveniently formulated in terms of the antisymmetric
Maxwell tensor Fµν = F[µν] related to the components Ei , Bi of the electric and
magnetic field by
F0i = −Ei , Fij = ijk Bk , (C.1)
where i, j, . . . = 1, 2, 3 and ijk is the completely antisymmetric symbol. The
vacuum Maxwell equations in special relativity are
The covariance principle predicts that the Maxwell equations in curved spacetimes
are given by
g µν ∇µ Fνρ = 0 , ∇[µ Fνρ] = 0 . (C.3)
Note, however, that this covariance recipe is not unique. The Riemann and Ricci tensors
vanish in special relativity, so that we could add terms involving them to the general relativistic
equations without changing the corresponding special relativistic limit.
We define the energy momentum in general terms as follows. Let xα be a coordinate system.
Then
T αβ ..= flux of α momentum across a surface of constant xβ . (C.5)
Recall that the tensor components are defined by filling the tensor slots with the basis one-
forms, T αβ = T (dxα dxβ ). The components can be interpreted in a more intuitive manner by
assuming that x0 = t is a timelike coordinates and xi , i = 1, 2, 3 are spatial coordinates, so
that
where all fluxes are measured by an observer momentarily at rest in a local inertial frame co-
moving with the matter element at point p.
The construction of the energy momentum tensor often follows the covariance principle. We
start with normal coordinates, find the energy momentum tensor from the local laws in special
relativity and then generalize the tensor to arbitrary coordinates using the coordinate invari-
ance of tensors. We will discuss below some of the most important types of matter used in
applications of general relativity.
C.2.1 Particles
We begin this discussion with the special case of point particles which are not fully consistent
forms of matter in general relativity because a finite amount of mass-energy contained inside
an infinitesimally small volume will be a black hole. Nevertheless, point particles are a very
useful concept and provide a good description of small objects that barely backreact on the
spacetime geometry. They are exceptional in this discussion because they are not of continuous
nature and are therefore conveniently described in terms of the four-momentum rather than the
energy momentum tensor. Using an energy momentum tensor with δ distributions representing
the particles would ultimately lead to the same relations that we develop here.
In special relativity (cf. Sec. A.3.5) we saw that the four momentum of a point particle of rest
mass m in some given frame can be written as
pµ = muµ = (E, pi ) , (C.7)
where uµ is the particle’s four-velocity in this frame and E and pi are the particle’s energy
and linear momentum in this frame. An observer at rest in this frame has four velocity is
wµ = (1, 0, 0, 0) and measures the particle’s energy as
E = −ηµν wµ pν . (C.8)
The right-hand side is Lorentz invariant, but note that the E is the particle’s energy in the
observer’s rest frame. The particle’s rest mass can be expressed as
ηµν pµ pν = −E 2 + p~ 2 = −m2 . (C.9)
By the covariance principle, these equations only change by substituting the metric gµν for the
Minkowski metric, so that
m2 = −gαβ pα pβ , (C.10)
E = −gαβ wα pβ . (C.11)
A more important difference is that in general relativity Eq. (C.11) is only well defined if the
vectors wα and pβ are at the same point of the manifold; we have no recipe for multiplying
vectors at different points and, unlike in special relativity, parallel transport is path dependent.
An observer can therefore only measure the energy of the particle by being at the same location
in the spacetime.
C PHYSICAL LAWS IN CURVED SPACETIMES 78
Here j i is the so-called Poynting vector and also describes the energy flux. The conservation
laws for energy and momentum density follow from the Maxwell equations and are
∂ ∂ji
+ ∂i ji = 0 , + ∂j Sij = 0 . (C.15)
∂t ∂t
In special relativity, these equations are conveniently formulated in terms of the energy mo-
mentum tensor given in an inertial frame by (recall the example in Sec. C.1 for the Maxwell
tensor Fµν )
1 ρ 1 ρσ
Tµν = Fµρ Fν − F Fρσ ηµν = Tνµ . (C.16)
4π 4
With the identification
T00 = , T0i = −ji , Tij = Sij , (C.17)
the conservation equations (C.15) can be shown to be equivalent to
The general relativistic analog follows straightforwardly from the covariance principle. The
energy momentum tensor and its conservation are given by
1 γ 1 γδ
Tαβ = Fαγ Fβ − F Fγδ gαβ = 0 , (C.19)
4π 4
∇α Tαβ = 0 . (C.20)
Let us now consider an observer O with four-velocity U α and a local inertial frame at point
p ∈ M where O is at rest. We can then construct an orthonormal basis starting with the
timelike basis vector e0 := U and the choosing three spatial vectors ei that are orthogonal to
U and to each other and have unit length. By the equivalence principle, we can use the laws
of special relativity in this frame and, using Eq. (C.17) obtain
C.2.3 Dust
The simplest type of continuous matter is the so-called dust, defined as the continuum limit of
a collection of non-interacting particles of rest mass m with a number density in the rest frame
denoted by n. It is often convenient to define a fluid element or, in this case, a dust element
as an infinitesimally small volume of particles with rest-frame density n.
The dust evolves purely under gravitational interaction, so that an observer comoving with
a dust element is, by definition, freely falling. In a locally comoving inertial frame both, the
particles and the observer are moving with four-velocity uµ = (1, 0, 0, 0), the metric is locally
Minkowskian and the energy density is ρ = mn. Since the particles are not moving in this
frame, the momentum density is zero, T i0 = 0. Furthermore, the particles are not interacting,
so no energy-momentum can be transferred in spatial directions, i.e. T ij = 0, T 0j = 0. In this
frame, the energy momentum tensor for dust is therefore given by
Here, m is merely a constant number and n, defined as the number density in the particles’
rest frame is a scalar, so that the equation is tensorial and therefore valid in every coordinate
system.
Def.: A perfect fluid is a continuous matter distribution that has no viscosity and no heat
conduction in the locally comoving frame.
The form of the energy momentum tensor for this type of matter follows from looking more
closely at the meaning of “no viscosity” and “no heat conduction”.
No heat conduction: If the total energy m of a particle contains some internal energy, we
require that this internal energy is not transferred to another particle. Energy can therefore
only flow if the particles themselves flow.
C PHYSICAL LAWS IN CURVED SPACETIMES 80
Here, the last equality follows from the fact that in the comoving frame, uα = (1, 0, 0, 0)
in special relativity. The general relativistic expression follows from replacing η αβ with g αβ
according to the covariance principle, so that
T αβ = (ρ + P )uα uβ + P g αβ . (C.27)
It is instructive to consider the implications of the energy conservation law ∇α T αβ = 0 for the
perfect fluid (C.27). We find
!
∇α T αβ = (∂α ρ + ∂α P )uα uβ + (ρ + P ) uβ ∇α uα + uα ∇α uβ + (∂α P )g αβ = 0 . (C.28)
⇒ uα ∇α ρ + (ρ + P )∇α uα = 0 . (C.29)
We can use this result to substitute for the first “(ρ + P )” term in Eq. (C.28), so that
α β β α α β β
α ρ + ∂α P )u u + u (−u ∇α ρ) + (ρ + P )u ∇α u + ∇ P = 0
(∂:::
:::::::::::::
By taking the Newtonian limit, one can indeed show that Eqs. (C.29) and (C.30) become the
law of mass conservation and the Euler equation of fluid dynamics. In order to model perfect
fluid sources, one needs one additional ingredient that is not provided by general relativity: an
equation of state relating pressure P and energy density ρ. This equation of state describes the
C PHYSICAL LAWS IN CURVED SPACETIMES 81
For the case of dust, i.e. P = 0, we see that Eq. (C.30) merely implies uα ∇α uβ = 0, so that the
dust particles move on geodesics. This is expected since they are non-interacting and, hence,
freely falling.
It may have been noticed that all cases discussed here resulted in a symmetric energy momentum
tensor, Tαβ = Tβα . This is not trivially obvious but can be shown to hold in general for the
energy momentum tensor. For example, energy flux in the xi direction is by construction energy
density × the velocity with which it flows in the xi direction. This product, however, can be
rewritten as mass-energy × velocity / volume, i.e. momentum density, and we have recovered
T 0i = T i0 . The symmetry of T ij can also be shown to hold generally. You may have come
across the Newtonian limit of this symmetry: the stress tensor tij in Newtonian dynamics is
symmetric. Readers interested in more details about the energy momentum tensor are referred
to Chapter 4 of Schutz’ book [24].
1 8πG
Gαβ = Rαβ − Rgαβ = 4 Tαβ , (C.31)
2 c
where we have restored the speed of light c and Newton’s gravitational constant G.
C PHYSICAL LAWS IN CURVED SPACETIMES 82
Comments: (i) The proportionality factor 8πG/c4 is obtained from taking the Newtonian
limit of the Einstein equations. We will return to this point in Sec. G.3
below.
(ii) Einstein’s first guess at the field equations was Rαβ = κTαβ with κ = const.
The contracted Bianchi identities (B.189), however imply ∇α Gαβ = 0, so
that
1 1
∇α Rαβ − gαβ ∇α R = κ ∇α Tαβ − gαβ ∇α R = 0
2 | {z } 2
=0
⇒ ∇α R = 0 ⇒ ∇α T = 0 . (C.32)
1
Gαβ = Rαβ − gαβ R = 0 · g αβ
2
⇒ R=0 ⇒ Rαβ = 0 . (C.33)
We conclude this discussion with Lovelock’s theorem and an important modification to the
Einstein equations, long regarded as Einstein’s biggest mistake, but by now rejuvenated to the
status of critical importance for some of relativity’s most important applications.
(1) In any coordinates and at every p ∈ M, Hαβ is a function only of the metric, its
first and its second partial derivatives.
(2) ∇α Hαβ = 0 .
(3) Hαβ is linear in the second partial derivatives of the metric ∂σ ∂ρ gµν .
8πG
Gαβ + Λgαβ = Tαβ , (C.35)
c4
where Λ is the cosmological constant presently estimated from observations to be about Λ−1/2 ≈
109 lightyears. As can be seen from Eq. (C.27) for the energy momentum tensor of a perfect
fluid, the cosmological constant term in the Einstein equation is equivalent to a matter source
of perfect fluid type with an equation of state
Λc4
ρ = −P = . (C.36)
8πG
This form of matter is called dark energy and trying to understand its nature is subject of
considerable contemporary research. Note, however, that the interpretation as matter or as a
cosmological constant term is mathematically indistinguishable.
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 84
Def.: A spacetime (M, g) is “symmetric in a variable s” if there exist coordinates xα such that one
of the xα = s and the metric components are independent of s in this coordinate system.
Def.: A spacetime (M, g) is “stationary” if there exist coordinates xα such that x0 is a timelike
coordinate and the metric components gαβ do not depend on x0 .
Def.: A spacetime (M, g) is “static” if it is stationary and in that coordinate system g0i = 0 for
i = 1, 2, 3.
In order to better understand the difference between stationary and static spacetimes, let us
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 85
Under reversal of the time direction, t → −t, the line element changes to
i.e. ds2 is invariant under time reversal for static spacetimes with g0i = 0 but not for stationary
spacetimes with g0i 6= 0.
Think of a pipe through which a fluid is flowing. If the fluid has the same constant density and
velocity at every point, the flow is stationary; the system looks the same tomorrow as today.
Under time reversal, however, the flow would change direction. The system is not static unless
the flow velocity is zero.
0 ≤ θ ≤ π, −π < φ ≤ π. (D.3)
Spherical symmetry of the spacetime implies that the proper distance between these two points
does not change under rotations. It can be shown that this condition implies that the angular
part of the line element is given by the metric on a 2-sphere: dθ2 + sin2 θ dφ2 .
Furthermore, we demand that the line element does not change under reflection of the angular
coordinates θ → π − θ, φ → −φ. This implies that all metric cross terms involving the θ or φ
component vanish. There must then exist a coordinate system xα = (t̃, r̃, θ, φ) such that the
spacetime metric is
where Ã, B̃, C̃, D̃ are functions of (t̃, r̃) and D̃ > 0.
p
We next define a new radial coordinate by r ..= D̃, so that
ds2 = −Â(t̃, r)dt̃2 + 2B̂(t̃, r)dt̃ dr + Ĉ(t̃, r)dr2 + r2 (dθ2 + sin2 θ dφ2 ) . (D.5)
The theory of ordinary differential equations tells us that there exists an integrating factor
I(t̃, r) such that we can rewrite the expression (D.6) as a total differential
dt̂ = I(t̃, r) − Â(t̃, r) dt̃ + B̂(t̃, r) dr
1 B̂ 2
⇒ −Â dt̃2 + 2B̂ dt̃ dr = − dt̂2 + dr2
ÂI 2 ! Â
2 2
dt̂ B̂
⇒ ds2 = − + Ĉ + dr2 + r2 (dθ2 + sin2 θ dφ2 )
ÂI 2 Â
⇒ ds2 = −j(t̂, r)dt̂2 + k(t̂, r)dr2 + r2 (dθ2 + sin2 θ dφ2 ) , (D.7)
where in the last step we merely renamed the free functions in a more convenient manner. Note
that up to this point, we have only used the coordinate freedom to adapt the line element to the
spherical symmetry. In order to make further progress, we need to use the Einstein equations.
A straightforward calculation leads to the non-vanishing components of the Ricci tensor
r 2 ∂r k + k 2 − k ! ∂t̂ k
Rt̂ t̂ = = 0, Rt̂ r = = 0,
k2 r2 k2r
∂k −r∂r j + jk − j
Rr t̂ = − t̂ = 0 , Rr r = = 0. (D.8)
jkr −jkr2
The equations for Rr t̂ and Rt̂ r show that k only depends on r. Next, we solve Rt̂ t̂ = 0 for the
function k. Making the Ansatz r/(r − 2M ), M = const turns out to give a solution. Plugging
this result for k into the component Rr r gives us
−r∂r j + jk − j = 0
r
⇒ r∂r j − j +j =0
r − 2M
⇒ r(r − 2M )∂r j − 2M j = 0 . (D.9)
Again, knowing the solution simplifies our task, so we make the Ansatz j = (r − 2M )f (t̂)/r
which turns out to solve Eq. (D.9). Requiring a metric with Lorentzian signature implies
q that
the otherwise arbitrary f (t̂) > 0. Finally, we rescale the time coordinate through dt = f (t̂) dt̂
and obtain the Schwarzschild metric
−1
2 2M 2 2M
ds = − 1 − dt + 1 − dr2 + r2 (dθ2 + sin2 θ dφ2 ) . (D.10)
r r
• For large values of the radius r, the Schwarzschild metric approaches the Minkowski metric.
This property is called asymptotic flatness.
• Even though we did not require any specific time dependence of the solution it turns out
to be static.
This result is known as Birkhoff ’s theorem.
Theorem: Any spherically symmetric solution of the vacuum Einstein equations is given by the
Schwarzschild metric and is therefore necessarily static and asymptotically flat.
The parameter M can be shown to denote the total mass-energy of the spacetime, the so-called
Arnowitt-Deser-Misner or ADM mass [4] that coincides with the black-hole mass as defined
through the apparent horizon. These concepts are beyond the scope of our course but more
details may be found in [13, 30].
Note that the Schwarzschild metric (D.10) also describes the exterior of spherically symmetric
stars; in its derivation we required the spacetime to be spherically symmetric but of vacuum
nature only at those points where we calculated the solution. The metric inside a spherically
symmetric matter distribution will differ from the Schwarzschild metric, but in the exterior
vacuum, Eq. (D.10) is the solution.
We have a good deal more to say about the Schwarzschild metric but we leave that to a later
section and first explore the geodesics in this spacetime.
where the dot denotes d/dλ. First we consider the θ component of the Euler-Lagrange equation
!
d ∂ L̂ ∂ L̂
− = 2r2 θ̈ + 4rṙθ̇ − 2r2 sin θ cos θ φ̇2 = 0
dλ ∂ θ̇ ∂θ
ṙθ̇
⇒ θ̈ + 2 − sin θ cos θ φ̇2 = 0 . (D.12)
r
We can always rotate our coordinate system such that the geodesic starts at θ = π/2 with
θ̇ = 0. From Eq. (D.12) we then find θ = π/2 along the entire geodesic. We can therefore set
θ = π/2 without loss of generality for all geodesics and shall do so in the remainder of this
section.
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 88
The calculation of geodesic curves is further simplified by recalling Noether’s theorem from
Sec. B.3.2 and employing the resulting constants of motion. We have three such constants,
∂ L̂ ∂ L̂
2M
(i) = 0 ⇒ C1 = = −2 1 − ṫ =.. −2E , (D.13)
∂t ∂ ṫ r
∂ L̂ ∂ L̂
(ii) =0 ⇒ C2 = = 2r2 sin2 θ φ̇ = 2r2 φ̇ =.. 2L , (D.14)
∂φ ∂ φ̇
−1
∂ L̂
2M 2 2M
(iii) =0 ⇒ C3 = − 1 − ṫ + 1 − ṙ2 + r2 φ̇2 =.. Q . (D.15)
∂λ r r
Recall that the third constant of motion Q = L̂ = gαβ ẋα ẋβ , so that Q = −1 if the geodesic
is timelike and we choose proper time for the parametrization, λ = τ . Likewise, Q = 1 if
we have a spatial geodesic and parametrize it with proper distance λ = s, and Q = 0 if the
geodesic is null. Recall that geodesics do not change their timelike, spacelike or null character.
To summarize, we have the following constants of motion
2M
E = 1− ṫ , (D.16)
r
L = r2 φ̇ , (D.17)
2M 2
2M
−1 −1
timelike
2 2 2
Q = − 1− ṫ + 1 − ṙ + r φ̇ = 0 null . (D.18)
r r
1 spacelike
In order to identify the physical significance of the constants E and L for timelike geodesics,
we consider the weak-field limit r M . The Schwarzschild metric approaches the Minkowski
limit in this case and we are in the regime of special relativity. In this limit,
dt
E = ṫ = , (D.19)
dτ
where t is the time measured by an observer at rest and τ the proper time along the particles
world line. In special relativity the two are related by Eq. (A.106), i.e. dt/dτ = γ, so that
dφ
L = r2 φ̇ = r2 γ . (D.20)
dt
If we denote the particle mass by m, we can write this as
E m = mγ = relativistic mass energy , (D.21)
dφ
L m = mγr2 = relativistic angular momentum , (D.22)
dt
so that E and L denote the energy and angular momentum per unit mass, respectively, of the
particle.
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 89
Now we insert Eqs. (D.16), (D.17) into the equation (D.18) for Q and obtain
2 2 1 2M 2 2M
−E + ṙ + 2 1 − L = 1− Q
r r r
2
1 2 1 1 2M L
⇒ ṙ + V (r) = E 2 , V (r) = 1− −Q . (D.23)
2 2 2 r r2
and has SI units of (m/s)2 . After multiplication with the particle mass m, this gives Nm in
agreement with the kinetic energy term mṙ2 /2. There remains the term mE which we already
identified as the relativistic mass. By Einstein’s famous E = mc2 , this term acquires the
dimension of energy after multiplication with c2 . Equation (D.23) written in SI units therefore
becomes 2
1 2 m 2GM L 1
mṙ + 1− 2 2
− Qc = E 2 mc2 .
2
(D.25)
2 2 cr r 2
Of course, there is some freedom in absorbing factors of c in the constants of motion by redefin-
ing, for example, Ẽ := cE or similar. Any such redefinition is, of course, equivalent to (D.25).
Now let us derive the Newtonian counterpart of this equation. It is obtained from energy
conservation. The Newtonian kinetic energy has two contributions, a radial and an angular
one,
1 1 1 m L2
Ekin = mṙ2 + mr2 φ̇2 = mṙ2 + , (D.26)
2 2 2 2 r2
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 90
where we defined the Newtonian angular momentum per unit mass L := r2 φ̇. Note that the
dot denotes d/dt here, since we do not distinguish between proper time and coordinate time in
Newtonian dynamics. The potential energy of a particle in a spherically symmetric field is
Mm
Epot = −G . (D.27)
r
If we denote by E the total energy per unit mass, conservation of energy Ekin + Epot gives us
1 2 1 L2 Mm
mṙ + m 2 − G = mE = const , (D.28)
2 2 r r
which we contrast with the relativistic Eq. (D.25) slightly rearranged as
1 2 1 L2 M m G mM L2 1 2
mṙ + m 2 + QG − 2 = (E + Q)mc2 = const . (D.29)
2 2 r r c r3 2
In the weak-field regime, we had E = γ, so that for small v and setting Q = −1
v2 v2 1 2 1 L2
2 1 1
E −1= − 1 ≈ 1 + − 1 = ⇒ mṙ + m 2 = mv 2 , (D.30)
1 − v /c
2 2 c 2 c 2 2 2 r 2
so in the limit of negligible gravitational field and low velocities, the relativistic equation merely
reduces to the Newtonian kinetic energy balance. It just happens that the term E which we
interpret as the relativistic energy in the absence of gravity enters the full blown geodesic
equation of general relativity in the form (E 2 + Q)/2.
Comparing the Newtonian and the relativistic equations (D.28) and (D.29), we see that they
merely differ by the extra term −GM L2 /(c2 r3 ) in the relativistic equation. For the following
discussion it is convenient to write the two equations as follows
1 2 1 L2 GM
ṙ + VN/GR (r) = const . VN (r) = − ,
2 2 r2 r
1 L2 GM G M L2
VGR (r) = + Q − , (D.31)
2 r2 r c2 r3
with Q = −1 for timelike and Q = 0 for null geodesics. The shape of the potential determines
the possible trajectories, so let us explore the potential for the three cases in more detail. In
doing so, we shall revert to natural units and set G = c = 1.
L2 M L2
VN0 (r) = − + 2 =0 ⇒ r= ,
r3 r M
3L2 2M
VN00 (r) = − 3 ⇒ VN00 (L2 /M ) = M 4 /L6 > 0 . (D.32)
r4 r
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 91
VN Newtonian
4
L/M=0
3 L/M=1
L/M=2
L/M=4
L/M=8
2
0
0 5 10
r/M
-1
-2
GR null geodesics GR timelike geodesics
VGR VGR
2 2
1 1
0
0
0 5 10
0 5 10 r/M
r/M -1
-1 L/M=0 0
L/M=1
L/M=2
-2 2 2
-2 L/M=1 L / M = 12
L/M=4
L/M=2 L/M=8
-0.05
L/M=4 -3
-3 L/M=8
-4 -0.1
0 5 10 15 20
-4 r/M
Figure 14: The Newtonian potential VN (upper panel) and the relativistic potential VGR for
timelike (bottom left) and null geodesics (bottom right panel), all for selected values of the
angular momentum parameter L/M .
The Newtonian potential has exactly one extremum and it is a minimum at r = L2 /M except
for the special case L = 0 which has no extremum. This behaviour is graphically illustrated in
Fig. 14. For L > 0 the Newtonian potential always admits a stable circular orbit (ṙ = 0) which
is located at r = L2 /M . Furthermore, a particle with non-zero angular momentum can never
reach the origin, since the centrifugal repulsion dominates over the gravitational attraction;
cf. top panel in Fig. 14.
GR null geodesics: The relativistic potential also approaches zero as r → ∞, but in the
limit r → 0 we have VGR → −∞. A calculation of the extrema is quite easy for Q = 0 and
leads to
0 L2 3M L2
VGR (r) =− 3 + =0 ⇒ r = 3M ,
r r4
00 3L2 12M L2 00 L2
VGR (r) = − ⇒ VGR (3M ) = − < 0. (D.33)
r4 r5 81M 4
For L > 0 there always exists an unstable circular orbit at r = 3 M which is often referred to
as the light ring. The relativistic correction term ∝ r−3 furthermore implies an infinitely deep
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 92
potential well at r = 0 which drags in all particles with insufficient energy; cf. bottom left panel
in Fig. 14.
GR timelike geodesics: The equations are a little more complicated but after some crunch-
ing one finds
r
2 2 2
0 L M 3M L L L4
VGR (r) = − 3 + 2 + = 0 ⇒ r = r± = ± − 3L2 ,
r r r4 2M 4M 2
00 3L2 2M 12M L2
VGR (r) = − −
r4 r3 r5
2
√ 2
4 L + L L − 12M − 12M
2 2
00
⇒ VGR (r+ ) = 16M √ > 0 for L2 > 12 M 2 ,
L (L + L − 12M )
3 2 2 5
2
√
00 L − L L2 − 12M 2 − 12M 2
∧ VGR (r− ) = 16M 4 √ < 0 for L2 > 12 M 2 . (D.34)
L3 (L − L2 − 12M 2 )5
The potential is shown for various values of L in the bottom right panel in Fig. 14 which also
includes an inset zooming in on three curves to demonstrate the presence or absence of extrema.
We see that extrema only exist for L2 > 12M 2 and in that case we find a minimum, i.e. a stable
circular orbit, at r = r+ and a maximum, i.e. an unstable circular orbit, at r = r− . One can
furthermore show that r+ (r− ) is monotonically increasing (decreasing) with L at fixed M and
in the limit L2 & 12M 2 , the two coincide: r+ = r− = 6M . Finally, in the limit of very large
angular momentum parameter L/M → ∞, the unstable circular orbit asymptotes towards the
light ring limit r− = 3M . In summary, stable circular orbits exist in the range r > 6M and
unstable circular orbits at 3M < r < 6M . Note the contrast to the Newtonian case where
stable circular orbits can be found for any r.
Newtonian calculation: Starting point for our Newtonian calculation is Eq. (D.28). It
turns out convenient for this calculation to switch to an inverse radial coordinate
1
y= , (D.35)
r
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 93
and parametrize the geodesic with the orbital angle φ rather than time t. We can do this
because by definition of the angular momentum parameter
L
φ̇ = , (D.36)
r2
so that t and φ are monotonic functions of each other. Denoting time derivatives with a dot as
before and φ derivatives with a prime, we obtain
d dφ d L d d
= = 2 = Ly 2
dt dt dφ r dφ dφ
−1
⇒ ṙ = Ly 2 r0 = Ly 2 y 0 = −Ly 0 . (D.37)
y2
L2 (y 0 )2 + L2 y 2 − 2M y = 2E . (D.38)
2L2 y 0 y 00 + 2L2 yy 0 − 2M y 0 = 0
M
⇒ y0 = 0 ∨ y 00 + y =
L2
M
⇒ y= (1 + cos φ) , (D.39)
L2
as is straightforwardly verified by inserting the solution. The resulting curve is a hyperbola for
> 1, a parabola for = 1 or an ellipse (see Fig. 15) for < 1. In the circular limit, = 0,
we find a constant radius r = 1/y = L2 /M . Most importantly for our calculation, the orbit is
closed: y returns to the same value after every passage of ∆φ = 2π. Newtonian gravity predicts
no perihelion precession for Mercury (barring for perturbations due to other planets that we
ignore here).
General relativistic calculation: Here, the motion is governed by the geodesic equation
(D.29) and we again change to the coordinate y = 1/r and use the angle φ to parametrize the
curve. This transformation proceeds in complete analogy to the Newtonian case above with
proper time τ taking the place of the Newtonian t and leads to
L2 (y 0 )2 + L2 y 2 + 2M Qy − 2M L2 y 3 = E 2 + Q
E2 −Q
0 2 2
⇒ (y ) = 2 − (1 − 2M y) +y . (D.40)
L L2
E 2 − 1 2M
(y 0 )2 + y 2 = + 2 y + 2M y 3 . (D.41)
L2 L
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 94
ỹ
L2
1 M
r= y φ
x̃
Figure 15: The solution (D.39) for the case < 1 is an ellipse. Do not confuse the Cartesian
coordinate ỹ = r sin φ with the inverse radius y = 1/r.
⇒ y 00 + y = M/L2 + 3M y 2 , (D.42)
where we ignored the case y 0 = 0 which corresponds to a circular orbit that does not exhibit
perihelion precession by construction. Note the similarity of our equation to the Newtonian
case in the second line of Eq. (D.39): The only new feature is the extra term 2M y 3 . This term,
however, makes the solution significantly harder, so that we resort to perturbation theory. For
this purpose we introduce the small parameter α ..= 3M 2 /L2 which is of the order of 10−7 for
Mercury. Equation (D.42) than becomes the Newtonian case plus a perturbation ∝ α,
M L2 2
y 00 + y = + α y , (D.43)
L2 M
and we likewise expand the solution in α as
Plugging this expansion into (D.43) and sorting terms according to the power of α leads to
M L2 2
y000 + αy100 + y0 + αy1 =
+ α y
L2 M 0
L2 2
00 M 00
⇒ y0 + y0 − 2 + α y1 + y1 − y0 = 0 . (D.45)
L M
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 95
In perturbation theory, equations of this type are solved order by order and we start with the
terms ∝ α0 = 1. At this order, we actually recover the Newtonian case (D.39), so that
M M
y000 + y0 − =0 ⇒ y0 = (1 + cos φ) . (D.46)
L2 L2
This expression for y0 can now be used in those terms of the differential equation ∝ α which
become
L2 2 M
y100 + y1 = y = (1 + 2 cos φ + 2 cos2 φ)
M 0 L2
2
M 2M M
= 2
1+ + 2 cos φ + 2 2 cos 2φ , (D.47)
L 2 L 2L
where we used the idenity cos2 φ = (1 + cos 2φ)/2. As a solution, we make the Ansatz
y1 = A + Bφ sin φ + C cos 2φ
2 M 2
M M
A= 2 1+ , B= 2 , C=− 2 . (D.49)
L 2 L 6L
Putting together the results for y0 and y1 , we obtain the solution to first perturbative order in
α as
M M 2 1 1
y = y0 + αy1 = 2 (1 + cos φ) + α 2 1 + φ sin φ + − cos 2φ (D.50)
L L 2 6
The last term in brackets is ∝ and therefore very small for a nearly circular orbit such as
Mercury’s around the sun. To high accuracy we can therefore write
M
y≈ (1 + α + cos φ + αφ sin φ) . (D.51)
L2
The first two constant terms in parentheses merely give us the average radius of Mercury’s
orbit and play no role in the perihelion precession. The latter two terms can be approximated
for small α 1 using the relation
so that
M
y≈ {1 + α + cos[φ(1 − α)]} . (D.53)
L2
The key point is that the (inverse) radius returns to the same value as φ increases from φn to
φn+1 where
(1 − α)(φn+1 − φn ) = 2π (D.54)
2π
⇒ φn+1 − φn = ≈ 2π(1 + α) (D.55)
1−α
The angle traversed from one perihelion to the next therefore exceeds the Newtonian value 2π
by the perihelion precession angle
M2
∆φ = 2απ = 6π . (D.56)
L2
For a nearly circular orbit, we can express the orbital angular momentum through the expression
for r+ in the first line of Eq. (D.34), which gives
Mr M
L2 = ≈ Mr ⇒ ∆φ ≈ 6π . (D.57)
1 − 3M/r r
rad 4300
⇒ ∆φ = 4.99 × 10−7 = . (D.58)
orbit century
r1 r2
b φ1
φ2
r b
φ
π + ∆φ
−∆φ
Figure 16: Upper panel: Illustration how Eq. (D.60) represents a straight line with impact
parameter b. The deflection angle is zero in this case. Lower panel: In the presence of gravity,
the light ray asymptotes to φ → π + ∆φ to the left and φ → −∆φ to the right. In the figure,
the magnitude of ∆φ is vastly exaggerated.
slightly cryptic fashion. The parameter b represents the closest distance of the line to the origin
and is often called the impact parameter. The light ray, assumed here to come from infinity
from the left φ = π, y = 0 and propagates to the right towards infinity at φ = 0, y = 0.
Let us now return to the case with gravitational field described by Eq. (D.59). We are interested
in small deflections of light rays that come in from infinity and, after the small deflection, move
on towards infinity. At infinity, we are looking for solutions of
M 1
2
(1 + sin φ) = 0 ⇒ sin φ = − . (D.61)
L
Small deflection angles correspond to small corrections to the non-gravitational case where
infinity corresponded to φ = π or φ = 0, i.e. sin φ ≈ 0. We therefore expect the small-deflection
limit to be given by 1/ 1. Equation (D.61) will then be solved by φ = −∆φ and φ = π + ∆φ
with ∆φ 1,
1 1
sin(−∆φ) ≈ −∆φ = − , sin(π + ∆φ) ≈ −∆φ = − . (D.62)
There remains the task to express in terms of the parameters L, M and b. As before, we
define the impact parameter as the closest distance between the light ray and the origin. This
is realized at φ = π/2 where
1 M M
= y(π/2) = 2 (1 + ) ≈ 2 . (D.63)
b L L
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 98
Furthermore, we can write the (conserved!) Newtonian angular momentum mass in terms of
the particle’s mass m and velocity c as
Using the last two equations we find the deflection angle as (see lower panel of Fig. 16 for an
illustration with exaggerated magnitude of ∆φ)
2 2M b 2M
2∆φ = = 2 = . (D.65)
L b
General relativistic calculation: The starting point is again the geodesic equation (D.40)
expressed in terms of the inverse radius y. We are considering null geodesics now and therefore
set Q = 0 and obtain
L2 (y 0 )2 + L2 y 2 − 2M L2 y 3 = E 2 . (D.66)
We differentiate this equation with respect to φ and divide by 2L2 y 0 which gives
y 00 + y = 3M y 2 . (D.67)
In the absence of a gravitational field we have M = 0 and recover the Newtonian case with the
solution (D.60). With gravitational field, we again assume the deflection angle to be small and
make the Ansatz that the curve is perturbatively close to the straight line y0 = (sin φ)/b,
M
∆y + O (M/b)2 .
y = y0 + (D.68)
b
Here M/b 1 is our expansion parameter. Plugging this Ansatz into (D.67) and using that
the background solution satisfies
1
y0 = sin φ ⇒ y000 + y0 = 0 , (D.69)
b
we find to linear order in M/b for the perturbation ∆y
2
M 00 M 1 M 3M
∆y + ∆y = 3M sin φ + ∆y ≈ 2 sin2 φ
b b b b b
3 2
⇒ ∆y 00 + ∆y = sin φ cos 2φ = cos2 φ − sin2 φ = 1 − 2 sin2 φ
b
3 1 − cos 2φ
⇒ ∆y 00 + ∆y = . (D.70)
b 2
We solve this differential equation by first considering the homogeneous part ∆y 00 + ∆y = 0
which is solved by
A B
∆ỹ = cos φ + sin φ , (D.71)
b b
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 99
where A and B are dimensionless constants that also satisfy |A|, |B| b/M in order to
ensure our perturbative expansion in Eq. (D.68) remains valid. A particular solution for the
inhomogeneous equation is
1
∆ŷ = (3 + cos 2φ) , (D.72)
2b
as is straightforwardly checked by inserting it into (D.70). We now choose A = 2 in the
homogeneous part, so that gathering all terms together gives
M 1 M 2M M
y = y0 + ∆y = sin φ + 2 (3 + cos 2φ) + 2 cos φ + 2 B sin φ . (D.73)
b b 2b b b
With this particular choice for A we have ensured that for φ → π we have y = 0, i.e. the photon
falls in directly from the left. This corresponds to a rotation of the bottom panel in Fig. 16 by
∆φ but has no impact on the result for the deflection angle. As the photon travels to the right,
it is deflected before escaping again to infinity y = 0 which now happens at an angle φ = δφ
determined by (D.73) to linear order as as
δφ M M 2M δφ M 2M
0≈ 1 + B + 2 (3 + 1) + 2 ≈ + 2 (3 + 1) + 2
b b 2b b b 2b b
4M
⇒ δφ ≈ − . (D.74)
b
Note that in the Newtonian calculation we defined ∆φ such that the total deflection angle was
2∆φ whereas here δφ is the deflection angle. The relativistic result is twice as large as the
Newtonian value (D.65).
For the sun with M = 1.5 km, b ≈ R ≈ 7 × 105 km, we find
1.5 km 360
|δφ| = 4 × × 60 × 6000 ≈ 1.7700 . (D.75)
7 × 105 km 2π
This effect was famously tested in 1919 through observations by two expeditions to Sobral
(Brazil) and to the Island of São Tomé e Principe off the west coast of Africa [10], both located
in the path of totality of the solar eclipse on May 29, 1919. Both expeditions, run by Arthur
Eddington and collaborators, measured the positions of stars near the sun (then located in the
Taurus constellation) and generated results compatible with Einstein’s theory of relativity. The
confirmation of his theory catapulted Einstein to a global-star status that has lost nothing in
the nearly one hundred years since.
Venus r1 b Earth
r2
Sun
b
Venus r1 r2 Earth
Sun
Figure 17: Illustration of the path of a radar signal from Earth to Venus and back in Minkowski
spacetime (upper panel) where the gravitational field of the sun is ignored and in general
relativity (lower panel) where the Sun’s gravity bends the light path.
the solar exterior is modelled by the Schwarzschild metric. The two scenarios are illustrated in
Fig. 17.
Without gravitational field: This scenario is shown in the upper panel of Fig. 17. We
denote by r1 and r2 the distance of Venus and Earth from the sun, respectively. The impact
parameter b is the solar radius. The time a radar signal needs to propagate to Venus and back
then follows from the rules of flat geometry,
q q
2 2
T =2 r1 − b + r2 − b .
2 2 (D.76)
With gravitational field: We recall the geodesic equations (D.16) and (D.23) in the
Schwarzschild spacetime and set Q = 0 for null geodesics,
−1
2M L2
2 2 2M
ṙ + 1 − =E , ṫ = 1 − E
r r2 r
2 2
ṙ2 2M L2
dr 2M 1 2
⇒ = 2 = 1− E − 1−
dt ṫ r E2 r r2
2 2 2
dr 2M 2M L
⇒ = 1− 1− 1−
dt r r r E2
2
s
2
dr 2M 2M L
⇒ =± 1− 1− 1− . (D.77)
dt r r r E2
2
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 101
Proper time on Earth is to very high precision identical with coordinate time of the Schwarzschild
metric, so that the time of passage of the radar signal is
Z r1 Z r2 s
b2 1 − 2M/r
dr dr 2M
T =2 +2 , f (r) = 1 − 1− 2 . (D.80)
b f (r) b f (r) r r 1 − 2M/b
Z
r 2M
Z
2M √
2 √ dr = √ dr = 2M ln(r + r2 − b2 )
r −b r
2 2 r 2 − b2
r1
p
r1 + r12 − b2
Z
⇒ 2 dr = 2M ln .
b b
r
r M b Mb r−b r−b
Z Z
3 √ dr = √ dr = . . . = M √ =M
r 2 − b2 r r + b (r + b) r2 − b2 r 2 − b2 r+b
r1
r
r1 − b
Z
⇒ 3 dr = M .
b r1 + b
r r !
r1 − b r2 − b
+2M + . (D.83)
r1 + b r2 + b
The first term is just the result (D.76) we obtained in the absence of gravity using the Minkowski
metric. The second and third term describe the time delay ∆T relative to the Minkowski result.
Using
M = M = 1.47 km
r1 = r♀ = 1.08 × 108 km
r2 = r♁ = 1.496 × 108 km
b = R = 6.96 × 105 km , (D.84)
(the astronomical symbols for Venus and Earth are ♀ and ♁) we obtain ∆T ≈ 77 km = 257 µs.
In practice, the radar signal passes a bit away from the solar surface which decreases the delay
to about 200 µs. The effect was first measured with the Massachusetts Institute of Technol-
ogy’s Haystack antenna a few years after Shapiro’s prediction and has been reinvestigated with
increasing accuracy in numerous experiments since, all compatible with the general relativis-
tic result. A chronology of experimental and observational tests of Einstein’s theory is given
in Sec. 15.9 of d’Inverno’s book [9]. We should add to this list the Nobel Prize winning ob-
servations of the Hulse-Taylor pulsar [16, 29, 32] and the ground breaking first detection of
gravitational waves from the black-hole binary system GW150914 [2] that kicked US presidential
hopefuls off the news headlines on February 11, 2016.
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 103
t t
y,z
x r
Figure 18: Left panel: Light cones in the Minkowski spacetime in Cartesian coordinates. One
spatial direction is suppressed and time points upwards. The future pointing light cone is
shown in green, the past one in red color. Right panel: Often we are interested in the limiting
curves of outgoing and ingoing radial geodesics. We then use spherical coordinates (t, r) with
the angular dependency suppressed and show the light cones by the out and ingoing curves.
We study the geodesic equation using an affine parameter λ and set dθ = dφ = 0, i.e. we
consider radial geodesics. We therefore need the t and r component of the geodesic equation.
The former we have already obtained in Eq. (D.16),
2M
1− ṫ = E = const ,
r
but the r component we still have to work out. The Euler-Lagrange equation applied to the
Schwarzschild metric gives us
d ∂L ∂L
=
dλ ∂ ṙ ∂r
" −1 # −2
d 2M 2M 2M 2 2M 2
⇒ 2 1− ṙ = − 1 − ṙ − 2 ṫ
dλ r r r2 r
−2 −1 −2
2M 2M 2 2M 2M 2M 2 2M 2
⇒ −2 1 − 2
ṙ + 2 1 − r̈ = − 1 − ṙ − 2 ṫ
r r r r r2 r
2
2M r2
2 2 2M
⇒ −2ṙ + 1 − r̈ = −ṙ − 1 − ṫ2 = −ṙ2 − E 2
r M r
2M r2
⇒ 1− r̈ = ṙ2 − E 2 , (D.85)
r M
where we plugged in the above equation for ṫ. This equation is clearly solved by ṙ = ±E. It
follows that r = ±Eλ+r0 is also an affine parameter. We use that observation to reparametrize
the geodesic by r,
dt ṫ r
= =±
dr ṙ r − 2M
In Fig. 19 we plot several curves given by Eq. (D.86) and also show some corresponding light
cones. Clearly, r = 2M separates two regions which we discuss in turn.
r > 2M : The + sign in (D.86) gives us outgoing and the − sign ingoing geodesics. At any given
point in the spacetime, a time like curve must be inside the light cones constructed
from the radial geodesics. For example, curves r = const are clearly timelike and
located inside the light cones.
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 105
t
10M
5M
0
0 2M 4M
r
Figure 19: Geodesic curves in the Schwarzschild spacetime according to Eq. (D.86). Curves
corresponding to the + sign are shown in in blue, those with the − sign in orange. A few
light cones are shown in green. The dotted black line marks the location r = 2M where the
Schwarzschild metric (D.10) becomes singular.
r < 2M : This case is more complicated. First, we note that the line element (with dθ = dφ = 0)
can now be written in the form
−1
2 2M 2 2M
ds = − −1 dr + − 1 dt2 ,
r r
so that now grr < 0 and gtt > 0 and, hence, r is the timelike coordinate. Curves
t = const are now timelike. In our diagram this means that horizontal lines must
be inside the light cones which are, accordingly, tilted horizontally. There remains
the question whether the future light cones point to the left or right in our diagram.
Based on physical arguments, we expect them to point towards r = 0, since we
expect the gravitational field to pull objects towards the center. We already note at
this point, however, that we do not have a mathematical proof for this. For example,
we cannot use continuity of the light cones from the exterior across r = 2M because
there the metric (D.10) is singular and does not allow for a calculation of light cones.
For comparison, we now describe the same timelike geodesic in terms of Schwarzschild time t
instead of proper time τ . Note that t is equal to the proper time of an observer staying fixed
at very large radius r. We obtained the expressions
−1
2M 2M
ṫ = 1 − E, ṙ2 = ,
r r
in the preceding calculation and thus find
r −1
dt ṫ r 2M
= =− 1− . (D.89)
dr ṙ 2M r
After some crunching, this equation can be integrated to give us
√ √ √ √
2 h 3/2 √ √ i r + 2M r0 − 2M
3/2
t − t0 = − √ r − r0 + 6M ( r − r0 ) + 2M ln √ √ √ √ .
3 2M r0 + 2M r − 2M
(D.90)
In Fig. 20 we compare τ (r) from Eq. (D.88) and t(r) from Eq. (D.90) for an observer starting to
fall from r0 = 20 M at t0 = τ0 = 0. The coordinate time t diverges as the observer approaches
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 107
70
τ(r)
60 t(r)
50
40
30
20
10
0
0 5 10 15 20
r/M
Figure 20: The trajectory of a falling observer in the Schwarzschild spacetime measured in
terms of the observer’s proper time τ (D.88) and coordinate time t (D.90) which corresponds
to the proper time of an observer staying behind at large radius. Both trajectories start from
r0 = 20 M at t0 = τ0 = 0.
r = 2 M . A second observer remaining behind at fixed r0 will therefore never see his sibling
cross the threshold r = 2 M as that would only happen at t → ∞. On the other hand, we
have already seen that the falling observer has quite another experience, crossing r = 2M after
finite proper time without anything special happening (besides gradually being spaghettified
due to the effect of tidal forces, but that’s another story).
We could imagine a scenario where the falling observer emits light signals outwards at regular
intervals of proper time. These are picked up by the less adventurous friend who will not detect
them at regular intervals in time t but instead sees them arrive with ever increasing delays (and
redshift).
t + 2M ln |r − 2M | = −r + const . (D.91)
t̄ = t + 2M ln |r − 2M | (D.92)
2M
⇒ dt̄ = dt + dr , valid for r > 2M or r < 2M (D.93)
r − 2M
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 108
t
10M
5M
0
0 2M 10M
r
Figure 21: Geodesic curves in the Schwarzschild spacetime in ingoing Eddington Finkelstein
coordinates according to Eqs. (D.95), (D.96). The former are shown in orange, the latter in
blue. A few light cones are shown in green. The dotted black line marks the location r = 2M
where the Schwarzschild metric (D.10) becomes singular.
The Schwarzschild line element (D.10) becomes in this new coordinate system
2 −1
2 2M 2M 2M
ds = − 1 − dt̄ − dr + 1 − dr2 + r2 (dθ2 + sin2 θ dφ)
r r − 2M r
2 2M 2 4M 2M
⇒ ds = − 1 − dt̄ + dt̄dr + 1 + dr2 + r2 (dθ2 + sin2 θ dφ) . (D.94)
r r r
Ingoing and outgoing radial null geodesics are given in terms of t̄ and r by
t̄ = −r + const , (D.95)
t̄ = r + 4M ln |r − 2M | + const . (D.96)
An illustration of these geodesics together with the resulting light cones is shown in Fig. 21.
We note the following observations.
(1) The light cones now smoothly vary across r = 2M . They tilt over in the inward direction
such that at r < 2M even outgoing null geodesics are directed towards decreasing r.
(2) At large distances, the light cones approach their Minkowskian structure with 45◦ inclina-
tion.
The location r = 2M marks a semi-transparent membrane in the sense that light rays can move
towards r < 2M from the outside, but not the other way round. Even outgoing light rays are
drawn in by the gravitational field. Since time like curves are bounded by the light cones, all
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 109
timelike observers inside r < 2M also inevitably fall towards smaller r. This motivates the
following definition.
Def.: The outermost boundary of a region of spacetime from which no null geodesics and, hence,
no timelike curves can escape to infinity, is called an event horizon.
This horizon motivated, of course, the term black hole coined by John Wheeler in the 1960s.
Without proof, we state Israel’s theorem on the uniqueness of static spacetimes containing a
horizon.
Theorem: If a spacetime is static, asymptotically flat and contains a regular horizon then it is a
Schwarzschild spacetime.
A simplification of the line element (D.94) is obtained by transforming to the null coordinate
v = t̄ + r ⇒ dt̄ = dv − dr
2 2M 2 2 4M 2 2M
⇒ ds = − 1 − (dv − 2drdv + dr ) + (dv dr − dr ) + 1 + dr2 + r2 dΩ2
r r r
2M 2 2 2M 4M 2M
= − 1− dv + 2dr dv + dr − 1 − − +1+ + r2 dΩ2
r r r r
2 2M
⇒ ds = − 1 − dv 2 + 2dr dv + r2 dΩ2 , (D.97)
r
where we introduced the notation dΩ2 ..= dθ2 +sin2 θ dφ2 . In this line element, the null character
of our ingoing radial null geodesics is manifest: the tangent vector to the curves v = const is
∂r and clearly g(∂r , ∂r ) = 0.
You may wonder whether the coordinate transformation (D.92) is really a legitimate way to
transform from Schwarzschild to Eddington Finkelstein coordinates; after all, (D.92) is singular
at r = 2M . This viewpoint, however, looks at the situation the wrong way round. The Edding-
ton Finkelstein version (D.94) of the Schwarzschild metric is a perfectly legitimate solution of
the Einstein equations (C.35). It is regular at r = 2M and has a clean structure of light cones.
Transforming to the Schwarzschild metric through (D.92) introduces a coordinate singularity
at r = 2M which is not surprising given that the transformation itself is singular there.
t − 2M ln |r − 2M | = r + const . (D.98)
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 110
t
10M
5M
0
0 2M 10M
r
Figure 22: Geodesic curves in the Schwarzschild spacetime in outgoing Eddington Finkelstein
coordinates according to Eqs. (D.101), (D.102). The former are shown in orange, the latter in
blue. A few light cones are shown in green. The dotted black line marks the location r = 2M
where the Schwarzschild metric (D.10) becomes singular.
We should be a little puzzled now. With ingoing Eddington Finkelstein coordinates we have
just shown that all future pointing light cones tilt over inwards inside r < 2M and that
therefore all null geodesics and timelike curves fall inwards. Here, we use outgoing Eddington
Finkelstein coordinates and demonstrate the exact opposite; all future pointing light cones
inside r < 2M point completely outwards. What is going on and which of the results is correct?
The answer is that both are correct. And at second glance the puzzle looks less paradoxical.
By construction, the Schwarzschild spacetime is static. We should therefore expect symmetry
under time reversal. In order to fully grasp how the puzzle is resolved, we need to go one
coordinate transformation further: to Kruskal-Szekeres coordinates.
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 111
Step 2: Now we collect both, the ingoing and outgoing, coordinate transformations
r − 2M
v = t̄ + r = t + r + 2M ln(r − 2M ) − 2M ln r∗ = t + r + 2M ln ,
r∗
r − 2M
u = t̃ − r = t − r − 2M ln , (D.105)
r∗
where we wrote the integration constant in the geodesic equations (D.91), (D.99) in the form
of a constant r∗ that ensures the argument of the logarithm is dimensionless. Now we combine
the in and outgoing Eddington Finkelstein coordinates into one coordinate transformation
1 1 r − 2M
(v + u) = t , (v − u) = r + 2M ln , (D.106)
2 2 r∗
1 r − 2M
⇒ dt = (dv + du) , dr = (dv − du) , (D.107)
2 2r
which transforms the Schwarzschild metric into
2 2M 1 2 1 2M
ds = − 1 − (dv + du) + 1− (dv − du)2 + r2 dΩ2
r 4 4 r
2 2M
⇒ ds = − 1 − du dv + r2 dΩ2 . (D.108)
r
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 112
It will be noted here that we dropped the modulus in the logarithmic argument, i.e. use ln(r −
2M )/r∗ instead of ln |(r − 2M )/r∗ |. In fact, all results we have obtained for the ingoing and
outgoing Eddington Finkelstein coordinates remain the same, with or without modulus. So
we can simply accept the transformation to involve complex intermediate expressions and see
where it leads us. The end product will be real.
1 1
⇒ dṽ =
ṽdv , dũ = − ũdu
4M 4M
16M 2
2 2M
⇒ ds = 1− dũ dṽ + r2 dΩ2 . (D.109)
ũṽ r
Step 4: The coordinates ũ, ṽ are null coordinates. Since we are more used to time and radius,
we now switch back to this type of coordinates. First, we realize that
r − 2M r − 2M r
v−u r
ũṽ = −e 4M = − exp + ln =− e 2M
2M r∗ r∗
16M 2 r∗
2M r
⇒ ds = −2
1− e− 2M dũ dṽ + r2 dΩ2
r − 2M r
16M 2 − r
⇒ ds2 = − e 2M dũ dṽ + r2 dΩ2 . (D.110)
r/r∗
16M 2 − r r − 2M r
⇒ ds2 = e 2M (−dt̂2 + dr̂2 ) + r2 dΩ2 , t̂2 − r̂2 = − e 2M . (D.111)
r/r∗ r∗
This is the Schwarzschild metric in Kruskal-Szekeres coordinates. Note that the original radius
r is implicitly defined through the last expression and still present in the metric components.
From now on we will set the integration constant r∗ = 1 as is customary in the literature. This
constant represents merely the unit in which we measure radius r and mass M . Note that we
have gained a lot with the new form of the Schwarzschild metric:
D THE SCHWARZSCHILD SOLUTION AND CLASSIC TESTS OF GR 113
(3) The third and probably most dramatic benefit only becomes clear if we consider the allowed
range of our new coordinates. This requires a little work.
It seems that we have somehow extended our spacetime. Unlike the Schwarzschild
radius r, our new radial coordinate r̂ can take on negative values. Furthermore, we
have two different expressions of t̂ and r̂ for each of the locations r = 2M and r = 0.
In order to understand these issues better, we draw the Kruskal diagram. For this purpose, we
consider the following curves.
(i) Curves r = r0 = const are hyperbolic curves
r0
t̂2 − r̂2 = −e 2M (r0 − 2M ) =: C
√ p
⇒ t̂ = ± r̂2 + C ∨ r̂ = ± t̂2 − C . (D.114)
(ii) Curves t = t0 = const are obtained as follows. Equation (D.105) gives us u, v as functions
of t, r. This implies
v t+r √ u r−t √
ṽ = e 4M = e 4M r − 2M , ũ = −e− 4M = −e 4M r − 2M
1 √ r t
⇒ t̂ = (ṽ + ũ) = r − 2M e 4M sinh
2 4M
1 √ r t
∧ r̂ = (ṽ − ũ) = r − 2M e 4M cosh
2 4M
t t̂
⇒ tanh = . (D.115)
4M r̂
t4
2 t=2M
r=0
r=1.5M
r=1.8M
t=M
r=2M r=2M
0 r=3M r=2.5M r=2.5M r=3M
r=1.8M
r=1.5M t=-M
-2 r=0 t=-2M
-4
-4 -2 0 2 4
r
Figure 23: Kruskal diagram of the Schwarzschild spacetime with curves r = const and t = const
as labeled. For each value r = const there exist two curves in the spacetime.
Several examples of these curves are plotted in Fig. 23. Note that each value r = const
corresponds to two curves. In particular, there are two singularities r = 0 and two horizons
r = 2M . We now also understand the apparent paradox of the outgoing Eddington-Finkelstein
coordinates. The singularity in the future is a black hole, everything passing inside r = 2M is
doomed to fall ever inwards until it hits r = 0. The past singularity r = 0, however, is a white
hole from which all light and timelike curves move outwards. We also have two asymptotically
flat regions, one at r̂ → ∞ and one at r̂ → −∞. These two regions, however, are causally
disconnected. Since all light cones have the shape t̂ = ±r̂ + const, they open up at 45◦ and no
information can pass from the left to the right region or vice versa. Finally, we note that the
horizon r = 2M is a null surface (t̂ = ±r̂) and the singularity at r = 0 is spacelike.
1 dM π 2 kB
4
J
− = σT 4 , σ= 3 2
= 5.67 × 10−8 2 . (D.117)
A dt 60~ c m s K4
Plugging in Eq. (D.116) for the temperature and A = 4π(2GM/c2 )2 for the surface area of a
black hole gives us an ordinary differential equation for M (t),
dM ~c6 πG2 3
=− ⇒ t = 5120 M . (D.118)
dt 15360π G2 M 2 ~c6
For a black hole of one solar mass, M = 2 × 1030 kg, the evaporation time is O(1060 ) yr. For
macroscopic black holes, this is such an extreme value that we can treat them as effectively
stable objects. Primordial black holes with masses M M , however, have been conjectured
to have formed in the very early universe’s density fluctuations. They would have evaporation
times much closer to the life time of our universe. If these objects exist, Hawking radiation
provides a potentially testable observational signature. Note, however, that our calculations
assume that no energy is added through accretion onto the holes. Accretion of some sort should
happen, even if only from the 2.7 K cosmic microwave background radiation, modifying the
expected evaporation times.
E COSMOLOGY 116
E Cosmology
Cosmology is the attempt to describe the entire Universe using simplifying assumptions that
still enable us to capture the essential properties of the Universe. The central concepts are those
of homogeneity and isotropy. These provide us with sufficient degrees of symmetry such that
analytic solutions of the Einstein equations are available and predict non-trivial consequences
that can be tested through astrophysical observations.
Cosmological principle: At a given moment in time, the universe is spatially homogeneous and
isotropic when viewed on a large scale.
Weyl’s postulate: The world lines of the fluid elements, that model the universe’s matter content,
are orthogonal to hypersurfaces of constant time, Σt , to which the cosmological
principle applies.
E COSMOLOGY 117
Note that we have been a bit vague so far about defining a time coordinate in this context and,
correspondingly, which spatial hypersurfaces are isotropic and homogeneous. Clearly, this is
not the case for arbitrary choices of time. For example, if an observer O finds the universe to
be isotropic, a second observer moving with constant velocity v 6= 0 relative to O will not see
the universe as isotropic. Weyl’s postulate fixes this ambiguity: The spatial hypersurfaces with
isotropy and homogeneity are those defined by constant proper time as measured by an observer
comoving with the cosmological fluid, i.e. with the galaxy distribution averaged over a large
volume. You may wonder at this stage what that has to do with hypersurface orthogonality.
We will shortly come to this.
First, though, we will define suitable coordinates and explore the structure of the metrics
satisfying the cosmological principle. The galaxies are assumed, by construction, to have no
peculiar motion relative to the averaged large-scale motion of the cosmological fluid elements
and therefore remain at fixed positions (x1 , x2 , x3 ) in coordinates comoving with the fluid.
Furthermore, we define time t to be the proper time measured along the world lines of the
galaxies or fluid elements. Note that we assume the universe to be homogeneous in space
but not necessarily in time. We therefore allow metric components to have arbitrary time
dependency. The spatial part of the line element (i.e. setting dt = 0) at time t is
Isotropy at every point implies that the time evolution is the same in every direction, so that
none of the hij components can have a preferred time dependency. With all hij depending
on time in the same way, we can factor out a time dependent term and write the spatial line
element as
d`2 = a(t)2 hij (xk )dxi dxj . (E.2)
The spacetime metric with this spatial part and using a time coordinate given by the proper
time of comoving observers is
ds2 = −dt2 + g0i dt dxi + a(t)2 hij (xk )dxi dxj . (E.3)
Now we use the hypersurface orthogonality of Weyl’s postulate. Let e0 = ∂t and ei = ∂i denote
the coordinate basis vectors. Clearly, ∂t is tangent to the world lines of observers comoving
with the cosmological fluid elements, since these are curves xi = const. By Weyl’s postulate,
these curves are orthogonal to the surface t = const. The spatial basis vectors ei are tangent
to this surface and we therefore have the condition
g0i = g(e0 , ei ) = e0 · ei = 0
Now we consider an observer moving with constant velocity relative to the fluid elements. The
metric in the frame of such an observer would be obtained from (E.4) by a Lorentz transforma-
tion. This transformation would mix time and spatial coordinates and therefore lead to g0i 6= 0;
cf. Eq. (A.85). The world line of this observer would not be orthogonal to a surface of constant
E COSMOLOGY 118
time in that frame and, as we already mentioned, such an observer would not see the universe
as isotropic.
We can further constrain the line element by considering the symmetry requirements on the
components hij . In Sec. D.1.2, we have seen that the spatial part of a spherically symmetric
metric can be written in the form [cf. Eq. D.4]
Note that spherical symmetry means isotropy around one point. Our assumption of isotropy
around every point amounts up to a so-called maximally symmetric spacetime which is a
stronger symmetry condition that implies spherical symmetry and more besides. One of the
“besides” that we have already identified is that the time dependency of C(t, r) can be factored
out as in Eq. (E.2). It turns out convenient to write this in the form C(t, r) = a(t)2 e2β(r) .
As we have seen in Sec. D.1.2, we can also rescale the radius to simplify the function D(t, r).
Instead of rescaling to D(t, r) = r2 as in the derivation of the Schwarzschild metric, we now
use D(t, r) = a2 (t)r2 , so that our line element (E.4) becomes
ds2 = −dt2 + a(t)2 d`2 , d`2 = e2β(r) dr2 + r2 (dθ2 + sin2 θ dφ2 ) . (E.6)
For further simplification, we focus on the spatial line element d`2 . The framework of differential
geometry we have developed in Sec. B applies to general manifolds and can therefore be used
as well to describe the three-dimensional hypersurface t = const. The only difference is that
we use Latin indices i, j, . . . = 1, 2, 3 in place of the Greek α, β, . . . = 0, . . . , 3 and that the
metric is now of signature (+ + +) instead of (− + + +). The quantities of particular interest
for our calculation are the three-dimensional Ricci tensor and scalar which we denote by Rij
and R = Ri i . A straightforward calculation gives us
2 −2β
R= 1 − ∂ r re . (E.7)
r2
This is a scalar quantity and therefore invariant under a coordinate transformation (xi ) → (x̃m ).
Furthermore we demand spatial homogeneity so that this quantity must be the same at every
point on the hypersurface t = const,
2 −2β
1 − ∂ r re = k̃ = const. (E.8)
r2
This can be integrated to
1
e2β = , with A = const . (E.9)
1 − 61 k̃r2 − A
r
dr2
2 2 2 2 2 2 2
⇒ ds = −dt + a(t) + r (dθ + sin θ dφ ) . (E.15)
1 − kr2
which is the flat metric on R3 but may also describe a topologically more complex
space such as a cylinder. Models with k = 0 are often called flat.
dr2 cos2 χ
r = sin χ ⇒ = dχ2 = dχ2 , (E.18)
1 − r2 1 − sin2 χ
⇒ d`2 = dχ2 + sin2 χ(dθ2 + sin2 θ dφ2 ) . (E.19)
dr2 cosh2 ψ
r = sinh ψ ⇒ = dψ 2 = dψ 2 (E.20)
1 + r2 1 + sinh2 ψ
⇒ d`2 = dψ 2 + sinh2 ψ(dθ2 + sin2 θ dφ2 ) . (E.21)
This space can be viewed as the surface w2 −x2 −y 2 −z 2 = const in the flat manifold
with metric −dw2 + dx2 + dy 2 + dz 2 . It is commonly viewed as a saddle. Models
with k = −1 are often called open.
1
R33 = sin2 θ R22 , Γ221 = Γ331 = ,
r
6
R= (aä + ȧ2 + k) , Γ233 = − sin θ cos θ , Γ332 = cot θ , (E.22)
a2
with all other non-vanishing components following by symmetry.
P = wρ , w = const , (E.27)
so that
ρ̇ ȧ
= −3(1 + w) ⇒ ρ ∝ a−3(1+w) . (E.28)
ρ a
The important cases are dust, radiation and dark energy.
(1) Dust: Here we have
w=0 ⇒ ρ ∝ a−3 . (E.29)
Dust represents a matter dominated Universe. The pressure between the individual galaxies
is negligible, so that this type of cosmological fluid is well approximated by dust.
(2) Radiation: In the Statistical Physics lecture you have learned/will learn that photons
can be regarded as gas with equation of state P = ρ/3. This corresponds to
1
w= ⇒ ρ ∝ a−4 . (E.30)
3
As we will see below in Sec. E.3, cosmological expansion leads to a redshift of the photons
whose wavelength λ ∝ a. The four powers of a in Eq. (E.30) are therefore composed of
three factors for the density of photons and one factor for the energy per photon.
(3) Dark energy: The third type of matter is literally more obscure. Recall from Lovelock’s
theorem in Sec. C.2.5 that we could add a cosmological term to the Einstein equations with-
out affecting the contracted Bianchi identities nor any of the fundamental properties of the
“left-hand side” of the Einstein equations (C.35). The cosmological term can alternatively
be interpreted as part of the energy momentum tensor,
This type of matter is interpreted as the non-zero ground state energy of the vacuum and
called dark energy. Do not confuse it with dark matter which is a separate dark-sector
component of the Universe that falls into either the dust or radiation category in this
discussion. As one might expect from a vacuum energy, its density is independent of the
size of the universe.
To summarize, the energy density of the different types of matter considered is
In an ever expanding universe, dark energy will therefore dominate over the other forms of
energy while the very early stages would be radiation dominated.
Before moving on with the Einstein equations and their solutions, we list here some parameters
that are frequently used in the literature on cosmology.
ȧ
Def.: H := is the Hubble parameter.
a
aä
q := − is the deceleration parameter.
ȧ2
3H 2
ρcrit := is the critical density; its significance will be revealed below.
8π
ρ 8π
Ω= = ρ is the density parameter.
ρcrit 3H 2
Note that these quantities are in general time dependent. They are often referred to as “pa-
rameters” because observations measure their present value which then is a number.
ȧ2 + k 1 ȧ2 + k Λ 4π
3 − Λ = 8πρ (I) ⇒ − = ρ,
a2 2 a 2 6 3
ä 4π Λ
= − (ρ + 3P ) + (III) .
a 3 3
The first two equations (I) and (II) are the Friedmann equations and we have rewritten both
in a slightly different way on the right side, since these are useful in some of the calculations
we will do later on. The third equation (III) follow from the other two but will be frequently
used in its specific form. Since we will use these equations quite often in the remainder of this
section, we distinguish them by the special labels (I)-(III).
An interesting consequence is obtained by taking the derivative of Eq. (I) and multiplying
Eq. (II) with 3ȧ/a which leads to
ȧ(ȧ2 + k) 2ȧä ȧ3 k ȧ
2ȧä ȧ ȧ
3 − 2 = 8π ρ̇ , 3 + + − 3 Λ = −24π P
a2 a3 a2 a3 a3 a a
ȧ3 + ȧk ȧ
ȧ
⇒ 3 −3 + Λ = 8π ρ̇ + 3 P . (E.34)
a3 a a
Using Eq. (I) on the left-hand side gives
ȧ ȧ
−24π ρ = 8π ρ̇ + 3 P
a a
ȧ
⇒ ρ̇ + 3 (ρ + P ) = 0 · a3
a
d 3 d
⇒ (a ρ) + P a3 = 0 . (E.35)
dt dt
The volume element of the metric (E.15) scales with V ∝ a3 , so that our last equation can be
written as dE + P dV = 0, i.e. in the form of the first law of thermodynamics. This equation
can be shown to also follow from ∇µ T µ α = 0. Here, instead, we obtained this equation by
differentiating the Einstein field equations. This is a direct manifestation of the contracted
Bianchi identities ∇µ (Gαµ + Λg αµ ) = 0.
r=0 r=R
dt dr
⇒ = ±√ . (E.36)
a(t) 1 − kr2
One can straightforwardly show that the curves obtained from this equation also solve the
geodesic equation. Let the observer be located at r = 0 and a galaxy at r = R from where
it emits light towards the observer; cf. Fig. 24. A first signal is emitted at te and a second at
te + ∆te . These reach the observer at to and to + ∆to , respectively. The signals travel on ingoing
(towards r = 0) null geodesics, so we take the − sign in (E.36). For the two signals we thus
obtain Z to Z 0 Z to +∆to Z 0
dt dr dt dr
=− √ , =− √ . (E.37)
te a R 1 − kr2 te +∆te a R 1 − kr2
The right-hand side is the same in both equations, so that
Z to +∆to Z te +∆te
dt dt
= . (E.38)
to a te a
Furthermore, we assume that ∆te , ∆to to − te , as realized for example for two consecutive
crests in a light wave. We can then regard a as nearly constant in the integrands. Finally the
wavelength of a photon is λ ∝ ∆t, so that
A final comment concerns the notion of distance in cosmology. A radial coordinate frequently
used in general relativity is the so-called areal radius Rar defined such that a sphere of constant
2
Rar has a proper surface area 4πRar . On a surface of constant radius, the Robertson-Walker
line element (E.15) becomes
The area of a sphere of constant r is 4πa2 r2 , so r is not an areal radius, but ar is. Now consider
the intensity of light collected at r = 0 from a source at r = R. The intensity is
energy E
I ..= = . (E.43)
area 4πa2 R2 (1 + z)2
The two factors of 1+z arise from (i) the redshift of each individual photon and (ii) the reduced
rate at with which photons hit the observer relative to their emission rate. Astrophysicists often
use the so-called luminosity distance defined by
E
DL2 := , (E.44)
4πI(1 + z)2
which incorporates the redshift factors and therefore is identical to the areal radius, DL = ar.
a(t)
(2) Let us again consider Λ = 0 and further assume that the energy density is positive and the
pressure is non-negative, ρ > 0, P ≥ 0. Then Eq. (III) tells us ä < 0. From observations we
furthermore know that ȧ > 0; the Universe is expanding. For vanishing ä, the curve a(t)
would be a straight line reaching the singularity a = 0 at time ∆t = −a/ȧ = −1/H, where H
would then be genuinely constant. Astrophysical observations determine the present value
of the Hubble constant H0 ≈ 71 km/(s Mpc) corresponding to −∆t = 1/H0 ≈ 13.8 Gyr.
With ä < 0, ȧ must have been larger in the past and ∆t is only an upper limit for the
age of the Universe; cf. Fig. 25. The singularity a = 0 is called the Big Bang. Near this
point, quantum effects will become important and general relativity is no longer expected
to provide an accurate description.
(3) We again consider the case Λ = 0, ρ > 0, P ≥ 0. From Eq. (I) we find
8π 2
ȧ2 = a ρ−k. (E.46)
3
For k = 0 or k = −1, the right-hand side is manifestly positive, so that ȧ2 > 0 always and
ȧ never reaches zero. Since ȧ > 0 today, we have ȧ > 0 always for open and flat Universes.
Next, we consider Eq. (E.35), which we write as
d 3 d
(a ρ) = −P a3 = −3a2 P ȧ . (E.47)
dt dt
The right-hand side is non-positive, so that d(a3 ρ)/dt ≤ 0. On the other hand, ρa3 is
by construction non-negative and must therefore approach a non-negative constant at late
times. This implies
lim a2 ρ = 0 . (E.48)
t→∞
E COSMOLOGY 127
a open
k=−1
flat
k=0
closed
k=1
now t
Figure 26: Illustration of the function a(t) for open (k = −1), flat (k = 0) and closed (k = +1)
In open Universes, ȧ → 1 at late times, while in the flat case ȧ → 0. In both cases, the
expansion never stops; cf. Fig. 26.
We first consider the case Λ > 0, set k = 0 in Eq. (E.52) and introduce a new variable
2Λ 3 2Λ 2
u= a ⇒ u̇ = a ȧ
3C C
4Λ2 4Λ2 3 4Λ3 6
C Λ 2
⇒ u̇ = 2 a4
2
+ a = a + a = 6Λu + 3Λu2
C a 3 C 3C 2
⇒ u̇2 = 3Λ(2u + u2 )
√
⇒ u̇ = 3Λ(2u + u2 )1/2 , (E.54)
Assuming that the Universe starts with a big bang, we use initial conditions a = u = 0 at
t = 0, so that Z u
1 √
√ dũ = 3Λ t . (E.55)
0 2ũ + ũ2
The integral on the left-hand side is solved with u = −1 + cosh w,
Z u Z u Z w Z w
dũ dũ sinh w̃ dw̃
√ = p = p = dw̃ = w
0 ũ2 + 2ũ 0 (ũ + 1)2 − 1 0 cosh2 w̃ − 1 0
√
⇒ u + 1 = cosh w = cosh( 3Λt)
2Λ 3 √ 3C h √ i
⇒ a = cosh( 3Λ t) − 1 ⇒ a3 = cosh( 3Λ t) − 1 . (E.56)
3C 2Λ
P = 0, k = 0
a
Λ>0
Λ=0
Λ<0
Einstein-de Sitter
Figure 27: Flat, matter dominated cosmological models for Λ > 0, Λ = 0, Λ < 0.
3 3C h √ i
a = 1 − cos( −3Λ t) . (E.58)
2(−Λ)
The case Λ = 0 is obtained directly from Eq. (E.52) which, with Λ = k = 0, becomes
Z √
C √
Z
2
ȧ = ⇒ ada = Cdt
a
2 3/2 √ 9C 2
⇒ a = Ct ⇒ a3 = t . (E.59)
3 4
This model is known as the Einstein-de Sitter model. For this case, k = Λ = 0, we find
ȧ 2
H= = ,
a 3t
−1
aä ȧ ä 1
q=− 2 =− = . (E.60)
ȧ a ȧ 2
The three different types of models (E.56), (E.59) and (E.58) are graphically illustrated in
Fig. 27.
(2) Matter dominated models with vanishing cosmological constant: Λ = 0, P = 0
Equation (E.52) now gives us
C
ȧ2 = −k. (E.61)
a
E COSMOLOGY 130
P = 0, Λ = 0
a
k = -1
k=0
k = +1
Einstein-de Sitter
Figure 28: Matter dominated cosmological models with Λ = 0. Note that the model k = 0 is
the Einstein-de Sitter model also shown in Fig. 27.
√
r r r
t a a a
arcsin u−u 1 − u2 = ± +b± ⇒ C arcsin − 1− = ±t + b± . (E.63)
C C C C
3C 4
⇒ a= ∧ Λ= . (E.67)
2 9C 2
Unfortunately, this model is not stable. Let us use (E.67) as a background solution a0 and
perturb around this background using a = a0 + , a0 . From Eq. (III) we obtain
ä 4π Λ
=− ρ+
a 3 3
C 4 3
⇒ a2 ä = − + a
2 27C 2
C 4 4 9 2
⇒ a20 ¨ ≈ − + 2
(a30 +3a20 + . . .) = C =
| 2 {z27C } 9C 2 4
=0
4
⇒ ¨ = = Λ . (E.68)
9C 2
√
The solutions are exponential functions exp(± Λt). The negative exponent can be ruled
out on physical grounds. Say, > 0, then the Universe is less dense, the gravitational
attraction is reduced which leads to further expansion.
E COSMOLOGY 132
ȧ2 √
3 =Λ ⇒ a(t) ∝ e± Λ/3 t
(E.71)
a2
The result is a bit misleading since all the three solutions can be shown to represent the
same spacetime, merely in different coordinates. Readers interested in more details are
referred to Hawking & Ellis [13]. In Fig. 29 we display the scale factor a(t) as given by
Eq. (E.70) for k = −1.
We mention in passing that for Λ < 0, there also exists a solution known as the Anti-
de Sitter spacetime. It has attracted less interest in a cosmological context, but plays a
central role in a fairly new branch of gravitational research known as the gauge-gravity
duality, sometimes also called the AdS/CFT correspondence (CFT stands for conformal
field theory).
(5) Radiation dominated, vanishing cosmological constant: P = ρ/3, Λ = 0
We recall from Eq. (E.35) that in general (even if Λ 6= 0),
d 3 d ρ
(a ρ) + P a3 = 0 Now set P =
dt dt 3
d 3 1 d d da
⇒ (a ρ) + ρ a3 = (a3 ρ) + ρa2 = 0. (E.73)
dt 3 dt dt dt
E COSMOLOGY 133
de Sitter: ρ = 0, Λ > 0
a
k=-1
Figure 29: The de Sitter Universe contains no matter other than dark energy corresponding
to Λ > 0. The solutions for k = −1, 0, +1 describe the same spacetime merely in different
coordinates. The figure shows a(t) for k = −1 as given in Eq. (E.70)
ȧ2 8π 4 !
3 2 = 8πρ ⇒ ȧ2 a2 = a ρ=B
a 3
Z √ √
1 2
Z
⇒ a da = ± Bdt ⇒ a = ± Bt. (E.76)
2
The scale factor a is real and non-negative, so that we take the positive square root on both
occasions and obtain
√ √
a = 2B 1/4 t . (E.77)
E COSMOLOGY 134
P = ρ/3, Λ = 0
a
k = -1
k=0
k = +1
Figure 30: The scale factor for radiation-dominated universes with vanishing cosmological
constant Λ = 0 and k = −1, k = 0 and k = +1. The behaviour is similar to the matter
dominated counterparts in Fig. 28.
s
√
2
t
k = −1 ⇒ a= B 1+ √ −1 . (E.79)
B
The three solutions (E.79), (E.77) and (E.78) are displayed in Fig. 30.
A brief summary of our observations is as follows. We have the following conservation laws for
the energy density.
(1) Radiation: ρa4 = const ,
(2) Matter: ρa3 = const ,
(3) Vacuum energy: ρ ∝ Λ = const .
Going back into the past when the Universe was smaller, we therefore find radiation to become
the increasingly dominant form of energy. Likewise, as a increases to the future, dark energy
will become more and more dominant. Only a stop of the expansion and an ensuing contraction
phase would then put an end to the dominance of dark energy. Our observations indicate that
at present, about 75 % of the Universe’s energy are in the form of dark energy, about 25 % in
the form of matter and only a negligible fraction as radiation. The 25 % of matter subdivide
E COSMOLOGY 135
into about 4 % of visible matter (such as stars or gas) and about 21 % dark matter whose
gravitational effect is apparent, for example in the rotation curve of galaxies, but whose nature
is unknown. It is an open puzzle that our present era coincides with a time where neither of
the forms of energy is completely dominating over the others. Bear in mind, however, that
modifications of Einstein’s theory cannot be ruled out and may change the picture we are
drawing here. As we have seen in our discussion of the motion of planets in Sec. A.2.5, history
has seen both, the revelation of previously unknown matter (Neptune) and a case where the
theory of gravity needed to be modified (Mercury). So stay tuned...
F SINGULARITIES AND GEODESIC INCOMPLETENESS 136
Clearly something goes bad in this metric at r = 2M where grr → ∞. We have seen in
Sec. D.4.5, however, that switching to Kruskal-Szekeres coordinates, we were able to cure this
singularity. We saw that r = 2M is still a special point, namely the location of the event horizon
that marks Schwarzschild’s solution as a black hole. But nothing really bad is happening at
that point. Likewise, the metric components diverge at r = 0 and this is still the case in the
Kruskal line element (D.111). This raises two questions. First, can we determine without a
priori knowledge of better coordinates whether such coordinates exist? Second, could there be
a further improvement over Kruskal coordinates that may even cure the singularity at r = 0?
Both questions amount up to finding a criterion whether we have a coordinate singularity or a
genuine physical singularity.
In order to answer that question, we turn to our rules of tensor calculus, where we saw that
scalars are invariant under coordinate transformations. Finding a suitable curvature scalar
should then tell us more about the nature of a singularity no matter which coordinates we
happen to be using. One might first turn towards the Ricci scalar (B.8.6), but this is not
too helpful: any vacuum spacetime satisfies the vacuum version of Einstein’s field equations
Rαβ = 0, so that the Ricci scalar also vanishes in such spacetimes by construction. A more
powerful variable is the Kretschmann scalar constructed out of the Riemann tensor
While the Ricci tensor vanishes for vacuum spacetimes such as Schwarzschild, the Riemann
tensor only vanishes for the Minkowski metric. After a straightforward but tedious calculation
(preferably performed with computation packages such as Mathematica [33] or GRTensor in
Maple [34, 35]), one finds for the Schwarzschild metric that
48M 2
κ= . (F.3)
r6
F SINGULARITIES AND GEODESIC INCOMPLETENESS 137
This tells us that the curvature diverges at r = 0 which therefore represents a genuinely singular
point in the spacetime whereas the curvature at r = 2M is regular. Likewise, we find for the
Einstein-de Sitter Universe (E.59) that at t = 0
80
κ= , (F.4)
27t4
which therefore is also a physical singularity.
ṫ
c0 = , c1 = z 2 ẋ , c2 = z 2 ẏ , (F.6)
z
where the dot denoted differentiation with respect to an affine parameter λ. Furthermore, the
Lagrangian does not explicitly depend on λ, so that
1
gµν ẋµ ẋν = − ṫ2 + z 2 (ẋ2 + ẏ 2 ) + z ż 2 = , (F.7)
z
where = +1 (−1, 0) for spacelike (timelike, null) geodesics with suitable affine parameter. A
straightforward calculation shows that the geodesic equation is solved by
c21 + c22
ż 2 + 3
− = c20 , (F.8)
z z
with ṫ, ẋ and ẏ directly following from the constants of motion (F.6).
Let us now consider the special case of null geodesics with initial conditions
ẋ = ẏ = 0 , ż < 0 , z = z0 at t = 0 . (F.9)
Without loss of generality, we assume time to be increasing towards the future, i.e. ṫ = zc0 >
0 ⇒ c0 > 0. Clearly, ẋ = 0, ẏ = 0 remain valid along the entire geodesic, so that all we need
is to solve
ż 2 = c20 ⇒ ż = ±c0 . (F.10)
For our initial condition ż < 0 we use the minus sign in the square root and our solution is
z = −c0 λ + z0 . (F.11)
F SINGULARITIES AND GEODESIC INCOMPLETENESS 138
From this equation we conclude that the geodesic hits the point z = 0 at finite affine parameter
λ. Is z = 0 a physical singularity? “Yes” screams the Kretschmann scalar
12
κ=
. (F.12)
z6
We see here an example of geodesic incompleteness.
Does the same happen at coordinate singularities? To answer this question, we consider a
second example, the Rindler spacetime (see e.g. [17]). Its metric is given by
ds2 = −z 2 dt2 + dx2 + dy 2 + dz 2 , (F.13)
with t, x, y, z ∈ R, z > 0. We use Noether’s theorem again on the Lagrangian for geodesics
with affine parameter λ which does not depend on t, x or y
∂L c0
− = 2z 2 ṫ = 2c0 ⇒ ṫ = 2 , ẋ = c1 , ẏ = c2 , (F.14)
∂ ṫ z
Furthermore, L does not depend on λ so that
− z 2 ṫ2 + ẋ2 + ẏ 2 + ż 2 = , (F.15)
where = +1 (−1, 0) for spacelike (timelike, null) geodesics with suitable affine parameter.
We consider geodesics with initial conditions
ẋ = ẏ = 0 , ż < 0 , z = z0 at t = 0 . (F.16)
We assume again future pointing time, so that ṫ = c0 /z 2 > 0, so that Eq. (F.15) becomes
c20
L = −z 2 + c2 + c2 +ż 2 =
z 4 |1 {z }2
=0
c20
⇒ ż 2 = + . (F.17)
z2
The solution for timelike geodesics ( = −1 which implies λ = τ ) is given by
τ
q
z(τ ) = z02 − τ 2 , t(τ ) = artanh . (F.18)
z0
We see that we cannot extend the geodesic beyond the affine parameter |τ | = z0 .
In order to see what is happening here, we transform the Rindler metric (F.13) to new coordi-
nates T, X, Y, Z defined by
√ T
x = X , y = Y , z = Z 2 − T 2 , t = artanh . (F.19)
Z
F SINGULARITIES AND GEODESIC INCOMPLETENESS 139
T 2
t=1
t = 0.5
1
z = 0.01
z = 0.5 z=1
0 z = 0.2
t=0
z = 0.8
-1
t = -0.5
t = -1
-2
0 1 2 3 4
Z
Figure 31: The Rindler wedge. Curves of constant t and z in the T -Z plane of the Minkowski
spacetime. Note that curves t → ±∞ coincide with the curve z = 0.
Z2 T2 −T 2 Z2
2 2
= dT − 2 + + dZ + + dX 2 + dY 2
Z − T 2 Z2 − T 2 Z2 − T 2 Z2 − T 2
= −dT 2 + dX 2 + dY 2 + dZ 2 . (F.20)
This is simply the Minkowski spacetime which contains no singularity. The geodesic incom-
pleteness in the Rindler spacetime signifies a coordinate singularity. For illustration of the
so-called Rindler wedge we invert the coordinate transformation,
z tanh t z
T =p , Z=p , (F.21)
1 − tanh2 t 1 − tanh2 t
and show in Fig. 31 curves of constant t and z in the Minkowski spacetime spanned by T and
Z.
Of course, we benefited greatly in this case from “knowing” the correct coordinate transfor-
mation. Can we systematically identify the “correct” coordinates for extending a spacetime in
this way once we have identified geodesic incompleteness and convinced ourselves that the sin-
gularity is not physical? In general, there is no recipe. But in two dimensions (which includes,
F SINGULARITIES AND GEODESIC INCOMPLETENESS 140
for example, four-dimensional spacetimes with spherical symmetry), there exists a systematic
procedure based on using affine parameters of ingoing and outgoing null geodesics. For more
details, we refer to Sec. 6.4 in Wald [30].
G LINEARIZED THEORY AND GRAVITATIONAL WAVES 141
~ B
E, ~ ∝ ei(~k·~x−ωt) , (G.1)
where ~k is the wave propagation vector. This is most easily seen by rotating the coordinate
system such that ~k points in the direction of one coordinate, say z. Then
~k = (0, 0, k) ⇒ ~ B
E, ~ ∝ ei(kz−ωt) = eik(z−vt) , (G.2)
where v = ω/k is the phase velocity. Plane electromagnetic waves solve the wave equation
~ 2f = 0 ,
2f = −∂t2 f + ∇ (G.3)
where f stands for any of the field components. Plugging (G.1) into the wave equation we
obtain the condition
ω 2 − ~k 2 = 0 . (G.4)
For a plane wave traveling in the z direction, this implies a phase velocity v = ω/k = ±1,
i.e. the wave propagates at unit speed. In relativistic notation, we write solutions to the wave
equation (G.3) as
kα = (−ω, ~k) with kα k α = 0 .
α
f ∝ eikα x , (G.5)
For a plane wave traveling in the z direction, kα = (−ω, 0, 0, k) and ω = |k|.
G LINEARIZED THEORY AND GRAVITATIONAL WAVES 142
Plane waves also exist in general relativity, either in the perturbative regime or in the fully
non-linear theory. We briefly consider the latter case before focusing on the linearized case in
the remainder of this section.
Def.: In general relativity, spacetimes admitting planar wave solutions are called pp wave spacetimes
and defined in more mathematical terms as spacetimes that admit a covariantly constant
vector field V .
A class of spacetimes which satisfies this property is given by the so-called Brinkmann metrics
ds2 = H(u, x, y)du2 + 2du dv + dx2 + dy 2 . (G.6)
It satisfies the above definition, since V := ∂v is a null vector field with
∇α V β = ∂α V β + Γβµα V µ = 0 + Γβµα δ µ v = Γβvα = 0 , (G.7)
since a straightforward calculation shows that all Christoffel symbols Γαµν with µ = v or ν = v
vanish.
The vacuum Einstein equations Rαβ = 0 for the metric (G.6) has only one non-trivial component
Ruu = 0 ⇒ ∂x2 H + ∂y2 H = 0 . (G.8)
A plane wave propagating in the z direction
α
H = H0 eikα x , H0 = const , kα = (−ω, 0, 0, ω) , (G.9)
therefore solves the Einstein equations as well as the wave equation (G.3). We have only intro-
duced the Brinkmann metrics here to illustrate how plane waves can arise in general relativity
and how they are represented mathematically. The concept of Brinkmann metrics and covari-
antly constant vectors, however, has more far-reaching consequences for the construction of
analytic solutions to the Einstein equations. For example, one can allow for more general wave
solutions with axisymmetry; the wave amplitude is no longer constant in the plane. One appli-
cation of this technique leads to the Aichelburg-Sexl metric [3] that describes a Schwarzschild
black hole moving at the speed of light. Analytic solutions of this type play important roles in
contemporary research.
A metric that is close to Minkowski is conveniently described in terms of its deviation from ηµν ,
g µν = η µν + k µν , k µν = O() 1
!
⇒ g µν gνρ = δ µ ρ + k µν ηνρ + η µν hνρ + k µν hνρ = δ µ ρ . (G.12)
| {z }
=O(2 )
In linearized theory we drop all terms beyond linear order O(). Here lies the key simplification
achieved with the perturbative technique. For the inverse metric perturbation we thus obtain
k µν ηνρ + η µν hνρ = 0 · η ρσ
Here we have raised the indices of hµν with the Minkowski metric η αβ . Note, however, that at
linear order, raising the indices instead with g µν would have led to the same result. Nevertheless,
we need to be watchful in raising and lowering indices and bear in mind which metric is used.
Unless specified otherwise, we shall from now on use the physical metric g to raise and lower
indices. Note also that k µν 6= hµν . This is a general result: the perturbation of a tensor with
upstairs indices is not obtained by raising (either with g µν or η µν ) those of the downstairs tensor
perturbations.
Let us next calculate the perturbations of the Christoffel symbols. To linear order in ,
1
Γµνρ = η µσ (∂ν hρσ + ∂ρ hσν − ∂σ hνρ ) + O(2 ) . (G.14)
2
For the Riemann tensor we obtain
1
⇒ Rµνρσ = ∂ρ ∂ν hµσ + ∂σ ∂µ hνρ − ∂ρ ∂µ hνσ − ∂σ ∂ν hµρ (G.15)
2
1 1
⇒ Rµν = ∂ ρ ∂(µ hν)ρ − ∂ ρ ∂ρ hµν − ∂µ ∂ν h h := hµ µ , ∂ µ := g µρ ∂ρ (G.16)
2 2
1 1 1 !
⇒ Gµν = ∂ ρ ∂(µ hν)ρ − ∂ ρ ∂ρ hµν − ∂µ ∂ν h − ηµν (∂ ρ ∂ σ hρσ − ∂ ρ ∂ρ h) = 8πTµν . (G.17)
2 2 2
G LINEARIZED THEORY AND GRAVITATIONAL WAVES 144
Note that the Einstein tensor Gµν = O() and, hence, the energy momentum tensor is also of
perturbative order Tµν = O(). Equation (G.17) gives us the Einstein equations at first order
in . It turns out that these equations are more conveniently expressed in terms of the trace
reversed metric perturbation.
Plugging this definition into Eq. (G.17), we obtain after a little calculation
1 1
Gµν = − ∂ ρ ∂ρ h̄µν + ∂ ρ ∂(µ h̄ν)ρ − ηµν ∂ ρ ∂ σ h̄ρσ = 8πTµν . (G.19)
2 2
Further simplification of the linearized Einstein equations is achieved by using the coordinate
freedom. Note that we have specified the background coordinates, Cartesian coordinates in an
inertial frame of the Minkowski spacetime. But we can still change the coordinates at order
O(). We denote this change by a difference ξ α = O(),
x̃α = xα − ξ α ⇔ xα = x̃α + ξ α
∂ x̃α ∂xν
= δ α
µ − ∂ µ ξ α
⇔ = δ ν β + ∂˜β ξ ν . (G.20)
∂xµ ∂ x̃β
The physical metric transforms according to the tensor transformation law (B.34), so that
g̃µν = ηµν + h̃µν = (δ α µ + ∂µ ξ α )(δ β ν + ∂ν ξ β )(ηαβ + hαβ ) = ηµν + ∂µ ξν + ∂ν ξµ + O(2 )
We have four free functions and can use these to satisfy four relations. A particularly convenient
transformation is to choose the ξµ such that
∂ ν ∂ν ξµ = −∂ ν h̄µν (G.22)
¯ = h̃ − 1 η ρσ h̃ η = h + ∂ ξ + ∂ ξ − 1 η ρσ (h + ∂ ξ + ∂ ξ )η
⇒ h̃µν µν ρσ µν µν µ ν ν µ ρσ ρ σ σ ρ µν
2 2
¯ = h̄ + ∂ ξ + ∂ ξ − η ∂ σ ξ
⇒ h̃ (G.23)
µν µν µ ν ν µ µν σ
¯ = ∂ ν h̄ + ∂ ν ∂ ξ + ∂ ν ∂ ξ − ∂ σ ∂ ξ = ∂ ν h̄ + ∂ ν ∂ ξ = 0 .
⇒ ∂ ν h̃ (G.24)
µν µν µ ν ν µ µ σ µν ν µ
Note that the expression (G.19) for the Einstein tensor is valid in unchanged form if we replace
hµν with h̃µν , since we could have started the entire derivation with either h or h̃. With the
gauge condition (G.24), however, Eq. (G.19) simplifies to
¯ = −16πT
∂ ρ ∂ρ h̃µν µν . (G.25)
G LINEARIZED THEORY AND GRAVITATIONAL WAVES 145
This is a quite remarkable simplification: we merely have to solve the flat-space wave equation
for the metric components. Because the tilde is not a convenient notation, especially in com-
bination with the bar for the trace reverse metric perturbation, we will drop the tilde now and
write hµν which we implicitly assume to satisfy the so-called “Lorentz gauge” condition (G.22).
where Φ is the gravitational potential. In Eqs. (A.9) and (A.10), we have seen that the Newto-
nian potential Φ ∝ v 2 where v is the velocity of objects moving in this field due to gravitational
attraction. This is indeed a generic feature of Newtonian gravity and we therefore define the
expansion parameter of the previous section as
v2 M
= 2
= v2 ∝ , (G.27)
c R
where M is the characteristic mass of the gravitational source and R the distance of moving par-
ticles from this source. For non-relativistic motion we have 1 as required for a perturbative
treatment. From our discussion of the energy-momentum tensor in Sec. C.2, we furthermore
know that the component T00 represents mass-energy density ρ, the T0i components represent
momentum density ∝ ρv i and the Tij components denote the flux of this momentum in spatial
directions, i.e. Tij ∝ ρv i v j . For Newtonian sources of gravitational waves, we already know
from the discussion following Eq. (G.17) that the energy density is ρ = O(), so that
T00 = ρ = O() ,
T0i ∼ ρv i ∼ O(3/2 ) ,
Tij ∼ ρv i v j ∼ O(2 ) . (G.28)
In Newtonian gravity, temporal changes in the field Φ are caused by the motion of the matter
sources. Again, we use the fact that these velocities v are small, so that
∂ ∂ ∂
∼ v i = O(1/2 ) i
∂t ∂x ∂x
~ 2 h̄µν = −16πTµν
⇒ 2h̄µν = ∂ ρ ∂ρ h̄µν = ∂ i ∂i h̄µν = ∇
~ 2 h̄00 = −16πT00 = −16πρ + O(3/2 ) ,
⇒ ∇ h̄0i = O(3/2 ) , h̄ij = O(2 ) . (G.30)
G LINEARIZED THEORY AND GRAVITATIONAL WAVES 146
This is Newton’s law (G.26) with the identification h̄00 = −4Φ. Now we merely need to
reverse-engineer the metric perturbations from h̄00 . We have
h̄ = η µν h̄µν = 4Φ + O(3/2 ) = −h
1 1
⇒ h00 = h̄00 − η00 h̄ = −2Φ , hij = h̄ij − ηij h̄ = −2Φδij , (G.31)
2 2
which gives us the metric in the Newtonian limit as
which is the line element we have used in the redshift calculation in Eq. (A.42).
Let us next calculate particle motion in the Newtonian limit by studying the geodesics of (A.42).
Using proper time and time like geodesics, we obtain [note that ẋi ∼ v i = O(1/2 )]
!
L = (1 + 2Φ)ṫ2 − δij (1 − 2Φ)ẋi ẋj = 1
d2 xk d 2 xk
⇒ = + O(2 ) = −∂k Φ . (G.34)
dt2 dτ 2
This is exactly the equation of motion for a test particle in Newtonian gravity. Note that this
calculation also confirms that the factor 8π in the Einstein equations G = 8πT is the correct
number to reproduce the Newtonian limit.
This is exactly the wave equation (G.3) we discussed in the context of plane waves in Sec. G.1.
Plane wave solutions to this equation are given by
ρ
h̄µν = Hµν eikρ x , Hµν = const . (G.36)
leaves the Lorentz gauge condition (G.22) unaffected. A short calculation shows that the
transformation (G.37) changes the plane wave (G.36) according to
It can be shown that there exists a choice Xµ such that (G.38) leads to
H0µ = 0 , H µµ = 0 . (G.39)
This is the “traceless” condition and combined with the transverse condition above, it is often
referred to as the transverse-traceless gauge. In this gauge, the gravitational wave solution has
two important properties.
(1) h = 0 ⇒ hµν = h̄µν , so that we need not distinguish between the trace-reversed and
the original metric perturbation.
(2) For a plane wave propagating in the z direction, we find H0µ = H3µ = H µ µ = 0, so that
Hµν can be written as
0 0 0 0
0 H+ H× 0
Hµν =
0 H× −H+ 0
(G.40)
0 0 0 0
So what happens if such a gravitational wave passes through some arrangement of test particles?
To answer this question, we study the geodesic equation for the metric gµν = ηµν + hµν with hµν
given by Eqs. (G.36), (G.40). Let us consider a test particle initially at rest in a background
inertial frame, i.e. the four-velocity of this particle is initially uα = (1, 0, 0, 0). The geodesic
equation at the initial time is given by
d α
u + Γαµν uµ uν = u̇α + Γα00 = 0 . (G.41)
dτ
G LINEARIZED THEORY AND GRAVITATIONAL WAVES 148
The figure illustrates the motion of the four test particles as the
gravitational wave generates the oscillating perturbation. This pat-
tern motivates the index “+” in h+ .
Case 2: H+ = 0 , H× 6= 0, so that h× oscillates. The proper distance between specific
particles can be summarized as follows.
√ √
2 particles at (−δ, −δ, 0)/ 2, (δ, δ, 0)/ 2 have ds2 = (1+h× )4δ 2 .
√ √
2 particles at (δ, −δ, 0)/ 2, (−δ, δ, 0)/ 2 have ds2 = (1−h× )4δ 2 .
The figure illustrates the motion of the four test particles as the
gravitational wave generates the oscillating perturbation. This pat-
tern motivates the index “×” in h× .
Gravitational waves have been conjectured to exist soon after Einstein published his theory,
but their nature remained under constant debate for about 40 years, including Einstein himself
who vacillated on the issue. It was only in the late 1950s, that results by Bondi, Pirani, Sachs
and others demonstrated convincingly that gravitational waves are not merely a gauge effect
G LINEARIZED THEORY AND GRAVITATIONAL WAVES 149
but carry physical energy; for an overview of of the history on these debates, see for instance
[22]. By now, there remains no doubt that gravitational waves carry energy and the leading
order term can be calculated analytically for a wide variety of sources. This is contained in the
famous quadrupole formula which we merely quote here; for a derivation of this formula see for
example [36]. Consider for this purpose a distribution of energy density ρ(t, ~y ) contained inside
a domain of compact support. The quadrupole tensor is defined as
Z
Iij .= ρ(t, ~y ) y i y j d3 y .
. (G.44)
The quadrupole formula predicts the energy flux at a distance r from the source averaged over
times that are large compared with the period of the gravitational wave signal. This flux is
G ... ... 1
hpit = Q Q ; Qij ..= Iij − Ikk δij , (G.45)
5c5 ij ij t−r 3
where Qij is the reduced quadrupole tensor and the indices t and t−r means that a gravitational
wave observed at time t is sourced by time variations of the sources at retarded time t − r. The
dots denote time derivatives and the symbols h . i the averaging over sufficiently long times.
Let us consider as an example a system of two equal point masses in circular orbit according
to Newtonian gravity. The energy density is
ρ(~x) = mδ(~x − ~x1 ) + mδ(~x − ~x2 ) , xi1 = r (cos φ, sin φ, 0) , xi2 = −r (cos φ, sin φ, 0) . (G.46)
The motion of two such bodies is governed by the Newtonian gravitational and centrifugal
forces r
m2 ! mv 2 m v2 v m
G 2
= ⇒ G 2 = ⇒ ω= = G 3 (G.47)
(2r) r 4r r r 4r
The quadrupole tensor is
Note that we traded the quadratic cos and sin functions for linear ones to simplify taking
derivatives. The traceless quadrupole tensor is Qij = Iij − 2mr2 /3 and thus only differs from
Iij by a constant. The time derivatives of the two are therefore equal,
...
Q = 8ω 3 mr2 sin 2ωt ,
... xx
Qyy = −8ω 3 mr2 sin 2ωt ,
... ...
Qxy = Qyx = −8ω 3 mr2 cos 2ωt . (G.49)
This loss of energy was famously identified in observations of the Hulse-Taylor pulsar starting in
the 1970s [16]. The observations were compared with higher-order predictions going beyond the
quadrupole formula and revealed excellent agreement with the predictions of general relativity
leading to the 1993 Nobel Prize. Finally, in September 2015, the LIGO gravitational wave
detectors in Hanford and Livingston, US, made the first direct detection of a gravitational wave
signal [1] using an instrumental setup that is reminiscent of the Michelson-Morley interferometer
but uses a wealth of highly advanced technology. Even though gravitational waves carry a
Figure 32: Observed signal of the black-hole binary signal GW150914 as measured with the
LIGO detectors at Hanford and Livingston (upper panels), numerical relativity predictions for
a black-hole binary using the most likely mass parameters (upper middle panels), the difference
between signal and prediction (lower middle panels) and the power spectrum in the time-
frequency domain (bottom panels). Taken from [1].
tremendous amount of energy, they interact very weakly with matter including the detectors.
The variation in length we have displayed for the arrangements of test particles above has
been vastly exaggerated. For realistic sources the change in length ∆l/l = O(10−21 ) which
corresponds to about the width of a hair in the distance to the next star, Proxima Centauri.
The detected signal together with the theoretical predictions and power spectra is shown in
Fig. 32. A second event has by now been detected [2], demonstrating that the first detection
was not merely a fluke. The LIGO detectors are being upgraded to higher sensitivity and
other detectors, Virgo, LIGO India and Japan’s KAGRA will join the network over the coming
G LINEARIZED THEORY AND GRAVITATIONAL WAVES 151
years. Throughout these notes, we have encountered a number of questions that remain open to
this day (dark energy, dark matter, possible modifications of the theory of relativity). It is not
unlikely that the new field of gravitational wave astronomy will revolutionize our understanding
of the Universe. But that is a story to be told on some other future occasion...
REFERENCES 152
References
[1] B. Abbott et al. Observation of Gravitational Waves from a Binary Black Hole Merger.
Phys. Rev. Lett., 116(6):061102, 2016. arXiv:1602.03837 [gr-qc].
[2] B. P. Abbott et al. GW151226: Observation of Gravitational Waves from a 22-Solar-Mass
Binary Black Hole Coalescence. Phys. Rev. Lett., 116(24):241103, 2016. arXiv:1606.04855
[gr-qc].
[3] P. C. Aichelburg and R. U. Sexl. On the Gravitational field of a massless particle. Gen.
Rel. Grav., 2:303–312, 1971.
[4] R. Arnowitt, S. Deser, and C. W. Misner. The dynamics of general relativity. In L. Witten,
editor, Gravitation an introduction to current research, pages 227–265. John Wiley, New
York, 1962. gr-qc/0405109.
[5] L. Blanchet. Gravitational Radiation from Post-Newtonian Sources and Inspiralling Com-
pact Binaries. Living Reviews in Relativity, 9(4), 2006. http://www.livingreviews.org/lrr-
2006-4.
[6] H. Bondi and T. Gold. The Steady-State Theory of the Expanding Universe. Mon. Not.
Roy. Astron. Soc., 108:252, 1948.
[7] S. M. Carroll. Lecture notes on general relativity, 1997. gr-qc/9712019.
[8] S. M. Carroll. Spacetime and Geometry: An Introduction to General Relativity. Pearson,
2003.
[9] R. d’Inverno. Introducing Einstein’s Relativity. Oxford: Clarendon Press, 1992. ISBN-
9780198596868.
[10] F. W. Dyson, A. S. Eddington, and C. Davidson. A Determination of the Deflection of
Light by the Sun’s Gravitational Field, from Observations Made at the Total Eclipse of
May 29, 1919. Phil. Trans. Roy. Soc. Lond., A220:291–333, 1920.
[11] J. B. Hartle. Gravity: An Introduction to Einstein’s General Relativity. Pearson, 2014.
[12] S. W. Hawking. Black hole explosions. Nature, 248:30–31, 1974.
[13] S. W. Hawking and G. F. R. Ellis. The Large Scale Structure of Space-Time. Cambridge
University Press, 1973.
[14] F. Hoyle. A New Model for the Expanding Universe. Mon. Not. Roy. Astron. Soc.,
108:372–382, 1948.
[15] L. P. Hughston and K. P. Tod. An Introduction to General Relativity. Cambridge Uni-
versity Press, 1991.
[16] R. A. Hulse and J. H. Taylor. Discovery of a Pulsar in a Binary System. Astrophys. J.,
195:L51–55, 1975.
[17] C. W. Misner, K. S. Thorne, and J. A. Wheeler. Gravitation. W. H. Freeman, New York,
1973.
[18] M. E. Osinovsky. Some Remarks on the Kasner Space-Time. Nuovo Cim., 7:76–78, 1973.
REFERENCES 153
[19] R. V. Pound and G. A. Rebka, Jr. Apparent Weight of Photons. Phys. Rev. Lett., 4:337–
341, 1960.
[20] W. Rindler. Relativity: Special, General, and Cosmological. Oxford University Press,
2006.
[21] L. Ryder. Introduction to General Relativity. Cambridge University Press, 2009.
[22] P. R. Saulson. Josh Goldberg and the physical reality of gravitational waves. Gen. Rel.
Grav., 43:3289–3299, 2011.
[23] L. I. Schiff. Possible New Experimental Test of General Relativity Theory. Phys. Rev.
Lett., 4:215–217, 1960.
[24] B. F. Schutz. A First Course in General Relativity. Cambridge University Press, 2009.
2nd Edition.
[25] K. Schwarzschild. On the gravitational field of a mass point according to Einstein’s
theory. Sitzungsber. Preuss. Akad. Wiss. Berlin (Math. Phys.), 1916:189–196, 1916.
physics/9905030.
[26] I. I. Shapiro. Fourth Test of General Relativity. Phys. Rev. Lett., 13:789–791, 1964.
[27] H. Stephani. An Introduction to Special and General Relativity. Cambridge University
Press, Cambridge, 2008. 3 edition.
[28] J. Stewart. Advanced general relativity. Cambridge University Press, 1991.
[29] J. H. Taylor and J. M. Weisberg. Further experimental tests of relativistic gravity using
the binary pulsar PSR 1913+16. Astrophys. J., 345:434–450, 1989.
[30] R. M. Wald. General Relativity. The University of Chicago Press, Chicago and London,
1984.
[31] S. Weinberg. Gravitation and Cosmology: Principles and Applications of the General Theory of Relativ
John Wiley & Sons, 1972.
[32] J. M. Weisberg, D. J. Nice, and J. H. Taylor. Timing Measurements of the Relativistic
Binary Pulsar PSR B1913+16. Astrophys. J., 722:1030–1034, 2010. arXiv:1011.0718 [astro-
ph].
[33] Mathematica webpage:
https://www.wolfram.com/mathematica/.
[34] Maple webpage:
http://www.maplesoft.com/products/maple/.
[35] GRTensor webpage:
http://grtensor.phy.queensu.ca/.
[36] Harvey Reall’s lecture notes on General Relativity:
http://www.damtp.cam.ac.uk/user/hsr1000/teaching.html.
[37] Gary Gibbons’ Lecture Notes on Part II General Relativity:
http://www.damtp.cam.ac.uk/research/gr/members/gibbons/partiipublic-2006.pdf.
[38] Stephen Siklos’ Lecture Notes:
http://www.damtp.cam.ac.uk/user/stcs/gr.html.