Introduction To General Relativity and Cosmology
Introduction To General Relativity and Cosmology
General Relativity
and Cosmology
Ian R Kenyon SECOND
EDITION
Books in the program range in level from short introductory texts on fast-moving
areas, graduate and upper-level undergraduate textbooks, research monographs,
and practical handbooks.
For a complete list of published and forthcoming titles, please visit iopscience.org/
books/aas.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system
or transmitted in any form or by any means, electronic, mechanical, photocopying, recording
or otherwise, without the prior permission of the publisher, or as expressly permitted by law or
under terms agreed with the appropriate rights organization. Multiple copying is permitted in
accordance with the terms of licences issued by the Copyright Licensing Agency, the Copyright
Clearance Centre and other reproduction rights organizations.
Permission to make use of IOP Publishing content other than as set out above may be sought
at [email protected].
Ian R Kenyon has asserted his right to be identified as the author of this work in accordance with
sections 77 and 78 of the Copyright, Designs and Patents Act 1988.
DOI 10.1088/2514-3433/acc3ff
Version: 20230801
AAS–IOP Astronomy
ISSN 2514-3433 (online)
ISSN 2515-141X (print)
British Library Cataloguing-in-Publication Data: A catalogue record for this book is available
from the British Library.
IOP Publishing, No.2 The Distillery, Glassfields, Avon Street, Bristol, BS2 0GR, UK
US Office: IOP Publishing, Inc., 190 North Independence Mall West, Suite 601, Philadelphia,
PA 19106, USA
To Valerie
Contents
Preface xv
Acknowledgments xvi
About the Author xvii
Author Tribute xviii
Physical Constants and Parameters xix
1 Introduction 1-1
1.1 Prologue 1-1
1.2 Einstein’s Insight 1-2
1.3 Structures Seen Today 1-5
1.4 Hubble’s Law 1-6
1.5 Olbers’ Paradox 1-9
1.6 The Big Bang and the Cosmic Microwave Background 1-9
1.7 Inflation 1-12
1.8 Dark Matter 1-12
1.9 Structure Formation 1-14
1.10 Dark Energy 1-15
1.11 The Model of the Universe 1-16
1.12 The Telescopes 1-16
1.13 Luminosity 1-20
1.14 Summary of Results in Special Relativity 1-22
1.15 Exercises 1-24
Further Reading 1-24
References 1-25
ix
Introduction to General Relativity and Cosmology (Second Edition)
x
Introduction to General Relativity and Cosmology (Second Edition)
xi
Introduction to General Relativity and Cosmology (Second Edition)
xiv
Preface
The intention is to present to the student a modern, compact and digestible account
of general relativity and modern cosmology. In recent decades there have been
significant changes. The LIGO/Virgo collaboration has detected gravitational waves
from the merger of black holes and then from neutron star mergers, with ∼100
mergers observed by early 2020. This is making the properties of black holes, and
tests of GR in the strong regime, accessible as never before. In 2019 the galactic
black hole M87* was imaged for the first time by the Event Horizon Telescope
collaboration. The study using SN1a supernovae as standard candles led to the
discovery of acceleration of the expansion of the universe, a process started around
5 Gyr ago, or “recently” as cosmologists say. This acceleration is due to the existence
of some field, most likely a scalar field; a field whose properties in most respects
match the properties of Einstein’s cosmological constant, and which appears to
account for the bulk of the energy in the universe. In the common view, another
scalar field was likely responsible for the earlier inflation certainly by a factor more
than 1030, terminating when the universe was only of order 10−35 s old.
Taken together these observations indicate that the aim of bringing to the student
a unified picture is appropriate and timely.
xv
Acknowledgments
The author is grateful for the continued support and warm encouragement of
Professor Paul Newman, Head of the Particle Physics Research Group, and of
Professor Bill Chaplin, the Head of the School of Physics and Astronomy at the
University of Birmingham. This has given me access to the essential facilities needed
for carrying out the project.
Thanks to Professor Sean McGee, from the School of Physics and Astronomy for
reading and commenting on most of the cosmological component of this book; and
also for producing essential figures with CMBfast and for patiently answering many
questions. Thanks to Dr Geraint Pratten, Royal Society University Research
Fellow, also of the School of Physics and Astronomy, who was kind enough to
read and comment on two cosmology chapters and took the trouble to help me
through an area of particular difficulty. Dr Geoff Brooker of Wadham once again
took on the task of knocking my text into shape: my thanks to him for reading and
commenting on the whole text with his usual insight. Thanks also to Professor Frank
van den Bosch of Yale, for taking time to answer specific questions that puzzled me
in his area of expertise. All this help was illuminating and invaluable. Errors are the
author’s alone.
Special thanks to Dr Mark Slater of the Particle Physics Research Group in the
School of Physics and Astronomy for his invaluable, patient, and cheerful help with
software, and for installing Linux and suites of software on a sequence of laptops.
This gave me the necessary stable and reliable computing environment in which to
operate.
My thanks to people at IOP Publishing: the Senior Editor John Navas, Leigh
Jenkins, Associate Director David McDade, and his production team. Their help with
preparation of figures, acquisition of permissions, and the process of e-publication, was
throughout courteous and efficient. My thanks too, to the American Astronomical
Society for jointly publishing the book.
Thanks to the publishers and authors who permitted re-use of material under
copyright, or creative commons, etc. Each case is fully detailed in the text as it
occurs.
I am indebted to Oxford University Press and their Senior Science Editor, Sonke
Adlung, for ceding to me the copyright of General Relativity published in 1990 by
Oxford. Sonke also forwarded to me originals of the figures. Oxford also allowed me
to adapt a portion of Chapter 11 (Quantum measurement) from Quantum 20/20:
Fundamentals, Entanglement, Gauge Fields, Condensates and Topology published by
Oxford University Press in 2019. These generous acts are warmly appreciated.
Last, but not least, my heartfelt thanks to Dr Yoshinari Mikami for supplying me
with a Latex file of the equations from his Japanese translation of General Relativity,
and for his careful checking of the formulae. His kind act gave a timely boost toward
getting this text launched and saved several weeks of additional keypunching and
checking.
xvi
About the Author
Ian R. Kenyon
Ian R. Kenyon is an elementary particle physicist in the School of
Physics and Astronomy at the University of Birmingham, UK. He
took part in the discovery of the carriers of the weak force, working
at CERN for three years on the design, construction, data-taking
and analysis of the UA1 experiment. Earlier he designed and built
the optics for the Northwestern University 50 cm liquid helium
bubble chamber. More recently he worked on the H1 experiment at
the HERA electron–proton collider and on optoelectronics: the
design and construction with CERN and Hewlett-Packard of the then fastest link
between computers, commercialized by HP, and on the CERN-funded RD23
programme for optoelectronic readout from LHC detectors. He is the author of
four advanced textbooks for physics undergraduates: Elementary Particle Physics,
General Relativity (also translated into Japanese), The Light Fantastic: A Modern
Introduction to Classical and Quantum Optics and Quantum 20/20: Fundamentals,
Entanglement, Gauge Fields, Condensates and Topology. This present text builds on
the earlier General Relativity to cover the development together of general relativity
and cosmology.
xvii
Author Tribute
During the last stages of the preparation of this book, Prof Ian Kenyon sadly passed
away. Ian was stalwart of Birmingham’s School of Physics and Astronomy and a
cornerstone of its particle physics group for over half a century. Fundamental
science has advanced enormously during Ian’s research career; his significant role in
the discovery of the W and Z bosons by the CERN UA1 experiment stands out
among his many contributions. Ian remained full of energy up to the very end of his
life, attending the university daily, writing prolifically, organising group seminars
and enjoying conversations on a wide range of topics with an eclectic selection of
scientists. His curiosity about nature and his tenacity in seeking to understand its
basic mechanisms were as remarkable as the breadth of his knowledge. This book
and his other titles form part of a rich and lasting legacy. He was an outstanding
physicist who will be fondly remembered and greatly missed.
xviii
Physical Constants and
Parameters
xix
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 1
Introduction
1.1 Prologue
This chapter contains a brief survey of the twin topics of general relativity and
cosmology, highlighting the key experimental discoveries and concepts. The body of
the book fleshes out these themes. Einstein’s theory is the first topic, including the
tests that his theory has passed within the solar system. Two consequent topics of
great current interest come next: black holes and gravitational waves. This allows a
smooth transition into cosmology: to an account of our understanding of how the
universe developed. The narrative shows the success of a model for the universe
picturing it as a space that initially expanded violently and almost instantaneously,
then more sedately and is now destined to expand ever more rapidly. This is a model
whose framework was provided by Einstein. The contents of the universe in this
successful model are mainly dark energy and cold dark matter, which paradoxically
are not directly detectable. Ordinary matter accounts for only about 5% of the
energy in the universe and the stars are just 5% of that 5%. Our understanding of the
whole has been built principally on observations of this tiny residue and of radiation
from the initial compressed, extremely hot phase of the universe. Our model, which
combines dark energy (λ) with cold dark matter (CDM) and is called λ CDM,
provides a consistent interpretation of evolution of the universe over some 13,800
million years since the Big Bang that set it off.
Chapters 2–6 are used to introduce the concepts of the general theory of relativity
(GR) and Einstein’s equation linking spacetime curvature and matter. Chapter 7
describes the success of GR in passing the many tests made within the solar system.
Chapter 8 is devoted to the analysis of black hole properties and the observations
that convince us of the existence of stellar and galactic black holes. Chapter 9
develops the theory of gravitational waves and describes their discovery and their
study carried out using km-sized optical interferometers. Chapters 10–17 are used to
present the general relativistic dynamics of the universe, and then to describe the
quantitative success of the ΛCDM model in explaining the evolution of the universe.
1-2
Introduction to General Relativity and Cosmology (Second Edition)
In the limit of low velocities and small gravitation effects Einstein’s equation reduces
to Newton’s law, so that all the latter’s successes are explained consistently.
The bending of light near a massive body is more spectacular in Einstein rings,
images of sources seen through galaxies located nearer to us. Figure 1.1 shows the
image of a blue galaxy gravitationally lensed by the red galaxy nearer to the Earth;
the axial alignment is so good that the image forms almost a complete circle.
GR predicts the fate of massive stars. Initially the gravitational self-attraction of
the stellar material leads to large internal pressures and temperatures that ignite
nuclear burning. Eventually the fuel is used up and our naive expectation would be
that the star should contract to a size at which the pressure and gravitational self-
attraction are in balance. However, in 1931 Chandrasekhar showed that for a
sufficiently massive star the gravitational collapse can continue indefinitely. What is
left behind is that mysterious entity, a black hole. Spacetime is so warped that even
light cannot escape from within its horizon: a sphere of radius 2GM c 2 for a static
black hole of mass M, where G is the gravitational constant. GR also predicts the
existence of gravitational waves that travel at the speed of light. When they cross a
region of spacetime it is spacetime itself that vibrates. Such vibrations were detected
for the first time in 2016 by the advanced LIGO 4 km Michelson interferometers.
The strain produced by these waves is shown in Figure 1.2: they were emitted during
the final inspiral of a binary pair of 30 solar mass black holes, merging as a single
black hole. About three solar masses (times c2) were converted to ∼1048 J of
gravitational wave energy, which, if it could be harnessed, would power our current
civilization for ∼1028 years.
Large black holes of order 106−9 solar masses lie at the centers of most galaxies.
The orbits of several stars close to the center of our Galaxy, near Sgr A*, have been
Figure 1.1. Einstein ring LRG 3-757 recorded using the Hubble Telescope’s Wide-Field Camera 3. Image
credit: ESA/Hubble and NASA.
1-3
Introduction to General Relativity and Cosmology (Second Edition)
1.0
Strain (10–21)
0.5
0.0
–0.5
–1.0 Numerical relativity
Reconstructed (template)
Separation (Rs)
Velocity (c)
0.6 4
0.5 Black hole separation 3
Black hole relative velocity 2
0.4
1
0.3 0
0.30 0.35 0.40 0.45
Time (s)
Figure 1.2. Estimated strain amplitude from GW150914 made with numerical relativity models of the black
hole behavior as the holes coalesce. In the lower panel the separation of the black holes is given in units of the
Schwarzschild radius 2GM c 2 and the relative velocity is divided by c. LIGO Open Science Center at https://
losc.ligo.org/events/GW150914. The work is reported by the LIGO Scientific Collaboration and Virgo
Collaboration (Abbott et al. 2016). Courtesy LIGO Collaboration. This figure is Figure 11.5 taken with
permission from Kenyon (2019) published by Oxford Univ. Press in 2019.
observed over decades. In the case of the star SO2, its velocity at closest approach is
fully consistent with GR but departs by 200 km s−1 from the Newtonian prediction
(Do et al. 2019).
It is interesting to contrast gravitation with the other long-range force in nature:
electromagnetism. The long range of the two forces is attested by the operation of
the solar system in the first case and by the Van Allen belts and solar flares in the
second. Of the two the electromagnetic force is intrinsically far stronger; the
electrical repulsion between two protons is 1036 times stronger than their mutual
gravitational attraction. The fact that gravity dominates on the large scale shows
that matter in stars and galaxies must be electrically neutral to very high precision.
While quantized theories have been discovered to describe the non-gravitational
forces, GR remains a classical theory. The possibility of reconciling GR and
quantum mechanics, the twin achievements of twentieth century physics, has now
1-4
Introduction to General Relativity and Cosmology (Second Edition)
been pursued by theoreticians for a century. There are still only tantalizing hints of a
solution.
Turning to cosmology, GR has provided the solid framework within which the
structure and development of the universe are described. The model of spacetime,
used throughout this text, due to Friedmann, Robertson, and Walker, was derived
using GR. It provides the framework for the modern coherent description of the
development of the universe.
1
Particle properties are detailed in Appendix A.
1-5
Introduction to General Relativity and Cosmology (Second Edition)
Figure 1.3. The left panel shows a 1000 square degree image taken by the Sloan Digital Sky Survey III. The
other panels are three-dimensional presentations going back 7 Gyrs based on the galaxy redshifts. Image
prepared from SDSS-III data by Jeremy Tinker. Courtesy Jeremy Tinker and the Sloan Digital Sky Survey.
1-6
Introduction to General Relativity and Cosmology (Second Edition)
2
We will continue to receive the light previously emitted, which gets progressively more redshifted and less
intense. This effect is explained in Chapter 8 when, in an equivalent way, a source falling into a black hole is
extinguished.
1-7
Introduction to General Relativity and Cosmology (Second Edition)
relatively rapidly and the luminosity falls gradually. The result is that the luminosity
distribution of the red giants is continuous up to a maximum luminosity, and then
cuts off sharply. This produces the TRGB shown in Figure 1.4. Following the
helium ignition the stars trajectory is to move rapidly to the left. Observation shows
that the absolute luminosity at the tip is independent of the era of star formation, or
other variables, making the TRGB luminosity an excellent standard candle.
Cepheids are often located in densely populated and dusty regions making for
Figure 1.4. The Hertzsprung–Russell plot for stars in a globular cluster in our Galaxy. MI is the absolute
magnitude of luminosity measured with an infrared filter. The precise definition of magnitude is given in
Section 1.13. (V − I) is the difference between the brightness using filters for yellow-green light (V) and
infrared light (I) and measures the temperature of the star. This uses the fact that as temperature rises the
spectrum of a black body source moves to shorter wavelengths. The color scale indicates the population
density. The sharp luminosity cutoff is at magnitude −4.05. Figure 1 from Freedman (2021). Published by the
American Astronomical Society under Creative Commons Attribution 4.0 licence.
1-8
Introduction to General Relativity and Cosmology (Second Edition)
uncertainty in estimating their luminosity. On the other hand stars at the TRGB are
usually well isolated.
Nowadays distances to stars within our Galaxy are obtained from the parallax
observed over the solar year by space telescopes such as Hipparcos and Gaia, which
are located in orbits close to the Earth. The angular resolution of the observations
made are μas (micro arcsecond). Cepheid and TRGB measurements carry the
distance scale to nearby galaxies. Much brighter standard candles are used further
off, the supernovae type Ia (SNe Ia). These are events that may occur when a white
dwarf star and a larger star are gravitationally bound (a binary pair). Material from
the larger partner flows onto the white dwarf and eventually this ignites a
thermonuclear explosion that is, for several hours, more than a billion times brighter
than the Sun. The peak luminosity and rate of luminosity decay have characteristic
correlations so that such explosions provide standard candles that can be observed at
remote distances. Observations discussed in Chapter 17 have revealed that the
expansion of the universe is accelerating due to a mysterious component of the
universe called dark matter.
1-9
Introduction to General Relativity and Cosmology (Second Edition)
400
300
MJy/sr
200
100
0
5 10 15 20
Frequency (cm–1)
Figure 1.5. The spectrum of the CMB measured with FIRAS, the Far Infrared Absolute Spectrometer on the
COBE satellite. Figure from Fixsen et al. (1996). Courtesy of Professor Fixsen on behalf of the copyright
holders. The agreement with the line displaying the prediction of the black body spectrum at 2.7255 K is such
that the error bars on the data points would only emerge if the line width were reduced one hundredfold. 1 MJy
is 10−20 W m−2 Hz−1.
1-10
Introduction to General Relativity and Cosmology (Second Edition)
8π x 3dx
W (x )dx = 3 3
[kBT ] 4 .
ch exp x − 1
Integration3 of the above expressions over all energies yields:
Number density = 2.026 × 107 T 3 m−3, (1.10)
∞ ∞
3
∫0 x 2 dx [exp x − 1] = 2.404 and ∫0 x3 dx [exp x − 1] = π 4 15.
1-11
Introduction to General Relativity and Cosmology (Second Edition)
Figure 1.6. Temperature variation of the cosmic microwave background recorded by the Planck
Collaboration: Planck 2018 results: I Overview and the cosmological legacy of Planck (Planck
Collaboration et al. 2020). Courtesy EDP Sciences and the European Southern Observatory. This standard
display is an equal area (Mollweide) projection of both hemispheres of the sky with the galactic mid-plane
lying along the horizontal center line. The direction toward the center of the Galaxy lies at the center of the
plot. As in an atlas, the left-hand edge wraps round onto the right-hand edge, this time enclosing the viewer.
The gray boundary encloses our Galaxy, whose foreground emissions have been removed.
once every 24 hours relative to the Earth; but the plane would not rotate in the frame
defined by the distant galaxies.
1.7 Inflation
The CMB is close to being both homogeneous and isotropic: if we look north and
south we view regions of the CMB which were never in causal contact, so how could
they be so similar? The unexpected solution to this problem is that the steady
expansion of the universe was punctuated, early on, at around 10−36 s after the Big
Bang, by an almost instantaneous expansion by a factor greater than ∼1030. What
we now see, across the full 4π angle of the sky, would have been in causal contact
before inflation. This violent inflation is believed to be the effect of the universe
dropping from one equilibrium state to another of lower energy. It was an essentially
quantum mechanical transition, after which the universe continued expanding more
leisurely to the present time. This expansion would also be enough to account for
why the spacetime we inhabit is to the limit of observational precision a flat
spacetime.
1-12
Introduction to General Relativity and Cosmology (Second Edition)
0 10 20 0 10 20 30 40
200
100 21 Sc galaxies
NGC 4605
0 NGC 3672 100 km s-1
NGC 1035
NGC 1421
NGC 4062
NGC 4321
NGC 2742
IC 467
NGC 2608
NGC 7541
NGC 3495
Velocity in plane of galaxy (km s-1)
NGC 7664
NGC 1087
NGC 2998
UGC 3691
NGC 753
NGC 4682
0 10 20 0 10 20 30 40
NGC 801
100
0
0 10 20 30 40 50
Distance from nucleus (kps)
Figure 1.7. Rotational velocities, determined from emission line spectra, as a function of the distance from the
center of the galaxy. Adapted from Figure 5 in Rubin et al. (1980). Copied from Figure 11.4 in Kenyon (1990),
published by Oxford Univ. Press, courtesy of the late Professor Rubin, the American Astronomical Society
and Oxford Univ. Press.
1-13
Introduction to General Relativity and Cosmology (Second Edition)
1-14
Introduction to General Relativity and Cosmology (Second Edition)
1-15
Introduction to General Relativity and Cosmology (Second Edition)
1-16
Introduction to General Relativity and Cosmology (Second Edition)
Figure 1.8. Evolution of radiation in the expanding universe as a function of log10(1 + z ), z being the redshift.
The spectral regions accessible with specific detector types are indicated by horizontal bands. The CMB was
emitted once, while various sources have over extended periods emitted the hydrogen 21 cm line and 121 nm
Lyα-line. The lower edge of the blue band at 400 nm is the wavelength below which absorption on metals in
stellar atmospheres becomes very strong. This results in the 4000 Ångstrom break in spectra. Adapted, with
permission, from the ICHEP2020 talk by David Kirkby, University of California, Irvine. https://faculty.sites.
uci.edu/dkirkby/.
for detection in the infrared: from 0.6 to 28.5 μ m. By optimizing on the near-
infrared higher redshift galaxies and stars become detectable compared to, for
example, the HST. Therefore, the JWST will view further back in time to the
appearance of the earliest stars: an era that is poorly understood at present. Studies
using the JWST will for the first time access events at redshift to between 8 and 10,
well into the first billion years after the Big Bang. The resolution, limited by the
detectors, is 100 mas, comparable to the HST. A complementary advantage is that
infrared light penetrates the dust clouds that obscure star formation, better than
visible light. This is because dust particles scatter, very efficiently, radiation whose
wavelength is less than or equal to the dust particles’ size.
Many earthbound telescopes already have larger objectives: the two KEK
telescopes have 10 m diameter objectives each made up of 36 hexagonal segments
that work as one. The turbulence in the atmosphere brings distortions that alter over
times of order 10 ms and would vitiate the advantage of objective size. The problem
is reduced by locating telescopes on high mountains in dry climates, such as the
KEK telescopes on the Maunakea extinct volcano in Hawaii, and the four 8.2 m
diameter European Southern Observatory Very Large Telescopes (VLTs) at 2600 m
altitude at Cerro Paranal in the Atacama desert in Chile. Compensation for
turbulence is still needed, and is provided by a flexible mirror in the light path
through the telescope after the objective, a mirror that can be deformed rapidly in a
controlled manner. The mirror is warped so as to retain a point image in the image
plane of the telescope of a bright point source in the field of view; this target may be
a star or it can be generated by a laser beam incident on sodium atoms in the upper
1-17
Introduction to General Relativity and Cosmology (Second Edition)
atmosphere and causing these to emit in turn. Three giant telescopes in the 30 m
diameter class are under development, with corresponding increased resolution and
light-gathering capability.
More recently it has proved possible to interfere light from more than one
telescope by making underground piped connections that are maintained constant in
length to a small fraction of a wavelength. This gives a resolution improvement in
the case of the KEK pair from 40 mas to 4 mas; and similarly for the VLTs.
More modest telescopes have performed important surveys. The 2.5 m diameter
wide-angle telescope at Apache Point, New Mexico has carried out a program called
the Sloan Digital Sky Survey since 2000. For example it was used to observe galaxies
and quasars (active galactic nuclei) at redshifts from 0.6 to 1.0 and from 0.8 to 3.5
respectively over one third of the sky. As a result a three-dimensional slice of the
universe is recorded and accessible; reaching back to when the universe was a
quarter its present age. This and similar data has been used as input in making a
comparison between the structures seen today and the overdense regions signaled by
the CMB.
At millimeter and microwave wavelengths the telescope primary mirror, dish, is a
metal mesh mirror: the mesh spacing is made smaller than the wavelengths of
interest so that it appears solid to such radiation. The radio telescope at Arecibo in
Puerto Rico had a 300 m diameter dish. Atmospheric turbulence is no problem but
absorption by air means that only certain wavelengths are usable from Earth. X-rays
are almost totally absorbed, so that satellite mounted detectors are essential.
The detectors that have produced the detailed data on the CMB have been
satellite mounted: the latest, Planck, was active from 2009 to 2013. Its parking orbit
was around 1.5 Mkm from the Earth in the opposite direction to the Sun (L2
Lagrangian point) so as to be well-shielded enough to detect the universal 2.7 K
CMB black body radiation. The bulk of the spectrum is concentrated between
30 GHz and 850 GHz (10 mm and 0.3 mm wavelength) and, serependitiously, this
lies in a gap between higher frequency radiation from dust and lower frequency
synchrotron radiation from the Galaxy. The primary mirror has a 1.5 m aperture
giving a resolution between 5′ and 30′ across the accessible spectrum. The detectors
from 30 to 100 GHz are high electron mobility transistors like those in satellite
dishes; from 100 to 850 GHz the detectors are bolometers. A typical device is a
roughly 1 cm diameter mesh of radial and azimuthal gold coated 1 μm wires
resembling an ideal spider’s web. At the center sits a transition edge sensor at 0.1 K
that conducts when warmed by the radiation absorbed on the mesh. The mesh gaps
are shorter than the wavelength so that radiation is efficiently absorbed but cosmic
rays that would warm the wires almost all pass between them. Signals from the
detectors are first amplified by electronics cooled at 40 K, and then processed,
digitized, and transmitted to Earth. After 18 months the helium coolant was
exhausted and later the satellite was passivated. Over its lifetime the temperature
precision achieved was 2 μ K at the lowest frequency: this and the angular resolution
were adequate to examine the quantum fluctuations of the CMB in great detail, as
described later. Some bolometers had a rectangular pattern of wires, with only those
in one orientation being made electrically conductive. In this way the polarization of
1-18
Introduction to General Relativity and Cosmology (Second Edition)
Figure 1.9. The image of the black hole M87* recorded by the Event Horizon Telescope (EHT) on the left; in
the center the predicted image with perfect resolution; on the right the prediction taking account of the
interferometer’s intrinsic resolution. The 66 billion solar mass black hole was imaged by eight radio telescopes
at 1.3 mm wavelength. The bright flare is from its accretion disk. Figure from The Event Horizon Telescope
Collaboration et al. (2019), reproduced with permission under CCBY-SA-3.0.
1-19
Introduction to General Relativity and Cosmology (Second Edition)
Figure 1.10. Distribution of events as a function of the square of the incident angle with respect to the Crab
Nebula direction. The data points are compared with simulations shown by solid histograms. Figure from
Amenomori et al. (2019). Courtesy of the American Physical Society.
3400m2, part of a much larger array of 600 plastic scintillators covering 65,700 m2;
all at an altitude of 4300 m in Tibet. Evidently there is a signal due to high energy
photons, above 100 TeV, originating in the Crab Nebula.
Telescopes detecting electromagnetic radiation are now complemented by the
giant interferometers that detect gravitational waves from inspiralling black hole
binaries and neutron star binaries. The aLIGO interferometer is described in detail
in a later chapter.
1.13 Luminosity
Stars emit radiation strongly across the spectrum from the infrared into the
ultraviolet. Filters are used with telescopes to select bands of the spectrum and so
gain more information. For example, the hotter the star the bluer its spectrum.
Bands covering from the ultraviolet to the near-infrared are shown in the
accompanying Table 1.1. Luminosity is the electromagnetic energy radiated from
a source per unit time, measured in watts. Flux is the electromagnetic energy flowing
across unit area per unit time, measured in W m−2. Magnitude is a measure inherited
from the classical world when the brightest stars were designated magnitude 1, and
those barely visible as magnitude 6; which already makes for confusion. As regards
1-20
Introduction to General Relativity and Cosmology (Second Edition)
U Ultraviolet 325–390
B Blue 390–490
V Yellow/Green 490–580
R Red 580–730
I Infrared 730–950
In practice the filter used determines the range.
the eye’s response to changes in intensity, Fechner’s law approximates quite well: for
an intensity I the response is
S = 2.3 log10(I I0),
where I0 is a constant. Nowadays the agreed way to define magnitudes is to
interpolate between magnitudes 1 and 6, and extend beyond, with a logarithmic
scale that makes use of Fechner’s law. The apparent magnitude is determined from
the measured incident flux f as
m = −2.5 log10(f f0 ), . (1.17)
The negative sign is needed to take care of the inversion of the magnitude scale, with
brighter sources having the lower magnitude. The constant flux f0 is set to
2.53 × 10−8 W m−2, so that the ancient reference magnitudes are matched quite
well. With this choice a difference in m of 5 conveniently gives a ratio of 100 between
the luminosities being compared. The relevant quantity for direct comparison
between sources is the absolute magnitude. This is defined to be the magnitude
that a source would have if viewed from a distance of 10 pc:
L
M = −2.5 log10⎡ ⎤ , (1.18)
⎢
⎣ 0⎥
L ⎦
where L is the luminosity of the source and L0 is the zero-point luminosity giving
zero magnitude, namely 3.0128 × 1028 W. Flux falls off with the square of the
(luminosity) distance dL, so that
d
M = m − 5 log10⎡ L ⎤ . (1.19)
⎢
⎣ 10pc ⎥
⎦
Flux, luminosity, and magnitude summed across the whole spectrum are known as
bolometric flux, bolometric luminosity and bolometric magnitude. Quantities for a
single spectral band have a subscript attached, as MB. The absolute bolometric
magnitude of the Sun is 4.75: that of Rigel, the brightest (bottom right), star in Orion
is 0.12. The brightest sources are distant quasars with absolute magnitudes reaching
−30. Most stars have absolute magnitudes in the range of +20 to −10. The Sun’s
1-21
Introduction to General Relativity and Cosmology (Second Edition)
where E is the total energy and p is the relativistic momentum. The Lorentz
transformation takes the same form for all four-vectors: in the case of the
four-momentum we can take Equation (1.22) and replace x everywhere by p.
1-22
Introduction to General Relativity and Cosmology (Second Edition)
Each four-vector has an invariant, formed in the same way as Equation (1.23). The
energy-momentum invariant is
p02 − p 2 = E 2 c 2 − p 2 = m 2c 4, (1.25)
where m would be the rest mass in the case of a particle. The four-momentum is
related to the four-velocity v whose components are
v0 = cγ , v1 = γvx , v2 = γvy , v3 = γvz . (1.26)
Here vx , vy , and vz are the standard components of the velocity, meaning that in time
dt the distance traveled along the x-direction will be vx dt. Also γ = 1 (1 − β 2 )1 2 ,
where βc = (vx2 + vy2 + vz2 )1 2 . The invariant for the velocity four-vector is c.
The invariants can be positive, negative or zero. Let us cite P1 at the origin in
Figure 1.11. If the spacetime invariant interval Δs 2 from event P1 to event P2 is zero
then cΔt exactly equals Δr and a light ray can travel from event P1 to event P2: such
an interval is called light-like. P2 lies upward along a diagonal line. When cΔt
exceeds Δr the interval is positive and P2 can be reached from P1 by traveling slower
that the speed of light. P2 lies in the upward gray region. Such an interval is called
time-like because we can choose an inertial frame such that r2 = r2, leaving only a
separation in time. Finally if the interval Δs 2 is negative cΔt is less than Δr so that no
information can pass from P1 to P2. P2 then lies in the white region. Such a
separation is called space-like, and in this case an inertial frame can be found in
which t2 = t1, leaving a spatial separation. In this case whatever happens at P1 can
have no influence on what happens at P2 and vice versa. Figure 1.11 illustrates these
different cases shown with P1 located at the origin with a single spatial axis. With
three dimensions the accessible future forms a three-dimensional cone in spacetime
around the time axis.
Figure 1.11. Section through the light cones at one point in spacetime showing one spatial dimension.
1-23
Introduction to General Relativity and Cosmology (Second Edition)
1.15 Exercises
1. Calculate whether the following spacetime intervals are space-like, time-like
or light-like: (1.0, 3.0, 0.0, 0.0); (3.0, 3.0, 0.0, 0.0); (3.0, −3.0, 0.0, 0.0); (−3.0,
3.0, 0.0, 0.0); (0.0, 3.0, 0.0, 0.0); (3.0, 1.0, 0.0, 0.0).
2. Use Stefan’s formula for the energy density, ρ(E ), in black body radiation
ρ(E ) = αT 4 , where α is 7.565 × 10−16 J m−3 K−4. What is the energy density
of the CMB at the current temperature of 2.7255 K? Take the mean energy
per photon to be 2.7kBT. Make an estimate of the number of CMB photons
per cubic centimeter.
3. The average density of baryonic matter (nucleons, either protons or
neutrons, and electrons) in the universe is 4.2 10−28 kg m−3. How many
nucleons are there per cubic meter? What is the ratio of CMB photons to
nucleons today?
4. Suppose a star made of antimatter annihilates on a star made of matter.
Calculate the energy release. The most energetic long duration cosmic
sources are quasars emitting up to 1040 W. Is it likely there is much
antimatter is the universe?
5. The transition between the ground and first excited state in atomic hydrogen
produces radiation of wavelength 121.6 nm, called the Lyα line. The
continuous spectrum from quasars is marked by the absorption lines due
to absorption by intergalactic clouds of hydrogen encountered on the way to
the Earth. If one such Lyα line is redshifted to 150 nm how far is the
absorbing hydrogen cloud from the Earth?
6. What is the thermal balance, heat in, heat out, of an astronaut during a free
spacewalk? Stefan’s law for radiation from a black body at temperature T
gives an energy flux per square meter E = 5.67 × 10−8T 4 .
7. Sirius, the brightest star of all in the night sky, has an apparent magnitude
-1.46 and is 2.67 pc away. What is its absolute magnitude? The TRGB stars
have an absolute magnitudes close to −4.02: what is their luminosity?
Further Reading
Two excellent parallel texts on cosmology are: Rich J 2009 Fundamentals of
Cosmology (2nd ed.; Berlin: Springer) and Ryden B 2017 Introduction to
Cosmology (2nd ed.; Cambridge: Cambridge Univ. Press). Both were
prepared before the discovery of gravitational waves.
Susskind L 2006 The Cosmic Landscape (Boston, MA: Little, Brown and
Company). This is a more popular book on cosmology by a noted theorist.
Narlikar J V 2002 An Introduction to Cosmology (3rd ed.; Cambridge:
Cambridge Univ. Press). This gives an extended and fuller account of similar
material, written by a practiced author. It is somewhat dated.
Cottrell G 2016 Telescopes: A Very Short Introduction (Oxford: Oxford Univ.
Press). This text provides a compact introduction to modern telescopes used
across the electromagnetic spectrum.
1-24
Introduction to General Relativity and Cosmology (Second Edition)
References
Abbott, B. P., Abbott, R., Abbott, T. D., et al. 2016, PhRvL, 116, 061102
Alpher, R. A., Bethe, H., & Gamow, G. 1948, PhRv, 73, 803
Amenomori, M., Bao, Y. W., Bi, X. J., et al. 2019, PhRvL, 123, 051101
Do, T., Hees, A., Ghez, A., et al. 2019, Science, 365, 664
Fixsen, D. J., Cheng, E. S., Gales, J. M., et al. 1996, ApJ, 473, 576
Freedman, W. L. 2021, ApJ, 919, 16
Hubble, E. 1929, PNAS, 15, 168
Hubble, E., & Humason, M. L. 1931, ApJ, 74, 43
Kenyon, I. R. 1990, General Relativity (Oxford: Oxford Univ. Press)
Kenyon, I. R. 2019, Quantum 20/20: Fundamentals, Entanglement, Gauge Fields, Condensates
and Topology (Oxford: Oxford Univ. Press)
Leavitt, H. S. 1908, AnHar, 60, 87
Penzias, A. A., & Wilson, R. W. 1965, ApJ, 142, 419
Planck Collaboration, Aghanim, N., Akrami, Y., et al. 2020, A&A, 641, A1
Rubin, V. C., Ford, W. K., & Thonnard, N. 1980, ApJ, 238, 471
The Event Horizon Telescope Collaboration, Akiyama, K., Alberdi, A., et al. 2019, ApJ, 875, L5
Zwicky, F. 1937, ApJ, 86, 217
1-25
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 2
The Equivalence Principle
the proportions of these vary from material to material. Each atom contains a
nucleus, which is made from nucleons (i.e., neutrons and protons) with electrons
circulating around the nucleus. Nucleons feel the strong force whereas electrons do
not: thus it is reasonable to ask whether nucleons and electrons feel the same
gravitational force. The nucleon-to-electron ratio varies from unity in hydrogen to
about 2.5 for elements with high atomic number, so that any difference in the
gravitational force felt by nucleons and electrons would appear as a difference in the
gravitational acceleration of elements of high and low atomic number. In addition
nuclei are lighter than the sum of the constituent nucleon masses by the nuclear
binding energy. This nuclear binding energy varies from zero for hydrogen to 0.7%
of the mass × c 2 for iron. Hence if the gravitational force depended, like the strong
force, on the number of nucleons rather than the mass there would be a difference of
0.7% between the gravitational acceleration of iron and hydrogen. Finally there is
the gravitational binding energy of matter, which is a fraction 4.64 × 10−10
(0.19 × 10−10 ) of the total mass in the case of the Earth (Moon). Again, it is
reasonable to ask whether the Earth and Moon fall toward the Sun with the same
acceleration: how does gravity act on gravity?
The EP was already implicit in the Newtonian analysis of Galileo’s experiment.
The force acting on a mass mg in a gravitational field g is
F = mgg. (2.1)
Applying Newton’s second law of motion gives the acceleration a of the mass
F = m ia. (2.2)
A distinction is being made here between the gravitational mass mg and the inertial
mass mi. Inertial mass appears in expressions for kinematic energy m iv2 and
momentum m iv, so that its definition is made independent of any weighing process.
Eliminating F from the last two equations gives the acceleration
a = (mg m i )g . (2.3)
Tests from Galileo’s time up to the present reveal no variation in the rate of fall from
material to material. Therefore mg /m i has the same value for all materials, and by
choosing units appropriately we make this ratio equal to unity. Einstein interpreted
this result as follows: the motion of a neutral test body released at a given point in
spacetime is independent of its composition, which is the weak equivalence principle
(WEP). A test body is by definition small. Massive objects such as the Earth and the
stars are, unlike test bodies, bound gravitationally and the question arises whether
the gravitational mass remains the same as the inertial (bound) mass in such cases. If
it is postulated that they are the same then the equivalence principle becomes the
strong equivalence principle (SEP).
Einstein next considered the implications of the equivalence principle for motion
in free fall. One example is the International Space Station (ISS), another is a capsule
falling radially toward the Earth, and a final example is a capsule drifting through
intergalactic space. In all three cases we need to ignore any drag forces from the
2-2
Introduction to General Relativity and Cosmology (Second Edition)
surrounding gas. Einstein posed a searching question for such systems: can an
astronaut inside a closed capsule determine his state of motion without looking out
of the capsule, whether it is in a uniform gravitational field or uniformly accelerat-
ing? If the astronaut drops a ball it accelerates at the same rate as the capsule and
will remain at rest relative to the capsule, whatever their shared acceleration. The
occupants of the ISS find this property sometimes useful, sometimes not so useful!
However, if the capsule is in a region where the gravitational field is not uniform
then the motion is in principle detectable. To give a concrete example, consider the
capsule to be falling radially toward the Earth. Then dropping not one but two balls
will be an effective strategy because the gravitational forces on them converge at the
center of the Earth as in Figure 2.1. The astronaut could measure the resultant
movement of the balls toward each other, given a large enough capsule and a long
enough time interval. Such effects owe their origin to gradients in the field and are
called tidal effects.
The weak equivalence principle can now be restated as follows so as to exclude
tidal effects: the results of local experiments in a state of free fall are independent of
the motion. Local is used here to express the restriction to a region sufficiently small
that the gravitational field is effectively uniform. Einstein then generalized the
equivalence principle to this form:
The results of local experiments in all freely-falling frames are independent of both the
location and the time.
In a nutshell, physics is the same in all free falling frames. Notice that one frame
in free fall can be accelerating with respect to another such frame. For example, we
can compare satellites in free fall around different stars. The EP can be viewed as an
extension of the first postulate of SR. SR requires that the result of an experiment is
Capsule
Planet
Figure 2.1. The trajectories of objects in free fall within a space capsule.
2-3
Introduction to General Relativity and Cosmology (Second Edition)
the same for all inertial frames, but has nothing to say about the effect of
gravitational acceleration; the EP requires that the result of local experiments be
the same in all freely-falling frames. In the special theory of relativity it is assumed
that a single inertial frame can be applied to the whole universe, but at the cost of
neglecting acceleration! In the general theory of relativity the natural frame
anywhere is chosen to be a frame in free fall, but we cannot cover the whole
universe with one such frame.
The vast body of experimental data that has been accumulated in support of SR
can be reconciled with the EP by making the inference, as Einstein did, that physics
in free fall must be consistent with SR. Thus, we should add a rider to the EP:
The results of local experiments in free fall are consistent with SR.
0600 N 1800
Au Al
To Sun
Figure 2.2. The experiment of Roll, Krotkov and Dicke (Roll et al. 1964): on the left viewed from above the
North Pole; on the right a side view.
2-4
Introduction to General Relativity and Cosmology (Second Edition)
GM
v2 = r(e)
R
where r(e) is the ratio mg(e)/mi(e). The balance of forces on the gold mass is
GMmg(Au) m i(Au)v 2
F (Au) + = ,
R2 R
where F is the pull from the balance arm. Rearranging this equation gives
GMm i(Au)
F (Au) = [r(e) − r(Au)].
R2
There is a similar equation for the aluminum. Taking both inertial masses to be
equal to m, the torque on the balance at 0600 is
GMml
Γ = [F (Au) − F (Al)] l = η,
R2
where η = r(Au) − r(Al) is called the Eö tvos̈ parameter. If η were non-zero there
would be a net torque that reverses every 12 hours. The authors looked for the
24 hour oscillations that such a torque would produce and saw none. Their upper
limit was
ηAu Al ⩽ 3 × 10−11.
Adelberger and colleagues have made more refined tests (Schlamminger et al.
2008). Their torsion balance was mounted on a turntable rotating at a constant rate,
which replaces the Earth’s rotation. Then, as seen by someone sitting on the
turntable, an oscillation of the balance arm at the rotation frequency would signal a
violation of the WEP. The rotation frequency was around 1 mHz, a hundred times
that of the Earth’s rotation. Oscillations of the balance arm were detected by an
optical system rotating with the turntable. At the latitude of the experiment in
Washington state the horizontal component of the Earth’s gravitational field is
1.68 cm s−2, three times that of the Sun, so that the gravitational attraction of the
Earth was used to test the WEP.
This approach has several advantages over the experiments relying on the Earth’s
rotation. It insulates the experiment from 24 hour cycles in temperature and
electrical power; it reduces noise in the apparatus, which generally has a 1/f or
1/f 2 dependence, where f is the signal frequency. The sensitivity improves as the
wire’s thickness is reduced and the mass correspondingly reduced so as not to break
the wire: 4.84 gm was the practical choice for each test body. The container was
evacuated, thermally insulated, shielded by mu-metal from ambient magnetic fields
and located in a temperature stabilized underground room. Data were recorded over
75 days giving
ηBe Ti ⩽ (0.3 ± 1.8) × 10−13.
The absence of differential acceleration, at this level of precision, toward the Earth
further validates the WEP.
2-5
Introduction to General Relativity and Cosmology (Second Edition)
The WEP and Newton’s inverse square law of gravitation work well from the
scale of the solar system down to centimeter distances. Adelberger and colleagues
took one step further: they adapted the torsion balance and ruled out any
detectable effect due to additional forces of ranges above around 50 μm
(Schlamminger et al. 2008).
The MICROSCOPE experiment reported a test of the WEP in the quiet
environment of a satellite orbiting at an altitude of 700 km (Bergé et al. 2017;
Touboul et al. 2022). The test bodies were coaxial hollow cylinders of different
alloys, Pt/Rh and Ti/Al/Va, chosen both to have widely different nucleon/electron
ratios and to be easy to machine. The position of each was monitored capacitively,
and each could be moved by applying a voltage to electrodes on a silica frame
enclosing both test masses. Figure 2.3 illustrates the principle of the measurement. If
the WEP holds the accelerations toward the Earth, in the direction of the red arrows,
should be equal. Their relative acceleration in the axial direction, along the black
arrows, was monitored by comparing the electrostatic forces needed to hold them at
rest with respect to each other. If the WEP is violated the force difference would
oscillate at the orbital frequency. In order to increase the frequency of such
oscillations, and hence of data taking, the satellite could be spun around an axis
Figure 2.3. MICROSCOPE experiment to test the SEP. Figure from Bergé et al. (2017). Courtesy of Professor
Berg é and the Institute of Physics.
2-6
Introduction to General Relativity and Cosmology (Second Edition)
perpendicular to the orbital plane. The latest limit on the Eö tv os̈ parameter
reported in 2022 (Touboul et al. 2022) is
ηPt Ti = ( −1.5 ± 2.3 ± 1.5)10−15 (2.4)
2-7
Introduction to General Relativity and Cosmology (Second Edition)
2-8
Introduction to General Relativity and Cosmology (Second Edition)
The remote observer measures the time intervals to be dilated and light to have
undergone a gravitational redshift. Later we shall need to relate the squares of time
intervals in different frames, so we write this for later use
2GM ⎞
dτ 2 = dt 2⎛1 − (2.12)
⎝ rc 2 ⎠
where dτ is the proper time and dt is the coordinate time. The derivation leading to
Equation (2.12) is heuristic, but the equation itself is rigorously correct in GR.
A precise measurement of the gravitational redshift was made by Pound and
Rebka in 1960 (Pound & Rebka 1960) using photons that dropped down inside a
22.6 m tower at Harvard. The predicted spectral shift is only
Δν ν = 2.46 × 10−15.
Pound and Rebka exploited the contemporary discovery by Mö ssbauer of decays by
gamma emission in which the whole crystal recoils rather than the parent nucleus.
This means that the spectral line width is unusually narrow. In the case of
57 57
Fe* → γ + Fe
the 14.4 keV line has a fractional width of 10−12, still 500 times the gravitational
spectral shift to be measured.
Pound and Rebka placed the 57Fe* source at the top of the tower and a thin 57Fe
absorber at the foot of the tower. This absorber covered a scintillator viewed by a
photomultiplier. It was arranged that the source could be driven slowly up or down
using a transducer, so producing a Doppler shift to compensate the gravitational
spectral shift. With exact compensation the absorption of photons in the 57Fe
absorber was maximized and the photomultiplier count rate minimized. Pound and
Rebka scanned across the narrow line profile by varying the drive velocity. This gave
a large gain in sensitivity in locating the line center, and a measured gravitational
spectral line shift
Δν ν = (2.57 ± 0.26) × 10−15,
which agrees, within the small quoted error, with the prediction obtained from the
SEP. Other more recent tests involve direct comparison of the time-keeping of
atomic clocks or of masers. Vessot and colleagues in 1980 (Vessot et al. 1980)
compared the rate of a hydrogen maser launched to a height of 10,000 km in a
rocket with the rate of an identical maser kept in the laboratory. Two-way telemetry
was used to compensate both for atmospheric effects and for the first-order Doppler
shift due to the relative velocity of the masers. The comparison gave a rate difference
that agreed with the prediction from the EP to parts in 104.
An extension of the above arguments based on the EP leads to the conclusion that
the path of light is bent in a gravitational field. Using Equation (2.9) we can infer that a
photon climbing a distance d against the Earth’s gravitational pull loses energy
hν
ΔE = hΔν = gd ,
c2
2-9
Introduction to General Relativity and Cosmology (Second Edition)
where h is Planck’s constant. The kinetic energy lost by a body of mass m rising
through the same distance is remarkably similar:
ΔE = gmd.
Therefore it emerges that a photon of energy E = hν behaves in a gravitational field
as if it possessed an inertial mass E /c 2 ! We shall see later that in Einstein’s theory of
general relativity all forms of energy couple to the gravitational field. Consequently
a photon feels the gravitational force and it follows that a photon follows a curved
path in a gravitational field. This behavior can be pictured in the following
Gedanken (thought) experiment.
Imagine a space capsule in free fall near the Earth: inside it an astronaut strapped
to one wall shines a beam of light horizontally at the opposite wall. Figure 2.4(a)
shows the light path as seen by the astronaut; his frame is in free fall so that the light
travels in a straight line to the opposite wall. An external observer at rest sees things
quite differently. At emission the lamp is at one height, but by the time the light
reaches the other wall the capsule has fallen a little. This view is shown in
Figure 2.4(b) where the light path is seen to curve. Einstein calculated the deviation
of starlight passing near the Sun’s surface on its way to the Earth; his result was
1.750 arcsec. We should note that a calculation using just the EP gives exactly half
this value: time and frame distortion contribute equally.1 The confirmation from the
measurements made in 1919 has already been discussed. Higher precision is
obtained by studying the apparent motion of radio sources that pass near the
Sun’s disk, a technique that is not restricted to times of solar eclipse. In Figure 2.5 a
widely spaced pair of radio telescopes are shown receiving signals from the same
source. If the source direction makes an angle θ with the baseline, which is of length
(b)
(a)
Figure 2.4. Light path in a space capsule in free fall near the Earth as seen by (a) an occupant of the capsule
and (b) an external observer at rest with respect to the Earth’s surface.
1
Appendix G: Kenyon I R 1990 General Relativity Oxford Univ. Press (Kenyon 1990).
2-10
Introduction to General Relativity and Cosmology (Second Edition)
Figure 2.5. The arrangement of a pair of radio antennae used to determine the angular position of radio
sources.
d, then the path difference to the two dishes is d cos θ and the phase lag between
their signals is
2π
Δ= d cos θ
λ
at a wavelength λ, and in terms of frequency ν this becomes
2πνd cos θ
Δ= .
c
Measurement of the phase difference leads to a determination of θ. Lebach et al. (1995)
used 30 m diameter radio telescopes on a 4000km baseline, across the USA, to study
the radio sources close to the Sun at frequencies of 2.3, 8.4, and 22.7 GHz. This is an
example of very long baseline interferometry (VLBI). They found a mean deviation
due the Sun’s gravitational field of 0.9998 ± 0.0008 times Einstein’s prediction.
2.5 Exercises
1. Show that in the Pound–Rebka experiment the expected gravitational
redshift was 2.46 × 10−15.
2. Calculate the redshift for the 768.9 nm potassium line emitted by an atom on
the Sun’s surface.
3. One recent theoretic prediction for the mass of pseudoscalar (spin 0ℏ,
negative parity) particles known as axions is around 20 μeV c−2. Exchange
of such particles would give rise to a modification of Newton’s law of
gravitation. At what range would such a modification be expected?
2-11
Introduction to General Relativity and Cosmology (Second Edition)
4. If some of the protons on the Sun and Earth were not matched by electrons
these two bodies would be electrically charged. What fraction of the electrons
would need to be removed from both bodies in order to change the attractive
force between the Sun and the Earth by 1 part in a million?
5. It seems plausible that neutrinos have a mass of around 0.1 eVc−2. Neutrinos
are emitted from supernovae with energies around 1 MeV. How much later
will neutrinos arrive on Earth than photons from a supernova distant 2 Mpc
from the Earth?
Further Reading
Will C M 2018 Theory and Experiment in Gravitational Physics (2nd ed.;
Cambridge: Cambridge Univ. Press). This is a thorough presentation by a
world expert on general relativity. It includes accounts of versions of the
equivalence principle, of the post-Newtonian parametrization of metric
theories of gravity, and of experimental tests.
References
Bergé, J., Touboul, P., Rodrigues, M., & Liorzou, F. 2017, JPCS, 840, 012028
Kenyon, I. R. 1990, General Relativity (Oxford: Oxford Univ. Press)
Lebach, D. E., Corey, B. E., Shapiro, I. I., et al. 1995, PhRvL, 75, 1439
Pound, R. V., & Rebka, G. A. 1960, PhRvL, 4, 337
Roll, P. G., Krotkov, R., & Dicke, R. H. 1964, AnPhy, 26, 442
Schlamminger, S., Choi, K.-Y., Wagner, T. A., Gundlach, J. H., & Adelberger, E. G. 2008,
PhRvL, 100, 041101
Touboul, P., Métris, G., Rodrigues, M., et al. 2022, CQGra, 39, 204009 arXiv:2209.15488
Vessot, R. F. C., Levine, M. W., Mattison, E. M., et al. 1980, PhRvL, 45, 2081
Williams, J. G., Turyshev, S. G., & Boggs, D. H. 2004, PhRvL, 93, 261101
2-12
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 3
Space and Spacetime Curvature
The observation of the gravitational spectral shift and of the bending of light passing
close to the Sun proved that spacetime is warped by the presence of matter. This
connection brings out a fundamental point: the curvature of a space can be
determined from intrinsic measurements, i.e., from measurements confined to the
space itself. For example, just by throwing a ball we can measure the curvature of
spacetime near the Earth. In Figure 3.1 the maximum height reached above the
launch height is h, t is the time taken to return to the launch height. ct greatly exceeds
x, the horizontal distance traveled, so the path length in spacetime is close to ct. The
path though parabolic is sufficiently flat that it can be can be treated as circular.
From the geometry of the circle the radius of curvature of the path in spacetime is
R = c 2 t 2 8h .
Taking the acceleration due to gravity to be g, and ignoring air resistance, the height
reached is
h = gt 2 8.
Combining these two equations gives R, the radius of curvature in spacetime near
the Earth:
R = c 2 g ≈ one light year.
For reference later we rewrite this as
GM⊕
1/ R = , (3.1)
r 2c 2
where r is the distance to the center of the Earth and M⊕ is the Earth’s mass.
In Section 3.1 the basic concepts of curvature are introduced, using the easily
visualized example of two-dimensional spaces.
path in space-time
ct
x
Figure 3.1. Curvature of spacetime from the flight of a ball.
The path shown in Figure 3.1 is one such geodesic. The general definition of
curvature and ways to measure it are described in Section 3.2. How you compare
local vectors at different places in a curved space is described in Section 3.3. The
relationship between curvature and the metric equation is described in Section 3.4,
and the metric equation for Minkowski space is considered in Section 3.5. This
moves the discussion forward to spacetime. Section 3.6 introduces geodesics as the
paths of bodies in free fall, and tidal acceleration, which is the divergence between
nearby geodesics due to the change in the gravitational field with location. Our final
topic is the Schwarzschild metric, describing spacetime around a spherically
symmetric mass. This is introduced here heuristically; a rigorous proof of its validity
in GR is carried through in Appendix C. In Chapter 7 predictions of GR effects in
the solar system will be made with the Schwarzschild metric and their successful tests
described. The Schwarzschild metric will be used again to analyze the properties of
black holes in Chapter 8.
3-2
Introduction to General Relativity and Cosmology (Second Edition)
x y
Q
80° W L
10° E
Figure 3.2. A two-dimensional spherical surface, such as the surface of the Earth.
at (θ,ϕ) and (θ + dθ , ϕ + dϕ) is rdθ in latitude and r sin θ dϕ in longitude. Then the
total separation is given by the quadratic equation
ds 2 = r 2dθ 2 + r 2 sin2 θ dϕ 2 . (3.2)
This is a metric equation. In order to reconstruct the geometry of the Earth from the
latitude and longitude a distance scale is needed and this is what the metric equation
provides. For example, the metric equation shows that near the poles (θ = 0° or
180°) changes in longitude ϕ produce only small displacements.
The properties of a spherical curved surface, highlighted above, are common to
other curved two- and higher-dimensional spaces of interest. First of all, in order to
cover the whole of a curved surface or space it is necessary to use generalized or
Gaussian coordinates; rectangular Cartesian coordinates are inadequate. A second
property is that the local distances are given by a metric equation in the Gaussian
coordinate separations.
A sphere is one example of a Riemann space: in such spaces it is always possible to
match any arbitrary region by a flat or Euclidean space provided that the region
taken is small enough, simplifying a surveyor’s life. This is like being able to draw a
straight line tangent at any point on a smooth curve. Consider a general two-
dimensional Riemann surface with metric equation at a point P
ds 2 = a du 2 + 2b du dv + c dv 2 (3.3)
where (u, v) are some Gaussian coordinates. The coefficients a, b, and c are
functions of position and contain all the information about the geometry of the
surface. This can be converted to obviously Euclidean form by completing the
squares
3-3
Introduction to General Relativity and Cosmology (Second Edition)
2
b b2
ds 2 = ⎡ a du + dv⎤ + ⎡c − ⎤dv 2 = dx 2 + dy 2 ,
⎣ a ⎦ ⎢
⎣ a⎥ ⎦
where
b b2
x= au + v and y= c− v,
a a
provided that a > 0 and ac > b2 . In this case it follows that a Euclidean surface will
match the surface locally at P. In other words a plane can always be drawn at any
arbitrary point on a two-dimensional Riemann surface so that it is locally tangential
to the surface. A similar procedure can be followed in higher-dimensional Riemann
spaces. Some coordinate transformation can always be found which converts the
metric equation to a sum of squares. The Cartesian coordinates which result from
this transformation describe the space tangential to the curved space at the point
selected. To reiterate: Riemann spaces are locally flat (locally Euclidean).
If instead of having ac > b2 , it is the case that b2 > ac then
ds 2 = dx 2 − dy 2 . (3.4)
The space involved is still locally flat because it has a metric equation that reduces to
a difference of squares. Its tangent space is called pseudo-Euclidean and the space
itself is called pseudo-Riemannian. For most purposes we can ignore the distinction
and include the pseudo-Riemannian spaces with the Riemannian spaces. Referring
back to the first chapter we see that the spacetime of SR is pseudo-Euclidean, which
explains our interest in such spaces.
The shortest paths joining points a finite distance apart on a curved surface are
generally not simple straight lines. However they are the straightest lines that can be
drawn on the surfaces between points under consideration. Such paths are called
geodesics and naturally on a flat surface a geodesic is a straight line. Geodesics on a
spherical surface are the well-known great circle routes which are used by aircraft on
intercontinental flights.
Figure 3.3(a) shows a great circle path drawn to touch a line of latitude θ at P.
Figure 3.3(b) shows a diametral plane section of the sphere through P and the North
Pole. D is the center of curvature of the line of latitude, while C is the center of
curvature of the surface. A geodesic has curvature 1/r and the line of latitude has
curvature 1/r sin θ . Resolving the curvature of the line of latitude perpendicular and
parallel to the surface gives (1/r sin θ )sin θ = 1/r and cos θ /(r sin θ ); while for the
geodesic great circle the corresponding components are 1/r and zero. This illustrates
a very important property of the set of lines through a point on a surface that are
tangential (share the same direction) at that point. The component of curvature
perpendicular to the surface is the same for all of them, and in the case illustrated
this is 1/r . Among these lines one, the geodesic, has only this component of
curvature. It has no component of curvature lying in the surface, which makes it
the straightest path possible over the surface. This characteristic is quite general and
holds for geodesics in spaces that are less symmetric than that of a sphere, and in
3-4
Introduction to General Relativity and Cosmology (Second Edition)
Figure 3.3. (a) A line of latitude and a geodesic (great circle) on a spherical surface which are tangential to one
another at P: (b) a diametral plane section through the sphere which contains the North Pole N and the center
of the Earth C. D is the center of curvature of the line of latitude at P.
3-5
Introduction to General Relativity and Cosmology (Second Edition)
r O
P
r
O
r O
Figure 3.4. Three two-dimensional surfaces: P is dome shaped, N is saddle shaped, and F is flat. In each case
the broken line marks out a path which stays exactly a distance r from O.
Figure 3.5. A section through a spherical surface, with center of curvature at C and radius R.
reader will find it hard to picture extending the surface N continuing everywhere
with negative curvature. The reason is that surfaces of negative curvature cannot be
embedded in our (nearly) flat and Euclidean three-space. The inhabitant of two-
dimensional space could also refine the use of his measurement to make a
quantitative determination of the curvature.
We specialize to the case of a spherical surface whose cross-section is shown in
Figure 3.5. C is the center of curvature and R is the radius of curvature. The angle
subtended by a length of string r at the center of the sphere is θ, and so
θ=rR (3.8)
3-6
Introduction to General Relativity and Cosmology (Second Edition)
G2
G1
G1
G2
Figure 3.6. G1OG1 and G2OG2 are geodesics on a two-dimensional surface which intersect at right angles. ON
is the normal to the surface at O.
3-7
Introduction to General Relativity and Cosmology (Second Edition)
In the new Cartesian frame with coordinates (x , y, z ) the surface has the equation
z = (K1x 2 + K2y 2 ) 2. (3.14)
The planes xOz and yOz are called the principle planes. Now consider the line along
which this surface intersects the plane xOz. It has the equation
z = K1x 2 2, (3.15)
and its radius of curvature R1 is given by the formula relating the sagitta z to the
chord length 2x for a circular arc
2R1z = x 2 . (3.16)
Thus the curvature of the line of intersection is
1 R1 = K1. (3.17)
Similarly in the orthogonal section with yOz the curvature is K2. It is not too difficult
to prove that K1 and K2 are the minimum and maximum curvatures for any plane
section through the surface at O containing ON. K1 and K2 are called the principal
curvatures of the surface at O. Their product is an invariant for the surface at O
called the Gaussian curvature K; thus
K = K1K2. (3.18)
A sphere has Gaussian curvature R−2 at any point on its surface. In the case of a
cylinder one principal plane bisects its length along a straight line; hence one principal
curvature is zero, and the Gaussian curvature is then also zero. Finally, for the saddle-
shaped surface of Figure 3.4 one of the principal planes lies along the length of the
saddle in the direction of the horse’s spine and the other lies transverse to the saddle in
the direction of the horse’s ribs. The center of curvature of the first section lies above
the saddle, while the center of curvature of the second section lies below the saddle.
Therefore K1 and K2 have opposite signs and the Gaussian curvature is negative. The
method already described for measuring the curvature of the sphere generalizes so that
for any other two-dimensional surface the curvature is given by
3 2πr − C ⎞
K= lim⎛ . (3.19)
π r→0 ⎝ r 3 ⎠
Two other intrinsic methods for measuring K are worth discussing as they are
methods that carry over to the analysis of spacetime curvature. The first method is
based on the way the separation of geodesics grows with distance. In the case of a
plane the separation between a pair of geodesics (straight lines) through a point
increases linearly with the distance from this point. In contrast the separation of a
pair of lines of longitude shown diverging from the North Pole N in Figure 3.7 does
not vary linearly with the distance s measured from the pole. The difference in
longitude is ϕ and so the separation after a distance s is
η = (R sin θ )ϕ (3.20)
3-8
Introduction to General Relativity and Cosmology (Second Edition)
S
φ
R sin θ
η
θ =s/R
Figure 3.7. A diagram to show how the separation of a pair of geodesics on a sphere depends on the distance
from their intersection point.
3-9
Introduction to General Relativity and Cosmology (Second Edition)
force on an electron in an electric field has magnitude and direction but only exists at
the coordinates of the electron. Wind velocity and force are both local vectors. A
simple and most useful local vector in the present context is the tangent vector to a
curve in space.
The comparison of local vectors at different places is easy in flat space. First, a
Cartesian coordinate system is set up with one vector a at location A and the other b
at B; a and b will be equal if their components are equal. Equivalently we can
imagine picking up b and carrying it to A without changing its length or direction
and then examining whether it fits a exactly. In this procedure b is said to be parallel
transported from B to A. Comparison of local vectors in a curved space is less
straightforward because it is no longer possible to set up a single Cartesian
coordinate system to cover all space. Once more it is helpful to consider the
situation of 2D, a two-dimensional being who lives on a spherical surface embedded
in our three-dimensional space. Suppose that 2D starts at the pole in Figure 3.8 with
the local vector a shown there. 2D’s only strategy is to make small steps and to carry
the local vector parallel to itself at each step. Without any reference frame to
check the parallelism even this seems difficult. However, 2D can start by moving off
in the direction of the local vector itself, and in this case parallel transport is well
defined. What 2D is doing in this case is to trace out a geodesic—a great circle on the
sphere, e.g., NA shown in Figure 3.8. After this 2D can carry any other vector
parallel to itself by traveling along a geodesic and keeping the local vector at a
constant angle to the geodesic. In cases where the route does not follow a geodesic it
would be necessary to split the path up into infinitesimal steps, each step being along
a geodesic; which is possible because the space is locally flat.
Emboldened by success, 2D could go on to parallel transport the vector a along
the closed path NABN in Figure 3.8. Each path segment is a geodesic. Starting
from the pole the local vector is carried along the direction it points (a line of
longitude) to the equator (A). From A it is parallel transported along the equator to
a N
A
B
Figure 3.8. Parallel transport of a vector round the path NABN over a spherical surface.
3-10
Introduction to General Relativity and Cosmology (Second Edition)
B and then returned along another line of longitude to the pole. As a result the
vector is seen to rotate through an angle ϕ, which is the separation in longitude
between A and B. In general the rotation of a local vector when carried around a
closed path on any two-dimensional surface is given by the expression
ϕ = K (area enclosed by path), (3.24)
which is easily checked for the route discussed.
An equivalent view of this effect is that the result of parallel transport in a curved
space depends on the path taken.
where grr is a function of r only. Figure 3.9 shows a section through the surface
embedded in a three-dimensional space, with the axis of rotation lying in the paper,
situated off to the left and parallel to the left edge of the paper. The curvature of this
surface is
∂grr ∂r
K= . (3.26)
2 r grr2
ρ1 S
ψ
r
Figure 3.9. A section through a two-dimensional cylindrically symmetric curved surface. r is the coordinate
distance from the axis of rotation: this axis is off to the left and lies in the plane of the paper, parallel to the left
edge of the paper. s is measured along the curve drawn.
1
A general proof is given in Appendix B: Kenyon I R (1990) General Relativity Oxford Univ. Press (Kenyon
1990).
3-11
Introduction to General Relativity and Cosmology (Second Edition)
Spaces with more than two dimensions require more than a single parameter to
describe the Gaussian curvature at a given point: n(n − 1)/2 independent Gaussian
curvatures are required for a space of n dimensions. A more complete and compact
description of curvature in n dimensions is embodied in the Riemann tensor,
introduced in Chapter 6.
⎡1 0 0 0⎤
⎢ 0 −1 0 0⎥
ημν = . (3.30)
⎢0 0 − 1 0⎥
⎢
⎣0 0 0 − 1⎥
⎦
ημν is called a metric tensor. Subscripts and superscripts are introduced in Equation
(3.29) so that the notation is consistent with that used from Chapter 4 onward. For
the present the reader can take subscripts and superscripts to be equivalent. In cases
for which the system has spherical symmetry, such as spacetime around a spherical
star or planet, it may often be better to use polar coordinates; then
ds 2 = c 2dt 2 − dr 2 − r 2dθ 2 − r 2 sin2 θ dϕ 2
3 3
(3.31)
= ∑∑ ημνdw μdw ν ,
μ = 0 ν= 0
3-12
Introduction to General Relativity and Cosmology (Second Edition)
⎡1 0 0 0⎤
0 − 1 0 0⎥
ημν = ⎢ . (3.32)
⎢0 0 −r 2
0⎥
⎢0 0 0 − r 2 sin2 θ ⎥
⎣ ⎦
The notation can be greatly simplified by adopting the Einstein summation
convention in which we sum over repeated indices. With this convention the metric
equation simplifies to
ds 2 = ημνdx μdx ν . (3.33)
It is useful to distinguish between Roman and Greek suffixes. When Greek letters
are used, as above, the summation is to be made over time and space coordinates.
With this interpretation the right-hand side of Equation (3.33) is identical to that of
Equation (3.29). However if Roman letters are used the summation is only made
over space coordinates (1,2,3). The metric equation for the three-dimensional
Euclidean space with Cartesian coordinates is then
ds 2 = aij dx i dx j (3.34)
where
⎡ 1 0 0⎤
aij = ⎢ 0 1 0 ⎥ . (3.35)
⎣ 0 0 1⎦
The observation of gravitational redshift and the deviation of electromagnetic waves
passing the Sun shows that real spacetime is curved. Therefore flat Minkowski
spacetime is inadequate and our analysis of curved spacetime will need to proceed
along the lines mapped out above for analyzing curved space. Gaussian (general-
ized) coordinates x μ can be used to cover curved spacetime. The interval ds between
events in spacetime with coordinate separations dx μ is then given by the quadratic
metric equation
ds 2 = c 2dτ 2 = gμνdx μdx ν (3.36)
where the components of the metric tensor gμν are functions of the position and time.
At this point the strong equivalence principle supplies a key ingredient to under-
standing curved spacetime. This principle requires that on transforming to a freely
falling frame all local experimental measurements give results in accord with SR.
What this means in geometric terms is that a Minkowski frame matches the structure
of real spacetime locally, but not globally. Thus we can at any event in spacetime, y,
always find some frame for which
3-13
Introduction to General Relativity and Cosmology (Second Edition)
gμν(y ) = ημν
∂gμν (3.37)
ρ
= 0.
∂x y
This frame is in free fall and so our spacetime belongs to the category of spaces
which have a quadratic metric equation and are flat, known as pseudo-Riemann
spaces.
Riemann, who was a student of Gauss, initiated the analysis of curved spaces of
more than two dimensions in 1846. By the early years of the twentieth century the
mathematical properties of Riemann spaces had been extensively studied and this
material was available for Einstein to use. Einstein was fortunate in having a friend
Marcel Grossmann who introduced him to Riemannian geometry and who worked
with him until Einstein left Zurich in 1914. The formal development of GR will be
outlined in Chapters 5 and 6 while in the remainder of this chapter a simpler
approach will be used to infer one particularly important solution of Einstein’s
equation. This is the solution for empty space outside a spherically symmetric mass
distribution.
A geodesic in flat space (a straight line) is the path of a free body as described by
Newton’s first law of motion. Hence a time-like geodesic in Minkowski spacetime (also
a straight line) is the path of a free body. Taking the equivalence principle as a guide we
may infer that equally in curved spacetime the path of a test body in free fall follows a
time-like geodesic.
Now let’s consider the deviation between paths of nearby test bodies in free fall
toward a spherically symmetric star. Figure 3.10 shows two such bodies A and B,
both having mass m and both at a radial distance r from the star of mass M; A and B
are a tangential distance ξ apart. If m is sufficiently small we can ignore the mutual
attraction of the two masses. Resolving the gravitational force due to the star
tangential to OA gives zero at A, while for B the force has a component
GMm ξ
F1 = (3.38)
r2 r
3-14
Introduction to General Relativity and Cosmology (Second Edition)
B
r
r
A
Figure 3.10. The paths of two nearby masses A and B in radial free fall toward a star whose center of mass lies
at O. ξ is their tangential separation.
3-15
Introduction to General Relativity and Cosmology (Second Edition)
The above approach cannot be extended to infer spatial curvature, so for the present
we infer that the curvature of the r − ϕ surface with θ = π /2 has a similar form
−GM
K rϕ = . (3.43)
r 3c 2
This inference is justified in Appendix C from the solution of Einstein’s equation.
Combining Equations (3.47) and (3.44) completes the metric equation for empty space
outside a spherically symmetric mass distribution:
2GM ⎞ dr 2
ds 2 = c 2dτ 2 = c 2dt 2⎛1 − − − r 2 dΩ 2 . (3.48)
⎝ rc ⎠ 1 − 2GM rc 2
2
3-16
Introduction to General Relativity and Cosmology (Second Edition)
Note that the relative signs of the spatial components with respect to the time
component are set negative, so that in the limit of zero mass this reduces to the
Minkowski metric equation of SR:
ds 2 = c 2 dτ 2 = c 2 dt 2 − dr 2 − r 2 dΩ 2 . (3.49)
Equation (3.48) is known as the Schwarzschild metric equation for which the metric
tensor is
⎡ ⎛1 − 2GM ⎞ 0 0 0 ⎤
⎢⎝ rc 2 ⎠ ⎥
⎢ −1
⎥
gμν =⎢ 0 − ⎛1 − 2GM ⎞ 0 0
⎥. (3.50)
⎢ rc 2 ⎠ ⎥
⎢ ⎝ ⎥
⎢ 0 0 − r2 0 ⎥
⎢ 0 0 0 − r 2 sin2 θ ⎥
⎣ ⎦
Note that the components of this equation do not depend on time, and so the
Schwarzschild metric is static.
Birkhoff and Langer (1923) proved that this metric equation is the unique
description in GR for spacetime outside a spherically symmetric mass distribution
(carrying no charge or angular momentum) and is asymptotically flat. The solution
is static and there is no other non-spherically symmetric static solution. If the
spherically symmetric mass distribution is contracting or expanding radially the
Schwarzschild solution remains valid outside this distribution. The Schwarzschild
metric will be used in calculations of the observable effects of GR in the solar system
(Chapter 8) and of the properties of non-rotating neutral black holes (Chapter 9).
It is important to retain a firm grasp on what the coordinates appearing in
Equation (3.50) are and what they are not. Referring back to Equation (2.2) we
recall that dt is the time interval between two events measured by an observer using a
clock which is in a region remote enough that spacetime is effectively flat; dt is called
the coordinate time interval. The proper time interval, dτ, is that measured on a clock
carried by someone moving from (t , r, θ , ϕ) to (t + dt , r + dr, θ + dθ , ϕ + dϕ).
The coordinate r is different from the radial distance measured from the center of
mass M, which we call a. The relationship between a and r is
12
1
da = ⎛ 2
⎞ dr . (3.51)
⎝ 1 − 2GM rc ⎠
The area of the spherical surface labeled by r is 4πr 2 and not 4πa 2 , and its
circumference is 2πr and not 2πa .
From the form of the Schwarzschild metric it is clear that the factor 2GM /rc 2 is
an important measure of the effect of mass on the curvature of spacetime. When
2GM /rc 2 is small compared to unity, the curvature is small and the general
relativistic effects are negligible. Conversely, if 2GM /rc 2 approaches unity the
curvature is severe and general relativistic effects dominate. Spacetime in the vicinity
3-17
Introduction to General Relativity and Cosmology (Second Edition)
of a star whose radius shrinks to a value less than 2GM /c 2 becomes so warped that
the region within is effectively isolated from the rest of the universe. The
phenomenon is known as a black hole and the radius of the surface of isolation,
the Schwarzschild radius, is
r0 = 2GM c 2 . (3.52)
3.8 Exercises
1. Calculate the approximate rotation in the horizontal plane of a vector if it is
parallel transported around the periphery of the 48 contiguous states of the
USA.
2. A two-dimensional toroidal surface has these dimensions: the mean diameter
is 20 m and the radius of the circular cross-section is 2 m. Calculate the
Gaussian curvature of the surface at the inner and outer edges of the torus.
Where does the Gaussian curvature of this surface become infinite?
3. A star with the same mass as the Sun has radius 3 km. Calculate the metric
equation for spacetime near the surface.
4. Write down the equation for an equatorial line in a space described by the
Schwarzschild metric at constant time (θ = π /2, ϕ = 0, t constant). You
should obtain
dr 2
ds 2 = ,
1 − r0 r
where r0 = 2GM⊙ /c 2 . Now suppose that s in this equation is used to define the
length in two-dimensional flat space with Cartesian coordinates w and r.
Show that the equation then defines a parabola
w 2 = 4r0(r − r0) .
Now consider the equatorial plane in the same space at constant time
(θ = π /2, t constant). Show that it is geometrically equivalent to a paraboloid
of revolution obtained by rotating the last equation around the w-axis.
5. Consider a body in a circular equatorial orbit around a spherically symmetric
mass M with angular velocity ω. Using the Schwarzschild coordinates show
that the condition that the orbital period is extremal results in a radius
[GM /ω 2 ]1/3.
6. The radius of the circle threading the center of the cavity of a torus is R. The
torus rests on a horizontal surface. A vertical circular section of the torus has
radius r. A line drawn in this plane from a point on the surface to the center
of the circle makes an angle ϕ with the horizontal. Show that the Gaussian
curvature at that point on the surface is cos ϕ /[r(R + r cos ϕ )].
Further Reading
Berry M 1976 Principles of Cosmology and Gravitation (Cambridge: Cambridge
Univ. Press). This old and short book contains lots of insights, and Chapter 4
is helpful in explaining curved spacetime.
3-18
Introduction to General Relativity and Cosmology (Second Edition)
References
Birkhoff, G. D., & Langer, R. E. 1923, Relativity and Modern Physics (Cambridge, MA: Harvard
Univ. Press)
Kenyon, I. R. 1990, General Rellativity (Oxford: Oxford Univ. Press)
3-19
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 4
Elementary Tensor Analysis
A physical law valid in SR, that is, in a frame in free fall, when recast in tensor form is
automatically valid under general transformations. This technique fails for the central
issue of writing a relativistic law of gravitation: Newton’s law of gravitation is not
compatible with SR, so we lack a starting point. SR only applies in a frame in free fall,
which neatly disconnects it from gravitation. What Einstein did was to discover a
tensor identity between spacetime curvature and the stress–energy tensor; a tensor that
quantifies appropriately the distribution of matter and associated energy. This identity
is Einstein’s law of gravitation and forms the keystone of GR.
4-2
Introduction to General Relativity and Cosmology (Second Edition)
⎡1 0 0 0⎤
⎢ 0 cosθ sinθ 0⎥.
Λ μν =
⎢ 0 sinθ − cosθ 0⎥
⎢
⎣0 0 0 1⎥
⎦
A Lorentz transformation to a frame with velocity βc along the x-axis has the form
⎡ coshu − sinhu 0 0⎤
Λ μν = ⎢− sinhu coshu 0 0⎥,
⎢ 0 0 1 0⎥
⎢
⎣ 0 0 0 1⎥
⎦
where cosh u = γ, sinh u = βγ and γ = 1/(1 − β 2 )1/2 . Finally, consider the trans-
formation between frames momentarily coincident in velocity near a massive spherical
body. The first frame is a frame in radial free fall, and the second is a frame at rest:
2 1/2
⎡(1 + 2φ / c ) 0 0 0 ⎤
μ ⎢ 0 1/(1 + 2 φ / c 2 )1/2 0 0 ⎥
Λν=⎢ ⎥
⎢ 0 0 1 0 ⎥
⎣ 0 0 0 1 ⎦
where the 1-axis is radial and φ is the gravitational potential.
4-3
Introduction to General Relativity and Cosmology (Second Edition)
Let us write:
dxμ = gμν dx ν (4.5)
and then
ds 2 = dxμ dx μ. (4.6)
4-4
Introduction to General Relativity and Cosmology (Second Edition)
In other words, the dxμ are the covector components of PP ′. Thus Equation (4.5) is
the general way of generating covector components from vector components.
A two-dimensional example can be used to illustrate these results. Figure 4.1
shows a local vector PP′ with the coordinate axes inclined at an angle θ. The vector
component of PP ′ on the 1-axis is obtained by projecting from P′ parallel to the
2-axis, and the 2-component by projecting parallel to the 1-axis. Then in terms of
unit vectors e1 and e2 along the axes:
PP′ = e1 dx1 + e 2 dx 2
The covector components given by Equation (4.5) in two dimensions:
dx1 = g11 dx1 + g12 dx 2 .
Then using Equation (4.3) we obtain
dx1 = dx1 + dx 2 cos θ , and dx2 = dx 2 + dx1 cos θ .
Figure 4.1 shows that these covector components dx1 and dx2 can be obtained by
projecting perpendicularly from P′ onto the relevant axis. How this generalizes in
Euclidean spaces of higher dimension is illustrated in Figure 4.2. The vector component
of PP ′ is obtained by projecting from P′ onto the 1-direction over the surface through P′
that contains all the other local basis vectors: dx1 is SP. The covector component is
simply the perpendicular projection from P′ onto the 1-direction: dx1 is S ′ P. With
rectangular Cartesian coordinates in Euclidean space the distinction between vector and
covector components disappears, which explains why covector components are not
usually used in Newtonian mechanics. Direct visualization of vector and covector
components in spacetime meets difficulties with the time components because we can
only draw Euclidean spaces in our three-dimensional world. The covector components
of local four-vectors are given by expressions similar to Equation (4.3). Other related
quantities, the scalar products of pairs of four-vectors, are also invariant. If the lowering
property of gμν is applied in turn to Aμ and B ν we obtain
gμνAμ B ν = Aν B ν = Aμ Bμ. (4.8)
2
2
dx
P′
dx 2
θ
P 1
dx1
dx1
Figure 4.1. The vector and covector components of an infinitesimal vector PP ′ in a two-dimensional space.
4-5
Introduction to General Relativity and Cosmology (Second Edition)
P'
S
(a)
P
1-direction
P'
S'
(b) P
Figure 4.2. The method for constructing (a) the vector and (b) the covector 1-component of the infinitesimal
vector PP ′ in a multi-dimensional space. The surface drawn would be a normal plane where the space is flat
and three-dimensional, but would be a hypersurface in higher-dimensional spaces.
4-6
Introduction to General Relativity and Cosmology (Second Edition)
D ′νσ
μ
= Λ μαΛ νβΛ σγ D αβγ . (4.11)
We see that each vector index brings a factor Λ* * while each covector index brings a
factor Λ* *. Valid identities between tensors only connect tensors with equal
numbers of subscripts and equal numbers of superscripts (the same rank).
There are a number of other tensor properties that are needed in Chapters 5 and 6.
A frequent tensor manipulation is the process of contraction. Consider a tensor Aγβαβ
where the summation is implied for all values of the repeated index β, i.e., the tensor in
full is
Aγα00 + Aγα11 + Aγα22 + Aγα33 .
so that Aγβαβ transforms as a rank 2 tensor. One more contraction gives Aαβ αβ
, which is
a scalar quantity. The scalar products met in vector analysis are familiar examples of
contraction. Other tensor products involving contraction are
Aμν Bν, which is a vector
(4.13)
C μνσBσα, which is a rank 3 tensor.
Not all collections of numbers or functions labeled with suffixes “Fμν ” constitute a
tensor. Whether or not “Fμν ” constitutes a tensor can be tested using the quotient
theorem. This theorem states that if the product of F μν with any arbitrary tensor is also
a tensor, then F μν is itself a tensor.
Finally we collect here some useful properties of the metric tensor. The metric
tensor is symmetric under interchange of its subscripts. An antisymmetric part’s
contribution to ds2 would be
(gμν − gνμ) dx μ dx ν /2,
which vanishes identically, and so it can safely be neglected. Contraction with gμν is
used to generate associated tensors. For example,
4-7
Introduction to General Relativity and Cosmology (Second Edition)
Hence the effect of g βα is to raise a subscript while the effect of contracting a tensor
with gαβ is to lower a superscript. In the case that gαβ is diagonal (e.g., for the
Schwarzschild metric), with no summation implied over α in this case
g αα = 1/ gαα. (4.16)
For completeness the transformation of the Kronecker delta δ νμ needs to be
discussed. Under the general transformation
∂x′ α ∂x μ ν ∂x′ α ∂x μ
(δ′)αβ = δ μ = = δ αβ , (4.17)
∂x ν ∂x′ β ∂x μ ∂x′ β
which demonstrates that this tensor is the same in all frames.
4.4 Exercises
1. With the geometry of Figure 4.1 show that the distance PP ′ is given by
ds 2 = (dx1)2 + (dx 2 )2 + 2 dx1 dx 2 cos θ
Further Reading
Laugwitz D 1965 Differential and Riemannian Geometry (New York:
Academic). This provides a thorough review of these subjects.
4-8
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 5
Einstein’s Theory I
The theme of this chapter is to present physical laws in a form valid under general
transformations to accelerated frames. According to the principle of equivalence the
physical laws in any frame in free fall are consistent with SR. Then in Chapter 3 we
learnt that physical laws expressed as tensor equations automatically retain their
form under general transformations.
These two ideas are now fused using the principle of generalized covariance, first
postulated by Einstein. This states that physical laws are expressible as tensor
equations that reduce to laws consistent with SR in a frame in free fall. Hence a law
valid in SR and expressed in tensors will apply in accelerating frames too. Most laws
are already expressed in terms of vectors, which are tensors, so we are part way there.
However the spacetime derivatives which appear in dynamical relations (e.g., the rate
of change of momentum) are not tensors. These derivatives must be converted to
covariant derivatives that are tensors and which reduce to the usual spacetime
derivative in a frame in free fall.
Covariant derivatives are introduced here and are found to depend on entities
called metric connections which quantify how local vectors change when they are
transported through curved spacetime. These metric connections are then evaluated
in terms of the derivatives of the metric coefficients gμν . Newton’s second law of
motion is the fundamental dynamical equation, and we re-write it in a form valid in
accelerating frames. In the absence of non-gravitational forces the second law
describes a body in free fall and we show that such a path is a geodesic in spacetime.
A procedure for transforming to a frame in free fall is then described. Geodesics are
stationary paths in spacetime with an integral equation that will be used later when
calculating the trajectories of planets and light in curved spacetime. Finally the
correspondence between metric connections and the inertial and gravitational forces
of classical Newtonian mechanics is explored.
The new quantity Γ μσρ introduced here is called the metric connection, and is clearly a
function of the position in spacetime. All components have to be included to
take account of the varieties of possible spacetime curvature: for example, the
1-component can change during a displacement in the 2-direction. Thus the change
in q μ due to the physical interaction is
dq μ + Γ μσρq σ dx ρ. (5.2)
The physical change in q μ per unit path length at a point in spacetime is obtained by
dividing this expression by ds and taking the limit as ds → 0. This gives the
covariant derivative, a valid tensor (in this case a vector)
eμ(s)
eμ(o)
Path
[length s]
Figure 5.1. A path across a curved surface; eμ is a local basis vector drawn at two points along this path.
5-2
Introduction to General Relativity and Cosmology (Second Edition)
Dq μ dq μ dx ρ ⎞
= + Γ μσρq σ⎛ . (5.3)
Ds ds ⎝ ds ⎠
Dq μ/Ds is the μ-component of the covariant derivative of the vector with compo-
nents q μ. It is quite easy to see that the connections are not themselves tensors. In a
frame in free fall the covariant derivative is the same as the standard derivative, so
that the metric connection vanishes. In any other frame the connection is non-zero
so it cannot be a tensor.
Expressions for covariant derivatives of tensors of any rank can be obtained, all
of which are also tensors. For covector components (see problem 5.1)
Dqμ dqμ dx ρ
= − Γ νμρ qν .
Ds ds ds
The covariant derivative of a rank-2 tensor, Aμν , is evaluated by considering the
product with vectors Aμν q μq ν . This product is invariant under parallel transport in
free fall so that
μ ν
⎣Aμν q q ⎤
δ⎡ ⎦ = 0.
Expanding this using Equation (5.1) gives
DA μν dA μν dx ρ dx ρ
= − Γ τμρA τν − Γ τνρA μτ
Ds ds ds ds
5-3
Introduction to General Relativity and Cosmology (Second Edition)
which, being a tensor equation, must hold in all frames. Then using Equation (5.4)
we have in general:
∂gμν
= Γ τμρgτν + Γ τνρgμτ .
∂x ρ
From here onward we write Γ τνρ as Γ τνρ for compactness. Now define
Γμνρ = gμτ Γ τνρ, (5.5)
The reader can check Equation (5.7) by substituting for Γνμρ and Γμνρ in Equation
(5.6).
5-4
Introduction to General Relativity and Cosmology (Second Edition)
general path parameter λ, which is linearly related to τ when the path is non light-
like:
Dq μ dq μ μ dx
ν
= + Γ νρ q ρ. (5.9)
Dλ dλ dλ
For covariant components
Dqμ dqμ dx ρ
= − Γ νμρ q.
Dλ dλ dλ ν
ν
Multiplying Equation (5.3) by ∂s /∂x gives another form:
Dq μ ∂q μ μ ρ
ν
= + Γ νρq . (5.10)
Dx ∂x ν
This can be written more compactly by using the subscript comma notation for
derivatives, and introducing the subscript semicolon notation for covariant
derivatives:
Dq μ
q;νμ = .
Dx ν
Then Equation (5.10) becomes
q;νμ = q,νμ + Γ νρ
μ ρ
q .
5-5
Introduction to General Relativity and Cosmology (Second Edition)
dp μ
Fμ =
dτ
where F μ is the four-vector force. Although the right-hand side is not a valid tensor,
we have just learnt how to write a covariant derivative that is a valid tensor.
Replacing the derivative on the right-hand side of this equation by a covariant
derivative yields
Dp μ
Fμ = . (5.13)
Dτ
This is a valid tensor equation and reduces to SR form in a frame in free fall, so that
it satisfies the principle of generalized covariance. Any other physical law consistent
with SR in a frame in free fall can be converted to a physical law valid in accelerating
frames by replacing spacetime derivatives by the equivalent covariant derivatives.
Clearly we have here a very powerful tool.
Using the subscript comma and subscript semicolon notation a standard space-
time derivative is written p μ;ν , while a covariant derivative is written p μ;ν . Hence the
procedure for converting an equation so that it becomes valid in accelerating frames
can be expressed pithily as: “replace commas by semicolons.”
5-6
Introduction to General Relativity and Cosmology (Second Edition)
where λ is the path parameter discussed in Section 5.3. Notice that the geodesic
Equation (5.15) does not depend on the mass of the test body. This shows that the
weak equivalence principle is already built into the very structure of Riemann
spacetime.
In Section 3.3 an alternative definition given for a geodesic was of a curve across
space with no component of curvature in that space. This property of a geodesic is
inherent in the geodesic equation, as we shall now show. The differential dx μ/ds is a
valid vector so we can write
Dx μ dx μ
= .
Ds ds
Then
D2x μ D dx μ d2x μ ν
μ dx dx
ρ
2
= = 2
+ Γ νρ ,
Ds Ds ds ds ds ds
so that
D2x μ
=0 (5.17)
Ds 2
is the compact form of the geodesic equation. In geometrical terms the Dx μ/Ds are
gradient components and the D2x μ/Ds2 the curvature components of the path.
Therefore the geodesic equation implies that a geodesic has no component of
curvature in spacetime.
S= ∫ (gμν dx μ dx ν )1 2 .
5-7
Introduction to General Relativity and Cosmology (Second Edition)
(a) B
A C
(b)
Space x
Time t
A C
Figure 5.2. (a) A straight line path AC in space and another longer path ABC; (b) a straight line path in
spacetime AC, and a displaced path ABC.
This integrand is unity all along the path so that if S is stationary then so too is a
simpler integral
δI = 0 (5.18)
where
μ
dx ν ⎞
I= ∫ ⎛⎝gμν ddxs ds ⎠
ds . (5.19)
Equations (5.18) and (5.19) set a variational problem that is solved in the standard
way in Appendix B. There it is shown that these equations are the integral form of
the geodesic equation, entirely equivalent to Equation (5.10).
It is important to appreciate that a body in free fall from a fixed starting event will
follow different geodesics if it is given different starting velocities. These geodesics
will lie inside the forward light cone through the starting event; they are time-like
with ∫ ds 2 > 0. When the test body is a photon the path integral ∫ ds 2 = 0, which
defines a null geodesic. Finally the space-like geodesics that have ∫ ds 2 < 0 would
correspond to motion with velocity greater than c. Neither material particles nor
light can follow space-like geodesics; however these geodesics are useful in setting up
coordinate frames that span spacetime.
5-8
Introduction to General Relativity and Cosmology (Second Edition)
5-9
Introduction to General Relativity and Cosmology (Second Edition)
5.8 Exercises
1. Show that for covector components the second term of Equation (5.10)
becomes negative:
Dpμ ∂pμ ρ
ν
= − Γ νμpρ
Dx ∂x ν
(Take a scalar quantity ϕ = pα q α and calculate its derivative).
2. Show that a geodesic that is time-/space-/light-like at a given point remains
time-/space-/light-like on its journey through spacetime. Hint: consider
parallel-transporting a vector Aμ and show that AμAμ is constant.
3. Take a space-like geodesic in Minkowski space and show that it is neither
maximal nor minimal.
4. The metric equation for free fall is
ds 2 = c 2 dt 2 − dr 2 − r 2 dθ 2 .
Deduce the radial and angular components of the geodesic equation:
2
d2r ⎛ dθ ⎞ = 0
− r
dτ 2 ⎝ dτ ⎠
d2θ 2 dθ dr
2
+ = 0.
dτ r dτ dτ
Further Reading
Schutz B 2008 A First Course in General Relativity (2nd ed.; Cambridge:
Cambridge Univ. Press). This is a longer text by an expert and is useful in the
areas covered by this and the following chapter.
Carlip S 2019 General Relativity: A Concise Introduction. This book is pitched at
more advanced students, despite the title. It brings out interesting points and
is modern in its approach.
Misner C W, Thorne K S and Wheeler J A 1971 Gravitation (W H Freeman and
company, San Francisco). A 1000 page long groundbreaking book that
established the notation widely used thereafter. It is worth dipping in for
enlightenment on material in this or the following chapter. One author, Kip
Thorne, shared the Nobel Prize in Physics in 2017 for the discovery of
gravitational waves.
5-10
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 6
Einstein’s Theory II
The general technique for converting an equation valid for SR into one valid in all
frames cannot be applied to gravitation because we lack the starting point—a law of
gravitation consistent with SR. Newton’s law of gravitation implies that gravitational
effects are transmitted instantaneously to all parts of the universe, faster than the speed
of light. Einstein perceived that there must be a direct link between the distribution of
mass/energy and the curvature of spacetime, expressible in tensor form. Einstein also
recognized that the stress–energy tensor provided the appropriate tensor description
for the distribution and flow of energy in spacetime. Later he identified a curvature
tensor (the Einstein tensor) having formal properties that match those of the stress–
energy tensor. Einstein simply equated these two tensors, making “curvature” at a
given point in spacetime proportional to “energy density” at the same point.
In Section 6.1 the Riemann curvature tensor is introduced; this is a tensor that
provides a complete description of curvature in multi-dimensional spaces. In Section
6.2 attention is focused on the stress–energy tensor Tμν and its properties. The
conjecture that the stress–energy tensor is proportional to some curvature tensor
leads to the selection for this role of a unique contraction of the Riemann tensor
(Section 6.3) called the Einstein tensor Gμν .
Then the metric tensor at x + Δx and its first derivative are given by expansions
around gμν(x ) and gμν,ρ(x ):
1
gμν(x + Δx ) = ημν + g Δx ρΔx σ
2 μν,ρσ
and
( )
gμν,ρ x + Δx = gμν,ρσΔx σ .
The change in gμν depends only on the second derivatives gμν,ρσ at x, and so these
derivatives must embody the curvature information.
In Section 5.7 it was seen that the metric connection and hence the first derivatives of
the metric components correspond to the gravitational acceleration. A tidal acceler-
ation is the difference between gravitational acceleration at different locations and so it
involves the second derivatives of the metric connections. There is therefore a
fundamental connection between the curvature of spacetime and classical tidal forces.
db
D
da
Figure 6.1. A closed path in curved spacetime around which a vector is parallel transported.
6-2
Introduction to General Relativity and Cosmology (Second Edition)
In writing the last line use has been made of identities such as
da β Γ αβν, γ = da δ Γ αδν, γ
which result from changing a repeated suffix, in this case β → δ . The expression for
dq α can be written compactly as
dq α = da δ db γ q βR αβγδ (6.2)
One useful result that can be proved is that Equation (6.4) can be written
2R αβγδ = gαδ,βγ − gβδ,αγ + gβγ ,αδ − gαγ ,βδ (6.5)
6-3
Introduction to General Relativity and Cosmology (Second Edition)
Figure 6.2. Summary of the effects of both spacetime curvature and coordinate transformations on the metric
connection and the Riemann curvature tensor.
6-4
Introduction to General Relativity and Cosmology (Second Edition)
6-5
Introduction to General Relativity and Cosmology (Second Edition)
second the change would be linear in γ. The behavior of ρ is, however, exactly that of
the time-time component of a second-rank tensor T μν :
T μν = ρ0 v μv ν , (6.10)
where v μ is the four-vector velocity of the cloud. In the frame S only the time-time
component of this tensor is non-zero; it is (T 00 )S = ρ0 c 2 . Under the transformation
from S to S ′
(T 00)S ⇒ T 00 = γ 2(T 00)0 .
The tensor T μν is called the stress–energy tensor. A definition of its components that
applies equally to dust clouds or more complex systems is the following: T μν is the flow
of the μ component of the four-momentum along the ν direction.
Herein lies a paradox for the classical view. It is well known that the pressure inside a
star resists gravitational collapse, and yet it emerges that pressure can, by virtue of
contributing to a component of the stress–energy tensor, hasten gravitational collapse.
This is indeed the case, and pressure contributes to the contraction of sufficiently
massive stars to black holes.
6-6
Introduction to General Relativity and Cosmology (Second Edition)
Conservation laws for energy and momentum take particularly simple forms
when expressed in terms of the stress–energy tensor. The conservation of energy
offers a good example of the way this happens. Figure 6.3 shows a cube of edge
length l with its edges parallel to the x-, y- and z-axes in a medium whose stress–
energy tensor is T μν . The rate of change in the energy content of the box is
∂T 00
l3 .
∂t
This change is produced by the net energy inflow through the six faces of the cube.
The energy flows through the faces x = x and x = x + l are, respectively,
l 2cT 01(x ) inward per unit time
2 01
(6.11)
l cT (x + l ) outward per unit time.
The net flow inward from these two faces is
∂T 01
l 2c⎡ T 01(x ) − T 01(x + l )⎤ = −l 3c .
⎣ ⎦ ∂x
There are similar contributions from the other pairs of faces:
02 03
∂T ⎞ 3 ⎛ ∂T ⎞
−l 3c⎛⎜ ⎟ and − l c . ⎜ ⎟
⎝ ∂y ⎠ ⎝ ∂z ⎠
Summing the three contributions gives the total inflow, and so we have
∂T 00 ∂T 01 ∂T 02 ∂T 03 ⎞
l3 = −l 3c⎛⎜ + + ⎟,
∂t ⎝ ∂x ∂y ∂z ⎠
which can be rearranged to become
∂T 00 ∂T 0i
+ = 0,
∂x 0 ∂x i
T 01(x)
T 01(x+l)
z
y x x+l
6-7
Introduction to General Relativity and Cosmology (Second Edition)
that is,
∂T 0α
= 0.
∂x α
This last result can be expressed in a more compact form using the subscript comma
notation for spacetime derivatives:
T,α0α = 0. (6.12)
This type of derivative is known as the divergence of T 0α , the derivative is taken with
respect to the same spacetime component as appears in the tensor (here α), and the sum
over the expression on the left-hand side is made for all values α. A parallel procedure
applied for the conservation of linear momentum yields the three equations
T,iαα = 0 for i = 1, 2, or 3. (6.13)
Finally the conservation laws of Equations (6.12) and (6.13) can be combined into a
single equation:
T,αβα = 0 for β = 0, 1, 2, or 3. (6.14)
This equation summarizes the conservation laws for four-momentum in SR. Put into
words they require that the divergences of the stress–energy tensor vanish every-
where. This result from SR can be converted to a form of general validity in curved
spacetime if the simple derivatives are replaced by covariant derivatives. Thus the
laws of conservation of energy and momentum in curved spacetime take the form
T βα;α = 0, (6.15)
where the subscript semicolon indicates a covariant derivative. This can be written in
more detail as
∂T βα
+ Γ αμαT βμ + Γ βμαT μα = 0.
∂x α
The properties of the energy-momentum tensor are now summarized:
• it vanishes in the absence of matter;
• it is of second rank;
• its divergences all vanish;
• it is symmetric, i.e., T μν = T νμ.
Einstein identified the stress–energy tensor as the source of spacetime curvature and
suggested the simplest possible relationship between it and the curvature:
KTμν = Gμν ,
6-8
Introduction to General Relativity and Cosmology (Second Edition)
where Gμν is the tensor describing spacetime curvature and K is some scalar constant
whose magnitude determines how effective the energy density is in distorting
spacetime.
Rβδ and gβδ are both symmetric so that the Einstein tensor is also symmetric. Like its
progenitors, the Riemann and Ricci tensors, the Einstein tensor vanishes in the
absence of any material to warp spacetime. Referring to the summary of the
properties of the stress–energy tensor given at the head of the previous section, we
see that the Einstein tensor matches these precisely. With this tensor we can now
rewrite Einstein’s Ansatz as
8πG
Gαβ = Tαβ . (6.19)
c4
The value of the constant 8πG /c 4 is fixed by the requirement that, in the limit of
weak slowly varying gravitational fields, the Einstein equation should reduce to
Newton’s law of gravitation. This will be shown explicitly in Section 6.4. Note that
the G appearing on the right-hand side of Equation (6.19) is the usual gravitational
constant and not some contraction of a tensor. As with all such tensor equations the
equality will remain true if we simultaneously raise one or more indices on both sides
of the equation.
When Einstein’s equation was applied to calculate the behavior of the universe on
the large scale it was apparent that the universe could contract or expand indefinitely
depending on the starting condition and the amount of matter present. However at
that time it was known that some stars moved toward the Earth and some away,
with no sign of universal contraction or expansion. Einstein presumed that the
universe was static and was therefore in a dilemma.
6-9
Introduction to General Relativity and Cosmology (Second Edition)
Einstein found that by adding a constant term to his equation he could obtain a static
universe
8πG
Gαβ − Λgαβ = Tαβ (6.20)
c4
where Λ is a universal constant called the cosmological constant. As we can see, it
would lead to curvature of spacetime in the absence of any matter and radiation
(Tαβ = 0). It is therefore possible by choosing Λ appropriately to obtain a static
universe. By 1931 Hubble and Humason had convincingly demonstrated that the
universe is currently expanding and Einstein was then happy to disown the cosmo-
logical constant. Ironically, modern cosmological observations show that something
very like a cosmological constant is an essential feature of the universe.
In addition the quantum field theory of fundamental particles has taught us that
the vacuum is not a featureless void, but is in a continuous ferment involving
particle–antiparticle production and absorption, making a non-zero cosmological
constant altogether reasonable. One point worth consideration is that the cosmo-
logical constant is like a constant of integration, it is there and the only question has
to be: how large is it? Finally the contribution of the cosmological constant Λgαβ
defines its equation of state. Going to the frame in free fall it only has diagonal terms
with the time or energy component and the spatial or momentum component being
equal and of opposite sign. Thus the energy density due to the cosmological constant
is positive, but totally unlike matter or radiation the pressure is negative. The
argument is refined in Appendix F. This unexpected behavior has fundamental
implications for the evolution of the universe.
where all the components of the tensor h are much less than unity. If we choose
Cartesian coordinates η00 = +1, η11 = η22 = η33 = −1. In the same limit velocities are
small (≪c ) and the spatial components of momentum are much less than energies.
The dominant term in the stress–energy tensor is therefore the energy density T00.
Thus the important part of Einstein’s equation, Equation (6.19), in the classical non-
relativistic limit is
8πG (T00 − T 00 g00 2)
R 00 = − Λg00. (6.21)
c4
6-10
Introduction to General Relativity and Cosmology (Second Edition)
In evaluating this expression we first note that the metric connections are linear in h;
hence to a first approximation in h the form Equation (6.4) or (6.5) can be used for
the Riemann curvature tensor rather than Equation (6.3). To a first order
approximation in h, in the classical slow-moving limit where the time derivative is
negligible compared to the spatial derivatives the Ricci tensor reduces to
h 00,ii
R 00 = ,
2
summing over only the spatial components i. The result of the gravitational redshift
experiment determines h00: from Equation 2.12 for instance,
2GM 2φ
h 00 = − = 2,
rc 2 c
where φ is the gravitational potential. If this assignment is carried through here, then
φ,ii ∇2 φ
R 00 = = .
c2 c2
Suppose that the rest density of matter is ρ; then
T00 = ρc 2 .
It is a sufficient approximation to take g00 = 1 on the right-hand side of Equation
(6.21), so that
T00g00 ρc 2
T00 − = .
2 2
Finally, substituting on both sides in Equation (6.21) gives
∇2 φ = 4πGρ − Λc 2 = 4πG⎡ ⎦,
⎣ρ − ρΛ ⎤
where ρΛ = Λc 2 /[4πG ]. Ignoring Λ for the moment, this reproduces Equation (6.9),
the differential form of Newton’s law of gravitation. The comparison verifies that
the constant appearing in Einstein’s equation has magnitude 8πG /c 4; any other
choice would spoil the agreement with Newton’s law in the classical non-relativistic
limit. Λ is hard to interpret in Newtonian terms: it remains constant as space
expands, while the density of matter and radiation fall; its force is repulsive, but
because the source is everywhere the net force on any mass is zero; finally we shall
find that the cosmological constant expands space itself, not something that has a
Newtonian equivalent. If you could confine the cosmological constant to a sphere of
radius r centered on a mass M, then the force on a unit mass at the boundary would
be
−GM c 2 Λr
F= + . (6.22)
2
r 3
Observed from our local perspective the impact of Λ is not felt in any direct physical
way. However about 5 Gyrs ago the effect of the repulsion of Λ had already grown
6-11
Introduction to General Relativity and Cosmology (Second Edition)
with the universe to the point of balancing the attraction of matter. From that
moment onward, the expansion of the universe accelerated, so that eventually all
cosmic structures will be pulled apart.
At this point the equivalences between classical non-relativistic and general
relativistic quantities are collected and reviewed. Making use of the result of the
gravitational redshift experiment we have shown that
g00 = 1 + h 00 = 1 + 2φ c 2
so that the metric coefficients replace the classical gravitational potential. However
gμν (and hμν ) have six independent components compared with a single classical
potential. In the Newtonian limit, dx i /dτ ≪ dx 0 /dτ and τ ≈ t , so that the geodesic
Equation (5.15) reduces to Equation (5.20)
d2x i
2
= −c 2 Γ i00,
dt
where the Γ i00 are the important metric connections in the Newtonian limit. To first
order in h,
1 ∂h ∂φ ∂xi
Γ i00 = − ηii 00i =
2 ∂x c2
(with no summation over i). We can reiterate the conclusion drawn in Section 5.7
that the metric connections replace both the inertial and the gravitational forces of
classical non-relativistic mechanics. Finally, Equation (6.4) permits an interpretation
of the Riemann curvature tensor in classical terms. In the Newtonian limit all the
time derivatives are small compared with spatial derivatives, and so Equation (6.4)
reduces to
1 (∂ 2φ ∂x k ∂x j )
R 0k 0j = − g00,kj = − ,
2 c2
which is a component of the tidal force. Table 6.1 summarizes the correspondence
between general relativistic and Newtonian kinematic quantities.
gμν = 1 + hμν 1 + 2φ /c 2
Metric tensor Gravitational potential φ /c 2
gμν,α , Γ μ να (∂φ /∂x α )/c 2
Gravitational force/c2
μ
gμν,αβ , R ναβ (∂ 2φ /∂x α ∂x β )/c 2
Tidal force/c2
6-12
Introduction to General Relativity and Cosmology (Second Edition)
6.5 Exercises
1. Obtain the expressions given for Γ100 , Γ10 0 0
, Γ10 and Γ111 for the Schwarzschild
metric. Then calculate Γ100, 1, Γ101, 1, Γ101, 0 . Finally, calculate R 010
1
.
2. Starting from the force given by Equation (6.22) show that the corresponding
gravitational potential satisfies
∇2 φ = 4πGρ − Λc 2 .
3. Show that the Riemann curvature tensor vanishes for the metric equation
ds 2 = c 2 dt 2 − dr 2 − r 2 dθ 2 .
6-13
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 7
Tests of General Relativity
Einstein’s theory passes the zeroth test because, as we saw in the last chapter, it
reproduces Newton’s laws in the limit of low velocity and weak gravitational fields.
Developments in technology opened the way to very precise tests of GR: making use of
space probes, satellites, atomic clocks, huge telescopes, and laser ranging to the Moon
and radar ranging to other planets. This hardware is supported by onboard computers
on spacecraft, and by high-rate data links to terabyte data storage and processor farms
on Earth.
In this chapter the measurements made within the solar system to test GR are
described. The earlier tests used the advance of the perihelion of Mercury, the
deflection of radiation by the Sun, and the time delay of radar signals passing by the
Sun. Together with the gravitational redshift measurements described in Chapter 2,
these are the successful classical tests of GR. More recent tests measured the frame-
dragging and geodetic precession that massive bodies, in this case the Earth, produce
according to Einstein’s theory. The observation of gravitational lensing by galaxies,
another effect predicted by GR, is also discussed here. All these effects involve
velocities much less than c and modest gravitational potentials. The physical effects
in more extreme conditions predicted by GR, involving black holes and gravita-
tional radiation, are covered in the two following chapters. Gravitational lensing is
now used to infer the mass distribution of galaxies and galaxy clusters: the method is
described in Chapter 17.
532 arcsec per century. Observation shows, however, that the precession is 43.11
arcsec larger with an error of only 0.45 arcsec, a discrepancy that was recognized as
long ago as 1859 by Leverrier.
After three centuries of telescopic measurements, the more precise radar echo
detection is now used to directly determine the distance to Mercury and other
planets. Figure 7.1 illustrates techniques using one of the 64 m antennae of the Deep
Space Network (DSN). A radar carrier pulse of 1 ms duration (300 km long) is phase
modulated with a random 255 bit code. When the pulse echo E (t + t0 ) returns it is
cross-correlated with the pattern transmitted T (t ) by forming the sum of the
products S (t0 ) = ∑t [T (t )E (t + t0 )]. The delay t0 is only known approximately,
and so the cross-correlation is repeated for 3 μs steps in t0 (every 1 km). If the echo
and the pulse pattern match, then the correlator output is large; otherwise, even for a
3 μs offset, the output is small. In practice the output rises sharply when the echo
returns from the nearest point on the planet but dies away slowly as echoes arrive
from surrounding regions of the planet’s surface. The delay until the initial sharp rise
in output from the correlator gives the distance to the planet. Corrugations of terrain
blur the precision to ± 1 km.
The analysis of Mercury’s motion begins with the statement that Mercury follows
a geodesic in Schwarzschild spacetime around the Sun. Perturbations of the
predicted GR-induced precession due to the other planets are negligible. The
opportunity is taken to include some relevant comments in what follows, beyond
the bare calculation. We can assume that Mercury’s orbital plane is the Sun’s
equatorial plane (θ = π /2) so that the metric Equation (3.48) becomes
c 2 dτ 2 = c 2 Z dt 2 − dr 2 Z − r 2 dφ 2 , (7.1)
where Z = (1 − 2GM /rc 2 ), M is the Sun’s mass, and r is the distance of Mercury
from the Sun. Then Z ≈ 1 − 5 × 10−7. Multiplying this equation by the planet’s
mass squared, m2, and dividing by dτ 2 gives
Figure 7.1. The use of radar ranging to measure planetary distances (adapted from Hellings (1984)).
7-2
Introduction to General Relativity and Cosmology (Second Edition)
dt 2 m 2(dr /dτ )2 dφ 2
m 2 c 2 = m 2 c 2Z ⎛ ⎞ − − m 2 r 2⎛ ⎞ . (7.2)
⎝ dτ ⎠ Z ⎝ dτ ⎠
In flat spacetime this reduces to
dt 2 dr 2 dφ 2
m 2 c 2 = m 2 c 2⎛ ⎞ − m 2 ⎛ ⎞ − m 2 r 2⎛ ⎞ ,
⎝ dτ ⎠ ⎝ dτ ⎠ ⎝ dτ ⎠
that is to
m 2c 2 = m 2c 2γ 2 − m 2vr 2γ 2 − m 2vφ2γ 2,
where vr and vφ are the radial and tangential components of the velocity v and
γ = 1/(1 − v2 /c 2 )1/2 . This last equation is the standard SR formula relating rest mass
to four-momentum:
m2c 2 = E 2 c 2 − p 2 ,
and Equation (7.2) is its equivalent in Schwarzschild spacetime. The geodesic
equation for Mercury in its integral form is given by Equations (5.18) and (5.19):
dt 2 (dr dτ )2 dφ 2
δ ∫ dτ⎡Zc 2⎛ ⎞ − − r 2⎛ ⎞ ⎤ = 0.
⎢ ⎝ dτ ⎠ Z ⎝ dτ ⎠ ⎥
⎣ ⎦
General solutions for such equations are discussed in Appendix A. In particular if L,
the quantity within the square brackets, is independent of a coordinate x μ, then
∂L /∂(dx μ /dτ ) is conserved. Here L is independent of t giving the conservation law
∂L
= constant,
∂(c dt dτ )
so that
dt
2Zc = constant. (7.3)
dτ
Re-expressing this result in terms of the momentum component p0, we have
dt
cp0 = Zcp0 = Zmc 2 = E,
dτ
where E is a constant. In words, E is a constant of motion for a body in free fall,
which in the absence of a gravitational force reduces to mc 2γ , the usual SR energy. In
SR E would be constant in the absence of any forces. L is also independent of φ, so
that
∂L
= constant,
∂(dφ dτ )
equivalently
7-3
Introduction to General Relativity and Cosmology (Second Edition)
dφ
r2 = J , a constant. (7.4)
dτ
Equation (7.4) is the equivalent of the Newtonian law of conservation of angular
momentum for Schwarzschild spacetime. Replacing dt /dτ in Equation (7.2) by
E /Zmc 2 gives
2
E2 m 2(dr dτ )2 2 2 ⎛ dφ ⎞
− − m r = m2c 2 . (7.5)
Zc 2 Z ⎝ dτ ⎠
Multiplying this by Z and dropping a factor m throughout we obtain
2 2
E2 ⎛ dr ⎞ − Zr 2m⎛ dφ ⎞ = mc 2 − 2GMm ,
− m
mc 2 ⎝ dτ ⎠ ⎝ dτ ⎠ r
which can be rearranged as
m(dr dτ )2 mr 2(dφ dτ )2 Z GMm (E 2 mc 2 − mc 2 )
+ − = = T, (7.6)
2 2 r 2
where T is also a conserved quantity. Equation (7.6) is the equivalent in
Schwarzschild spacetime of the Newtonian conservation law for energy. There is
a radial kinetic energy term m(dr /τ )2 /2, a transverse kinetic energy term
mr 2(dφ /dτ )2Z /2, and a gravitational energy term −GMm /r . Together, Equations
(7.3), (7.4), and (7.6) fully describe the motion of Mercury, or for that matter any test
mass in free fall in Schwarzschild spacetime. The quantities E = cp0 and
J = r 2(dφ /dτ ) are invariants of motion, like their SR counterparts. We next go on
to solve these equations of motion. Using Equation (7.4) gives
dr dr dφ J dr
= = 2
dτ dφ dτ r dφ
and putting u = 1/r this becomes
dr du
= −J .
dτ dφ
Substituting this expression for dr /dτ into Equation (7.6) gives
J 2(du dφ)2 u 2J 2Z T
+ − GMu = .
2 2 m
Differentiating this with respect to φ and canceling a factor du /dφ, we obtain
d2u 3GMu 2J 2
J 2⎛ 2 ⎞ + J 2u −
⎜ ⎟ − GM = 0.
⎝ dφ ⎠ c2
Rearrangement gives
d2u GM 3GMu 2
+ u − = . (7.7)
dφ 2 J2 c2
7-4
Introduction to General Relativity and Cosmology (Second Edition)
This result can be compared to the Newtonian equation for orbits in the gravita-
tional potential of a mass M:
d2u GM
2
+ u − 2 = 0. (7.8)
dφ J
The solution of Equation (7.8) is well known:
1 + e cos φ
u= ,
l
where l = a(1 − e 2 ). Figure 7.2 shows a bound orbit: an ellipse with eccentricity
0 ⩽ e < 1. At aphelion φ = π , r = a(1 + e ), at perihelion φ = 0, r = a(1 − e ). Hence
the long (major) axis is 2a in length. In addition, although less easy to prove,
l = J 2 GM.
A solution of Equation (7.8) is clearly a very good approximate solution of Equation
(7.7) because Mercury’s orbit is nearly Newtonian. Consequently we can rewrite the
small term on the right-hand side of Equation (7.7) as
3GM (1 + e cos φ)2 l 2c 2
and make an entirely negligible error. With this substitution Equation (7.7) becomes
d2u GM 3GM
+ u − 2 = 2 2 (1 + 2e cos φ + e 2 cos2 φ). (7.9)
dφ 2 J l c
7-5
Introduction to General Relativity and Cosmology (Second Edition)
The solution to Equation (7.9) is similar to that for Equation (7.8) with extra
particular integral terms arising from the three terms on the right-hand side.
Explicitly
(1 + e cos φ) 3GM e2 e2
u= + 2 2 ⎡⎛1 + ⎞ − ⎛ ⎞cos 2φ + eφ sin φ⎤ .
⎜ ⎟ ⎜ ⎟
l l c ⎢
⎣⎝ 2⎠ ⎝6⎠ ⎥
⎦
Of the additional terms, the first is a constant and the second oscillates through two
cycles on each orbit; both these terms are immeasurably small. However, the last
term increases steadily in amplitude with φ, and hence with time, while oscillating
once per orbit, and so this term is responsible for the precession. Dropping the
unimportant terms we have
1 + e cos φ + eαφsin φ
u= ,
l
where the factor α = 3GM /lc 2 is extremely small. Thus
1 + e cos[(1 − α )φ ]
u= (7.10)
l
is the general relativistic solution. At perihelion we have
(1 − α )φ = 2nπ ,
that is
GM
φ = 2nπ + 6nπ ,
lc 2
where n is an integer. This shows that the perihelion advances by Δφ = 6πGM /lc 2
per rotation, in time τ, and so the rate of precession is
Δφ 6πGM
= . (7.11)
τ a(1 − e 2 )τc 2
Table 7.1 compares the observed and predicted planetary precessions measured in
arcsec per century. Among the planets, Mercury, with the smallest radius and
greatest eccentricity (which also makes it easier to locate the perihelion precisely),
has the largest precession. More recently the much larger precession of a pulsar
7-6
Introduction to General Relativity and Cosmology (Second Edition)
PSR1913+16 orbiting a companion star was measured by Hulse and Taylor (1975).
The pulsar and its companion star both have mass 1.4 M⊙, the orbital period is
7.75 hours and the orbital eccentricity is large, 0.617. The resulting precession of the
periastron, that is the point of closest approach of the binary pair, is 4.23 degrees per
year. Measurements of the precession of several such pairs agree precisely with the
GR predictions. More details about PSR1913+16 will be given in the context of
gravitational wave detection.
7-7
Introduction to General Relativity and Cosmology (Second Edition)
Figure 7.3. The deflection of a light ray passing near the Sun.
Now the path before and after the Sun are reflections in the x-axis. This makes the
total deflection
Δφ = 4GM bc 2 (7.15)
which for light just grazing the Sun’s limb (b = b0 = 6.96 × 108 m) is 1.750 arcsec.
At any larger impact parameter b, the deflection Δφ is scaled down by b0 /b. This
prediction of GR is confirmed by both the optical and radio measurements described
in Chapter 2. If the Newtonian calculation is made taking into account only the time
distortion implied by the equivalence principle, the resultant deviation is only half as
large. The GR prediction takes into account both frame and time distortion.
In 1971 Shapiro and colleagues (Shapiro et al. 2004) measured the deflection of
541 compact radio sources by the gravitational field of the Sun, using them as targets
in place of Mercury. The authors made use of the generalization of the deflection
formula to cover sources far from the Sun’s direction due to Ward (Ward 1970).
Equation (7.15) becomes
(1 + γ )GM
Δφ = [1 + cos α ], (7.16)
bc 2
where α is the angle between the Sun and the source as seen from Earth. γ would be
zero (unity) in Newtonian mechanics (GR). The 2500 measurement sessions each
extended over 24 hours to cover a range of α values, thus allowing both b and α to be
determined. These sessions were recorded using very large baseline interferometer
(VLBI) arrays between 1979 and 1999. A global fit made to this data set gave
γ = 0.9998 ± 0.0004, again in strong support of Einstein’s theory.
7-8
Introduction to General Relativity and Cosmology (Second Edition)
Figure 7.4. A radar beam from Earth to Venus near superior conjunction.
7-9
Introduction to General Relativity and Cosmology (Second Edition)
200
120
80
40
0
–300 –200 –100 0 100 200 300
Time (days)
Figure 7.5. A sample of post-fit residuals for Earth–Venus time-delay measurements. The solid line is the
prediction using GR. Adapted from Figure 1 in Shapiro et al. (1971). Courtesy of the American Physical
Society.
which increases the time delay by 10%. The round trip takes 1300s and the
predicted delay is merely 220 μs. Figure 7.5 shows over 600 days’ measurements of
time delays of reflections from Venus made by Shapiro and colleagues in 1971
(Shapiro et al. 1971), using two radiotelescopes: the Haystack Observatory,
Massachusetts at 7.84 GHz and another at Arecibo, Puerto Rico at 430 MHz.
The figure shows how the delay changes from a year before to a year after superior
conjunction. The solid line giving the GR prediction over that time span is in
excellent agreement with the data. One experimental difficulty is that the solar
corona has a refractive index different from unity (varying as 1/(frequency)2), and
the effect of the corona is to increase the delay. A correction for this effect was
applied before producing Figure 7.5. A second difficulty lies in the uncertainty of
our knowledge of the topography of Venus, at a precision of 1500 m or 10 μs in
timing. In 1979 Reasenberg and colleagues (Reasenberg et al. 1979) overcame
these difficulties when measuring the time delay for reflections from Mars over a
period of 14 months. The uncertainty in topography was avoided by receiving and
retransmitting the radar signal from the Viking Lander on Mars. The effect of the
corona was studied at two frequencies (2.3 and 8.4 GHz) using a transponder on a
Viking Orbiter in Mars orbit. Knowing the frequency variation of the refractive
index, a comparison of the delays could be used to correct for the second effect.
Timing uncertainty was reduced to about 0.1 μs. The ratio of the observed delay
to that predicted by GR was 1.000 ± 0.002, another clear success of Einstein’s
theory.
7-10
Introduction to General Relativity and Cosmology (Second Edition)
Figure 7.6. The satellite Gravity Probe B. Figure from Everitt et al. (2011). Courtesy of the American Physical
Society.
7-11
Introduction to General Relativity and Cosmology (Second Edition)
gyros that floated freely inside the GP-B. Each was a 3.8 cm diameter homogeneous
fused silica near-perfect sphere coated with a niobium film. These were contained in
an enclosure cooled by liquid helium at 1.8 K rendering the niobium superconduct-
ing. The spheres were spun up by blasts of gas to 4300rpm. As predicted by Fritz
London, the magnetic moment of a superconductor aligns precisely with the spin
axis. A first step was to point this London moment at a guide star IM Pegasi using
an onboard telescope. IM Pegasi is particularly stable with proper motion of under
0.15 mas/yr. The axes of the gyros were continuously monitored, over the 17 months
that the coolant lasted, to determine the spin axis orientation and so measure
its precession. Any external magnetic field was excluded by shielding. Analysis of the
data transmitted to Earth gave values for the two precessions of 6601.8 ± 18.3 mas/yr
and 37.2 ± 7.2 mas/yr in excellent agreement with the GR predictions of 6606.1 and
39.2 mas/yr. This remarkable experiment, 45 years in design and development before
launch, achieved the astonishing angular precision of one one-hundred-thousandth of
a degree per year.
A thought-provoking comment made by one of the experimenters: if space were
nothing you couldn’t twist it.
7-12
Introduction to General Relativity and Cosmology (Second Edition)
1200
NV
Lya Si IV/O IV] C IV
900
Relative flux
UM 673B
600
300
UM 673A
0
3800 4340 4880 5420 5960 6500 7040
Wavelength (Å)
Figure 7.7. The low dispersion spectra of UM673A and UM673B recorded in December 1986. The resolution
is about 1.3 nm. Figure 2 from Surdej et al. (1987). Courtesy Springer.
7.6 Exercises
1. What is the radar echo time delay for reflection from Mars at superior
conjunction? Assume that the orbit of Mars is a circle of radius 228 million
kilometers.
2. An observer is at rest at a distance r from a star of mass M. Show that his four-
velocity in this frame (S) is u α = (c / Z , 0, 0, 0), where Z = 1 − 2GM /rc 2 .
Then consider a frame in free fall (F) which coincides momentarily with the
observer’s rest frame. Show that his four-velocity in frame F is
v α = (c, 0, 0, 0). Suppose a body with four-momentum pα in frame S passes
close the observer. Show that E *, the energy of the body in frame F, is pα uα .
This question illustrates that the energy measured by an observer is the
invariant pα uα where pα is the body’s four-momentum and uα is the
observer’s four-velocity, both referred to the same frame.
4GM DLS
3. Show that the angular radius of an Einstein ring is , where M is
c 2 D L DS
the mass of the lens, DL its distance from the Earth, DS the distance of the
source from the Earth, and DLS the source–lens separation.
4. Calculate the precession of Mars’ orbit given that the semimajor axis is
228 109 m, the orbital period is 687 days, and the orbit’s eccentricity is
0.0934.
5. In the Schwarzschild metric the coordinate distances in the radial direction
differ from those in the angular directions. Use the transformation
r = w[1 + k ]2 , where k = GM /[2c 2w ], to produce isotropic coordinates in
which there is no difference between the radial and angular directions.
7-13
Introduction to General Relativity and Cosmology (Second Edition)
Further Reading
Will C M 1986 Was Einstein Right? (New York: Basic Books). A readable
account of tests of Einstein’s theory, backed by expert knowledge of the
subject.
Will C M 2018 Theory and Experiment in Gravitational Physics (2nd ed.;
Cambridge: Cambridge Univ. Press). This gives a thorough presentation by
an expert gives a deep, exhaustive account of experimental tests.
References
Everitt, C. W. F., DeBra, D. B., Parkinson, B. W., et al. 2011, PhRvL, 106, 221101
Hellings, R. W. 1984, in General Relativity and Gravitation, ed. B. Bertotti, F. Felice, &
A. Pascolini (Dordrecht: Reidel), 365
Hulse, R. A., & Taylor, J. H. 1975, ApJ, 195, L51
Reasenberg, R. D., Shapiro, I. I., MacNeil, P. E., et al. 1979, ApJ, 234, L219
Shapiro, I. I., Ash, M. E., Ingalls, R. P., et al. 1971, PhRvL, 26, 1132
Shapiro, S. S., Davis, J. L., Lebach, D. E., & Gregory, J. S. 2004, PhRvL, 92, 121101
Surdej, J., Magain, P., Swings, J.-P., et al. 1987, Natur, 329, 695
Ward, W. R. 1970, ApJ, 162, 345
7-14
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 8
Black Holes
In 1784 the Rev. J. Michell became the first person to draw attention to the
implications of the gravitational potential GM/r becoming large (Michell 1784). For
a body of mass m to be able to escape to infinity from a star of mass M and radius r
its kinetic energy must exceed the gravitational potential. It requires an initial
velocity v, such that
mv 2 2 ⩾ GMm r , i. e ., v ⩾ (2GM / r )1 2 ,
so that escape is only possible for velocities greater than 2GM /r . With a dense
enough star the escape velocity reaches the velocity of light: the star’s radius is
r0 ≡ 2GM c 2 . (8.1)
A more compact star would be invisible. The curvature of spacetime is so severe that
we can only hope to give a consistent account of conditions using GR. For light
traveling radially in a region described by the Schwarzschild metric, Equation (3.48),
with ds 2 = dΩ2 = 0, becomes
dr 2
0 = c 2 dt 2(1 − r0 / r ) − .
(1 − r0 r )
Thus
dr
c dt = . (8.2)
1 − r0 r
If a star shrinks to a radius less than r0 ≡ 2GM /c 2 the time taken for light to emerge
from the spherical surface at r0 becomes infinite. An observer will never receive any
light emitted from within that radius. The spherical surface at r0 is called an event
horizon, and r0 is known as the Schwarzschild radius of the star. Any star shrinking
within its Schwarzschild radius becomes invisible, a black hole. Black holes are in one
sense simple: lack of contact with the universe outside limits the properties they can
exhibit to their mass, angular momentum, and charge.
The Schwarzschild radius of the Sun is 2.96 km and that of the Earth is 8.9 mm.
For simplicity the analysis here will be carried through for non-rotating electri-
cally neutral (Schwarzschild) black holes. It is likely that most black holes are in fact
rotating, and given the name Kerr black holes. Where necessary, the effects of this
rotation on the analysis will be pointed out. Space is populated by matter so that any
charge on a black hole would soon be neutralized. In 1931 Chandrasekhar
(Chandrasekhar 1931) deduced that the gravitational self-attraction of a sufficiently
massive star leads inevitably to its collapse to a point.1 The mass limit depends on
the equation of state assumed for the star: Chandrasekhar gave a limit of 1.4 M⊙,
and it is now considered to be around 2.1 M⊙. A star that is 1.4 times more massive
than our Sun has a Schwarzschild radius of only 2 km, and such a star would then
attain a mean density of 1020 kg m−3, far beyond the density of nuclei. It seems
difficult to imagine how this could come about. Surely the mounting pressure would
eventually halt the contraction? In the last chapter we saw that the curvature tensor
is proportional to the stress–energy tensor, so that pressure also contributes to the
gravitational self-attraction of a star. Another paradoxical property of black holes,
treated below, is that they can radiate through a quantum field effect discovered by
Hawking (1974).
The first topic treated here is the unusual and sometimes counter-intuitive
properties of spacetime near a black hole and the orbits followed there by matter
and radiation. The interplay of gravitational and quantum physics is the second
topic: this includes an account of Hawking radiation from black holes, an outline of
the thermodynamic properties of black holes, and a brief section to expose the puzzle
over whether information is lost when matter enters a black hole. The third topic,
last but not least, recounts the now convincing experimental evidence for the
existence of both stellar and galactic black holes.
1
Chandrasekhar won the Nobel Prize in Physics for this work, 52 years later—a record.
2
Hawking radiation escapes at the black hole’s surface.
8-2
Introduction to General Relativity and Cosmology (Second Edition)
horizon, consider a probe falling radially toward a star that has collapsed inside its
Schwarzschild radius. Equation (7.6) is the appropriate orbital equation. If the probe
moves along a radius with θ = π /2, ϕ = 0, then
(dr dτ )2 GM T
− =
2 r m
For simplicity the probe is taken to have started at rest at an infinite distance from
the hole, making its kinetic energy T vanish. The equation then reduces to
12
r
c dτ = ± ⎛ ⎞ dr
⎜ ⎟ (8.3)
r
⎝ 0⎠
whereas defined, r0 is 2GM /rc 2 . Taking the negative sign for travel inward and
integrating gives
32
r0 ⎛ 2 ⎞⎛ r ⎞
τ = τ0 − ⎜ ,
⎟ (8.4)
c ⎝ 3 ⎠⎝ r0 ⎠
where τ0 is the proper time when the probe reaches the center (r = 0). This path is
plotted in Figure 8.1 as the solid curve. A key feature to note is that the proper time τ,
recorded by an onboard clock, changes smoothly on crossing the horizon. Once
across the horizon the probe soon reaches the center of the hole: in the case of a hole
of mass 10 M⊙ this interval τ0 − τ is 10−4 s. In practice instruments could not
survive, being torn apart by the gravitational field gradients (tides). Next consider
Figure 8.1. The proper time τ and the coordinate time t plotted as a function of the radial coordinate r for a
probe falling radially into a black hole. τ and t are set equal at r = 5r0 .
8-3
Introduction to General Relativity and Cosmology (Second Edition)
how the journey appears to a distant observer. The coordinate time t measured by
this remote observer is related to the proper time through Equation (7.3):
r dt
⎛1 − 0 ⎞ = K a constant.
⎝ r ⎠ dτ
Imposing the initial conditions, with the distant probe at rest, gives K = 1. Thus
r
⎛1 − 0 ⎞dt = dτ . (8.5)
⎝ r⎠
Equation (8.4) can be used to replace dτ in Equation (8.5) and rearranging we obtain
dr r 3 2
c dt = − .
(r − r0)r01 2
Integration over the inward journey yields
32 12
r0 ⎡ 2 ⎛ r ⎞ r (r r0)1 2 + 1 ⎤
t = t0 + − ⎜ ⎟ − 2⎛ ⎞
⎜ ⎟ + ln . (8.6)
c ⎢ 3 ⎝ r0 ⎠ ⎝ r0 ⎠ (r r0)1 2 − 1 ⎥
⎣ ⎦
Remote from the black hole this reduces to
32 12
r0 ⎛ 2 ⎞⎛ r ⎞ r r
t ≈ t0 − ⎜ ⎟ − 2⎛ 0 ⎞⎛ ⎞ .
⎜ ⎟
c ⎝ 3 ⎠⎝ r0 ⎠ ⎝ c ⎠⎝ r0 ⎠
By choosing t0 suitably it is possible to arrange that t and τ are equal at some large
distance R. This choice is
12
r0 ⎛ R ⎞
t 0 ≈ τ0 + 2 ⎜ ⎟ .
c ⎝ r0 ⎠
From Equation (8.6) it is clear that as r tends toward r0 then t tends to infinity. The
world line described by Equation (8.6) is shown by the broken line in Figure 8.1. The
very different behavior of coordinate and proper time as the probe approaches and
crosses the horizon illustrates vividly how the curvature of spacetime makes it
impossible to cover all spacetime with one set of Cartesian coordinates.
We can show in a few steps that signals received from a probe crossing the
horizon fade away quickly. Suppose the probe emits signals at constant frequency
(from an onboard quartz crystal oscillator): as it approaches the horizon, the
photons traveling to the distant observer arrive less frequently and increasingly
redshifted. Both effects diminish the energy received so that eventually the signals
are undetectable. The photons in question follow radial paths for which the time
between emission and detection is given by Equation (8.2)
dr
t′ = ∫ c(1 − r0 r )
(8.7)
r − r0 r R − r0 ⎞
= + 0 ln⎛ , ⎜ ⎟
c c ⎝ r − r0 ⎠
8-4
Introduction to General Relativity and Cosmology (Second Edition)
where the probe is at radius r and the observer is at radius R. As far as the observer is
concerned the arrival time T is measured relative to some fixed event, which can be
the departure of the probe. This time interval is the sum of the inward travel time of
the probe given by Equation (8.6) and the return time of the photons from probe to
observer is given by Equation (8.7). The expressions simplify a good deal once it is
noted that the most significant contributions come from terms containing
ln[ r − r0 ], which dominate when the probe approaches the horizon and r → r0:
in comparison terms like r − r0, ln[ r + r0 ] and ln[ r0 ] can be ignored. First from
Equation (8.6)
12
r ⎡ r r
t ≈ − ⎛ 0 ⎞ ln⎢⎛ ⎞ − 1⎤
⎜ ⎟
⎥ ≈ − 0 ln(r1 2 − r01 2 )
c
⎝ ⎠ ⎣⎝ 0 ⎠ r c
⎦
r
≈ − ⎛ 0 ⎞ ln(r − r0).
⎝c ⎠
Next, from Equation (8.7)
r
t′ ≈ −⎛ 0 ⎞ln(r − r0).
⎝c ⎠
Thus the total elapsed time is
r
T = t + t′ ≈ −2⎛ 0 ⎞ln(r − r0). (8.8)
⎝c ⎠
Now the energy L(T ) received per unit time contains a factor 1 − r0 /r due to the
increase in time interval between photons and another factor 1 − r0 /r due to their
redshift. Thus
r r − r0
L(T ) ∝ ⎛1 − 0 ⎞ = .
⎝ r ⎠ r
Using Equation (8.8) this becomes
cT
L(T ) ∝ exp⎛ − ⎞ .
⎜ ⎟
⎝ 2r0 ⎠
This analysis applies equally to a star in the act of collapsing through its horizon: it
reddens and fades on a timescale 2r0 /c . In the case of a star of mass 5 M⊙ it would
blink out in a microsecond.
The critical difference between spacetime inside and outside the horizon lies in the
sign reversal of the metric coefficients g00 = (1 − r0 /r ) and g11 = −1/(1 − r0 /r ) at the
surface r = r0 . Table 8.1 illustrates this. Therefore if a small change in t is made at
constant radius inside the horizon (r < r0),
ds 2
= dτ 2 = g00 dt 2 < 0.
c2
8-5
Introduction to General Relativity and Cosmology (Second Edition)
r > r0 r < r0
g00 + −
g11 − +
This is opposite in sign to the effect of a similar small change in t outside the horizon
(r > r0), namely
ds 2
= dτ 2 = g00 dt 2 > 0.
c2
Inside the horizon a separation in coordinate time has become space-like (d s 2 < 0).
Similar considerations show that inside the horizon a separation in radial coordinate
only has become time-like (d s 2 > 0). The curvature is so intense that if we insist on
using coordinates appropriate to distant flat spacetime, then we find that space and
time inside the horizon interchange the properties normally associated with them.
The absence of any indication in the world line of the onboard clock that it is crossing
the horizon was illustrated in Figure 8.1 and shows that the horizon is not a physical
singularity; rather it is a mathematical singularity.
What then is the import of the mathematical singularity with g11 diverging to
infinity as r goes to r0? The mathematical singularity arises because the set of
coordinates imposed everywhere is best suited to regions of small curvature. A set of
8-6
Introduction to General Relativity and Cosmology (Second Edition)
coordinates more appropriate to the locale of the black hole was invented by
Eddington in 1924 (Eddington 1924) and rediscovered by Finkelstein in 1958
(Finkelstein 1958); the time coordinate t is replaced by t˜ such that
r0 r
t˜ = t + ln −1
c r0
and
dr
dt˜ = dt − .
c(1 − r r0)
In terms of this new time coordinate the Schwarzschild metric Equation (3.48)
ds 2 = c 2dt 2(1 − r0 r ) − dr 2 (1 − r0 r ) − r 2dΩ2
becomes
r r r
ds 2 = ⎛1 − 0 ⎞c 2 dt˜ 2 − 2c dr dt˜⎛ 0 ⎞ − dr 2⎛1 + 0 ⎞ − r 2 dΩ2 . (8.9)
⎝ r⎠ ⎝r⎠ ⎝ r⎠
The metric coefficients of Equation (8.9) have no mathematical singularity at the
horizon. We shall show next that Eddington’s coordinates (also called Eddington–
Finkelstein coordinates) provide the basis for a clearer understanding of spacetime
structure close to the black hole, although such coordinates would be a strange
choice in a region remote from the black hole.
The radial path of a light ray in these new coordinates is
r r r
0 = ⎛1 − 0 ⎞c 2 dt˜ 2 − 2c 0 dr dt˜ −⎛1 + 0 ⎞dr 2 ,
⎝ r⎠ r ⎝ r⎠
which has two solutions
dt˜ 1 dt˜ 1 1 + r0 r
=− and = . (8.10)
dr c dr c 1 − r0 r
These solutions describe the paths of ingoing and outgoing rays respectively, and in
the absence of a black hole would become
dt˜ 1 dt˜ 1
=− and =+ .
dr c dr c
Light cones constructed according to Equation (8.10) are drawn in Figure 8.2 at
various points on the trajectory of a source falling into a black hole. The maximum
inward component of the velocity of light is c throughout: the left-hand edges of the
light cones drawn in the figure have the same length and make the same angle with
the time axis. On the other hand the maximum outward component varies with the
radial distance. When the source is far from the horizon dt˜ / dr is +1/c for light
directed outwards. Then as the source approaches the horizon dt˜ / dr increases until,
at the horizon it points along the time axis; the maximum outward component of the
8-7
Introduction to General Relativity and Cosmology (Second Edition)
Figure 8.2. The light cones of a probe falling into a black hole. t˜ is the Eddington time coordinate.
velocity of light therefore falls to zero at the horizon. When the source goes inside
the horizon dt˜ / dr becomes negative for “outgoing” as well as ingoing rays so
that the whole light cone is tilted inward. This also means that once the horizon is
crossed the source will inevitably head toward the center of the black hole. However
powerful the rocket engine may be, the probe’s velocity vector must lie inside the
light cone, and this seals the fate of the probe.
There exists an equally valid alternative choice for Eddington coordinates with
r0 r
t˜ = t − ln −1 .
c r0
Now the light cone of a source within the horizon tilts so that it points outward. The
physical situation is that of a white hole ejecting material from the singularity at r = 0
into spacetime. Gravitational collapse may create a black hole, but we don’t know of
any physical process to generate its converse, the white hole: mathematically they
are equally valid. We have learnt too that the choice of coordinate scheme can bring
with it an apparent singularity that in fact does not exist.
However the singularity associated with the center of a black hole cannot be
removed by the choice of coordinates. In 1965 Penrose showed that physical
8-8
Introduction to General Relativity and Cosmology (Second Edition)
Hawking & Penrose (1970) refined the proof that physical singularities are
broadly expected in general relativity. Penrose was awarded the Nobel Prize in
physics in 2020 for his work showing that general relativity leads to singularities.
There is no accepted theory for describing the conditions at the center of the black
hole (r = 0), a singularity where the curvature of spacetime becomes infinite. Penrose
conjectured that there are no naked singularities, that is to say singularities that we
can directly observe, all would be hidden behind some horizon: this is called cosmic
censorship. Of course these proofs and speculation are purely valid classically. What
happens in practice depends on what the quantum theory of gravitation might be,
which is so far not understood.
8.2 Orbits Around Black Holes
The analysis of stable orbits developed in Chapter 7 can also be applied when the
parent body is a Schwarzschild black hole. Combining Equations (7.4) and (7.5) to
eliminate dϕ / dτ gives
E2 m 2(dr dτ )2 m2J 2
− − = m2c 2 ,
Zc 2 Z r2
where E is a constant of motion equivalent to the classical energy, and J the angular
momentum per unit mass is another constant of motion. Rearranging the above
equation gives
E2 (dr dτ )2 J2
2 4
= 2
+ Z ⎛1 + 2 2 ⎞ .
⎜ ⎟ (8.11)
mc c ⎝ c r ⎠
8-9
Introduction to General Relativity and Cosmology (Second Edition)
Then
r = rJ = 3r0. (8.16)
0.2
0.1
5.5
Potential - 1.0
0
4.0
B
A
-0.1
3.0
2.0
-0.2
0 5 10 15 20 25
r/r0
Figure 8.3. Equivalent radial potential seen by a massive body in orbit around a Schwarzschild black hole. The
labels on the potentials refer to the values of J 2 /r02c 2 .
8-10
Introduction to General Relativity and Cosmology (Second Edition)
The energy of this smallest stable circular orbit is given by Equation (8.11) as
12
8m 2c 4 8
E 2(min) = and E (min) = ⎛ ⎞ mc 2 , (8.17)
9 ⎝9⎠
and lies on the potential curve labeled 3 in the figure. Therefore the binding energy
amounts to a fraction 1 − 8/9 or 5.72% of the rest mass energy. This quantity is of
cosmic importance: it is the energy released when material accreted around a black
hole spirals into the stable orbit with lowest energy. By comparison the maximum
energy release in thermonuclear fusion when hydrogen burns to 56Fe is only 0.84%
of the rest mass energy. Gravitational energy release is therefore the most potent
energy source contributing to stellar processes. When a body orbits a rotating black
hole the gravitational energy release can go much higher, reaching 42% of the rest
mass energy in favorable cases. In most galaxies there is a massive black hole, in our
Galaxy of mass ∼4 × 106 M⊙. In many galaxies, but not ours, matter is accreting in
a disk around the black hole under the intense gravitational attraction, and then falls
into the black hole with a huge energy release in radiation. Such galactic cores are
known as active galactic nuclei (AGNs). Those AGNs whose intense radiation is
observed without obscuration by the accreting disk are known as quasars.
In the image of the black hole M87* shown in Figure 1.9 the bright ring matches
the circular unstable photon orbits around the black hole. Matter in the accretion
disk lying along these obits emits photons that after orbiting then escape and have
been detected by the Event Horizon Telescope (EHT) Collaboration (Beckenstein
1972). We show now that the radius of this unstable orbit is 3r0 /2.
The Schwarzschild metric equation, Equation (3.48), in the case of a photon
traveling in the equatorial plane around a static black hole is
Zc 2(dt dλ)2 − (dr dλ)2 Z − r 2(dϕ dλ)2 = 0, (8.18)
where λ is the path length parameter. In this orbit the components of the
Schwarzschild metric are independent of both the azimuthal angle ϕ and time t.
Hence taking results from analyzing the precession of Mercury there are two
invariants given by Equations (7.3) and (7.4):
e = Zcdt dλ and ℓ = r 2dϕ dλ .
These are equivalent to the conservation, respectively, of energy and angular
momentum. Then the equation for the photon orbit round the black hole reduces to
e 2 Z − (dr dλ)2 Z − ℓ 2 r 2 = 0. (8.19)
Multiplying by Z /ℓ 2 and putting b = ℓ /e gives
b−2 = ℓ −2(dr dλ)2 + Z r 2 . (8.20)
We can interpret the parameter b physically using the top-right inset in Figure 8.4.
This inset shows a section of a photon’s path from the neighborhood of the black
hole out to a remote observer. We write R to distinguish the large values of r remote
8-11
Introduction to General Relativity and Cosmology (Second Edition)
Figure 8.4. Equivalent potential for photon near a Schwarzschild black hole. The inset shows the path from
the photon ring to the observer. The symbols are used in the text.
from the black hole. Then at remote points: Z = 1 and ϕ = p /R , where p is called the
impact parameter. Equally
R2(dϕ dλ) dϕ d(p R ) dR
b = ∣ℓ / e ∣ = = R2 = R2 =p = p. (8.21)
Z ( c dt dλ ) c dt c dt c dt
The main body of Figure 8.4 shows the potential from Equation (8.20). It peaks
where d(Z /r 2 )/dr = 0, that is where
r = 3GM c 2 = 3r0 2 = rγ . (8.22)
This is the radius of an unstable circular photon orbit; any perturbation would
propel the photon either inward or outward. We can obtain the impact parameter
for this unstable orbit by substituting rγ for r in Equation (8.20): the orbit is circular
so that dr / dλ is zero and we get
bγ = pγ = rγ2 (1 − r0 rγ ) = 3 3 r0 2. (8.23)
Then, seen from the Earth, a distance R away, the bright ring around the black
hole’s shadow has an angular radius bγ /R . If, as seems probable, the black hole is
rotating, the Kerr metric replaces the Schwarzschild metric to describe spacetime
around the black hole. The angular size of the image is reduced, but only by 5% even
if the rotation is maximal. Figure 8.5 shows photon paths from around the black
hole in a plane containing the center of the black hole. That in red is the
unstable orbit with the path to the Earth. Material anywhere along the
8-12
Introduction to General Relativity and Cosmology (Second Edition)
is
ax
T
EH
to
BH
Figure 8.5. Photon paths to the Earth from near a black hole. The unstable circular path and its link to the
telescope are indicated in red. The broken lines are ingoing paths only. The components are not drawn to scale.
unstable orbit can emit photons into this orbit: thus radiation detected with impact
parameter bγ should be intense. The broken lines ending on the horizon are ingoing
paths only. Observations made by the EHT of black holes at the centers of the
nearby M87 galaxy and our own Galaxy are described in Section 8.10.
where a = J /Mc . The larger the angular momentum the smaller the radius of the
horizon becomes. In the limit that J/Mc exceeds r0 /2 it would become complex and
the singularity at the center of the black hole might become accessible. What would
then happen is unclear; Penrose conjectured that some physical principle (cosmic
censorship) would guarantee that singularities remain hidden behind horizons. A
second important surface defines a region outside the event horizon from which
escape is possible but within which static equilibrium cannot be maintained. Any
8-13
Introduction to General Relativity and Cosmology (Second Edition)
matter in the intermediate volume, the ergosphere, will rotate with the black hole.
The outer surface of the ergosphere is a spheroid of revolution with a radial
coordinate r+, which depends on the polar angle θ it makes with the spin axis
8-14
Introduction to General Relativity and Cosmology (Second Edition)
Figure 8.6. Mass-radius diagram showing the regions where general relativistic effects and quantum effects are
important. MP is the Planck mass and ℓP the Planck length.
8-15
Introduction to General Relativity and Cosmology (Second Edition)
purely quantum effect. Quantum mechanics has revealed that the vacuum, rather
than being inert, is in a state of constant activity due to the continuous creation and
annihilation of particle–antiparticle pairs. Thus a pair of photons can be created
close to, but outside the black hole horizon with four-momenta (pc, −p) and (−pc , p).
The net four-momentum is zero, but the negative energy of one photon violates the
requirement that real photons have positive energy. According to Heisenberg’s
uncertainty principle this virtual photon can only exist for a time
Δt ∼ ℏ pc.
For some directions of emission the negative-energy photon will cross the horizon.
Once across the horizon and inside the black hole the space-like components of four-
vectors become time-like, and vice-versa. Thus the photon’s negative energy
becomes an acceptable spatial momentum and its momentum converts into an
acceptable positive energy. Its lifetime is no longer restricted and so it travels quite
freely within the black hole. Its positive-energy partner travels freely outward, and
such photons make up the Hawking radiation emerging from near the horizon. An
estimate of the temperature of this radiation can be obtained as follows using the
uncertainty principle. The position of a photon emitted from the surface of the black
hole is uncertain to ∼r0 . Accordingly the uncertainty in the photon momentum is
Δp ∼ ℏ/r0 . This momentum can be expressed in terms of a thermal energy kBT at a
temperature T where kB is the Boltzmann constant: Δp ≈ kBT /c . Combining these
two relations for momentum gives
kBT ℏ ℏ
≈ = .
c r0 2GM c 2
Thus T ≈ ℏc 3 /(2kBGM ). The exact expression obtained by Hawking only differs by a
factor 4π:
ℏc 3
T= . (8.28)
8πkBGM
Putting in numbers we have:
T = 6 × 10−8(M⊙ M ) K. (8.29)
The rate at which a black hole loses energy through Hawking radiation is
d(Mc 2 )
= σT 4(surface area) ∝ M−2,
dt
where σ is the Stefan–Boltzmann constant. The lifetime is consequently proportional
to M−3. Small black holes therefore have higher temperatures and radiate their
energy more rapidly than larger black holes. The lifetime of a black hole is given
approximately by
3
M ⎞
τ ≈ ⎛⎜ 11 ⎟ × 1010 years. (8.30)
⎝ 10 kg ⎠
8-16
Introduction to General Relativity and Cosmology (Second Edition)
Thus any black hole of one solar mass would not have had time to evaporate since
the origin of the universe; however, very small black holes could have formed early
in the life of the universe and subsequently evaporated.
Another important result proved by Hawking will be used the next section. His
theorem maintains that in any process the surface area of a black hole cannot shrink.
The thermal nature of Hawking radiation suggests that a black hole could possess
other thermodynamic properties. Beckenstein (1972) was the first to appreciate that a
black hole should have entropy. Noting that neither classical entropy nor the area of a
black hole may grow smaller, he conjectured that the entropy of a black hole was
proportional to the area of its horizon. This idea was incorporated by Bardeen, Carter
and Hawking (Bardeen et al. 1973) in the four laws summarizing black hole
thermodynamics.
These laws are presented here and compared with the corresponding laws of
classical thermodynamics.3
• The first classical law is that the temperature of a body in thermal equilibrium
should be the same throughout the body. In the case of a black hole the
parallel law is that the interface with the rest of nature, the horizon, should be
at a uniform temperature. Locally the gravitational force at the surface of a
black hole is
GM c4 2πc
2
= = kBT ,
r0 4GM ℏ
demonstrating that the Hawking temperature must also be uniform over the
horizon’s surface.
• The second classical law states that energy E is conserved. In the case of a
volume of gas V: dE = T dS − P dV , where S is the entropy, P the pressure.
The equivalent relationship for a black hole can be obtained by considering
the energy change when a mass dM falls into the black hole,
ℏc 3
dMc 2 = T dSBH = dSBH. (8.31)
8πkBGM
Rearranging and integrating gives
ℏc
M2 = SBH. (8.32)
4πkBG
3
These laws apply equally to rotating and charged black holes; proofs are then more complicated. I am
grateful to Jeff Forshaw for pointing this out.
8-17
Introduction to General Relativity and Cosmology (Second Edition)
Black hole entropy being proportional to surface area seems at best puzzling,
because for anything else entropy is proportional to volume. However it can be
argued that this comes about because there is no shielding from gravity. Thus adding
matter to normal objects affects their internal structure only weakly, but has a
pervasive effect on black holes. ’t Hooft and Susskind proposed that entropy being
proportional to surface area, the holographic principle, is a general property of
gravitating systems.
8-18
Introduction to General Relativity and Cosmology (Second Edition)
firewall caused by the violent entropy change when information is lost as matter
enters the black hole (Almheiri et al. 2013) or that the surface is a fuzzball of string
microstates encoding the information (Chowdhury & Mathur 2008). At any event
the resolution of the paradox of information loss is theoretical work in progress.
8-19
Introduction to General Relativity and Cosmology (Second Edition)
8-20
Introduction to General Relativity and Cosmology (Second Edition)
In the explosion the accumulation of elements remaining from the earlier fusion
sequence up to iron are similarly ejected. Any stars, like the Sun, containing such
elements are therefore in at least the second generation of stars.
Black hole
X-rays
HDE226868
Accretion disk
Figure 8.7. A section taken in the orbital plane of the X-ray binary Cygnus X-1. The red arrow indicates the
flow of material from the blue supergiant HDE 226868 onto the accretion disk surrounding its black hole
partner.
8-21
Introduction to General Relativity and Cosmology (Second Edition)
following: that the orbital radii are rX and rbg for Cygnus X-1 and the blue
supergiant, respectively, that the normal to the orbital plane is tilted at an angle i
with respect to our line of sight, that τ is the orbital period and that the maximum
velocity of HDE 226868 toward us is v bg . Then simple application of Kepler’s laws
to the orbital motion determines the mass function:
(m X sin i )3 (rbg sin i )3(2π τ )3
=
(m X + mbg )2 G
(8.38)
vbg3τ
= .
2πG
A lower limit on the mass of Cygnus X-1 can be obtained from this equation by
setting mbg to 20 M⊙, an approximate lower limit for blue supergiant masses, and i to
90◦. Then we insert the measured quantities in the equation: the period of 5.6 days
and the maximum velocity, from Doppler shift data, of 95 km s−1. The emerging
lower limit for mX is 10 M⊙, well above the upper limit (2.1 M⊙) predicted for the
mass of a neutron star. Orosz et al. (2011) made a fit to all measurements: they find
that the orbits of the two stars are only 0.2 AU across; that the mass of HDE 226868
is 19.2 M⊙; and that the black hole has mass 14.8 M⊙, and that its Schwarzschild
radius is 57 km, while i is 27◦. Finally the X-ray emission from Cygnus X-1 flickers
on a timescale of milliseconds: in order to show this degree of coherence the source
must be less than milli-lightseconds across, that is less than 300 km across. This
neatly matches the X-ray source to the expected region of energy release and
excitation, namely the accretion disk. Estimates of the total number of stellar black
holes in our Galaxy lies between 106 and 109.
The black holes found in X-ray binaries all have masses of order 10 M⊙. In the
following chapter the growing number of mergers of stellar black holes detected by
the gravitational wave detectors LIGO and Virgo are discussed. These have been
events with the merging black hole masses ranging from 10 M⊙ to 85 M⊙. In the
event labeled GW190321 black holes of masses 85 M⊙ and 65 M⊙ merged giving a
black hole of mass 142 M⊙ releasing an energy 8 M⊙c 2 in gravitational waves. Apart
from that instance there is only tentative evidence at present for black holes with
masses in the range 100–105 M⊙. This gap impacts on the question of how
supermassive black holes were formed at the centers of galaxies.
Processes discussed that would inhibit stars ending as black holes with
intermediate masses roughly of order 100 M⊙ (apart from mergers) take account
of the production during nuclear burning of a large flux of energetic photons. Once
the temperature exceeds ∼109 K (0.1 MeV) the fraction of photons capable of
electron–positron pair production (requiring 1.022 MeV) becomes significant. Pair
production reduces the internal photon pressure, allowing further collapse; this
leads to increased pressure and temperature with the cycle repeating and ending
with a final collapse to a black hole. Alternatively if the star is massive enough it
would explode at the end of the first cycle as a supernova, blowing itself apart
without a remnant.
8-22
Introduction to General Relativity and Cosmology (Second Edition)
Figure 8.8. Observations of the orbit of S2 made from 1992 to 2018 covering a full orbit around Sgr A*. The
left-hand panel shows the orbit projected on the sky. The upper right-hand panel shows the radial velocity as a
function of time. The panel below is an expanded view of the portion of the orbit nearest to Sgr A*. The cyan
curve is the best fit including special and general relativistic effects. Figure from Abuter et al. (2018).
Reproduced with permission © ESO.
8-23
Introduction to General Relativity and Cosmology (Second Edition)
lifetime of the solar system (Maoz 1998). The leaders of these two research teams, Andrea
Ghez and Reinhard Genzel, shared the 2020 Nobel Prize in physics (with Roger Penrose)
for their research contributions demonstrating the existence of a supermassive black hole
at the center of our Galaxy.
Returning to Figure 8.9 the top right-hand panel shows the radial velocity of S2,
and below this an expanded view of the orbit when nearest Sgr A*. There are
deviations of the measured stellar velocities from an overall Newtonian fit, in the
case of S2 amounting to 200 km s−1 at the closest approach to Sgr A*. These
deviations disappear when the calculations are repeated with GR, which is
illustrated in Figure 8.9 for S2. This comparison extends the successful tests of
GR to conditions beyond the reach of solar system tests, to the strong coupling
regime, where GR effects are large.
In parallel the motion of Sgr A* has been measured using long baseline radio
interferometry. Its velocity in the galactic plane is about 7 km s−1, and about 0.4 km s−1
perpendicular to the plane: a thousand times slower than the motion of those nearby
stars used to infer the presence of a black hole. These velocities are consistent with the
motion expected for a 4.4 × 106 M⊙ black hole under the fluctuating gravitational
effect of the stars in the Galaxy. Energy released from material falling into the black
hole is responsible for the intense radio source Sgr A*. Applying Equation (3.52) to the
galactic black hole gives a Schwarzschild radius of 1.30 × 1010 m or 0.087 AU. Using
Equation (8.29) gives a Hawking temperature of 1.4 × 10−14 K.
200
dV (km/s)
100
Figure 8.9. The departure of the radial velocity of S2 around Sgr A* from the flat gray line showing the
Newtonian prediction. The cusped GR prediction is seen to agree with the data. Figure from Abuter et al.
(2018). Reproduced with permission © ESO.
8-24
Introduction to General Relativity and Cosmology (Second Edition)
The next nearest black hole of interest, M87*, is located at the center of the spiral
galaxy M87, itself at the center of the ∼1500 member Virgo cluster of galaxies. The M87
galaxy is proportionately larger, comprising ∼2 × 1012 stars within a radius of 80 kpc.
M87* lies 16.4 Mpc from the Earth, but still close enough for the orbital dynamics of stars
and gas close to the gravitational center of the galaxy to be measured with the HST and
Gemini telescopes.4 Analysis of the dynamics reveals that M87* is a monster compared
to Sgr A*, weighing in at around 6.5 × 109 M⊙. M87* has a correspondingly larger
Schwarzschild radius (2GM /c 2 ) of 1.9 × 1013 m or 130 AU, comparable to the size of the
solar system. There is strong emission across the γ-ray and X-ray spectrum from M87*,
the former fluctuating over times as short as days. In order that radiation from the whole
source may show this degree of temporal correlation the source must be at most light-
days across, consistent with the inferred Schwarzschild radius. We found in Section 8.2
that a black hole is encircled by possible photon orbits terminating in the closest, an
unstable circular orbit. This means that though we receive copious radiation from
material on its way into the black hole, none will come from within this orbit. Therefore
the image of a black hole should be a central shadow whose edge is defined by the
innermost, unstable photon orbit. It was shown that this orbit has a radius
3 3 r0 /2 = 27 GMBlack Hole /c 2 . The corresponding light rings for Sgr A* and for
M87* subtend similar opening angles on Earth, ∼40 μas: the larger mass of M87* is
compensated in the case of Sgr A* by the shorter distance. The Event Horizon Telescope
Collaboration (The Event Horizon Telescope Collaboration et al. 2019) set out to
observe the light ring of M87* using radio telescopes to achieve the required microarc-
second angular resolution. They used the shortest wavelength, 1.3 mm, and an array of
eight telescopes whose sites enclose an area near to one hemisphere of the Earth. The
angular resolution is thus ∼1.3 mm/10,000 km, that is 10−10 rad or 20μas: just enough!
Digitized images were recorded on memory at each site and time stamped with GPS.
Later the interference pattern was produced at a central site by synchronizing, playing
through and superposing the contents of the hard drives carrying the individual signals.5
This produced the world scale interferometer’s image of M87*. Figure 8.10 shows the
results. On the left the image of M87* reconstructed from data; at the center the predicted
image with perfect resolution showing the sharp photon ring; on the right the outcome of
convolving this predicted image with the expected resolution. This is the first image of a
black hole and as such adds the stamp of direct observation to the wealth of indirect
evidence for the existence of black holes. In 2022 the EHT collaboration imaged the black
hole at the center of our Galaxy.
4
These are twin mountain-based telescopes at Maunakea in Hawaii and Cerro Pachon in Chile with detection
from the infrared to the ultraviolet. Both have 8 m diameter aperture primary mirrors.
5
Recall that radio telescopes respond to the electromagnetic field amplitude, not intensity.
8-25
Introduction to General Relativity and Cosmology (Second Edition)
Figure 8.10. Left: the image of the black hole M87* recorded by the EHT; Center: the predicted image with
perfect resolution; Right: the prediction taking account of the interferometer’s intrinsic resolution. Figure from
The Event Horizon Telescope Collaboration et al. (2019). Reproduced with permission under CC-BY-SA-3.0.
the quasars, were detected using radio telescopes as unresolved bright point-like
sources. Their intensity varied rapidly, which showed that they were surprisingly
compact, only as large as the solar system. In 1963 Maarten Schmidt discovered a
matching optical source for the brightest quasar 3C273, and this optical source was
used to determine the redshift. It turned out to have a high redshift (for that era) of
z = 0.16 and hence 3C273 lies at a distance from the Earth cz /H0 = 690 Mpc. Using
this distance as input the luminosity calculated for 3C273 had the astonishingly large
value 2.2 × 10 40 W. For comparison, our own typical galaxy of around 2 × 1011
stars radiates just 5 × 1036 W. Later on, after the introduction of X-ray telescopes on
rockets and satellites, space-based observations revealed the existence of narrow jets
of energetic material, mainly relativistic electrons, emerging from such quasars.
Often there are two back-to-back jets from individual quasars. Figure 8.11 is a
composite image of the jet from 3C273 (moving rightward) starting 12 arcsec and
ending 22 arcsec from 3C273. This figure was constructed from images taken with
detectors sensitive to parts of the spectrum: from X-rays at the left to infrared
radiation at the right. The further the jet travels from its source, the more it is
decelerated by intergalactic gas, and hence the cooler the radiation it emits. This
radiation is plane polarized as expected for synchrotron radiation emitted by
relativistic electrons. 3C273 is at an angular distance6 of 590 Mpc; its jet subtends
22 arcsec so the jet length is 60 kpc (196 lyr). The average velocity of matter in the jet
is well below c, so we can infer that 3C273 has been ejecting material for well over
200 kyr. Jets from some other AGNs extend to several 100 kpc.
From its luminosity of 2.2 × 10 40 W it follows that over a million years 2C273
would radiate around 0.7 × 1054 J or 3.9 × 106 M⊙c 2 . The mass of the source has been
measured by the GRAVITY collaboration (Sturm et al. 2018) to be 3 × 108 M⊙.
Converting hydrogen to helium releases 0.7% of the mass as free energy, which in the
case of 2C273 would give an energy release of 2.1 × 106 M⊙c 2 if all the mass were
6
In Chapter 11 we will learn that the angular distance is smaller than the proper distance 690 Mpc by a factor
(1 + z ).
8-26
Introduction to General Relativity and Cosmology (Second Edition)
Figure 8.11. The image of one jet from 3C273. Blue identifies the 0.4–6 keV emission detected with the
Chandra X-ray space telescope; green identifies UV emission detected by the HST; red identifies the 3.6 μm
radiation detected by the Spitzer space telescope; the contours are drawn for 2 cm radio emission observed with
the VLA. The latter is truncated on H2, the strongest source. Figure 7 from Uchiyama et al. (2006). Courtesy
Professor C. Megan Urry for the copyright holders.
converted. This upper estimate is still well short of the energy radiated by the quasar.
On the other hand, we have seen that when matter falls into a black hole between 5%
and 42% of the initial mass is released as energy, depending on the angular momentum
of the black hole. Evidently quasars are powered by black holes. On its way inbound
matter ultimately entering the black hole will carry angular momentum with it. This
has two effects: it provides the spin that gives the jets; and it makes it certain that the
black holes in the universe are predominantly Kerr black holes. Angular momentum,
lost by matter falling into the black hole, and intense magnetic fields both play some
part in spinning up the AGN and launching the jet; though the details are yet to be
worked out.
Figure 8.12 sketches the principal features of an AGN. The gravitational
attraction of the black hole and the requirement to conserve angular momentum
build a rotating accretion disk feeding the black hole. In turn this disk is fed from a
torus of gas and dust. Around the disk are clouds of matter in violent motion and
radiating Doppler broadened atomic lines. Depending on whether we are viewing
the AGN head-on along the spin axis, or sideways-on through the accretion disk, or
otherwise, AGNs exhibit very different features. For this reason it took decades to
recognize their common origin. Seen from the side the active core is hidden by the
torus which re-radiates in the radio spectrum. As the viewing angle steepens toward
the jet the energetic core comes into sight, and such objects are classed as Seyfert
galaxies. Quasars are AGNs seen head-on either along or near a jet. Typically a
quasar’s total energy flux exceeds that from the parent galaxy by a factor as large as
104. Thanks to their huge luminosity quasars are sources visible to us now from the
earliest times in the life of the universe. Currently the most distant quasar, J1342
+0928, with redshift 7.54, is 4.2 Gpc away. The radiation we are receiving now from
this quasar was therefore emitted within a billion years after the Big Bang. Once an
AGN has swept up all the accessible local gas, dust, and stars, it becomes quiescent
like M87* and Sgr A*. Figure 8.13 shows the plot of the quasar comoving density as
a function of redshift. At redshift ∼6 they are sparse, only of order 1 Gpc−3.
8-27
Introduction to General Relativity and Cosmology (Second Edition)
~ 30 pc
~ 0.3 pc
Accretion disk
Hot gas clouds feeding the
black hole
Plasma jet
extending ~ 100 kpc
Figure 8.12. A sketch, not to scale, of a section through an AGN; the black dot marks the location of the black
hole.
Figure 8.13. The density of luminous quasars in the early universe. The reference volume is the comoving
volume. Making the extrapolation indicated by a broken line, there would only be one luminous quasar
powered by a black hole of a billion solar masses or more in the observable universe at redshift of 9. Figure 1
(a) reproduced from Fan et al. (2019) with permission under CC-BY 4.0.
8-28
Introduction to General Relativity and Cosmology (Second Edition)
8.12 Exercises
1. Calculate the radius of the horizon of a neutral non-rotating black hole
whose mass is 108 M⊙. Show that the velocity of a body moving in the
smallest stable orbit around this Schwarzschild black hole is c/2. Hence
calculate the period measured locally (proper time) and that measured on a
clock remote from the black hole (coordinate time).
2. If someone is in free fall radially toward a black hole, which has a mass equal
to 10 M⊙, he would feel a lateral tidal force squeezing him. Calculate the
radial distance from the center of the black hole at which the tidal
acceleration has grown to 400 m s−2. At this point our traveler will be
crushed. We assume that the traveler starts from rest at a place remote from
the black hole.
3. What is the Schwarzschild radius of a galactic black hole of 108 M⊙? How
much mass would need to be consumed by the black hole per annum in order
to generate 1039 W?
4. Show that when two black holes merge there is an increase in Beckenstein
entropy. According to Beckenstein how many bits does a Schwarzschild
black hole of mass 20 M⊙ contain? What is the temperature of the Hawking
radiation from it?
5. An astronaut falls under gravity only toward a black hole of mass 104 M⊙
starting from rest at effectively infinite distance. How long does it take
according to his onboard clock to travel from being at twice the
Schwarzschild radius from the black hole center until he crosses the horizon?
6. Unruh (1976) used quantum field theory to argue that an observer moving at
constant acceleration a with respect to an inertial frame would perceive black
body radiation at a temperature
ℏa
T= . (8.39)
2πkBc
Calculate the Unruh temperature expected for an observer remaining sta-
tionary at the horizon of a black hole. Is this consistent with the Hawking
temperature?
Further Reading
Susskind L and Lindesay J 2005 The Holographic Universe: An Introduction to
Black Holes, Information and the String Revolution (Singapore: World
Scientific) gives a less technical account of this modern approach to
cosmology.
Haardt F, Gorini V, Moschella U, Treves A and Colpi M (editors) 2016
Astrophysical Black Holes: Lecture Notes in Physics (Berlin: Springer). This
text is more wide ranging.
Blundell K 2015 Black Holes: A Very Short introduction. A short lively account.
8-29
Introduction to General Relativity and Cosmology (Second Edition)
References
Abuter, R., Amorim, A., Anugu, N., et al. 2018, A&A, 615, L15
Almheiri, A., Marolf, D., Polchinski, J., & Sully, J. 2013, JHEP, 2013, 18
Bardeen, J. M., Carter, B., & Hawking, S. W. 1973, CMaPh, 31, 161
Beckenstein, A. 1972, NCimL, 4, 99
Chandrasekhar, S. 1931, ApJ, 74, 81
Chowdhury, B. D., & Mathur, S. D. 2008, CQGra, 25, 225021
Eddington, A. S. 1924, Natur, 113, 192
Fan, X., Barth, A., Banados, E., et al. 2019, BAAS, 51, 121
Finkelstein, D. 1958, PhRv, 110, 965
Hawking, S. W. 1974, Natur, 248, 30
Hawking, S. W., & Penrose, R. 1970, RSPSA, 314, 529
Maoz, E. 1998, ApJ, 494, L181
Michell, J. 1784, RSPT, 74, 35
Orosz, J. A., McClintock, J. E., Aufdenberg, J. P., et al. 2011, ApJ, 742, 84
Penrose, R. 1965, PhRvL, 14, 57
Sturm, E., Dexter, J., Pfuhl, O., et al. 2018, Natur, 563, 657
The Event Horizon Telescope CollaborationAkiyama, K., Alberdi, A., et al. 2019, ApJ, 875, L5
Uchiyama, Y., Urry, C. M., Cheung, C., et al. 2006, ApJ, 648, 910
Unruh, W. G. 1976, PhRvD, 14, 870
8-30
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 9
The Discovery and Study of
Gravitational Waves
limit for measurement precision imposed by quantum mechanics. The LIGO team
leaders, Barry Barish, Kip Thorne, and Rainer Weiss, were awarded the Nobel Prize
in physics in 2017. In the following sections gravitational wave motion will be
discussed and quantified, and then their detection.
where ημν is the Minkowski metric and all the components hμν are very much less
than unity. gμν and ημν are symmetric tensors, and hence hμν is also a symmetric
tensor. Expanding Equation (9.1) to first order in hμν gives
h αν, μα − h νμ, αα + h αμ, αν − h αα, μν = 0 (9.3)
A symmetric tensor like hμν has in general ten independent components, but
gravitational waves are quadrupole, with only two independent polarization states.
This is similar to the situation with electromagnetism.1 The same approach is used in
both cases: gauge conditions are applied that restrict the solutions of the wave
equation, here Equation (9.3), to those of physical importance. The first, the
traceless condition, is to set the fourth term to zero; the next condition is to require
the divergence to vanish, which eliminates the first and third terms. Applying these
conditions together reduces Equation (9.3) to a wave equation
∂ 2hμν
h μν, αα = = 0, (9.4)
∂xα∂x α
A further and final requirement can be made on the coordinate choice
hα0 = 0. (9.5)
The eight constraints that have been applied have left just two independent
components of hμν . These solutions appear below as the two independent polar-
ization states presented in Equations (9.8) and (9.9). Writing Equation (9.4) in detail
gives
1
The underlying reason is that the quanta of electromagnetism and gravitational fields are massless. For
massive particles all the degrees of freedom are necessary to describe the particle state.
9-2
Introduction to General Relativity and Cosmology (Second Edition)
⎡0 0 0 0⎤
0 A A 0⎥
Aμν = ⎢ 11 12
. (9.7)
⎢ 0 A12 − A11 0⎥
⎢ ⎥
⎣0 0 0 0⎦
The choice of gauge has made the wave amplitudes Aμν both transverse and traceless;
therefore this is called the transverse-traceless gauge. Quantities defined in this gauge
carry a superscript TT: this will only be inserted from time to time in what follows as
a reminder. The analysis for weak gravitational waves, and slow-moving weak (near
Newtonian) sources can be carried through using transverse-traceless gravitational
wave solutions of Einstein’s equation. The general solution of the form of Equation
(9.6) is made up of a linear combination of the two orthogonal states
hμν = h+(e+)μν cos(ωt − kz ) (9.8)
and
hμν = h×(e×)μν cos(ωt − kz ), (9.9)
where
⎡0 0 0 0⎤
(e+)μν = ⎢ 0
1 0 0⎥
(9.10)
⎢0 0 −1 0⎥
⎢
⎣0 0 0 0⎥
⎦
⎡0 0 0 0⎤
(e×)μν = ⎢ 0
0 1 0⎥
(9.11)
⎢0 1 0 0⎥
⎢
⎣0 0 0 0⎥
⎦
and where h+ = A11 and h× = A12 . e+ and e× are two independent polarization states
of gravitational radiation. Their tensor form implies a more complicated polar-
ization than the linear polarization met with in the case of light.
9-3
Introduction to General Relativity and Cosmology (Second Edition)
y = y0 + x0h× cos ωt ,
which can be manipulated to give
h
x + y = ⎛1 + × cos ωt ⎞(x0 + y0)
⎝ 2 ⎠
and
h
x − y = ⎛1 − × cos ωt ⎞(x0 − y0)
⎝ 2 ⎠
The complete cycle is drawn in the right-hand column of Figure 9.1. It is not possible
to construct the e+ pattern from the e× pattern or vice versa; they are orthogonal
9-4
Introduction to General Relativity and Cosmology (Second Edition)
Figure 9.1. The effect of gravitational waves on a circle of test masses followed over one cycle. The observer is
looking toward the source. Two orthogonal states of quadrupole radiation are illustrated.
9-5
Introduction to General Relativity and Cosmology (Second Edition)
Figure 9.2. The effect of gravitational waves on a circle of test masses followed over one cycle. The observer is
looking toward the source. Right- and left-handed circularly polarized patterns are illustrated.
Figure 9.3. The patterns of tidal acceleration for the two orthogonal states of polarization, e+ and e× , when the
phase angle is zero.
9-6
Introduction to General Relativity and Cosmology (Second Edition)
c 4 (2)
tμν = − Gμν . (9.14)
8πG
where G (2) only contains terms quadratic in hαβ . This expression is evaluated in
Appendix D. The energy flow per unit area per unit time is
c3 2 2 c3
F= 〈h+̇ + h×̇ 〉 = 〈hij̇ hij̇ 〉, (9.15)
16πG 32πG
where each dot over a quantity indicates taking a time derivative. It is misleading to
assign energy to a point in spacetime because only the relative displacements are
meaningful. Furthermore it is not even possible to specify whether the energy is in
the peaks or valleys of the waves. For these reasons Equation (9.15) only applies
when averages are taken over several cycles and wavelengths. This averaging is
indicated by the angular brackets.
Solutions of Einstein’s equation will be sought for sources that are nearly Newtonian,
which means first that the curvature produced by the source and hence the strain are
small, and second that the velocity v of the material within the source is very much less
than c: GM /rc 2 ≪ 1 and v2 /c 2 ≪ 1. A linear approximation can be made to the metric
in these circumstances. It turns out that the intensity of gravitational radiation
predicted by this linearized theory generally only differs by small numerical factors
from the results of more exact calculations (Davis et al. 1971). In Appendix E
9-7
Introduction to General Relativity and Cosmology (Second Edition)
the contribution due quadrupole motion of the source is evaluated in Equation (E.4).
At a distance r from the source
2G TT r
hij (t ) = I ̈ ⎛t − ⎞
4 ij
(9.16)
rc ⎝ c⎠
where IijTT is the transverse-traceless part of the quadrupole moment of the source
(see Equation (E.3)). Equation (9.16) is a retarded solution: gravitational disturban-
ces propagate at a velocity c, and hence the amplitude at r at time t is determined by
the source behavior at an earlier time t − r /c . This argument will be omitted from
here on in order to simplify the presentation. The energy flow in the gravitational
wave given by Equation (E.6):
G TT TT
F= 〈I ij⃛ I ij⃛ 〉.
8πr 2c 5
As before the angular brackets indicate the expectation value averaged over several
cycles. The total energy flow through a sphere at distance r from the source is called
its luminosity L. In Appendix E it is shown that
G
L = ⎛ 5 ⎞〈 I ⃛ij I ⃛ij 〉, (9.17)
⎝ c ⎠
5
where I ij is the reduced quadrupole moment of the source:
9-8
Introduction to General Relativity and Cosmology (Second Edition)
companion so it is likely to be another neutron star. Both the pulsar and its
companion have masses close to 1.4 M⊙ and their orbits would almost fit inside the
Sun, enhancing GR effects far beyond those observed in the solar system. These GR
effects can be measured remotely because the pulsar itself is a clock as precise as any
atomic clock. The key observation made by Hulse and Taylor was that the orbits
decay at a rate that is precisely that expected due to the loss of energy in
gravitational radiation from the pair. Taylor and colleagues have timed the pulses
over four decades and determined the orbital behavior in fine detail. In Figure 9.4
the data from the orbital collapse are compared to the prediction using GR. The
predicted decay rate of the orbital period τ, is
dτ
= −2.40263(5) × 10−12 s s−1.
dt
The fit to the data gives a rate 0.9983 ± 0.0016 times the GR prediction. Here we
follow the steps leading to this prediction taking, for simplicity, the case that the
binary pair have equal mass.
The quadrupole moment of the binary pair has an xx-component (see Exercises)
Ixx = 2Ma 2 cos2 ωt = Ma 2(1 + cos 2ωt ),
where ω is the angular frequency of rotation. Similarly
Iyy = Ma 2(1 − cos 2ωt ).
Figure 9.4. Orbital decay of PSR 1913+16 with time. The curve represents the orbital phase shift expected due
to the emission of gravitational waves. The points are data with error bars that are smaller than the line width.
Figure 3 from Weisberg and Huang (2016). Courtesy of Professor Weisberg on behalf of the copyright holders.
9-9
Introduction to General Relativity and Cosmology (Second Edition)
4a 4 ⎝ GM ⎠
Thus
dE 2 dω 2 dτ
= =− .
E 3 ω 3 τ
Then the observable quantity, the orbital decay rate, is
(dτ dt ) 3 dE dt 3L 768 ω6a 5
=− = =− .
τ 2 E 2E 5 c5
Substituting for ω gives, finally,
dτ dt 12 G 3M 3
=− . (9.19)
τ 5 c 5a 4
Press and Thorne (1972) calculated the correction required for the case of elliptical
orbits.2 The precision with which the GR prediction has tracked the decay rate of
2
With eccentricity e the right-hand side of Equation (9.19) is then multiplied by a factor
1 + 73e 2 24 + 37e 4 96
.
(1 − e 2 )7 2
9-10
Introduction to General Relativity and Cosmology (Second Edition)
PSR 1913+16’s orbit for forty years is convincing, but indirect evidence for the
existence of gravitational waves. In eight other examples of binary pulsars studied by
Weisberg and Huang similar agreement was found between the GR prediction and
the observed orbital decay rates. These very precise quantitative comparisons with
GR nicely complement the direct observations of gravitational waves by the LIGO/
Virgo collaboration; which is the next topic.
3
The material in this and the succeeding sections is substantially modified from similar material that appeared
in “Quantum 20/20: Fundamentals, Entanglement, Gauge Fields, Condensates and Topology” published by
Oxford Univ. Press in 2019 (Kenyon 2019). The author is indebted to Oxford Univ. Press for permission.
9-11
Introduction to General Relativity and Cosmology (Second Edition)
Figure 9.5. Strain versus time plot from the first gravitational wave signal observed by the advanced LIGO at
Hanford (H1) and Livingstone (L1). LIGO Open Science Center at https://losc.ligo.org/events/GW150914.
For use later, we use Kepler’s law to obtain expressions for r and r :̇
13
r = [GM ω k2 ] , (9.23)
r ̇ = −(2/3)[rω̇ k / ωk ]. (9.24)
9-12
Introduction to General Relativity and Cosmology (Second Edition)
Figure 9.6. Estimated strain amplitude from GW150914 made with numerical relativity models of the black
hole behavior as the holes coalesce. In the lower panel the separation of the black holes is given in units of the
Schwarzschild radius 2GM /c 2 and the relative velocity is divided by c. LIGO Open Science Center at https://
losc.ligo.org/events/GW150914. The work is reported by the LIGO Scientific Collaboration and Virgo
Collaboration (Abbott et al. 2016). Courtesy LIGO Collaboration. This figure is Figure 11.5 taken with
permission from Kenyon (2019) published by Oxford Univ. Press in 2019.
9-13
Introduction to General Relativity and Cosmology (Second Edition)
9-14
Introduction to General Relativity and Cosmology (Second Edition)
Figure 9.7. Sketch of components of the advanced LIGO detectors. The mirrors TM1, TM2, TM3, and TM4
form Fabry–Perot etalons in the arms of the Michelson interferometer. PRM is the power recycling mirror,
SRM the signal recycling mirror. BS is the beam splitter. This figure is Figure 11.6 taken with permission from
Kenyon (2019) published by Oxford Univ. Press in 2019.
mirror, a second similarly massive mirror placed close to the beam splitter. This
second mirror converts each arm into a Fabry–Perot cavity and the laser is tuned to
a cavity resonance. Radiation in resonance with the cavity passes to and fro about
300 times before its intensity falls significantly. As a result the phase resolution of the
interferometer is improved by a similar factor. It is arranged that in the absence of
gravitational waves the reflected beams arrive out of phase at the photodetector, in
other words it views a dark fringe. An important consequence is that almost all the
radiation entering through the input face of the beam splitter ends up exiting
through this same face. In general the electric field at the detector is the real part of
E = E in[exp(iϕx ) − exp(iϕy ]/2, (9.30)
where ϕx,y are the phase changes in the two arms. When the interferometer is set on a
dark fringe a gravitational wave traveling perpendicular to the plane of the
interferometer produces an electric field at the detector
E GW = E in[exp(iϕGW ) − exp( −iϕGW )]/2 ≈ iϕGWE in, (9.31)
where ϕGW is the oscillating phase difference induced by the gravitational wave. This
signal is so small that in order to make a useful measurement a larger DC field is
added to it. This is achieved by offsetting the output away from the dark fringe by a
small angle in the absence of any gravitational wave. Then the total signal when a
gravitational wave is detected is
9-15
Introduction to General Relativity and Cosmology (Second Edition)
ET = E GW + EDC. (9.32)
The power is then
2
P ∝ E DC + 2EDCE GW
= EDC[EDC + 2ϕGWE in ]
2
where the negligible term EGW has been ignored. This power is converted to a
current in the photodetector and from the current the oscillating phase ϕGW is
extracted.
4
This matter is pursued more fully by Braginsky et al. (2003).
9-16
Introduction to General Relativity and Cosmology (Second Edition)
Next consider the QRPN. The force due to each photon striking a mirror of mass
M during the measurement time τ is 2ℏk /τ . With an average of N photons arriving at
the mirror in time τ the fluctuation on this force is
δF = 2 N ℏk / τ . (9.36)
When the freely suspended mirror responds to a gravitational wave of angular
frequency Ω GW its equation of motion is
F cos(Ω GWt ) = M d2x /dt 2 = −M Ω 2GW x cos(Ω GWt ). (9.37)
Then the arm length fluctuation due to the radiation pressure fluctuation is
2 N ℏk
δxr = δF / M Ω 2GW = . (9.38)
τM Ω 2GW
The corresponding spectral noise intensity is
2
2ℏk ⎤
Sr = (δxr )2 τ = Nτ⎡
⎢ τM Ω 2GW ⎥
⎣ ⎦
2
Pτ 2 ⎞⎡ 2ℏk ⎤ (9.39)
=⎛ ⎜ ⎟
2
⎝ ℏω ⎠⎢⎣ τM Ω GW ⎦
⎥
8π ℏP 4
= /(M 2Ω GW ).
λc
Expressed as a strain this gives
Sr 1 8π ℏP
hr = = 2
. (9.40)
L LM Ω GW λc
Notice that the limiting strains due to the shot noise, Equation (9.35), and the
QRPN, Equation (9.40), have inverted dependences on the laser power P. Thus the
minimum detectable strain is obtained when these two contributions are equal:
λℏc 8π ℏP
= 4
, (9.41)
2πP λcM 2Ω GW
hence we deduce that there is an optimum optical power
λcM Ω 2GW
P= . (9.42)
4π
Using this value for P the minimum total spectral noise power density,
4ℏ
S = Ss + Sr ⩾ , (9.43)
M Ω 2GW
9-17
Introduction to General Relativity and Cosmology (Second Edition)
Figure 9.8. Limit on detectable strain with a simple Michelson interferometer having 10 km arms, 10 kg
mirrors, and laser wavelength 1064 nm. Examples are shown for 50 kW and 500 kW circulating power.
We have here the standard quantum limit (SQL). Putting in the parameters of a
simple Michelson interferometer with the dimensions similar but not identical to
LIGO, the limiting strain detectable is plotted in Figure 9.8. The masses of the
mirrors are 10 kg, the laser wavelength is 1064 nm, and the arms are 10 km long.
Two cases are shown, for a circulating optical power in the arms of 50 kW and
500 kW. Raising the power lowers the shot noise and improves the sensitivity for
detecting high frequency gravitational waves. However, the radiation pressure
fluctuations increase with increasing power, which reduces the sensitivity to low
frequency waves. The frequency at which the SQL is attained rises as the circulating
optical power is increased.
Then our ideal detector with 100 kW power would have adequate sensitivity to
detect the event at ∼100 Hz observed by advanced LIGO. Evidently very high
optical powers are required, much greater than the wattage available from the well-
stabilized 1064 nm Nd:YAG lasers as used in LIGO.
9-18
Introduction to General Relativity and Cosmology (Second Edition)
9-19
Introduction to General Relativity and Cosmology (Second Edition)
–23
2.2 10
2 Reference (without squeezing)
1.8 Quantum noise model (without squeezing)
Quantum-enhanced sensitivity
1.6
Sum of non-quantum noises
1.4
L1 Strain noise (1/√Hz)
1.2
1
0.8
0.6
0.4
0.3
30 100 1000
Frequency (Hz)
Figure 9.9. Strain noise level in advanced LIGO with unsqueezed (black) and squeezed (green) vacuum.
Figure 1 from Tse et al. (2019). Made available through Creative Commons Attribution 4.0 International
license.
9.7 Squeezing
Surprisingly the port where radiation exits to the detector has an impact on the
noise. Through it the vacuum radiation field enters the interferometer and the
vacuum fluctuations add to both the shot noise and the QRPN, introducing some
correlation between them. It proved possible, by taking advantage of this correlation
induced between the shot noise and the QRPN, to reduce the noise below the SQL
(Unruh 1982; Braginsky et al. 1992; Kimble et al. 2001). The researchers have
adapted LIGO by injecting a squeezed vacuum into the exit port, which by virtue of
the correlation just noted, can be tuned to reduce the interferometer noise below the
SQL over a range of frequencies. This strategy is now explained.
The quantum operator for the electric field in a single optical mode can be
written5
Eˆ = [aˆ exp( −iωt ) + aˆ † exp(iωt )]/ 2 , (9.45)
where â annihilates and aˆ† creates a photon in the mode of angular frequency ω:
aˆ∣n〉 = n ∣ n − 1〉 , (9.46)
aˆ †∣n〉 = n + 1 ∣ n + 1〉 . (9.47)
Here ∣n〉 is the state vector describing a mode containing n photons. â and aˆ† are
conjugate operators with the property that [aˆ , aˆ†] = 1. Of interest here are the
quadrature operators
5
See for example Section 8.2 in Kenyon (2019) published by Oxford Univ. Press.
9-20
Introduction to General Relativity and Cosmology (Second Edition)
in terms of which
Eˆ = [Xˆ1 cos(ωt ) + Xˆ2 sin(ωt )]. (9.49)
It follows that
[Xˆ1, Xˆ2 ] = i , (9.50)
so that X̂1 and X̂2 are also conjugate operators. Uncertainty in X1 corresponds to an
amplitude uncertainty, and impacts the QRPN; while uncertainty in X2 corresponds
to phase uncertainty, and impacts the shot noise. We can write the equivalent
uncertainty relation between the variances of the observables (ΔX 2 = X¯ 2 − X¯ 2 ):
(ΔX1)2 (ΔX2 )2 ⩾ 1/4. (9.51)
A fully coherent beam attains the minimum uncertainty limit with
(ΔX1)2 = (ΔX2 )2 = 1/2.
The mechanism employed in beating the SQL is to squeeze the vacuum lightly.
This involves making a change in the content of a mode of the electromagnetic field
of interest:
∣0〉 → ∣ψ 〉 ≡ ∣0〉 + s∣2〉, (9.52)
with s ≪ 1 being the amplitude for injecting two coherent photons into that mode.
Then
2
〈ψ ∣Xˆ1 ∣ψ 〉 = 〈ψ ∣ {aˆ † + aˆ}2 ∣ψ 〉/2
= {〈0∣+s〈2∣} {aˆ †aˆ † + aaˆ ˆ + aˆ †aˆ + aa
ˆ ˆ †} {∣0〉 + s∣2〉}/2 (9.53)
= (1 + 2 2 s )/2
to order s. In making the second equality we have used the property of the
annihilation operator that aˆ∣0〉 = 0. Also
6
The product of ΔX12 and ΔX22 appears to violate the uncertainty principle. If the expansion is continued to
higher orders in s this violation disappears.
9-21
Introduction to General Relativity and Cosmology (Second Edition)
generate the required pairs of entangled photons appearing in Equation (9.52). The
device used in LIGO is an optical parametric oscillator (OPO). A laser pumps a
Fabry–Perot cavity containing a nonlinear crystal. The cavity mode is matched to
that of the interferometer and the excitation produces pairs of entangled photons
that are collinear and have frequency equal to that of the cavity resonance. These
pairs of entangled photons are thus in the same mode and at the correct frequency to
apply squeezing. The laser intensity is held just below threshold to excite oscillation in
the crystal. With this setup the OPO input is the vacuum and the OPO output is the
squeezed vacuum. The data in Figure 9.9 shows a 3 dB reduction in the noise
spectrum above 50 Hz. This is the region in which the observed mergers were
detected and shot noise dominates. There is a 14% increase in the distance at which
mergers can be detected, which increases the detection rate by 40%.
9.9 Exercises
1. Calculate the quadrupole moment of a system of two stars, each of mass M
in circular orbits of angular frequency ω at a separation 2a . Take the center
of mass as the origin and let the motion be in the x–y plane with the masses
along the x axis at time zero. Calculate Ixx, Ixy = Iyx and Iyy at time t. Then
calculate the reduced quadrupole moments using the fact that only the time
dependent part survives.
2. Calculate the luminosity of the gravitational wave emission from the binary
pair described in the previous question, in terms of M, ω, and a.
9-22
Introduction to General Relativity and Cosmology (Second Edition)
3. Using the results from the previous question, determine the frequency and
amplitude of gravitational waves from the binary pulsar system PSR 1913
+16 at the surface of the Earth. You can assume that the orbits are circular
with a equal to 3.1961 lightsec, that the masses are each 1.414 M⊙, that the
period of rotation is 27,907 s, and that the pulsar is 6.4 kpc from the Earth.
4. LISA is a gravitational wave interferometer planned for launch into space
around 2034. The laser light would follow a closed light path round an
equilateral triangle of mirrors of side length 2.5 × 106 km. Suppose the laser
used has 1 W power at wavelength 1 μm, and that 50 pW power is received in
one pass along an arm. What is the minimum strain detectable?
5. Explain how the laws of conservation of momentum and angular momentum
forbid emission of any dipole gravitational radiation.
Further Reading
Misner C W, Thorne K S and Wheeler J A 1971 Gravitation (San Francisco,
CA: W. H. Freeman). Chapter 18 contains a thorough treatment of the
formalism of gravitational waves. Gravitation established the notation gen-
erally used today for general relativity.
Bond A, Brown D, Friese A and Strain K A 2016 Interferometric Techniques for
Gravitational-Wave Detection: Living Reviews in Relativity 19:3 (Berlin:
Springer). http://www.springer.com/gp/livingreviews/relativity.
Abbott B P et al. 2020 Prospects for Observing and Localizing Gravitational-
wave Transients with Advanced LIGO, Advanced Virgo and KAGRA: Living
Reviews in Relativity 23:3 (Berlin: Springer). http://www.springer.com/gp/
livingreviews/relativity.
The LIGO website, https://www.ligo.org/, is very useful.
Braginsky V. B. & Khalili F. Ya. 1990 Quantum Measurement ed. K. S. Thorne
(Cambridge: Cambridge Univ. Press). This succinct introduction to quantum
measurement was written by pioneers in the understanding of quantum
measurement.
Scully M O and Zubairy M S 1997 Quantum Optics (Cambridge: Cambridge
Univ. Press) contains a useful discussion of optical squeezing.
Ferrari V July 2010 The Quadrupole Formalism Applied to Binary Systems,
VESF School, Sesto val Pusteria. A useful account of GR application to
pulsars and merging binary systems. Available online.
References
Abbott, B. P., Abbott, R., Abbott, T. D., et al. 2016, PhRvL, 116, 061102
Abbott, B. P., Abbott, R., Abbott, T. D., et al. 2017, ApJ, 848, L13
Bond, C., Brown, D., Freise, A., & Strain, K. A. 2016, LRR, 19, 3
Braginsky, V. B., Gorodetsky, M. L., Khalili, F. Y., et al. 2003, PhRvD, 67, 082001
Braginsky, V. B., Khalili, F., & Thorne, K. S. 1992, Quantum Measurement (Cambridge:
Cambridge Univ. Press)
Davis, M., Ruffini, R., Press, W. H., & Price, R. H. 1971, PhRvL, 27, 1466
9-23
Introduction to General Relativity and Cosmology (Second Edition)
9-24
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 10
Cosmic Dynamics
10.1 Introduction
To an excellent approximation the universe is flat. Consequently, this will be the case
looked at in detail; less will be said about spacetimes with curvature. On the largest
scale matter in the universe is approaching a homogeneous isotropic distribution.
Within the framework of GR, Friedmann, Robertson, and Walker derived the
metric for such homogeneous isotropic universes. Friedmann (again) and Le Maître
applied Einstein’s equation to these universes. The resulting three Friedmann–Le
Maître equations are the tools used here for analyzing the properties of the universe.
The inputs required when applying the Friedmann–Le Maître equations to our
universe are the densities of matter, radiation, and dark energy; plus their equations
of state, relating pressure to energy in each case. The derivations are carried through
here and the Friedmann–Le Maître equations are applied to the case of the ΛCDM
model.
The simple model that describes the universe on the large scale is that of a
homogeneous, isotropic universe with a density equal to the mean density of the
actual universe. Instead of galaxies and stars this model has a uniform fluid filling all
space. This is our working hypothesis for how the universe appears on the very large
scale. Our first step is to deduce the metric of such a model universe. Of course the
motion of galaxies is distorted from that of particles in such an ideal fluid by the local
variation of the gravitational attraction of other matter present.
We can imagine placing clocks at rest with respect to the cosmic fluid, and setting
them to read the same reference time when the fluid density and temperature reach
agreed values. Then using these synchronized clocks the physical state of the
universe will depend on time in the same way everywhere. This time is called the
cosmic time. Suppose now that a three-dimensional slice (hypersurface) is taken
through spacetime at cosmic time t. This hypersurface will also be isotropic and
homogeneous. Therefore there is a single Gaussian curvature κ (t ) characterizing it,
and this depends solely on the cosmic time:
κ (t ) = k / R 2 (t ) (10.1)
where R2 gives the magnitude and k the sign ( +1, 0, −1) of this curvature. A
positively curved space with k = +1 is the three-dimensional analog of a spherical
surface, i.e., a hypersphere. It can be embedded in a four-dimensional Euclidean
space, just as a two-dimensional spherical surface is embedded in Euclidean three-
space. Let x, y, z, and w be the Cartesian coordinates in Euclidean four space, with
x, y, and z the spatial coordinates in the usual three-space. Then the hypersurface
has the equation
x 2 + y 2 + z 2 + w 2 = r 2 + w 2 = R2(t ).
By differentiating at fixed time we obtain
w dw = − r dr and w 2 dw 2 = r 2 dr 2 .
Then
r 2 dr 2
dw 2 = .
R2 − r 2
The separation of nearby points on the hypersurface is given by
dℓ 2 = dr 2 + r 2 dΩ 2 + dw 2 ,
where (r, θ , φ) are the polar coordinates in the usual three-space and
dΩ2 = dθ 2 + sin2 θ dφ 2 . Eliminating dw2 we obtain the equation of the hypersurface
R 2 dr 2
dℓ 2 = + r 2 dΩ 2 .
R2 − r 2
A change in angle θ produces a displacement r dθ , while a change in r in any
direction gives a displacement of R dr / R2 − r 2 . These features show that the three-
dimensional hypersurface is curved and isotropic. The choice of positive curvature
means that it is a hypersphere. Note that the equation is the same for all points on
the hypersphere, and so this space is also homogeneous. Changing the notation we
introduce the dimensionless σ = r /R , so that the equation becomes
dσ 2
dℓ 2 = R 2 ( t ) ⎛
⎜
2
+ σ 2 dΩ 2 ⎞ , ⎟
⎝1 − σ ⎠
10-2
Introduction to General Relativity and Cosmology (Second Edition)
where the time dependence of R(t ) is made explicit again. We now complete the
metric equation by incorporating the cosmic time in a way consistent with SR. The
invariant metric distance squared is
dσ 2
d s 2 = c 2 dt 2 − R 2 ( t ) ⎛ ⎜
2
+ σ 2 dΩ 2 ⎞ .
⎟
⎝ 1 − σ ⎠
which gives contrasting results according to the sign of k. Let us first take k = +1,
then
d = R(t ) sin−1 σ .
If a sphere centered on A is drawn to intersect B the area of the sphere is
A = 4πR2σ 2 = 4πR2 sin2(d / R ).
As d increases from zero to πR/2 the area of the sphere increases steadily. Then as d
increases further to π R the area of the sphere falls to zero; by analogy with the case
of a two-dimensional sphere we have reached the Antipodes of A. Finally when d
reaches 2π R , σ is zero and we are back at A. A hypersurface with positive curvature
is a closed but unbounded surface. On the other hand, when k = −1 we have
d = R(t ) sinh−1 σ .
1
Both Friedmann in Russia and Einstein in Switzerland belonged to clubs of young people sharing an interest
in the latest in the physical sciences.
10-3
Introduction to General Relativity and Cosmology (Second Edition)
A flat universe is open and unbounded. The FRW metric equation collapses to
ds 2 = c 2dt 2 − a 2(t )[dr 2 + r 2dΩ 2]. (10.4)
Here we have introduced notation that follows standard practice in cosmology: a(t )
varies with time and is made dimensionless; the length dimension is transferred to the
comoving length coordinate r. As the universe expands r remains constant. The
expansion is all in a(t ). In comparison with the Minkowski metric of SR the only
difference is the appearance of the time varying scale factor a(t ).2
2
a was already used in connection with Hubble’s law in Section 1.4.
10-4
Introduction to General Relativity and Cosmology (Second Edition)
Time
t2
t1
0
A B C
Comoving coordinate r
Figure 10.1. The development of the universe from a singularity. The past light cones are drawn for one
location B at cosmic times t1 and t2.
c dt = a(t )dr .
Therefore the velocity of light in the comoving coordinate frame is c /a(t ). In an
expanding universe a(t ) increases with time so that the edges of the light cones in the
comoving frame are curved concave-up. Events lying within the past light cone at
time t1 can influence B at time t1. Events early in the lives of A and C can affect B at
time t2 but not earlier at time t1, before causal contact was made.
The wobbly line in the figure indicates an initial state. Note that light cones have
been drawn for the case that a(t ) shrank to zero at t = 0. This makes the light cones
become tangential at the initial point. That would allow instantaneous causal
contact across the universe. Alternatively the comoving coordinates could continue
for infinite time in the past. What actually happened is not known.
An observer in a flat, expanding, homogenous and isotropic universe sees all galaxies
moving away radially: the observer’s frame is a comoving frame and it is also a frame
in free fall. For good measure this frame would coincide with the frame defined by the
distant galaxies, and in it the CMB would appear isotropic. From the Earth we find
that the rest frame defined by the distant galaxies and that defined by the CMB
coincide, so this frame is a comoving frame: of course we move relative to the
comoving frame with the vector sum of the Earth’s motion around the Sun, of the Sun
around the Galaxy and of the Galaxy itself. This vector sum is our secular motion.
10-5
Introduction to General Relativity and Cosmology (Second Edition)
It follows from the strong equivalence principle that because a comoving frame is
in free fall, the laws of physics in such a frame should satisfy the postulates of SR. In
the language of SR the comoving frame is an inertial frame. Other inertial frames
can be obtained from the comoving frame by boosting to a frame with constant
relative velocity. The comoving frame in free fall is unique because it is the only
frame in which the CMB is seen to be isotropic. Consequently this frame is a
preferred frame, something abhorrent to both classical and SR views.
As mentioned in Section 1.6, the existence of our local comoving frame is revealed
by a Foucault pendulum, most vividly by one located at the North Pole. To us it
appears to rotate through 360◦ in 24 hours. However, the plane of its swing remains
in a fixed orientation relative to the most distant galaxies, and to the CMB. This
makes specific the principle proposed by Mach in the 19th century: namely that the
matter in the universe determines a reference (comoving) frame. It follows that any
rotation with respect to that frame produces centrifugal and Coriolis forces.
3
Einstein initially published a paper disproving Friedmann’s solution, but then had to withdraw his disproof as
incorrect. At that point he disowned the cosmological constant.
4
For curved spacetime κ = k /R2 , where for a positive curved spacetime R is the radius of curvature in the
comoving frame, and k = ±1 indicates the sign of curvature.
10-6
Introduction to General Relativity and Cosmology (Second Edition)
cosmological constant, which does not change. For the latter the energy density and
pressure are given by Equations (F.12) and (F.13)
c 4Λ
εΛ = ρΛ c 2 = −pΛ = , (10.6)
8πG
and, to reiterate Λ is invariant. Friedmann’s equation is essentially an equation
expressing the conservation of energy. The first term in a 2̇ is the kinetic energy; that
in ε is the gravitational energy; while the second term is the energy in the curvature
of spacetime. If spacetime is flat the kinetic and gravitational energies sum to zero:
the universe is created from nothing! The carry-home message is this: Friedmann’s
equation relates the expansion rate of the universe to the curvature and energy
density of the universe.
A second equation, Equation (F.15), is the acceleration equation
4πG
a /̈ a = − [ε + 3p ]. (10.7)
3c 2
The RHS of the equation contains the gravitational force responsible for acceleration/
deceleration of the universe’s expansion. We see immediately that, as expected in GR,
both energy ε and pressure p contribute. The gravitational attraction of matter acts to
reduce the expansion rate. As we have seen earlier the cosmological constant, a.k.a.
dark energy, has pΛ = −εΛ , and it acts to accelerate the expansion.
The final equation, Equation (F.16), is the fluid equation
ε ̇ = −(3a /̇ a )[p + ε ]. (10.8)
This relates the rate of change of the energy density to the expansion of the universe.
Of the three equations, this is the only one for which a proof can be sketched using
Newtonian mechanics. On the right-hand side of this equation the pressure and
energy density of the cosmological constant mutually cancel. On the left-hand side
its contribution is equally zero because the cosmological constant does not vary. In
order to calculate the behavior of the universe from these equations the relationship
between density and pressure must be provided, that is to say, the equation of state of
the contents of the universe. We already know that for the cosmological constant.
The other components are matter, whether baryonic or dark matter, and radiation,
which can include the known neutrinos because they have negligible mass.
Notice that, in the case of a flat universe, all the terms containing a in the
Friedmann–Le Maître equations are ratios making the size of a flat universe
indeterminate. It is then convenient to choose, as we have done earlier, the scale
factor at the current time t0 to be equal to unity:
a(t0) = 1.0.
This simplifies the redshift/scale factor relationship:
a(t0) − a(t )
z= = 1/ a − 1 and a = 1/(1 + z ). (10.9)
a(t )
10-7
Introduction to General Relativity and Cosmology (Second Edition)
Friedmann’s equation shows that the critical energy density required to make the
universe precisely flat, εc , is given by
3H 2 = 8πGεc / c 2 .
Rearranging this in terms of the critical density ρc = εc /c 2 gives
3H 2
ρc ≡ . (10.10)
8πG
In a flat universe
ρr + ρm + ρΛ = ρc .
where the contributions are explicitly: ρm from baryonic matter plus dark matter, ρr
from radiation, and ρΛ from the cosmological constant. Friedmann’s equation for a
flat universe can be re-expressed in terms of fractional contributions Ωi = ρi /ρc :
Ω ≡ Ω r + Ω m + ΩΛ = 1,
When referring to the current era we add a subscript 0 to all parameters as well as to
Hubble’s constant (H0). If the universe is not flat but curved then Friedmann’s
equation becomes
κc 2
Ω−1= = Ωκ . (10.11)
[aH ]2
Unless otherwise stated we analyze the case of a flat universe, which is consistent
with all current measurements. Taking the current value of Hubble’s constant as 70
km s−1 Mpc−1
ρc0 = 9.14 × 10−27 kg m−3,
or 1.35 × 1011 M⊙ Mpc−3. This is equivalent to 5.5 nucleon masses (mostly in non-
baryonic matter) per cubic meter, which is tiny compared to the 1030 nucleons per
cubic meter within the Earth.
The generally accepted model of the universe is an essentially flat FRW spacetime: the
content is taken to be electromagnetic radiation, cold (non-relativistic) dark matter,
baryonic matter and dark energy. This dark matter interacts gravitationally, but not
through the other (strong, weak, electromagnetic) forces. What sort of elementary
particles make up this dark matter is undetermined. The properties of dark energy are
consistent with the properties of the cosmological constant proposed by Einstein. The
cosmological constant Λ and the cold dark matter (CDM) are the origin of the name of
this standard model of cosmology, ΛCDM.
10-8
Introduction to General Relativity and Cosmology (Second Edition)
How this model has been constructed and validated is described in the remainder of
the book. In the following section we begin by considering simple models, each with
only one component from those listed, and each useful in approximating the
behavior of the universe on the large scale at some era.
10-9
Introduction to General Relativity and Cosmology (Second Edition)
with
1
t0 = c 2 /6πGε0 , (10.16)
1+w
and Hubble’s constant
2
H0 = . (10.17)
(3 + 3w)t0
We can now use the last three equations to determine how the universe would
develop if radiation, matter or dark energy alone were present. The results give clues
about the eras when either radiation, matter or dark energy dominated and how
these fit together.
• If radiation dominates (w = 1/3) we get:
a(t ) = (t / t0)1/2 , (10.18)
with
3 2
t0 = c /6πGε0 , (10.19)
4
and Hubble’s constant
1
H= . (10.20)
2t
The lifetime of the universe would be 1/2H0 .
• With non-relativistic matter dominant (w = 0) we get:
a(t ) = (t / t0)2/3 , (10.21)
with
t0 = c 2 /6πGε0 , (10.22)
10-10
Introduction to General Relativity and Cosmology (Second Edition)
8πG
a 2̇ = ρ a2.
3 Λ
Taking the square root of this equation,
H = a ̇/ a = 8πGρΛ /3 .
The eras in the life of the universe when radiation and matter dominate depend very
much on how their energy densities change with time. Matter is non-relativistic for
all but the earliest times, thus we can take its energy density to be inversely
proportional to the volume of the universe,
ρm ∝ a −3. (10.25)
With radiation the energy density falls by an additional factor a to account for the
stretching of the wavelength, and consequent fall in the individual photon frequency
and energy,
ρr ∝ a −4. (10.26)
From the scale dependence of Equations (10.25) and (10.26) we see that at early
enough times radiation would inevitably dominate and at some later time matter
would take over. At the moment when the energy density of matter became equal to
the energy density of radiation the value of a is
a rm = ρr0 / ρm0 , (10.27)
where ρr0 and ρm0 are the current values of ρr and ρm , respectively.
Dark energy increases as the volume of space expands; and because dark energy’s
gravitational force is repulsive this increases the expansion rate. The universe
undergoes ever accelerating expansion. So it is guaranteed that eventually dark
energy dominates over matter. The takeover from matter dominance occurs when a
has the value
a mΛ = [ρm0 / ρΛ ]1/3 . (10.28)
Figure 10.2 compares the expansion of the universe predicted for radiation, matter
and dark energy dominance; arbitrarily making them to match today’s observed
expansion rate. Putting x = H0(t − t0 ), we have
• For a radiation dominated universe: a = [1 + 2x ]1/2 ,
• For a matter dominated universe: a = [1 + 3x /2]2/3, and
• For dark energy dominant (Λ): a = exp x .
These dependences are drawn in Figure 10.2. We shall see that the lifetime of the
universe is approximately H0−1 making the intercept on the time axis equal to −1.
10-11
Introduction to General Relativity and Cosmology (Second Edition)
Figure 10.2. Predicted expansion of the universe for the cases of radiation or matter or dark energy dominant.
None of the single component universes matches ours particularly well, but a mix
looks promising.
5
There is some tension within the data, not currently resolved. Measurements of H0 from the CMB data
extrapolating forward to the present give a 2% lower value, while measurements from low redshift using
standard candles, including supernovae SNe Ia, give a 2% higher value.
10-12
Introduction to General Relativity and Cosmology (Second Edition)
Table 10.1. Time, Redshift, Scale Parameter, and Temperature at Radiation/Matter Energy Equality and
Matter/Dark Energy Equality
of their different dependence on a: ε0Ω r0/a 4 , ε0Ω m0/a3 and ε0ΩΛ0 . Then Equation
(10.12) becomes
a 2̇ 8πGε0
= (Ω r0 / a 4 + Ω m0 / a 3 + ΩΛ0). (10.29)
a2 3c 2
We can make this more compact, multiplying through by a 2 /H02 :
[a /̇ H0 ]2 = Ω r0 / a 2 + Ω m0 / a + ΩΛ0a 2 .
Then taking the square root and rearranging gives
H0 dt = [Ω r0 / a 2 + Ω m0 / a + ΩΛ0a 2 ]−1/2 da . (10.30)
Finally this can be integrated to give
a
H0t = ∫0 [Ω r0 / a 2 + Ω m0 / a + ΩΛ0a 2 ]−1/2 da . (10.31)
This integration is carried through with the Ω values of the ΛCDM model to obtain
the expansion of the universe with time. It will also be useful to express the Hubble
constant at an earlier period in terms of H0 its current value:
H = a /̇ a = H0[Ω r0 / a 4 + Ω m0 / a 3 + ΩΛ0]1/2 (10.32)
We now introduce the conventional deceleration parameter q using Equation (10.7)
aä 1 4πG 1
q=− = 2 2 [ε + 3p ] = [ε + 3p ], (10.33)
a 2̇ H 3c 2ρc c 2
where the last equality has used Equation (10.10). Next, writing the contributions of
the components of the universe and then applying the equations of state pi = wiρi
this becomes
1 1
q= 2
2ρc c i
∑[εi + 3pi )] = ∑Ωi (1 + 3wi ).
2 i
(10.34)
10-13
Introduction to General Relativity and Cosmology (Second Edition)
Figure 10.3. Fractional energy content of the universe in the ΛCDM model as a function of the scale factor.
The redshift is 1/a − 1, which is close to 1/a for most of the lifetime of the universe.
an acceleration of the expansion, and this will only increase as ΩΛ increases toward
unity and Ω m declines correspondingly into the future.
10.8 Exercises
1. Show that for small lookback times Δt (t = t0 − Δt ) the scale parameter can
be approximated by
a(t ) = 1 − ΔtH0 − q0H02Δt 2 /2.
Hence show that to the same approximation
z = H0Δt + (1 + q0 /2)H02Δt 2,
and that
H0Δt = z − (1 + q0 /2)z 2 .
10-14
Introduction to General Relativity and Cosmology (Second Edition)
Now write a rm = Ω r0/Ω m0, which is the scale factor at which the energy
density of matter overtakes the energy density of radiation. Then the
Friedmann equation reduces to
−1/2
a da ⎡ a ⎤
dt = 1+ .
Ω r 0 H0 ⎢
⎣ a rm ⎥
⎦
Hence show that the time of matter–radiation equality was approximately
2
a rm
trm = 0.391 ≈ 50, 000 years. (10.36)
Ω r0 H0
When integrating, try the substitution x 2 = 1 + a /a rm .
Further Reading
Liddle A 2015 An Introduction to Modern Cosmology (3rd ed.; New York:
Wiley). This text provides a compact introduction.
Peebles P J E 2020 Cosmology’s Century: an inside history of our modern
understanding of the universe Princeton University Press. This book gives
insights into the development of the subject by a Nobel laureate.
References
Friedman, A. 1922, ZPhy, 10, 377
Le Maître, G. 1927, ASSB, A47, 49
Planck CollaborationAghanim, N., Akrami, Y., et al. 2020, A&A, 641, A1
Robertson, H. P. 1936, ApJ, 83, 257
Walker, A. G. 1937, PLMS, s2-42, 90
10-15
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 11
Distances, Horizons and Measurements
11.1 Introduction
The acceptance of an expanding universe raises the question of how to measure
distances when these are changing continually. If we could freeze the universe at a
given moment of cosmic time, then distances between points would be easy to define
and measure. That is not what we have, so how should we proceed? A connected
question is whether light or gravitational waves could have traveled from one
spacetime point to another. This range is crucial since it is the range of any causal
influence. The limits of causal influence are generically called horizons.
where dP is known as the proper distance to the quasar at the current time. To get the
proper distance when the light was emitted, simply multiply the result above by a(tq ).
The other distance frequently used is the comoving distance of Equation (10.4). For
the interval between cosmic times t1 and t2, this is
t2
dt
dC = c ∫t 1 a(t )
, (11.2)
which overlaps the definition of the current proper time. Another closely related
quantity is the conformal time. The conformal time from the Big Bang up to the
cosmic time t is
t
dt
η= ∫0 a(t )
. (11.3)
Now turning to the horizons: the particle horizon is defined to be the boundary of
the region containing all spacetime events that could have influenced the reference
spacetime point. Thus the distance to the particle horizon is simply the distance that
light, or equally gravitational effects, had traveled since the universe began until the
reference time t. The proper distance to the particle horizon is therefore
t
dt ′
dP = a(t )c ∫0 a(t′)
, (11.4)
Note that dC can be re-expressed (dropping the primes on the integrated quantities)
as
a a
c
dC = c ∫0 [d(ln a )/ a ̇] = ∫0 ⎡ ⎤d(ln a ),
⎣ aH ⎦
(11.6)
where c/[aH] is called the comoving Hubble radius spanning the distance light
travels in one expansion time of the universe, that is when the universe expands by a
factor e. Another important boundary is the event horizon. For instance, the furthest
events now occurring that will ever be visible from Earth constitute our event
horizon. To reiterate, such horizons apply equally to gravitational and electro-
magnetic effects.
In many contexts reference is made to some structure entering the horizon or
leaving the horizon. This is best explained by saying first what being within the
horizon means. For a structure to be within the horizon requires that light from any
part of the structure has had sufficient time since some reference time to reach all
other parts of the structure. The reference time is usually the Big Bang. Within that
sufficient time the whole structure could come in causal contact. When a structure
11-2
Introduction to General Relativity and Cosmology (Second Edition)
enters the horizon all its parts come into causal contact. When a structure leaves the
horizon all its parts are no longer in causal contact.
where dP is the proper distance of the source. In an expanding universe the photon
frequency (and energy) falls by a factor (1 + z ) and the frequency of their arrival
falls by the same factor. Thus
S = L /[4πd P2(1 + z )2 ]. (11.8)
The observer only has S and L available and so a luminosity distance dL is defined by
S = L /[4πd L2 ]. (11.9)
Thus
dL = dP(1 + z ). (11.10)
Once the source’s redshift is measured the proper distance dP can be extracted.
In the second case, the diameter of the object viewed can be inferred independ-
ently. In an astronomical context the apparent angular diameter of any galaxy or
star, dθ, is small: if its physical diameter is ρ then by definition the angular diameter
distance is
dA = ρ /dθ . (11.11)
The coordinate system has the observer at the origin with the light rays from the
edges of the source traveling radially. Now
ρ = a(t ) dP dθ (11.12)
where t is the time at which light left the source. Using Equation (10.9) this gives
ρ = dP dθ /(1 + z ). (11.13)
Hence the angular diameter distance
dA = ρ /dθ = dP /(1 + z ). (11.14)
11-3
Introduction to General Relativity and Cosmology (Second Edition)
Figure 11.1. Proper distance, luminosity distance and angular diameter distance as a function of the redshift z
of the source. The ΛCDM model of the universe was used taking Ω r0 = 0.000 09, Ω m0 = 0.30 and ΩΛ0 = 0.70.
11-4
Introduction to General Relativity and Cosmology (Second Edition)
0.8
d P in units of c/H 0
0.6
1.2
0.4 5.7 0.4
Red shifts
0.2
0.1
0
0 2 4 6 8 10 12 14
Age of universe in Gyr
Figure 11.2. Several paths for matter at different comoving distances from the Earth. Their distances are
plotted against cosmic time. The blue curve shows the path of light reaching the Earth at the present moment.
Light emitted from the locations of the red dots arrives on Earth with the indicated redshifts. Adapted with
permission from the ICHEP2020 talk by David Kirkby, University of California, Irvine. https://faculty.sites.
uci.edu/dkirkby/.
the indicated path of light and the time axis encloses all the spacetime locations that
were at some time in causal contact with us. Now because the universe is expanding
isotropically, the figure will work equally well for any line of sight over the 4π solid
angle. Thus the blue line encloses our past lightcone. Any radiation emitted from
sources within the lightcone (closer to the time axis) would have reached Earth
earlier: radiation emitted outside would either reach us later or not at all. Put
another way, the blue line marks our particle horizon.
11.4 Exercises
1. Calculate the distance to sources at redshifts 0.1 and 1.0. Calculate also the
temperature of the CMB, its peak wavelength and the age of the universe in
both cases when the radiation was emitted from the source. By what factor
would the luminosity and angular diameter estimates for the distance to the
source have differed in each case? You can assume that H has its current
value throughout the expansion.
2. Find dP(t0 ) for the second example in the previous chapter in the same
approximation as used there.
3. A source at redshift z = 4 is observed to change in intensity over a period of 5
years. What time interval does this correspond to at the source itself? Why
11-5
Introduction to General Relativity and Cosmology (Second Edition)
does this interval set an upper limit to the size of the source? Estimate this
upper limit and its angular size viewed from the Earth.
4. Show that in a non-expanding universe the observed surface brightness of a
source is independent of distance. Also show that in an expanding universe
the brightness falls of like (1 + z )−4 .
Further Reading
Liddle A 2015 An Introduction to Modern Cosmology (3rd ed.; New York:
Wiley). This text provides a compact introduction.
Peebles P J E 2020 Cosmology’s Century: An Inside History of Our Modern
Understanding of the Universe (Princeton, NJ: Princeton Univ. Press). This
book gives insights into the development of the subject by a Nobel laureate.
11-6
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 12
Cosmic Microwave Background
12.1 Introduction
This radiation was introduced in Section 1.6. It was first observed at Bell Labs by
Penzias and Wilson with a microwave receiver previously used in satellite commu-
nication. They found that the instrument suffered from constant, unexpected noise
with a characteristic equivalent temperature of around 3 K, and this appeared to be
omnidirectional. They had no luck in eliminating it. The interpretation emerged that
this noise is the radiation expected from the initial compact, high temperature, Big
Bang with which the universe began. Penzias & Wilson (1965) earned a Nobel Prize
for their discovery. The relic radiation from the Big Bang reaches us as nearly
isotropic black body radiation, shown in Figure 1.5. The temperature is 2.7255 K,
with the peak wavelength at 1.06 mm: earlier it was hotter by a factor 1 + z at
redshift z.
The cosmic microwave background (CMB) originates from an initial compact
source; yet, paradoxically, we detect it in every direction. Analogously, a two
dimensional person living on the surface of a balloon expanding from a point would
see radiation from its point origin in any direction in his/her two dimensions. We can
visualize that situation easily, and can argue that viewed from a four dimensional
space, three of which form our three dimensions, the paradox would be resolved.
Figure 12.1 shows a section through our space with us at the center and the source of
radiation apparently forming a sphere around us. Its radius is the distance that the
CMB has traveled since decoupling from matter at about 365,000 years after the Big
Bang. Before that there existed a plasma of charged baryonic matter in which
radiation was continually scattered, so that direct observation of earlier eras is
impossible: we detect the photons from the last such scattering. The CMB temper-
ature only deviates from isotropy by less than parts in 104 (∼10 μK): however a great
deal has been learnt about the universe from studying how these deviations are
Observer
distributed across the sky. This success is built on the observations made with a
sequence of impressively instrumented satellites, thus eliminating the substantial
microwave absorption by the Earth’s atmosphere at wavelengths shorter than 6 cm.
The latest satellite, Planck, was active from 2009 to 2013, having nine detectors
across the frequency range from 30 to 900 GHz (10 cm–0.3 mm). Planck was
positioned at the L2 point located along the Sun–Earth line and 1.5 Mkm beyond
the Earth. Lying at L2, shielded from the Sun by the Earth, the cooling required to
operate detectors sensitive to the CMB radiation at 2.7255 K is long-term
manageable. This orbit is stable radially along the Sun–Earth direction. In the
perpendicular direction (lying in the plane of the Earth’s orbit) the orbit is weakly
unstable. The time taken for any offset in this direction from L2 to double is one
month, so that only small corrective rocket impulses are needed to hold station on
L2.
The frequency distribution of photons in black body radiation at temperature T
K was given by Planck (the physicist!):
8πf 2
n(f )df = df , (12.1)
c 3[exp(hf / kBT ) − 1]
where kB is the Boltzmann constant, and h is Planck’s constant. The energy distribution
is n(f )hf df . These expressions were integrated over all f in Section 1.6 to give the total
number density and total energy density in the black body radiation:
n = 2.03 × 107 T 3 m−3, E = 7.56 × 10−16 T 4 J m−3. (12.2)
12-2
Introduction to General Relativity and Cosmology (Second Edition)
This translates today, when T = 2.7255 K, to 411 cm−3 (200,000 inside a cup), and
an energy density of only 0.261 eV cm−3. The mean photon energy is
E mean = 2.70 kBT = 3.72 × 10−23 T J. (12.3)
The angular correlations observed between the temperature fluctuations in the
CMB across the sky are fundamental to understanding cosmology. Both the
intensity and polarization correlations carry useful information. The distribution
of perturbations of matter in the early universe, as preserved by the CMB, is
complemented by measurements of the distribution of matter at low redshift.
Knowing both has made it possible to ask whether the universe developed from
those early perturbations in accordance with the known physical laws to give the
present pattern of galaxies, galaxy clusters, and voids. We shall learn that it has. A
consistent history connecting the era of the CMB, 365,000 years after the Big Bang,
and now, 13,800,000,000 years later, has been constructed based on the ΛCDM
model.
1
See Table 10.1 and Exercise 9.8.
12-3
Introduction to General Relativity and Cosmology (Second Edition)
cross-section for Thomson scattering, σ (6.65 × 10−29 m2) we can calculate the mean
free path between scatters:
λ = 1/(σn e ) = 1.47 × 1018 m = 47.6 pc,
and the time interval between scatters
τ ≈ λ / c ≈ 5 × 109 s,
or 150 years. This illustrates how dilute the plasma already was at matter/radiation
equality.
The mean photon energy 2.7kBT = 2.2 eV - was well below Q = 13.6 eV the
energy required to ionize hydrogen. However, because the photons outnumbered the
baryons (explicitly nucleons) by a factor (the same as now) of 1.6 × 109 the photons
in the high energy tail of the black body spectrum were numerous enough maintain
the ionization. The buildup of hydrogen atoms, through the capture of the free
electrons by the protons, took place between around 70,000 and 365,000 years after
the Big Bang. The progress of recombination can be followed by applying the Saha
ionization equation to the reactions in equilibrium:
γ +1H ⇌ e−+p. (12.5)
The Saha equation relates the numbers of the reacting particles in thermal
equilibrium to the temperature T. Given that in equilibrium the number of hydrogen
atoms per unit volume is nH, of free protons is np, and of free electrons is ne, then the
Saha equation gives
∞
2
∫0 x 2 dx /[exp x − 1] = 2.404.
12-4
Introduction to General Relativity and Cosmology (Second Edition)
3
kT
n γ = 60.42⎡ B ⎤ . (12.8)
⎣ hc ⎦
Thus we have
kT 3
n p = Xnb = Xηn γ = 60.42ηX ⎡ B ⎤ . (12.9)
⎣ hc ⎦
The right-hand sides of Equations (12.7) and (12.9) will be equal in thermal
equilibrium. Applying this equality to the mid-point of recombination when
X = 0.5 we get3 an equilibrium temperature of 3755 K.
When a calculation is made taking account of the details of ionization and
recombination the temperature at which the photons decouple from matter is found
to be T = 2974 K. Correspondingly in the ΛCDM model the redshift is 1090 and the
age of the universe is 365,000 years. Figure 12.2 shows the energy distribution in all
the detectable background radiation arriving now at the Earth’s upper atmosphere:
Figure 12.2. The background radiation of the universe. For example, CXB indicates the x-ray background,
CUB the ultraviolet background, and so on. The line width indicates the experimental uncertainty.
Compilation in Figure 9 of Hillet al. (2018).
3
Taking a target value of 0.5 for X, the right-hand sides can be computed with trial values of T until they
become equal.
12-5
Introduction to General Relativity and Cosmology (Second Edition)
for instance CXB indicates the x-ray background. Despite the tiny energy per
photon the CMB dominates the energy flow. At recombination the photon flux in
the CMB would have been (1090)2 times greater and the energy flow would have
been (1090)3 times greater. The competing sources of radiation, stars and galaxies,
would only form much later. This all fits in with the CMB being the radiation from
the initial fireball, the Big Bang.
4
The three low-frequency detectors at 30–70 GHz were high electron mobility transistors cooled to 20 K and
the six high-frequency detectors at 100–857 GHz were bolometers cooled to 0.1 K.
12-6
Introduction to General Relativity and Cosmology (Second Edition)
over the azimuthal angle is reasonable because there is no preferred direction across
the sky. Now using the expansion in spherical harmonics this reduces to
∞
C (θ ) = (1/4π )∑(2ℓ + 1)CℓPℓ(cos θ ) (12.13)
ℓ=0
where Pℓ are Legendre polynomials: P0(x ) = 1, P1(x ) = x and P2(x ) = (3x 2 − 1)/2,
etc. The coefficient
+ℓ
1
Cl = ∑ ∣aℓm∣2 .
2ℓ + 1 m =−ℓ
(12.14)
This is the average of the 2ℓ + 1 samplings, ∣a ℓm∣2 , for a given value of ℓ . The rms
deviation of Cℓ , ΔCℓ , called the cosmic variance, is given by
ΔCℓ ⎡ 2 ⎤,
=
Cℓ ⎣ 2ℓ + 1 ⎦
and is biggest for the low ℓ multipoles with their fewer samples. The ℓ = 0 multipole
vanishes because the mean deviation is zero.
The secular motion of the Earth relative to the CMB Doppler shifts the CMB so
that its apparent temperature becomes (see Equation (1.15) and surrounding text in
Section 1.6)
T (θ ) = 〈T 〉(1 + β cos θ ),
where θ is the angle between the Earth’s instantaneous velocity, βc , and whatever
reference axis is chosen. Thus the Doppler shift induces an artificial dipole moment
(ℓ = 1) in the CMB temperature distribution. In all, the Earth’s motion has
components from the motion of the Galaxy group, of the Galaxy in the group,
and of the Sun round the Galaxy, as well as its motion round the Sun. Overall
β ∼ 3 × 10−3, and this produces a few milli-Kelvin effect, which is ∼100 times the
intrinsic temperature perturbations in the CMB. Thus a first essential step in
analyzing the CMB is to compensate for this Doppler shift before carrying out
the analysis described in this section.
A first quantity of cosmological interest is the angular power spectrum of the
temperature fluctuations
ℓ(ℓ + 1)
D TT
ℓ = Cℓ〈T 〉2 . (12.15)
2π
In Figure 12.3 the measurements of the temperature correlations made by the Planck
Collaboration are plotted against ℓ . In Section 12.8 we meet a complementary
measurement, that of the polarization of the CMB: the corresponding correlations in
polarization are shown in Figure 12.6. The superposed curves in both figures are the
outcome of fits to the data calculated using the ΛCDM model. These detailed
calculations, which help determine the values of cosmological parameters, are
outside our scope here. Instead the main features of the temperature and
12-7
Introduction to General Relativity and Cosmology (Second Edition)
Figure 12.3. Temperature–temperature correlations of the CMB measured by the Planck Collaboration. The
experimental errors are displayed in the lower panel. Adapted from Figure 1 in Planck Collaboration et al.
(2020). Reproduced with permission © ESO.
polarization angular spectra are discussed and inferences made about the impact
they have on the cosmological parameters. Of these parameters, we have already
met Ωr0, Ωm0, Ωb0 , ΩΛ0 and H0. In the later universe radiation from the stars reionizes
the hydrogen gas it contains and the CMB undergoes further Thomson scattering
from the free electrons thus created. We meet a measurable parameter used to
quantify this effect: τ the optical thickness of the universe.
12-8
Introduction to General Relativity and Cosmology (Second Edition)
This gives the thermal variation across the last scattering surface, resulting in the
correlations that Planck measured.
Dark matter particles had become non-relativistic much earlier, as we shall prove.
Lacking any interactions other than gravitational, dark matter simply accumulated
in the gravitational valleys and was depleted on the hills left after inflation. This
process steadily enhanced the contrast between the hills and valleys. Relativistic
baryonic matter however formed a plasma in thermal equilibrium with the photons,
coupled by Thomson scattering of photons from electrons and Coulomb scattering
of electrons off protons. This plasma reacted to the gravitational potential with
baryonic matter flowing into valleys and becoming compressed there. The photons
then exerted a restoring pressure that produced an outflow from the potential wells.
Together, the inertia of the dust-like baryons and the spring provided by the
radiation produced plasma oscillations in the existing gravitational potential wells.
These oscillations would have traveled at the speed of sound in the plasma.5 Photons
dominated the baryons in number and in their total energy so that the speed of
sound was
cs ≈ c/ 3 ,
not very different from the speed of light.
Plasma oscillations that completed an exact half cycle at recombination would
have reached maximal compression at recombination and hence been hotter, and
contribute to the first and largest peak in the power spectrum in Figure 12.3. More
generally the odd numbered peaks in D TT ℓ are the result of 1/2, 3/2, 5/2, … cycles of
oscillation being completed at recombination: they correspond to maximal com-
pression at recombination and a hotter plasma. When 1, 2, 3, … full cycles are
completed at recombination there is maximal rarefaction and a colder plasma: this
gives the even order peaks in D TT ℓ . Even or odd, these peaks, the result of
longitudinal sound waves in the photon-baryon plasma, are called acoustic peaks.
During the compression of the plasma the gravitational attraction of the baryons
acts in the same sense as the gravitational attraction of the dark matter, by contrast
they are in opposition at rarefaction. Hence the relative heights of the even
numbered acoustic peaks with respect to the odd peaks is determined by the baryon
to dark matter ratio. The fits to the CMB spectra by the Planck Collaboration give
mean values Ωb0 = 0.049 and Ω m0 = 0.31. It follows that the ratio of the number of
protons to the number of photons now, and equally ever since decoupling, is
6.0 × 10−10 .
The observed angular size of the acoustic peaks is dependent on the intrinsic
curvature of the universe. A positive curvature would lead to perceived angular sizes
being smaller, and hence acoustic peaks would be displaced to larger ℓ values than
for a flat universe. Conversely a negative curvature would displace the peaks to
smaller ℓ values. The position of the peaks therefore can be used to determine the
sum Ω m0 + ΩΛ0, which would be unity in a flat universe (ignoring the tiny Ω r0). We
5
In Chapter 15 the speed of plasma waves is examined in more detail in connection with structure growth later
in the universe.
12-9
Introduction to General Relativity and Cosmology (Second Edition)
can show in a simple manner that the location of the first peak is where it would be
expected in the ΛCDM model with its flat universe.
The distance the plasma waves would have traveled up to the time of recombi-
nation is the sound horizon distance
trec
cs dt
rs = a rec ∫0 a(t )
, (12.16)
where the subscript rec signifies the time at recombination when the components of
the plasma decouple. Using Equation (10.31) with the parameters of ΛCDM model,
trec is 365,000 years and
rs = 0.150 Mpc. (12.17)
This distance, the extent of the sound horizon over the last scattering surface, is
important in providing a well-defined length scale (a ruler) at recombination. We
shall see in Chapter 15 how a comparison is made with a length defined by the
oscillations inherited by the baryons in the universe at low redshift. The comparison
provides a crucial test for the ΛCDM model of the universe. (Note that we will need
to tweak Equation (12.16) slightly in the following chapter.) The sound horizon
today subtends an angle at the Earth of
θ = rs / dA, (12.18)
where dA is the angular diameter distance to the last scattering surface. dA evaluated
using the ΛCDM model is
t0
1
dA = ⎡
⎣
∫0 cdt / a(t )⎤
⎦ 1 + zrec
= 13.0 Mpc. (12.19)
Here the time before recombination has been included because it is negligible
compared to the lifetime of the universe today. Thus the angle subtended by plasma
waves just reaching maximum compression for the first time at recombination is
θ ≈ 0.150/13.0 = 0.0115, (12.20)
that is 0.69◦. Projecting this onto partial waves gives ℓ ≈ 180°/0.69° = 260. This is
broadly consistent with the measurements shown in Figure 12.3: the first peak in the
power spectrum lies at ℓ ≈ 220, while the spacing between peaks is around 300.
The curves shown in Figures 12.3 and 12.6 are the result of fitting the ΛCDM
model to the data, by varying parameters including Ω m0 and ΩΛ0 . Their sum is found
to lie close to unity providing strong evidence for the flatness of the universe. As a
measure of the sensitivity, note that if the sum Ω m0 + ΩΛ0 were reduced to 0.1 then
the first peak would move to ℓ = 800. In more detail, Figure 12.4 shows the
restriction imposed by such fits to the CMB spectrum (Suzuki et al., 2012) on the
permissible range of values of Ωm0 and ΩΛ0 . Those consistent with the CMB data lie
within the orange-red boundaries at confidence levels of 68%, 95% and 99.7%. The
line Ωm0 + ΩΛ0 = 1.0 traces out the combinations that would make the universe flat.
The other limitations shown come from measurements of baryon acoustic
12-10
Introduction to General Relativity and Cosmology (Second Edition)
Figure 12.4. Current values of ΩΛ (ΩΛ0 ) versus Ω m (Ω m0 ). Figure 5 from (Suzuki et al., 2012). Courtesy
Professor Suzuki for the copyright holders.
12-11
Introduction to General Relativity and Cosmology (Second Edition)
The last scattering of the CMB photons is spread over time and during that period
there is the opportunity for scatters to smooth out the temperature fluctuations. This
effect, known as Silk damping, is limited to the thickness of the last scattering layer.
The effect grows in importance as the potential wells’ physical scale decreases, and
thus with increasing ℓ . Making the baryon density larger would shorten the photon
mean free path and spread the damping to larger ℓ values. This as well as the relative
heights of the even to odd peaks constrains Ωb0 .
The power in the fluctuations with a given ℓ value is observed to have a Gaussian
distribution: this is what is expected to result from the model for inflation discussed
in the next chapter. One consequence of a Gaussian distribution is that the two-point
correlations analyzed above account for all correlations. All the information content
of the correlations is contained in the power spectra shown in Figures 12.3 and 12.6.
In addition the fluctuations observed in the CMB are consistent with being
adiabatic. That is to say, the fractional change induced in number density is the
same for all the particle species: this is a further prediction made with the model of
inflation presented in the next chapter.
12-12
Introduction to General Relativity and Cosmology (Second Edition)
δT / T = −δa / a = (2/3)Φ / c 2 .
Relative to this modified black body temperature the spectral shift of the photons is
reduced to
δT / T = Φ / c 2 − (2/3)Φ / c 2 = (1/3)Φ / c 2 .
The temperature fluctuations calculated for a super-horizon mode are therefore
independent of the dimensions of the potential fluctuation and this gives a plateau in
the power spectrum at ℓ < 30, the Sachs–Wolfe plateau.
where jℓ is the ℓ th spherical Bessel function. If ns is unity the coefficient for the
spectrum shown in Figure 12.3 is simply
ℓ(ℓ + 1)C ℓSW = 8As /25, (12.24)
6
Section 9.9 in the second edition of “Modern Cosmology” by Scott Dodelson and Fabian Schmidt,
Academic, New York, 2021.
12-13
Introduction to General Relativity and Cosmology (Second Edition)
giving the observed near constancy of the Sachs–Wolfe plateau. Fitting the data gives
As = 2.1 × 10−9, making the fluctuations As ∼few 10−5. This is the most direct
measure of the strength of the curvature fluctuations produced during inflation.
By fitting the measured power spectrum to data the Planck Collaboration found
the value ns = 0.965, equal to unity within the uncertainty in the determination. Two
physicists, Harrison and Zeldovich, had independently proposed that the power
spectrum for the perturbations had ns equal to unity. The distribution is then scale
invariant, that is to say each logarithmic interval in k then contains equal power. Put
another way, it is fractal. The model of inflation presented in the following chapter
accounts for ns being close to, but marginally less than unity.7
12.8 Polarization
The CMB is weakly polarized and the correlations in polarization across the sky
have been measured by the Planck Collaboration. They simultaneously fitted the
temperature correlations, the polarization correlations, and the temperature–polar-
ization correlations using the ΛCDM model, varying parameters including Ω r0 , Ω m0,
and ΩΛ0 . Two of these precise fits are shown in Figures 12.3 and 12.6.
Polarization of the CMB was produced at the last scattering surface, in the final
Thomson scatter of each photon, before it began traveling freely across the universe.
Asymmetries in the CMB temperature distribution are converted in this last scatter
to polarization of the CMB. The explanation for this conversion takes a few steps:
remember that photons are polarized perpendicular to the direction of travel. It
7
An attractive mathematical property of the Harrison–Zeldovich power spectrum is that it is the only power
law distribution that avoids a divergence at both very small and very large values of k.
12-14
Introduction to General Relativity and Cosmology (Second Edition)
starts by noting that the Thomson cross-section depends on the relative alignment of
the polarizations (e) of the incoming and outgoing photons. This happens because
the incoming photon excites the electron to oscillate along the direction of its
polarization, and then the electron emits a photon with that same polarization. As a
result the Thomson differential cross-section is
dσT /dΩ = re2∣e in · eout∣2 , (12.25)
where re is the classical radius of the electron, 2.82 × 10−15 m2. Thus the scattering is
strong when the polarizations of incoming and outgoing photons are aligned, and
null when they are orthogonal. The polarization of the electric field measured by the
detectors on board Planck is of course exactly the polarization of these outgoing
photons. Figure 12.5 illustrates what happens at the last scattering in the simplest
arrangement in which incoming photons enter in the horizontal plane and scatter
through 90◦ upward. The lower panel shows the possibilities for scattering where a
dipole temperature asymmetry exists: that is where a rotation of π around the
direction of the outgoing scattered photon reverses the sign of the temperature
perturbation. The upper panel shows the possibilities for scattering where there is a
quadrupole temperature asymmetry: that is where a rotation of π /2 around the
direction of the outgoing photon reverses the sign of the temperature perturbation.
The incident photons from the hotter regions are drawn as incoming arrows using full
lines in the figure. They are more numerous than the incident photons from the colder
Hot
Quadrupole
Cold Cold
Hot
Cold
Dipole
Hot Cold
Hot
Figure 12.5. The last scattering in the presence of quadrupole and dipole temperature asymmetries. Only the
polarizations of the more numerous photons from the hotter regions are drawn. The polarization of a photon
that is scattered in the direction of the outgoing ray is drawn as a solid blue line: the polarization that cannot be
scattered in that direction is shown as a broken blue line.
12-15
Introduction to General Relativity and Cosmology (Second Edition)
regions: these latter are drawn with dotted arrowed lines. The two orthogonal possible
polarizations for the incoming photons are only indicated for the more numerous hotter
photons. The polarizations that lead to scattering along the outgoing ray are drawn
with full lines; polarizations not scattered along the outgoing ray are indicated using
broken lines. In the lower panel, illustrating the case of a dipole asymmetry the more
numerous outgoing hotter photons can have either of the orthogonal polarizations. On
the other hand in the upper panel, for the case of a quadrupole temperature asymmetry,
the more numerous hotter scattered photons are linearly polarized. Thus, finally, we see
that quadrupole asymmetries in the temperature perturbations at recombination
produce corresponding net polarization patterns in the CMB.
During recombination the CMB polarization was being continually reset by
Thomson scattering. It is therefore only a layer of the universe of thickness equal to
the mean free path of the photons that influences the polarization. Consequently the
strength of polarization correlations is much weaker than the thermal correlations.
The polarization across the sky is again expressed in terms of spherical harmonics
∞ ℓ
E (θ , ϕ ) = ∑ ∑ a ℓmY ℓm(θ, ϕ). (12.26)
ℓ = 0 m =−ℓ
The correlations between polarizations are defined in a similar way to those for
temperature asymmetries
C EE(θ ) = 〈E (aˆ )E (bˆ )〉. (12.27)
Projecting out the angular momentum states gives the coefficient ClEE , which is
plotted in Figure 12.6. The acoustic peaks in the polarization correlations are in
antiphase to those in the temperature correlations seen in Figure 12.3. This is
expected because the polarization is proportional to the velocity of the plasma flow,
while the temperature maps the plasma density.
In principle the primordial quantum fluctuations left by inflation could have been
scalar, that is density fluctuations, or tensor fluctuations due to gravitational waves
excited during inflation. These types of fluctuation produce very different and
distinctive patterns of polarization in the area of the sky around hot and cold spots.
In the case of scalar fluctuations the electric field is either radial or tangential, the E-
mode. In contrast the tensor pattern would have the field at 45◦, left or right of these
directions, swirling around the hot/cold spot. This is the B-mode, in appearance
analogous to magnetic field lines. The observed polarization is almost entirely E-
mode, thus requiring scalar primordial perturbations.
It is worth repeating that Figures 12.3 and 12.6 illustrate how well the ΛCDM
model can fit all the detailed features of the power and polarization spectra. There
are two key properties of the CMB that imply analogous features in baryonic
matter, and these features should be apparent in baryonic structures in the low
redshift universe. First the near scale invariant (Harrison–Zeldovich) power spec-
trum of perturbations; second the angular correlation peak marking the sound
horizon at recombination. As we shall see fits made with the ΛCDM model provide
a consistent match to the observed evolution.
12-16
Introduction to General Relativity and Cosmology (Second Edition)
Figure 12.6. Polarization–polarization correlations of the CMB measured by the Planck Collaboration. The
experimental errors are displayed in the lower panels. Adapted from Figure 1 in Planck Collaboration et al.
(2020) Reproduced with permission © ESO.
12-17
Introduction to General Relativity and Cosmology (Second Edition)
This is only possible energetically if the neutrinos have energies in the center-of-mass
frame greater than the electron mass × c 2 (ϵ = 511 keV). The corresponding
temperature of the universe is ϵ /kB or 6 × 109 K. At lower temperatures the
neutrinos decouple from matter.
Between neutrino and photon decoupling the photons picked up energy when the
electrons and positrons annihilated
e −+ e + ⇌ γ + γ ; (12.29)
this energy is not accessible to the neutrinos. The resulting difference between the
temperatures of the CNB and the CMB can be predicted precisely:
Tν = (4/11)1/3Tγ , (12.30)
which makes the present temperature of the cosmic microwave neutrinos 1.94 K.
The ratio of the number of CNB neutrinos plus anti-neutrinos to the number of
CMB photons is again precisely predicted at 3/11, so there are 1.12 × 108 m−3 today.
Hence you and I are each being traversed instantaneously by well over ten million
CNB neutrinos! The corresponding CMB photons incident mostly get absorbed by
water molecules in the atmosphere.
Experiments in particle physics show that the neutrino masses are of order 0.2 eV c−2.
Consequently the neutrinos formed a relativistic gas in the early universe, and exerted
pressure. Therefore they must be included with the photons in the relativistic radiation
when calculating the properties of the universe for this period. Figure 10.3 separates the
CNB and the CMB in showing how the contents of the universe have changed with time.
Big Bang nucleosynthesis, discussed in Chapter 14, provides the most direct information
on the number density of neutrinos.
12.11 Exercises
1. Make an argument in terms of angular momentum conservation, which rules
out polarization of the CMB, originating from higher multipole components
of the temperature asymmetry than the quadrupole asymmetry.
2. Make an estimate of the optical depth along the path of the CMB, assuming
that reionization is complete at z = 10. Also assume that the universe
develops as if matter dominated so that H = H0 /a3/2 .
3. Use Equation (12.1) to show that for photon energies, E much larger than
kBT the black body spectrum reduces to
8πE 2
n(E )dE = exp[ −E / kBT ] dE .
c 3h3
Show that the number with energy greater than E per meter cubed is
approximately
2
kT 3 E ⎤
8π⎡ B ⎤ ⎡ exp[ −E / kBT ].
⎣ hc ⎦ ⎢
⎣ kBT ⎥
⎦
12-18
Introduction to General Relativity and Cosmology (Second Edition)
2
This requires integration by parts, keeping only the term in ⎡ k ET ⎤ . What
⎣ B ⎦
fraction is this of the black body spectrum? At 5700 K what fraction of the
black body photons has energy greater than 13.6 eV?
4. Calculate the redshift and the time after the Big Bang when the CMB was at
room temperature. Would it have suffered rescattering at that era?
5. At z = 1090 what was the mean free path for Thomson scattering and what
was the interval between collisions?
Further Reading
Liddle A 2015 An Introduction to Modern Cosmology (3rd ed.; New York:
Wiley) (Oxford: Oxford Univ. Press). This text provides a compact
introduction.
Samtleben D, Staggs S and Winstein B 2007 The Cosmic Microwave
Background for Pedestrians in Annual Reviews of Nuclear Science 57 245.
This is a very helpful 38 page-long article for the non-expert.
Donaldson S and Schmidt F 2021 Modern Cosmology (2nd ed.; Amsterdam:
Elsevier). This gives a detailed expert analysis using the Boltzmann and
Einstein equations.
References
Hill, R., Masui, K. W., & Scott, D. 2018, ApSpe, 72, 663
Penzias, A. A., & Wilson, R. W. 1965, ApJ, 142, 419
Planck Collaboration,, Akrami, Y., & Arroja, F. 2020, A&A, 641, A10
Sachs, R. K., & Wolfe, A. M. 1967, ApJ, 147, 73
Suzuki, N., Rubin, D., Lidman, C., et al. 2012, ApJ, 746, 85
12-19
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 13
Inflation in the Early Universe
3κ c 2
a 2ρ(Ω−1 − 1) = − , (13.1)
8πG
13.2 Inflation
The commonly accepted solution for all three problems was suggested by Guth
(Guth & Steinhardt, 1984) in 1981, with later contributions from Linde (Linde,
1987) and Steinhardt (Guth & Steinhardt, 1984). The solution is that the universe
inflated almost instantaneously shortly after its creation and by a huge factor: space
expanded much faster than the speed of light. As a result regions that were in causal
contact expanded to a size larger than the part of the universe we observe. This
explains why the CMB is close to homogeneous of over the whole sky. As a bonus
the expansion flattened out any existing curvature of spacetime in the universe. The
cause of this inflationary epoch is thought to be a state change of the empty universe,
the vacuum, dropping to a lower energy vacuum. This change released an enormous
1
The ratio is predicted to be the inverse of the fine structure constant.
13-2
Introduction to General Relativity and Cosmology (Second Edition)
amount of energy into matter and radiation: an event which we recognize as being
the Big Bang.
First we look further into how inflation solves the three problems, and then
describe the commonly inferred version of the phase transition. In Figure 13.1 the
broken line shows how the horizon would increase with time in the absence of
inflation, while the solid line shows how the universe now visible to us would have
grown. The figure shows again that the points on opposite sides of the sky we see
today would never have been in causal contact in the remote past. Figure 13.2 shows
the corresponding development with inflation by a factor of, for example, 1050 at
10−36 s. With inflation the region that was in causal contact in the past has become
much larger than the universe now visible to us. This explains why the CMB is so
uniform across the whole sky. Note that the figures were drawn choosing a scale
factor a ∝ t1/2 , appropriate for a radiation dominated universe: other choices would
not affect the conclusions reached here. Inflation by 1050 would flatten the curvature
of the universe by the same factor, leaving a curvature undetectable using current
measurement techniques.
The vast change in volume following inflation would so dilute the numbers of
existing monopoles, that detection of even a single monopole in the near future is
highly unlikely. Equally all other matter would be diluted to the same extent leaving
a virtually empty universe. This invites a question: why is the universe now
populated with matter, despite further gentle expansion? As previously noted, it is
thought that at the end of inflation the universe fell into a lower energy vacuum
state. The compensating huge energy release appeared as matter and radiation: this
event becomes the Big Bang.
1 Previous size of
present visible
Universe
10–20
Horizon
distance
R/R0
10–40 ct
10–60
10–80
Figure 13.1. The broken line indicates how the distance to the horizon increased through the life of the
universe. The solid line extrapolates the size of that part of the universe, which we now see back into the past,
without inflation.
13-3
Introduction to General Relativity and Cosmology (Second Edition)
Horizon
distance
1
10–20
R/R0
10–40
10–80
Figure 13.2. The broken line indicates how the distance to the horizon increased through the life of the
universe. The solid line extrapolates the size of that part of the universe, which we now see back into the past,
taking account of inflation.
13-4
Introduction to General Relativity and Cosmology (Second Edition)
U(1)
40 SU(2)
20
Strong
0
0 5 10 15 20
log10 Energy in GeV
Figure 13.3. Long range extrapolation of the coupling strengths of the electroweak and strong forces,
according to the standard model of particle physics.
transition is now well understood, and some reasonable inferences can be drawn
about the grand unification transition. At the electroweak phase transition the
vacuum state of the universe changed. A neutral scalar field extending uniformly
over the whole universe, the Higgs field, dropped into a lower energy state with a
release of energy into baryonic matter. It is argued that at the grand unification
transition an analogous neutral scalar field underwent a similar transition. This field
is generally called the inflaton field. Although a rigorous quantum field theory has
been developed to describe the electroweak transition, a parallel theory does not
exist yet to analyze what happens in the early universe. In such cases where the
mathematical underpinning is not available an effective field theory is built none-
theless: this approach proved highly successful in modeling reality in the cases of the
superfluid and superconducting transitions.2 The approach here is to present at an
elementary level the corresponding analysis of the inflationary epoch in terms of this
inflaton field. Following inflation by the factor ∼1030 any pre-existing matter and
radiation is so diluted that the universe is essentially empty, apart from the inflaton
field. This field is assumed to be a scalar field like the Higgs field, and equally, like
the Higgs field, uniform and isotropic.
Our analysis of inflation starts with the acceleration Equation (10.7) in the era of
interest
4πG
a /̈ a = − [εϕ + 3pϕ ],
3c 2
where εϕ and pϕ are the energy density and pressure of the inflaton field. If there is to
be inflation, then a ̈ > 0, and in the equation of state of the inflaton pϕ = wεϕ , w must
be less than −1/3. The simplest case is to have w = −1, which, we shall see, fits the
2
The success of the earlier developed effective field theory for superconductivity led Anderson to propose how
the eventual rigorous theory of the electroweak transition could be constructed.
13-5
Introduction to General Relativity and Cosmology (Second Edition)
H= ⎡ 8πGεϕ ⎤ . (13.7)
⎢ 2
⎣ 3c ⎥ ⎦
If H remains constant during inflation the number of e-foldings of the universe’s
dimensions during an inflationary period of duration Δt is
N = H Δt. (13.8)
Using Equation (13.5) we see that any initial regions of extreme curvature with Ωk
approaching unity were smoothed out by inflation if
exp(2H Δt ) > 1057 . (13.9)
Applying Equation (13.8), the flattening and smoothing of the universe requires 65
or more e-foldings. The 1057 expansion in the scale factor during inflation caused the
temperature to fall by the same large factor. At this point the universe was both very
cold and virtually empty. The burst of inflation then ended with the release of the
energy of the inflaton field into an energetic soup of matter and radiation. This
process reheated and repopulated the universe with matter and radiation. This was
effectively the Big Bang.
The comoving Hubble radius
c c
=
aH ȧ
introduced after Equation (11.6) measures the range of causal effects at a given
moment. Figure 13.4 shows how this distance changed as a function of the scale
parameter a during and after inflation, on a log-log scale. During inflation H was
constant so that a grew exponentially, and so therefore did ȧ. As a result the horizon
for causal contact shrank correspondingly rapidly. After inflation, while radiation
dominated, H ∝ a−2 so that now the horizon grew steadily and its growth continues
to the present. The path of a typical quantum fluctuation (mode) responsible for the
thermal correlations in the photons in the CMB is drawn in Figure 13.4. During
inflation it would first exit the horizon, and then at some later time re-enter the
horizon.
13-6
Introduction to General Relativity and Cosmology (Second Edition)
exit re-enter
horizon horizon
on
riz
Ho
10-36s of
Inflation Reheating
Log ( scale factor )
Figure 13.4. Evolution of the comoving Hubble radius during and after inflation. A typical fluctuation is
shown, first leaving and then re-entering the horizon. The energy release and reheating after inflation are
discussed later.
time measured from the moment of re-entering the horizon. Among all the modes,
those modes whose amplitude completed one half cycle of oscillation between re-
entering the horizon and recombination were in phase with one another at
recombination and added coherently to produce the first peak in the CMB temper-
ature correlation power spectrum. If instead these waves had re-entered with
random phases there would be cancellation between them at recombination and
hence no spectral peak in the CMB power spectrum. This coherence was essential in
producing all the acoustic peaks seen in these spectra (Dodelson, 2003). Figure 13.5
shows how the photon perturbations in plasma waves changed after inflation
between the moment of re-entering the horizon and recombination. The black line
shows an example of a wave that completed one half cycle at recombination; such
waves added coherently to give the first peak in the CMB temperature correlation
power spectrum Referring back to Equation (12.16) we see now that the integral
used to evaluate the sound horizon distance must run from the moment at which this
perturbation re-enters the horizon up to the moment of recombination:
trec
cs dt
rs = a rec ∫t
reentry a(t )
. (13.10)
Going back to Figure 13.5, the blue line shows a wave that completed one full cycle
at recombination and contributed to the second peak. Waves such as that shown in
red did not change much before recombination because their wavelengths were
much larger than the horizon at recombination: these are called superhorizon modes.
We saw that their contributions at recombination formed the Sachs–Wolfe plateau.
13-7
Introduction to General Relativity and Cosmology (Second Edition)
Figure 13.5. Notional development of photon perturbations between reentering the horizon and recombina-
tion. Δt is the time since reentering the horizon and Δtrec is the time between reentry and recombination.
Waves drawn in black contribute to the first (compression) acoustic peak; those in blue contribute to the
second (rarefaction) acoustic peak. An example of a superhorizon mode is drawn in red.
Other key features of the inhomogeneities observed in the CMB, the scale
invariance, Gaussianity and adiabaticity, also emerge naturally in the model of
inflation presented here.
13-8
Introduction to General Relativity and Cosmology (Second Edition)
where ϕ is the field amplitude. ϕ2̇ /2 is the density of kinetic energy and V (ϕ ) is the
potential energy density. This potential energy density is the result of the interaction
that the field at one point has with the surrounding field. The field exerts a pressure
pϕ = ϕ2̇ /2 − V (ϕ). (13.12)
The equation of state of the field is pϕ = wεϕ , so that, as with dark energy, w must be
negative to give inflation, and close to −1 to give scale invariance. That means that
the potential must dominate:
V (ϕ) ≫ ϕ2̇ . (13.13)
Figure 13.6 shows a suitable form for the potential energy of the vacuum of the
universe as a function of the amplitude of the inflaton field. This potential gives rise
to what is called slow roll inflation.
Figure 13.6. Slow roll inflation: variation of the inflaton field’s potential energy as a function of the field’s
amplitude. The solid circle marks the initial metastable state before inflation.
13-9
Introduction to General Relativity and Cosmology (Second Edition)
Initially the universe would have been in a metastable false vacuum with
vanishing amplitude of the inflaton field: at the point marked by the black ball.
Like a ball poised on a shallow slope the state of the universe rolled down the slope
ending in the true current vacuum at the minimum in the potential well. In the steep
fall into the minimum the inflaton field picked up kinetic energy so that Equation
(13.13) no longer held. Then the inflaton field performed damped SHM in the well.
This damping took the form of the conversion of the energy of the inflaton field to a
soup of matter and radiation that became the matter and radiation that now fill the
universe. Consequently the universe was reheated back to its temperature before
inflation.
The analysis of the slow roll starts with the fluid Equation (10.8)
ε ̇ = −3H (p + ε ).
Substituting for the energy density and pressure using Equations (13.11) and (13.12)
this becomes
̈ ̇ + V ̇ (ϕ) = −3Hϕ2̇ .
ϕϕ (13.14)
In this equation the potential gradient (V ′ is negative) provides the inflationary force,
and this is opposed by a damping term on the right-hand side, called Hubble friction.
The condition for prolonged inflation is that the terms on the right-hand side almost
cancel, the potential gradient term being larger. If the energy of the inflaton field is
the dominant energy in the universe, the Hubble constant
H= ⎡ 8πGV ⎤ . (13.17)
2
⎣ 3c ⎦
Then over the range where the slope of the inflaton field is constant and sufficiently
shallow both terms on the right-hand side of Equation (13.16) remain nearly
constant and there is constant acceleration down the slope. The near constant value
of V appearing in Equation (13.17) makes H in turn nearly constant, producing
nearly exponential inflation.
a = a initial exp[H (t − tinitial )]. (13.18)
Evidently the process is self-similar: after a time 1/H the growth repeats itself on a
scale e-times larger. As a result the quantum fluctuations in the inflaton field would
have close to the scale invariant form suggested by Harrison–Zeldovich. The
inflaton fluctuations induce corresponding fluctuations in the curvature/gravita-
tional potential of the universe. Whence come the CMB thermal fluctuations. In
13-10
Introduction to General Relativity and Cosmology (Second Edition)
more detail H cannot be precisely constant and the slow roll must have a finite rate:
each effect gives rise to a slight deviation from the Harrison–Zeldovich scale
invariance. Both effects reduce the spectral power ns appearing in Equation
(12.21) below unity. We saw in Chapter 12 that the CMB power spectrum measured
by the Planck Collaboration is indeed slightly tilted from the Harrison–Zeldovich
form: ns is 0.965 rather than being exactly unity.
To conclude, inflation solves the problems of flatness and homogeneity in the
universe at the time the CMB decouples. If there are monopoles then they would be
dispersed to undetectable levels in the universe. The proposed origin for the fluctuations
in quantum field fluctuations ensures that the fluctuations are Gaussian, while the
quantum field can be designed to give near scale invariant and adiabatic fluctuations,
which accounts naturally for these observed properties of the CMB.
Thus far we have looked at two major events in the early universe: at 10−36 s when
grand unification ended in inflation, and at 365,000 years after the Big Bang, when
radiation and matter decoupled. In between the temperature of the universe fell
through several thresholds. The one we understand best, when the nuclei of the
lightest elements in the universe condensed out of baryonic matter, is the main topic
of the next chapter.
13.6 Exercises
1. Estimate the value of the inflaton energy density at the start of inflation.
2. Explain why fluctuations that leave the Hubble horizon are frozen until they
re-enter this horizon.
3. How important is dark energy during the inflation at the grand unification
energy scale?
Further Reading
Liddle A 2015 An Introduction to Modern Cosmology (3rd ed.; New York:
Wiley) (Oxford: Oxford Univ. Press). This text provides a compact
introduction.
Dodelson S and Schmidt T 2021 Modern Cosmology (2nd ed.; New York:
Academic). Chapters 7 and 9. A text for anyone who wants to penetrate more
deeply into the subtleties of inflation and the CMB. It is rigorous, clear and
fully up-to-date.
References
Dodelson, S. 2003, AIPCP, 689, 184
Guth, A. H., & Steinhardt, P. J. 1984, SciAm, 250, 116
Linde, A. 1987, PhT, 40, 61
13-11
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 14
Big Bang Nucleosynthesis
14.1 Timeline
We can infer much about the history of baryonic matter in the universe following
inflation from what we know about the properties of elementary particles. This era
culminated between one second and five minutes after the Big Bang, with baryonic
matter condensing into light nuclei; a process that is called Big Bang nucleosynthesis
(BBN). It was then that the proportions of hydrogen, helium, and lithium in the
universe were established. A small fraction of this primordial matter was later
reprocessed into the heavier elements in stars and supernovae. Table 14.1 summa-
rizes the important steps in the evolution of the early universe up to the BBN and
beyond. Before describing BBN in detail, the evolution of baryonic matter between
the end of inflation and BBN will be sketched.
The existence of the first phase change affecting the baryons after inflation, the
electroweak transition, is integral to the standard model of elementary particle
physics. At that transition the weak and electromagnetic forces became distinct and
their overall symmetry was lost. The discovery of the predicted scalar Higgs boson in
2011 at the Large Hadron Collider (LHC) at CERN was ample confirmation of the
essential soundness of our understanding of electroweak physics. The Higgs boson is
the quantum of a scalar field filling all space uniformly. At the electroweak transition
the Higgs field changed to a lower energy state and the universe dropped from one
vacuum state to one of lower energy. The Higgs’ mass mH = 125 GeV c−2 sets the
energy of the transition; collisions in which the available energy falls below 125 GeV
can no longer produce Higgs bosons. The model for inflation as described in the
previous chapter is analogous to the better understood electroweak transition.
After the electroweak transition the temperature of the universe continued high
enough that baryonic matter remained in the form of a quark-gluon plasma. At
particle energies below 1 GeV the quarks condensed into hadrons: they become
Table 14.1. Timeline for Radiation and Baryonic Matter up to the Decoupling of the CMB from Matter
14-2
Introduction to General Relativity and Cosmology (Second Edition)
14-3
Introduction to General Relativity and Cosmology (Second Edition)
interactions per second. Also, like photons, their number density nν is proportional
to a−3, hence
n ν ∝ a −3 ∝ T 3.
The weak cross-sections have a temperature dependence
σ ∝ GF2T 2,
where GF is the coupling constant for the weak interaction. Combining the last three
equations, the interaction rate
Γ ∝ GF2T 5.
Space was expanding at the Hubble rate H ∝ G T 2 and when Γ ∼ H the weak
reactions could no longer maintain equilibrium, the neutron/proton ratio was
frozen. The freeze-out temperature is
Tfr ∼ (G / GF4)1/6 ∼ 1010 K.
The energy per particle was thus ∼0.8 MeV, and the time around one second after
the Big Bang. At freeze-out the Boltzmann factor was
nn − (m n − m p )c 2 ⎤
= exp ⎡ = exp[ −1.29/0.8] = 1/5. (14.3)
np ⎢ kBT ⎥
⎣ ⎦
After freeze-out the remaining neutrons could either decay or become bound into
light stable nuclei. Given the short neutron lifetime of 15 min it became a race to
make light nuclides; about as long as a 5000 m race. It was certainly
successful because helium makes up about a quarter of the mass of all baryonic
matter today.
14-4
Introduction to General Relativity and Cosmology (Second Edition)
1
In the case of decoupling Equation (12.6) the prefactor is unity.
14-5
Introduction to General Relativity and Cosmology (Second Edition)
nn
= 1/6 (14.6)
np
( 3H + p) or ( 3He + n) → 4He + γ .
Tritium, 3H, is included, although unstable, because its lifetime is 24 years. 4He is
very well bound by 7.1 MeV per nucleon, while 3H and 3He have binding energies of
2.8 and 2.6 MeV per nucleon respectively. As a result most of the neutrons end up in
4
He. Building beyond 4He is difficult because, as Figure 14.1 shows that there are no
stable isotopes with atomic numbers 5 and 8: the nuclides with these atomic numbers
quickly decay into the deeply bound 4He. First, the atomic number 5 nuclides’ decay
modes are
5
Li → e+ + 5He and 5
He → n + 4He,
with lifetimes around 10−22 s. Second, the nuclides with atomic number 8 decay to a
pair of 4He nuclei. 4He is a sink for neutrons. Assuming that all the reactions ended
10 56
Fe
8 4
He
BE/nucleon in MeV
9
6 Be
6,7
Li
4
3
He
2
2
H
0
0 10 20 30 40 50 60 70
Atomic number
Figure 14.1. Binding energies per nucleon for stable nuclides against atomic number. Above atomic number 56
a slow continuous decline in binding energy per nucleon with atomic number commences: a few nuclides are
shown to indicate this.
14-6
Introduction to General Relativity and Cosmology (Second Edition)
in 4He then the fraction, by mass, of baryonic matter converted to 4He in BBN
would be
Y = 2n n /[n n + n p ] = 0.28,
which amounts to an upper limit on 4He production in BBN. Very little 6Li and 7Li
are produced in BBN: the production of nuclei with higher atomic numbers only
took place much later, in stars and stellar explosions.
2
The 1H emission lines come from the excited minority neutral atoms.
14-7
Introduction to General Relativity and Cosmology (Second Edition)
Figure 14.2. Predicted light element abundances relative to hydrogen as a function of the baryon to photon
number ratio. The broader (red) vertical band shows the range of the baryon to photon ratio consistent with
the D/H measurements. The narrower (blue) band is the expectation from the CMB. The yellow bands show
the results of measurements for 4He and 7Li. Figure 23.1 from Tanabashiet al. (2018). Courtesy of the
American Physical Society.
3
The proportion of 3He has only been determined locally so that no comparison is possible with the predicted
3
He/H ratio (sloping red line).
14-8
Introduction to General Relativity and Cosmology (Second Edition)
14.7 Exercises
1. What is the number of 4He nuclei formed per second in the Sun? Take the
luminosity of the Sun to be 3.8 × 1026 W, the proton mass to be 938.272
MeV c−2, the 4He mass to be 3727.380 MeV c−2 and the electron mass to be
0.511 MeV c−2.
2. What fraction of the hydrogen in the Sun is converted to 4He in 1 Gyr?
3. How many solar neutrinos pass through you per second?
4. How many of these interact inside you? Assume the body to be simply water
and the cross-section for any neutrino interaction with a nucleon or an
electron to be approximately 10−47 m2, for neutrinos of energy around 1
MeV.
Further Reading
Liddle A 2015 An Introduction to Modern Cosmology (3rd ed.; New York:
Wiley) (Oxford: Oxford Univ. Press). This text provides a compact
introduction.
Coles P and Lucchin F 2002 Cosmology: The Origin and Evolution of Cosmic
Structure (2nd ed.; New York: Wiley). Clear account of the BBN and
associated topics.
14-9
Introduction to General Relativity and Cosmology (Second Edition)
References
Cyburt, R. H., Fields, B. D., Olive, K. A., & Yeh, T.-H. 2016, RvMP, 88, 015004
Tanabashi, M., Hagiwara, K., Hikasa, K., et al. 2018, PhRvD, 98, 030001
Wagoner, R. V. 1969, ApJS, 18, 247
14-10
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 15
Structure Origins
15.1 Introduction
The density fluctuations of radiation and of baryonic matter at decoupling were
miniscule, ∼10−5. Since then the clumping together of matter has increased
enormously. We may compare local matter densities today to the critical density.
The density is about 102 higher in galaxy clusters; around 105 higher in galaxies; and
in stars like the Sun it is 1029 times higher. Most of the matter in and around a galaxy
or galaxy cluster is dark matter, called its dark matter halo. Such a halo envelops the
visible structure, rather than standing apart like a saint’s halo.
This chapter, and the final two chapters, cover the evolution from the tiny
perturbations existing in the universe after inflation to the stars, galaxies and clusters
of galaxies we see today. The early phase dominated by the growth of dark matter
fluctuations is described in this chapter.
The initial density fluctuations gravitationally attracted more matter, the poten-
tial wells deepened, and equally the depleted regions lost matter. Countering this the
expansion of the universe acted to disperse matter: the Hubble drag. Gravitational
collapse in a static and an expanding universe are tackled in the first sections here.
Following this the growth of the majority dark matter perturbations is followed until
they reached equilibrium. Until decoupling the baryons oscillated within the
gravitational wells formed by accumulations of dark matter. Once released, the
baryonic matter then fell into these gravitational potential wells and enhanced their
growth.
If there had only been dark matter present it would simply have become
progressively hotter until its kinetic motion prevented any further gravitational
contraction, the state of virial equilibrium. Baryonic matter differs from dark matter
by coupling to the electromagnetic field. Ions, atoms, molecules, and electrons can
all radiate photons which carry off energy. Thus, concentrations of baryons in the
gravitational potential wells were able to cool by emitting radiation that escaped the
wells. Baryonic matter could thus continue to contract and so build the dense
structures present today: the stars, the galaxies, and the clusters of galaxies. These
are still embedded in their more massive dark matter halos.
In the previous chapter we saw that seconds after the Big Bang baryonic matter
was already non-relativistic. We know less, directly, about the state of dark matter
particles because they have never been detected. Dark matter only interacts
gravitationally, its interactions are said to be collisionless. Both baryonic and dark
matter were non-relativistic throughout the era of structure formation described
here. We show that if dark matter particles had been so light as to be relativistic their
streaming like photons would have dispersed the overdense accumulations of matter
that, in fact, grew denser and became nurseries of galaxies of stars. There is a
hierarchy of structures with stars forming before galaxies, and galaxies forming
before clusters of galaxies. We shall show in this chapter that this hierarchical
ordering provides critical evidence that dark matter is non-relativistic: that it is cold
dark matter (CDM), rather than relativistic and hot dark matter (HDM). The
ΛCDM model will be seen to make predictions that account for the large-scale
structures existing today.
The oscillations of the baryon–photon plasma were necessarily imprinted on
baryonic matter as well as on the CMB at decoupling. At the end of this chapter the
detection of these baryon acoustic oscillations (BAO) in the large-scale structure of
the universe is described. This provided a critical quantitative test of the ΛCDM
model of the evolution of the universe from decoupling at 365,000 years after the Big
Bang to the present, 13.5 Gyr later.
This isn’t the whole story describing collapse. The obvious neglected item is the
conversion of gravitational potential energy into kinetic energy during the collapse.
The increased average speed of the fluid particles would on its own lead to
expansion. Evidently the kinetic energy builds up and resists the gravitational
contraction, and this resistance acts whether the matter is baryonic or dark matter.
How the competition works out depends on how fast the pressure waves travel
compared to the free fall velocity. If the volume is tiny the pressure waves will win
and resist contraction. If the volume is very large gravitation produces contraction,
which pushes up the density, which reduces the free fall time, which make
contraction unstoppable. The limiting size and mass at which the two forces are
in balance are named after the originator of this idea affecting our understanding of
the formation of structures in the universe: the Jeans length and the Jeans mass.
15-2
Introduction to General Relativity and Cosmology (Second Edition)
In baryonic matter pressure waves travel at the speed of sound cs, so that the time
for a pressure wave to cross a sphere of diameter d
ts ≈ d / cs . (15.2)
For a sphere that is overdense and on the edge of stability ts ≈ tfree . The length at
which the pressure wave crossing time equals the free fall time is called the Jeans
length, λJ : exact calculation1 gives
λJ = cs π /[Gρ ] . (15.3)
4π ⎡ λJ ⎤3 πc 3 π3
MJ = ρ = s ⎡ 3 ⎤. (15.4)
3 ⎣2⎦ 6 ⎢ ⎣G ρ ⎥
⎦
If the extra mass due to the overdensity of a region exceeds the Jeans mass it will
collapse.2 Given the right conditions the collapse may go as far as concentrating
matter in an overdense region that becomes a galaxy and the stars within it. The
speed of sound is seen to be a critical determinant of the response of a local
overdense region; whether it collapses or supports oscillations. The speed of sound is
given by
∂p ṗ
cs = c =c , (15.5)
∂ε ε̇
where p is the pressure and ε the energy density, and a dot above indicates the time
derivative of the parameter. Applying the fluid Equation (10.8) to the plasma
ε ̇ = −3H (p + ε ) = −H (4εr + 3εb). (15.6)
Also for radiation
pṙ = εṙ /3, (15.7)
and
εṙ dε
= −a r = 4εr . (15.8)
−H da
Collecting terms, the speed of sound in the plasma is
c 4εr
cs = . (15.9)
3 4εr + 3εb
1
Peebles P J E 1993 Principles of Physical Cosmology (Princeton, NJ: Princeton Univ. Press) 116.
2
This and succeeding results in the following chapters are somewhat fluid. The choice of the λJ to be the
diameter rather than the radius is sometimes made, shifting the Jeans mass by a factor 8.
15-3
Introduction to General Relativity and Cosmology (Second Edition)
because the photons overwhelm the roughly 1010 times fewer baryons. Hence the
Jeans length in the plasma is very similar to the Hubble length
3
c /H = c .
8πGρ
After decoupling the non-relativistic baryons can be treated as an ideal gas, then
15-4
Introduction to General Relativity and Cosmology (Second Edition)
15-5
Introduction to General Relativity and Cosmology (Second Edition)
Dividing this equation by r using Equation (15.14) and ignoring terms in the second
order like δδ ,̇
r̈ ä 2 ȧ ̇ 1 ̈
= − δ − δ. (15.16)
r a 3a 3
At the surface of the spherical top hat the acceleration inward is given by
4π
r ̈ = −GM / r 2 = − Grρ(1 + δ ).
3
This gives
r̈ 4π
= − Gρ(1 + δ ),
r 3
while the acceleration Equation (10.7) for non-relativistic matter reduces to
ä 4π
= − Gρ .
a 3
Making these substitutions in Equation (15.16) produces
4π 4π 2 ȧ ̇ 1 ̈
− Gρ(1 + δ ) = − Gρ − δ − δ.
3 3 3a 3
Multiplying by 3 and putting the Hubble constant in place of a /̇ a gives our final
compact result3
δ ̈ = 4πGρδ − 2Hδ .̇ (15.17)
The second term on the right-hand side of Equation (15.17) is the damping of the
growth of density fluctuations due to the expansion of the universe: it is known as the
Hubble drag. Then using Equation (10.12) to replace the density in Equation (15.17)
gives
δ ̈ + 2Hδ ̇ − (3/2)H 2 Ω mδ = 0. (15.18)
This equation yields a time dependence of growth that is very different in the
radiation- and matter-dominated eras. We treat the radiation-dominated era first.
3
The effect of pressure would add a term [cs2 /a 2 ]∇2 δ to the right-hand side. See Chapter 5 in Peebles P J E 1993
Principles of Physical Cosmology (Princeton, NJ: Princeton Univ. Press).
15-6
Introduction to General Relativity and Cosmology (Second Edition)
During this era the free fall time tff ∝ (Gρm )−1/2 was much longer than the Hubble
time, tH ∝ (Gρr )−1/2 , which is the expansion time of the universe. The universe was
expanding so fast that the growth of the matter overdensities stalled, which is called
the Meszaros effect. At sufficiently early times in the radiation era we can drop the
final term in Equation (15.18) and take H = 1/(2t ):
δ ̈ = −δ /̇ t .
Thus an approximate solution is
δ = A + B ln t = C + D ln a ,
with constants A, B, C, and D. More generally, changing variables in Equation
(15.18) from t to y = a /a rm gives, after some manipulation,
2 + 3y 3
δ ′′ + δ′ − δ = 0, (15.19)
2y(1 + y ) 2y(1 + y )
where primes indicate differentiation with respect to y. There is a growing solution,
which also has limited growth
δ ∝ 1 + (3/2)a / a rm, (15.20)
only approaching linear growth with a at the transition to matter dominance.
Moving on to the matter-dominated era we can use Equation (10.23) to replace
Hubble’s constant in Equation (15.18), giving
δ ̈ + 4δ /(3
̇ t ) − 2δ /(3t 2 ) = 0. (15.21)
Substituting a power law solution in the form [δ = constant times t n] gives two
solutions, which you may check: n = −1 or n = 2/3. The former solution dies away
leaving
δ = constant t 2/3. (15.22)
In a matter-dominated universe using Equation (10.21) gives
δ ∝ t 2/3 ∝ a = 1/(1 + z ). (15.23)
Accordingly the fluctuations would have grown by a factor of 1091 between
decoupling and the present. This would only boost the fluctuations in matter from
10−5 at decoupling to 10−2 at the present time. Vastly more compaction is required to
reach even the overdensity 102 of galaxy clusters. Because dark matter was immune
to the plasma oscillations of radiation and baryonic matter, dark matter over-
densities commenced growing from much earlier than decoupling. Without this
boost there would be no viable explanation for how today’s structures developed in
the available time.
Figure 15.1 sketches the behavior of fluctuations in dark matter, baryonic matter,
and radiation up to and beyond decoupling. After decoupling baryonic matter fell
into the gravitational potential wells provided by dark matter and thereafter tracked
dark matter. The density fluctuations continued to grow beyond the figure’s range:
15-7
Introduction to General Relativity and Cosmology (Second Edition)
Figure 15.1. Sketch of the early growth of fluctuations showing a representative mode: the solid line for
baryonic matter both in the plasma and after decoupling, the broken line for dark matter, and the dot-dash line
for the CMB. After decoupling the growth of baryonic matter structures tracks that of dark matter. The dotted
vertical lines mark the times of matter–radiation equality and of decoupling.
Figure 15.2. A sketch of the progress of a uniform spherical overdensity fluctuation that exceeds the Jeans
mass (using the top hat model). The fluctuation expands until turn-around, and then collapses and virializes at
half the turn-around radius. In the graph log ρ is the log of the density (relative to the mean density of the
universe today) plotted against the log of the scale factor, beyond decoupling. The solid curve is for a virialized
region; the broken curve is for the universe.
Figure 15.2 shows the subsequent evolution of an overdense fluctuation whose mass
became greater than the Jeans mass. The latter figure shows how the fluctuation
contracted violently before finally settling into equilibrium. This phase of structure
formation, still driven by dark matter, is the next topic.
15-8
Introduction to General Relativity and Cosmology (Second Edition)
15-9
Introduction to General Relativity and Cosmology (Second Edition)
After turn-around the density contrast between the top hat region and the
surrounding universe increased as the overdensity collapsed. During the collapse
the total energy was conserved: gravitational energy was converted to kinetic energy.
A final equilibrium state was attained when the kinetic motion resisted further
contraction. The virial theorem, described in Appendix G, relates the mean kinetic
energy of the matter particles to their mean potential energy at equilibrium:
〈PE vir〉 + 2〈KE vir〉 = 0. (15.29)
The gravitational energy of a sphere of uniform density ρs and radius R, taking
M ( <r ) to be the mass within radius r < R from the center, is
R
GM ( <r )
〈PE〉 = − ∫0 r
(4πr 2ρs dr )
R
(15.30)
16 3GM 2
=− ∫0 3
Gπ 2ρs2 r 4dr = −
5R
,
where M is the mass of the whole sphere. For a uniform sphere the virial equation
gives
〈KE vir〉 = 3GM 2 /[10R vir ], (15.31)
where M is again the mass of the sphere, essentially that of the fluctuation, and Rvir
its radius. The total energies at turn-around and at virialization are equal. Ignoring
the small kinetic energy at turn-around, energy conservation gives
〈PE〉ta = 〈PE〉vir + 〈KE〉vir = (1/2)〈PE〉vir , (15.32)
where the second equality comes from applying the virial theorem, Equation (15.29).
From this we learn that Rvir = Rta /2, and that the density of the fluctuation increased
by a factor 8. In parallel the density of the universe fell while the fluctuation virialized.
Virialization occurs when ϕ = 2π so that tvir = 2tta . Thus the universe expanded by a
factor 22/3 and its density fell by a factor 4. Overall the contrast between the density of
the fluctuation and that of the surrounding universe grew by a factor
9π 2
contrast = × 8 × 4 ≈ 200.
16
This is very similar to the contrast between the galaxy cluster density, in baryonic
and dark matter, and the corresponding mean density as observed in the universe
today. Further contraction to reach the overdensities of galaxies and stars depended
on baryons being able to shed energy in electromagnetic radiation, in photons that
escape the contracting mass—something dark matter cannot do.
The time it takes for a structure to collapse can be estimated using the free fall or
dynamic time 1/Gρ . This neglects the dissipative processes involved, but it serves
to show that the galaxies we see, whose overall matter content is denser, collapse
sooner than the galaxy clusters. For a galaxy cluster with a density ∼200 times the
density of the universe the estimate is ∼3 Gyr. Thus galaxy clusters are still forming.
In the case of the Milky Way, taking a mass of 1.5 × 1012 M⊙, the virial radius comes
15-10
Introduction to General Relativity and Cosmology (Second Edition)
out at 18 kpc. This compares with the 8 kpc that the Sun lies from the center of the
Galaxy. The Milky Way density is now higher, ∼105 times the density of the universe
thanks to the contraction after virialization made possible by the radiative cooling of
the baryons. This contraction was happening in parallel and faster than the
virialization of larger structures.
We can also make a crude estimate of the redshift z, at the time when virialization
occurred. At virialization the density contrast between matter in the Milky Way
Galaxy and in the universe as a whole was about 200. Currently the contrast is 105.
Now the density of matter has varied like (1 + z )3 for most of the time involved.
Hence, virialization would have taken place when
(1 + z )3 = 105/200,
making z around 6.9.
where ρ b is the baryon density in the cloud. Eliminating Rvir using the last two
equations gives
3/2 1/2
5f kBTvir ⎤ ⎡ 3 ⎤
Mb = ⎡ b . (15.36)
⎢ ⎢ ⎥
⎣ Gμm H ⎥ ⎦ ⎣ 4πρb ⎦
Note that the baryonic virial mass is very similar to the Jeans mass of Equation
(15.4). A virialized baryonic cloud can collapse by radiating the energy released; but
a virialized dark matter halo cannot contract further. Now ρ b = nμmH where n is the
number density of baryons. Then inserting the constants, the virial baryon mass is
3/2 3/2 −1/2
Mb = 105Tvir f b n M⊙. (15.37)
15-11
Introduction to General Relativity and Cosmology (Second Edition)
15-12
Introduction to General Relativity and Cosmology (Second Edition)
This equation is used as a tool for extracting the power spectrum of spatial
correlations in matter in the universe at low redshift. Such correlations have evolved
from those in matter at decoupling. In principle the current correlations should be
predictable by using the laws of physics to emulate the evolution. Using such
techniques, evidence is presented below that the baryonic acoustic oscillations persist
in the matter content of the universe at low redshift.
The observed power spectrum of the CMB fluctuations has n slightly smaller than
unity. This near scale invariance was interpreted in Chapter 13 as a consequence of
slow roll inflation. It is to be expected that the power spectrum of fluctuations in the
distribution of matter in the universe should inherit the same functional form. A
widely used way to quantify perturbations in the recent universe starts from the
fractional excess in the mass in a volume, paralleling the overdensity definition:
δM = (M − 〈M 〉)/ 〈M 〉,
where 〈M 〉 is the mass calculated taking the mean density of the universe. Then the
parameter used is the root mean square deviation of the fractional excess enclosed
within a spherical volume V of radius R:
V
σR2 =
(2π )3
∫ W 2(kR)P(k ) 4π k 2 dk, (15.45)
where W (kR ) is the Fourier transform of the spherical top hat distribution with
uniform density 3/4πR3 out to a radius R:
3j1 (x ) 3[sin x − x cos x ]
W (x ) = = ,
x x3
j1 (x ) being a spherical Bessel function. Taking a power spectrum P (k ) ∝ k n and
using x = kR again gives
∞
9V
σR2 ∝
2π 2R3+n
∫0 x n[j1 (x )]2 dx ∝ R−[3+n]. (15.46)
One standard choice for the parameter to quantify the clumpiness of matter is σ8, the
variance of fluctuations of mass within volumes of a size, 8 Mpc, comparable to that
15-13
Introduction to General Relativity and Cosmology (Second Edition)
of a typical galaxy cluster. A value for σ8 of ∼0.8 has been determined from overall
fits in the ΛCDM model, compatible with the CMB, the BAO, and also the SN Ia
data discussed in Chapter 17.
With the Harrison–Zeldovich spectrum P (k ) ∝ k we have σR ∝ R−2 ∝ M−2/3. The
small-scale fluctuations have larger deviations from the average mass than do large-
scale fluctuations. Thus the probability of a fluctuation becoming non-linear and
collapsing is greatest among the small-scale fluctuations. Denser mass fluctuations
have additionally shorter free fall times. These two features assist in giving the
observed hierarchical clustering in the universe. The smaller scale dark matter
accumulations containing the future stars collapsed before the larger-scale accumu-
lations that gave rise to galaxies, and galaxy clusters continue to form into the
present era.
In Section 15.2 it was shown that dark matter streams out from fluctuations
whose masses are smaller than the Jeans mass. The effect is to wash out such low
mass fluctuations. In turn this depresses the power spectrum below the Harrison–
Zeldovich prediction at the corresponding large wave numbers. If the particles
constituting dark matter had been light, that is hot dark matter (HDM) as distinct
from CDM then their relativistic streaming at early times in the life of the universe
would have dispersed the smaller dark matter accumulations. We now explore the
effect of having hot dark matter, choosing a mass mdm = 2.5 eV c−2 for the
individual dark matter particles. As a simple approximation we assume the dark
matter particles streamed at the speed of light while the temperature T was high
enough that their kinetic energy exceeded their rest energy mdmc 2 . Below this
temperature we also assume that dark matter stopped streaming. Then
3kBT ≈ m dmc 2 , (15.47)
which yields a temperature of 9650 K. The universe was then still radiation
dominated, so that the relationship between the time t, the scale factor a, and the
temperature was
t ∝ a 2 ∝ T −2 .
As a reference point we take matter/radiation equality, for which Section 12.2 gives
T = 9384, a = 1/3444, and t = 50, 000 yrs. Thus dark matter would make the
transition to non-relativistic motion when
t = 50, 000 (9384/9650)2 = 47, 200 yr.
At that time the particle horizon, the limit of causal contact since the universe began,
was ct = 47, 200 light years or 14.48 kpc. Also
a = [9384/9650]/3444 = 1/3540,
so that the comoving size of that horizon, that is its size now, would be 51 Mpc.
Under our approximation of streaming at velocity c structures of any smaller size
would be dispersed. The corresponding mass of dark matter would have been
M = (4π /3)ρm0 [51 Mpc]3 = 4.43 × 10 45 kg = 2.23 × 1016 M⊙ .
15-14
Introduction to General Relativity and Cosmology (Second Edition)
Thus if the dark matter particles had mass 2.5 eV c−2, then any dark matter
fluctuations lighter than 2.23 × 1016 M⊙ would be dispersed by the streaming of the
dark matter while relativistic. This would eliminate structures as large as the Local
Group of galaxies, the one including the Milky Way and M31. HDM would require
that the first structures were the largest: smaller structures could only result from the
fragmentation of the larger structures, and appear later. This is the reverse of what is
observed: galaxies formed at high redshifts while clusters are still forming at present.
Particle physics experiments put an upper limit of ∼1 eV c−2 on the neutrino masses,
hence neutrinos are highly relativistic and can be ruled out as the dark matter
particles. Figure 15.3 shows the fluctuation power spectra as a function of the
fluctuation mass. It shows the initial Harrison–Zeldovich spectrum, and the mass
spectra of the surviving fluctuations for CDM and HDM. The mass assumed for the
dark matter particles is 2.5 eV c−2. Figure 15.4 shows the same power spectra, now
as a function of the wavenumber. Both figures demonstrate the severe and unrealistic
loss of galaxy sized accumulations of matter with HDM.
A compilation of measurements of the matter power spectrum from a wide range
of cosmological probes is shown in Figure 15.5 (Planck Collaboration et al., 2020).
The solid line is an overall fit using the ΛCDM model. The goodness of the fit is
strong evidence in particular for CDM and the near scale invariance of perturba-
tions during inflation. As we shall see next, data incorporated in Figure 15.5 provide
another, crucial test of how well the ΛCDM model describes the development of the
universe.
Figure 15.3. The matter power spectrum (in arbitrary units) as a function of mass at matter–radiation equality.
The input Harrison–Zeldovich spectrum is shown as well as the spectrum expected after damping by HDM,
and after damping by CDM. Courtesy Dr Sean McGee, Institute for Gravitational Wave Astronomy and
School of Physics and Astronomy, Birmingham University.
15-15
Introduction to General Relativity and Cosmology (Second Edition)
Figure 15.4. Matter power spectrum (in arbitrary units) as a function of wavenumber at radiation-matter
equality. The input Harrison–Zeldovich spectrum is shown as well as the spectrum expected after damping by
HDM and after damping by CDM. Courtesy Dr Sean McGee, Institute for Gravitational Wave Astronomy
and School of Physics and Astronomy, Birmingham University.
Figure 15.5. Matter power spectrum inferred from different cosmological probes. The curve is from an overall
fit to this and other data using the ΛCDM model. SDSS LRG refers to luminous red galaxies from the Sloan
Digital Sky Survey. h is the ratio of the observed Hubble constant to 100 km s−1 Mpc−1. Figure 19 from Planck
Collaboration et al., (2020). Reproduced with permission © ESO.
15-16
Introduction to General Relativity and Cosmology (Second Edition)
4
The WiggleZ Dark Energy Survey was a redshift survey of 240,000 galaxies using the 3.9 m AAT telescope at
Siding Spring Observatory NSW; the 6dF Galaxy Survey observed with redshifts, 136,000 galaxies with the
1.2 m UK Schmidt telescope at the same location.
15-17
Introduction to General Relativity and Cosmology (Second Edition)
evolution of the universe follows the prediction of the ΛCDM model. Two
conversion factors are required. First, the separation between galaxies at measured
redshifts z and z + Δz , expressed as a comoving distance between the galaxies, is:
Δz ⎡ cz ⎤
sΔz = . (15.50)
z ⎢
⎣ H (z ) ⎥
⎦
Second, the comoving distance between galaxies separated by an angle ΔΘ,
expressed in terms of the angular diameter distance dA, is:
sΔΘ = ΔΘ[(1 + z )dA ]. (15.51)
We can also convert volumes by making use of Equations (15.50) and (15.51): the
comoving volume enclosed by a difference in redshift of Δz /z and the two
orthogonal angle separations, ΔΘx and ΔΘy , is
The correlations observed by the SDSS collaboration are shown in Figure 15.6 and
reveal clear baryon acoustic oscillations. On the left the correlations are plotted
against the comoving distance. The quantity h appearing on the axes is H0 divided
by a nominal value of 100 km s−1 Mpc−1. On the right is the Fourier transform, that
is the power spectrum plotted against the wavenumber. This latter corresponds to a
section of Figure 15.5 at high magnification. The two measurements of the sound
horizon from the CMB and from these galaxy overdensity correlations at low
redshift are in excellent agreement. The fit shown in Figure 15.6 corresponds to a
comoving sound horizon distance rs, at the time baryons are released from the
radiation drag, of 153.19 Mpc.
Figure 12.4 shows the restrictions placed on the acceptable ranges of values of Ω m
and ΩΛ found by comparing the size of the BAO at the time of the CMB and at the
era of low z. You can see that because the expansion of the universe since decoupling
has been mainly controlled by matter the constraint from BAO is a good guide to the
value of Ω m . That there should be a common intersection point in Figure 12.4 is a
necessary test of the ΛCDM model, and the intersection fixes the values for Ω m and
ΩΛ . Beyond that, because the intersection point lies on the line ΩΛ + Ω m = 1.0, then
the universe must be flat.
The successful extrapolation made using the ΛCDM model from the CMB
oscillations to galaxy clustering shows that this model is a signal success. One view is
to accept that this model provides an accurate description of the development of the
universe and make inferences on this basis. The sound horizon at any redshift then
becomes a standard ruler supplementing the standard candles as a tool for
measuring cosmological distance. One application is to use the measurement of
Δz /z where Δz spans the sound horizon to determine the value of the Hubble
15-18
Introduction to General Relativity and Cosmology (Second Edition)
Figure 15.6. Correlations in the galaxy distribution density measured in the Sloan Digital Sky Survey: Baryon
Oscillation Spectroscopic Survey. On the left, the correlation coefficient plotted against the comoving
separation. On the right, the power spectrum plotted against the wavenumber. Reconstruction removes
distortions from bulk flows estimated using a three-dimensional map of the galaxy positions to infer their
peculiar velocities. Figure 11 from Anderson et al. (2014). Courtesy: Royal Astronomical Society, London and
Oxford Univ. Press, Oxford.
15.8 Exercises
1. Given that HDM was made up of particles of mass 10 eV calculate the
following: the time at which the motion becomes non-relativistic, the dark
matter density at that time, the mass of dark matter then in causal contact.
2. What percentage error is made in calculating the speed of sound by setting it
to c/ 3 at matter–radiation equality?
3. In Figure 15.1 explain why the spacing of the radiation oscillations tightens
up and by how much.
15-19
Introduction to General Relativity and Cosmology (Second Edition)
4. The rotation velocity of matter at the outer edge of a spiral galaxy of radius
R is v. Show that the rotation period at this radius is very similar to the time
it took the galaxy to collapse from a density perturbation. Take the dark
matter halo to be spherical.
Further Reading
Rich J 2009 Fundamentals of Cosmology (2nd ed.; Berlin: Springer). This text
gives much useful coverage.
References
Anderson, L., Aubourg, É., Bailey, S., et al. 2014, MNRAS, 441, 24
Birkhoff, G. D., & Langer, R. E. 1923, Relativity and Modern Physics (Cambridge, MA: Harvard
Univ. Press)
Planck Collaboration, Aghanim, N., Akrami, Y., et al. 2020, A&A, 641, A1
15-20
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 16
Baryonic Structures
16.1 Introduction
Immediately following decoupling, the structure of the universe was relatively
simple. Overdensities of dark matter originating from the quantum fluctuations
during inflation were steadily growing; baryonic matter followed the dark matter.
The only radiation was the infrared CMB, steadily cooling as its wavelength
expanded with the universe. The universe entered the dark ages. In time the
overdense concentrations of dark matter virialized, so reaching an equilibrium.
Thereafter the accompanying baryonic overdensities could radiate energy away and
collapse further. Baryonic matter accumulations reached densities at which nuclear
burning is ignited and they became visible stars. The dark ages ended when the first
stars began to emit energy as radiation, a period known as first light. Ultraviolet
radiation, principally from the stars, reionized the hydrogen gas that still made up
the most part of baryonic matter. What we observe now, as in Figure 1.3, are the
structures of visible matter. One early research paper likened the appearance of the
universe to foam on the top of washing up. Matter is distributed in filaments that
intersect in nodes, and between the filaments are vast, nearly empty voids. Galaxies
densely populate the threads and strings and form clusters and superclusters at the
nodes.
The processes of structure formation involve physical effects on scales remote
from processes observed on Earth. Nonetheless cosmologists have deduced the
broad principles involved. Making the assumption that the laws of physics remain
valid at all eras and on all scales has led to a consistent description of the processes at
work.
Simulations of the evolution of the universe have been helpful in supplementing
and interpreting the astronomical observations. One large-scale program of 2001
was carried out at the Max Planck Institute for Astrophysics in Munich. This
Millennium simulation followed the evolution in the expanding universe, since the
decoupling era, of 1010 collisionless (dark matter) particles under their mutual
gravitational attraction alone: each of mass 109 M⊙ so that 100 would equal a dwarf
galaxy in mass. The volume was a periodic cubic box of side length 500 Mpc/h,
where h = 100/H0, and the initial distribution of perturbations was made scale
invariant, taking the amplitude from that seen in the CMB. The results reproduced
with surprising fidelity the organization and evolution of the universe as seen at low
redshifts. Figure 16.1 shows the simulation of the evolved cosmic web at present, the
width of the image corresponding to 500 Mpc/h. The brightness indicates the
overdensity and hence the regions populated by baryonic matter that we see as stars,
galaxies, and clusters. For comparison with the actual universe at low redshift see
Figure 1.3.1 The filaments and nodes are plainly visible, with overdensities
comparable to those in nature. The intervening voids are also as large and as
sparsely populated as in the real universe. Such good agreement with the observed
structure is another indicator of the validity of our model of the universe.
In this chapter the formation of stars, galaxies and clusters of galaxies in their
dark matter halos is followed roughly chronologically. Cooling mechanisms and
reionization are described first. Then the way that stars form within large molecular
Figure 16.1. Millennium simulation of the evolution of the universe: carried out at the Max-Planck Institute
for Astrophysics (Springel et al. 2005). The width of the image is 500/hMpc. h is the usual factor H0 /100 km s−1
Mpc−1. Courtesy Professor Springel.
1
More recently the simulation has been extended to include baryons, a project called MilleniumTNG: https://
arxiv.org/abs/2210.10060.
16-2
Introduction to General Relativity and Cosmology (Second Edition)
clouds is outlined. Next the properties of galaxies, galaxy clusters, and superclusters
are discussed. Reference will be made to Chapter 8 in which we got to know the
monster black holes powering the AGNs of galaxies, and the mechanism by which
stars collapse to compact objects. The Milky Way is looked at in more detail, and
the nearby Coma cluster and supercluster. The last, but not least, topic concerns the
intergalactic matter which, unexpectedly, holds much more of the baryonic matter
than do the stars.
16-3
Introduction to General Relativity and Cosmology (Second Edition)
16-4
Introduction to General Relativity and Cosmology (Second Edition)
Figure 16.2. A sketch of baryon density versus temperature for cosmic structures. The diagonal lines are at
fixed total mass. In the differently shaded regions different cooling mechanisms operate as described in the text.
The locations of the Milky Way Galaxy and the Coma cluster are indicated. Heavily adapted from Figure 3 in
Blumenthal et al. (1984). Courtesy Springer.
useful. The interstellar medium (ISM) is a flattened disk of cold gas, mainly
hydrogen, of which 20% is in molecular form and 1% is dust. The dust grains are
typically 0.35–1.0 nm across so they radiate and absorb well at comparable
wavelengths, that means in the near-infrared. Molecular hydrogen is produced by
the processes in which free electrons act as catalysts:
e− + H → H− + γ : followed by H− + H → H2 + e−,
or H + H+ → H +2 + γ : followed by H +2 + H → H2 + H+
The molecular hydrogen is distributed in giant molecular clouds (GMCs) of masses
∼104–106 M⊙ and average densities ∼10−18 kg m−3. It is in these clouds that star
production is seen to be concentrated in the Milky Way. Prolific star production is
seen too in starburst galaxies, again dominantly in GMCs. The gray region in
Figure 16.2 marked “H2” is the region where transitions between molecular
rotational and vibrational states of hydrogen are effective in cooling the molecular
clouds. These transitions involve meV energies rather than eV energies, and carry
the cooling correspondingly further. Cooling proceeds in two steps: kinetic energy is
transferred to molecular excitation in collisions, then photons carry off the
excitation energy and escape from the cloud of hydrogen. One representative
transition is the forbidden dipole transition at 43.9 meV, corresponding to 510 K
thermal excitation. This molecular cooling takes the hydrogen molecules to temper-
atures of order 200 K. The molecular clouds in our Galaxy contain metals and dust
16-5
Introduction to General Relativity and Cosmology (Second Edition)
whose transitions in the infrared are important for cooling in the area marked
“metals” in Figure 16.2. The end result of this sequence of cooling processes is that
the GMCs reach temperatures of around 10 K. The mass distribution of young stars
that are observed today is discussed in Section 16.5.1, masses are of order M⊙ being
typical.
τ is called the optical depth. The effect on the observed polarization of the CMB was
isolated in their analysis by the Planck Collaboration yielding a value of 0.056 for τ.
16-6
Introduction to General Relativity and Cosmology (Second Edition)
This value for τ is what we use to estimate when the universe reionized. Here the
simplifying assumption is made that the universe was neutral up to some redshift zre,
and fully ionized from then on. At times later than this redshift boundary the
electron density expressed in terms of the current density, n e0, is n e = n e0 /a3. Then
1
τ = n e0σTc ∫a re
dt / a 3 . (16.9)
Now
dt = da / a ̇ = da /[aH ]
so that
1
τ = n e0σTc ∫a re
da /[Ha 4 ].
The limits of integration lie in a period that is matter dominated. Therefore we can
approximate Equation (10.32) adequately by
H = H0 Ω m0 a −3/2 . (16.10)
16-7
Introduction to General Relativity and Cosmology (Second Edition)
Figure 16.3. Star formation rates versus redshift. Far ultraviolet and infrared rest-frame measurements. The
far-UV is obscured by intervening dust, while the latter IR penetrates it. The reference volume is the comoving
volume. Adapted from Figure 9 in Madau & Dickinson (2014). Courtesy Professor Piero Madau and Annual
Reviews Inc.
16-8
Introduction to General Relativity and Cosmology (Second Edition)
Figure 16.4. The optical spectra of quasars at redshift 1.3, 2.9, and 5.8 plotted against wavelength in the
source’s rest frame. This shows the Lyα absorption by neutral hydrogen between the quasar sources and the
Earth. Figure 1 from McQuinn (2016). Courtesy Professor Matthew McQuinn and Annual Reviews Inc.
medium (IGM) from 1 Gyr after the Big Bang, and in particular on the progress of
reionization. The broad dominating emission peak in each spectrum is the Lyα line
emission from hydrogen atoms around the quasar; in the Lyα transition the electron
drops from the excited state (2p) to the ground state (1s). The transition energy is
10.2 eV, and the wavelength, well into the ultraviolet, is 121.6 nm. The other
component of the quasar radiation visible in the figure is a smooth continuous
spectrum, punctured by numerous absorption lines. The absorptions involve the
same Lyα transition: photons in the continuum resonantly scattering off neutral
hydrogen atoms in the gas clouds between the quasar and the Earth. Any scattered
photons are deflected and lost from the flux traveling toward the Earth. In the case
of a cloud of neutral (atomic) hydrogen, in a cloud at redshift z it will scatter out
quasar radiation of wavelength 121.6 × (1 + z ) nm: that is blueshifted (leftward)
from the quasar’s Lyα line at 121.6 × (1 + zquasar ). The many absorption lines seen in
the figure are known as the Lyα forest. They are of great interest to cosmologists.
Firstly because the pattern of absorption lines provides a map in redshift of where
clouds of neutral hydrogen gas were located along the line of sight, and secondly
because the depth of an absorption line reveals the integrated density of neutral
hydrogen gas along the line of sight through that cloud. The quasar Lyα emission
line is far broader than the absorption lines because emission is from hydrogen gas in
violent motion close to the AGN. Many millions of quasar spectra have been
recorded; and from the Lyα forest in each information has been obtained about
redshift distribution of the neutral component of hydrogen clouds. In addition the
lateral extent of these clouds has been extracted from the absorption spectra of
16-9
Introduction to General Relativity and Cosmology (Second Edition)
⎝ ⎠ 0
where the range of integration includes the value a = ν /ν0 for which the δ-function
diverges. First we assume all the hydrogen atoms remain neutral. Then using
Equation (16.8) again the optical depth for photons arriving at the Earth with
frequency ν is
1 1
c da
τ (ν ) = ∫aq σ (ν /a − ν0)nb(a)dt = ∫aq σ (ν /a − ν0)nb(a) aH (a )
,
where nb(a ) is the baryon density at a scale factor a. The lower limit of integration is
the scale factor at the quasar aq. Evidently the δ-function will pick out the clouds
around the scale factor a = ν /ν0. nb(a ) = n 0 [a ]−3, with n0 taken to be the current
baryon number density. We make the substitution for H (a ) given in Equation
(16.10), appropriate when matter dominates. This gives
1
τ (ν ) = ∫aq σ (ν /a − ν0) a5/2 nΩ0c da .
m0 H0
After integration
σ0n 0c
τ (ν ) ≈ [ν01/2 / ν 3/2 ].
H0 Ω m0
2
This analysis is based on notes and comments by Professor Tom Theuns: http://icc.dur.ac.uk/tt/IGM.pdf.
Professor Matthew McQuinn gave useful advice on this section.
16-10
Introduction to General Relativity and Cosmology (Second Edition)
If the cloud lies at redshift z, then ν = ν0 /(1 + z ), and the optical depth is
σ0n 0c (1 + z )3/2
τ (ν ) ≈ ≈ 3 10 4 (1 + z )3/2 ,
H0 Ω m0 ν0
where we have been assuming that all the hydrogen atoms remain neutral. Absorption
saturates when τ reaches unity; hence it only requires a tiny fraction of the hydrogen to
remain neutral at redshift 5, of order 10−5, to produce complete absorption.
Consequently, in Figure 16.4, the incomplete absorption seen in the spectrum of the
quasar at redshift 5.8 demonstrates that reionization is very close to being total at this
redshift. The current view based on detailed analysis is that reionization took place
between redshifts 12 and 6 (360 Myr and 920 Myr, respectively). The observed quasars
get rarer as the redshift grows. Despite that it may be possible to gather further precise
information using the recently launched JWST with its larger mirrors, and with its
detectors having better infrared sensitivity than Hubble.
3
Many telescopes in use are sensitive to the infrared radiation from such organic molecular transitions: for
example the 50 m diameter Large Millimeter Telescope at 4640 m altitude in Sierra Negra, Mexico, sensitive to
0.85–4 mm wavelength radiation.
16-11
Introduction to General Relativity and Cosmology (Second Edition)
these molecular clouds absorb the ultraviolet radiation and re-emit the incoming
energy, principally as infrared radiation. This cocoons the cold core of each cloud.
Eventually the radiation from stars formed inside a GMC will disperse its gas and
halt further star formation.
The linewidths of emission from gas molecules in GMCs indicate that the
molecules have a velocity dispersion of ∼10 km s−1. This far exceeds the dispersion
expected in a gas at 10 K, of only 0.2 km s−1. The reason for the elevated velocity
dispersion is thought to be supersonic turbulent flow in the GMCs. Likely sources of
turbulence are outflows from supernovae, winds from massive stars, radiation
pressure and cosmic ray streaming. The outcome is that GMCs show a filamentary
structure with nodes on all scales. GMCs are not themselves bound by self-gravity:
to see this we can use a virial parameter comparing the kinetic and gravitational
energies of a molecule at the cloud’s surface
σ 2R
α vir = 〈KE〉/ 〈GE〉 ≈
GM
where M is the cloud mass, σ the line-of-sight velocity dispersion and R the cloud radius.
In the case of clouds in our Galaxy α vir ranges up to 100: they are not going to collapse as
single items. Analogous to the three-dimensional Jeans mass there is a critical mass per
unit length above which a filament in a GMC will collapse. Two radial forces are at work
(Inutsuka & Miyama 1997): the first is from the outward pressure
Fp ∼ cs2 / R , (16.12)
where R is the outer radius of the filament and cs is the velocity of sound in the
filament. The other is the inward gravitational force
GMline
Fg ∼ − , (16.13)
R
where Mline is the mass per unit length. Then the critical mass per unit length at
which these forces balance is
Mcrit ∼ 2cs2 / G . (16.14)
16-12
Introduction to General Relativity and Cosmology (Second Edition)
Figure 16.5. Column density in the far-infrared across a molecular cloud in the Aquila Rift. This was observed
with the Herschel Space Telescope in the range 70–500 μ m. The shading indicates the fraction of the critical
mass per unit length along the filaments. White indicates where this exceeds 0.5. The green and blue points
mark the O-type protostars and the bound pre-stars, respectively. Figure 1(a) from André et al. (2010).
Reproduced with permission © ESO.
Aquila rift.4 The filaments are typically 0.05 pc across. In the figure the identified
bound prestellar cores and Class-O protostars are shown to be concentrated along
those filaments where the mass per length is above half the critical value.
Once collapse commences, flow and accretion from the environment will result in
a near spherical core. We can calculate the Jeans mass once a spherical core has
formed. The local molecular number density ncore is ∼109–1012 m−3 and the mass
density ρcore is ∼10−18–10−15 kg m−3. Using Equation (15.1) the free fall time for the
core is
−1/2
n
tfree = [Gρcore ]−1/2 = 0.21 Myr ⎡ 11core−3 ⎤ ,
⎣ 10 m ⎦
4
This molecular cloud blocks visible radiation coming from stars in the galactic plane.
16-13
Introduction to General Relativity and Cosmology (Second Edition)
so that the formation time is of order Myr. The Jeans mass given by Equation (15.4)
is
3 1/2
π ⎤
MJ = (π /6)cs3⎡ 3 .
⎢ G ρcore ⎥
⎣ ⎦
Referencing this to typical values of temperature and molecular number density
gives
T ⎤3/2 n −1/2
MJ = 1.22 M⊙ ⎡ ⎡ core ⎤ . (16.16)
11 −3
⎣ 10K ⎦ ⎣ 10 m ⎦
Molecular clouds have uniform temperatures thanks to the turbulence and hence the
Jeans mass falls as the density increases. As a result fragmentation into smaller
collapsing bodies happens. This can continue while the fragments can radiate their
binding energy efficiently. A limit is imposed by self-absorption of radiation on the
dust and metals present in the collapsing gas. The limit is reached when the density
reaches ∼10−10 kg m−3 or 1017 molecules per cubic meter. Inserting this value in
Equation (16.16) gives a minimum mass for stars of around 0.001 M⊙, that is around
the mass of the planet Jupiter.
The star formation rate in our Galaxy is equivalent to 1 M⊙ per year. Now the
mass of gas in the Milky Way is some 2 × 109 M⊙, so that this rate of star formation
would have exhausted the supply in 2 billion years, yet the Galaxy has lasted for
around 12 billion years. This discrepancy is resolved if there has been repeated
recycling of the baryons in stars, which is necessary anyway to explain the metals
and dust now present. Another source of additional gas is through accretion from
the wider environment, which can continue over a long period. Star formation takes
place in the giant molecular clouds that are overtaken by the turbulence of the
rotating spiral arms of a galaxy. In such cold regions the accumulations inferred
from Equation (16.16) to collapse are of order M⊙. Once star formation occurs the
radiation from the new stars disrupts the parent cloud. As a result little of the
baryonic matter ends up in stars: it is buffeted about by supernovae, winds from
massive stars, outflows from AGNs, and on the larger scale by galactic collisions,
and other magneto-hydrodynamic processes. Below we shall review the evidence
that the biggest portion of the baryons, 86%, is in the form of warm and hot diffuse
ionized clouds pervading and enveloping galaxies and clusters of galaxies.
16-14
Introduction to General Relativity and Cosmology (Second Edition)
Uniformity of these IMF is consistent with the idea that stars are born in similar
environments, namely giant molecular gas clouds, with similar metallicities. This
was not the case in the early universe, when dust and metals had yet to be produced.
Then the energy radiated during the collapse of a protostar escaped directly rather
than being re-absorbed by dust and metals. Consequently it required higher
temperatures and pressures to ignite hydrogen fusion, and hence the stars would
have been more massive. Nuclear burning in these massive stars would have been
correspondingly more rapid, so that none could remain today.
The template for the IMFs at low redshifts is shown in Figure 16.6. Stars with the
highest luminosity, the hottest, having masses above 15 M⊙ are called O-type. Stellar
luminosity is roughly proportional to the mass cubed in main sequence stars. A 50 M⊙
star has a luminosity L = 20, 000 L⊙ in the B-band where L⊙, where the Sun’s B-band
luminosity is 1.54 × 1026 W. This makes the mass to light ratio for such 50 M⊙ stars to
be 50/20,000 = 0.0025[M/L]⊙. At the other end of the stellar mass scale are red dwarf or
M-type stars with masses up to 0.5 M⊙ these account for three quarters of the visible
stars in the Milky Way; and beyond them lie the brown dwarfs with masses less than
0.08 M⊙, making them too light to ignite hydrogen fusion in their cores.
Figure 16.6. Stellar initial mass function. The low mass stars contribute most of the galactic mass; the heaviest
stars contribute most of the galactic radiation.
16-15
Introduction to General Relativity and Cosmology (Second Edition)
16.6 Galaxies
The giant galaxies mostly have total baryonic plus dark matter masses in the range
107–1013 M⊙. There are three broad classes of giant galaxies: around three quarters
are spirals, the rest elliptical or the rarer irregularly shaped galaxies. Figure 16.7
shows a barred spiral galaxy, UGC12158 in Pegasus, similar to our own. These are
5
The stars were ordered alphabetically according to the intensity of the hydrogen spectral lines by Williamina
Fleming in 1880. Although the ordering is now by temperature, these alphabetic labels were retained.
16-16
Introduction to General Relativity and Cosmology (Second Edition)
Figure 16.7. The barred spiral galaxy UGC12158 similar to the Milky Way. The bright regions are lit by stars
formed within giant molecular clouds. Credit: ESA/Hubble and NASA.
galaxies in which intense star formation is in progress at the bright points along the
spirals. A plausible view is that the mass concentrations from which spiral galaxies
developed would have lacked spherical symmetry. Gravitational collapse would
have been most rapid along the minor axis, leading to a pancake final shape. During
the collapse of this disk toward its core (harboring a black hole) angular momentum
would be conserved: then, just as in the case of a skater pulling her arms into her
body while rotating, this spun up the rotation of the disk. Our Sun, for example,
takes 240 Myr to complete its journey round the Galaxy’s axis. The spiral arms are
located where shock waves rotating around the core produce high pressure and
turbulence, and hence a burst of star production. These shock waves travel
maintaining constant angular velocity; by contrast, as we have seen, the gas in the
Galaxy rotates with constant linear velocity. As a result, the path of the shock waves
trace out spirals in the gas. Along each spiral arm the UV radiation from the young
massive O-type and B-type stars ionizes the gas around them giving the beads of
light seen in Figure 16.7. The radiation from these massive stars gives spiral galaxies
their characteristic blue/white coloration, and the lifetime of such stars determines
the width of the spiral arms.
Figure 16.8 is a sketch of a section made through the rotation axis of our Galaxy. The
stars form a disk which extends out to around 17 kpc; 95% of them are concentrated in a
thickness ∼0.3 kpc. The bulge is the section taken across the bar, itself a few kpc in
16-17
Introduction to General Relativity and Cosmology (Second Edition)
Globular
cluster
1 kpc
Sun
Giant molecular cloud
with active star
formation
Figure 16.8. A sketch of a section through the Milky Way Galaxy. At the center of the bulge lies a black hole
of mass ∼4 × 106 M⊙ . The disk’s radius is around 17 kpc. The globular clusters are dispersed in the baryonic
halo, while the molecular clouds are largely confined to the disk. Gas clouds in the halo are likely to be atomic
rather than molecular hydrogen. The dark matter halo is roughly spherical and extends well beyond the sketch
to a radius of order 100 kpc.
radius with mass ∼1010M⊙. At the center lies the black hole near Sgr A∗ with mass
∼4 × 106 M⊙. Dark matter forms a halo that, judging from Figure 1.7, extends beyond
50 kpc. The total mass of the stars is ∼7 × 1010 M⊙ and their collective luminosity is
∼2 × 1010 L⊙. Isolated stars, gas clouds, and around 150 globular clusters form a
spherical halo, contributing something like a percent of the total stellar mass. Globular
clusters are collections of 105–106 densely packed stars, in volumes ∼1 pc3. One possible
view is that in the hierarchical ordering of baryonic structures, with low mass structures
developing first, globular clusters would have formed earlier than galaxies. Some
globular clusters do have very low metallicity, consistent with this interpretation.
However the origin of globular clusters is still under study and not yet settled.
It was explained earlier, in Chapter 1, that a dark matter halo is essential to
explain why the rotational velocity of the visible matter v remains constant out to
radii at the limit of detectability. The total mass, baryonic plus dark matter, within
radius R of the central black hole, M ( <R ), is given by the radial equation of motion
of a star of mass m at that radius
GM ( <R )m / R2 = mv 2 / R ,
so that
M ( <R) = v 2R / G. (16.19)
16-18
Introduction to General Relativity and Cosmology (Second Edition)
Inserting the rotational velocity of the Sun, 230 km/s, at 8 kpc from the central black
hole gives
R ⎤
M ( <R ) = 1.22 × 1012 ⎡ M⊙. (16.20)
⎢
⎣ 100 kpc ⎥
⎦
Recently the motion of stars traveling at high velocity in orbits far out of the galactic
plane have been analyzed and this gave an independent estimate of 1.2–1.9 × 1012
M⊙ for the total mass of our Galaxy. Using Equation (15.1) the free fall time of a
galaxy radius R and mass M is
1/2 3/2 1/2
4πR3/3 ⎤ R ⎤ ⎡ 1012M⊙ ⎤
tfree = ⎡ = 10 ⎡
9
yr.
⎢
⎣ GM ⎥ ⎦ ⎢
⎣ 100kpc ⎥
⎦ ⎢
⎣ M ⎥ ⎦
Thus our Galaxy took ∼1 Gyr to form. The Milky Way is typical in having many
gravitationally bound satellite galaxies. The satellite galaxies range from dwarfs
formed from thousands of stars to the Large Magellanic Cloud with mass around
1011 M⊙. Many tens of the attendant low luminosity dwarf galaxies were only
detected in recent decades. Given a strictly hierarchical evolution of the universe,
globular clusters and dwarf galaxies would be the first stellar structures to form.
In contrast to spiral galaxies, the elliptical galaxies contain predominantly older stars
and are therefore reddish in color. They lack the gas and dust, whose presence was seen
to be important in stellar nurseries. Stars in elliptical galaxies have randomly oriented
elliptical orbits around the center of mass, very different from the dominant uniform
rotation in spiral galaxies. Both the coloration and random motion are consistent with
the elliptical galaxies being the outcome of galaxy–galaxy interactions or mergers. In
such events the impact of gravitational interaction would disrupt existing coherent
rotation in the participant galaxies. Figure 16.9 illustrates the process: images of
observed pairs of galaxies in various stages of merging were selected and ordered by the
authors. In the collision the stars, being relatively widely spaced, would not collide.
However the gas clouds would collide, heat up and escape the gravitational pull of the
resultant elliptical galaxy. There would be one burst of star formation and when this
subsided there would remain the collection of yellow and red stars that we now observe.
Spiral galaxies are found in groups such as the Local Group containing the Milky
Way. This group extends over a volume of dimension ∼3 Mpc, and comprises
around thirty much smaller galaxies and one other equally large galaxy M31. By
contrast elliptical galaxies are usually found in clusters of hundreds of large galaxies,
where galaxy–galaxy collisions are likely to be more frequent.
16-19
Introduction to General Relativity and Cosmology (Second Edition)
Figure 16.9. A representative sequence of galaxy mergers presented in what would be chronological order:
reading left to right, and top to bottom. Figure 1 in Read & Ponman (1998). Courtesy Professors Read and
Ponman, the Royal Astronomical Society, London and Oxford Univ. Press, Oxford.
stellar mass function (GSMF) ϕ(M ) dM is defined to be the number of galaxies per
Mpc3 each containing a total stellar mass in the interval M to M + dM . This
distribution is parametrized using a Schechter function:
M α dM
ϕ(M )dM = ϕ∗⎡ ∗ ⎤ exp[ −M / M∗] ∗ . (16.21)
⎣M ⎦ M
Here ϕ∗ is the normalizing galaxy number density per Mpc3, and M∗ a characteristic
galactic stellar mass. Integration over all galactic stellar masses gives the total stellar
mass density
6
If the Hubble constant is larger than 70 km s−1 Mpc−1 by a factor h70 then the entry should be multiplied by
this factor. dex−1 indicates the entry is that for a decade of stellar mass centered on the stellar mass entry.
16-20
Introduction to General Relativity and Cosmology (Second Edition)
Figure 16.10. Measurements of low z < 0.1 galaxy stellar mass distributions compiled by the GAMA
Collaboration, comprising several hundred thousand galaxy spectra. The solid line shows their fit to the
data. The upper panel shows the GSMF. The middle panel shows the ratio of the data to the fit. The gray lines
show examples of how randomly displacing the data points by their errors would affect the fit. The lower panel
shows the distribution of the stellar mass density. Figure 12 from Driver et al. (2022). Courtesy of the Royal
Astronomical Society, London and Oxford Univ. Press, Oxford.
16-21
Introduction to General Relativity and Cosmology (Second Edition)
Taking these two features together the authors argue that their parametrization
offers a plausible extrapolation to cover the full mass distribution. On this basis the
authors infer that at z < 0.1 the stellar mass density and the fraction this contributes
to the critical density are, respectively:
ρ∗ = 2.97 × 108 M⊙ Mpc−3 and Ω∗ = 2.2 × 10−3. (16.22)
This means that only 5% of the baryons have been captured in stars up to our era.
In addition to the elliptical and spiral galaxies there are irregularly shaped
galaxies. These are generally smaller galaxies and in number form about a quarter of
all galaxies. Some owe their shape to collisions and interactions, others are in the
process of consolidation.
16-22
Introduction to General Relativity and Cosmology (Second Edition)
120 km s−1 indicating the continued effect of the dark and baryon matter gravita-
tional well they share.
The best studied cluster, the Coma cluster, is a large, densely populated cluster
containing over 103 galaxies. It lies close in the sky to the constellation Coma
Berenices: the viewing direction is perpendicular to our galactic plane, and hence
clear of most of our Galaxy’s dust. The mean velocity of recession of the galaxies
making up the Coma cluster, v is 6933km s−1. Hence the mean redshift is 0.0231 and
the corresponding distance from us v /H0 = 102 Mpc. This cluster is small enough to
have had time to collapse and virialize. Zwicky applied the virial theorem to the
motion of galaxies in this cluster and deduced the presence of the unseen dark
matter, with many times the mass of the visible matter. We can update Zwicky’s
analysis using recent data. The velocity dispersion of the galaxies along the line of
sight σ, is ∼1000km s−1. We calculate the mass M within the virial radius R from the
center of the cluster, where the density falls to 200 times the critical density, that is
within 3 Mpc. Then the virial theorem gives
M = 3σ 2R / G , (16.23)
where the factor 3 converts the line-of-sight dispersion to the three-dimensional
dispersion. The mass inferred is 2 × 1015 M⊙. With luminosity in the visible spectrum
1013 L⊙ the mass to light ratio is 200 times that of the Sun, and 50 times that for our
Galaxy. The inference from updating Zwicky’s analysis is that dark matter
dominates over baryonic matter.
In Figure 16.11 we see what Zwicky could not see, a composite view, 800 kpc
wide, covering the Coma cluster. The color shows where there is X-ray emission
Figure 16.11. Composite image of the Coma cluster. The pink and blue color indicates the X-ray emission
from gas at millions of degrees kelvin, and the white indicates the optical emission from galaxies. The X-ray
image was taken by the Chandra satellite (https://chandra.harvard.edu/) and the optical image by the Sloan
Digital Sky Survey (http://www.sdss.org). Courtesy Chandra and SDSS Collaborations.
16-23
Introduction to General Relativity and Cosmology (Second Edition)
from gas at ∼8 keV; superimposed in white is the optical image of the constituent
galaxies in the same region. X-rays are strongly absorbed by the Earth’s atmosphere,
so that it was only when satellites were launched in the 1970s, equipped with X-ray
detectors, that X-ray emission could be observed and studied. The X-ray emission
seen in Figure 16.11 provides direct evidence that the greater part of baryonic matter
in the universe lies in diffuse gas clouds enveloping the galaxy clusters. Including
these diffuse gas clouds builds up the baryonic contribution to 4.5% of the critical
density at the present time, the value used throughout this text.
16-24
Introduction to General Relativity and Cosmology (Second Edition)
has a distribution around unity for the clusters studied, which supports the plasma
interpretation.
Assuming the plasma observed in a cluster has reached equilibrium after
virialization, then the cluster mass including its dark matter content can be deduced.
For simplicity the gas cloud is taken to be spherically symmetric. In hydrostatic
equilibrium the radial pressure gradient in the gas at radius r balances the inward
gravitational force
dPg /dr = GM ( <r )ρg / r 2 , (16.24)
where Pg is the gas pressure, ρg the gas density at radius r, and M ( <r ) is the total
mass inside the sphere of radius r. Using the perfect gas law at radius r
Pg = ρg kBTg / μ. (16.25)
16-25
Introduction to General Relativity and Cosmology (Second Edition)
terminate the life of massive stars, and AGNs all generate massive outflows of
matter and radiation. These mechanisms ionize and inject metals into the inter-
galactic clouds.
16.9 Exercises
1. A galaxy has a luminosity 108 L⊙, half coming from within a radius of 4 kpc.
The observed line-of-sight velocity dispersion of its stars is 0.1 km s−1.
Calculate the galaxy’s mass, its mass to light ratio and its overdensity.
2. Can you explain why accretion onto an AGN brings an electron with each
proton? What is the momentum flux, that is momentum per unit time,
carried by radiation from the AGN with luminosity L? For this part refer to
Section 1.14. What is the momentum flux across unit area of the accreting
surface, taking this to be spherical and radius r? Writing σT as the Thomson
scattering cross-section, how much momentum does an electron receive in
Thomson scattering from the outgoing radiation per unit time? What is the
inward gravitational force on the electron and its proton partner? From these
results show that there is a limiting luminosity that prevents more rapid
accretion: the Eddington limit. What is this limit for an AGN of mass 106
M⊙?
3. A group of interacting galaxies has a velocity dispersion of 350 km s−1. What
temperature would the intergalactic gas reach? At what wavelength would
this gas radiate most strongly?
4. In a galaxy group the gas is in hydrostatic equilibrium, so that the radial
pressure gradient balances the gravitational attraction. Suppose the gas has a
uniform temperature of 1.5 × 107 K and the density varies as 1/r out to a
radius r of 200 kpc. Calculate the total mass enclosed within this radius.
5. The gas in the group of the previous exercise has mass 8 × 1011 M⊙ and the
light from the galaxies is 1010 L⊙. Assuming a mass to light ratio of the stars
is four times that of the Sun determine the gas/star mass ratio, and the ratio
of the total mass to the baryonic mass.
6. Using the initial mass function of Equation (16.17) above a mass cutoff of
0.25 M⊙, calculate the fraction of stars with masses above 5 M⊙ and the
fraction of the total mass that they contribute.
Further Reading
Sparke L S and Gallagher J S 2007 Galaxies in the Universe: An Introduction
(Cambridge: Cambridge Univ. Press). This presents a clear account of the
astrophysics of galaxies since the early universe at roughly the same level as
the text here.
Mo H, van den Bosch F and White S 2010 Galaxy Formation and Evolution
(Cambridge: Cambridge Univ. Press). This is an advanced text with thorough
coverage at graduate level. Not for the faint-hearted.
16-26
Introduction to General Relativity and Cosmology (Second Edition)
References
André, P., Men’shchikov, A., Bontemps, S., et al. 2010, A&A, 518, L102
Blumenthal, G. R., Faber, S. M., Primack, J. R., & Rees, M. J. 1984, Natur, 311, 517
Driver, S. P., Bellstedt, S., Robotham, A. S. G., et al. 2022, MNRAS, 513, 439
Inutsuka, S., & Miyama, S. M. 1997, ApJ, 480, 681
Madau, P., & Dickinson, M. 2014, ARA&A, 52, 415
McQuinn, M. 2016, ARA&A, 54, 313
Read, A. M., & Ponman, T. J. 1998, MNRAS, 297, 143
Springel, V., White, S. D. M., Jenkins, A., et al. 2005, Natur, 435, 629
16-27
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Chapter 17
The Dark Sector
17.1 Introduction
The existence and dominance of dark matter over baryonic matter has been integral
to the account of the evolution of the universe given in earlier chapters. In this
chapter the evidence for dark matter is reviewed, and considered further. That other
mysterious entity, dark energy, has been introduced and the ΛCDM model has been
seen to provide a plausible interpretation of the evolution of the universe. Crucial
evidence for the existence of dark energy came late last century; it was discovered
that the expansion of the universe is accelerating now and has been accelerating for
several gigayears. This evidence will be reviewed in the latter part of the chapter.
matter is therefore non-relativistic: cold dark matter. Finally the success, discussed
in Section 15.7, of the prediction for how the baryon acoustic oscillations developed
in the matter era, requires dark matter at a density consistent with other
measurements.
These dark matter particles could be supersymmetric particles. Supersymmetry is
the remaining unexploited symmetry consistent with the special theory of relativity.
Supposing supersymmetry to have been unbroken at the extreme energy density in
the universe shortly after inflation, then fermions (with half-integral spins in units ℏ)
and bosons (with integral spins) would have been interconvertible via supersym-
metric particles, whereas now when this symmetry has been broken fermions and
bosons are quite distinct. Any number of one species of boson, for example photons,
can exist in the same quantum state. In the case of a species of fermion, for example
electrons, a quantum state can only ever contain a single fermion. Once the
symmetry was broken the supersymmetric particles would have decayed to the
lightest neutral supersymmetric particle; making the latter the dark matter particles
in today’s universe. Pursuing this interpretation, the supersymmetry breaking
transition must have occurred before the electroweak symmetry breaking transition.
Otherwise the supersymmetric particles would be as light as the Higgs boson (∼103
GeV c−2) and would have been produced and detected from the 13 TeV collisions
between protons at the Large Hadron Collider at CERN. Supersymmetric particles
more massive than the Higgs would become non-relativistic, and hence CDM, long
before the decoupling of matter from radiation. Many experiments have been
carried out to detect the huge numbers of such supersymmetric dark matter particles
that would currently penetrate all space (including the Earth). Searches are made for
examples of nuclei recoiling from collisions with these invisible particles. None has
been found. The null results impose very severe limits on the interaction strength of
dark matter supersymmetric particles with baryons, orders of magnitude weaker
than the weak nuclear force. A second candidate dark matter particle is the axion. Its
existence would solve a fundamental conceptual difficulty with the standard model
of elementary particle physics, namely the absence of any charge-parity violation in
the strong force. Axions would have minuscule masses in the range under 10−3 eV
c−2. An axion would convert to a pair of photons in a strong magnetic field.
Attempts to detect axions have so far also failed. Axions, despite their low mass,
would have been non-relativistic from early in the life of the universe. Thus, either
axions or supersymmetric particles could fulfill the role of dark matter.
From early in the life of the universe dark matter only interacted gravitationally
with baryonic matter and electromagnetic radiation. Dark matter could therefore
collapse directly into the gravitational potential wells resulting from quantum
fluctuations during inflation. At the same time the baryons and photons formed a
plasma that oscillated within these wells. In Chapter 15 we saw how dark matter
accumulations became the scaffolding of today’s galaxies and clusters of galaxies.
To summarize:
• There is about six times more dark matter than baryonic matter.
• Dark matter interacts gravitationally with itself and with baryonic matter.
• Dark matter has no other detectable interaction.
17-2
Introduction to General Relativity and Cosmology (Second Edition)
17-3
Introduction to General Relativity and Cosmology (Second Edition)
Figure 17.1. Gravitational lensing. A source at S is imaged at I by a galaxy cluster in the plane LL. The lateral
dimension shown here is hugely magnified; in reality all the angles are very small.
The distance of closest approach of the ray to the lensing cluster, given by Equation
(17.1), is
b = θ iDℓ . (17.3)
If the source is directly behind the center of the cluster θs = 0 and an Einstein’s ring
would result with θi = θ E . Using Equations (17.1), (17.2), and (17.3) gives
4GM Dℓs 4GMDℓs
θE = = 2 , (17.4)
c 2b Ds c θ EDℓDs
so that the Einstein ring has angular radius
17-4
Introduction to General Relativity and Cosmology (Second Edition)
In this case when the alignment is not exact there are two images with
1
θi = ⎡θs ± θs2 + 4θ E2 ⎤ . (17.8)
2⎣ ⎦
Taking the positive sign gives θi > θ E , and the image lies outside the Einstein ring.
Taking the negative sign makes θi negative and θi < θ E , so the image lies inside the
ring, below the axis (the horizontal line) in the figure. This second image is inverted
top to bottom. Lensing galaxies and clusters are extended objects, not necessarily
axially symmetric, so that images also depend on the geometry of the lens.
A less restrictive example is that of an extended lensing galaxy axially symmetric
around a line drawn from the observer. Thanks to the symmetry the deflection
produced is the same as that of a point object on axis; the mass of this equivalent
object is equal to the integrated mass of the actual lensing object within radius b. A
further useful simplification is to take the galaxy’s mass to have uniform surface
mass density Σ across LL in Figure 17.1. Then the mass doing the lensing is
M = πb 2Σ . (17.9)
The gravitational attraction on the ray passing through the lensing surface off axis
by b due to all the mass outside the circle of radius b cancels exactly. This result
parallels the result that a body at radius r within a spherically symmetric mass
distribution feels no effect from matter outside radius r. A critical surface mass
density is defined as that just sufficient to produce an Einstein ring. Now in the case
of an Einstein ring Equations (17.5) and Equation (17.3) give
2
b 4GM Dℓs
θ E2 = ⎡ ⎤ = . (17.10)
⎢ D
⎣ ℓ⎦⎥ c 2 DℓDs
Rearranging this equation gives the critical mass required to produce an Einstein ring,
b 2c 2Ds
M= , (17.11)
4GDℓDℓs
whence the critical surface mass density is
M c 2 Ds
Σ cr = 2
= . (17.12)
πb 4πG DℓDℓs
Notice that the result is independent of b. Whatever the area of the lensing body
there will be an Einstein ring at its edge if the surface density has the critical value
Σcr . The convergence is defined as the ratio of the surface mass density to the critical
surface mass density
κ = Σ Σ cr . (17.13)
17-5
Introduction to General Relativity and Cosmology (Second Edition)
If the convergence is larger than unity there will be multiple images. Requiring the
convergence κ to be greater than unity provides a convenient definition of the strong
gravitational lensing regime. Where the angle between the lines of sight through the
source and through the lens is large, κ will be much less than unity. In such cases the
effect of lensing will only amount to a distortion of the image. This is the weak
gravitational lensing regime.
17.4 MACHOs
Dark matter could plausibly exist as dark stars lighter than the brown dwarfs. These
objects are called massive astrophysical compact halo objects or MACHOs.
Extensive searches have been made by the OGLE Collaboration (Wyrzykowski
et al. 2011) for MACHOs in the Large and Small Magellanic Clouds, satellite
galaxies of our own galaxy. Thanks to the relative motion of all three, the Earth, a
suitable located MACHO and a background star, all could at some moment line up.
Then the MACHO would briefly pass in front of the star. The consequent gravita-
tional lensing would cause the distant star to brighten just as briefly. In order to
exclude cases where the distant star was itself changing state, it was required that the
temporary brightening was the same all across the distant star’s spectrum. The
OGLE Collaboration found only three possible examples in 13 years. This result
imposed an upper limit on the contribution of MACHOs to dark matter of order 1%.
17-6
Introduction to General Relativity and Cosmology (Second Edition)
Figure 17.2. Axes are shown for the shear modes of a galaxy referenced to a second galaxy. The +-axes define
one mode of shear, with shrinkage along one axis and expansion along the other. Similarly the ×-axes rotated
by 45◦ apply for the orthogonal shear mode.
The corresponding quantities here are simply the shear components in the image
plane. Figure 17.2 shows the reference axes for the two independent shear modes
drawn at the left-hand galaxy. The axes of the two modes, labeled + and ×, make an
angle of 45◦. The difference in distortion between the imaging of nearby regions
depends on the differential:
(∂θs )k
= Akj (17.14)
(∂θ i ) j
where k and j are labels for axes at right angles in the image plane. As before, i and s
label image and source. Guided by Equation (9.7), A is written
1 + γ+ γ× ⎤
A = (1 − κ )⎡ , (17.15)
⎢ γ× 1 − γ+ ⎥
⎣ ⎦
where κ is the convergence, while γ+ and γ× are the projections of the shear γ along the
respective axes. Thus in terms of its two components
γ = (γ+, γ×), (17.16)
where
γ+ = γ cos 2ϕ and γ× = γ sin 2ϕ , (17.17)
with ϕ being the angle shown in Figure 17.2. Again, as for gravitational waves, the
doubled angle 2ϕ appears here because of the way quadrupoles (spin-2 tensors)
transform under rotations. We can separate out the magnification
17-7
Introduction to General Relativity and Cosmology (Second Edition)
1
μ = 1/(determinant A) = ≈ 1 + 2κ . (17.18)
(1 − κ )2 + γ 2
Now the measurable quantity, from which the ellipticity of a galaxy is determined, is
the quadrupole moment. Continuing with the same perpendicular axes in the image
plane, the quadrupole moments are defined by
∫ I (θ)θkθj dθ
Qkj = , (17.19)
∫ I (θ)dθ
where I (θ ) is the brightness at each two-dimensional point θ in the image plane.
Then the + and × ellipticity components are
Qxx − Qyy 2Qxy
ε= , , (17.20)
Qxx + Qyy Qxx + Qyy
irrespective of the orientation of the perpendicular x- and y-axes in the image plane.
As noted a few lines above, the observed ellipticity of a galaxy is the sum of the
intrinsic ellipticity and the shear:
ε = εint + γ .
The range of ellipticity of galaxies is around 0.3, while the shear is only around 0.01.
Therefore any attempt to determine the shear has to be statistical and requires a data
set of many millions of galaxies. The strength of the correlation between the
ellipticities of pairs of galaxies falls as their separation increases. Hence the galaxy
pairs must be grouped according to their angular separation in the sky. For each
range of angular separation the mean correlations of interest are:
ξ± = (ε+)a (ε+)b ± (ε×)a (ε×)b (17.21)
where a and b label the two galaxies in a pair and the average is taken over all pairs
in the selected range of angular separation. Correlations like 〈(ε+ )a (ε× )b〉 between the
orthogonal modes should vanish, which provides an experimental test for the
validity of the interpretation given here for the shear observed. Another test involves
randomly interchanging the galaxy data set and checking that correlations dis-
appear. The DES Collaboration surveyed 26 million galaxies over an area of 1321
square degrees with redshifts measured to lie in the range 0.2–1.3. In Figure 17.3 the
mean correlations are plotted against the angular separation of the galaxy pair
(Troxel et al. 2018). The curves superposed in the figure are the prediction for the
shear made using the ΛCDM model parameters. This prediction depends on the
mean density of dark matter, expressed by the parameter Ω m , and on the clumpiness
of matter expressed by the parameter σ8 introduced in Section 15.6. Measurements
of cosmic shearing determine the product S8 = σ8 Ω m0/0.3 . More recently the DES
Collaboration (DES Collaboration et al. 2022) combining data from shearing,
galaxy–galaxy lensing and galaxy clustering using 108 galaxies find
17-8
Introduction to General Relativity and Cosmology (Second Edition)
Figure 17.3. Lensing correlations measured by the DES Collaboration. The curves are predictions made with
the ΛCDM model. Adapted from Figure 3 from Troxel et al. (2018). Courtesy American Physical Society.
S8 = 0.816 ± 0.008 and Ω m0 = 0.306 ± 0.006 in the ΛCDM model. (This is little
different from the value 0.30 for Ω m0 used in calculations in this book.)
It is worth emphasizing that determinations of the distribution of the total mass
and its distribution from cosmic shearing and galaxy–galaxy lensing are critical
because they only rely on the presumption that the deviation of light rays due to
matter is that given by general relativity.
17-9
Introduction to General Relativity and Cosmology (Second Edition)
Figure 17.4. The Bullet cluster. In both plots the green lines are the matter density contours determined from
weak lensing measurements. On the left they are superposed on the visible galaxies; on the right superposed on
the X-ray intensity in the energy range 0.5 to 5.0 keV. At the outer contour the convergence κ = 0.16 , and
going inward the contours mark successive increases of 0.07. Figure 1 from Clowe et al. (2006). Courtesy
Professor Gonzalez for the authors.
cluster which is being stripped by ram pressure exerted by the intracluster medium.
Forward of the visible bullet there is a shock wave traveling at 4700 km s−1; and in
the region between the bullet and the surface of the shock wave the temperature of
the emitted X-rays climbs to 28 keV. By contrast the temperature both in the bullet
and beyond the shock wave is much lower. (Expressing electron energy as (3/2)kBT
gives a temperature of 8.2 × 107 K to power 10 keV X-ray emission.)
A significant feature seen in Figure 17.4 is the precise overlap between the total
mass distribution detected by weak gravitational lensing and the distribution of the
stars. Evidently dark matter in the galaxy clusters keeps pace with the stars, rather
than falling behind with the X-ray emitting baryonic gas. The passage of the dark
matter must therefore also be collisionless. These observations confirm the existence
of dark matter and show directly that its only interaction with baryonic matter is
gravitational.
17-10
Introduction to General Relativity and Cosmology (Second Edition)
more plausibly the energy density was exactly the critical density and would then
remain so to the present. Inflation in the first ∼10−36 s was invoked to ensure that,
whatever the starting conditions, the universe became as flat as required, and
crucially made the universe as homogeneous as it is today. The conclusion then
drawn is that some undetected material, dark energy, makes up the deficit needed to
reach the critical density. It contributes the bulk of the energy, around 70% at the
current epoch.
Significantly, the properties of this dark energy are, so far as we can tell,
consistent with the cosmological constant in Einstein’s equation. If this association
is correct, then as discussed in Chapter 6, dark energy exerts a repulsive gravitational
force, the opposite from matter. It appears to fill space uniformly; increasing with
the volume of space, and expanding space indefinitely. This makes for a grim future,
though many gigayears away.
17-11
Introduction to General Relativity and Cosmology (Second Edition)
Figure 17.5. Evolution of the scale parameter and redshift with time for universes assuming there is dark
energy, and assuming there is no dark energy. Note that where the curve is concave down there is deceleration,
and where concave up, acceleration.
1
SNe Ia also occur when two white dwarfs merge.
17-12
Introduction to General Relativity and Cosmology (Second Edition)
predictable manner with their luminosity. More recently Cepheid variables have
been supplemented by the TRGB method discussed in Section 1.4. The use of SNe Ia
as standard candles has carried the scale to distant galaxies with redshifts currently
above 2.0. Because the absolute peak brightness of SNe Ia is not well determined
from first principles, the overlap with shorter range standard candles is essential in
calibrating the supernovae Ia distance scale.2
The observed redshift and luminosity distance of a standard candle can be
connected using in turn Equations (11.10) and (11.1):
t0
dt ′
dL = dP(1 + z ) = c(1 + z ) ∫t a(t′)
, (17.22)
where z is the redshift at time t, and now is the time t0. Then Equation (10.30) can be
used to give an integral in a with explicit dependence on the proportions of matter
and dark energy in the universe:
1
c(1 + z ) da
dL =
H0
∫a(z ) ⎡⎢ (Ω ⎤.
4 1 2⎥
(17.23)
⎣ m0a + Ω r0 + ΩΛ0a ) ⎦
Inserting values of Ω m0 , etc. from a model of the universe in Equation (17.23) yields
one estimate of the luminosity distance. The luminosity distance to a SNe Ia can also
be determined directly from the difference between the predicted absolute magnitude
M of the SNe Ia and its measured apparent magnitude m: using Equation (1.19) we
get
M − m = −5 log10[dL 10 pc]. (17.24)
If the model of the universe describes its evolution correctly, including any period of
accelerated expansion, then the two estimates of the luminosity distance should
coincide.
In practice the SNe Ia data is presented as a plot of the distance modulus
μ = M − m against the redshift, with a prediction made using the model of the
universe superposed. The intention had been to track the slowing down of the
expansion with time, as was expected in a matter-dominated universe. Perlmutter,
Riess, Schmidt, and their colleagues found the reverse: the expansion for the last
5 Gyr has been speeding up thanks to the repulsive gravitational force of dark
energy. Figure 17.6 shows the distance moduli versus redshift from a sample of 1550
SNe Ia measured by the Pantheon+ Collaboration (Brout et al. 2022). Corrections
have been applied to compensate for the effect of the difference in luminosities,
depending on the type of host galaxies, and for other small biases. The solid line is a
fit to the data made with the ΛCDM model. There is a point of inflection at around
redshift 0.5 (5 Gyrs ago) when dark energy began to dominate and the expansion of
the universe accelerated. The broken gray line shows the expected bias due to very
low redshift effects arising from peculiar velocities. In the lower section the
2
My thanks to Professor Brout for clarifying several features of SN Ia observations.
17-13
Introduction to General Relativity and Cosmology (Second Edition)
Figure 17.6. The distance modulus versus redshift survey from 1701 SNe Ia light curves compiled by the
Pantheon+ Collaboration with redshifts from 0.001 to 2.26: Figure 4 from Brout et al. (2022). Courtesy
Professor Brout for the Pantheon+ Collaboration: ArXiv:2202.04077. The solid lines show the predictions
using the ΛCDM model. The dot-dash lines show the predictions for no dark energy (courtesy Professor
Brout).
differences between the data points and the fit are shown: the black dots with error
bars show the data averaged over intervals in redshift. A prediction made with all
matter and no dark energy is indicated by the dashed–dotted lines: these diverge
from data with increasing redshift. They lie over 0.6 units in magnitude below the
data points at redshift 1.0; meaning that if there were no dark energy the supernovae
at redshift 1.0 would be a factor of 2 brighter. This difference is many times the
statistical variation indicated by the black error bars, which is convincing evidence
for dark energy. Currently the universe is in a transitional state: until recently matter
dominated, but from hereon it will be dark energy that does so. Riess, Schmidt, and
Perlmutter shared the Nobel Prize in Physics for their work in 2011.
The surprising result that the expansion of the universe is now accelerating can be
checked by using the ruler provided by the baryon acoustic oscillations introduced in
Section 15.7. The ruler’s length is the scale factor DV (z ) given by Equation (15.52),
which is obtained from isotropic acoustic scale fits to galaxy surveys. Figure 17.7
shows the variation of DV (z ) with redshift during the period of acceleration at low
17-14
Introduction to General Relativity and Cosmology (Second Edition)
Figure 17.7. The scale factor DV (z ) measured from the baryon acoustic oscillations plotted against redshift.
The curve is a prediction made using the ΛCDM model of the universe. Figure 21 from Anderson et al. (2014),
Courtesy Royal Astronomical Society, London and Oxford Univ. Press, Oxford.
redshift.3 The superimposed curve is the prediction of the ΛCDM model, which
matches the observed change in the ruler’s length with redshift. This confirms the
evidence from the observations of SNe Ia that currently the dark energy has taken
over from matter as the dominant energy component in the universe.
Referring back, Figure 12.4 shows the constraints imposed in ΩΛ0 Ω m0 space by
the separate CMB, BAO, and SNe Ia data sets in the ΛCDM model. The three
acceptable regions overlap in the neighborhood of the parameter choice used
throughout this book: that is where ΩΛ0 is 0.70 and Ω m0 is 0.30. This shows that a
simultaneous fit to the data from all eras exists, giving confidence that the model
describes the evolution of the universe adequately. We can look at each of the three
constraints in turn.
First the observed angular spacing of the thermal perturbations of the CMB
requires the universe to be close to having perfect flatness; the constraint is that
ΩΛ0 + Ω m0 is close to unity. Next the spacing of the BAO grew after decoupling
during a matter dominated era and so depends primarily on the fraction of dark plus
baryonic matter in the universe: this gives a constraint at roughly constant Ω m0.
Finally the acceleration of the expansion of the universe evidenced by SNe Ia data is
determined by the difference between the gravitational repulsion due to dark energy
(ΩΛ0 ) and the gravitational attraction due to matter (Ω m0): this gives a band in
Figure 12.4 at right angles to the first, CMB defined band. It evidently helps in
showing the consistency of the data sets, and in determining the values of ΩΛ0 and
3
The factor multiplying DV (z ) compensates for the small shift in the parameters of the ΛCDM model used as
input into the analysis and their values emerging from the fit to the data.
17-15
Introduction to General Relativity and Cosmology (Second Edition)
Figure 17.8. Outline of the evolution of the universe as interpreted in the text using the ΛCDM model.
Ω m0, that the three data sets chosen give bands in the figure that intersect at large
angles. For reference, Figure 10.3 shows how the energy content of the universe
changed with time according to the ΛCDM model.
Figure 17.8 summarizes the life history of the universe.
17-16
Introduction to General Relativity and Cosmology (Second Edition)
17.10 Exercises
1. What is the angle deflection of a lightray just grazing a neutron star of mass
1.4 M⊙ and radius 10 km?
2. Show that the expansion rate of a universe changes from decelerating to
accelerating when the matter density is twice the dark energy density.
3. Calculate the Einstein ring radius for a lensing mass of 103 M⊙ halfway
between the observer and a source at 2 Gpc distance. In the same
configuration, what is the critical surface mass density?
4. Starting from Equation (17.23) make an estimate of the ratio between the
apparent luminosity of a source at redshift 2 in a ΛCDM universe and one in
a universe with Ω m0 = 1.0.
5. Suppose that during the thermonuclear burning phase of an SN Ia a mass 0.5
M⊙ is converted to 56Ni. Estimate the total energy released. Make use of the
binding energies of nuclear species. The nucleus 56Ni decays first by electron
17-17
Introduction to General Relativity and Cosmology (Second Edition)
Further Reading
The 2011 Nobel lectures given by Saul Perlmutter and Adam Riess concerning
the discovery of the acceleration of the expansion of the universe are well
worth reading/viewing. They can be found at Nobel.org.
References
Anderson, L., Aubourg, É., Bailey, S., et al. 2014, MNRAS, 441, 24
Brout, D., Scolnic, D., Popovic, B., et al. 2022, ApJ, 938, 110
Clowe, D., Bradač, M., Gonzalez, A. H., et al. 2006, ApJ, 648, L109
DES CollaborationAbbott, T., Aguena, M. C., Alarcon, A., et al. 2022, PhRvD, 105, 023520
Perlmutter, S., Gabi, S., Goldhaber, G., et al. 1997, ApJ, 483, 565
Riess, A. G., Filippenko, A. V., Challis, P., et al. 1998, ApJ, 116, 1009
Troxel, M. A., MacCrann, N., Zuntz, J., et al. 2018, PhRvD, 98, 043528
Wyrzykowski, L., Skowron, J., Kozłowski, S., et al. 2011, MNRAS, 416, 2949
17-18
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Appendix A
The Particles and Forces
Around 5% of the energy in the universe is carried by baryonic matter; the rest is
carried by dark matter and dark energy, neither of which is directly detectable.
Baryonic matter includes free protons and neutrons as well as nuclei into which they
are bound by their mutual strong interaction. This is nature’s strongest interaction,
about 102 times stronger than the next strongest, the electromagnetic interaction.
Nuclei, including protons and neutrons, that all feel the strong force are called
hadrons. Electrons do not feel the strong force, but are nonetheless classed as
baryonic matter by convention in the context of cosmology. Atoms consisting of an
electron cloud around a nucleus are equally baryonic matter. Most of what has been
learnt about the universe has come from observing electromagnetic radiation from
baryonic matter. Neutrons and protons are massive compared to electrons, 938.3
MeV c−2 and 939.6 MeV c−2, respectively, compared to 0.511 MeV c−2. The masses
are such that neutrons can decay into protons, or given enough energy, a proton can
convert to a neutron: thus
n → p + e−+ν¯e,
(A.1)
ν¯e + p → e+ +n.
The two processes introduced here occur through the third force, the weak
interaction, orders of magnitude weaker than the electromagnetic interaction. The
processes indicated involve the electron neutrino νe and its antiparticle ν̄e . Neutrinos
have minuscule masses, less than 1.0 eV c−2, and feel neither the strong nor the
electromagnetic force. Everyone feels the weak force: protons, neutrons, electrons,
and neutrinos. The e+, appearing in the second equation, signifies the positron, the
antiparticle of the electron. Like other antiparticles it has the same mass but the
opposite charge to its particle partner. An antiparticle interacts through the same
forces as its particle partner. Although weak, the weak force is what powers the Sun
and other astrophysical processes. It requires the high pressure and temperature
that develop in the interior of stars when they collapse under their gravitational
A-2
Introduction to General Relativity and Cosmology (Second Edition)
which is relativistic, and, of course, holds at all temperatures met for photons and,
similarly in practice, for neutrinos. At the other extreme for baryons at low enough
temperatures Equation (1.25) reduces to
E = mc 2 + p 2 /(2m), (A.4)
the non-relativistic behavior. Roughly speaking the changeover happens for
nucleons around GeV energies (1013 K) and at MeV energies (1010 K) for electrons.
Within instants after the Big Bang energies dropped well below 1 GeV so that
nucleons have behaved non-relativistically for the greater part of the life of the
universe. We shall see that whether particles are non-relativistic or relativistic has
far-reaching consequences. The history of the universe depends strongly on whether
the particles that make up the mysterious dark matter would be relativistic (light and
hot) or non-relativistic (heavy and cold). The latter case is strongly favored by all
current observations.
The protons and neutrons are themselves bound states of yet more fundamental
particles, known as quarks. There are three quarks in each proton and in each
neutron, bound together by the strong force. The force acts through the exchange of
gluons, which are the equivalent of the photons that bind electrons to nuclei to form
atoms. The parallel goes further. There is a residual attenuated electric dipole force
between atoms, the Van der Waals force: correspondingly the residual attenuated
strong force between protons and neutrons is what binds them into nuclei.
A-3
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Appendix B
Variational Methods
The general problem is to calculate for what path between A and B in spacetime the
path integral
B
I= ∫A L dτ
Suppose that the stationary path is known and that x μ and q μ are the coordinates
and derivatives along the path. Next consider the small excursion from this path
shown in Figure B.1 over which these quantities become x μ + δx μ and q μ + δq μ,
respectively. Then the change in the path integral is
B
δI = ∫A ⎛⎜ ∂L δx + ∂L δq⎞⎟ dτ .
⎝ ∂x ∂q ⎠
where for clarity the superscript μ has been suppressed. Integrating the second term
by parts gives
B
B
⎡⎛⎜ ∂L ⎞⎟δx⎤ − d(∂L / ∂q ) ⎤
⎢ ∂q ⎥ ∫A δ x⎡
⎣ dτ ⎦
dτ .
⎣⎝ ⎠ ⎦A
Of these terms, the first vanishes because the end points of the path are fixed so that
δx = 0 at A and B. Thus
B
δI = ∫A ⎡ ∂L − d(∂L / ∂q ) ⎤δx dτ .
⎣ ∂x dτ ⎦
This variation must be zero for any choice of δx , which means that the term in
square brackets must be identically zero along the whole path. Thus the stationary
path has the differential equation (with superscript μ restored)
∂L d(∂L / ∂q μ)
μ
− = 0. (B.1)
∂x dτ
If L is replaced by L , the result is unchanged. In order to obtain the geodesic
equation from Equation (B.1) we make the substitution
L = gαβq αq β .
Differentiating, and remembering that the coordinates and velocities are independ-
ent quantities, we obtain
∂L
= gαβ,μq αq β
∂x μ
∂L
= gαμq α + gμβq β
∂q μ
d(∂L / ∂q μ) dq α
= gαμ,σq αq σ + gμβ,σq βq σ + 2gμα .
dτ dτ
Substituting these values into Equation (B.1) gives
dq α
gαβ,μq αq β − gαμ,σq αq σ − gμβ,σq βq σ − 2gμα = 0.
dτ
The first three terms simplify using Equation (5.17)
dq α
−2Γμβαq αq β − 2gαμ = 0,
dτ
B-2
Introduction to General Relativity and Cosmology (Second Edition)
that is
dq ν
gμν Γ νβαq αq β + gμν = 0,
dτ
where in the second term the equality of gμν and gνμ has been used, and the repeated
index has been changed from α to ν. This reduces to
dq ν
Γ νβαq αq β + = 0.
dτ
Finally, expressing q μ in full, we have
dx α dx β d2x ν
Γ νβα + = 0, (B.2)
dτ dτ dτ 2
which is the geodesic Equation (5.15).
If L is independent of one of the coordinates x ν , then the corresponding
component of Equation (B.1) becomes much simpler
d(∂L / ∂q ν )
= 0, (B.3)
dτ
while the remaining three component equations with μ ≠ ν retain the more general
form of Equation (B.1). In this case ∂L /∂q ν is a constant of the motion, a result
which is frequently used in Chapter 7. The reader may recognize that the variational
methods used here are closely related to the Lagrangian analysis of classical
mechanics. In that case L is the Lagrangian T − V, where T is the kinetic energy
and V is the potential energy. In the cases of interests T is purely a quadratic
function of the momentum so that, if V is independent of some coordinate x μ, the
result Equation (B.3) follows. This is the basis of the conservation laws of classical
mechanics. If V, and hence L, is independent of the position coordinates, as is
usually the case, then
∂L ∂T ∂(q 2 /2m) qμ
μ
= μ
= μ
=
∂q ∂q ∂q m
is a conserved quantity. In other words, the linear momentum is conserved. Similar
calculations lead to the conservation of energy and angular momentum. The
corresponding conserved quantities for Schwarzschild spacetime are discussed in
Chapter 7 during the study of planetary orbits round the Sun.
B-3
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Appendix C
The Schwarzschild Metric
The spherically symmetric metric satisfying Einstein’s equation for empty space is
the Schwarzschild metric. This is straightforward to demonstrate using the addi-
tional assumption that the metric is static. The most general symmetric metric is
ds 2 = a c 2 dt 2 − b dr 2 − 2e dt dr − r 2 dΩ2 ,
where a, b and c are functions of r and t only. A new time coordinate t′ can be chosen
for which
e
c dt′ = ⎛a1/2 c dt − 1/2 dr⎞f (t , r ),
⎝ a c ⎠
where f is the integrating factor needed to make the right-hand side a perfect
differential. Then
ds 2 = Ac 2 dt ′ 2 − B dr 2 − r 2 dΩ2 .
If the metric is static then A and B depend on r only. Dropping the prime gives
ds 2 = Ac 2 dt 2 − B dr 2 − r 2 dΩ2 (C.1)
with metric components
g00 = A , g11 = −B , g22 = −r 2 , g33 = −r 2 sin2 θ .
The metric is diagonal so that g 00 = 1/g00 , etc. By using the definition of Equation
(5.7) the non-zero components of the metric connection can be obtained:
Γ 001 = Γ10
0
= A′ /2A
2
Γ12 = Γ 221 = Γ13
3
= Γ 331 = 1/ r
2
Γ 33 = −sin θ cos θ , Γ 332 = Γ 323 = cot θ .
Here the prime denotes differentiation with respect to r. Next using the definition of
the Ricci tensor,
α
R βδ = R βαδ
A′′ A′ ⎛ A′ B′ ⎞ B′
R11 = − + + + (C.3)
2A 4A ⎝ A B ⎠ rB
r ⎛ A′ B′ ⎞ 1
R22 = 1 − − − (C.4)
2B ⎝ A B⎠ B
C-2
Introduction to General Relativity and Cosmology (Second Edition)
Γ 331 = Γ13
3
= 1/ r , 2
Γ 33 = −sin θ cos θ
cos θ
Γ 332 = Γ 323 = .
sin θ
The non-zero elements of the Riemann curvature tensor are
3 1
R131 = −m /2r 3Z , R 010 = −mZ / r 3
1 2
R 212 = −m /2r , R 323 = m sin2 θ / r
0 0
R 303 = −m sin2 θ /2r , R 202 = −m /2r
plus components that are related by symmetry operations, for example:
−R3113 = −R1331 = R1313 = R3131,
where
3
R3131 = g33R131, etc.
The components of the Ricci tensor
α
Rμν = R μαν
C-3
Introduction to General Relativity and Cosmology (Second Edition)
all vanish, as they must if the metric satisfies Einstein’s equation in empty spacetime.
Finally we obtain the curvature of two-dimensional surfaces in spacetime. Take the
surface defined by geodesics, which at the origin lie in directions such that the polar
angle θ = π /2 and have t constant. The Gaussian curvature of this surface is
R3131 R3 m
K rφ = − = − 131 = − 3 ,
g11g33 g11 2r
that is
GM
K rφ = − .
r 3c 2
This checks the consistency of Equation (3.43).
C-4
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Appendix D
Energy Flow in Gravitational Waves
The stress–energy tensor for gravitational waves was shown in Section 9.2 (Equation
(9.14))
c 4 (2)
tαβ = − Gαβ , (D.1)
8πG
where only terms quadratic in hαβ are retained on the right-hand side. This stress–
energy tensor will now be calculated for a plane transverse-traceless wave with (+)
polarization using Cartesian coordinates. Thus
h+ = h11 = −h22 = h cos[k (x 0 − x 3)]. (D.2)
The only components of the metric that do not vanish are
g11 = −1 + h+ , g11 = −1 − h+ ,
g22 = −1 − h+ , g 22 = −1 + h+ ,
One later step in calculating energy flow requires us to take the average of the stress–
energy tensor over a region of spacetime covering several complete waves. With this
averaging procedure any contribution from the above product vanishes:
〈h+h+,0〉 = 0. (D.3)
Therefore we can ignore such terms from here on. Then
Γ110 = Γ101 = Γ11
0
= −h+,0 /2,
and
Γ113 = Γ131 = −Γ11
3
= −h+,3/2 = +h+,0 /2.
Similar formulae hold with the suffix 1 replaced everywhere by 2. Referring to the
expression for the Riemann tensor (Equation (6.3)) shows that if only the terms
quadratic in h are retained we have
R (2)βγδ
α
= Γ ασγ Γ σβδ − Γ ασδΓ σβγ .
Relevant non-zero components of this Riemann tensor are
−R (2)1 (2)2 2
010 = − R 020 = h+, 0 /4,
−R (2)0 (2)3 2
101 = R 131 = h+, 0 /4,
R (2)1 (2)1 2
013 = R 310 = − h+, 0 /4,
−R (2)1 (2)2 2
313 = − R 323 = h+, 0 /4, (D.4)
whence it follows that the second-order components of the Ricci tensor are
(2) (2) (2) (2)
−R 00 = R 30 = R 03 = −R 33 = h+2, 0 /2, (D.5)
Finally, substituting these results in Equation (D.1) we have the components of the
stress–energy tensor of the gravitational wave, for example
c4
t00 = h+2, 0 .
16πG
Now the energy in a wave cannot be localized because only relative displacements
have physical significance. It is also not clear whether the energy is located in a peak
or trough. Hence it is necessary to average over several complete cycles giving
D-2
Introduction to General Relativity and Cosmology (Second Edition)
c4
t00 = h+2, 0 ,
16πG
where the angular brackets refer to the expectation values. Converting to a time
derivative, we obtain
c2 ⎞ ̇ 2
t00 = ⎛ ⎜ h+ .⎟
⎝ 16πG ⎠
If both polarizations are present this result generalizes to
c2 2 2
t00 = h+̇ + h×̇ . (D.7)
16πG
It follows that the energy flux is
c3 2 2
F = ct00 = h+̇ + h×̇ , (D.8)
16πG
meaning the energy crossing unit area in unit time (kg s−3). Equation (D.8) can be
rewritten as
c3 ̇ 2 + h12
̇ 2 + h 21
̇ 2 + h 22
̇2 ,
F= h11
32πG
that is
c3 TT TT
F= hij̇ hij̇ , (D.9)
32πG
where we make it explicit, through the superscript TT, that we are working in the
transverse-traceless gauge.
D-3
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Appendix E
Radiation from a Nearly Newtonian Source
A Newtonian source is one in which the curvature and strain of spacetime are small,
and in which the motion of matter is non-relativistic. Then the metric tensor can be
written approximately as:
gαβ = ηαβ + hαβ , hαβ ≪ 1.
We drop the time arguments in order to simplify the presentation. The integrand in
brackets is a quadrupole mass moment of the source:
Iij = ∫ ρ0 yi yj d3y.
The choice we make for hij is that it is transverse and traceless, and so it follows that
equally the quadrupole moment must similarly be rendered transverse and traceless.
A conventional starting point is to take the traceless or reduced quadrupole moment
for n (2 or 3) dimensional motion. The Kroneker delta δij has the value +1 when i = j
and zero otherwise. Now the component of a vector d transverse to the unit vector n
is given by
d jT = Pjl dl ,
where Pjl = (δjl − nj nl ). In the present case n is a unit vector r /r pointing away from
the source. Similarly a transverse version of I ij is
Thus
2G TT
hij = Iij̈ . (E.4)
rc 4
E-2
Introduction to General Relativity and Cosmology (Second Edition)
Using earlier results from Appendix D we can calculate the energy flux at points
distant from the source and also the luminosity of the source. Equation (D.9) gives
the flux
c3
F= 〈hij̇ hij̇ 〉 (E.5)
32πG
G TT TT
= I ij⃛ I ij⃛ . (E.6)
8πr 2c 5
Integrating this over a sphere at a distance r from the source gives the total flux, and
hence the luminosity of the source is
L = r2 ∫F dΩ .
L 2 (j , k ) = ∫ n j n k dΩ
vanishes if i ≠ j . Figure E.1 shows a slice through the sphere of integration at
constant xl, where l is the label of the remaining space coordinate (l ≠ k , l ≠ j ). The
value of nj at P′ is opposite to its value at P but nk is the same; consequently the
contributions near P and P′ cancel. Similarly all contributions to the integral cancel.
When j = k, this does not happen. For example, if j = k = 3 we have
+π 2π
L 2(3, 3) = ∫− π ∫ 0 cos2 θ sin θ dθ dφ = 4π /3.
P′ P
n′ n
Figure E.1. Schematic diagram of slice through the sphere of integration at constant xl
E-3
Introduction to General Relativity and Cosmology (Second Edition)
L = (G /5c 5) I ⃛ ij I ⃛ ij . (E.7)
E-4
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Appendix F
The Friedmann Equations
The parameters of supreme interest to the physicist are the sign of curvature and the
magnitude of the scale parameter. It emerges that the universe is, if not exactly flat,
very close to being so. The scale parameter is then arbitrary and chosen to be unity
at present. Interest shifts to enquiring the way this has changed throughout the life of
the universe. Equivalently we ask how the Hubble’s constant has changed with time.
The universe is taken, for simplicity, to be a perfect fluid with stress–energy tensor
Tμν = (p / c 2 + ρ)vμvν − pgμν , (F.1)
where p is the pressure, ρ is the rest density, and v is the fluid velocity. We now set down
the components of Einstein’s equation. Using the FRW metric of Equation (10.2),
1 − R2
g00 = g 00 = 1 g11 = =
g11 1 − kσ 2
1 1
g22 = = − R 2σ 2 g33 = = −R2σ 2 sin2 θ .
g 22 g 33
In a comoving frame v = (c, 0, 0, 0) and
pR2
T00 = ρc 2 T11 =
1 − kσ 2
T22 = pR2 σ 2 T33 = pR2 σ 2 sin2 θ .
The metric connections are evaluated with the help of Equation (5.7): for example,
1 11
Γ101 = Γ110 = g11Γ110 =
g g11, 0
2
1 1 − kσ 2 ⎞⎧ 2RR ̇ ⎫ Ṙ
= ⎛− ⎜
2
− ⎟
2
= .
2⎝ R ⎠⎨
⎩ c(1 − kσ ) ⎬⎭ Rc
Equation (6.3) gives the components of the Riemann tensor: for example;
1
R 010 = − Γ101, 0 − Γ110Γ101
R ̇2 R̈ R ̇2 −R̈
= ⎛⎜ 2 2 − 2 ⎞⎟ − 2 2 = 2 .
⎝R c c R⎠ c R c R
2
Similarly R 020 = R 030
3
= −R ̈ /c 2R , while R 000
0
is identically zero. Using Equation (6.16)
the 00 component of the Ricci tensor is
0 1 2 3
R 00 = R 000 + R 010 + R 020 + R 030 = − 3R ̈ / c 2 R .
Its only other non-zero components are
T
R11 = R22 = Tσ 2 R33 = R22 sin2 θ ,
1 − kσ 2
where
T = 2k + RR ̈/ c 2 + 2R ̇2 / c 2 .
Thus the Ricci scalar is given by
R(Ricci scalar) = g μνRμν = − 6S / R2 ,
where
S = k + RR ̈/ c 2 + R ̇2 / c 2 .
Finally the Einstein tensor (Equation (6.17)) is given by
1
Gμν = Rμν − g R(Ricci scalar),
2 μν
for which the two non-vanishing components are
G 00 = 3R ̇2 / R2c 2 + 3k / R2 ,
k + 2RR ̈/ c 2 + R ̇2 / c 2
G11 = − .
1 − kσ 2
The corresponding 00 and 11 components of the Einstein Equation (6.20) are thus
3R ̇2 / R2 + 3kc 2 / R2 − c 2 Λ = 8πGρ , (F.2)
F-2
Introduction to General Relativity and Cosmology (Second Edition)
F-3
Introduction to General Relativity and Cosmology (Second Edition)
The minus sign is surprising, because it reveals that the cosmological constant gives
rise to a repulsive gravitational force. Then all three equations (Friedmann’s,
acceleration, and fluid equations) can be expressed more simply in terms of the
total energy density and pressure including the cosmological constant:
3a 2̇ / a 2 + 3κc 2 / a 2 = 8πGρ ; (F.14)
and the acceleration equation becomes
4πG 2
a ̈/ a = − [ρc + 3p ]; (F.15)
3c 2
while the form of the fluid equation is unchanged because the contributions from the
cosmological constant to both left- and right-hand sides vanish
ρ ċ 2 + (3a /̇ a )[p + ρc 2 ] = 0. (F.16)
References
Friedman, A. 1922, ZPhy, 10, 377
Le Maître, G. 1927, ASSB,, A47, 49
F-4
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Appendix G
The Virial Theorem
where Fk is the force on the kth mass and v k its velocity. If the system is in
equilibrium and bound then I ̈ vanishes, so we have
and similarly
Fkj = −Gmj mk rjk / rjk3
which is simply minus their mutual potential energy. Hence Equation (G.4) can be
re-written
2T (Kinetic energy)
(G.5)
= −U (Potential energy) − W (Work done by external pressure).
This result will hold for matter, dark plus baryonic, provided it has reached
equilibrium. If the system is isolated, which is approximately true of galaxies and
galaxy clusters, then
2T (Kinetic energy) = −U (Potential energy). (G.6)
G-2
Introduction to General Relativity and Cosmology
(Second Edition)
Ian R Kenyon
Appendix H
Scale Invariance
The near scale invariance of the observed perturbations of matter has been discussed
in Chapters 12, 13, and 15. This appendix is used to work through the prediction for
the power spectrum of the perturbations resulting from slow roll inflation. The scalar
inflaton field, expanded around the background value is
ϕ = ϕ0 + δϕ ,
where δϕ /ϕ0 must be of the same small size as the CMB fluctuations. Dimensional
analysis shows that the correlation between the field perturbations at r and r + Δr is
〈δ(r , t )δ(r + δ r , t )〉 = Length−2 . (H.1)
The field is scalar and neither it nor its interactions offer a length. Thus the only
length relevant is Δr , and we have
〈δ(r , t )δ(r + δ r , t )〉 ∼ Δr −2 . (H.2)
During inflation the separation between points grows and eventually they lose causal
contact when Δr = c /H . At that time the correlation freezes and remains unchanged
until the points again make causal contact at some time after inflation. Thus the
correlation that exists on reentering the horizon is
〈δ(r , t )δ(r + δ r , t )〉 ∼ (c / H )−2 , (H.3)
which is clearly scale invariant. The wave equation for a scalar field given in Chapter
13 can be refined to take account of the field kinetic energy, as well as the expansion
of the universe and the potential V (ϕ ):
c2 2
ϕ̈ − ∇ ϕ + 3Hϕ ̇ + V ′(ϕ) = 0, (H.4)
a(t )2
where the prime symbol indicates taking the derivative with respect to ϕ. The time
dependence of the scale factor is made explicit. In slow roll inflation the last term
V ′(ϕ ) is smaller the slower the roll. This term will be ignored, so that we consider the
limiting case of infinitely prolonged inflation. Then Equation (H.4) becomes
the equation for damped harmonic motion in which the angular frequency ω and
the wave number k are related by
ω = ck / a(t ). (H.5)
Note that the angular frequency of the Fourier components of the perturbations are
time dependent. In the case, as here, that the perturbations are tiny the Fourier
modes effectively do not interact and evolve independently. The mean square
fluctuations of an harmonic oscillator are proportional to 1/ω, which translates for a
Fourier component of the scalar field to
1
〈∣δϕk ∣2 〉 ∝ , (H.6)
a 3ω
where the normalization of the field is taken into account by the a−3 factor. On
leaving the horizon we have
ω horizon ∼ H and ahorizon ∼ k / H . (H.7)
Thus the correlation freezes with
1 H2
〈∣δϕk ∣2 〉 ∝ 3
∝ . (H.8)
a horizon ω horizon k3
From this field fluctuation we can infer the corresponding curvature perturbations
defined by
δa H
ξ= = δϕ . (H.9)
a ϕ̇
The two point correlation of the curvature perturbations is then
H2 H4
〈∣ξk∣2 〉 = 〈∣ ϕk ∣2
〉 ∝ , (H.10)
ϕ2̇ k 3ϕ2̇
which is equally the two-point power spectrum of the curvature fluctuations
H4
Pξ(k ) ∝ . (H.11)
k 3ϕ2̇
A useful quantity can be defined here,
1 3
Δ2(k ) = k Pξ(k ), (H.12)
2π 2
giving the power in the fluctuations per unit log interval of k. Then
H-2
Introduction to General Relativity and Cosmology (Second Edition)
H4
Δ2(k ) ∝ , (H.13)
ϕ2̇
so that the power in each decade of k is the same. This is another expression of scale
invariance. Finally the resulting fluctuations in matter can be calculated. Poisson’s
equation provides the link between the fractional curvature/gravitational field
fluctuations (ξ) and the fractional matter fluctuations (δ):
∇2 ξ = 4πG δ , (H.14)
giving for Fourier components
k 2ξk = 4πG δk . (H.15)
Then using Equations (H.11) and (H.15) the power spectrum of the matter
fluctuations
2
k2 ⎤ H4
P (k ) ∝ ⎡ Pξ(k ) = k. (H.16)
⎢
⎣ 4πG ⎥
⎦ [4πG ϕ ̇]2
This justifies the connection, made in Chapter 15: namely that limitingly slow roll
inflation gives a power spectrum of matter fluctuations that depends linearly on k.
This is just the scale invariant Harrison–Zeldovich spectrum. Some modification of the
analysis is needed to take account of the finite speed and duration of inflation: the
predicted power spectrum is then found to be proportional to kn, with n slightly less
than unity.
H-3