Quantum Theory, Groups and Representations: An Introduction
Quantum Theory, Groups and Representations: An Introduction
An Introduction
(under construction)
Peter Woit
Department of Mathematics, Columbia University
[email protected]
ii
Contents
Preface xi
iii
4.4 Inner products . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.5 Adjoint operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.6 Orthogonal and unitary transformations . . . . . . . . . . . . . . 40
4.6.1 Orthogonal groups . . . . . . . . . . . . . . . . . . . . . . 41
4.6.2 Unitary groups . . . . . . . . . . . . . . . . . . . . . . . . 42
4.7 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . 43
4.8 For further reading . . . . . . . . . . . . . . . . . . . . . . . . . . 44
iv
9 Tensor Products, Entanglement, and Addition of Spin 105
9.1 Tensor products . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
9.2 Composite quantum systems and tensor products . . . . . . . . . 107
9.3 Indecomposable vectors and entanglement . . . . . . . . . . . . . 109
9.4 Tensor products of representations . . . . . . . . . . . . . . . . . 109
9.4.1 Tensor products of SU (2) representations . . . . . . . . . 110
9.4.2 Characters of representations . . . . . . . . . . . . . . . . 111
9.4.3 Some examples . . . . . . . . . . . . . . . . . . . . . . . . 112
9.5 Bilinear forms and tensor products . . . . . . . . . . . . . . . . . 113
9.6 For further reading . . . . . . . . . . . . . . . . . . . . . . . . . . 115
v
15 Quantization 177
15.1 Canonical quantization . . . . . . . . . . . . . . . . . . . . . . . . 177
15.2 The Groenewold-van Hove no-go theorem . . . . . . . . . . . . . 179
15.3 Canonical quantization in d dimensions . . . . . . . . . . . . . . 180
15.4 Quantization and symmetries . . . . . . . . . . . . . . . . . . . . 181
15.5 More general notions of quantization . . . . . . . . . . . . . . . . 182
15.6 For further reading . . . . . . . . . . . . . . . . . . . . . . . . . . 183
vi
21 The Harmonic Oscillator as a Representation of the Heisenberg
Group 235
21.1 Complex structures and phase space . . . . . . . . . . . . . . . . 236
21.2 Complex structures and quantization . . . . . . . . . . . . . . . . 238
21.3 The positivity condition on J . . . . . . . . . . . . . . . . . . . . 240
21.4 Complex structures for d = 1 and squeezed states . . . . . . . . . 243
21.5 Coherent states and the Heisenberg group action . . . . . . . . . 245
21.6 For further reading . . . . . . . . . . . . . . . . . . . . . . . . . . 247
vii
27 Anticommuting Variables and Pseudo-classical Mechanics 293
27.1 The Grassmann algebra of polynomials on anticommuting gener-
ators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
27.2 Pseudo-classical mechanics and the fermionic Poisson bracket . . 296
27.3 Examples of pseudo-classical mechanics . . . . . . . . . . . . . . 299
27.3.1 The pseudo-classical spin degree of freedom . . . . . . . . 300
27.3.2 The pseudo-classical fermionic oscillator . . . . . . . . . . 301
27.4 For further reading . . . . . . . . . . . . . . . . . . . . . . . . . . 302
viii
34 Multi-particle Systems and Non-relativistic Quantum Fields 351
34.1 Multi-particle quantum systems as quanta of a harmonic oscillator352
34.1.1 Bosons and the quantum harmonic oscillator . . . . . . . 352
34.1.2 Fermions and the fermionic oscillator . . . . . . . . . . . . 354
34.2 Solutions to the free particle Schrödinger equation . . . . . . . . 354
34.2.1 Box normalization . . . . . . . . . . . . . . . . . . . . . . 355
34.2.2 Continuum normalization . . . . . . . . . . . . . . . . . . 358
34.3 Quantum field operators . . . . . . . . . . . . . . . . . . . . . . . 359
34.4 For further reading . . . . . . . . . . . . . . . . . . . . . . . . . . 364
ix
40 The Klein-Gordon Equation and Scalar Quantum Fields 411
40.1 The Klein-Gordon equation and its solutions . . . . . . . . . . . 412
40.2 Classical relativistic scalar field theory . . . . . . . . . . . . . . . 415
40.3 The complex structure on the space of Klein-Gordon solutions . 417
40.4 Quantization of the real scalar field . . . . . . . . . . . . . . . . . 419
40.5 The propagator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
40.6 Fermionic scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
40.7 For further reading . . . . . . . . . . . . . . . . . . . . . . . . . . 422
A Conventions 445
x
Preface
This document began as course notes prepared for a class taught at Columbia
during the 2012-13 academic year. The intent was to cover the basics of quantum
mechanics, up to and including basic material on relativistic quantum field
theory, from a point of view emphasizing the role of unitary representations of
Lie groups in the foundations of the subject. It has been significantly rewritten
and extended during the past year and the intent is to continue this process
based upon experience teaching the same material during 2014-5. The current
state of the document is that of a first draft of a book. As changes are made,
the latest version will be available at
http://www.math.columbia.edu/~woit/QM/qmbook.pdf
Corrections, comments, criticism, and suggestions about improvements are
encouraged, with the best way to contact me email to [email protected]
The approach to this material is simultaneously rather advanced, using cru-
cially some fundamental mathematical structures normally only discussed in
graduate mathematics courses, while at the same time trying to do this in as
elementary terms as possible. The Lie groups needed are relatively simple ones
that can be described purely in terms of small matrices. Much of the represen-
tation theory will just use standard manipulations of such matrices. The only
prerequisite for the course as taught was linear algebra and multi-variable cal-
culus. My hope is that this level of presentation will simultaneously be useful to
mathematics students trying to learn something about both quantum mechan-
ics and representation theory, as well as to physics students who already have
seen some quantum mechanics, but would like to know more about the mathe-
matics underlying the subject, especially that relevant to exploiting symmetry
principles.
The topics covered often intentionally avoid overlap with the material of
standard physics courses in quantum mechanics and quantum field theory, for
which many excellent textbooks are available. This document is best read in
conjunction with such a text. Some of the main differences with standard physics
presentations include:
• The role of Lie groups, Lie algebras, and their unitary representations is
systematically emphasized, including not just the standard use of these to
derive consequences for the theory of a “symmetry” generated by operators
commuting with the Hamiltonian.
xi
• Symplectic geometry and the role of the Lie algebra of functions on phase
space in Hamiltonian mechanics is emphasized, with quantization just the
passage to a unitary representation of (a subalgebra of) this Lie algebra.
• The role of the metaplectic representation and the subtleties of the pro-
jective factor involved are described in detail.
• The parallel role of the Clifford algebra and spinor representation are
extensively investigated.
• Some topics usually first encountered in the context of relativistic quan-
tum field theory are instead first developed in simpler non-relativistic or
finite-dimensional contexts. Non-relativistic quantum field theory based
on the Schrödinger equation is described in detail before moving on to
the relativistic case. The topic of irreducible representations of space-
time symmetry groups is first encountered with the case of the Euclidean
group, where the implications for the non-relativistic theory are explained.
The analogous problem for the relativistic case, that of the irreducible rep-
resentations of the Poincaré group, is then worked out later on.
• The emphasis is on the Hamiltonian formalism and its representation-
theoretical implications, with the Lagrangian formalism de-emphasized.
In particular, the operators generating symmetry transformations are de-
rived using the moment map for the action of such transformations on
phase space, not by invoking Noether’s theorem for transformations that
leave invariant a Lagrangian.
• Care is taken to keep track of the distinction between vector spaces and
their duals, as well as the distinction between real and complex vector
spaces, making clear exactly where complexification and the choice of a
complex structure enters the theory.
• A fully rigorous treatment of the subject is beyond the scope of what is
covered here, but an attempt is made to keep clear the difference between
where a rigorous treatment could be pursued relatively straight-forwardly,
and where there are serious problems of principle making a rigorous treat-
ment very hard to achieve.
xii
Chapter 1
1.1 Introduction
A famous quote from Richard Feynman goes “I think it is safe to say that no one
understands quantum mechanics.”[17]. In this book we’ll pursue one possible
route to such an understanding, emphasizing the deep connections of quantum
mechanics to fundamental ideas and powerful techniques of modern mathemat-
ics. The strangeness inherent in quantum theory that Feynman was referring
to has two rather different sources. One of them is the inherent disjunction and
incommensurability between the conceptual framework of the classical physics
which governs our everyday experience of the physical world, and the very dif-
ferent framework which governs physical reality at the atomic scale. Familiarity
with the powerful formalisms of classical mechanics and electromagnetism pro-
vides deep understanding of the world at the distance scales familiar to us.
Supplementing these with the more modern (but still “classical” in the sense
of “not quantum”) subjects of special and general relativity extends our under-
standing into other less accessible regimes, while still leaving atomic physics a
mystery.
Read in context though, Feynman was pointing to a second source of diffi-
culty, contrasting the mathematical formalism of quantum mechanics with that
of the theory of general relativity, a supposedly equally hard to understand
subject. General relativity can be a difficult subject to master, but its math-
ematical and conceptual structure involves a fairly straight-forward extension
of structures that characterize 19th century physics. The fundamental physical
laws (Einstein’s equations for general relativity) are expressed as partial differ-
ential equations, a familiar if difficult mathematical subject. The state of the
system is determined by the set of fields satisfying these equations, and observ-
able quantities are functionals of these fields. The mathematics is just that of
the usual calculus: differential equations and their real-valued solutions.
In quantum mechanics, the state of a system is best thought of as a different
sort of mathematical object: a vector in a complex vector space, the so-called
1
state space. One can sometimes interpret this vector as a function, the wave-
function, although this comes with the non-classical feature that wavefunctions
are complex-valued. What’s truly completely different is the treatment of ob-
servable quantities, which correspond to self-adjoint linear operators on the state
space. This has no parallel in classical physics, and violates our intuitions about
how physics should work, with observables now often no longer commuting.
During the earliest days of quantum mechanics, the mathematician Hermann
Weyl quickly recognized that the mathematical structures being used were ones
he was quite familiar with from his work in the field of representation theory.
From the point of view that takes representation theory as a fundamental struc-
ture, the framework of quantum mechanics looks perfectly natural. Weyl soon
wrote a book expounding such ideas [72], but this got a mixed reaction from
physicists unhappy with the penetration of unfamiliar mathematical structures
into their subject (with some of them characterizing the situation as the “Grup-
penpest”, the group theory plague). One goal of this course will be to try and
make some of this mathematics as accessible as possible, boiling down Weyl’s
exposition to its essentials while updating it in the light of many decades of
progress and better understanding of the subject.
Weyl’s insight that quantum mechanics crucially involves understanding the
Lie groups that act on the phase space of a physical system and the unitary rep-
resentations of these groups has been vindicated by later developments which
dramatically expanded the scope of these ideas. The use of representation the-
ory to exploit the symmetries of a problem has become a powerful tool that has
found uses in many areas of science, not just quantum mechanics. I hope that
readers whose main interest is physics will learn to appreciate the mathematical
structures that lie behind the calculations of standard textbooks, helping them
understand how to effectively exploit them in other contexts. Those whose main
interest is mathematics will hopefully gain some understanding of fundamen-
tal physics, at the same time as seeing some crucial examples of groups and
representations. These should provide a good grounding for appreciating more
abstract presentations of the subject that are part of the standard mathemat-
ical curriculum. Anyone curious about the relation of fundamental physics to
mathematics, and what Eugene Wigner described as “The Unreasonable Ef-
fectiveness of Mathematics in the Natural Sciences”[73] should benefit from an
exposure to this remarkable story at the intersection of the two subjects.
The following sections give an overview of the fundamental ideas behind
much of the material to follow. In this sketchy and abstract form they will
likely seem rather mystifying to those meeting them for the first time. As we
work through basic examples in coming chapters, a better understanding of the
overall picture described here should start to emerge.
2
1.2.1 Fundamental axioms of quantum mechanics
In classical physics, the state of a system is given by a point in a “phase space”,
which one can think of equivalently as the space of solutions of an equation
of motion, or as (parametrizing solutions by initial value data) the space of
coordinates and momenta. Observable quantities are just functions on this space
(i.e. functions of the coordinates and momenta). There is one distinguished
observable, the energy or Hamiltonian, and it determines how states evolve in
time through Hamilton’s equations.
The basic structure of quantum mechanics is quite different, with the for-
malism built on the following simple axioms:
Axiom (States). The state of a quantum mechanical system is given by a non-
zero vector in a complex vector space H with Hermitian inner product h·, ·i.
We’ll review in chapter 4 some linear algebra, including the properties of in-
ner products on complex vector spaces. H may be finite or infinite dimensional,
with further restrictions required in the infinite-dimensional case (e.g. we may
want to require H to be a Hilbert space). Note two very important differences
with classical mechanical states:
• The state space is always linear: a linear combination of states is also a
state.
• The state space is a complex vector space: these linear combinations can
and do crucially involve complex numbers, in an inescapable way. In the
classical case only real numbers appear, with complex numbers used only
as an inessential calculational tool.
In this course we will sometimes use the notation introduced by Dirac for
vectors in the state space H: such a vector with a label ψ is denoted
|ψi
d i
|ψ(t)i = − H|ψ(t)i
dt ~
The Hamiltonian observable H will have a physical interpretation in terms
of energy, and one may also want to specify some sort of positivity property on
H, in order to assure the existence of a stable lowest energy state.
3
~ is a dimensional constant, the value of which depends on what units one
uses for time and for energy. It has the dimensions [energy] · [time] and its
experimental values are
1.054571726(47) × 10−34 Joule · seconds = 6.58211928(15) × 10−16 eV · seconds
(eV is the unit of “electron-Volt”, the energy acquired by an electron moving
through a one-Volt electric potential). The most natural units to use for quan-
tum mechanical problems would be energy and time units chosen so that ~ = 1.
For instance one could use seconds for time and measure energies in the very
small units of 6.6 × 10−16 eV, or use eV for energies, and then the very small
units of 6.6 × 10−16 seconds for time. Schrödinger’s equation implies that if one
is looking at a system where the typical energy scale is an eV, one’s state-vector
will be changing on the very short time scale of 6.6 × 10−16 seconds. When
we do computations, usually we will just set ~ = 1, implicitly going to a unit
system natural for quantum mechanics. When we get our final result, we can
insert appropriate factors of ~ to allow one to get answers in more conventional
unit systems.
It is sometimes convenient however to carry along factors of ~, since this
can help make clear which terms correspond to classical physics behavior, and
which ones are purely quantum mechanical in nature. Typically classical physics
comes about in the limit where
(energy scale)(time scale)
~
is large. This is true for the energy and time scales encountered in everyday
life, but it can also always be achieved by taking ~ → 0, and this is what will
often be referred to as the “classical limit”.
4
Principle (Observables). States where the value of an observable can be char-
acterized by a well-defined number are the states that are eigenvectors for the
corresponding self-adjoint operator. The value of the observable in such a state
will be a real number, the eigenvalue of the operator.
This principle identifies the states we have some hope of sensibly associating
a label to (the eigenvalue), a label which in some contexts corresponds to an
observable quantity characterizing states in classical mechanics. The observables
of most use will turn out to correspond to some group action on the physical
system (for instance the energy, momentum, angular momentum, or charge).
Principle (The Born rule). Given an observable O and two unit-norm states
|ψ1 i and |ψ2 i that are eigenvectors of O with eigenvalues λ1 and λ2 (i.e. O|ψ1 i =
λ1 |ψ1 i and O|ψ2 i = λ2 |ψ2 i), the complex linear combination state
c1 |ψ1 i + c2 |ψ2 i
may not have a well-defined value for the observable O. If one attempts to
measure this observable, one will get either λ1 or λ2 , with probabilities
|c21 |
|c21 |+ |c22 |
and
|c22 |
|c21 | + |c22 |
respectively.
The Born rule is sometimes raised to the level of an axiom of the theory, but
it is plausible to expect that, given a full understanding of how measurements
work, it can be derived from the more fundamental axioms of the previous
section. Such an understanding though of how classical behavior emerges in
experiments is a very challenging topic, with the notion of “decoherence” playing
an important role. See the end of this chapter for some references that discuss
the these issues in detail.
Note that the state c|ψi will have the same eigenvalues and probabilities as
the state |ψi, for any complex number c. It is conventional to work with states
of norm fixed to the value 1, which fixes the amplitude of c, leaving a remaining
ambiguity which is a phase eiθ . By the above principles this phase will not
contribute to the calculated probabilities of measurements. We will however
not at all take the point of view that this phase information can be ignored. It
plays an important role in the mathematical structure, and the relative phase
of two different states certainly does affect measurement probabilities.
5
will be examining this notion in great detail and working through many examples
in coming chapters, but here’s a quick summary of the relevant definitions, as
well as an indication of the relationship to the quantum theory formalism.
Definition (Group). A group G is a set with an associative multiplication, such
that the set contains an identity element, as well as the multiplicative inverse
of each element.
Many different kinds of groups are of interest in mathematics, with an ex-
ample of the sort that we will be interested in the group of all rotations about
a point in 3-dimensional space. Most of the groups we will consider are “matrix
groups”, i.e. subgroups of the group of n by n invertible matrices (with real or
complex coefficients).
Definition (Representation). A (complex) representation (π, V ) of a group G
is a homomorphism
π : g ∈ G → π(g) ∈ GL(V )
where GL(V ) is the group of invertible linear maps V → V , with V a complex
vector space.
Saying the map π is a homomorphism means
6
the standard Hermitian inner product in a complex vector space. In physical
applications, the group representations under consideration typically correspond
to physical symmetries, and will preserve lengths in H, since these correspond
to probabilities of various observations. We have the definition
Definition (Unitary representation). A representation (π, V ) on a complex vec-
tor space V with Hermitian inner product h·, ·i is a unitary representation if it
preserves the inner product, i.e.
U −1 = U †
π(g) = eA
A† = −A
7
We thus see that, at least in the case of finite-dimensional H, the unitary
representation π of G on H coming from a symmetry G of our physical sys-
tem gives us not just unitary matrices π(g), but also corresponding self-adjoint
operators B on H. Symmetries thus give us quantum mechanical observables,
with the fact that these are self-adjoint linear operators corresponding to the
fact that symmetries are realized as unitary representations on the state space.
In the following chapters we’ll see many examples of this phenomenon. A
fundamental example that we will study in detail is that of time-translation
symmetry. Here the group G = R and we get a unitary representation of
R on the space of states H. The corresponding self-adjoint operator is the
Hamiltonian operator H. This unitary representation gives the dynamics of the
theory, with the Schrödinger equation just the statement that ~i H∆t is the skew-
adjoint operator that gets exponentiated to give the unitary transformation that
moves states ψ(t) ahead in time by an amount ∆t.
8
a finite dimensional vector space. In general though it will be infinite dimen-
sional and we will need to further specify the space of functions (i.e. continuous
functions, differentiable functions, functions with finite integral, etc.).
Given a group action of G on M , taking complex functions on M provides
a representation (π, F un(M )) of G, with π defined on functions f by
(π(g)f )(x) = f (g −1 · x)
Note the inverse that is needed to get the group homomorphism property to
work since one has
This calculation would not work out properly for non-commutative G if one
defined (π(g)f )(x) = f (g · x).
One way to construct quantum mechanical state spaces H is as “wavefunc-
tions”, meaning complex-valued functions on space-time. The above shows that
given any group action on space-time, we get a representation π on the state
space H of such wavefunctions.
Note that only in the case of M a finite set of points will we get a finite-
dimensional representation this way, since only then will F un(M ) be a finite-
dimensional vector space (C# of points in M ). A good example to consider to
understand this construction is the following:
• Take M to be a set of 3 elements x1 , x2 , x3 . So F un(M ) = C3 . For
f ∈ F un(M ), f is a vector in C3 , with components (f (x1 ), f (x2 ), f (x3 )).
• Take G = S3 , the group of permutations of 3 elements. This group has
3! = 6 elements.
• Take G to act on M by permuting the 3 elements.
(g, xi ) → g · xi
9
The discussion here has been just a quick sketch of some of the ideas behind
the material we will cover in later chapters. These ideas will be examined in
much greater detail, beginning with the next two chapters, where they will
appear very concretely when we discuss the simplest possible quantum systems,
those with one and two-complex dimensional state spaces.
10
see Landsman [37]. Finally, to get an idea of the wide variety of points of view
available on the topic of the “interpretation” of quantum mechanics, there’s a
volume of interviews [53] with experts on the topic.
11
12
Chapter 2
The simplest example of a Lie group is the group of rotations of the plane,
with elements parametrized by a single number, the angle of rotation θ. It is
useful to identify such group elements with unit vectors in the complex plane,
given by eiθ . The group is then denoted U (1), since such complex numbers
can be thought of as 1 by 1 unitary matrices . We will see in this chapter how
the general picture described in chapter 1 works out in this simple case. State
spaces will be unitary representations of the group U (1), and we will see that any
such representation decomposes into a sum of one-dimensional representations.
These one-dimensional representations will be characterized by an integer q, and
such integers are the eigenvalues of a self-adjoint operator we will call Q, which
is an observable of the quantum theory.
One motivation for the notation Q is that this is the conventional physics
notation for electric charge, and this is one of the places where a U (1) group
occurs in physics. Examples of U (1) groups acting on physical systems include:
• Quantum particles can be described by a complex-valued “wavefunction”,
and U (1) acts on such wavefunctions by phase transformations of the
value of the function. This phenomenon can be used to understand how
particles interact with electromagnetic fields, and in this case the physical
interpretation of the eigenvalues of the Q operator will be the electric
charge of the particle. We will discuss this in detail in chapter 42.
• If one chooses a particular direction in three-dimensional space, then the
group of rotations about that axis can be identified with the group U (1).
The eigenvalues of Q will have a physical interpretation as the quantum
version of angular momentum in the chosen direction. The fact that such
eigenvalues are not continuous, but integral, shows that quantum angular
momentum has quite different behavior than classical angular momentum.
• When we study the harmonic oscillator we will find that it has a U (1) sym-
13
metry (rotations in the position-momentum plane), and that the Hamil-
tonian operator is a multiple of the operator Q for this case. This im-
plies that the eigenvalues of the Hamiltonian (which give the energy of
the system) will be integers times some fixed value. When one describes
multi-particle systems in terms of quantum fields one finds a harmonic
oscillator for each momentum mode, and then the Q for that mode counts
the number of particles with that momentum.
This is just a set of n by n matrices, one for each group element, satisfying
the multiplication rules of the group elements. n is called the dimension of the
representation.
The groups G we are interested in will be examples of what mathematicians
call “Lie groups”. For those familiar with differential geometry, such groups
are examples of smooth manifolds. This means one can define derivatives of
functions on G and more generally the derivative of maps between Lie groups.
We will assume that our representations are given by differentiable maps π.
Some difficult general theory shows that considering the more general case of
continuous maps gives nothing new since the homomorphism property of these
maps is highly constraining. In any case, our goal in this course will be to study
quite explicitly certain specific groups and representations which are central in
quantum mechanics, and these representations will always be easily seen to be
differentiable.
Given two representations one can form their direct sum:
14
This representation is given by the homomorphism
π1 (g) 0
(π1 ⊕ π2 ) : g ∈ G →
0 π2 (g)
det(M − λ1) = 0
Vλ = {v ∈ V : M v = λv}
15
are non-zero vector subspaces of V and can also be described as ker(M − λ1),
the kernel of the operator M −λ1. Since this operator and all the π(g) commute,
we have
v ∈ ker(M − λ1) =⇒ π(g)v ∈ ker(M − λ1)
π(g)π(h) = π(h)π(g)
for all h ∈ G. If π is irreducible, Schur’s lemma implies that, since they commute
with all the π(g), the matrices π(h) are all scalar matrices, i.e. π(h) = λh 1 for
some λh ∈ C. π is then irreducible exactly when it is the one-dimensional
representation given by π(h) = λh .
Definition (The group U (1)). The elements of the group U (1) are points on
the unit circle, which can be labeled by the unit complex number eiθ , for θ ∈ R.
Note that θ and θ +N 2π label the same group element for N ∈ Z. Multiplication
of group elements is just complex multiplication, which by the properties of the
exponential satisfies
eiθ1 eiθ2 = ei(θ1 +θ2 )
16
By our theorem from the last section, since U (1) is a commutative group,
all irreducible representations will be one-dimensional. Such an irreducible rep-
resentation will be given by a map
π : U (1) → GL(1, C)
Theorem 2.1. All irreducible representations of the group U (1) are unitary,
and given by
for k ∈ Z.
f : U (1) → C∗
f (θ + ∆θ) − f (θ)
f 0 (θ) = lim
∆θ→0 ∆θ
(f (∆θ) − 1)
= f (θ) lim (using the homomorphism property)
∆θ→0 ∆θ
= f (θ)f 0 (0)
Denoting the constant f 0 (0) by C, the only solutions to this differential equation
satisfying f (0) = 1 are
f (θ) = eCθ
Requiring periodicity we find
17
The representations we have found are all unitary, with πk taking values not
just in C∗ , but in U (1) ⊂ C∗ . One can check that the complex numbers eikθ
satisfy the condition to be a unitary 1 by 1 matrix, since
These representations are restrictions to the unit circle U (1) of the irre-
ducible representations of the group C∗ , which are given by
πk : z ∈ C∗ → πk (z) = z k ∈ C∗
Such representations are not unitary, but they have an extremely simple form,
so it sometimes is convenient to work with them, later restricting to the unit
circle, where the representation is unitary.
Digression (Fourier analysis of periodic functions). We’ll discuss Fourier anal-
ysis more seriously in chapter 10 when we come to the case of the translation
groups and of state-spaces that are spaces of “wavefunctions” on space-time. For
now though, it might be worth pointing out an important example of a repre-
sentation of U (1): the space F un(S 1 ) of complex-valued functions on the circle
S 1 . We will evade discussion here of the very non-trivial analysis involved, by
not specifying what class of functions we are talking about (e.g. continuous,
integrable, differentiable, etc.). Periodic functions can be studied by rescaling
the period to 2π, thus looking at complex-valued functions of a real variable φ
satisfying
f (φ + N 2π) = f (φ)
for integer N , which we can think of as functions on a circle, parametrized by
angle φ. We have an action of the group U (1) on the circle by rotation, with
the group element eiθ acting as:
φ→φ+θ
18
where we have matched the sin of not specifying the class of functions in F un(S 1 )
on the left-hand side with the sin of not explaining how to handle the infinite
L
direct sum c on the right-hand side. What can be specified precisely is how
the irreducible sub-representation (πk , C) sits inside F un(S 1 ). It is the set of
functions f satisfying
Definition. The charge operator Q for the U (1) representation (π, H) is the
self-adjoint linear operator on H that acts by multiplication by qj on the irre-
ducible representation Hqj . Taking basis elements in Hqj it acts on H as the
matrix
q1 0 · · · 0
0 q2 · · · 0
· · · · · ·
0 0 · · · qn
19
have a well-defined numerical value for this observable, the integer qj . A general
state will be a linear superposition of state vectors from different Hqj and there
will not be a well-defined numerical value for the observable Q on such a state.
From the action of Q on H, one can recover the representation, i.e. the action
of the symmetry group U (1) on H, by multiplying by i and exponentiating, to
get
iq θ
e 1 0 ··· 0
0 eiq2 θ · · · 0
π(θ) = eiQθ = ···
∈ U (n) ⊂ GL(n, C)
···
0 0 · · · eiqn θ
π 0 : iθ ∈ iR → π 0 (iθ) = iQθ
20
The right-hand side of the picture is supposed to somehow represent GL(n, C),
which is the 2n2 dimensional real vector space of n by n complex matrices, mi-
nus the locus of matrices with zero determinant, which are those that can’t be
inverted. It has a distinguished point, the identity. The derivative π 0 of the
representation map π is the linear operator iQ.
In this very simple example, this abstract picture is over-kill and likely con-
fusing. We will see the same picture though occurring in many other examples
in later chapters, examples where the abstract technology is increasingly useful.
Keep in mind that, just like in this U (1) case, the maps π will just be exponen-
tial maps in the examples we care about, with very concrete incarnations given
by exponentiating matrices.
|ψ(t)i = U (t)|ψ(0)i
for
U (t) = e−itH
The commutator of two operators O1 , O2 is defined by
[O1 , O2 ] := O1 O2 − O2 O1
[U (t), Q] = 0
21
This condition
U (t)Q = QU (t) (2.1)
implies that if a state has a well-defined value qj for the observable Q at time
t = 0, it will continue to have the same value at any other time t, since
so the action of the U (1) group on the state space of the system commutes with
the time evolution law determined by the choice of Hamiltonian. We see that
this notion of symmetry implies a corresponding conservation law.
2.5 Summary
To summarize the situation for G = U (1), we have found
• Irreducible representations π are one-dimensional and characterized by
their derivative π 0 at the identity. If G = R, π 0 could be any complex
number. If G = U (1), periodicity requires that π 0 must be iq, q ∈ Z, so
irreducible representations are labeled by an integer.
• An arbitrary representation π of U (1) is of the form
π(eiθ ) = eiθQ
22
Chapter 3
The simplest truly non-trivial quantum systems have state spaces that are in-
herently two-complex dimensional. This provides a great deal more structure
than that seen in chapter 2, which could be analyzed by breaking up the space
of states into one-dimensional subspaces of given charge. We’ll study these two-
state systems in this section, encountering for the first time the implications of
working with representations of non-commutative groups. Since they give the
simplest non-trivial realization of many quantum phenomena, such systems are
the fundamental objects of quantum information theory (the “qubit”) and the
focus of attempts to build a quantum computer (which would be built out of
multiple copies of this sort of fundamental object). Many different possible two-
state quantum systems could potentially be used as the physical implementation
of a qubit.
One of the simplest possibilities to take would be the idealized situation
of a single electron, somehow fixed so that its spatial motion could be ignored,
leaving its quantum state described just by its so-called “spin degree of freedom”,
which takes values in H = C2 . The term “spin” is supposed to call to mind
the angular momentum of an object spinning about about some axis, but such
classical physics has nothing to do with the qubit, which is a purely quantum
system.
In this chapter we will analyze what happens for general quantum systems
with H = C2 by first finding the possible observables. Exponentiating these
will give the group U (2) of unitary 2 by 2 matrices acting on H = C2 . This
is a specific representation of U (2), the “defining” representation. By restrict-
ing to the subgroup SU (2) ⊂ U (2) of elements of determinant one, we get a
representation of SU (2) on C2 often called the “spin 1/2” representation.
Later on, in chapter 8, we will find all the irreducible representations of
SU (2). These are labeled by a natural number
N = 0, 1, 2, 3, . . .
23
and have dimension N + 1. The corresponding quantum systes are said to have
“spin N/2”. The case N = 0 is the trivial representation on C and the case
N = 1 is the case of this chapter. In the limit N → ∞ one can make contact
with classical notions of spinning objects and angular momentum, but the spin
1/2 case is at the other limit, where the behavior is purely quantum-mechanical.
The σj are called the “Pauli matrices” and are a pretty universal choice of basis
in this subject. This choice of basis is a convention, with one aspect of this
convention that of taking the basis element in the 3-direction to be diagonal.
In common physical situations and conventions, the third direction is the dis-
tinguished “up-down” direction in space, so often chosen when a distinguished
direction in R3 is needed.
Recall that the basic principle of how measurements are supposed to work
in quantum theory says that the only states that have well-defined values for
these four observables are the eigenvectors for these matrices. The first matrix
gives a trivial observable (the identity on every state), whereas the last one, σ3 ,
has the two eigenvectors
1 1
σ3 =
0 0
and
0 0
σ3 =−
1 1
with eigenvalues +1 and −1. In quantum information theory, where this is
the qubit system, these two eigenstates are labeled |0i and |1i because of the
analogy with a classical bit of information. Later on when we get to the theory of
spin, we will see that 21 σ3 is the observable corresponding to the SO(2) = U (1)
symmetry group of rotations about the third spatial axis, and the eigenvalues
24
− 12 , + 12 of this operator will be used to label the two eigenstates
1 1 1 0
|+ i= and | − i =
2 0 2 1
we have
0 1 1 0
(σ1 + iσ2 ) =2 (σ1 + iσ2 ) =
1 0 0 0
25
and
1 0 0 0
(σ1 − iσ2 ) =2 (σ1 − iσ2 ) =
0 1 1 0
This is just exactly the case studied in chapter 2, for a U (1) group acting on
H = C2 , with
1 0
Q=
0 1
This matrix commutes with any other 2 by 2 matrix, so we can treat its action
on H independently of the action of the σj .
Turning to the other three basis elements of the space of observables, the
Pauli matrices, it turns out that since all the σj satisfy σj2 = 1, their exponentials
also take a simple form.
1 1
eiθσj = 1 + iθσj + (iθ)2 σj2 + (iθ)3 σj3 + · · ·
2 3!
1 1
= 1 + iθσj − θ2 1 − i θ3 σj + · · ·
2 3!
1 1
= (1 − θ2 + · · · )1 + i(θ − θ3 + · · · )σj
2! 3!
= (cos θ)1 + iσj (sin θ) (3.1)
26
subgroups inside the unitary 2 by 2 matrices, but only one of them (the case
j = 3) will act diagonally on H, with the U (1) representation determined by
1 0
Q=
0 −1
For the other two cases j = 1 and j = 2, by a change of basis one could put
either one in the same diagonal form, but doing this for one value of j makes
the other two no longer diagonal. All three values of j need to be treated
simultaneously, and one needs to consider not just the U (1)s but the group one
gets by exponentiating general linear combinations of Pauli matrices.
To compute such exponentials, one can check that these matrices satisfy the
following relations, useful in general for doing calculations with them instead of
multiplying out explicitly the 2 by 2 matrices:
[σj , σk ]+ = σj σk + σk σj = 2δjk 1
Here [·, ·]+ is the anticommutator. This relation says that all σj satisfy σj2 = 1
and distinct σj anticommute (e.g. σj σk = −σk σj for j 6= k).
Notice that the anticommutation relations imply that, if we take a vector
v = (v1 , v2 , v3 ) ∈ R3 and define a 2 by 2 matrix by
v3 v1 − iv2
v · σ = v1 σ 1 + v 2 σ 2 + v3 σ 3 =
v1 + iv2 −v3
since
((cos θ)1 + i(sin θ)v · σ)((cos θ)1 − i(sin θ)v · σ) = (cos2 θ + sin2 θ)1 = 1
We’ll review linear algebra and the notion of a unitary matrix in chapter 4, but
one form of the condition for a matrix M to be unitary is
M † = M −1
27
so the self-adjointness of the σj implies unitarity of eiθv·σ since
28
The condition that the first row has length one gives
αα + ββ = |α|2 + |β|2 = 1
Using these two relations and computing the determinant (which has to be 1)
gives
ααγ γ γ
αδ − βγ = − − βγ = − (αα + ββ) = − = 1
β β β
so one must have
γ = −β, δ = α
and an SU (2) matrix will have the form
α β
−β α
While physicists prefer to work with self-adjoint Pauli matrices and their
real eigenvalues, one can work instead with the following skew-adjoint matrices
σj
Xj = −i
2
29
which satisfy the slightly simpler commutation relations
3
X
[Xj , Xk ] = jkl Xl
l=1
or more explicitly
If these commutators were zero, the SU (2) elements one gets by exponentiat-
ing linear combinations of the Xj would be commuting group elements. The
non-triviality of the commutators reflects the non-commutativity of the group.
Group elements U ∈ SU (2) near the identity satisfy
U ' 1 + 1 X1 + 2 X2 + 3 X2
for j small and real, just as group elements z ∈ U (1) near the identity satisfy
z ' 1 + i
30
3.3 Dynamics of a two-state system
Recall that the time dependence of states in quantum mechanics is given by the
Schrödinger equation
d
|ψ(t)i = −iH|ψ(t)i
dt
where H is a particular self-adjoint linear operator on H, the Hamiltonian op-
erator. The most general such operator on C2 will be given by
H = h0 1 + h1 σ1 + h2 σ2 + h3 σ3
|ψ(t)i = U (t)|ψ(0)i
where
U (t) = e−itH
The h0 1 term in H just contributes an overall phase factor e−ih0 t , with the
remaining factor of U (t) an element of the group SU (2) rather than the larger
group U (2) of all 2 by 2 unitaries.
Using our earlier equation
h1 σ1 + h2 σ2 + h3 σ3
U (t) =e−ih0 t (cos(−t|h|)1 + i sin(−t|h|) )
|h|
h1 σ1 + h2 σ2 + h3 σ3
=e−ih0 t (cos(t|h|)1 − i sin(t|h|) )
|h|
!
h3 −ih2
e−ih0 t (cos(t|h|) − i |h| sin(t|h|)) −i sin(t|h|) h1|h|
= h1 +ih2 h3
−i sin(t|h|) |h| e−ih0 t (cos(t|h|) + i |h| sin(t|h|))
31
In this special case, one can see that the eigenvalues of the Hamiltonian are
h0 ± h3 .
In the physical realization of this system by a spin 1/2 particle (ignoring its
spatial motion), the Hamiltonian is given by
ge
H= (B1 σ1 + B2 σ2 + B3 σ3 )
4mc
where the Bj are the components of the magnetic field, and the physical con-
stants are the gyromagnetic ratio (g), the electric charge (e), the mass (m) and
the speed of light (c), so we have solved the problem of the time evolution of
ge
such a system, setting hj = 4mc Bj . For magnetic fields of size |B| in the 3-
direction, we see that the two different states with well-defined energy (| + 12 i
and | − 12 i) will have an energy difference between them of
ge
|B|
2mc
This is known as the Zeeman effect and is readily visible in the spectra of atoms
subjected to a magnetic field. We will consider this example in more detail in
chapter 7, seeing how the group of rotations of R3 appears. Much later, in
chapter 42, we will derive this Hamiltonian term from general principles of how
electromagnetic fields couple to such spin 1/2 particles.
32
Chapter 4
v = v1 e1 + v2 e2 + · · · + vn en
33
The choice of a basis {ej } also allows us to express the action of a linear
operator Ω on V
Ω : v ∈ V → Ωv ∈ V
as multiplication by an n by n matrix:
v1 Ω11 Ω12 ... Ω1n v1
v2 Ω21 Ω22 ... Ω2n v2
.. → ..
.. .. .. ..
. . . . . .
vn Ωn1 Ωn2 ... Ωnn vn
The invertible linear operators on V form a group under composition, a
group we will sometimes denote GL(V ). Choosing a basis identifies this group
with the group of invertible matrices, with group law matrix multiplication.
For V n-dimensional, we will denote this group by GL(n, R) in the real case,
GL(n, C) in the complex case.
Note that when working with vectors as linear combinations of basis vectors,
we can use matrix notation to write a linear transformation as
Ω11 Ω12 . . . Ω1n v1
Ω
21
Ω 22 . . . Ω 2n v2
v → Ωv = e1 · · · en . . . . .
.. .. .. .. ..
Ωn1 Ωn2 ... Ωnn vn
One sees from this that we can think of the transformed vector as we did
above in terms of transformed coefficients vj with respect to fixed basis vectors,
but also could leave the vj unchanged and transform the basis vectors. At times
we will want to use matrix notation to write formulas for how the basis vectors
transform in this way, and then will write
e1 Ω11 Ω21 . . . Ωn1 e1
e2 Ω12 Ω22 . . . Ωn2 e2
.. → ..
.. .. .. ..
. . . . . .
en Ω1n Ω2n . . . Ωnn en
Note that putting the basis vectors ej in a column vector like this causes the
matrix for Ω to act on them by the transposed matrix. This is not a group action
since in general the product of two transposed matrices is not the transpose of
the product.
34
for α, β ∈ k, v, w ∈ V .
Given a linear transformation Ω acting on V , one can define:
Definition (Transpose transformation). The transpose of Ω is the linear trans-
formation
ΩT : V ∗ → V ∗
that satisfies
(ΩT l)(v) = l(Ωv)
for l ∈ V ∗ , v ∈ V .
For any representation (π, V ) of a group G on V , one can define a corre-
sponding representation on V ∗
Definition (Dual or contragredient representation). The dual or contragredient
representation on V ∗ is given by taking as linear operators
(π T )−1 (g) : V ∗ → V ∗
These satisfy the homomorphism property since
(π T (g1 ))−1 (π T (g2 ))−1 = (π T (g2 )π T (g1 ))−1 = ((π(g1 )π(g2 ))T )−1
For any choice of basis {ej } of V , one has a dual basis {e∗j } of V ∗ that
satisfies
e∗j (ek ) = δjk
Coordinates on V with respect to a basis are linear functions, and thus elements
of V ∗ . One can identify the coordinate function vj with the dual basis vector
e∗j since
e∗j (v1 e1 + v2 e2 + · · · + vn en ) = vj
One can easily show that the elements of the matrix for Ω in the basis ej
are given by
Ωjk = e∗j (Ωek )
and that the matrix for the transpose map (with respect to the dual basis) is
just the matrix transpose
(ΩT )jk = Ωkj
One can use matrix notation to write elements
l = l1 e∗1 + l2 e∗2 + · · · + ln e∗n ∈ V ∗
of V ∗ as row vectors
l1 l2 ··· ln
of coordinates on V ∗ . Then evaluation of l on a vector v given by matrix
multiplication
v1
v2
l(v) = l1 l2 · · · ln . = l1 v1 + l2 v2 + · · · + ln vn
..
vn
35
4.3 Change of basis
Any invertible transformation A on V can be used to change the basis ej of V
to a new basis e0j by taking
ej → e0j = Aej
The matrix for a linear transformation Ω transforms under this change of basis
as
In the second step we are using the fact that elements of the dual basis transform
as the dual representation. One can check that this is what is needed to ensure
the relation
(e0j )∗ (e0k ) = δjk
The change of basis formula shows that if two matrices Ω1 and Ω2 are related
by conjugation by a third matrix A
Ω2 = A−1 Ω1 A
then one can think of them as both representing the same linear transforma-
tion, just with respect to two different choices of basis. Recall that a finite-
dimensional representation is given by a set of matrices π(g), one for each group
element. If two representations are related by
(for all g, A does not depend on g), then we can think of them as being the
same representation, with different choices of basis. In such a case the represen-
tations π1 and π2 are called “equivalent”, and we will often implicitly identify
representations that are equivalent.
Definition (Inner Product, real case). An inner product on a real vector space
V is a map
h·, ·i : V × V → R
that is linear in both variables and symmetric (hv, wi = hw, vi).
36
Our inner products will usually be positive-definite (hv, vi ≥ 0 and hv, vi =
0 =⇒ v = 0), with indefinite inner products only appearing in the con-
text of special or general relativity, where an indefinite inner product on four-
dimensional space-time is used.
In the complex case, one has
h·, ·i : V × V → C
hv, wi = hw, vi
as well as linear in the second variable, and antilinear in the first variable: for
α ∈ C and u, v, w ∈ V
||v||2 = hv, vi
v ∈ V → lv ∈ V ∗
where lv is defined by
lv (w) = hv, wi
Physicists have a useful notation for elements of vector space and their duals,
for the case when V is a complex vector space with an Hermitian inner product
(such as the state space for a quantum theory). An element of such a vector
space V is written as a “ket vector”
|vi
where v is a label for a vector. An element of the dual vector space V ∗ is written
as a “bra vector”
hl|
Evaluating l ∈ V ∗ on v ∈ V gives an element of C, written
hl|vi
37
If Ω : V → V is a linear map
hl|Ω|vi = hl|Ωvi = l(Ωv)
In the bra-ket notation, one denotes the dual vector lv by hv|. Note that in
the inner product the angle bracket notation means something different than in
the bra-ket notation. The similarity is intentional though, since in the bra-ket
notation one has
hv|wi = hv, wi
Note that our convention of linearity in the second variable of the inner product,
antilinearity in the first, implies
|αvi = α|vi, hαv| = αhv|
for α ∈ C.
For a choice of orthonormal basis {ej }, i.e. satisfying
hej , ek i = δjk
a useful notation is
|ji = ej
Because of orthonormality, coefficients of vectors can be calculated as
vj = hej , vi
In bra-ket notation we have
vj = hj|vi
and
n
X
|vi = |jihj|vi
j=1
With respect to the chosen orthonormal basis {ej }, one can represent vectors
v as column vectors and the operation of taking a vector |vi to a dual vector hv|
corresponds to taking a column vector to the row vector that is its conjugate-
transpose.
hv| = v1 v2 ··· vn
Then one has
w1
w2
hv|wi = v1 v2 ··· vn . = v1 w1 + v2 w2 + · · · + vn wn
..
wn
38
If Ω is a linear operator Ω : V → V , then with respect to the chosen basis it
becomes a matrix with matrix elements
Ωkj = hk|Ωji
hΩv, wi = hv, Ω† wi
39
Generalizing the fact that
hαv| = αhv|
for α ∈ C, one can write
hΩv| = hv|Ω†
Note that mathematicians tend to favor Ω∗ as notation for the adjoint of Ω,
as opposed to the physicist’s notation Ω† that we are using.
In terms of explicit matrices, since hΩv| is the conjugate-transpose of |Ωvi,
the matrix for Ω† will be given by the conjugate transpose ΩT of the matrix for
Ω:
Ω†jk = Ωkj
In the real case, the matrix for the adjoint is just the transpose matrix. We
will say that a linear transformation is self-adjoint if Ω† = Ω, skew-adjoint if
Ω† = −Ω.
which says that the column vectors of the matrix for Ω are orthonormal vectors.
Using instead the equivalent condition
ΩΩ† = 1
one finds that the row vectors of the matrix for Ω are also orthornormal.
Since such linear transformations preserving the inner product can be com-
posed and are invertible, they form a group, and some of the basic examples of
Lie groups are given by these groups for the cases of real and complex vector
spaces.
40
4.6.1 Orthogonal groups
We’ll begin with the real case, where these groups are called orthogonal groups:
Definition (Orthogonal group). The orthogonal group O(n) in n-dimensions
is the group of invertible transformations preserving an inner product on a real
n-dimensional vector space V . This is isomorphic to the group of n by n real
invertible matrices Ω satisfying
Ω−1 = ΩT
so
det(Ω) = ±1
O(n) is a continuous Lie group, with two components: SO(n), the subgroup of
orientation-preserving transformations, which include the identity, and a com-
ponent of orientation-changing transformations.
The simplest non-trivial example is for n = 2, where all elements of SO(2)
are given by matrices of the form
cos θ − sin θ
sin θ cos θ
so the representation theory of SO(2) is just as for U (1), with irreducible com-
plex representations one-dimensional and classified by an integer.
In chapter 6 we will consider in detail the case of SO(3), which is crucial
for physical applications because it is the group of rotations in the physical
three-dimensional space.
41
4.6.2 Unitary groups
In the complex case, groups of invertible transformations preserving the Hermi-
tian inner product are called unitary groups:
Definition (Unitary group). The unitary group U (n) in n-dimensions is the
group of invertible transformations preserving an Hermitian inner product on a
complex n-dimensional vector space V . This is isomorphic to the group of n by
n complex invertible matrices satisfying
T
Ω−1 = Ω = Ω†
is a group homomorphism.
We have already seen the examples U (1), U (2) and SU (2). For general
values of n, the case of U (n) can be split into the study of its determinant,
which lies in U (1) so is easy to deal with, and the subgroup SU (n), which is a
much more complicated story.
Digression. Note that is not quite true that the group U (n) is the product
group SU (n) × U (1). If one tries to identify the U (1) as the subgroup of U (n)
of elements of the form eiθ 1, then matrices of the form
m
ei n 2π 1
for m an integer will lie in both SU (n) and U (1), so U (n) is not a product of
those two groups.
We saw at the end of section 3.1.2 that SU (2) can be identified with the three-
sphere S 3 , since an arbitrary group element can be constructed by specifying one
row (or one column), which must be a vector of length one in C2 . For the case
n = 3, the same sort of construction starts by picking a row of length one in C3 ,
which will be a point in S 5 . The second row must be orthornormal, and one can
show that the possibilities lie in a three-sphere S 3 . Once the first two rows are
specified, the third row is uniquely determined. So as a manifold, SU (3) is eight-
dimensional, and one might think it could be identified with S 5 ×S 3 . It turns out
42
that this is not the case, since the S 3 varies in a topologically non-trivial way
as one varies the point in S 5 . As spaces, the SU (n) are topologically “twisted”
products of odd-dimensional spheres, providing some of the basic examples of
quite non-trivial topological manifolds.
det(λ1 − Ω) = 0
can always be factored into linear factors, and solved for the eigenvalues λ. For
an arbitrary n by n complex matrix there will be n solutions (counting repeated
eigenvalues with multiplicity). One can always find a basis for which the matrix
will be in upper triangular form.
The case of self-adjoint matrices Ω is much more constrained, since transpo-
sition relates matrix elements. One has:
Theorem (Spectral theorem for self-adjoint matrices). Given a self-adjoint
complex n by n matrix Ω, one can always find a unitary matrix U such that
U ΩU −1 = D
43
which is the exponential of a corresponding diagonalized skew-adjoint matrix
iλ1 0
0 iλ2
For matrices in the subgroup SU (2), one has λ2 = −λ1 = λ so in diagonal form
an SU (2) matrix will be iλ
e 0
0 e−iλ
which is the exponential of a corresponding diagonalized skew-adjoint matrix
that has trace zero
iλ 0
0 −iλ
44
Chapter 5
π : G → U (n)
Recall that in the case of G = U (1), we could use the homomorphism property
of π to determine π in terms of its derivative at the identity. This turns out to
be a general phenomenon for Lie groups G: we can study their representations
by considering the derivative of π at the identity, which we will call π 0 . Because
of the homomorphism property, knowing π 0 is often sufficient to characterize
the representation π it comes from. π 0 is a linear map from the tangent space
to G at the identity to the tangent space of U (n) at the identity. The tangent
space to G at the identity will carry some extra structure coming from the group
multiplication, and this vector space with this structure will be called the Lie
algebra of G.
The subject of differential geometry gives many equivalent ways of defining
the tangent space at a point of manifolds like G, but we do not want to enter
here into the subject of differential geometry in general. One of the standard
definitions of the tangent space is as the space of tangent vectors, with tangent
vectors defined as the possible velocity vectors of parametrized curves g(t) in
the group G.
More advanced treatments of Lie group theory develop this point of view
(see for example [70]) which applies to arbitrary Lie groups, whether or not
they are groups of matrices. In our case though, since we are interested in
specific groups that are explicitly given as groups of matrices, we can give a
more concrete definition, just using the exponential map on matrices. For a
more detailed exposition of this subject, using the same concrete definition of
the Lie algebra in terms of matrices, see Brian Hall’s book [27] or the abbreviated
on-line version [28].
45
Note that the material of this chapter is quite general, and may be hard
to make sense of until one has some experience with basic examples. The next
chapter will discuss in detail the groups SU (2) and SO(3) and their Lie algebras,
as well as giving some examples of their representations, and this may be helpful
in making sense of the general theory of this chapter.
46
Definition (Adjoint representation). The adjoint representation (Ad, g) is given
by the homomorphism
Ad : g ∈ G → Ad(g) ∈ GL(g)
where Ad(g) acts on X ∈ g by
(Ad(g))(X) = gXg −1
To show that this is well-defined, one needs to check that gXg −1 ∈ g when
X ∈ g, but this can be shown using the identity
−1
etgXg = getX g −1
−1
which implies that etgXg ∈ G if etX ∈ G. To check this, just expand the
exponential and use
(gXg −1 )k = (gXg −1 )(gXg −1 ) · · · (gXg −1 ) = gX k g −1
It is also easy to check that this is a homomorphism, with
Ad(g1 )Ad(g2 ) = Ad(g1 g2 )
A Lie algebra g is not just a real vector space, but comes with an extra
structure on the vector space
Definition (Lie bracket). The Lie bracket operation on g is the bilinear anti-
symmetric map given by the commutator of matrices
[·, ·] : (X, Y ) ∈ g × g → [X, Y ] = XY − Y X ∈ g
We need to check that this is well-defined, i.e. that it takes values in g.
Theorem. If X, Y ∈ g, [X, Y ] = XY − Y X ∈ g.
Proof. Since X ∈ g, we have etX ∈ G and we can act on Y ∈ g by the adjoint
representation
Ad(etX )Y = etX Y e−tX ∈ g
As t varies this gives us a parametrized curve in g. Its velocity vector will also
be in g, so
d tX −tX
(e Y e )∈g
dt
One has (by the product rule, which can easily be shown to apply in this case)
d tX −tX d d
(e Y e ) = ( (etX Y ))e−tX + etX Y ( e−tX )
dt dt dt
= XetX Y e−tX − etX Y Xe−tX
Evaluating this at t = 0 gives
XY − Y X
which is thus shown to be in g.
47
The relation
d tX −tX
(e Y e )|t=0 = [X, Y ] (5.1)
dt
used in this proof will be continually useful in relating Lie groups and Lie alge-
bras.
To do calculations with a Lie algebra, one can just choose a basis X1 , X2 , . . . , Xn
for the vector space g, and use the fact that the Lie bracket can be written in
terms of this basis as
Xn
[Xj , Xk ] = cjkl Xl
l=1
where cjkl is a set of constants known as the “structure constants” of the Lie
algebra. For example, in the case of su(2), the Lie algebra of SU (2) one has a
basis X1 , X2 , X3 satisfying
3
X
[Xj , Xk ] = jkl Xl
l=1
so the structure constants of su(2) are just the totally antisymmetric jkl .
The condition
ΩΩ† = 1
thus becomes
†
etX (etX )† = etX etX = 1
48
Taking the derivative of this equation gives
† †
etX X † etX + XetX etX = 0
X + X† = 0
X † = −X
Note that physicists often choose to define the Lie algebra in these cases
as self-adjoint matrices, then multiplying by i before exponentiating to get a
group element. We will not use this definition, with one reason that we want to
think of the Lie algebra as a real vector space, so want to avoid an unnecessary
introduction of complex numbers at this point.
Ω = etX
These give a path connecting Ω to the identity (taking esX , s ∈ [0, t]). We
saw above that the condition ΩT = Ω−1 corresponds to skew-symmetry of the
matrix X
X T = −X
So in the case of G = SO(n), we see that the Lie algebra so(n) is the space of
skew-symmetric (X T = −X) n by n real matrices, together with the bilinear,
antisymmetric product given by the commutator:
In chapter 6 we will examine in detail the n = 3 case, where the Lie algebra
so(3) is R3 , realized as the space of antisymmetric real 3 by 3 matrices.
49
5.2.2 Lie algebra of the unitary group
For the case of the group U (n), the group is connected and one can write all
group elements as etX , where now X is a complex n by n matrix. The unitarity
condition implies that X is skew-adjoint (also called skew-Hermitian), satisfying
X † = −X
So the Lie algebra u(n) is the space of skew-adjoint n by n complex matrices,
together with the bilinear, antisymmetric product given by the commutator:
(X, Y ) ∈ u(n) × u(n) → [X, Y ] ∈ u(n)
2
Note that these matrices form a subspace of Cn of half the dimension,
so of real dimension n2 . u(n) is a real vector space of dimension n2 , but it
is NOT a space of real n by n matrices. It is the space of skew-Hermitian
matrices, which in general are complex. While the matrices are complex, only
real linear combinations of skew-Hermitian matrices are skew-Hermitian (recall
that multiplication by i changes a skew-Hermitian matrix into a Hermitian
matrix). Within this space of complex matrices, if one looks at the subspace of
real matrices one gets the sub-Lie algebra so(n) of antisymmetric matrices (the
Lie algebra of SO(n) ⊂ U (n)).
Given any complex matrix Z ∈ M (n, C), one can write it as a sum of
1 1
Z= (Z + Z † ) + (Z − Z † )
2 2
where the first term is self-adjoint, the second skew-Hermitian. This second
term can also be written as i times a self-adjoint matrix
1 1
(Z − Z † ) = i( (Z − Z † ))
2 2i
so we see that we can get all of M (n, C) by taking all complex linear combina-
tions of self-adjoint matrices..
There is an identity relating the determinant and the trace of a matrix
det(eX ) = etrace(X)
which can be proved by conjugating the matrix to upper-triangular form and
using the fact that the trace and the determinant of a matrix are conjugation-
invariant. Since the determinant of an SU (n) matrix is 1, this shows that the
Lie algebra su(n) of SU (n) will consist of matrices that are not only skew-
Hermitian, but also of trace zero. So in this case su(n) is again a real vector
space, of dimension n2 − 1.
One can show that U (n) and u(n) matrices can be diagonalized by conju-
gation by a unitary matrix to show that any U (n) matrix can be written as an
exponential of something in the Lie algebra. The corresponding theorem is also
true for SO(n) but requires looking at diagonalization into 2 by 2 blocks. It is
not true for O(n) (you can’t reach the disconnected component of the identity
by exponentiation). It also turns out to not be true for the groups GL(n, R)
and GL(n, C) for n ≥ 2.
50
5.3 Lie algebra representations
We have defined a group representation as a homomorphism (a map of groups
preserving group multiplication)
π : G → GL(n, C)
satisfying
φ([X, Y ]) = [φ(X), φ(Y )]
φ(X)† = −φ(X)
d
X
[φ(Xj ), φ(Xk )] = cjkl φ(Xl )
l=1
where
d
π 0 (X) = (π(etX ))|t=0
dt
51
In the case of U (1) we classified all irreducible representations (homomor-
phisms U (1) → GL(1, C) = C∗ ) by looking at the derivative of the map at
the identity. For general Lie groups G, one can do something similar, show-
ing that a representation π of G gives a representation of the Lie algebra (by
taking the derivative at the identity), and then trying to classify Lie algebra
representations.
d
π 0 : X ∈ g → π 0 (X) = (π(etX ))|t=0 ∈ gl(n, C) = M (n, C)
dt
satisfies
1.
0
π(etX ) = etπ (X)
2. For g ∈ G
π 0 (gXg −1 ) = π(g)π 0 (X)(π(g))−1
π 0 ([X, Y ]) = [π 0 (X), π 0 (Y )]
52
Proof. 1. We have
d d
π(etX ) = π(e(t+s)X )|s=0
dt ds
d
= π(etX esX )|s=0
ds
d
= π(etX ) π(esX )|s=0
ds
= π(etX )π 0 (X)
d
So f (t) = π(etX ) satisfies the differential equation dt f = f π 0 (X) with
0
initial condition f (0) = 1. This has the unique solution f (t) = etπ (X)
2. We have
0 −1 −1
etπ (gXg )
= π(etgXg )
tX −1
= π(ge g )
= π(g)π(e tX
)π(g)−1
0
= π(g)etπ (X) π(g)−1
This theorem shows that we can study Lie group representations (π, V )
by studying the corresponding Lie algebra representation (π 0 , V ). This will
generally be much easier since the π 0 (X) are just linear maps. We will proceed
in this manner in chapter 8 when we construct and classify all SU (2) and SO(3)
representations, finding that the corresponding Lie algebra representations are
much simpler to analyze.
53
For any Lie group G, we have seen that there is a distinguished representa-
tion, the adjoint representation (Ad, g). The corresponding Lie algebra represen-
tation is also called the adjoint representation, but written as (Ad0 , g) = (ad, g).
From the fact that
Ad(etX )(Y ) = etX Y e−tX
we can differentiate with respect to t to get the Lie algebra representation
d tX −tX
ad(X)(Y ) = (e Y e )|t=0 = [X, Y ] (5.3)
dt
From this we see that one can define
Y → [X, Y ]
Note that this linear map ad(X), which one can write as [X, ·], can be
thought of as the infinitesimal version of the conjugation action
where these are linear maps on g, with ◦ composition of linear maps, so operating
on Z ∈ g we have
This is called the Jacobi identity. It could have been more simply derived as
an identity about matrix multiplication, but here we see that it is true for a
more abstract reason, reflecting the existence of the adjoint representation. It
can be written in other forms, rearranging terms using antisymmetry of the
commutator, with one example the sum of cyclic permutations
54
Definition (Abstract Lie algebra). An abstract Lie algebra over a field k is a
vector space A over k, with a bilinear operation
satisfying
1. Antisymmetry:
[X, Y ] = −[Y, X]
2. Jacobi identity:
Such Lie algebras do not need to be defined as matrices, and their Lie bracket
operation does not need to be defined in terms of a matrix commutator (al-
though the same notation continues to be used). Later on in this course we
will encounter important examples of Lie algebras that are defined in this more
abstract way.
5.4 Complexification
The way we have defined a Lie algebra g, it is a real vector space, not a complex
vector space. Even if G is a group of complex matrices, when it is not GL(n, C)
itself but some subgroup, its tangent space at the identity will not necessarily
be a complex vector space. Consider for example the cases G = U (1) and
G = SU (2), where u(1) = R and su(2) = R3 . While the tangent space to the
group of all invertible complex matrices is a complex vector space, imposing
some condition such as unitarity picks out a subspace which generally is just a
real vector space, not a complex one. So the adjoint representation (Ad, g) is in
general not a complex representation, but a real representation, with
ad : X ∈ g → ad(X) ∈ gl(dim g, R)
and once we pick a basis of g, we can identify gl(dim g, R) = M (dim g, R). So,
for each X ∈ g we get a real linear operator on a real vector space.
We would however often like to work with not real representations, but
complex representations, since it is for these that Schur’s lemma applies, and
representation operators can be diagonalized. To get from a real Lie algebra
representation to a complex one, we can “complexify”, extending the action of
real scalars to complex scalars. If we are working with real matrices, complex-
ification is nothing but allowing complex entries and using the same rules for
multiplying scalars as before.
More generally, for any real vector space we can define:
55
Definition. The complexification VC of a real vector space V is the space of
pairs (v1 , v2 ) of elements of V with multiplication by a + bi ∈ C given by
VC = V + iV
with v1 in the first copy of V , v2 in the second copy. Then the rule for mul-
tiplication by a complex number comes from the standard rules for complex
multiplication. In the cases we will be interested in this level of abstraction is
not really needed, since V will be given as a subspace of a complex space, and VC
will just be the larger subspace you get by taking complex linear combinations
of elements of V .
Given a real Lie algebra g, the complexification gC is pairs of elements
(X1 , X2 ) of g, with the above rule for multiplication by complex scalars. The
Lie bracket on g extends to a Lie bracket on gC by the rule
and gC is a Lie algebra over the complex numbers. In many cases this definition
is isomorphic to something just defined in terms of complex matrices, with the
simplest case
gl(n, R)C = gl(n, C)
Recalling our discussion from section 5.2.2 of u(n), a real Lie algebra, with
elements certain (skew-Hermitian) complex matrices, one can see that complex-
ifying will just give all complex matrices so
u(n)C = gl(n, C)
This example shows that two different real Lie algebras may have the same com-
plexification. For yet another example, since so(n) is the Lie algebra of all real
antisymmetric matrices, so(n)C is the Lie algebra of all complex antisymmetric
matrices.
We can extend the operators ad(X) on g by complex linearity to turn ad
into a complex representation of gC on the vector space gC itself
ad : Z ∈ gC → ad(Z)
56
at the level of this course is the book Naive Lie Theory [60]. It covers basics of
Lie groups and Lie algebras, but without representations. The notes [28] and
book [27] of Brian Hall are a good source to study from. Some parts of the
proofs given here are drawn from those notes.
57
58
Chapter 6
Among the basic symmetry groups of the physical world is the orthogonal group
SO(3) of rotations about a point in three-dimensional space. The observables
one gets from this group are the components of angular momentum, and under-
standing how the state space of a quantum system behaves as a representation
of this group is a crucial part of the analysis of atomic physics examples and
many others. This is a topic one will find in some version or other in every
quantum mechanics textbook.
Remarkably, it turns out that the quantum systems in nature are often
representations not of SO(3), but of a larger group called Spin(3), one that has
two elements corresponding to every element of SO(3). Such a group exists in
any dimension n, always as a “doubled” version of the orthogonal group SO(n),
one that is needed to understand some of the more subtle aspects of geometry
in n dimensions. In the n = 3 case it turns out that Spin(3) ' SU (2) and we
will study in detail the relationship of SO(3) and SU (2). This appearance of
the unitary group SU (2) is special to geometry in 3 and 4 dimensions, and we
will see that quaternions provide an explanation for this.
59
This can be written as an exponential, R(θ) = eθL = cos θ1 + L sin θ for
0 −1
L=
1 0
Here SO(2) is a commutative Lie group with Lie algebra so(2) = R (it is one-
dimensional, with trivial Lie bracket, all elements of the Lie algebra commute).
Note that we have a representation on V = R2 here, but it is a real representa-
tion, not one of the complex ones we have when we have a representation on a
quantum mechanical state space.
In three dimensions the group SO(3) is 3-dimensional and non-commutative.
Choosing a unit vector w and angle θ, one gets an element R(θ, w) of SO(3),
rotation by θ about the w axis. Using standard basis vectors ej , rotations about
the coordinate axes are given by
1 0 0 cos θ 0 sin θ
R(θ, e1 ) = 0 cos θ − sin θ , R(θ, e2 ) = 0 1 0
0 sin θ cos θ − sin θ 0 cos θ
cos θ − sin θ 0
R(θ, e3 ) = sin θ cos θ 0
0 0 1
A standard parametrization for elements of SO(3) is in terms of 3 “Euler angles”
φ, θ, ψ with a general rotation given by
i.e. first a rotation about the z-axis by an angle φ, then a rotation by an angle
θ about the new x-axis, followed by a rotation by ψ about the new z-axis.
Multiplying out the matrices gives a rather complicated expression for a rotation
in terms of the three angles, and one needs to figure out what range to choose
for the angles to avoid multiple counting.
The infinitesimal picture near the identity of the group, given by the Lie
algebra structure on so(3), is much easier to understand. Recall that for orthog-
onal groups the Lie algebra can be identified with the space of antisymmetric
matrices, so one in this case has a basis
0 0 0 0 0 1 0 −1 0
l1 = 0 0 −1 l2 = 0 0 0 l3 = 1 0 0
0 1 0 −1 0 0 0 0 0
Note that these are exactly the same commutation relations satisfied by
the basis vectors X1 , X2 , X3 of the Lie algebra su(2), so so(3) and su(2) are
60
isomorphic Lie algebras. They both are the vector space R3 with the same Lie
bracket operation on pairs of vectors. This operation is familiar in yet another
context, that of the cross-product of standard basis vectors ej in R3 :
e1 × e2 = e3 , e2 × e3 = e1 , e3 × e1 = e2
(X, Y ) ∈ R3 × R3 → [X, Y ] ∈ R3
[v, w] = v × w
Something very special that happens for orthogonal groups only in three di-
mensions is that the vector representation (the defining representation of SO(n)
matrices on Rn ) is isomorphic to the adjoint representation. Recall that any Lie
group G has a representation (Ad, g) on its Lie algebra g. so(n) can be identified
2
with the antisymmetric n by n matrices, so is of (real) dimension n 2−n . Only
for n = 3 is this equal to n, the dimension of the representation on vectors in
Rn . This corresponds to the geometrical fact that only in 3 dimensions is a
plane (in all dimensions rotations are built out of rotations in various planes)
determined uniquely by a vector (the vector perpendicular to the plane). Equiv-
alently, only in 3 dimensions is there a cross-product v × w which takes two
vectors determining a plane to a unique vector perpendicular to the plane.
The isomorphism between the vector representation (πvector , R3 ) on column
vectors and the adjoint representation (Ad, so(3)) on antisymmetric matrices is
given by
v1 0 −v3 v2
v2 ↔ v1 l1 + v2 l2 + v3 l3 = v3 0 −v1
v3 −v2 v1 0
or in terms of bases by
ej ↔ lj
0
For the vector representation on column vectors, πvector (g) = g and πvector (X) =
X, where X is an antisymmetric 3 by 3 matrix, and g = eX is an orthogonal 3
by 3 matrix. Both act on column vectors by the usual multiplication.
61
For the adjoint representation on antisymmetric matrices, one has
0 −v3 v2 0 −v3 v2
Ad(g) v3 0 −v1 = g v3 0 −v1 g −1
−v2 v1 0 −v2 v1 0
The corresponding Lie algebra representation is given by
0 −v3 v2 0 −v3 v2
ad(X) v3 0 −v1 = [X, v3 0 −v1 ]
−v2 v1 0 −v2 v1 0
where X is a 3 by 3 antisymmetric matrix.
One can explicitly check that these representations are isomorphic, for in-
stance by calculating how basis elements lj ∈ so(3) act. On vectors, these lj act
by matrix multiplication, giving for instance, for j = 1
l1 e1 = 0, l1 e2 = e3 , l1 e3 = −e2
On antisymmetric matrices one has instead the isomorphic relations
(ad(l1 ))(l1 ) = 0, (ad(l1 ))(l2 ) = l3 , (ad(l1 ))(l3 ) = −l2
62
6.2.1 Quaternions
The quaternions are a number system (denoted by H) generalizing the complex
number system, with elements q ∈ H that can be written as
q = q0 + q1 i + q2 j + q3 k, qi ∈ R
with i, j, k ∈ H satisfying
q → q̄ = q0 − q1 i − q2 j − q3 k
uv = v̄ū
Using
q q̄
=1
|q|2
one has a formula for the inverse of a quaternion
q̄
q −1 =
|q|2
The length one quaternions thus form a group under multiplication, called
Sp(1). There are also Lie groups called Sp(n) for larger values of n, consisting
of invertible matrices with quaternionic entries that act on quaternionic vectors
preserving the quaternionic length-squared, but these play no significant role in
quantum mechanics so we won’t study them further. Sp(1) can be identified
with the three-dimensional sphere since the length one condition on q is
63
6.2.2 Rotations and spin groups in four dimensions
Pairs (u, v) of unit quaternions give the product group Sp(1) × Sp(1). An
element of this group acts on H = R4 by
q → uqv
Later on in the course we’ll encounter Spin(4) and SO(4) again, but for now
we’re interested in the subgroup Spin(3) that only acts non-trivially on 3 of the
dimensions, and double-covers not SO(4) but SO(3). To find this, consider the
subgroup of Spin(4) consisting of pairs (u, v) of the form (u, u−1 ) (a subgroup
isomorphic to Sp(1), since elements correspond to a single unit length quaternion
u). This subgroup acts on quaternions by conjugation
q → uqu−1
an action which is trivial on the real quaternions, but nontrivial on the “pure
imaginary” quaternions of the form
q = ~v = v1 i + v2 j + v3 k
~v → u~v u−1
64
Both u and −u act in the same way on ~v , so we have two elements in
Sp(1) corresponding to the same element in SO(3). One can show that Φ is a
surjective map (one can get any element of SO(3) this way), so it is what is called
a “covering” map, specifically a two-fold cover. It makes Sp(1) a double-cover of
SO(3), and we give this the name “Spin(3)”. This also allows us to characterize
more simply SO(3) as a geometrical space. It is S 3 = Sp(1) = Spin(3) with
opposite points on the three-sphere identified. This space is known as RP(3),
real projective 3-space, which can also be thought of as the space of lines through
the origin in R4 (each such line intersects S 3 in two opposite points).
65
For those who have seen some topology, note that the covering map Φ is
an example of a topologically non-trivial cover. It is just not true that topo-
logically S 3 ' RP3 × (+1, −1). S 3 is a connected space, not two disconnected
pieces. This topological non-triviality implies that globally there is no possible
homomorphism going in the opposite direction from Φ (i.e. SO(3) → Spin(3)).
One can do this locally, picking a local patch in SO(3) and taking the inverse
of Φ to a local patch in Spin(3), but this won’t work if we try and extend it
globally to all of SO(3).
The identification R2 = C allowed us to represent elements of the unit circle
group U (1) as exponentials eiθ , where iθ was in the Lie algebra u(1) = R of
U (1). For Sp(1) one can do much the same thing, with the Lie algebra sp(1)
now the space of all pure imaginary quaternions, which one can identify with
R3 by
w1
w = w2 ∈ R3 ↔ w ~ = w1 i + w2 j + w3 k ∈ H
w3
Unlike the U (1) case, there’s a non-trivial Lie bracket, just the commutator of
quaternions.
Elements of the group Sp(1) are given by exponentiating such Lie algebra
elements, which we will write in the form
66
where θ ∈ R and w ~ is a purely imaginary quaternion of unit length. Taking θ
as a parameter, these give paths in Sp(1) going through the identity at θ = 0,
with velocity vector w
~ since
d
u(θ, w)|θ=0 = (− sin θ + w
~ cos θ)|θ=0 = w
~
dθ
Theorem 6.1.
Φ(u(θ, w)) = R(2θ, w)
Proof. First consider the special case w = e3 of rotations about the 3-axis.
and
u(θ, e3 )−1 = e−θk = cos θ − k sin θ
so Φ(u(θ, e3 )) is the rotation that takes v (identified with the quaternion ~v =
v1 i + v2 j + v3 k) to
One can readily do the same calculation for the case of e1 , then use the
Euler angle parametrization of equation 6.1 to show that a general u(θ, w) can
be written as a product of the cases already worked out.
Notice that as θ goes from 0 to 2π, u(θ, w) traces out a circle in Sp(1). The
homomorphism Φ takes this to a circle in SO(3), one that gets traced out twice
as θ goes from 0 to 2π, explicitly showing the nature of the double covering
above that particular circle in SO(3).
The derivative of the map Φ will be a Lie algebra homomorphism, a linear
map
Φ0 : sp(1) → so(3)
67
It takes the Lie algebra sp(1) of pure imaginary quaternions to the Lie algebra
so(3) of 3 by 3 antisymmetric real matrices. One can compute it easily on basis
vectors, using for instance equation 6.2 above to find for the case w
~ =k
d
Φ0 (k) = Φ(cos θ + k sin θ)|θ=0
dθ
−2 sin 2θ −2 cos 2θ 0
= 2 cos 2θ −2 sin 2θ 0
0 0 0 |θ=0
0 −2 0
= 2 0 0 = 2l3
0 0 0
i j k
, , and l1 , l2 , l3
2 2 2
i j k j k i k i j
[ , ]= , [ , ]= , [ , ]=
2 2 2 2 2 2 2 2 2
68
we see that the length-squared function on quaternions corresponds to the de-
terminant function on 2 by 2 complex matrices. Taking q ∈ Sp(1), so of length
one, the corresponding complex matrix is in SU (2).
Under this identification of H with 2 by 2 complex matrices, we have an
identification of Lie algebras sp(1) = su(2) between pure imaginary quaternions
and skew-Hermitian trace-zero 2 by 2 complex matrices
−iw3 −w2 − iw1
~ = w1 i + w2 j + w3 k ↔
w = −iw · σ
w2 − iw1 iw3
The basis 2i , 2j , k2 gets identified with a basis for the Lie algebra su(2) which
written in terms of the Pauli matrices is
σj
Xj = −i
2
with the Xj satisfying the commutation relations
We now have no less than three isomorphic Lie algebras sp(1) = su(2) =
so(3), with elements that get identified as follows
0 −w3 w2
i w3 w1 − iw2
w1 2i + w2 2j + w3 k2 ↔ −
↔ w3 0 −w1
2 w1 + iw2 −w3
−w2 w1 0
69
linear combinations of skew-Hermitian trace-zero 2 by 2 complex matrices just
gives all trace-zero 2 by 2 matrices (the Lie algebra sl(2, C)).
In addition, recall that there is a fourth isomorphic version of this repre-
sentation, the representation of SO(3) on column vectors. This is also a real
representation, but can straightforwardly be complexified. Since so(3) and su(2)
are isomorphic Lie algebras, their complexifications so(3)C and sl(2, C) will also
be isomorphic.
In terms of 2 by 2 complex matrices, one can exponentiate Lie algebra ele-
ments to get group elements in SU (2) and define
θ
Ω(θ, w) = eθ(w1 X1 +w2 X2 +w3 X3 ) = e−i 2 w·σ (6.3)
θ θ
= cos( )1 − i(w · σ) sin( ) (6.4)
2 2
Transposing the argument of theorem 6.1 from H to complex matrices, one finds
that, identifying
v3 v1 − iv2
v ↔v·σ =
v1 + iv2 −v3
one has
Φ(Ω(θ, w)) = R(θ, w)
Note that in changing from the quaternionic to complex case, we are treating
the factor of 2 differently, since in the future we will want to use Ω(θ, w) to
perform rotations by an angle θ. In terms of the identification SU (2) = Sp(1),
we have Ω(θ, w) = u( θ2 , w).
Recall that any SU (2) matrix can be written in the form
α β
−β α
with α, β ∈ C arbitrary complex numbers satisfying |α|2 +|β|2 = 1. One can also
write down a somewhat unenlightening formula for the map Φ : SU (2) → SO(3)
in terms of such explicit SU (2) matrices, getting
70
6.3 A summary
To summarize, we have shown that for three dimensions we have two distinct
Lie groups:
• Spin(3), which geometrically is the space S 3 . Its Lie algebra is R3 with
Lie bracket the cross-product. We have seen two different explicit con-
structions of Spin(3), in terms of unit quaternions (Sp(1)), and in terms
of 2 by 2 unitary matrices of determinant 1 (SU (2)).
• SO(3), with the same Lie algebra R3 with the same Lie bracket.
There is a group homomorphism Φ that takes the first group to the second,
which is a two-fold covering map. Its derivative Φ0 is an isomorphism of the Lie
algebras of the two groups.
We can see from these constructions two interesting irreducible representa-
tions of these groups:
• A representation on R3 which can be constructed in two different ways: as
the adjoint representation of either of the two groups, or as the defining
representation of SO(3). This is known to physicists as the “spin 1”
representation.
• A representation of the first group on C2 , which is most easily seen as
the defining representation of SU (2). It is not a representation of SO(3),
since going once around a non-contractible loop starting at the identity
takes one to minus the identity, not back to the identity as required. This
is called the “spin 1/2 or “spinor” representation and will be studied in
more detail in chapter 7.
71
72
Chapter 7
73
representation of Spin(3) = SU (2). The homomorphism πspinor defining the
representation is just the identity map from SU (2) to itself.
The spin representation of SU (2) is not a representation of SO(3). The
double cover map Φ : SU (2) → SO(3) is a homomorphism, so given a rep-
resentation (π, V ) of SO(3) one gets a representation (π ◦ Φ, V ) of SU (2) by
composition. One cannot go in the other direction: there is no homomorphism
SO(3) → SU (2) that would allow us to make the standard representation of
SU (2) on C2 into an SO(3) representation.
One could try and define a representation of SO(3) by
π : g ∈ SO(3) → π(g) = πspinor (g̃) ∈ SU (2)
where g̃ is some choice of one of the elements g̃ ∈ SU (2) satisfying Φ(g̃) = g.
The problem with this is that we won’t quite get a homomorphism. Changing
our choice of g̃ will introduce a minus sign, so π will only be a homomorphism
up to sign
π(g1 )π(g2 ) = ±π(g1 g2 )
The nontrivial nature of the double-covering ensures that there is no way to
completely eliminate all minus signs, no matter how we choose g̃. Examples
like this, which satisfy the representation property only one up to a sign ambi-
guity, are known as “projective representations”. So, the spinor representation
of SU (2) = Spin(3) is only a projective representation of SO(3), not a true
representation of SO(3).
Quantum mechanics texts sometimes deal with this phenomenon by noting
that physically there is an ambiguity in how one specifies the space of states H,
with multiplication by an overall scalar not changing the eigenvalues of operators
or the relative probabilities of observing these eigenvalues. As a result, the sign
ambiguity has no physical effect. It seems more straightforward though to not
try and work with projective representations, but just use the larger group
Spin(3), accepting that this is the correct symmetry group reflecting the action
of rotations on three-dimensional quantum systems.
The spin representation is more fundamental than the vector representation,
in the sense that the spin representation cannot be found just knowing the
vector representation, but the vector representation of SO(3) can be constructed
knowing the spin representation of SU (2). We have seen this in the identification
of R3 with 2 by 2 complex matrices, where rotations become conjugation by spin
representation matrices. Another way of seeing this uses the tensor product, and
is explained in section 9.4.3. Note that taking spinors as fundamental entails
abandoning the descriptions of three-dimensional geometry purely in terms of
real numbers. While the vector representation is a real representation of SO(3)
or Spin(3), the spinor representation is a complex representation.
74
correspond (up to a factor of i) to the corresponding Lie algebra representation.
The U (1) ⊂ U (2) subgroup commutes with everything else and can be analyzed
separately, so we will just consider the SU (2) subgroup. For an arbitrary such
system, the group SU (2) has no particular geometric significance. When it
occurs in its role as double-cover of the rotational group, the quantum system
is said to carry “spin”, in particular “spin one-half” (in chapter 8 will discuss
state spaces of higher spin values).
As before, we take as a standard basis for the Lie algebra su(2) the operators
Xj , j = 1, 2, 3, where
σj
Xj = −i
2
which satisfy the commutation relations
To make contact with the physics formalism, we’ll define self-adjoint operators
σj
Sj = iXj =
2
We could have chosen the other sign, but this is the standard convention of
the physics literature. In general, to a skew-adjoint operator (which is what
one gets from a unitary Lie algebra representation and what exponentiates to
unitary operators) we will associate a self-adjoint operator by multiplying by
i. These self-adjoint operators have real eigenvalues (in this case ± 21 ), so are
favored by physicists as observables since such eigenvalues will be related to
experimental results. In the other direction, given a physicist’s observable self-
adjoint operator, we will multiply by −i to get a skew-adjoint operator that can
be exponentiated to get a unitary representation.
Note that the conventional definition of these operators in physics texts
includes a factor of ~:
~σj
Sjphys = i~Xj =
2
A compensating factor of 1/~ is then introduced when exponentiating to get
group elements
θ phys
Ω(θ, w) = e−i ~ w·S ∈ SU (2)
which do not depend on ~. The reason for this convention has to do with the
action of rotations on functions on R3 (see chapter 17) and the appearance of
~ in the definition of the momentum operator. Our definitions of Sj and of
rotations using (see equation 6.3)
will not include these factors of ~, but in any case they will be equivalent to
the physics text definitions when we make our standard choice of working with
units such that ~ = 1.
75
States in H = C2 that have a well-defined value of the observable Sj will be
the eigenvectors of Sj , with value for the observable the corresponding eigen-
value, which will be ± 12 . Measurement theory postulates that if we perform the
measurement corresponding to Sj on an arbitrary state |ψi, then we will
• with probability c+ get a value of + 12 and leave the state in an eigenvector
|j, + 12 i of Sj with eigenvalue + 12
• with probability c− get a value of − 12 and leave the state in an eigenvector
|j, − 12 i of Sj with eigenvalue − 12
where if
1 1
|ψi = α|j, + i + β|j, − i
2 2
we have
|α|2 |β|2
c+ = , c− =
|α|2 + |β|2 |α| + |β|2
2
76
(see equation 6.5). Under this identification the Sj correspond (up to a factor
of 2) to the basis vectors ej . One can write their transformation rule as
and 0
S1 S1
S20 = R(θ, w)T S2
S30 S3
Note that, recalling the discussion in section 4.1, rotations on sets of basis
vectors like this involve the transpose R(θ, w)T of the matrix R(θ, w) that acts
on coordinates.
In chapter 42 we will get to the physics of electromagnetic fields and how
particles interact with them in quantum mechanics, but for now all we need to
know is that for a spin one-half particle, the spin degree of freedom that we are
describing by H = C2 has a dynamics described by the Hamiltonian
H = −µ · B (7.1)
d
|ψ(t)i = −i(−µ · B)|ψ(t)i
dt
with solution
|ψ(t)i = U (t)|ψ(0)i
where
−ge ge ge|B| B
U (t) = eitµ·B = eit 2mc S·B = et 2mc X·B = et 2mc X· |B|
The time evolution of a state is thus given at time t by the same SU (2) element
B
that, acting on vectors, gives a rotation about the axis w = |B| by an angle
ge|B|t
2mc
77
• The Zeeman effect: this is the splitting of atomic energy levels that occurs
when an atom is put in a constant magnetic field. With respect to the
energy levels for no magnetic field, where both states in H = C2 have the
same energy, the term in the Hamiltonian given above adds
ge|B|
±
4mc
to the two energy levels, giving a splitting between them proportional to
the size of the magnetic field.
One can instead use U (t) to make a unitary transformation that puts the
time-dependence in the observables, removing it from the states, as follows:
78
where the “H” subscripts for “Heisenberg” indicate that we are dealing with
“Heisenberg picture” observables and states. One can easily see that the physi-
cally observable quantities given by eigenvalues and expectations values remain
the same:
d eg
SH (t) = i[H, SH (t)] = i [SH (t) · B, SH (t)] (7.2)
dt 2mc
We know from the discussion above that the solution will be
for ge|B| B
U (t) = e−it 2mc S· |B|
and thus the spin vector observable evolves in the Heisenberg picture by rotating
about the magnetic field vector with angular velocity ge|B|
2mc .
79
states, one still has the freedom to multiply states by a phase eiθ without chang-
ing eigenvectors, eigenvalues or expectation values. In terms of group theory,
the overall U (1) in the unitary group U (2) acts on H by a representation of
U (1), which can be characterized by an integer, the corresponding “charge”,
but this decouples from the rest of the observables and is not of much interest.
One is mainly interested in the SU (2) part of the U (2), and the observables
that correspond to its Lie algebra.
Working with normalized states in this case corresponds to working with
unit-length vectors in C2 , which are given by points on the unit sphere S 3 . If
we don’t care about the overall U (1) action, we can imagine identifying all states
that are related by a phase transformation. Using this equivalence relation we
can define a new set, whose elements are the “cosets”, elements of S 3 ⊂ C2 ,
with elements that differ just by multiplication by eiθ identified. The set of these
elements forms a new geometrical space, called the “coset space”, often written
S 3 /U (1). This structure is called a “fibering” of S 3 by circles, and is known
as the “Hopf fibration”. Try an internet search for various visualizations of the
geometrical structure involved, a surprising decomposition of three-dimensional
space into non-intersecting curves.
The same space can be represented in a different way, as C2 /C∗ , by taking
all elements of C2 and identifying those related by muliplication by a non-zero
complex number. If we were just using real numbers, R2 /R∗ can be thought of
as the space of all lines in the plane going through the origin.
One sees that each such line hits the unit circle in two opposite points, so
80
this set could be parametrized by a semi-circle, identifying the points at the
two ends. This space is given the name RP 1 , the “real projective line”, and
the analog space of lines through the origin in Rn is called RP n−1 . What we
are interested in is the complex analog CP 1 , which is often called the “complex
projective line”.
To better understand CP 1 , one would like to put coordinates on it. A
standard way to choose such a coordinate is to associate to the vector
z1
∈ C2
z2
81
and the equations relating coordinates (X1 , X2 , X3 ) on the sphere and the com-
plex coordinate z1 /z2 = z = x + iy on the plane are given by
X1 X2
x= , y=
1 − X3 1 − X3
and
2x 2y x2 + y 2 − 1
X1 = , X 2 = , X 3 =
x2 + y 2 + 1 x2 + y 2 + 1 x2 + y 2 + 1
takes
z1 αz + β
z= →
z2 −βz + α
Such transformations of the complex plane are conformal (angle-preserving)
transformations known as “Möbius transformations”. One can check that the
corresponding transformation on the sphere is the rotation of the sphere in R3
corresponding to this SU (2) = Spin(3) transformation.
To mathematicians, this sphere identified with CP 1 is known as the “Rie-
mann sphere”, whereas physicists often instead use the terminology of “Bloch
sphere”. It provides a useful parametrization of the states of the qubit system,
up to scalar multiplication, which is supposed to be physically irrelevant. The
82
North pole is the “spin-up” state, the South pole is the “spin-down” state, and
along the equator one finds the two states that have definite values for S1 , as
well as the two that have definite values for S2 .
Notice that the inner product on vectors in H does not correspond at all
to the inner product of unit vectors in R3 . The North and South poles of the
Bloch sphere correspond to orthogonal vectors in H, but they are not at all
orthogonal thinking of the corresponding points on the Bloch sphere as vectors
in R3 . Similarly, eigenvectors for S1 and S2 are orthogonal on the Bloch sphere,
but not at all orthogonal in H.
83
7.5 For further reading
Just about every quantum mechanics textbook works out this example of a spin
1/2 particle in a magnetic field. For one example, see chapter 14 of [57]. For
an inspirational discussion of spin and quantum mechanics, together with more
about the Bloch sphere, see chapter 22 of [45].
84
Chapter 8
Representations of SU (2)
and SO(3)
For the case of G = U (1), in chapter 2 we were able to classify all complex
irreducible representations by an element of Z and explicitly construct each
irreducible representation. We would like to do the same thing here for repre-
sentations of SU (2) and SO(3). The end result will be that irreducible repre-
sentations of SU (2) are classified by a non-negative integer n = 0, 1, 2, 3, · · · ,
and have dimension n + 1, so we’ll (hoping for no confusion with the irreducible
representations (πn , C) of U (1)) denote them (πn , Cn+1 ). For even n these will
also be irreducible representations of SO(3), but this will not be true for odd
n. It is common in physics to label these representations by s = n2 = 0, 21 , 1, · · ·
and call the representation labeled by s the “spin s representation”. We already
know the first three examples:
• Spin 0: (π0 , C) is the trivial representation on V, with
π0 (g) = 1 ∀g ∈ SU (2)
This is also a representation of SO(3). In physics, this is sometimes called
the “scalar representation”. Saying that something transforms under ro-
tations as the “scalar representation” just means that it is invariant under
rotations.
• Spin 21 : Taking
π1 (g) = g ∈ SU (2) ⊂ U (2)
gives the defining representation on C2 . This is the spinor representation
discussed in chapter 7. It is not a representation of SO(3).
• Spin 1: Since SO(3) is a group of 3 by 3 matrices, it acts on vectors in R3 .
This is just the standard action on vectors by rotation. In other words,
the representation is (ρ, R3 ), with ρ the identity homomorphism
g ∈ SO(3) → ρ(g) = g ∈ SO(3)
85
One can complexify to get a representation on C3 , which in this case just
means acting with SO(3) matrices on column vectors, replacing the real
coordinates of vectors by complex coordinates. This is sometimes called
the “vector representation”, and we saw in chapter 6 that it is isomorphic
to the adjoint representation.
One gets a representation (π2 , C3 ) of SU (2) by just composing the homo-
morphisms Φ and ρ:
π2 = ρ ◦ Φ : SU (2) → SO(3)
86
Proof. Recall that if we diagonalize a unitary matrix, the diagonal entries are
the eigenvalues, but their order is undetermined: acting by permutations on
these eigenvalues we get different diagonalizations of the same matrix. In the
case of SU (2) the matrix
0 1
P =
−1 0
has the property that conjugation by it permutes the diagonal elements, in
particular iθ −iθ
e 0 −1 e 0
P P =
0 e−iθ 0 eiθ
So iθ −iθ
e 0 e 0
π(P )π( )π(P )−1 = π( )
0 e−iθ 0 eiθ
and we see that π(P ) gives a change of basis of V such that the representation
matrices on the U (1) subgroup are as before, with θ → −θ. Changing θ → −θ
in the representation matrices is equivalent to changing the sign of the weights
qj . The elements of the set {qj } are independent of the basis, so the additional
symmetry under sign change implies that for each non-zero element in the set
there is another one with the opposite sign.
Looking at our three examples so far, we see that the scalar or spin 0 repre-
sentation of course is one-dimensional of weight 0
(π0 , C) = C0
1
and the spinor or spin 2 representation decomposes into U (1) irreducibles of
weights −1, +1:
(π1 , C2 ) = C−1 ⊕ C+1
For the spin 1 representation, recall that our double-cover homomorphism
Φ takes
iθ cos 2θ − sin 2θ 0
e 0
∈ SU (2) → sin 2θ cos 2θ 0 ∈ SO(3)
0 e−iθ
0 0 1
Acting with the SO(3) matrix on the right on C3 will give a unitary transforma-
tion of C3 , so in the group U (3). One can show that the upper left diagonal 2 by
2 block acts on C2 with weights −2, +2, whereas the bottom right element acts
trivially on the remaining part of C3 , which is a one-dimensional representation
of weight 0. So, the spin 1 representation decomposes as
Recall that the spin 1 representation of SU (2) is often called the “vector” rep-
resentation, since it factors in this way through the representation of SO(3) by
rotations on three-dimensional vectors.
87
8.1.2 Lie algebra representations: raising and lowering op-
erators
To proceed further in characterizing a representation (π, V ) of SU (2) we need
to use not just the action of the chosen U (1) subgroup, but the action of
group elements in the other two directions away from the identity. The non-
commutativity of the group keeps us from simultaneously diagonalizing those
actions and assigning weights to them. We can however work instead with the
corresponding Lie algebra representation (π 0 , V ) of su(2). As in the U (1) case,
the group representation is determined by the Lie algebra representation. We
will see that for the Lie algebra representation, we can exploit the complexifica-
tion (recall section 5.4) sl(2, C) of su(2) to further analyze the possible patterns
of weights.
Recall that the Lie algebra su(2) can be thought of as the tangent space R3
to SU (2) at the identity element, with a basis given by the three skew-adjoint
2 by 2 matrices
1
Xj = −i σj
2
which satisfy the commutation relations
88
with eiθ going around U (1) once as θ goes from 0 to 2π, this means we can
choose a basis of V so that
iθq
e 1 0 ··· 0
0 eiθq2 · · · 0
π(ei2θS3 ) =
···
···
0 0 · · · eiθqm
Taking the derivative of this representation to get a Lie algebra representation,
using
d
π 0 (X) = π(eθX )|θ=0
dθ
we find for X = i2S3
iθq
e 1 0 ··· 0 iq1 0 ··· 0
d iθq2
0 e · · · 0 0 iq2 ··· 0
π 0 (i2S3 ) =
=
dθ · · · · · · · ·· ···
0 0 · · · eiθqm |θ=0 0 0 ··· iqm
89
S1 and S2 don’t commute with S3 , so they may not preserve the subspaces
Vk and we can’t diagonalize them simultaneously with S3 . We can however
exploit the fact that we are in the complexification sl(2, C) to construct two
complex linear combinations of S1 and S2 that do something interesting:
We have S+ , S− ∈ sl(2, C). These are neither self-adjoint nor skew-adjoint, but
satisfy
(S± )† = S∓
and similarly we have
π 0 (S± )† = π 0 (S∓ )
We call π 0 (S+ ) a “raising operator” for the representation (π, V ), and π 0 (S− )
a “lowering operator”.
k
π 0 (S3 )π 0 (S+ )v = π 0 (S+ )π 0 (S3 )v + π 0 (S+ )v = ( + 1)π 0 (S+ )v
2
so
v ∈ Vk =⇒ π 0 (S+ )v ∈ Vk+2
The linear operator π 0 (S+ ) takes vectors with a well-defined weight to vectors
with the same weight, plus 2 (thus the terminology “raising operator”). A
similar calculation shows that π 0 (S− ) takes Vk to Vk−2 , lowering the weight by
2.
We’re now ready to classify all finite dimensional irreducible unitary repre-
sentations (π, V ) of SU (2). We define
90
Theorem (Highest weight theorem). Finite dimensional irreducible represen-
tations of SU (2) have weights of the form
−n, −n + 2, · · · , n − 2, n
[S+ , S− ] = 2S3
The span of the vn−2j is not just a representation, but an irreducible one,
since all the non-zero vn−2j arise by repeated application of π 0 (S− ) to vn and
equation 8.1 shows that (up to a constant) π 0 (S+ ) is an inverse to π 0 (S− ) for
all j up to the value j = n + 1. In the sequence of vn−2j for increasing j, finite-
dimensionality of V n implies that at some point one one must hit a “lowest
weight vector”, one annihilated by π 0 (S− ). From that point on, the vn−2j for
higher j will be zero. Taking into account the fact that the pattern of weights
is invariant under change of sign, one finds that the only possible pattern of
weights is
−n, −n + 2, · · · , n − 2, n
This is consistent with equation 8.1, which shows that it is at j = n that π 0 (S− )
will act on vn−2j without having an inverse proportional to π 0 (S+ ) (which would
act on vn−2(j+1) ).
91
Since we saw in section 8.1.1 that representations can be studied by looking
at by the set of their weights under the action of our chosen U (1) ⊂ SU (2), we
can label irreducible representations of SU (2) by a non-negative integer n, the
highest weight. Such a representation will be of dimension n + 1, with weights
−n, −n + 2, · · · , n − 2, n
where all the vector spaces are copies of C, and all the maps are isomorphisms
(multiplications by various numbers).
In summary, we see that all irreducible finite dimensional unitary SU (2)
representations can be labeled by a non-negative integer, the highest weight n.
These representations have dimension n + 1 and we will denote them (πn , V n =
Cn+1 ). Note that Vn is the n’th weight space, V n is the representation with
highest weight n. The physicist’s terminology for this uses not n, but n2 and
calls this number the “spin”of the representation. We have so far seen the lowest
three examples n = 0, 1, 2, or spin s = n2 = 0, 12 , 1, but there is an infinite class
of larger irreducibles, with dim V = n + 1 = 2s + 1.
92
Recall from our early discussion of representations that if one has an action
of a group on a space M , one can get a representation on functions f on M by
taking
(π(g)f )(x) = f (g −1 · x)
For SU (2), we have an obvious action of the group on M = C2 (by matrices
acting on column vectors), and we look at a specific class of functions on this
space, the polynomials. We can break up the infinite-dimensional space of
polynomials on C2 into finite-dimensional subspaces as follows:
Definition (Homogeneous polynomials). The complex vector space of homoge-
neous polynomials of degree m in two complex variables z1 , z2 is the space of
functions on C2 of the form
93
so
1 ∂ ∂
πn0 (S3 ) = (−z1 + z2 )
2 ∂z1 ∂z2
and similarly
∂ ∂
πn0 (S+ ) = −z2 , πn0 (S− ) = −z1
∂z1 ∂z2
The z1k z2n−k are eigenvectors for S3 with eigenvalue 12 (n − 2k) since
1 1
πn0 (S3 )z1k z2n−k = (−kz1k z2n−k + (n − k)z1k z2n−k ) = (n − 2k)z1k z2n−k
2 2
z2n will be an explicit highest weight vector for the representation (πn , V n ).
An important thing to note here is that the formulas we have found for π 0
are not in terms of matrices. Instead we have seen that when we construct our
representations using functions on C2 , for any X ∈ su(2) (or its complexification
sl(2, C)), πn0 (X) is given by a differential operator. Note that these differential
operators are independent of n: one gets the same operator π 0 (X) on all the
V n . This is because the original definition of the representation
(π(g)f )(x) = f (g −1 · x)
94
this case is define an inner product on polynomial functions on C2 by
Z
1 2 2
hf, gi = 2 f (z1 , z2 )g(z1 , z2 )e−(|z1 | +|z2 | ) dx1 dy1 dx2 dy2 (8.2)
π C2
Here z1 = x1 + iy1 , z2 = x2 + iy2 . One can do integrals of this kind fairly easily
since they factorize into separate integrals over z1 and z2 , each of which can be
treated using polar coordinates and standard calculus methods. One can check
by explicit computation that the polynomials
zj zk
√1 2
j!k!
will be an orthornormal basis of the space of polynomial functions with respect
to this inner product, and the operators π 0 (X), X ∈ su(2) will be skew-adjoint.
Working out what happens for the first few examples of irreducible SU (2)
representations, one finds orthonormal bases for the representation spaces V n
of homogeneous polynomials as follows
• For n = s = 0
1
1
• For n = 1, s = 2
z1 , z2
• For n = 2, s = 1
1 1
√ z12 , z1 z2 , √ z22
2 2
3
• For n = 3, s = 2
1 1 1 1
√ z13 , √ z12 z2 , √ z1 z22 , √ z23
6 2 2 6
π =ρ◦Φ
πn (−1) = ρ ◦ Φ(−1) = 1
95
From knowing that the weights of πn are −n, −n + 2, · · · , n − 2, n, we know that
inπ
iπ e 0 ··· 0
e 0 0 ei(n−2)π · · · 0
πn (−1) = πn −iπ = ···
=1
0 e ···
0 0 · · · e−inπ
which will only be true for n even, not for n odd. Since the Lie algebra of SO(3)
is isomorphic to the Lie algebra of SU (2), the same Lie algebra argument using
raising and lowering operators as in the last section also applies. The irreducible
representations of SO(3) will be (ρl , V = C2l+1 ) for l = 0, 1, 2, · · · , of dimension
2l + 1 and satisfying
ρl ◦ Φ = π2l
Just like in the case of SU (2), we can explicitly construct these representa-
tions using functions on a space with an SO(3) action. The obvious space to
choose is R3 , with SO(3) matrices acting on x ∈ R3 as column vectors, by the
formula we have repeatedly used
x1
(ρ(g)f )(x) = f (g −1 · x) = f (g −1 x2 )
x3
We’ll also use elements l± = l1 ± il2 of the complexified Lie algebra to create
raising and lowering operators L± = iρ0 (l± ).
96
As with the SU (2) case, we won’t include a factor of ~ as is usual in physics
(e.g. the usual convention is Lj = i~ρ0 (lj )), since for considerations of the action
of the rotation group it would just cancel out (physicists define rotations using
i
e ~ θLj ). The factor of ~ is only of significance when Lj is expressed in terms of
the momentum operator, a topic discussed in chapter 17.
In the SU (2) case, the π 0 (Sj ) had half-integral eigenvalues, with the eigen-
values of π 0 (2S3 ) the integral weights of the representation. Here the Lj will
have integer eigenvalues, the weights will be the eigenvalues of 2L3 , which will
be even integers.
0 0
0
−1
0 0
−t x1
d 0 1
0 x )
ρ0 (l1 )f = f (e 2 |t=0 (8.3)
dt
x3
0 0 0 x1
d
= f ( 0 cos t sin t x2 )|t=0 (8.4)
dt
0 − sin t cos t x3
0
d
= f ( x2 cos t + x3 sin t )|t=0 (8.5)
dt
−x2 sin t + x3 cos t
0
∂f ∂f ∂f
=( , , ) · x3 (8.6)
∂x1 ∂x2 ∂x3
−x2
∂f ∂f
=x3 − x2 (8.7)
∂x2 ∂x3
so
∂ ∂
ρ0 (l1 ) = x3 − x2
∂x2 ∂x3
∂ ∂ ∂ ∂
ρ0 (l2 ) = x1 − x3 , ρ0 (l3 ) = x2 − x1
∂x3 ∂x1 ∂x1 ∂x2
The space of all functions on R3 is much too big: it will give us an infinity of
copies of each finite dimensional representation that we want. Notice that when
SO(3) acts on R3 , it leaves the distance to the origin invariant. If we work in
spherical coordinates (r, θ, φ) (see picture)
97
we will have
x1 =r sin θ cos φ
x2 =r sin θ sin φ
x3 =r cos θ
Acting on f (r, φ, θ), SO(3) will leave r invariant, only acting non-trivially on
θ, φ. It turns out that we can cut down the space of functions to something
that will only contain one copy of the representation we want in various ways.
One way to do this is to restrict our functions to the unit sphere, i.e. just look
at functions f (θ, φ). We will see that the representations we are looking for can
be found in simple trigonometric functions of these two angular variables.
We can construct our irreducible representations ρ0l by explicitly constructing
a function we will call Yll (θ, φ) that will be a highest weight vector of weight
l. The weight l condition and the highest weight condition give two differential
equations for Yll (θ, φ):
L3 Yll = lYll , L+ Yll = 0
These will turn out to have a unique solution (up to scalars).
We first need to change coordinates from rectangular to spherical in our
expressions for L3 , L± . Using the chain rule to compute expressions like
∂
f (x1 (r, θ, φ), x2 (r, θ, φ), x3 (r, θ, φ))
∂r
we find
∂ ∂
∂r sin θ cos φ sin θ sin φ cos θ ∂x1
∂ = r cos θ cos φ ∂
r cos θ sin φ − sin θ ∂x
∂θ 2
∂ ∂
∂φ −r sin θ sin φ r sin θ cos φ 0 ∂x 3
98
so ∂
∂
∂r sin θ cos φ sin θ sin φ cos θ ∂x1
1 ∂ = cos θ cos φ cos θ sin φ ∂
− sin θ ∂x
r ∂θ 2
1 ∂ ∂
r sin θ ∂φ − sin φ cos φ 0 ∂x3
This is an orthogonal matrix, so one can invert it by taking its transpose, to get
∂ ∂
∂x1 sin θ cos φ cos θ cos φ − sin φ ∂r
∂ = sin θ sin φ cos θ sin φ cos φ 1 ∂
∂x2 r ∂θ
∂ 1 ∂
∂x
cos θ − sin θ 0 r sin θ ∂φ
3
So we finally have
∂ ∂ ∂ ∂
L1 = iρ0 (l1 ) = i(x3 − x2 ) = i(sin φ + cot θ cos φ )
∂x2 ∂x3 ∂θ ∂φ
∂ ∂ ∂ ∂
L2 = iρ0 (l2 ) = i(x1 − x3 ) = i(− cos φ + cot θ sin φ )
∂x3 ∂x1 ∂θ ∂φ
∂ ∂ ∂
L3 = iρ0 (l3 ) = i(x1 − x3 ) = −i
∂x3 ∂x1 ∂φ
and
∂ ∂ ∂ ∂
L+ = iρ0 (l+ ) = eiφ ( + i cot θ ), L− = iρ0 (l− ) = e−iφ (− + i cot θ )
∂θ ∂φ ∂θ ∂φ
Now that we have expressions for the action of the Lie algebra on functions in
spherical coordinates, our two differential equations saying our function Yll (θ, φ)
is of weight l and in the highest-weight space are
∂ l
L3 Yll (θ, φ) = −i Y (θ, φ) = lYll (θ, φ)
∂φ l
and
∂ ∂
L+ Yll (θ, φ) = eiφ ( + i cot θ )Yll (θ, φ) = 0
∂θ ∂φ
The first of these tells us that
99
This is a function on the sphere, which is also a highest weight vector in a
2l + 1 dimensional irreducible representation of SO(3). To get functions which
give vectors spanning the rest of the weight spaces, one just repeatedly applies
the lowering operator L− , getting functions
Ylm (θ, φ) =Clm (L− )l−m Yll (θ, φ)
∂ ∂
=Clm (e−iφ (− + i cot θ ))l−m eilφ sinl θ
∂θ ∂φ
for m = l, l − 1, l − 2 · · · , −l + 1, −l
The functions Ylm (θ, φ) are called “spherical harmonics”, and they span the
space of complex functions on the sphere in much the same way that the einθ
span the space of complex valued functions on the circle. Unlike the case of
polynomials on C2 , for functions on the sphere, one gets finite numbers by
integrating such functions over the sphere. So one can define an inner product
on these representations for which they are unitary by simply setting
Z Z 2π Z π
hf, gi = f g sin θdθdφ = f (θ, φ)g(θ, φ) sin θdθdφ
S2 φ=0 θ=0
We will not try and show this here, but for the allowable values of l, m the
Ylm (θ, φ) are mutually orthogonal with respect to this inner product.
One can derive various general formulas for the Ylm (θ, φ) in terms of Leg-
endre polynomials, but here we’ll just compute the first few examples, with
the proper constants that give them norm 1 with respect to the chosen inner
product.
• For the l = 0 representation
r
1
Y00 (θ, φ) =
4π
100
We will see later that these functions of the angular variables in spheri-
cal coordinates are exactly the functions that give the angular dependence of
wavefunctions for the physical system of a particle in a spherically symmetric
potential. In such a case the SO(3) symmetry of the system implies that the
state space (the wavefunctions) will provide a unitary representation π of SO(3),
and the action of the Hamiltonian operator H will commute with the action of
the operators L3 , L± . As a result all of the states in an irreducible representa-
tion component of π will have the same energy. States are thus organized into
“orbitals”, with singlet states called “s” orbitals (l = 0), triplet states called
“p” orbitals (l = 1), multiplicity 5 states called “d” orbitals (l = 2), etc.
Definition (Casimir operator for SO(3)). The Casimir operator for the repre-
sentation of SO(3) on functions on S 2 is the second-order differential operator
(the symbol L2 is not intended to mean that this is the square of an operator L)
so
L2 = L21 + L22 + L23 = L− L+ + L3 + L23
For the representation ρ of SO(3) on functions on S 2 constructed above,
we know that on a highest weight vector of the irreducible representation ρl
101
(restriction of ρ to the 2l + 1 dimensional irreducible subspace of functions that
are linear combinations of the Ylm (θ, φ)), we have the two eigenvalue equations
L+ f = 0, L3 f = lf
with solution the functions proportional to Yll (θ, φ). Just from these conditions
and our expression for L2 we can immediately find the scalar eigenvalue of L2
since
L2 f = L− L+ f + (L3 + L23 )f = 0 + l + l2 = l(l + 1)
We have thus shown that our irreducible representation ρl can be characterized
as the representation on which L2 acts by the scalar l(l + 1).
In summary, we have two different sets of partial differential equations whose
solutions provide a highest weight vector for and thus determine the irreducible
representation ρl :
•
L+ f = 0, L3 f = lf
which are first order equations, with the first using complexification and
something like a Cauchy-Riemann equation, and
•
L2 f = l(l + 1)f, L3 f = lf
where the first equation is a second order equation, something like a
Laplace equation.
That a solution of the first set of equations gives a solution of the second set
is obvious. Much harder to show is that a solution of the second set gives a
solution of the first set. The space of solutions to
L2 f = l(l + 1)f
102
For the group SU (2) we can also find irreducible representations as solution
spaces of differential equations on functions on C2 . In that case, the differential
equation point of view is much less useful, since the solutions we are looking for
are just the homogeneous polynomials, which are more easily studied by purely
algebraic methods.
103
104
Chapter 9
Tensor Products,
Entanglement, and
Addition of Spin
If one has two independent quantum systems, with state spaces H1 and H2 ,
the combined quantum system has a description that exploits the mathematical
notion of a “tensor product”, with the combined state space the tensor product
H1 ⊗ H2 . Because of the ability to take linear combinations of states, this
combined state space will contain much more than just products of independent
states, including states that are described as “entangled”, and responsible for
some of the most counter-intuitive behavior of quantum physical systems.
This same tensor product construction is a basic one in representation the-
ory, allowing one to construct a new representation (πW1 ⊗W2 , W1 ⊗ W2 ) out of
representations (πW1 , W1 ) and (πW2 , W2 ). When we take the tensor product of
states corresponding to two irreducible representations of SU (2) of spins s1 , s2 ,
we will get a new representation (πV 2s1 ⊗V 2s2 , V 2s1 ⊗ V 2s2 ). It will be reducible,
a direct sum of representations of various spins, a situation we will analyze in
detail.
Starting with a quantum system with state space H that describes a single
particle, one can describe a system of N particles by taking an N -fold tensor
product H⊗N = H ⊗ H ⊗ · · · ⊗ H. A deep fact about the physical world
is that for identical particles, we don’t get the full tensor product space, but
only the subspaces either symmetric or antisymmetric under the action of the
permutation group by permutations of the factors, depending on whether our
particles are “bosons” or “fermions”. An even deeper fact is that elementary
particles of half-integral spin s must behave as fermions, those of integral spin,
bosons.
Digression. When physicists refer to “tensors”, they generally mean the “ten-
sor fields” used in general relativity or other geometry-based parts of physics,
105
not tensor products of state spaces. A tensor field is a function on a manifold,
taking values in some tensor product of copies of the tangent space and its dual
space. The simplest tensor fields are just vector fields, functions taking values
in the tangent space. A more non-trivial example is the metric tensor, which
takes values in the dual of the tensor product of two copies of the tangent space.
will be a basis of V ⊕ W .
A less trivial construction is the tensor product of the vector spaces V and
W . This will be a new vector space called V ⊗ W , of dimension
One way to motivate the tensor product is to think of vector spaces as vector
spaces of functions. Elements
v = v1 e1 + v2 e2 + · · · + vdim V edim V ∈ V
v ⊗ (c1 w1 + c2 w2 ) = c1 (v ⊗ w1 ) + c2 (v ⊗ w2 )
106
• There are natural isomorphisms
C ⊗ V ' V, V ⊗ W ' W ⊗ V
and
U ⊗ (V ⊗ W ) ' (U ⊗ V ) ⊗ W
for vector spaces U, V, W
• Given a linear operator A on V and another linear operator B on W , we
can define a linear operator A ⊗ B on V ⊗ W by
(A ⊗ B)(v ⊗ w) = Av ⊗ Bw
for v ∈ V, w ∈ W .
With respect to the bases ei , fj of V and W , A will be a (dim V ) by
(dim V ) matrix, B will be a (dim W ) by (dim W ) matrix and A ⊗ B will
be a (dim V )(dim W ) by (dim V )(dim W ) matrix (which one can think of
as a (dim V ) by (dim V ) matrix of blocks of size (dim W )).
• One often wants to consider tensor products of vector spaces and dual vec-
tor spaces. An important fact is that there is an isomorphism between the
tensor product V ∗ ⊗ W and linear maps from V to W given by identifying
l ⊗ w (l ∈ V ∗ ) with the linear map
v ∈ V → l(v)w ∈ W
For V a real vector space, its complexification VC (the vector space one
gets by allowing multiplication by both real and imaginary numbers) can be
identified with the tensor product
V C = V ⊗R C
Here the notation ⊗R indicates a tensor product of two real vector spaces: V
of dimension dim V with basis {e1 , e2 , . . . , edim V } and C = R2 of dimension 2
with basis {1, i}.
HT = H1 ⊗ H 2
107
with operators of the form
A ⊗ Id + Id ⊗ B
with A ∈ O1 , B ∈ O2 . To describe an interacting quantum system, one can use
the state space HT , but with a more general class of operators.
If H is the state space of a quantum system, one can think of this as de-
scribing a single particle, and then to describe a system of N such particles, one
uses the multiple tensor product
H⊗N = H ⊗ H ⊗ · · · ⊗ H ⊗ H
| {z }
N times
The symmetric group SN acts on this state space, and one has a repre-
sentation (π, H⊗N ) of SN as follows. For σ ∈ SN a permutation of the set
{1, 2, . . . , N } of N elements, on a tensor product of vectors one has
π(σ)(v1 ⊗ v2 ⊗ · · · ⊗ vN ) = vσ(1) ⊗ vσ(2) ⊗ · · · ⊗ vσ(N )
The representation of SN that this gives is in general reducible, containing
various components with different irreducible representations of the group SN .
A fundamental axiom of quantum mechanics is that if H⊗N describes N iden-
tical particles, then all physical states occur as one-dimensional representations
of SN , which are either symmetric (“bosons”) or antisymmetric (“fermions”)
where
Definition. A state v ∈ H⊗N is called
• symmetric, or bosonic if ∀σ ∈ SN
π(σ)v = v
The space of such states is denoted S N (H).
• antisymmetric, or fermionic if ∀σ ∈ SN
π(σ)v = (−1)|σ| v
The space of such states is denoted ΛN (H). Here |σ| is the minimal num-
ber of transpositions that by composition give σ.
Note that in the fermionic case, for σ a transposition interchanging two
particles, the antisymmetric representation π acts on the factor H ⊗ H by in-
terchanging vectors, taking
w⊗w ∈H⊗H
to itself. Antisymmetry requires that this state go to its negative, so the state
cannot be non-zero. So one cannot have non-zero states in H⊗N describing two
identical particles in the same state w ∈ H, a fact that is known as the “Pauli
Principle”.
While the symmetry or antisymmetry of states of multiple identical particles
is a separate axiom when such particles are described in this way as tensor
products, we will see later on (chapter 34) that this phenomenon instead finds
a natural explanation when particles are described in terms of quantum fields.
108
9.3 Indecomposable vectors and entanglement
If one is given a function f on a space X and a function g on a space Y , one
can form a product function f g on the product space X × Y by taking (for
x ∈ X, y ∈ Y )
(f g)(x, y) = f (x)g(y)
However, most functions on X × Y are not decomposable in this manner. Sim-
ilarly, for a tensor product of vector spaces, one has:
Note that our basis vectors of V ⊗ W are all decomposable since they are
products of basis vectors of V and W . Linear combinations of these basis vectors
however are in general indecomposable. If we think of an element of V ⊗ W
as a dim V by dim W matrix, with entries the coordinates with respect to our
basis vectors for V ⊗ W , then for decomposable vectors we get a special class
of matrices, those of rank one.
In the physics context, the language used is:
109
Definition (Tensor product representation of a group). For (πV , V ) and (πW , W )
representations of a group G, one has a tensor product representation (πV ⊗W , V ⊗
W ) defined by
(πV ⊗W (g))(v ⊗ w) = πV (g)v ⊗ πW (g)w
One can easily check that πV ⊗W is a homomorphism.
To see what happens for the corresponding Lie algebra representation, one
computes (for X in the Lie algebra)
d
πV0 ⊗W (X)(v ⊗ w) = πV ⊗W (etX )(v ⊗ w)t=0
dt
d
= (πV (etX )v ⊗ πW (etX )w)t=0
dt
d d
=(( πV (etX )v) ⊗ πW (etX )w)t=0 + (πV (etX )v ⊗ ( πW (etX )w))t=0
dt dt
=(πV0 (X)v) ⊗ w + v ⊗ (πW 0
(X)w)
which could also be written
πV0 ⊗W (X) = (πV0 (X) ⊗ 1W ) + (1V ⊗ πW
0
(X))
110
9.4.2 Characters of representations
A standard tool for dealing with representations that we have ignored so far is
that of associating to a representation an invariant called its character. This
will be a conjugation-invariant function on the group that only depends on the
equivalence class of the representation. Given two representations constructed
in very different ways, one can often check whether they are isomorphic just by
seeing if their character functions match. The problem of identifying the possible
irreducible representations of a group can be attacked by analyzing the possible
character functions of irreducible representations. We will not try and enter
into the general theory of characters here, but will just see what the characters
of irreducible representations are for the case of G = SU (2). These can be used
to give a simple argument for the Clebsch-Gordan decomposition of the tensor
product of SU (2) representations. For this we don’t need general theorems
about the relations of characters and representations, but can directly check
that the irreducible representations of SU (2) correspond to distinct character
functions which are easily evaluated.
χV (g) = T r(π(g))
χV ⊕W = χV + χW , χV ⊗W = χV χW
ei(n+1)θ − e−i(n+1)θ
iθ
e 0 sin((n + 1)θ)
χV n ( −iθ )= =
0 e eiθ − e−iθ sin(θ)
111
To get a proof of 9.1, one can compute the character of the tensor product
on the diagonal matrices using the Weyl character formula for the second factor
(ordering things so that n2 > n1 )
χV n1 ⊗V n2 =χV n1 χV n2
ei(n2 +1)θ − e−i(n2 +1)θ
=(ein1 θ + ei(n1 −2)θ + · · · + e−i(n1 −2)θ + e−in1 θ )
eiθ − e−iθ
−i(n1 +n2 +1)θ i(n2 −n1 +1)θ
(e i(n1 +n2 +1)θ
−e ) + · · · + (e − e−i(n2 −n1 +1)θ )
=
eiθ − e−iθ
=χV n1 +n2 + χV n1 +n2 −2 + · · · + χV n2 −n1
So, when we decompose the tensor product of irreducibles into a direct sum of
irreducibles, the ones that must occur are exactly those of theorem 9.1.
V1⊗V1 =V2⊕V0
This says that the four complex dimensional tensor product of two spinor
representations (which are each two complex dimensional) decomposes
into irreducibles as the sum of a three dimensional vector representation
and a one dimensional trivial (scalar) representation.
1 0
Using the basis , for V 1 , the tensor product V 1 ⊗ V 1 has a basis
0 1
1 1 1 0 0 1 0 0
⊗ , ⊗ , ⊗ , ⊗
0 0 0 1 1 0 1 1
The vector
1 1 0 0 1
√ ( ⊗ − ⊗ )∈V1⊗V1
2 0 1 1 0
112
These three vectors span one-dimensional complex subspaces of weights
q = 2, 0, −2 under the U (1) ⊂ SU (2) subgroup
iθ
e 0
0 e−iθ
V 1 ⊗ V 1 ⊗ V 1 = (V 2 ⊕ V 0 ) ⊗ V 1 = (V 2 ⊗ V 1 ) ⊕ (V 0 ⊗ V 1 ) = V 3 ⊕ V 1 ⊕ V 1
This says that the tensor product of three spinor representations decom-
poses as a four dimensional (“spin 3/2”) representation plus two copies of
the spinor representation.
One can clearly generalize this and consider N -fold tensor products (V 1 )⊗N
of the spinor representation. Taking N high enough one can get any ir-
reducible representation of SU (2) that one wants this way, giving an al-
ternative to our construction using homogeneous polynomials. Doing this
however gives the irreducible as just one component of something larger,
and one needs a method to project out the component one wants. One
can do this using the action of the symmetric group SN on (V 1 )⊗N and
an understanding of the irreducible representations of SN . This relation-
ship between irreducible representations of SU (2) and those of SN coming
from looking at how both groups act on (V 1 )⊗N is known as “Schur-Weyl
duality”, and generalizes to the case of SU (n), where one looks at N -fold
tensor products of the defining representation of SU (n) matrices on Cn .
For SU (n) this provides perhaps the most straight-forward construction
of all irreducible representations of the group.
B : (u, u0 ) ∈ V × V → B(u, u0 ) ∈ k
113
that is bilinear in both entries, i.e.
α ⊗ β ∈ V ∗ ⊗ V ∗ → B : B(u, u0 ) = α(u)β(u0 )
This expresses the bilinear form B in terms of a matrix B with entries Bjk ,
which can be computed as
Bjk = B(ej , ek )
In terms of the matrix B, the bilinear form is computed as
0
B11 . . . B1d u1
. . .
0
B(u, u ) = u1 . . . ud .. .. .. ...
= u · Bu
0
u ∈ V → B(u, u) = u · Bu
That one gets quadratic functions by multiplying two linear functions corre-
sponds in terms of tensor products to
1
(α, β) ∈ V ∗ × V ∗ → (α ⊗ β + β ⊗ α) ∈ S 2 (V ∗ )
2
114
We will not give the details here, but one can generalize the above from
bilinear forms (isomorphic to V ∗ ⊗ V ∗ ) to multi-linear forms with N arguments
(isomorphic to (V ∗ )⊗N ). Evaluating such a multi-linear form with all argu-
ments set to u ∈ V gives a homogeneous polynomial of degree N , and one
has an isomorphism between symmetric multi-linear forms in S N (V ∗ ) and such
polynomials.
Antisymmetric bilinear forms lie in Λ2 (V ∗ ) ⊂ V ∗ ⊗ V ∗ and correspond to
antisymmetric matrices. One can define a multiplication (called the “wedge
product”) on V ∗ that takes values in Λ2 (V ∗ ) by
1
(α, β) ∈ V ∗ × V ∗ → α ∧ β = (α ⊗ β − β ⊗ α) ∈ Λ2 (V ∗ )
2
One can use this to get a product on the space of antisymmetric multilinear
forms of different degrees, giving something in many ways analogous to the
algebra of polynomials. This plays a role in the description of fermions and will
be considered in more detail in chapter 27.
115
116
Chapter 10
We’ll now turn to the problem that conventional quantum mechanics courses
generally begin with: that of the quantum system describing a free particle
moving in physical space R3 . This is something quite different than the classical
mechanical decription of a free particle, which will be reviewed in chapter 12.
A common way of motivating this is to begin with the 1924 suggestion by de
Broglie that, just as photons may behave like particles or waves, the same should
be true for matter particles. Photons carry an energy given by E = ~ω, where
ω is the angular frequency, and de Broglie’s proposal was that matter particles
behave like a wave with spatial dependence
eik·x
where x is the spatial position, and the momentum of the particle is p = ~k.
This proposal was realized in Schrödinger’s early 1926 discovery of a version
of quantum mechanics, in which the state space H is a space of complex-valued
functions on R3 , called “wavefunctions”. The operator
P = −i~∇
117
will see that in the Hamiltonian form of classical mechanics, the components of
the momentum vector give a basis of the Lie algebra of the spatial translation
group R3 , the energy a basis of the Lie algebra of the time translation group
R. Invoking the classical relationship between energy and momentum
|p|2
E=
2m
used in non-relativistic mechanics relates the Hamiltonian and momentum oper-
ators, giving the conventional Schrödinger differential equation for the wavefunc-
tion of a free particle. We will examine the solutions to this equation, beginning
with the case of periodic boundary conditions, where spatial translations in each
direction are given by the compact group U (1) (whose representations we have
already studied in detail).
for H a self-adjoint matrix follows from the same sort of argument as in theorem
2.1. Such a U (t) provides solutions of the Schrödinger equation by
|ψ(t)i = U (t)|ψ(0)i
118
minus sign is a convention, for reasons that will be explained in the discussion
of momentum to come later.
Note that if one wants to treat the additive group R as a matrix group,
related to its Lie algebra R by exponentiation of matrices, one can describe the
group as the group of matrices of the form
1 a
0 1
since
1 a 1 b 1 a+b
=
0 1 0 1 0 1
Since
0 a
0 0 1 a
e =
0 1
the Lie algebra is just matrices of the form
0 a
0 0
We will mostly though write the group law in additive form. We are inter-
ested in the group R as a group of translations acting on a linear space, and the
corresponding infinite dimensional representation induced on functions on the
space. The simplest case is when R acts on itself by translation. Here a ∈ R
acts on q ∈ R (where q is a coordinate on R) by
q →a·q =q+a
π(g)f (q) = f (g −1 · q)
to get
π(a)f (q) = f (q − a)
In the Lie algebra version of this representation, we will have
d
π 0 (a) = −a
dq
since
0 d df a2 d2 f
π(a)f = eπ (a) f = e−a dq f (q) = f (q) − a + + · · · = f (q − a)
dq 2! dq 2
which for functions with appropriate properties is just Taylor’s formula. Note
that here the same a labels points of the Lie algebra and of the group. We are
not treating the group R as a matrix group, since we want an additive group
119
law. So Lie algebra elements are not defined as for matrix groups (things one
exponentiates to get group elements). Instead, we think of the Lie algebra as
the tangent space to the group at the identity, and then simply identify R as the
tangent space at 0 (the Lie algebra) and R as the additive group. Note however
that the representation obeys a multiplicative law, with the homomorphism
property
π(a + b) = π(a)π(b)
so there is an exponential in the relation between π and π 0 .
Since we now want to describe quantum systems that depend not just on
time, but on space variables q = (q1 , q2 , q3 ), we will have an action by unitary
transformations of not just the group R of time translations, but also the group
R3 of spatial translations. We will define the corresponding Lie algebra rep-
resentations using self-adjoint operators P1 , P2 , P3 that play the same role for
spatial translations that the Hamiltonian plays for time translations:
∂ ∂ ∂
P1 = −i~ , P2 = −i~ , P3 = −i~
∂q1 ∂q2 ∂q3
These are given the name “momentum operators” since we will see that their
eigenvalues have an intepretation as the components of the momentum vector
for the system, just as the eigenvalues of the Hamiltonian have an interpretation
as the energy. Note that while in the case of the Hamiltonian the factor of ~ kept
track of the relative normalization of energy and time units, here it plays the
same role for momentum and length units. It can be set to one if appropriate
choices of units of momentum and length are made.
The differentiation operator is skew-adjoint since, using integration by parts
one has for ψ ∈ H
Z +∞ Z +∞ Z +∞
d d d d
ψ( ψ)dq = ( (ψψ) − ( ψ)ψ)dq = − ( ψ)ψdq
−∞ dq −∞ dq dq −∞ dq
The Pj are thus self-adjoint operators, with real eigenvalues as expected for an
observable operator. Multiplying by −i to get the corresponding skew-adjoint
operator of a unitary Lie algebra representation we find
∂
−iPj = −~
∂qj
Up to the ~ factor that depends on units, these are exactly the Lie algebra
representation operators on basis elements for the action of R3 on functions on
R3 induced from translation:
120
∂ ∂ ∂
π 0 (a1 , a2 , a3 ) = a1 (−iP1 ) + a2 (−iP2 ) + a3 (−iP3 ) = −~(a1 + a2 + a3 )
∂q1 ∂q2 ∂q3
Note that the convention for the sign choice here is the opposite from the
d d
case of the Hamiltonian (−iP = −~ dq vs. −iH = ~ dt ). This means that the
conventional sign choice we have been using for the Hamiltonian makes it minus
the generator of translations in the time direction. The reason for this comes
from considerations of special relativity, where the inner product on space-time
has opposite signs for the space and time dimensions . We will review this
subject in chapter 37 but for now we just need the relationship special relativity
gives between energy and momentum. Space and time are put together in
“Minkowski space”, which is R4 with indefinite inner product
Energy and momentum are the components of a Minkowski space vector (p0 =
E, p1 , p2 , p3 ) with norm-squared given by minus the mass-squared:
This is the formula for a choice of space and time units such that the speed of
light is 1. Putting in factors of the speed of light c to get the units right one
has
E 2 − |p|2 c2 = m2 c4
Two special cases of this are:
• For photons, m = 0, and one has the energy momentum relation E = |p|c
• For velocities v small compared to c (and thus momenta |p| small com-
pared to mc), one has
p p c|p|2 |p|2
E= |p|2 c2 + m2 c4 = c |p|2 + m2 c2 ≈ + mc2 = + mc2
2mc 2m
In the non-relativistic limit, we use this energy-momentum relation to
describe particles with velocities small compared to c, typically dropping
the momentum-independent constant term mc2 .
1 1 −~2 ∂ 2 ∂2 ∂2
H= (P12 + P22 + P32 ) = |P|2 = ( 2 + 2 + 2)
2m 2m 2m ∂q1 ∂q2 ∂q3
121
The Schrödinger equation then becomes:
∂ −~2 ∂ 2 ∂2 ∂2 −~2 2
i~ ψ(q, t) = ( 2 + 2 + 2 )ψ(q, t) = ∇ ψ(q, t)
∂t 2m ∂q1 ∂q2 ∂q3 2m
−~2 2
HψE (q) = ∇ ψE (q) = EψE (q)
2m
with eigenvalue E for the Hamiltonian operator and then use the fact that
i
ψ(q, t) = ψE (q)e− ~ tE
∂
i~ ψ(q, t) = Hψ(q, t)
∂t
The solutions ψE (q) to the time-independent equation are just complex expo-
nentials proportional to
satisfying
−~2 ~2 |k|2
(−i)2 |k|2 = =E
2m 2m
We have found that solutions to the Schrödinger equation are given by linear
combinations of states |ki labeled by a vector k, which are eigenstates of the
momentum and Hamiltonian operators with
~2
Pj |ki = ~kj |ki, H|ki = |k|2 |ki
2m
These are states with well-defined momentum and energy
|p|2
pj = ~kj , E =
2m
so they satisfy exactly the same energy-momentum relations as those for a clas-
sical non-relativistic particle.
While the quantum mechanical state space H contains states with the clas-
sical energy-momentum relation, it also contains much, much more since it
includes linear combinations of such states. At t = 0 one has
X
|ψi = ck eik·q
k
122
where ck are complex numbers, and the general time-dependent state will be
X |k|2
|ψ(t)i = ck eik·q e−it~ 2m
k
for some constant C. But if we try and compute the norm-squared of one of
our basis states |ki we find
Z Z
−ik·q ik·q 3
hk|ki = C (e )(e )d q = C 1 d3 q = ∞
R3 R3
As a result there is no value of C which will give these states a unit norm.
In the finite dimensional case, a linear algebra theorem assures us that given a
self-adjoint operator, we can find an orthonormal basis of its eigenvectors. In this
infinite dimensional case this is no longer true, and a much more sophisticated
formalism (the “spectral theorem for self-adjoint operators”) is needed to replace
the linear algebra theorem. This is a standard topic in treatments of quantum
mechanics aimed at mathematicians emphasizing analysis, but we will not try
and enter into this here. One place to find such a discussion is section 2.1 of
[64].
One way to deal with the normalization problem is to replace the non-
compact space by one of finite volume. We’ll consider first the simplified case of
a single spatial dimension, since once one sees how this works for one dimension,
treating the others the same way is straight-forward. In this one dimensional
case, one replaces R by the circle S 1 . This is equivalent to the physicist’s method
of imposing “periodic boundary conditions”, meaning to define the theory on
an interval, and then identify the ends of the interval. One can then think of
the position variable q as an angle φ and define the inner product as
Z 2π
1
hψ1 , ψ2 i = ψ1 (φ)ψ2 (φ)dφ
2π 0
123
The state space is then
H = L2 (S 1 )
the space of complex-valued square-integrable functions on the circle.
Instead of the translation group R, we have the standard action of the
group SO(2) on the circle. Elements g(θ) of the group are rotations of the circle
counterclockwise by an angle θ, or if we parametrize the circle by an angle φ,
just shifts
φ→φ+θ
Recall that in general we can construct a representation on functions from a
group action on a space by
π(g)f (x) = f (g −1 · x)
π(g(θ))ψ(φ) = ψ(φ − θ)
The eigenfunctions of π 0 (X) are just the einφ , for n ∈ Z, which we will also
write as state vectors |ni. These are orthonormal
hn|mi = δnm
and provide a basis for the space L2 (S 1 ), a basis that corresponds to the de-
composition into irreducibles of
L2 (S 1 )
124
as a representation of SO(2) described above. One has
πn (g(θ)) = einθ
The theory of Fourier series for functions on S 1 says that one can expand any
function ψ ∈ L2 (S 1 ) in terms of this basis, i.e.
+∞
X +∞
X
|ψi = ψ(φ) = cn einφ = cn |ni
n=−∞ n=−∞
The Lie algebra of the group S 1 is the same as that of the group (R, +),
and the π 0 (X) we have found for the S 1 action on functions is related to the
momentum operator in the same way as in the R case. So, we can use the same
momentum operator
d
P = −i~
dφ
which satisfies
P |ni = ~n|ni
By changing space to the compact S 1 we now have momenta that instead of
taking on any real value, can only be integral numbers times ~. Solving the
Schrödinger equation
∂ P2 −~2 ∂ 2
i~ ψ(φ, t) = ψ(φ, t) = ψ(φ, t)
∂t 2m 2m ∂φ2
as before, we find
−~2 d2
EψE (φ) = ψE (φ)
2m dφ2
an eigenvector equation, which has solutions |ni, with
~2 n 2
E=
2m
125
Writing a solution to the Schrödinger equation as
+∞
~n2
X
ψ(φ, t) = cn einφ e−i 2m t
n=−∞
the cn will be determined from the initial condition of knowing the wavefunction
at time t = 0, according to the Fourier coefficient formula
Z 2π
1
cn = e−inφ ψ(φ, 0)dφ
2π 0
To get something more realistic, we need to take our circle to have an ar-
bitrary circumference L, and we can study our original problem by considering
the limit L → ∞. To do this, we just need to change variables from φ to φL ,
where
L
φL = φ
2π
The momentum operator will now be
d
P = −i~
dφL
2π~
and its eigenvalues will be quantized in units of L . The energy eigenvalues
will be
2π 2 ~2 n2
E=
mL2
126
Definition (Fourier transform). The Fourier transform of a function ψ is given
by Z ∞
1
Fψ = ψ(k)
e ≡√ e−ikq ψ(q)dq
2π −∞
The definition makes sense for ψ ∈ L1 (R), Lebesgue integrable functions on
R. For the following, it is convenient to instead restrict to the Schwartz space
S(R) of functions ψ such that the function and its derivatives fall off faster than
any power at infinity (which is a dense subspace of L2 (R)). For more details
about the analysis and proofs of the theorems quoted here, one can refer to a
standard textbook such as [63].
Given the Fourier transform of ψ, one can recover ψ itself:
Theorem (Fourier Inversion). For ψe ∈ S(R) the Fourier transform of a func-
tion ψ ∈ S(R), one has
Z +∞
1
ψ(q) = F ψ = √
e e eikq ψ(k)dk
e
2π −∞
Note that Fe is the same linear operator as F, with a change in sign of the
argument of the function it is applied to. Note also that we are choosing one
of various popular ways of normalizing the definition of the Fourier transform.
In others, the factor of 2π may appear instead in the exponent of the complex
exponential, or just in one of F or Fe and not the other.
The operators F and Fe are thus inverses of each other on S(R). One has
Theorem (Plancherel). F and Fe extend to unitary isomorphisms of L2 (R)
with itself. In other words
Z ∞ Z ∞
|ψ(q)|2 dq = |ψ(k)|
e 2
dk
−∞ −∞
127
A crucial property of the unitary operator F on H is that it diagonalizes
the differentiation operator and thus the momentum operator P . Under Fourier
transform, differential operators become just multiplication by a polynomial,
giving a powerful technique for solving differential equations. Computing the
Fourier transform of the differentiation operator using integration by parts, we
find
Z +∞
dψ
g 1 dψ
=√ e−ikq dq
dq 2π −∞ dq
Z +∞
1 d d
=√ ( (e−ikq ψ) − ( e−ikq )ψ)dq
2π −∞ dq dq
Z +∞
1
=ik √ e−ikq ψdq
2π −∞
=ik ψ(k)
e
Since p = ~k, we can easily change variables and work with p instead of k,
and often will do this from now on. As with the factors of 2π, there’s a choice
of where to put the factors of ~ in the normalization of the Fourier transform.
We’ll make the following choices, to preserve symmetry between the formulas
for Fourier transform and inverse Fourier transform:
Z +∞
1 pq
ψ(p)
e =√ e−i ~ dq
2π~ −∞
Z +∞
1 pq
ψ(q) = √ ei ~ dp
2π~ −∞
Note that in this case we have lost an important property that we had for
finite dimensional H and had managed to preserve by using S 1 rather than
R as our space. If we take H = L2 (R), the eigenvectors for the operator P
(the functions eikq ) are not square-integrable, so not in H. The operator P
is an unbounded operator and we no longer have a theorem saying that its
eigenvectors give an orthornormal basis of H. As mentioned earlier, one way
to deal with this uses a general spectral theorem for self-adjoint operators on a
Hilbert space, for more details see Chapter 2 of [64].
128
10.3.1 Delta functions
One would like to think of the eigenvectors of the operator P as in some sense
continuing to provide an orthonormal basis for H. One problem is that these
eigenvectors are not square-integrable, so one needs to expand one’s notion of
state space H beyond a space like L2 (R). Another problem is that Fourier
transforms of such eigenvectors (which will be eigenvectors of the position op-
erator) gives something that is not a function but a distribution. The proper
general formalism for handling state spaces H which include eigenvectors of both
position and momentum operators seems to be that of “rigged Hilbert spaces”
which this author confesses to never have mastered (the standard reference is
[21]). As a result we won’t here give a rigorous discussion, but will use non-
normalizable functions and distributions in the non-rigorous form in which they
are used in physics. The physics formalism is set up to work as if H was finite
dimensional and allows easy manipulations which don’t obviously make sense.
Our interest though is not in the general theory, but in very specific quantum
systems, where everything is determined by their properties as unitary group
representations. For such systems, the general theory of rigged Hilbert spaces
is not needed, since for the statements we are interested in various ways can
be found to make them precise (although we will generally not enter into the
complexities needed to do so).
Given any function g(q) on R, one can try and define an element of the dual
space of the space of functions on R by integration, i.e by the linear operator
Z +∞
f→ g(q)f (q)dq
−∞
(we won’t try and specify which condition on functions f or g is chosen to make
sense of this). There are however some other very obvious linear functionals on
such a function space, for instance the one given by evaluating the function at
q = c:
f → f (c)
Such linear functionals correspond to generalized functions, objects which when
fed into the formula for integration over R give the desired linear functional.
The most well-known of these is the one that gives this evaluation at q = c, it
is known as the “delta function” and written as δ(q − c). It is the object which,
if it were a function, would satisfy
Z +∞
δ(q − c)f (q)dq = f (c)
−∞
To make sense of such an object, one can take it to be a limit of actual functions.
For the δ-function, consider the limit as → 0 of
1 (q−c)2
g = √ e− 2
2π
129
which satisfy
Z +∞
g (q)dq = 1
−∞
for all > 0 (one way to see this is to use the formula given earlier for the
Fourier transform of a Gaussian).
Heuristically (ignoring obvious problems of interchange of integrals that
don’t make sense), one can write the Fourier inversion formula as follows
Z +∞
1
ψ(x) = √ eikq ψ(k)dk
e
2π −∞
Z +∞ Z +∞
1 1 0
=√ ikq
e (√ e−ikq ψ(q 0 )dq 0 )dk
2π −∞ 2π −∞
Z +∞ Z +∞
1 0
= ( eik(q−q ) ψ(q 0 )dk)dq 0
2π −∞ −∞
Z +∞
= δ(q 0 − q)ψ(q 0 )dq 0
−∞
Taking the delta function to be an even function (so δ(x0 − x) = δ(x − x0 )),
one can interpret the above calculation as justifying the formula
Z +∞
1 0
δ(q − q 0 ) = eik(q−q ) dk
2π −∞
1
|ki = √ eikq
2π
As mentioned before, we will usually work with the variable p = ~k, in which
case we have
1 pq
|pi = √ ei ~
2π~
and Z +∞
0 1 (p−p0 )q
δ(p − p ) = ei ~ dq
2π~ −∞
130
10.4 For further reading
Every book about quantum mechanics covers this example of the free quantum
particle somewhere very early on, in detail. Our discussion here is unusual
just in emphasizing the role of the spatial translation groups and its unitary
representations. Discussions of quantum mechanics for mathematicians (such
as [64]) typically emphasize the development of the functional analysis needed
for a proper description of the Hilbert space H and of the properties of general
self-adjoint operators on this state space. In this class we’re restricting attention
to a quite limited set of such operators coming from Lie algebra representations,
so will avoid the general theory.
131
132
Chapter 11
In our discussion of the free particle, we used just the actions of the groups R3 of
spatial translations and the group R of time translations, finding corresponding
observables, the self-adjoint momentum (P ) and Hamiltonian (H) operators.
We’ve seen though that the Fourier transform involves a perfectly symmetrical
treatment of position and momentum variables. This allows us to introduce a
position operator Q acting on our state space H. We will analyze in detail in
this chapter the implications of extending the algebra of observable operators in
this way, most of the time restricting to the case of a single spatial dimension,
since the physical case of three dimensions is an easy generalization.
The P and Q operators generate an algebra usually called the Heisenberg
algebra, since Werner Heisenberg and collaborators used it in the earliest work
on a full quantum-mechanical formalism in 1925. It was quickly recognized by
Hermann Weyl that this algebra comes from a Lie algebra representation, with
a corresponding group (called the Heisenberg group by mathematicians, the
Weyl group by physicists). The state space of a quantum particle, either free or
moving in a potential, will be a unitary representation of this group, with the
group of spatial translations a subgroup. Note that this particular use of a group
and its representation theory in quantum mechanics is both at the core of the
standard axioms and much more general than the usual characterization of the
significance of groups as “symmetry groups”. The Heisenberg group does not in
any sense correspond to a group of invariances of the physical situation (there
are no states invariant under the group), and its action does not commute with
any non-zero Hamiltonian operator. Instead it plays a much deeper role, with
its unique unitary representation determining much of the structure of quantum
mechanics.
Note: beginning with this chapter, we will always assume units for position
133
and momentum chosen so that ~ = 1 and no longer keep track of how this
dimensional constant appears in equations.
134
and in particular
1
hq|pi = √ eipq
2π
d 1 0 1 0
Q|qi = i (√ e−ip q ) = q( √ e−ip q ) = q|qi
dp0 2π~ 2π
Another way to see that this is the correct operator is to use the unitary trans-
formation F and its inverse Fe that relate the position and momentum space
representations. Going from position space to momentum space one has
Q → FQFe
and one can check that this transformed Q operator will act as i dpd 0 on functions
of p0 .
One can express momentum space wavefunctions as coefficients of the ex-
pansion of a state ψ in terms of momentum eigenvectors
Z +∞
1 0
hp|ψi = ( √ e−ipq )ψ(q 0 )dq 0 = F(ψ(q)) = ψ(p)
e
−∞ 2π
135
11.1.3 Physical interpretation
With now both momentum and position operators on H, we have the standard
set-up for describing a non-relativistic quantum particle that is discussed exten-
sively early on in any quantum mechanics textbook, and one of these should be
consulted for more details and for explanations of the physical interpretation
of this quantum system. The classically observable quantity corresponding to
the operator P is the momentum, and eigenvectors of P are the states that
have well-defined values for this (the eigenvalue). The momentum eigenvalues
and energy eigenvalues will have the correct non-relativistic energy momentum
relationship. Note that for the free particle P commutes with the Hamiltonian
P2
H = 2m so there is a conservation law: states with a well-defined momentum
at one time always have the same momentum. This corresponds to an obvious
physical symmetry, the symmetry under spatial translations.
The operator Q on the other hand does not correspond to a physical sym-
metry, since it does not commute with the Hamiltonian. We will see that it
does generate a group action, and from the momentum space picture we can
see that this is a shift in the momentum, but such shifts are not symmetries of
the physics and there is no conservation law for Q. The states in which Q has
a well-defined numerical value are the ones such that the position wavefunction
is a delta-function. If one prepares such a state at a given time, it will not
remain a delta-function, but quickly evolve into a wavefunction that spreads
out in space.
Since the eigenfunctions of P and Q are non-normalizable, one needs a
slightly different formulation of the measurement theory principle used for finite
dimensional H. In this case, the probability of observing a position of a particle
with wavefunction ψ(q) in the interval [q1 , q2 ] will be
R q2
q1
ψ(q)ψ(q)dq
R +∞
−∞
ψ(q)ψ(q)dq
This will make sense for states |ψi ∈ L2 (R), which we will normalize to have
norm-squared one when discussing their physical interpretation. Then the sta-
tistical expectation value for the measured position variable will be
hψ|Q|ψi
which can be computed in either the position or momentum space representa-
tion.
Similarly, the probability of observing a momentum of a particle with momentum-
space wavefunction ψ(q)
e in the interval [p1 , p2 ] will be
R p2
p1
ψ(p)
e ψ(p)dp
e
R +∞
−∞
ψ(p)
e ψ(p)dp
e
and for normalized states the statistical expectation value of the measured mo-
mentum is
hψ|P |ψi
136
Note that states with a well-defined position (the delta-function states in
the position-space representation) are equally likely to have any momentum
whatsoever. Physically this is why such states quickly spread out. States with
a well-defined momentum are equally likely to have any possible position. The
properties of the Fourier transform imply the so-called “Heisenberg uncertainty
principle” that gives a lower bound on the product of a measure of uncertainty
in position times the same measure of uncertainty in momentum. Examples
of this that take on the lower bound are the Gaussian shaped functions whose
Fourier transforms were computed earlier.
For much more about these questions, again most quantum mechanics text-
books will contain an extensive discussion.
Note that this is a non-trivial Lie algebra, but only minimally so. All Lie
brackets of Z with anything else are zero. All Lie brackets of Lie brackets are
also zero (as a result, this is an example of what is known as a “nilpotent” Lie
algebra).
The Heisenberg Lie algebra is isomorphic to the Lie algebra of 3 by 3 strictly
upper triangular real matrices, with Lie bracket the matrix commutator, by the
following isomorphism:
0 1 0 0 0 0 0 0 1
X ↔ 0 0 0 , Y ↔ 0 0 1 , Z ↔ 0 0 0
0 0 0 0 0 0 0 0 0
137
0 x z
xX + yY + zZ ↔ 0 0 y
0 0 0
and one has
x0 z0 xy 0 − x0 y
0 x z 0 0 0
[0 0 y , 0 0 y 0 ] = 0 0 0
0 0 0 0 0 0 0 0 0
One can write this Lie algebra as a Lie algebra of matrices for any d. For
instance, in the physical case of d = 3, elements of the Heisenberg Lie algebra
can be written
0 x1 x2 x3 z
0 0 0 0 y3
0 0 0 0 y2
0 0 0 0 y1
0 0 0 0 0
1 x z + 21 xy
0 x z
exp 0 0 y = 0 1 y
0 0 0 0 0 1
so the group with Lie algebra h3 will be the group of upper triangular 3 by 3 real
matrices with ones on the diagonal, and this group will be the Heisenberg group
H3 . For our purposes though, it is better to work in exponential coordinates
(i.e. labeling a group element with the Lie algebra element that exponentiates
to it).
Matrix exponentials in general satisfy the Baker-Campbell-Hausdorff for-
mula, which says
1 1 1
eA eB = eA+B+ 2 [A,B]+ 12 [A,[A,B]]− 12 [B,[A,B]]+···
138
where the higher terms can all be expressed as repeated commutators. This
provides one way of showing that the Lie group structure is determined (for
group elements expressible as exponentials) by knowing the Lie bracket. For
the full formula and a detailed proof, see chapter 3 of [27]. One can easily
check the first few terms in this formula by expanding the exponentials, but the
difficulty of the proof is that it is not at all obvious why all the terms can be
organized in terms of commutators.
For the case of the Heisenberg Lie algebra, since all multiple commutators
vanish, the Baker-Campbell-Hausdorff formula implies for exponentials of ele-
ments of h3
1
eA eB = eA+B+ 2 [A,B]
(a proof of this special case of Baker-Campbell-Hausdorff is in section 3.1 of [27]).
We can use this to explicitly write the group law in exponential coordinates:
Note that the Lie algebra basis elements X, Y, Z each generate subgroups
of H3 isomorphic to R. Elements of the first two of these subgroups generate
the full group, and elements of the third subgroup are “central”, meaning they
commute with all group elements. Also notice that the non-commutative nature
of the Lie algebra or group depends purely on the factor xy 0 − yx0 .
The generalization of this to higher dimensions is:
Definition (Heisenberg group). The Heisenberg group H2d+1 is the space R2d+1
with the group law
0
x + x0
x x 0 1
( , z)( 0 , z ) = ( , z + z 0 + (x · y0 − y · x0 ))
y y y + y0 2
Note that in these exponential coordinates the exponential map relating the
Heisenberg Lie algebra h2d+1 and the Heisenberg Lie group H2d+1 is just the
identity map.
139
Definition (Schrödinger representation, Lie algebra version). The Schrödinger
representation of the Heisenberg Lie algebra h3 is the representation (Γ0S , L2 (R))
satisfying
d
Γ0S (X)ψ(q) = −iQψ(q) = −iqψ(q), Γ0S (Y )ψ(q) = −iP ψ(q) = − ψ(q)
dq
Γ0S (Z)ψ(q) = −iψ(q)
Factors of i have been chosen to make these operators skew-adjoint and the
representation thus unitary. They can be exponentiated, giving in the exponen-
tial coordinates on H3 of equation 11.1
x
ΓS (( , 0))ψ(q) = e−xiQ ψ(q) = e−ixq ψ(q)
0
0 d
ΓS (( , ))ψ(q) = e−yiP ψ(q) = e−y dq ψ(q) = ψ(q − y)
y
0
ΓS (( , z))ψ(q) = e−iz ψ(q)
0
For general group elements of H3 one has
Definition (Schrödinger representation, Lie group version). The Schrödinger
representation of the Heisenberg Lie group H3 is the representation (ΓS , L2 (R))
satisfying
x xy
ΓS (( , z))ψ(q) = e−iz ei 2 e−ixq ψ(q − y)
y
To check that this defines a representation, one computes
0
x x x 0 x0 y 0 0
ΓS (( , z))ΓS (( 0 , z 0 ))ψ(q) = ΓS (( , z))e−iz ei 2 e−ix q ψ(q − y 0 )
y y y
0 xy+x0 y 0 0
=e−i(z+z ) ei 2 e−ixq e−ix (q−y) ψ(q − y − y 0 )
0 1 0 0 (x+x0 )(y+y 0 ) 0
=e−i(z+z + 2 (xy −yx )) ei 2 e−i(x+x )q ψ(q − (y + y 0 ))
x + x0
1
=ΓS (( , z + z 0 + (xy 0 − yx0 )))
y + y0 2
The group analog of the Heisenberg commutation relations (often called the
“Weyl form” of the commutation relations) is the relation
140
as well as the same product in the opposite order, and then comparing the
results.
Note that, for the Schrödinger representation, we have
0 0
ΓS (( , z + 2π)) = ΓS (( , z))
0 0
so the representation operators are periodic with period 2π in the z-coordinate.
Some authors choose to define the Heisenberg group H3 as not R2 ⊕ R, but
R2 ⊕ S 1 , building this periodicity automatically into the definition of the group,
rather than the representation.
We have seen that the Fourier transform F takes the Schrödinger represen-
tation to a unitarily equivalent representation of H3 , in terms of functions of p
(the momentum space representation). The operators change as
ΓS (g) → F ΓS (g)Fe
when one makes the unitary transformation to the momentum space represen-
tation.
In typical physics quantum mechanics textbooks, one often sees calculations
made just using the Heisenberg commutation relations, without picking a spe-
cific representation of the operators that satisfy these relations. This turns out
to be justified by the remarkable fact that, for the Heisenberg group, once one
picks the constant with which Z acts, all irreducible representations are uni-
tarily equivalent. By unitarity this constant is −ic, c ∈ R. We have chosen
c = 1, but other values of c would correspond to different choices of units. In a
sense, the representation theory of the Heisenberg group is very simple: there’s
just one irreducible representation. This is very different than the theory for
even the simplest compact Lie groups (U (1) and SU (2)) which have an infinity
of inequivalent irreducibles labeled by weight or by spin. Representations of a
Heisenberg group will appear in different guises (we’ve seen two, will see an-
other in the discussion of the harmonic oscillator, and there are yet others that
appear in the theory of theta-functions), but they are all unitarily equivalent.
This statement is known as the Stone-von Neumann theorem.
So far we’ve been modestly cavalier about the rigorous analysis needed to
make precise statements about the Schrödinger representation. In order to prove
a theorem like the Stone-von Neumann theorem, which tries to say something
about all possible representations of a group, one needs to invoke a great deal
of analysis. Much of this part of analysis was developed precisely to be able to
deal with general quantum mechanical systems and prove theorems about them.
The Heisenberg group, Lie algebra and its representations are treated in detail
in many expositions of quantum mechanics for mathematicians. Some good
references for this material are [64], and [29]. In depth discussions devoted to
the mathematics of the Heisenberg group and its representations can be found
in [35], [19] and [66].
In these references can be found a proof of the (not difficult)
Theorem. The Schrödinger representation ΓS described above is irreducible.
141
and the much more difficult
Theorem (Stone-von Neumann). Any irreducible representation π of the group
H3 on a Hilbert space, satisfying
π 0 (Z) = −i1
142
Chapter 12
We have seen that the quantum theory of a free particle corresponds to the con-
struction of a representation of the Heisenberg Lie algebra in terms of operators
Q and P . One would like to use this to produce quantum systems with a similar
relation to more non-trivial classical mechanical systems than the free particle.
During the earliest days of quantum mechanics it was recognized by Dirac that
the commutation relations of the Q and P operators somehow corresponded
to the Poisson bracket relations between the position and momentum coordi-
nates on phase space in the Hamiltonian formalism for classical mechanics. In
this chapter we’ll give an outline of the topic of Hamiltonian mechanics and
the Poisson bracket, including an introduction to the symplectic geometry that
characterizes phase space.
The Heisenberg Lie algebra h2d+1 is usually thought of as quintessentially
quantum in nature, but it is already present in classical mechanics, as the Lie
algebra of degree zero and one polynomials on phase space, with Lie bracket
the Poisson bracket. The full Lie algebra of all functions on phase space (with
Lie bracket the Poisson bracket) is infinite dimensional, so not the sort of finite
dimensional Lie algebra given by matrices that we have studied so far (although,
historically, it is this kind of infinite dimensional Lie algebra that motivated the
discovery of the theory of Lie groups and Lie algebras by Sophus Lie during the
1870s). In chapter 14 we will see that degree two polynomials on phase space
also provide an important finite-dimensional Lie algebra.
143
terpretation of phase space is that it is the space that uniquely parametrizes
solutions of the equations of motion of a given classical mechanical system. The
basic axioms of Hamiltonian mechanics can be stated in a way that parallels
the ones for quantum mechanics.
∂h
q˙j =
∂pj
∂h
p˙j = −
∂qj
Specializing to the case d = 1, for any observable function f , Hamilton’s
equations imply
df ∂f dq ∂f dp ∂f ∂h ∂f ∂h
= + = −
dt ∂q dt ∂p dt ∂q ∂p ∂p ∂q
We can define
df
= {f, h}
dt
This relation is equivalent to Hamilton’s equations since it implies them by
taking f = q and f = p
∂h
q̇ = {q, h} =
∂p
∂h
ṗ = {p, h} = −
∂q
p2
For a non-relativistic free particle, h = 2m and these equations become
p
q̇ = , ṗ = 0
m
144
which just says that the momentum is the mass times the velocity, and is con-
served. For a particle subject to a potential V (q) one has
p2
h= + V (q)
2m
and the trajectories are the solutions to
p ∂V
q̇ = , ṗ = −
m ∂q
which adds Newton’s second law
∂V
F =− = ma = mq̈
∂q
to the definition of momentum in terms of velocity.
One can easily check that the Poisson bracket has the properties
• Antisymmetry
{f1 , f2 } = −{f2 , f1 }
• Jacobi identity
{{f1 , f2 }, f3 } + {{f3 , f1 }, f2 } + {{f2 , f3 }, f1 } = 0
These two properties, together with the bilinearity, show that the Poisson
bracket fits the definition of a Lie bracket, making the space of functions on
phase space into an infinite dimensional Lie algebra. This Lie algebra is respon-
sible for much of the structure of the subject of Hamiltonian mechanics, and it
was historically the first sort of Lie algebra to be studied.
The conservation laws of classical mechanics are best understood using this
Lie algebra. From the fundamental dynamical equation
df
= {f, h}
dt
we see that
df
{f, h} = 0 =⇒ =0
dt
and in this case the function f is called a “conserved quantity”, since it does
not change under time evolution. Note that if we have two functions f1 and f2
on phase space such that
{f1 , h} = 0, {f2 , h} = 0
then using the Jacobi identity we have
{{f1 , f2 }, h} = −{{h, f1 }, f2 } − {{f2 , h}, f1 } = 0
This shows that if f1 and f2 are conserved quantities, so is {f1 , f2 }, so functions
f such that {f, h} = 0 make up a Lie subalgebra. It is this Lie subalgebra
that corresponds to “symmetries” of the physics, commuting with the time
translation determined by the dynamical law given by h.
145
12.2 The Poisson bracket and the Heisenberg
Lie algebra
A third fundamental property of the Poisson bracket that can easily be checked
is the
• Leibniz rule
This property says that taking Poisson bracket with a function f acts on a
product of functions in a way that satisfies the Leibniz rule for what happens
when you take the derivative of a product. Unlike antisymmetry and the Ja-
cobi identity, which reflect the Lie algebra structure on functions, the Leibniz
property describes the relation of the Lie algebra structure to multiplication of
functions. At least for polynomial functions, it allows one to inductively reduce
the calculation of Poisson brackets to the special case of Poisson brackets of the
coordinate functions q and p, for instance:
146
This isomorphism preserves the Lie bracket relations since
[X, Y ] = Z ↔ {q, p} = 1
It is convenient to choose its own notation for the dual phase space, so we
will often write M ∗ = M. The three dimensional space we have identified with
the Heisenberg Lie algebra is then
M⊕R
We will denote elements of this space either by functions cq q + cp p + c, or as
c
( q , c)
cp
In this second notation, the Lie bracket is
0 0
c c 0 c c
[( q , c), ( 0q , c)] = ( , Ω( q , 0q ))
cp cp 0 cp cp
Notice that the non-trivial part of the Lie bracket structure is determined by Ω.
In higher dimensions, coordinate functions q1 , · · · , qd , p1 , · · · , pd on M pro-
vide a basis for the dual space M consisting of the linear coefficient functions
of vectors in M . Taking as an additional basis element the constant function
1, we have a 2d + 1 dimensional space with basis q1 , · · · , qd , p1 , · · · , pd , 1. The
Poisson bracket relations
{qj , qk } = {pj , pk } = 0, {qj , pk } = δjk
turn this space into a Lie algebra, isomorphic to the Heisenberg Lie algebra
h2d+1 . On general functions, the Poisson bracket will be given by the obvious
generalization of the d = 1 case
d
X ∂f1 ∂f2 ∂f1 ∂f2
{f1 , f2 } = ( − ) (12.2)
j=1
∂qj ∂pj ∂pj ∂qj
147
Digression. We have been careful here to keep track of the difference between
phase space M = R2d and its dual M = M ∗ , since it is M ⊕ R that is given
the structure of a Lie algebra (in this case h2d+1 ) by the Poisson bracket, and it
is this Lie algebra we want to use in chapter 15 when we define a quantization
of the classical system. Taking duals, we find an isomorphism
M ⊕ R ↔ h∗2d+1
It is a general phenomenon that one can define a version of the Poisson bracket
on functions on g∗ , the dual of any Lie algebra g. This is because the Leibniz
property ensures that the Poisson bracket only depends on Ω, its restriction to
linear functions, and linear functions on g∗ are just elements of g. So one can
define a Poisson bracket on functions on g∗ by first defining
148
Definition (Symplectic form). A symplectic form ω on a vector space V is a
bilinear map
ω :V ×V →R
such that
∂
qj ∈ M ↔ Ω(·, qj ) = − Ω(qj , ·) = − ∈M
∂pj
∂
pj ∈ M ↔ Ω(·, pj ) = − Ω(pj , ·) = ∈M
∂qj
and in general
u ∈ M ↔ Ω(·, u) (12.6)
Note that unlike the inner product case, a choice of convention of minus sign
must be made and is done here.
Recalling the discussion of bilinear forms from section 9.5, a bilinear form on
a vector space V can be identified with an element of V ∗ ⊗ V ∗ . Taking V = M ∗
we have V ∗ = (M ∗ )∗ = M , and the bilinear form Ω on M ∗ is an element of
M ⊗ M given by
d
X ∂ ∂ ∂ ∂
Ω= ( ⊗ − ⊗ )
j=1
∂q j ∂p j ∂p j ∂q j
v ∈ M → ω(v, ·) ∈ M
149
In the case of Euclidean geometry, one can show by Gram-Schmidt orthog-
onalization that a basis ej can always be found that puts the inner product
(which is a symmetric element of V ∗ ⊗ V ∗ ) in the standard form
n
X
vj ⊗ vj
j=1
Digression. For those familiar with differential manifolds, vector fields and
differential forms, the notion of a symplectic vector space can be extended to:
Unlike the linear case though, there will in general be no global choice of coor-
dinates for which this true. Later on, our discussion of quantization will rely
crucially on having a linear structure on phase space, so will not apply to general
symplectic manifolds.
Note that there is no assumption here that M has a metric (i.e. it may
not be a Riemannian manifold). A symplectic two-form ω is a structure on a
manifold analogous to a metric but with opposite symmetry properties. Whereas
a metric is a symmetric non-degenerate bilinear form on the tangent space at
150
each point, a symplectic form is an antisymmetric non-degenerate bilinear form
on the tangent space.
Returning to vector spaces V , one can generalize the notion of a symplectic
structure by dropping the non-degeneracy condition on ω. The Leibniz property
can still be used to extend this to a Poisson bracket on functions on V , which
is then called a Poisson structure on V . In particular, one can do this for
V = g∗ for any Lie algebra g, using the choice of ω discussed at the end of
section 12.2. So g∗ always has a Poisson structure, and on subspaces where ω
is non-degenerate, a symplectic structure.
For instance, for
V = h∗2d+1 = M ⊕ R
one has a Poisson bracket on functions on V , but only on the subspace M is
this Poisson bracket non-degenerate on linear functions, making M a symplectic
vector space. For another example one can take g = so(3), which is just R3 ,
with antisymmetric bilinear form ω given by the vector cross-product. In this
case it turns out that if one considers spheres of fixed radius in R3 , ω provides
a non-degenerate symplectic form on their tangent spaces and such spheres are
symplectic manifolds with symplectic two-form their area two-form.
151
152
Chapter 13
q(0) = q0 , p(0) = p0
153
if Fq and Fp are differentiable functions these differential equations have a unique
solution q(t), p(t), at least for some neighborhood of t = 0 (from the existence
and uniqueness theorem that can be found for instance in [34]). These solutions
q(t), p(t) describe trajectories in R2 with velocity vector F(q(t), p(t)) and such
trajectories can be used to define the “flow” of the vector field: for each t this is
the map that takes the initial point (q(0), p(0)) ∈ R2 to the point (q(t), p(t)) ∈
R2 .
Another equivalent way to define vector fields on R2 is to use instead the
directional derivative along the vector field, identifying
∂ ∂
F(q, p) ↔ F(q, p) · ∇ = Fq (q, p) + Fp (q, p)
∂q ∂p
The case of F a constant vector is just our previous identification of the vector
∂ ∂
space M with linear combinations of ∂q and ∂p .
An advantage of defining vector fields in this way as first order linear differ-
ential operators is that it shows that vector fields form a Lie algebra, where one
takes as Lie bracket the commutator of the differential operators. The commu-
tator of two first-order differential operators is another first-order differential
operator since higher order derivatives will cancel, using equality of mixed par-
tial derivatives. In addition, such a commutator will satisfy the Jacobi identity.
Not only do we get a Lie algebra, but also a representation of the Lie algebra,
on functions on R2 .
Given this Lie algebra of vector fields, one can ask what the corresponding
group might be. This is not a finite dimensional matrix Lie algebra, so expo-
nentiation of matrices will not give the group. One can however use the flow of
the vector field X to define an analog of the exponential of a parameter t times
X:
Definition (Exponential map of a vector field). An exponential map for the
vector field X on M is a map
exp(tX) : M → M
154
Digression. For any manifold M , one has a Lie algebra of differentiable vector
fields with an associated Lie bracket. One also has an infinite dimensional Lie
group, the group of invertible maps from M to itself, such that the maps and
their inverses are both differentiable. This group is called the diffeomorphism
group of M and written Dif f (M ). Its Lie algebra is the Lie algebra of vector
fields.
The representation of the Lie algebra of vector fields on functions is the
differential of the representation of Dif f (M ) on functions induced in the usual
way from the action of Dif f (M ) on the space M .
∂f ∂f
Fq = , Fp = −
∂p ∂q
∂f ∂ ∂f ∂
− = −{f, ·}
∂p ∂q ∂q ∂p
The simplest non-zero Hamiltonian vector fields are those for f a linear
function. For cq , cp constants, if
f = cq q + cp p
then
∂ ∂
Xf = cp − cq
∂q ∂p
and the map
f → Xf
is just the isomorphism of M and M of equation 12.6.
155
∂
For example, taking f = p, we have Xp = ∂q . The exponential map for this
vector field is
∂
Similarly, for f = q one has Xq = − ∂p and
For quadratic functions f one gets vector fields Xf with components linear in
the coordinates. An important example, which describes a harmonic oscillator
and will be treated in much more detail in chapter 20, is the case
1 2
f= (q + p2 )
2
for which
∂ ∂
Xf = p −q
∂q ∂p
dq dp
= p, = −q
dt dt
q(t) = q(0) cos t + p(0) sin t, p(t) = p(0) cos t − q(0) sin t
The vector field Xf and the trajectories in the q − p plane look like this
156
The relation of vector fields to the Poisson bracket is given by
so
X{f1 ,f2 } = Xf2 Xf1 − Xf1 Xf2 = −[Xf1 , Xf2 ] (13.4)
This shows that the map f → Xf of equation 13.1 that we defined between
these Lie algebras is not quite a Lie algebra homomorphism because of the -
157
sign in equation 13.4 (it is called a Lie algebra “antihomomorphism”). The map
that is a Lie algebra homomorphism is
f → −Xf
To keep track of the minus sign here, one needs to keep straight the difference
between
• The functions on phase space M are a Lie algebra, with adjoint action
and
• The functions f provide vector fields Xf acting on functions on M , where
The first of these is what will be most relevant to us later when we quantize
functions on M to get operators, preserving the Lie algebra structure. The
second is what one naturally gets from the geometrical action of a Lie group G
on the phase space M (see section 13.3). As a simple example, the function p
satisfies
∂F
{p, F (q)} = −
∂q
so
∂(·)
{p, ·} = −
∂q
∂
is the infinitesimal action of translation in q on functions, whereas ∂q is the
vector field on M corresponding to infinitesimal translation in the position co-
ordinate.
It is important to note that the Lie algebra homomorphism
f → Xf
158
Digression. For a general symplectic manifold M , the symplectic two-form ω
gives one an analog of Hamilton’s equations. This is the following equality of
one-forms, relating a Hamiltonian function h and a vector field Xh determining
time evolution of trajectories in M
iXh ω = ω(Xh , ·) = dh
(here iX is interior product with the vector field X). The Poisson bracket in
this context can be defined as
f → −∇Xf + if (13.6)
where ∇X is the covariant derivative with respect to the connection. For details
of this, see [36] or [29].
In our treatment of functions on phase space M , we have always been tak-
ing such functions to be time-independent. One can abstractly interpret M
as the space of trajectories of a classical mechanical system, with coordinates
q, p having the interpretation of initial conditions q(0), p(0) of the trajectories.
The exponential maps exp(tXh ) give an action on the space of trajectories for
Hamiltonian function h, taking the trajectory with initial conditions given by
m ∈ M to the time-translated one with initial conditions given by exp(tXh )(m).
One should really interpret the formula for Hamilton’s equations
df
= {f, h}
dt
as meaning
d
f (exp(tXf )(m))|t=0 = {f (m), h(m)}
dt
for each m ∈ M .
Given a Hamiltonian vector field Xf , the maps
exp(tXf ) : M → M
159
are known to physicists as “canonical transformations”, and to mathematicians
as “symplectomorphisms”. We will not try and work out in any more detail
how the exponential map behaves in general. In chapter 14 we will see what
happens for f an order-two homogeneous polynomial in the qj , pk . In that case
the vector field Xf will take linear functions on M to linear functions, thus
acting on M, in which case its behavior can be studied using the matrix for the
linear transformation with respect to the basis elements qj , pk .
Digression. The exponential map exp(tX) can be defined as above on a general
manifold. For a symplectic manifold M , Hamiltonian vector fields Xf will have
the property that
exp(tXf )∗ ω = ω
This is because
L ∈ g → XL
from g to vector fields on M that takes L to the vector field XL which acts on
functions on M by
d
XL F (m) = F (etL · m)|t=0
dt
This map however is not a homomorphism (for the Lie bracket on vector fields
the commutator of derivatives), but an anti-homomorphism. To see why this
is, recall that when a group G acts on a space, we get a representation π on
functions F on the space by
π(g)F (m) = F (g −1 · m)
L → π 0 (L) = −XL
160
on M and the Poisson bracket. Quantization will turn functions on M into
operators, turning the function f into an operator which will be the observable
corresponding to the infinitesimal group action by a Lie algebra element L.
Only for certain actions of G on M will the XL be Hamiltonian vector fields.
A necessary condition is that XL satisfy equation 13.5
L → µL
XL = XµL
XµL F = −{µL , F } = XL F
µL = L = cq q + cp p + c ∈ h3
Here
∂ ∂
XµL = −cq + cp
∂p ∂q
161
The action of H3 on M for which this is the moment map will have XL = XµL
and one can check that this action is
x
( , z) · (q0 , p0 ) = (q0 + y, p0 − x) (13.7)
y
See equations 13.2 and 13.3, which show that this action corresponds to the
exponential map for vector fields associated to Lie algebra basis elements q and
p. For central elements of the Lie algebra (the constant functions), the vector
field is zero, and the exponential map takes these to the identity map on M .
For d = 3 the translation group G = R3 is a subgroup of H7 of elements of
the form
0
( , 0)
a
since it acts on the phase space by translation in the position coordinates
(q0 , p0 ) → (q0 + a, p0 )
µa (q0 , p0 ) = a · p0
µ(q0 , p0 )(a) = a · p0
(an element of the dual of the Lie algebra for each point m = (q0 , p0 ) in phase
space).
For another example, consider the action of the group G = SO(3) of rota-
tions on phase space M = R6 , which gives a map from so(3) to vector fields on
R6 , taking for example
∂ ∂ ∂ ∂
l1 ∈ so(3) → Xl1 = −q3 + q2 − p3 + p2
∂q2 ∂q3 ∂p2 ∂p3
(this is the vector field for an infinitesimal clockwise rotation in the q2 − q3 and
p2 − p3 planes, in the opposite direction to the case of the vector field X 12 (q2 +p2 )
in the q − p plane of section 13.2). The moment map here gives the usual
expression for the 1-component of the angular momentum
µl1 = q2 p3 − q3 p2
since one can check from equation 13.1 that Xl1 = Xµl1 . On basis elements of
so(3) one has
µlj (q0 , p0 ) = (q0 × p0 )j
162
Formulated as a map from M to so(3)∗ , the moment map is
where l ∈ so(3).
Digression. For the case of M a general symplectic manifold, one can still de-
fine the moment map, whenever one has a Lie group G acting on M , preserving
the symplectic form ω. The infinitesimal condition for such a G action is that
LX ω = 0
where LX is the Lie derivative along the vector field X. Using the formula
LX = (d + iX )2 = diX + iX d
for the Lie derivative acting on differential forms (iX is contraction with the
vector field X), one has
(diX + iX d)ω = 0
and since dω = 0 we have
diX ω = 0
When M is simply-connected, one-forms iX ω whose differential is 0 (called
“closed”) will be the differentials of a function (and called “exact”). So there
will be a function µ such that
µ : M → g∗
by defining
(µ(m))(L) = µL (m)
An important class of symplectic manifolds M with an action of a Lie group
G, preserving the symplectic form, are the co-adjoint orbits Ol . These are the
163
manifolds one gets by acting on a chosen l ∈ g∗ by the co-adjoint action Ad∗ ,
meaning the action of g on g∗ satisfying
where X ∈ g, and Ad(g) is the usual adjoint action on g. For these cases, the
moment map
µ : Ol → g∗
is just the inclusion map. Two simple examples are
• For g = h3 , fixing a choice of c elements of g are linear functions on
M = R2 , so l ∈ g∗ is a point in M (evaluation of the function at that
point). The co-adjoint action is the action of H3 on M of equation 13.7.
• For g = so(3) the non-zero co-adjoint orbits are spheres, with radius the
length of l.
164
Chapter 14
165
the inner product by the antisymmetric bilinear form Ω that determines the
symplectic geometry of phase space. We will define
Definition (Symplectic Group). The symplectic group Sp(2d, R) is the sub-
group of linear transformations g of M ∗ = R2d that satisfy
for v1 , v2 ∈ M ∗
While this definition uses the dual phase space M ∗ and Ω, it would have been
equivalent to have made the definition using M and ω, since these transforma-
tions preserve the isomorphism between M and M ∗ given by Ω (see equation
12.6). For an action on M ∗
u ∈ M ∗ → gu ∈ M ∗
Here the first equality uses the definition of the dual representation (see 4.2)
and the second uses the invariance of Ω.
Note that for the analogous case of the inner product, the same formula holds
with the elementary antisymmetric matrix replaced by the identity matrix.
A linear transformation g of M ∗ will be given by
cq α β cq
→
cp γ δ cp
166
This says that we can have any linear transformation with unit determinant.
In other words, we find that Sp(2, R) = SL(2, R). We will see later that this
isomorphism with a special linear group only happens for d = 1.
For the analog of equation 14.2 in the inner product case, replace the el-
ementary antisymmetric matrices by unit matrices, giving the condition that
defines orthogonal matrices, g T g = 1.
Now turning to the Lie algebra, for group elements g ∈ GL(2, R) near the
identity, one can write g in the form g = etL where L is in the Lie algebra
gl(2, R). The condition that g acts on M preserving Ω implies that (differenti-
ating 14.2)
d tL T 0 1 tL tL T T 0 1 0 1
((e ) e ) = (e ) (L + L)etL = 0
dt −1 0 −1 0 −1 0
which is what one expects: L is in the Lie algebra sl(2, R) of 2 by 2 real matrices
with zero trace. The analog in the inner product case is just the condition
defining elements of the Lie algebra of the orthogonal group, LT + L = 0.
The homogeneous degree two polynomials in p and q form a three-dimensional
sub-Lie algebra of the Lie algebra of functions on phase space, since the non-zero
2 2
Poisson bracket relations between them on a basis q2 , p2 , qp are
q 2 p2
{ , } = qp {qp, p2 } = 2p2 {qp, q 2 } = −2q 2
2 2
This Lie algebra is isomorphic to sl(2, R), with an explicit isomorphism given
by identifying basis elements as follows:
q2 p2
0 1 0 0 1 0
↔E= − ↔F = − qp ↔ G = (14.4)
2 0 0 2 1 0 0 −1
which are the same as the Poisson bracket relations between the corresponding
quadratic polynomials.
We thus see that we have an isomorphism between the Lie algebra of degree
two homogeneous polynomials with the Poisson bracket and the Lie algebra
167
of 2 by 2 trace-zero real matrices with the commutator as Lie bracket. The
isomorphism on general elements of these Lie algebras is
bq 2 cp2
1 b −a q a b
µL = −aqp + − = q p ↔L= (14.5)
2 2 2 −a −c p c −a
We have seen that this is a Lie algebra isomorphism on basis elements, but one
can explicitly check that.
{µL , µL0 } = µ[L,L0 ]
The use of the notation µL for these quadratic functions reflects the fact
that
L ∈ sl(2, R) → µL
is a moment map. This is for the SL(2, R) action on phase space M = R2
corresponding to the above SL(2, R) action on M (under the identification
between M and M given by Ω). One can check the condition XL = XµL on
vector fields on M , but we will not do this here, since for our purposes it is the
action on M that is important.
Two important subgroups of SL(2, R) are
Here one can explicitly see that this group has elements going off to infinity.
1 2
E−F ↔ (p + q 2 )
2
which we will later re-encounter as the Hamiltonian function for the har-
monic oscillator.
168
14.1.2 The symplectic group for arbitary d
For general d, the symplectic group Sp(2d, R) is the group of linear transfor-
mations g of M that leave Ω (see 12.3) invariant, i.e. satisfy
0 0
cq cq c c
Ω(g , g 0 ) = Ω( q , 0q )
cp cp cp cp
B = BT , C = C T
Theorem 14.1. The Lie algebra sp(2d, R) is isomorphic to the Lie algebra of
order two homogeneous polynomials on M = R2d by the isomorphism (using a
vector notation for the coefficient functions q1 , · · · , qd , p1 , · · · , pd )
L ↔ µL
where
1 0 −1 q
µL = q p L
2 1 0 p
1 B −A q
= q p
2 −AT −C p
1
= (q · Bq − 2q · Ap − p · Cp) (14.7)
2
169
We will postpone the proof of this theorem until section 14.2, since it is easier
to first study Poisson brackets between order two and order one polynomials,
then use this to prove the theorem about Poisson brackets between order two
polynomials.
The Lie algebra sp(2d, R) has a subalgebra gl(d, R) consisting of matrices
of the form
A 0
L=
0 −AT
or, in terms of polynomials, polynomials
−q · Ap = −(AT p) · q
Here A is any real d by d matrix. This shows that one way to get symplectic
transformations is to take any linear transformation of the position coordinates,
together with the dual linear transformation on momentum coordinates. In this
way, any linear group of symmetries of the position space becomes a group of
symmetries of phase-space. An example of this is the group SO(d) of spatial
rotations, with Lie algebra so(d) ⊂ gl(d, R), the antisymmetric d by d matrices.
In the case d = 3, µL gives the standard expression for the angular momentum
as a function of the qj , pk coordinates on phase space. For example, taking
L = l1 , one has
µl1 = q2 p3 − q3 p2
the standard expression for angular momentum about the 1-axis.
Another important subgroup comes from taking A = 0, B = 1, C = −1,
which gives
1
µL = (|q|2 + |p|2 )
2
which will be the Hamiltonian function for a d-dimensional harmonic oscillator.
Exponentiating, one gets an SO(2) subgroup, one that acts on phase space in a
way that mixes position and momentum coordinates, so cannot be understood
just in terms of configuration space.
170
Taking all quadratic polynomials, we get a six-dimensional Lie algebra with
basis elements 1, q, p, qp, q 2 , p2 . This is not the direct product of h3 and sl(2, R)
since there are nonzero Poisson brackets
{qp, q} = − q, {qp, p} = p
p 2
q 2 (14.8)
{ , q} = − p, { , p} = q
2 2
These relations show that operating on a basis of linear functions on M by
taking the Poisson bracket with something in sl(2, R) (a quadratic function)
provides a linear transformation on M ∗ . In this section we will see that this
is a reflection of the fact that SL(2, R) acts on the Heisenberg group H3 by
automorphisms.
Recall the definition 11.1 of the Heisenberg group H3 as elements
x
( , z) ∈ M ⊕ R
y
with the group law
0
x + x0
0
x x 1 x x
( , z)( 0 , z 0 ) = ( , z + z 0
+ Ω( , 0 ))
y y y + y0 2 y y
Elements g ∈ SL(2, R) act on H3 by
x x x
( , z) → φg (( , z)) = (g , z) (14.9)
y y y
This is an example of
Definition (Group automorphisms). If an action of elements g of a group G
on a group H
h ∈ H → φg (h) ∈ H
satisfies
φg (h1 )φg (h2 ) = φg (h1 h2 )
for all g ∈ G and h1 , h2 ∈ H, the group G is said to act on H by automorphisms.
Each map φg is an automorphism of H. Note that since φg is an action of G,
we have φg1 g2 = φg1 φg2 .
Here G = SL(2, R), H = H3 and φg given above is an action by automorphisms
since
0 0
x x x x
φg ( , z)φg ( 0 , z 0 ) =(g , z)(g 0 , z 0 )
y y y y
0
0
x+x 0 1 x x
=(g , z + z + Ω(g , g 0 ))
y + y0 2 y y
0
0
x+x 1 x x
=(g , z + z 0 + Ω( , 0 ))
y + y0 2 y y
0
x x
=φg (( , z)( 0 , z 0 )) (14.10)
y y
171
One can consider the Lie algebra h3 instead of the group H3 , and there
will again be an action of SL(2, R) by automorphisms. Denoting elements
cq q + cp p + c of the Lie algebra h3 = M ⊕ R (see section 12.2) by
c
( q , c)
cp
These φg are just the same maps φg that give automorphisms of the group
structure of H3 since the exponential map relating h3 and H3 in these coordi-
nates is just the identification
cq x
( , c) ↔ ( , z)
cp y
h3 = M ⊕ R
172
of H on h by automorphisms φh = Ad(h). For the case of the Heisenberg
group, one can check that the adjoint representation of H3 on h3 leaves
invariant the M component, only acting on the R component. Note that
this is opposite behavior to the co-adjoint action of H3 on h∗3 , which acts
on M by translations (as in equation 13.7).
The SL(2, R) action on h3 by Lie algebra automorphisms has an infinitesimal
version (i.e. for group elements infinitesimally close to the identity), an action of
the Lie algebra of SL(2, R) on h3 . This is defined for L ∈ sl(2, R) and X ∈ h3
by
d
L · X = (etL · X)|t=0 (14.11)
dt
Computing this, one finds
c c
L · ( q , c) = (L q , 0) (14.12)
cp cp
so L acts on h3 = M ⊕ R just by matrix multiplication on vectors in M.
More generally, one has
Definition 14.1 (Lie algebra derivations). If an action of a Lie algebra g on a
Lie algebra h
X ∈h→Z ·X ∈h
satisfies
[Z · X, Y ] + [X, Z · Y ] = Z · [X, Y ]
for all Z ∈ g and X, Y ∈ h, the Lie algebra g is said to act on h by derivations.
The action of an element Z on h is a derivation of h.
Given an action of a Lie group G on a Lie algebra h by automorphisms,
taking the derivative as in 14.11 gives an action of g on h by derivations since
d d
Z·[X, Y ] = (φetL ([X, Y ]))|t=0 = ([φetL X, φetL Y ])|t=0 = [Z·X, Y ]+[X, Z·Y ]
dt dt
We will often refer to this action of g on h as the infinitesimal version of the
action of G on h.
Two examples are
• The case above, where sl(2, R) acts on h3 by derivations.
• The adjoint representation of a Lie algebra h on itself gives an action of h
on itself by derivations, with
X ∈ h → Z · X = ad(Z)(X) = [Z, X]
The Poisson brackets between degree two and degree one polynomials dis-
cussed at the beginning of this section give explicitly the action of sl(2, R) on
h3 by derivations. For a general L ∈ sl(2, R) and cq q + cp p + c ∈ h3 we have
0
cq acq + bcp cq
{µL , cq q + cp p + c} = c0q q + c0p p, = = L (14.13)
c0p ccq − acp cp
173
(here µL is given by 14.5). We see that this is just the action of sl(2, R) by
derivations on h3 of equation 14.12, the infinitesimal version of the action of
SL(2, R) on h3 by automorphisms.
Note that in the larger Lie algebra of all polynomials on M of order two or
less, the action of sl(2, R) on h3 by derivations is part of the adjoint action of
the Lie algebra on itself, since it is given by the Poisson bracket (which is the
Lie bracket), between order two and order one polynomials.
The generalization to the case of arbitrary d is
Theorem. The sp(2d, R) action on h2d+1 = M ⊕ R by derivations is
where 0
cq cq
= L
c0p cp
or, equivalently (see section 4.1), on coordinate function basis vectors of M one
has
q q
{µL , } = LT
p p
Proof. One can first prove 14.14 for the cases when only one of A, B, C is non-
zero, then the general case follows by linearity. For instance, taking the special
case
0 B 1
L= , µL = q · Bq
0 0 2
one can show that the action on coordinate functions (the basis vectors of M)
is
1 q q 0
{ q · Bq, } = LT =
2 p p Bq
by computing
1X 1X
{ qj Bjk qk , pl } = (qj {Bjk qk , pl } + {qj Bjk , pl }qk )
2 2
j,k j,k
1 X X
= ( qj Bjl + Blk qk )
2 j
k
X
= Blj qj (since B = B T )
j
Repeating for A and C one finds that for general L one has
q q
{µL , } = LT
p p
Since an element in M can be written as
q c q
cq cp LT = (L q )T
p cp p
174
we have 0
cq c
=L q
c0p cp
175
176
Chapter 15
Quantization
Given any Hamiltonian classical mechanical system with phase space R2d , phys-
ics textbooks have a standard recipe for producing a quantum system, by a
method known as “canonical quantization”. We will see that for linear functions
on phase space, this is just the construction we have already seen of a unitary
representation Γ0S of the Heisenberg Lie algebra, the Schrödinger representation,
and the Stone-von Neumann theorem assures us that this is the unique such
construction, up to unitary equivalence. We will also see that this recipe can
only ever be partially successful, with the Schrödinger representation extending
to give us a representation of a sub-algebra of the algebra of all functions on
phase space (the polynomials of degree two and below), and a no-go theorem
showing that this cannot be extended to a representation of the full infinite
dimensional Lie algebra. Recipes for quantizing higher-order polynomials will
always suffer from a lack of uniqueness, a phenomenon known to physicists as
the existence of “operator ordering ambiguities”.
In later chapters we will see that this quantization prescription does give
unique quantum systems corresponding to some Hamiltonian systems (in par-
ticular the harmonic oscillator and the hydrogen atom), and does so in a manner
that allows a description of the quantum system purely in terms of representa-
tion theory.
177
where the last of these equations is the equation for the time dependence of a
Heisenberg picture observable O(t) in quantum mechanics. Dirac’s suggestion
was that given any classical Hamiltonian system, one could “quantize” it by
finding a rule that associates to a function f on phase space a self-adjoint
operator Of (in particular Oh = H) acting on a state space H such that
i
O{f,g} = − [Of , Og ]
~
This is completely equivalent to asking for a unitary representation (π 0 , H)
of the infinite dimensional Lie algebra of functions on phase space (with the
Poisson bracket as Lie bracket). To see this, note that one can choose units
for momentum p and position q such that ~ = 1. Then, as usual getting a
skew-adjoint Lie algebra representation operator by multiplying a self-adjoint
operator by −i, setting
π 0 (f ) = −iOf
the Lie algebra homomorphism property
corresponds to
−iO{f,g} = [−iOf , −iOg ] = −[Of , Og ]
so one has Dirac’s suggested relation.
Recall that the Heisenberg Lie algebra is isomorphic to the three-dimensional
sub-algebra of functions on phase space given by linear combinations of the con-
stant function, the function q and the function p. The Schrödinger representa-
tion ΓS provides a unitary representation not of the Lie algebra of all functions
on phase space, but of these polynomials of degree at most one, as follows
O1 = 1, Oq = Q, Op = P
so
d
Γ0S (1) = −i1, Γ0S (q) = −iQ = −iq, Γ0S (p) = −iP = −
dq
Moving on to quadratic polynomials, these can also be quantized, as follows
P2 Q2
O p2 = , O q2 =
2 2 2 2
For the function pq one can no longer just replace p by P and q by Q since the
operators P and Q don’t commute, and P Q or QP is not self-adjoint. What
does work, satisfying all the conditions to give a Lie algebra homomorphism is
1
Opq = (P Q + QP )
2
This shows that the Schrödinger representation Γ0S that was defined as a
representation of the Heisenberg Lie algebra h3 extends to a unitary Lie algebra
178
representation of a larger Lie algebra, that of all quadratic polynomials on phase
space, a representation that we will continue to denote by Γ0S and refer to as the
Schrödinger representation. On a basis of homogeneous order two polynomials
we have
p2 P2 i d2
Γ0S ( ) = −i =
2 2 2 dq 2
q2 Q2 i
Γ0S ( ) = −i = − q2
2 2 2
−i
Γ0S (pq) = (P Q + QP )
2
Restricting Γ0S to just linear combinations of these homogeneous order two poly-
nomials (which give the Lie algebra sl(2, R), recall equation 14.4) we get a Lie
algebra representation of sl(2, R) called the metaplectic representation.
Restricted to the Heisenberg Lie algebra, the Schrödinger representation Γ0S
exponentiates to give a representation ΓS of the corresponding Heisenberg Lie
group (see 11.4). As an sl(2, R) representation however, one can show that Γ0S
has the same sort of problem as the spinor representation of su(2) = so(3), which
was not a representation of SO(3), but only of its double cover SU (2) = Spin(3).
To get a group representation, one must go to a double cover of the group
SL(2, R), which will be called the metaplectic group and denoted M p(2, R).
The source of the problem is the subgroup of SL(2, R) generated by expo-
nentiating the Lie algebra element
1 2 2 0 1
(p + q ) ↔ E − F =
2 −1 0
When we study the Schrödinger representation using its action on the quantum
harmonic oscillator state space H in chapter 20 we will see that the Hamiltonian
is the operator
1 2
(P + Q2 )
2
and this has half-integer eigenvalues. As a result, trying to exponentiate Γ0S
gives a representation of SL(2, R) only up to a sign, and one needs to go to the
double cover M p(2, R) to get a true representation.
One should keep in mind though that, since SL(2, R) acts non-trivially by
automorphisms on H3 , elements of these two groups do not commute. The
Schrödinger representation is a representation not of the product group, but of
something called a “semi-direct product” which will be discussed in more detail
in chapter 16.
179
of ordering the P and Q operators lead to different Of for the same function
f , with physically different observables (although the differences involve the
commutator of P and Q, so higher-order terms in ~).
When physicists first tried to find a consistent prescription for producing an
operator Of corresponding to a polynomial function on phase space of degree
greater than two, they found that there was no possible way to do this consistent
with the relation
i
O{f,g} = − [Of , Og ]
~
for polynomials of degree greater than two. Whatever method one devises for
quantizing higher degree polynomials, it can only satisfy that relation to lowest
order in ~, and there will be higher order corrections, which depend upon one’s
choice of quantization scheme. Equivalently, it is only for the six-dimensional Lie
algebra of polynomials of degree up to two that the Schrödinger representation
gives one a Lie algebra representation, and this cannot be consistently extended
to a representation of a larger subalgebra of the functions on phase space. This
problem is made precise by the following no-go theorem
Theorem (Groenewold-van Hove). There is no map f → Of from polynomials
on R2 to self-adjoint operators on L2 (R) satisfying
i
O{f,g} = − [Of , Og ]
~
and
Op = P, Oq = Q
for any Lie subalgebra of the functions on R2 larger than the subalgebra of
polynomials of degree less than or equal to two.
Proof. For a detailed proof, see section 5.4 of [7], section 4.4 of [19], or chapter
16 of [25]. In outline, the proof begins by showing that taking Poisson brack-
ets of polynomials of degree three leads to higher order polynomials, and that
furthermore for degree three and above there will be no finite-dimensional sub-
algebras of polynomials of bounded degree. The assumptions of the theorem
force certain specific operator ordering choices in degree three. These are then
used to get a contradiction in degree four, using the fact that the same degree
four polynomial has two different expressions as a Poisson bracket:
1 1
q 2 p2 = {q 2 p, p2 q} = {q 3 , p3 }
3 9
180
which satisfy the Heisenberg relations
[Qj , Pk ] = iδjk
are a basis for a Lie algebra representation of so(3). These are the same
operators that were studied in chapter 8 under the name ρ0 (lj ). They will
181
be symmetries of rotationally invariant Hamiltonians, for instance the free
particle as above, or the particle in a potential
1
H= (P 2 + P22 + P32 ) + V (Q1 , Q2 , Q3 )
2m 1
when the potential only depends on the combination Q21 + Q22 + Q23 .
182
adjoint representation (which are symplectic manifolds) and irreducible repre-
sentations. Geometric quantization provides one possible method for trying to
associate representations to orbits. For more details, see [35].
None of the general methods of quantization is fully satisfactory, with each
running into problems in certain cases, or not providing a construction with all
the properties that one would want.
183
184
Chapter 16
Semi-direct Products
v → (a2 , R2 ) · v = a2 + R2 v
185
If we then act on the result with (a1 , R1 ) we get
Note that this is not what we would get if we took the product group law on
R3 × SO(3), since then the action of (a1 , R1 )(a2 , R2 ) on R3 would be
v → a1 + a2 + R1 R2 v
To get the correct group action on R3 , we need to take R3 × SO(3) not with
the product group law, but instead with the group law
This group law differs from the standard product law, by a term R1 a2 , which
is the result of R1 ∈ SO(3) acting non-trivially on a2 ∈ R3 . We will denote the
set R3 × SO(3) with this group law by
R3 o SO(3)
Rd o SO(d)
186
Definition (Semi-direct product group). Given a group K, a group N , and an
action φ of K on N by automorphisms
φk : n ∈ N → φk (n) ∈ N
the semi-direct product N o K is the set of pairs (n, k) ∈ N × K with group law
One can easily check that this satisfies the group axioms. The inverse is
The notation N o K for this construction has the weakness of not explicitly
indicating the automorphism φ which it depends on. There may be multiple
possible choices for φ, and these will always include the trivial choice φk = 1 for
all k ∈ K, which will give the standard product of groups.
Digression. For those familiar with the notion of a normal subgroup, N is a
normal subgroup of N o K. A standard notation for “N is a normal subgroup
of G” is N G. The symbol N o K is supposed to be a mixture of the × and
symbols (note that some authors define it to point in the other direction).
The Euclidean group E(d) is an example with N = Rd , K = SO(d). For
a ∈ Rd , R ∈ SO(d) one has
φR (a) = Ra
In chapter 39 we will see another important example, the Poincaré group which
generalizes E(3) to include a time dimension, treating space and time according
to the principles of special relativity.
The most important example for quantum theory is
Definition (Jacobi group). The Jacobi group in d dimensions is the semi-direct
product group
GJ (d) = H2d+1 o Sp(2d, R)
If we write elements of the group as
c
(( q , c), k)
cp
187
where k ∈ Sp(2d, R), then the automorphism φk that defines the Jacobi group
is given by
c c
φk (( q , c)) = (k q , c) (16.1)
cp cp
Note that the Euclidean group E(d) is a subgroup of the Jacobi group GJ (d),
the subgroup of elements of the form
0 R 0
(( , 0), )
cp 0 R
188
where X is an antisymmetric d by d matrix and a ∈ Rd . Exponentiating such
matrices will give elements of E(d).
The Lie bracket is then given by the matrix commutator, so
X1 a1 X2 a2 [X1 , X2 ] X1 a2 − X2 a1
[ , ]= (16.2)
0 0 0 0 0 0
So the Lie algebra of E(d) will be given by taking the sum of Rd (the Lie
algebra of Rd ) and so(d), with elements pairs (a, X) with a ∈ Rd and X an
antisymmetric d by d matrix. The infinitesimal version of the rotation action
of SO(d) on Rd by automorphisms
φR (a) = Ra
is
d d
φetX (a)|t=0 = (etX a)|t=0 = Xa
dt dt
Just in terms of such pairs, the Lie bracket can be written
Definition (Semi-direct product Lie algebra). Given Lie algebras k and n, and
an action of elements Y ∈ k on n by derivations
X ∈n→Y ·X ∈n
the semi-direct product nok is the set of pairs (X, Y ) ∈ n⊕k with the Lie bracket
One can easily see that in the special case of the Lie algebra of E(d) this
agrees with the construction above.
In section 14.1.2 we studied the Lie algebra of all polynomials of degree
at most two in d-dimensional phase space coordinates qj , pj , with the Poisson
bracket as Lie bracket. There we found two Lie subalgebras, the degree zero
and one polynomials (isomorphic to h2d+1 ), and the homogeneous degree two
polynomials (isomorphic to sp(2d, R)) with the second subalgebra acting on the
first by derivations as in equation 14.14.
Recall from chapter 14 that elements of this Lie algebra can also be written
as pairs
c
(( q , c), L)
cp
of elements in h2d+1 and sp(2d, R), with this pair corresponding to the polyno-
mial
µL + cq · q + cp · p + c
189
In terms of such pairs, the Lie bracket is given by
0 0 0
cq cq cq 0 cq c c
[(( 0
, c), L), (( 0 , c), L )] = ((L 0 −L , Ω( q , 0q )), [L, L0 ])
cp cp cp cp cp cp
h2d+1 o sp(2d, R)
and from the discussion in chapter 14.2 one can see that this is the Lie algebra
of the semi-direct product group
.
The Lie algebra of E(d) will be a sub-Lie algebra of this, consisting of ele-
ments of the form
0 X 0
(( , 0), )
cp 0 X
where X is an antisymmetric d by d matrix.
Digression. Just as E(d) can be identified with a group of d+1 by d+1 matrices,
the Jacobi group GJ (d) is also a matrix group and one can in principle work with
it and its Lie algebra using usual matrix methods. The construction is slightly
complicated and represents elements of GJ (d) as matrices in Sp(2d+1, R) . See
section 8.5 of [8] for details of the d = 1 case.
190
Chapter 17
191
available in this case, and irreducible representations are eigenfunctions of this
operator in the space of wavefunctions of fixed energy. The eigenvalue of the
second Casimir operator turns out to be an integer, known to physicists as the
“helicity”.
l = q1 p2 − q2 p1 , p1 , p2
on the d = 2 phase space M = R4 . The non-zero Lie bracket relations are given
by the Poisson brackets
and there is an isomorphism of this Lie algebra with a matrix Lie algebra of 3
by 3 matrices given by
0 −1 0 0 0 1 0 0 0
l ↔ 1 0 0 , p1 ↔ 0 0 0 , p2 ↔ 0 0 1
0 0 0 0 0 0 0 0 0
Writing this Lie algebra in terms of linear and quadratic functions on the
phase space shows that it can be realized as a sub-Lie algebra of the Jacobi
Lie algebra gJ (2). Quantization via the Schrödinger representation Γ0S then
provides a unitary representation of the Lie algebra of E(2) on the state space
H of functions of the position variables q1 , q2 , in terms of operators
∂ ∂
Γ0S (p1 ) = −iP1 = − , Γ0S (p2 ) = −iP1 = − (17.1)
∂q1 ∂q2
and
∂ ∂
Γ0S (l) = −iL = −i(Q1 P2 − Q2 P1 ) = −(q1 − q2 ) (17.2)
∂q2 ∂q1
The Hamiltonian operator for the free particle is
1 1 ∂2 ∂2
H= (P12 + P22 ) = − ( 2 + 2)
2m 2m ∂q1 ∂q2
and solutions to the Schrödinger equation can be found by solving the eigenvalue
equation
1 ∂2 ∂2
Hψ(q1 , q2 ) = − ( 2 + 2 )ψ(q1 , q2 ) = Eψ(q1 , q2 )
2m ∂q1 ∂q2
192
The operators L, P1 , P2 commute with H and so provide a representation of the
Lie algebra of E(2) on the space of wavefunctions of energy E.
This construction of irreducible representations of E(2) is similar in spirit
to the construction of irreducible representations of SO(3) in section 8.4. There
the Casimir operator L2 commuted with the SO(3) action, and gave a differ-
ential operator on functions on the sphere whose eigenfunctions were spaces
of dimension 2l + 1 with eigenvalue l(l + 1), for l non-negative and integral.
For E(2) the quadratic function p21 + p22 Poisson commutes with l, p1 , p2 . After
quantization,
|P|2 = P12 + P22
|p|2
E= >0
2m
and such |pi give a sort of continuous basis of H, even though these are not
square-integrable functions. The formalism for working with them uses distri-
butions and the orthonormality relation
hp|p0 i = δ(p − p0 )
|p|2
( − E)ψ(p)
e =0
2m
193
Going to polar coordinates p = (p cos θ, p sin θ), the space of solutions to
the time-independent Schrödinger equation at energy E is given by ψ(p)
e of the
form
ψ(p)
e = ψeE (θ)δ(p2 − 2mE)
√
To put this delta-function in a more useful form, note that for p ≈ 2mE one
has the linear approximation
1 √
p2 − 2mE ≈ √ (p − 2mE)
2 2mE
1 √
δ(p2 − 2mE) = √ δ(p − 2mE)
2 2mE
√
It is this space of functions ψeE (θ) of functions on the circle of radius 2mE
that will provide an infinite-dimensional representation of the group E(2), one
that turns out to be irreducible, although we will not show that here. The
194
position space wavefunction corresponding to ψeE (θ) will be
ZZ
1
ψ(q) = eip·q ψeE (θ)δ(p2 − 2mE)pdpdθ
2π
1
ZZ
1 √
= eip·q ψeE (θ) √ δ(p − 2mE)pdpdθ
2π 2 2mE
Z 2π √
1
= ei 2mE(q1 cos θ+q2 sin θ) ψeE (θ)dθ
4π 0
e 0 (l) = − (p1 ∂ − p2 ∂ )
Γ S
∂p2 ∂p1
∂
=−
∂θ
(use integration by parts to show qj = i ∂p∂ j and thus the first equality, then the
chain rule for functions f (p1 (θ), p2 (θ)) for the second).
This construction of a representation of E(2) starting with the Schrödinger
representation gives the same result as starting with the action of E(2) on
configuration space, and taking the induced action on functions on R2 (the
wavefunctions). To see this, note that E(2) has elements (a, R(φ)) which can
be written as a product (a, R(φ)) = (a, 1)(0, R(φ)) or, in terms of matrices
cos φ − sin φ a1 1 0 a1 cos φ − sin φ 0
sin φ cos φ a2 = 0 1 a2 sin φ cos φ 0
0 0 1 0 0 1 0 0 1
195
The group has a unitary representation
on the position space wavefunctions ψ(q), given by the induced action on func-
tions from the action of E(2) on position space R2
This is just the Schrödinger representation ΓS of the Jacobi group GJ (2), re-
stricted to the subgroup E(2) of transformations of phase space that are transla-
tions in q and rotations in both q and p vectors, preserving their inner product
(and thus the symplectic form). One can see this by considering the action
of translations as the exponential of the Lie algebra representation operators
Γ0S (pj ) = −iPj
and the action of rotations as the exponential of the Γ0S (l) = −iL
196
relationship to the Schrödinger representation as in two spatial dimensions. The
main difference is that the rotation group is now three dimensional and non-
commutative, so instead of the single Lie algebra basis element l we have three
of them, satisfying Poisson bracket relations that are the Lie algebra relations
of so(3)
l=q×p
or, in components
l1 = q2 p3 − q3 p2 , l2 = q3 p1 − q1 p3 , l3 = q1 p2 − q2 p1
The Euclidean group E(3) is a subgroup of the Jacobi group GJ (3) in the
same way as in two dimensions, and the Schrödinger representation ΓS provides
a representation of E(3) with Lie algebra version
∂ ∂
Γ0S (l1 ) = −iL1 = −i(Q2 P3 − Q3 P2 ) = −(q2 − q3 )
∂q3 ∂q2
∂ ∂
Γ0S (l2 ) = −iL2 = −i(Q3 P1 − Q1 P3 ) = −(q3 − q1 )
∂q1 ∂q3
197
∂ ∂
Γ0S (l3 ) = −iL3 = −i(Q1 P2 − Q2 P1 ) = −(q1 − q2 )
∂q2 ∂q1
∂
Γ0S (pj ) = −iPj = −
∂qj
These are just the infinitesimal versions of the action of E(3) on functions
induced from its action on position space R3 . Given an element g = (a, R) ∈
E(3) ⊂ GJ (3) we have a unitary transformation on wavefunctions
ψ(p)
e = ψeE (p)δ(|p|2 − 2mE)
198
characterized by functions ψeE (p) defined on the sphere |p|2 = 2mE.
√
Such complex-valued functions on the sphere of radius 2mE provide a
Fourier-transformed version ue of the irreducible representation of E(3). Here
the action of the group E(3) is by
for translations, by
e(0, R)ψeE (p) = ψeE (R−1 p)
u
−1
u
e(a, R)ψeE (p) = u u(0, R)ψeE (p) = e−ia·R p ψeE (R−1 p)
e(a, 1)e
• There is a second Casimir operator which one can show commutes with
the E(3) action, given by
L · P = L1 P1 + L2 P2 + L3 P3
199
For single-component wavefunctions, a straightforward computation shows
that the second Casimir operator L · P acts as zero. By introducing wavefunc-
tions with several components, together with an action of SO(3) that mixes the
components, it turns out that one can get new irreducible representations, with
a non-zero value of the second Casimir corresponding to a non-trivial weight of
the action of the SO(2) of rotations about the momentum vector.
One can construct such multiple-component wavefunctions as representa-
tions of E(3) by taking the tensor product of our irreducible representation on
wavefunctions of energy E (call this HE ) and the finite dimensional irreducible
representation C2s+1
HE ⊗ C2s+1
The Lie algebra representation operators for the translation part of E(3) act as
momentum operators on HE and as 0 on C2s+1 . For the SO(3) part of E(3),
we get operators we can write as
Jj = Lj + Sj
200
erator, which will now be
J·P
We will not work out the details of this here (although details can be found
in chapter 31 for the case s = 21 , where SO(3) is replaced by Spin(3)). What
happens is that the tensor product breaks up into irreducibles as
n=s
M
HE ⊗ C2s+1 = HE,n
n=−s
201
202
Chapter 18
Representations of
Semi-direct Products
203
The reader should be warned that much of the material included in this
chapter is not well-motivated by its applications to non-relativistic quantum
mechanics, where it is not obviously needed. The motivation is rather provided
by the more complicated case of relativistic quantum field theory, but it seems
worthwhile to first see how things work in a simpler context. In particular, the
discussion of representations of N o K for M commutative is motivated by the
case of the Poincaré group (see chapter 39), and that of intertwining operators
by the case of symmetry groups acting on quantum fields (see chapter 36).
The Stone-von Neumann theorem assures us that these must all be unitarily
equivalent, so there must exist unitary operators Uk satisfying
Operators like this that relate two representations are called “intertwining
operators”.
204
the Uk should satisfy the group homomorphism property
µL ∈ sp(2d, R) → UL0
205
The Lie algebra relation (see 14.14)
q T q
{µL , }=L (18.3)
p p
Exponentiating this UL0 will give us our Uk , and thus the operators we want.
Note that if we only need to satisfy equation 18.4 the UL0 can be changed by
a constant times the identity operator, but such a change would be inconsistent
with equation 18.2 (and thus called an “anomaly”). Equation 18.4 could be
written
[UL0 , Γ0S (X)] = Γ0S (L · X) (18.5)
where
d
L·X = φ tL (X)|t=0
dt e
It is the infinitesimal expression of the intertwining property 18.1.
206
By equation 14.5 this is
1 1 0 q 1
µL = q p = (q 2 + p2 )
2 0 1 p 2
i
UL0 = − (Q2 + P 2 )
2
satisfying
Q −P
[UL0 , ]= (18.6)
P Q
and intertwining operators
0 2
θ
+P 2 )
Ug = eθUL = e−i 2 (Q
One has
d2
(Q2 + P 2 )ψ(q) = (q 2 − )ψ(q) = ψ(q)
dq 2
so ψ(q) is an eigenvector of Q2 + P 2 with eigenvalue 1. As one goes around the
group SO(2) once (taking θ from 0 to 2π), the representation acts by a phase
that only goes from 0 to π, demonstrating the same problem that occurs in the
case of the spinor representation.
Conjugating the Heisenberg Lie algebra representation operators by the uni-
tary operators Ug intertwines the representations corresponding to rotations of
the phase space plane by an angle θ.
θ 2 2 Q i θ2 (Q2 +P 2 ) cos θ − sin θ Q
e−i 2 (Q +P ) e = (18.7)
P sin θ cos θ P
Note that this is a quite different calculation than in the spin case where we
also constructed a double cover of SO(2). Despite the quite different context
(rotations acting on an infinite dimensional state space), again one sees the
double cover here, as either Ug or −Ug will give the same rotation.
This example will be studied in much greater detail when we get to the
theory of the quantum harmonic oscillator in chapter 20. Note that the SO(2)
group action here inherently requires using both coordinate and momentum
variables, it is not a symmetry that can be seen just by looking at the problem
in configuration space.
207
18.2.2 The SO(2) action by rotations of the plane for d = 2
In the case d = 2 there is a another example of an SO(2) group which is a
subgroup of the symplectic group, here Sp(4, R). This is the group of rotations
of the configuration space R2 , with a simultaneous rotation of the momentum
space, leaving invariant the Poisson bracket. The group SO(2) acts on cq1 q1 +
cq2 q2 + cp1 p1 + cp2 p2 ∈ M by
cq1 cq1 cos θ − sin θ 0 0 cq1
cq2 cq2 sin θ cos θ 0 0 cq2
→ g =
cp1 cp1 0 0 cos θ − sin θ cp1
c p2 cp2 0 0 sin θ cos θ cp2
is
0 −1
µL = −q · p = q1 p2 − q2 p1
1 0
This is just the formula for the angular momentum corresponding to rotation
about an axis perpendicular to the q1 − q2 plane
l = q1 p2 − q2 p1
UL0 = −i(Q1 P2 − Q2 P1 )
208
satisfying
Q1 Q2 P1 P2
[UL0 , ]= , [UL0 , ]=
Q2 −Q1 P2 −P1
Exponentiating gives a representation of SO(2)
where
c0q1
cos θ − sin θ cq1
=
c0q2 sin θ cos θ cq2
Note that for this SO(2) the double cover is trivial. As far as this subgroup
of Sp(4, R) is concerned, there is no need to consider the double cover M p(4, R)
to get a well-defined representation.
Replacing the matrix L by
A 0
0 A
for A any real 2 by 2 matrix
a11 a12
A=
a21 a22
Note that the action of A on the momentum operators is the dual of the action
on the position operators. Only in the case of an orthogonal action (the SO(2)
earlier) are these the same, with AT = −A.
209
special cases of the Euclidean groups in 2 and 3 dimensions were covered in
chapter 17 and the Poincaré group case will be discussed in chapter 39.
For a general commutative group N , one does not have the simplifying fea-
ture of the Heisenberg group, the uniqueness of its irreducible representation.
For commuative groups on the other hand, while there are many irreducible
representations, they are all one-dimensional. As a result, the set of represen-
tations of N acquires its own group structure, also commutative, and one can
define
Definition (Character group). For N a commutative group, let N
b be the set of
characters of N , i.e. functions
α:N →C
The elements of N
b form a group, with multiplication
We only will actually need the case N = Rd , where we have already seen
that the differentiable irreducible representations are one-dimensional and given
by
αp (a) = eip·a
where a ∈ N . So the character group in this case is Nb = Rd , with elements
labeled by the vector p.
For a semidirect product N o K, we will have an automorphism φk of N for
each k ∈ K. From this action on N , we get an induced action on functions on
N , in particular on elements of N
b , by
φbk : α ∈ N
b → φbk (α) ∈ N
b
so
φbk (αp ) = α(φ−1 )T (p)
k
210
To analyze representations (π, V ) of N o K, one can begin by restricting
attention to the N action, decomposing V into subspaces Vα where N acts
according to α. v ∈ V is in the subspace Vα when
Theorem.
v ∈ Vα =⇒ π(0, k)v ∈ Vφbk (α)
Proof. Using the definition of the semi-direct product in chapter 16 one can
show that the group multiplication satisfies
For each α ∈ N b one can look at its orbit under the action of K by φbk , which
will give a subset Oα ⊂ N b . From the above theorem, we see that if Vα 6= 0,
then we will also have Vβ 6= 0 for β ∈ Oα , so one piece of information that
characterizes a representation V is the set of orbits one gets in this way.
α also defines a subgroup Kα ⊂ K consisting of group elements whose action
on Nb leaves α invariant:
211
gave new representations corresponding to a choice of orbit Oα and a choice
of irreducible representation of Kα = SO(2). We did not show this, but this
construction gives an irreducible representation when a single orbit Oα occurs
(with a transitive K action), with an irreducible representation of Kα on Vα .
We will not further pursue the general theory here, but one can show that
distinct irreducible representations of N o K will occur for each choice of an
orbit Oα and an irreducible representation of Kα . One way to construct these
representations is as the solution space of an appropriate wave-equation, with
the wave-equation corresponding to the eigenvalue equation for a Casimir oper-
ator. In general, other “subsidiary conditions” then need be imposed to pick out
a subspace of solutions that give an representation of N o K, this corresponds
to the existence of other Casimir operators.
212
Chapter 19
p2
h= + V (q)
2m
for some function V (q). In the physical case of three dimensions, this will be
1 2
h= (p + p22 + p23 ) + V (q1 , q2 , q3 )
2m 1
213
erator for a particle moving in a potential V (q1 , q2 , q3 ) will be
1
H= (P 2 + P22 + P33 ) + V (Q1 , Q2 , Q3 )
2m 1
−~2 ∂ 2 ∂2 ∂2
= ( 2 + 2 + 2 ) + V (q1 , q2 , q3 )
2m ∂q1 ∂q2 ∂q3
−~2
= ∆ + V (q1 , q2 , q3 )
2m
We will be interested in so-called “central potentials”, potential functions that
are functions only of q12 + q22 + q32 , and thus only depend upon r, the radial
distance to the origin. For such V , both terms in the Hamiltonian will be
SO(3) invariant, and eigenspaces of H will be representations of SO(3).
Using the expressions for the angular momentum operators in spherical co-
ordinates derived in chapter 8 (including equation 8.8 for the Casimir operator
L2 ), one can show that the Laplacian has the following expression in spherical
coordinates
∂2 2 ∂ 1
∆= 2 + − L2
∂r r ∂r r2
The Casmir operator L2 has eigenvalues l(l + 1) on irreducible representations
of dimension 2l + 1 (integral spin l). So, restricted to such an irreducible repre-
sentation, we have
∂2 2 ∂ l(l + 1)
∆= 2 + −
∂r r ∂r r2
To solve the Schrödinger equation, we want to find the eigenfunctions of
H. The space of eigenfunctions of energy E will be a sum of of irreducible
representations of SO(3), with the SO(3) acting on the angular coordinates
of the wavefunctions, leaving the radial coordinate invariant. We have seen in
chapter 8 that such representations on functions of angular coordinates can be
explicitly expressed in terms of the spherical harmonic functions Ylm (θ, φ). So,
to find eigenfunctions of the Hamiltonian
~2
H=− ∆ + V (r)
2m
we want to find functions glE (r) depending on l = 0, 1, 2, . . . and the energy
eigenvalue E satisfying
−~2 d2 2 d l(l + 1)
( ( 2+ − ) + V (r))glE (r) = EglE (r)
2m dr r dr r2
Given such a glE (r) we will have
and the
ψ(r, θ, φ) = glE (r)Ylm (θ, φ)
214
will span a 2l + 1 dimensional (since m = −l, −l + 1, . . . , l − 1, l) space of energy
eigenfunctions for H of eigenvalue E.
For a general potential function V (r), exact solutions for the eigenvalues E
and corresponding functions glE (r) cannot be found in closed form. One special
case where we can find such solutions is for the three-dimensional harmonic os-
cillator, where V (r) = 21 mω 2 r2 . These are much more easily found though using
the creation and annihilation operator techniques to be discussed in chapter 20.
The other well-known and physically very important case is the case of a 1r
potential, called the Coulomb potential. This describes a light charged particle
moving in the potential due to the electric field of a much heavier charged
particle, a situation that corresponds closely to that of a hydrogen atom. In
this case we have
e2
V =−
r
−~2 d2 2 d l(l + 1) e2
( ( 2+ − 2
) − )glE (r) = EglE (r)
2m dr r dr r r
Since having
d2
(rg) = Erg
dr2
is equivalent to
d2 2
( + )g = Eg
dr2 r
−~2 d2 l(l + 1) e2
( ( 2− 2
) − )rglE (r) = ErglE (r)
2m dr r r
The solutions to this equation can be found through a rather elaborate pro-
cess described in most quantum mechanics textbooks, which involves looking
for a power series solution. For E ≥ 0 there are non-normalizable solutions that
describe scattering phenomena that we won’t study here. For E < 0 solutions
correspond to an integer n = 1, 2, 3, . . ., with n ≥ l + 1. So, for each n we get n
solutions, with l = 0, 1, 2, . . . , n − 1, all with the same energy
me4
En = −
2~2 n2
215
The degeneracy in the energy values leads one to suspect that there is some
extra group action in the problem commuting with the Hamiltonian. If so, the
eigenspaces of energy eigenfunctions will come in irreducible representations
of some larger group than SO(3). If the representation of the larger group
is reducible when one restricts to the SO(3) subgroup, giving n copies of our
SO(3) representation of spin l, that would explain the pattern observed here.
In the next section we will see that this is the case, and there use representation
theory to derive the above formula for En .
We won’t go through the process of showing how to explicitly find the func-
tions glEn (r) but just quote the result. Setting
~2
a0 =
me2
(this has dimensions of length and is known as the “Bohr radius”), and defining
216
gnl (r) = glEn (r) the solutions are of the form
r 2r l 2l+1 2r
gnl (r) ∝ e− na0 ( ) Ln+l ( )
na0 na0
for
n = 1, 2, . . .
l = 0, 1, . . . , n − 1
m = −l, −l + 1, . . . , l − 1, l
The first few of these, properly normalized, are
1 r
ψ100 = p 3 e− a0
πa0
1 r r
ψ200 = p (1 − )e− 2a0
3
8πa0 2a 0
1 r r
ψ211 = − p 3 e− 2a0 sin θeiφ
8 πa0 a0
1 r − 2ar
ψ211 = − p e 0 cos θ
4 2πa0 a0
3
1 r r
ψ21−1 = p 3 e− 2a0 sin θe−iφ
8 πa0 a0
217
motion comes from conservation of angular momentum, which corresponds to
the Poisson bracket relation
{lj , h} = 0
Here we’ll take the Coulomb version of the Hamiltonian that we need for the
hydrogen atom problem
1 e2
h= |p|2 −
2m r
One can read the relation {lj , h} = 0 in two ways:
• The components of the angular momentum (lj ) are invariant under the
action of the group (R of time translations) whose infinitesimal generator
is h, so the angular momentum is a conserved quantity.
Kepler’s first and third laws have a different origin, coming from the existence
of a new conserved quantity for this special choice of Hamiltonian. This quantity
is, like the angular momentum, a vector, often called the Lenz (or sometimes
Runge-Lenz, or even Laplace-Runge-Lenz) vector.
Definition (Lenz vector). The Lenz vector is the vector-valued function on the
phase space R6 given by
1 q
w= (l × p) + e2
m |q|
l·w =0
We won’t here explicitly calculate the various Poisson brackets involving the
components wj of w, since this is a long and unilluminating calculation, but
will just quote the results, which are
•
{wj , h} = 0
This says that, like the angular momentum, the vector with components
wj is a conserved quantity under time evolution of the system, and its
components generate symmetries of the classical system.
•
{lj , wk } = jkl wl
These relations say that the generators of the SO(3) symmetry act on wj
the way one would expect, since wj is a vector.
218
•
−2h
{wj , wk } = jkl ll ( )
m
This is the most surprising relation, and it has no simple geometrical
explanation (although one can change variables in the problem to try and
give it one). It expresses a highly non-trivial relationship between the two
sets of symmetries generated by the vectors l, w and the Hamiltonian h.
The wj are cubic in the q and p variables, so one would expect that the
Groenewold-van Hove no-go theorem would tell one that there is no consistent
way to quantize this system by finding operators Wj corresponding to the wj
that would satisfy the commutation relations corresponding to these Poisson
brackets. It turns out though that this can be done, although not for functions
defined over the entire phase-space. One gets around the no-go theorem by
doing something that only works when the Hamiltonian h is negative (we’ll be
taking a square root of −h).
The choice of operators Wj that works is
1 Q
W= (L × P − P × L) + e2
2m |Q|2
[Wj , H] = 0
[Lj , Wk ] = i~jkl Wl
2
[Wj , Wk ] = i~jkl Ll (− H)
m
as well as
L·W =W·L=0
The first of these shows that energy eigenstates will be preserved not just by
the angular momentum operators Lj , but by a new set of non-trivial operators,
the Wj , so will be representations of a larger Lie algebra thatn so(3).
In addition, one has the following relation between W 2 , H and the Casimir
operator L2
2
W 2 = e4 1 + H(L2 + ~2 1)
m
and it is this which will allow us to find the eigenvalues of H, since we know
those for L2 , and can find those of W 2 by changing variables to identify a second
so(3) Lie algebra.
To do this, first change normalization by defining
r
−m
K= W
2E
219
where E is the eigenvalue of the Hamiltonian that we are trying to solve for.
Note that it is at this point that we violate the conditions of the no-go theorem,
since we must have E < 0 to get a K with the right properties, and this restricts
the validity of our calculations to a subset of the energy spectrum. For E > 0
one can proceed in a similar way, but the Lie algebra one gets is different (so(3, 1)
instead of so(4)).
One then has the following relation between operators
2H(K 2 + L2 + ~2 1) = −me4 1
and the following commutation relations
[Lj , Lk ] = i~jkl Ll
[Lj , Kk ] = i~jkl Kl
[Kj , Kk ] = i~jkl Ll
Defining
1 1
M= (L + K), N = (L − K)
2 2
one has
[Mj , Mk ] = i~jkl Ml
[Nj , Nk ] = i~jkl Nl
[Mj , Nk ] = 0
This shows that we have two commuting copies of so(3) acting on states, spanned
respectively by the Mj and Nj , with two corresponding Casimir operators M 2
and N 2 .
Using the fact that
L·K=K·L=0
one finds that
M2 = N2
Recall from our discussion of rotations in three dimensions that representa-
tions of so(3) = su(2) correspond to representations of Spin(3) = SU (2), the
double cover of SO(3) and the irreducible ones have dimension 2l + 1, with l
half-integral. Only for l integral does one get representations of SO(3), and it
is these that occur in the SO(3) representation on functions on R3 . For four di-
mensions, we found that Spin(4), the double cover of SO(4), is SU (2) × SU (2),
and one thus has spin(4) = so(4) = su(2)×su(2) = so(3)×so(3). This is exactly
the Lie algebra we have found here, so one can think of the Coulomb problem as
having an so(4) symmetry. The representations that will occur can include the
half-integral ones, since neither so(3) is the so(3) of physical rotations in 3-space
(those are generated by L = M + N, which will have integral eigenvalues of l).
The relation between the Hamiltonian and the Casimir operators M 2 and
2
N is
2H(K 2 + L2 + ~2 1) = 2H(2M 2 + 2N 2 + ~2 1) = 2H(4M 2 + ~2 1) = −me4 1
220
On irreducible representations of so(3) of spin µ, we will have
M 2 = µ(µ + 1)1
for some half-integral µ, so we get the following equation for the energy eigen-
values
−me4 −me4
E=− 2 =− 2
2~ (4µ(µ + 1) + 1) 2~ (2µ + 1)2
Letting n = 2µ + 1, for µ = 0, 21 , 1, . . . we get n = 1, 2, 3, . . . and precisely the
same equation for the eigenvalues described earlier
me4
En = −
2~2 n2
It is not hard to show that the irreducible representations of a product like
so(3) × so(3) are just tensor products of irreducibles, and in this case the two
factors of the product are identical due to the equality of the Casimirs M 2 = N 2 .
The dimension of the so(3)×so(3) irreducibles is thus (2µ+1)2 = n2 , explaining
the multiplicity of states one finds at energy eigenvalue En .
(or, equivalently, replace our state space H of wavefunctions by the tensor prod-
uct H ⊗ C2 ) in a way that we will examine in detail in chapter 31.
The Hamiltonian operator for the hydrogen atom acts trivially on the C2
factor, so the only effect of the additional wavefunction component is to double
the number of energy eigenstates at each energy. Electrons are fermions, so
antisymmetry of multi-particle wavefunctions implies the Pauli principle that
states can only be occupied by a single particle. As a result, one finds that when
adding electrons to an atom described by the Coulomb potential problem, the
first two fill up the lowest Coulomb energy eigenstate (the ψ100 or 1S state at n =
1), the next eight fill up the n = 2 states ( two each for ψ200 , ψ211 , ψ210 , ψ21−1 ),
etc. This goes a long ways towards explaining the structure of the periodic table
of elements.
When one puts a hydrogen atom in a constant magnetic field B, the Hamil-
tonian acquires a term that acts only on the C2 factor, of the form
2e
B·σ
mc
221
This is exactly the sort of Hamiltonian we began our study of quantum mechan-
ics with for a simple two-state system. It causes a shift in energy eigenvalues
proportional to ±|B| for the two different components of the wavefunction, and
the observation of this energy splitting makes clear the necessity of treating the
electron using the two-component formalism.
222
Chapter 20
In this chapter we’ll begin the study of the most important exactly solvable
physical system, the harmonic oscillator. Later chapters will discuss extensions
of the methods developed here to the case of fermionic oscillators, as well as free
quantum field theories, which are harmonic oscillator systems with an infinite
number of degrees of freedom.
223
20.1 The harmonic oscillator with one degree of
freedom
An even simpler case of a particle in a potential than the Coulomb potential of
the last chapter is the case of V (q) quadratic in q. This is also the lowest-order
approximation when one studies motion near a local minimum of an arbitrary
V (q), expanding V (q) in a power series around this point. We’ll write this as
p2 1
h= + mω 2 q 2
2m 2
with coefficients chosen so as to make ω the angular frequency of periodic motion
of the classical trajectories. These satisfy Hamilton’s equations
∂V p
ṗ = − = mω 2 q, q̇ =
∂q m
so
q̈ = −ω 2 q
which will have solutions with periodic motion of angular frequency ω. These
solutions can be written as
we have
so
1 1
c+ = q(0) + i p(0)
2 2mω
The classical phase space trajectories are
1 1 1 1
q(t) = ( q(0) + i p(0))eiωt + ( q(0) − i p(0))e−iωt
2 2mω 2 2mω
imω 1 −imω 1
p(t) = ( q(0) − p(0))eiωt + ( q(0) + p(0))e−iωt
2 2 2 2
224
Instead of using two real coordinates to describe points in the phase space
(and having to introduce a reality condition when using complex exponentials),
one can instead use a single complex coordinate
1 i
z(t) = √ (q(t) − p(t))
2 mω
Then the equation of motion is a first-order rather than second-order differential
equation
ż = iωz
with solutions
z(t) = z(0)eiωt (20.1)
The classical trajectories are then realized as complex functions of t, and paramet-
rized by the complex number
1 i
z(0) = √ (q(0) − p(0))
2 mω
Since the Hamiltonian is just quadratic in the p and q, we have seen that we
can construct the corresponding quantum operator uniquely using the Schröding-
er representation. For H = L2 (R) we have a Hamiltonian operator
P2 1 ~2 d2 1
H= + mω 2 Q2 = − + mω 2 q 2
2m 2 2m dq 2 2
To find solutions of the Schrödinger equation, as with the free particle, one
proceeds by first solving for eigenvectors of H with eigenvalue E, which means
finding solutions to
~2 d2 1
HψE = (− 2
+ mω 2 q 2 )ψE = EψE
2m dq 2
Solutions to the Schrödinger equation will then be linear combinations of the
functions
i
ψE (q)e− ~ Et
Standard but somewhat intricate methods for solving differential equations
like this show that one gets solutions for E = En = (n + 21 )~ω, n a non-negative
integer, and the normalized solution for a given n (which we’ll denote ψn ) will
be r
mω 1 mω − mω q2
ψn (q) = ( 2n 2
) 4 Hn ( q)e 2~ (20.2)
π~2 (n!) ~
where Hn is a family of polynomials called the Hermite polynomials. The
ψn provide an orthonormal basis for H (one does not need to consider non-
normalizable wavefunctions as in the free particle case), so any initial wavefunc-
tion ψ(q, 0) can be written in the form
∞
X
ψ(q, 0) = cn ψn (q)
n=0
225
with Z +∞
cn = ψn (q)ψ(q, 0)dq
−∞
(note that the ψn are real-valued). At later times, the wavefunction will be
∞ ∞
− ~i En t 1
X X
ψ(q, t) = cn ψn (q)e = cn ψn (q)e−i(n+ 2 )ωt
n=0 n=0
[Q, P ] = i~1
we define
r r r r
mω 1 † mω 1
a= Q+i P, a = Q−i P
2~ 2mω~ 2~ 2mω~
which satisfy the commutation relation
[a, a† ] = 1
226
Up to the constant 21 , H is given by the operator
N = a† a
and
[N, a† ] = a†
If |ci is a normalized eigenvector of N with eigenvalue c, one has
and
N a† |ci = ([N, a† ] + a† N )|ci = a† (N + 1)|ci = (c + 1)a† |ci
This shows that a|ci will have eigenvalue c − 1 for N , and a normalized eigen-
function for N will be
1
|c − 1i = √ a|ci
c
Similarly, since
we have
1
|c + 1i = √ a† |ci
c+1
We can find eigenfunctions for H by first solving
a|0i = 0
for |0i (the lowest energy or “vacuum” state) which will have energy eigenvalue
1 † 1
2 , then acting by a n-times on |0i to get states with energy eigenvalue n + 2 .
The equation for |0i is thus
1 1 d
a|0i = √ (Q + iP )ψ0 (q) = √ (q + )ψ0 (q) = 0
2 2 dq
One can check that all solutions to this are all of the form
q2
ψ0 (q) = Ce− 2
227
so there is a unique normalized lowest-energy eigenfunction
1 q2
ψ0 (q) = 1 e− 2
π 4
a† a† a† 1 d q2
|ni = √ · · · √ √ |0i = 1 n √ (q − )n e− 2
n 2 1 π 4 2 2 n! dq
which (after putting back in constants and consulting the definition of a Hermite
polynomial) can be shown to give the eigenfunctions claimed earlier in equation
20.2.
In the physical interpretation of this quantum system, the state |ni, with
energy ~ω(n + 21 ) is thought of as a state describing n “quanta”. The state
|0i is the “vacuum state” with zero quanta, but still carrying a “zero-point”
energy of 12 ~ω. The operators a† and a have somewhat similar properties to
the raising and lowering operators we used for SU (2) but their commutator
is different (just the identity operator), leading to simpler behavior. In this
case they are called “creation” and “annihilation” operators respectively, due
to the way they change the number of quanta. The relation of such quanta to
physical particles like the photon is that quantization of the electromagnetic field
involves quantization of an infinite collection of oscillators, with the quantum
of an oscillator corresponding physically to a photon with a specific momentum
and polarization. This leads to a well-known problem of how to handle the
infinite vacuum energy corresponding to adding up 21 ~ω for each oscillator.
The first few eigenfunctions are plotted below. The lowest energy eigenstate
is a Gaussian centered at q = 0, with a Fourier transform that is also a Gaussian
centered at p = 0. Classically the lowest energy solution is an oscillator at rest at
its equilibrium point (q = p = 0), but for a quantum oscillator one cannot have
such a state with a well-defined position and momentum. Note that the plot
gives the wavefunctions, which in this case are real and can be negative. The
square of this function is what has an intepretation as the probability density
for measuring a given position.
228
20.3 The Bargmann-Fock representation
Working with the operators a and a† and their commutation relation
[a, a† ] = 1
makes it clear that there is a simpler way to represent these operators than
the Schrödinger representation as operators on position space functions that we
have been using, while the Stone-von Neumann theorem assures us that this will
be unitarily equivalent to the Schrödinger representation. This representation
appears in the literature under a large number of different names, depending on
the context, all of which refer to the same representation:
229
where w = u + iv. We define the following two operators acting on this space:
d
a= , a† = w
dw
One has
d d n
[a, a† ]wn = (wwn ) − w w = (n + 1 − n)wn = wn
dw dw
so this commutator is the identity operator on polynomials
[a, a† ] = 1
and
Theorem. The Bargmann-Fock representation has the following properties
• The elements
wn
√
n!
of F for n = 0, 1, 2, . . . are orthornormal.
• The operators a and a† are adjoints with respect to the given inner product
on F.
• The basis
wn
√
n!
of F for n = 0, 1, 2, . . . is complete.
Proof. The proofs of the above statements are not difficult, in outline they are
• For orthonormality one can just compute the integrals
Z
2
wm wn e−|w| dudv
C
in polar coordinates.
d
• To show that w and dw are adjoint operators, use integration by parts.
• For completeness, assume hn|ψi = 0 for all n. The expression for the |ni
as Hermite polynomials times a Gaussian implies that
Z
q2
F (q)e− 2 ψ(q)dq = 0
q2
for all polynomials F (q). Computing the Fourier transform of ψ(q)e− 2
gives
∞
(−ikq)j − q2
Z 2
Z X
−ikq − q2
e e ψ(q)dq = e 2 ψ(q)dq = 0
j=0
j!
q2
So ψ(q)e− 2 has Fourier transform 0 and must be 0 itself. Alternatively,
one can invoke the spectral theorem for the self-adjoint operator H, which
guarantees that its eigenvectors form a complete and orthonormal set.
230
Since in this representation the number operator N = a† a satisfies
d n
N wn = w w = nwn
dw
the monomials in w diagonalize the number and energy operators, so one has
wn
|ni = √
n!
B : HBF → HS
to express operators either purely in terms of aj and a†j , which have a simple
expression
∂
aj = , a†j = wj
∂wj
in the Bargmann-Fock representation, or purely in terms of Qj and Pj which
have a simple expression
∂
Qj = qj , Pj = −i
∂qj
231
in the Schrödinger representation.
To give an idea of what the Bargmann transform looks like explicitly, we’ll
just give the formula for the d = 1 case here, without proof. If ψ(q) is a state
in HS = L2 (R), then
Z +∞
1 2 q2
(Bψ)(z) = 1 e−z − 2 +2qz ψ(q)dq
π 4 −∞
One can check this equation for the case of the lowest energy state in the
Schrödinger representation, where |0i has coordinate space representation
1 q2
ψ(q) = 1 e− 2
π 4
and
Z +∞
1 2 2
− q2 +2qz 1 q2
(Bψ)(z) = 1 e−z 1 e− 2 dq
π4 −∞ π 4
Z +∞
1 2
−q 2 +2qz
= 1 e−z dq
π2 −∞
Z +∞
1 2
= 1 e−(q−z) dq
π2 −∞
Z +∞
1 2
= 1 e−q dq
π 2 −∞
=1
which is the expression for the state |0i in the Bargmann-Fock representation.
H = Fd = F ⊗ · · · ⊗ F
| {z }
d times
Qj , Pj j = 1, . . . d
satisfying
[Qj , Pk ] = iδjk 1, [Qj , Qk ] = [Pj , Pk ] = 0
232
where Qj and Pj just act on the j’th term of the tensor product in the usual
way.
We can now define annihilation and creation operators in the general case:
Definition (Annihilation and creation operators). The 2d operators
1 1
aj = √ (Qj + iPj ), a†j = √ (Qj − iPj ), j = 1, . . . , d
2 2
Using the fact that tensor products of function spaces correspond to func-
tions on the product space, in the Schrödinger representation we have
H = L2 (Rd )
where one should keep in mind that one can rescale each degree of freedom
separately, allowing different parameters ωj for the different degrees of freedom.
The energy and number operator eigenstates will be written
|n1 , . . . , nd i
where
a†j aj |n1 , . . . , nd i = Nj |n1 , . . . , nd i = nj |n1 , . . . , nd i
Note that for d = 3 the harmonic oscillator problem is an example of the cen-
tral potential problems described in chapter 19. It has an SO(3) symmetry, with
angular momentum operators that commute with the Hamiltonian, and space
of energy eigenstates that can be organized into irreducible SO(3) representa-
tions. In the Schrödinger representation states are in H = L2 (R3 ), decribed
by wavefunctions that can be written in rectangular or spherical coordinates,
and the Hamiltonian is a second order differential operator. In the Bargmann-
Fock representation, states in F3 are described by holomorphic functions of 3
complex variables, with operators given in terms of products of annihilation
233
and creation operators. The Hamiltonian is, up to a constant, just the number
operator, with energy eigenstates homogeneous polynomials (with eigenvalue of
the number operator their degree).
Either the Pj , Qk or the aj , a†k together with the identity operator will give
a representation of the Heisenberg Lie algebra h2d+1 on H, and by exponentia-
tion a representation of the Heisenberg group H2d+1 . Quadratic combinations
of these operators will give a representation of sp(2d, R), the Lie algebra of
Sp(2d, R). In the next chapters we will study these and other aspects of the
quantum harmonic oscillator as a unitary representation.
234
Chapter 21
235
mute with the Hamiltonian, it does have physically important aspects. In par-
ticular it takes the state |0i to a distinguished set of states known as “coherent
states”. These states are labeled by points of the phase space R2d and provide
the closest analog possible in the quantum system of classical states (i.e. those
with a well-defined value of position and momentum variables).
The zj were then quantized using creation operators a†j , the z j using annihilation
operators aj . In the Bargmann-Fock representation, where the state space is a
space of functions of complex variables wj , we have
∂
aj = , a†j = wj
∂wj
236
such that
J 2 = −1
Given such a pair (V = R2d , J), one can break up complex linear combina-
tions of vectors in V into those on which J acts as i and those on which it acts
as −i (since J 2 = −1, its eigenvalues must be ±i). Note that we have extended
the action of J on V to an action on V ⊗ C using complex linearity. One has
V ⊗ C = VJ+ ⊕ VJ−
J0 qj = pj , J0 pj = −qj
237
since one has
1
J0 zj = √ (pj + iqj ) = izj
2
Basis elements of M−
J0 are the complex conjugates
1
z j = √ (qj + ipj )
2
With respect to the chosen basis qj , pj , one can write a complex structure
as a matrix. For the case of J0 and for d = 1, on an arbitrary element of M
one has
J0 (cq q + cp p) = cq p − cp q
so J0 in matrix form with respect to the basis (q, p) is
cq 0 −1 cq −cp
J0 = = (21.2)
cp 1 0 cp cq
h2d+1 = M ⊕ R
where the R component is the constant functions. The Lie bracket is just the
Poisson bracket. Complexifying this, one has
−
h2d+1 ⊗ C = (M ⊕ R) ⊗ C = (M ⊗ C) ⊕ C = M+
J ⊕ MJ ⊕ C
This complexified Lie algebra is still a Lie algebra, with the Lie bracket relations
extended from the real Lie algebra by complex linearity. One has
[Γ0J (u1 , c1 ), Γ0J (u2 , c2 )] = Γ0J ([(u1 , c1 ), (u2 , c2 )]) = Γ0J (0, Ω(u1 , u2 )) (21.3)
238
Note that Γ0J will only be a unitary representation (with Γ0J (u, c) skew-adjoint
operators) for (u, c) in the real Lie subalgebra h2d+1 (meaning u ∈ M, c ∈ R).
Since we can write
From the definition of the symplectic group in chapter 14, this condition just
says that J ∈ Sp(2d, R). Since we are extending the action of J to M ⊗ C by
complex linearity, this condition will remain true for u1 , u2 ∈ M ⊗ C. Given
this condition, the Γ0J (u+ , 0) will commute, since if u+ + +
1 , u2 ∈ MJ , by 21.3 we
have
[Γ0J (u+ 0 + 0 + +
1 , 0), ΓJ (u2 , 0)] = ΓJ (0, Ω(u1 , u2 ))
and
Ω(u+ + + + + + + +
1 , u2 ) = Ω(Ju1 , Ju2 ) = Ω(iu1 , iu2 ) = −Ω(u1 , u2 ) = 0
The Γ0J (u− , 0) will commute with each other by essentially the same argument.
For the case of the standard complex structure J = J0 , one can check this
compatibility condition by computing (treating the d = 1 case, which generalizes
easily, and using equations 14.1 and 21.2)
0
0 0 0 −1 cq T 0 1 0 −1 cq
Ω(J0 (cq q + cp p), J0 (cq q + cp p)) =( ) ( )
1 0 cp −1 0 1 0 c0p
0
0 1 0 1 0 −1 cq
= cq cp
−1 0 −1 0 1 0 c0p
0
0 1 cq
= cq cp
−1 0 c0p
=Ω(cq q + cp p, c0q q + c0p p)
More simply of course, one could just note that J0 ∈ SL(2, R) = Sp(2, R).
For the case of J = J0 and arbitrary values of d, one can write out explicitly
Ω on basis elements
−
zj ∈ M+ J0 , z j ∈ MJ0
239
as
1 1
Ω(zj , zk ) = {zj , zk } = { √ (qj − ipj ), √ (qj − ipj )} = 0
2 2
1 1
Ω(z j , z k ) = ({z j , z k } = { √ (qj + ipj ), √ (qj + ipj )} = 0
2 2
1 1
Ω(zj , z k ) = {zj , z k } = { √ (qj − ipj ), √ (qj + ipj )} = iδjk
2 2
or
{zj , −iz k } = δjk
Note that here the conjugate variable to zj with respect to the Poisson bracket
is −iz k .
The Lie algebra representation is given in this case by
∂
Γ0J0 (zj , 0) = −ia†j = −iwj , Γ0J0 (z j , 0) = −iaj = −i , Γ0J0 (0, 1) = −i1
∂wj
Note that the operators aj and a†j are not skew-adjoint, so Γ0J0 is not unitary
on the full Lie algebra h2d+1 ⊗ C, but only on the real subspace h2d+1 of real
linear combinations of qj , pj , 1.
[a, a† ] = 1
[Γ0J0 (z, 0), Γ0J0 (z, 0)] = Γ0J0 (0, Ω(z, z)) = −iΩ(z, z)1
240
so positivity here corresponds to the fact that
−iΩ(z, z) = 1 > 0
[a, a† ] = −1
would correspond to interchanging the role of a and a† , with the lowest energy
state now satisfying a† |0i = 0 and no state in the state space satisfying a|0i = 0.
−
This is equivalent to a change of sign of J, interchanging M+ J and MJ . For the
d = 1 case with the wrong sign, one can just make this interchange to construct
the state space, but in higher dimensions one needs to have the same sign for
all [aj , a†j ]. In order to have a state |0i that is annihilated by all annihilation
operators, we need all the commutators [aj , a†j ] to have the positive sign.
For the case of general J and arbitrary d, to get operators Γ0J (u, 0) with the
right positivity properties, we will need the condition
−iΩ(u, u) > 0
For the standard complex structure J0 , we can check this using the matrix
expressions for Ω and J0
0 1 0 −1 cq
Ω(cq q + cp p, J0 (cq q + cp p)) = cq cp
−1 0 1 0 cp
1 0 cq
= cq cp
0 1 cp
=c2q + c2p
Note that Ω(v, Jv) thus gives a positive-definite quadratic function on M. Using
the isomorphism M = M provided by Ω, this corresponds to a positive-definite
241
quadratic function on the phase space itself. For J0 this is just (twice) the
standard harmonic oscillator Hamiltonian function. One application of more
general J is to the case of more general quadratic (but still positive-definite)
Hamiltonian functions.
We can give a name to the class of J for which we will have a formalism of
annihilation and creation operators:
•
−iΩ(u, u) > 0
for non-zero u ∈ M+
J.
•
Ω(v, Jv) > 0
for non-zero v ∈ M.
where aj , a†k satisfy the conventional commutation relations, and z Jj is the com-
plex conjugate of zjJ .
1. Define a norm on M by |v|2 = Ω(v, Jv). One then has a positive inner
product h·, ·iJ on M and by Gram-Schmidt orthornormalization can find
a basis of span{qj } ⊂ M consisting of d vectors qjJ satisfying
242
3. Define
1
zjJ = √ (qjJ − iJqjJ )
2
The zjJ give a complex basis of M+ J
J , their complex conjugates z j a complex
−
basis of MJ .
4. One can check that
∂
Γ0J (zjJ , 0) = −ia†j = −iwj , Γ0J (z Jj , 0) = −iaj = −i
∂wj
1
z = √ (q − ip)
2
by replacing the i by an arbitrary complex number τ . Then the condition that
−
q − τ p be in M+
J and its conjugate in MJ is
1 Re(τ )
J(p) = − q+ p
Im(τ ) Im(τ )
243
One can easily check that det J = 1, so J ∈ SL(2, R) and is compatible with Ω.
The positivity condition here is that the matrix
|τ |2
0 1 1 − Re(τ )
J=
−1 0 Im(τ ) Re(τ ) 1
give a positive quadratic form. This will be the case when Im(τ ) > 0. We
have thus constructed a set of J that are positive, compatible with Ω, and
parametrized by an element τ of the upper half-plane, with J0 corresponding to
τ = i.
To construct annihilation and creation operators satisfying the standard
commutation relations
[aτ , aτ ] = [a†τ , a†τ ] = 0, [aτ , a†τ ] = 1
set
1 1
aτ = p (Q − τ P ), a†τ = p (Q − τ P )
2 Im(τ ) 2 Im(τ )
1
The Hamiltonian with eigenvalues n + 2 for n = 0, 1, 2, · · · will be
1 1
Hτ = (aτ a†τ + a†τ aτ ) = (Q2 + |τ |2 P 2 − Re(τ )(QP + P Q)) (21.6)
2 2 Im(τ )
The lowest energy state will satisfy
aτ |0iτ = 0
which in the Schrödinger representation is the differential equation
d
(Q − τ P )ψ(q) = (q + iτ )ψ(q) = 0
dq
which has as solutions
i 2|ττ |2 q 2
ψ(q) ∝ e (21.7)
This will be a normalizable state for Im(τ ) > 0, again showing the necessity of
the positivity condition.
Eigenstates of Hτ for general τ are known as “squeezed states” in physics.
Note that for τ pure imaginary, they correspond just to a rescaling of variables
1 p
q→p q, p → Im(τ )p
Im(τ )
Such states are “squeezed” in the sense that for Im(τ ) large the position un-
certainty in the state |0iτ will become small (while the momentum uncertainty
becomes large).
The construction for d = 1 can be generalized to arbitrary d, with complex
structures J now parametrized by a d-dimensional complex matrix τ which must
be symmetric (to be compatible with Ω), and such that Im(τ ) is a positive-
definite matrix. The space of such complex structures is known as the Siegel
upper half-space.
244
21.5 Coherent states and the Heisenberg group
action
Since the Hamiltonian for the harmonic oscillator does not commute with the
operators aj or a†j which give the representation of the Lie algebra h2d+1 on
the state space HBF , the Heisenberg Lie group and its Lie algebra are not
symmetries of the system. Energy eigenstates do not break up into irreducible
representations of the group but rather the entire state space makes up such
an irreducible representation. The state space for the harmonic oscillator does
however have a distinguished state, the lowest energy state |0i, and one can ask
what happens to this state under the Heisenberg group action. We’ll study this
question for the simplest case of d = 1 and the standard complex structure J0 .
Considering the basis of operators for the Lie algebra representation 1, a, a† ,
we see that the first acts as a constant on |0i, generating a phase tranformation
of the state, while the second annihilates |0i, so generates group transformations
that leave the state invariant. It is only the third operator a† , that takes |0i to
other non-zero states, and one could consider the family of states
†
eαa |0i
†
for α ∈ C. The transformations eαa are not unitary since αa† is not skew ad-
joint. It is better to fix this by replacing αa† with the skew-adjoint combination
αa† − αa, defining
Definition (Coherent states). The coherent states in H are the states
†
−αa
|αi = eαa |0i
where α ∈ C.
†
Since eαa −αa is unitary, the |αi will be a family of distinct normalized states
in H, with α = 0 corresponding to the lowest energy state |0i. These are, up
to phase transformation, precisely the states one gets by acting on |0i with
arbitrary elements of the Heisenberg group H3 .
Using the Baker-Campbell-Hausdorff formula gives
† † |α|2
−αa
|αi = eαa |0i = eαa e−αa e− 2 |0i
245
and this property could be used as an equivalent definition of coherent states.
Note that coherent states are superpositions of different states |ni, so are
not eigenvectors of the number operator N . They are eigenvectors of
1
a = √ (Q + iP )
2
with eigenvalue α so one can try and think of α as a complex number whose
real part gives the position and imaginary part the momentum. This does not
lead to a violation of the Heisenberg uncertainty principle since this is not a
self-adjoint operator, and thus not an observable. Such states are however very
useful for describing certain sorts of physical phenomena, for instance the state
of a laser beam, where (for each momentum component of the electromagnetic
field) one does not have a definite number of photons, but does have a definite
amplitude and phase.
One thing coherent states do provide is an alternate complete set of norm
one vectors in H, so any state can be written in terms of them. However, these
states are not orthogonal (they are eigenvectors of a non-self-adjoint operator so
the spectral theorem for self-adjoint operators does not apply). One can easily
compute that
2
|hβ|αi|2 = e−|α−β|
One possible reason these states are given the name “coherent” is that they
remain coherent states as they evolve in time (for the harmonic oscillator Hamil-
tonian), with α evolving in time along a classical phase space trajectory. If the
state at t = 0 is a coherent state labeled by α0 (|ψ(0)i = |α0 i), by 21.8, at later
times one has (here ~ = ω = 1)
246
for a highest weight vector, we have
n n n
πn0 (S3 )| i = , πn0 (S+ )| i = 0
2 2 2
and we can create a family of spin coherent states by acting on | n2 i by elements
of SU (2). If we identify states in this family that differ just by a phase, the
states are parametrized by a sphere.
By analogy with the Heisenberg group coherent states, with πn0 (S+ ) playing
the role of the annihilation operator a and πn0 (S− ) playing the role of the creation
operator a† , we can define a skew-adjoint transformation
1 iφ 0 1
θe πn (S− ) − θe−iφ πn0 (S+ )
2 2
and exponentiate to get a family of unitary transformations parametrized by
(θ, φ). Acting on the highest weight state we get a definition of the family of
spin coherent states as
1 iφ 0
(S− )− 12 θe−iφ πn
0 n
|θ, φi = e 2 θe πn (S+ )
| i
2
One can show that the SU (2) group element used here corresponds, in terms of
its action on vectors, to a rotation by an angle θ about the axis (sin φ, − cos φ, 0),
so one can associate the state |θ, φi to the unit vector along the z-axis, rotated
by this transformation.
247
248
Chapter 22
In the last chapter we examined those aspects of the harmonic oscillator quan-
tum system and the Bargmann-Fock representation that correspond to quan-
tization of phase space functions of order less than or equal to one, finding a
unitary representation ΓJ of the Heisenberg group H2d+1 for each positive com-
patible complex structure J on the dual phase space M. We’ll now turn to what
happens for order two functions, which will give a representation of M p(2d, R)
on the harmonic oscillator state space, extending ΓJ to a representation of the
full Jacobi group. In this chapter we will see what happens in some detail for
the d = 1 case, where the symplectic group is just SL(2, R).
The choice of complex structure J corresponds not only to a choice of |0iJ
(since J determines which operators are annihilation operators), but also to
that of a specific subgroup U (1) ⊂ SL(2, R). The nature of the double-cover
needed for ΓJ to be a true representation (not just a representation up to sign)
is best seen by considering the action of this U (1) on the harmonic oscillator
state space. The Lie algebra of this U (1) acts on energy eigenstates with an
extra 12 term, well-known to physicists as the non-zero energy of the vacuum
state, and this shows the need for the double-cover.
249
the standard description in terms of annihilation and creation operators of the
quantum harmonic oscillator system with d degrees of freedom.
Recall from our discussion of the Schrödinger representation Γ0S in section
18 that we can extend that representation from h2d+1 to include quadratic
combinations of the qj , pj , getting a unitary representation of the semi-direct
product h2d+1 o sp(2d, R). Restricting attention to the sp(2d, R) factor, we get
the metaplectic representation, and it is this that we will construct explicitly
using Γ0BF instead of Γ0S . In this chapter, we’ll start with the case d = 1, where
sp(2, R) = sl(2, R).
One can readily compute the Poisson brackets of order two of z and z using
the basic relation {z, z} = i and the Leibniz rule, finding the following for the
non-zero cases
1
Z = E − F, X± = (G ± i(E + F ))
2
which satisfy
z2 z2
↔ X− , ↔ X+ , zz ↔ Z
2 2
The element
1 0 1
zz = (q 2 + p2 ) ↔ Z =
2 −1 0
exponentiates to give a SO(2) = U (1) subgroup of SL(2, R) with elements of
the form
cos θ sin θ
eθZ =
− sin θ cos θ
Note that h = 21 (p2 + q 2 ) = zz is the classical Hamiltonian function for the
harmonic oscillator.
250
We can now quantize quadratics in z and z using annihilation and creation
operators acting on the Fock space F. There is no operator ordering ambiguity
for
d2
z 2 → (a† )2 = w2 , z 2 → a2 =
dw2
For the case of zz (which is real), in order to get the sl(2, R) commutation
relations to come out right (in particular, the Poisson bracket {z 2 , z 2 } = 4izz),
we must take the symmetric combination
1 1 d 1
zz → (aa† + a† a) = a† a + = w +
2 2 dw 2
(which of course is just the standard Hamiltonian for the quantum harmonic
oscillator).
Multiplying as usual by −i one can now define an extension of the Bargmann-
Fock representation to an sl(2, C) representation by taking
i i 1
Γ0BF (X+ ) = − a2 , Γ0BF (X− ) = − (a† )2 , Γ0BF (Z) = −i (a† a + aa† )
2 2 2
One can check that we have made the right choice of Γ0BF (Z) to get an sl(2, C)
representation by computing
i i 1
[Γ0BF (X+ ), Γ0BF (X− )] =[− a2 , − (a† )2 ] = − (aa† + a† a)
2 2 2
= − iΓ0BF (Z) = Γ0BF ([X+ , X− ])
As a representation of the real sub-Lie algebra sl(2, R) of sl(2, C), one has
(using the fact that G, E + F, E − F is a real basis of sl(2, R)):
Note that one can explicitly see from these expressions that this is a unitary
representation, since all the operators are skew-adjoint (using the fact that a
and a† are each other’s adjoints).
This representation Γ0BF will be unitarily equivalent to the Schrödinger ver-
sion Γ0S found earlier when quantizing q 2 , p2 , pq as operators on H = L2 (R).
It is however much easier to work with since it can be studied as the state
251
space of the quantum harmonic oscillator, with the Lie algebra acting simply
by quadratic expressions in the annihilation and creation operators.
One thing that can now easily be seen is that this representation Γ0BF does
not integrate to give a representation of the group SL(2, R). If the Lie algebra
representation Γ0BF comes from a Lie group representation ΓBF of SL(2, R), we
have
0
ΓBF (eθZ ) = eθΓBF (Z)
where
1 1
Γ0BF (Z) = −i(a† a + ) = −i(N + )
2 2
so
1
ΓBF (eθZ )|ni = e−iθ(n+ 2 ) |ni
which has its origin in the physical phenomenon that the energy of the lowest
energy eigenstate |0i is 21 rather than 0, so not an integer.
This is precisely the same sort of problem we found when studying the
spinor representation of the Lie algebra so(3). Just as in that case, the problem
indicates that we need to consider not the group SL(2, R), but a double cover,
the metaplectic group M p(2, R). The behavior here is quite a bit more subtle
than in the Spin(3) double cover case, where Spin(3) was just the group SU (2),
and topologically the only non-trivial cover of SO(3) was the Spin(3) one since
π1 (SO(3)) = Z2 . Here one has π1 (SL(2, Z)) = Z, and each extra time one goes
around the U (1) subgroup we are looking at one gets a topologically different
non-contractible loop in the group. As a result, SL(2, R) has lots of non-trivial
covering groups, of which only one interests us, the double cover M p(2, R). In
^R), but that plays
particular, there is an infinite-sheeted universal cover SL(2,
no role here.
252
Another aspect of the metaplectic representation that is relatively easy to
see in the Bargmann-Fock construction is that the state space F is not an
irreducible representation, but is the sum of two irreducible representations
F = Feven ⊕ Fodd
where Feven consists of the even functions, Fodd of odd functions. On the sub-
space F f in ⊂ F of finite sums of the number eigenstates, these are just the
even and odd degree polynomials. Since the generators of the Lie algebra rep-
resentation are degree two combinations of annihilation and creation operators,
they will take even functions to even functions and odd to odd. The separate
irreducibility of these two pieces is due to the fact that (when n = m(2)), one
can get from state |ni to any another |mi by repeated application of the Lie
algebra representation operators.
253
which implies β = −γ and α = δ. The elements of SL(2, R) that we want will
be of the form
α β
−β α
with unit determinant, so α2 + β 2 = 1. This is the U (1) = SO(2) subgroup of
SL(2, R) of matrices of the form
cos θ sin θ
= eθZ
− sin θ cos θ
i
Γ0BF (zz) = Γ0BF (Z) = − (aa† + a† a)
2
and the quantized analogs of 22.2 are
i i
[− (aa† + a† a), a† ] = −ia† , [− (aa† + a† a), a] = ia (22.3)
2 2
which satisfy
254
Note that, using equation 5.1 one has
d i
(Ug a† Ug−1 )|θ=0 = [− (aa† + a† a), a† ]
dθ 2
so equation 22.3 is just the derivative at the identity of equation 22.4. Also note
that
We see that, on operators, conjugation by the action of the U (1) subgroup
of SL(2, R) does not mix creation and annihilation operators. On the distin-
guished state |0i, Ug acts as the phase transformation
1
Ug |0i = e−i 2 θ |0i
Besides 22.2, there are also Poisson bracket relations corresponding to in-
finitesimal sl(2, R) transformations that do not preserve the complex structure.
They are
For α > 0 these are “squeezing” transformations which expand vectors in the q
direction in M, and contract them in the p direction. One can think of these
as a change of basis in M, with the complex structure in the new basis
α −α
e2α
e 0 0 −1 e 0 0
Jα = gα J0 gα−1 = =
0 e−α 1 0 0 eα e−2α 0
gα = eαG
and we find
0 † 2
i
) +a2 )
ΓBF (gα ) = ΓBF (eαG ) = eαΓBF (G) = e−α 2 ((a
by 22.1.
255
The ΓBF (gα ) are unitary operators (since exponentials of skew-adjoint oper-
ators) on the Fock space F that take |0i to a different state |0iα , not proportional
to |0i. This state |0iα can be used to characterize the complex structure Jα ,
as the one corresponding to the change of variables that takes the annihilation
operator a to the linear combination of annihilation and creation operators that
annihilates |0iα . One can generalize this for arbitrary positive compatible com-
plex structures, and get a map that take τ in the upper half-plane to vectors in
F (modulo scalar multiplication of the vectors). This map turns out to be quite
useful in algebraic geometry, providing an embedding of the upper half-plane
in complex projective space (the same holds true for d > 1 for the Siegel upper
half-space).
256
−
• The choice of the decomposition M ⊗ C = M+ J ⊕ MJ . After quantization
this determines which operators are linear combinations of annihilation
operators and which are linear combinations of creation operators.
• The choice of normal-ordering prescription. SL(2, R) transformations
that mix annihilation and creation operators change the definition of the
normal ordering symbol : :.
• The choice (up to a scalar factor) of a distinguished state |0i ∈ H, the
state annihilated by annihilation operators. Equation 21.7 gives this state
explicitly in the Schrödinger representation, showing how it depends on
the complex structure τ .
The group of such matrices is called SU (1, 1) and is isomorphic to SL(2, R).
257
258
Chapter 23
259
23.1 Complex structures and the Sp(2d, R) ac-
tion on M
We saw in chapter 21 that a generalization of the Bargmann-Fock representation
can be defined for any choice of a positive compatible complex structure J on
M = R2d . J provides a decomposition
−
M ⊗ C = M+
J ⊕ MJ
• A Lie subalgebra with basis elements zj zk (as usual, the Lie bracket is the
Poisson bracket). There are 12 (d2 + d) distinct such basis elements. This is
a commutative Lie subalgebra, since the Poisson bracket of any two basis
elements is zero.
These are the analogs for arbitrary d of the Bogoliubov transformations studied
in the case d = 1, but we will not further discuss their properties in the general
case. The last subalgebra is the one we will mostly be interested in since it turns
out that quantization of elements of this subalgebra produces the operators of
most physical interest.
260
Taking all complex linear combinations, this subalgebra can be identified
with the Lie algebra gl(d, C) of all d by d complex matrices, since if Ejk is the
matrix with 1 at the j-th row and k-th column, zeros elsewhere, one has
[Ejk , Elm ] = Ejm δkl − Elk δjm
and these provide a basis of gl(d, C). Identifying bases by
izj z k ↔ Ejk
gives the isomorphism of Lie algebras. This gl(d, C) is the complexification of
u(d), the Lie algebra of the unitary group U (d). Elements of u(d) will corre-
spond to skew-adjoint matrices so real linear combinations of the real quadratic
functions
zj z k + z j zk , i(zj z k − z j zk )
on M.
The moment map here is
X
A ∈ gl(d, C) → µA = i zj Ajk z k
j,k
and we have
Theorem 23.1. One has the Poisson bracket relation
{µA , µA0 } = µ[A,A0 ]
so the moment map is a Lie algebra homomorphism.
One also has (for column vectors z with components z1 , . . . , zd )
{µA , z} = AT z, {µA , z} = −Az (23.2)
Proof. Using 23.1 one has
X
{µA , µA0 } = − {zj Ajk z k , zl A0lm z m }
j,k,l,m
X
=− Ajk A0lm {zj z k , zl z m }
j,k,l,m
X
=i Ajk A0lm (zj z m δkl − zl z k δjm )
j,k,l,m
X
=i zj [A, A0 ]jk z k = µ[A,A0 ]
j,k
261
and
X X
{µA , z l } ={i zj Ajk z k , z l } = i Ajk {zj , z l }z k
j,k j,k
X
=− Alk z k
k
Note that here we have written formulas for A ∈ gl(d, C), an arbitrary com-
plex d by d matrix. It is only for A ∈ u(d), the skew-adjoint (AT = −A)
matrices, that µA will be a real-valued moment map, lying in the real lie al-
gebra sp(2d, R), and giving a unitary representation on the state space after
quantization. For such A we can write the relations 23.2 as a (complexified)
example of 14.14
T
z A 0 z
{µA , }=
z 0 AT z
lies in this sub-algebra (it is the case A = −i1), and one can show that its
Poisson brackets with the rest of the sub-algebra are zero. It gives a basis
element of the one-dimensional u(1) subalgebra that commutes with the rest of
the u(d) subalgebra.
In section 22.2, for the case d = 1 and J = J0 , we found that there was
a U (1) ⊂ SL(2, R) group acting on M preserving Ω, and commuting with J0 .
−
Complexifying M this U (1) acted separately on M+ J0 and MJ0 , and there was
a moment map taking Z = −J0 to the function µZ = zz on M . Here we have
a U (d) ⊂ Sp(2d, R), again acting on M preserving Ω, and commuting with J,
−
so also acting separately on M+ J and MJ after complexification.
262
23.2 The metaplectic representation and U (d) ⊂
Sp(2d, R)
Turning to the quantization problem, we would like to extend the quantization of
linear functions of zj , z j of chapter 21 to quadratic functions, using annihilation
and creation operators. For any j, k one can take
zj z j → −ia†j aj
The definition of normal ordering generalizes simply, since the order of anni-
hilation and creation operators with different values of j is immaterial. If one
uses this normal-ordered choice, one has shifted the usual quantized operators
of the Bargmann-Fock representation by a scalar 12 for each j, and after expo-
nentiation the state space H = Fd provides a representation of U (d), with no
need for a double cover. As a u(d) representation however, this does not extend
to a representation of sp(2d, R), since commutation of a2j with (a†j )2 can land
one on the unshifted operators.
We saw above that the infinitesimal action of u(d) ⊂ sp(2d, R) preserves
the decomposition of M ⊗ C = M+ ⊕ M− , and this will be true after ex-
ponentiating for U (d) ⊂ Sp(2d, R). We won’t show this here, but U (d) is
the maximal subgroup that preserves this decomposition. The analog of the
d = 1 parametrization of possible distinguished states |0i by SL(2, R)/U (1)
here would be a parametrization of such states (or, equivalently, of possible
choices of J) by the space Sp(2d, R)/U (d), the Siegel upper half-space.
Since the normal-ordering doesn’t change the commutation relations obeyed
by products of the form a†j ak , one can quantize the quadratic expression for
263
µA and get quadratic combinations of the aj , a†k with the same commutation
relations as in theorem 23.1. Letting
X †
UA0 = aj Ajk ak
j,k
we have
Theorem 23.2. For A ∈ gl(d, C) a d by d complex matrix one has
So
A ∈ gl(d, C) → UA0
is a Lie algebra representation of gl(d, C) on H = C[w1 , . . . , wd ], the harmonic
oscillator state space in d degrees of freedom.
One also has (for column vectors a with components a1 , . . . , ad )
These satisfy
T T
UeA a† (UeA )−1 = eA a† , UeA a(UeA )−1 = eA a (23.4)
(the relations 24.1 are the derivative of these). This shows that the UeA are
intertwining operators for a U (d) action on annihilation and creation operators
that preserves the canonical commutation relations (the relations that say the
aj , a†j give a representation of the complexified Heisenberg Lie algebra). Here
the use of normal-ordered operators means that UA0 is a representation of u(d)
that differs by a constant from the metaplectic representation, and UeA differs
by a phase-factor. This does not affect the commutation relations with UA0 or
the conjugation action of UeA . The representation one gets this way differs
in two ways from the metaplectic representation. It acts on the same space
H = Fd , but it is a true representation of U (d), no double-cover is needed. It
also does not extend to a representation of the larger group Sp(2d, R).
The operators UA0 and UeA commute with the Hamiltonian operator. From
the physics point of view, this is useful, as it provides a decomposition of en-
ergy eigenstates into irreducible representations of U (d). From the mathematics
point of view, the quantum harmonic oscillator state space provides a construc-
tion of a large class of irreducible representations of U (d) (the energy eigenstates
of a given energy).
264
23.3 Examples in d = 2 and 3
23.3.1 Two degrees of freedom and SU (2)
In the case d = 2, the group U (2) ⊂ Sp(4, R) preserving the complex structure
J0 commutes with the standard harmonic oscillator Hamiltonian and so acts
as symmetries on the quantum state space, preserving energy eigenspaces. Re-
stricting to the subgroup SU (2) ⊂ U (2), we can recover our earlier construction
of SU (2) representations in terms of homogeneous polynomials, in a new con-
text. This use of the energy eigenstates of a two-dimensional harmonic oscillator
appears in the physics literature as the “Schwinger boson method” for studying
representations of SU (2).
The state space for the d = 2 Bargmann-Fock representation, restricting to
finite linear combinations of energy eigenstates, is
H = F2f in = C[w1 , w2 ]
the polynomials in two complex variables w1 , w2 . Recall from our SU (2) dis-
cussion that it was useful to organize these polynomials into finite dimensional
sets of homogeneous polynomials of degree n for n = 0, 1, 2, . . .
H = H0 ⊕ H1 ⊕ H2 ⊕ · · ·
Our original dual phase space was M = R4 , with a group Sp(4, R) acting on
it, preserving the antisymmetric bilinear form Ω. When picking the coordinates
z1 , z2 , we made a standard choice of complex structure J0 on M. Complexifying,
we have
−
M ⊗ C = M+ 2
J0 ⊕ MJ0 = C ⊕ C
2
−
where z1 , z2 are coordinates on M+
J0 , z 1 , z 2 are coordinates on MJ0 . This choice
of J = J0 picks out a distinguished subgroup U (2) ⊂ Sp(4, R).
The quadratic combinations of the creation and annihilation operators give
representations on H of three subalgebras of the complexification sp(4, C) of
sp(4, R):
• A three dimensional commutative Lie sub-algebra spanned by z1 z2 , z12 , z22 ,
with quantization
Γ0BF (z1 z2 ) = −ia†1 a†2 , Γ0BF (z12 ) = −i(a†1 )2 , Γ0BF (z22 ) = −i(a†2 )2
265
• A three dimensional commutative Lie sub-algebra spanned by z 1 z 2 , z 21 , z 22 ,
with quantization
z1 z 1 , z2 z 2 , z1 z 2 , z 1 z2
and quantization
i i
Γ0BF (z1 z 1 ) = − (a†1 a1 + a1 a†1 ), Γ0BF (z2 z 2 ) = − (a†2 a2 + a2 a†2 )
2 2
z1 z 1 , z2 z 2 , z1 z 2 + z2 z 1 , i(z1 z 2 − z2 z 1 )
span the Lie algebra u(2) ⊂ sp(4, R), and Γ0BF applied to these gives a
unitary Lie algebra representation by skew-adjoint operators.
Inside this last subalgebra, there is a distinguished element h = z1 z 1 + z2 z 2
that Poisson-commutes with the rest of the subalgebra (but not with elements
in the first two subalgebras). Quantization of h gives the Hamiltonian operator
1 1 1 ∂ ∂
H= (a1 a†1 + a†1 a1 + a2 a†2 + a†2 a2 ) = N1 + + N2 + = w1 + w2 +1
2 2 2 ∂w1 ∂w2
This operator will just multiply a homogeneous polynomial by its degree plus
one, so it acts just by multiplication by n + 1 on Hn . Exponentiating this
operator (multiplied by −i) one gets a representation of a U (1) subgroup of the
metaplectic cover M p(4, R). Taking instead the normal-ordered version
∂ ∂
:H: = a†1 a1 + a†2 a2 = N1 + N2 = w1 + w2
∂w1 ∂w2
one gets a representation of a U (1) subroup of Sp(4, R). Neither H nor :H:
commutes with operators coming from quantization of the first two subalgebras.
These change the eigenvalue of H or :H: by ±2 so take
Hn → Hn±2
1 i 1
X1 ↔ (z1 z 2 + z2 z 1 ), X2 ↔ (z2 z 1 − z1 z 2 ), X3 ↔ (z1 z 1 − z2 z 2 )
2 2 2
266
This relates two different but isomorphic ways of describing su(2): as 2 by 2
matrices with Lie bracket the commutator, or as quadratic polynomials, with
Lie bracket the Poisson bracket.
Quantizing using normal-ordering of operators give a representation of su(2)
on H
i 1
Γ0 (X1 ) = − (a†1 a2 + a†2 a1 ), Γ0 (X2 ) = (a†2 a1 − a†1 a2 )
2 2
i
Γ0 (X3 ) = − (a†1 a1 − a†2 a2 )
2
Comparing this to the representation π 0 of su(2) on homogeneous polynomials
discussed in chapter 8, one finds that they are isomorphic, although they act on
dual spaces, so Γ0 (X) = π 0 (−X T ) for all X ∈ su(2).
We see that, up to this change from a vector space to its dual, and the
normal-ordering (which only affects the u(1) factor, shifting the Hamiltonian by
a constant), the Bargmann-Fock representation on polynomials and the SU (2)
representation on homogeneous polynomials are identical. The inner product
that makes the representation unitary is the one of equation 8.2. The Bargmann-
Fock representation extends this SU (2) representation as a unitary representa-
tion to a much larger group (H5 o M p(4, R)), with all polynomials in w1 , w2
now making up a single irreducible representation.
The fact that we have an SU (2) group acting on the state space of the d = 2
harmonic oscillator and commuting with the action of the Hamiltonian H means
that energy eigenstates can be organized as irreducible representations of SU (2).
In particular, one sees that the space Hn of energy eigenstates of energy n + 1
will be a single irreducible SU (2) representation, the spin n2 representation of
dimension n + 1 (so n + 1 will be the multiplicity of energy eigenstates of that
energy).
Another physically interesting subgroup here is the SO(2) ⊂ SU (2) ⊂
Sp(4, R) consisting of simultaneous rotations in the position and momentum
planes, which was studied in detail using the coordinates q1 , q2 , p1 , p2 in section
18.2.2. There we found that the moment map was given by
µL = l = q1 p2 − q2 p1
Note that this is a different SO(2) action than the one with moment map the
Hamiltonian, it acts separately on positions and momenta rather than mixing
them.
To see what happens if one instead uses the Bargmann-Fock representation,
note that
1 1
qj = √ (zj + z j ), pj = i √ (zj − z j )
2 2
267
so the moment map is
i
µL = ((z1 + z 1 )(z2 − z 2 ) − (z2 + z 2 )(z1 − z 1 ))
2
=i(z2 z 1 − z1 z 2 )
Quantizing, Bargmann-Fock gives a unitary representation of so(2)
UL0 = a†2 a1 − a†1 a2
which is Γ0 (2X2 ). The factor of two here reflects the fact that exponentiation
gives a representation of SO(2) ⊂ Sp(4, R), with no need for a double cover.
268
to calculate
3
X
−iLj = Ul0j = a†m (lj )mn an
m,n=1
This gives
269
270
Chapter 24
In this chapter we’ll introduce a new quantum system by using a simple varia-
tion on techniques we used to study the harmonic oscillator, that of replacing
commutators by anticommutators. This variant of the harmonic oscillator will
be called a “fermionic oscillator”, with the original sometimes called a “bosonic
oscillator”. The terminology of “boson” and “fermion” refers to the principle
enunciated in chapter 9 that multiple identical particles are described by tensor
product states that are either symmetric (bosons) or antisymmetric (fermions).
The bosonic and fermionic oscillator systems are single-particle systems, de-
scribing the energy states of a single particle, so the usage of the bosonic/fermion-
ic terminology is not obviously relevant. In later chapters we will study quantum
field theories, which can be treated as infinite-dimensional oscillator systems.
In that context, multiple particle states will automatically be symmetric or an-
tisymmetric, depending on whether the field theory is treated as a bosonic or
fermionic oscillator system, thus justifying the terminology.
d
X 1
H= (Q2j + Pj2 )
j=1
2
1 1
aj = √ (Qj + iPj ), a†j = √ (Qj − iPj )
2 2
271
that satisfy the so-called canonical commutation relations (CCR)
The simple change in the harmonic oscillator problem that takes one from
bosons to fermions is the replacement of the bosonic annihilation and creation
operators (which we’ll now denote aB and aB † ) by fermionic annihilation and
creation operators called aF and aF † , and replacement of the commutator
[A, B] ≡ AB − BA
[A, B]+ ≡ AB + BA
with the last two relations implying that a2F = 0 and (a†F )2 = 0
The fermionic number operator
NF = a†F aF
now satisfies
2
NF2 = a†F aF a†F aF = a†F (1 − a†F aF )aF = NF − a†F a2F = NF
2
(using the fact that a2F = a†F = 0). So one has
NF2 − NF = NF (NF − 1) = 0
which implies that the eigenvalues of NF are just 0 and 1. We’ll denote eigen-
vectors with such eigenvalues by |0i and |1i. The simplest representation of the
operators aF and a†F on a complex vector space HF will be on C2 , and choosing
the basis
0 1
|0i = , |1i =
1 0
the operators are represented as
0 0 † 0 1 1 0
aF = , aF = , NF =
1 0 0 0 0 0
Since
1 †
H= (a aF + aF a†F )
2 F
is just 21 the identity operator, to get a non-trivial quantum system, instead we
make a sign change and set
1
1 1 0
H = (a†F aF − aF a†F ) = NF − 1 = 2
2 2 0 − 21
272
The energies of the energy eigenstates |0i and |1i will then be ± 12 since
1 1
H|0i = − |0i, H|1i = |1i
2 2
Note that the quantum system we have constructed here is nothing but our
old friend the two-state system of chapter 3. Taking complex linear combinations
of the operators
aF , a†F , NF , 1
we get all linear transformations of HF = C2 (so this is an irreducible represen-
tation of the algebra of these operators). The relation to the Pauli matrices is
just
1 1 1
a†F = (σ1 + iσ2 ), aF = (σ1 − iσ2 ), H = σ3
2 2 2
aF j , aF †j , j = 1, . . . , d
is said to satisfy the canonical anticommutation relations (CAR) when one has
In this case one may choose as the state space the tensor product of N copies
of the single fermionic oscillator state space
HF = (C2 )⊗d = C2 ⊗ C2 ⊗ · · · ⊗ C2
| {z }
d times
273
are satisfied for j 6= k since in these cases one will get in the tensor product
factors
0 0 0 1
[σ3 , ] = 0 or [σ3 , ] =0
1 0 + 0 0 +
While this sort of tensor product construction is useful for discussing the
physics of multiple qubits, in general it is easier to not work with large tensor
products, and the Clifford algebra formalism we will describe in chapter 25
avoids this.
The number operators will be
NF j = aF †j aF j
These will commute with each other, so can be simultaneously diagonalized,
with eigenvalues nj = 0, 1. One can take as an orthonormal basis of HF the 2d
states
|n1 , n2 , · · · , nd i
As an example, for the case d = 3 the pattern of states and their energy
levels for the bosonic and fermionic cases looks like this
274
In the bosonic case the lowest energy state is at positive energy and there are
an infinite number of states of ever increasing energy. In the fermionic case the
lowest energy state is at negative energy, with the pattern of energy eigenvalues
of the finite number of states symmetric about the zero energy level.
Just as in the bosonic case, we can consider quadratic combinations of cre-
ation and annihilation operators of the form
X †
UA0 = aF j Ajk aF k
j,k
and we have
Theorem 24.1. For A ∈ gl(d, C) a d by d complex matrix one has
[UA0 , UA0 0 ] = U[A,A0 ]
So
A ∈ gl(d, C) → UA0
is a Lie algebra representation of gl(d, C) on HF
One also has (for column vectors aF with components aF 1 , . . . , aF d )
[UA0 , a†F ] = AT a†F , [UA0 , aF ] = −AaF (24.1)
Proof. The proof is similar to that of 23.1, except besides the relation
[AB, C] = A[B, C] + [A, B]C
we also use the relation
[AB, C] = A[B, C]+ − [A, B]+ C
For example
[UA0 , aF †l ] = [aF †j Ajk aF k , aF †l ]
X
j,k
aF †j Ajk [aF k , aF †l ]+
X
=
j,k
aF †j Ajl
X
=
j
The Hamiltonian is X 1
H= (NF j − 1)
j
2
which (up to the constant 12 that doesn’t contribute to commutation relations) is
just UB0 for the case B = 1. Since this commutes with all other d by d matrices,
we have
[H, UA0 ] = 0
for all A ∈ gl(d, C), so these are symmetries and we have a representation of
the Lie algebra gl(d, C) on each energy eigenspace. Only for A ∈ u(d) (A a
skew-adjoint matrix) will the representation turn out to be unitary.
275
24.3 For further reading
Most quantum field theory books and a few quantum mechanics books contain
some sort of discussion of the fermionic oscillator, see for example Chapter
21.3 of [57] or Chapter 5 of [10]. The standard discussion often starts with
considering a form of classical analog using anticommuting “fermionic” variables
and then quantization to get the fermionic oscillator. Here we are doing things
in the opposite order, starting in this chapter with the quantized oscillator, then
considering the classical analog in a later chapter.
276
Chapter 25
277
any element of this algebra can be written as a sum of elements in normal order,
of the form
cl,m (a†B )l am
B
with all annihilation operators aB on the right, for some complex constants cl,m .
As a vector space over C, Weyl(2, C) is infinite-dimensional, with a basis
∂
a†B = w, aB =
∂w
one sees that Weyl(2, C) can be identified with the algebra of polynomial coeffi-
cient differential operators on functions of a complex variable w. As a complex
vector space, the algebra is infinite dimensional, with a basis of elements
∂m
wl
∂wm
In our study of quantization and the harmonic oscillator we saw that the
subset of such operators consisting of complex linear combinations of
∂ ∂2 ∂
1, w, , w2 , , w
∂w ∂w2 ∂w
is closed under commutators, so it forms a Lie algebra of complex dimension 6.
This Lie algebra includes as subalgebras the Heisenberg Lie algebra h3 ⊗ C (first
three elements) and the Lie algebra sl(2, C) = sl(2, R)⊗C (last three elements).
Note that here we are allowing complex linear combinations, so we are getting
the complexification of the real six-dimensional Lie algebra that appeared in
our study of quantization.
Since the aB and a†B are defined in terms of P and Q, one could of course
also define the Weyl algebra as the one generated by 1, P, Q, with the Heisenberg
commutation relations, taking complex linear combinations of all products of
these operators.
278
This algebra is a four dimensional algebra over C, with basis
1, aF , a†F , a†F aF
since higher powers of the operators vanish, and one can use the anticommu-
tation relation betwee aF and a†F to normal order and put factors of aF on
the right. We saw in the last chapter that this algebra is isomorphic with the
algebra M (2, C) of 2 by 2 complex matrices, using
1 0 0 0 † 0 1 † 1 0
1↔ , aF ↔ , aF ↔ , a F aF ↔
0 1 1 0 0 0 0 0
We will see later on that there is also a way of identifying this algebra with
“differential operators in fermionic variables”, analogous to what happens in
the bosonic (Weyl algebra) case.
Recall that the bosonic annihilation and creation operators were originally
defined in term of the P and Q operators by
1 1
aB = √ (Q + iP ), a†B = √ (Q − iP )
2 2
Looking for the fermionic analogs of the operators Q and P , we use a slightly
different normalization, and set
1 1
aF = (γ1 + iγ2 ), a†F = (γ1 − iγ2 )
2 2
so
1
γ1 = aF + a†F , γ2 =(aF − a†F )
i
and the CAR imply that the operators γj satisfy the anticommutation relations
[γj , γk ]+ = 2δjk
• Using just the generators 1 and γ1 , one gets an algebra Cliff(1, C), gener-
ated by 1, γ1 , with the relation
γ12 = 1
279
25.1.3 Multiple degrees of freedom
For a larger number of degrees of freedom, one can generalize the above and
define Weyl and Clifford algebras as follows:
Definition (Complex Weyl algebras). The complex Weyl algebra for d degrees
of freedom is the algebra Weyl(2d, C) generated by the elements 1, aB j , aB †j ,
j = 1, . . . , d satisfying the CCR
∂ ∂2 ∂
1, wj , , wj wk , , wj
∂wj ∂wj ∂wk ∂wk
or, alternatively, one has the following more general definition that also works
in the odd-dimensional case
We won’t try and prove this here, but one can show that, abstractly as
algebras, the complex Clifford algebras are something well-known. Generalizing
the case d = 1 where we saw that Cliff(2, C) was isomorphic to the algebra of 2
by 2 complex matrices, one has isomorphisms
Cliff(2d, C) ↔ M (2d , C)
280
in the even-dimensional case, and in the odd-dimensional case
1, γj , γj γk , γj γk γl , . . . , γ1 γ2 γ3 · · · γn−1 γn
(1 + γ1 )(1 + γ2 ) · · · (1 + γn )
which will have 2n terms that are exactly those of the basis listed above.
For reasons that will be explained in the next chapter, it turns out that a
more general definition is useful. We write the number of variables as n = r + s,
for r, s non-negative integers, and now vary not just r + s, but also r − s, the
so-called “signature”.
Definition (Real Clifford algebras, arbitrary signature). The real Clifford al-
gebra in n = r + s variables is the algebra Cliff(r, s, R) over the real numbers
generated by 1, γj for j = 1, 2, . . . , n satisfying the relations
[γj , γk ]+ = ±2δjk 1
281
In other words, as in the complex case different γj anticommute, but only
the first r of them satisfy γj2 = 1, with the other s of them satisfying γj2 = −1.
Working out some of the low-dimensional examples, one finds:
• Cliff(0, 1, R). This has generators 1 and γ1 , satisfying
γ12 = −1
Taking real linear combinations of these two generators, the algebra one
gets√is just the algebra C of complex numbers, with γ1 playing the role of
i = −1.
• Cliff(0, 2, R). This has generators 1, γ1 , γ2 and a basis
1, γ1 , γ2 , γ1 γ2
with
This four-dimensional algebra over the real numbers can be identified with
the algebra H of quaternions by taking
γ1 ↔ i, γ2 ↔ j, γ1 γ2 ↔ k
Note that one can construct this using the aF , a†F for the complex case
Cliff(2, C)
γ1 = aF + a†F , γ2 = aF − a†F
since these are represented as real matrices.
• Cliff(3, 0, R). This is the algebra M (2, C) of complex 2 by 2 matrices,
with one possible identification using Pauli matrices given by
1 0
1↔
0 1
0 1 0 −i 1 0
γ1 ↔ σ1 = , γ2 ↔ σ2 = , γ 3 ↔ σ3 =
1 0 i 0 0 −1
i 0 0 i 0 −1
γ1 γ2 ↔ iσ3 = , γ2 γ3 ↔ iσ1 = , γ1 γ3 ↔ −iσ2 =
0 −i i 0 1 0
i 0
γ1 γ2 γ3 ↔
0 i
282
It turns out that Cliff(r, s, R) is always one or two copies of matrices of real,
complex or quaternionic elements, of dimension a power of 2, but this requires
a rather intricate algebraic argument that we will not enter into here. For the
details of this and the resulting pattern of algebras one gets, see for instance
[38]. One special case where the pattern is relatively simple is when one has
r = s. Then n = 2r is even-dimensional and one finds
Cliff(r, r, R) = M (2r , R)
We will see in the next chapter that just as quadratic elements of the Weyl
algebra give a basis of the Lie algebra of the symplectic group, quadratic ele-
ments of the Clifford algebra give a basis of the Lie algebra of the orthogonal
group.
283
284
Chapter 26
The definitions given in last chapter of Weyl and Clifford algebras were purely
algebraic, based on a choice of generators. These definitions do though have a
more geometrical formulation, with the definition in terms of generators corre-
sponding to a specific choice of coordinates. For the Weyl algebra, the geometry
involved is known as symplectic geometry, and we have already seen that in the
bosonic case quantization of a phase space R2d depends on the choice of a non-
degenerate antisymmetric bilinear form Ω which determines the Poisson brack-
ets and thus the Heisenberg commutation relations. Such a Ω also determines a
group Sp(2d, R), which is the group of linear transformations of R2d preserving
Ω. The Clifford algebra also has a coordinate invariant definition, based on a
more well-known structure on a vector space Rn , that of a non-degenerate sym-
metric bilinear form, i.e. an inner product. In this case the group that preserves
the inner product is an orthogonal group. In the symplectic case antisymmetric
forms require an even number of dimensions, but this is not true for symmetric
forms, which also exist in odd dimensions.
285
Ω(u, u0 ) =cq1 c0p1 − cp1 c0q1 + · · · + cqd c0pd − cpd c0qd
0
0 1 ... 0 0 cq1
−1 0 . . . 0 0 c0p
1
= cq1 cp1 . . . cqd cpd ... .. .. .. ..
. . . .
0 0 ... 0 1 c0qd
0 0 ... −1 0 c0pd
Matrices g ∈ M (2d, R) such that
0 1 ... 0 0 0 1 ... 0 0
−1 0 . . . 0 0 −1 0 ... 0 0
g T ... .. .. .. g = .. .. .. ..
. . .
.
. . .
0 0 . . . 0 1 0 0 ... 0 1
0 0 . . . −1 0 0 0 ... −1 0
make up the group Sp(2d, R) and preserve Ω, satisfying
Ω(gu, gu0 ) = Ω(u, u0 )
This choice of Ω is much less arbitrary than it looks. One can show that
given any non-degenerate antisymmetric bilinear form on R2d a basis can be
found with respect to which it will be the Ω given here (for a proof, see [7]).
This is also true if one complexifies, taking (q, p) ∈ C2d and using the same
formula for Ω, which is now a bilinear form on C2d . In the real case the group
that preserves Ω is called Sp(2d, R), in the complex case Sp(2d, C).
To get a fermionic analog of this, it turns out that all we need to do is replace
“non-degenerate antisymmetric bilinear form Ω(·, ·)” with “non-degenerate sym-
metric bilinear form h·, ·i”. Such a symmetric bilinear form is actually something
much more familiar from geometry than the antisymmetric case analog: it is
just a notion of inner product. Two things are different in the symmetric case:
• The underlying vector space does not have to be even dimensional, one
can take M = Rn for any n, including n odd. To get a detailed analog of
the bosonic case though, we will mostly consider the even case n = 2d.
• For a given dimension n, there is not just one possible choice of h·, ·i up to
change of basis, but one possible choice for each pair of integers r, s such
that r + s = n. Given r, s, any choice of h·, ·i can be put in the form
hu, u0 i =u1 u01 + u2 u02 + · · · ur u0r − ur+1 u0r+1 − · · · − un u0n
0
1 0 ... 0 0 u1
0 1 . . . 0 0 u02
= u1 . . . un ... ... .. .. ..
. . .
0 0 . . . −1 0 u0n−1
0 0 . . . 0 −1 u0n
| {z }
r + signs, s - signs
286
For a proof by Gram-Schmidt orthogonalization, see [7].
We can thus extend our definition of the orthogonal group as the group of
transformations g preserving an inner product
hgu, gu0 i = hu, u0 i
to the case r, s arbitrary by
Definition (Orthogonal group O(r, s, R)). The group O(r, s, R) is the group of
real r + s by r + s matrices g that satisfy
1 0 ... 0 0 1 0 ... 0 0
0 1 . . . 0 0 0 1 ... 0 0
g T ... ... .. .. g = .. .. .. ..
. .
.
. . .
0 0 . . . −1 0 0 0 . . . −1 0
0 0 . . . 0 −1 0 0 ... 0 −1
| {z } | {z }
r + signs, s - signs r + signs, s - signs
287
where h·, ·i is the symmetric bilinear form on Rn corresponding to the standard
inner product of vectors. Note that taking v = w one has
2
/ = hv, vi = ||v||2
v
v
/w/ + w/
/ v = 2hv, wi
v
/w / v = −2hv, wi
/ + w/
another common choice. One also sees variants without the factor of 2.
For n dimensional vector spaces over C, we have seen that for any non-
degenerate symmetric bilinear form a basis can be found such that h·, ·i has the
standard form
hz, wi = z1 w1 + z2 w2 + · · · + zn wn
As a result, there is just one complex Clifford algebra in dimension n, the one
we defined as Cliff(n, C).
For n dimensional vector spaces over R with a non-degenerate symmetric
bilinear forms of type r, s such that r+s = n, the corresponding Clifford algebras
Cliff(r, s, R) are the ones defined in terms of generators in the last chapter.
In special relativity, space-time is a real 4-dimensional vector space with an
indefinite inner product corresponding to (depending on one’s choice of conven-
tion) either the case r = 1, s = 3 or the case s = 1, r = 3. The group of linear
transformations preserving this inner product is called the Lorentz group, and
its orientation preserving component is written as SO(3, 1) or SO(1, 3) depend-
ing on the choice of convention. In later chapters we will consider what happens
to quantum mechanics in the relativistic case, and there encounter the corre-
sponding Clifford algebras Cliff(3, 1, R) or Cliff(1, 3, R). The generators γj of
such a Clifford algebra are well-known in the subject as the “Dirac γ- matrices”.
For now though, we will restrict attention to the positive definite case, so
just will be considering Cliff(n, R) and seeing how it is used to study the group
O(n) of n-dimensional rotations in Rn .
288
26.2.1 Rotations as iterated orthogonal reflections
We’ll consider two different ways of seeing the relationship between the Clifford
algebra Cliff(n, R) and the group O(n) of rotations in Rn . The first is based
upon the geometrical fact (known as the Cartan-Dieudonné theorem) that one
can get any rotation by doing multiple orthogonal reflections in different hy-
perplanes. Orthogonal reflection in the hyperplane perpendicular to a vector w
takes a vector v to the vector
hv, wi
v0 = v − 2 w
hw, wi
something that can easily be seen from the following picture
−1 w
/
w
/ =
hw, wi
and the reflection transformation is just conjugation by w
/ times a minus sign
0 −1 −1
/→v
v / =v/−v
/ − w/
/ vw
/ = −w/
/ vw
/
So, thinking of vectors as lying in the Clifford algebra, the orthogonal trans-
formation that is the result of one reflection is just a conjugation (with a minus
289
sign). These lie in the group O(n), but not in the subgroup SO(n), since they
change orientation. The result of two reflections in hyperplanes orthogonal to
/ 2w
w1 , w2 will be a conjugation by w /1
0 −1 −1
/→v
v / = −w
/ 2 (−w
/ 1v
/w/ 1 )w
/ 2 = (w
/ 2w
/ 1 )/
v(w / 1 )−1
/ 2w
w /2 ···w
/ 1w /k
/ → (w
v /2 ···w
/ 1w / k )/
v(w
/ 1w / k )−1
/2 ···w
and this will correspond to a rotation of the vector v. This construction gen-
eralizes to arbitrary n the one we gave in chapter 6 of Spin(3) in terms of unit
length elements of the quaternion algebra H. One can see here the characteristic
fact that there are two elements of the Spin(n) group giving the same rotation
in SO(n) by noticing that changing the sign of the Clifford algebra element
w /2 ···w
/ 1w / k does not change the conjugation action, where signs cancel.
In the SO(3) case we saw that there were three of these matrices
290
providing a basis of the Lie algebra so(3). In n dimensions there will be 12 (n2 −n)
of them, providing a basis of the Lie algebra so(n).
Just as in the case of SO(3) where unit length quaternions were used, we can
use elements of the Clifford algebra to get these same rotation transformations,
but as conjugations in the Clifford algebra. To see how this works, consider the
quadratic Clifford algebra element γj γk for j 6= k and notice that
(γj γk )2 = γj γk γj γk = −γj γj γk γk = −1
so one has
θ (θ/2)2 (θ/2)3
e 2 γj γk =(1 − + · · · ) + γj γk (θ/2 − + ···)
2! 3!
θ θ
= cos( ) + γj γk sin( )
2 2
Conjugating a vector vj γj + vk γk in the j − k plane by this, one can show
that
θ θ
e− 2 γj γk (vj γj + vk γk )e 2 γj γk = (vj cos θ − vk sin θ)γj + (vj sin θ + vk cos θ)γk
291
of the Lie algebra sp(2n, R). This is the Lie algebra of the group Sp(2n, R),
the group preserving the non-degenerate antisymmetric bilinear form Ω(·, ·) on
the phase space R2n . The fermionic case is precisely analogous, with the role of
the antisymmetric bilinear form Ω(·, ·) replaced by the symmetric bilinear form
h·, ·i and the Lie algebra sp(2n, R) replaced by so(n) = spin(n).
In the bosonic case the linear functions of the Qj , Pj satisfied the commuta-
tion relations of another Lie algebra, the Heisenberg algebra, but in the fermionic
case this is not true for the γj . In chapter 27 we will see that one can define a
notion of a “Lie superalgebra” that restores the parallelism.
292
Chapter 27
Anticommuting Variables
and Pseudo-classical
Mechanics
The analogy between the algebras of operators in the bosonic (Weyl algebra) and
fermionic (Clifford algebra) cases can be extended by introducing a fermionic
analog of phase space and the Poisson bracket. This gives a fermionic ana-
log of classical mechanics, sometimes called “pseudo-classical mechanics”, the
quantization of which gives the Clifford algebra as operators, and spinors as
state spaces. In this chapter we’ll intoduce “anticommuting variables” ξj that
will be the fermionic analogs of the variables qj , pj . These objects will become
generators of the Clifford algebra under quantization, and will later be used in
the construction of fermionic state spaces, by analogy with the Schrödinger and
Bargmann-Fock constructions in the bosonic case.
Definition (Grassmann algebra). The algebra over the real numbers generated
293
by ξj , j = 1, . . . , n, satisfying the relations
ξj ξk + ξk ξj = 0
Note that these relations imply that generators satisfy ξj2 = 0. Also note that
sometimes the product in the Grassmann algebra is called the “wedge product”
and the product of ξj and ξk is denoted ξj ∧ ξk . We will not use a different
symbol for the product in the Grassmann algebra, relying on the notation for
generators to keep straight what is a generator of a conventional polynomial
algebra (e.g. qj or pj ) and what is a generator of a Grassman algebra (e.g. ξj ).
Recall from section 9.5 that the algebra of polynomial functions on V could
be thought of as S ∗ (V ∗ ), the symmetric part of the tensor algebra on V ∗ , with
multiplication of two linear functions l1 , l2 in V ∗ given by
1
l1 l2 = (l1 ⊗ l2 + l2 ⊗ l1 ) ∈ S 2 (V ∗ )
2
This corresponds to a polynomial on V by taking its value at the point v ∈ V
to be
1
l1 l2 (v) = (l1 ⊗ l2 + l2 ⊗ l1 )(v ⊗ v)
2
Similarly, the Grassman algebra on V is the antisymmetric part of the tensor
algebra on V ∗ , with the wedge product of two linear functions l1 , l2 in V ∗ given
by
1
l1 ∧ l2 = (l1 ⊗ l2 − l2 ⊗ l1 ) ∈ Λ2 (V ∗ )
2
If one tries to think of these as functions on V , one cannot ask for their value
at a point v, since
1
(l1 ⊗ l2 − l2 ⊗ l1 )(v ⊗ v) = 0
2
The Grassmann algebra behaves in many ways like the polynomial algebra
on Rn , but it is finite dimensional, with basis
1, ξj , ξj ξk , ξj ξk ξl , · · · , ξ1 ξ2 · · · ξn
294
space Tx M and its dual space (Tx M )∗ . A set of local coordinates xj on M gives
basis elements of (Tx M )∗ denoted by dxj and differential forms locally can be
written as sums of terms of the form
F (ξ) = c0 + c1 ξ
F (ξ1 , ξ2 , . . . , ξn ) = FA + ξj FB
where FA , FB are functions that do not depend on the chosen ξj (one gets FB
by using the anticommutation relations to move ξj all the way to the left). Then
one can define
∂
F = FB
∂ξj
This derivative operator has many of the same properties as the conventional
derivative, although there are unconventional signs one must keep track of. An
unusual property of this derivative that is easy to see is that one has
∂ ∂
=0
∂ξj ∂ξj
Taking the derivative of a product one finds this version of the Leibniz rule
for monomials F and G
∂ ∂ ∂
(F G) = ( F )G + (−1)|F | F ( G)
∂ξj ∂ξj ∂ξj
295
A notion of integration (often called the “Berezin integral”) with many of the
usual properties of an integral can also be defined. It has the peculiar feature
of being the same operation as differentiation, defined in the n = 1 case by
Z
(c0 + c1 ξ)dξ = c1
∂
F = G
∂ξj
then Z
∂ ∂ ∂
F dξj = F = G=0
∂ξj ∂ξj ∂ξj
using the fact that repeated derivatives give zero.
d
f = {f, h}
dt
for some Hamiltonian function h. This says that taking the derivative of any
function in the direction of the velocity vector of a classical trajectory is the
linear map
f → {f, h}
on functions. As we saw in chapter 12, since this linear map is a derivative, the
Poisson bracket will have the derivation property, satisfying the Leibniz rule
for arbitrary functions f1 , f2 , f3 on phase space. Using the Leibniz rule and an-
tisymmetry, one can calculate Poisson brackets for any polynomials, just from
296
knowing the Poisson bracket on generators qj , pj (or, equivalently, the antisym-
metric bilinear form Ω(·, ·)), which we chose to be
where |F2 | and |F3 | are the degrees of F2 and F3 . It will also have the symmetry
property
{F1 , F2 }+ = −(−1)|F1 ||F2 | {F2 , F1 }+
and one can use these properties to compute the fermionic Poisson bracket for
arbitrary functions in terms of the relations for generators.
One can think of the ξj as the “anti-commuting coordinate functions” with
respect to a basis ei of V = Rn . We have seen that the symmetric bilinear
forms on Rn are classified by a choice of positive signs for some basis vectors,
negative signs for the others. So, on generators ξj one can choose
{ξj , ξk }+ = ±δjk
and
297
The second of these equations shows that the quadratic combinations of the
generators ξj satisfy the relations of the Lie algebra of the group of rotations in
n dimensions (so(n) = spin(n)). The first shows that the ξk ξl acts on the ξj as
infinitesimal rotations in the k − l plane.
In the case of the conventional Poisson bracket, the antisymmetry of the
bracket and the fact that it satisfies the Jacobi identity implies that it is a
Lie bracket determining a Lie algebra (the infinite dimensional Lie algebra of
functions on a phase space R2d ). The fermionic Poisson bracket provides an
example of something called a Lie superalgebra. These can be defined for vector
spaces with some usual and some fermionic coordinates:
Definition (Lie superalgebra). A Lie superalgebra structure on a real or com-
plex vector space V is given by a Lie superbracket [·, ·]± . This is a bilinear map
on V which on generators X, Y, Z (which may be usual coordinates or fermionic
ones) satisfies
[X, Y ]± = −(−1)|X||Y | [Y, X]±
and a super-Jacobi identity
where |X| takes value 0 for a usual generator, 1 for a fermionic generator.
Analogously to the bosonic case, on polynomials in generators with order of
the poynomial less than or equal to two, the fermionic Poisson bracket {·, ·}+ is a
Lie superbracket, giving a Lie superalgebra of dimension 1 + n + 21 (n2 − n) (since
there is one constant, n linear terms ξj and 12 (n2 − n) quadratic terms ξj ξk ).
On functions of order two this Lie superalgebra is a Lie algebra, so(n). We will
see in chapter 28 that one can generalize the definition of a representation to
Lie superalgebras, and quantization will give a distinguished representation of
this Lie superalgebra, in a manner quite parallel to that of the Schrödinger or
Bargmann-Fock constructions of a representation in the bosonic case.
The relation between between the quadratic and linear polynomials in the
generators is parallel to what happens in the bosonic case. Here we have the
fermionic analog of the bosonic theorem 14.1:
Theorem 27.1. The Lie algebra so(n, R) is isomorphic to the Lie algebra
Λ2 (V ∗ ) (with Lie bracket {·, ·}+ ) of order two anticommuting polynmials on
V = Rn , by the isomorphism
L ↔ µL
where L ∈ so(n, R) is an antisymmetric n by n real matrix, and
1 1X
µL = ξ · Lξ = Ljk ξj ξk
2 2
j,k
298
or
{µL , ξ}+ = LT ξ
Proof. The theorem follows from equations 27.1 and 27.2, or one can proceed
by analogy with the proof of theorem 14.1 as follows. First prove the second
part of the theorem by computing
1X 1X
{ ξj Ljk ξk , ξl }+ = Ljk (ξj {ξk , ξl }+ − {ξj , ξl }+ ξk )
2 2
j,k j,k
1 X X
= ( Ljl ξj − Llk ξk )
2 j
k
X
= Ljl ξj (since L = −LT )
j
L → µL
ξ → {µL , ξ}
of µL ∈ so(n, R) on an arbitrary
X
ξ= cj ξj
j
and uses the super-Jacobi identity relating the fermionic Poisson brackets of
µL , µL0 , ξ.
d
F = {F, h}+
dt
299
27.3.1 The pseudo-classical spin degree of freedom
Using pseudo-classical mechanics, one can find a “classical” analog of something
that is quintessentially quantum: the degree of freedom that appears in the
qubit or spin 1/2 system that we have seen repeatedly in this course. Taking
V = R3 with the standard inner product as fermionic phase space, we have
three generators ξ1 , ξ2 , ξ3 ∈ V ∗ satisfying the relations
{ξj , ξk }+ = δjk
1, ξ1 , ξ2 , ξ3 , ξ1 ξ2 , ξ1 ξ3 , ξ2 ξ3 , ξ1 ξ2 ξ3
d
ξj (t) = {ξj , h}+ = −{h, ξj }+
dt
d
ξj (t) = Lξj (t)
dt
with solution
ξj (t) = etL ξj (0)
This will be a time-dependent rotation of the ξj in the plane perpendicular to
300
27.3.2 The pseudo-classical fermionic oscillator
We have already studied the fermionic oscillator as a quantum system, and one
can ask whether there is a corresponding pseudo-classical system. Such a system
is given by taking an even dimensional fermionic phase space V = R2d , with a
basis of coordinate functions ξ1 , · · · , ξ2d that generate Λ∗ (R2d ). On generators
the fermionic Poisson bracket relations come from the standard choice of positive
definite symmetric bilinear form
{ξj , ξk }+ = δjk
As in the bosonic case, we can make the standard choice of complex structure
J = J0 on R2d and get a decomposition
V ∗ ⊗ C = R2d ⊗ C = Cd ⊕ Cd
301
and, similarly,
{h, θj }+ = iθj
so one sees that h is just the generator of U (1) ⊂ U (d) phase rotations on the
variables θj . The equations of motion are
d d
θj = {θj , h}+ = iθj , θj = {θj , h}+ = −iθj
dt dt
with solutions
θj (t) = eit θj (0), θj (t) = e−it θj (0)
302
Chapter 28
In this chapter we’ll begin by investigating the fermionic analog of the notion
of quantization, which takes functions of anticommuting variables on a phase
space with symmetric bilinear form h·, ·i and gives an algebra of operators with
generators satisfying the relations of the corresponding Clifford algebra. We
will then consider analogs of the constructions used in the bosonic case which
there gave us the Schrödinger and Bargmann-Fock representations of the Weyl
algebra on a space of states.
We know that for a fermionic oscillator with d degrees of freedom, the alge-
bra of operators will be Cliff(2d, C), the algebra generated by annihilation and
creation operators aF j , aF †j . These operators will act on HF , a complex vector
space of dimension 2d , and this will be our fermionic analog of the bosonic Γ0 .
Since the spin group consists of invertible elements of the Clifford algebra, it
has a representation on HF . This is known as the “spinor representation”, and
it can be constructed by analogy with the construction of the metaplectic in
the bosonic case. We’ll also consider the analog in the fermionic case of the
Schrödinger representation, which turns out to have a problem with unitarity,
but finds a use in physics as “ghost” degrees of freedom.
303
Poisson bracket, which on basis elements is
and taking values in a Lie superalgebra of linear operators, with |Φ(X)| = |X|
and
[Φ(X), Φ(Y )]± = Φ(X)Φ(Y ) − (−)|X||Y | Φ(Y )Φ(X)
A representation of the pseudo-classical Lie superalgebra (and thus a quan-
tization of the pseudo-classical system) will be given by finding a linear map Γ+
that takes basis elements ξj to operators Γ+ (ξj ) satisfying the anticommutation
relations
[Γ+ (ξj ), Γ+ (ξk )]+ = ±δjk Γ+ (1), [Γ+ (ξj ), Γ+ (1)] = [Γ+ (1), Γ+ (1)] = 0
304
by using the Clifford algebra relations, or by noting that this is the special case
of equation 26.2 for v = el . That equation shows that commuting by − 21 γj γk
acts by the infinitesimal rotation jk in the j − k coordinate plane.
For 27.2, one can again just use the Clifford algebra relations to show
1 1 1 1 1 1
[ γj γk , γl γm ] = δkl γj γm − δjl γk γm + δkm γl γj − δjm γl γk
2 2 2 2 2 2
One could also instead use the commutation relations for the so(n) Lie algebra
satisfied by the basis elements jk corresponding to infinitesimal rotations. One
must get identical commutation relations for the − 12 γj γk and can show that
these are the relations needed for commutators of Γ+ (ξj ξk ) and Γ+ (ξl ξm ).
Note that here we are not introducing the factors of i into the definition of
quantization that in the bosonic case were necessary to get a unitary represen-
tation of the Lie group corresponding to the real Heisenberg Lie algebra h2d+1 .
In the bosonic case we worked with all complex linear combinations of powers
of the Qj , Pj (the complex Weyl algebra Weyl(2d, C)), and thus had to identify
the specific complex linear combinations of these that gave unitary represen-
tations of the Lie algebra h2d+1 o sp(2d, R). Here we are not complexifying
for now, but working with the real Clifford algebra Cliff(r, s, R), and it is the
irreducible representations of this algebra that provide an analog of the unique
interesting irreducible representation of h2d+1 . In the Clifford algebra case, the
representations of interest are not Lie algebra representations and may be on
real vector spaces. There is no analog of the unitarity property of the h2d+1
representation.
In the bosonic case we found that Sp(2d, R) acted on the bosonic dual phase
space, preserving the antisymmetric bilinear form Ω that determined the Lie al-
gebra h2d+1 , so it acted on this Lie algebra by automorphisms. We saw (see
chapter 18) that intertwining operators there gave us a representation of the
double cover of Sp(2d, R) (the metaplectic representation), with the Lie alge-
bra representation given by the quantization of quadratic functions of the qj , pj
phase space coordinates. There is a closely analogous story in the fermionic
case, where SO(r, s, R) acts on the fermionic phase space V , preserving the
symmetric bilinear form h·, ·i that determines the Clifford algebra relations.
Here one constructs a representation of the spin group Spin(r, s, R) double
covering SO(r, s, R) using intertwining operators, with the Lie algebra repre-
sentation given by quadratic combinations of the quantizations of the fermionic
coordinates ξj .
The fermionic analog of 18.1 is
[UL0 , Γ+ (ξ)] = Γ+ (L · ξ)
305
where L ∈ so(r, s, R) and L acts on V ∗ as an infinitesimal orthogonal transfor-
mation. In terms of basis vectors of V ∗
ξ1
..
ξ=.
ξn
this says
[UL0 , Γ+ (ξ)] = Γ+ (LT ξ)
Just as in the bosonic case, the UL0 can be found by looking first at the
pseudo-classical case, where one has theorem 27.1 which says
{µL , ξ}+ = LT ξ
where
1 1X
µL = ξ · Lξ = Ljk ξj ξk
2 2
j,k
For the case s = 0 and a rotation in the j − k plane, with L = jk one
recovers formulas 26.1 and 26.2 from chapter 26, with
1
[− γj γk , γ(v)] = γ(jk v)
2
the infinitesimal action of a rotation on the γ matrices, and
θ θ
γ(v) → e− 2 γj γk γ(v)e 2 γj γk = γ(eθjk v)
the group version. Just as in the symplectic case, exponentiating the UL0 only
gives a representation up to sign, and one needs to go to the double cover of
SO(n) to get a true representation. As in that case, the necessity of the double
cover is best seen by use of a complex structure and an analog of the Bargmann-
Fock construction, a topic we will address in a later section.
In order to have a full construction of a quantization of a pseudo-classical
system, we need not just the abstract Clifford algebra elements given by the
map Γ+ , but also a realization of the Clifford algebra as linear operators on a
state space. As mentioned in chapter 25, it can be shown that the real Clifford
algebras Cliff(r, s, R) are isomorphic to either one or two copies of the matrix
algebras M (2l , R), M (2l , C), or M (2l , H), with the power l depending on r, s.
The irreducible representations of such a matrix algebra are just the column
vectors of dimension 2l , and there will be either one or two such irreducible
representations for Cliff(r, s, R) depending on the number of copies of the matrix
algebra. This is the fermionic analog of the Stone-von Neumann uniqueness
result in the bosonic case.
306
28.1.1 Quantization of the pseudo-classical spin
As an example, one can consider the quantization of the pseudo-classical spin
degree of freedom of section 27.3.1. In that case Γ+ takes values in Cliff(3, 0, R),
for which an explicit identification with the algebra M (2, C) of two by two
complex matrices was given in section 25.2. One has
1 1
Γ+ (ξj ) = √ γj = √ σj
2 2
and the Hamiltonian operator is
µ=S
307
space H to be a space of functions on a subspace of the classical phase space
which had the property that the basis coordinate functions Poisson-commuted.
Two examples of this are the position coordinates qj , since {qj , qk } = 0, or the
momentum coordinates pj , since {pj , pk } = 0. Unfortunately, for symmetric
bilinear forms h·, ·i of definite sign, such as the positive definite case Cliff(n, R),
the only subspace the bilinear form is zero on is the zero subspace.
To get an analog of the bosonic situation, one needs to take the case of
signature (d, d). The fermionic phase space will then be 2d dimensional, with
d-dimensional subspaces on which h·, ·i and thus the fermionic Poisson bracket
is zero. Quantization will give the Clifford algebra
Cliff(d, d, R) = M (2d , R)
d
which has just one irreducible representation, R2 . One can complexify this to
get a complex state space
d
HF = C2
This state space will come with a representation of Spin(d, d, R) from expo-
nentiating quadratic combinations of the generators of Cliff(d, d, R). However,
this is a non-compact group, and one can show that on general grounds it can-
not have unitary finite-dimensional representations, so there must be a problem
with unitarity.
To see what happens explicitly, consider the simplest case d = 1 of one degree
of freedom. In the bosonic case the classical phase space is R2 , and quantization
gives operators Q, P which in the Schrödinger representation act on functions
∂
of q, with Q = q and P = −i ∂q . In the fermionic case with signature (1, 1),
basis coordinate functions on phase space are ξ1 , ξ2 , with
{ξ1 , ξ1 }+ = 1, {ξ2 , ξ2 }+ = −1, {ξ1 , ξ2 }+ = 0
Defining
1 1
η = √ (ξ1 + ξ2 ), π = √ (ξ1 − ξ2 )
2 2
we get objects with fermionic Poisson bracket analogous to those of q and p
{η, η}+ = {π, π}+ = 0, {η, π}+ = 1
Quantizing, we get analogs of the Q, P operators
1 1
η̂ = Γ+ (η) = √ (Γ+ (ξ1 ) + Γ+ (ξ2 )), π̂ = Γ+ (π) = √ (Γ+ (ξ1 ) − Γ+ (ξ2 ))
2 2
which satisfy anticommutation relations
η̂ 2 = π̂ 2 = 0, η̂π̂ + π̂ η̂ = 1
and can be realized as operators on the space of functions of one fermionic
variable η as
∂
η̂ = multiplication by η, π̂ =
∂η
308
This state space is two complex dimensional, with an arbitrary state
f (η) = c1 1 + c2 η
with cj complex numbers. The inner product on this space is given by the
fermionic integral Z
(f1 (η), f2 (η)) = f1∗ (η)f2 (η)dη
with
f ∗ (ξ) = c1 1 + c2 η
With respect to this inner product, one has
This inner product is indefinite and can take on negative values, since
(1 − η, 1 − η) = −2
309
To quantize this system we need to find operators Γ+ (θj ) and Γ+ (θj ) that
satisfy
[Γ+ (θj ), Γ+ (θk )]+ = [Γ+ (θj ), Γ+ (θk )]+ = 0
[Γ+ (θj ), Γ+ (θk )]+ = δjk 1
but these are just the CAR satisfied by fermionic annihilation and creation
operators. We can choose
Γ+ (θj ) = aF †j , Γ+ (θj ) = aF j
∂
aF j = , aF †j = multiplication by χj
∂χj
1 ↔ |0iF
χj ↔ aF †j |0iF
χj χk ↔ aF †j aF †k |0i
···
χ1 . . . χd ↔ aF †1 aF †2 · · · aF †d |0iF
where f1 and f2 are complex linear combinations of the powers of the anticom-
muting variables χj . For the details of the construction of this inner product,
see chapter 7.2 of [64] or chapters 7.5 and 7.6 of [79].
The quantization using fermionic annihilation and creation operators given
here provides an explicit realization of a representation of the Clifford algebra
Cliff(2d, R) on the complex vector space HF . The generators of the Clifford
algebra are identified as operators on HF by
√ √ 1
γ2j−1 = 2Γ+ (ξ2j−1 ) = 2Γ+ ( √ (θj + θj )) = aF j + a†F j
2
310
√ √ i
γ2j = 2Γ+ (ξ2j ) = 2Γ+ ( √ (θj − θj )) = i(aF j − a†F j )
2
Quantization of the pseudo-classical fermionic oscillator Hamiltonian h of
section 27.3.2 gives
d d
iX iX †
Γ+ (h) = Γ+ (− (θj θj − θj θj )) = − (a aF − aF j a†F j ) = −iH (28.2)
2 j=1 2 j=1 F j j
where H is the Hamiltonian operator for the fermionic oscillator used in chapter
24.
Taking quadratic combinations of the operators γj , γk provides a represen-
tation of the Lie algebra so(2d) = spin(2d). This representation exponentiates
to a representation up to sign of the group SO(2d), and a true representation
of its double-cover Spin(2d). The representation that we have constructed here
on the fermionic oscillator state space HF is called the spinor representation of
Spin(2d), and we will sometimes denote HF with this group action as S.
In the bosonic case, H = Fd is an irreducible representation of the Heisenberg
group, but as a representation of M p(2d, R), it has two irreducible components,
corresponding to even and odd polynomials. The fermionic analog is that HF
is irreducible under the action of the Clifford algebra Cliff(2d, C). One way
to show this is to show that Cliff(2d, C) is isomorphic to the matrix algebra
d
M (2d , C) and its action on HF = C2 is isomorphic to the action of matrices
on column vectors.
While HF is irreducible as a representation of the Clifford algebra, it is the
sum of two irreducible representations of Spin(2d), the so-called “half-spinor”
representations. Spin(2d) is generated by quadratic combinations of the Clifford
algebra generators, so these will preserve the subspaces
S+ = span{|0iF , aF †j aF †k |0iF , · · · } ⊂ S = HF
and
S− = span{aF †j |0iF , aF †j aF †k aF †l |0iF , · · · } ⊂ S = HF
corresponding to the action of an even or odd number of creation operators on
|0iF . This is because quadratic combinations of the aF j , aF †j preserve the parity
of the number of creation operators used to get an element of S by action on
|0iF .
311
the Bargmann-Fock case of section 21.1. The difference here is that for the
analogous construction of spinors a complex structure J must be chosen to
preserve not an antisymmetric bilinear form Ω, but the inner product, so one
has
hJ(·), J(·)i = h·, ·i
We will here restrict to the case of h·, ·i positive definite, and unlike in the
bosonic case, no additional positivity condition on J will be required.
J splits the complexification of the real dual phase space V ∗ = V = R2d with
its coordinates ξj into a d-dimensional complex vector space and a conjugate
complex vector space. As in the bosonic case one has
V ⊗ C = VJ+ ⊕ VJ−
g0 = eA ⊂ U (d)
acting on the fermionic dual phase space preserving J and the inner product, we
can use exactly the same method as in theorems 23.1 and 23.2 to construct its
action on the fermionic state space by the second of the above representations.
For A a skew-adjoint matrix we have a fermionic moment map
X
A ∈ u(d) → µA = θj Ajk θk
j,k
312
satisfying
{µA , µA0 }+ = µ[A,A0 ]
and
{µA , θ}+ = AT θ, {µA , θ}+ = AT θ
The Lie algebra representation operators are the
X †
UA0 = aF j Ajk aF k
j,k
and
[UA0 , a†F ] = AT a†F , [UA0 , aF ] = AT aF
Exponentiating these gives the intertwining operators, which act on the an-
nihilation and creation operators as
T
UeA a†F (UeA )−1 = eA a†F , UeA aF (UeA )−1 = eA aF
T
For the simplest example, consider the U (1) ⊂ U (d) ⊂ SO(2d) that acts by
θj → e−iφ θj , θj → eiφ θj
µA = φh
where
d
X
h = −i θj θ j
j=1
is the Hamiltonian for the classical fermionic oscillator. Quantizing h (see equa-
tion 28.2) will give the Hamiltonian operator
d d
1X 1
(aF †j aF j − aF j aF †j ) = (aF †j aF j − )
X
H=
2 j=1 j=1
2
313
will give a true representation of U (1) ⊂ U (d), with
d
aF †j aF j
X
UA0 = −iφ
j=1
satisfying
[UA0 , a†F ] = −iφa†F , [UA0 , aF ] = iφaF
Exponentiating, the action on annihilation and creation operators is
aF †j aF j † iφ aF †j aF j
Pd Pd
e−iφ j=1 aF e j=1 = e−iφ a†F
aF †j aF j aF †j aF j
Pd Pd
e−iφ j=1 aF eiφ j=1 = eiφ aF
where we have written the matrices in 2 by 2 block form, and are indexing the
four dimensions from 0 to 3. One can easily check that these satisfy the Clifford
algebra relations: they anticommute with each other and
The quadratic Clifford algebra elements − 21 γj γk for j < k satisfy the com-
mutation relations of so(4) = spin(4). These are explicitly
1 i σ1 0 1 i σ1 0
− γ0 γ1 = − , − γ2 γ3 = −
2 2 0 −σ1 2 2 0 σ1
1 i σ2 0 1 i σ2 0
− γ0 γ2 = − , − γ1 γ3 = −
2 2 0 −σ2 2 2 0 σ2
1 i σ3 0 1 i σ3 0
− γ0 γ3 = − − γ1 γ2 = −
2 2 0 −σ3 2 2 0 σ3
The Lie algebra spin representation is just matrix multiplication on S = C4 ,
and it is obviously a reducible representation on two copies of C2 (the upper
314
and lower two components). One can also see that the Lie algebra spin(4) =
su(2) + su(2), with the two su(2) Lie algebras having bases
1 1 1
− (γ0 γ1 + γ2 γ3 ), − (γ0 γ2 + γ1 γ3 ), − (γ0 γ3 + γ1 γ2 )
4 4 4
and
1 1 1
− (γ0 γ1 − γ2 γ3 ), − (γ0 γ2 − γ1 γ3 ), − (γ0 γ3 − γ1 γ2 )
4 4 4
The irreducible spin representations of Spin(4) are just the spin one-half repre-
sentations of the two copies of SU (2).
In the fermionic oscillator construction, we have
S = S + + S − , S + = span{1, η1 η2 }, S − = span{η1 , η2 }
and the Clifford algebra action on S is given for the generators as (indexing
dimensions from 1 to 4)
∂ ∂
γ1 = + η1 , γ2 = i( − η1 )
∂η1 ∂η1
∂ ∂
γ3 = + η2 , γ4 = i( − η2 )
∂η2 ∂η2
Note that in this construction there is a choice of complex structure J = J0 .
This gives a distinguished vector |0i = 1 ∈ S + , as well as a distinguished sub-Lie
algebra u(2) ⊂ so(4) of transformations that act trivially on |0i, given by linear
combinations of
∂ ∂ ∂ ∂
η1 , η2 , η1 , η2 ,
∂η1 ∂η2 ∂η2 ∂η1
There is also a distinguished sub-Lie algebra u(1) ⊂ u(2) given by
∂ ∂
η1 + η2
∂η1 ∂η2
Bogoliubov transformations that are unitary transformations on the spinor
state space, but change |0i and correspond to a change in complex structure,
are given by exponentiating the Lie algebra representation operators
i(aF †1 aF †2 + aF 2 aF 1 ), aF †1 aF †2 − aF 2 aF 1
315
In chapter 38 we will consider explicit matrix representations of the Clifford
algebra for the case of Spin(3, 1). One could also use the fermionic oscillator
construction, complexifying to get a representation of
316
Chapter 29
A Summary: Parallels
Between Bosonic and
Fermionic Quantization
Bosonic Fermionic
317
Stone-von Neumann Uniqueness of Cliff(2d, C) representa-
Uniqueness of h2d+1 representation tion
M p(2d, R) double-cover of Sp(2d, R) Spin(n) double-cover of SO(n)
J : J 2 = −1, Ω(Ju, Jv) = Ω(u, v) J : J 2 = −1, hJu, Jvi = hu, vi
−
M ⊗ C = M+
J ⊕ MJ V ⊗ C = VJ+ ⊕ VJ−
U (d) ⊂ Sp(2d, R) commutes with J U (d) ⊂ SO(2d, R) commutes with J
Compatible J ∈ Sp(2d, R)/U (d) Compatible J ∈ O(2d)/U (d)
aj , a†j satisfying CCR aj F , aj †F satisfying CAR
aj |0i = 0, |0i depends on J aj F |0i = 0, |0i depends on J
Bogoliubov transformations generated Bogoliubov transformations generated
by symmetric quadratics in a†j , a†k by anti-symmetric quadratics in
aF †j , aF †k
318
Chapter 30
Supersymmetry, Some
Simple Examples
If one considers fermionic and bosonic quantum system that each separately
have operators coming from Lie algebra or superalgebra representations on their
state spaces, when one combines the systems by taking the tensor product, these
operators will continue to act on the combined system. In certain special cases
new operators with remarkable properties will appear that mix the fermionic
and bosonic systems and commute with the Hamiltonian (often by giving some
sort of “square root” of the Hamiltonian). These are generically known as
“supersymmetries” and provide new information about energy eigenspaces. In
this chapter we’ll examine in detail some of the simplest such quantum systems,
examples of “superymmetric quantum mechanics”.
d d
1 X 1
(aB †j aB j + aB j aB †j ) =
X
H= ~ω (NB j + )~ω
2 j=1 j=1
2
where NB j is the number operator for the j’th degree of freedom, with
eigenvalues nB j = 0, 1, 2, · · · .
319
|0iF . The Hamiltonian is
d d
1 X 1
(aF †j aF j − aF j aF †j ) =
X
H= ~ω (NF j − )~ω
2 j=1 j=1
2
where NF j is the number operator for the j’th degree of freedom, with
eigenvalues nF j = 0, 1.
Putting these two systems together we get a new quantum system with state
space
H = HB ⊗ HF
and Hamiltonian
d
X
H= (NB j + NF j )~ω
j=1
Notice that the lowest energy state |0i for the combined system has energy 0,
due to cancellation between the bosonic and fermionic degrees of freedom.
For now, taking for simplicity the case d = 1 of one degree of freedom, the
Hamiltonian is
H = (NB + NF )~ω
Notice that while there is a unique lowest energy state |0, 0i of zero energy, all
non-zero energy states come in pairs, with two states
|n, 0i and |n − 1, 1i
Q+ = aB a†F , Q− = a†B aF
which are not self adjoint, but are each other’s adjoints ((Q− )† = Q+ ).
The pattern of energy eigenstates looks like this
320
Computing anticommutators using the CCR and CAR for the bosonic and
fermionic operators (and the fact that the bosonic operators commute with the
fermionic ones since they act on different factors of the tensor product), one
finds that
Q2+ = Q2− = 0
and
(Q+ + Q− )2 = [Q+ , Q− ]+ = H
1
Q1 = Q+ + Q− , Q2 = (Q+ − Q− )
i
which satisfy
[Q1 , Q2 ]+ = 0, Q21 = Q22 = H
321
with the same energy. To find states of zero energy, instead of trying to solve
the equation H|0i = 0 for |0i, one can look for solutions to
Q1 |0i = 0 or Q2 |0i = 0
These operators don’t correspond to a Lie algebra representation as H does,
but do come from a Lie superalgebra representation, so are described as gen-
erators of a “supersymmetry” transformation. In more general theories with
operators like this with the same relation to the Hamiltonian, one may or may
not have solutions to
Q1 |0i = 0 or Q2 |0i = 0
If such solutions exist, the lowest energy state has zero energy and is described
as invariant under the supersymmetry. If no such solutions exist, the lowest
energy state will have a non-zero, positive energy, and satisfy
Q1 |0i =
6 0 or Q2 |0i =
6 0
In this case one says that the supersymmetry is “spontaneously broken”, since
the lowest energy state is not invariant under supersymmetry.
There is an example of a physical quantum mechanical system that has ex-
actly the behavior of this supersymmetric oscillator. A charged particle confined
to a plane, coupled to a magnetic field perpendicular to the plane, can be de-
scribed by a Hamiltonian that can be put in the bosonic oscillator form (to
show this, we need to know how to couple quantum systems to electromagnetic
fields, which we will come to later in the course). The equally spaced energy
levels are known as “Landau levels”. If the particle has spin one-half, there will
be an additional term in the Hamiltonian coupling the spin and the magnetic
field, exactly the one we have seen in our study of the two-state system. This
additional term is precisely the Hamiltonian of a fermionic oscillator. For the
case of gyromagnetic ratio g = 2, the coefficients match up so that we have
exactly the supersymmetric oscillator described above, with exactly the pattern
of energy levels seen there.
322
Introducing an arbitrary superpotential W (q) with derivative W 0 (q) we can
define new annihilation and creation operators:
1 1
aB = √ (W 0 (Q) + iP ), a†B = √ (W 0 (Q) − iP )
2 2
Q+ = aB a†F , Q− = a†B aF
These satisfy
Q2+ = Q2− = 0
for the same reason as in the oscillator case: repeated factors of aF or a†F vanish.
Taking as the Hamiltonian the same square as before, we find
H =(Q+ + Q− )2
1 1
= (W 0 (Q) + iP )(W 0 (Q) − iP )a†F aF + (W 0 (Q) − iP )(W 0 (Q) + iP )aF a†F
2 2
1 1
= (W (Q) + P )(aF aF + aF aF ) + (i[P, W 0 (Q)])(a†F aF − aF a†F )
0 2 2 † †
2 2
1 0 2 2 1 0
= (W (Q) + P ) + (i[P, W (Q)])σ3
2 2
But iP is the operator corresponding to infinitesimal translations in Q, so we
have
i[P, W 0 (Q)] = W 00 (Q)
and
1 1 00
H= (W 0 (Q)2 + P 2 ) + W (Q)σ3
2 2
which gives a large class of quantum systems, all with state space
H = HB ⊗ HF = L2 (R) ⊗ C2
.
There may or may not be a state with zero energy, depending on whether
or not one can find a solution to the equation
323
If such a solution does exist, thinking in terms of super Lie algebras, one calls
Q1 the generator of the action of a supersymmetry on the state space, and
describes the ground state |0i as invariant under supersymmetry. If no such
solution exists, one has a theory with a Hamiltonian that is invariant under su-
persymmetry, but with a ground state that isn’t. In this situation one describes
the supersymmetry as “spontaneously broken”. The question of whether a given
supersymmetric theory has its supersymmetry spontaneously broken or not is
one that has become of great interest in the case of much more sophisticated su-
persymmetric quantum field theories. There, hopes (so far unrealized) of making
contact with the real world rely on finding theories where the supersymmetry
is spontaneously broken.
In this simple quantum mechanical system, one can try and explicitly solve
the equation Q1 |ψi = 0. States can be written as two-component complex
functions
ψ+ (q)
|ψi =
ψ− (q)
c+ eW (q)
ψ+ (q) W (q)σ3 c+
=e =
ψ− (q) c− c− e−W (q
c+ = 0, lim W (q) = +∞
q→±∞
or
c− = 0, lim W (q) = −∞
q→±∞
If, for example, W (q) is an odd polynomial, one will not be able to satisfy either
of these conditions, so there will be no solution, and the supersymmetry will be
spontaneously broken.
324
30.3 Supersymmetric quantum mechanics and
differential forms
If one considers supersymmetric quantum mechanics in the case of d degrees of
freedom and in the Schrödinger representation, one has
H = L2 (Rd ) ⊗ Λ∗ (Rd )
= (d + δ)2
325
326
Chapter 31
1
31.1 The Pauli operator and free spin 2 particles
in d = 3
We have so far seen two quite different quantum systems based on three dimen-
sional space:
• The free particle of chapter 17. This had classical phase space R6 with
1
coordinates q1 , q2 , q3 , p1 , p2 , p3 and Hamiltonian 2m |p|2 . Quantization us-
ing the Schrödinger representation gave operators Q1 , Q2 , Q3 , P1 , P2 , P3
on the space HB = L2 (R3 ) of square-integrable functions of the position
327
coordinates. The Hamiltonian operator is
1 1 ∂2 ∂2 ∂2
H= |P|2 = − ( 2 + 2 + 2)
2m 2m ∂q1 ∂q2 ∂q3
H = HB ⊗ HF = L2 (R3 ) ⊗ C2
328
in terms of it). In this pseudo-classical theory p1 ξ1 + p2 ξ2 + p3 ξ3 is a “super-
symmetry”, Poisson commuting with the Hamiltonian, while at the same time
playing the role of a sort of “square root” of the Hamiltonian, providing a new
sort of symmetry that can be thought of as a “square root” of an infinitesimal
time translation.
Quantization takes
1
p1 ξ1 + p2 ξ2 + p3 ξ3 → √ σ · P
2
and the Hamiltonian operator can now be written as an anticommutator and a
square
1 1 1 1 1
H= [ √ σ · P, √ σ · P]+ = (σ · P)2 = (P12 + P22 + P32 )
2m 2 2 2m 2m
(using the fact that the σj satisfy the Clifford algebra relations for Cliff(3, 0, R)).
We will define the three-dimensional Dirac operator as
∂ ∂ ∂
∂/ = σ1 + σ2 + σ3 =σ·∇
∂q1 ∂q2 ∂q3
It operates on two-component wavefunctions
ψ1 (q)
ψ2 (q)
Using this Dirac operator (often called in this context the “Pauli operator”) we
can write a two-component version of the Schrödinger equation (often called the
“Pauli equation” or “Schrödinger-Pauli equation”)
∂ ψ1 (q) 1 ∂ ∂ ∂ 2 ψ1 (q)
i =− (σ1 + σ2 + σ3 ) (31.1)
∂t ψ2 (q) 2m ∂q1 ∂q2 ∂q3 ψ2 (q)
1 ∂2 ∂2 ∂2
ψ1 (q)
=− ( 2 + 2 + 2)
2m ∂q1 ∂q2 ∂q3 ψ2 (q)
This free-particle version of the equation is just two copies of the standard free-
particle Schrödinger equation, so physically just corresponds to two independent
quantum free particles. It becomes much more non-trivial when a coupling to
an electromagnetic field is introduced, as will be seen in chapter 42.
The introduction of a two-component wavefunction does allow us to find
more interesting irreducible representations of the group E(3), beyond the ones
studied in chapter 17. These have eigenvalue ± 21 p (where p2 is the eigenvalue
of the first Casimir operator |P|2 ) for the second Casimir operator J · P, as
opposed to the zero eigenvalue case of single wavefunctions.
These representations will as before be on the space of solutions of the time-
independent equation, and irreducible for fixed choice of the energy E. The
equation for the energy eigenfunctions of energy eigenvalue E will be
1 ψ1 (q) ψ1 (q)
(σ · P)2 =E
2m ψ2 (q) ψ2 (q)
329
In terms of the inverse Fourier transform
ZZZ
1
ψ1,2 (q) = 3 eip·q ψe1,2 (p)d3 p
(2π) 2
and as in chapter√17 our solution space is given by functions ψeE,1,2 (p) on the
sphere of radius 2mE = |p| in momentum space (although now, two such
functions).
Another way to find solutions to this equation is to look for solutions to
a pair of first-order equations involving the three-dimensional Dirac operator.
Solutions to ! !
ψe1 (p) √ ψe1 (p)
σ·p e = ± 2mE e
ψ2 (p) ψ2 (p)
will give solutions to 31.2, for either sign. One can rewrite this as
! !
σ · p ψe1 (p) ψe1 (p)
=± e
|p| ψe2 (p) ψ2 (p)
and we will write solutions to this equation with the + sign as ψeE,+ (p), those for
the − sign as ψeE,− (p). Note that ψeE,+ (p) and ψeE,+ (p) are each√
two-component
complex functions of the momentum, supported on the sphere 2mE = |p|.
In chapter 16 we saw that R ∈ SO(3) acts on single-component momentum
space solutions of the Schrödinger equation by
330
(or, equivalently, thinking of two component wavefunctions as the tensor product
of the space of single component wavefunction with C2 , just acting on the
first factor). If we do this, the operator σ · P does not commute with the
representation ue(0, R) because
u(0, R)−1 = (σ · R−1 P) 6= σ · P
e(0, R)(σ · P)e
u
Then rotations do not act separately on the spaces ψeE,+ (p) and ψeE,− (p).
If we want rotations to act separately on these spaces, we need to change
the action of rotations to
eS (0, R)ψeE,± (p) = ΩψeE,± (R−1 p)
ψeE,± (p) → u
where Ω is one of the two elements of SU (2) corresponding to R ∈ SO(3) (or,
in terms of tensor products, action by SU (2) on the C2 factor). Such an Ω can
be constructed using equation 6.3
φ
Ω = Ω(φ, w) = e−i 2 w·σ
Equation 6.5 shows that Ω is the SU (2) matrix corresponding to a rotation R
by an angle φ about the axis given by a unit vector w.
With this action on solutions we have
uS (0, R)−1 ψeE,± (p) =e
eS (0, R)(σ · P)e
u uS (0, R)(σ · P)Ω−1 ψeE,± (Rp)
=Ω(σ · R−1 P)Ω−1 ψeE,± (R−1 Rp)
=(σ · P)ψeE,± (p)
where we have used equation 6.5 to show
Ω(σ · R−1 P)Ω−1 = (σ · RR−1 P) = (σ · P)
Note that the two representations we get this way are representations not
of the rotation group SO(3) but of its double cover Spin(3) = SU (2) (because
otherwise there is a sign ambiguity since we don’t know whether to choose Ω
or −Ω). The translation part of the spatial symmetry group is easily seen to
commute with σ · P, so we have constructed representations of E(3), or rather,
of its double cover
] = R3 o SU (2)
E(3)
on the two spaces of solutions ψeE,± (p). We will see that these two representa-
tions are the E(3) representations described in section 17.3, the ones labeled by
the helicity ± 21 representations of the stabilizer group SO(2).
The translation part of the group acts as in the one-component case, by the
multiplication operator
eS (a, 1)ψeE,± (p) = e−i(a·p) ψeE,± (p)
u
and
eS (a, 1) = e−ia·P
u
so the Lie algebra representation is given by the usual P operator. The SU (2)
part of the group acts by a product of two commuting different actions
331
1. The same action on the momentum coordinates as in the one-component
case, just using R = Φ(Ω), the SO(3) rotation corresponding to the SU (2)
group element Ω. For example, for a rotation about the x-axis by angle φ
we have
ψeE,± (p) → ψeE,± (R(φ, e1 )−1 p)
Recall that the operator that does this is e−iφL1 where
∂ ∂
−iL1 = −i(Q2 P3 − Q3 P2 ) = −(q2 − q3 )
∂q3 ∂q2
and in general we have operators
−iL = −iQ × P
that provide the Lie algebra version of the representation (recall that at
the Lie algebra level, SO(3) and Spin(3) are isomorphic).
2. The action of the matrix Ω ∈ SU (2) on the two-component wavefunction
by
ψeE,± (p) → ΩψeE,± (p)
For a rotation by angle φ about the x-axis we have
σ1
Ω = e−iφ 2
and the operators that provide the Lie algebra version of the representation
are the
1
−iS = −i σ
2
The Lie algebra representation corresponding to the action of both of these
transformations is given by the operator
−iJ = −i(L + S)
J·P
and as in the one-component case the L · P part of this acts trivially on our
solutions ψeE,± (p). The spin component acts non-trivially and we have
1 1
(J · P)ψeE,± (p) = ( σ · p)ψeE,± (p) = ± |p|ψeE,± (p)
2 2
so we see that our solutions have helicity (eigenvalue of J · P divided by the
square root of the eigenvalue of |P|2 ) values ± 21 , as opposed to the integral
helicity values discussed in chapter 17, where E(3) appeared and not its double
cover.
332
31.2 The Dirac operator
One can generalize the above construction to the case of any dimension d as
follows. Recall from chapter 26 that associated to Rd with a standard inner
product, but of a general signature (r, s) (where r + s = d, r is the number
of + signs, s the number of − signs) we have a Clifford algebra Cliff(r, s) with
generators γj satisfying
γj γk = −γk γj , j 6= k
γj2 = +1 for j = 1, · · · , r γj2 = −1, for j = r + 1, · · · , d
To any vector v ∈ Rd with components vj recall that we can associate a corre-
sponding element v
/ in the Clifford algebra by
d
X
v ∈ Rd → v
/= γj vj ∈ Cliff(r, s)
j=1
Multiplying this Clifford algebra element by itself and using the relations above,
we get a scalar, the length-squared of the vector
2
/ = v12 + v22 · · · + vr2 − vr+1
v 2
− · · · − vd2 = |v|2
This will be a first-order differential operator with the property that its
square is the Laplacian
2 ∂2 ∂2 ∂2 ∂2
∂/ = 2 + ··· + 2 − 2 − ··· − 2
∂q1 ∂qr ∂qr+1 ∂qd
The Dirac operator ∂/ acts not on functions but on functions taking values
in the spinor vector space S that the Clifford algebra acts on. Picking a matrix
representation of the γj , the Dirac operator will be a constant coefficient first
order differential operator acting on wavefunctions with dim S components. In
chapter 44 we will study in detail what happens for the case of r = 3, s = 1 and
see how the Dirac operator there provides an appropriate wave-equation with
the symmetries of special relativistic space-time.
333
in any quantum mechanics book, see for example chapter 14 of [57]. For more
details about supersymmetric quantum mechanics and the appearance of the
Dirac operator as the generator of a supersymmetry in the quantization of a
pseudo-classical system, see [64] and [1].
334
Chapter 32
In this chapter we’ll give a rapid survey of a different starting point for devel-
oping quantum mechanics, based on the Lagrangian rather than Hamiltonian
classical formalism. Lagrangian methods have quite different strengths and
weaknesses than those of the Hamiltonian formalism, and we’ll try and point
these out, while referring to standard physics texts for more detail about these
methods.
The Lagrangian formalism leads naturally to an apparently very different
notion of quantization, one based upon formulating quantum theory in terms of
infinite-dimensional integrals known as path integrals. A serious investigation
of these would require another and very different volume, so again we’ll have to
restrict ourselves to outlining how path integrals work, describing their strengths
and weaknesses, and giving references to standard texts for the details.
335
one can define a functional on the space of such paths
Definition. Action
The action S for a path γ is
Z t2
S[γ] = L(q(t), q̇(t))dt
t1
336
But
d
δ q̇(t) = δq(t)
dt
and, using integration by parts
∂L d ∂L d ∂L
δ q̇(t) = ( δq) − ( )δq
∂ q̇ dt ∂ q̇ dt ∂ q̇
so
Z t2
∂L d ∂L d ∂L
δS[γ] = (( − )δq − ( δq))dt
t1 ∂q dt ∂ q̇ dt ∂ q̇
Z t2
∂L d ∂L ∂L ∂L
= ( − )δqdt − ( δq)(t2 ) + ( δq)(t1 )
t1 ∂q dt ∂ q̇ ∂ q̇ ∂ q̇
If we keep the endpoints fixed so δq(t1 ) = δq(t2 ) = 0, then for solutions to
∂L d ∂L
(q(t), q̇(t)) − ( (q(t), q̇(t))) = 0
∂q dt ∂ q̇
the integral will be zero for arbitrary variations δq.
As an example, a particle moving in a potential V (q) will be described by a
Lagrangian
d
1 X 2
L(q, q̇) = m q̇ − V (q)
2 j=1 j
337
This is an example of a more general result known as “Noether’s theorem”.
It says that given a Lie group action on a Lagrangian system that leaves the
action invariant, for each element X of the Lie algebra we will have a conserved
quantity
∂L
δq(X)
∂ q̇
which is independent of time along the trajectory. A basic example is when
the Lagrangian is independent of the position variables qj , depending only on
the velocities q̇j , for example in the case of a free particle, when V (q) = 0. In
such a case one has invariance under the Lie group Rd of space-translations.
An infinitesimal transformation in the j-direction is given by
δqj (t) =
∂L
= mq̇j
∂ q̇j
∂L
pj =
∂ q̇j
338
dynamics” and are not unusual: one example we will see later is that of the
equations of motion of a free electromagnetic field (Maxwell’s equations).
Besides a phase space, for a Hamiltonian system one needs a Hamiltonian
function. Choosing
Xd
h= pj q̇j − L(q, q̇)
j=1
to solve for the velocities q̇j and express them in terms of the momentum vari-
ables. In that case, computing the differential of h one finds (for d = 1, the
generalization to higher d is straightforward)
∂L ∂L
dh =pdq̇ + q̇dp − dq − dq̇
∂q ∂ q̇
∂L
=q̇dp − dq
∂q
So one has
∂h ∂h ∂L
= q̇, =−
∂p ∂q ∂q
but these are precisely Hamilton’s equations since the Euler-Lagrange equations
imply
∂L d ∂L
= = ṗ
∂q dt ∂ q̇
The Lagrangian formalism has the advantage of depending concisely just on
the choice of action functional, which does not distinguish time in the same
way that the Hamiltonian formalism does by its dependence on a choice of
Hamiltonian function h. This makes the Lagrangian formalism quite useful in
the case of relativistic quantum field theories, where one would like to exploit the
full set of space-time symmetries, which can mix space and time directions. On
the other hand, one loses the infinite dimensional group of symmetries of phase
space for which the Poisson bracket is the Lie bracket (the so-called “canonical
transformations”). In the Hamiltonian formalism we saw that the harmonic
oscillator could be best understood using such symmetries, in particular the
U (1) symmetry generated by the Hamiltonian function. The harmonic oscillator
is a more difficult problem in the Lagrangian formalism, where this symmetry
is not manifest.
339
we have discussed extensively earlier (known to physicists as “canonical quan-
tization”). There is however a very different approach to quantization, which
completely bypasses the Hamiltonian formalism. This is the path integral for-
malism, which is based upon a method for calculating matrix elements of the
time-evolution operator
i
hqT |e− ~ HT |q0 i
in the position eigenstate basis in terms of an integral over the space of paths
that go from q0 to q1 in time T . Here |q0 i is an eigenstate of Q with eigenvalue
q0 (a delta-function at q0 in the position space representation), and |qT i has Q
eigenvalue qT (as in many cases, we’ll stick to d = 1 for this discussion). This
matrix element has a physical interpretation as the amplitude for a particle
starting at q0 at t = 0 to have position qT at time T , with its norm-squared
giving the probability density for observing the particle at position qT .
To try and derive a path-integral expression for this, one breaks up the
interval [0, T ] into N equal-sized sub-intervals and calculates
i
hqT |(e− N ~ HT )N |q0 i
If K(P ) can be chosen to depend only on the momentum operator P and V (Q)
depends only on the operator Q then one can insert alternate copies of the
identity operator in the forms
Z ∞ Z ∞
|qihq|dq = 1, |pihp|dp = 1
−∞ −∞
where the index j goes from 1 to N and the pj , qj variable will be integrated
over. Such a term can be evaluated as
i i
hqj |pj ihpj |qj−1 ie− N ~ K(pj )T e− N ~ V (qj−1 )T
1 i 1 i i i
=√ e ~ qj pj √ e− ~ qj−1 jpj e− N ~ K(pj )T e− N ~ V (qj−1 )T
2π~ 2π~
1 i pj (qj −qj−1 ) − i (K(pj )+V (qj−1 ))T
= e~ e N~
2π~
1 N
The N factors of this kind give an overall factor of ( 2π~ ) times something
which is a discretized approximation to
i
RT
(pq̇−h(q(t),p(t)))dt
e~ 0
340
where the phase in the exponential is just the action. Taking into account the
integrations over qj and pj one should have something like
N Z
1 NY ∞ ∞
Z RT
i i
hqT |e− ~ HT |q0 i = lim ( ) dpj dqj e ~ 0 (pq̇−h(q(t),p(t)))dt
N →∞ 2π~ j=1 −∞ −∞
although one should not do the first and last integrals over q but fix the first
value of q to q0 and the last one to qT . One can try and interpret this sort of
integration in the limit as an integral over the space of paths in phase space,
thus a “phase space path integral”.
This is an extremely simple and seductive expression, apparently saying that,
once the action S is specified, a quantum system is defined just by considering
integrals
Z
i
Dγ e ~ S[γ]
over paths γ in phase space, where Dγ is some sort of measure on this space of
paths. Since the integration just involves factors of dpdq and the exponential
just pdq and h, this formalism seems to share the same sort of invariance under
the infinite-dimensional group of canonical transformations (transformations of
the phase space preserving the Poisson bracket) as the classical Hamiltonian
formalism. It also appears to solve our problem with operator ordering am-
biguities, since introducing products of P s and Q’s at various times will just
give a phase space path integral with the corresponding p and q factors in the
integrand, but these commute.
Unfortunately, we know from the Groenewold-van Hove theorem that this
is too good to be true. This expression cannot give a unitary representation of
the full group of canonical transformations, at least not one that is irreducible
and restricts to what we want on transformations generated by linear functions
q and p. Another way to see the problem is that a simple argument shows
that by canonical transformations one can transform any Hamiltonian into a
free-particle Hamiltonian, so all quantum systems would just be free particles
in some choice of variables. For the details of these arguments and a careful
examination of what goes wrong, see chapter 31 of [54]. One aspect of the
problem is that, as a measure on the discrete sets of points qj , pj , points in
phase space for successive values of j are not likely to be close together, so
thinking of the integral as an integral over paths is not justified.
When the Hamiltonian h is quadratic in the momentum p, the pj integrals
will be Gaussian integrals that can be performed exactly. Equivalently, the
kinetic energy part K of the Hamiltonian operator will have a kernel in position
space that can be computed exactly. Using one of these, the pj integrals can be
eliminated, leaving just integrals over the qj that one might hope to interpret as
a path integral over paths not in phase space, but in position space. One finds,
P2
if K = 2m
i
hqT |e− ~ HT |q0 i =
341
N Z
m Y ∞ m(qj −qj−1 )2
r
i2π~T N i
PN T
lim ( )2 dqj e ~ j=1 ( 2T /N −V (qj ) N )
N →∞ N m i2π~T j=1 −∞
where now the paths γ(t) are paths in the position space.
An especially attractive aspect of this expression is that it provides a simple
understanding of how classical behavior emerges in the classical limit as ~ → 0.
The stationary phase approximation method for oscillatory integrals says that,
for a function f with a single critical point at x = xc (i.e. f 0 (xc ) = 0) and for a
small parameter , one has
Z +∞
1 1
√ dx eif / = p eif (xc )/ (1 + O())
i2π −∞ f 00 (c)
Using the same principle for the infinite-dimensional path integral, with f = S
the action functional on paths, and = ~, one finds that for ~ → 0 the path
integral will simplify to something that just depends on the classical trajectory,
since by the principle of least action, this is the critical point of S.
Such position-space path integrals do not have the problems of principle
of phase space path integrals coming from the Groenewold-van Hove theorem,
but they still have serious analytical problems since they involve an attempt to
integrate a wildly oscillating phase over an infinite-dimensional space. One does
not naturally get a unitary result for the time evolution operator, and it is not
clear that whatever results one gets will be independent of the details of how
one takes the limit to define the infinite-dimensional integral.
Such path integrals though are closely related to integrals that are known
to make sense, ones that occur in the theory of random walks. There, a well-
defined measure on paths does exist, Wiener measure. In some sense Wiener
measure is what one gets in the case of the path integral for a free particle, but
taking the time variable t to be complex and analytically continuing
t → it
So, one can use Wiener measure techniques to define the path integral, getting
results that need to be analytically continued back to the physical time variable.
In summary, the path integral method has the following advantages:
• Study of the classical limit and “semi-classical” effects (quantum effects
at small ~) is straightforward.
342
• Calculations for free particles and for series expansions about the free
particle limit can be done just using Gaussian integrals, and these are rela-
tively easy to evaluate and make sense of, despite the infinite-dimensionality
of the space of paths.
• After analytical continuation, path integrals can be rigorously defined us-
ing Wiener measure techniques, and often evaluated numerically even in
cases where no exact solution is known.
On the other hand, there are disadvantages:
• Some path integrals such as phase space path integrals do not at all have
the properties one might expect, so great care is required in any use of
them.
• How to get unitary results can be quite unclear. The analytic continua-
tion necessary to make path integrals well-defined can make their physical
interpretation obscure.
343
344
Chapter 33
Quantization of
Infinite-dimensional Phase
Spaces
Up until this point we have been dealing with finite-dimensional phase spaces
and their quantization in terms of Weyl and Clifford algebras. We will now turn
to the study of quantum systems (both bosonic and fermionic) corresponding
to infinite-dimensional phase spaces. The phase spaces of interest are spaces
of solutions of some partial differential equation, so these solutions are classi-
cal fields. The corresponding quantum theory is thus called a “quantum field
theory”. In this chapter we’ll just make some general comments about the new
phenomena that appear when one deals with such infinite-dimensional exam-
ples, without going into any detail at all. Formulating quantum field theories in
a mathematically rigorous way is a major and ongoing project in mathematical
physics research, one far beyond the scope of this text. We will treat this subject
at a physicist’s level of rigor, while trying to give some hint of how one might
proceed with precise mathematical constructions when they exist. We will also
try and indicate where there are issues that require much deeper or even still
unknown ideas, as opposed to those where the needed mathematical techniques
are of a conventional nature.
While finite-dimensional Lie groups and their representations are rather well-
understood mathematical objects, this is not at all true for infinite-dimensional
Lie groups, where mathematical results are rather fragmentary. For the case
of infinite-dimensional phase spaces, bosonic or fermionic, the symplectic or or-
thogonal groups acting on these spaces will be infinite-dimensional. One would
like to find infinite-dimensional analogs of the role these groups and their rep-
resentations play in quantum theory in the finite-dimensional case.
345
33.1 Inequivalent irreducible representations
In our discussion of the Weyl and Clifford algebras in finite dimensions, an im-
portant part of this story was the Stone-von Neumann theorem and its fermionic
analog, which say that these algebras each have only one interesting irreducible
representation (the Schrödinger representation in the bosonic case, the spinor
representation in the fermionic case). Once we go to infinite dimensions, this is
no longer true: there will be an infinite number of inequivalent irreducible rep-
resentations, with no known complete classification of the possibilities. Before
one can even begin to compute things like expectation values of observables,
one needs to find an appropriate choice of representation, adding a new layer of
difficulty to the problem that goes beyond that of just increasing the number of
degrees of freedom.
To get some idea of how the Stone-von Neumann theorem can fail, one
can consider the Bargmann-Fock quantization of the harmonic oscillator with d
degrees of freedom, and note that it necessarily depends upon making a choice
of an appropriate complex structure J (see chapter 22), with the conventional
choice denoted J0 . Changing from J0 to a different J corresponds to changing
the definition of annihilation and creation operators (but in a manner that
preserves their commutation relations). Physically, this entails a change in the
Hamiltonian and a change in the lowest-energy or vacuum state:
|0iJ0 → |0iJ
But |0iJ is still an element of the same state space H as |0iJ0 , and one gets
the same H by acting with annihilation and creation operators on |0iJ0 or on
|0iJ . The two constructions of the same H correspond to unitarily equivalent
representations of the Heisenberg group H2d+1 .
For a phase space with d = ∞, what can happen is that there can be choices
of J such that acting with annihilation and creation operators on |0iJ0 and
|0iJ gives two different state spaces HJ0 and HJ , providing two inequivalent
representations of the Heisenberg group. For quantum systems with an infinite
number of degrees of freedom, one can thus have the same algebra of operators,
but a different choice of Hamiltonian can give both a different vacuum state and
a different state space H on which the operators act. This same phenomenon
occurs both in the bosonic and fermionic cases, as one goe to infinite dimensional
Weyl or Clifford algebras.
It turns out though that if one restricts the class of complex structures J to
ones not that different from J0 , then one can recover a version of the Stone-von
Neumann theorem and have much the same behavior as in the finite-dimensional
case. Note that for invertible linear map g on phase space, g acts on the complex
structure, taking
J0 → Jg = g · J0
346
Definition (Restricted symplectic and orthogonal groups). The group of linear
transformations g of an infinite-dimensional symplectic vector space preserving
the symplectic structure and also satisfying the condition
tr(A† A) < ∞
on the operator
A = [Jg , J0 ]
is called the restricted symplectic group and denoted Spres . The group of linear
transformations g of an infinite-dimensional inner-product space preserving the
inner-product and satisfying the same condition as above on [Jg , J0 ] is called the
restricted orthogonal group and denoted Ores .
An operator A satisfying tr(A† A) < ∞ is said to be a Hilbert-Schmidt
operator.
One then has the following replacement for the Stone-von Neumann theorem:
Theorem. Given two complex structures J1 , J2 on a Hilbert space such that
[J1 , J2 ] is Hilbert-Schmidt, acting on the states
|0iJ1 , |0iJ2
347
section 23.2. The group U (∞) will act trivially on the vacuum state |0iJ0 , and
the finite-dimensional groups of symmetries of quantum field theories (coming
from, for example, the action of the rotation group on physical space) will be
subgroups of this group. For these, the problem to be discussed in this section
will not occur.
Recall though that knowing the action of the symplectic group as automor-
phisms the Weyl algebra only determines its representation on the state space
up to a phase factor. In the finite-dimensional case it turned out that this phase
factor could be chosen to be just a sign, giving a representation (the metaplectic
representation) that was a representation up to sign (and a true representation
of a double cover of the symplectic group). In the infinite dimensional case of
Spres , it turns out that the phase factors cannot be reduced to signs, and the
analog of the metaplectic representation is a representation of Spres only up to
a phase. To get a true representation, one needs to extend Spres to a larger
group Sp^ res that is not just a cover, but has an extra dimension.
In terms of Lie algebras one has
sp
] res = spres ⊕ R
with elements non-zero only in the R direction commuting with everything else.
For all commutation relations, there are now possible scalar terms to keep track
of. In the finite dimensional case we saw that such terms would occur when we
used normal-ordered operators (an example is the the shift by the scalar 12 in the
Hamiltonian operator for a harmonic oscillator with one degree of freedom), but
without normal-ordering no such terms were needed. In the infinite-dimensional
case normal-ordering is needed to avoid having a representation that acts on
states like |0iJ0 by an infinite phase change, and one can not eliminate the
effect of normal-ordering by making a well-defined finite phase change on the
way the operators act.
Commuting two elements of the Lie sub-algebra u(∞) will not give a scalar
factor, but such factors can occur when one commutes the action of elements of
spres not in u(∞), in which case they are known as “Schwinger terms”. Already
in finite dimensions, we saw that commuting the action a2 with that of (a† )2
gave a scalar factor with respect to the normal ordered a† a operator (see section
22.3), and in infinite dimensions, it is this that cannot be redefined away.
This phenomenon of new scalar terms in the commutation relations of the
operators in a quantum theory coming from a Lie algebra representation is
known as an “anomaly”, and while we have described it for the bosonic case,
much the same thing happens in the fermionic case for the Lie algebra ores .
This is normally considered to be something that happens due to quantiza-
tion, with the “anomaly” the extra scalar terms in commutation relations not
there in the corresponding classical Poisson bracket relations. From another
point of view this is a phenomenon coming not from quantization, but from
infinite-dimensionality, already visible in Poisson brackets when one makes a
choice of complex structure on the phase space. It is the occurrence for infinite-
dimensional phase spaces of certain inherently different ways of choosing the
348
complex structure that is relevant. We will see in later chapters that in quan-
tum field theories one often wants to choose a complex structure on an infinite
dimensional space of solutions of a classical field equation by taking positive
energy complexified solutions to be eigenvectors of the complex structure with
eigenvalue +i, those with negative energy to have eigenvalue −i, and it is this
choice of complex structure that introduces the anomaly phenomenon.
349
is here.
350
Chapter 34
The quantum mechanical systems we have studied so far describe a finite num-
ber of degrees of freedom, which may be of a bosonic or fermionic nature. In
particular we have seen how to describe a quantized free particle moving in
three-dimensional space. By use of the notion of tensor product, we can then
describe any particular fixed number of such particles. We would, however, like
a formalism capable of conveniently describing an arbitrary number of parti-
cles. From very early on in the history of quantum mechanics, it was clear
that at least certain kinds of particles, photons, were most naturally described
not one by one, but by thinking of them as quantized excitations of a classical
system with an infinite number of degrees of freedom: the electromagnetic field.
In our modern understanding of fundamental physics not just photons, but all
elementary particles are best described in this way.
Conventional textbooks on quantum field theory often begin with relativis-
tic systems, but we’ll start instead with the non-relativistic case. We’ll study a
simple quantum field theory that extends the conventional single-particle quan-
tum systems we have dealt with so far to deal with multi-particle systems. This
version of quantum field theory is what gets used in condensed matter physics,
and is in many ways simpler than the relativistic case, which we’ll take up in a
later chapter.
Quantum field theory is a large and complicated subject, suitable for a full-
year course at an advanced level. We’ll be giving only a very basic introduc-
tion, mostly just considering free fields, which correspond to systems of non-
interacting particles. Most of the complexity of the subject only appears when
one tries to construct quantum field theories of interacting particles. A remark-
able aspect of the theory of free quantum fields is that in many ways it is little
more than something we have already discussed in great detail, the quantum
351
harmonic oscillator problem. However, the classical harmonic oscillator phase
space that is getting quantized in this case is an infinite dimensional one, the
space of solutions to the free particle Schrödinger equation. To describe multiple
non-interacting fermions, we just need to use fermionic oscillators.
For simplicity we’ll set ~ = 1 and start with the case of a single spatial
dimension. We’ll also begin using x to denote a spatial variable instead of the q
conventional when this is the coordinate variable in a finite-dimensional phase
space.
where the superscript S means we take elements of the tensor product invariant
under the action of the group SN by permutation of the factors. We want to
consider state spaces containing an arbitrary number of particles, so we define
Definition (Bosonic Fock space, the symmetric algebra). Given a complex vec-
tor space V , the symmetric Fock space is defined as
F S (V ) = C ⊕ V ⊕ (V ⊗ V )S ⊕ (V ⊗ V ⊗ V )S ⊕ · · ·
S N (V ) = (V ⊗ · · · ⊗ V )S
| {z }
N −times
(recall chapter 9)
A quantum harmonic oscillator with d degrees of freedom has a state space
consisting of linear combinations of states with N “quanta” (i.e. states one gets
by applying N creation operators to the lowest energy state), for N = 0, 1, 2, . . ..
We have seen in our discussion of the quantization of the harmonic oscillator that
in the Bargmann-Fock representation, the state space is just C[z1 , z2 , . . . , zd ],
the space of polynomials in d complex variables.
352
The part of the state space with N quanta has dimension
N +d−1
N
which grows with N . This is just the binomial coefficient giving the number
of d-variable monomials of degree N . The quanta of a harmonic oscillator are
indistinguishable, which corresponds to the fact that the space of states with
N quanta can be identified with the symmetric part of the tensor product of N
copies of Cd .
More precisely, what one has is
We won’t try and give a detailed proof of this here, but one can exhibit the
isomorphism explicitly on generators. If zj ∈ V ∗ are the coordinate functions
with respect to a basis ej of V (i.e. zj (ek ) = δjk ), they give a basis of V ∗ which
is also a basis of the linear polynomial functions on V . For higher degrees, one
makes the identification
zj ⊗ · · · ⊗ zj ∈ S N (V ∗ ) ↔ zjN
| {z }
N −times
• As polynomial functions on H1 .
353
34.1.2 Fermions and the fermionic oscillator
For the case of fermionic particles, there’s an analogous Fock space construction
using tensor products:
F A (V ) = C ⊕ V ⊕ (V ⊗ V )A ⊕ (V ⊗ V ⊗ V )A ⊕ · · ·
where the superscript A means the subspace of the tensor product that just
changes sign under interchange of two factors. This is known to mathemati-
cians as the “exterior algebra” Λ∗ (V ), with
ΛN (V ) = (V ⊗ · · · ⊗ V )A
| {z }
N −times
For each of these descriptions, one can define d annihilation or creation operators
aj , a†j satisfying the canonical anticommutation relations, and these will generate
an algebra of operators on the Fock space.
354
34.2.1 Box normalization
Recall that for a free particle in one dimension the state space H consists of
complex-valued functions on R, with observables the self-adjoint operators for
momentum
d
P = −i
dx
and energy (the Hamiltonian)
P2 1 d2
H= =−
2m 2m dx2
Eigenfunctions for both P and H are the functions of the form
ψp (x) ∝ eipx
2
p
for p ∈ R, with eigenvalues p for P and 2m for H.
Note that these eigenfunctions are not normalizable, and thus not in the
conventional choice of state space as L2 (R). One way to deal with this issue is
to do what physicists sometimes refer to as “putting the system in a box”, by
imposing periodic boundary conditions
ψ(x + L) = ψ(x)
This “box” normalization is one form of what physicists call an “infrared cutoff”,
a way of removing degrees of freedom that correspond to arbitrarily large sizes,
in order to make a problem well-defined. To get a well-defined problem one
starts with a fixed value of L, then one studies the limit L → ∞.
355
The number of degrees of freedom is now countable, but still infinite. In order
to get a completely well-defined problem, one typically needs to first make the
number of degrees of freedom finite. This can be done with an additional cutoff,
an “ultraviolet cutoff”, which means restricting attention to |p| ≤ Λ for some
finite Λ, or equivalently |l| < ΛL
2π . This makes the state space finite dimensional
and one then studies the Λ → ∞ limit.
For finite L and Λ our single-particle state space H1 is finite dimensional,
with orthonormal basis elements ψl (x). An arbitrary solution to the Schrödinger
equation is then given by
+ ΛL
2π
X 2πl 4π 2 l2
ψ(x, t) = αl ei L x e−i 2mL2 t
l=− ΛL
2π
where the subscript j indexes the possible values of the momentum p (which are
discretized in units of 2π
L , and in the interval [−Λ, Λ]). The occupation number
npj is the number of particles in the state with momentum pj . In the bosonic
case it takes values 0, 1, 2, · · · , ∞, in the fermionic case it takes values 0 or 1.
The state with all occupation numbers equal to zero is denoted
| · · · , 0, 0, 0, · · · i = |0i
356
• The number operator
X
N
b= a†pk apk
k
• The Hamiltonian
X p2
k †
H
b = a ap
2m pk k
k
X p2
b · · · , np , np , np , · · · i = (
H| j−1 j j+1 nk k )| · · · , npj−1 , npj , npj+1 , · · · i
2m
k
With ultraviolet and infrared cutoffs in place, the possible values of pj are
finite in number, H1 is finite dimensional and this is nothing but the standard
quantized harmonic oscillator (with a Hamiltonian that has different frequencies
p2j
ω(pj ) =
2m
for different values of j). In the limit as one or both cutoffs are removed,
H1 becomes infinite dimensional, the Stone-von Neumann theorem no longer
applies, and we are in the situation discussed in chapter 33. State spaces with
different choices of vacuum state |0i can be unitarily inequivalent, with not just
the dynamics of states in the state space dependent on the Hamiltonian, but the
state space itself depending on the Hamiltonian (through the characterization
of |0i as lowest energy state). Even for the free particle, we have here defined
the Hamiltonian as the normal-ordered version, which for finite dimensional H1
differs from the non-normal-ordered one just by a constant, but as cut-offs are
removed this constant becomes infinite, requiring careful treatment of the limit.
357
34.2.2 Continuum normalization
A significant problem introduced by using cutoffs such as the box normalization
is that these ruin some of the space-time symmetries of the system. The one-
particle space with an infrared cutoff is a space of functions on a discrete set of
points, and this set of points will not have the same symmetries as the usual
continuous momentum space (for instance in three dimensions it will not carry
an action of the rotation group SO(3)). In our study of quantum field theory
we would like to exploit the action of space-time symmetry groups on the state
space of the theory, so need a formalism that preserves such symmetries.
In our earlier discussion of the free particle, we saw that physicists often
work with a “continuum normalization” such that
1
|pi = ψp (x) = √ eipx , hp0 |pi = δ(p − p0 )
2π
where formulas such as the second one need to be interpreted in terms of dis-
tributions. In quantum field theory we want to be able to think of each value
of p as corresponding to a classical degree of freedom that gets quantized, and
this “continuum” normalization will then correspond to an uncountable number
of degrees of freedom, requiring great care when working in such a formalism.
This will however allow us to readily see the action of space-time symmetries
on the states of the quantum field theory and to exploit the duality between
position and momentum space embodied in the Fourier transform.
In the continuum normalization, an arbitrary solution to the free-particle
Schrödinger equation is given by
Z ∞
1 p2
ψ(x, t) = √ α(p)eipx e−i 2m t dp
2π −∞
for some function complex-valued function α(p) on momentum space. Such
solutions are in one-to-one correspondence with initial data
Z ∞
1
ψ(x, 0) = √ α(p)eipx dp
2π −∞
This is exactly the Fourier inversion formula, expressing a function ψ(x, 0) in
^
terms of its Fourier transform ψ(x, 0)(p) = α(p). Note that we want to con-
sider not just square integrable functions α(p), but non-integrable functions
like α(p) = 1 (which corresponds to ψ(x, 0) = δ(x)), and distributions such as
α(p) = δ(p), which corresponds to ψ(x, 0) = 1.
We will generally work with this continuum normalization, taking as our
single-particle space H1 the space of complex valued functions ψ(x, 0) on R. One
can think of the |pi as an orthornomal basis of H1 , with α(p) the coordinate
function for the |pi basis vector. α(p) is then an element of H1∗ , the linear
function on H1 given by taking the coefficient of the |pi basis vector.
Quantization should take α(p) ∈ H1∗ to corresponding annihilation and cre-
ation operators a(p), a† (p). Such operators though need to be thought of as
358
operator-valued distributions: what is really a well-defined operator is not a(p),
but Z +∞
f (p)a(p)dp
−∞
for sufficiently well-behaved functions f (p). From the point of view of quanti-
zation of H1∗ , it is vectors in H1∗ that one can write in the form
Z +∞
α(f ) = f (p)α(p)dp
−∞
but which is only defined for some particularly well-behaved f (and in particular
is not defined for the non-integrable choice of f = 1).
359
Definition (Quantum field operator). The quantum field operators for the free
particle system are
X X 1
ψ(x)
b = ψp (x)ap = √ eipx ap
p p L
Note that these are not self-adjoint operators, and thus not themselves ob-
servables. To get some idea of their behavior, one can calculate what they do
to the vacuum state. One has
ψ(x)|0i
b =0
1 X −ipx
ψb† (x)|0i = √ e | · · · , 0, np = 1, 0, · · · i
L p
While this sum makes sense as long as it is finite, when cutoffs are removed it is
clear that ψ̂ † (x) will have a rather singular limit as an infinite sum of operators.
It can be in some vague sense thought of as the (ill-defined) operator that creates
a particle localized precisely at x.
The field operators allow one to recover conventional wavefunctions, for sin-
gle and multiple-particle states. One sees by orthonormality of the occupation
number basis states that
1
h· · · , 0, np = 1, 0, · · · |ψb† (x)|0i = √ e−ipx = ψ p (x)
L
the complex conjugate wavefunction of the single-particle state of momentum
p. An arbitrary one particle state |Ψ1 i with wavefunction ψ(x) is a linear
combination of such states, and taking complex conjugates one finds
h0|ψ(x)|Ψ
b 1 i = ψ(x)
Similarly, for a two-particle state of identical particles with momenta pj1 and
pj2 one finds
h0|ψ(x b 2 )| · · · , 0, np = 1, 0, · · · , 0, np = 1, 0, · · · i = ψp ,p (x1 , x2 )
b 1 )ψ(x
j1 j2 j1 j2
where
ψpj1 ,pj2 (x1 , x2 )
is the wavefunction (symmetric under interchange of x1 and x2 for bosons) for
this two particle state. For a general two-particle state |Ψ2 i with wavefunction
ψ(x1 , x2 ) one has
h0|ψ(x b 2 )|Ψ2 i = ψ(x1 , x2 )
b 1 )ψ(x
360
and one can easily generalize this to see how field operators are related to
wavefunctions for an arbitrary number of particles.
Cutoffs ruin translational invariance and calculations with them quickly be-
come difficult. We’ll now adopt the physicist’s convention of working directly
in the continuous case with no cutoff, at the price of having formulas that only
make sense as distributions. One needs to be aware that the correct interpreta-
tion of such formulas may require going back to the cutoff version.
In the continuum normalization we take as normalized eigenfunctions for the
free particle
1
|pi = ψp (x) = √ eipx
2π
with Z ∞
1 0
hp0 |pi = ei(p−p )x dx = δ(p − p0 )
2π −∞
[ψ(x),
b b 0 )] = [ψb† (x), ψb† (x0 )] = 0
ψ(x
Z ∞Z ∞
† 0 1 0 0
[ψ(x), ψ (x )] =
b b eipx e−ip x [a(p), a† (p0 )]dpdp0
2π −∞ −∞
Z ∞Z ∞
1 0 0
= eipx e−ip x δ(p − p0 )dpdp0
2π −∞ −∞
Z ∞
1 0
= eip(x−x ) dp
2π −∞
=δ(x − x0 )
361
and the commutator relation above means
Z ∞Z ∞ Z ∞
b ), ψb† (g)] =
[ψ(f f (x)g(x0 )δ(x − x0 )dxdx0 = f (x)g(x)dx
−∞ −∞ −∞
There are observables that one can define simply using field operators. These
include:
• The number operator N̂ . One can define a number density operator
• The total momentum operator Pb. This can be defined in terms of field
operators as
Z ∞
d b
Pb = ψb† (x)(−i )ψ(x)dx
−∞ dx
Z ∞Z ∞Z ∞
1 0 1
= √ e−ip x a† (p0 )(−i)(ip) √ eipx a(p)dpdp0 dx
−∞ −∞ −∞ 2π 2π
Z ∞Z ∞
= δ(p − p0 )pa† (p0 )a(p)dpdp0
−∞ −∞
Z ∞
= pa† (p)a(p)dp
−∞
• The Hamiltonian H.
b This can be defined much like the momentum, just
changing
d 1 d2
−i →−
dx 2m dx2
362
to find
∞ ∞
1 d2 b p2 †
Z Z
H
b = ψb† (x)(− )ψ(x)dx = a (p)a(p)dp
−∞ 2m dx2 −∞ 2m
We will see that one can more generally use quadratic expressions in field
operators to define an observable O
b corresponding to a one-particle quantum
mechanical observable O by
Z ∞
Ob= ψb† (x)Oψ(x)dx
b
−∞
For Hamiltonians just quadratic in the quantum fields, quantum field the-
ories are quite tractable objects. They are in some sense just free quantum
oscillator systems, with all of their symmetry structure intact, but taking the
number of degrees of freedom to infinity. Higher order terms though make
quantum field theory a difficult and complicated subject, one that requires a
year-long graduate level course to master basic computational techniques, and
one that to this day resists mathematician’s attempts to prove that many ex-
amples of such theories have even the basic expected properties. In the theory
of charged particles interacting with an electromagnetic field, when the electro-
magnetic field is treated classically one still has a Hamiltonian quadratic in the
field operators for the particles. But if the electromagnetic field is treated as a
quantum system, it acquires its own field operators, and the Hamiltonian is no
longer quadratic in the fields, a vastly more complicated situation described as
an“interacting quantum field theory”.
Even if one restricts attention to the quantum fields describing one kind of
particles, there may be interactions between particles that add terms to the
363
Hamiltonian, and these will be higher order than quadratic. For instance, if
there is an interaction between such particles described by an interaction energy
v(y − x), this can be described by adding the following quartic term to the
Hamiltonian
1 ∞ ∞ b†
Z Z
ψ (x)ψb† (y)v(y − x)ψ(y)
b ψ(x)dxdy
b
2 −∞ −∞
More remarkably, one can also very easily write down theories of quantum
systems with an arbitrary number of fermionic particles, just by changing com-
mutators to anticommutators for the creation-annihilation operators and using
fermionic instead of bosonic oscillators. One gets fermionic fields that satisfy
anticommutation relations
[ψ(x),
b ψb† (x0 )]+ = δ(x − x0 )
364
Chapter 35
In finite dimensions, we saw that we could think of phase space M as the space
parametrizing solutions to the equations of motion of a classical system, that
linear functions on this space carried the structure of a Lie algebra (the Heisen-
berg Lie algebra), and that quantization was given by finding an irreducible
unitary representation of this Lie algebra.
Instead of motivating the definition of quantum fields by starting with an-
nihilation and creation operators for a free particle of fixed momentum, one
can more simply just define them as what one gets by taking the space H1 of
solutions of the free single particle Schrödinger equation as a classical phase
space, and quantizing to get a unitary representation (of a Heisenberg algebra
that is now infinite-dimensional). This procedure is sometimes called “second
quantization”, with “first quantization” what was done when one started with
the classical phase space for a single particle and quantized to get the space H1
of wavefunctions.
In this chapter we’ll consider the properties of quantum fields from this point
of view, including seeing how quantization of classical Hamiltonian dynamics
gives the dynamics of quantum fields.
365
wavefunction at t = 0. Note that this is unlike typical finite-dimensional classical
mechanical systems, where the equation of motion is second-order in time, with
solutions determined by two pieces of initial-value data, the coordinates and
momenta (since one needs initial velocities as well as positions). Taking M = H1
as a classical phase space, it has the property that there is no natural splitting of
coordinates into position-like variables and momentum-like variables, and thus
no natural way of setting up an infinite-dimensional Schrödinger representation
where states would be functionals of position-like variables.
On the other hand, since wavefunctions are complex valued, H1 is already
a complex vector space, and we can quantize by the Bargmann-Fock method
using this complex structure. This is quite unlike our previous examples of
quantization, where we started with a real phase space and needed to choose a
complex structure (to get annihilation and creation operators).
Using Fourier transforms we can think of H1 either as a space of functions
ψ of position x, or as a space of functions ψe of momentum p. This corresponds
to two possible choices of orthonormal bases of the function space H1 : the |pi
(plane waves of momentum p) or the |xi (delta-functions at position x). In
the finite dimensional case it is the coordinate functions qj , pj on phase space,
which lie in the dual phase space M = M ∗ , that get mapped to operators
Qj , Pj under quantization. Here what corresponds to the qj , pj is either the
α(p) (coordinates with respect to the |pi basis) which quantize to annihilation
operators a(p), or the ψ(x) (field value at x, coordinates with respect to the |xi
basis) which quantize to field operators ψ(x).
b
As described in chapter 34 though, what is really well-defined is not the
quantization of ψ(x), but of ψ(f ) for some class of functions f . ψ(x) is the
linear function on the space of solutions H1 given by
ψ(x) : ψ ∈ H1 → ψ(x, 0)
but to get a well-defined operator, one wants the quantization not of this, but
of elements of H1∗ of the form
Z ∞
ψ(f ) : ψ ∈ H1 → f (x)ψ(x, 0)dx
−∞
To get all elements of H1∗ we will also need the ψ(x) and
Z ∞
ψ(g) : ψ ∈ H1 → g(x)ψ(x, 0)dx
−∞
Despite the potential for confusion, we will write ψ(x) for the distribution given
by evaluation at x, which corresponds to taking f to be the delta-function
δ(x − x0 ). This convenient notational choice means that one needs to be aware
that ψ(x) may be a complex number, or may be the “evaluation at x” linear
function on H1 .
So quantum fields should be “operator-valued distributions” and the proper
mathematical treatment of this situation becomes quite challenging, with one
366
class of problems coming from the theory of distributions. What class of func-
tions should appear in the space H1 ? What class of linear functionals on this
space should be used? What properties should the operators ψ(f b ) satisfy?
These issues are far beyond what we can discuss here, and they are not purely
mathematical, with the fact that the product of two distributions does not have
an unambiguous sense one indication of the difficulties of quantum field theory.
To understand the Poisson bracket structure on functions on H1 , one should
recall that in the Bargmann-Fock quantization we found that, choosing a com-
plex structure and complex coordinates zj on phase space, the non-zero Poisson
bracket relations were
{zj , iz l } = δjl
If we use the |pi basis for H1 , our complex coordinates will be α(p), α(p)
with Poisson bracket
Using instead the |xi basis for H1 , our complex coordinates will be ψ(x), ψ(x)
with Poisson brackets
ψ(x) → −iψ(x),
b ψ(x) → −iψb† (x), 1 → −i1
This gives a Heisenberg Lie algebra representation Γ0 (unitary on the real and
imaginary parts of ψ(x)) with commutator relations
[ψ(x),
b b 0 )] = [ψb† (x), ψb† (x0 )] = 0 [ψ(x),
ψ(x b ψb† (x0 )] = δ(x − x0 )1
367
of H1∗ with Poisson bracket relations
Z ∞
{ψ(f1 ) + ψ(g1 ), ψ(f2 ) + ψ(g2 )} = −i (f1 (x)g2 (x) − f2 (x)g1 (x))dx
−∞
Here the right-hand size of the equation is the symplectic form Ω for the dual
phase space M = H1∗ , and this should be thought of as an infinite-dimensional
version of formula ??. This is the Lie bracket relation for an infinite-dimensional
Heisenberg Lie algebra, with quantization giving a representation of the Lie
algebra, with commutation relations for field operators
Z ∞
† †
[ψ(f1 ) + ψ (g1 ), ψ(f2 ) + ψ (g2 )] =
b b b b (f1 (x)g2 (x) − f2 (x)g1 (x))dx · 1
−∞
Pretty much exactly the same formalism works to describe fermions, with the
same H1 and the same choice of bases. The only difference is that the coordinate
functions are now taken to be anticommuting, satisfying the fermionic Poisson
bracket relations of a super Lie algebra rather than a Lie algebra. After quantiza-
tion, the fields ψ(x),
b ψb† (x) or the annihilation and creation operators a(p), a† (p)
satisfy anticommutation relations and generate an infinite-dimensional Clifford
algebra, rather than the Weyl algebra of the bosonic case.
U (t) = e−iHt
that determines how states (in the Schrödinger picture) evolve under time trans-
lation. In the Heisenberg picture states stay the same and operators evolve, with
their time evolution given by
368
is in the quantum field operators) than the Schrödinger picture (in which the
time dependence is in the states). This is especially true in relativistic systems
where one wants to as much as possible treat space and time on the same foot-
ing. It is however also true in non-relativistic cases due to the complexity of
the description of the states (inherent since one is trying to describe arbitrary
numbers of particles) versus the description of the operators, which are built
simply out of the quantum fields.
The classical phase space to be quantized is the space H1 of solutions of
the free particle Schrödinger equation, parametrized by the initial data of a
complex-valued wavefunction ψ(x, 0) ≡ ψ(x), with Poisson bracket
{ψ(x), iψ(x0 )} = δ(x − x0 )
Time translation on this space is given by the Schrödinger equation, which says
that wavefunctions will evolve with time dependence given by
∂ i ∂2
ψ(x, t) = ψ(x, t)
∂t 2m ∂x2
If we take our Hamiltonian function on H1 to be
Z +∞
−1 ∂ 2
h= ψ(x) ψ(x)dx
−∞ 2m ∂x2
then we will get the single-particle Schrödinger equation from the Hamiltonian
dynamics, since
∂
ψ(x, t) ={ψ(x, t), h}
∂t
Z +∞
−1 ∂ 2
={ψ(x, t), ψ(x0 , t) 02
ψ(x0 , t)dx0 }
−∞ 2m ∂x
−1 +∞ ∂2
Z
= ({ψ(x, t), ψ(x0 , t)} 0 2 ψ(x0 , t)+
2m −∞ ∂x
∂2
ψ(x0 , t){ψ(x, t), 0 2 ψ(x0 , t)})dx0 )
∂x
−1 +∞ 2
∂2
Z
∂
= (−iδ(x − x0 ) 0 2 ψ(x0 , t) + ψ(x0 , t) 0 2 {ψ(x, t), ψ(x0 , t)})dx0 )
2m −∞ ∂x ∂x
i ∂2
= ψ(x, t)
2m ∂x2
Here we have used the derivation property of the Poisson bracket and the lin-
∂2
earity of the operator ∂x 02 .
Note that there are other forms of the same Hamiltonian function, related
to the one we chose by integration by parts. One has
d2 d d d
ψ(x) ψ(x) = (ψ(x) ψ(x)) − | ψ(x)|2
dx2 dx dx dx
d d d d2
= (ψ(x) ψ(x) − ( ψ(x))ψ(x)) + ( 2 ψ(x))ψ(x)
dx dx dx dx
369
so neglecting integrals of derivatives (assuming boundary terms go to zero at
infinity), one could have used
+∞ +∞
−1 d2
Z Z
1 d
h= | ψ(x)|2 dx or h = ( ψ(x))ψ(x)dx
2m −∞ dx 2m −∞ dx2
Instead of working with position space fields ψ(x, t) we could work with
their momentum space components. Recall that we can write solutions to the
Schrödinger equation as
Z ∞
1
ψ(x, t) = √ α(p, t)eipx dp
2π −∞
where
p2
α(p, t) = α(p)e−i 2m t
Using these as our coordinates on the H1 , dynamics is given by
∂ p2
α(p, t) = {α(p, t), h} = −i α(p, t)
∂t 2m
and one can easily see that one can choose
Z ∞ 2
p
h= |α(p)|2 dp
−∞ 2m
370
The field operator ψ(x,
b t) satisfies the Schrödinger equation which now ap-
pears as a differential equation for operators rather than for wavefunctions.
One can explicitly solve such a differential equation just as for wavefunctions,
by Fourier transforming and turning differentiation into multiplication. If the
operator ψ(x,
b t) is related to the operator a(p, t) by
Z ∞
b t) = √1
ψ(x, eipx a(p, t)dp
2π −∞
then the Schrödinger equation for the a(p, t) will be
∂ −ip2
a(p, t) = a(p, t)
∂t 2m
with solution
p2
a(p, t) = e−i 2m t a(p, 0)
The solution for the field will then be
Z ∞
b t) = √1
ψ(x,
p2
eipx e−i 2m t a(p)dp
2π −∞
where the operators a(p) ≡ a(p, 0) are the initial values.
We will not enter into the important topic of how to compute observables
in quantum field theory that can be connected to experimentally important
quantities such as scattering cross-sections. A crucial role in such calculations
is played by the following observables:
Definition (Green’s function or propagator). The Green’s function or propa-
gator for a quantum field theory is the amplitude, for t > t0
G(x, t, x0 , t0 ) = h0|ψ(x,
b t)ψb† (x0 , t0 )|0i
The physical interpretation of these functions is that they describe the am-
plitude for a process in which a one-particle state localized at x is created at time
t0 , propagates for a time t − t0 , and then its wavefunction is compared to that
of a one-particle state localized at x. Using the solution for the time-dependent
field operator given earlier we find
Z +∞ Z +∞
1 p2 0 p0 2 0
G(x, t, x0 , t0 ) = h0|eipx e−i 2m t a(p)e−ip x ei 2m t a† (p0 )|0idpdp0
2π −∞ −∞
Z +∞ Z +∞
p2 0 0 p0 2 0
= eipx e−i 2m t e−ip x ei 2m t δ(p − p0 )dpdp0
−∞ −∞
Z +∞
0 p2 0
= eip(x−x ) e−i 2m (t−t ) dp
−∞
371
and that
lim G(x0 , t0 , x, t) = δ(x0 − x)
t→t0
While we have worked purely in the Hamiltonian formalism, one could in-
stead start with a Lagrangian for this system. A Lagrangian that will give the
Schrödinger equation as an Euler-Lagrange equation is
∂ ∂ 1 ∂2
L = iψ ψ − h = iψ ψ + ψ ψ
∂t ∂t 2m ∂x2
or, using integration by parts to get an alternate form of h mentioned earlier
∂ 1 ∂ 2
L = iψ ψ− | ψ|
∂t 2m ∂x
∂L
If one tries to define a canonical momentum for ψ as ∂ ψ̇
one just gets iψ.
This justifies the Poisson bracket relation
but, as expected for a case where the equation of motion is first-order in time,
such a canonical momentum is not independent of ψ and the space of the wave-
functions ψ is already a phase space. One could try and quantize this system
by path integral methods, for instance computing the propagator by doing the
integral Z
i
Dγ e ~ S[γ]
over paths γ from (x, t) to (x0 , t0 ). However one needs to keep in mind the
warnings given earlier about path integrals over phase space, since that is what
one has here.
372
Chapter 36
Symmetries and
Non-relativistic Quantum
Fields
In our study of the harmonic oscillator (chapter 22) we found that the sym-
metries of the system could be studied using quadratic functions on the phase
space. Classically these gave a Lie algebra under the Poisson bracket, and quan-
tization provided a unitary representation Γ0 of the Lie algebra, with quadratic
functions becoming quadratic operators. In the case of fields, the same pattern
holds, with the phase space now an infinite dimensional space, the single particle
Hilbert space H1 . Certain specific quadratic functions of the fields will provide
a Lie algebra under the Poisson bracket, with quantization then providing a
unitary representation of the Lie algebra in terms of quadratic field operators.
In chapter 35 we saw how this works for time translation symmetry, which
determines the dynamics of the theory. For the case of a free particle, the
field theory Hamiltonian is a quadratic function of the fields, providing a basic
example of how such functions generate a unitary representation on the states
of the quantum theory by use of a quadratic combination of the quantum field
operators. In this chapter we will see how other group actions on the space
H1 also lead to quadratic operators and unitary transformations on the full
quantum field theory. We would like to find a formula to these, something
that will be simplest to do in the case that the group acts on phase space as
unitary transformations, preserving the complex structure used in Bargmann-
Fock quantization.
373
complex values of the function. Such a group action that acts trivially on the
spatial coordinates but non-trivially on the values of ψ(x) is called an “internal
symmetry”. If the fields ψ have multiple components, taking values in Cm ,
there will be a unitary action of the larger group U (m).
z → eiθ z, z → e−iθ z
•
i
zz → − (a† a + aa† )
2
This will have eigenvalues −i(n + 21 ), n = 0, 1, 2 . . . .
•
zz → −ia† a
This is the normal-ordered form, with eigenvalues −in.
374
In both cases we have
[N, a] = −a, [N, a† ] = a†
so
e−iθN aeiθN = eiθ a, e−iθN a† eiθN = e−iθ a†
Either choice of N will give the same action on operators. Hoever, on states
only the normal-ordered one will have the desirable feature that
Since we now want to treat fields, adding together an infinite number of such
oscillator degrees of freedom, we will need the normal-ordered version in order
to not get ∞ · 21 as the number eigenvalue for the vacuum state.
In momentum space, we simply do the above for each value of p and sum,
getting
Z +∞
N
b= a† (p)a(p)dp
−∞
where one needs to keep in mind that this is really an operator valued distribu-
tion, which must be integrated against some weighting function on momentum
space to get a well-defined operator. What really makes sense is
Z +∞
N
b (f ) = a† (p)a(p)f (p)dp
−∞
b . ψb† (x)ψ(x)
with the Fourier transform relating the two formulas for N b is also an
operator valued distribution, with the interpretation of measuring the number
density at x.
On field operators, Nb satisfies
[N b = −ψ,
b , ψ] b , ψb† ] = ψb†
b [N
which are the quantized versions of the U (1) action on the phase space coordi-
nates
ψ(x) → eiθ ψ(x), ψ(x) → e−iθ ψ(x)
375
that we began our discussion with.
An important property of Nb that can be straightforwardly checked is that
+∞
−1 ∂ 2 b
Z
[N
b , H]
b = [N
b, ψb† (x) ψ(x)dx] = 0
−∞ 2m ∂x2
This implies that particle number is a conserved quantity: if we start out with
a state with a definite particle number, this will remain constant. Note that the
origin of this conservation law comes from the fact that N b is the quantized gen-
erator of the U (1) symmetry of phase transformations on complex-valued fields
ψ. If we start with any Hamiltonian function h on H1 that is invariant under the
U (1) (i.e. built out of terms with an equal number of ψs and ψs), then for such
a theory N b will commute with H b and particle number will be conserved. Note
though that one needs to take some care with arguments like this, which assume
that symmetries of the classical phase space give rise to unitary representations
in the quantum theory. The need to normal-order operator products, working
with operators that differ from the most straightforward quantization by an in-
finite constant, can cause a failure of symmetries to be realized as expected in
the quantum theory, a phenomenon known as an “anomaly” in the symmetry.
In quantum field theories, due to the infinite number of degrees of freedom,
the Stone-von Neumann theorem does not apply, and one can have unitarily
inequivalent representations of the algebra generated by the field operators,
leading to new kinds of behavior not seen in finite dimensional quantum systems.
In particular, one can have a space of states where the lowest energy state |0i
does not have the property
In this case, the vacuum state is not an eigenstate of N b so does not have a
well-defined particle number. If [N , H] = 0, the states |θi will all have the same
b b
energy as |0i and there will be a multiplicity of different vacuum states, labeled
by θ. In such a case the U (1) symmetry is said to be “spontaneously broken”.
This phenomenon occurs when non-relativistic quantum field theory is used to
describe a superconductor. There the lowest energy state will be a state without
a definite particle number, with electrons pairing up in a way that allows them
to lower their energy, “condensing” in the lowest energy state.
376
theories with larger internal symmetry groups than U (1). Taking as Hamilto-
nian function Z +∞ Xm
−1 ∂ 2
h= ψ j (x) ψj (x)dx
−∞ j=1 2m ∂x2
one can see that this will be invariant not just under U (1) phase transformations,
but also under transformations
ψ1 ψ1
ψ2 ψ2
.. → U ..
. .
ψm ψm
where U is an m by m unitary matrix. The Poisson brackets will be
{ψj (x), ψ k (x0 )} = −iδ(x − x0 )δjk
and are also invariant under such transformations by U ∈ U (m).
As in the U (1) case, one can begin by considering the case of one particular
value of p or of x, for which the phase space is Cm , with coordinates zb , z b . As we
saw in section 23.1, the m2 quadratic combinations zj z k for j = 1, . . . , m, k =
1, . . . , m will generalize the role played by zz in the m = 1 case, with their
Poisson bracket relations exactly the Lie bracket relations of the Lie algebra
u(m) (or, considering all complex linear combinations, gl(m, C)).
After quantization, these quadratic combinations become quadratic combi-
nations in annihilation and creation operators aj , a†j satisfying
So, for each X in the Lie algebra gl(m, C), quantization will give us a represen-
tation of gl(m, C) where X acts as the operator
m
a†j Xjk ak
X
j,k=1
When the matrices X are chosen to be skew-adjoint (Xjk = −Xkj ) this con-
struction will give us a unitary representation of u(m).
As in the U (1) case, one gets an operator in the quantum field theory just by
summing over either the a(p) in momentum space, or the fields in configuration
space, finding for each X ∈ u(m) an operator
Z +∞ X m
Xb= ψbj† (x)Xjk ψbk (x)dx
−∞ j,k=1
377
that provides a representation of u(m) and U (m) on the quantum field theory
state space. This representation takes
b† (x)Xjk ψ
R +∞ Pm
ψ bk (x)dx
eX ∈ U (m) → U (eX ) = eX = e
b
−∞ b,c=1 j
[X,
b H]
b =0
In this case, if |0i is invariant under the U (m) symmetry, then energy eigenstates
of the quantum field theory will break up into irreducible representations of
U (m) and can be labeled accordingly. As in the U (1) case, the U (m) symmetry
may be spontaneously broken, with
X|0i
b 6 0
=
for some directions X in u(m). When this happens, just as in the U (1) case
states did not have well-defined particle number, now they will not carry well-
defined irreducible U (m) representation labels.
x → Rx + a
Recall that this is not an irreducible representation of E(3), but one can
get an irreducible representation by taking distributional wavefunctions
ψeE with support on the sphere |p|2 = 2mE.
378
ψ1
For the case of two-component wavefunctions ψ = satisfying the
ψ2
Pauli equation (see chapter 31), one has to use the double cover of E(3),
with elements (a, Ω), Ω ∈ SU (2) and on these the action is
and
−1
ψ(p)
e →u
e(a, Ω)ψ(p)
e = e−ia·R p Ωψ(R
e −1 p)
It is the last of these that we want to understand here, and as usual for
quantum field theory, we don’t want to try and explicitly construct the state
space H and see the E(3) action on that construction, but instead want to use
the analog of the Heisenberg picture in the time-translation case, taking the
group to act on operators. For each (a, R) ∈ E(3) we want to find operators
U (a, R) that will be built out of the field operators, and act on the field operators
as
ψ(x)
b → U (a, R)ψ(x)U
b (a, R)−1 = ψ(Rx
b + a) (36.1)
Note that here the way the group acts on the argument of the operator-valued
distribution is opposite to the way that it acts on the argument of a solution
in H1 . This is because ψ(x)
b is an operator associated not to an element of H1 ,
but to a distribution on this space, in particular the distribution ψ(x), here
meaning “evaluation of the solution ψ at x. The group will act oppositely on
such linear functions on H1 to its action on elements of H1 . For a more general
distribution of the form
Z
ψ(f ) = f (x)ψ(x)d3 x
R3
379
and on ψ(f ) by
Z
ψ(f ) → (a, R) · ψ(f ) = f (R−1 (x − a))ψ(x)d3 x
R3
U (a, 1) = e−ia·P
b
after exponentiation. Note that these are not the momentum operators P that
act on H1 , but are operators in the quantum field theory that will be built out
of quadratic combinations of the field operators. By equation 36.1 we want
380
Note that these constructions are infinite-dimensional examples of theorem
23.2 which showed how to take an action of the unitary group on phase space
(preserving Ω) and produce a representation of this group on the state space
of the quantum theory. In our study of quantum field theory, we will be con-
tinually exploiting this construction, for groups acting unitarily on the infinite-
dimensional phase space H1 of solutions of some linear field equations.
36.3 Fermions
Everything that was done in this chapter carries over straightforwardly to the
case of a fermionic non-relativistic quantum field theory of free particles. Field
operators will in this case generate an infinite-dimensional Clifford algebra and
the quantum state space will be an infinite-dimensional version of the spinor
representation. All the symmetries considered in this chpter also appear in the
fermionic case, and have Lie algebra representations constructed using quadratic
combinations of the field operators in just the same way as in the bosonic case.
In section 28.3 we saw in finite dimensions how unitary group actions on the
fermionic phase space gave a unitary representation on the fermionic oscillator
space, by the same method of annihilation and creation operators as in the
bosonic case. The construction of the Lie algebra representation operators in
the fermionic case is an infinite-dimensional example of that method.
381
382
Chapter 37
For the case of non-relativistic quantum mechanics, we saw that systems with
an arbitrary number of particles, bosons or fermions, could be described by
taking as the Hamiltonian phase space the state space H1 of the single-particle
quantum theory (e.g. the space of complex-valued wavefunctions on R3 in the
bosonic case). This phase space is infinite-dimensional, but it is linear and it
can be quantized using the same techniques that work for the finite-dimensional
harmonic oscillator. This is an example of a quantum field theory since it is a
space of functions (fields, to physicists) that is being quantized.
We would like to find some similar way to proceed for the case of rela-
tivistic systems, finding relativistic quantum field theories capable of describ-
ing arbitrary numbers of particles, with the energy-momentum relationship
E 2 = |p|2 c2 + m2 c4 characteristic of special relativity, not the non-relativistic
2
limit |p| mc where E = |p| 2m . In general, a phase space can be thought of as
the space of initial conditions for an equation of motion, or equivalently, as the
space of solutions of the equation of motion. In the non-relativistic field theory,
the equation of motion is the first-order in time Schrödinger equation, and the
phase space is the space of fields (wavefunctions) at a specified initial time, say
t = 0. This space carries a representation of the time-translation group R, the
space-translation group R3 and the rotation group SO(3). To construct a rela-
tivistic quantum field theory, we want to find an analog of this space. It will be
some sort of linear space of functions satisfying an equation of motion, and we
will then quantize by applying harmonic oscillator methods.
Just as in the non-relativistic case, the space of solutions to the equation
of motion provides a representation of the group of space-time symmetries of
the theory. This group will now be the Poincaré group, a ten-dimensional
group which includes a four-dimensional subgroup of translations in space-time,
and a six-dimensional subgroup (the Lorentz group), which combines spatial
rotations and “boosts” (transformations mixing spatial and time coordinates).
383
The representation of the Poincaré group on the solutions to the relativistic
wave equation will in general be reducible. Irreducible such representations will
be the objects corresponding to elementary particles. Our first goal will be to
understand the Lorentz group, in later sections we will find representations of
this group, then move on to the Poincaré group and its representations.
(x, y) ≡ x · y = −x0 y0 + x1 y1 + x2 y2 + x3 y3
• Only for this choice will we have a purely real spinor representation (since
Cliff(3, 1) = M (22 , R) 6= Cliff(1, 3)).
• Weinberg’s quantum field theory textbook [71] uses this convention (al-
though, unlike him, we’ll put the 0 component first).
This inner product will also sometimes be written using the matrix
−1 0 0 0
0 1 0 0
ηµν =
0
0 1 0
0 0 0 1
as
3
X
x·y = ηµν xµ xν
µ,ν=0
384
Digression (Upper and lower indices). In many physics texts it is conventional
in discussions of special relativity to write formulas using both upper and lower
indices, related by
3
X
xµ = ηµν xν = ηµν xν
ν=0
with the last form of this using the Einstein summation convention.
One motivation for introducing both upper and lower indices is that special
relativity is a limiting case of general relativity, which is a fully geometrical
theory based on taking space-time to be a manifold M with a metric g that
varies from point to point. In such a theory it is important to distinguish between
elements of the tangent space Tx (M ) at a point x ∈ M and elements of its dual,
the co-tangent space Tx∗ (M ), while using the fact that the metric g provides an
inner product on Tx (M ) and thus an isomorphism Tx (M ) ' Tx∗ (M ). In the
special relativity case, this distinction between Tx (M ) and Tx∗ (M ) just comes
down to an issue of signs, but the upper and lower index notation is useful for
keeping track of those.
Vectors v ∈ M 4 such that |v|2 = v · v > 0 are called “spacelike”, those with
|v|2 < 0 “time-like” and those with |v|2 = 0 are said to lie on the “light-cone”.
Suppressing one space dimension, the picture to keep in mind of Minkowski
space looks like this:
385
We can take Fourier transforms with respect to the four space-time variables,
which will take functions of x0 , x1 , x2 , x3 to functions of the Fourier transform
variables p0 , p1 , p2 , p3 . The definition we will use for this Fourier transform will
be
Z
1
fe(p) = e−ip·x f (x)d4 x
(2π)2 M 4
Z
1
= e−i(−p0 x0 +p1 x1 +p2 x2 +p3 x3 ) f (x)dx0 d3 x
(2π)2 M 4
Note that our definition puts one factor of √12π with each Fourier (or inverse
Fourier) transform with respect to a single variable. A common alternate con-
vention among physicists is to put all factors of 2π with the p integrals (and
thus in the inverse Fourier transform), none in the definition of fe(p), the Fourier
transform itself.
∂
The reason why one conventionally defines the Hamiltonian operator as i ∂t
∂
but the momentum operator with components −i ∂x j
is due to the sign change
between the time and space variables that occurs in this Fourier transform in
the exponent of the exponential.
Discuss the sign conventions and the Fourier transform in more detail here.
386
37.2 The Lorentz group and its Lie algebra
Recall that in 3 dimensions the group of linear transformations of R3 pre-
serving the standard inner product was the group O(3) of 3 by 3 orthogonal
matrices. This group has two disconnected components: SO(3), the subgroup
of orientation preserving (determinant +1) transformations, and a component
of orientation reversing (determinant −1) transformations. In Minkowksi space,
one has
Definition (Lorentz group). The Lorentz group O(3, 1) is the group of linear
transformations preserving the Minkowski space inner product on R4 .
The Lorentz group has four components, with the component of the iden-
tity a subgroup called SO(3, 1) (which some call SO+ (3, 1)). The other three
components arise by multiplication of elements in SO(3, 1) by P, T, P T where
1 0 0 0
0 −1 0 0
P =
0 0 −1 0
0 0 0 −1
where R is in SO(3). For each pair j, k of spatial directions one has the usual
SO(2) subgroup of rotations in the j −k plane, but now in addition for each pair
0, j of the time direction with a spatial direction, one has SO(1, 1) subgroups
387
of matrices of transformations called “boosts” in the j direction. For example,
for j = 1, one has the subgroup of SO(3, 1) of matrices of the form
cosh φ sinh φ 0 0
sinh φ cosh φ 0 0
Λ= 0
0 1 0
0 0 0 1
for φ ∈ R.
The Lorentz group is six-dimensional. For a basis of its Lie algebra one can
take six matrices Mµν for µ, ν ∈ 0, 1, 2, 3 and j < k. For the spatial indices,
these are
0 0 0 0 0 0 0 0 0 0 0 0
0 0 −1 0 0 0 0 1 0 0 0 0
M12 = 0 1 0 0 , M13 = 0 0 0 0 , M23 = 0 0 0 −1
0 0 0 0 0 −1 0 0 0 0 1 0
which correspond to the basis elements of the Lie algebra of SO(3) that we saw
in an earlier chapter. One can rename these using the same names as earlier
0 0 0 0 0 0 0 0 1 0 0 0
One can easily calculate the commutation relations between the kj and lj , which
show that the kj transform as a vector under infinitesimal rotations. For in-
stance, for infinitesimal rotations about the x1 axis, one finds
388
Digression. A more conventional notation in physics is to use Jj = ilj for
infinitesimal rotations, and Kj = ikj for infinitesimal boosts. The intention of
the different notation used here is to start with basis elements of the real Lie
algebra so(3, 1), (the lj and kj ) which are purely real objects, before complexifying
and considering representations of the Lie algebra.
Taking the following complex linear combinations of the lj and kj
1 1
Aj = (lj + ikj ), Bj = (lj − ikj )
2 2
one finds
[A1 , A2 ] = A3 , [A3 , A1 ] = A2 , [A2 , A3 ] = A1
and
[B1 , B2 ] = B3 , [B3 , B1 ] = B2 , [B2 , B3 ] = B1
This construction of the Aj , Bj requires that we complexify (allow complex
linear combinations of basis elements) the Lie algebra so(3, 1) of SO(3, 1) and
work with the complex Lie algebra so(3, 1) ⊗ C. It shows that this Lie alge-
bra splits into a product of two sub-Lie algebras, which are each copies of the
(complexified) Lie algebra of SO(3), so(3) ⊗ C. Since
so(3) ⊗ C = su(2) ⊗ C = sl(2, C)
we have
so(3, 1) ⊗ C = sl(2, C) × sl(2, C)
In the next section we’ll see the origin of this phenomenon at the group level.
389
and thus a rotation in SO(3).
The same sort of thing works for the Lorentz group case. Now we identify
R4 with the space of 2 by 2 complex self-adjoint matrices by
x0 + x3 x1 − ix2
(x0 , x1 , x2 , x3 ) ↔
x1 + ix2 x0 − x3
This provides a very useful way to think of Minkowski space: as complex self-
adjoint 2 by 2 matrices, with norm-squared minus the determinant of the matrix.
The linear transformation
x0 + x3 x1 − ix2 x0 + x3 x1 − ix2
→ Λ̃ Λ̃†
x1 + ix2 x0 − x3 x1 + ix2 x0 − x3
for Λ̃ ∈ SL(2, C) preserves the determinant and thus the inner-product, since
x0 + x3 x1 − ix2 x0 + x3 x1 − ix2
det(Λ̃ Λ̃† ) =(det Λ̃) det (det Λ̃† )
x1 + ix2 x0 − x3 x1 + ix2 x0 − x3
=x20 − x21 − x22 − x23
Note that both Λ̃ and −Λ̃ give the same linear transformation when they act by
conjugation like this. One can show that all elements of SO(3, 1) arise as such
conjugation maps, by finding appropriate Λ̃ that give rotations or boosts in the
µ − ν planes, since these generate the group.
Recall that the double covering map
Φ : SU (2) → SO(3)
390
Digression (The complex group Spin(4, C) and its real forms). Recall that
we found that Spin(4) = Sp(1) × Sp(1), with the corresponding SO(4) trans-
formation given by identifying R4 with the quaternions H and taking not just
conjugations by unit quaternions, but both left and right multiplication by dis-
tinct unit quaternions. Rewriting this in terms of complex matrices instead of
quaternions, we have Spin(4) = SU (2) × SU (2), and a pair Ω1 , Ω2 of SU (2)
matrices acts as an SO(4) rotation by
x0 − ix3 −x2 − ix1 x0 − ix3 −x2 − ix1
→ Ω1 Ω2
x2 − ix1 x0 + ix3 x2 − ix1 x0 + ix3
and
Spin(2, 2) = SL(2, R) × SL(2, R)
that we have seen are all so-called “real forms” of a fact about complex groups
that one can get by complexifying any of the examples, i.e. considering elements
(x0 , x1 , x2 , x3 ) ∈ C4 , not just in R4 . For instance, in the Spin(4) case, taking
the x0 , x1 , x2 , x3 in the matrix
x0 − ix3 −x2 − ix1
x2 − ix1 x0 + ix3
391
preserves this space as well as the determinant (z02 + z12 + z22 + z32 ) for Ω1 and Ω2
not just in SU (2), but in the larger group SL(2, C). So we find that the group
SO(4, C) of complex orthogonal transformations of C4 has spin double cover
Since spin(4, C) = so(3, 1) ⊗ C, this relation between complex Lie groups corre-
sponds to the Lie algebra relation
392
Chapter 38
Representations of the
Lorentz Group
393
have irreducible representations of dimension 2s + 1 for s = 0, 12 , 1, . . .. These
will now be representations (πs , V s ) of SL(2, C). There are several things that
are different though about these representations:
• They are not unitary (except in the case of the trivial representation).
1
For example, for the defining representation V 2 on C2 and the Hermitian
inner product < ·, · >
0
ψ10
ψ1 ψ1
< , >= ψ1 ψ2 · = ψ1 ψ10 + ψ2 ψ20
ψ2 ψ20 ψ20
394
that we find here giving the symplectic form on a representation space C2
of SL(2, C). In the phase space case, everything was real, and the invari-
ance group of was the real symplectic group Sp(2, R) = SL(2, R). What
occurs here is just the complexification of this story, with the symplectic
form now on C2 , and the invariance group now SL(2, C).
• In the case of SU (2) representations, the complex conjugate representa-
tion one gets by taking as representation matrices π(g) instead of π(g) is
equivalent to the original representation (the same representation, with a
different basis choice, so matrices changed by a conjugation). To see this
for the spin- 12 representation, note that SU (2) matrices are of the form
α β
Ω=
−β α
so the matrix
0 1
−1 0
is the change of basis matrix relating the representation and its complex
conjugate.
This is no longer true for SL(2, C). One cannot complex conjugate arbi-
trary 2 by 2 complex matrices of unit determinant by a change of basis,
and representations πs will not be equivalent to their complex conjugates
πs .
To add: show that this is true
The classification of irreducible finite dimensional SU (2) representation was
done earlier in this course by considering its Lie algebra su(2), complexified to
give us raising and lowering operators, and this complexification is sl(2, C). If
you take a look at that argument, you see that it mostly also applies to irre-
ducible finite-dimensional sl(2, C) representations. There is a difference though:
now flipping positive to negative weights (which corresponds to change of sign
of the Lie algebra representation matrices, or conjugation of the Lie group rep-
resentation matrices) no longer takes one to an equivalent representation. It
turns out that to get all irreducibles, one must take both the representations
we already know about and their complex conjugates. Using the fact that the
tensor product of one of each type of irreducible is still an irreducible, one can
show (we won’t do this here) that the complete list of irreducible representations
of sl(2, C) is given by
Theorem (Classification of finite dimensional sl(2, C) representations). The
irreducible representations of sl(2, C) are labeled by (s1 , s2 ) for sj = 0, 21 , 1, . . ..
395
These representations are built out of the representations (πs , V s ) with the ir-
reducible (s1 , s2 ) given by
(πs1 ⊗ π s2 , V s1 ⊗ V s2 )
and having dimension (2s1 + 1)(2s2 + 1).
All these representations are also representations of the group SL(2, C) and
one has the same classification theorem for the group, although we will not try
and prove this. We will also not try and study these representations in general,
but will restrict attention to the four cases of most physical interest.
• (0, 0): The trivial representation on C, also called the “spin 0” or scalar
representation.
• ( 12 , 0): These are called left-handed (for reasons we will see later on) “Weyl
spinors”. We will often denote the representation space C2 in this case as
SL , and write an element of it as ψL .
• (0, 12 ): These are called right-handed Weyl spinors. We will often denote
the representation space C2 in this case as SR , and write an element of it
as ψR .
Note that the representations of SL(2, C) on SL and SR are described
explicitly below.
• ( 21 , 12 ): This is called the “vector” representation since it is the complexifi-
cation of the action of SL(2, C) as SO(3, 1) transformations of space-time
vectors that we saw earlier. Recall that for Ω ∈ SL(2, C) this action was
x0 + x3 x1 − ix2 x0 + x3 x1 − ix2
→Ω Ω†
x1 + ix2 x0 − x3 x1 + ix2 x0 − x3
Since Ω† is the conjugate transpose this is the action of SL(2, C) on the
representation SL ⊗ SR .
add an explicit identification of matrices and the tensor product
This representation is on a vector space C4 = M (2, C), but preserves
the subspace of self-adjoint matrices that we have identified with the
Minkowski space R4 .
The reducible 4 complex dimensional representation ( 21 , 0) ⊕ (0, 12 ) is known as
the representation on “Dirac spinors”. As explained earlier, of these representa-
tions, only the trivial one is unitary. Only the trivial and vector representations
are representations of SO(3, 1) as well as SL(2, C).
One can manipulate these Weyl spinor representations ( 12 , 0) and (0, 12 ) in
a similar way to the treatment of tangent vectors and their duals in tensor
analysis. Just like in that formalism, one can distinguish between a represen-
tation space and its dual by upper and lower indices, in this case using not
the metric but the SL(2, C) invariant bilinear form to raise and lower indices.
With complex conjugates and duals, there are four kinds of irreducible SL(2, C)
representations on C2 to keep track of
396
• SL : This is the standard defining representation of SL(2, C) on C2 , with
Ω ∈ SL(2, C) acting on ψL ∈ SL by
ψL → ΩψL
A standard index notation for such things is called the “van der Waer-
den notation”. It uses a lower index α taking values 1, 2 to label the
components
ψ1
ψL = = ψα
ψ2
and in this notation Ω acts by
ψα → Ωβα ψβ
Writing elements of the dual as row vectors, our example above of a par-
ticular Ω acts by
θ
ψ1 ψ2 → ψ1 ψ2 ei 2 σ3
ψ α = αβ ψβ
where
0 1
αβ =
−1 0
397
• SR : This is the complex conjugate representation to SL , with Ω ∈ SL(2, C)
acting on ψR ∈ SR by
ψR → ΩψR
The van der Waerden notation uses a separate set of dotted indices for
these, writing this as
β̇
ψα̇ → Ωα̇ ψβ̇
Another common notation among physicists puts a bar over the ψ to
denote that the vector is in this representation, but we’ll reserve that
notation for complex conjugation. The Ω corresponding to a rotation
about the z-axis acts as
ψ1̇ i θ2 σ3 ψ1̇
→e
ψ2̇ ψ2̇
∗
• SR : This is the dual representation to SR , with Ω ∈ SL(2, C) acting on
∗ ∗
ψR ∈ SR by
∗ −1 ∗
ψR → (Ω )T ψR
and the index notation uses raised dotted indices
−1 T α̇ β̇
ψ α̇ → ((Ω ) )β̇ ψ
Another copy of
α̇β̇ 0 1
=
−1 0
∗
gives the isomorphism of SR and SR as representations, by
ψ α̇ = α̇β̇ ψβ̇
398
theory to the case of Cliff(3, 1) and this will give us explicitly the representations
( 12 , 0) and (0, 12 ).
If we complexify our R4 , then its Clifford algebra becomes just the algebra
of 4 by 4 complex matrices
Digression. Note that the Aj and Bj we constructed using the lj and kj were
also complex 4 by 4 matrices, but they were acting on complex vectors (the com-
plexification of the vector representation ( 12 , 12 )). Now we want 4 by 4 matrices
for something different, putting together the spinor representations ( 21 , 0) and
(0, 21 ).
One can easily check that these satisfy the Clifford algebra relations for gener-
ators of Cliff(1, 3): they anticommute with each other and
The quadratic Clifford algebra elements − 21 γj γk for j < k satisfy the com-
mutation relations of so(3, 1). These are explicitly
1 i σ3 0 1 i σ2 0 1 i σ1 0
− γ1 γ2 = − , − γ1 γ3 = − , − γ2 γ3 = −
2 2 0 σ3 2 2 0 σ2 2 2 0 σ1
and
1 1 −σ1 0 1 1 −σ2 0 1 1 −σ3 0
− γ0 γ1 = , − γ0 γ2 = , − γ0 γ3 =
2 2 0 σ1 2 2 0 σ2 2 2 0 σ3
1 1 1
π 0 (l1 ) = − γ2 γ3 , π 0 (l2 ) = − γ1 γ3 , π 0 (l3 ) = − γ1 γ2
2 2 2
399
and
1 1 1
π 0 (k1 ) = − γ0 γ1 , π 0 (k2 ) = − γ0 γ2 , π 0 (k3 ) = − γ0 γ3
2 2 2
Note that the π 0 (lj ) are skew-adjoint, since this representation of the so(3) ⊂
so(3, 1) sub-algebra is unitary. The π 0 (kj ) are self-adjoint and this representa-
tion π 0 of so(3, 1) is not unitary.
On the two commuting SL(2, C) subalgebras of so(3, 1) ⊗ C with bases
1 1
Aj = (lj + ikj ), Bj = (lj − ikj )
2 2
this representation is
i σ1 0 i σ2 0 i σ3 0
π 0 (A1 ) = − , π 0 (A2 ) = − , π 0 (A3 ) = −
2 0 0 2 0 0 2 0 0
and
0 i 0 0 0 i 0 0 0 i 0 0
π (B1 ) = − , π (B2 ) = − , π (B3 ) = −
2 0 σ1 2 0 σ2 2 0 σ3
We see explicitly that the action of the quadratic elements of the Clifford
algebra on the spinor representation C4 is reducible, decomposing as the direct
∗
sum SL ⊕ SR of two inequivalent representations on C2
ψL
Ψ= ∗
ψR
where the index α on the left takes values 1, 2, 3, 4 and the indices α, α̇ on the
right each take values 1, 2.
An important element of the Clifford algebra is constructed by multiplying
all of the basis elements together. Physicists traditionally multiply this by i to
make it self-adjoint and define
−1 0
γ5 = iγ0 γ1 γ2 γ3 =
0 1
This can be used to produce projection operators from the Dirac spinors onto
the left and right-handed Weyl spinors
1 1 ∗
(1 − γ5 )Ψ = ψL , (1 + γ5 )Ψ = ψR
2 2
400
There are two other commonly used representations of the Clifford algebra
relations, related to the one above by a change of basis. The Dirac representation
is useful to describe massive charged particles, especially in the non-relativistic
limit. Generators are given by
D 1 0 D 0 σ1
γ0 = −i , γ1 = −i
0 −1 −σ1 0
0 σ2 0 σ3
γ2D = −i , γ3D = −i
−σ2 0 −σ3 0
and the projection operators for Weyl spinors are no longer diagonal, since
0 1
γ5D =
1 0
401
the complexification of the real vector space into a sum of two complex vector
spaces, related by complex conjugation. In this case this corresponds to
∗
SM ⊗ C = SL ⊕ SR
the fact that complexifying Majorana spinors gives the two kinds of Weyl
spinors.
402
Chapter 39
In the previous chapter we saw that one can take the semi-direct product of
spatial translations and rotations and that the resulting group has infinite-
dimensional unitary representations on the state space of a quantum free parti-
cle. The free particle Hamiltonian plays the role of a Casimir operator: to get
irreducible representations one fixes the eigenvalue of the Hamiltonian (the en-
ergy), and then the representation is on the space of solutions to the Schrödinger
equation with this energy. This is a non-relativistic procedure, treating time and
space (and correspondingly the Hamiltonian and the momenta) differently. For
a relativistic analog, we will use instead the semi-direct product of space-time
translations and Lorentz transformations. Irreducible representations of this
group will be labeled by a continuous parameter (the mass) and a discrete pa-
rameter (the spin or helicity), and these will correspond to possible relativistic
elementary particles.
In the non-relativistic case, the representation occurred as a space of solu-
tions to a differential equation, the Schrödinger equation. There is an analogous
description of the irreducible Poincaré group representations as spaces of solu-
tions of relativistic wave equations, but we will put off that story until succeeding
chapters.
403
We will refer to both of these groups as the “Poincaré group”, meaning by
this the double-cover only when we need it because spinor representations of the
Lorentz group are involved. The two groups have the same Lie algebra, so the
distinction is not needed in discussions that only need the Lie algebra. Elements
of the group P will be written as pairs (a, Λ), with a ∈ R4 and Λ ∈ SO(3, 1).
The group law is
The Lie algebra LieP = LieP̃ has dimension 10, with basis
t0 , t1 , t2 , t3 , l1 , l2 , l3 , k1 , k2 , k3
where the first four elements are a basis of the Lie algebra of the translation
group, and the next six are a basis of so(3, 1), with the lj giving the subgroup of
spatial rotations, the kj the boosts. We already know the commutation relations
for the translation subgroup, which is commutative so
[tj , tk ] = 0
404
0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1
t0 ↔
0 0 0 0 0 t1 ↔ 0
0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
t2 ↔
0 0 0 0 1 t3 ↔ 0
0 0 0 0
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 0
We can use this explicit matrix representation to compute the commutators
of the infinitesimal translations tj with the infinitesimal rotations and boosts
(lj , kj ). t0 commutes with the lj and t1 , t2 , t3 transform as a vector under
rotations, For instance, for infinitesimal rotations about the 1-axis
Note that infinitesimal boosts do not commute with infinitesimal time transla-
tion, so after quantization boost will not commute with the Hamiltonian and
thus are not the sort of symmetries which act on spaces of energy eigenstates,
preserving the energy.
405
Equivalently, the p0 , p1 , p2 , p3 are the eigenvalues of the energy and momentum
operators
that give the representation of the translation part of the Poincaré group Lie
algebra on the states.
The Lorentz group acts on this R4 by
p → Λp
and, restricting attention to the p0 − p3 plane, the picture of the orbits looks
like this
Unlike the Euclidean group case, here there are several different kinds of
orbits Oα . We’ll examine them and the corresponding stabilizer groups Kα
each in turn, and see what can be said about the associated representations.
One way to understand the equations describing these orbits is to note that
the different orbits correspond to different eigenvalues of the Poincaré group
Casimir operator
P 2 = −P02 + P12 + P22 + P32
406
This operator commutes with all the generators of the Lie algebra of the Poincaré
group, so by Schur’s lemma it must act as a scalar times the identity on an
irreducible representation (recall that the same phenomenon occurs for SU (2)
representations, which can be characterized by the eigenvalue j(j +1) of the Cas-
mir operator J2 for SU (2)). At a point p = (p0 , p1 , p2 , p3 ) in energy-momentum
space, the Pj operators are diagonalized and P 2 will act by the scalar
Note that in this chapter we are just classifying Poincaré group representa-
tions, not actually constructing them. It is possible to construct these represen-
tations using the data we will find that classifies them, but this would require
introducing some techniques (for so-called “induced representations”) that go
beyond the scope of this course. In later chapters we will explicitly construct
these representations in certain specific cases as solutions to certain relativistic
wave equations.
so q
p0 = p21 + p22 + p23 + m2
The stabilizer group of Km,0,0,0 is the subgroup of SO(3, 1) of elements of
the form
1 0
0 Ω
where Ω ∈ SO(3), so Km,0,0,0 = SO(3). Irreducible representations of this
group are classified by the spin. For spin 0, points on the hyperboloid can
be identified with positive energy solutions to a wave equation called the Klein-
Gordon equation and functions on the hyperboloid both correspond to the space
of all solutions of this equation and carry an irreducible representation of the
Poincaré group. In the next chapter we will study the Klein-Gordon equation,
407
as well as the quantization of the space of its solutions by quantum field theory
methods.
We will later study the case of spin 12 , where one must use the double cover
SU (2) of SO(3). The Poincaré group representation will be on functions on
the orbit that take values in two copies of the spinor representation of SU (2).
These will correspond to solutions of a wave equation called the massive Dirac
equation.
For choices of higher spin representations of the stabilizer group, one can
again find appropriate wave equations and construct Poincaré group represen-
tations on their space of solutions, but we will not enter into this topic.
Again, one has the same stabilizer group K−m,0,0,0 = SO(3) and the same con-
stuctions of wave equations of various spins and Poincaré group representations
on their solution spaces as in the positive energy case. Since negative energies
lead to unstable, unphysical theories, we will see that these representations are
treated differently under quantization, corresponding physically not to particles,
but to antiparticles.
It is not too difficult to see that the stabilizer group of the orbit is K0,0,0,m =
SO(2, 1). This is isomorphic to the group SL(2, R), and it has no finite-
dimensional unitary representations. These orbits correspond physically to
“tachyons”, particles that move faster than the speed of light, and there is
no known way to consistently incorporate them in a conventional theory.
408
group SO(3, 1). For each finite-dimensional representation of SO(3, 1), one gets
a corresponding finite dimensional representation of the Poincaré group, with
translations acting trivially. These representations are not unitary, so not usable
for our purposes.
409
39.3 For further reading
The Poincaré group and its Lie algebra is discussed in pretty much any quantum
field theory textbook. Weinberg [71] (Chapter 2) has some discussion of the
representations of the Poincaré group on single particle state spaces that we have
classified here. Folland [20] (Chapter 4.4) and Berndt [7] (Chapter 7.5) discuss
the actual construction of these representations using the induced representation
methods that we have chosen not to try and explain here.
410
Chapter 40
411
40.1 The Klein-Gordon equation and its solu-
tions
Recall that a condition characterizing the orbit in momentum space that we
want to study was that the Casimir operator P 2 of the Poincaré group acts on
the representation corresponding to the orbit as the scalar m2 . So, we have the
operator equation
∂2 ∂2 ∂2 ∂2
(− + + + )φ = m2 φ
∂t2 ∂x21 ∂x22 ∂x23
or
∂2
(−+ ∆ − m2 )φ = 0
∂t2
for functions φ(x) on Minkowski space (which may be real or complex valued).
This equation is the simplest Lorentz-invariant wave equation to try, and
historically was the one Schrödinger first tried (he then realized it could not
account for atomic spectra and instead used the non-relativistic equation that
bears his name). Taking Fourier transforms
Z
e = 1
φ(p) d4 xe−i(−p0 x0 +p·x) φ(x)
(2π)2
the Klein-Gordon equation becomes
412
In the non-relativistic case, a continuous basis of solutions of the Schrödinger
equation labeled by p ∈ R3 was given by the functions
|p|2
eip·x e−i 2m t
413
with the integral over the 3d hyperboloid expressed as a 4d integral over R4
with a delta-function on the hyperboloid in the argument.
The delta function distribution with argument a function f (x) depends only
on the zeros of f , and if f 0 6= 0 at such zeros, one has
X 1
δ(f (x)) = δ(f 0 (xj )(x − xj )) = δ(x − xj )
|f 0 (xj )|
xj :f (xj )=0
For each p, one can apply this to the case of the function of p0 given by
f = p20 − ωp2
on R4 , and using
d 2
(p − ωp2 ) = 2p0 = ±2ωp
dp0 0
one finds
Z
1 1 i(p·x−p0 t)
φ(x, t) = (δ(p0 − ωp ) + δ(p0 + ωp ))φ(p)e
e dp0 d3 p
(2π)3/2 M 4 2ωp
Z 3
1 e+ (p)e−iωp t + φe− (p)eiωp t )eip·x d p
= ( φ
(2π)3/2 R3 2ωp
Here
φe+ (p) = φ(ω
e p , p), φe− (p) = φ(−ω
e p , p)
are the values of φe on the positive and negative energy hyperboloids. We see
that instead of thinking of the Fourier transforms of solutions as taking values
on energy-momentum hyperboloids, we can think of them as taking values just
on the space R3 of momenta (just as in the non-relativistic case), but we do
have to use both positive and negative energy Fourier components, and to get
a Lorentz invariant measure need to use
d3 p
2ωp
instead of d3 p.
A general complex-valued solution to the Klein-Gordon equation will be
given by the two complex-valued functions φe+ , φe− , but we can impose the
condition that the solution be real-valued, in which case one can check that the
pair of functions must satisfy the condition
414
40.2 Classical relativistic scalar field theory
We would like to set up the Hamiltonian formalism, finding a phase space H1
and a Hamiltonian function h on it such that Hamilton’s equations will give us
the Klein-Gordon equation as equation of motion. Such a phase space will be
an infinite-dimensional function space and the Hamiltonian will be a functional.
We will here blithely ignore the analytic difficulties of working with such spaces,
and use physicist’s methods, with formulas that can be given a legitimate inter-
pretation by being more careful and using distributions. Note that now we will
take the fields φ to be real-valued, this is the so-called real scalar field.
Since the Klein-Gordon equation is second order in time, solutions will be
parametrized by initial data which, unlike the non-relativistic case now requires
the specification at t = 0 of not one, but two functions,
∂
φ(x) = φ(x, 0), φ̇(x) = φ(x, t)|t=0
∂t
the values of the field and its first time derivative.
We will take as our phase space H1 the space of pairs of functions (φ, π),
with coordinates φ(x), π(x) and Poisson brackets
{φ(x), π(x0 )} = δ(x − x0 ), {φ(x), φ(x0 )} = {π(x), π(x0 )} = 0
We want to get the Klein-Gordon equation for φ(x, t) as the following pair of
first order equations
∂ ∂
φ = π, π = (∆ − m2 )φ
∂t ∂t
which together imply
∂2
φ = (∆ − m2 )φ
∂t2
To get these as equations of motion, we just need to find a Hamiltonian
function h on the phase space H1 such that
∂
φ ={φ, h} = π
∂t
∂
π ={π, h} = (∆ − m2 )φ
∂t
One can check that two choices of Hamiltonian function that will have this
property are Z
h= H(x)d3 x
R3
where
1 2 1
H= (π − φ∆φ + m2 φ2 ) or H = (π 2 + (∇φ)2 + m2 φ2 )
2 2
where the two different integrands H(x) are related (as in the non-relativistic
case) by integration by parts so these just differ by boundary terms that are
assumed to vanish.
415
To be added: work out one of the two above Poisson brackets
One could instead have taken as starting point the Lagrangian formalism,
with an action Z
S= L d4 x
M4
where
1 ∂ 2
L= (( φ) − (∇φ)2 − m2 φ2 )
2 ∂t
This action is a functional now of fields on Minkowski space M 4 and is Lorentz
invariant. The Euler-Lagrange equations give as equation of motion the Klein-
Gordon equation
(2 − m2 )φ = 0
One recovers the Hamiltonian formalism by seeing that the canonical momentum
for φ is
∂L
π= = φ̇
∂ φ̇
and the Hamiltonian density is
1 2
H = π φ̇ − L = (π + (∇φ)2 + m2 φ2 )
2
Besides the position-space Hamiltonian formalism, we would like to have one
for the momentum space components of the field, since for a free field it is these
that will decouple into an infinite collection of harmonic oscillators. For a real
solution to the Klein-Gordon equation we have
Z 3
1 e+ (p)e−iωp t + φe (−p)eiωp t )eip·x d p
φ(x, t) = (φ +
(2π)3/2 R3 2ωp
d3 p
Z
1
= 3/2
(φe+ (p)e−iωp t eip·x + φe+ (p)eiωp t e−ip·x )
(2π) R3 2ωp
where we have used the symmetry of the integration over p to integrate over
−p instead of p.
We can choose a new way of normalizing Fourier coefficients, one that reflects
the fact that the Lorentz-invariant notion is that of integrating over the energy-
momentum hyperboloid rather than momentum space
φe+ (p)
α(p) = p
2ωp
d3 p
Z
1
φ(x, t) = (α(p)e−iωp t eip·x + α(p)eiωp t e−ip·x ) p
(2π)3/2 R3 2ωp
416
The α(p), α(p) will have the same sort of Poisson bracket relations as the
z, z for a single harmonic oscillator, or the α(p), α(p) Fourier coefficients in the
case of the non-relativistic field:
{α(p), α(p0 )} = −iδ 3 (p − p0 ), {α(p), α(p0 ))} = {α(p), α(p0 ))} = 0
To see this, one can compute the Poisson brackets for the fields as follows. We
have
d3 p
Z
∂ 1 ip·x −ip·x
π(x) = φ(x, t)|t=0 = (−iω p )(α(p)e − α(p)e )
(2π)3/2 R3
p
∂t 2ωp
and
d3 p
Z
1
φ(x) = (α(p)eip·x + α(p)e−ip·x ) p
(2π)3/2 R3 2ωp
so
Z
0 1 0 0
{φ(x), π(x )} = ({α(p), iα(p0 )}ei(p·x−p x )
2(2π)3 R3 ×R3
0 0
− {iα(p), α(p0 )}ei(−p·x+p x ) )d3 pd3 p0
Z
1 0 0 0 0
= 3
δ 3 (p − p0 )(ei(p·x−p x ) + ei(−p·x+p x ) )d3 pd3 p0
2(2π) R3 ×R3
Z
1 0 0
= 3
(eip·(x−x ) + e−ip·(x−x ) )d3 p
2(2π) R3
=δ 3 (x − x0 )
As in the non-relativistic case, one really should work elements of H1∗ of the
form (for appropriately chosen class of functions f, g)
Z
φ(f ) + π(g) = (f (x)φ(x) + g(x)π(x))d3 x
R3
This is just the infinite-dimensional analog of the Poisson bracket of two linear
combinations of the qj , pj , with the right-hand side the symplectic form Ω on
H1∗ .
417
−
where M+ J is the +i eigenspace of J, MJ the −i eigenspace. The quantum
state space will be the space of polynomials on the dual of M+ J . The choice of
J corresponds to a choice of distinguished state |0iJ ∈ H, the Bargmann-Fock
state given by the constant polynomial function 1.
In the non-relativistic quantum field theory case we saw that basis elements
of M could be taken to be either the linear functionals ψ(x) and their conjugates
ψ(x) or, Fourier transforming, the linear functionals α(p) and their conjugates
α(p). These coordinates are not real-valued, but complex-valued, and as a result
M came with a distinguished natural complex structure J, which is +i on the
ψ(x) or the α(p), and −i on their conjugates.
In the relativistic scalar field theory, we must do something very different.
The solutions to the Klein-Gordon equation we are considering are real-valued,
not complex-valued functions, and give a real phase space M to be quantized
(what happens when we consider a theory with configuration space complex val-
ued fields will be discussed in chapter 41). When we complexify and look at the
space M ⊗ C, it naturally decomposes as a representation of the Poincaré group
into two pieces: M+ , the complex functions on the positive energy hyperboloid
and M− , the complex functions on the negative energy hyperboloid. More ex-
plicitly, we can decompose a complexified solution φ(x, t) of the Klein-Gordon
equation as φ = φ+ + φ− , where
Z 3
1 −iωp t ip·x d p
φ+ (x, t) = α(p)e e
(2π)3/2 R3
p
2ωp
and
d3 p
Z
1
φ− (x, t) = α(p)eiωp t e−ip·x p
(2π)3/2 R3 2ωp
We will take as complex structure the operator J that is +i on positive
energy wavefunctions and −i on negative energy wavefunctions. Complexified
classical fields in M+ get quantized as annihilation operators, those in M−
as creation operators. Since conjugation interchanges M+ and M− , non-zero
real-valued classical fields have components in both M+ and M− since they
are their own conjugates.
One motivation for this particular choice of J is that it leads to a state space
with states of non-negative energy. Theories with states of arbitrarily negative
energy are considered undesirable since they will tend to have no stable vacuum
state (since any supposed vacuum state could potentially decay into states of
large positive and large negative energy, while preserving total energy). To see
the mechanism for non-negative energy, first consider again the non-relativistic
case, where the Hamiltonian is (for d = 1)
Z +∞ Z ∞ 2
1 d p
h= | ψ(x)|2 dx = |α(p)|2 dp
2m −∞ dx −∞ 2m
418
which is an operator with positive eigenvalues (this is in normal-ordered form,
but the non-normal-ordered version is still positive, although adding an infinite
positive constant).
WARNING: HAVEN’T FINISHED REWRITING REST OF THIS SEC-
TION.
The Hamiltonian function h is the quadratic polynomial function of the
coordinates φ(x), π(x)
Z
1 2
h= (π + (∇φ)2 + m2 φ2 )d3 x
R 2
3
and by laborious calculation one can substitute the above expressions for φ, π
in terms of α(p), α(p) to find h as a quadratic polynomial in these coordinates
on the momentum space fields. A quicker way to find the correct expression is
to use the fact that different momentum components of the field decouple, and
we know the time-dependence of such components, so just need to find the right
h that generates this.
If, as in the non-relativistic case, we interpret φ as a single-particle wave-
function, Hamilton’s equation of motion says
∂
{φ, h} = φ
∂t
and applying this to the component of φ+ with momentum p, we just get
multiplication by −iωp . The energy of such a wavefunction would be ωp , the
∂
eigenvalue of i ∂t . These are called “positive frequency” or “positive energy”
wavefunctions. In the case of momentum components of φ− , the eigenvalue is
−ωp , and one has “negative frequency” or “negative energy” wavefunctions.
An expression for h in terms of momentum space field coordinates that will
have the right Poisson brackets on φ+ , φ− is
Z
h= ωp α(p)α(p)d3 p
R3
and this is the same expression one could have gotten by a long direct calcula-
tion.
∂
In the non-relativistic case, the eigenvalues of the action of i ∂t on the
2
wavefunctions ψ were non-negative ( |p| 2m ) so the single-particle states had non-
negative energy. Here we find instead eigenvalues ±ωp of both signs, so single-
particle states can have arbitrarily negative energies. This makes a physically
sensible interpretation of H1 as a space of wavefunctions describing a single
relativistic particle difficult if not impossible. We will however see in the next
section that there is a way to quantize this H1 as a phase space, getting a
sensible multi-particle theory with a stable ground state.
419
tize the theory in exactly the same way as was done with the non-relativistic
Schrödinger equation, taking momentum components of fields to operators by
replacing
α(p) → a(p), α(p) → a† (p)
where a(p), a† (p) are operator valued distributions satisfying the commutation
relations
[a(p), a† (p0 )] = δ 3 (p − p0 )
For the Hamiltonian we take the normal-ordered form
Z
H=
b ωp a† (p)a(p)d3 p
R3
Starting with a vacuum state |0i, by applying creation operators one can create
arbitary positive energy multiparticle states of free relativistic particles with
single-particle states having the energy momentum relation
p
E(p) = ωp = |p|2 + m2
d3 p
Z
1
φ(x)
b = 3/2
(a(p)eip·x + a† (p)e−ip·x ) p (40.1)
(2π) R3 2ωp
d3 p
Z
1
π
b(x) = (−iωp )(a(p)eip·x − a† (p)e−ip·x ) p (40.2)
(2π)3/2 R3 2ωp
By essentially the same computation as for Poisson brackets, one can com-
pute commutators, finding
[φ(x),
b b(x0 )] = iδ 3 (x − x0 ), [φ(x),
π b b 0 )] = [b
φ(x b(x0 )] = 0
π (x), π
420
The Hamiltonian operator will be quadratic in the field operators and can
be chosen to be
Z
1
H=
b π (x)2 + (∇φ(x))
: (b b 2
+ m2 φ(x)
b 2 ) : d3 x
R3 2
This operator is normal ordered, and a computation (see for instance [10]) shows
that in terms of momentum space operators this is just
Z
H
b = ωp a† (p)a(p)d3 p
R3
421
state |0i and time-dependent operators. For the free scalar field theory, we have
explicitly solved for the time-dependence of the field operators. A basic quantity
needed for describing the propagation of quanta of a quantum field theory is
the propagator:
Definition (Green’s function or propagator, scalar field theory). The Green’s
function or propagator for a scalar field theory is the amplitude, for t > t0
For t > 0, this gives the amplitude for propagation of a particle in time t
from the origin to the point x.
Plan to expand this section. Compare to non-relativistic propagator. Com-
pute commutator of fields at arbitrary space-time separations, show that com-
mutator of fields at space-like separations vanishes.
422
Chapter 41
Just as for non-relativistic quantum fields, the theory of free relativistic scalar
quantum fields starts by taking as phase space an infinite dimensional space of
solutions of an equation of motion. Quantization of this phase space involves
constructing field operators which provide a representation of the corresponding
Heisenberg Lie algebra, by an infinite dimensional version of the Bargmann-Fock
construction. The equation of motion has its own representation-theoretical
significance: it is an eigenvalue equation for a Casimir operator of a group of
space-time symmetries, picking out an irreducible representation of that group.
In this case the Casimir operator is the Klein-Gordon operator, and the space-
time symmetry group is the Poincaré group. The Poincaré group acts on the
phase space of solutions to the Klein-Gordon equation, preserving the Poisson
bracket. One can thus use the same methods as in the finite-dimensional case
to get a representation of the Poincaré group by intertwining operators for
the Heisenberg Lie algebra representation (that representation is given by the
field operators). These methods give a representation of the Lie algebra of the
Poincaré group in terms of quadratic combinations of the field operators.
We’ll begin with the case of an even simpler group action on the phase space,
that coming from an “internal symmetry” one gets if one takes multi-component
scalar fields, with an orthogonal group or unitary group acting on the real or
complex vector space in which the classical fields take their values.
423
To get a theory with such a distinction we need to introduce fields with more
components. Two possibilities are to consider real fields with m components, in
which case we will have a theory with SO(m) symmetry, and U (1) = SO(2) the
m = 2 special case, or to consider complex fields with m components, in which
case we have theories with U (m) symmetry, and m = 1 the U (1) special case.
µL = q1 p2 − q2 p1
To get a quadratic functional on the fields that will have the desired Poisson
bracket with the fields for each value of x, we need to just integrate the analog
of µL over R3 . We will denote the result by Q, since it is an observable that
will have a physical interpretation as electric charge when this theory is coupled
to the electromagnetic field (see chapter 42):
Z
Q= (π2 (x)φ1 (x) − π1 (x)φ2 (x))d3 x
R3
to check that
φ1 (x) −φ2 (x) π (x) −π2 (x)
{Q, }= , {Q, 1 }=
φ2 (x) φ1 (x) π2 (x) π1 (x)
424
Quantization of the classical field theory gives us a unitary representation U
of SO(2), with
Z
U 0 (L) = −iQb = −i b1 (x)φb2 (x))d3 x
π2 (x)φb1 (x) − π
(b
R3
The operator
U (θ) = e−iθQ
b
One expects that since the time evolution action on the classical field space
commutes with the SO(2) action, the operator Q b should commute with the
Hamiltonian operator H. This can readily be checked by computing [H,
b b Q]
b
using Z
H
b = ωp (a†1 (p)a1 (p) + a†2 (p)a2 (p))d3 p
R3
Note that the vacuum state |0i is an eigenvector for Q b and H b with eigenvalue
† †
0: it has zero energy and zero charge. States a1 (p)|0i and a2 (p)|0i are eigen-
vectors of H b with eigenvalue and thus energy ωp , but these are not eigenvectors
of Q,
b so do not have a well-defined charge.
All of this can be generalized to the case of m > 2 real scalar fields, with a
larger group SO(m) now acting instead of the group SO(2). The Lie algebra is
now multi-dimensional, with a basis the elementary antisymmetric matrices jk ,
with j, k = 1, 2, · · · , m and j < k, which correspond to infinitesimal rotations in
the j − k planes. Group elements can be constructed by multiplying rotations
eθjk in different planes. Instead of a single operator Q, b we get multiple operators
Z
−iQ b jk = −i bj (x)φbk (x))d3 x
πk (x)φbj (x) − π
(b
R3
and conjugation by
Ujk (θ) = e−iθQjk
b
425
rotates the field operators in the j − k plane. These also provide unitary oper-
ators on the state space, and, taking appropriate products of them, a unitary
representation of the full group SO(m) on the state space. The Q b jk commute
with the Hamiltonian, so the energy eigenstates of the theory break up into ir-
reducible representations of SO(m) (a subject we haven’t discussed for m > 3).
Note that introducing complex fields in a theory like this with field equations
that are second-order in time means that for each x we have a phase space
with two complex dimensions (φ(x) and π(x)). Using Bargmann-Fock methods
requires complexifying one’s phase space, which is a bit confusing here since
the phase space is already is given in terms of complex fields. We can however
proceed to find the operator that generates the U (1) symmetry as follows.
426
In terms of complex fields, the SO(2) transformations on the pair φ1 , φ2 of
real fields become U (1) phase transformations, with Q now given by
Z
Q = −i (π(x)φ(x) − π(x)φ(x))d3 x
R3
satisfying
{Q, φ(x)} = iφ(x), {Q, φ(x)} = −iφ(x)
Quantization of the classical field theory gives a representation of the infinite
dimensional Heisenberg algebra with commutation relations
[φ(x),
b b 0 )] = [b
φ(x b(x0 )] = [φb† (x), φb† (x0 )] = [b
π (x), π π † (x), π
b† (x0 )] = 0
[φ(x),
b b(x0 )] = [φb† (x), π
π b† (x0 )] = iδ 3 (x − x0 )
Quantization of the quadratic functional Q of the fields is done with the normal-
ordering prescription, to get
Z
Q = −i
b : (b
π (x)φ(x)
b b† (x)φb† (x)) : d3 x
−π
R3
and
U (θ) = e−iθQ
b
π U (θ)−1 = eiθ π
U (θ)b π † U (θ)−1 = e−iθ π
b, U (θ)b b†
It will also give a representation of U (1) on states, with the state space de-
composing into sectors each labeled by the integer eigenvalue of the operator
Q.
b
In the Bargmann-Fock quantization of this theory, we can express quantum
fields now in terms of a different set of two annihilation and creation operators
1 1
a(p) = √ (a1 (p) + ia2 (p)), a† (p) = √ (a†1 (p) − ia†2 (p))
2 2
1 1
b(p) = √ (a1 (p) − ia2 (p)), b† (p) = √ (a†1 (p) + ia†2 (p))
2 2
The only non-zero commutation relations between these operators will be
427
so we see that we have, for each p, two independent sets of standard annihilation
and creation operators, which will act on a tensor product of two standard
harmonic oscillator state spaces. The states created and annihilated by the
a† (p) and a(p) operators will have an interpretation as particles of momentum
p, whereas those created and annihilated by the b† (p) and b(p) operators will
be antiparticles of momentum p. The vacuum state will satisfy
a(p)|0i = b(p)|0i = 0
Using these creation and annihilation operators, the definition of the complex
field operators is
Definition (Complex scalar quantum field). The complex scalar quantum field
operators are the operator-valued distributions defined by
d3 p
Z
1 ip·x † −ip·x
φ(x)
b = (a(p)e + b (p)e )
(2π)3/2 R3
p
2ωp
d3 p
Z
1
φb† (x) = (b(p)e ip·x
+ a †
(p)e −ip·x
)
(2π)3/2 R3
p
2ωp
d3 p
Z
1 ip·x † −ip·x
π
b(x) = (−iω p )(a(p)e − b (p)e )
(2π)3/2 R3
p
2ωp
d3 p
Z
1
b† (x) =
π 3/2
(−iωp )(b(p)eip·x − a† (p)e−ip·x ) p
(2π) R3 2ωp
[φ(x),
b b 0 )] = [b
φ(x b(x0 )] = [φb† (x), φb† (x0 )] = [b
π (x), π π † (x), π
b† (x0 )] = 0
[φb† (x), π
b(x0 )] = [φ(x),
b b† (x0 )] = iδ 3 (x − x0 )
π
The Hamiltonian operator will be
Z
Hb = π † (x)b
: (b π (x) + (∇φb† (x))(∇φ(x))
b + m2 φb† (x)φ(x))
b : d3 x
R3
Z
= ωp (a† (p)a(p) + b† (p)b(p))d3 p
R3
Note that the classical solutions to the Klein-Gordon equation have both
positive and negative energy, but the quantization is chosen so that negative
energy solutions correspond to antiparticle annihilation and creation operators,
and all states of the quantum theory have non-negative energy.
428
41.2 Poincaré symmetry and scalar fields
Momentum and energy operators, angular momentum operators. Discuss action
of Lorentz boosts.
The Poincaré group action on the coordinates φ(p)
e on H1 will be given by
e = e−ipa φ(Λ
u(a, Λ)φ(p) e −1 p)
429
430
Chapter 42
431
operator with certain commutation relations with the field operators, acting
with integral eigenvalues on the space of states. Instead of just multiplying
fields by a constant phase eiϕ , one can imagine multiplying by a phase that
varies with the coordinates x, so
(so ϕ will be a function, taking values in R/2π). By doing this, we are making
a huge group of transformations act on the theory. Elements of this group are
called gauge transformations:
Definition (Gauge group). The group G of functions on R4 with values in the
unit circle U (1), with group law given by point-wise multiplication
|∇ψ|2
Aµ (x)
and such that the gauge group G acts on the space of U (1) connections by
With this new object one can define a new sort of derivative which will have
homogeneous transformation properties
Definition (Covariant derivative). Given a connection A, the associated co-
variant derivative in the µ direction is the operator
432
Note that under a gauge tranformation, one has
(DA )µ ψ → eiϕ(x) (DA )µ ψ
and terms in a Hamiltonian such as
3
X
((DA )j ψ)((DA )j ψ)
j=1
433
42.3 The Pauli-Schrödinger equation in an elec-
tromagnetic field
The Pauli-Schrödinger equation (31.1) describes a free spin-half non-relativistic
quantum particle. One can couple it to a vector potential by the “minimal
coupling” prescription of replacing derivatives by covariant derivatives, with
the result
∂ ψ1 (q) 1 ψ1 (q)
i( − iA0 ) =− (σ · (∇ − iA))2
∂t ψ2 (q) 2m ψ2 (q)
434
Chapter 43
Quantization of the
Electromagnetic Field: the
Photon
Understanding the classical field theory of coupled scalar fields and vector po-
tentials is rather difficult, with the quantized theory even more so, due to the
fact that the Hamiltonian is no longer quadratic in the field variables. If one
simplifies the problem by ignoring the scalar fields and just considering the
vector potentials, one does get a theory with quadratic Hamiltonian that can
be readily understood and quantized. The classical equations of motion are
the Maxwell equation in a vacuum, with solutions electromagnetic waves. The
quantization will be a relativistic theory of free, massless particles of helicity
±1, the photons.
To get a sensible, unitary theory of photons, one must take into account
the infinite dimensional gauge group G that acts on the classical phase space of
solutions to the Maxwell equations. We will see that there are various ways of
doing this, each with its own subtleties.
dF = 0, d ∗ F = 0
Then in components.
Show that gauge transform of a solution is a solution, gauge group acts on
the solution space.
435
43.2 Hamiltonian formalism for electromagnetic
fields
Equations in Hamiltonian form. Hamiltonian is E 2 + B 2 .
First problem: data at fixed t does not give a unique solution. Deal with
this by going to temporal gauge A0 = 0.
Second problem: no Gauss’s law. Have remaining symmetry under time-
independent gauge transformations. Compute moment map for time-independent
gauge transformation.
43.3 Quantization
Two general philosophies: impose constraints on states, or on the space one
quantizes.
436
Chapter 44
437
where the γj are generators of the Clifford algebra Cliff(n). The same thing is
true for Minkowski space, where one takes the Clifford algebra Cliff(3, 1) which
is generated by elements γ0 , γ1 , γ2 , γ3 satisfying
Using the Clifford algebra generators, we find that this Casimir operator has a
square root
±(−γ0 P0 + γ1 P1 + γ2 P2 + γ3 P3 )
so one could instead look for solutions to
±(−γ0 P0 + γ1 P1 + γ2 P2 + γ3 P3 ) = im
Definition (Dirac operator and the Dirac equation). The Dirac operator is the
operator
∂ ∂ ∂ ∂
/ = −γ0
D + γ1 + γ2 + γ3
∂x0 ∂x1 ∂x2 ∂x3
and the Dirac equation is the equation
/ = mΨ
DΨ
which act on
ψL
Ψ=
ψR
438
(recall that using γ-matrices this way, ψL transforms under the Lorentz group
as the SL representation, ψR as the dual of the SR representation).
The Dirac equation is then
∂
0 ( ∂x − σ · ∇) ψL ψL
∂
0
= m
( ∂x 0
+ σ · ∇) 0 ψ R ψ R
or, in components
∂
( − σ · ∇)ψR = mψL
∂x0
∂
( + σ · ∇)ψL = mψR
∂x0
In the case that m = 0, these equations decouple and we get
(p0 + σ · p)ψ̃R = 0
(p0 − σ · p)ψ̃L = 0
Since
(p0 + σ · p)(p0 − σ · p) = p20 − (σ · p)2 = p20 − |p|2
both ψ̃R and ψ̃L satisfy
(p20 − |p|2 )ψ̃ = 0
so are functions with support on the positive (p0 = |p|) and negative (p0 = −|p|)
energy null-cone. These are Fourier transforms of solutions to the massless
Klein-Gordon equation
∂2 ∂2 ∂2 ∂2
(− 2 + 2 + 2 + )ψ = 0
∂x0 ∂x1 ∂x2 ∂x23
439
Definition (Helicity). The operator
σ·p
h=
|p|
is called the helicity operator. It has eigenvalues ± 12 , and its eigenstates are
said to have helicity ± 12 . States with helicity + 21 are called “left-handed”, those
with helicity − 12 are called “right-handed”.
uL (p)ei(−p0 x0 +p·x)
where uL ∈ C2 satisfies
1
huL = + uL
2
so the helicity is + 12 , and negative energy (p0 = −|p|) solutions
uL (p)ei(−p0 x0 +p·x)
with helicity − 12 . After quantization, this wave equation gives a field theory
describing masslessleft-handed particles and right-handed antiparticles. The
Weyl equation for ψR will give a description of massless right-handed particles
and left-handed antiparticles.
Show Lorentz covariance.
Canonical formalism Hamiltonian, Lagrangian
Separate sub-sections for the Dirac and Majorana cases
44.3 Symmetries
Write down formulas for Poincaré and internal symmetry actions.
440
Chapter 45
An Introduction to the
Standard Model
441
442
Chapter 46
Further Topics
443
444
Appendix A
Conventions
I’ve attempted to stay close to the conventions used in the physics literature,
leading to the choices listed here. Units have been chosen so that ~ = 1.
To get from the self-adjoint operators used by physicists as generators of
symmetries, multiply by −i to get a skew-adjoint operator in a unitary repre-
sentation of the Lie algebra, for example
• The Lie bracket on the space of functions on phase space M is given by
the Poisson bracket, determined by
{q, p} = 1
L1 = Q2 P3 − Q3 P2
445
σ
For the spin 12 representation, the self-adjoint operators are Sj = 2j ,
σ
the Xj = −i 2j give the Lie algebra representation. Unlike the integer
spin representations, this representation does not come from the bosonic
quantization map Γ.
Given a unitary Lie algebra representation π 0 (X), the unitary group action
on states is given by
0
|ψi → π(eθX )|ψi = eθπ (X) |ψi
Instead of considering the action on states, one can consider the action on
operators by conjugation
0 0
O → O(θ) = e−θπ (X) Oeθπ (X)
446
Operators in the Heisenberg picture satisfy
or infinitesimally
d
O(t) = [O, −iH]
dt
which is the quantization of the Poisson bracket relation in Hamiltonian
mechanics
d
f = {f, h}
dt
Conventions for special relativity.
Conventions for representations on field operators.
Conventions for anticommuting variables. for unitary and odd super Lie
algebra actions.
447
448
Bibliography
[1] Alvarez, O., Lectures on quantum mechanics and the index theorem, in
Geometry and Quantum Field Theory, Freed, D, and Uhlenbeck, K., eds.,
American Mathematical Society, 1995.
[5] Berezin, F., The Method of Second Quantization, Academic Press, 1966.
[6] Berezin, F., and Marinov, M., Particle Spin Dynamics as the Grassmann
Variant of Classical Mechanics, Annals of Physics 104 (1972) 336-362.
[10] Das, A., Field Theory, a Path Integral Approach, World Scientific, 1993.
[11] Dimock, J., Quantum Mechanics and Quantum Field Theory, Cambridge
University Press, 2011.
[13] Fadeev, L.D. and Yakubovskii, O.A., Lectures on Quantum Mechanics for
Mathematics Students, AMS, 2009.
[15] Feynman, R. and Hibbs, A., Quantum Mechanics and Path Integrals,
McGraw-Hill, 1965.
449
[16] Feynman, R., Feynman Lectures on Physics,Volume 3, Addison-Wesley,
1965. Online at http://feynmanlectures.caltech.edu
[17] Feynman, R., The Character of Physical Law, page 129, MIT Press, 1967.
[20] Folland, G., Quantum Field Theory: A tourist guide for mathematicians,
AMS, 2008.
[21] Gelfand, I., and Vilenkin, N. Ya., Generalized Functions, Volume 4, Aca-
demic Press, 1964.
[27] Hall, B., Lie Groups, Lie Algebras, and Representations: An Elementary
Introduction, Springer-Verlag, 2003.
[31] Haroche, S. and Ramond, J-M., Exploring the Quantum: Atoms, Cavities
and Photons, Oxford University Press, 2006.
[32] Hatfield, B., Quantum Field Theory of Point Particles and Strings,
Addison-Wesley, 1992.
[34] Hirsch, M., and Smale, S., Differential Equations, Dynamical Systems, and
Linear Algebra, Academic Press, 1974.
450
[36] Kostant, B., Quantization and unitary representations: Part I, Prequanti-
zation, in Lecture Notes in Mathematics, 170 (1970) 87-208.
[39] Lion, G., and Vergne, M., The Weil Representation, Maslov index and
Theta series, Birkhäuser, 1980.
[42] Meinrenken, E., Clifford Algebras and Lie Theory, Springer-Verlag, 2013.
[44] Ottesen, J., Infinite Dimensional Groups and Algebra in Quantum Physics,
Springer, 1995.
[46] Peskin, M., and Schroeder, D., An Introduction to Quantum Field Theory,
Westview Press, 1995.
[47] Pressley, A., and Segal, G., Loop Groups, Oxford University Press, 1986.
[49] Porteous, I., Clifford Algebras and the Classical Groups Cambridge Univer-
sity Press, 1995.
451
[54] Schulman, L., Techniques and Applications of Path Integration, John Wiley
and Sons, 1981.
[55] Shale, D., Linear symmetries of free boson fields, Trans. Amer. Math. Soc.
103 (1962) 149-167.
[56] Shale, D. and Stinespring, W., States of the Clifford algebra, Ann. of Math.
(2) 80 (1964) 365-381.
[57] Shankar, R., Principles of Quantum Mechanics, 2nd Ed., Springer, 1994.
[58] Singer, S., Linearity, Symmetry, and Prediction in the Hydrogen Atom,
Springer-Verlag, 2005.
[59] Sternberg, S., Group Theory and Physics, Cambridge University Press,
1994.
[60] Stillwell, J., Naive Lie Theory Springer-Verlag, 2010. http://www.
springerlink.com/content/978-0-387-78214-0
452
[73] Wigner, E., The Unreasonable Effectiveness of Mathematics in the Natural
Sciences, Comm. Pure Appl. Math. 13 (1960) 1-14..
[74] Witten, E., Supersymmetry and Morse Theory, J. Differential Geometry
17 (1982) 661-692.
[80] Zurek, W., Decoherence and the Transition from Quantum to Classical –
Revisited, http://arxiv.org/abs/quant-ph/0306072
[81] Zurek, W., Quantum Darwinism, Nature Physics 5 (2009) 181.
453