Special Topics in Particle Physics: Robert Geroch April 18, 2005
Special Topics in Particle Physics: Robert Geroch April 18, 2005
Robert Geroch
April 18, 2005
Contents
1 The Klein-Gordon Equation 3
2 Hilbert Space and Operators 6
3 Positive-Frequency Solutions of the Klein-Gordon Equation 8
4 Constructing Hilbert Spaces and Operators 13
5 Hilbert Space and Operators for the Klein-Gordon Equation 15
6 The Direct Sum of Hilbert Spaces 18
7 The Completion of an Inner-Product Space 20
8 The Complex-Conjugate Space of a Hilbert Space 22
9 The Tensor Product of Hilbert Spaces 23
10 Fock Space: The Symmetric Case 28
11 Fock Space: The Anti-Symmetric Case 31
12 Klein-Gordon Fields as Operators 34
13 The Hilbert Space of Solutions of Maxwells Equations 39
14 Maxwell Fields as Operators 44
15 The Poincare Group 47
16 Representations of the Poincare Group 49
17 Casimir Operators: Spin and Mass 54
18 Spinors 61
19 The Dirac Equation 65
20 The Neutrino Equation 72
21 Complex Klein-Gordon Fields 75
22 Positive Energy 75
23 Fields as Operators: Propagators 79
24 Spin and Statistics 83
1
25 -Algebras 85
26 Scattering: The S-Matrix 91
27 The Hilbert Space of Interacting States 94
28 Calculating the S-Matrix: An Example 96
29 The Formula for the S-Matrix 100
30 Dimensions 102
31 Charge Reversal 104
32 Parity and Time Reversal 107
33 Extending Operators to Tensor Products and Direct Sums 111
34 Electromagnetic Interactions 114
35 Transition Amplitudes 119
2
1 The Klein-Gordon Equation
We want to write down some kind of a quantum theory for a free relativistic
particle. We are familiar with the old Schr odinger prescription, which more or
less instructs us as to how to write down a quantum theory for a simple, nonrel-
ativistic classical system. The idea is to mimic as much at that prescription as
we can. In doing this, a number of diculties will be encountered which, how-
ever, we shell be able to resolve. There is a reasonable and consistent quantum
theory for a free relativistic (spin zero) particle.
Recall the Schr odinger prescription. We have a classical system (e.g., a
pendulum, or a ball rolling on a table). The manifold of possible instantaneous
congurations of this system is called conguration space, and points of this
manifold are labeled by letters such as x. However, in order to specify completely
the state of the system (i.e., in order to give enough information to uniquely
determine its future evolution), we must specify at some initial time both its
conguration x and its momentum p. The collection of such pairs (x, p) is
called phase space. (More precisely, phase space is the cotangent bundle of
conguration space.) Finally, the dynamics of the system is described by a
certain real-valued function on phase space, H(x, p), the Hamiltonian. The
time-evolution of the system (i.e., its point in phase space) is given by Hamiltons
equations:
d
dt
x =
p
H
d
dt
p =
x
H (1)
Thus, the complete dynamical history of the classical system is represented by
curves (solutions of Eqn. (1)), (x, p)(t), in phase space. (More precisely, by
integral curves of the Hamiltonian vector eld in phase space.)
The state of the corresponding quantum system is characterized not by a
point in phase space as in the classical case, but rather by a complex-valued
function (x) on conguration space. The time-evolution of the state of the
system is then given, not by Eqn. (1) as in the classical case, but rather by the
Schr odinger equation
i
t
= H
_
x, i
x
_
(2)
where the operator on the right means at each appearance of p in H, substitute
i
x
. (Clearly, this prescription may become ambiguous for a suciently
complicated classical system.) Thus, the complete dynamical history of the
system is represented by a certain complex-valued function (x, t) of location
in conguration space and time.
We now attempt to apply this prescription to a free relativistic article of mass
m 0. The (4-)momentum of such a classical particle, p
a
, satises p
a
p
a
=
m
2
. (Latin indices represent (4-)vectors or tensors in Minkowski space. We
use signature (+, , , ).) Choose a particular unit (future-directed) timelike
vector t
a
(a rest frame), and consider the component of p
a
parallel to t
a
,
E = p
a
t
a
, and its components perpendicular to t
a
, p. Then, from p
a
p
a
= m
2
,
3
we obtain the standard relation between this energy and 3-momentum:
E =
_
p p +m
2
_
1/2
. (3)
(Here and hereafter, we set the speed of light, c, equal to one.) The plus sign on
the right in Eqn. (3) results from the fact that p
a
is a future-directed timelike
vector. It seems natural to consider Eqn. (3) as representing the Hamiltonian
for a free relativistic particle. We are thus led to consider the dynamical history
of the quantum particle as being characterized by a complex-valued function
(x
a
) on Minkowski space (x
a
represents position in Minkowski space it
replaces both the x and t in the Schr odinger theory), satisfying the equation:
i
t
=
_
2
+m
2
1/2
(4)
The rst set of diculties now appear. In the rst place, it is not obvious
that Eqn. (4) is in any sense Lorentz invariant - that it is independent of our
original choice of p
a
. Furthermore, it is not clear what meaning is to be given
to the operator on the right side of Eqn. (4): what does the square root of a
dierential operator mean? Both of these diculties can be made to disappear,
after a fashion, by multiplying both sides of Eqn. (4) by another, equally obscure,
operator, i
t
+
_
2
+m
2
1/2
, and expanding using associativity. The
result is the Klein-Gordon equation:
_
+
m
2
2
_
= 0, or
_
+
2
_
= 0 (5)
which is both meaningful and relativistically invariant. (We set = m/.)
We might expect intuitively that the consequence of multiplying Eqn. (4) by
something to get Eqn. (5) will be that the number of solutions of Eqn. (5) will
be rather larger than the number of solutions of Eqn. (4) (whatever that means.)
As we shall see later, this intuitive feeling is indeed borne out.
To summarize, we have decided to describe our quantized free relativistic
particle by a complex-valued function on Minkowski space, which satises
Eqn. (5).
Just for the fun of it, lets look for a solution of Eqn. (5). We try
= e
ikax
a
(6)
where k
a
is a constant vector eld in Minkowski space. Substituting Eqn. (6)
into Eqn. (5), we discover that (6) is indeed a solution provided
k
a
k
a
=
2
(7)
i.e., provided k
a
is timelike with norm .
In the Schr odinger prescription, the wave function has a denite and simple
physical interpretation:
t
(
) =
_
2mi
_
_
_
(8)
(Proof: evaluate the time-derivatives on the left using (2), and verify that the
result is the same as the expression on the right.) This looks very much like
the nonrelativistic form of the statement that the 4-divergence of some 4-vector
vanishes. Hence, we want to construct some divergence-free 4-vector from solu-
tions of the Klein-Gordon equation. One soon discovers such an object which,
in fact, looks suggestively like the object appearing in Eqn. (8):
J
a
=
1
2i
(
a
a
) (9)
Note that, because of (5), J
a
is divergence-free.
We cannot interpret the time-component of Eqn. (9) as a probability den-
sity for the particle unless this quantity is always nonnegative, that is to say,
that J
a
t
a
0 for every future-directed timelike vector t
a
, that is to say, unless
J
a
itself is future-directed and timelike. To see whether this is indeed the case,
we evaluate J
a
for the plane-wave solution (Eqn. (6)), and nd:
J
a
= k
a
(10)
This expression is indeed timelike, but is not necessarily future-directed:
Eqn. (6) is a solution of the Klein-Gordon equation whether k
a
is future- or past-
directed. Thus, we have not succeeded in interpreting a solution of the Klein-
Gordon equation in terms of a probability density for nding the particle.
We next consider the situation with regard to the initial value problem.
Since the Schr odinger equation is rst order in time derivatives, a solution of that
equation is uniquely specied by giving
dV (11)
5
That the real number (11) is independent of the t = const. surface over
which the integral is performed is a consequence of Eqn. (8) (assuming, as one
always does, that falls o suciently quickly at innity.) One might therefore
be tempted to try to dene the norm of a solution of the Klein-Gordon equation
as an integral of J
a
_
S
J
a
ds
a
(12)
over a spacelike 3-plane S. But it is clear from (10) that the expression (12) will
not in general be nonnegative. Thus, the most obvious way to make a Hilbert
space out of solutions of the Klein-Gordon equation fails. This, of course, is
rather embarrassing, for we are used to doing quantum theory in a Hilbert
space, with Hermitian operators representing observables, etc.
To summarize, a simple relativization of the Schr odinger equations leads
to a number of maladies.
2 Hilbert Space and Operators
The collection of states of a quantum system, together with certain of the struc-
ture naturally induced on this collection, is described by a mathematical object
known as a Hilbert space. We recall the basic denitions.
A Hilbert space consists, rst of all, of a set H. Secondly, H has the structure
of an Abelian group. That is to say, given any two elements, and , of H,
there is associated a third element of H, written + , this operation subject
to the following conditions:
H1. For , H, + = +.
H2. For , , H, ( +) + = + ( +).
H3. There is an element of H, written 0, with the following property: for
each H, + 0 = .
H4. If H, there exists an element of H, written , with the following
property: + () = 0.
Furthermore, H has the structure of a complex vector space. That is to say,
with each complex number and each element of H there is associated an
element of H, written , this operation subject to the following conditions:
H5. For , H, C, ( +) = +.
H6. For H, , C, ( +) = + and () = ().
H7. For H, 1 = .
There is, in addition, a positive-denite inner product dened on H. That is to
say, with any two elements, and , of H there is associated a complex number,
written (, ), this operation subject to the following conditions:
6
H8. For , , H, C, ( +, ) = (, ) + (, ).
H9. For , H, (, ) = (, ).
H10. For H, with ,= 0, (, ) > 0. (That (, ) is real follows from H9.)
We sometimes write || for
_
(, ). Finally, we require that this structure have
a property called completeness. A sequence,
i
(i 1, 2, ...), of elements of H is
called a Cauchy sequence if, for every number > 0, there is a number N such
that |
i
j
| < whenever i and j are greater than N. A sequence is said to
converge to H if |
j
| 0 as i . H is said to be complete if every
Cauchy sequence converges to an element of H.
H11. H is complete.
There are, of course, hundreds of elementary properties of Hilbert spaces which
follow directly from these eleven axioms.
A (linear) operator on a Hilbert space H is a rule A which assigns to each
element of H another element of H, written A , this operation subject to the
following condition:
O1. , H, C, A( +) = A +A.
We shall discuss the various properties and types of operators when they arise.
There is a fundamental diculty which arises when one attempts to use
this mathematical apparatus in physics. The collection of quantum states
which arises naturally in a physical problem normally satises H1H10. (This
is usually easy to show in each case.) The problem is with H11. The most
obvious collection of states often fails to satisfy the completeness condition. As
one wants a Hilbert space, he normally corrects this deciency by completing the
space, that is, by including additional elements so that all Cauchy sequences will
have something to converge to. (There is a well-dened mathematical procedure
for constructing, from a space which satises H1H10, a Hilbert space.) The
unpleasant consequence of being forced to introduce these additional states is
that the natural operators of the problem, which were dened on the original
collection of states, cannot be dened in any reasonable way on the entire Hilbert
space. Thus, they are not operators at all as we have dened them, for they
only operate on a subset of the Hilbert space. Fortunately, this subset is dense.
(A subset D of a Hilbert space H is said to be dense if, for every element
of H, there is a sequence consisting of elements of D which converges to . )
Some very unaesthetic mathematical techniques have been devised for dealing
with such situations. (See Von Neumanns book on Mathematical Foundations
of Quantum Mechanics.)
This problem is not conned to quantum eld theory. It occurs already
in Schr odinger theory. For example, the collection of smooth solutions of the
Schr odinger equation for which the integral (11) converges satisfy H1H10, but
not H11. To complete this space, we have to introduce solutions which are,
for example, discontinuous. How does one apply the Schr odinger momentum
operator to such a wave function?
7
Figure 1: The mass shell in momentum space.
3 Positive-Frequency Solutions of the Klein-
Gordon Equation
We represent solutions of the Klein-Gordon equation as linear combinations of
plane-wave solutions (Eqn. (6)):
(x) =
_
M
f(k
a
)e
ikax
a
dV
(13)
Of course, we wish to include in the integral (13) only plane-waves which satisfy
the Klein-Gordon equation, i.e., only plane waves whose k
a
satisfy the nor-
malization condition (7). The four-dimensional (real) vector space of constant
vector elds in Minkowski space-time is called momentum space. The collection
of all vectors k
a
in momentum space which satisfy Eqn. (7) consists of two hy-
perbolas (except in the case = 0, in which case the hyperbolas degenerate to
the two null cones through the origin). This collection is called the mass shell
(associated with ), M
(consist-
ing of future-directed vectors which satisfy (7)) and the past mass shell M
= M
+
.
Eqn. (13) immediately suggests two questions: i) What are the necessary
and sucient conditions on the complex-valued function f on M
in order that
the integral (13) exist for every x
a
, and in order that the resulting (x) be
smooth and satisfy the Klein-Gordon equation? ii) What are the necessary
and sucient conditions on a solution (x) of the Klein-Gordon equation in
8
Figure 2: The volume element on the mass shell.
order that it can be expressed in the form (13) for some f? These, of course,
are questions in the theory of Fourier analysis. It suces for our purposes,
however, to remark that the required conditions are of a very general character
(that functions not be too discontinuous, and that, asymptotically, they go to
zero suciently quickly). The point is that all the serious things we shall do with
the Klein-Gordon equation will be in momentum space. We shall use Minkowski
Space and (x) essentially only to motivate denitions and constructions on the
fs in momentum space.
One question regarding (13) which must be answered is what is the volume
element dV
. This d
=
1
d
(14)
which is easily veried to he nonzero also on the null cone. In more conventional
terms, our volume-element can be described as follows. Choose a unit time-like
vector t
a
in momentum space, and let S be the spacelike 3-plane perpendicular
to t
a
. Then any small patch A on M
be the
volume of A
= dV
[t
a
k
a
[
1
(15)
The existence of a limit as 0 is clear from (15), but Lorentz-invariance
(independence of the choice of t
a
) is not.
Is there any gauge in f? Given a solution (x) of the Klein-Gordon equa-
tion, is f uniquely determined by (13)? The only arbitrary choice which was
made in writing (13) was the choice of an origin: x
a
refers to the position
vector of a point in Minkowski space with respect to a xed origin. We are thus
led to consider the behavior of the fs under origin changes. Let O and O
be
two origins, and let v
a
be the position vector of O
be x
a
and x
a
, respectively, whence
x
a
= x
a
v
a
(16)
Then, if f and f
, respectively, we have
(p) =
_
M
fe
ikax
a
dV
=
_
M
f
e
ikax
a
dV
(17)
Clearly, we must have
f
(k) = f(k)e
ikav
a
(18)
Thus, when we consider states as represented by functions on the mass shell, it
is necessary to check that conclusions are unchanged if (18) is applied simulta-
neously to all such functions.
10
Now look again at Eqn. (3). It says, in particular, that the energy-momen-
tum vector is future-directed. This same feature shows up in the right side of
Eqn. (4) by the plus sign. If this sign were replaced by a minus, we would be deal-
ing with a past-directed energy-momentum vector. The trick we used to obtain
Eqn. (5) from (4) amounted to admitting also past-directed energy-momenta.
It is clear now how Eqn. (4) itself can be carried over into a well-dened and
fully relativistic condition on . We merely require that the f of Eqn. (13) van-
ish on M
a
x
a
(19)
That is, is a complex constant, and k
a
and k
a
are future-directed constant
vectors satisfying (7). (Strictly speaking, this example is not applicable, for (19)
cannot be Fourier analyzed. It is not dicult, however, to appropriately smear
(19) over the future mass shell to obtain an example without this deciency.)
Substituting (19) into (9), we obtain:
J
a
=
1
2
k
a
_
2 +e
i(k
b
k
b
)x
b
+
e
i(k
b
k
b
)x
b
_
+
1
2
k
a
_
2
+e
i(k
b
k
b
)x
b
+
e
i(k
b
k
b
)x
b
_
(20)
Clearly, one can choose , k
a
, and k
a
so that this J
a
is not timelike in cer-
tain regions. Thus, even the assumption of positive-frequency solutions does
not resolve the diculty associated with not having a simple probabilistic in-
terpretation for our wavefunction : we still cannot think of J
a
t
a
(with t
a
unit,
future-directed, timelike) as representing a probability density for nding the
particle. The resolution of this problem must await our introduction of a posi-
tion operator.
Note from Eqn. (20) that J
a
is trying very hard to be timelike and future-
directed in the positive-frequency case: it is only the cross terms between the
two plane waves which destroys this property. This observation suggests that, in
11
the positive-frequency case, the integral of J
a
over a spacelike 3-plane might be
positive. In order to check on this possibility, we want to rewrite the integral of
J
a
in terms of the corresponding function f on M
_
M
dV
1
2
k
a
_
f
(k)f(k
)e
i(k
b
k
b
)x
b
+f(k)f
(k
)e
i(k
b
k
b
)x
b
_
(21)
We now let S be a spacelike 3-plane through the origin, and let t
a
be the unit,
future-directed normal to S. Then
_
S
J
a
t
a
dS =
_
M
dV
_
M
dV
1
2
k
a
t
a
_
_
f
(k)f(k
)
_
S
e
i(k
b
k
b
)x
b
dS
+f(k)f
(k
)
_
S
e
i(k
b
k
b
)x
b
dS
_
_
(22)
But, from the theory of Fourier analysis
_
S
e
i(k
b
k
b
)x
b
dS = (2)
3
[t
a
k
a
[
1
(k k
) (23)
and so (22) becomes
_
S
J
a
t
a
dS = (2)
3
_
_
_
M
+
ff
dV
_
M
ff
dV
_ (24)
In particular, if f vanishes on M
would have been negative.) This calculation was not done merely for
idle curiosity; the right side of (24) will be important later.
We saw before that the initial-value problem for the Klein-Gordon equation
is as follows: one must specify and t
a
a
on an initial spacelike 3-plane. How
does the initial-value problem go for positive-frequency solutions of the Klein-
Gordon equation? In fact, we only have to specify as initial data in this case.
To see this, suppose we know the value of the integral
(x) =
_
M
+
f(k)e
ikax
a
dV
(25)
12
for every x
a
which is perpendicular to a unit timelike vector t
a
at the origin
(i.e., on the spacelike 3-plane perpendicular to t
a
, through the origin). The
integral (25) can certainly be expressed as a Fourier integral over S (t
a
sets
up a one-to-one correspondence between M
+
a
on a spacelike 3-plane.
2. There is a one-to-one correspondence between: i) positive-frequency solu-
tions of the Klein-Gordon equation, ii) complex-valued functions on M
+
,
and iii) values of on a spacelike 3-plane.
4 Constructing Hilbert Spaces and Operators
There is a general and extremely useful technique for obtaining a Hilbert space
along with a collection of operators on it. It is essentially this technique which
is used, for example, in treating the Schr odinger and Klein-Gordon equations.
It is convenient, therefore, to describe this construction, once and for all, in a
general case. Special cases can then be treated as they arise.
The fundamental object we need is some n-dimensional manifold M on
which there is specied a smooth, nowhere-vanishing volume-element dV . In
dierential-geometric terms, this means that we have a smooth, nowhere-vani-
shing, totally skew tensor eld
a1an
on M. Our Hilbert space, and operators,
are now dened in terms of certain elds on M.
We rst dene the Hilbert space. Consider the collection H of all complex-
valued, measurable, square-integrable functions f on M. This H is certainly a
complex vector apace. We introduce a norm on H:
|f|
2
=
_
M
ff
dV (26)
It is known that this H thus becomes a Hilbert space. (Actually, we have
been a little sloppy here. One should, more properly, dene an equivalence
relation on H: two functions are equivalent if they dier only on a subset (of
M) of measure zero. It is the equivalence classes which actually form a Hilbert
space. For example, the function f which vanishes everywhere on M except one
point, where it is one, is measurable and square-integrable. Its norm, (26), is
zero, although this f is not the zero element of H. It is, however, in the zero
equivalence class, for it diers from the zero function only on a set (namely, one
point) of measure zero.) This is a special case of a more general theorem: the
13
collection of all complex-valued, measurable, square-integrable functions (more
precisely, the collection of equivalence classes as above) on a complete measure
space form a Hilbert space.
We now introduce some operators. Let v
a
be any smooth (complex) con-
travariant vector eld, and v any smooth (complex) scalar eld on M. Then
with each smooth, complex-valued function f on M we may associate the func-
tion
V f = v
a
a
f +vf (27)
where
a
denotes the gradient on M. To what extent does (27) dene an
operator on H? Unfortunately, (27) is not applicable to every element of H,
for two reasons: i) a function f could be measurable and square-integrable (i.e.,
an element of H), but not dierentiable. Then the gradient operation in (27)
would not be dened. ii) an element f of H could even be smooth, but could
have the property that, although f itself is square-integrable, the function (27)
is not. However, there is a large class of elements of H on which (27) is dened
and results in an element of H. Such a class, for example, is the collection of
all functions F which are smooth and have compact support. (Such a function
is automatically square-integrable and measurable.) This class is, in fact, dense
in H. Clearly, (27) is linear whenever it is dened. Thus, we can call (27) an
operator on H, in the sense that we have agreed to abuse that term.
We agree to call an operator Hermitian if, whenever V f and V g are dened,
(V f, g) = (f, V g). What are the necessary and sucient conditions that (27)
be Hermitian? Let f and g be smooth functions on M, of compact support.
Then:
(V f, g) =
_
M
(v
a
a
f +vf)g
dV
=
_
M
[fv
a
a
g
+fg
(
a
v
a
+v)] dV
(28)
where we have done an integration by parts (throwing away a surface term by
the compact supports). Eqn. (28) is clearly equal to
(f, V g) =
_
M
[fv
a
a
g
+fv
] dV (29)
for every f and g when and only when:
v
a
= v
a
v v
=
a
v
a
(30)
These, then, are the necessary and sucient conditions that V be Hermitian.
One further remark is required with regard to what the divergence in the rst
Eqn. (30) is supposed to mean. (We dont have a metric, or a covariant deriva-
tive, dened on M.) It is well-known that the divergence of a contravariant
vector eld can be dened on a manifold with a volume-element
a1an
. This
14
can be done, for example, using either exterior derivatives or Lie derivatives.
For instance, using Lie derivatives we dene
a
v
a
by:
L
v
m
a1an
= (
a
v
a
)
a1an
(31)
(Note that, since the left side is totally skew, it must be some multiple of
a1an
.)
Finally, we work out the commutator of two of our operators; V = (v
a
, v)
and W = (w
a
, w). If f is a smooth function of compact support, we have:
[V, W]f = (v
a
a
+v)(w
b
b
+w)f (w
b
b
+w)(v
a
a
+v)f
= (v
b
b
w
a
w
b
b
v
a
)
a
f + (v
a
a
w w
a
a
v)f
(32)
Note that the commutator is again an operator of the form we have been dis-
cussing, (27). Note furthermore that the vector part of the commutator is the
Lie bracket of the vector elds appearing in V and W.
To summarize, with any n-manifold M on which there is given a smooth,
nowhere-vanishing volume element we associate a Hilbert space H along with a
collection of operators on H. The commutator of two operators in this collection
is again an operator in the collection.
5 Hilbert Space and Operators for the Klein-
Gordon Equation
We now complete our description of the quantum theory of a free, relativistic,
spin-zero particle.
For our Hilbert space we take, as suggested by Sec. 3, the collection of
all complex-valued, measurable, square-integrable functions on the future mass
shell, M
+
.
We rst consider momentum operators. Let p
a
be any constant vector eld
in Minkowski apace, and any positive-frequency solution of the Klein-Gordon
equation. Then, clearly,
i
p
a
a
(33)
is also a positive-frequency solution of the Klein-Gordon equation. In terms of
the corresponding functions on M
+
. Thus, for
each constant vector eld p
a
, we have an operator, P(p
a
), on our Hilbert space
H. Since the multiplying function in (34) is real, the operators P(p
a
) are all
Hermitian. (See (30).) We now interpret these operators. Choose a constant,
15
unit, future-directed timelike vector eld t
a
in Minkowski space (a preferred
state of rest). Then P(t
a
) is the energy operator, and P(p
a
), with p
a
unit
and perpendicular to t
a
, is the component of momentum in the p
a
-direction
operator.
The position operators are more complicated. Not only do they depend on
more objects in Minkowski space (rather than just a single p
a
as in the momen-
tum case), but also they require us to take derivatives in the mass shell. To
obtain a position operator, we need the following information: a choice of origin
O in Minkowski space, a constant, unit, future-directed timelike vector eld t
a
in Minkowski space, and a constant unit vector eld q
a
which is perpendicular
to t
a
. (Roughly speaking, O and t
a
dene a spacelike 3-plane the instant
at which the operator is to be applied q
a
denes which position coordinate
were operating with, and O tells us what the origin of this position coordinate
is.) Now, q
a
is a vector in momentum space, and therefore denes a constant
vector eld in momentum space, which we also write as q
a
. One is tempted to
take the derivative of f along this vector eld. But this will not work, for q
a
is
not tangent to the mass shell, whereas f is only dened on the mass shell. To
correct this deciency, we project q
a
into the mass shell that is, we add to
q
a
that multiple of t
a
which results in a vector eld lying in M
+
1
i
_
q
a
t
a
(t
b
k
b
)
1
(q
c
k
c
)
(35)
We now have a vector eld on M
+
1
i
_
g
ab
2
k
a
k
b
_
a
_
q
b
t
b
(t
c
k
c
)
1
(q
d
k
d
)
=
1
i
(q
a
k
a
)(t
b
k
b
)
2
(36)
where we have denoted the derivative in momentum space by
a
. To obtain a
Hermitian operator, we take the Hermitian part of the operator represented by
(35):
f
1
i
_
q
a
t
a
(t
b
k
b
)
1
(q
c
k
c
)
a
f
1
2i
(q
a
k
a
)(t
b
k
b
)
2
(37)
In (37), f is to be the function on M
+
a
)
= 0
_
X(O, t
a
, q
a
), X(O, t
a
, q
a
)
= 0
[P(p
a
), X(O, t
a
, q
a
)] = i(p
a
q
a
)
(p
a
t
a
) = 0
(38)
The next thing one normally does with operators (in the Heisenberg repre-
sentation, which is the one were using) is to work out their time derivatives.
For the momentum operators, this is easy, for no notion of a time was used to
dene P(p
a
)). Thus, whatever reasonable thing one wants to mean by a ,
we have:
P(p
a
) = 0 (39)
This, of course, is what we expect for the momentum operator on a free particle.
For the position operators, on the other hand, we have an interesting notion
of time-derivative. We want to compare X(O, t
a
, q
a
) with the same position
operator at a slightly later time. This at a slightly later time is expressed
by slightly displacing O in the t
a
-direction. Thus, we are led to dene:
X(O, t
a
, q
a
) = lim
0
1
[X(O
, t
a
, q
a
) X(O, t
a
, q
a
)] (40)
where O
X(O, t
a
, q
a
)f = (q
a
k
a
)(t
b
k
b
)
1
f (41)
which, of course, is what we expected. Note that a number of statements about
how X(O, t
a
, q
a
) depends on its arguments follow directly from Eqn. (41).
Finally, one would like to ask about the eigenvectors and eigenvalues of our
operators. It is clear from Eqn. (34) that the only candidate for an eigenfunc-
tion of P(p
a
) would be a -function on M
+
a
f
1
2
(q
a
k
a
)(t
b
k
b
)
2
f = 0 (42)
for every such q
a
. The solution to (42) is:
f = const. (t
a
k
a
)
1/2
(43)
The rst remark concerning (43) is that it is not square-integrable, and hence
does not represent an element of H. This does not stop us, however, from
17
substituting (43) into (13) to obtain a function on Minkowski space. The
resulting (the explicit formula is not very enlightening it involves Hankel
functions) is not a -function at O. In fact, this is spread out around O to
distances of the order of
1
the Compton wavelength of our particle. Thus,
our picture is that a relativistic particle cannot be conned to distances much
smaller than its Compton wavelength.
6 The Direct Sum of Hilbert Spaces
Associated with any countable sequence, H
, H
, H
, H
, H
, . . .) (44)
consisting of one element (
) of H
, one element (
) of H
|
2
+|
|
2
+|
|
2
+. . . (45)
converges. This collection is the underlying point set of the direct sum. To
obtain a Hilbert space, we must dene addition, scalar multiplication, and an
inner product, and verify H1H11.
The sum of two sequences (44) is dened by adding them component-wise:
(
, . . .) + (
, . . .) = (
, . . .) (46)
We must verify that, if the addends satisfy (45), then so does the sum. This
follows immediately from the inequality:
|
|
2
= |
|
2
+ (
) + (
) +|
|
2
|
|
2
+ 2|
||
| +|
|
2
2|
|
2
+ 2|
|
2
(47)
The product of a sequence (44) and a complex number is dened by:
(
, . . .) = (
, . . .) (48)
That the right side of (48) satises (45) follows from the fact that
|
| = [[|
| (49)
We have now dened addition and scalar multiplication. That these two oper-
ations satisfy H1H7, i.e., that we have a complex vector space, is trivial.
We dene the inner product between two sequences (44) to be the complex
number
((
, . . .), (
, . . .)) = (
) + (
) + (
) +. . . (50)
18
The indicated sum of complex numbers on the right of (50) converges if (in fart,
converges absolutely if and only if) the sum of the absolute values converges.
Thus, the absolute convergence of the right side of (50) follows from the fact
that
[(
)[ |
|
1
2
|
|
2
+
1
2
|
|
2
(51)
We now have a complex vector space in which there is dened an inner product.
(Note incidentally, that the norm is given by (45).) The verication of H8, H9,
and H10 is easy.
Thus, as usual, the only dicult part is to check H11. Consider a Cauchy
sequence of sequences (44). That is to say, we have a countable collection of
such sequences,
1
= (
1
,
1
,
1
, . . .)
2
= (
2
,
2
,
2
, . . .)
3
= (
3
,
3
,
3
, . . .)
.
.
.
(52)
with the following property: for each real > 0 there is a number N such that
|
i
j
|
2
= |
j
|
2
+|
j
|
2
+ (53)
whenever i, j N. We must show that the sequence of elements (52) of the
direct sum converge to some element of the direct sum. First note that (53)
implies
|
j
|
2
, |
j
|
2
, . . . (54)
That is to say, the rst column of (52) is a Cauchy sequence in H
, the second
column a Cauchy sequence in H
, etc. Since H
, H
, etc. Form
= (
, . . .) (55)
We must show that the
i
converge to , and that is is an element of the
direct sum (i.e., that (45) converges for ). Fix 0 and choose i such that
|
i
j
|
2
whenever j > i. Then, for each positive integer n,
|
j
|
2
+|
j
|
2
+. . . +|
(n)
i
(n)
j
|
2
(56)
Taking the limit of (56) as j , we obtain
|
|
2
+|
|
2
+. . . +|
(n)
i
(n)
|
2
(57)
but n is arbitrary, and so, taking the limit of (57) as n ,
|
i
|
2
= |
|
2
+|
|
2
+. . . (58)
19
That is to say, the
i
converge to . Finally, the fact that is actually an
element of the direct sum, i.e., the fact that
|
|
2
+|
|
2
+|
|
2
+. . . (59)
converges, follows immediately by substituting
|
|
2
2|
i
|
2
+ 2|
|
2
(60)
(and the corresponding expressions with more primes) into (59), and using (58)
and the fact that
i
is an element of the direct sum. Thus, the direct sum is
complete.
To summarize, we have shown how to construct a Hilbert space from a
countable sequence of Hilbert spaces. Note, incidentally, that the direct sum
is essentially independent of the order in which the Hilbert spaces are taken.
More precisely, the direct sum obtained by taking H
, H
, . . . in one order is
naturally isomorphic to the direct sum obtained by taking these spaces in any
other order.
Finally, we discuss certain operators on the direct sum. Consider a sequence
of operators: A
acting on H
, A
acting on H
, A
, A
, . . .) (61)
Unfortunately, (61) may not be an element of the direct sum, for
|A
|
2
+|A
|
2
+|A
|
2
+. . . (62)
may fail to converge. However, (61) will produce an element of the direct sum
when acting on a certain dense subset of the direct sum, namely, the set of
sequences (44) which consist of zeros after a certain point. Is there any condition
on the As which will ensure that (61) will always be an element of the direct
sum? An operator A on a Hilbert space H is said to be bounded if A is dened
everywhere and, for some number a, |A| a|| for every H. The
smallest such a is called the bound of A, written [A[. (The norm on a Hilbert
space induces on it a metric topology. Boundedness is equivalent to continuity
in this topology.) It is clear that (62) converges for every element of the direct
sum provided i) all the As are bounded, and ii) the sequence of real numbers
[A
[, [A
[, . . . is bounded.
7 The Completion of an Inner-Product Space
It is sometimes the case, when one wishes to construct a Hilbert space, that one
nds a set on which addition, scalar multiplication, and an inner product are
dened, subject to H1H10 what we shall call an inner product space. One
wants, however, to obtain a Hilbert space, i.e., something which also satises
20
H11. There is a construction for obtaining a Hilbert space from an inner prod-
uct space. Since this construction is in most textbooks, we merely indicate the
general idea.
Let G be an inner product space. Denote by G
. If
i
G
,
i
G
(i = 1, 2, . . .), we write
i
i
provided
lim
i
|
i
i
| = 0 (63)
One veries that is an equivalence relation. The collection of equivalence
classes, denoted by G, is to be made into a Hilbert space.
Consider two elements of G, i.e., two equivalence classes, and let
i
and
i
be representatives. We dene a new sequence, whose ith element is
i
+
i
.
One veries, using the fact that
i
and
i
, are Cauchy sequences, that this new
sequence is also Cauchy. Furthermore, if
i
and
i
are replaced by equivalent
Cauchy sequences, the sum becomes a Cauchy sequence which is equivalent to
i
+
i
. Thus, we have dened an operation of addition in G. In addition, if
i
is a Cauchy sequence and a complex number,
i
is a Cauchy sequence
whose equivalence class depends only on the equivalence class of
i
. We have
thus dened an operation of scalar multiplication in G. These two operations
satisfy H1H7.
If
i
G
,
i
G
, then
lim
i
(
i
,
i
) (64)
exists. Furthermore, this complex number is unchanged if
i
and
i
are replaced
by equivalent Cauchy sequences. Thus, (64) denes an inner product on G. One
must now verify H8, H9, and H10, so that G becomes an inner product space.
Finally (and this is the only hard part), one proves that G is complete, and so
constitutes a Hilbert space. The Hilbert space G is called the completion of the
inner product space G.
Note that G can be considered as a subspace of its completion; with each
G associate the element of G (the equivalence class) containing the Cauchy
sequence
i
= of G. It is easily checked from the denition, in fact, that G is
dense in G. Suppose that G itself were already complete? Then every Cauchy
sequence in G would converge, and, from (63), two Cauchy sequences would be
equivalent if and only if they converged to the same thing. Thus, in this case
G would be just another copy of G; the completion of a complete space is just
that space again.
Finally, suppose that A is a bounded operator on the inner product space
G. Then A can be extended to a bounded operator on G. That is, there is a
bounded operator on G which reduces to A on G considered as a subspace of
G. To prove this, let
i
be a Cauchy sequence in G. Then, since A is bounded,
|A
i
A
j
| = |A(
i
j
)| [A[|
i
j
| (65)
whence A
i
is a Cauchy sequence in G. Furthermore, if two Cauchy sequences
satisfy (63), then
lim
i
|A
i
A
i
| [A[ lim
i
|
i
i
| = 0 (66)
21
That is to say, A
i
is replaced by an equivalent Cauchy sequence when
i
is.
Therefore, we can consider A as acting on elements of G to produce elements
of G. This action is clearly linear, and so we have an operator A dened on G.
Furthermore, this A is bounded, and in fact [A[ = [A[, for
lim
i
|A
i
| [A[ lim
i
|
i
| (67)
for any Cauchy sequence in G.
8 The Complex-Conjugate Space of a Hilbert
Space
Let H be a Hilbert space. We introduce the notion of the complex-conjugate
space of H, written
H. As point-sets, H =
H. That is to say, with each
element H there is associated an element of
H; this element will be written
+ = ( +) (68)
In other words, the sum of two elements of
H is dened by taking the sum
(in H) of the corresponding elements of H, and taking the result back to
H.
Scalar multiplication in H, on the other hand, is dened by the formula ( C,
H):
= ( ) (69)
That is, to multiply an element of
H by a complex number, one multiplies
the corresponding element of H by the complex-conjugate of that number, and
takes the result back to
H. (Note that a bar appears in two dierent senses in
(69). A bar over a complex number denotes its complex-conjugate; a bar over
an element of H denotes the corresponding element of
H.) Finally, the inner
product on
H is xed by requiring that the transition from H to
H preserve
norms:
|
| = || (70)
It is obvious that this
H thus becomes a Hilbert space.
Note that the complex-conjugate space of
H is naturally isomorphic with
H. We write
H = H, and, for H,
= .
The reason for introducing
H is that one frequently encounters mappings
on H which are anti-linear (T( + ) = T() + T()) rather than linear
(T(+) = T()+T()). Anti-linear mappings on H become linear mappings
on
H, and it is easier to think about linear mappings than anti-linear ones.
Consider, for example, the inner product on H, (, ). This can be considered
as a mapping H H C, which is linear in the rst H and anti-linear in the
second. If, however, we consider the inner product as a mapping H
H C,
it becomes linear in both factors.
22
9 The Tensor Product of Hilbert Spaces
With any nite collection of Hilbert spaces, H
, H
, . . . , H
, there is associated
another Hilbert space, called the tensor product of H
, H
, . . . , H
, and written
H
, H
, . . . , H
, will
be written
H
,
H
, . . . ,
H
which corresponds
to
would be written
= (
).
Finally, the operation of taking the inner product (which associates a complex
number, linearly, with an element of H and an element of
H) is indicated by
placing the two elements next to each other, e.g.,
. Hence, |
|
2
=
.
The inner product operation looks (and is) similar to contraction.
We now wish to dene the tensor product. In order to avoid cumbersome
strings of dots, we shall discuss the tensor product of just two Hilbert spaces, H
and H
+. . . +
(71)
Formal here means that the pluses and juxtapositions of elements in (71) are
not to be considered, for the moment, as well-dened operations. They are
merely marks on the paper. We introduce, on the collection of all such formal
sums, an equivalence relation: two formal sums will be considered equivalent
if they can be obtained from each other by any combination of the following
operations on such sums:
1. Permute, in any way, the terms of a formal sum.
2. Add to a formal sum, or delete, the following combination of terms:
(
+ (
)(
1
).
3. Add to a formal sum, or delete, the following combination of terms: (
+ (
+ (
.
4. Add to a formal sum, or delete, the following combination of terms:
) +
) +
).
We denote the collection of equivalence classes by F
+. . .+
)+(
+. . .+
) =
+. . .+
+. . .+
(72)
23
The equivalence class of the formal sum on the right in (72) depends, of
course, only on the equivalence classes of the two formal sums on the left, and
so we have dened an operation of addition on F
+. . . +
) = (
+. . . + (
(73)
induces an operation of scalar multiplication on F
the structure of a complex vector space. So far, we have merely repeated the
standard construction of the tensor product of two vector spaces.
We next wish to dene an inner product, or, equivalently, a norm, on F
.
The norm of a formal sum, Eqn. (71), is dened by writing
(
+ +
)(
+ +
) (74)
and expanding using associativity. For example, the norm of a formal sum with
just two terms would be given by the sum of the complex numbers on the right
of:
(
)(
)
= (
)(
) + (
)(
) + (
)(
) + (
)(
)
(75)
This norm clearly depends only on the equivalence class of the formal sum, and
so denes a norm, and hence an inner product on F
appearing in
that sum are also either parallel or perpendicular, and if, nally, no two terms in
that sum have the property that both their H
elements
are parallel. (This last condition can always be achieved by combining terms.)
The norm of a formal sum in normal form is clearly positive. For example, if
Eqn. (71) were in normal form, its norm would be
_
_
+. . . +
_
_
2
= (
)(
) +. . . + (
)(
) (76)
Thus, the proof that the norm we have dened on F
is positive-denite will
be complete if we can show that every formal sum is equivalent to a formal sum
in normal form. The essential step in this demonstration is the Gram-Schmidt
orthogonalization procedure. Let ,
1
, . . . ,
n
be elements of a Hilbert space H.
Then
=
1
1
+ +
n
n
+ (
i
C, (
i
, ) = 0) (77)
That is to say, any vector in H can be written as a linear combination of
1
, . . . ,
n
, plus a vector perpendicular to the s. Consider now a formal sum,
say
(78)
24
We obtain an equivalent formal sum by replacing
by a vector parallel to
(79)
in which
is perpendicular to
by a vector parallel to
plus a
vector perpendicular to both
and
(80)
with
, and
is positive-denite, whence F
is an
inner-product space.
We now dene the tensor product of H
and H
to be the completion of
F
:
H
= F
(81)
This is really quite complicated. An element of the tensor product is an equiv-
alence class of Cauchy sequences in an inner-product space whose elements are
equivalence classes of formal sums. Note that F
will be called nite elements. In fact, we shall go one step further and consider
formal sums (71) to be elements of the tensor product. Equivalent formal sums
are then equal as elements of the tensor product. With these conventions, we
shall be able to avoid, for the most part, having to speak always in terms of
equivalence classes and Cauchy sequences.
The tensor product of more than two Hilbert spaces, H
,
is dened in a completely analogous way. We use Greek indices to indicate
membership in the various tensor products, e.g., we write
for a typical
element of H
(82)
Note, incidentally, that our original formal sums are now considerably less for-
mal. For example, Eqn. (71) can be considered as the sum, in H
, of the
following elements of H
, . . . ,
.
We next observe that there is a natural, one-to-one correspondence between
the formal sums which are used to obtain H
+ +
(83)
25
and those which are used to obtain
H
+ +
(84)
That is to say, the inner product space consisting of nite elements of
H
is
the same as (words we shall frequently use instead of is naturally isomorphic
with) H
will be written
of H
and
, of H
(85)
Thus, the index notation extends very nicely from the original collection of
Hilbert spaces to the various tensor products which can be constructed.
We now introduce some operations between elements of our tensor product
spaces. Let
and
be nite elements of H
. . .H
. . .H
and
H
. . .
H
of H
. . . H
. For example,
(
)(
)
=
)(
) +
)(
) +
)(
)
+
)(
)
(86)
Note, furthermore, that
|
| |
||
|, (87)
a result which is easily checked by placing
and
in normal form.
Can this operation be extended from nite
i
and
i
, be Cauchy sequences of
nite elements of H
. . . H
. . . H
and
H
. . .
H
, respectively.
Then
|
i
i
j
j
|
= |
i
i
j
i
+
j
i
j
j
|
|
j
||
i
| +|
j
||
i
j
|
(88)
where we have used (87). Thus,
. . . H
and
an element
and
be nite elements of H
. . .H
and H
. . .H
,
respectively. We can certainly associate with these two an element,
,
H
. . . H
. . . H
. For example,
(
)(
) =
(89)
Note, furthermore, that
|
| = |
||
| (90)
This operation, too, can be extended from nite elements to the entire tensor
product. Let
i
and
i
be Cauchy sequences. Then
i
is also a
Cauchy sequence, for
|
j
|
= |
i
+
j
|
|
j
||
i
| +|
j
||
j
|
(91)
This Cauchy sequence must converge to some element of H
. . . H
. . . H
and
we
may associate an element
. . . H
) (H
. . . H
) = H
. . . H
. . . H
(92)
That is, we have shown that nite elements of the left side can be considered
as elements of the right side of (92). The rest of the proof is analogous to the
proofs above.
We next consider the extension of operators from our Hilbert spaces to their
tensor products. Let A be a linear operator from H
to H
with A is written A
. (The
notation may be misleading here. While every element of H
denes such
an operator, not every operator can be considered as belonging to H
.)
With each nite element
of H
. . . H
, of H
. . . H
. For example,
A
) = (A
+ (A
(93)
27
Unfortunately, there is in general no inequality which will permit us to extend
this operation to the entire Hilbert space H
. . . H
. Thus, in general,
A
. . . H
. . . H
. If A
is
bounded, then
|A
| [A[|
|. (94)
(Proof: Use normal form.) In this case, (94) implies that A
i
is a Cauchy
sequence if
i
is, and so we may dene A
for any
.
In fact, in many applications of the tensor product, the H
, H
, etc. are
all merely copies of one xed Hilbert space H. We then have one-to-one corre-
spondences between H
, H
corresponding with
would be written
. In
this case when all our underlying Hilbert spaces are copies of a Hilbert space
H we may introduce symmetrization over tensor indices (round brackets),
e.g.,
()
=
1
6
_
_
(95)
and anti-symmetrization over tensor indices (square brackets), e.g.,
[]
=
1
6
_
_
(96)
Note that any Cauchy sequence of symmetric (resp., skew) tensors of a given
rank converges to a tensor which is necessarily symmetric (resp., skew). Hence,
the symmetric (resp., skew) tensors of a given rank themselves form a Hilbert
space. Similar remarks apply, of course, to any other symmetry on tensor in-
dices.
There are an enormous number of facts about tensor products of Hilbert
spaces. We have stated a few of them and proven still fewer here. It
is the sheer bulk of the information, however, which makes the index notation
valuable. Elementary facts are made to look elementary, and the mind is freed
for important questions.
10 Fock Space: The Symmetric Case
The arena in which one discusses systems of many noninteracting identical par-
ticles is a Hilbert space called Fock space. This Hilbert space is constructed in
terms of the Hilbert space H of one-particle states. Although the construction of
H itself depends on the type of particle being considered (neutrinos, electrons,
mesons, photons, etc.), the steps leading from H to its Fock space are indepen-
dent of such details. In fact, there are two Fock spaces which can be associated
28
with a given Hilbert space H what we shall call the symmetric Fock space
and the anti-symmetric Fock space. If H represents the one-particle states of
a Boson eld, the appropriate space of many-particle states is the symmetric
Fock space based on H. Similarly, fermions are described by the anti-symmetric
Fock space. We shall dene the Fock spaces associated with a Hilbert space H
and a few of the operators on these spaces.
Let H be a Hilbert space. The (symmetric) Fock space based on H is the
Hilbert space
C H
(H
(
H
)
) (H
(
H
H
)
) (97)
where H
, H
, etc. are all copies of H (Sect. 9), and where the round brackets
surrounding the indices of the tensor products mean that the Hilbert space of
symmetric tensors is to be used. More explicitly, an element of the symmetric
Fock space consists of a string
= (,
, . . .) (98)
where is a complex number,
is an element of H,
is a symmetric (
()
) second-rank tensor over H,
+ , (99)
which denes the norm of , converges. Physically,
1n
represents the n-
particle contribution to . That the tensors are required to be symmetric
is a reection of the idea that is invariant under interchange of identical
particles.
We next introduce the creation and annihilation operators. Let H. We
associate with this an operator C() on Fock space, this operator dened by
its action on a typical element (98):
C() = (0,
2
(
)
,
3
(
)
, . . .) (100)
Similarly, with each
H we associate an operator A( ), dened by
A( ) = (
, . . .) (101)
This C() is called the creation operator (associated with ); A( ) the anni-
hilation operator (associated with ). Note that the creation and annihilation
operators are only dened on a dense subset of Fock space, for, in general, the
sum on the right in (99) will not converge for the right sides of (100) and (101).
It is an easy exercise in tensor calculus to work out the commutators of these
operators:
[C(), C(
)] = 0
[A( ), A(
)] = 0
[A( ), C()] = (
)I
(102)
29
For example, the last equation in (102) would be derived as follows:
A( )C() = A( )(0,
2
(
)
,
3
(
)
, . . .)
= (
+ 2
(
, . . .)
C()A( ) = C()(
, . . .)
= (0,
, 2
(
, 3
(
, . . .)
(103)
Furthermore, if represents the element (,
+ (105)
Eqn. (104) is often summarized in words by saying that C() and A( ) are
adjoints of each other. (An operator is thus its own adjoint if and only if it
is Hermitian. Technical distinctions are sometimes made between the adjoint
and the Hermitian conjugate, and between self-adjoint and Hermitian. We shall
not make these distinctions until they arise.) We can now understand why the
strange factors
2,
, 2
(
, . . .) (107)
The total number operator, N, is dened by
N = (0,
, 2
, 3
, 4
, . . .) (108)
Note that these operators, too, are only dened on a dense subset of Fock space.
We can think, intuitively, of N as resulting from summing the N()s over an
orthonormal basis, using
orthonormal
basis
(109)
where
(1
2n)
=
1n
(112)
for some complex number if and only if
1n
=
(1
m
m+1n)
(113)
for some
m+1n
satisfying
m+1n
n
= 0. Proof: If = 0, were through.
If ,= 0, (112) implies
1n
=
(1
2...n)
(114)
for some
2...n
. If
2...n
n
= 0, were through. If not, substitute (114)
into (112) to obtain
1n
=
(1
3...n)
(115)
Continue in this way. It is now clear, from (107), that the most general simul-
taneous eigenvector of N and N(), with eigenvalues n and m, respectively,
is
(0, 0, . . . , 0,
(1
m
m+1n)
, 0, . . .) (116)
where
m+1n
n
= 0.
11 Fock Space: The Anti-Symmetric Case
The denition and properties of Fock space in the antisymmetric case are closely
analogous to those in the symmetric case.
Let H be a Hilbert space. The (anti-symmetric) Fock space based on H is
the Hilbert space
C H
(H
[
H
]
) (H
[
H
H
]
) (117)
where H
, H
, etc. are all copies of H, and where the square brackets sur-
rounding the indices of the tensor products mean that the Hilbert space of
anti-symmetric tensors is to be used. That is, an element of the antisymmetric
Fock space consists of a string
= (,
, . . .) (118)
31
of anti-symmetric tensors over H for which the sum
||
2
=
+ , (119)
which denes the norm of , converges. That the tensors are required to he
anti-symmetric is a reection of the idea that reverses sign under the in-
terchange of identical particles. Physically,
1n
represents the n-particle
contribution to .
We associate with each H a creation operator, C(), and with each
H an annihilation operator, A( ), on Fock space as follows:
C() = (0,
2
[
]
,
3
[
]
, . . .) (120)
A( ) = (
, . . .) (121)
As in the symmetric case, these operators are only dened on a dense subset
of Fock space. The commutators of these creation and annihilation operators
certainly exist but they dont reduce to anything simple. We dene the
anti-commutator of two operators:
A, B = AB +BA (122)
It is the anti-commutators of the creation and annihilation operators which are
simple in the antisymmetric case:
C(), C(
) = 0
A( ), C(
) = 0
A( ), C() = (
)I
(123)
The creation and annihilation operators are still adjoints of each other:
(C(), ) = (, A( )) (124)
There is one further property of the creation and annihilation operators which
is special to the antisymmetric case. Setting =
in (123), we obtain:
C()
2
= 0 A( )
2
= 0 (125)
These equations have a simple physical interpretation. If we try to create two
particles in the same state, or annihilate two particles from the same state, we
obtain zero. That is to say, one cant have more than one particle in a given
state. This, of course, is the essential feature of Fermi statistics.
The number operator in the state ( ,= 0) and total number operator are
dened by:
N() = ||
2
C()A( ) = (0,
, 2
[
||]
, . . .) (126)
N = (0,
, 2
, 3
, . . .) (127)
32
These operators are Hermitian. We can think of N as obtained by summing
the N()s over an orthonormal basis. Some commutators are:
[N(), C()] = [N, C()] = C()
[N(), A( )] = [N, A( )] = A( )
[N(), N] = 0
(128)
(It is interesting that one must use commutators, and not anticommutators,
to make (128) simple.) The number operator in the state has one further
property, this one special to the antisymmetric case. From (126), (123), and
(125), we have
N()
2
= N() (129)
Clearly, (129) is again saying that occupation numbers in the antisymmetric
case are either 0 or 1. A Hermitian operator which is equal to its square is
called a projection operator. Eqn. (129) (or, alternatively, Eqn. (126)) implies
that N() is bounded (and, in fact, [N()[ = 1). Hence, from Section 7, N()
is dened on all of Fock space. On the other hand, N is only dened on a dense
subset.
Finally, we write down the eigenvectors and eigenvalues of our number op-
erators. Once again, the eigenvalues of N are the nonnegative integers, and the
general eigenvector with eigenvalue n is:
(0, 0, . . . , 0,
1n
, 0, . . .) (130)
The eigenvalue-eigenvector structure of N(), however, is quite dierent from
that of the symmetric case. In fact, (129) implies that the only eigenvalues of
N() are 0 and 1. First note that
[1
||2n]
=
1n
(131)
if and only if either
1n
n
or
1n
=
[1
2n]
(132)
for some tensor
2n
which is antisymmetric. (We neednt require, in addi-
tion, that
2n
n
= 0. Any multiples of
= 0,
= 0, . . . (133)
The most general eigenvector with eigenvalue 1 is
(0,
, [
]
,
[
]
,
[
]
, . . .). (134)
33
12 Klein-Gordon Fields as Operators
Everybody knows that one essential idea of quantum eld theory is that classical
elds (e.g., real or complex-valued functions of position in Minkowski space) are
to become operators (operator-valued functions of position in Minkowski space)
on some Hilbert space. We have now assembled enough machinery to discuss
this transition from elds to operators in the Klein-Gordon case. Of course, the
same program will have to be carried out later in essentially the same way
for other elds. The resulting eld operators will play an important role when
we discuss interactions.
Let
+
be a positive-frequency solution of the Klein-Gordon equation. That
is,
+
is a complex-valued function of position in Minkowski space. The com-
plex-conjugate function of
+
, i.e., the function dened by the property that its
value at a point in Minkowski space is to be the complex-conjugate of the value
of
+
, will be written
. Clearly,
(x) (135)
of the Klein-Gordon equation. The functions
+
and can certainly be recov-
ered from : they are the positive- and negative-frequency parts, respectively,
of . Alternatively, these relations can be discussed in terms of functions in
momentum space. Let h
+
(resp., h
) of the
Klein-Gordon equation. Then h
+
and h
(k) = h
+
(k) (136)
That is to say, the value of h
at k
a
M
(137)
This h has the property that it is invariant under simultaneous complex-conju-
gation and reection through the origin, a property equivalent to the reality of
.
Our
+
,
, and each assign a number (for the rst two, a complex num-
ber; for the last, a real number) to each point of Minkowski space. Roughly
speaking, what we want to do is nd objects
+
(x),
+
(x),
(x)
are complex-conjugates of each other, we demand that the operators
+
(x) and
34
(x) be adjoints of each other; since the function (x) is real and the sum
of
+
(x) and
i
(x) (the complex-conjugate function of
+
i
(x)), and a real
solution
i
(x) (the sum of
+
i
(x) and
i
(x)). Then any triple of solutions,
+
(x),
is
its complex-conjugate, and is their sum) could be expanded in terms of our
basis:
+
(x) =
i
a
i
+
i
(x) (138)
(x) =
i
a
i
i
(x) (139)
(x) =
+
(x) +
(x) =
i
(a
i
+
i
(x) + a
i
i
(x)) (140)
Here, a
1
, a
2
, . . . are simply complex numbers. Thus, triples of solutions related
as above would be characterized by sequences of complex numbers. To pass
from elds to operators, we could now simply replace the coecients in the ex-
pansions (138), (139), and (140) by the corresponding creation and annihilation
operators:
+
(x) =
+
i
(x)A(
i
) (141)
(x) =
i
(x)C(
i
) (142)
(x) =
+
(x) +
(x) =
(
+
i
(x)A(
i
) +
i
(x)C(
i
)) (143)
Fix the basis
i
. Then, for each position x in Minkowski space,
+
i
(x),
i
(x),
and
i
(x) are just numbers. The right sides of (141), (142), and (143) are
simply (innite!) sums of operators on Fock space. In this way, we might
expect to be able to associate operators,
+
(x),
+
(f) =
i
(
_
+
i
(x)f(x) dV )A(
i
) (146)
(f) =
i
(
_
i
(x)f(x) dV )C(
i
) (147)
(f) =
+
(f) +
(f) (148)
Thus, the operators which result from smearing are merely the creation and
annihilation operators, weighted by the components of f with respect to the
basis vectors.
The preceding two paragraphs particularly (146), (147), (148) are
intended to motivate the simple and precise denitions which follow. Let f(x)
be a test function. Then the Fourier inverse of f, f
,
and hence an element, (f), of our Hilbert space. (The proof that this f
(k)
is actually square-integrable over the mass shell is given in books on Fourier
analysis.) That is, with each test function f there is associated an element (f)
of H. We now dene:
+
(f) = A((f)) (149)
(f) are adjoints of each other, and that (f) is Hermitian with (f) =
+
(f) +
(f). It is now clear why we were not able to dene operators such
as
+
(x) earlier. We can think of
+
(x) as being the limit of
+
(f) as f
approaches a -function at x (see (145).) But as f approaches a -function,
f
+
((+
2
)f) = 0
((+
2
)f) = 0
((+
2
)f) = 0
(155)
37
(Note that if f is a test function, so is ( +
2
)f.) In fact, our conjecture is
true, for if f
(k) is the
Fourier inverse of (+
2
)f. But (k
a
k
a
+
2
)f
(f),
(g)] = 0 (156)
Furthermore, since the commutator of any creation operator with any annihila-
tion operator is a multiple of the identity operator (see (102)), we have
[
+
(f),
(g)] =
i
D
+
(f, g)I [
(f),
+
(g)] =
i
D
(g)
(f)
D
(f, g) = i
(f)
(f)
(158)
Therefore,
D
+
(f, g) = D
(g, f) (159)
D
+
(f, g) = D
(f, g) (160)
The commutators of the operators follow from (151) and (157),
[(f), (g)] =
i
(D
+
(f, g) +D
(f, g))I =
i
D(f, g)I (161)
where the second equality is the denition of D(f, g). Eqns. (159) and (160)
now imply
D(f, g) = D(g, f) (162)
D(f, g) = D(f, g) (163)
The D-functions have one further property, which can be called Poincare in-
variance. Let x Px be a Poincare transformation on Minkowski space which
does not reverse the future and past time directions. (This last stipulation is
necessary because the distinction between positive frequency and negative fre-
quency requires a particular choice of a future time direction on Minkowski
space.) Then, dening the test functions
f(x) = f(Px), g(x) = g(Px), we have
D
(
f, g) = D
(f, g) D(
f, g) = D(f, g) (164)
38
The functions D
(f, g) =
_
dV
x
_
dV
y
D
(x, y)f(x)g(y)
D(f, g) =
_
dV
x
_
dV
y
D(x, y)f(x)g(y)
(165)
It is not surprising that there should exist such distributions. A distribution,
after all, is just a continuous linear mapping from the topological vector space
of test functions to the reals, and D
(x, y) = D
(x y)
D(x, y) = D(x y)
(166)
where we have written x y for the position vector of x relative to y. It is
not dicult to evaluate the functions (166) explicitly using (158) and a table of
integrals. They involve Bessel functions.
There is, however, one particularly interesting property of D(f, g). Test
functions f and g will be said to have relatively spacelike supports if, for any
point x of the support of f and any point y of the support of g, x y is
spacelike. The property is the following: If f and g have relatively spacelike
supports, then D(f, g) = 0. The easiest proof is by means of the distribution
D(x, y). Eqn. (162) implies
D(x y) = D(y x) (167)
But if x y is spacelike, there is a Poincare transformation which does not
reverse future and past and which takes x to y and y to x (i.e., x = Py,
y = Px). Poincare invariance, (164), now implies
D(x y) = D(y x) (168)
whence D(x, y) = 0 for x y spacelike. That D(f, g) = 0 when f and g have
relatively spacelike supports now follows from (165).
13 The Hilbert Space of Solutions of Maxwells
Equations
We now wish to write down the quantum theory for a system of many free
(non-interacting) photons. Our starting point is the classical eld equations:
Maxwells equations. The method is entirely analogous to that of the Klein-
Gordon equation: the electromagnetic eld plays the role of the Klein-Gordon
39
eld , the Maxwell equations the role of the Klein-Gordon equation. There
are, of course, important dierences between the two cases: a tensor eld rather
than a scalar eld, two rst-order tensor equations rather than one second-order
scalar equation, etc. One further dierence should be emphasized. Whereas the
Klein-Gordon equation is, in a sense, the Schr odinger equation for a free par-
ticle, the Maxwell equations are classical (non-quantum). The electromagnetic
analogy of the classical free particle, on the other hand, would be geometrical
optics. Thus, we have the following table:
Electrodynamics Free Relativistic Particle
Geometrical Optics Classical Dynamics
Maxwells Equations Klein-Gordon Equation
Quantum Electrodynamics Many-Particle Theory
(169)
Theories appearing in the same row are described, mathematically, in roughly
the same terms: for the rst row, curves in Minkowski space; for the second row,
elds in Minkowski space; for the third row, creation and annihilation operators
on Fock space.
The rst step is to impose the structure of a Hilbert space on a certain
collection of solutions of Maxwells equations just as we began the second-
quantization of the Klein-Gordon equation by making a Hilbert space of solu-
tions of that equation.
The electromagnetic eld is a skew, second-rank tensor eld F
ab
( = F
[ab]
)
on Minkowski space. In the absence of sources, this eld must satisfy Maxwells
equations:
[a
F
bc]
= 0 (170)
a
F
ab
= 0 (171)
Eqn. (170) implies that there exists a vector eld A
a
on Minkowski space for
which
F
ab
2
[a
A
b]
(172)
Conversely, given any vector eld A
a
, the F
ab
given by (172) satises (170).
This A
a
is called a vector potential. Substituting (172) into (171), we obtain
A
a
a
(
b
A
b
) = 0 (173)
Thus, any vector eld satisfying (173) denes, via (172), a solution of Max-
wells equations, end, conversely, every solution of Maxwells equations can be
obtained from some vector potential satisfying (173). Two vector potentials, A
a
and
A
a
, dene (via (172)) the same F
ab
if and only if
A
a
= A
a
+
a
(174)
for some scalar eld on Minkowski space. Changes in the vector potential of
the form (174) are called gauge transformations. By means of a gauge transfor-
mation one can nd, for any solution of Maxwells equations, a vector potential
which satises
a
A
a
= 0 (175)
40
Vector potentials which satisfy (175) are said to be in the Lorentz gauge. If a
vector potential for a solution of Maxwells equations is in the Lorentz gauge,
then, from (173), it satises
A
a
= 0 (176)
If two vector potentials are both in the Lorentz gauge, and dier by a gauge
transformation (174), then necessarily
= 0 (177)
We can summarize the situation with the following awkward remark: the vector
space of solutions of Maxwells equations is equal to the quotient space of the
vector space of vector elds which satisfy (175) and (176) by the vector subspace
consisting of gradients of scalar elds which satisfy (177). All elds above are,
of course, real.
We now do with Maxwells equations what was done earlier with the Klein-
Gordon equation: we go to momentum space. Let A
a
(x) be a vector potential,
in the Lorentz gauge, for a solution to Maxwells equations. Set
A
a
(x) =
_
M
A
a
(k)e
ik
b
x
b
dV (178)
In (178), k represents position in momentum space, and A
a
(k) associates a
complex vector in momentum space with each such k. The integral on the right
in (178) associates, with each point x in Minkowski space, a vector in momentum
space, and hence a vector in Minkowski space at the point x. Thus, the right
side of (178) denes a vector eld in Minkowski space. We now demand that
A
a
(x) given by (178) satisfy (176) and (175):
A
a
(x) =
_
M
(k
c
k
c
)A
a
(k)e
ik
b
x
b
dV = 0 (179)
a
A
a
(x) =
_
M
ik
a
A
a
(k)e
ik
b
x
b
dV = 0 (180)
Eqn. (179) states that A
a
(k) vanishes unless k
a
k
a
= 0. That is to say, A
a
(k)
need only be specied on the null cone in momentum space, or, what is the
same thing, on the mass-zero shell, M
0
. Thus, we can replace (178) by
A
a
(x) =
_
M0
A
a
(k)e
ik
b
x
b
dV
0
(181)
Eqn. (180) states that
k
a
A
a
(k) = 0 (182)
for every k M
0
. An A
a
(k) which satises (182) will be said to be transverse.
Finally, the condition that A(x), given by (181), be real, is
A
a
(k) = A
a
(k) (183)
41
Eqn. (183) implies, in particular, that the knowledge of A
a
(k) on M
+
0
determines
uniquely the values of A
a
(k) on M
0
. We thus need only concern ourselves with
A
a
(k) on M
+
0
.
To summarize, there is a one-to-one correspondence (modulo questions of
convergence of Fourier integrals) between real vector elds A
a
(x) on Minkowski
space which satisfy (175) and (176) and transverse complex vector functions
A
a
(k) on M
+
0
.
Unfortunately, real vector elds A
a
(x) on Minkowski space which satisfy
(175) and (176) are not the same as solutions of Maxwells equations: we have
to deal with the problem of gauge transformations. Let satisfy (177), and let
A
a
(x) be given by (174). Then the corresponding Fourier inverses,
(k) and
a
(k), are clearly related by:
a
(k) = A
a
(k) +ik
a
(k) (184)
In other words, a gauge transformation on A
a
(x) which preserves the Lorentz
gauge corresponds simply to adding to A
a
(k) a complex multiple of k
a
. Note
that, since k
a
is null, the gauge transformations (184) do not destroy the
transversality condition, (182).
To summarize, there is a one-to-one correspondence (modulo convergence
of Fourier integrals) between solutions of Maxwells equations and equivalence
classes of transverse complex vector functions A
a
(k) on M
+
0
, where two such
functions A
a
(k) are regarded as equivalent if they dier by a multiple of k
a
.
The reason for expressing the content of Maxwells equations in terms of
momentum space is that certain properties of the space of solutions of Max-
wells equations become more transparent there. We rst impose on the (real!)
solutions of Maxwells equations the structure of a complex vector space. To
add two solutions of Maxwells equations, one simply adds the tensor elds on
Minkowski space. Expressed in terms of momentum space, this means that one
adds the corresponding A
a
(k). To multiply a solution F
ab
of Maxwells equa-
tions by a complex number , one multiplies the corresponding complex vector
function A
a
(k) by in the usual way, and interprets the result, A
a
(k), as a
solution of Maxwells equations (necessarily, a real solution). These operations
clearly extend to operations on the equivalence classes of A
a
(k), and hence are
well-dened operations on solutions of Maxwells equations. It is only when
is real that multiplying a solution F
ab
by , in the sense described above,
is equivalent to simply multiplying the tensor eld F
ab
by in the usual way.
This cannot be the case, of course, when is complex, for the usual product,
F
ab
, would be a complex eld on Minkowski space rather than a real one, and
solutions of Maxwells equation must he real. We can, however, give a picture
for what the product of i and F
ab
(product and multiply will always refer
to that operation dened above) means. Let A
a
(x) be a vector potential in the
Lorentz space, and let A
a
(k) be as in (181). Then iA
a
(k) corresponds to the
42
vector potential
_
M
+
0
iA
a
(k)e
ik
b
x
b
dV
0
_
M
0
i A
a
(k)e
ik
b
x
b
dV
0
=
_
M
+
0
A
a
(k)e
ik
b
x
b
+
2
dV
0
+
_
M
0
A
a
(k)e
ik
b
x
b
2
dV
0
(185)
In other words, multiplication of a solution of Maxwells equations by i cor-
responds to resolving F
ab
into complex plane-waves, and shifting the phase of
the positive frequency parts by /2 while shifting the phase of the negative-
frequency parts by /2. (In exactly the same way, the real solutions of the
Klein-Gordon equation form a complex vector space.)
We next introduce an inner product on our complex vector space. We dene
the norm of a transverse complex vector function A
a
(k) on M
+
0
by
2
_
M
+
0
(A
a
(k)A
a
(k)) dV
0
(186)
Since A
a
(k) is transverse, and since k
a
k
a
= 0 on M
0
, the real number (186)
is clearly invariant under gauge transformations, (184), on A
a
(k). Thus, the
norm (186) is well-dened on solutions of Maxwells equations. Furthermore,
the norm (186) is non-negative and vanishes when and only when A
a
(k) = 0
(more properly, when and only when A
a
(k) is in the zero equivalence class, i.e.,
when and only when A
a
(k) is a multiple of k
a
). To prove this, we show that
the integrand is non-negative. Fix k, and let
A
a
(k) = m
a
+in
a
(187)
where m
a
and n
a
are real. By transversality,
m
a
k
a
= n
a
k
a
= 0 (188)
The integrand of (186) is
(A
a
(k)A
a
(k)) = m
a
m
a
n
a
n
a
(189)
But (188) implies that m
a
and n
a
are either spacelike or multiples of k
a
, whence
(189) is nonnegative and vanishes when and only when m
a
+ in
a
is a multiple
of k
a
.
Thus, the collection of all equivalence classes of (say, continuous) transverse
A
a
(k) on for which (186) converges has the structure of an inner-product space.
Its completion is our Hilbert space, H
M
, of solutions of Maxwells equations.
Just as in the Klein-Gordon case, one can describe H
M
directly in momen-
tum space. It is the collection of all equivalence classes of measurable, square-
integrable (in the sense that (186) converges), transverse A
a
(k) on M
+
0
.
43
This H
M
represents the one-photon states. (Intuitively, a solution of Max-
wells equations represents a wave function for a single photon.) The space
of many-photon states is the (symmetric, since photons are bosons) Fock space
based on H
M
. Thus, from our earlier discussion, we have creation, annihilation,
and number operators for (free) photons. The commutation relations and other
properties of these operators have already been worked out.
Finally, we introduce momentum operators on H
M
. Let p
a
be a constant
vector eld in Minkowski space. Then, with each solution F
ab
(x) of Maxwells
equations, we associate another solution: multiply the solution p
c
c
F
ab
of
Maxwells equations by the number i (multiply, of course, in the sense of
H
M
). We thus dene a linear operator, P(p
a
), on H
M
. In momentum space,
this operator clearly takes the form
P(p
b
)A
a
(k) = (p
b
k
b
)A
a
(k) (190)
Note that the momentum operators are only dened on a dense subset of H
M
,
are Hermitian, and commute with each other. Another interesting property of
these operators which also holds in the Klein-Gordon case is that energies
are positive. Let p
a
be timelike and future-directed, so P(p
a
) represents an
energy operator. Then p
a
k
a
0 for any k
a
M
+
0
. Hence, from (190) and
(186), the inner product of and P(p
a
) is positive for any element ( ,= 0)
of H
M
.
Although they are not commonly discussed, one can also introduce position
operators on H
M
. As in the Klein-Gordon case, one projects to obtain a vector
eld on the mass shell. Instead of taking the directional derivative of a function
on the mass shell as in the Klein-Gordon case, one takes the Lie derivative of
A
a
(k), considered as a contravariant vector eld on M
+
0
. (Its important, in
order to preserve transversality, that one takes A
a
(k) to be a contravariant
rather than a covariant eld.) Finally, one includes an appropriate divergence-
type term in the operators in order to make them be Hermitian.
14 Maxwell Fields as Operators
We shall now introduce, on the Fock space for the Maxwell equation, operators
associated with the classical elds, A
a
and F
ab
, of the Maxwell theory. The
denitions are closely analogous to those of the Klein-Gordon theory.
Since the classical Klein-Gordon eld is a scalar eld, the test functions
used to smear out the corresponding eld operators are scalar elds. In the
Maxwell case, on the other hand, the classical elds are vector or tensor elds on
Minkowski space. One must therefore introduce test functions which them-
selves have vectorial or tensorial character. The support of a tensor eld f
a1an
on Minkowski space is dened as the closure of the collection of all points of
Minkowski Space at which f
a1an
,= 0. A smooth, real tensor eld on Minkow-
ski space, with compact support, will be called a test eld. In order to facilitate
calculations with such test elds, it is convenient to establish the following re-
mark:
44
Lemma 1. Let T
a1an
be a smooth, totally antisymmetric tensor eld on Min-
kowski space. Then
_
T
a1an
[a1
f
a2an]
dV = 0 (191)
for every totally antisymmetric test eld f
a2an
if and only if
a1
T
a1a2an
= 0 (192)
Furthermore,
_
T
a1an
m
f
ma1an
dV = 0 (193)
for every totally antisymmetric test eld f
ma1an
if and only if
[m
T
a1an]
= 0 (194)
Proof. Integrating by parts once, and discarding the surface term by compact
support, we have the identity
_
T
a1an
[a1
f
a2an]
dV =
_
(
a1
T
a1an
)f
a2an
dV (195)
for every totally antisymmetric test eld f
a2an
. But clearly the right side of
(195) vanishes for every test eld if and only if (192) holds. The second part of
the Lemma is proved in the same way, using the identity
_
T
a1an
m
f
ma1an
dV =
_
(
[m
T
a1an]
)f
ma1an
dV (196)
Note that Lemma 1 is easily generalized to higher order equations, to other
symmetries of the tensors, etc. The essential idea is that linear dierential
equations on a tensor eld T
a1an
on Minkowski space can be expressed by the
condition that the smeared-out version of this eld vanish for an appropriate
collection of test elds.
We begin with the eld operators for the vector potential. Unfortunately,
the classical vector potential, A
a
(x), is not determined uniquely by a solution of
Maxwells equations; there is the freedom of gauge transformations (174), where
is a solution of the wave equation. We would expect this gauge freedom to
appear in some way in the corresponding operators. The essential observation
is that, by Lemma 1, the quantity
_
A
a
f
a
dV (197)
is invariant under gauge transformations provided the test eld f
a
is the sum of
a gradient and a vector eld whose divergence vanishes. Conversely, the value
of the real number (197) for every test eld which is the sum of a gradient and
45
a divergence-free eld determines A
a
(x) uniquely up to gauge transformations.
We are thus led to view the gauge freedom in the vector potential as representing
a restriction on the class of test elds which are appropriate for smearing out
the vector potential.
The remarks above motivate the denitions below. Let f
a
be a test eld,
and let f
a
(k) be its Fourier inverse, a vector function on momentum space.
Evidently, if f
a
is divergence-free then
f
a
(k)k
a
= 0 (198)
while if f
a
is a gradient then
f
a
(k) = h(k)k
a
(199)
It is clear, therefore, that if f
a
is the sum of a gradient and a divergence-free
eld, then f
a
(k), restricted to M
+
0
, is transverse. In other words, we may
associate, with each test eld f
a
on Minkowski space which is the sum of a
gradient and a divergence-free eld, an element (f
a
) of H
M
. We dene the
vector potential operators
A(f
a
) = (C((f
a
)) +A((f
a
)) (200)
Note that the operator (200) is Hermitian, a result one expects because the
corresponding classical eld is real. The denition of the electromagnetic eld
operators is suggested by Lemma 1 and Eqn. (172). If f
ab
is a skew test eld,
we dene
F(f
ab
) = A(2
b
f
ab
) (201)
Thus, the electromagnetic eld operators (which are also Hermitian) must be
smeared out with skew, second-rank test elds. (Note that the right side of
(201) is well-dened, for the argument is necessarily divergence-free.)
We next verify that our eld operators satisfy the same equations as the
classical elds. Using Lemma 1, Eqns. (175) and (176) are translated into
A(
a
f) = 0 (202)
A(f
a
) = 0 (203)
where f is any test function and f
a
is any test eld which is the sum of a gradi-
ent and a divergence-free eld. Eqn. (202) follows immediately from (199). To
prove (203), note that, if f
a
(k) is the Fourier inverse of f
a
, then (k
b
k
b
)f
a
(k)
is the Fourier inverse of f
a
. But (k
b
k
b
)f
a
(k) vanishes on M
+
0
, whence (203)
follows. We conclude that, in a suitable sense, our vector potential operators
satisfy (175) and (176). Similarly, using Lemma 1, Maxwells equations (170)
and (171) on F
ab
are to be translated into the following conditions on the elec-
tromagnetic eld operators:
F(
c
f
abc
) = 0 (204)
F(
[a
f
b]
) = 0 (205)
46
where f
abc
is a totally antisymmetric test eld and f
a
is any test eld. To prove
(204) and (205), we substitute the denition (201):
F(
c
f
abc
) = A(2
b
c
f
abc
) = A(0) (206)
F(
[a
f
b]
) = A(2
b
[a
f
b]
) = A(
a
(
b
f
b
) f
a
) (207)
Thus, (204) is clearly true, while (205) follows immediately from (202) and
(203). We conclude that, in a suitable sense, our Maxwell eld operators satisfy
Maxwells equations.
Finally, we remark briey on the commutators of the vector potential oper-
ators. Let f
a
and
f
a
be test elds, each of which is the sum of a gradient and
a divergence-free eld. Then, from (200) and (102),
[A(f
a
), A(
f
a
)] =
2
(
(f
a
)
(
f
a
) +
(
f
a
)
(f
a
))I =
i
D(f
a
,
f
a
)I (208)
where the second equality denes D(f
a
,
f
a
). Thus, D(f
a
,
f
a
) is real and satises
D(f
a
,
f
a
) = D(
f
a
, f
a
) (209)
These properties imply that whenever f
a
and
f
a
have relatively spacelike sup-
ports, D(f
a
,
f
a
) = 0. As in the Klein-Gordon case, D(f
a
,
f
a
) can be written
out explicitly in terms of a distribution on Minkowski space.
15 The Poincare Group
A smooth mapping from Minkowski space to itself which preserves the norms of
vectors is called a Poincare transformation. If, in addition, this mapping i) does
not reverse the of past and future time directions, and ii) does not reverse spatial
parities ( i) and ii) together are equivalent to i) and the condition that
abcd
be invariant), then the Poincare transformation is called a restricted Poincare
transformation. The result of applying two Poincare transformations (resp.,
restricted Poincare transformations) in succession is clearly again a Poincare
(resp., restricted Poincare) transformation. These transformations thus form a
group, called the Poincare group (resp. restricted Poincare group), T (resp.,
T.) One sometimes expresses this relation between the Poincare group and
Minkowski space by saying that the Poincare group acts on Minkowski space.
That is, we have a mapping : T M M (M = Minkowski space) with the
following properties:
(P, (P
, x)) = (PP
, x) (210)
(e, x) = x (211)
for P, P
T, x M.
In fact, the Poincare group has more structure than merely that of a group.
It is also a (10-dimensional, real, dierentiable) manifold. This additional man-
ifold structure on T leads naturally to the notion of an innitesimal Poincare
transformation.
47
A group G which is also a smooth manifold, and for which the group oper-
ations (composition within the group, considered as a mapping from GG to
G, and the operation of taking the inverse, considered as a mapping from G to
G) are smooth mappings, is called a Lie group. Let G be a Lie group, and let
LG denote the collection of all contravariant vectors at the identity element e
of G. This LG is thus a real vector space whose dimension is the same as the
dimension of the manifold G. (Vectors at the identity of G represent elements
of G which dier innitesimally from the identity.)
So far, our LG involves only the manifold structure of G (and, of course, the
location of the identity element.) Is there some way in which the group structure
of G can also be incorporated into LG? Let v LG, so v is a contravariant
vector at e G. Let g() be a smooth curve, parameterized by the parameter
, in G such that g(O) = e and such that the tangent vector, with respect to ,
of g() at e is just v. (Tangent vector with respect to means that one takes
the derivative of g() with respect to and evaluates at = 0.) Similarly, let
g
()g
1
()g
1
() (212)
in G. Unfortunately, the tangent vector (with respect to ) of the curve (212)
vanishes at e. It turns out, however, that (212) is still a smooth curve if we take
as its parameter not but rather
2
. The tangent vector of (212), with respect
to the parameter
2
, is not in general zero at e. Furthermore, this tangent vector
depends only on v and v
() which
actually appear in (212)), and so we may write it as follows: [v, v
]. Thus, with
any two elements, v and v
], of LG.
It is by means of this bracket operation that the group structure of G appears
in LG. It can be proven that the bracket is necessarily linear, antisymmetric,
and subject to the Jacobi identity:
[av +bv
, v
] = a[v, v
] +b[v
, v
]
[v, av
+bv
] = a[v, v
] +b[v, v
]
(213)
[v, v
] = [v
, v] (214)
[v, [v
, v
]] + [v
, [v
, v]] + [v
, [v, v
]] = 0 (215)
(a, b R; v, v
, v
(a
v
b)
= 0 (216)
(Eqn. (216) states that the Lie derivative of the Minkowski metric by v
a
van-
ishes.) Choosing a particular origin O, the most general solution of (216) can
be expressed in the form
v
a
= v
O
ab
x
b
+v
O
a
(217)
where v
O
a
is a constant vector eld on Minkowski space, v
O
ab
is a constant skew
tensor eld on Minkowski space, and x
a
is the position vector of x relative
to our origin O. Note that the particular constant elds v
O
ab
and v
O
a
which
describe a given v
a
(x) will depend on the choice of origin O. Note also that the
dimensions are correct: six dimensions for v
O
ab
plus four dimensions for v
O
a
make
ten dimensions for LT. The bracket operation in LT becomes Lie derivatives
of solutions of (216). That is to say, if v, v
LT correspond to solutions v
a
, v
a
,
respectively, of (216), then the solution of (216) which corresponds to [v, v
] is
just
L
v
v
a
= v
b
b
v
a
v
b
v
a
(218)
As a check, one can verify (213), (214), and (215) for (218).
To summarize, whereas the Lie algebra LT of the Poincare group arises from
very general considerations involving the structure of Lie groups, LT can in fact
be expressed very simply in terms of certain vector elds in Minkowski space.
16 Representations of the Poincare Group
Let P be a member of the restricted Poincare group. Then, with each positive-
frequency solution (x) of the Klein-Gordon equation, we may certainly asso-
ciate another positive-frequency solution, (Px). This mapping from solutions
to solutions is clearly linear, and so represent an operator, U
P
, on the Hilbert
space H
KG
of positive-frequency solutions of the Klein-Gordon equation. That
is, for each P T, we have an operator U
P
on H
KG
. Since the operators
arise from the action of T on Minkowski space, we have
U
P
U
P
= U
PP
(219)
49
U
e
= I (220)
where e denotes the identity of T. A mapping from a group into a collection
of operators on a Hilbert space, subject to (219) and (220), is called a represen-
tation of the group. (More generally, the term representation is used when the
operators act on any vector space.) Thus, we have dened a representation of
T.
The inner product we have dened on H
KG
is clearly invariant under the
action of the restricted Poincare group. That is to say, if P T, , H
KG
,
we have
(U
P
, U
P
) = (, ) (221)
An operator on a Hilbert space which is dened everywhere and which satis-
es (221) for any two elements of that Hilbert space is said to be unitary. A
representation of a group with the property that the operator associated with
each group element is unitary is called a unitary representation. We thus have
a unitary representation of T on H
KG
.
A similar situation obtains in the Maxwell case (and for the other relativistic
eld equations we shall introduce later.) We have a unitary representation of
T on H
M
.
Associated with the restricted Poincare group T is its Lie algebra LT.
What does a unitary representation of T look like in terms of LT? Let U
P
be
a unitary representation of the restricted Poincare group on a Hilbert space H.
Let v LT, and let P() be a corresponding curve in T. Consider, for each
H, the right side of
H
v
=
i
lim
0
U
P()
U
P(0)
(222)
(lim, of course, refers to the topology on H.) It may happen, of course, that
the limit in (222) does not exist for certain . It is normally the case in practice,
however, that the limit does exist for a dense subset of H, and, furthermore,
that the limit depends only on v and not on the particular curve P(). In this
case, the right side of (222) is certainty linear in (since the U
P
are), and so
denes an operator H
v
on H. (The factor /i in (222) is for later convenience.)
Thus, we associate with each v LT an operator H
v
on H. The operator H
v
is linear in v, i.e.,
H
av+bv
= aH
v
+bH
v
(223)
How is H
[v,v
]
related to H
v
and H
v
? To answer this question, we consider the
operators associated with the curve (212):
U
P()
U
P
()
U
1
P()
U
1
P
()
(224)
Taking the derivative (i.e., as in (222)) of (224), and evaluating at = 0, we
obtain the desired relation
[H
v
, H
v
] =
i
H
[v,v
]
(225)
50
where we have used (222). In other words, the bracket operation on the vs
becomes commutators of the H
v
s. (Note that (225) is consistent with (213),
(214), and (215).) One further property of the H
v
s follows from the unitary
character, (221), of our representation. Taking the derivative of
(U
P()
, U
P()
) = (, ) (226)
with respect to and evaluating at = 0, we obtain, using (222),
(H
v
, ) = (, H
v
) (227)
That is, each operator H
v
, is Hermitian.
To summarize, a unitary representation of the restricted Poincare group on
a Hilbert space H normally leads to a linear mapping from LT to the collection
of Hermitian operators on H. The Lie bracket operation in LT translates to
the commutator of the corresponding operators.
The general remarks above are merely intended to provide a framework for
what follows. In practice, it is not necessary to go through a limiting process
to obtain the Hermitian operators associated with a representation of T. Let
v LT be the vector eld v
a
on Minkowski space, so v
a
satises (216). Then,
if (x) is a positive-frequency solution of the Klein-Gordon equation, so is the
right side of
H
v
=
i
v
a
a
(228)
We thus dene an operator H
v
on (a dense subset of) H
KG
. The H
v
s clearly
satisfy (223) and (225). In terms of momentum space, (228) may be described
as follows. Let (k) be the Fourier inverse of (x) with respect to an origin O,
and let v
a
be given by (217) with respect to the same origin O. It then follows
immediately, taking the Fourier inverse of (228), that
H
v
(k) = (v
0a
k
a
)(k) +
i
v
0a
b
k
b
a
(k) (229)
(Note that (229) is well-dened, for v
0a
b
k
b
tangent to M
+
c
F
ab
+F
cb
a
v
c
+F
ac
b
v
c
) (230)
where the multiplication by i in (230) refers to multiplication within the Hilbert
space H
M
. In momentum space, our Hermitian operators take the form
H
v
A
a
(k) = (v
0b
k
b
)A
a
(k) +
i
L
v
0c
b
k
b A
a
(k) (231)
To summarize, we can take LT be simply the Lie algebra of solutions of
(216), and the operators H
v
, to be dened by (228) and (230) (or by (229) and
(231)). Then Hermiticity, (223), and (225) follow directly.
51
To facilitate calculations with the H
v
s, it is convenient to introduce a special
notation. Let T
a1an
be a tensor eld on Minkowski space. Then T
a1an
asso-
ciates, with each point x and tensor f
a1an
at x, a real number, T
a1an
f
a1an
.
For xed x, this mapping is linear in f
a1an
. Furthermore, the value of this
number for every x and f
a1an
determines T
a1an
uniquely. (Think of f
a1an
as a test function.) An operator eld is what results if we replace real
number in the remarks above by operator on a Hilbert space H. Thus, an
operator eld, T
a1an
, associates, with each point x of Minkowski space and
tensor f
a1an
at x, an operator on H, written T
a1an
f
a1an
, such that, for x
xed, this operator is linear in f
a1an
. (For example, an operator eld is what
A
a
(x) and F
ab
(x) would be, if they existed.) Note that a tensor eld is a special
case of an operator eld when all the operators are multiples of the identity
operator on H.
The easiest way to discuss the H
v
s is as operator elds. Let x be a point of
Minkowski space, and f
a
a vector at x. Then the constant vector eld
v
a
= f
a
(232)
on Minkowski space certainly satises (216), and so denes an operator H
v
(on
either H
KG
or H
M
). We have dened an operator eld, which we write as P
a
.
(These, of course, are our old momentum operators, expressed in a dierent
way.) Let x be point of Minkowski space, and let f
ab
be a skew tensor at x.
Then the vector eld
v
a
(y) = f
ab
x
b
(233)
on Minkowski space, where x
a
denotes the position vector of y relative to x,
satises (216), and so denes an operator H
v
. We have thus dened a skew
operator eld, which we write P
ab
.
We introduce three operations on operator elds. The rst is outer product.
Let f
abc
be a tensor at the point x of Minkowski space. Write f
abc
in the form
f
abc
= m
a
m
bc
+ +n
a
n
bc
(234)
Then, for example, the outer product of P
a
and P
bc
is the operator eld P
a
P
bc
,
dened by
P
a
P
bc
f
abc
a = (m
a
P
a
)(m
bc
P
bc
) + (n
a
P
a
)(n
bc
P
bc
) (235)
where the products on the right are to be interpreted as merely products of
operators. Note that (235) is independent of the particular expansion (234).
The outer product of two operator elds in general depends on the order in
which they are written. For example, P
a
P
bc
,= P
bc
P
a
. The second operation
is contraction. Let f
b
be a vector at the point x of Minkowski space. Then, for
example, P
a
P
ab
is the operator dened by
P
a
P
ab
f
b
= (P
c
t
c
)(P
db
t
d
f
b
) (P
c
x
c
)(P
db
x
d
f
b
) (P
c
y
c
)(P
db
y
d
f
b
)
(P
c
z
c
)(P
db
z
d
f
b
)
(236)
52
where t
a
, x
a
, y
a
, z
a
are vectors at x which dene an orthonormal basis
t
a
t
a
= 1 x
a
x
a
= y
a
y
a
= z
a
z
a
= 1
t
a
x
a
= t
a
y
a
= t
a
z
a
= x
a
y
a
= x
a
z
a
= y
a
z
a
= 0
(237)
Note that (236) is independent of the choice of basis. The nal operation on
operator elds is dierentiation. Let r
a
and f
b
be vectors at the point x of
Minkowski space. Let x
b
be f
b
translated to the point x
. Then, for
example,
a
P
b
is the operator eld dened by
(
a
P
b
)r
a
f
b
= lim
0
P
a
f
a
P
a
f
a
(238)
(provided this limit exists). In short, operator elds are handled exactly as
tensor elds, except that one must keep track of the order in products. The
terms Hermitian operator eld, unitary operator eld, etc. are self-explanatory.
First note that P
a
and P
ab
are Hermitian operator elds. We next consider
the derivatives of our two operator elds. It is clear from (232) and (238) that
P
a
is constant:
a
P
b
= 0 (239)
To compute the derivative P
a
b, we rst note the following fact. If v
a
(x), a
solution of (216), is expressed in the form (217) with respect to two dierent
origins, O and O
, then
v
O
ab
= v
O
ab
v
O
a
= v
O
a
+v
O
ab
r
b
(240)
where r
a
is the position vector of O
cd
is s
cd
translated to O
, then
P
cd
s
cd
P
cd
s
cd
= s
cd
r
d
P
c
(241)
Hence,
a
P
bc
=
a[b
P
c]
(242)
Eqns. (239) and (242) imply, in particular, that the second derivative of P
a
b
vanishes. Finally, we evaluate the commutators of our operator elds. We have
already seen that the momentum operators commute:
[P
a
, P
b
] = 0 (243)
The other commutators are computed using the following fact: if v
a
(x) and
w
a
(x) are elements of LT, expressed in the form (217) with respect to the same
origin O, then [v, w] takes the form
2v
O
[a
c
w
O
b]c
x
b
+ (v
O
cw
O
ac
w
Oc
v
O
ac
) (244)
with respect to O. Hence, from (225), (233), and (244), we have
[r
ab
P
ab
, s
cd
P
c
d] =
2
i
r
a
c
s
bc
P
ab
(245)
53
where r
ab
and s
ab
are skew tensors at x. Therefore,
[P
ab
, P
cd
] =
i
_
b[c
P
d]a
(
a[c
P
d]b
_
(246)
By an identical argument, we obtain, nally,
[P
a
, P
bc
] =
a[b
P
c]
(247)
The interaction between the restricted Poincare group and our Hilbert spaces
is expressed completely and neatly by the operator elds P
a
and P
ab
. The
important equations on these elds are (239), (242), (243), (246), and (247).
17 Casimir Operators: Spin and Mass
Our plan is to introduce a number of relativistic eld equations, and, for each
one, to make a Hilbert space of an appropriate collection of solutions, to in-
troduce the corresponding Fock space, and to replace the classical elds by
operators on Fock space. This program has now been carried out for the Klein-
Gordon and Maxwell equations. With each set of equations there are associated
two real numbers called the mass and the spin. We could, of course, merely
state what mass and what spin are to be associated with the equations in each
case. It is useful, however, to see how these quantities arise in a natural way
from very general considerations involving the structure of the Poincare group.
In fact, what we need of the Poincare group is the action of its Lie algebra, LT,
on our Hilbert spaces (Sect. 16), and certain objects, called Casimir operators,
associated with LT. More generally, there are Casimir operators associated
with any Lie algebra. We begin with this more general situation.
Let L be a Lie algebra. Then, in particular, L is a vector space. It is
convenient to introduce an index notation. An element of L will be written
with a raised Greek index (not to be confused with the Greek indices used in
the discussion of Fock space.) Elements of the dual space of L (elements of
the vector space of linear maps from L to the reals (or the complexes, if L
were a complex vector space)) are written with lowered Greek indices. Objects
with more that one index represent tensors over L and its dual. Finally, the
action of the dual induces the operation of contraction between one raised and
one lowered Greek index: this is indicated by using a repeated index. (When
one wants to do anything except the most trivial calculations with multilinear
algebra, it is usually simpler in the long run to introduce an index notation.)
For example, the bracket operation in L is a bilinear mapping from LL to L,
and so can be represented by a tensor C
:
[v, v
= C
(248)
(This tensor is sometimes called the structure constant tensor.) Eqns. (214) and
(215), expressed in terms of C
, become
C
= C
[]
(249)
54
C
[
C
]
= 0 (250)
In other words, a Lie algebra is simply a vector space over which there is given
a tensor C
.)
We now introduce the set / of all nite strings of tensors over L:
(v
, v
, . . . , v
1n
, 0, 0, . . .) (251)
What structure do we have on /? We can certainly add two nite strings by
adding them component-wise i.e., adding the vector of the rst string to
the vector of the second string, the second-rank tensor of the rst string to
the second-rank tensor of the second string, etc. to obtain a new element of
/. Furthermore, we can multiply a nite string by a number by multiplying
each element of that string by the number. Thus, / has the structure of an
(innite-dimensional) vector space. We can also introduce a product operation
on /. (This, in fact, is the reason for considering / at all.) To take the product
of two nite strings, take all possible outer products consisting of one tensor
from the rst string and one from the second, always placing the tensor from
the rst string rst, and add together the resulting tensors when they have the
same rank to obtain the product string. For example,
(v
, v
, 0, . . .)(w
, w
, w
, 0, . . .)
= (0, v
, v
+v
, v
+v
, v
, 0, . . .) (252)
Note that the product, AB, of elements A and B of / is linear in A and B:
(aA+A
)B = aAB +A
B
A(aB +B
) = aAB +AB
(253)
(a R, A, A
, B, B
, 2v
[
w
]
, 0, . . .) (255)
for v, w L. Let 1 denote the set of all elements of / which can be written
as a sum of products of elements of / in such a way that at least one factor in
55
each product is of the form (255). Clearly, we have (i) 1 is a vector subspace
of /, and (ii) the product of any element of 1 with any element of / is again
an element of 1. (A subset of an associative algebra, satisfying (i) and (ii),
is called an ideal.) We now want to take the quotient algebra, //1, of / by
the ideal 1. We dene an equivalence relation on /: two elements of / are
to be regarded as equivalent if their dierence is in 1. That the equivalence
class of any linear combination of elements A and B of / depends only on the
equivalence classes of A and B follows from (i). That the equivalence class of
the product of any two elements A and B of / depends only on the equivalence
class of A and B follows from (ii). Thus, the collection of equivalence classes is
itself an associative algebra. It is written |L and called the universal enveloping
algebra of the Lie algebra. To summarize, with every algebra there is associated
an associative algebra |L.
There is an important relation between L and |L. Let v
L, and let
(v) denote the element of |L whose equivalence class contains the element
(v
]) = (v)(v
) (v
)(v) (256)
for any two elements of L. (In fact, it was to make (256) hold that we dened
|L as we did.) In other words, the bracket operation in the Lie algebra L corre-
sponds to the commutator of elements of the associative algebra |L. Note that,
applying to both sides of (214) and (215), and using (256) and associativity,
we obtain identities.
Why this interest in the universal enveloping algebra? Let L be a Lie algebra,
and suppose, for each element v of L, we are given an operator H
v
on some xed
Hilbert space H. Suppose, furthermore, that H
v
is linear in v, and that
H
[v,v
]
= H
v
H
v
H
v
H
v
(257)
for any v, v
, u
+ p
, r
, 0, 0, . . .) (258)
We associate with each expression of the form (258) an operator on H, e.g.,
H
v
+H
u
H
w
+H
p
H
q
+H
r
H
s
H
t
(259)
It follows from the fact that H
v
is linear in v that the operator (259) depends
only on the element of / represented by (258) (and not on the particular ex-
pansion used.) Furthermore, (255) and (257) imply that if (258) is an element
of 1 ( /), then the operator (259) is zero. Thus, (259) depends only on the
56
equivalence class of (258). In other words, we have, for each element of |L,
an operator, H
, on H. The operators H
are linear in H:
H
a+
= aH
+H
(260)
and, clearly, satisfy
H
(v)
= H
v
(261)
Furthermore, it follows immediately from (259) that
H
= H
(262)
Let us summarize the situation. We have a Lie algebra L acting on a Hilbert
space H by means of the operators H
v
(v L) on H. The collection of all
operators (at least, the collection of all those which are dened everywhere) on
a Hilbert space has the structure of an associative algebra. We thus have a
mapping from a Lie algebra to an associative algebra, with these two algebraic
structures related via (257). Things could be better. It would be nice if we
could express the bracket operation in L in the form
[v, v
] = vv
v (263)
and have
H
vv
= H
v
H
v
(264)
Then (257) would follow already from (263) and (264). This program, unfortu-
nately, cannot be accomplished directly, for the only product which is dened
in L is the entire bracket, and not the individual terms on the right of (263).
But it can be accomplished indirectly. We enlarge L to |L. We still cannot
write (263) but instead we have (256). (Eqn. (256) also states that the al-
gebraic structure of L has been incorporated into that of |L.) We still cannot
write (264) but instead we have (262). In short, since L is being mapped to
an associative algebra (the operators on H), and since the natural thing to map
to an associative algebra is another associative algebra, we force associativity
on L by enlarging it to |L.
We can now introduce the Casimir operators. A Casimir operator of the Lie
algebra L is an element of the center of |L, i.e., an element of |L such that
= 0 (265)
for every |L. It should be emphasized that the Casimir operators of a
Lie algebra are not themselves elements of that Lie algebra, but rather of its
universal enveloping algebra. That is, they must be represented as strings of
tensors over L. Note that the collection of all Casimir operators of a Lie algebra
form an associative algebra. Finally, we remark that the universal enveloping
algebra |L and hence the Casimir operators (which are not operators, as we
have dened them, but merely elements of an algebra) are xed once and for all
given the Lie algebra L. They do not depend on the presence of a Hilbert space
57
H or on the H
v
s. For example, the Casimir operators of LT (the Lie algebra
of the Poincare group) simply exist. (In fact, there are just two algebraically
independent ones.) They neednt be found individually for H
KG
, H
M
, etc.
Now suppose again that we have a Hilbert space H and, for each v L, an
operator H
v
on H, where the H
v
s are linear in v and satisfy (257). Then we have
an operator H
s are
multiples of the identity in cases of interest. However, the following result
suggests this conclusion:
Lemma 2 (Schurs Lemma). Let H be a nite-dimensional complex vector
space, and let L be a set. Suppose, for each v L, we are given an operator
(dened everywhere) on H, H
v
. Suppose, furthermore, that the only vector
subspaces S of H having the property that H
v
S for every v L and S
are S = 0 and S = H. Let K be an operator (dened everywhere) on H which
commutes with all the H
v
s. Then K is some complex multiple of the identity.
Proof. Since H is a complex vector space, K has at least one eigenvector, i.e.,
there exists a complex number and a nonzero element of H such that
K = (266)
Fix , and let S be the collection of all s which satisfy (266). Then, for S,
v L,
K(H
v
) = H
v
K = (H
v
) (267)
Hence, H
v
S. By hypothesis, therefore, S = 0, or S = H. But by
construction S contains at least one nonzero element of H, so we must have
S = H. In other words, every element of H satises (266), whence K = I.
We now want to apply all this mathematics to our relativistic elds. As
usual, one can regard the formal developments as merely providing motivation
and insight into what turn out to be very simple notions in practice. The
operators on our Hilbert spaces associated with the Casimir operators of LT
can be expressed quite easily in terms of the operator elds P
a
and P
ab
discussed
in Section 16. The rst Casimir operator is
P
a
P
a
= m
2
(268)
We see from (243) and (247) that m
2
commutes with P
a
and P
ab
. Furthermore,
(239)) implies that m
2
is a constant operator eld. Hence, m
2
is just an ordinary
58
operator on our Hilbert spaces. It turns out to be a multiple of the identity (as
suggested above), and that multiple is called the (squared) mass of the eld.
To dene the second Casimir operator, we rst introduce the operator eld
W
a
abcd
P
b
P
cd
(269)
Then (239) and (242) imply that W
a
is constant. The second Casimir operator
is the left side of
W
a
W
a
=
2
m
2
s(s + 1) (270)
Note, from (243), (246), and (247), that W
a
W
a
commutes with P
a
and P
ab
. It
turns out to be a multiple of the identity, and the non-negative number s which
makes that multiple be the right side of (270) is called the spin of the eld. We
remark that the mass and spin are associated not with each individual solution
of a relativistic eld equation, but rather with the equation itself.
Unfortunately, (270) will not give the spin s when m = 0. In the massless
case, it is found that there is a number s for which
W
a
= sP
a
(271)
and so this equation is used to dene the spin. This denition has an interesting
consequence. Note that the denition of W
a
involves one
abcd
, while there
are none in P
a
. That is, W
a
is a pseudovector, while P
a
an ordinary vector
(operator eld). Hence, the spin s is a pseudoscalar in the massless case, and
a scalar when m ,= 0. We shall see shortly that this feature is related to the
notion of helicity.
Finally, we evaluate the mass and spin in the Klein-Gordon and Maxwell
cases. Let r
a
be a vector at the point x of Minkowski space, and let positive-
frequency solution of the Klein-Gordon equation. Then
r
a
P
a
(r
b
P
b
) = r
a
P
a
_
i
r
b
_
=
2
r
a
r
b
b
(272)
To evaluate P
a
P
a
we must sum (272), with the appropriate signs, as r
a
runs
over an orthonormal tetrad (see (236).) Clearly, the result of taking this sum is
simply to replace r
a
r
b
by the Minkowski metric,
ab
. So
P
a
P
a
=
2
=
2
2
(273)
But
2
2
for the Klein-Gordon equation is what we earlier (c.f. (5)) called m
2
.
Hence, the m in Sect. 1 is indeed the mass for the Klein-Gordon equation. To
evaluate the spin, let r
a
be a vector and s
ab
a skew tensor at the point x. Then,
writing x
a
for the position vector relative to x,
r
a
P
a
(s
bc
P
bc
) = r
a
P
a
_
i
s
b
c
x
c
_
=
i
r
a
a
_
i
s
b
c
x
c
_
=
2
(r
c
s
b
c
b
+r
a
s
b
c
x
c
b
) (274)
59
Let u
a
be another vector at x. Then, to evaluate u
a
W
a
, we must sum (274)
over rs and ss so that
r
b
s
cd
= u
a
abcd
. The result, clearly, is just to replace
the combination r
b
s
cd
in (274) by u
a
abcd
. So,
u
a
W
a
=
2
u
a
abcd
(
bd
c
+ x
d
c
) = 0 (275)
Thus, W
a
is zero on H
KG
. Now (270) implies s = 0 in the massive case, while
(271) gives s = 0 in the massless case. The Klein-Gordon equation describes a
particle of mass m and spin zero.
It is enlightening, instead of treating just the Maxwell case, to discuss the
more general equation
(+
2
)A
a
= 0
a
A
a
= 0 (276)
Maxwells equations are obtained for = 0. If r
a
is a vector at x,
r
a
P
a
(r
b
P
b
A
c
) =
2
r
a
r
b
b
A
c
(277)
Hence,
P
a
P
a
A
b
=
2
A
b
=
2
2
A
b
(278)
Hence, the mass of the elds described by (276) is just as in the Klein-Gordon
case. (In particular, photons have mass zero.) If r
a
is a vector and s
ab
a skew
tensor at x, then
r
a
P
a
(s
bc
P
bc
A
d
) = r
a
P
a
_
i
L
s
b
cx
cA
d
_
=
i
r
a
a
_
i
_
s
b
c
x
c
b
A
d
+A
b
s
b
d
_
_
=
2
_
r
a
s
b
c
x
c
b
A
d
+r
c
s
b
c
b
A
d
+r
a
s
b
d
a
A
b
(279)
Therefore, by the same argument as before,
u
a
W
a
A
e
=
2
u
a
abcd
[x
d
c
A
e
+
bd
c
A
e
+
ed
b
A
c
]
=
2
aebc
u
a
b
A
c
(280)
where u
a
is a vector at x. Hence,
W
a
W
a
A
c
=
2
aebc
b
_
acpq
p
A
q
_
= 4
4
b
(
[e
A
b]
) = 2
4
2
A
e
(281)
Thus, the spin of the elds (276) is s = 1, provided ,= 0.
But something appears to be wrong in the Maxwell case, = 0. Eqn. (280)
is not proportional to
u
a
P
a
A
e
=
i
r
a
a
A
e
(282)
60
First note that, by a gauge transformation on the right in (282), we can write
u
a
P
a
A
e
=
2i
u
a
[a
A
e]
(283)
We still dont have proportionality with (280). The reason is that the repre-
sentation of LT on H
M
is not irreducible. A solution of Maxwells equations is
said to have positive (resp. negative) helicity if
abcd
F
cd
=
i
2
F
ab
(284)
with the plus (resp. minus) sign on the right. (In (284), i means multiplication
in H
M
. The factor i/2 is necessary because
abcd
cdef
F
ef
= 4F
ab
for any skew
F
ab
.) In momentum space, a positive-helicity or negative-helicity solution takes
the form
A
a
(k) = m
a
+in
a
(285)
with m
a
m
a
= n
a
n
a
, m
a
n
a
= 0. The two helicities arise because there are two
directions through which m
a
can be rotated through 90
o
to obtain n
a
. Every
solution of Maxwells equations can be written uniquely as the sum of a positive
and a negative helicity solution. Furthermore, the inner product of a positive
helicity solution with a negative helicity solution is zero. (These facts follow
immediately from (285).) Thus, H
M
is the direct sum of the Hilbert space of
positive-helicity solutions with the Hilbert space of negative-helicity solutions.
On the Hilbert space of positive-helicity solutions, s = 1; on the Hilbert space
of negative-helicity solutions, s = 1.
18 Spinors
Particles with half-integer spin (electrons, neutrinos, etc.) are described by
mathematical objects called spinor elds. We shall base our treatment of such
particles on what are called two-component spinors (rather than the more
common four-component spinors.) Essentially the only dierence between the
two is one of notation. Whereas the two-component spinors lend themselves
more naturally to an index notation, the four-component spinors are slightly
more convenient when discussing discrete symmetries. We shall rst dene
(two-component) spinors, and then indicate how formulae can be translated to
the four-component language.
Let C be a two-dimensional, complex vector space. Membership in C will be
indicated with a raised, upper case Latin index, e.g.,
A
,
A
, etc. We introduce
three additional two-dimensional complex vector spaces:
i) The complex-conjugate space,
C, of C (see Sect. 8);
ii) The dual space, C
; mem-
bership in C
, and
C
? Of
course, we can multiply elements by complex numbers, and add elements which
belong to the same vector space (i.e., which have the same index structure). Fur-
thermore, our four vector spaces can be grouped into pairs which are complex-
conjugates of each other: C and
C are complex-conjugates of each other, and
C
and
C
, or
C
),
yields an element of
C (resp. C,
C
, C
). For example,
A
=
A
=
A
=
A
A
=
A
(286)
Note the eect of the operation of complex-conjugation on the index structure:
it adds a prime if there was none before, and a deletes a prime if there was
one before. Finally, we can group our four vector spaces into pairs which are
duals of each other: C and C
are duals
of each other. We thus have the operation of contraction: an element
A
C
together with an element
A
C
A
; an element
C together with an element
A
C
A
.
One indicates contraction, as above, by using a repeated index. Note that one
can only contract between a raised and a lowered index when these are of the
same type (both primed or both unprimed). We have, for example,
(
A
A
) =
A
(287)
(Note that the index notation we used for Hilbert spaces is essentially a special
case of that described above. The inner product on a Hilbert space induces
a natural isomorphism between
C and C
, and between
C
, and
C
. The
particular tensor product to which an object belongs is indicated by its index
structure, e.g., T
ABC
EFG
H
. Complex-conjugation extends in an ob-
vious way to the tensor products, e.g.,
T
ABC
EFG
H
=
T
A
CD
E
GH
(288)
We dene a spinor-space as a two-dimensional, complex vector space C on which
is specied a nonzero object
AB
which is skew:
AB
=
[AB]
(289)
62
(Note that, since C is two-dimensional, any two skew
AB
s dier at most by a
complex factor. Hence, there is just one spinor space.) Elements of the tensor
products will be called spinors. Thus, we can multiply spinors by complex
numbers, add spinors when they have the same index structure, take outer
products of spinors, and contract over spinor indices (one raised and one lowered,
both primed or both unprimed.) Since
AB
,= 0, there is a unique spinor
AB
which is skew and satises
AM
BM
=
B
A
(290)
where
B
A
is the unit spinor (dened by
B
A
A
=
B
for all
A
.) We can now
raise and lower spinor indices (i.e., dene isomorphisms between C and C
and between
C and
C
):
A
=
AB
=
A
A
=
B
BA
A
=
B
(291)
(Note the placement of indices in (291).) Similarly, one can raise and lower an
index of a spinor with more than one index. Note that, since
AB
is skew, we
have
A
=
A
A
(292)
Let V denote the collection of all spinors
AA
AA
=
AA
(293)
We can certainly add two elements of V obtain another element of V , and
multiply an element of V by a real number to obtain another element of V . (Note
that multiplication by a complex number does not preserve (293): (
AA
) =
AA
AA
AA
(294)
By introducing a basis, or otherwise, it is easily checked that the signature of
this metric is (+, , , ).
So far, spinor space is a purely mathematical construct. In order to actually
use the spinor space in physics, we must somehow tie it down to our space-
time Minkowski space. This is accomplished, of course, through the vector
space V of solutions of (293). The vectors at a point x of Minkowski space
form a four-dimensional vector space on which there is a metric of signature
(+, , , ). We tie down spinor space, therefore, by specifying some metric-
preserving isomorphism between V and this vector space of vectors at the point
x. We assume that such an isomorphism has been xed once and for all. Thus,
we can regard a tensor in Minkowski space at x, e.g., T
ab
c
, dening a spinor,
T
AA
BB
CC
which is real:
T
AA
BB
CC
= T
AA
BB
CC
(295)
63
We shall allow ourselves to write such equivalent quantities as equal:
T
ab
c
= T
AA
BB
CC
(296)
In other words, we are free to replace a lowercase Latin index (tensor index in
Minkowski space) by the corresponding uppercase Latin index written twice,
once unprimed and once primed. For example, the metric in Minkowski space
takes the form (see (294))
ab
=
AB
A
B
(297)
We may thus regard tensors at x as merely a special case of spinors (those having
an equal number of primed and unprimed indices, and which, for a real tensor,
are real). Finally, note that translation in Minkowski space denes a metric-
preserving isomorphism between the vectors at x and the vectors at any other
point y. Hence, we have automatically spinors at the point y. More generally,
we have the notion of a spinor eld, a spinor function of position in Minkowski
space. Tensor elds in Minkowski space are, of course, a special case. We can
multiply spinor elds by real or complex scalar elds, and spinor elds (when
they have the same index structure), take outer products of spinor elds, and
contract appropriate spinor indices.
It is possible, in addition, to take derivatives of spinor elds. Let, for exam-
ple, T
A
BC
be a spinor eld. Let r
m
be a vector at x, and let x
be the point
whose position vector relative to x is r
m
. We dene
m
T
A
BC
by
r
m
m
T
A
BC
= lim
0
T
A
BC
(x
) T
A
BC
(x)
(298)
The replacement of Minkowski tensor indices by spinor indices can, of course,
be extended to the index of the derivative operator. That is,
m
T
A
BC
=
MM
T
A
BC
(299)
In short, the mechanics of calculating with spinor elds is in no essential way
dierent from that of tensor elds. The one point one has to be careful about
is the index locations in contractions (see (292).)
One further question must be discussed. To what extent does the imposition
of the notion of spinor elds on Minkowski space enrich the structure of Min-
kowski space? That is, are there essentially inequivalent spinor structures on
Minkowski space? To obtain evidence on this question, consider the collection
of vector elds on Minkowski space of the form
(300)
where
A
(x) is a spinor eld. This vector eld is certainly real, and, from (294)
is null. (From (292),
A
A
= 0.) Furthermore, the inner product of two such
elds,
(
A
)(
A
A
) = (
A
A
)(
A
) = (
A
A
)(
B
B
) (301)
is necessarily non-negative. Thus, the spinor structure on Minkowski space
determines a particular time orientation, which we may specify as being the
64
future. (Past-directed null vectors then have the form
A
.) Furthermore,
the tensor eld on Minkowski space dened by the right side of
abcd
= i(
AB
CD
A
C
B
D
A
B
C
D
AC
BD
) (302)
is real, totally antisymmetric, and satises
abcd
abcd
= 24. Hence, this must
be all alternating tensor on Minkowski space. Thus, a spinor structure on Min-
kowski space induces both temporal and spatial parities on Minkowski space.
In fact, this is all the structure induced on Minkowski space by a spinor struc-
ture. More precisely, given two metric-preserving isomorphisms between V and
vectors in Minkowski space, such that these induce the same spatial and tem-
poral parities on Minkowski space, these are related by a linear mapping from C
onto C which preserves
AB
(i.e., by an element of SL(2, C).) Finally, note that
there are precisely two
AB
-preserving linear mappings on C which leave V (and
hence tensors in Minkowski space) invariant, namely the identity and minus the
identity. This is the statement of the two-valuedness associated with spinors.
Finally, we briey indicate how one translates formulae from the two-compo-
nent to the four-component spinor notation. A four-component spinor is a pair
of two-component spinors, (
A
,
A
), consisting of one spinor with a primed and
one with an unprimed index. One then normally chooses a basis for C and writes
this pair out as a 41 column matrix. The -matrices in the four-component
notation serve the function of combining these components in the appropriate
way to obtain the various scalar, vector, and tensor elds on Minkowski space
associated with the pair (
A
,
A
). For example, a pair (
A
,
A
) denes the
following elds on Minkowski space:
A
A
,
A
A
,
A
A
,
A
A
,
A
B
A
B
,
A
B
AB
,
A
B
A
B
,
B
A
AC
C
(303)
The spinor notation discussed here (which is due to Penrose) essentially avoids
the -matrices by choosing a basis for neither spinor space nor Minkowski space.
Questions of (restricted) Lorentz invariance simply do not arise: one cannot,
with this notation, write anything which is not invariant.
19 The Dirac Equation
The eld which describes a free, massive, spin-
1
2
particle consists of a pair,
(
A
,
A
), of spinor elds on Minkowski space. These elds must satisfy the
Dirac equation:
AA
A
=
A
(304)
AA
A
=
A
(305)
65
where is a positive real number (which, as we shall see shortly, is essentially
the mass of the particle.) The Dirac equations are to be considered as analogous
to the Klein-Gordon equation, or to the Maxwell equations.
There is another, for some purposes more illuminating, form for Diracs
equations. Taking the derivative of (304), we have
BA
AA
A
=
BA
A
(306)
Substituting (305) on the right in (306), and using the fact that
BA
AA
=
1
2
B
A
(307)
we obtain
(+
2
)
B
= 0 (308)
Thus, the Dirac equations imply that the spinor eld
A
(and, by a similar
argument,
A
) satises the Klein-Gordon equation. Conversely, if
A
is any
solution of (308), then, dening
A
by (304) (note ,= 0), (
A
,
A
) is a solu-
tion of Diracs equations. In other words, there is a one-to-one correspondence
between solutions of Diracs equations and solutions of (308). Why, then, do
we choose to deal with a pair of spinor elds and the relatively complicated
equations (304), (305) rather than simply a simple spinor eld and (308)? The
reason is that there is a certain symmetry between
A
and
A
which, while
merely a curiosity at present, will later be found to be related to the discrete
symmetries of Minkowski space.
One further consequence of (308) is that it makes clear the fact that the
problem of nding solutions of Diracs equations is no more and no less dicult
than that of nding solutions of the Klein-Gordon equations. Fix two constant
spinor elds,
A
and
A
, on Minkowski space. Then, by the remarks above,
each solution, (
A
,
A
), of Diracs equations denes two solutions,
A
A
and
A
, of the Klein-Gordon equation, and, conversely, if and are solutions
of the Klein-Gordon equation, then
A
+
A
is a solution of (308), and hence
denes a solution of Diracs equations.
Note that if (
A
,
A
) is a solution of Diracs equations, so is (
A
,
A
). We
call (
A
,
A
) the complex-conjugate of the solution (
A
,
A
) (analogous to the
complex-conjugate of a solution of the Klein-Gordon equation). Of course,
complex-conjugation, applied twice to a solution of Diracs equations, yields
the original solution.
We now go to momentum space. Set
A
(x) =
_
M
A
(k)e
ik
b
x
b
dV
(309)
A
(x) =
_
M
A
(k)e
ik
b
x
b
dV
(310)
66
where
A
(k) and
A
(k) are spinor-valued functions. These functions are only
dened on M
, and the integrals (309), (310) are only carried out over M
,
because of (308). Inserting (309) and (310) into (304) and (305), we obtain
ik
AA
A
(k) =
A
(k) (311)
ik
AA
A
(k) =
A
(k) (312)
Note that each of (311) and (312) implies the other. Thus, a solution of Diracs
equations is characterized by a pair of spinor-valued functions,
A
(k) and
A
(k),
on M
. Then
A
(k) is dened by
(311), and (312) follows identically.) A solution of Diracs equations is said to
be positive-frequency (resp. negative-frequency) if
A
(k) and
A
(k) vanish on
M
(resp. M
+
A
(k)
A
(k)
A
(k)
A
(k) (313)
Thus, just as in the Klein-Gordon case, complex-conjugation takes positive-
frequency solutions to negative-frequency solutions, and vice-versa. (Roughly
speaking, positive-frequency solutions represent electrons, and negative-frequen-
cy solutions positrons.)
Let (
A
,
A
) be a solution of Diracs equations, and consider the real vector
eld
j
a
=
A
+
A
A
(314)
in Minkowski space. First note that, since each term on the right in (314) is a
future-directed null vector, j
a
is future-directed and either timelike or null. We
have, for the divergence of j
a
,
a
j
a
=
A
AA
AA
A
+
A
AA
A
+
A
AA
A
(315)
Substituting (304) and (305), and using (292), we nd
a
j
a
= 0 (316)
Thus, j
a
is a real, future-directed timelike or null, divergence-free vector eld.
Therefore, the integral of j
a
over a spacelike 3-plane yields a nonnegative number
which, assuming that the Dirac eld goes to zero suciently quickly at innity,
is independent of the choice of the 3-plane. This integral can be used to dene
a norm on solutions of Diracs equations. The situation is much simpler when
translated into momentum space (see (23)). We dene the norm by
i
_
_
_
_
M
+
A
(k)
A
(k) dV
_
M
A
(k)
A
(k) dV
_
_
_ (317)
67
Note that, because of (311) and (312), the expression (317) is equal to both of
_
_
_
_
M
+
_
M
_
_
_
1
2
(
A
+
A
)k
AA
dV
(318)
_
_
_
_
M
+
_
M
_
_
_
2
k
AA
dV
(319)
The forms (318) or (319) show, in particular, that our norm is positive. (The
vector
A
.)
We have now obtained a norm on our collection of solutions of Diracs equa-
tion. In order to obtain a Hilbert space, therefore, we have only to impose
the structure of a complex vector space on our collection of solutions. In other
words, we must dene addition of solutions and multiplication of solutions by
complex numbers. There is only one reasonable way to dene addition: one
simply adds the corresponding spinor elds in Minkowski space (or, in momen-
tum space, adds the corresponding spinor functions on the mass shell.) One
might think, at rst glance, that there is also only one reasonable denition of
the product of a complex number and a solution of Diracs equations: if is a
complex number, and (
A
(k),
A
(k)) is a solution of Diracs equations, one de-
nes the product to be the solution (
A
(k),
A
(k)). In other words, since the
Dirac equation is linear on (complex) spinor elds, the solutions of this equa-
tion naturally have the structure of a complex vector space. There is, however,
an alternative way to dene the product of a solution of Diracs equations and
a complex number. Let (k) and
A
(k) be a pair of spinor functions on M
which satisfy (311) and (312), i.e., a solution (in momentum space) of Diracs
equations. Let be a complex number. Then we might also dene the product
of and (
A
(k),
A
(k)) to be the solution
(
A
(k),
A
(k)) for k M
+
(
A
(k),
A
(k)) for k M
(320)
of Diracs equations. That is to say, we multiply the positive-frequency part of
the elds by and the negative-frequency part by . We obtain, in this way,
an essentially dierent complex vector space of solutions of Diracs equations.
In fact, we adopt this second rather less aesthetic alternative. As we
shall see later, this choice is essential to obtain agreement between theory and
experiment.
We now have a complex vector space with a norm, (317), and hence a Hilbert
space. More precisely, the Hilbert space of the Dirac equation, H
D
, is the
collection of all pairs, (
A
(k),
A
(k)), of spinor functions on M
which satisfy
(311), which are measurable, and for which the integral (317) converges. The
68
inner product on our Hilbert space can now be obtained from the norm via the
identity
(, ) =
1
4
_
| +|
2
| |
2
_
+
i
4
_
| +i|
2
| i|
2
_
(321)
Using (317) and (321), the inner product on H
D
takes the form
1
2
_
M
+
A
A
+
A
A
_
k
AA
dV
2
_
M
A
+
A
_
k
AA
dV
(322)
where (
A
(k),
A
(k)) and (
A
(k),
A
(k)) are two solutions of Diracs equa-
tions. Note the appearance of the complex-conjugations in the integral over
M
. These arise because of our choice of the complex vector space structure
for H
D
.
To summarize, whereas the solutions of Diracs equations have only one
reasonable real vector space structure and only one reasonable norm, there are
two possible complex vector space structures, of which we choose one. This
choice then leads to the particular form for the inner product on our Hilbert
space.
We now introduce the antisymmetric Fock space based on H
D
. We thus
have creation and annihilation operators, number operators, etc.
In the real Klein-Gordon and Maxwell cases, we were dealing with real elds
on Minkowski space. This feature was reected in momentum space by our
requirement that the elds on the mass shell be invariant under simultaneous
complex-conjugation and reection through the origin. Physically, we were deal-
ing with particles which are identical with their antiparticles. While we could,
of course, restrict ourselves to real (
A
=
A
) solutions of Diracs equations, it
is convenient not to do so. Thus, the functions on the future mass shell need
bear no special relation to those on the past mass shell. This state of aairs
leads to a pair of projection operators on H
D
. Let (
A
(k),
A
(k)) H
D
. Then
the action of P
+
(projection onto the positive frequency part) is dened by
P
+
(
A
(k),
A
(k)) =
_
(
A
,
A
) for k M
+
(0, 0) for k M
(323)
and similarly for P
. Note that
P
+
+P
= I (324)
These operators are both projection operators, i.e., they are dened everywhere
and satisfy
(P
+
)
2
= P
+
(P
)
2
= P
(325)
Eigenstates of P
+
with eigenvalue one (i.e., positive-frequency solutions) are
called particle states, and those of P
eP
+
, (this is the form when
the particles have negative charge, e.g., electrons) where e is the fundamental
charge.
The Dirac equation describes particles of mass and spin
1
2
. This statement
must, of course, be proven using the techniques of Sect. 17. We now give
the proof. The only piece of additional machinery we require is the notion
of the Lie derivative of a spinor eld. Quite generally, any smooth mapping,
with smooth inverse, from Minkowski space to itself takes any tensor eld on
Minkowski space to another tensor eld on Minkowski space. Smooth mappings
which dier innitesimally from the identity mapping are described by smooth
vector elds. The corresponding innitesimal change in a tensor eld denes
the Lie derivative of that tensor eld. Does a smooth mapping, with smooth
inverse, on Minkowski space take spinor elds to spinor elds? In other words,
can we formulate a natural notion of the Lie derivative of a spinor eld (by a
vector eld) so that the Lie derivative of a tensor eld will arise as a special case
(i.e., considering a tensor eld as merely a special case of a spinor eld when the
numbers of primed and unprimed spinor indices are equal)? Unfortunately, the
answer to these questions is no. To see this, suppose for a moment that it were
possible to generalize the Lie derivative from tensor to spinor elds. Let v
a
be
an arbitrary smooth vector eld on Minkowski space. Then we would have
L
v
ab
= L
v
(
AB
A
B
) =
AB
L
v
A
B
+
A
B
L
v
AB
(326)
But, since
AB
is skew, so must be L
v
AB
, and similarly for L
v
A
B
. Thus, the
right side of (326) must be some multiple of the Minkowski metric
ab
. But it is
simply false that, for an arbitrary smooth vector eld v
a
on Minkowski space,
L
v
ab
is a multiple of
ab
. Thus, we cannot in general dene the Lie derivative
of a spinor eld. Intuitively, the problem is that the light-cone structure of
Minkowski space is an essential ingredient in the very denition of a spinor
eld. A smooth (nite or innitesimal) mapping on Minkowski space which
alters the light-cone structure simply does not know what to do with a general
spinor eld.
The remarks above are also the key to resolving the problem. In order to
dene spin and mass, it is only necessary to take Lie derivatives of spinor elds
by vector elds v
a
which satisfy (216) i.e., by vector elds which do preserve
the light-cone structure of Minkowski space. We might expect to be able to
dene Lie derivatives by such vector elds, and this is indeed the case. The
formula is, for example,
L
v
T
ABC
DE
= v
m
m
T
ABC
DE
1
2
T
MBC
DE
MM
v
AM
1
2
T
AMC
DE
MM
v
BM
1
2
T
ABM
DE
MM
v
MC
+
1
2
T
ABC
ME
DM
v
MM
+
1
2
T
ABC
DM
ME
v
MM
(327)
70
Note that Lie dierentiation commutes with complex-conjugation (v
a
is real),
raising and lowering of spinor indices, and contraction of spinor indices. Note,
furthermore, that (327) reduces to the usual Lie derivative for tensor elds. It
follows from the remarks on p. 65 that (327) is the only formula which satises
these properties.
We rst determine the mass associated with the Dirac equation. Let (
A
(x),
A
(x)) be a solution of the Dirac equation, and r
a
a vector at some point x of
Minkowski space. Then
(r
a
P
a
)
M
=
i
r
a
M
(r
a
P
a
)(r
b
P
b
)
M
=
2
r
a
r
b
M
(328)
Substituting
ab
for r
a
r
b
, we obtain
P
a
P
a
M
=
2
M
=
2
M
(329)
where we have used (308). Hence, from (268), the mass associated with the
Dirac equation is . The spin calculation is slightly more complicated. Let s
cd
be a skew tensor at x. Then, from (327),
s
cd
P
cd
M
=
i
_
s
c
d
x
d
1
2
N
s
MN
NN
_
(330)
If r
b
is a vector at x, we have, therefore.
r
b
P
b
s
cd
P
cd
M
=
2
r
b
b
_
s
c
d
x
d
1
2
N
s
MN
NN
_
=
2
r
BB
s
CC
DD
BD
B
D
CC
M
+
1
2
D
C
M
C
BB
D
_
(331)
Substituting u
a
a
bcd for r
b
s
cd
, and using (269) and (302),
u
a
W
a
M
= i
2
_
AB
CD
A
AC
BD
_
u
AA
BD
B
D
CC
M
+
1
2
D
C
M
C
BB
D
_
=
i
2
2
u
a
M
+i
2
u
MA
AA
A
(332)
Therefore,
u
a
W
a
u
b
W
b
M
= (i
2
)
2
_
1
4
u
a
u
b
1
2
u
MA
u
b
AA
b
1
2
u
MB
u
a
BB
a
B
+u
MA
u
AB
AA
BB
B
_
(333)
Finally, substituting
ab
for u
a
u
b
, we have
W
a
W
a
M
=
3
4
M
=
3
4
2
m
2
M
(334)
We conclude from (270) that the Dirac equation describes a particle with spin
1
2
.
71
20 The Neutrino Equation
A neutrino is essentially a massless Dirac particle. There are, however, a few
features which are particular to the case = 0.
The (four-component) neutrino eld consists of a pair, (
A
,
A
) of spinor
elds on Minkowski space, subject to the neutrino equation (see (304), (305)):
AA
A
= 0 (335)
AA
A
= 0 (336)
Note that, whereas in the massive case either of the two spinor elds can be
obtained from the other (via (304), (305)), the elds become uncoupled in
the massless case. That is to say, each spinor eld satises its own equation.
Taking a derivative of (335),
BA
AA
A
= 0 (337)
and using (307), we obtain
A
= 0 (338)
and similarly for
A
. Thus, each of our neutrino elds satises the wave equa-
tion. Note, however, that (338) does not imply (335). (Solutions of the neutrino
equations can, however, be obtained from solutions of the wave equation. If
A
A
satises (335).)
The complex-conjugate of the solution (
A
,
A
) of the neutrino equation is
the solution (
A
,
A
).
Passing to momentum space, we set
A
(x) =
_
M0
A
(k)e
ik
b
x
b
dV
0
(339)
A
(x) =
_
M0
A
(k)e
ik
b
x
b
dV
0
(340)
where
A
(k) and
A
(k) are spinor-valued functions on the zero-mass shell, M
0
.
In momentum space, (335) and (336) become
A
(k)k
AA
= 0 (341)
A
(k)k
AA
= 0 (342)
Positive-frequency and negative-frequency solutions of the neutrino equations
are well-dened. Complex-conjugation again reverses frequency, and is again
expressed in momentum apace by the equations (313).
The current (314) is still divergence-free in the massless case. (In fact, the
proof is rather simpler with = 0.) This fact leads to a norm on solutions of
the neutrino equation. The simplest way to obtain the norm, however, is as a
72
0 limit of the Dirac norm. Consider (318). It is not dicult to check
from (311) and (312) that
A
(k)
(k) +
A
A
(k) = (k)k
AA
(343)
where (k) is a real function on M
which is positive on M
+
and negative on
M
(344)
We now return to the massless case. Eqn. (341) implies that
A
(k)
(k) is
proportional to k
AA
(k)
A
(k) is also proportional
to k
AA
. Therefore,
A
(k)
(k) +
A
A
(k) = (k)k
AA
(345)
on M
0
, where (k) is real on M
0
and positive on M
+
0
and negative on M
0
. We
therefore dene the norm in the neutrino case, in analogy with (344), by
_
M0
[(k)[ dV
(346)
For the complex vector space structure in the massless case, we use the same
convention as in the massive case (see (320)).
In fact, the theory we have been discussing is not very interesting physi-
cally. The reason is that our Hilbert space of solutions of the neutrino equation
contains four irreducible subspaces: positive-frequency solutions with
A
= 0,
negative-frequency solutions with
A
= 0, positive-frequency solutions with
A
= 0, and negative-frequency solutions with
A
= 0. Every solution can
be written uniquely as the sum of four solutions, one from each class above.
Thus, our neutrino eld describes four similar particles. But neutrinos in the
real world appear in pairs (particle-antiparticle.) Thus, we would like to intro-
duce a eld whose Hilbert space has only two irreducible (under the restricted
Poincare group) subspaces. The result is what is called the two-component
neutrino theory, which we now describe. (The only purpose in treating the
four-component theory at all was to make explicit the analogy with the Dirac
equation.)
The (two-component) neutrino eld is a single spinor eld
A
on Minkowski
space which satises (335), and, therefore, (338). In momentum space, we have a
spinor-valued function
A
(k) on M
0
which satises (341). This equation implies
A
(k)
(k) = (k)k
AA
(347)
where (k) is real, and positive on M
+
0
and negative on M
0
. We dene the
norm on our solutions by
_
M0
[(k)[ dV
0
(348)
73
The complex vector space structure is dened as before: the product of a com-
plex number and a solution
A
(k) is dened to be the solution
A
(k) for k M
+
0
A
(k) for k M
0
(349)
The collection of all measurable spinor functions
A
(k) on M
0
for which (341)
is satised and (348) converges, with the above complex vector space structure,
is a Hilbert space which we write as H
N
.
We introduce the antisymmetric Fock space based on H
N
. We thus have
creation and annihilation operators, number operators, etc.
We introduce on H
N
the two projection operators P
+
and P
, projection
onto positive and negative frequency, respectively. These operators, of course,
satisfy (324) and (325).
Finally, we remark on the spin and mass to be associated with H
N
. We have
done the mass calculation several times: (338) clearly leads to m = 0 for H
N
.
Furthermore, most of the work involved in calculating the spin has already been
done. Nowhere in the argument leading to (332) did we use the fact that
A
satises the Dirac equation, and so (332) holds also in the neutrino case. But
now (335) implies that the second term on the right in (332) vanishes, so we
have
u
a
W
a
B
=
1
2
2
i
u
a
B
(350)
Since, furthermore,
u
a
P
a
B
=
i
u
a
B
(351)
one is tempted to conclude from (271) that s =
1
2
for H
N
. This conclusion is
essentially correct, but one technical point must be claried. (Unfortunately,
our notation is rather badly suited to the remarks below, and so they will sound
rather mystical.) The problem involves what the is mean in (350) and (351).
(This problem never arose in the Dirac case because the is were always squared
away, so their meaning was irrelevant.) Where did the is come from? The i
in (351) came from the /i factors which are introduced in the operator elds
associated with the Poincare group. This i means multiplication by i within
the Hilbert space H
N
because only in this way does one obtain Hermitian
operators from innitesimal unitary operators. In other words, the i in (351)
arises from very general considerations involving the action of a Lie group on a
Hilbert space, and, in this general framework within which the formalism was set
up, there is only one notion of multiplication by i, namely, multiplication within
the Hilbert space. Thus, the i in (351) multiplies the positive-frequency part
of what follows by i, and the negative-frequency part by i. (See (349).) The
i in (350), on the other hand, is a quite dierent animal. It arose from the i in
(302). (The is in P
a
and P
ab
(see (269)) combine to give 1.) But the i in (302)
appears because of the way that the real tensor eld
abcd
must be expressed
in terms of spinors. Hence, the i in (350), because of its origin, represents
simply multiplication of a tensor eld by i. That is to say, the i-operators in
74
(350) and (351) are equal for positive-frequency solutions, and minus each other
for negative-frequency solutions. Thus, s =
1
2
for positive-frequency solutions
(neutrinos), and s =
1
2
for negative-frequency solutions (antineutrinos). That
is, in the neutrino case the particle and its antiparticle have opposite helicity.
This prediction is in fact conrmed by experiment.
21 Complex Klein-Gordon Fields
In Sect. 5, we dealt with real equations of the Klein-Gordon equation (although,
for reasons of motivation, we chose to characterize such elds as complex po-
sitive-frequency solutions). Such elds describe particles with spin zero which
are identical with their antiparticles (e.g., the
0
). On the other hand, there are
spin-zero particles which are not identical with their antiparticles (the
+
and
(352)
Thus, our solution is characterized by a complex-valued function (k) on M
.
(In the real case, one requires in addition (k) =
(k).) The norm of such a
function is dened by
1
_
M
(k)
(k) dV
(353)
We adopt, for the complex vector space structure on these functions, essentially
the same structure used in the Dirac and neutrino case. To multiply (k) by
a complex number, one takes
(k) for k M
+
(k) for k M
(354)
The collection of all measurable, square-integrable (in the sense of (353), com-
plex-valued functions on M
, which
take the positive-frequency part and negative-frequency part, respectively.
We introduce the symmetric Fock space based on H
CKG
, creation and an-
nihilation operators, number operators, etc.
22 Positive Energy
Many of the quantities associated with an elementary particle (e.g., charge)
are reversed in the passage from a particle to its antiparticle. It is observed
75
experimentally, however, that energy is not one of these quantities. For example,
if an electron and a positron annihilate (say, with negligible kinetic energy), then
the total, energy released is 2m, and not zero. We are thus forced to assign a
(rest) energy +m to both a positron and an electron. Where does this fact
appear in our formalism?
Of course, energy refers to the state of motion of an observer. This state
of motion is represented by some constant, unit, future-directed timelike vector
eld r
a
in Minkowski space. The energy operator is then E = r
a
P
a
. It should
be emphasized that we are not free to assign energies arbitrarily to obtain agree-
ment with experiment. The very concept of energy is based in an essential way
on the action of the Poincare group (more explicitly, on the time translations).
If we wish to avoid a radical change in what energy means in the passage from
classical to quantum theory, we must choose for the energy in quantum eld
theory that quantity which arises naturally from time translations in Minkow-
ski space, i.e., we must choose the E above. We take as our precise statement
that energies are nonnegative the statement that the expectation value of E
in any state (on which E is dened) be nonnegative:
(, E) 0 (355)
Is it true or false that (355) holds for the ve Hilbert spaces we have constructed,
H
RKG
, H
CKG
, H
M
, H
D
, H
N
?
We begin with the real Klein-Gordon case. The Hilbert space consists of
measurable, square-integrable, complex-valued functions (k) on M
which sat-
isfy
(k) =
(k) (356)
Such functions do not have an obvious complex vector space structure. If (k)
satises (356), and is a complex number, then (k) will not in general satisfy
(356). This fact, of course, is not surprising: there is no obvious way to take the
product of a complex number and a real solution of a dierential equation to
obtain another real solution. This problem is resolved, in H
RKG
, by choosing
one of the two mass shells to be preferred, and calling it the future mass shell,
M
+
gets while M
must be
content with . In other words, we dene multiplication of (k) by by
(k) k M
+
(k) k M
(357)
It should be emphasized that, in the real case, we are forced (by the requirement
that we obtain a Hilbert space) to select one preferred mass shell and dene
multiplication by (357).
Now consider the energy. If (x) is a real solution of the Klein-Gordon
equation, then
E(x) =
i
r
a
a
(x) (358)
Because of the i in (358), one might naively think that (358) does not represent a
real solution of the Klein-Gordon equation, and so that (358) is not a denition
76
for E. This, of course, is not the case. The i in (358) arose because of general
considerations involving representations of the Poincare group (Sect. 16), and
means multiplication within the Hilbert space H
RKG
. In momentum space,
the operator r
a
a
has the eect
(k) i(r
a
k
a
)(k) (359)
Note that (359) does not destroy (356), a statement which reects the fact that
(x) r
a
a
(360)
is an unambiguous operation on real solutions of the Klein-Gordon equation.
Now using (357), the energy operator in momentum space takes the form
(k)
_
r
a
k
a
(k) k M
+
r
a
k
a
(k) k M
(361)
The expectation value of E in the state (k) is
_
M
+
(r
a
k
a
)(k)
(k) dV
_
M
(r
a
k
a
)(k)
(k) dV
(362)
which, of course, is positive. (Why dont we just dene the energy operator by
(359), (360), leaving out the i? Because the expectation value of this operator
is not real. That is, the i is needed for Hermiticity.)
We summarize the situation. In order to make a Hilbert space of real so-
lutions of the Klein-Gordon equation, we are forced to select a preferred mass
shell to be called future. Then, provided r
a
is future-directed according to
this convention, E will have positive expectation values.
Now consider the complex Klein-Gordon case. The energy operator still has
the form
E(x) =
i
r
a
a
(x) (363)
and i still means multiplication within our Hilbert space. In momentum space,
(k) is an arbitrary measurable, square-integrable, complex-valued function.
The operator r
a
a
has the eect
(k) i(r
a
k
a
)(k) (364)
We must still multiply (364) by 1/i. But we now have the freedom to select one
of two possible complex vector space structures on the complex solutions of the
Klein-Gordon equation. For the product of a complex number and (k),
we could choose
(k) (365)
or, alternatively,
(k) k M
+
(k) k M
(366)
77
The resulting energy operators are
E(k) = (r
a
k
a
)(k) (367)
E(k) =
_
r
a
k
a
(k) k M
+
r
a
k
a
(k) k M
(368)
respectively. Finally, the resulting expectation values of E are
_
M
r
a
k
a
(k)
(k) dV
(369)
_
M
+
(r
a
k
a
)(k)
(k) dV
_
M
(r
a
k
a
)(k)
(k) dV
(370)
respectively. But note that (369) can take both positive and negative values,
while (370) is always nonnegative. But this is exactly what one might expect.
The complex vector space structure (365) does not prefer one time direction
over the other: it makes no reference to past and future. Therefore, it could not
possibly lead to a positive energy, for the energy associated with r
a
is certainly
minus the energy associated with r
a
. The complex vector space structure
(366), on the other hand, picks out a particular future time direction. Then
the expectation value of E is positive provided r
a
is future-directed in this
sense. It is for this reason that we are led to select (366) as our complex vector
space structure.
We summarize. If energy is to arise from time translations, there is no
freedom to alter the energy operator itself. In the real Klein-Gordon case, we
are forced, in order to obtain a Hilbert space, to select a preferred future mass
shell. Then energy is positive provided r
a
is future-directed. In the complex
Klein-Gordon case, there are two distinct ways to obtain a Hilbert space, one
which selects a preferred future mass shell, and one which does not. It is only
the former choice which leads to positive energies. We make this choice.
There is an additional sense in which (366) is a more natural choice for the
complex vector space structure for H
C
KG. Every real solution of the Klein-
Gordon equation is certainly also a complex solution. We thus have a mapping
: H
RKG
H
CKG
. This mapping is certainly norm-preserving. Is it also
linear? The answer is no if we choose the structure (365), and yes if we choose
the the structure (366).
A completely analogous situation holds for the other Hilbert spaces. H
M
is
based on real solutions of Maxwells equations, its complex vector space struc-
ture depends on choosing a particular future mass shell, and energies are nat-
urally positive. On the other hand, H
D
and H
N
are based on complex elds.
We have two choices for the complex vector space structure, one of which leads
to positive energies and one of which does not. We choose the complex vector
space structure to be the one which, by preferring a future mass shell, makes
the energy be positive. (See (320), (349).)
78
23 Fields as Operators: Propagators
Ordinary relativistic elds are to be replaced, eventually, by an appropriate
class of operators on Fock space. This transition from elds to operators is to
be carried out according to the following general rules:
i) A real eld becomes a Hermitian operator; a pair of complex-conjugate
elds is a pair of adjoint operators;
ii) The operators have the same index structure, and satisfy the same equa-
tions, as the corresponding elds; and
iii) The positive-frequency part of the operator is annihilation of a particle,
the negative-frequency part creation of an antiparticle.
We have already discussed these operators in the real Klein-Gordon and Maxwell
cases (Sects. 12 and 14, respectively). The purposes of this section are, rstly,
to treat the complex Klein-Gordon and Dirac cases, and, secondly, to establish
certain properties of the functions which appear in the commutators or anti-
commutators. For completeness, we briey review Sects. 12 and 14.
Real Klein-Gordon. H
RKG
consists of (measurable, square-integrable)
complex-valued functions (k) on M
which satisfy
(k) = (k). The inner
product is
((k), (k)) =
1
_
M
+
(k)(k) dV
+
1
_
M
(k)(k) dV
(371)
Let f(x) be a real test function on Minkowski space, and let f(k) be its Fourier
inverse, so
f(k) = f(k). Then f(k), restricted to M
, denes an element,
(f), of H
RKG
. The corresponding eld operator on symmetric Fock space is
(f) = C((f)) +A((f)) (372)
Note that (372) is Hermitian and satises the Klein-Gordon equation:
_
(+
2
)f
_
= 0 (373)
The commutator is
_
(f), (g)
=
2
([C((f)), A((g))] + [A((f)), C((g))])
=
2
(
(f)
(g) +
(g)
(f)) I
=
i
D(f, g)I
(374)
where we have dened
D(f, g) = i
_
_
_
_
M
+
_
M
_
_
_
_
f(k) g(k)
f(k)g(k)
_
dV
(375)
79
Complex Klein-Gordon. H
CKG
consists of (measurable, square-inte-
grable) complex-valued functions (k) on M
_
M
+
(k)(k) dV
+
1
_
M
(k)(k) dV
(376)
Let f(x) be a real test function on Minkowski space, and let f(k) be its Fourier
inverse, so
f(k) = f(k). Let
+
(f) be the element of H
CKG
given by f(k) on
M
+
and zero on M
, and let
and zero on M
+
.
The corresponding eld operators on symmetric Fock space are
(f) =
_
C(
(f)) +A(
+
(f))
_
(377)
(f) =
_
A(
(f)) +C(
+
(f))
_
(378)
Note that these are adjoints of each other, and that they satisfy the Klein-
Gordon equation:
_
(+
2
)f
_
=
_
(+
2
)f
_
= 0 (379)
We clearly have
_
(f), (g)
=
_
(f),
(g)
(380)
For the other commutator, however,
_
(f),
(g)
=
2
__
C(
(f)), A(
(g))
_
+
_
A(
+
(f)), C(
+
(g))
__
=
2
_
(f)
(g) +
+
(f)
_
I =
2i
D(f, g)I (381)
where D(f, g) is given by (375).
Maxwell. H
M
consists of (measurable, square-integrable) complex vector
functions A
a
(k) on M
0
which satisfy
A
a
(k) = A
a
(k) and k
a
A
a
(k) = 0, where
two such functions which dier by a multiple of k
a
are to be regarded as dening
the same element of H
M
. The inner product is
(A
a
(k), B
a
(k)) =
1
_
M
+
0
A
a
(k)
B
a
(k) dV
1
_
M
A
a
(k)B
a
(k) dV (382)
Let f
a
(x) be a real test vector eld on Minkowski space which is the sum of a
divergence-free eld and a gradient, and let f
a
(k) be its Fourier inverse. Then
f
a
(k) satises f
a
(k) =
f
a
(k) and k
a
f
a
(k) = 0, and so denes an element,
(f
a
), of H
M
. The corresponding operator on symmetric Fock space is
A(f
a
) = C((f
a
)) +A((f
a
)) (383)
This operator is Hermitian, and satises Maxwells equations (for the vector
potential):
A(
a
f) = 0 A(f
a
) = 0 (384)
80
The commutator is
[A(f
a
), A(g
a
)] =
2
(
(f
a
)
(g
a
) +
(g
a
)
(f
a
)) I =
i
D(f
a
, g
a
)I (385)
where we have dened
D(f
a
, g
a
) = i
_
_
_
_
M
+
0
_
M
0
_
_
_
_
f
a
(k) g
a
(k) g
a
(k)
f
a
(k)
_
dV (386)
Dirac. H
D
consists of (measurable, square-integrable) pairs, (
A
(k),
A
(k)),
of spinor functions on M
which satisfy
ik
AA
A
(k) =
A
(k) ik
AA
A
(k) =
A
(k) (387)
The inner product is
_
(
A
,
A
), (
A
,
A
)
_
=
2
2
_
M
+
A
(k)
A
k
AA
dV
2
2
_
M
A
(k)
k
AA
dV
(388)
Let f
A
(x),
f
A
(x) be a pair of test spinor elds on Minkowski space which is
real in the sense that the second eld is the complex-conjugate of the rst. (Of
course, test means having compact support.) Let f
A
(k) be the Fourier inverse
of f
A
(x), and consider the pair
_
f
A
(k)
i
k
AA
f
A
(k),
f
A
(k) +
i
k
AA
f
A
(k)
_
(389)
of spinor functions on M
+
(f
A
,
f
A
) be the element of H
D
which is given by (389) on M
+
and zero
on M
, and let
(f
A
,
f
A
) be the element given by (389) on M
and zero on
M
+
(f
A
,
f
A
)) +A(
+
(f
A
,
f
A
)) (390)
(f
A
,
f
A
) = A(
(f
A
,
f
A
)) +C(
+
(f
A
,
f
A
)) (391)
These operators are adjoints of each other, and satisfy the Dirac equation:
_
f
A
(k)
AA
f
A
(k),
f
A
(k) +
AA
f
A
(k)
_
=
_
f
A
(k)
AA
f
A
(k),
f
A
(k) +
AA
f
A
(k)
_
= 0 (392)
81
We clearly have
_
(f
A
,
f
A
), (g
A
, g
A
)
_
=
_
(f
A
,
f
A
),
(g
A
, g
A
)
_
= 0 (393)
For the other anticommutator,
_
(f
A
,
f
A
),
(g
A
, g
A
)
_
=
_
(f
A
,
f
A
)
(g
A
, g
A
) +
+
(g
A
, g
A
)
+
(f
A
,
f
A
)
_
I
= D
_
(f
A
,
f
A
), (g
A
, g
A
)
_
I
(394)
where, using (388) and (389),
D
_
(f
A
,
f
A
), (g
A
, g
A
)
_
= 2
_
_
_
_
M
+
_
M
_
_
_
__
f
A
(k) g
A
(k) +
f
A
(k)g
A
(k)
_
k
AA
+
i
2
_
f
A
(k)g
A
(k)
f
A
(k) g
A
(k)
_
_
dV (395)
The functions (375), (386), and (395) play a very special role in relativistic
quantum eld theory. They are called Feynman propagators. Several properties
of the propagators follow immediately from the denitions. In the rst place,
they are all real. Secondly, we have the symmetries
D(f, g) = D(g, f) (396)
D(f
a
, g
a
) = D(g
a
, f
a
) (397)
D
_
(f
A
,
f
A
), (g
A
, g
A
)
_
= D
_
(g
A
, g
A
), (f
A
,
f
A
)
_
(398)
Furthermore, since the propagators arise from commutators or anticommutators
of the eld operators, they satisfy the appropriate eld equations:
D
_
(+
2
)f, g
_
= 0 (399)
D(
a
f, g
a
) = 0 D(f
a
, g
a
) = 0 (400)
D
__
f
A
(k)
AA
f
A
(k),
f
A
(k) +
AA
f
A
(k)
_
, (g
A
, g
A
)
_
= 0
(401)
Note also that the propagators are linear in the real test elds on which they
depend.
A more remarkable property of the propagators is that they can all be ex-
pressed directly in terms of the Klein-Gordon propagator, D(f, g). Let v
a
and
w
a
be constant vector elds on Minkowski space, f and g real test functions,
and consider the expression
v
a
w
a
D(f, g) (402)
82
Inserting (375) in (402), we see that (402) is precisely the right side of (386),
provided we set f
a
= fv
a
, g
a
= wg
a
. Thus, we may dene a function D(f
a
, g
a
)
for test elds of the form f
a
= fv
a
, g
a
= wg
a
by (402). Then, assuming
linearity in f
a
and g
a
, we extend the range of D(f
a
, g
a
) to arbitrary real test
vector elds. Finally, restricting f
a
and g
a
to elds which can be expressed
as the sum of a divergence-free eld and a gradient, we obtain precisely the
Maxwell propagator.
The situation is similar in the Dirac case. Let
A
and
A
be constant spinor
elds, f and g real test functions. Then, from (395) and (375), we have
D
_
(f
A
,
f
A
), (g
A
, g
A
)
_
= 2D
_
(
A
A
+
A
)
AA
f, g
_
+
2
(
A
A
+
A
)D(f, g) (403)
But, again by linearity, if we know the Dirac propagator for test elds of the form
f(
A
,
A
) with constant
A
, we know the Dirac propagator for all test elds.
In this sense, then, the Dirac propagator follows already from the Klein-Gordon
propagator.
We may now derive a particularly important property of the Feynman prop-
agators. A function of a pair of test elds will be called causal if it vanishes
whenever the test elds have relatively spacelike supports (see p. 39). We have
seen in Sect. 12 that the Klein-Gordon propagator, D(f, g), is causal. The
remarks above imply, therefore, that all the Feynman propagators are causal.
24 Spin and Statistics
What is it that determines whether the appropriate Fock space for an elemen-
tary particle is the symmetric or the antisymmetric one? (This distinction is
said to one of statistics. Particles described by the symmetric Fock space are
called bosons, and are said to satisfy Bose statistics. Particles described by the
antisymmetric Fock space are called fermions, and are said to satisfy Fermi
statistics.) It is found in Nature that the statistics a particle obeys is invariably
correlated with another feature of the particle, its spin. It is found, in fact, that
all particles with half-integer (i.e., half-odd-integer) spin obey Fermi statistics,
while particles with integer spin obey Bose statistics. How should this fact be
incorporated into quantum eld theory? One could, of course, merely regard
the correlation between spin and statistics as an empirical fact a fact which
can be used to choose the appropriate statistics in each case. It is natural to
ask, however, whether there is some deeper theoretical reason why Nature oper-
ates as She does. Certainly, no obvious internal inconsistencies arise if we insist
that Klein-Gordon and Maxwell particles be fermions, while Dirac particles be
bosons. It would be desirable, however, to nd some very general requirement
on quantum eld theories which would force the experimentally observed re-
lation between spin and statistics. There is, in fact, such a requirement: the
demand that the propagators be causal. We have seen in Sect. 23 that, with the
83
correct statistics, the propagators are indeed causal. In this section, we shall
indicate why the propagators for fermion Klein-Gordon, fermion Maxwell,
and boson Dirac particles are not causal. These results are a special case of a
core general theorem. If we require that energy be positive (to x the complex
vector space structure), and that the propagator be causal, then particles with
half-integer spin must be fermions and those with integer spin bosons. We shall
not discuss this general theorem further.
We begin with H
RKG
. The (one-particle) Hilbert space is the same as before,
the inner product given by (371). The operator (f) is still dened by (372).
Now, however, we suppose that the creation and annihilation operators act on
antisymmetric Fock space. Then (373) still holds, but (374) must be modied
as follows:
_
(f), (g)
_
=
2
C((f)), A((g) +
2
A((f)), C((g))
=
2
(
(f)
(g) +
(f)) I
=
_
_
_
_
M
+
+
_
M
_
_
_
_
f(k) g(k) +
f(k)g(k)
_
dV I
(404)
The propagator for antisymmetric statistics the last line in (404) is
simply not causal. (Proof: Choose almost any test functions f and g with
relatively spacelike supports, and evaluate the integral.) That is to say, we
obtain a causal propagator in the real Klein-Gordon case if and only if we use
Bose statistics. Thus, if we take causality of the propagator as a fundamental
assumption, we are led to assign Bose statistics to real Klein-Gordon particles.
Now consider the complex Klein-Gordon case. If we choose Fermi statistics,
(376), (377), (378), and (379)) still hold. Furthermore, (380) holds if we replace
the commutators by anticommutators. For (381), however, we have
_
(f),
(g)
_
=
2
C(
(f)), A(
(g)) +
2
A(
+
(f)), C(
+
(g))
=
2
_
(f)
(g) +
+
(g)
+
(f)
_
I
=
2
_
_
_
_
M
+
+
_
M
_
_
_
_
f(k) g(k) +
f(k)g(k)
_
dV I
(405)
But the last line of (405) is not causal. Hence, in order to obtain a causal
propagator, complex Klein-Gordon particles must be bosons.
If we assign Fermi statistics to H
M
, (382), (383), and (384) remain valid.
But (385) becomes
A(f
a
), A(g
a
) =
2
(
(f
a
)
(g
a
) +
(g
a
)
(f
a
)) I
=
_
_
_
_
M
+
+
_
M
_
_
_
_
f
a
(k) g
a
(k) +
f
a
(k)g
a
(k)
_
dV I
(406)
84
The last line is not causal. So causality of the propagator implies Bose statistics
for photons.
Finally, we attempt to impose Bose statistics on Dirac particles. Eqns. (387),
(388), (389), (390), (391), and (392) remain valid. Eqn. (393) remains valid if
the anticommutators are replaced by commutators. But (394) becomes
_
(f
A
,
f
A
),
(g
A
, g
A
)
_
=
_
(f
A
,
f
A
)
(g
A
, g
A
) +
+
(g
A
, g
A
)
+
(f
A
,
f
A
)
_
I
= 2
_
_
_
_
M
+
+
_
M
_
_
_
__
f
A
(k) g
A
(k)
f
A
(k)g
A
(k)
_
k
AA
+
i
2
_
f
A
(k) g
A
(k) +f
A
(k)g
A
(k)
_
_
dV I
(407)
which, again, is not causal. Causality of the propagator implies Fermi statistics
for Dirac particles.
We summarize. The requirement that energies be positive xes the complex
vector space structure of the one-particle Hilbert spaces. The additional require-
ment that the propagators (the commutators or anticommutators of the eld
operators) be causal then requires that particles with integer spin be bosons and
particles with half-integer spin be fermions, at least for the our cases H
RKG
,
H
CKG
, H
M
, and H
D
.
25 -Algebras
We have now obtained a number of quantum eld theories of relativistic, non-
interacting particles. Our approach consists, basically, of the following steps:
i) form a Hilbert space of an appropriate collection of solutions of the eld
equations (Klein-Gordon, Maxwell, Dirac, neutrino),
ii) introduce the corresponding symmetric or antisymmetric Fock space, and
iii) replace the original elds by operators on Fock space.
However, there exists an alternative approach, in which one begins with the
eld operators and their commutators (or anticommutators) as the basic ob-
jects, deriving from these the Fock space and nally the one-particle Hilbert
space. While the two approaches are completely equivalent logically, they dier
considerably in attitude. In particular, the alternative approach emphasizes the
analogy between second quantization (the ultimate passage from elds to eld
operators) and rst quantization (e.g., the passage from Newtonian mechanics
to the Schr odinger equation). One thinks of the elds (Klein-Gordon, Max-
well, Dirac, etc.) as classical quantities (analogous to x and p in Newtonian
85
mechanics) which, in the quantized version of the theory, are to become opera-
tors on some Hilbert space. This alternative approach is the one conventionally
followed in textbooks. We discuss it in this section.
It is convenient to rst introduce a mathematical object. A -algebra con-
sists, rst of all, of an associative algebra / (over the complexes) with unit I.
That is to say, / is a complex vector space on which there is dened a product,
AB, between elements A and B of /, where this product is linear in the factors
and satises (254). Furthermore, there is an element I of / such that
IA = AI = A (408)
for every A /. (Clearly, this I is unique.) Furthermore, we require that, as
part of the structure of a -algebra, there be given a mapping from / to / (the
image of A / under this mapping written A
) subject to:
A1. For each A /, (A
= A.
A2. For each A, B /, (AB)
= B
.
A3. For each A, B /, C, (A +B)
= A
+ B
.
The standard example of a -algebra is the collection of all bounded oper-
ators which are dened everywhere on a xed Hilbert space H. Addition is
dened by adding the operators, and scalar multiplication by multiplying the
operator by the complex number in the usual way. The product of two oper-
ators is dened by applying them in succession. The unit I is, of course, the
identity operator. Finally, represents the operation of taking the adjoint of
an operator. (In fact, every bounded operator dened everywhere on a Hilbert
space H has a unique adjoint. We omit the (moderately dicult) proof.) Note
that it is well-dened to speak of projection, Hermitian, and unitary elements of
a -algebra, for these notions involve only the structure incorporated in to the
-algebra. Intuitively, one can think of a -algebra as representing operators
on a Hilbert space, but without the Hilbert space itself.
The essential idea of the approach to be described below is identical for all
the relativistic eld equations. It will suce, therefore, to treat one case in
detail. We select the complex Klein-Gordon elds.
The idea is to rst introduce a certain -algebra /. We suppose that, with
each real test function f on Minkowski space, there is associated a pair of
elements of /, (f) and (f), which are related by the -operation. We
suppose, furthermore, that / is generated by I, (f), and (f) (as f runs
over all test functions). That is to say, the most general element of / consists
of a nite linear combination, with complex coecients, of I and products of
the (f)s and (f)s, e.g.,
I +
(f) +(g)
(k) +(m)
(n)(p) (409)
Clearly, we can take the sum or product of objects of the form (409) multiply
such an object by a complex number, and take the of such an object. Un-
fortunately, we still do not have quite the -algebra we require. We wish to
86
require, in addition, that certain expressions of the form (409), while formally
distinct, are to be regarded as equal as elements of /. That is to say, we wish to
impose certain relations among the elements (409) of /. (This construction is
analogous to that in which one obtains a group by postulating the existence of
certain elements subject to relations. If we wished to be more formal, we would
introduce an equivalence relation.) We impose the following relations:
(af + g) = a(f) +(g)
(af +g) = a
(g) (410)
_
(+
2
)f
_
=
_
(+
2
)f
_
= 0 (411)
_
(f), (g)
=
_
(f),
(g)
= 0 (412)
_
(f),
(g)
=
2i
D(f, g) I (413)
where f and g are any test functions, a is any real number, and D(f, g) is the
Feynman propagator, (375). This completes the specication of the -algebra /.
(Although, of course, this -algebra / looks familiar, it is to be regarded, for the
present as merely the mathematical object which results from the construction
above.)
It is useful conceptually to restate the construction above from a more phys-
ical point of view. We have taken the classical eld (x) and its complex-
conjugate eld
(x), and replaced them by operators, (f) and
(f). (Oper-
ators which, as yet, act on no particular Hilbert space: therefore, elements of
a -algebra.) We impose on these operators a number of more or less natural
conditions. We require that the operators be linear in the test functions, (410).
We require that the operators, in their position dependence, satisfy the same
equations as the elds they replaced, (411). We require that the s commute
with each other, and that the
(f), and C
= C
+
(f) (A
(f))
= C
(f) (414)
A
(af +g) = aA
(f) +A
(g)
C
(af +g) = aC
(f) +C
(g)
(415)
A
_
(+
2
)f
_
= C
_
(+
2
)f
_
= 0 (416)
_
C
+
(f), C
+
(g)
=
_
C
+
(f), C
(g)
=
_
C
(f), C
(g)
= 0
_
A
+
(f), A
+
(g)
=
_
A
+
(f), A
(g)
=
_
A
(f), A
(g)
= 0
_
A
+
(f), C
(g)
=
_
A
(f), C
+
(g)
= 0
(417)
_
A
+
(f), C
+
(g)
=
2i
D
+
(f, g)I
_
A
(f), C
(g)
=
2i
D
(f) +A
+
(f)
(f) =
_
C
+
(f) +A
(f)
(419)
whence each element of / denes an element of B. Note that the identications
(419) indeed establish / as -subalgebra of B, for (414)(418) imply (410)
(413). In fact, although B is larger than /, there is a sense in which B does
not add anything new to /. Specically, each element of B can be considered as
limiting case of elements of /: A
+
(f), for example, is the positive-frequency
part of (f). (Just as a complex-valued solution (x) of the Klein-Gordon
equation can be decomposed into its positive-frequency and negative-frequency
parts, so can an operator-valued solution, (f), of the Klein-Gordon equation.
It in perhaps not surprising that if one introduces enough machinery, it becomes
possible to describe such a decomposition directly in terms of the -algebra.)
Why do we introduce two distinct -algebras when they carry essentially
the same information? Because / is easier to motivate while B is easier to
manipulate.
We now construct our inner-product space, K
B
. We postulate, rst of all,
the existence of an element
0
(the vacuum, more commonly written [0)).
The most general element of K
B
is to consist of the juxtaposition of an element
88
of B and
0
. We add such elements of K
B
, and multiply by complex numbers,
by performing the indicated operation on B, e.g.,
(A
0
) + (B
0
) = (A+B)
0
(420)
where A, B B, C. We wish, however, to impose on these elements a
further relation, namely
A
+
(f)
0
= 0 A
(f)
0
= 0 (421)
(annihilation on the vacuum gives zero). We now have a complex vector space.
To obtain an inner-product space, we must introduce a norm. To evaluate the
norm of an element of K
B
, one rst formally takes the inner product of the
element with itself. For A
+
(f)C
(g)C
+
(h)
0
, for example, one would write
(A
+
(f)C
(g)C
+
(h)
0
, A
+
(f)C
(g)C
+
(h)
0
) (422)
We now set down certain rules for manipulating such expressions. Firstly, an
element of B which appears rst on either side of the formal inner product
can be transferred to the other side (where it must also appear rst) provided
that, simultaneously, it is replaced by its starred version. (That is, we mimic
the usual rule for transferring an operator to the other side of an inner product.)
For example, (423) can be rewritten
(C
(g)C
+
(h)
0
, C
+
(f)A
+
(f)C
(g)C
+
(h)
0
) (423)
or
( A
(g)C
+
(f)A
+
(f)C
(g)C
+
(h)
0
, C
+
(h)
0
) (424)
Secondly, one can use the commutation relations (417) and (418). For example,
(423) can be rewritten
(C
(g)C
+
(h)
0
, A
+
(f)C
+
(f)C
(g)C
+
(h)
0
)
+ (C
(g)C
+
(h)
0
,
2i
D
+
(f, f)C
(g)C
+
(h)
0
) (425)
Thirdly, one can use (421). Finally, we postulate
(
0
,
0
) = 1 (426)
(the vacuum is normalized to unity). By using these rules, every norm can
be reduced to some number. This is done, roughly speaking, as follows. First
use the commutators to push the annihilation operators to the right until
they stand next to
0
and hence give zero. There then remain only creation
operators. Each of these, in turn, is transferred to the other side of the inner
product, thus becoming an annihilation operator. Each annihilation operator
obtained in this way is then pushed to the right again, where it eventually
meets
0
and gives zero. In this way, all the operators are eventually eliminated,
89
leaving only the functions which appear in the commutators. Now use (426).
As a simple example, we evaluate the norm of C
+
(f)
0
:
(C
+
(f)
0
, C
+
(f)
0
) = (A
+
(f)C
+
(f)
0
,
0
)
= (C
+
(f)A
+
(f)
0
,
0
) + (
2i
D
+
(f, f)
0
,
0
)
= 0 +
2i
D
+
(f, f)(
0
,
0
)
=
2i
D
+
(f, f)
(427)
Thus, K
B
has the structure of an inner-product space. (We shall establish
shortly that the norm, dened above, is indeed positive.) In particular, if we
consider only the elements of K
B
which can be obtained by applying elements
of / to
0
we obtain an inner-product subspace, K
A
, of K
B
. Finally, we take
the completion, K
A
, of K
A
to obtain a Hilbert space. (In fact, K
A
is dense in
K
B
, so K
A
= K
B
.)
All these formal rules and relations sound rather mysterious. It is easy,
however, to see what the resulting Hilbert space is. Consider the symmetric
Fock space based on H
CKG
. As we have mentioned, the -algebras / and
B can be represented as operators on this Hilbert space. Consider now the
element (1, 0, 0, . . .) (see (98)) of Fock space. It satises (421) and (426). Clearly,
the inner-product space K
A
(resp. K
B
) is identical to the inner-product space
consisting of all elements of Fock space which can be obtained by applying
elements of / (resp. B) to (1, 0, 0, . . .). Thus, K
A
and K
B
can be considered
as subspaces of our Fock space. But in fact, both these subspaces are dense in
Fock space. Hence, K
A
and K
B
are identical with symmetric Fock space based
on H
CKG
. In other words, we have simply re-obtained Fock space by a dierent
route.
We summarize the situation. In the conventional approach, one begins with
the classical elds, which are replaced by operators (elements of a -algebra),
subject to certain commutation relations. One then assumes the existence of
a vacuum, and builds an inner-product space by applying the elements of our
-algebra to the vacuum. The norm is dened by formal manipulative rules,
using the postulated commutators. Finally, one completes this inner-product
space to obtain a Hilbert space. One has the feeling that one is quantizing a
classical theory. We have proceeded in a rather dierent direction. We looked
ahead to see what the resulting Hilbert space would be, and simply wrote it
out. It turned out to be what we called the symmetric Fock based on the Hilbert
space H
CKG
, which, in turn, was based on the solutions of the original equation.
We then simply dened the action of creation operators, annihilation operators,
and eld operators on this explicit Fock space. The resulting mathematical
structures are identical the methods of deriving this structure quite dierent.
We have sacriced much of the motivation to gain a certain explicitness.
90
26 Scattering: The S-Matrix
Our discussion so far has been restricted to free particles. That is to say, we have
been dealing with systems consisting of any number of identical particles which
interact neither with themselves nor with particles of any other type. While such
systems provide a convenient starting point for quantum eld theory, they are
by themselves of very little physical interest. Particles in the real world do in-
teract with other particles: electrons have charge, and so interact with photons;
nucleons interact, at least, with -mesons to form nuclei, etc. Furthermore,
even for systems in which interactions play a minor role, the experiments which
provide information about such systems must come from the interaction with
other systems (i.e., the experimental apparatus). One of the most important
situations from both the theoretical and experimental points of view in
which interactions play a signicant role is that of scattering. In this section, we
shall set up the general framework for the description of scattering experiments.
We rst recall a general principle of quantum theory. Let S
1
and S
2
represent
two quantum systems which, we assume, in no way interact with each other.
Let H
1
be the Hilbert space which encompasses the possible states of S
1
, and
H
2
the Hilbert space for S
2
. It is because the systems do not interact that each
is characterized by its own Hilbert space. Now suppose we introduce a new
quantum system, S, which consists of S
1
and S
2
together. Note that we are
not here turning on any interactions we have merely decided to consider two
systems as a single system. What is the Hilbert space of states of the system
S? It is H
1
H
2
. (Note: the tensor product, not the direct sum.) That is to
say, a state of S can be obtained by taking a formal sum of formal products of
states of S
1
and S
2
. (Simple example: if H
1
and H
2
were both one-dimensional,
so S
1
and S
2
each had essentially one state, then S should also have essentially
one state. But in this example H
1
H
2
is two-dimensional, whereas H
1
H
2
is
one-dimensional.) Note, incidentally, that any operator on H
1
(i.e., which acts
on S
1
) extends naturally to an operator on H
1
H
2
(i.e., extends to an operator
which acts on S).
Now suppose we wish to consider a situation in which only certain types
of particles will be permitted to interact say, electrons-positrons, photons,
and neutral -mesons. We begin by writing down the Hilbert space H which
encompasses the states of such a system when the interactions are turned o.
That is to say, we imagine a system in which our various particles co-exist but
do not interact, and describe its states by H. In our example, H would be the
tensor product of the antisymmetric Fock space based on H
D
, the symmetric
Fock space based on H
M
, and the symmetric Fock space based on H
RKG
. Note
that this is a purely mathematical construct. In the real world such particles
would interact: we do not have the option of turning o interactions to suit our
convenience.
It is in terms of this H that scattering processes are described. We consider
the following situation. In the distant past, our particles are represented by
broad and widely separated wave packets. These particles then enter a region
in which the amplitudes are large and the wave packets overlap signicantly.
91
They interact. Finally, we suppose that, in the distant future, all the particles
are again widely separated. We wish to describe such a process as follows. It is
perhaps not unreasonable to suppose that, as one goes further and further into
the past, the interactions play a smaller and smaller role. Thus, in the limit
as one goes into the past, it might be possible to describe the system by an
element of our non-interacting Hilbert space H. That is, our incoming state is
to be some element of H. Similarly, in the limit as one goes into the future, i.e.,
for the outgoing state, one obtains some other element of H. It is only in these
past and future limits that H provides an accurate description of the state of
the system. In any actual nite region of Minkowski space and particularly
where the interactions are strong a description in terms of H is impossible.
We therefore simply abandon, for the time being, any attempt to describe in
detail what is happening while the interaction takes place. We agree that all we
care to know about the interaction is simply the relation between the incoming
state and the outgoing state both elements of H. This relation is given by
some mapping S from H to H.
We illustrate this idea with a classical analogy. (Caution: This analogy can
be misleading if pushed too far.) Suppose we are interested in solutions of
(+
2
) =
3
(428)
Let L denote the collection of solutions of this equation which are, in some
suitable sense, well-behaved asymptotically. (Note that L is not a vector space.
It is analogous to the states of the interacting quantum system, which do form
a vector space.) Let H denote the collection of asymptotically well-behaved
solutions of the Klein-Gordon equation. Fix a time-coordinate t in Minkowski
space (i.e.,
a
t is constant, unit, and timelike). For each value of t, we dene
a mapping, (t), from L to H. Fix t
0
. Then, if (x) is a solution of (428),
the values of and (
a
t)
a
phi on the 3-surface t = t
0
are initial data for
some solution of the Klein-Gordon equation, which we write (t). Clearly,
this mapping (t) is invertible. We now ask whether the right side of
S = lim
t2
t1
(t
2
)(t
1
)
1
(429)
exists and is independent of our original choice of time-coordinate. If so, we
obtain a mapping S from H to H. This S clearly provides a great deal of
information about the structure of Eqn. (428).
We now return to quantum eld theory. All the information we want about
the interactions is to be contained in the mapping S, called the S-matrix, from
the non-interacting Hilbert space to itself. One could, of course, merely deter-
mine S, as best as one can, from experiments. But this would hardly represent a
physical theory. Ultimately, we shall be concerned with the problem of calculat-
ing S from specic assumptions concerning the nature of the interaction. It is of
interest, however, to rst ask whether there are any very general properties of S
which one might expect to hold merely from its physical interpretation. In fact,
there are two such properties. The rst is that S is an operator on H, i.e., S
92
is linear. I do not know of a water-tight physical argument for this assumption.
It is, however, suggested by the principle of superposability in quantum theory.
Let
1
and
2
be unit, orthogonal elements of H. Then = (
1
+
2
)/
2 is
also a unit vector in H. A system whose incoming state is has probability
1/2 that its incoming state is
1
, and probability 1/2 that its incoming state is
2
. Hence, we might expect the corresponding outgoing state, S() , to be the
same linear combination of S(
1
) and S(
2
), i.e., we might expect to have
S() =
1
2
(S(
1
) + S(
2
)) (430)
These considerations strongly suggest the assumption we now make: that S is
a linear operator on H. The second property of S follows from the probabilistic
interpretation of states in quantum theory. Let be a unit vector in H. Then,
if we write H as a direct sum of certain of its orthogonal subspaces, the sum
of the norms of the projections of into these subspaces is one. This fact is
interpreted as meaning that the total probability of the systems being found
in one of these subspaces is one. But if this is our incoming state, the total
probability for all possible outgoing states must also be one. Hence, we might
expect to have
|S| = 1 (431)
provided || = 1. In other words, we expect S to be a unitary operator on H.
To summarize, the probabilities for all possible outcomes of all possible scat-
tering experiments (involving a certain, given list of particles) are specied com-
pletely by a unitary operator S on the non-interacting Hilbert space H. We want
to nd this S.
The S-matrix approach to scattering problems involves a number of physical
assumptions. Among these are the following:
1. In the limit to the distant past (and distant future), the interactions have
negligible inuence, so the state of the actual physical system can be
associated, in these limits, with elements of the non-interacting Hilbert
space, H.
2. The interaction is completely described by the S-matrix (e.g., there are
no bound states.)
3. One can nd short list of particles such that, if only particles which appear
on this list are involved in the incoming state, then all outgoing particles
will also appear on the list.
In fact, all of these assumptions are believed to be false:
1. Even in the distant past and future, particles carry a cloud of virtual
particles which aect, for example, the observed mass. Thus, the in-
teractions are important even in the limits. It appears, however, that
these eects can be accounted for by suitably modifying the parameters
(e.g., mass) which appear in the non-interacting Hilbert space H. This
procedure is associated with what is called renormalization.
93
2. There exist bound states, e.g., the hydrogen atom.
3. Suppose we decide that we shall allow incoming states containing only
photons and electron-positrons. Then, if the energies are suciently large,
the outgoing states could certainly include -mesons, proton-antiproton
pairs, etc. The problem is that we do not have an exhaustive list of all
elementary particles, and so we cannot write down the nal H. We
are forced to proceed by a series of approximations. In certain situations,
interactions which produce elementary particles not included in our H
will not play an important role. Thus, we can closely describe the phys-
ical situation by using only one or two of the many interactions between
elementary particles. Whenever we write down an H and an S, we have
a physical theory with only a limited range of validity. The larger our H,
and the more interactions included, the larger is the domain of validity of
our theory.
Despite these objections, we shall, as our starting point for the discussion of
interactions, use the S-matrix approach.
27 The Hilbert Space of Interacting States
We have seen in Sect. 26 that scattering phenomena are completely described by
a certain unitary operator S on a Hilbert space H. We also remarked that, since
H represents noninteracting states, and since the states of the actual physical
system are inuenced by interactions, we cannot interpret H as encompassing
the states of our system. Is it possible, then, to construct a Hilbert space L
which does represent the states of the interacting system? The answer is yes (at
least, for scattering states), provided we accept a suciently loose interpretation
for the word construct.
What features would we expect for a Hilbert snare which is to represent
the interacting states of the system? Firstly, comparing L and H in the
distant past, we might expect to have an isomorphism
in
: L H between
L to H. (See the example on p. 92.) (An isomorphism between two Hilbert
spaces is a mapping from one to the other which is one-to-one, onto, linear,
and norm-preserving. Clearly, any isomorphism has an inverse, which is itself
an isomorphism.) Similarly, we would expect to have a second isomorphism
out
: L H. Finally, from the denition of the S-matrix, we would expect to
have
S =
out
1
in
(432)
Fix H and S. A triple, (L,
in
,
out
) consisting of a Hilbert space L and two
isomorphisms,
in
and
out
, from L to H, subject to (432), will be called on
interaction space. How many essentially dierent interaction spaces are there for
a given H, S? In fact, there is just one, in the following sense: Let (L
in
,
out
)
and (L,
in
,
out
) be two interaction spaces for H, S. Then there exists a unique
94
isomorphism from L to L
such that
in
=
in
out
=
out
(433)
(Proof: Evidently, we must choose =
in
1
in
, which, by (432), is the same
as =
out
1
out
.) That is to say, the interaction space is unique up to
isomorphism (which, of course, is as unique as we could expect it to be.)
All this looks rather pedagogical. After all, L is just another copy of H, so
why dont we just say so instead of speaking of triples, etc.? The point is that
the interaction space is more than just another cony of H it also contains,
as part of its structure, certain isomorphisms from L to H. As a consequence,
only certain portions of the (extensive) structure on H can be carried over, in
a natural way, to L. Examples of structure on H which arise from the way
in which H was constructed are:
1. The total charge operator on H.
2. The total number of photons operator on H (if, say, H happens to
include photons).
3. The projection operator onto photon states (eliminating all other types of
particles).
4. The operators which arise from the action of the restricted Poincare group
on H.
5. The number of baryons minus number of anti-baryons operator.
6. The creation and annihilation operators of various particles in various
states.
7. The eld operators on H.
The important point is that, in every case, the additional structure on H can
be described by giving an operator on H. The question of transferring structure
from H to L reduces, therefore, to the following: under what conditions does
an operator A on H dene, in a natural way, an operator on L? In fact, given
an operator A on H, there are two natural operators on L, namely,
1
in
A
in
1
out
A
out
(434)
In other words, we can carry A from H to L via either of the isomorphisms
in
or
out
. Which of (434) should we choose? There would he no choice if these
two operators were equal. Thus, an operator A on H leads to a unique operator
on the interaction space L provided A is such that the two operators (434) are
equal. Using (432), this is equivalent to the condition
[S, A] = 0 (435)
95
This, of course, is the result we expect. It is only properties of H which are in-
variant under the interaction (i.e., operators which commute with the S-matrix)
which lead unambiguously to properties of the interaction states.
To summarize, structure on the interaction space is obtained from operators
on H which commute with the S-matrix.
The operators which commute with S characterize what are called conserved
quantities. They include charge, baryon number, lepton number, momentum,
angular momentum, etc. Operators such as 2, 3, 6, and 7 above will not com-
mute with S. They describe quantities which are not conserved in interactions.
28 Calculating the S-Matrix: An Example
We shall soon begin writing down formulae for the S-matrix. Unfortunately,
these formula are rather complicated. They contain large numbers of terms,
sums and integrals whose convergence is doubtful, and symbols whose precise
meaning is rather obscure. We wish to avoid encountering all of these problems
simultaneously. It is convenient, therefore, to rst study a simpler example
a problem in which some of the features of the S-matrix formulae are exhibited,
and in which some, but only some, of the diculties are seen. We discuss such
an example in the present section.
Let H he a xed Hilbert space. Let K(t) be a one-parameter family of
bounded operators dened everywhere on H. That is, for each real number t,
K(t) is an operator on H. Suppose furthermore that K(t) = 0 unless t is in
some nite interval. That is, suppose that there are numbers t
i
< t
f
such that
K(t) = 0 for t t
f
and t t
i
. We are interested in studying curves in H, i.e.,
one-parameter families (t) of elements of H, which satisfy the equation
i
d
dt
(t) = K(t)(t) (436)
(Note: Derivatives and integrals of one-parameter families of elements of a
Hilbert space, and operators on a Hilbert space, are dened by the usual limit-
ing procedure.) Let (t) satisfy (436). Then, since K(t) = 0 for t t
i
, (t) is
a constant element of H,
i
, for t t
i
. Similarly, (t) =
f
for t t
f
. Clearly,
a solution of (436) is completely and uniquely determined by
i
, and
f
is a
linear function of
i
. We write
f
= S
i
(437)
where S is some operator on H. The problem is to nd an expression for S in
terms of K(t).
We rst consider a special case in which the solution is easy. Suppose that
all the K(t)s commute with each other, i.e., [K(t), K(t
.
Then
(t) =
_
_
exp
_
_
t
_
ti
dK()
_
_
_
_
i
(438)
96
is clearly a solution of (436). (Note: If A(t) is a one-parameter (dierentiable)
family of operators on H, then
d
dt
expA(t) = (expA(t))
d
dt
A(t) only when A(t)
and
d
dt
A(t) commute.) Therefore,
S = exp
_
_
t
f
_
ti
dK()
_
_
= I +
_
_
t
f
_
ti
dK()
_
_
+
1
2!
_
_
t
f
_
ti
dK()
_
_
2
+
1
3!
_
_
t
f
_
ti
dK()
_
_
3
+ (439)
(The exponential of an operator is, of course, dened by the second equality
in (439). We ignore, for the time being, questions of the convergence of such
series.) Thus, when the K(t)s commute, S is given by the relatively simple
expression (439).
Now suppose that the K(t)s do not commute. Integrating (436), we rewrite
it as an integral equation:
(t) =
i
t
_
ti
dK()() (440)
We shall solve (440), at least formally, using a sequence of approximations. We
begin with a trial solution,
0
(t). We substitute this
0
(t) into the right side
of (440), and denote the result by
1
(t). We now take
1
(t) as our next trial
solution. Substituting it into the right side of (440) to obtain
2
(t), etc. Thus,
the general formula for passing from one trial solution to the next is
n+1
(t) =
i
t
_
ti
dK()
n
() (441)
As our initial trial solution, we take
0
(t) =
i
, a constant. The hope is that
the resulting
n
(t) will, as n , converge, in a suitable sense, to a solution
(t) of (436). Using (441) successively, we have
1
(t) =
i
t
_
ti
d
1
K(
1
)
i
2
(t) =
i
t
_
ti
d
1
K(
1
)
i
+
_
_
2
t
_
ti
d
1
1
_
ti
d
2
K(
1
)K(
2
)
i
(442)
97
or, more generally,
n
(t) =
_
_
n
m=0
_
_
m
t
_
ti
d
1
1
_
ti
d
2
m1
_
ti
d
m
K(
1
) K(
m
)
_
_
i
(443)
Thus, our formal limiting solution is
n
(t) =
_
_
m=0
_
_
m
t
_
ti
d
1
1
_
ti
d
2
m1
_
ti
d
m
K(
1
) K(
m
)
_
_
i
(444)
Indeed, if we substitute (444) into (436), and ignore questions of convergence of
sums and the validity of interchanging the order of dierentiation and summa-
tion, we obtain an identity. Thus, our formal expression for S is
S =
m=0
_
_
m
t
_
ti
d
1
1
_
ti
d
2
m1
_
ti
d
m
K(
1
) K(
m
) (445)
It is convenient to recast (445) into a form which more closely resembles
(439). The idea is to eliminate integrals whose limits of integration lie between
t
i
and t
f
, i.e., to have all integrals be over the full range from t
i
to t
f
. Explicitly,
the rst few terms of (445) are
S = I +
_
_
t
f
_
ti
d
1
K(
1
) +
_
_
2
t
f
_
ti
d
1
t
f
_
1
d
2
K(
1
)K(
2
)
+
_
_
3
t
f
_
ti
d
1
t
f
_
1
d
2
t
f
_
2
d
3
K(
1
)K(
2
)K(
3
) + (446)
The rst two terms on the right in (446) are already in the desired form. How-
ever, the third term on the right,
_
_
2
t
f
_
ti
d
1
t
f
_
1
d
2
K(
1
)K(
2
) (447)
is not. The region of integration in (447) is shown in the gure. The idea is
to reverse the orders of the two integrations, while keeping the actual region
over which the integration is performed the shaded region in Figure 4
unchanged. Thus, the expression (447) is equal to
_
_
2
t
f
_
ti
d
2
t
f
_
2
d
1
K(
1
)K(
2
) (448)
98
Figure 4: Region of integration in (447).
We next reverse the roles of the integration variables,
1
and
2
, in (448) to
obtain
_
_
2
t
f
_
ti
d
1
t
f
_
1
d
2
K(
2
)K(
1
) (449)
Finally, adding (447) and (449), we nd that (447) is equal to
1
2
_
_
2
t
f
_
ti
d
1
t
f
_
ti
d
2
T [K(
1
), K(
2
)] (450)
where we have dened
T [K(
1
), K(
2
)] =
_
K(
1
)K(
2
) if
1
2
K(
2
)K(
1
) if
2
1
(451)
A similar procedure can be applied to each successive term in (446). The n
th
term is equal to
1
n!
_
_
n
t
f
_
ti
d
1
t
f
_
ti
d
n
T [K(
1
), . . . , K(
n
)] (452)
where T[K(
1
), K(
2
), . . . , K(
n
)] is dened to be the product of these operators,
but arranged in the order in which the operator associated with the smallest
99
m
is placed on the right, the operator associated with the next-smallest
m
is placed next, etc. This T[K(
1
), K(
2
), . . . , K(
n
)] is called the time-ordered
product. Thus, our nal formal expression for S is
S =
n=0
1
n!
_
_
n
t
f
_
ti
d
1
t
f
_
ti
d
2
t
f
_
ti
d
n
T [K(
1
), . . . , K(
n
)] (453)
Note that, if all the K()s commute, then the time-ordering is irrelevant, and
(453) reduces to (439).
To summarize, the only modication of (439) required when the operators
do not commute is that the products of operators in the multiple integrals must
be time-ordered.
Finally, note that if all the K()s are Hermitian, then, from (436),
d
dt
((t), (t)) =
_
d
dt
,
_
+
_
,
d
dt
_
=
_
K,
_
+
_
,
i
K
_
=
i
[(K, ) (, K)] = 0
(454)
Hence, S must be a unitary operator. Unitarity is obvious in (439) rather
less so in (453).
Our formulae for the S-matrix in eld theory will also involve innite series
of multiple integrals of time-ordered products of Hermitian operators.
29 The Formula for the S-Matrix
In this section we shall write down the formula for the S-matrix in terms of a
certain (as yet unspecied) operator eld on Minkowski space. While we shall
in no sense derive that our expression for S is correct, it will be possible, at
least, to show that the formula is a reasonable guess. We rely heavily on the
discussion in Sect. 28.
Consider Eqn. (453). We wish to write down an analogous formula for S
in quantum eld theory. Since, rst of all, S is to be an operator on the non-
interacting Hilbert space we must take the Ks to be operators on H. Secondly,
S should be unitary: we therefore take the Ks to de, Hermitian. By what
should we replace the interaction variables - the s in Eqn. (453)? If we think
of in Sect. 28 as representing a time, then a natural replacement would be
position x in Minkowski space-time. The integrals in (453) would then extend
over all of Minkowski space. (Note: This is the reason why it was convenient, in
Sect. 28, to obtain an expression in which the integrals extended over the entire
t-range from t
i
to t
f
.) Thus, we are led to consider the interaction is described
by a certain Hermitian operator eld, K(x), which depends on position x in
100
Minkowski space, and acts on H. The S-matrix will then be given by the
formula
S =
n=0
1
n!
_
_
n
_
dV
1
_
dV
n
T [K(x
1
), . . . , K(x
n
)] (455)
where x
1
, x
2
, . . . represent points in Minkowski space, dV
1
, dV
2
, . . . the corre-
sponding volume elements in Minkowski space, and all integrals are over all
of Minkowski space. Note that Eqn. (453) already suers from one diculty:
the question of the convergence of the innite series. In writing (455) we have
retained that diculty, and, in fact , introduced a second one: the question of
the convergence of the integrals. (The integrals in (453) are all over compact
regions.)
In fact, there is a second problem with (455) which was not encountered in
(453), namely, the question of what the time-ordering operator T is to mean
in (455). In (453), T means that the K() operators are to be placed in the
order of decreasing -values. Unfortunately, in the passage from a time to
position x in Minkowski space-time, the natural ordering is destroyed. There is,
however, one case in which points in Minkowski space can be ordered: we agree
that x
2
exceeds x
1
, x
2
> x
1
, if x
2
x
1
(i.e., the position vector of x
2
relative to
x
1
) is future-directed and timelike or null. Hence, for the region of integration
in (455) for which all the x
1
, . . . , x
n
can be ordered in this way, T has a well-
dened meaning, and hence the integral makes sense. (More explicitly, the s
are totally ordered, while points in (time-oriented) Minkowski space are only
partially ordered.) Clearly, there is no Poincare-invariant way to time-order
K(x
1
) and K(x
2
) when x
2
x
1
is spacelike. How, then, are we to give a meaning
to (455)? One way of doing this would be through an additional condition on
the K(x). We simply assume that the ordering of K(x
1
) and K(x
2
) is irrelevant
when x
2
x
1
is spacelike. That is to say, a natural way of forcing a meaning
for (455) is to assume that the K(x) have the property
[K(x), K(x
)] = 0 for x x
spacelike (456)
We include (456) as a requirement on our K(x). Thus, if x
1
, x
2
, . . . , x
n
are points
in Minkowski space, we dene T[K(x
1
), K(x
2
), . . . , K(x
n
)] to be the product of
these operators, placed in an order such that, if x
i
x
j
is timelike or null and
future-directed, then K(x
i
) appears before K(x
j
) in the product. Clearly, there
always exists at least one ordering having this property. Furthermore, (456) im-
plies that all such orderings yield the same operator. Thus, T[K(x
1
), . . . , K(x
n
)]
is a well-dened operator.
To summarize, the interaction is to be described by giving a Hermitian
operator eld K(x) on H which satises (456). The S-matrix is to be expressed
in terms of K(x) by (455). The formal expression (455) is unsatisfactory insofar
as we have investigated the convergence of neither the innite series nor the
integrals themselves.
The standard textbooks give a more detailed, but, I feel, no more satisfac-
tory, argument for (455). One chooses a time-coordinate t in Minkowski space,
101
and writes
H(t) =
_
t=const.
K(x) (457)
for the Hamiltonian. One then imagines a state vector in the Hilbert space
H, (t), which depends on the time-coordinate. One writes the Schr odinger
equation,
i
d
dt
(t) = H(t)(t) (458)
The argument of Sect. 28 then yields (455), where T refers to the ordering
induced by the time-coordinate t. Finally Poincare invariance (i.e., the condition
that S be independent of our original choice of t) requires (456). This argument
assumes, implicitly, that the states during the interaction are described by H.
We have deviated only slightly from this conventional argument. We isolated
the simplest and clearest part in Sect. 28, and guessed the rest.
30 Dimensions
With each of the various elds and operators we have introduced, there is associ-
ated a corresponding physical dimension. We shall determine these dimensions
in the present section.
Recall that we have set the speed of light equal to one, so length and time
have the same units. (E.g., we measure distance in light-seconds.) We may
therefore take as our fundamental units a mass (m) and a time (t). Then
Plancks constant h has dimensions mt. We assign to position vectors in Min-
kowski space dimensions t, so the derivative in Minkowski space has dimensions
t
1
, and the wave operator dimensions t
2
. (Raising, lowering, and con-
tracting indices does not aect dimensions.) The quantity which appears in
the Klein-Gordon and Dirac equations therefore has dimensions t
1
. Position
vectors in momentum space have dimensions t
1
. Finally, the volume element
on the mass shell has dimensions t
1
(see (14).)
The rule for determining the dimensions to be associated with a classical
eld is the following: consider an element of the Hilbert space which has norm
unity, and work back to determine the dimensions of the corresponding eld.
Consider rst the (real or complex) Klein-Gordon case. A unit vector in the
Hilbert space is represented by a function (k) on M
which satises
1
_
M
(k)(k) dV
= 1 (459)
Therefore, (k) has dimensions m
1/2
t
3/2
. But
(x) =
_
(k)e
ik
b
x
b
dV (460)
and so the Klein-Gordon eld has dimensions m
1/2
t
1/2
. For the Dirac case,
Eqn. (388) implies that (
A
(k),
A
(k) has dimensions t
1/2
. Then (309) and
102
(310) imply that (
A
(k),
A
(k) has dimensions t
3/2
. The neutrino elds have
the same dimensions. Finally, for the Maxwell case, (382) implies that A
a
(k)
has dimensions m
1/2
t
3/2
, whence, from (178), A
a
(x) has dimensions m
1/2
t
1/2
.
(In the Maxwell case, one has a simple independent check on the dimensions.
The dimensions above for the vector potential imply that electric and magnetic
elds have dimensions m
1/2
t
3/2
which agrees, of course, with the dimensions
from classical electrodynamics.)
The other quantities whose dimensions are of particular interest are the
eld operators. To assign dimensions to these operators, we must decide what
dimensions are to be associated with the test elds. Recall that the role of the
test elds is to smear out the (undened) operators associated with points of
Minkowski space, e.g.,
(f) =
_
f(x)(x) dV (461)
Thus, we can think of a test eld as a smearing density. We therefore take
all test elds to have dimensions t
4
. (That is, we require that (f) and the
(undened) (x) have the same dimensions.) With this convention, the deter-
mination of the dimensions of the eld operators is straightforward. Consider
rst the Klein-Gordon case. If f(x) is a test eld (dimensions t
4
), then, from
f(x) =
_
f(k)e
ik
b
x
b
dV (462)
f(k) is dimensionless. But a dimensionless element of our Hilbert space, (459),
denes a (k) with dimensions m
1/2
t
3/2
. Therefore, the element of our Hilbert
space associated with this f(k), (f), has dimensions m
1/2
t
3/2
. Thus, the
creation and annihilation operators have dimensions m
1/2
t
3/2
. But
(f) = C((f)) +A((f)) (463)
(say, for real Klein-Gordon elds), and so the eld operators, (f), have dimen-
sions m
1/2
t
1/2
. In the Dirac case, the test elds, (f
A
(x), f
A
(x)) have dimensions
t
4
, whence f
A
(k) is dimensionless. The corresponding pair of functions on the
mass shell, (389), therefore has dimensions t
1
. Hence, the corresponding ele-
ments of our Hilbert space, (f
A
, f
A
), have dimensions t
3/2
. This, then, is
the dimensions of the creation and annihilation operators. Finally, from (390),
the eld operators in the Dirac case have dimensions t
3/2
. Similarly, in the
Maxwell case, f
a
(x) has dimensions t
4
, f
a
(k) dimensionless, (f
a
) dimensions
m
1/2
t
3/2
. The creation and annihilation operators therefore have dimen-
sions m
1/2
t
3/2
, and so by (383), the Maxwell eld operators have dimensions
m
1/2
t
1/2
.
Note that, in every case, the classical elds and the corresponding eld
operators have the same dimensions.
103
31 Charge Reversal
An important tool, both for constructing and analyzing interactions, is the
discrete symmetries charge, parity, and time reversal. We begin with charge
reversal.
Let H be a non-interacting Hilbert space, and suppose we are given an
operator S, the S-matrix, on H. What does it mean to say that the interaction
described by S is invariant under charge reversal? Roughly speaking, this
means that if we replace any in-state, H, by the same state, but with
all particles replaced by their antiparticles, then the corresponding out-states
are again the same, but with particles replaced by antiparticles. Thus, we
are led to try to give a meaning to the notion the same state (in H), but with
particles replaced by antiparticles. Let us suppose that this mapping from non-
interacting states to non-interacting states is accomplished by come operator C
on H, so that C represents the same state as , but with particles replaced
by antiparticles. Then the statement that the interaction (described by S) is
invariant under charge reversal reduces to the condition
SC = CS (464)
for any H. In other words, invariance under charge reversal is expressed
mathematically by the condition that S and C (both operators on H) commute.
In general, there will be a number of dierent operators C which could be
interpreted as eecting the replacement of particles by antiparticles. There is
no obvious, unambiguous way of translating this physical notion into a math-
ematical operator. We shall therefore proceed as follows. We rst write down
a list of properties which reect the intuitive idea of replacing particles by
antiparticles, but not otherwise changing the state. In general, there will he
a moderately large class of Cs which satisfy these criteria. Then, for each in-
teraction, we look for an operator C which satises our criteria and which, in
addition, commutes with the S-matrix. If such a C exists, we say that our
interaction is invariant under charge-reversal. The point is that any operator
which commutes with the S-matrix is valuable. We regard the words charge
reversal as merely suggesting a particularly fertile area in which such operators
might be found. This philosophy is important:
i) there is no natural, a priori charge-reversal operator;
ii) one sets up a class of possible charge-reversal operators, and then selects
from this class depending on what the interaction is,
iii) if no operator in this class commutes with the S-matrix, there is little
point in considering charge-reversal for that interaction.
(The third point is somewhat over-stated, and will be modied slightly later.)
The rst condition on C is that it should not mix up particles of dierent
types. That is to say, C should commute with the total number operators on
each of the Fock spaces which make up H. Therefore, C can be considered
104
as an operator on each of the Fock spaces separately. The assumption that C
commutes with the total number operators implies, furthermore, that C can be
decomposed into an operator on the one-particle Hilbert space, an operator on
the two-particle Hilbert space, etc. We next assume that the action of C on the
many-particle states can be obtained from the action on the one-particle states
as follows. Let H be one of our one-particle Hilbert spaces (e.g., H
RKG
, H
CKG
,
H
M
, H
D
). Then an element of the corresponding Fock space consists of a string
(,
, . . .) (465)
of tensors over H. The operator C on the one-particle Hilbert space H can be
written, in the index notation, as C
to an element
of H is written C
, C
, . . .) (466)
This is a quite reasonable assumption: if we know what charge-reversal means
on a one-particle state, we assume that, for a two-particle state, the eect of
charge reversal is to apply charge-reversal to each of the particles individually.
Thus, we are led to distinguish a class of charge-reversal operators on each
of our one-particle Hilbert spaces, H
RKG
, H
M
, etc.
It is convenient to introduce some denitions. A mapping T from a Hilbert
space H to itself is said to be antilinear if
T( +) = T() +T() (467)
for any , H, C. (Alternatively, T could be considered as a linear
mapping from H to
H.) We shall sometimes refer to an antilinear mapping as
an antilinear operator. The word operator alone means linear operator. A
linear or antilinear operator T is said to be norm-preserving if
|T| = || (468)
for every H. Eqn. (468) immediately implies that, for any , H,
(T, T) = (, ) (469)
or (T, T) = (, ) (470)
according as T is linear or antilinear, respectively. As we have remarked, a lin-
ear, norm-preserving operator is called unitary. An antilinear, norm-preserving
operator is said to be antiunitary.
Let H be one of the Hilbert spaces H
RKG
, H
CKG
, H
M
, or H
D
. A linear or
antilinear operator C on H will be called a charge-reversal operator if
1. C is norm-preserving.
2. C commutes with all the unitary operators on H which arise from the
action of the restricted Poincare group on H.
105
3. C, applied to a positive- (resp., negative-) frequency element of H, yields
a negative- (resp., positive-) frequency element.
These conditions are to reect the intuitive that the state is changed only in
that particles are replaced by antiparticles. Conditions 1 and 3 are clearly
reasonable. (The passage from positive-frequency to negative-frequency is the
passage from particles to anti-particles.) Condition 2 ensures that quantities
such as the locations and momenta of particles are unchanged under C. Note
that, if C is a charge-reversal operator, and is a complex number with [[ = 1,
then C is also a charge-reversal operator. (One could, conceivably, impose the
further condition C
2
= 1. We shall not do so.) Note that Condition 1 implies
that C is also norm-preserving on the non-interacting Hilbert space H.
Before discussing examples of charge-reversal operators, we establish the
following result: C must be linear rather than antilinear. Let r
a
be a constant,
unit, future-directed, timelike vector eld in Minkowski space. Then the energy
operator associated with r
a
is
E =
i
r
a
L
a
(471)
where r
a
L
a
is the operator which comes from the unitary transformation as-
sociated with the Poincare transformation (a translation) generated by r
a
. It
is essential, in (471), that the i/ appear explicitly, so that r
a
is simply the
rst-order dierence between a unitary operator and the identity operator. We
assume that C is antiunitary, and obtain a contradiction. For each of our Hilbert
spaces, the expectation value of the energy E in any state (and, in particular,
in the state C
1
, for H) is non-negative:
(C
1
, EC
1
) 0 (472)
But, from (470), this implies
(CEC
1
, ) 0 (473)
Write
CEC
1
=
_
C
i
C
1
_
(Cr
a
L
a
C
1
) (474)
Condition 2 above implies that C(r
a
L
a
)C
1
. The assumption that C is antiu-
nitary implies C(i/)C
1
= i/. Thus, we have
(C
1
, EC
1
) = (E, ) (475)
But this is a contradiction, for the left side is non-negative and the right side
non-positive. Therefore, C cannot be anti-linear.
The question of the uniqueness of charge-reversal operators is settled by
the following fact: let O be a bounded operator dened everywhere on H (one
of our four one-particle Hilbert spaces). Suppose that O commutes with the
unitary operators which arise from the action of the restricted Poincare group
106
and, furthermore, that O takes positive- (resp., negative-) frequency to positive-
(resp., negative-) frequency states. Then O is a multiple of the identity. I
know of no simple proof. (The statement is essentially an innite-dimensional
generalization of Schurs Lemma (p. 58).) Now suppose that C and C
are
charge-reversal operators. Then C
C
1
satises the conditions above, and hence
must be a multiple of the identity. It follows that C
= C, where is a complex
number with [[ = 1. Thus, having found one charge-reversal operator on our
Hilbert space H, we have found them all.
We begin with H
RKG
. In this case, the only positive-frequency or negative-
frequency solution is the zero solution. (That is to say, H
RKG
describes parti-
cles which are identical with their antiparticles.) Hence, Condition 3 is empty.
Therefore, the identity is a possible charge-reversal operator. We conclude that
the most general charge-reversal operator on H
RKG
is I, with [[ = 1.
For H
CKG
, one charge-reversal operator is given by
(x)
(x) (476)
or, in momentum space, by
(k)
(k) (477)
That this operator is unitary rather than antiunitary follows from our complex
vector-space structure on H
CKG
(see Sect. 12). Thus, the most general charge-
reversal operator on H
CKG
is given by
(x)
(x) (478)
with [[ = 1.
The most general charge-reversal operator on H
M
is I, with [[ = 1.
The most general charge-reversal operator on H
D
is
(
A
,
A
) (
A
,
A
) (479)
with [[ = 1.
32 Parity and Time Reversal
The basic idea of the remaining two discrete symmetries parity and time
reversal is essentially the same as that for charge reversal. One is concerned
primarily with nding operators which commute with the S-matrix, and op-
erators which can be interpreted as representing parity and time reversal are
particularly good candidates.
We begin with some remarks concerning the Poincare group. Recall that the
restricted Poincare group, T, is a connected, 10-dimensional Lie group. This
T is a normal subgroup of another 10-dimensional Lie group, the full Poincare
group T. However, T is not connected; it has four connected components. These
components consist of Poincare transformations which reverse neither time nor
space orientation (T), time but not space orientation, space but not time
107
orientation, and both time and space orientation. The quotient group, T/T,
is isomorphic with the group Z
2
Z
2
. (Z
2
is the additive group of integers mod
2.)
The situation is slightly dierent for the boson and fermion cases. We begin
with the boson case. Let H be one of the Hilbert spaces H
RKG
, H
CKG
, or H
M
.
Then, as we have seen (Sect. 16), H denes a representation of T. That is to
say, with each P T there is associated a unitary operator U
P
on H, where
these U
P
s satisfy:
U
P
U
P
= U
PP
U
e
= I
(480)
The problem of obtaining parity-reversal and time-reversal operators can be
stated as follows: we wish to extend this representation from T to T. That
is to say, we wish to nd, for each P T, a (unitary or antiunitary) operator
U
P
, subject to (480) and to the condition that, for P T, this representation
reduces to the given representation (Sect. 16) of it T. It is necessary to admit
both unitary and antiunitary operators for, as we shall see shortly, it is otherwise
impossible to nd any extension of our original representation of T.
There is an important dierence between charge reversal on the one hand and
parity and time reversal on the other. In the case of charge reversal, one settles
eventually on a single unitary charge-reversal operator C. There is, however, no
one natural parity-reversal operator P or time-reversal operator T. There
is, instead, a 10-dimensional manifolds worth of such operators, namely, the
operators associated with the appropriate component of the Poincare group.
Suppose now that we have a representation of T as described above. Let
P and Q be in the same component of T, so PQ T. Now U
P
U
Q
= U
PQ
.
But, since PQ T, U
PQ
is unitary rather than antiunitary. We conclude
that either both U
P
and U
Q
are unitary, or else both are antiunitary. (The
product of two unitary operators, or two antiunitary operators, is unitary; the
product of a unitary and an antiunitary operator is antiunitary.) We conclude
that all the operators associated with the Poincare transformations in a given
component of T are the same (all unitary or all antiunitary.)
In fact, it follows from an argument similar to that used for charge-reversal
that U
P
is antiunitary if and only if the Poincare transformation P reverses
time. From the remarks above, it suces to show that, for some P which
reverses parity but not time, U
P
is unitary, and that, for some P which reverses
time but not parity, U
P
is antiunitary. Let r
a
be a constant, unit, future-directed
timelike vector eld, and consider the energy operator E given by (471). All
expectation values of E are non-negative. Fix an origin O, and let t
a
be a unit,
future-directed timelike vector at O. Let P denote the Poincare transformation
which sends
x
a
x
a
+ 2t
a
(x
b
t
b
) (481)
where x
a
is the position vector relative to O. Evidently, this P reverses time
orientation but not space orientation. From the commutativity properties of
108
the Poincare group,
U
1
P
EU
P
=
_
U
1
P
i
U
P
_
r
a
L
a
(482)
where
r
a
= r
a
+ 2t
a
(r
b
t
b
) (483)
Since r
a
is a past-directed timelike vector, the positivity of E implies, evidently,
that U
P
must be antiunitary. Similarly, let Q be the Poincare transformation
x
a
x
a
2t
a
(x
b
t
b
) (484)
So Q reverses spatial orientation but not time orientation. Then
U
1
Q
EU
Q
=
_
U
1
Q
i
U
Q
_
r
a
L
a
(485)
where
r
a
= r
a
2t
a
(r
b
t
b
) (486)
Clearly, in this case r
a
is a future-directed timelike vector, hence U
Q
must be
unitary.
We next consider uniqueness. Let U
P
and U
P
be two extensions of our repre-
sentation of T, so U
P
= U
P
for P T. Let Q be a Poincare transformation
which reverses, say, temporal orientation but not spatial orientation. Consider
the unitary operator
A = U
Q
U
1
Q
(487)
We rst how that this A depends only on the component of T in which Q lies.
Let W be another Poincare transformation which lies in the same component
of T as Q. Then W = QR for some R T. Hence,
U
W
U
1
W
= U
QR
U
1
QR
= (U
Q
U
R
)(U
1
R
U
1
Q
) = U
Q
U
1
Q
(488)
where we have used the fact that U
R
= U
R
. We next show that, for P T,
U
P
commutes with A. Indeed, since PQ = QV for some V T, we have
U
P
AU
1
P
= U
P
U
Q
U
1
Q
U
1
P
= U
PQ
U
1
PQ
= U
QV
U
1
QV
= (U
Q
U
V
)(U
1
V
U
1
Q
) = U
Q
U
1
Q
= A (489)
These properties do not yet suce, however, to show that A is a multiple of the
identity (see p. 106). We must impose an additional condition which ensures
that A does not mix up particles and antiparticles. However, it is reasonable,
on physical grounds, to make the following additional assumption: U
P
reverses
the roles of particles and antiparticles if and only if P reverses time directions.
Under this assumption, A must be a multiple of the identity, whence U
Q
= U
Q
,
where is some complex number (the same for every on Poincare transformation
in the same component as Q). However, since QQ T, we have
U
Q
U
Q
= U
QQ
= U
QQ
= U
Q
U
Q
(490)
109
whence = 1.
To summarize, we are interested in extending a given representation from
T to T in such a way that U
P
reverses the role of particles and antiparticles
when and only when P reverses temporal orientation. Every such extension has
the property that U
P
is antiunitary when and only when P reverses temporal
orientation. The extension of the representation is unique except for the follow-
ing possibilities: ax a minus sign to U
P
whenever P reverses spatial parity,
ax a minus sign to U
P
whenever reverses temporal orientation, or ax a minus
sign to U
P
whenever P reverses spatial or temporal orientation, but not both.
Thus, from one extension of the representation, it is easy to write down them
all.
Finally, we write down a representation of T for each of our Hilbert spaces
H
RKG
, H
CKG
, and H
M
. Let P be a Poincare transformation, and write Px
for the point of Minkowski space to which P sends the point x. For the real
and complex Klein-Gordon cases, the action of U
P
on an element (x) of our
Hilbert space is as follows:
(x) (Px) (491)
This action clearly denes a representation of T. For the Maxwell case, note
that P is a smooth mapping from Minkowski space to Minkowski space, and
hence P sends any vector eld on Minkowski space to another vector eld. This
action denes U
P
on H
M
. For example, the two Poincare transformations (481)
and (484) have the following actions on the vector potential A
a
(x):
A
a
(x) A
a
(Px) + 2t
a
(t
b
A
b
(Px))
A
a
(x) A
a
(Qx) 2t
a
(t
b
A
b
(Qx))
(492)
The situation with regard to H
D
diers in some important respects from that
above. The fundamental dierence is that H
D
does not dene a representation
of T. Instead, the operative group is what is usually called inhomogeneous
SL(2, C): the (double) covering group of T. (SL(2, C) is the (double) cov-
ering group of the restricted Lorentz group.) We shall denote this connected,
10-dimensional Lie group by o. If o is to replace T, what group should
replace T? One could, in fact, introduce such a group, and attempt to extend
to it our representation of o. It is simpler, however, to proceed in a slightly
dierent way.
Let P T. Associated with P there are precisely two elements of o.
The corresponding pair of unitary operators on H
D
dier only in sign. Thus,
we can regard H
D
as a double-valued representation of T: with each P T,
there is associated two unitary operators on H
D
, these operators diering only
in sign. These operators satisfy (480), modulo sign. The question in the fermion
case is therefore the following: Can this double-valued representation of T be
extended to a double-valued representation of T? The argument given earlier
shows that the operators associated with P T are antiunitary if and only if P
reverses time-orientation. The uniqueness situation is essentially the same, by
the same argument.
110
There remains, therefore, only the task of specifying what the operators are
for P / T. Since the (double-valued) action of T on H
D
is known, we need
only specify U
P
for one time-reversing P and one parity-reversing P. The action
is as follows. For the Poincare transformation (481),
(
A
(x),
A
(x))
i
2
(t
AA
A
(Px), t
AA
A
(Px)) (493)
and for the Poincare transformation (484),
(
A
(x),
A
(x))
1
2
(t
AA
A
(Qx), t
AA
A
(Qx)) (494)
33 Extending Operators to Tensor Products
and Direct Sums
We have now introduced a large number of operators some on the one-particle
Hilbert spaces and some of the Fock spacs. In order that such operators can be
discussed relative to the S-matrix, however, their action must be dened on the
non-interacting Hilbert space H. Since H arises from two constructions the
tensor product and direct sum of Hilbert spaces we are led to the problem
of extending the action of operators through these two constructions.
We begin with the direct sum. Let H
1
, H
2
, . . . be a sequence of Hilbert
spaces. Then the direct sum of this sequence, H = H
1
H
2
H
3
. . . is the
Hilbert space consisting of sequences
= (
1
,
2
,
3
, . . .) (495)
with
i
H
i
for which the sum
||
2
= |
1
|
2
+|
2
|
2
+ (496)
which denes the norm, converges. Now let O
1
, O
2
, . . . be a sequence of operators
(O
i
on H
i
) which are either all linear or all antilinear. We then dene an operator
O on H as follows:
O = (O
1
1
, O
2
2
, O
3
3
, . . .) (497)
Clearly, O is linear (resp., antilinear) provided the O
i
are linear (resp., antilin-
ear.) Note, furthermore, that if all the O
i
are norm-preserving, so is O; if all
the O
i
are Hermitian, so is O; if all the O
i
are projection operators, so is O. Of
course, not every operator on H can be expressed in the form (497).
The tensor product is next. Let H
, H
, . . . , H
be a nite sequence of
Hilbert spaces. Then the tensor product of this sequence, H = H
+ +
(498)
111
(The index indicates the Hilbert space to which a vector belongs.) Now let
O
, O
, . . . , O
, H
, . . . , H
, re-
spectively. We then can dene an operator O on H by:
O(
) = (O
)(O
) (O
) +
+ (O
)(O
) (O
) (499)
If all the Os are unitary, so is O; if all the Os are Hermitian, so is O; if all the
Os are projection operators, so is O. Now suppose that the given O-operators
are antilinear rather than linear. We thus have linear mappings from
H
to
H
, from
H
to H
, O
, . . . , O
) = (O
)(O
) (O
) +
+ (O
)(O
) (O
) (500)
If the Os are anti-unitary, so is O.
We next consider the application of these constructions to obtaining opera-
tors on H. Recall that the non-interacting Hilbert space H is the tensor product
of certain Fock spaces based on one-particle Hilbert spaces, e.g.,
H = T(H
D
) T(H
M
) T(H
CKG
) (501)
where T denotes the operation of taking the (symmetric or antisymmetric, as
appropriate) Fock space.
Consider rst the unitary or antiunitary operators U
P
(P T) which arise
from the Poincare group. These operators are dened originally on the one-
particle Hilbert spaces. Their action is rst extended to the many-particle
Hilbert spaces via (499) or (500), and then to the Fock spaces via (497). Fi-
nally, these operators are dened on H via (499) or (500). Thus, we obtain
a representation of the Poincare group T on H. For P T, we write the
corresponding operator on H as U
P
. (No confusion will result from this du-
plicity of notation.) Note that all the operators U
P
on H are norm-preserving,
and that U
P
is antilinear if and only if P reverses time orientation, linear oth-
erwise. The energy, momentum, and angular momentum operators on H are
obtained by considering the U
P
s which dier innitesimally from the identity
(see Eqn. (222).) Similarly, the charge-reversal operator C is dened, rst, on
the one-particle Hilbert spaces, and then extended successively to the many-
particle spaces, to the Fock spaces, and nally to H. The resulting operator on
H is again denoted by C.
Another operator of interest is the total charge operator, Q. On our real
Hilbert spaces (which represent neutral particles), H
RKG
and H
M
, Q = 0. On
the complex Hilbert spaces, H
CKG
and H
D
, Q takes one of the two forms
Q = eP
eP
+
(502)
112
Q = eP
+
eP
(503)
where P
+
and P
(g)
=
2i
D(f, g)I (507)
on H. Similarly, the adjoint relations between these operators are unchanged in
the passage to H. Field operators (as well as creation and annihilation opera-
tors) which act on dierent Fock spaces commute. For example,
_
(f), A(f
a
)
= 0 (508)
113
Finally, we consider the relationship between the eld operators and C, Q, and
U
P
. Once again, everything is straightforward, so a single example will suce.
Consider H
CKG
, so
(f) = C(
(f)) +A(
+
(f))
(f) = A(
(f)) +C(
+
(f))
(509)
Let P be a restricted Poincare transformation. Then, from (497), (499), (100),
and (101),
U
P
(f)U
1
P
= C(U
P
(f) +A(U
P
+
(f)) (510)
Similarly, if C is a particle-reversal operator, then
C(f)C
1
= C(C
(f)) +A(C
+
(f))
= C(
+
(f)) +A(
(f)) =
(f)
(511)
For the charge operator Q, we note that (f) creates an antiparticle (say, with
positive charge) and annihilates a particle. Hence, the total change in the charge
eected by (f) is just e times the norm of
+
(f). Thus,
[Q, (f)] = e
_
|
+
(f)| +|
(f)|
_
I (512)
Clearly, the list of operators in this subject is almost innite. Roughly speaking,
any two operators in this list have a relationship which is simple, straightforward
to derive, and easy to interpret physically.
34 Electromagnetic Interactions
In Sect. 29 (see Eqn. (455)) we wrote down an expression for the S-matrix in
terms of an (unknown) operator eld K(x) on Minkowski space. Of course,
this formula gives practically no information about the scattering unless one
knows K(x). One imagines that the actual K(x) which describes physical pro-
cesses in the real world can be written as the sum a certain number of terms
(e.g., the electron-photon interaction, the nucleon-photon interaction, the -
meson-photon interaction (electromagnetic interactions), the -meson-nucleon
interaction (strong interactions), the electron-neutrino interaction, the -meson-
neutrino interaction (weak interactions), etc.) There are at least some exper-
imental situations in which one single term dominates all the others. One at-
tempts to obtain an expression for this term using physical arguments and trial
and error. That is to say, one makes a reasonable guess for the term in K(x),
and compares the theoretical consequences of that guess (via (455)) with ex-
periment. The hope is that one can, in this way, isolate and study each term,
and then, by adding the well-established terms, obtain a reasonable approxima-
tion to the actual K(x) which is operative in Nature. We shall here merely
illustrate the general idea by writing down and discussing a few of the K(x)s
associated with the interaction of charged particles with the electromagnetic
eld.
114
We begin with the simplest case: the interaction of a complex Klein-Gordon
eld with the Maxwell eld, e.g., the interaction of
(x)
(x)
a
(x)
_
A
a
(x) (514)
(Note that it is only an integral of (514) which has meaning, for we have the free-
dom to add a gradient to the vector potential. Appropriate integrals are gauge-
invariant, however, because the rst integral is, as we have seen, divergence-
free.) In (514), e is a constant. Using the discussion of Sect. 30, and the fact
that K(x) has dimensions of energy density (mt
3
), we see that the coupling
constant e has dimensions m
1/2
t
1/2
, whence e
2
/ is dimensionless. In order to
obtain eventual agreement with experiment, it will, of course, be necessary to
set this constant to 1/137.
While (514) is a perfectly nice scalar eld (on Minkowski space) constructed
out of a complex Klein-Gordon eld and a Maxwell eld, it is, unfortunately, just
that a scalar eld rather than an operator eld. The expression (514) is just
not the right sort of mathematical object to be K(x). Now comes the transition
from classical to quantum theory. Roughly speaking, what we propose to do
is to replace the classical elds in K(x) (Eqn. (514)) by the corresponding eld
operators to obtain K(x). Unfortunately, this replacement is not so simple and
unambiguous as it may appear at rst sight.
By what operator should we replace (x)? Our Klein-Gordon eld operator,
(f), depends on test elds in Minkowski space, and not on points of Minkowski
space. What one would like to do is dene an operator eld (x) by
(x) = lim
fx
phi(f) (515)
where
x
denotes a -function located at the point x. But will the limit in (515)
exist? The answer, as we have seen earlier, is no. We could still regard (x)
as an operator-valued distribution (i.e., a linear mapping from test functions
to operators on H that, after all, is what (f) is), but such an attitude
again leads to diculties. Eqn. (514) will require us to take products of such
operator-valued distributions, but the ability to take products is precisely what
is lost in the transition from functions to distributions. That is to say, products
of distributions are not in general well-dened. (This is a genuine and serious
problem not to be confused, for example with the standard complaints about
use of the Dirac -function.) In short, we are stuck. There is no meaning which
can be given to (515) which would be appropriate for replacement in (514).
We adopt the following attitude. We leave the problem of the nonexistence
of limits such as (515) unresolved for the time being. We permit ourselves to
115
manipulate such quantities formally, as though the question of the limits had
never arisen. This marks the third (and, mercifully, the last) of the mathe-
matical problems associated with this formalism. For emphasis, we list these
problems again:
1. The question of the convergence of the innite sun of operators in (455).
2. The question of the convergence of the integrals (over all of Minkowski
space) of operators in (455).
3. The nonexistence of the -function limits of eld operators used in ob-
taining K(x).
The situation will look less gloomy shortly. (I nd it hard to believe that the
ultimate, mathematically acceptable, quantum eld theory will result from a
brute-force attack on these problems.)
We now have (x) to replace (x). Naturally, we replace the classical
complex-conjugate eld,
(x), by the adjoint operator,
a
(x) = lim
fx
(p
a
a
f) (516)
(See Eqn. (145).) Similar remarks concerning non-existence of limits apply.
We must next select an operator to replace the vector potential, A
a
(x), in
(514). Ideally, one would like to dene ulA
a
(x) by
p
a
A
a
(x) = lim
fx
A(p
a
f) (517)
where p
a
is a constant vector eld, and A( ) is the eld operator (Sect. 14) for
the vector potential. Unfortunately, this wont work, for A(f
a
) is only dened
for test elds f
a
which can be written as the sum of a divergence-free eld and
a gradient; p
a
f cannot be written in this form in general. The simplest way of
overcoming this diculty is as follows. First note that the commutator of the
vector potential operators ((385) and (386)) is well-dened whether or not the
test elds, f
a
and g
a
, can be written as the sum of a divergence-free eld and
a gradient. In fact, it is only these commutators which will enter the S-matrix.
Hence, we can work with vector potential operators, A
a
(x), and use for the
commutators (385).
We now have an operator equivalent for each term in (514). We must now
face the next problem: in what order should the operators be placed? This di-
culty does not arise, of course, in the classical theory, because the classical elds
may be placed in any order. We consider the most general linear combination:
K(x) =
ie
2
_
a
a
+b(
a
) +c
a
+d(
a
)
_
A
a
(518)
where a, b, c, and d are real numbers. Taking the Hermitian conjugate of (518),
K(x) =
ie
2
_
a(
a
)
+b
a
+c(
a
) +d(
a
)
_
A
a
(519)
116
We see that the Hermiticity of (518) requires
a = d b = c (520)
Further information about the coecients is obtained from the experimental
fact that electromagnetic interactions are invariant under charge reversal. From
(518):
CK(x)C
1
=
ie
2
_
a
a
+b(
a
)
+c
a
+d(
a
)
_
A
a
(521)
Thus, invariance under charge reversal requires one of the following two alter-
natives:
a = c b = d CA
a
C
1
= A
a
(522)
a = c b = d CA
a
C
1
= A
a
(523)
We choose (523) for two reasons: (i) it is more reasonable on physical grounds
to have the vector potential reverse sign under charge reversal (for classical
electromagnetic elds reverse sign when the signs of all charges are reversed),
and (ii) with this choice, K(x) reduces, in the classical limit, to the classical
expression (514). Thus, we arrive at the interaction:
K(x) =
ie
4
_
+ (
a
a
(
a
)
_
A
a
(524)
Eqn. (524) describes an interaction which is invariant under charge reversal.
Note, furthermore, that if P is any Poincare transformation, then
U
P
K(x)U
1
P
= K(Px) (525)
Since K(x) is integrated over all of Minkowski space (Eqn. (455)), the nal
S-matrix will commute with each U
P
. Thus, our interaction conserves the
quantities associated with the innitesimal generators of energy, momentum,
and angular momentum. The interaction is also invariant under parity and
time reversal. Note, furthermore, that we have
[K(x), K(y)] = 0 (526)
for x y spacelike, for when x y is spacelike, any two operators in (524)
commute with each other. Finally, the total charge operator, Q, commutes
with K(x), for A
a
(x) commutes with Q, (x) decreases the total charge by 2e,
while
+
A
A
_
A
AA
(527)
117
The classical Dirac elds are to be replaced by the following operators:
p
A
A
(x) = lim
fx
1
2
_
(fp
A
, f p
A
) i(ifp
A
, if p
A
)
p
A
(x) = lim
fx
1
2
_
(fp
A
, f p
A
) +i(ifp
A
, if p
A
)
(528)
where p
A
is a constant spinor eld. Note that e in (527) again has dimensions
m
1/2
t
1/2
. The classical complex-conjugate elds,
and
A
, are to be replaced
by the Hermitian conjugates,
A
and
A
, respectively. In this case, the problem
of factor ordering is not resolved by the requirement that K(x) be Hermitian:
this condition is satised for any factor ordering. However, this electromagnetic
interaction should be invariant under charge reversal. We have
C
A
C
1
=
A
C
A
C
1
=
(529)
What should we adopt for the behavior of the vector potential operator, A
a
(x),
under charge reversal? We have already decided, for the meson-photon interac-
tion, to use
CA
a
C
1
= A
a
(530)
It is an important point that we must choose the same behavior for the present
interaction. The reason is that, for the actual interaction which Nature obeys,
K(x) will be the sum of the various interactions. If we use a dierent charge-
reversal operator for each term which appears in this sum, then we will have no
operator which commutes with the total K(x). In other words, the behavior of
each type of particle under the various reversals must be xed once and for all.
One has, of course, freedom to choose that behavior, and this choice is based
on obtaining operators which commute with as many terms in the nal K(x)
possible. Thus, using (529) and (530), we are led to adopt the expression
K(x) =
1
2
e
_
A
+
A
_
A
AA
(531)
for the interaction.
Note that (530) is Hermitian, and that the resulting S-matrix commutes
with the unitary operators which arise from the Poincare group. Thus, (530)
conserves energy, momentum, and angular momentum. By the same argument
as before, K(x) commutes with the total charge operator Q. Finally, we note
that, if x y is spacelike,
[K(x), K(y)] = 0 (532)
This arises from the following facts: when x y is spacelike, any two boson
operators commute, while any two fermion operators anticommute. But K(x)
contains an even number of fermion operators. Since reversing the order of
two boson operators gives a plus sign, and reversing the order of two fermion
118
operators gives a minus sign, the total number of minus signs will be even, and
so we have (532).
Clearly a vast number of conceivable interactions could be written down
using the pattern illustrated above. One rst writes down a real scalar eld
constructed from the classical elds. One then replaces the classical elds by
the corresponding operators. The factors must be ordered so that the resulting
operator is Hermitian, and satises (532). Beyond that, the choice of factor
ordering must be based on physical or aesthetic considerations, experiment, etc.
We have merely discussed two possible interactions here in order to illustrate
the method. (In fact, these are the two simplest, for one can rely heavily on
classical theory as a guide.)
35 Transition Amplitudes
Suppose now that we have selected a particular K(x), and wish to work out
its experimental consequences, using (455). The straightforward procedure
substituting K(x) into (455), and attempting to carry out the integrals and sum
turns out to be too dicult to carry out in practice. Instead, one adopts
a more indirect approach which leads ultimately to the Feynman rules. We
shall not attempt to derive the Feynman rules, or even discuss the large volume
of technical apparatus which has been developed to deal with (455). Instead,
we merely indicate the general idea of the method.
Suppose rst that we were able, in some way, to obtain the value of the
complex number
(, S) (533)
for any two states , H. (The expression (533) is called the transition
amplitude between the state and the state .) This information is, of course,
completely equivalent to a knowledge of the S-matrix. In fact, it suces to know
(533) only for s and s drawn from a certain subspace of H, provided this
subspace is dense in H. Let
0
denote the vacuum state in H and C
1
, C
2
, . . . , C
n
any nite sequence of creation operators on H. (One C
i
might create a photon,
another an electron, others mesons, etc.) We consider the state (element of H)
C
1
C
2
C
n
0
(534)
Clearly, the collection of all nite linear combinations of states of the form (534)
is dense in H. Hence, it suces to evaluate
(C
1
C
n
0
, SC
1
C
0
) = (C
1
C
n
0
, C
1
C
0
)
+
_
__
dV
1
(C
1
C
n
0
, K(x
1
)C
1
C
0
)
+
1
2!
_
_
2
_
dV
1
_
dV
2
(C
1
C
n
0
, T [K(x
1
), K(x
2
)] C
1
C
0
)
+ (535)
119
for any C
1
, . . . , C
n
, C
1
, . . . , C
m
. One now attempts to evaluate the various terms
in the sum (535) individually. The rst term is celled the 0
th
-order interaction.
It clearly vanishes unless n = m, and the Cs and C