0% found this document useful (0 votes)
35 views63 pages

Simple Advanced

This document is a lecture on advanced topics in quantum mechanics. It introduces quantum systems with a finite number of discrete states that can be described using linear algebra and vector spaces. The simplest example is a qubit, which exists in a two-dimensional state space analogous to the surface of a sphere. Measurements of quantum systems yield probabilistic outcomes determined by the system's state vector. Unitary dynamics evolve states coherently according to the Schrödinger equation.

Uploaded by

Philip Ruijten
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views63 pages

Simple Advanced

This document is a lecture on advanced topics in quantum mechanics. It introduces quantum systems with a finite number of discrete states that can be described using linear algebra and vector spaces. The simplest example is a qubit, which exists in a two-dimensional state space analogous to the surface of a sphere. Measurements of quantum systems yield probabilistic outcomes determined by the system's state vector. Unitary dynamics evolve states coherently according to the Schrödinger equation.

Uploaded by

Philip Ruijten
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

Advanced Topics

in Quantum Mechanics
M. Fannes
Instituut voor Theoretische Fysica
K.U.Leuven

March 2013

Contents
1 Introduction

1.1 General principles . . . . . . . . . . . . . . . . . . . . . . . . .

1.2 The qubit . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3 Two qubits . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.4 Teleportation . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 Observables and states


2.1 Positive matrices . . . . . . . . . . . . . . . . . . . . . . . . .

9
9

2.2 States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1

Density matrices . . . . . . . . . . . . . . . . . . . . . 12

2.2.2

Convex sets . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.3

Pure and mixed states . . . . . . . . . . . . . . . . . . 16

2.2.4

The pure state space of Md . . . . . . . . . . . . . . . 17

2.3 State space of a qubit . . . . . . . . . . . . . . . . . . . . . . . 19

2.4 A probability theory approach to states . . . . . . . . . . . . . 20


2.5 States of composite systems . . . . . . . . . . . . . . . . . . . 22
3 Quantum dynamics

25

3.1 Dynamics in discrete time . . . . . . . . . . . . . . . . . . . . 25


3.2 General quantum operations . . . . . . . . . . . . . . . . . . . 27
3.3 Examples of quantum operations . . . . . . . . . . . . . . . . 29
3.3.1

Unitary gates . . . . . . . . . . . . . . . . . . . . . . . 29

3.3.2

Dissipative operations . . . . . . . . . . . . . . . . . . 31

3.4 Dynamics in continuous time . . . . . . . . . . . . . . . . . . . 35

3.4.1

Generators . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.4.2

Lindblads theorem . . . . . . . . . . . . . . . . . . . . 37

3.4.3

A semi-group of decoherent qubit maps . . . . . . . . . 39

3.4.4

Radiation loss to the vacuum . . . . . . . . . . . . . . 40

3.5 Thermalising maps . . . . . . . . . . . . . . . . . . . . . . . . 43


3.5.1

Reversible dynamics of the system . . . . . . . . . . . . 44

3.5.2

Equilibrium states of the bath . . . . . . . . . . . . . . 45

3.5.3

The weak-coupling limit . . . . . . . . . . . . . . . . . 47

3.5.4

Properties of the weak-coupling limit . . . . . . . . . . 47

3.5.5

Quantum detailed balance maps on Md

4 Entropy

. . . . . . . . 48
50

4.1 Classical entropy . . . . . . . . . . . . . . . . . . . . . . . . . 50


4.2 Construction of the Shannon entropy . . . . . . . . . . . . . . 52
4.3 Quantum Entropy . . . . . . . . . . . . . . . . . . . . . . . . . 55

Introduction

Classical systems can be described at dierent levels of complexity. The


simplest systems are registers that can be in a discrete number of states
like an Ising spin that can be up or down. More complex systems appear
in classical mechanics: point masses whose motion is governed by Newtons
second law. Still more complex are classical elds like e.g. the electromagnetic
eld.
A very similar situation holds in the quantum world. The simplest systems
are these that can only access a nite number of states, typically the ground
state and a few excited states. Here we just need nite dimensional vector
spaces and linear algebra to describe them. More complex are systems like the
hydrogen atom where we need an innite dimensional space of wave functions.
The appropriate mathematical tools for such systems are Hilbert spaces and
linear operators on such spaces. Still more complex are quantum eld theories
or systems with an innite number of degrees of freedom. There the Hilbert
space picture breaks down and one has to turn to operator algebras.
We will here mainly restrict our attention to the simplest, nite dimensional
description of quantum mechanics. This context is already wide enough to
discover a number of typical quantum phenomena. It is, moreover, also a
realistic approach to small systems at low temperatures such as a few atoms,
a molecule or a few photons.
To describe a quantum system with d accessible levels we will just need the
complex d-dimensional orthogonal space. Fixing an orthonormal basis we can
identify the space with d equipped with its standard scalar product. Linear
transformations of the space can then simply be identied with matrices with
complex entries.

1.1

General principles

The main postulates for describing an autonomous isolated quantum system


are:
Postulate 1: The maximal knowledge that can be obtained about an
ensemble of identically prepared isolated systems is encoded into the
state vector of the system. This is a normalized vector in d .
1

Postulate 2: Every normalized vector in d is a possible state vector


of the system. This is equivalent to the superposition principle:
if 1 and 2 are possible state vectors, then any normalized linear
combination of 1 and 2 also occurs.
Postulate 3: The observables of the system are the Hermitian linear transformations of d . Let A be an observable and {ej } be the
orthonormal basis of eigenvectors of A with corresponding eigenvalues {j }. The j are the possible outcomes of a measurement of A.
Measuring the observable A will return a random outcome and the
probability of the outcome j for a system in the state is | ej , |2 .
This probability can be obtained by repeatedly observing the system
and establishing the relative frequencies of the dierent outcomes.
Postulate 4: The evolution of an autonomous isolated system is
governed by an Hermitian Hamiltonian matrix H. If is the
state vector at time t1 then exp(i(t2 t1 )H) is the state vector at
time t2 .
Postulate 1 supposes that we can produce an arbitrary number of systems
all identically and perfectly prepared. The state vector or wave function
associated to such a preparation is not attached to any particular instance of
the system but rather to the ensemble. We need such an ensemble to build
a statistics of measurement outcomes using relative frequencies. This is the
meaning of Postulate 3.
Postulate 2 also leads to the tensor product construction for dealing
with composite systems. If systems A and B are independent with state
vectors A and B then a product vector A B should describe the composite system. Superpositions of such vectors are also allowable states, we
have therefore to consider all linear combinations of such product vectors
which leads to the tensor product construction.
For systems composed of indistinguishable particles the Pauli principle applies: the admissible state vectors for Fermions have to change sign under
odd permutations of the particles while they remain invariant for Bosons.
In such a situation we must consider the totally anti-symmetric or symmetric subspaces of a tensor product of copies of the single particle space.
Sending an instance of a system prepared in the state through a measurement apparatus of an observable A will yield one of the outcomes j . It is not
2

possible to predict which j will be observed but the relative frequency


for obtaining j will tend to | ej , |2. Postulate 3 allows to compute the
expectation value of any function of an observable which is another way to
obtain the measurement outcome statistics
f (A)

f (j ) | ej , |2 = , f (A) .

=
j

(1)

It also follows from Postulate 3 that multiplying by a phase doesnt alter


any expectation value. Therefore normalized vectors in d that dier only
by a global phase yield the same outcome statistics for every observable and
should be identied. We arrive in this way at a more rened notion of state
vector: a one-dimensional subspace of d also called a ray.
Postulate 4 tells us that unitary quantum dynamics is linear at the level of
state vectors: superpositions evolve into the same superpositions of evolved
state vectors: unitary quantum dynamics is a coherent evolution.

1.2

The qubit

This is the simplest non-trivial quantum system with state vectors in 2 . It


is the quantum mechanical counterpart of an Ising spin which is a classical
system with only two congurations: spin up or down. Let us x the notation
for the standard basis
1
0

|0 =

and |1 =

0
.
1

(2)

In the context of quantum information theory this basis is often called the
computational basis.
An arbitrary state vector in
= |0 + |1

is of the form

with ,

and ||2 + ||2 = 1.

(3)

We are, moreover, free to multiply by a complex number of modulus one.


This freedom can be used to make the coecient of |0 positive. In this way
we arrive at a unique parametrisation of the rays of a qubit
= cos(/2)|0 + sin(/2) ei |1

with 0 and 0 < 2.


3

(4)

So we see, viewing (, ) as spherical angular coordinates, that the space of


rays of a qubit is isomorphic to the unit sphere in 3 . The state |0 is the
north pole and |1 is the south pole. The unit sphere in 3 , seen as the state
space of a qubit, is called the Bloch sphere.
Problem 1. Show that orthonormal bases of
points on the Bloch sphere.

correspond to antipodal

Problem 2. Express the transition probability between two qubit states


in terms of a geometric property of the corresponding points on the Bloch
sphere. (The transition probability between 1 and 2 is | 1 , 2 |2 .)
Let us observe the z-component of the spin which corresponds to the Pauli
z matrix. As z2 = , we only need to consider
z

= , z = cos2 (/2) sin2 (/2) = cos .

(5)

Clearly, observing only z does not provide enough information to reconstruct the state vector: there is no way to determine . The variance of
measurement outcomes is given by
2 (z ) = z2

= 1 cos2 = sin2 .

(6)

Hence we see that a perfectly accurate determination of z is only possible


for = 0 and = or, in other words, when is an eigenstate of the
observable.
Problem 3. Show that, up to a shift by a constant and a rescaling, any qubit
observable is unitarily equivalent to the Pauli z matrix. (Two observables
A and B are unitarily equivalent if there exists a unitary matrix U such that
B = U A U .)
Problem 4 (Uncertainty relation for a qubit). Show that the uncertainties
in the x and y components of a qubit satisfy 2 (x ) + 2 (z ) 1.
The dynamics of an isolated autonomous qubit is quite simple. It is generated
by a Hamiltonian
H = 1 |f1 f1 | + 2 |f2 f2 |
(7)

where 1 and 2 are the energies of the eigenstates f1 and f2 of H. By


Schrodingers equation a state evolves according to
= |f1 + |f2 t = eit1 |f1 + eit2 |f2 .
4

(8)

It is clear that the eigenstates |f1 and |f2 remain, up to a multiplication


by a global phase, unchanged. Therefore the dynamics leaves the points on
the Bloch sphere corresponding to |f1 and |f2 stationary.
Problem 5. Describe geometrically the dynamics of an isolated autonomous
qubit on the Bloch sphere. Show in particular that it is a rotation of the
Bloch sphere around the axis through the eigenstates of the Hamiltonian at
a constant angular velocity determined by 2 1 , the Bohr frequency of
the system.

1.3

Two qubits

We now turn to a composite system of two qubits. The components are


sometime called parties and the composite system bipartite. The states of
such a system are the normalized vectors in 2 2 = 4 up to a factor of
modulus one. The standard tensor basis is usually lexicographically ordered
|00 = |0 |0 , |01 = |0 |1 , |10 = |1 |0 , and |11 = |1 |1 . (9)
A general two-qubit state can then be written as
= a00 |00 + a01 |01 + a10 |10 + a11 |11
|a00 |2 + |a01 |2 + |a10 |2 + |a11 |2 = 1.

with

As we can also x a global phase there remain 6 real parameters.


Some two-qubit state vectors have a particular structure: they are elementary tensors
= , , 2 .
(10)
This type of state vectors describes a composite system where both parties
are independent: the expectation of product observables AB is simply the
product of the expectations of these observables in the subsystems. States
of this type are called non-entangled or separable and they are not very
useful for many purposes.
Problem 6. Show that a two-qubit state vector is non-entangled i a00 a11 =
a01 a10 . As this imposes two real constraints, we need 4 real parameters to
describe a generic non-entangled two-qubit states. This means that a generic
two-qubit state vector will be entangled.
5

Let us consider a very entangled state vector 00 = 12 |00 + |11 and let
us computed the expectation of an observable of the rst subsystem
A

00

= 00 , A 00 =

1
2

0|A 0 + 1|A 1 .

(11)

It is easy to see that there is no qubit state vector such that


, A =

1
2

0|A 0 + 1|A 1 .

(12)

In fact we need an equal weight mixture of the expectations dened by the


qubit state vectors |0 and |1 to reproduce such expectations. It means
that we have to enlarge our description of expectation values to include also
mixtures of expectations dened by state vectors. This is the subject of
the next section.
Problem 7. Show that there is no qubit state vector reproducing the expectation
A 21 0|A 0 + 1|A 1 .
(13)

1.4

Teleportation

Suppose that we want to transmit a qubit state from A(lice)s lab B(ob)s
lab using standard means of communication, that is to say, by transmitting
numerical data from A to B. The state is just any possible unknown
state of a qubit and we are allowed a single use of the system. Measuring
an observable in A will not be very helpful. Indeed, the outcome is random
and after the measurement the state has turned into one of the eigenstates
of the observable. Suppose, however, that beside transmitting numerical
information A and B share a (maximally) entangled state two-qubit state
like 00 above. Sharing a two-qubit state means that A can act on the
composite system consisting of the unknown qubit state and the rst 2
factor of the space 2 2 to which 00 belongs. B on the other hand can
act on the second factor in 2 2 .

We consider a general (, ) superposition of the states |0 and |1 at the A


lab
= |0 + |1
(14)

and set up a procedure to generate the same (, ) superposition of a set of


basis states at lab B. We rst introduce the four Bell states which form an
orthonormal basis of 2 2
00 =

10 =

1
2
1

|00 + |11

01 =

|00 |11

11 =

1
2
1

|01 + |10

|01 |10 .

(15)

The three-qubit state consisting of the generic and 00 is then


00 =
=

1
2
1

(|0 + |1 ) (|00 + |11 )

|000 + |011 + |100 + |111 .

(16)

In the A lab a measurement of an observable that is diagonal in the Bell


basis is made and depending on the (random) outcome an instruction is
transmitted to the B lab about what action should be taken. To do this we
expand the three-qubit state with respect to the Bell basis of the rst two
components:
1
1
00 (|0 + |1 ) + 01 (|1 + |0 )
2
2
(17)
1
1
+ 10 (|0 |1 ) + 11 (|1 |0 ).
2
2
The idea is now to perform a measurement in lab A in the Bell basis and to
transmit the random outcome of this measurement to lab B where a suitable
action is undertaken so as to restore the original state in lab B. Suppose, e.g.
that the measurement selected the Bell state 11 , then we know that the part
of the state that is accessible in lab B is |1 |0 and we can reconstruct
the original state by applying the unitary
00 =

0 1
.
1 0
This procedure is called teleportation. It is actually not transporting physical
objects from A to B but rather the structure of an arbitrary quantum state
at A to that of a state at B. Remark also that neither A nor B actually
measure the teleported state. The only eect of the whole procedure is that
an unknown (, ) superposition of two states at lab A has been exactly
reconstructed at lab B.
Problem 8. Work out the actions that have to be taken in Bobs lab to
restore the original qubit for other measurement outcomes of Alice.
7

Observables and states

To describe the simplest possible classical system, such as a register, we


need a conguration space that labels the contents of the register. For an
Ising spin this is just a set with two elements. More complex systems can
have continuous conguration spaces or, whenever motion is possible, a general phase space. There is no notion of conguration space or phase space
for quantum systems. It is however possible to describe jointly classical and
quantum systems by passing to the level of observables: real or complex functions on conguration or phase space for classical systems or Hermitian linear
maps on the space of wave functions for the quantum case. In both cases
the observables form a complex algebra, commutative in the classical and
non-commutative in the quantum. Algebra means that elements can be multiplied by complex scalars, added and multiplied among themselves. These
three operations satisfy the usual axioms on associativity and distributivity.
An important ingredient in the quantum is Hermitian conjugation A A
which extends the notion of complex conjugation for the classical case where
one can actually do with real functions. Finally, and this is only relevant in
continuous or innite dimensional situations, an adapted notion of continuity
is needed.

2.1

Positive matrices

We shall mostly restrict ourselves to fully quantum systems with d accessible


levels. For such systems the algebra of observables is Md , the algebra of
d d complex matrices. Hermitian conjugation is transposition followed by
complex conjugation:
A ij = Aji .
For classical systems with d states the algebra of observables Cd consists of
the complex functions on the conguration space d = {1, 2, . . . , d}. It is
natural to identify a classical observable with a diagonal matrix

f (1)
0

0
0
f (2)
0

f ..
..
.. .
.
.
.
.
.
.
0
0
f (d)
9

So classical can always be naturally embedded in quantum: Cd Md . The


Hermitian conjugate of a diagonal matrix corresponding to a function f is
then the diagonal matrix corresponding to the conjugate function.
Hermitian and unitary matrices play an important role in quantum theory.
A matrix A is Hermitian if A = A, such matrices are general observables
of the system, their eigenvalues are real and they can be diagonalized by
representing them in a suitably chosen orthonormal basis. A matrix U is
unitary if it satises U U = , this automatically implies that also U U = ,
hence U is the inverse of U. Unitary matrices preserve the inner product:
U , U = , , ,

This is equivalent to stating that unitary matrices map orthonormal bases in


orthonormal bases. Unitary matrices are important in describing symmetries
of a quantum system, think of unitary group representations, and reversible
evolution.
Orthogonal projection matrices, also called projectors, are particular Hermitian matrices:
P = P = P 2.
It is easily seen that also P is a projector and that P ( P ) = ( P )P =
0. Any vector d can therefore be written as = P + ( P ) =
1 + 2 with 1 , 2 = 0 and therefore 2 = 1 2 + 2 2 . The vector
1 is the vector in P d that is as close as possible to , it is the orthogonal
projection of on P d . In this way there is a one to one correspondence
between projectors and subspaces of d . Let P1 and P2 be projectors such
that P1 P2 = 0(= P2 P1 ) then P1 + P2 is again a projector. Conversely, P1 + P2
with P1 and P2 projectors is again a projector only if P1 P2 = 0. In terms of
the associated spaces: P1 P2 = 0 means that the subspaces P1 d and P2 d
are orthogonal and P1 + P2 is then the projector on the space spanned by
P1 d and P2 d . Also, P1 P2 is a projector if and only if P1 and P2 commute in
which case P1 P2 projects on the intersection of P1 d and P2 d . For general
projectors there are two interesting constructions: P1 P2 is the projector
on the space spanned by P1 d and P2 d and P1 P2 is the projector on the
intersection of P1 d and P1 d . P1 P2 is called the join of P1 and P2 and
P1 P2 their meet.
For Hermitian and for unitary matrices, in fact for normal matrices (A A =
A A ), one has an orthogonal decomposition of d in eigenspaces: to each
10

eigenvalues of A there corresponds an eigenspace V and a corresponding


projector P . These projectors satisfy
P = .

P P = P and

(18)

The matrix A can then be decomposed as


A=

P .

(19)

An important notion is that of positive observable: A = A is non-negative


if all the eigenvalues of A are non-negative, sometimes one uses the term
positive semi-denite. So the measurement outcomes of a non-negative observable are always non-negative. The terms positive and positive denite
are reserved for strict positivity. We will be sloppy and use positive for both
cases.
Proposition 1. The following conditions on A Md are equivalent
1. A is positive
2. A = B B for some B Md
3. , A 0 for every

Problem 9. Fill in the proof of the preceding proposition.


Problem 10. Find the necessary
make

a b
c a
b c

and sucient conditions on a, b, c to

c
b positive.
a

We can now order pairs of Hermitian matrices: A B if and only A = A ,


B = B and B A is positive. For projectors this ordering can be expressed
as follows:: P1 P2 if and only if P1 = P2 P1 . The join and meet operations
on projectors can be restated as follows: P1 P2 is the smallest projector
that is larger or equal than P1 and P2 while P1 P2 is the largest projector
that is dominated by both P1 and P2 .
11

2.2

States

We now come to the notion of state. This will encode all the statistical
information that we can obtain about measurements of observables for a
system prepared with innite care following a given procedure. In this sense
a state is a preparation procedure. So a state is an expectation functional
on the observables:
A A .
(20)
Up to now the term state vector was used for a quantum system. A state
vector denes a state through
A

:= , A , A an observable.

It turns out that the following requirements on states t very well the experimental observations and allow, moreover, for a probabilistic interpretation
of the theory:
Definition 1. A state
on an algebra of observables A is a functional that
satises the following requirements
1. A A
ables

is a complex linear functional on the algebra of observ-

2. = 1, this is a normalisation condition


3. A 0 whenever A is a positive observable.
The complex linearity is important and is asking strictly more than real
linearity on the (Hermitian) observables. Real linear theories have been
developed but they simply dont work. An important property of the state
space of a system is its convexity: if 1 and 2 are states and 0 p 1
then p 1 + (1 p) 2 is also a state. This is an immediate consequence of
the previous denition. States can thus be mixed in any proportion, we can
in fact mix an arbitrary number of states.
2.2.1

Density matrices

We rst characterise states on Cd :


12

Proposition 2. There is a one to one correspondence between states on Cd


and probability vectors p of length d:
d

f =

pj f (j).

(21)

j=1

Here p = {pj } with pj 0 and

pj = 1.

Problem 11. Prove the preceding proposition.


Next we characterise states on Md . To do so, we introduce the notion of
density matrix which is the quantum mechanical counterpart of a probability vector. A density matrix of dimension d is a matrix in Md that
satises
is positive and Tr = 1.
It is easily seen that the set of d-dimensional density matrices is convex.
We shall denote by Sd the space of d-dimensional density matrices. This
notation refers to the density matrices as the states on Md and is justied
by the following proposition.
Proposition 3. There is a one to one affine correspondence between states
on Md and d-dimensional density matrices given by
A = Tr A.

(22)

Problem 12. Fill in the proof of the preceding proposition.


Let

be a normalised vector, then


A

= , A = Tr | |A , A Md .

(23)

Therefore the one-dimensional projector | | is the density matrix corresponding to the state .
Canonical Gibbs matrices are another important example: given an inverse
temperature > 0 and a Hermitian Hamiltonian H, the canonical equilibrium state has density matrix
eH
with Z = Tr eH .
(24)
Z
The normalization factor Z is called the partition function and it yields the
free energy log(Z)/ of the system.
=

13

2.2.2

Convex sets

There is a general theory for convex subsets of a real vector space that shows
how a compact convex set can be reconstructed in terms of its extreme points.
These results provide us with a number of useful notions about state spaces.
We consider for simplicity only subsets of nite dimensional real vector
spaces. Most results generalise to innite dimensions modulo some additional technical assumptions. A subset C d is convex if px + (1 p)y C
whenever x, y C and 0 p 1. If C d is convex, x1 , x2 , . . . , xk C,
pj 0 and j pj = 1 then also
k

j=1

pj xj C.

(25)

This follows by recursion from the denition of convexity.


Given X d the convex hull of X is the smallest convex set in d that
contains X. It is easily seen that
Conv(X) =

pj xj
j

xj X, pj 0,

pj = 1 .

(26)

The closed convex hull is the closure of Conv(X), it is the set obtained by
adding all limits points.
Let C be a closed convex subset of d . A point c C is called an extreme
point of X if it cannot be written as a non-trivial mixture of points of C. In
formulas: x is extreme if x = px1 + (1 p)x2 with 0 < p < 1 and x1 , x2 C
implies that x1 = x2 . The set of extreme points of a closed convex set is
called the extreme boundary of C and denoted by ext (C).
Problem 13. Find the necessary and sucient conditions on a, b, c to turn
the matrix of Problem 10 in a density matrix. Is the set of density matrices
that you obtain convex? If so, nd its extreme points.
Theorem 1 (Minkowski). Let C be a closed, bounded, convex subset of d ,
then
C = Conv ext (C) .
(27)

14

In innite dimensions closed and bounded has to be replaced by compact


which is strictly stronger and one has to take the closed convex hull. The
general theorem is known as the Krein-Milman theorem.
Theorem 2 (Caratheodory). Let C be a closed and bounded convex subset
of d , then every element in C is a convex combination of at most d + 1
extreme points of C.
Closed, bounded, convex subsets of d with exactly d + 1 extreme points are
called simplices. Any point in a simplex can be decomposed in a unique
way in a mixture of extreme points. The weights of this decomposition
are sometimes called the convex or barycentric coordinates of the point.
Simplices are basic building blocks for constructing more general sets, not
necessarily convex (triangulation).
A basic tool in proving Minkowskis theorem is the following separation property for convex sets: given a closed, convex subset C of d and a point x d
that does not belong to C then there exists a separating hyperplane in
d in-between x and C. An hyperplane H in d is a d 1 dimensional plane
and a real number d:
in d , it is uniquely determined by a unit vector n
x = d}.
H = {x d | n

(28)

An hyperplane cuts d in a positive and a negative half-space


H + = {x d | n
x d} and H = {x d | n
x d}.

(29)

The separation property claims that we can always nd an hyperplane such


that
C H + and x H .
(30)

Using this notion one can show that a closed convex set C d dierent from
d is equal to the intersection of all the positive half-spaces that contain C.
I.e. such a set can be characterised by a set of linear inequalities. In general
there will be a lot of redundant inequalities and one can look for a minimal
set, this is called linear programming.
Problem 14. Describe the unit disk in 2 by a set of linear inequalities.

15

2.2.3

Pure and mixed states

We now characterise the boundary and extreme boundary of Sd , the state


space of Md . To give a precise meaning to the topological boundary we
consider the state space as a subset of the plane of Hermitian d-dimensional
matrices with trace 1. A basis a neighbourhoods of an element X of this
plane are open disks: Hermitian matrices of the form X + Y where Y is
Hermitian with trace 0 and norm less than a given .
Proposition 4. A density matrix belongs to the (topological) boundary of
Sd if and only if is not invertible, equivalently if and only if at least one of
the eigenvalues of is equal to 0.
Problem 15. Work out the details of the proof of the proposition.
Proposition 5. A density matrix belongs to the extreme boundary of Sd if
and only if is a one-dimensional projector, equivalently if and only if there
is a normalized vector d such that = | |.
Problem 16. Work out the details of the proof of the proposition.
The vector states are also called pure states, they are the extreme points
of the state space and correspond to density matrices that are projectors of
dimension one. A state that is not pure can be decomposed into a non-trivial
convex decomposition of pure states. Such states are called mixed and they
correspond to density matrices that have at least two eigenvalues dierent
from zero.
Mixed states arise e.g. when describing quantum sources. Suppose that a
source emits a number of pure states given by vectors j which appear with
probabilities pj . Repeated measurements will eventually yield the density
matrix
=
pj |j j |.
(31)
j

This contains all the information that can be gained by measurements about
the particles emitted by the source. The details of the source {(pj , j )} is
called a quantum ensemble. Because the state space of a quantum system
is very dierent from a simplex dierent quantum ensembles can return the
same density matrix and no experiment can discern between such sources.
16

This is very dierent from the classical case where the state space is a simplex
and where therefore a mixed state automatically denes a unique ensemble.
Let us count the number of real parameters needed to describe a d-dimensional density matrix. For a general Hermitian matrix we need d + d(d 1) =
d2 . The positivity condition doesnt change that number but normalisation
removes 1 degree of freedom, hence we need d2 1. The topological boundary
imposes one additional real condition and so we still need d2 2. The points
of the extreme boundary correspond to one-dimensional subspaces of d .
Normalising a vector in such a subspace and multiplying it by a phase we
remain with 2(d 1) real parameters. So we see that the extreme boundary
is in general a much smaller set than the boundary except for a qubit (d =
2). It turns out that in any dimension the pure state space is a very nice
Riemannian manifold. The boundary of the state space is, however, very
complicated and contains many at parts.
Suppose that x is a point on the topological boundary of a compact convex
subset C of d . Such a point need not be extreme but it denes a face of C
F (x) = {y C | z C such that x = y + (1 )z
with z C and 0 < < 1}.

(32)

Faces are compact and convex.


Let be a density matrix on the boundary of the space of d-dimensional
density matrices, then has at least one zero eigenvalue. Suppose that has
k strictly positive eigenvalues, taking possible multiplicities into account. It
is not hard to see that the face of is actually the state space of a quantum
system of dimension k.
Problem 17. Check the assertion of above about the faces of the state space
of Md .
2.2.4

The pure state space of Md

Let be a unit vector in

. The set

{z | z

, |z| = 1}

(33)

is called a ray in d . As multiplying a state vector by a complex number


of modulus one doesnt aect the expectation value of any observable, all
17

vectors in a ray yield the same pure state on Md . It is easily seen that the
correspondence between rays and pure states is actually one to one. The
space of rays in d is sometimes called the complex projective Hilbert
space of dimension d: CP(d). We actually just said that
d

CP(d) = {

| = 1}/U(1).

(34)

Here U(1) are the one-dimensional unitary matrices: the complex numbers
of modulus one.
The usual norm distance in
denote by [] the ray of

d
d

induces a natural distance on CP(d). If we


, = 1, then

dSF [1 ], [2 ] := min 1 z2 .
zU(1)

(35)

The distance dSF is called the Study-Fubini distance and it is easily computed
(36)
dSF [1 ], [2 ] = 2 2| 1 , 2 |.
Let us consider a small perturbation d of the vectors z belonging to a ray
that does not change the norm to rst order
z + d

= 2 + z , d + z d, + o( d )
= 1 + z , d + z d, + o( d ).

(37)

Hence , d = 0, i.e., the tangent space at [] to CP(d) is the (d 1)dimensional subspace of d .


For d we have
and hence

+ d

= 1 + d

+ d
= + d
+ d

1
2

d 2 + o( d 2 ).

(38)
(39)

We now compute the square ds2 of the Study-Fubini distance between []


+d
and +d
:
+ d
+ d
1
= 2 2 1 2 d 2 + o( d 2 )
= d 2 .

ds2 = 2 2 ,

18

(40)

So we see that the Study-Fubini distance yields a Riemannian metric on


CP(d) dened at [] by the identity matrix on .
The length of a parametric curve t [0, 1] [(t)] on CP(d) starting at
[1 ] and ending at [2 ] is then
1

dt
0

d(t)
.
dt

(41)

For two points [1 ] and 2 we can always choose the phases so that 1 , 2
0. The geodesic connecting [1 ] and [2 ] is then the circular arc in the
(1 , 2 )-plane connecting 1 and 2 . Its length is
cos1 | 1 , 2 | .

(42)

Problem 18. Verify the statements made in this section.

2.3

State space of a qubit

The state space of a qubit is particularly simple. We rst parametrise a


general qubit density matrix in terms of the Pauli matrices = (1 , 2 , 3 )
1 =

0 1
,
1 0

2 =

0 i
,
i 0

and 3 =

1 0
.
0 1

(43)

The Pauli matrices together with form a basis of Hermitian matrices of


M2 . Therefore = 21 (a + x ) with a and x 3 . As Tr = 1
we have a = 1 and in order to have non-negative we still must impose
det() = 14 (1 x 2 ) 0 or x 1. So we see that the state space of a
qubit is anely isomorphic to the unit ball in 3 also called the Bloch ball
=

1
2

( + x ), x 3 .

(44)

The boundary and extreme boundary coincide in this case. The centre of
the ball is the uniform state A 21 Tr A which is for instance obtained as
an equilibrium state at innite temperature.
Problem 19. Check that the parametrisation of qubit vector states used
in (4) corresponds to the usual parametrisation of the unit sphere in terms
of spherical angular coordinates.
19

To make a full tomography of a qubit state we can perform series of measurements to obtain reliable gures for the expectations of 1 , 2 , and 3 .
Using the Bloch parametrisation (44) we have
xj = Tr j = j .

(45)

From the positivity condition for a qubit density matrix we see that we must
have
1 2 + 2 2 + 3 2 1.
(46)
The closer this expression comes to 1 the purer the state is. A full tomography of more complicated systems, like two qubits, requires many more
measurements and is therefore a costly and lengthy operation.

2.4

A probability theory approach to states

In classical probability one usually starts with the notion of a universe of


elementary events. For a dice this would be the faces: = {1, 2, . . . , 6}. The
events to which a probability will be assigned are then the Borel subsets of .
For a nite set these are just the elements of the power set of . In general, it
is a Borel algebra B meaning that the set is closed under countable unions
and intersections and under taking complements. The empty set also
belongs to B. A probability measure is now a function on B with values in
[0, 1] such that () = 0, () = 1, and

Bj =
j

(Bj )
j

for all countable collections {Bj } of disjoint Borel sets. This last property is
called -additivity. The Borel sets are the subsets of that can be given a
probability, they represent the events that can occur.
For a quantum system the projectors play the role of events: the corresponding measurement can only have two outcomes 0 or 1, true or false. The
projectors (on closed subspaces of a Hilbert space) form a lattice. Recall
that a projector P1 is smaller than P2 if P2 P1 is positive denite and this
is equivalent to P1 = P2 P1 . Being a lattice means that for any two projectors
P1 and P2 there is a projector that dominates both, e.g. P1 P2 . There is
also a projector that is dominated by both such as P1 P2 . Actually, the
20

lattice is nite because there is a smallest element, the zero operator, and a
largest one, the identity. The lattice is also orthocomplemented: the join of
P and P is and their meet is 0. One can also show that the lattice is
closed under countable joins and meets. A quantum probability measure can
now be characterized as in the classical case: a function
from the lattice
of projectors to [0, 1] such that 0 = 0, = 1, and
Pj =
j

Pj
j

for every countable collection {Pj } of pairwise orthogonal projectors.


Gleason showed that for Hilbert spaces of dimension larger than two every
quantum probability measure corresponds to a density matrix :
P = Tr P, P a projector.

(47)

A basic dierence between classical and quantum probability is that conditioning doesnt work in the quantum case. Let X and Y be two Borel subsets
of and let Prob(Y ) > 0. The conditional probability of X given Y is
dened to be
Prob(X Y )
Prob(X|Y ) :=
.
(48)
Prob(Y )
Suppose that {Yj } is a partition of in at most a countably innite number
of Borel subsets such that for every j Prob(Yj ) > 0. We can then write for
an arbitrary Borel subset X of
Prob(X) =

Prob(Yj ) Prob(X|Yj ).

(49)

This is often called Bayess law. In the quantum context this doesnt work,
the reason being that for a countable partition {Pj } of the identity, i.e. for
a countably innite family of projectors Pj such that j Pj = and for a
general projector Q one usually has the strict inequality
Q

Q Pj ,

(50)

while in the classical case, for a partition {Yj } of


X=
j

X Yj .

21

(51)

2.5

States of composite systems

Often one has only access to local information contained in a state of a


composite system. Assume for simplicity a system composed of two parties
A and B that are suciently distant so that one can not reasonably observe
quantities that correlate both parties. In such a case one will not so much be
interested in the full joint density matrix AB but rather in the marginals
of this state
X Tr AB (X B ) and Y Tr AB (A Y ).

(52)

It is clear that these denes states on A and B. So there is a density matrix


A for the rst party such that
Tr A X = Tr AB (X B ),

X observable of rst party.

(53)

A is called a reduced density matrix of AB and it is obtained by computing a partial trace


A = TrB AB

j|A |k =

or

j|AB |k .

(54)

Problem 20. Show that the partial trace does not depend on the choice of
basis in the space over which the partial trace is taken.
Problem 21. Verify (54).
Given local density matrices A and B then there is always an extension to
the composite system: A B . This corresponds to independence between
both parties. In most cases there will be more possibilities compatible with
the local restrictions. Indeed, for a general bipartite state we have d2A d2B 1
real freedoms while specifying the reduced density matrices consumes only
d2A + d2B 2 parameters.

How strong quantum systems are correlated is partly captured in the notion
of entanglement. A state AB of a bipartite system is called separable if
it is a convex combination of product states
AB =

p A B, .
22

(55)

A state that is not separable is called entangled. By construction, the separable states form a convex set. Deciding whether a given bipartite state is
separable or entangled turns out to be a very dicult problem. Moreover,
a more rened distinction between classes of states seems to be necessary
in order to understand their usefulness for various tasks. These questions a
very dicult and go far beyond the scope of this course. Restricting to pure
bipartite states is, however, quite simple: a pure state is separable if and
only if it is a product of pure states.
Problem 22. Show that the extreme points of the set of separable states
are the pure product states.
There is a simple quantitative characterisation of the degree of entanglement
of a pure state based on the following proposition:
Proposition 6. For any AB dA dB of a composite system there exist
orthonormal families {ej } in dA and {fj } in dB and positive numbers cj
such that
AB =
cj ej fj .
(56)
j

This decomposition is called the Schmidt decomposition and the number


of necessary terms is the Schmidt number or Schmidt rank of AB .
Problem 23. Fill out the details of the proposition.
Problem 24. Find the Schmidt decomposition of the two qubit state
1
(|00 + |01 + |10 ).
3
Clearly a pure state is separable if and only if its Schmidt number is equal
to 1. It is also clear that the Schmidt number of a pure state is quite discontinuous as it takes values in 0 . A very useful consequence of the Schmidt
decomposition of bipartite pure states is:
Proposition 7. The eigenvalues, with their multiplicities, of both marginals
of a bipartite pure state are equal up to zeros.
Problem 25. Let be a state vector in d1 d2 . What is the maximal
number of non-zero eigenvalues of the reduced density matrices?
23

Problem 26. Could one extend the Schmidt decomposition to more than
two parties?
Closely related to the Schmidt decomposition is the purification of a general
state. Let be a density matrix on d , then we can nd an orthonormal
basis {ej } and a probability vector {rj } such that
=
j

rj |ej ej |.

(57)

In fact we can limit the sum to the j with rj > 0. Suppose there are d such
j, the number d is the rank of , it is the dimension of the range space of

. Pick now an orthonormal basis {fj } in d and construct the normalised


vector

=
rj ej fj
(58)
j
rj >0

in d d . One then checks that the reduction of | | to the rst system


is precisely equal to . The pure state on the composite system dened by

is called a purication of and d is called an ancillary space.


Problem 27. Verify the statements above. Can you parametrise all possible
purications of ?

24

Quantum dynamics

We suppose that we can start the dynamics of a system at some initial time
t0 with an arbitrary initial state 0 of the system. At some later time t the
state of the system is (t; t0 , 0 ). The following general requirements on an
evolution appear to agree well with observations
1. The map 0 (t; t0 , 0 ) is affine: it preserves convex mixtures. This
allows us to introduce a linear evolution map (t, t0 ) on Mk such that
(t; t0 , 0 ) = (t, t0 ) 0 , t t0 .

(59)

Clearly the map is positive because it maps any density matrix in


another density matrix and it preserves normalisation, in other words
(t, t0 ) is trace-preserving.
2. The maps (t, t0 ) depend continuously on t and (t0 , t0 ) = id.
3. It should be possible to trivially extend the dynamics to an enlarged
setting system + environment: for any dimension d of the environment
(t, t0 )

(60)

should extend to an ane transformation of the global state space of


environment + system. Here is an arbitrary d-dimensional density
matrix and an arbitrary density matrix of the system. This condition
is called complete positivity of and it is a strong requirement as
we shall see later on.

3.1

Dynamics in discrete time

In this section we concentrate our attention mostly on the complete positivity


of dynamical maps. More precisely, we consider a single map from the
state space of Mk to that of Mn , i.e. a linear map that sends k-dimensional
density matrices to n-dimensional density matrices. Clearly, this is equivalent
to imposing that is positive and trace-preserving. Here positive means
mapping positive matrices to positive matrices. As we think of as acting
on states we say that is a dynamical map in Schr
odinger picture.
25

There is of course also an Heisenberg version of obtained through


duality
Tr () X = Tr (X), k-dimensional density matrix,
X n-dimensional observable.

(61)

Note that : Mn Mk . It is easily seen that is unity-preserving,


(n ) = k , whenever is trace-preserving.
Example 1. Transposition with respect to some given basis is a positive
transformation of M2 because it preserves hermiticity and the spectrum.
However, if we trivially extend transposition to a composite system of two
qubits we loose positivity. Consider the projector P onto 12 |00 + |11 :

1 0 0 1
1 0
0 1

1 0 0 0 0
0 0
= 1 0 0
.
P =
0 0
2 0 0 0 0 2 0 0
1 0 0 1
1 0
0 1

We now transform the density matrix P by acting with the transposition


only on the second qubit:

1 0 0 0
1 0 0 1 0
1 0
0 1
1
(id2 T)(P ) =

0
1
0
0
0
1
1 0
2
2
0 0 0 1

which is not positive because of the last term. Hence, we already loose
positivity if we extend the transposition to a single additional qubit.
Let d 0 . A -linear map : Mk Mn can be extended to a -linear
map idd from Md Mk to Md Mn . Any element X Md Mk can
be written as a d d matrix with entries in Mk :

X11 X1d

..
.. , X M .
X = ...
(62)
ij
k
.
.
Xd1 Xdd
We then put

(X11 ) (X1d )

..
.. .
idd (X) = ...
.
.
(Xd1 ) (Xdd )
26

(63)

Problem 28. Show that (63) is equivalent to putting for A Md and


B Mk
idd (A B) := A (B).
A -linear map : Mk Mn is d-positive if idd is positive. A -linear
map : Mk Mn is completely positive if it is d-positive for d = 1, 2, . . .
It can be shown that there exist for any d maps that are d-positive but not
(d + 1)-positive. This is, however, not easy.
Problem 29. Show that is completely positive i is completely positive.
The d-positive maps from Mk to Mn form a convex cone: if 1 and 2 are
d-positive and if a and b are non-negative numbers then a1 + b2 too is
d-positive. A composition of two d-positive maps with matching dimensions
is again d-positive.

3.2

General quantum operations

Sometimes the term super-operator is used to denote linear transformations of Mk , or linear maps from Mk to Mn . This is just to warn you that
we are not working on the level of the space of wave functions but rather
on that of states or observables considered as elements of the linear space of
transformations of vector states.
There is a standard way of encoding such a super-operator called the Choi
encoding. We start by introducing the standard matrix units eij := |i j|,
these are the matrices with all entries equal to 0 except for a 1 on row i and
column j. Obviously these matrix units form a basis of Mk considered as
vector space and we have
Mk A =

Aij eij .

(64)

ij

It is clear that a super-operator : Mk Mn is completely specied if we


know its action on each of the eij . The Choi encoding does this in a global
way:
C() :=
eij (eij ).
(65)
ij

27

In this way one associates to the super-operator a kn-dimensional matrix


and vice versa. It is easily seen that C(1 + 2 ) = C(1 ) + C(2 ) but it is
NOT TRUE that C(1 2 ) = C(1 )C(2 ).
A super-operator is called positive if it maps positive denite matrices in
positive denite matrices. Note that this is a completely dierent notion than
positive deniteness for which you need an inner product space. A superoperator is trace-preserving if Tr (A) = Tr A for all A, it is called
unity-preserving if () = . Suppose that is both positive and tracepreserving, then it maps by denition density matrices to density matrices.
We provide now a basic example of a completely positive super-operator that
is generally neither trace nor unity-preserving but that will prove very useful
in describing quantum operations
Example 2. Pick a linear map V :

, then

: X V XV is completely positive.

(66)

Indeed:
idd (A B) = A (B) = A V BV = (d V )(A B)(d V ) .
Now, if C Md Mk is positive then it is of the form D D and
idd (C) = (d V )D D(d V )
= D(d V )

D(d V ) 0.

The following theorem characterises the completely positive maps from Mk


to Mn :
Theorem 3 (Choi - Kraus - Jamiolkowski). : Mk Mn is completely
positive if and only if C() is a positive semi-definite matrix.
A quantum operation in Schrodinger picture is a completely positive tracepreserving map from Mk to Mn . Quantum operations transform density
matrices into density matrices and can, moreover, be trivially extended to
composite systems without loosing this property. In the dual Heisenberg
picture trace-preserving should be replaced by unity-preserving.

28

Corollary 1. Every quantum operation from a k-level system to a n-level


system is of the form
Vj Vj = .

Vj Vj with

() =
j

(67)

There are at most kn terms needed in the summation.


The {Vj } appearing in (67) are called Kraus operators and this way of
writing is a Kraus decomposition. There are in general many ways
of writing such a decomposition, i.e. the Kraus operators are not uniquely
determined by the map . Actually one can show that two dierent sets of
Kraus operators {Vj } and {W } of a same map are connected by an isometric
transformation u
uj Vj .
W =
j

Problem 30. Fill out the details of the proof of the theorem and the corollary.
Problem 31. Express the trace-preserving and unity-preserving conditions
on a quantum operation in terms of its Choi matrix.

3.3

Examples of quantum operations

We now consider a number of examples of quantum operations.


3.3.1

Unitary gates

Example 3 (Unitary gate).


() = U U with U U = (= UU ).

(68)

Unitary gates are single-shot unitaries that are applied to qubit systems
extending operations on bits to qubits or introducing operations that have
no classical counterpart. These gates are intended to represent basic logical
operations performed on systems of qubits.

29

A unitary gate maps pure states into pure states


U | | U = |U U|.

(69)

Because the action of the gate can be expressed at the level of state vectors,
we also have
U(1 + 2 ) = U1 + U2 .
(70)
Hence, unitary gates not only preserve mixtures but also coherences.
Denoting by a classical bit {0, 1} we denote by 1 2 addition modulo
2. A few common classical gates are
NOT :

(71)

AND :

1
1 2
2

(72)

CNOT :

1
1

1 2
2

(73)

FAN OUT :
NAND :

(74)

1
1 1 2
2

(75)

Problem 32 (Universality of NAND). Write all the mentioned gates in terms


of NANDs.
One can get quantum analogues of such gates by associating to | . As
quantum gates should be unitary and therefore reversible one can only apply
this recipe to reversible classical gates, i.e. to classical gates where the input
can be uniquely reconstructed in terms of the output. The gates NOT and
CNOT are examples of reversible gates, this is not the case for AND and
also not for FAN OUT as not every output is reached. There exist however
classical reversible gates that allow to realise AND and FAN OUT modulo
introducing additional bits and setting some of these to a particular value.
A nice example is the Tooli gate
1
1
2
2
3 1 2
3
30

(76)

The AND gate can be realised by putting 3 = 0, feeding 1 and 2 to the


rst two leads and only retaining the last lead as output. We can get FAN
OUT by feeding to the rst lead, setting the second lead equal to 1 and
the third to 0. We can then get two copies of on the rst and third output
lead.
The single qubit gate that implements the NOT on the computational basis
is the unitary U determined by
U : |0 |1

and U : |1 |0 .

0 1
. A general
1 0
superposition of the computational basis states is then mapped into the same
superposition of the negated basis states. Such extensions of classical gates
to the quantum setting dont really full the expectations one could have.
For example the quantum NOT dened above does not map every qubit state
into its orthogonal complement. FAN OUT would map, modulo neglecting
some outputs, |0 to |00 and |1 to |11 and is therefore a classical copier.
A general qubit state would then be mapped as follows

It is easily seen that this unitary is the Pauli 1 matrix

|0 + |1 |00 + |11 = (|0 + |1 ) (|0 + |1 )


for general and . Hence this quantum extension of the classical copier
is not a quantum copier. In fact, one can prove that no unitary gate can
ever make two copies a general (unknown) pure quantum state, this is the
no-cloning theorem.
It should also be remarked that many quantum gates have no classical counterpart. A very useful single qubit gate in this class is the Hadamard gate
1 1 1
H=
.
2 1 1
There is a well-understood theory on how to approximately and eciently
generate an arbitrary gate by concatenating gates from a nite basic set.
This goes, however, beyond the scope of these lectures.
3.3.2

Dissipative operations

In contrast with unitary gates, a general quantum operation might send a


pure state into a mixed one or inversely. They should be compared to classical
31

stochastic maps that cannot be seen as ows or maps on the conguration


space but that rather send a point in conguration space with a certain
probability to another one.
Example 4 (von Neumann measurement without selection). Let {ej }
be an orthonormal basis corresponding to an observable. The corresponding
projectors pj := |ej ej | are mutually orthogonal and form a resolution of the
identity
pj pk = jk pj and
p j = .
(77)
j

If a system described by a state vector is sent through the measuring


device corresponding to {pj } then we obtain the read-out j with probability
| ej , |2 and the incoming state collapses to the corresponding eigenstate
|ej ej |. If we dont lter out any particular set of outcomes we obtain the
outgoing state
| ej , |2|ej ej | =
pj | |pj .
(78)
j

Sending in an arbitrary state will produce the outcome

pj pj .

(79)

It is obvious that a pure state will be transformed in a mixed state by this


procedure and that this is also an irreversible process: there is no possibility
for reconstructing the incoming state.
Example 5 (Coupling to an external system). This is an example of a
map between states on Mk , the system, and on Mk M , environment plus
system. Let be a xed density matrix of the environment then
() = , density matrix in Mk .

(80)

The extension to an auxiliary d-level system is then


idd (X) = X , X Md Mk
and this is positive.

32

(81)

Example 6 (Reduction to a subsystem). We are now considering the


partial trace
Tr Mk
(82)
where is a density matrix on Mk M . We write
k

Tr =

Tr j |i j| =

i
ij=1

ij=1 a=1

|i ia|

=
a=1

i=1

ia|ja |i j|
(83)

j=1

|ja j| =

Va Va
a=1

where
Va :

: |ib ab |i .

In this way we obtain a Kraus form for the partial trace which proves that it
is completely positive. Combining this example with the three previous ones
we can realise a generalised measurement set-up using a pointer system.
Example 7 (A generalised measurement). A more complicated and
more realistic set-up uses an auxiliary pointer system k : the incoming
state is composed with an initial state of the pointer system. This composite system then unitarily evolves by passing through a unitary gate. Then
a von Neumann measurement, without ltering any particular outcome, is
applied to the pointer part of the system and we are nally left with the
reduction of the resulting state to the system we observe. We compute the
transition from initial to nal state using some simplifying features but it
turns out that the overall result is still a completely general quantum operation.
Let {fj } denote the measurement basis in our pointer system k and suppose
that the initial state of the pointer system is f1 . It is useful to write the
unitary gate U in the given basis of the pointer system
U = Uij

(84)

ij

where the Uij are d d matrices that satisfy the unitarity relations
U
j

ij

Uji Uj = i d .

Uj =
j

33

(85)

We now compute what happens to an incoming pure state d . It is rst


coupled to the initial state of the pointer system and so we have f1 at
our disposal. This evolves through the unitary gate U into j fj Uj1 .
Sending this through the measuring apparatus that observes the pointer part
of the system we obtain the state j |fj fj | |Uj1 Uj1 |. Finally tracing

over the pointer system we obtain j |Uj1 Uj1 | = j Uj1 | |Uj1


. The
global quantum operation on a general mixed state can therefore be written
as

Uj1 Uj1 =
Vj Vj .
(86)
j

Because of the unitarity condition the matrices Vj := Uj1 satisfy


Vj Vj = .

(87)

Example 8 (Qubit decoherence).


() =

11 12
21 22

with

(88)

In order to nd out under which condition on this is a quantum operation


we write down the Choi matrix

1 0 0
0 0 0 0

(89)
C() =
0 0 0 0 .
0 0 1

This matrix is positive i || 1. Repeated applications of this map, with


|| < 1 will map any initially pure state with state vector |0 + |1 to the
mixed state ||2|0 0| + ||2|1 1|. This means that the o-diagonal terms in
the original pure state, |1 0| + |0 1|, are eventually sent to zero. This
is loss of coherence.
Example 9 (Qubit depolarisation).

() = + (1 ) ,
2

(90)

In order to nd out under which condition on this is a quantum operation


we write down the corresponding Choi matrix. We rst rewrite the map in
34

such a way that the anity in is manifest:


() = + (1 )

Tr .

This allows us to compute the Choi matrix


1+
0
0
2
0 1 0
2
C() =
0
0 1
2

0
0

0
.
0

(91)

(92)

1+
2

This matrix is positive i 13 1. The extreme value = 13 corresponds to /3. This is the most we can do reversing the sign of all
components of angular momentum without destroying complete positivity.

3.4

Dynamics in continuous time

We now return to evolution maps {(t, t0 ) | t t0 } as introduced at the


beginning of this chapter (59).
3.4.1

Generators

We assume the additional simplifying property of divisibility also called


Markovianity in time: if we know the state of a system at a given time
then we know it for all later times, put dierently, the evolution of a system
is independent of its history. This can be stated as
(t2 , t0 ) = (t2 , t1 ) (t1 , t0 ), t0 t1 t2 .

(93)

Even simpler are autonomous evolutions. Here (t, t0 ) only depends on tt0 .
Slightly abusing notation we write
(t, t0 ) = (t t0 ).

(94)

In this case we obtain a one-parameter semi-group of trace-preserving completely positive maps


{(t) | t + },

(t1 + t2 ) = (t1 ) (t2 ).


35

(95)

Even more restrictive are reversible autonomous dynamics where one has a
one-parameter group of trace-preserving completely positive maps
{(t) | t },

(t1 + t2 ) = (t1 ) (t2 ).

(96)

In this case it turns out that the dynamics is unitary:


(t)() = U(t) U(t) with {U(t) | t } a unitary group.

(97)

Markovian dynamics are characterised by their generator, which is in general time-dependent. Let us dene this time-dependent generator as
(t) := lim
0

(t, t ) id
(t + , t) id
= lim
.
0

(98)

Using the Markovian property, we can write


(t + , t0 ) (t, t0 ) = (t + , t) id (t, t0 ).

(99)

Dividing by and taking the limit 0 we obtain a dierential equation


with initial condition for
d
(t, t0 ) = (t) (t, t0 ), t > t0 and (t0 , t0 ) = id .
dt

(100)

This equation rightly deserves the name Schrodinger equation.


Conversely, we can solve the linear dierential equation (100) by applying
Picard iteration in a time-interval [t0 , tmax ] for which depends continuously
on time
t

(t, t0 ) = Texp

ds (s)
t0

= id +

ds1 (s1 ) +
t0 s1 t

t0 s2 s1 t

ds1 ds2 (s1 )(s2 ) +


(101)

The series that denes the solution is called a time-ordered exponential.


By the unicity of the solution of (100) we obtain Markovianity in time. Linblads theorem gives the general form of a generator {t (t)} of a Markovian dynamics that leads to completely positive dynamical maps.
36

3.4.2

Lindblads theorem

Theorem 4. A family of super-operators {t } on the d d matrices is


a generator of a divisible family of quantum operations if and only if it is of
the form
() = i[H(t), ] +
k

Vk (t)Vk (t) 21 {Vk (t)Vk (t), } .

(102)

Here t H(t) is a continuous family of Hermitian d-dimensional matrices


and the t Vk (t) are continuous families of transformations of Md .
The dierential equation
d
(t, t0 ) = (t) (t, t0 ), t > t0 and (t0 , t0 ) = id .
dt

(103)

with the explicit form (102) for the generator is called Lindblads equation.
It is the generalisation of Schrodingers equation to general non-reversible
Markovian dynamics. The dierential equation for a state with initial
condition 0 reads
d
= i[H(t), ]+
dt

Vk (t)Vk (t) 21 {Vk (t)Vk (t), } and (t0 ) = 0 . (104)

To derive this result it suces to consider the autonomous case and we need
the general characterisation of positive semi-deniteness for a 2 2 block
matrix.
A C
acting on d1 d2 is positive
Proposition 8. The block matrix
C B
semi-definite if and only if
1. A and B are positive semi-definite and
2. there exists a U :

d2

d1

with U 1 such that C =

AU B.

Problem 33. Prove Proposition 8.


Problem 34 (Schur complement). Suppose that B in Proposition 8 is
invertible. Show then that the positivity conditions for the block matrix can
be written as
A, B 0 and CB 1 C A.
(105)
37

Proof of Theorem 4. Because we are dealing with a semi-group of quantum


operations we have
exp(t) = exp(t/n)n .
(106)
For suciently small we may write
exp() = id + + o()

(107)

Because of Chois theorem we then have that


C(id + ) + o() is positive semi-denite.

(108)

An easy computation shows that


|ii

C(id) =
i

We now split the space


d

jj| .

(109)

into

=
i

|ii

|ii

(110)

which leads to the block matrix decomposition


C(id + ) + o() =

d + C()11 C()12
.
C()21
C()22

(111)

By Proposition 8 and taking the limit 0 we remain with the condition


C()22 0 and C()12 Ran C()22 .

(112)

Now we can essentially repeat the proof of the Choi-Jamiolkowski-Kraus theorem. The main dierence is that C()22 acts on the orthogonal complement
of the maximally entangled vector i |ii . Combining this with preservation
of the trace leads to (102).
All these arguments can be reversed to show that a super-operator of the
form (102) is the generator of a semi-group of quantum operations.
Problem 35. Fill out the details of the proof of Theorem 4
Before turning to applications we mention the quite useful two-positivity
inequality:
38

Proposition 9. Let be a quantum operation in Heisenberg picture, then


(X ) (X) (X X) for any X.

(113)

Let be a generator in Heisenberg picture of a continuous one-parameter


semi-group of 2-positive unity-preserving maps, then
X (X) + (X )X (X X) for any X.

(114)

To prove this inequality we consider the positive 2 2 block matrix


X

X X X
.
X

X =

(115)

As is 2-positive we have also


0 (id )

(X X) (X )
.
(X)

(116)

and (X ) (X) (X X).

(117)

X X X
X

This is the case i


(X ) = (X)

The inequality (114) follows from (113) by expanding exp(t ) for small
positive t.
3.4.3

A semi-group of decoherent qubit maps

We propose a simple form for a time-independent generator of a completely


positive trace-preserving semi-group of quantum operations acting on a single
qubit

a11 + b22
c12
11 12 =
(118)
21 22
c 21
a11 b22

and determine the necessary and sucient conditions on the parameters a,


b, and c that ensure that is a Lindblad generator. We certainly need that
annihilates the trace, Tr = 0, because the exponential of t has to be
trace-preserving. This is already incorporated in the form of .

39

From the proof of Lindblads theorem we need to pick a, b, and c in such a


way that

a 0 0 c
0 a 0 0

C() =
(119)
0 0 b 0 0 on
c 0 0 b
where =

1
2

1 0 0 1 . This leads to

a 0, b 0, and a + b c c.

(120)

From (118) we see that the o-diagonal matrix elements are eigenvectors of
and that mixes the diagonal entries of a density matrix. This makes the
exponentiation of rather straightforward:
et

11 12
21 22

1
a+b

b + (a11 b22 )et(a+b)


etc 21

etc 12
a (a11 b22 )et(a+b)
(121)

The conditions (120) on the parameters of the generator are equivalent with
etc

et(a+b) .

(122)

Inspecting (121) we see that every initial density matrix evolves toward the
equilibrium state when t
=

1
a+b

b 0
.
0 a

(123)

The convergence rate of the diagonal elements to this limit is determined by


a+b. The o-diagonal elements decay even faster to zero, this is the meaning
of (122).
Problem 36. Check that the details of the example above.
3.4.4

Radiation loss to the vacuum

Suppose that we consider a single mode of a radiation eld, say to describe


a laser. We are also not interested in polarisation but just in the number
40

of photons. The photon eld is described in terms of a creation and a


annihilation operator that satisfy the canonical commutation relations
[a, a ] = .

(124)

The vector state with zero photons is called the vacuum, we represent it by
or |0 . The dening feature of is
a = 0.

(125)

Using this and (124) we can compute that


am (a )n = 0 for n > m and an (a )n = n! .

(126)

In this way we introduce an orthonormal family {|n | n } of vectors


1
|n := (a )n .
n!

(127)

|n is the n photon state and {|n } is called the particle number basis. It
is easy to express a and a in this basis

(128)
a |n = n + 1 |n + 1 and a|n = n |n 1 .
The number operator N := a a counts the number of photons in a state
N|n = n|n .

(129)

We will need the following:


Lemma 1. Let A and B be linear transformations such that [A, [A, B]] =
[B, [A, B]] = 0, then
1

eA+B = e 2 [A,B] eA eB .

(130)

Problem 37. Prove the lemma by showing that

t F (t) := e

t2
[A,B] tA tB
2
e e

(131)

satises the dierential equation


dF
= (A + B)F (t) and F (0) = .
dt
You need to use [A, [A, B]] = [B, [A, B]] = 0.
41

(132)

We can apply this to exponentials of linear combinations of the creation and


annihilation operator. E.g.:
za

eza

= e 2 |z| eza eza , z

(133)

Remark that the left hand side is the exponential of a skew Hermitian transformation and that it is therefore unitary.
It turns out that the radiation of a laser is well-described by coherent states
za

eza

(134)

depending on a complex parameter z. Using (133) and a = 0 we expand


such a coherent state in the particle number basis
za

eza

= e 2 |z| eza = e 2 |z|

n=0

zn
|n .
n!

(135)

This means that the statistics of the number operator is Poisson:


Prob{N = n} = e|z|

|z|2n
.
n!

(136)

|z|2 xes the average number of photons, i.e. the intensity of the beam.
A phenomenological description of radiation loss to the vacuum can be obtained through a weak-coupling type limit. It leads to a Lindblad generator
() = aa 21 {a a, }.

(137)

Computing the action on the matrix units in the number basis yields
(|n m|) = a |n m| a 21 a a |n m| 21 |n m| a a

= mn |n 1 m 1| 12 (n + m) |n m|.

(138)
(139)

This means that


mn

et (|n m|) = e(n+m)t/2 |n m| +

k=1

ck (t) |n k m k|.

(140)

Hence, for any initial density matrix we have


lim et () = |0 0|.

42

(141)

It is possible to compute explicitly the time evolution of functions of the


number operator and the evolution of coherent states. Deriving the general
results is a bit too lengthy in this context and we limit ourselves to an
example.
(N) = a Na 21 {a a, N} = N.
Hence

Hence

(142)

et (N) = et N.

(143)

(N 2 ) = a N 2 a 21 {a a, N 2 } = 2N 2 + N.

(144)

et (N 2 ) = e2t N 2 + et (1 et )N.

(145)

Computing the intensity of the beam at time t leads to exponential decay


N

= et N 0 .

(146)

Computing the long-time behaviour of the variance we get


N

et/2

N 0.

(147)

Problem 38. Verify the computations above.


Problem 39. For those who like more challenging computations, prove the
following
N(N ) (N k ) = (k + 1)N(N ) (N k )

et

esN = exp N log 1 + est et

(148)
(149)

Here k is a natural number and s and t can be taken non-negative in order


to avoid convergence problems.

3.5

Thermalising maps

We present here very schematically a simple black box dynamics that describes how an environment in thermal equilibrium that is weakly coupled
43

to a small system drives the system towards equilibrium at the temperature


of the environment. This is an approximate description that is only valid
in a situation where the system is weakly interacting with the thermal bath
and where it consequently takes a long time to equilibrate the small system.
This explains the name weak-coupling limit and the need to consider long
times.
Let us denote by S the system and by B the thermal bath or environment.
We assume that S has d accessible levels and that B is an innite system.
In particular, the energy spectrum of S is discrete while B has a continuous
spectrum. Although we will not enter in the technical details, this is an
important feature. We will briey describe the outcome of the weak-coupling
limit but in order to prove its existence one needs some technical properties
that can not hold for systems with a discrete energy spectrum.
3.5.1

Reversible dynamics of the system

Let H S denote the Hamiltonian of the system, i.e. a d-dimensional Hermitian


matrix. We then know that there is an orthonormal basis {|j } of eigenvectors
of H S : H S |j = Ej |j . For a generic Hamiltonian we can always assume that
the eigenvalues are non-degenerate. The dynamics, in Heisenberg picture, is
given by
S
S
A eitH A eitH = exp(it[H S , ])(A), A Md .
(150)
In this formula [H S , ] is the short-hand notation for the super-operator
[H S , ](A) := [H S , A], A Md .

(151)

It is not dicult to compute the eigenvalues and eigenvectors of the superoperator [H S , ]:


[H S , |k |] = (Ek E )|k | = k |k |, k, = 1, 2, . . . , d.

(152)

The energy gaps {k } between the levels of H S are called the Bohr frequencies of the evolution. Generically, each k is non-degenerate except for
0 = kk which has a degeneracy d. An arbitrary observable A can be expanded in the matrix units {|k |} and this yields the following expression
for the dynamics
d
itH S

itH S

Ae

=
k,=1

44

Ak eitk |k |.

(153)

It is clear from this formula that, for a generic Hamiltonian, the constants
of the motion are precisely the linear combinations of the spectral projectors
{|k k|} of H S . In other words, the constants of the motion are just the
algebra of diagonal matrices. In the sequel we use the notation dia() with
d to denote the diagonal matrix whose entries on the diagonal are the
components of . It is also clear from (153) that the evolution of any system
observable is essentially periodic in time. It is not truly periodic because
the Bohr frequencies are in general not integer multiples of a fundamental
frequency, nevertheless after a suciently long time the system repeats itself
up to an arbitrary small error. Such a behaviour is called quasi or almost
periodic. It is in particular impossible for systems with this type of energy
spectrum to converge in the long run to some equilibrium situation. Convergence or return to equilibrium can only happen in innite systems and such
a behaviour is needed for the weak-coupling limit.
3.5.2

Equilibrium states of the bath

The equilibrium state at inverse temperature = 1/kT for a N-particle


system is given by the canonical Gibbs matrix
eH
=
Z

eH
=
Tr eH N

(154)

where H N is the N-particle Hamiltonian. This expression makes sense for


nite systems but does not survive the thermodynamic limit N . We
can, however, use a purication of the state to get
a Hilbert space HN
a normalized vector N in HN that generates the expectations of observables
an algebra AN of observables of the N-particle system that acts on HN .
For nite systems, HN has a tensor product structure and the observables act only on the rst factor. Expectations of bath observables are
given in the usual way
X

= N , X N , X AN .
45

a unitary dynamics on the observables AN


There is, moreover, a deep connection between the dynamics and the thermal
expectations. For any two observables X and Y of the system
N

XY

Tr eH XY
Tr eH Y eH X eH
=
=
Z
Z
= Y, i (X) .

Here

t (X) := eitH X eitH

(155)

(156)

is the reversible dynamics of the N-particle system.


This set-up survives the thermodynamic limit at a price: the innite system
Hilbert space H has no simple tensor decomposition such that the bath observables act only on the rst factor. The innite system can be described
as follows:
there is an innite dimensional separable Hilbert space H,
a normalized vector in H that generates the expectations of observables, and
an algebra A of bath observables that acts on H.
The dynamics is generated by a bath Hamiltonian H B that acts on H and
that enjoys the following properties:
H B = 0 which expresses the time-independence of the expectations.
There is a global dynamics of the observables: if A A then also
B
B
eitH A eitH is an observable.
The dynamics and thermal expectations are still connected as in (155)
B

, XY = , Y eH X eH ) , X, Y A.

(157)

Because of the innite dimensions, we can now have a very dierent behaviour
of the dynamics, such as return to equilibrium
B

lim , X eitH Y eitH

Z = , XZ , Y .

(158)

Such a behaviour is excluded for nite systems as in that situation all timedependent expectations are quasi-periodic.
46

3.5.3

The weak-coupling limit

The weak-coupling limit describes the simplied situation that arises when
we couple a d-level system weakly with a thermal bath at inverse temperature
. Because of the weak interaction we need to wait a long time before the
coupling aects the expectation values of system observables. It turns out,
however, that after a proper rescaling of coupling strength and time, we end
up with a simple dissipative dynamics of the small system: all memory eects
disappear and we are left with a semi-group of completely positive maps of
the system that drives any initial state toward the equilibrium state of the
system at the inverse temperature imposed by the thermal bath.
So, we consider a total Hamiltonian of system + bath
H tot = H S + H B + H int = H free + H int.

(159)

The interaction part H int is assumed to be generic. The proper scaling for
the limit is to let 0 and t such that 2 t . This causes, however,
very fast oscillations that have to be corrected for using the free evolution
( = 0). Modulo technical assumptions on the bath one can show for 0,
for any A Md , and for any d-dimensional density matrix the existence
of the following limit
Tr (X)
=

3.5.4

lim

0, t
2 t

Tr | | eitH

free

eitH

tot

X eitH

tot

eitH

free

(160)

Properties of the weak-coupling limit

The weak-coupling limit provides us with a set of super-operators {t | t 0}


in Heisenberg picture or with their dual Schrodinger version {t | t 0}.
These maps are completely positive and unity- or trace-preserving according
to the chosen picture.
A rst remarkable property is that they are actually semi-groups:
t1 t2 = t1 +t2 , t1 , t2 0.

(161)

Physically, it means that all memory in the dynamics at nite coupling disappear in the limit. As a consequence there exists a time-independent generator
47

of Lindblad type such that

), t 0.
t = exp(t

(162)

is only for very local use and intended to make the letter
(The notation
available.)
The canonical equilibrium state of the system has density matrix
S

eH
=
.
Z

(163)

lim t () = , arbitrary density matrix in Md .

(164)

This state is not only invariant under each t but any initial state of the
system will eventually be driven to
t

The Lindblad generator has a very special form:

(A) = i[H S , A] + (A), A Md .

(165)

and
Here is the dissipative part of

The super-operators and [H S , ] commute, i.e. the reversible dynamics of the system commutes with the dissipative part.
satises quantum detailed balance which is expressed as
being

Hermitian with respect to the scalar product induced by on the
observables of the system
(Y ) = Tr
(X)Y, X, Y Md .
Tr X
3.5.5

(166)

Quantum detailed balance maps on Md

In this last section we try to nd out the structure of a quantum detailed


balance semi-group on Md in a generic case. We start out with the system
Hamiltonian H S such as in Section 3.5.1 and assume that the Bohr frequencies of H S are non-degenerate, except for 0 that has a degeneracy d.
We rst express that the dissipative and reversible parts of the dynamics
commute. On diagonal matrices this leads to
[H S , (dia())] = [H S , dia()] = 0,
48

(167)

This means that sends constants of the motion into constants of the
motion. Because of linearity it implies that there is a d-dimensional matrix
L such that
(dia()) = dia(L ), d .
(168)

For o-diagonal matrix units |k |, k = , we have

[H S , (|k |)] = [H S , |k |] = k (|k |).

(169)

This means that either (|k |) is zero or that it is an eigenvector of [H , ]


with eigenvalue k . As the non-zero Bohr frequencies are assumed to be
non-degenerate we conclude that there exists ck such that
(|k |) = ck |k |, k = .

(170)

We can put the cs in a square table and ll it up with zeroes on the diagonal
to get a second d-dimensional matrix C. So, commutation of the dissipative
and reversible parts of the dynamics leads us to specify in terms of only
two d-dimensional matrices L and C.
is the generator of a semi-group of unity-preserving completely positive
maps if and only if () = 0 and the Choi encoding C( ) is positive on the
orthogonal complement of the maximally entangled state 1d k |kk . These
conditions are equivalent with
1. L is a generator of a semi-group of stochastic matrices preserving the
constant function 1 and

11 c12 c1d
c21 22 c2d

2. the matrix ..
.. is positive semi-denite, where L =
.. . .
.
. .
.
cd1 cd2 dd
[k ]k .
It remains to nd out the consequences of quantum detailed balance. A
straightforward application of (166) yields that L has to be the generator
of a classical detailed balance process with invariant measure given by the
diagonal elements of and moreover the matrix C has to be real symmetric.
Problem 40. Fill out the details in Section 3.5.5.
49

4
4.1

Entropy
Classical entropy

A stationary classical source emits letters belonging to an alphabet X =


{x1 , x2 , . . . , xd } at discrete times. After N emissions we obtain a message
xi0 xi1 xiN1 . We now assume that we can empirically establish the statistics of the source by counting relative frequencies. This is actually the
meaning of stationarity. So the probability distribution of single letters is
given by
1
(xj ) = lim
N(xj )
(171)
N N
where N(xj ) is the number of times that the letter xj appears up to time
N 1. We can also look for relative frequencies of two letter words such as
xj0 xj1
1
xj0 xj1 = lim
N xj0 xj1 .
(172)
N N
Here N xj0 xj1 is the number of times that the letters xj0 and xj1 consecutively appear in the message. It is easily seen from the construction that
d

xj0 xj1 .

xj0 =

(173)

j1 =1

This reects the compatibility between the probability distributions on one


and two letter words. Another relation that we obtain is
d

xj0 xj1 .

xj1 =

(174)

j0 =1

This expresses the stationarity of the source or equivalently shift-invariance


of our probability measure on two letter words: the probability of all two
letter words with a given rst letter is the same as that of two letter words
with the same letter on the second place. This procedure can be extended to
words of arbitrary length and computing relative frequencies for all possible
nite words we obtain a shift-invariant probability measure on X . It is in
these terms that a stationary source is described. Depending on additional
properties of one distinguishes particular classes of sources.
50

A source is memoryless if is a product measure:


xj0 xj1 xjk = xj0 xj1 xjk .

(175)

This means that the letters sent out by the source are completely uncorrelated. Correlations of time length 1 yield Markov measures:
xj0 xj1 xjk =

xj0 xj1 xj1 xj2 xjk1 xjk


xj1 xj2 xjk1

(176)

= xj0 Tj0 j1 Tj1 j2 Tjk1 jk


with
Tjk :=

(xj xk )
.
(xj )

(177)

The matrix T = [Tjk ] is called the transition matrix. It has non-negative


entries and its row sums are equal to 1. This matrix describes the probability
of jumping from j to k. Such matrices are also called stochastic. In order
to obtain a shift-invariant Markov measure one has to impose
d

(xj xk ) =
k=1

(xk xj )

(178)

k=1

Obviously there exist much more complicated measures.


An important informational task is to compress long messages before sending
them through a transmission channel and then to restore the original message by a suitable decompression. Such a procedure is very useful whenever
transmission or storing of information is costly or space limited. Instead of
requiring a perfect reconstruction of the original unknown message one may
require that most messages emitted by the source are restored in the end,
this can allow for big savings whenever some words appear more frequently
than others. The extreme example is that of a source emitting only a single
letter. The main idea is compression is to assign short code words to parts
of message words that often appear.
Let us consider a memoryless source and determine the probability of occurrence of a word xj0 xj1 xjN1 of length N. Suppose that the letter xj
occurs Nj times. We nd therefore the probability
d

(xj )Nj .
j=1

51

(179)

Now there are N!/(N1 !N2 ! Nd !) such words. Therefore the probability
distribution over the possible words is multinomial. For large N such a
distribution is sharply peaked around its maximum that is attained at Nj =
(xj )N. This follows from Stirlings formula. It also follows that almost
the full weight of the probability distribution is concentrated on a subset of
words of size exp(N(h + )) where
d

h :=

(xj ) log (xj ).

(180)

j=1

The quantity h is called the Shannon entropy of the source. Shannons


noiseless (memoryless) compression theorem states that for such a source the
space of relevant messages of length N cannot be further compressed than
exp(Nh).
Problem 41. Fill in the details of the computation of above.

4.2

Construction of the Shannon entropy

Dene the function


(x) := x log x for 0 < x 1 and (0) := 0.

(181)

It is not hard to check that is non-negative, strictly concave, and smooth


on ]0, 1]. Moreover it vanishes only at x = 0 and x = 1.
Let X be a nite set equipped with a probability measure . The Shannon
entropy of is dened as
H(X) :=

((x)).

(182)

xX

The notation H(X) is traditional and not quite optimal, H() would t better.
H takes values in [0, log|X|]. The extreme value 0 is attained for degenerate
measures (giving weight 1 to one of the points in X) while log|X| is attained
for the uniform measure (x) = 1/|X|. Furthermore H is strictly concave:
the entropy of a non-trivial mixture of two measures is strictly larger than
the corresponding mixture of entropies.
Problem 42. Verify the statements just made.
52

Consider a composite system X Y equipped with a probability measure .


The marginals X and Y of this measure dene probabilities on X and Y :
X (x) :=

(xy).

(183)

yY

The term joint entropy is used for


H(XY ) := H(X Y ).

(184)

We have the following relations


H(X) H(XY ) monotonicity and

(185)

H(XY ) H(X) + H(Y ) subadditivity.

(186)

Moreover, the subadditivity inequality is saturated i is a product measure.


The mutual information is the degree of non-saturation of subadditivity
H(X : Y ) := H(X) + H(Y ) H(XY ).

(187)

H(X : Y ) = 0 i the joint measure is the product of its marginals, this means
that there is independence between the two subsystems and so certainly
no strong correlations that would allow to identify with high probability
elements of X with elements of Y . The maximal value that H(X : Y ) can
attain is H(X). This happens when there is a bijective map f connecting the
elements with positive probability in X with those of positive probability in
Y such that
(xf (y)) > 0 and (xy) = 0 for y = f (x).

(188)

An other important property is strong subadditivity


H(XY Z) + H(Y ) H(XY ) + H(Y Z).

(189)

Markov measures saturate this inequality, where Markov means


XY Z (x, y, z) =

XY (x, y)Y Z (y, z)


.
Y (y)

The subscripts to indicate the marginals of the joint measure.


53

(190)

Problem 43. Verify the statements just made.


Theorem 5. Let be a shift-invariant probability measure on X then
1
H(n X)
n n
exists. It is called the entropy of the source .
h() := lim

(191)

Proof. The proof is based on subadditivity. In fact one shows that


h() = inf
n

1
H(n X).
n

(192)

Problem 44. Compute the entropy of a Markov measure.


Actually, this result can be rened. Using the short notation Hn = H(n X)
Theorem 6. For n = 1, 2, . . .
0 Hn+2 Hn+1 Hn+1 Hn

(193)

h = lim Hn+1 Hn .

(194)

and
n

Theorem 6 is much stronger than Theorem 5: it tells that h is the entropy


entropy produced each time unit while the former just states that h is the
average entropy produced over a large time.
In order to compute the probability of all the words with given rst letter xk
and third letter x we have to compute
d

(xk xi x ).

(195)

i=1

In a similar way we can obtain probabilities of general congurations xed


at a number of times. A measure is mixing if probabilities of nite words
pulled far apart factorise in probabilities of the words. For two single letter
words
(196)
(xk xi1 xi2 xin x ) = (xk )(x ).
lim
n

i1 ,i2 ,...,in

We can now state the general Shannon compression theorem:


54

Theorem 7. Let be a shift-invariant, mixing probability measure on X


with entropy h. For any , > 0 and n sufficiently large there exist coding and
decoding maps Cn and Dn to and from a set Rn with at most exp(n(h + ))
elements such that
{x n X | Dn Cn (x) = x} 1 .

(197)

If |Rn | exp(n(h )) then no such maps exist.

4.3

Quantum Entropy

We now deal with a memoryless source emitting pure states {i } with probabilities {pi }. This denes a quantum ensemble {(pi , i )} of pure states. We
can in the same way consider a quantum ensemble {(pi , i )} of mixed states,
this can be thought to be a source emitting pure composite states states of
which only one party is accessible. Observing the source by repeated measurements, we see an average state
=
i

pi |i i | or =

pi i .

(198)

Moreover, observations between consecutive times will be uncorrelated


A1 A2 Tr A1 A2 .

(199)

Note that, in contrast to classical systems, we cannot interpret as an ensemble. Dierently built sources may generate the same because a quantum
state space is very far from a simplex.
As before, we have an entropy S that measures the uncertainty in the state
spit out by the source, this is the von Neumann entropy
S() := Tr () = Tr log .

(200)

It is not hard to see that S() = 0 i all coincide (or all i are pure and
coincide).
For more complicated sources (not memoryless) one has a description as in
the classical case. The multi-time correlated expectations are given by re-

55

duced density matrices (n) on Md Md and we have the relations


n times

Tr (n+1) (An ) = Tr (n) An , compatibility


(n+1)

Tr

( An ) = Tr

(n)

An , shift-invariance.

(201)
(202)

Here An is an arbitrary observable in Md Md . The collection of (ren times

duced) density matrices {(n) } now denes a shift-invariant state on Md .

The von Neumann entropy of a density matrix on Md has properties similar


to those of the Shannon entropy. It is easily seen that its range is [0, log d],
S() = 0 i is pure and the maximal value log d is attained i is the
uniform state /d. It is also a concave function which follows from Lemma 2.
Lemma 2. Let f : [a, b] be concave, then A Tr f (A) is concave on
the set of Hermitian matrices in Md whose eigenvalues are a subset of [a, b].
Problem 45. Fill out the details of the proof.
Let 12 be a density matrix of a bipartite system Md1 Md2 with reduced
density matrices 1 and 2 , then
S(12 ) S(1 ) + S(2 ), subadditivity and
S(2 ) S(12 ) + S(1 ).

(203)
(204)

The term S(1 ) in the right hand side of (204) is needed because monotonicity does not hold in the quantum case, e.g., 12 can be pure in
which case 1 and 2 have the same entropy, generally strictly positive. In
this case (204) is saturated. A more symmetric version of (204) reads
|S(1 ) S(2 )| S(12 ).

(205)

The proof of subadditivity relies on Kleins inequality while that of (204)


uses strong subadditivity, see later.
Lemma 3. Let x A(x) be a continuously dierentiable function
taking values in the Hermitian matrices and let f : be continuously
dierentiable, then
d
dA
Tr f (A(x)) = Tr f (A(x))
(x) .
dx
dx
56

(206)

Problem 46. Fill out the details of the proof.


Lemma 4. A continuously dierentiable function f :]a, b[ is concave i
f (y) f (x) + (y x)f (x) for all x, y ]a, b[.

(207)

Problem 47. Fill out the details of the proof.


Lemma 5 (Kleins inequality). Let f :]a, b[ be continuously differentiable and concave and let A, B be Hermitian matrices in Md whose
eigenvalues belong to ]a, b[, then
Tr f (B) Tr f (A) + (B A)f (A) .

(208)

Problem 48. Fill out the details of the proof.


Lemma 6 (Positivity of relative entropy). Let and be density matrices and assume that is strictly positive, then
S(|) := Tr (log log ) 0.

(209)

Relative entropy is an important quantity. In statistics it is related to hypothesis testing. In statistical mechanics it appears in the context of the
variational principle for the free energy. Suppose that is a canonical
equilibrium state: = exp(H) where the appropriate normalisation constant, the log of the partition function, has been absorbed in the Hamiltonian,
then
S(|) = H S()
(210)

is the expected internal energy of the system in the state minus the entropy
of . Thermodynamic equilibrium is attained when this quantity reaches its
minimal value.
Problem 49. Fill out the details of the proof.
Subadditivity now follows from
0 S(12 | 1 2 ) = S(1 ) + S(2 ) S(12 ).

(211)

A really profound result, the proof of which goes beyond the scope of these
lectures, is quantum strong subadditivity
S(123 ) + S(2 ) S(12 ) + S(23 ).
57

(212)

Let be a density matrix. Consider then an orthonormal basis {ej } of


eigenvectors of
ej = rj ej , and so =
j

rj |ej ej |.

(213)

Suppose that there are k eigenvalues of strictly positive. We then introduce


a vector d k

rj ej fj .
(214)
:=
j

Here {fj } is an orthonormal basis of k . It is then easily checked that is


the partial trace of | | over k . The state vector is called a purication
of . There are obviously many purication because we can freely pick the
orthonormal basis {fj }.

Suppose now that we are given a two party state 12 . We can always purify
this to a pure state 123 . Because marginals of a bipartite pure state have the
same non-zero eigenvalues, and hence the same entropy, we have S(23 ) =
S(1 ). Strong subadditivity then yields
S(2 ) S(12 ) + S(1 )

(215)

which is exactly (204).

Existence of entropy of a shift-invariance state on Md can now be proved,


more or less as in the classical case on the basis of subadditivity
1
S((n) ).
n n

s = lim

(216)

Using strong subadditivity one can show that for shift-invariant states the
entropy is monotonically increasing in the volume and that
0 S((n+2) ) S((n+1) ) S((n+1) ) S((n) ).

(217)

This suces to prove that also in the quantum case


s = lim S((n+1) ) S((n) ) .
n

(218)

We conclude this short series of lectures with Schumachers noiseless compression theorem for sources that emit pure quantum states. After encoding and decoding a sequence i1 i2 in of states emitted by the
58

source we should end up with a state that is close to the original state for
a subset of states that appears with high probability. In order to compare
states, fidelity can be used. For two vector states and delity is the
ordinary transition probability:
Fid(, ) := | , |2 .

(219)

A delity close to 1 means that the vectors are also close. This notion can be
extended to arbitrary mixed states. The idea is to jointly purify both density
matrices and then to use the transition probability for these purication.
More explicitly: let 1 and 2 be joint purications of 1 and 2 :
j = Tr

|j j |, j = 1, 2

(220)

then
Fid(1 , 2 ) := max | 1 , 2 |2

(221)

where the maximum is taken over all joint purications of 1 and 2 . Uhlmann
obtained an explicit expression for the mixed state delity
Fid(1 , 2 ) = Tr

1 2 1

(222)

Problem 50. Estimate in terms of the delity.


Problem 51. Compute the right hand side of (222) for the case were 1 is
pure.
Problem 52. Show for the case that 1 is pure that the maximal transition
probability between all joint purications of 1 and 2 indeed agrees with the
expression obtained in the previous problem.
We now formulate a simple case of Schumachers theorem
Theorem 8. Suppose that a memoryless sources emits pure states j d
with probability pj , let =
j pj |j j | and s = S(). For any given
, > 0 and n sufficiently large there exist quantum operations (coding maps)
Cn : L(n d ) L(Kn ) and decoding maps Dn : L(Kn ) L(n d ) such
that
j1 j2 ...jn

pj1 pj2 pjn Fid |j1 j1 | |jn jn | ,


Dn Cn (|j1 j1 | |jn jn |) 1 .
59

(223)

Here Kn is a Hilbert space of dimension not exceeding exp(n(s+)) and L(V)


denotes the linear transformations of the complex vector space V. Moreover,
there are no reliable coding and decoding maps in the sense of (223) if for all
n the dimension of the code space Kn is not exceeding exp(n(s )).

60

You might also like