0% found this document useful (0 votes)
96 views48 pages

Q Crypto Ucl

The lecture notes on quantum cryptography by Lluís Masanes cover various topics including classical information theory, quantum key distribution (QKD), and the security of cryptographic protocols. It discusses the significance of QKD in practical applications and its foundation in quantum mechanics, particularly through protocols like BB84 and device-independent QKD. The document also addresses challenges in cryptography, such as the need for secure key distribution and the implications of information-theoretic versus computational security.

Uploaded by

Martin Kafula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views48 pages

Q Crypto Ucl

The lecture notes on quantum cryptography by Lluís Masanes cover various topics including classical information theory, quantum key distribution (QKD), and the security of cryptographic protocols. It discusses the significance of QKD in practical applications and its foundation in quantum mechanics, particularly through protocols like BB84 and device-independent QKD. The document also addresses challenges in cryptography, such as the need for secure key distribution and the implications of information-theoretic versus computational security.

Uploaded by

Martin Kafula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Quantum cryptography – lecture notes

Lluı́s Masanes

l.masanes@ucl.ac.uk Office 3.13 at 66-72 Gower Street, London

October 13, 2022

Contents
1. Motivation 1

2. Classical Information Theory 1


2.1. Information-theoretic security . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2.2. Shannon information theory: one variable . . . . . . . . . . . . . . . . . . . 2
2.3. Shannon information theory: two variables . . . . . . . . . . . . . . . . . . 4
2.4. Shannon information in quantum key distribution . . . . . . . . . . . . . . . 6
2.5. Problem 1: generating a secret key from partially secret correlations . . . . 7

3. State discrimination 7
3.1. General case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2. Classical case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3. Pure-state case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.4. Problem 2: distinguishability of states . . . . . . . . . . . . . . . . . . . . . 10

4. Quantum key distribution: the BB84 protocol 10


4.1. Description of the protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.2. Informal security analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.3. Individual attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.4. Information gain vs disturbance in the conjugate basis . . . . . . . . . . . . 13
4.5. Secret key rate in the BB84 . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.6. Improved BB84 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5. Quantum cloning machines 18


5.1. The universal cloning machine . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2. The phase-covariant cloning machine . . . . . . . . . . . . . . . . . . . . . . 18
5.3. Problem 3: fidelity of phase-covariant CM . . . . . . . . . . . . . . . . . . . 19
5.4. Asymmetric phase-covariant CM . . . . . . . . . . . . . . . . . . . . . . . . 19
5.5. Problem 4: fidelity trade-off in asymmetric CM . . . . . . . . . . . . . . . . 19
5.6. Individual attacks and cloning machines . . . . . . . . . . . . . . . . . . . . 20
5.7. The 6-state protocol and the universal cloning machine . . . . . . . . . . . . 21

6. Entanglement-based QKD 21
6.1. Equivalence of prepare+measure and entanglement-based QKD . . . . . . . 21
6.2. Problem 5: the singlet state . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.3. Description of the protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1
6.4. Purification of a mixed state . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.5. Problem 6: purification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.6. Individual attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

7. General attacks 29
7.1. Secret key rate of the noisy singlet . . . . . . . . . . . . . . . . . . . . . . . 30
7.2. Problem 7: General attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.3. De-Finetti Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

8. Non-local correlations 32
8.1. Classical, quantum and beyond . . . . . . . . . . . . . . . . . . . . . . . . . 33
8.2. Monogamy of non-local correlations . . . . . . . . . . . . . . . . . . . . . . 37
8.3. Monogamy and no-cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
8.4. Problem 8: non-locality with only one observable . . . . . . . . . . . . . . . 39

9. Device-independent QKD 40
9.1. Simple example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
9.2. No-signaling QKD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
9.3. Problem 9: alternative protocol for no-signaling QKD . . . . . . . . . . . . 42
9.4. Characterizing the set of quantum correlations . . . . . . . . . . . . . . . . 42
9.5. Problem 10: quantum correlations . . . . . . . . . . . . . . . . . . . . . . . 44

A. Optimal individual attack for the BB84 44


A.1. Information-vs-disturbance Theorem . . . . . . . . . . . . . . . . . . . . . . 44
A.2. Information gain in terms of mutual information . . . . . . . . . . . . . . . 47

1. Motivation
• The applications of cryptography are widely used in our society, and their conse-
quences and regulations are constantly discussed in media. In 2010 more than 85%
of all US companies experienced data breaches, this cost the US economy around
$100-billion.

• Cryptography englobes lots of different tasks. Some examples are: non-repudiation,


authentication, confidentiality and secret communication. In this course we only
study this last one.

• Quantum key distribution (QKD) is a good example of how research on the foun-
dations of quantum theory lead to practical applications that are now in the mar-
ketplace. In particular, the most modern and secure protocols for QKD, known
as device-independent QKD, are based on Bell inequalities, which were originally
designed to rule out the possibility of a local-classical description of quantum phe-
nomena.

• QKD is a perfect playground for practicing quantum mechanics and getting a deep
understanding of its counterintuitive and non-classical features.

2
2. Classical Information Theory
2.1. Information-theoretic security
A general model for secret communication using a shared secret key is the following.

Steps Alice Bob


0. Initial x, k x =?, k
1. Encoding y = Ek (x), x, k k
2. Broadcasting y = Ek (x), x, k y, k
3. Decoding y = Ek (x), x, k x = Dk (y), y, k

where x ∈ X is the message (plaintext), k ∈ K the shared secret key and y ∈ Y the cypher-
text. The encoder E : K × X → Y and decoder D : K × Y → X satisfy Dk (Ek (x)) = x.
A protocol enjoys information-theoretic security if P (x|y) is uniform for all y. Shannon’s
Theorem states that information-theoretic security implies |K| ≥ |Y| ≥ |X |. That is, the
key has to be at least as long as the message! Which is quite expensive in practice.
An information-theoretically secure encryption method with minimal key length |K| =
|X | is the one-time pad. Let x ∈ {0, 1}N and k be uniformly distributed in {0, 1}N . Then
the encoder and decoder are the bit-wise xor (or sum modulo 2) y = Ek (x) = x ⊕ k,
Dk (y) = y ⊕ k = x. Using the fact that k ⊕ k = 0 for any k ∈ {0, 1}N , we see that
Dk (Ek (x)) = (x ⊕ k) ⊕ k = x. For example, if x = (0, 0, 0, 0, 1) and k = (1, 0, 0, 1, 1) then
Alice encodes y = x ⊕ k = (1, 0, 0, 1, 0), and Bob decodes y ⊕ k = (0, 0, 0, 0, 1), which is
equal to x.
An alternative (but not information-theoretically secure) method which uses a much
shorter key is public-key cryptography. This is currently widely used. Here, a pair of
keys are involved, one called the public key, which can be widely disseminated, and one
called the private key, which is known only to the intended recipient (Bob). Any person
can encrypt a message for Bob using the public key. However, this message can now be
decrypted only through Bob’s private key. For example, in RSA, the private key are two
large prime numbers (a, b), and the public key is the product of these c = ab. Given c it
is believed to be computationally hard to obtain a and b. (This scheme is not explained
in due course.) Hence, RSA (and other public-key cryptography methods) rely on: (i) the
limitations on the computational power of the adversary, and (ii) unproven mathematical
conjectures. Hence, this is not information-theoretic security, but computational security.
If we do not want to rely on these assumptions we need to go back to one-time pad, and
distribute long secret keys between Alice and Bob efficiently. QKD solves this problem.

2.2. Shannon information theory: one variable

Shannon entropy. Let X be a random variable taking values x in the alphabet X =


{0, 1, . . . , d − 1} and following the probability distribution P (x). The Shannon entropy of
X is defined by
d−1
! 1
H(X) = P (x) log2 . (1)
P (x)
x=0

In the case of p = 0 the function p log2 (1/p) takes the value limp→0 p log2 (1/p) = 0.

I.i.d. A sequence of random variables X1 , X2 , . . . , XN is independent and identically


distributed (i.i.d.) if the joint probability is of the form

P (x1 , x2 , . . . , xN ) = P (x1 )P (x2 ) . . . P (xN ) , (2)

3
where all distributions P (xi ) are identical.

Typical sequences. The number of zeroes in a given sequence x = (x1 , x2 , . . . , xN )


is denoted by N0 (x). And more generally, the number of times that the value x ∈ X
appears in the sequence x = (x1 , x2 , . . . , xN ) is denoted by Nx (x). We say that a sequence
x = (x1 , x2 , . . . , xN ) is typical if it approximately contains N P (0) zeroes, N P (1) ones,
and so on:
Nx (x) ≈ N P (x) , (3)
for all x ∈ X . More precisely, the sequence x is typical if

|Nx (x) − N P (x)| ≤ 4 N , (4)

for all x√ ∈ X . If we sample a sequence, the probability of being non-typical is smaller


than e− N . The total number of typical sequences ΩX (N ) is approximately given by the
multinomial coefficient
" #
N
ΩX (N ) ≈ (5)
N P (0) · · · N P (d − 1)
N!
= (6)
[N P (0)]![N P (1)]! · · · [N P (d − 1)]!
$ %
!
≈ exp N ln N − N − [N P (x) ln(N P (x)) − N P (x)] (7)
x
$ %
!
= exp N ln N − N P (x) [ln N + ln P (x)] (8)
x
$ %
!
= exp −N P (x) ln P (x) , (9)
x

where we have used Stirling’s approximation ln N ! ≈ N ln N − N . In summary, we obtain


the fundamental (approximate) equality

ΩX (N ) ≈ 2N H(X) . (10)

An interesting property of the typical sequences is that they all have approximately the
same probability. Specifically, if x = (x1 , x2 , . . . , xN ) is typical then

P (x) = P (x1 , x2 , . . . , xN ) = P (x1 )P (x2 ) · · · P (xN )


= P (0)N0 (x) P (1)N1 (x) · · · P (d − 1)Nd−1 (x)
≈ P (0)N P (0) P (1)N P (1) · · · P (d − 1)N P (d−1)
= 2log2 [P (0) ]
N P (0) P (1)N P (1) ···P (d−1)N P (d−1)

= 2−N H(X) . (11)

The approximate identities (10) and (11) imply that the total probability of obtaining a
typical sequence is approximately one
!
P (x) ≈ ΩX (N ) P (x) ≈ 1 . (12)
x typical

As mentioned above, the error made in this approximation is e− N.

Information compression. For this task, Alice is given a sequence of i.i.d. random
variables X1 , X2 , . . . , XN , and the goal is to encode the value of this sequence x =

4
(x1 , x2 , . . . , xN ) into a shorter sequence of M bits r = (r1 , r2 , . . . , rM ) ∈ {0, 1}M , sent
r to the Bob, so that he can reconstruct (decode) the value of x from r, with high success
probability. We want to know: what is the smallest value of M ?
We know that with high probability the sequence x is typical, and that there are ΩX (N )
typical sequences. On the other hand, the size of the encoding alphabet is |{0, 1}M | = 2M .
Correctly encoding x in r imposes the constraint ΩX (N ) ≤ |{0, 1}M |; so the value of M
has to be the smallest integer compatible with this constraint. This is approximately

M ≈ N H(X) . (13)

That is, the entropy H(X) is the number of bits required to encode each copy of X.

Randomness distillation. We can distill M perfect random bits from the i.i.d. sequence
X1 , X2 , . . . , XN if there is a map (X1 , . . . , XN ) −→ (K1 , . . . , KM ) ∈ {0, 1}M such that the
probability distribution for the M bits (K1 , . . . , KM ) is close to uniform. We want to
know: what is the largest value for M ?
We know that with high probability the outcome of X1 , . . . , XN is typical, and that
all typical sequence approximately have the same probability (11). Therefore, we can
construct a map that encodes each typical sequence x into a different M -bit string k =
(k1 , . . . , kM ). The requirement that k is uniform implies that ΩX (N ) ≥ |{0, 1}M |. Then
M is the largest integer compatible with this constraint, which approximately is

M ≈ N H(X) . (14)

That is, the entropy H(X) is the number of uniform random bits that can be distilled
from each copy of X.

2.3. Shannon information theory: two variables


Consider an information source that, in each round, it sends the random variable X to
Alice and Y to Bob, following the distribution P (x, y). This is repeated N rounds, and each
round is independent and identically distributed, so that the joint probability distribution
is

P (x, y) = P (x1 , . . . , xN , y1 , . . . , yN ) = P (x1 , y1 )P (x2 , y2 ) · · · P (xN , yN ) . (15)

We can consider the pair (X, Y ) to be a single random variable with alphabet X × Y.
Hence, the number of typical sequences (x, y) is given by

Ω(X,Y ) (N ) ≈ 2N H(X,Y ) , (16)

where naturally
! 1
H(X, Y ) = P (x, y) log2 . (17)
x,y
P (x, y)

Also, we know that each the typical sequence (x, y) has probability

P (x, y) ≈ 2−N H(X,Y ) . (18)

Sub-additibity of entropy. One can prove that

H(X) ≤ H(X, Y ) ≤ H(X) + H(Y ) . (19)

5
Conditional probability. As mentioned above, a sequence (x, y) is typical if N(x,y) (x, y) ≈
N P (x, y). Now, let us show that, when (x, y) is typical,
& then the sequence x is typical
(with respect to the marginal distribution P (x) = y P (x, y)) too. This is proven as
! !
Nx (x) = N(x,y) (x, y) ≈ N P (x, y) = N P (x) . (20)
y y
&
Analogously, y is typical (with respect to the marginal P (y) = x P (x, y)).
Using (11) and (18) we show that, all typical sequences (x, y) approximately have the
same conditional probability

P (x, y) 2−N H(X,Y )


P (y|x) = ≈ −N H(X) = 2−N H(Y |X) , (21)
P (x) 2
where we define the conditional entropy as

H(Y |X) = H(X, Y ) − H(X) . (22)

Analogously we have

P (x|y) ≈ 2−N H(X|Y ) . (23)

Note that (19) implies

0 ≤ H(X|Y ) ≤ H(X) (24)

Joint tipicality. Let us consider a fixed typical sequence y, and define ΩX|y (N ) as the
number of sequences x such that (x, y) is typical. Using (23) we obtain
! !
1= P (x|y) ≥ P (x|y) ≈ ΩX|y (N ) 2−N H(X|Y ) , (25)
x x such that (x,y) is typical

which implies

ΩX|y (N ) ≤ 2N H(X|Y ) . (26)

Combining this inequality with (16) gives


!
2N H(X,Y ) ≈ Ω(X,Y ) (N ) = ΩX|y (N )
y typical
!
N H(X|Y )
≤ 2 ≈ 2N H(Y ) 2N H(X|Y ) = 2N H(X,Y ) , (27)
y typical

which implies that the inequality (26) is actually an approximate equality

ΩX|y (N ) ≈ 2N H(X|Y ) . (28)

Note that ΩX|y (N ) is (approximately) independent of y, and that it is the inverse of the
conditional probability (23).

Two-variable summary. If (x, y) is typical then x and y are typical too, and
1
P (x|y) ≈ ≈ 2−N H(X|Y ) , (29)
ΩX|y (N )
1
P (y|x) ≈ ≈ 2−N H(Y |X) . (30)
ΩY |x (N )

6
Error correction is the generalisation of “information compression” to the case where
the receiver has partial information about the data. Suppose that Alice has (X1 , . . . , XN )
and Bob has (Y1 , . . . , YN ), following the i.i.d. distribution P (xi , yi ). That is, in round
i, the variables xi , yi are correlated according to P (xi , yi ), but the variables in different
rounds are independent. For this task we understand Bob’s data y = (y1 , . . . , yN ) as a
degraded version of Alice’s data x = (x1 , . . . , xN ). Then, he needs to correct the errors in
y to obtain x. For this, Alice sends M bits r = (r1 , . . . , rM ) ∈ {0, 1}M to Bob, so that
he can reconstruct x (with high probability) by using the message r and his data y. We
want to know what is the smallest value of M .
The knowledge of y tells Bob that (with high probability) Alice’s data x is such that
(x, y) is typical. Therefore, Alice only needs to help Bob to discriminate among the
sequences x that are jointly typical with Bob’s sequence y, and there are only ΩX|y (N ) ≈
2N H(X|Y ) of those (28). This implies that Alice needs to send, at least,
M ≈ N H(X|Y ) (31)
bits. This number of bits is also sufficient for the task to be successful with high probability,
but we are not proving this here. In summary, the conditional entropy H(X|Y ) is the
number of bits that we need to encode X once we know Y .

Privacy amplification is the generalisation of randomness distillation when an adversary


has partial information. Suppose Alice holds (X1 , . . . , XN ), which is correlated with Eve’s
data (Z1 , . . . , ZN ), following an i.i.d. distribution given by P (xi , zi ). That is, in round i, the
variables xi , zi are correlated according to P (xi , yi ), but the variables in different rounds are
independent. Alice wants to generate a secret key (K1 , . . . , KM ) ∈ {0, 1}M by applying
a map (X1 , . . . , XN ) −→ (K1 , . . . , KM ) such that the distribution for (K1 , . . . , KM ) is
uniform and uncorrelated with (Z1 , . . . , ZN ). That is
P (k|z) ≈ 2−M , (32)
for all k = (k1 , . . . , kM ) and all typical z = (z1 , . . . , zN ). This condition guarantees that
Eve’s information z is useless to guess the key k.
The data generated by the source (x, z) = (x1 , . . . , xN , z1 , . . . , zN ) is typical with high
probability. In this case, for each value of z there are ΩX|z (N ) ≈ 2N H(X|Z) many sequences
x, all having the same probability (28). By suitably encoding each of this sequences x
into an M -bit string k, Alice can generate a secret key of which Eve has no knowledge.
Therefore, the length of this key is
M ≈ N H(X|Z) . (33)
In summary, the conditional entropy H(X|Z) is the ignorance about X that an agent
having Z has.

2.4. Shannon information in quantum key distribution


After the measurement stage of a QKD protocol, Alice, Bob and Eve end up with the
sequences x = (x1 , . . . , xN ), y = (y1 , . . . , yN ) and z = (z1 , . . . , zN ), respectively. We
assume that these are i.i.d. according to P (xi , yi , zi ). This data would constitute a perfect
secret key if xi and yi are identical, uniformly distributed and uncorrelated from zi , as
1
P key (xi , yi , zi ) = δxi ,yi P (zi ) . (34)
d
But in general, this is not the case P (xi , yi , zi ) ∕= P key (xi , yi , zi ). In what follows we
describe a classical post-processing for the above data which allows Alice and Bob to
generate a secret key k = (k1 , . . . , kM ) with high probability. We will also calculate the
length of the secret key M . This task can be accomplished by the following two steps:

7
1. Error correction. Alice broadcasts Mec ≈ N H(X|Y ) bits of information r about
x, such that Bob can reconstruct x by using (y, r). However, after this process, Eve’s
information has increased from z to (z, r), and her ignorance about x has decreased
from N H(X|Z) to

H(X|Z, R) = H(X|Z) − H(R|Z) (35)


≥ N H(X|Z) − Mec = N [H(X|Z) − H(X|Y )] , (36)

because H(R|Z) ≤ H(R) ≤ Mec .

2. Privacy amplification. Alice and Bob, which now share the same raw key x, apply
the same privacy-amplification map to x, obtaining a shorter string k of length

Mpa ≈ N [H(X|Z) − H(X|Y )] . (37)

This Mpa -bit string k is uniformly-distributed and uncorrelated to (z, r), hence, it
is a perfect secret key.

The asymptotic efficiency rate R = Mpa /N in the limit N → ∞ is then

R = H(X|Z) − H(X|Y ) = I(X : Y ) − I(X : Z) , (38)

where I(X : Y ) = H(X) + H(Y ) − H(X, Y ) is the mutual information.

2.5. Problem 1: generating a secret key from partially secret correlations


Let us consider the situation where Alice, Bob and Eve share N copies of the distribution

y=0 y=1
x = 0 z = 1 1/8
P (x, y, z) = z = 2 2/8 z = 2 2/8 , (39)
x=1 z = 3 2/8
z = 1 1/8

where we assume that N is large.

A. Calculate the secret key rate

R→ = H(X|Z) − H(X|Y ) . (40)

B. Consider the possibility that Alice and Bob exchange their roles, that is, consider
the case where y is the raw key and is Alice who corrects the errors. This amounts
to use the rate formula

R← = H(Y |Z) − H(Y |X) , (41)

instead of R→ . What is the length of the message in the error-correction phase?


What is the length of the final key?

3. State discrimination
An essential ingredient for the analysis of QKD protocols is state discrimination. In this
section we analyse the optimal strategy to discriminate two quantum states.

8
3.1. General case
Suppose that we are given one of the two states ρ0 and ρ1 , with prior probabilities p0 and
p1 respectively, and we want to guess it with minimal error probability. The most general
protocol
& consists of: (i) performing a measurement ({B1 , . . . , Bn } satisfying Bi ≥ 0 and
B
i i = ), and (ii) computing the guess ρj (j = 0, 1) from the outcome i = 1, . . . , n.
Any such computation will be characterised by a conditional probability distribution q(j|i),
which give us the probability output j when the input is i.

The probability for the wrong guess is


! !
# = p0 P (Bi |ρ0 ) q(1|i) + p1 P (Bi |ρ1 ) q(0|i)
i i
! !
= p0 tr(Bi ρ0 ) q(1|i) + p1 tr(Bi ρ1 ) q(0|i)
i i
= p0 tr(A1 ρ0 ) + p1 tr(A0 ρ1 ) ,
&
where we have defined Aj = i q(j|i)Bi . Note that {A0 , A1 } is a measurement, since
A0 , A1 ≥ 0 and
! !
A0 + A1 = [q(0|i) + q(1|i)] Bi = Bi = . (42)
i i

Also note that the protocol measuring Aj is as good as the one measuring Bi . Hence, it
is enough with a two-outcome measurement, with outcome A0 for “guess ρ0 ” and A1 for
“guess ρ1 ”. The probability for the wrong guess is

# = p0 tr(A1 ρ0 ) + p1 tr(A0 ρ1 )
= p0 tr(A1 ρ0 ) + p1 tr([ − A1 ]ρ1 )
= p1 + tr(A1 M ) , (43)

have used A0 + A1 = and defined the Hermitian matrix M = [p0 ρ0 − p1 ρ1 ].


where we &
Let M = k λk |φk 〉〈φk | be the spectral decomposition of M , then
!
# = p1 + λk 〈φk |A1 |φk 〉 . (44)
k

The matrix A1 , restricted to 0 ≤ A1 ≤ , which minimises the error #, is the projector


onto the subspace with negative eigenvalues of M . That is
!
A1 = |φk 〉〈φk | , (45)
k : λk <0

which gives the error !


# = p1 + λk . (46)
k : λk <0

9

The 1-norm (or trace-norm) of a matrix M is defined as .M .1 = tr M † M , and for
Hermitian matrices it can be written as
! ! !
.M .1 = |λk | = λk − λk . (47)
k k : λk ≥0 k : λk <0

Using the normalization of the states we get


! !
p0 − p1 = trM = λk + λk . (48)
k : λk ≥0 k : λk <0

And subtracting the above two equations we obtain


!
p0 − p1 − .M .1 = 2 λk . (49)
k : λk <0

Hence, we can write the error (46) as


1 1 1
# = p1 + (p0 − p1 − .M .1 ) = − .M .1 . (50)
2 2 2
In summary, the error made when discriminating ρ0 and ρ1 is
1 1
#= − .p0 ρ0 − p1 ρ1 .1 . (51)
2 2
This formula illustrates that the trace-norm (or the 1-norm) has a physical meaning in
terms of distinguishability. Most (mathematical) norms do not have a concrete physical
meaning.

3.2. Classical case


Obviously, the above result
& includes the classical
& case, where the two states are diagonal
in a fixed basis: ρ0 = x P (x)|x〉〈x| and ρ1 = x Q(x)|x〉〈x|. Here, the 1-norm is called
the statistical distance of the two probability distributions P (x) and Q(x):
!
.ρ0 − ρ1 .1 = |P (x) − Q(x)| . (52)
x

Note that the optimal measurement does something very intuitive: when the measurement
outcome is x it guesses P if P (x) > Q(x) and it guesses Q otherwise.

3.3. Pure-state case


Now, let us consider the case where the two states are pure ρ0 = |ψ0 〉〈ψ0 | and ρ1 = |ψ1 〉〈ψ1 |,
and the prior probabilities are p0 = p1 = 1/2. Let |ψ0⊥ 〉 be a vector in the subspace spanned
by {|ψ0 〉, |ψ1 〉} that is orthogonal to |ψ0 〉. Hence, without loss of generality, we can write
|ψ1 〉 = α|ψ0 〉 + β|ψ0⊥ 〉 with |α|2 + |β|2 = 1. By re-defining |ψ0⊥ 〉 we can absorb a complex
phase such that β > 0. In the basis {|ψ〉, |ψ ⊥ 〉} we can write
" # " # " #
1 0 |α|2 αβ β2 −αβ
ρ0 − ρ 1 = − = , (53)
0 0 α∗ β β 2 −α∗ β −β 2

which has determinant


" #
β2 −αβ
det = −β 4 − |α|2 β 2 = −β 2 . (54)
−α∗ β −β 2

10
Let the eigenvalues of the Hermitian matrix [ρ0 − ρ1 ] be {λ1 , λ2 }. Using the values of the
trace
λ1 + λ2 = tr[ρ0 − ρ1 ] = tr[ρ0 ] − tr[ρ1 ] = 1 − 1 = 0 , (55)
and the determinant
λ1 λ2 = det[ρ0 − ρ1 ] = −β 2 , (56)
we conclude that λ1 = −λ2 = β. This allows to obtain the optimal error probability for
distinguishing two pure states:
1 1 1 1 1 1'
#= − (|λ1 | + |λ2 |) = − β = − 1 − |〈ψ0 |ψ1 〉|2 , (57)
2 4 2 2 2 2
where we have used that α = 〈ψ0 |ψ1 〉.

3.4. Problem 2: distinguishability of states


Suppose that you are given one of the two states
( * ( *
1/4 0 i/4 1/2 1/2 0
ρ1 = ) 0 1/2 0 + , ρ2 = ) 1/2 1/2 0 + (58)
−i/4 0 1/4 0 0 0

with prior probabilities p1 = p2 = 1/2.

A. What is the optimal probability of successfully distinguishing them?

B. Which measurement has to be performed in order to achieve this probability? Are


A0 , A1 projectors?

C. How the success probability changes if the prior probabilities are p1 = 1/4 and
p2 = 3/4 instead?

4. Quantum key distribution: the BB84 protocol


4.1. Description of the protocol
In what follows we use the bases Z = {|0〉, |1〉} and X = {|+〉, |−〉}, where we write
|±〉 = √12 (|0〉 ± |1〉). We also use the notation

|Z(0)〉 = |0〉, |Z(1)〉 = |1〉, |X(0)〉 = |+〉, |X(0)〉 = |−〉, (59)

and define the binary entropy as

h(p) = −p log2 p − (1 − p) log2 (1 − p) . (60)

All random choices are done with uniform distribution.

11
Distribution phase. In each round i ∈ {1, 2, . . . , N } the honest parties perform the fol-
lowing two steps:
1. Alice randomly choses a basis ui ∈ {Z, X}, prepares a random state of that
basis |ui (xi )〉, xi ∈ {0, 1}, and sends it to Bob.
2. Bob randomly choses a basis vi ∈ {Z, X}, and measures the received state
|ui (xi )〉 with the chosen basis, obtaining the outcome |vi (yi )〉 with yi ∈ {0, 1}.

Basis-reconciliation phase. Alice and Bob publish (ui , vi ) for all i ∈ {1, . . . , N }, and
construct a new sequence (xj , yj ) with j ∈ {1, . . . , Nrec }, containing only the pairs
with compatible bases ui = vi . We have that Nrec ≈ N/2. (Note that here, Eve
learns the preparation basis of all qubits.)

Estimation
√ phase. Alice and Bob select a random subset S ⊂ {1, . . . , Nrec } of size |S| =
⌈ N ⌉, publish the pairs (xj , yj ) in the subset j ∈ S, and compute the relative
frequency of errors
|{j ∈ S : xj ∕= yj }|
D= , (61)
|S|
also known as the disturbance. The raw key (xk , yk ) with k ∈ {1, √
. . . , Nraw } is
obtained after discarding the items in S. We have that Nraw ≈ N/2 − N .

Error-correction phase. Alice calculates the number

M = ⌈Nraw h(D)⌉ , (62)

12
generates a random hash function f : {0, 1}Nraw → {0, 1}M , and publishes f and
f (x1 , . . . , xNraw ). Bob uses the information (y1 , . . . , yNraw ), f and f (x1 , . . . , xNraw )
to reconstruct (x1 , . . . , xNraw ). (Note that here, Eve learns substantial information
about the raw key. In subsection 4.4 we show that H(X|Y ) = h(D).)
Privacy-amplification phase. Alice calculates the number
, " # -
1 '
Nkey = Nraw h − D(1 − D) − M , (63)
2
generates a random hash function g : {0, 1}Nraw → {0, 1}Nkey and publishes it.
Both, Alice and Bob, generate the joint secret key by computing (k1 , . . . , kNkey ) =
'
g(x1 , . . . , xNraw ). (In subsection 4.4 we obtain H(X|Z) = h[1/2− D(1 − D)], where
Z is the best guess Eve can make on X. Shannon theory tells us that Eve knows
nothing on this secret key.)

The efficiency rate is defined as the number of generated secret bits Nkey divided by the
number of uses of the quantum channel N . The efficiency rate of the BB84 is
. " # /
Nkey 1 1 '
R= = h − D(1 − D) − h(D) + O(N −1/2 ) . (64)
N 2 2
The BB84 protocol is summarised in the following figure. (Recall the difference between
polarisation and Bloch-sphere directions: Z for vertical and horizontal, and X for both
diagonals.)

4.2. Informal security analysis


If Eve intercepts and measures one photon, there is 1/2 probability that she does it in a
different basis than the one prepared by Alice. Hence, when Bob measures this intercepted
photon in the same basis than Alice, halve of the times they will obtain a different result.
This would not be the case if there was no eavesdropper, so they will suspect. Next, let
us add some remarks.

13
• In practice, even if there is no Eavesdropper, we expect errors (D > 0), because
channels and measurement apparatuses are usually imperfect. However, in QKD
we must assume the worst case: the channel is perfect and all errors come from the
action of an eavesdropper. Measuring D allows Alice and Bob to asses the amount of
information that Eve has about the generated key (in a worst case scenario). Then,
by using privacy amplification, they can be sure that the final key is secure. Regard-
less of whether the errors come from the apparatuses or from an actual adversary.
• In the BB84, a key barrier for Eve is that she does not know in which basis Alice
prepares the photon. But perhaps, instead of measuring this intercepted photon,
she could make it interact with an additional system (ancilla) and send the original
photon to Bob. Then, she could measure the ancilla once Alice publishes the prepa-
ration bases. This is actually a better attack than measuring without knowing the
basis, and it is analysed it in the next subsection.
• Proving that a protocol is secure against all possible attacks that we can think of
does not constitute a security proof. For this, we will have to formalise what is the
most general strategy that an adversary could perform, and prove security against
it.

4.3. Individual attacks


Individual attacks are those in which Eve addresses each qubit independently, and in the
same way. In this subsection we identify the most general individual attack (also known as
incoherent attack ); in the next subsection we construct the optimal individual attack for
the BB84 protocol; and the proof of optimality is fully detailed in Appendix A. Coherent
attacks are more general and more powerful than individual attacks, and are studied in
Section ??.
Let us formalise the most general individual attack. In a given round of the protocol,
Alice sends qubit B to Bob, in one of the states |a〉B with a ∈ {0, 1, +, −}. Eve’s attack
is the following:
1. Intercept qubit B.
2. Engineer an interaction between systems B and E.
3. Send system B to Bob and keep E (which might be entangled to B).
4. Once she has all the information that the honest parties publish, including the basis
in which B is prepared, choose the optimal measurement for E and perform it.
Without loss of generality, system E is initially in a fixed pure state |φ〉E and the interac-
tion is unitary U ; since any mixed state and non-unitary interaction can be simulated from
a unitarily evolving pure state by disposing part of the system E after the interaction. We
write this general interaction as
U (|φ〉E ⊗ |a〉B ) = |Φa 〉EB . (65)
Before the measurement, all of Eve’s quantum information is her reduced state
ρa = trB |Φa 〉EB 〈Φa | . (66)
This is summarised in the following picture:
0
|a〉B −→ −→ for Bob 1
U |Φa 〉BE (67)
2
|φ〉E −→ −→ ρa for Eve

14
4.4. Information gain vs disturbance in the conjugate basis
In this subsection we address the following question: what is the maximal amount of
information that an eavesdropper can have when disturbing the channel by an amount
D? We construct an explicit attack, which is proven to be optimal in Appendix A. To
simplify our analysis, we assume that Eve’s interaction has the same symmetries than the
protocol, that is, it is invariant under the exchanges 0 ↔ 1 and Z ↔ X.
When Alice prepares a ∈ {0, 1} equation (65) can be written as

U |φ〉 ⊗ |0〉 = γ00 |E00 〉 ⊗ |0〉 + γ01 |E01 〉 ⊗ |1〉 , (68)


U |φ〉 ⊗ |1〉 = γ10 |E10 〉 ⊗ |0〉 + γ11 |E11 〉 ⊗ |1〉 , (69)

where all kets are normalised and the unknown coefficients γab are real. Imposing that
unitary transformations preserve the norm we have γa0 2 + γ 2 = 1. Imposing the 0 ↔ 1
a1
symmetry we get γ00 = γ00 and γ01 = γ10 . The disturbance D is the probability that Bob
obtains an error, which in this case is D = γ012 . The fidelity of the channel is F = 1 − D,

so we can write
√ √
U |φ〉 ⊗ |0〉 = F |E00 〉 ⊗ |0〉 + D|E01 〉 ⊗ |1〉 , (70)
√ √
U |φ〉 ⊗ |1〉 = D|E10 〉 ⊗ |0〉 + F |E11 〉 ⊗ |1〉 . (71)

Using the 0 ↔ 1 symmetry again, we define the two parameters

α = 〈E00 |E11 〉 = 〈E11 |E00 〉 ∈ R , (72)


β = 〈E01 |E10 〉 = 〈E10 |E01 〉 ∈ R . (73)

Eve’s system is a C4 , hence the unitary U has (2 × 4)2 = 64 independent parameters! By


imposing symmetries, optimality and some tricks below, U becomes characterised by a
single parameter D.
By linearity we have
1
U |φ〉 ⊗ |±〉 = √ U |φ〉 ⊗ (|0〉 ± |1〉)
2
1 3√ √ 4 1 3√ √ 4
= √ F |E00 〉 ± D|E10 〉 ⊗ |0〉 + √ D|E01 〉 ± F |E11 〉 ⊗ |1〉
2 2
3√ √ 4 |+〉 + |−〉 3√ √ 4 |+〉 − |−〉
= F |E00 〉 ± D|E10 〉 ⊗ + D|E01 〉 ± F |E11 〉 ⊗
2 2
1 3√ √ √ √ 4
= F |E00 〉 ± D|E10 〉 + D|E01 〉 ± F |E11 〉 ⊗ |+〉
2
1 3√ √ √ √ 4
+ F |E00 〉 ± D|E10 〉 − D|E01 〉 ∓ F |E11 〉 ⊗ |−〉 (74)
2
Using the Z ↔ X symmetry we know that there must be an analog of (70) for the X basis
√ √
U |φ〉 ⊗ |+〉 = F |E++ 〉 ⊗ |+〉 + D|E+− 〉 ⊗ |−〉 (75)
√ √
U |φ〉 ⊗ |−〉 = D|E−+ 〉 ⊗ |+〉 + F |E−− 〉 ⊗ |−〉 (76)

where the four vectors |E±± 〉 are normalized and satisfy identities analogous to (72) and
(73). Comparing (74) with (75) and (76) we obtain
√ 1 3√ √ √ √ 4
F |E++ 〉 = F |E00 〉 + D|E10 〉 + D|E01 〉 + F |E11 〉 (77)
2
√ 1 3√ √ √ √ 4
F |E−− 〉 = F |E00 〉 − D|E10 〉 − D|E01 〉 + F |E11 〉 (78)
2

15
The normalization of these two vectors implies
√ 155√ √ √ √ 5
5
F = 5 F |E00 〉 + D|E10 〉 + D|E01 〉 + F |E11 〉5
2
155√ √ √ √ 5
5
= 5 F |E 00 〉 − D|E 10 〉 − D|E 01 〉 + F |E 〉
11 5 . (79)
2
Using (72) and (73) the above can be written as
3√ √ √ √ 43√ √ √ √ 4
4F = F 〈E00 | + D〈E10 | + D〈E01 | + F 〈E11 | F |E00 〉 + D|E10 〉 + D|E01 〉 + F |E11 〉
√ 6 7
= 2F + 2D + 2F α + 2Dβ + 2 DF re 〈E00 |E10 〉 + 〈E01 |E11 〉 + 〈E00 |E01 〉 + 〈E10 |E11 〉
= 2 + 2F α + 2Dβ + Ω
3√ √ √ √ 43√ √ √ √ 4
4F = F 〈E00 | − D〈E10 | − D〈E01 | + F 〈E11 | F |E00 〉 − D|E10 〉 − D|E01 〉 + F |E11 〉
√ 6 7
= 2F + 2D + 2F α + 2Dβ − 2 DF re 〈E00 |E10 〉 + 〈E01 |E11 〉 + 〈E00 |E01 〉 + 〈E10 |E11 〉
= 2 + 2F α + 2Dβ − Ω ,
which implies Ω = 0 and
4F = 2 + 2F α + 2Dβ . (80)
This relation between D, F, α, β is a consequence of unitarity (i.e. quantum mechanics)
and it plays an important role below.
Once Eve knows that the qubit prepared by Alice is in the Z basis a ∈ {0, 1}, she has
to optimally distinguish the two states:
ρ0 = F |E00 〉〈E00 | + D|E01 〉〈E01 | , (81)
ρ1 = F |E11 〉〈E11 | + D|E10 〉〈E10 | , (82)
corresponding to Alice sending 0 or 1. Since we are assuming that Eve’s attack has Z ↔ X
symmetry, it is not necessary to analyze the case where Alice prepares the qubit in the X
basis, the information gained by Eve will be the same.
The error #E made when optimally discriminating ρ0 and ρ1 is given by formula (51).
But we cannot use this formula, since we don’t know the exact form of the vectors |Eab 〉;
the only thing we know is the relation (80). However, Alice and Bob have enough with a
lower-bound on #E . Hence, we can instead solve the following simpler problem. Suppose
that Eve has to guess x in the imaginary situation where she is told whether x = y or
x ∕= y. That is, with probability F Eve knows that she has either |E00 〉〈E00 | or |E11 〉〈E11 |,
and with probability D she knows that she has either |E01 〉〈E01 | or |E10 〉〈E10 |. Clearly,
this situation is not worse than the original one, since Eve can always ignore the extra
information. Then, using the error formula for pure states (57) we obtain
13 ' 4 13 ' 4
#E ≥ F 1 − 1 − |〈E00 |E11 〉|2 + C 1 − 1 − |〈E01 |E10 〉|2
2 2
1 F' D '
= − 1 − α2 − 1 − β2 . (83)
2 2 2
Next, we minimise (83) with respect to α, β given the constraint (80). We can do this
with the Lagrange multipliers method: differentiating
1 F' D'
− 1 − α2 − 1 − β 2 − µ (2 + 2F α + 2Dβ − 4F ) (84)
2 2 2
with respect to α and β and equating to zero gives
αF
√ − 2µF = 0 , (85)
2 1 − α2
βD
' − 2µD = 0 , (86)
2 1 − β2

16
which implies α = β. Updating constraint (80) with this fact gives

2F = 1 + α , α = 1 − 2D . (87)

Substituting this in (83) we obtain the famous information/disturbance tradeoff


1 1' 1 '
#E ≥ − 1 − α2 = − D(1 − D) , (88)
2 2 2
which can be plotted as:

In[2]:= PlotB.5 - e H1 - eL , 8e, 0, .5<F

0.5

0.4

0.3

Out[2]=
0.2

0.1

0.1 0.2 0.3 0.4 0.5

In Appendix A it is proven that inequality (88) is actually an equality. This implies that
in the optimal attack Eve knows wether x = y or x ∕= y with probability one (although
she still doesn’t know x or y with probability one). This is equivalent to say that, in the
optimal attack, we have

〈E00 |E01 〉 = 〈E00 |E10 〉 = 0 , (89)


〈E11 |E01 〉 = 〈E11 |E10 〉 = 0 . (90)

4.5. Secret key rate in the BB84


In this subsection we calculate the efficiency of the BB84, by using the results of Section
2.4 (Shannon theory). This requires us to calculate the conditional entropies H(X|Y ) and
H(X|Z).
Since Alice prepares all states |u(x)〉 with the same probability we have P (x) = 1/2.
Also, Eve’s attack generates an effective channel between Alice and Bob which leaves the
transmitted state invariant (y = x) with probability F and introduces an error (y ∕= x)
with probability D, giving the conditional distribution
8
1 − D if y = x
P (y|x) = , (91)
D if y ∕= x

which implies
! 1
P (y) = P (y|x)P (x) = . (92)
x
2
Eve’s information is her guess z on x, which according to (88) follows the distribution
8 1 '
P (z|x) = 2 + 'D(1 − D) if z = x
. (93)
1
2 − D(1 − D) if z ∕= x

Arguing as in (92) we can show that P (z) = 1/2.

17
Next, with this statistical information we calculate the conditional entropies H(X|Y )
and H(X|Z). Using P (x) = P (y) = P (z) = 1/2 we obtain H(X) = H(Y ) = H(Z) =
h(1/2) = 1. Using this and (91) we obtain

H(Y |X) = P (X = 0)H(Y |X = 0) + P (X = 1)H(Y |X = 1)


1 1
= h(D) + h(D) = h(D) , (94)
2 2
which gives

H(X|Y ) = H(Y |X) + H(X) − H(Y ) = h(D) . (95)

Using (93) we obtain

H(Z|X) = P (X = 0)H(Z|X = 0) + P (X = 1)H(Z|X = 1)


" # " #
1 1 ' 1 1 '
= h − D(1 − D) + h − D(1 − D)
2 2 2 2
" #
1 '
= h − D(1 − D) , (96)
2

which gives
" #
1 '
H(X|Z) = H(Z|X) + H(X) − H(Z) = h − D(1 − D) .
2

Putting everything together we can calculate at the secret key rate (secret bits/photon)
. " # /
1 1 1 '
R = [H(X|Z) − H(X|Y )] = h − D(1 − D) − h(D) . (97)
2 2 2
The factor 1/2 follows from the fact that the fraction of compatible basis between prepa-
rations by Alice and measurements by Bob is asymptotically 50%.
Next, we plot Eve’s ignorance H(X|Z) in blue and Bob’s ignorance H(X|Y ) in red, as
a function of the disturbance D. The secret key rate is 1/2 of the difference between these
two functions.
In[9]:= h@x_D := - x Log@2, xD - H1 - xL Log@2, 1 - xD
PlotB:hB.5 - e H1 - eL F, h@eD>, 8e, 0, .5<F

1.0

0.8

0.6

Out[10]=
0.4

0.2

0.1 0.2 0.3 0.4 0.5

At the point where the two entropies cross, the secret key rate becomes zero. This point
is around D = 15%. We will see that other protocols tolerate higher disturbance.

18
4.6. Improved BB84
In this subsection we describe a modification of the BB84 protocol, so that we eliminate
the factor 1/2 in the efficiency rate (97).
In the distribution phase of this new protocol, Alice generates each random basis ui
with the probability distribution
8
1 − N −1/2 if u = Z
P (u) = , (98)
N −1/2 if u = X

and independently, Bob generates each random measurement vi with the same distribution.
Recall that in the original protocol the distributions are uniform. The basis-reconciliation
phase is the same as in the original protocol, but due to the change of statistics we have
9 :
Nrec ≈ P (u = Z)P (v = Z) + P (u = X)P (v = X) N
.3 42 3 42 /
−1/2 −1/2
= 1−N + N N

= N − O(N 1/2 ) . (99)

In the estimation phase, Alice and Bob use all rounds with uj = vj = X and a selected
random subset of the rounds with uj = vj = Z of size N 1/2 . This produces a raw key of
size Nraw ≈ N − O(N 1/2 ), instead of Nraw ≈ N/2 − O(N 1/2 ). Now, continuing as in the
original protocol, we obtain the desired rate without the factor 1/2.

5. Quantum cloning machines


5.1. The universal cloning machine
Quantum theory forbids the existence of a universal cloning machine (CM). This is an
imaginary device that takes an unknown input state |ψ〉 and produces two identical copies
of it. However, quantum theory allows for approximate CMs, which produce two approx-
imate copies σ1 (ψ), σ2 (ψ) of the input state |ψ〉. The fidelity is the overlap between the
clones and the input state

Funiv = 〈ψ|σ1 (ψ)|ψ〉 = 〈ψ|σ2 (ψ)|ψ〉 . (100)

For qubits |ψ〉 ∈ C2 , the optimal universal CM has fidelity


5
Funiv = ≈ 0.833 . (101)
6

5.2. The phase-covariant cloning machine


The phase-covariant CM works with the promise that the input state belongs to the
equator of the Bloch sphere, that is
1 3 4
|ψθ 〉 = √ |0〉 + eiθ |1〉 , (102)
2
for all θ. The optimal phase-covariant CM is a two-qubit unitary U acting on systems B
and E.
0
|ψθ 〉B −→ −→ σB 1
W |Φ〉EB = W (|0〉E |ψθ 〉B ) (103)
2
|0〉E −→ −→ σE

19
The initial state of B is the input state |ψθ 〉, and the initial state of E is fixed to |0〉,
without loss of generality. The joint output state of B and E is |Φ〉BE , and its reduced
states
σB = trE |Φ〉EB 〈Φ| , (104)
σE = trB |Φ〉EB 〈Φ| , (105)
constitute the two clones of |ψθ 〉. The unitary W acts as
W (|0〉E |0〉B ) = |0〉E |0〉B , (106)
1
W (|0〉E |1〉B ) = √ (|1〉E |0〉B + |0〉E |1〉B ) . (107)
2

5.3. Problem 3: fidelity of phase-covariant CM


A. Write a 4 × 4 unitary matrix W satisfying (106) and (107)?
B. Calculate the joint state |Φ〉EB = W (|0〉E |ψθ 〉B ) where |ψθ 〉 = 2−1/2 (|0〉 + eiθ |1〉).
Calculate the reduced density matrix σB = trE |Φ〉EB 〈Φ| and the fidelity Fphase =
〈ψθ |σB |ψθ 〉.
C. Which fidelity is larger: Fphase or Funiv = 5/6. Argue why.
D. Suppose that we break the promise and feed in W the input state |1〉. What is the
fidelity? Compare this fidelity with Fphase and Funiv , and argue the reason for the
difference.

5.4. Asymmetric phase-covariant CM


Motivated by BB84 individual attacks, we want to analyse phase-covariant CMs in which
B
clone σB has fidelity Fphase E
> Fphase , while clone σE (necessarily) has fidelity Fphase <
Fphase . The CM that provides the optimal trade-off of fidelities is the following family of
unitaries
Wη (|0〉E |0〉B ) = |0〉E |0〉B , (108)
Wη (|0〉E |1〉B ) = cos η|1〉E |0〉B + sin η|0〉E |1〉B , (109)
where the parameter η specifies the degree of asymmetry. When η = π/4 we recover the
symmetric CM (106-107). When η < π/4 we have Fphase E > Fphase , and when η > π/4 we
B
have Fphase > Fphase . The optimal fidelity trade-off is given by
; B <2 ; E <2
2Fphase − 1 + 2Fphase −1 =1 , (110)
which quantifies how one fidelity decreases when the other increases.

20
5.5. Problem 4: fidelity trade-off in asymmetric CM
A. Calculate the joint state |Φ〉EB = Wη (|0〉E |ψθ 〉B ) where |ψθ 〉 = 2−1/2 (|0〉 + eiθ |1〉).
Calculate the reduced density matrices

σB = trE |Φ〉EB 〈Φ| , (111)


σE = trB |Φ〉EB 〈Φ| , (112)

and the corresponding fidelities


B
Fphase = 〈ψθ |σB |ψθ 〉 , (113)
E
Fphase = 〈ψθ |σE |ψθ 〉 . (114)

B. Prove the trade-off


; B
<2 ; E <2
2Fphase − 1 + 2Fphase −1 =1 . (115)

B
C. Find the value of Fphase B
when you impose Fphase B
= Fphase . Comment on it.

5.6. Individual attacks and cloning machines


Let us analyse the performance of the individual attack U for the BB84 constructed in
subsection 4.4 as a phase-covariant cloning machine. Note that in the context of BB84 we
use the equator |0〉, |1〉, |+〉, |−〉 instead of |ψθ 〉, but this makes no difference.
Re-interpreting the diagram from subsection 4.4 we get
a (Bob)
0
|a〉B −→ −→ σB 1
U |Φa 〉EB (116)
a 2
|φ〉E −→ −→ σE (Eve)

where a ∈ {0, 1, +, −}. Recall that the unitary U produces the state
√ √
|Φa 〉 = U (|φ〉|a〉) = F |Eaa 〉|a〉 + D|Eaā 〉|ā〉 , (117)

where ā denotes the opposite state in the same basis. Using the fact that |Eaa 〉 and |Eaā 〉
are orthogonal (89-90), we obtain Bob’s reduced state
3√ √ 43√ √ 4
a
σB = trE F |Eaa 〉|a〉 + D|Eaā 〉|ā〉 F 〈Eaa |〈a| + D〈Eaā |〈ā| (118)
= F |a〉〈a| + D |ā〉〈ā| . (119)

Hence, Bob’s fidelity is


a
FB = 〈a|σB |a〉 = F . (120)

The fact that the fidelity is independent of a follows from the symmetries X ↔ Z and
0 ↔ 1 present in U .
The orthogonality constraints (89-90) tell us that, even though Eve doesn’t know a, she
knows wether her reduced state is σE a = |E 〉〈E | or σ a = |E 〉〈E |. Let us analyse
aa aa E aā aā
the first case, and the second will follow by symmetry. Next, write the vectors |E00 〉 and
|E11 〉 in the basis {|a〉 : a = 0, 1},

|E00 〉 = cos ξ |0〉 + sin ξ eiγ |1〉 , (121)



|E11 〉 = cos ξ |1〉 + sin ξ e |0〉 , (122)

21
where we have imposed the symmetries again, which gives Eve’s fidelity

FE = |〈a|Eaa 〉|2 = cos2 ξ . (123)

From subsection 4.4 we know that

2FB − 1 = α = 〈E00 |E11 〉 = cos ξ sin ξ (eiγ + e−iδ ) ∈ R . (124)

To be a real number the above must have γ = δ, hence

2FB − 1 = 2 cos ξ sin ξ cos γ . (125)

Also, for a fixed value of FE , and hence cos ξ = 1, we want to maximise FB , which gives

2FB − 1 = 2 cos ξ sin ξ . (126)

Combining the above equalities with (123) gives

(2FB − 1)2 = 4FE (1 − FE2 ) = 1 − (2FE − 1)2 , (127)

which is the same fidelity trade-off as in (110). Hence, we conclude that the optimal
individual attack for the BB84 is the optimal asymmetric phase-covariant cloning machine.

5.7. The 6-state protocol and the universal cloning machine


The 6-state protocol is like the BB84 but with with the additional basis Y = { √12 (|0〉 ± i|1〉)},
corresponding to the Y axis in the Bloch sphere. In this protocol, Alice’s preparation and
Bob’s measurement are chosen at random from the three basis X, Y, Z. This protocol has
a better efficiency and noise tolerance than the BB84.
The optimal attack for the 6-state protocol can be constructed following the steps of
subsection 4.4 but with the three bases X, Y, Z and the corresponding symmetries. As one
could expect, the optimal interaction U coincides with the optimal asymmetrical universal
CM. But this time, instead of the phase-covariant CM, it is the universal CM.

6. Entanglement-based QKD
Until now we have seen “prepare and measure” protocols for QKD, in which: (i) Alice
prepares a quantum state, (ii) sends it to Bob through a quantum channel, and (iii) he
measures it. Another type of QKD protocols is when there is a source of entangled pairs
of quantum states in between Alice and Bob, which: (i) sends a halve of a pair to each of
them, and (ii) they perform measurements. Due to the phenomenon of quantum steering,
there is an equivalence between prepare and measure and entanglement-based protocols.

6.1. Equivalence of prepare+measure and entanglement-based QKD


Suppose Alice and Bob share a singlet state √12 (|01〉 − |10〉). If Alice measures in the
orthogonal basis {|φ0 〉, |φ1 〉} and obtains outcome |φx 〉 then she is effectively preparing
Bob’s system in the same basis but opposite state |φx̄ 〉. That is, the choice of basis Z, X
of Bob’s state can be made by Alice measuring, instead of preparing. This can be used to
implement the prepare-and-measure version BB84 in a new way: suppose Alice has a two-
photon source in her lab, which she uses to prepare Bob’s state, and later sends it to him.
The following three figures illustrate that, from a theoretical point of view, there is no
difference between entanglement-based and prepare-and-measure protocols. This insight
will allow us to easily derive optimal protocols and the corresponding optimal attacks.

22
6.2. Problem 5: the singlet state
Prove that each pair of orthogonal states |φ0 〉, |φ1 〉 ∈ C2 satisfies

|φ0 〉 ⊗ |φ1 〉 − |φ1 〉 ⊗ |φ0 〉 ∝ |0〉 ⊗ |1〉 − |1〉 ⊗ |0〉 , (128)

where the symbol ∝ stands for “proportional”.

23
6.3. Description of the protocol
In each round i of the distribution phase, Alice randomly selects wether the ith system is
used for generating raw key or for estimation. That is, she generates the random variable
ri ∈ {raw, est} with probability P (est) = N −1/2 and P (raw) = 1 − N −1/2 . If ri = raw
then Alice measures in a fixed basis {|µ0 〉, |µ1 〉}, generating outcome xi . This basis is
optimised to yield the maximal correlation with Bob. If ri = est then Alice randomly
selects a basis ui ∈ {X, Y, Z} with uniform distribution and measures it. At the same
time, Bob generates the independent random variables si ∈ {raw, est} and vi ∈ {X, Y, Z}
with the same statistics as those of Alice, and performs the same procedure as Alice.

After the distribution phase, Alice and Bob do the following to each round i:

• If ri ∕= si then they throw away round i

• If ri = si = est then they keep the data (ui , vi , xi , yi ) to generate the joint statistics
of the process P (x, y|u, v) and reconstruct the joint state ρAB .

• If ri = si = raw then they keep outcomes (xi , yi ) for the raw key.

The following is an example of one such distribution phase:

24
The fraction of rounds that are thrown away is approximately 2P (est)P (raw) = O(N −1/2 ),
which tends to zero as N grows. On the other hand, the fraction of rounds used for the
raw key is P (raw)P (raw) = O(1).
The difference between this estimation procedure and that of the BB84 is that here we
consider the statistics of the 9 measurement combinations (u, v), while in the BB84 only
the statistics of u = v = Z and u = v = X is considered.

25
A standard situation is when the source distributes many copies of a 2-qubit singlet state
|Φ− 〉 = √12 (|01〉 − |10〉) with a δ amount of isotropic noise, that is

ρAB = (1 − δ)|Φ− 〉〈Φ− | + δ . (129)


4

6.4. Purification of a mixed state


We define an extension of a given state ρA to be a bipartite state ρAE such that

ρA = trE ρAE . (130)

If the state ρA has spectral decomposition


!
ρA = λj |αj 〉A 〈αj | , (131)
j

we define the purification as a bipartite state of the form


!'
|ψ〉AE = λj |αj 〉A ⊗ |βj 〉E , (132)
j

where {|βj 〉} is any the orthonormal basis on E. Then, all purifications are equivalent up
to this choice of basis. Or, in other words, all purifications are equivalent up to a unitary
transformation on the ancillary system E. Due to this equivalence, we talk about the
purification as being essentially unique. Note that the purification is an extension.
!' ; <
trE |ψ〉AE 〈ψ| = λj λj ′ |αj 〉A 〈αj ′ | tr |βj 〉E 〈βj ′ |
jj ′
!'
= λj λj ′ |αj 〉A 〈αj ′ | δjj ′
jj ′
!
= λj |αj 〉A 〈αj | . (133)
j

This extension, in addition of being pure, turns out to have a very remarkable property:
any other extension can be obtained from the purification by a local (not necessarily
unitary) operation on Eve’s side.

Theorem 1 (Purification). Let ρA be a given state and |ψ〉AE its purification. For any
extension ρAE of ρA there is completely positive trace-preserving map Λ such that

( A ⊗ ΛE )(|ψ〉AE 〈ψ|) = ρAE . (134)

Proof. If ρA is not full rank, we redefine the Hilbert space such that this is the case. Let
the purification of ρA be
!'
|ψ〉AE = λj |αj 〉A ⊗ |βj 〉E . (135)
j

The spectral decomposition of the given extension is


r
!
ρAE = |φi 〉AE 〈φi | , (136)
i=1

26
then, we can write its (unnormalized) eigenvectors in terms of the bases of the purification
!
i
|φi 〉AE = Cj,k |αj 〉A ⊗ |βk 〉E . (137)
j,k

Now, observe that the matrices


! 1
Mi = i
Cj,k √ |βk 〉〈βj | (138)
k,j
λj

satisfy
⊗ MEi |ψ〉AE = |φi 〉AE .
A (139)
&r
This implies that the completely positive map Λ(σ) = i=1 M i σM i† satisfies
& (134). Now,
the only thing that remains to be shown is that Λ is trace-preserving: M i† M i = .
i
Using the facts that ρAB is an extension of ρA , identity (134), and the cyclicity of the
trace, we obtain
=$ % >
! 9 : ! i†
i i† i
ρA = trE ρAE = trE ME |ψ〉AE 〈ψ|ME = trE ME ME |ψ〉AE 〈ψ| . (140)
i i

Now, note that


−1/2
ρA |ψ〉AE = |Φ〉AE , (141)
where !
|Φ〉AE = |αj 〉A ⊗ |βj 〉E (142)
j

is an unnormalized maximally entangled state. This vector has the following property: for
any matrix Q we have
trE [QE |Φ〉AE 〈Φ|] = QTA , (143)
where QTA is the transpose of QA , and this is the same matrix as QE but in Alice’s space.
Putting all together we have
=$ % >
−1/2 −1/2
! −1/2 −1/2
A = ρA ρ A ρA = trE MEi† MEi ρA |ψ〉AE 〈ψ|ρA
i
=$ % > $ %T
! !
= trE MEi† MEi |Φ〉AE 〈Φ| = MAi† MAi . (144)
i i

Transposing all of the above and using T = gives trace preservation of Λ.

6.5. Problem 6: purification


What is the purification of a pure state |ψ〉A ? What is the purification of a maximally
mixed state d1A A ? (Where A is the identity matrix and dA is the dimension of the
Hilbert space of system A.)

6.6. Individual attacks


Due to the purification theorem we know that the optimal attack is for Eve to hold the
system E in the state |ψABE 〉 which
? purifies the state ρAB shared by Alice and Bob. If
we describe the global state as i |ψAi Bi Ei 〉, then the individual attack consists of Eve

27
measuring each system Ei separately. The state of Eve conditioned on Alice’s information
is
1
ρE|x = A〈x|ρAE |x〉A , (145)
P (x)
where P (x) is the marginal of

P (x, y) = A〈x|B〈y|ρAB |x〉A |y〉B , (146)

Eve’s measurement transforms the quantum information ρE|x into the classical information
z, producing the joint distribution P (x, z), which together with P (x, y) allows to calculate
the efficiency rate
R = H(X|Z) − H(X|Y ) . (147)
So let us calculate this rate when Alice and Bob share N copies of the noisy singlet

ρAB = (1 − δ)|Φ− 〉〈Φ− | + δ , (148)


4
as a function of the noise parameter δ. Recall that the singlet is invariant under change of
basis, hence, it does not matter in which basis Alice and Bob measure, as long as they do
it in the same one. After measuring both in the basis |0〉, |1〉 they obtain the distribution
8 1−δ δ
2 + 4 if x ∕= y
P (x, y) = δ . (149)
4 if x = y

Using the fact that P (y) = 1/2 for all y we obtain


8
P (x, y) 1 − 2δ if x ∕= y
P (x|y) = = δ . (150)
P (y) 2 if x = y
which gives

H(X|Y ) = P (y = 0)H(X|y = 0) + P (y = 1)H(X|y = 1) = h(δ/2) . (151)

Note that the relative entropy does not care of wether Bob flips its symbols. The relation
with the disturbance is D = δ/2.
Next, we calculate (145). To obtain Eve’s purification, it is convenient to decompose
the identity with the following orthonormal basis

= |Φ− 〉〈Φ− | + |Φ+ 〉〈Φ+ | + |00〉〈00| + |11〉〈11| , (152)

where |Φ+ 〉 = √1 (|01〉


+ |10〉). This allows to write the AB-state in the form
2
. /
δ δ δ δ
ρAB = (1 − δ) + |Φ− 〉〈Φ− | + |Φ+ 〉〈Φ+ | + |00〉〈00| + |11〉〈11| , (153)
4 4 4 4
which has purification
' ' ' '
|ψ〉ABE = (1 − δ) + δ/4 |Φ− 〉⊗|E1 〉+ δ/4 |Φ+ 〉⊗|E2 〉+ δ/4 |00〉⊗|E3 〉+ δ/4 |11〉⊗|E4 〉
(154)
where |Ei 〉 are orthonormal. Conditioning on Alice and Bob’s outcomes we get
'
AB 〈00|ψ〉ABE = δ/4 |E3 〉 (155)
'
AB 〈11|ψ〉ABE = δ/4 |E4 〉 (156)
@
1 − 3δ/4 '
AB 〈01|ψ〉ABE = |E1 〉 + δ/8 |E2 〉 (157)
@ 2
1 − 3δ/4 '
AB 〈10|ψ〉ABE = − |E1 〉 + δ/8 |E2 〉 (158)
2

28
and using P (x = 0) = 1/2 we obtain
1 3 4
ρE|x=0 = AB 〈00|ψ〉ABE 〈ψ|00〉AB +AB 〈01|ψ〉ABE 〈ψ|01〉AB (159)
P (x = 0)
δ 3' ' 43' ' 4
= |E3 〉〈E3 | + 1 − 3δ/4 |E1 〉 + δ/4 |E2 〉 1 − 3δ/4 〈E1 | + δ/4 〈E2 | ,
2
1 3 4
ρE|x=1 = AB 〈10|ψ〉ABE 〈ψ|10〉AB +AB 〈11|ψ〉ABE 〈ψ|11〉AB (160)
P (x = 1)
δ 3' ' 43' ' 4
= |E4 〉〈E4 | + 1 − 3δ/4 |E1 〉 − δ/4 |E2 〉 1 − 3δ/4 〈E1 | − δ/4 〈E2 | .
2
To calculate the optimal guessing probability we need to find the four eigenvalues of the
matrix

ρE|x=0 − ρE|x=1 (161)


' ' δ δ
= 2 (δ/4)(1 − 3δ/4)|E1 〉〈E2 | + 2 (δ/4)(1 − 3δ/4)|E2 〉〈E1 | + |E3 〉〈E3 | − |E4 〉〈E4 | .
2 2
To do so we note that
" #
0 1
|E1 〉〈E2 | + |E2 〉〈E1 | = , (162)
1 0

which has eigenvalues ±1. Then, the four eigenvalues of [ρE|x=0 − ρE|x=1 ] are ±δ/2 and
'
±2 (δ/4)(1 − 3δ/4), which give trace distance
5 5 '
5ρE|x=0 − ρE|x=1 5 = 2 δ(1 − 3δ/4) + δ .
1

Hence the minimal error is


1 13 ' 4
#= − 2 δ(1 − 3δ/4) + δ . (163)
2 4
which can be written as
A 3 ' 4
B 1
+ 1
2 δ(1 − 3δ/4) + δ if z = x
2 4
3 ' 4
P (x|z) = . (164)
C 1
− 1
2 δ(1 − 3δ/4) + δ if z ∕= x
2 4

Using P (x) = 1/2 we obtain the conditional entropy


" #
1 1' δ
H(X|Z) = h − δ(1 − 3δ/4) − . (165)
2 2 4
The secret key rate is
" #
1 1' δ
R=h − δ(1 − 3δ/4) − − h(δ/2) . (166)
2 2 4
In the following picture you can see that this rate (amber) is larger than that of the BB84
(blue). The reason for this is that, in this case, Alice and Bob know the 15 parameters
that specify their state ρAB , while in the BB84 they only know the single parameter
1 1
F − D = tr(σx ⊗ σx ρAB ) + tr(σz ⊗ σz ρAB ) . (167)
2 2
To obtain the BB84 rate within the framework of entanglement-based QKD we have to take
ρAB as a free parameter, and minimise the rate as a function of ρAB with the constraint
(167).

29
In[! ]:= h[x_] := - x Log[2, x] - (1 - x) Log[2, 1 - x]
Plot
1
h - d (1 - d)  - h[d],
2
1-d 1
h - d (2 - 3 d)  - h[d]
2 2
, {d, 0, .5}

1.0

0.5

Out[! ]=
0.1 0.2 0.3 0.4 0.5

-0.5

-1.0

7. General attacks
Suppose that Alice and Bob share N copies of the state ρAB with purification |ψABE 〉, so
the global state is
N
D
|ψAi Bi Ei 〉 , (168)
i=1

where Alice holds the systems A1 , A2 , . . . , AN , Bob B1 , . . . , BN and Eve E1 , . . . , EN .


Hence, Eve can make a joint (coherent) measurement of all her systems, which has the
following advantage. Note that, actually, Eve is not interested on guessing the raw key
(x1 , . . . , xN ), but the secret key
(k1 , . . . , kNkey ) = f (x1 , . . . , xN ) , (169)

where f : {0, 1}N → {0, 1}Nkey is a complicated function. For example, in the simple case
k = x1 ⊕ x2 , (170)
the relevant information is whether x1 = x2 or x1 ∕= x2 , not the individual values of x1
or x2 . And there might be a joint measurement of E1 , E2 which provides better informa-
tion of k. In any case, a joint measurement will never do worse, since it includes local
measurements as a particular case. Of course, this extra information that Eve gets by
doing a joint measurement can be “hashed out” by Alice and Bob, by doing more privacy
amplification, that is, by making the final key shorter.

Theorem. The optimal secret key rate of one-way public communication protocols (Alice
→ Bob) is
R→ = I(X : Y ) − I(X : E) , (171)
where
!
I(X : E) = S(E) − S(E|X) = S(ρE ) − P (x)S(ρE|x ) . (172)
x

30
and S(ρ) = −tr(ρ log2 ρ) denotes the von-Newman entropy.

Above we write the symbol E instead of Z to denote that Eve’s system is quantum,
because it is not measured yet. That is, the joint state of Alice and Eve is classic-quantum
(cq-state)
!
ρXE = P (x)|x〉X 〈x| ⊗ ρE|x . (173)
x

We use the notation where X, Y, Z are classical systems and A, B, E are quantum. The
fact that, in the rate formula, Eve’s system is quantum simplifies our task, because we
don’t need to find what is her best measurement. In other context I(X : E) is called the
Holevo bound, and is denoted by χ(X : E).
There are other protocols with a higher secret key rate than (171) in the high-noise
regime. This is done by exploiting two-way communication between Alice and Bob. How-
ever, we do not study these protocols in due course.

7.1. Secret key rate of the noisy singlet


After the estimation phase the honest parties know that they share N copies of

ρAB = (1 − δ)|Φ− 〉〈Φ− | + δ . (174)


4
If both measure their systems in the same basis then, as computed in Subsection 6.6, they
have
H(X|Y ) = h(δ/2) , (175)
or equivalently
I(X : Y ) = 1 − h(δ/2) , (176)
Let us calculate I(X : E). Alice and Bob know that, in the worst-case scenario, an
adversary holds a system E which purifies the above state. Hence, the tripartite state is
' ' ' '
|ψ〉ABE = (1 − δ) + δ/4 |Φ− 〉⊗|E1 〉+ δ/4 |Φ+ 〉⊗|E2 〉+ δ/4 |00〉⊗|E3 〉+ δ/4 |11〉⊗|E4 〉
(177)
where |Ei 〉 are orthonormal. As we have calculated in subsection 6.6, Eve’s states condi-
tioned on Alice’s outcome are
δ 3' ' 43' ' 4
ρE|x=0 = |E3 〉〈E3 | + 1 − 3δ/4 |E1 〉 + δ/4 |E2 〉 1 − 3δ/4 〈E1 | + δ/4 〈E2 | ,
2
δ 3' ' 43' ' 4
ρE|x=1 = |E4 〉〈E4 | + 1 − 3δ/4 |E1 〉 − δ/4 |E2 〉 1 − 3δ/4 〈E1 | − δ/4 〈E2 | .
2
which give von-Newman entropy

S(ρE|x=0 ) = S(ρE|x=1 ) = h(δ/2) , (178)

and using (177) we get

δ δ δ
ρE = (1 − 3δ/4) |E1 〉〈E1 | + |E2 〉〈E2 | + |E3 〉〈E3 | + |E4 〉〈E4 | , (179)
4 4 4
and
3δ δ
S(ρE ) = −(1 − 3δ/4) log(1 − 3δ/4) − log . (180)
4 4

31
This, together with I(X : Y ) = 1 − h(δ/2), gives a secret key rate of
!
R = I(X : Y ) − S(ρE ) + P (x)S(ρE|x ) (181)
x
3δ δ
= 1 − h(δ/2) − S(ρE ) + h(δ/2) = 1 + (1 − 3δ/4) log(1 − 3δ/4) + log . (182)
4 4
In the next plot we compare the secret key rate against individual (blue) and general
(amber) attacks as a function of the disturbance δ. Obviously, the general attack provides
more information to the adversary, and hence, less key rate for the honest parties.
In[! ]:= h[x_] := - x Log[2, x] - (1 - x) Log[2, 1 - x]
Plot
1-d 1
h - d (2 - 3 d)  - h[d],
2 2
1 + (1 - 3 d / 2) Log[2, 1 - 3 d / 2] + (3 d / 2) Log[2, d / 2]
, {d, 0, .5}

1.0

0.5

Out[! ]=
0.1 0.2 0.3 0.4 0.5

-0.5

-1.0

7.2. Problem 7: General attacks


Suppose that Alice and Bob share N copies of the two-qubit state
3 1
ρAB = |Ψ〉〈Ψ| + , (183)
4 44
where @ @
3 1
|Ψ〉 = |0〉 ⊗ |0〉 + |1〉 ⊗ |1〉 . (184)
4 4
Suppose that this is the partial trace of a tripartite state where the extra system is pos-
sessed by a cryptographic adversary.
A. Can you write the global tripartite state that gives maximal advantage to this ad-
versary?
B. If Alice and Bob measure both in the {|0〉, |1〉} basis, what is the tripartite classical-
classical-quantum-state of the following form?
1
!
ρXY E = P (x, y) |x〉X 〈x| ⊗ |y〉Y 〈y| ⊗ ρE|x,y . (185)
x,y=0

C. Is it possible to distill a secret key from this state, which is secure against general
attacks? That is, does formula
R ≥ I(X : Y ) − I(X : E) , (186)
give a positive number?

32
7.3. De-Finetti Theorem
In the previous section we considered general attacks with the i.i.d. assumption on Alice
and Bob’s reduced state ρ⊗N AB . However, in order to prove full security, no assumption
should be made. To drop this assumption we use the quantum exponential de-Finetti
theorem.
Something relevant is that the protocol applied by the honest parties treats each of the
N pairs of systems (or rounds) on equal footing. Hence, nothing changes if, before the
protocol, Alice and Bob apply on their pairs of systems a random permutation, and later
forget the value of this permutation. This would ensure that the reduced state of the
honest parties ρAN B N is symmetric with respect to the exchange of pairs. Note that, since
Eve knows which permutation has been applied, the global state ρAN B N E is not necessarily
symmetric. But when dealing with the reduced state for Alice and Bob we can apply the
following.

Theorem 2 (Informal version of de-Finetti). Let ρ(N ) be a state of N systems that is


invariant under any permutation of the N systems. If ρ(M ) is the reduced state of M < N
of the N systems then
5E 5 3 4
5 5
5 dσP (σ) σ ⊗M
− ρ (M ) 5 ≤ O 2−(N −M ) , (187)
5 5
1

where P (σ) is a probability distribution over single-system density matrices.

This tells us that, after discarding a small fraction of the N states, the remaining state is
approximately i.i.d. So the analysis of Section 7 holds in general. Hence, the most general
attack allows Alice and Bob to obtain the secret key rate given in formula (171).

8. Non-local correlations
In this section we develop a formalism that allows to construct QKD protocols by only
looking at the statistics P (a, b|x, y) and ignoring the quantum model that produces that
statistics 3 4
P (a, b|x, y) = tr Aax ⊗ Byb ρAB . (188)

(Here, x, y label the different observables that Alice and &Bob can perform, so that, for
each x, the operators {A1x , A2x , . . .} constitute a POVM a Aax = , and analogously for
{By1 , By2 , . . . }.) In this framework, Alice and Bob look at their devices as black boxes.

The causal structure of these correlations is represented in the following diagram, where
the common cause Λ is the shared state ρAB .

33
The advantage of this framework is that security does not rely on having a correct de-
scription/model of the devices, as long as they generate a useful statistics P (a, b|x, y). As
we will see below, the security of these protocols does not even rely on the validity of
quantum mechanics.

8.1. Classical, quantum and beyond


A bipartite conditional distribution P (a, b|x, y) is non-signaling if
! !
P (a, b|x, y) = P (a, b|x′ , y) ∀ x, x′ , y, b , (189)
a a
! !
P (a, b|x, y) = P (a, b|x, y ′ ) ∀ y, y ′ , x, a . (190)
b b

No-signaling conditional distributions have well-defined marginals, because


!
P (a, b|x, y) = P (b|y) (191)
a

is independent of x (and the other way arround). The fact that this is a set of lin-
ear constraints implies that: if P1 (a, b|x, y) and P2 (a, b|x, y) are non-signaling then so is
qP1 (a, b|x, y) + (1 − q)P2 (a, b|x, y) for any 0 ≤ q ≤ 1. In other words, the set of non-
signaling correlations is convex. This convex set turns out to have a finite number of
extreme points (it is a polytope), but describing them is in general hard (see below). A
polytope can always be defined in two dual ways: in terms of its generators {e1 , . . . , en },
or in terms of linear inequalities {c1 , . . . , cm }:
F& & G
P = conv{e1 , . . . , en } = i p i e i : p i ≥ 0, i p i = 1 (192)
= {x : c1 · x ≤ 1, . . . , cm · x ≤ 1} . (193)
Note that any inequality c · x ≤ c can be written as 1c c · x ≤ 1, as long as c ∕= 0. This last
condition can be warranted by translating the origin of coordinates to the interior of the
polytope.
Local Correlations, also called classical or locally causal, are the ones that can be written
as !
P (a, b|x, y) = P (λ)P (a|x, λ)P (b|y, λ) . (194)
λ
If P (a, b|x, y) cannot be written in this way we say that it is non-local (and violate Bell
inequalities). The set of local distributions forms a convex set generated by the extreme
points
Pf g (a, b|x, y) = δfa(x) δg(y)
b
, (195)
for all functions f : X → A and g : Y → B . To see this, note that every single-site
distribution P (a|x) can be written as a mixture of distributions of the form δfa(x) , that is
!
P (a|x, λ) = P (f |λ)δfa(x) , (196)
f

34
and analogously for Bob. For example

.8 .4 1 0 1 1 0 1
P (a|x) = = .6 + .2 + .2 . (197)
.2 .6 0 1 0 0 1 0

(Note that this decomposition is not unique.) This decomposition implies that any local
distribution (194) can be written as
!
P (a, b|x, y) = P (λ)P (a|x, λ)P (b|y, λ)
λ
!
= P (λ)P (f |λ)δfa(x) P (g|λ)δg(y)
b

λ,f,g
!
= P (f, g) δfa(x) δg(y)
b
, (198)
f,g

where the new hidden variable λ̃ = (f, g) has distribution


!
P (f, g) = P (λ)P (f |λ)P (g|λ) . (199)
λ

The above shows that any local distribution can be written as a mixture of elements of
the form (195), showing that they generate the local polytope. The fact that the distri-
butions (195) have an entry equal to one and the rest zeroes implies that they cannot be
written as mixtures, hence they are extreme points. The local polytope that we are study-
ing has generators (195), and inequalities being the Bell Inequalities. The Bell inequalities
are in general not known, but below we constructed in a simple scenario.
Quantum Correlations P (a, b|x, y) are the ones for which there exists a state ρAB and
measurements Aax , Byb such that
3 4
P (a, b|x, y) = tr Aax ⊗ Byb ρAB , (200)
& &
where obviously Aax , Byb ≥ 0, a Aax = A and b Byb = B for all x, y. Note that the
dimension of the Hilbert space is not fixed, and it could be infinite. The set of quantum
correlations also forms a convex set, but this has infinitely many extreme points, hence it
is not a polytope. The extreme points of the quantum convex set are in general not known.
Next we study a particular case where the quantum set is completely characterized.

35
Binary Correlators. The simplest case where we can analyze these sets of correlations
is: two parties with two dichotomic observables (a, b, x, y ∈ {0, 1}). To simplify the
mathematical structure even more, instead of considering the 24 numbers P (a, b|x, y), we

36
consider the 4 numbers
!
Cx,y = (−1)a (−1)b P (a, b|x, y) (201)
a,b
= prob{a = b|x, y} − prob{a ∕= b|x, y} ∈ [−1, 1] , (202)
for all x, y. All possible vectors (C00 , C01 , C10 , C11 ) ∈ [−1, 1]×4 are non-signaling, since
they can be achieved with the following distribution
H
1+Cxy
4 if a = b
P (a, b|x, y) = 1−Cxy , (203)
4 if a ∕= b
which has uniform marginals for Alice and Bob P (a|x) = P (b|y) = 1/2 for all a, b, x, y.
Since these marginals do not depend on x, y the non-signaling constraints are satisfied.
It is known that the correlations (C00 , C01 , C10 , C11 ) are local if and only if they satisfy
the 8 CHSH inequalities:
−2 ≤ C00 + C01 + C10 − C11 ≤ 2 (204)
−2 ≤ C00 + C01 − C10 + C11 ≤ 2 (205)
−2 ≤ C00 − C01 + C10 + C11 ≤ 2 (206)
−2 ≤ −C00 + C01 + C10 + C11 ≤ 2 (207)
The correlations (C00 , C01 , C10 , C11 ) are quantum if and only if they satisfy the 8 non-linear
inequalities:
−π ≤ arcsin C00 + arcsin C01 + arcsin C10 − arcsin C11 ≤ π (208)
−π ≤ arcsin C00 + arcsin C01 − arcsin C10 + arcsin C11 ≤ π (209)
−π ≤ arcsin C00 − arcsin C01 + arcsin C10 + arcsin C11 ≤ π (210)
−π ≤ − arcsin C00 + arcsin C01 + arcsin C10 + arcsin C11 ≤ π (211)
This is proven in L. Masanes, Necessary and sufficient condition for quantum-generated cor-
relations, quant-ph/0309137.
A linearization of the above is the famous Cirelson’s Bound: the correlations (C00 , C01 , C10 , C11 )
are quantum only if they satisfy the 8 linear inequalities:
√ √
−2 2 ≤ C00 + C01 + C10 − C11 ≤ 2 2 (212)
√ √
−2 2 ≤ C00 + C01 − C10 + C11 ≤ 2 2 (213)
√ √
−2 2 ≤ C00 − C01 + C10 + C11 ≤ 2 2 (214)
√ √
−2 2 ≤ −C00 + C01 + C10 + C11 ≤ 2 2 (215)
Note that the above is necessary but not sufficient. In order to visualize the non-signaling,
classical and quantum sets we plot the subsets satisfying the constraint C11 = 1:

37
The boundaries of these sets are given by the above inequalities (when C11 = 1), in
particular, the facets of the tetrahedron are the CHSH inequalities. And the corners of
the cube outside the tetrahedron are the maximally non-local distributions, which are
called PR-boxes 8 1
2 if a ⊕ b = xy
PPR (a, b|x, y) = . (216)
0 otherwise
In the general case (C00 , C01 , C10 , C11 ) there are eight PR-boxes, connected by relabelings
a → 1 − a, and the same for x, y. For example, another PR-box is
8 1
2 if (1 − a) ⊕ b = x(1 − y)
PPR (a, b|x, y) = . (217)
0 otherwise

Note that relabeling b is the same as relabeling a. In the above picture there are only 4
from the 8 PR-boxes, the other 4 have C11 = −1. Apart from the eight PR-boxes there
are 16 local extreme points:

Plocal (a, b|x, y) = δfa(x) δg(y)


b
(218)

for all pairs of functions f, g : {0, 1} → {0, 1}.

8.2. Monogamy of non-local correlations


Suppose that there is a 3-party non-signaling distribution P (a, b, e|x, y, z) such that
!
P (a, b, e|x, y, z) = PPR (a, b|x, y) , (219)
e

then, using Bayes rule and no-signaling, we can write


!
PPR (a, b|x, y) = P (a, b, e|x, y, z)
e
!
= P (e|x, y, z)P (a, b|x, y, e, z) (220)
e
!
= P (e|z)P (a, b|x, y, e, z) .
e

We know that PPR is an extreme point, hence it cannot be a mixture of distributions,


therefore all distributions P (a, b|x, y, e, z) for all values of e, z are equal P (a, b|x, y, e, z) =
P (a, b|x, y). Which implies that the original distribution is uncorrelated:

P (a, b, e|x, y, z) = P (a, b|x, y)P (e|z) . (221)

Although the above only applies when Alice and Bob share a PR-box, we see that in this
case, Eve cannot be even classically correlated with Alice and Bob. This is analogous to
what happens with a singlet. In the context of key distribution, the above implies that:
if Alice and Bob share a PR-box then their correlations are necessarily secret (the same
than with a singlet). This is one manifestation of the monogamy of non-local correlations.
Let us see a different one.
Definition 3. We say that the distribution P (a, b|x, y) is 2-shareable with respect to Bob
if there is a 3-party non-signaling distribution P (a, b1 , b2 |x, y1 , y2 ) such that
!
P (a, b1 , b|x, y1 , y) = P (a, b|x, y) , (222)
b1
!
P (a, b, b2 |x, y, y2 ) = P (a, b|x, y) . (223)
b2

38
Theorem 4. Any 2-shareable distribution satisfies all Bell inequalities with two measure-
ments on Bob’s side.

Proof. Given P (a, b|x, y), assuming the existence of P (a, b1 , b2 |x, y1 , y2 ), and using (222)
and (223) we can write
!
b
P (a, b|x, y) = P (b1 , b2 |0, 1) P (a|x, b1 , b2 , 0, 1) δ(b 0 1 .
1 δy +b2 δy )
(224)
b1 ,b2

Now, if we identify λ = (b1 , b2 ), the above looks like


!
P (a, b|x, y) = P (λ)P (a|x, λ)P (b|y, λ) . (225)
λ

Hence P (a, b|x, y) is local whenever y = 0, 1.

The above can be easily generalized to the following statement: Any k-shareable distri-
bution satisfies all Bell inequalities with k measurements on Bob’s side. What happens
then with ∞-shareable distributions? They satisfy all Bell inequalities. Conversely, local
distributions are ∞-shareable: If
!
P (a, b|x, y) = P (λ)P (a|x, λ)P (b|y, λ) , (226)
λ

then yo can write


!
P (a, b1 , . . . , bk |x, y1 , . . . , yk ) = P (λ)P (a|x, λ)P (b|y, λ) · · · P (bk |yk , λ) . (227)
λ

We end this subsection by proving our general monogamy result for the binary case.
But before this, we need to introduce a symmetric family of correlations and a protocol
which allows Alice and Bob to symmetrize their correlations without loosing non-locality.
The family of symmetric correlations is
8 1−ν
2 if a ⊕ b = xy
Pν (a, b|x, y) = ν , (228)
2 otherwise

where ν ∈ [0, 1] is a parameter. We can write them as noisy PR-boxes with noise parameter
ν ∈ [0, 1],
1
Pν (a, b|x, y) = (1 − 2ν)PPR (a, b|x, y) + 2ν . (229)
4
These correlations are analogous to the noisy singlet (153) that we studied in QKD. The
symmetry of these correlations is manifested in the fact that

C00 = C01 = C10 = −C11 = C = 1 − 2ν . (230)

For this one-parameter family of correlations we have:

• Classical: 1/4 ≤ ν

• Quantum: .14 ≈ (1 − 1/ 2)/2 ≤ ν < 1/4 ≈ .25

• Beyond: 0 ≤ ν < (1 − 1/ 2)/2

Next, we describe a 3-step protocol for transforming any P (a, b|x, y) to the symmetric
distribution Pν (a, b|x, y) having the same CHSH violation.

39
First step: with probability 1/2 Alice and Bob do
• nothing
• flip a and b.
Second step: with probability 1/2 Alice and Bob do
• nothing
• flip y and ax=1
Third step: with probability 1/2 Alice and Bob do
• nothing
• flip x and by=1
and they forget all what they did. Importantly, all of the above operations are local, and
they only require (3 bits of) shared classical randomness. It is easy to check that the
violation of the CHSH is left invariant by this protocol (can you prove it?). Now we are
ready to show our general result on the monogamy of non-local correlations for the binary
case.
Theorem 5. Consider P (a, b, e|x, y, z) where all the variables are binary a, b, e, x, y, z ∈
{0, 1}. Then, at most one of the marginals P (a, b|x, y) or P (a, e|x, z) is non-local.

Proof. Let us prove this by contradiction. Suppose that the two marginals violate CHSH.
Then with local operations we can symmetrize the two marginals, transforming them into
noisy PR-boxes Pν (a, b|x, y) and Pν ′ (a, e|x, z) which still violate CHSH. Now, without
loss of generality, suppose that Pν (a, b|x, y) violates CHSH more than ν ′ P (a, e|x, z), that
is ν ≤ ν ′ . Then Bob can always add a small amount of noise to decrease the Alice-
Bob violation, until it becomes equal to that of Alice-Eve ν 7→ ν ′ . This is done by Bob
flipping b with a suitable probability. This transformation would generate a 2-shareable
distribution. This, together with the fact that all inputs are binary, implies that both
marginals are local. Obviously, the same conclusion can be obtained for the other two
pairs of marginals.

8.3. Monogamy and no-cloning


Using steering we can relate monogamy with no-cloning. If Alice measures his halve of
the pair with input x0 and obtains outcome a0 , then, he effectively prepares on Bob’s
side the distribution P (b|y, a0 , x0 ). If perfect cloning where possible, then Bob could
obtain P ′ (b1 , b2 |y1 , y2 , a0 , x0 ) such that each of the marginals is equal to P (b|y, a0 , x0 ). In
addition notice that the performance of the cloning operation is independent of whether
Alice measures before or after the cloning process. Invoking special relativity, there is
always an observer who sees the cloning first, and describes the global state with the
distribution
P ′ (b1 , b2 , a0 |y1 , y2 , x0 ) = P ′ (b1 , b2 |y1 , y2 , a0 , x0 )P (a0 |x0 ) , (231)
which has both marginals P ′ (b1 , a|y1 , x) and P ′ (b2 , a|y2 , x) equal to the original distri-
bution P (b|y, a, x). If the original distribution is non-local then the two marginals are
non-local, in conflict with the previous theorem. Hence, cloning is impossible.

8.4. Problem 8: non-locality with only one observable


Consider a no-signalling distribution with more than one experimental setting on Alice’s
side x ∈ {0, 1, . . .} and only one on Bob’s side y = 0, so that we can omit it P (a, b|x). Can
such distribution be non-local (i.e. violate a Bell inequality)? Please, argue.

40
9. Device-independent QKD
There are two formalisms for device-independent QKD. In the first one (which we consider
next), only the no-signaling principle is assumed. Hence, the adversary is not restricted
by the laws of quantum mechanics. In the second one (which we do not consider in
this course), the geometry of quantum correlations (e.g. (208)) is assumed, but not the
particular states and measurements involved in the protocol.

9.1. Simple example


Suppose that Alice, Bob and Eve share a non-signaling distribution P (a, b, e|x, y) such
that e = a with probability one. This property implies that the Bob-Eve marginal
!
P (b, e|y) = P (a, b, e|x, y) , (232)
a

satisfies

P (a, b, e|x, y) = δea P (b, e|y) . (233)

Hence, if we trace over Eve we obtain


!
P (a, b|x, y) = P (a, b, e|x, y) (234)
e
!
= δea P (b, e|y) . (235)
e
!
= P (e) δea P (b|y, e) , (236)
e

which is the definition of local correlations.

Conclusion: If Alice and Bob share a non-signaling distribution P (a, b|x, y) and an
adversary can predict the outcome of one of them, then P (a, b|x, y) cannot violate any
Bell inequality.

9.2. No-signaling QKD


Alice and Bob can always perform tomography to estimate their marginal P (a, b|x, y).
This constrains the optimal (individual) attack P (a, b, e|x, y, z) to satisfy
!
P (a, b|x, y) = P (e|z)P (a, b|x, y, e, z) . (237)
e

Now, let us prove that in the optimal attack the conditionals {P (a, b|x, y, e, z)}e are ex-
treme points in the Alice-Bob polytope. To show this, assume the opposite
!
P (a, b|x, y, e, z) = P (i|e, z)Pi (a, b|x, y, e, z) , (238)
i

where Pi (a, b|x, y, e, z) are extremal for all i. Note that the new distribution

P (a, b, e, i|x, y, z) := Pi (a, b|x, y, e, z)P (i|e, z)P (e|z) (239)

is non-signaling (can you check it?), and it becomes the original one P (a, b, e|x, y, z) when
tracing out i, and has extremal conditional distributions P (a, b|x, y, z, e, i) for all (e, i).

41
Hence, if the adversary, in addition of having e she also has i, she can always ignore i and
recover her original correlations with Alice and Bob.
Also, if the the Alice-Bob marginal (237) violates CHSH, then one of the extreme points
{P (a, b|x, y, e, z)}e must be a PR box. This can be interpreted in the following way. During
the performance of the protocol, some of the pairs of Alice and Bob will be in the PR-box
state (216); although Alice and Bob do not know which of the pairs. In such pairs, Eve
knows nothing about the measurement outcomes of Alice and Bob. Hence, she wants to
minimise the probability of PR-box. In order to do so, the only non-local extreme point
has to be the PR-box (216), and the only local extreme points must be the ones which
saturate CHSH.
Now, we recall that Alice and Bob can always transform their distribution into a sym-
metric one 8 1−ν
2 if a ⊕ b = xy
Pν (a, b|x, y) = ν , (240)
2 otherwise
without losing CHSH violation. Note that ν = D is the disturbance. For what follows it is
convenient to write Pν (a, b|x, y) as a mixture of a PR-box and the symmetric distribution
saturating the CHSH inequality. Using C = 1 − 2ν we get

Pν (a, b|x, y) = (2C − 1)Pν=0 (a, b|x, y) + (2 − 2C) Pν=1/4 (a, b|x, y) (241)
= (1 − 4ν)Pν=0 (a, b|x, y) + 4ν Pν=1/4 (a, b|x, y) (242)
8
1! a
= (1 − 4ν)PPR (a, b|x, y) + 4ν δft (x) δgbt (y) , (243)
8
t=1

where the 8 local points a = ft (x) and b = gt (y) are the ones which saturate the CHSH
inequality (you found them in Problem 5).
Therefore, redefining (e, i) → e we can write the global distribution corresponding to
the optimal attack
8
1! a
P (a, b, e|x, y) = (1 − 4ν)PPR (a, b|x, y) δe0 + 4ν δft (x) δgbt (y) δet . (244)
8
t=1

The above tripartite correlations happen for one particular value of z. But being these
correlations optimal from the point of view of Eve, she is not going to use other possible
values of z. Hence, we have omitted them in the above expression. In other words: in
order to implement the optimal attack, Eve does not need a machine with multiple inputs.
From (244) we see that, if Eve knows x then she also knows a with probability 4ν, and
knows nothing with probability (1 − 4ν). This gives

H(A|B) = h(ν) (245)


H(A|E, X) = 1 − 4ν (246)

and a secret key rate of


R = 1 − 4D − h(D) . (247)
A variant of this protocol is that Alice does not announce x and Bob does announce y.
Then, in order to maximize correlations C, Alice flips a when in the cases x = y = 1, that
is a 7→ a′ = a ⊕ xy. Can you derive the following rate formulas?

H(A′ |B) = h(ν) , (248)


1
H(A′ |E, Y ) = (1 − 4ν) + 4ν . (249)
2

42
9.3. Problem 9: alternative protocol for no-signaling QKD
We know that the optimal attack is
8
1! a
P (a, b, e|x, y, z) = (1 − 4ν)PPR (a, b|x, y) δe0 + 4ν δft (x) δgbt (y) δet , (250)
8
t=1

where ν = D is the disturbance. Now, suppose that Bob publishes y and Alice does not
publish x. Instead, she transforms the raw key as

a 7→ a′ = a ⊕ xy , (251)

in order to maximize correlations C.

A. Can you calculate H(A′ |B) and H(A′ |E, Y )? For this you need the solution of
Problem 5.

B. Can you write the secret key rate r and compare it with the one of the simple
protocol shown above Rsimple = 1 − 4ν − h(ν)? Which protocol is better?

9.4. Characterizing the set of quantum correlations


As mentioned above, in this course we do not study the brand of device-independent QKD
which relies on quantum theory. But next we describe some of its tools and formalism.
The following is a method for bounding the set of quantum correlations.
Suppose Alice and Bob share the state ρAB and measure it with observables represented
by the operators Ax and By , with eigenvalues ±1. The correlation functions are

Cxy = tr (ρAB Ax ⊗ By ) . (252)

Note that we are not imposing any restriction on the dimensionality of the Hilbert spaces.
Define the indexed Hermitian matrix

(M1 , M2 , M3 , M4 ) = (A0 ⊗ , A1 ⊗ , ⊗ B0 , ⊗ B1 ) (253)

and the (also Hermitian) matrix

Qij = tr (ρAB Mi Mj ) = Q̄ji , (254)

for i, j = 1, 2, 3, 4. Using the fact that Mi2 = ⊗ we can write


( *
1 α C00 C01
I ᾱ 1 C10 C11 J
Q=I ) C00 C10 1
J (255)
β +
C01 C11 β̄ 1

where
α = tr (ρAB [A0 A1 ⊗ ]) , β = tr (ρAB [ ⊗ B0 B1 ]) (256)
and ᾱ, β̄ are their complex conjugates. Next, we show that the matrix
& Q is always positive
semi-definite. For any vector v ∈ C we can define the matrix V = i vi Mi and note that
4

! 3 4
v̄i Qij vj = tr ρAB V † V ≥ 0 . (257)
ij

43
Recall that any matrix of the form V † V is positive semi-definite. We also know that if Q
is positive then so is its complex conjugate Q̄ and their average
( *
1 re α C00 C01
1 I re α 1 C10 C11 J
Q̃ = (Q + Q̄) = I)
J (258)
2 C00 C10 1 re β +
C01 C11 re β 1

where re α is the real part of the complex number α.


In experimental or cryptographic situations the correlation functions Cxy are known,
but the real parameters re α, re β not, because the two matrices (A0 A1 ) and (B0 B1 ) do not
correspond to arbitrary measurements: they are not Hermitian. However, we can claim
the following.

Theorem 6. If the correlations Cxy are quantum, then there are two real numbers α, β
such that the matrix ( *
1 α C00 C01
I α 1 C10 C11 J
Q=I ) C00 C10 1
J (259)
β +
C01 C11 β 1
is positive semi-definite.

Let us solve this positivity condition for the simple case of a symmetric distribution C00 =
C01 = C10 = −C11 = C. The four eigenvalues of
( *
1 α C C
I α 1 C −C J
Q=I ) C C
J (260)
1 β +
C −C β 1
are K
1 '
1± √ α2 + β 2 + 4C 2 ± (α2 − β 2 )2 + 8(α2 + β 2 )C 2 . (261)
2
The positivity of all of them requires
'
α2 + β 2 + 4C 2 ± (α2 − β 2 )2 + 8(α2 + β 2 )C 2 ≤ 2 . (262)

Multiplying the two inequalities we obtain

[α2 + β 2 + 4C 2 ]2 − [(α2 − β 2 )2 + 8(α2 + β 2 )C 2 ] ≤ 4 , (263)

or equivalently
4(α2 β 2 + 4C 4 ) ≤ 4 . (264)
Minimizing the left-hand side with respect to α, β we obtain

4(4C 4 ) ≤ 4 , (265)

which is true when


1
C≤√ . (266)
2
Substituting in √
C00 + C01 + C10 − C11 ≤ 2 2 , (267)
we obtain Cirelson’s Bound. However, Theorem 8 is much more general, since it provides
stronger constraints than Cirelson’s Bound for the case of non-symmetric correlations.

44
9.5. Problem 10: quantum correlations
Is there any distribution P (a, b|x, y) with a, b, x, y ∈ {0, 1} which does not violate Cirelson’s
Bound (212-215) but lies outside the quantum set? Hint: notice that the distribution
1 1
C00 = √ + # , C01 = √ − # , (268)
2 2
1 1
C10 = √ , C11 = − √ , (269)
2 2
saturates Cirelson’s Bound (267). Substitute this in (259) and check the non-positivity of
Q with perturbation theory to leading order in the small parameter 0 < # ≪ 1.

Appendices

A. Optimal individual attack for the BB84


This appendix contains the proof that the individual attack constructed in Section 4.4 is
optimal. It is based on the reference http://arxiv.org/abs/quant-ph/9701039v1.

A.1. Information-vs-disturbance Theorem

Information&gain on the X basis. Suppose that Eve makes the measurement Qi


(Qi ≥ 0 and i Qi = ). Then, the probability distribution for i generally depends on
which state is prepared by Alice. In the X basis |±〉 we have

p(i|+) = trρ+ Qi = EB 〈Φ+ |Qi ⊗ B |Φ+ 〉EB , (270)


p(i|−) = trρ− Qi = EB 〈Φ− |Qi ⊗ B |Φ− 〉EB . (271)

The information gain on the X basis is quantified by the statistical distance between these
two probability distributions
1 ! LL L
L
δX = Lp(i|+) − p(i|−)L . (272)
2
i

We can analogously define the information gain on the Z basis δZ .

Disturbance on the Z basis is the average probability of Bob obtaining the wrong
outcome
1 1
DZ = 〈Φ0 | ( E ⊗ |1〉〈1|) |Φ0 〉 + 〈Φ1 | ( E ⊗ |0〉〈0|) |Φ1 〉 . (273)
2 2
Here and in what follows, we omit the system labels E and B when the expressions are
clear enough. Note that DZ ∈ [0, 1/2], otherwise Bob could re-define |a〉 and decrease
DZ . Also note that DZ only depends on the interaction U , or equivalently the vectors
|Φ0 〉, |Φ1 〉, but does not depend on the measurement Qi . We can analogously define the
disturbance on the X basis DX .
Theorem 7 (Information vs disturbance, statistical distance). The information that Eve
gains in in one basis constrains the disturbance that she produces in the conjugate basis:
'
δX ≤ 2 DZ (1 − DZ ) , (274)
'
δZ ≤ 2 DX (1 − DX ) . (275)

Note that the more information gain the more disturbance.

45
Proof. By the linearity of transformation (65) we have that |±〉 = √1 (|0〉 ± |1〉) implies
2

1
|Φ± 〉 = √ (|Φ0 〉 ± |Φ1 〉) . (276)
2
Using this and |re x| ≤ |x| we obtain
1 ! LL L
L
δX = L〈Φ+ |Qi ⊗ |Φ+ 〉 − 〈Φ− |Qi ⊗ |Φ− 〉L
2
i
1 ! LL L
L
= L (〈Φ0 | + 〈Φ1 |) Qi ⊗ (|Φ0 〉 + |Φ1 〉) − (〈Φ0 | − 〈Φ1 |) Qi ⊗ (|Φ0 〉 − |Φ1 〉) L
4
i
1 ! LL L
L
= L〈Φ0 |Qi ⊗ |Φ1 〉 + 〈Φ1 |Qi ⊗ |Φ0 〉L
2
i
! LL L
L
= Lre〈Φ0 |Qi ⊗ |Φ1 〉L
i
! LL L
L
≤ L〈Φ0 |Qi ⊗ |Φ1 〉L
i
! LL L
L
= L〈Φ0 | (Qi ⊗ |0〉〈0|) |Φ1 〉 + 〈Φ0 | (Qi ⊗ |1〉〈1|) |Φ1 〉L , (277)
i

where we have√ inserted |0〉〈0| + |1〉〈1| = on system B. The positivity of Qi implies the
positivity of Qi . Hence, we can define the family of unnormalised vectors
'
i
|ψaa ′〉 = Qi ⊗ |a′ 〉〈a′ ||Φa 〉 , (278)

which allows for writing


! LL L
i i i i L
δX ≤ L〈ψ00 |ψ10 〉 + 〈ψ01 |ψ11 〉L
i
! 3LL L L L4
i i L L i i L
≤ L〈ψ00 |ψ10 〉L + L〈ψ01 |ψ11 〉L , (279)
i

where we have used the triangular inequality |x+y| ≤ |x|+|y|. Using the Cauchi-Schwartz
inequality |〈α|β〉|2 ≤ 〈α|α〉〈β|β〉 we obtain
! "K K #
δX ≤ i |ψ i 〉〈ψ i |ψ i 〉 +
〈ψ00 〈ψ i |ψ i 〉〈ψ i |ψ i 〉 . (280)
00 10 10 01 01 11 11
i

Now, we note that


; <
p(a′ , i|a) = 〈ψaa
i i ′ ′
′ |ψaa′ 〉 = 〈Φa | Qi ⊗ |a 〉〈a | |Φa 〉 (281)

is the probability of Bob obtaining outcome a′ and Eve obtaining outcome i when Alice
prepares a.
Now, note that (x − y)2 ≥ 0 implies xy ≤ 12 (x2 + y 2 ), therefore
' 1 1
p(i|0)p(i|1) ≤ p(i|0) + p(i|1) = p(i) . (282)
2 2

46
Using this and Bayes’ Rule we obtain
! 3' ' 4
δX ≤ p(0, i|0)p(0, i|1) + p(1, i|0)p(1, i|1)
i
!' 3' ' 4
= p(i|0)p(i|1) p(0|i, 0)p(0|i, 1) + p(1|i, 0)p(1|i, 1)
i
! 3' ' 4
≤ p(i) p(0|i, 0)p(0|i, 1) + p(1|i, 0)p(1|i, 1)
i
! .K K /
; < ; <
= p(i) 1− D0i D1i + D0i 1− D1i , (283)
i

where we have defined Dai the probability of Bob getting the wrong result conditioned an
Alice preparing a and Eve obtaining outcome i:

D0i = p(1|i, 0) , (284)


D1i = p(0|i, 1) . (285)

The corresponding average is


1 1
DZi = D0i + D1i . (286)
2 2
If we keep the average DZi fixed then the square bracket in (283) is a function of the single
variable D0i . By differentiating and equating to zero one can obtain the absolute maximum
of this function, which happens at D0i = DZi . Therefore
! K; <
δX ≤ 2 p(i) 1 − DZi DZi . (287)
i
' &
Now& we use the fact that the function f (x) = x(1 − x) is &
convex, that is x p(x)f (x) ≤
f ( x p(x)x). Recalling that definition (273) fulfils DZ = i p(i)DZi we obtain our final
result (274). The second inequality is derived in a similar fashion.

In BB84 Alice prepares states in the Basis Z and X with the same probability. Then,
the average statistical distance and disturbance are
1
δ =
(δX + δZ ) , (288)
2
1
D = (DX + DZ ) . (289)
2
'
Using again the convexity of the function x(1 − x) we obtain the bound for the averaged
quantities '
δ ≤ 2 (1 − D) D . (290)
Equation (51) gives a relation between the statistical distance δ and the error probability
#, which translates to
1 '
# ≤ − (1 − D) D . (291)
2
This provides the (famous) information/disturbance tradeoff.

47
In[2]:= PlotB.5 - e H1 - eL , 8e, 0, .5<F

0.5

0.4

0.3

Out[2]=
0.2

0.1

0.1 0.2 0.3 0.4 0.5

In analogy to our previous results (Theorem 1), we can obtain information vs disturbance
tradeoffs where information gain is quantified by the mutual information.

A.2. Information gain in terms of mutual information


Theorem 8 (Information vs disturbance, mutual information). The information that Eve
gains in in one basis constrains the disturbance that she produces in the conjugate basis:
" #
1 '
I(aX : i) ≤ 1 − h − DZ (1 − DZ ) , (292)
2
" #
1 '
I(aZ : i) ≤ 1 − h − DX (1 − DX ) , (293)
2
where
h(p) = −p log2 p − (1 − p) log2 (1 − p) (294)
is the binary entropy. We omit the proof of these inequalities.
Analogously, we can also get a bound relating the averaged quantities
1 1
I(a : i) = I(aZ : i) + I(aX : i) , (295)
2 2
1 1
D = DZ + DX , (296)
2 2
which is " #
1 '
I(a : i) ≤ 1 − h − D(1 − D) , (297)
2
and it looks as
In[3]:= PlotB1 - hB.5 - d H1 - dL F, 8d, 0, .5<F
1.0

0.8

0.6

Out[3]=

0.4

0.2

0.1 0.2 0.3 0.4 0.5

Bound (297) tells us that the attack constructed in subection 4.4 is optimal.

48

You might also like