Q Crypto Ucl
Q Crypto Ucl
Lluı́s Masanes
Contents
1. Motivation 1
3. State discrimination 7
3.1. General case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2. Classical case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3. Pure-state case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.4. Problem 2: distinguishability of states . . . . . . . . . . . . . . . . . . . . . 10
6. Entanglement-based QKD 21
6.1. Equivalence of prepare+measure and entanglement-based QKD . . . . . . . 21
6.2. Problem 5: the singlet state . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.3. Description of the protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1
6.4. Purification of a mixed state . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.5. Problem 6: purification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.6. Individual attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
7. General attacks 29
7.1. Secret key rate of the noisy singlet . . . . . . . . . . . . . . . . . . . . . . . 30
7.2. Problem 7: General attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.3. De-Finetti Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
8. Non-local correlations 32
8.1. Classical, quantum and beyond . . . . . . . . . . . . . . . . . . . . . . . . . 33
8.2. Monogamy of non-local correlations . . . . . . . . . . . . . . . . . . . . . . 37
8.3. Monogamy and no-cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
8.4. Problem 8: non-locality with only one observable . . . . . . . . . . . . . . . 39
9. Device-independent QKD 40
9.1. Simple example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
9.2. No-signaling QKD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
9.3. Problem 9: alternative protocol for no-signaling QKD . . . . . . . . . . . . 42
9.4. Characterizing the set of quantum correlations . . . . . . . . . . . . . . . . 42
9.5. Problem 10: quantum correlations . . . . . . . . . . . . . . . . . . . . . . . 44
1. Motivation
• The applications of cryptography are widely used in our society, and their conse-
quences and regulations are constantly discussed in media. In 2010 more than 85%
of all US companies experienced data breaches, this cost the US economy around
$100-billion.
• Quantum key distribution (QKD) is a good example of how research on the foun-
dations of quantum theory lead to practical applications that are now in the mar-
ketplace. In particular, the most modern and secure protocols for QKD, known
as device-independent QKD, are based on Bell inequalities, which were originally
designed to rule out the possibility of a local-classical description of quantum phe-
nomena.
• QKD is a perfect playground for practicing quantum mechanics and getting a deep
understanding of its counterintuitive and non-classical features.
2
2. Classical Information Theory
2.1. Information-theoretic security
A general model for secret communication using a shared secret key is the following.
where x ∈ X is the message (plaintext), k ∈ K the shared secret key and y ∈ Y the cypher-
text. The encoder E : K × X → Y and decoder D : K × Y → X satisfy Dk (Ek (x)) = x.
A protocol enjoys information-theoretic security if P (x|y) is uniform for all y. Shannon’s
Theorem states that information-theoretic security implies |K| ≥ |Y| ≥ |X |. That is, the
key has to be at least as long as the message! Which is quite expensive in practice.
An information-theoretically secure encryption method with minimal key length |K| =
|X | is the one-time pad. Let x ∈ {0, 1}N and k be uniformly distributed in {0, 1}N . Then
the encoder and decoder are the bit-wise xor (or sum modulo 2) y = Ek (x) = x ⊕ k,
Dk (y) = y ⊕ k = x. Using the fact that k ⊕ k = 0 for any k ∈ {0, 1}N , we see that
Dk (Ek (x)) = (x ⊕ k) ⊕ k = x. For example, if x = (0, 0, 0, 0, 1) and k = (1, 0, 0, 1, 1) then
Alice encodes y = x ⊕ k = (1, 0, 0, 1, 0), and Bob decodes y ⊕ k = (0, 0, 0, 0, 1), which is
equal to x.
An alternative (but not information-theoretically secure) method which uses a much
shorter key is public-key cryptography. This is currently widely used. Here, a pair of
keys are involved, one called the public key, which can be widely disseminated, and one
called the private key, which is known only to the intended recipient (Bob). Any person
can encrypt a message for Bob using the public key. However, this message can now be
decrypted only through Bob’s private key. For example, in RSA, the private key are two
large prime numbers (a, b), and the public key is the product of these c = ab. Given c it
is believed to be computationally hard to obtain a and b. (This scheme is not explained
in due course.) Hence, RSA (and other public-key cryptography methods) rely on: (i) the
limitations on the computational power of the adversary, and (ii) unproven mathematical
conjectures. Hence, this is not information-theoretic security, but computational security.
If we do not want to rely on these assumptions we need to go back to one-time pad, and
distribute long secret keys between Alice and Bob efficiently. QKD solves this problem.
In the case of p = 0 the function p log2 (1/p) takes the value limp→0 p log2 (1/p) = 0.
3
where all distributions P (xi ) are identical.
ΩX (N ) ≈ 2N H(X) . (10)
An interesting property of the typical sequences is that they all have approximately the
same probability. Specifically, if x = (x1 , x2 , . . . , xN ) is typical then
The approximate identities (10) and (11) imply that the total probability of obtaining a
typical sequence is approximately one
!
P (x) ≈ ΩX (N ) P (x) ≈ 1 . (12)
x typical
√
As mentioned above, the error made in this approximation is e− N.
Information compression. For this task, Alice is given a sequence of i.i.d. random
variables X1 , X2 , . . . , XN , and the goal is to encode the value of this sequence x =
4
(x1 , x2 , . . . , xN ) into a shorter sequence of M bits r = (r1 , r2 , . . . , rM ) ∈ {0, 1}M , sent
r to the Bob, so that he can reconstruct (decode) the value of x from r, with high success
probability. We want to know: what is the smallest value of M ?
We know that with high probability the sequence x is typical, and that there are ΩX (N )
typical sequences. On the other hand, the size of the encoding alphabet is |{0, 1}M | = 2M .
Correctly encoding x in r imposes the constraint ΩX (N ) ≤ |{0, 1}M |; so the value of M
has to be the smallest integer compatible with this constraint. This is approximately
M ≈ N H(X) . (13)
That is, the entropy H(X) is the number of bits required to encode each copy of X.
Randomness distillation. We can distill M perfect random bits from the i.i.d. sequence
X1 , X2 , . . . , XN if there is a map (X1 , . . . , XN ) −→ (K1 , . . . , KM ) ∈ {0, 1}M such that the
probability distribution for the M bits (K1 , . . . , KM ) is close to uniform. We want to
know: what is the largest value for M ?
We know that with high probability the outcome of X1 , . . . , XN is typical, and that
all typical sequence approximately have the same probability (11). Therefore, we can
construct a map that encodes each typical sequence x into a different M -bit string k =
(k1 , . . . , kM ). The requirement that k is uniform implies that ΩX (N ) ≥ |{0, 1}M |. Then
M is the largest integer compatible with this constraint, which approximately is
M ≈ N H(X) . (14)
That is, the entropy H(X) is the number of uniform random bits that can be distilled
from each copy of X.
We can consider the pair (X, Y ) to be a single random variable with alphabet X × Y.
Hence, the number of typical sequences (x, y) is given by
where naturally
! 1
H(X, Y ) = P (x, y) log2 . (17)
x,y
P (x, y)
Also, we know that each the typical sequence (x, y) has probability
5
Conditional probability. As mentioned above, a sequence (x, y) is typical if N(x,y) (x, y) ≈
N P (x, y). Now, let us show that, when (x, y) is typical,
& then the sequence x is typical
(with respect to the marginal distribution P (x) = y P (x, y)) too. This is proven as
! !
Nx (x) = N(x,y) (x, y) ≈ N P (x, y) = N P (x) . (20)
y y
&
Analogously, y is typical (with respect to the marginal P (y) = x P (x, y)).
Using (11) and (18) we show that, all typical sequences (x, y) approximately have the
same conditional probability
Analogously we have
Joint tipicality. Let us consider a fixed typical sequence y, and define ΩX|y (N ) as the
number of sequences x such that (x, y) is typical. Using (23) we obtain
! !
1= P (x|y) ≥ P (x|y) ≈ ΩX|y (N ) 2−N H(X|Y ) , (25)
x x such that (x,y) is typical
which implies
Note that ΩX|y (N ) is (approximately) independent of y, and that it is the inverse of the
conditional probability (23).
Two-variable summary. If (x, y) is typical then x and y are typical too, and
1
P (x|y) ≈ ≈ 2−N H(X|Y ) , (29)
ΩX|y (N )
1
P (y|x) ≈ ≈ 2−N H(Y |X) . (30)
ΩY |x (N )
6
Error correction is the generalisation of “information compression” to the case where
the receiver has partial information about the data. Suppose that Alice has (X1 , . . . , XN )
and Bob has (Y1 , . . . , YN ), following the i.i.d. distribution P (xi , yi ). That is, in round
i, the variables xi , yi are correlated according to P (xi , yi ), but the variables in different
rounds are independent. For this task we understand Bob’s data y = (y1 , . . . , yN ) as a
degraded version of Alice’s data x = (x1 , . . . , xN ). Then, he needs to correct the errors in
y to obtain x. For this, Alice sends M bits r = (r1 , . . . , rM ) ∈ {0, 1}M to Bob, so that
he can reconstruct x (with high probability) by using the message r and his data y. We
want to know what is the smallest value of M .
The knowledge of y tells Bob that (with high probability) Alice’s data x is such that
(x, y) is typical. Therefore, Alice only needs to help Bob to discriminate among the
sequences x that are jointly typical with Bob’s sequence y, and there are only ΩX|y (N ) ≈
2N H(X|Y ) of those (28). This implies that Alice needs to send, at least,
M ≈ N H(X|Y ) (31)
bits. This number of bits is also sufficient for the task to be successful with high probability,
but we are not proving this here. In summary, the conditional entropy H(X|Y ) is the
number of bits that we need to encode X once we know Y .
7
1. Error correction. Alice broadcasts Mec ≈ N H(X|Y ) bits of information r about
x, such that Bob can reconstruct x by using (y, r). However, after this process, Eve’s
information has increased from z to (z, r), and her ignorance about x has decreased
from N H(X|Z) to
2. Privacy amplification. Alice and Bob, which now share the same raw key x, apply
the same privacy-amplification map to x, obtaining a shorter string k of length
This Mpa -bit string k is uniformly-distributed and uncorrelated to (z, r), hence, it
is a perfect secret key.
y=0 y=1
x = 0 z = 1 1/8
P (x, y, z) = z = 2 2/8 z = 2 2/8 , (39)
x=1 z = 3 2/8
z = 1 1/8
B. Consider the possibility that Alice and Bob exchange their roles, that is, consider
the case where y is the raw key and is Alice who corrects the errors. This amounts
to use the rate formula
3. State discrimination
An essential ingredient for the analysis of QKD protocols is state discrimination. In this
section we analyse the optimal strategy to discriminate two quantum states.
8
3.1. General case
Suppose that we are given one of the two states ρ0 and ρ1 , with prior probabilities p0 and
p1 respectively, and we want to guess it with minimal error probability. The most general
protocol
& consists of: (i) performing a measurement ({B1 , . . . , Bn } satisfying Bi ≥ 0 and
B
i i = ), and (ii) computing the guess ρj (j = 0, 1) from the outcome i = 1, . . . , n.
Any such computation will be characterised by a conditional probability distribution q(j|i),
which give us the probability output j when the input is i.
Also note that the protocol measuring Aj is as good as the one measuring Bi . Hence, it
is enough with a two-outcome measurement, with outcome A0 for “guess ρ0 ” and A1 for
“guess ρ1 ”. The probability for the wrong guess is
# = p0 tr(A1 ρ0 ) + p1 tr(A0 ρ1 )
= p0 tr(A1 ρ0 ) + p1 tr([ − A1 ]ρ1 )
= p1 + tr(A1 M ) , (43)
9
√
The 1-norm (or trace-norm) of a matrix M is defined as .M .1 = tr M † M , and for
Hermitian matrices it can be written as
! ! !
.M .1 = |λk | = λk − λk . (47)
k k : λk ≥0 k : λk <0
Note that the optimal measurement does something very intuitive: when the measurement
outcome is x it guesses P if P (x) > Q(x) and it guesses Q otherwise.
10
Let the eigenvalues of the Hermitian matrix [ρ0 − ρ1 ] be {λ1 , λ2 }. Using the values of the
trace
λ1 + λ2 = tr[ρ0 − ρ1 ] = tr[ρ0 ] − tr[ρ1 ] = 1 − 1 = 0 , (55)
and the determinant
λ1 λ2 = det[ρ0 − ρ1 ] = −β 2 , (56)
we conclude that λ1 = −λ2 = β. This allows to obtain the optimal error probability for
distinguishing two pure states:
1 1 1 1 1 1'
#= − (|λ1 | + |λ2 |) = − β = − 1 − |〈ψ0 |ψ1 〉|2 , (57)
2 4 2 2 2 2
where we have used that α = 〈ψ0 |ψ1 〉.
C. How the success probability changes if the prior probabilities are p1 = 1/4 and
p2 = 3/4 instead?
11
Distribution phase. In each round i ∈ {1, 2, . . . , N } the honest parties perform the fol-
lowing two steps:
1. Alice randomly choses a basis ui ∈ {Z, X}, prepares a random state of that
basis |ui (xi )〉, xi ∈ {0, 1}, and sends it to Bob.
2. Bob randomly choses a basis vi ∈ {Z, X}, and measures the received state
|ui (xi )〉 with the chosen basis, obtaining the outcome |vi (yi )〉 with yi ∈ {0, 1}.
Basis-reconciliation phase. Alice and Bob publish (ui , vi ) for all i ∈ {1, . . . , N }, and
construct a new sequence (xj , yj ) with j ∈ {1, . . . , Nrec }, containing only the pairs
with compatible bases ui = vi . We have that Nrec ≈ N/2. (Note that here, Eve
learns the preparation basis of all qubits.)
Estimation
√ phase. Alice and Bob select a random subset S ⊂ {1, . . . , Nrec } of size |S| =
⌈ N ⌉, publish the pairs (xj , yj ) in the subset j ∈ S, and compute the relative
frequency of errors
|{j ∈ S : xj ∕= yj }|
D= , (61)
|S|
also known as the disturbance. The raw key (xk , yk ) with k ∈ {1, √
. . . , Nraw } is
obtained after discarding the items in S. We have that Nraw ≈ N/2 − N .
12
generates a random hash function f : {0, 1}Nraw → {0, 1}M , and publishes f and
f (x1 , . . . , xNraw ). Bob uses the information (y1 , . . . , yNraw ), f and f (x1 , . . . , xNraw )
to reconstruct (x1 , . . . , xNraw ). (Note that here, Eve learns substantial information
about the raw key. In subsection 4.4 we show that H(X|Y ) = h(D).)
Privacy-amplification phase. Alice calculates the number
, " # -
1 '
Nkey = Nraw h − D(1 − D) − M , (63)
2
generates a random hash function g : {0, 1}Nraw → {0, 1}Nkey and publishes it.
Both, Alice and Bob, generate the joint secret key by computing (k1 , . . . , kNkey ) =
'
g(x1 , . . . , xNraw ). (In subsection 4.4 we obtain H(X|Z) = h[1/2− D(1 − D)], where
Z is the best guess Eve can make on X. Shannon theory tells us that Eve knows
nothing on this secret key.)
The efficiency rate is defined as the number of generated secret bits Nkey divided by the
number of uses of the quantum channel N . The efficiency rate of the BB84 is
. " # /
Nkey 1 1 '
R= = h − D(1 − D) − h(D) + O(N −1/2 ) . (64)
N 2 2
The BB84 protocol is summarised in the following figure. (Recall the difference between
polarisation and Bloch-sphere directions: Z for vertical and horizontal, and X for both
diagonals.)
13
• In practice, even if there is no Eavesdropper, we expect errors (D > 0), because
channels and measurement apparatuses are usually imperfect. However, in QKD
we must assume the worst case: the channel is perfect and all errors come from the
action of an eavesdropper. Measuring D allows Alice and Bob to asses the amount of
information that Eve has about the generated key (in a worst case scenario). Then,
by using privacy amplification, they can be sure that the final key is secure. Regard-
less of whether the errors come from the apparatuses or from an actual adversary.
• In the BB84, a key barrier for Eve is that she does not know in which basis Alice
prepares the photon. But perhaps, instead of measuring this intercepted photon,
she could make it interact with an additional system (ancilla) and send the original
photon to Bob. Then, she could measure the ancilla once Alice publishes the prepa-
ration bases. This is actually a better attack than measuring without knowing the
basis, and it is analysed it in the next subsection.
• Proving that a protocol is secure against all possible attacks that we can think of
does not constitute a security proof. For this, we will have to formalise what is the
most general strategy that an adversary could perform, and prove security against
it.
14
4.4. Information gain vs disturbance in the conjugate basis
In this subsection we address the following question: what is the maximal amount of
information that an eavesdropper can have when disturbing the channel by an amount
D? We construct an explicit attack, which is proven to be optimal in Appendix A. To
simplify our analysis, we assume that Eve’s interaction has the same symmetries than the
protocol, that is, it is invariant under the exchanges 0 ↔ 1 and Z ↔ X.
When Alice prepares a ∈ {0, 1} equation (65) can be written as
where all kets are normalised and the unknown coefficients γab are real. Imposing that
unitary transformations preserve the norm we have γa0 2 + γ 2 = 1. Imposing the 0 ↔ 1
a1
symmetry we get γ00 = γ00 and γ01 = γ10 . The disturbance D is the probability that Bob
obtains an error, which in this case is D = γ012 . The fidelity of the channel is F = 1 − D,
so we can write
√ √
U |φ〉 ⊗ |0〉 = F |E00 〉 ⊗ |0〉 + D|E01 〉 ⊗ |1〉 , (70)
√ √
U |φ〉 ⊗ |1〉 = D|E10 〉 ⊗ |0〉 + F |E11 〉 ⊗ |1〉 . (71)
where the four vectors |E±± 〉 are normalized and satisfy identities analogous to (72) and
(73). Comparing (74) with (75) and (76) we obtain
√ 1 3√ √ √ √ 4
F |E++ 〉 = F |E00 〉 + D|E10 〉 + D|E01 〉 + F |E11 〉 (77)
2
√ 1 3√ √ √ √ 4
F |E−− 〉 = F |E00 〉 − D|E10 〉 − D|E01 〉 + F |E11 〉 (78)
2
15
The normalization of these two vectors implies
√ 155√ √ √ √ 5
5
F = 5 F |E00 〉 + D|E10 〉 + D|E01 〉 + F |E11 〉5
2
155√ √ √ √ 5
5
= 5 F |E 00 〉 − D|E 10 〉 − D|E 01 〉 + F |E 〉
11 5 . (79)
2
Using (72) and (73) the above can be written as
3√ √ √ √ 43√ √ √ √ 4
4F = F 〈E00 | + D〈E10 | + D〈E01 | + F 〈E11 | F |E00 〉 + D|E10 〉 + D|E01 〉 + F |E11 〉
√ 6 7
= 2F + 2D + 2F α + 2Dβ + 2 DF re 〈E00 |E10 〉 + 〈E01 |E11 〉 + 〈E00 |E01 〉 + 〈E10 |E11 〉
= 2 + 2F α + 2Dβ + Ω
3√ √ √ √ 43√ √ √ √ 4
4F = F 〈E00 | − D〈E10 | − D〈E01 | + F 〈E11 | F |E00 〉 − D|E10 〉 − D|E01 〉 + F |E11 〉
√ 6 7
= 2F + 2D + 2F α + 2Dβ − 2 DF re 〈E00 |E10 〉 + 〈E01 |E11 〉 + 〈E00 |E01 〉 + 〈E10 |E11 〉
= 2 + 2F α + 2Dβ − Ω ,
which implies Ω = 0 and
4F = 2 + 2F α + 2Dβ . (80)
This relation between D, F, α, β is a consequence of unitarity (i.e. quantum mechanics)
and it plays an important role below.
Once Eve knows that the qubit prepared by Alice is in the Z basis a ∈ {0, 1}, she has
to optimally distinguish the two states:
ρ0 = F |E00 〉〈E00 | + D|E01 〉〈E01 | , (81)
ρ1 = F |E11 〉〈E11 | + D|E10 〉〈E10 | , (82)
corresponding to Alice sending 0 or 1. Since we are assuming that Eve’s attack has Z ↔ X
symmetry, it is not necessary to analyze the case where Alice prepares the qubit in the X
basis, the information gained by Eve will be the same.
The error #E made when optimally discriminating ρ0 and ρ1 is given by formula (51).
But we cannot use this formula, since we don’t know the exact form of the vectors |Eab 〉;
the only thing we know is the relation (80). However, Alice and Bob have enough with a
lower-bound on #E . Hence, we can instead solve the following simpler problem. Suppose
that Eve has to guess x in the imaginary situation where she is told whether x = y or
x ∕= y. That is, with probability F Eve knows that she has either |E00 〉〈E00 | or |E11 〉〈E11 |,
and with probability D she knows that she has either |E01 〉〈E01 | or |E10 〉〈E10 |. Clearly,
this situation is not worse than the original one, since Eve can always ignore the extra
information. Then, using the error formula for pure states (57) we obtain
13 ' 4 13 ' 4
#E ≥ F 1 − 1 − |〈E00 |E11 〉|2 + C 1 − 1 − |〈E01 |E10 〉|2
2 2
1 F' D '
= − 1 − α2 − 1 − β2 . (83)
2 2 2
Next, we minimise (83) with respect to α, β given the constraint (80). We can do this
with the Lagrange multipliers method: differentiating
1 F' D'
− 1 − α2 − 1 − β 2 − µ (2 + 2F α + 2Dβ − 4F ) (84)
2 2 2
with respect to α and β and equating to zero gives
αF
√ − 2µF = 0 , (85)
2 1 − α2
βD
' − 2µD = 0 , (86)
2 1 − β2
16
which implies α = β. Updating constraint (80) with this fact gives
2F = 1 + α , α = 1 − 2D . (87)
0.5
0.4
0.3
Out[2]=
0.2
0.1
In Appendix A it is proven that inequality (88) is actually an equality. This implies that
in the optimal attack Eve knows wether x = y or x ∕= y with probability one (although
she still doesn’t know x or y with probability one). This is equivalent to say that, in the
optimal attack, we have
which implies
! 1
P (y) = P (y|x)P (x) = . (92)
x
2
Eve’s information is her guess z on x, which according to (88) follows the distribution
8 1 '
P (z|x) = 2 + 'D(1 − D) if z = x
. (93)
1
2 − D(1 − D) if z ∕= x
17
Next, with this statistical information we calculate the conditional entropies H(X|Y )
and H(X|Z). Using P (x) = P (y) = P (z) = 1/2 we obtain H(X) = H(Y ) = H(Z) =
h(1/2) = 1. Using this and (91) we obtain
which gives
" #
1 '
H(X|Z) = H(Z|X) + H(X) − H(Z) = h − D(1 − D) .
2
Putting everything together we can calculate at the secret key rate (secret bits/photon)
. " # /
1 1 1 '
R = [H(X|Z) − H(X|Y )] = h − D(1 − D) − h(D) . (97)
2 2 2
The factor 1/2 follows from the fact that the fraction of compatible basis between prepa-
rations by Alice and measurements by Bob is asymptotically 50%.
Next, we plot Eve’s ignorance H(X|Z) in blue and Bob’s ignorance H(X|Y ) in red, as
a function of the disturbance D. The secret key rate is 1/2 of the difference between these
two functions.
In[9]:= h@x_D := - x Log@2, xD - H1 - xL Log@2, 1 - xD
PlotB:hB.5 - e H1 - eL F, h@eD>, 8e, 0, .5<F
1.0
0.8
0.6
Out[10]=
0.4
0.2
At the point where the two entropies cross, the secret key rate becomes zero. This point
is around D = 15%. We will see that other protocols tolerate higher disturbance.
18
4.6. Improved BB84
In this subsection we describe a modification of the BB84 protocol, so that we eliminate
the factor 1/2 in the efficiency rate (97).
In the distribution phase of this new protocol, Alice generates each random basis ui
with the probability distribution
8
1 − N −1/2 if u = Z
P (u) = , (98)
N −1/2 if u = X
and independently, Bob generates each random measurement vi with the same distribution.
Recall that in the original protocol the distributions are uniform. The basis-reconciliation
phase is the same as in the original protocol, but due to the change of statistics we have
9 :
Nrec ≈ P (u = Z)P (v = Z) + P (u = X)P (v = X) N
.3 42 3 42 /
−1/2 −1/2
= 1−N + N N
In the estimation phase, Alice and Bob use all rounds with uj = vj = X and a selected
random subset of the rounds with uj = vj = Z of size N 1/2 . This produces a raw key of
size Nraw ≈ N − O(N 1/2 ), instead of Nraw ≈ N/2 − O(N 1/2 ). Now, continuing as in the
original protocol, we obtain the desired rate without the factor 1/2.
19
The initial state of B is the input state |ψθ 〉, and the initial state of E is fixed to |0〉,
without loss of generality. The joint output state of B and E is |Φ〉BE , and its reduced
states
σB = trE |Φ〉EB 〈Φ| , (104)
σE = trB |Φ〉EB 〈Φ| , (105)
constitute the two clones of |ψθ 〉. The unitary W acts as
W (|0〉E |0〉B ) = |0〉E |0〉B , (106)
1
W (|0〉E |1〉B ) = √ (|1〉E |0〉B + |0〉E |1〉B ) . (107)
2
20
5.5. Problem 4: fidelity trade-off in asymmetric CM
A. Calculate the joint state |Φ〉EB = Wη (|0〉E |ψθ 〉B ) where |ψθ 〉 = 2−1/2 (|0〉 + eiθ |1〉).
Calculate the reduced density matrices
B
C. Find the value of Fphase B
when you impose Fphase B
= Fphase . Comment on it.
where a ∈ {0, 1, +, −}. Recall that the unitary U produces the state
√ √
|Φa 〉 = U (|φ〉|a〉) = F |Eaa 〉|a〉 + D|Eaā 〉|ā〉 , (117)
where ā denotes the opposite state in the same basis. Using the fact that |Eaa 〉 and |Eaā 〉
are orthogonal (89-90), we obtain Bob’s reduced state
3√ √ 43√ √ 4
a
σB = trE F |Eaa 〉|a〉 + D|Eaā 〉|ā〉 F 〈Eaa |〈a| + D〈Eaā |〈ā| (118)
= F |a〉〈a| + D |ā〉〈ā| . (119)
The fact that the fidelity is independent of a follows from the symmetries X ↔ Z and
0 ↔ 1 present in U .
The orthogonality constraints (89-90) tell us that, even though Eve doesn’t know a, she
knows wether her reduced state is σE a = |E 〉〈E | or σ a = |E 〉〈E |. Let us analyse
aa aa E aā aā
the first case, and the second will follow by symmetry. Next, write the vectors |E00 〉 and
|E11 〉 in the basis {|a〉 : a = 0, 1},
21
where we have imposed the symmetries again, which gives Eve’s fidelity
Also, for a fixed value of FE , and hence cos ξ = 1, we want to maximise FB , which gives
which is the same fidelity trade-off as in (110). Hence, we conclude that the optimal
individual attack for the BB84 is the optimal asymmetric phase-covariant cloning machine.
6. Entanglement-based QKD
Until now we have seen “prepare and measure” protocols for QKD, in which: (i) Alice
prepares a quantum state, (ii) sends it to Bob through a quantum channel, and (iii) he
measures it. Another type of QKD protocols is when there is a source of entangled pairs
of quantum states in between Alice and Bob, which: (i) sends a halve of a pair to each of
them, and (ii) they perform measurements. Due to the phenomenon of quantum steering,
there is an equivalence between prepare and measure and entanglement-based protocols.
22
6.2. Problem 5: the singlet state
Prove that each pair of orthogonal states |φ0 〉, |φ1 〉 ∈ C2 satisfies
23
6.3. Description of the protocol
In each round i of the distribution phase, Alice randomly selects wether the ith system is
used for generating raw key or for estimation. That is, she generates the random variable
ri ∈ {raw, est} with probability P (est) = N −1/2 and P (raw) = 1 − N −1/2 . If ri = raw
then Alice measures in a fixed basis {|µ0 〉, |µ1 〉}, generating outcome xi . This basis is
optimised to yield the maximal correlation with Bob. If ri = est then Alice randomly
selects a basis ui ∈ {X, Y, Z} with uniform distribution and measures it. At the same
time, Bob generates the independent random variables si ∈ {raw, est} and vi ∈ {X, Y, Z}
with the same statistics as those of Alice, and performs the same procedure as Alice.
After the distribution phase, Alice and Bob do the following to each round i:
• If ri = si = est then they keep the data (ui , vi , xi , yi ) to generate the joint statistics
of the process P (x, y|u, v) and reconstruct the joint state ρAB .
• If ri = si = raw then they keep outcomes (xi , yi ) for the raw key.
24
The fraction of rounds that are thrown away is approximately 2P (est)P (raw) = O(N −1/2 ),
which tends to zero as N grows. On the other hand, the fraction of rounds used for the
raw key is P (raw)P (raw) = O(1).
The difference between this estimation procedure and that of the BB84 is that here we
consider the statistics of the 9 measurement combinations (u, v), while in the BB84 only
the statistics of u = v = Z and u = v = X is considered.
25
A standard situation is when the source distributes many copies of a 2-qubit singlet state
|Φ− 〉 = √12 (|01〉 − |10〉) with a δ amount of isotropic noise, that is
where {|βj 〉} is any the orthonormal basis on E. Then, all purifications are equivalent up
to this choice of basis. Or, in other words, all purifications are equivalent up to a unitary
transformation on the ancillary system E. Due to this equivalence, we talk about the
purification as being essentially unique. Note that the purification is an extension.
!' ; <
trE |ψ〉AE 〈ψ| = λj λj ′ |αj 〉A 〈αj ′ | tr |βj 〉E 〈βj ′ |
jj ′
!'
= λj λj ′ |αj 〉A 〈αj ′ | δjj ′
jj ′
!
= λj |αj 〉A 〈αj | . (133)
j
This extension, in addition of being pure, turns out to have a very remarkable property:
any other extension can be obtained from the purification by a local (not necessarily
unitary) operation on Eve’s side.
Theorem 1 (Purification). Let ρA be a given state and |ψ〉AE its purification. For any
extension ρAE of ρA there is completely positive trace-preserving map Λ such that
Proof. If ρA is not full rank, we redefine the Hilbert space such that this is the case. Let
the purification of ρA be
!'
|ψ〉AE = λj |αj 〉A ⊗ |βj 〉E . (135)
j
26
then, we can write its (unnormalized) eigenvectors in terms of the bases of the purification
!
i
|φi 〉AE = Cj,k |αj 〉A ⊗ |βk 〉E . (137)
j,k
satisfy
⊗ MEi |ψ〉AE = |φi 〉AE .
A (139)
&r
This implies that the completely positive map Λ(σ) = i=1 M i σM i† satisfies
& (134). Now,
the only thing that remains to be shown is that Λ is trace-preserving: M i† M i = .
i
Using the facts that ρAB is an extension of ρA , identity (134), and the cyclicity of the
trace, we obtain
=$ % >
! 9 : ! i†
i i† i
ρA = trE ρAE = trE ME |ψ〉AE 〈ψ|ME = trE ME ME |ψ〉AE 〈ψ| . (140)
i i
is an unnormalized maximally entangled state. This vector has the following property: for
any matrix Q we have
trE [QE |Φ〉AE 〈Φ|] = QTA , (143)
where QTA is the transpose of QA , and this is the same matrix as QE but in Alice’s space.
Putting all together we have
=$ % >
−1/2 −1/2
! −1/2 −1/2
A = ρA ρ A ρA = trE MEi† MEi ρA |ψ〉AE 〈ψ|ρA
i
=$ % > $ %T
! !
= trE MEi† MEi |Φ〉AE 〈Φ| = MAi† MAi . (144)
i i
27
measuring each system Ei separately. The state of Eve conditioned on Alice’s information
is
1
ρE|x = A〈x|ρAE |x〉A , (145)
P (x)
where P (x) is the marginal of
Eve’s measurement transforms the quantum information ρE|x into the classical information
z, producing the joint distribution P (x, z), which together with P (x, y) allows to calculate
the efficiency rate
R = H(X|Z) − H(X|Y ) . (147)
So let us calculate this rate when Alice and Bob share N copies of the noisy singlet
Note that the relative entropy does not care of wether Bob flips its symbols. The relation
with the disturbance is D = δ/2.
Next, we calculate (145). To obtain Eve’s purification, it is convenient to decompose
the identity with the following orthonormal basis
28
and using P (x = 0) = 1/2 we obtain
1 3 4
ρE|x=0 = AB 〈00|ψ〉ABE 〈ψ|00〉AB +AB 〈01|ψ〉ABE 〈ψ|01〉AB (159)
P (x = 0)
δ 3' ' 43' ' 4
= |E3 〉〈E3 | + 1 − 3δ/4 |E1 〉 + δ/4 |E2 〉 1 − 3δ/4 〈E1 | + δ/4 〈E2 | ,
2
1 3 4
ρE|x=1 = AB 〈10|ψ〉ABE 〈ψ|10〉AB +AB 〈11|ψ〉ABE 〈ψ|11〉AB (160)
P (x = 1)
δ 3' ' 43' ' 4
= |E4 〉〈E4 | + 1 − 3δ/4 |E1 〉 − δ/4 |E2 〉 1 − 3δ/4 〈E1 | − δ/4 〈E2 | .
2
To calculate the optimal guessing probability we need to find the four eigenvalues of the
matrix
which has eigenvalues ±1. Then, the four eigenvalues of [ρE|x=0 − ρE|x=1 ] are ±δ/2 and
'
±2 (δ/4)(1 − 3δ/4), which give trace distance
5 5 '
5ρE|x=0 − ρE|x=1 5 = 2 δ(1 − 3δ/4) + δ .
1
29
In[! ]:= h[x_] := - x Log[2, x] - (1 - x) Log[2, 1 - x]
Plot
1
h - d (1 - d) - h[d],
2
1-d 1
h - d (2 - 3 d) - h[d]
2 2
, {d, 0, .5}
1.0
0.5
Out[! ]=
0.1 0.2 0.3 0.4 0.5
-0.5
-1.0
7. General attacks
Suppose that Alice and Bob share N copies of the state ρAB with purification |ψABE 〉, so
the global state is
N
D
|ψAi Bi Ei 〉 , (168)
i=1
where f : {0, 1}N → {0, 1}Nkey is a complicated function. For example, in the simple case
k = x1 ⊕ x2 , (170)
the relevant information is whether x1 = x2 or x1 ∕= x2 , not the individual values of x1
or x2 . And there might be a joint measurement of E1 , E2 which provides better informa-
tion of k. In any case, a joint measurement will never do worse, since it includes local
measurements as a particular case. Of course, this extra information that Eve gets by
doing a joint measurement can be “hashed out” by Alice and Bob, by doing more privacy
amplification, that is, by making the final key shorter.
Theorem. The optimal secret key rate of one-way public communication protocols (Alice
→ Bob) is
R→ = I(X : Y ) − I(X : E) , (171)
where
!
I(X : E) = S(E) − S(E|X) = S(ρE ) − P (x)S(ρE|x ) . (172)
x
30
and S(ρ) = −tr(ρ log2 ρ) denotes the von-Newman entropy.
Above we write the symbol E instead of Z to denote that Eve’s system is quantum,
because it is not measured yet. That is, the joint state of Alice and Eve is classic-quantum
(cq-state)
!
ρXE = P (x)|x〉X 〈x| ⊗ ρE|x . (173)
x
We use the notation where X, Y, Z are classical systems and A, B, E are quantum. The
fact that, in the rate formula, Eve’s system is quantum simplifies our task, because we
don’t need to find what is her best measurement. In other context I(X : E) is called the
Holevo bound, and is denoted by χ(X : E).
There are other protocols with a higher secret key rate than (171) in the high-noise
regime. This is done by exploiting two-way communication between Alice and Bob. How-
ever, we do not study these protocols in due course.
δ δ δ
ρE = (1 − 3δ/4) |E1 〉〈E1 | + |E2 〉〈E2 | + |E3 〉〈E3 | + |E4 〉〈E4 | , (179)
4 4 4
and
3δ δ
S(ρE ) = −(1 − 3δ/4) log(1 − 3δ/4) − log . (180)
4 4
31
This, together with I(X : Y ) = 1 − h(δ/2), gives a secret key rate of
!
R = I(X : Y ) − S(ρE ) + P (x)S(ρE|x ) (181)
x
3δ δ
= 1 − h(δ/2) − S(ρE ) + h(δ/2) = 1 + (1 − 3δ/4) log(1 − 3δ/4) + log . (182)
4 4
In the next plot we compare the secret key rate against individual (blue) and general
(amber) attacks as a function of the disturbance δ. Obviously, the general attack provides
more information to the adversary, and hence, less key rate for the honest parties.
In[! ]:= h[x_] := - x Log[2, x] - (1 - x) Log[2, 1 - x]
Plot
1-d 1
h - d (2 - 3 d) - h[d],
2 2
1 + (1 - 3 d / 2) Log[2, 1 - 3 d / 2] + (3 d / 2) Log[2, d / 2]
, {d, 0, .5}
1.0
0.5
Out[! ]=
0.1 0.2 0.3 0.4 0.5
-0.5
-1.0
C. Is it possible to distill a secret key from this state, which is secure against general
attacks? That is, does formula
R ≥ I(X : Y ) − I(X : E) , (186)
give a positive number?
32
7.3. De-Finetti Theorem
In the previous section we considered general attacks with the i.i.d. assumption on Alice
and Bob’s reduced state ρ⊗N AB . However, in order to prove full security, no assumption
should be made. To drop this assumption we use the quantum exponential de-Finetti
theorem.
Something relevant is that the protocol applied by the honest parties treats each of the
N pairs of systems (or rounds) on equal footing. Hence, nothing changes if, before the
protocol, Alice and Bob apply on their pairs of systems a random permutation, and later
forget the value of this permutation. This would ensure that the reduced state of the
honest parties ρAN B N is symmetric with respect to the exchange of pairs. Note that, since
Eve knows which permutation has been applied, the global state ρAN B N E is not necessarily
symmetric. But when dealing with the reduced state for Alice and Bob we can apply the
following.
This tells us that, after discarding a small fraction of the N states, the remaining state is
approximately i.i.d. So the analysis of Section 7 holds in general. Hence, the most general
attack allows Alice and Bob to obtain the secret key rate given in formula (171).
8. Non-local correlations
In this section we develop a formalism that allows to construct QKD protocols by only
looking at the statistics P (a, b|x, y) and ignoring the quantum model that produces that
statistics 3 4
P (a, b|x, y) = tr Aax ⊗ Byb ρAB . (188)
(Here, x, y label the different observables that Alice and &Bob can perform, so that, for
each x, the operators {A1x , A2x , . . .} constitute a POVM a Aax = , and analogously for
{By1 , By2 , . . . }.) In this framework, Alice and Bob look at their devices as black boxes.
The causal structure of these correlations is represented in the following diagram, where
the common cause Λ is the shared state ρAB .
33
The advantage of this framework is that security does not rely on having a correct de-
scription/model of the devices, as long as they generate a useful statistics P (a, b|x, y). As
we will see below, the security of these protocols does not even rely on the validity of
quantum mechanics.
is independent of x (and the other way arround). The fact that this is a set of lin-
ear constraints implies that: if P1 (a, b|x, y) and P2 (a, b|x, y) are non-signaling then so is
qP1 (a, b|x, y) + (1 − q)P2 (a, b|x, y) for any 0 ≤ q ≤ 1. In other words, the set of non-
signaling correlations is convex. This convex set turns out to have a finite number of
extreme points (it is a polytope), but describing them is in general hard (see below). A
polytope can always be defined in two dual ways: in terms of its generators {e1 , . . . , en },
or in terms of linear inequalities {c1 , . . . , cm }:
F& & G
P = conv{e1 , . . . , en } = i p i e i : p i ≥ 0, i p i = 1 (192)
= {x : c1 · x ≤ 1, . . . , cm · x ≤ 1} . (193)
Note that any inequality c · x ≤ c can be written as 1c c · x ≤ 1, as long as c ∕= 0. This last
condition can be warranted by translating the origin of coordinates to the interior of the
polytope.
Local Correlations, also called classical or locally causal, are the ones that can be written
as !
P (a, b|x, y) = P (λ)P (a|x, λ)P (b|y, λ) . (194)
λ
If P (a, b|x, y) cannot be written in this way we say that it is non-local (and violate Bell
inequalities). The set of local distributions forms a convex set generated by the extreme
points
Pf g (a, b|x, y) = δfa(x) δg(y)
b
, (195)
for all functions f : X → A and g : Y → B . To see this, note that every single-site
distribution P (a|x) can be written as a mixture of distributions of the form δfa(x) , that is
!
P (a|x, λ) = P (f |λ)δfa(x) , (196)
f
34
and analogously for Bob. For example
.8 .4 1 0 1 1 0 1
P (a|x) = = .6 + .2 + .2 . (197)
.2 .6 0 1 0 0 1 0
(Note that this decomposition is not unique.) This decomposition implies that any local
distribution (194) can be written as
!
P (a, b|x, y) = P (λ)P (a|x, λ)P (b|y, λ)
λ
!
= P (λ)P (f |λ)δfa(x) P (g|λ)δg(y)
b
λ,f,g
!
= P (f, g) δfa(x) δg(y)
b
, (198)
f,g
The above shows that any local distribution can be written as a mixture of elements of
the form (195), showing that they generate the local polytope. The fact that the distri-
butions (195) have an entry equal to one and the rest zeroes implies that they cannot be
written as mixtures, hence they are extreme points. The local polytope that we are study-
ing has generators (195), and inequalities being the Bell Inequalities. The Bell inequalities
are in general not known, but below we constructed in a simple scenario.
Quantum Correlations P (a, b|x, y) are the ones for which there exists a state ρAB and
measurements Aax , Byb such that
3 4
P (a, b|x, y) = tr Aax ⊗ Byb ρAB , (200)
& &
where obviously Aax , Byb ≥ 0, a Aax = A and b Byb = B for all x, y. Note that the
dimension of the Hilbert space is not fixed, and it could be infinite. The set of quantum
correlations also forms a convex set, but this has infinitely many extreme points, hence it
is not a polytope. The extreme points of the quantum convex set are in general not known.
Next we study a particular case where the quantum set is completely characterized.
35
Binary Correlators. The simplest case where we can analyze these sets of correlations
is: two parties with two dichotomic observables (a, b, x, y ∈ {0, 1}). To simplify the
mathematical structure even more, instead of considering the 24 numbers P (a, b|x, y), we
36
consider the 4 numbers
!
Cx,y = (−1)a (−1)b P (a, b|x, y) (201)
a,b
= prob{a = b|x, y} − prob{a ∕= b|x, y} ∈ [−1, 1] , (202)
for all x, y. All possible vectors (C00 , C01 , C10 , C11 ) ∈ [−1, 1]×4 are non-signaling, since
they can be achieved with the following distribution
H
1+Cxy
4 if a = b
P (a, b|x, y) = 1−Cxy , (203)
4 if a ∕= b
which has uniform marginals for Alice and Bob P (a|x) = P (b|y) = 1/2 for all a, b, x, y.
Since these marginals do not depend on x, y the non-signaling constraints are satisfied.
It is known that the correlations (C00 , C01 , C10 , C11 ) are local if and only if they satisfy
the 8 CHSH inequalities:
−2 ≤ C00 + C01 + C10 − C11 ≤ 2 (204)
−2 ≤ C00 + C01 − C10 + C11 ≤ 2 (205)
−2 ≤ C00 − C01 + C10 + C11 ≤ 2 (206)
−2 ≤ −C00 + C01 + C10 + C11 ≤ 2 (207)
The correlations (C00 , C01 , C10 , C11 ) are quantum if and only if they satisfy the 8 non-linear
inequalities:
−π ≤ arcsin C00 + arcsin C01 + arcsin C10 − arcsin C11 ≤ π (208)
−π ≤ arcsin C00 + arcsin C01 − arcsin C10 + arcsin C11 ≤ π (209)
−π ≤ arcsin C00 − arcsin C01 + arcsin C10 + arcsin C11 ≤ π (210)
−π ≤ − arcsin C00 + arcsin C01 + arcsin C10 + arcsin C11 ≤ π (211)
This is proven in L. Masanes, Necessary and sufficient condition for quantum-generated cor-
relations, quant-ph/0309137.
A linearization of the above is the famous Cirelson’s Bound: the correlations (C00 , C01 , C10 , C11 )
are quantum only if they satisfy the 8 linear inequalities:
√ √
−2 2 ≤ C00 + C01 + C10 − C11 ≤ 2 2 (212)
√ √
−2 2 ≤ C00 + C01 − C10 + C11 ≤ 2 2 (213)
√ √
−2 2 ≤ C00 − C01 + C10 + C11 ≤ 2 2 (214)
√ √
−2 2 ≤ −C00 + C01 + C10 + C11 ≤ 2 2 (215)
Note that the above is necessary but not sufficient. In order to visualize the non-signaling,
classical and quantum sets we plot the subsets satisfying the constraint C11 = 1:
37
The boundaries of these sets are given by the above inequalities (when C11 = 1), in
particular, the facets of the tetrahedron are the CHSH inequalities. And the corners of
the cube outside the tetrahedron are the maximally non-local distributions, which are
called PR-boxes 8 1
2 if a ⊕ b = xy
PPR (a, b|x, y) = . (216)
0 otherwise
In the general case (C00 , C01 , C10 , C11 ) there are eight PR-boxes, connected by relabelings
a → 1 − a, and the same for x, y. For example, another PR-box is
8 1
2 if (1 − a) ⊕ b = x(1 − y)
PPR (a, b|x, y) = . (217)
0 otherwise
Note that relabeling b is the same as relabeling a. In the above picture there are only 4
from the 8 PR-boxes, the other 4 have C11 = −1. Apart from the eight PR-boxes there
are 16 local extreme points:
Although the above only applies when Alice and Bob share a PR-box, we see that in this
case, Eve cannot be even classically correlated with Alice and Bob. This is analogous to
what happens with a singlet. In the context of key distribution, the above implies that:
if Alice and Bob share a PR-box then their correlations are necessarily secret (the same
than with a singlet). This is one manifestation of the monogamy of non-local correlations.
Let us see a different one.
Definition 3. We say that the distribution P (a, b|x, y) is 2-shareable with respect to Bob
if there is a 3-party non-signaling distribution P (a, b1 , b2 |x, y1 , y2 ) such that
!
P (a, b1 , b|x, y1 , y) = P (a, b|x, y) , (222)
b1
!
P (a, b, b2 |x, y, y2 ) = P (a, b|x, y) . (223)
b2
38
Theorem 4. Any 2-shareable distribution satisfies all Bell inequalities with two measure-
ments on Bob’s side.
Proof. Given P (a, b|x, y), assuming the existence of P (a, b1 , b2 |x, y1 , y2 ), and using (222)
and (223) we can write
!
b
P (a, b|x, y) = P (b1 , b2 |0, 1) P (a|x, b1 , b2 , 0, 1) δ(b 0 1 .
1 δy +b2 δy )
(224)
b1 ,b2
The above can be easily generalized to the following statement: Any k-shareable distri-
bution satisfies all Bell inequalities with k measurements on Bob’s side. What happens
then with ∞-shareable distributions? They satisfy all Bell inequalities. Conversely, local
distributions are ∞-shareable: If
!
P (a, b|x, y) = P (λ)P (a|x, λ)P (b|y, λ) , (226)
λ
We end this subsection by proving our general monogamy result for the binary case.
But before this, we need to introduce a symmetric family of correlations and a protocol
which allows Alice and Bob to symmetrize their correlations without loosing non-locality.
The family of symmetric correlations is
8 1−ν
2 if a ⊕ b = xy
Pν (a, b|x, y) = ν , (228)
2 otherwise
where ν ∈ [0, 1] is a parameter. We can write them as noisy PR-boxes with noise parameter
ν ∈ [0, 1],
1
Pν (a, b|x, y) = (1 − 2ν)PPR (a, b|x, y) + 2ν . (229)
4
These correlations are analogous to the noisy singlet (153) that we studied in QKD. The
symmetry of these correlations is manifested in the fact that
• Classical: 1/4 ≤ ν
√
• Quantum: .14 ≈ (1 − 1/ 2)/2 ≤ ν < 1/4 ≈ .25
√
• Beyond: 0 ≤ ν < (1 − 1/ 2)/2
Next, we describe a 3-step protocol for transforming any P (a, b|x, y) to the symmetric
distribution Pν (a, b|x, y) having the same CHSH violation.
39
First step: with probability 1/2 Alice and Bob do
• nothing
• flip a and b.
Second step: with probability 1/2 Alice and Bob do
• nothing
• flip y and ax=1
Third step: with probability 1/2 Alice and Bob do
• nothing
• flip x and by=1
and they forget all what they did. Importantly, all of the above operations are local, and
they only require (3 bits of) shared classical randomness. It is easy to check that the
violation of the CHSH is left invariant by this protocol (can you prove it?). Now we are
ready to show our general result on the monogamy of non-local correlations for the binary
case.
Theorem 5. Consider P (a, b, e|x, y, z) where all the variables are binary a, b, e, x, y, z ∈
{0, 1}. Then, at most one of the marginals P (a, b|x, y) or P (a, e|x, z) is non-local.
Proof. Let us prove this by contradiction. Suppose that the two marginals violate CHSH.
Then with local operations we can symmetrize the two marginals, transforming them into
noisy PR-boxes Pν (a, b|x, y) and Pν ′ (a, e|x, z) which still violate CHSH. Now, without
loss of generality, suppose that Pν (a, b|x, y) violates CHSH more than ν ′ P (a, e|x, z), that
is ν ≤ ν ′ . Then Bob can always add a small amount of noise to decrease the Alice-
Bob violation, until it becomes equal to that of Alice-Eve ν 7→ ν ′ . This is done by Bob
flipping b with a suitable probability. This transformation would generate a 2-shareable
distribution. This, together with the fact that all inputs are binary, implies that both
marginals are local. Obviously, the same conclusion can be obtained for the other two
pairs of marginals.
40
9. Device-independent QKD
There are two formalisms for device-independent QKD. In the first one (which we consider
next), only the no-signaling principle is assumed. Hence, the adversary is not restricted
by the laws of quantum mechanics. In the second one (which we do not consider in
this course), the geometry of quantum correlations (e.g. (208)) is assumed, but not the
particular states and measurements involved in the protocol.
satisfies
Conclusion: If Alice and Bob share a non-signaling distribution P (a, b|x, y) and an
adversary can predict the outcome of one of them, then P (a, b|x, y) cannot violate any
Bell inequality.
Now, let us prove that in the optimal attack the conditionals {P (a, b|x, y, e, z)}e are ex-
treme points in the Alice-Bob polytope. To show this, assume the opposite
!
P (a, b|x, y, e, z) = P (i|e, z)Pi (a, b|x, y, e, z) , (238)
i
where Pi (a, b|x, y, e, z) are extremal for all i. Note that the new distribution
is non-signaling (can you check it?), and it becomes the original one P (a, b, e|x, y, z) when
tracing out i, and has extremal conditional distributions P (a, b|x, y, z, e, i) for all (e, i).
41
Hence, if the adversary, in addition of having e she also has i, she can always ignore i and
recover her original correlations with Alice and Bob.
Also, if the the Alice-Bob marginal (237) violates CHSH, then one of the extreme points
{P (a, b|x, y, e, z)}e must be a PR box. This can be interpreted in the following way. During
the performance of the protocol, some of the pairs of Alice and Bob will be in the PR-box
state (216); although Alice and Bob do not know which of the pairs. In such pairs, Eve
knows nothing about the measurement outcomes of Alice and Bob. Hence, she wants to
minimise the probability of PR-box. In order to do so, the only non-local extreme point
has to be the PR-box (216), and the only local extreme points must be the ones which
saturate CHSH.
Now, we recall that Alice and Bob can always transform their distribution into a sym-
metric one 8 1−ν
2 if a ⊕ b = xy
Pν (a, b|x, y) = ν , (240)
2 otherwise
without losing CHSH violation. Note that ν = D is the disturbance. For what follows it is
convenient to write Pν (a, b|x, y) as a mixture of a PR-box and the symmetric distribution
saturating the CHSH inequality. Using C = 1 − 2ν we get
Pν (a, b|x, y) = (2C − 1)Pν=0 (a, b|x, y) + (2 − 2C) Pν=1/4 (a, b|x, y) (241)
= (1 − 4ν)Pν=0 (a, b|x, y) + 4ν Pν=1/4 (a, b|x, y) (242)
8
1! a
= (1 − 4ν)PPR (a, b|x, y) + 4ν δft (x) δgbt (y) , (243)
8
t=1
where the 8 local points a = ft (x) and b = gt (y) are the ones which saturate the CHSH
inequality (you found them in Problem 5).
Therefore, redefining (e, i) → e we can write the global distribution corresponding to
the optimal attack
8
1! a
P (a, b, e|x, y) = (1 − 4ν)PPR (a, b|x, y) δe0 + 4ν δft (x) δgbt (y) δet . (244)
8
t=1
The above tripartite correlations happen for one particular value of z. But being these
correlations optimal from the point of view of Eve, she is not going to use other possible
values of z. Hence, we have omitted them in the above expression. In other words: in
order to implement the optimal attack, Eve does not need a machine with multiple inputs.
From (244) we see that, if Eve knows x then she also knows a with probability 4ν, and
knows nothing with probability (1 − 4ν). This gives
42
9.3. Problem 9: alternative protocol for no-signaling QKD
We know that the optimal attack is
8
1! a
P (a, b, e|x, y, z) = (1 − 4ν)PPR (a, b|x, y) δe0 + 4ν δft (x) δgbt (y) δet , (250)
8
t=1
where ν = D is the disturbance. Now, suppose that Bob publishes y and Alice does not
publish x. Instead, she transforms the raw key as
a 7→ a′ = a ⊕ xy , (251)
A. Can you calculate H(A′ |B) and H(A′ |E, Y )? For this you need the solution of
Problem 5.
B. Can you write the secret key rate r and compare it with the one of the simple
protocol shown above Rsimple = 1 − 4ν − h(ν)? Which protocol is better?
Note that we are not imposing any restriction on the dimensionality of the Hilbert spaces.
Define the indexed Hermitian matrix
where
α = tr (ρAB [A0 A1 ⊗ ]) , β = tr (ρAB [ ⊗ B0 B1 ]) (256)
and ᾱ, β̄ are their complex conjugates. Next, we show that the matrix
& Q is always positive
semi-definite. For any vector v ∈ C we can define the matrix V = i vi Mi and note that
4
! 3 4
v̄i Qij vj = tr ρAB V † V ≥ 0 . (257)
ij
43
Recall that any matrix of the form V † V is positive semi-definite. We also know that if Q
is positive then so is its complex conjugate Q̄ and their average
( *
1 re α C00 C01
1 I re α 1 C10 C11 J
Q̃ = (Q + Q̄) = I)
J (258)
2 C00 C10 1 re β +
C01 C11 re β 1
Theorem 6. If the correlations Cxy are quantum, then there are two real numbers α, β
such that the matrix ( *
1 α C00 C01
I α 1 C10 C11 J
Q=I ) C00 C10 1
J (259)
β +
C01 C11 β 1
is positive semi-definite.
Let us solve this positivity condition for the simple case of a symmetric distribution C00 =
C01 = C10 = −C11 = C. The four eigenvalues of
( *
1 α C C
I α 1 C −C J
Q=I ) C C
J (260)
1 β +
C −C β 1
are K
1 '
1± √ α2 + β 2 + 4C 2 ± (α2 − β 2 )2 + 8(α2 + β 2 )C 2 . (261)
2
The positivity of all of them requires
'
α2 + β 2 + 4C 2 ± (α2 − β 2 )2 + 8(α2 + β 2 )C 2 ≤ 2 . (262)
or equivalently
4(α2 β 2 + 4C 4 ) ≤ 4 . (264)
Minimizing the left-hand side with respect to α, β we obtain
4(4C 4 ) ≤ 4 , (265)
44
9.5. Problem 10: quantum correlations
Is there any distribution P (a, b|x, y) with a, b, x, y ∈ {0, 1} which does not violate Cirelson’s
Bound (212-215) but lies outside the quantum set? Hint: notice that the distribution
1 1
C00 = √ + # , C01 = √ − # , (268)
2 2
1 1
C10 = √ , C11 = − √ , (269)
2 2
saturates Cirelson’s Bound (267). Substitute this in (259) and check the non-positivity of
Q with perturbation theory to leading order in the small parameter 0 < # ≪ 1.
Appendices
The information gain on the X basis is quantified by the statistical distance between these
two probability distributions
1 ! LL L
L
δX = Lp(i|+) − p(i|−)L . (272)
2
i
Disturbance on the Z basis is the average probability of Bob obtaining the wrong
outcome
1 1
DZ = 〈Φ0 | ( E ⊗ |1〉〈1|) |Φ0 〉 + 〈Φ1 | ( E ⊗ |0〉〈0|) |Φ1 〉 . (273)
2 2
Here and in what follows, we omit the system labels E and B when the expressions are
clear enough. Note that DZ ∈ [0, 1/2], otherwise Bob could re-define |a〉 and decrease
DZ . Also note that DZ only depends on the interaction U , or equivalently the vectors
|Φ0 〉, |Φ1 〉, but does not depend on the measurement Qi . We can analogously define the
disturbance on the X basis DX .
Theorem 7 (Information vs disturbance, statistical distance). The information that Eve
gains in in one basis constrains the disturbance that she produces in the conjugate basis:
'
δX ≤ 2 DZ (1 − DZ ) , (274)
'
δZ ≤ 2 DX (1 − DX ) . (275)
45
Proof. By the linearity of transformation (65) we have that |±〉 = √1 (|0〉 ± |1〉) implies
2
1
|Φ± 〉 = √ (|Φ0 〉 ± |Φ1 〉) . (276)
2
Using this and |re x| ≤ |x| we obtain
1 ! LL L
L
δX = L〈Φ+ |Qi ⊗ |Φ+ 〉 − 〈Φ− |Qi ⊗ |Φ− 〉L
2
i
1 ! LL L
L
= L (〈Φ0 | + 〈Φ1 |) Qi ⊗ (|Φ0 〉 + |Φ1 〉) − (〈Φ0 | − 〈Φ1 |) Qi ⊗ (|Φ0 〉 − |Φ1 〉) L
4
i
1 ! LL L
L
= L〈Φ0 |Qi ⊗ |Φ1 〉 + 〈Φ1 |Qi ⊗ |Φ0 〉L
2
i
! LL L
L
= Lre〈Φ0 |Qi ⊗ |Φ1 〉L
i
! LL L
L
≤ L〈Φ0 |Qi ⊗ |Φ1 〉L
i
! LL L
L
= L〈Φ0 | (Qi ⊗ |0〉〈0|) |Φ1 〉 + 〈Φ0 | (Qi ⊗ |1〉〈1|) |Φ1 〉L , (277)
i
where we have√ inserted |0〉〈0| + |1〉〈1| = on system B. The positivity of Qi implies the
positivity of Qi . Hence, we can define the family of unnormalised vectors
'
i
|ψaa ′〉 = Qi ⊗ |a′ 〉〈a′ ||Φa 〉 , (278)
where we have used the triangular inequality |x+y| ≤ |x|+|y|. Using the Cauchi-Schwartz
inequality |〈α|β〉|2 ≤ 〈α|α〉〈β|β〉 we obtain
! "K K #
δX ≤ i |ψ i 〉〈ψ i |ψ i 〉 +
〈ψ00 〈ψ i |ψ i 〉〈ψ i |ψ i 〉 . (280)
00 10 10 01 01 11 11
i
is the probability of Bob obtaining outcome a′ and Eve obtaining outcome i when Alice
prepares a.
Now, note that (x − y)2 ≥ 0 implies xy ≤ 12 (x2 + y 2 ), therefore
' 1 1
p(i|0)p(i|1) ≤ p(i|0) + p(i|1) = p(i) . (282)
2 2
46
Using this and Bayes’ Rule we obtain
! 3' ' 4
δX ≤ p(0, i|0)p(0, i|1) + p(1, i|0)p(1, i|1)
i
!' 3' ' 4
= p(i|0)p(i|1) p(0|i, 0)p(0|i, 1) + p(1|i, 0)p(1|i, 1)
i
! 3' ' 4
≤ p(i) p(0|i, 0)p(0|i, 1) + p(1|i, 0)p(1|i, 1)
i
! .K K /
; < ; <
= p(i) 1− D0i D1i + D0i 1− D1i , (283)
i
where we have defined Dai the probability of Bob getting the wrong result conditioned an
Alice preparing a and Eve obtaining outcome i:
In BB84 Alice prepares states in the Basis Z and X with the same probability. Then,
the average statistical distance and disturbance are
1
δ =
(δX + δZ ) , (288)
2
1
D = (DX + DZ ) . (289)
2
'
Using again the convexity of the function x(1 − x) we obtain the bound for the averaged
quantities '
δ ≤ 2 (1 − D) D . (290)
Equation (51) gives a relation between the statistical distance δ and the error probability
#, which translates to
1 '
# ≤ − (1 − D) D . (291)
2
This provides the (famous) information/disturbance tradeoff.
47
In[2]:= PlotB.5 - e H1 - eL , 8e, 0, .5<F
0.5
0.4
0.3
Out[2]=
0.2
0.1
In analogy to our previous results (Theorem 1), we can obtain information vs disturbance
tradeoffs where information gain is quantified by the mutual information.
0.8
0.6
Out[3]=
0.4
0.2
Bound (297) tells us that the attack constructed in subection 4.4 is optimal.
48