Logical Time
Logical Time
in DISTRIBUTED SYSTEMS
Michel RAYNAL
Institut Universitaire de France
Academia Europaea
c M. Raynal
Logical time in distributed systems 1
Companion Book (1)
c M. Raynal
Logical time in distributed systems 2
Companion Book (2): content (six parts)
c M. Raynal
Logical time in distributed systems 3
Contents
c M. Raynal
Logical time in distributed systems 4
Part I
SCALAR/LINEAR TIME
- Lamport, L., Time, Clocks and the Ordering of Events in a Distributed System.
Communications of the ACM, 21(7):558-565, 1978
c M. Raynal
Logical time in distributed systems 5
Aim
c M. Raynal
Logical time in distributed systems 6
The fundamental constraint
c M. Raynal
Logical time in distributed systems 7
Lamport clocks (1978)
Sending rule:
when sending a message m to pj :
hi ← hi + 1; % date of the send event %
send (m, hi) to pj
Receiving Rule:
when receiving a message (m, h) from pj :
hi ← max(hi, h);
hi ← hi + 1 % date of the receive event %
c M. Raynal
Logical time in distributed systems 8
Illustration
1 2 3 4 6
p1 h1
2 3
5
1 2 6
p2 h2
4 5
1 6
h3
p3
1 2 3 4 5 7
Observation: (date(e) = x)
⇔
There are x events on the longest causal path ending at e
c M. Raynal
Logical time in distributed systems 9
Build a total order on all the events
c M. Raynal
Logical time in distributed systems 10
Total order definition
TO def
(e −→ f ) = (h < k) ∨ ((h = k) ∧ (i < j))
c M. Raynal
Logical time in distributed systems 11
Illustration
Σinit = [0, 0]
(2, 1) (3, 1)
e12
e11 e21
p1 Σ = [0, 1]
e11 e22
p2
e12 e22 e32 Σ = [1, 1] Σ = [0, 2]
(1, 2) (2, 2) (4, 2) e21 e22 e11
Σf inal = [2, 3]
c M. Raynal
Logical time in distributed systems 12
A theorem on the space of scalar clocks
• Let C be the set of all the scalar clock systems that are
consistent (with respect to the causality relation)
• Let e and f be any two events of a distributed execution
ev
• ∀C ∈ C: e −→ f ⇒ dateC (e) < dateC (f ) (Consistency)
• e||f ⇔ ∃C ∈ C : dateC (e) = dateC (f )
• Or equivalently
e||f ⇔ ∃C1, C2 ∈ C :
dateC1(e) ≤ dateC1(f ) ∧ dateC2(e) ≥ dateC2(f )
c M. Raynal
Logical time in distributed systems 13
Part II
c M. Raynal
Logical time in distributed systems 14
The Mutex problem
out asking
in
c M. Raynal
Logical time in distributed systems 15
The Mutex problem: definition
• Definition
• Algorithms
- Raynal M., Algorithms for mutual exclusion. The MIT Press, 1986
- Anderson J., Kim Y.-J. and Herman T., Shared-memory mutual exclusion: major
research trends since 1986. Distributed Computing, 16(2-3): 75-110, 2003
- Taubenfeld G., Synchronization Algorithms and Concurrent Programming. Pear-
son/Prentice Hall, 2006.
c M. Raynal
Logical time in distributed systems 16
Individual permissions: principles
c M. Raynal
Logical time in distributed systems 17
Granting a permission
statei = out statei = asking
pi
perm
pj
statej = out
pk
statek 6= out
c M. Raynal
Logical time in distributed systems 18
From mechanisms to properties
• Safety: ∀i 6= j : j ∈ Ri ∧ i ∈ Rj
• Liveness: based on a timestamping mechanism
c M. Raynal
Logical time in distributed systems 19
Ricart-Agrawala mutex algorithm: local variables
c M. Raynal
Logical time in distributed systems 20
Structure
acquire() release()
11111111111111
00000000000000
00000000000000
11111111111111 11111111111111
00000000000000
00000000000000
11111111111111
00000000000000
11111111111111 00000000000000
11111111111111
00000000000000
11111111111111 00000000000000
11111111111111
00000000000000
11111111111111 00000000000000
11111111111111
local variables
11111111111111
00000000000000 11111111111111
00000000000000
req(k, j)
00000000000000
11111111111111 00000000000000
11111111111111
00000000000000
11111111111111 perm(j)
00000000000000
11111111111111 00000000000000
11111111111111
00000000000000
11111111111111 00000000000000
11111111111111
00000000000000
11111111111111
c M. Raynal
Logical time in distributed systems 21
Ricart-Agrawala mutex algorithm (1)
c M. Raynal
Logical time in distributed systems 22
Ricart-Agrawala mutex algorithm (2)
operation release()
for each j ∈ postponedi do
send perm(i) to pj end for;
statei ← out
c M. Raynal
Logical time in distributed systems 23
Clock values
c M. Raynal
Logical time in distributed systems 24
Ricart-Agrawala mutex algorithm
c M. Raynal
Logical time in distributed systems 25
Proof: on the safety side (1)
pi
request(h, i)
request(k, j)
pj
c M. Raynal
Logical time in distributed systems 26
Proof: on the safety side (2)
cs statei 6= out
pi
request(h, i)
request(k, j)
pj
out asking
c M. Raynal
Logical time in distributed systems 27
Proof: on the liveness side (1)
• Two-step proof
• Deadlock-freedom:
c M. Raynal
Logical time in distributed systems 28
Proof: on the liveness side (2)
request(h, i) from pi to pj
permission(j) from pj to pi
pi
request(k, j) request(h′, i)
where k > h where h′ > k
pj
clockj = k − 1 > h − 1
c M. Raynal
Logical time in distributed systems 29
Cost
c M. Raynal
Logical time in distributed systems 30
Variants
• Ring structure
c M. Raynal
Logical time in distributed systems 31
On mutual exclusion
• Permission-based
• Token-based
• A continuous view
- Raynal M., Algorithms for mutual exclusion, The MIT Press, 1986
- Anderson J., Kim Y.-J. andHerman T., Shared-memory mutual exclusion: major
research trends since 1986, Distributed Computing 16(2-3): 75-110 2003
c M. Raynal
Logical time in distributed systems 32
Part III
VECTOR TIME
- Fidge C., Timestamp in Message Passing Systems that Preserves Partial Ordering,
Proc. 11th Australian Computing Conference, pp. 56-66, 1988
- Mattern F., Virtual time and global states of distributed systems. Proc. Int’l work-
shop on Parallel and Distributed Systems, North-Holland, pp. 215-226, (Cosnard,
Quinton, Raynal and Robert Eds), 1988
- Baldoni R. and Raynal M. Fundamentals of Distributed Computing: A Practical
Tour of Vector-Clock Systems. IEEE Distributed Systems Online, 3(2):1-18, 2002
c M. Raynal
Logical time in distributed systems 33
Aim: capture the causality relation
⋆ Respects causality
⋆ But does not capture it
ev
(e −→ f ) ⇔ date(e) < date(f )
c M. Raynal
Logical time in distributed systems 34
Vector clock: intuition
c M. Raynal
Logical time in distributed systems 35
Vector clock: definition
c M. Raynal
Logical time in distributed systems 36
Vector clock: algorithm
Receiving Rule:
when receiving a message (m, V C) from pj :
V Ci[i] ← V Ci[i] + 1;
V Ci ← max(V Ci, V C[)
c M. Raynal
Logical time in distributed systems 37
Illustration
[1, 2, 0, 0]
[0, 0, 0, 0]
p1
[0, 1, 0, 0] [0, 3, 0, 0]
p2
[0, 2, 0, 0]
[0, 3, 2, 2]
p3
[0, 0, 0, 0] [0, 0, 1, 0]
p4
[0, 0, 0, 0] [0, 3, 0, 1] [0, 3, 0, 2]
c M. Raynal
Logical time in distributed systems 38
A few simple definitions
def
• V1≤V2 = ∀k : V 1[k] ≤ V 2[k]
def
• V1<V2 = (V 1 ≤ V 2) ∧ (V 1 6= V 2)
def
• V 1||V 2 = ¬(V 1 ≤ V 2) ∧ ¬(V 2 ≤ V 1)
c M. Raynal
Logical time in distributed systems 39
The vector clock properties
ev
(e −→ f ) ⇔ (Ve < Vf )
(e || f ) ⇔ (Ve || Vf )
c M. Raynal
Logical time in distributed systems 40
Proof (1)
ev
⋆ (e −→ f ) ⇒ (Ve < Vf ): follows from Theorem 1.
ev
⋆ (Ve < Vf ) ⇒ (e −→ f ):
Let pi be the process that issued the event e. We
have (Ve < Vf ) ⇒ (Ve[i] ≤ Vf [i]). As only pi can
entail an increase of V [i] (for any vector V ), it follows
that there is a causal path from e to f .
c M. Raynal
Logical time in distributed systems 41
Proof and cost
• Theorem 3: (e || f ) ⇔ (Ve || Vf ).
def ev ev
(e || f ) = ¬(e −→ f ) ∧ ¬(f −→ e) (definition).
ev
⋆ ¬(e −→ f ) ⇒ ¬(Ve < Vf ).
ev
⋆ ¬(f −→ e) ⇒ ¬(Vf < Ve).
c M. Raynal
Logical time in distributed systems 42
Refining the causality test
c M. Raynal
Logical time in distributed systems 43
A process is a “local” observer
Σinit = [0, 0]
[1, 1] [2, 1] e12
σ10 e11 σ11 e21 σ12
p1 Σ = [0, 1]
Σ = [2, 1] Σ = [1, 2]
e22 e21
Σ = [2, 2]
e32
A process is an oberver of the computation
Σf inal = [2, 3]
c M. Raynal
Logical time in distributed systems 44
A vector clock denotes a global state
Σinit = [0, 0]
[1, 1] [2, 1] e12
σ10 e11 σ11 e21 σ12
p1 Σ = [0, 1]
Σa = [2, 1] Σb = [1, 2]
Σa = [2, 1]
e22 e21
Σb = [1, 2]
Σc = [2, 2]
Σc = max(Σa, Σb)
e32
[2, 2] = max([2, 1], [1, 2])
Σf inal = [2, 3]
c M. Raynal
Logical time in distributed systems 45
The development of logical time (1)
Vi[i] = s
pi
e m
pj
Vj [i] ≥ s causal path: e → f
Vj [j] = r
pk
f
c M. Raynal
Logical time in distributed systems 46
The development of logical time (2)
1, 0 2, 0 3, 0 4, 2 5, 2 6, 5
m1 m4
m2
m3
0, 1 0, 2 0, 3 3, 4 3, 5 3, 6
5 m3
4
m4
1 m2
0
0 1 2 3 4 5 6 p1
c M. Raynal
Logical time in distributed systems 47
Part IV
c M. Raynal
Logical time in distributed systems 48
Causal order abstraction
c M. Raynal
Logical time in distributed systems 49
Causal delivery: definition
• Causal Order:
co broadcast(m1) → co broadcast(m2)
⇒ co del(m1) → co del(m2)
c M. Raynal
Logical time in distributed systems 50
Causal delivery: Why it is useful
• Capture causality
• Cooperative work
c M. Raynal
Logical time in distributed systems 51
Causal order: Example 1
c M. Raynal
Logical time in distributed systems 52
Causal order: Example 2
c M. Raynal
Logical time in distributed systems 53
Causal broadcast
m2 m4
m1 m3 m5
c M. Raynal
Logical time in distributed systems 54
Illustration
0 0 1
0 0 0
0 0 1
p1
0
0 0 1 1
0 0 0 0
0 0 1
p2
1
0 0
0 0
0
p3
1 1
0 0
0 1
c M. Raynal
Logical time in distributed systems 55
RST algorithm
operation co broadcast(m)
for each j 6= i do send (m, V Ci) to pj end for;
V Ci[i] ← V Ci[i] + 1
c M. Raynal
Logical time in distributed systems 56
Part V
PREDICATE DETECTION
c M. Raynal
Logical time in distributed systems 57
Stable Local Predicate Detection (1)
c M. Raynal
Logical time in distributed systems 58
Stable Local Predicate Detection (2)
c M. Raynal
Logical time in distributed systems 59
Stable Local Predicate Detection (3)
m1 m5
σ20 σ2x2 m3 σ2y2
P2
m2 m4
σ30 σ3x3 σ3y3
P3
c M. Raynal
Logical time in distributed systems 60
Stable Local Predicate Detection (4)
σ1y1
P1
σ2y2 m1 m3
P2
σ3y3 m2
P3
c M. Raynal
Logical time in distributed systems 61
Detection algorithm: local context of pi
c M. Raynal
Logical time in distributed systems 62
Detection algorithm (1)
procedure detected? is
if SATi = {1, 2, . . . , n} then
F IRSTi defines the first consistent
V
global state Σ that satisfies j LPj
fi
c M. Raynal
Logical time in distributed systems 63
Detection algorithm (2)
c M. Raynal
Logical time in distributed systems 64
Detection algorithm (3)
c M. Raynal
Logical time in distributed systems 65
Detection algorithm (4)
c M. Raynal
Logical time in distributed systems 66
Part VI
DETECTION OF A SIMPLE
EVENT PATTERN
-Raynal M., Illustrating the Use of Vector Clocks in Property Detection: an Example
and a Counter-Example. Proc. 5th Int’l European Parallel Computing Conference
(EUROPAR’99), Springer LNCS 1685, pp. 806-814, 1999
c M. Raynal
Logical time in distributed systems 67
Pattern Recognition (1)
⋆ (black(s) ∧ black(t))
c M. Raynal
Logical time in distributed systems 68
Pattern Recognition
t t
u u
s s
White and black: s.V C = (0, 0, 2) and t.V C = (3, 4, 2) in both cases
c M. Raynal
Logical time in distributed systems 69
Non-Triviality of the Problem
a t1 t2
P1
b u c
P2
s d
P3
c M. Raynal
Logical time in distributed systems 70
Decomposing the Predicate
⋆ P2(s, u, t) ≡ (s → u ∧ u → t)
c M. Raynal
Logical time in distributed systems 71
Using Vector of Vector Clocks
c M. Raynal
Logical time in distributed systems 72
Example (1)
a t1 t2
P1
b u c
P2
s d
P3
c M. Raynal
Logical time in distributed systems 73
Example (2)
a t1 t2
P1
b u c
P2
s d
P3
c M. Raynal
Logical time in distributed systems 74
Operational Predicate
c M. Raynal
Logical time in distributed systems 75
The Protocol (1)
c M. Raynal
Logical time in distributed systems 76
The Protocol (2)
c M. Raynal
Logical time in distributed systems 77
The Protocol (3)
c M. Raynal
Logical time in distributed systems 78
What has ben learnt
c M. Raynal
Logical time in distributed systems 79
Part VII
DETERMINING IMMEDIATE
PREDECESSORS
c M. Raynal
Logical time in distributed systems 80
Relevant Events
c M. Raynal
Logical time in distributed systems 81
A Distributed Computation
P1
1
0 1
0 00
11
P2
1
0 11 0
00 1
P3
11
00 11
00
c M. Raynal
Logical time in distributed systems 82
Vector Clocks (2)
c M. Raynal
Logical time in distributed systems 83
Vector Clocks (3)
• More precisely:
c M. Raynal
Logical time in distributed systems 84
Vector Clocks: Example
c M. Raynal
Logical time in distributed systems 85
Immediate Predecessor Tracking: the Problem
⋆ e → f , and
c M. Raynal
Logical time in distributed systems 86
Immediate Predecessor Tracking: Why?
c M. Raynal
Logical time in distributed systems 87
Distributed Computation and its Reduction
c M. Raynal
Logical time in distributed systems 88
Transitive Reduction (Hasse Diagram)
11
00
00 1
11 0 0 1
(1, 1) (1, 2) (1, 3)
0
1 00
11
0 11
1 00 1
(2, 1) 0
1
0 (2, 2)
(2, 3)
11 00
00 11
(3, 1)
(3, 2)
c M. Raynal
Logical time in distributed systems 89
Basic IPT Protocol (1)
Each pi manages:
• A vector clock V Ci
• A boolean array IPi whose meaning is:
c M. Raynal
Logical time in distributed systems 90
Basic IPT Protocol (2)
c M. Raynal
Logical time in distributed systems 91
How to Manage the IPi Vectors? (1)
P1
P2
P3
c M. Raynal
Logical time in distributed systems 92
How to Manage the IPi Vectors? (2)
P1
P2
P3
c M. Raynal
Logical time in distributed systems 93
Basic IPT Protocol (3)
∀k : case
V Ci[k] < m.V C[k] then V Ci[k] := m.V C[k];
IPi[k] := m.IP [k]
V Ci[k] = m.V C[k] then IPi[k] := min(IPi[k], m.IP [k])
V Ci[k] > m.V C[k] then skip
end case
c M. Raynal
Logical time in distributed systems 94
Efficient IPT? (1)
c M. Raynal
Logical time in distributed systems 95
Efficient IPT (2): Towards a General Condition
Underlying intuition:
V Ci [k] = x
IPi [k] = 1
send(m)
Pi
Pj
V Cj [k] ≥ x receive(m)
c M. Raynal
Logical time in distributed systems 96
Efficient IPT (3): a General Condition
c M. Raynal
Logical time in distributed systems 97
Efficient IPT (3): a General Condition
• Theorem 1:
The condition K(m, k) is both nec-
essary and sufficient to omit the trans-
mission of V Ci[k] and IPi[k] when m is
sent by pi to pj
c M. Raynal
Logical time in distributed systems 98
Efficient IPT (4): Towards a Concrete Condition
c M. Raynal
Logical time in distributed systems 99
Efficient IPT (5): Towards a Concrete Condition
c M. Raynal
Logical time in distributed systems 100
An Implementation of the Matrices Mi
c M. Raynal
Logical time in distributed systems 101
A Concrete Condition
∨(send(m).V Ci[k] = 0)
c M. Raynal
Logical time in distributed systems 102
An Efficient IPT Protocol (1)
RM0 Both V Ci[1..n] and IPi[1..n] are set to [0, . . . , 0], and
∀ (j, k) : Mi[j, k] is set to 1
RM1 Each time pi produces a relevant event e:
c M. Raynal
Logical time in distributed systems 103
An Efficient IPT Protocol (2)
c M. Raynal
Logical time in distributed systems 104
Properties of the IPT Protocol
c M. Raynal
Logical time in distributed systems 105
Part VIII
MATRIX CLOCKS
-Wuu G.T. and Bernstein A.J., Efficient solutions to the replicated log and dic-
tionnary problems. Proc. 3rd Int’l ACM Symposium on Principles of Distributed
Computing (PODC’84), ACM Press, pp. 233-242, 1984
c M. Raynal
Logical time in distributed systems 106
Matrix clock
M Ci[j, k] = x means
c M. Raynal
Logical time in distributed systems 107
Matrix clock: algorithm
Receiving Rule:
when receiving a message (m, M C) from pj :
M Ci[i, i] ← M Ci[i, i] + 1;
M Ci[i, ∗] ← max(M Ci[i, ∗], M C[j, ∗]);
for each k do
M Ci[k, ∗] ← max(M Ci[k, ∗], M C[k, ∗]) end for
c M. Raynal
Logical time in distributed systems 108
Illustration
c M. Raynal
Logical time in distributed systems 109
Properties
c M. Raynal
Logical time in distributed systems 110
Matrix clocks in action: Message stability tracking
c M. Raynal
Logical time in distributed systems 111
Message stability tracking: structure
broadcast(m) deliver(m)
deposit(m)
deposit(m)
buffer
discard(m)
send(m) receive(m)
c M. Raynal
Logical time in distributed systems 112
Message stability tracking: control variables
c M. Raynal
Logical time in distributed systems 113
Message stability tracking: algorithm (1)
c M. Raynal
Logical time in distributed systems 114
Message stability tracking: algorithm (2)
c M. Raynal
Logical time in distributed systems 115
Message stability tracking: algorithm (3)
when ∃ m ∈ buffer :
k = m.sender ∧
m.V C[k] ≤ min(M Ci[1, k], . . . , (M Ci[n, k]): discard (m)
c M. Raynal
Logical time in distributed systems 116
THAT’s ALL, FOLKS!
c M. Raynal
Logical time in distributed systems 117