0% found this document useful (0 votes)
23 views

DC M2 Part1

Uploaded by

Manoj Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

DC M2 Part1

Uploaded by

Manoj Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

DISTRIBUTED COMPUTING

MODULE-2

# Introduction
Usually causality is tracked using physical time. However, in distributed systems, it is not possible to
have global physical time; it is possible to realize only an approximation of it. The knowledge of the causal
precedence relation among the events of processes helps solve a variety of problems in distributed systems.
Examples of some of these problems is as follows:
Distributed algorithms design The knowledge of the causal precedence relation among events helps ensure
liveness and fairness in mutual exclusion algorithms, helps maintain consistency in replicated databases, and
helps design correct deadlock detection algorithms to avoid phantom and undetected deadlocks.
Tracking of dependent events In distributed debugging, the knowledge of the causal dependency among
events helps construct a consistent state for resuming reexecution; in failure recovery, it helps build a
checkpoint; in replicated databases, it aids in the detection of file inconsistencies in case of a network
partitioning.
Knowledge about the progress The knowledge of the causal dependency among events helps measure the
progress of processes in the distributed computation. This is useful in discarding obsolete information,
garbage collection, and termination detection.
Concurrency measure The knowledge of how many events are causally dependent is useful in measuring
the amount of concurrency in a computation. All events that are not causally related can be executed
concurrently. Thus, an analysis of the causality in a computation gives an idea of the concurrency in the
program.
In a system of logical clocks, every process has a logical clock that is advanced using a set of rules. Every
event is assigned a timestamp and the causality relation between events can be generally inferred from their
timestamps. The timestamps assigned to events obey the fundamental monotonicity property; that is, if an event
a causally affects an event b, then the timestamp of a is smaller than the timestamp of b.

# A framework for a system of logical clocks


A system of logical clocks consists of a time domain T and a logical clock C. Elements of T form a partially
ordered set over a relation <. This relation is usually called the happened before or causal precedence.
Intuitively, this relation is analogous to the earlier than relation provided by the physical time. The logical
clock C is a function that maps an event e in a distributed system to an element in the time domain T, denoted
as C(e) and called the timestamp of e, and is defined as follows:
C : H → T,
such that the following property is satisfied:
for two events ei and ej , ei →ej =⇒ C(ei) < C(ej ).
This monotonicity property is called the clock consistency condition. When T and C satisfy the following
condition,
for two events ei and ej , ei →ej ⇔ C(ei) < C(ej ), the system of clocks is said to be strongly consistent.

# Implementation of logical clocks


Implementation of logical clocks requires addressing two issues : data structures local to every process to
represent logical time and a protocol (set of rules) to update the data structures to ensure the consistency
condition. Each process pi maintains data structures that allow it the following two capabilities:
• A local logical clock, denoted by lci, that helps process pi measure its own progress.
• A logical global clock, denoted by gci, that is a representation of process pi’s local view of the logical
global time. It allows this process to assign consistent timestamps to its local events. Typically, lc i is a
part of gci
The protocol ensures that a process’s logical clock, and thus its view of the global time, is managed
consistently. The protocol consists of the following two rules:
• R1 This rule governs how the local logical clock is updated by a process when it executes an event
(send, receive, or internal).
• R2 This rule governs how a process updates its global logical clock to update its view of the global
time and global progress. It dictates what information about the logical time is piggybacked in a
message and how this information is used by the receiving process to update its view of the global
time.

# Scalar Time
• Proposed by Lamport in 1978 as an attempt to totally order events in a distributed system.
• Time domain is the set of non-negative integers.
• The logical local clock of a process pi and its local view of the global time are squashed into one
integer variable Ci .
• Rules R1 and R2 to update the clocks are as follows:R1: Before executing an event (send, receive, or
internal), process pi executes the following: Ci := Ci + d (d > 0)
• In general, every time R1 is executed, d can have a different value; however, typically d is kept at 1.
• R2: Each message piggybacks the clock value of its sender at sending time. When a process pi
receives a message with timestamp Cmsg , it executes the following actions:
◮ Ci := max(Ci, Cmsg )
◮ Execute R1.
◮ Deliver the message.
Figure shows evolution of scalar time

# Basic properties of scalar time


Consistency Property
• Scalar clocks satisfy the monotonicity and hence the consistency property:
for two events ei and ej , ei → ej =⇒ C(ei) < C(ej).
Total Ordering
• Scalar clocks can be used to totally order events in a distributed system.
• The main problem in totally ordering events is that two or more events at different processes may
have identical timestamp.
• For example in Figure, the third event of process P1 and the second event of process P2 have
identical scalar timestamp
• A tie-breaking mechanism is needed to order such events. A tie is broken as follows:
• Process identifiers are linearly ordered and tie among events with identical scalar timestamp is
broken on the basis of their process identifiers.
• The lower the process identifier in the ranking, the higher the priority.
• The timestamp of an event is denoted by a tuple (t, i) where t is its time of occurrence and i is the
identity of the process where it occurred.
• The total order relation ≺ on two events x and y with timestamps (h,i) and (k,j), respectively, is
defined as follows: x ≺ y ⇔ (h < k or (h = k and i < j))
Event counting
• If the increment value d is always 1, the scalar time has the following interesting property: if event e
has a timestamp h, then h-1 represents the minimum logical duration, counted in units of events,
required before producing the event e;
• We call it the height of the event e.
• In other words, h-1 events have been produced sequentially before the event e regardless of the
processes that produced these events.
• For example, in Figure, five events precede event b on the longest causal path ending at b.
No Strong Consistency
• The system of scalar clocks is not strongly consistent; that is, for two events ei and ej , C(ei) < C(ej)
=⇒ ei → ej .
• For example, in Figure, the third event of process P1 has smaller scalar timestamp than the third
event of process P2. However, the former did not happen before the latter.
• The reason that scalar clocks are not strongly consistent is that the logical local clock and logical
global clock of a process are squashed into one, resulting in the loss causal dependency information
among events at different processes.
• For example, in Figure, when process P2 receives the first message from process P1, it updates its
clock to 3, forgetting that the timestamp of the latest event at P1 on which it depends is 2.

# Vector Time
• The system of vector clocks was developed independently by Fidge, Mattern and Schmuck.
• In the system of vector clocks, the time domain is represented by a set of n-dimensional non-negative
integer vectors.
• Each process pi maintains a vector vti [1..n], where vti[i] is the local logical clock of pi and describes
the logical time progress at process pi . vti[j] represents process pi ’s latest knowledge of process pj
local time.
• If vti[j]=x, then process pi knows that local time at process pj has progressed till x.
• The entire vector vti constitutes pi ’s view of the global logical time and is used to timestamp events.
• Process pi uses the following two rules R1 and R2 to update its clock: R1: Before executing an event,
process pi updates its local logical time as follows: vti [i] := vti [i] + d ; (d > 0)
• R2: Each message m is piggybacked with the vector clock vt of the sender process at sending time. On
the receipt of such a message (m,vt), process pi executes the following sequence of actions:
1. Update its global logical time as follows: 1 ≤ k ≤ n : vti[k] := max(vti[k], vt[k])
2. Execute R1.
3. Deliver the message m

# Basic properties if vector time


Isomorphism
• If events in a distributed system are timestamped using a system of vector clocks, we have the
following property.
• If two events x and y have timestamps vh and vk, respectively,then
x → y ⇔ vh < vk
x || y ⇔ vh || vk.
• Thus, there is an isomorphism between the set of partially ordered events produced by a distributed
computation and their vector timestamps
Strong Consistency
• The system of vector clocks is strongly consistent; thus, by examining the vector timestamp of two
events, we can determine if the events are causally related.
• However, Charron-Bost showed that the dimension of vector clocks cannot be less than n, the total
number of processes in the distributed computation, for this property to hold.
Event Counting
• If d=1 (in rule R1), then the i th component of vector clock at process pi , vt i [i], denotes the number
of events that have occurred at pi until that instant.
• So, if an event e has timestamp vh, vh[j] denotes the number of events executed by process pj that
causally precede e. Clearly, ∑vh[j] − 1 represents the total number of events that causally precede e
in the distributed computation

You might also like