UNIT I
UNIT I
Resource Sharing: Multiple users can share hardware and software resources across
different nodes.
Concurrency: Different computations can be executed simultaneously on different
nodes.
Scalability: The system can expand by adding more nodes without significant
performance degradation.
Fault Tolerance: Even if some nodes fail, the system continues functioning properly.
Transparency: Users perceive the system as a single entity rather than multiple
interconnected machines.
o Access Transparency: Users access resources uniformly.
o Location Transparency: Users don’t need to know where resources are
physically located.
o Replication Transparency: Users do not notice if data is replicated for
performance.
o Failure Transparency: The system recovers automatically from failures.
o Concurrency Transparency: Multiple users can access shared resources
simultaneously.
2. Nodes of a Distributed System
A node in a distributed system is an independent computing entity that participates in the
execution of processes. Nodes communicate via a network to coordinate tasks and share
resources. The different types of nodes in a distributed system are:
Types of Nodes:
Client Nodes: These nodes request services from servers. Examples include web
browsers accessing websites.
Server Nodes: Provide services such as file storage, database management, or
computing resources.
Middleware Nodes: Act as intermediaries to facilitate communication between
clients and servers, ensuring interoperability and security.
Storage Nodes: Responsible for managing and storing data, commonly used in cloud
storage systems.
Characteristics of Nodes:
1. Client-Server Model
Example:
A user accesses a website via a browser (client), which sends an HTTP request to a
web server.
2. Peer-to-Peer (P2P) Model
Example:
A user downloads a file using a P2P protocol where multiple peers contribute parts of
the file.
3. Three-Tier Architecture
A banking application where the UI handles user input, the application layer
processes transactions, and the database stores user account details.
4. Multi-Tier Architecture
Extends the three-tier model by adding extra layers such as caching, security, or load
balancing.
Used in large-scale cloud applications.
Example:
Example:
An online travel booking system where separate services handle flight, hotel, and
payment processing.
1. Transparency
The system should hide the complexity of distributed processes from users.
Different forms of transparency:
o Access, Location, Replication, Failure, Concurrency.
2. Scalability
3. Openness
The system should be able to integrate different hardware and software components.
Uses standardized communication protocols.
4. Fault Tolerance
5. Resource Sharing
Enable multiple users to share resources like files, databases, and processing power.
6. Concurrency
7. Security
In distributed systems, time and state play a crucial role in coordinating and synchronizing
events across multiple nodes. Since there is no global clock, each node maintains its own
time, leading to challenges in maintaining a consistent view of the system's state.
Types of Events:
1. Internal Events: Events occurring within a single node without communication with
others.
2. Message Sending Events: When one node sends a message to another.
3. Message Receiving Events: When a node receives and processes a message.
State Transitions:
Since different nodes have their own local clocks, defining the correct order of events
requires logical and physical clock synchronization.
Logical Clocks:
Example:
Process P1 Process P2
A (1) ----> B (2)
L(A) < L(B) → A happened before B.
Vector Clocks:
Recording the global state is essential for debugging, fault tolerance, and recovery in
distributed systems.
Methods:
2. Election Algorithms
Election algorithms are used in distributed systems to select a coordinator among distributed
nodes.
Initiation: When a node detects that the current leader has failed (usually through a
timeout mechanism), it initiates an election.
Election Process:
The initiating node sends an “election” message to all other nodes with higher
priority.
Nodes with higher priority respond by either acknowledging their current leader status
or declaring themselves candidates.
Notification: The newly elected leader informs all nodes of its leadership status,
ensuring consistency across the distributed system.
There can be three types of messages that processes exchange with each other in the
bully algorithm:
Coordinator (Victory) message: Sent by winner of the election to announce the new
coordinator.
The Ring Election Algorithm is a method used in distributed systems to elect a leader
among a group of interconnected nodes arranged in a ring-like structure. It ensures
that only one node in the network becomes the leader, facilitating coordination and
decision-making within the system.
How Does Ring Election Algorithm Work?
Step 2: Message Passing: The algorithm begins when a node initiates an election
process. It sends a special message, often called an "election message" or "token,"
containing its identifier, to its neighboring node(s) in the ring.
Step 3: Comparison and Forwarding: Upon receiving the election message, each node
compares the identifier in the message with its own. If the received identifier is
greater than its own, it forwards the message to the next node in the ring. If the
received identifier is smaller than its own, it discards the message.
Step 4: Propagation: This process continues until the message returns to the initiating
node. As the message travels around the ring, each node updates its state to reflect the
highest identifier it has encountered.
Step 5: Leader Election: Once the message returns to the initiating node, it knows it
has the highest identifier in the network. It declares itself as the leader.
Algorithm:
1) The process on the client machine sends the request for fetching clock
time(time at the server) to the Clock Server at time .
2) The Clock Server listens to the request made by the client process and
returns the response in form of clock server time.
3) The client process fetches the response from the Clock Server at time
and calculates the synchronized client clock time using the formula given
below.
Vector clocks are a mechanism used in distributed systems to track the causality and
ordering of events across multiple nodes or processes. Each process in the system
maintains a vector of logical clocks, with each element in the vector representing the
state of that process’s clock. When events occur, these clocks are incremented, and
the vectors are exchanged and updated during communication between processes.
By comparing vector clocks, the system can identify if an event on one node causally
happened before, after, or concurrently with an event on another node, enabling
effective conflict resolution and ensuring consistency.
Vector clocks have several important use cases in distributed systems, particularly in
scenarios where tracking the order of events and understanding causality is critical.
Here are some key use cases:
Vector clocks are used in distributed databases such as Cassandra or Amazon
DynamoDB to settle disputes that arise when several data replicas are updated
separately.
Several people can edit the same document at once using collaborative editing
programs like Google Docs.
Knowing the sequence in which various nodes operate is crucial for debugging or
monitoring distributed systems.
Several clients may read and edit files simultaneously in distributed file systems such
as the Hadoop Distributed File System (HDFS) or Google File System (GFS).
Causality Tracking: Vector clocks allow distributed systems to accurately track the
causal relationships between events. This helps in understanding the sequence of
operations across different nodes, which is critical for maintaining consistency and
preventing conflicts.
Conflict Resolution: Vector clocks provide a systematic way to detect and resolve
conflicts that arise due to concurrent updates or operations in a distributed system.
Fault Tolerance: Vector clocks enhance fault tolerance by enabling the system to
handle network partitions or node failures gracefully. Since each node maintains its
own version of the clock, the system can continue to operate and later reconcile
differences when nodes are reconnected.
Scalability: Vector clocks scale well in large distributed systems because they do not
require global synchronization or coordination. Each process only needs to keep track
of its own events and those of other relevant processes.
One issue is scalability: In an n-node system, the size of a vector clock grows linearly
with the number of nodes. The memory uses can be huge and will result in a high cost
due to communication.
Partial Ordering: Vector clocks are capable of allowing only partial ordering of
events, meaning they tell the causal relationship between some, not all, events. This
may lead to vagueness in determining the exact order of events.
This is because every time an Internal event occurs in a process, the value of the
processes’s logical clock in the vector is incremented by 1.
Also, whenever a process sends a message, the value of the processes’s logical clock
in the vector is incremented by 1.
Every time a process receives a message, it increments the sender process’s logical
clock value in the vector by 1.
Besides, each element is updated by taking a maximum both of the value in its own
vector clock and of the value in the vector in the received message for every element.