0% found this document useful (0 votes)
2 views69 pages

Vertopal.com DistrributedSystemDetailNotes

Distributed systems consist of interconnected computers that work together, providing features such as communication, transparency, and resource sharing. They face challenges like heterogeneity, security, and fault tolerance, and utilize architectural models like client-server and peer-to-peer for structure. Concepts like logical clocks and message passing are essential for managing event ordering and communication in these complex systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views69 pages

Vertopal.com DistrributedSystemDetailNotes

Distributed systems consist of interconnected computers that work together, providing features such as communication, transparency, and resource sharing. They face challenges like heterogeneity, security, and fault tolerance, and utilize architectural models like client-server and peer-to-peer for structure. Concepts like logical clocks and message passing are essential for managing event ordering and communication in these complex systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

UNIT 1

Characterization of Distributed Systems


Introduction to Distributed Systems
A distributed system consists of multiple interconnected computers that work together to
achieve a common goal. These systems appear to users as a single cohesive system but are
actually composed of independent machines.

Key features:

• Multiple Computers: Systems are composed of independent computers or devices.


• Communication: Machines communicate via a network (e.g., LAN, WAN, or the Internet).
• Transparency: Users should not know whether they're interacting with one computer or
many.
• Shared Resources: These systems allow resource sharing, such as files, printers, or
processing power.

Examples of distributed systems include online banking, cloud computing platforms, and e-
commerce systems like Amazon.

Examples of Distributed Systems


1. The Internet and the Web:
– The World Wide Web is a prime example of a distributed system.
– Millions of servers worldwide host websites, and users access these servers via
browsers.
– The communication happens seamlessly through protocols like HTTP and HTTPS.
2. Cloud Computing:
– Platforms like Google Drive or AWS provide distributed storage and
computational resources.
– Users upload files or run applications without worrying about the underlying
infrastructure.
3. Online Gaming Platforms:
– Multiplayer games like PUBG or Fortnite depend on distributed systems to
connect players in real time.
– Game servers distributed globally ensure low latency and real-time interaction.
4. Social Media:
– Platforms like Facebook, Instagram, or Twitter use distributed systems to store
and manage data across global servers while providing quick access to users.
5. E-commerce:
– Websites like Amazon distribute resources across regions to handle traffic
efficiently and ensure faster response times for users.

Resource Sharing and the Web


Resource sharing is a fundamental goal of distributed systems. In a distributed system:

• Resources such as data, hardware (printers, servers), and software can be shared among
users across the system.
• Users can access resources regardless of their location, provided they have the necessary
permissions.

Examples of Resource Sharing:

• Shared files and databases.


• Shared computational tasks (e.g., distributed computing projects like SETI@home).
• Shared hardware like printers in a network.

Resource Sharing on the Web: The Web itself is a distributed system designed for resource
sharing. For instance:

• Web Servers: Serve documents and applications to multiple users simultaneously.


• Cloud Platforms: Provide users with remote access to powerful hardware and software
resources.
• Streaming Services: Platforms like YouTube and Netflix distribute video content from
servers spread globally.

Challenges in Distributed Systems


1. Heterogeneity:
– Distributed systems involve different hardware, software, and network types.
– Making these diverse components work together is a challenge.
2. Scalability:
– Systems must handle increasing numbers of users and resources.
– For example, during sales events, e-commerce systems must handle a surge in
traffic.
3. Security:
– Protecting data and resources across a distributed network is critical.
– Challenges include unauthorized access, data breaches, and cyberattacks.
4. Fault Tolerance:
– Components in distributed systems can fail (e.g., servers, network links).
– Ensuring the system continues to operate despite failures is a significant
challenge.
5. Concurrency:
– Multiple users or processes might try to access or modify the same resource
simultaneously.
– Managing such situations without conflicts or errors is vital.
6. Latency:
– Distributed systems depend on networks for communication, which introduces
delays.
– Reducing response times is essential, especially for real-time applications like
gaming.

In summary, distributed systems are critical for modern computing, enabling resource sharing
and scalability. However, designing and maintaining such systems requires overcoming
significant challenges related to heterogeneity, security, and fault tolerance.

Architectural Models and Fundamental Models


1. Architectural Models
Architectural models define the structure of distributed systems, focusing on how components
interact and communicate. They help visualize the system's design and guide implementation.

Types of Architectural Models


1. Client-Server Model:
– In this model, clients request services, and servers provide them.
– Example: Web browsers (clients) request pages from web servers.
2. Peer-to-Peer (P2P) Model:
– All nodes in the system act as both clients and servers, sharing resources equally.
– Example: Torrent networks, blockchain systems.
3. Multitier Model:
– Extends the client-server model by adding layers like application servers or
database servers.
– Example: E-commerce platforms with user interface (UI), application logic, and
database layers.
4. Publish-Subscribe Model:
– Publishers generate events or messages, and subscribers express interest in
certain events.
– Example: Notification systems in social media platforms.

2. Fundamental Models
Fundamental models describe the properties and assumptions of distributed systems.

Three Fundamental Models:


1. Interaction Model:
– Focuses on communication between processes in distributed systems.
– Challenges: Delays due to network latency and failures in communication.
2. Failure Model:
– Defines the types of faults and their behavior:
• Crash Faults: A component stops functioning but doesn’t misbehave.
• Byzantine Faults: A component behaves unpredictably or maliciously.
3. Security Model:
– Addresses threats like unauthorized access and data corruption.
– Uses encryption, authentication, and firewalls to mitigate risks.

Limitations of Distributed Systems


1. Absence of a Global Clock:
– Distributed systems lack a shared global clock to synchronize processes.
– Each process has its local clock, leading to difficulties in determining event order
across systems.
2. No Shared Memory:
– Processes in distributed systems do not share memory directly.
– Communication relies on message passing or shared storage systems, which can
introduce delays or inconsistencies.

Logical Clocks
Logical clocks are used to establish a consistent order of events in a distributed system where no
global clock exists.

1. Lamport’s Logical Clocks


• Proposed by Leslie Lamport in 1978, Lamport’s logical clocks assign a numerical
timestamp to each event to order them.

Key Principles:
1. Increment Rule:
– Each process increments its clock before executing an event.
2. Message Rule:
– When a process sends a message, it includes its current timestamp.
– The receiving process updates its clock to the maximum of its current clock and
the timestamp in the message, then increments it.

Advantages:
• Simple and effective for ordering events.
• Ensures causality: If event A causes event B, then A will always have a smaller timestamp
than B.

Limitation:
• Does not capture concurrency: Two independent events can have the same timestamp.

2. Vector Clocks
Vector clocks extend Lamport’s logical clocks to capture concurrency and causality more
effectively.

How Vector Clocks Work:


1. Each process maintains a vector of counters, one for each process in the system.
2. When a process performs an event:
– It increments its own counter in the vector.
3. When a process sends a message:
– It includes its current vector clock.
4. When a process receives a message:
– It updates its vector clock by taking the element-wise maximum of its current
vector and the received vector, then increments its own counter.

Advantages:
• Tracks causality and concurrency effectively.
• Helps determine if two events are related or independent.

Example:

Let’s consider three processes ( P_1, P_2, P_3 ):

• Initial state: ( [0, 0, 0] ) for all processes.


• ( P_1 ) performs an event: ( [1, 0, 0] ).
• ( P_1 ) sends a message to ( P_2 ): ( [1, 0, 0] ).
• ( P_2 ) receives the message and updates: ( [1, 1, 0] ).

Comparison of Events:

Using vector clocks, we can determine:

• ( A \to B ) (A happens before B): If ( A[V] < B[V] ).


• ( A \parallel B ) (A and B are concurrent): If neither ( A[V] < B[V] ) nor ( B[V] < A[V] ).
Conclusion
Distributed systems are inherently complex due to limitations like the absence of a global clock
and shared memory. Logical clocks, especially Lamport’s and vector clocks, provide mechanisms
to manage event ordering and maintain causality. Understanding architectural and fundamental
models helps design robust and scalable distributed systems while addressing their inherent
challenges.

Concepts in Message Passing Systems


Message passing systems are at the core of distributed systems, enabling communication
between processes that do not share memory. Since distributed systems lack a global clock,
ensuring correct message order and state consistency is a major challenge.

Let’s delve into the key concepts:

1. Causal Order
Causal order in message passing refers to ensuring that messages are delivered in a way that
respects the cause-and-effect relationship between events. This is based on the happens-
before (→) relationship introduced by Leslie Lamport.

Happens-before Relationship:
• If an event ( A ) happens before ( B ) in the same process, then ( A → B ).
• If ( A ) is the sending of a message and ( B ) is the receipt of that message, then ( A → B ).
• Transitivity: If ( A → B ) and ( B → C ), then ( A → C ).

Causal Order in Messaging:


• A message ( m_1 ) that influences another message ( m_2 ) (directly or indirectly) must be
delivered before ( m_2 ).

Example:
• Process ( P_1 ) sends message ( m_1 ) to ( P_2 ).
• ( P_2 ) processes ( m_1 ) and sends message ( m_2 ) to ( P_3 ).
• ( P_3 ) should receive ( m_1 ) before ( m_2 ), even though ( P_3 ) never directly interacts
with ( P_1 ).

Implementation:
• Vector clocks are typically used to enforce causal ordering.

2. Total Order
Total order ensures that all processes in the system agree on the order of messages, even if
they are unrelated. This means every process delivers messages in the same sequence.
Total Ordering Rules:
• If two messages ( m_1 ) and ( m_2 ) are sent concurrently, they are assigned an order
based on a global criterion (e.g., message timestamps or process IDs).
• All processes deliver ( m_1 ) and ( m_2 ) in the same order, regardless of when they are
received.

Example:
• If ( P_1 ) sends ( m_1 ), ( P_2 ) sends ( m_2 ), and ( P_3 ) receives both, ( P_3 ) must deliver
( m_1 ) and ( m_2 ) in a consistent order across all processes.

3. Total Causal Order


Total causal order combines causal order and total order:

• Messages must respect the happens-before relationship.


• All processes must deliver messages in the same total order.

Why It’s Important:


• In applications like distributed databases, ensuring both causality and a consistent order
of updates is critical to avoid inconsistencies.

4. Message Ordering
Message ordering ensures that messages are delivered and processed in a sequence that
respects certain criteria. Types include:

1. FIFO Order:
– Messages sent by a process are delivered in the order they were sent.
– Example: If ( P_1 ) sends ( m_1 ) followed by ( m_2 ) to ( P_2 ), ( P_2 ) must deliver
( m_1 ) before ( m_2 ).
2. Causal Order:
– Respects the cause-and-effect relationship between messages.
3. Total Order:
– Ensures all processes agree on the order of messages.
4. Unordered:
– No guarantees on the sequence of message delivery.

Comparison:
Order Type Guarantees
FIFO Per-process order
Causal Cause-effect relationships are
respected
Total Uniform order across all processes
Total Causal Both causal and total order are
preserved
5. Causal Ordering of Messages
Causal ordering ensures that messages are delivered in a way that respects the cause-effect
relationship. It does not require a strict total order but focuses on preserving the logic of
causality.

Algorithm for Causal Ordering:


• Use vector clocks:
a. Each process maintains a vector of logical clocks.
b. A message carries the vector clock of the sender.
c. A process delivers a message only if it satisfies the causality condition:
• ( V[i] \geq V[j] ), where ( i ) is the sender and ( j ) is the receiver.

6. Global State
The global state of a distributed system is the combined state of all processes and channels in
the system at a given point in time.

Challenges in Determining Global State:


• Processes do not share memory and have no common clock.
• Communication delays may cause inconsistent views of the system.

Chandy-Lamport Algorithm for Global State:


1. Initiate a snapshot when a process receives a marker message.
2. Record the local state and the state of incoming channels.
3. Propagate the marker to all neighbors.
4. Combine local snapshots to get the global state.

Uses:
• Debugging distributed systems.
• Deadlock detection.
• Analyzing distributed computations.

7. Termination Detection
Termination detection involves determining when all processes in a distributed system have
completed their tasks and the system can safely shut down.

Challenges:
• Processes may not know the state of others.
• Some processes may be idle, waiting for messages.
Algorithms for Termination Detection:
1. Dijkstra-Scholten Algorithm:
– Tracks dependencies among processes.
– When the initiating process detects no more dependencies, termination is
confirmed.
2. Token-Passing Algorithm:
– A token circulates in the system.
– When the token returns to the initiator with no active processes, termination is
detected.

Applications:
• Distributed computations.
• Coordinating system shutdowns.

Summary
Concept Description
Causal Order Ensures messages are delivered respecting the cause-and-effect
relationship.
Total Order Guarantees all processes deliver messages in the same sequence.
Total Causal Order Combines causal and total order.
Message Ordering Various types like FIFO, causal, total, and unordered messaging.
Global State Combined state of all processes and channels at a specific point in time.
Termination Identifies when all processes in a distributed system have completed
Detection their tasks.

Understanding these concepts is crucial for designing and implementing robust distributed
systems that function efficiently and maintain consistency across processes.

UNIT 2
Distributed Mutual Exclusion (DME)
Distributed Mutual Exclusion ensures that multiple processes in a distributed system can access
a shared resource mutually exclusively—that is, one process at a time, without interference
from others.

1. Introduction to Mutual Exclusion


Mutual exclusion is a fundamental problem in distributed systems where multiple processes
compete for a shared resource or critical section (CS). In a distributed system:

• Processes are located on different nodes without shared memory.


• Communication is done via message passing.

The goal of Distributed Mutual Exclusion (DME) is to manage concurrent access to the shared
resource efficiently and ensure correctness.

2. Requirements of Mutual Exclusion Theorem


A distributed mutual exclusion algorithm must satisfy the following requirements:

1. Mutual Exclusion:
– Only one process can access the critical section (CS) at any given time.
2. Fairness:
– No process should experience starvation.
– Requests should be granted in the order they were made (FIFO).
3. Deadlock Freedom:
– The system must avoid deadlock situations where no progress can be made.
4. Fault Tolerance:
– The system should handle failures gracefully without compromising correctness.
5. Scalability:
– The algorithm must perform efficiently as the number of processes increases.
6. Message Overhead:
– The number of messages exchanged should be minimized to reduce network
traffic.

3. Classification of Distributed Mutual Exclusion Algorithms


Distributed mutual exclusion algorithms are classified based on how processes communicate
and coordinate access to the critical section:

A. Permission-Based Algorithms
• Processes request permission from others before entering the CS.
• Based on granting or denying access.

Examples:
1. Centralized Algorithm:
– A single coordinator grants or denies access to the CS.
– Steps:
• A process sends a request to the coordinator.
• The coordinator grants access if the CS is free or queues the request.
– Advantages:
• Simple to implement.
• Low message overhead (3 messages per CS entry: request, grant, and
release).
– Disadvantages:
• Single point of failure (coordinator crash).
• Bottleneck at the coordinator.
2. Distributed Algorithm (Ricart-Agrawala Algorithm):
– All processes communicate directly to determine access.
– Steps:
• A process broadcasts a request message to all other processes.
• Each process responds with a grant message if the CS is free.
• Access is granted when the requesting process receives all replies.
– Advantages:
• No single point of failure.
– Disadvantages:
• High message overhead (( 2(n-1) ) messages for ( n ) processes).
3. Token-Based Algorithm:
– A unique token circulates in the system, granting access to the CS.
– Steps:
• A process must hold the token to enter the CS.
• After exiting, the token is passed to the next requesting process.
– Advantages:
• Minimal message overhead.
– Disadvantages:
• Token loss requires additional recovery mechanisms.

B. Non-Token-Based Algorithms
• These algorithms rely on message passing but do not use a token for granting access.

Example:
• Lamport’s Algorithm:
– Processes use logical timestamps to order requests.
– Steps:
i. A process sends a request with a timestamp to all others.
ii. A process replies with a grant only if it has not requested the CS or if the
sender’s timestamp is earlier.
iii. Access is granted when the requesting process receives replies from all
others.
– Advantages:
• Ensures causality using Lamport clocks.
– Disadvantages:
• High message overhead (( 3(n-1) ) messages for ( n ) processes).

C. Hybrid Algorithms
• Combine features of token-based and non-token-based approaches.
• Reduce message overhead while maintaining fault tolerance.

Example:
• Suzuki-Kasami Algorithm:
– Uses a token but reduces the message overhead by broadcasting a request only
when the token is lost.

4. Key Properties of Distributed Mutual Exclusion Algorithms


Property Centralized Ricart-Agrawala Token-Based
Message Overhead Low (3 messages) High (( 2(n-1) )) Low (1 per request)
Failure Tolerance Poor (single failure) Moderate High (with recovery)
Fairness High (FIFO) High Moderate
Scalability Limited Limited High

5. Limitations of Distributed Mutual Exclusion


1. Communication Delays:
– Network latency can delay messages, impacting performance.
2. Single Point of Failure:
– Centralized approaches are vulnerable to coordinator crashes.
3. High Message Complexity:
– Non-token-based algorithms require a large number of messages, leading to
overhead in large systems.
4. Token Loss:
– In token-based approaches, losing the token requires complex recovery
mechanisms.
5. Clock Synchronization Issues:
– Algorithms relying on timestamps require logical or vector clocks, adding
complexity.
6. Comparison of Distributed Mutual Exclusion Algorithms
Algorithm Type Strengths Weaknesses
Centralized Permission- Simplicity, low message Single point of failure,
based overhead bottleneck
Ricart- Non-token- Fault-tolerant, respects High message overhead
Agrawala based causality
Token-Based Token-based Low overhead, efficient in Token loss adds complexity
large systems
Lamport's Non-token- Ensures causality with High overhead, scalability
Algorithm based logical clocks issues
Suzuki- Hybrid Efficient token management Token loss recovery
Kasami

Conclusion
Distributed Mutual Exclusion ensures fairness, prevents deadlocks, and maintains consistency in
distributed systems. The choice of algorithm depends on the system’s requirements, such as
scalability, fault tolerance, and message overhead.

Token-Based and Non-Token-Based Algorithms in Distributed


Mutual Exclusion
Distributed Mutual Exclusion (DME) ensures that only one process at a time can access the
critical section (CS) in a distributed system. There are two main categories of DME algorithms:
Token-Based Algorithms and Non-Token-Based Algorithms.

Below, we explain these algorithms, their mechanisms, examples, and performance metrics in
detail.

1. Token-Based Algorithms
Concept
• A unique token circulates in the system. The token acts as a "permission slip" to access
the critical section (CS).
• A process must hold the token to enter the CS.
• After exiting the CS, the process passes the token to the next requesting process (if any).

Key Features
• Low message overhead: Only one message is required to transfer the token.
• Fault tolerance mechanisms are needed for token recovery in case of loss.

How it Works
1. A process requests the token if it doesn't have it.
2. If the token is held by another process, it waits until the token is available.
3. The process that holds the token uses it to enter the CS.
4. Once done, it either keeps the token (if no other requests exist) or passes it to the next
requester.

Examples
1. Suzuki-Kasami Algorithm:
– A process broadcasts its request to all other processes only if it doesn’t know
where the token is.
– Processes maintain a request queue to manage token requests.
2. Raymond's Tree-Based Algorithm:
– Organizes processes in a logical tree structure.
– Token requests propagate up the tree, and the token is passed down when
granted.

Advantages
• Low message overhead: Only one message is needed per CS entry.
• Fairness: Requests are served in FIFO order.
• Efficiency: Scalable in large systems.

Disadvantages
• Token loss: If the token is lost, recovery is complex and time-consuming.
• Fault tolerance: Token-based systems need additional mechanisms to handle node or
process failures.

2. Non-Token-Based Algorithms
Concept
• No token is used. Instead, processes communicate via messages to coordinate access to
the CS.
• Requests are granted based on logical clocks or a distributed agreement protocol.

Key Features
• High message overhead, as all processes must participate in the coordination.
• No risk of token loss, but algorithms may face higher delays.

How it Works
1. A process broadcasts a request message to all other processes when it wants to enter
the CS.
2. Other processes reply with a grant message if their conditions are met.
3. The requesting process can enter the CS only after receiving all necessary replies.
4. Once done, the process sends a release message to inform others.

Examples
1. Lamport’s Algorithm:
– Requests are timestamped using Lamport logical clocks.
– Processes compare timestamps to decide the order of granting access.
2. Ricart-Agrawala Algorithm:
– Similar to Lamport’s algorithm but reduces messages by using direct replies
instead of broadcasting releases.
– Processes grant requests if they are not in the CS and the requester has the
earliest timestamp.

Advantages
• No token loss: Eliminates the need for token recovery mechanisms.
• Fault tolerance: Handles process crashes more gracefully compared to token-based
approaches.

Disadvantages
• High message complexity: Requires multiple messages for coordination.
• Scalability issues: Performance degrades as the number of processes increases.

3. Performance Metrics for DME Algorithms


The performance of distributed mutual exclusion algorithms is evaluated using several key
metrics:

1. Message Complexity
• Definition: The number of messages exchanged per critical section entry.
• Comparison:
– Token-Based: Requires only 1 message (token transfer) in the best case.
– Non-Token-Based: Typically requires 2(n-1) messages for ( n ) processes (request
and reply).

2. Synchronization Delay
• Definition: The time taken between when a process exits the CS and when the next
process enters.
• Comparison:
– Token-Based: Very low (1 message delay).
– Non-Token-Based: Higher due to the time required for message exchanges.

3. Fault Tolerance
• Definition: The ability of the algorithm to handle failures (e.g., node crashes or token
loss).
• Comparison:
– Token-Based: Token loss requires recovery mechanisms, which increase
complexity.
– Non-Token-Based: Better fault tolerance as there is no dependency on a token.
4. Fairness
• Definition: Ensuring requests are granted in the order they are made.
• Comparison:
– Token-Based: Maintains fairness using FIFO queues.
– Non-Token-Based: Achieves fairness through logical timestamps.

5. Scalability
• Definition: The ability of the algorithm to handle an increasing number of processes.
• Comparison:
– Token-Based: Scales better due to low message overhead.
– Non-Token-Based: Message complexity increases quadratically, making it less
scalable.

6. Deadlock Freedom
• Definition: The system must avoid situations where no progress can be made.
• Comparison:
– Token-Based: Deadlock can occur if the token is lost.
– Non-Token-Based: Deadlock is avoided by careful message ordering.

4. Comparison of Token-Based and Non-Token-Based Algorithms


Metric Token-Based Non-Token-Based
Message Low (1 message per CS entry) High (( 2(n-1) ) messages for ( n )
Complexity processes)
Synchronization Low (1 message delay) High (multiple messages)
Delay
Fault Tolerance Requires token recovery Better fault tolerance
mechanisms
Fairness Maintained using token queues Maintained using timestamps
Scalability High (efficient in large systems) Limited due to high message
complexity
Deadlock Requires token management Avoided by design
Freedom

5. Use Cases of Each Approach


Token-Based Algorithms
• Suitable for systems with low communication delays and where scalability is critical.
• Used in large distributed systems or sensor networks.

Non-Token-Based Algorithms
• Preferred when reliability is essential, and the system can tolerate higher message
complexity.
• Suitable for smaller distributed systems with frequent failures.

6. Conclusion
Both token-based and non-token-based algorithms have their strengths and weaknesses. The
choice of algorithm depends on system requirements such as scalability, fault tolerance,
message overhead, and fairness. Token-based algorithms excel in low message complexity and
scalability but face challenges in token loss. Non-token-based algorithms offer robustness and
fairness but at the cost of higher message overhead. Understanding the trade-offs is crucial for
designing an efficient distributed mutual exclusion solution.

Distributed Deadlock Detection


Deadlocks occur in distributed systems when processes or resources are waiting indefinitely for
events that will never occur, such as the release of a resource or a communication
acknowledgment. Detecting and resolving deadlocks in distributed systems is more challenging
than in centralized systems because the system state is distributed across multiple nodes.

This explanation covers the system model, types of deadlocks, and strategies for resource and
communication deadlock detection in detail.

1. System Model for Distributed Deadlock Detection


A distributed system consists of multiple processes running on different nodes that
communicate and share resources. Deadlocks can arise due to resource allocation or
communication dependencies. The following concepts are crucial:

Key Components
1. Processes:
– Act as entities requesting and releasing resources.
– Communicate with one another to coordinate actions or data sharing.
2. Resources:
– Can be physical (e.g., printers, files) or logical (e.g., locks, database records).
– Resources may be allocated to processes or requested by them.
3. Wait-for Graph (WFG):
– A directed graph where nodes represent processes, and edges represent "wait-
for" relationships.
– If Process ( P1 ) is waiting for a resource held by ( P2 ), there is an edge from ( P1 )
to ( P2 ).
– A cycle in this graph indicates a potential deadlock.
2. Resource Deadlocks vs. Communication Deadlocks
Deadlocks in distributed systems can broadly be classified into two categories based on their
cause:

A. Resource Deadlocks
• Definition: Occur when processes compete for finite resources in a circular wait
condition.
• Example:
– Process ( P1 ) holds Resource ( R1 ) and requests ( R2 ).
– Process ( P2 ) holds Resource ( R2 ) and requests ( R1 ).
– Both processes are stuck in a circular wait.
• Characteristics:
– Involves physical or logical resources.
– The deadlock can be visualized in a WFG.

B. Communication Deadlocks
• Definition: Occur when processes are waiting indefinitely for messages or
acknowledgments from one another in a cyclic dependency.
• Example:
– Process ( P1 ) is waiting for a message from ( P2 ), and ( P2 ) is waiting for a
message from ( P1 ).
• Characteristics:
– Involves message passing and acknowledgments.
– Typically occurs in distributed systems with synchronous communication.

3. Deadlock Detection Techniques


Detecting deadlocks in distributed systems is challenging due to the lack of a global view of the
system state. The techniques for detecting resource and communication deadlocks differ
slightly.

A. Resource Deadlock Detection


To detect resource deadlocks:

1. System Representation:
– Represent the system using a WFG, where nodes are processes and resources,
and edges represent resource dependencies.
– Detect cycles in the graph.
2. Algorithms for Detection:
– Centralized Algorithm:
• One process acts as a coordinator and maintains the WFG.
• Periodically checks for cycles in the graph.
– Distributed Algorithm:
• Each process maintains partial information about the WFG.
• Use distributed cycle-detection algorithms to find cycles across multiple
nodes.
– Hierarchical Algorithm:
• Divide the system into regions, each with its own coordinator.
• Coordinators communicate to detect inter-region deadlocks.
3. Example:
– Chandy-Misra-Haas Algorithm:
• A distributed algorithm for detecting resource deadlocks.
• Works by propagating "deadlock probe messages" through the system. If
a probe returns to the initiating process, a deadlock is detected.

B. Communication Deadlock Detection


To detect communication deadlocks:

1. System Representation:
– Represent the system using a dependency graph, where nodes are processes and
edges represent communication dependencies.
– Detect cycles or unreachable nodes.
2. Algorithms for Detection:
– Timeout-Based Detection:
• Use timeouts to detect delays in message acknowledgments. If a process
waits beyond the timeout, it is suspected to be in a deadlock.
– Dependency Graph Analysis:
• Construct and analyze a graph of communication dependencies.
• Use cycle detection to identify deadlocks.
3. Example:
– Obermarck's Algorithm:
• Detects communication deadlocks by analyzing dependency graphs.
• Tracks dependencies among processes and checks for cycles.

4. Challenges in Distributed Deadlock Detection


Distributed deadlock detection is complicated due to the following challenges:

1. Lack of Global State:


– The system state is distributed across multiple nodes, making it hard to detect
cycles in the WFG.
2. Concurrency:
– Multiple processes may concurrently detect and resolve deadlocks, leading to
inconsistencies.
3. Message Overhead:
– Distributed algorithms require significant communication to share state
information, increasing overhead.
4. False Positives and Negatives:
– Inaccurate or incomplete state information can lead to incorrect detection
results.
5. Scalability:
– Detection algorithms must scale efficiently as the number of processes and
resources increases.

5. Handling Deadlocks
After detecting a deadlock, the system must resolve it using one of the following techniques:

1. Resource Preemption:
– Forcefully take a resource away from a process to break the cycle.
2. Process Termination:
– Terminate one or more processes involved in the deadlock to release resources.
3. Rollback:
– Roll back one or more processes to a previous state to break the dependency
chain.
4. Avoidance Mechanisms:
– Use strategies like the Banker's Algorithm to prevent deadlocks by ensuring the
system never enters an unsafe state.

6. Comparison: Resource Deadlocks vs. Communication Deadlocks


Aspect Resource Deadlocks Communication Deadlocks
Cause Competing for finite resources. Waiting for communication
acknowledgments.
System Wait-for Graph (WFG). Dependency Graph.
Representation
Resolution Preemption, termination, rollback. Timeout-based actions,
dependency graph analysis.
Detection Moderate; depends on graph size. Higher due to message
Complexity dependencies.
Common in Resource-sharing systems (e.g., Message-passing systems (e.g.,
databases). distributed apps).

Conclusion
Deadlock detection in distributed systems is essential for maintaining system efficiency and
availability. Understanding the differences between resource and communication deadlocks,
along with the appropriate detection algorithms, is critical for designing reliable distributed
systems. By leveraging centralized, distributed, or hierarchical techniques, systems can
efficiently detect and resolve deadlocks, ensuring smooth operation.

Deadlock in Distributed Systems


Deadlocks occur when processes in a system are stuck in a cycle, each waiting for resources held
by others. Addressing deadlocks involves prevention, avoidance, detection, and resolution
mechanisms. Let's discuss these approaches in detail.

1. Deadlock Prevention
Deadlock prevention ensures that at least one of the necessary conditions for deadlock cannot
occur. The four necessary conditions for deadlock are:

1. Mutual Exclusion: Only one process can hold a resource at a time.


2. Hold and Wait: Processes holding resources can request additional resources.
3. No Preemption: Resources cannot be forcibly taken from processes.
4. Circular Wait: A closed chain of processes exists, where each process waits for a
resource held by the next process.

Strategies for Prevention


• Breaking Mutual Exclusion:
– Use resources in a shared or read-only mode when possible.
• Breaking Hold and Wait:
– Require processes to request all needed resources at once.
– Example: When booking tickets, request both payment and seat selection at the
same time.
• Breaking No Preemption:
– Allow resources to be forcibly taken if a higher-priority process needs them.
– Example: Preempting a printer for urgent jobs.
• Breaking Circular Wait:
– Impose an ordering on resources and require processes to request resources in
order.
– Example: Process A requests ( R1 ), then ( R2 ), but not vice versa.

Trade-offs:
• Prevention mechanisms can reduce resource utilization and system efficiency.
• Over-conservative measures may lead to resource starvation.

2. Deadlock Avoidance
Deadlock avoidance dynamically ensures that the system never enters an unsafe state where
deadlocks might occur. This approach requires knowledge of future resource requests.
Banker’s Algorithm:
• Widely used for deadlock avoidance in centralized systems.
• Assumes:
– Each process declares its maximum resource needs in advance.
– The system grants resource requests only if the resulting state is safe.
• Safe State:
– A state is safe if a sequence exists where all processes can complete without
deadlock.

Challenges in Distributed Systems:


• The lack of a global view makes it hard to apply avoidance algorithms.
• Communication delays can complicate state evaluation.

3. Deadlock Detection
In contrast to prevention or avoidance, detection allows the system to enter a deadlocked state
but detects and resolves it later. Deadlock detection algorithms rely on identifying cycles or
knots in resource allocation graphs.

A. Centralized Deadlock Detection


A centralized algorithm uses a single coordinator or server to manage and monitor resource
allocation.

1. Steps:
– The coordinator gathers the resource allocation graph (WFG) from all processes.
– Periodically checks for cycles in the WFG.
– A cycle indicates a deadlock.
2. Example:
– Wait-for Graph (WFG):
• Nodes represent processes.
• Directed edges represent dependencies (e.g., Process ( P1 ) waits for
Process ( P2 )).
3. Advantages:
– Simpler implementation due to a single monitoring point.
– Easier cycle detection algorithms.
4. Disadvantages:
– Single point of failure.
– Not scalable for large systems due to high communication overhead.
B. Distributed Deadlock Detection
In distributed systems, no single coordinator manages resources. Deadlock detection requires
cooperation among nodes.

1. Approach:
– Each node maintains partial knowledge of the WFG.
– Nodes communicate to detect cycles in the global graph.
2. Algorithms:
– Path-Pushing Algorithm:
• Nodes exchange dependency paths.
• If a node receives its own dependency path, a deadlock is detected.
– Edge-Chasing Algorithm:
• Nodes send "probe messages" along dependency edges.
• If a probe returns to the initiator, a cycle (deadlock) exists.
3. Advantages:
– No single point of failure.
– Better scalability.
4. Disadvantages:
– High communication overhead.
– Susceptible to delays and inconsistencies due to distributed nature.

4. Deadlock Resolution
Once a deadlock is detected, the system must resolve it to restore functionality. Common
strategies include:

A. Resource Preemption
• Forcibly take resources from a process and allocate them to others.
• Requires rollback mechanisms to handle preempted processes.

B. Process Termination
• Terminate one or more processes involved in the deadlock.
• Criteria for selection:
– Priority: Terminate the lowest-priority process.
– Resource Usage: Terminate the process holding the most resources.
– Rollback Cost: Terminate the process with the lowest rollback cost.

C. Rollback
• Rollback one or more processes to a previous state where they were not part of the
deadlock.

D. Breaking Circular Wait


• Force one process in the cycle to release its resources voluntarily or wait for resources in
a different order.
Comparison of Strategies
Aspect Prevention Avoidance Detection Resolution
When Applied Before deadlock During resource After deadlock After deadlock is
occurs allocation occurs detected
Resource Low Moderate High Moderate
Utilization
System No global Requires Requires Operates after
Knowledge knowledge maximum needs dependency info detection
needed info
Complexity Low High Moderate Moderate

Conclusion
Deadlock management in distributed systems requires careful balancing of prevention,
avoidance, detection, and resolution techniques. While prevention and avoidance reduce the
likelihood of deadlocks, detection and resolution focus on addressing them when they occur.
The choice of strategy depends on the system's scale, resource availability, and the cost of
communication and computation.

Deadlock Detection Algorithms in Distributed Systems


Distributed deadlock detection algorithms rely on detecting cycles or knots in the Wait-For
Graph (WFG), which represents dependencies between processes. Two common techniques
used for this are Path Pushing Algorithms and Edge Chasing Algorithms. Let’s explore both in
detail.

1. Path Pushing Algorithm


Path Pushing is a deadlock detection approach where dependency paths (or portions of the
Wait-For Graph) are exchanged between processes. The goal is to gather enough information to
detect a cycle in the global WFG.

How It Works
• Each process maintains information about its own dependencies.
• When a process cannot proceed because it's waiting for a resource, it sends dependency
paths to other processes involved.
• Dependency paths are propagated along the WFG.
• If a process receives a dependency path that contains itself, a cycle is detected, indicating
a deadlock.

Steps in the Algorithm


1. Initialization:
– Each process generates a dependency path for itself whenever it waits for a
resource.
– A path is represented as a sequence of process IDs.
2. Propagation:
– The dependency path is forwarded to the process holding the required resource.
– The receiving process appends its own dependencies and forwards the updated
path further.
3. Cycle Detection:
– If a process receives a path containing its own ID, a deadlock is detected.
– The process declares a deadlock and initiates resolution.

Example
1. Process ( P1 ) waits for ( P2 ), so ( P1 ) sends the path ( [P1] ) to ( P2 ).
2. ( P2 ) is waiting for ( P3 ), so it appends itself and sends ( [P1, P2] ) to ( P3 ).
3. ( P3 ) is waiting for ( P1 ), so it appends itself and sends ( [P1, P2, P3] ) back to ( P1 ).
4. ( P1 ) detects its own ID in the path ( [P1, P2, P3] ), confirming a deadlock.

Advantages
• Simple conceptually.
• Detects deadlocks accurately when paths propagate correctly.

Disadvantages
• High communication overhead due to the transmission of dependency paths.
• Paths may grow large in size as the number of dependencies increases.
• Performance may degrade in large systems.

2. Edge Chasing Algorithm


The Edge Chasing algorithm detects deadlocks by sending special "probe messages" along the
edges of the WFG. These probes help trace dependencies between processes to identify cycles.

How It Works
• A probe message is initiated by a process that is waiting for a resource.
• The probe travels through the WFG following dependency edges.
• If the probe returns to the initiator, a cycle is detected, indicating a deadlock.

Steps in the Algorithm


1. Probe Initialization:
– When a process ( P ) requests a resource held by another process, it sends a
probe message ( \langle Initiator, Sender, Receiver \rangle ) to the holder of the
resource.
2. Probe Forwarding:
– Each process receiving the probe checks if it is waiting for another resource.
– If yes, it forwards the probe to the process holding the resource it is waiting for.
– The probe message is updated with the current sender and receiver.
3. Cycle Detection:
– If a process receives a probe message where the Initiator matches its own ID, it
detects a cycle and declares a deadlock.

Example
1. ( P1 ) waits for ( P2 ), so ( P1 ) sends ( \langle P1, P1, P2 \rangle ) to ( P2 ).
2. ( P2 ) waits for ( P3 ), so ( P2 ) sends ( \langle P1, P2, P3 \rangle ) to ( P3 ).
3. ( P3 ) waits for ( P1 ), so ( P3 ) sends ( \langle P1, P3, P1 \rangle ) back to ( P1 ).
4. ( P1 ) detects its ID in the probe ( \langle P1, P3, P1 \rangle ), confirming a deadlock.

Advantages
• Efficient in terms of message size (probe messages are small).
• Only requires processes to handle and forward messages, making it lightweight.

Disadvantages
• Requires timely delivery of messages; delays may hinder detection.
• May generate many probe messages, increasing network load.

Comparison: Path Pushing vs. Edge Chasing


Aspect Path Pushing Edge Chasing
Message Type Dependency paths Probe messages
Message Size Large (grows with dependency chain Small (fixed size)
length)
Communication High (propagates entire paths) Moderate (only probes are sent)
Overhead
Detection Requires cycle detection in paths Simple return of probe to initiator
Complexity
Scalability Less scalable due to growing path More scalable in distributed
size environments
Speed of May be slower due to large path Faster due to lightweight probes
Detection propagation

Conclusion
Both Path Pushing and Edge Chasing are effective in distributed deadlock detection, but they
are suited to different scenarios. Path Pushing works better in systems with fewer dependencies
but suffers from high message overhead as dependencies grow. Edge Chasing, on the other
hand, is more scalable and efficient in terms of communication but requires careful handling of
message delays and potential network congestion. The choice of algorithm depends on the
system's scale, complexity, and communication characteristics.
UNIT 3

Agreement Protocols in Distributed Systems


Agreement protocols are fundamental in distributed systems to ensure consistency and
coordination among multiple processes or nodes. They are particularly important in scenarios
where processes need to agree on a common value, decision, or action, despite potential failures
or malicious activities.

1. Introduction to Agreement Protocols


In a distributed system, processes must coordinate and agree on specific decisions to maintain
system reliability and consistency. Agreement protocols define mechanisms for achieving
consensus in the presence of various challenges such as:

• Communication delays
• Faulty processes
• Network partitions
• Malicious actors (Byzantine failures)
Key Goals of Agreement Protocols
• Consistency: All non-faulty processes must agree on the same value.
• Validity: The agreed value must be a valid and permissible one (e.g., one proposed by a
process).
• Termination: Every process must eventually decide, ensuring that the protocol
concludes.

2. System Models in Agreement Protocols


Agreement protocols operate under certain assumptions about the system's behavior and
failures. These models are critical in defining the applicability and effectiveness of an agreement
protocol.

Types of Failures
1. Crash Failures: Processes fail by halting and never recover.
2. Omission Failures: Messages are lost due to process or network failures.
3. Arbitrary/Byzantine Failures: Processes behave maliciously or unpredictably, sending
conflicting or invalid messages.

Communication Models
1. Synchronous Systems:
– Fixed upper bound on message delivery time.
– Processes execute steps within a known time.
– Easier to achieve agreement due to predictable behavior.
2. Asynchronous Systems:
– No bound on message delivery time or process execution speed.
– More challenging for agreement protocols, especially with failures.

3. Classification of Agreement Problems


Agreement problems are broadly classified into three categories:

a) Consensus Problem
Processes must agree on a single value, which could be a value proposed by any of the
processes. Common in distributed databases and leader election.

b) Byzantine Agreement Problem


Processes must agree on a single value despite the presence of Byzantine failures (malicious or
arbitrary behavior). Crucial for systems like blockchain and fault-tolerant systems.

c) Atomic Commit Problem


Processes must agree to commit or abort a transaction. Used in distributed databases to
maintain atomicity of transactions.
4. Byzantine Agreement Problem
The Byzantine Agreement Problem is a special case of the consensus problem, dealing with
processes that may behave maliciously. It was introduced by Lamport et al. as the "Byzantine
Generals Problem."

Scenario
Imagine multiple generals (processes) planning an attack. They need to coordinate and agree on
whether to attack or retreat. Some generals might act maliciously, sending contradictory or false
messages. The goal is for all loyal generals to agree on the same action.

Requirements
1. Agreement: All non-faulty processes agree on the same value.
2. Validity: If all non-faulty processes propose the same value, they must agree on it.
3. Fault Tolerance: The protocol should handle up to ( f ) faulty processes.

Assumptions
• At least ( 3f + 1 ) processes are required to tolerate ( f ) Byzantine failures.
• The system is synchronous, ensuring predictable communication and computation.

5. Solutions to the Byzantine Agreement Problem


a) Oral Message Algorithm (OM)
• A message-passing approach to achieve agreement.
• Relies on a hierarchy of communication rounds.
• Ensures that all loyal processes propagate the same value to others.
• Requires ( f + 1 ) rounds to tolerate ( f ) Byzantine faults.

b) Signed Message Algorithm (SM)


• Uses cryptographic techniques (e.g., digital signatures) to verify message authenticity.
• Processes can detect and discard messages from faulty nodes.
• Reduces the need for redundant message propagation but adds computational
overhead.

Example
1. General ( G1 ) proposes a plan (e.g., attack).
2. Each general forwards the message to others, appending their decision.
3. Faulty generals may send conflicting decisions.
4. Using majority voting or signature verification, non-faulty generals agree on the final
decision.
6. Practical Applications of Byzantine Agreement
• Blockchain and Cryptocurrencies: Ensures consistency and fault tolerance in
decentralized systems.
• Aviation Systems: Achieves reliability in flight control systems.
• Replicated Databases: Maintains consistency across distributed database replicas.

7. Performance Metrics of Agreement Protocols


a) Message Complexity
• The number of messages exchanged to achieve agreement.
• High message complexity increases network overhead.

b) Time Complexity
• The number of communication rounds required.
• Faster convergence is preferred for real-time systems.

c) Fault Tolerance
• The number of failures (crash or Byzantine) that the protocol can handle.
• A critical factor for high-reliability systems.

Conclusion
Agreement protocols, especially those addressing the Byzantine Agreement Problem, are vital
for ensuring consistency and reliability in distributed systems. They provide mechanisms to
tolerate faults, including malicious behaviors, and are the foundation for fault-tolerant systems
like blockchain and distributed databases. While achieving consensus in distributed
environments is challenging, advancements in cryptographic techniques and efficient
algorithms continue to make these protocols robust and practical.

Consensus Problem
Definition
The Consensus Problem in distributed systems is ensuring that a group of processes agree on a
single value, even in the presence of failures. It is essential for maintaining consistency across
systems like distributed databases, blockchain, and fault-tolerant applications.

Requirements of Consensus
1. Agreement: All non-faulty processes must agree on the same value.
2. Validity: If a process proposes a value, the agreed value must be one of the proposed
values.
3. Termination: All processes must eventually decide on a value.
4. Fault Tolerance: The system should handle up to a certain number of failures (e.g., crash
or Byzantine).
Interactive Consistency Problem
Definition
The Interactive Consistency Problem is a specific variant of the consensus problem, requiring
every process in a distributed system to agree on:

1. A common set of values, one for each process.


2. The values agreed upon must reflect the input values of the non-faulty processes.

Use Case
• Maintaining a consistent state across distributed replicas of a system.

Requirements
1. Agreement: All non-faulty processes agree on the same value for each process.
2. Validity: For non-faulty processes, their agreed value must match their initial value.
3. Fault Tolerance: The system can tolerate failures up to a specific limit.

Solution to the Byzantine Agreement Problem


The Byzantine Agreement Problem deals with achieving consensus despite the presence of
Byzantine faults, where processes can behave maliciously or arbitrarily.

Assumptions
1. At least ( 3f + 1 ) processes are required to tolerate ( f ) Byzantine faults.
2. The system operates in a synchronous environment where message delivery time is
bounded.

Algorithms for Byzantine Agreement


1. Oral Message (OM) Algorithm:
– A hierarchical algorithm that uses multiple rounds of message exchanges.
– At each round, processes forward received values and filter inconsistent ones.
– Consensus is achieved through majority voting after ( f+1 ) rounds.
2. Signed Message (SM) Algorithm:
– Uses cryptographic signatures to authenticate messages.
– Faulty processes cannot forge valid signatures.
– Reduces the number of rounds and message complexity compared to OM.

Applications of Agreement Problem


1. Blockchain:
– Ensures consensus among distributed nodes regarding transaction validity.
– Byzantine Fault Tolerance (BFT) protocols like PBFT (Practical Byzantine Fault
Tolerance) are used.
2. Replicated Databases:
– Maintains consistency across replicas by ensuring all replicas agree on updates.
3. Aviation Systems:
– Critical for achieving fault-tolerant flight control systems.
4. Distributed Coordination:
– Leader election in distributed systems relies on consensus.
5. Sensor Networks:
– Ensures consistency in data aggregation and decision-making among sensor
nodes.

Atomic Commit in Distributed Database Systems


Definition
The Atomic Commit Problem is ensuring that all participating processes (or database nodes) in
a transaction either commit the transaction or abort it, maintaining consistency across the
system.

Key Properties
1. Atomicity: All participants commit or abort together.
2. Consistency: The database remains in a valid state before and after the transaction.
3. Isolation: Intermediate states of the transaction are not visible to other transactions.
4. Durability: Once a transaction commits, its changes persist.

Atomic Commit Protocols


1. Two-Phase Commit (2PC):
– Phase 1 (Voting):
• The coordinator sends a prepare message to all participants.
• Participants respond with vote-commit or vote-abort based on their local
state.
– Phase 2 (Commit/Abort):
• If all participants vote commit, the coordinator sends a commit message.
• If any participant votes abort, the coordinator sends an abort message.
Advantages:
– Ensures atomicity in most cases.
Disadvantages:
– Blocking: Participants can remain in uncertain states if the coordinator fails.
2. Three-Phase Commit (3PC):
– Adds a third phase to 2PC to avoid blocking.
– Phase 1 (Prepare): Coordinator sends prepare messages.
– Phase 2 (Pre-Commit): If all participants vote commit, the coordinator sends a
pre-commit message.
– Phase 3 (Commit): Upon receiving acknowledgments, the coordinator sends a
commit message.
Advantages:
– Non-blocking in case of failures.
Disadvantages:
– More communication overhead compared to 2PC.

Comparison of Consensus and Atomic Commit


Feature Consensus Atomic Commit
Focus Agreement on a single value. Consistent commit or abort of a
transaction.
Fault Tolerance Handles crash and Byzantine Primarily handles crash failures.
failures.
Applications Leader election, replicated state Distributed transactions in
machines. databases.
Protocols Paxos, Raft, PBFT. 2PC, 3PC.

Conclusion
Agreement protocols like Consensus and Atomic Commit are essential for maintaining reliability
and consistency in distributed systems. Byzantine Agreement specifically handles malicious
failures, ensuring robust systems like blockchain. In distributed databases, Atomic Commit
ensures transactions remain consistent and atomic, crucial for maintaining data integrity. Each
protocol has trade-offs in terms of performance, complexity, and fault tolerance, tailored to
specific system requirements.

Distributed Resource Management: Issues in Distributed File


Systems
Introduction
A Distributed File System (DFS) is a system that allows files to be stored across multiple
machines or servers, while providing the illusion of a single, unified file system to the user. In
such systems, files are distributed across different machines connected through a network, and
users can access files from any location as though they were stored locally.

Distributed resource management in DFS focuses on efficiently allocating, accessing, and


managing resources (such as storage and network bandwidth) in a way that is transparent to the
user, ensuring reliability, consistency, and performance in the system.

Key Issues in Distributed File Systems


1. Transparency
– Definition: Transparency refers to hiding the complexity and distribution of the
system from the users and applications.
– Challenges:
• Access Transparency: Users should not need to know whether files are
stored locally or remotely.
• Location Transparency: The location of files should be abstracted so
users don’t need to know where files are physically stored.
• Replication Transparency: The system should ensure that the replication
of files is transparent, meaning users and applications should not have to
manually manage or be aware of replicated files.
• Concurrency Transparency: Ensuring that multiple users can access the
same file simultaneously without interfering with each other.
2. Fault Tolerance
– Definition: Fault tolerance ensures that the system continues to function even
when some components (e.g., servers, network links) fail.
– Challenges:
• Data Redundancy: To avoid data loss, DFS usually replicates files across
different servers. However, this introduces issues of how to maintain
consistency between replicas.
• Recovery: When a node fails, the system must be able to recover the lost
data and continue normal operations.
• Consistency during Failures: Ensuring that all users or processes
accessing files receive consistent data even if part of the system fails (e.g.,
after a crash or network partition).
3. Consistency
– Definition: Consistency refers to ensuring that all copies of a file or data are in
sync, especially when multiple processes are reading from and writing to the
same file.
– Challenges:
• File Updates: When a file is being modified by different users or
processes at the same time, ensuring that updates do not conflict and that
all users see the latest version is crucial.
• Write Propagation: In a distributed file system, changes to a file need to
be propagated to all replicas of that file. Managing this propagation while
maintaining performance is a challenge.
• Consistency Models: There are different consistency models such as
strong consistency, eventual consistency, and causal consistency.
Choosing the right model impacts the system's performance and user
experience.
4. Concurrency Control
– Definition: Concurrency control ensures that multiple processes or users can
access files concurrently without corrupting the data.
– Challenges:
• Locking Mechanisms: Distributed file systems must implement locks or
other synchronization mechanisms to prevent simultaneous conflicting
operations (e.g., two users modifying the same file).
• Deadlock: Deadlock can occur if processes are waiting for each other’s
locks indefinitely. DFS must handle deadlock detection and resolution
efficiently.
• Race Conditions: Multiple processes accessing or modifying a file at the
same time can lead to race conditions, which must be managed to ensure
correctness.
5. Performance
– Definition: Performance in a distributed file system refers to how efficiently files
can be accessed and transferred across the network.
– Challenges:
• Latency: In a distributed environment, accessing remote files can
introduce latency. Minimizing this latency, especially in large systems, is
critical.
• Bandwidth: The system must efficiently use available network bandwidth
to transfer files between clients and servers, without overloading the
network.
• Caching: Implementing caching strategies can improve performance by
storing frequently accessed files locally, reducing the need for repeated
remote access. However, cache consistency and invalidation are
challenging problems to solve.
6. Scalability
– Definition: Scalability refers to the ability of the distributed file system to grow in
terms of the number of clients, servers, and the total volume of data it manages.
– Challenges:
• Data Distribution: As the system grows, the way files are distributed
across servers must be efficient to prevent bottlenecks and ensure
balanced resource usage.
• Metadata Management: Managing metadata (e.g., file names, directories,
permissions) becomes increasingly complex as the number of files and
clients increases. It requires efficient indexing and lookup mechanisms.
• Replication Management: The number of file replicas and their locations
must be dynamically adjusted to accommodate system growth.
7. Security
– Definition: Security in a distributed file system ensures that only authorized users
can access and modify files, and that the data is protected from unauthorized
access or corruption.
– Challenges:
• Authentication and Authorization: Ensuring that only authorized users
have access to certain files or directories is crucial.
• Data Integrity: The system must ensure that data is not tampered with
during transfer or storage, requiring encryption mechanisms.
• Privacy: Users must be assured that their data is not exposed to
unauthorized entities.
8. Resource Allocation
– Definition: Resource allocation in DFS refers to how storage, computation, and
network resources are managed across the distributed system.
– Challenges:
• Storage Allocation: As data is distributed across multiple servers, the
system must decide how much space to allocate for each file on each
server, ensuring efficient use of storage resources.
• Load Balancing: Ensuring that no server is overloaded while others are
underutilized. This requires dynamic allocation of resources based on
current demand.
• Fault Tolerant Allocation: Ensuring that resources are allocated in such a
way that the failure of one server does not affect the availability of data.
9. Synchronization and Coordination
– Definition: Synchronization and coordination refer to how processes in a
distributed file system coordinate their actions to ensure correct and consistent
access to files.
– Challenges:
• Clock Synchronization: Since there is no global clock, different nodes in a
DFS may have different notions of time. Ensuring coordination across
nodes without a common time reference is a challenge.
• Version Control: Managing multiple versions of files and ensuring that all
changes are properly versioned and tracked across the system is essential
for both consistency and recovery.

Solutions to Distributed File System Issues


1. Replication Techniques: To improve fault tolerance and performance, DFS often
replicates files across multiple nodes. However, challenges such as consistency and
synchronization need to be addressed, typically with protocols like quorum-based
replication or Paxos.

2. Caching and Prefetching: Caching frequently accessed files at the client side helps
reduce latency and improves performance. However, caching must be coordinated
with other nodes to avoid stale data issues.

3. Distributed Locking: Locking mechanisms like distributed locks or leases can be


used to ensure proper synchronization in the system, but they require careful
handling to avoid deadlocks.

4. Distributed Metadata Management: Using distributed hash tables (DHTs) or


similar structures can help manage metadata across multiple nodes efficiently.
Techniques like consistent hashing are commonly used to maintain load balance as
the system scales.

Conclusion
The challenges of Distributed Resource Management in Distributed File Systems (DFS)
primarily revolve around ensuring transparency, fault tolerance, consistency, and performance
while maintaining security and scalability. The design of a distributed file system must address
the complexities of network communication, data replication, and concurrent access, all while
providing a seamless experience to users and ensuring data integrity in the face of failures.
Effective solutions to these issues are crucial for the smooth operation of large-scale distributed
systems like cloud storage, content delivery networks, and enterprise-scale file systems.

Mechanism for Building Distributed File Systems


Building a Distributed File System (DFS) involves multiple key mechanisms that ensure the
system can manage, access, and store files across a distributed network of machines. These
mechanisms must address various challenges such as data transparency, fault tolerance,
consistency, scalability, and security. Here's a detailed breakdown of the mechanisms involved
in building a DFS:

1. Architecture and Design of DFS


A distributed file system architecture is designed to achieve the following goals:

• Transparency: Hide the complexity of file distribution, so users interact with the system
as though they are working with a local file system.
• Scalability: The system should handle increasing numbers of files, clients, and servers
efficiently.
• Fault tolerance: Ensure the system remains functional even if some components fail.

Key Components:
• Client: The user-facing interface that interacts with the DFS.
• Server: The machine that stores the file data and handles file operations.
• Metadata Server (MDS): Manages the metadata (file names, directories, access rights,
etc.), separate from the data servers where actual file content resides.
• Data Server (DAS): Stores the actual data of the files.

The architecture typically involves:

1. Distributed Metadata: In large systems, metadata is distributed across multiple servers


to ensure scalability and prevent bottlenecks.
2. Data Replication: Data is replicated across multiple servers to ensure fault tolerance and
availability.
3. Caching: Frequently accessed data can be cached locally to reduce latency.

2. File Location and Naming Mechanisms


In a DFS, files are typically stored across multiple machines, and it is necessary to know where a
particular file is located. Two primary mechanisms for managing file locations are:

File Naming and Directory Structure


• Flat Naming: Every file has a unique identifier or name that maps to its location in the
system.
• Hierarchical Naming: Files are organized into directories, forming a tree-like structure,
where each directory can contain multiple files or subdirectories. This approach is similar
to local file systems.

File Location Information:


• Centralized File Location: In some DFS designs, a central server or database tracks
where each file is stored.
• Distributed File Location: In a distributed system, file locations can be tracked using
distributed hash tables (DHTs) or other distributed indexing techniques.

3. File Replication Mechanism


Replication is a crucial mechanism in DFS to ensure data availability and fault tolerance. Files can
be replicated on different servers or machines across the network.

Replication Types:
• Synchronous Replication: Each modification to a file is immediately reflected in all
replicas, ensuring consistency. However, this can reduce performance due to
synchronization overhead.
• Asynchronous Replication: Modifications to a file are first applied to the primary copy,
and later propagated to replicas. This is more efficient but introduces the risk of
temporary inconsistencies.

Replication Strategies:
• Quorum-Based Replication: Ensures that a file modification is only considered
successful if a majority (quorum) of replicas acknowledge the change. This prevents
conflicts between versions of the same file.
• Primary-Backup Replication: One server holds the primary copy, and others store
backup copies. In case of a failure, the backup can be promoted to the primary copy.

4. File Consistency and Synchronization


In a distributed environment, file consistency must be maintained across replicas. This can be
difficult because of network delays, partial failures, and concurrent access by multiple users or
processes.

Consistency Models:
• Strong Consistency: Every user sees the same version of a file, and all changes are
immediately visible to all users.
• Eventual Consistency: Updates to a file may not be immediately visible to all users, but
the system guarantees that eventually, all replicas will converge to the same state.
• Causal Consistency: If operation A happens before operation B, then all replicas will
reflect this ordering.
Locking and Concurrency Control:
• Distributed Locking: Locks can be placed on files to prevent concurrent modifications. A
process must acquire a lock before modifying a file, ensuring that only one process can
modify a file at a time.
• Optimistic Concurrency: Allows multiple processes to modify files concurrently, but
checks for conflicts when the changes are committed. If conflicts are detected, the
changes are rolled back.

5. Fault Tolerance and Recovery Mechanism


To build a resilient DFS, it is important to handle faults and recover gracefully. This includes
managing data loss, server failures, and network partitions.

Fault Tolerance Mechanisms:


• Data Replication: As mentioned earlier, files are replicated across multiple servers. If one
server fails, the data can still be accessed from other replicas.
• Failure Detection: The system must be able to detect when a server or component has
failed. Techniques such as heartbeats and timeouts are used for failure detection.
• Logging and Journaling: Changes to files are logged to ensure that in the event of a
failure, the system can recover to a consistent state.

Recovery Mechanisms:
• Checkpointing: The system periodically takes a snapshot of the system’s state (metadata
and data), allowing it to roll back to a previous consistent state in case of failure.
• Data Reconstruction: When data is lost due to a failure, the system can reconstruct the
data from the remaining replicas.

6. Caching and Prefetching Mechanism


To reduce latency and improve performance, DFS uses caching and prefetching mechanisms.

Caching:
• Client-Side Caching: Frequently accessed files or parts of files are cached on the client-
side to avoid repetitive network requests.
• Server-Side Caching: Servers can cache data that is frequently requested, reducing disk
I/O and improving response times.

Cache Consistency:
• Write-Invalidate Cache Protocol: When a file is modified, all cached copies are
invalidated to maintain consistency.
• Write-Update Cache Protocol: When a file is modified, all cached copies are updated
with the new version to maintain consistency.
Prefetching:
• Prefetching involves predicting which data or files will be requested next and loading
them into cache ahead of time. This helps reduce wait times and improves system
responsiveness.

7. Security Mechanisms
Security is essential in a DFS to prevent unauthorized access and ensure the integrity of data.

Authentication and Authorization:


• Authentication: Users must prove their identity before they can access the DFS. This can
be done using passwords, digital certificates, or biometric data.
• Authorization: After authentication, users must have the correct permissions to access
files or directories. Access control lists (ACLs) or role-based access control (RBAC) are
used to manage permissions.

Encryption:
• Data Encryption: Files can be encrypted both at rest (on disk) and in transit (over the
network) to prevent unauthorized access.
• Secure Channels: Transport Layer Security (TLS) or similar protocols are used to ensure
secure communication between clients and servers.

8. Metadata Management Mechanism


Metadata in a distributed file system includes information such as file names, file types,
permissions, and file locations. Managing metadata is crucial for efficiency and scalability.

Metadata Servers:
• In large DFS, metadata is often stored on a separate server (or set of servers) to ensure
that the data servers can focus on storing actual file data.
• Distributed Metadata Management: Metadata servers themselves can be distributed to
ensure that no single server becomes a bottleneck.

Directory Services:
• A DFS must efficiently manage file directories and support operations like searching,
creating, and deleting directories. Distributed hash tables (DHT) or similar techniques are
used to manage and locate files.

Conclusion
Building a Distributed File System involves combining several complex mechanisms to ensure
that files are stored, accessed, and managed efficiently across a distributed environment. By
focusing on transparency, fault tolerance, consistency, scalability, and security, DFS can
provide users with a reliable and efficient file storage solution. Key mechanisms include file
replication, metadata management, concurrency control, caching, and security protocols. These
mechanisms work together to offer a unified, fault-tolerant, and high-performance system for
handling files in distributed environments like cloud storage, enterprise systems, and large-
scale applications.

Design Issues in Distributed Shared Memory (DSM)


Distributed Shared Memory (DSM) allows processes on different nodes of a distributed system
to share data as if they were accessing a common physical memory. Below are the design issues
to consider when implementing DSM:

1. Granularity
This refers to the size of the memory block shared among processes.

• Fine-grained DSM: Shares small units of data, like bytes or words.

• Coarse-grained DSM: Shares larger units, like pages or segments.

Trade-off: Fine-grained DSM provides better concurrency but increases communication


overhead, while coarse-grained DSM minimizes communication but can lead to false sharing.

2. Memory Consistency Models


Defines how changes made by one process are visible to others. Common models include:

• Strict Consistency: All memory operations are instantaneously visible globally.

• Sequential Consistency: Operations appear to occur in the same sequential order on all
processes.

• Causal Consistency: Only causally related updates are visible in the same order.

• Release Consistency: Synchronization points dictate visibility.


Challenge: Striking a balance between strict consistency (which is difficult to achieve in
distributed systems) and relaxed consistency (which requires developer intervention for
synchronization).

3. Data Replication
DSM systems often replicate data to improve performance and availability.

• Advantages: Reduces latency and increases fault tolerance.

• Challenges: Ensuring consistency among replicas and resolving conflicts during


concurrent updates.

4. Data Coherence
This ensures that all processes have a consistent view of the shared memory.

• Coherence Protocols: Such as write-invalidate or write-update mechanisms.

• Challenge: Managing frequent updates while avoiding high communication costs.

5. Synchronization
Processes need to coordinate access to shared memory to avoid race conditions.
Synchronization methods include:

• Locks (e.g., mutex, spinlocks).

• Barriers (processes must wait at a barrier until all reach it).

Challenge: Minimizing the performance overhead caused by synchronization mechanisms.

6. Fault Tolerance
Distributed systems are prone to failures. DSM must handle:

• Node crashes.

• Network partitions.

• Data recovery mechanisms.


7. Scalability
As the number of nodes increases, communication overhead, consistency maintenance, and
synchronization complexity grow. DSM systems must address scalability through hierarchical
designs or relaxed consistency models.

8. Communication Overhead
Efficiently transferring shared memory updates over the network is crucial. Techniques like
batching updates or using multicast can reduce overhead.

Algorithm for Implementation of Distributed Shared Memory


Below is a generic algorithm for implementing DSM using a page-based approach:

Assumptions:
• Memory is divided into fixed-size pages.

• Each page has an owner node responsible for maintaining its consistency.

• Updates to shared memory are propagated using a consistency protocol.

Steps:
1. Initialization:
– Divide the shared memory into pages.

– Assign ownership of each page to a specific node.

1. Accessing Memory:
– Read Operation:

a. Check if the required page is available locally.

b. If not, send a request to the owner node for the page.

c. Owner node sends the latest version of the page.

d. Cache the page locally.

– Write Operation:
a. Check if the node is the owner of the page.

b. If not, request ownership from the current owner.

c. Owner invalidates other copies of the page and grants ownership.

d. Update the page locally.

1. Consistency Maintenance:
– Implement a coherence protocol (e.g., write-invalidate or write-update).

– Synchronize pages using the chosen memory consistency model.

1. Synchronization:
– Use locks or barriers to coordinate access to shared memory.

– Ensure mutual exclusion for critical sections.

1. Fault Tolerance:
– Periodically checkpoint memory state.

– Use replication to store redundant copies of critical pages.

Example Protocols:
• Write-Invalidate Protocol: When a process writes to a page, it invalidates all other
copies, forcing others to fetch the latest version when needed.

• Write-Update Protocol: Updates are propagated to all copies immediately.

Conclusion
Implementing DSM requires careful consideration of trade-offs among granularity, consistency,
and communication overhead. While DSM simplifies programming by abstracting memory
sharing, achieving optimal performance and reliability in a distributed environment is
challenging.

Failure Recovery in Distributed Systems


Failure recovery in distributed systems ensures that a system can continue functioning correctly
or restore itself to a consistent state after a failure. The two main approaches are backward
recovery and forward recovery.
Concepts in Backward and Forward Recovery
1. Backward Recovery
Backward recovery involves rolling the system back to a previous consistent state before the
failure occurred.

Techniques:

• Checkpointing: Periodically save the state of the system.

• Logging: Record events or changes to allow replaying operations from a saved state.

• Rollback: When a failure occurs, revert the system to the last checkpoint.

Advantages:

• Simple to implement.

• Ensures consistency by restoring known states.

Disadvantages:

• May lose recent updates or progress made since the last checkpoint.

• Inefficient for long-running processes due to frequent rollback needs.

2. Forward Recovery
Forward recovery involves transitioning the system to a new, valid state after a failure without
rolling back.

Techniques:

• Error Correction: Use redundancy or algorithms to fix corrupted data.

• Alternate Execution Pathways: Shift operations to a different workflow to bypass the


failure.

• Graceful Degradation: Operate with reduced functionality instead of failing completely.

Advantages:

• No loss of progress.

• Efficient for systems requiring real-time or continuous operation.

Disadvantages:

• Complex implementation.
• Requires precise error detection and correction mechanisms.

Recovery in Concurrent Systems


Recovery in concurrent systems involves additional complexities due to multiple processes
running simultaneously, often interacting with shared resources.

1. Challenges in Concurrent Recovery


• Interdependencies: Failures in one process can propagate to others.

• Deadlocks: Recovery mechanisms may introduce deadlocks.

• Inconsistencies: Shared resource access can lead to partial or corrupt updates.

2. Techniques for Recovery in Concurrent Systems


Checkpointing and Rollback
• Processes periodically save their state.

• If a failure occurs, rollback restores all processes to the latest consistent checkpoint.

• Coordinated Checkpointing: All processes synchronize their checkpoints to maintain


consistency.

• Uncoordinated Checkpointing: Each process checkpoints independently, but this can


lead to a "domino effect," where a single rollback cascades through the system.

Message Logging
• Log inter-process messages to help restore the system to a consistent state.

• Types of message logging:


– Pessimistic Logging: Logs are written synchronously with message exchanges to
ensure recoverability.
– Optimistic Logging: Logs are written asynchronously for better performance but
risk lost messages during failure.
– Causal Logging: Combines aspects of both pessimistic and optimistic logging,
ensuring consistency while reducing overhead.

Dependency Tracking
• Track dependencies among processes or transactions.
• During recovery, ensure that dependent operations are restored in the correct order to
maintain consistency.

Recovery Lines
• Identify a consistent state across all processes (called a recovery line) where no
dependencies are violated.

• Processes are rolled back only to this recovery line to avoid unnecessary rollbacks.

Isolation Mechanisms
• Use locks or transactions to ensure that processes recover in isolation without
interfering with others.

• Example: In database systems, ACID properties ensure recovery without compromising


consistency.

Comparison of Backward and Forward Recovery in Concurrent


Systems
Feature Backward Recovery Forward Recovery
State Handling Rolls back to a previous state. Moves forward to a new valid
state.
Complexity Easier to implement. More complex.
Progress Loss May lose recent progress. Retains progress.
Use Case Suitable for static or less Ideal for real-time, critical
dynamic systems. systems.

Conclusion
Failure recovery in distributed systems is critical for maintaining availability, consistency, and
reliability. While backward recovery is simpler and widely used, forward recovery is essential for
real-time systems that cannot afford rollback delays. In concurrent systems, techniques like
coordinated checkpointing, message logging, and dependency tracking are employed to address
interdependencies and ensure a consistent recovery.

Obtaining Consistent Checkpoints in Distributed Systems


A checkpoint is a snapshot of the system state saved periodically for recovery purposes. In
distributed systems, ensuring consistency across all nodes is challenging due to concurrent
operations and inter-process communication.
1. Challenges in Consistent Checkpointing
• Concurrency: Multiple processes run simultaneously and may interact.

• Message Passing: In-flight messages (messages sent but not yet delivered) can cause
inconsistencies.

• Domino Effect: Rolling back one process might require rolling back others, leading to
cascading rollbacks.

2. Approaches for Obtaining Consistent Checkpoints


Two main approaches are used:

a. Coordinated Checkpointing
All processes in the distributed system synchronize their checkpointing efforts to ensure a
consistent global state.

Steps:

1. A coordinator process initiates the checkpointing process.


2. All processes pause their activities and record their state.
3. Any in-transit messages are saved as part of the checkpoint.
4. Once all processes have checkpointed, normal operations resume.

Advantages:

• Guarantees a consistent global checkpoint.

• Avoids the domino effect.

Disadvantages:

• Requires synchronization, which can introduce delays.

• Not suitable for systems requiring high availability.

b. Uncoordinated Checkpointing
Each process checkpoints independently, without coordination with others.

Steps:

1. Each process periodically saves its state.

2. During recovery, processes use their most recent checkpoints.

3. To handle inconsistencies, rollback mechanisms (e.g., replay logs) are employed.


Advantages:

• Low overhead, no synchronization required.

• Checkpointing happens more frequently.

Disadvantages:

• Risk of the domino effect.

• May require complex rollback strategies.

c. Communication-Induced Checkpointing
Combines the benefits of coordinated and uncoordinated checkpointing. Processes take local
checkpoints and occasionally coordinate based on communication patterns.

Advantages:

• Reduces the frequency of global coordination.

• Prevents the domino effect.

Disadvantages:

• Complexity in detecting when coordination is needed.

d. Algorithm Example: Chandy-Lamport Algorithm


The Chandy-Lamport algorithm is a popular technique for obtaining consistent checkpoints.

Steps:

1. A checkpoint marker is sent to all processes.

2. Upon receiving a marker, each process:


– Records its state (local checkpoint).

– Records the state of in-transit messages.

– Sends the marker to its neighbors.

3. The process is complete when all processes have recorded their states and message
channels.

Key Feature: Ensures a consistent global state without requiring processes to pause completely.
Recovery in Distributed Database Systems
Distributed databases must recover from failures while maintaining consistency and availability.
Recovery mechanisms address failures in:

1. Transactions: Partial execution due to failures.

2. Processes: Crashes or unavailability of nodes.

3. Communication: Lost or delayed messages.

1. Types of Failures
• System Failures: Affect individual nodes (e.g., crashes).

• Media Failures: Affect the storage medium (e.g., disk corruption).

• Communication Failures: Interruptions in the network.

2. Key Concepts in Distributed Database Recovery


a. Atomicity in Transactions
Ensures that a transaction is either fully completed or not performed at all.

• Commit Protocols: Ensure consistency during recovery.


– Two-Phase Commit (2PC):
• Phase 1 (Prepare): The coordinator asks all participating nodes to prepare
for commit.

• Phase 2 (Commit/Rollback): If all nodes agree, the coordinator commits;


otherwise, it aborts.
Challenges:
• Blocking problem: If the coordinator fails, participants may wait
indefinitely.
– Three-Phase Commit (3PC): Adds a timeout mechanism to address the blocking
problem in 2PC.

b. Logging
Logs record all transaction operations for recovery purposes.

• Undo Logging: For rollback operations.

• Redo Logging: For reapplying operations during recovery.

• Undo/Redo Logging: Combines both for complete recovery.


c. Checkpointing in Databases
Similar to checkpointing in distributed systems, databases periodically save their state.
Checkpoints reduce the amount of log replay needed during recovery.

3. Recovery Algorithms
a. Transaction-Oriented Recovery

• Rollback: Undo incomplete transactions.

• Redo: Reapply operations of committed transactions.

b. Log-Based Recovery

• Analysis Phase: Identify which transactions need undo or redo.

• Redo Phase: Reapply all operations for committed transactions.

• Undo Phase: Roll back incomplete transactions.

c. Shadow Paging
Uses a shadow copy of the database:

• Updates are applied to a shadow version.

• Once a transaction commits, the shadow becomes the main version.

Advantages:

• No need for undo logs.

Disadvantages:

• High storage overhead.

4. Recovery Protocols
• Synchronous Recovery: All processes and nodes recover together, ensuring consistency.

• Asynchronous Recovery: Nodes recover independently, potentially leading to


inconsistencies that must be resolved later.

Conclusion
• Obtaining consistent checkpoints and recovery in distributed database systems is
essential to handle failures while maintaining system consistency and availability.
• Checkpointing techniques like the Chandy-Lamport algorithm ensure global consistency,
while recovery mechanisms such as 2PC and logging provide robust transaction-level
recovery.
Fault Tolerance in Distributed Systems
Fault tolerance is the ability of a system to continue operating correctly in the event of failures. It
is crucial in distributed systems due to the inherent complexity and higher probability of
component failures.

Issues in Fault Tolerance


1. Types of Faults
• Transient Faults: Temporary faults that disappear after some time (e.g., network
congestion).

• Intermittent Faults: Faults that occur sporadically and may reappear (e.g., hardware
glitches).

• Permanent Faults: Faults that persist until repaired (e.g., node crashes).

2. Key Challenges
1. Fault Detection
– Identifying faulty components in a distributed environment.

– Techniques: Heartbeats, timeout mechanisms.


2. Fault Masking
– Hiding the effects of faults from users or other system components.

– Techniques: Redundancy (e.g., replication).


3. Recovery Mechanisms
– Ensuring the system can return to a consistent state.

– Techniques: Checkpointing, rollback recovery.


4. Consistency
– Ensuring all nodes agree on the system state after a fault.

– Example: Consensus protocols like Paxos or Raft.


5. Overheads
– Fault tolerance mechanisms often introduce communication and computation
overhead. Balancing fault tolerance with performance is critical.

Commit Protocols
Commit protocols are used to ensure atomicity in distributed transactions, meaning a
transaction either completes successfully across all nodes or aborts entirely.
1. Two-Phase Commit Protocol (2PC)
A popular protocol for distributed transaction management.

Phases
1. Prepare Phase
– The coordinator sends a "prepare to commit" message to all participants.

– Participants perform local checks and respond with "Yes" (ready to commit) or
"No" (cannot commit).
2. Commit/Abort Phase
– If all participants vote "Yes," the coordinator sends a "commit" message.

– Otherwise, it sends an "abort" message.

– Participants execute the decision and send acknowledgments.

Advantages
• Ensures atomicity of transactions.

• Simple to implement.

Disadvantages
• Blocking Problem: If the coordinator fails during the commit phase, participants might
remain in an uncertain state.

• High latency due to multiple rounds of communication.

2. Three-Phase Commit Protocol (3PC)


An enhancement of 2PC to address the blocking problem.

Phases
1. Prepare Phase
– Similar to 2PC.
2. Pre-Commit Phase
– If all participants vote "Yes," the coordinator sends a "pre-commit" message.

– Participants acknowledge the pre-commit.


3. Commit/Abort Phase
– If the coordinator fails, participants can independently decide based on the pre-
commit acknowledgment.

– This avoids indefinite blocking.


Advantages
• Non-blocking under certain failure scenarios.

Disadvantages
• More complex and requires additional communication.

• Still prone to network partition issues.

Voting Protocols
Voting protocols are used to ensure consistency in replicated systems. Each replica has a vote,
and operations are committed based on the majority consensus.

1. Basic Voting Protocol


• Each replica in the system has a vote.

• An operation (e.g., read or write) is executed only if it receives votes from a quorum
(majority) of replicas.

Advantages
• Ensures strong consistency.

• Fault-tolerant as long as a quorum can be formed.

Disadvantages
• High overhead due to frequent voting.

• Susceptible to performance degradation in large systems.

2. Read/Write Quorum
• Read Quorum ((R)): Minimum number of replicas that must respond for a read operation.

• Write Quorum ((W)): Minimum number of replicas that must acknowledge a write
operation.

• To ensure consistency:
[ R + W > N ] where (N) is the total number of replicas.

Dynamic Voting Protocols


Dynamic voting protocols adapt to changes in the system, such as failures or network partitions.
1. Key Concepts
• Voting rights are dynamically adjusted based on system conditions.

• Faulty or unavailable nodes lose their voting rights.

2. Types of Dynamic Voting Protocols


a. Majority Voting
• Votes are redistributed among operational replicas.

• A new majority is formed dynamically to maintain quorum.

b. Weighted Voting
• Each replica is assigned a weight based on its reliability or other criteria.

• Operations require a weighted quorum instead of a simple majority.

c. Tree-Based Voting
• Replicas are organized hierarchically (e.g., as a tree).

• Voting is performed at different levels to reduce communication overhead.

Advantages of Dynamic Voting


• Increases fault tolerance by adapting to failures.

• Reduces overhead by excluding failed nodes from voting.

Disadvantages of Dynamic Voting


• Complex to implement and manage.

• Requires additional mechanisms for detecting and handling failures.

Comparison of Commit Protocols and Voting Protocols


Feature Commit Protocols Voting Protocols
Focus Ensures atomicity of transactions. Maintains consistency in
replication.
Mechanism Coordinator-based communication. Majority-based consensus.
Fault Tolerance Prone to blocking in 2PC. Adapts better with dynamic
voting.
Overhead High due to multiple High for frequent voting.
communication rounds.
Conclusion
Fault tolerance is critical for distributed systems, ensuring they remain reliable and consistent
despite failures. Commit protocols like 2PC and 3PC focus on transaction atomicity, while voting
protocols ensure consistency in replicated systems. Dynamic voting extends traditional voting
by adapting to failures, making it more suitable for highly dynamic environments. The choice of
protocol depends on system requirements, failure models, and performance constraints.

Transactions and Concurrency Control


In distributed systems, transactions and concurrency control are fundamental concepts to
ensure data consistency, integrity, and system performance in multi-user environments.

Transactions
A transaction is a sequence of operations performed as a single logical unit of work. It must
satisfy the ACID properties:

1. Atomicity: All operations within a transaction are completed, or none are applied.
2. Consistency: The database remains consistent before and after a transaction.
3. Isolation: Transactions do not interfere with each other.
4. Durability: Once committed, changes are permanent.

Nested Transactions
Nested transactions allow transactions to have a hierarchical structure with sub-transactions.

Structure
• The main transaction is called the parent transaction.

• Sub-transactions are called child transactions.


Key Features
• A parent transaction can commit only if all its child transactions commit.

• Child transactions can fail without causing the entire parent transaction to fail, allowing
partial rollback and error recovery.

Advantages
• Modularity: Complex operations are broken into smaller, manageable units.

• Failure Isolation: Faults in sub-transactions are contained, improving reliability.

Disadvantages
• Complexity in implementation.

• Requires advanced concurrency control to manage dependencies.

Concurrency Control
Concurrency control ensures that multiple transactions can execute concurrently without
violating the consistency and isolation properties.

Challenges in Concurrency Control


• Lost Update: Two transactions overwrite each other's updates.

• Dirty Read: A transaction reads uncommitted changes from another transaction.

• Non-Repeatable Read: A transaction reads different values for the same data due to
other transactions modifying it.

• Phantom Reads: A transaction retrieves different sets of rows in repeated queries.

Concurrency Control Methods


1. Locks
2. Optimistic Concurrency Control
3. Timestamp Ordering

1. Locks
Locks are used to control access to shared data.

Types of Locks
1. Exclusive Lock (Write Lock): Allows only one transaction to access the data for writing.
2. Shared Lock (Read Lock): Multiple transactions can read the data concurrently but
cannot write.

Locking Protocols
• Two-Phase Locking (2PL):
– Growing Phase: A transaction acquires all locks it needs.

– Shrinking Phase: A transaction releases locks but cannot acquire new ones.
• Strict 2PL: All locks are held until the transaction commits or aborts, ensuring
serializability.

Advantages
• Ensures consistency and prevents lost updates.

• Compatible with various database systems.

Disadvantages
• Can lead to deadlocks (circular waiting on resources).

• Performance overhead due to contention for locks.

2. Optimistic Concurrency Control (OCC)


Optimistic Concurrency Control assumes that conflicts are rare and allows transactions to
execute without restrictions until they commit.

Phases
1. Read Phase: Transactions read data and perform operations locally without locking.

2. Validation Phase: Before committing, the system checks for conflicts with other
transactions.

3. Write Phase: If validation succeeds, the changes are applied; otherwise, the transaction
is aborted.

Advantages
• No locking overhead, improving performance in systems with low contention.

• Suitable for read-heavy workloads.

Disadvantages
• High abort rates in write-heavy or high-contention environments.

• Validation requires additional overhead.


3. Timestamp Ordering
Transactions are assigned unique timestamps based on their start time. The system ensures
that transactions are executed in timestamp order to maintain consistency.

Rules
1. Read Rule: A transaction can read a data item only if its timestamp is less than the last
write timestamp of the item.

2. Write Rule: A transaction can write a data item only if its timestamp is greater than the
last read and write timestamps of the item.

Advantages
• Ensures serializability without locks.

• Avoids deadlocks.

Disadvantages
• High overhead in maintaining and checking timestamps.

• Inefficient in scenarios with frequent conflicts.

Comparison of Concurrency Control Methods


Optimistic Concurrency Timestamp
Feature Locks Control Ordering
Mechanism Restricts access via Executes transactions Orders
locks. optimistically and transactions by
validates at commit time. timestamps.
Overhead High (lock Low in low-contention High (timestamp
management). systems; high validation maintenance).
cost in high contention.
Deadlock Possibility Yes No No
Abort Rates Low High in write-heavy High in frequent
systems. conflict scenarios.
Use Case Suitable for mixed Best for read-heavy, low- Effective for low-
workloads. contention systems. conflict
environments.

Conclusion
Concurrency control methods address the challenges of maintaining consistency and isolation in
distributed systems. Locks are reliable but can cause deadlocks, while OCC is ideal for read-
heavy systems with low contention. Timestamp ordering provides serializability without locks
but is less efficient in high-conflict scenarios. The choice of method depends on the workload
characteristics and system requirements.

Distributed Transactions
A distributed transaction involves multiple nodes or databases participating in a single logical
transaction. These systems ensure consistency and atomicity across distributed resources
despite failures.

Flat vs. Nested Distributed Transactions


1. Flat Distributed Transactions
• A flat transaction spans multiple distributed resources but does not have a hierarchical
structure.
• All operations are at the same level and treated as a single unit of work.

Features:

• Simpler structure.
• Commit or rollback affects the entire transaction.
• Lack of modularity for complex operations.

Example: Transferring funds between two accounts in separate databases.

2. Nested Distributed Transactions


• A nested transaction has a hierarchical structure with parent and child transactions.

• The parent transaction oversees the entire transaction, while child transactions handle
specific parts.

Features:

• Modularity: Each child transaction can be independently committed or rolled back.

• Failure Isolation: A failure in a child transaction may not necessarily cause the parent
transaction to fail.

Example: A travel booking system where booking a flight, hotel, and car rental are managed by
child transactions under a single parent transaction.

Key Considerations:

• Parent commits only if all child transactions commit.

• Rollback can propagate upward, affecting parent and other child transactions.
Atomic Commit Protocols
Atomic commit protocols ensure that a distributed transaction either commits across all nodes
or rolls back entirely, maintaining the atomicity property.

1. Two-Phase Commit Protocol (2PC)


A widely used atomic commit protocol in distributed transactions.

Phases:

1. Prepare Phase:
– The coordinator sends a "Prepare to Commit" request to all participants.

– Participants respond with "Yes" (ready to commit) or "No" (cannot commit).


2. Commit Phase:
– If all participants respond "Yes," the coordinator sends a "Commit" message.

– If any participant responds "No," it sends an "Abort" message.

Advantages:

• Guarantees atomicity and consistency.

Disadvantages:

• Blocking Problem: Participants can become stuck in a waiting state if the coordinator
fails.

• High communication overhead.

2. Three-Phase Commit Protocol (3PC)


An enhancement of 2PC to address the blocking problem.

Phases:

1. Prepare Phase:
– Similar to 2PC. Participants indicate their readiness to commit.
2. Pre-Commit Phase:
– If all participants are ready, the coordinator sends a "Pre-Commit" message,
signaling the transition towards committing.

– Participants acknowledge this message.


3. Commit Phase:
– The coordinator sends a final "Commit" message if all participants acknowledge
the pre-commit phase.
Advantages:

• Avoids blocking by ensuring that participants can decide on their own if the coordinator
fails.

Disadvantages:

• More complex and requires additional communication.

• Still vulnerable to network partitions.

3. Paxos Commit Protocol


Paxos is a consensus-based protocol designed to handle failures in distributed systems.

Key Idea:

• Consensus is reached among participants before committing.

• Ensures consistency even with partial failures.

Advantages:

• High fault tolerance.

• Non-blocking.

Disadvantages:

• Significant overhead in achieving consensus.

Comparison of Commit Protocols


Two-Phase
Feature Commit Three-Phase Commit Paxos Commit
Phases 2 3 Multiple
Blocking Yes No No
Complexity Low Medium High
Fault Tolerance Limited Improved High
Overhead Moderate High Very High

Conclusion
Distributed transactions, whether flat or nested, require robust commit protocols to ensure
atomicity and consistency across distributed resources. 2PC is the simplest and most widely
used protocol, but it suffers from blocking issues. 3PC addresses some of these limitations,
while Paxos Commit provides the highest fault tolerance but at a significant cost in complexity
and overhead. The choice of protocol depends on the system’s requirements, including fault
tolerance, performance, and scalability.

Concurrency Control in Distributed Transactions


Concurrency control in distributed systems ensures that multiple transactions executing across
different nodes do not violate consistency or isolation. This is crucial in distributed systems, as
multiple nodes may access or modify shared resources simultaneously.

Concurrency Control Methods in Distributed Transactions


1. Distributed Locks
• Similar to centralized systems but distributed across nodes.
• Nodes use lock managers to grant shared or exclusive locks.

Key Challenges:

• Deadlocks: Circular waiting across nodes.


• Scalability: Managing locks in large systems.

2. Timestamp Ordering
• Each transaction is assigned a unique timestamp upon initiation.
• Transactions execute based on timestamp order to ensure serializability.

Rules:

1. A transaction (T_i) can read data item (x) only if (T_i)'s timestamp is greater than (x)'s last
write timestamp.
2. A transaction (T_i) can write to (x) only if (T_i)'s timestamp is greater than the last read
and write timestamps of (x).

Advantages:

• Avoids locks and deadlocks.

Disadvantages:

• High overhead to maintain timestamps.


• Prone to transaction aborts in high-contention systems.

3. Distributed Two-Phase Locking (D2PL)


• Extends the two-phase locking (2PL) protocol to distributed systems.
• Nodes coordinate locking to ensure all required locks are acquired before execution
begins.

Phases:
1. Growing Phase: Locks are acquired but not released.
2. Shrinking Phase: Locks are released but not acquired.

Advantages:

• Ensures serializability.

Disadvantages:

• Deadlock-prone in distributed environments.


• High communication overhead.

4. Optimistic Concurrency Control (OCC)


• Transactions execute without restrictions, and conflicts are resolved during the commit
phase.

Phases:

1. Read Phase: Transactions read and perform local computations.


2. Validation Phase: Conflicts are checked with other transactions.
3. Write Phase: If validation succeeds, changes are applied.

Advantages:

• Suitable for read-heavy, low-conflict systems.

Disadvantages:

• High abort rates in write-heavy systems.

Distributed Deadlocks
Deadlocks occur when transactions in a distributed system wait for each other indefinitely to
release resources, creating a circular wait.

Deadlock Conditions (Coffman Conditions)


1. Mutual Exclusion: A resource is held by one transaction at a time.
2. Hold and Wait: Transactions hold resources while waiting for others.
3. No Preemption: Resources cannot be forcibly taken from transactions.
4. Circular Wait: A cycle of transactions exists, each waiting for a resource held by the next.

Deadlock Detection in Distributed Systems


• Distributed deadlocks are challenging to detect due to the absence of a global view.

Techniques:
1. Wait-for Graph (WFG):
– Nodes maintain a local WFG to track dependencies.
– These local graphs are periodically merged to detect global cycles.
2. Edge Chasing:
– Nodes propagate special messages called probes along dependency edges.
– A deadlock is detected if a probe returns to its origin.

Deadlock Prevention
• Timeouts: Transactions are aborted if they exceed a predefined timeout period.
• Wait-Die Scheme:
– Older transactions can wait for younger transactions, but younger ones abort if
they conflict with older transactions.
• Wound-Wait Scheme:
– Younger transactions can wait for older ones, but older transactions abort
younger ones.

Transaction Recovery
In distributed systems, transaction recovery ensures the system can return to a consistent state
after failures.

Types of Failures
1. Transaction Failures:
– Logic errors, deadlocks, or manual aborts.
2. System Failures:
– Crashes that disrupt transaction processing.
3. Media Failures:
– Disk crashes causing data loss.

Recovery Mechanisms
1. Checkpointing
• Periodically save the system state to stable storage.
• In case of failure, transactions restart from the last checkpoint.

Types:

1. Coordinated Checkpointing:
– All nodes take a consistent snapshot simultaneously.
2. Uncoordinated Checkpointing:
– Nodes take independent snapshots, requiring additional mechanisms to resolve
inconsistencies.
2. Log-Based Recovery
• Transactions maintain logs of operations in stable storage.

Logs Types:

1. Undo Log: Records old values to undo changes during a rollback.


2. Redo Log: Records new values to redo changes during recovery.

Recovery Process
1. Undo Recovery: Reverse the effects of incomplete transactions using undo logs.
2. Redo Recovery: Reapply changes of committed transactions using redo logs.
3. Undo-Redo Recovery: Combines both approaches for flexibility.

3. Distributed Commit Protocols


• Protocols like Two-Phase Commit (2PC) ensure all participants in a distributed
transaction commit or abort together.

Comparison of Techniques
Feature Concurrency Control Deadlocks Recovery
Goal Maintain consistency Avoid or resolve Return to a
and isolation. circular waits. consistent state
post-failure.
Techniques Locks, OCC, WFG, Edge Checkpointing,
Timestamping. Chasing. Logging.
Overhead High in high- High in deadlock- Moderate (depends
contention prone systems. on frequency).
environments.

Conclusion
Effective management of concurrency, deadlocks, and recovery in distributed transactions
ensures reliability and performance. Concurrency control ensures isolation and consistency,
deadlock management prevents or resolves circular waits, and transaction recovery ensures
the system remains consistent even after failures. The choice of techniques depends on system
requirements, workload characteristics, and failure models.

Replication in Distributed Systems


Replication involves creating and maintaining copies of data or services across multiple nodes to
improve availability, fault tolerance, and performance in distributed systems. It ensures data
consistency and reliability even in the face of system failures.
System Model for Replication
A replication system comprises:

1. Nodes: Servers or databases holding replicated data or services.


2. Clients: Applications or users accessing the data or services.
3. Replication Manager: Ensures consistency and synchronization among replicas.

Key Concepts:

• Replica: A copy of the data or service.


• Primary-Backup Model: A primary replica handles updates, and backups synchronize
with it.
• Active Replication: All replicas handle requests simultaneously.
• Passive Replication: A single replica (primary) handles requests, and updates are
propagated to backups.

Consistency Models in Replication


1. Strong Consistency: Updates are visible to all replicas before any client reads the data.
2. Eventual Consistency: Updates propagate asynchronously, and replicas become
consistent over time.
3. Causal Consistency: Ensures that related operations are executed in the same order at
all replicas.

Group Communication in Replication


Group communication provides a framework for replicas to exchange messages and maintain
consistency.

Key Properties of Group Communication


1. Atomicity:
– Messages are delivered to all group members or none.
2. Reliability:
– Messages are guaranteed to be delivered, even in case of node failures.
3. Order:
– FIFO Ordering: Messages from a sender are delivered in the order they were
sent.
– Causal Ordering: Messages reflecting causal dependencies are delivered in the
same order.
– Total Ordering: All replicas receive messages in the same order.

Mechanisms in Group Communication


1. Multicast Communication:
– Messages are sent to a group of replicas simultaneously.
2. Leader Election:
– A leader is chosen among replicas to coordinate updates and resolve conflicts.

Fault Tolerance in Replication


Fault tolerance ensures a system continues functioning correctly even when some of its
components fail.

Fault Tolerance Techniques


1. Redundancy:
– Multiple replicas ensure that the system can withstand node or service failures.
2. Failure Detection:
– Mechanisms like heartbeat messages or timeout detection identify failed nodes.
3. Recovery Mechanisms:
– Failed replicas are brought back in sync with the latest state using checkpointing
or logs.
4. Quorums:
– A subset of replicas (quorum) is required for updates or reads, balancing
availability and consistency.
– Write Quorum: Ensures updates propagate to a sufficient number of replicas.
– Read Quorum: Ensures reads return the latest version of the data.

Fault Tolerance in Replicated Systems


1. Replication Strategies:
– Active Replication: All replicas process updates simultaneously. Faults are
masked by redundant replicas.
– Passive Replication: A primary replica processes updates, and backups are
synchronized.
2. Agreement Protocols:
– Consensus Protocols: Ensure replicas agree on the state of the system despite
failures. Examples include Paxos and Raft.
– Commit Protocols: Ensure atomic updates across replicas. Examples include
Two-Phase Commit (2PC).

Advantages of Replication
1. High Availability:
– The system remains operational even if some replicas fail.
2. Improved Performance:
– Load is distributed among replicas, reducing response times.
3. Fault Tolerance:
– Redundant replicas ensure data reliability and recovery.

Challenges in Replication
1. Consistency:
– Maintaining synchronization across replicas is complex.
2. Latency:
– Propagation of updates to all replicas may increase response times.
3. Partitioning:
– Network failures can isolate replicas, causing inconsistencies.
4. Overheads:
– Synchronization and communication between replicas add resource overheads.

Summary Table
Aspect Description
System Model Defines replicas, clients, and replication manager.
Group Ensures reliable, ordered, and atomic message delivery among
Communication replicas.
Fault Tolerance Provides mechanisms for detection, recovery, and consensus.
Advantages High availability, improved performance, and fault tolerance.
Challenges Consistency, latency, partitioning, and overheads.

Conclusion
Replication in distributed systems is vital for fault tolerance and high availability. By leveraging
group communication and fault-tolerance mechanisms, systems can ensure consistent and
reliable operations even in the presence of failures. However, achieving this requires careful
trade-offs between consistency, availability, and performance, tailored to specific application
needs.

You might also like