Vertopal.com DistrributedSystemDetailNotes
Vertopal.com DistrributedSystemDetailNotes
Key features:
Examples of distributed systems include online banking, cloud computing platforms, and e-
commerce systems like Amazon.
• Resources such as data, hardware (printers, servers), and software can be shared among
users across the system.
• Users can access resources regardless of their location, provided they have the necessary
permissions.
Resource Sharing on the Web: The Web itself is a distributed system designed for resource
sharing. For instance:
In summary, distributed systems are critical for modern computing, enabling resource sharing
and scalability. However, designing and maintaining such systems requires overcoming
significant challenges related to heterogeneity, security, and fault tolerance.
2. Fundamental Models
Fundamental models describe the properties and assumptions of distributed systems.
Logical Clocks
Logical clocks are used to establish a consistent order of events in a distributed system where no
global clock exists.
Key Principles:
1. Increment Rule:
– Each process increments its clock before executing an event.
2. Message Rule:
– When a process sends a message, it includes its current timestamp.
– The receiving process updates its clock to the maximum of its current clock and
the timestamp in the message, then increments it.
Advantages:
• Simple and effective for ordering events.
• Ensures causality: If event A causes event B, then A will always have a smaller timestamp
than B.
Limitation:
• Does not capture concurrency: Two independent events can have the same timestamp.
2. Vector Clocks
Vector clocks extend Lamport’s logical clocks to capture concurrency and causality more
effectively.
Advantages:
• Tracks causality and concurrency effectively.
• Helps determine if two events are related or independent.
Example:
Comparison of Events:
1. Causal Order
Causal order in message passing refers to ensuring that messages are delivered in a way that
respects the cause-and-effect relationship between events. This is based on the happens-
before (→) relationship introduced by Leslie Lamport.
Happens-before Relationship:
• If an event ( A ) happens before ( B ) in the same process, then ( A → B ).
• If ( A ) is the sending of a message and ( B ) is the receipt of that message, then ( A → B ).
• Transitivity: If ( A → B ) and ( B → C ), then ( A → C ).
Example:
• Process ( P_1 ) sends message ( m_1 ) to ( P_2 ).
• ( P_2 ) processes ( m_1 ) and sends message ( m_2 ) to ( P_3 ).
• ( P_3 ) should receive ( m_1 ) before ( m_2 ), even though ( P_3 ) never directly interacts
with ( P_1 ).
Implementation:
• Vector clocks are typically used to enforce causal ordering.
2. Total Order
Total order ensures that all processes in the system agree on the order of messages, even if
they are unrelated. This means every process delivers messages in the same sequence.
Total Ordering Rules:
• If two messages ( m_1 ) and ( m_2 ) are sent concurrently, they are assigned an order
based on a global criterion (e.g., message timestamps or process IDs).
• All processes deliver ( m_1 ) and ( m_2 ) in the same order, regardless of when they are
received.
Example:
• If ( P_1 ) sends ( m_1 ), ( P_2 ) sends ( m_2 ), and ( P_3 ) receives both, ( P_3 ) must deliver
( m_1 ) and ( m_2 ) in a consistent order across all processes.
4. Message Ordering
Message ordering ensures that messages are delivered and processed in a sequence that
respects certain criteria. Types include:
1. FIFO Order:
– Messages sent by a process are delivered in the order they were sent.
– Example: If ( P_1 ) sends ( m_1 ) followed by ( m_2 ) to ( P_2 ), ( P_2 ) must deliver
( m_1 ) before ( m_2 ).
2. Causal Order:
– Respects the cause-and-effect relationship between messages.
3. Total Order:
– Ensures all processes agree on the order of messages.
4. Unordered:
– No guarantees on the sequence of message delivery.
Comparison:
Order Type Guarantees
FIFO Per-process order
Causal Cause-effect relationships are
respected
Total Uniform order across all processes
Total Causal Both causal and total order are
preserved
5. Causal Ordering of Messages
Causal ordering ensures that messages are delivered in a way that respects the cause-effect
relationship. It does not require a strict total order but focuses on preserving the logic of
causality.
6. Global State
The global state of a distributed system is the combined state of all processes and channels in
the system at a given point in time.
Uses:
• Debugging distributed systems.
• Deadlock detection.
• Analyzing distributed computations.
7. Termination Detection
Termination detection involves determining when all processes in a distributed system have
completed their tasks and the system can safely shut down.
Challenges:
• Processes may not know the state of others.
• Some processes may be idle, waiting for messages.
Algorithms for Termination Detection:
1. Dijkstra-Scholten Algorithm:
– Tracks dependencies among processes.
– When the initiating process detects no more dependencies, termination is
confirmed.
2. Token-Passing Algorithm:
– A token circulates in the system.
– When the token returns to the initiator with no active processes, termination is
detected.
Applications:
• Distributed computations.
• Coordinating system shutdowns.
Summary
Concept Description
Causal Order Ensures messages are delivered respecting the cause-and-effect
relationship.
Total Order Guarantees all processes deliver messages in the same sequence.
Total Causal Order Combines causal and total order.
Message Ordering Various types like FIFO, causal, total, and unordered messaging.
Global State Combined state of all processes and channels at a specific point in time.
Termination Identifies when all processes in a distributed system have completed
Detection their tasks.
Understanding these concepts is crucial for designing and implementing robust distributed
systems that function efficiently and maintain consistency across processes.
UNIT 2
Distributed Mutual Exclusion (DME)
Distributed Mutual Exclusion ensures that multiple processes in a distributed system can access
a shared resource mutually exclusively—that is, one process at a time, without interference
from others.
The goal of Distributed Mutual Exclusion (DME) is to manage concurrent access to the shared
resource efficiently and ensure correctness.
1. Mutual Exclusion:
– Only one process can access the critical section (CS) at any given time.
2. Fairness:
– No process should experience starvation.
– Requests should be granted in the order they were made (FIFO).
3. Deadlock Freedom:
– The system must avoid deadlock situations where no progress can be made.
4. Fault Tolerance:
– The system should handle failures gracefully without compromising correctness.
5. Scalability:
– The algorithm must perform efficiently as the number of processes increases.
6. Message Overhead:
– The number of messages exchanged should be minimized to reduce network
traffic.
A. Permission-Based Algorithms
• Processes request permission from others before entering the CS.
• Based on granting or denying access.
Examples:
1. Centralized Algorithm:
– A single coordinator grants or denies access to the CS.
– Steps:
• A process sends a request to the coordinator.
• The coordinator grants access if the CS is free or queues the request.
– Advantages:
• Simple to implement.
• Low message overhead (3 messages per CS entry: request, grant, and
release).
– Disadvantages:
• Single point of failure (coordinator crash).
• Bottleneck at the coordinator.
2. Distributed Algorithm (Ricart-Agrawala Algorithm):
– All processes communicate directly to determine access.
– Steps:
• A process broadcasts a request message to all other processes.
• Each process responds with a grant message if the CS is free.
• Access is granted when the requesting process receives all replies.
– Advantages:
• No single point of failure.
– Disadvantages:
• High message overhead (( 2(n-1) ) messages for ( n ) processes).
3. Token-Based Algorithm:
– A unique token circulates in the system, granting access to the CS.
– Steps:
• A process must hold the token to enter the CS.
• After exiting, the token is passed to the next requesting process.
– Advantages:
• Minimal message overhead.
– Disadvantages:
• Token loss requires additional recovery mechanisms.
B. Non-Token-Based Algorithms
• These algorithms rely on message passing but do not use a token for granting access.
Example:
• Lamport’s Algorithm:
– Processes use logical timestamps to order requests.
– Steps:
i. A process sends a request with a timestamp to all others.
ii. A process replies with a grant only if it has not requested the CS or if the
sender’s timestamp is earlier.
iii. Access is granted when the requesting process receives replies from all
others.
– Advantages:
• Ensures causality using Lamport clocks.
– Disadvantages:
• High message overhead (( 3(n-1) ) messages for ( n ) processes).
C. Hybrid Algorithms
• Combine features of token-based and non-token-based approaches.
• Reduce message overhead while maintaining fault tolerance.
Example:
• Suzuki-Kasami Algorithm:
– Uses a token but reduces the message overhead by broadcasting a request only
when the token is lost.
Conclusion
Distributed Mutual Exclusion ensures fairness, prevents deadlocks, and maintains consistency in
distributed systems. The choice of algorithm depends on the system’s requirements, such as
scalability, fault tolerance, and message overhead.
Below, we explain these algorithms, their mechanisms, examples, and performance metrics in
detail.
1. Token-Based Algorithms
Concept
• A unique token circulates in the system. The token acts as a "permission slip" to access
the critical section (CS).
• A process must hold the token to enter the CS.
• After exiting the CS, the process passes the token to the next requesting process (if any).
Key Features
• Low message overhead: Only one message is required to transfer the token.
• Fault tolerance mechanisms are needed for token recovery in case of loss.
How it Works
1. A process requests the token if it doesn't have it.
2. If the token is held by another process, it waits until the token is available.
3. The process that holds the token uses it to enter the CS.
4. Once done, it either keeps the token (if no other requests exist) or passes it to the next
requester.
Examples
1. Suzuki-Kasami Algorithm:
– A process broadcasts its request to all other processes only if it doesn’t know
where the token is.
– Processes maintain a request queue to manage token requests.
2. Raymond's Tree-Based Algorithm:
– Organizes processes in a logical tree structure.
– Token requests propagate up the tree, and the token is passed down when
granted.
Advantages
• Low message overhead: Only one message is needed per CS entry.
• Fairness: Requests are served in FIFO order.
• Efficiency: Scalable in large systems.
Disadvantages
• Token loss: If the token is lost, recovery is complex and time-consuming.
• Fault tolerance: Token-based systems need additional mechanisms to handle node or
process failures.
2. Non-Token-Based Algorithms
Concept
• No token is used. Instead, processes communicate via messages to coordinate access to
the CS.
• Requests are granted based on logical clocks or a distributed agreement protocol.
Key Features
• High message overhead, as all processes must participate in the coordination.
• No risk of token loss, but algorithms may face higher delays.
How it Works
1. A process broadcasts a request message to all other processes when it wants to enter
the CS.
2. Other processes reply with a grant message if their conditions are met.
3. The requesting process can enter the CS only after receiving all necessary replies.
4. Once done, the process sends a release message to inform others.
Examples
1. Lamport’s Algorithm:
– Requests are timestamped using Lamport logical clocks.
– Processes compare timestamps to decide the order of granting access.
2. Ricart-Agrawala Algorithm:
– Similar to Lamport’s algorithm but reduces messages by using direct replies
instead of broadcasting releases.
– Processes grant requests if they are not in the CS and the requester has the
earliest timestamp.
Advantages
• No token loss: Eliminates the need for token recovery mechanisms.
• Fault tolerance: Handles process crashes more gracefully compared to token-based
approaches.
Disadvantages
• High message complexity: Requires multiple messages for coordination.
• Scalability issues: Performance degrades as the number of processes increases.
1. Message Complexity
• Definition: The number of messages exchanged per critical section entry.
• Comparison:
– Token-Based: Requires only 1 message (token transfer) in the best case.
– Non-Token-Based: Typically requires 2(n-1) messages for ( n ) processes (request
and reply).
2. Synchronization Delay
• Definition: The time taken between when a process exits the CS and when the next
process enters.
• Comparison:
– Token-Based: Very low (1 message delay).
– Non-Token-Based: Higher due to the time required for message exchanges.
3. Fault Tolerance
• Definition: The ability of the algorithm to handle failures (e.g., node crashes or token
loss).
• Comparison:
– Token-Based: Token loss requires recovery mechanisms, which increase
complexity.
– Non-Token-Based: Better fault tolerance as there is no dependency on a token.
4. Fairness
• Definition: Ensuring requests are granted in the order they are made.
• Comparison:
– Token-Based: Maintains fairness using FIFO queues.
– Non-Token-Based: Achieves fairness through logical timestamps.
5. Scalability
• Definition: The ability of the algorithm to handle an increasing number of processes.
• Comparison:
– Token-Based: Scales better due to low message overhead.
– Non-Token-Based: Message complexity increases quadratically, making it less
scalable.
6. Deadlock Freedom
• Definition: The system must avoid situations where no progress can be made.
• Comparison:
– Token-Based: Deadlock can occur if the token is lost.
– Non-Token-Based: Deadlock is avoided by careful message ordering.
Non-Token-Based Algorithms
• Preferred when reliability is essential, and the system can tolerate higher message
complexity.
• Suitable for smaller distributed systems with frequent failures.
6. Conclusion
Both token-based and non-token-based algorithms have their strengths and weaknesses. The
choice of algorithm depends on system requirements such as scalability, fault tolerance,
message overhead, and fairness. Token-based algorithms excel in low message complexity and
scalability but face challenges in token loss. Non-token-based algorithms offer robustness and
fairness but at the cost of higher message overhead. Understanding the trade-offs is crucial for
designing an efficient distributed mutual exclusion solution.
This explanation covers the system model, types of deadlocks, and strategies for resource and
communication deadlock detection in detail.
Key Components
1. Processes:
– Act as entities requesting and releasing resources.
– Communicate with one another to coordinate actions or data sharing.
2. Resources:
– Can be physical (e.g., printers, files) or logical (e.g., locks, database records).
– Resources may be allocated to processes or requested by them.
3. Wait-for Graph (WFG):
– A directed graph where nodes represent processes, and edges represent "wait-
for" relationships.
– If Process ( P1 ) is waiting for a resource held by ( P2 ), there is an edge from ( P1 )
to ( P2 ).
– A cycle in this graph indicates a potential deadlock.
2. Resource Deadlocks vs. Communication Deadlocks
Deadlocks in distributed systems can broadly be classified into two categories based on their
cause:
A. Resource Deadlocks
• Definition: Occur when processes compete for finite resources in a circular wait
condition.
• Example:
– Process ( P1 ) holds Resource ( R1 ) and requests ( R2 ).
– Process ( P2 ) holds Resource ( R2 ) and requests ( R1 ).
– Both processes are stuck in a circular wait.
• Characteristics:
– Involves physical or logical resources.
– The deadlock can be visualized in a WFG.
B. Communication Deadlocks
• Definition: Occur when processes are waiting indefinitely for messages or
acknowledgments from one another in a cyclic dependency.
• Example:
– Process ( P1 ) is waiting for a message from ( P2 ), and ( P2 ) is waiting for a
message from ( P1 ).
• Characteristics:
– Involves message passing and acknowledgments.
– Typically occurs in distributed systems with synchronous communication.
1. System Representation:
– Represent the system using a WFG, where nodes are processes and resources,
and edges represent resource dependencies.
– Detect cycles in the graph.
2. Algorithms for Detection:
– Centralized Algorithm:
• One process acts as a coordinator and maintains the WFG.
• Periodically checks for cycles in the graph.
– Distributed Algorithm:
• Each process maintains partial information about the WFG.
• Use distributed cycle-detection algorithms to find cycles across multiple
nodes.
– Hierarchical Algorithm:
• Divide the system into regions, each with its own coordinator.
• Coordinators communicate to detect inter-region deadlocks.
3. Example:
– Chandy-Misra-Haas Algorithm:
• A distributed algorithm for detecting resource deadlocks.
• Works by propagating "deadlock probe messages" through the system. If
a probe returns to the initiating process, a deadlock is detected.
1. System Representation:
– Represent the system using a dependency graph, where nodes are processes and
edges represent communication dependencies.
– Detect cycles or unreachable nodes.
2. Algorithms for Detection:
– Timeout-Based Detection:
• Use timeouts to detect delays in message acknowledgments. If a process
waits beyond the timeout, it is suspected to be in a deadlock.
– Dependency Graph Analysis:
• Construct and analyze a graph of communication dependencies.
• Use cycle detection to identify deadlocks.
3. Example:
– Obermarck's Algorithm:
• Detects communication deadlocks by analyzing dependency graphs.
• Tracks dependencies among processes and checks for cycles.
5. Handling Deadlocks
After detecting a deadlock, the system must resolve it using one of the following techniques:
1. Resource Preemption:
– Forcefully take a resource away from a process to break the cycle.
2. Process Termination:
– Terminate one or more processes involved in the deadlock to release resources.
3. Rollback:
– Roll back one or more processes to a previous state to break the dependency
chain.
4. Avoidance Mechanisms:
– Use strategies like the Banker's Algorithm to prevent deadlocks by ensuring the
system never enters an unsafe state.
Conclusion
Deadlock detection in distributed systems is essential for maintaining system efficiency and
availability. Understanding the differences between resource and communication deadlocks,
along with the appropriate detection algorithms, is critical for designing reliable distributed
systems. By leveraging centralized, distributed, or hierarchical techniques, systems can
efficiently detect and resolve deadlocks, ensuring smooth operation.
1. Deadlock Prevention
Deadlock prevention ensures that at least one of the necessary conditions for deadlock cannot
occur. The four necessary conditions for deadlock are:
Trade-offs:
• Prevention mechanisms can reduce resource utilization and system efficiency.
• Over-conservative measures may lead to resource starvation.
2. Deadlock Avoidance
Deadlock avoidance dynamically ensures that the system never enters an unsafe state where
deadlocks might occur. This approach requires knowledge of future resource requests.
Banker’s Algorithm:
• Widely used for deadlock avoidance in centralized systems.
• Assumes:
– Each process declares its maximum resource needs in advance.
– The system grants resource requests only if the resulting state is safe.
• Safe State:
– A state is safe if a sequence exists where all processes can complete without
deadlock.
3. Deadlock Detection
In contrast to prevention or avoidance, detection allows the system to enter a deadlocked state
but detects and resolves it later. Deadlock detection algorithms rely on identifying cycles or
knots in resource allocation graphs.
1. Steps:
– The coordinator gathers the resource allocation graph (WFG) from all processes.
– Periodically checks for cycles in the WFG.
– A cycle indicates a deadlock.
2. Example:
– Wait-for Graph (WFG):
• Nodes represent processes.
• Directed edges represent dependencies (e.g., Process ( P1 ) waits for
Process ( P2 )).
3. Advantages:
– Simpler implementation due to a single monitoring point.
– Easier cycle detection algorithms.
4. Disadvantages:
– Single point of failure.
– Not scalable for large systems due to high communication overhead.
B. Distributed Deadlock Detection
In distributed systems, no single coordinator manages resources. Deadlock detection requires
cooperation among nodes.
1. Approach:
– Each node maintains partial knowledge of the WFG.
– Nodes communicate to detect cycles in the global graph.
2. Algorithms:
– Path-Pushing Algorithm:
• Nodes exchange dependency paths.
• If a node receives its own dependency path, a deadlock is detected.
– Edge-Chasing Algorithm:
• Nodes send "probe messages" along dependency edges.
• If a probe returns to the initiator, a cycle (deadlock) exists.
3. Advantages:
– No single point of failure.
– Better scalability.
4. Disadvantages:
– High communication overhead.
– Susceptible to delays and inconsistencies due to distributed nature.
4. Deadlock Resolution
Once a deadlock is detected, the system must resolve it to restore functionality. Common
strategies include:
A. Resource Preemption
• Forcibly take resources from a process and allocate them to others.
• Requires rollback mechanisms to handle preempted processes.
B. Process Termination
• Terminate one or more processes involved in the deadlock.
• Criteria for selection:
– Priority: Terminate the lowest-priority process.
– Resource Usage: Terminate the process holding the most resources.
– Rollback Cost: Terminate the process with the lowest rollback cost.
C. Rollback
• Rollback one or more processes to a previous state where they were not part of the
deadlock.
Conclusion
Deadlock management in distributed systems requires careful balancing of prevention,
avoidance, detection, and resolution techniques. While prevention and avoidance reduce the
likelihood of deadlocks, detection and resolution focus on addressing them when they occur.
The choice of strategy depends on the system's scale, resource availability, and the cost of
communication and computation.
How It Works
• Each process maintains information about its own dependencies.
• When a process cannot proceed because it's waiting for a resource, it sends dependency
paths to other processes involved.
• Dependency paths are propagated along the WFG.
• If a process receives a dependency path that contains itself, a cycle is detected, indicating
a deadlock.
Example
1. Process ( P1 ) waits for ( P2 ), so ( P1 ) sends the path ( [P1] ) to ( P2 ).
2. ( P2 ) is waiting for ( P3 ), so it appends itself and sends ( [P1, P2] ) to ( P3 ).
3. ( P3 ) is waiting for ( P1 ), so it appends itself and sends ( [P1, P2, P3] ) back to ( P1 ).
4. ( P1 ) detects its own ID in the path ( [P1, P2, P3] ), confirming a deadlock.
Advantages
• Simple conceptually.
• Detects deadlocks accurately when paths propagate correctly.
Disadvantages
• High communication overhead due to the transmission of dependency paths.
• Paths may grow large in size as the number of dependencies increases.
• Performance may degrade in large systems.
How It Works
• A probe message is initiated by a process that is waiting for a resource.
• The probe travels through the WFG following dependency edges.
• If the probe returns to the initiator, a cycle is detected, indicating a deadlock.
Example
1. ( P1 ) waits for ( P2 ), so ( P1 ) sends ( \langle P1, P1, P2 \rangle ) to ( P2 ).
2. ( P2 ) waits for ( P3 ), so ( P2 ) sends ( \langle P1, P2, P3 \rangle ) to ( P3 ).
3. ( P3 ) waits for ( P1 ), so ( P3 ) sends ( \langle P1, P3, P1 \rangle ) back to ( P1 ).
4. ( P1 ) detects its ID in the probe ( \langle P1, P3, P1 \rangle ), confirming a deadlock.
Advantages
• Efficient in terms of message size (probe messages are small).
• Only requires processes to handle and forward messages, making it lightweight.
Disadvantages
• Requires timely delivery of messages; delays may hinder detection.
• May generate many probe messages, increasing network load.
Conclusion
Both Path Pushing and Edge Chasing are effective in distributed deadlock detection, but they
are suited to different scenarios. Path Pushing works better in systems with fewer dependencies
but suffers from high message overhead as dependencies grow. Edge Chasing, on the other
hand, is more scalable and efficient in terms of communication but requires careful handling of
message delays and potential network congestion. The choice of algorithm depends on the
system's scale, complexity, and communication characteristics.
UNIT 3
• Communication delays
• Faulty processes
• Network partitions
• Malicious actors (Byzantine failures)
Key Goals of Agreement Protocols
• Consistency: All non-faulty processes must agree on the same value.
• Validity: The agreed value must be a valid and permissible one (e.g., one proposed by a
process).
• Termination: Every process must eventually decide, ensuring that the protocol
concludes.
Types of Failures
1. Crash Failures: Processes fail by halting and never recover.
2. Omission Failures: Messages are lost due to process or network failures.
3. Arbitrary/Byzantine Failures: Processes behave maliciously or unpredictably, sending
conflicting or invalid messages.
Communication Models
1. Synchronous Systems:
– Fixed upper bound on message delivery time.
– Processes execute steps within a known time.
– Easier to achieve agreement due to predictable behavior.
2. Asynchronous Systems:
– No bound on message delivery time or process execution speed.
– More challenging for agreement protocols, especially with failures.
a) Consensus Problem
Processes must agree on a single value, which could be a value proposed by any of the
processes. Common in distributed databases and leader election.
Scenario
Imagine multiple generals (processes) planning an attack. They need to coordinate and agree on
whether to attack or retreat. Some generals might act maliciously, sending contradictory or false
messages. The goal is for all loyal generals to agree on the same action.
Requirements
1. Agreement: All non-faulty processes agree on the same value.
2. Validity: If all non-faulty processes propose the same value, they must agree on it.
3. Fault Tolerance: The protocol should handle up to ( f ) faulty processes.
Assumptions
• At least ( 3f + 1 ) processes are required to tolerate ( f ) Byzantine failures.
• The system is synchronous, ensuring predictable communication and computation.
Example
1. General ( G1 ) proposes a plan (e.g., attack).
2. Each general forwards the message to others, appending their decision.
3. Faulty generals may send conflicting decisions.
4. Using majority voting or signature verification, non-faulty generals agree on the final
decision.
6. Practical Applications of Byzantine Agreement
• Blockchain and Cryptocurrencies: Ensures consistency and fault tolerance in
decentralized systems.
• Aviation Systems: Achieves reliability in flight control systems.
• Replicated Databases: Maintains consistency across distributed database replicas.
b) Time Complexity
• The number of communication rounds required.
• Faster convergence is preferred for real-time systems.
c) Fault Tolerance
• The number of failures (crash or Byzantine) that the protocol can handle.
• A critical factor for high-reliability systems.
Conclusion
Agreement protocols, especially those addressing the Byzantine Agreement Problem, are vital
for ensuring consistency and reliability in distributed systems. They provide mechanisms to
tolerate faults, including malicious behaviors, and are the foundation for fault-tolerant systems
like blockchain and distributed databases. While achieving consensus in distributed
environments is challenging, advancements in cryptographic techniques and efficient
algorithms continue to make these protocols robust and practical.
Consensus Problem
Definition
The Consensus Problem in distributed systems is ensuring that a group of processes agree on a
single value, even in the presence of failures. It is essential for maintaining consistency across
systems like distributed databases, blockchain, and fault-tolerant applications.
Requirements of Consensus
1. Agreement: All non-faulty processes must agree on the same value.
2. Validity: If a process proposes a value, the agreed value must be one of the proposed
values.
3. Termination: All processes must eventually decide on a value.
4. Fault Tolerance: The system should handle up to a certain number of failures (e.g., crash
or Byzantine).
Interactive Consistency Problem
Definition
The Interactive Consistency Problem is a specific variant of the consensus problem, requiring
every process in a distributed system to agree on:
Use Case
• Maintaining a consistent state across distributed replicas of a system.
Requirements
1. Agreement: All non-faulty processes agree on the same value for each process.
2. Validity: For non-faulty processes, their agreed value must match their initial value.
3. Fault Tolerance: The system can tolerate failures up to a specific limit.
Assumptions
1. At least ( 3f + 1 ) processes are required to tolerate ( f ) Byzantine faults.
2. The system operates in a synchronous environment where message delivery time is
bounded.
Key Properties
1. Atomicity: All participants commit or abort together.
2. Consistency: The database remains in a valid state before and after the transaction.
3. Isolation: Intermediate states of the transaction are not visible to other transactions.
4. Durability: Once a transaction commits, its changes persist.
Conclusion
Agreement protocols like Consensus and Atomic Commit are essential for maintaining reliability
and consistency in distributed systems. Byzantine Agreement specifically handles malicious
failures, ensuring robust systems like blockchain. In distributed databases, Atomic Commit
ensures transactions remain consistent and atomic, crucial for maintaining data integrity. Each
protocol has trade-offs in terms of performance, complexity, and fault tolerance, tailored to
specific system requirements.
2. Caching and Prefetching: Caching frequently accessed files at the client side helps
reduce latency and improves performance. However, caching must be coordinated
with other nodes to avoid stale data issues.
Conclusion
The challenges of Distributed Resource Management in Distributed File Systems (DFS)
primarily revolve around ensuring transparency, fault tolerance, consistency, and performance
while maintaining security and scalability. The design of a distributed file system must address
the complexities of network communication, data replication, and concurrent access, all while
providing a seamless experience to users and ensuring data integrity in the face of failures.
Effective solutions to these issues are crucial for the smooth operation of large-scale distributed
systems like cloud storage, content delivery networks, and enterprise-scale file systems.
• Transparency: Hide the complexity of file distribution, so users interact with the system
as though they are working with a local file system.
• Scalability: The system should handle increasing numbers of files, clients, and servers
efficiently.
• Fault tolerance: Ensure the system remains functional even if some components fail.
Key Components:
• Client: The user-facing interface that interacts with the DFS.
• Server: The machine that stores the file data and handles file operations.
• Metadata Server (MDS): Manages the metadata (file names, directories, access rights,
etc.), separate from the data servers where actual file content resides.
• Data Server (DAS): Stores the actual data of the files.
Replication Types:
• Synchronous Replication: Each modification to a file is immediately reflected in all
replicas, ensuring consistency. However, this can reduce performance due to
synchronization overhead.
• Asynchronous Replication: Modifications to a file are first applied to the primary copy,
and later propagated to replicas. This is more efficient but introduces the risk of
temporary inconsistencies.
Replication Strategies:
• Quorum-Based Replication: Ensures that a file modification is only considered
successful if a majority (quorum) of replicas acknowledge the change. This prevents
conflicts between versions of the same file.
• Primary-Backup Replication: One server holds the primary copy, and others store
backup copies. In case of a failure, the backup can be promoted to the primary copy.
Consistency Models:
• Strong Consistency: Every user sees the same version of a file, and all changes are
immediately visible to all users.
• Eventual Consistency: Updates to a file may not be immediately visible to all users, but
the system guarantees that eventually, all replicas will converge to the same state.
• Causal Consistency: If operation A happens before operation B, then all replicas will
reflect this ordering.
Locking and Concurrency Control:
• Distributed Locking: Locks can be placed on files to prevent concurrent modifications. A
process must acquire a lock before modifying a file, ensuring that only one process can
modify a file at a time.
• Optimistic Concurrency: Allows multiple processes to modify files concurrently, but
checks for conflicts when the changes are committed. If conflicts are detected, the
changes are rolled back.
Recovery Mechanisms:
• Checkpointing: The system periodically takes a snapshot of the system’s state (metadata
and data), allowing it to roll back to a previous consistent state in case of failure.
• Data Reconstruction: When data is lost due to a failure, the system can reconstruct the
data from the remaining replicas.
Caching:
• Client-Side Caching: Frequently accessed files or parts of files are cached on the client-
side to avoid repetitive network requests.
• Server-Side Caching: Servers can cache data that is frequently requested, reducing disk
I/O and improving response times.
Cache Consistency:
• Write-Invalidate Cache Protocol: When a file is modified, all cached copies are
invalidated to maintain consistency.
• Write-Update Cache Protocol: When a file is modified, all cached copies are updated
with the new version to maintain consistency.
Prefetching:
• Prefetching involves predicting which data or files will be requested next and loading
them into cache ahead of time. This helps reduce wait times and improves system
responsiveness.
7. Security Mechanisms
Security is essential in a DFS to prevent unauthorized access and ensure the integrity of data.
Encryption:
• Data Encryption: Files can be encrypted both at rest (on disk) and in transit (over the
network) to prevent unauthorized access.
• Secure Channels: Transport Layer Security (TLS) or similar protocols are used to ensure
secure communication between clients and servers.
Metadata Servers:
• In large DFS, metadata is often stored on a separate server (or set of servers) to ensure
that the data servers can focus on storing actual file data.
• Distributed Metadata Management: Metadata servers themselves can be distributed to
ensure that no single server becomes a bottleneck.
Directory Services:
• A DFS must efficiently manage file directories and support operations like searching,
creating, and deleting directories. Distributed hash tables (DHT) or similar techniques are
used to manage and locate files.
Conclusion
Building a Distributed File System involves combining several complex mechanisms to ensure
that files are stored, accessed, and managed efficiently across a distributed environment. By
focusing on transparency, fault tolerance, consistency, scalability, and security, DFS can
provide users with a reliable and efficient file storage solution. Key mechanisms include file
replication, metadata management, concurrency control, caching, and security protocols. These
mechanisms work together to offer a unified, fault-tolerant, and high-performance system for
handling files in distributed environments like cloud storage, enterprise systems, and large-
scale applications.
1. Granularity
This refers to the size of the memory block shared among processes.
• Sequential Consistency: Operations appear to occur in the same sequential order on all
processes.
• Causal Consistency: Only causally related updates are visible in the same order.
3. Data Replication
DSM systems often replicate data to improve performance and availability.
4. Data Coherence
This ensures that all processes have a consistent view of the shared memory.
5. Synchronization
Processes need to coordinate access to shared memory to avoid race conditions.
Synchronization methods include:
6. Fault Tolerance
Distributed systems are prone to failures. DSM must handle:
• Node crashes.
• Network partitions.
8. Communication Overhead
Efficiently transferring shared memory updates over the network is crucial. Techniques like
batching updates or using multicast can reduce overhead.
Assumptions:
• Memory is divided into fixed-size pages.
• Each page has an owner node responsible for maintaining its consistency.
Steps:
1. Initialization:
– Divide the shared memory into pages.
1. Accessing Memory:
– Read Operation:
– Write Operation:
a. Check if the node is the owner of the page.
1. Consistency Maintenance:
– Implement a coherence protocol (e.g., write-invalidate or write-update).
1. Synchronization:
– Use locks or barriers to coordinate access to shared memory.
1. Fault Tolerance:
– Periodically checkpoint memory state.
Example Protocols:
• Write-Invalidate Protocol: When a process writes to a page, it invalidates all other
copies, forcing others to fetch the latest version when needed.
Conclusion
Implementing DSM requires careful consideration of trade-offs among granularity, consistency,
and communication overhead. While DSM simplifies programming by abstracting memory
sharing, achieving optimal performance and reliability in a distributed environment is
challenging.
Techniques:
• Logging: Record events or changes to allow replaying operations from a saved state.
• Rollback: When a failure occurs, revert the system to the last checkpoint.
Advantages:
• Simple to implement.
Disadvantages:
• May lose recent updates or progress made since the last checkpoint.
2. Forward Recovery
Forward recovery involves transitioning the system to a new, valid state after a failure without
rolling back.
Techniques:
Advantages:
• No loss of progress.
Disadvantages:
• Complex implementation.
• Requires precise error detection and correction mechanisms.
• If a failure occurs, rollback restores all processes to the latest consistent checkpoint.
Message Logging
• Log inter-process messages to help restore the system to a consistent state.
Dependency Tracking
• Track dependencies among processes or transactions.
• During recovery, ensure that dependent operations are restored in the correct order to
maintain consistency.
Recovery Lines
• Identify a consistent state across all processes (called a recovery line) where no
dependencies are violated.
• Processes are rolled back only to this recovery line to avoid unnecessary rollbacks.
Isolation Mechanisms
• Use locks or transactions to ensure that processes recover in isolation without
interfering with others.
Conclusion
Failure recovery in distributed systems is critical for maintaining availability, consistency, and
reliability. While backward recovery is simpler and widely used, forward recovery is essential for
real-time systems that cannot afford rollback delays. In concurrent systems, techniques like
coordinated checkpointing, message logging, and dependency tracking are employed to address
interdependencies and ensure a consistent recovery.
• Message Passing: In-flight messages (messages sent but not yet delivered) can cause
inconsistencies.
• Domino Effect: Rolling back one process might require rolling back others, leading to
cascading rollbacks.
a. Coordinated Checkpointing
All processes in the distributed system synchronize their checkpointing efforts to ensure a
consistent global state.
Steps:
Advantages:
Disadvantages:
b. Uncoordinated Checkpointing
Each process checkpoints independently, without coordination with others.
Steps:
Disadvantages:
c. Communication-Induced Checkpointing
Combines the benefits of coordinated and uncoordinated checkpointing. Processes take local
checkpoints and occasionally coordinate based on communication patterns.
Advantages:
Disadvantages:
Steps:
3. The process is complete when all processes have recorded their states and message
channels.
Key Feature: Ensures a consistent global state without requiring processes to pause completely.
Recovery in Distributed Database Systems
Distributed databases must recover from failures while maintaining consistency and availability.
Recovery mechanisms address failures in:
1. Types of Failures
• System Failures: Affect individual nodes (e.g., crashes).
b. Logging
Logs record all transaction operations for recovery purposes.
3. Recovery Algorithms
a. Transaction-Oriented Recovery
b. Log-Based Recovery
c. Shadow Paging
Uses a shadow copy of the database:
Advantages:
Disadvantages:
4. Recovery Protocols
• Synchronous Recovery: All processes and nodes recover together, ensuring consistency.
Conclusion
• Obtaining consistent checkpoints and recovery in distributed database systems is
essential to handle failures while maintaining system consistency and availability.
• Checkpointing techniques like the Chandy-Lamport algorithm ensure global consistency,
while recovery mechanisms such as 2PC and logging provide robust transaction-level
recovery.
Fault Tolerance in Distributed Systems
Fault tolerance is the ability of a system to continue operating correctly in the event of failures. It
is crucial in distributed systems due to the inherent complexity and higher probability of
component failures.
• Intermittent Faults: Faults that occur sporadically and may reappear (e.g., hardware
glitches).
• Permanent Faults: Faults that persist until repaired (e.g., node crashes).
2. Key Challenges
1. Fault Detection
– Identifying faulty components in a distributed environment.
Commit Protocols
Commit protocols are used to ensure atomicity in distributed transactions, meaning a
transaction either completes successfully across all nodes or aborts entirely.
1. Two-Phase Commit Protocol (2PC)
A popular protocol for distributed transaction management.
Phases
1. Prepare Phase
– The coordinator sends a "prepare to commit" message to all participants.
– Participants perform local checks and respond with "Yes" (ready to commit) or
"No" (cannot commit).
2. Commit/Abort Phase
– If all participants vote "Yes," the coordinator sends a "commit" message.
Advantages
• Ensures atomicity of transactions.
• Simple to implement.
Disadvantages
• Blocking Problem: If the coordinator fails during the commit phase, participants might
remain in an uncertain state.
Phases
1. Prepare Phase
– Similar to 2PC.
2. Pre-Commit Phase
– If all participants vote "Yes," the coordinator sends a "pre-commit" message.
Disadvantages
• More complex and requires additional communication.
Voting Protocols
Voting protocols are used to ensure consistency in replicated systems. Each replica has a vote,
and operations are committed based on the majority consensus.
• An operation (e.g., read or write) is executed only if it receives votes from a quorum
(majority) of replicas.
Advantages
• Ensures strong consistency.
Disadvantages
• High overhead due to frequent voting.
2. Read/Write Quorum
• Read Quorum ((R)): Minimum number of replicas that must respond for a read operation.
• Write Quorum ((W)): Minimum number of replicas that must acknowledge a write
operation.
• To ensure consistency:
[ R + W > N ] where (N) is the total number of replicas.
b. Weighted Voting
• Each replica is assigned a weight based on its reliability or other criteria.
c. Tree-Based Voting
• Replicas are organized hierarchically (e.g., as a tree).
Transactions
A transaction is a sequence of operations performed as a single logical unit of work. It must
satisfy the ACID properties:
1. Atomicity: All operations within a transaction are completed, or none are applied.
2. Consistency: The database remains consistent before and after a transaction.
3. Isolation: Transactions do not interfere with each other.
4. Durability: Once committed, changes are permanent.
Nested Transactions
Nested transactions allow transactions to have a hierarchical structure with sub-transactions.
Structure
• The main transaction is called the parent transaction.
• Child transactions can fail without causing the entire parent transaction to fail, allowing
partial rollback and error recovery.
Advantages
• Modularity: Complex operations are broken into smaller, manageable units.
Disadvantages
• Complexity in implementation.
Concurrency Control
Concurrency control ensures that multiple transactions can execute concurrently without
violating the consistency and isolation properties.
• Non-Repeatable Read: A transaction reads different values for the same data due to
other transactions modifying it.
1. Locks
Locks are used to control access to shared data.
Types of Locks
1. Exclusive Lock (Write Lock): Allows only one transaction to access the data for writing.
2. Shared Lock (Read Lock): Multiple transactions can read the data concurrently but
cannot write.
Locking Protocols
• Two-Phase Locking (2PL):
– Growing Phase: A transaction acquires all locks it needs.
– Shrinking Phase: A transaction releases locks but cannot acquire new ones.
• Strict 2PL: All locks are held until the transaction commits or aborts, ensuring
serializability.
Advantages
• Ensures consistency and prevents lost updates.
Disadvantages
• Can lead to deadlocks (circular waiting on resources).
Phases
1. Read Phase: Transactions read data and perform operations locally without locking.
2. Validation Phase: Before committing, the system checks for conflicts with other
transactions.
3. Write Phase: If validation succeeds, the changes are applied; otherwise, the transaction
is aborted.
Advantages
• No locking overhead, improving performance in systems with low contention.
Disadvantages
• High abort rates in write-heavy or high-contention environments.
Rules
1. Read Rule: A transaction can read a data item only if its timestamp is less than the last
write timestamp of the item.
2. Write Rule: A transaction can write a data item only if its timestamp is greater than the
last read and write timestamps of the item.
Advantages
• Ensures serializability without locks.
• Avoids deadlocks.
Disadvantages
• High overhead in maintaining and checking timestamps.
Conclusion
Concurrency control methods address the challenges of maintaining consistency and isolation in
distributed systems. Locks are reliable but can cause deadlocks, while OCC is ideal for read-
heavy systems with low contention. Timestamp ordering provides serializability without locks
but is less efficient in high-conflict scenarios. The choice of method depends on the workload
characteristics and system requirements.
Distributed Transactions
A distributed transaction involves multiple nodes or databases participating in a single logical
transaction. These systems ensure consistency and atomicity across distributed resources
despite failures.
Features:
• Simpler structure.
• Commit or rollback affects the entire transaction.
• Lack of modularity for complex operations.
• The parent transaction oversees the entire transaction, while child transactions handle
specific parts.
Features:
• Failure Isolation: A failure in a child transaction may not necessarily cause the parent
transaction to fail.
Example: A travel booking system where booking a flight, hotel, and car rental are managed by
child transactions under a single parent transaction.
Key Considerations:
• Rollback can propagate upward, affecting parent and other child transactions.
Atomic Commit Protocols
Atomic commit protocols ensure that a distributed transaction either commits across all nodes
or rolls back entirely, maintaining the atomicity property.
Phases:
1. Prepare Phase:
– The coordinator sends a "Prepare to Commit" request to all participants.
Advantages:
Disadvantages:
• Blocking Problem: Participants can become stuck in a waiting state if the coordinator
fails.
Phases:
1. Prepare Phase:
– Similar to 2PC. Participants indicate their readiness to commit.
2. Pre-Commit Phase:
– If all participants are ready, the coordinator sends a "Pre-Commit" message,
signaling the transition towards committing.
• Avoids blocking by ensuring that participants can decide on their own if the coordinator
fails.
Disadvantages:
Key Idea:
Advantages:
• Non-blocking.
Disadvantages:
Conclusion
Distributed transactions, whether flat or nested, require robust commit protocols to ensure
atomicity and consistency across distributed resources. 2PC is the simplest and most widely
used protocol, but it suffers from blocking issues. 3PC addresses some of these limitations,
while Paxos Commit provides the highest fault tolerance but at a significant cost in complexity
and overhead. The choice of protocol depends on the system’s requirements, including fault
tolerance, performance, and scalability.
Key Challenges:
2. Timestamp Ordering
• Each transaction is assigned a unique timestamp upon initiation.
• Transactions execute based on timestamp order to ensure serializability.
Rules:
1. A transaction (T_i) can read data item (x) only if (T_i)'s timestamp is greater than (x)'s last
write timestamp.
2. A transaction (T_i) can write to (x) only if (T_i)'s timestamp is greater than the last read
and write timestamps of (x).
Advantages:
Disadvantages:
Phases:
1. Growing Phase: Locks are acquired but not released.
2. Shrinking Phase: Locks are released but not acquired.
Advantages:
• Ensures serializability.
Disadvantages:
Phases:
Advantages:
Disadvantages:
Distributed Deadlocks
Deadlocks occur when transactions in a distributed system wait for each other indefinitely to
release resources, creating a circular wait.
Techniques:
1. Wait-for Graph (WFG):
– Nodes maintain a local WFG to track dependencies.
– These local graphs are periodically merged to detect global cycles.
2. Edge Chasing:
– Nodes propagate special messages called probes along dependency edges.
– A deadlock is detected if a probe returns to its origin.
Deadlock Prevention
• Timeouts: Transactions are aborted if they exceed a predefined timeout period.
• Wait-Die Scheme:
– Older transactions can wait for younger transactions, but younger ones abort if
they conflict with older transactions.
• Wound-Wait Scheme:
– Younger transactions can wait for older ones, but older transactions abort
younger ones.
Transaction Recovery
In distributed systems, transaction recovery ensures the system can return to a consistent state
after failures.
Types of Failures
1. Transaction Failures:
– Logic errors, deadlocks, or manual aborts.
2. System Failures:
– Crashes that disrupt transaction processing.
3. Media Failures:
– Disk crashes causing data loss.
Recovery Mechanisms
1. Checkpointing
• Periodically save the system state to stable storage.
• In case of failure, transactions restart from the last checkpoint.
Types:
1. Coordinated Checkpointing:
– All nodes take a consistent snapshot simultaneously.
2. Uncoordinated Checkpointing:
– Nodes take independent snapshots, requiring additional mechanisms to resolve
inconsistencies.
2. Log-Based Recovery
• Transactions maintain logs of operations in stable storage.
Logs Types:
Recovery Process
1. Undo Recovery: Reverse the effects of incomplete transactions using undo logs.
2. Redo Recovery: Reapply changes of committed transactions using redo logs.
3. Undo-Redo Recovery: Combines both approaches for flexibility.
Comparison of Techniques
Feature Concurrency Control Deadlocks Recovery
Goal Maintain consistency Avoid or resolve Return to a
and isolation. circular waits. consistent state
post-failure.
Techniques Locks, OCC, WFG, Edge Checkpointing,
Timestamping. Chasing. Logging.
Overhead High in high- High in deadlock- Moderate (depends
contention prone systems. on frequency).
environments.
Conclusion
Effective management of concurrency, deadlocks, and recovery in distributed transactions
ensures reliability and performance. Concurrency control ensures isolation and consistency,
deadlock management prevents or resolves circular waits, and transaction recovery ensures
the system remains consistent even after failures. The choice of techniques depends on system
requirements, workload characteristics, and failure models.
Key Concepts:
Advantages of Replication
1. High Availability:
– The system remains operational even if some replicas fail.
2. Improved Performance:
– Load is distributed among replicas, reducing response times.
3. Fault Tolerance:
– Redundant replicas ensure data reliability and recovery.
Challenges in Replication
1. Consistency:
– Maintaining synchronization across replicas is complex.
2. Latency:
– Propagation of updates to all replicas may increase response times.
3. Partitioning:
– Network failures can isolate replicas, causing inconsistencies.
4. Overheads:
– Synchronization and communication between replicas add resource overheads.
Summary Table
Aspect Description
System Model Defines replicas, clients, and replication manager.
Group Ensures reliable, ordered, and atomic message delivery among
Communication replicas.
Fault Tolerance Provides mechanisms for detection, recovery, and consensus.
Advantages High availability, improved performance, and fault tolerance.
Challenges Consistency, latency, partitioning, and overheads.
Conclusion
Replication in distributed systems is vital for fault tolerance and high availability. By leveraging
group communication and fault-tolerance mechanisms, systems can ensure consistent and
reliable operations even in the presence of failures. However, achieving this requires careful
trade-offs between consistency, availability, and performance, tailored to specific application
needs.