UNIT 4 DOS
UNIT 4 DOS
Process Migration:
Types of Threads
1. User-Level Threads:
o Managed by user-level libraries, not the operating system (OS).
o Lightweight and fast but lack true parallelism on multiprocessor systems because
the OS is not aware of them.
2. Kernel-Level Threads:
o Managed by the OS, which can schedule them on different processors.
o More overhead than user-level threads but support true parallelism.
Distributed File Systems (DFS) allow users to access and store files across multiple machines
as if they were on a single local system. Here’s a breakdown of the key concepts related to DFS:
1.Storage services
3. Name service
• A system that manages files across multiple networked computers, enabling seamless
access to data as if it is stored on a local machine.
1. Data Sharing: Allows multiple users and applications to access files concurrently from
different locations.
2. Data Availability: Ensures that files are accessible even if some nodes (machines) are
down, enhancing system reliability.
3. Scalability: Provides the ability to expand storage capacity by adding more nodes
without disrupting existing services.
4. Resource Utilization: Efficiently uses storage resources by distributing files across
multiple machines.
1. Transparency: Users should not be aware that the files are distributed across multiple
machines
2. Scalability: Easily add more storage or nodes to accommodate growing data
requirements without performance degradation.
3. Fault Tolerance: Ability to continue functioning even when some components fail, often
achieved by replicating data across multiple nodes.
4. Concurrency Control: Manage access to files by multiple users or applications
simultaneously without data conflicts.
5. User Mobility
6. Performance
7. Simplicity and ease of use
8. Scalability
9. High availability
10. High Reliability
11. Data integrity
12. Security
13. Heterogeneity
1. Data Consistency: Ensuring that all copies of a file are synchronized, especially when
files are updated by multiple users.
2. Network Latency: Accessing files over a network can lead to delays, affecting
performance.
3. Fault Handling: Managing failures, such as node crashes, without data loss or service
disruption.
4. Data Security and Privacy: Ensuring data security across different nodes and networks,
especially in untrusted environments.
File Services in Distributed Systems provide a way to access and manage files across a
network of machines. They enable users and applications to work with files as if they are stored
locally, even though the files may be distributed across different servers. Below are the key
points related to file services:
1. Definition
• File Service: A subsystem in a distributed system that provides mechanisms for users
and applications to create, access, and manage files stored across multiple machines on a
network.
1. Remote File Access: Users can access files stored on a remote server as if they are local,
often through protocols like NFS (Network File System) or SMB (Server Message
Block).
2. Distributed File System (DFS): Files are distributed across multiple servers, and the
system handles the distribution and access, providing a single, unified view of the file
system.
3. Cloud-Based File Services: Files are stored and managed on cloud servers, providing
seamless access from any location (e.g., Google Drive, Dropbox).
• Stateful Server: Maintains information (state) about each client session across multiple
requests. It remembers client data between interactions.
• Stateless Server: Does not retain any information (state) about clients between requests.
Each request is processed independently.
Replication and Caching are key techniques in distributed systems to enhance performance,
reliability, and availability. They both involve storing multiple copies of data, but they serve
different purposes and are implemented in distinct ways.
• Replication involves creating and maintaining multiple copies (replicas) of data across
different nodes in a distributed system. These replicas are synchronized to ensure
consistency.
• Definition: Caching involves storing copies of frequently accessed data closer to the user
or application to reduce access time. Unlike replication, caching does not necessarily aim
to maintain multiple copies of the entire dataset, just the most frequently requested parts.
• Purpose: Improves performance by reducing latency and minimizing the need for
repeated access to the original data source.
• Comparison Between Replication and Caching
Deadlock is a situation in a distributed system where a set of processes are blocked, waiting for
resources held by each other. This circular dependency leads to a standstill where none of the
involved processes can proceed. Distributed Deadlock Detection deals with identifying and
resolving these situations across a network of interconnected systems.
• Deadlock Prevention
• Deadlock Avoidance
• Deadlock Detection and Resolution
A. Deadlock Prevention
• Preventing deadlocks involves designing the system in such a way that deadlocks cannot
occur.
• Strategies include:
1. Resource Ordering: Enforce a strict order in which resources must be requested,
preventing circular wait conditions.
2. Request All Resources at Once: Ensure that a process requests all necessary
resources in a single request. If it cannot get all of them, it does not proceed.
3. Limit Resource Holding Time: Processes should hold resources only for a
limited period, reducing the chances of circular dependencies.
• Disadvantages: Can lead to poor resource utilization and can be impractical for complex
systems where resource requirements are not known in advance.
B. Deadlock Avoidance
• Deadlock avoidance involves careful allocation of resources to ensure that circular wait
conditions do not arise.
• Common strategies include:
1. Banker’s Algorithm: Dynamically examine requests and decide whether
granting them will lead to a potential deadlock. If so, the request is denied.
2. Wait-Die and Wound-Wait Schemes: Establish priorities for processes. In the
wait-die scheme, a lower-priority process waits for a higher-priority one, while in
the wound-wait scheme, a higher-priority process preempts a lower-priority one.
• Disadvantages: Requires advanced knowledge of resource requirements, which may not
be feasible in distributed environments.
• Unlike prevention and avoidance, this approach allows deadlocks to occur but provides
mechanisms to detect and resolve them.
• Key Strategies for Deadlock Detection:
1. Centralized Deadlock Detection:
▪ A central node collects information about resources and processes and
checks for deadlocks.
▪ Advantage: Simpler to implement.
▪ Disadvantage: Creates a single point of failure and can be a bottleneck in
large systems.
2. Distributed Deadlock Detection Algorithms:
▪ Algorithms are distributed across nodes, with each node responsible for
detecting deadlocks involving resources it manages.
▪ Wait-for Graph (WFG) Technique: Nodes exchange information to
build a global view of process dependencies.
▪ Edge Chasing Algorithm: Nodes pass "probes" along resource wait
chains. If a probe returns to its origin, a deadlock is detected.
▪ Global Snapshot Algorithm: Collects a global state snapshot to check for
circular dependencies. The system takes snapshots of process states and
resource allocations, then examines them to detect deadlocks.
▪ Advantages: Removes the single point of failure, better suited for
distributed environments.
▪ Disadvantages: More complex, communication overhead, and may lead
to false positives or negatives due to inconsistent information.
• Resolution Techniques:
1. Abort Deadlocked Processes:
▪ Identify one or more processes involved in the deadlock and terminate
them to release their resources.
▪ Criteria for Selection: Process priorities, runtime, or minimal impact on
the system.
2. Resource Preemption:
▪ Temporarily take away resources from one or more processes and allocate
them to others.
▪ Rollback Mechanism: Preempted processes may be rolled back to a safe
state before they started waiting for resources.
3. Process Migration:
▪ Move a process to another node where the required resources are
available, thereby breaking the deadlock.
▪ This can be complex and resource-intensive but may be an effective
solution in some scenarios.
1. Chandy-Misra-Haas Algorithm:
o A probe-based algorithm that detects deadlocks by sending probes between
processes. If a probe returns to its origin, a deadlock is detected.
2. Suzuki-Kasami Algorithm:
o A token-based algorithm that tracks resource requests and identifies potential
deadlocks by analyzing token information.
3. Obermarck’s Algorithm:
o A distributed algorithm that uses a form of global wait-for graph to check for
cycles, which indicate deadlocks.