Synchronization in Multiprocessor Systems
Synchronization in Multiprocessor Systems
Background
Peterson’s Solution
Synchronization Hardware
Semaphores
Monitors
Background
#define BUFFER_SIZE 10
typedef struct {
...
} item;
item buffer[BUFFER_SIZE];
int in = 0;
int out = 0;
; // do nothing
in = (in + 1) % BUFFER_SIZE;
count++;
while (true)
while (count == 0)
; // do nothing
nextConsumed = buffer[out];
count--;
If counter is 5 and both routines execute “counter++” and “counter--” concurrently, value of
counter may be 4 or 5 or 6. (while correct value of counter is 5)
register1 = count
register1 = register1 + 1
count = register1
register2 = count
register2 = register2 - 1
count = register2
Consider this execution interleaving with “count = 5” initially (interleaving may be in possible in
different ways):
An incorrect state occurs when both the processes are allowed to manipulate the variable
counter concurrently.
A situation where several processes access and manipulate the same data concurrently and the
outcome of the execution depends on the particular order in which the execution takes place, is
called race condition.
Thus only one process should be allowed to manipulate the counter at a time, thus
synchronization is required.
Each process has a code segment, called critical section, in which the shared data is accessed
e.g. changing common variables, updating a table, writing a file etc.
It is necessary to ensure that when one process is executing in its critical section, no other
process is allowed to execute in its critical section.
A process must acquire a lock before entering a critical section and releases the lock when it
exits the critical section.
Solution to Critical-Section Problem
do {
entry section
critical section
exit section
remainder section
} while (true);
Each process must request permission to enter its critical section which is implemented as entry
section.
1. Mutual Exclusion:
If process Pi is executing in its critical section, then no other processes can be executing in their
critical sections.
2. Progress:
If no process is executing in its critical section and there exist some processes that wish to enter
their critical section,
then the selection of the processes that will enter the critical section next cannot be postponed
indefinitely
(only those processes can enter which are not executing in their remainder section)
3. Bounded Waiting:
A bound must exist on the number of times that other processes are allowed to enter their
critical sections after a process has made a request to enter its critical section and before that request is
granted
Critical-Section Problem
Preemptive kernels
Nonpreemptive kernels
Nonpreemptive kernels are free from race condition on kernel data structures.
Preemptive kernels must be carefully designed to ensure that shared kernel data are free from
race conditions.
A preemptive kernels is suitable for real-time programming having short response time.
Peterson’s Solution
Two process solution
int turn;
Boolean flag[2]
The variable turn indicates whose turn it is to enter the critical section.
The flag array is used to indicate if a process is ready to enter the critical section. flag[i] = true
implies that process Pi is ready.
A process must acquire a lock before entering a critical section and releases the lock when it
exits the critical section.
Synchronization Hardware
Many systems provide hardware support for locking the critical section code
Such code would execute without preemption which is the approach used for non-preemptive.
TestAndSet()
Swap()
TestAndSet Instruction
*target = TRUE;
return temp;
Swap Instruction
Swap() is executed atomically
*a = *b;
*b = temp:
TestAndSet() and Swap() algorithms support mutual exclusion but do not follow bounded
waiting.
Bounded waiting algorithm using TestAndSet() instruction satisfies all critical section problems.
boolean lock;
Semaphore
All the modifications to the semaphore in the wait() and signal() must be executed atomically.
wait()
signal()
Semaphore: usage
Binary semaphore: integer value can range only between 0 and 1, it is simpler to implement
Semaphore: usage-1
Binary semaphore can be used to deal with critical section problem for multiple processes.
Semaphore: usage-2
Counting semaphore can be used to control access to a given resource consisting of a finite
number of instances.
Each process wishes to use the resource, will perform wait() operation and decrement
semaphore by one.
When a process release the resource, will perform signal() operation and increment semaphore
by one.
Semaphore: usage-3
When a process is in its critical section, any other process tries to enter in its critical section
must loop continuously to check condition s ≤ 0.
When a process executes wait() and finds that value of semaphore is not positive, process blocks
itself rather than busy waiting by block() operation.
The block operation places a process into a waiting queue associated with semaphore, and the
state of the process is switched to the waiting state.
A blocked process on a semaphore will be restarted when some other process executes a
signal() operation.
Blocked process is restarted by a wakeup() operation, which changes the process from waiting
state to ready state.
Block() and wakeup() are provided as basic system calls by operating system.
A signal() removes one process from the list of waiting processes and awakens that process.
Semaphore Implementation with no Busy waiting
With each semaphore there is an associated waiting queue. Each entry in a waiting queue has
two data items:
Two operations:
block() – place the process invoking the operation on the appropriate waiting queue.
wakeup() – remove one of processes in the waiting queue and place it in the ready
queue.
Semaphore: Implementation
Implementation must guarantee that no two processes can execute wait () and signal () on the
same semaphore at the same time
Thus, it becomes the critical section problem where the wait and signal code are placed in the
critical section.
In uniprocessor system, interrupts can be disabled during wait() and signal() operations.
In SMP, wait() and signal() should be performed atomically.
If applications spend lots of time in critical sections then block() and wakeup() solution is good
although it has context switching.
Deadlock
Two or more processes are waiting indefinitely for an event that can be caused by only
one of the waiting processes
Starvation
Indefinite blocking. A process may never be removed from the semaphore queue in
which it is suspended (if implementation is LIFO).
Semaphore: Deadlocks
Bounded-Buffer Problem
Dining-Philosophers Problem
Bounded-Buffer Problem
Two processes share a common fixed size buffer. One of them, the producer, puts information
in buffer. Other one, the consumer, takes it out.
Problems:
Solutions:
When buffer is full, producer is sent to sleep and gets awakened when consumer takes an item
from buffer.
When buffer is empty, consumer is sent to sleep and gets awakened when producer puts an
item in buffer.
Shared data
define N 10
/*buffer size*/
semaphore full = 0
semaphore empty = N
semaphore mutex = 1
do { … do {
… wait (mutex);
wait (empty); …
Readers-Writers Problem
A data object can be shared among several concurrent processes. Some processes (readers) only
read the content of the object but some processes (writers) write the content in object.
Problems:
When two readers access the object, there is no adverse effects will result.
When one writer and other reader; or two writer access the object simultaneously,
inconsistency may occur.
Solutions:
Shared data
semaphore mutex = 1
semaphore wrt = 1
int readcount = 0
System Model
Deadlock Characterization
Deadlock Prevention
Deadlock Avoidance
Deadlock Detection
Basic Concepts
A set of blocked processes each holding a resource and waiting to acquire another resource
held by another process in the set, creates condition of deadlock.
Example:
The resources are partitioned into several types, each consisting of several instances.
Processes may compete for same type of resources or different type of resources.
System Model
e.g. CPU cycles, memory space, I/O devices (printers, tape drives), logical resources like semaphore,
monitor, files
request
use
release
request
A process must request a resource before using it. If the request can not be granted
immediately, then the requesting process must wait until it can acquire the resource
use
release
Deadlock Characterization
1. Mutual exclusion
4. Circular wait
Mutual exclusion
At least one resource must be non-sharable mode i.e. only one process can use a resource at a
time.
The requesting process must be delayed until the resource has been released.
A process must be holding at least one resource and waiting to acquire additional resources
held by other processes.
No preemption
A resource can be released only voluntarily by the process holding it after that process has
completed its task i.e. no resource can be forcibly removed from a process holding it.
Circular wait
There exists a set {P0, P1, …, Pn} of waiting processes such that P0 is waiting for a resource that
is held by P1, P1 is waiting for a resource that is held by P2,…… …, Pn–1 is waiting for a resource
that is held by Pn, and Pn is waiting for a resource that is held by P0.
Resource-Allocation Graph
Deadlocks can be described more precisely in terms of a directed graph called a system
resource allocation graph.
P = {P1, P2, …, Pn}, the set consisting of all the active processes in the system.
R = {R1, R2, …, Rm}, the set consisting of all resource types in the system.
Process
Resource Type with 4 instances
Pi requests instance of Rj
(Request edge)
Pi is holding an instance of Rj
(Assignment edge)
P
i
P
i
Example 1:
Example 2:
Example 3:
R S
C D
• Resource instances:
• Process P2 is holding an instance of resource type R 1& R2, and is waiting for an instance
of resource type R3
Basic Facts
R S T R S T R S T
R S T R S T
R S T Deadlock!
Acquire T
Acquire S Acquire R
Methods for Handling Deadlocks
Ignore the problem and pretend that deadlocks never occur in the system; used by most
operating systems, including UNIX.
Deadlock Prevention
Deadlock Prevention can be implemented by ensuring that at least one of four necessary
conditions for deadlock cannot hold.
Problem:
But mutual exclusion must hold for non-sharable resources, thus it can not be avoided.
e.g. write permission for a file by many processes.
It must be guaranteed that whenever a process requests a resource, it does not hold any other
resources.
A process can request for resources only when the process has none; if it has to
request other resources, first it should release and then request.
Preventing Nonpreemption
If a process that is holding some resources requests another resource that cannot be
immediately allocated to it, then all resources currently being held should be released.
Preempted resources are added to the list of resources for which the process is waiting.
Process will be restarted only when it can regain its old resources, as well as the new ones
that it is requesting.
Possible only if state of the process can be saved e.g. CPU registers, memory space.
F(tape drive) = 1
F(disk drive) = 5
F(printer) = 12
Deadlock Prevention: summary
Mutual exclusion
No preemption
Circular wait
Deadlock Avoidance
This method requires that the system has some additional prior information available.
Each process declares the maximum number of resources of each type that it may need.
to decide whether the current request can be satisfied or must wait to avoid a
possible future deadlock.
Safe State
A state is safe if the system can allocate resources to each process (up to its maximum) in
some order and still avoid a deadlock.
Po 10 5
P1 4 2
P2 9 2
Po 10 5 5
P1 4 2 2
P2 9 2 7
At time t0, the system is in a safe state. The sequence <P1, P0, P2> satisfies the safety condition.
Explanation:
Po 10 5
P1 4 2
P2 9 3
Available tape drives in system = 12
Po 10 5 5
P1 4 2 2
P2 9 3 6
System is in safe state if there exists a safe sequence of all processes i.e. all the currently
running processes can finish their execution in definite time in any possible sequence.
A safe state is not a deadlock state (or a deadlock state is an unsafe state).
The resources that Pirequest can be satisfied by currently available resources + resources held
by all the Pi.
If Pi resource needs are not immediately available, then Pi can wait until all Pjhave finished.
When Pj is finished, Pi can obtain needed resources, execute, return allocated resources, and
terminate.
When Pi terminates, Pi+1 can obtain its needed resources, and so on.
Whenever a process requests a resource that is currently available, the system must decide
whether the resource can be allocated immediately or whether the process must wait.
The request is granted only if the allocation leaves the system in a safe state.
If we have a resource-allocation system with only one instance of each resource type, a
variant of the resource-allocation graph can be used for deadlock avoidance.
request edge
assignment edge
claim edge.
Claim edgePi - - - - ->Rj indicated that process Pi may request resource Rj; represented by a
dashed line.
Claim edge converts to request edge when a process requests a resource PiRj .
The request can be granted only if converting the request edge P i Rj to an assignment edge
Rj Pi does not result in the formation of a cycle in the resource-allocation graph.
We check for safety by using a cycle-detection algorithm.
An algorithm for detecting a cycle in this graph requires an order of n 2 operations, where n is
the number of processes in the system.
allocating R2 to P1 is OK
Banker algorithm can be applied to multiple instances but is less efficient than the resource-
allocation graph scheme.
When a new process enters the system, it must declare the maximum number of instances of
each resource type that it may need.
This number should not exceed the total number of resources in the system
When a process gets all its resources it must return them in a finite amount of time.
Looks at each request for resources and tests if the request moves the system into an unsafe
state
n = number of processes,
Data Structures:
Max:
If Max [i, j] = k, then process Pimay request at most k instances of resource type Rj.
R0 R1 R2
P0 3 1 4
P1 2 4 3
Allocation:
R0 R1 R2
P0 2 0 2
P1 1 3 2
Need:
If Need [i, j] = k, then Pi may need k more instances of Rjto complete its task.
R0 R1 R2
P0 1 1 2
P1 1 1 1
Safety Algorithm
Work := Available
(b) NeediWork
This algorithm may require an order of m x n 2 operations to decide whether a state is safe.
1. If RequestiNeedigo to step 2.
Otherwise, raise error condition, since process has exceeded its maximum claim.
1. If RequestiAvailable, go to step 3.
If Needi ≤ Available,
Find finish for all processes true and generate safe sequence
If Requesti ≤ Needi
If Requesti ≤ Available,
Again run Safety algorithm on updated data structure, find safe sequence
5 processes P0 through P4
3 resource types
A (9 instances)
B (5 instances)
C (7 instances)
Need0 (5 2 3) ≤ Available (2 2 3)
Finish0 := false
For P1
Need1 (2 2 2) ≤ Available (2 2 3)
Finish1 := true
:= 4 3 3
For P2
Need2 (6 1 0) ≤ Available (4 3 3)
Finish2 := false
For P3
Need3 (0 0 2) ≤ Available (4 3 3)
Finish3 := true
:= 6 4 3
For P4
Need4 (3 0 1) ≤ Available (6 4 3)
Finish4 := true
:= 6 4 5
For P0
Need0 (5 2 3) ≤ Available (6 4 5)
Finish0 := true
:= 7 5 5
For P2
Need2 (6 1 0) ≤ Available (7 5 5)
Finish2 := true
The system is in a safe state since the sequence <P1, P3, P4, P0, P2> satisfies safety criteria.
Deadlock Detection
Detection algorithm
Recovery scheme
Process Termination
Resource Preemption
In wait-for graph
Wait-for graph is obtained from the resource allocation graph by removing the nodes of
resource types and collapsing the appropriate edges.
Periodically system invokes an algorithm that searches for a cycle in the graph.
An edge PiPj implies that Process Pi is waiting for the Process Pj to release a resource that Pi
needs.
An edge PiPj exists in a wait-for graph if and only if the corresponding resource allocation
graph contains two edges PiRq& RqPj for some resource type Rq.
Available: A vector of length m indicates the number of available resources of each type.
Allocation: An n x m matrix defines the number of resources of each type currently allocated to
each process.
Request: An n x m matrix indicates the current request of each process. If Request [i,j] = k, then
process Pi is requesting k more instances of resource type. Rj.
Detection Algorithm
(b) RequestiWork
5 processes P0 through P4
3 resource types
A (6 instances)
B (2 instances)
C (5 instances)
A B C A B C A B C
P0 0 1 1 0 0 0 0 0 0
P1 3 0 1 1 0 2
P2 2 0 1 0 0 0
P3 0 1 0 1 1 0
P4 1 0 2 1 0 2
Sequence <P0, P2, P3, P4, P1> will result in Finish [i] = true for all i.
A B C A B C A B C
P0 0 1 1 0 0 0 0 0 0
P1 3 0 1 1 0 2
P2 2 0 1 0 0 2
P3 0 1 0 1 1 0
P4 1 0 2 1 0 2
State of system?
Process Termination
Resource Preemption
Very expensive
It is also expensive because after aborting each process, a deadlock detection algorithm
must be invoked to find whether processes are still deadlocked.
How long process has computed, and how much longer to completion.
How many and what type of resources the process has used.
Selecting a victim
Which all resources from which processes are to be preempted
Rollback
Starvation
Unit 4
Chapter 1- Memory Management strategies
Background
Basic Hardware
Address Binding
Dynamic Loading
Swapping
Contiguous Allocation
Paging
Segmentation
Background
Memory consists of a large array of words or bytes, each with its own address.
The CPU fetches instructions from memory according to the value of the program counter.
These instructions may cause additional loading from and storing to specific memory addresses.
Basic Hardware
CPU can access only storage like main memory and CPU registers.
Thus any instruction to be executed and data being used by the instructions must be in one of
these storages.
Registers that are built into CPU are accessible within one cycle of the CPU clock.
CPU can decode instructions and perform operations on registers contents at the rate of one or
more operations per CPU clock tick.
Memory which is accessed via a transaction on memory bus takes many CPU cycles to complete
thus CPU needs to wait till data or instructions are available.
Thus remedy is to add a faster memory i.e. cache between CPU and main memory.
Operating system should be protected from user programs and user programs from each other.
This protection can be provided by using two registers i.e. base register and limit register
For example:
The program can access all addresses from 30004 through 42094 (inclusive)
Protection of memory space is accomplished by having the CPU hardware compare every
address generated with registers.
Any attempt by a program executing in user mode to access operating system memory or other
users’ memory results in a trap to the operating system, which treats the attempt as a fatal
error.
The base and limit registers can be loaded only by operating system, which needs privileged
instruction.
Since privileged instruction can be executed in kernel mode and only OS executes in kernel
mode.
Thus only OS can load base and limit registers.
This scheme allows OS to change the value of the registers but prevents user programs from
changing registers’ contents.
Address Binding
Program must be brought into memory and placed within a process for it to be executed.
The process may be moved between disk and memory during its execution depending on the
memory management in use.
Input queue – collection of processes on the disk that are waiting to be brought into memory to
run the program.
The binding of instructions and data to memory addresses can be done at any step along the
way.
Address binding of instructions and data to memory addresses can happen at three different
stages:
Compile time
Load time
Execution time
Compile time
If memory location is known at compile time, then absolute code can be generated
Compiled code will start at that location and extend up from there.
If, at some later time, the starting location changes, then it will be necessary to
recompile this code.
Load time
Compiler must generate relocatable code if memory location is not known at compile
time
Execution time
Binding is delayed until run time if the process can be moved during its execution from
one memory segment to another.
Special hardware must be available for address mapping (e.g., base and limit registers).
The concept of a logical address space that is bound to a separate physicaladdress space is
central to proper memory management
The set of all physical addresses corresponding to these logical addresses is a physical-address
space.
Logical and physical addresses are the same in compile-time and load-time address-binding
schemes.
The user program deals with logical addresses; it never sees the real physical addresses.
The memory-mapping hardware device (MMU) converts logical (virtual) addresses into physical
addresses.
In MMU general scheme, the value of the relocation register (same as base register) is added to
every logical address generated by a user process at the time it is sent to memory