532ebdistributed Processing
532ebdistributed Processing
A Distributed system is a collection of independent computers that appear to the users of the system as a single computer.
Design Issues
Transparency Flexibility Reliability Performance Scalability
Transparency
A transparent system is where the system designers fool everyone into thinking that the collection of machines is simply an oldfashioned timesharing system. It can be achieved in two ways:
Easier method: Hide the distribution from the users. Harder method: Make the system look transparent to programs.
Types of Transparency
Flexibility
The distributed system should be flexible to any changes in the hardware organization or software applications.
Reliability
A highly reliable system must be highly available, where availability refers to the fraction of time the system is usable.
Performance
A good Performance means that the time taken to run a program on a Distributed system should not be worse than running it on a single system. Various performance metrics can be used here, for example:
Response time Throughput.
Scalability
If the number of processors are increased, the performance improves accordingly.
The MIMD architecture can further be divided into two categories: 1. Tightly coupled (Multiprocessors)- Delay experienced is less and the data rate is high 2. Loosely Coupled (Multicomputers)-Inter machine delay is large and data rate is low.
Bus-based Multiprocessors
Switched Multiprocessors
Bus-based Multicomputers
Switched Multicomputers
Software Concepts
Network Operating System True Distributed Systems
The previous setup is a very primitive form of communication. A better way is to provide a shared, global form of information sharing such as a file system.
Communications in DS
Distributed systems do not have a shared memory Processes run on separate machines Each has its own resources, including processing and memory They cannot communicate through shared memory Instead they pass messages over a network
Message Passing
Distributed processes communicate by passing messages send a message to a destination receive a message from a source Mutual exclusion is not a problem, but new issues in condition synchronisation arise
OSI Model
Clock Synchronization
In general, the Distributed systems have the following properties:
The relevant information is scattered among multiple machines. Processes make decisions based only on local information. A single point of failure in the system should be avoided. No common clock or other precise global time source exists.
Logical Clocks
A computer timer is usually a precise quartz crystal. With each crystal 2 registers are associated: Counter, Holding register. The value of holding register is copied in the counter. With each oscillation the counter decrements by 1. As soon as the value in the counter reaches 0, an interrupt is generated. This interrupt is called a clock tick. The actual time is set at the boot time, with every tick the time is incremented. Due to slight variations in the occillation frequency of the quartz crystals, two clocks may vary by a certain amount. This is called Clock skew.
Lamports Algorithm
Physical Clocks
These clocks work the same as the logical clocks with the constraint that must never deviate from the actual time. Physical clock synchronization algorithms:
Christians algorithm Berkeleys algorithm
Christians Algorithm
One of the machines which is capable of managing actual physical time is called the Time Server. No more than /2 seconds, each machine sends a message to the time server. is a constant specified by the manufacturer known as the maximum drift rate. If we require that no two clocks vary by , then synchronization every /2 seconds is necessary.
The time server sends time T which is the current time. When the machine which sent the request gets the reply, it has the following options:
It sets the time to T. The problems associated with it are:
Time can not run backward and the clock if needed should not be reduced to a lesser value. Solution Time should be slowed down gradually, so that in the next few minutes the time becomes same as that is required. It is assumed that it takess non-zero amount of time in message passing. Solution- Message propogation time is estimated and added to T.
Berkeleys Algorithm
Mutual Exclusion
When a process has to read or update a certain shared data structure, it first enters a critical region to achieve mutual exclusion and ensure that no other machine can access that data structure at that time.
A Centralized algorithm
One processor is elected as the co-ordinator . Whenever a process wants to enter a critical region, it sends a request message to the coordinator stating which critical region it wants to enter and asking for permission. If no other process is currently in the critical region, the coordinator sends back a reply granting permission.
A Distributed Algorithm
When a process wants to enter a critical region, it builds a message containing the name of the critical region it wants to enter, its processor number and the current time. It sends the message to all the processors, including itself and every message is acknowledged.
Deadlocks
Deadlocks in Distributed systems are harder to avoid, prevent or even detect compared to single processor systems. Deadlocks can be of two types:
Communication deadlock Resource deadlock
One machine is the coordinator. Each machine maintains its own resource graph for its own processes and resources. The coordinator maintains the complete graph for the complete system. When the coordinator detects a cycle, it kills one of the processes to break the deadlock. The method used by the system to maintain the resource graph at the coordinator, such as:
Whenever a processor adds or deletes an arc, a message is sent to the coordinator. The updation of the resource graphs can be done periodically.
A message consisting of 3 numbers is sent: Process that just blocked, the process sending the message and the process to whom it is being sent. When a message arrives, the recipients checks to see if it itself is waiting on a process. If so, the message is updated, by updating the second field by its id. If the message goes around and comes back to the sender, a deadlock is detected. A deadlock can be broken by killing the process
DP (Distributed Programming)
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal. A computer program that runs in a distributed system is called a distributed program, and distributed programming is the process of writing such programs. Distributed computing also refers to the use of distributed systems to solve computational problems. In distributed computing, a problem is divided into many tasks, each of which is solved by one or more computers.
Petrinets
A Petri net (also known as a place/transition net or P/T net) is one of several mathematical modeling languages for the description of distributed systems. A Petri net is a directed bipartite graph, in which the nodes represent transitions (i.e. events that may occur, signified by bars) and places (i.e. conditions, signified by circles). The directed arcs describe which places are pre- and/or post conditions for which transitions (signified by arrows). Some sources state that Petri nets were invented in August 1939 by Carl Adam Petri at the age of 13 for the purpose of describing chemical processes. Applications: Workflow management, Process Modeling Please refer to http://en.wikipedia.org/wiki/Petri_net#Application_areas for an example.