Module 1
Module 1
By
Prof. Ankita Mandore,
Assistant Professor,
CSE (Data Science), DSCE
What is Distributed Computing?
The term resource is a rather abstract one, but it best characterizes the
range of things that can usefully be shared is a networked computer system.
Equipment are shared to reduce cast. Data shared in database or web pages
are high- level resources which are mark significant to users without regard
for the server on servers that provide these.
Types of resources
1. Hardware resource: Hard disk, printer, camera
2. Data: File, database, web page.
3. Service: Search Engine
Patterns of resource sharing vary widely in their scope and in how closely users
work together.
Search Engine:
Users need no contact between users.
Computer supported co-operative working (CSCW):
Users cooperate directly share resources mechanisms to coordinate users action are
determined by the pattern of sharing and the geographic distribution
For effective sharing, each resource must be managed by a program that offers' a
communication interface enabling the resource to be accessed and updated reliably and
consistently.
Service:
Manages a collection of related resources and presents their functionalities to users and
applications .
Server:
Server is basically storage of resources and it provides services to the authenticated clients.
It is running program on a networked computer. Server accepts requests from client and
performs service and responds to request.
Example: Apache server
The complete interaction between server machine and client machine, from the point when
the client sends its request to when it receives the server's response is called a remote
invocation.
Hardware resources
CPU
a) Computing server: It extents processor intensive applications for clients
b) Remote object server: It executes methods on behalf of clients
c) Worm program: It Shares CPU capacity f desktop machine with the local user
Memory
Cache server holds recently accessed web pages in its RAM, for faster access by other
local computers
Disk
File server, virtual disk server, videos on demand server
Screen
Network window systems
Printer
Networked printer accept print Jobs from many computers and managing them with a
queuing system
Software Resources
Web page
Web servers enable multiple clients to share read only page content.
File
File Servers enable multiple clients to share read write files
Object
Possibilities for software objects are limitless. Shared white board, Shared diary and
room booking system are examples of this type.
Database
Databases are in tended to record the definitive State of some related sets of data.
They have been shared ever since. multi-user computers appeared. They include
techniques to manage concurrent updates
News group Content
The net news system makes rend only copies of the recently posted news items
available to clients throughout the internet
Video/Audio Stream
Servers can store entire videos on disk and deliver them at playback speed to
multiple clients simultaneously
Advantages of Distributed Computing
Disadvantages of Distributed Computing
Architectures of Distributed Systems
Architectures of Distributed Systems
Inter Process Communication- Shared
Data
Inter Process Communication- Message
Passing
Message Passing in Distributed System
A process is a program in execution.
Resource manager process to monitor the current status of usage of its local
resources All resource managers communicate each other from time to time to
dynamically balance the system load.
Therefore a DOS needs to provide inter-process communication (IPC) mechanism for
communication activities.
IPC basically requires information sharing among two or more processes.
Two basic methods for information sharing
Original sharing, or shared-data approach
Copy sharing, or message-passing approach
The shared-data paradigm gives the conceptual communication pattern.
3. Efficiency
• If the MPS is not efficient, IPC may become so expensive
• Application users try to avoid its use in their applications
• An IPC protocol of a MPS can be made efficient by reducing the number of message
exchanges during communication
• Some optimizations are
• Avoiding cost of establishing and terminating connections between the same pair of processes
of every exchange
• Minimizing the cost of maintaining connections
• Piggybacking of acknowledgement
4. Reliability
• A reliable IPC protocol can cope up with failure problems and guarantees the
delivery of a message.
• Failure due to node crash or communication link failure
• Handling of lost messages usually involves acknowledgements and retransmissions
on the basis of timeouts
• Another issues related to reliability is duplicate messages
• Duplicate messages because of event of failures or timeouts
• A reliable IPC protocol is also capable of detecting and handling duplicates
• Use sequence number to avoid duplicate messages
5. Correctness
• IPC system has group communication
• One sender to multiple receiver, multiple sender to one receiver
• Correctness related to IPC protocols group communication
Issues related to correctness is
Atomicity
Ensures that every message sent to a group of receivers will be delivered to either all of
them or none of them
Ordered delivery
Ensures that messages arrive at all receivers in an order acceptable to the application
Survivability
Guarantees that messages will be delivered despite of partial failure of processes,
machines, or communication links
6. Flexibility
• Not all applications require the same degree of reliability and correctness of the
IPC protocols
• Many applications do not require atomicity or ordered delivery of messages
• The IPC primitives should be such that users have the flexibility to choose and
specify the types and levels of reliability and correctness requirements of
applications
• Flexibility permit control flow as synchronous and asynchronous send/receive
7. Security
• A MPS be capable of providing a secure end-to-end communication
• A message in transit on the network should not be accessible to any user other
than those to whom it is addressed and the sender
• Steps necessary for secure communication is
• Authentication of the receiver(s) of a message by the sender
• Authentication of the sender of a message by its receiver(s)
• Encryption of a message before sending it over the network
8. Portability
• Two different aspects of portability
• It should be easily construct new IPC facility on another system by reusing the basic design
of existing MPS
• Applications are also portable heterogeneity must be considered while designing MPS
Message Structure
Interrupt
When the message is filled in the buffer, software interrupt is used to notify the receiving
process
This method permits the receiving process to continue without having unsuccessful test
requests
Its highly efficient and allows maximum parallelism
Drawback is user-level interrupts make programming difficult
A variant of Nonblocking receive primitive is the conditional receive primitive
It returns control immediately, either with a message or an indicator that no
message
Blocking send primitive uses the timeout values
The value set by user or default value
Timeout value used for blocking receive primitive to prevent the receiving
process blocked indefinitely
Both the send and receive primitives of a communication between two
process use blocking semantics is said to be synchronous
If its uses nonblocking primitives then communication asynchronous
Synchronous communication is simple and easy to implement
Provide high reliability
Drawbacks are
Limits the concurrency and is subject to communication deadlocks
Less flexible because sending process always has to wait for an acknowledgement,
even it is nor required
Buffering
Messages copying from the address space of the sending process to the
address space of the receiving process
If the receiving process is not ready to receive messages, then it should be
save for later usage
The message buffering is related synchronization strategy
The following are the buffering strategies
1. Null buffer or no buffer
2. Buffer with unbounded capacity
3. Single-message buffer
4. Finite-bound or multiple-message buffer
Null buffer (or no buffering)
There is no place to temporarily store the message
One of the following implementation strategies used
The message remains in sender address space and execution of send is delayed
until the receiver executes receive
The message is simply discarded and the timeout mechanism is used to resend the
message after a timeout period
Single-Message buffer
A buffer capacity to store single message is used on the receiver's node
An application module may have at most one message outstanding at a time
Single-message buffer strategy is to keep the message ready for use at the
location of the receiver
The request message is buffered on the receiver's node if the receiver is not
ready to receive the message
The message buffer may either be located in the kernel's address space or in
the receiver process's address space
Unbounded-capacity buffer
Flow-controlled communication
The sender is blocked until the receiver accepts some messages
This method introduces a synchronization between sender and receiver
It result in unexpected deadlocks
The amount of buffer space to be allocated depends on implementation
A create-buffer system call is provided to the users
The receiver mail box is located in the kernel address space or in the receiver
process address space
This buffering provides better concurrency and flexibility
Multidatagram Messages
All networks has upper bound of the size of data transmitted at a time
This size is known as Maximum Transfer Unit(MTU) of network a
Message size greater than MTU has fragmented in to multiples of the MTU
Each fragment sent separately
Each fragment is sent in a packet with control information and data
Each packet is known as datagram
Messages smaller than the MTU of the network can be sent in a single packet
known as single-datagram messages
Messages larger than the MTU of the network have to be fragmented and sent
in multiple packets known as multidatagram messages
Encoding and Decoding of Message Data
The structure of program objects should be preserved, while transmitting
from the address of the sending process to receiving process
Since both processes are on computers of different architectures it is difficult
Because two reasons
An absolute pointer value loses its meaning when transferred from one address
space to another
Different program objects occupy varying amount of storage space, ex. Long int,
short int, var size character strings
Due to this problem the program objects first converted to a stream form for
transmission and placed into message buffer
This conversion process on the sender side is known as encoding of a message
data
When received stream form converted to original program objects
Known as decoding.
Two representations used for the
encoding and decoding
Tagged representation
The type of each program object along with its value is encoded in the message
The receiving process to check the type of each program object in the message
Program object is the self-describing nature of the coded data format
Untagged representation
The message data only contains program objects
No information is included in the message data to specify the type of each program
object
Receiver process must have prior knowledge of how to decode
Algorithmic challenges in distributed
computing
Multicast messaging enables one-to-many communication, where a message is sent from one
sender to a specific group of receivers. The key characteristics include:
Group-Based Communication: Messages are delivered to a subset of nodes that have joined
the multicast group.
Efficient for Groups: Saves bandwidth by sending the message once to all nodes in the group
instead of individually.
Advantages:
Reduces network traffic by sending a single message to multiple recipients, making it ideal for
content distribution or group updates.
Scales efficiently for applications where data needs to reach specific groups, like video
conferencing or online gaming.
Disadvantages:
Complex to implement as nodes need mechanisms to manage group memberships and handle
node join/leave requests.
Not all network infrastructures support multicast natively, which can limit its applicability.
5. Broadcast Messaging
Broadcast messaging involves sending a message from one sender to all nodes
within the network. The key characteristics include:
Wide Coverage: The message is sent to every node, ensuring that all nodes in the
network receive it.
Network-Wide Reach: Suitable for announcements, alerts, or updates intended
for all nodes without targeting specific ones.
Advantages:
Guarantees that every node in the network receives the message, which is useful for
critical notifications or status updates.
Simplifies dissemination of information when all nodes need to be aware of an event or
data change.
Disadvantages:
Consumes significant network resources since every node, regardless of relevance,
receives the message.
Can lead to unnecessary processing at nodes that don’t need the message, potentially
causing inefficiency.