0% found this document useful (0 votes)
14 views

Chapter 1 Introduction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Chapter 1 Introduction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 73

Distributed Systems

Chapter 1: Introduction to Distributed


Systems
Presentation Outline

 Introduction and Definition of Distributed Systems


 Characteristics of Distributed Systems
 Organization and Goals of DSs
 The Client-Server Model
 Types of Distributed Systems
 Advantages and Challenges of DSs
 Hardware and Software Concepts

2
1.1. Introduction
From a Single Computer to DS
 Before the mid-80s, computers were
 very expensive (hundred of thousands or even millions
of dollars)
 very slow (a few thousand instructions per second)
 not connected among themselves
 After the mid-80s: two major developments
 cheap and powerful microprocessor-based computers
appeared
 computer networks
 LANs at speeds ranging from 10 to 1000 Mbps
 WANs at speed ranging from 64 Kbps to gigabits/sec
 Consequence
 feasibility of using a large network of computers to work for the
same application; this is in contrast to the old centralized systems
where there was a single computer with its peripherals
 Distributed Systems 3
…Introduction
 Networks of computers are everywhere!
 Mobile phone networks

Corporate networks

Factory networks

Campus networks

Home networks

In-car networks

On board networks in planes and trains

 This subject aims:



to cover characteristics of networked
computers that impact system designers
and implementers, and

to present the main concepts and
techniques that have been developed to
help in the tasks of designing and
implementing systems and applications that
are based on them (networks).

4
What Is a Distributed System?

Definition:

Operational perspective:
 “A system in which hardware or software components
located at networked computers communicate and
coordinate their actions only by message passing.”
[Coulouris].

User perspective:
 A distributed system is:
a collection of independent computers that appears to its
users as a single coherent system - computer (Tanenbaum
& Van Steen)

5
 This definition has two aspects:
1. Hardware: autonomous machines
2. Software: a single system view for the users
(Middleware)
 Examples:

Cluster:

“A type of parallel or distributed processing system, which consists of a
collection of interconnected stand-alone computers cooperatively working
together as a single, integrated computing resource” [Buyya].

Cloud:

“a type of parallel and distributed system consisting of a collection of
interconnected and virtualised computers that are dynamically
provisioned and presented as one or more unified computing resources
based on service-level agreements established through negotiation
between the service provider and consumers” [Buyya].

6
 Why Distributed?
 Resource and Data Sharing
 printers, databases, multimedia servers, ...
 Availability, Reliability
 the loss of some instances can be hidden
 Scalability, Extensibility
 the system grows with demand (e.g., extra servers)
 Performance
 huge power (CPU, memory, ...) available
 Inherent distribution, communication
 organizational distribution, e-mail, video

7
1.2. Characteristics of Distributed Systems

 Differences between the computers and the ways they


communicate are hidden from users
 Users and applications can interact with a distributed system
in a consistent and uniform way regardless of location
 Distributed systems should be easy to expand and scale
 a distributed system is normally continuously available, even
if there may be partial failures
- Users and applications should not notice that parts are
being replaced or fixed, or that new parts are added to serve
more users or applications

8
1.3. Organization and Goals of a Distributed Systems
 to support heterogeneous computers and networks and to
provide a single-system view, a distributed system is
often organized by means of a layer of software called
middleware that extends over multiple machines

Same interface everywhere

a distributed system organized as middleware; note that the middleware


layer extends over multiple machines 9
Goals of a distributed system:
a distributed system should
 make resources accessible(printers, computers, storage
facilities, data, files, Web pages, ...)
 reasons: economics, to collaborate and exchange
information
 be transparent: hide the fact that the resources and
processes are distributed across multiple computers.
 be open
 be scalable

Transparency in a Distributed System


 a distributed system that is able to present itself to users
and applications as if it were only a single computer
system is said to be transparent
10
 different forms of transparency in a distributed system
Transparency Description
Access Hide differences in data representation
and how a resource is accessed
Location Hide where a resource is physically located; where
is http://www.prenhall.com/index.html? (naming)
Migration Hide that a resource may move to another location
Relocation Hide that a resource may be moved to another
location while in use; e.g., mobile users using their wireless
laptops
Replication Hide that a resource is replicated
Concurrency Hide that a resource may be shared by several
competitive users; a resource must be left in a
consistent state
Failure Hide the failure and recovery of a resource
Persistence Hide whether a (software) resource is in memory or on
disk
11
 Openness in a Distributed System
 an Open Distributed System is a system that offers services
according to standard rules that describe the syntax and
semantics of those services; e.g., protocols in networks
 a distributed system should be open
 we need well-defined interfaces
 interoperability
 components of different origin can communicate
 portability
 components work on different platforms
 another goal of an open distributed system is that it should
be flexible and extensible; easy to configure the system out
of different components; easy to add new components,
replace existing ones

12
 in distributed systems, such services are often specified
through interfaces often described using an Interface
Definition Language (IDL)
 specify only syntax: the names of the functions, types
of parameters, return values, possible exceptions, ...

Scalability in Distributed Systems


 Scalability in three dimensions
 a distributed system should be scalable:
 in size: adding more users and resources to the system
 Geographically : users and resources may be far apart
 Administratively: should be easy to manage even if it
spans many administrative organizations

13
Scalability Problems
 Problems with size scalability: performance problems caused by
limited capacity of servers and networks
 Often caused by centralized solutions

Concept Example
Single server for all users-mostly for security
Centralized services
reasons
Centralized data A single on-line telephone book
Doing routing based on complete
Centralized algorithms
information
examples of scalability limitations
 Problems with geographical scalability:
 traditional synchronous communication in LAN

 unreliable communications in WAN

 Problems with administrative scalability:


 Conflicting policies, complex management, security problems
14
Scaling Techniques

 how to solve scaling problems


 the problem is mainly performance, and arises as a result
of limitations in the capacity of servers and networks (for
geographical scalability)
 three possible solutions: hiding communication latencies,
distribution, and replication

15
a. Hiding Communication Latencies - Scaling Techniques
 try to avoid waiting for responses to remote service
requests
 let the requester do other useful job
 i.e., construct requesting applications that use only
asynchronous communication instead of synchronous
communication; when a reply arrives the application is
interrupted
 good for batch processing and parallel applications but
not for interactive applications
 for interactive applications, move part of the job to the
client to reduce communication; e.g. filling a form and
checking the entries

16
(a) a server checking the correctness of field entries
(b) a client doing the job
 e.g., shipping code is now supported in Web applications
using Java Applets

17
b. Distribution - Scaling Techniques
Splitting a resource (such as data) into smaller parts, and spreading
the parts across the system.
 e.g., DNS - Domain Name System
 divide the name space into zones
 for details, see later in Chapter 4 - Naming

an example of dividing the DNS name space into zones


18
c. Replication - Scaling Techniques
 replicate components across a distributed system to
increase availability and for load balancing, leading to
better performance
 decided by the owner of a resource
 caching (a special form of replication) also reduces
communication latency; decided by the user
 but, caching and replication may lead to consistency
problems (see Chapter 6 - Consistency and Replication)

19
1.4. The Client-Server Model
 how are processes organized in a system
 thinking in terms of clients requesting services from
servers

general interaction between a client and a server

20
1.4.1. Application Layering
 no clear distinction between a client and a server; for
instance a server for a distributed database may act as a
client when it forwards requests to different file servers
 three levels exist
 the user-interface level: implemented by clients and
contains all that is required by a client; usually
through GUIs, but not necessarily
 the processing level: contains the applications
 the data level: contains the programs that maintain
the actual data dealt with

21
 the general organization of an Internet search engine into
three different layers

1.4.2. Client-Server Architectures


 how to physically distribute a client-server application
across several machines
 Multitiered Architectures 22
Two-tiered architecture: alternative client-server organizations
a) put only terminal-dependent part of the user interface on the
client machine and let the applications remotely control the
presentation
b) put the entire user-interface software on the client side
c) move part of the application to the client, e.g. checking
correctness in filling forms
d) and e) are for powerful client machines 23
three tiered architecture: an example of a server acting as a client

24
 Modern Architectures
 Vertical distribution: when the different tiers correspond
directly with the logical organization of applications
 Horizontal distribution: physically split up the client or the
server into logically equivalent parts. e.g. Web server

an example of horizontal distribution of a Web service 25


1.5 TYPES OF DISTRIBUTED SYSTEMS
1. Distributed computing systems
 Used for high performance computing tasks
 Cluster and Cloud computing systems
 Grid computing systems
2. Distributed information systems
 Systems mainly for management and integration of business functions
 Transaction processing systems
 Enterprise application integration

– Goal: Distribute information across several servers


3. Distributed pervasive( Ubiquitous ) systems
– Focus on mobile, embedded, communicating systems
– Goal: Spread a real-life environment with a large variety of smart devices.
26
1. Distributed Computing Systems
a) Cluster Computing Systems
Essentially a group of systems connected through a LAN.
 Homogeneous
o Same OS, near-identical hardware
 A collection of computing nodes + master node
 Master runs middleware: parallel execution and management
 Centralized job management & scheduling system

27
b) Grid Computing Systems
Lots of nodes (including clusters across multiple subnets) from
everywhere.
 Federation of autonomous and heterogeneous computer
systems (HW,OS,...), several admin domains
 Heterogeneous
 Dispersed across several organizations
 To allow for collaborations, grids generally use virtual
organizations.
 Distributed job management & scheduling

Fig: A layered architecture for grid computing systems 28


c) Cloud Computing Systems
Over 20 definitions:
 http://cloudcomputing.sys-con.com/read/612375_p.htm
 Renting “remote storage”  backup
 Renting “remote server”  hosting Web server
 Renting “remote more servers”  to manage large workload
 Scientific definition of Cloud Computing 
 “Cloud is a market-oriented distributed computing system
consisting of a collection of inter-connected and virtualized
computers that are dynamically provisioned and presented as
one or more unified computing resources based on service-
level agreements (SLAs) established through negotiation
between the service provider and consumers.”
 SLA = {negotiated and agreed QoS parameters + rewards
+ penalties for violation of agreement....}
( taken from- www.cloudbus.org + www.buyya.com)
29
Subscription-Oriented Cloud Services:
X{compute, apps, data, ..}
as a Service (..aaS)
Public Clouds

Applications

Development and
Runtime Platform

Hy Compute
Cloud
Storage
br
Manager
id
Cl
ou

Private
d

Clients
Cloud

Other Govt.
Cloud Services Cloud Services

30
Cloud Services

 Infrastructure as a Service (IaaS)


 CPU, Storage: Amazon.com, Google Software as a Service (SaaS)

Compute, ….
 Platform as a Service (PaaS)
 Google App Engine, Microsoft Platform as a Service (PaaS)

Azure,..
 Software as a Service (SaaS)
 Gmail.com,Facebook.com,Youtube.com,S
alesForce.Com,… Infrastructure as a Service (IaaS)

31
…..Cloud Services

Fig: Cloud Service architecture

32
Cloud Deployment Models

Public/Internet Private/Enterprise Hybrid/Inter


Clouds Clouds Clouds

3rd party, Mixed usage of


Cloud model run
multi-tenant Cloud private and public
within a company’s
infrastructure Clouds: Leasing public
own Data Center /
& services: cloud services
infrastructure for
when private cloud
internal and/or
* available on capacity is
partners use.
subscription basis insufficient

33
Cloud Applications
• Scientific/Tech Applications
• Business Applications
• Consumer/Social Applications

Science and Technical Applications

Business Applications

Consumer/Social Applications 34
2. Distributed Information Systems
 The vast amount of distributed systems in use today is in
the form of traditional information systems.
Example: Transaction processing systems
BEGIN TRANSACTION(server, transaction);
 READ(transaction, file-1, data);
BEGIN TRANSACTION;
WRITE(transaction, file-2, data);
 UPDATE accountsnewData := MODIFIED(data);
 SET balance = balance - 100
IF WRONG(newData) THEN
 WHERE account_id = 'customer1'
 AND balance >= 100; ABORT TRANSACTION(transaction);
ELSE
 IF @@ROWCOUNT = 0WRITE(transaction, file-2, newData);
 ROLLBACK TRANSACTION;
END TRANSACTION(transaction);
 ELSE END IF;
 COMMIT TRANSACTION;

35
Note:
 All READ and WRITE operations are executed, i.e. their
effects are made permanent at the execution of END
TRANSACTION.
 Transactions form an atomic operation.

36
Transaction Processing Systems
A transaction is a collection of operations on the state of an object
(database, object composition, etc.) that satisfies the following properties
(ACID):
 Atomicity: All operations either succeed, or all of them fail.
- When the transaction fails, the state of the object will remain
unaffected by the transaction.
 Consistency: A transaction establishes a valid state transition.
- This does not exclude the possibility of invalid,
intermediate states during the transaction’s execution.
 Isolation: Concurrent transactions do not interfere with each other.
- It appears to each transaction T that other transactions occur either
before T, or after T, but never both.
 Durability: After the execution of a transaction, its effects are
made permanent:
- Changes to the state survive failures.
37
Transaction Processing Monitor
 In many cases, the data involved in a transaction is distributed across several
servers. A TP Monitor is responsible for coordinating the execution of a
transaction

 A Transaction Processing Monitor (TPM)


is a system that oversees transactions
from start to finish, ensuring they
complete successfully1
 . If an error occurs, the TPM takes
appropriate actions to handle it2
 . TPMs are crucial for maintaining data
integrity, providing fault-tolerance, load
38
balancing, and scalability in transaction-
Transaction Processing Monitor
 A Transaction Processing Monitor (TPM) selects a server based on several facto
rs to ensure efficient and reliable transaction processing. Here are some key cons
iderations:
 Load Balancing: TPMs distribute the transaction load evenly across multiple ser
vers to prevent any single server from becoming a bottleneck.
 Server Availability: The TPM checks the availability of servers to ensure that th
e selected server is up and running.
 Transaction Type: Different types of transactions may be routed to different serv
ers based on their specific requirements and the server's capabilities.
 Server Performance: The TPM may consider the performance metrics of servers,
such as CPU usage, memory usage, and network latency, to select the most suit
able server.
 Failover Mechanisms: In case of server failure, the TPM has failover mechanis
ms to automatically switch to a backup server to ensure continuous transaction p
rocessing.

39
Transaction Processing Monitor
import java.util.HashMap;
import java.util.Map;

public class TransactionProcessingMonitor {

private Map<String, Server> servers = new HashMap<>();

public TransactionProcessingMonitor() {
// Initialize servers
servers.put("Server1", new Server("Server1", true, 10));
servers.put("Server2", new Server("Server2", true, 5));
servers.put("Server3", new Server("Server3", false, 0)); // Server3 is down
}

40
Transaction Processing Monitor
public Server selectServer() {
Server selectedServer = null;
int minLoad = Integer.MAX_VALUE;

for (Server server : servers.values()) {


if (server.isAvailable() && server.getLoad() < minLoad) {
minLoad = server.getLoad();
selectedServer = server;
}
}

return selectedServer;
}

public static void main(String[] args) {


TransactionProcessingMonitor tpm = new TransactionProcessingMonitor(); 41
3. Distributed Pervasive Systems
Pervasive systems:
exploiting the increasing integration of services and
(small/tiny) computing devices in our everyday
physical world
A next-generation of distributed systems emerging in which the nodes are small,
wireless, battery-powered, mobile (e.g. PDAs, smart phones, wireless surveillance
cameras, portable ECG monitors, etc.), and often embedded as part of a larger
system.
Three subtypes:
 Ubiquitous computing systems: pervasive and continuously present,
i.e., there is a continuous interaction between system and user.
 Mobile computing systems: pervasive, but emphasis is on the fact that
devices are inherently mobile.
 Sensor (and actuator) networks: pervasive, with emphasis on the actual
(collaborative) sensing and actuation of the environment.
42
Ubiquitous Computing Systems

Basic characteristics
 Distribution: Devices are networked, distributed, and
accessible in a transparent manner
 Interaction: Interaction between users and devices is highly
unobtrusive
 Context awareness: The system is aware of a user’s context
in order to optimize interaction
 Autonomy: Devices operate autonomously without human
intervention, and are thus highly self-managed
 Intelligence: The system as a whole can handle a wide
range of dynamic actions and interactions

43
Mobile Computing Systems

 Mobile computing systems are generally a subclass of ubiquitous


computing systems and meet all of the five requirements.
Typical characteristics
 Many different types of mobile devices: smart phones, remote controls,
car equipment, and so on
 Wireless communication
 Devices may continuously change their location =>
o setting up a route may be problematic, as routes can change
frequently
o devices may easily be temporarily disconnected=>
disruption-tolerant networks

44
Sensor Networks

 Consists of spatially distributed autonomous sensors to


cooperatively monitor physical or environmental conditions, such
as temperature, sound, vibration, pressure, motion or pollutants,
etc.
Characteristics
 The nodes to which sensors are attached are:
• Many (10s-1000s)
• Simple (small memory/compute/communication capacity)
• Often battery-powered (or even battery-less)

45
EXAMPLE

46
Distributed Pervasive Systems: Examples

 Electronic Health Systems


 Devices are physically close to a person
 Where and how should monitored data be stored?
 How can we prevent loss of crucial data?
 What infrastructure is needed to generate and
propagate alerts?
 How can security be enforced?
 How can physicians provide online feedback?

47
EXAMPLE

48
Pros and Cons of Distributed Systems

Pros of Distributed Systems


 Performance: Very often a collection of processors can provide higher
performance (and better price/performance ratio) than a centralized
computer.
 Distribution: many applications involve, by their nature, spatially
separated machines (banking, commercial, automotive system).
 Reliability (fault tolerance): if some of the machines crash, the system
can survive.
 Incremental growth: as requirements on processing power grow, new
machines can be added incrementally.
 Sharing of data/resources: shared data is essential to many
applications (banking, computer supported cooperative work,
reservation systems); other resources can be also shared (e.g. expensive
printers).
 Communication: facilitates human-to-human communication.
49
Cons of Distributed Systems

 Difficulties of developing distributed software: how should

operating systems, programming languages and


applications look like?
 Networking problems: several problems are created by

the network infrastructure, which have to be dealt


with: loss of messages, overloading, ...
 Security problems: sharing generates the problem of data

security.

50
1.6 Hardware and Software Concepts
o Hardware Concepts
 different classification schemes exist
 Multiprocessors - with shared memory
 Multicomputers - that do not share memory
 can be homogeneous or heterogeneous

51
 a single
backbone

different basic organizations of processors and memories in distributed


systems

52
 Multiprocessors - Shared Memory
 the shared memory has to be coherent - the same value
written by one processor must be read by another
processor
 performance problem for bus-based organization since the
bus will be overloaded as the number of processors
increases
 the solution is to add a high-speed cache memory between
the processors and the bus to hold the most recently
accessed words; may result in incoherent memory

a bus-based multiprocessor
 bus-based multiprocessors are difficult to scale even with
caches
 two possible solutions: crossbar switch and omega
network 53
 Crossbar switch
 divide memory into modules and connect them to the
processors with a crossbar switch
 at every intersection, a crosspoint switch is opened and
closed to establish connection
 problem: expensive; with n CPUs and n memories, n 2
switches are required

54
 Omega network
 use switches with multiple input and output lines
 drawback: high latency because of several switching
stages between the CPU and memory

55
 Multicomputer Systems
o Homogeneous Multicomputer Systems
 also referred to as System Area Networks (SANs)
 the nodes are mounted on a big rack and connected
through a high-performance network
 could be bus-based or switch-based
 bus-based
 shared multiaccess network such as Fast Ethernet can be
used and messages are broadcasted
 performance drops highly with more than 25-100 nodes

56
 Switch-based
 messages are routed through an interconnection network
 two popular topologies: meshes (or grids) and
hypercubes

Hypercube
Grid

57
 Heterogeneous Multicomputer Systems
 most distributed systems are built on heterogeneous
multicomputer systems
 the computers could be different in processor type,
memory size, architecture, power, operating system, etc.
and the interconnection network may be highly
heterogeneous as well
 the distributed system provides a software layer to hide the
heterogeneity at the hardware level; i.e., provides
transparency

58
o Software Concepts
 OSs in relation to distributed systems
 tightly-coupled systems, referred to as distributed OSs
(DOS)
 the OS tries to maintain a single, global view of the
resources it manages
 used for multiprocessors and homogeneous
multicomputers
 loosely-coupled systems, referred to as network OSs
(NOS)
 a collection of computers each running its own OS; they
work together to make their services and resources
available to others
 used for heterogeneous multicomputers
 Middleware: to enhance the services of NOSs so that a
better support for distribution transparency is provided

59
 Summary of main issues

System Description Main Goal


Tightly-coupled operating system for multi- Hide and manage
DOS processors and homogeneous hardware
multicomputers resources
Loosely-coupled operating system for Offer local
NOS heterogeneous multicomputers (LAN and services to remote
WAN) clients
Provide
Additional layer atop of NOS implementing
Middleware distribution
general-purpose services
transparency

An overview of DOSs, NOSs, and middleware

60
 Distributed Operating Systems
 two types
 multiprocessor operating system: to manage the
resources of a multiprocessor
 multicomputer operating system: for homogeneous
multicomputers
 Uniprocessor Operating Systems
 separating applications from operating system code
through a microkernel

61
Multiprocessor Operating Systems
 extended uniprocessor operating systems to support
multiple processors having access to a shared memory
 a protection mechanism is required for concurrent access
to guarantee consistency
 two synchronization mechanisms: semaphores and
monitors
 semaphore: an integer with two atomic operations down
(if s=0 then sleep; s := s-1) and up (s := s+1; wakeup a
sleeping process if any)
 monitor: a programming language construct consisting
of procedures and variables that can be accessed only
by the procedures of the monitor; only a single process
at a time is allowed to execute a procedure

62
 Multicomputer Operating Systems
 processors can not share memory; instead communication
is through message passing
 each node has its own
 kernel for managing local resources
 separate module for handling interprocessor
communication

general structure of a multicomputer operating system 63


 Distributed Shared Memory Systems
 how to emulate shared memories on distributed systems to
provide a virtual shared memory
 page-based distributed shared memory (DSM) - use the
virtual memory capabilities of each individual node

pages of address space distributed among four machines


64
situation after CPU 1 references page 10

 read-only pages can be easily replicated

situation if page 10 is read only and replication is used


65
 Network Operating Systems
 possibly heterogeneous underlying hardware
 constructed from a collection of uniprocessor systems, each
with its own operating system and connected to each other
in a computer network

general structure of a network operating system


66
 Services offered by network operating systems
 remote login (rlogin)
 remote file copy (rcp)
 shared file systems through file servers

two clients and a server in a network operating system

67
 Middleware
 a distributed operating system is not intended to handle a
collection of independent computers but provides
transparency and ease of use
 a network operating system does not provide a view of a
single coherent system but is scalable and open
 combine the scalability and openness of network operating
systems and the transparency and ease of use of distributed
operating systems
 this is achieved through a middleware, another layer of
software

68
general structure of a distributed system as middleware

69
 Different middleware models exist
 treat every resource as a file; just as in UNIX
 through Remote Procedure Calls (RPCs) - calling a
procedure on a remote machine
 distributed object invocation
 (details later in Chapter 2 - Communication)
 Middleware services
 access transparency: by hiding the low-level message
passing
 naming: such as a URL in the WWW
 distributed transactions: by allowing multiple read and
write operations to occur atomically
 security

70
 Middleware and Openness
 in an open middleware-based distributed system, the
protocols used by each middleware layer should be the
same, as well as the interfaces they offer to applications

71
 A comparison between multiprocessor operating systems,
multicomputer operating systems, network operating
systems, and middleware-based distributed systems

Distributed OS
Network Middleware
Item
Multiproc Multicomp OS -based OS

Degree of
Very High High Low High
transparency
Same OS on all nodes Yes Yes No No
Number of copies of
1 N N N
OS
Basis for Shared Model
Messages Files
communication memory specific
Resource Global, Global,
Per node Per node
management central distributed
Scalability No Moderately Yes Varies
Openness Closed Closed Open Open
72
73

You might also like