Distributed Systems
Chapter 2:
Distributed Systems Architectures
Inline
Definition
Architectural Styles
System Architecture in distributed
system
Central, Decentralize and Hybrid
architecture
System architecture ( 2 tier, 2tier)
Peer to peer architecture, Client server
Over relay network
2
Definitions
Software Architectures – describe the organization
and interaction of software components; focuses on
logical organization of software (component interaction,
etc.)
System Architectures - describe the placement of
software components on physical machines
The realization of an architecture may be
Centralized (most components located on a single
machine),
Decentralized (most machines have approximately
the same functionality),
Hybrid (some combination).
3
Architectural Styles
An architectural style describes a particular way to
configure
a collection of components and connectors.
Component - a module with well-defined interfaces;
reusable, replaceable
Connector –suitable
Architectures
communication link between
for distributed modules
systems:
Layered architectures*
Object-based
architectures*
Data-centered
architectures
Event-based
4 architectures
Architectural Styles
layered architectural style The object-based architectural style.
Object based is less structured
component = object
connector = RPC or RMI
5
Data-Centered Architectures
Main purpose: data access and update
Processes interact by reading and in some
modifying data shared repository (active or
Traditional data base (passive): responds to requests
passive)
Blackboard system (active): clients solve problems collaboratively;
system updates clients when information changes.
6
Event-based architectural style
Event-based arch.
supports several
communication styles:
• Publish-subscribe
• Broadcast
• Point-to-point
E.g., register interest in market info; get email updates
7
Distribution Transparency
An important characteristic of software architectures in
distributed systems is that they are designed to
support distribution transparency.
Transparency involves trade-offs (balance
achieved
between two
Different desirableapplications
distributed but incompatible features)
require different
solutions/architectures
8
System Architectures for Distributed Systems
Centralized: traditional client-server structure
Vertical (or hierarchical) organization of communication and control
paths (as in layered software architectures)
Logical separation of functions into client (requesting process) and
server
(responder)
Decentralized: peer-to-peer
Horizontal rather than hierarchical comm. and control
Communication
Hybrid: paths are less
combine elements structured;
of Client symmetric
Server and P2Pfunctionality
Edge-server systems
Collaborative distributed
systems.
9
Traditional Client-Server(Centralized)
Processes are divided into two groups (clients and servers).
Synchronous communication: request-reply protocol
In LANs, often implemented with a connectionless
protocol
(unreliable)
In WANs, communication is typically connection-oriented
TCP/IP (reliable)
10
Client Server Architectures
Figure 2-3. General interaction between a client and a server.
11
Transmission Failures
With connectionless transmissions, failure of any sort means
no reply
Possibilities:
Request message was lost
Reply message was
lost
Server failed either before, during or after performing the
service
12
Layered (software) Architecture for Client-Server Systems
User-interface level: GUI’s (usually) for interacting with end users
Processing level: data processing applications – the core
functionality
Data level: interacts with data base or file
system
File or database system
Web search engine
Example:
Interface: type in a keyword string
Processing level: processes to generate DB queries
Data level: database of web pages
Desktop “office suites”
Interface: access to various documents, data,
Processing: word processing, database queries,
spreadsheets,…
Data : file systems and/or databases
13
Application Layering
Figure 2-4. organization of an Internet search engine into three different
layers.
14
System Architecture
Mapping the software architecture to system
hardware
Correspondence between logical software modules
and actual computers
Multi-tiered architectures
Layer and tier are roughly equivalent terms, but layer typically implies
software and tier is more likely to refer to hardware.
Two-tier and three-tier are the most common
15
Two-tiered Client Server Architectures
Server provides processing and data management; client
provides simple graphical display (thin-client)
Perceived performance loss at client
Easier to manage, more reliable, client machines don’t need
to be so
Atlarge and powerful
the other extreme, all application processing and some data
resides at the client (fat-client)
Pro: reduces work load at server; more
scalable
Con: harder to manage by system admin, less
secure
16
Multi tiered Architectures
Thin Fat
Clien Clie
t nt
Figure 5. Alternative client-server organizations
17
(a)–(e).
Three-tiered Architectures
In some applications servers may also need to be clients,
leading to a three level architecture
Distributed transaction processing
Web servers that interact with database
servers functionality across three levels of machines
Distribute
instead of two.
18
Multi tiered Architectures (3 Tier Architecture)
In some applications servers may also need to be clients,
leading to a three level architecture
Distributed transaction processing
Web servers that interact with database servers
Distribute functionality across three levels of machines instead of
two.
Figure 6. An example of a server acting as client.
19
Modern Architecture
Vertical distribution: traditional client-server architectures
Each level serves a different purpose in the system.
Logically different components reside on different nodes
Horizontal distribution (P2P): each node has roughly the
same
processing capabilities and stores or manages part of the total
system
data.
Better load balancing, more resistant to denial-of-service attacks,
harder to manage than C/S
Communication & control is not hierarchical; all about equal
20
Peer-to-Peer
Nodes act as both client and
server the total system data
Each node acts as a server for part of
Overlay/stored instruction/ networks connect nodes
in the P2P system
Nodes in the overlay use their own addressing system for storing
and retrieving data in the system
Nodes can route requests to locations that may not be known by the
requester.
21
Overlay Networks
Are logical or virtual networks, built on top of a physical network
A link between two nodes in the overlay may consist of
several
physical
Messageslinks.
in the overlay are sent to logical addresses, not physical
(IP) addresses
Overlay Network Example
Circles represent nodes in the network. Blue nodes are also part of the overlay
network. Dotted lines represent virtual links. Actual routing is based on
22
TCP/IP protocols
Overlay Networks
The overlay network may be
Structured (nodes and content are according to
connected some design that simplifies
later lookups) or
Unstructured (content is assigned to nodes without regard
to the network topology.)
23
Structured P2P Architectures
A common approach is to use a distributed hash table (DHT)
to organize the nodes
Traditional hash functions convert a key to a hash value,
which can be used as an index into a hash table.
Keys are unique – each represents an object to store in the table
The hash function value is used to insert an object in the hash table
and to retrieve it.
24
Structured P2P Architectures
In a DHT, data objects and nodes are each assigned a key which
hashes to a random number from a very large identifier
space (to
ensure uniqueness)
A lookup, also based on hash function value, returns the
network
address of the node that stores the requested object.
Scalable – to thousands,
Characteristics
of DHT even millions of network nodes
Fault tolerant – able to re-organize itself when
nodes fail
Decentralized – no central coordinator
25
Chord Routing Algorithm Structured P2P
Nodes are logically arranged in a circle
Nodes and data items have m-bit identifiers from a 2m
(keys)
namespace.
e.g., a node’s key is a hash of its IP address and a file’s keymight be the
hash of its name or of its content or other unique key.
26
Structured Peer-to-Peer Architectures
Inserting Items in the DHT
A data item with key value k is
mapped to the node with
the smallest identifier id
suchid ≥ k (mod 2m)
that
This node is the successor of
k, or succ(k)
Modular arithmetic is used
Figure 2. The mapping of data items onto nodes in Chord for m = 4,
27
24 (hashvalues=16 (key nodes)
Unstructured P2P
Unstructured P2P organizes the overlay network as a random graph.
Each node knows about a subset of nodes, its “neighbors”.
Data items are randomly mapped to some node in the system & lookup
is random, unlike the structured lookup in Chord.
Locating a Data Object by Flooding
Send a request to all known neighbors
Works well in small to medium sized networks, doesn’t
scale well “Time-to-live” counter can be used to control
number of hops Example system: Freenet
28
Comparison
Structured networks typically guarantee that if an object
is in the network it will be located in a bounded amount of
time – usually O(log(N))
Unstructured networks offer no guarantees.
For example, some will only forward search requests a specific
number of hops.
Random graph approach means there may be loops
Graph may become disconnected
29
Hybrid Architectures
Combine client-server and P2P architectures
Edge-server systems; e.g. ISPs, which act as servers to their
clients, but cooperate with other edge servers to host shared content
Collaborative distributed systems; e.g., BitTorrent, which
supports parallel downloading and uploading of chunks of a file.
First, interact with Client Server system, then operate in
decentralized manner.
30
Edge-Server Systems
Figure 13. Viewing the Internet as consisting of a collection of edge servers.
31
Collaborative Distributed Systems BitTorrent
Clients contact a global directory (Web server) to locate a .torrent
file with
the information needed to locate a tracker; a server that can
supply a list
of active
Using nodes thatfrom
information havethe
chunks of the
tracker, desired
clients can file.
download the
file
file in to other users.
chunks
chunks from multiple sites in the network. Clients must also
provide Trackers know which nodes are active (downloading chunks of a
file)
Figure 14. The principal working of BitTorrent
32
Freenet
“Freenet is free software which lets you publish and obtain
information on the Internet without fear of restriction.
To achieve this freedom, the network is entirely decentralized and
publishers and consumers of information are unidentified.
Without anonymity/unidentified there can never be true freedom of
speech, and without decentralization the network will be vulnerable
to attack.”
33
P2P v Client/Server
P2P computing allows end users to communicate without a dedicated
server.
There is less likelihood of performance bottlenecks since is
communication more distributed.
Data distribution leads to workload distribution.
Resource discovery is more difficult than in centralized client-server
computing & look-up/retrieval is slower
P2P can be more fault tolerant, more resistant to denial of service
attacks
because network content is distributed.
Individual hosts may be unreliable, but overall, the system should maintain a
consistent level of service
34
Architecture versus Middleware
Where does middleware fit into an architecture?
Middleware: the software layer between user applications and
distributed platforms.
Purpose: to provide distribution transparency
Applications can access programs running on remote nodes without
understanding the remote environment.
Middleware may also have an architecture
e.g., CORBA has an object-oriented style.
Use of a specific architectural style can make it easier to develop
applications, but it may also lead to a less flexible system.
Possible solution: develop middleware that can be
customized as
needed for different applications.
35
Thank you
36