0% found this document useful (0 votes)

52 views24 pages

Lecture - 28

This document discusses communication costs in parallel machines and various message passing and routing techniques used in parallel computers. It covers the following key points: - Communication cost has startup time, per-hop time, and per-word transfer time components. - Store-and-forward routing receives the full message at each hop before forwarding. Cut-through routing pipelines message units called flits through the network. - Important metrics for evaluating mappings between networks include congestion, dilation, and expansion. Optimal mappings aim to minimize these values. - Common structures like linear arrays and meshes can be optimally mapped to hypercubes using Gray codes and concatenation. Mapping graphs to different topologies involves tradeoffs in

Uploaded by

Monica Mupudi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views24 pages

Lecture - 28

Uploaded by

Monica Mupudi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

Communication Costs

in Parallel Machines
• Along with idling and contention, communication is a
major overhead in parallel programs.
• The cost of communication is dependent on a variety of
features including the programming model semantics,
the network topology, data handling and routing, and
associated software protocols.
Message Passing Costs in
Parallel Computers
• The total time to transfer a message over a network
comprises of the following:
– Startup time (ts): Time spent at sending and receiving nodes
(executing the routing algorithm, programming routers, etc.).
– Per-hop time (th): This time is a function of number of hops and
includes factors such as switch latencies, network delays, etc.
– Per-word transfer time (tw): This time includes all overheads that
are determined by the length of the message. This includes
bandwidth of links, error checking and correction, etc.
Store-and-Forward Routing

• A message traversing multiple hops is completely

received at an intermediate hop before being
forwarded to the next hop.
• The total communication cost for a message of size m
words to traverse l communication links is

• In most platforms, th is small and the above expression

can be approximated by
Routing Techniques

Passing a message from node P0 to P3 (a) through a store-and-

forward communication network; (b) and (c) extending the concept
to cut-through routing. The shaded regions represent the time that
the message is in transit. The startup time associated with this
message transfer is assumed to be zero.
Packet Routing

• Store-and-forward makes poor use of communication

resources.
• Packet routing breaks messages into packets and
pipelines them through the network.
• Since packets may take different paths, each packet
must carry routing information, error checking,
sequencing, and other related header information.
• The total communication time for packet routing is
approximated by:

• The factor tw accounts for overheads in packet headers.

Cut-Through Routing

• Takes the concept of packet routing to an extreme by

further dividing messages into basic units called flits.
• Since flits are typically small, the header information
must be minimized.
• This is done by forcing all flits to take the same path, in
sequence.
• A tracer message first programs all intermediate routers.
All flits then take the same route.
• Error checks are performed on the entire message, as
opposed to flits.
• No sequence numbers are needed.
Cut-Through Routing

• The total communication time for cut-through routing

is approximated by:

• This is identical to packet routing, however, tw is

typically much smaller.
Simplified Cost Model for Communicating
Messages
• The cost of communicating a message between two
nodes l hops away using cut-through routing is given
by

• In this expression, th is typically smaller than ts and tw.

For this reason, the second term in the RHS does not
show, particularly, when m is large.
• Furthermore, it is often not possible to control routing
and placement of tasks.
• For these reasons, we can approximate the cost of
message transfer by
Simplified Cost Model for Communicating
Messages
• It is important to note that the original expression for
communication time is valid for only uncongested
networks.
• If a link takes multiple messages, the corresponding tw
term must be scaled up by the number of messages.
• Different communication patterns congest different
networks to varying extents.
• It is important to understand and account for this in the
communication time accordingly.
Cost Models for
Shared Address Space Machines
• While the basic messaging cost applies to these
machines as well, a number of other factors make
accurate cost modeling more difficult.
• Memory layout is typically determined by the system.
• Finite cache sizes can result in cache thrashing.
• Overheads associated with invalidate and update
operations are difficult to quantify.
• Spatial locality is difficult to model.
• Prefetching can play a role in reducing the overhead
associated with data access.
• False sharing and contention are difficult to model.
Routing Mechanisms
for Interconnection Networks
• How does one compute the route that a message takes
from source to destination?
– Routing must prevent deadlocks - for this reason, we use
dimension-ordered or e-cube routing.
– Routing must avoid hot-spots - for this reason, two-step routing
is often used. In this case, a message from source s to
destination d is first sent to a randomly chosen intermediate
processor i and then forwarded to destination d.
Routing Mechanisms
for Interconnection Networks

Routing a message from node Ps (010) to node Pd (111) in a three-

dimensional hypercube using E-cube routing.
Mapping Techniques for Graphs

• Often, we need to embed a known communication

pattern into a given interconnection topology.
• We may have an algorithm designed for one network,
which we are porting to another topology.

For these reasons, it is useful to understand mapping

between graphs.
Mapping Techniques for Graphs: Metrics

• When mapping a graph G(V,E) into G’(V’,E’), the

following metrics are important:
• The maximum number of edges mapped onto any edge
in E’ is called the congestion of the mapping.
• The maximum number of links in E’ that any edge in E is
mapped onto is called the dilation of the mapping.
• The ratio of the number of nodes in the set V’ to that in
set V is called the expansion of the mapping.
Embedding a Linear Array
into a Hypercube
• A linear array (or a ring) composed of 2d nodes (labeled
0 through 2d − 1) can be embedded into a d-dimensional
hypercube by mapping node i of the linear array onto
node
• G(i, d) of the hypercube. The function G(i, x) is defined
as follows:

0
Embedding a Linear Array
into a Hypercube
The function G is called the binary reflected Gray
code (RGC).

Since adjoining entries (G(i, d) and G(i + 1, d)) differ

from each other at only one bit position, corresponding
processors are mapped to neighbors in a hypercube.
Therefore, the congestion, dilation, and expansion of the
mapping are all 1.
Embedding a Linear Array
into a Hypercube: Example

(a) A three-bit reflected Gray code ring; and (b) its embedding into
a three-dimensional hypercube.
Embedding a Mesh
into a Hypercube
• A 2r × 2s wraparound mesh can be mapped to a 2r+s-node
hypercube by mapping node (i, j) of the mesh onto node
G(i, r− 1) || G(j, s − 1) of the hypercube (where || denotes
concatenation of the two Gray codes).
Embedding a Mesh into a Hypercube

(a) A 4 × 4 mesh illustrating the mapping of mesh nodes to the nodes

in a four-dimensional hypercube; and (b) a 2 × 4 mesh embedded into
a three-dimensional hypercube.

Once again, the congestion, dilation, and expansion of

the mapping is 1.
Embedding a Mesh into a Linear Array

• Since a mesh has more edges than a linear array, we will

not have an optimal congestion/dilation mapping.
• We first examine the mapping of a linear array into a
mesh and then invert this mapping.
• This gives us an optimal mapping (in terms of
congestion).
Embedding a Mesh into a Linear Array:
Example

(a) Embedding a 16 node linear array into a 2-D mesh; and (b) the
inverse of the mapping. Solid lines correspond to links in the linear
array and normal lines to links in the mesh.
Embedding a Hypercube into a 2-D Mesh

• Each node subcube of the hypercube is mapped to

a node row of the mesh.
• This is done by inverting the linear-array to hypercube
mapping.
• This can be shown to be an optimal mapping.
Embedding a Hypercube into a 2-D Mesh:
Example

Embedding a hypercube into a 2-D mesh.

IB Specification Vol 1-Release-1.4-2020!04!07
No ratings yet
IB Specification Vol 1-Release-1.4-2020!04!07
1,981 pages
Introduction To Parallel Computing: Solution Manual
No ratings yet
Introduction To Parallel Computing: Solution Manual
70 pages
Simulation of Digital Communication Systems Using Matlab
From Everand
Simulation of Digital Communication Systems Using Matlab
Mathuranathan Viswanathan
3.5/5 (22)
@perkins: Diagnostic Code Reader
No ratings yet
@perkins: Diagnostic Code Reader
2 pages
Parallel Programming Platforms (Part 2) : CSE3057Y Parallel and Distributed Systems
No ratings yet
Parallel Programming Platforms (Part 2) : CSE3057Y Parallel and Distributed Systems
20 pages
Lecture 3 - 3 Evaluating Static Interconnection Networks
No ratings yet
Lecture 3 - 3 Evaluating Static Interconnection Networks
41 pages
Chapter 2 - Parallel Programming Platforms
No ratings yet
Chapter 2 - Parallel Programming Platforms
33 pages
Intro To Communication: - Advantages
No ratings yet
Intro To Communication: - Advantages
13 pages
Static and Dynamic
No ratings yet
Static and Dynamic
43 pages
05 Notes
No ratings yet
05 Notes
30 pages
unit-3.2 static interconnection networks
No ratings yet
unit-3.2 static interconnection networks
10 pages
Interconnection Network Topology Design Trade-Offs
No ratings yet
Interconnection Network Topology Design Trade-Offs
29 pages
Parallel 2ndtweek Class2
No ratings yet
Parallel 2ndtweek Class2
18 pages
Sciencedirect: Cluster Based Application Mapping Strategy For 2D Noc
No ratings yet
Sciencedirect: Cluster Based Application Mapping Strategy For 2D Noc
8 pages
Slides Chapter 2 - Parallel Programming Platforms
No ratings yet
Slides Chapter 2 - Parallel Programming Platforms
33 pages
Parallel Processing Lecture3
No ratings yet
Parallel Processing Lecture3
54 pages
Distributed Memory Machines
No ratings yet
Distributed Memory Machines
10 pages
Lecture 03 - Communication in Networks
No ratings yet
Lecture 03 - Communication in Networks
44 pages
Slides 4
No ratings yet
Slides 4
36 pages
UNIT-4
No ratings yet
UNIT-4
56 pages
ECE544Lec4 15
No ratings yet
ECE544Lec4 15
71 pages
Optimal Communication Algorithms For Hypercubes : Journal of Parallel and Distributed Computing 11, 263-275 (1991)
No ratings yet
Optimal Communication Algorithms For Hypercubes : Journal of Parallel and Distributed Computing 11, 263-275 (1991)
13 pages
3 Module 3 Message Passing Studemt Version 2
No ratings yet
3 Module 3 Message Passing Studemt Version 2
18 pages
System Interconnect Network & Topologies
No ratings yet
System Interconnect Network & Topologies
48 pages
Dynamic Routing Protocols RIP
No ratings yet
Dynamic Routing Protocols RIP
41 pages
Interconnects
No ratings yet
Interconnects
25 pages
Lec3 InnerconnectionNetworks
No ratings yet
Lec3 InnerconnectionNetworks
28 pages
ECE544: Communication Networks-II, Spring 2011: D. Raychaudhuri
No ratings yet
ECE544: Communication Networks-II, Spring 2011: D. Raychaudhuri
71 pages
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
No ratings yet
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
11 pages
4.link State & Multicasting
No ratings yet
4.link State & Multicasting
20 pages
Matrix Transpose On Meshes: Theory and Practice
No ratings yet
Matrix Transpose On Meshes: Theory and Practice
5 pages
Static Vs Dynamic: Routing Algorithm
No ratings yet
Static Vs Dynamic: Routing Algorithm
10 pages
Network Routing Algorithms
No ratings yet
Network Routing Algorithms
42 pages
Multiprocessor Interconnection Networks Networks: CS 740 November 19, 2003
No ratings yet
Multiprocessor Interconnection Networks Networks: CS 740 November 19, 2003
8 pages
Seminar
No ratings yet
Seminar
49 pages
F07 Lecture10 Routing
No ratings yet
F07 Lecture10 Routing
12 pages
Introduction
No ratings yet
Introduction
46 pages
Solution 2-DD
No ratings yet
Solution 2-DD
70 pages
CS 408 Computer Networks: Chapter 11: Routing in IP
No ratings yet
CS 408 Computer Networks: Chapter 11: Routing in IP
54 pages
Parallel Routing in Hypercube Networks With Faulty Nodes
No ratings yet
Parallel Routing in Hypercube Networks With Faulty Nodes
8 pages
Network Routing Algorithms
No ratings yet
Network Routing Algorithms
47 pages
Notes HW1 Notes
No ratings yet
Notes HW1 Notes
5 pages
Advance Computer Architecture: Unit:Ii System Interconnect Architectures
No ratings yet
Advance Computer Architecture: Unit:Ii System Interconnect Architectures
53 pages
Computer Network UNIT 3
No ratings yet
Computer Network UNIT 3
28 pages
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
No ratings yet
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
11 pages
Routingalgorithm Networklayer 170223123829
No ratings yet
Routingalgorithm Networklayer 170223123829
48 pages
computer networksUNIT 3 PPT
No ratings yet
computer networksUNIT 3 PPT
63 pages
Lecture 11
No ratings yet
Lecture 11
52 pages
CN - UNIT 3 New
No ratings yet
CN - UNIT 3 New
30 pages
Implementation of Shortest Path in Packet Switching Network Using Genetic Algorithm
No ratings yet
Implementation of Shortest Path in Packet Switching Network Using Genetic Algorithm
6 pages
CNL 10 PDF
No ratings yet
CNL 10 PDF
12 pages
Routing Overview: Routers
No ratings yet
Routing Overview: Routers
12 pages
As Reference - Pastry - Orig
No ratings yet
As Reference - Pastry - Orig
58 pages
10-Hypercube & Network
No ratings yet
10-Hypercube & Network
22 pages
Short Notes: Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels
No ratings yet
Short Notes: Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels
10 pages
Online Class 13 Routing
No ratings yet
Online Class 13 Routing
23 pages
ACA-Lect17
No ratings yet
ACA-Lect17
11 pages
DVMRPand MOSPF
No ratings yet
DVMRPand MOSPF
43 pages
Computer Networks: An Introduction To IP Addressing
No ratings yet
Computer Networks: An Introduction To IP Addressing
28 pages
Routing in Wireless Mesh Networks
From Everand
Routing in Wireless Mesh Networks
Raghav Kumar
No ratings yet
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet
Planning, Negotiating, Implementing, and Managing Wide Area Networks: A Practical Guide
From Everand
Planning, Negotiating, Implementing, and Managing Wide Area Networks: A Practical Guide
Luiz Augusto de Carvalho
No ratings yet
Bigip F5 LTM Training Loadbalance: Syllabus Blueprint
No ratings yet
Bigip F5 LTM Training Loadbalance: Syllabus Blueprint
1 page
NMap - Network Mapping
100% (5)
NMap - Network Mapping
35 pages
Headphone-Sony WH-1000XM3-guide1
No ratings yet
Headphone-Sony WH-1000XM3-guide1
2 pages
SDN & NFV in 5G: Advancements and Challenges
No ratings yet
SDN & NFV in 5G: Advancements and Challenges
5 pages
Hunegnaw Mez 1
No ratings yet
Hunegnaw Mez 1
29 pages
Mikrotik - Firewall - Failover
No ratings yet
Mikrotik - Firewall - Failover
2 pages
GUL Frequency v1.0
No ratings yet
GUL Frequency v1.0
125 pages
Optelecom 9000 Series Installation and Operation Manual
No ratings yet
Optelecom 9000 Series Installation and Operation Manual
18 pages
US 22 Shaik Attacks From A New Front Door in 4G 5G Mobile Networks
No ratings yet
US 22 Shaik Attacks From A New Front Door in 4G 5G Mobile Networks
40 pages
Embedded Systems
No ratings yet
Embedded Systems
16 pages
IP 930D Plan and Install Guide
No ratings yet
IP 930D Plan and Install Guide
102 pages
Advanced CheckPoint Interview Questions and Answers 2018
No ratings yet
Advanced CheckPoint Interview Questions and Answers 2018
10 pages
ENISA - 5G Standards
No ratings yet
ENISA - 5G Standards
82 pages
Power-Spectral Theory Propagation in The Mobile-Radio Environment
No ratings yet
Power-Spectral Theory Propagation in The Mobile-Radio Environment
12 pages
SIM800 Series - AT Command Manual - V1.09-220-248
No ratings yet
SIM800 Series - AT Command Manual - V1.09-220-248
29 pages
VTP3+MST Lab PDF
No ratings yet
VTP3+MST Lab PDF
6 pages
UE Interactions: 5G Standalone Access Registration: 2:Msg1: Preamble
No ratings yet
UE Interactions: 5G Standalone Access Registration: 2:Msg1: Preamble
5 pages
Otdr Kit: MUST Be Used With One of The Following Mainframe
No ratings yet
Otdr Kit: MUST Be Used With One of The Following Mainframe
5 pages
Error Control For Digital Satellite Links
No ratings yet
Error Control For Digital Satellite Links
57 pages
Hct-Bert/C: E1/Datacom BER Tester
No ratings yet
Hct-Bert/C: E1/Datacom BER Tester
1 page
Bandwidth Sharing of Multimode Base Station Co-Transmission (SRAN18.1 - Draft A)
No ratings yet
Bandwidth Sharing of Multimode Base Station Co-Transmission (SRAN18.1 - Draft A)
72 pages
Annexure Ii - PH Ix - 2 DTR and Sor
100% (1)
Annexure Ii - PH Ix - 2 DTR and Sor
281 pages
TP-PN7160-NEXT-GEN-NFC-CONTROLLER
No ratings yet
TP-PN7160-NEXT-GEN-NFC-CONTROLLER
25 pages
Reference Architecture
No ratings yet
Reference Architecture
8 pages
Oxe Um 8088 Smart DeskPhone
No ratings yet
Oxe Um 8088 Smart DeskPhone
55 pages
PL-4000T
No ratings yet
PL-4000T
2 pages
Iclock360 Datasheet
No ratings yet
Iclock360 Datasheet
2 pages

Lecture - 28

Uploaded by

Lecture - 28

Uploaded by

Communication Costs

• A message traversing multiple hops is completely

• In most platforms, th is small and the above expression

Passing a message from node P0 to P3 (a) through a store-and-

• Store-and-forward makes poor use of communication

• The factor tw accounts for overheads in packet headers.

• Takes the concept of packet routing to an extreme by

• The total communication time for cut-through routing

• This is identical to packet routing, however, tw is

• In this expression, th is typically smaller than ts and tw.

Routing a message from node Ps (010) to node Pd (111) in a three-

• Often, we need to embed a known communication

For these reasons, it is useful to understand mapping

• When mapping a graph G(V,E) into G’(V’,E’), the

Since adjoining entries (G(i, d) and G(i + 1, d)) differ

(a) A 4 × 4 mesh illustrating the mapping of mesh nodes to the nodes

Once again, the congestion, dilation, and expansion of

• Since a mesh has more edges than a linear array, we will

• Each node subcube of the hypercube is mapped to

Embedding a hypercube into a 2-D mesh.

You might also like