0% found this document useful (0 votes)

11 views21 pages

Chapter 3

Uploaded by

saifiashour

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views21 pages

Chapter 3

Uploaded by

saifiashour

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 21

Part 2: Parallel Models (II)

1. Classification of Computers According to

Memory.
2. PRAM
3. Interconnection Networks.
Parallel Computer Memory Architectures

Shared Memory Distributed Memory Hybrid D-S Memory

Shared Memory
:Advantages
Global address space provides a user-friendly )1(
. programming perspective to memory
.Data sharing between tasks is fast )2(

:Disadvantages
(1) Scalability between memory and CPUs. Adding more
CPUs can geometrically increases traffic on the shared
memory-CPU path, and for cache coherent systems,
geometrically increase traffic associated with cache/memory
management.
(2) Programmer responsibility for synchronization constructs
that insure "correct" access of global memory.
(3) Expense: it becomes increasingly difficult and expensive
to design and produce shared memory machines with ever
increasing numbers of processors.
Non-Shared (Distributed)
Memory
:Advantages
Memory is scalable with number of processors. Increase )1(
the number of processors and the size of memory increases
.proportionately
Each processor can rapidly access its own memory )2(
without interference and without the overhead incurred with
.trying to maintain cache coherency
Cost effectiveness: can use commodity, off-the-shelf )3(
.processors and networking

:Disadvantages
The programmer is responsible for many of the details )1(
.associated with data communication between processors
It may be difficult to map existing data structures, based )2(
.on global memory, to this memory organization
Parallel Random Access Machine (PRAM)

Processors: there are n identical processors (PEs), , each of which is

identical to the RAM processor. Assume that n is (large) finite.
Memory: Common/Global memory with M locations,
Memory Access Unit: similar to MAU of RAM, but allows any PE to
get to any memory location

p0 p1 pi pi+1 pn-1

memory access unit

y Shared memory x
operations Each step of a PRAM algorithm consists of:
Read phase - up to n PEs may z
simultaneously perform one p0 p1 pi pi+1 pn
read from memory to its local
memory (i.e., a register) x

receive(x,z)
Compute phase - every processor is entitled to perform a (small) fixed
number of logical or arithmetic operations on the contents of its local
memory (registers)
+ * ++ /
p0 p1 pi pi+1 pn

z
Write phase - up to n PEs may p0 p1 pi pi+1 pn
simultaneously write a value that is its
local memory (i.e., a register) to the
y
global/common memory, send(z,y)
Memory there are a number of different ways for the
Access processors to gain access to memory.
(1) Exclusive p0 p1 pi pi+1 pn
itRead
is not possible to read a memory cell
simultaneously by several processors;
p0 p1 pi pi+1 pn
(2) Exclusive
itWrite
is not possible to write to memory cell
simultaneously by several processors;
(3) Concurrent Read p0 p1 pi pi+1 pn
it is possible to read a memory cell
simultaneously by several processors;
p0 p1 pi pi+1 pn
(4) Concurrent
itWrite
is possible to write to memory cell
simultaneously by several processors;
Concurrent Write p0 p1 pi pi+1 pn

a) Priority CW - only PE with highest priority succeeds

b) Common CW - all PEs writing to the same location must write
the same value
c) Arbitrary CW - one PE, chosen arbitrarily, succeeds
d) Combining CW
i) Arithmetic functions - SUM, PRODUCT
ii) Logical functions - AND, XOR
iii) Selection/Semigroup - MAX, MIN
PRAM models
a) EREW
b) CREW
c) ERCW
d) CRCW
Interconnection
Networks
In PRAM, all exchange of data among processors take place #
.through the shared memory
Another way for processors to communicate is via direct links #
.connecting them

The M locations of memory are distributed among the N #

.processors

When processors Pi wishes to send a datum to processor Pj, it #

.uses the network to route the datum from its memory to that of Pj
Interconnection
Networks

Two processors directly connected by a link are said to be #

.neighbors
p1

p2 p4

p3 p5 p6

The link between Pi and Pj represents two links, namely, one #

.from Pi to Pj and one from Pj to Pi
Interconnection
Networks

There are a number of questions that are need to answer

:when designed a model of computation of this kind

?What shape should the network have )1(

How many neighbors should each processor have, how are
these neighbors selected, should all processors have the
.same number of neighbors

Can a processor communicate with all of its neighbors at )2(

?once
Can processor send data to all of its processors and receive
.data from all of its neighbors in one time unit
Interconnection
Networks

What are the size of a message that a processor can )3(

?transmission at a time
If a datum is considered to have constant size, how many
.data can be sent in one transmission

How long does it take for a processor to initiate a )4(

?transmission
Is the time required by a processor to start up a
?communication significant

How long does it take for a datum to travel between two )5(
?neighboring processors
Is the time required by a datum to go from Pi to its neighbor Pj
a function of the length of the link connecting Pi and Pj
Interconnection
Networks

?How long does it take a processor to receive a datum )6(

Is the time required by a datum sent by Pi to gain access to
?processor Pj significant

?Are the paths static or dynamic )7(

?Does the algorithm allow for flexibility in choosing the paths

Do the processor operate synchronously or )8(

?asynchronously

What kind or processor is used by an interconnection )9(

?network
Interconnection
Networks
degree of the network maximum degree of any PE in the network.

communication diameter maximum of the minimum distance

between any pair of PEs.

bisection width minimum number of wires that have

to be removed in order to disconnect
the network into 2 “equal” size subnetworks.
Interconnection
Networks
Scalability
A network model must be scalable so that more processors can be easily
added when new resources are available.
Also, the model should be regular.
linear array
p0 p1 p2 pn-1

n processors are connected in the form of a one dimensional array.

pi is connected to pi-1 and pi+1,
2 deg
for all i=1..n-2
p0 is connected to p1 and O(n) Diam
1 Bisect.
pn-1 is connected to pn-2

scalability
ring

p0 p1 p2 pn-1

n processors are connected in the form of a ring.

pi is connected to pi-1 and pi+1mod n,

2 deg
O(n) diam
2 bisect.

HW: Study the Scalability

tree p0

p1 p2

p3 p4 p5 p6 3 deg
p7 p8 p9 p10 p11 p12 p13 p14 O(2log n) diam
1 Bisect.

Consists of n=2d -1 processors arranged as a complete binary tree. Each processor

at level i is connected by a 2-way communication line to its parent at level i+1
and to its two children at level i-1.

HW: Study the Scalability

mesh
p0 p1 p2 P0,0 P0,1 P0,2
A two dimensional network is obtained by
p3 p4 p5 P1,0 P1,1 P1,2
arranging the n processors into m1 x m2.
The processor in row i and column j is p6 p7 p8 P2,0 P2,1 P2,2
denoted by pi,j.
p9 p10 p11 P3,0 P3,1 P3,2
Pi-1,j
4 deg
Pi,j-1 Pi,j Pi,j+1 O(m1+m2) diam
O(m1) Bisect
Pi+1,j

p0 p1 p2 p'0 p’1 p’2

p3 p4 p5 p’3 p’4 p’5

p6 p7 p8 p’6 p’7 p’8

p9 p10 p11 p’9 p’10 p’11

hypercube
p6 p7 110 111 110 111

p2 p3 010 011 010 011

p4 p5 100 101 100 101

p0 p1 000 001 000 001

Consists of n=2d processors connected as cube.

each processor pi is connected to pj if and only if i and j is differ in one
bit(binary).
O(log n) deg
O(log n) diam
O(log n) bisect.
0110 0111 1110 1111

0010 0011 1010 1011

0100 0101 1100 1101

0000 0001 1000 1001

star
p7
p6 p8

p5 p0 p1

p4 p2
p3

O(n-1) deg
2 diam
1 bisect.

HW: Study the Scalability

Best Udemy Courses Ever Created
No ratings yet
Best Udemy Courses Ever Created
4 pages
Notes 02
No ratings yet
Notes 02
9 pages
L2 Parallel Computing Models
No ratings yet
L2 Parallel Computing Models
31 pages
Parallel Architecture
No ratings yet
Parallel Architecture
33 pages
Ram, Pram, and Logp Models
No ratings yet
Ram, Pram, and Logp Models
72 pages
Parallel Algorithms: Peter Harrison and William Knottenbelt
No ratings yet
Parallel Algorithms: Peter Harrison and William Knottenbelt
65 pages
Par Seq Algorithms
No ratings yet
Par Seq Algorithms
44 pages
Unit 1
No ratings yet
Unit 1
25 pages
Parallel Computer Architecture A Hardware-Software
No ratings yet
Parallel Computer Architecture A Hardware-Software
18 pages
Parallel Computation Models: Slide 1
No ratings yet
Parallel Computation Models: Slide 1
28 pages
Introduction
No ratings yet
Introduction
46 pages
Three
No ratings yet
Three
10 pages
Memory in Multiprocessor System
No ratings yet
Memory in Multiprocessor System
52 pages
Parallel Random Access Machine
No ratings yet
Parallel Random Access Machine
22 pages
Lecture 4 Flynn's Classical Taxonomy
No ratings yet
Lecture 4 Flynn's Classical Taxonomy
43 pages
Aca Notes: Scalability
No ratings yet
Aca Notes: Scalability
13 pages
PDA_3
No ratings yet
PDA_3
90 pages
PRAM Models
No ratings yet
PRAM Models
6 pages
Chapter 4
No ratings yet
Chapter 4
46 pages
Lecture 4 Network Topologies For Parallel Architecture
No ratings yet
Lecture 4 Network Topologies For Parallel Architecture
34 pages
UNIT-2 PP FlynnsClassification
No ratings yet
UNIT-2 PP FlynnsClassification
80 pages
What Is Parallel Computing
No ratings yet
What Is Parallel Computing
9 pages
Aca Notes
No ratings yet
Aca Notes
63 pages
Multiprocessors
No ratings yet
Multiprocessors
39 pages
Explicitly Parallel Platforms
No ratings yet
Explicitly Parallel Platforms
90 pages
Lecture 5
No ratings yet
Lecture 5
72 pages
PRAM Model
No ratings yet
PRAM Model
72 pages
Lecture 5 Network Topologies for Parallel Architectures - Updated
No ratings yet
Lecture 5 Network Topologies for Parallel Architectures - Updated
46 pages
Lecture 8 Miscellaneous Topics
No ratings yet
Lecture 8 Miscellaneous Topics
52 pages
Parallel Random Access Machine
No ratings yet
Parallel Random Access Machine
8 pages
Lecture 4
No ratings yet
Lecture 4
33 pages
PDC - Lecture - No. 3
No ratings yet
PDC - Lecture - No. 3
34 pages
CA Lecture 13
No ratings yet
CA Lecture 13
27 pages
Parallel Computing
No ratings yet
Parallel Computing
28 pages
CICS 504 Computer Organization
No ratings yet
CICS 504 Computer Organization
35 pages
2. Parallel Computers
No ratings yet
2. Parallel Computers
39 pages
Chapter 5 - Shared Memory Multiprocessor
No ratings yet
Chapter 5 - Shared Memory Multiprocessor
96 pages
Chap2 Slides Week3
No ratings yet
Chap2 Slides Week3
28 pages
Parallel Architecture: Sathish Vadhiyar
No ratings yet
Parallel Architecture: Sathish Vadhiyar
26 pages
Unit-3.3 PRAM Model.pptx
No ratings yet
Unit-3.3 PRAM Model.pptx
29 pages
CSCI 8150 Advanced Computer Architecture
100% (2)
CSCI 8150 Advanced Computer Architecture
18 pages
Slides Taken From: Parallel Computing Platforms
No ratings yet
Slides Taken From: Parallel Computing Platforms
11 pages
pdcco1
No ratings yet
pdcco1
8 pages
Chapter 2- Communication Models
No ratings yet
Chapter 2- Communication Models
64 pages
PRAM Model
No ratings yet
PRAM Model
5 pages
Interconnection Networks
No ratings yet
Interconnection Networks
31 pages
Memory Organisation: Shared Memory in Distributed Memory Architectures
No ratings yet
Memory Organisation: Shared Memory in Distributed Memory Architectures
15 pages
Parallel Programming Platforms (Part 1) : CSE3057Y Parallel and Distributed Systems
No ratings yet
Parallel Programming Platforms (Part 1) : CSE3057Y Parallel and Distributed Systems
38 pages
ACA Assignment 4
No ratings yet
ACA Assignment 4
16 pages
comporg6_ch12
No ratings yet
comporg6_ch12
36 pages
Additional Topics of Unit-I and Unit-II: Syed Rameem Zahra
No ratings yet
Additional Topics of Unit-I and Unit-II: Syed Rameem Zahra
21 pages
PRAM COMP 633: Parallel Computing Algorithms: The PRAM Model of Computation
No ratings yet
PRAM COMP 633: Parallel Computing Algorithms: The PRAM Model of Computation
49 pages
Parallel Computing
No ratings yet
Parallel Computing
30 pages
4th
No ratings yet
4th
84 pages
Interconnection Networks
No ratings yet
Interconnection Networks
40 pages
Multiprocessing: Flynn's Classification (1966)
No ratings yet
Multiprocessing: Flynn's Classification (1966)
8 pages
Parallel and Distributed Computing Research Paper
No ratings yet
Parallel and Distributed Computing Research Paper
8 pages
HPA - Notes
No ratings yet
HPA - Notes
5 pages
Advanced Unix Programming
From Everand
Advanced Unix Programming
Prof. N. B Venkateswarlu
No ratings yet
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
NumPy Recipes
From Everand
NumPy Recipes
Martin McBride
No ratings yet
Hxr-Mc1: Digital HD Video Camera Recorder
No ratings yet
Hxr-Mc1: Digital HD Video Camera Recorder
8 pages
CCNA 4 Chapter 7 V4.0 Answers
No ratings yet
CCNA 4 Chapter 7 V4.0 Answers
8 pages
1.2 the Main Components of Computer Systems
No ratings yet
1.2 the Main Components of Computer Systems
11 pages
RTEMS C User's Guide: On-Line Applications Research Corporation
No ratings yet
RTEMS C User's Guide: On-Line Applications Research Corporation
294 pages
Unit 3
No ratings yet
Unit 3
48 pages
Oracle Applications Cloning
No ratings yet
Oracle Applications Cloning
7 pages
Pralay Roy: CONTACT: +91 9886153455 Email
No ratings yet
Pralay Roy: CONTACT: +91 9886153455 Email
5 pages
BPA DBMS Chapter1 - DBMS Overview
No ratings yet
BPA DBMS Chapter1 - DBMS Overview
45 pages
Variables
No ratings yet
Variables
39 pages
Extron DIR CPO 4430 Netsync Network Solutions - Pricelist
No ratings yet
Extron DIR CPO 4430 Netsync Network Solutions - Pricelist
47 pages
A Practical Training Presentation "On Cdma Technology": BY-Vijay Yadav Iv Yr, Ece Dept.,Section - B
No ratings yet
A Practical Training Presentation "On Cdma Technology": BY-Vijay Yadav Iv Yr, Ece Dept.,Section - B
37 pages
Clock Tree Synthesis _ Physical Design
No ratings yet
Clock Tree Synthesis _ Physical Design
18 pages
IT Architecture Car
100% (1)
IT Architecture Car
32 pages
Datasheet Mediamaster EP
No ratings yet
Datasheet Mediamaster EP
3 pages
P Cim770sp4 RLN
No ratings yet
P Cim770sp4 RLN
19 pages
Department of Electrical Engineering College of Technology: Casual Leave Application Form
No ratings yet
Department of Electrical Engineering College of Technology: Casual Leave Application Form
20 pages
Paramount Reasoning Book in Hindi PDF
100% (1)
Paramount Reasoning Book in Hindi PDF
201 pages
Manual Cisco CCNA
No ratings yet
Manual Cisco CCNA
83 pages
Lecture 01
No ratings yet
Lecture 01
26 pages
BSC6900 GU LMT User Guide - (V900R011C00 - 03)
No ratings yet
BSC6900 GU LMT User Guide - (V900R011C00 - 03)
260 pages
IDOC: (T. Code: WE02)
No ratings yet
IDOC: (T. Code: WE02)
19 pages
Chapter 8 (Updated) User Interface Design
No ratings yet
Chapter 8 (Updated) User Interface Design
27 pages
A VHDL Design of A JPEG Still Image Compression Standard Decoder
No ratings yet
A VHDL Design of A JPEG Still Image Compression Standard Decoder
280 pages
Electronic Design Automation
No ratings yet
Electronic Design Automation
24 pages
Counter and Register
No ratings yet
Counter and Register
56 pages
Basys 3 ™ FPGA Board Reference Manual: Revised July 10, 2019 This Manual Applies To The Basys 3 Rev. C
No ratings yet
Basys 3 ™ FPGA Board Reference Manual: Revised July 10, 2019 This Manual Applies To The Basys 3 Rev. C
19 pages
2010 Networking Course Guide
No ratings yet
2010 Networking Course Guide
4 pages
Assembly For Begginers
100% (1)
Assembly For Begginers
68 pages
DVD Audio Extractor: User Manual
No ratings yet
DVD Audio Extractor: User Manual
39 pages

Chapter 3

Uploaded by

Chapter 3

Uploaded by

Part 2: Parallel Models (II)

1. Classification of Computers According to

Shared Memory Distributed Memory Hybrid D-S Memory

Processors: there are n identical processors (PEs), , each of which is

memory access unit

a) Priority CW - only PE with highest priority succeeds

The M locations of memory are distributed among the N #

When processors Pi wishes to send a datum to processor Pj, it #

Two processors directly connected by a link are said to be #

The link between Pi and Pj represents two links, namely, one #

There are a number of questions that are need to answer

?What shape should the network have )1(

Can a processor communicate with all of its neighbors at )2(

What are the size of a message that a processor can )3(

How long does it take for a processor to initiate a )4(

?How long does it take a processor to receive a datum )6(

?Are the paths static or dynamic )7(

Do the processor operate synchronously or )8(

What kind or processor is used by an interconnection )9(

communication diameter maximum of the minimum distance

bisection width minimum number of wires that have

n processors are connected in the form of a one dimensional array.

n processors are connected in the form of a ring.

HW: Study the Scalability

Consists of n=2d -1 processors arranged as a complete binary tree. Each processor

HW: Study the Scalability

p0 p1 p2 p'0 p’1 p’2

p3 p4 p5 p’3 p’4 p’5

p6 p7 p8 p’6 p’7 p’8

p9 p10 p11 p’9 p’10 p’11

p2 p3 010 011 010 011

p0 p1 000 001 000 001

Consists of n=2d processors connected as cube.

0010 0011 1010 1011

0000 0001 1000 1001

HW: Study the Scalability

You might also like