Parallel Computing and Programming.
Lecture 1: Von Neumann vs Dataflow Models
Dr. Rony Kassam
IEF Tishreen Uni
S1 - 2021
Index
n Von Neumann vs Dataflow Models.
n ISA vs Microarchitecture.
n Single-cycle vs Multi-cycle Microarchitectures.
n Instruction Level Parallelism: Pipelining Intro.
n Instruction Level Parallelism: Issues in Pipeline Design.
n Thread Level Parallelism: Data Dependence Solutions.
n Thread Level Parallelism: Shared Memory and OpenMP.
2
Recall: The Von Neumann Model
MEMORY
Mem Addr Reg
Mem Data Reg
INPUT PROCESSING UNIT OUTPUT
Keyboard, Monitor,
Mouse, ALU TEMP Printer,
Disk… Disk…
CONTROL UNIT
PC or IP Inst Register
3
Recall: The Instruction Cycle
q FETCH
q DECODE
q EVALUATE ADDRESS
q FETCH OPERANDS
q EXECUTE
q STORE RESULT
4
Recall: The Instruction Set Architecture
n The ISA is the interface between what the software commands
and what the hardware carries out
Problem
n The ISA specifies
q The memory organization Algorithm
n Address space (LC-3: 216, MIPS: 232) Program
n Addressability (LC-3: 16 bits, MIPS: 32 bits)
ISA
n Word- or Byte-addressable
Microarchitecture
q The register set Circuits
n R0 to R7 in LC-3 Electrons
n 32 registers in MIPS
q The instruction set
n Opcodes
n Data types
n Addressing modes
n Semantics of instructions
5
Microarchitecture
n An implementation of the ISA
6
Microarchitecture
n An implementation of the ISA
n How do we implement the ISA?
6
Microarchitecture
n An implementation of the ISA
n How do we implement the ISA?
n There can be many implementations of the same ISA
6
Microarchitecture
n An implementation of the ISA
n How do we implement the ISA?
n There can be many implementations of the same ISA
q MIPS R2000, R10000, …
6
Microarchitecture
n An implementation of the ISA
n How do we implement the ISA?
n There can be many implementations of the same ISA
q MIPS R2000, R10000, …
q Intel 80486, Pentium, Pentium Pro, Pentium 4, Kaby Lake,
Coffee Lake, … AMD K5, K7, K9, Bulldozer, BobCat, …
6
The Von Neumann Model/Architecture
n Von Neumann model is also called stored program computer
(instructions in memory). It has two key properties:
7
The Von Neumann Model/Architecture
n Von Neumann model is also called stored program computer
(instructions in memory). It has two key properties:
n Stored program
n Sequential instruction processing
7
The Von Neumann Model/Architecture
n Von Neumann model is also called stored program computer
(instructions in memory). It has two key properties:
n Stored program
q Instructions stored in a linear memory array
q Memory is unified between instructions and data
n The interpretation of a stored value depends on the control signals
n Sequential instruction processing
7
The Von Neumann Model/Architecture
n Von Neumann model is also called stored program computer
(instructions in memory). It has two key properties:
n Stored program
q Instructions stored in a linear memory array
q Memory is unified between instructions and data
n The interpretation of a stored value depends on the control signals
When is a value interpreted as an instruction?
n Sequential instruction processing
7
The Von Neumann Model/Architecture
n Von Neumann model is also called stored program computer
(instructions in memory). It has two key properties:
n Stored program
q Instructions stored in a linear memory array
q Memory is unified between instructions and data
n The interpretation of a stored value depends on the control signals
When is a value interpreted as an instruction?
n Sequential instruction processing
q One instruction processed (fetched, executed, completed) at a time
q Program counter (instruction pointer) identifies the current instruction
q Program counter is advanced sequentially except for control transfer
instructions
7
The Von Neumann Model (of a Computer)
MEMORY
Mem Addr Reg
Mem Data Reg
PROCESSING UNIT
INPUT OUTPUT
ALU TEMP
CONTROL UNIT
IP Inst Register
8
The Dataflow Model (of a Computer)
n Von Neumann model: An instruction is fetched and
executed in control flow order
9
The Dataflow Model (of a Computer)
n Von Neumann model: An instruction is fetched and
executed in control flow order
q As specified by the instruction pointer
9
The Dataflow Model (of a Computer)
n Von Neumann model: An instruction is fetched and
executed in control flow order
q As specified by the instruction pointer
q Sequential unless explicit control flow instruction
9
The Dataflow Model (of a Computer)
n Von Neumann model: An instruction is fetched and
executed in control flow order
q As specified by the instruction pointer
q Sequential unless explicit control flow instruction
n Dataflow model: An instruction is fetched and executed in
data flow order
9
The Dataflow Model (of a Computer)
n Von Neumann model: An instruction is fetched and
executed in control flow order
q As specified by the instruction pointer
q Sequential unless explicit control flow instruction
n Dataflow model: An instruction is fetched and executed in
data flow order
q i.e., when its operands are ready
9
The Dataflow Model (of a Computer)
n Von Neumann model: An instruction is fetched and
executed in control flow order
q As specified by the instruction pointer
q Sequential unless explicit control flow instruction
n Dataflow model: An instruction is fetched and executed in
data flow order
q i.e., when its operands are ready
q i.e., there is no instruction pointer
9
The Dataflow Model (of a Computer)
n Von Neumann model: An instruction is fetched and
executed in control flow order
q As specified by the instruction pointer
q Sequential unless explicit control flow instruction
n Dataflow model: An instruction is fetched and executed in
data flow order
q i.e., when its operands are ready
q i.e., there is no instruction pointer
q Instruction ordering specified by data flow dependence
n Each instruction specifies “who” should receive the result
n An instruction can “fire” whenever all operands are received
9
The Dataflow Model (of a Computer)
n Von Neumann model: An instruction is fetched and
executed in control flow order
q As specified by the instruction pointer
q Sequential unless explicit control flow instruction
n Dataflow model: An instruction is fetched and executed in
data flow order
q i.e., when its operands are ready
q i.e., there is no instruction pointer
q Instruction ordering specified by data flow dependence
n Each instruction specifies “who” should receive the result
n An instruction can “fire” whenever all operands are received
q Potentially many instructions can execute at the same time
9
The Dataflow Model (of a Computer)
n Von Neumann model: An instruction is fetched and
executed in control flow order
q As specified by the instruction pointer
q Sequential unless explicit control flow instruction
n Dataflow model: An instruction is fetched and executed in
data flow order
q i.e., when its operands are ready
q i.e., there is no instruction pointer
q Instruction ordering specified by data flow dependence
n Each instruction specifies “who” should receive the result
n An instruction can “fire” whenever all operands are received
q Potentially many instructions can execute at the same time
n Inherently more parallel
9
Von Neumann vs Dataflow
n Consider a Von Neumann program
10
Von Neumann vs Dataflow
n Consider a Von Neumann program
v <= a + b;
w <= b * 2;
x <= v - w
y <= v + w
z <= x * y
Sequential
10
Von Neumann vs Dataflow
n Consider a Von Neumann program
q What is the significance of the program order?
v <= a + b;
w <= b * 2;
x <= v - w
y <= v + w
z <= x * y
Sequential
10
Von Neumann vs Dataflow
n Consider a Von Neumann program
q What is the significance of the program order?
q What is the significance of the storage locations?
v <= a + b;
w <= b * 2;
x <= v - w
y <= v + w
z <= x * y
Sequential
10
Von Neumann vs Dataflow
n Consider a Von Neumann program
q What is the significance of the program order?
q What is the significance of the storage locations?
a b
v <= a + b;
w <= b * 2;
x <= v - w + *2
y <= v + w
z <= x * y
- +
Sequential
*
Dataflow
10
Von Neumann vs Dataflow
n Consider a Von Neumann program
q What is the significance of the program order?
q What is the significance of the storage locations?
a b
v <= a + b;
w <= b * 2;
x <= v - w + *2
y <= v + w
z <= x * y
- +
Sequential
*
Dataflow
z
n Which model is more natural to you as a programmer?
10
More on Data Flow
n In a data flow machine, a program consists of data flow
nodes
q A data flow node fires (fetched and executed) when all it
inputs are ready
n i.e. when all inputs have tokens
n Data flow node and its ISA representation
11
Data Flow Nodes
12
An Example Data Flow Program
13
ISA-level Tradeoff: Instruction Pointer
n Do we need an instruction pointer in the ISA?
14
ISA-level Tradeoff: Instruction Pointer
n Do we need an instruction pointer in the ISA?
q Yes: Control-driven, sequential execution
n An instruction is executed when the IP points to it
n IP automatically changes sequentially (except for control flow
instructions)
14
ISA-level Tradeoff: Instruction Pointer
n Do we need an instruction pointer in the ISA?
q Yes: Control-driven, sequential execution
n An instruction is executed when the IP points to it
n IP automatically changes sequentially (except for control flow
instructions)
q No: Data-driven, parallel execution
n An instruction is executed when all its operand values are
available (data flow)
14
ISA-level Tradeoff: Instruction Pointer
n Do we need an instruction pointer in the ISA?
q Yes: Control-driven, sequential execution
n An instruction is executed when the IP points to it
n IP automatically changes sequentially (except for control flow
instructions)
q No: Data-driven, parallel execution
n An instruction is executed when all its operand values are
available (data flow)
n Tradeoffs: MANY high-level ones
14
ISA-level Tradeoff: Instruction Pointer
n Do we need an instruction pointer in the ISA?
q Yes: Control-driven, sequential execution
n An instruction is executed when the IP points to it
n IP automatically changes sequentially (except for control flow
instructions)
q No: Data-driven, parallel execution
n An instruction is executed when all its operand values are
available (data flow)
n Tradeoffs: MANY high-level ones
q Ease of programming (for average programmers)?
q Ease of compilation?
q Performance: Extraction of parallelism?
q Hardware complexity?
14
ISA vs. Microarchitecture Level Tradeoff
n A similar tradeoff (control vs. data-driven execution) can be
made at the microarchitecture level
15
ISA vs. Microarchitecture Level Tradeoff
n A similar tradeoff (control vs. data-driven execution) can be
made at the microarchitecture level
n ISA: Specifies how the programmer sees the instructions to
be executed
15
ISA vs. Microarchitecture Level Tradeoff
n A similar tradeoff (control vs. data-driven execution) can be
made at the microarchitecture level
n ISA: Specifies how the programmer sees the instructions to
be executed
n Microarchitecture: How the underlying implementation
actually executes instructions
15
ISA vs. Microarchitecture Level Tradeoff
n A similar tradeoff (control vs. data-driven execution) can be
made at the microarchitecture level
n ISA: Specifies how the programmer sees the instructions to
be executed
q Programmer sees a sequential, control-flow execution order vs.
q Programmer sees a data-flow execution order
n Microarchitecture: How the underlying implementation
actually executes instructions
15
ISA vs. Microarchitecture Level Tradeoff
n A similar tradeoff (control vs. data-driven execution) can be
made at the microarchitecture level
n ISA: Specifies how the programmer sees the instructions to
be executed
q Programmer sees a sequential, control-flow execution order vs.
q Programmer sees a data-flow execution order
n Microarchitecture: How the underlying implementation
actually executes instructions
q Microarchitecture can execute instructions in any order as long
as it obeys the semantics specified by the ISA when making the
instruction results visible to software
n Programmer should see the order specified by the ISA
15
The Von-Neumann Model
n All major instruction set architectures today use this model
q x86, ARM, MIPS, SPARC, Alpha, POWER, RISC-V, …
16
The Von-Neumann Model
n All major instruction set architectures today use this model
q x86, ARM, MIPS, SPARC, Alpha, POWER, RISC-V, …
n Underneath (at the microarchitecture level), the execution
model of almost all implementations (or, microarchitectures)
is very different
16
The Von-Neumann Model
n All major instruction set architectures today use this model
q x86, ARM, MIPS, SPARC, Alpha, POWER, RISC-V, …
n Underneath (at the microarchitecture level), the execution
model of almost all implementations (or, microarchitectures)
is very different
q Pipelined instruction execution: Intel 80486 uarch
16
The Von-Neumann Model
n All major instruction set architectures today use this model
q x86, ARM, MIPS, SPARC, Alpha, POWER, RISC-V, …
n Underneath (at the microarchitecture level), the execution
model of almost all implementations (or, microarchitectures)
is very different
q Pipelined instruction execution: Intel 80486 uarch
q Multiple instructions at a time: Intel Pentium uarch
16
The Von-Neumann Model
n All major instruction set architectures today use this model
q x86, ARM, MIPS, SPARC, Alpha, POWER, RISC-V, …
n Underneath (at the microarchitecture level), the execution
model of almost all implementations (or, microarchitectures)
is very different
q Pipelined instruction execution: Intel 80486 uarch
q Multiple instructions at a time: Intel Pentium uarch
q Out-of-order execution: Intel Pentium Pro uarch
16
The Von-Neumann Model
n All major instruction set architectures today use this model
q x86, ARM, MIPS, SPARC, Alpha, POWER, RISC-V, …
n Underneath (at the microarchitecture level), the execution
model of almost all implementations (or, microarchitectures)
is very different
q Pipelined instruction execution: Intel 80486 uarch
q Multiple instructions at a time: Intel Pentium uarch
q Out-of-order execution: Intel Pentium Pro uarch
q Separate instruction and data caches
16
The Von-Neumann Model
n All major instruction set architectures today use this model
q x86, ARM, MIPS, SPARC, Alpha, POWER, RISC-V, …
n Underneath (at the microarchitecture level), the execution
model of almost all implementations (or, microarchitectures)
is very different
q Pipelined instruction execution: Intel 80486 uarch
q Multiple instructions at a time: Intel Pentium uarch
q Out-of-order execution: Intel Pentium Pro uarch
q Separate instruction and data caches
n But, what happens underneath that is not consistent with
the von Neumann model is not exposed to software
16
The Von-Neumann Model
n All major instruction set architectures today use this model
q x86, ARM, MIPS, SPARC, Alpha, POWER, RISC-V, …
n Underneath (at the microarchitecture level), the execution
model of almost all implementations (or, microarchitectures)
is very different
q Pipelined instruction execution: Intel 80486 uarch
q Multiple instructions at a time: Intel Pentium uarch
q Out-of-order execution: Intel Pentium Pro uarch
q Separate instruction and data caches
n But, what happens underneath that is not consistent with
the von Neumann model is not exposed to software
q Difference between ISA and microarchitecture
16