Lec 12

This document discusses techniques for improving instruction level parallelism (ILP) by reducing stalls, including loop unrolling and handling different types of dependencies. It describes how control dependence need not always be maintained through speculation and conditional instructions. Loop unrolling can introduce more parallelism but can also create loop-carried dependencies. The document then discusses static versus dynamic scheduling and how a scoreboard approach was used in the CDC 6600 computer to enable out-of-order execution and completion while detecting hazards through a data structure and control logic.

Uploaded by

jyothibellary4233

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views15 pages

Lec 12

Uploaded by

jyothibellary4233

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

LECTURE - 12

ILP: Recall

Improving ILP == reducing stalls

Loop unrolling enlarges the basic block
More parallelism
More opportunity for better scheduling

Dependences:
Data dependence
Name dependence
Control dependence
Handling Control Dependence

Control dependence need not be maintained

We need to maintain:
Exception behaviour –do not caus e new
exceptions
Data flow –ensur e the right data item is used

Speculation and conditional instructions are
techniques to get around control dependence
Loop Unrolling: a Relook

Our example:
for(int i = 1000; i >= 1; i = i-1) {
x[i] = x[i] + C; // FP
}

Consider:
for(int i = 1000; i >= 1; i = i-1) {
A[i-1] = A[i] + C[i]; // S1
B[i-1] = B[i] + A[i-1]; // S2
}
S2 is dependent on S1
S1 is dependent on its previous iteration; same
case with S2

Loop-carried dependence ==> loop iterations have to
be in-order
Removing Loop-Carried
Dependence

Another example:
for(int i = 1000; i >= 1; i = i-1) {
A[i] = A[i] + B[i]; // S1
B[i-1] = C[i] + D[i]; // S2
}

S1 depends on the prior iteration of S2
Can be removed (no cyclic dependence)
A[1000] = A[1000] + B[1000];
for(int i = 1000; i >= 2; i = i-1) {
B[i-1] = C[i] + D[i]; // S2
A[i-1] = A[i-1] + B[i-1]; // S1
}
Static vs. Dynamic Scheduling

Static scheduling: limitations
Dependences may not be known at compile time
Even if known, compiler becomes complex
Compiler has to have knowledge of pipeline

Dynamic scheduling
Handle dynamic dependences
Simpler compiler
Efficient even if code compiled for a different
pipeline
Dynamic Scheduling

For now, we will focus on overcoming data
hazards

The idea:
DIVD F0, F2, F4
ADDD F10, F0, F8
SUBD F12, F8, F14

SUBD can proceed without waiting for DIVD
CDC 6600: A Case Study

IF stage: fetch instructions onto a queue

ID stage is split into two stages:
Issue: decode and check for structural hazards
Read operands: check for data hazards

Execution may begin, and may complete out-
of-order
Complications in exception handling
Ignore for now

What is the logic for data hazard checks?
The CDC Scoreboard

Out-of-order completion ==> WAR and WAW
hazards possible

Scoreboard: a data-structure for all hazard
detection in the presence of out-of-order
execution/completion

All instructions “cons ult” the scoreboard to
detect hazards
The Scoreboard Solution

Three components:
Stages of the pipeline:

Issue (ID1), Read-operands (ID2), EX, WB
Data structure (in hardware)
Logic for hazard detection, stalling
Scoreboard Control & the
Pipeline Stages

Issue (ID1): decode, check if functional unit is
free, and if a previous instruction has the same
destination register
No such hazard ==> scoreboard issues to the
appropriate functional unit

Note: structural/WAW hazards prevented by stalling here

Note: stall here ==> IF queue will grow

Read operands (ID2):
Operand is available if no earlier instruction is going
to write it, or if the register is being written currently
RAW hazards are resolved here
Scoreboard Control & the
Pipeline Stages (continued)

Execute (EX):
Functional units perform execution
Scoreboard is notified on completion

Write-Back (WB):
Check for WAR hazards

Stall on detection

Write-back otherwise
Some Remarks

WAW causes stall in ID1, WAR causes stall
in WB

No forwarding logic
Output written as soon as it is available (and no
WAR hazard)

Structural hazard possible in register
read/write
CDC has 16 functional units, and 4 buses
The Scoreboard Data-Structures

Instruction status

Functional unit status

Register result status

Randy Katz's CS252 slides... (Lecture 10,
Spring 1996)
Scoreboard pipeline control
A detailed example
Limitations of the Scoreboard

Speedup of 1.7 for (compiled) FORTRAN,
speedup of 2.5 for hand-coded assembly

Scoreboard only in basic-block!

Some hazards still cause stalls:
Structural
WAR, WAW

CMP3010L05-Hazard Continue ILP
No ratings yet
CMP3010L05-Hazard Continue ILP
54 pages
Score Boarding
No ratings yet
Score Boarding
36 pages
EC483 Fall2024 W7
No ratings yet
EC483 Fall2024 W7
40 pages
4-Advanced Pipelining - 241114 - 060906
No ratings yet
4-Advanced Pipelining - 241114 - 060906
80 pages
Lecture 6 The Processors-Improving The Performance
No ratings yet
Lecture 6 The Processors-Improving The Performance
40 pages
Pipelining Achieves Instruction Level Parallelism (ILP)
No ratings yet
Pipelining Achieves Instruction Level Parallelism (ILP)
59 pages
CH14 COA10e
No ratings yet
CH14 COA10e
54 pages
Lect 06
No ratings yet
Lect 06
89 pages
Computer Architecture ILP - Techniques For Increasing
No ratings yet
Computer Architecture ILP - Techniques For Increasing
11 pages
SRM Pipelining 05
No ratings yet
SRM Pipelining 05
42 pages
Lec 03
No ratings yet
Lec 03
16 pages
4.architetture OutOfOrder
No ratings yet
4.architetture OutOfOrder
40 pages
ch4 3
No ratings yet
ch4 3
61 pages
CH14-WS - 10thed - Pipeline
No ratings yet
CH14-WS - 10thed - Pipeline
16 pages
Track Modding
No ratings yet
Track Modding
82 pages
Score Boarding
No ratings yet
Score Boarding
38 pages
Lect06 2up
No ratings yet
Lect06 2up
45 pages
Dynamic Scheduling - Scoreboard Technique
No ratings yet
Dynamic Scheduling - Scoreboard Technique
39 pages
Lecture 5
No ratings yet
Lecture 5
76 pages
CEA201 - Chapter 14 - Processor Structure and Function
No ratings yet
CEA201 - Chapter 14 - Processor Structure and Function
42 pages
Instruction-Level Parallelism and Its Exploitation: Prof. Dr. Nizamettin AYDIN
No ratings yet
Instruction-Level Parallelism and Its Exploitation: Prof. Dr. Nizamettin AYDIN
170 pages
Dynamic Scheduling
No ratings yet
Dynamic Scheduling
70 pages
Chapter 4
No ratings yet
Chapter 4
78 pages
AWS Academy Resource - Content Resources
No ratings yet
AWS Academy Resource - Content Resources
7 pages
Module 5 - Processor Structure and Function
No ratings yet
Module 5 - Processor Structure and Function
74 pages
Pipelining 2019
No ratings yet
Pipelining 2019
82 pages
Instruction-Level Parallelism: Stalls Control Stalls WAW Stalls WAR Stalls RAW Stalls Structural CPI CPI
No ratings yet
Instruction-Level Parallelism: Stalls Control Stalls WAW Stalls WAR Stalls RAW Stalls Structural CPI CPI
50 pages
Manual: Overview Netstalkinga (V. 1.0.0)
No ratings yet
Manual: Overview Netstalkinga (V. 1.0.0)
116 pages
13) Ilp1 PDF
No ratings yet
13) Ilp1 PDF
85 pages
Slot15 CH14 ProcessorStructureAndFunction 42 Slots
No ratings yet
Slot15 CH14 ProcessorStructureAndFunction 42 Slots
42 pages
Slot24 25 CH14 ProcessorStructureAndFunction 42 Slots
No ratings yet
Slot24 25 CH14 ProcessorStructureAndFunction 42 Slots
42 pages
Lec03 - Processor Structure and Function
No ratings yet
Lec03 - Processor Structure and Function
55 pages
Chapter 2 ILP
No ratings yet
Chapter 2 ILP
89 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
214 pages
L11 DS PDF
No ratings yet
L11 DS PDF
41 pages
Topic2c Ss Dynamicscheduling
No ratings yet
Topic2c Ss Dynamicscheduling
94 pages
Processor Structure and Function
100% (1)
Processor Structure and Function
55 pages
03 Dynamic Sched
No ratings yet
03 Dynamic Sched
84 pages
Capture The Flags - 2
No ratings yet
Capture The Flags - 2
83 pages
Instruction Level Pipelining
100% (1)
Instruction Level Pipelining
113 pages
ILP ScoreBoard
No ratings yet
ILP ScoreBoard
45 pages
Lecutre-7 Instruction Pipelining
No ratings yet
Lecutre-7 Instruction Pipelining
29 pages
Pipelinehazard 160823134502
No ratings yet
Pipelinehazard 160823134502
61 pages
Pipeline Hazards: Structural Hazards: Resource Conflict
No ratings yet
Pipeline Hazards: Structural Hazards: Resource Conflict
49 pages
7 Tools To Make Scientific Illustrations
No ratings yet
7 Tools To Make Scientific Illustrations
27 pages
Pipelining Become Universal Technique in 1985
No ratings yet
Pipelining Become Universal Technique in 1985
16 pages
Pipeline Hazards Detailed Notes
No ratings yet
Pipeline Hazards Detailed Notes
49 pages
COA Unit 3
No ratings yet
COA Unit 3
89 pages
CH10-Processor Structure and Function
No ratings yet
CH10-Processor Structure and Function
14 pages
Pipelining Basic Concepts: Instruction Fetch Execute Operand Fetch IF OF EX
No ratings yet
Pipelining Basic Concepts: Instruction Fetch Execute Operand Fetch IF OF EX
28 pages
CSE 820 Graduate Computer Architecture Week 5 - Instruction Level Parallelism
No ratings yet
CSE 820 Graduate Computer Architecture Week 5 - Instruction Level Parallelism
38 pages
UFD Utility User-Manual-EC
No ratings yet
UFD Utility User-Manual-EC
73 pages
CH14 COA9e Processor Structure and Function
No ratings yet
CH14 COA9e Processor Structure and Function
40 pages
Pipelinehazard For Class
No ratings yet
Pipelinehazard For Class
61 pages
CS607 CURRENT MIDTERM SOLVED SUBJECTIVE by JUNAID
No ratings yet
CS607 CURRENT MIDTERM SOLVED SUBJECTIVE by JUNAID
10 pages
sc-300 44e09d565042 Certfiq
100% (1)
sc-300 44e09d565042 Certfiq
314 pages
ILP Techniques: Laxmi N. Bhuyan CS 162 Spring 2003
No ratings yet
ILP Techniques: Laxmi N. Bhuyan CS 162 Spring 2003
23 pages
Lecture 13-14: Pipelines Hazards": Suggested Reading:" (HP Chapter 4.5-4.7) "
No ratings yet
Lecture 13-14: Pipelines Hazards": Suggested Reading:" (HP Chapter 4.5-4.7) "
51 pages
Study Guide 300-615 Dcit Troubleshooting Cisco Data Centre Infrastructure
From Everand
Study Guide 300-615 Dcit Troubleshooting Cisco Data Centre Infrastructure
Anand Vemula
No ratings yet
Grc330 en Col17 Ilt FV Co A4
100% (1)
Grc330 en Col17 Ilt FV Co A4
33 pages
Star Lion College of Engineering & Technology: Cs2354 Aca-2 Marks & 16 Marks
No ratings yet
Star Lion College of Engineering & Technology: Cs2354 Aca-2 Marks & 16 Marks
14 pages
Cpe 242 Computer Architecture and Engineering Instruction Level Parallelism
No ratings yet
Cpe 242 Computer Architecture and Engineering Instruction Level Parallelism
46 pages
s7-1500t Kinematic Function Manual en-US en-US
No ratings yet
s7-1500t Kinematic Function Manual en-US en-US
307 pages
Instruction-Level Parallelism (ILP), Since The
100% (1)
Instruction-Level Parallelism (ILP), Since The
57 pages
Lecture 9: Dynamic Scheduling: Kunle Olukotun Gates 302 Kunle@ogun - Stanford.edu
No ratings yet
Lecture 9: Dynamic Scheduling: Kunle Olukotun Gates 302 Kunle@ogun - Stanford.edu
14 pages
HRY-312 Computer Organization Introduction To Pipelining
No ratings yet
HRY-312 Computer Organization Introduction To Pipelining
30 pages
PV Monitor Foresight Group International AG Submission Performance and Case List(s) (Report C11a) Traceability Matrix
No ratings yet
PV Monitor Foresight Group International AG Submission Performance and Case List(s) (Report C11a) Traceability Matrix
33 pages
William Stallings Computer Organization and Architecture 8 Edition Processor Structure and Function
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Processor Structure and Function
74 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
55 pages
Lect02.LecJan12 2006.PipelineProcessor
No ratings yet
Lect02.LecJan12 2006.PipelineProcessor
34 pages
Hacking G-Mail Using GX Cookie
No ratings yet
Hacking G-Mail Using GX Cookie
6 pages
Multiple Choice Questions of Computer Networking
No ratings yet
Multiple Choice Questions of Computer Networking
64 pages
Parallelism Via Instructions: Instruction-Level Parallelism (ILP)
No ratings yet
Parallelism Via Instructions: Instruction-Level Parallelism (ILP)
21 pages
CronosurfWave Android v2.0.1 20160912 en
No ratings yet
CronosurfWave Android v2.0.1 20160912 en
16 pages
Shruti Ranjan's Resume
No ratings yet
Shruti Ranjan's Resume
1 page
Second-Generation Stack Computer Architecture
No ratings yet
Second-Generation Stack Computer Architecture
178 pages
ImageNet A Large-Scale Hierarchical Image Database PDF
No ratings yet
ImageNet A Large-Scale Hierarchical Image Database PDF
9 pages
Lec 19
No ratings yet
Lec 19
19 pages
Lec 06
No ratings yet
Lec 06
18 pages
280 Advanced IOM Course Syllabus 20100316M IGT PDF
No ratings yet
280 Advanced IOM Course Syllabus 20100316M IGT PDF
8 pages
E9000 Exam Questions
No ratings yet
E9000 Exam Questions
13 pages
C01 LumenSoft Candela RMS Installation Guide
No ratings yet
C01 LumenSoft Candela RMS Installation Guide
16 pages
08 - IEC 61850 Overview
No ratings yet
08 - IEC 61850 Overview
42 pages
Email List Building
No ratings yet
Email List Building
16 pages
Lec 15
No ratings yet
Lec 15
15 pages
Lec 24
No ratings yet
Lec 24
14 pages
Seminar On: Android Operating System
No ratings yet
Seminar On: Android Operating System
30 pages
Lec 05
No ratings yet
Lec 05
13 pages
Mca Iii Semester Software Lab Ii - Practicals List: Assignments For Design and Analysis of Algorithms (Daa)
No ratings yet
Mca Iii Semester Software Lab Ii - Practicals List: Assignments For Design and Analysis of Algorithms (Daa)
28 pages
Lec 11
No ratings yet
Lec 11
19 pages
TSP Java
No ratings yet
TSP Java
11 pages
TSP Java
No ratings yet
TSP Java
11 pages
Model 6600 M4 Datasheet
No ratings yet
Model 6600 M4 Datasheet
2 pages
Standard C++ With Object-Oriented Programming BOOK CH 4
No ratings yet
Standard C++ With Object-Oriented Programming BOOK CH 4
51 pages
9.dynamic Scheduling (Score Boarding)
No ratings yet
9.dynamic Scheduling (Score Boarding)
10 pages
Lec 13
No ratings yet
Lec 13
13 pages
Lec 31
No ratings yet
Lec 31
5 pages
Geek Squad Setup Guide
No ratings yet
Geek Squad Setup Guide
1 page
cs3362 Foundations of Data Science Lab Manual
75% (8)
cs3362 Foundations of Data Science Lab Manual
53 pages
Ganga-Ashtakam-1 Telugu PDF File9839
No ratings yet
Ganga-Ashtakam-1 Telugu PDF File9839
3 pages
Reduced Instruction Set Computer (Risc) Complex Instruction Set Computer (Cisc)
No ratings yet
Reduced Instruction Set Computer (Risc) Complex Instruction Set Computer (Cisc)
7 pages
Development of Arduino Powered Robotic Vehicle With Automatic Obstacle Detection and Avoidance
No ratings yet
Development of Arduino Powered Robotic Vehicle With Automatic Obstacle Detection and Avoidance
7 pages
Lec 22
No ratings yet
Lec 22
14 pages
Info117 Resume117
No ratings yet
Info117 Resume117
5 pages
Network Troubleshooting Using Packet Capture Utilities
No ratings yet
Network Troubleshooting Using Packet Capture Utilities
13 pages
Reg D
No ratings yet
Reg D
24 pages

Lec 12

Uploaded by

Lec 12

Uploaded by

LECTURE - 12

You might also like