0% found this document useful (0 votes)

9 views11 pages

Bản Sao Của Lecture 9 - Pipelined Processor Design

Uploaded by

hinam74

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views11 pages

Bản Sao Của Lecture 9 - Pipelined Processor Design

Uploaded by

hinam74

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Lecture 9: Pipelined Processor Design

Drawbacks of Single Cycle Processor

❑ Long cycle time
➢ All instructions take as much time as the slowest instruction

Worst Case Timing

❑ Slowest instruction: load

➢ Cycle time is longer than needed for other instructions

Multicycle Implementation
❑ Break instruction execution into five steps
➢ Instruction fetch
➢ Instruction decode, register read, target address for jump/branch
➢ Execution, memory address calculation, or branch outcome
➢ Memory access or ALU instruction completion
➢ Load instruction completion
❑ One clock cycle per step (clock cycle is reduced)
➢ First 2 steps are the same for all instructions

Single cycle vs. multicycle example

❑ Single cycle

❑ Multicycle
➢ Shorter clock cycle time: constrained by longest step, not longest instruction
➢ Higher overall performance: simpler instructions take fewer cycles, less waste
❑ Assume the following operation times for components:
➢ Instruction and data memories: 200 ps
➢ LU and adders: 180 ps
➢ Decode and Register file access (read or write): 150 ps
➢ Ignore the delays in PC, mux, extender, and wires
❑ Assume the following instruction mix:
➢ 40% ALU, 20% Loads, 10% stores, 20% branches, & 10% jumps
❑ Which of the following would be faster and by how much?
➢ Single-cycle implementation for all instructions
➢ Multicycle implementation optimized for every class of instructions

=> Example solution

❑ For fixed single-cycle implementation:
➢ Clock cycle = 880 ps determined by longest delay (load instruction)
❑ For multi-cycle implementation:
➢ Clock cycle = max (200, 150, 180) = 200 ps (maximum delay at any step)
➢ Average CPI = 0.4×4 + 0.2×5 + 0.1×4+ 0.2×3 + 0.1×2 = 3.8
❑ Speedup = 880 ps / (3.8 × 200 ps) = 880 / 760 = 1.16

The idea of pipelining

❑ Multicycle improves performance over single cycle, but can you see limitations of the
multi-cycle design?
➢ Some HW resources are idle during different phases of the instruction cycle, e.g.
“Fetch” logic is idle when an instruction is being “decoded” or “executed”
➢ Most of the datapath is idle when a memory access is happening

❑ Can we do better?
➢ Yes: More concurrency → Higher instruction throughput (i.e., more “work” completed
in one cycle)

❑ Idea: when an instruction is using some resources in its processing phase, process
other instructions on idle resources
➢ E.g., when an instruction is being decoded, fetch the next instruction
➢ E.g., when an instruction is being executed, decode another instruction
➢ E.g., when an instruction is accessing data memory (lw/sw), execute the next
instruction
➢ E.g., when an instruction is writing its result into the register file, access data
memory for the next instruction
Single-cycle vs multi-cycle vs pipeline
❑ Five stages, one step per stage
➢ Each step requires 1 clock cycle → steps enter/leave pipeline at the rate of one step
per clock cycle

Pipeline performance
❑ Ideal pipeline assumptions
➢ Identical operations, e.g. four laundry steps are repeated for all loads
➢ Independent operations, e.g. no dependency between laundry steps
➢ Uniformly partitionable sub operations (that do not share resources), e.g. laundry
steps have uniform latency.
❑ Ideal pipeline speedup

➢ Speedup is due to increased throughput (*) , latency (*) does not decrease

❑ Speedup for non-ideal pipelines is less

➢ External/internal fragmentation, pipeline stalls.
✓ Latency = execution time (delay or response time) = the total time from start to
finish of ONE instruction
✓ Throughput (or execution bandwidth) = the total amount of work done in a given
amount of time

=> Example: An MIPS pipelined processor performance

❑ Assume time for stages is
✓ 100ps for register read or write
✓ 200ps for other stages

❑ Compare pipelined datapath with single-cycle datapath

=> solution
❑ Time btw 1st and 5th instructions: single cycle = 3200ps (4 x 800ps) vs pipelined =
800ps (4 x 200ps)
→ speedup = 4.
➢ Execution time for 5 instructions: 4000ps vs 1800ps ≈ 2.22 times speedup
→ Why shouldn't the speedup be 5 (#stages)? What’s wrong?
➢ Think of real programs which execute billions of instructions.

MIPS ISA supports for pipelining

❑ What makes it easy
➢ All instructions are 32-bits
• Easier to fetch and decode in one cycle: fetch in the 1st stage and decode in the 2nd stage
c.f. x86: 1- to 17-byte instructions
➢ Few and regular instruction formats
• Can decode and read registers in one step
➢ Memory operations occur only in loads and stores
• Can calculate address in 3rd stage, access memory in 4th stage
➢ Operands must be aligned in memory
• Memory access takes only one cycle
➢ Each instruction writes at most one result (i.e., changes the machine state) and does
it in the last few pipeline stages (MEM or WB)

Ideas from the Single-Cycle Datapath

❑ How to pipeline a single-cycle datapath? Think of the simple datapath as a linear
sequence of stages.
Pipelined Datapath
❑ Add state registers between each pipeline stage
➢ To isolate information between cycles

Pipeline operation
❑ Cycle-by-cycle flow of instructions through the pipelined datapath
➢ Same clock edge updates all pipeline registers, register file, and data memory (for
store instruction)
➢ “Single-clock-cycle” pipeline diagram
✓ Shows pipeline usage in a single cycle
✓ Highlight resources used
➢ c.f. “multi-clock-cycle” diagram (later)
✓ Graph of operation over time
❑ We’ll look at “single-clock-cycle” diagrams for load to verify the proposed datapath

IF for Load, Store, …

EX for Load

MEM for Load

WB for Load

Corrected Datapath for Load

Multi-Cycle Pipeline Diagram
❑ Shows the complete execution of instructions in a single figure
➢ Instructions are listed in instruction execution order from top to bottom
➢ Clock cycles move from left to right
➢ Figure shows the use of resources at each stage and each cycle

❑ Can help with answering questions like:

➢ How many cycles does it take to execute this code?
➢ What is the ALU doing during cycle 4

Pipelined control: control points

❑ Same control points as in the single-cycle datapath

Pipelined control: settings

❑ Control signals derived from instruction & determined during ID
➢ As instruction moves → pipeline the control signals → extend pipeline registers to
include control signals
➢ Each stage uses some of the control signals

Pipelined control: complete

Can Pipelining Get Us Into Trouble?
❑ Yes - instruction pipeline is not an ideal pipeline
➢ different instructions → not all need the same stages: some pipe stages idle for some
instructions → external fragmentation
➢ different pipeline stages → not the same latency: some pipe stages are too fast but
all take the same clock cycle time → internal fragmentation
➢ instructions are not independent of each other → pipeline stalls: pipeline is not
always moving

❑ Issues in pipeline design: pipeline hazards

➢ structural hazards: attempt to use the same resource by two different instructions at
the same time
➢ data hazards: attempt to use data before it is ready, e.g. an instruction’s source
operand(s) are produced by a prior instruction still in the pipeline
➢ control hazards: attempt to make a decision about program control flow before the
condition has been evaluated and the new PC target address calculated (e.g. branch and
jump instructions, exceptions
Example: structural hazards

Summary
❑ Multi-cycle processor
➢ Use one clock cycle per step → shorter clock cycle time = longest step, not longest
instruction.
➢ Higher performance over single-cycle processor: simpler instructions take fewer
cycles → less waste

❑ Pipeline processor design

➢ Employs instruction parallelism: process the next instruction on the resources
available when current instructions move to subsequent phases.
➢ Speedup is due to increased throughput: once the pipeline is full, CPI=1.
➢ Datapath can be derived from that of single-cycle processor, with additional buffer
registers
➢ Control signals remain the same as in the single-cycle case but some of them are
moved along the pipeline via inter-stage buffers.

❑ As the instruction pipeline is not ideal, various issues may occur including structural,
data, and control hazards.

3-Pipelining 241110 203716
No ratings yet
3-Pipelining 241110 203716
59 pages
Pipelined MIPS Processor: Dmitri Strukov ECE 154A
No ratings yet
Pipelined MIPS Processor: Dmitri Strukov ECE 154A
81 pages
L14 MipsPipeline Ovw
No ratings yet
L14 MipsPipeline Ovw
17 pages
CA07 2022S3 New
No ratings yet
CA07 2022S3 New
29 pages
Pipe Lining
No ratings yet
Pipe Lining
66 pages
Lecture 13 Pipelining
No ratings yet
Lecture 13 Pipelining
12 pages
Chapter 6
No ratings yet
Chapter 6
43 pages
Pipelining in Computer Architecture
No ratings yet
Pipelining in Computer Architecture
64 pages
Pipeline Processor Design
No ratings yet
Pipeline Processor Design
89 pages
Chapter 4.5 - 4.8 Piplined Processor and Hazards
No ratings yet
Chapter 4.5 - 4.8 Piplined Processor and Hazards
68 pages
Pipelining Basic and Intermediate Concepts
No ratings yet
Pipelining Basic and Intermediate Concepts
75 pages
Module 4-Pipelining
No ratings yet
Module 4-Pipelining
39 pages
Lec12 Pipeline
No ratings yet
Lec12 Pipeline
23 pages
Basic Pipelining: CS2100 - Computer Organization
No ratings yet
Basic Pipelining: CS2100 - Computer Organization
83 pages
CS530 Fall2015 Lecture9
No ratings yet
CS530 Fall2015 Lecture9
5 pages
Pipelining in Computer Architecture
No ratings yet
Pipelining in Computer Architecture
77 pages
Lecture # Pipelining
No ratings yet
Lecture # Pipelining
36 pages
07 Pipeline Notes
No ratings yet
07 Pipeline Notes
145 pages
Lec18 Pipeline
No ratings yet
Lec18 Pipeline
59 pages
CODch 6 Slides
No ratings yet
CODch 6 Slides
77 pages
Pipelined Processor Design Overview
No ratings yet
Pipelined Processor Design Overview
106 pages
06 - CS F342 Pipelining (ForMIDSEM - Upto35slides)
No ratings yet
06 - CS F342 Pipelining (ForMIDSEM - Upto35slides)
69 pages
04 Pipeline
No ratings yet
04 Pipeline
83 pages
Module 2
No ratings yet
Module 2
64 pages
Module 3-Part 2
No ratings yet
Module 3-Part 2
50 pages
HRY-312 Computer Organization Introduction To Pipelining
No ratings yet
HRY-312 Computer Organization Introduction To Pipelining
30 pages
Pipelining - Modified1
No ratings yet
Pipelining - Modified1
51 pages
Pipeline Processing
No ratings yet
Pipeline Processing
28 pages
CO Pipelining PDF Notes
No ratings yet
CO Pipelining PDF Notes
10 pages
Lecture Notes Pipelining Stages 7B
No ratings yet
Lecture Notes Pipelining Stages 7B
7 pages
05 Pipelining
No ratings yet
05 Pipelining
34 pages
Advanced Pipelining Techniques
No ratings yet
Advanced Pipelining Techniques
44 pages
Pipelining and Parallel Processing
No ratings yet
Pipelining and Parallel Processing
26 pages
Unit 4
No ratings yet
Unit 4
20 pages
Lec11 Pipeline 1 Notes
No ratings yet
Lec11 Pipeline 1 Notes
26 pages
Design of 32bit MIPS Processor
No ratings yet
Design of 32bit MIPS Processor
23 pages
Pipelined Processor Design: Computer Architecture and Assembly Language
No ratings yet
Pipelined Processor Design: Computer Architecture and Assembly Language
22 pages
8 Pipeline DDP Control
No ratings yet
8 Pipeline DDP Control
54 pages
Instruction Pipelining and SuperScalar Development - 2019
No ratings yet
Instruction Pipelining and SuperScalar Development - 2019
53 pages
Computer Systems Pipelining Guide
No ratings yet
Computer Systems Pipelining Guide
7 pages
Pipelining Concepts and Problems
No ratings yet
Pipelining Concepts and Problems
33 pages
Parallel and Pipeline Processing Explained
No ratings yet
Parallel and Pipeline Processing Explained
43 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
49 pages
Helping Slides Pipelining Hazards Solutions
No ratings yet
Helping Slides Pipelining Hazards Solutions
55 pages
3 Pipelining Pipeline:: "Folder" Takes 20 Minutes
No ratings yet
3 Pipelining Pipeline:: "Folder" Takes 20 Minutes
8 pages
Understanding Instruction Pipelining
No ratings yet
Understanding Instruction Pipelining
4 pages
Lecture # 7.
No ratings yet
Lecture # 7.
26 pages
Pipelining: 5-Stage Pipeline: Mahdi Nazm Bojnordi
No ratings yet
Pipelining: 5-Stage Pipeline: Mahdi Nazm Bojnordi
35 pages
MIPS Pipelining and Hazards Explained
No ratings yet
MIPS Pipelining and Hazards Explained
48 pages
Parallel Processing & Pipelining
No ratings yet
Parallel Processing & Pipelining
33 pages
Pipelining and Parallelism
No ratings yet
Pipelining and Parallelism
41 pages
CSE332 / EEE336 Computer Organization & Architecture Pipelining I
No ratings yet
CSE332 / EEE336 Computer Organization & Architecture Pipelining I
21 pages
CPU Pipeline Architecture Guide
No ratings yet
CPU Pipeline Architecture Guide
26 pages
Pipelining vs Parallel Processing Explained
No ratings yet
Pipelining vs Parallel Processing Explained
23 pages
Understanding Processor Pipelining
No ratings yet
Understanding Processor Pipelining
28 pages
Ca06 2014 PDF
No ratings yet
Ca06 2014 PDF
53 pages
Pipe Lining
No ratings yet
Pipe Lining
23 pages
Chapter # 03 Pipelining
No ratings yet
Chapter # 03 Pipelining
85 pages
ILP - Appendix C PDF
No ratings yet
ILP - Appendix C PDF
52 pages
Writing Effective Meeting Minutes
No ratings yet
Writing Effective Meeting Minutes
3 pages
Hostel Management System Overview
No ratings yet
Hostel Management System Overview
24 pages
KFGQPC Symbols PDF
100% (2)
KFGQPC Symbols PDF
5 pages
RNS 510 Install
0% (1)
RNS 510 Install
20 pages
Internal Verification of Assessment Decisions - BTEC (RQF) : Higher Nationals
No ratings yet
Internal Verification of Assessment Decisions - BTEC (RQF) : Higher Nationals
27 pages
Shortcut Keys for Trading & Charts
No ratings yet
Shortcut Keys for Trading & Charts
2 pages
Micropump Handling Instructions Guide
No ratings yet
Micropump Handling Instructions Guide
4 pages
Procedure For Claims
No ratings yet
Procedure For Claims
3 pages
Keysight ADS Example Book CH 04 - Planar Electromagnetic (EM) Simulation in ADS 5992-1479
No ratings yet
Keysight ADS Example Book CH 04 - Planar Electromagnetic (EM) Simulation in ADS 5992-1479
27 pages
Journal For Sybsc Cs 3rd Sem Data Structure & Algorithm
No ratings yet
Journal For Sybsc Cs 3rd Sem Data Structure & Algorithm
34 pages
Analyzing The Ecosystem of Malicious URL Redirection Through Longitudinal Observation From Honeypots
No ratings yet
Analyzing The Ecosystem of Malicious URL Redirection Through Longitudinal Observation From Honeypots
19 pages
3.1 SQL
No ratings yet
3.1 SQL
56 pages
WhatsApp Chat With School Friend, ?
No ratings yet
WhatsApp Chat With School Friend, ?
1 page
Titanus Product Line Brochure
No ratings yet
Titanus Product Line Brochure
4 pages
Smartflo Brochure
No ratings yet
Smartflo Brochure
7 pages
1 s2.0 S2352146524000589 Main
No ratings yet
1 s2.0 S2352146524000589 Main
8 pages
VDDR-Pressure Vessel & Storage Tanks
No ratings yet
VDDR-Pressure Vessel & Storage Tanks
1 page
CBC SOP for Dirui BF-6500 Analyzer
No ratings yet
CBC SOP for Dirui BF-6500 Analyzer
14 pages
CLASS 10 PRACTICAL FILE-format
100% (1)
CLASS 10 PRACTICAL FILE-format
31 pages
Control and Accounting Systems Overview
No ratings yet
Control and Accounting Systems Overview
20 pages
Commitments of Traders Final
No ratings yet
Commitments of Traders Final
11 pages
Candidate Registration Form
No ratings yet
Candidate Registration Form
3 pages
Citrix Virtual Apps and Desktops
No ratings yet
Citrix Virtual Apps and Desktops
996 pages
Data-Driven Aerospace Engineering With ML
No ratings yet
Data-Driven Aerospace Engineering With ML
28 pages
2-Parallel FIR Filter, 2-Parallel Fast FIR Filter
No ratings yet
2-Parallel FIR Filter, 2-Parallel Fast FIR Filter
7 pages
FAQ2023MTECH2023
No ratings yet
FAQ2023MTECH2023
4 pages
Trading Central Product Training Guide
No ratings yet
Trading Central Product Training Guide
32 pages
Castler Brochure
No ratings yet
Castler Brochure
2 pages
Abstract Prototype Developement Document Olabs Hackathon
No ratings yet
Abstract Prototype Developement Document Olabs Hackathon
2 pages
Case Study: 4
No ratings yet
Case Study: 4
2 pages

Bản Sao Của Lecture 9 - Pipelined Processor Design

Uploaded by

Bản Sao Của Lecture 9 - Pipelined Processor Design

Uploaded by

Lecture 9: Pipelined Processor Design

Drawbacks of Single Cycle Processor

Worst Case Timing

➢ Cycle time is longer than needed for other instructions

Single cycle vs. multicycle example

=> Example solution

The idea of pipelining

❑ Speedup for non-ideal pipelines is less

=> Example: An MIPS pipelined processor performance

❑ Compare pipelined datapath with single-cycle datapath

MIPS ISA supports for pipelining

Ideas from the Single-Cycle Datapath

IF for Load, Store, …

MEM for Load

Corrected Datapath for Load

❑ Can help with answering questions like:

Pipelined control: control points

Pipelined control: settings

Pipelined control: complete

❑ Issues in pipeline design: pipeline hazards

❑ Pipeline processor design

You might also like