0% found this document useful (0 votes)

416 views27 pages

Arithmetic Pipeline in Computer Architecture

The document discusses arithmetic pipelines, which can be used to speed up fixed-point and floating-point arithmetic operations. Fixed-point arithmetic pipelines work by breaking down multiplication into a series of addition and shift operations that can be pipelined. Floating-point addition and subtraction can also be pipelined into four stages: mantissa alignment, exponent difference, mantissa addition, and rounding. Vector and array processors are also discussed as ways to parallelize arithmetic tasks like matrix multiplication using pipelined multiply-add units.

Uploaded by

s1910576101

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

416 views27 pages

Arithmetic Pipeline in Computer Architecture

Uploaded by

s1910576101

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Arithmetic Pipeline

• Main topics in Pipeline processing is

• Arithmetic pipeline :
• fixed Arithmetic pipeline
• floating point
• Vector processing : adder/multiplier pipeline
• Array processing : array processor
• Attached array processor
• SIMD Array Processor
Parallel Processing Adder-subtractor

Integer multiply
• Simultaneous data processing tasks
for the purpose of increasing the Logic unit
computational speed
• Perform concurrent data Shift unit

processing to achieve faster To Memory

execution time Incrementer

Processor

• Multiple Functional Unit : registers

Floatint-point
add-
subtract
• Separate the execution unit Floatint-point
into eight functional units multiply

operating in parallel. Floatint-point

divide
Pipelining: Laundry
Example
A B C D
 Small laundry has one washer, one
dryer and one operator, it takes 90
minutes to finish one load:
 Washer takes 30 minutes
 Dryer takes 40 minutes
 “operator folding” takes 20 minutes
Sequential
Laundry
6 PM 7 8 9 11 Midnight
10 This operator scheduled his
loads to be delivered to
Time • the laundry every 90
minutes which is the time
required to finish one load.
30 40 20 30 40 20 30 40 20 30 40 20
T
a A • In other words he will not
start a new task unless
s he is already done with
B the previous task
k
O
r
C • The process is sequential.
Sequential laundry takes 6
d 90 min hours for 4 loads
D
e

r
Efficiently scheduled
laundry: Pipelined Laundry
Operator
6 PM 7 8 9 10
11 • Another operator
Time20 asks for the delivery
30 40 40 40 40 of loads to the
40 40 40
T laundry every 40
a A minutes!?.
s
B • Pipelined laundry
k
O takes 3.5 hours for 4
r loads
C
d

e
D

r
• Multiple tasks operating
Pipelining simultaneously
Facts6 PM
7 8 9 • Pipelining doesn’t help
Time latency (response time) of
single task, it helps throughput
T of entire workload
a 30 40 40 40 40
20
s A • Pipeline rate limited by slowest
k
O
pipeline stage
r B
• Potential speedup = Number of
d
C The washer
waits for the
pipe stages
e dryer for 10
minutes
D • Unbalanced lengths of pipe
r
stages reduces speedup
Pipelining
Decomposing a sequential process into suboperations
Each subprocess is executed in a special dedicated segment concurrently

• Instruction execution is divided into k segments or stages

• Instruction exits pipe stage k-1 and proceeds into pipe stage k
• All pipe stages take the same amount of time; called one processor cycle
• Length of the processor cycle is determined by the slowest pipe stage

k segments
Pipelinin
g
• Suppose we want to perform the combined multiply and add
operations with a stream of numbers:
• Ai * Bi + Ci for i =1,2,3,…,7

• The sub operations performed in each segment of the pipeline are as

follows:

• R1  Ai
R2  Bi
,
• R3  R1 * R2 R4  Ci
• R5  R3 + R4
Arithmetic
•Pipeline
Pipeline arithmetic units are usually found in very high speed computers.
• Arithmetic pipelines are constructed for :
simple fixed-point
floating-point arithmetic operations.

• For implementing the arithmetic pipelines we generally use following two types
of adder:

• i) Carry propagation adder (CPA): It adds two numbers such that carries
generated in successive digits are propagated.

• ii)Carry save adder (CSA): It adds two numbers such that carries
generated are
not propagated rather these are saved in a carry vector.
Fixed Arithmetic
pipeline
• We take the example of multiplication of fixed numbers.
• Two fixed-point numbers are added by the ALU using add and shift
operations.
• This sequential execution makes the multiplication a slow process.
• Observe that this is the process of adding the multiple copies of
shifted multiplicands as show below:
Fixed Arithmetic
pipeline
Now, we can identify the following stages for
the pipeline:

•The first stage generates the partial product of the numbers, which form the six
rows of shifted multiplicands.
•In the second stage, the six numbers are given to the two CSAs merging into four
numbers.
• In the third stage, there is a single CSA merging the numbers into 3numbers.
• In the fourth stage, there is a single number merging three numbers into
2numbers.
•In the fifth stage, the last two numbers are added through a CPA to get the final
product.
Floating point
operations.
• The inputs to floating point adder pipeline are two normalized
floating point numbers.

Mantissa Exponent

• A and B are mantissas and a and b are the exponents.

• The floating point addition and subtraction can be performed in four
segments.
Mantissa Exponent

Floating-Point
Add/Subtracti
on Pipeline:
Vector
Processing
• Science and Engineering Applications
• Long-range weather forecasting,
• Petroleum explorations,
• Seismic data analysis
• Medical diagnosis ,
• Aerodynamics and space flight simulators,
• Artificial intelligence and expert systems,
• Mapping the human genome, Image processing
Vector
Processing
Vector Instruction Format :
Operation Base address Base address Base address Vector
code source 1 source 2 destination
length
ADD A B C 100
Matrix Multiplication
3 x 3 matrices multiplication : n2 = 9 inner product

a11 a12 a13  b11 b12 b13  c11 c12 c13 

a a a   b21 b c c 
21 22 23   22
b23   c
21 22 23 
a31 a32 a33 
b32 b 
33 
: inner productc329
c11  a11 b11b3a1 12 b21  a13 b31 c31
Cumulative multiply-add operation : n3 = 27c multiply-add
33 

c ca : Three such multiply-add

b
therefore 9 X 3 multiply-add = 27
c11  c11  a11 b11  a12 b21  a13 b31
C11 initial value = 0  
 
• Pipeline for calculating an inner product :
• Floating point multiplier pipeline : 4 segment
• Floating point adder pipeline : 4 segment
• Example: C  A1B1 A2 B2  A3B3   Ak Bk

• after 1st clock input

• after 4th clock input
Source
Source
A
A

A A4B4 A3B3 A2B 2 A1B1

1B1

Source Multiplier Adder Source Multiplier Adder

B pipeline pipeline B pipeline pipeline

• after 8th clock input • after 9th, 10th, 11th ,...

Source Source
A A

A8B8 A7B7 A6B 6 A5B5 A4B4 A3B3 A2B 2 A1B 1

A A7B7 A6B6 A5B5 A4B4 A3B B2 A1B1
8B8 3 A2

Source Source Multiplier Adder

Multiplier Adder B
B pipeline pipeline pipeline pipeline

C  A1B1  A5B5  A9 B9  A13B13  A2 B2  A6B6 A1B1  A5B5

• The four partial sum are added  A2 B2  A6 B6  A10B10  A14B14  ,,,
to form the final sum  A3B3  A7 B7  A11B11  A15B15  
 A4 B4  A8 B8  A12B12  A16B16 
Memory Interleaving
• Memory Interleaving :
• Simultaneous access to memory from two or more source using one memory bus system.
• Select one of 4 memory modules using lower 2 bits of AR
• Example) Even / Odd Address Memory Access

Address bus

AR AR AR AR

Memory Memory Memory

Memory array array
array array

DR DR DR
DR

D a t a bus
Array
Processor
• Processor that performs the computations on large arrays of
data.

Vector processing : Adder/Multiplier pipeline use

Array processing: using a separate array processor

• There are two different types of (array processor)

:
• Attached Array Processor
• SIMD Array Processor
Attached Array
•Processor
It is designed as a peripheral for a conventional host computer.
• Its purpose is to enhance the performance of the computer by
providing vector processing.
• It achieves high performance by means of parallel processing with
multiple functional units.

General-purpose Input-Output Attached array

computer interface Processor

Main memory Local memory

High-speed memory to-
memory bus
SIMD Array
•Processor
It is processor which consists of multiple processing unit operating in
parallel.
• The processing units are synchronized to perform the same task
under control of common control unit.
• Each processor elements(PE) includes an ALU , a floating point
arithmetic unit and working register.
PE 1 M1

Master control
unit
PE 2 M2

PE 3 M3

Main memory
PE n Mn

Pipelining Vector Processing
No ratings yet
Pipelining Vector Processing
27 pages
Pipelining and Vector Processing Overview
No ratings yet
Pipelining and Vector Processing Overview
37 pages
Scribid ACA Important Topics With Answers
No ratings yet
Scribid ACA Important Topics With Answers
57 pages
Handling Exceptions
No ratings yet
Handling Exceptions
12 pages
ES Unit 1
No ratings yet
ES Unit 1
40 pages
II ISemester B.Tech R23 Course Structure& Syllabi
No ratings yet
II ISemester B.Tech R23 Course Structure& Syllabi
20 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
33 pages
Computer Arithmetic Operations
No ratings yet
Computer Arithmetic Operations
5 pages
Unit 3 - Advanced Computer Architecture - WWW - Rgpvnotes.in
No ratings yet
Unit 3 - Advanced Computer Architecture - WWW - Rgpvnotes.in
15 pages
OOP - Chapter 8 Stream Computation For Console and File IO
No ratings yet
OOP - Chapter 8 Stream Computation For Console and File IO
43 pages
Digital Electronics and Microprocessor
No ratings yet
Digital Electronics and Microprocessor
2 pages
I Bcom Ca C PRG
No ratings yet
I Bcom Ca C PRG
17 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
9 pages
Instruction Execution in Computer Architecture
No ratings yet
Instruction Execution in Computer Architecture
16 pages
Arithmetic and Logic Micro Operations
No ratings yet
Arithmetic and Logic Micro Operations
7 pages
Pipelining and Superscalar Techniques: CSE539: Advanced Computer Architecture
No ratings yet
Pipelining and Superscalar Techniques: CSE539: Advanced Computer Architecture
49 pages
SP Unit 3
No ratings yet
SP Unit 3
63 pages
TCS CodeVita Previous Year Questions
No ratings yet
TCS CodeVita Previous Year Questions
35 pages
Pipelining 2019
No ratings yet
Pipelining 2019
82 pages
Computer Architecture and Microprocessor
No ratings yet
Computer Architecture and Microprocessor
253 pages
Co Question Paper
No ratings yet
Co Question Paper
16 pages
Computer Arithmetic Operations Guide
100% (2)
Computer Arithmetic Operations Guide
49 pages
MIPS Processor Design Overview
100% (1)
MIPS Processor Design Overview
21 pages
Daa Unit-1
No ratings yet
Daa Unit-1
44 pages
CS8392 OOP Inheritance Notes
No ratings yet
CS8392 OOP Inheritance Notes
32 pages
Abstract Classes and Interface in JAVA
No ratings yet
Abstract Classes and Interface in JAVA
33 pages
Operating Systems A Spiral Approach 1st Edition Ramez Elmasri
0% (1)
Operating Systems A Spiral Approach 1st Edition Ramez Elmasri
546 pages
Parallel Computer Structures
No ratings yet
Parallel Computer Structures
23 pages
Superscalar Architecture
No ratings yet
Superscalar Architecture
9 pages
OS Unit III - Interprocess Communication, Deadlocks
No ratings yet
OS Unit III - Interprocess Communication, Deadlocks
54 pages
Unit-3.3 Dynamic Interconnection Network
No ratings yet
Unit-3.3 Dynamic Interconnection Network
15 pages
Instruction-Level Parallelism (ILP), Since The
100% (1)
Instruction-Level Parallelism (ILP), Since The
57 pages
80386 Microprocessor Security Analysis
No ratings yet
80386 Microprocessor Security Analysis
9 pages
RISC Pipelining in Computer Architecture
No ratings yet
RISC Pipelining in Computer Architecture
14 pages
Introduction to Computer Architecture
100% (1)
Introduction to Computer Architecture
18 pages
CS3391-Quesbank 2marks & 13 Marks
No ratings yet
CS3391-Quesbank 2marks & 13 Marks
40 pages
Nep Se CMPN Syllabus 2024-2025
No ratings yet
Nep Se CMPN Syllabus 2024-2025
113 pages
Fork, Wait, and Exit in Unix Processes
No ratings yet
Fork, Wait, and Exit in Unix Processes
23 pages
Multilevel Viewpoint of A Machine
No ratings yet
Multilevel Viewpoint of A Machine
6 pages
Characteristics of Memory
No ratings yet
Characteristics of Memory
7 pages
Unit 4
No ratings yet
Unit 4
62 pages
Computer Organization & Architecture Notes
No ratings yet
Computer Organization & Architecture Notes
48 pages
IAS Computer Architecture Overview
No ratings yet
IAS Computer Architecture Overview
34 pages
SPOS UNIT NO-6 Master Solution
No ratings yet
SPOS UNIT NO-6 Master Solution
20 pages
Java Programming Q&A: Packages & Threads
No ratings yet
Java Programming Q&A: Packages & Threads
16 pages
COA Predicted QP 2025 by Deepseek
No ratings yet
COA Predicted QP 2025 by Deepseek
5 pages
Q: What Is Instruction Level Parallelism (ILP) ? Explain Its Concepts
No ratings yet
Q: What Is Instruction Level Parallelism (ILP) ? Explain Its Concepts
18 pages
TOC Chapter-1-For Reference
No ratings yet
TOC Chapter-1-For Reference
70 pages
Understanding CPU Scheduling Algorithms
100% (2)
Understanding CPU Scheduling Algorithms
57 pages
Dr. D. J. Jackson Lecture 1-1 Electrical & Computer Engineering
No ratings yet
Dr. D. J. Jackson Lecture 1-1 Electrical & Computer Engineering
14 pages
Floating Point Number
No ratings yet
Floating Point Number
34 pages
Memory Systems for IT Students
No ratings yet
Memory Systems for IT Students
50 pages
Shivaji University B.Tech CSE Syllabus
No ratings yet
Shivaji University B.Tech CSE Syllabus
48 pages
Computer Instruction Execution
100% (1)
Computer Instruction Execution
13 pages
4-Concept of Pipelining
No ratings yet
4-Concept of Pipelining
20 pages
ARM Processor Overview in EC8791
No ratings yet
ARM Processor Overview in EC8791
10 pages
Arithmetic Pipeline and Parallel Processing
No ratings yet
Arithmetic Pipeline and Parallel Processing
27 pages
COA Chapter 9
No ratings yet
COA Chapter 9
36 pages
Understanding Arithmetic Pipelines and Array Processors
No ratings yet
Understanding Arithmetic Pipelines and Array Processors
91 pages
Parallel Processing Explained
No ratings yet
Parallel Processing Explained
33 pages
EPIC Architecture: A New Paradigm in ILP
No ratings yet
EPIC Architecture: A New Paradigm in ILP
8 pages
These Interview Questions Test The Knowledge of x86 Intel Architecture and 8086 Microprocessor Specifically
No ratings yet
These Interview Questions Test The Knowledge of x86 Intel Architecture and 8086 Microprocessor Specifically
4 pages
Mastermind Game Controller Guide
No ratings yet
Mastermind Game Controller Guide
9 pages
EEPROM Module User Manual PDF
100% (1)
EEPROM Module User Manual PDF
5 pages
VHDL State Machine for Train Control
No ratings yet
VHDL State Machine for Train Control
9 pages
Comparator Circuit Design Lab Report
No ratings yet
Comparator Circuit Design Lab Report
5 pages
(SMC), (SMP), (MPP) : Symmetric Multi-Computers Symmetric Multi-Processors
No ratings yet
(SMC), (SMP), (MPP) : Symmetric Multi-Computers Symmetric Multi-Processors
13 pages
Enhancing Dual Port SRAM Robustness
No ratings yet
Enhancing Dual Port SRAM Robustness
5 pages
HT1622 LCD Controller Overview
No ratings yet
HT1622 LCD Controller Overview
18 pages
Clock Domain Crossing Rules Guide: Version J-2014.12 December 2014 Comments? E-Mail Your Comments About This Manual To
No ratings yet
Clock Domain Crossing Rules Guide: Version J-2014.12 December 2014 Comments? E-Mail Your Comments About This Manual To
150 pages
MCP2025 LIN Transceiver With Voltage Regulator
No ratings yet
MCP2025 LIN Transceiver With Voltage Regulator
36 pages
Digital Electronics Lab Guide EEE-3104
No ratings yet
Digital Electronics Lab Guide EEE-3104
83 pages
5-8: PLA (Programmable Logic Array)
No ratings yet
5-8: PLA (Programmable Logic Array)
19 pages
Acer Aspire A515-51G Compal LA-E892P C5V01 R1a PDF
No ratings yet
Acer Aspire A515-51G Compal LA-E892P C5V01 R1a PDF
57 pages
YASNAC LX3/MX3 Programming Manual
No ratings yet
YASNAC LX3/MX3 Programming Manual
114 pages
COA Question Bank (BCS302)
No ratings yet
COA Question Bank (BCS302)
5 pages
Mbed Board LED Blinking Guide
No ratings yet
Mbed Board LED Blinking Guide
7 pages
DI02000161
No ratings yet
DI02000161
10 pages
Fan 73893
No ratings yet
Fan 73893
1 page
Basic Computer Hardware Overview
No ratings yet
Basic Computer Hardware Overview
7 pages
DELD All Units Question Bank
No ratings yet
DELD All Units Question Bank
6 pages
Gate DLD
No ratings yet
Gate DLD
193 pages
PIC16F627A/628A/648A: 4.0 Memory Organization
No ratings yet
PIC16F627A/628A/648A: 4.0 Memory Organization
7 pages
GA-AX370-Gaming 5: User's Manual
No ratings yet
GA-AX370-Gaming 5: User's Manual
48 pages
LECTURE 8 - ECE521 STM32F446RE BOARD & LED Interfacing
No ratings yet
LECTURE 8 - ECE521 STM32F446RE BOARD & LED Interfacing
33 pages
8051 Microcontroller Applications
No ratings yet
8051 Microcontroller Applications
43 pages
Combinational Circuits
No ratings yet
Combinational Circuits
8 pages
blf177 219860
No ratings yet
blf177 219860
2 pages
Task: 1 Write A Sketch To Interface Arduino With A 3 X 4 Matrix Keypad. The Display of The Pressed Key Should Be Displayed On The LCD. Code
No ratings yet
Task: 1 Write A Sketch To Interface Arduino With A 3 X 4 Matrix Keypad. The Display of The Pressed Key Should Be Displayed On The LCD. Code
12 pages
FPGA Interview Questions & Answers
No ratings yet
FPGA Interview Questions & Answers
9 pages

Arithmetic Pipeline in Computer Architecture

Uploaded by

Arithmetic Pipeline in Computer Architecture

Uploaded by

Arithmetic Pipeline

• Main topics in Pipeline processing is

processing to achieve faster To Memory

execution time Incrementer

• Multiple Functional Unit : registers

operating in parallel. Floatint-point

• Instruction execution is divided into k segments or stages

• The sub operations performed in each segment of the pipeline are as

• A and B are mantissas and a and b are the exponents.

a11 a12 a13  b11 b12 b13  c11 c12 c13 

c ca : Three such multiply-add

• after 1st clock input

A A4B4 A3B3 A2B 2 A1B1

Source Multiplier Adder Source Multiplier Adder

• after 8th clock input • after 9th, 10th, 11th ,...

A8B8 A7B7 A6B 6 A5B5 A4B4 A3B3 A2B 2 A1B 1

Source Source Multiplier Adder

C  A1B1  A5B5  A9 B9  A13B13  A2 B2  A6B6 A1B1  A5B5

Memory Memory Memory

Vector processing : Adder/Multiplier pipeline use

• There are two different types of (array processor)

General-purpose Input-Output Attached array

Main memory Local memory

You might also like