Slides Chapter 5 Basic Processing Unit
Slides Chapter 5 Basic Processing Unit
datapath
2. Doing some computation (in the ALU)
3. Accessing the memory
4. Writing a register (in the register file)
Processor’s building blocks
• PC provides instruction
address
• Instruction is fetched into
IR
• Instruction address
generator updates PC
• ALU performs some
computation during
execution
• Control circuitry interprets
instruction and generates
control signals to perform
the actions needed.
A digital processing system
• datapath
A multi-stage digital processing system
• datapath
Why multi-stage?
• Processing moves from one stage to the next in
each clock cycle
• Such a multi-stage system is the basis for
pipelined operation
– High-performance processors have a pipelined
organization
– Pipelining enables the execution of successive
instructions to be overlapped
• We will get back to pipeline later. Let’s now
focus on the basics of the multi-stage
architecture of a RISC-style processor
Instruction execution
• Pipelined organization is most effective if all
instructions can be executed in the same number of
steps.
• Each step is carried out in a separate hardware
stage.
• Processor design will be illustrated using five
hardware stages.
• How can instruction execution be divided into five
steps?
– Let’s start from some representative RISC instructions
A memory access instruction:
Load R5, X(R7)
1. Fetch the instruction and increment the
program counter.
2. Decode the instruction and read the contents
of register R7 in the register file.
3. Compute the effective address = X + [R7].
4. Read the memory source operand.
5. Load the operand into the destination
register, R5.
A computational instruction:
Add R3, R4, R5
1. Fetch the instruction and increment the program
counter.
2. Decode the instruction and read registers
R4 and R5.
3. Compute the sum [R4] + [R5].
4. No action.
5. Load the result into the destination register, R3.
• It may be
implemented using a
2-port memory.
Hardware components: Register file
Hardware components: ALU (1)
• Both source operands
and the destination
location are in the
register file.
[RA] and [RB] denote [RB]
values of registers that
new [RC]
are identified by [RA]
addresses A and B
new [RC] denotes the
result that is stored to
the register identified
by address C
Hardware components: ALU (2)
• In this case, one of
the source
operands is the
immediate value
in the IR.
new [RC]
[RA]
A 5-stage implementation of
a RISC processor
• Instruction processing
moves from stage to stage
in every clock cycle,
starting with fetch.
• …
• If a memory operation is
involved, it takes place in
stage 4.
• Register file,
used in stages 2 and 5
– (Inter-stage registers RA, RB, RZ, RM, RY
needed to carry data from one stage to
the next)
• ALU stage
• Memory stage
Generated by decoding
the OPCODE field of the
instruction hold in the
IR register
Instruction
Format
R
I
ALU control signals
Generated by decoding
the OPCODE field of the
instruction hold in the
IR register Analyzed by the
CONTROL CIRCUITRY
during the execution
of a branch
instruction
Result selection
Generated by decoding
the OPCODE field of the
instruction hold in the
IR register
Memory access
• When data are found in the cache, access to
memory can be completed in one clock cycle.
• Otherwise, read and write operations may require
several clock cycles to load data from main memory
into the cache.
• A control signal is needed to indicate that memory
function has been completed (MFC). E.g., for step 1:
1.Memory address ← [PC], Read memory,
Wait for MFC,
IR ← Memory data, PC ← [PC] + 4
Memory and IR control signals
MuxY
Memory and IR control signals
E.g.
RF_wtite = T5&(ALU | Load | Call);
PC_enable = T1&MFC | T3&(BR | Ret | Call);
CISC processors
• CISC-style processors have more complex
instructions.
• The full set of instructions cannot all be
implemented in a fixed number of steps.
• Execution steps for different instructions do not
all follow a prescribed sequence of actions.
• Hardware organization should therefore enable
a flexible flow of data and actions to
accommodate CISC.
Hardware organization for a CISC computer
Main difference between
5-stage RISC organization
and CISC organization,
where a datapath cannot
Hold temporary results be identified easily
during instruction
execution
Bus
• An example of an interconnection network.
• When functional units are connected to a
common bus, tri-state drivers are needed.
Register Enable
A 3-bus interconnection network
Example 1: Add R5, R6
1. Memory address ← [PC],
Read memory, Wait for
MFC, IR ← Memory data,
PC ← [PC] + 4
2. Decode instruction
3. R5 ← [R5] + [R6]
A 3-bus interconnection network
Example 2: And X(R7), R9
1. Memory address ← [PC], Read memory,
Wait for MFC,
IR ← Memory data,
PC ← [PC] + 4
2. Decode instruction
3. Memory address ← [PC], Read memory,
Wait for MFC,
Temp1 ← Memory data,
PC ← [PC] + 4
4. Temp2 ← [Temp1] + [R7]
5. Memory address ← [Temp2], Read
memory, Wait for MFC, Temp1 ← Memory
data
6. Temp1 ←[Temp1] AND [R9]
7. Memory address ← [Temp2], Memory data
← [Temp1], Write memory, Wait for MFC