0% found this document useful (0 votes)

32 views81 pages

Unit-2 CDA DrManojY

Uploaded by

rgchessworld

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views81 pages

Unit-2 CDA DrManojY

Uploaded by

rgchessworld

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

UNIT 2

Control Unit Design

Computer Memories
o There are several memories in computer: internal (cache and main
memory); external (secondary memory).

9/12/2024 Manoj Yadav 2

Memory Heirarchy
o Classification: Cache, main (primary) memory, secondary/auxiallary
memory.

9/12/2024 Manoj Yadav 3

Memory Heirarchy
o The memory speed increases from bottom to top but size of memories decreases
in this order. This is a trade-off between memory size and speed.

o In computer, memories are interfaced to each other in order of their speed viz. the
fastest memory (cache) stays closest to procesor and slower one (secondary
memory) stays far from it.

9/12/2024 Manoj Yadav 4

Memory Heirarchy
o Levels of Memory:
: It is a type of memory in which data is stored and
accepted that are immediately stored in the CPU. The most commonly
used register is Accumulator, Program counter, Address Register,
etc.

o Level 2 or Cache memory: It is the fastest memory that has faster

access time where data is temporarily stored for faster access.

: It is the memory on which the computer

works currently. It is small in size and once power is off data no
longer stays in this memory.

o Level 4 or Secondary Memory: It is external memory that is not as

fast as the main memory but data stays permanently in this memory.

9/12/2024 Manoj Yadav 5

Memory Heirarchy and Cache Memory

Hierarchical memory is a hardware optimization that takes the benefits of

spatial and temporal locality and can be used on several levels of the
memory hierarchy.

Paging: It is obviously benefits from temporal and spatial locality. A cache

is a simple example of exploiting temporal locality, because it is a specially
designed, faster but smaller memory area, generally used to keep recently
referenced data and data near recently referenced data, which can lead to
potential performance increases.
Memory Heirarchy and Cache Levels/types
Typical memory hierarchy (access times and cache
sizes are approximate for the purpose of discussion):
1. CPU registers (8-256 registers) – immediate access,
with the speed of the innermost core of the processor.

2. L1 CPU caches (32 KB to 512KB) – fast access, with

the speed of the innermost memory bus owned
exclusively by each core

3. L2 CPU caches (128 KB to 24MB) – slightly slower

access, with the speed of the memory bus shared
between twins of cores

4. L3 CPU caches (2 MB to 32MB) – even slower

access, with the speed of the memory bus shared
between even more cores of the same processor
Memory Heirarchy and Cache Levels/types
5. Main physical memory (RAM) (256 MB to
64GB) – slow access, the speed of which is
limited by the spatial distances and general
hardware interfaces between the processor and
the memory modules on the motherboard.

6. Disk (virtual memory, file system) (1 GB to

256TB) – very slow, due to the narrower (in bit
width), physically much longer data channel
between the main board of the computer and the
disk devices, and due to the extraneous software
protocol needed on the top of the slow hardware
interface.

7. Remote memory (other computers or the cloud)

(practically unlimited) – speed varies from very
slow to extremely slow.
Cache Memory (SRAM) and Mapping
v The
, the
average memory access time will approach the access time of the cache.

v Although the cache is only a small fraction of the size of main memory, a
large fraction of memory requests will be found in the fast cache memory
because of the of programs.

v Locality of reference refers to a phenomenon in which a computer program

tends to access same set of memory locations for a particular time period.
v In other words, refers to the tendency of the
computer program to access instructions whose addresses are near one
another.
Cache Memory Working
v When the CPU needs to access memory, firstly the cache is examined. If
the word is found in the cache, it is read from the cache memory.

v If the word addressed by the CPU is not found in the cache, the main
memory is accessed to read the word.

v A block of words containing the one just accessed is then transferred

from main memory to cache memory.

v The block size may vary from one word (the one just accessed) to about 16
words adjacent to the one just accessed.

v In this manner, some data are transferred to cache so that future references
to memory find the required words in the fast cache memory.
Cache Memory:
v The performance of cache memory is frequently measured in
terms of a quantity called .

v When the CPU refers to memory and finds the word in

cache, it is said to produce a hit (or cache hit).

v If the word is not found in cache, it is in main memory and it

counts as a miss (or cache miss).

v Hit ratio=No of hits/(no of miss+no of hits)

Memory Mapping and Types
v Cache mapping refers to a process/scheme using which the
content which is present in the main memory, is brought into
the cachememory.

v So, basically, the transfer of data from main memory to cache

memory is referred to as a mapping process.

v Three types of mapping procedures are of practical interest

when considering the organization of cache memory:
§ Direct mapping
§ Associative mapping
§ Set-associative mapping
1. Associative Mapping:

•The fastest and most flexible cache

organization uses an associative memory
•Stores both the address and content
(data) of the memory word.
•Any location in cache can store any
word from main memory.
•A CPU address of 15 bits is placed in the
argument register and the associative
memory is searched for a matching
address.
•If found ok, otherwise read from Main
memory, and store address and data pair
in Cache.
•If Cache is full FIFO algorithm
•Associative memory is used for Cache
expensive
2. Direct Mapping:
Random Access Memory is used to implement
The CPU address of 15 bits is divided into two fields.
The nine least significant bits constitute the index field
Remaining six bits form the tag field.
To access Main Memory, we needs an address that includes both the tag and the
index bits.
The number of bits in the index field is equal to the number of address bits
required to access the cache memory.
•Each word in cache consists of the data word and its associated tag

•When a new word is first brought into the cache, the tag bits are stored
alongside the data bits

•When the CPU generates a memory request, the index field is used for
the address to access the cache  tag field of the CPU address is
compared with the tag in the word read from the cache.

•If the two tags match, there is a hit and the desired data word is in
•cache.
•If there is no match, there is a miss and the required word is read from
main memory.

•It is then stored in the cache together with the new tag, replacing the
previous value.
•We have divided the Main memory into blocks of 29 data and there are
26 blocks
•We keep one data from each block.
In general:
if there are n bits in Main Memory address, and k
bits in Cache memory address.
 2(n-k) Blocks and size of each block is 2k.
Disadvantage of Direct mapping:

•The disadvantage of direct mapping is that the hit ratio can drop
considerably if two or more words whose addresses have the
•same index but different tags are accessed repeatedly.
•How ever, the possibility is less as the two data are far away from
each other
•Two words with the same index in their address but with different tag
values cannot reside in cache memory at the same time
Read from 02000?
Example:
3. Set Associative Mapping

each word of cache can store two or more words of memory under the same
index address
Each data word is stored together with its tag and the number of tag-data
items in one word of cache is said to form a set.
Two way set associative
mapping cache

•Each index address refers to two data words and their associated tags.
•Each tag requires six bits and each data word has 12 bits
•so the word length is 2(6 + 12) = 36 bits.
•An index address of nine bits can accommodate 512 words. Thus the size of
cache memory is 512 x 36.
•It can accommodate 1024 words of main memory since each word of cache
contains
•two data words.
•In general, a set-associative cache of set size k will accommodate k words of
main memory in each word of cache.
Other Consideration

•The comparison logic is done by an associative search of the tags in the set
similar to an associative memory search: thus the name "set-associative."
•The hit ratio will improve as the set size increases because more words with
the same index but different tags can reside in cache.
•However, an increase in the set size increases the number of bits in words of
cache and requires more complex comparison logic.
•When a miss occurs in a set-associative cache and the set is full, it is
necessary to replace one of the tag-data items with a new value.
Replacement Algorithms

•The most common replacement algorithms used are:

• Random replacement
• First-in, first out (FIFO)
• Least recently used (LRU).
•With the random replacement policy the control chooses one tag-data
item for replacement at random.
•The FIFO procedure selects for replacement the item that has been in the
set the longest.
•The LRU algorithm selects for replacement the item that has been least
recently used by the CPU. Therefore, in memory, any item that has been
unused for a longer period of time than the others is replaced.
•Both FIFO and LRU can be implemented by adding a few extra bits in each
word of cache.

•Purpose is to reduce cache-miss and increase the cache-hit.

Numerical

FIFO replacement
LRU replacement (oldest from Access point of view)
Pipelining
in Processor
Why use the Array Processor
• Array processors increases the overall instruction processing speed.
• As most of the Array processors operates asynchronously from the host CPU,
hence it improves the overall capacity of the system.
• Array Processors has its own local memory, hence providing extra memory
for systems with low memory.
SIMD Array Processors
single instruction stream and multiple data
streams.
Pipelining
in Processor
Flynn's Classification of Computers

v Single Instruction stream and Single Data Stream (SISD)

v Single Instruction stream and Multiple Data stream(SIMD)
v Multiple Instruction stream and Single Data Stream(MISD)
v Multiple Instruction stream and Multiple Data stream (MIMD)
Flynn's Classification of Computers
Arithmetic Pipeline :
An for
execution in various pipeline segments. It is used for floating point operations, multiplication
and various other computations.

4-Segment/sub-operations
Pipeline:
1. Compare the exponents
2. Align mantissas and choose
exponent.
3. Add or Subtract the mantissas
4. Normalise the result.
Arithmetic Pipeline :
§ First of all the two exponents are compared and the larger of two exponents
is chosen as the result exponent.

§ The difference in the exponents then decides how many times we must shift
the smaller exponent to the right.

§ Then after shifting of exponent, both the mantissas get aligned.

§ Finally the addition of both numbers take place followed by normalisation of

the result in the last segment.
Instruction Pipeline :
§ In this a stream of instructions can be executed by overlapping fetch,
decode and execute phases of an instruction cycle.
§ This type of technique is used to increase the throughput of the computer
system.
§ An instruction pipeline reads instruction from the memory while previous
instructions are being executed in other segments of the pipeline.
§ Thus we can execute multiple instructions simultaneously.
§ The pipeline will be more efficient if the instruction cycle is divided into
segments of equal duration.

In the most general case computer needs to process each instruction in following
sequence of steps:

• Fetch the instruction from memory (FI)

• Decode the instruction (DA)
• Calculate the effective address
• Fetch the operands from memory (FO)
• Execute the instruction (EX)
• Store the result in the memory.
Pipe line Hazards/issues:
1. Resource conflicts caused by access to memory by
two segments at the same time. Most of these conflicts
can be resolved by using separate instruction and data
memories.
2. Data dependency conflicts arise when an instruction
depends on the result of a previous instruction, but this
result is not yet available.
3. Branch difficulties arise from branch and other
instructions that change the value of PC.
Branching Instruction in Pipeline:
Pipeline conflict:
Data Dependancy:

A collision occurs when an instruction cannot proceed because

previous instructions did not complete certain operations. A
data dependency occurs when an instruction needs data that
are not yet available.
e.g.
an instruction in the FO segment may need to fetch an
operand that is being generated at the same time by the
previous instruction in segment EX. Therefore, the second
instruction must wait for data to become available by the first
instruction delay the operation
Dealing with data dependency:

1. Hardware Interlock: The most straightforward method is to insert

hardware interlocks . An interlock is a circuit that detects instructions
whose source operands are destinations of instructions farther up in the
pipeline. Detection of this situation causes the instruction whose source
is not available to be delayed by enough clock cycles to resolve the
conflict. This approach maintains the program sequence by using
hardware to insert the required delays.
2. Operand Forwarding:
This uses special hardware to detect a conflict and then avoid it by
routing the data through special paths between pipeline segments. For
example, instead of transferring an ALU result into a destination
register, the hardware checks the destination operand, and if it is
needed as a source in the next instruction, it passes the result directly
into the ALU input, bypassing the register file. This method requires
additional hardware paths through multiplexers as well as the circuit
that detects the conflict.
3. Delayed Load:
A procedure employed in some computers is to give the responsibility
for solving data conflicts problems to the compiler that translates the
high-level programming language into a machine language program. The
compiler for such computers is designed to detect a data conflict and
reorder the instructions as necessary to delay the loading of the
conflicting data by inserting delayed load no-operation instructions. This
method is referred to as delayed load.
Handling of Branch Instructions

Pre-fetch Target Instruction:

One way of handling a conditional branch is to prefetch the target
instruction in addition to the instruction following the branch. Both are
saved until the branch is executed. If the branch condition is successful,
the pipeline continues from the branch target instruction. An extension
of this procedure is to continue fetching instructions from both places
until the branch decision is made. At that time control chooses the
instruction stream of the correct program flow.
2. Branch target Buffer:
Another possibility is the use of a branch target buffer or BTB. The BTB is
an associative memory included in the fetch segment of the pipeline. Each
entry in the BTB consists of the address of a previously executed branch
instruction and the target instruction for that branch. It also stores the
next few instructions after the branch target instruction. When the
pipeline decodes a branch instruction, it searches the associative memory
BTB for the address of the instruction. If it is in the BTB, the instruction is
available directly and prefetch continues from the new path. If the
instruction is not in the BTB, the pipeline shifts to a new instruction
stream and stores the target instruction in the BTB. The advantage of this
scheme is that branch instructions that have occurred previously are
readily available in the pipeline without interruption.
3. Loop Buffer: A variation of the BTB is the loop buffer. This
is a small very high speed register file maintained by the
instruction fetch segment of the pipeline. When a program
loop is detected in the program, it is stored in the loop
buffer in its entirety, including all branches. The program
loop can be executed directly without having to access
memory until the loop mode is removed by the final
branching out.
4.Branch Prediction:
Another procedure that some computers use is branch
prediction . A pipeline with branch prediction uses some
additional logic to guess the outcome of a conditional
branch instruction before it is executed. The pipeline then
begins prefetching the instruction stream from the
predicted path. A correct prediction eliminates the
wasted time caused by branch penalties.
5. Delayed Branching:
A procedure employed in most ruse processors is the delayed branch . In
this procedure, the compiler detects the branch instructions and
rearranges the machine language code sequence by inserting useful
instructions that keep the pipeline operating without interruptions. An
example of delayed branch is the insertion of a no-operation instruction
after a branch instruction. This causes the computer to fetch the target
instruction during the execution of the no operation instruction, allowing
a continuous flow of the pipeline.

Cache Memory in Computer Organization
No ratings yet
Cache Memory in Computer Organization
12 pages
Cache 13115
No ratings yet
Cache 13115
20 pages
UNIT-4 Memory
No ratings yet
UNIT-4 Memory
54 pages
Cache Memory
No ratings yet
Cache Memory
11 pages
Cache Memory: Types and Performance
No ratings yet
Cache Memory: Types and Performance
4 pages
3 - Cache Memory
No ratings yet
3 - Cache Memory
35 pages
Ch4 CacheMemory
No ratings yet
Ch4 CacheMemory
29 pages
Lecture 04 IS064
No ratings yet
Lecture 04 IS064
41 pages
Cache Memory in Computer Organization
No ratings yet
Cache Memory in Computer Organization
5 pages
Computer Mapping and Different Memory
No ratings yet
Computer Mapping and Different Memory
9 pages
Lecture 2.2.4 (Associative Memory, Cache Memory and Its Design Issues)
No ratings yet
Lecture 2.2.4 (Associative Memory, Cache Memory and Its Design Issues)
54 pages
Cache Memory CAD
No ratings yet
Cache Memory CAD
16 pages
Lec 10
No ratings yet
Lec 10
45 pages
CA Unit-2 EE
No ratings yet
CA Unit-2 EE
13 pages
6.module 2 - Part 2
No ratings yet
6.module 2 - Part 2
39 pages
Principles of Cache Memory Explained
No ratings yet
Principles of Cache Memory Explained
47 pages
Dlca 22itc01 Unit 5
No ratings yet
Dlca 22itc01 Unit 5
18 pages
Chapter 7
No ratings yet
Chapter 7
39 pages
Understanding Cache Memory Basics
No ratings yet
Understanding Cache Memory Basics
15 pages
Understanding Memory Organization in Computing
No ratings yet
Understanding Memory Organization in Computing
61 pages
Cache
No ratings yet
Cache
5 pages
Memory Systems for Engineers
No ratings yet
Memory Systems for Engineers
77 pages
Unit 5 Memory System
No ratings yet
Unit 5 Memory System
77 pages
Unit 4 Coa - Memory-1
No ratings yet
Unit 4 Coa - Memory-1
12 pages
Computer Memory Essentials
No ratings yet
Computer Memory Essentials
58 pages
Module 5
No ratings yet
Module 5
30 pages
Lec 5
No ratings yet
Lec 5
29 pages
Cache Memory Overview and Mapping Techniques
No ratings yet
Cache Memory Overview and Mapping Techniques
11 pages
Memory Hierarchy and CPU Connection
No ratings yet
Memory Hierarchy and CPU Connection
30 pages
Lecture 4 Characteristics of Memory Systems
No ratings yet
Lecture 4 Characteristics of Memory Systems
36 pages
Cache Memory
No ratings yet
Cache Memory
6 pages
6 Memory Organization
No ratings yet
6 Memory Organization
44 pages
Cache Memory Presentation Slides
No ratings yet
Cache Memory Presentation Slides
25 pages
04 - Cache Memory
No ratings yet
04 - Cache Memory
47 pages
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
No ratings yet
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
53 pages
Cache Memory
No ratings yet
Cache Memory
8 pages
Chapter 2z
No ratings yet
Chapter 2z
54 pages
Cacne Memory - Mapping Techniques
No ratings yet
Cacne Memory - Mapping Techniques
7 pages
Presentation 5421 Content Document 20250306025428PM
No ratings yet
Presentation 5421 Content Document 20250306025428PM
47 pages
L10 Cache Memory
No ratings yet
L10 Cache Memory
52 pages
Cache & Virtual Memory Guide
No ratings yet
Cache & Virtual Memory Guide
16 pages
CH7 - Memory Organization
No ratings yet
CH7 - Memory Organization
38 pages
COAP Chapter 6 - Memory Organization
No ratings yet
COAP Chapter 6 - Memory Organization
66 pages
Embedded Systems: Applications in Imaging and Communication
No ratings yet
Embedded Systems: Applications in Imaging and Communication
71 pages
Cache Memory Characteristics Overview
No ratings yet
Cache Memory Characteristics Overview
57 pages
Lecture 8
No ratings yet
Lecture 8
41 pages
CH04 COA10e
No ratings yet
CH04 COA10e
41 pages
Coa - Memory Organization
50% (2)
Coa - Memory Organization
31 pages
Memory Hierarchy & Management
No ratings yet
Memory Hierarchy & Management
32 pages
Chapter 5 Memory Organization
No ratings yet
Chapter 5 Memory Organization
75 pages
Cache Memory Characteristics Explained
No ratings yet
Cache Memory Characteristics Explained
46 pages
Memory Organization PPT1
No ratings yet
Memory Organization PPT1
23 pages
Cache - Memory - Concept
No ratings yet
Cache - Memory - Concept
73 pages
Memory
No ratings yet
Memory
57 pages
Cache Memory
No ratings yet
Cache Memory
47 pages
Purpose of Cache Memory Explained
No ratings yet
Purpose of Cache Memory Explained
4 pages
Chapter5-The Memory System
No ratings yet
Chapter5-The Memory System
36 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
46 pages
Dimensional Planning and Validation Admin Guide
No ratings yet
Dimensional Planning and Validation Admin Guide
176 pages
OCR Predicted Paper 1 2025
No ratings yet
OCR Predicted Paper 1 2025
12 pages
Operating System Concepts Overview
No ratings yet
Operating System Concepts Overview
14 pages
Seminar On: Random Access Memory (Ram)
No ratings yet
Seminar On: Random Access Memory (Ram)
17 pages
Gartner Magic Quadrant For Enterprise Storage 202509
No ratings yet
Gartner Magic Quadrant For Enterprise Storage 202509
25 pages
Cs 201 Notes
No ratings yet
Cs 201 Notes
27 pages
r23 Cse Bos Syllabus 27-7-2024
No ratings yet
r23 Cse Bos Syllabus 27-7-2024
50 pages
Operating System Overview-1-23
No ratings yet
Operating System Overview-1-23
23 pages
Technical Reference Options and Adapters Volume 2 2of3
No ratings yet
Technical Reference Options and Adapters Volume 2 2of3
198 pages
Module 3-Os
No ratings yet
Module 3-Os
36 pages
ADU-07 (E) : Operating Manual
No ratings yet
ADU-07 (E) : Operating Manual
218 pages
IBM Storage As A Service - Data Sheet: Simplify Your Storage Experience
No ratings yet
IBM Storage As A Service - Data Sheet: Simplify Your Storage Experience
6 pages
Hurghadah International Airport
No ratings yet
Hurghadah International Airport
578 pages
HDFS
100% (2)
HDFS
6 pages
CSE 08PE605 Digital Image Processing
No ratings yet
CSE 08PE605 Digital Image Processing
252 pages
Ecs g31t-Lm Rev 1.0
No ratings yet
Ecs g31t-Lm Rev 1.0
39 pages
GSM-Controlled Robotics Guide
No ratings yet
GSM-Controlled Robotics Guide
5 pages
How To Perform HANA Archiving Process
No ratings yet
How To Perform HANA Archiving Process
17 pages
Computer Architecture Problem Set
No ratings yet
Computer Architecture Problem Set
4 pages
SRMIST Information Storage Management Syllabus
No ratings yet
SRMIST Information Storage Management Syllabus
2 pages
Computer Hardware Basics Lab
No ratings yet
Computer Hardware Basics Lab
7 pages
Grade 9 Ict Lecture
No ratings yet
Grade 9 Ict Lecture
94 pages
Types of Information in Microcomputer Interfaces
No ratings yet
Types of Information in Microcomputer Interfaces
32 pages
Computer Systems & Architecture Overview
No ratings yet
Computer Systems & Architecture Overview
80 pages
Pelco Endura NVR5100 Series Network Video Recorder Spec
No ratings yet
Pelco Endura NVR5100 Series Network Video Recorder Spec
2 pages
Papper 1 Completed 2023
No ratings yet
Papper 1 Completed 2023
12 pages
PPS Unit Wise Important Question 2024-25 Odd Sem
No ratings yet
PPS Unit Wise Important Question 2024-25 Odd Sem
41 pages
Course On Computer Concepts PDF
No ratings yet
Course On Computer Concepts PDF
502 pages
Stores Management
No ratings yet
Stores Management
3 pages
CNC Systems and Cartesian Coordinates
No ratings yet
CNC Systems and Cartesian Coordinates
102 pages

Unit-2 CDA DrManojY

Uploaded by

Unit-2 CDA DrManojY

Uploaded by

UNIT 2

Control Unit Design

9/12/2024 Manoj Yadav 2

9/12/2024 Manoj Yadav 3

9/12/2024 Manoj Yadav 4

o Level 2 or Cache memory: It is the fastest memory that has faster

: It is the memory on which the computer

o Level 4 or Secondary Memory: It is external memory that is not as

9/12/2024 Manoj Yadav 5

Hierarchical memory is a hardware optimization that takes the benefits of

Paging: It is obviously benefits from temporal and spatial locality. A cache

2. L1 CPU caches (32 KB to 512KB) – fast access, with

3. L2 CPU caches (128 KB to 24MB) – slightly slower

4. L3 CPU caches (2 MB to 32MB) – even slower

6. Disk (virtual memory, file system) (1 GB to

7. Remote memory (other computers or the cloud)

v Locality of reference refers to a phenomenon in which a computer program

v A block of words containing the one just accessed is then transferred

v When the CPU refers to memory and finds the word in

v If the word is not found in cache, it is in main memory and it

v Hit ratio=No of hits/(no of miss+no of hits)

v So, basically, the transfer of data from main memory to cache

v Three types of mapping procedures are of practical interest

•The fastest and most flexible cache

•The most common replacement algorithms used are:

•Purpose is to reduce cache-miss and increase the cache-hit.

v Single Instruction stream and Single Data Stream (SISD)

§ Then after shifting of exponent, both the mantissas get aligned.

§ Finally the addition of both numbers take place followed by normalisation of

• Fetch the instruction from memory (FI)

A collision occurs when an instruction cannot proceed because

1. Hardware Interlock: The most straightforward method is to insert

Pre-fetch Target Instruction:

You might also like