0% found this document useful (0 votes)
55 views11 pages

Advanced Concepts in Cache Memory - 1: Lecture 4F

The document discusses advanced concepts in cache memory, focusing on multilevel caches and optimization techniques for improving average memory access time (AMAT). It includes calculations for AMAT based on different cache configurations and explores the impact of cache block size on performance. Additionally, it details cache mapping by the operating system and analyzes cache misses and evictions during program execution.

Uploaded by

jyotiraipnk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views11 pages

Advanced Concepts in Cache Memory - 1: Lecture 4F

The document discusses advanced concepts in cache memory, focusing on multilevel caches and optimization techniques for improving average memory access time (AMAT). It includes calculations for AMAT based on different cache configurations and explores the impact of cache block size on performance. Additionally, it details cache mapping by the operating system and analyzes cache misses and evictions during program execution.

Uploaded by

jyotiraipnk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Multi-Core Computer Architecture

Lecture 4F
Advanced Concepts in Cache Memory -1

John Jose
Associate Professor
Department of Computer Science & Engineering
Indian Institute of Technology Guwahati
Multilevel Caches
Assume a 2-level cache system with the following
specifications. L1 Hit Time = 1 cycle, L1 Miss Rate = 2.5%, L2
Hit Time = 6 cycles, L2 Miss Rate = 17% (% L1 misses that
miss), L2 Miss Penalty = 120 cycles. Compute the average
memory access time.

AMAT = ht1+mr1xMP
= ht1+mr1x(ht2+mr2xMP)
= 1 + 0.025 x (6 +0.17 x 120)= 1.66 CC
Optimization
A cache has access time (hit latency) of 10 ns and miss rate of 5%.
An optimization was made to reduce the miss rate to 3% but the hit
latency was increased to 15 ns. Under what condition this change will
result in better performance (Lower AMAT)?
Optimization
A cache has access time (hit latency) of 10 ns and miss rate of 5%.
An optimization was made to reduce the miss rate to 3% but the hit
latency was increased to 15 ns. Under what condition this change will
result in better performance (Lower AMAT)?
AMAT 1 = HT1 + MR1 x MP HT1 = 10ns; MR1=0.05
AMAT 2 = HT2 + MR2 x MP HT2 = 15ns; MR2=0.03

AMAT2<AMAT1
15 + 0.03x MP < 10 +0.05xMP

5 <0.02MP 🡪 MP> 250 ns


Optimization
A cache has hit rate of 95%, block size of 128B, cache hit latency of
5ns. Main memory takes 50 ns to return first word (32 bits) of a block
and 10 ns for each subsequent word.
(a) What is the miss latency of the cache?
(b) If doubling the cache block size reduces the miss rate to 3%, does
it reduces AMAT?
Optimization
hit rate of 95%, 28B blocks, cache hit latency of 5ns. Main memory takes 50
ns to return first word (32 bits) of a block and 10 ns for each subsequent
word. (a) What is the miss latency of the cache?
(b) If doubling the cache block size reduces the miss rate to 3%, does it
reduces AMAT?
Hr= 0.95; BS= 128B; Ht =5 ns ; 1word= 4B ( 32 bits)
# words/ block = 128B/4B = 32
(A) MP = 50 + (31x10) = 360 ns
AMAT1 = 5 + 0.05 x 360 = 23 ns
(B) # words/ block = 256B/4B = 64 ;
MP = 50 + 63 x10 =680 ns
AMAT2 = 5 + 0.03 x 680 = 25.4 ns
Doubling block size will not reduce AMAT
Cache Mapping by OS
A 16KB direct mapped 256B block unified cache is attached to a
16MB main memory system. The word length as well as instruction
length of the processor is 16 bits. Consider a program that consists of
a main routine M which in turn calls a subroutine S. M consists of 12
instruction words which are loaded in the main memory from the
address 0x4230FA onwards. The last five instructions of M is a loop
that is iterated 10 times. The second instruction in the loop is a call to
subroutine S. S consists of 4 instruction words loaded in the main
memory from the address 0x70F168. The last instruction of S is a
subroutine return back to M. The only two data words that are used
by M and S are at addresses 0x748074 and 0x846064. Assume the
caches are initially empty. Ignore OS level interruption and
subsequent cache impact on context switching.
Cache Mapping by OS
1 0x 4230FA
M
Tag Index Offset
2 0x 4230FC
10 6 8
3 0x 4230FE
4 0x 423100
5 0x 423102
6 0x 423104
7 0x 423106
1 0x 748074 1 0x 846064
8 0x 423108
9 0x 42310A 1 0x 707168 S
10 0x 42310C 2 0x 70716A
10 3 0x 70716C
11 0x 42310E
12 0x 423110 4 0x 70716E
Cache Mapping by OS
1 0x 4230FA (48)
M
Tag Index Offset
2 0x 4230FC (48)
10 6 8
3 0x 4230FE (48)
4 0x 423100 (49)
5 0x 423102 (49)
6 0x 423104 (49)
7 0x 423106 (49)
1 0x 748074 (0) 1 0x 846064 (16)
8 0x 423108 (49)
9 0x 42310A (49) 1 0x 707168 (49) S
10 0x 42310C (49) 2 0x 70716A (49)
10 3 0x 70716C (49)
11 0x 42310E (49)
12 0x 423110 (49) 4 0x 70716E (49)
Cache Mapping by OS
Find the number of cache misses occurred during the execution of
the program.
M1
M4
S1, M10, S1, M10, ….. (10 TIMES) = 22 MISSES
How many cache block evictions happened during the execution of
the program?
20 EVICTIONS
List out the block numbers (in decimal) in the cache that are
non-empty after the execution of the program.
ALL BLOCKS EXCEPT 0,16,48,49
johnjose@[Link]
[Link]

You might also like