Cache Memory
Cache Memory
memory.
Location-
Cache memory lies on the path between the CPU and the main memory.
It facilitates the transfer of data between the processor and the main memory at the speed
which matches to the speed of the processor.
Data is transferred in the form of words between the cache memory and the CPU.
Data is transferred in the form of blocks or pages between the cache memory and the main
memory.
Purpose-
Execution Of Program-
Whenever any program has to be executed, it is first loaded in the main memory.
The portion of the program that is mostly probably going to be executed in the near future is
kept in the cache memory.
This allows CPU to access the most probable portion at a faster speed.
Step-01:
Whenever CPU requires any word of memory, it is first searched in the CPU registers.
Now, there are two cases possible-
Case-01:
If the required word is found in the CPU registers, it is read from there.
Case-02:
If the required word is not found in the CPU registers, Step-02 is followed.
Step-02:
When the required word is not found in the CPU registers, it is searched in the cache
memory.
Tag directory of the cache memory is used to search whether the required word is present in
the cache memory or not.
Example-
Three level cache organization consists of three cache memories of different size organized at three
different levels as shown below-
Size (L1 Cache) < Size (L2 Cache) < Size (L3 Cache) < Size (Main Memory)
Cache Mapping-
Cache mapping defines how a block from the main memory is mapped to the cache memory
in case of a cache miss.
OR
Cache mapping is a technique by which the contents of main memory are brought into the
cache memory.
NOTES
Main memory is divided into equal size partitions called as blocks or frames.
Cache memory is divided into partitions having same size as that of blocks called as lines.
During cache mapping, block of main memory is simply copied to the cache and the block is not
actually brought from the main memory.
1. Direct Mapping
2. Fully Associative Mapping
3. K-way Set Associative Mapping
1. Direct Mapping-
In direct mapping,
A particular block of main memory can map only to a particular line of the cache.
The line number of cache to which a particular block can map is given by-
Step-01:
Each multiplexer reads the line number from the generated physical address using its select
lines in parallel.
To read the line number of L bits, number of select lines each multiplexer must have = L.
Step-02:
After reading the line number, each multiplexer goes to the corresponding line in the cache
memory using its input lines in parallel.
Number of input lines each multiplexer must have = Number of lines in the cache memory
Step-03:
Each multiplexer outputs the tag bit it has selected from that line to the comparator using its
output line.
Number of output line in each multiplexer = 1.
UNDERSTAND
It is important to understand-
A multiplexer can output only a single bit on output line.
So, to output the complete tag to the comparator,
Number of multiplexers required = Number of bits in the tag
Each multiplexer is configured to read the tag bit at specific location.
Example-
Step-04:
Comparator compares the tag coming from the multiplexers with the tag of the generated
address.
Only one comparator is required for the comparison where-
Size of comparator = Number of bits in the tag
If the two tags match, a cache hit occurs otherwise a cache miss occurs.
Hit latency-
The time taken to find out whether the required word is present in the Cache Memory or
not is called as hit latency.
Following are the few important results for direct mapped cache-
Block j of main memory can map to line number (j mod number of lines in cache) only of the
cache.
Number of multiplexers required = Number of bits in the tag
Size of each multiplexer = Number of lines in cache x 1
Number of comparators required = 1
Size of comparator = Number of bits in the tag
Hit latency = Multiplexer latency + Comparator latency
Example-
Here,
All the lines of cache are freely available.
Thus, any block of main memory can map to any line of the cache.
Had all the cache lines been occupied, then one of the existing blocks will have to be
replaced.
Need of Replacement Algorithm-
Here,
k = 2 suggests that each set contains two cache lines.
Since cache contains 6 lines, so number of sets in the cache = 6 / 2 = 3 sets.
Block ‘j’ of main memory can map to set number (j mod 3) only of the cache.
Within that set, block ‘j’ can map to any cache line that is freely available at that moment.
If all the cache lines are occupied, then one of the existing blocks will have to be replaced.
Need of Replacement Algorithm-
Set associative mapping is a combination of direct mapping and fully associative mapping.
It uses fully associative mapping within each set.
Thus, set associative mapping requires a replacement algorithm.
Special Cases-
If k = Total number of lines in the cache, then k-way set associative mapping becomes fully
associative mapping.
Most systems incorporate a special-purpose cache, called a translation
look-aside buffer (TLB), that maps virtual page numbers to physical page
numbers. The TLB is usually small and quite fast. It’s usually fully-
associative to ensure the best possible hit ratio by avoiding collisions. If
the PPN is found by using the TLB, the access to main memory for the
page table entry can be avoided, and we’re back to a single physical
access for each virtual access.
The hit ratio of a TLB is quite high, usually better than 99%. This isn’t too
surprising since locality and the notion of a working set suggest that only
a small number of pages are in active use over short periods of time.
Page replacement algorithm picks a page to paged out
and free up a frame