Dld&Co Cse-Ds Unit 5-1
Dld&Co Cse-Ds Unit 5-1
• Memory hierarchy
• Cache Memory
• Main Memory
Memory Hierarchy
• The memory unit is needed for storing programs
and data.
• At the bottom of the hierarchy are the relatively slow magnetic tapes used to
store removable files.
• Next are the magnetic disks used as backup storage.
• The main memory occupies a central position by being able to communicate
directly with the CPU and with auxiliary memory devices through an I/O
processor.
• When programs not residing in main memory are needed by the CPU, they are
brought in from auxiliary memory. Programs not currently needed in main memory
are transferred into auxiliary memory to provide space for currently used
programs and data.
Memory Hierarchy in Computer
• Memory Hierarchy is to obtain the highest possible access speed while
minimizing the total cost of the memory system
Auxiliary memory
Magnetic
tapes I/O Main
processor memory
Magnetic
disks
CPU Cache
memory
b. Increasing capacity
• Main memory:
– communicates DIRECTLY with the CPU
– communicates with auxiliary memory devices (disks, tapes) through an I/O
processor
– is the PRINCIPAL INTERNAL memory system of the computer
– Each location in main memory has a unique address
– Main memory is usually extended with a higher-speed, smaller cache
• Cache:
– is NOT visible to the programmer or to the processor
– very-high speed memory
– Used to compensate for the speed differential between main memory
access time and processor logic
– Its access time is close to processor logic clock cycle time
• Auxiliary memory or Secondary memory:
– Devices that provide backup storage are called
auxiliary memory
– Ex:- magnetic disks, magnetic tapes
– used for storing system programs, large data files,
and other backup information
– Only programs and data currently needed by the
processor reside in MAIN MEMORY
– All other information is stored in auxiliary memory
and transferred to main memory when needed
Main Memory
• The main memory is the central storage unit in a
computer system.
• The read and write inputs specify the memory operation and
the two chips select (CS) control inputs are for enabling the
chip only when it is selected by the microprocessor.
• (Fig.(b)) The unit is in operation only when CS1 = 1 and CS2-
bar = 0.
• A ROM can only read, the data bus can only be in an output mode.
• The nine address lines in the ROM chip specify any one of the 512 bytes stored
in it.
• The two chip select inputs must be CS1 = 1 and CS2_bar = 0 for the unit to
operate. Otherwise, the data bus is in a high-impedance state.
• There is no need for a read or write control because the unit can only read.
• Thus when the chip is enabled by the two select inputs, the byte selected by the
address lines appears on the data bus.
Memory Address Map
• A memory address map, is a pictorial representation of assigned address
space for each chip in the system.
• Assume that a computer system needs 512 bytes of RAM and 512 bytes of
ROM.
• The component column specifies whether a RAM or a ROM chip is used.
• The hexadecimal address column assigns a range of hexadecimal equivalent
addresses for each chip.
• Although there are 16 lines in the address bus (third column), the table shows
only 10 lines because the other 6 are not used (assumed zero).
• The small x's designate those lines that must be connected to the address
inputs in each chip.
TABLE: Memory Address Map for Microprocomputer
• The RAM chips have 128 bytes and need seven address lines.
• The ROM chip has 512 bytes and needs 9 address lines.
• The x's are always assigned to the low-order bus lines: lines 1
through 7 for the RAM and lines 1 through 9 for the ROM.
• When line 10 is 0, the CPU selects a RAM, and when this line is
equal to 1, it selects the ROM.
• The low-order lines in the address bus select the byte within
the chips and
• other lines in the address bus select a particular chip through
its chip select inputs.
Memory Connection to CPU
• This configuration gives a memory
capacity of 512 bytes of RAM and 512
bytes of ROM.
• When the CPU refers to memory and finds the word in cache, it
is said to produce a HIT .
• If the word is not found in cache, it is in main memory and it
counts as a MISS .
• 1. Associative mapping
• 2. Direct mapping
• 3. Set-associative mapping
• The main memory can store 32k words of 12 bits each.
• The cache is capable of storing 512 of these words at any given time.
• The CPU communicates both.
• For every word stored in cache, there is a duplicate copy in main
memory.
15 bits as a 12 bits as a
5 digit octal 4 digit octal
• The fastest and most flexible cache organization uses an
associative memory.
• This permits any location in cache to store any word from main
memory.
• If the data not available in cache , the address and data pair are
transferred to associative memory.
• The above diagram shows three words presently stored in the cache.
• The address value of 15 bits is shown as a five-digit octal number and its
corresponding 12-bit word is shown as a four-digit octal number.
• A CPU address of 15 bits is placed in the argument register and the associative
memory is searched for a matching address.
• If the address is found, the corresponding 12-bit data is read and sent to the CPU.
• If no match occurs, the main memory is accessed for the word.
• The address-data pair is then transferred to the associative cache memory.
• If the cache is full, an address-data pair must be displaced to make room for a pair
that is needed and not presently in the cache.
• k bits for the index field and (n-k) bits for the tag field.
• The DIRECT MAPPING cache organization uses the n-bit address to access the
main memory and the k-bit index to access the cache.
• (Fig.(b)) Each word in cache consists of the data word and its associated tag.
• When the CPU generates a memory request, the index field is used for the
address to access the cache.
• The tag field of the CPU address is compared with the tag in the word read from
the cache.
• If the two tags match, there is a hit and the desired data word is in cache.
• If there is no match, there is a miss and the required word is read from main
memory.
• It is then stored in the cache together with the new tag, replacing the previous
value.
• Disadvantage: the hit ratio can drop considerably if two or more words whose
addresses have the same index but different tags are accessed repeatedly.
• However, this possibility is minimized by the fact that such words are relatively
far apart in the address range (multiples of 512 locations in this example. )
• Consider example shown in Fig. below.
• The word at address zero is presently stored in the cache (index = 000, tag = 00, data = 1220).
• Suppose that the CPU now wants to access the word at address 02000.
• The index address is 000, so it is used to access the cache. The two tags are then compared. The cache
tag is 00 but the address tag is 02, which does not produce a match.
• Therefore, the main memory is accessed and the data word 5670 is transferred to the CPU.
• The cache word at index address 000 is then replaced with a tag of 02 and data of 5670.
• The direct-mapping organization using a block size of B words is shown in Fig below.
• The index field is now divided into two parts: the block field and the word field.
• In a 512-word cache there are 64 blocks of 8 words each, since 64 x 8 = 512.
• The block number is specified with a 6-bit field and the word within the block is specified with a 3-bit field.
• The tag field stored within the cache is common to all eight words of the same block.
• Every time a miss occurs, an entire block of eight words must be transferred from main memory to
cache memory.
• Although this takes extra time, the hit ratio will most likely improve with a larger block size because of
the sequential nature of computer programs.
Figure:
Direct mapping cache with
block size of 8 words
Set-Associative Mapping
• Set-associative mapping, is an improvement over the direct mapping organization in that
each word of cache can store two or more words of memory under the same index
address.
• Each data word is stored together with its tag and the number of tag-data items in one
word of cache is said to form a set.
• An example of a set-associative cache organization for a set size of two is shown in Fig
below.
12 Bits data
word
CPU address
of 15 bits
• Each tag requires six bits and each data word has 12 bits, so the
word length is 2(6+12)=36 bits.
• When the CPU generates a memory request, the index value of the
address is used to access the cache.
• The tag field of the CPU address is then compared with both tags in
the cache to determine if a match occurs.
• The hit ratio will improve as the set size increases because more
words with the same index but different tags can reside in cache.
Replacement Algorithm
• When a miss occurs in a set-associative cache and the set is full, it is
necessary to replace one of the tag-data items with a new value.
• With the random replacement policy the control chooses one tag-data item
for replacement at random.
• The FIFO procedure selects for replacement the item that has been in the set
the longest.
• The LRU algorithm selects for replacement the item that has been least
recently used by the CPU.
• Both FIFO and LRU can be implemented by adding a few extra bits in each
word of cache.
Writing into Cache
• When the CPU finds a word in cache during a read operation, the main memory
is NOT involved in the transfer.
• If the operation is a write, there are two ways that the system can proceed.
• The simplest and most commonly used procedure is to update main memory
with every memory write operation, with cache memory being updated in
parallel if it contains the word at the specified address.
• This is called the WRITE-THROUGH method.
• The valid bit of a particular cache word is set to 1 the first time this word is
loaded from main memory and stays set unless the cache has to be initialized
again.
• The introduction of the valid bit means that a word in cache is not replaced by
another word unless the valid bit is set to 1 and a mismatch of tags occurs.
• If the valid bit happens to be 0, the new word automatically replaces the
invalid data.
• Thus the initialization condition has the effect of forcing misses from the cache
until it fills with valid data.