Memory Management Notes
Memory Management Notes
Background
• Memory consists of a large array of words or bytes, each with its own address. The CPU fetches
instructions from memory according to the value of the program counter.
• A typical instruction-execution cycle, for example, first fetches an instruction from memory.
• The instruction is then decoded and may cause operands to be fetched from memory. After the
instruction has been executed on the operands, results may be stored back in memory.
• The memory unit sees only a stream of memory addresses. We can ignore how a program
generates a memory address.
Basic Hardware
• Main memory and the registers built into the processor are the only storage that the CPU can
access directly. There are machine instructions that take memory addresses as arguments, but
none that take disk addresses.
• Registers that are built into the CPU are generally accessible within one cycle of the CPU clock.
Completing a memory access may take many cycles of the CPU clock. The remedy is to add
fast memory between the CPU and main memory called cache.
• The operating system must be protected from user processes and, in addition, user processes
must be protected from one another. This protection must be provided by the hardware.
• We can provide this protection by using two registers, usually a base and a limit, as illustrated
below. A pair of base and limit registers define the logical address space.
Figure 8.1 A base and a limit register define a logical address space.
Figure 8.2 Hardware address protection with base and limit registers.
Address Binding
• Usually, a program resides on a disk as a binary executable file. To be executed, the program
must be brought into memory and placed within a process.
• Depending on the memory management in use, the process may be moved between disk and
memory during its execution. The processes on the disk that are waiting to be brought into
memory for execution form the input queue.
• Most systems allow a user process to reside in any part of the physical memory. Thus, although
the address space of the computer starts at 00000, the first address of the user process need not
be 00000.
• A user program will go through several steps-some of which may be optional-before being
executed. Addresses may be represented in different ways during these steps.
• Addresses in the source program are generally symbolic. A compiler will typically bind these
symbolic addresses to relocatable addresses (such as "14 bytes from the beginning of this
module"). The linkage editor or loader will in turn bind the relocatable addresses to absolute
addresses (such as 74014). Each binding is a mapping from one address space to another.
Compile time. If you know at compile time where the process will reside in memory, then absolute
code can be generated. For example, if you know that a user process will reside starting at location R,
then the compiler generated code will start at that location and extend up from there.
Load time. If it is not known at compile time where the process will reside in memory, then the
compiler must generate relocatable code. In this case, final binding is delayed until load time.
Execution time. If the process can be moved during its execution from one memory segment to
another, then binding must be delayed until run time.
Dynamic Loading
• With dynamic loading, a routine is not loaded until it is called. All routines are kept on disk in a
relocatable load format.
• The main program is loaded into memory and is executed. When a routine needs to call another
routine, the calling routine first checks to see whether the other routine has been loaded.
• If it has not, the relocatable linking loader is called to load the desired routine into memory.
Then control is passed to the newly loaded routine.
• The advantage of dynamic loading is that an unused routine is never loaded. This method is
particularly useful when large amounts of code are needed to handle infrequently occurring
cases, such as error routines.
• Dynamic loading results in better memory-space utilization.
Dynamic Linking
• Static linking – in which system language libraries and program code are combined by the
loader into the binary program image.
• Dynamic linking, in contrast, is similar to dynamic loading. Here linking, rather than loading,
is postponed until execution time.
• This feature is usually used with system libraries, such as language subroutine libraries.
Without this facility, each program on a system must include a copy of its language library in
the executable image.
• With dynamic linking, a stub is included in the image for each library routine reference. The
stub is a small piece of code that indicates how to locate the appropriate memory-resident
library routine or how to load the library if the routine is not already present.
• When the stub is executed, it checks to see whether the needed routine is already in memory. If
it is not, the program loads the routine into memory. Either way, the stub replaces itself with the
address of the routine and executes the routine.
• This feature can be extended to library updates. A library may be replaced by a new version,
and all programs that reference the library will automatically use the new version.
• One copy of library is shared by multiple programs. This system is also known as shared
libraries.
• Unlike dynamic loading, dynamic linking generally requires help from the operating system.
•
If the processes in memory are protected from one another, then the operating system is the only
entity that can check to see whether the needed routine is in another process's memory space or
that can allow multiple processes to access the same memory addresses.
Swapping
• A process must be in memory to be executed. A process, however, can be swapped temporarily
out of memory to a backing store and then brought back into memory for continued execution.
• For example, assume a multiprogramming environment with a round-robin CPU-scheduling
algorithm. When a quantum expires, the memory manager will start to swap out the process that
just finished and to swap another process into the memory space that has been freed. In the
meantime, the CPU scheduler will allocate a time slice to some other process in memory.
• Backing store
fast disk large enough to accommodate copies of all memory images for all users;
must provide direct access to these memory images
• Roll out, roll in
Swapping variant used for priority-based scheduling algorithms;
Lower-priority process is swapped out so higher-priority process can be loaded and execute
• Assume that I/O operation of process P1 is queued because the device is busy. If we swap out
process P1 and swap in process P2, the I/O operation might attempt to use memory that now
belongs to process P2.
• Two solutions to the problem:
◦ Never swap a process with pending I/O.
Execute I/O operations only into the operating system buffers. Transfer between buffers and
process memory then occur only when the process is swapped in.
• Standard swapping requires too much swapping time and provides too little execution time.
Modified versions of swapping are found on many systems (i.e., UNIX, Linux, and Windows).
Swapping normally disabled.
Started if more than threshold amount of memory allocated.
Disabled again once memory demand reduced below threshold.
Memory Allocation
• One of the simplest methods for allocating memory is to divide memory into several fixed-sized
partitions . Each partition may contain exactly one process.
• In this multiple partition method, when a partition is free, a process is selected from the input
queue and is loaded into the free partition. When the process terminates, the partition becomes
available for another process.
• In the variable-sized partition scheme, the operating system keeps a table indicating which
parts of memory are available and which are occupied. Initially, all memory is available for user
processes and is considered one large block of available memory, a hole.
• As processes enter the system, they are put into an input queue. The operating system takes into
account the memory requirements of each process and the amount of available memory space in
determining which processes are allocated memory.
• The memory blocks available comprise a set of holes of various sizes scattered throughout
memory. When a process arrives and needs memory, the system searches the set for a hole that
is large enough for this process.
• If the hole is too large, it is split into two parts. One part is allocated to the arriving process; the
other is returned to the set of holes. When a process terminates, it releases its block of memory,
which is then placed back in the set of holes.
• At any given time, then, we have a list of available block sizes(holes) and an input queue.
Memory is allocated to processes until finally, the memory requirements of the next process
cannot be satisfied.
• The operating system can then wait until a large enough block is available, or it can skip down
the input queue to see whether the smaller memory requirements of some other process can be
met.
Summary
• First-fit: Allocate the first hole that is big enough
• Best-fit: Allocate the smallest hole that is big enough;
o must search entire list, unless ordered by size
o Produces the smallest leftover hole
• Worst-fit: Allocate the largest hole;
o must also search entire list
o Produces the largest leftover hole
Fragmentation
• Both the first-fit and best-fit strategies for memory allocation suffer from external
fragmentation. As processes are loaded and removed from memory, the free memory space is
broken into little pieces.
• External fragmentation exists when there is enough total memory space to satisfy a request
but the available spaces are not contiguous; storage is fragmented into a large number of small
holes.
• External Fragmentation
total memory space exists to satisfy a request, but it is not contiguous
• Consider a multiple-partition allocation scheme with a hole of 18,464 bytes. Suppose that the
next process requests 18,462 bytes. If we allocate exactly the requested block, we are left with a
hole of 2 bytes. The overhead to keep track of this hole will be substantially larger than the hole
itself.
• The general approach to avoid this problem is to break the physical memory into fixed-sized
blocks. With this approach, the memory allocated to a process may be slightly larger than the
requested memory. The difference between these two numbers is internal fragmentation-
unused memory that is internal to a partition.
• Internal Fragmentation
allocated memory may be slightly larger than requested memory;
this size difference is memory internal to a partition, but not being used
• One solution to the problem of external fragmentation is compaction. The goal is to shuffle the
memory contents so as to place all free memory together in one large block.
• Compaction is not always possible, however. If relocation is static and is done at assembly or
load time, compaction cannot be done; compaction is possible only if relocation is dynamic and
is done at execution time.
• Another possible solution to the external-fragmentation problem is to permit the logical address
space of the processes to be noncontiguous, thus allowing a process to be allocated physical
memory wherever such memory is available.
Paging
Paging is a memory-management scheme that permits the physical address space of a process to be
noncontiguous. Paging avoids external fragmentation and the need for compaction.
Basic Method
• The basic method for implementing paging involves breaking physical memory into fixed-sized
blocks called frames and breaking logical memory into blocks of the same size called pages.
• When a process is to be executed, its pages are loaded into any available memory frames from
their source(a file system or the backing store).
• The backing store is divided into fixed-sized blocks that are of same size as the memory frames.
• Every address generated by the CPU is divided into two parts:a page number (p) and a page
offset (d). The page number is used as an index into a page table.
• The page table contains the base address of each page in physical memory. This base address is
combined with the page offset to define the physical memory address that is sent to the memory
unit.
• The size of a page is typically a power of 2, varying between 512 bytes and 16 MB per page,
depending on the computer architecture.
• The selection of a power of 2 as a page size makes the translation of a logical address into a
page number and page offset particularly easy.
Paging Hardware
• If the size of the logical address space is 2m, and a page size is 2n addressing units (bytes or
words), then the high-order m-n bits of a logical address designate the page number, and the n
low-order bits designate the page offset. Thus, the logical address is as follows:
• where p is an index into the page table and d is the displacement within the page.
Paging Example
• Here, in the logical address, n= 2 and m = 4. Using a page size of 4 bytes and a physical
memory of 32 bytes (8 pages).
• Logical address 0 is page 0, offset 0. Indexing into the page table, we find that page 0 is in
frame 5. Thus, logical address 0 maps to physical address 20 [= (5 x 4) + 0].
• Logical address 3 (page 0, offset 3) maps to physical address 23 [= (5 x 4) + 3].
• Logical address 4 is page 1, offset 0; according to the page table, page 1 is mapped to frame 6.
Thus, logical address 4 maps to physical address 24 [= (6 x 4) + 0].
• Logical address 13 maps to physical address 9.
Figure 8.9 Paging example for n=2 and m=4 32-byte memory with 4-byte pages
• When we use a paging scheme, we have no external fragmentation, However, we may have
some internal fragmentation.
• If the memory requirements of a process do not happen to coincide with page boundaries, the
last frame allocated may not be completely full.
• For example, if page size is 2,048 bytes, a process of 72,766 bytes will need 35 pages plus 1,086
bytes. It will be allocated 36 frames, resulting in internal fragmentation of 2,048 – 1,086 = 962
bytes.
• In the worst case, a process would need n pages plus 1 byte. It would be allocated n + 1 frames.
• Usually, each page-table entry is 4 bytes long, but that size can vary as well. A 32-bit entry can
point to one of 232 physical page frames. If frame size is 4 KB, then a system with 4-byte
entries can address 244 bytes (or 16 TB) of physical memory.
• i.e.
• Calculating internal fragmentation
Page size = 2,048 bytes
Process size = 72,766 bytes
35 pages + 1,086 bytes
Internal fragmentation of 2,048 - 1,086 = 962 bytes
Worst case fragmentation = 1 frame – 1 byte
On average fragmentation = 1 / 2 frame size
So small frame sizes desirable?
But each page table entry takes memory to track
Page sizes growing over time
Solaris supports two page sizes – 8 KB and 4 MB
• Each page of the process needs one frame. Thus, if the process requires n pages, at least n
frames must be available in memory. If n frames are available, they are allocated to this arriving
process. Frame numbers are put in the page table for this process.
Free Frames
Figure 8.10 Free frames (a) before allocation and (b) after allocation.
• An important aspect of paging is the clear separation between the user's view of memory and
the actual physical memory.
• User program views memory as one single space, containing only this program. In fact user
program is scattered throughout the physical memory, which also holds other programs.
• The mapping of logical addresses to physical is hidden from the user and is controlled by the
operating system.
• The operating system maintains a copy of the page table for each process. This copy is used to
translate logical addresses to physical addresses.
• Operating system maintains a data structure called a Frame table: Has one entry for each
physical frame, indicating whether the frame is free or allocated and, if allocated, to which page
of which process.
Hardware Support
• The hardware implementation of the page table can be done in several ways. In the simplest
case, the page table is implemented as a set of dedicated registers. The CPU dispatcher reloads
these registers, just as it reloads the other registers.
• The use of registers for the page table is satisfactory if the page table is reasonably small. Most
contemporary computers, however, allow the page table to be very large.
• So, the page table is kept in main memory, and a page-table base register (PTBR) points to page
table. Changing page tables requires changing only this one register, substantially reducing
context-switch time.
• The problem with this approach is the time required to access a user memory location. In this
scheme every data/instruction access requires two memory accesses, one for the page table and
one for the data/instruction.
Shared Pages
• An advantage of paging is the possibility of sharing common code. Consider a system that
supports 40 users, each of whom executes a text editor. If the text editor consists of 150 KB of
code and 50 KB of data space, we need 8,000 KB to support the 40 users.
• If the code is pure code(or reentrant code), however, it can be shared, as shown below. Here we
see a three-page editor-each page 50 KB in size being shared among three processes. Each
process has its own data page.
• Pure code never changes during execution. Thus, two or more processes can execute the same
code at the same time. The data for two different processes will of course, be different.
• Only one copy of the editor need be kept in physical memory. Each user's page table maps onto
the same physical copy of the editor, but data pages are mapped onto different frames.
• Thus, to support 40 users, we need only one copy of the editor (150 KB), plus 40 copies of the
50 KB of data space per user. The total space required is now 2,150 KB instead of 8,000 KB.
• Other heavily used programs can also be shared -compilers, window systems, run-time libraries,
database systems, and so on.
Shared Pages Example
• where P1 is an index into the outer page table and P2 is the displacement within the page of
inner the page table. The address-translation method for this architecture is shown below.
Because address translation works from the outer page table inward, this scheme is also known
as a forward mapped page table.
Address-Translation Scheme
• The outer page table consists of 242 entries, or 244 bytes. The obvious way to avoid such a large
table is to divide the outer page table into smaller pieces. We can page the outer page table,
giving us a three-level paging scheme.
• Each entry in the hash table contains a linked list of elements that hash to the same location (to
handle collisions).
• Each element consists of three fields: (1) the virtual page number, (2) the value of the mapped
page frame, and (3) a pointer to the next element in the linked list.
• The algorithm works as follows: The virtual page number in the virtual address is hashed into
the hash table. The virtual page number is compared with field 1 in the first element in the
linked list. If there is a match, the corresponding page frame (field 2) is used to form the desired
physical address. If there is no match, subsequent entries in the linked list are searched for a
matching virtual page number.
• Although this scheme decreases the amount of memory needed to store each page table, it
increases the amount of time needed to search the table when a page reference occurs.
Because the inverted page table is sorted by the physical address, but lookups occur on virtual
addresses.
Segmentation
• As we have already seen, the user's view of memory is not the same as the actual physical
memory. The user's view is mapped onto physical memory.
• Users prefer to view memory as a collection of variable-sized segments.
Basic Method
• Segmentation is a memory-management scheme that supports this user view of memory. A
logical address space is a collection of segments.
• Each segment has a name and a length. The addresses specify both the segment name and the
offset within the segment. The user therefore specifies each address by two quantities: a
segment name and an offset.
• For simplicity of implementation, segments are numbered and are referred to by a segment
number, rather than by a segment name. Thus, a logical address consists of a two tuple:
<segment-number, offset>
• A C compiler might create separate segments for the following:
CSE Dept, VCET Puttur pg.
OPERATING SYSTEMS Module 4
The code
Global variables
The heap, from which memory is allocated
The stacks used by each thread
The standard C library
Segmentation Architecture
• The use of a segment table is illustrated below. A logical address consists of two parts: a
segment number, s and an offset into that segment, d. The segment number is used as an index
to the segment table. The offset d of the logical address must be between 0 and the segment
limit. If it is not, we trap to the operating system. When an offset is legal, it is added to the
segment base to produce the address in physical memory of the desired byte.
• As an example, consider the situation shown in above. We have five segments numbered from 0
through 4.
• The segments are stored in physical memory as shown. The segment table has a separate entry
for each segment, giving the beginning address of the segment in physical memory (or base)
and the length of that segment (or limit).
• For example, segment 2 is 400 bytes long and begins at location 4300. Thus, a reference to byte
53 of segment 2 is mapped onto location 4300 +53= 4353. A reference to segment 3, byte 852,
is mapped to 3200 (the base of segment 3) + 852 = 4052.
• A reference to byte 1222 of segment 0 would result in a trap to the operating system, as this
segment is only 1000 bytes long.
Segmentation Hardware
• We must define an implementation to map two-dimensional user-defined addresses into one-
dimensional physical addresses. This mapping is effected by a segment table. Each entry in the
segment table has a segment base and a segment limit. The segment base contains the starting
physical address where the segment resides in memory, and the segment limit specifies the
length of the segment.
• Segmentation hardware is shown in the below figure.
10. What are the methods used to handle the dead locks? Explain how circular wait condition can be
prevented from occurring.
11. Define deadlock? State and explain Banker's algorithm for deadlock avoidance.
12. What is resource allocation graph(RAG)? Explain how RAG is very useful in describing deadly
embrace by considering your own example.
13. Explain the multistep processing of a user program with a neat block diagram.
CSE Dept, VCET Puttur pg.
OPERATING SYSTEMS Module 4