Module-5
Module-5
1
Virtual memory
When the operating system uses either contiguous allocation or paging or segmentation technique to load
programs into main memory, then the operating system has to load the entire program into main memory
in order to start the execution of program.
When the size of the program is greater than the size of main memory, then it is not possible to load the
entire program into main memory and start the execution of the program.
With virtual memory concept, the operating system can load part of the program into main memory and
can start the execution of the program.
Virtual memory supports execution of programs whose size is greater than the size of main memory.
Demand Paging Technique
The space in RAM is divided into a number of equal size frames before loading any program into the RAM.
To load a program into the RAM, the operating system divides the program into a number of equal size pages.
Depending on the number of free frames available in the RAM, the operating system loads either all or few pages
of the program into the RAM.
3
The Number of entries in the page table is equal to the number of pages in the program.
If the page is currently loaded into RAM, then the corresponding entry in the page table is filled with the frame
number and ‘v’.
Otherwise, the frame number field is empty and the valid bit field is filled with ‘i’.
In the following figure, program P1 is divided into 4 pages. Two pages are loaded into RAM and remaining two
pages are in the HD.
Two entries of the page table contain the corresponding frame numbers and the remaining two entries are
empty.
4
5
To execute a statement of the program, the CPU generates the logical address of the statement.
The page number of logical address is used as an index into the page table of the program.
If the entry of page table contains a frame number, then that frame number is appended with the offset of
logical address to generate the physical address.
If the entry of page table does not contain any frame number, then that situation is called a page fault.
6
Operating system performs the following activities when a page fault occurs.
1. Checks whether the RAM contains a free frame or not.
2. If the RAM contains a free frame, then the operating system loads the required page into the free frame and
updates the page table of program.
3. Otherwise, the operating system replaces one of the pages in the RAM with the required page and updates
the page table of program. Operating System uses a page replacement algorithm for replacing one of the
pages in the RAM.
4. Instructs the CPU to generate the same logical address again.
Page Replacement
If the RAM is currently full and if the operating system wants to load a page of a process into RAM, then the
operating system moves one of the pages in the RAM to Hard disk and loads the new page into the free frame.
To select a page for replacement, the operating system uses one of the following page replacement algorithms.
1. First in first out (FIFO) page replacement algorithm
2. Optimal page replacement algorithm
3. Least Recently Used (LRU) page replacement algorithm
Reference string
Reference string indicates the order in which the pages of a process are executed or referenced.
8
First in first out page replacement algorithm
The page that was loaded for the first time into RAM is replaced by the requested page.
Ex:-
Number of pages in the process=8 (0 to 7)
Number of frames in the RAM=3
Reference string: 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1
The replacement of pages in the RAM is shown in below figure
Reference string: 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1
12
Generally, the number of page faults will be decreased by increasing the number of frames in the RAM. With
FIFO algorithm, in some cases, the number of page faults increases when the number of frames in the RAM
increases.
This situation is called “Belady’s Anamoly”.
This is a major drawback with FIFO.
To avoid Belady’s Anamoly, the optimal page replacement algorithm is used.
13
Optimal page replacement algorithm
The page that will not be used for the longest period of time will be replaced by the requested page.
Ex1:
Number of pages in the process=8 (0 to 7)
Number of frames in the RAM=3
Reference string: 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1
14
Ex2:
Number of pages in the process=8
Number of frames in the RAM=4
Reference string: 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1
15
Ex3:
Number of pages in the process=5(1 to 5)
Number of frames in the RAM=3
Reference String: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
16
Ex4:
Number of pages in the process=5(1 to 5)
Number of frames in the RAM=4
Reference String: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
17
Advantage:
Less number of page faults compared to FIFO.
Disadvantage:
This algorithm works based on the future references of pages.
If the operating system doesn’t know the order in which the pages will be referred, then it is not possible to
calculate future references for the pages and not possible to use this algorithm.
18
Least Recently Used Page Replacement Algorithm
The page that has not been used for the longest period of time is replaced by the requested page.
Ex1:
Number of pages in the process=8
Number of frames in the RAM=3
Reference string: 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1
19
Ex2:
Number of pages in the process=8
Number of frames in the RAM=4
Reference string: 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1
Advantages:
1) Less number of page faults compared to FIFO.
2) Its implementation is easy as it depends only on previous references of pages.
20
FILE MANAGEMENT
File
A file is a container of information that is stored in the secondary storage device (Hard disk).
Files are broadly categorised into program files and data files.
2) Identifier: it is a unique number assigned by operating system. The operating system identifies a file by
its identifier.
5) Size: indicates the current size of the file in bytes or words or blocks.
6) Protection: indicates the access permissions for different users.
7) Time, date and user identification: indicates the owner of file and the date and time on which the file is
created and last modified.
File Operations
1) Creating a file: to create a new file, the operating system has to allocate space for the file and create an
entry for the file in the directory.
2) Writing a file: to write to a file, name of the file and the information to be written should be specified. The
operating system identifies the file and writes the information into the file at the position specified by file
pointer and moves the file pointer.
3) Reading a file: to read from a file, name of the file should be specified. The operating system identifies the
file and reads the data from the position pointed by the file pointer and moves the file pointer.
6) Truncating a file: the contents of the file are deleted but the space allocated to the file is not released.
Other operations that can be performed on a file are
The file name is split into two parts: name and extension, separated by a period character.
Based on the file type, the operating system decides the operations that can be performed on the file.
Access Methods
Information in the file is accessed in sequential order i.e. one record after the other.
With direct access, any record or block of the file can be accessed directly.
For example, we may read block 14, then read block 53, and then write block 8.
Some systems start the relative block number from ‘0’; others start at ‘1’.
Not all operating systems support both sequential and direct access for files.
Some systems allow only sequential file access; others allow only direct access.
Indexed Access
An index is created for the file. The index contains pointers to blocks of the file.
To access a block of the file, the index is searched first and then the pointer in the index is used to access the block.
The following figure shows how the data in the file is accessed using the indexed access method.
The index is searched using binary search method.
When the file is large then the index file becomes large.
A directory is collection of nodes or entries containing information about all files in that directory.
Directory
Files
F1 F2 F4
F3
Fn
There are different structures for directories. When considering a particular directory structure, we need to keep
in mind the operations that are to be performed on a directory:
1) Search for a file: We need to be able to search a directory structure to find the entry for a particular file.
2) Create a file: New files need to be created and added to the directory.
3) Delete a file: When a file is no longer needed, we want to be able to remove it from the directory.
4) List a directory: We need to be able to list the files in a directory and the contents of the directory entry for
each file in the list.
5) Rename a file: Because the name of a file represents its contents to its users, we must be able to change the
name when the contents or use of the file changes.
6) Traverse the file system: We may wish to access every directory and every file within a directory structure.
Different Structures of Directory
1) Single-Level Directory
2) Two-Level Directory
3) Tree-Structured Directory
4) Acyclic-Graph Directory
5) General-Graph Directory
Single-Level Directory
All files are contained in the same directory as shown in following figure.
Limitations with Single-Level directory:
1) Since all files are in the same directory, they must have unique names. If two users call their data file test, then
the unique-name rule is violated.
2) Even a single user on a single-level directory may find it difficult to remember the names of all the files as the
number of files increases.
Two-Level Directory
In the two-level directory structure, each user has his own User File Directory (UFD).
The UFDs have similar structure, but each lists only the files of a single user.
The MFD is indexed by user name or account number, and each entry points to the UFD for that user as shown
in below figure.
Different users may have files with the same name, as long as all the file names within each UFD are unique.
To create a file for a user, the operating system searches only that user's UFD to ascertain whether another file of
that name exists.
To delete a file, the operating system confines its search to the local UFD; thus, it cannot accidentally delete
another user's file that has the same name.
Advantage:
solves the name-collision problem
Disadvantages:
This structure isolates one user from another. Isolation is an advantage when the users are completely
independent but is a disadvantage when the users want to cooperate on some task and to access one another's
files.
A two-level directory can be thought of as a tree of height 2.
To name a particular file uniquely in a two-level directory, we must give both the user name and the file name.
Tree structure allows users to create their own subdirectories and to organize their files accordingly.
The tree has a root directory, and every file in the system has a unique path name.
A directory (or subdirectory) contains a set of files or subdirectories.
One bit in each directory entry defines the entry as a file (0) or as a subdirectory (1).
An absolute path name begins at the root and follows a path down to the specified file, giving the directory names
on the path.
For example, in the tree-structured file system of above figure, if the current directory is root/spell/mail, then the
relative path name prt/first refers to the same file as does the absolute path name root/spell/mail/prt/first.
An interesting policy decision in a tree-structured directory concerns how to handle the deletion of a directory.
If a directory is empty, its entry in the directory that contains it can simply be deleted.
However, suppose the directory to be deleted is not empty but contains several files or subdirectories. One of two
approaches can be taken.
Some systems, such as MS-DOS, will not delete a directory unless it is empty.
Thus, to delete a directory, the user must first delete all the files in that directory.
If any subdirectories exist this procedure must be applied recursively to them, so that they can be deleted also.
The latter policy is more convenient, but it is also more dangerous, because an entire directory structure can be
removed with one command.
If that command is issued in error, a large number of files and directories will need to be restored (assuming a
backup exists).
With a tree-structured directory system, users can be allowed to access, in addition to their files, the files of other
users.
For example, user B can access a file of user A by specifying its path name.
When two or more programmers are working on a joint project then the files associated with that project can be
stored in a subdirectory.
An acyclic graph allows directories to share subdirectories and files as shown in below figure.
The same file or subdirectory may be in two different directories.
A shared file (or directory) is not the same as two copies of the file.
With a shared file, only one actual file exists, so any changes made by one person are immediately visible to the
other.
When a new file is created by one person then that file will automatically appear in all the shared subdirectories.
Disadvantages
When we traverse the file system then the same file is traversed more than once.
Another problem involves deletion. When can the space allocated to a shared file be deallocated and reused?
One possibility is to remove the file whenever anyone deletes it, but this action may leave dangling pointers to the
now-nonexistent file.
Another approach to deletion is to preserve the file until all references to it are deleted.
If we start with a two-level directory and create subdirectories then a tree-structured directory is created.
When we add new files and subdirectories to an existing tree-structured directory the tree-structure is preserved.
But when we add links then the tree structure is destroyed and results in a simple graph structure.
One problem with graph structure is that the traversing program may enter into an infinite loop.
With acyclic-graph directory structures, a value of 0 in the reference count means that there are no more
references to the file or directory, and the file can be deleted.
However, when cycles exist, the reference count may not be 0 even when it is no longer possible to refer to a
directory or file.
In this case, we generally need to use a garbage-collection scheme to determine when the last reference has been
deleted and the disk space can be reallocated.
Garbage collection involves traversing the entire file system, marking everything that can be accessed.
Then, a second pass collects everything that is not marked onto a list of free space.
Allocation Methods
To allocate space to these files three major methods are in wide use: contiguous, linked, and indexed.
An operating system uses one method for all files within a file-system type.
Contiguous Allocation
The directory entry for each file indicates the address of the starting block and the number of blocks allocated for
that file.
Advantage
Disadvantages
1) External fragmentation
As files are allocated and deleted, the free disk space is broken into pieces.
External fragmentation exists whenever free space is broken into chunks and when the largest contiguous chunk is
insufficient for a request.
Compaction technique can be used to solve the external fragmentation problem.
Compaction technique compact all free space into one contiguous space.
2) Another problem with contiguous allocation is determining how much space is needed for a file.
If too little space is allocated to a file then the file cannot be extended.
If more space is allocated then some space may be wasted (internal fragmentation).
To minimize these drawbacks, some operating systems use a modified contiguous-allocation scheme.
In this scheme, a contiguous chunk of space is allocated initially; then, if that space is not enough, another chunk of
contiguous space, known as an extent is added.
The directory entry of the file now contains address of the starting block, block count, plus address of first block
of the next extent.
Linked Allocation
With linked allocation, the blocks at any position of the disk can be allocated to a file.
For example, a file of five blocks may start at block 9 and continue at block 16, then block 1, then block 10, and
finally block 25.
Each block allocated to the file contains a pointer to the next block allocated to the file.
The directory entry of a file contains a pointer to the first and last blocks of the file.
Advantages
1) There is no external fragmentation with linked allocation, and any free block on the free-space list can be used
to satisfy a request.
2) The size of a file need not be declared when that file is created. A file can continue to grow as long as free blocks
are available.
Disadvantages
1) Does not support direct access. To find the ith block of a file, we must start at the beginning of that file and
follow the pointers until we get to the ith block.
For example, a cluster is defined as four blocks. Pointers then use a much smaller percentage of the file's disk space.
Cluster mechanism improves disk throughput and decreases the space needed for free-list management.
But, this approach increases internal fragmentation, because more space is wasted when a cluster is partially full
than when a block is partially full.
3) Another problem with linked allocation is reliability. If a block allocated to a file is corrupted then it is not possible
to access the remaining blocks of the file.
Indexed Allocation
Blocks at any position of the disk can be allocated to a file as in linked allocation method.
The addresses of these blocks are stored into another block called index block.
Each file has its own index block, which is an array of disk-block addresses.
The ith entry in the index block points to the ith block of the file.
The directory entry of the file contains the address of the index block.
To find and read the ith block, we use the pointer in the ith entry of index block.
When the file is created, all pointers in the index block are set to nil.
When the ith block is first written, a block is obtained from the free-space manager and its address is put in
the ith entry of index block.
Advantages
1) No external fragmentation
2) Files can grow
3) Supports direct access
4) Reliability is more
Disadvantages
1) Wastage of space in index block
Consider a file which occupies only one or two blocks. With linked allocation, we lose the space of only one
pointer per block.
With indexed allocation, an entire index block must be allocated, even if only one or two pointers will be non-nil.
If one index block is not enough to store the addresses of blocks allocated to a file then number of index blocks
are allocated to the file and they are linked together.
The last entry in first index block contains the address of second index block.
The last entry in second index block contains the address of third index block and so on.
Multilevel index
This scheme uses a first-level index block to point to a set of second-level index blocks, which in turn point to the
file blocks.
To access a block, the operating system uses the first-level index to find a second-level index block and then uses
that block to find the desired data block.
This approach could be continued to a third or fourth level, depending on the file size.
Free Space Management
The operating system maintains a free-space list to keep track of free blocks in the disk.
The free-space list contains the addresses or numbers of free blocks in the disk.
To allocate blocks to a file, the operating system searches the free-space list and identifies the required number
of free blocks and allocates that blocks to the file.
The allocated blocks are then removed from the free-space list.
When a file is deleted, the blocks allocated to that file are added to the free-space list.
Methods for maintaining the Free Space List
1) Bit Vector
2) Linked List
3) Grouping
4) Counting
Bit Vector
Each block is represented by one bit. If the block is free then the bit is 0; if the block is allocated then the bit is 1.
For example, consider a disk with 32 blocks (0 to 31) where blocks 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 17, 18, 25, 26,
and 27 are free and the rest of the blocks are allocated.
11000011000000111001111110001111
Advantages
1) Simple
Disadvantage
To store the Bit Vector, more space is required when the size of disk is large.
For example, if the size of disk is 1-GB (230 bytes) and size of each block in the disk is 1-KB (2 12 bytes) then the
size of Bit Vector is
A 1-TB disk with 4-KB blocks requires 32 MB to store the Bit Vector.
Linked List
The address of first free block is stored in a special location in the disk.
The first free block contains a pointer to the next free block, and so on.
For example, consider a disk where blocks 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 17, 18, 25, 26, and 27 are free and the
rest of the blocks are allocated.
In this situation, the address of block 2 is stored in the special location in the disk.
Block 2 will contain a pointer to block 3, which will point to block 4, which will point to block 5, which will point
to block 8, and so on.
Advantage
No wastage of space. i.e. no need to store the addresses of free blocks separately.
Disadvantage
The addresses of n free blocks are stored in the first free block.
The last block contains the addresses of another n free blocks, and so on.
Advantage
Block number 10 contains the addresses of blocks 11, 12, 13, 17, 18, 25.
When the space in disk is allocated with the contiguous allocation algorithm then several contiguous blocks
may be allocated or freed simultaneously.
Instead of keeping a list of n free block addresses, we can keep the address of the first free block and the
number (n) of contiguous free blocks that follow the first free block.
Each entry in the free-space list then consists of a disk address and a count.
Free Space List