0% found this document useful (0 votes)
14 views

Lecture2

stanford operating system lecture

Uploaded by

yanggeer00
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Lecture2

stanford operating system lecture

Uploaded by

yanggeer00
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

CS111, Lecture 2

Introduction to Filesystems

Optional reading:
Operating Systems: Principles and Practice (2 nd Edition): Chapter 11,
Section 12.1, 12.2 and Section 13.3 (up through page 567)

While you’re waiting – get set up with PollEverywhere!


Visit pollev.stanford.edu to set up your account.
This document is copyright (C) Stanford Computer Science and Nick Troccoli, licensed under
Creative Commons Attribution 2.5 License. All rights reserved.
Based on slides and notes created by John Ousterhout, Jerry Cain, Chris Gregg, and others.
NOTICE RE UPLOADING TO WEBSITES: This content is protected and may not be shared, 1
uploaded, or distributed. (without expressed written permission)
2
PollEverywhere
• Today we’re doing a “trial run” of using PollEverywhere for poll questions
• Not counted for attendance (that starts next lecture), just a chance to try it out – there’s
also a sample not-counted “lecture 2 quiz” on Canvas.
• Confirm responses went through in Canvas Gradebook after lecture
• Responses not anonymized, but we look only at aggregated results and totals
• Polls are for live in-person response in lecture
• Option 2 for lecture credit is to complete Canvas quiz (also starts with next lecture)
• Visit pollev.stanford.edu to log in (or use the PollEverywhere app) and sign in
with your @stanford.edu email – NOT your personal email!
• Compatible with any device with a web browser, mobile app also available, or
you can respond via text – however, to respond via text you must first log in
via a web browser and add your phone number to your profile.
• Poll questions in slides will activate the poll -respond at pollev.com/cs111. 3
Announcements
• Assign0 released - see course website for more information
• No late submissions accepted (except for OAE/Head TA accommodations – during the
quarter, extension requests must be received in advance of the assignment on-time
deadline, or as soon as possible if extenuating circumstances occur later, or extenuating
circumstances prevent reaching out prior to the deadline.)
• Remember to input your section preferences by 11:59PM Thurs! Link is on the
course website (under “Sections”).
• Helper Hours now scheduled, starting this week!
• Please let the Head TA know about OAE accommodations and midterm
conflicts as soon as you can – [email protected]

4
Topic 1: Filesystems - How can
we design filesystems to manage files
on disk, and what are the tradeoffs
inherent in designing them? How
can we interact with the filesystem in
our programs?
5
CS111 Topic 1: Filesystems
Filesystems - How can we design filesystems to manage files on disk, and what
are the tradeoffs inherent in designing them? How can we interact with the
filesystem in our programs?

Why is answering this question important?


• Helps us understand what filesystems do (today and next time)
• Provides insight into the challenges and tradeoffs in designing large systems
(next few lectures)
• Shows us how we can directly manipulate files in our programs (next week)
assign1: implement layers of the Unix v6 filesystem to read a file from disk given its
path.
6
CS111 Topic 1: Filesystems
Filesystems - How can we design filesystems to manage files on disk, and what are
the tradeoffs inherent in designing them? How can we interact with the filesystem in
our programs?

Filesystems Filesystem
Case study: Unix
introduction and System calls and Crash recovery
V6 Filesystem
design file descriptors

Today Lectures 3-4 Lecture 5 Lectures 6-7

assign1: implement portions of the Unix v6 filesystem!

7
Learning Goals
• Understand the key responsibilities and requirements of a filesystem
• Get practice identifying tradeoffs in different filesystem designs
• Explore the design of the Unix V6 filesystem

8
Plan For Today
• Filesystems Introduction
• Methods for Storing Files
• Contiguous Allocation
• Linked Files
• Windows FAT
• Multi-level indexes
• The Unix V6 Filesystem
• Inodes

9
Plan For Today
• Filesystems Introduction
• Methods for Storing Files
• Contiguous Allocation
• Linked Files
• Windows FAT
• Multi-level indexes
• The Unix V6 Filesystem
• Inodes

10
Filesystems
A filesystem is the portion of the OS that manages the disk.
• A hard drive (or, more commonly these days, flash storage) is persistent
storage – it can store data between power-offs.

Memory (RAM) Disk


• Fast, less space, more expensive • Slower, more space, cheaper
• Byte-addressable: can quickly access • Sector-addressable: cannot read/write
any byte of data by address, but not just one byte of data – can only
individual bits by address read/write “sectors” of data at a time
• Not persistent: cannot store data • Persistent: stores data between
between power-offs power-offs
11
Hard Drives
Magnetic disks (hard drives) have been the
standard storage mechanism for files.
• Spinning, magnetically-coated platters
• Actuator arm positions heads, which can read
and write data on the magnetic surfaces
• Moving parts means risk of damage from
sudden movement, dust, etc.

12
Hard Drives
Hard drives have peculiar performance
characteristics that have a big impact on how
we build filesystems.
• Reading and writing requires seeking (moving
arm to position heads over desired track) and
waiting for desired location to pass
underneath. Want to minimize this time.
• We can only read data in chunks of sectors.
Example of virtualization; making one thing
look like another.


sector 0 sector 1 sector 2 sector 3 sector 4 sector 5 sector 6 13
Hard Disks are Sector-Addressable


sector 0 sector 1 sector 2 sector 3 sector 4 sector 5 sector 6

If we are the OS, the hard disk creators might provide this API (“application
programming interface”) – a set of public functions - to interface with the disk:

void readSector(size_t sectorNumber, void *data);


void writeSector(size_t sectorNumber, const void *data);

This is all we get! We (the OS) must build a filesystem by layering functions on
top of these to ultimately allow us to read, write, lookup, and modify entire files.14
Filesystem Functionality
We want to read/write file on disk and have them persist even when the device
is off. This may include operations like:

• creating a new file on disk


• looking up the location of a file on disk
• Reading/editing all or part of an existing file from disk – e.g.,
sequential/random access
• creating folders on disk
• getting the contents of folders on disk
• ...

15
Filesystems

Functions for user programs to read/write files


Filesystem
readSector and writeSector

16
Filesystem Challenges
Problems addressed by modern file systems:
• Disk space management:
• Fast access to files (minimize seeks)
• Sharing space between users
• Efficient use of disk space
• Naming: how do users select files?
• Reliability: information must survive OS crashes and hardware failures.
• Protection: isolation between users, controlled sharing.

17
Flash Storage
Recently, flash storage (“SSD”) has become
more popular and commonplace, especially
with the growth in mobile devices.
• Much faster (100x faster access), but more
expensive
• No moving parts, so more reliable
• Issues with wear-out; once a chunk of the https://www.samsung.com/us/computing/memory-
drive has been erased many times (~100k), it storage/solid-state-drives/980-pro-pcie-4-0-nvme-ssd-1tb-
mz-v8p1t0b-am/

no longer stores info reliably.


• Typically, still only support reading/writing in
units of sectors.

18
Plan For Today
• Filesystems Introduction
• Methods for Storing Files
• Contiguous Allocation
• Linked Files
• Windows FAT
• Multi-level indexes
• The Unix V6 Filesystem
• Inodes

19
Sectors and Blocks
A filesystem generally defines its own unit of data, a "block," that it reads/writes
at a time.
• "Sector" = hard disk storage unit
• "Block" = filesystem storage unit (1 or more sectors) - software abstraction

Pros of larger block size? Smaller block size?


• E.g. fewer operations if larger, but smaller files may read in more data than
necessary block 0 block 1 block 2

Example: the block


size could be defined
as two sectors …
sector 0 sector 1 sector 2 sector 3 sector 4 sector 5 sector 6 20
Storing Files on Disk
Two types of data we will be working with:
1. file payload data - contents of files (e.g. text in documents, pixels in images)
2. file metadata - information about files (e.g. name, size)

Key insight: both must be stored on the hard disk. Otherwise, we will not have
it across power-offs! (E.g. without storing metadata we would lose all filenames
after shutdown). This means some blocks must store data other than payload
data.

21
Storing Files on Disk
Two types of data we will be working with:
1. file payload data - contents of files (e.g. text in documents, pixels in
images)
2. file metadata - information about files (e.g. name, size)

Key insight: both must be stored on the hard disk. Otherwise, we will not have
it across power-offs! (E.g. without storing metadata we would lose all filenames
after shutdown). This means some blocks must store data other than payload
data.

22
Contiguous Allocation
First key question: should we store files contiguously on disk? What would it
look like if we did?
• Called contiguous allocation – allocate a file in one contiguous group of blocks
• For each file, keep track of the number of its first sector and its length
• Keep a free list of unused areas of the disk
• Example: IBM OS/360
• Advantages?


block 0 block 1 block 2 block 3 block 4 block 5 block 6
23
Contiguous Allocation
First key question: should we store files contiguously on disk? What would it
look like if we did?
• Called contiguous allocation – allocate a file in one contiguous group of blocks
Advantages:
• simple
• can read sequentially or easily jump to any location in file (“random access”)
• all data in one place (few seeks)
What about disadvantages?


block 0 block 1 block 2 block 3 block 4 block 5 block 6
24
Contiguous Allocation
First key question: should we store files contiguously on disk? What would it
look like if we did?
• Called contiguous allocation – allocate a file in one contiguous group of blocks
Disadvantages:
• hard to grow files
• hard to lay out files on disk – we may not be able to squeeze a new file in a
block of free space (external fragmentation – occurs when we have space on
disk, but can’t use it to store files)


block 0 block 1 block 2 block 3 block 4 block 5 block 6
25
Linked Files
First key question: should we store files contiguously on disk? What would it
look like if we didn’t?
• Problem: we need to know what blocks are associated with what files
One idea: linked files – like a linked list
• Each block contains file data as well as the location of the next block
• For each file, keep track of the number of its first block in separate location
• Approximate examples: TOPS-10, Xerox Alto
• Advantages?
File 0 Start: 10 File 0 File 2 File 1 File 2 File 0 File 2
File 1 Start: 12 … …
File 2 Start: 13 Next: 14 Next: END Next: END Next: 15 Next: END Next: 11
block 10 block 11 block 12 block 13 block 14 block 15 26
Linked Files
First key question: should we store files contiguously on disk? What would it
look like if we didn’t? One idea: linked files – like a linked list
• Each block contains file data as well as the location of the next block
Advantages:
• Easy to grow files
• Easier to fit files in available space – less fragmentation
• Still supports simple sequential access
What about disadvantages?
File 0 Start: 10 File 0 File 2 File 1 File 2 File 0 File 2
File 1 Start: 12 … …
File 2 Start: 13 Next: 14 Next: END Next: END Next: 15 Next: END Next: 11
block 10 block 11 block 12 block 13 block 14 block 15 27
Linked Files
First key question: should we store files contiguously on disk? What would it
look like if we didn’t?
One idea: linked files – like a linked list
• Each block contains file data as well as the location of the next block
Disadvantages:
• Can’t easily jump to any arbitrary location in the file
• Data scattered throughout disk (more seeks)

File 0 Start: 10 File 0 File 2 File 1 File 2 File 0 File 2


File 1 Start: 12 … …
File 2 Start: 13 Next: 14 Next: END Next: END Next: 15 Next: END Next: 11
block 10 block 11 block 12 block 13 block 14 block 15 28
Linked Files
First key question: should we store files contiguously on disk? What would it
look like if we didn’t?
One idea: linked files – like a linked list
• Each block contains file data as well as the location of the next block
Disadvantages:
• Can’t easily jump to any arbitrary location in the file
• Data scattered throughout disk (more seeks)

File 0 Start: 10 File 0 File 2 File 1 File 2 File 0 File 2


File 1 Start: 12 … …
File 2 Start: 13 Next: 14 Next: END Next: END Next: 15 Next: END Next: 11
block 10 block 11 block 12 block 13 block 14 block 15 29
Windows FAT
First key question: should we store files In-Memory File Allocation Table
contiguously on disk? What would it look …
like if we didn’t? 10 14
Interesting idea: what if we instead stored 11 END
the links in one big table in memory? 12 END
13 15
• Windows (DOS) FAT: like linked allocation,
except links aren’t in blocks, they are in a 14 END
“file allocation table” in memory and disk 15 11
(originally 16 bits per entry) …
Disk
File 0 Start: 10 File 0 File 2 File 1 File 2 File 0 File 2
File 1 Start: 12 … …
File 2 Start: 13 Next: 14 Next: END Next: END Next: 15 Next: END Next: 11
block 10 block 11 block 12 block 13 block 14 block 15 30
Windows FAT
First key question: should we store files In-Memory File Allocation Table
contiguously on disk? What would it look …
like if we didn’t? 10 14
Interesting idea: what if we instead stored 11 END
the links in one big table in memory? 12 END
13 15
• Windows (DOS) FAT: like linked allocation,
except links aren’t in blocks, they are in a 14 END
“file allocation table” in memory and disk 15 11
(originally 16 bits per entry) …
Disk
File 0 Start: 10
File 1 Start: 12 … File 0 File 2 File 1 File 2 File 0 File 2 …
File 2 Start: 13
block 10 block 11 block 12 block 13 block 14 block 15 31
Windows FAT
First key question: should we store files
contiguously on disk? What would it look
like if we didn’t? File Allocation
Table
Interesting idea: what if we instead stored
0 free
the links in one big table in memory? 1 2 File A:
• Windows (DOS) FAT: like linked allocation, 2 end
except links aren’t in blocks, they are in a 3 end 6 4 3
4 3
“file allocation table” in memory and disk File B:
5 end
(originally 16 bits per entry) 6 4
• Still keep track of each file’s first block 7 free 1 2

• (Still used today for flash sticks, digital
cameras, many embedded devices)
• Advantages? 32
Windows FAT
First key question: should we store files
contiguously on disk? What would it look
like if we didn’t? File Allocation
Table
• Windows (DOS) FAT: like linked allocation,
0 free
except links aren’t in blocks, they are in a 1 2 File A:
“file allocation table” in memory 2 end
Advantages: 3 end 6 4 3
4 3
• Can more quickly jump to various 5 end File B:
locations in a file 6 4
7 free 1 2
• Still supports easy sequential access

What about disadvantages?


33
Windows FAT
First key question: should we store files
contiguously on disk? What would it look
like if we didn’t? File Allocation
Table
• Windows (DOS) FAT: like linked allocation,
0 free
except links aren’t in blocks, they are in a 1 2 File A:
“file allocation table” in memory 2 end
Disadvantages: 3 end 6 4 3
4 3
• Data scattered throughout disk (more 5 end File B:
seeks) 6 4
7 free 1 2
• Still need to jump through table to get to

an arbitrary location in the file
• Must store table in memory
34
Windows FAT
First key question: should we store files
contiguously on disk? What would it look
like if we didn’t? File Allocation
Table
• Windows (DOS) FAT: like linked allocation,
0 free
except links aren’t in blocks, they are in a 1 2 File A:
“file allocation table” in memory 2 end
Disadvantages: 3 end 6 4 3
4 3
• Data scattered throughout disk (more 5 end File B:
seeks) 6 4
7 free 1 2
• Still need to jump through table to get

to an arbitrary location in the file
• Must store table in memory
35
File Payload Data
First key question: should we store files contiguously on disk? What would it
look like if we didn’t?
Interesting idea: what if did not have a file allocation table or links, and instead
we stored all the block numbers for a file in order? That way we could quickly
jump to any point in the file.

File 0: 10, 14
File 1: 12
File 2: 13, 15, 11

… File 0 File 2 File 1 File 2 File 0 File 2 …

block 10 block 11 block 12 block 13 block 14 block 15 36


File Payload Data
First key question: should we store files contiguously on disk? What would it
look like if we didn’t?
• Multi-level indexes: store all block numbers for a given file (but how?)
• Example: 4.3 BSD Unix, Unix V6 Filesystem (~1975)
• More modern ext2 and ext3 Linux file systems based on this idea; Windows NTFS also
uses a tree-based structure, though slightly different
File 0: 10, 14
File 1: 12
File 2: 13, 15, 11

… File 0 File 2 File 1 File 2 File 0 File 2 …

block 10 block 11 block 12 block 13 block 14 block 15 37


Plan For Today
• Filesystems Introduction
• Methods for Storing Files
• Contiguous Allocation
• Linked Files
• Windows FAT
• Multi-level indexes
• The Unix V6 Filesystem
• Inodes

38
Unix V6 Filesystem
Key Idea: files don’t need to be stored contiguously on disk, but we want to
store all the block numbers in order that make up the data for a file.
Where could we store this information for each file for easy lookup?

Let’s reserve some space on disk to store this information for each file,
separately from its payload data. This per-file space is called an inode.

39
Inodes
An inode ("index node") is a grouping of data about a single file, stored on disk.
• For Unix v6, an inode contains an ordered list of block numbers that store the
file’s payload data, and also stores other metadata like file size.
• Unix v6 stores inodes on disk together in a reserved portion of blocks starting
at block 2, called the inode table, for quick access.
• Inodes can be read into memory when used for quicker access
• Some other filesystems (e.g., contiguous allocation/linked files, but not FAT)
store file metadata in inodes, too

40
Unix V6 Inodes
The Unix v6 filesystem stores inodes on disk together in the inode table for
quick access.
• Inodes are 32 bytes big, and 1 block = 1 sector = 512 bytes, so 16 inodes/block.
• inodes are stored in a reserved region starting at block 2 (block 0 is "boot
block" containing hard drive info, block 1 is "superblock" containing filesystem
info). Typically, at most 10% of the drive stores metadata.
• Filesystem goes from filename to inode number ("inumber") to file data.

41
Unix V6 Inodes
We need inodes to be a fixed size, and not too large. So how should we store
the block numbers? How many should there be?
1. if variable number, there's no fixed inode size
2. if fixed number, this limits maximum file size

The inode design here has space for 8 block numbers, which are stored in
order. (i.e. first block number stores first chunk of file, etc.). But we will see
later how we can build on this to support very large files.

42
Recap
• Filesystems Introduction Lecture 2 takeaway:
• Methods for Storing Files Filesystems need to store
• Contiguous Allocation
both file metadata and
• Linked Files
• Windows FAT payload data. There are
• Multi-level indexes various ways to store
• The Unix V6 Filesystem payload data, each with
• Inodes different pros/cons. The Unix
V6 filesystem uses inodes to
Next time: more about the Unix v6 store file data, including
Filesystem
block numbers.
43

You might also like