Comp 3000 - Exam Notes Before Midterm 1
Comp 3000 - Exam Notes Before Midterm 1
Fork
Unique PID
Libcall in C
Makes a copy of the parent
Exec
External Fragmentation
When free memory is not contiguous
EXAMPLE
Paging
Need for contiguous allocation in physical memory
Used to address EXTERNAL FRAGMENTATION
Needs to be of the same size so that there is no external fragmentation
managing and allocating memory in pages of the same size in page tables
System Calls
Invoked when an OS service is needed.
I/O is done by the OS
Expensive
Library Calls
Done in library files .so.
Used to increase portability and abstraction
You can dynamically link them so that they dont live in the program binary
Will often make sys calls.****
Terminals
Virtual Terminals
Pseudo Terminals
Terminal Emulator
Getty
Password Files
/etc/passwd
Account info
- username
- userid
- group id
- full name
- home directory
- shell
Password not stored in here
/etc/shadow
Used to store hashes of passwords, salts, so that a user can login
Login Process
Init
getty
login
bash
programs
User and Group IDs
UID GID
User and group id represent the user and group of the user
For example
Root UID: 0
EUID EGID
By default it is the user id.
When a user calls the setuid() syscall it doesn't actually change the uid
It changes the EUID
- Effective user id.
EUID and EGID are used to check permission bits.
Only changed per process
And child processes
Parent and sibling processes don't inherit the new EUID and EGID
sudo changes the EUID and EGID
Permission Bits
Split up into 3 octets:
The first is OWNER permissions
The second is Group permissions (users in the same group as the owner)
Other permissions (other users that are not the owner or in the same group as the
owner)
Can be represented by a rwx or an octet (a triple of bits: 000)
There is also an s or S that can be used in place of the x in the rwx.
These are setuid and setgid bits. Anytime the program runs, it will be run with
the permissions of the owner, or group respectively.
a lowercase means that there is execution privileges for the user/group and the
user/group running the program will always have the euid/egid of the owner.
Bits:
mapped to rwx with a 1 or 0.
For the set uid and gid parts, there is a 4th tuple added onto the front that
represent whether it is set uid and set gid
1st bit is setuid
2nd bit is setgid
3rd is sticky (we don't need to know)
Decimal
Taken from bits
Examples
All users have their EGID set to the owner's but the
people in the group cannot execute it (seems
useless, but can be in very certain scenarios)
Zombie process
Parent must use wait() on the child
If not it becomes a zombie.
If a process is zombie, the only way to "reap" it, or remove it from the process table,
is to kill the parent.
Special Files
Files that do special things
Device Drivers
ETC.
Drivers
Special files that live in the /dev file system
Driver Hierarchy
Device/Special File
Device Files have:
A major number
Represents type of device
Kernel does this
Minor number
Distinguishes between devices in type
Driver does this
Character/Block Devices
Special files that represent a physical device
Link driver in kernel space to user space
What ifs?
Interruption while updating on disk structures
Crash inconsistency
Meta data is gone, super bad
Inodes are good with missing/inconsistent blocks
Random data on the block
Good blocks but missing inodes
Cant find data because
File becomes orphaned
We can find them again
Optimal outcome if it has to happen
Superblock is corrupted
Do a repair
Fetch backup copies
First-Half LECTURES
Intro
What is an OS?
OS + kernel + UI + language runtime
Role of an OS?
Resource management
Hardware
Software
Abstraction
What resources does it manage?
CPU
Time shares
Shares CPU between tasks
Illusion:
Each task owns the whole CPU
Memory
Memory allocation and placements
Illusion:
Each task owns the whole address space
Disk
- Simplified view of data - or file systems
- Unified storage management despite
- Heterogenous devices
This all helps with hardware support
Abstraction
Means to achieve
Simplicity
Security
Containing errors
Seg Fault
BSOD
Kernel Panic
Portability
Create an Illusion
A view or interface
Present new semantics
Comparison to virtualization
Types of Kernels
Monolothic
Used by:
Desktops
Mobile
Server
Microkernel
Used by
- Real-Time operating systems
- IoT devices
Everything above the top red line is in the user space
Supervisor vs User mode
Kernel Space
Privileged
Runs on Ring 0
Can do anything
OS kernel
Device Driver
User Space
Runs at Ring 3
No I/O Access
needs to request from kernel
File reads
Network
Peripherals
Contained by the privileged (in the process)
Processes
Definition
PID
Memory Image (content)
CPU Context
I/O Resources
Lifecycle of a program
Label (UID)
Purposes
Accountability
Security
Root is a user, a special user
UID = 0
Not ring 0
Files
What is a file?
Structure
File Systems
Function calls
Lib Calls
Dynamic lib calls are made with the help of the OS (dynamically loading new code)
Sys Calls
Note
POSIX
Abstraction
Key concepts
Made possible by
Omnipotence - (Higher Privilege)
Omniscience - (Sees everything)
OS must have both
Abstraction
New semantics
Virtualization
making a copy
Direct Exec
Indirect Exec:
Purposes
Security
Simplicity
Maximizing utilization
Expected Illusion
Execution Context
IP - 16 bits
EIP 32 bits
RIP - 64 bits
GDB
Process abstraction
Running a program
Loaded into mem as binary
Hidden
New Semantics
Schedules
Mechanisms
What is done
Policies
How it is done
Mechanism
Execution Context
Policy
Parameters of exec context
Maxing CPU utilization
How to get IO?
Ask outside resources
Pending IO it can finish
Containment
Be fair
Round Robin
Time Slice
Context Switch
Mechanism to
Save context of one process
Restore another context
Data structure where it is stored is called
Process Control Block
Can cause:
Thrashing !!
Parallelism
Fork !
Unique PID
Libcall in C
Makes a copy of the parent
Exec !
Processes Vs Threads !
Parallelization of a program
Thread
Process
Pthread (POSIX)
Pthread_create() libcall
At OS level, syscall used to create thread
clone() on linux
Virtualizing Memory
Goals
Security
Reliability
Simplicity
Max efficiency
How is it organized?
Addressability
Granularity of bytes
Accessed
Process architecture determines (32 bits = 4 byte accessibility)
in a struct, it could be
packed
There is no gap in data
Unpacked
Gap in data
Aligned to size of architecture
Consequence of unpacked
Data can go over a boundary and cause bad performance
Element can go across boundary
Problem to Solve: Space allocation
Metadata
Stack
ASM, push pop
Heap
Lives elsewhere, requested at run time
Segmentation
Assigns different segments to different sections
Segmentation is no longer used, but seg fault is still used
The start of a segment serves as the address by applying an offset
Paging !
Need for contiguous allocation in physical memory
Used to address EXTERNAL FRAGMENTATION
Needs to be of the same size so that there is no external fragmentation
managing and allocating memory in pages of the same size in page tables
Swapping
When there isn't enough ram on the system for everything to be run
Put less used data in memory on a "pagefile" or swap partition on the disk so
that it can be accessed later need be.
When the pages are needed, its caused a page fault, then they are fetched
Abstractions Provided By OS
Most of the abstractions are not seen in the kernel space
Similar mechanisms apply.
System Calls
What's a terminal
A device used to enter data
Virtual Terminals
Pseudo Terminals
Terminal Emulator
Getty
Login Process
Establishes a session
Password Files
/etc/passwd
Account info
- username
- userid
- group id
- full name
- home directory
- shell
Password not stored in here
/etc/shadow
Permissions
File based access control
Each object has an owner
there are permission bits
rwx, rws Permission Bits
The Shell
Command interpreter, like python
Steps of a shell
If a process is zombie, the only way to "reap" it, or remove it from the process table,
is to kill the parent.
Signals
Asynchronous
Predefined
Use SIGUSR1 and SIGUSR2 to define custom processes (can't adda ny data)
An OS artifact
From POSIX
Signal Handling
Pipe
Uniderication
Ouput from 1 | Input to another
| = stdout -> stdin
|& = stout+stderr -> stdin
Redirection
User perspective
Identifier: filename
Path+filename
Read and written
Meaning of a file name is just a name
Pathnames
Hierarchical
can be relative
CWD per process
Operations on Files
POSIX
Create()
path,
Permission bits
Open
A file descriptor needed for operations (open is used to get it)
Read Write Close
Seek
Something out of band
ioctl
File Systems
Memory vs Storage
Storage is persistent
Ram is volatile, higher speed, low capacity
Drivers
Physical Devices
CD, USB, HDD,
Exposing
Not readable semantics
Unifies everything
Key to persistence
leads to file systems and raw
Always accessed in the size of access unit (blocks)
File System
Raw
Application
Device Driver
Actual Media'
Has it's own drivers
Maybe even many layers
Hidden by the operating system
Block size can differ from the systems
File systems can reside on any block devices.
Some cases they reside on ram
Performance is important too
Write sys call doesnt automatically cause a write to the block
It writes to ram until a threshold, then writes to the device.
File System Layer
Types of Files
File Descriptors
Non negative integer that points to a data structure in the kernel
Stdin, out and err are special file descriptors
You can check the file descriptor of any process in:
/proc/self/fd
Self refers to the calling process
Inode table
Inode Types
Directory
Regular file
Char device
Block Device
(named) pipe
symbolic link
socket
A bunch of metadata
Block is a set of pointers
Dentry
Directory Entry
Syscall: getdents()
A dentry
Filename -> inode
File -> Dentry -> Inode -> Data
Root Directory
Link to a file/pathname
Stores the filename in the datablocks
If target is deleted, then cant use the file
If sym link is deleted that's okay.
Weak binding between symlink and file it points to
Hard Link
Copy/move/remove
Copy
Remove
Decreases link count if greater than one, and decreases that directory entry
Removes the inode as well if the link count = 1
In terms of directories, the . is deleted first, so it will be one
Then it will be deleted.
Device access shows the difference between virtual and emulated terminal
Need a pseudo terminal to talk the OS
Device Files
Represent physical or virtual hardware devices
File system interface between devices drivers and user space applications
Identification
Major Number
Type
Minor Number
Distinguisher
/proc/devices
Character Devices
Block Devices
Accessed at blocks
Addressable
Storage device
Super Blocks
Filesystem meta data
Sort of like an inode for the filesystem
We need same metadata for file systems as we do for files
Block devices get further identification
There is a primary, and a backup super block.
What happens if it's corrupted
Mounting failure
Cant mount a filesystem
Data inconsistency
If using a backup, is may not be accessible because backups dont store
everything
Physical Size
Holes in a file
Sparse files
Where we have a large file with maybe a huge amount of 0s, these 0s are part of the
logical size
But the physical size is less because the OS determines that there is enough space
to put a file in there.
Log > Phys
External Fragmentation
Internal Fragmentation
dd command
Experimenting with real devices is bad
Use a virtual version
command to copy and convrt data
uses an input and output file
Could be other special files
dd vs cp
They both copy, and cp works find for a granularity of files
dd has more control over data, you can seek, skip, use block size
like a file based pipe
On Disk
RAM
What ifs
Interruption while updating on disk structures
Crash inconsistency
Meta data is gone, super bad
Inodes are good with missing/inconsistent blocks
Random data on the block
Good blocks but missing inodes
Cant find data because
File becomes orphaned
We can find them again
Optimal outcome if it has to happen
Superblock is corrupted
Do a repair
Fetch backup copies
Lazy approach
fsck tool:
Check super blocks
Link count
Allocation
Bad blocks
Better approach
Journalling
We can journal our updates and go back to the previous working one
Process information
Sysfs
FUSE
FUSE is a framework
exposes /dev/fuse
SSHFS
A FUSE file system
Why?
NFS
Where are environment variables stored when they're not for the whole environment?
Difference?
What is FUSE?
Set uid allows allows program to run on behalf a user? How? What does it do?
What is a group?
| vs >>
Why do I need to link math lib if using sqrt with lazy dynamic linking?
What is a login?
Init->getty->login->bash?
What is /proc?
Sparse files?