UNIX SYSTEM PROGRAMMING
Compilation in UNIX
• Stages of Compilation:
o Preprocessing: Directives like #include, #define are handled here. Use gcc -E to
view preprocessed code.
o Compilation: Converts preprocessed code into assembly code; gcc -S outputs
assembly.
o Assembly: Translates assembly code to machine code, resulting in object files
(.o).
o Linking: Combines object files and resolves references (external libraries, other
files) into a final executable.
Tooling:
• gcc: Common flags include:
o -c: Compile to object file without linking.
o -o <filename>: Specify the output filename.
o -Wall, -Werror: Enable warnings, treat warnings as errors.
• Makefiles: Use make to streamline builds with dependencies, avoiding redundant
compilation.
• Debug Symbols: -g enables debugging; critical for using debuggers like gdb
C Data Types and Pointers
Type Modifiers and Compatibility:
o signed and unsigned modifiers change the range of values a type can hold.
o short, long, long long extend storage capacity.
o Example of sizeof() usage to ensure cross-platform compatibility:
Pointers:
• Basics: Store memory addresses; declared as int *ptr, char *ptr, etc.
• Pointer Operations:
o * (dereference): Access data at the address.
o & (address-of): Get the memory address.
• Pointer Arithmetic: Adding/subtracting integers to pointers depends on data size; e.g.,
int *p; p + 1 increments by sizeof(int).
• Use Cases:
o Array manipulation: Pointers can traverse arrays without indexing.
o Function arguments: Efficiently pass large data by reference.
Standard I/O
• File Handling Functions:
o fopen(), fclose(), fgets(), fputs(): High-level functions for file access.
o fscanf(), fprintf(): Formatted I/O functions for reading/writing data.
• File Pointer (FILE *):
o Provides a reference for file operations.
o Standard Streams: stdin (keyboard input), stdout (screen output), stderr (error
messages).
• Buffering:
o Types:
▪ Line-buffered: Flushed on newline (common for terminal output).
▪ Block-buffered: Accumulates data before writing to disk.
▪ Unbuffered: Immediate I/O, often for stderr.
o Implications: Buffering improves efficiency but may delay visibility of output.
Dynamic Memory Management
• Memory Allocation:
o malloc(size): Allocates size bytes; returns a pointer to the start.
o calloc(num, size): Allocates num * size bytes; initializes to zero.
o realloc(ptr, new_size): Resizes allocated memory block.
o Memory Deallocation:
▪ free(ptr): Frees memory; avoids memory leaks and undefined behavior.
• Error Handling: Always check for NULL return values to handle allocation failure.
System Calls
• Definition: Interface for requesting kernel-level operations from user-space programs.
• Key System Calls:
o Process Management:
▪ fork(): Creates a new process; returns 0 to child, PID to parent.
▪ exec(): Replaces current process image with a new one; part of process
chaining.
▪ wait(): Pauses parent until child process completes.
▪ exit(): Terminates the process, returning status to parent.
o File Operations:
▪ open(), close(): Basic file access. Flags include O_RDONLY,
O_WRONLY, O_CREAT, O_TRUNC.
▪ read(), write(): Perform direct data transfer to/from files.
Error Handling
• Error Codes (errno):
o Stores the error code when a system call fails; use #include <errno.h>.
o perror(): Prints the error string associated with errno.
o strerror(errno): Converts error code to human-readable string.
• Exit Status:
o EXIT_SUCCESS, EXIT_FAILURE: Standard codes for program termination.
o assert(condition): Aborts if condition is false; useful for debugging.
Building and Debugging in UNIX
• Makefiles:
o Automate compile steps for large projects.
o make uses dependencies and rules to streamline the build process.
• Debugger (gdb):
o Essential commands:
▪ break <line|function>: Set breakpoints.
▪ run: Start program.
▪ step, next: Step through lines of code.
▪ print <var>: Inspect variable values.
o Valgrind:
▪ Detects memory leaks, invalid accesses.
▪ Usage: valgrind ./program for a memory report.
Using gdb to Debug
Run in gdb to check for errors
Directories in UNIX
• Hierarchical Structure: Root (/) as the top-level directory, with subdirectories like:
o /bin: Essential binaries.
o /usr: User-installed software, binaries, libraries.
o /etc: Configuration files.
o /home: User home directories.
o /var: Variable files like logs, spools.
• Directory Commands:
o ls: List directory contents.
o mkdir, rmdir: Create/remove directories.
o cd: Change directory; pwd: Print working directory.
Listing and Creating Directories
UNIX File System Structure
• Linux File System Kernel:
o Manages the hierarchical directory structure, inodes, and data blocks.
• Key Data Structures:
o Inode: Contains metadata like permissions, owner, timestamps.
o Superblock: Information about the file system, such as size, number of inodes.
o Dentry: Directory entries, linking file names to inode numbers.
• File System Types: ext2, ext3, ext4:
o Journaling (ext3, ext4): Reduces risk of data corruption during unexpected
shutdowns.
o Performance: ext4 optimized for speed and handling large files.
Inspecting inode Information
Low-Level I/O
• File Descriptors:
o Integer identifiers for open files.
o Standard Descriptors:
▪ 0: stdin (input).
▪ 1: stdout (output).
▪ 2: stderr (error).
• File Operations:
o Opening and Creating:
▪ open(): Open or create files with flags like O_RDONLY, O_CREAT, and
file mode for permissions.
o Reading and Writing:
▪ read(fd, buffer, count): Reads count bytes from file descriptor fd into
buffer.
▪ write(fd, buffer, count): Writes count bytes from buffer to fd.
o Closing and Deleting:
▪ close(fd): Releases the file descriptor.
▪ Deleting Files: unlink(path) removes the directory entry, and deletes the
file when no process holds a file descriptor.
Process Management in Low-Level I/O:
• fork(): Creates new processes. Returns twice—0 to the child and PID to the parent.
• Pipes and Redirection:
o pipe(): Sets up unidirectional data flow between processes.
o Redirection (> for stdout, < for stdin): Used to manipulate I/O
sources/destinations in shells.
Process Creation and Pipe Communication