Unix M1 to M5
Unix M1 to M5
Module - I
INTRODUCTION
WHAT IS AN OPERATING SYSTEM (OS)?
An operating system (OS) is an interface between hardware and user. It manages hardware and
software resource. It takes the form of a set of software routines that allow users and application
programs to access system resources (e.g. the CPU, memory, disks, modems, printers, network
cards etc.) in a safe, efficient and abstract way.
For example, an OS ensures safe access to a printer by allowing only one application program
to send data directly to the printer at any one time. An OS encourages efficient use of the CPU
by suspending programs that are waiting for I/O operations to complete to make way for
programs that can use the CPU more productively. An OS also provides convenient
abstractions (such as files rather than disk locations) which isolate application programmers
and users from the details of the underlying hardware.
BRIEF HISTORY
The limitation of UNICS was not portable.In oder to overcome the limitation , Ken Thompson
started to work on the development of system using higher level language called B Language.
As B language did not yield expected results,Dennis ritchie developed higher level language
called C .Ken Thompson then teamed up with Dennis Ritchie, the author of the first C
compiler in 1973. They rewrote the UNIX kernel in C - this was a big step forwards in terms
of the system's portability - and released the fifth Edition of UNIX to universities in 1974.
UNIX PROGRAMMING (18CS56)
UNIX ARCHITECTURE
It is loaded into memory when the system is booted and communicates directly with the
hardware. The kernel manages system memory, processes, decides priorities.
Shell: interface between Kernel and User. It functions as command interpreter i,e it receives
and interprets the command from user and interacts with the hardware. There is only one kernel
running on the system, there could be several shells in action- one for each user who is logged
Files and Process: file is an array of bytes and it contain virtually anything. Unix considers
even the directories and devices as members of file system. The dominant file type is text and
behavior of system is mainly controlled by text files.
The second entity is the process, which is the name given to a file when it is executed as a
program. Process is simply a time image of an executable file.
1.1 System Calls: Though there are thousands of commands in the unix system, they all
use a handful of functions called system calls. User programs that need to access the hardware
use the services of the kernel, which performs the job on users behalf. These programs access
the kernel through a set of functions called system calls.
Ex: open()-- system call to access both file and device. Write() system call to write a file.
UNIX PROGRAMMING (18CS56)
FEATURES OF UNIX
Several features of UNIX have made it popular. Some of them are:
Portable: UNIX can be installed on many hardware platforms. Its widespread use
can be traced to the decision to develop it using the C language. Because C programs
are easily moved from one hardware environment to another, it is relatively simple to
port it to different environments.
Multiuser: The UNIX design allows multiple users to concurrently share hardware
and software
Multitasking: UNIX allows a user to run more than one program at a time. In fact
more than one program can be running in the background while a user is working
foreground.
Networking: While UNIX was developed to be an interactive, multiuser,
multitasking system, networking is also incorporated into the heart of the operating
system. Access to another system uses a standard communications protocol known as
Transmission Control Protocol/Internet Protocol (TCP/IP).
Device Independence: UNIX treats input/output devices like ordinary files. Input or
output to a program can be from any device or file.The source or destination for file
input and output is easily controlled through a UNIX design feature called redirection.
Utilities: UNIX provides a rich library of utilities that can be use to increase user
productivity.
Services: UNIX also includes the support utilities for system administration and
control.
UNIX PROGRAMMING (18CS56)
Personal environment
Timesharing environment: Many users connected to one computer
Client/server environment Computing split between a central computer (server) and
Personal environment originally unix designed as a multiuser environment, many user user are
installed UNIX on their personal computers this tends to personal unix system environment
Timesharing environment
All of this work tends to keep the central computer busy so, user has to wait more
time for get done their work so, it is nonproductive because of slow response.
UNIX STRUCTURE
UNIX PROGRAMMING (18CS56)
Kernel: is the heart of UNIX system. It contains two basic parts of the OS: process control and
resource management. All other components of the system call on the kernel to perform these
services for them.
Shell: interface between Kernel and User. It functions as command interpreter i,e it receives
and interprets the command from user and interacts with the hardware. There is only one kernel
running on the system, there could be several shells in action- one for each user who is logged
in. Shell has two major parts.
a. Interpreter: reads your commands and works with the kernel to execute them.
b. Shell Programming: is a programming capability that allows you to write a shell scripts.
A shell script is a file that contains the shell commands that perform a useful function. It is also
known as shell program.
C shell: developed in the Berkeley by Bill joy, Its commands look like C statements.
Korn shell: developed by David Korn, also of AT&T Labs is the newest and powerful.
Utilities: A utility is a standard Unix program that provides a support for users. Three common
utilities are text editors, search programs and sort programs.
Applications: are programs that are not a standard part of UNIX. Written by system
xtended capability to the
system.
UNIX PROGRAMMING (18CS56)
fragmentation and absence of a single conforming standard adversely affected the development
of portable applications. First ,AT &T created the System V Interface Definition(SVID). Later,
X/Guide(XPG). Products conforming to this specification were branded UNIX95, UNIX98 or
UNIX03 depending on the version of the specification.
Yet another group of standards, the portable operating system interface for computer
environments(POSIX), were developed at the behest of the Institution of Electrical and
Electronics Engineers(IEEE). POSIX refers to operating systems in general, but was based on
UNIX. Two of the most cited standards from the POSIX family are known as POSIX.1 and
POSIX.2. POSIX.1 specifies the C application program interface the system calls. POSIX.2
deals with the shell and utilities.
In 2001, a joint initiative of X/Open and IEEE resulted in the unification of the two standards.
approach to this development means that once software has been developed on any POSIX
compliant UNIX system, it can be easily ported to another POSIX- compliant UNIX machine
with minimum modifications. We make reference to POSIX throughout this text, but these
references should be Interpreted to mean the SUSV3 as well.
Commands are entered at shell prompt.The components of the command line are:
the verb;
any options required by the command
the command's arguments (if required).
For example, the general form of a UNIX command is:
Verb: is the command name. The command indicates what action is to be taken. This action
concept gives us the name verb for action .
option: modifies how the action is applied.
argument: provides additional information to the command.
Note: Options MUST come after the command and before any command arguments. Options
SHOULD NOT appear after the main argument(s). However, some options can have their
own arguments
if options are enclosed within the [] then options are not mandatory else it is compulsory
if arguments are enclosed within the [] then options are not mandatory else it is compulsory
UNDERSTANDING OF SOME BASIC COMMANDS SUCH AS echo, printf, ls, who,
date, passwd, cal.
THE DATE COMMAND:
UNIX PROGRAMMING (18CS56)
date: displays the system date and time.If the system is local that is one in your own area-it is
the current time.If the system is remote, such as across the country the reply will contain the
time where the system is physically located.
The input for the date is the system itself.The date is actually maintained in the computer as a
part of OS.The date command sends its response to monitor.
$date
M Two digit
m
p Display am or pm
$ cal
Output:
April 2016
Su Mo Tu We Th Fr
Sa 1 2
3456789
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
UNIX PROGRAMMING (18CS56)
February 2015
Su Mo Tu We
Th Fr Sa 1 2 3
4567
8 9 10 11 12 13 14
15 16 17 18 19 20 21
$who am i
Displays the same information, but only for the terminal session where the command was
issued, for example:
UNIX PROGRAMMING (18CS56)
$who u
Indicates how long it has been since there was any activity on the line. This is known as
Idle time. It also returns the process id for the user.
$who -uH
Displays "all" information, and headers above each column of data, for example:
passwd syntax
passwd [OPTION] [USER]
$passwd
Running passwd with no options will change the password of the account running the
command. You will first be prompted to enter the account's current password:
The passwd command changes passwords for user accounts. A normal user can only change
the password for their own account, but the superuser can change the password for any account.
Passwd can also change or reset the account's validity period how much time can pass before
the password expires and must be changed.
Before a normal user can change their own password, they must first enter their current
password for verification. (The superuser can bypass this step when changing another user's
password.)
After the current password has been verified, passwd checks to see if the user is allowed to
change their password at this time. If not, passwd refuses to continue, and exits.
Otherwise, the user is then prompted twice for a replacement password. Both entries must
match for passwd to continue.
If you specify the -e option, the following escape sequences are recognized:
\r A carriage return.
\t A horizontal tab.
A vertical tab.
\v
Ex 3:
$echo -e Here\bthe\bspaces\bare\bbackspaced
Outputs the following text:
Output :Herthspacearbackspaced //one character before backslash is deleted.
UNIX PROGRAMMING (18CS56)
UNIX PROGRAMMING (18CS56)
have all sections, but the first three(NAME,SYNOPSIS and DESCRIPTION) are seen in all
man pages.
NAME presents the online introduction to the command
SYNOPSIS shows the syntax used by the command
DESCRIPTION provides a detailed information.
man syntax
$man [option] command name
Options
-K,-- Search for text in all manual pages. This is a brute-force search, and is likely to take some
time; if global- you can, you should specify a section to reduce the number of pages that need to be
searched. aproposSearch terms may be simple strings (the default), or regular expressions if the --
regex option is used
UNIX PROGRAMMING (18CS56)
Section Numbers
The section numbers of the manual are listed below. While reading documentation, if you see
a com- mand name followed by a number in parentheses, the number refers to one of these
sections. For exam- ple, man is the documentation of man found in section number 1. Some
commands may have docu- mentation in more than one section, so the numbers after the
command name may direct you to the cor- rect section to find a specific type of information.
The section numbers, and the topics they cover, are as follows:
Section number Description
1 Executable programs or shell commands
2 System calls (functions provided by the kernel)
3 Library calls (functions within program libraries)
4 Special files (usually found in /dev)
5 File formats and conventions eg /etc/passwd
6 Games
7 Miscellaneous (including macro packages and conventions),
e.g. man, groff
8 System administration commands (usually only for root)
9 Kernel routines [Non standard]
UNIX PROGRAMMING (18CS56)
searches the manual pages for a keyword or regular expression. Each manual page has a short
descrip- tion included with it. apropos searches these descriptions for instances of keyword.
$apropos find
aa_find_mountpoint (2) - find where the apparmor interface filesystem is mounted
chkdupexe (1) - find duplicate executables
ffs (3) - find first bit set in a word
ls COMMAND
The ls command lists all files in the directory that match the name. If name is left blank, it will
list all of the files in the directory.
Syntax
The syntax for the ls command is:
UNIX PROGRAMMING (18CS56)
Option Description
-a Displays all files.
-b Displays nonprinting characters in octal.
-c Displays files by file timestamp.
-C Displays files in a columnar format (default)
-d Displays only directories.
-f Interprets each name as a directory, not a file.
-F Flags filenames.
-g Displays the long format listing, but exclude the owner name.
-i Displays the inode for each file.
-l Displays the long format listing.
-L Displays the file or directory referenced by a symbolic link.
-m Displays the names as a comma-separated list.
-n Displays the long format listing, with GID and UID numbers.
-o Displays the long format listing, but excludes group name.
-p Displays directories with /
-q Displays all nonprinting characters as ?
-r Displays files in reverse order.
-R Displays subdirectories as well.
-t Displays newest files first. (based on timestamp)
-u Displays files by the file access time.
-x Displays files as rows across the screen.
-1 Displays each entry on a line.
a. Field 1:
1st Character File Type: First character specifies the type of the file. In the example
above the hyphen (-) in the 1st character indicates that this is a normal file. Following are
the possible file type options in the 1st character of the ls -l output.
Field Explanation
normal file
d directory
s socket file
l link file
2nd to 9th character -- File Permissions: Next 9 character specifies the files permission.
Each 3 characters refers to the read, write, execute permissions for owner, group and other.
g. Field 7 File name: The last field is the name of the file.
$ ls -l /etc
total 3344
-rw-r--r-- 1 root root 15276 Oct 5 2004 a2ps.cfg
-rw-r--r-- 1 root root 2562 Oct 5 2004 a2ps-site.cfg
drwxr-xr-x 4 root root 4096 Feb 2 2007 acpi
-rw-r--r-- 1 root root 48 Feb 8 2008 adjtime
drwxr-xr-x 4 root root 4096 Feb 2 2007 alchemist
tThe above command first it displays the line count, word cont and byte or character
count along with this it also display the details of note file.
When you learn to redirect the output of these commands you may even like to group
them together within parentheses .
Example :
The combined output of the two commands is now sent to the file newlist. Whitespace
is provided here only for better readability. You might reduce a few keystrokes like
this
$wc note;ls -l note)>newlist
When a command line contains a semicolon, the shell understands that the command
on each side of it needs to be processed separately. The ; here is known as a
metacharacter, and you'll come across several metacharaters that have special
meaning to the shell.
A command line can overflow or be split into multiple lines
UNIX PROGRAMMING (18CS56)
A command is often keyed in. though the terminal width is restricted to 80 characters, that
doesn't prevent you from entering a command, or a sequence of them, in one line even though
the total width may exceed 80 characters. The command simply overflows to the next line
though it is stil in a single logical line.
Sometimes, you'll find it necessary or desirable to split a long command line into multiple
lines. In that case, the shell issues a secondary prompt, usually >, to indicate to you that the
command line isn't complete. This is easily shown with the echo command:
For example, when using the "cd" command, no process is created. The current
directory sim- ply gets changed on executing it.
External Command:
External commands are not built into the shell. These are executable present in a separate file.
When an external command has to be executed, a new process has to be spawned and the
command gets exe- cuted.
For example, when you execute the "cat" command, which usually is at /usr/bin, the
executable /usr/bin/cat gets executed.
UNIX PROGRAMMING (18CS56)
type command:
For the internal commands, the type command will clearly say its shell built-in, however for
the external commands, it gives the path of the command from where it is executed.
THE TYPE COMMAND: knowing the type of a command and locating it.
type - Display information about command type.
The type command is a shell built-in that displays the kind of command the shell will
execute, given a particular command name. It works like this: type command where
$type type
Output: type is a shell built-in
$type ls
$type cp
Output: cp is /bin/cp
Here we see the results for three different commands. Notice that the one for ls (taken from a
--
utput from ls is displayed in color!
UNIX PROGRAMMING (18CS56)
created but comes with every system. Its password is generally set at the time of installation
of the system and has to be used on logging in
Login: root
Password: ******
The prompt of the root is # other users (non privileged user) either $ or %
Once you login as root
could be / or /root
Any user can acquire super user status with the su command if she knows the root password.
Example, the user GMIT becomes a super user in this way
$su
Password: ******
#pwd
/home/GMIT
Though the current directory does not change the # prompt indicates the GMIT now has
l.
$su GMIT
This sequence . Su
runs a separate sub shell, so this mode is terminated by hitting [Ctrl-d] or using exit.
UNIX PROGRAMMING (18CS56)
UNIX FILES
UNIX system has thousands of files. If you write a program, you add one more file to the system. When
you compile it you add some more. Files grow rapidly, and if they are not organized properly, you will
find it difficult to locate them. So UNIX has a file system (UFS) to manage or organizes its own files in
directory.
b. Directory files
i. Contains no data, but keeps some details of the files and subdirectories that it contains.
ii. A directory file contains an entry for every file and sub directory that it houses. Each entry has
two components
The filename
A unique Identification number for the file or directory( called the inode number)
iii. A directory contains the filename but not the contents of file.
iv. When you create or remove a file the kernel automatically updates its corresponding
directory by adding or removing the enter i.e inode number associated with that file.
c. Device files
i. Used to represent a real physical device such as a printer, tape drive or terminal, used for
Input/Ouput (I/O) operations
ii. Unix considers any device attached to the system to be a file - including your terminal:
iii. By default, a command treats your terminal as the standard input file (stdin) from which to
read its input
iv. Your terminal is also treated as the standard output file (stdout) to which a command's output
is sent.
All files in UNIX are related to one another. The file system in unix is a collection of all ordinary,
directory and device files and organized in a hierarchical structure as shown in below fig.
The implicit feature of every UNIX file system is that there is a top which serves as reference point for
all files.This top is called root & is represented by a /(front slash). Root is actually a directory. The root
UNIX PROGRAMMING (18CS56)
directory has a number of sub directories under it. These sub directories in turn have more sub directories
and others files under them.
For instance bin and usr are two directories directly under root, while a second bin and kumar are sub
directories under usr.
Every file apart from root must have a parent. Thus the home directory is the parent of kumar , while /
is the parent of home and grandparent of kumar. If you create a file login.sql under the kumar directory
,then kumar will be the parent of this file.
The first group contains the files that are made available during system installation
/bin and /usr/bin: these are the directories where all the commonly used UNIX commands are
found.
/sbin and /usr/sbin: If
can execute, it would be probably in one of these directories.
/etc: this directory contains the configuration files of the system. You can change a very important
aspect of system functioning by editing a text file in this directory. Your login name and password
are stored in files /etc/passwd and etc/shadow
/dev: This directory contains all device files.
be more sub directories like pts, dsk and rdsk in this directory
/lib and /usr/lib: Contains shared library files and sometimes other kernel-related files.
/usr and /include: contains the standard header files used by C programs. The statement
#include<stdio.h> used in most C programs referes to the file stdio.h in this directory.
/usr/share/man: this is where the man pages are stored. There are separate subdirectories
here(like man1,man2 etc) that contains the pages for each section. For instance, the man page of
ls can be found in /usr/share/man/man1
User also work with their own files, they write programs, send and receive mail and also create temporary
files. These files are available in the second group shown below
/tmp: the directory where users are allowed to create temporary files. These files are wiped
away regularly by the system
/var: The variable part of the file system. Contains all your print jobs and your outgoing and
incoming mail.
/home:On many systems users are housed here.Kumar would have his home directory in
/home/kumar
UNIX PROGRAMMING (18CS56)
THE home directory and the HOME VARIABLE
HOME DIRECTORY : When log on to the system, UNIX automatically places you in a directory
called the home directory.
It is created by the system when user account is opened.
If you log in using the login name sharma , you will land up in a directory that could have the
pathname
/home/sharma
directory
$echo $HOME
/home/sharma
You will be doing much of your work in your home directory and subdirectories.
Home variable: it is also called environment variables. Environment variables are a set of
dynamic named values that can affect the way running processes will behave on a computer.
Here $HOME is a environment variable it indicates the home directory of the current user: the
default argument for the cd built-in command.
PATH VARIABLE:
The PATH environment variable is a colon-delimited list of directories that your shell searches
through when you enter a command.
Program files (executables) are kept in many different places on the Unix system. Yourpath
tells the Unix shell where to look on the system when you request a particular program.
To find out what your path is, at the Unix shell prompt echo $PATH
Your path will look something like the following.
/usr2/username/bin:/usr/local/bin:/usr/bin:.
You will see your username in place of username. Using the above example path, if you enter the ls
command, your shell will look for the appropriate executable file in the following order: first, it would
look through the directory /usr2/username/bin, then /usr/local/bin, then /usr/bin, and finally the local
directory, indicated by the. (a period).
UNIX PROGRAMMING (18CS56)
Command Function
cd Returns you to your login directory
cd ~ Also returns you to your login directory
cd /
cd /root Takes you to the home directory of the root or superuser,account
created at installation, you must be root user to access this directory.
cd /home Takes you to the home directory where user login directories are
usually stored
cd .. Moves you up one directory
cd ~otheruser login directory
cd /dir/subdirfoo Regardless of which directory you are in, the absolute path takes you
directly to subdirfoo, a subdirectory of dir.
Ex .1: Assume the current directory is /home/kumar/progs/data/text, using cd .. will move one level up
$pwd
/home/kumar/progs/data/text
$ cd ..
$pwd
/home/kumar/progs/data
ABSOLUTE PATHNAMES:
If the first character of a pathname is / the files location must be determined with respect to root(/)
. Such a pathname is called absolute pathname.
cat /home/kumar
When you have more than one / in a pathname for such / you have to descend one level in the file
system. Thus Kumar is one level below home and two levels below root.
When you specify a file y using frontslashes to demarcate the various levels,you have a
mechanism of identifying a file uniquely.No two files in a UNIX system can have same absolute
pathnames.
When you specify the date command, the system has to locate the file date from a list of
directories specified in the PATH variable and then execute it.
However if you know the location of a command in prior, for example date is usually located in
/bin or /usr/bin . Use absolute pathname i,e precede its name with complete path
$/bin/date
For example if you need to execute program less residing in /usr/local/bin you need to enter
the absolute pathname
$/usr/local/bin/less
Basic syntax
$pwd [option]
Options Description
-L (logical) Use PWD from environment, even if it contains symbolic links
-P (physical) Avoid all symbolic links
help Display this help and exit
version Output version information and exit
-L -P
-P
Ex 2: When cd used without arguments: cd when used without arguments reverts to home directory
$pwd
/home/kumar/gmit
$cd
cd without argument will change directory from gmit to its home directory Kumar
$pwd
/home/kumar
Ex 3: If your present working directory is /home/Kumar and you need to switch to /bin directory
directly, use absolute pathname i.e /bin wd cd command
$pwd
/home/kumar
$cd /bin
$pwd
UNIX PROGRAMMING (18CS56)
Ex 2: To create three directories at a time, named patch, dbs, doc, pass directory names as argu- ments.
$mkdir patch dbs doc
VTUP
Ex 4: Error while creating a directory tree
$mkdir gmit/cse gmit/ise
y
Error is due to the fact that the parent directory named gmit is not created before creating sub directo-
ries cse and ise.
Ex 5: $mkdir test
mkdir: Failed to make directory
This can happen due to:
a. The directory named test may already exist
b. There may be an ordinary file by the same name in the current directory.
c. The permissions set for the current directory do not permit the creation of files and directories by the
user.
UNIX PROGRAMMING (18CS56)
rmdir: REMOVING DIRECTORIES
The rmdir utility removes the directory entry specified by each directory argument, provided the direc-
tory is empty.
Ex 6.4.1: $rmdir progs
removes the directory named progs
Arguments are processed in the order given. To remove both a parent directory and a subdirectory of that
parent, the subdirectory must be specified first, so the parent directory is empty when rmdir tries to
remove it.
The reverse logic of mkdir is applied.
$rmdir subdirectories parent directory
$rmdir gmit/cse gmit/ise gmit
You cant delete a directory with rmdir unless it is empty.In this example gmit directory cannot be
removed until the sub directories cse and ise are removed.
You cant remove a sub directory unless you are place in a directory which is hierarchically above
the one you have chosen to remove.
$ls
Output:08_packets.html
calendar
dept.lst
emp.lst
helpdir
uskdsk06
ls options:
Output in multiple columns(-x):
$ls -x
08_packets.html calendar dept.lst emp.lst
helpdir progs usdsk07 usdsk07
UNIX PROGRAMMING (18CS56)
If we specify two directories named helpdir and progs , the contents of the directory i,e filenames are
listed out.
Recursive listing(-R)
The recursive option lists all sub-directories and files in a directory tree structure.
$ls -xR
08_packets.html calendar cptodos.sh dept.lst
emp.lst helpdir progs usdsk07
./helpdir
forms.hlp graphics.hlp
./progs
arrays.pl n2words.pl
cp chap01 unit1
if destination file i.e unit1 does not exist, first it will be created before copying.if not it will
be simply overwritten without any warning.
Copying a file to another directory
ex: assume there is a file named chap01 and it has to be copied to progs directory
cp chap01 progs
output: chap01 is now copied to directory named progs with the same name chap01.
Copying a file to another directory with different name
ex: assume there is a file named chap01 and it has to be copied to progs directory with chap01
file renamed as unit1
cp chap01 progs/unit1
output: chap01 is now copied to directory named progs with the same name unit1
cp options:
Interactive copying (-i): the -i option warns the user before overwriting the destination file.
Ex: $ cp -i chap01 unit1
cp: overwrite unit1(yes/no)? y
A y at this prompt will overwrite the file.
Copying directory structure(-R) : the -R command behaves recursively to copy an entire directory
structure say progs to newprogs.
Ex say progs directory contains three files kernel, bash, korn. To copy all three files under progs to
newprogs directory
$ cp -R progs newprogs
rm : deleting files
The rm command deletes one or more files.
Ex 1: The following command deletes three files chap01, chap02, chap03.
$ rm chap01 chap02 chap03
Ex 2: to delete files named chap01 and chap02 under progs directory
$ rm progs/chap01 progs/chap02
Ex 3: to remove all file
$ rm*
UNIX PROGRAMMING (18CS56)
rm options:
Interactive deletion (-i): the i optin makes the command ask the user for confirmation before
removing each file.
$rm -i chap01 chap02 chap03
rm: remove chap01(yes/no)?y
rm: remove chap01(yes/no)?y
rm: remove chap01(yes/no)?y
Recursive deletion(-r or -R) deletes all subdirectories and files recursively. Rm wont normally
remove directories but when used with -r or -R option it will.
$ rm -r *
Forcing removal: rm prompts for removal, if a file is write protected. The -f option overrides this minor
protection and forces removal.
$ rm -rf * /*(deletes everything in the current directory and below)
To rename a directory:
$ mv pts perdir
pts directory is renamed as perdir
$ od -b odfile
The -b option displays the octal values for each character.
000000 127 150 151 164 145 040 163 160 141 143 145 040 151 156 143 154
000000 165 144 145 163 040 141 040 011 012 124 150 145 040 007 040 143
Each line displays 16 bytes of data in octal , preceded by the offset in the file of the first byte in the
line.
UNIX PROGRAMMING (18CS56)
$od -bc odfile
The -b and -c option combined
Each line is now replaced with two.
The octal values are shown in first line and printable characters and escape sequences are shown in
second line
000000 127 150 151 164 145 040 163 160 141 143 145 040 151
W h i t e s p a c e i
156 143 154
n c l
000000 165 144 145 163 040 141 040 011 012 124 150 145 040
u d e s a \t \n T h e
The octal equivalent of characters are displayed ex for W- 127, i-151, \t (tab)-011, \n(newline)-012
^G(Bell character)- 007
UNIX PROGRAMMING (18CS56)
MODULE 2
FILE ATTRIBUTES AND PERMISSIONS
The ls command with options
ls l: LISTING FILE ATTRIBUTES
ls command is used to obtain a list of all filenames in the current directory. The output in UNIX lingo is
often referred to as the listing. Sometimes we combine this option with other options for displaying other
attributes, or ordering the list in a different sequence. ls look up the inode to fetch its attributes. It
lists seven attributes of all files in the current directory and they are:
File type and Permissions
The file type and its permissions: The first column shows the type and permissions associated with each
file.The first character in this column is mostly a which indicates that the file is an ordinary one. In
unix, file system has three types of permissions- read, write and execute.
Links: The second column indicates the number of links associated with the file. This is actually the
number of filenames maintained by the system of that file.
Ownership: The third column shows the owner of files. The owner has full authority to tamper with files
content and permissions. Similarly, you can create, modify or remove files in a directory if you are the
owner of the directory.
Group ownership: The fourth column represents the group owner of the file. When opening a user
account, the system admin also assigns the user to some group. The concept of a group of users also
owning a file has acquired importance today as group members often need to work on the same file.
File size: The fifth column shows the size of the file in bytes. The important thing to remember here is
that it only a character count of the file and not a measure of the disk space that it occupies.
Last modification time: The sixth, seventh and eighth columns indicate the last modification time of the
file, which is stored to the nearest second. A file is said to be modified only if its content have changed
in any way.If
Filename: The last column displays the filename arranged in ASCII collating sequence.
UNIX PROGRAMMING (18CS56)
For example, $ ls l
total 72
-rw-r--r-- 1 kumar metal 19514 may 10 13:45 chap01
-rw-r--r-- 1 kumar metal 4174 may 10 15:01 chap02
-rw-rw-rw- 1 kumar metal 84 feb 12 12:30 dept.lst
-rw-r--r-- 1 kumar metal 9156 mar 12 1999 genie.sh
drwxr-xr-x 2 kumar metal 512 may 09 10:31 helpdir
drwxr-xr-x 2 kumar metal 512 may 09 09:57 progs
FILE PERMISSIONS
UNIX has a simple and well defined system of assigning permissions to files.
Lets issue the ls l command once again to view the permissions of a few lines .
$ls -l chap02 dept.lst dateval.sh
-rwxr-xr-- 1 kumar metal 25000 May 10 19:21 chap02
-rwxr-xr-x 1 kumar metal 890 Jan 10 23:17 dept.lst
-rw-rw-rw- 1 kumar metal 84 Feb 18 12:20 dateval.sh
Third group(r--):
has the write and execute bits absent.
This set is applicable to others i,e those who are neither the owner nor group.
This category is referred to as the world.
RELATIVE PERMISSIONS
chmod only changes the permissions specified in the command line and leaves the other permissions
unchanged.
Its syntax is:
chmod category operation permission filename(s)
chmod takes an expression as its argument which contains:
user category (user, group, others)
operation to be performed (assign or remove a permission)
type of permission (read, write, execute)
Ex 1:
$ls l xstart
-rw-r--r-- 1 kumar metal 1906 sep 23:38 xstart
Here user is having the only read and execute permission .
Using relative file permission need to add the execute permission to user
chmod category operation(+,-) permission filename.
$chmod u + x xstart
$chmod u+x xstart
$ ls l xstart
-rwxr--r-- 1 kumar metal 1906 sep 23:38 xstart
After executing the chmod command, the command assigns (+) execute (x) permission to the user (u),
other permissions remain unchanged.
Ex 2: To remove execute permission from all and assign read permission to group and others
$chmod a-x, go+r xstart /*to remove execute permission from all(a)ie user, group, others
/*to assign read permission to group and others (go+r)
ABSOLUTE PERMISSIONS
A string of three octal digits is used as an expression. The permission can be represented by one octal
digit for each category. For each category, we add octal digits. If we represent the permissions of each
category by one octal digit, this is how the permission can be represented:
Read permission 4 (octal 100)
Write permission 2 (octal 010)
Execute permission 1 (octal 001)
UNIX PROGRAMMING (18CS56)
We have three categories and three permissions for each category, so three octal digits can describe a
Ex 2:
To assign read
Here to and write
assign rw-for user and remove
corresponds to digitwrite,
6 execute permissions from group and others
Remove write , execute permissions is nothing but assigning only read option to group and oth-
ers
Only read permission is r corresponds to 4
$chmod 644 xstart
Ex 3:
To assign all permissions to the owner, read and write to group and only execute for others.
$chmod 761 xstart
Ex 4
To assign all permissions to all categories.
$chmod 777 xstart
UNIX PROGRAMMING (18CS56)
5.2 The Security Implications
Let the default permission for the file xstart is
-rw-r r--
$chmod u-rw, go-r xstart or
$chmod 000 xstart
After Executing above any one command the output will be removes the all permission from all
categories as shown below
----------
This is simply useless but still the user can delete this file
On the other hand,
chmod a+rwx xstart or
chmod 777 xstart
After Executing either of the one command it adds the all permission to all categories as shown
below
-rwxrwxrwx
The UNIX system by default, never allows this situation as you can never have a secure system. Hence,
directory permissions also play a very vital role here.
This makes all the files and subdirectories found in the shell_scripts directory, executable by all users.
DIRECTORY PERMISSIONS
Directories also have their own permissions and the significance of these permissions differ
from those of ordinary files.
The default permissions of a directory are,
rwxr-xr-x (755)
A directory must never be writable by group and others
UNIX PROGRAMMING (18CS56)
Ex1:
$mkdir c_progs ; ls ld c_progs
drwxr-xr-x 2 kumar metal 512 may 9 09:57 c_progs
Here the c_progs directory is created (mkdir c_progs) and then the attributes of directory is listed out(ls
ld c_progs)
If a directory has write permission for group and others also, be assured that every user can remove every
file in the directory. As a rule, you must not make directories universally writable unless you have definite
reasons to do so.
The group owner of the file dept.lst is changed from metal to dba by issuing the command
$chgrp dba dept.lst
2. After a command is entered, the shell scans the command line for metacharacters and expands
4. The shell waits for the command to complete and normallty cant do any work while the command is
running.
5. After command execution is complete, the prompt reappers and shell returns to its waiting role to
UNIX PROGRAMMING (18CS56)
WILD CARDS
The metacharacters taht are used to construct the generalized pattern for matching filenames belong to
the category called wild cards.
The * and ?
The metacharacter * is one of the characters of the shell wild card set. It matches any
number of characters including none.
For example : to match filenames chap chap01 chap02 chap03 chap04
$ ls chap*
Output : // the * matches all strings along with none
chap
chap01
chap02
chap03
chap04
chap02
chap03
chap04
To match filename with 3 character that does not begin with an Upper Case letter
$ ls [!A-Z]??
ESCAPING
Placing a \ immediately before a metacharacter turns off its special meaning.
For instance \* , matches * itself. Its special meaning of matching zero or more occurrencesof
character is turned off.
Ex 1:
$rm chap*
removes all the filenames starting with chap. Chap, chap01,chap02 and chap03 are removed.
Ex 2:
If there are files with names chap01, chap02,chap03.
To list the filenames starting with chap0
$ ls chap0[1-3]
Output:
chap01
chap02
chap03
Ex 3:
To match the file named as chap0[1-3]
$ ls chap0\[1-3\]
Output:
chap0[1-3]
Escaping the space: To remove the file My document.doc, which has space embedded,
$rm My\ document.doc
QUOTING:
This is the another way of turning off the meaning of metacharacter.
When a command argument is enclosed within quotes, the meaning of all enclosed special
characters are turned off
$rm 'chap*' // * metacharacter meaning is turned off
removes the filename with chap* /*name of the file itself is chap*.
STANDARD INPUT
This file is indeed special
The keyboard, the default source
a file using redirection with the < symbol
another program using the pipeline
The input redirection operator is less than character (<).
When you use wc without an argument , it prompts you to provide the input from standard input
keyboard
$ wc
Unix is a multiuser multitasking OS
[ctrl-d]
When wc is used with argument. Filename is passed as an argument i,e wc takes the input from
the filename we have specified
$ wc < sample.txt /*wc command takes input from the file sample.txt
output
1 6 36 /* count of characters, words, lines of file sample.txt
STANDARD OUTPUT
All commands displaying the output on the terminal actually write to the standard output file as
a stream of characters and not directly to the terminal as such.
There are three possible destinations of this stream
The terminal, the default destination
A file using the redirection symbol > and >>
As input to another program using a pipeline
There are two basic redirection operators for standard output.
UNIX PROGRAMMING (18CS56)
1 6 37
STANDARD ERROR
When you enter an incorrect command or try to open a non existent file, certain diagnostic messages
show up on the screen. This is the standard error stream whose default destination is the terminal.
$cat filelist
UNIX PROGRAMMING (18CS56)
$cat stdout
-rwxr r-- 1 gilberg staff 1234 oct file1
$cat stderr
Cannot access file2: no such file or directory
$who | wc -l
Output: 5 /* count of number of lines of who command
Here the output of who command has been passed directly as the input to wc command and
who is said to be piped to wc.
The grep,egrep
$cat emp1.lst
UNIX PROGRAMMING (18CS56)
$cat emp2.lst
Ex 2: To search the pattern director from 2 files i,e emp.lst and emp2.lst
Ignoring case: when you look for a name but are not sure of the case, use the i option to
ignore case for pattern matching.
Deleting lines(-v): The v option selects all lines except those containing the pattern
The lines containing the pattern director are deleted in the output.
VTUPuse.co
Counting lines containing patter(-c)
The c option counts the number of lines containing the pattern.
Dislaying filenames(-l)
The l option displays only the names of the files containing the pattern.
Here the pattern manager is searched in all files ending with .lst (*.lst)
UNIX PROGRAMMING (18CS56)
Negating a class(^)
Regular expressions use the caret(^) to negate the character class, while the shell uses bang(!)
Ex:
[^a-zA-Z] matches a non-alphabetic character
The *(asterisk)
The * refers to the immediately preceding character.Here it indicates that the previous character can
occur many times or not at all.
The patttern g*
Matches none, g, gg, gg, ggg,.....
UNIX PROGRAMMING (18CS56)
The dot ( .)
A . matches a single character where as the shell uses ? to indicate that.
Here the . matches single character. It list all files beginning with 10 followed by single character.
It displays lines with id 101,102,103,104,105.
7.1.1 Specifying pattern locations(^ and $)
The two regular expressions characters that match pattern at the beginning or end of line .
^(caret) Matching at the beginning of the line
$(dollar) Matching at the end of the line
5...$ matches all the lines ending with four digit number beginning with 5.
UNIX PROGRAMMING (18CS56)
The + and ?
+ Matches one or more occurrences of the previous character
? Matches zero or one occurrences of the previous character.
..
The + symbol matches one or more occurrences of character c i,e c, cc, ccc
The occurrences of a character is not matched by +c , since there is no c in the occurrences.
The characters ( and ) lets you group patterns and use of | inside the parenthesis, you can frame
more compact pattern
UNIX PROGRAMMING (18CS56)
SHELL PROGRAMMING
The follo
VAR_1
VAR_2
TOKEN _A
DEFINING VARIABLES:
Variables are defined as follows
variable_name=variable_value
For example:
NAME="Sumitabha Das"
ACCESSING VARIABLES:
To access the value stored in a variable, prefix its name with the dollar sign ( $)
For example, following script would access the value of defined variable NAME and would
print it on STDOUT
#!/bin/sh
echo $NAME
UNIX PROGRAMMING (18CS56)
ENVIRONMENT VARIABLES
An environment variable is a variable that is available to any child process of the shell. Some programs
need environment variables in order to function correctly. Usually a shell script defines only those
environment variables that are needed by the programs that it runs.
SHELL: points to the shell defined as default.
DISPLAY : Contains the idenifier for the display that X11 programs should use by default.
HOME: Indicates the home directory of the current user; the default argument for the cd built in
command
IFS: Indicates the Internal Field Separator that is used by the parser for word splitting after expansion.
PATH : Indicates search path for commands.It is a colon separated list of directories in which the shell
looks for commands.
PWD: Indicates the current working directory as set by the cd command.
RANDOM: Generates a random integer between 0 and 32767 each time it is referenced.
SHLVL: Increments by one each time an instance of bash is created.
UID: Expands to the numeric user ID of the current user initialized at shell prompt.
Following is the sample example showing few environment variables
V
$ echo $HOME
/root
]$ echo $DISPLAY
$ echo $TERM
xterm
$ echo $PATH
/usr/local/bin:/bin:/usr/bin:/home/amrood/bin:/usr/local/bin
$
$PS1='=>'
=>
=>
Your prompt would become =>.
UNIX PROGRAMMING (18CS56)
$ echo "this is a
> test"
this is a
test
$
$PS= '-->'
$ echo "this is a
--> test"
The .profile File
The file /etc/profile is maintained by the system administrator of your UNIX machine and
contains shell initialization information required by all users on a system.
The file .profile is under your control. You can add as much shell customization information as
you want to this file. The minimum set of information that you need to configure includes
The type of terminal you are using
A list of directories in which to locate commands
A list of variables effecting look and feel of your terminal.
You can check your .profile available in your home directory. Open it using vieditor and check
all the variables set for your environment.
SHELL SCRIPTS
When a group of commands have to be executed regularly they should be stored in a file and the file
itself executed as a shell script or shell program.
output:
$sh script.sh
UNIX PROGRAMMING (18CS56)
My shell: /bin/sh
A single read statement can be used with one or more variables to let you enter multiple
arguments.
read pname flname
The script asks for a pattern to be entered. Input the string director, which is assigned to the
variable pname. Next the script asks for the filename enter the string emp.lst which is assigned
to the variable flname.
grep runs with these two variables as arguments
#!/bin/sh
#emp1.sh
#
echo "Enter the pattern to be searched : \c"
read pname
echo " Enter the file to be used : \c"
read flname
echo " Searching for $pname from file $flname"
grep "$pname" $flname
echo "Selected rows shown above"
Output:
$sh emp1.sh
Enter the pattern to be searched:director
Enter the file to be used: emp.lst
Searching for director from file emp.lst
101 sharma|director|production|12/03/70|7000
102|barun|director|marketing|11/06/67|7800
selected rows shown above
grep "$1" $2
echo "\n job over"
Output:
$ sh emp2.sh director emp.lst
Program: emp2.sh
The number of arguments specified is:2
The arguments are director emp.lst
101| sharma|director|production|12/03/70|7000
102|barun|director|marketing|11/06/67|7800
job over
Shell Significance
Parameter
$1, $2... Positional parameters representing command line arguments
$# Number of arguments specified in command line
$0 Name of executed command
$* Complete set of positional parameters as a single string
"$@" Each quoted string treated as separate argument
$? Exit status of last command
$$ PID of the current shell
$! PID of the last background job
Its through the exit command or function that every command returns an exit status to the caller.
Further a command is said to return true exit status if it executes successfully and false if its fails.
THE PARAMETER $? :It stores the exit status of the last command. It has the value 0 if the
command succeeds and a non zero value if it fails. This parameter is set by argument If no
exit status is specified then $? is set to zero(true).
UNIX PROGRAMMING (18CS56)
Consider two files file1 which exist in current directory and file2 which does not exist
$ ls l file1; echo $? /*file1 attributes are listed
Output :0 /*exit status $?=0, since cmd executed successfully
Output:
1066| sharma | director |sales |03/09/66 | 7000
1098| Kumar |director| production|0/08/67 | 8200
Pattern found in file
The || operator plays inverse role. The second command is executed only when the first fails.
Output:
Pattern not found /* cmd1 -deputy manager is not found in emp.lst.
Hence cmd1 fails. Therefore pattern not found
executes.
Output
1082|sumith| manager|marketing|09/09/73| 709 /*Here cmd1 is executed successfully i,e
manager is found ,therefore cmd2 will not
be executed.
UNIX PROGRAMMING (18CS56)
CONDITIONAL STATEMENTS:
The if CONDITIONAL
If command is successful If command is successful If command is successful
then then then
execute commands execute commands execute commands
else fi elif command is successful
execute commands then ..
fi else ..
fi
The if statement makes two way decision making depending on the fulfillment of a certain condition.
#!/bin/sh
a=10
b=20
if [ $a==$b ]
then
elif [ $a gt $b ]
then
elif [ $a -lt $b ]
then
else
fi
output:
a is lesser than b
UNIX PROGRAMMING (18CS56)
esac
case first matches expression with pattern1. If the match succeeds, then it executes commands1, which
may
list is terminated with a pair of semicolons and the entire construct is closed with esac .
#!/bin/sh
#menu.sh
\n
1. List of files\n 2.Processes of user\n 3.Todays date\n
4.Users of system\n 5.Quit\n
Enter your option: \
read choice
in
1 ) ls l ;;
2 ) ps f ;;
3 ) date ;;
4 ) who ;;
5 ) exit ;;
esac
3. Todays date
4.Users of system
5. Quit
Enter your option : 3
Sun Nov 6 18:03:06 IST 2016
Matching multiple patterns:
case can also specify same action for more than one pattern.
For example the expression y|Y can be used to match y in both upper and lower case letters.
\
read answer
in
y|Y ) ;;
n|N ) exit ;;
esac
Wild cards: case uses them
case has a string matching feature that uses wild cards.
It uses the filename matching meta characters *, ? and the character class but only to match
strings but not the files in the current directory.
[yY][eE] * ) ;;
V
[nN][oO] ) exit ;;
esac
NUMERIC COMPARISION:
The numeric comparison operators used by test are
Operator Meaning
-eq Equal to
-ne Not equal to
-gt Greater than
-ge Greater than or equal to
-lt Less than
-le Less than or equal to
Numeric comparison in the shell is confined to integer values only , decimal values are simply
truncated.
$ x=5; y=7; z=7.2
$ test $x eq $y ; echo $?
Output : 1
$ test $x lt $y ; echo $?
Output: 0
$ test $z gt $y ; echo $?
Output: 1
STRING COMPARISION
test can be used to compare strings with yet another set of operators.
Test True if
s1=s2 String s1=s2
s1!=s2 String s1 is not equal to s2
-n stg String stg is not a null string
-z stg String stg is null string
Stg String stg is assigned and not null
s1==s2 String s1= s2(Korn and bash only)
UNIX PROGRAMMING (18CS56)
Example:
#!/bin/sh
if [ $a = $b ]
then
else
fi
UNIX PROGRAMMING (18CS56)
output:
a is not equal to b
FILE TESTS
test can be used to test the various file attributes like its type(file, directory or symbolic link) or its
permissions(read,write,execute)
$ls l emp.lst
-rw-rw-rw- 1 kumar group 870 Sep 8 15:52 emp.lst
Ex:
$for file in chap20 chap21 chap22
do
cp $file {$file}.bak
echo $file copied to $file.bak
done
Output:
chap20 copied to chap20.bak
chap21 copied to chap21.bak
chap22 copied to chap22.bak
Ex:
\$1 is $1, \$2 is $2, \
Output: $1 is 989, $2 is 878, $3 is 779
$shift 2
$echo $1 $2 $3
Output: 09:04:30 IST 2016
If the message is short you can have both the command and message in the same script.
mail sharma << MARK
Your program for printing the invoices has been executed
on `date`. The updated file is $flname
MARK
The here document symbol(<<) followed by three lines of data and a delimiter (the string
MARK)
The shell treats every line following the command and delimited by MARK as input to the
command.
Sharma at the other end will see the three lines of message text with the date inserted by
command substitution and the evaluated filename.
When a script is sent any of the signals in signal_list, trap executes the commands in
command_list
The signal_list can contain the integer values or names of one or more signals.
MODULE 3
UNIX FILE APIs
Files in a UNIX and POSIX system may be any one of the following types:
Regular file
Directory File
FIFO file
Block device file
character device file
Symbolic link file.
There are
open
This is used to establish a connection between a process and a file i.e. it is used to open an existing
file for data transfer function or else it may be also be used to create a new file.
The returned value of the open system call is the file descriptor (row number of the file table), which
contains the inode information.
The prototype of open function is
If successful, open returns a nonnegative integer representing the open file descriptor.
If unsuccessful, open returns 1.
The first argument is the name of the file to be created or opened. This may be an absolute pathname
or relative pathname.
If the given pathname is symbolic link, the open function will resolve the symbolic link reference to
a non-symbolic link file to which it refers.
The second argument is access modes, which is an integer value that specifies how actually the file
should be accessed by the calling process.
Generally, the access modes are specified in <fcntl.h>. Various access modes are:
There are other access modes, which are termed as access modifier flags, and one or more of the following
can be specified by bitwise-ORing them with one of the above access mode flags to alter the access
mechanism of the file.
To illustrate the use of the above flags, the following example statement opens a file called
/usr/divya/usp for read and write in append mode:
int fd=op
If the file is opened in read only, then no other modifier flags can be used.
UNIX PROGRAMMING (18CS56)
The third argument is used only when a new file is being created. The symbolic names for file
permission are given in the table in the previous page.
creat
This system call is used to create new regular files.
The prototype of creat is
read
The read function fetches a fixed size of block of data from a file referenced by a given file descriptor.
The prototype of read function is:
write
The write system call is used to write data into a file.
The write function puts data to a file in the form of fixed block size referred by a given file descriptor.
The prototype of write is
close
The close system call is used to terminate the connection to a file from a process.
The prototype of the close is
fcntl
The fcntl function helps a user to query or set flags and the close-on-exec flag of any file descriptor.
The prototype of fcntl is
int cur_flags=fcntl(fdesc,F_GETFL);
int rc=fcntl(fdesc,F_SETFL,cur_flag | O_APPEND | O_NONBLOCK);
V
The following example reports the close-on-exec flag of fdesc, sets it to on afterwards:
-on-
The following statements change the standard input og a process to a file called FOO:
int //open FOO for read
close(0); //close standard input
if(fcntl(fdesc,F_DUPFD,0)==-1)
//stdin from FOO now
char buf[256];
int rc=read(0,buf,256); //read data from FOO
The dup and dup2 functions in UNIX perform the same file duplication
function as fcntl. They can be implemented using fcntl as:
lseek
The lseek function is also used to change the file offset to a different value.
Thus lseek allows a process to perform random access of data on any opened file.
The prototype of lseek is
link
The link function creates a new link for the existing file.
The prototype of the link function is
/*test_ln.c*/
#include<iostream.h
> #include<stdio.h>
#include<unistd.h>
return 1;
}
return 0;
}
unlink
The unlink function deletes a link of an existing file.
This function decreases the hard link count attributes of the named file, and removes the file name
entry of the link from directory file.
A file is removed from the file system when its hard link count is zero and no process has any file
descriptor referencing that file.
The prototype of unlink is
The UNIX mv command can be implemented using the link and unlink APIs as shown:
#include <iostream.h>
#include <unistd.h>
#include<string.h>
int main ( int argc, char *argv[ ])
{
if (argc != 3 || strcmp(argv[1],argcv[2]))
\
else if(link(argv[1],argv[2]) == 0)
return unlink(argv[1]);
return 1;
}
stat, fstat
The stat and fstat function retrieves the file attributes of a given file.
The only difference between stat and fstat is that the first argument of a stat is a file pathname, where
as the first argument of fstat is file descriptor.
The prototypes of these functions are
The second argument to stat and fstat is the address of a struct stat-typed variable which is defined in
the
<sys/stat.h> header.
Its declaration is as follows:
struct stat
{
dev_t st_dev; /* file system ID */
ino_t st_ino; /* file inode number */
mode_t st_mode; /* contains file type and permission */
nlink_t st_nlink; /* hard link count */
uid_t st_uid; /* file user ID */
gid_t st_gid; /* file group ID */
dev_t st_rdev; /*contains major and minor device#*/
off_t st_size; /* file size in bytes */
time_t st_atime; /* last access time */
time_t st_mtime; /* last modification time */
time_t st_ctime; /* last status change time */
UNIX PROGRAMMING (18CS56)
access
The access system call checks the existence and access permission of user to a named file.
The prototype of access function is:
The flag argument value to an access call is composed by bitwise-ORing one or more of the
above bit flags as shown:
else
chmod, fchmod
The chmod and fchmod functions change file access permissions for owner, group & others
as well as the set_UID, set_GID and sticky flags.
A process must have the effective UID of either the super-user/owner of the file.
The pathname argument of chmod is the path name of a file whereas the fdesc argument of
UNIX PROGRAMMING (18CS56)
if (UID == (uid_t)-1)
else
perror
utime Function
The utime function modifies the access time and the modification time stamps of a file.
The prototype of utime function is
So, in order to overcome this drawback UNIX and POSIX standard support file locking mechanism.
File locking is applicable for regular files.
Only a process can impose a write lock or read lock on either a portion of a file or on the entire file.
The differences between the read lock and the write lock is that when write lock is set, it prevents the
other process from setting any over-lapping read or write lock on the locked file.
Similarly when a read lock is set, it prevents other processes from setting any overlapping write locks
on the locked region.
The intension of the write lock is to prevent other processes from both reading and writing the locked
region while the process that sets the lock is modifying the region, so write lock is termed as
Exclusive lock
The use of read lock is to prevent other processes from writing to the locked region while the process
that sets the lock is reading data from the region.
Other processes are allowed to lock and read data from the locked regions. Hence a read lock is also
shared lock
File lock may be mandatory if they are enforced by an operating system kernel.
If a mandatory exclusive lock is set on a file, no process can use the read or write system calls to
access the data on the locked region.
These mechanisms can be used to synchronize reading and writing of shared files by multiple
processes.
If a process locks up a file, other processes that attempt to write to the locked regions are blocked until
the former process releases its lock.
Problem with mandatory lock is if a runaway process sets a mandatory exclusive lock on a file and
never unlocks it, then, no other process can access the locked region of the file until the runway process
is killed or the system has to be rebooted.
If locks are not mandatory, then it has to be advisory lock.
A kernel at the system call level does not enforce advisory locks.
This means that even though a lock may be set on a file, no other processes can still use the read and
write functions to access the file.
To make use of advisory locks, process that manipulate the same file must co-operate such that they
follow the given below procedure for every read or write operation to the file.
Try to set a lock at the region to be accesses. If this fails, a process can either wait for the
lock request to become successful.
After a lock is acquired successfully, read or write the locked region.
Release the lock.
If a process sets a read lock on a file, for example from address 0 to 256, then sets a write lock on the
file from address 0 to 512, the process will own only one write lock on the file from 0 to 512, the
previous read lock from 0 to 256 is now covered by the write lock and the process does not own two
locks on Lock Promotion
Furthermore, if a process now unblocks the file from 128 to 480, it will own two write locks on the
Lock Splitting
UNIX systems provide fcntl function to support file locking. By using fcntl it is possible to impose
read or write locks on either a region or an entire file.
UNIX PROGRAMMING (18CS56)
VT
25th byte.
; lock.l_type=F_RDLCK;
lock.l_whence=0;
lock.l_start=10; lock.l_len=15;
fcntl(fd,F_SETLK,&lock);
}
UNIX PROGRAMMING (18CS56)
A Directory file is a record-oriented file, where each record stores a file name and the inode number
of a file that resides in that directory.
Directories are created with the mkdir API and deleted with the rmdir API.
The prototype of mkdir is
Function Use
opendir Opens a directory file for read-only. Returns a file handle dir * for future
reference of the file.
readdir Reads a record from a directory file referenced by dir-fdesc and returns
that
record information.
rewinddir Resets the file pointer to the beginning of the directory file referenced by
dir-
fdesc. The next call to readdir will read the first record from the file.
closedir closes a directory file referenced by dir-fdesc.
An empty directory is deleted with the rmdir API.
The prototype of rmdir is
If the link count of the directory becomes 0, with the call and no other process has the directory open
then
UNIX PROGRAMMING (18CS56)
Function Use
telldir Returns the file pointer of a given dir_fdesc
seekdir Changes the file pointer of a given dir_fdesc to a specified address
The following list_dir.C program illustrates the uses of the mkdir, opendir, readdir, closedir and rmdir APIs:
#include<iostream.h>
#include<stdio.h>
#include<sys/types.h>
#include<unistd.h>
#include<string.h>
#include<sys/stat.h>
#if defined(BSD) &&
!_POSIX_SOURCE
#include<sys/dir.h>
typedef struct dirent Dirent;
#else
#include<dirent.h>
typedef struct dirent Dirent;
#endif
int main(int agc, char* argv[])
{
Dirent* dp;
DIR*
dir_fdesc;
while(--argc>0)
{
if(!(dir_fdesc=opendir(*++argv)))
{
if(mkdir(*argv,S_IRWXU | S_IRWXG |
S_IRWXO)==-1) perror("opendir");
continue;
}
for(int i=0;i<2;i++)
UNIX PROGRAMMING (18CS56)
for(int cnt=0;dp=readdir(dir_fdesc);)
{
if(i)
cout<<dp->d_name<<endl;
if(strcmp(dp->d_name,".") && strcmp(dp-
>d_name,"..")) cnt++;
}
if(!cnt)
{
rmdir(*argv)
; break;
}
rewinddir(dir_fdesc);
}
closedir(dir_fdesc);
}
}
execute permission is granted for user, group and others with major number as 8 and minor number
3.
On success mknod API returns 0 , else it returns -1
The following test_mknod.C program illustrates the use of the mknod, open, read, write and close APIs on
a block device file.
UNIX PROGRAMMING (18CS56)
#include<iostream.h>
#include<stdio.h>
#include<stdlib.h>
#include<sys/types.h>
#include<unistd.h>
#include<fcntl.h>
#include<sys/stat.h>
FIFO
FIFO files are sometimes called named pipes.
Pipes can be used only between related processes when a common ancestor has created the pipe.
Creating a FIFO is similar to creating a file.
Indeed the pathname for a FIFO exists in the file system.
The prototype of mkfifo is
VTUPu
Returns 0 on success and 1 on failure.
If the pipe call executes successfully, the process can read from fd[0] and write to fd[1]. A single
process with a pipe is not very useful. Usually a parent process uses pipes to communicate with its
children.
The following test_fifo.C example illustrates the use of mkfifo, open, read, write and close APIs for a FIFO
file:
#include<iostream.h>
#include<stdio.h>
#include<sys/types.h>
#include<unistd.h>
#include<fcntl.h>
#include<sys/stat.h>
#include<string.h>
#include<errno.h>
int main(int argc,char* argv[])
{
if(argc!=2 && argc!=3)
{
cout<<"usage:"<<argv[0]<<"<file>
[<arg>]"; return 0;
UNIX PROGRAMMING (18CS56)
char buf[256];
(void) mkfifo(argv[1], S_IFIFO | S_IRWXU | S_IRWXG |
S_IRWXO ); if(argc==2)
{
fd=open(argv[1],O_RDONLY | O_NONBLOCK);
while(read(fd,buf,sizeof(buf))==-1 &&
errno==EAGAIN)
sleep(1);
} while(read(fd,buf,sizeof(buf))>0)
else cout<<buf<<endl;
{
fd=open(argv[1],O_WRONLY);
} write(fd,argv[2],strlen(argv[2]));
close(fd);
}
A symbolic link is an indirect pointer to a file, unlike the hard links which pointed directly to the
inode of the file.
Symbolic links are developed to get around the limitations of hard links:
Symbolic links can link files across file systems.
Symbolic links can link directory files
Symbolic links always reference the latest version of the files to which they link
There are no file system limitations on a symbolic link and what it points to and anyone can
create a symbolic link to a directory.
Symbolic links are typically used to move a file or an entire directory hierarchy to some other
location on a system.
A symbolic link is created with the symlink.
The prototype is
The org_link and sym_link arguments to a sym_link call specify the original file path name and the
symbolic
link path name to be created.
UNIX PROGRAMMING (18CS56)
UNIX PROCESSES
INTRODUCTION
A Process is a program under execution in a UNIX or POSIX system.
mainFUNCTION
A C program starts execution with a function called main. The prototype for the mainfunction is
int main(int argc, char *argv[]);
where argc is the number of command-line arguments, and argv is an array of pointers to the arguments.
When a C program is executed by the kernel by one of the exec functions, a special start-up routine is called
before the main function is called. The executable program file specifies this routine as the starting address
for the program; this is set up by the link editor when it is invoked by the C compiler. This start-up routine
takes values from the kernel, the command-line arguments and the environment and sets things up so that the
mainfunction is called.
PROCESS TERMINATION
There are eight ways for a process to terminate. Normal termination occurs in five ways:
Return from main
Calling exit
Calling _exitor _Exit
Return of the last thread from its start routine
Calling pthread_exit from the last thread
Abnormal termination occurs in three ways:
Calling abort
Receipt of a signal
Response of the last thread to a cancellation request
Exit Functions
Three functions terminate a program normally: _exit and _Exit, which return to the kernel immediately, and
exit, which performs certain cleanup processing and then returns to the kernel.
All three exit functions expect a single integer argument, called the exit status. Returning an integer value from
the
main function is equivalent to calling exit with the same value. Thus
exit(0); is the same as
return(0);
from the main function.
UNIX PROGRAMMING (18CS56)
undefined. Example:
atexitFunction
With ISO C, a process can register up to 32 functions that are automatically called by exit. These are called
exit handlers and are registered by calling the atexitfunction.
int main(void)
{
if (atexit(my_exit2) != 0) err_sys("can't
register my_exit2");
if (atexit(my_exit1) != 0) err_sys("can't
register my_exit1");
if (atexit(my_exit1) != 0) err_sys("can't
register my_exit1");
printf("main is done\n");
return(0);
}
static void
my_exit1(void)
{
printf("first exit handler\n");
}
UNIX PROGRAMMING (18CS56)
{
printf("second exit handler\n");
}
Output:
$ ./a.out
main is done
first exit handler first
exit handler second exit
handler
The below figure summarizes how a C program is started and the various ways it can terminate.
COMMAND-LINE ARGUMENTS
When a program is executed, the process that does the exec can pass command-line arguments to the new
program.
Example: Echo all command-line arguments to standard output
#include "apue.h"
ENVIRONMENT LIST
Each program is also passed an environment list. Like the argument list, the environment list is an array of
character pointers, with each pointer containing the address of a null-terminated C string. The address of the
array of pointers is contained in the global variable environ:
extern char **environ;
Figure: Environment consisting of five C character string
appearing outside any function causes this variable to be stored in the uninitialized data segment.
Stack, where automatic variables are stored, along with information that is saved each time a
function is called. Each time a function is called, the address of where to return to and certain
information about the caller's environment, such as some of the machine registers, are saved on the
stack. The newly called function then allocates room on the stack for its automatic and temporary
variables. This is how recursive functions in C can work. Each time a recursive function calls itself,
a new stack frame is used, so one set of variables doesn't interfere with the variables from another
instance of the function.
Heap, where dynamic memory allocation usually takes place. Historically, the heap has been
located between the uninitialized data and the stack.
SHARED LIBRARIES
Nowadays most UNIX systems support shared libraries. Shared libraries remove the common library routines
from the executable file, instead maintaining a single copy of the library routine somewhere in memory that
all processes reference. This reduces the size of each executable file but may add some runtime overhead,
either when the program is first executed or the first time each shared library function is called. Another
advantage of shared libraries is that, library functions can be replaced with new versions without having to re-
link, edit every program that uses the library. With cc compiler we can use the option g to indicate that we
are using shared library.
MEMORY ALLOCATION
ISO C specifies three functions for memory allocation:
malloc, which allocates a specified number of bytes of memory. The initial value of the memory is
indeterminate.
calloc, which allocates space for a specified number of objects of a specified size. The space is
initialized to all 0 bits.
realloc, which increases or decreases the size of a previously allocated area. When the size increases,
it may involve moving the previously allocated area somewhere else, to provide the additional room
UNIX PROGRAMMING (18CS56)
at the end. Also, when the size increases, the initial value of the space between the old contents and
the end of the new area is indeterminate.
All three return: non-null pointer if OK, NULLon error
void free(void *ptr);
The pointer returned by the three allocation functions is guaranteed to be suitably aligned so that it can be used
for any data object. Because the three alloc functions return a generic void * pointer, if we #include
<stdlib.h> (to obtain the function prototypes), we do not explicitly have to cast the pointer returned by these
functions when we assign it to a pointer of a different type.
The function free causes the space pointed to by ptr to be deallocated. This freed space is usually put into a
pool of available memory and can be allocated in a later call to one of the three allocfunctions.
The reallocfunction lets us increase or decrease the size of a previously allocated area. For example, if we
allocate room for 512 elements in an array that we fill in at runtime but find that we need room for more than
512 elements, we can call realloc. If there is room beyond the end of the existing region for the requested
space, then realloc doesn't have to move anything; it simply allocates the additional area at the end and returns
the same pointer that we passed it. But if there isn't room at the end of the existing region, reallocallocates
another area that is large enough, copies the existing 512-element array to the new area, frees the old area, and
returns the pointer to the new area.
The allocation routines are usually implemented with the sbrk(2) system call. Although sbrk can expand or
contract the memory of a process, most versions of malloc and freenever decrease their memory size. The
space that we free is available for a later allocation, but the freed space is not usually returned to the kernel;
that space is kept in the mallocpool.
It is that writing past the end of an allocated area could overwrite this record-keeping information important to
realize that most implementations allocate a little more space than is requested and use the additional space for
record keeping the size of the allocated block, a pointer to the next allocated block, and the like. This means
in a later block. These error may not show up until much later. Also, it is possible to overwrite this record
keeping by writing before the start of the allocated area.
Because memory allocation errors types of errors are often catastrophic, but difficult to find, because the
are difficult to track down, some systems provide versions of these functions that do additional error checking
every time one of the three alloc functions or free is called. These versions of the functions are often specified
by including a special library for the link editor. There are also publicly available sources that you can compile
with special flags to enable additional runtime checking.
libmalloc
SVR4-based systems, such as Solaris, include the libmalloc library, which provides a set of interfaces
matching the ISO C memory allocation functions. The libmalloclibrary includes mallopt, a function that allows
UNIX PROGRAMMING (18CS56)
a process to set certain variables that control the operation of the storage allocator. A function called mallinfo
is also available to provide statistics on the memory allocator.
UNIX PROGRAMMING (18CS56)
quick-fit
Historically, the standard mallocalgorithm used either a best-fit or a first-fit memory allocation strategy. Quick-
fit is faster than either, but tends to use more memory. Free implementations of mallocand freebased on
quick-fit are readily available from several FTP sites.
allocaFunction
The function alloca has the same calling sequence as malloc; however, instead of allocating memory from the
heap, the memory is allocated from the stack frame of the current function. The advantage is that we don't
have to free the space; it goes away automatically when the function returns. The alloca function increases the
size of the stack frame. The disadvantage is that some systems can't support alloca, if it's impossible to increase
the size of the stack frame after the function has been called.
ENVIRONMENT VARIABLES
The environment strings are usually of the form: name=value. The UNIX kernel never looks at these strings;
their interpretation is up to the various applications. The shells, for example, use numerous environment
variables. Some, such as HOME and USER, are set automatically at login, and others are for us to set. We
normally set environment variables in a shell start-
can use to set and fetch values from the variables are setenv, putenv, and getenv functions. The prototype of
these functions are
Returns: pointer to value associated with name, NULL if not found.
Note that this function returns a pointer to the value of a name=value string. We should always use getenv to
fetch a specific value from the environment, instead of accessing environ directly. In addition to fetching the
value of an environment variable, sometimes we may want to set an environment variable. We may want to
change the value of an existing variable or add a new variable to the environment. The prototypes of these
functions are
All return: 0 if OK, nonzero on error.
The putenv function takes a string of the form name=value and places it in the environment list. If
name already exists, its old definition is first removed.
The setenvfunction sets name to value. If name already exists in the environment, then
if rewrite is nonzero, the existing definition for name is first removed;
if rewrite is 0, an existing definition for name is not removed, name is not set to the new value,
and no error occurs.
The unsetenvfunction removes any definition of name. It is not an error if such a definition does
not exist. Note the difference between putenv and setenv. Whereas setenv must allocate memory to
UNIX PROGRAMMING (18CS56)
environment.
NOTE:
If we're modifying an existing name:
If the size of the new value is less than or equal to the size of the existing value, we can just
copy the new string over the old string.
If the size of the new value is larger than the old one, however, we must malloc to obtain
room for the new string, copy the new string to this area, and then replace the old pointer in
the environment list for name with the pointer to this allocated area.
If we're adding a new name, it's more complicated. First, we have to call malloc to allocate room
for the name=value string and copy the string to this area.
Then, if it's the first time we've added a new name, we have to call mallocto obtain room for
a new list of pointers. We copy the old environment list to this new area and store a pointer
to the name=value string at the end of this list of pointers. We also store a null pointer at the
end of this list, of course. Finally, we set environto point to this new list of pointers.
If this isn't the first time we've added new strings to the environment list, then we know that
we've already allocated room for the list on the heap, so we just call realloc to allocate room
for one more pointer. The pointer to the new name=value string is stored at the end of the
list (on top of the previous null pointer), followed by a null pointer.
setjmpAND longjmpFUNCTIONS
In C, we can't goto a label that's in another function. Instead, we must use the setjmp and longjmp functions
to perform this type of branching. As we'll see, these two functions are useful for handling error conditions
VTUP
that occur in a deeply nested function call.
Return s: 0 if called directly, nonzero if returning from a call to longjmp
void longjmp(jmp_buf env, int val);
The setjmp function records or marks a location in a program code so that later when the longjmp function is
called from some other function, the execution continues from the location onwards. The env variable(the first
argument) records the necessary information needed to continue execution. The env is of the jmp_buf defined
in <setjmp.h> file, it contains the task.
Example of setjmpand longjmp
#include "apue.h"
#include <setjmp.h>
#define TOK_ADD 5
jmp_buf jmpbuffer;
int main(void)
{
char line[MAXLINE];
if (setjmp(jmpbuffer) != 0)
UNIX PROGRAMMING (18CS56)
do_line(line);
exit(0);
}
...
void cmd_add(void)
{
int token;
token = get_token();
if (token < 0) /* an error has occurred */
longjmp(jmpbuffer, 1);
The setjmp function always returns on its success when it is called directly in a process (for the first
time).
The longjmp function is called to transfer a program flow to a location that was stored in the env
argument.
The program code marked by the env must be in a function that is among the callers of the current
function.
When the process is jumping to the target function, all the stack space used in the current function
and its callers, upto the target function are discarded by the longjmp function.
The process resumes execution by re-executing the setjmp statement in the target function that is
marked by env. The return value of setjmp function is the value(val), as specified in the longjmp
function call.
getrlimitAND setrlimitFUNCTIONS
Every process has a set of resource limits, some of which can be queried and changed by the geTRlimit and
setrlimitfunctions.
RLIMIT_CPU The maximum amount of CPU time in seconds. When the soft limit is exceeded, the
SIGXCPU
signal is sent to the process.
RLIMIT_DATA The maximum size in bytes of the data segment: the sum of the initialized data,
uninitialized
data, and heap.
RLIMIT_FSIZE The maximum size in bytes of a file that may be created. When the soft limit is
exceeded, the
process is sent the SIGXFSZsignal.
RLIMIT_LOCKS The maximum number of file locks a process can hold.
RLIMIT_MEML The maximum amount of memory in bytes that a process can lock into memory using
OCK mlock(2).
RLIMIT_NOFIL The maximum number of open files per process. Changing this limit affects the value
E returned
by the sysconffunction for its _SC_OPEN_MAXargument
RLIMIT_NPROC The maximum number of child processes per real user ID. Changing this limit affects
the value
returned for _SC_CHILD_MAXby the sysconffunction
RLIMIT_RSS Maximum resident set size (RSS) in bytes. If available physical memory is low, the
kernel
takes memory from processes that exceed their RSS.
RLIMIT_SBSIZE The maximum size in bytes of socket buffers that a user can consume at any given time.
RLIMIT_STACK The maximum size in bytes of the stack.
RLIMIT_VMEM This is a synonym for RLIMIT_AS.
The resource limits affect the calling process and are inherited by any of its children. This means that the
setting of resource limits needs to be built into the shells to affect all our future processes.
Example: Print the current resource limits
#include "apue.h"
#if defined(BSD) ||
defined(MACOS) #include
<sys/time.h>
#define FMT "%10lld
" #else
#define FMT
"%10ld " #endif
#include <sys/resource.h>
int main(void)
{
#ifdef RLIMIT_AS
doit(RLIMIT_AS);
#endif
doit(RLIMIT_CORE);
doit(RLIMIT_CPU);
doit(RLIMIT_DATA);
doit(RLIMIT_FSIZE);
#ifdef RLIMIT_LOCKS
doit(RLIMIT_LOCKS
);
#endif
#ifdef RLIMIT_MEMLOCK
doit(RLIMIT_MEMLOCK);
#endif
doit(RLIMIT_NOFILE);
#ifdef RLIMIT_NPROC
doit(RLIMIT_NPROC
);
#endif
#ifdef RLIMIT_RSS
doit(RLIMIT_RSS);
#endif
#ifdef RLIMIT_SBSIZE
doit(RLIMIT_SBSIZE);
#endif
doit(RLIMIT_STACK);
#ifdef RLIMIT_VMEM
doit(RLIMIT_VMEM);
#endif
exit(0);
}
RLIM_INFINITY)
printf("(infinite) ");
else
printf(FMT, limit.rlim_cur);
if (limit.rlim_max ==
RLIM_INFINITY)
printf("(infinite)");
else
printf(FMT, limit.rlim_max);
putchar((int)'\n');
}
All processes in UNIX system expect the process that is created by the system boot code, are created by the
fork system call. After the fork system call, once the child process is created, both the parent and child
processes resumes execution. When a process is created by fork, it contains duplicated copies of the text, data
and stack segments of its parent as shown in the Figure below. Also it has a file descriptor table, which contains
reference to the same opened files as the parent, such that they both share the same file pointer to each opened
files.
Current directory: this is the reference (inode number) to a working directory file.
Root directory: this is the reference to a root directory.
Signal handling: the signal handling settings.
Signal mask: a signal mask that specifies which signals are to be blocked.
Unmask: a file mode mask that is used in creation of files to specify which accession rights should
be taken out.
Nice value: the process scheduling priority value.
Controlling terminal: the controlling terminal of the process.
In addition to the above attributes, the following attributes are different between the parent and child processes:
Process identification number (PID): an integer identification number that is unique per process in
an entire operating system.
Parent process identification number (PPID): the parent process PID.
Pending signals: the set of signals that are pending delivery to the parent process.
Alarm clock time: the process alarm clock time is reset to zero in the child process.
File locks: the set of file locks owned by the parent process is not inherited by the chid process.
fork and exec are commonly used together to spawn a sub-process to execute a different program. The
advantages of this method are:
A process can create multiple processes to execute multiple programs concurrently.
Because each child process executes in its own virtual address space, the parent process is not
affected by the execution status of its child process.
UNIX PROGRAMMING (18CS56)
PROCESS CONTROL
INTRODUCTION
Process control is concerned about creation of new processes, program execution, and process termination.
PROCESS IDENTIFIERS
#include <unistd.h>
pid_t getpid(void);
Returns: process ID of calling process
pid_t getppid(void);
Returns: parent process ID of calling
process
uid_t getuid(void);
Returns: real user ID of calling process
uid_t geteuid(void);
Returns: effective user ID of calling
process
gid_t getgid(void);
Returns: real group ID of calling
process
gid_t getegid(void);
Returns: effective group ID of calling
process
fork FUNCTION
An existing process can create a new one by calling the forkfunction.
Example programs:
Program 1
/* Program to demonstrate fork function Program name fork1.c */
#include<sys/types.h>
#include<unistd.h>
int main( )
{
fork( );
\
}
Output :
$ cc fork1.c
$ ./a.out
hello USP
hello USP
Note : The statement hello USP is executed twice as both the child and parent have executed that instruction.
Program 2
/* Program name
fork2.c */
#include<sys/types.h>
#include<unistd.h>
int main( )
{
\
fork( );
\
}
Output :
$ cc fork1.c
UNIX PROGRAMMING (18CS56)
6 sem
hello USP
hello USP
Note: The statement 6 sem is executed only once by the parent because it is called before fork and
statement hello USP is executed twice by child and parent. [Also refer lab program 3.sh]
File Sharing
Consider a process that has three different files opened for standard input, standard output, and standard error.
On return from fork, we have the arrangement shown in Figure 8.2.
Figure 8.2 Sharing of open files between parent and child after fork
It is important that the parent and the child share the same file offset.
Consider a process that forks a child, then waits for the child to complete.
Assume that both processes write to standard output as part of their normal processing.
If the parent has its standard output redirected (by a shell, perhaps) it is essential that the parent's file
offset be updated by the child when the child writes to standard output.
In this case, the child can write to standard output while the parent is waiting for it; on completion of
the child, the parent can continue writing to standard output, knowing that its output will be appended
to whatever the child wrote.
If the parent and the child did not share the same file offset, this type of interaction would be more
difficult to accomplish and would require explicit actions by the parent.
UNIX PROGRAMMING (18CS56)
There are two normal cases for handling the descriptors after a fork.
The parent waits for the child to complete. In this case, the parent does not need to do anything with
its descriptors. When the child terminates, any of the shared descriptors that the child read from or
wrote to will have their file offsets updated accordingly.
Both the parent and the child go their own ways. Here, after the fork, the parent closes the descriptors
that it doesn't need, and the child does the same thing. This way, neither interferes with the other's open
descriptors. This scenario is often the case with network servers.
There are numerous other properties of the parent that are inherited by the child:
Real user ID, real group ID, effective user ID, effective group ID
Supplementary group IDs
Process group ID
Session ID
Controlling terminal
The set-user-ID and set-group-ID flags
Current working directory
Root directory
File mode creation mask
Signal mask and dispositions
The close-on-exec flag for any open file descriptors
Environment
Attached shared memory segments
Memory mappings
Resource limits
parent goes back to waiting for the next service request to arrive.
When a process wants to execute a different program. This is common for shells. In this case, the child
does an execright after it returns from the fork.
vfork FUNCTION
The function vforkhas the same calling sequence and same return values as fork.
The vfork function is intended to create a new process when the purpose of the new process is to exec
a new program.
The vfork function creates the new process, just like fork, without copying the address space of the
parent into the child, as the child won't reference that address space; the child simply calls exec (or
exit) right after the vfork.
Instead, while the child is running and until it calls either exec or exit, the child runs in the address
space of the parent. This optimization provides an efficiency gain on some paged virtual-memory
implementations of the UNIX System.
Another difference between the two functions is that vfork guarantees that the child runs first, until
the child calls execor exit. When the child calls either of these functions, the parent resumes.
Example of vforkfunction
#include "apue.h"
int glob = 6; /* external variable in initialized data */
int main(void)
{
int var; /* automatic variable on the stack */
pid_t pid;
var = 88;
printf("before vfork\n"); /* we don't flush stdio */ if
((pid = vfork()) < 0) {
err_sys("vfork error");
} else if (pid == 0) { /* child */
glob++; /* modify parent's variables */
var++;
_exit(0); /* child terminates */
}
/*
* Parent continues here.
*/
printf("pid = %d, glob = %d, var = %d\n", getpid(), glob, var);
exit(0);
UNIX PROGRAMMING (18CS56)
Output:
$ ./a.out
before vfork
pid = 29039, glob = 7, var = 89
exit FUNCTIONS
A process can terminate normally in five ways:
Executing a return from the main function.
Calling the exit function.
Calling the _exit or _Exit function.
In most UNIX system implementations, exit(3) is a function in the standard C library, whereas _exit(2)
is a system call.
Executing a return from the start routine of the last thread in the process. When the last thread returns
from its start routine, the process exits with a termination status of 0.
Calling the pthread_exit function from the last thread in the
process. The three forms of abnormal termination are as follows:
Calling abort. This is a special case of the next item, as it generates the SIGABRT signal.
When the process receives certain signals. Examples of signals generated by the kernel include the
process referencing a memory location not within its address space or trying to divide by 0.
The last thread responds to a cancellation request. By default, cancellation occurs in a deferred
manner: one thread requests that another be canceled, and sometime later, the target thread
terminates.
it blocks the caller until a child terminates. If the caller blocks and has multiple children, wait returns when
one terminates.
For both functions, the argument statloc is a pointer to an integer. If this argument is not a null pointer, the
termination status of the terminated process is stored in the location pointed to by the argument.
Print a description of the exitstatus
#include "apue.h"
#include <sys/wait.h>
Int main(void)
{
pid_t pid;
int status;
pid == 1 Waits for any child process. In this respect, waitpid is equivalent to wait.
pid > 0 Waits for the child whose process ID equals pid.
pid == 0 Waits for any child whose process group ID equals that of the calling
process.
pid < 1 Waits for any child whose process group ID equals the absolute value of
pid.
Macro Description
WIFEXITED(status) True if status was returned for a child that terminated normally. In this
case, we can execute
WEXITSTATUS(status)
to fetch the low-order 8 bits of the argument that the child passed toexit,
_exit,or _Exit.
WIFSIGNALED(status) True if status was returned for a child that terminated abnormally, by
receipt of a signal that it didn't catch. In this case, we can execute
Puse.com
WTERMSIG(status)
to fetch the signal number that caused the termination.
Additionally, some implementations (but not the Single UNIX
Specification) define the macro
WCOREDUMP(status)
that returns true if a core file of the terminated process was generated.
WIFSTOPPED(status) True if status was returned for a child that is currently stopped. In this
case, we can execute
WSTOPSIG(status)
to fetch the signal number that caused the child to stop.
WIFCONTINUED(status) True if status was returned for a child that has been continued after a job
control stop
The waitpidfunction provides three features that aren't provided by the waitfunction.
The waitpid function lets us wait for one particular process, whereas the wait function returns the
status of any terminated child. We'll return to this feature when we discuss the popenfunction.
The waitpid function provides a nonblocking version of wait. There are times when we want to fetch
a child's status, but we don't want to block.
The waitpidfunction provides support for job control with the WUNTRACEDand WCONTINUED
options.
Program to Avoid zombie processes by calling forktwice
#include "apue.h"
#include <sys/wait.h>
Int main(void)
{
pid_t pid;
if ((pid = fork()) < 0) {
err_sys("fork error");
} else if (pid == 0) { /* first child */ if
((pid = fork()) < 0)
err_sys("fork error");
else if (pid > 0)
exit(0); /* parent from second fork == first child */
/*
We're the second child; our parent becomes init assoon
as our real parent calls exit() in the statementabove.
Here's where we'd continue executing, knowing thatwhen
we're done, init will reap our status.
*/
sleep(2);
printf("second child, parent pid = %d\n", getppid());
exit(0);
}
if (waitpid(pid, NULL, 0) != pid) /* wait for first child */
err_sys("waitpid error");
UNIX PROGRAMMING (18CS56)
knowing that we're not the parent of the second child.
*/
exit(0);
}
Output:
$ ./a.out
$ second child, parent pid = 1
waitidFUNCTION
The waitidfunction is similar to waitpid, but provides extra flexibility.
Constan Description
t
P_PID Wait for a particular process: id contains the process ID of the child to wait for.
P_PGID Wait for any child process in a particular process group: id contains the process group ID of the
children to wait for.
P_ALL Wait for any child process: id is ignored.
The options argument is a bitwise OR of the flags as shown below: these flags indicate which state changes
the caller is interested in.
Constant Description
WCONTINU Wait for a process that has previously stopped and has been continued, and whose status has
ED not
yet been reported.
WEXITED Wait for processes that have exited.
WNOHANG Return immediately instead of blocking if there is no child exit status available.
WNOWAIT Don't destroy the child exit status. The child's exit status can be retrieved by a subsequent call
to
wait, waitid,or waitpid.
WSTOPPED Wait for a process that has stopped and whose status has not yet been reported.
wait3AND wait4FUNCTIONS
UNIX PROGRAMMING (18CS56)
is an additional argument that allows the kernel to return a summary of the resources used by the terminated
process and all its child processes.
The prototypes of these functions are:
RACE CONDITIONS
A race condition occurs when multiple processes are trying to do something with shared data and the final
outcome depends on the order in which the processes run.
Example: The program below outputs two strings: one from the child and one from the parent. The program
contains a race condition because the output depends on the order in which the processes are run by the kernel
and for how long each process runs.
#include "apue.h"
static void charatatime(char *);
int main(void)
{
pid_t pid;
if ((pid = fork()) < 0) {
err_sys("fork error");
} else if (pid == 0) { charatatime("output
from child\n");
} else {
charatatime("output from parent\n");
}
exit(0);
}
static void
charatatime(char *str)
{
char *ptr;
int c;
setbuf(stdout, NULL); /* set unbuffered */
for (ptr = str; (c = *ptr++) != 0; )
UNIX PROGRAMMING (18CS56)
Output:
$ ./a.out
ooutput from child
utput from parent
$ ./a.out
ooutput from child
utput from parent
$ ./a.out
output from child
output from parent
#include "apue.h"
static void charatatime(char *);
int main(void)
{
pid_t pid;
+ TELL_WAIT();
When we run this program, the output is as we expect; there is no intermixing of output from the two processes.
execFUNCTIONS
When a process calls one of the exec functions, that process is completely replaced by the new program, and
the new program starts executing at its main function. The process ID does not change across an exec, because
a new process is not created; exec merely replaces the current process - its text, data, heap, and stack segments
- with a brand new program from disk.
There are 6 exec functions:
We've mentioned that the process ID does not change after an exec, but the new program inherits additional
properties from the calling process:
Process ID and parent process ID
Real user ID and real group ID
Supplementary group IDs
Process group ID
Session ID
Controlling terminal
Time left until alarm clock
Current working directory
Root directory
File mode creation mask
File locks
Process signal mask
Pending signals
Resource limits
Values for tms_utime, tms_stime, tms_cutime, and tms_cstime.
int main(void)
{
pid_t pid;
exit(0);
}
Output:
$ ./a.out
argv[0]: echoall
argv[1]: myarg1
argv[2]: MY
ARG2
USER=unknown
PATH=/tmp
$ argv[0]: echoall
argv[1]: only 1 arg
USER=sar
LOGNAME=sar
SHELL=/bin/bash
47 more lines that aren't shown
HOME=/home/sar
UNIX PROGRAMMING (18CS56)
Note that the shell prompt appeared before the printing of argv[0] from the second exec. This is because the
parent did not wait for this child process to finish.
Echo all command-line arguments and all environment strings
#include "apue.h"
exit(0);
}
UNIX PROGRAMMING (18CS56)
MODULE 4
CHANGING USER IDs AND GROUP IDs
When our programs need additional privileges or need to gain access to resources that they currently aren't
allowed to access, they need to change their user or group ID to an ID that has the appropriate privilege or
access. Similarly, when our programs need to lower their privileges or prevent access to certain resources,
they do so by changing either their user ID or group ID to an ID without the privilege or ability access to the
resource.
We can make a few statements about the three user IDs that the kernel maintains.
Only a superuser process can change the real user ID. Normally, the real user ID is set by the login(1)
program when we log in and never changes. Because login is a superuser process, it sets all three user
IDs when it calls setuid.
The effective user ID is set by the exec functions only if the set-user-ID bit is set for the program file.
If the set-user-ID bit is not set, the exec functions leave the effective user ID as its current value. We
can call setuid at any time to set the effective user ID to either the real user ID or the saved set-user-
ID. Naturally, we can't set the effective user ID to any random value.
The saved set-user-ID is copied from the effective user ID by exec. If the file's set-user-ID bit is set,
this copy is saved after execstores the effective user ID from the file's user ID.
ID exec setuid(uid)
set-user-ID bit off set-user-ID bit on superuser unprivileged
user
real user ID unchanged unchanged set to uid unchanged
effective user ID unchanged set from user ID of program set to uid set to uid
file
saved set- copied from effective user copied from effective user ID set to uid
unchanged
user ID ID
The above figure summarises the various ways these three user IDs can be changed
UNIX PROGRAMMING (18CS56)
setreuidand setregidFunctions
Swapping of the real user ID and the effective user ID with the setreuid function.
Figure: Summary of all the functions that set the various user Ids
INTERPRETER FILES
These files are text files that begin with a line of the form
#! pathname [ optional-argument ]
The space between the exclamation point and the pathname is optional. The most common of these interpreter
files begin with the line
#!/bin/sh
UNIX PROGRAMMING (18CS56)
The pathname is normally an absolute pathname, since no special operations are performed on it (i.e., PATH
is not used). The recognition of these files is done within the kernel as part of processing the exec system call.
The actual file that gets executed by the kernel is not the interpreter file, but the file specified by the pathname
on the first line of the interpreter file. Be sure to differentiate between the interpreter filea text file that begins
with #!and the interpreter, which is specified by the pathname on the first line of the interpreter file.
Be aware that systems place a size limit on the first line of an interpreter file. This limit includes the #!, the
pathname, the optional argument, the terminating newline, and any spaces.
Int main(void)
{
pid_t pid;
if ((pid = fork()) < 0) {
err_sys("fork error");
} else if (pid == 0) { /* child */
if (execl("/home/sar/bin/testinterp",
"testinterp", "myarg1", "MY ARG2", (char *)0) <
0) err_sys("execl error");
}
Output:
$ cat /home/sar/bin/testinterp
#!/home/sar/bin/echoarg foo
$ ./a.out
argv[0]: /home/sar/bin/echoarg
argv[1]: foo
argv[2]: /home/sar/bin/testinterp
argv[3]: myarg1
argv[4]: MY ARG2
systemFUNCTION
If cmdstring is a null pointer, system returns nonzero only if a command processor is available. This
Feature determines whether the system function is supported on a given operating system. Under the UNIX
UNIX PROGRAMMING (18CS56)
Because systemis implemented by calling fork, exec, and waitpid, there are three types of return values.
If either the forkfails or waitpidreturns an error other than EINTR, systemreturns 1 with errnoset to
indicate the error.
If the execfails, implying that the shell can't be executed, the return value is as if the shell had
executed
exit(127).
Otherwise, all three functions fork, exec, and waitpid succeed, and the return value from system is
the termination status of the shell, in the format specified for waitpid.
#include <sys/wait.h>
#include <errno.h>
#include <unistd.h>
return(status);
}
Int main(void)
{
int status;
exit(0);
}
int
main(void
)
{
printf("real uid = %d, effective uid = %d\n", getuid(), geteuid());
exit(0);
UNIX PROGRAMMING (18CS56)
PROCESS ACCOUNTING
Most UNIX systems provide an option to do process accounting. When enabled, the kernel writes an
accounting record each time a process terminates.
These accounting records are typically a small amount of binary data with the name of the command,
the amount of CPU time used, the user ID and group ID, the starting time, and so on.
A superuser executes accton with a pathname argument to enable accounting.
The accounting records are written to the specified file, which is usually /var/account/acct. Accounting
is turned off by executing accton without any arguments.
The data required for the accounting record, such as CPU times and number of characters transferred,
is kept by the kernel in the process table and initialized whenever a new process is created, as in the
child after a fork.
Each accounting record is written when the process terminates.
This means that the order of the records in the accounting file corresponds to the termination order of
the processes, not the order in which they were started.
The accounting records correspond to processes, not programs.
A new record is initialized by the kernel for the child after a fork, not when a new program is
executed. The structure of the accounting records is defined in the header <sys/acct.h>and looks
something like
typedef u_short comp_t; /* 3-bit base 8 exponent; 13-bit fraction */
struct acct
{
char ac_flag; /* flag */
char ac_stat; /* termination status (signal & core flag only) */
/* (Solaris only) */
uid_t ac_uid; /* real user ID */
gid_t ac_gid; /* real group ID */
dev_t ac_tty; /* controlling terminal */
time_t ac_btime; /* starting calendar time */
comp_t ac_utime; /* user CPU time (clock ticks) */
comp_t ac_stime; /* system CPU time (clock ticks) */
comp_t ac_etime; /* elapsed time (clock ticks) */
comp_t ac_mem; /* average memory usage */
comp_t ac_io; /* bytes transferred (by read and write) */
/* "blocks" on BSD systems */
comp_t ac_rw; /* blocks read or written */
/* (not present on BSD systems) */
char ac_comm[8] /* command name: [8] for Solaris, */
;
/* [10] for Mac OS X, [16] for FreeBSD, and */
/* [17] for Linux */
UNIX PROGRAMMING (18CS56)
Int main(void)
{
pid_t pid;
/* first child */
if ((pid = fork()) < 0)
err_sys("fork error");
else if (pid != 0) {
sleep(4);
abort(); /* terminate with core dump */
}
/* second child */
if ((pid = fork()) < 0)
err_sys("fork error");
else if (pid != 0) {
execl("/bin/dd", "dd", "if=/etc/termcap", "of=/dev/null", NULL);
UNIX PROGRAMMING (18CS56)
/* third child */
/* fourth child */
sleep(6);
kill(getpid(), SIGKILL); /* terminate w/signal, no core dump */
exit(6); /* shouldn't get here */
}
USER IDENTIFICATION
Any process can find out its real and effective user ID and group ID. Sometimes, however, we want to find
out the login name of the user who's running the program. We could call getpwuid(getuid()), but what if a
single user has multiple login names, each with the same user ID? (A person might have multiple entries in
the password file with the same user ID to have a different login shell for each entry.) The system normally
keeps track of the name we log in and the getloginfunction provides a way to fetch that login name.
PROCESS TIMES
We describe three times that we can measure: wall clock time, user CPU time, and system CPU time. Any
process can call the timesfunction to obtain these values for itself and any terminated children.
Note that the structure does not contain any measurement for the wall clock time. Instead, the function returns
the wall clock time as the value of the function, each time it's called. This value is measured from some
arbitrary point in the past, so we can't use its absolute value; instead, we use its relative value.
PROCESS RELATIONSHIPS
INTRODUCTION
In this chapter, we'll look at process groups in more detail and the concept of sessions that was introduced by
POSIX.1. We'll also look at the relationship between the login shell that is invoked for us when we log in and
all the processes that we start from our login shell.
TERMINAL LOGINS
The terminals were either local (directly connected) or remote (connected through a modem). In either case,
these logins came through a terminal device driver in the kernel.
The system administrator creates a file, usually /etc/ttys, that has one line per terminal device. Each line
specifies the name of the device and other parameters that are passed to the getty program. One parameter is
the baud rate of the terminal, for example. When the system is bootstrapped, the kernel creates process ID 1,
the init process, and it is init that brings the system up multiuser. The init process reads the file /etc/ttys and,
for every terminal device that allows a login, does a fork followed by an exec of the program getty. This gives
us the processes shown in Figure 9.1.
UNIX PROGRAMMING (18CS56)
All the processes shown in Figure 9.1 have a real user ID of 0 and an effective user ID of 0 (i.e., they all have
superuser privileges). The initprocess also execs the gettyprogram with an empty environment.
It is getty that calls open for the terminal device. The terminal is opened for reading and writing. If the device
is a modem, the open may delay inside the device driver until the modem is dialed and the call is answered.
Once the device is open, file descriptors 0, 1, and 2 are set to the device. Then getty outputs something like
login: and waits for us to enter our user name.
When we enter our user name, getty's job is complete, and it then invokes the loginprogram, similar to
execle("/bin/login", "login", "-p", username, (char *)0, envp);
Change to our user ID (setuid) and invoke our login shell, as in execl("/bin/sh", "-sh",
(char *)0);
The minus sign as the first character of argv[0] is a flag to all the shells that they are being invoked as a
login shell. The shells can look at this character and modify their start-up accordingly.
Figure 9.3. Arrangement of processes after everything is set for a terminal login
NETWORK LOGINS
The main (physical) difference between logging in to a system through a serial terminal and logging in to a
system through a network is that the connection between the terminal and the computer isn't point-to-point.
With the terminal logins that we described in the previous section, init knows which terminal devices are
enabled for logins and spawns a gettyprocess for each device. In the case of network logins, however, all the
logins come through the kernel's network interface drivers (e.g., the Ethernet driver).
Let's assume that a TCP connection request arrives for the TELNET server. TELNET is a remote login
application that uses the TCP protocol. A user on another host (that is connected to the server's host through
a network of some form) or on the same host initiates the login by starting the TELNET client:
telnet hostname
The client opens a TCP connection to hostname, and the program that's started on hostname is called the
TELNET server. The client and the server then exchange data across the TCP connection using the TELNET
application protocol. What has happened is that the user who started the client program is now logged in to
the server's host. (This assumes, of course, that the user has a valid account on the server's host.) Figure 9.4
shows the sequence of processes involved in executing the TELNET server, called telnetd.
UNIX PROGRAMMING (18CS56)
The telnetd process then opens a pseudo-terminal device and splits into two processes using fork. The parent
handles the communication across the network connection, and the child does an exec of the login program.
The parent and the child are connected through the pseudo terminal. Before doing the exec, the child sets up
file descriptors 0, 1, and 2 to the pseudo terminal. If we log in correctly, loginperforms the same steps we
described in Section 9.2: it changes to our home directory and sets our group IDs, user ID, and our initial
environment. Then login replaces itself with our login shell by calling exec. Figure 9.5 shows the arrangement
of the processes at this point.
Figure 9.5. Arrangement of processes after everything is set for a network login
PROCESS GROUPS
A process group is a collection of one or more processes, usually associated with the same job, that can receive
signals from the same terminal. Each process group has a unique process group ID. Process group IDs are
similar to process IDs: they are positive integers and can be stored in a pid_t data type. The function getpgrp
returns the process group ID of the calling process.
SESSIONS
A session is a collection of one or more process groups. For example, we could have the arrangement
shown in Figure 9.6. Here we have three process groups in a single session.
Figure 9.6. Arrangement of processes into process groups and sessions
The process has no controlling terminal. If the process had a controlling terminal before calling
setsid, that association is broken.
This function returns an error if the caller is already a process group leader. The getsidfunction returns the
process group ID of a process's session leader. The getsid function is included as an XSI extension in the
Single UNIX Specification.
CONTROLLING TERMINAL
Sessions and process groups have a few other characteristics.
A session can have a single controlling terminal. This is usually the terminal device (in the case of a
terminal login) or pseudo-terminal device (in the case of a network login) on which we log in.
The session leader that establishes the connection to the controlling terminal is called the controlling
process.
The process groups within a session can be divided into a single foreground process group and one or
more background process groups.
If a session has a controlling terminal, it has a single foreground process group, and all other process
groups in the session are background process groups.
Whenever we type the terminal's interrupt key (often DELETE or Control-C), this causes the
interrupt signal be sent to all processes in the foreground process group.
Whenever we type the terminal's quit key (often Control-backslash), this causes the quit signal to be
sent to all processes in the foreground process group.
If a modem (or network) disconnect is detected by the terminal interface, the hang-up signal is sent
to the controlling process (the session leader).
These characteristics are shown in Figure 9.7.
Figure 9.7. Process groups and sessions showing controlling terminal
UNIX PROGRAMMING (18CS56)
The function tcgetpgrp returns the process group ID of the foreground process group associated with the
terminal open on filedes. If the process has a controlling terminal, the process can call tcsetpgrp to set the
foreground process group ID to pgrpid. The value of pgrpid must be the process group ID of a process group
in the same session, and filedes must refer to the controlling terminal of the session.
The single UNIX specification defines an XSI extension called tcgetsid to allow an application to obtain the
process group-ID for the session leader given a file descriptor for the controlling terminal.
-1 on error
JOB CONTROL
This feature allows us to start multiple jobs (groups of processes) from a single terminal and to control which
jobs can access the terminal and which jobs are to run in the background. Job control requires three forms of
support:
A shell that supports job control
The terminal driver in the kernel must support job control
The kernel must support certain job-control signals
The interaction with the terminal driver arises because a special terminal character affects the foreground job:
the suspend key (typically Control-Z). Entering this character causes the terminal driver to send the SIGTSTP
signal to all processes in the foreground process group. The jobs in any background process groups aren't
affected. The terminal driver looks for three special characters, which generate signals to the foreground
process group.
The interrupt character (typically DELETE or Control-C) generates SIGINT.
The quit character (typically Control-backslash) generates SIGQUIT.
The suspend character (typically Control-Z) generates SIGTSTP.
UNIX PROGRAMMING (18CS56)
This signal normally stops the background job; by using the shell, we are notified of this and can bring the
job into the foreground so that it can read from the terminal. The following demonstrates this:
$ cat > temp.foo & start in background, but it'll read from standard input
[1] 1681
$ we press RETURN
[1] + Stopped (SIGTTIN) cat > temp.foo &
$ fg %1 bring job number 1 into the foreground
cat > temp.foo the shell tells us which job is now in the foreground
Figure 9.8. Summary of job control features with foreground & background jobs & terminal driver
UNIX PROGRAMMING (18CS56)
What happens if a background job outputs to the controlling terminal? This is an option that we can allow or
disallow. Normally, we use the stty(1) command to change this option. The following shows how this works:
$ cat temp.foo & execute in background
[1] 1719
$ hello, world the output from the background job appears after the prompt
we press RETURN
When we disallow background jobs from writing to the controlling terminal, cat will block when it tries to
write to its standard output, because the terminal driver identifies the write as coming from a background
process sends the job the SIGTTOUsignal and.
Figure 9.8 summarizes some of the features of job control that we've been describing. The solid through the
terminal driver box mean that the terminal I/O and the terminal-generated signals are always connected from
the foreground process group to the actual terminal. The dashed line corresponding to the SIGTTOU signal
means that whether the output from a process in the background process lines group appears on the terminal is
an option.
If we execute the command in the background, the only value that changes is the process ID of the command.
Output 3:
The program cat1 is just a copy of the standard cat program, with a different name. Note that the last process
in the pipeline is the child of the shell and that the first process in the pipeline is a child of the last process.
Figure 9.9. Processes in the pipeline ps | cat1 | cat2when invoked by Bourne shell
A process whose parent terminates is called an orphan and is inherited by the initprocess.
static void
sig_hup(int signo)
{
printf("SIGHUP received, pid = %d\n", getpid());
}
static void
pr_ids(char
*name)
{
printf("%s: pid = %d, ppid = %d, pgrp = %d, tpgrp = %d\n",
name, getpid(), getppid(), getpgrp(), tcgetpgrp(STDIN_FILENO));
fflush(stdout);
}
int
main(void
)
{
char c;
pid_t pid;
pr_ids("parent");
if ((pid = fork()) < 0) {
err_sys("fork error");
} else if (pid > 0) { /* parent */
sleep(5); /*sleep to let child stop itself */
exit(0); /* then parent exits */
} else { /* child */
pr_ids("child");
signal(SIGHUP, sig_hup); /* establish signal handler */
kill(getpid(), SIGTSTP); /* stop ourself */
pr_ids("child"); /* prints only if we're continued */ if
(read(STDIN_FILENO, &c, 1) != 1)
printf("read error from controlling TTY, errno = %d\n",
errno);
exit(0);
}
}
Output:
$ ./a.out
UNIX PROGRAMMING (18CS56)
The child inherits the process group of its parent (6099). After the fork,
The parent sleeps for 5 seconds. This is our (imperfect) way of letting the child execute before the
parent terminates.
The child establishes a signal handler for the hang-up signal (SIGHUP). This is so we can see whether
SIGHUP is sent to the child. The child sends itself the stop signal (SIGTSTP) with the kill function.
This stops the child, similar to our stopping a foreground job with our terminal's suspend character
(Control-Z).
When the parent terminates, the child is orphaned, so the child's parent process ID becomes 1, the init
process ID.
At this point, the child is now a member of an orphaned process group If the process group is not
orphaned, there is a chance that one of those parents in a different process group but in the same session
will restart a stopped process in the process group that is not orphaned. Here, the parent of every
process in the group belongs to another session.
Since the process group is orphaned when the parent terminates, POSIX.1 requires that every process
in the newly orphaned process group that is stopped (as our child is) be sent the hang-up signal
(SIGHUP) followed by the continue signal (SIGCONT).
This causes the child to be continued, after processing the hang-up signal. The default action for the
hang-up signal is to terminate the process, so we have to provide a signal handler to catch the signal.
We therefore expect the printfin the sig_hupfunction to appear before the printfin the pr_idsfunction.
UNIX PROGRAMMING (18CS56)
INTRODUCTION
IPC enables one application to control another application, and for several applications to share the same data
without interfering with one another. IPC is required in all multiprocessing systems, but it is not generally
supported by single-process operating systems.
The various forms of IPC that are supported on a UNIX system are as follows :
Half duplex Pipes.
PIPES
Pipes are the oldest form of UNIX System IPC. Pipes have two limitations.
Historically, they have been half duplex (i.e., data flows in only one direction).
Pipes can be used only between processes that have a common ancestor. Normally, a pipe is created by
a process, that process calls fork, and the pipe is used between the parent and the child.
A pipe is created by calling the pipefunction.
Two ways to picture a half-duplex pipe are shown in Figure 15.2. The left half of the figure shows the two
ends of the pipe connected in a single process. The right half of the figure emphasizes that the data in the pipe
flows through the kernel.
Figure 15.2. Two ways to view a half-duplex pipe
UNIX PROGRAMMING (18CS56)
A pipe in a single process is next to useless. Normally, the process that calls pipe then calls fork, creating an
IPC channel from the parent to the child or vice versa. Figure 15.3 shows this scenario.
Figure 15.3 Half-duplex pipe after a fork
What happens after the fork depends on which direction of data flow we want. For a pipe from the parent to
the child, the parent closes the read end of the pipe (fd[0]), and the child closes the write end (fd[1]). Figure
15.4 shows the resulting arrangement of descriptors.
Figure 15.4. Pipe from parent to child
For a pipe from the child to the parent, the parent closes fd[1], and the child closes
fd[0]. When one end of a pipe is closed, the following two rules apply.
If we read from a pipe whose write end has been closed, read returns 0 to indicate an end of file after
all the data has been read.
If we write to a pipe whose read end has been closed, the signal SIGPIPEis generated. If we either
ignore the signal or catch it and return from the signal handler, writereturns 1 with errnoset to EPIPE.
PROGRAM: shows the code to create a pipe between a parent and its child and to send data down the pipe.
#include "apue.h"
int
main(void
)
{
int n;
int fd[2];
pid_t pid;
char line[MAXLINE];
UNIX PROGRAMMING (18CS56)
error");
if ((pid = fork()) < 0) {
err_sys("fork error");
} else if (pid > 0) { /* parent */
close(fd[0]);
write(fd[1], "hello world\n", 12);
} else { /* child */
close(fd[1]);
n = read(fd[0], line, MAXLINE);
write(STDOUT_FILENO, line,
n);
}
exit(0);
}
popenAND pcloseFUNCTIONS
Since a common operation is to create a pipe to another process, to either read its output or send it input, the
standard I/O library has historically provided the popen and pclose functions. These two functions handle all
the dirty work that we've been doing ourselves: creating a pipe, forking a child, closing the unused ends of the
pipe, executing a shell to run the command, and waiting for the command to terminate.
VTUP
Returns: file pointer if OK, NULLon error
If type is "w", the file pointer is connected to the standard input of cmdstring, as shown:
COPROCESSES
A UNIX system filter is a program that reads from standard input and writes to standard output. Filters are
normally connected linearly in shell pipelines. A filter becomes a coprocess when the same program generates
the filter's input and reads the filter's output. A coprocess normally runs in the background from a shell, and
its standard input and standard output are connected to another program using a pipe.
The process creates two pipes: one is the standard input of the coprocess, and the other is the standard output
of the coprocess. Figure 15.16 shows this arrangement.
Figure 15.16. Driving a coprocess by writing its standard input and reading its standard output
Int main(void)
{
int n, int1, int2;
char
;
line[MAXLINE]
FIFOs
FIFOs are sometimes called named pipes. Pipes can be used only between related processes when a
common ancestor has created the pipe.
We create the FIFO and then start prog3 in the background, reading from the FIFO. We then start prog1 and
use
teeto send its input to both the FIFO and prog2. Figure 15.21shows the process arrangement.
UNIX PROGRAMMING (18CS56)
FIGURE 15.21: Using a FIFO and teeto send a stream to two different processes
whether a client crashes. This causes the client-specific FIFOs to be left in the file system.
before reading the response, leaving the client-specific FIFO with one writer (the server) and no reader.
XSI IPC
Identifiers and Keys
Each IPC structure (message queue, semaphore, or shared memory segment) in the kernel is referred to by a
non- negative integer identifier. The identifier is an internal name for an IPC object. Cooperating processes
need an external naming scheme to be able to rendezvous using the same IPC object. For this purpose, an IPC
object is associated with a key that acts as an external name.
Whenever an IPC structure is being created, a key must be specified. The data type of this key is the primitive
system data type key_t, which is often defined as a long integer in the header <sys/types.h>. This key is
converted into an identifier by the kernel.
There are various ways for a client and a server to rendezvous at the same IPC structure.
The server can create a new IPC structure by specifying a key of IPC_PRIVATE and store the returned
identifier somewhere (such as a file) for the client to obtain. The key IPC_PRIVATE guarantees that
the server creates a new IPC structure. The disadvantage to this technique is that file system operations
are required for the server to write the integer identifier to a file, and then for the clients to retrieve this
identifier later.
The IPC_PRIVATE key is also used in a parent-child relationship. The parent creates a new IPC
structure specifying IPC_PRIVATE, and the resulting identifier is then available to the child after the
fork. The child can pass the identifier to a new program as an argument to one of the execfunctions.
The client and the server can agree on a key by defining the key in a common header, for example.
The server then creates a new IPC structure specifying this key. The problem with this approach is that
it's possible for the key to already be associated with an IPC structure, in which case the get function
(msgget, semget, or shmget) returns an error. The server must handle this error, deleting the existing
IPC structure, and try to create it again.
The client and the server can agree on a pathname and project ID (the project ID is a character value
between 0 and 255) and call the function ftok to convert these two values into a key. This key is then
used in step 2. The only service provided by ftokis a way of generating a key from a pathname and
project ID.
UNIX PROGRAMMING (18CS56)
Permission Structure
XSI IPC associates an ipc_perm structure with each IPC structure. This structure defines the permissions and
owner and includes at least the following members:
struct ipc_perm
{
uid_t uid; /* owner's effective user id */ gid_t
gid; /* owner's effective group id */ uid_t
cuid; /* creator's effective user id */ gid_t
cgid; /* creator's effective group id */ mode_t
mode; /* access modes */
.
.
.
};
All the fields are initialized when the IPC structure is created. At a later time, we can modify the uid, gid, and
mode fields by calling msgctl, semctl, or shmctl. To change these values, the calling process must be either
the creator of the IPC structure or the superuser. Changing these fields is similar to calling chownor chmodfor
a file.
Permission Bit
user-read 0400
user-write (alter) 0200
group-read 0040
group-write (alter) 0020
other-read 0004
other-write (alter) 0002
Configuration Limits
All three forms of XSI IPC have built-in limits that we may encounter. Most of these limits can be changed
by reconfiguring the kernel. We describe the limits when we describe each of the three forms of IPC.
Message Queues:
A message queue is a linked list of messages stored within the kernel and identified by a message queue
identifier. We'll call the message queue just a queue and its identifier a queue ID.
A new queue is created or an existing queue opened by msgget. New messages are added to the end of a queue
by msgsnd. Every message has a positive long integer type field, a non-negative length, and the actual data
bytes (corresponding to the length), all of which are specified to msgsndwhen the message is added to a queue.
Messages are fetched from a queue by msgrcv. We don't have to fetch the messages in a first-in, first-out
order. Instead, we can fetch messages based on their type field.
Each queue has the following msqid_dsstructure associated with it:
struct msqid_ds
{
struct ipc_perm msg_perm; /* see Section 15.6.2 */
msgqnum_t msg_qnum; /* # of messages on queue */
msglen_t msg_qbytes; /* max # of bytes on queue */
pid_t msg_lspid; /* pid of last msgsnd() */
UNIX PROGRAMMING (18CS56)
VTUPulse m
On success, msgget returns the non-negative queue ID. This value is then used with the other three message
queue functions.
The msgctlfunction performs various operations on a queue.
Each message is composed of a positive long integer type field, a non-negative length (nbytes), and the actual
data bytes (corresponding to the length). Messages are always placed at the end of the queue.
The ptr argument points to a long integer that contains the positive integer message type, and it is immediately
followed by the message data. (There is no message data if nbytes is 0.) If the largest message we send is 512
bytes, we can define the following structure:
struct mymesg
{
long mtype; /* positive message type */
char mtext[512]; /* message data, of length nbytes */
};
The ptr argument is then a pointer to a mymesg structure. The message type can be used by the receiver to
fetch messages in an order other than first in, first out.
Messages are retrieved from a queue by msgrcv.
When a process is done with a shared resource that is controlled by a semaphore, the semaphore value is
incremented by 1. If any other processes are asleep, waiting for the semaphore, they are awakened.
A common form of semaphore is called a binary semaphore. It controls a single resource, and its value is
initialized to 1. In general, however, a semaphore can be initialized to any positive value, with the value
indicating how many units of the shared resource are available for sharing.
XSI semaphores are, unfortunately, more complicated than this. Three features contribute to this unnecessary
complication.
UNIX PROGRAMMING (18CS56)
The creation of a semaphore (semget) is independent of its initialization (semctl). This is a fatal flaw,
since we cannot atomically create a new semaphore set and initialize all the values in the set.
Since all forms of XSI IPC remain in existence even when no process is using them, we have to worry
about a program that terminates without releasing the semaphores it has been allocated. The undo
feature that we describe later is supposed to handle this.
The kernel maintains a semid_dsstructure for each semaphore set:
struct semid_ds {
struct ipc_perm sem_perm; /* see Section 15.6.2 */
unsigned short sem_nsems; /* # of semaphores in set */
time_t sem_otime; /* last-semop() time */
time_t sem_ctime; /* last-change time */
.
.
.
};
Each semaphore is represented by an anonymous structure containing at least the following members:
struct {
unsigned short semval; /* semaphore value, always >= 0 */
pid_t sempid; /* pid for last operation */
unsigned short semncnt; /* # processes awaiting semval>curval */
unsigned short semzcnt; /* # processes awaiting semval==0 */
.
.
.
};
When a new set is created, the following members of the semid_ds structure are initialized.
The ipc_perm structure is initialized. The mode member of this structure is set to the corresponding
permission bits of flag.
sem_otime is set to 0.
sem_ctime is set to the current time.
sem_nsems is set to nsems.
The number of semaphores in the set is nsems. If a new set is being created (typically in the server), we must
specify nsems. If we are referencing an existing set (a client), we can specify nsems as 0.
union of
various command-specific arguments:
union semun
{
int val; /* for SETVAL */
struct semid_ds *buf; /* for IPC_STAT and
IPC_SET */ unsigned short *array; /* for
GETALL and SETALL */
};
The cmd argument specifies one of the above ten commands to be performed on the set specified by
semid. The function semopatomically performs an array of operations on a semaphore set.
If sem_op is a negative number, semop adds the sem_op value to the corresponding semaphore element
value provided that the result would not be negative. If the operation would make the element value
negative, semop blocks the process on the event that the semaphore element value increases. If the
resulting value is 0, semop wakes the processes waiting for 0.
UNIX PROGRAMMING (18CS56)
MODULE 5
SIGNALS AND DAEMON PROCESSES
Signals are software interrupts. Signals provide a way of handling asynchronous events: a user at a
terminal typing the interrupt key to stop a program or the next program in a pipeline terminating
prematurely.
Name Description Default action
SIGABRT abnormal termination (abort) terminate+core
SIGALRM timer expired (alarm) terminate
SIGBUS hardware fault terminate+core
SIGCANC threads library internal use ignore
EL
SIGCHLD change in status of child ignore
SIGCONT continue stopped process continue/ignore
SIGEMT hardware fault terminate+core
SIGFPE arithmetic exception terminate+core
SIGFREEZ checkpoint freeze ignore
E
SIGHUP hangup terminate
SIGILL illegal instruction terminate+core
SIGINFO status request from keyboard ignore
SIGINT terminal interrupt character terminate
SIGIO asynchronous I/O terminate/ignore
SIGIOT hardware fault terminate+core
SIGKILL termination terminate
SIGLWP threads library internal use ignore
SIGPIPE write to pipe with no readers terminate
SIGPOLL pollable event (poll) terminate
SIGPROF profiling time alarm (setitimer) terminate
SIGPWR power fail/restart terminate/ignore
SIGQUIT terminal quit character terminate+core
SIGSEGV invalid memory reference terminate+core
SIGSTKFL coprocessor stack fault terminate
T
SIGSTOP stop stop process
SIGSYS invalid system call terminate+core
SIGTERM termination terminate
UNIX PROGRAMMING (18CS56)
SIGNAL
The function prototype of the signal API is:
The formal argument of the API are: sig_no is a signal identifier like SIGINT or SIGTERM. The handler
argument is the function pointer of a user-defined signal handler function.
The following example attempts to catch the SIGTERM signal, ignores the SIGINT signal, and accepts the
UNIX PROGRAMMING (18CS56)
#include<iostream.h>
#include<signal.h>
/*signal handler function*/ void
catch_sig(int sig_num)
{
signal (sig_num,catch_sig);
/*main function*/
int main()
{
signal(SIGTERM,catch_sig);
signal(SIGINT,SIG_IGN);
signal(SIGSEGV,SIG_DFL);
pause( ); /*wait for a signal interruption*/
}
The SIG_IGN specifies a signal is to be ignored, which means that if the signal is generated to the process, it
will be discarded without any interruption of the process.
VTUPu
The SIG_DFL specifies to accept the default action of a signal.
SIGNAL MASK
process are not passed on. A process may query or set its signal mask via the sigprocmask API:
The BSD UNIX and POSIX.1 define a set of API known as sigsetops functions:
The sigemptyset API clears all signal flags in the sigmask argument.
The sigaddset API sets the flag corresponding to the signal_num signal in the sigmask
argument. The sigdelset API clears the flag corresponding to the signal_num signal in the
sigmask argument. The sigfillset API sets all the signal flags in the sigmask argument.
[ all the above functions return 0 if OK, -1 on error ]
The sigismember API returns 1 if flag is set, 0 if not set and -1 if the call fails.
The following example checks whether the SIGINT signal is present in a process signal mask and adds it to
the mask if it is not there.
#include<stdio.h>
#include<signal.h>
int main()
{
sigset_t sigmask;
sigemptyset(&sigmask); /*initialise set*/
; exit(1);
}
else sigaddset(&sigmask,SIGINT); /*set SIGINT flag*/
}
UNIX PROGRAMMING (18CS56)
A process can query which signals are pending for it via the sigpending API:
<< endl;
}
In addition to the above ,unix also supports following API’s for signal mask manipulation
SIGACTION
The sigaction API blocks the signal it is catching allowing a process to specify additional signals to be
blocked when the API is handling a signal.
The sigaction API prototype is:
void (*sa_handler)(int);
sigset_t sa_mask;
int sa_flag;
}
The following program illustrates the uses of sigaction:
#include<iostream.h
> #include<stdio.h>
#include<unistd.h>
#include<signal.h>
}
int main(int argc, char* argv[])
{
sigset_t sigmask;
struct sigaction action,old_action;
sigemptyset(&sigmask);
if(sigaddset(&sigmask,SIGTERM)==-1 ||
sigprocmask(SIG_SETMASK,&sigmask,0)==-
sigemptyset(&action.sa_mask);
sigaddset(&action.sa_mask,SIGSEG
V); action.sa_handler=callme;
action.sa_flags=0;
if(sigaction(SIGINT,&action,&old_action)==-
pause();
\
return 0;
}
UNIX PROGRAMMING (18CS56)
VTUP
THE sigsetjmp AND siglongjmp APIs
The function prototypes of the APIs are:
The sigsetjmp and siglongjmp are created to support signal mask processing. Specifically, it is implementation-
dependent on whether a process signal mask is saved and restored when it invokes the setjmp and longjmp
APIs respectively.
The only difference between these functions and the setjmp and longjmp functions is that sigsetjmp has an
additional argument. If savemask is nonzero, then sigsetjmp also saves the current signal mask of the process
in env. When siglongjmp is called, if the env argument was saved by a call to sigsetjmp with a nonzero
savemask, then siglongjmp restores the saved signal mask. The siglongjmp API is usually called from user-
defined signal handling functions. This is because a process signal mask is modified when a signal handler is
ling function.
UNIX PROGRAMMING (18CS56)
The following program illustrates the uses of sigsetjmp and siglongjmp APIs.
#include<iostream.h
> #include<stdio.h>
#include<unistd.h>
#include<signal.h>
#include<setjmp.h>
sigjmp_buf env;
void callme(int sig_num)
{
<<endl; siglongjmp(env,2);
}
int main()
{
sigset_t sigmask;
struct sigaction action,old_action;
sigemptyset(&sigmask);
if(sigaddset(&sigmask,SIGTERM)==-1) ||
sigprocmask(SIG_SETMASK,&sigmask,0)==-
sigemptyset(&action.sa_mask);
sigaddset(&action.sa_mask,SIGSEGV);
action.sa_handler=(void(*)())callme;
action.sa_flags=0;
if(sigaction(SIGINT,&action,&old_action)==-
1)
if(sigsetjmp(env,1)!=0)
{
return 0;
}
else
UNIX PROGRAMMING (18CS56)
pause();
}
KILL
A process can send a signal to a related process via the kill API. This is a simple means of inter-process
communication or control. The function prototype of the API is:
pid > 0 The signal is sent to the process whose process ID is pid.
pid == 0 The signal is sent to all processes whose process group ID equals the process group ID of the
sender and for which the sender has permission to send the signal.
pid < 0 The signal is sent to all processes whose process group ID equals the absolute value of pid and
for which the sender has permission to send the signal.
pid == 1 The signal is sent to all processes on the system for which the sender has permission to send the
signal.
The following program illustrates the implementation of the UNIX kill command using the kill API:
#include<iostream.h
> #include<stdio.h>
#include<unistd.h>
#include<string.h>
#include<signal.h>
return -1;
}
argv++,argc--;
}
while(--argc>0)
UNIX PROGRAMMING (18CS56)
{
if(kill(pid,sig)==-1)
}
else
return 0;
}
The UNIX kill command invocation syntax is:
Kill [ -<signal_num> ] <pid>......
Where signal_num can be an integer number or the symbolic name of a signal. <pid> is process ID.
UNIX PROGRAMMING (18CS56)
ALARM
The alarm API can be called by a process to request the kernel to send the SIGALRM signal after a certain
number of real clock seconds. The function prototype of the API is:
void wakeup( )
{;}
return -1;
}
(void) alarm (timer);
(void) pause( ); return
0;
}
INTERVAL TIMERS
The interval timer can be used to schedule a process to do some tasks at a fixed time interval, to time the
execution of some operations, or to limit the time allowed for the execution of some tasks.
The following program illustrates how to set up a real-time clock interval timer using the alarm API:
#include<stdio.h>
#include<unistd.h
>
#include<signal.h>
UNIX PROGRAMMING (18CS56)
return 1;
}
if(alarm(INTERVAL)==-1)
; else while(1)
{
/*do normal operation*/
}
return 0;
}
In addition to alarm API, UNIX also invented the setitimer API, which can be used to define up to three
different types of timers in a process:
Real time clock timer
Timer based on the user time spent by a process
Timer based on the total user and system times spent by a process
The getitimer API is also defined for users to query the timer values that are set by the
setitimer API. The setitimer and getitimer function prototypes are:
UNIX PROGRAMMING (18CS56)
The which arguments to the above APIs specify which timer to process. Its possible values and the
corresponding timer types are:
Example program:
#include<stdio.h>
#include<unistd.h>
#include<signal.h>
#define INTERVAL 5
void callme(int sig_no)
{
/ *do scheduled tasks*/
}
int main()
{
struct itimerval val;
struct sigaction action;
sigemptyset(&action.sa_mask);
action.sa_handler=(void(*)( )) callme;
action.sa_flags=SA_RESTART;
if(sigaction(SIGALARM,&action,0)==-
1)
{
return 1;
}
val.it_interval.tv_sec
UNIX PROGRAMMING (18CS56)
val.it_value.tv_sec
=INTERV
AL;
val.it_value.tv_usec =0;
else while(1)
{
/*do normal operation*/
}
return 0;
}
The setitimer and getitimer APIs return a zero value if they succeed or a -1 value if they fail.
POSIX.1b TIMERS
POSIX.1b defines a set of APIs for interval timer manipulations. The POSIX.1b timers are more flexible
and powerful than are the UNIX timers in the following ways:
Users may define multiple independent timers per system clock.
The timer resolution is in nanoseconds.
Users may specify the signal to be raised when a timer expires.
The time interval may be specified as either an absolute or a relative time.
DAEMON PROCESSES
INTRODUCTION
Daemons are processes that live for a long time. They are often started when the system is bootstrapped
and terminate only when the system is shut down.
DAEMON CHARACTERISTICS
The characteristics of daemons are:
Daemons run in background.
Daemons have super-user privilege.
terminal.
Daemons are session and group leaders.
CODING RULES
Call umask to set the file mode creation mask to 0. The file mode creation mask that's inherited
could be set to deny certain permissions. If the daemon process is going to create files, it may want to
set specific permissions.
Call fork and have the parent exit. This does several things. First, if the daemon was started as a
simple shell command, having the parent terminate makes the shell think that the command is done.
Second, the child inherits the process group ID of the parent but gets a new process ID, so we're
guaranteed that the child is not a process group leader.
Call setsid to create a new session. The process (a) becomes a session leader of a new session, (b)
becomes the process group leader of a new process group, and (c) has no controlling terminal.
Change the current working directory to the root directory. The current working directory
inherited from the parent could be on a mounted file system. Since daemons normally exist until the
system is rebooted, if the daemon stays on a mounted file system, that file system cannot be unmounted.
Unneeded file descriptors should be closed. This prevents the daemon from holding open any
descriptors that it may have inherited from its parent.
Some daemons open file descriptors 0, 1, and 2 to /dev/null so that any library routines that try
to read from standard input or write to standard output or standard error will have no effect.
Since the daemon is not associated with a terminal device, there is nowhere for output to be displayed;
nor is there anywhere to receive input from an interactive user. Even if the daemon was started from
an interactive session, the daemon runs in the background, and the login session can terminate without
affecting the daemon. If other users log in on the same terminal device, we wouldn't want output from
the daemon showing up on the terminal, and the users wouldn't expect their input to be read by the
daemon.
UNIX PROGRAMMING (18CS56)
Example Program:
#include <unistd,h>
#include <sys/types.h>
#include <fcntl.h>
int daemon_initialise( )
{
pid_t pid;
if (( pid = for() ) < 0)
return 1;
else if ( pid != 0)
exit(0); /* parent exits */
/* child continues */
umask(0);
return 0;
}
ERROR LOGGING
One problem a daemon has is how to handle error messages. It can't simply write to standard error, since it
shouldn't have a controlling terminal. We don't want all the daemons writing to the console device, since on
many workstations, the console device runs a windowing system. A central daemon error-logging facility is
required.
SINGLE-INSTANCE DAEMONS
Some daemons are implemented so that only a single copy of the daemon should be running at a time for
proper operation. The file and record-locking mechanism provides the basis for one way to ensure that only
one copy of a daemon is running. If each daemon creates a file and places a write lock on the entire file, only
one such write lock will be allowed to be created. Successive attempts to create write locks will fail, serving
as an indication to successive copies of the daemon that another instance is already running.
File and record locking provides a convenient mutual-exclusion mechanism. If the daemon obtains a write-
lock on an entire file, the lock will be removed automatically if the daemon exits. This simplifies recovery,
removing the need for us to clean up from the previous instance of the daemon.
DAEMON CONVENTIONS
If the daemon uses a lock file, the file is usually stored in /var/run. Note, however, that the daemon
might need superuser permissions to create a file here. The name of the file is usually name.pid, where
name is the name of the daemon or the service. For example, the name of the cron daemon's lock file
is
/var/run/crond.pid.
If the daemon supports configuration options, they are usually stored in /etc. The configuration file is
named name.conf, where name is the name of the daemon or the name of the service. For example, the
configuration for the syslogddaemon is /etc/syslog.conf.
Daemons can be started from the command line, but they are usually started from one of the system
initialization scripts (/etc/rc*or /etc/init.d/*). If the daemon should be restarted automatically when it
exits, we can arrange for initto restart it if we include a respawnentry for it in /etc/inittab.
If a daemon has a configuration file, the daemon reads it when it starts, but usually won't look at it
again. If an administrator changes the configuration, the daemon would need to be stopped and
restarted to account for the configuration changes. To avoid this, some daemons will catch SIGHUP
and reread their configuration files when they receive the signal. Since they aren't associated with
terminals and are either session leaders without controlling terminals or members of orphaned process
groups, daemons have no reason to expect to receive SIGHUP. Thus, they can safely reuse it
CLIENT-SERVER MODEL
In general, a server is a process that waits for a client to contact it, requesting some type of service. In Figure
13.2 [REFER PAGE 10], the service being provided by the syslogdserver is the logging of an error message.
In Figure 13.2, the communication between the client and the server is one-way. The client sends its service
request to the server; the server sends nothing back to the client. In the upcoming chapters, we'll see numerous