David Wang Computing Science and Information Technology: Info 1211 - Operating System'S Principles and Applications

The document provides an overview of regular expressions, including their history, features, and usage in text processing and programming. It explains the mechanics of regular expressions, including various metacharacters, character classes, and practical examples. Additionally, it covers the differences between basic and extended regular expressions, highlighting their applications in tools like grep and egrep.

Uploaded by

Rajan Thakur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views73 pages

David Wang Computing Science and Information Technology: Info 1211 - Operating System'S Principles and Applications

Uploaded by

Rajan Thakur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 73

LECTURE 4:

INFO 1211 – OPERATING SYSTEM’S PRINCIPLES AND APPLICATIONS

DAVID WANG
COMPUTING SCIENCE AND INFORMATION TECHNOLOGY
OUTLINE
• Regular Expression
• grep Command
• sad Command
REGULAR EXPRESSION
• A regular expression (regex or regexp for short) is a
sequence of characters that define a search pattern
• Similar to (but different from) Wild Cards
• More precise
• More complicated
• Terms
• Character
• basic unit of text
• letter, number, punctuation, space, etc.
• String
• a sequence in length from 0 to many characters
HISTORY
• Regular Set
• Originated version of Regular Expression
• By Mathematician Stephen Cole Kleene in 1956
• described regular languages using his mathematical notation
• Applied to theoretical computer science
• automata theory (models of computation)
• the description and classification of formal languages.
• SNOBOL language
• Other early implementations of pattern matching
• did not use regular expressions, but instead its own syntax
• Regular expressions become popular from 1968 in two uses:
• Pattern matching in a text editor
• I study at KPU, I work at KPU, I found my friends at KPU…
• I xxx at KPU (Pattern matching)
• Lexical analysis in a compiler
• C, Java …
HISTORY (CONT.)
• First Appearances of Regular Expression in program form
• Ken Thompson’s pattern matching in QED editor
• For speed, Thompson implemented regular expression
matching by just-in-time compilation (JIT) to IBM 7094 code
on the Compatible Time-Sharing System, an important early
example of JIT compilation.
• Many variations in Unix programs at Bell Labs in the 1970s
• including vi, emacs, lex, sed, AWK and expr
• standardized in POSIX.2 in 1992
• Today regular expressions are widely supported in
• programming languages (validate an email address input)
• text processing programs (e.g., lexers for lexical analysis)
• advanced text editors (search, replace)
• etc.
PATTERN
The pattern sequence itself is an expression that is a
statement in a language designed specifically to represent
prescribed targets in the most concise and flexible way to
direct the automation of text processing of general text files,
specific textual forms, or of random input strings.
• an expression to specify a set of strings required for a
particular purpose
• a simple way to specify a finite set of strings is to list its
elements or members
WORKING MECHANISM
• A regular expression processor translates a regular
expression into a nondeterministic finite automaton (NFA),
to recognize substrings that match the regular expression
• The picture shows the NFA scheme that accepts any
binary string that contains at least one 00 or 11 as a
substring
WORKING MECHANISM
• An NFA that accepts all binary strings that end with 101.
REGULAR EXPRESSIONS FEATURES
• Regular expressions are interpreted by the command and
not by the shell
• Quoting ensures that the shell isn’t able to interfere and
interpret the metacharacters in its own way.
• Some of the characters used by regular expressions are
also meaningful to the shell – enough reason why these
expressions should be quoted.
• Category of Regular Expressions
• Basic Regular Expressions (BREs)
• Extended Regular Expressions (EREs)
• Perl Regular Expressions (PREs)
• Python Regular Expressions
BASIC REGULAR EXPRESSIONS
• Oldest regular expression flavor still in use today
• Standardizes a flavor similar to the one used by the
traditional UNIX grep command
• Most metacharacters require a backslash to give the
metacharacter its flavor
• Using a backslash to escape a character that is never a
metacharacter is an error
• Supports POSIX bracket expressions
POSIX BRACKET EXPRESSIONS
POSIX Description ASCII Java

[:alnum:] Alphanumeric characters [a-zA-Z0-9] \p{Alnum}

[:alpha:] Alphabetic characters [a-zA-Z] \p{Alpha}
[:ascii:] ASCII characters [\x00-\x7F] \p{ASCII}
[:blank:] Space and tab [ \t] \p{Blank}
[:cntrl:] Control characters [\x00-\x1F\x7F] \p{Cntrl}
[:digit:] Digits [0-9] \p{Digit}
Visible characters (i.e.
[:graph:] anything except spaces, [\x21-\x7E] \p{Graph}
control characters, etc.)
POSIX BRACKET EXPRESSIONS (CONT.)
POSIX Description ASCII Java

[:lower:] Lowercase letters [a-z] \p{Lower}

Visible characters and spaces
[:print:] (i.e. anything except control [\x20-\x7E] \p{Print}
characters, etc.)
[!"#$
[:punct:] Punctuation and symbols. %&'()*+,-./:;<=>? \p{Punct}
@[\]^_`{|}~]
All whitespace characters,
[:space:] [ \t\r\n\v\f] \p{Space}
including line breaks
[:upper:] Uppercase letters [A-Z] \p{Upper}
Word characters (letters,
[:word:] [A-Za-z0-9_]
numbers and underscores)
[:xdigit:] Hexadecimal digits [A-Fa-f0-9] \p{XDigit}
THE CHARACTER CLASS
• Single Character Matching
• Specify a group of characters enclosed within a pair of
rectangular brackets [ ]
• Example
• [od] matches either o or d
• [od][de] matches four patterns
• od
• oe
• dd
• de
• To match woodhouse and wodehouse
• wo[od][de]house
NEGATING A CLASS
• Use a caret ^ to negate the character class
• The same to the bang ! in shell wild card
• Example
• [^a-zA-Z] matches single non-alphabetic character string
• [^0-9] match single non-numeric character string
*
• Use * to matches the preceding pattern element zero or
more times
• refers to the immediate preceding pattern
• Nothing common with the * in wild card
• Example
• e* matches null, e, ee, eee, eeee, ...
• s*printf matches print, sprint, ssprintf, sssprintf, ...
• to match trueman and truman
• true*man
• to match wilcox and wilcocks
• wilco[cx]k*s*
.
• Use . to match single any character except a newline
• The same to the ? in wild card
• Within square brackets the dot is literal
• Can be escape with \
• Example
• 2...
• matches a four-character pattern beginning with a 2
• chap..
• matches a two-character pattern beginning with chap
• \.[co]
• matches .c or .o
• [.][co]
.*
• Use .* to signify any number of characters or none
• similar to * in wild card
• Example
• p.j. woodhouse
• p. woodhouse
• p.j.woodhouse
• p.*woodhouse
• A regular expression match is made for the longest
possible string
• 03.*05 will match 03 and 05 as close to the left and right of
the line, respetively
^ AND $
• Most of the regular expression characters are used for
matching patterns
• Use ^ and $ to specify pattern locations
• ^ matches pattern at the beginning of a line
• $ matches pattern at the end of a line
• Example
• bash$
• bash at the end of line
• ^bash
• bash at the beginning of the line
• Find lines with bash as the only word inline
• ^bash$
• Find blank lines
• ^$
EXAMPLE
2365 :john woodcook :director :personnel :05/11/47 :120000
5678 :robert dylan :d.g.m :marketing 04/19/43 :85000
9876 :bill Johnson :director :production :03/12/50 :130000
2233 :charles harris :g.m. :sales :12/12/52 :90000
5423 :barry wood :chairman :admin :08/30/56 :160000

• Find lines with beginning of 2

• ^2
• Find lines with ending range from 80000 to 99999
• [89]....$
• Find lines with beginning of character that is not 2
• ^[^2]
TRIPLE ROLES OF ^
• Beginning of a character class
• Negates every character of the class
• [^a-z]
• Beginning of the expression
• Pattern matched at the beginning of the line
• ^2
• Other locations
• Matches itself
• Discussion
• ^^^ and ^[^^]
ESCAPING
• Some of the special characters may exist as text
• . and * lost meanings when placed inside character class
• [.] matches .
• s[*] matches s*
• * is also matched literally if it is the first character
• *sta[re] matches *star and *stae
• For others, use \ to escape from the metacharacters
• g\* matches g*
• \[ matches [
• \.\* matches .*
PRACTICE
• Handel", Hundel, and Haendel
• can be specified by the pattern H[ua]e*ndel
• .at
• matches any three-character string ending with "at",
including "hat", "cat", and "bat".
• [hc]at
• matches "hat" and "cat".
• [^b]at
• matches all strings matched by .at except "bat".
• [^hc]at
• matches all strings matched by .at other than "hat" and
"cat".
PRACTICE
• ^[hc]at
• matches "hat" and "cat", but only at the beginning of the
string or line.
• [hc]at$
• matches "hat" and "cat", but only at the end of the string or
line.
• \[.\]
• matches any single character surrounded by "[" and "]"
since the brackets are escaped, for example: "[a]" and "[b]".
• s.*
• matches s followed by zero or more characters, for
example: "s" and "saw" and "seed".
CASE STUDY
• To find hi in a paragraph
• Regex: hi
• match all hi in the paragraph
• Problem
• Also matches
• him
• history
• high
• Solution
• \bhi\b
\b
• Matches a zero-width boundary between a word-class
character and either a non-word class character or an
edge
• In general, the delimiters between words are space,
punctuation, newline, etc. However, \b does not matches
any of them, it only matches a position.
• Example
• er\b matches never, but doesn’t match verb
\B
• Matches a none zero-width boundary between a word-
class character and either a non-word class character or
an edge
• Example
• er\B matches verb, but doesn’t match never
\w
• Matches an alphanumeric character, including "_“
• same as [A-Za-z0-9_] in ASCII

\W
• Matches a non-alphanumeric character, excluding "_“
• same as [^A-Za-z0-9_] in ASCII
BRE SUMMARY
Pattern Matches
* zero or more occurrences of the previous character
. a single character
[pqr] a single character p, q, or r
[c1-c2] a single character within the asci range between c1 and c2
[^pqr] a single character which is not p, q, or r
^pat pattern par at beginning of line
pat$ pattern par at end of line
\b a zero-length boundary matches at a position
\B the negated version of \b
\w an alphanumeric character, including "_"
\W a non-alphanumeric character, excluding "_"
PRACTICE
STRING1 Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)
STRING2 Mozilla/4.75 [en](X11;U;Linux2.2.16-22 i586)

in[du] STRING1 match finds ind in Windows

STRING2 match finds inu in Linux
x[0-9A-Z] STRING1 no match Again the tests are case sensitive to find
the xt in DigExt we would need to use [0-9a-z] or
[0-9A- Zt]. We
can also use this format for testing upper
and lower case e.g. [Ff] will check for lower and
upper
case F.
STRING2 match Finds x2 in Linux2
[^A-M]in STRING1 match Finds Win in Windows
STRING2 no match We have excluded the range A
to M in our
search so Linux is not found but linux (if it were
present) would be found.
PRACTICE
STRING1 Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)
STRING2 Mozilla/4.75 [en](X11;U;Linux2.2.16-22 i586)

m STRING1 match Finds the m in compatible

STRING2 no match There is no lower case m in this string. Searches
are case sensitive
unless you take special action.
a/4 STRING1 match Found in Mozilla/4.0 - any
combination of characters can be
used for the match
STRING2 match Found in same place as in
STRING1
5 \[ STRING1 no match The search is looking for a pattern of '5 [' and this
does NOT exist in
STRING1. Spaces are valid in searches.
STRING2 match Found in Mozilla/4.75 [en]
Note: The \
(backslash) is an escape character and must be
present since the following [ is a meta character that we will
meet in
the next section.
in STRING1 match found in Windows
STRING2 match Found in Linux
le STRING1 match found in compatible
STRING2 no match There is an l and an e in this string but they are
PRACTICE
STRING1 Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)
STRING2 Mozilla/4.75 [en](X11;U;Linux2.2.16-22 i586)

[a-z]\)$ STRING1 match finds t) in DigiExt) Note: The \ is an escape character and
is required to treat the ) as a literal
STRING2 no match We have a numeric value at the end of this string but we
would need [0-9a-z]) to find it.
.in STRING1 match Finds Win in Windows.
STRING2 match Finds Lin in Linux.
EXTENDED REGULAR EXPRESSIONS
• "Extended" is relative to the original UNIX grep
• grep only had bracket, dot, caret, dollar and star
• Standardizes a flavor similar to the one used by the UNIX
egrep command
• egrep did not maintain compatibility with grep
• use a backslash to suppress the meaning of
metacharacters
• Adds ?, +, and |, and it removes the need to escape the
metacharacters ( ) and { }, which are required in BRE.
+ AND ?
• Often used in place of the * to restrict the matching scope
• + matches one or more occurrences of the previous character
• Example
• b+ matches b, bb, bbb, bbbb, ...
• doesn’t matches nothing
• #include +<stdio.h> matches
#include <stdio.h>, #include <stdio.h>, #include <stdio.h>
• ? matches zero or one occurrence of the previous character
• Example
• b? matches null, b
• doesn’t matches bb, bbb, ...
• true?man matches trueman and truman
• Typical usage: # ?include +<stdio.h>
| AND ()
• Use | to serve as the delimiter of multiple patterns
• Example
• woodhouse|woodcock
• matches woodhouse or woodcock
• Use ( ) to group patterns
• works with | as a better alternative
• Example
• wood(house|cock)
• matches woodhouse or woodcock
• gr(a|e)y
• matches gray and grey
• @(samp|code)\{[^}]+\}
• matches @code{foo} and @samp{bar}
( ) IN BRE
• ( ) can also be used in basic regular expressions
• Requires 
PRACTICE
• wilco[cx]k*s*|wood(house|cock)
• woodcock
• woodhouse
• wilcocwoodcock
• wicowood
• wilcocx
• woodcoxk
• wilcocxks
• Woodcoxks
• wilcocks
• wilcox
PRACTICE
Is following statements correct?
• a+ = aa*
• a? = (a|ε)
REPETITION TIMES
• Use { } to denote the match count
• {m} Denotes the accurate m match count.
• {m,n} Denotes the minimum m and the maximum n match
count
• {m,} Denotes the minimum m match count
• Example
• ca{1,3}ndy matches candy, caandy, caaandy, not caaaandy
• a{2} matches caandy and caaandy, but not candy
• a{1,3} matches candy, caandy, caaandy, caaaandy
• a{2,} matches caandy, caaandy, caaaandy
{ } IN BRE
• { } can also be used in basic regular expressions
• Requires \{ \}
ERE SUMMARY
Pattern Matches
+ matches one or more occurrences of the previous
character
? matches zero or one occurrences of the previous character
| This is the alternative operator (logical OR), as the
delimiter of multiple patterns
(...) Used for grouping in regular expressions, as in arithmetic.
{m} the accurate m match count
{m,n} the minimum m and the maximum n match count
{m,} the minimum m match count
EXAMPLE
• [hc]+at
• matches "hat", "cat", "hhat", "chat", "hcat", "cchchat", and
so on, but not "at".
• [hc]?at
• matches "hat", "cat", and "at".
• [hc]*at
• matches "hat", "cat", "hhat", "chat", "hcat", "cchchat", "at",
and so on.
• cat|dog
• matches "cat" or "dog".
PRACTICE
STRING1 Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)
STRING2 Mozilla/4.75 [en](X11;U;Linux2.2.16-22 i586)

\(.*l STRING1 match finds the ( and l in (compatible. The opening \ is

an escape character used to indicate the ( it
precedes is a literal (search character) not a
metacharacter.
STRING2 no match Mozilla contains lls but not preceded by an
open parenthesis (no match) and Linux has an
upper case L (no match).
W*in STRING1 match Finds the Win in Windows.
STRING2 match Finds in in Linux preceded by W zero times - so
a match.
[xX][0-9a-z]{2} STRING1 no match Finds x in DigExt but only one t.
STRING2 match Finds X and 11 in X11.
PRACTICE
STRING1 Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)
STRING2 Mozilla/4.75 [en](X11;U;Linux2.2.16-22 i586)

^([L-Z]in) STRING1 no match The '^' is an anchor (because it lies outside any
square brackets) indicating first position. Win
does not start the string so no match.
STRING2 no match The '^' is an anchor (because it lies outside any
square brackets) indicating first position. Linux
does not start the string so no match.
((4\.[0-3])|(2\.[0-3])) STRING1 match Finds the 4.0 in Mozilla/4.0. The '\.' sequence
uses the escape metacharacter (\) to ensure
that the '.' (dot) is used as a literal in the
search.
STRING2 match Finds the 2.2 in Linux2.2.16-22.
(W|L)in STRING1 match Finds Win in Windows.
STRING2 match Finds Lin in Linux.
MORE REGULAR EXPRESSIONS
• https://en.wikipedia.org/wiki/Regular_expression
• http://www.regular-expressions.info/
• https://
en.wikibooks.org/wiki/Regular_Expressions/POSIX-Extend
ed_Regular_Expressions
GREP: SEARCHING FOR A PATTERN
• Unix has a special family of commands for handling
search requirements
• grep scans its input for a pattern, and displays:
• The selected pattern
• The line numbers
• Or the filenames where the pattern occur
• Format
• $ grep options pattern filenames
• Example
• $ grep “sales” emp.list
FILTER PROGRAMS
• grep and sed are filter programs
• They do not change a file "in place"
• Produce a new data stream
• To be piped into another command, or
• captured into a new file with output redirection
• grep and sed are interpreted, not compiled
• hence they tend to be slower at run time
• Fast in terms of programming time
COMMAND FEATURES
Commands Standard Input Standard Output
mkdir, rmdir, cp, rm No No
ls, pwd, who No Yes
lp, lpr Yes No
cat, wc, gzip Yes Yes

• Commands in the forth category are called filters

• dual stream-handling feature
• makes them powerful text manipulators
• Flexible usage
• $ wc < calc.txt > result.txt
• $ wc > result.txt < calc.txt
• $ wc>result.txt<calc.txt
• $ > result.txt < calc.txt wc
GREP AS A FILTER
• Because grep is also a filter, it is able to
• search standard input for the pattern
• store the output in a file
• Example
• $ who | grep henry > foo
• $ grep henry < namelist.txt > foo
SUPRESS THE FILENAME
• When grep is used with multiple filenames, it displays the
filenames along with the output.
• $ grep ‘director’ emp1.lst emp2.lst
emp1.lst:1006:gordon lightfood:director:sales:09/03/38:140000
emp1.lst:6521:derryk o’brien:director:marketing:09/26/45:125000
emp2.lst:9876:bill johnson:director:production:03/12/05:130000
emp2.lst:2365:john woodcook:director:personnel:05/11/47:120000
• To suppress the filenames:
• make grep ignorant of the source of its input
• $ cat emp[12].lst | grep ‘director’
• $ grep ‘director ‘ < emp1.list
• use cut to select all but the first field using grep as its input
QUOTING IN GREP
• Do we need quoting in grep?
• $ grep “sales” emp.list
• $ who | grep henry > foo
• $ grep ‘director’ emp1.lst emp2.lst
• Quoting is essential if the search string consists of
• more than one word (has space in the search pattern)
• any of the shell’s metacharacters
• Example
• $ grep gordon lightfoot emp1.lst
• error: lightfoot: no such file or directory
• emp1.lst:1006:gordon lightfood:director:sales:09/03/38:140000
• $ grep ‘gordon lightfoot’ emp1.lst
• 1006:gordon lightfood:director:sales:09/03/38:140000
SINGLE OR DOUBLE QUOTE
• Principle
• single quote protects double quote
• double quote protects single quote
• double quote allows command substitution and variable
evaluation
• Example
• $ grep ‘neil o’bryan’ emp1.lst
• >
• $ grep “neil o’bryan” emp1.lst
• 4290:neil o’bryan:executive:production:09/07/50:65000
• $ grep ‘Ted ”TK” Kim ’ emp1.lst
• 2210:Ted “TK” Kim:executive:research:03/01/60:135000
• $ grep “`echo name`” emp1.lst
• $ grep “$USERNAME” emp1.lst
WHEN GREP FAILS
• grep simply returns the prompt when the pattern can’t be
located
• $ grep president emp.lst
• $
• A similar behavior to cmp and sed
• pattern search failed
• command execution success
• exit status
• 0
• 1
GREP OPTIONS
Options Significance
-i Ignores case for matching
-v Doesn’t display lines matching expression
-n Displays line numbers along with lines
-c Displays count of number of occurrences
-l Displays list of file names only
-e exp Specifies expression exp with this option. Can use
multiple times. Also used for matching expression
beginning with a hyphen
GREP OPTIONS (CONT.)
Options Significance
-x Matches pattern with entire line (doesn’t match
embedded patterns)
-f file Takes patterns from file, one per line
-E Treat pattern as an extended regular expression (ERE)
-F Matches multiple fixed strings
-n Displays line and n lines above and below (Linux only)
-A n Displays line and n lines after matching lines (Linux only)
-B n Displays line and n lines before matching lines (Linux only)
-e
• Matching Multiple Patterns
• $ grep -e gordon -e derryk -e bill emp.lst
1006:gordon lightfood:director:sales:09/03/38:140000
6521:derryk o’brien:director:marketing:09/26/45:125000
9876:bill johnson:director:production:03/12/05:130000
• Matching Expression beginning with a hyphen
• $ grep “-mtime” filename.txt
grep: invalid option - m
Usage: grep [OPTION] ... PATTERN [FILE]...
Try `grep --help` for more information.
• $ grep -e “-mtime” filename.txt
romeo:55 17 * * 4 find / -name core –mtime +30 -print
USING REGULAR EXPRESSIONS
• Regular expressions introduce efficient pattern matching
• Regular expression interpreted by the command not by
the shell
• Quoting ensures that shell isn’t able to interfere
• grep
• support basic regular expression by default
• $ grep “expression” filenames
• support extended regular expression by –E option
• $ grep -E “expression” filenames
• if grep doesn’t support -E, use egrep instead
APPLICATIONS
• Listing Only Directories
• $ ls -l | grep “^d”

• Identifying files with write permissions for group users

• $ ls -l | grep “^.....w”
CUT COMMAND
• Review
• $ head –n 5 file.txt
• $ tail –n 3 file.txt
• Head and tail slice a file horizontally
• In contrast, cut is the command that slice a file vertically
• cutting columns
• $ cut -c1-4 file.txt
• get the 1st to 4th column of each line in the file
• cutting fields
• $ cut -d”:” –f1,3 file.txt
• get the 1st and 3rd field of each line in the file
PRACTICE
• what does these command do?
• $ grep a b c
• find a in file b and c
• $ grep <HTML> foo
• not working, since < and > shall be quoted
• $ grep “**” foo
• looks for zero or more *
• matches all lines
• $ grep *
• If * expands to multiple filenames, grep looks for the first
filename in the remaining files.
• If * expands to a single filename, grep searches the
standard input.
SUMMARY
• grep is used to search lines from input
• using regular expression to search
SED: ADDRESSING
• Addressing in sed is done in two ways:
• Line Addressing
• by one or two line numbers
• 3,7
• Content Addressing
• By specifying a /-enclosed pattern which occurs in a line
• /From:/
LINE ADDRESSING
• Addressing by line numbers
• Example
• $ sed ‘3q’ emp.lst
• q is the action of quit
• means quits after line number 3
• $ sed -n ‘1,2p’ emp.lst
• p is the action of print
• means print line number 1 through 2
• -n is used to suppress the behavior of printing all lines
• when using p
• $ sed -n ‘$p’ emp.lst
• $ selects last line
• means print last line
LINE ADDRESSING (CONT.)
• More Examples
• $ sed -n ‘9,11p’ emp.lst
• means print line number 9 through 11
• $ sed -n ‘1,2p; 7,9p; $p’ emp.lst
• selecting multiple groups of lines
• can be written in multiple lines without semicolon
• $ sed -n ‘1,2p
> 7,9p
> $p’ emp.lst
• $ sed -n ‘3,$!p’ emp.lst
• ! is used to negate the action
• means print line number 1 through 2
CONTEXT ADDRESSING
• Addressing by contexts (pattern matching)
• Example
• $ sed -n ‘/From:/p’ $HOME/mbox
• means print lines contains From:
• Using regular expression to help pattern matching
• $ sed -n ‘/^From:/p’ $HOME/mbox
• $ sed -n ‘/wilco[cx]k*s*/p’ emp.lst
• $ sed -n “/o’br[iy][ae]n/p;/lennon/p” emp.lst
• Using double quote to protect single quote
• using semicolon for multiple action
• Only support basic regular expression!
CONTEXT ADDRESSING (CONT.)
• Using comma to select a group of contiguous lines
• $ sed -n ‘/johnson/,/lightfoot/p’ emp.lst
• print lines between johnson and lightfoot
• what if we have multiple johnson and lightfood?
• print lines between first johnson and last lightfood
• $ sed -n ‘1,/woodcock/p’ emp.lst
• supports mix of line and context address
• print lines between 1st line and woodcock
SED: WRITING LINES TO A FILE
• Using w command to write selected lines to a file
• Example
• $ sed ‘/<FORM>/,/<\/FORM>/w forms.html’ *.html
• extract all FORMs from all html files
• write these lines to forms.html
• use -n to suppress the print of content in *.html
SED: TEXT EDITING
• sed can insert text and change existing text in a file.
• i – insert
• a – append
• c – change
• d – delete
• Examples:
• $ sed ‘1i #include <stdio.h>’ foo.c > $$; mv $$ foo.c
• 1i means to insert text at line number 1
• output redirected to file $$
• move $$ to foo.c to overwrite foo.c
• echo ”#include <stdio.h>" | cat - foo.c> $$; mv $$ foo.c
• Inserting multiple lines by
• $ sed ‘1i\
> #include <stdio.h>\
> #include <unistd.h>
> ’ foo.c > $$; mv $$ foo.c
SED: TEXT EDITING (CONT.)
• Examples:
• $ sed ‘a\
>
> ’ emp.lst
• Using a (append) without specifying line numbers
• Appending text to every line of the file
• insert a blank line after each line
• $ sed '
> /WORD/ c\
> new sentence
> ‘ emp.lst
• replace WORD by new sentence
• $ sed ‘/^#/d’ emp.lst
• delete lines starting with #
SED: SUBSTITUTION
• Substitution is the most important feature of sed
• Usage
• $ sed ‘[address]s/expression1/expression2/flags’ filename(s)
• Description
• expression1 is replaced with expression2 in all lines
specified by [address]
• If the address is not specified, the substitution is performed
for all matching lines
• if flags is set to g, all occurrences are replaced, else, only
the first occurrence is replaced
SED: SUBSTITUTION EXAMPLE
• Example
• $ sed ‘s/:/|/’ emp.lst
• only the first instance of the : in a line
• $ sed ‘s/:/|/g’ emp.lst
• use the g(global) flag to replace all the :
• $ sed ‘s/^/2/;s/$/.00/’ emp.lst
• using regular expression
• $ sed ‘s/<I>/<EM>/g
> s/<EM>/<STRONG>/g’ form.html
• using multiple lines to create multiple substitution
• sed processes several instructions in a sequential manner.
• Each instruction operates on the output of the previous
instruction.
SED OPTIONS
• -e
• lets you use multiple instructions
• $ sed -e ‘/<FORM>/,/<\/FORM>/w forms.html’
-e ‘/<TABLE>/,/<\/TABLE>/w tables.html’
-e ‘/<FRAME>/,/<\/ FRAME >/w frames.html’
*.html
• -f
• take instructions from a file
• when you have a group of instructions to execute
• place them in a file and use sed with the - f option
PRACTICE
• Use sed to insert <HTML> and </HTML> to the beginning
and end of foo.html, repectively
• $ sed -e ‘1i\<HTML>‘ -e ‘$a\</HTML>‘ foo.html > $$; mv $$ foo.html
• Explain what will happen
• $ sed -e ‘s/compute/calculate/g’ -e ‘s/computer/host/g’ foo
• Solution
• $ sed -e ‘s/computer/host/g’ -e ‘s/compute/calculate/g’ foo
• How to sort a file that is double-spaced (even numbered lines are
blank lines) and still preserve the blank lines?
• $ sort foo | sed -e ‘/^ *$/d’ -e ‘a\
> [blank line]
>‘
SUMMARY
• Sed is a stream editor in UNIX
• Sed uses line number and content to address lines
• Sed supports operations of print, insert, append, change,
delete, write and substitution
• Output of sed is a standard output

Regex and Automata in NLP
No ratings yet
Regex and Automata in NLP
62 pages
Regex
No ratings yet
Regex
24 pages
Regular Expressions for Developers
No ratings yet
Regular Expressions for Developers
5 pages
Validations PHP With Regex
No ratings yet
Validations PHP With Regex
13 pages
Regex Cheat Sheet
No ratings yet
Regex Cheat Sheet
10 pages
L02 - Programming - RE PLC
No ratings yet
L02 - Programming - RE PLC
35 pages
Chapter 5 Regular Expression, Rollover and Frames
No ratings yet
Chapter 5 Regular Expression, Rollover and Frames
56 pages
Andrei's Regex Clinic - PHP Quebec 2009
100% (2)
Andrei's Regex Clinic - PHP Quebec 2009
209 pages
Lecture 9
No ratings yet
Lecture 9
26 pages
Python RegEx Tutorial: Basics & Examples
No ratings yet
Python RegEx Tutorial: Basics & Examples
8 pages
Unix Regular Expressions Explained
No ratings yet
Unix Regular Expressions Explained
7 pages
Chapter 10
No ratings yet
Chapter 10
28 pages
Regular Expressions - Pattern Matching
No ratings yet
Regular Expressions - Pattern Matching
107 pages
Regular Expressions
100% (5)
Regular Expressions
94 pages
Python Regular Expressions Cheat Sheet PDF
No ratings yet
Python Regular Expressions Cheat Sheet PDF
1 page
Sys LW-08EN Regex-Filters
No ratings yet
Sys LW-08EN Regex-Filters
31 pages
Regular Expresions
No ratings yet
Regular Expresions
27 pages
Regex Cheat Sheet for Developers
No ratings yet
Regex Cheat Sheet for Developers
2 pages
Perl Regex Operators Guide
No ratings yet
Perl Regex Operators Guide
30 pages
Regular Expressions
No ratings yet
Regular Expressions
14 pages
Python Regex Basics and Usage
No ratings yet
Python Regex Basics and Usage
12 pages
WT - Regular Expression
No ratings yet
WT - Regular Expression
22 pages
Java Regular Expressions Explained
No ratings yet
Java Regular Expressions Explained
6 pages
Jan Goyvaerts - All About Regular Expressions-Https - WWW - Regular-Expressions - Info - (2019)
No ratings yet
Jan Goyvaerts - All About Regular Expressions-Https - WWW - Regular-Expressions - Info - (2019)
206 pages
Regular Expressions Guide
No ratings yet
Regular Expressions Guide
72 pages
Regular Expressions and Sed & Awk
No ratings yet
Regular Expressions and Sed & Awk
13 pages
Regex Basics for sed & awk Users
No ratings yet
Regex Basics for sed & awk Users
14 pages
PCD Lab Manual
No ratings yet
PCD Lab Manual
28 pages
Regular Expressions
No ratings yet
Regular Expressions
35 pages
Regex
100% (1)
Regex
42 pages
Regular Expression Tutorial: What Regular Expressions Are Exactly - Terminology
No ratings yet
Regular Expression Tutorial: What Regular Expressions Are Exactly - Terminology
42 pages
Javascript Regexp Object
No ratings yet
Javascript Regexp Object
4 pages
Howto Regex
No ratings yet
Howto Regex
20 pages
Unit 2 Regular Expression
No ratings yet
Unit 2 Regular Expression
3 pages
Regex Essentials for Developers
100% (1)
Regex Essentials for Developers
148 pages
Howto Regex PDF
No ratings yet
Howto Regex PDF
20 pages
Module2 NLP BAD613B Notes
100% (1)
Module2 NLP BAD613B Notes
16 pages
Python Regular Expressions Tutorial
No ratings yet
Python Regular Expressions Tutorial
23 pages
Regular Expressions: Exceptions in A Character Set
No ratings yet
Regular Expressions: Exceptions in A Character Set
10 pages
Regex Slides PDF
No ratings yet
Regex Slides PDF
435 pages
Using Regular Expressions With PHP
No ratings yet
Using Regular Expressions With PHP
6 pages
Howto Regex
No ratings yet
Howto Regex
20 pages
Perl Re Quick
No ratings yet
Perl Re Quick
9 pages
Css Unit 5 Dev Notes
No ratings yet
Css Unit 5 Dev Notes
13 pages
Lexi Cal Analysis
No ratings yet
Lexi Cal Analysis
91 pages
Python Regex Guide
No ratings yet
Python Regex Guide
20 pages
Regex Special Characters and Classes
No ratings yet
Regex Special Characters and Classes
12 pages
3 Regular Expression
No ratings yet
3 Regular Expression
15 pages
Regular Expressions
No ratings yet
Regular Expressions
5 pages
Java Regex Tutorial: Lars Vogel
No ratings yet
Java Regex Tutorial: Lars Vogel
20 pages
Chapter 5 Regular Expressions, Rollover and Frames Regular Expression
No ratings yet
Chapter 5 Regular Expressions, Rollover and Frames Regular Expression
16 pages
How To Write Regular Expressions?: What Is A Regular Expression and What Makes It So Important?
No ratings yet
How To Write Regular Expressions?: What Is A Regular Expression and What Makes It So Important?
2 pages
Activity10 PDF
No ratings yet
Activity10 PDF
2 pages
Lecture Notes
No ratings yet
Lecture Notes
87 pages
Activity 11
No ratings yet
Activity 11
2 pages
Software Design Descriptions
No ratings yet
Software Design Descriptions
12 pages
Exp19 20
No ratings yet
Exp19 20
4 pages
Latika Thapliyal
No ratings yet
Latika Thapliyal
5 pages
Finite Automata and Language Theory
No ratings yet
Finite Automata and Language Theory
16 pages
Checklist For Electrical Safety Inspection
100% (5)
Checklist For Electrical Safety Inspection
1 page
Traditional Hallacas Recipe
No ratings yet
Traditional Hallacas Recipe
5 pages
Cmmi Acquisition Module (CMMI-AM), Version 1.1
No ratings yet
Cmmi Acquisition Module (CMMI-AM), Version 1.1
49 pages
Gordon's Functional Health Pattern
100% (3)
Gordon's Functional Health Pattern
5 pages
Planet, Code - PYTHON For LARGE LANGUAGE MODELS - A Beginners Handbook For Leveraging Llms Into Modern Development Workflows and Applications (2025)
100% (2)
Planet, Code - PYTHON For LARGE LANGUAGE MODELS - A Beginners Handbook For Leveraging Llms Into Modern Development Workflows and Applications (2025)
254 pages
Systems Engineering Learning Path
No ratings yet
Systems Engineering Learning Path
3 pages
Kannada MentalAbility SampleQPaper QA1 PDF
0% (1)
Kannada MentalAbility SampleQPaper QA1 PDF
24 pages
02 Metformin
No ratings yet
02 Metformin
9 pages
Cytology Practical
No ratings yet
Cytology Practical
34 pages
Especificaciones Nissan ZD30 TD TERRANO II PDF
No ratings yet
Especificaciones Nissan ZD30 TD TERRANO II PDF
2 pages
Overview of the Respiratory System
No ratings yet
Overview of the Respiratory System
1 page
C11000 Copper Specification Sheet
No ratings yet
C11000 Copper Specification Sheet
8 pages
Computer Applications, Business Accounting and Multilingual DTP (Caba-Mdtp)
No ratings yet
Computer Applications, Business Accounting and Multilingual DTP (Caba-Mdtp)
5 pages
Vial Integrity
No ratings yet
Vial Integrity
87 pages
Internship Report Format
No ratings yet
Internship Report Format
13 pages
KET (A2) : Reading and Writing Part 1 Questions 1-6
No ratings yet
KET (A2) : Reading and Writing Part 1 Questions 1-6
6 pages
AVR Microcontroller Programming Guide
100% (3)
AVR Microcontroller Programming Guide
63 pages
2P36784, Plug Valve
100% (1)
2P36784, Plug Valve
34 pages
Coordination and Response: IGCSE Biology Workbook
No ratings yet
Coordination and Response: IGCSE Biology Workbook
10 pages
Unit 10 Modal Auxiliray Verbs in The Past - Other Uses
No ratings yet
Unit 10 Modal Auxiliray Verbs in The Past - Other Uses
5 pages
Liquefaction of Natural Gas Using Single Stage Mixed Refrigerant PRICO Process
No ratings yet
Liquefaction of Natural Gas Using Single Stage Mixed Refrigerant PRICO Process
8 pages
03 Vip 90 Tuan So 16 Bo de Du Doan Dac Biet Phat Trien de Thi Minh Hoa Nam 2025 de So 14
100% (1)
03 Vip 90 Tuan So 16 Bo de Du Doan Dac Biet Phat Trien de Thi Minh Hoa Nam 2025 de So 14
11 pages
Church Parking Lot Repair RFP
100% (1)
Church Parking Lot Repair RFP
10 pages
1 Check of I Shaped Members and Channels Subject To Combined Axial Compression and Flexure
No ratings yet
1 Check of I Shaped Members and Channels Subject To Combined Axial Compression and Flexure
15 pages
Physics Kinematics Practice Set
No ratings yet
Physics Kinematics Practice Set
1 page
Bad Weather Ship Maneuvers Guide
No ratings yet
Bad Weather Ship Maneuvers Guide
6 pages
Burrell Gary Pam 1973 Brazil PDF
No ratings yet
Burrell Gary Pam 1973 Brazil PDF
20 pages
Second Term Past Paper Question
No ratings yet
Second Term Past Paper Question
14 pages

David Wang Computing Science and Information Technology: Info 1211 - Operating System'S Principles and Applications

Uploaded by

David Wang Computing Science and Information Technology: Info 1211 - Operating System'S Principles and Applications

Uploaded by

LECTURE 4:

INFO 1211 – OPERATING SYSTEM’S PRINCIPLES AND APPLICATIONS

[:alnum:] Alphanumeric characters [a-zA-Z0-9] \p{Alnum}

[:lower:] Lowercase letters [a-z] \p{Lower}

• Find lines with beginning of 2

in[du] STRING1 match finds ind in Windows

m STRING1 match Finds the m in compatible

\(.*l STRING1 match finds the ( and l in (compatible. The opening \ is

• Commands in the forth category are called filters

• Identifying files with write permissions for group users

You might also like