0% found this document useful (0 votes)

29 views23 pages

2 Regular Expression

Uploaded by

Salam Abdulla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views23 pages

2 Regular Expression

Uploaded by

Salam Abdulla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

University of sulaimani

College of science
Department of Computer Science

Computation
Regular Expression
Lecture four

Mzhda Hiwa Hama

2023-2024
Regular Expression

• Regular Expressions (shortened as "regex") are used in

many programming languages and tools. They can be
used in ﬁnding and extracting patterns in texts and
programs.

• Regular expressions are a way to search for substrings

("matches") in strings. This is done by searching with
"patterns" through the string.

• Regular expressions are useful tools in the design of

compilers for programming languages. Elemental objects
in a programming language, called tokens, such as the
variable names and constants, may be described with
• Using regular expressions, we can also specify and
validate forms of data such as passwords, e-mail
addresses, user IDs, etc.
Regular Expression’s
metacharacters

A bracket expression. Matches a single character that is

contained within the brackets. For example, [abc] matches "a",
[] "b", or "c". [a-z] specifies a range which matches any lowercase
letter from "a" to "z". These forms can be mixed: [abcx-z]
matches "a", "b", "c", "x", "y", or "z“.

. Matches any single character. Within bracket expressions,

the dot character matches a literal dot. For example, a.c
matches "abc", etc., but [a.c] matches only "a", ".", or "c".
Matches a single character that is not contained within the
brackets. For example, [^abc] matches any character other than
[^ ] "a", "b", or "c". [^a-z] matches any single character that is not a
lowercase letter from "a" to "z".
Regular Expression’s
metacharacters

() Defines a marked sub expression. The string matched within the

parentheses can be recalled later . A marked subexpression is also
called a block or capturing group. (abc)

* Matches the preceding element zero or more times. For example,

ab*c matches "ac", "abc", "abbbc", etc. [xyz]* matches "", "x", "y",
"z", "zx", "zyx", "xyzzy", and so on. (ab)* matches "", "ab", "abab",
"ababab", and so on.
? Matches the preceding element zero or one time. For example, ab?c
matches only "ac" or "abc".
Matches the preceding element one or more times. For example,
+ ab+c matches "abc", "abbc", "abbbc", and so on, but not "ac".
The choice (also known as alternation or set union) operator
matches either the expression before or the expression after the
| operator. For example, abc|def matches "abc" or "def".
Regular Expression’s
metacharacters

^ Matches the starting position within the string.

$ Matches the ending position of the string or the position just

before a string-ending newline.

{n} Matches Exactly the specified number of occurrences,

a{3} contains {aaa} , exactly three a

{m,n} Matches the preceding element at least m and not more than n
times. For example, a{3,5} matches only "aaa", "aaaa", and
"aaaaa".
Formal Definition of Regular
Expression
Say that R is a regular expression if R is

1. a for some a in the alphabet Σ, so a represented as {a}

2. ε, represent {ε} language.
3. ∅, represent { } empty language.

Note: Don’t confuse the regular expressions ε and ∅. The

expression ε represents the language containing a single string
—namely, the empty string—whereas ∅ represents the
language that doesn’t contain any strings.
Regular Expression Operation

1. Union(OR) : where R1 and R2 are regular expressions, then

(R1 ∪ R2), also written as( R1 | R2 or R1 + R2) is also a
regular expression. L(R1|R2) = L(R1) U L(R2).

2. Concatenation: (R1 ◦ R2), where R1 and R2 are regular

expressions then R1R2 (also written as R1.R2) is also a
regular expression. L(R1R2) = L(R1) concatenated with
L(R2).

3. Kleene closure(star): (R1*), where R1 is a regular

expression then R1* (the Kleene closure of R1) is also a
regular expression. L(R1*) = epsilon U L(R1) U L(R1R1) U
L(R1R1R1) U…
Regular Expression and
languages
• The origins of regular expressions lie in Automata
Theory and Formal Language Theory.

• We can use RE to identify Regular Languages.

• So, The value of regular expression is a language.

• Regular language is one accepted by some FA or

described by an RE.
Note

• In arithmetic, we can use the operations + and × to build up

expressions such as (5 + 3) × 4 . Similarly, we can use the
regular operations to build up expressions describing
languages, which are called regular expressions. An example
is: (0 ∪ 1)0 ∗ . The value of the arithmetic expression is the
number 32. The value of a regular expression is a language.

In arithmetic, we say that × has precedence over + to mean

that when there is a choice, we do the × operation first. Thus
in 2+3×4, the 3×4 is done before the addition. To have the
addition done first, we must add parentheses to obtain (2 +
3)×4. In regular expressions, the star operation is done first,
followed by concatenation, and finally union, unless
parentheses change the usual order.
Examples

• In the following instances, we assume that the alphabet

Σ is{0,1}.
1. 0*10* = {w|w contains a single 1}.
2. Σ*1Σ* ={w|w has at least one 1}. Σ*=(0+1) *
3. Σ*001Σ* ={w|w contains the string 001 as a substring}.
4. (ΣΣ)* = {w|w is a string of even length}.
5. (ΣΣΣ)* = {w|the length of w is a multiple of 3}.
6. 01∪10 = {01,10}.
7. (0∪ε)(1∪ε) = {ε,0,1,01}.
8. 1*∅= ∅. Concatenating the empty set to any set
yields the empty set.
9. (0 ᴜ 1 )* Consists of all possible strings of 0s and 1s

10. (0∑) ᴜ (∑1) Consists of all strings that start with

0 or end with 1.

11. The set of strings over {0,1} that end in 3 consecutive

1's.
(0 | 1)* 111

12. The set of strings over {0,1} that have at most one 1
0* | 0* 1 0*
Homework

• Write a regular expressions for each of the

following languages:

1. {w| w starts with a 0 or a 1 and followed by any

number of 0s}
2. {w| w contains the string 101 as a substring}
3. {w| w starts with the string 11 and ends with
10}
4. Start and end with same symbol.
5. {w| w contains at least three 1s}
Equivalence with Finite
Automata
• Every regular language is FA recognizable, ie. Any RE
can be converted into Finite Automata that
recognizes the language it describes, and vice versa.
Recall that a regular language is one that is
recognized by some ﬁnite automaton.

• Note: A language is regular if and only if some

regular expression describes it .
Example1

• We convert the regular expression (ab∪a)* to an NFA in a

sequence of stages. We build up from the smallest
subexpressions to larger subexpressions until we have an
NFA for the original expression, as shown in the following
diagram.
Example 2

• (a ᴜ b)* aba
Look Ahead and Look Behind
collectively called "lookaround"

You can have assertions in your pattern like lookahead or

behind to ensure that a substring does or does not occur.
These “look around” assertions are specified by putting
the substring checked for in a string, whose leading
characters are:

• ?= (for positive lookahead),

• ?! (negative lookahead),
• ?<= (positive lookbehind),
• ?<! (negative lookbehind).
Look Ahead and Look Behind…
cont’d
• Use ?! (for negative lookahead), if the query was to
avoid appearing a specific substring in a string. At
the beginning of the string

• Ex: ^(?!101)[01]* // Doesn’t have 101 at beginning

of the string.
Look Ahead and Look Behind…
cont’d

• Use ?= (for positive lookahead), if the query

required appearing a specific substring in a string.
At the beginning of the string

Ex: ^(?=101)[01]* // String must contain 101 at

beginning of the string.
Look Ahead and Look Behind…
cont’d
• Use ?<! (for negative lookbehind), if the query was to
avoid appearing a specific substring only at the end of
the string
Ex: ^[01]*(?<!101)$ // Doesn’t end with 101

• Use ?<= (for positive lookbehind), if the query required

appearing a specific substring only at the end of the
string

Ex: ^[01]*(?<=101)$ // must end with 101

• Note: always specify the end position with $ when using
lookbehind.

Unit 3 - Regular Expression
No ratings yet
Unit 3 - Regular Expression
45 pages
Class 3
No ratings yet
Class 3
52 pages
COMP3 RegEx
No ratings yet
COMP3 RegEx
10 pages
Chapter Two
No ratings yet
Chapter Two
59 pages
Regular Expressions and Their Applications
No ratings yet
Regular Expressions and Their Applications
68 pages
Chapter 3 - Regular Expressions
No ratings yet
Chapter 3 - Regular Expressions
49 pages
Understanding Regular Expressions in Compilers
No ratings yet
Understanding Regular Expressions in Compilers
16 pages
3-Regular Expressions
No ratings yet
3-Regular Expressions
34 pages
CPSC 388 - Compiler Design and Construction: Scanners - Regular Expressions
No ratings yet
CPSC 388 - Compiler Design and Construction: Scanners - Regular Expressions
20 pages
Regular Expressions and Identities Explained
No ratings yet
Regular Expressions and Identities Explained
70 pages
Chapter 2 RegularExpressions
No ratings yet
Chapter 2 RegularExpressions
95 pages
03 Regular Expression
No ratings yet
03 Regular Expression
18 pages
Regular Expressions and Finite Automata
No ratings yet
Regular Expressions and Finite Automata
95 pages
Unit Ii
No ratings yet
Unit Ii
25 pages
FLAT Regular Languages Expressions
No ratings yet
FLAT Regular Languages Expressions
6 pages
2 Regular Expressions
No ratings yet
2 Regular Expressions
34 pages
Understanding Regular Expressions and Languages
No ratings yet
Understanding Regular Expressions and Languages
60 pages
Automata Module 2
No ratings yet
Automata Module 2
69 pages
Toc Unit 2
No ratings yet
Toc Unit 2
29 pages
Regular Expression
No ratings yet
Regular Expression
17 pages
Unit22pdf 2021 03 13 13 38 11
No ratings yet
Unit22pdf 2021 03 13 13 38 11
114 pages
File 16
No ratings yet
File 16
46 pages
Regex Cheat Sheet
No ratings yet
Regex Cheat Sheet
10 pages
Automata Lectuee3
No ratings yet
Automata Lectuee3
27 pages
Module2 NLP BAD613B Notes
100% (1)
Module2 NLP BAD613B Notes
16 pages
Regular Expressions in Compiler Design
No ratings yet
Regular Expressions in Compiler Design
25 pages
Understanding Regular Expressions Basics
No ratings yet
Understanding Regular Expressions Basics
16 pages
Intro to Regular Expressions
No ratings yet
Intro to Regular Expressions
27 pages
Regular Expressions and Regular Languages
No ratings yet
Regular Expressions and Regular Languages
5 pages
Computability 05
No ratings yet
Computability 05
28 pages
Unit 2
No ratings yet
Unit 2
135 pages
TPL Lect 15 - 16
No ratings yet
TPL Lect 15 - 16
5 pages
ACT Chapter 2
No ratings yet
ACT Chapter 2
22 pages
Regular Expression
No ratings yet
Regular Expression
89 pages
Week4 5
No ratings yet
Week4 5
43 pages
Regular Expressions for Language Patterns
No ratings yet
Regular Expressions for Language Patterns
46 pages
TOC Unit2
No ratings yet
TOC Unit2
87 pages
Class 10 Regular Expression
No ratings yet
Class 10 Regular Expression
26 pages
Regular Expression: Anab Batool Kazmi
No ratings yet
Regular Expression: Anab Batool Kazmi
32 pages
Introduction to Alphabets and Regular Expressions
No ratings yet
Introduction to Alphabets and Regular Expressions
21 pages
Regular Expressions Full Notes Cse
No ratings yet
Regular Expressions Full Notes Cse
16 pages
Regular Expressions: Reading: Chapter 3
No ratings yet
Regular Expressions: Reading: Chapter 3
16 pages
Compiler Lecture 7
No ratings yet
Compiler Lecture 7
18 pages
Regular Expressions and Languages
No ratings yet
Regular Expressions and Languages
16 pages
Chapter 3 Regular Expressions Notes
100% (1)
Chapter 3 Regular Expressions Notes
36 pages
Regular Expression
No ratings yet
Regular Expression
3 pages
Lecture Slides Regular Expressions
No ratings yet
Lecture Slides Regular Expressions
138 pages
Chapter 3
No ratings yet
Chapter 3
10 pages
Regular Expressions: Reading: Chapter 3
No ratings yet
Regular Expressions: Reading: Chapter 3
16 pages
Regex
No ratings yet
Regex
24 pages
Compiler Lecture 7
No ratings yet
Compiler Lecture 7
18 pages
Chapter 3 Finite State Automata Part 2
No ratings yet
Chapter 3 Finite State Automata Part 2
49 pages
Mod 2
No ratings yet
Mod 2
49 pages
Bcs503 Module 2
No ratings yet
Bcs503 Module 2
46 pages
Finite State Machines & Regular Expressions
No ratings yet
Finite State Machines & Regular Expressions
14 pages
2.0+regular Expression Part 1 MKN
No ratings yet
2.0+regular Expression Part 1 MKN
33 pages
Regular Expression
No ratings yet
Regular Expression
21 pages
Unit-2 Regular Expression and Languages
No ratings yet
Unit-2 Regular Expression and Languages
42 pages
1 Finite Automata
No ratings yet
1 Finite Automata
62 pages
1 Compiler Phases
No ratings yet
1 Compiler Phases
30 pages
2 Lexical Analyzer
No ratings yet
2 Lexical Analyzer
21 pages
3 Syntax Analysis
No ratings yet
3 Syntax Analysis
42 pages
SECJ3303 202120221 Test1b Unlocked
No ratings yet
SECJ3303 202120221 Test1b Unlocked
10 pages
Neural Networks Course Overview
No ratings yet
Neural Networks Course Overview
6 pages
Key Algorithm Techniques Explained
No ratings yet
Key Algorithm Techniques Explained
11 pages
MidTermLabTest (2021)
No ratings yet
MidTermLabTest (2021)
10 pages
Original Slides by Daniel Liang Modified Slides by Salam Abdulla
No ratings yet
Original Slides by Daniel Liang Modified Slides by Salam Abdulla
112 pages
3.III-Regular Expression Part-I & II 2022-23
No ratings yet
3.III-Regular Expression Part-I & II 2022-23
14 pages
Building a Web Browser with Python
No ratings yet
Building a Web Browser with Python
32 pages
Re and Finite Automata Examples
No ratings yet
Re and Finite Automata Examples
6 pages
Release Notes
No ratings yet
Release Notes
37 pages
Module 3 Ref
No ratings yet
Module 3 Ref
117 pages
R16 Question Papers Flat
No ratings yet
R16 Question Papers Flat
12 pages
Python Regex Module Overview
No ratings yet
Python Regex Module Overview
10 pages
Bug Bounty Automation With Python The Secrets of Bug Hunting
75% (4)
Bug Bounty Automation With Python The Secrets of Bug Hunting
79 pages
Linux Fundamentals Student Workbook: Unauthorized
No ratings yet
Linux Fundamentals Student Workbook: Unauthorized
65 pages
Geochemical Data Toolkit For Windows Written in R Language
No ratings yet
Geochemical Data Toolkit For Windows Written in R Language
11 pages
Whys (Poignant) Guide To Ruby by Why The Lucky Stiff
No ratings yet
Whys (Poignant) Guide To Ruby by Why The Lucky Stiff
222 pages
Compilers Course Info CST8152 Fall 2021
No ratings yet
Compilers Course Info CST8152 Fall 2021
8 pages
PHPIDS 0.6.5 Bypass Techniques
No ratings yet
PHPIDS 0.6.5 Bypass Techniques
6 pages
Cross Fit Two
No ratings yet
Cross Fit Two
56 pages
Python Harvard RegularExpressions
No ratings yet
Python Harvard RegularExpressions
20 pages
Compiler Design Lab (CSP358) : Practical No. 1 (LEX)
No ratings yet
Compiler Design Lab (CSP358) : Practical No. 1 (LEX)
16 pages
ACD Lab Plan
No ratings yet
ACD Lab Plan
2 pages
FORMAL LANGUAGE AND AUTOMATA THEORY 2nd Edition Singh Ajit Available All Format
100% (3)
FORMAL LANGUAGE AND AUTOMATA THEORY 2nd Edition Singh Ajit Available All Format
151 pages
Unit 1
No ratings yet
Unit 1
99 pages
AMDP Code To Data Book
No ratings yet
AMDP Code To Data Book
97 pages
Python 2.7 Regex Cheat Sheet
No ratings yet
Python 2.7 Regex Cheat Sheet
1 page
C# Regular Expressions Guide
No ratings yet
C# Regular Expressions Guide
23 pages
Cisco IOS Regular Expressions Guide
No ratings yet
Cisco IOS Regular Expressions Guide
8 pages
2-Regular Expressions, Text Normalization, Edit Distance
No ratings yet
2-Regular Expressions, Text Normalization, Edit Distance
42 pages
RoboticsCustomizedUIManual (061 090)
No ratings yet
RoboticsCustomizedUIManual (061 090)
30 pages
Suppress Filenames in Grep Output
No ratings yet
Suppress Filenames in Grep Output
2 pages
JavaScript and Selectors
No ratings yet
JavaScript and Selectors
7 pages
Introduction To UNIX With LINUX
No ratings yet
Introduction To UNIX With LINUX
106 pages
Lexical Analyzer Generators Guide
No ratings yet
Lexical Analyzer Generators Guide
18 pages
Java Regular Expressions Cheat Sheet
No ratings yet
Java Regular Expressions Cheat Sheet
1 page

2 Regular Expression

Uploaded by

2 Regular Expression

Uploaded by

University of sulaimani

Mzhda Hiwa Hama

• Regular Expressions (shortened as "regex") are used in

• Regular expressions are a way to search for substrings

• Regular expressions are useful tools in the design of

A bracket expression. Matches a single character that is

. Matches any single character. Within bracket expressions,

() Defines a marked sub expression. The string matched within the

* Matches the preceding element zero or more times. For example,

^ Matches the starting position within the string.

$ Matches the ending position of the string or the position just

{n} Matches Exactly the specified number of occurrences,

1. a for some a in the alphabet Σ, so a represented as {a}

Note: Don’t confuse the regular expressions ε and ∅. The

1. Union(OR) : where R1 and R2 are regular expressions, then

2. Concatenation: (R1 ◦ R2), where R1 and R2 are regular

3. Kleene closure(star): (R1*), where R1 is a regular

• We can use RE to identify Regular Languages.

• So, The value of regular expression is a language.

• Regular language is one accepted by some FA or

• In arithmetic, we can use the operations + and × to build up

In arithmetic, we say that × has precedence over + to mean

• In the following instances, we assume that the alphabet

10. (0∑*) ᴜ (∑*1) Consists of all strings that start with

11. The set of strings over {0,1} that end in 3 consecutive

• Write a regular expressions for each of the

1. {w| w starts with a 0 or a 1 and followed by any

• Note: A language is regular if and only if some

• We convert the regular expression (ab∪a)* to an NFA in a

You can have assertions in your pattern like lookahead or

• ?= (for positive lookahead),

• Ex: ^(?!101)[01]* // Doesn’t have 101 at beginning

• Use ?= (for positive lookahead), if the query

Ex: ^(?=101)[01]* // String must contain 101 at

• Use ?<= (for positive lookbehind), if the query required

Ex: ^[01]*(?<=101)$ // must end with 101

You might also like

10. (0∑) ᴜ (∑1) Consists of all strings that start with