0% found this document useful (0 votes)
43 views33 pages

Course 3 Si 4

The document discusses formal languages and regular expressions. It begins by defining a finite automaton as a 5-tuple (Q,Σ,δ,q0,F) where Q is a finite set of states, Σ is a finite alphabet, δ is the transition function, q0 is the initial state, and F is the set of final states. It then defines the language accepted by a finite automaton as the set of strings that can reach a final state from the initial state through the transition function. It proves that for any regular grammar there exists an equivalent finite automaton recognizing the same language, and vice versa. It also defines regular expressions and shows that regular sets and languages recognized by regular expressions are the

Uploaded by

Veko Boy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views33 pages

Course 3 Si 4

The document discusses formal languages and regular expressions. It begins by defining a finite automaton as a 5-tuple (Q,Σ,δ,q0,F) where Q is a finite set of states, Σ is a finite alphabet, δ is the transition function, q0 is the initial state, and F is the set of final states. It then defines the language accepted by a finite automaton as the set of strings that can reach a final state from the initial state through the transition function. It proves that for any regular grammar there exists an equivalent finite automaton recognizing the same language, and vice versa. It also defines regular expressions and shows that regular sets and languages recognized by regular expressions are the

Uploaded by

Veko Boy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Course 3&4

Formal Languages
- Basic notions -

S. Motogna - FL&CD
Regular languages

S. Motogna - FL&CD
S.Motogna - Formal Languages & Compiler Design

1. Search engine – succes of


Google
Why? 2. Unix commands
3. Programming languages – new
feature
Reg exp

Finite Regular
Automata grammars
S.Motogna - Formal Languages & Compiler Design
u

d a

a n a

r c a
a

Problem: The door to the tower is closed by the Red Dragon, using a
complicated machinery. Prince Charming has managed to steal the
plans and is asking for your help. Can you help him determining all
the person names that can unlock the door
Finite Automata
• Intuitive model
a n a Σ

q
CU

S. Motogna - FL&CD
Definition: A finite automaton (FA) is a 5-tuple
M = (Q,Σ,δ,q0,F)
where:
• Q - finite set of states (|Q|<∞)
• Σ - finite alphabet (|Σ|<∞)
• δ – transition function : δ:Q×Σ→P(Q)
• q0 – initial state q0 ∊ Q
• F⊆Q – set of final states

S. Motogna - FL&CD
Remarks

1. Q∩Σ=∅
2. δ:Q×Σ→P(Q) , ε∈Σ0 - relation δ(q,ε)=p NOT allowed
3. If |δ(q,a)|≤1 => deterministic finite automaton (DFA)
4. If |δ(q,a)|>1 (more than a state obtained as result) =>
nondeterministic finite automaton (NFA)

Property: For any NFA M there exists a DFA M’ equivalent to M

S. Motogna - FL&CD
Configuration C=(q,x)
where:
- q state
- x unread sequence from input: x ∊ ∑*

Initial configuration : (q0,w) , w - whole sequence


Final configuration: (qf ,ε) , qf ∈ F, ε –empty sequence
(corresponds to accept)

S. Motogna - FL&CD
Relations between configurations
• ⊢ move / transition (simple, one step)
(q,ax) ⊢ (p,x) , p ∈ δ(q,a)

k
• ⊢ k move = a sequence of k simple transitions) C0 ⊢ C1 ⊢... ⊢ Ck
+
• ⊢ + move
+ k
C ⊢ Cʹ : ∃ k>0 such that C ⊢ Cʹ
*
•⊢ * move (star move)
k
C ⊢ Cʹ : ∃ k≥0 such that
* C ⊢ Cʹ

S. Motogna - FL&CD
Definition : Language accepted by FA M = (Q,Σ,δ,q0,F) is:
*
L(M)={ w ∈ Σ∗ | (q0,w) ⊢ (qf ,ε) , qf ∈F }

Remarks
1. 2 finite automata M1 and M2 are equivalent if and only if they
accept the same language
L(M1)=L(M2)
1. ε ∈ L(M) ó q0∈F (initial state is final state)

S. Motogna - FL&CD
Representing FA b

a b
1. List of all elements p q r
2. Table a

3. Graphical representation
M=(Q,Σ,δ,p,F)
M=(Q,Σ,δ,p,F)
Q = {p,q,r}
F = {r}
Σ = {a,b}
δ(p,a) = q a b
δ(q,a)=q p q r (p,aab)|-(q,ab)|-(q,b)|-(r,ε) => aab accepted
δ(q,b)=r
q q r (p,aba)|-(q,ba)|-(r,a) => aba not accepted
δ(p,b)=r
F = {r} r - -

S. Motogna - FL&CD
Remember
• Finite automaton

M = (Q,Σ,δ,q0,F)

*
L(M)={ w ∈ Σ∗ | (q0,w) ⊢ (qf ,ε) , qf ∈F }

S.Motogna - Formal Languages & Compiler Design


Regular grammars
• G = (N, 𝚺, P, S) right linear grammar if
∀p∊P: A→aB or A →b, where A,B ∊N and a,b ∊ 𝚺
S->aA|ε; A-> a reg
• G = (N, 𝚺, P, S) regular grammar if S->aS|aA; A->bS|b reg
• G is right linear grammar S->aA; A->aA|ε NOT reg
and S->aA|ε; A->aS NOT reg
• A→𝜀 ∉ P, with the exception that S →𝜀 ∊ P, in which case S does not appear in
the rhs (right hand side) of any other production

• L(G) = {w ∊ 𝚺* | S *=> w} - right linear language

S.Motogna - Formal Languages & Compiler Design


Theorem 1: For any regular grammar G=(N, 𝚺, P, S) there exists
a FA M=(Q, 𝚺, 𝛿, q0,F) such that L(G) = L(M)
Proof: construct M based on G
Q = N U {K}, K ∉ N 𝛿: if A →aB ∊ P then 𝛿(A,a) = B
q0 = S if A →a ∊ P then 𝛿(A,a) = K
F = {K} U {S| if S→𝜺 ∊ P}
Prove that L(G) = L(M) (w∈L(G) ⇔ w∈L(M)):
* * (qf , 𝜺)
S⇒w ⇔ (S,w) ⊢
w= 𝜺: S⇒* 𝜺 ⇔ (S, 𝜺) ⊢ * (S, 𝜺) – true
w=a1a2. . .an: S ⇒* w ⇔ (S,w) ⊢ * (K, 𝜺)
S ⇒ a1A1 ⇒ a1a2A2 ⇒ . . . ⇒ a1a2. . .an−1An−1 ⇒ a1a2. . .an−1an
S ⇒ a1A1 exists if S → a1A1 and then δ(S,a1)=A1
A1 → a2A2 : δ(A1,a2)=A2 . . .
An−1 → an : δ(An−1,an)=K
(S,a1a2. . .an) ⊢ (A1,a2. . .an) ⊢ (A2,a3. . .an) ⊢ . . . ⊢ (An−1,an) ⊢ (K, 𝜺) , K∈F

S.Motogna - Formal Languages & Compiler Design


Theorem 2: For any FA M=(Q, 𝚺, 𝛿, q0,F) there exists a right
linear grammar G=(N, 𝚺, P, S) such that L(G) = L(M)
Proof: construct G based on M P: if 𝛿(q,a) = p then q →ap ∊ P
N=Q if p ∊ F then q →a ∊ P
S = q0 if q0 ∊ F then S → 𝜺
Prove that L(M) = L(G) (w∈L(M) ⇔ w∈L(G)):
P(i): q i+1
⇒ x ⇔ (q,x) ⊢i (qf , 𝜺) , qf∈F -prove by induction
i+1 i
Apply P : q0 ⇒ w ⇔ (q0,w) ⊢ (qf , 𝜺) , qf∈F
If i=0: q⇒x ó (q,x) ⊢0 (qf , 𝜺) (x= 𝜺,q=qf ) q⇒ 𝜺 ó q0→ 𝜺 , q0∈F
Assume ∀ k≤i P is true
i+1
q ⇒ x ⇔ (q,x) ⊢i (qf , 𝜺)
i
For q ∊ N apply ”⇒” : q ⇒ ap ⇒ ax
i i-1
If q ⇒ ap then 𝛿(q,a)= p ; if p ⇒ ax then (p,x) ⊢ (qf , 𝜺) , qf∈F
THEN (q,ax) ⊢i (qf , 𝜺) , qf∈F

S.Motogna - Formal Languages & Compiler Design


Regular sets
Definition: Let 𝚺 be a finite alphabet. We define regular sets over 𝚺
recursively in the following way:
1. 𝞥 is a regular set over 𝚺 (empty set)
2. {𝞮} is a regular set over 𝚺
3. {a} is a regular set over 𝚺, ∀ a∊𝚺
4. If P, Q are regular sets over 𝚺, then P∪Q, PQ, P* are regular sets
over 𝚺
5. Nothing else is a regular set over 𝚺

S.Motogna - Formal Languages & Compiler Design


Regular expressions
Definition: Let 𝚺 be a finite alphabet. We define regular expressions
over 𝚺 recursively in the following way:
1. 𝞥 is a regular expression denoting the regular set 𝞥 (empty set)
2. 𝞮 is a regular expression denoting the regular set {𝞮}
3. a is a regular expression denoting the regular set {a}, ∀ a∊𝚺
4. If p,q are regular expression denoting the regular sets P, Q then:
• p+q is a regular expression denoting the regular set P∪Q,
• pq is a regular expression denoting the regular set PQ,
• p* is a regular expression denoting the regular set P*
5. Nothing else is a regular expression
S.Motogna - Formal Languages & Compiler Design
Remarks:
Examples
1. p+ = pp*
2. Use paranthesis to avoid ambiguity
3. Priority of operations: *, concat, + (from high to low)
4. For each regular set we can find at least one regular exp to denote
it (there is an infinity of reg exp denoting them)
5. For each regular exp, we can construct the corresponding regular
set
6. 2 regular expressions are equivalent iff they denote the same
regular set

S.Motogna - Formal Languages & Compiler Design


Algebraic properties of regular exp
Let 𝛂, 𝛃, 𝛄 be regular expressions.
1. 𝛂+𝛃=𝛃+𝛂
2. 𝞥* = 𝞮 9. 𝛂*= 𝛂 + 𝛂*
3. 𝛂 + (𝛃 + 𝛄) = (𝛂 + 𝛃) + 𝛄 10.(𝛂*)* = 𝛂*
4. 𝛂(𝛃𝛄) = (𝛂𝛃)𝛄 11.𝛂 + 𝛂 = 𝛂
5. 𝛂 (𝛃 + 𝛄) = 𝛂𝛃 + 𝛂𝛄 12.𝛂 + 𝞥 = 𝛂
6. (𝛂 + 𝛃)𝛄 = 𝛂𝛄 + 𝛃𝛄
7. 𝛂𝞮=𝞮𝛂=𝛂
8. 𝞥𝛂 = 𝛂𝞥 = 𝞥
S.Motogna - Formal Languages & Compiler Design
Reg exp equations
• Normal form: X = aX + b
where a,b – reg exp a a*b + b = (aa* +𝛆)b = a*b
• Solution: X = a*b

• System of reg exp equations:


𝑋 = 𝑎! 𝑋 + 𝑎" 𝑌 + 𝑎#
/
𝑌 = 𝑏! 𝑋 + 𝑏" 𝑌 + 𝑏#
• Solution: Gauss method (replace Xi and solve Xn)

S.Motogna - Formal Languages & Compiler Design


Why?

Reg exp

Maths

Finite Regular
Automata grammars

S.Motogna - Formal Languages & Compiler Design


Prop:Regular sets are right linear languages
Lemma 1: 𝞥,{𝞮}, {a},∀a∊𝚺 are right linear languages

Proof: constructive
i. G = ({S}, 𝚺, 𝞥, S) – regular grammar such that L(G) = 𝞥

ii. G = ({S}, 𝚺,{S→𝞮}, S) – regular grammar such that L(G) ={𝞮}

iii. G = ({S}, 𝚺,{S→a}, S) – regular grammar such that L(G) ={a}

S.Motogna - Formal Languages & Compiler Design


Lemma 2: If L1 and L2 are right linear languages then:
L1 ∪ L2, L1L2 and L1* are right linear languages.

Proof: constructive
L1,L2 right linear languages => ∃G1, G2 such that
G1 = (N1, 𝚺1,P1,S1) and L1 = L(G1)
G2 = (N2, 𝚺2,P2,S2) and L2 = L(G2) assume N1∩N2 = ∅

S.Motogna - Formal Languages & Compiler Design


i. G3 = (N3, 𝚺,P3,S3)

N3 = N1U N2U {S3}; ∑3 = ∑1 U ∑2

P3 = P1U P2U {S3→ S1| S2}

{S3→ 𝛂1| S1→ 𝛂 1∊ P1} U {S3→ 𝛂2| S2→ 𝛂 2∊ P2}


G3 – right linear language
and
L(G3) = L(G1) U L(G2) PROOF!!! Homework

S.Motogna - Formal Languages & Compiler Design


ii. G4 = (N4, 𝚺,P4,S4)

N4 = N1U N2; S4= S1;∑4=∑1U∑2

P4 = {A→ aB| if A→ aB ∊ P1} U


{A→ aS2| if A→ a ∊ P1} U
P2 U
{S1→ 𝛂2| if S1 → 𝝴 ∊ P1 and S2→ 𝛂 2∊ P2 }

G4 – right linear language


and
L(G4) = L(G1) L(G2) PROOF!!! Homework

S.Motogna - Formal Languages & Compiler Design


iii. G5 = (N5, 𝚺1,P5,S5)
//IDEA: concatenate L1 with itself
N4 = N1U {S5};

P5 = P1 U {S5 → 𝝴} U
{S5→ 𝛂1| S1→ 𝛂 1∊ P1} U
{A→ aS1| if A→ a ∊ P1}

G5 – right linear language


and
L(G5) = L(G1)* PROOF!!! Homework

S.Motogna - Formal Languages & Compiler Design


Theorem: A language is a regular set if and
only if is a right linear language
Proof:
=> Apply lemma 1 and lemma 2
<= construct a system of regular exp equations where:
- Indeterminants – nonterminals
- Coefficients – terminals
- Equation for A: all the possible rewritings of A
Example: G=({S,A,B},{0,1}, P, S)
P: S → 0A | 1B | 𝝴
A → 0B | 1A 𝑆 = 0𝐴 + 1𝐵 + 𝝴 Regular exp = solution
- 𝐴 = 0𝐵 + 1𝐴 corresponding to S
B → 0S | 1
𝐵 = 0𝑆 + 1

S.Motogna - Formal Languages & Compiler Design


Theorem: A language is a
regular set if and only if is
accepted by a FA

Proof: 𝑞1 = 𝑞30 + 𝝴
=> Apply lemma 1 and lemma 2 (to follow, similar to RG) !𝑞2 = 𝑞10 + 𝑞11 + 𝑞20 + 𝑞30
<= construct a system of regular exp equations where: 𝑞3 = 𝑞21
- Indeterminants – states
- Coefficients – terminals Regular exp = union of
- Equation for A: all the possibilities that put the FA in solutions corresponding
state A
- Equation of the form: X=Xa+b => solution X=ba*
to final states

S.Motogna - Formal Languages & Compiler Design


Lemma 1’:𝞥,{𝞮}, {a},∀a∊𝚺 are accepted by FA
Reg exp FA
𝞥 M = (Q, 𝚺, 𝛿, q0, 𝞥)
𝞮 M = (Q, 𝚺, 𝞥, q0, {q0})
a,∀a∊𝚺 M = ({q0,q1}, 𝚺, {𝛿(q0,a) = q1}, q0, {q1})

S.Motogna - Formal Languages & Compiler Design


Lemma 2’:If L1 and L2 are accepted by a FA then:
L1 ∪ L2, L1L2 and L1* are accepted by FA
Proof:
M1 = (Q1, 𝚺1, 𝛿1, q01, F1) such that L1= L(M1)
M2 = (Q2, 𝚺2, 𝛿2, q02, F2) such that L2 = L(M2)

M3 = (Q3, 𝚺1U, 𝛿3, q03, F3)


Q3 = Q1 U Q2 U {q03}; ∑3 = ∑1 U ∑2 L(M3) = L(M1) U L(M2)
F3 = F1 U F2 U {q03 | if q01 ∊ F1 or q02 ∊ F2}
𝛿3 = 𝛿1 U 𝛿2 U {𝛿3(q03,a) = p | ∃𝛿1(q01,a) = p} U
PROOF!!! Homework
{𝛿3(q03,a) = p | ∃𝛿2(q02,a) = p}
S.Motogna - Formal Languages & Compiler Design
M4 = (Q4, 𝚺4, 𝛿4, q04, F4)
Q4 = Q1 U Q2; q04 = q01;

F3 = F2 U {q ∊ F1 | if q02 ∊ F2}
𝛿3(q,a) = 𝛿1(q,a), if q ∊ Q1-F1
𝛿1(q,a) U 𝛿2(q02,a) if q ∊ F1
𝛿2(q,a), if q ∊ Q2 L(M3) = L(M1)L(M2)

PROOF!!! Homework

S.Motogna - Formal Languages & Compiler Design


M5 = (Q5, 𝚺1, 𝛿5, q05, F5) //IDEA: concatenate with itself
Q5 = Q1; q05 = q01
F5 = F1 U {q01}
𝛿5(q,a) = 𝛿1(q,a), if q ∊ Q1-F1
𝛿1(q,a) U 𝛿1(q01,a) if q ∊ F1

L(M3) = L(M1)*

PROOF!!! Homework

S.Motogna - Formal Languages & Compiler Design

You might also like