SIMPLIFICATION OF CFG AND
CHOMSKY NORMAL FORM (CNF)
𝑳𝒆𝒄𝒕𝒖𝒓𝒆 # 𝟎𝟖
Dr. Imran Khalil
(imrankhalil@[Link])
Contents
• Simplification of CFG
• Chomsky Normal Form (CNF)
• Removing ε Productions/Rules
• Handling Unit Productions/Rules
• Handling Useless Productions/Rules
1
Simplifying CFGs
• In order to simplify CFGs, we take the following steps;
• Removing ε Productions/Rules
• Handling Unit Productions/Rules
• Handling Useless Productions/Rules
2
A Substitution Rule
Equivalent
grammar
S → aB
S → aB | ab
A → aaA
A → aaA
A → abBc Substitute
B→b A → abBc | abbc
B → aA
B → aA
B→b
3
S → aB | ab
A → aaA
A → abBc | abbc
B → aA
Substitute
B → aA
S → aB | ab | aaA
Equivalent
A → aaA
grammar
A → abBc | abbc | abaAc 4
In general:
A → xBz
B → y1
Substitute
B → y1
equivalent
A → xBz | xy1z grammar
5
Nullable Variables
e - production: X ®e
Nullable Variable:
Example:
S ® aMb
M ® aMb
M ®e
Nullable variable e - production
6
Removing 𝜺 Rules/Productions
S ® aMb
S → aMb | ab
M ® aMb Substitute
M ®e M → aMb | ab
M ®e
7
Unit-Productions
Unit Production: X → Y
(a single variable in both sides)
Example: S → aA
A→a
A→ B
Unit Productions
B→A
B → bb 8
Removal of unit productions:
S → aA
S → aA | aB
A→a
A→a
A→ B Substitute
A→ B B → A| B
B→A
B → bb
B → bb
9
Unit productions of form X →X
can be removed immediately
S → aA | aB S → aA | aB
A→a Remove
A→a
B → A| B B→B B→A
B → bb B → bb
10
S → aA | aB
S → aA | aB | aA
A→a
Substitute
B→A A→a
B→A
B → bb
B → bb
11
Remove repeated productions
Final grammar
S → aA | aB | aA S → aA | aB
A→a A→a
B → bb B → bb
12
Useless Productions/Rules
S ® aSb
S ®e
S®A
A ® aA Useless Production
Some derivations never terminate...
S A aA aaA aaaA 13
Another grammar:
S®A
A ® aA
A ®e
B ® bA Useless Production
Not reachable from S
14
A production A → x is useless
if any of its variables is useless
S ® aSb
S ®e Productions
useless
Variables S®A
useless A ® aA useless
useless B®C useless
useless C®D useless
15
Removing All
Step 1: Remove Nullable Variables
Step 2: Remove Unit-Productions
Step 3: Remove Useless Variables
This sequence guarantees that unwanted
variables and productions are removed
16
CHOMSKY NORMAL
FORM (CNF)
17
Chomsky Normal Form (CNF)
Each production has form:
A → BC or A→a
variable variable terminal
18
Examples:
S → AS S → AS
S →a S → AAS
A → SA A → SA
A→b A → aa
Chomsky Not Chomsky
Normal Form Normal Form
19
Conversion to CNF
S → ABa
Example:
A → aab Not in CNF
B → Ac
We will convert it to CNF
20
Introduce new variables for
the terminals:
Ta , Tb , Tc S → ABTa
S → ABa A → TaTaTb
A → aab B → ATc
B → Ac Ta → a
Tb → b
Tc → c
21
Introduce new intermediate variable V1
to break first production:
S → AV1
S → ABTa
V1 → BTa
A → TaTaTb
A → TaTaTb
B → ATc
B → ATc
Ta → a
Ta → a
Tb → b
Tb → b
Tc → c
Tc → c 22
Introduce intermediate variable: V
2
S → AV1
S → AV1
V1 → BTa
V1 → BTa
A → TaV2
A → TaTaTb
V2 → TaTb
B → ATc
B → ATc
Ta → a
Ta → a
Tb → b
Tb → b
Tc → c
Tc → c 23
Final grammar in Chomsky Normal Form:
S → AV1
V1 → BTa
A → TaV2
Initial grammar
V2 → TaTb
S → ABa B → ATc
A → aab Ta → a
B → Ac Tb → b
Tc → c 24
Practice Problem # 01
Convert the following CFG to CNF
𝑆 → 𝐴𝑆𝐴|𝑎𝐵
𝐴 → 𝐵|𝑆
𝐵 → 𝑏|𝜖
25
Practice Problem # 01
Step 1: Add new start variable 𝑆0 (only if when Start variable present
in the RHS)
𝑆0 → 𝑆
𝑆 → 𝐴𝑆𝐴|𝑎𝐵
𝐴 → 𝐵|𝑆
𝐵 → 𝑏|𝜖
26
Practice Problem # 01
Step 2: Remove 𝜖
𝑩→𝝐
𝑆0 → 𝐴𝑆𝐴|𝑎𝐵|𝑎𝜖
𝑆 → 𝐴𝑆𝐴|𝑎𝐵|𝑎𝜖
𝐴 → 𝐵|𝑆
𝐵 →𝑏
27
Practice Problem # 01
Step 3: Remove unit production
𝐀→𝑩
𝑆0 → 𝐴𝑆𝐴|𝑎𝐵|𝑎
𝑆 → 𝐴𝑆𝐴|𝑎𝐵|𝑎
𝐴 → 𝑏|𝑆
𝐵 →𝑏
28
Practice Problem # 01
Step 3: Remove unit production
𝐀→𝑺
𝑆0 → 𝐴𝑆𝐴|𝑎𝐵|𝑎
𝑆 → 𝐴𝑆𝐴|𝑎𝐵|𝑎
𝐴 → 𝑏|𝐴𝑆𝐴|𝑎𝐵|𝑎
𝐵 →𝑏
29
Practice Problem # 01
Step 4: Now find out the productions which has more than two variables
in RHS
𝑽 → 𝑺𝑨, 𝑼 → 𝒂
𝑆0 → 𝐴𝑉|𝑈𝐵|𝑎
𝑆 → 𝐴𝑉|𝑈𝐵|𝑎
𝐴 → 𝑏|𝐴𝑉|𝑈𝐵|𝑎
𝐵 →𝑏
𝑉 → 𝑆𝐴
𝑈→𝑎
30
Practice Problem # 01
Initial Grammar CNF
𝑆 → 𝐴𝑆𝐴|𝑎𝐵 𝑆0 → 𝐴𝑉|𝑈𝐵|𝑎
𝑆 → 𝐴𝑉|𝑈𝐵|𝑎
𝐴 → 𝐵|𝑆
𝐴 → 𝑏|𝐴𝑉|𝑈𝐵|𝑎
𝐵 → 𝑏|𝜖 𝐵 →𝑏
𝑉 → 𝑆𝐴
𝑈→𝑎
31
Practice Problem # 02
Convert the following CFG to CNF
𝑆 → 𝑎𝑋𝑏𝑋
𝑋 → 𝑎𝑌 𝑏𝑌 𝜖
𝑌 → 𝑋|𝑐
32
Practice Problem # 02
Step 1: Add new start variable 𝑆0 (only if when Start variable
present in the RHS)
There is no start variable is the RHS
𝑆 → 𝑎𝑋𝑏𝑋
𝑋 → 𝑎𝑌 𝑏𝑌 𝜖
𝑌 → 𝑋|𝑐
33
Practice Problem # 02
Step 2: Remove 𝜖 Transitive
Property
𝑋→𝜖
𝑋 → 𝜖, 𝑌 → 𝑋, 𝑡ℎ𝑒𝑛 𝑌 → 𝜖
𝑆 → 𝑎𝑋𝑏𝑋 𝑎𝑏𝑋 𝑎𝑋𝑏|𝑎𝑏
𝑋 → 𝑎𝑌 𝑏𝑌 𝑎|𝑏
𝑌 → 𝑎𝑌 𝑏𝑌 𝑎|𝑏|𝑐
34
Practice Problem # 02
Step 3: Remove unit production
There is no unit production
𝑆 → 𝑎𝑋𝑏𝑋 𝑎𝑏𝑋 𝑎𝑋𝑏|𝑎𝑏
𝑋 → 𝑎𝑌 𝑏𝑌 𝑎|𝑏
𝑌 → 𝑎𝑌 𝑏𝑌 𝑎|𝑏|𝑐
35
Practice Problem # 02
Step 4: Now find out the productions which has more than two
variables in RHS
𝑉1 → 𝑎, 𝑉2 → 𝑏, 𝑉3 → 𝑎𝑋, 𝑉4 → 𝑏𝑋
𝑆 → 𝑉3 𝑉4 𝑉1 𝑉4 𝑉3 𝑉2 |𝑉1 𝑉2
𝑋 → 𝑉1 𝑌 𝑉2 𝑌 𝑎|𝑏
𝑌 → 𝑉1 𝑌 𝑉2 𝑌 𝑎|𝑏|𝑐
36
Practice Problem # 02
Initial Grammar CNF
𝑆 → 𝑎𝑋𝑏𝑋 𝑆 → 𝑉3 𝑉4 𝑉1 𝑉4 𝑉3 𝑉2 |𝑉1 𝑉2
𝑋 → 𝑎𝑉1 𝑌 𝑉2 𝑌 𝑎|𝑏
𝑋 → 𝑎𝑌 𝑏𝑌 𝜖
𝑌 → 𝑉1 𝑌 𝑉2 𝑌 𝑎|𝑏|𝑐
𝑌 → 𝑋|𝑐 𝑉1 → 𝑎
𝑉2 → 𝑏
𝑉3 → 𝑎𝑋
𝑉4 → 𝑏𝑋
37
Applications of CNF
• CNF enables a polynomial time algorithm to decide whether a
string can be generated by a grammar.
• Simplicity of proofs: 𝜀 free transition for PDA
• Enables parsing: PDA can be used to parse words with any
grammar, this is often inconvenient. Normal forms can give us
more structure to work with, resulting in easier parsing
algorithms.
38
Acknowledgment
• [Michael_Sipser] Introduction to the Theory of Computation.
• [Margaret Fleck and Sariel Har Peled] Introduction to the Theory of
Computation.
• [John_E._Hopcroft] Intro to Automata Theory, Language and Computation.
• .
39