0% found this document useful (0 votes)
263 views29 pages

Theory of Automata Notes

The document discusses Kleene closure and regular expressions. It defines Kleene closure as the set of all possible strings of all lengths over a set, excluding the empty string. Kleene's theorem states that for any regular expression representing a language, there is a corresponding finite automaton that accepts the same language. The document provides examples of how to systematically generate a finite automaton from a regular expression using Kleene's theorem by applying operations like union, concatenation, and Kleene star to smaller automata. It also discusses non-deterministic finite automata (NFA) and deterministic finite automata (DFA).
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
263 views29 pages

Theory of Automata Notes

The document discusses Kleene closure and regular expressions. It defines Kleene closure as the set of all possible strings of all lengths over a set, excluding the empty string. Kleene's theorem states that for any regular expression representing a language, there is a corresponding finite automaton that accepts the same language. The document provides examples of how to systematically generate a finite automaton from a regular expression using Kleene's theorem by applying operations like union, concatenation, and Kleene star to smaller automata. It also discusses non-deterministic finite automata (NFA) and deterministic finite automata (DFA).
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Kleene Closure / Plus

 Definition − The set ∑+ is the infinite set of all possible strings of all possible
lengths over ∑ excluding λ.
 Representation − ∑+ = ∑1 ∪ ∑2 ∪ ∑3 ∪…….
∑+ = ∑* − { λ }
 Example − If ∑ = { a, b } , ∑+ = { a, b, aa, ab, ba, bb,………..}
In mathematical logic and computer science, the Kleene star (or Kleene operator or Kleene
closure) is a unary operation, either on sets of strings or on sets of symbols or characters. In
mathematics it is more commonly known as the free monoid construction. The application of the
Kleene star to a set  is written as . It is widely used for regular expressions, which is the context in
which it was introduced by Stephen Kleene to characterize certain automata, where it means "zero
or more repetitions".
1. If  is a set of strings, then  is defined as the smallest superset of  that contains
the empty string  and is closed under the string concatenation operation.
2. If  is a set of symbols or characters, then  is the set of all strings over symbols in ,
including the empty string .
The set  can also be described as the set containing the empty string and all finite-length strings that
can be generated by concatenating arbitrary elements of , allowing the use of the same element
multiple times. If  is either the empty set ∅ or the singleton set , then ; if  is any other finite
set or countably infinite set, then  is a countably infinite set.[1] As a consequence, each formal
language over a finite or countably infinite alphabet  is countable, since it is a subset of the countably
infinite set .
The operators are used in rewrite rules for generative grammars.
Given a set  define
 (the language consisting only of the empty string),
and define recursively the set
 for each .
If  is a formal language, then , the -th power of the set , is a shorthand for
the concatenation of set  with itself  times. That is,  can be understood to be the set of
all strings that can be represented as the concatenation of  strings in .
The definition of Kleene star on  is[2]
This means that the Kleene star operator is an idempotent unary operator:  for any
set  of strings or characters, as  for every .
Kleene’s
A language is said to be regular if it can be represented by using a Finite Automata or if
a Regular Expression can be generated for it. This definition leads us to the general
definition that; For every Regular Expression corresponding to the language, a Finite
Automata can be generated.
For certain expressions like :- (a+b), ab, (a+b)* ; It’s fairly easier to make the Finite
Automata by just intuition as shown below. The problem arises when we are provided
with a longer Regular Expression. This brings about the need for a systematic approach
towards FA generation, which has been put forward by Kleene in Kleene’s Theorem – I
Kleene’s Theorem-I :
For any Regular Expression r that represents Language L(r), there is a Finite Automata
that accepts same language.

To understand Kleene’s Theorem-I, Let’s take in account the basic definition of Regular
Expression where we observe that  ,   and a single input symbol “a” can be included in
a Regular Language and the corresponding operations that can be performed by the
combination of these are:
Say,   and   be two regular expressions. Then,
1. +  is a regular expression too, whose corresponding language is L( )U
L( )
2. .  is a regular expression too, whose corresponding language is L( ).L(
)
3. * is a regular expression too, whose corresponding language is L( )*
We can further use this definition in association with Null Transitions to give rise to a FA
by the combination of two or more smaller Finite Automata (each corresponding to a
Regular Expression).
Let S accept L = {a} and T accept L = {b}, then R can be represented as a combination of
S and T using the provided operations as:
R = S + T

We observe that,
1. In case of union operation we can have a new start state, from which, null
transition proceeds to the starting state of both the Finite State Machines.
2. The final states of both the Finite Automata’s are converted to intermediate
states. The final state is unified into one which can be traversed by null
transitions.
R = S.T

We observe that,
1. In case of concatenation operation we can have the same starting state as that
of S, the only change occurs in the end state of S, which is converted to an
intermediate state followed by a Null Transition.
2. The Null transition is followed by the starting state of T, the final state of T is
used as the end state of R.
R = S*

We observe that,
1. A new starting state is added, and S has been put as an intermediate state so
that self looping condition could be incorporated.
2. Starting and Ending states have been defined separately so that the self looping
condition is not disturbed.
Now that we are aware about the general operations. Let’s see how Kleene’s Theorem-I
can be used to generate a FA for the given Regular Expression.
Example:
Make a Finite Automata for the expression (ab+a)*
We see that using Kleene’s Theorem – I gives a systematic approach towards the
generation of a Finite Automata for the provided Regular Expression.

NFA (Non-Deterministic finite automata)


o NFA stands for non-deterministic finite automata. It is easy to construct an NFA than
DFA for a given regular language.
o The finite automata are called NFA when there exist many paths for specific input from
the current state to the next state.
o Every NFA is not DFA, but each NFA can be translated into DFA.
o NFA is defined in the same way as DFA but with the following two exceptions, it contains
multiple next states, and it contains ε transition.

In the following image, we can see that from state q0 for input a, there are two next
states q1 and q2, similarly, from q0 for input b, the next states are q0 and q1. Thus it is
not fixed or determined that with a particular input where to go next. Hence this FA is
called non-deterministic finite automata.

Formal definition of NFA:


NFA also has five states same as DFA, but with different transition function, as shown
follows:

δ: Q x ∑ →2Q

where,

1. Q: finite set of states  
2. ∑: finite set of the input symbol  
3. q0: initial state   
4. F: final state  
5. δ: Transition function  
Graphical Representation of an NFA
An NFA can be represented by digraphs called state diagram. In which:

1. The state is represented by vertices.


2. The arc labeled with an input character show the transitions.
3. The initial state is marked with an arrow.
4. The final state is denoted by the double circle.

Example 1:

1. Q = {q0, q1, q2}  
2. ∑ = {0, 1}  
3. q0 = {q0}  
4. F = {q2}  

Solution:

Transition diagram:

Transition Table:

Present State Next state for Input 0 Next State of Input 1

→q0 q0, q1 q1

q1 q2 q0

*q2 q2 q1, q2

In the above diagram, we can see that when the current state is q0, on input 0, the next
state will be q0 or q1, and on 1 input the next state will be q1. When the current state is
q1, on input 0 the next state will be q2 and on 1 input, the next state will be q0. When
the current state is q2, on 0 input the next state is q2, and on 1 input the next state will
be q1 or q2.
Example 2:
NFA with ∑ = {0, 1} accepts all strings with 01.

Solution:

Transition Table:

Present State Next state for Input 0 Next State of Input 1

→q0 q1 ε

q1 ε q2

*q2 q2 q2

Example 3:
NFA with ∑ = {0, 1} and accept all string of length atleast 2.

Solution:

Transition Table:

Present State Next state for Input 0 Next State of Input 1


→q0 q1 q1

q1 q2 q2

*q2 ε ε

DFA (Deterministic finite automata)


o DFA refers to deterministic finite automata. Deterministic refers to the uniqueness of the
computation. The finite automata are called deterministic finite automata if the machine
is read an input string one symbol at a time.
o In DFA, there is only one path for specific input from the current state to the next state.
o DFA does not accept the null move, i.e., the DFA cannot change state without any input
character.
o DFA can contain multiple final states. It is used in Lexical Analysis in Compiler.

In the following diagram, we can see that from state q0 for input a, there is only one
path which is going to q1. Similarly, from q0, there is only one path for input b going to
q2.
Formal Definition of DFA
A DFA is a collection of 5-tuples same as we described in the definition of FA.

1. Q: finite set of states  
2. ∑: finite set of the input symbol  
3. q0: initial state   
4. F: final state  
5. δ: Transition function  

Transition function can be defined as:

1. δ: Q x ∑→Q  
Graphical Representation of DFA
A DFA can be represented by digraphs called state diagram. In which:

1. The state is represented by vertices.


2. The arc labeled with an input character show the transitions.
3. The initial state is marked with an arrow.
4. The final state is denoted by a double circle.

Example 1:

1. Q = {q0, q1, q2}  
2. ∑ = {0, 1}  
3. q0 = {q0}  
4. F = {q2}  

Solution:

Transition Diagram:

Transition Table:

Present State Next state for Input 0 Next State of Input 1

→q0 q0 q1

q1 q2 q1

*q2 q2 q2

Example 2:
DFA with ∑ = {0, 1} accepts all starting with 0.

Solution:

Explanation:
o In the above diagram, we can see that on given 0 as input to DFA in state q0 the DFA
changes state to q1 and always go to final state q1 on starting input 0. It can accept 00,
01, 000, 001....etc. It can't accept any string which starts with 1, because it will never go to
final state on a string starting with 1.

Example 3:
DFA with ∑ = {0, 1} accepts all ending with 0.

Solution:

Explanation:

In the above diagram, we can see that on given 0 as input to DFA in state q0, the DFA
changes state to q1. It can accept any string which ends with 0 like 00, 10, 110, 100....etc.
It can't accept any string which ends with 1, because it will never go to the final state q1
on 1 input, so the string ending with 1, will not be accepted or will be rejected.

Mealy Machine
A Mealy machine is a machine in which output symbol depends upon the present input
symbol and present state of the machine. In the Mealy machine, the output is
represented with each input symbol for each state separated by /. The Mealy machine
can be described by 6 tuples (Q, q0, ∑, O, δ, λ') where
1. Q: finite set of states  
2. q0: initial state of machine  
3. ∑: finite set of input alphabet  
4. O: output alphabet  
5. δ: transition function where Q × ∑ → Q  
6. λ': output function where Q × ∑ →O  

Example 1:
Design a Mealy machine for a binary input sequence such that if it has a substring 101,
the machine output A, if the input has substring 110, it outputs B otherwise it outputs C.

Solution: For designing such a machine, we will check two conditions, and those are
101 and 110. If we get 101, the output will be A. If we recognize 110, the output will be
B. For other strings the output will be C.

The partial diagram will be:

Now we will insert the possibilities of 0's and 1's for each state. Thus the Mealy machine
becomes:
Example 2:
Design a mealy machine that scans sequence of input of 0 and 1 and generates output
'A' if the input string terminates in 00, output 'B' if the string terminates in 11, and
output 'C' otherwise.

Solution: The mealy machine will be:


Chomsky Hierarchy
Chomsky Hierarchy represents the class of languages that are accepted by the different
machine. The category of language in Chomsky's Hierarchy is as given below:

1. Type 0 known as Unrestricted Grammar.


2. Type 1 known as Context Sensitive Grammar.
3. Type 2 known as Context Free Grammar.
4. Type 3 Regular Grammar.
This is a hierarchy. Therefore every language of type 3 is also of type 2, 1 and 0.
Similarly, every language of type 2 is also of type 1 and type 0, etc.

Type 0 Grammar:
Type 0 grammar is known as Unrestricted grammar. There is no restriction on the
grammar rules of these types of languages. These languages can be efficiently modeled
by Turing machines.

For example:

1. bAa → aa  
2. S → s  

Type 1 Grammar:
Type 1 grammar is known as Context Sensitive Grammar. The context sensitive grammar
is used to represent context sensitive language. The context sensitive grammar follows
the following rules:

o The context sensitive grammar may have more than one symbol on the left hand side of
their production rules.
o The number of symbols on the left-hand side must not exceed the number of symbols
on the right-hand side.
o The rule of the form A → ε is not allowed unless A is a start symbol. It does not occur on
the right-hand side of any rule.
o The Type 1 grammar should be Type 0. In type 1, Production is in the form of V → T

Where the count of symbol in V is less than or equal to T.

For example:

1. S → AT  
2. T → xy  
3. A → a  

Type 2 Grammar:
Type 2 Grammar is known as Context Free Grammar. Context free languages are the
languages which can be represented by the context free grammar (CFG). Type 2 should
be type 1. The production rule is of the form

1. A → α  

Where A is any single non-terminal and is any combination of terminals and non-
terminals.

For example:

1. A → aBb  
2. A → b  
3. B → a  

Type 3 Grammar:
Type 3 Grammar is known as Regular Grammar. Regular languages are those languages
which can be described using regular expressions. These languages can be modeled by
NFA or DFA.

Type 3 is most restricted form of grammar. The Type 3 grammar should be Type 2 and
Type 1. Type 3 should be in the form of
1. V → T*V / T*  

For example:

1. A → xy  

Ambiguity in Grammar
A grammar is said to be ambiguous if there exists more than one leftmost derivation or
more than one rightmost derivation or more than one parse tree for the given input
string. If the grammar is not ambiguous, then it is called unambiguous.

If the grammar has ambiguity, then it is not good for compiler construction. No method
can automatically detect and remove the ambiguity, but we can remove ambiguity by
re-writing the whole grammar without ambiguity.

Example 1:
Let us consider a grammar G with the production rule

1. E → I  
2. E → E + E  
3. E → E * E  
4. E → (E)  
5. I → ε | 0 | 1 | 2 | ... | 9  

Solution:

For the string "3 * 2 + 5", the above grammar can generate two parse trees by leftmost
derivation:
Since there are two parse trees for a single string "3 * 2 + 5", the grammar G is
ambiguous.

Example 2:
Check whether the given grammar G is ambiguous or not.

1. E → E + E  
2. E → E - E  
3. E → id  

Solution:

From the above grammar String "id + id - id" can be derived in 2 ways:

First Leftmost derivation

1. E → E + E  
2.    → id + E  
3.    → id + E - E  
4.    → id + id - E  
5.    → id + id- id  

Second Leftmost derivation

1. E → E - E  
2.    → E + E - E  
3.    → id + E - E  
4.    → id + id - E  
5.    → id + id - id  

Since there are two leftmost derivation for a single string "id + id - id", the grammar G is
ambiguous.

Example 3:
Check whether the given grammar G is ambiguous or not.

1. S → aSb | SS  
2. S → ε  

Solution:

For the string "aabb" the above grammar can generate two parse trees

Since there are two parse trees for a single string "aabb", the grammar G is ambiguous.

Example 4:
Check whether the given grammar G is ambiguous or not.

1. A → AA  
2. A → (A)  
3. A → a  
Solution:

For the string "a(a)aa" the above grammar can generate two parse trees:

Since there are two parse trees for a single string "a(a)aa", the grammar G is ambiguous.

Unambiguous Grammar
A grammar can be unambiguous if the grammar does not contain ambiguity that means
if it does not contain more than one leftmost derivation or more than one rightmost
derivation or more than one parse tree for the given input string.

To convert ambiguous grammar to unambiguous grammar, we will apply the following


rules:
1. If the left associative operators (+, -, *, /) are used in the production rule, then apply
left recursion in the production rule. Left recursion means that the leftmost symbol on
the right side is the same as the non-terminal on the left side. For example,

1. X → Xa  

2. If the right associative operates(^) is used in the production rule then apply right
recursion in the production rule. Right recursion means that the rightmost symbol on
the left side is the same as the non-terminal on the right side. For example,

1. X → aX  

Example 1:
Consider a grammar G is given as follows:

1. S → AB | aaB  
2. A → a | Aa  
3. B → b  

Determine whether the grammar G is ambiguous or not. If G is ambiguous, construct an


unambiguous grammar equivalent to G.

Solution:

Let us derive the string "aab"


As there are two different parse tree for deriving the same string, the given grammar is
ambiguous.

Unambiguous grammar will be:

1. S → AB  
2. A → Aa | a  
3. B → b  

Example 2:
Show that the given grammar is ambiguous. Also, find an equivalent unambiguous
grammar.

1. S → ABA  
2. A → aA | ε  
3. B → bB | ε  

Solution:

The given grammar is ambiguous because we can derive two different parse tree for
string aa.
The unambiguous grammar is:

1. S → aXY | bYZ | ε  
2. Z → aZ | a  
3. X → aXY | a | ε  
4. Y → bYZ | b | ε  

Example 3:
Show that the given grammar is ambiguous. Also, find an equivalent unambiguous
grammar.

1. E → E + E  
2. E → E * E  
3. E → id  

Solution:

Let us derive the string "id + id * id"


As there are two different parse tree for deriving the same string, the given grammar is
ambiguous.

Unambiguous grammar will be:

1. E → E + T  
2. E → T  
3. T → T * F  
4. T → F  
5. F → id  

Example 4:
Check that the given grammar is ambiguous or not. Also, find an equivalent
unambiguous grammar.

1.  S → S + S  
2. S → S * S  
3. S → S ^ S  
4. S → a  

Solution:
The given grammar is ambiguous because the derivation of string aab can be
represented by the following string:

Unambiguous grammar will be:

1. S → S + A |  
2. A → A * B | B  
3. B → C ^ B | C  
4. C → a  

Strings:
A string is a finite ordered sequence of symbols chosen form some set of
alphabet or ∑. For example, ‘aababbbbaa’ is a valid string from the alphabet ∑
= {a, b}, similarly ‘001111000101’ is a valid string form the alphabet ∑ = {0,
1}.
Empty string:
Every alphabet ∑ has a special string called empty string which means the
string with zero occurrences of symbols. This string represented by λ, e or ε. It
is the string that may be chosen from any alphabet whatsoever.

Length of a string:
The finite occurrence of input symbols form ∑ present the length of a string. If s
denotes the string over alphabet ∑ then length of a string is represented by |S|.
For instance, ‘001110’ is a string from the alphabets ∑= {0, 1} has length 6.
Similarly if ∑ = {a, b} and S = ‘aabbabbba’ then |S| = 9.

Note: The length of empty string | ε| = 0.

Introduction to Grammars
 Previous Page
Next Page  

n the literary sense of the term, grammars denote syntactical rules for conversation in
natural languages. Linguistics have attempted to define grammars since the inception
of natural languages like English, Sanskrit, Mandarin, etc.
The theory of formal languages finds its applicability extensively in the fields of
Computer Science. Noam Chomsky gave a mathematical model of grammar in 1956
which is effective for writing computer languages.

Grammar
A grammar G can be formally written as a 4-tuple (N, T, S, P) where −
 N or V  is a set of variables or non-terminal symbols.
N

 T or ∑ is a set of Terminal symbols.


 S is a special variable called the Start symbol, S ∈ N
 P is Production rules for Terminals and Non-terminals. A production rule has the
form α → β, where α and β are strings on V  ∪ ∑ and least one symbol of α
N

belongs to VN.

Example

Grammar G1 −
({S, A, B}, {a, b}, S, {S → AB, A → a, B → b})
Here,
 S, A, and B are Non-terminal symbols;
 a and b are Terminal symbols
 S is the Start symbol, S ∈ N
 Productions, P : S → AB, A → a, B → b

Example

Grammar G2 −
(({S, A}, {a, b}, S,{S → aAb, aA → aaAb, A → ε } )
Here,
 S and A are Non-terminal symbols.
 a and b are Terminal symbols.
 ε is an empty string.
 S is the Start symbol, S ∈ N
 Production P : S → aAb, aA → aaAb, A → ε

Derivations from a Grammar


Strings may be derived from other strings using the productions in a grammar. If a
grammar G has a production α → β, we can say that x α y derives x β y in G. This
derivation is written as −
x α y  ⇒G x β y

Example

Let us consider the grammar −


G2 = ({S, A}, {a, b}, S, {S → aAb, aA → aaAb, A → ε } )
Some of the strings that can be derived are −
S ⇒ aAb using production S → aAb
⇒ aaAbb using production aA → aAb
⇒ aaaAbbb using production aA → aaAb
⇒ aaabbb using production A → ε

You might also like