ToC Module 2 Handouts Sept05
ToC Module 2 Handouts Sept05
Tapas Pandit
1 1
0
𝐸𝑣𝑒𝑛 𝑂𝑑𝑑
For simplicity, let us replace even and odd states by 𝑞 0 and 𝑞 1 respectively. As we can see from
Figure 1, each time the machine takes a pair of inputs, a state 𝑞 ∈ {𝑞 0 , 𝑞 1 } and an input symbol 𝜎 ∈ Σ,
and determines its next state. That is, the machine works based on a function 𝛿 : 𝑄 × Σ −→ 𝑄, where
𝑄 = {𝑞 0 , 𝑞 1 } is called the set of finite states. This function is called transition function which is described
in the transition table below.
1 Accept and recognize will be used interchangeably throughout our discussion.
1
1 1
0
𝑞0 𝑞1
input state (𝑞) input symbol (𝜎) next state (𝛿(𝑞, 𝜎))
𝑞0 0 𝑞1
𝑞0 1 𝑞0
𝑞1 0 𝑞0
𝑞1 1 𝑞1
It can be easily checked that the strings like 0, 000, 101, 1110 are all accepted by the machine in Figure
1. Actually, the machine recognizes the language 𝐿 = {𝜔 ∈ Σ∗ : #0’s in 𝜔 is odd}, that is, all the strings
of 𝐿 are accepted by the machine.
Let us consider another example. Design a machine that accepts those strings whose entry at each
even position is 0, such as 10, 00, 101, 1010, etc. We keep three states: ‘OP’ (for odd position), ‘EP’ (for
even position), and ‘R’ (for rejection). When the machine is in the OP state, it simply moves to the EP
state for any input symbol, as the machine is not interested in the symbol at the odd position. When the
machine is in the EP state and the input symbol is 0, the machine moves to the OP state. If the input
symbol is 1, the machine moves to the R state and remains there for all subsequent input symbols. The
machine description is given in Figure 2. Note that here both states OP and EP are final/accepting states.
The transition function is described in the table below:
0,1
0, 1 1
𝑂𝑃 𝐸𝑃 𝑅
input state (𝑞) input symbol (𝜎) next state (𝛿(𝑞, 𝜎))
OP 0 EP
OP 1 EP
EP 0 OP
EP 1 R
R 0 R
R 1 R
2
0, 1
0, 1 0, 1 1
𝑞0 𝑞1 𝑞2 𝑞3
𝑞4 0, 1
Figure 3: Machine for recognizing strings having 1 at the 3rd position from the LHS.
Before formally introducing the finite state machine, let us consider one more example. Design a
machine that accepts those strings whose 3rd bit from the LHS is 1. The machine is described in Figure
3, where it has five states 𝑞 0 , 𝑞 1 , . . . , 𝑞 4 . Regardless of the first two input symbols, the machine moves to
the state 𝑞 2 through 𝑞 1 . Then, if the 3rd bit from the LHS of the input string is 1, the machine enters the
state 𝑞 3 (the accepting state), otherwise, it enters the state 𝑞 4 . The machine remains in 𝑞 3 (resp. 𝑞 4 ) for
all subsequent input symbols.
All the machines discussed so far are called finite state machines or finite state automata, because the
number of states is finite. We now formally define finite state automaton. You might have noticed that at
each transition, the machines have only one option for their next state. For this reason, the machines are
called deterministic finite automata (DFAs).
Definition 1.1. A deterministic finite automaton (DFA) is a quintuple M = (𝑄, Σ, 𝛿, 𝑠, 𝐹), where
3
Input Tape (Read only)
a b a b a b a b
Reading Head
𝑞3 ..
.
𝑞2 𝑞𝑛
𝑞1 𝑞0
Finite Control
Definition 1.3 (Binary relation over configurations). A binary relation `M holds between two configura-
tions of M if and only if the machine M can pass from one to the other as a result of a single move.
Thus, if (𝑞, 𝜔) and (𝑞 0, 𝜔 0) are two configurations of M, then (𝑞, 𝜔) `M (𝑞 0, 𝜔 0) if and only if 𝜔 = 𝑎𝜔 0
for some symbol 𝑎 ∈ Σ and 𝛿(𝑞, 𝑎) = 𝑞 0. In this case, we say that (𝑞, 𝜔) yields (𝑞 0, 𝜔 0) in one step. For
example, for the DFA defined in Figure 1, we can write (𝑞 0 , 0100) `M (𝑞 1 , 100).
Remark 1.4. As we see the binary relation `M is a function from 𝑄 × Σ+ to 𝑄 × Σ∗ , that is, for every
configuration except those of the form (𝑞, 𝜖) there is a uniquely determined next configuration. A config-
uration of the form (𝑞, 𝜖) signifies that M has consumed all its input symbols, and hence, its operation
ceases at this point.
We denote the reflexive, transitive closure of `M by `∗M which is nothing but the application of zero
step or a finite number of steps of the binary relation `M . For example, for the DFA defined in Figure 1,
we can write (𝑞 0 , 0100) `∗M (𝑞 0 , 0) as (𝑞 0 , 0100) `M (𝑞 1 , 100) `M (𝑞 1 , 00) `M (𝑞 0 , 0).
Definition 1.4 (Acceptance). A string 𝜔 ∈ Σ∗ is said to be accepted by M if and only if there exists a
state 𝑞 ∈ 𝐹 such that (𝑠, 𝜔) `∗M (𝑞, 𝜖). Finally, the language accepted by M, denoted by L (M), is the set of
all strings accepted by M.
Remark 1.5. The transition diagram of a DFA can always be viewed as a special type of directed graph,
where the states and transition arrows represent nodes and directed edges of the underlying graph respec-
tively. We say the DFA accepts a string 𝜔 = 𝜔1 · · · 𝜔 𝑛 if there is a directed walk from the initial node to a
node corresponding to a final state, where the directed edges are labeled with 𝜔1 , 𝜔2 , . . . , 𝜔 𝑛 respectively.
Example 1.6. Let us first write down the DFA M in Figure 3 as per definition. M = (𝑄, Σ, 𝛿, 𝑠, 𝐹), where
𝑄 = {𝑞 0 , . . . , 𝑞 4 }, Σ = {0, 1}, 𝑠 = 𝑞 0 , 𝐹 = {𝑞 3 } and the transition function 𝛿 : 𝑄 × Σ −→ 𝑄 is given
by 𝛿(𝑞 0 , 0) = 𝑞 1 , 𝛿(𝑞 0 , 1) = 𝑞 1 , 𝛿(𝑞 1 , 0) = 𝑞 2 , 𝛿(𝑞 1 , 1) = 𝑞 2 , 𝛿(𝑞 2 , 1) = 𝑞 3 , 𝛿(𝑞 2 , 0) = 𝑞 4 , 𝛿(𝑞 3 , 0) = 𝑞 3 ,
𝛿(𝑞 3 , 1) = 𝑞 3 , 𝛿(𝑞 4 , 0) = 𝑞 4 and 𝛿(𝑞 4 , 1) = 𝑞 4 . The above DFA M accepts/recognizes the language
L (M) = {𝑥 ∈ {0, 1}∗ : |𝑥| ≥ 3 and 3rd bit of 𝑥 from the LHS is 1}. (1)
4
For example, M accepts the string 101011. In fact, using the notation `M we have,
(𝑞 0 , 101011) `M (𝑞 1 , 01011)
`M (𝑞 2 , 1011)
`M (𝑞 3 , 011)
`M (𝑞 3 , 11)
`M (𝑞 3 , 1)
`M (𝑞 3 , 𝜖).
1. 𝑞 0 = 𝑠,
3. 𝑞 𝑛 ∈ 𝐹.
Example 1.8. The DFA given in Figure 5 accepts the language {𝜔 ∈ {0, 1}∗ : #1’s is even and #0’s is even}.
𝐸𝐸 𝐸𝑂
0
1 1 1 1
0
𝑂𝐸 𝑂𝑂
Figure 5: Machine for recognizing strings having even number of 1’s and even number of 0’s.
5
0 1 0,1
𝑞0 1 𝑞1 0 𝑞2 1 𝑞3
1%3
1
0
1
0
0%3 2%3
0 1
Figure 7: Machine accepts those strings whose decimal value is congruent to 2 modulo 3.
Example 1.11. Let us design a deterministic finite automaton M that accepts the language
𝑞 𝜎 𝛿(𝑞, 𝜎)
𝑞0 𝑎 𝑞0
𝑞0 𝑏 𝑞1
𝑞1 𝑎 𝑞0
𝑞1 𝑏 𝑞2
𝑞2 𝑎 𝑞0
𝑞2 𝑏 𝑞3
𝑞3 𝑎 𝑞3
𝑞3 𝑏 𝑞3
𝑎 𝑎 𝑎, 𝑏
𝑞0 𝑏 𝑞1 𝑏 𝑞2 𝑏 𝑞3
Figure 8: Machine recognizes those strings that do not three consecutive b’s.
6
1%3
1 1
0
0%3 2%3
1
0 0
Figure 9: Machine accepts those strings whose sum of binary digits is congruent to 2 modulo 3.
2. L (M) = {𝜔 ∈ Σ∗ : if ℓ = |𝜔| > 1, then 𝜔2𝑖 = 0 ∀𝑖 ∈ [bℓ/2c]} where the DFA M is defined in Example
1.2.
3. L (M) = {𝜔 ∈ {0, 1}∗ : #1’s is even and #0’s is even} where the DFA is given in Example 1.8.
4. L (M) = {𝜔 ∈ {0, 1}∗ : 𝜔 contains 101 as substring} where the DFA is defined in Example 1.9.
5. L (M) = {𝜔 ∈ {0, 1}∗ : decimal value of 𝜔 is congruent to 2 modulo 3} where the DFA is defined in
Example 1.10.
6. L (M) = {𝜔 ∈ {𝑎, 𝑏}∗ : 𝜔 does not contain three consecutive b’s} where the DFA is defined in Exam-
ple 1.11.
7. L (M) = {𝜔 ∈ {0, 1}∗ : sum of binary digits in 𝜔 is congruent to 2 modulo 3} where the DFA is
defined in Example 1.12.
While we have constructed several deterministic finite automata, there are still specific languages for
which DFAs need to be designed. Can we design DFAs for the languages Σ∗ , ∅, {𝜖 }, {𝜔} for any 𝜔 ∈ Σ∗ ?
Yes, the DFAs are presented below.
𝑎, 𝑏
𝑞0
7
𝑎, 𝑏 𝑎, 𝑏
𝑞0 𝑞1
𝑎, 𝑏
𝑎, 𝑏
𝑞0 𝑞1
4. For 𝐿 = {𝜔}, where 𝜔 ∈ {𝑎, 𝑏}∗ is a fixed string. Let 𝜔 = 𝜔1 · · · 𝜔 𝑛 . The machine M is given by
M = (𝑄, Σ, 𝛿, 𝑠, 𝐹), where 𝑄 = {𝑞 0 , 𝑞 1 , . . . , 𝑞 𝑛 , 𝑞 𝑛+1 }, Σ = {𝑎, 𝑏}, 𝑠 = 𝑞 0 , 𝐹 = {𝑞 𝑛 } and 𝛿 : 𝑄 × Σ −→ 𝑄
is given as follows.
𝑞0 𝑎 𝑞1 𝑎 𝑞2 𝑏 𝑞3 𝑏 𝑞4
𝑏 𝑏
𝑎
𝑎 𝑎, 𝑏
𝑞5
𝑎, 𝑏
Definition 1.5 (Regular). A language 𝐿 ⊆ Σ∗ is said to be regular, if there exists a DFA M such that
L (M) = 𝐿.
So, the languages accepted by all the DFAs discussed so far are regular. In particular, the languages
Σ∗ , ∅, {𝜖 }, {𝜔} for any 𝜔 ∈ Σ∗ are regular.
Theorem 1.1. Regular languages are closed under union, intersection and complement.
Proof. Let 𝐿 1 and 𝐿 2 be two regular languages over the same alphabet Σ. So, there exist two DFAs M1
and M2 such that L (M1 ) = 𝐿 1 and L (M2 ) = 𝐿 2 . Let M1 = (𝑄 1 , Σ, 𝛿1 , 𝑠1 , 𝐹1 ) and M2 = (𝑄 2 , Σ, 𝛿2 , 𝑠2 , 𝐹2 ).
8
1. Union. We show that there exists a DFA M such that L (M) = 𝐿 1 ∪ 𝐿 2 . In fact, the DFA M
is defined using M1 and M2 as follows. M = (𝑄, Σ, 𝛿, 𝑠, 𝐹), where 𝑄 = 𝑄 1 × 𝑄 2 , 𝑠 = (𝑠1 , 𝑠2 ), 𝐹 =
(𝐹1 × 𝑄 2 ) ∪ (𝑄 1 × 𝐹2 ) and 𝛿 : 𝑄 × Σ −→ 𝑄 is given by 𝛿((𝑞 1 , 𝑞 2 ), 𝜎) = (𝛿1 (𝑞 1 , 𝜎), 𝛿2 (𝑞 2 , 𝜎)) for any
(𝑞 1 , 𝑞 2 ) ∈ 𝑄 and 𝜎 ∈ Σ. We claim that L (M) = L (M1 ) ∪ L (M2 ): exercise.
2. Intersection. We have to show that there exists a DFA M such that L (M) = 𝐿 1 ∩ 𝐿 2 . The description
of the DFA M is exactly the same as above, except 𝐹 = 𝐹1 ×𝐹2 . We claim that L (M) = L (M1 )∩L (M2 ):
exercise.
3. Complement. We have to show that 𝐿 1 is regular. It is easy to see that the DFA M10 = (𝑄 1 , Σ, 𝛿1 , 𝑠1 , 𝐹10 =
𝑄 1 \ 𝐹1 ) recognizes 𝐿 1 .
Remark 1.17. Cannot the case of intersection be handled using complement and union? Yes, in fact,
𝐿 1 ∩ 𝐿 2 = Σ∗ \ ((Σ∗ \ 𝐿 1 ) ∪ (Σ∗ \ 𝐿 2 )). A similar expression can be defined for 𝐿 1 ∪ 𝐿 2 .
Remark 1.18. The approach we used for union and intersection can be viewed as running both machines
in parallel on the same input string and then determining acceptance or rejection based on the state of
each machine.
Corollary 1.2. All finite languages over Σ are regular.
Exercise 1.19. Let 𝐿 1 and 𝐿 2 be two regular languages. Show that 𝐿 1 \ 𝐿 2 (set difference) and 𝐿 1 Δ𝐿 2
(symmetric difference) are regular.
Remark 1.20. Note that the constructions used in the proof of Theorem 1.1 serve as a tool for designing
DFAs for complex regular languages. We illustrate this with some examples. Suppose we are interested in
designing DFAs for the following two languages:
So, if we can design DFAs for 𝐿 1 and 𝐿 2 , then using the construction rules in the proof of Theorem 1.1,
we can design DFAs for 𝐿 1 ∪ 𝐿 2 and 𝐿 1 ∩ 𝐿 2 . The details are given in Figure 10.
Exercise 1.21. Design DFAs for the following languages.
1. {𝑎 𝑖 𝑏 𝑗 𝑐 𝑘 : 𝑖, 𝑗, 𝑘 ≥ 0}.
4. {𝜔 ∈ {𝑎, 𝑏}∗ : 𝜔 has at least three a’s and at least two b’s}.
9
Regular
Deterministic Finite Automata Comments
Languages
0 0, 1
𝐿1 Simple DFA M1
𝑝0 1 𝑝1
0
𝑞0 𝑞1 0, 1
𝐿2 1 Simple DFA M2
𝑞2 0, 1
0 0, 1
DFA for
𝐿1
𝑝0 1 𝑝1 Complement
Figure 10: Illustrations of machines recognizing the union, intersection, and complement of given languages.
5. {𝜔 ∈ {0, 1}∗ : 𝜔 contains either even number of 0’s or even number of 1’s}.
10
Exercise 1.22. Suppose a DFA has 𝑛 many states and 𝑚 many symbols in its alphabet. How many bits
are required to store the transition function of the DFA?
Remark 1.23. In the context of DFA, there is no difference between a decider and a recognizer. So, given
any regular language 𝐿, we can find a DFA M that decides 𝐿. Further, given any string 𝜔, the machine M
decides 𝜔 in linear time in |𝜔|.
0,1
0, 1 1
𝑞0 𝑞1 𝑞2
𝑞3 0,1
The illustration of this machine is simple. After scanning the 2nd input symbol, the machine can decide3
whether the input string is accepted or not. But, if we want to design a DFA for accepting those strings
whose 2nd bit from the RHS is 1, the same mechanism will not work. While executing on a valid4 input
string, the machine should be in a state such that whenever the last symbol appears, it must move to
the final/accepting state(s). Basically, we consider four states labeled with 00, 01, 10 and 11. The state
labeled with 00 means that the last two scanned symbols are 0. Similarly, the state labeled with 10 means
the last two scanned symbols are 1 and 0 respectively. In other words, the states keep track of the last
two symbols scanned by the machine. Now look at the machine given below. At any given time, if the
machine is at state 𝑎𝑏, where 𝑎, 𝑏 ∈ {0, 1}, then the last two symbols scanned were a and b, respectively.
Therefore, when the DFA finishes scanning and is in state 10 or 11, the input string is accepted.
1
00 01
1
0 1
0
10 11 1
0
Claim. Let 𝐿 = {𝜔 ∈ {0, 1} : 2nd bit of 𝜔 is 1 from the RHS}. Any DFA that accepts the language 𝐿
must have at least 4 states. Exercise.
3 But, the decision will be made after reading all the input symbols.
4 It means the 2nd bit is 1 from the RHS.
11
Exercise 2.1. Design a DFA that accepts those strings whose 3rd bit from the RHS is 1. Show that your
DFA will have at least 8 states.
Similarly, we can argue that the DFAs that accept the language
using only four states instead of 8 states. Of-course, we have to define what is the sense of accepting a
string by this special machine.
0,1
1 0, 1 0, 1
𝑞0 𝑞1 𝑞2 𝑞3
Notice that this machine differs from DFAs exactly in state transitions. When the machine at 𝑞 0 and
input symbol is 1, the machine has two options for its next state, either 𝑞 0 or 𝑞 1 . Note that in case of
DFA, given any string, whether it is valid or invalid, the DFA always follows a unique transition path.
But for this special machine, transition path may not be unique. For this reason, this type of machine is
called nondeterministic finite automaton (NFA). For example, suppose the input string is 00100. Then the
machine could follow any of the two different paths:
So, for the same input string the machine N sometimes reaches a final state (𝑞 3 ) and sometimes a non-final
state (𝑞 0 ). Although 00100 is a valid string, sometimes N accepts it and sometimes rejects it. But, for a
valid string, the machine always has at least on path from the initial state to final state (called acceptance
path). We say a machine of above kind accepts a strings if and only if there exists at least one acceptance
path. Therefore, the above machine accepts the string 00100. On the other hand, for any invalid input
string, the machine never reaches the final state 𝑞 3 after scanning all its input symbols. You might wonder,
what will be the situation if the machine operates on an invalid string, say, 1011? In one case, the machine
always stays at state 𝑞 0 after successfully scanning all the symbols. In this case, we say the machine halts.
In other case, the machine reaches out to 𝑞 3 after scanning 101 and hangs out at there, because transition
is not defined when the machine is at state 𝑞 3 . So, in this case, machine does not halt. Hence, in either
situation, the machine does not accept 1011 as required.
Before going to formal definition of NFA, let us look at another example. Consider a machine for the
language 𝐿 = {𝑎𝑏𝑎, 𝑎𝑏}∗ . It can be shown that the following DFA accepts 𝐿.
12
𝑎
𝑞0 𝑎 𝑞1 𝑏 𝑞2 𝑎 𝑞3
𝑏
𝑏
𝑎
𝑏
𝑞4
𝑎, 𝑏
The above DFA looks a bit complicated. But, its nondeterministic counterpart (see Figure 11) has simple
description.
𝑏
𝑞0 𝑞1
𝑎
𝑎 𝑏
𝑞2
Although we have not introduced the formal definition of NFA, one can easily see that a DFA can be
treated as an NFA. Let us now pose a question. How much more powerful are NFAs than their deterministic
counterparts? Do some NFAs recognize some languages that are not recognized by any DFAs? The answer
is negative. We see the answer concretely as we progress.
Definition 2.1 (NFA). A nondeterministic finite automaton (NFA) is a quintuple N = (𝑄, Σ, 𝛿, 𝑠, 𝐹), where
5. 𝛿 is the transition function5 from 𝑄 × (Σ ∪ {𝜖 }) to P (𝑄), where P (𝑄) denotes the power set of 𝑄.
Remark 2.2. Unlike DFAs, for a given input symbol, NFAs can have more than one next state and that’s
why the name “nondeterminism” is attached to this kind of machine. Essentially, the next states are
represented by a subset of 𝑄. It may also happen that machine does not move at all, which is captured by
∅. For example, see (below) the transition table of the NFA in Figure 11.
a b
𝑞0 {𝑞 1 } ∅
𝑞1 ∅ {𝑞 0 , 𝑞 2 }
𝑞2 {𝑞 0 } ∅
5 In Lewis and Papadimitriou [LP06], a transition relation, i.e., a subset of 𝑄 × (Σ ∪ {𝜖 }) × 𝑄 is considered.
13
Further, under a special type of transition, called 𝜖-transition, a machine can move to some state(s)
without consuming any symbol. When a machine has at least one 𝜖-transition, the machine is referred to
as 𝜖-NFA. The transition table of the 𝜖-NFA in Figure 12 is given below. What language does this 𝜖-NFA
accept?
0 1 𝜖
𝑞0 {𝑞 0 } {𝑞 0 , 𝑞 1 } ∅
𝑞1 {𝑞 2 } ∅ {𝑞 2 }
𝑞2 ∅ {𝑞 3 } ∅
𝑞3 {𝑞 3 } {𝑞 3 } ∅
0,1 0,1
1 0, 𝜖 1
𝑞0 𝑞1 𝑞2 𝑞3
𝑞0
𝑞0
1
1 1
𝜖
𝑞0 𝑞1 𝑞2
0 0
𝑞0 𝑞2
1 1 1
1
𝜖
𝑞0 𝑞1 𝑞2 𝑞3
1
1 1 1 1
𝜖
𝑞0 𝑞1 𝑞2 𝑞3 𝑞3
0 0 0 0
𝑞0 𝑞2 𝑞3 𝑞3
Figure 13: Different computation paths of the NFA defined in Figure 12 on input 010110.
As mentioned earlier, an NFA may follow different paths on the same input, in contrast to DFAs.
When the machine reaches out to a final state after scanning all its input symbols, we say that the NFA
accepts the input string. It may be possible that machine enters a final state following different paths of
computation. For example, Figure 13 shows the computation of the NFA (defined in Figure 12) on the
input string 010110. For each leaf node, there is exactly one computation path from the initial state. Some
of the paths are incomplete as the next state is not defined. After scanning the input 010110, the NFA
reaches out to the final state 𝑞 3 following two different paths (shown by bold arrow).
14
Exercise 2.3. Let N be an NFA and 𝜔 be an input string of length 𝑛. How many computation paths by
the NFA N on 𝜔 are possible?
Remark 2.4. We have seen that a DFA always halts after reading all the symbols of the input. Does an
NFA always halt on an input? Answer is no. We already have seen some cases where NFA does not halt,
for example, the NFA in Figure 11. Indeed, the head of the machine will stuck at state 𝑞 1 indefinitely if the
current scan symbol is 𝑎. Can you modify a given NFA to make it halt on any input? Asnwer: Exercise.
Exercise 2.5. Solve the following problems, where Σ = {0, 1}.
(a) Can you design a DFA M with a single final state such that L (M) = 𝐿?
(b) Can you design a DFA M using only two states such that L (M) = 𝐿?
(c) Design an NFA N such that L (N) = 𝐿.
(d) Can you design an NFA N with a single final state such that L (N) = 𝐿?
(a) Can you design a DFA M with a single final state such that L (M) = 𝐿?
(b) Can you design a DFA M using two or three states such that L (M) = 𝐿?
(c) Can you design an NFA N with a single final state such that L (N) = 𝐿?
(d) Can you design an NFA N using two states such that L (N) = 𝐿?
Definition 2.3 (Equivalent). Two finite automata N1 and N2 (deterministic or nondeterministic) are
equivalent if and only if L (N1 ) = L (N2 ).
Theorem 2.1. For each nondeterministic finite automaton, there is an equivalent deterministic finite
automaton.
Proof Idea. The key idea is to view a nondeterministic finite automaton at any moment, not a single
state but a set of states: namely, all the states that can be reached from the initial state by means of
the input consumed so far. We discuss the construction idea in two cases, NFA without 𝜖-transition and
𝜖-NFA.
1. Without 𝜖-transition. We illustrate this case using the NFA N in Figure 11 whose formal description
is given as follows. N = (𝑄, Σ, 𝛿, 𝑠, 𝐹), where 𝑄 = {𝑞 1 , 𝑞 1 , 𝑞 2 }, Σ = {𝑎, 𝑏}, 𝑠 = 𝑞 0 , 𝐹 = {𝑞 0 } and the
transition function 𝛿 : 𝑄 × Σ −→ P (𝑄) is given by 𝛿(𝑞 0 , 𝑎) = {𝑞 1 }, 𝛿(𝑞 0 , 𝑏) = ∅, 𝛿(𝑞 1 , 𝑎) = ∅,
𝛿(𝑞 1 , 𝑏) = {𝑞 1 , 𝑞 2 }, 𝛿(𝑞 2 , 𝑎) = {𝑞 0 }, and 𝛿(𝑞 2 , 𝑎) = ∅. Recall that a string 𝜔 is accepted by an NFA
if there exists at least one acceptance path from the initial state to an final state. Let us look at
different paths for the string 𝑎𝑏𝑎. Basically, there are two paths.
15
2. (𝑞 0 , 𝑎𝑏𝑎) `N (𝑞 1 , 𝑏𝑎) `N (𝑞 2 , 𝑎) `N (𝑞 0 , 𝜖) (acceptance path).
We have to make sure the corresponding DFA M will have a unique path and that if a string is
valid, then one of the final states must be reachable from the initial state after consuming all input
symbols. Note that if the NFA starts from any state, then machine may have many reachable states
after scanning the input string. For example, the above NFA reaches 𝑞 0 and 𝑞 1 after scanning
the strings “aba”. If we consider the state of the underlying DFA M to be a subset of 𝑄, then
we see that the machine actually reaches the state {𝑞 0 , 𝑞 1 } from {𝑞 0 } (initial state). If the new
machine M has to accept the string “aba”, then {𝑞 0 , 𝑞 1 } must have to be one of the final states.
In other words, considering the general case, a state 𝐴 ⊆ 𝑄 is a final state of M if and only if
𝐴 contains at least one final state of the given NFA N. Therefore, the target DFA is given by
M = (𝑄 0 = P (𝑄), Σ, 𝛿 0, 𝑠 0 = {𝑞 0 }, 𝐹 0), where 𝐹 0 = { 𝐴 ⊆ 𝑄 : 𝐴 ∩ 𝐹 ≠ ∅} and the transition function
𝛿 0 : 𝑄 0 × Σ −→ 𝑄 0 is defined by 𝛿 0 (𝑃, 𝜎) = ∪ 𝛿( 𝑝, 𝜎) for all 𝑃 ∈ 𝑄 0 and 𝜎 ∈ Σ. The transition
𝑝 ∈𝑃
diagram and transition table of the DFA M are given below. Note that the states {𝑞 2 }, {𝑞 1 , 𝑞 2 } and
{𝑞 0 , 𝑞 1 , 𝑞 2 } are redundant.
a b
∅ ∅ ∅
{𝑞 0 } {𝑞 1 } ∅
{𝑞 1 } ∅ {𝑞 0 , 𝑞 2 }
{𝑞 2 } {𝑞 0 } ∅
{𝑞 0 , 𝑞 1 } {𝑞 1 } {𝑞 0 , 𝑞 2 }
{𝑞 0 , 𝑞 2 } {𝑞 0 , 𝑞 1 } ∅
{𝑞 1 , 𝑞 2 } {𝑞 0 } {𝑞 0 , 𝑞 2 }
{𝑞 0 , 𝑞 1 , 𝑞 2 } {𝑞 0 , 𝑞 1 } {𝑞 0 , 𝑞 2 }
𝑎 𝑏
𝑎
{𝑞 0 } {𝑞 1 } {𝑞 2 } {𝑞 0 , 𝑞 1 } {𝑞 0 , 𝑞 2 }
𝑎
𝑏 𝑎
𝑎 𝑎 𝑏
𝑏 𝑏 𝑏
𝑎, 𝑏 ∅ {𝑞 0 , 𝑞 1 , 𝑞 2 } {𝑞 1 , 𝑞 2 }
2. With 𝜖-transition. Here, additionally, we have to handle 𝜖-transition. Recall that without consuming
any symbol, the machine may move from one state to another (or multiple states). It may also
happen that at the beginning the machine could apply 𝜖-transition. To tackle 𝜖-transition, we define
16
an 𝜖-closure of a state 𝑞, denoted by 𝐸 (𝑞), as follows. 𝐸 (𝑞) = {𝑝 ∈ 𝑄 : (𝑞, 𝜖) `∗N ( 𝑝, 𝜖)}. Note
that for any 𝑞 ∈ 𝑄, 𝑞 ∈ 𝐸 (𝑞). The initial state for the corresponding DFA M will be 𝐸 (𝑞 0 ). The
set of states of M is 𝑄 0 = P (𝑄) as above. For the NFA in Figure 12, it can be checked that
𝐸 (𝑞 0 ) = {𝑞 0 }, 𝐸 (𝑞 1 ) = {𝑞 1 , 𝑞 2 }, 𝐸 (𝑞 2 ) = {𝑞 2 } and 𝐸 (𝑞 3 ) = {𝑞 3 }. The machine may apply 𝜖-transition
after scanning any symbol. For example, the NFA in Figure 12, 𝛿(𝑞 0 , 1) = {𝑞 0 , 𝑞 1 , 𝑞 2 }, where 𝑞 2 is
reached out by applying 𝜖-transition. Therefore, the state transition of the corresponding DFA has
to take care the 𝜖-transition. In fact, the transition rule for M is defined as follows. For each 𝑃 ⊆ 𝑄
and each symbol 𝜎 ∈ Σ, we define
The transition table of the DFA M corresponding to the NFA in Figure 12 is described below.
0 1
∅ ∅ ∅
{𝑞 0 } {𝑞 0 } {𝑞 0 , 𝑞 1 , 𝑞 2 }
{𝑞 1 } {𝑞 2 } ∅
{𝑞 2 } ∅ {𝑞 3 }
{𝑞 3 } {𝑞 3 } {𝑞 3 }
{𝑞 0 , 𝑞 1 } {𝑞 0 , 𝑞 2 } {𝑞 0 , 𝑞 1 , 𝑞 2 }
{𝑞 0 , 𝑞 2 } {𝑞 0 } {𝑞 0 , 𝑞 1 , 𝑞 2 , 𝑞 3 }
{𝑞 0 , 𝑞 3 } {𝑞 0 , 𝑞 3 } {𝑞 0 , 𝑞 1 , 𝑞 2 , 𝑞 3 }
{𝑞 1 , 𝑞 2 } {𝑞 2 } {𝑞 3 }
{𝑞 1 , 𝑞 3 } {𝑞 2 , 𝑞 3 } {𝑞 3 }
{𝑞 2 , 𝑞 3 } {𝑞 3 } {𝑞 3 }
{𝑞 0 , 𝑞 1 , 𝑞 2 } {𝑞 0 , 𝑞 2 } {𝑞 0 , 𝑞 1 , 𝑞 2 , 𝑞 3 }
{𝑞 0 , 𝑞 1 , 𝑞 3 } {𝑞 0 , 𝑞 2 , 𝑞 3 } {𝑞 0 , 𝑞 1 , 𝑞 2 , 𝑞 3 }
{𝑞 0 , 𝑞 2 , 𝑞 3 } {𝑞 0 , 𝑞 3 } {𝑞 0 , 𝑞 1 , 𝑞 2 , 𝑞 3 }
{𝑞 1 , 𝑞 2 , 𝑞 3 } {𝑞 2 , 𝑞 3 } {𝑞 3 }
{𝑞 0 , 𝑞 1 , 𝑞 2 , 𝑞 3 } {𝑞 0 , 𝑞 2 , 𝑞 3 } {𝑞 0 , 𝑞 1 , 𝑞 2 , 𝑞 3 }
Proof. We are now ready to give a formal proof of Theorem 2.1. Let N = (𝑄, Σ, 𝛿, 𝑠, 𝐹) be an NFA. We will
construct a DFA M = (𝑄 0, Σ, 𝛿 0, 𝑠 0, 𝐹 0) that is equivalent to N.
Construction. For any state 𝑞 ∈ 𝑄, let 𝐸 (𝑞) be the set of all states of N that are reachable from
state 𝑞 without reading any input symbol. That is, 𝐸 (𝑞) = {𝑝 ∈ 𝑄 : (𝑞, 𝜖) `∗N ( 𝑝, 𝜖)}. The above set is
nonempty as 𝑞 ∈ 𝐸 (𝑞). The corresponding DFA M = (𝑄 0, Σ, 𝛿 0, 𝑠 0, 𝐹 0) is given by: 𝑄 0 = 2𝑄 , 𝑠 0 = 𝐸 (𝑠),
𝐹 0 = { 𝐴 ⊆ 𝑄 : 𝐴 ∩ 𝐹 ≠ ∅}, and for each 𝑃 ⊆ 𝑄 and each symbol 𝜎 ∈ Σ, define
17
Base Step. For |𝜔| = 0, that is, for 𝜔 = 𝜖 we must show that (𝑞, 𝜖) `∗N ( 𝑝, 𝜖) if and only if (𝐸 (𝑞), 𝜖) `∗M
(𝑃, 𝜖) for some set 𝑃 containing 𝑝. The first statement is equivalent to saying that 𝑝 ∈ 𝐸 (𝑞). Since M is
deterministic, the second statement is equivalent to saying that 𝑃 = 𝐸 (𝑞) and 𝑃 contains 𝑝. This completes
the proof of the base step.
Induction Hypothesis. Suppose that the claim is true for all strings 𝜔 of length 𝑘 or less for some
𝑘 ≥ 0.
Induction Step. We prove the claim for any string 𝜔 of length 𝑘 + 1. Let 𝜔 = 𝑣𝑎, where 𝑎 ∈ Σ and
𝑣 ∈ Σ∗ .
(=⇒) Suppose that (𝑞, 𝜔) `∗N ( 𝑝, 𝜖). Then there are states 𝑟 1 and 𝑟 2 such that
That is, N reaches state 𝑟 1 from state 𝑞 by some number of moves during which input 𝑣 is read, followed
by one move during which input 𝑎 is read, followed by some number of moves during which no input is
read. Now (𝑞, 𝑣𝑎) `∗N (𝑟 1 , 𝑎) is equivalent to (𝑞, 𝑣) `∗N (𝑟 1 , 𝜖). Since |𝑣| = 𝑘, by the induction hypothesis
(𝐸 (𝑞), 𝑣) `∗M (𝑅1 , 𝜖) for some set 𝑅1 containing 𝑟 1 . Since (𝑟 1 , 𝑎) `N (𝑟 2 , 𝜖), we have 𝛿(𝑟 1 , 𝑎) = 𝑟 2 , and
hence by the construction of M, 𝐸 (𝑟 2 ) ⊆ 𝛿 0 (𝑅1 , 𝑎). But since (𝑟 2 , 𝜖) `∗N ( 𝑝, 𝜖), it follows that 𝑝 ∈ 𝐸 (𝑟 2 ),
and therefore 𝑝 ∈ 𝛿 0 (𝑅1 , 𝑎). Therefore, (𝑅1 , 𝑎) `M (𝑃, 𝜖) for some 𝑃 (= 𝛿 0 (𝑅1 , 𝑎)) containing 𝑝, and thus
(𝐸 (𝑞), 𝑣𝑎) `∗M (𝑅1 , 𝑎) `M (𝑃, 𝜖).
(⇐=) To prove the other direction, suppose that (𝐸 (𝑞), 𝑣𝑎) `∗M (𝑅1 , 𝑎) `M (𝑃, 𝜖) for some 𝑃 containing 𝑝 and
some 𝑅1 such that 𝛿 0 (𝑅1 , 𝑎) = 𝑃. Since 𝑝 ∈ 𝑃 = 𝛿 0 (𝑅1 , 𝑎), there is 𝑟 1 ∈ 𝑅1 and a transition 𝛿(𝑟 1 , 𝑎) = 𝑟 2 of
N such that 𝑝 ∈ 𝐸 (𝑟 2 ). Then (𝑟 2 , 𝜖) `∗N ( 𝑝, 𝜖) by the definition of 𝐸 (𝑟 2 ). Also, by the induction hypothesis,
(𝑞, 𝑣) `∗N (𝑟 1 , 𝜖) and therefore (𝑞, 𝑣𝑎) `∗N (𝑟 1 , 𝑎) `N (𝑟 2 , 𝜖) `∗N ( 𝑝, 𝜖).
Corollary 2.2. A language is regular if and only if some NFA accepts it.
Exercise 2.6. Convert the following NFAs to the corresponding DFAs using the construction involved in
the proof of Theorem 2.1.
𝑞1 0
𝜖
𝑞0 0,1
𝜖
1 0, 1 0, 1
𝑞2 1 𝑞0 𝑞1 𝑞2 𝑞3
0, 1 𝑎 𝑏 𝑐
1, 𝜖 0 𝜖 𝜖
𝑞0 𝑞1 𝑞2 𝑞0 𝑞1 𝑞2
Exercise 2.7. Given a finite state automaton M, construct a finite state machine N with only a single
final/accepting state such that L (N) = L (M). Hint: 𝜖-transition.
18
References
[LP06] Harry R Lewis and Christos H. Papadimitriou. Elements of Theory of Computation. Pearson
Education, 2006.
[Sip14] Michael Sipser. Introduction to the Theory of Computation. Cengage India Private Limited, 2014.
19