Department of Computer Science, Australian National University
COMP2600 — Formal Methods for Software Engineering
Semester 2, 2016
— Assignment 1 —
Automata, Languages, and Computability
Sample Solutions
1 Finite State Automata and Regular Languages
Problem 1. Consider the following NFA A over {a, b}:
start s0
a a, b
a
s1 s2 a
a
b b
s3
A is intended to recognise the language
L = {zαb | z ∈ {a, b}, α ∈ {a}∗ }
of strings over the alphabet {a, b} that start with either a or b, end with b and have an arbitrary
number of a’s between the first and the last letter.
(i) The following transition table shows a DFA A0D that is obtained from A by the method of
“lazy” subset construction:
a b
∅ ∅ ∅
→ {s0 } {s1 , s2 } {s2 }
{s1 , s2 } {s1 , s2 } {s3 }
{s2 } {s1 , s2 } {s3 }
∗{s3 } ∅ ∅
Assignment 1 — Automata, Languages, and Computability 1
For the sake of convenience, let’s define a DFA AD that is exactly like A0D but states are
renamed. The following table shows AD :
a b
q4 q4 q4
→ q0 q1 q2
q1 q1 q3
q2 q1 q3
∗q3 q4 q4
The transition diagram of AD is as follows:
start q0
a b
a q1 q2
a
b b
q3 a, b q4 a, b
(ii) The initial split differentiates non-final and final states:
[ [ q0 , q1 , q2 , q4 ], [ q3 ] ]
b splits [ q0 , q1 , q2 , q4 ]:
[ [ q0 , q4 ], [ q1 , q2 ], [ q3 ] ]
a (as well as b) splits [ q0 , q4 ]:
[ [ q0 ], [ q4 ], [ q1 , q2 ], [ q3 ] ]
Neither a nor b splits the only remaining non-singleton group [ q1 , q2 ]. Every other group
is a singleton. Therefore, the algorithm terminates. States q1 and q2 are equivalent.
(iii) Am is as follows
Assignment 1 — Automata, Languages, and Computability 2
start q0
a, b
a q1
q3 a, b q4 a, b
(iv) Prove that Am recognises L.
Proof: We need to prove the following two statements
(a) for all w, if δ ∗ (q0 , w) = q3 then w ∈ L
(b) for all w, if w ∈ L then δ ∗ (q0 , w) = q3
Each of these statements can be reformulated as follows
(a) for all w, if δ ∗ (q0 , w) = q3 then either w = aαb or w = bαb, where α ∈ {a}∗
(b) for all α, if α ∈ {a}∗ then δ ∗ (q0 , aαb) = q3 and δ ∗ (q0 , bαb) = q3
Below we prove each of these statements
(a) • |w| = 0
Hence, w = . δ ∗ (q0 , ) = q0 . State q0 is a state distinct from q3 , therefore the
antecedent of the conditional statement is false. Therefore, the whole statement is
true.
• |w| = 1
Hence, either w = a or w = b. In both cases δ ∗ (q0 , w) = q1 . State q1 is a state
distinct from q3 , therefore the antecedent of the conditional statement is false.
Therefore, the whole statement is true.
• |w| > 1
Hence, w is of the form xαy, where x, y ∈ {a, b} and α ∈ {a, b}∗ .
Assume δ ∗ (q0 , xαy) = q3 .
By definition of δ ∗ the following holds
δ ∗ (q0 , xαy) = δ ∗ (δ(q0 , x), αy)
δ(q0 , x) is defined for x ∈ {a, b} as follows:
δ(q0 , a) = q1
δ(q0 , b) = q1
Therefore, regardless whether x is a or b, δ(q0 , x) = q1 . Hence,
δ ∗ (q0 , xαy) = δ ∗ (q1 , αy)
Assignment 1 — Automata, Languages, and Computability 3
δ ∗ (q1 , αy) = δ(δ ∗ (q1 , α), y) = q3
There is only one one-step transition that leads to q3 . It is labelled b. Therefore y
must be b and δ ∗ (q1 , α) must be q1 .
Consequently, it is sufficient to prove the following statement:
for all α ∈ {a, b}∗ , if δ ∗ (q1 , α) = q1 then α ∈ {a}∗
Lemma 1 proves this statement.
(b) Assume α ∈ {a}∗ . We need to prove that δ ∗ (q0 , aαb) = q3 and δ ∗ (q0 , bαb) = q3 . We
consider δ ∗ (q0 , aαb) and δ ∗ (q0 , bαb) separately:
• δ ∗ (q0 , aαb) = δ ∗ (δ(q0 , a), αb) = δ ∗ (q1 , αb) = δ(δ ∗ (q1 , α), b).
• δ ∗ (q0 , bαb) = δ ∗ (δ(q0 , b), αb) = δ ∗ (q1 , αb) = δ(δ ∗ (q1 , α), b).
In both cases we reached δ(δ ∗ (q1 , α), b). By Lemma 2, δ ∗ (q1 , α) = q1 . There-
fore, δ(δ ∗ (q1 , α), b) = δ(q1 , b) = q3
Lemma 1
For all β ∈ {a, b}∗ , if δ ∗ (q1 , β) = q1 then β ∈ {a}∗
Proof:
• |β| = 0. β = and β ∈ {a}∗ .
• |β| > 0. This means that β = xα where x ∈ {a, b} and α ∈ {a, b}∗ .
We need to prove the following: if δ ∗ (q1 , xα) = q1 then xα ∈ {a}∗
Assume
δ ∗ (q1 , xα) = q1 (1)
By the definition of δ ∗ the following holds:
δ ∗ (q1 , xα) = δ ∗ (δ(q1 , x), α) (2)
For the sake of contradiction assume that x = b, then
δ(q1 , x) = δ(q1 , b) = q3 . Then
δ ∗ (q1 , xα) = δ ∗ (q3 , α) = q1
This means that from state q3 we reach state q3 after reading α. But all transitions from
q3 lead to the “sink” state q4 . Therefore δ ∗ (q3 , α) = q1 is impossible. We reached a
contradiction. Therefore, x 6= b and x can only be equal to a, since a is the only
remaining symbol of the alphabet.
Hence we can rewrite δ ∗ (δ(q1 , x), α) in (2) as follows:
δ ∗ (δ(q1 , a), α)
Since δ(q1 , a) = q1 and by assumption (1) the following holds:
δ ∗ (δ(q1 , a), α) = δ ∗ (q1 , α) = q1
The only one possible way to reach q1 from q1 is a transition δ(q1 , a) or a sequence of
them. Therefore α ∈ {a}∗ .
Hence, xα ∈ {a}∗ .
Assignment 1 — Automata, Languages, and Computability 4
Lemma 2
For all α ∈ {a}∗ , δ ∗ (q1 , α) = q1
Proof: Induction on the length of α.
• Base case: |α| = 0. δ ∗ (q1 , ) = q1
• Induction: |α| > 0.
Induction hypothesis: δ ∗ (q1 , an ) = q1 , where an is a string consisting of a sequence
of n a’s. We need to prove δ ∗ (q1 , an+1 ) = q1 .
δ ∗ (q1 , an+1 ) = δ(δ ∗ (q1 , an ), a)
By induction hypothesis δ ∗ (q1 , an ) = q1 , hence
δ ∗ (q1 , an+1 ) = δ(q1 , a) = q1
Problem 2. Let L be the language
L = {w ∈ {0, 1}∗ | w ∈ {0}∗ or w ∈ {1}∗ }
Construct a deterministic finite automaton that accepts all strings that are in L, and rejects all
strings that are not in L.
start q0 0 q1 1 q3 0, 1
1 0
q2
Problem 4. Let L be the language
L = {w ∈ {0, 1}∗ | w has a 1 in the third position from the right}
Construct a non-deterministic finite automaton that accepts all strings that are in L, and rejects
all strings that are not in L.
0, 1
q0 1 q1 0, 1 q2 0, 1 q3
start
Assignment 1 — Automata, Languages, and Computability 5
Problem 3. Let L be the language
L = {w ∈ {0, 1}∗ | each block of 5 consecutive symbols contains at least two 0’s}
• Construct a deterministic finite automaton that accepts all strings that are in L, and rejects
all strings that are not in L.
• Explain in plain English the intuition behind the constructed automaton.
We first define a non-minimal DFA An that recognizes L. The name of a state conveys the in-
formation about the sequence of 0s and 1s of the last 4-symbols block that the automaton has
“processed”. Thus, if the sequence of the last four processed symbols was 0010, the automaton
is in state named “0010”. Then δ(0010, 0) = 0100 and δ(0010, 1) = 0101. We define the only
non-accepting state sink. Whenever the automaton is in a state named with a sequence of four
symbols and at least three of these symbols are 1s, the automaton goes to the rejecting state sink
on input 1. The only exception is the string 1111, as discussed below.
Since the automaton has to accept all strings of length smaller than 5, there are states for these
strings. The only string containing four consecutive 1s that is accepted is the string of length 4.
State 1111 is the accepting state corresponding to strings 1111. If the first four 1’s of a string are
followed by a symbol, the automaton performs a transition to sink.
The initial state is named “”. It is the state for the empty string.
Thus the automaton is as follows:
Assignment 1 — Automata, Languages, and Computability 6
0 1
→* 0 1
*0 00 01
*1 10 11
*00 000 001
*01 010 011
*10 100 101
*11 110 111
*000 0000 0001
*001 0010 0011
*010 0100 0101
*011 0110 0111
*100 1000 1001
*101 1010 1011
*110 1100 1101
*111 1110 1111
*0000 0000 0001
*0001 0010 0011
*0010 0100 0101
*0011 0110 0111
*0100 1000 1001
*0101 1010 1011
*0110 1100 1101
*0111 1110 sink
*1000 0000 0001
*1001 0010 0011
*1010 0100 0101
*1011 0110 sink
*1100 1000 1001
*1101 1010 sink
*1110 1100 sink
*1111 sink sink
sink sink sink
An can be minimized. The equivalence classes of states are as follows:
[] [1111] [sink] [1] [0111] [1011] [111] [1110] [1101] [11] [010, 0010, 1010, 10]
[011, 0011] [110, 0110] [101, 0101] [01, 001, 0001, 1001] [0, 00, 000, 100, 0000, 0100, 1000, 1100]
Among equivalent states, we keep the states that are emphasized in bold above. This gives us the
following minimal automaton Am :
Assignment 1 — Automata, Languages, and Computability 7
0 1
→* 0000 1
*1 1010 11
*11 0110 111
*111 1110 1111
*0000 0000 0001
*0001 1010 0011
*0011 0110 0111
*0101 1010 1011
*0110 0000 1101
*0111 1110 sink
*1010 0000 0101
*1011 0110 sink
*1101 1010 sink
*1110 0000 sink
*1111 sink sink
sink sink sink
Here is Am ’s transition diagram:
start 1 1 1 1
1 11 111 1111
0 0 0
0 0, 1
1 1 1 1 0, 1
0 0000 0001 0011 0111 sink
0 0
0
0 0 0 1
1010 0110 1110
0
1
0 1
1
0101 1101
1 0
1
1011
Assignment 1 — Automata, Languages, and Computability 8
2 Grammar
Problem 5. From the NFA A given in Problem 1, derive a right-linear grammar using the
algorithm given in the lectures.
The grammar is as follows
G = ({a, b}, {S0 , S1 , S2 , S3 }, S0 , P )
where the set P consists of the following production rules
S0 → aS1
S0 → aS2
S0 → bS2
S1 → aS2
S1 → bS3
S2 → aS1
S2 → aS2
S2 → bS3
S3 →
Problem 2.6. Consider the grammar G = ({S, T }, {0, 1}, S, P ), where P consists of the follow-
ing productions
S → 0S | 1T | 0
T → 1T | 1
Show that no string in the language L(G) contains the substring 10.
Proof: Let α be a sentential form derived using G. We show by induction on the length n of the
derivation of α that α does not contain any of the substrings 10, 1S, S0, SS, T 0 or T S.
Base case: n=1. The possible derivations of length 1 of sentential forms for G are S ⇒ 0S, S ⇒
1T , and S ⇒ 0. Neither 0S, nor 1T , nor 0 contain 10, 1S, S0, SS, T 0 or T S as substring.
∗
Inductive case: Let S ⇒ α ⇒ β be a derivation of β of length n + 1. By inductive hypothesis, α
does not contain 10, 1S, S0, SS, T 0 or T S as substring.
• If the last derivation step uses S → 0S, since α does not contain 1S, SS or T S, the
derivation step cannot introduce 10, S0 or T 0 in β.
• If the last derivation step uses S → 1T , since α does not contain S0 or SS, the derivation
step cannot introduce T 0 or T S in β.
• If the last derivation step uses S → 0, since α does not contain 1S, SS or T S, the derivation
step cannot introduce 10, S0 or T 0 in β.
• If the last derivation step uses T → 1T , since α does not contain T 0 or T S, the derivation
step cannot introduce T 0 or T S in β.
• If the last derivation step uses T → 1, since α does not contain T 0 or T S, the derivation
step cannot introduce 10 or 1S in β.
Assignment 1 — Automata, Languages, and Computability 9
3 Context Free Languages and Pushdown Automata
Problem 3.6. Show that the language {uawb | u, w ∈ {a, b}∗ , with |u| = |w|} is context free by
exhibiting a context free grammar that generates it.
A CFG grammar that generates this language is G = ({a, b}, {S, T, X}, S, P ), where P consists
of the following productions:
S → Tb
T → a | XT X
X→a|b
Problem 7. Consider the context-free grammar G = ({S, A, B}, {a, b, c}, P, S), where P con-
sists of the following productions
S → aA
A → BA | a
B → bS | cS
Construct a PDA M that accepts L(G) by empty stack. Draw the parse three of G for the string
abaaa, and show the corresponding execution trace for M .
The parse tree for the string abaaa:
A a A a
A S
S B a
a b
M = ({q0 , q1 , q2 }, q0 , {q2 }, {a, b, c}, {a, b, c, S, A, B, Z}, Z, δ) with δ defined as follows:
δ(q0 , , Z) = {(q1 , SZ)}
δ(q1 , , S) = {(q1 , aA)}
δ(q1 , , A) = {(q1 , BA), (q1 , a)}
δ(q1 , , B) = {(q1 , bS), (q1 , cS)}
δ(q1 , a, a) = {(q1 , )}
δ(q1 , b, b) = {(q1 , )}
δ(q1 , c, c) = {(q1 , )}
δ(q1 , , Z) = {(q2 , )}
Assignment 1 — Automata, Languages, and Computability 10
The execution trace of M for the string abaaa is as follows:
(q0 , abaaa, Z) ` (q1 , abaaa, SZ)
` (q1 , abaaa, aAZ)
` (q1 , baaa, AZ)
` (q1 , baaa, BAZ)
` (q1 , baaa, bSAZ)
` (q1 , aaa, SAZ)
` (q1 , aaa, aAAZ)
` (q1 , aa, AAZ)
` (q1 , aa, aAZ)
` (q1 , a, AZ)
` (q1 , a, aZ)
` (q1 , , Z)
` (q2 , , )
4 OPTIONAL: Turing Machine and Computability
Problem 8. Design a 2-tape Turing machine accepting the language of all strings over {0, 1} that
have an equal number of 0’s and 1’s. The first tape contains the input, and is scanned from left to
right. The second tape is used to keep track of the difference between the number of 0’s and 1’s in
the part of the input seen so far.
M = ({q0 , q1 , q2 }, q0 , {q2 }, {Λ, 0, 1, #}, {0, 1}, Λ, δ)
The machine starts in state q0 . If the input string is empty, it goes directly to accepting state q2 . If
the input string is non-empty, the machine writes # on the second tape and goes to state q1 .
• for each 0 on tape 1, the machine moves left on tape 2
• for each 1 on tape 1, the machine moves right on tape 2
If after the whole word is scanned on tape 1, the head of tape 2 is on symbol #, the machine
accepts by going to accepting state q2 ; otherwise it rejects.
δ(q0 , [Λ, Λ]) = (q2 , (Λ, R), (Λ, R))
δ(q0 , [0, Λ]) = (q1 , (0, R), (#, L))
δ(q0 , [1, Λ]) = (q1 , (1, R), (#, R))
δ(q1 , [0, Λ]) = (q1 , (0, R), (Λ, L))
δ(q1 , [1, Λ]) = (q1 , (1, R), (Λ, R))
δ(q1 , [0, #]) = (q1 , (0, R), (#, L))
δ(q1 , [1, #]) = (q1 , (1, R), (#, R))
δ(q1 , [Λ, #]) = (q2 , (Λ, R), (#, R))
Problem 9. For a TM M , let E(M ) denote the encoding of M . Consider the language
L = {E(M ) | M, when started on a blank tape eventually writes a 1 somewhere on the tape}.
(i) Show that L is recursively enumerable.
Assignment 1 — Automata, Languages, and Computability 11
(ii) Show that L is not recursive.
Let’s use notation hM i from the lectures for E(M ).
(i) We can construct a TM ML that recognizes L as follows. ML first transforms its input hM i
to hhM i, i. ML simulates a universal TM U except the following cases:
• as soon as U writes a 1 on the tape, ML moves to an accepting state
• if U halts in an accepting state, ML halts in a non-accepting state
(ii) Lblankhalt = {hMblankhalt i | Mblankhalt () halts} is a halting language. It is the language
of encodings of TMs that halt on the blank tape. Lblankhalt is not recursive (as was proven
on lectures).
We show how to reduce Lblankhalt to L. Assume there is a decider D such that D(hM i)
halts in an accepting state if and only if hM i ∈ L and otherwise halts in a non-accepting
state. We show that using D we can construct Dblankhalt such that Dblankhalt (hMblankhalt i)
halts in an accepting state if and only if hMblankhalt i ∈ Lblankhalt and otherwise halts in a
non-accepting state.
We define a converter R such that for all machines Mblankhalt such that
hMblankhalt i ∈ Lblankhalt , it returns machine M such that hM i ∈ L.
R constructs M from Mblankhalt in the following way:
• All occurences of symbol 1 in the transitions of Mblankhalt (read and written) are
replaced in M with #, where # is a new symbol not occuring in the tape alphabet of
M.
• The final states of Mblankhalt are not final in M . Instead, for each final state qf of
Mblankhalt , for every tape symbol x, R adds a transition δ(qf , x) = (qf , 1, R) to M .
• For each state q and tape symbol x for which δ(q, x) is undefined in Mblankhalt (i.e.
Mblankhalt halts on x), R adds a transition δ(q, x) = (qf , 1, R) to M .
Mblankhalt eventually halts when started on a blank tape if and only if M = R(Mblankhalt )
eventually writes 1 somewhere on the tape.
Then we can take Dblankhalt = D(hR(Mblankhalt )i).
We know such a decider does not exists because Lblankhalt is not recursive.
Assignment 1 — Automata, Languages, and Computability 12