0% found this document useful (0 votes)
27 views

Automata Theory Lecture Notes

Uploaded by

Shashank S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Automata Theory Lecture Notes

Uploaded by

Shashank S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 145

Automata Theory, Languages and Computation

Lecture 1: Why Study Automata Theory?

Introduction to Automata Theory

Automata theory is the foundational study of computational models that perform calculations, recognize patterns, and
solve problems. Its relevance lies in the following domains:

1. Understanding Computation:

Automata theory defines the fundamental concepts of computation, helping us understand what problems can and
cannot be solved by machines.

It provides formal tools to classify computational problems based on their complexity and solvability.

2. Practical Applications:

Automata are widely used in software design, hardware design, natural language processing, and other fields.

Examples include lexical analyzers in compilers, regular expression engines, and control systems in hardware.

3. Bridge Between Theory and Practice:

Automata theory serves as a bridge between theoretical computer science and practical applications by providing
models for real-world systems.

The study of automata helps in creating efficient algorithms and robust computational models.

Informal Introduction to Finite Automata

Finite automata are one of the simplest yet most powerful models of computation. Informally, they can be thought of as
machines that:

Accept or reject strings based on predefined rules.

Operate on a finite amount of memory.

Are foundational to the design of regular expressions and lexical analyzers.

Key Characteristics:

1. Deterministic Behavior:

In deterministic finite automata (DFA), for each input symbol, the machine transitions to exactly one state.

2. Non-Deterministic Behavior:

In non-deterministic finite automata (NFA), multiple transitions for a single input symbol are allowed, offering
multiple paths for computation.

Real-World Analogy:

A finite automaton can be likened to a vending machine:

It transitions through states based on user inputs (coin insertion, button presses).

1/145
It determines the output (dispense item) based on its current state and input sequence.

Limitations:

Finite automata have no memory beyond their current state, making them unsuitable for recognizing patterns that
require counting (e.g., matching nested parentheses).

Structural Representations

Finite automata can be represented in multiple equivalent forms:

1. State Diagrams:

Graphical representation where:

Nodes represent states.

Edges represent transitions labeled with input symbols.

A special node denotes the start state, and one or more nodes are designated as accept states.

2. Transition Tables:

Tabular representation showing transitions for each state and input symbol.

3. Formal Definition:

A finite automaton is a 5-tuple M = (Q, Σ, δ, q0 , F ), where:


Q: Finite set of states.


Σ: Alphabet of input symbols.
δ : Transition function δ : Q × Σ → Q for DFA or δ : Q × Σ → 2Q for NFA.
q0 : Initial state q0 ∈ Q.
​ ​

F : Set of accept states F ⊆ Q.

Automata and Complexity

1. Relationship with Complexity Theory:

Automata are crucial for understanding computational complexity classes such as P, NP, and PSPACE.

Different classes of automata correspond to different computational powers:

Regular languages are recognized by finite automata.

Context-free languages are recognized by pushdown automata.

Recursively enumerable languages are recognized by Turing machines.

2. Key Concepts:

Decidability: Automata help formalize the concept of decidability—problems that can be solved algorithmically.

Efficiency: Automata provide models to analyze the time and space complexity of algorithms.

3. Applications in Complexity:

2/145
Lexical analysis in compilers leverages finite automata to tokenize input efficiently.

Pushdown automata form the basis of parsing algorithms for context-free grammars.

Conclusion:
Automata theory provides a theoretical framework to study computation and its limitations. It is indispensable for
understanding how computational systems are designed and optimized. The introductory concepts of finite automata, their
structural representations, and their relation to complexity establish a foundation for advanced study in computation and
language theory.

Lecture 2: Introduction to Formal Proof


Formal proof is the cornerstone of mathematical reasoning, providing a rigorous framework to establish the truth of
statements. In this lecture, we explore various forms of proofs, focusing on deductive proofs, reduction to definitions,
alternative theorem forms, and theorems that are not explicitly stated as "if-then" propositions. All examples will be drawn
from basic mathematics.

Deductive Proofs

A deductive proof establishes the truth of a statement by logically deriving it from previously known facts, axioms, or
theorems. It ensures that if the premises are true, the conclusion must also be true.

Example:

Prove that the sum of two even numbers is always even.

1. Premises:

An even number can be expressed as 2k , where k is an integer.

2. Proof:

Let a = 2k1 and b = 2k2 , where k1 , k2 are integers.


​ ​ ​ ​

The sum is a + b = 2k1 + 2k2 = 2(k1 + k2 ).


​ ​ ​ ​

Since k1 ​ + k2 is an integer, a + b is a multiple of 2, and thus even.


3. Conclusion:

The sum of two even numbers is even, as required.

Deductive proofs like this rely on the clear use of definitions and logical steps to establish results.

Reduction to Definitions

Reduction to definitions involves proving a statement by directly using the formal definitions of the terms involved. This
approach is particularly useful for simple properties and fundamental results.

Example:

3/145
Prove that the product of two odd numbers is always odd.

1. Definition of an Odd Number:

An odd number can be expressed as 2k + 1, where k is an integer.


2. Proof:

Let a = 2k1 + 1 and b = 2k2 + 1, where k1 , k2 are integers.


​ ​ ​ ​

The product is:


a ⋅ b = (2k1 + 1)(2k2 + 1) = 4k1 k2 + 2k1 + 2k2 + 1 = 2(2k1 k2 + k1 + k2 ) + 1
​ ​ ​ ​ ​ ​ ​ ​ ​ ​

This is of the form 2m + 1, where m = 2k1 k2 + k1 + k2 is an integer.


​ ​ ​ ​

3. Conclusion:

The product of two odd numbers is odd.

Reduction to definitions provides direct insight into why the statement is true.

Other Theorem Forms

While many theorems are presented in "if-then" form, others may involve equivalent statements, constructive proofs, or
claims of uniqueness.

Example:

Prove that there exists a unique integer x such that 2x + 3 = 7.


1. Existence:

Solve the equation 2x + 3 = 7:

2x = 7 − 3 = 4 ⇒ x=2
Thus, x = 2 satisfies the equation.
2. Uniqueness:

Assume there exist two solutions, x1 and x2 , such that 2x1


​ ​ ​ + 3 = 7 and 2x2 + 3 = 7.

Then 2x1 ​ = 2x2 , which implies x1 = x2 .


​ ​ ​

3. Conclusion:

There exists a unique solution x = 2.

Theorem forms like this often combine proof techniques to establish both existence and uniqueness.

Theorems That Appear Not to Be If-Then Statements

Some theorems are not explicitly written as "if-then" statements but can often be reformulated in this way for clarity.

Example:

Theorem: The square of an even number is even.

1. Restating the Theorem:

If n is an even number, then n2 is even.

4/145
2. Proof:

Let n = 2k , where k is an integer.


Then n2 = (2k)2 = 4k 2 = 2(2k 2 ).
Since 2k 2 is an integer, n2 is even.

3. Conclusion:

The original theorem is equivalent to an "if-then" statement and is proven.

Such theorems illustrate the flexibility in how statements can be expressed and proven.

Conclusion:
Formal proofs, through deductive reasoning, reduction to definitions, and alternative theorem forms, provide a structured
method for verifying mathematical claims. Understanding these techniques lays the foundation for rigorous reasoning
across mathematics and other theoretical disciplines.

2/2

Lecture 3: Additional Forms of Proof


This lecture focuses on advanced forms of proof used in mathematical reasoning, including methods for proving
equivalence about sets, the contrapositive, proof by contradiction, and counterexamples. These techniques are essential for
tackling a wide variety of mathematical problems and understanding the structure of logical arguments.

Proving Equivalence about Sets

To prove that two sets A and B are equivalent (A = B ), we must show two properties:
1. Subset Relation: A ⊆ B and B ⊆ A.
2. Bijection (in some contexts): If relevant, establish a one-to-one correspondence between the elements of A and B .

Example:

Prove that A = {x ∈ Z ∣ x is even} and B = {2k ∣ k ∈ Z}.


1. Show A ⊆ B:
Let x ∈ A. By definition of A, x is even, so x = 2k for some k ∈ Z.
Thus, x ∈ B , proving A ⊆ B .
2. Show B ⊆ A:
Let x ∈ B . By definition of B , x = 2k for some k ∈ Z, so x is even.
Thus, x ∈ A, proving B ⊆ A.
3. Conclusion:

Since A ⊆ B and B ⊆ A, A = B .

Proving equivalence about sets often requires carefully verifying both subset relations.

5/145
The Contrapositive

A contrapositive reformulates an "if-then" statement P ⟹ Q as ¬Q ⟹ ¬P . Both forms are logically equivalent, and
the contrapositive is often easier to prove.

Example:

Prove: If n2 is odd, then n is odd.

1. Contrapositive Formulation:

Instead of proving P ⟹ Q (n2 is odd ⟹ n is odd), prove ¬Q ⟹ ¬P (n is even ⟹ n2 is even).


2. Proof:

If n is even, n = 2k for some k ∈ Z.


Then n2 = (2k)2 = 4k 2 , which is a multiple of 2 and hence even.
3. Conclusion:

Since the contrapositive is true, the original statement is also true.

The contrapositive is particularly useful when proving the original statement directly is challenging.

Proof by Contradiction

Proof by contradiction establishes the truth of a statement P by assuming ¬P (its negation) and deriving a logical
inconsistency.

Example:

Prove: 2 is irrational.

1. Assumption (Negation of the Claim):

Assume 2 is rational. Then


​ 2 = pq , where p and q are integers with gcd(p, q) = 1.
​ ​

2. Derive Contradiction:
p2
Squaring both sides gives 2 = q2
, or
​ p2 = 2q 2 .
Thus, p2 is even, implying p is even (p = 2k ).
Substituting p = 2k into p2 = 2q 2 gives (2k)2 = 2q 2 , or 4k 2 = 2q 2 , which simplifies to q 2 = 2k 2 .
Hence, q 2 is even, implying q is even.

3. Contradiction:

Both p and q are even, contradicting the assumption that gcd(p, q) = 1.


4. Conclusion:

The assumption that 2 is rational leads to a contradiction. Thus,


​ 2 is irrational.

Proof by contradiction is a powerful tool for establishing the impossibility of certain claims.

6/145
Counterexamples

A counterexample disproves a universal statement by providing a specific instance where the statement fails.

Example:

Disprove: All prime numbers are odd.

1. Counterexample:

The prime number 2 is even.

2. Conclusion:

Since there exists a prime number that is not odd, the statement is false.

Counterexamples provide direct evidence against universal claims and are invaluable for clarifying misconceptions.

Conclusion:
These additional forms of proof—proving equivalence about sets, the contrapositive, proof by contradiction, and
counterexamples—enhance the mathematical toolkit for rigorous reasoning. Each method has its strengths and is suited to
specific types of problems, offering flexibility in constructing logical arguments.

Lecture 4: Inductive Proofs


Inductive proofs are a powerful and systematic approach to demonstrating the truth of propositions over a domain,
especially when the domain is infinite, such as the set of integers or structured objects like trees. This lecture focuses on
induction applied to integers, generalizations of integer induction, structural induction, and mutual induction.

Inductions to Integers

The principle of mathematical induction proves statements about integers by demonstrating two key steps:

1. Base Case:

Show that the statement holds for the initial value, typically n = 0 or n = 1.
2. Inductive Step:

Assume the statement holds for n = k (inductive hypothesis).


Prove it holds for n = k + 1.

Example:
n(n+1)
Prove: 1 + 2 + ⋯ + n = 2
for all
​ n ≥ 1.
1. Base Case (n = 1):
Left-hand side: 1.
1(1+1)
Right-hand side: 2
​ = 1.
Thus, the statement holds for n = 1.
2. Inductive Step:

Assume the statement holds for n = k:

7/145
k(k + 1)
1+2+⋯+k = .
2

Show it holds for n = k + 1:


k(k + 1)
1 + 2 + ⋯ + k + (k + 1) = + (k + 1).
2

Simplify the right-hand side:

k(k + 1) k(k + 1) + 2(k + 1) (k + 1)(k + 2)


+ (k + 1) = = .
2 2 2
​ ​ ​

Thus, the statement holds for n = k + 1.

3. Conclusion:
n(n+1)
By induction, 1 + 2 + ⋯ + n = 2 ​for all n ≥ 1.

More General Forms of Integer Inductions

While standard induction proves statements for n ≥ 1, more general forms extend this principle:
1. Strong Induction:

Assumes the statement holds for all integers n ≤ k to prove it for n = k + 1.


Useful when the proof relies on multiple earlier cases.

Example:

Prove: Any integer n ≥ 2 can be written as a product of prime numbers.


1. Base Case (n = 2):
2 is a prime number and thus satisfies the property.
2. Inductive Step:

Assume the statement holds for all integers 2 ≤ n ≤ k.


Show it holds for n = k + 1:

If k + 1 is prime, the statement holds.


If k + 1 is composite, it can be written as k + 1 = a ⋅ b, where 2 ≤ a, b ≤ k .
By the inductive hypothesis, a and b can be expressed as products of primes, and hence k + 1 can also be
expressed as a product of primes.

3. Conclusion:

By strong induction, the statement holds for all n ≥ 2.


4. Backward Induction:

Used for proving statements about finite sequences by reversing the direction of induction.

Structural Inductions

8/145
Structural induction generalizes mathematical induction to prove properties of recursively defined objects, such as trees or
strings.

Example:

Prove: For a binary tree T , the number of leaves is one more than the number of internal nodes.

1. Base Case:

A tree with a single node (leaf) has 0 internal nodes and 1 leaf. The statement holds.

2. Inductive Step:

Assume the statement holds for smaller trees.

For a tree T with root R and subtrees T1 and T2 :


​ ​

By the inductive hypothesis, T1 and T2 satisfy the property.


​ ​

Combining T1 and T2 under R adds one internal node (the root R) and no new leaves, preserving the
​ ​

relationship.

3. Conclusion:

By structural induction, the statement holds for all binary trees.

Mutual Inductions

Mutual induction proves interdependent statements simultaneously by leveraging their recursive definitions.

Example:

Prove:

1. All even numbers n ≥ 0 can be expressed as 2k .


2. All odd numbers n ≥ 1 can be expressed as 2k + 1.
3. Base Cases:

For n = 0, n is even (2k = 0).


For n = 1, n is odd (2k + 1 = 1).
4. Inductive Steps:

Assume the statements hold for n = k:

If k is even, k + 1 is odd (2k + 1).


If k is odd, k + 1 is even (2(k + 1)).
5. Conclusion:

By mutual induction, the statements hold for all n ≥ 0.

Conclusion:
Inductive proofs, including standard, strong, and structural induction, along with mutual induction, provide a systematic
framework for proving propositions over integers and recursively defined structures. These methods are essential for

9/145
verifying properties in mathematics and theoretical computer science.

Lecture 5: The Central Concepts of Automata Theory


The foundational concepts of automata theory revolve around alphabets, strings, and languages. These concepts are used to
model and formalize computational problems, enabling their analysis in terms of decidability and complexity. In this lecture,
we delve into the definitions, operations, and their applications in defining computational problems.

1. Alphabets (Σ)

An alphabet is a finite, non-empty set of symbols used as the building blocks for constructing strings.

Examples:

Σ = {a, b, c} (English letters)


Σ = {0, 1} (binary digits)
Σ = {+, -, *, /, =} (arithmetic symbols)

Properties:

Alphabets are finite.

Each symbol in Σ is atomic and indivisible.

Alphabets are essential for defining strings, which in turn form the basis of languages.

2. Strings

A string is a finite sequence of symbols from an alphabet Σ. The set of all possible strings over Σ is denoted Σ∗ .

Key Terms:

Empty String (ϵ): A string with zero symbols.

Length of a String (∣w∣): The number of symbols in a string w .


Example: If w = abb, ∣w∣ = 3.

Operations on Strings:

1. Concatenation: Joining two strings w1 and w2 to form w1 w2 .


​ ​ ​ ​

Example: If w1 ​ = ab and w2 = ba, w1 w2 = abba.


​ ​ ​

2. Reversal: Reversing the order of symbols in a string w .


Example: If w = abc, then wR = cba.
3. Substring: A contiguous sequence of symbols within a string.
Example: In w = abcde, w2 = bcd is a substring.

4. Prefix and Suffix:

Prefix: A substring that starts at the beginning of w . Example: ab is a prefix of abcde.

Suffix: A substring that ends at the end of w . Example: cde is a suffix of abcde.

10/145
5. Lengthening (String Power): Repeating w k -times, denoted w k .
Example: If w = ab, w3 = ababab.
6. Empty String Property:

wϵ = ϵw = w.

Strings form the building blocks for languages, and operations on strings enable transformations essential for language
manipulation.

3. Languages

A language is a set of strings over a given alphabet Σ. Formally, L ⊆ Σ∗ . Languages can be finite or infinite.
Examples of Languages:

1. Finite Language:
L = {a, ab, abc} over Σ = {a, b, c}.
2. Infinite Language:
L = {an ∣ n ≥ 1} (strings of one or more a's) over Σ = {a}.

Operations on Languages:

1. Union (L1 ∪ L2 ): The set of strings that belong to L1 , L2 , or both.


​ ​ ​ ​

Example: If L1 = {a, b} and L2 = {b, c}, L1 ∪ L2 = {a, b, c}.


​ ​ ​ ​

∩ L2 ): The set of strings that belong to both L1 and L2 .


2. Intersection (L1 ​ ​ ​ ​

Example: L1 ∩ L2 = {b}.
​ ​

3. Concatenation (L1 L2 ): The set of strings formed by concatenating a string from L1 with a string from L2 .
​ ​ ​ ​

Example: If L1 ​ = {a, b} and L2 = {c, d}, L1 L2 = {ac, ad, bc, bd}.


​ ​ ​

4. Kleene Star (L∗ ): The set of all strings formed by concatenating zero or more strings from L.
Example: If L = {a}, L∗ = {ϵ, a, aa, aaa, … }.
5. Difference (L1− L2 ): The set of strings in L1 but not in L2 .
​ ​ ​ ​

Example: L1 − L2 = {a}.
​ ​

Languages are the formal objects that automata recognize and manipulate.

4. Problems

In automata theory, a problem is represented as a language, with strings encoding problem instances and decisions about
membership indicating solutions. A problem is considered decidable if there exists an algorithm that can determine, for any
string, whether it belongs to the language.

Example 1: Palindrome Checking

Define the language L = {w ∈ Σ∗ ∣ wR = w}.


Σ = {a, b}
L = {ϵ, a, b, aa, bb, aba, bab, … }.

11/145
The problem is to decide if a given string w belongs to L (i.e., is w a palindrome?).

Example 2: Even-Length Strings

Define the language L = {w ∈ Σ∗ ∣ ∣w∣ is even}.


Σ = {0, 1}
L = {ϵ, 00, 01, 10, 11, 0000, … }.
The problem is to decide if the length of a given string w is even.

Example 3: Divisibility Problem

Define the language L = {w ∈ {0, 1}∗ ∣ binary value of w is divisible by 3}.


For Σ = {0, 1}:

L = {ϵ, 0, 11, 110, 1001, … }.


The problem is to decide if the binary number represented by w is divisible by 3.

These examples illustrate how computational problems can be formalized as languages, enabling their study within the
automata theory framework.

Conclusion:
The central concepts of automata theory—alphabets, strings, languages, and problems—provide the foundation for
understanding computation. By abstracting problems as languages, automata theory enables a precise analysis of their
computational properties, including decidability and complexity.

Lecture 6: An Informal Picture of Finite Automata


Finite automata are abstract machines that process sequences of symbols and determine whether a given sequence meets
certain criteria. In this lecture, we explore the behavior of finite automata through an informal, example-driven approach,
focusing on the interaction between a Customer, a Store, and a Bank. We examine how automata can model this system,
define rules, manage actions, and validate protocols.

1. The Ground Rules

Imagine a scenario where a Customer interacts with a Store to make a purchase, and the Store communicates with a Bank
to process the payment. The interactions between these entities follow specific rules:

The Customer can make requests, such as adding items to a cart or attempting to purchase.

The Store accepts payments only after validating them with the Bank.

The Bank authorizes or denies payments based on account balance and transaction validity.

These rules ensure the smooth functioning of the system. The finite automaton representing each entity enforces these
ground rules by defining states and transitions based on specific actions or events.

12/145
2. The Protocol

The protocol governs how interactions occur between the Customer, Store, and Bank. It can be outlined as follows:

1. Customer Actions:

The Customer can initiate a transaction, such as selecting items or requesting payment.

Example: A customer adds an item to the cart and proceeds to checkout.

2. Store Actions:

The Store waits for customer actions and forwards payment requests to the Bank.

The Store transitions to a "waiting for authorization" state upon initiating communication with the Bank.

3. Bank Actions:

The Bank processes the payment request and responds with either an authorization or denial.

Example: If the Customer's account balance is sufficient, the Bank authorizes the transaction, and the Store
completes the purchase.

Each entity is modeled as a finite automaton with states representing the current status (e.g., "Idle," "Awaiting Payment")
and transitions triggered by specific actions.

3. Enabling the Automata to Ignore Actions

Not all actions in a system are relevant to every automaton. For example:

The Bank does not care about the items the Customer adds to the cart; it only processes payment requests.

The Customer need not know whether the Bank is processing other transactions while waiting for their request.

This capability to ignore irrelevant actions is critical. In finite automata, this is achieved by allowing certain inputs to trigger
transitions that effectively "skip" or ignore those actions, ensuring the automaton focuses on relevant events.

Example:

The Store automaton might transition from "Idle" to "Waiting for Payment Authorization" regardless of whether the
Customer selects one or multiple items.

This flexibility simplifies the automaton's design and prevents unnecessary complexity.

4. The Entire System as an Automaton

Each component (Customer, Store, Bank) operates as an individual finite automaton, but the entire system can be viewed as
a single composite automaton.

States of the System:

The state of the system is a combination of the states of the individual automata. For example:

Customer State: Selecting items, Requesting payment.

Store State: Idle, Awaiting payment authorization.

Bank State: Idle, Processing payment.

13/145
Transitions of the System:

The system transitions occur when an action from one automaton triggers a response in another. For example:

The Customer transitions to "Requesting Payment," prompting the Store to transition to "Waiting for Bank Response."

The Store's payment request triggers the Bank to transition to "Processing Payment."

The composite automaton models the entire sequence of interactions, ensuring that the system's behavior adheres to the
defined protocol.

5. Using the Product Automaton to Validate the Protocol

The product automaton combines the individual automata of the Customer, Store, and Bank into a single automaton. It
allows us to validate the correctness of the protocol by verifying all possible sequences of actions and states.

Example:

1. Initial State:

Customer: Selecting items.

Store: Idle.

Bank: Idle.

2. Transition 1:

Action: Customer requests payment.

Result:

Customer: Requesting payment.

Store: Awaiting Bank authorization.

Bank: Idle.

3. Transition 2:

Action: Store sends payment request to Bank.

Result:

Customer: Requesting payment.

Store: Awaiting Bank authorization.

Bank: Processing payment.

4. Transition 3:

Action: Bank authorizes payment.

Result:

Customer: Transaction completed.

Store: Completing sale.

Bank: Idle.

Validation:

The product automaton ensures that:

14/145
No invalid sequences (e.g., Bank processing payment without a request) are allowed.

The system returns to a valid state (e.g., all automata idle) after each transaction.

Conclusion:
Through the example of the Customer, Store, and Bank, we have explored the informal workings of finite automata. The
ground rules, protocol, and interactions define the system's behavior, while the product automaton validates its correctness.
This approach provides an intuitive understanding of finite automata, preparing us for their formal representation in
subsequent discussions.

Lecture 7: Deterministic Finite Automata


A Deterministic Finite Automaton (DFA) is a theoretical model used to recognize regular languages. In this lecture, we
define a DFA, explain how it processes strings, and introduce simpler notations such as transition diagrams and tables.
Additionally, we discuss how the transition function is extended to handle strings, and we define the language of a DFA.

1. Definition of a DFA

A Deterministic Finite Automaton (DFA) is a 5-tuple (Q, Σ, δ, q0 , F ), where:


Q: A finite set of states.

Σ: A finite set of input symbols, called the alphabet.


δ : A transition function δ : Q × Σ → Q, which specifies the next state for a given state and input symbol.
q0 : The initial state, where computation begins, q0 ∈ Q.
​ ​

F: A set of accepting (final) states, where F ⊆ Q.

Key Points:

The automaton is deterministic, meaning for each state and input symbol, there is exactly one state transition.

The automaton has a finite number of states and processes input strings of finite length.

It either accepts or rejects a string based on whether the final state is an accepting state.

2. How a DFA Processes Strings

The operation of a DFA can be described in the following steps:

1. Initialization:
The DFA starts in the initial state q0 .

2. Processing Input:
The DFA reads the input string one symbol at a time. For each symbol σ from the string, it transitions from its current
state to a new state, as defined by the transition function δ . This continues until all symbols have been processed.

15/145
3. Final Decision:
After the entire input string has been processed, the DFA checks if it is in an accepting state (i.e., if the current state is in
F ).
If the DFA is in an accepting state, the string is accepted.

If the DFA is not in an accepting state, the string is rejected.

Example:

Let’s consider a simple DFA that accepts binary strings ending with ‘01’.

Q = {q0 , q1 , q2 }
​ ​ ​

Σ = {0, 1}
Transition function δ :

δ(q0 , 0) = q0
​ ​

δ(q0 , 1) = q1
​ ​

δ(q1 , 0) = q2
​ ​

δ(q1 , 1) = q1
​ ​

δ(q2 , 0) = q0
​ ​

δ(q2 , 1) = q1
​ ​

q0 is the initial state.


F = {q2 }, the accepting state.


For the string ‘1101’, the DFA would process the symbols as follows:

Start in state q0 and read ‘1’, transition to q1 .


​ ​

Read ‘1’, stay in q1 . ​

Read ‘0’, transition to q2 . ​

Read ‘1’, transition back to q1 . Since the DFA ends in q1 (not an accepting state), the string is rejected.
​ ​

3. Simpler Notations for DFAs (Transition Diagrams and Tables)

To make the representation of DFAs more intuitive, we often use transition diagrams and transition tables.

Transition Diagrams:
A transition diagram is a graphical representation of a DFA where each state is represented by a circle, and transitions
between states are represented by directed edges labeled with input symbols.

Example:

States q0 , q1 , q2 are represented as circles.


​ ​ ​

The initial state q0 has an arrow pointing to it.


Transitions between states are labeled with the appropriate input symbols.

Accepting states are typically indicated by a double circle.

Transition Tables:
A transition table is a tabular representation where each row corresponds to a state, each column corresponds to an

16/145
input symbol, and each cell shows the next state resulting from applying the input symbol to the state.

Example for the DFA accepting binary strings ending in ‘01’:

State 0 1

q0 ​ q0 ​ q1 ​

q1 ​ q2 ​ q1 ​

q2 ​ q0 ​ q1 ​

Both notations are equivalent, and the choice of notation depends on the specific use case (e.g., clarity in visualization or
simplicity in computation).

4. Extending the Transition Function to Strings

The transition function δ can be extended to handle entire strings. Instead of processing one symbol at a time, we apply δ
iteratively for each symbol in the string.

Formally, for a string w = w1 w2 … wn (where each wi ∈ Σ), the extended transition function δ ∗ (q, w) is defined as:
​ ​ ​ ​

δ ∗ (q, ϵ) = q (For the empty string ϵ, the state remains unchanged.)


δ ∗ (q, w1 w2 … wn ) = δ(δ ∗ (q, w1 w2 … wn−1 ), wn )
​ ​ ​ ​ ​ ​ ​

In other words, the extended transition function processes each symbol in the string, starting from the initial state, and
transitions through the states as defined by δ .

Example:

For the string ‘1101’ with the DFA above:

Start in q0 , process ‘1’, transition to q1 .


​ ​

Process ‘1’, stay in q1 . ​

Process ‘0’, transition to q2 . ​

Process ‘1’, transition to q1 . The DFA ends in q1 , which is not an accepting state, so the string is rejected.
​ ​

5. Language of a DFA

The language of a DFA is the set of strings that the DFA accepts. Formally, it is defined as:

L(M ) = {w ∈ Σ∗ ∣ δ ∗ (q0 , w) ∈ F }

In other words, the language of a DFA M is the set of all strings w such that, starting from the initial state q0 , the DFA

reaches an accepting state after processing w .

Example:

For the DFA accepting binary strings ending with ‘01’, the language is the set of all strings that end with ‘01’, including
strings like:

{01, 101, 1101, 1001, 111101}

The language does not include strings like:

17/145
{0, 1, 100, 111} (as these do not end with ‘01’).

Conclusion:
In this lecture, we introduced Deterministic Finite Automata (DFA) as a model for recognizing regular languages. We
explored how DFAs process strings through state transitions and defined the language of a DFA. We also discussed simpler
notations for representing DFAs, such as transition diagrams and tables, and extended the transition function to handle
entire strings. Understanding DFAs is fundamental for studying more complex computational models and understanding
regular languages.

Lecture 8: Non-Deterministic Finite Automata


In this lecture, we will introduce Non-Deterministic Finite Automata (NFA), contrast them with DFAs, and discuss their
formal definitions. We will examine the extended transition function for NFAs, the language recognized by an NFA, and the
equivalence between DFAs and NFAs. Finally, we will explore a bad case for the subset construction, a method used to
convert an NFA into a DFA.

1. Informal View of NFA

A Non-Deterministic Finite Automaton (NFA) is similar to a DFA, but with one key difference: an NFA can have multiple
possible next states for a given state and input symbol. This non-deterministic behavior means that at any point during the
computation, the automaton can "choose" between several transitions, rather than being forced into a single transition as in
the case of a DFA. This flexibility allows NFAs to potentially recognize the same languages as DFAs, but the way they process
strings is different.

Key Differences from DFAs:

Multiple Transitions: For a given state and input symbol, there can be more than one next state (or none).

Epsilon Transitions: An NFA can transition to a new state without consuming any input symbol (via epsilon transitions ϵ
).

Non-determinism: At each step, the NFA can be in multiple states simultaneously.

2. Definition of NFA

A Non-Deterministic Finite Automaton (NFA) is formally defined as a 5-tuple (Q, Σ, δ, q0 , F ), where:


Q: A finite set of states.

Σ: A finite set of input symbols, the alphabet.


δ : A transition function δ : Q × Σ → 2Q , which maps a state and an input symbol to a set of possible next states.
q0 : The initial state, where computation begins, q0 ∈ Q.
​ ​

F: A set of accepting (final) states, where F ⊆ Q.

Notable Features:

18/145
The transition function δ is a set-valued function, meaning for a given state and input symbol, the NFA may transition
to several different states, or even no state at all.

Epsilon transitions: The NFA may transition to a state without consuming any input symbol, denoted as δ(q, ϵ).

3. The Extended Transition Function

The extended transition function δ ∗ for an NFA is used to determine the states that the automaton can reach after
processing an entire string. It extends the original transition function to handle strings rather than individual symbols.

Formally, the extended transition function δ ∗ is defined recursively:

δ ∗ (q, ϵ) = {q} for any state q (the empty string leaves the state unchanged).
δ ∗ (q, w1 w2 … wn ) = ⋃p∈δ(q,w1 ) δ ∗ (p, w2 … wn ) for any string w1 w2 … wn .
​ ​ ​


​ ​ ​ ​ ​ ​

In other words, given a current state q and input string w , the automaton can non-deterministically choose one of the
possible transitions for the first symbol, then continue processing the rest of the string from the resulting states.

If the NFA can reach any of the accepting states after processing the string, the string is accepted. If no accepting state is
reached, the string is rejected.

4. The Language of an NFA

The language of an NFA M = (Q, Σ, δ, q0 , F ) is the set of strings that the automaton accepts, i.e., the set of strings for

which there exists a sequence of states q0 , q1 , … , qn such that:


​ ​

q0 is the initial state.


qn is a state in the accepting set F .


For each symbol wi in the input string, the automaton can transition from qi to qi+1 according to the transition function
​ ​ ​

δ.

Formally:

L(M ) = {w ∈ Σ∗ ∣ δ ∗ (q0 , w) ∩ F =
​  ∅}

This means that a string w is accepted by an NFA if there exists some sequence of transitions starting from q0 , possibly ​

involving multiple possible paths, that leads to an accepting state in F .

5. Equivalence of DFA and NFA

Despite the non-deterministic nature of NFAs, every NFA has an equivalent DFA. This equivalence is crucial because it shows
that NFAs and DFAs recognize exactly the same class of languages, i.e., regular languages.

Formal Discussion of Equivalence:

Given an NFA N = (Q, Σ, δ, q0 , F ), we can construct a deterministic finite automaton (DFA) D = (Q′ , Σ, δ ′ , q0′ , F ′ ) that
​ ​

recognizes the same language. The construction involves the subset construction or powerset construction algorithm.

19/145
States of the DFA: The states of the DFA correspond to subsets of the states of the NFA. If the NFA has ∣Q∣ states, the
DFA will have at most 2∣Q∣ states.

Initial state of the DFA: The initial state of the DFA is the set of NFA states that can be reached from the initial state q0 ​

through epsilon transitions. This is the epsilon closure of q0 . ​

Transition function of the DFA: For each state in the DFA, we determine the possible states it can transition to for each
input symbol. This is done by considering all possible NFA states the NFA can transition to for the input symbol and then
taking the epsilon closure of the resulting set of states.

Accepting states of the DFA: The accepting states of the DFA correspond to any subset of NFA states that contains at
least one accepting state of the NFA.

This construction shows that for every NFA, there exists an equivalent DFA, and both automata recognize the same
language. However, the number of states in the resulting DFA can grow exponentially compared to the NFA, which is a key
difference between the two models.

6. A Bad Case for the Subset Construction

While the subset construction guarantees an equivalent DFA for any NFA, it can sometimes lead to a large number of states
in the resulting DFA, even if the NFA itself is relatively simple. This exponential growth in the number of states is often
referred to as the "state explosion problem."

Formal Example:

Consider the following NFA N with states Q = {q0 , q1 }, alphabet Σ = {a}, initial state q0 , and accepting state F = {q1 }.
​ ​ ​ ​

The transition function δ is as follows:

δ(q0 , a) = {q0 , q1 }
​ ​ ​

δ(q1 , a) = {q1 }
​ ​

This NFA accepts strings that contain at least one 'a'. Now, let's construct the equivalent DFA using the subset construction
method:

1. The DFA will have states corresponding to subsets of Q, i.e., the power set of Q, which is {∅, {q0 }, {q1 }, {q0 , q1 }}.
​ ​ ​ ​

2. The initial state of the DFA is {q0 }.


3. For each input symbol, we compute the possible next states:

From {q0 }, the transition on 'a' leads to {q0 , q1 }.


​ ​ ​

From {q1 }, the transition on 'a' stays in {q1 }.


​ ​

From {q0 , q1 }, the transition on 'a' leads to {q0 , q1 }.


​ ​ ​ ​

4. The DFA will have four states, and it will need to process all these states for every input symbol. This results in
exponential growth in the number of states, which is inefficient.

Conclusion:
In this lecture, we introduced Non-Deterministic Finite Automata (NFA), defined them formally, and discussed how they
process strings. We extended the transition function to handle entire strings and defined the language of an NFA. The

20/145
equivalence of DFAs and NFAs was formally established through the subset construction, which converts an NFA into an
equivalent DFA. However, we also highlighted the "state explosion" problem, where the number of states in the resulting
DFA can grow exponentially, leading to inefficiency in certain cases.

Lecture 9: An Application - Text Search


In this lecture, we will explore the application of Automata Theory in text searching. Specifically, we will discuss methods for
finding strings within a larger text using both Non-Deterministic Finite Automata (NFA) and Deterministic Finite
Automata (DFA). We will explore how automata can be used to recognize patterns or keywords in a string, a fundamental
task in many computing applications, such as searching in text files, document indexing, and web searching.

1. Finding Strings in Text

Finding a string (or substring) within a larger text is a common problem in computer science. Given a pattern P and a text
T , the goal is to determine whether P appears in T . This problem is crucial in text processing, search engines, and many
other applications that require string matching.

Basic Approach:

We scan the text character by character.

For each character in the text, we check if it matches the first character of the pattern. If it matches, we continue
checking the subsequent characters of the text to see if the full pattern appears.

If the full pattern matches, we have found the string; otherwise, we continue scanning the text.

However, a more efficient approach to string searching can be achieved using automata.

Using Automata for Text Search:

To efficiently perform string searching, we can build an automaton that recognizes the target string as a pattern. The
automaton will transition through its states as we scan the text, and when we reach an accepting state, we will know that
the pattern has been found.

2. NFA for Text Search

A Non-Deterministic Finite Automaton (NFA) can be constructed to search for a string in a text. This NFA will simulate the
process of matching a given pattern by making transitions through multiple possible states as it reads each symbol in the
text.

Example: NFA for Searching Pattern “ab”

Consider searching for the string "ab" in a text. We can build an NFA for this search task as follows:

States: Q = {q0 , q1 , q2 }
​ ​

q0 : Initial state, no characters matched yet.


q1 : The character 'a' is matched.


q2 : The pattern "ab" is fully matched, accepting state.


Alphabet: Σ = {a, b}
Transition Function δ :

21/145
δ(q0 , a) = {q1 } (After reading 'a', move to state q1 ).
​ ​ ​

δ(q0 , b) = ∅ (No transition on 'b' from q0 ).


​ ​

δ(q1 , a) = ∅ (After reading 'a' again, no further transition from q1 ).


​ ​

δ(q1 , b) = {q2 } (After reading 'b', move to q2 ).


​ ​ ​

δ(q2 , a) = {q1 } (After matching "ab", restart the matching process by reading 'a').
​ ​

δ(q2 , b) = ∅ (No transition on 'b' from q2 ).


​ ​

Initial State: q0 ​

Accepting State: q2 (Pattern matched)


How it works:

At each state, the NFA checks the current character of the text and decides whether to transition to another state. If the
automaton reaches the accepting state q2 , we have successfully matched the pattern.

As the NFA is non-deterministic, it can try multiple possible paths in parallel, enabling it to handle situations where the
pattern partially matches at multiple locations in the text.

Example Execution:

For the input text "ababb", the NFA transitions as follows:

Start in q0 , read 'a', move to q1 .


​ ​

Read 'b', move to q2 , accept the pattern "ab".


Continue reading the text, re-entering q1 on 'a', and transitioning back to q2 on 'b', accepting another occurrence of
​ ​

"ab".

3. DFA to Recognize a Set of Keywords

Now, we will focus on using a Deterministic Finite Automaton (DFA) to recognize a set of keywords. This is a more efficient
approach compared to NFAs when we need to recognize multiple patterns simultaneously in the text.

Constructing a DFA for Multiple Keywords:

Given a set of keywords {P1 , P2 , … , Pn }, the goal is to construct a DFA that can recognize any of these keywords in a
​ ​ ​

given text. This can be done using a generalized DFA where each state represents the progress made in matching any of
the keywords.

Example: DFA for Keywords “ab” and “bc”

Consider the keywords "ab" and "bc". The DFA needs to be designed to transition between states that represent the
progress of matching any of these two patterns. Here’s how we can approach the construction:

States: The DFA has a state for each possible combination of progress through both keywords.

q0 : Initial state, no progress on any keyword.


q1 : The state after reading 'a' (start of "ab").


q2 : The state after reading "ab" (accepting state for "ab").


q3 : The state after reading 'b' (start of "bc").


q4 : The state after reading "bc" (accepting state for "bc").


22/145
Alphabet: Σ = {a, b, c}
Transition Function δ :

δ(q0 , a) = q1 (Transition to q1 on 'a').


​ ​ ​

δ(q0 , b) = q3 (Transition to q3 on 'b').


​ ​ ​

δ(q0 , c) = q0 (No progress on 'c' from q0 ).


​ ​ ​

δ(q1 , b) = q2 (Transition to q2 on 'b' for "ab").


​ ​ ​

δ(q3 , c) = q4 (Transition to q4 on 'c' for "bc").


​ ​ ​

δ(q2 , a) = q1 (Re-start matching "ab" on 'a').


​ ​

δ(q2 , b) = q3 (Re-start matching "bc" on 'b').


​ ​

δ(q4 , a) = q1 (Re-start matching "ab" on 'a').


​ ​

δ(q4 , b) = q3 (Re-start matching "bc" on 'b').


​ ​

Initial State: q0 ​

Accepting States: q2 , q4 (Accepts "ab" and "bc")


​ ​

How it works:

The DFA processes the text character by character, transitioning between states as it matches characters from either of
the keywords.

If the DFA reaches state q2 or q4 , the text contains one of the keywords and the string is accepted.
​ ​

Example Execution:

For the input text "abbc", the DFA transitions as follows:

Start in q0 , read 'a', move to q1 .


​ ​

Read 'b', move to q2 , accept the keyword "ab".


Continue reading the text, re-entering q3 on 'b', and transitioning to q4 on 'c', accepting the keyword "bc".
​ ​

Conclusion:
In this lecture, we explored the application of automata theory in text searching. We discussed how an NFA can be
constructed to search for a single string or pattern in a text, and how a DFA can be used to recognize a set of keywords
efficiently. Both methods are powerful tools in text processing, and automata theory provides a formal foundation for
understanding and implementing text search algorithms, which are crucial in fields such as information retrieval, document
indexing, and web searching.

Lecture 10: Finite Automata with Epsilon-Transitions


In this lecture, we will discuss Finite Automata with Epsilon-Transitions (ε-Transitions), which are a generalization of the
traditional finite automaton. These automata allow transitions between states without consuming any input symbols,
represented by epsilon (ε). We will explore how epsilon-transitions are used, the formal notation for epsilon-NFAs (ε-NFAs),
and how to compute the epsilon-closure of a state. We will also discuss extended transitions and languages recognized by
epsilon-NFAs and techniques for eliminating epsilon-transitions to obtain a simpler NFA or DFA.

23/145
1. Use of Epsilon-Transitions

Epsilon-transitions (ε-transitions) are transitions in an automaton that do not require the consumption of any input
symbol. Instead, these transitions allow the automaton to "move" between states without reading any characters from the
input string. This feature gives the automaton more flexibility in recognizing languages and constructing more compact
representations of regular languages.

Key Points:

Non-consumptive: Epsilon-transitions do not consume any input symbol, which means that they allow the automaton
to jump between states without advancing in the input string.

Generalization: Epsilon-transitions generalize the idea of non-determinism, since an automaton can "choose" to take an
epsilon-transition at any point in its execution.

Epsilon-transitions are especially useful in the construction of automata for certain regular languages, and they make the
automaton more efficient by reducing the number of states and transitions required.

2. The Formal Notation for an Epsilon-NFA

An epsilon-NFA (ε-NFA) is a type of Non-Deterministic Finite Automaton (NFA) that allows epsilon-transitions. It is formally
defined as a 5-tuple:

N = (Q, Σ, δ, q0 , F )

Where:

Q is a finite set of states.

Σ is the finite input alphabet.


δ is the transition function, which maps a state and an input symbol (or epsilon) to a set of possible next states. In the
case of epsilon-transitions, δ(q, ϵ) gives the set of states that can be reached from state q via epsilon-transitions alone.

q0 is the initial state.


F is the set of accepting (final) states.

Transition Function:

For an epsilon-NFA, the transition function δ is defined as:

δ(q, a) ⊆ Q for every input symbol a ∈ Σ.


δ(q, ϵ) ⊆ Q for epsilon-transitions, where ϵ is the empty string.

The use of epsilon-transitions adds additional complexity to the transition function, as it introduces the possibility of
reaching a state without consuming an input symbol.

3. Epsilon-Closures

24/145
The epsilon-closure of a state q in an epsilon-NFA, denoted as ϵ-closure(q), is the set of states that can be reached from q
by taking epsilon-transitions, including q itself.

Formally, the epsilon-closure of a state q is defined as:

ϵ-closure(q) = {q} ∪ {p ∈ Q ∣ there is a path from q to p using only ϵ-transitions}

In other words, the epsilon-closure of a state is the set of all states that can be reached from q by following epsilon-
transitions alone. This closure is important for computing the next states during the processing of input strings, as we need
to account for all possible states that can be reached without consuming symbols.

Example:

Consider an epsilon-NFA with the following transitions:

δ(q0 , a) = {q1 }
​ ​

δ(q1 , ϵ) = {q2 }
​ ​

δ(q2 , b) = {q3 }
​ ​

Here, the epsilon-closure of q0 is ϵ-closure(q0 )


​ ​ = {q0 , q2 }, because from q0 , you can reach q2 via the epsilon-transition.
​ ​ ​ ​

4. Extended Transitions and Languages for ε-NFA

The extended transition function δ ∗ for an epsilon-NFA is an extension of the transition function that handles strings of
arbitrary length, including the epsilon transition.

The extended transition function for an epsilon-NFA is defined as follows:

δ ∗ (q, ϵ) = ϵ-closure(q) (the epsilon-closure of state q is the set of states that can be reached with epsilon-transitions).
For a non-empty string w = a1 a2 ⋯ an , the extended transition function is defined recursively:
​ ​ ​

δ ∗ (q, w) = ⋃ ​
δ ∗ (p, a2 ⋯ an )
​ ​

p∈δ(q,a1 )

In other words, δ ∗ (q, w) is the set of states reachable from q after processing the string w , including the epsilon-closures
for each state along the way.

Language of an ε-NFA:

The language of an epsilon-NFA N = (Q, Σ, δ, q0 , F ), denoted L(N ), is the set of strings that lead the automaton from

the initial state q0 to any accepting state in F , while possibly passing through epsilon-transitions.

Formally:

L(N ) = {w ∈ Σ∗ ∣ δ ∗ (q0 , w) ∩ F =
 ∅} ​

This means that a string w is accepted by an epsilon-NFA if, after processing w , the automaton can reach at least one of the
accepting states, possibly using epsilon-transitions along the way.

5. Eliminating Epsilon-Transitions

While epsilon-transitions can simplify the construction of automata, they can also complicate the processing of strings. To
convert an epsilon-NFA into an equivalent NFA without epsilon-transitions, we can use an algorithm called epsilon

25/145
elimination.

The epsilon-elimination process involves the following steps:

1. Compute the Epsilon-Closure: For each state q , compute the epsilon-closure ϵ-closure(q).

2. Update the Transition Function: For each state q and input symbol a, update the transition function δ by adding
transitions from the epsilon-closure of q to the epsilon-closures of the states reached by the symbol a.

3. Handle Epsilon-Closure of Accepting States: If a state in the epsilon-closure of a state is accepting, mark the
corresponding state as accepting in the new automaton.

4. Create the New NFA: After the transition function has been updated for all states, construct the new NFA that no longer
contains epsilon-transitions.

Example:

Consider an epsilon-NFA with the following transitions:

δ(q0 , a) = {q1 }
​ ​

δ(q1 , ϵ) = {q2 }
​ ​

δ(q2 , b) = {q3 }
​ ​

The epsilon-eliminated NFA will:

Compute the epsilon-closure of q0 , which includes q0 and q2 .


​ ​ ​

Transition on 'a' from q0 leads to q1 , and then on 'b' from q1 via epsilon-transition to q2 , eventually leading to q3 .
​ ​ ​ ​ ​

Thus, the new transition function eliminates the epsilon transitions, resulting in an equivalent NFA without epsilon-
transitions.

Conclusion:
In this lecture, we explored Finite Automata with Epsilon-Transitions (ε-NFAs). We discussed how epsilon-transitions work,
how to compute epsilon-closures, and how to extend the transition function to handle strings. We also examined the
process of eliminating epsilon-transitions to convert an ε-NFA into an NFA, which is a crucial step in making automata
easier to process and implement. The ability to handle epsilon-transitions extends the power and flexibility of finite
automata, but eliminating them often simplifies the automaton for practical use.

Lecture 11: Regular Expressions


In this lecture, we will discuss Regular Expressions (RegEx), a powerful tool for pattern matching and string manipulation in
computer science. Regular expressions are used to describe patterns in strings and are commonly used in text processing,
searching, and validation. We will cover the basic operators of regular expressions, demonstrate how to construct them, and
explore the precedence of operators.

1. Operators of Regular Expressions

Regular expressions use a set of operators to define patterns in strings. These operators allow for the specification of
complex string patterns, such as repetitions, choices, and groupings.

26/145
Key Operators:

1. Concatenation:

This operator specifies that two patterns must appear consecutively in the string.

Example: The regular expression ab matches the string "ab", where 'a' is followed by 'b'.

2. Union (Alternation):

Represented by the vertical bar | , it specifies that either one pattern or another pattern can appear.

Example: The regular expression a|b matches either "a" or "b". The expression abc|def matches either "abc" or
"def".

3. Kleene Star (Zero or More):

Represented by the asterisk * , it matches zero or more occurrences of the preceding pattern.

Example: The regular expression a* matches the empty string "" , "a", "aa", "aaa", and so on.

4. Kleene Plus (One or More):

Represented by the plus sign + , it matches one or more occurrences of the preceding pattern.

Example: The regular expression a+ matches "a", "aa", "aaa", but does not match the empty string "" .

5. Optional (Zero or One):

Represented by the question mark ? , it matches zero or one occurrence of the preceding pattern.

Example: The regular expression a? matches the empty string "" or "a".

6. Character Classes:

Character classes define a set of characters that a single character can be matched against. They are enclosed in
square brackets [] .

Example: The regular expression [a-z] matches any lowercase letter from 'a' to 'z'. The expression [0-9] matches
any digit.

7. Negation of Character Classes:

A negated character class, represented by [^...] , matches any character that is not in the specified set.

Example: The regular expression [^0-9] matches any character that is not a digit.

8. Anchors:

Caret ^ : Matches the beginning of the string.

Dollar $ : Matches the end of the string.

Example: The regular expression ^a matches "a" at the start of the string, and the expression b$ matches "b" at
the end of the string.

9. Grouping and Capturing:

Parentheses () are used to group patterns and create sub-expressions.

Example: The regular expression (ab)+ matches one or more occurrences of "ab".

10. Escape Sequences:

Special characters can be escaped using a backslash \ to match the literal character.

Example: The regular expression \. matches a literal period (dot), whereas . without the backslash matches any
character.

27/145
2. Building Regular Expressions

Regular expressions can be built incrementally by combining operators to form complex patterns. Let's go through some
examples to understand how different components are used to construct regular expressions.

Example 1: Matching a Phone Number

We want to create a regular expression that matches a phone number in the format (xxx) xxx-xxxx , where x is a digit.

Step 1: Match the opening parenthesis ( : This is just a literal character, so we use \( .

Step 2: Match three digits: This can be done with the character class [0-9] repeated three times: [0-9]{3} .

Step 3: Match the closing parenthesis ) : Again, this is just a literal character, so we use \) .

Step 4: Match a space: This can be done with the literal character .

Step 5: Match the next three digits: Again, use [0-9]{3} .

Step 6: Match the hyphen - : Use \- .

Step 7: Match four digits: Use [0-9]{4} .

Combining all the steps, we get the regular expression:

\text{^\(\d{3}\) \d{3}-\d{4}$}

This regular expression matches a phone number like (123) 456-7890 .

Example 2: Matching an Email Address

An email address typically consists of a local part, an "@" symbol, and a domain part. The domain part can be a string with
periods separating the parts (e.g., example.com ).

Step 1: Match the local part, which can include alphanumeric characters and some special symbols: [a-zA-Z0-
9._%+-]+ .

Step 2: Match the "@" symbol: @ .

Step 3: Match the domain part, which consists of alphanumeric characters and periods: [a-zA-Z0-9.-]+ .

Step 4: Optionally, match the top-level domain, which consists of two or more letters: [a-zA-Z]{2,} .

The full regular expression becomes:

\text{^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$}

This regular expression matches email addresses like [email protected] .

Example 3: Matching a URL

A URL typically has a protocol (e.g., http or https ), followed by :// , and then a domain name and possibly a path.

Step 1: Match the protocol, which can be "http" or "https": https? .

Step 2: Match the :// : :// .

Step 3: Match the domain name, which consists of alphanumeric characters and dots: [a-zA-Z0-9.-]+ .

Step 4: Optionally, match a path starting with / : (/.*)? .

The regular expression for a simple URL is:

\text{^https?://[a-zA-Z0-9.-]+(/.*)?$}

This matches URLs like http://example.com or https://www.example.com/path .

28/145
3. Precedence of Regular Expression Operators

Regular expression operators have a defined precedence, which determines the order in which they are applied in the
absence of parentheses. The standard precedence is as follows (from highest to lowest):

1. Parentheses () : Used for grouping and capturing sub-expressions. Expressions inside parentheses are evaluated first.

2. Kleene Star * , Plus + , and Question Mark ? : These operators are applied to the preceding sub-expression and have
the second-highest precedence.

3. Concatenation (implicit): The concatenation operator is applied after Kleene star, plus, and question mark, but before
alternation.

4. Alternation | : The alternation operator has the lowest precedence.

Example: Understanding Precedence

Consider the regular expression a|b* . This expression will match:

a : Matches a literal 'a'.

b* : Matches zero or more occurrences of 'b'.

Because the alternation operator ( | ) has lower precedence than the Kleene star ( * ), the expression is interpreted as a |
(b*) . Thus, it matches either "a" or zero or more occurrences of 'b'.

If we wanted to change the meaning of this expression, we could add parentheses, like (a|b)* , which would match zero or
more occurrences of "a" or "b".

Conclusion:
In this lecture, we have covered Regular Expressions—a vital tool for pattern matching and string manipulation. We
discussed the basic operators used in regular expressions, how to construct regular expressions through examples, and the
importance of operator precedence. Regular expressions are essential for many text processing tasks, such as searching,
replacing, and validating strings, and understanding how to construct and interpret them is crucial for working with
patterns in strings efficiently.

Lecture 12: Finite Automata and Regular Expressions


In this lecture, we will discuss the relationship between Finite Automata (FA) and Regular Expressions (RE). Specifically, we
will focus on two main conversions:

1. From DFA to RE: We will explore the formal proof for converting a Deterministic Finite Automaton (DFA) to a regular
expression.

2. Converting DFA to RE by State Elimination: This method is a systematic procedure to convert a DFA into a regular
expression.

3. Converting RE to Automata: We will also explore how to convert a regular expression back into a finite automaton.

These conversions will be discussed through formal proofs and examples to ensure a deep understanding of how finite
automata and regular expressions are equivalent in their ability to recognize regular languages.

29/145
1. From DFA to RE (Formal Discussion and Proofs)
The process of converting a Deterministic Finite Automaton (DFA) to a Regular Expression (RE) is based on the idea that
every language accepted by a DFA can be described by a regular expression. We will prove this by showing that for any DFA,
there exists an equivalent regular expression that describes the same language.

Proof Outline for DFA to RE Conversion

Let D = (Q, Σ, δ, q0 , F ) be a DFA, where:


Q is the set of states,


Σ is the input alphabet,
δ : Q × Σ → Q is the transition function,
q0 ∈ Q is the start state,

F ⊆ Q is the set of accepting states.

We aim to construct a regular expression R such that L(D) = L(R), where L(D) is the language accepted by the DFA.

Key Ideas for the Proof:

1. Transition Matrix Representation:

We can represent the transitions of the DFA in a matrix form, where each element Tij represents the transition

from state i to state j on some input symbol. If there is no direct transition, we represent it as an empty string.

Over time, we can update this matrix to represent all possible paths between states using regular expressions.

2. Constructing the Regular Expression:

We start by initializing the regular expressions for each transition between states. These initial expressions
correspond to the individual symbols in the alphabet that lead from one state to another.

We then iteratively compute the regular expressions for longer paths (using the Kleene star and concatenation)
between states by considering multiple transitions.

3. Final Regular Expression:

The final regular expression is derived by considering all paths from the start state q0 to any accepting state f
​ ∈ F,
incorporating any intermediate states and transitions, and combining them using the appropriate operators.

Example:

Let’s consider a DFA D with the following components:

States: Q = {q0 , q1 }
​ ​

Alphabet: Σ = {a, b}
Start state: q0 ​

Accepting state: F = {q1 }


Transition function δ is defined as:

δ(q0 , a) = q0
​ ​

δ(q0 , b) = q1
​ ​

δ(q1 , a) = q1
​ ​

δ(q1 , b) = q1
​ ​

30/145
The regular expression for this DFA can be found as follows:

The path from q0 to q1 involves the transition on b, and once at q1 , we can loop on both 'a' and 'b'.
​ ​ ​

Therefore, the regular expression is b(a∣b)∗ .

Thus, the language accepted by the DFA is L(D) = {b(a∣b)∗ }, which is equivalent to the regular expression b(a∣b)∗ .

2. Converting DFA to RE by State Elimination


The state elimination method is a systematic approach to converting a DFA into a regular expression. The main idea
behind this approach is to progressively eliminate states while adjusting the transitions to preserve the language of the
automaton.

State Elimination Algorithm:

1. Initial Setup: Begin with the transition table of the DFA. For each transition, assign a regular expression corresponding
to the transition between states.

2. Eliminate States:

Choose a state q (except the start and accepting states) and eliminate it.

For each pair of states p and r that are connected via state q , update the transition between p and r to reflect the
new path that goes through q .

This involves combining the regular expressions for paths that pass through q using union and concatenation
operators.

3. Repeat the process until only the start state and accepting states remain.

4. Final Regular Expression: Once all non-start and non-accepting states are eliminated, the regular expression for the
transition between the start state and accepting states is the final regular expression that describes the language of the
DFA.

Example:

Let’s consider a simple DFA with the following components:

States: Q = {q0 , q1 , q2 }
​ ​ ​

Alphabet: Σ = {a, b}
Start state: q0 ​

Accepting state: q2 ​

Transition function δ :

δ(q0 , a) = q1 , δ(q0 , b) = q0
​ ​ ​ ​

δ(q1 , a) = q1 , δ(q1 , b) = q2
​ ​ ​ ​

δ(q2 , a) = q2 , δ(q2 , b) = q2
​ ​ ​ ​

To convert this DFA to a regular expression, we eliminate states one by one:

1. Eliminate q1 : ​

The transition from q0 to q2 is updated to reflect the path q0


​ ​ ​ → q1 → q2 , which results in the regular expression
​ ​

a(b∣a)∗ b.

31/145
2. Eliminate q0 : ​

The remaining regular expression between the start state and the accepting state is a(b∣a)∗ b.

Thus, the regular expression corresponding to the DFA is a(b∣a)∗ b.

3. Converting RE to Automata
The conversion of a regular expression (RE) to a finite automaton (FA) involves constructing an automaton that recognizes
the same language described by the regular expression. This can be done using methods such as the Thompson's
construction for Non-deterministic Finite Automata (NFA).

Thompson’s Construction Algorithm:

1. Base Case:

If the regular expression is a single symbol, create a simple NFA with two states: one for the start and one for the
accepting state, with a transition labeled by the symbol.

2. Recursive Case:

For each operator in the regular expression (concatenation, alternation, or Kleene star), build an NFA for that
operator by combining smaller NFAs based on the specific construction rules.

Key Construction Rules:

Concatenation: If R1 and R2 are regular expressions, construct an NFA for R1 and R2 by linking the accepting state of
​ ​ ​ ​

R1 to the start state of R2 .


​ ​

Union (Alternation): If R1 and R2 are regular expressions, construct an NFA for R1


​ ​ ​ ∪ R2 by creating a new start state

with epsilon transitions to the start states of the NFAs for R1 and R2 .
​ ​

Kleene Star: For a regular expression R, construct an NFA for R∗ by creating a new start state with an epsilon transition
to both the start state of R and a new accepting state, and adding an epsilon transition from the accepting state of R
back to the start state of R.

Example:

For the regular expression (a∣b)∗ , Thompson's construction produces an NFA that accepts any string consisting of zero or
more 'a's or 'b's.

Conclusion
In this lecture, we explored the connections between Finite Automata and Regular Expressions. We covered the conversion
process from DFA to RE, including formal proofs and examples, as well as the state elimination method for converting a
DFA into a regular expression. Additionally, we discussed how to convert a regular expression to an automaton using
Thompson’s construction. These conversions show the equivalence of regular expressions and finite automata, both of
which describe the class of regular languages. Understanding these conversions is essential for analyzing and constructing
automata and regular expressions.

32/145
Lecture 13: Applications of Regular Expressions
In this lecture, we explore practical applications of Regular Expressions (RE), focusing on their usage in Unix systems,
lexical analysis, and pattern matching in text. Regular expressions are powerful tools for searching, manipulating, and
analyzing strings, and they are widely used in both academic and industrial applications.

1. Regular Expressions in UNIX


In Unix and Unix-like operating systems, regular expressions are extensively used in various command-line utilities for
pattern matching and text processing. Tools such as grep, sed, and awk leverage regular expressions to search for patterns
within files, extract data, and manipulate strings.

a) grep (Global Regular Expression Print)

Functionality: grep searches through text or files for lines that match a specified pattern. It supports both basic and
extended regular expressions.

Syntax:

bash

grep 'pattern' filename

Example: To find all lines containing the word "error" in a log file:

bash

grep 'error' logfile.txt

You can also use regular expression operators to make searches more sophisticated, such as matching specific patterns
or character classes.

b) sed (Stream Editor)

Functionality: sed is used for text transformation and stream editing. It allows for search and replace operations, and
it supports regular expressions to match patterns in the input text.

Syntax:

bash

sed 's/pattern/replacement/g' filename

Example: To replace all occurrences of "apple" with "orange" in a file:

bash

sed 's/apple/orange/g' file.txt

Regular expressions in sed allow for advanced text manipulation, such as deleting lines matching a pattern, inserting
text, or modifying specific parts of lines.

c) awk (Pattern Scanning and Processing Language)

33/145
Functionality: awk is a powerful text-processing tool that uses regular expressions to pattern match text and perform
actions on it, such as printing selected fields, performing calculations, or reformatting the text.

Syntax:

bash

awk '/pattern/ {action}' filename

Example: To print the second field of each line that contains the word "apple":

bash

awk '/apple/ {print $2}' file.txt

Regular expressions are integral to awk 's ability to perform complex text manipulation tasks.

2. Lexical Analysis
Lexical analysis is the process of converting a sequence of characters (such as source code or text) into a sequence of
tokens, which are meaningful chunks of information. This process is often the first step in the compilation of programming
languages, but it is also applicable in many other areas, such as natural language processing and data extraction.

Role of Regular Expressions in Lexical Analysis

Tokenization: Regular expressions are used to define the patterns of valid tokens in a language. For example, in a
programming language, tokens might include keywords, operators, identifiers, and literals, all of which can be
described using regular expressions.

Finite Automata: The process of lexical analysis can be modeled using finite automata. Each regular expression can be
converted into a finite automaton (either deterministic or non-deterministic), which then performs the tokenization by
matching the input against the regular expressions.

Example:

Consider a simple language with the following tokens:

Keywords: if , else , while

Identifiers: Any sequence of letters and digits starting with a letter (e.g., var1 , x , hello ).

Numbers: A sequence of digits (e.g., 123 , 456 ).

We can define regular expressions for each of these tokens:

Keyword Regular Expressions:

regex

if | else | while

Identifier Regular Expression:

regex

[a-zA-Z][a-zA-Z0-9]*

34/145
Number Regular Expression:

regex

[0-9]+

A lexical analyzer would use these regular expressions to scan the input string and extract tokens such as if , var1 , and
123 .

3. Finding Patterns in Text


Regular expressions are widely used for pattern matching and searching within large text datasets. This ability is useful in
a variety of applications, including data mining, text processing, and information retrieval. Regular expressions provide a
concise and efficient way to search for complex patterns and extract relevant information.

Applications in Text Search:

Search and Replace: A common use of regular expressions is to find a pattern in a text and replace it with another
string. This is particularly useful for tasks such as correcting errors, formatting text, or performing batch modifications
on documents.

Pattern Matching: Regular expressions allow for the detection of complex patterns, such as email addresses, phone
numbers, or dates, within large bodies of text. These patterns can be used for data extraction or validation.

Example:

Consider the task of extracting email addresses from a block of text. A regular expression for matching most common email
formats could be:

regex

[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}

This regular expression matches email addresses with the following components:

Username: Alphanumeric characters, including periods, underscores, and special characters.

Domain: Alphanumeric characters and hyphens.

Top-Level Domain: A period followed by two or more alphabetic characters.

By using this regular expression in a text search utility, we can extract all email addresses from the text.

Advanced Text Pattern Matching:

Regular expressions are also used in more advanced scenarios, such as searching for patterns that span multiple lines or
handling cases with optional or repeated text elements. This flexibility is critical for working with real-world text, where the
structure can vary significantly.

Conclusion

35/145
Regular expressions have a broad range of applications, from basic text processing to complex tasks like lexical analysis and
data mining. In Unix systems, they are essential tools for text search, manipulation, and processing, enabling users to
efficiently work with large datasets and files. In lexical analysis, regular expressions help define the structure of tokens,
playing a crucial role in the parsing of programming languages and other formal grammars. Finally, pattern matching in
text is one of the most widely used applications of regular expressions, enabling users to search, extract, and manipulate
data in powerful ways. Understanding these applications equips you with the tools needed for a wide variety of practical
problems in text processing and automation.

Lecture 14: Algebraic Laws of Regular Expressions


In this lecture, we will explore the algebraic laws that govern regular expressions (RE). These laws provide a set of rules for
simplifying, manipulating, and reasoning about regular expressions. Understanding these laws is crucial for optimizing
regular expressions, proving equivalencies between expressions, and designing efficient algorithms in text processing and
lexical analysis.

We will cover the following algebraic laws:

1. Associativity and Commutativity

2. Distributive Laws

3. Idempotent Law

4. Closure Laws

5. Discovering Laws for RE (including the derivation of a new law)

6. Testing Algebraic Laws for Regular Expressions

1. Associativity and Commutativity

a) Associativity

The associative property of regular expressions refers to the grouping of operations in a regular expression. The grouping
does not affect the result of the operation when combined with other operations.

Formal Statement:

Concatenation:

(R1 ⋅ (R2 ⋅ R3 )) = ((R1 ⋅ R2 ) ⋅ R3 )


​ ​ ​ ​ ​ ​

The order in which concatenation is performed does not affect the result.

Union:

(R1 ∪ (R2 ∪ R3 )) = ((R1 ∪ R2 ) ∪ R3 )


​ ​ ​ ​ ​ ​

The union operation is also associative.

Example:

Concatenation example: Let R1 ​ = a, R2 = b, and R3 = c. Then:


​ ​

(a ⋅ (b ⋅ c)) = (a ⋅ b ⋅ c) = ((a ⋅ b) ⋅ c)
Union example: Let R1 ​
= a, R2 = b, and R3 = c. Then:
​ ​

(a ∪ (b ∪ c)) = (a ∪ b ∪ c)

b) Commutativity

36/145
The commutative property of regular expressions refers to the order in which two operations are performed, not affecting
the result.

Formal Statement:

Union:

(R1 ∪ R2 ) = (R2 ∪ R1 )
​ ​ ​ ​

The order of the operands in a union operation does not change the resulting language.

Example:

Let R1 ​ = a and R2 = b. Then:


(a ∪ b) = (b ∪ a)

Both expressions represent the language {a, b}.

2. Distributive Laws
The distributive property involves the distribution of one operation over another. This property is essential in simplifying
regular expressions and is frequently used in the manipulation and optimization of patterns.

Formal Statement:

Union over Concatenation:

(R1 ∪ (R2 ⋅ R3 )) = (R1 ∪ R2 ) ⋅ (R1 ∪ R3 )


​ ​ ​ ​ ​ ​ ​

Concatenation over Union:

(R1 ⋅ (R2 ∪ R3 )) = (R1 ⋅ R2 ) ∪ (R1 ⋅ R3 )


​ ​ ​ ​ ​ ​ ​

Example:

Let R1 ​ = a, R2 = b, and R3 = c.
​ ​

Applying the first distributive law:


(a ∪ (b ⋅ c)) = (a ∪ b) ⋅ (a ∪ c)
Applying the second distributive law:
(a ⋅ (b ∪ c)) = (a ⋅ b) ∪ (a ⋅ c)

3. Idempotent Law
The idempotent law states that applying the same operation multiple times does not change the result. This is useful for
simplifying regular expressions and eliminating redundant parts of a pattern.

Formal Statement:

Union:

(R ∪ R) = R
Concatenation:

(R ⋅ R) = R

37/145
Example:

Let R = a.

For union:
(a ∪ a) = a
For concatenation:
(a ⋅ a) = a

4. Closure Laws
The closure laws apply specifically to the Kleene star operation, which denotes zero or more repetitions of a pattern.

Formal Statement:

Identity:

R∗ = (R∗ )∗
Repeatedly applying the Kleene star to a regular expression does not change its meaning.

Concatenation with Identity:

R⋅ϵ=R=ϵ⋅R

Concatenating a regular expression with the empty string (ϵ) does not affect the regular expression.

Example:

Let R = a.

For closure:
(a∗ ) = ((a∗ )∗ )
For identity:
(a ⋅ ϵ) = a

5. Discovering a New Law for RE (Derived Law)


Let us now derive a new law for regular expressions, specifically a reverse law. This law states that:

Reverse Law: For a regular expression R, the reverse of the language L(R) is the language generated by reversing the
string of each element in L(R). Mathematically, it can be expressed as:

(R1 ⋅ R2 )R = R2R ⋅ R1R


​ ​ ​ ​

This law indicates that the reverse of a concatenated expression is the concatenation of the reverses of the individual
expressions, in reverse order.

Proof of the Reverse Law:

Let R1 ​ = a and R2 = b, then:


The language of R1 ​ ⋅ R2 is {ab}.


The reverse of this language is {ba}, which is the same as the language of R2R ​ ⋅ R1R .

38/145
Thus, the reverse law holds for this simple case, and we can generalize it to all regular expressions.

6. Test for a Regular Expression Algebraic Law


We will now test the reverse law for regular expressions with a more complex example.

Example: Let R1 ​ = a(b∣c)∗ and R2 = d. We need to test the law:


(R1 ⋅ R2 )R = R2R ⋅ R1R


​ ​ ​ ​

The language of R1 ​ ⋅ R2 is the set of strings that start with 'a', followed by any combination of 'b' and 'c', and ending

with 'd'.

The reverse of this language is a set of strings starting with 'd' and followed by any combination of 'b' and 'c', ending
with 'a'.

Now let's reverse each part individually:

The reverse of R1 ​ = a(b∣c)∗ is (b∣c)∗ a.


The reverse of R2 ​ = d is simply d.

Thus:

(R1 ⋅ R2 )R = (b∣c)∗ a ⋅ d = d ⋅ (b∣c)∗ a = R2R ⋅ R1R


​ ​ ​ ​

This confirms that the reverse law holds.

Conclusion
In this lecture, we explored several important algebraic laws of regular expressions, including associativity,
commutativity, distributive laws, idempotent law, and closure laws. We also derived a new law for regular expressions,
the reverse law, and tested it through examples. These algebraic properties are foundational for simplifying, optimizing,
and reasoning about regular expressions, and they play a critical role in various applications like text processing, lexical
analysis, and pattern matching. Understanding and applying these laws will improve your ability to work with regular
expressions efficiently.

Lecture 15: Proving Languages Not to be Regular


In this lecture, we focus on techniques for proving that certain languages are not regular. The central tool for this task is
the Pumping Lemma for Regular Languages, which provides a formal way to argue that certain languages cannot be
recognized by finite automata. We will begin by presenting the Pumping Lemma in detail, including its formal statement
and proof. Then, we will explore how to apply the Pumping Lemma to demonstrate that specific languages are not regular.

1. Pumping Lemma for Regular Languages

39/145
The Pumping Lemma for Regular Languages provides a necessary condition for a language to be regular. It states that for
any regular language L, there exists a constant p (called the pumping length) such that any string s ∈ L with length
greater than or equal to p can be split into three parts s = xyz with the following properties:
1. ∣xy∣ ≤ p
2. ∣y∣ > 0
3. xy k z ∈ L for all k ≥ 0

In other words, if L is regular, then any sufficiently long string in L can be decomposed into three parts such that the
middle part (denoted y ) can be repeated any number of times (including zero) while still remaining in L.

Proof of the Pumping Lemma for Regular Languages

The Pumping Lemma is typically proven by contradiction. We will now provide a proof of the lemma:

Assume L is a regular language.

Since L is regular, there exists a deterministic finite automaton (DFA) M that recognizes L. Let the number of states in M
be p (called the pumping length).

Let s be any string in L with ∣s∣


≥ p. Since M has p states, by the pigeonhole principle, as M processes the string s, it
must pass through at least one state more than once, because there are only p states and ∣s∣ ≥ p.

Let the string s be decomposed as s = xyz , where:


x is the prefix of s processed by M up to the first repeated state,
y is the portion of s that causes the state to repeat (i.e., the portion that loops),
z is the remaining suffix of s after the repetition.

Because M loops on y , repeating the loop any number of times will result in a string that is still accepted by M . Therefore,
xy k z ∈ L for all k ≥ 0, which proves the pumping lemma.
This proof shows that any string s in a regular language can be decomposed into three parts, and the middle part can be
pumped (repeated) without leaving the language.

2. Applications of the Pumping Lemma


The Pumping Lemma can be used to prove that a language is not regular by showing that no matter how the string is split
into parts xyz , there will be some value of k for which xy k z ∈
/ L. This contradiction proves that the language is not regular.
We will now explore a few examples to demonstrate how the Pumping Lemma can be applied to prove that a language is
not regular.

Example 1: Language L = {an bn ∣ n ≥ 0}


We want to prove that the language L = {an bn ∣ n ≥ 0} is not regular.
Assume L is regular.

By the Pumping Lemma, there exists a pumping length p.

Let s = ap bp . Clearly, s ∈ L and ∣s∣ = 2p ≥ p.

Now, according to the Pumping Lemma, we can decompose s = xyz , where:


1. ∣xy∣ ≤ p,

40/145
2. ∣y∣ > 0,
3. xy k z ∈ L for all k ≥ 0.

The string s = ap bp consists of p 'a's followed by p 'b's. The part y must consist only of 'a's, since ∣xy∣ ≤ p, meaning
that y = ai for some i > 0.

Now, let us pump y with k = 2:

xy 2 z = ap+i bp

This string ap+i bp is not in L because the number of 'a's and 'b's is no longer equal. Thus, the pumping lemma is violated,
and we have a contradiction.

Since we cannot decompose s in such a way that xy k z ∈ L for all k ≥ 0, we conclude that L is not regular.

Example 2: Language L = {an bn cn ∣ n ≥ 0}


We want to prove that the language L = {an bn cn ∣ n ≥ 0} is not regular.
Assume L is regular.

By the Pumping Lemma, there exists a pumping length p.

Let s = ap bp cp . Clearly, s ∈ L and ∣s∣ = 3p ≥ p.

Now, we decompose s = xyz , where:


1. ∣xy∣ ≤ p,
2. ∣y∣ > 0,
3. xy k z ∈ L for all k ≥ 0.

The string s= ap bp cp consists of p 'a's, p 'b's, and p 'c's. Since ∣xy∣ ≤ p, the part y must consist only of 'a's, so y = ai
for some i > 0.

Now, let us pump y with k = 2:

xy 2 z = ap+i bp cp

This string ap+i bp cp is not in L because the number of 'a's, 'b's, and 'c's is no longer the same. Thus, the pumping lemma is
violated, and we have a contradiction.

Since we cannot decompose s in such a way that xy k z ∈ L for all k ≥ 0, we conclude that L is not regular.

Example 3: Language L = {ww ∣ w ∈ {a, b}∗ }


We want to prove that the language L = {ww ∣ w ∈ {a, b}∗ } is not regular.
Assume L is regular.

By the Pumping Lemma, there exists a pumping length p.

Let s = ap bp ap bp . Clearly, s ∈ L and ∣s∣ = 2p ≥ p.

41/145
Now, we decompose s = xyz , where:
1. ∣xy∣ ≤ p,
2. ∣y∣ > 0,
3. xy k z ∈ L for all k ≥ 0.

The string s= ap bp ap bp consists of p 'a's, followed by p 'b's, followed by another p 'a's, and then another p 'b's. Since
∣xy∣ ≤ p, the part y must consist of 'a's from the first half of the string, so y = ai for some i > 0.

Now, let us pump y with k = 2:

xy 2 z = ap+i bp ap bp

This string is not in L because it no longer has the form ww , as the number of 'a's and 'b's in the two halves are no longer
equal. Thus, the pumping lemma is violated, and we have a contradiction.

Since we cannot decompose s in such a way that xy k z ∈ L for all k ≥ 0, we conclude that L is not regular.

Conclusion
In this lecture, we discussed the Pumping Lemma for Regular Languages, which provides a powerful tool for proving that
certain languages are not regular. By assuming a language is regular and showing that no decomposition of a sufficiently
long string satisfies the pumping lemma, we can demonstrate that the language cannot be recognized by a finite
automaton. Through several examples, we applied the Pumping Lemma to prove that languages such as {an bn ∣ n ≥ 0},

{a b c ∣ n ≥ 0}, and {ww ∣ w ∈ {a, b} } are not regular.
n n n

Lecture 16: Closure Properties of Regular Languages


In this lecture, we explore the closure properties of regular languages (RL), focusing on how regular languages behave
under various operations. Closure properties describe the set of operations under which regular languages remain regular.
Specifically, we will discuss the closure of regular languages under:

1. Boolean operations (union, intersection, and complement),

2. Reversal (taking the reverse of a language),

3. Homomorphisms (applying a substitution rule to the alphabet),

4. Inverse Homomorphisms (reversing the substitution).

Each of these operations will be discussed with detailed proofs demonstrating that the resulting languages remain regular.

1. Closure of Regular Languages Under Boolean Operations

Union:

Regular languages are closed under union, meaning that if L1 and L2 are regular languages, then L1
​ ​ ​ ∪ L2 is also regular.

We will show this through a construction of a nondeterministic finite automaton (NFA) for the union of two regular
languages.

Proof:

42/145
Let L1 and L2 be regular languages, and let M1
​ ​ ​ = (Q1 , Σ, δ1 , q10 , F1 ) and M2 = (Q2 , Σ, δ2 , q20 , F2 ) be the NFAs
​ ​


​ ​ ​ ​ ​


​ ​

recognizing L1 and L2 , respectively. ​

Construct a new NFA M = (Q, Σ, δ, q0 , F ) where: ​

Q = Q1 ∪ Q2 , ​ ​

q0 is a new start state,


F = F1 ∪ F2 (the accepting states are the union of the accepting states of M1 and M2 ),
​ ​ ​ ​

The transition function δ is defined as:

δ(q0 , a) = {q10 , q20 } for any input a ∈ Σ (this is a nondeterministic choice to start either in M1 or M2 ),



​ ​ ​

δ1 and δ2 are used for transitions within the individual NFAs M1 and M2 .
​ ​ ​ ​

Since an NFA recognizing the union of L1 and L2 can be constructed, the union of regular languages is regular.
​ ​

Intersection:

Regular languages are closed under intersection. We will prove this closure property by showing that the intersection of
two regular languages can be recognized by a deterministic finite automaton (DFA).

Proof:

Let L1 and L2 be regular languages, and let M1


​ ​ ​ = (Q1 , Σ, δ1 , q10 , F1 ) and M2 = (Q2 , Σ, δ2 , q20 , F2 ) be the DFAs
​ ​


​ ​ ​ ​ ​


​ ​

recognizing L1 and L2 , respectively. ​

Construct a new DFA M = (Q, Σ, δ, q0 , F ) where: ​

Q = Q1 × Q2 (the Cartesian product of the state sets of M1 and M2 ),


​ ​ ​ ​

q0 = (q10 , q20 ) (the initial state is the pair of initial states of M1 and M2 ),



​ ​ ​

F = F1 × F2 (the accepting states are the pairs of accepting states from M1 and M2 ),
​ ​ ​ ​

The transition function δ is defined as:

δ((q1 , q2 ), a) = (δ1 (q1 , a), δ2 (q2 , a)).


​ ​ ​ ​ ​ ​

Since a DFA recognizing the intersection of L1 and L2 can be constructed, the intersection of regular languages is regular. ​ ​

Complement:

Regular languages are closed under complement. We will show that if a language is regular, its complement is also regular
by using the construction of a DFA and applying the complement operation on its accepting states.

Proof:

Let L be a regular language, and let M = (Q, Σ, δ, q0 , F ) be the DFA recognizing L.


To construct a DFA that recognizes the complement of L, we can use the same DFA M but change the set of accepting
states. The new DFA M ′ = (Q, Σ, δ, q0 , Q ∖ F ) recognizes L, the complement of L.

Since the new DFA recognizes the complement of L, regular languages are closed under complement.

2. Closure of Regular Languages Under Reversal


Regular languages are closed under reversal, meaning that if L is a regular language, then the reverse of L, denoted LR ,
is also regular.

43/145
Proof:

Let L be a regular language, and let M = (Q, Σ, δ, q0 , F ) be the DFA recognizing L.


Construct a new NFA M ′ = (Q, Σ, δ ′ , q0 , F ′ ) that recognizes LR as follows:


Q′ = Q (the states remain the same),


F ′ = {q0 } (the start state becomes the accepting state),

The transition function δ ′ is modified to reverse the direction of transitions:

For each q and a ∈ Σ, if δ(q, a) = p, then δ ′ (p, a) = q .


q0 is now the accepting state, and the start state is moved to F .

Since an NFA recognizing the reverse of L can be constructed, the reversal of a regular language is regular.

3. Closure of Regular Languages Under Homomorphisms


Regular languages are closed under homomorphisms, meaning that if L is a regular language and h is a homomorphism,
then h(L) is regular.

A homomorphism h is a mapping h : Σ∗ → Γ∗ that replaces each symbol of Σ with a string over some alphabet Γ. If w =
a1 a2 … an is a string over Σ, then h(w) = h(a1 )h(a2 ) … h(an ).
​ ​ ​ ​ ​ ​

Proof:

Let L be a regular language, and let M = (Q, Σ, δ, q0 , F ) be the DFA recognizing L.


Construct a new automaton M ′ = (Q, Γ, δ ′ , q0 , F ) that recognizes h(L), where:


δ ′ (q, γ) = δ(q, h−1 (γ)) for each q ∈ Q and γ ∈ Γ.


The transition function δ ′ is defined to simulate the application of h to each symbol in the string.

Since we can construct a DFA recognizing h(L), regular languages are closed under homomorphisms.

4. Closure of Regular Languages Under Inverse Homomorphisms


Regular languages are closed under inverse homomorphisms, meaning that if L is a regular language and h is a
homomorphism, then the inverse homomorphism of L, denoted h−1 (L), is regular.

Proof:

Let L be a regular language, and let M = (Q, Σ, δ, q0 , F ) be the DFA recognizing L.


We define the inverse homomorphism h−1 (L) as the set of strings w ∈ Σ∗ such that h(w) ∈ L.
Construct an automaton that simulates applying h−1 to the strings in L and accepts them if the transformed string is in
L.

Since an automaton recognizing h−1 (L) can be constructed, regular languages are closed under inverse homomorphisms.

44/145
Conclusion
In this lecture, we explored several closure properties of regular languages. Specifically, we demonstrated that regular
languages are closed under the following operations:

1. Boolean operations (union, intersection, complement),

2. Reversal (taking the reverse of a language),

3. Homomorphisms (substituting symbols with strings),

4. Inverse homomorphisms (reverse substitution).

For each property, we provided detailed proofs that show how regular languages remain regular after applying these
operations. These closure properties are essential tools for analyzing and manipulating regular languages in automata
theory.

Lecture 17: Decision Properties of Regular Languages


In this lecture, we explore decision properties of regular languages (RL), which refer to the ability to decide certain
properties about a regular language using algorithms. The primary decision problems we will discuss are:

1. Converting between different representations of regular languages (such as from automata to regular expressions,
etc.),

2. Testing the emptiness of a regular language,

3. Testing membership of a string in a regular language.

Each of these properties plays a key role in the analysis and manipulation of regular languages in automata theory.

1. Converting Among Representations


Regular languages can be represented in multiple ways, including:

Finite Automata (DFA or NFA),

Regular Expressions,

Regular Grammars.

One of the fundamental decision problems is determining how to convert between these different representations.
Specifically, we will focus on conversions from finite automata to regular expressions, since this is a commonly
encountered task.

From DFA to Regular Expression

We will now discuss how to convert a deterministic finite automaton (DFA) to a regular expression (RE).

Proof/Procedure:

1. DFA Representation: Let M = (Q, Σ, δ, q0 , F ) be a DFA recognizing a regular language L.


2. Goal: We aim to find a regular expression R such that R represents the language recognized by M , i.e., L(M ) =
L(R).
3. State Elimination Method: One of the most common methods to convert a DFA to a regular expression is state
elimination. In this approach, we eliminate states from the DFA one by one while updating the transition relations to
reflect the removal of each state. The process can be outlined as follows:

45/145
Start with the original DFA.

For each state q ∈ Q, for every pair of states p, r ∈ Q, replace the transitions in the automaton with regular
expressions.

Eliminate states one by one by updating the transitions for all pairs of states that could be affected by the removal.

After all states are eliminated (except for the initial and final states), the regular expression corresponding to the
language accepted by the DFA is formed.

4. Example:

Consider a DFA M with the following components:

States: Q = {q0 , q1 , q2 },
​ ​ ​

Alphabet: Σ = {a, b},


Transitions: δ(q0 , a) ​ = q1 , δ(q0 , b) = q0 , δ(q1 , a) = q1 , δ(q1 , b) = q2 , δ(q2 , a) = q2 , δ(q2 , b) = q0 ,
​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​

Start state: q0 ,

Accepting state: F = {q2 }. ​

After applying state elimination, the DFA can be converted into the following regular expression:

(a∗ ba∗ b)∗

which represents the language recognized by M .

From Regular Expression to DFA

The reverse process is also possible: converting a regular expression to a DFA. The process involves:

Converting the regular expression to an NFA using standard construction methods (e.g., Thompson's construction),

Converting the resulting NFA to a DFA using the subset construction algorithm.

2. Testing Emptiness of a Regular Language


The problem of testing emptiness of a regular language involves determining whether the language recognized by a given
automaton is empty, i.e., whether there are any strings accepted by the automaton.

Definition:

The emptiness problem for a regular language asks whether the language L recognized by an automaton M is empty, i.e.,
L(M ) = ∅.

Procedure:

Given a DFA M = (Q, Σ, δ, q0 , F ), we want to determine whether L(M ) = ∅. The basic idea is to check whether there is

any path from the start state q0 to any accepting state in F . If no such path exists, the language is empty.

1. Reachability Check:

Perform a breadth-first search (BFS) or depth-first search (DFS) starting from the initial state q0 . ​

If any of the accepting states in F is reachable from q0 , then L(M ) is non-empty.


If none of the accepting states in F are reachable from q0 , then L(M ) ​ = ∅.


2. Example:

46/145
Let M = (Q, Σ, δ, q0 , F ) be a DFA with the following components:

States: Q = {q0 , q1 },
​ ​

Alphabet: Σ = {a},
Transitions: δ(q0 , a) ​ = q1 , δ(q1 , a) = q0 ,
​ ​ ​

Start state: q0 ,

Accepting state: F = {q1 }. ​

A BFS or DFS starting from q0 would show that q1 is reachable, and hence L(M )
​ ​  ∅.
=
3. Time Complexity: The time complexity of this check is O(∣Q∣), where ∣Q∣ is the number of states in the DFA, as we are
simply performing a reachability check.

3. Testing Membership in a Regular Language


The problem of membership asks whether a given string w ∈ Σ∗ belongs to a regular language L. This can be framed as
checking whether the automaton accepts the string w .

Procedure:

Given a DFA M = (Q, Σ, δ, q0 , F ) and a string w = a1 a2 … an ∈ Σ∗ , the task is to determine whether w ∈ L(M ). This
​ ​ ​ ​

can be done by simulating the DFA on w .

1. Simulate the DFA:

Start at the initial state q0 . ​

For each symbol ai in the string w , update the current state to δ(qi−1 , ai ).
​ ​ ​

After processing the entire string w , check if the current state is in the set of accepting states F .

If the current state is in F , accept the string; otherwise, reject it.

2. Example:

Let M be a DFA with the following components:

States: Q = {q0 , q1 },
​ ​

Alphabet: Σ = {a},
Transitions: δ(q0 , a) ​ = q1 , δ(q1 , a) = q0 ,
​ ​ ​

Start state: q0 ,

Accepting state: F = {q1 }. ​

For the input string w= a, the DFA will start at q0 , process the first symbol a, transition to q1 , and accept the string
​ ​

because q1 is an accepting state. For w = aa, the DFA will transition to q0 after processing the second a, and since q0 is
​ ​ ​

not an accepting state, the string will be rejected.

3. Time Complexity: The time complexity of testing membership is O(∣w∣), where ∣w∣ is the length of the string. Each
transition is made once for each symbol in the string.

Conclusion

47/145
In this lecture, we discussed several decision properties of regular languages, including:

1. Converting between representations of regular languages (e.g., DFA to regular expression),

2. Testing emptiness of a regular language, by checking for reachable accepting states in the DFA,

3. Testing membership of a string in a regular language, by simulating the DFA on the input string.

These decision problems are fundamental tools for working with regular languages, as they allow for efficient analysis and
manipulation of languages represented by automata and regular expressions.

Lecture 18: Equivalence and Minimization of Automata


In this lecture, we focus on two crucial topics in the study of automata theory: testing equivalence of automata (and
regular expressions), and the minimization of deterministic finite automata (DFAs). We will delve into each of these
topics thoroughly, providing formal proofs and detailed examples.

1. Testing Equivalence of States


The task of testing equivalence of states in a DFA involves determining whether two states, say q1 and q2 , are equivalent. ​ ​

Two states are considered equivalent if, for every string in the alphabet Σ, the automaton transitions to equivalent states
for that string, and ultimately either both accept or both reject the string.

Formal Definition:

Let M = (Q, Σ, δ, q0 , F ) be a DFA. Two states q1 , q2 ∈ Q are equivalent if for all strings w ∈ Σ∗ , the following holds:
​ ​ ​

δ(q1 , w) ∈ F ⟺ δ(q2 , w) ∈ F
​ ​

In other words, q1 and q2 are equivalent if they lead to the same set of accepting and rejecting states for every string in the
​ ​

alphabet.

Procedure for Testing Equivalence of States:

1. Inductive Method: The simplest way to test whether two states are equivalent is by induction:

Initialize a table T where each pair of states (q1 , q2 ) is marked as either equivalent or non-equivalent.
​ ​

Mark the pairs (q1 , q2 ) as equivalent if they lead to the same final state for all strings.
​ ​

For non-equivalent states, one of them must eventually lead to an accepting state, and the other to a rejecting state
for some input string.

By systematically examining all possible strings, it becomes clear which states are equivalent.

Example:

Consider a DFA M = (Q, Σ, δ, q0 , F ) where: ​

Q = {q0 , q1 , q2 },
​ ​ ​

Σ = {a, b},
Transitions: δ(q0 , a) ​ = q1 , δ(q0 , b) = q2 , δ(q1 , a) = q1 , δ(q1 , b) = q0 , δ(q2 , a) = q0 , δ(q2 , b) = q2 ,
​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​

Start state: q0 , ​

Accepting state: F = {q0 }. ​

We check the equivalence of states q0 and q1 : ​ ​

48/145
For w = a, δ(q0 , a) = q1 and δ(q1 , a) = q1 , which are both non-accepting.
​ ​ ​ ​

For w = b, δ(q0 , b) = q2 and δ(q1 , b) = q0 , where q0 is accepting and q2 is non-accepting. Thus, q0 and q1 are not
​ ​ ​ ​ ​ ​ ​ ​

equivalent.

We repeat this process for all state pairs, ultimately determining the equivalence of all states in the automaton.

2. Testing Equivalence of Regular Expressions


Regular expressions (REs) can also be tested for equivalence. Two regular expressions R1 and R2 are equivalent if they
​ ​

generate the same language, i.e., L(R1 ) ​ = L(R2 ). ​

Procedure:

One way to test the equivalence of two regular expressions is by:

1. Converting the regular expressions R1 and R2 into their equivalent DFAs M1 and M2 ,
​ ​ ​ ​

2. Testing the equivalence of the resulting DFAs using the state equivalence procedure outlined above.

If the DFAs derived from R1 and R2 are equivalent, then the original regular expressions R1 and R2 are equivalent as well.
​ ​ ​ ​

Example:

Let:

R1 = (a∣b)∗,

R2 = (a ∗ ∣b∗)∗.

Both regular expressions generate the same language, i.e., the set of all strings over Σ = {a, b}. However, if we convert
them to DFAs, we find that the structure of the DFAs may differ, yet they recognize the same language, hence R1 and R2 are ​ ​

equivalent.

3. Minimization of DFAs
The goal of minimization of a DFA is to reduce the number of states in the DFA while preserving the language it recognizes.
The process eliminates redundant states that do not affect the language accepted by the automaton.

Procedure for Minimizing a DFA:

1. Identify Equivalence Classes:

Group states that are equivalent, as discussed earlier.

States that are equivalent can be merged into a single state in the minimized DFA.

2. Construct the Minimized DFA:

After merging equivalent states, the resulting DFA will have fewer states but recognize the same language.

3. Distinguish States:

The key idea is to mark distinguishable and indistinguishable states by examining their behavior on all possible
strings.

49/145
Start by marking all pairs of states that are obviously distinguishable (i.e., one is accepting and the other is
rejecting).

Continue by refining the partition of states by considering all transitions for each string in the alphabet.

Example:

Consider the following DFA M = (Q, Σ, δ, q0 , F ):


States: Q = {q0 , q1 , q2 },
​ ​ ​

Alphabet: Σ = {a, b},


Transitions: δ(q0 , a) ​ = q1 , δ(q0 , b) = q2 , δ(q1 , a) = q1 , δ(q1 , b) = q0 , δ(q2 , a) = q0 , δ(q2 , b) = q2 ,
​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​

Start state: q0 , ​

Accepting state: F = {q0 }. ​

We first group states based on acceptance:

q0 is accepting, and q1 , q2 are not.


​ ​

Now, we refine the partition by considering the transitions:

δ(q0 , a) = q1 and δ(q0 , b) = q2 , so q1 and q2 are distinguishable.


​ ​ ​ ​ ​ ​

Thus, the minimized DFA has the following states: {q0 }, {q1 , q2 }. ​ ​ ​

The minimized DFA has two states and is simpler than the original DFA.

4. Why Minimized DFA Can't Be Beaten


A minimized DFA is the most efficient deterministic automaton for recognizing a regular language. It is the smallest DFA
that recognizes a given language, meaning that no DFA with fewer states can recognize the same language. This can be
formally proven through the pigeonhole principle and is related to the idea that each state in a DFA corresponds to a
distinct equivalence class of strings.

Proof (Intuition):

1. For any regular language L, a DFA must distinguish between different equivalence classes of strings.

2. Each equivalence class requires a distinct state in the DFA.

3. The minimized DFA achieves this by merging all states that are equivalent, leading to the smallest number of states.

4. Therefore, a minimized DFA is optimal in terms of state count and cannot be further reduced without losing its ability to
recognize the language.

This minimality is not just theoretical—it provides a concrete method for efficiently implementing DFAs in practice,
particularly in lexical analysis and text search algorithms.

Conclusion
In this lecture, we thoroughly discussed:

50/145
1. Testing equivalence of states in DFAs, which involves checking whether two states behave the same for all input
strings,

2. Testing equivalence of regular expressions, where we convert the expressions into DFAs and check if the DFAs are
equivalent,

3. Minimization of DFAs, a process of reducing the number of states in a DFA while preserving its language recognition
capability,

4. Why the minimized DFA is the most efficient and cannot be further reduced.

These techniques are central to the optimization and analysis of finite automata, and they have important applications in
areas like compiler design and text processing.

Lecture 19: Context-Free Grammars


In this lecture, we focus on Context-Free Grammars (CFGs), which are one of the most important formal tools in automata
theory and formal language theory. CFGs are widely used to describe the syntax of programming languages and natural
languages. We will explore an informal example, provide a formal definition, and examine key concepts such as derivations,
sentential forms, and the language generated by a grammar.

1. An Informal Example of a Context-Free Grammar


Consider the following simple example of a grammar for arithmetic expressions involving addition and multiplication:

Expressions can either be a number or an expression followed by an operator and another expression.

The operator can be either addition or multiplication.

We can illustrate this informal grammar as follows:

1. An expression (E ) is either a number or another expression combined with an operator.

2. If an expression is the result of an addition, it consists of two expressions separated by a "+" symbol.

3. If an expression is the result of multiplication, it consists of two expressions separated by a "*" symbol.

In this informal context, we might write an expression like:

3+4∗5
This should be interpreted according to the precedence of operations, where multiplication takes precedence over
addition.

2. Definition of Context-Free Grammars (CFGs)


A Context-Free Grammar (CFG) is formally defined as a 4-tuple G = (V , Σ, R, S), where:
V is a finite set of variables (also called non-terminal symbols),
Σ is a finite set of terminals (the alphabet of the language),
R is a finite set of production rules, where each rule has the form A → α, with A ∈ V (a non-terminal) and α ∈ (V ∪
Σ)∗ (a string of terminals and/or non-terminals),

51/145
S ∈ V is the start symbol, the initial variable from which the derivation process begins.

Key Points:

A production rule A → α means that the non-terminal A can be replaced with the string α during derivation.
The language generated by the grammar is the set of strings derived from the start symbol S that consist solely of
terminal symbols.

Example: A CFG for simple arithmetic expressions could be defined as follows:

V = {E} (where E is the non-terminal symbol for expressions),


Σ = {+, ∗, (, ), 0, 1, 2, ..., 9} (digits and operators),
R consists of the following production rules:

1. E → E + E,
2. E → E ∗ E,
3. E → (E),
4. E → digit,
S = E is the start symbol.

This grammar allows us to generate expressions like 3 + 4 ∗ 5 by starting with E and applying the appropriate production
rules.

3. Derivations Using a Grammar


A derivation is a sequence of applications of production rules that transforms the start symbol into a string of terminal
symbols. The process begins with the start symbol S and applies rules to replace non-terminal symbols with other non-
terminal or terminal symbols until only terminal symbols remain.

Example of Derivation:

For the grammar defined above, let’s derive the expression 3 + 4 ∗ 5:

1. Start with E .

2. Apply rule E → E + E: E + E.
3. Apply rule E → digit for the first E : 3 + E .
4. Apply rule E → digit for the second E : 3 + 4.
5. Apply rule E → E ∗ E for the second E : 3 + (4 ∗ 5).
6. Apply rule E → digit for both 4 and 5: 3 + (4 ∗ 5).

This derivation produces the expression 3 + 4 ∗ 5 from the start symbol E .

4. Leftmost and Rightmost Derivations

52/145
A leftmost derivation is a derivation where, at each step, the leftmost non-terminal is replaced by one of its productions.
Similarly, a rightmost derivation involves replacing the rightmost non-terminal at each step.

Leftmost Derivation Example:

Consider the expression 3 + 4 ∗ 5 again. Starting with E , the leftmost derivation proceeds as follows:

1. Start with E ,

2. Apply E → E + E (leftmost E replaced),


3. Apply E → digit for the first E , yielding 3 + E ,
4. Apply E → E ∗ E for the second E , yielding 3 + (E ∗ E),
5. Apply E → digit for both E 's, yielding 3 + (4 ∗ 5).

Rightmost Derivation Example:

For the same string 3 + 4 ∗ 5, the rightmost derivation proceeds as follows:

1. Start with E ,

2. Apply E → E + E,
3. Apply E → digit for the first E , yielding 3 + E ,
4. Apply E → digit for the second E , yielding 3 + 4,
5. Apply E → E ∗ E for the second E , yielding 3 + (4 ∗ 5).

5. Language of a Grammar
The language of a grammar, L(G), is the set of all strings that can be derived from the start symbol S using the production
rules. The language is a subset of the set of terminal symbols Σ∗ .

Example:

For the arithmetic grammar above, the language consists of all valid arithmetic expressions formed by the production rules,
such as:

3 + 4,
5 ∗ 6,
7 + 8 ∗ 9,
(3 + 4) ∗ 5.

6. Sentential Forms
A sentential form is any string of terminals and non-terminals that can be derived from the start symbol S in one or more
derivation steps. A sentential form is not necessarily a string of terminal symbols, but it is a step in the derivation process.

Example:

Consider the derivation of the expression 3 + 4 ∗ 5:

The sentential form E + E appears during the derivation,

53/145
The sentential form 3 + E appears before the final step,

The sentential form 3 + 4 ∗ 5 is a string of only terminal symbols, and it is a sentential form that is also part of the
language generated by the grammar.

Conclusion
In this lecture, we have:

Introduced Context-Free Grammars (CFGs), providing both an informal example and a formal definition.

Discussed how to perform derivations using CFGs, including both leftmost and rightmost derivations.

Examined the language of a grammar, which is the set of all strings derivable from the start symbol.

Defined sentential forms, which are intermediate strings during the derivation process.

CFGs are foundational for understanding syntax in formal languages and play a crucial role in parsing algorithms for
programming languages.

Lecture 20: Parse Trees


In this lecture, we explore Parse Trees, a graphical representation of the syntactic structure of a string derived from a
context-free grammar (CFG). Parse trees are essential in understanding how derivations are formed and how strings are
structured according to the rules of a grammar. We will thoroughly discuss the process of constructing parse trees,
understanding their yield, and the relationship between derivations, inferences, and parse trees, using several examples.

1. Constructing Parse Trees


A parse tree represents a derivation of a string from a context-free grammar (CFG). It is a tree where:

Each internal node represents a non-terminal symbol.

Each leaf node represents a terminal symbol (or a string of terminal symbols).

The root of the tree represents the start symbol of the grammar.

To construct a parse tree:

1. Start with the start symbol of the grammar.

2. Apply the production rules step by step, replacing non-terminals with their right-hand side (RHS) rules.

3. Continue until all leaves are terminal symbols.

Example: Constructing a Parse Tree for 3 + 4 ∗ 5

Using the following simple CFG for arithmetic expressions:

E →E+E
E →E∗E
E → digit

Let’s construct the parse tree for the expression 3 + 4 ∗ 5.

54/145
1. Start with E .

→ E + E to get:
2. Apply E
E → E + E.
→ digit (since 3 is a digit):
3. For the left E , apply E
E + E becomes 3 + E .
→ E ∗ E (since the second part involves multiplication):
4. For the right E , apply E
3 + E becomes 3 + (E ∗ E).
5. Now, for each E on the right, apply E → digit for both 4 and 5:
3 + (E ∗ E) becomes 3 + (4 ∗ 5).

The parse tree for 3 + 4 ∗ 5 is:

mathematica

E
/ \
E E
/ \ / \
digit + E E
| / \ / \
3 digit * digit
| |
4 5

2. Yield of a Parse Tree


The yield of a parse tree is the string of terminal symbols that is derived from the tree by reading the leaves from left to
right. The yield is simply the string that is generated by the grammar, corresponding to a derivation of the string.

Example:

For the above parse tree of 3 + 4 ∗ 5, the yield is the sequence of terminal symbols (the digits and operators) in the leaves
of the tree, read from left to right:

Yield: 3 + 4 ∗ 5.

This is exactly the string we set out to parse, demonstrating that the parse tree represents the correct derivation of the
string from the grammar.

3. Inference, Derivations, and Parse Trees


Inference in the context of grammar and parse trees refers to the process of applying rules to derive strings, leading to the
construction of a parse tree. Derivations are sequences of rule applications, and the parse tree is a visual representation of
that derivation.

Inference refers to applying a production rule to infer a new string.

55/145
Derivation is the entire process of applying a series of production rules, starting from the start symbol and producing a
string in the language.

A parse tree represents one possible way of applying rules to derive a string, and it uniquely corresponds to a specific
derivation.

Example:

For the expression 3 + 4 ∗ 5, the derivation steps were:

1. E → E + E,
2. E → digit, yielding 3 + E ,
3. E → E ∗ E , yielding 3 + (E ∗ E),
4. E → digit for both right-hand E 's, yielding 3 + (4 ∗ 5).

The parse tree is the visual representation of these derivation steps.

4. From Inferences to Trees


To move from inferences (applying production rules) to parse trees, each inference corresponds to a node in the tree.
Specifically:

An inference where we apply a production A → α becomes a parent node A with children representing the symbols in
α.

Example:

Consider the inference:

E → E + E , this creates a node labeled E with two children, each of which is labeled E .
Then apply E → digit for the left E , resulting in a leaf node labeled "3".
Apply E → E ∗ E for the right E , leading to the creation of nodes labeled E , connected by a child labeled "*" and
further children labeled "4" and "5".

The entire process is reflected in the parse tree that grows with each rule application.

5. From Trees to Derivations


A parse tree can be used to reconstruct a derivation. To do this, we begin with the root of the tree (which is the start
symbol) and trace through the tree, reading off the rules that were applied at each step.

To reconstruct the derivation from a tree:

1. Start at the root and use the production rule that led to the child nodes.

2. Recursively apply the same process to each subtree, tracing back through the tree to find the sequence of production
rules.

Example:

Starting with the parse tree for 3 + 4 ∗ 5, we can reconstruct the derivation as follows:

56/145
1. From the root node E , the rule applied was E → E + E.
2. For the left E , the rule applied was E → digit, yielding 3.
3. For the right E , the rule applied was E → E ∗ E , with each E becoming a digit (4 and 5).
4. Thus, the sequence of production rules is:
E → E + E,
E → digit,
E → E ∗ E,
E → digit,
E → digit.

6. From Derivations to Recursive Inferences


Every derivation corresponds to a sequence of recursive inferences. Each application of a production rule can be seen as an
inference that leads to a subproblem (i.e., replacing a non-terminal with its production). These inferences continue
recursively as we break down non-terminals into their corresponding productions.

For example, in the derivation of 3 + 4 ∗ 5, the recursive inferences would look like this:

First, we replace E → E + E,
Then we recursively infer E → digit for the left E (yielding 3),
Next, for the right E , we apply E → E ∗ E , recursively inferring digits for both E 's.

Each recursive inference is a step towards fully replacing non-terminals with terminal symbols.

Conclusion
In this lecture, we have:

Defined and illustrated how to construct parse trees, using a step-by-step process.

Discussed the yield of a parse tree, which corresponds to the string derived by the grammar.

Explored the relationship between inferences, derivations, and parse trees, and how these concepts are connected.

Demonstrated how to move from inferences to parse trees, from trees to derivations, and from derivations to
recursive inferences.

Parse trees are crucial for understanding the structure of strings in a language and are foundational in the design of
parsers for programming languages.

Lecture 21: Applications of Context-Free Grammars


In this lecture, we will explore several applications of Context-Free Grammars (CFGs) in real-world scenarios. These
applications span various areas, including parsing, parser generation, and the definition of structured languages used in
computing. We will discuss how CFGs are leveraged in tools such as YACC (Yet Another Compiler Compiler), as well as their
role in defining markup languages such as XML and DTDs (Document Type Definitions).

57/145
1. Parsers
A parser is a component of a compiler or interpreter that analyzes a string of symbols (often program code or data) to
determine its grammatical structure with respect to a given formal grammar. The parser takes an input string and attempts
to build a parse tree that represents the syntactic structure of the string according to the grammar.

Role of CFGs in Parsers: Context-Free Grammars are widely used for defining the syntax of programming languages
and data formats. CFGs provide a formal way of specifying the structure of a language, which is essential for designing
parsers that can check whether a string belongs to the language and how it can be structured.

Types of Parsers:

Top-down Parsers: These start from the start symbol and try to rewrite it into the input string, matching the string
from left to right. Examples include recursive descent parsers.

Bottom-up Parsers: These begin with the input string and attempt to reduce it to the start symbol by applying
productions in reverse. Examples include shift-reduce parsers.

Both types of parsers rely heavily on the grammar of the language being parsed, and CFGs provide the structure
necessary for such parsing algorithms to function.

Example:

Consider the grammar for a simple arithmetic expression language:

E →E+E
E →E∗E
E → (E)
E → digit

A parser for this language would take an expression like 3 + 4 ∗ 5, try to match it against these production rules, and build
a parse tree representing the syntactic structure of the expression. The parser helps ensure that the expression is valid
according to the grammar, and it can also provide a structure for evaluating the expression.

2. YACC Parser-Generator
YACC (Yet Another Compiler Compiler) is a tool used to generate parsers for context-free grammars. It is one of the most
widely used parser generators for C and C++ compilers.

How YACC Works:

1. Grammar Specification: A programmer defines the grammar of the language they want to parse, typically using a
BNF (Backus-Naur Form) or a similar notation.

2. YACC Input: The grammar is input into YACC, which automatically generates C code for a parser.

3. Parser Construction: YACC generates a parser that can take an input string and construct a parse tree based on the
provided grammar.

4. Action Code: YACC also allows the inclusion of action code that is executed when a specific rule is applied, allowing
the parser to not only validate but also perform computations or transformations as it parses the input.

Example Usage: A typical YACC specification might define a grammar for arithmetic expressions like the one shown
earlier, and when YACC processes this input, it generates a C program with parsing functions that can evaluate or
analyze such expressions.

58/145
YACC in Practice: YACC is typically used in combination with lex (a lexical analyzer generator) to build complete
compilers. Lex handles tokenizing the input, while YACC takes care of parsing the structure of the tokens according to
the grammar. The combination allows the construction of parsers for complex languages.

3. Markup Languages
Markup languages use a system of tags to annotate text, typically to define the structure and presentation of documents.
Context-Free Grammars are widely used to define the syntax of these languages.

Defining Structure with CFGs: The syntax of markup languages, such as HTML and XML, can be described by context-
free grammars. For instance, the rules for nested tags in XML documents can be formalized using a CFG that defines
how elements and attributes should be structured and nested.

For example, a very simplified XML grammar might look like:

Element → Tag Content Tag


Tag → <Name >
Content → Text ∣ Element
Text → any characters except < and >

This grammar captures the essence of how an XML document is structured, with nested elements and content.

4. XML and DTDs (Document Type Definitions)


XML (eXtensible Markup Language) is a widely used markup language for defining documents with a customizable
structure. While XML itself is not strictly a context-free language, its structure can be described using context-free grammars
in various contexts (e.g., defining validation rules for XML documents).

DTD (Document Type Definition): DTDs define the structure and rules for XML documents. While DTDs are often
described in terms of regular expressions or simpler context-free grammars, they specify the valid structure of XML
elements, attributes, and their relationships.

For example, a simple DTD for an XML document might define that a document contains a sequence of <book>
elements, each containing a <title> , <author> , and <price> :

xml

<!ELEMENT catalog (book+)>


<!ELEMENT book (title, author, price)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT price (#PCDATA)>

Here, the DTD specifies that a catalog contains one or more book elements, and each book contains a title ,
author , and price . This structure can be represented as a context-free grammar, where each rule corresponds to a
non-terminal, and the structure of the tags and content is captured.

XML Schema: An XML Schema is more powerful than a DTD and allows for more precise definition of data types,
constraints, and more complex structures. Though XML Schema is more expressive, it often relies on the underlying

59/145
idea of CFGs for defining structural rules.

Example:

Consider the following XML document:

xml

<catalog>
<book>
<title>Introduction to Automata Theory</title>
<author>John Doe</author>
<price>39.99</price>
</book>
</catalog>

A parser (such as one generated by YACC) for this XML document would validate that the document follows the structure
defined by the XML grammar or DTD, ensuring that elements are correctly nested and properly ordered.

Conclusion
In this lecture, we have discussed the applications of Context-Free Grammars in several key areas:

1. Parsers: CFGs are essential for designing parsers that validate and process strings in programming languages and data
formats.

2. YACC Parser-Generator: YACC automates the generation of parsers from CFGs, greatly simplifying the process of
building compilers and interpreters.

3. Markup Languages: CFGs are used to define the structure of markup languages like XML, ensuring that documents are
correctly structured and follow the required syntax.

4. XML and DTDs: Context-free grammars provide a formal basis for defining the valid structure of XML documents
through DTDs, enabling validation and parsing.

These applications illustrate the powerful role that context-free grammars play in formal language design, data
representation, and the processing of complex languages in computing.

Lecture 22: Ambiguity in Grammars and Languages


In this lecture, we will explore the concept of ambiguity in formal grammars and languages, focusing on ambiguous
grammars, how to remove ambiguity from a grammar, the role of leftmost derivations in expressing ambiguity, and the
concept of inherent ambiguity. We will use detailed examples to explain these concepts.

1. Ambiguous Grammars
A grammar is said to be ambiguous if there exists at least one string in the language it generates for which there are
multiple distinct leftmost or rightmost derivations (or equivalently, multiple distinct parse trees).

60/145
Definition: A context-free grammar (CFG) is ambiguous if there is a string w that can be derived in more than one way,
i.e., it has more than one leftmost derivation, rightmost derivation, or parse tree.

Example of an Ambiguous Grammar:

Consider the following simple grammar:

S →S+S ∣S×S ∣a

This grammar describes expressions involving addition and multiplication. Now, consider the string w = a + a × a. This
string can be derived in two different ways, leading to different parse trees.

First Derivation (addition first):

S ⇒S+S ⇒a+S ⇒a+a×a

The parse tree for this derivation is:

css

S
/ | \
S + S
| / \
a S a
|
a

Second Derivation (multiplication first):

S ⇒S×S ⇒a×S ⇒a×a+a

The parse tree for this derivation is:

css

S
/ | \
S × S
| |
a S
|
a + a

As shown, the string a + a × a can be derived in two different ways, with two distinct parse trees. This shows that the
grammar is ambiguous.

2. Removing Ambiguity in Grammars


One of the key tasks when working with formal grammars is to remove ambiguity to ensure that there is a unique parse
tree for every string. There are various strategies for doing this, including:

Restructuring the Grammar: We can change the structure of the grammar to make the order of operations explicit.

61/145
Using Operator Precedence and Associativity: In many cases, the ambiguity arises from operations with different
precedence (e.g., multiplication has higher precedence than addition). By introducing rules that specify the precedence
and associativity, we can resolve ambiguity.

Example of Removing Ambiguity:

The ambiguous grammar

S →S+S ∣S×S ∣a

can be rewritten to avoid ambiguity by ensuring that multiplication has higher precedence than addition. This can be done
by introducing a new non-terminal for terms involving multiplication:

S →S+T ∣T

T →T ×a∣a

Now, consider the string w = a + a × a. This string will always be parsed in a single, unambiguous way:
Derivation:

S ⇒S+T ⇒T +T ⇒a+T ⇒a+a×a

The parse tree for this derivation is:

css

S
/ \
S T
| |
T T
| |
a a
|
a

With this revised grammar, the string a + a × a is always parsed with multiplication taking precedence over addition,
resolving the ambiguity.

3. Leftmost Derivations as a Way to Express Ambiguity


In a leftmost derivation, the leftmost non-terminal in each step of the derivation is replaced first. The ambiguity of a
grammar can often be illustrated by the different leftmost derivations that lead to the same string.

Example:

Using the same grammar as before:

S →S+S ∣S×S ∣a

Consider the string w = a + a × a. There are two possible leftmost derivations:


First Leftmost Derivation (addition first):

S ⇒S+S ⇒a+S ⇒a+a×a

62/145
Second Leftmost Derivation (multiplication first):

S ⇒S×S ⇒a×S ⇒a×a+a

Thus, we have two different leftmost derivations for the same string, demonstrating the ambiguity in the grammar.

4. Inherent Ambiguity
An inherent ambiguity refers to a situation where no matter how the grammar is rewritten, there will always be ambiguity
for certain strings. Some languages, by their very nature, are inherently ambiguous. Context-free languages can
sometimes have inherent ambiguity, where there is no way to define a grammar that avoids multiple parse trees for certain
strings.

Example of Inherent Ambiguity:

Consider the language consisting of palindromes, i.e., strings that read the same forward and backward, over the alphabet
{a, b}. A grammar for palindromes could be defined as:

S → aSa ∣ bSb ∣ a ∣ b ∣ ϵ

Now, consider the string w = abba. This string can be derived in multiple ways:
First Derivation:

S ⇒ aSa ⇒ abSb ⇒ abba

Second Derivation:

S ⇒ bSb ⇒ bb ⇒ abba

Here, we see that for the same string, there are two different derivations, demonstrating inherent ambiguity. This
ambiguity arises due to the nature of palindromes, and no matter how we modify the grammar, we will always face this
ambiguity.

Conclusion
In this lecture, we have covered the following important topics:

1. Ambiguous Grammars: A grammar is ambiguous if there is a string that can be derived in more than one way,
resulting in different parse trees.

2. Removing Ambiguity in Grammars: Ambiguity can often be resolved by restructuring the grammar and introducing
precedence rules, ensuring that each string has a unique parse tree.

3. Leftmost Derivations: Ambiguity can be expressed through different leftmost derivations of the same string, showing
the multiple ways the grammar can be applied.

4. Inherent Ambiguity: Some languages, like the language of palindromes, are inherently ambiguous, and there is no way
to define a grammar that completely avoids this ambiguity.

The concept of ambiguity is critical in formal language theory, as it affects the ease with which we can process and interpret
strings. Understanding how to deal with ambiguous grammars is an important skill for designing parsers and compilers.

63/145
Lecture 23: Pushdown Automata
In this lecture, we will explore Pushdown Automata (PDA), a type of automaton that extends finite automata with the ability
to use a stack for additional memory. We will start with an informal introduction to PDAs, followed by their formal
definition and graphical notation. Finally, we will discuss instantaneous descriptions of PDAs, which are critical in
understanding their operation.

1. Informal Introduction of Pushdown Automata (PDA)


A Pushdown Automaton (PDA) is a computational model that extends the concept of Finite Automata (FA) by incorporating
a stack as additional memory. This stack allows the PDA to process context-free languages (CFLs), which cannot be
recognized by finite automata alone.

A finite automaton can only remember a limited amount of information (typically about the current state). In contrast, a PDA
can "push" symbols onto a stack and "pop" symbols from the stack, allowing it to recognize patterns that require memory
beyond the current state. This capability makes PDAs suitable for parsing context-free languages, such as programming
languages or arithmetic expressions.

Example:

Consider a language like L = {an bn ∣ n ≥ 0}, which consists of strings with an equal number of a 's followed by b 's. This
language cannot be recognized by a finite automaton because it requires the ability to "remember" how many a 's have
been encountered to match them with b 's later in the string. A PDA can handle this by pushing a 's onto the stack and
popping them when it encounters b 's.

2. Formal Definition of Pushdown Automaton (PDA)


A Pushdown Automaton is formally defined as a 7-tuple (Q, Σ, Γ, δ, q0 , Z0 , F ), where:
​ ​

Q is a finite set of states.

Σ is a finite input alphabet (the set of symbols that can appear in the input string).
Γ is a finite stack alphabet (the set of symbols that can appear on the stack).
δ is the transition function, defined as:

δ : Q × (Σ ∪ {ϵ}) × Γ → P(Q × Γ∗ )
This function describes the behavior of the PDA based on the current state, the current input symbol (or an empty input
symbol for epsilon transitions), and the top of the stack.

q0 ∈ Q is the initial state.


Z0 ∈ Γ is the initial stack symbol, which marks the bottom of the stack.

F \subseteq Q is the set of accepting states.

The PDA operates as follows:

It reads an input symbol from the string, updates its state, and modifies the stack (by pushing or popping symbols).

The machine can either process an input symbol, or transition based on the top of the stack (without reading an input
symbol, using an epsilon transition).

64/145
The PDA accepts a string if, after reading all the input symbols, it reaches an accepting state and the stack is in an
appropriate configuration.

Transition Function Details:

The transition function δ allows a PDA to either:

Read an input symbol and modify the stack.

Make an epsilon transition (move without reading an input symbol) and possibly modify the stack.

This additional power, using a stack, allows the PDA to recognize languages that require a form of memory beyond just the
current state.

3. Graphical Notation for PDA


In practice, PDAs can be represented using a state transition diagram, which is similar to the transition diagrams for finite
automata, but with the added complexity of stack operations.

Components of a PDA Diagram:

States: Represented as circles, labeled with state names (e.g., q0 , q1 , …).


​ ​

Transitions: Arrows between states. Each transition is labeled with:

The current input symbol (or ϵ for epsilon transitions).

The symbol to be popped from the stack (if any).

The symbol(s) to be pushed onto the stack (if any).

Initial State: Denoted with an arrow pointing to it from nowhere.

Accepting States: Denoted by double circles.

Stack Operations: Indicated in the transition labels (e.g., pop a, push b).

Example of PDA Diagram:

Consider the PDA for the language L = {an bn ∣ n ≥ 0}.


1. Start in state q0 .

2. On reading a , push a onto the stack.

3. On reading b , pop an a from the stack.

4. If the stack is empty and the input is exhausted, the PDA reaches an accepting state.

The transitions in the PDA diagram could look like:

scss

(q0, a, Z0) → (q0, aZ0)


(q0, a, a) → (q0, aa)
(q0, b, a) → (q0, ε)
(q0, ε, Z0) → (q_accept, Z0)

This diagram represents the PDA's process of reading the string and manipulating the stack.

65/145
4. Instantaneous Descriptions of Pushdown Automata
An instantaneous description (ID) of a PDA represents the current configuration of the machine at any point in time during
its computation. It consists of three components:

1. Current State: The state in which the PDA is.

2. Remaining Input: The remaining string to be processed, starting from the current position.

3. Current Stack Contents: The symbols currently on the stack, with the topmost symbol at the front.

An instantaneous description can be written in the following format:

⟨q, w, γ⟩

where:

q is the current state.


w is the remaining input string.
γ is the current stack content (with the top of the stack on the left).

Example of Instantaneous Description:

Consider the string w = aab and a PDA in state q0 with stack symbol Z0 as the initial stack symbol.
​ ​

1. Initially, the ID is: ⟨q0 , aab, Z0 ⟩.


​ ​

2. After reading the first a and pushing a onto the stack, the ID is: ⟨q0 , ab, aZ0 ⟩.

3. After reading the second a and pushing another a onto the stack, the ID is: ⟨q0 , b, aaZ0 ⟩.
​ ​

4. After reading the b and popping a from the stack, the ID is: ⟨q0 , ϵ, aZ0 ⟩.
​ ​

5. Finally, after making the epsilon transition and reaching the accepting state, the ID is: ⟨qaccept , ϵ, Z0 ⟩.
​ ​

The instantaneous descriptions provide a step-by-step view of how the PDA processes the string and manipulates the stack.

Conclusion
In this lecture, we introduced the Pushdown Automaton (PDA), focusing on the following key points:

1. Informal Introduction: PDAs extend finite automata by using a stack, enabling them to recognize context-free
languages.

2. Formal Definition: A PDA is formally defined as a 7-tuple, with a transition function that includes both input symbols
and stack operations.

3. Graphical Notation: PDAs can be represented visually using state transition diagrams that include stack manipulations.

4. Instantaneous Descriptions: An instantaneous description provides a snapshot of the PDA’s current state, input, and
stack contents at any point during the computation.

Understanding PDAs is essential for parsing context-free languages, which are foundational in the design of compilers and
interpreters.

66/145
Lecture 24: Languages of a Pushdown Automaton (PDA)
In this lecture, we will explore the various ways in which a Pushdown Automaton (PDA) can accept strings from its input
alphabet. Specifically, we will focus on two primary methods of acceptance:

1. Acceptance by Final State: A PDA accepts a string if it reaches an accepting (final) state after processing the entire input
string.

2. Acceptance by Empty Stack: A PDA accepts a string if, after processing the entire string, the stack is empty.

Additionally, we will explore the relationship between these two methods of acceptance, particularly how one can be used
to simulate the other.

1. Acceptance by Final State


A PDA accepts by final state if, after processing the entire input string, the PDA reaches a state that is included in its set of
accepting (final) states. This method of acceptance focuses on the state the PDA is in after consuming the entire input
string, without necessarily considering the stack contents.

Formal Definition:

A string w is accepted by the PDA if there exists a sequence of transitions that leads the PDA to an accepting state after
consuming all of w .

Example:

Consider the PDA P that recognizes the language L = {an bn ∣ n ≥ 0}, where the PDA accepts by final state. The PDA
operates as follows:

1. Initially, in state q0 , the PDA pushes a onto the stack for every a in the input.

2. When reading b , the PDA pops an a from the stack.

3. If, after reading all b 's, the input is exhausted and the PDA is in an accepting state (say q1 ), then the string is accepted.

For the input string w = aab:


The PDA starts in q0 with stack Z0 .
​ ​

On reading the first a , it transitions to q0 and pushes a onto the stack, resulting in the stack aZ0 .
​ ​

On reading the second a , it transitions again to q0 and pushes another a , resulting in the stack aaZ0 .
​ ​

On reading the b , it transitions to q0 and pops a from the stack, resulting in the stack aZ0 .
​ ​

On reading the second b , it pops the remaining a , leaving the stack as Z0 . ​

After processing all the input, the PDA reaches the accepting state q1 and the string is accepted.

2. Acceptance by Empty Stack


A PDA accepts by empty stack if, after processing the entire input string, the stack is empty. This method of acceptance
focuses on the contents of the stack rather than the state of the PDA, meaning the PDA is considered to have successfully
recognized the input if the stack is emptied by the end of the input string.

67/145
Formal Definition:

A string w is accepted by the PDA if there exists a sequence of transitions that causes the PDA’s stack to be empty after
processing all of w .

Example:

Consider the same PDA P recognizing L = {an bn ∣ n ≥ 0}, but this time the PDA accepts by empty stack.
1. Initially, the PDA is in state q0 with the stack initialized to Z0 .
​ ​

2. For each a , the PDA pushes a onto the stack.

3. For each b , the PDA pops a from the stack.

4. After reading all input, if the stack is empty, the string is accepted.

For the input string w = aab:


The PDA starts in q0 with stack Z0 .
​ ​

On reading the first a , it transitions to q0 and pushes a , resulting in the stack aZ0 .
​ ​

On reading the second a , it pushes another a , resulting in the stack aaZ0 . ​

On reading the b , it pops one a from the stack, resulting in the stack aZ0 . ​

On reading the second b , it pops the remaining a , leaving the stack Z0 . ​

Since the stack is empty at the end of the input, the string is accepted.

3. From Empty Stack to Final State


We can show that a PDA that accepts by empty stack can be simulated by a PDA that accepts by final state. Specifically, we
can transform a PDA that accepts by empty stack into a PDA that accepts by final state.

Construction:

Given a PDA P that accepts by empty stack, we construct a new PDA P ′ that accepts by final state as follows:

1. Add a new final state qf to P ′ .


2. Modify the transition function such that when the PDA reaches the empty stack configuration (i.e., the stack is empty), it
transitions to the new final state qf . ​

3. The PDA P ′ will accept the string if it reaches the state qf with an empty stack.

This construction ensures that the PDA accepts by final state whenever the original PDA would have accepted by empty
stack.

Example:

For the PDA P that accepts by empty stack for L = {an bn ∣ n ≥ 0}:
1. Construct P ′ , where the state qf is added as a new final state.

2. When P empties the stack, P ′ transitions to qf . ​

3. If P ′ reaches qf with an empty stack, the string is accepted by final state.


This transformation ensures equivalence between acceptance by empty stack and acceptance by final state.

68/145
4. From Final State to Empty Stack
Conversely, we can also show that a PDA that accepts by final state can be transformed into a PDA that accepts by empty
stack.

Construction:

Given a PDA P that accepts by final state, we construct a new PDA P ′ that accepts by empty stack as follows:

1. Add a new stack symbol Zf to mark the initial stack symbol in P ′ .


2. Modify P ′ to ensure that it can only reach the final state when the stack is empty. Specifically, make transitions such
that, if P ′ reaches a final state, it must pop every symbol from the stack, leaving the stack empty.

This construction ensures that the PDA P ′ accepts by empty stack whenever the original PDA accepted by final state.

Example:

For the PDA P that accepts by final state for L = {an bn ∣ n ≥ 0}:
1. Construct P ′ , where we add a new stack symbol Zf to mark the bottom of the stack.

2. Modify P ′ 's transition function to ensure the stack is emptied before reaching the final state.

This transformation ensures equivalence between acceptance by final state and acceptance by empty stack.

Conclusion
In this lecture, we have discussed the two primary methods by which a Pushdown Automaton (PDA) can accept strings:

1. Acceptance by Final State: The PDA accepts a string if it reaches an accepting state after processing the entire input.

2. Acceptance by Empty Stack: The PDA accepts a string if, after processing the input, the stack is empty.

We also explored the equivalence between these two methods, showing that a PDA that accepts by empty stack can be
transformed to one that accepts by final state and vice versa. Understanding both methods is crucial in studying the
expressive power of PDAs and their ability to recognize context-free languages.

Lecture 25: Equivalence of PDAs and CFGs


In this lecture, we explore the equivalence between Pushdown Automata (PDA) and Context-Free Grammars (CFGs).
Specifically, we will cover the following topics:

1. From Grammars to PDAs: How to construct a PDA that accepts the language generated by a given CFG.

2. From PDAs to Grammars: How to construct a CFG that generates the language recognized by a given PDA.

Both of these directions are fundamental in understanding the relationship between these two models of computation,
both of which recognize context-free languages.

1. From Grammars to PDAs

69/145
Given a Context-Free Grammar (CFG), we can construct a Pushdown Automaton (PDA) that recognizes the same language.
The construction works by simulating the derivation process of the grammar with the stack of the PDA.

Construction Steps:

Let G = (V , Σ, R, S) be a CFG, where:


V is the set of non-terminal symbols,
Σ is the set of terminal symbols,
R is the set of production rules, and
S is the start symbol.

We construct a PDA P that recognizes the language L(G) generated by G as follows:

1. States of the PDA: The PDA will have a single state q0 where it stays during the entire computation.

2. Stack Alphabet: The stack of the PDA will use symbols from V ∪ Σ ∪ {#}, where # is a new stack symbol
representing the bottom of the stack.

3. Start Symbol: The initial stack symbol is S , the start symbol of the grammar G.

4. Transition Function: The PDA will make transitions based on the top of the stack:

If the top of the stack is a terminal symbol (from Σ), the PDA will match it with the input string.

If the top of the stack is a non-terminal symbol (from V ), the PDA will apply the corresponding production rule from
G, replacing the non-terminal with the right-hand side of the production.
5. Acceptance Condition: The PDA accepts by empty stack, meaning that when the input is completely consumed, the
stack must be empty.

Example:

Consider the CFG G = (V , Σ, R, S), where:


V = {S},
Σ = {a, b},
R = {S → aSb, S → ϵ}, and
S is the start symbol.

We can construct a PDA P to recognize the language L(G) = {an bn ∣ n ≥ 0} as follows:


The PDA has a single state q0 and an initial stack symbol S .

The transition function of P is:

From q0 , on reading a and having S on top of the stack, replace S with aSb.

From q0 , on reading b and having S on top of the stack, pop S from the stack.

If the stack is empty after processing the entire input, the PDA accepts the string.

Key Idea: The PDA simulates the leftmost derivation of the grammar G, and by following the production rules, it ensures
that each a read corresponds to a matching b .

2. From PDAs to Grammars

70/145
Given a Pushdown Automaton (PDA), we can construct a Context-Free Grammar (CFG) that generates the same language.
The construction works by simulating the computation of the PDA and producing derivations that correspond to its
behavior.

Construction Steps:

Let P = (Q, Σ, Γ, δ, q0 , Z0 , F ) be a PDA, where:


​ ​

Q is the set of states,


Σ is the input alphabet,
Γ is the stack alphabet,
δ is the transition function,
q0 is the initial state,

Z0 is the initial stack symbol, and


F is the set of accepting states.

We construct a CFG G = (V , Σ, R, S), where:


V is a set of variables representing combinations of PDA states and stack symbols,
Σ is the input alphabet, and
R is the set of production rules.

1. Variables of the Grammar: Each variable in G corresponds to a pair of a PDA state and a stack symbol, i.e., a variable of
the form [p, X], where p ∈ Q and X ∈ Γ. The variable [p, X] represents the derivation of strings from state p with X
on the stack.

2. Start Symbol: The start symbol of the grammar is [q0 , Z0 ], representing the initial configuration of the PDA.
​ ​

3. Production Rules: The production rules of the CFG are constructed based on the transitions of the PDA:

If there is a transition δ(p, a, X) = (q, α), where a ∈ Σ and X ∈ Γ, and α ∈ Γ∗ , we add the rule [p, X] → a[q, α]
to the grammar.

If there is a transition δ(p, ϵ, X) = (q, α), we add the rule [p, X] → [q, α] to the grammar.
If the PDA has a final state pf ​ ∈ F , we add the rule [pf , Z0 ] → ϵ to allow the empty string to be derived when the
​ ​

stack is empty.

Example:

Consider a PDA P that recognizes the language L = {an bn ∣ n ≥ 0}. The PDA has:
States Q = {q0 , q1 },
​ ​

Input alphabet Σ = {a, b},


Stack alphabet Γ = {a, Z0 }, ​

Start state q0 , start stack symbol Z0 , and final state q1 .


​ ​ ​

The PDA P performs the following transitions:

1. δ(q0 , a, Z0 ) = (q0 , aZ0 ),


​ ​ ​ ​

2. δ(q0 , a, a) = (q0 , aa),


​ ​

3. δ(q0 , b, a) = (q1 , ϵ),


​ ​

4. δ(q1 , b, a) = (q1 , ϵ),


​ ​

71/145
5. δ(q1 , ϵ, Z0 ) = (q1 , Z0 ).
​ ​ ​ ​

The corresponding CFG G for this PDA has:

Variables: [q0 , Z0 ], [q0 , a], [q1 , Z0 ],


​ ​ ​ ​ ​

Start symbol: [q0 , Z0 ],​ ​

Production rules:

[q0 , Z0 ] → a[q0 , a],


​ ​ ​

[q0 , a] → a[q0 , a],


​ ​

[q0 , a] → b[q1 , Z0 ],
​ ​ ​

[q1 , Z0 ] → ϵ.
​ ​

This CFG generates the same language {an bn ∣ n ≥ 0} as the original PDA.

Conclusion
In this lecture, we explored the equivalence between Pushdown Automata (PDAs) and Context-Free Grammars (CFGs).
Specifically, we discussed the two constructions:

1. From Grammars to PDAs: We constructed a PDA that simulates the derivation process of a CFG.

2. From PDAs to Grammars: We constructed a CFG that generates the language recognized by a PDA by simulating the
PDA’s computation.

These constructions show that the class of context-free languages can be recognized both by PDAs and generated by CFGs,
demonstrating the deep equivalence between these two models.

Lecture 26: Deterministic Pushdown Automata (DPDA)


In this lecture, we will explore Deterministic Pushdown Automata (DPDA), which are a restricted form of pushdown
automata. We will discuss their definition, properties, their relationship with Regular Languages (RL), Context-Free
Languages (CFL), and their connection to Ambiguous Grammars.

1. Definition of Deterministic Pushdown Automata (DPDA)


A Deterministic Pushdown Automaton (DPDA) is a specific type of Pushdown Automaton (PDA) where the computation is
deterministic. In other words, for any given state and input symbol, the DPDA must have exactly one possible transition to
make. This makes the behavior of a DPDA predictable and deterministic, in contrast to the general PDA where multiple
transitions may be possible for a given state and input symbol.

Formally, a Deterministic Pushdown Automaton (DPDA) is defined as a 7-tuple P = (Q, Σ, Γ, δ, q0 , Z0 , F ), where:


​ ​

Q is the finite set of states.


Σ is the finite input alphabet.
Γ is the finite stack alphabet.

72/145
δ is the transition function δ : Q × Σ × Γ → Q × Γ∗ , which is deterministic: for each combination of state, input
symbol, and stack symbol, there is at most one transition.

q0 is the start state.


Z0 is the initial stack symbol.


F is the set of accepting states.

A DPDA is said to accept a string if it:

Consumes the entire input string, and

The stack is empty at the end of the computation (or it reaches an accepting state depending on the acceptance
condition).

Key Difference from Non-Deterministic PDA:

In a Non-Deterministic PDA (NFA), for a given state, input symbol, and stack symbol, there may be multiple possible
transitions. However, in a Deterministic PDA (DPDA), there is exactly one transition for each such combination, making the
computation deterministic.

2. Regular Languages (RL) and DPDA


It is important to note that Deterministic Pushdown Automata (DPDA) can only recognize a subset of Context-Free
Languages (CFLs). This subset is actually the class of Deterministic Context-Free Languages (DCFLs), which is strictly
smaller than the set of all context-free languages. Regular Languages (RL), on the other hand, can be recognized by both
Finite Automata (FA) and DPDA.

Properties of DPDA and RL:

Any regular language can be recognized by a DPDA. In fact, regular languages are a subset of deterministic context-
free languages.

Since regular languages can be recognized by finite automata (FAs), and deterministic pushdown automata (DPDAs)
can simulate finite automata (since they have the capability to ignore the stack), any regular language is also recognized
by a DPDA.

Example:

Consider the regular language L = {an bn ∣ n ≥ 0}. This language is a regular language and can be recognized by both a
finite automaton and a DPDA. The DPDA, in this case, can simply use its stack to track the number of a s and then match
them with the corresponding b s.

3. DPDA and Context-Free Languages (CFL)


The class of languages recognized by Deterministic Pushdown Automata (DPDA) is the class of Deterministic Context-
Free Languages (DCFLs). This class is strictly smaller than the class of Context-Free Languages (CFLs) because not all
context-free languages can be recognized by deterministic automata.

Relationship Between DPDA and CFL:

73/145
CFLs: The set of all context-free languages is the set of languages that can be recognized by a non-deterministic
pushdown automaton (NPDA). NPDAs allow multiple possible transitions for a given input symbol and stack symbol,
which means they can recognize a broader set of languages.

DCFLs: The set of deterministic context-free languages is a proper subset of CFLs. These are the languages that can
be recognized by a deterministic pushdown automaton (DPDA).

Example of DCFL:

Consider the language L = {an bn ∣ n ≥ 0}. This is a deterministic context-free language, and a DPDA can recognize it by
pushing an a onto the stack and popping it when it reads a b . The DPDA can deterministically match each a with a b .

Example of CFL but not DCFL:

The language L = {an bn cn ∣ n ≥ 0} is a context-free language (CFL), but it is not deterministic. A DPDA cannot
recognize this language, as it requires non-determinism to match the a s, b s, and c s in the correct order. A non-
deterministic PDA (NPDA) can recognize this language, but a DPDA cannot.

4. DPDA and Ambiguous Grammars


An ambiguous grammar is a grammar that generates a language where at least one string has more than one distinct
parse tree (or derivation). Ambiguity is a property of the grammar, not of the language itself. A context-free language (CFL)
may or may not be generated by an ambiguous grammar.

DPDA and Ambiguity:

DPDAs cannot recognize languages generated by ambiguous grammars. If a context-free language is ambiguous,
there is no DPDA that can recognize it deterministically.

Non-deterministic PDAs (NPDAs), on the other hand, can handle ambiguous grammars because they can "guess" the
correct parse tree non-deterministically.

Example of Ambiguous Grammar:

Consider the context-free grammar G = {S → aS ∣ bS ∣ ϵ}. The language generated by this grammar is L = {an bn ∣
n ≥ 0}. However, this grammar is ambiguous because the string ab can be derived in two ways:
1. S → aS → ab,
2. S → bS → ab.

The ambiguity of this grammar means that it cannot be recognized by a DPDA. An NPDA could recognize the language, but
a DPDA cannot, since it would need non-determinism to choose between the two possible derivations.

Summary
Deterministic Pushdown Automata (DPDA) are a restricted form of pushdown automata where for every combination
of state, input symbol, and stack symbol, there is at most one transition.

Regular Languages (RL) can be recognized by both Finite Automata (FA) and DPDA.

Deterministic Context-Free Languages (DCFLs) are a subset of Context-Free Languages (CFLs) that can be recognized
by DPA. Some context-free languages (e.g., an bn cn ) cannot be recognized by a DPDA.

74/145
Ambiguous grammars generate languages that cannot be recognized by DPDAs because DPDAs require deterministic
parsing, which is not possible for ambiguous languages.

This concludes the lecture on Deterministic Pushdown Automata (DPDA) and their properties.

Lecture 27: Normal Forms of Context-Free Grammars


In this lecture, we will discuss various normal forms of Context-Free Grammars (CFGs). Specifically, we will cover the
processes of eliminating useless symbols, eliminating epsilon-productions, eliminating unit productions, and Chomsky
Normal Form (CNF). Each of these topics will be explained through an elaborate example, building step-by-step as we work
through the transformations.

1. Eliminating Useless Symbols


A useless symbol in a grammar is one that does not contribute to generating any terminal string. We identify useless
symbols in two steps:

1. Generating symbols: A symbol is generating if there exists a derivation from it to a string of terminals.

2. Reachable symbols: A symbol is reachable if it can be reached from the start symbol.

Step-by-step Process:

Identify Generating Symbols: A non-terminal is generating if there exists a derivation from it to a terminal string
(possibly using other non-terminals).

Identify Reachable Symbols: A non-terminal is reachable if there exists a derivation from the start symbol to it.

Example:

Consider the grammar G = (V , Σ, R, S) where:


V = {S, A, B, C}
Σ = {a, b}
R contains the following production rules:

S → AB
A → aA ∣ a
B → bB ∣ b
C→c

Step 1: Find the Generating Symbols

A → aA (generating), A → a (generating).
B → bB (generating), B → b (generating).
C → c (generating).

So, A and B are generating symbols. S is generating because it can produce AB , and both A and B are generating.
However, C is not generating since there is no derivation from C to any terminal string.

Step 2: Find Reachable Symbols

The start symbol is S , and from S we can reach A and B , so A and B are reachable.

75/145
C is not reachable from S (there is no production starting from S that uses C ).

Thus, useless symbol is C , which we eliminate from the grammar.

Updated Grammar:

S → AB
A → aA ∣ a
B → bB ∣ b

2. Computing the Generating and Reachable Symbols


This process has already been demonstrated in the previous step. In general, the steps to compute generating and
reachable symbols are:

Start by marking all terminal symbols as generating.

For non-terminal symbols, iteratively check whether there is a production rule that can lead to a string of terminals.

For reachable symbols, start from the start symbol and mark all non-terminals that can be reached through the
production rules.

3. Eliminating Epsilon-Productions
An epsilon-production is a production rule of the form A → ϵ, where ϵ represents the empty string. Eliminating epsilon-
productions involves the following steps:

Identify all non-terminal symbols that can derive ϵ.

For each production A → X1 X2 ⋯ Xn , generate new productions by considering the possibility of each Xi being
​ ​ ​ ​

replaced by ϵ.

Example:

Consider the grammar G = (V , Σ, R, S) where:


V = {S, A, B}
Σ = {a, b}
R contains the following production rules:

S→A∣b
A → ϵ ∣ aA
B → bB ∣ ϵ

Step 1: Find Non-terminals Deriving ϵ

A can derive ϵ directly.


B can also derive ϵ.

76/145
Step 2: Modify the Productions
For every production that includes A or B , add new productions where A and B are omitted:

For S → A ∣ b, since A can be replaced by ϵ, add the production S → b.


For A → ϵ ∣ aA, remove A → ϵ, but leave A → aA.
For B → bB ∣ ϵ, remove B → ϵ, but leave B → bB .

Updated Grammar:

S→A∣b
A → aA
B → bB ∣ b

4. Eliminating Unit Productions


A unit production is a production of the form A → B , where both A and B are non-terminals. The goal is to eliminate
such unit productions by substituting the right-hand side of the production.

Example:

Consider the grammar G = (V , Σ, R, S) where:


V = {S, A, B}
Σ = {a, b}
R contains the following production rules:

S→A∣b
A → B ∣ aA
B→b

Step 1: Eliminate Unit Productions

S → A is a unit production, and A has productions A → B ∣ aA.


Replace S → A with S → B ∣ aA (since A → B and A → aA).

Updated Grammar:

S → B ∣ aA ∣ b
A → B ∣ aA
B→b

5. Chomsky Normal Form (CNF)


A Context-Free Grammar is in Chomsky Normal Form (CNF) if all its production rules are of the form:

1. A → BC (where A, B, C are non-terminals),


2. A → a (where a is a terminal),

77/145
3. The start symbol can produce ϵ (if the language includes the empty string).

Steps to Convert to CNF:

1. Eliminate epsilon-productions.

2. Eliminate unit productions.

3. Ensure all productions are either of the form A → BC or A → a.


4. If necessary, introduce new variables to break down productions longer than two symbols.

Example:

Consider the grammar G = (V , Σ, R, S) where:


V = {S, A, B, C}
Σ = {a, b}
R contains the following production rules:

S → AB
A → aA ∣ a
B→b

Step 1: Eliminate Unit Productions


There are no unit productions in this grammar, so we proceed to the next step.

Step 2: Ensure Binary Productions


We need to convert the production A → aA into binary form. We introduce a new non-terminal X such that X → a, and
replace A → aA with A → XA.
Step 3: CNF Grammar

S → AB
A → XA ∣ a
B→b
X→a

Now, all the productions are of the form A → BC or A → a, so the grammar is in Chomsky Normal Form (CNF).

Summary
Eliminating Useless Symbols: We identify and remove symbols that do not contribute to generating terminal strings.

Generating and Reachable Symbols: Symbols that can derive terminal strings and can be reached from the start
symbol are useful for grammar.

Eliminating Epsilon-Productions: We eliminate A → ϵ productions by adjusting other productions that might involve
non-terminals that can derive ϵ.

Eliminating Unit Productions: We remove unit productions (i.e., A → B ) by directly substituting the rules for B into A.
Chomsky Normal Form (CNF): We transform the grammar so that each production is either of the form A → BC or
A → a, ensuring that the grammar is in the desired normal form for further analysis or parsing algorithms.

78/145
This concludes the lecture on Normal Forms of Context-Free Grammars.

Lecture 28: The Pumping Lemma for Context-Free Grammars (CFGs)


In this lecture, we will explore the Pumping Lemma for Context-Free Languages (CFLs). The Pumping Lemma is a
fundamental result in formal language theory that provides a necessary condition for a language to be context-free. This
lemma is commonly used to prove that certain languages are not context-free.

We will cover the following topics in detail:

1. Size of Parse Trees

2. Statement and Proof of the Pumping Lemma for CFGs

3. Applications of the Pumping Lemma for CFLs (through examples)

1. Size of Parse Trees


The size of a parse tree is an important concept when discussing the Pumping Lemma. The parse tree represents the
syntactic structure of a string derived from a grammar. The number of levels in the parse tree corresponds to the number of
applications of the production rules needed to derive a string.

For context-free grammars, we know from the Chomsky hierarchy that the height of a parse tree for a string of length n
grows logarithmically in the worst case. More specifically, for a context-free grammar, the length of the derivation (the
number of nodes or levels in the parse tree) grows at most linearly with the size of the string, i.e., a context-free grammar
can generate a string of length n with a parse tree having a maximum height proportional to n.

The pumping lemma exploits the structure of these parse trees, particularly when the string length is sufficiently large, to
show that some parts of the string can be "pumped" (repeated) while still belonging to the language.

2. Statement of the Pumping Lemma for CFGs


The Pumping Lemma for Context-Free Languages provides a way to show that certain languages cannot be context-free.
Specifically, it states that:

For any context-free language L, there exists some constant p (called the pumping length) such that any string s ∈L
with ∣s∣ ≥ p can be divided into five substrings s = uvwxy satisfying the following conditions:
1. Length of the substrings: ∣vwx∣ ≤p
2. Non-empty substrings: ∣vx∣ ≥1
3. Pumping property: For all i≥ 0, the string u(v i )w(xi )y is in L, i.e., the string formed by repeating the substrings v
and x any number of times still belongs to L.

This lemma asserts that in any sufficiently long string derived from a context-free grammar, we can "pump" (repeat) some
part of the string, and the resulting string will still be part of the language. This is an important tool for proving that certain
languages are not context-free by showing that no decomposition can satisfy the conditions of the pumping lemma.

79/145
3. Detailed Proof of the Pumping Lemma for CFGs
We now provide a proof of the Pumping Lemma for context-free languages. The proof proceeds by induction on the
number of non-terminal symbols in the grammar.

Proof Outline:

1. Assumption: Suppose L is a context-free language. By the definition of context-free languages, there exists a context-
free grammar G = (V , Σ, R, S) that generates L.
2. Let p be the pumping length for L. By the pumping lemma, any string s ∈ L with ∣s∣ ≥ p can be decomposed as s =
uvwxy satisfying the conditions stated earlier.
3. Since G is context-free, we know that any string s ∈ L of length at least p must have a corresponding parse tree with at
most p internal nodes. The structure of this tree forces the existence of repeating substrings in the derivation, which
can be "pumped".

4. The key idea of the proof is that the repeated substrings correspond to parts of the parse tree where a non-terminal
symbol is rewritten, allowing for the "pumping" of the corresponding part of the string.

The proof leverages the structure of the parse tree and the properties of context-free grammars, particularly their ability to
rewrite non-terminals recursively. Since a context-free grammar has a finite number of non-terminals and rules, there will
always be repetitions in the derivation process for long enough strings. These repetitions form the substrings v and x,
which can be pumped as described by the lemma.

4. Applications of the Pumping Lemma for CFLs


The Pumping Lemma is particularly useful in proving that certain languages are not context-free. To demonstrate this, we
show that no matter how we decompose a string in the language, there is no way to pump certain parts of the string while
still maintaining membership in the language.

Let's consider a few examples of applying the Pumping Lemma to prove that a language is not context-free.

Example 1: The Language L = {an bn cn ∣ n ≥ 0}


This is a well-known example of a language that is not context-free.

1. Assume, for the sake of contradiction, that L is context-free.

2. Let p be the pumping length given by the pumping lemma.

3. Choose the string s = ap bp cp , which clearly belongs to L and has length 3p ≥ p.


4. According to the pumping lemma, we can split s into five parts uvwxy such that:

∣vwx∣ ≤ p,
∣vx∣ ≥ 1,
For all i ≥ 0, uv i wxi y ∈ L.
5. The string s = ap bp cp consists of three distinct segments: a block of a's, a block of b's, and a block of c's. Given the
condition that ∣vwx∣ ≤ p, it is clear that vwx must lie entirely within one of these blocks (since the blocks of a's, b's,
and c's are distinct).

6. If we try to pump v and x in such a way that the pumped string still has the form an bn cn , we will end up with a string
that has more of one character (say, more a's) and fewer of the others, thus breaking the structure of the language.

80/145
This contradiction shows that L is not context-free.

Example 2: The Language L = {ww ∣ w ∈ {a, b}∗ }


Consider the language consisting of strings that are two copies of the same string w . We will use the pumping lemma to
show that this language is not context-free.

1. Assume, for the sake of contradiction, that L is context-free.

2. Let p be the pumping length.

3. Choose the string s = ap bp ap bp , which is clearly in L.


4. According to the pumping lemma, we can split s into five parts uvwxy satisfying the pumping conditions.

5. We know that vwx must be entirely within the first half or the second half of the string (because the two halves are
identical).

However, when we pump v and x, we disrupt the structure of the string, making it impossible to maintain the form ww
where both halves are identical.

This contradiction proves that the language L is not context-free.

Summary
Pumping Lemma for Context-Free Languages provides a necessary condition for a language to be context-free.
Specifically, it says that sufficiently long strings in a CFL can be "pumped" (with repeated substrings) without leaving the
language.

Proof of the Pumping Lemma involves showing that for any sufficiently long string, parts of its derivation tree can be
repeated (pumped) without violating the structure of the language.

Applications of the Pumping Lemma: We applied the lemma to show that certain languages, such as {an bn cn } and
{ww}, are not context-free by demonstrating that no valid pumping can preserve the structure of the language.

The Pumping Lemma is a powerful tool in the theory of formal languages, especially for proving that certain languages are
not context-free.

Lecture 29: Closure Properties of Context-Free Languages


In this lecture, we will explore the closure properties of context-free languages (CFLs). The concept of closure refers to
whether a class of languages remains closed under certain operations. In other words, if you apply an operation to two
languages from the class, the result is also a language from the same class. For context-free languages, we will study the
closure properties under various operations, such as substitution, reversal, intersection with regular languages, and inverse
homomorphisms.

We will also provide formal proofs and examples to demonstrate how these properties hold.

1. Substitutions
The substitution operation refers to replacing each symbol in a string with a string from a certain language. Specifically, if
L1 and L2 are languages, and G1 = (V1 , Σ1 , R1 , S1 ) is a context-free grammar for L1 , we can define a substitution as
​ ​ ​ ​ ​ ​ ​ ​

81/145
follows:

For each symbol X in the alphabet of the language L1 , replace X with a string derived from the language L2 .
​ ​

The Substitution Theorem for context-free languages states that the class of context-free languages is closed under
substitution. That is, if we substitute a non-terminal in a context-free grammar with another context-free language, the
resulting language will still be context-free.

Proof of Substitution Closure

Let L1 and L2 be context-free languages, and suppose that G1


​ ​ ​ = (V1 , Σ1 , R1 , S1 ) is a grammar for L1 , and G2 =
​ ​ ​ ​ ​ ​

(V2 , Σ2 , R2 , S2 ) is a grammar for L2 . We perform substitution as follows:


​ ​ ​ ​ ​

1. For every non-terminal symbol X ∈ V1 , we substitute it with the language L2 .


​ ​

2. We then define a new grammar G′ that generates the resulting language, where we replace the occurrences of X in
G1 with the rules of G2 .
​ ​

By the definition of context-free grammars and the properties of non-terminals and derivations, we can construct a
grammar that generates the substituted language. Hence, the language formed by substitution of L2 into L1 is still context- ​ ​

free.

Example of Substitution

Let’s consider the languages L1 ​ = {an bn ∣ n ≥ 0} (which is context-free) and L2 = {am bm ∣ m ≥ 0} (which is also

context-free).

Suppose we have a grammar for L1 : ​

S1 → aSb ∣ ϵ

We can substitute S with the grammar for L2 : ​

S → aSb ∣ ϵ, and S is replaced by aSb ∣ ϵ, yielding a new context-free grammar.

The resulting language will still be context-free because substitution preserves the context-free property.

2. Applications of the Substitution Theorem


The Substitution Theorem can be applied in various contexts, such as:

Combining different context-free languages: By substituting one context-free language into another, we can create
new context-free languages.

Grammar transformations: Substitution can be used to transform grammars for complex constructs in programming
languages.

Example Application

Let’s consider two context-free languages L1 = {an bn ∣ n ≥ 0} and L2 = {cm dm ∣ m ≥ 0}. If we use substitution to
​ ​

replace b with cm dm , we create a new language L = {an cm dm ∣ n ≥ 0, m ≥ 0}, which is still context-free by the closure
under substitution.

82/145
3. Reversal
The reversal of a language L, denoted LR , is the language formed by reversing each string in L. Specifically, if L =
{w1 , w2 , w3 , … }, then:
​ ​ ​

LR = {w1R , w2R , w3R , … }


​ ​ ​

The Reversal Theorem for context-free languages states that the class of context-free languages is closed under reversal.
That is, if L is a context-free language, then its reversal LR is also context-free.

Proof of Reversal Closure

Let G = (V , Σ, R, S) be a context-free grammar generating a language L. To construct a grammar for LR , we do the


following:

1. Reverse all production rules of G. Specifically, for each production A → X1 X2 … Xk in G, replace it with A →
​ ​ ​

Xk Xk−1 … X1 .
​ ​ ​

2. Reverse the start symbol S and the strings it generates.

By reversing all the production rules and the start symbol, the resulting grammar will generate the language LR . Therefore,
the class of context-free languages is closed under reversal.

Example of Reversal

Consider the language L = {an bn ∣ n ≥ 0}, which is context-free. The reversal of L is LR = {bn an ∣ n ≥ 0}. We can
easily observe that this language is still context-free because we can construct a grammar for LR using similar rules to
those of L, but in reversed order.

4. Intersection with a Regular Language


The intersection of a context-free language (CFL) and a regular language (RL) is another important operation. The
Intersection Theorem for context-free languages states that the class of context-free languages is not closed under
intersection with regular languages. In other words, the intersection of a CFL and a RL is not guaranteed to be context-
free.

Proof (Not Closed Under Intersection with RL)

Let L1 be a context-free language, and L2 be a regular language. The intersection L1


​ ​ ​ ∩ L2 may not be context-free.

Counterexample:

Consider the CFL L1 ​ = {an bn cn ∣ n ≥ 0} and the regular language L2 = {a∗ b∗ c∗ }.


The intersection L1∩ L2 will contain strings of the form an bn cn , which is context-free. However, the result still requires a
​ ​

non-context-free property, as the intersection is a strict subset of L1 and requires a non-context-free computation.

Hence, the intersection of a CFL and a RL can be non-context-free.

5. Inverse Homomorphisms
An inverse homomorphism is an operation that maps a language by reversing the process of a homomorphism. Given a
homomorphism h from Σ to Γ, the inverse homomorphism is defined as:

83/145
h−1 (L) = {w ∣ h(w) ∈ L}

The Inverse Homomorphism Theorem for context-free languages states that the class of context-free languages is closed
under inverse homomorphisms. That is, if L is a context-free language and h is a homomorphism, then h−1 (L) is also
context-free.

Proof of Inverse Homomorphism Closure

Let L be a context-free language, and h be a homomorphism from Σ to Γ. We know that context-free languages are closed
under composition with homomorphisms, so applying h−1 to a context-free language L will result in another context-free
language.

By constructing a context-free grammar for h−1 (L), we can show that the resulting language is context-free. This can be
done by systematically reversing the homomorphism and adjusting the grammar accordingly.

Example of Inverse Homomorphism

Let L = {an bn ∣ n ≥ 0} and let h : {a, b} → {a, b, c} be the homomorphism where h(a) = ac and h(b) = bc. The
inverse homomorphism h−1 (L) will map the strings an bn to strings of the form acn bcn , which remains a context-free
language.

Summary
Substitutions: The class of context-free languages is closed under substitution.

Reversal: The class of context-free languages is closed under reversal.

Intersection with Regular Languages: The class of context-free languages is not closed under intersection with
regular languages.

Inverse Homomorphisms: The class of context-free languages is closed under inverse homomorphisms.

We have explored these closure properties in depth through formal proofs and examples, demonstrating the versatility and
limitations of context-free languages under various operations.

Lecture 30: Decision Properties of Context-Free Languages (CFLs)


In this lecture, we will explore the decision properties of context-free languages (CFLs). Decision properties refer to
questions about the language and grammar that can be answered algorithmically, often with a finite procedure. Specifically,
we will look at:

1. Complexity of Converting among CFGs and PDAs

2. Running Time of Conversion to Chomsky Normal Form (CNF)

3. Testing Emptiness of CFLs

4. Testing Membership in a CFL

5. Preview of Undecidable CFL Problems

Each topic will be discussed with proofs and examples where applicable.

1. Complexity of Converting among CFGs and PDAs

84/145
A Context-Free Grammar (CFG) and a Pushdown Automaton (PDA) are two equivalent formal models of computation for
context-free languages. The question of converting between CFGs and PDAs is important because it allows us to move
between two representations of a context-free language.

From a CFG to a PDA

Conversion: Every context-free grammar can be converted into an equivalent pushdown automaton. The PDA simulates
the derivation process of the grammar by pushing symbols onto the stack and popping them as rules are applied.

Complexity: Converting a context-free grammar G to a PDA involves constructing a PDA that simulates the leftmost
derivation of G. This can be done in linear time with respect to the size of the grammar, as each non-terminal in the
grammar corresponds to a state in the PDA, and each production rule corresponds to a transition in the PDA.

From a PDA to a CFG

Conversion: Similarly, every pushdown automaton can be converted into an equivalent context-free grammar. This is
done by simulating the push and pop operations of the PDA and translating them into grammar production rules.

Complexity: The conversion process from a PDA to a CFG can be more complex because it involves handling the non-
deterministic transitions and the stack behavior of the PDA. However, the process can be achieved in polynomial time in
the size of the PDA.

Thus, both conversions from a CFG to a PDA and from a PDA to a CFG can be done in polynomial time, but the specific
complexities depend on the details of the grammar or automaton.

2. Running Time of Conversion to Chomsky Normal Form (CNF)


Chomsky Normal Form (CNF) is a special form for context-free grammars where each production rule is of the form:

A → BC where A, B, C are non-terminals, or


A → a where a is a terminal.

The conversion of a CFG to CNF involves removing useless symbols, eliminating ε-productions, eliminating unit
productions, and ensuring the grammar meets the CNF structure.

Conversion Process and Complexity

1. Eliminating Useless Symbols: First, remove any non-terminal symbols that do not generate any terminal strings or
cannot be reached from the start symbol. This step involves identifying the "reachable" and "generating" non-terminals,
and it can be done in linear time.

2. Eliminating ε-productions: ε-productions (productions of the form A → ϵ) need to be eliminated. This process involves
finding all nullable non-terminals and replacing the ε-productions with the appropriate modifications to other rules. The
complexity of this step is linear in the size of the grammar.

3. Eliminating Unit Productions: Unit productions are those of the form A → B , where A and B are non-terminals.
These can be eliminated by substituting all productions of B into A's rules. This step takes polynomial time.

4. Breaking Down Long Productions: The final step is to ensure all productions are either of the form A → BC or A →
a. For any production longer than two symbols, new non-terminals are introduced to shorten the right-hand side of the
production. This process takes linear time.

Thus, the total running time for converting a CFG into CNF is polynomial in the size of the grammar.

85/145
Example:

Consider the following grammar:

S → AB
A→a
B→b

This grammar is already in CNF. If we had longer productions, we would split them into smaller ones by introducing new
non-terminals.

3. Testing Emptiness of CFLs


The emptiness problem for context-free languages asks whether a given context-free language is empty, i.e., whether there
are any strings that can be generated by a given CFG.

Algorithm and Complexity

To test whether a context-free language L is empty, we need to check whether the start symbol S of the CFG can derive
any string. This can be done by performing a reachability analysis where we compute the set of symbols that can derive
strings, starting from the start symbol.

The algorithm involves:

1. Finding all the non-terminal symbols that can generate terminal strings.

2. Checking if the start symbol is in this set.

The complexity of this procedure is linear in the size of the grammar because we can iteratively mark non-terminals
that can eventually derive terminal strings.

Example:

For the grammar:

S → aSb ∣ ϵ

We can see that S can derive the empty string (because S → ϵ). Therefore, the language is not empty.

4. Testing Membership in a CFL


The membership problem for context-free languages asks whether a given string w belongs to a context-free language L,
i.e., whether the string w can be derived from the start symbol S of the CFG.

Algorithm and Complexity

PDA Simulation: To test whether a string w belongs to the language L, we can simulate a pushdown automaton (PDA)
that accepts the language L. The PDA will push and pop symbols from the stack according to the production rules of
the CFG.

Dynamic Programming: Another approach is to use dynamic programming (CYK algorithm), which runs in cubic time
with respect to the length of the string. This algorithm constructs a table where each cell T [i, j] represents the set of
non-terminals that can derive the substring of w from position i to j .

86/145
Thus, testing membership in a CFL can be done in cubic time in the length of the string using the CYK algorithm.

Example:

Given the grammar:

S → aSb ∣ ϵ

To test membership for the string "ab", the CYK algorithm will check if there is a derivation for the string using the grammar.
It will confirm that "ab" is part of the language.

5. Preview of Undecidable CFL Problems


While many decision properties for CFLs are decidable, there are also undecidable problems. Some important undecidable
problems include:

1. Universality Problem: Given a CFG G, is the language generated by G equal to Σ∗ ? This is undecidable because it
involves determining whether every possible string is generated by the grammar.

2. Finiteness Problem: Given a CFG G, is the language generated by G finite? This is undecidable because determining
whether a context-free language is finite is equivalent to checking whether a language can generate an infinite number
of strings.

3. Equivalence Problem: Given two context-free grammars G1 and G2 , do they generate the same language? This
​ ​

problem is undecidable for context-free languages.

4. Emptiness for Intersection: Given a context-free language L1 and a regular language L2 , is L1


​ ​ ​ ∩ L2 empty? This is

undecidable because checking the intersection of CFLs and RLs is computationally hard.

These problems have been proven to be undecidable, and there is no algorithm that can solve them for all context-free
languages.

Summary
In this lecture, we have discussed the following decision properties of context-free languages (CFLs):

Converting between CFGs and PDAs is feasible in polynomial time.

Converting to Chomsky Normal Form (CNF) has a polynomial-time complexity.

Testing emptiness of a CFL can be done in linear time with respect to the size of the grammar.

Testing membership in a CFL can be done in cubic time using the CYK algorithm.

There are several undecidable problems related to context-free languages, such as universality, finiteness, and
equivalence problems.

These results provide a strong foundation for understanding the computational limitations and capabilities of context-free
languages and grammars.

Lecture 31: Problems That Computers Cannot Solve

87/145
In this lecture, we explore unsolvable problems in computation, which involve the limits of computation and the inability
of computers to solve certain classes of problems. We will address this concept through the following points:

1. Programs that Print "Hello World"

2. The Hypothetical "Hello World" Tester

3. Reducing One Problem to Another

This discussion will involve formal proofs, examples, and a C code example for better understanding.

1. Programs that Print "Hello World"


In this section, we will discuss how Fermat's Last Theorem (a famous problem in mathematics) can be framed as a "Hello
World" program in the world of computation. A "Hello World" program is typically the simplest program a beginner writes to
test if their system works, usually something like:

#include <stdio.h>

int main() {
printf("Hello World\n");
return 0;
}

However, imagine we frame Fermat's Last Theorem in a similar context. Fermat’s Last Theorem states that there are no
three positive integers a, b, and c that satisfy the equation:

an + bn = cn

for any integer value of n greater than 2.

Let’s consider the following C program that "prints" a proof of Fermat’s Last Theorem:

#include <stdio.h>
#include <math.h>

int main() {
int a, b, c, n;
for (n = 3; n <= 10; n++) { // Looping over some possible values of n
for (a = 1; a <= 100; a++) {
for (b = 1; b <= 100; b++) {
for (c = 1; c <= 100; c++) {
if (pow(a, n) + pow(b, n) == pow(c, n)) {
printf("Counterexample found for n=%d: a=%d, b=%d, c=%d\n", n, a, b, c);
return 0;
}
}
}
}
}

88/145
printf("No counterexample found. Fermat's Last Theorem holds!\n");
return 0;
}

This code attempts to find a counterexample to Fermat’s Last Theorem. However, the theorem has already been proven, so
the program will never find any valid solution for n > 2. The key point here is:
No solution exists that would violate Fermat’s Last Theorem for integers n > 2. This demonstrates a computationally
unsolvable problem: The program cannot definitively prove the theorem in a reasonable amount of time since it would
have to exhaustively check all possible combinations of a, b, and c, and even then, it would not be able to prove the
result for all integers.

Formal Insight:

This illustrates how certain problems (like Fermat's Last Theorem) may not be computable in a simple algorithmic sense,
and it exemplifies how unsolvable problems are often linked to intractable computational tasks.

2. The Hypothetical "Hello World" Tester


Now, we discuss a hypothetical paradox: a "Hello World" tester that checks whether a given program is a valid "Hello
World" program. The idea here is to create a program that tests if another program outputs "Hello World." The paradox lies
in trying to define such a tester for all programs.

The Hypothesis:

We imagine a program that takes another program P as input and determines if P prints the string "Hello World". Let's call
this program the Hello World tester. It would have the following structure:

#include <stdio.h>

int test_hello_world(char *program) {


// Hypothetical code to test if program outputs "Hello World"
if (run_program(program) == "Hello World") {
return 1; // Program prints "Hello World"
}
return 0; // Program does not print "Hello World"
}

int main() {
// Hypothetically test a given program
if (test_hello_world("some_program.c") == 1) {
printf("It's a valid Hello World program.\n");
} else {
printf("It's not a Hello World program.\n");
}
return 0;
}

Now, why is this paradoxical?

1. Suppose the program test_hello_world works as intended for all possible C programs.

89/145
2. Consider the following contradiction using the program test_hello_world :

#include <stdio.h>

int main() {
if (test_hello_world("some_program.c") == 1) {
// If the program prints "Hello World", it doesn't output anything
} else {
printf("Hello World\n");
}
return 0;
}

This program will either:

Output nothing (if it is a valid Hello World program).

Print "Hello World" (if it is not a valid Hello World program).

Thus, it causes a logical contradiction. It cannot both print and not print "Hello World", which is a paradox — it essentially
defies the definition of a Hello World program. This paradox reflects the limitations of computability.

This example demonstrates the Halting Problem in disguise: whether a program halts or produces a specific output (in this
case, "Hello World") is not always decidable.

3. Reducing One Problem to Another


In this section, we will introduce the concept of reductions through a computational example. A reduction is a process by
which one problem can be transformed into another. If we can reduce a known hard problem (like the Halting Problem) to a
problem P , and show that solving P would solve the hard problem, then P must also be hard.

Example: Reducing the Halting Problem to Another Problem

Let’s assume we have a problem X that is undecidable (like the Halting Problem). Suppose we want to determine whether a
given C program halts when run. The Halting Problem states that there is no general algorithm that can decide whether an
arbitrary program will halt on a given input.

Let’s try to reduce the Halting Problem to a simpler problem Y. We will show how solving Y would allow us to solve the
Halting Problem.

Consider the following:

The Halting Problem is: Given a program P and input I , does P halt on input I ?

Let’s assume we have a solver for Y that can answer whether a program halts.

Now, we reduce the Halting Problem to Y by creating a new program P ′ based on P and input I :

#include <stdio.h>

int P_prime() {

90/145
// This function runs P with input I and does something trivial afterward.
if (halting_solver(P, I)) {
return 1; // If P halts on I, return 1.
}
return 0; // Otherwise, return 0.
}

Now, solving the problem of whether P' halts is equivalent to solving the Halting Problem for the original program P . This
shows that Y is at least as hard as the Halting Problem, and since the Halting Problem is undecidable, Y must also be
undecidable.

Summary
In this lecture, we have discussed several key ideas related to unsolvable problems and computational limits:

1. Programs that Print "Hello World": We framed Fermat's Last Theorem as a Hello World program and demonstrated
that some problems cannot be solved by computers.

2. The Hypothetical "Hello World" Tester: We explored the paradox of testing whether a program is a valid Hello World
program, which leads to a contradiction and relates to the Halting Problem.

3. Reducing One Problem to Another: We introduced the concept of reductions in computation and showed how solving
one problem (like the Halting Problem) could lead to solving another, demonstrating the undecidability of certain
problems.

These examples highlight the fundamental limits of computation and underscore the importance of understanding the
boundaries of what computers can and cannot do.

Lecture 32: The Turing Machine


In this lecture, we will introduce the Turing Machine (TM), a central concept in the theory of computation, which serves as a
model for computation and is fundamental to understanding the limits of what can be computed. We will explore the
historical context behind the development of the Turing Machine, formal notation for TMs, instantaneous descriptions,
transition diagrams, and discuss its relationship with halting. We will also define the language of a Turing Machine.

1. The Quest to Decide All Mathematical Questions (Historical Account)


The idea of automating computation and formalizing mathematical reasoning has deep historical roots. In the 19th and
early 20th centuries, mathematicians and logicians such as David Hilbert and Kurt Gödel were concerned with formalizing
all of mathematics. Hilbert’s Program sought to prove that every mathematical question could be resolved using a complete
and consistent set of axioms and procedures (i.e., a formal system).

In the early 1930s, mathematicians were confronted with Gödel’s Incompleteness Theorems, which demonstrated that any
sufficiently powerful formal system could not be both complete and consistent. This result shattered the dream of finding a
mechanical method (or algorithm) that could decide every mathematical question.

Around the same time, Alan Turing and Alonzo Church developed the concept of a computational model to address these
issues. Turing introduced the Turing Machine in 1936 as a theoretical construct to formalize the idea of computation.
Turing’s model would provide a rigorous definition of what it means for a problem to be computable.

91/145
Turing’s work, along with the work of Church (via the lambda calculus), showed that the class of problems solvable by a
human using a machine (in Turing's sense) is exactly the class of problems solvable by any algorithmic process. This led to
the Church-Turing Thesis, which asserts that anything computable by a machine is computable by a Turing machine.

Thus, the Turing Machine became the foundation for understanding computation and its limitations, particularly with
respect to decidability.

2. Notation for the Turing Machine


A Turing Machine (TM) is a theoretical device that manipulates symbols on a tape according to a set of rules, simulating a
human-like process of computation. The formal components of a Turing Machine are defined as follows:

Tape: An infinite sequence of cells, each containing a symbol from a finite alphabet. The tape can be thought of as a
sequence of memory locations, where each location can hold a symbol. The tape extends infinitely in both directions.

Alphabet: The set of symbols that the machine can read and write, denoted by Σ. This includes a special blank symbol
⊔ to represent an empty cell.
Head: A read/write head that scans the tape. The head can move left or right, and it can also read or write a symbol in
the current tape cell.

States: The machine operates in one of a finite set of states Q, and there is a designated start state and one or more
halt states. The machine’s operation is determined by its current state and the symbol it is reading on the tape.

Transition Function: The transition function δ is a set of rules that determine the next state, the symbol to write, and
the direction to move the head, given the current state and symbol on the tape. Formally, this is written as:

δ : Q × Σ → Q × Σ × {L, R}

where:

Q is the set of states.


Σ is the tape alphabet.
L and R represent the left and right directions for the head to move.
Accept and Reject States: A Turing Machine halts when it reaches a designated accept state or reject state. The
behavior of the machine can be classified as accepting, rejecting, or non-halting based on the machine’s computation.

3. Instantaneous Descriptions for TMs


An instantaneous description (ID) of a Turing Machine is a snapshot of the machine at any point in its computation. It
consists of:

The current state of the machine.

The contents of the tape (the string of symbols currently written on the tape).

The position of the head on the tape.

For example, consider the following instantaneous description of a Turing Machine:

bash

92/145
(ID) q1: ...011010... (head on 1)

This tells us that the machine is in state q1, the tape contents are "...011010...", and the head is on the second symbol, which
is 1.

An instantaneous description provides a complete description of the machine’s state at any given moment during its
computation.

4. Transition Diagrams for TMs


A transition diagram is a visual representation of the Turing Machine’s behavior. In these diagrams:

Each state is represented by a circle.

Directed edges between states represent transitions based on the current symbol being read.

Labels on the edges show the symbol written, the direction of head movement (L for left, R for right), and the state
transition.

Example: Consider a simple Turing Machine designed to recognize the language L = {w ∣


w contains an even number of 1s}, over the alphabet {0, 1}. The machine begins in the start state q0 and moves left or
right based on the current symbol read.

The transition diagram might look like:

rust

q0 --0--> q0 (stay in q0 on reading 0)


q0 --1--> q1 (move to q1 on reading 1)

q1 --0--> q1 (stay in q1 on reading 0)


q1 --1--> q0 (move back to q0 on reading 1)

This Turing Machine checks whether the string has an even number of 1s by switching between two states q0 and q1. If the
machine ends in q0, it accepts the string (even number of 1s); otherwise, it rejects the string.

5. TMs and Halting


A key concept in Turing Machines is halting: whether a Turing Machine eventually halts or runs forever. The Halting
Problem was famously proven to be undecidable by Alan Turing, meaning there is no general algorithm that can determine
for every Turing Machine whether it will halt on a given input.

To illustrate halting, let’s consider a simple Turing Machine that halts if the input string contains an even number of 1s:

1. If the machine reads a 1, it moves to state q1 (indicating an odd number of 1s).

2. If the machine reads another 1, it returns to state q0, indicating an even number of 1s.

3. The machine halts if it reaches the end of the string with an even number of 1s.

For example, on input "101", the machine will:

93/145
Read the first 1, move to state q1.

Read 0, stay in q1.

Read 1 again, move to state q0.

The machine halts in q0, meaning the input has an even number of 1s.

However, a Turing Machine is not guaranteed to halt in all cases. For instance, consider a machine that runs in an infinite
loop or one that processes an undecidable problem. In those cases, the machine may never halt.

6. The Language of a Turing Machine


The language of a Turing Machine M is the set of strings L(M ) that M accepts. Formally, this is:

L(M ) = {w ∣ M halts in an accept state on input w}

For example, consider a Turing Machine that recognizes the language L = {w ∣ w has an even number of 1s}. The
language of this machine is the set of all strings over {0, 1} that contain an even number of 1s.

Summary
In this lecture, we introduced the concept of the Turing Machine (TM), a central model of computation that helps define the
limits of what can be computed. We discussed its historical context in the quest to decide all mathematical questions, the
formal notation for Turing Machines, instantaneous descriptions, and transition diagrams. We also explored the concept of
halting and proved that the Halting Problem is undecidable. Finally, we defined the language of a Turing Machine, which
consists of the set of strings that the machine accepts.

Lecture 33: Programming Techniques for Turing Machines


In this lecture, we will delve into several advanced programming techniques for Turing Machines (TMs) that can significantly
enhance their computational power and flexibility. These techniques include using storage in the state, utilizing multiple
tracks on the tape, and employing subroutines for structured computation. We will explore the necessary formalism for
each technique and illustrate their application through examples.

1. Storage in the State


One basic limitation of a simple Turing Machine is that its memory is constrained to the tape. However, by utilizing the state
of the machine itself as an additional form of storage, we can extend the computational capabilities of the machine. This
technique involves encoding information directly in the states of the machine rather than relying solely on the tape for
memory.

Formalism:

The set of states Q in a Turing Machine M can be enlarged to include states that encode additional information. Thus,
the set of states Q′ becomes a super-set of Q, where the new states represent specific computations or data encoded
in the machine’s configuration.

94/145
This storage technique is essentially an encoding scheme where states can carry encoded data such as binary
counters, flags, or memory values that would otherwise require additional tape cells.

Example:

Consider a Turing Machine that needs to count the number of occurrences of a symbol a in a string. Rather than writing the
count to the tape, we can encode the count directly in the machine's states. Let’s suppose the machine has states
q0 , q1 , q2 , …, where q0 represents the initial state, and each subsequent state q1 , q2 , … represents the count of a's read so
​ ​ ​ ​ ​ ​

far.

Here’s how it would work:

Start in state q0 (representing 0 a's counted).


Upon reading an a, move to state q1 (representing 1 a counted).


Upon reading another a, move to state q2 , and so on. ​

By storing the count in the states, the machine can track the number of a's without modifying the tape.

This technique is useful for problems where the machine needs to keep track of intermediate results or flags without
consuming tape space.

2. Multiple Tracks
Another powerful technique is to use multiple tracks on the tape. A Turing Machine traditionally has one tape with a single
head, but we can extend the model to allow for multiple tracks or multiple tapes. In this extended model, the tape is
divided into several parallel tracks, each of which holds a separate tape alphabet.

Each track can store a different piece of information, and the head can read from and write to each track independently.
Multiple tracks allow for a more sophisticated approach to computation, where different types of data can be manipulated
simultaneously.

Formalism:

A multi-track tape can be described as a set of k tapes, each with its own alphabet Σ1 , Σ2 , … , Σk , where k is the
​ ​ ​

number of tracks.

The machine’s transition function now takes into account the current state and the symbols read from each of the
tracks. Formally, the transition function δ is extended to:

δ : Q × (Σ1 × Σ2 × ⋯ × Σk ) → Q × (Σ1 × Σ2 × ⋯ × Σk ) × {L, R}


​ ​ ​ ​ ​

where each tuple (Σ1 , Σ2 , … , Σk ) represents the symbols from the different tracks under the current head position.
​ ​ ​

Example:

Let’s consider a Turing Machine with two tracks. On the first track, it stores a binary number (e.g., 101), and on the second
track, it stores the reverse of the binary number (e.g., 101). The machine’s task is to check whether the two tracks store the
same sequence.

Initially, the head scans both tracks and starts in the initial state q0 .

The machine checks if the first and last symbols are the same. If they are, the machine proceeds to compare the next
pair of symbols, moving towards the center of the tape.

95/145
The transition function would read from both tracks at once, updating both tracks as needed and checking the symbols
simultaneously.

Using multiple tracks like this simplifies the task, as the machine can work with two separate pieces of information at once.

3. Subroutines
A subroutine in the context of a Turing Machine refers to a modular computation method where the machine can invoke a
set of predefined states (a subroutine) to perform a specific task. This technique helps in organizing the computation into
smaller, reusable parts, making complex tasks easier to manage and reason about.

While Turing Machines do not inherently have the concept of functions or procedures like in high-level programming
languages, we can simulate subroutines by creating states that perform a specific task and then returning to the main
computation after completion.

Formalism:

A subroutine is essentially a set of states and transitions designed to accomplish a specific task. Once the task is
complete, the machine transitions back to the state that invoked the subroutine.

The subroutine can be invoked by setting up an initial state in the main program that transitions to the first state of the
subroutine.

A subroutine can return control to the calling state by using a special return state or by modifying the machine's state
to indicate the subroutine has completed.

Example:

Let’s consider a Turing Machine that needs to compute the sum of two binary numbers. This task can be broken down into
smaller subroutines, such as:

1. Subroutine for Adding Two Digits: This subroutine takes two digits (one from each binary number) and adds them,
handling the carry if necessary.

2. Subroutine for Handling Carry: This subroutine deals with carry-over from one digit to the next during the addition.

The main program would invoke these subroutines at appropriate points to add digits and propagate carries. Once a
subroutine completes its task, the machine would return to the main program to continue the overall computation.

In formal terms, the machine’s transition function would involve:

Transitioning from the main program to the subroutine.

The subroutine performing its task and transitioning to a return state.

Returning to the main program and resuming computation.

Summary
In this lecture, we explored three advanced programming techniques for Turing Machines: storage in the state, multiple
tracks, and subroutines. These techniques significantly enhance the Turing Machine’s ability to perform more complex
computations efficiently.

96/145
1. Storage in the state allows the machine to use its states as additional memory, enabling efficient tracking of
intermediate results.

2. Multiple tracks provide a way to store and manipulate multiple types of information simultaneously, enhancing the
machine’s computational power.

3. Subroutines allow the machine to perform modular computations by invoking predefined sets of states, making
complex tasks more manageable.

These techniques reflect how Turing Machines can be adapted and programmed in sophisticated ways, although they
remain abstract computational models, far removed from practical implementation.

Lecture 34: Extensions to the Basic Turing Machines


In this lecture, we will explore several extensions to the basic Turing Machine (TM) model. These extensions include
multitape Turing Machines, non-deterministic Turing Machines, and the concept of equivalence between one-tape and
multitape TMs. We will also discuss the running time of multitape TMs and how to simulate multitape TMs with single-tape
TMs.

1. Multitape Turing Machines


A multitape Turing Machine is an extension of the basic Turing Machine that has more than one tape and multiple heads,
each moving independently on a different tape. These machines are often used to simplify complex algorithms or to prove
theoretical results by making it easier to describe certain computations.

Formal Definition:

A multitape Turing Machine M is formally defined by a 7-tuple:

M = (Q, Σ, Γ, δ, q0 , qaccept , qreject )


​ ​ ​

where:

Q is the set of states.


Σ is the input alphabet.
Γ is the tape alphabet (which includes the blank symbol).
δ is the transition function, which maps the current state and the symbols on each tape to a new state, symbols to
write, and directions to move.

q0 is the start state.


qaccept and qreject are the accept and reject states, respectively.
​ ​

For a k-tape Turing Machine, the transition function δ is modified as:

δ : Q × (Γ1 × Γ2 × ⋯ × Γk ) → Q × (Γ1 × Γ2 × ⋯ × Γk ) × {L, R}k


​ ​ ​ ​ ​ ​

where Γ1 , Γ2 , … , Γk are the tape alphabets for each tape, and {L, R}k denotes the movement of each head (left or
​ ​ ​

right) on each tape.

Example:

Consider a 2-tape Turing Machine tasked with checking whether a string is a palindrome. The machine could:

1. Copy the string from tape 1 to tape 2, ensuring the second tape mirrors the first.

97/145
2. Then, it can compare symbols on both tapes simultaneously to verify if they match, moving heads in tandem.

While multitape TMs provide significant computational convenience, it is important to prove that multitape TMs are not
more powerful than single-tape TMs in terms of computational power (they can compute the same class of languages).

2. Equivalence of One-Tape and Multitape Turing Machines


Even though multitape Turing Machines are more convenient for designing algorithms, it is important to prove that they do
not increase the computational power of Turing Machines. In fact, a single-tape Turing Machine can simulate a multitape
Turing Machine.

Proof: Equivalence of One-Tape and Multitape Turing Machines:

Let Mk be a k-tape Turing Machine, where k


​ ≥ 2. We will show that there exists an equivalent one-tape Turing Machine
M1 that simulates Mk .
​ ​

1. Representation of Multiple Tapes on One Tape:

The idea is to use the single tape of M1 to represent all the tapes of Mk . We interleave the contents of the k tapes
​ ​

onto a single tape.

Suppose that each of the k tapes holds a string of symbols. We can represent these strings on the single tape by
separating the contents of each tape with a special separator symbol (e.g., #).

2. Encoding of Tape Configuration:

Let the configuration of the k -tape machine at any point in time be represented as:

Tape 1 contents#Tape 2 contents# … #Tape k contents


Each tape is indexed, and each tape's head position can be marked using a special symbol, such as ∣. This allows the
machine to simulate the movement of heads on each tape.

3. Simulating Transitions:

For each step of the simulation, M1 will:


Identify the positions of the heads for each tape by searching for the special head marker ∣.

Copy the appropriate symbol from the simulated tapes on M1 's tape.

Perform the necessary state transitions and move the tape head accordingly.

After each step, the Turing machine moves to the next configuration by updating the tape and head positions,
simulating all the operations of the multitape machine.

Conclusion:

A single-tape Turing Machine can simulate a multitape Turing Machine by encoding multiple tapes on one tape and
simulating the transitions. The number of steps required for the simulation can be bounded by a polynomial function of the
number of tapes, meaning the computational power remains the same, though the single-tape machine may take more
time.

3. Running Time and the Many-Tapes-to-One Construction

98/145
The time complexity of simulating a multitape Turing Machine using a single-tape Turing Machine is an important
consideration. When simulating a k-tape Turing Machine using a single-tape Turing Machine, we must account for the
additional time required for accessing and updating the information on multiple tapes.

Formal Construction:

Suppose Mk is a k-tape Turing Machine that runs in Tk (n) time, where n is the size of the input.
​ ​

A one-tape Turing Machine M1 that simulates Mk may take T1 (n) time, where:
​ ​ ​

T1 (n) = O(Tk (n)2 ) ​

This quadratic increase in time is due to the need for simulating multiple tape heads on a single tape. Each step of the
multitape machine may require additional steps for managing the tape head positions and for simulating the
movement on different tapes.

Specifically, the simulation of a move on a single tape requires moving the head back and forth over the interleaved
tape configuration and reading and writing to multiple parts of the tape. Therefore, the time complexity is at most
quadratic in the number of tapes.

Example:

Consider a 2-tape Turing Machine that adds two binary numbers. The time complexity of this machine is T2 (n), where n is

the size of the input. A single-tape Turing Machine simulating the 2-tape machine might require O(n2 ) time due to the
need to traverse the interleaved configuration of both tapes.

4. Non-Deterministic Turing Machines


A Non-deterministic Turing Machine (NDTM) is a Turing Machine where, at each step, there may be multiple possible
transitions to different states. This contrasts with deterministic Turing Machines, where the next state is uniquely
determined by the current state and input symbol.

Formal Definition:

A Non-deterministic Turing Machine is defined by a 7-tuple:

M = (Q, Σ, Γ, δ, q0 , qaccept , qreject )


​ ​ ​

where δ is now a set-valued function:

δ : Q × Γ → P(Q × Γ × {L, R})

where P(S) denotes the power set of S , meaning that at each step, the machine may transition to several possible
configurations.

Example:

Consider a Non-deterministic Turing Machine that decides whether a binary string has an even number of 1's. The NDTM
can:

1. Non-deterministically guess whether the number of 1's is even or odd.

2. Proceed to scan the string and verify the guess.

This type of machine allows for parallel exploration of possible computational paths. If any path leads to an accept state, the
machine accepts the input.

99/145
Significance:

NDTMs are more powerful than DTMs in terms of the time complexity for solving certain problems. For example, the
nondeterministic polynomial time (NP) complexity class involves problems that can be solved by an NDTM in polynomial
time. However, it is important to note that the class of languages recognized by NDTMs (the class NP) is equivalent to the
class recognized by DTMs (P) in the context of deterministic simulations.

Summary
In this lecture, we explored several important extensions to the basic Turing Machine model:

1. Multitape Turing Machines: These machines provide multiple tapes and heads, simplifying computations. We showed
that multitape TMs are not more powerful than single-tape TMs in terms of the class of languages they can recognize.

2. Equivalence of One-Tape and Multitape TMs: We proved that a single-tape Turing Machine can simulate a multitape
Turing Machine with a quadratic increase in time complexity.

3. Non-Deterministic Turing Machines: NDTMs allow for multiple possible transitions at each step, enabling them to
explore multiple computational paths simultaneously. We briefly discussed the power and implications of non-
determinism in computational complexity theory.

Lecture 35: Restricted Turing Machines


In this lecture, we will explore different types of restricted Turing Machines that are modifications of the standard Turing
Machine. These machines are useful in the study of complexity and computability, and they provide insights into the limits
of computation under various constraints. We will discuss the following restricted types:

1. Turing Machines with Semi-Infinite Tapes

2. Multistack Machines

3. Counter Machines

4. Power of Counting Machines

Each section will be detailed with formal definitions, examples, and proofs where applicable.

1. Turing Machines with Semi-Infinite Tapes


A Turing Machine with a Semi-Infinite Tape is a variation of the standard Turing Machine where the tape is infinite in one
direction and has a fixed beginning in the other direction. This modification restricts the tape by limiting one end, but it still
allows for unbounded movement in one direction.

Formal Definition:

A semi-infinite Turing Machine is defined as:

M = (Q, Σ, Γ, δ, q0 , qaccept , qreject )


​ ​ ​

where:

Q is the finite set of states,


Σ is the input alphabet (excluding the blank symbol),

100/145
Γ is the tape alphabet, which includes the blank symbol,
δ is the transition function, which now considers the head moving only to the left or right but with the restriction that
the tape is infinite in one direction,

q0 is the start state,


qaccept and qreject are the accept and reject states.


​ ​

In this model, the tape is infinite to the right but has a fixed left endpoint. That is, the tape starts at a certain position
(usually represented as the leftmost cell), but the head can only move to the right infinitely.

Example:

Imagine a semi-infinite Turing Machine that operates on a binary string. The machine can move to the right across the
string, but if it attempts to move left beyond the start of the string, it is prevented by the fixed boundary. The behavior of
this machine is similar to that of a one-way machine, but it allows the machine to process input from the left without losing
the ability to process infinitely in one direction.

Power of Semi-Infinite Tapes:

A semi-infinite Turing Machine is equivalent in computational power to a standard Turing Machine. The restriction on the
tape does not reduce its ability to compute the same class of languages (i.e., recursively enumerable languages). The key
difference is that the machine may not be able to move freely in both directions, which could influence the efficiency of
certain algorithms.

2. Multistack Machines
A Multistack Machine is a restricted form of Turing Machine where the tape is replaced by several stacks, each with a single
head that can move along the stack. These stacks operate as LIFO (Last In, First Out) data structures. In a k-stack machine,
there are k stacks and the machine can perform operations on any of them, such as pushing or popping symbols, in
addition to moving between the stacks.

Formal Definition:

A k-stack machine is defined as:

M = (Q, Σ, Γ, δ, q0 , qaccept , qreject )


​ ​ ​

where:

Q is the finite set of states,


Σ is the input alphabet (excluding the blank symbol),
Γ is the tape alphabet, used to populate the stacks,
δ is the transition function, which specifies how the machine manipulates the stacks, including pushing, popping, and
state transitions,

q0 is the start state,


qaccept and qreject are the accept and reject states.


​ ​

In this machine, the stacks are represented as S1 , S2 , … , Sk , each of which has its own head and can perform operations
​ ​ ​

such as pushing a symbol onto the stack, popping a symbol from the stack, or reading the top of the stack. The heads of the
stacks can move independently of one another.

101/145
Example:

A 2-stack machine can be used to recognize the language of palindromes. One stack might store the first half of the string,
while the second stack stores the reverse of the string. As the machine moves through the input, it will pop elements from
the second stack and compare them to the symbols from the first half of the string to determine if the string is a
palindrome.

Power of Multistack Machines:

A 2-stack machine is equivalent to a Turing Machine in terms of computational power. This is a result of the fact that the
two stacks allow the machine to simulate the movement of a tape head in both directions, enabling it to recognize any
recursively enumerable language. However, a single stack is not as powerful; a single stack machine can only recognize a
strict subset of context-free languages.

3. Counter Machines
A Counter Machine is a restricted form of computation model that uses a finite number of counters instead of a tape. The
counters can hold non-negative integers, and the machine can increment, decrement, or test for zero on the counters.
Counter Machines are simpler than Turing Machines but are still useful for certain types of problems.

Formal Definition:

A counter machine is defined as:

M = (Q, Σ, Γ, δ, q0 , qaccept , qreject , C)


​ ​ ​

where:

Q is the finite set of states,


Σ is the input alphabet,
Γ is the alphabet for counters (usually {0, 1}),
δ is the transition function, which specifies how the machine operates on the counters (increment, decrement, or check
if a counter is zero),

q0 is the start state,


qaccept and qreject are the accept and reject states,


​ ​

C is the set of counters used by the machine.

The machine operates by reading an input, manipulating its counters based on the current state and input symbol, and
transitioning to new states. It may stop if it reaches the accept or reject states.

Example:

A counter machine with two counters can be used to recognize the language {an bn ∣n ≥ 0}. The machine will increment the
first counter for each a and decrement the second counter for each b. If both counters reach zero at the same time, the
input string is accepted; otherwise, it is rejected.

Power of Counter Machines:

A single counter machine is not as powerful as a Turing Machine. Counter machines are limited in that they can only
recognize context-free languages or certain counting problems (languages that require counting but not arbitrary

102/145
manipulation of symbols). However, if the counter machine is extended to have multiple counters, it can simulate a Turing
Machine.

4. Power of Counting Machines


A Counting Machine is a variant of the counter machine where the machine is allowed to perform computations on
counters with more powerful operations. It is often used to model certain types of computations involving counting and
checking properties related to numbers.

Formal Definition:

A counting machine is similar to a counter machine but may include additional operations such as:

Arithmetic operations: More complex manipulations of numbers (addition, subtraction, etc.).

Threshold operations: The machine can test whether the count reaches certain thresholds.

Power of Counting Machines:

A Counting Machine can recognize certain types of languages that are beyond the capabilities of regular Turing Machines,
particularly in problems related to counting, such as counting primes or determining if a number is even or odd. These
machines are useful in situations where the computation primarily involves counting occurrences or occurrences modulo
certain numbers.

Counting Machines are more powerful than regular Counter Machines but still not as powerful as Turing Machines in terms
of general-purpose computation.

Summary
In this lecture, we explored various restricted Turing Machines:

1. Turing Machines with Semi-Infinite Tapes: These machines have a tape that is infinite in one direction, and we showed
that they are equivalent in computational power to standard Turing Machines.

2. Multistack Machines: Machines that use multiple stacks as the computational medium, and we showed that a 2-stack
machine is equivalent in power to a Turing Machine.

3. Counter Machines: Machines that use counters instead of tapes, with operations like increment, decrement, and zero-
check. We discussed their ability to recognize context-free languages and the limitations of a single-counter machine.

4. Counting Machines: A more powerful class of machines that deal with counting operations and thresholds. These
machines have applications in problems related to counting but are still less powerful than full Turing Machines.

These restricted models offer insights into computation within various constraints, helping us understand the relationship
between different computational models.

Lecture 36: Turing Machines and Computers


In this lecture, we explore the relationship between Turing Machines and real-world computers. We focus on the following
concepts:

1. Simulating a Turing Machine by a Computer

103/145
2. Simulating a Computer by a Turing Machine

3. Comparing the Running Times

We will discuss the theoretical and practical aspects of computation, comparing the abstract Turing Machine model with the
physical models of real-world computers. This discussion will involve formal descriptions, examples, and a focus on
computational complexity.

1. Simulating a Turing Machine by a Computer


A real-world computer is essentially a physical implementation of the Turing Machine model. Every task that a Turing
Machine can compute, a real-world computer can also compute, and vice versa. The key difference lies in the actual
implementation and physical limitations.

Formal Simulation:

To simulate a Turing Machine on a computer, we first need to encode the following elements:

Tape: A computer’s memory (e.g., an array or a dynamic list) can serve as the tape. The tape of a Turing Machine is
infinite in one direction, but the computer memory is finite. However, for any computation that a Turing Machine can
perform, a computer can simulate a tape of sufficient size.

Head: A Turing Machine's head can move left or right across the tape, performing read and write operations. On a
computer, this is represented by a pointer or an index that moves across the memory array.

States: The current state of the Turing Machine can be mapped to a variable in the computer’s memory.

Transition Function: The transition function δ of the Turing Machine is implemented as a set of conditional statements
(e.g., if-else or switch-case ).

Example:

Consider a Turing Machine that adds two binary numbers. The algorithm can be translated into a sequence of operations on
a computer:

1. Input Representation: The input numbers are stored in an array (acting as the tape).

2. Head Movement: The computer maintains an index or pointer to simulate the head’s movement across the tape.

3. State Transitions: The current state of the Turing Machine (such as whether it is in the "start", "add", or "halt" state) can
be tracked by a simple variable.

While the Turing Machine operates in a theoretical, infinite space, the computer's memory is finite. Nonetheless, with
sufficient memory and proper encoding, a computer can simulate any Turing Machine’s operation.

Simulation Example: Binary Addition:

A Turing Machine that adds two binary numbers a = 1102 and b = 1012 might work as follows:
​ ​

Start by writing the binary numbers on the tape.

Move the head to the rightmost digit of both numbers.

Compare the digits and write the sum, moving left if necessary.

On a computer, we simulate this process by implementing:

1. A memory array representing the two numbers.

104/145
2. A loop to simulate the head moving from right to left.

3. Conditional statements to perform bitwise addition (carrying over when necessary).

Thus, a computer can simulate a Turing Machine by mimicking these operations and managing memory accordingly.

2. Simulating a Computer by a Turing Machine


The reverse process is also possible: a Turing Machine can simulate a computer. The concept here is that a computer is
just a finite-state machine with a finite tape (memory), and it follows certain algorithms, which are essentially computations
that can be encoded into a Turing Machine.

Formal Simulation:

When simulating a real-world computer on a Turing Machine, we need to break down the computer’s operations into the
fundamental operations of the Turing Machine. These operations include:

Memory Access: The Turing Machine uses its tape to simulate the computer’s memory. A memory location can be
represented as a cell on the tape.

Arithmetic Operations: A Turing Machine can perform arithmetic by simulating addition, subtraction, and other
operations using its transition rules. For example, adding two numbers on a Turing Machine can be implemented by a
series of state transitions that simulate carrying over digits.

Control Flow: A Turing Machine can simulate a computer's program counter by transitioning between states in a
predefined sequence, based on input symbols.

The key insight here is that any computation performed by a computer can be reduced to a sequence of Turing Machine
operations.

Example:

Consider a simple computer program that increments a variable x by 1. The computer will perform:

1. Access memory location x.

2. Increment the value of x.

3. Store the result back into memory.

In a Turing Machine, we would:

1. Represent x on the tape, e.g., with binary values.

2. Use a transition function to simulate incrementing by adjusting the binary value.

3. Transition between states to represent the storing of the incremented value back into memory.

Equivalence:

Since a Turing Machine can simulate a computer, the class of problems solvable by a computer is equivalent to the class of
problems solvable by a Turing Machine. Both are capable of solving exactly the same problems (i.e., they can compute the
same set of functions).

105/145
3. Comparing the Running Times
In this section, we compare the running times of Turing Machines and real-world computers. While Turing Machines are
theoretically universal, real-world computers are constrained by physical factors like memory, processor speed, and
input/output operations.

Running Time in Turing Machines:

For a given Turing Machine, the running time of an algorithm is typically described in terms of the number of steps the
machine takes to complete the computation. This is usually measured by:

The number of transitions (steps) from one state to another.

The amount of tape used during the computation.

For a Turing Machine, the time complexity is often expressed as a function of the input size n. The time complexity is
denoted as O(f (n)), where f (n) is the number of steps taken relative to the input size.

Running Time in Computers:

In real-world computers, the running time of an algorithm is typically expressed in terms of:

The number of CPU cycles the algorithm takes.

The amount of memory used.

The I/O operations required.

While a real computer is a finite machine with bounded resources, it is widely believed that any algorithm solvable by a
Turing Machine can be simulated by a computer with an equivalent running time, though it may require more resources
in practice.

Example: Comparing Times for Binary Addition:

In a Turing Machine simulation of binary addition, the number of steps is proportional to the number of bits in the
input. If the inputs have size n, the Turing Machine might require O(n) steps to add two binary numbers.

On a real computer, the addition operation is executed in constant time, assuming that the binary numbers are
represented in fixed-size memory cells. This is effectively O(1) for the addition, although memory operations could
make the total running time depend on the size of the input.

Turing Machine vs Computer Performance:

Turing Machines operate in a theoretical, infinite model, so the computation time is measured in terms of the number
of steps taken. While this is useful for understanding the inherent computational complexity, real-world computers face
practical limitations like memory size and processor speed.

A real-world computer executes operations based on its hardware architecture and is subject to finite memory and
processing power, meaning its actual performance can differ from that of a Turing Machine.

Summary
1. Simulating a Turing Machine by a Computer: A real-world computer can simulate a Turing Machine by encoding the
tape and transition functions within its memory, moving the head across the tape, and transitioning between states.

2. Simulating a Computer by a Turing Machine: A Turing Machine can simulate the operations of a computer by using its
tape for memory and simulating arithmetic and control flow with state transitions.

106/145
3. Comparing Running Times: While both Turing Machines and computers solve the same class of problems, their
running times can differ. Turing Machines are abstract and have no physical limitations, while real-world computers are
subject to memory, speed, and I/O constraints.

In conclusion, Turing Machines provide a theoretical model for computation that is equivalent in power to real-world
computers, but the practical efficiency and performance of these machines can vary. The study of running times in both
models helps us understand the complexities of computational problems and the limits of what can be computed.

Lecture 37: A Language that is not Recursively Enumerable


In this lecture, we explore languages that are not recursively enumerable (RE). Specifically, we focus on the
Diagonalization Language Ld and prove that it is not recursively enumerable by using diagonalization and the properties

of Turing Machines. This lecture covers the following topics:

1. Enumerating the Binary Strings

2. Codes for Turing Machines

3. The Diagonalization Language

4. Proof that Ld is not Recursively Enumerable


1. Enumerating the Binary Strings


We begin by examining the set of binary strings, which is denoted by Σ∗ where Σ = {0, 1}. The set of all binary strings is
countably infinite, and it is possible to enumerate them systematically. For example, we can list the strings in lexicographical
order:

0, 1, 00, 01, 10, 11, 000, 001, 010, 011, 100, 101, 110, 111, …

The process of enumerating binary strings involves assigning each string a unique natural number n in a sequence:

0 → 1st string
1 → 2nd string
00 → 3rd string
01 → 4th string
10 → 5th string
11 → 6th string
and so on.

This numbering provides a method to index the strings in Σ∗ , which is essential when we attempt to reason about the
computability of certain languages. Each binary string can be identified by its index, allowing us to enumerate all strings in a
sequential manner.

2. Codes for Turing Machines


A Turing machine (TM) is formally defined by a 7-tuple:

107/145
M = (Q, Σ, Γ, δ, q0 , qaccept , qreject )
​ ​ ​

where:

Q is the finite set of states,


Σ is the input alphabet,
Γ is the tape alphabet (which contains Σ and the blank symbol),
δ is the transition function,
q0 is the start state,

qaccept is the accept state,


qreject is the reject state.


Turing Machines are abstract machines, but for any given Turing Machine, we can represent it as a binary string. This string
acts as a code for the machine. Essentially, every Turing Machine can be encoded into a unique binary string, and thus, we
can also enumerate the set of all possible Turing Machines in the same way we enumerated binary strings.

A universal Turing machine can simulate any Turing Machine by reading its description encoded as a binary string. Thus,
for any Turing Machine M , we can associate it with a unique binary string wM that encodes the machine. The set of all

Turing Machine codes is also countable because it corresponds to the set of binary strings.

Now, let us consider the language of Turing Machines. We are interested in a specific language L, which may or may not
be recursively enumerable. Our goal is to identify whether certain languages are RE or not, using Turing Machines as the
fundamental model of computation.

3. The Diagonalization Language


We introduce the diagonalization language Ld , which is defined as follows:

Ld = {wi ∣ wi ∈
​ ​ / L(Mi )}
​ ​

where wi is the i-th binary string in the enumeration of all binary strings, and L(Mi ) is the language recognized by the i-th
​ ​

Turing Machine Mi . ​

To explain this language, let’s first unpack its components:

wi is the i-th binary string in the enumeration of binary strings.


L(Mi ) is the language of the i-th Turing Machine, i.e., the set of strings that the Turing Machine Mi accepts.
​ ​

The language Ld consists of those strings wi such that wi is not accepted by the Turing Machine Mi . In other words, Ld
​ ​ ​ ​ ​

includes the diagonal elements where a string does not appear in the language of its corresponding Turing Machine.

Motivation for Diagonalization:

Diagonalization is a technique used to construct languages that cannot be decided by any Turing Machine. The idea is to
exploit the enumeration of Turing Machines and their corresponding languages to create a language that differs from every
L(Mi ) by at least one string (the diagonal element).

By the diagonalization process, we ensure that for each i, the string wi is specifically chosen to not belong to L(Mi ). This is
​ ​

a form of contradiction since, if Ld were recursively enumerable, there would be some Turing Machine that accepts exactly

the strings in Ld . However, no such machine can exist, as shown below.


108/145
4. Proof that Ld is not Recursively Enumerable

We now prove that the language Ld is not recursively enumerable. The proof uses the method of reduction and

diagonalization, showing that no Turing Machine can decide Ld . ​

Assume for contradiction:

Suppose that Ld is recursively enumerable. This means there exists a Turing Machine Md that accepts exactly the strings in
​ ​

Ld . By definition, for each binary string wi , Md accepts wi if and only if wi ∈


​ ​ / L(Mi ).
​ ​ ​ ​

Constructing a contradiction:

Now, we reason as follows:

1. Enumerate all binary strings w1 , w2 , w3 , …, and for each i, associate it with a Turing Machine Mi .
​ ​ ​ ​

2. Ld consists of the strings wi for which wi ∈


​ ​ / L(Mi ), meaning wi is specifically chosen to not belong to L(Mi ).
​ ​ ​ ​

Consider the following:

If Md accepts wi , it means that wi


​ ​ ​ ∈
/ L(Mi ). ​

If Md rejects wi , it means that wi


​ ​ ​ ∈ L(Mi ). ​

In either case, Ld would depend on the relationship between wi and L(Mi ). However, since Ld is constructed using the
​ ​ ​ ​

diagonalization argument, no Turing Machine can recognize Ld because any such machine would lead to a paradoxical

situation.

Thus, there is no Turing Machine that can accept exactly the strings in Ld , meaning that Ld is not recursively enumerable.
​ ​

Conclusion
In summary, we have constructed a language Ld using diagonalization and proved that it is not recursively enumerable.

The key idea is that Ld is defined to be different from every language L(Mi ) by at least one string, ensuring that it cannot
​ ​

be accepted by any Turing Machine. This result demonstrates the limitations of the class of recursively enumerable
languages, highlighting the existence of languages that are fundamentally beyond the reach of Turing Machines.

Lecture 38: An Undecidable Problem that is Recursively Enumerable (RE)


In this lecture, we will discuss the properties of recursively enumerable (RE) languages, recursive languages, and their
complements, and how they lead to the conclusion that certain problems are undecidable. We will focus on the universal
language and prove that it is undecidable, despite being recursively enumerable. This lecture covers the following topics:

1. Recursive Languages

2. Complements of Recursive and RE Languages

3. The Universal Language

4. Undecidability of the Universal Language

1. Recursive Languages

109/145
A recursive language is a language that can be decided by a Turing Machine. In other words, there exists a Turing Machine
M that halts for every input and correctly decides whether any given string belongs to the language or not. More formally:

L is recursive if ∃M such that M halts on every input and M accepts strings in L and rejects strings not in L.

For example, the set of even-length strings over a binary alphabet Σ = {0, 1} is a recursive language because we can
construct a Turing Machine that counts the length of the string and halts, accepting if the length is even and rejecting if the
length is odd.

The class of recursive languages is decidable because a Turing Machine can determine membership in the language by
halting for all inputs. This implies that every recursive language is also a decidable problem.

2. Complements of Recursive and RE Languages


Before delving into the universal language, it’s essential to understand how the complement operation behaves for
recursive languages and recursively enumerable languages (RE).

Complement of Recursive Languages:

If L is a recursive language, its complement L is also recursive. This is because if a Turing Machine can decide
membership in L, we can construct another Turing Machine that decides membership in L by reversing the accept and
reject states of the original machine.

Complement of RE Languages:

If L is recursively enumerable (RE), L is not necessarily RE. This means that while we may have a Turing Machine that
can enumerate the strings in L, there is no guarantee that we can enumerate the strings in L, or that there exists a
Turing Machine that halts for all inputs when checking membership in L.

3. The Universal Language


The universal language LU is the set of all strings that are accepted by some Turing Machine. It can be formally defined

as:

LU = {⟨M , w⟩ ∣ M is a Turing Machine and M accepts the input w}


where ⟨M , w⟩ represents a pair consisting of a Turing Machine M and a string w .

In essence, LU is the language of all Turing Machine computations: it consists of all pairs of Turing Machines and strings

such that the Turing Machine accepts the string. This language is recursively enumerable because we can construct a
Turing Machine that, given a pair ⟨M , w⟩, simulates M on input w and accepts if M accepts w .

However, while LU is RE, it is not recursive. This is because it’s related to the halting problem, which is undecidable. We can

summarize the properties of LU as follows:

LU is RE (since we can enumerate the strings accepted by some Turing Machine).


LU is not recursive because we cannot decide for every ⟨M , w⟩ whether M halts on w (the halting problem is

undecidable).

4. Undecidability of the Universal Language


We now prove that the universal language LU is undecidable.

Proof that LU is Undecidable:


110/145
Suppose, for contradiction, that there exists a Turing Machine MU that decides LU . This means that MU takes an input
​ ​ ​

⟨M , w⟩, where M is a Turing Machine and w is a string, and halts with:


Accept if M accepts w .

Reject if M does not accept w (i.e., M rejects w or does not halt).

Now, let’s use MU to decide the halting problem, which is known to be undecidable. Consider the following procedure:

1. Input: A pair ⟨M , w⟩, where M is a Turing Machine and w is a string.

2. Construct a new Turing Machine M ′ that behaves as follows:

On input x, M ′ ignores x and enters an infinite loop.

If M halts on w , M ′ accepts the input.

3. Now, apply MU to ⟨M ′ , x⟩. There are two possible cases:


If M halts on w , then M ′ will accept all inputs, including x, and MU will accept ⟨M ′ , x⟩.

If M does not halt on w , then M ′ will never halt on any input, and MU will reject ⟨M ′ , x⟩.

Thus, using MU to decide whether M ′ accepts x is equivalent to solving the halting problem for M on w . Since the halting

problem is undecidable, it follows that LU cannot be decided by any Turing Machine.


Hence, the universal language LU is undecidable.


Conclusion
In this lecture, we have explored the concept of recursive languages, recursively enumerable (RE) languages, and their
complements. We examined the universal language, which is RE but not recursive, and proved that it is undecidable. The
key takeaway is that while some problems can be recognized by a Turing Machine (i.e., they are RE), they may still be
undecidable, meaning that no Turing Machine can decide them for all inputs. The universal language LU serves as a prime

example of such a language.

Lecture 39: Undecidable Problems about Turing Machines


In this lecture, we will explore several undecidable problems related to Turing Machines (TMs). We will discuss the technique
of reductions, problems concerning Turing Machines that accept the empty language, the implications of Rice's
Theorem, and various problems related to Turing-Machine specifications. Each topic will be covered with formal
definitions, proofs, and examples to illustrate the undecidability of these problems.

Outline:
1. Reductions

2. Turing Machines that Accept the Empty Language

3. Rice's Theorem and the Properties of the RE Languages

4. Problems about Turing-Machine Specifications

1. Reductions

111/145
A reduction is a technique used to prove undecidability by showing that if we could decide one problem, we could use that
solution to decide another problem that is known to be undecidable. This is one of the most powerful methods in
computability theory and complexity theory.

Definition of Reductions:

If we want to show that a problem P1 is undecidable, we reduce an already-known undecidable problem P2 to P1 . The idea
​ ​ ​

is that if we could solve P1 , we could also solve P2 , but since P2 is undecidable, P1 must also be undecidable.
​ ​ ​ ​

A many-one reduction from a problem P2 to a problem P1 is a function f such that for any input x of P2 :
​ ​ ​

x is a "yes" instance of P2 if and only if f (x) is a "yes" instance of P1 .


​ ​

Example: Halting Problem Reduction to Emptiness Problem

Suppose we want to prove that the emptiness problem is undecidable. We will reduce the Halting Problem to the
emptiness problem.

The Halting Problem asks whether a given Turing Machine M halts on an input w .

The Emptiness Problem asks whether a given Turing Machine M ′ accepts no strings, i.e., the language L(M ′ ) is
empty.

To reduce the halting problem to the emptiness problem, we do the following:

Given an instance of the Halting Problem, which is ⟨M , w⟩, construct a new Turing Machine M ′ such that:

M ′ behaves as follows:

On input x, if x = w, then M ′ simulates M on w. If M halts, M ′ accepts x; if M does not halt, M ′ never


halts.

On all other inputs x  w, M ′ rejects x.


=
Now, solve the emptiness problem for M ′ :

If M halts on w , M ′ will accept w (hence L(M ′ ) is not empty).

If M does not halt on w , M ′ will accept nothing (hence L(M ′ ) is empty).

By checking whether L(M ′ ) is empty, we can determine whether M halts on w . Therefore, since the Halting Problem is
undecidable, the Emptiness Problem is also undecidable.

2. Turing Machines that Accept the Empty Language


One common undecidable problem is determining whether a given Turing Machine M accepts the empty language. In
other words, we want to decide if L(M ) = ∅, meaning that the machine never accepts any input string.

Formal Statement of the Problem:

Given a Turing Machine M , decide whether L(M ) = ∅, i.e., whether M accepts no strings.

Proof of Undecidability:

To prove that this problem is undecidable, we can reduce the Halting Problem to this problem. We know that the Halting
Problem is undecidable, so we will use it to show that determining whether a Turing Machine accepts the empty language is
also undecidable.

Given an instance of the Halting Problem ⟨M , w⟩, construct a new Turing Machine M ′ such that:

112/145
M ′ behaves as follows:

On input x, M ′ first simulates M on w .

If M halts on w , M ′ accepts x.

If M does not halt on w , M ′ rejects all inputs.

Now, consider L(M ′ ):

If M halts on w , M ′ accepts all inputs, so L(M ′ ) = ∅.



If M does not halt on w , M ′ accepts no strings, so L(M ′ ) = ∅.
By checking whether L(M ′ ) = ∅, we can determine whether M halts on w, which is undecidable. Thus, the problem of
determining whether M accepts the empty language is undecidable.

3. Rice's Theorem and the Properties of the RE Languages


Rice's Theorem is a general result that applies to any non-trivial property of the language accepted by a Turing Machine. It
states that any non-trivial property of the language of a Turing Machine is undecidable.

Formal Statement of Rice's Theorem:

Let P be a property of languages. Then the problem of deciding whether a given Turing Machine M accepts a language
L(M ) that has property P is undecidable if P is a non-trivial property.
Non-trivial property means that there exists some Turing Machine whose language has property P , and some Turing
Machine whose language does not have property P .

Example: The Property of Being Non-Empty

Consider the problem of determining whether a Turing Machine M accepts a non-empty language, i.e., L(M )  ∅. This is
=
a non-trivial property because:

There exists a Turing Machine that accepts the empty language (e.g., one that rejects all inputs).

There exists a Turing Machine that accepts a non-empty language (e.g., one that accepts a specific string).

According to Rice’s Theorem, this problem is undecidable because it is a non-trivial property of the language of the Turing
Machine.

4. Problems about Turing-Machine Specifications


Another class of undecidable problems involves the specifications of Turing Machines. A Turing Machine specification can
be thought of as a description of the machine's behavior and its input-output mapping. Some examples of undecidable
problems in this category are:

1. The Halting Problem: Determining whether a given Turing Machine M halts on a given input w is undecidable.

2. The Equivalence Problem: Given two Turing Machines M1 and M2 , determining whether L(M1 )
​ ​ ​ = L(M2 ) is

undecidable.

3. The Universality Problem: Given a Turing Machine M , determining whether L(M ) = Σ∗ , i.e., whether M accepts all
possible strings, is undecidable.

113/145
Proof of Undecidability (Equivalence Problem):

The equivalence problem asks whether two Turing Machines M1 and M2 recognize the same language, i.e., L(M1 )
​ ​ ​ =
L(M2 ). We can prove this is undecidable by reducing the Halting Problem to it:

Given an instance ⟨M , w⟩ of the Halting Problem, construct two Turing Machines M1 and M2 as follows:
​ ​

M1 simulates M on w and accepts if M halts.


M2 accepts everything (i.e., L(M2 ) = Σ∗ ).


If M1 and M2 are equivalent, then M must halt on w . If they are not equivalent, then M does not halt on w .
​ ​

Thus, by checking whether M1 and M2 are equivalent, we can decide the Halting Problem, which is undecidable. Therefore,
​ ​

the equivalence problem is also undecidable.

Conclusion
In this lecture, we explored several important undecidable problems related to Turing Machines. We used reductions to
prove undecidability, explored the problem of determining whether a Turing Machine accepts the empty language, and
discussed Rice's Theorem as a tool for understanding undecidability in terms of properties of languages. We also covered
various undecidable problems related to Turing-Machine specifications. These results underscore the inherent limitations
of computation and the impossibility of deciding certain properties of Turing Machines.

Lecture 40: Post's Correspondence Problem


In this lecture, we will explore Post's Correspondence Problem (PCP), a well-known undecidable problem in computability
theory. We will define the problem, discuss its modified version, and complete a detailed proof of its undecidability.

Outline:
1. Definition of Post's Correspondence Problem (PCP)

2. The Modified PCP

3. Completion of the Proof of PCP Undecidability

1. Definition of Post's Correspondence Problem (PCP)


The Post's Correspondence Problem (PCP) is a decision problem related to finding sequences of strings from a given set
that can be concatenated to form the same string when placed side by side. Specifically, it involves matching two sets of
strings under certain conditions.

Formal Definition of PCP:

Given two finite sets of strings:

{w1 , w2 , … , wn } (Set 1)
​ ​ ​

{v1 , v2 , … , vn } (Set 2)
​ ​ ​

114/145
The task is to determine whether there exists a sequence of indices i1 , i2 , … , ik such that: ​ ​ ​

w i 1 w i 2 … w i k = vi 1 vi 2 … vi k






In other words, the strings from the first set {w1 , w2 , … , wn } can be concatenated to form a string that is exactly the
​ ​ ​

same as the string formed by concatenating the strings from the second set {v1 , v2 , … , vn }, using the same indices in ​ ​ ​

both sequences.

Example 1:

Consider the following sets:

W = {w1 = "ab", w2 = "b", w3 = "ba"}


​ ​ ​

V = {v1 = "b", v2 = "a", v3 = "ab"}


​ ​ ​

We want to find a sequence of indices i1 , i2 , … , ik such that: ​ ​ ​

w i 1 w i 2 … w i k = vi 1 vi 2 … vi k






One possible solution is:

i1 = 1, i2 = 2, i3 = 3
​ ​ ​

Thus, concatenating w1 , w2 , w3 gives: ​ ​ ​

w1 w2 w3 = "ab""b""ba" = "abba"
​ ​ ​

and concatenating v1 , v2 , v3 gives:


​ ​ ​

v1 v2 v3 = "b""a""ab" = "abba"
​ ​

Since the two concatenated strings are identical, this is a solution to the problem.

2. The Modified PCP


A modified version of PCP can be described where the two sets of strings are not necessarily equal in size, or the problem
may involve additional constraints on the formation of valid solutions.

Modified Definition:

In the modified version of PCP, we are given two sets of strings:

W = {w1 , w2 , … , wm } (Set 1, with size m)


​ ​ ​

V = {v1 , v2 , … , vp } (Set 2, with size p)


​ ​ ​

We are asked to find whether there exists a sequence of indices i1 , i2 , … , ik , such that: ​ ​ ​

w i 1 w i 2 … w i k = vi 1 vi 2 … vi k






where the sequence {i1 , i2 , … , ik } is chosen from the indices of the strings in both sets, but the set sizes m and p may
​ ​ ​

differ.

Example 2:

Let:

W = {w1 = "aa", w2 = "bb"}


​ ​

115/145
V = {v1 = "ab", v2 = "ba"}
​ ​

We are tasked with determining whether there is a sequence of indices such that the concatenation of strings from W
matches the concatenation of strings from V . In this case, no such sequence exists because, regardless of the combination
of indices chosen, the concatenated results from W and V will not match.

3. Completion of the Proof of PCP Undecidability


The undecidability of PCP is a fundamental result in computability theory. We will now complete the proof that PCP is
undecidable. This proof is typically done via reduction from the Halting Problem or a known undecidable problem.

Reduction from the Halting Problem:

We will reduce the Halting Problem to PCP. The Halting Problem asks whether a given Turing Machine M halts on an input
w. We can reduce this problem to PCP as follows:
Given an instance of the Halting Problem ⟨M , w⟩, we construct two sets W and V such that solving the corresponding
PCP will tell us whether M halts on w .

Let M be a Turing Machine and w an input. The construction is based on simulating M on w and creating strings that will
match if M halts.

1. Set W :

W = {w1 , w2 }, where:
​ ​

w1 = "a" (This is a placeholder symbol for the Turing Machine to consume.)


w2 = "b" (This is another placeholder symbol for the Turing Machine to consume.)

2. Set V :

V = {v1 , v2 }, where:
​ ​

v1 = "a"

v2 = "ab"

The idea here is that the strings w1 , w2 simulate the transition of the Turing Machine M on the input w . Specifically:
​ ​

w1 represents the transition from the initial state of the Turing Machine to the state where it reads the first symbol of

the input w .

w2 represents a step in the computation of the Turing Machine M on w.


Now, the PCP instance asks whether we can find a sequence of indices such that the concatenation of the strings from W
matches the concatenation of the strings from V .

Key Idea:

If M halts on w , we can find such a sequence because the Turing Machine will eventually enter a halting state, and the
corresponding strings will match.

If M does not halt, no such sequence exists because the Turing Machine will keep running indefinitely, and the two
concatenated strings will never match.

116/145
Thus, by solving this PCP, we can determine whether M halts on w . Since the Halting Problem is undecidable, this reduction
implies that the PCP is also undecidable.

Conclusion
In this lecture, we introduced Post's Correspondence Problem (PCP), defined it formally, and discussed a modified version
of the problem. We then proved the undecidability of PCP through a reduction from the Halting Problem. The
undecidability of PCP highlights the complexity and limitations of computational problems, underscoring the challenges in
decision-making for Turing Machines and their behavior.

Lecture 41: Other Undecidable Problems


In this lecture, we will discuss several undecidable problems related to formal languages and computational theory. These
problems involve questions about programs, context-free grammars (CFGs), and certain languages. We will provide detailed
formulations and proofs for each of the problems discussed.

Outline:
1. Problems about Programs

2. Undecidability of Ambiguity for Context-Free Grammars (CFGs)

3. The Complement of a List Language

1. Problems about Programs


There are many undecidable problems concerning the behavior of programs. One of the most famous undecidable
problems is the Halting Problem, which is concerned with determining whether a given program will halt on a specific
input. However, we will examine a more general class of problems regarding the behavior of programs.

Problem: The Problem of Determining Whether a Program Does Not Halt

Given a program P and an input w , the task is to determine whether P does not halt when run on input w . This problem is
known as the Non-Halting Problem, and it is undecidable.

Proof of Undecidability:

To prove the undecidability of the Non-Halting Problem, we will reduce the Halting Problem to it. Recall that the Halting
Problem is defined as follows: given a program P and an input w , determine if P halts on input w .

The Non-Halting Problem asks if a program P does not halt on input w . We can reduce the Halting Problem to the Non-
Halting Problem by negating the solution. Specifically:

Given an instance ⟨P , w⟩ of the Halting Problem, we construct a new program P ′ that behaves as follows:

If P halts on input w , P ′ enters an infinite loop (i.e., does not halt).

If P does not halt on input w , P ′ halts.

117/145
Now, solving the Non-Halting Problem on P ′ and w gives us the negation of the solution to the Halting Problem:

If P ′ does not halt, then P halts on w .

If P ′ halts, then P does not halt on w .

Thus, by solving the Non-Halting Problem for P ′ and w , we can solve the Halting Problem. Since the Halting Problem is
undecidable, the Non-Halting Problem is also undecidable.

2. Undecidability of Ambiguity for Context-Free Grammars (CFGs)


An important question in the theory of context-free grammars (CFGs) is whether a given CFG is ambiguous. A CFG is said to
be ambiguous if there exists a string in the language of the grammar that has more than one leftmost derivation (or
equivalently, more than one parse tree).

Problem: Determining Whether a CFG is Ambiguous

Given a context-free grammar G and a string w , the task is to determine whether w has more than one leftmost derivation
(or equivalently, more than one parse tree) in G. This problem is known as the Ambiguity Problem for CFGs.

Proof of Undecidability:

We will prove the undecidability of the Ambiguity Problem by reducing the Post Correspondence Problem (PCP) to it. Recall
that PCP is undecidable, so if we can reduce PCP to the Ambiguity Problem, we will show that the Ambiguity Problem is
undecidable.

We will construct a context-free grammar G that generates a string w if and only if the PCP instance has a solution.

1. PCP Instance:

Let W = {w1 , w2 , … , wn } and V = {v1 , v2 , … , vn } be two sets of strings, as defined in the PCP.
​ ​ ​ ​ ​ ​

The task is to find whether there exists a sequence of indices i1 , i2 , … , ik such that: ​ ​ ​

w i 1 w i 2 … w i k = vi 1 vi 2 … vi k






2. Grammar Construction: Construct a context-free grammar G that generates a string w if and only if there exists a
solution to the PCP instance. The grammar will generate strings of the form:

wi1 wi2 … wik





​ and vi 1 vi 2 … vi k



where wi1 wi2




​ … wik equals vi1 vi2 … vik .




3. Ambiguity: If the PCP instance has a solution, the grammar G will generate a string with more than one derivation, as it
can derive the same string from two different sequences of rules corresponding to the matching pairs of wi and vi . ​ ​

Thus, G will be ambiguous.

By solving the Ambiguity Problem for this grammar G, we can solve the PCP. Since PCP is undecidable, the Ambiguity
Problem for CFGs is also undecidable.

3. The Complement of a List Language


A list language is a language consisting of all possible concatenations of strings from a finite set of strings. The
complement of a list language is the set of all strings that do not belong to the list language.

118/145
Problem: The Complement of a List Language

Given a list language L defined by a finite set of strings {w1 , w2 , … , wn }, the task is to determine whether the
​ ​ ​

complement of L is a regular language. In other words, we want to determine if the complement of the language
consisting of all concatenations of strings from {w1 , w2 , … , wn } is a regular language.
​ ​ ​

Proof of Undecidability:

We will prove that this problem is undecidable by reducing from the Halting Problem.

1. Halting Problem: The Halting Problem asks whether a given Turing Machine M halts on a specific input w .

2. Reduction: Given an instance ⟨M , w⟩ of the Halting Problem, we will construct a list language L such that:

If M halts on w , the complement of L is regular.

If M does not halt on w , the complement of L is not regular.

3. List Language Construction: Construct the list language L as follows:

If M halts on w , let L be the set of all strings formed by concatenating strings from the set {w1 , w2 , … , wn }.
​ ​

If M does not halt on w , let L be the set of strings that cannot be formed by concatenating strings from
{w1 , w2 , … , wn }.
​ ​ ​

Now, the complement of L will be regular if and only if M halts on w . This construction reduces the Halting Problem to the
problem of determining whether the complement of a list language is regular, proving that the problem is undecidable.

Conclusion
In this lecture, we discussed three important undecidable problems:

1. Problems about Programs: We demonstrated the undecidability of the Non-Halting Problem by reducing it from the
Halting Problem.

2. Undecidability of Ambiguity for Context-Free Grammars (CFGs): We showed the undecidability of the Ambiguity
Problem by reducing from the Post Correspondence Problem (PCP).

3. The Complement of a List Language: We proved the undecidability of determining whether the complement of a list
language is regular by reducing from the Halting Problem.

These problems highlight the inherent limitations in computational theory and underscore the challenges in determining
properties of programs and formal languages.

Lecture 42: The Classes P and NP


In this lecture, we will explore the classes P and NP , two central complexity classes in computational theory. We will
provide formal definitions, discuss problems that belong to each class, and introduce the concept of NP-completeness.

Outline:
1. Problems Solvable in Polynomial Time (Class P)

2. An Example: Kruskal’s Algorithm

119/145
3. Non-Deterministic Polynomial Time (Class NP)

4. An NP Example: The Traveling Salesman Problem

5. Polynomial Time Reductions

6. NP-Complete Problems (Introductory Definition)

1. Problems Solvable in Polynomial Time (Class P)


Class P consists of decision problems (problems with a yes/no answer) that can be solved by a deterministic Turing machine
in polynomial time. A problem is said to belong to P if there exists an algorithm that can solve the problem in O(nk ) time
for some constant k , where n is the size of the input.

Formal Definition of Class P:

A problem A is in P if there exists a deterministic Turing machine M such that for all inputs w of size n, M (w) halts and
gives the correct answer (yes or no) in O(nk ) time for some constant k .

2. An Example: Kruskal's Algorithm


Kruskal's Algorithm is a greedy algorithm used to find the minimum spanning tree (MST) of a connected, undirected graph.
The MST is a tree that connects all the vertices of the graph with the minimum possible total edge weight.

Problem Definition:

Given a graph G = (V , E), where V is the set of vertices and E is the set of edges with associated weights, the task is to
find the minimum spanning tree of the graph.

Steps of Kruskal's Algorithm:

1. Sort all edges in the graph by their weights.

2. Initialize a forest where each vertex is its own disjoint tree.

3. For each edge in the sorted list:

If the edge connects two vertices in different trees, add it to the MST.

Otherwise, discard the edge.

4. Repeat until there are V − 1 edges in the MST.

Correctness and Time Complexity:

The algorithm ensures the inclusion of edges with the smallest weight that do not form a cycle, which guarantees that
the MST is formed.

Sorting the edges takes O(E log E), and the union-find operations (for detecting cycles) take O(α(V )), where α is the
inverse Ackermann function. Therefore, the overall time complexity of Kruskal's algorithm is dominated by O(E log E),
which is polynomial in the size of the input.

Since Kruskal’s algorithm runs in polynomial time, it is an example of a problem solvable in class P .

3. Non-Deterministic Polynomial Time (Class NP)

120/145
Class NP consists of decision problems for which a proposed solution can be verified in polynomial time by a deterministic
Turing machine. More formally, a problem is in NP if, for every "yes" instance of the problem, there exists a certificate (a
proposed solution) that can be verified in polynomial time.

Formal Definition of Class NP:

A problem A is in NP if there exists a nondeterministic Turing machine M such that for every input w of size n:

If the answer is "yes," then there exists a certificate c (of size polynomial in n) such that M (w, c) accepts in polynomial
time.

If the answer is "no," no such certificate exists.

In simpler terms, if the solution exists, a verifier can check it in polynomial time.

4. An NP Example: The Traveling Salesman Problem (TSP)


The Traveling Salesman Problem (TSP) is a classic optimization problem. The task is to find the shortest possible route that
visits each city exactly once and returns to the origin city. In decision form, the problem is typically stated as: "Is there a tour
of length less than or equal to a given number k ?"

Problem Definition:

Given a set of cities and the distances between each pair of cities, is there a tour that visits every city exactly once and
returns to the starting city such that the total length of the tour is at most k ?

Non-Deterministic Polynomial Time Verification:

To verify a proposed solution (a sequence of cities representing the tour), we can check:

1. Whether the tour visits each city exactly once.

2. Whether the total length of the tour is less than or equal to k .

Both of these checks can be done in polynomial time, so the TSP decision problem is in NP .

Time Complexity of TSP:

The best known algorithms for solving TSP in its optimization form (finding the shortest path) are not polynomial-time
algorithms; they typically take factorial time O(n!). Therefore, TSP is NP-hard, but as a decision problem (whether there
exists a tour of length at most k ), it is in NP .

5. Polynomial Time Reductions


A polynomial-time reduction is a way to transform one problem into another in polynomial time. If problem A can be
reduced to problem B in polynomial time, and if problem B is solvable in polynomial time, then problem A is also solvable
in polynomial time.

Formal Definition of Polynomial-Time Reduction:

A problem A is polynomial-time reducible to a problem B (denoted A ≤p B ) if there exists a polynomial-time algorithm


that transforms any instance of problem A into an instance of problem B such that the answer to the transformed instance
of B is the same as the answer to the original instance of A.

Polynomial-time reductions are often used to show that one problem is at least as hard as another, and they play a key role
in the theory of NP-completeness.

121/145
6. NP-Complete Problems (Introductory Definition)
A problem is said to be NP-complete if it satisfies two conditions:

1. It is in NP , i.e., a solution to the problem can be verified in polynomial time.

2. Every other problem in NP can be reduced to it in polynomial time. In other words, if a polynomial-time algorithm
exists for an NP-complete problem, then every problem in NP can also be solved in polynomial time.

Formal Definition of NP-Completeness:

A decision problem A is NP-complete if:

1. A ∈ NP .
2. For every problem B ∈ NP , B ≤p A (i.e., B can be reduced to A in polynomial time).

The concept of NP-completeness was introduced by Stephen Cook in 1971 with Cook’s Theorem, which showed that the
Boolean satisfiability problem (SAT) is NP-complete. Many other problems, such as the Traveling Salesman Problem, the
Knapsack Problem, and the Hamiltonian Cycle Problem, have been shown to be NP-complete.

Conclusion
In this lecture, we covered the following topics related to the complexity classes P and NP :

1. Class P: Problems solvable in polynomial time. Example: Kruskal's Algorithm for finding a minimum spanning tree.

2. Class NP: Problems where a proposed solution can be verified in polynomial time. Example: The Traveling Salesman
Problem.

3. Polynomial Time Reductions: The concept of transforming one problem into another in polynomial time.

4. NP-Complete Problems: Problems that are both in NP and as hard as any other problem in NP.

The study of P and NP is foundational in understanding the computational complexity of problems and has profound
implications for fields such as cryptography, optimization, and algorithm design. The central question of whether P = NP
remains one of the most important unsolved problems in computer science.

1/2

Lecture 43: An NP-Complete Problem – The Satisfiability Problem


In this lecture, we will study the Satisfiability Problem (SAT), which was the first problem to be proven NP-complete by
Stephen Cook in his famous 1971 paper. We will break down the SAT problem, examine how SAT instances are represented,
and conclude with a formal proof of Cook’s Theorem, which shows that SAT is NP-complete.

Outline:
1. The Satisfiability Problem (SAT)

2. Representing SAT Instances

3. NP-Completeness of SAT

122/145
4. Proof of Cook's Theorem (Formal Proof)

1. The Satisfiability Problem (SAT)


The Satisfiability Problem (SAT) is the problem of determining whether there exists an assignment of truth values (True or
False) to the variables of a Boolean formula such that the formula evaluates to True.

Definition of SAT:

A Boolean formula is a logical expression involving variables, logical operators (AND, OR, NOT), and constants (True and
False). The variables in the formula can be assigned truth values (True or False), and the goal is to determine if there is an
assignment of these truth values that makes the formula evaluate to True.

A formula is in conjunctive normal form (CNF) if it is a conjunction (AND) of clauses, where each clause is a disjunction (OR)
of literals, and a literal is either a variable or its negation.

SAT Problem: Given a Boolean formula F in CNF, determine whether there exists a truth assignment to the variables of F
that makes F evaluate to True.

Example:
Consider the formula:

F = (x1 ∨ ¬x2 ) ∧ (x2 ∨ x3 ) ∧ (¬x3 ∨ x1 )


​ ​ ​ ​ ​ ​

We want to determine if there exists a truth assignment to x1 , x2 , x3 that makes this formula True.
​ ​ ​

2. Representing SAT Instances


A SAT instance is a Boolean formula expressed in CNF. To represent SAT instances efficiently, we use the following format:

Variables: Denoted as x1 , x2 , … , xn , where each variable can take a truth value (True or False).
​ ​ ​

Clauses: Each clause is a disjunction (OR) of literals. A literal is a variable xi or its negation ¬xi .
​ ​

Formula: A conjunction (AND) of such clauses. For example, the SAT instance F = (x1 ∨ ¬x2 ) ∧ (x2 ∨ x3 ) ∧ (¬x3 ∨
​ ​ ​ ​ ​

x1 ) is a CNF formula with three clauses, each of which is a disjunction of literals.


To check whether an assignment of truth values exists that satisfies the formula, one needs to evaluate the formula for
every possible assignment of the variables. Since the number of variables is n, there are 2n possible truth assignments,
which can be computationally expensive as n increases.

3. NP-Completeness of SAT
The SAT problem was the first problem proven to be NP-complete. This was done through Cook’s Theorem, which shows
that SAT is both in NP and NP-hard. To show that SAT is NP-complete, we must establish two things:

1. SAT is in NP: A solution to SAT can be verified in polynomial time.

Given a Boolean formula and an assignment of truth values, it is easy to verify whether the assignment satisfies the
formula by evaluating the formula with the given truth values. This verification process takes polynomial time in the

123/145
size of the formula, so SAT is in NP.

2. SAT is NP-hard: Any problem in NP can be reduced to SAT in polynomial time.

The core of Cook’s Theorem is showing that any decision problem in NP can be transformed into a SAT instance in
polynomial time.

4. Proof of Cook's Theorem (Formal Proof)


Cook’s Theorem:
The Boolean satisfiability problem (SAT) is NP-complete, meaning:

SAT is in NP.

Every problem in NP can be reduced to SAT in polynomial time.

Step 1: SAT is in NP

We have already established that SAT is in NP by showing that a proposed solution (a truth assignment) can be verified in
polynomial time.

Step 2: SAT is NP-hard (Proof by Polynomial-Time Reduction)

To show that SAT is NP-hard, we need to prove that every problem in NP can be reduced to SAT in polynomial time. Cook’s
reduction focuses on arbitrary problems in NP, and he constructs a polynomial-time reduction from any nondeterministic
Turing machine (NDTM) computation to a SAT instance.

Concept: The key idea in Cook’s proof is that we can simulate the computation of a nondeterministic Turing machine
(NDTM) on any input string using a Boolean formula. The formula will encode the possible configurations of the NDTM, and
the satisfiability of the formula will correspond to whether there exists a sequence of valid transitions that leads to an
accepting state.

Step 2.1: Simulation of Nondeterministic Computation

Let us consider a nondeterministic Turing machine M with input string w of length n. We need to simulate the computation
of M on input w . The computation consists of a sequence of configurations, where each configuration specifies:

1. The state of the Turing machine.

2. The current tape contents (for each tape cell).

3. The position of the head.

Each configuration can be encoded as a set of Boolean variables. For instance:

Let qi be a Boolean variable that indicates whether the Turing machine is in state qi at a particular step.
​ ​

Let ti,j represent the symbol in the j -th tape cell at the i-th step.

Step 2.2: Constructing the Boolean Formula

The goal is to create a formula that encodes the transitions of the Turing machine. This formula must be satisfiable if and
only if the NDTM accepts the input w . The formula will consist of:

1. Initial Configuration: A clause ensuring that the initial configuration is consistent with the given input.

2. Transition Constraints: Clauses that encode the valid transitions between configurations of the Turing machine.

3. Acceptance Conditions: A clause that ensures the NDTM enters an accepting state.

124/145
Thus, the Boolean formula will represent all possible configurations and transitions of the Turing machine. If there exists a
sequence of configurations leading from the initial state to an accepting state, the formula will be satisfiable.

Step 2.3: Polynomial Time Construction

Constructing the Boolean formula involves encoding the transitions of the NDTM and ensuring the satisfiability corresponds
to a valid computation. The size of the formula grows polynomially in the size of the input string w and the number of steps
the NDTM takes, making the construction of the formula polynomial in time.

Step 3: Conclusion

Since we can transform any NDTM computation into a SAT instance in polynomial time, and since we know that a solution to
SAT can be verified in polynomial time, we conclude that SAT is NP-complete.

Conclusion
In this lecture, we:

1. Discussed the Satisfiability Problem (SAT) and how it is represented in conjunctive normal form (CNF).

2. Showed that SAT is in NP and provided the formal definition of NP-completeness.

3. Presented Cook’s Theorem, which proves that SAT is NP-complete, by constructing a polynomial-time reduction from an
arbitrary NP problem (simulated by a nondeterministic Turing machine) to a SAT instance.

Cook’s Theorem is foundational in complexity theory, as it was the first to establish the concept of NP-completeness, leading
to the identification of many other NP-complete problems. The problem of determining whether P = NP remains one of
the most important open questions in computer science.

Lecture 44: A Restricted Satisfiability Problem


In this lecture, we explore restricted versions of the Satisfiability Problem, particularly focusing on CSAT (Conjunctive
Normal Form Satisfiability) and 3-SAT, both of which are pivotal problems in the study of NP-completeness. We will establish
the normal forms for Boolean expressions, convert general expressions to CNF, and formally prove the NP-completeness of
CSAT and 3-SAT.

Outline:
1. Normal Forms for Boolean Expressions

2. Converting Expressions to CNF

3. NP-Completeness of CSAT

4. NP-Completeness of 3-SAT

1. Normal Forms for Boolean Expressions

125/145
To understand the restricted versions of SAT, we first need to discuss the concept of normal forms for Boolean expressions.
These normal forms allow us to represent logical formulas in a standardized and structured way, which is crucial for
analyzing their computational complexity.

Conjunctive Normal Form (CNF)

A Boolean expression is in Conjunctive Normal Form (CNF) if it is a conjunction (AND) of clauses, where each clause is a
disjunction (OR) of literals. A literal is either a variable xi or its negation ¬xi .
​ ​

Example of CNF:

(x1 ∨ ¬x2 ∨ x3 ) ∧ (¬x1 ∨ x2 ) ∧ (x3 ∨ ¬x2 )


​ ​ ​ ​ ​ ​ ​

In CNF, each clause is a disjunction of literals, and the entire expression is a conjunction of such clauses.

Disjunctive Normal Form (DNF)

In contrast, a Boolean expression is in Disjunctive Normal Form (DNF) if it is a disjunction (OR) of conjunctions (AND) of
literals. DNF is not used directly in this lecture, but it is important to note the distinction.

Example of DNF:

(x1 ∧ x2 ) ∨ (¬x1 ∧ x3 )
​ ​ ​ ​

Other Normal Forms:

There are other normal forms such as Negation Normal Form (NNF), which is a Boolean expression where negations only
appear directly in front of literals. These are not the focus of this lecture but serve as important concepts in logic and
complexity theory.

2. Converting Expressions to CNF


For the problems we study, it is often necessary to convert Boolean formulas into CNF, as it standardizes the input for
satisfiability testing. The process of conversion to CNF can be broken down into the following steps:

Step 1: Eliminate Implications and Biconditionals

Implications (p → q ) and biconditionals (p ↔ q ) are not allowed in CNF. We replace them using the following equivalences:
p → q ≡ ¬p ∨ q
p ↔ q ≡ (p → q) ∧ (q → p)

Step 2: Push Negations Inward

Negations should be applied only to literals. To ensure this, we apply De Morgan’s laws:

¬(p ∧ q) ≡ ¬p ∨ ¬q
¬(p ∨ q) ≡ ¬p ∧ ¬q

We repeatedly push negations inward until they appear only in front of literals.

Step 3: Distribute OR over AND (Distributive Law)

The next step is to apply the distributive law to get the formula into CNF:

(p ∨ (q ∧ r)) ≡ (p ∨ q) ∧ (p ∨ r)

This ensures that we have a conjunction of disjunctions.

126/145
Step 4: Simplify the Expression

Finally, we simplify the formula to remove any redundant terms, ensuring the formula is in valid CNF form. This step often
involves eliminating tautologies or redundant clauses.

Example: Converting (x1 ​ ∧ (x2 ∨ x3 )) into CNF:


​ ​

1. Apply distributivity:
(x1 ∧ (x2 ∨ x3 )) ≡ (x1 ∧ x2 ) ∨ (x1 ∧ x3 )
​ ​ ​ ​ ​ ​ ​

2. The result is a conjunction of disjunctions, as required.

3. NP-Completeness of CSAT
We now introduce the restricted version of SAT called CSAT (Conjunctive Satisfiability), where we restrict the formula to be
in conjunctive normal form (CNF). Specifically, CSAT is the problem of determining whether a given CNF formula has a
satisfying assignment.

Definition of CSAT:

Given a Boolean formula F in CNF, determine whether there exists a truth assignment to the variables that satisfies the
formula.

For example, the formula:

(x1 ∨ ¬x2 ) ∧ (x2 ∨ x3 ) ∧ (¬x3 ∨ x1 )


​ ​ ​ ​ ​ ​

is a CSAT instance. The task is to find a truth assignment to x1 , x2 , x3 that makes the formula true.
​ ​ ​

Proof of NP-Completeness of CSAT:

CSAT is in NP: Given a Boolean formula in CNF and a proposed truth assignment, we can verify if the assignment
satisfies the formula in polynomial time by evaluating each clause.

CSAT is NP-hard: We reduce the general SAT problem (which is NP-complete) to CSAT in polynomial time. Since SAT can
be reduced to CSAT in polynomial time, CSAT is NP-hard.

Thus, CSAT is NP-complete.

4. NP-Completeness of 3-SAT
Next, we consider the problem of 3-SAT, a special case of SAT where each clause contains exactly three literals. The 3-SAT
problem is of particular importance because it was the first problem shown to be NP-complete through a polynomial-time
reduction from CSAT.

Definition of 3-SAT:

Given a Boolean formula in CNF where each clause contains exactly three literals, determine if there exists an assignment of
truth values to the variables that makes the formula true.

For example, the formula:

(x1 ∨ x2 ∨ ¬x3 ) ∧ (¬x1 ∨ x2 ∨ x3 ) ∧ (x1 ∨ ¬x2 ∨ x3 )


​ ​ ​ ​ ​ ​ ​ ​ ​

127/145
is a 3-SAT instance.

Proof of NP-Completeness of 3-SAT:

3-SAT is in NP: Given a truth assignment to the variables, we can easily check if all the clauses are satisfied, which takes
polynomial time.

3-SAT is NP-hard: To show that 3-SAT is NP-hard, we reduce CSAT to 3-SAT in polynomial time. This can be done by
transforming any CNF formula into an equivalent formula where each clause has exactly three literals. The
transformation process is as follows:

Step 1: If a clause has more than three literals, introduce new variables to break the clause into smaller clauses,
each with exactly three literals.

Step 2: If a clause has fewer than three literals, we introduce new dummy variables (such as x4 or ¬x4 ) to pad the
​ ​

clause to exactly three literals.

This transformation ensures that the 3-SAT problem is NP-hard.

Thus, 3-SAT is NP-complete, since it is both in NP and NP-hard.

Conclusion
In this lecture, we:

1. Introduced normal forms for Boolean expressions, focusing on Conjunctive Normal Form (CNF).

2. Discussed the process of converting Boolean expressions to CNF.

3. Proved the NP-completeness of CSAT, showing that it is both in NP and NP-hard.

4. Proved the NP-completeness of 3-SAT, demonstrating its NP-hardness through a polynomial-time reduction from CSAT.

The NP-completeness of 3-SAT is a foundational result in computational complexity, and it plays a central role in
understanding the difficulty of many other NP-complete problems.

Lecture 45: Additional NP-Completeness Problems


In this lecture, we explore several important NP-complete problems beyond the famous 3-SAT problem. These problems
arise in various areas of computer science, and each one has been shown to be NP-complete through formal reductions
from other NP-complete problems. We'll cover the following problems in detail:

1. The Problem of Independent Sets

2. The Node-Cover Problem

3. The Directed Hamiltonian Circuit Problem

4. Undirected Hamiltonian Circuits and the Traveling Salesman Problem (TSP)

5. Summary of NP-Complete Problems

Each of these problems will follow the same structure:

Problem Definition

Formalisms

Detailed Proof of NP-Completeness through polynomial-time reductions

128/145
1. The Problem of Independent Sets

Problem Definition:

= (V , E), an independent set is a subset of vertices I ⊆ V such that no two vertices in I are adjacent.
Given a graph G
The Independent Set Problem is to determine whether there exists an independent set of size at least k in the graph.

Formalisms:

G = (V , E): An undirected graph where V is the set of vertices and E is the set of edges.
Independent Set: A subset I ⊆ V where for all pairs u, v ∈ I , (u, v) ∈
/ E.
Decision Problem: Given G and an integer k , is there an independent set of size at least k ?

NP-Completeness Proof:

Independent Set is in NP: Given a set I , we can verify in polynomial time if I is an independent set by checking for
each pair of vertices u, v ∈ I that (u, v) ∈
/ E.
Independent Set is NP-hard: We reduce Vertex Cover (an NP-complete problem) to Independent Set in polynomial
time. This is because:

A set of vertices C is a vertex cover if for every edge (u, v) ∈ E , either u ∈ C or v ∈ C .


The complement of a vertex cover is an independent set. If C is a vertex cover, then V ∖ C is an independent set.
Thus, a polynomial-time reduction exists from Vertex Cover to Independent Set, proving NP-hardness.

Since Independent Set is both in NP and NP-hard, it is NP-complete.

2. The Node-Cover Problem

Problem Definition:

Given a graph G = (V , E) and an integer k , the Node-Cover Problem asks whether there exists a set C ⊆ V of size at
most k such that every edge in G is incident to at least one vertex in C .

Formalisms:

G = (V , E): An undirected graph.


Node-Cover: A subset C ⊆ V such that for every edge (u, v) ∈ E , u ∈ C or v ∈ C .
Decision Problem: Given G and an integer k , is there a node cover of size at most k ?

NP-Completeness Proof:

Node-Cover is in NP: Given a set C , we can verify in polynomial time if C is a node cover by checking that each edge is
incident to at least one vertex in C .

Node-Cover is NP-hard: We reduce the Vertex Cover Problem to Node-Cover in polynomial time. This is because:

A vertex cover is simply a special case of a node cover where the size of the set is bounded by k .

Thus, if we can solve the Node-Cover Problem, we can solve the Vertex Cover Problem by checking whether a node
cover exists of size at most k .

This reduction proves NP-hardness.

129/145
Since Node-Cover is both in NP and NP-hard, it is NP-complete.

3. The Directed Hamiltonian Circuit Problem

Problem Definition:

Given a directed graph G = (V , E), the Directed Hamiltonian Circuit Problem asks whether there exists a directed cycle
that visits every vertex in V exactly once.

Formalisms:

G = (V , E): A directed graph where each edge is directed from one vertex to another.
Hamiltonian Circuit: A cycle in the graph that visits every vertex exactly once and returns to the starting vertex.

Decision Problem: Given G, does there exist a directed Hamiltonian circuit?

NP-Completeness Proof:

Directed Hamiltonian Circuit is in NP: Given a cycle, we can verify in polynomial time if it visits every vertex exactly
once and returns to the starting vertex.

Directed Hamiltonian Circuit is NP-hard: We reduce the Hamiltonian Cycle Problem (for undirected graphs) to the
Directed Hamiltonian Circuit Problem in polynomial time. The reduction works by replacing each undirected edge (u, v)
in the Hamiltonian cycle problem with two directed edges (u, v) and (v, u) in the directed graph. This guarantees that
solving the directed version will solve the undirected version.

Since Directed Hamiltonian Circuit is both in NP and NP-hard, it is NP-complete.

4. Undirected Hamiltonian Circuits and the Traveling Salesman Problem (TSP)

Problem Definition:

Undirected Hamiltonian Circuit Problem: Given an undirected graph G = (V , E), determine if there exists a cycle
that visits each vertex exactly once.

Traveling Salesman Problem (TSP): Given a complete graph G = (V , E) with weights on the edges, is there a tour
(Hamiltonian circuit) whose total weight is less than or equal to a specified bound B ?

Formalisms:

Undirected Hamiltonian Circuit: A cycle that visits every vertex in G exactly once and returns to the starting point.

TSP: A Hamiltonian cycle where the sum of the edge weights is at most B .

NP-Completeness Proof (for Undirected Hamiltonian Circuit):

Undirected Hamiltonian Circuit is in NP: Given a cycle, we can verify in polynomial time if it is a Hamiltonian cycle.

Undirected Hamiltonian Circuit is NP-hard: We reduce the Hamiltonian Path Problem (a special case of Hamiltonian
Circuit) to the Undirected Hamiltonian Circuit Problem in polynomial time. This involves creating a cycle by adding an
edge between the start and end of a Hamiltonian path. This reduction proves NP-hardness.

Since the problem is both in NP and NP-hard, it is NP-complete.

130/145
TSP NP-Completeness:

The Traveling Salesman Problem is NP-complete as well. We can reduce the Hamiltonian Circuit Problem to TSP by setting
all edge weights to 1. If a solution to TSP exists with weight less than or equal to ∣V ∣, then it corresponds to a Hamiltonian
circuit.

5. Summary of NP-Complete Problems


At this point, we have covered several NP-complete problems:

Independent Set

Node-Cover

Directed Hamiltonian Circuit

Undirected Hamiltonian Circuit

Traveling Salesman Problem (TSP)

Reductions Between Problems:

Vertex Cover reduces to Independent Set.

Hamiltonian Cycle reduces to Directed Hamiltonian Circuit and Undirected Hamiltonian Circuit.

Hamiltonian Path reduces to Undirected Hamiltonian Circuit.

TSP reduces to Hamiltonian Circuit.

Finally, as a key observation, all these NP-complete problems can be ultimately reduced to SAT, demonstrating that they all
belong to the same complexity class, NP-complete.

Conclusion:
In this lecture, we:

1. Defined and formally analyzed several additional NP-complete problems, including Independent Set, Node-Cover,
Directed Hamiltonian Circuit, and Undirected Hamiltonian Circuit.

2. Proved the NP-completeness of these problems through polynomial-time reductions.

3. Demonstrated how all these problems connect to each other and can be reduced to SAT, illustrating the
interconnectedness of NP-complete problems.

This comprehensive exploration of NP-complete problems enriches our understanding of the vast landscape of
computational complexity and how many different problems can be reduced to one another.

Lecture 46: Complements of Languages in NP


In this lecture, we will explore the concept of co-NP, its relationship to NP, and the interactions between NP-complete and
co-NP problems. The primary focus will be on understanding the class of languages co-NP, how it contrasts with NP, and
how NP-complete problems relate to co-NP problems.

131/145
1. The Class of Languages Co-NP

Definition of Co-NP:

NP (Nondeterministic Polynomial time) is the class of decision problems for which a "yes" answer can be verified in
polynomial time. More formally, a language L belongs to NP if there exists a polynomial-time verifier for it, meaning
that for any string x in L, there exists a certificate (or witness) that can be verified in polynomial time.

Co-NP is defined as the class of languages whose complements are in NP. Specifically, a language L belongs to Co-NP if
and only if the complement of L, denoted L, belongs to NP. In other words, a language is in Co-NP if for every "no"
answer to a decision problem, there exists a polynomial-time verifier for that "no" answer.

Formally:

Co-NP = { L ∣ L ∈ NP }

This means that if there is a polynomial-time verifier for the complement of L, then L is in Co-NP.

Examples of Co-NP Problems:

Tautology Problem: Given a Boolean formula, determine whether the formula is true for all possible truth assignments.
This problem is in Co-NP because checking whether a formula is not a tautology (i.e., there exists an assignment that
makes the formula false) is an NP problem.

The Non-Emptiness Problem for CFLs: Given a context-free grammar G, determine whether the language generated
by G is non-empty. This problem is in Co-NP because checking whether a context-free grammar generates an empty
language can be done in NP by checking if there is no derivation to any terminal string.

2. NP-Complete Problems and Co-NP

Relationship Between NP and Co-NP:

NP vs Co-NP: The relationship between NP and Co-NP is one of the most fundamental open questions in computational
complexity. The key question is whether NP = Co-NP. If NP and Co-NP were equal, then every problem for which we can
verify a "yes" answer efficiently (NP problems) would also have a complement problem that can be verified efficiently
(Co-NP problems).

It is widely believed that NP ≠ Co-NP, although this has not been proven definitively. Most complexity theorists
conjecture that there are some problems that belong to NP but not to Co-NP, and vice versa.

Examples of NP-Complete and Co-NP-Complete Problems:

NP-Complete Problems: These are problems that are both in NP and are at least as hard as any other NP problem. This
means that if we can solve an NP-complete problem in polynomial time, we can solve all NP problems in polynomial
time.

Examples: SAT, Independent Set, Traveling Salesman Problem, Hamiltonian Circuit, etc.

Co-NP-Complete Problems: These are problems that are co-NP-hard and also in Co-NP. In other words, these problems
are at least as hard as any other problem in Co-NP. If we could solve a Co-NP-complete problem in polynomial time, we
would prove that Co-NP = NP.

132/145
Tautology Problem (checking whether a Boolean formula is true for all possible inputs) is an example of a Co-NP-
complete problem.

NP-Complete Problems and Co-NP:

While there is no direct equivalence between NP-complete problems and Co-NP problems, it is important to recognize that
for every NP problem, there is a corresponding Co-NP problem. For example:

SAT (NP-complete problem): Given a Boolean formula, is there an assignment of variables that makes the formula true?

TAUT (Co-NP-complete problem): Given a Boolean formula, is the formula true for all possible assignments? The
complement of SAT is TAUT because if a formula is not satisfiable (no assignment makes it true), then it is a tautology
(true for all assignments).

This shows that NP and Co-NP are conceptually related but distinct classes. The main distinction is that NP deals with
verifying "yes" answers, while Co-NP deals with verifying "no" answers.

Complementation in NP and Co-NP:

A language L in NP has a polynomial-time verifier for "yes" instances of the problem.

A language L in Co-NP has a polynomial-time verifier for "no" instances of the problem.

Thus, the existence of a verifier for "no" instances of a problem corresponds to the problem being in Co-NP.

Polynomial-Time Completeness:

NP-complete problems are those problems to which every problem in NP can be reduced in polynomial time. Similarly,
Co-NP-complete problems are the hardest problems in Co-NP, to which every problem in Co-NP can be reduced in
polynomial time.

Key Observation:

There is no known polynomial-time reduction from an NP-complete problem to a Co-NP-complete problem or vice versa,
unless NP = Co-NP.

3. Conclusion
In this lecture, we:

Defined the class Co-NP, which consists of languages whose complements are in NP.

Explored the fundamental question of whether NP = Co-NP, noting that most complexity theorists believe they are
distinct but have not been proven so.

Discussed NP-complete problems and Co-NP-complete problems, highlighting how certain problems in NP have
complementary problems in Co-NP.

Examined examples like the Tautology Problem, which is Co-NP-complete, and compared them with NP-complete
problems like SAT.

This topic remains an essential part of computational complexity theory, with significant open questions, especially the one
about the relationship between NP and Co-NP.

Lecture 47: Problems Solvable in Polynomial Space

133/145
In this lecture, we will delve into the class of problems solvable in polynomial space and explore the concept of Polynomial
Space Turing Machines (PS-TMs). We will examine the relationship of PS (polynomial space) and NPS (non-deterministic
polynomial space) with previously defined complexity classes, and distinguish between deterministic and non-
deterministic polynomial space.

1. Polynomial Space Turing Machines

Definition of Polynomial Space Turing Machines (PS-TMs):

A Turing machine (TM) operates on an infinite tape and can use an arbitrary amount of space to solve a problem, but we
impose a restriction on the amount of space it can use. Specifically, we are concerned with Turing machines that use at most
polynomial space with respect to the size of the input.

Polynomial Space (PS): A Turing machine is said to operate within polynomial space if the amount of tape it uses is
bounded by a polynomial function of the input size. This means that if the size of the input is n, the machine's tape
usage is bounded by O(nk ), where k is a constant.

Formally:

Let the input size be n. If a Turing machine uses space at most O(nk ) for some constant k , it is said to operate in
polynomial space. In terms of language recognition, the class of languages that can be recognized by such machines is
denoted PSPACE.

PSPACE = {Languages that can be decided by a polynomial-space Turing machine}

Examples of PSPACE Problems:

Quantified Boolean Formula (QBF): The problem of determining the truth value of a quantified Boolean formula,
where the variables in the formula are quantified using existential and universal quantifiers, is PSPACE-complete.

Generalized Reachability Problem: Given a graph and two nodes, determine if there is a path between the nodes that
satisfies certain conditions. This problem can be solved in polynomial space.

2. Relationship of PSPACE and NPSPACE to Previously Defined Classes

Comparison with P and NP:

P (Polynomial Time): The class of problems that can be solved by a deterministic Turing machine in polynomial time.
This is a time-bound class.

NP (Nondeterministic Polynomial Time): The class of problems for which a solution can be verified in polynomial time
by a deterministic Turing machine, or equivalently, can be solved by a nondeterministic Turing machine in polynomial
time.

PSPACE (Polynomial Space): The class of problems that can be solved by a deterministic Turing machine in polynomial
space, irrespective of the time complexity.

NPSPACE (Nondeterministic Polynomial Space): The class of problems that can be solved by a nondeterministic Turing
machine in polynomial space.

Key Relationships:

134/145
1. PSPACE and P: It is known that P is a subset of PSPACE, i.e., P ⊆ PSPACE. This is because any problem that can be solved
in polynomial time can also be solved in polynomial space, as time complexity inherently places limits on the amount of
space required.

2. PSPACE and NP: The relationship between PSPACE and NP is more subtle. While NP problems may not necessarily
require polynomial space (they could use exponential space, for example), PSPACE includes problems that can require
exponential time but still remain within polynomial space. Thus, PSPACE is strictly larger than NP in general.

3. PSPACE and NPSPACE: A critical result in computational complexity theory is that PSPACE = NPSPACE, which is a
consequence of Savitch's Theorem. Savitch's theorem states that any problem solvable by a nondeterministic Turing
machine in polynomial space can also be solved by a deterministic Turing machine in polynomial space.

Savitch’s Theorem: For any function f (n) ≥ log n, the class of languages decidable by a nondeterministic Turing
machine in f (n)-space is the same as the class of languages decidable by a deterministic Turing machine in f (n)2 -
space.

Thus, nondeterminism does not add more computational power in terms of space when the space is polynomial,
making PSPACE = NPSPACE.

3. Deterministic and Non-Deterministic Polynomial Space

Deterministic Polynomial Space (PS):

In deterministic polynomial space, a Turing machine makes a series of decisions based on its current state and input
symbol, and the computation is deterministic. The machine uses a polynomial amount of tape (space) to process the
input and produce an output.

Examples of deterministic polynomial space problems include:

The Quantified Boolean Formula (QBF) problem: Determining whether a given Boolean formula with quantifiers is
true or false is a PSPACE-complete problem.

Game Theory Problems: Certain problems in game theory, such as determining the winner in a generalized version
of the game of chess, can be solved in polynomial space.

Nondeterministic Polynomial Space (NPSPACE):

In nondeterministic polynomial space, the Turing machine has the ability to make "guesses" at each step of the
computation, and then "verify" the correctness of those guesses. This allows the machine to explore multiple
computational paths simultaneously.

NPSPACE allows nondeterministic Turing machines to use polynomial space, but it is still limited by the space constraint.
Importantly, it has been proven that NPSPACE = PSPACE (by Savitch's Theorem).

Examples of nondeterministic polynomial space problems include:

Graph Reachability: Determining whether there is a path between two nodes in a graph, under certain conditions,
can be solved using NPSPACE.

Generalized Games: Some generalized game problems that involve multiple players and strategies can be
formulated as NPSPACE problems.

Savitch's Theorem (Restated):

Savitch's theorem formalizes the equivalence of PSPACE and NPSPACE, providing a deep insight into how nondeterminism
can be simulated using deterministic algorithms in polynomial space:

135/145
Theorem: For any language decidable by a nondeterministic Turing machine in polynomial space, there exists a
deterministic Turing machine that can decide the same language in polynomial space.

This result shows that nondeterminism does not provide additional power when restricted to polynomial space.

4. Conclusion
In this lecture, we:

Introduced the concept of Polynomial Space Turing Machines (PS-TMs) and defined the complexity class PSPACE.

Discussed the relationship between PSPACE, NPSPACE, and previously defined classes like P and NP, emphasizing that
PSPACE is strictly more powerful than NP, but PSPACE = NPSPACE by Savitch's Theorem.

Explained the distinction between deterministic and nondeterministic polynomial space, emphasizing that
nondeterminism does not provide extra power in polynomial space.

The exploration of polynomial space helps in understanding the boundaries of tractability when only space usage is
constrained, rather than time. PSPACE contains many significant problems in computational complexity, including those
with exponential time complexity but polynomial space requirements.

Lecture 48: A Problem That Is Complete for PS (PS-Completeness)


In this lecture, we will explore the concept of PS-completeness and examine the Quantified Boolean Formula (QBF)
problem, which is known to be PS-complete. The focus will be on understanding the properties of PS-complete problems
and formally proving the PS-completeness of the QBF problem.

1. PS-Completeness

Definition of PS-Completeness:

A problem is said to be PS-complete if it satisfies two conditions:

1. It is in PSPACE: The problem can be solved by a deterministic Turing machine using polynomial space. In other words,
the space required to solve the problem grows at most polynomially with the input size.

2. Every problem in PSPACE can be reduced to it in polynomial time: For any problem A in PSPACE, there exists a
polynomial-time reduction from A to the PS-complete problem P . This means that any PSPACE problem can be reduced
to P in polynomial time, showing that P is as "hard" as any other problem in PSPACE.

Thus, PS-complete problems serve as the hardest problems in PSPACE, and solving one PS-complete problem efficiently
would imply an efficient solution for all problems in PSPACE.

2. Quantified Boolean Formulas (QBF)

Definition of QBF:

136/145
A Quantified Boolean Formula (QBF) is an extension of the standard Boolean formulas. In a QBF, Boolean variables are
quantifiable, meaning they can be universally or existentially quantified. Specifically, a QBF is a Boolean formula that
includes quantifiers over the variables in the formula. These quantifiers can either be:

Existential quantifiers (∃): There exists a variable assignment that makes the formula true.

Universal quantifiers (∀): The formula must be true for all variable assignments.

An example of a QBF is:

∃x1 ∀x2 (x1 ∧ ¬x2 )


​ ​ ​ ​

This QBF expresses that there exists a value for x1 , such that for every value of x2 , the formula x1
​ ​ ​ ∧ ¬x2 holds true.

The formula can be interpreted as a two-player game between the existential quantifier (which picks values for the
variables) and the universal quantifier (which must check whether the formula holds for all variable assignments).

3. Evaluating Quantified Boolean Formulas

Problem Definition:

The QBF Evaluation Problem is the problem of determining the truth value of a quantified Boolean formula. Given a QBF
formula Q consisting of Boolean variables and quantifiers, the task is to evaluate whether the formula is true or false based
on the values assigned to the variables.

The evaluation of a QBF formula involves recursively handling the quantifiers:

For an existential quantifier (∃x), the formula is true if there exists a value of x that makes the formula true.

For a universal quantifier (∀x), the formula is true if, for every possible value of x, the formula remains true.

Example of QBF Evaluation:

Consider the QBF formula:

∃x1 ∀x2 (x1 ∨ x2 )


​ ​ ​ ​

To evaluate this formula, we follow these steps:

1. For ∃x1 , we check if there exists an assignment for x1 such that for all x2 , the formula x1
​ ​ ​ ∨ x2 holds.

2. For each assignment of x1 , we then evaluate whether for all possible values of x2 , the formula holds.
​ ​

For x1 ​ = true, we find that the formula is true regardless of x2 's value, thus making the entire formula true.

4. PS-Completeness of QBF Problem

Proving PS-Completeness of QBF:

To prove that the QBF problem is PS-complete, we need to show two things:

1. QBF is in PSPACE: We need to demonstrate that evaluating a QBF can be done using polynomial space. This is done by
observing that for each quantifier in the formula, the evaluation can be done recursively without requiring more than
polynomial space.

137/145
Recursive Space Evaluation: To evaluate a QBF, we traverse the formula from the outermost quantifier to the
innermost one, processing each quantifier sequentially. For each quantifier, we need only store the current partial
evaluation and the variables currently being considered, which requires polynomial space.

Space Complexity: The space required for storing the QBF evaluation at each level is proportional to the number of
variables in the formula. Since the formula has a polynomial size in the number of variables, the space required to
evaluate the QBF is also polynomial.

2. Every problem in PSPACE reduces to QBF: Next, we need to show that any problem in PSPACE can be reduced to the
QBF evaluation problem in polynomial time. This step typically involves encoding a PSPACE-complete problem (like the
Generalized Reachability Problem) into a QBF formula. If this reduction can be performed in polynomial time, and the
resulting formula can be evaluated in polynomial space, then the QBF problem is PS-complete.

Reduction from PSPACE Problems: The process of reducing a PSPACE problem to QBF involves converting the
space-bounded computations of the problem into a sequence of quantifiers and Boolean operations. These
quantifiers model the nondeterministic choices made by a machine during its computation. The Boolean variables
represent the states of the machine, and the quantifiers specify whether the machine can reach a certain state
under the given constraints.

Example of Reduction: Consider the problem of determining whether a given graph has a path from node A to
node B . This problem can be encoded into a QBF formula, where each quantifier represents a decision at each step
of the pathfinding process, and the Boolean variables represent the edges in the graph. By evaluating the QBF, we
can determine if a path exists.

Since both conditions hold—QBF is in PSPACE and any PSPACE problem can be reduced to QBF—we conclude that QBF is
PS-complete.

5. Conclusion
In this lecture, we have:

Introduced PS-completeness, which characterizes the hardest problems in the PSPACE complexity class.

Discussed the Quantified Boolean Formula (QBF) problem, which is a natural candidate for PS-completeness.

Evaluated a QBF formula and discussed how the evaluation process works by handling the existential and universal
quantifiers.

Provided a formal proof for the PS-completeness of the QBF problem, demonstrating that it is both in PSPACE and that
every problem in PSPACE can be polynomial-time reduced to QBF.

Understanding PS-completeness is crucial in theoretical computer science, as it provides insight into the structure of PSPACE
and identifies some of the most difficult problems within this class. The QBF problem is one of the key problems in this area
and serves as a benchmark for understanding the complexity of PSPACE.

Lecture 49: Language Classes Based on Randomization


In this lecture, we will delve into randomized algorithms and explore the randomized Turing machine (TM) model. We will
examine the classes RP and ZPP, their definitions, and their relationship to other complexity classes such as P and NP. The
discussion will cover formal notations and proofs, along with examples for better understanding.

138/145
1. Quicksort: An Example of a Randomized Algorithm

Quicksort Overview:

Quicksort is a well-known divide-and-conquer algorithm for sorting arrays. The key idea behind Quicksort is to select a
pivot element from the array and partition the other elements into two subarrays, according to whether they are less than
or greater than the pivot. These subarrays are then sorted recursively.

Randomized Quicksort:

In the randomized version of Quicksort, the pivot is selected randomly from the array rather than choosing a fixed pivot
(such as the first element). This random selection introduces an element of randomness in the algorithm, which affects the
runtime performance.

The main advantage of the randomized pivot selection is that it can ensure the algorithm performs well on average, even
for arrays with adversarial orderings, thus avoiding the worst-case time complexity O(n2 ) in many situations.

Analysis of Randomized Quicksort:

Expected Time Complexity: The expected time complexity of randomized Quicksort is O(n log n), where n is the
number of elements in the array. This is because each partitioning step divides the array into two parts, and on
average, the pivot splits the array in half.

Worst-Case Time Complexity: In the worst case, the algorithm could behave like a deterministic version of Quicksort,
resulting in O(n2 ) time complexity. However, this happens only with very low probability when the pivot is consistently
chosen poorly (e.g., always the smallest or largest element).

2. A TM-Model Using Randomization

Randomized Turing Machine (RTM):

A randomized Turing machine (RTM) is a Turing machine that has access to random bits. At each step of computation, the
machine can make a random choice (such as choosing between multiple possible transitions). The randomness is typically
modeled by a special "random tape" that provides random binary values, which can be accessed during computation.

An RTM operates by reading from its input tape and the random tape, making decisions based on both the input and the
random bits. The main difference between a standard deterministic Turing machine (DTM) and an RTM is the presence of
randomness, which allows the RTM to explore multiple computational paths simultaneously.

Formal Definition of RTM:

Transition Function: The transition function δ of an RTM is similar to a DTM, but it takes an additional random input. At
each step, the RTM moves to a new state based not only on the current state and input symbol but also based on the
value of a random bit.

Computation Path: An RTM does not follow a single deterministic path; instead, it explores multiple possible paths,
each corresponding to a different sequence of random choices.

Acceptance Criteria: The RTM can accept or reject an input string based on the outcome of these multiple paths. The
machine accepts the input if at least one computation path leads to an accepting state.

3. The Language of a Randomized TM

139/145
Definition of a Language Recognized by an RTM:

The language of a randomized Turing machine is the set of all strings that the machine accepts with a certain probability.
For an input string w , the machine may either accept or reject it based on the sequence of random choices it makes during
computation.

The key point is that the machine does not necessarily accept the input with certainty but with high probability. Specifically,
we define the following:

Success Probability: The machine is said to accept a string w if the probability of accepting w over all possible random
paths is at least a certain threshold (often 1/2).

Error Probability: A machine is allowed to make errors, but the probability of error (i.e., rejecting an input when it is in
the language or accepting it when it is not) must be bounded by a small constant (such as 1/3).

4. The Class RP

Definition of RP:

The class RP (Randomized Polynomial time) consists of decision problems for which a randomized algorithm exists that can
decide the problem in polynomial time with the following properties:

Correct Acceptance: If the input string belongs to the language (i.e., the answer is "yes"), the machine always accepts
the string with probability 1.

Correct Rejection: If the input string does not belong to the language (i.e., the answer is "no"), the machine always
rejects the string with probability at least 1/2.

Error Bound: If the machine accepts an input that is not in the language, the error probability is bounded by 1/2.
Similarly, if the machine rejects an input that is in the language, the error probability is also bounded by 1/2.

Formally, we can define RP as:

1
RP = {L ∣ ∃ an RTM M such that M ∈ P and the error probability of M is at most }
2

Example:

An example of a problem in RP is the Primality Testing Problem (checking if a given number is prime). A well-known
randomized algorithm, such as Miller-Rabin primality test, can solve the problem in polynomial time with high probability.

5. Recognizing Languages in RP

Recognition Process in RP:

Given a language L ∈ RP, the recognition process involves the following steps:
1. Input: The input string w is provided to the RTM.

2. Randomized Computation: The RTM performs its computation by reading the input tape and making random decisions
at each step based on the random tape.

3. Acceptance: If the RTM accepts w , the string is considered to be in the language L.

4. Rejection: If the RTM rejects w , the string is not in the language L.

140/145
Since the error probability is bounded, the RTM may need to repeat the process multiple times to reduce the error
probability. However, the machine always decides in polynomial time.

6. The Class ZPP

Definition of ZPP:

The class ZPP (Zero-error Probabilistic Polynomial time) consists of decision problems for which there exists a randomized
algorithm that runs in expected polynomial time and always produces correct answers (i.e., no error probability).

Correctness: The machine always either accepts or rejects the input with certainty, i.e., no errors occur.

Expected Polynomial Time: The expected running time of the algorithm is polynomial in the size of the input.

Formally, we define ZPP as:

ZPP = {L ∣ ∃ a probabilistic algorithm A such that A always returns the correct answer in expected polynomial time}

Example:

An example of a ZPP problem is the Parity Problem, where the task is to determine whether a given sequence has an even
or odd number of 1's. A probabilistic algorithm can be designed such that it always produces the correct answer with
certainty, and the expected time complexity is polynomial.

7. Relationship Between RP and ZPP


RP vs. ZPP: The key difference between RP and ZPP is that RP allows for error (with high probability), whereas ZPP
guarantees zero error. While RP allows a small probability of making mistakes, ZPP requires that the algorithm always
produces the correct answer, with errors not allowed.

RP ⊆ ZPP: Since ZPP guarantees no errors, any language in RP can also be recognized by a ZPP machine. In fact, every
language in RP can be transformed into a ZPP problem by repeating the randomization until the answer is certain.

8. Relationships of RP and ZPP to Classes P and NP


RP ⊆ P: Since the error probability in RP is bounded, and the machine always runs in polynomial time, RP is a subset of
P. In other words, every language that can be recognized by a randomized polynomial-time algorithm (with error
probability) can also be recognized by a deterministic polynomial-time algorithm.

ZPP ⊆ P: The class ZPP is also contained in P because the algorithm always produces the correct answer, and its
expected time is polynomial. Since expected polynomial time implies polynomial time in the worst case, ZPP is a subset
of P.

RP and NP: While RP is contained in P, it is not known whether RP is contained in NP. NP problems involve non-
deterministic choices and are not guaranteed to have zero-error solutions. Therefore, there is no clear containment
relationship between RP and NP without further research.

141/145
9. Conclusion
In this lecture, we:

Examined the concept of randomized algorithms and how they can be modeled using randomized Turing machines
(RTMs).

Introduced the classes RP and ZPP, providing formal definitions and exploring examples of problems in these classes.

Discussed the relationships between RP, ZPP, P, and NP, highlighting their connections and distinctions.

Understanding these randomized complexity classes is crucial for analyzing problems that involve probabilistic decision-
making and for designing efficient algorithms that leverage randomness.

Lecture 50: The Complexity of Primality Testing


In this lecture, we will explore the complexity of primality testing, a fundamental problem in number theory and
cryptography. We will discuss the importance of primality testing, introduce modular arithmetic as a tool for efficient
computations, examine the complexity of modular arithmetic computations, and delve into randomized and non-
deterministic algorithms for primality testing.

1. The Importance of Primality Testing


Primality testing is the process of determining whether a given integer n is prime (i.e., divisible only by 1 and itself) or
composite (i.e., having divisors other than 1 and itself). This problem has several important applications, particularly in:

Cryptography: Many cryptographic algorithms, such as RSA, rely on the difficulty of factoring large numbers, and
efficient primality testing is crucial for selecting large primes for cryptographic keys.

Randomized Algorithms: Primality testing is often used in the generation of random primes, which are necessary in
cryptography for secure key generation.

Mathematical Applications: Primality testing is essential for various algorithms in number theory, such as in the search
for large primes used in prime factorization and other problems.

Efficient primality testing is important because brute force checking of divisibility by every integer up to n − 1 would take
O(n) time, which is impractical for large numbers.

2. Introduction to Modular Arithmetic


Modular arithmetic is a system of arithmetic for integers, where numbers "wrap around" upon reaching a certain value,
called the modulus. The operation a mod m gives the remainder when a is divided by m.
Definition: a mod m is the remainder when a is divided by m, and it is always a number in the range 0 ≤ a
mod m < m.
Properties of Modular Arithmetic:

Addition: (a + b) mod m = [(a mod m) + (b mod m)] mod m

142/145
Multiplication: (a ⋅ b) mod m = [(a mod m) ⋅ (b mod m)] mod m
Exponentiation: (ak ) mod m is computed efficiently using modular exponentiation.

Modular Arithmetic and Primality Testing:

Many primality tests make use of modular arithmetic, particularly in the context of modular exponentiation, which allows
us to compute powers of numbers modulo a value efficiently. This is crucial in algorithms such as Fermat’s Little Theorem-
based tests and Miller-Rabin primality tests.

3. The Complexity of Modular-Arithmetic Computations


Modular arithmetic computations, particularly modular exponentiation, are central to the efficiency of primality testing
algorithms. The naive approach to computing ak mod m would take O(k) time, which is inefficient for large k . Instead,
we can use exponentiation by squaring, which allows us to compute ak mod m in O(log k) time.

Exponentiation by Squaring:

Exponentiation by squaring is a method to compute powers of numbers modulo some modulus efficiently. It works as
follows:

1. If k is even, ak mod m = (ak/2 mod m)2 mod m


2. If k is odd, ak mod m = a ⋅ ak−1 mod m

By recursively applying this approach, we reduce the number of multiplications needed to compute ak mod m, making it
run in O(log k) time.

Example:

To compute a13 mod m:


a13 = a ⋅ (a2 )6
Compute a2 mod m, then square it iteratively to compute higher powers.
Multiply the results to get a13 mod m.

This method drastically reduces the time complexity compared to directly multiplying a by itself 13 times.

4. Random-Polynomial Primality Testing


A key result in number theory is the existence of randomized primality tests that run in polynomial time. One of the most
famous such algorithms is the Miller-Rabin primality test, which is based on properties of modular arithmetic and works as
follows:

Miller-Rabin Primality Test:

Given a number n, the algorithm tests if n is prime using the following steps:

1. Write n − 1 = 2s ⋅ d with d odd:

Factor n − 1 as 2s ⋅ d, where d is odd. This can be done efficiently.


2. Choose a random base a such that 2 ≤ a ≤ n − 2:

143/145
Select a random base a for the test.

3. Check the condition ad mod n:

Compute ad mod n. If the result is 1, n may be prime. If the result is n − 1, then n may also be prime.
4. Perform repeated squaring for higher powers:
r
Compute higher powers of a mod n using modular exponentiation. Specifically, for each step, compute a2 ⋅d
mod n for r = 0, 1, … , s − 1. If any of these results equals n − 1, n passes this round of testing.
5. Repeat for multiple values of a:

If n passes the test for several different random values of a, the probability of n being prime increases. The
algorithm repeats for different bases to reduce the probability of error.

3
The time complexity of the Miller-Rabin test is O(k log n), where k is the number of iterations and n is the number being
tested. This is polynomial time for practical values of n, making the test very efficient for large numbers.

Example:

Let n = 17. We perform the Miller-Rabin test for random bases a to test primality. After several iterations (testing different
values of a), we conclude that 17 is prime with high probability.

5. Non-Deterministic Primality Tests


A non-deterministic primality test involves making multiple guesses about the primality of a number and verifying those
guesses in polynomial time. The non-deterministic primality problem is essentially a decision problem where a non-
deterministic machine can guess the primality of a number and then verify the guess in polynomial time.

One such example is the AKS primality test, which was developed as a deterministic primality test that runs in polynomial
time. Although the AKS test is deterministic, it is still a noteworthy result, as it was the first primality test proven to run in
polynomial time without reliance on randomization.

AKS Primality Test:

The AKS test, developed in 2002, is a deterministic algorithm for primality testing. It runs in polynomial time and has a
time complexity of O(n7.5 ).

The algorithm is based on concepts from algebraic number theory and polynomial interpolation.

It uses the fact that if a number n is prime, certain polynomial properties hold that can be verified in polynomial time.

The AKS test is important because it provides a deterministic approach to primality testing, removing the reliance on
randomization (as in Miller-Rabin) and making primality testing more predictable.

Conclusion
In this lecture, we explored the complexity of primality testing through the following key concepts:

1. Modular arithmetic and its role in efficiently performing primality tests.

2. The Miller-Rabin primality test, a randomized polynomial-time algorithm that tests whether a number is prime with
high probability.

144/145
3. The AKS primality test, which is a deterministic polynomial-time algorithm for primality testing.

4. Non-deterministic primality tests, and their relation to problems that can be solved by a non-deterministic machine.

Efficient primality testing is crucial for modern cryptographic systems and number-theoretic algorithms, and understanding
the complexity of these tests is essential for designing efficient algorithms in these domains.

145/145

You might also like