0% found this document useful (0 votes)

1K views8 pages

Lexical Analysis in Compiler Design With Example

Lexical analysis is the first phase of compiler design. It converts source code into a sequence of tokens by breaking the syntax into units. A lexical analyzer scans source code and identifies tokens by checking for legal patterns. It passes tokens to a parser and skips whitespace and comments. Lexical errors occur when a character sequence cannot be scanned as a valid token. Common recovery techniques include removing characters or inserting missing ones. Separating lexical and syntax analysis simplifies the design and improves efficiency.

Uploaded by

Aansa Malik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views8 pages

Lexical Analysis in Compiler Design With Example

Uploaded by

Aansa Malik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Lexical Analysis in Compiler Design: Introduces lexical analysis, highlighting the conversion of source code into tokens to facilitate parsing.
Example of Lexical Analysis: Offers a practical demonstration of lexical analysis using example sentences to identify tokens.
Lexical Analyzer Architecture: Details how lexical analyzers recognize tokens and interact with parsers in the compilation process.
Basic Terminologies: Explains fundamental terms like lexeme, token, and pattern needed for understanding lexical analysis.
Examples of Tokens and Non-Tokens: Provides tables illustrating examples of tokens and non-tokens with corresponding lexemes.
Error Recovery in Lexical Analyzer: Discusses error types in lexical analysis and techniques for recovery during tokenization.
Lexical Analyzer vs. Parser: Compares the roles and functionalities of lexical analyzers and parsers within compilation.
Advantages and Disadvantages of Lexical Analysis: Outlines the benefits and limitations associated with lexical analysis in compiler design.
Summary: Summarizes the key points discussed, reinforcing the importance and function of lexical analysis.

Lexical Analysis in Compiler Design with

Example
What is Lexical analysis?
Lexical analysis is the very first phase in the compiler designing. It takes
the modified source code which is written in the form of sentences. In other
words, it helps you to converts a sequence of characters into a sequence of
tokens. The lexical analysis breaks this syntax into a series of tokens. It
removes any extra space or comment written in the source code.

Programs that perform lexical analysis are called lexical analyzers or

lexers. A lexer contains tokenizer or scanner. If the lexical analyzer detects
that the token is invalid, it generates an error. It reads character streams
from the source code, checks for legal tokens, and pass the data to the
syntax analyzer when it demands.

Example
How Pleasant Is The Weather?

See this example; Here, we can easily recognize that there are five words
How Pleasant, The, Weather, Is. This is very natural for us as we can
recognize the separators, blanks, and the punctuation symbol.

HowPl easantIs Th ewe ather?

Now, check this example, we can also read this. However, it will take some
time because separators are put in the Odd Places. It is not something
which comes to you immediately.

In this tutorial, you will learn

 What is Lexical analysis?

 Basic Terminologies:
 Lexical Analyzer Architecture: How tokens are recognized
 Roles of the Lexical analyzer
 Lexical Errors
 Error Recovery in Lexical Analyzer
 Lexical Analyzer vs. Parser
 Why separate Lexical and Parser?
 Advantages of Lexical analysis
 Disadvantage of Lexical analysis

Basic Terminologies
What's a lexeme?
A lexeme is a sequence of characters that are included in the source
program according to the matching pattern of a token. It is nothing but an
instance of a token.

What's a token?
The token is a sequence of characters which represents a unit of
information in the source program.

What is Pattern?
A pattern is a description which is used by the token. In the case of a
keyword which uses as a token, the pattern is a sequence of characters.

Lexical Analyzer Architecture: How tokens are

recognized
The main task of lexical analysis is to read input characters in the code and
produce tokens.

Lexical analyzer scans the entire source code of the program. It identifies
each token one by one. Scanners are usually implemented to produce
tokens only when requested by a parser. Here is how this works-
1. "Get next token" is a command which is sent from the parser to the
lexical analyzer.
2. On receiving this command, the lexical analyzer scans the input until
it finds the next token.
3. It returns the token to Parser.

Lexical Analyzer skips whitespaces and comments while creating these

tokens. If any error is present, then Lexical analyzer will correlate that error
with the source file and line number.

Roles of the Lexical analyzer

Lexical analyzer performs below given tasks:

 Helps to identify token into the symbol table

 Removes white spaces and comments from the source program
 Correlates error messages with the source program
 Helps you to expands the macros if it is found in the source program
 Read input characters from the source program

Example of Lexical Analysis, Tokens, Non-Tokens

Consider the following code that is fed to Lexical Analyzer

#include <stdio.h>
int maximum(int x, int y) {
// This will compare 2 numbers
if (x > y)
return x;
else {
return y;
}
}

Examples of Tokens created

Lexeme Token

int Keyword

maximum Identifier

( Operator

int Keyword

x Identifier

, Operator

int Keyword

Y Identifier

) Operator

{ Operator
If Keyword

Examples of Nontokens

Type Examples

Comment // This will compare 2 numbers

Pre-processor directive #include <stdio.h>

Pre-processor directive #define NUMS 8,9

Macro NUMS

Whitespace /n /b /t

Lexical Errors
A character sequence which is not possible to scan into any valid token is a
lexical error. Important facts about the lexical error:

 Lexical errors are not very common, but it should be managed by a

scanner
 Misspelling of identifiers, operators, keyword are considered as
lexical errors
 Generally, a lexical error is caused by the appearance of some illegal
character, mostly at the beginning of a token.

Error Recovery in Lexical Analyzer

Here, are a few most common error recovery techniques:

 Removes one character from the remaining input

 In the panic mode, the successive characters are always ignored until
we reach a well-formed token
 By inserting the missing character into the remaining input
 Replace a character with another character
 Transpose two serial characters

Lexical Analyzer vs. Parser

Lexical Analyser Parser

Scan Input program Perform syntax analysis

Identify Tokens Create an abstract representation of the code

Insert tokens into Symbol Table Update symbol table entries

It generates lexical errors It generates a parse tree of the source code

Why separate Lexical and Parser?

 The simplicity of design: It eases the process of lexical analysis and
the syntax analysis by eliminating unwanted tokens
 To improve compiler efficiency: Helps you to improve compiler
efficiency
 Specialization: specialized techniques can be applied to improves the
lexical analysis process
 Portability: only the scanner requires to communicate with the outside
world
 Higher portability: input-device-specific peculiarities restricted to the
lexer

Advantages of Lexical analysis

 Lexical analyzer method is used by programs like compilers which
can use the parsed data from a programmer's code to create a
compiled binary executable code
 It is used by web browsers to format and display a web page with the
help of parsed data from JavsScript, HTML, CSS
 A separate lexical analyzer helps you to construct a specialized and
potentially more efficient processor for the task

Disadvantage of Lexical analysis

 You need to spend significant time reading the source program and
partitioning it in the form of tokens
 Some regular expressions are quite difficult to understand compared
to PEG or EBNF rules
 More effort is needed to develop and debug the lexer and its token
descriptions
 Additional runtime overhead is required to generate the lexer tables
and construct the tokens

Summary
 Lexical analysis is the very first phase in the compiler designing
 A lexeme is a sequence of characters that are included in the source
program according to the matching pattern of a token
 Lexical analyzer is implemented to scan the entire source code of the
program
 Lexical analyzer helps to identify token into the symbol table
 A character sequence which is not possible to scan into any valid
token is a lexical error
 Removes one character from the remaining input is useful Error
recovery method
 Lexical Analyser scan the input program while parser perform syntax
analysis
 It eases the process of lexical analysis and the syntax analysis by
eliminating unwanted tokens
 Lexical analyzer is used by web browsers to format and display a
web page with the help of parsed data from JavsScript, HTML, CSS
 The biggest drawback of using Lexical analyzer is that it needs
additional runtime overhead is required to generate the lexer tables
and construct the tokens

Lexical Analysis in Compiler Design with
Example
What is Lexical analysis?
Lexical analysis is the very first phase in the

 Lexical Analyzer vs. Parser
 Why separate Lexical and Parser?
 Advantages of Lexical analysis
 Disadvantage of Lexica

(https://www.guru99.com/images/1/020819_1105_LexicalAnal1.png)1. "Get next token" is a command which is sent from the parser

if (x > y)
return x;
else {
return y;
}
}
Examples of Tokens create

If
Keyword
Examples of Nontokens
Type
Examples
Comment
// This will compare 2 numbers
Pre-processor directive
#includ

 In the panic mode, the successive characters are always ignored until
we reach a well-formed token
 By inserting the mis

 Lexical analyzer method is used by programs like compilers which
can use the parsed data from a programmer's code to creat

 The biggest drawback of using Lexical analyzer is that it needs
additional runtime overhead is required to generate the le

002chapter 2 - Lexical Analysis
No ratings yet
002chapter 2 - Lexical Analysis
114 pages
Unit 2-LEXICAL ANALYSIS
No ratings yet
Unit 2-LEXICAL ANALYSIS
46 pages
Compiler Design - Lexical Analysis
No ratings yet
Compiler Design - Lexical Analysis
2 pages
Finite Automata and Lexical Analysis
No ratings yet
Finite Automata and Lexical Analysis
95 pages
ToA - Lecture 07 08 - Nondeterministic Finite Automata Transition Graphs
No ratings yet
ToA - Lecture 07 08 - Nondeterministic Finite Automata Transition Graphs
30 pages
Chapter 4 - Pushdown Automata
No ratings yet
Chapter 4 - Pushdown Automata
42 pages
Rule-Based Expert Systems Overview
No ratings yet
Rule-Based Expert Systems Overview
50 pages
Properties of Alphabets, Strings, and Parsing
No ratings yet
Properties of Alphabets, Strings, and Parsing
6 pages
Pushdown Automata
No ratings yet
Pushdown Automata
17 pages
CH-6 Intermediate Code Generator
No ratings yet
CH-6 Intermediate Code Generator
54 pages
Compiler Design Worksheet
No ratings yet
Compiler Design Worksheet
2 pages
Chapter 2 - Lexical Analysis
100% (1)
Chapter 2 - Lexical Analysis
69 pages
Understanding Nondeterministic Finite Automata
No ratings yet
Understanding Nondeterministic Finite Automata
30 pages
Algorithms and Flowcharts Guide
No ratings yet
Algorithms and Flowcharts Guide
27 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
41 pages
Ava Questions & Answers - Try & Catch
No ratings yet
Ava Questions & Answers - Try & Catch
5 pages
CS005 Summer 2004 Midterm Exam
No ratings yet
CS005 Summer 2004 Midterm Exam
10 pages
Chapter 2 Lexical Analysis
No ratings yet
Chapter 2 Lexical Analysis
26 pages
Automata Theory for CS Students
No ratings yet
Automata Theory for CS Students
19 pages
Compiler Design 1
100% (1)
Compiler Design 1
30 pages
Push Down Automata New
100% (1)
Push Down Automata New
53 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
42 pages
1 Compiler Design Lect1
No ratings yet
1 Compiler Design Lect1
28 pages
Introduction To Compilers - Analysis of The Source Program - Phases of A Compiler Phases of Compiler
No ratings yet
Introduction To Compilers - Analysis of The Source Program - Phases of A Compiler Phases of Compiler
25 pages
Compiler Construction: By: Engr. Muhammad Adnan Malik Class of BS-CS, NCBA&E, MULTAN
100% (1)
Compiler Construction: By: Engr. Muhammad Adnan Malik Class of BS-CS, NCBA&E, MULTAN
64 pages
Syntax Analysis
No ratings yet
Syntax Analysis
87 pages
01 03 Storage Structure
No ratings yet
01 03 Storage Structure
2 pages
Understanding Finite Automata and DFAs
No ratings yet
Understanding Finite Automata and DFAs
48 pages
Code Optimization in Compiler Design
100% (1)
Code Optimization in Compiler Design
11 pages
Role of Parse1
No ratings yet
Role of Parse1
20 pages
2 Syntax Directed Transiation
No ratings yet
2 Syntax Directed Transiation
9 pages
State Transition Testing Handout
No ratings yet
State Transition Testing Handout
28 pages
Conversion of Nfa To Dfa-1 Up
100% (1)
Conversion of Nfa To Dfa-1 Up
12 pages
NFA To DFA Example
No ratings yet
NFA To DFA Example
27 pages
Lecture 2-The Big-Oh Notation
No ratings yet
Lecture 2-The Big-Oh Notation
30 pages
Specification of Tokens
No ratings yet
Specification of Tokens
21 pages
Context Sensitive Languages (C.S. Languages)
No ratings yet
Context Sensitive Languages (C.S. Languages)
2 pages
Syntax Analysis
67% (3)
Syntax Analysis
46 pages
Automata & Complexity Theory Basics
No ratings yet
Automata & Complexity Theory Basics
25 pages
Application of Finite Automata
No ratings yet
Application of Finite Automata
8 pages
Understanding Non-Deterministic Finite Automata
No ratings yet
Understanding Non-Deterministic Finite Automata
26 pages
Types of Parser
No ratings yet
Types of Parser
17 pages
Theory of Computation Basics
No ratings yet
Theory of Computation Basics
37 pages
Checking Ambiguous Grammar (Toa)
No ratings yet
Checking Ambiguous Grammar (Toa)
10 pages
Compiler Error Detection Methods
No ratings yet
Compiler Error Detection Methods
35 pages
Finite Automata Basics
No ratings yet
Finite Automata Basics
15 pages
Unit-3. Context Free Grammar
No ratings yet
Unit-3. Context Free Grammar
68 pages
A Ad - A - Ab - Abc - B: Generate The SLR Parsing Table For The Following Grammar
0% (1)
A Ad - A - Ab - Abc - B: Generate The SLR Parsing Table For The Following Grammar
7 pages
Automata Assignments
No ratings yet
Automata Assignments
24 pages
Semantic Networks
100% (1)
Semantic Networks
68 pages
Applications of Arrays in Data Structures
No ratings yet
Applications of Arrays in Data Structures
26 pages
Introduction to Compiler Design
No ratings yet
Introduction to Compiler Design
30 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
63 pages
The Mid-Term Exam of Compiler (2017521460090 - ALVIN)
No ratings yet
The Mid-Term Exam of Compiler (2017521460090 - ALVIN)
10 pages
Final Exam 50% Compiler Design
No ratings yet
Final Exam 50% Compiler Design
4 pages
Lexical Analysis in Compiler Design
100% (1)
Lexical Analysis in Compiler Design
3 pages
Role of A Lexical AN
No ratings yet
Role of A Lexical AN
26 pages
Lexical Analyzer Design and Tools
No ratings yet
Lexical Analyzer Design and Tools
20 pages
Lexical Analysis and Token Recognition
No ratings yet
Lexical Analysis and Token Recognition
67 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
5 pages
GIRA Area/Line Coupler Installation Guide
No ratings yet
GIRA Area/Line Coupler Installation Guide
4 pages
Oman Electrical Installation Regulations
No ratings yet
Oman Electrical Installation Regulations
108 pages
Global Communities in Social Studies
No ratings yet
Global Communities in Social Studies
26 pages
A Simple Approach To YIG Oscillators PDF
No ratings yet
A Simple Approach To YIG Oscillators PDF
8 pages
G3516B Engine Wiring Diagram
No ratings yet
G3516B Engine Wiring Diagram
2 pages
Yingtong Zou: Engineering Student Profile
No ratings yet
Yingtong Zou: Engineering Student Profile
1 page
Sam4s Er-230 Series
No ratings yet
Sam4s Er-230 Series
70 pages
Trello and Asana
100% (1)
Trello and Asana
6 pages
Conitel Slave Protocol: Reference Manual
No ratings yet
Conitel Slave Protocol: Reference Manual
16 pages
Series RE 3442: Installation and Operating Instructions - Positioner Installation and Service Instructions - Positioner
No ratings yet
Series RE 3442: Installation and Operating Instructions - Positioner Installation and Service Instructions - Positioner
4 pages
Rameshwaram Holiday Home Booking Confirmation
No ratings yet
Rameshwaram Holiday Home Booking Confirmation
2 pages
Solutions Set 4 - (Telegram @myhackersworld2)
No ratings yet
Solutions Set 4 - (Telegram @myhackersworld2)
7 pages
Study Material For Computer Basic
No ratings yet
Study Material For Computer Basic
13 pages
Ce 515 Transportation Engineering: Engr. Amadeus B. Palomata
100% (1)
Ce 515 Transportation Engineering: Engr. Amadeus B. Palomata
32 pages
IBM's Social Selling Strategy Explained
No ratings yet
IBM's Social Selling Strategy Explained
3 pages
Atari 130XE Machine Language For The Absolute Beginner
100% (1)
Atari 130XE Machine Language For The Absolute Beginner
164 pages
TE-419890-001 - 869-U-1010 - AMMONIUM SULFATE DOSING PACKAGE Rev.B
No ratings yet
TE-419890-001 - 869-U-1010 - AMMONIUM SULFATE DOSING PACKAGE Rev.B
13 pages
Ups Full Meaning - Google Search 2
No ratings yet
Ups Full Meaning - Google Search 2
1 page
FT602 Module
No ratings yet
FT602 Module
22 pages
Ict (Power Point)
No ratings yet
Ict (Power Point)
11 pages
OBIEE 11G Logs and Configuration Files - Reference
No ratings yet
OBIEE 11G Logs and Configuration Files - Reference
7 pages
CSC-100 Line Protection IED Technical Application Manual (CD20163213) V1.00
No ratings yet
CSC-100 Line Protection IED Technical Application Manual (CD20163213) V1.00
406 pages
B.Tech AI Exam Paper Dec 2020
No ratings yet
B.Tech AI Exam Paper Dec 2020
4 pages
Python String Basics Guide
No ratings yet
Python String Basics Guide
5 pages
Gokul G-Resume
No ratings yet
Gokul G-Resume
4 pages
Career N QW - Merged
No ratings yet
Career N QW - Merged
10 pages
Pitney Bowes DMT Solutions Guide
No ratings yet
Pitney Bowes DMT Solutions Guide
60 pages
Bus Ticket Booking System in PHP
No ratings yet
Bus Ticket Booking System in PHP
5 pages
GI - PF Transfer Process
No ratings yet
GI - PF Transfer Process
9 pages
IT Head Resume with SAP Expertise
No ratings yet
IT Head Resume with SAP Expertise
4 pages

Lexical Analysis in Compiler Design With Example

Uploaded by

Lexical Analysis in Compiler Design With Example

Uploaded by

Lexical Analysis in Compiler Design with

Programs that perform lexical analysis are called lexical analyzers or

HowPl easantIs Th ewe ather?

In this tutorial, you will learn

 What is Lexical analysis?

Lexical Analyzer Architecture: How tokens are

Lexical Analyzer skips whitespaces and comments while creating these

Roles of the Lexical analyzer

 Helps to identify token into the symbol table

Example of Lexical Analysis, Tokens, Non-Tokens

Examples of Tokens created

Comment // This will compare 2 numbers

Pre-processor directive #include <stdio.h>

Pre-processor directive #define NUMS 8,9

 Lexical errors are not very common, but it should be managed by a

Error Recovery in Lexical Analyzer

 Removes one character from the remaining input

Lexical Analyzer vs. Parser

Scan Input program Perform syntax analysis

Identify Tokens Create an abstract representation of the code

Insert tokens into Symbol Table Update symbol table entries

It generates lexical errors It generates a parse tree of the source code

Why separate Lexical and Parser?

Advantages of Lexical analysis

Disadvantage of Lexical analysis

You might also like