Introduction
Why compiler?
Programming problems are easier to solve in high-level languages
Languages closer to the level of the problem domain, e.g.,
SmallTalk: OO programming
JavaScript: Web pages
 Solutions are usually more e�cient (faster, smaller) when written in machine language
Language that reflects to the cycle-by-cycle working of a processor
Compilers are the bridges
Tools to translate programs written in high-level languages to
e�cient executable code
What is a compiler?
"A program that reads a program written in one language and translates it into another
language."
Traditionally, compilers go from high-level languages to low-level
languages.
Traditionally, compilers go from high-level languages to low-level languages.
How to translate?
https://www.slideshare.net/Amansharma1037
https://www.linkedin.com/in/includeaman
Requirement:
In order to translate statements in a language, one needs to understand both
the structure of the language: the way “sentences" are constructed in the
language
the meaning of the language: what each “sentence" stands for.
 Terminology:
Structure ≡ Syntax
Meaning ≡ Semantics
Analysis-Synthesis model of compilation :
Two parts
Analysis
Breaks up the source program into constituents
Synthesis
Constructs the target program
Compilation Steps / Phases
https://www.slideshare.net/Amansharma1037
https://www.linkedin.com/in/includeaman
Compilation Steps/Phases
Lexical Analysis Phase: Generates the “tokens” in the source program
Syntax Analysis Phase: Recognizes “sentences" in the program using the syntax of the
language
Semantic Analysis Phase: Infers information about the program using the semantics of
the language
Intermediate Code Generation Phase: Generates “abstract” code based on the syntactic
structure of the program and the semantic information from Phase 2.
Optimization Phase: Refines the generated code using a series of optimizing
transformations
Final Code Generation Phase: Translates the abstract intermediate code into specific
machine instructions.
Lexical Analysis
Convert the stream of characters representing input program into a sequence of tokens
Tokens are the “words" of the programming language
https://www.slideshare.net/Amansharma1037
https://www.linkedin.com/in/includeaman
Lexeme
The characters comprising a token
For Example,
the sequence of characters “static int" is recognized as two tokens, representing
the two words
“static" and “int"
The sequence of characters “*x++" is recognized as three tokens, representing
“*", “x" and “++“
Removes the white spaces
Removes the comments
Lexical Analysis
Input: result = a + b * 10
Tokens:
Syntax Analysis
Uncover the structure of a sentence in the program from a stream of tokens.
For instance, the phrase “x = +y", which is recognized as four tokens, representing “x",
“=“ and “+" and “y", has the
structure =(x,+(y)), i.e., an assignment expression, that operates on “x" and the
expression “+(y)".
Build a tree called a parse tree that reflects the structure of the input sentence.
Syntax Analysis: Grammars
Expression grammar
Exp ::=Exp ‘+’ Exp | Exp ‘*’ Exp | ID | NUMBER
https://www.slideshare.net/Amansharma1037
https://www.linkedin.com/in/includeaman
Assign ::= ID ‘=‘ Exp
Syntax Tree
Input: result = a + b * 10
Semantic Analysis
Concerned with the semantic (meaning) of the program
Performs type checking
Operator operand compitability
Intermediate Code Generation
Translate each hierarchical structure decorated as tree into intermediate code
A program translated for an abstract machine
Properties of intermediate codes
Should be easy to generate
Should be easy to translate
Intermediate code hides many machine-level details, but has instruction-level mapping
to many assembly languages
Main motivation: portability
One commonly used form is “Three-address Code”
Code Optimization
Apply a series of transformations to improve the time and space e�ciency of the
generated code.
Peephole optimizations: generate new instructions by combining/expanding on a small
number of consecutive instructions.
https://www.slideshare.net/Amansharma1037
https://www.linkedin.com/in/includeaman
Global optimizations: reorder, remove or add instructions to change the structure of
generated code
Consumes a significant fraction of the compilation time
Optimization capability varies widely
Simple optimization techniques can be very valuable
Code Generation
Map instructions in the intermediate code to specific machine instructions.
Memory management, register allocation, instruction selection, instruction scheduling,
...
Generates su�cient information to enable symbolic debugging.
Symbol Table
Records the identifiers used in the source program
Collects various associated information as attributes
Variables: type, scope, storage allocation
Procedure: number and types of arguments method of argument passing
It’s a data structure with collection of records
Di�erent fields are collected and used at di�erent phases of compilation
Error Detection, Recovery and Reporting
Each phase can encounter error
Specific types of error can be detected by specific phases
Lexical Error: int abc, 1num;
Syntax Error: total = capital + rate year;
Semantic Error: value = myarray [realIndex];
Should be able to proceed and process the rest of the program after an error detected
Should be able to link the error with the source program
https://www.slideshare.net/Amansharma1037
https://www.linkedin.com/in/includeaman
Translation of a statement
https://www.slideshare.net/Amansharma1037
https://www.linkedin.com/in/includeaman
https://www.slideshare.net/Amansharma1037
https://www.linkedin.com/in/includeaman
https://www.slideshare.net/Amansharma1037
https://www.linkedin.com/in/includeaman
https://www.slideshare.net/Amansharma1037
https://www.linkedin.com/in/includeaman
Syntax Analyzer versus Lexical Analyzer
Both of them do the similiar thing ;
But the lexical analyzer deals with simple non-recursive constructs of the language.
The syntax analyzer deals with recursive constructs of the language.
The lexical analyzer simplifies the job of the syntax analyzer.
The lexical analyzer recognizes the smallest meaningful units (tokens) in a source
program.
The syntax analyzer works on the smallest meaningful units (tokens) in a source program
to recognize meaningful structures in our programming language.
Cousins of the Compiler
Preprocessor
Macro preprocessing : Define and use shorthand for longer constructs
File inclusion :Include header files
“Rational” Preprocessors :Augment older languages with modern flow-of-control or
data-structures
Language Extension:Add capabilities to a language
Equel: query language embedded in C
Assemblers
https://www.slideshare.net/Amansharma1037
https://www.linkedin.com/in/includeaman
Two-Pass Assembly
Simplest form of assembler
First pass
All the identifiers are stored in a symbol table
Storage is allocated
Second pass
Translates each operand code in the machine language
https://www.slideshare.net/Amansharma1037
https://www.linkedin.com/in/includeaman
Loaders and Link-Editors
Loader Converts the relocatable machine code into absolute machine code
Map the relocatable address
Place altered instructions and data in memory
Linker Makes a single program from several files of relocatable machine code
Several files of relocatable codes
Library files
Grouping of Phases
Front End and Back End with Front end consisting of phases that depend iupon source
language and are independent of target language.
Passes : Several phases of compilers with similiar function are grouped in Passes.
often Passes generate an explicit output file
In each pass whole input file is processed.
Issues Driving Compiler Design
Correctness
Speed (runtime and compile time)
https://www.slideshare.net/Amansharma1037
https://www.linkedin.com/in/includeaman
Degrees of optimization
Multiple passes
Space
Feedback to user
Debugging
Other Applications
In addition to the development of a compiler, the techniques used in compiler design
can be applicable to many problems in computer science.
Techniques used in a lexical analyzer can be used in text editors, information retrieval
system, and pattern recognition programs.
Techniques used in a parser can be used in a query processing system such as SQL.
Many software having a complex front-end may need techniques used in compiler
design.
A symbolic equation solver which takes an equation as input.That program should parse
the given input equation.
Most of the techniques used in compiler design can be used in Natural Language
Processing (NLP) systems.
https://www.slideshare.net/Amansharma1037
https://www.linkedin.com/in/includeaman

More Related Content

PPTX
Compiler construction tools
PPTX
Compiler construction
PPTX
Parallel Algorithms Advantages and Disadvantages
PDF
Introduction to Parallel Distributed Computer Systems
PDF
Lecture 1 introduction to parallel and distributed computing
PPT
Evaluation of morden computer & system attributes in ACA
PDF
Lecture 01 introduction to compiler
PPTX
Introduction to Parallel and Distributed Computing
Compiler construction tools
Compiler construction
Parallel Algorithms Advantages and Disadvantages
Introduction to Parallel Distributed Computer Systems
Lecture 1 introduction to parallel and distributed computing
Evaluation of morden computer & system attributes in ACA
Lecture 01 introduction to compiler
Introduction to Parallel and Distributed Computing

What's hot (20)

PPTX
Operating system 11 system calls
PDF
Parallel and Distributed Computing chapter 1
PPTX
Cache coherence ppt
PPTX
Structure of the compiler
PDF
Introduction to OpenMP (Performance)
PPT
basics of compiler design
PDF
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
PPTX
Parallel language & compilers
PPTX
Dead Code Elimination
PPTX
Unit1 principle of programming language
PPTX
Peephole Optimization
PDF
Cache optimization
PPTX
Unit 6 shared memory multiprocessors
PPTX
Multithreading computer architecture
PDF
Syntax analysis
PDF
loaders and linkers
PPTX
Security in distributed systems
PPTX
Presentation on Segmentation
PPTX
Interpreter
PPTX
Cache coherence
Operating system 11 system calls
Parallel and Distributed Computing chapter 1
Cache coherence ppt
Structure of the compiler
Introduction to OpenMP (Performance)
basics of compiler design
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Parallel language & compilers
Dead Code Elimination
Unit1 principle of programming language
Peephole Optimization
Cache optimization
Unit 6 shared memory multiprocessors
Multithreading computer architecture
Syntax analysis
loaders and linkers
Security in distributed systems
Presentation on Segmentation
Interpreter
Cache coherence
Ad

Similar to Compiler design Introduction (20)

PDF
Compiler_Lecture1.pdf
PDF
unit1pdf__2021_12_14_12_37_34.pdf
PDF
Chapter#01 cc
PPTX
PPTX
Chapter 1.pptx
PPTX
1._Introduction_.pptx
PPTX
Lecture 1 introduction to language processors
PPTX
Compiler Design Introduction With Design
PPTX
Pros and cons of c as a compiler language
PPTX
ppt_cd.pptx ppt on phases of compiler of jntuk syllabus
PDF
COMPILER DESIGN Engineering learinin.pdf
PDF
3_1_COMPILER_DESIGNGARGREREGREGREGREGREGRGRERE
PDF
design intoduction of_COMPILER_DESIGN.pdf
PPTX
COMPILER CONSTRUCTION KU 1.pptx
PPT
Introduction to compiler design and phases of compiler
PPTX
The Phases of a Compiler
PPTX
Phases of Compiler.pptx
PDF
Chapter1pdf__2021_11_23_10_53_20.pdf
PPTX
Unit2_CD.pptx more about compilation of the day
PPTX
COMPILER DESIGN PPTS.pptx
Compiler_Lecture1.pdf
unit1pdf__2021_12_14_12_37_34.pdf
Chapter#01 cc
Chapter 1.pptx
1._Introduction_.pptx
Lecture 1 introduction to language processors
Compiler Design Introduction With Design
Pros and cons of c as a compiler language
ppt_cd.pptx ppt on phases of compiler of jntuk syllabus
COMPILER DESIGN Engineering learinin.pdf
3_1_COMPILER_DESIGNGARGREREGREGREGREGREGRGRERE
design intoduction of_COMPILER_DESIGN.pdf
COMPILER CONSTRUCTION KU 1.pptx
Introduction to compiler design and phases of compiler
The Phases of a Compiler
Phases of Compiler.pptx
Chapter1pdf__2021_11_23_10_53_20.pdf
Unit2_CD.pptx more about compilation of the day
COMPILER DESIGN PPTS.pptx
Ad

More from Aman Sharma (6)

PDF
Information architecture unit i
PDF
Lexical Analysis - Compiler design
PDF
Role of supoort Institutions & Management of Small Business UNIT IV
PDF
Small Enterprises and Enterprise Launching Formalitites UNIT III
PDF
Opportunity Identification and Product selection UNIT II
PDF
Entrepreneurship & Entreprenurs Unit I
Information architecture unit i
Lexical Analysis - Compiler design
Role of supoort Institutions & Management of Small Business UNIT IV
Small Enterprises and Enterprise Launching Formalitites UNIT III
Opportunity Identification and Product selection UNIT II
Entrepreneurship & Entreprenurs Unit I

Recently uploaded (20)

PPTX
Research Writing, Mechanical Engineering
PDF
IoT-Based Hybrid Renewable Energy System.pdf
PPTX
1. Effective HSEW Induction Training - EMCO 2024, O&M.pptx
PDF
PhD defense presentation in field of Computer Science
PPTX
Cloud Security and Privacy-Module-2a.pptx
PDF
AI agent, robotics based Smart Construction 2025
PPTX
L1111-Important Microbial Mechanisms.pptx
PPTX
Downstream processing_in Module1_25.pptx
PDF
B461227.pdf American Journal of Multidisciplinary Research and Review
PPTX
Electric vehicle very important for detailed information.pptx
PDF
BBC NW_Tech Facilities_30 Odd Yrs Ago [J].pdf
PPTX
Cloud Security and Privacy-Module-1.pptx
PPTX
Ingredients of concrete technology .pptx
PPTX
5-2d2b20afbe-basic-concepts-of-mechanics.ppt
PDF
Project_Mgmt_Institute_- Marc Marc Marc.pdf
PPTX
Module 1 – Introduction to Computer Networks: Foundations of Data Communicati...
PDF
August 2025 Top read articles in International Journal of Database Managemen...
PDF
The Journal of Finance - July 1993 - JENSEN - The Modern Industrial Revolutio...
PPT
Basics Of Pump types, Details, and working principles.
PPTX
RA-UNIT-1.pptx ( Randomized Algorithms)
Research Writing, Mechanical Engineering
IoT-Based Hybrid Renewable Energy System.pdf
1. Effective HSEW Induction Training - EMCO 2024, O&M.pptx
PhD defense presentation in field of Computer Science
Cloud Security and Privacy-Module-2a.pptx
AI agent, robotics based Smart Construction 2025
L1111-Important Microbial Mechanisms.pptx
Downstream processing_in Module1_25.pptx
B461227.pdf American Journal of Multidisciplinary Research and Review
Electric vehicle very important for detailed information.pptx
BBC NW_Tech Facilities_30 Odd Yrs Ago [J].pdf
Cloud Security and Privacy-Module-1.pptx
Ingredients of concrete technology .pptx
5-2d2b20afbe-basic-concepts-of-mechanics.ppt
Project_Mgmt_Institute_- Marc Marc Marc.pdf
Module 1 – Introduction to Computer Networks: Foundations of Data Communicati...
August 2025 Top read articles in International Journal of Database Managemen...
The Journal of Finance - July 1993 - JENSEN - The Modern Industrial Revolutio...
Basics Of Pump types, Details, and working principles.
RA-UNIT-1.pptx ( Randomized Algorithms)

Compiler design Introduction

  • 1. Introduction Why compiler? Programming problems are easier to solve in high-level languages Languages closer to the level of the problem domain, e.g., SmallTalk: OO programming JavaScript: Web pages  Solutions are usually more e�cient (faster, smaller) when written in machine language Language that reflects to the cycle-by-cycle working of a processor Compilers are the bridges Tools to translate programs written in high-level languages to e�cient executable code What is a compiler? "A program that reads a program written in one language and translates it into another language." Traditionally, compilers go from high-level languages to low-level languages. Traditionally, compilers go from high-level languages to low-level languages. How to translate? https://www.slideshare.net/Amansharma1037 https://www.linkedin.com/in/includeaman
  • 2. Requirement: In order to translate statements in a language, one needs to understand both the structure of the language: the way “sentences" are constructed in the language the meaning of the language: what each “sentence" stands for.  Terminology: Structure ≡ Syntax Meaning ≡ Semantics Analysis-Synthesis model of compilation : Two parts Analysis Breaks up the source program into constituents Synthesis Constructs the target program Compilation Steps / Phases https://www.slideshare.net/Amansharma1037 https://www.linkedin.com/in/includeaman
  • 3. Compilation Steps/Phases Lexical Analysis Phase: Generates the “tokens” in the source program Syntax Analysis Phase: Recognizes “sentences" in the program using the syntax of the language Semantic Analysis Phase: Infers information about the program using the semantics of the language Intermediate Code Generation Phase: Generates “abstract” code based on the syntactic structure of the program and the semantic information from Phase 2. Optimization Phase: Refines the generated code using a series of optimizing transformations Final Code Generation Phase: Translates the abstract intermediate code into specific machine instructions. Lexical Analysis Convert the stream of characters representing input program into a sequence of tokens Tokens are the “words" of the programming language https://www.slideshare.net/Amansharma1037 https://www.linkedin.com/in/includeaman
  • 4. Lexeme The characters comprising a token For Example, the sequence of characters “static int" is recognized as two tokens, representing the two words “static" and “int" The sequence of characters “*x++" is recognized as three tokens, representing “*", “x" and “++“ Removes the white spaces Removes the comments Lexical Analysis Input: result = a + b * 10 Tokens: Syntax Analysis Uncover the structure of a sentence in the program from a stream of tokens. For instance, the phrase “x = +y", which is recognized as four tokens, representing “x", “=“ and “+" and “y", has the structure =(x,+(y)), i.e., an assignment expression, that operates on “x" and the expression “+(y)". Build a tree called a parse tree that reflects the structure of the input sentence. Syntax Analysis: Grammars Expression grammar Exp ::=Exp ‘+’ Exp | Exp ‘*’ Exp | ID | NUMBER https://www.slideshare.net/Amansharma1037 https://www.linkedin.com/in/includeaman
  • 5. Assign ::= ID ‘=‘ Exp Syntax Tree Input: result = a + b * 10 Semantic Analysis Concerned with the semantic (meaning) of the program Performs type checking Operator operand compitability Intermediate Code Generation Translate each hierarchical structure decorated as tree into intermediate code A program translated for an abstract machine Properties of intermediate codes Should be easy to generate Should be easy to translate Intermediate code hides many machine-level details, but has instruction-level mapping to many assembly languages Main motivation: portability One commonly used form is “Three-address Code” Code Optimization Apply a series of transformations to improve the time and space e�ciency of the generated code. Peephole optimizations: generate new instructions by combining/expanding on a small number of consecutive instructions. https://www.slideshare.net/Amansharma1037 https://www.linkedin.com/in/includeaman
  • 6. Global optimizations: reorder, remove or add instructions to change the structure of generated code Consumes a significant fraction of the compilation time Optimization capability varies widely Simple optimization techniques can be very valuable Code Generation Map instructions in the intermediate code to specific machine instructions. Memory management, register allocation, instruction selection, instruction scheduling, ... Generates su�cient information to enable symbolic debugging. Symbol Table Records the identifiers used in the source program Collects various associated information as attributes Variables: type, scope, storage allocation Procedure: number and types of arguments method of argument passing It’s a data structure with collection of records Di�erent fields are collected and used at di�erent phases of compilation Error Detection, Recovery and Reporting Each phase can encounter error Specific types of error can be detected by specific phases Lexical Error: int abc, 1num; Syntax Error: total = capital + rate year; Semantic Error: value = myarray [realIndex]; Should be able to proceed and process the rest of the program after an error detected Should be able to link the error with the source program https://www.slideshare.net/Amansharma1037 https://www.linkedin.com/in/includeaman
  • 7. Translation of a statement https://www.slideshare.net/Amansharma1037 https://www.linkedin.com/in/includeaman
  • 11. Syntax Analyzer versus Lexical Analyzer Both of them do the similiar thing ; But the lexical analyzer deals with simple non-recursive constructs of the language. The syntax analyzer deals with recursive constructs of the language. The lexical analyzer simplifies the job of the syntax analyzer. The lexical analyzer recognizes the smallest meaningful units (tokens) in a source program. The syntax analyzer works on the smallest meaningful units (tokens) in a source program to recognize meaningful structures in our programming language. Cousins of the Compiler Preprocessor Macro preprocessing : Define and use shorthand for longer constructs File inclusion :Include header files “Rational” Preprocessors :Augment older languages with modern flow-of-control or data-structures Language Extension:Add capabilities to a language Equel: query language embedded in C Assemblers https://www.slideshare.net/Amansharma1037 https://www.linkedin.com/in/includeaman
  • 12. Two-Pass Assembly Simplest form of assembler First pass All the identifiers are stored in a symbol table Storage is allocated Second pass Translates each operand code in the machine language https://www.slideshare.net/Amansharma1037 https://www.linkedin.com/in/includeaman
  • 13. Loaders and Link-Editors Loader Converts the relocatable machine code into absolute machine code Map the relocatable address Place altered instructions and data in memory Linker Makes a single program from several files of relocatable machine code Several files of relocatable codes Library files Grouping of Phases Front End and Back End with Front end consisting of phases that depend iupon source language and are independent of target language. Passes : Several phases of compilers with similiar function are grouped in Passes. often Passes generate an explicit output file In each pass whole input file is processed. Issues Driving Compiler Design Correctness Speed (runtime and compile time) https://www.slideshare.net/Amansharma1037 https://www.linkedin.com/in/includeaman
  • 14. Degrees of optimization Multiple passes Space Feedback to user Debugging Other Applications In addition to the development of a compiler, the techniques used in compiler design can be applicable to many problems in computer science. Techniques used in a lexical analyzer can be used in text editors, information retrieval system, and pattern recognition programs. Techniques used in a parser can be used in a query processing system such as SQL. Many software having a complex front-end may need techniques used in compiler design. A symbolic equation solver which takes an equation as input.That program should parse the given input equation. Most of the techniques used in compiler design can be used in Natural Language Processing (NLP) systems. https://www.slideshare.net/Amansharma1037 https://www.linkedin.com/in/includeaman