0% found this document useful (0 votes)
54 views

Week 2 Lec 4 CC

The document discusses the lexical analysis, syntax analysis, and semantic analysis phases of a compiler. Lexical analysis scans source code and groups characters into tokens. Syntax analysis uses these tokens to build a parse tree representing the program structure. Semantic analysis checks for semantic errors, gathers type information, and performs type checking. It verifies that variables are defined before use, functions are defined before calls, and operators have the correct operand types. Intermediate code generation then translates the parse tree into an intermediate representation to ease code generation for the target machine.

Uploaded by

Sohaib Kashmiri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views

Week 2 Lec 4 CC

The document discusses the lexical analysis, syntax analysis, and semantic analysis phases of a compiler. Lexical analysis scans source code and groups characters into tokens. Syntax analysis uses these tokens to build a parse tree representing the program structure. Semantic analysis checks for semantic errors, gathers type information, and performs type checking. It verifies that variables are defined before use, functions are defined before calls, and operators have the correct operand types. Intermediate code generation then translates the parse tree into an intermediate representation to ease code generation for the target machine.

Uploaded by

Sohaib Kashmiri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

1

Lahore Garrison University


CSC373-Compiler Construction
Week-2 Lecture-4
Semester-#5 Fall 2019
Prepared by:

Eisha Tir Razia


2
Overview of previous lecture

 Compiler types in general


 Analysis
 synthesis

Lahore Garrison University


3
Preamble of each lecture

 Compilation Process in brief


 How a source program gets compiled?
 Phases of Compiler

Lahore Garrison University


4
Lecture Outcomes

 Understanding related
 Compiler phases in detail

Lahore Garrison University


5
Lexical Analysis

 First phase of compiler works as a text scanner


 This phase scans the source code as a stream of characters and
convert it into a meaningful lexemes
 This phase scans the characters in the source program and groups
them into a stream of tokens in which each token represents a logically
organized sequence of characters such as an identifier, a keyword, a
punctuation character.

Lahore Garrison University


6
Lexical Analysis

 The character sequence forming a token is called the lexemes for the
token.
 Lexical analyzer represents lexemes in the form of tokens as
<token_name, attribute_value>
 More appropriate way
<Token_type, Word>

Lahore Garrison University


7
Example

 Total = Count + Rate * 10


 The characters in this assignment could be grouped into the following lexemes
and mapped into the following tokens passed on to the syntax analyzer:
 Total is a lexeme that would be mapped into a token {id, 1 ), where id is an
abstract symbol standing for identifier and 1 points to the symbol table entry
for total. The symbol-table entry for an identifier holds information about the
identifier, such as its name and type
 The assignment symbol = is a lexeme that is mapped into the token {=). Since
this token needs no attribute-value, we have omitted the second component .
We could have used any abstract symbol such as assign for the token-name,
but for notational convenience we have chosen to use the lexeme itself as the
name of the abstract symbol

Lahore Garrison University


8
Cont..

 Count is a lexeme that is mapped into the token (id, 2 ) , where 2


points to the symbol-table entry for count
 + is a lexeme that is mapped into the token (+).
 rate is a lexeme that is mapped into the token ( id, 3 ) , where 3 points
to the symbol-table entry for rate
 * is a lexeme that is mapped into the token (* )
 10 is a lexeme that is mapped into the token (10)
 Blanks separating the lexemes would be discarded by the lexical
analyzer.
( id, l) ( = ) (id, 2) (+) (id, 3) ( * ) (60)

Lahore Garrison University


9
Example

void main( ) {
int x;
x = 3;
}

Lahore Garrison University


10
Syntax Analysis

 The second phase of the compiler is syntax analysis or parsing. The


parser uses the first components of the tokens produced by the lexical
analyzer to create a tree-like intermediate representation that depicts
the grammatical structure of the token stream. A typical representation
is a syntax tree in which each interior node represents an operation and
the children of the node represent the arguments of the operation.
 This tree shows the order in which the operations in the assignment are
to be performed.

Lahore Garrison University


11
Cont..

Lahore Garrison University


12
Cont..

 The tree has an interior node labeled * with ( id, 3 ) as its left child and
the integer 10 as its right child. The node (id, 3) represents the
identifier rate. The node labeled * makes it explicit that we must first
multiply the value of rate by 10. The node labeled + indicates that we
must add the result of this multiplication to the value of count. The root
of the tree, labeled =, indicates that we must store the result of this
addition into the location for the identifier total.

Lahore Garrison University


13
Cont..

 For building such type of syntax tree the production rules are to be
designed.
 The rules are usually expressed by context free grammar.

Lahore Garrison University


14
Operations of Syntax Analysis

1. Obtain tokens from lexical analyzer


2. Check whether the expression is syntactically correct
3. Report syntax errors, if any
4. Group tokens into statements
 Determine the statement class, is it an assignment statement, a condition
statement (if statement), etc.
5. Construct hierarchical structure called parse tree

Lahore Garrison University


15
Semantic Analysis

 This phase checks the source program for semantic errors and gathers type information
for the subsequent code generation phase.
 It uses the hierarchical structure determined by the syntax-analysis phase to identify
the operators and operands of expressions and statements
 An important component of semantic analysis is type checking where the compiler
checks that each operator has matching operands. For example, many programming
language definitions require an array index to be an integer; the compiler must report
an error if a floating-point number is used to index an array
 It checks whether the parse tree constructed follows the rules of language.
 It only checks errors
 It also gathers type information and saves it in either the syntax tree or the symbol
table, for subsequent use during intermediate-code generation

Lahore Garrison University


16
Cont..

 All variables are defined before they are used.


 All functions are defined before they are called, and called with correct
number (and types) of parameters, as well as correct return type.
 All arithmetic & boolean operators have (+, -, *, /, &&, || and so on) operands
of the correct types.
 variables (array and classes) are accessed correctly
 Assign statements are compatible left-hand and right-hand sides
 Control expressions in statements have the correct types

Lahore Garrison University


17
Cont..

 The language specification may permit some type conversions called


coercions.
 For example, a binary arithmetic operator may be applied to either a
pair of integers or to a pair of floating-point numbers. If the operator is
applied to a floating-point number and an integer, the compiler may
convert the integer into a floating-point number

Lahore Garrison University


18
Example

 Suppose that total, count, and rate have been declared to be floating-
point numbers, and that the lexeme 10 by itself forms an integer. The
type checker in the semantic analyzer discovers that the operator * is
applied to a floating-point number rate and an integer 10. In this case,
the integer may be converted into a floating-point number.
 notice that the output of the semantic analyzer has an extra node for
the operator inttofloat , which explicitly converts its integer argument
into a floating-point number

Lahore Garrison University


19
Intermediate Code Generation

 The next step in compilation process is to translate the abstract syntax


tree into another intermediate representation- abstract assembly.
 It is a kind of code & this code can easily be converted to target code.
 This code is in variety of forms, such as three-address code, quadruple,
triple.
 Intermediate code can be either language specific (e.g., Byte Code for
Java) or language independent (three-address code).
 Abstract assembly is middle ground between the source language and
target language.
20
Three-Address Code

 Three-address code consists of instructions each of


which has at the most three operands.
 Properties of three-address code
1. Each three address instruction has at the most one
operator in addition to the assignment.
2. The compiler must generate a temporary name to
hold the value computed by each instruction.
3. Some instruction may have fewer than operands.

Lahore Garrison University


21
Example

Lahore Garrison University


22
Code Optimization

 This phase attempts to improve the intermediate code.


 This is necessary to have a fast executing code or less consumption of
memory.
 Thus by optimizing the code, the overall running time of the target
program can be improve.

Lahore Garrison University


23
Example

Lahore Garrison University


24
Code Generation

 In this phase the target code gets generated.


 The code generator takes as input an intermediate representation of
the source program and maps it into the target language. If the target
language is machine code, registers Or memory locations are selected
for each of the variables used by the program. Then, the intermediate
instructions are translated

Lahore Garrison University


25
Example

 LDF R2 , id3
 MULF R2 , R2 , 60 . 0
 LDF R1 , id2
 ADDF R1 , R 1 , R2
 STF id1 , R1

Lahore Garrison University


26

Lahore Garrison University


27

Q&A

Lahore Garrison University


28
References

 These lecture notes were taken from following source:


 Compilers: Principles, Techniques, and Tools By Alfred V. Aho, Ravi Sethi, Jeffrey
D. Ullman, Contributor Jeffrey D. Ullman, Addison-Wesley Pub. Co., 2nd edition,
2006

Lahore Garrison University

You might also like